fr.paris.lutece.util.mail
Class HtmlDocument

java.lang.Object
  extended by fr.paris.lutece.util.mail.HtmlDocument

public class HtmlDocument
extends java.lang.Object

This classes provides implementation to retrieve urls from specified tags on an HTML page.


Field Summary
static fr.paris.lutece.util.mail.HtmlDocument.ElementUrl ELEMENT_CSS
          To define a CSS, html element must have: "link" tag name "rel" attribute equal to "stylesheet" The url is contained in the attributed named "href"
static fr.paris.lutece.util.mail.HtmlDocument.ElementUrl ELEMENT_IMG
          To define an image, html element must have: "img" tag name The url is contained in the attributed named "src"
static fr.paris.lutece.util.mail.HtmlDocument.ElementUrl ELEMENT_JAVASCRIPT
          To define a javascript, html element must have: "script" tag name "type" attribute equal to "text/javascript" The url is contained in the attributed named "src"
 
Constructor Summary
HtmlDocument(java.lang.String strHtml, java.lang.String strBaseUrl, boolean useAbsoluteUrl)
          Instanciates an HtmlDocument after having built the DOM tree.
 
Method Summary
 java.util.Map<java.lang.String,java.net.URL> getAllUrls(fr.paris.lutece.util.mail.HtmlDocument.ElementUrl elementType)
          Get the urls of all html elements specified by elementType
 java.util.List<UrlAttachment> getAllUrlsAttachement(fr.paris.lutece.util.mail.HtmlDocument.ElementUrl elementType)
          Get the urls of all html elements specified by elementType
protected  java.lang.Object getUrlContent(java.net.URL url)
          Loads the url in a DataHandler
protected  java.lang.String getUrlName(java.net.URL url)
          Return the absolute or relative url depending on _useAbsoluteUrl
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

ELEMENT_CSS

public static final fr.paris.lutece.util.mail.HtmlDocument.ElementUrl ELEMENT_CSS
To define a CSS, html element must have: The url is contained in the attributed named "href"


ELEMENT_JAVASCRIPT

public static final fr.paris.lutece.util.mail.HtmlDocument.ElementUrl ELEMENT_JAVASCRIPT
To define a javascript, html element must have: The url is contained in the attributed named "src"


ELEMENT_IMG

public static final fr.paris.lutece.util.mail.HtmlDocument.ElementUrl ELEMENT_IMG
To define an image, html element must have: The url is contained in the attributed named "src"

Constructor Detail

HtmlDocument

public HtmlDocument(java.lang.String strHtml,
                    java.lang.String strBaseUrl,
                    boolean useAbsoluteUrl)
Instanciates an HtmlDocument after having built the DOM tree.

Parameters:
strHtml - The Html code to be parsed.
strBaseUrl - The Base url used to retrieve urls.
useAbsoluteUrl - Determine if we use absolute or relative url for HTML element's names
Method Detail

getAllUrls

public java.util.Map<java.lang.String,java.net.URL> getAllUrls(fr.paris.lutece.util.mail.HtmlDocument.ElementUrl elementType)
Get the urls of all html elements specified by elementType

Parameters:
elementType - the type of element to get
Returns:
a Collection containing the urls. Those urls are Objects, as defined by getUrl().

getAllUrlsAttachement

public java.util.List<UrlAttachment> getAllUrlsAttachement(fr.paris.lutece.util.mail.HtmlDocument.ElementUrl elementType)
Get the urls of all html elements specified by elementType

Parameters:
elementType - the type of element to get
Returns:
a Collection containing the urls. Those urls are Objects, as defined by getUrl().

getUrlContent

protected java.lang.Object getUrlContent(java.net.URL url)
Loads the url in a DataHandler

Parameters:
url - an absolute url
Returns:
an Object containing the DataHandler

getUrlName

protected java.lang.String getUrlName(java.net.URL url)
Return the absolute or relative url depending on _useAbsoluteUrl

Parameters:
url - an absolute url
Returns:
a String representing the url


Copyright © 2012 Mairie de Paris. All Rights Reserved.