Class CmsHtmlImportConverter


  • public class CmsHtmlImportConverter
    extends java.lang.Object
    This class implements Html-converting routines based on tidy to modify the Html code of the imported Html pages.

    Since:
    6.0.0
    • Constructor Summary

      Constructors 
      Constructor Description
      CmsHtmlImportConverter​(CmsHtmlImport htmlImport, boolean xmlMode)
      Default constructor, creates a new HtmlConverter.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void convertHTML​(java.io.Reader input, java.io.Writer output, java.lang.String startPattern, java.lang.String endPattern, java.util.Hashtable properties)
      Transforms HTML code into user defined output.
      java.lang.String convertHTML​(java.lang.String filename, java.lang.String inString, java.lang.String startPattern, java.lang.String endPattern, java.util.Hashtable properties)
      Transforms HTML code into user defined output.
      static java.lang.String extractHtml​(java.lang.String content, java.lang.String startpoint, java.lang.String endpoint)
      Extracts the content of a HTML page.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • CmsHtmlImportConverter

        public CmsHtmlImportConverter​(CmsHtmlImport htmlImport,
                                      boolean xmlMode)
        Default constructor, creates a new HtmlConverter.

        Parameters:
        htmlImport - reference to the htmlimport
        xmlMode - switch for setting the import to HTML or XML mode
    • Method Detail

      • extractHtml

        public static java.lang.String extractHtml​(java.lang.String content,
                                                   java.lang.String startpoint,
                                                   java.lang.String endpoint)
        Extracts the content of a HTML page.

        This method should be pretty robust and work even if the input HTML does not contains the specified matchers.

        Parameters:
        content - the content to extract the body from
        startpoint - the point where matching starts
        endpoint - the point where matching ends
        Returns:
        the extracted body tag content
      • convertHTML

        public void convertHTML​(java.io.Reader input,
                                java.io.Writer output,
                                java.lang.String startPattern,
                                java.lang.String endPattern,
                                java.util.Hashtable properties)
        Transforms HTML code into user defined output.

        Parameters:
        input - Reader with HTML code
        output - Writer with transformed code
        startPattern - the start pattern definition for content extracting
        endPattern - the end pattern definition for content extracting
        properties - the file properties
      • convertHTML

        public java.lang.String convertHTML​(java.lang.String filename,
                                            java.lang.String inString,
                                            java.lang.String startPattern,
                                            java.lang.String endPattern,
                                            java.util.Hashtable properties)
        Transforms HTML code into user defined output.

        Parameters:
        filename - the absolute path in the real filesystem of the file to convert
        inString - String with HTML code
        startPattern - the start pattern definition for content extracting
        endPattern - the end pattern definition for content extracting
        properties - the file properties
        Returns:
        String with transformed code