Interface I_CmsHtmlNodeVisitor

    • Method Summary

      All Methods Instance Methods Abstract Methods 
      Modifier and Type Method Description
      java.lang.String getConfiguration()
      Returns the configuartion String of this visitor or the empty String if was not provided before.
      java.lang.String getResult()
      Returns the text extraction result.
      java.lang.String process​(java.lang.String html, java.lang.String encoding)
      Extracts the text from the given html content, assuming the given html encoding.
      void setConfiguration​(java.lang.String configuration)
      Set a configuartion String for this visitor.
      void setNoAutoCloseTags​(java.util.List<java.lang.String> noAutoCloseTags)
      Sets a list of upper case tag names for which parsing / visitng should not correct missing closing tags.
      void visitEndTag​(org.htmlparser.Tag tag)
      Visitor method (callback) invoked when a closing Tag is encountered.
      void visitRemarkNode​(org.htmlparser.Remark remark)
      Visitor method (callback) invoked when a remark Tag (HTML comment) is encountered.
      void visitStringNode​(org.htmlparser.Text text)
      Visitor method (callback) invoked when a remark Tag (HTML comment) is encountered.
      void visitTag​(org.htmlparser.Tag tag)
      Visitor method (callback) invoked when a starting Tag (HTML comment) is encountered.
    • Method Detail

      • getConfiguration

        java.lang.String getConfiguration()
        Returns the configuartion String of this visitor or the empty String if was not provided before.

        Returns:
        the configuartion String of this visitor - by this contract never null but an empty String if not provided.
        See Also:
        setConfiguration(String)
      • getResult

        java.lang.String getResult()
        Returns the text extraction result.

        Returns:
        the text extraction result
      • process

        java.lang.String process​(java.lang.String html,
                                 java.lang.String encoding)
                          throws org.htmlparser.util.ParserException
        Extracts the text from the given html content, assuming the given html encoding.

        Parameters:
        html - the content to extract the plain text from
        encoding - the encoding to use
        Returns:
        the text extracted from the given html content
        Throws:
        org.htmlparser.util.ParserException - if something goes wrong
      • setConfiguration

        void setConfiguration​(java.lang.String configuration)
        Set a configuartion String for this visitor.

        This will most likely be done with data from an xsd, custom jsp tag, ...

        Parameters:
        configuration - the configuration of this visitor to set.
      • setNoAutoCloseTags

        void setNoAutoCloseTags​(java.util.List<java.lang.String> noAutoCloseTags)
        Sets a list of upper case tag names for which parsing / visitng should not correct missing closing tags.

        This has to be used before process(String, String) is invoked to take an effect.

        Parameters:
        noAutoCloseTags - a list of upper case tag names for which parsing / visiting should not correct missing closing tags to set.
      • visitEndTag

        void visitEndTag​(org.htmlparser.Tag tag)
        Visitor method (callback) invoked when a closing Tag is encountered.

        Parameters:
        tag - the tag that is ended.
        See Also:
        NodeVisitor.visitEndTag(org.htmlparser.Tag)
      • visitRemarkNode

        void visitRemarkNode​(org.htmlparser.Remark remark)
        Visitor method (callback) invoked when a remark Tag (HTML comment) is encountered.

        Parameters:
        remark - the remark Tag to visit.
        See Also:
        NodeVisitor.visitRemarkNode(org.htmlparser.Remark)
      • visitStringNode

        void visitStringNode​(org.htmlparser.Text text)
        Visitor method (callback) invoked when a remark Tag (HTML comment) is encountered.

        Parameters:
        text - the text that is visited.
        See Also:
        NodeVisitor.visitStringNode(org.htmlparser.Text)
      • visitTag

        void visitTag​(org.htmlparser.Tag tag)
        Visitor method (callback) invoked when a starting Tag (HTML comment) is encountered.

        Parameters:
        tag - the tag that is visited.
        See Also:
        NodeVisitor.visitTag(org.htmlparser.Tag)