Package org.opencms.util
Class CmsHtml2TextConverter
- java.lang.Object
-
- org.htmlparser.visitors.NodeVisitor
-
- org.opencms.util.CmsHtmlParser
-
- org.opencms.util.CmsHtml2TextConverter
-
- All Implemented Interfaces:
I_CmsHtmlNodeVisitor
public class CmsHtml2TextConverter extends CmsHtmlParser
Extracts the HTML page content.
-
-
Field Summary
-
Fields inherited from class org.opencms.util.CmsHtmlParser
m_echo, m_noAutoCloseTags, m_result, TAG_ARRAY, TAG_LIST
-
-
Constructor Summary
Constructors Constructor Description CmsHtml2TextConverter()
Creates a new instance of the html converter.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static java.lang.String
html2text(java.lang.String html, java.lang.String encoding)
Extracts the text from the given html content, assuming the given html encoding.void
visitEndTag(org.htmlparser.Tag tag)
Visitor method (callback) invoked when a closing Tag is encountered.void
visitStringNode(org.htmlparser.Text text)
Visitor method (callback) invoked when a remark Tag (HTML comment) is encountered.void
visitTag(org.htmlparser.Tag tag)
Visitor method (callback) invoked when a starting Tag (HTML comment) is encountered.-
Methods inherited from class org.opencms.util.CmsHtmlParser
collapse, configureNoAutoCorrectionTags, getConfiguration, getNoAutoCloseTags, getResult, getTagHtml, process, setConfiguration, setNoAutoCloseTags, visitRemarkNode
-
-
-
-
Constructor Detail
-
CmsHtml2TextConverter
public CmsHtml2TextConverter()
Creates a new instance of the html converter.
-
-
Method Detail
-
html2text
public static java.lang.String html2text(java.lang.String html, java.lang.String encoding) throws java.lang.Exception
Extracts the text from the given html content, assuming the given html encoding.- Parameters:
html
- the content to extract the plain text fromencoding
- the encoding to use- Returns:
- the text extracted from the given html content
- Throws:
java.lang.Exception
- if something goes wrong
-
visitEndTag
public void visitEndTag(org.htmlparser.Tag tag)
Description copied from interface:I_CmsHtmlNodeVisitor
Visitor method (callback) invoked when a closing Tag is encountered.- Specified by:
visitEndTag
in interfaceI_CmsHtmlNodeVisitor
- Overrides:
visitEndTag
in classCmsHtmlParser
- Parameters:
tag
- the tag that is ended.- See Also:
NodeVisitor.visitEndTag(org.htmlparser.Tag)
-
visitStringNode
public void visitStringNode(org.htmlparser.Text text)
Description copied from interface:I_CmsHtmlNodeVisitor
Visitor method (callback) invoked when a remark Tag (HTML comment) is encountered.- Specified by:
visitStringNode
in interfaceI_CmsHtmlNodeVisitor
- Overrides:
visitStringNode
in classCmsHtmlParser
- Parameters:
text
- the text that is visited.- See Also:
NodeVisitor.visitStringNode(org.htmlparser.Text)
-
visitTag
public void visitTag(org.htmlparser.Tag tag)
Description copied from interface:I_CmsHtmlNodeVisitor
Visitor method (callback) invoked when a starting Tag (HTML comment) is encountered.- Specified by:
visitTag
in interfaceI_CmsHtmlNodeVisitor
- Overrides:
visitTag
in classCmsHtmlParser
- Parameters:
tag
- the tag that is visited.- See Also:
NodeVisitor.visitTag(org.htmlparser.Tag)
-
-