Package org.opencms.util
Interface I_CmsHtmlNodeVisitor
- All Known Implementing Classes:
CmsHtml2TextConverter
,CmsHtmlDecorator
,CmsHtmlParser
,CmsLinkProcessor
public interface I_CmsHtmlNodeVisitor
Interface for a combination of a visitor of HTML documents along with the hook to start the
parser / lexer that triggers the visit.
- Since:
- 6.1.3
-
Method Summary
Modifier and TypeMethodDescriptionReturns the configuartion String of this visitor or the empty String if was not provided before.Returns the text extraction result.Extracts the text from the given html content, assuming the given html encoding.void
setConfiguration
(String configuration) Set a configuartion String for this visitor.void
setNoAutoCloseTags
(List<String> noAutoCloseTags) Sets a list of upper case tag names for which parsing / visitng should not correct missing closing tags.void
visitEndTag
(org.htmlparser.Tag tag) Visitor method (callback) invoked when a closing Tag is encountered.void
visitRemarkNode
(org.htmlparser.Remark remark) Visitor method (callback) invoked when a remark Tag (HTML comment) is encountered.void
visitStringNode
(org.htmlparser.Text text) Visitor method (callback) invoked when a remark Tag (HTML comment) is encountered.void
visitTag
(org.htmlparser.Tag tag) Visitor method (callback) invoked when a starting Tag (HTML comment) is encountered.
-
Method Details
-
getConfiguration
Returns the configuartion String of this visitor or the empty String if was not provided before.- Returns:
- the configuartion String of this visitor - by this contract never null but an empty String if not provided.
- See Also:
-
getResult
Returns the text extraction result.- Returns:
- the text extraction result
-
process
Extracts the text from the given html content, assuming the given html encoding.- Parameters:
html
- the content to extract the plain text fromencoding
- the encoding to use- Returns:
- the text extracted from the given html content
- Throws:
org.htmlparser.util.ParserException
- if something goes wrong
-
setConfiguration
Set a configuartion String for this visitor.This will most likely be done with data from an xsd, custom jsp tag, ...
- Parameters:
configuration
- the configuration of this visitor to set.
-
setNoAutoCloseTags
Sets a list of upper case tag names for which parsing / visitng should not correct missing closing tags.This has to be used before
is invoked to take an effect.process(String, String)
- Parameters:
noAutoCloseTags
- a list of upper case tag names for which parsing / visiting should not correct missing closing tags to set.
-
visitEndTag
Visitor method (callback) invoked when a closing Tag is encountered.- Parameters:
tag
- the tag that is ended.- See Also:
-
NodeVisitor.visitEndTag(org.htmlparser.Tag)
-
visitRemarkNode
Visitor method (callback) invoked when a remark Tag (HTML comment) is encountered.- Parameters:
remark
- the remark Tag to visit.- See Also:
-
NodeVisitor.visitRemarkNode(org.htmlparser.Remark)
-
visitStringNode
Visitor method (callback) invoked when a remark Tag (HTML comment) is encountered.- Parameters:
text
- the text that is visited.- See Also:
-
NodeVisitor.visitStringNode(org.htmlparser.Text)
-
visitTag
Visitor method (callback) invoked when a starting Tag (HTML comment) is encountered.- Parameters:
tag
- the tag that is visited.- See Also:
-
NodeVisitor.visitTag(org.htmlparser.Tag)
-