Class CmsHtmlStripper
- java.lang.Object
-
- org.opencms.util.CmsHtmlStripper
-
public final class CmsHtmlStripper extends java.lang.Object
Simple html tag stripper that allows configuration of html tag names that are allowed.All tags that are not explicitly allowed via invocation of one of the
addPreserve...
methods will be missing in the result of the method
.stripHtml(String)
Instances are reusable but not shareable (multithreading). If configuration should be changed between subsequent invocations of
methodstripHtml(String)
has to be called.reset()
- Since:
- 6.9.2
-
-
Constructor Summary
Constructors Constructor Description CmsHtmlStripper()
Default constructor that turns echo on and uses the settings for replacing tags.CmsHtmlStripper(boolean useTidy)
Creates an instance with control whether tidy is used.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description boolean
addPreserveTag(java.lang.String tagName)
Adds a tag that will be preserved by
.stripHtml(String)
void
addPreserveTagList(java.util.List<java.lang.String> preserveTags)
Convenience method for adding several tags to preserve.void
addPreserveTags(java.lang.String tagList, char separator)
Convenience method for adding several tags to preserve in form of a delimiter-separated String.void
reset()
Resets the configuration of the tags to preserve.java.lang.String
stripHtml(java.lang.String html)
Extracts the text from the given html content, assuming the given html encoding.
-
-
-
Constructor Detail
-
CmsHtmlStripper
public CmsHtmlStripper()
Default constructor that turns echo on and uses the settings for replacing tags.
-
CmsHtmlStripper
public CmsHtmlStripper(boolean useTidy)
Creates an instance with control whether tidy is used.- Parameters:
useTidy
- if true tidy will be used
-
-
Method Detail
-
addPreserveTag
public boolean addPreserveTag(java.lang.String tagName)
Adds a tag that will be preserved by
.stripHtml(String)
- Parameters:
tagName
- the name of the tag to keep (case insensitive)- Returns:
- true if the tagName was added correctly to the internal engine
-
addPreserveTagList
public void addPreserveTagList(java.util.List<java.lang.String> preserveTags)
Convenience method for adding several tags to preserve.- Parameters:
preserveTags
- aList<String>
with the case-insensitive tag names of the tags to preserve- See Also:
addPreserveTag(String)
-
addPreserveTags
public void addPreserveTags(java.lang.String tagList, char separator)
Convenience method for adding several tags to preserve in form of a delimiter-separated String.The String will be
withCmsStringUtil.splitAsList(String, char, boolean)
tagList
as the first argument,separator
as the second argument and the third argument set to true (trimming - support).- Parameters:
tagList
- a delimiter-separated String with case-insensitive tag names to preserve bystripHtml(String)
separator
- the delimiter that separates tag names in thetagList
argument- See Also:
addPreserveTag(String)
-
reset
public void reset()
Resets the configuration of the tags to preserve.This is called from the constructor and only has to be called if this instance is reused with a differen configuration (of tags to keep).
-
stripHtml
public java.lang.String stripHtml(java.lang.String html) throws org.htmlparser.util.ParserException
Extracts the text from the given html content, assuming the given html encoding.Additionally tags are replaced / removed according to the configuration of this instance.
Please note:
There are static process methods in the superclass that will not do the replacements / removals. Don't mix them up with this method.- Parameters:
html
- the content to extract the plain text from.- Returns:
- the text extracted from the given html content.
- Throws:
org.htmlparser.util.ParserException
- if something goes wrong.
-
-