Package org.opencms.search.documents
Handles indexing different sorts of document and resource type from the OpenCms VFS for the full text search.
- Since:
- 6.0.0
-
Interface Summary Interface Description I_CmsDocumentFactory Used to create index Lucene Documents for OpenCms resources, controls the text extraction algorithm used for a specific OpenCms resource type / MIME type combination.I_CmsSearchExtractor Defines a text extractor for the integrated search engine.I_CmsTermHighlighter Highlights arbitrary terms, used for generation of search excerpts. -
Class Summary Class Description A_CmsVfsDocument Base document factory class for a VFS
, just requires a specialized implementation ofCmsResource
for text extraction from the binary document content.I_CmsSearchExtractor.extractContent(CmsObject, CmsResource, I_CmsSearchIndex)
CmsDocumentContainerPage Lucene document factory class to extract index data from a resource of typeCmsResourceTypeContainerPage
.CmsDocumentDependency Provides the dependency information about one search result document, used to generate the list of document search results.CmsDocumentGeneric Lucene document factory class for indexing data from a generic
.CmsResource
CmsDocumentHtml Lucene document factory class to extract index data from a cms resource containing plain html data.CmsDocumentMsOfficeOLE2 Lucene document factory class to extract text data from a VFS resource that is an OLE 2 MS Office document.CmsDocumentMsOfficeOOXML Lucene document factory class to extract text data from a VFS resource that is an OOXML MS Office document.CmsDocumentOpenOffice Lucene document factory class to extract index data from a cms resource containing Open Document Format data.CmsDocumentPdf Lucene document factory class to extract index data from a cms resource containing Adobe pdf data.CmsDocumentPlainText Lucene document factory class to extract index data from a cms resource containing plain text data.CmsDocumentRtf Lucene document factory class to extract index data from a cms resource containing RTF data.CmsDocumentXmlContent Lucene document factory class to extract index data from an OpenCms VFS resource of typeCmsResourceTypeXmlContent
.CmsDocumentXmlPage Lucene document factory class to extract index data from a cms resource of typeCmsResourceTypeXmlPage
.CmsExtractionResultCache Implements a disk cache that stores text extraction results in the RFS.CmsTermHighlighterHtml Default highlighter implementation used for generation of search excerpts.Messages Convenience class to access the localized messages of this OpenCms package. -
Enum Summary Enum Description CmsDocumentDependency.DependencyType Defines the possible dependency types. -
Exception Summary Exception Description CmsIndexNoContentException Signals an error during content extraction of an empty document.