Package org.opencms.search.documents


package org.opencms.search.documents
Handles indexing different sorts of document and resource type from the OpenCms VFS for the full text search.

Since:
6.0.0
  • Class
    Description
    Base document factory class for a VFS CmsResource, just requires a specialized implementation of I_CmsSearchExtractor.extractContent(CmsObject, CmsResource, I_CmsSearchIndex) for text extraction from the binary document content.
    Lucene document factory class to extract index data from a resource of type CmsResourceTypeContainerPage.
    Provides the dependency information about one search result document, used to generate the list of document search results.
    Defines the possible dependency types.
    Lucene document factory class for indexing data from a generic CmsResource.
    Lucene document factory class to extract index data from a cms resource containing plain html data.
    Lucene document factory class to extract text data from a VFS resource that is an OLE 2 MS Office document.
    Lucene document factory class to extract text data from a VFS resource that is an OOXML MS Office document.
    Lucene document factory class to extract index data from a cms resource containing Open Document Format data.
    Lucene document factory class to extract index data from a cms resource containing Adobe pdf data.
    Lucene document factory class to extract index data from a cms resource containing plain text data.
    Lucene document factory class to extract index data from a cms resource containing RTF data.
    Lucene document factory class to extract index data from an OpenCms VFS resource of type CmsResourceTypeXmlContent.
    Lucene document factory class to extract index data from a cms resource of type CmsResourceTypeXmlPage.
    Implements a disk cache that stores text extraction results in the RFS.
    Signals an error during content extraction of an empty document.
    Default highlighter implementation used for generation of search excerpts.
    Used to create index Lucene Documents for OpenCms resources, controls the text extraction algorithm used for a specific OpenCms resource type / MIME type combination.
    Defines a text extractor for the integrated search engine.
    Highlights arbitrary terms, used for generation of search excerpts.
    Convenience class to access the localized messages of this OpenCms package.