Package org.opencms.search.documents
Interface I_CmsSearchExtractor
- All Known Subinterfaces:
I_CmsDocumentFactory
- All Known Implementing Classes:
A_CmsVfsDocument,CmsDocumentContainerPage,CmsDocumentGeneric,CmsDocumentHtml,CmsDocumentMsOfficeOLE2,CmsDocumentMsOfficeOOXML,CmsDocumentOpenOffice,CmsDocumentPdf,CmsDocumentPlainText,CmsDocumentRtf,CmsDocumentXmlContent,CmsDocumentXmlPage,CmsSolrDocumentContainerPage,CmsSolrDocumentXmlContent
public interface I_CmsSearchExtractor
Defines a text extractor for the integrated search engine.
The job of a search extractor is to extract indexable plain text from a resource in the OpenCms VFS. This may be from the resource content, for example from a PDF file, or from the resource properties, for example the Title, Keywords and Description properties.
- Since:
- 6.0.0
-
Method Summary
Modifier and TypeMethodDescriptionextractContent(CmsObject cms, CmsResource resource, I_CmsSearchIndex index) Extracts the content of a given index resource according to the resource file type and the configuration of the given index.
-
Method Details
-
extractContent
I_CmsExtractionResult extractContent(CmsObject cms, CmsResource resource, I_CmsSearchIndex index) throws CmsException Extracts the content of a given index resource according to the resource file type and the configuration of the given index.- Parameters:
cms- the cms objectresource- the resource to extract the content fromindex- the index to extract the content for- Returns:
- the extracted content of the resource
- Throws:
CmsException- if something goes wrong
-