Interface I_CmsDocumentFactory
- All Superinterfaces:
I_CmsSearchExtractor
- All Known Implementing Classes:
A_CmsVfsDocument
,CmsDocumentContainerPage
,CmsDocumentGeneric
,CmsDocumentHtml
,CmsDocumentMsOfficeOLE2
,CmsDocumentMsOfficeOOXML
,CmsDocumentOpenOffice
,CmsDocumentPdf
,CmsDocumentPlainText
,CmsDocumentRtf
,CmsDocumentXmlContent
,CmsDocumentXmlPage
,CmsSolrDocumentContainerPage
,CmsSolrDocumentXmlContent
The configuration of the search index is defined in opencms-search.xml
.
There you can associate a combintion of OpenCms resource types and MIME types to an instance
of this factory. This rather complex configuration is required because only the combination of
OpenCms resource type and MIME type can decide what to use for search indexing.
For example, if the OpenCms resource type is plain
,
the extraction algorithm for MIME types .html
and .txt
must be different.
On the other hand, the MIME type .html
in OpenCms can be almost any resource type,
like xmlpage
, xmlcontent
or even jsp
.
- Since:
- 6.0.0
-
Method Summary
Modifier and TypeMethodDescriptioncreateDocument
(CmsObject cms, CmsResource resource, I_CmsSearchIndex index) Creates the Lucene Document for the given VFS resource and the given search index.getCache()
Returns the disk based cache used to store the raw extraction results.getDocumentKeys
(List<String> resourceTypes, List<String> mimeTypes) Returns the list of accepted keys for the resource types that can be indexed using this document factory.getName()
Returns the name of this document type factory.boolean
Returnstrue
if this document factory is locale depended.default boolean
Returnstrue
if the extraction result dependent on the resources content itself, i.e., has not to be re-extracted if the content date is unchanged.boolean
Returnstrue
if result caching is supported for this factory.void
setCache
(CmsExtractionResultCache cache) Sets the disk based cache used to store the raw extraction results.Methods inherited from interface org.opencms.search.documents.I_CmsSearchExtractor
extractContent
-
Method Details
-
createDocument
I_CmsSearchDocument createDocument(CmsObject cms, CmsResource resource, I_CmsSearchIndex index) throws CmsException Creates the Lucene Document for the given VFS resource and the given search index.This triggers the indexing process for the given VFS resource according to the configuration of the provided index.
The provided index resource contains the basic contents to index. The provided search index contains the configuration what to index, such as the locale and possible special field mappings.
- Parameters:
cms
- the OpenCms user context used to access the OpenCms VFSresource
- the search index resource to create the Lucene document fromindex
- the search index to create the Document for- Returns:
- the Search Document for the given index resource and the given search index
- Throws:
CmsException
- if something goes wrong- See Also:
-
getCache
Returns the disk based cache used to store the raw extraction results.In case
null
is returned, then result caching is not supported for this factory.- Returns:
- the disk based cache used to store the raw extraction results
-
getDocumentKeys
List<String> getDocumentKeys(List<String> resourceTypes, List<String> mimeTypes) throws CmsException Returns the list of accepted keys for the resource types that can be indexed using this document factory.The result List contains String objects. This String is later matched against
A_CmsVfsDocument.getDocumentKey(String, String)
to find the corrospondigI_CmsDocumentFactory
for a resource to index.The list of accepted resource types may contain a catch-all entry "*"; in this case, a list for all possible resource types is returned, calculated by a logic depending on the document handler class.
- Parameters:
resourceTypes
- list of accepted resource typesmimeTypes
- list of accepted mime types- Returns:
- the list of accepted keys for the resource types that can be indexed using this document factory (String objects)
- Throws:
CmsException
- if something goes wrong
-
getName
Returns the name of this document type factory.- Returns:
- the name of this document type factory
-
isLocaleDependend
boolean isLocaleDependend()Returnstrue
if this document factory is locale depended.- Returns:
true
if this document factory is locale depended
-
isOnlyDependentOnContent
Returnstrue
if the extraction result dependent on the resources content itself, i.e., has not to be re-extracted if the content date is unchanged.- Returns:
true
if the extraction result dependent on the resources content itself, i.e., has not to be re-extracted if the content date is unchanged.
-
isUsingCache
boolean isUsingCache()Returnstrue
if result caching is supported for this factory.- Returns:
true
if result caching is supported for this factory
-
setCache
Sets the disk based cache used to store the raw extraction results.This should only be used for factories where
isUsingCache()
returnstrue
.- Parameters:
cache
- the disk based cache used to store the raw extraction results
-