Package org.opencms.search.documents
Class A_CmsVfsDocument
- java.lang.Object
-
- org.opencms.search.documents.A_CmsVfsDocument
-
- All Implemented Interfaces:
I_CmsDocumentFactory
,I_CmsSearchExtractor
- Direct Known Subclasses:
CmsDocumentContainerPage
,CmsDocumentGeneric
,CmsDocumentHtml
,CmsDocumentMsOfficeOLE2
,CmsDocumentMsOfficeOOXML
,CmsDocumentOpenOffice
,CmsDocumentPdf
,CmsDocumentPlainText
,CmsDocumentRtf
,CmsDocumentXmlContent
,CmsDocumentXmlPage
,CmsSolrDocumentXmlContent
public abstract class A_CmsVfsDocument extends java.lang.Object implements I_CmsDocumentFactory
Base document factory class for a VFS
, just requires a specialized implementation ofCmsResource
for text extraction from the binary document content.I_CmsSearchExtractor.extractContent(CmsObject, CmsResource, I_CmsSearchIndex)
- Since:
- 6.0.0
-
-
Field Summary
Fields Modifier and Type Field Description static java.lang.String
DEFAULT_ALL_TYPES
Generic type name used as default for all types.static java.lang.String
DEFAULT_ALL_UNCONFIGURED_TYPES
Generic type name used as default for all types that are globally unconfigured.protected java.lang.String
m_name
Name of the document type.
-
Constructor Summary
Constructors Constructor Description A_CmsVfsDocument(java.lang.String name)
Creates a new instance of this lucene document factory.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description I_CmsSearchDocument
createDocument(CmsObject cms, CmsResource resource, I_CmsSearchIndex index)
Generates a new lucene document instance from contents of the given resource for the provided index.CmsExtractionResultCache
getCache()
Returns the disk based cache used to store the raw extraction results.static java.lang.String
getDocumentKey(java.lang.String type, java.lang.String mimeType)
Creates a document factory lookup key for the given resource type name / MIME type configuration.java.util.List<java.lang.String>
getDocumentKeys(java.util.List<java.lang.String> resourceTypes, java.util.List<java.lang.String> mimeTypes)
Returns the list of accepted keys for the resource types that can be indexed using this document factory.java.lang.String
getName()
Returns the name of this document type factory.protected void
logContentExtraction(CmsResource resource, I_CmsSearchIndex index)
Logs content extraction for the specified resource and index.protected CmsFile
readFile(CmsObject cms, CmsResource resource)
Upgrades the given resource to aCmsFile
with content.void
setCache(CmsExtractionResultCache cache)
Sets the disk based cache used to store the raw extraction results.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.opencms.search.documents.I_CmsDocumentFactory
isLocaleDependend, isUsingCache
-
Methods inherited from interface org.opencms.search.documents.I_CmsSearchExtractor
extractContent
-
-
-
-
Field Detail
-
DEFAULT_ALL_UNCONFIGURED_TYPES
public static final java.lang.String DEFAULT_ALL_UNCONFIGURED_TYPES
Generic type name used as default for all types that are globally unconfigured. Note that any special xml content is already configured if xmlcontent is configured.- See Also:
- Constant Field Values
-
DEFAULT_ALL_TYPES
public static final java.lang.String DEFAULT_ALL_TYPES
Generic type name used as default for all types.- See Also:
- Constant Field Values
-
m_name
protected java.lang.String m_name
Name of the document type.
-
-
Constructor Detail
-
A_CmsVfsDocument
public A_CmsVfsDocument(java.lang.String name)
Creates a new instance of this lucene document factory.- Parameters:
name
- name of the documenttype
-
-
Method Detail
-
getDocumentKey
public static java.lang.String getDocumentKey(java.lang.String type, java.lang.String mimeType)
Creates a document factory lookup key for the given resource type name / MIME type configuration.If the given
mimeType
isnull
, this indicates that the key should match all VFS resource of the given resource type regardless of the MIME type.- Parameters:
type
- the resource type name to usemimeType
- the MIME type to use- Returns:
- a document factory lookup key for the given resource id / MIME type configuration
-
createDocument
public I_CmsSearchDocument createDocument(CmsObject cms, CmsResource resource, I_CmsSearchIndex index) throws CmsException
Generates a new lucene document instance from contents of the given resource for the provided index.- Specified by:
createDocument
in interfaceI_CmsDocumentFactory
- Parameters:
cms
- the OpenCms user context used to access the OpenCms VFSresource
- the search index resource to create the Lucene document fromindex
- the search index to create the Document for- Returns:
- the Search Document for the given index resource and the given search index
- Throws:
CmsException
- if something goes wrong- See Also:
I_CmsDocumentFactory.createDocument(CmsObject, CmsResource, I_CmsSearchIndex)
-
getCache
public CmsExtractionResultCache getCache()
Description copied from interface:I_CmsDocumentFactory
Returns the disk based cache used to store the raw extraction results.In case
null
is returned, then result caching is not supported for this factory.- Specified by:
getCache
in interfaceI_CmsDocumentFactory
- Returns:
- the disk based cache used to store the raw extraction results
- See Also:
I_CmsDocumentFactory.getCache()
-
getDocumentKeys
public java.util.List<java.lang.String> getDocumentKeys(java.util.List<java.lang.String> resourceTypes, java.util.List<java.lang.String> mimeTypes) throws CmsException
Description copied from interface:I_CmsDocumentFactory
Returns the list of accepted keys for the resource types that can be indexed using this document factory.The result List contains String objects. This String is later matched against
getDocumentKey(String, String)
to find the corrospondigI_CmsDocumentFactory
for a resource to index.The list of accepted resource types may contain a catch-all entry "*"; in this case, a list for all possible resource types is returned, calculated by a logic depending on the document handler class.
- Specified by:
getDocumentKeys
in interfaceI_CmsDocumentFactory
- Parameters:
resourceTypes
- list of accepted resource typesmimeTypes
- list of accepted mime types- Returns:
- the list of accepted keys for the resource types that can be indexed using this document factory (String objects)
- Throws:
CmsException
- if something goes wrong- See Also:
I_CmsDocumentFactory.getDocumentKeys(java.util.List, java.util.List)
-
getName
public java.lang.String getName()
Description copied from interface:I_CmsDocumentFactory
Returns the name of this document type factory.- Specified by:
getName
in interfaceI_CmsDocumentFactory
- Returns:
- the name of this document type factory
- See Also:
I_CmsDocumentFactory.getName()
-
setCache
public void setCache(CmsExtractionResultCache cache)
Description copied from interface:I_CmsDocumentFactory
Sets the disk based cache used to store the raw extraction results.This should only be used for factories where
I_CmsDocumentFactory.isUsingCache()
returnstrue
.- Specified by:
setCache
in interfaceI_CmsDocumentFactory
- Parameters:
cache
- the disk based cache used to store the raw extraction results- See Also:
I_CmsDocumentFactory.setCache(org.opencms.search.documents.CmsExtractionResultCache)
-
logContentExtraction
protected void logContentExtraction(CmsResource resource, I_CmsSearchIndex index)
Logs content extraction for the specified resource and index.- Parameters:
resource
- the resource to log content extraction forindex
- the search index to log content extraction for
-
readFile
protected CmsFile readFile(CmsObject cms, CmsResource resource) throws CmsException, CmsIndexNoContentException
Upgrades the given resource to aCmsFile
with content.- Parameters:
cms
- the current users OpenCms contextresource
- the resource to upgrade- Returns:
- the given resource upgraded to a
CmsFile
with content - Throws:
CmsException
- if the resource could not be readCmsIndexNoContentException
- if the resource has no content
-
-