Package org.opencms.search.documents
Class CmsDocumentMsOfficeOLE2
java.lang.Object
org.opencms.search.documents.A_CmsVfsDocument
org.opencms.search.documents.CmsDocumentMsOfficeOLE2
- All Implemented Interfaces:
I_CmsDocumentFactory,I_CmsSearchExtractor
Lucene document factory class to extract text data from a VFS resource that is an OLE 2 MS Office document.
Supported formats are MS Word (.doc), MS PowerPoint (.ppt) and MS Excel (.xls).
The OLE 2 format was introduced in Microsoft Office version 97 and was the default format until Office version 2007 and the new XML-based OOXML format.
- Since:
- 8.0.1
-
Field Summary
Fields inherited from class org.opencms.search.documents.A_CmsVfsDocument
DEFAULT_ALL_TYPES, DEFAULT_ALL_UNCONFIGURED_TYPES, m_name -
Constructor Summary
ConstructorsConstructorDescriptionCreates a new instance of this lucene document factory. -
Method Summary
Modifier and TypeMethodDescriptionextractContent(CmsObject cms, CmsResource resource, I_CmsSearchIndex index) Returns the raw text content of a given vfs resource containing MS Word data.booleanReturnstrueif this document factory is locale depended.booleanReturnstrueif result caching is supported for this factory.Methods inherited from class org.opencms.search.documents.A_CmsVfsDocument
createDocument, getCache, getDocumentKey, getDocumentKeys, getName, logContentExtraction, readFile, setCacheMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.opencms.search.documents.I_CmsDocumentFactory
isOnlyDependentOnContent
-
Constructor Details
-
CmsDocumentMsOfficeOLE2
Creates a new instance of this lucene document factory.- Parameters:
name- name of the documenttype
-
-
Method Details
-
extractContent
public I_CmsExtractionResult extractContent(CmsObject cms, CmsResource resource, I_CmsSearchIndex index) throws CmsIndexException, CmsException Returns the raw text content of a given vfs resource containing MS Word data.- Parameters:
cms- the cms objectresource- the resource to extract the content fromindex- the index to extract the content for- Returns:
- the extracted content of the resource
- Throws:
CmsException- if something goes wrongCmsIndexException- See Also:
-
isLocaleDependend
Description copied from interface:I_CmsDocumentFactoryReturnstrueif this document factory is locale depended.- Returns:
trueif this document factory is locale depended- See Also:
-
isUsingCache
Description copied from interface:I_CmsDocumentFactoryReturnstrueif result caching is supported for this factory.- Returns:
trueif result caching is supported for this factory- See Also:
-