Class CmsSearchManager
- java.lang.Object
-
- org.opencms.search.CmsSearchManager
-
- All Implemented Interfaces:
I_CmsEventListener
,I_CmsScheduledJob
public class CmsSearchManager extends java.lang.Object implements I_CmsScheduledJob, I_CmsEventListener
Implements the general management and configuration of the search and indexing facilities in OpenCms.- Since:
- 6.0.0
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
CmsSearchManager.CmsSearchForceUnlockMode
Enumeration class for force unlock types.protected class
CmsSearchManager.CmsSearchOfflineHandler
Handles offline index generation.protected class
CmsSearchManager.CmsSearchOfflineIndexThread
The offline indexer thread runs periodically and indexes all resources added by the event handler.protected class
CmsSearchManager.CmsSearchOfflineIndexWorkThread
An offline index worker Thread runs each time for every offline index update action.
-
Field Summary
Fields Modifier and Type Field Description static int
DEFAULT_EXCERPT_LENGTH
The default value used for generating search result excerpts (1024 chars).static float
DEFAULT_EXTRACTION_CACHE_MAX_AGE
The default value used for keeping the extraction results in the cache (672 hours = 4 weeks).static int
DEFAULT_MAX_INDEX_WAITTIME
The default maximal wait time for re-indexing after editing a content.static int
DEFAULT_MAX_MODIFICATIONS_BEFORE_COMMIT
Default for the maximum number of modifications before a commit in the search index is triggered (500).static int
DEFAULT_OFFLINE_UPDATE_FREQNENCY
The default update frequency for offline indexes (15000 msec = 15 sec).static int
DEFAULT_TIMEOUT
The default timeout value used for generating a document for the search index (60000 msec = 1 min).static java.lang.String
JOB_PARAM_INDEXLIST
Scheduler parameter: Update only a specified list of indexes.static java.lang.String
JOB_PARAM_WRITELOG
Scheduler parameter: Write the output of the update to the logfile.protected static org.apache.commons.logging.Log
LOG
The log object for this class.static java.lang.String
LUCENE_ANALYZER
Prefix for Lucene default analyzers package (org.apache.lucene.analysis.
).protected CmsObject
m_adminCms
The administrator OpenCms user context to access OpenCms VFS resources.protected java.util.List<I_CmsSearchIndex>
m_offlineIndexes
The list of indexes that are configured for offline index mode.protected CmsSearchManager.CmsSearchOfflineIndexThread
m_offlineIndexThread
The thread used of offline indexing.-
Fields inherited from interface org.opencms.main.I_CmsEventListener
EVENT_BEFORE_PUBLISH_PROJECT, EVENT_CLEAR_CACHES, EVENT_CLEAR_OFFLINE_CACHES, EVENT_CLEAR_ONLINE_CACHES, EVENT_CLEAR_PRINCIPAL_CACHES, EVENT_FLEX_CACHE_CLEAR, EVENT_FLEX_PURGE_JSP_REPOSITORY, EVENT_FULLSTATIC_EXPORT, EVENT_GROUP_MODIFIED, EVENT_LOGIN_USER, EVENT_OU_MODIFIED, EVENT_PROJECT_MODIFIED, EVENT_PROPERTY_DEFINITION_CREATED, EVENT_PROPERTY_DEFINITION_MODIFIED, EVENT_PROPERTY_MODIFIED, EVENT_PUBLISH_PROJECT, EVENT_REBUILD_SEARCHINDEXES, EVENT_REINDEX_OFFLINE, EVENT_REINDEX_ONLINE, EVENT_RESOURCE_AND_PROPERTIES_MODIFIED, EVENT_RESOURCE_COPIED, EVENT_RESOURCE_CREATED, EVENT_RESOURCE_DELETED, EVENT_RESOURCE_MODIFIED, EVENT_RESOURCE_MOVED, EVENT_RESOURCES_AND_PROPERTIES_MODIFIED, EVENT_RESOURCES_MODIFIED, EVENT_SITEMAP_CHANGED, EVENT_UPDATE_EXPORTS, EVENT_USER_MODIFIED, KEY_CHANGE, KEY_DBCONTEXT, KEY_GROUP_ID, KEY_GROUP_NAME, KEY_INDEX_NAMES, KEY_IS_ONLINE, KEY_OU_ID, KEY_OU_NAME, KEY_PROJECTID, KEY_PUBLISHID, KEY_PUBLISHLIST, KEY_REINDEX_RELATED, KEY_REPORT, KEY_RESOURCE, KEY_RESOURCES, KEY_SKIPINDEX, KEY_USER_ACTION, KEY_USER_CHANGES, KEY_USER_ID, KEY_USER_NAME, LISTENERS_FOR_ALL_EVENTS, VALUE_CREATE_SIBLING, VALUE_GROUP_MODIFIED_ACTION_CREATE, VALUE_GROUP_MODIFIED_ACTION_DELETE, VALUE_GROUP_MODIFIED_ACTION_WRITE, VALUE_OU_MODIFIED_ACTION_CREATE, VALUE_OU_MODIFIED_ACTION_DELETE, VALUE_USER_MODIFIED_ACTION_ADD_USER_TO_GROUP, VALUE_USER_MODIFIED_ACTION_CREATE_USER, VALUE_USER_MODIFIED_ACTION_DELETE_USER, VALUE_USER_MODIFIED_ACTION_REMOVE_USER_FROM_GROUP, VALUE_USER_MODIFIED_ACTION_RESET_PASSWORD, VALUE_USER_MODIFIED_ACTION_SET_OU, VALUE_USER_MODIFIED_ACTION_WRITE_USER
-
-
Constructor Summary
Constructors Constructor Description CmsSearchManager()
Default constructor when called as cron job.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected java.util.List<CmsPublishedResource>
addAdditionallyAffectedResources(CmsObject adminCms, java.util.List<CmsPublishedResource> updateResources)
Collects the resources whose indexed document depends on one of the updated resources.void
addAnalyzer(CmsSearchAnalyzer analyzer)
Adds an analyzer.void
addDocumentTypeConfig(CmsSearchDocumentType documentType)
Adds a document type.void
addFieldConfiguration(I_CmsSearchFieldConfiguration fieldConfiguration)
Adds a search field configuration to the search manager.protected java.util.Collection<CmsPublishedResource>
addIndexContentRelatedResources(CmsObject adminCms, java.util.Collection<CmsPublishedResource> updateResources, java.util.Collection<CmsPublishedResource> updateResourcesToCheck)
Collects the resources whose indexed document depends on one of the updated resources.void
addSearchIndex(I_CmsSearchIndex searchIndex)
Adds a search index to the configuration.void
addSearchIndexSource(CmsSearchIndexSource searchIndexSource)
Adds a search index source configuration.protected void
cleanExtractionCache()
Cleans up the extraction result cache.void
cmsEvent(CmsEvent event)
Implements the event listener of this class.protected java.util.Collection<CmsPublishedResource>
findRelatedContainerPages(CmsObject adminCms, java.util.Collection<CmsPublishedResource> updateResources, java.util.Collection<CmsPublishedResource> updateResourcesToCheck)
Collects the related containerpages to the resources that have been published.java.util.List<CmsSolrIndex>
getAllSolrIndexes()
Returns all Solr index.static org.apache.lucene.analysis.Analyzer
getAnalyzer(java.lang.String className)
Returns an analyzer for the given class name.org.apache.lucene.analysis.Analyzer
getAnalyzer(java.util.Locale locale)
Returns an analyzer for the given language.java.util.Map<java.util.Locale,CmsSearchAnalyzer>
getAnalyzers()
Returns an unmodifiable view of the map that contains theCmsSearchAnalyzer
list.CmsSearchAnalyzer
getCmsSearchAnalyzer(java.util.Locale locale)
Returns the search analyzer for the given locale.java.lang.String
getDirectory()
Returns the name of the directory below WEB-INF/ where the search indexes are stored.java.lang.String
getDirectorySolr()
Returns the configured Solr home directorynull
if not set.I_CmsDocumentFactory
getDocumentFactoryForName(java.lang.String docTypeName)
Returns the document factory configured under the provided name.CmsSearchDocumentType
getDocumentTypeConfig(java.lang.String name)
Returns a document type config.java.util.List<CmsSearchDocumentType>
getDocumentTypeConfigs()
Returns an unmodifiable view (read-only) of the DocumentTypeConfigs Map.java.util.List<java.lang.String>
getDocumentTypeKeys(java.lang.String resourceType, java.lang.String mimeType)
Returns the document type keys used to specify the correct document factory.java.util.List<java.lang.String>
getDocumentTypeKeys(CmsResource resource)
Returns the document type keys used to specify the correct document factory.java.util.Map<java.lang.String,I_CmsDocumentFactory>
getDocumentTypeMapForTypeNames(java.util.List<java.lang.String> documentTypeNames)
Returns the map from document type keys to document factories with all entries for the provided document type names.protected java.util.List<java.lang.String>
getDocumentTypes()
Returns the set of names of all configured document types.float
getExtractionCacheMaxAge()
Returns the maximum age a text extraction result is kept in the cache (in hours).I_CmsSearchFieldConfiguration
getFieldConfiguration(java.lang.String name)
Returns the search field configuration with the given name.java.util.List<I_CmsSearchFieldConfiguration>
getFieldConfigurations()
Returns the unmodifieable List of configuredI_CmsSearchFieldConfiguration
entries.java.util.List<CmsLuceneFieldConfiguration>
getFieldConfigurationsLucene()
Returns the Lucene search field configurations only.java.util.List<CmsSolrFieldConfiguration>
getFieldConfigurationsSolr()
Returns the Solr search field configurations only.CmsSearchManager.CmsSearchForceUnlockMode
getForceunlock()
Returns the force unlock mode during indexing.I_CmsTermHighlighter
getHighlighter()
Returns the highlighter.I_CmsSearchIndex
getIndex(java.lang.String indexName)
Returns the Lucene search index configured with the given name.int
getIndexLockMaxWaitSeconds()
Returns the seconds to wait for an index lock during an update operation.java.util.List<java.lang.String>
getIndexNames()
Returns the names of all configured indexes.CmsSolrIndex
getIndexSolr(java.lang.String indexName)
Returns the Solr index configured with the given name.static CmsSolrIndex
getIndexSolr(CmsObject cms, java.util.Map<java.lang.String,java.lang.String[]> params)
Returns the Solr index configured with the parameters name.CmsSearchIndexSource
getIndexSource(java.lang.String sourceName)
Returns a search index source for a specified source name.int
getMaxExcerptLength()
Returns the max.long
getMaxIndexWaitTime()
Returns the maximal time to wait for re-indexing after a content is edited (in milliseconds).int
getMaxModificationsBeforeCommit()
Returns the maximum number of modifications before a commit in the search index is triggered.protected CmsProject
getOfflineIndexProject()
Returns the a offline project used for offline indexing.long
getOfflineUpdateFrequency()
Returns the update frequency of the offline indexer in milliseconds.java.util.List<I_CmsSearchIndex>
getSearchIndexes()
Returns an unmodifiable list of all configured
instances.I_CmsSearchIndex
java.util.List<I_CmsSearchIndex>
getSearchIndexesAll()
Returns an unmodifiable list of all configured
instances.I_CmsSearchIndex
java.util.List<CmsSolrIndex>
getSearchIndexesSolr()
Returns an unmodifiable list of all configured
instances.I_CmsSearchIndex
java.util.Map<java.lang.String,CmsSearchIndexSource>
getSearchIndexSources()
Returns an unmodifiable view (read-only) of the SearchIndexSources Map.CmsSolrSpellchecker
getSolrDictionary()
Return singleton instance of the OpenCms spellchecker.CmsSolrConfiguration
getSolrServerConfiguration()
Returns the Solr configuration.protected CmsIndexingThreadManager
getThreadManager()
Returns a new thread manager for the indexing threads.long
getTimeout()
Returns the timeout to abandon threads indexing a resource.protected void
initAvailableDocumentTypes()
Initializes the available Cms resource types to be indexed.void
initialize(CmsObject cms)
Initializes the search manager.void
initializeFieldConfigurations()
CallsI_CmsSearchFieldConfiguration.init()
for all registered field configurations.void
initializeIndexes()
Initializes all configured document types, index sources and search indexes.protected void
initIndexSources()
Initializes the index sources.void
initOfflineIndexes()
Initialize the offline index handler, require after an offline index has been added.protected void
initSearchIndexes()
Initializes the configured search indexes.void
initSpellcheckIndex(CmsObject adminCms)
Initializes the spell check index.static boolean
isLuceneIndex(java.lang.String indexName)
Returnstrue
if the index for the given name is a Lucene index,false
otherwise.boolean
isOfflineIndexingPaused()
Returns if the offline indexing is paused.java.lang.String
launch(CmsObject cms, java.util.Map<java.lang.String,java.lang.String> parameters)
Updates the indexes from as a scheduled job.void
pauseOfflineIndexing()
Pauses the offline indexing.void
rebuildAllIndexes(I_CmsReport report)
Rebuilds (if required creates) all configured indexes.void
rebuildIndex(java.lang.String indexName, I_CmsReport report)
Rebuilds (if required creates) the index with the given name.void
rebuildIndexes(java.util.List<java.lang.String> indexNames, I_CmsReport report)
Rebuilds (if required creates) the List of indexes with the given name.void
registerSolrIndex(CmsSolrIndex index)
Registers a new Solr core for the given index.boolean
removeSearchFieldConfiguration(I_CmsSearchFieldConfiguration fieldConfiguration)
Removes this field configuration from the OpenCms configuration (if it is not used any more).boolean
removeSearchFieldConfigurationField(I_CmsSearchFieldConfiguration fieldConfiguration, CmsSearchField field)
Removes a search field from the field configuration.boolean
removeSearchFieldMapping(CmsLuceneField field, CmsSearchFieldMapping mapping)
Removes a search field mapping from the given field.void
removeSearchIndex(I_CmsSearchIndex searchIndex)
Removes a search index from the configuration.void
removeSearchIndexes(java.util.List<java.lang.String> indexNames)
Removes all indexes included in the given list (which must contain the name of an index to remove).boolean
removeSearchIndexSource(CmsSearchIndexSource indexsource)
Removes this indexsource from the OpenCms configuration (if it is not used any more).void
resumeOfflineIndexing()
Resumes offline indexing if it was paused.void
setDirectory(java.lang.String value)
Sets the name of the directory below WEB-INF/ where the search indexes are stored.void
setExtractionCacheMaxAge(float extractionCacheMaxAge)
Sets the maximum age a text extraction result is kept in the cache (in hours).void
setExtractionCacheMaxAge(java.lang.String extractionCacheMaxAge)
Sets the maximum age a text extraction result is kept in the cache (in hours) as a String.void
setForceunlock(java.lang.String value)
Sets the unlock mode during indexing.void
setHighlighter(java.lang.String highlighter)
Sets the highlighter.void
setIndexLockMaxWaitSeconds(int value)
Sets the seconds to wait for an index lock during an update operation.void
setMaxExcerptLength(int maxExcerptLength)
Sets the max.void
setMaxExcerptLength(java.lang.String maxExcerptLength)
Sets the max.void
setMaxIndexWaitTime(long maxIndexWaitTime)
Sets the maximal wait time for offline index updates after edit operations.void
setMaxIndexWaitTime(java.lang.String maxIndexWaitTime)
Sets the maximal wait time for offline index updates after edit operations.void
setMaxModificationsBeforeCommit(int maxModificationsBeforeCommit)
Sets the maximum number of modifications before a commit in the search index is triggered.void
setMaxModificationsBeforeCommit(java.lang.String value)
Sets the maximum number of modifications before a commit in the search index is triggered as a string.void
setOfflineUpdateFrequency(long offlineUpdateFrequency)
Sets the update frequency of the offline indexer in milliseconds.void
setOfflineUpdateFrequency(java.lang.String offlineUpdateFrequency)
Sets the update frequency of the offline indexer in milliseconds.void
setSolrServerConfiguration(CmsSolrConfiguration config)
Sets the Solr configuration.void
setTimeout(long value)
Sets the timeout to abandon threads indexing a resource.void
setTimeout(java.lang.String value)
Sets the timeout to abandon threads indexing a resource as a String.protected boolean
shouldUpdateAtAll(I_CmsSearchIndex index)
Checks, if the index should be rebuilt/updated at all by the search manager.void
shutDown()
Shuts down the search manager.protected void
updateAllIndexes(CmsObject adminCms, java.util.List<CmsPublishedResource> updateResources, I_CmsReport report)
Incrementally updates all indexes that have their rebuild mode set to"auto"
.protected void
updateAllIndexes(CmsObject adminCms, CmsUUID publishHistoryId, I_CmsReport report)
Incrementally updates all indexes that have their rebuild mode set to"auto"
after resources have been published.protected void
updateIndex(I_CmsSearchIndex index, I_CmsReport report, java.util.List<CmsPublishedResource> resourcesToIndex)
Updates (if required creates) the index with the given name.protected void
updateIndexCompletely(CmsObject cms, I_CmsSearchIndex index, I_CmsReport report)
The method updates all OpenCms documents that are indexed.protected void
updateIndexIncremental(CmsObject cms, I_CmsSearchIndex index, I_CmsReport report, java.util.List<CmsPublishedResource> resourcesToIndex)
Incrementally updates the given index.protected void
updateIndexOffline(I_CmsReport report, java.util.List<CmsPublishedResource> resourcesToIndex)
Updates the offline search indexes for the given list of resources.void
updateOfflineIndexes()
Updates all offline indexes.void
updateOfflineIndexes(long waitTime)
Updates all offline indexes.
-
-
-
Field Detail
-
DEFAULT_EXCERPT_LENGTH
public static final int DEFAULT_EXCERPT_LENGTH
The default value used for generating search result excerpts (1024 chars).- See Also:
- Constant Field Values
-
DEFAULT_EXTRACTION_CACHE_MAX_AGE
public static final float DEFAULT_EXTRACTION_CACHE_MAX_AGE
The default value used for keeping the extraction results in the cache (672 hours = 4 weeks).- See Also:
- Constant Field Values
-
DEFAULT_MAX_MODIFICATIONS_BEFORE_COMMIT
public static final int DEFAULT_MAX_MODIFICATIONS_BEFORE_COMMIT
Default for the maximum number of modifications before a commit in the search index is triggered (500).- See Also:
- Constant Field Values
-
DEFAULT_OFFLINE_UPDATE_FREQNENCY
public static final int DEFAULT_OFFLINE_UPDATE_FREQNENCY
The default update frequency for offline indexes (15000 msec = 15 sec).- See Also:
- Constant Field Values
-
DEFAULT_MAX_INDEX_WAITTIME
public static final int DEFAULT_MAX_INDEX_WAITTIME
The default maximal wait time for re-indexing after editing a content.- See Also:
- Constant Field Values
-
DEFAULT_TIMEOUT
public static final int DEFAULT_TIMEOUT
The default timeout value used for generating a document for the search index (60000 msec = 1 min).- See Also:
- Constant Field Values
-
JOB_PARAM_INDEXLIST
public static final java.lang.String JOB_PARAM_INDEXLIST
Scheduler parameter: Update only a specified list of indexes.- See Also:
- Constant Field Values
-
JOB_PARAM_WRITELOG
public static final java.lang.String JOB_PARAM_WRITELOG
Scheduler parameter: Write the output of the update to the logfile.- See Also:
- Constant Field Values
-
LUCENE_ANALYZER
public static final java.lang.String LUCENE_ANALYZER
Prefix for Lucene default analyzers package (org.apache.lucene.analysis.
).- See Also:
- Constant Field Values
-
LOG
protected static final org.apache.commons.logging.Log LOG
The log object for this class.
-
m_adminCms
protected CmsObject m_adminCms
The administrator OpenCms user context to access OpenCms VFS resources.
-
m_offlineIndexes
protected java.util.List<I_CmsSearchIndex> m_offlineIndexes
The list of indexes that are configured for offline index mode.
-
m_offlineIndexThread
protected CmsSearchManager.CmsSearchOfflineIndexThread m_offlineIndexThread
The thread used of offline indexing.
-
-
Constructor Detail
-
CmsSearchManager
public CmsSearchManager()
Default constructor when called as cron job.
-
-
Method Detail
-
getAnalyzer
public static org.apache.lucene.analysis.Analyzer getAnalyzer(java.lang.String className) throws java.lang.Exception
Returns an analyzer for the given class name.- Parameters:
className
- the class name of the analyzer- Returns:
- the appropriate lucene analyzer
- Throws:
java.lang.Exception
- if something goes wrong
-
getIndexSolr
public static final CmsSolrIndex getIndexSolr(CmsObject cms, java.util.Map<java.lang.String,java.lang.String[]> params)
Returns the Solr index configured with the parameters name. The parameters must contain a key/value pair with an existing Solr index, otherwisenull
is returned.- Parameters:
cms
- the current contextparams
- the parameter map- Returns:
- the best matching Solr index
-
isLuceneIndex
public static boolean isLuceneIndex(java.lang.String indexName)
Returnstrue
if the index for the given name is a Lucene index,false
otherwise.- Parameters:
indexName
- the name of the index to check- Returns:
true
if the index for the given name is a Lucene index
-
addAnalyzer
public void addAnalyzer(CmsSearchAnalyzer analyzer)
Adds an analyzer.- Parameters:
analyzer
- an analyzer
-
addDocumentTypeConfig
public void addDocumentTypeConfig(CmsSearchDocumentType documentType)
Adds a document type.- Parameters:
documentType
- a document type
-
addFieldConfiguration
public void addFieldConfiguration(I_CmsSearchFieldConfiguration fieldConfiguration)
Adds a search field configuration to the search manager.- Parameters:
fieldConfiguration
- the search field configuration to add
-
addSearchIndex
public void addSearchIndex(I_CmsSearchIndex searchIndex)
Adds a search index to the configuration.- Parameters:
searchIndex
- the search index to add
-
addSearchIndexSource
public void addSearchIndexSource(CmsSearchIndexSource searchIndexSource)
Adds a search index source configuration.- Parameters:
searchIndexSource
- a search index source configuration
-
cmsEvent
public void cmsEvent(CmsEvent event)
Implements the event listener of this class.- Specified by:
cmsEvent
in interfaceI_CmsEventListener
- Parameters:
event
- CmsEvent that has occurred- See Also:
I_CmsEventListener.cmsEvent(org.opencms.main.CmsEvent)
-
getAllSolrIndexes
public java.util.List<CmsSolrIndex> getAllSolrIndexes()
Returns all Solr index.- Returns:
- all Solr indexes
-
getAnalyzer
public org.apache.lucene.analysis.Analyzer getAnalyzer(java.util.Locale locale) throws CmsSearchException
Returns an analyzer for the given language.The analyzer is selected according to the analyzer configuration.
- Parameters:
locale
- the locale to get the analyzer for- Returns:
- the appropriate lucene analyzer
- Throws:
CmsSearchException
- if something goes wrong
-
getAnalyzers
public java.util.Map<java.util.Locale,CmsSearchAnalyzer> getAnalyzers()
Returns an unmodifiable view of the map that contains theCmsSearchAnalyzer
list.The keys in the map are
Locale
objects, and the values areCmsSearchAnalyzer
objects.- Returns:
- an unmodifiable view of the Analyzers Map
-
getCmsSearchAnalyzer
public CmsSearchAnalyzer getCmsSearchAnalyzer(java.util.Locale locale)
Returns the search analyzer for the given locale.- Parameters:
locale
- the locale to get the analyzer for- Returns:
- the search analyzer for the given locale
-
getDirectory
public java.lang.String getDirectory()
Returns the name of the directory below WEB-INF/ where the search indexes are stored.- Returns:
- the name of the directory below WEB-INF/ where the search indexes are stored
-
getDirectorySolr
public java.lang.String getDirectorySolr()
Returns the configured Solr home directorynull
if not set.- Returns:
- the Solr home directory
-
getDocumentFactoryForName
public I_CmsDocumentFactory getDocumentFactoryForName(java.lang.String docTypeName)
Returns the document factory configured under the provided name.- Parameters:
docTypeName
- the name of the document type.- Returns:
- the factory for the provided name.
-
getDocumentTypeConfig
public CmsSearchDocumentType getDocumentTypeConfig(java.lang.String name)
Returns a document type config.- Parameters:
name
- the name of the document type config- Returns:
- the document type config.
-
getDocumentTypeConfigs
public java.util.List<CmsSearchDocumentType> getDocumentTypeConfigs()
Returns an unmodifiable view (read-only) of the DocumentTypeConfigs Map.- Returns:
- an unmodifiable view (read-only) of the DocumentTypeConfigs Map
-
getDocumentTypeKeys
public java.util.List<java.lang.String> getDocumentTypeKeys(CmsResource resource)
Returns the document type keys used to specify the correct document factory.- Parameters:
resource
- the resource to generate the list of document type keys for.- Returns:
- the document type keys.
- See Also:
for detailed information on the returned keys.
-
getDocumentTypeKeys
public java.util.List<java.lang.String> getDocumentTypeKeys(java.lang.String resourceType, java.lang.String mimeType)
Returns the document type keys used to specify the correct document factory. One resource typically has more than one key. The document factories are matched in the provided order and the first matching factory is used. The keys for type name "typename" and mimetype "mimetype" would be a subset of:typename_mimetype
typename
- if
typename
is a sub-type ofcontainerpage
containerpage_mimetype
containerpage
- if
typename
is a sub-type ofxmlcontent
xmlcontent_mimetype
xmlcontent
__unconfigured___mimetype
__unconfigured__
__all___mimetype
__all__
-
Note that all keys except the "__all__"-keys are only added as long as globally
there is no matching factory for the key.
This in particular means that a factory matching "typename" will never be used
if you have a factory for "typename__mimetype" - even if this is not configured
for the used index source. Eventually, the content will not be indexed in such cases.
- Parameters:
resourceType
- the resource type to generate the list of document type keys for.mimeType
- the mime type to generate the list of document type keys for.- Returns:
- the document type keys.
-
getDocumentTypeMapForTypeNames
public java.util.Map<java.lang.String,I_CmsDocumentFactory> getDocumentTypeMapForTypeNames(java.util.List<java.lang.String> documentTypeNames)
Returns the map from document type keys to document factories with all entries for the provided document type names.- Parameters:
documentTypeNames
- list of document type names to generate the map for.- Returns:
- the map from document type keys to document factories.
-
getExtractionCacheMaxAge
public float getExtractionCacheMaxAge()
Returns the maximum age a text extraction result is kept in the cache (in hours).- Returns:
- the maximum age a text extraction result is kept in the cache (in hours)
-
getFieldConfiguration
public I_CmsSearchFieldConfiguration getFieldConfiguration(java.lang.String name)
Returns the search field configuration with the given name.In case no configuration is available with the given name,
null
is returned.- Parameters:
name
- the name to get the search field configuration for- Returns:
- the search field configuration with the given name
-
getFieldConfigurations
public java.util.List<I_CmsSearchFieldConfiguration> getFieldConfigurations()
Returns the unmodifieable List of configuredI_CmsSearchFieldConfiguration
entries.- Returns:
- the unmodifieable List of configured
I_CmsSearchFieldConfiguration
entries
-
getFieldConfigurationsLucene
public java.util.List<CmsLuceneFieldConfiguration> getFieldConfigurationsLucene()
Returns the Lucene search field configurations only.- Returns:
- the Lucene search field configurations
-
getFieldConfigurationsSolr
public java.util.List<CmsSolrFieldConfiguration> getFieldConfigurationsSolr()
Returns the Solr search field configurations only.- Returns:
- the Solr search field configurations
-
getForceunlock
public CmsSearchManager.CmsSearchForceUnlockMode getForceunlock()
Returns the force unlock mode during indexing.- Returns:
- the force unlock mode during indexing
-
getHighlighter
public I_CmsTermHighlighter getHighlighter()
Returns the highlighter.- Returns:
- the highlighter
-
getIndex
public I_CmsSearchIndex getIndex(java.lang.String indexName)
Returns the Lucene search index configured with the given name.The index must exist, otherwise
null
is returned.- Parameters:
indexName
- then name of the requested search index- Returns:
- the Lucene search index configured with the given name
-
getIndexLockMaxWaitSeconds
public int getIndexLockMaxWaitSeconds()
Returns the seconds to wait for an index lock during an update operation.- Returns:
- the seconds to wait for an index lock during an update operation
-
getIndexNames
public java.util.List<java.lang.String> getIndexNames()
Returns the names of all configured indexes.- Returns:
- list of names
-
getIndexSolr
public CmsSolrIndex getIndexSolr(java.lang.String indexName)
Returns the Solr index configured with the given name.The index must exist, otherwise
null
is returned.- Parameters:
indexName
- then name of the requested Solr index- Returns:
- the Solr index configured with the given name
-
getIndexSource
public CmsSearchIndexSource getIndexSource(java.lang.String sourceName)
Returns a search index source for a specified source name.- Parameters:
sourceName
- the name of the index source- Returns:
- a search index source
-
getMaxExcerptLength
public int getMaxExcerptLength()
Returns the max. excerpt length.- Returns:
- the max excerpt length
-
getMaxIndexWaitTime
public long getMaxIndexWaitTime()
Returns the maximal time to wait for re-indexing after a content is edited (in milliseconds).- Returns:
- the maximal time to wait for re-indexing after a content is edited (in milliseconds)
-
getMaxModificationsBeforeCommit
public int getMaxModificationsBeforeCommit()
Returns the maximum number of modifications before a commit in the search index is triggered.- Returns:
- the maximum number of modifications before a commit in the search index is triggered
-
getOfflineUpdateFrequency
public long getOfflineUpdateFrequency()
Returns the update frequency of the offline indexer in milliseconds.- Returns:
- the update frequency of the offline indexer in milliseconds
-
getSearchIndexes
public java.util.List<I_CmsSearchIndex> getSearchIndexes()
Returns an unmodifiable list of all configured
instances.I_CmsSearchIndex
- Returns:
- an unmodifiable list of all configured
instancesI_CmsSearchIndex
-
getSearchIndexesAll
public java.util.List<I_CmsSearchIndex> getSearchIndexesAll()
Returns an unmodifiable list of all configured
instances.I_CmsSearchIndex
- Returns:
- an unmodifiable list of all configured
instancesI_CmsSearchIndex
-
getSearchIndexesSolr
public java.util.List<CmsSolrIndex> getSearchIndexesSolr()
Returns an unmodifiable list of all configured
instances.I_CmsSearchIndex
- Returns:
- an unmodifiable list of all configured
instancesI_CmsSearchIndex
-
getSearchIndexSources
public java.util.Map<java.lang.String,CmsSearchIndexSource> getSearchIndexSources()
Returns an unmodifiable view (read-only) of the SearchIndexSources Map.- Returns:
- an unmodifiable view (read-only) of the SearchIndexSources Map
-
getSolrDictionary
public CmsSolrSpellchecker getSolrDictionary()
Return singleton instance of the OpenCms spellchecker.- Returns:
- instance of CmsSolrSpellchecker.
-
getSolrServerConfiguration
public CmsSolrConfiguration getSolrServerConfiguration()
Returns the Solr configuration.- Returns:
- the Solr configuration
-
getTimeout
public long getTimeout()
Returns the timeout to abandon threads indexing a resource.- Returns:
- the timeout to abandon threads indexing a resource
-
initialize
public void initialize(CmsObject cms) throws CmsRoleViolationException
Initializes the search manager.- Parameters:
cms
- the cms object- Throws:
CmsRoleViolationException
- in case the given opencms object does not have
permissionsCmsRole.WORKPLACE_MANAGER
-
initializeFieldConfigurations
public void initializeFieldConfigurations()
CallsI_CmsSearchFieldConfiguration.init()
for all registered field configurations.
-
initializeIndexes
public void initializeIndexes()
Initializes all configured document types, index sources and search indexes.This methods needs to be called if after a change in the index configuration has been made.
-
initOfflineIndexes
public void initOfflineIndexes()
Initialize the offline index handler, require after an offline index has been added.
-
initSpellcheckIndex
public void initSpellcheckIndex(CmsObject adminCms)
Initializes the spell check index.- Parameters:
adminCms
- the ROOT_ADMIN cms context
-
isOfflineIndexingPaused
public boolean isOfflineIndexingPaused()
Returns if the offline indexing is paused.- Returns:
true
if the offline indexing is paused
-
launch
public java.lang.String launch(CmsObject cms, java.util.Map<java.lang.String,java.lang.String> parameters) throws java.lang.Exception
Updates the indexes from as a scheduled job.- Specified by:
launch
in interfaceI_CmsScheduledJob
- Parameters:
cms
- the OpenCms user context to use when reading resources from the VFSparameters
- the parameters for the scheduled job- Returns:
- the String to write in the scheduler log
- Throws:
java.lang.Exception
- if something goes wrong- See Also:
I_CmsScheduledJob.launch(CmsObject, Map)
-
pauseOfflineIndexing
public void pauseOfflineIndexing()
Pauses the offline indexing.May take some time, because the indexes are updated first.
-
rebuildAllIndexes
public void rebuildAllIndexes(I_CmsReport report) throws CmsException
Rebuilds (if required creates) all configured indexes.- Parameters:
report
- the report object to write messages (ornull
)- Throws:
CmsException
- if something goes wrong
-
rebuildIndex
public void rebuildIndex(java.lang.String indexName, I_CmsReport report) throws CmsException
Rebuilds (if required creates) the index with the given name.- Parameters:
indexName
- the name of the index to rebuildreport
- the report object to write messages (ornull
)- Throws:
CmsException
- if something goes wrong
-
rebuildIndexes
public void rebuildIndexes(java.util.List<java.lang.String> indexNames, I_CmsReport report) throws CmsException
Rebuilds (if required creates) the List of indexes with the given name.- Parameters:
indexNames
- the names (String) of the index to rebuildreport
- the report object to write messages (ornull
)- Throws:
CmsException
- if something goes wrong
-
registerSolrIndex
public void registerSolrIndex(CmsSolrIndex index) throws CmsConfigurationException
Registers a new Solr core for the given index.- Parameters:
index
- the index to register a new Solr core for- Throws:
CmsConfigurationException
- if no Solr server is configured
-
removeSearchFieldConfiguration
public boolean removeSearchFieldConfiguration(I_CmsSearchFieldConfiguration fieldConfiguration) throws CmsIllegalStateException
Removes this field configuration from the OpenCms configuration (if it is not used any more).- Parameters:
fieldConfiguration
- the field configuration to remove from the configuration- Returns:
- true if remove was successful, false if preconditions for removal are ok but the given field configuration was unknown to the manager.
- Throws:
CmsIllegalStateException
- if the given field configuration is still used by at least one
.I_CmsSearchIndex
-
removeSearchFieldConfigurationField
public boolean removeSearchFieldConfigurationField(I_CmsSearchFieldConfiguration fieldConfiguration, CmsSearchField field)
Removes a search field from the field configuration.- Parameters:
fieldConfiguration
- the field configurationfield
- field to remove from the field configuration- Returns:
- true if remove was successful, false if preconditions for removal are ok but the given field was unknown.
-
removeSearchFieldMapping
public boolean removeSearchFieldMapping(CmsLuceneField field, CmsSearchFieldMapping mapping) throws CmsIllegalStateException
Removes a search field mapping from the given field.- Parameters:
field
- the fieldmapping
- mapping to remove from the field- Returns:
- true if remove was successful, false if preconditions for removal are ok but the given mapping was unknown.
- Throws:
CmsIllegalStateException
- if the given mapping is the last mapping inside the given field.
-
removeSearchIndex
public void removeSearchIndex(I_CmsSearchIndex searchIndex)
Removes a search index from the configuration.- Parameters:
searchIndex
- the search index to remove
-
removeSearchIndexes
public void removeSearchIndexes(java.util.List<java.lang.String> indexNames)
Removes all indexes included in the given list (which must contain the name of an index to remove).- Parameters:
indexNames
- the names of the index to remove
-
removeSearchIndexSource
public boolean removeSearchIndexSource(CmsSearchIndexSource indexsource) throws CmsIllegalStateException
Removes this indexsource from the OpenCms configuration (if it is not used any more).- Parameters:
indexsource
- the indexsource to remove from the configuration- Returns:
- true if remove was successful, false if preconditions for removal are ok but the given searchindex was unknown to the manager.
- Throws:
CmsIllegalStateException
- if the given indexsource is still used by at least one
.I_CmsSearchIndex
-
resumeOfflineIndexing
public void resumeOfflineIndexing()
Resumes offline indexing if it was paused.
-
setDirectory
public void setDirectory(java.lang.String value)
Sets the name of the directory below WEB-INF/ where the search indexes are stored.- Parameters:
value
- the name of the directory below WEB-INF/ where the search indexes are stored
-
setExtractionCacheMaxAge
public void setExtractionCacheMaxAge(float extractionCacheMaxAge)
Sets the maximum age a text extraction result is kept in the cache (in hours).- Parameters:
extractionCacheMaxAge
- the maximum age for a text extraction result to set
-
setExtractionCacheMaxAge
public void setExtractionCacheMaxAge(java.lang.String extractionCacheMaxAge)
Sets the maximum age a text extraction result is kept in the cache (in hours) as a String.- Parameters:
extractionCacheMaxAge
- the maximum age for a text extraction result to set
-
setForceunlock
public void setForceunlock(java.lang.String value)
Sets the unlock mode during indexing.- Parameters:
value
- the value
-
setHighlighter
public void setHighlighter(java.lang.String highlighter)
Sets the highlighter.A highlighter is a class implementing org.opencms.search.documents.I_TermHighlighter.
- Parameters:
highlighter
- the package/class name of the highlighter
-
setIndexLockMaxWaitSeconds
public void setIndexLockMaxWaitSeconds(int value)
Sets the seconds to wait for an index lock during an update operation.- Parameters:
value
- the seconds to wait for an index lock during an update operation
-
setMaxExcerptLength
public void setMaxExcerptLength(int maxExcerptLength)
Sets the max. excerpt length.- Parameters:
maxExcerptLength
- the max. excerpt length to set
-
setMaxExcerptLength
public void setMaxExcerptLength(java.lang.String maxExcerptLength)
Sets the max. excerpt length as a String.- Parameters:
maxExcerptLength
- the max. excerpt length to set
-
setMaxIndexWaitTime
public void setMaxIndexWaitTime(long maxIndexWaitTime)
Sets the maximal wait time for offline index updates after edit operations.- Parameters:
maxIndexWaitTime
- the maximal wait time to set in milliseconds
-
setMaxIndexWaitTime
public void setMaxIndexWaitTime(java.lang.String maxIndexWaitTime)
Sets the maximal wait time for offline index updates after edit operations.- Parameters:
maxIndexWaitTime
- the maximal wait time to set in milliseconds
-
setMaxModificationsBeforeCommit
public void setMaxModificationsBeforeCommit(int maxModificationsBeforeCommit)
Sets the maximum number of modifications before a commit in the search index is triggered.- Parameters:
maxModificationsBeforeCommit
- the maximum number of modifications to set
-
setMaxModificationsBeforeCommit
public void setMaxModificationsBeforeCommit(java.lang.String value)
Sets the maximum number of modifications before a commit in the search index is triggered as a string.- Parameters:
value
- the maximum number of modifications to set
-
setOfflineUpdateFrequency
public void setOfflineUpdateFrequency(long offlineUpdateFrequency)
Sets the update frequency of the offline indexer in milliseconds.- Parameters:
offlineUpdateFrequency
- the update frequency in milliseconds to set
-
setOfflineUpdateFrequency
public void setOfflineUpdateFrequency(java.lang.String offlineUpdateFrequency)
Sets the update frequency of the offline indexer in milliseconds.- Parameters:
offlineUpdateFrequency
- the update frequency in milliseconds to set
-
setSolrServerConfiguration
public void setSolrServerConfiguration(CmsSolrConfiguration config)
Sets the Solr configuration.- Parameters:
config
- the Solr configuration
-
setTimeout
public void setTimeout(long value)
Sets the timeout to abandon threads indexing a resource.- Parameters:
value
- the timeout in milliseconds
-
setTimeout
public void setTimeout(java.lang.String value)
Sets the timeout to abandon threads indexing a resource as a String.- Parameters:
value
- the timeout in milliseconds
-
shutDown
public void shutDown()
Shuts down the search manager.This will cause all search indices to be shut down.
-
updateOfflineIndexes
public void updateOfflineIndexes()
Updates all offline indexes.Can be used to force an index update when it's not convenient to wait until the offline update interval has eclipsed.
Since the offline indexes still need some time to update the new resources, the method waits for at most the configurable
maxIndexWaitTime
to ensure that updating is finished.- See Also:
updateOfflineIndexes(long)
-
updateOfflineIndexes
public void updateOfflineIndexes(long waitTime)
Updates all offline indexes.Can be used to force an index update when it's not convenient to wait until the offline update interval has eclipsed.
Since the offline index will still need some time to update the new resources even if it runs directly, a wait time of 2500 or so should be given in order to make sure the index finished updating.
- Parameters:
waitTime
- milliseconds to wait after the offline update index was notified of the changes
-
addAdditionallyAffectedResources
protected java.util.List<CmsPublishedResource> addAdditionallyAffectedResources(CmsObject adminCms, java.util.List<CmsPublishedResource> updateResources)
Collects the resources whose indexed document depends on one of the updated resources.We take transitive dependencies into account and handle cyclic dependencies correctly as well.
- Parameters:
adminCms
- an OpenCms user context with Admin permissionsupdateResources
- the resources to be re-indexed- Returns:
- the updated list of resource to re-index
-
addIndexContentRelatedResources
protected java.util.Collection<CmsPublishedResource> addIndexContentRelatedResources(CmsObject adminCms, java.util.Collection<CmsPublishedResource> updateResources, java.util.Collection<CmsPublishedResource> updateResourcesToCheck)
Collects the resources whose indexed document depends on one of the updated resources.- Parameters:
adminCms
- an OpenCms user context with Admin permissionsupdateResources
- the resources to be re-indexedupdateResourcesToCheck
- the resources to check additionally affected resources for, subset of updateResources- Returns:
- the list of resources that need to be additionally re-index
-
cleanExtractionCache
protected void cleanExtractionCache()
Cleans up the extraction result cache.
-
findRelatedContainerPages
protected java.util.Collection<CmsPublishedResource> findRelatedContainerPages(CmsObject adminCms, java.util.Collection<CmsPublishedResource> updateResources, java.util.Collection<CmsPublishedResource> updateResourcesToCheck)
Collects the related containerpages to the resources that have been published.- Parameters:
adminCms
- an OpenCms user context with Admin permissionsupdateResources
- the resources to be re-indexedupdateResourcesToCheck
- the resources to check additionally affected resources for, subset of updateResources- Returns:
- the list of resources that need to be additionally re-index
-
getDocumentTypes
protected java.util.List<java.lang.String> getDocumentTypes()
Returns the set of names of all configured document types.- Returns:
- the set of names of all configured document types
-
getOfflineIndexProject
protected CmsProject getOfflineIndexProject()
Returns the a offline project used for offline indexing.- Returns:
- the offline project if available
-
getThreadManager
protected CmsIndexingThreadManager getThreadManager()
Returns a new thread manager for the indexing threads.- Returns:
- a new thread manager for the indexing threads
-
initAvailableDocumentTypes
protected void initAvailableDocumentTypes()
Initializes the available Cms resource types to be indexed.A map stores document factories keyed by a string representing a colon separated list of Cms resource types and/or mimetypes.
The keys of this map are used to trigger a document factory to convert a Cms resource into a Lucene index document.
A document factory is a class implementing the interface
I_CmsDocumentFactory
.
-
initIndexSources
protected void initIndexSources()
Initializes the index sources.
-
initSearchIndexes
protected void initSearchIndexes()
Initializes the configured search indexes.This initializes also the list of Cms resources types to be indexed by an index source.
-
shouldUpdateAtAll
protected boolean shouldUpdateAtAll(I_CmsSearchIndex index)
Checks, if the index should be rebuilt/updated at all by the search manager.- Parameters:
index
- the index to check.- Returns:
- a flag, indicating if the index should be rebuilt/updated at all.
-
updateAllIndexes
protected void updateAllIndexes(CmsObject adminCms, CmsUUID publishHistoryId, I_CmsReport report)
Incrementally updates all indexes that have their rebuild mode set to"auto"
after resources have been published.- Parameters:
adminCms
- an OpenCms user context with Admin permissionspublishHistoryId
- the history ID of the published projectreport
- the report to write the output to
-
updateAllIndexes
protected void updateAllIndexes(CmsObject adminCms, java.util.List<CmsPublishedResource> updateResources, I_CmsReport report)
Incrementally updates all indexes that have their rebuild mode set to"auto"
.- Parameters:
adminCms
- an OpenCms user context with Admin permissionsupdateResources
- the resources to updatereport
- the report to write the output to
-
updateIndex
protected void updateIndex(I_CmsSearchIndex index, I_CmsReport report, java.util.List<CmsPublishedResource> resourcesToIndex) throws CmsException
Updates (if required creates) the index with the given name.If the optional List of
instances is provided, the index will be incrementally updated for these resources only. If this List isCmsPublishedResource
null
or empty, the index will be fully rebuild.- Parameters:
index
- the index to update or rebuildreport
- the report to write output messages toresourcesToIndex
- an (optional) list of
objects to update in the indexCmsPublishedResource
- Throws:
CmsException
- if something goes wrong
-
updateIndexCompletely
protected void updateIndexCompletely(CmsObject cms, I_CmsSearchIndex index, I_CmsReport report) throws CmsIndexException
The method updates all OpenCms documents that are indexed.- Parameters:
cms
- the OpenCms user context to use for accessing the VFSindex
- the index to updatereport
- the report to write output messages to- Throws:
CmsIndexException
- thrown if indexing fails for some reason
-
updateIndexIncremental
protected void updateIndexIncremental(CmsObject cms, I_CmsSearchIndex index, I_CmsReport report, java.util.List<CmsPublishedResource> resourcesToIndex) throws CmsException
Incrementally updates the given index.- Parameters:
cms
- the OpenCms user context to use for accessing the VFSindex
- the index to updatereport
- the report to write output messages toresourcesToIndex
- a list of
objects to update in the indexCmsPublishedResource
- Throws:
CmsException
- if something goes wrong
-
updateIndexOffline
protected void updateIndexOffline(I_CmsReport report, java.util.List<CmsPublishedResource> resourcesToIndex)
Updates the offline search indexes for the given list of resources.- Parameters:
report
- the report to write the index information toresourcesToIndex
- the list ofCmsPublishedResource
objects to index
-
-