Search the documentation

Here you find frequently asked questions about searching OpenCms with Solr.

General questions

Where to find general information about Solr?

If you are interested in Solr in general the Solr wiki is a good starting point: http://wiki.apache.org/solr/. OpenCms specific topics are covered by this documentation.

How is Solr integrated in general?

Independent from OpenCms a standard Solr Server offers a HTTP-Interface that is reachable at http://localhost:8983/solr/select in a default Apache Solr Installation.

You are able to attach each valid Solr query to this URL. The HTTP response can either be JSON or XML. For example, the answer of the query http://localhost:8983/solr/select?q=*:*&rows=2 could look like:

<response>
  <lst name="responseHeader">
    <int name="status">0</int>
    <int name="QTime">32</int>
    <lst name="params">
    <str name="q">*:*</str>
    <int name="rows">2</int>
    <long name="start">0</long>
  </lst>
  <result name="response" numFound="139" start="0">
    <doc>...</doc>
    <doc>...</doc>
  </result>
</response>

How to sort text for specific languages?

In this example, text is sorted according to the default German rules provided by Java. The rules for sorting German in Java are defined in a package called Java Locale.
Locales are typically defined as a combination of language and country, but you can specify just the language if you want. For example, if you specify "de" as the language, you will get sorting that works well for German language. If you specify "de" as the language and "CH" as the country, you will get German sorting specifically tailored for Switzerland. You can see a list of supported Locales here. And in order to get more general information about how text analysis is working with Solr have a look at Language Analysis page.

<!-- define a field type for German collation -->
<fieldType name="collatedGERMAN" class="solr.TextField">
  <analyzer>
    <tokenizer class="solr.KeywordTokenizerFactory"/>
    <filter class="solr.CollationKeyFilterFactory"
        language="de"
        strength="primary"
    />
  </analyzer>
</fieldType>
...
<!-- define a field to store the German collated manufacturer names -->
<field name="manuGERMAN" type="collatedGERMAN" indexed="true" stored="false" />
...
<!-- copy the text to this field. We could create French, English, Spanish versions too, 
  -- and sort differently for different users!
  -->
<copyField source="manu" dest="manuGERMAN"/>

Solr result size gets limited to 50 by default, how to get more than 50 results?

In order to return only permission checked resources (what is an expensive task) we only return this limited number of results. For paging over results please have a look at at the Solr parameters rows and start, e.g., at http://wiki.apache.org/solr/CommonQueryParameters. Since OpenCms version 8.5.x you can increase the resulting documents to a size of your choice.

Questions about highlighting of search results

Does OpenCms support result highlighting?

Yes, use the OpenCms Solr Select handler at localhost:8080/opencms/opencms/handleSolrSelect and you will find the highlighting section below the list of documents within the returned XML/JSON:

<lst name="highlighting">
  <lst name="a710bb16-1e04-11e2-b767-6805ca037347">
    <arr name="content_en">
      <str><em>YIPI</em> <em>YOHO</em> text text text</str>
    </arr>
  </lst>
  [...]
</lst>

Does the Java API of OpenCms support highlighting?

Currently the OpenCms search API does not support full-featured Solr highlighting. But you can make use of the Solr default highlighting mechanism or course and:

  • Call org.opencms.search.solr.CmsSolrResultList#getSolrQueryResponse() that returns a SolrQueryResponse as documented in the solr API documentation.
  • Or you can use the above mentioned OpenCms Solr Select handler at http://localhost:8080/opencms/opencms/handleSolrSelect

Is highlighting a performance killer?

Yes, for this reason highlighting is turned off before the first search is executed. After all not permitted resources are filtered out of the result list, the highlighting is performed again.

Questions about indexing

Please explain the differences between the "Solr Online and Offline"?

As the name of the indexes let assume Offline indexes are also containing changes that have not yet been published and Online indexes only contain thoses resources that have already been published. The "Online EN VFS" is a Lucene based index and also contains only those resources that have been published.

When executing a Solr query, does only the Solr index get used?

No, permissions are checked by OpenCms API afterwards.

Is there a way to create a full backup of the complete index?

You can copy the index folder WEB-INF/index/${INDEX_NAME} by hand.

How to rebuild indexes with a fail-safe?

Edit the opencms-search.xml within your WEB-INF/config directory and add the following node to your index:

<param name="org.opencms.search.CmsSearchIndex.useBackupReindexing">true</param>

This will create a snapshot as explained here.

Solr mailing list questions

A class cast exception is thrown, what can I do?

You have to set the right classes for the index, and the field configuration otherwise the Lucene search index implementation is used.

<index class="org.opencms.search.solr.CmsSolrIndex">[...]</index>
<fieldconfiguration class="org.opencms.search.solr.CmsSolrFieldConfiguration">
  [...]
</fieldconfiguration>

Is it possible to map elements with maxOccurs > 1?

Since OpenCms version >= 8.5.1 they are mapped to a multivalved field.

How to index OpenCmsDateTime elements?

<searchsetting element="Release" searchcontent="false">
    <solrfield targetfield="arelease" sourcefield="*_dt" />
</searchsetting>

You can improve this page

Please contribute your suggestions or comments regarding this topic on our wiki. For support questions, please use the OpenCms mailing list or go for professional support.