XML contents are by default multi-lingual. Each XML file consists of a series of language specific copies of the content. These can be zero to unbounded versions. In short: As much languages as you need.
An important aspect in designing a website is localization. Whether the web presence should be just translated to each client's language or culturally fully adapted to the respective target market's locale - OpenCms provides the means necessary to do the job.
This topic provides an overview on the different mechanisms for localization. In particular, you get answers to the questions:
- How to build a multi-lingual website?
- How to support multiple languages in editors?
- How to configure the languages available in OpenCms (in the ADE interfaces)?
Whenever you open an XML content in the content editor, you can select the language version, you want to edit. By default, only the languages "German" and "English" can be chosen. The version which gets displayed first depends on the language of the page you are editing at. The language of the request context is chosen - or a default, if this language is not available for the content.
- See here for more information on the page editor.
- See here for details about editing content with the form-based content editor.
A special hint is maybe necessary for the situation, you want to add a content to an, e.g., English page that exists, for example, only in German. Then you will not find the content unless you change the language for which you search content to German. This can be done as shown in the following screenshot form the page editor's "Add wizard".
To adjust the system-wide available languages, or more precisely locales, you have to change the system configuration. To support a locale, you have to add respective entries in the opencms-system.xml
, found at {webapp home}/WEB-INF/config/
in the RFS. Locales are specified based on the two-letter lowercase code as standardized by ISO 639-1. For instance, if Spanish should be available as content language, add the lines as marked below:
<opencms>
<system>
<internationalization>
<localehandler class="org.opencms.i18n.CmsDefaultLocaleHandler"/>
<localesconfigured>
<locale>en</locale>
<locale>de</locale>
<!-- The line below has to be added.. -->
<locale>es</locale>
</localesconfigured>
<localesdefault>
<locale>en</locale>
<locale>de</locale>
<!-- ..and here as well -->
<locale>es</locale>
</localesdefault>
<timezone>GMT+01:00</timezone>
</internationalization>
...
</system>
</opencms>
The most interesting line is adding the locale under <localesconfigured>
. That makes the locale available. Adding it under <localesdefault>
adds it as a fallback, if the content is not available in the requested locale.
Additionally to the system wide configuration of the available locales as described in section Configuring the supported locales, one can restrict the number of locales that are available for a specific content (type). If at the XML file containing the content, or one of the parent folders of that file, the property locale-available
is set, than only the locales given there are selectable in the content editor. E.g., if you set locale-available
to en,es
, only English and Spanish will be available. But not German, even if configured. This is in particular helpful, if you have contents used for configuration. They usually need only one language version. Thus, you set locale-available
to that one language at the content's XML files. This is done best by configuring default properties for newly generated contents.
When rendering content using <cms:formatter>
or <cms:contentload>
, you typically use the properties value
, valueList
, etc. to access a content's values. When doing so, the content's values are automatically accessed in the language of the current request context, i.e., the language of the requested page, or, if this language does not exist, by the configured default(s). This is typically the intended behavior. But the CmsJspContentAccessBean
(the type of the variable <cms:formatter>
and <cms:contentload>
expose) used to access the content also allows access to the values in a specific locale when using different properties. This is depicted in the JSP snippet below.
<cms:formatter var="content">
<!-- Get a value from the XML content in the selected locale -->
${content.localeValue['de']['Title']}
<!-- Get a list of values from the XML content in the selected locale -->
${content.localeValueList['de']['Teaser']}
<!-- Check if a value exists in the specified locale -->
${content.hasLocaleValue['de']['Title']}
</cms:formatter>
There are different ways to build a multilingual website. All have their pros and cons and in the end your use-case will tell the best solution. The next subsections provide an overview on possible solutions.
To understand the different approaches you have to understand how OpenCms chooses the locale used to render a page (i.e., the locale for choosing the content's language and typically also to read message bundles). The locale is determined by the locale
property. If you request a page and the property is set at it, or at a parent folder, this locale is chosen for rendering. Thus, if the locale property is set to de
, the page is rendered in German. Is it set to en
, the page is rendered in English. This mechanism can be overwritten by the request parameter __locale
. If you provide that parameter and hand over a configured locale as value, e.g., __locale=en
, then this locale will be used.
In this approach you add language-specific sub-sites or sub-folders right below the root folder of your site. It's good practice to name these folders after the ISO 639-1 language codes, e.g. en
or de
. For each of these "lanuage-folders" you set the property locale
to the locale of the sub site they contain. So, typically to the name of the folder, e.g., for the folder en
you set the property locale
to en
. For example, if you want an English and a German version of your website, the main folder structure would look as shown in the figure below.
The index.html
in the root folder is typically a JSP that redirects you to one of the language specific versions according to some user information (e.g., the browser locale). Now in each of the sub-sites you build a separate, language specific version of your website. You can either completely separate the two sites by using sub-sitemaps as "language-folders" and have content only locally in each of the language specific sub-sites, or you can share content between the two sub-sites by storing it in the .content/
folder directly below the root folder of the complete site.
Advantages:
- Complete freedom to adjust each language version independently of the other versions
- Content can be published for single languages without effecting other languages (if content is not shared between the language-specific sites)
- Language-specific URLs are possible
Drawbacks:
- Each site must be maintained separately
- Sites can easily be out of sync
The approach is similar to the one from section "Independent subsites" concerning the folder structure, but you necessarily share content, i.e., put it under the root folder. For a bilingual site with German and English, the root folder will look exactly as shown in figure "top level locale folders". Except, that you may use normal folders instead of sub-sitemaps. The folder en
will have set the locale property to en
and the folder de
will have set the locale property to de
.
The main difference to the approach from section "Independent subsites" is that you do not add extra container pages as index.html
files in each language specific subsite. You create one language version and for the other languages you add siblings of the "master"-version's index.html
files. For example, if you have a container page /en/about-us/index.html
and you want to have the same page in the German subsite, you create /en/ueber-uns/index.html
as a sibling of the file /en/about-us/index.html
. The consequence is that all content in one language version is also directly available at all other language versions. If you remove a content, it's removed in all language versions.
Advantages:
- Freedom to define different page structures for each locale
- Language-specific URLs are possible
- Slightly better maintainability than the approach from section "Independent subsites".
Drawbacks:
- Siblings must be created manually for each locale version
- Contents can only be published for all languages at once
- Each page's language specific versions automatically have the same contents on it
In this approach you have just one folder tree where each page is by itself multilingual. To obtain the multi-lingual behavior, you have to use the __locale
parameter to set the locale - as alternative to using the locale
property. At the moment the approach is possible, but not yet completely supported. You have to take care for creating correct links by yourself: either by attaching the __locale
parameter manually (in the formatters, etc.) to each link, or by substituting the default link substitution handler to a version that adds the parameter automatically.
Advantages:
- You have to maintain only one site
- New pages are directly available in all languages when added
Drawbacks:
- You have to take care of attaching the
__locale
parameter to links - Your URLs are the same for each language version
- Properties are uni-lingual, which may cause you problems in particular with navigations.
It's also possible to have different sites for each language. To share content between sites, it must be placed in the /shared/
folder under the root site. With this option, it's possible to have similar approaches as described in sections "Independent subsites" and "Sibling subsites". Just set the locale
property accordingly at each site's root folder.