Package org.opencms.site.xmlsitemap
Class CmsXmlSitemapGenerator
- java.lang.Object
-
- org.opencms.site.xmlsitemap.CmsXmlSitemapGenerator
-
- Direct Known Subclasses:
CmsDetailPageDuplicateEliminatingSitemapGenerator
public class CmsXmlSitemapGenerator extends java.lang.Object
Class for generating XML sitemaps for SEO purposes, as described in http://www.sitemaps.org/protocol.html.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected class
CmsXmlSitemapGenerator.ResultEntry
A bean that consists of a sitemap URL bean and a priority score, to determine which of multiple entries with the same URL are to be preferred.
-
Field Summary
Fields Modifier and Type Field Description static java.lang.String
DEFAULT_CHANGE_FREQUENCY
The default change frequency.static double
DEFAULT_PRIORITY
The default priority.protected java.lang.String
m_baseFolderRootPath
The root path for the sitemap root folder.protected java.lang.String
m_baseFolderSitePath
The site path of the base folder.protected boolean
m_computeContainerPageDates
Flag to control whether container page dates should be computed.protected java.util.List<CmsDetailPageInfo>
m_detailPageInfos
The list of detail page info beans.protected java.util.Map<java.lang.String,java.util.List<CmsResource>>
m_detailResources
A map from type names to lists of potential detail resources of that type.protected com.google.common.collect.Multimap<java.lang.String,java.lang.String>
m_detailTypesByPage
A multimap from detail page root paths to corresponding types.protected CmsObject
m_guestCms
A CMS context with guest privileges.protected CmsPathIncludeExcludeSet
m_includeExcludeSet
The include/exclude configuration used for choosing pages for the XML sitemap.protected com.google.common.collect.Multimap<CmsUUID,CmsAlias>
m_pageAliasesBelowBaseFolderByStructureId
A map from structure ids to page aliases below the base folder which point to the given structure id.protected java.util.Map<java.lang.String,CmsXmlSitemapGenerator.ResultEntry>
m_resultMap
The map used for storing the results, with URLs as keys.protected CmsObject
m_siteGuestCms
A guest user CMS object with the site root of the base folder.protected java.lang.String
m_siteRoot
The site root of the base folder.protected java.lang.String
m_siteRootLink
A link to the site root.
-
Constructor Summary
Constructors Constructor Description CmsXmlSitemapGenerator(java.lang.String folderRootPath)
Creates a new sitemap generator instance.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
addDetailLinks(CmsResource containerPage, java.util.Locale locale)
Adds the detail page links for a given page to the results.protected void
addResult(CmsXmlSitemapUrlBean result, int resultPriority)
Adds an URL bean to the internal map of results, but only if there is no existing entry with higher internal priority than the priority given as an argument.protected long
computeContainerPageModificationDate(CmsResource containerPage)
Computes the container the container page modification date from its referenced contents.java.util.List<CmsXmlSitemapUrlBean>
generateSitemapBeans()
Generates a list of XML sitemap entry beans for the root folder which has been set in the constructor.protected static java.lang.String
getChangeFrequency(java.util.List<CmsProperty> properties)
Gets the change frequency for a sitemap entry from a list of properties.protected java.lang.String
getDetailLink(CmsResource pageRes, CmsResource detailRes, java.util.Locale locale)
Gets the detail link for a given container page and detail content.protected java.util.List<I_CmsResourceType>
getDetailTypesForPage(CmsResource resource)
Gets the types for which a given resource is configured as a detail page.protected java.util.List<CmsResource>
getDirectPages()
Gets the list of pages which should be directly added to the XML sitemap.CmsPathIncludeExcludeSet
getIncludeExcludeSet()
Gets the include/exclude configuration of this XML sitemap generator.protected java.lang.String
getInnerXmlForEntry(CmsXmlSitemapUrlBean entry)
Writes the inner node content for an url element to a buffer.protected java.util.List<CmsResource>
getNavigationPages()
Gets the list of pages from the navigation which should be directly added to the XML sitemap.protected static double
getPriority(java.util.List<CmsProperty> properties)
Gets the page priority from a list of properties.protected java.lang.String
getUrlSetOpenTag()
Gets the opening tag for the urlset element (can be overridden to add e.g.protected java.lang.String
getXmlForEntry(CmsXmlSitemapUrlBean entry)
Writes the XML for an URL entry to a buffer.protected boolean
isAliasBelowBaseFolder(CmsAlias alias)
Checks whether the given alias is below the base folder.protected boolean
isValidDetailPageCombination(CmsResource page, java.util.Locale locale, CmsResource detailRes)
Checks whether the page/detail content combination is a valid detail page.protected static void
removeInternalFiles(java.util.List<CmsResource> resources)
Removes files marked as internal from a resource list.java.lang.String
renderSitemap()
Generates a sitemap and formats it as a string.protected java.lang.String
replaceServerUri(java.lang.String link)
Replaces the protocol/host/port of a link with the ones from the configured server URI, if it's not empty.static java.lang.String
replaceServerUri(java.lang.String link, java.lang.String server)
Replaces the protocol/host/port of a link with the ones from the given server URI, if it's not empty.void
setComputeContainerPageDates(boolean computeContainerPageDates)
Enables or disables computation of container page dates.void
setServerUrl(java.lang.String serverUrl)
Sets the replacement server URL.
-
-
-
Field Detail
-
DEFAULT_CHANGE_FREQUENCY
public static final java.lang.String DEFAULT_CHANGE_FREQUENCY
The default change frequency.- See Also:
- Constant Field Values
-
DEFAULT_PRIORITY
public static final double DEFAULT_PRIORITY
The default priority.- See Also:
- Constant Field Values
-
m_baseFolderRootPath
protected java.lang.String m_baseFolderRootPath
The root path for the sitemap root folder.
-
m_baseFolderSitePath
protected java.lang.String m_baseFolderSitePath
The site path of the base folder.
-
m_computeContainerPageDates
protected boolean m_computeContainerPageDates
Flag to control whether container page dates should be computed.
-
m_detailPageInfos
protected java.util.List<CmsDetailPageInfo> m_detailPageInfos
The list of detail page info beans.
-
m_detailResources
protected java.util.Map<java.lang.String,java.util.List<CmsResource>> m_detailResources
A map from type names to lists of potential detail resources of that type.
-
m_detailTypesByPage
protected com.google.common.collect.Multimap<java.lang.String,java.lang.String> m_detailTypesByPage
A multimap from detail page root paths to corresponding types.
-
m_guestCms
protected CmsObject m_guestCms
A CMS context with guest privileges.
-
m_includeExcludeSet
protected CmsPathIncludeExcludeSet m_includeExcludeSet
The include/exclude configuration used for choosing pages for the XML sitemap.
-
m_pageAliasesBelowBaseFolderByStructureId
protected com.google.common.collect.Multimap<CmsUUID,CmsAlias> m_pageAliasesBelowBaseFolderByStructureId
A map from structure ids to page aliases below the base folder which point to the given structure id.
-
m_resultMap
protected java.util.Map<java.lang.String,CmsXmlSitemapGenerator.ResultEntry> m_resultMap
The map used for storing the results, with URLs as keys.
-
m_siteGuestCms
protected CmsObject m_siteGuestCms
A guest user CMS object with the site root of the base folder.
-
m_siteRoot
protected java.lang.String m_siteRoot
The site root of the base folder.
-
m_siteRootLink
protected java.lang.String m_siteRootLink
A link to the site root.
-
-
Constructor Detail
-
CmsXmlSitemapGenerator
public CmsXmlSitemapGenerator(java.lang.String folderRootPath) throws CmsException
Creates a new sitemap generator instance.- Parameters:
folderRootPath
- the root folder for the XML sitemap to generate- Throws:
CmsException
- if something goes wrong
-
-
Method Detail
-
replaceServerUri
public static java.lang.String replaceServerUri(java.lang.String link, java.lang.String server)
Replaces the protocol/host/port of a link with the ones from the given server URI, if it's not empty.- Parameters:
link
- the link to changeserver
- the server URI string- Returns:
- the changed link
-
getChangeFrequency
protected static java.lang.String getChangeFrequency(java.util.List<CmsProperty> properties)
Gets the change frequency for a sitemap entry from a list of properties.If the change frequency is not defined in the properties, this method will return null.
- Parameters:
properties
- the properties from which the change frequency should be obtained- Returns:
- the change frequency string
-
getPriority
protected static double getPriority(java.util.List<CmsProperty> properties)
Gets the page priority from a list of properties.If the page priority can't be found among the properties, -1 will be returned.
- Parameters:
properties
- the properties of a resource- Returns:
- the page priority read from the properties, or -1
-
removeInternalFiles
protected static void removeInternalFiles(java.util.List<CmsResource> resources)
Removes files marked as internal from a resource list.- Parameters:
resources
- the list which should be replaced
-
generateSitemapBeans
public java.util.List<CmsXmlSitemapUrlBean> generateSitemapBeans() throws CmsException
Generates a list of XML sitemap entry beans for the root folder which has been set in the constructor.- Returns:
- the list of XML sitemap entries
- Throws:
CmsException
- if something goes wrong
-
getIncludeExcludeSet
public CmsPathIncludeExcludeSet getIncludeExcludeSet()
Gets the include/exclude configuration of this XML sitemap generator.- Returns:
- the include/exclude configuration
-
renderSitemap
public java.lang.String renderSitemap() throws CmsException
Generates a sitemap and formats it as a string.- Returns:
- the sitemap XML data
- Throws:
CmsException
- if something goes wrong
-
setComputeContainerPageDates
public void setComputeContainerPageDates(boolean computeContainerPageDates)
Enables or disables computation of container page dates.- Parameters:
computeContainerPageDates
- the new value
-
setServerUrl
public void setServerUrl(java.lang.String serverUrl)
Sets the replacement server URL.The replacement server URL will replace the scheme/host/port from the URLs returned by getOnlineLink.
- Parameters:
serverUrl
- the server URL
-
addDetailLinks
protected void addDetailLinks(CmsResource containerPage, java.util.Locale locale) throws CmsException
Adds the detail page links for a given page to the results.- Parameters:
containerPage
- the container page resourcelocale
- the locale of the container page- Throws:
CmsException
- if something goes wrong
-
addResult
protected void addResult(CmsXmlSitemapUrlBean result, int resultPriority)
Adds an URL bean to the internal map of results, but only if there is no existing entry with higher internal priority than the priority given as an argument.- Parameters:
result
- the result URL bean to addresultPriority
- the internal priority to use for updating the map of results
-
computeContainerPageModificationDate
protected long computeContainerPageModificationDate(CmsResource containerPage) throws CmsException
Computes the container the container page modification date from its referenced contents.- Parameters:
containerPage
- the container page- Returns:
- the computed modification date
- Throws:
CmsException
- if something goes wrong
-
getDetailLink
protected java.lang.String getDetailLink(CmsResource pageRes, CmsResource detailRes, java.util.Locale locale)
Gets the detail link for a given container page and detail content.- Parameters:
pageRes
- the container pagedetailRes
- the detail contentlocale
- the locale for which we want the link- Returns:
- the detail page link
-
getDetailTypesForPage
protected java.util.List<I_CmsResourceType> getDetailTypesForPage(CmsResource resource)
Gets the types for which a given resource is configured as a detail page.- Parameters:
resource
- a resource for which we want to find the detail page types- Returns:
- the list of resource types for which the given page is configured as a detail page
-
getDirectPages
protected java.util.List<CmsResource> getDirectPages() throws CmsException
Gets the list of pages which should be directly added to the XML sitemap.- Returns:
- the list of resources which should be directly added to the XML sitemap
- Throws:
CmsException
- if something goes wrong
-
getInnerXmlForEntry
protected java.lang.String getInnerXmlForEntry(CmsXmlSitemapUrlBean entry)
Writes the inner node content for an url element to a buffer.- Parameters:
entry
- the entry for which the content should be written- Returns:
- the inner XML
-
getNavigationPages
protected java.util.List<CmsResource> getNavigationPages()
Gets the list of pages from the navigation which should be directly added to the XML sitemap.- Returns:
- the list of pages to add to the XML sitemap
-
getUrlSetOpenTag
protected java.lang.String getUrlSetOpenTag()
Gets the opening tag for the urlset element (can be overridden to add e.g. more namespaces.- Returns:
- the opening tag
-
getXmlForEntry
protected java.lang.String getXmlForEntry(CmsXmlSitemapUrlBean entry)
Writes the XML for an URL entry to a buffer.- Parameters:
entry
- the XML sitemap entry bean- Returns:
- an XML representation of this bean
-
isAliasBelowBaseFolder
protected boolean isAliasBelowBaseFolder(CmsAlias alias)
Checks whether the given alias is below the base folder.- Parameters:
alias
- the alias to check- Returns:
- true if the alias is below the base folder
-
isValidDetailPageCombination
protected boolean isValidDetailPageCombination(CmsResource page, java.util.Locale locale, CmsResource detailRes)
Checks whether the page/detail content combination is a valid detail page.- Parameters:
page
- the container pagelocale
- the localedetailRes
- the detail content resource- Returns:
- true if this is a valid detail page combination
-
replaceServerUri
protected java.lang.String replaceServerUri(java.lang.String link)
Replaces the protocol/host/port of a link with the ones from the configured server URI, if it's not empty.- Parameters:
link
- the link to change- Returns:
- the changed link
-
-