Mavention.nl v3: How we did it?–The Landscape
Planning for the landscape
In the last few years we noticed that we were having quite a few visits from outside the Netherlands. Although it’s not fool-proof we assumed that all those visitors would be non-Dutch speaking. After analyzing the web analytics data we had for our website we found out, that more than 60% of our visitors were from outside the Netherlands and with that very likely not speaking Dutch; and this while for all that time all of the content presenting our company and services was in Dutch only! While planning for our new website one thing that we knew for sure was that it was the highest time for a website in English targeted at our foreign visitors.
One of the most important investments in the SharePoint 2013 platform related to Web Content Management is the Cross-Site Publishing (XSP) capability that allows you to publish content across Site Collections but also across Farms (!) by using search. While planning for our new website we decided we had to have a look at this new capability just to be able to experience what it means to design and build for XSP.
If you’re interested in other investments in the Web Content Management area of SharePoint 2013 I suggest you watch the following video http://www.microsoft.com/resources/technet/en-us/office/media/video/video.html?cid=stc&from=mscomstc&VideoID=7923b059-abb9-430d-85d9-3abd86a7b40c by Vesa Juvonen where he walks you through all new and improved WCM areas in SharePoint 2013.
Although planning for the landscape wasn’t that difficult it was the design process that turned out to be quite challenging.
Designing for the landscape
One of the things that we started designing for was the support for multilingual content. We started with it, because we suspected it to have a major impact on how our sites would be laid out. As it turned out we weren’t that far from the truth.
To help you understand the landscape of our website following is a schema that illustrates what we ended up with.
On the left-hand side is the authoring Site Collection where we create and manage our content. On the right-hand side are publishing Site Collections where the content is published and which are publicly available. At the bottom of the stack is the assets site where we store documents, images, etc. that we refer to from the content on both the .nl and .com sites. All publishing sites are Host-Named Site Collections.
Note that although the schema shows the mavention.com site, it’s not yet available. We’re working hard to get the content translated and we will be launching it soon so stay tuned.
Designing for the multilingual
If you’ve worked with multilingual solutions on the SharePoint platform you probably know that the standard approach to build multilingual websites in SharePoint is by using Variations. In such approach one of the variation languages becomes a source and the content is published to all other languages either automatically or manually depending on the configuration.
SharePoint 2013 is no different in this area. However if you are using the Cross-Site Publishing capability, Variations is how you manage your content and not necessarily how you publish it.
If you look at the schema of our website the authoring site is a Publishing Site using Variations – the same as you would build it in SharePoint 2007 or 2010. The only difference with the previous versions of SharePoint is the fact that we use the authoring site to manage the content and that this is not publicly available.
Mavention.nl authoring site
In order to get the content published to our .nl and .com sites we have published Pages Libraries as Catalogs with anonymous access enabled so that all of the content is available to our visitors. The .nl and .com sites connect respectively to the Dutch and English Catalog from the authoring site to retrieve their content.
Although we are translating content to English, both the Dutch and the English sites are using the same images. As you might know images are binaries and as such are not included in the search index (the metadata is, but the binary itself isn’t). To prevent ourselves from storing the same assets twice we decided to have them stored in a separate Site Collection. Although we could’ve chosen to store them in the authoring Site Collection, it would mean that we would have to enable anonymous access which would not only lower the security but would also limit the optimization possibilities for content management purposes.
Designing for Cross-Site Publishing
Cross-Site Publishing is a new concept in SharePoint 2013 which is based upon different paradigms than you might be used to from the previous versions of SharePoint. The basic idea behind Cross-Site Publishing is that the content is created and remains in one place, and is made available for publishing in one or more places via the SharePoint Search index. This is opposite to the publishing model based on Content Deployment that we know from the previous versions of SharePoint where content has been copied to the locations where it would be published.
One of the prerequisites for using Cross-Site Publishing is tagging the content using taxonomies defined in the Managed Metadata Service. It’s not the physical structure or location in the authoring site that defines the content hierarchy but the Taxonomy that is used for tagging content.
For the purpose of our website in the local Term Group of the authoring site we have created a Term Set called Authoring – Mavention.nl Navigation that we use for tagging all the pages within our website.
Within this Term Set we have created a few Terms that describe all the different types of content that we have on our site such as News, Products or Blogs. In our scenario for the most part these Terms correspond to Content Types that we use for creating content in the authoring site.
You might have noticed that all of the pages Terms are nested under the Pages Term. This is done on purpose and the reason for this is that when you connect to a Catalog and decide to pin the tagging hierarchy to the navigation, you can choose only one Term and you cannot select the whole Term Set. Because for our website our main navigation corresponds to the different type of information that we have on our site we decided to pin the whole structure to the Global Navigation.
All Term Sets that we use on the authoring site are located in a Local Term Group which in theory is accessible to the authoring site only. However as soon as you try to connect to a Catalog from a different Site Collection you would need access to those Terms. One of the changes to the Managed Metadata Service in SharePoint 2013 is the ability to grant access to a Local Term Group to other Site Collections. For our website we have granted the access to the Local Term Group of the authoring site to all publishing websites.
Using the Managed Metadata Service for tagging and describing the hierarchy is great especially in the context of multilingual websites. When working with the Managed Metadata Service in SharePoint 2013 you can define additional languages for your Terms and the great thing is that you don’t necessarily need to have the Languages Packs installed for those languages! When defining additional languages all you have to do is to choose Other locales in the drop-down box and you can choose any language that you need. For our website we have set the Dutch language as Default Language and we added English as additional Working Language.
Designing for publishing
Once the process of configuring and publishing a Catalog is completed the last thing that needs to be done is to connect to a Catalog to retrieve the data. On our authoring site we have two Variations Sites (Dutch and English) each with its own Pages Library. Although it’s not necessary for you to use Publishing Pages on the authoring site, we decided to do so to be able to preview the content before it’s published.
I will share with you more about our authoring site in the future articles of the Mavention.nl v3: How we did it? series.
In SharePoint 2007 or 2010 if you used Variations your language sites would be published as subsites of the root website. One consequence of this approach is that from the Search Engine Optimization perspective you cannot benefit of additional page rank points that are added to local search results (ie. results for which the domain name suffix is the same as the language in which the user searches). In SharePoint 2013 however, if you’re using the new Cross-Site Publishing capability it’s fairly easy to map Variations Site to Country Code Top Level Domains (CCTLD) which is exactly what we have done on our website.
If you recall the schema of our landscape, we have a single authoring Site Collection which contains both Dutch and English content and we have two separate publishing Site Collections that display the content. To map the English content from the English Variation site to the .com publishing site and the Dutch content from the Dutch Variation site to the .nl publishing site all we had to do is to connect the right site to the right Catalog.
Because the only difference between the two sites is the catalog to which they are connected we were able to reuse all of the configuration and had very little work (besides getting the content translated) to do to get the whole configuration to work.
Designing for content management
The majority of our content is created and maintained in the authoring Site Collection which is accessible only to us (authenticated users). Publishing Site Collections on the other hand are used to display the content published using the Cross-Site Publishing capability. Although most of the traffic to the publishing Site Collections is anonymous over HTTP, once in a while we have to authenticate on the publishing Site Collections for maintenance purposes. Whenever we authenticate we want to be able to do it in a secure manner. Following is the schema of our publishing sites.
First we have the Publishing Web Application. This Web Application consists of two zones: Default and Intranet each mapped to their own IIS sites and using the same Application Pool. The Default Zone allows anonymous access only and if you want to authenticate you can do so using the Intranet Zone only. Should you try to authenticate using the Default Zone you will immediately get a 403 Access Denied response.
As mentioned before all the publishing Site Collections are Host-Named Site Collections. In our configuration we make use of the new SPSiteUrl capability that allows us not only to assign multiple URLs to a single Host-Named Site Collection but also to have them mapped to Zones defined on the Web Application level and with that use different authentication mechanisms for different URLs.
Content By Search
While reading the information on the new capabilities of SharePoint 2013 you might have stumbled upon the Content By Search term. Some use it to denote the Cross-Site Publishing concept but it’s also used in SharePoint as well, for example in the Content By Search Web Part which is used to display the content published through search.
When thinking about this new concept many people are concerned about the delay involved with the content being published: after all it first has to be indexed by SharePoint Search. To minimize the time involved with indexing the content and the impact of crawling SharePoint content that often SharePoint 2013 introduces the concept of Continuous Crawling which is more efficient than Incremental Crawling. By default the Continuous Crawl process executes every 15 minutes but you can easily make it run more often using the following PowerShell cmdlet:
$ssa = Get-SPEnterpriseSearchServiceApplication $ssa.SetProperty("ContinuousCrawlInterval", 1)
The code snippet above sets the crawl interval to 1 minute which is exactly what we are using on our website. This allow us to have content published with almost no noticeable delay.
Building for the landscape
Although our landscape applies to one website (from the business perspective) all the separate sites have their own characteristics. When building our website we decided to take all the different sites and their purposes into account and separate out all the different components.
The authoring site is a regular Publishing Site with Site Columns, Content Types, Page Layouts and Publishing Pages. However because it’s for content management purposes only, we don’t need there the branding that we use on the publishing site. Additionally as all of the structure is described using Taxonomy we use a single web (one for each language) to store all of the content, which is just fine for our website.
Another type of sites that we have in our landscape are publishing sites where nearly all of the content comes from search. Because of this we have no use of Site Columns, Content Types or Page Layouts that we use on the authoring site. Instead we need a separate set of assets which are branded the way our visitors will see them and which display all of the content based on the data retrieved from search.
Finally we have the assets site which is nothing more than storage and which doesn’t need any branding or specific functionality at all.
From the solution perspective we ended up with three projects:
- Mavention.nl Publishing – which contains assets and configuration for publishing sites only (eg. Search-based Page Layouts)
- Mavention.nl Authoring – which contains assets and configuration for the authoring site (eg. Site Columns, Content Types, Page Layouts)
Each of those projects produces its own WSP which we can then deploy to the specific Web Application (ie. authoring or publishing). Such isolation offers us flexibility and helps us ensure that we can get the most out of every site in our landscape.
In this article I have showed you the landscape of our website. We discussed which sites we are using and what purpose they serve. In the upcoming articles we will zoom in on the specific functionality of those websites and I will show you how we optimized our website for content management as well as anonymous visitors. Stay tuned!