Inconvenient backup and restore of Sites with Catalog Connections to different URLs


Using Catalog Connections in SharePoint 2013 offers great flexibility with regard to content reuse within as well as across Farms. Unfortunately, as Catalog Connections are based on URLs, they break if you restore your Site Collections to different URLs. So is rebuilding Catalog Connections the only option or is there a better way to work with Catalog Connections and backup and restore of Site Collections to different URLs?

Cross-Site Publishing in SharePoint 2013

SharePoint 2013 contains many new and improved capabilities for building public-facing websites. One of the new features is Cross-Site Publishing (XSP) that we can use to publish content from one Site Collection to another. In XSP scenarios the content is stored in Catalogs – Lists configured to contain content that can be reused in every Site in, or even outside, your Farm. Once the contents of a Catalog are included in the Search Index a Site can then connect to that Catalog and publish its contents.

Backup and restore is what we do

When building public-facing websites it’s a common practice to backup and restore sites to different environments during the lifecycle of a website. Making the same site available in other environments allows us to performs tests on that site or add new functionality without impacting the availability of the production site.

Although the process of backup and restore of a public-facing website built on the SharePoint 2013 platform is rather straight-forward, it becomes quite inconvenient when the site to restore is connected to a Catalog.

Inconvenient backup and restore of Sites with Catalog Connections to different URLs

When connecting to a Catalog, SharePoint 2013 uses a wizard to guide you through the process of configuring that connection.

SharePoint 2013 Catalog Connection Wizard

While connecting to a Catalog different configuration options can be set. Eventually all of your settings will make up a Catalog Connection Settings entry in the Property Bag of your Site. Among all the different settings, that a Catalog Connection Settings contains, is the URL of the Site that published the catalog. And this is exactly where the problems begin when restoring sites to different URLs: the Catalog Connection breaks.

Although SharePoint 2013 won’t report any errors or warnings when restoring a Site Collection with Catalog Connections to another URL behind the screen things will break. One of the symptoms, that is the easiest to notice, is when you start seeing URLs of Catalog Items not being rewritten and are pointing to the site that published the Catalog instead.

URL of a Catalog Item pointing to the authoring site instead of being rewritten

Another thing that you will notice, is, when you go to Manage catalog connections page in Site Settings and click your catalog connection, in the Catalog Item URL Format section you will see an error message stating that the properties specified by the shared catalog could not be found in the search schema. You will see this error even though the properties mentioned in the message do exist!

The ‘Properties specified by the shared catalog could not be found in search schema’ error message

Inconvenient fixing Catalog Connections after restoring Sites to different URLs

The first thing that you might think of, when facing a broken Catalog Connection, would be to change the Catalog URL and make it point to the new URL. Unfortunately, as SharePoint uses the Catalog URL to identify Catalog Connections, the URL cannot be changed once the connection has been configured.

Another approach, that you might be considering, could be disconnecting from the non-existing Catalog and connecting to the same Catalog using the new URL. Unfortunately, if you decided to reuse the content from the Catalog in your Site, at this stage your site contains the Result Source, Category and Catalog Item Pages and Terms – all of them configured to work with the content from that Catalog. If you were to recreate the Catalog Connection you would have to remove all those items and start from scratch which is tedious an rather inconvenient. Luckily, there is a better way.

Fixing Catalog Connections after restoring Sites to different URLs

To fix a Catalog Connection after restoring a Site Collection to a different URL the first thing, that you need to do, is to remove the Search Result Source that has been created while connecting to the Catalog previously (optionally, if you want to keep it for reference, you can rename is as well). If you would proceed it without doing this, you would get an exception later on, stating that you are trying to create a Search Object (Search Result Source in this case) that already exists. This error is caused by the fact that, although Search Results Sources have unique IDs, they also need to have unique names and when connecting to a Catalog, SharePoint uses the name of the Catalog to name the newly created Result Source.

Note: **** If you have multiple Result Sources pointing to a Catalog and which were created by copying the Result Source created automatically by SharePoint when connecting to a Catalog, you don’t need to remove all of them. In such case you only need to remove the one Result Source that has been created by SharePoint. You will however still need to edit all other Result Sources to change the URL of the Catalog’s Site Collection.

The next step is to reconnect to the Catalog. This however needs to be done programmatically. When using the UI, you would get an error trying to repin Terms that are already a part of the Managed Navigation of your Site Collection from the previous Catalog Connection. Following is the PowerShell script that you need to run to reconnect to the Catalog using a new URL:

param(
    $SiteUrl,
    $OldCatalogUrl,
    $NewCatalogSiteUrl,
    $NewCatalogUrl
)

$site = Get-SPSite $SiteUrl
$catalogConfig = New-Object Microsoft.SharePoint.Publishing.CatalogConnectionManager($site)
$catalog = $catalogConfig.GetCatalogConnectionSettings($OldCatalogUrl)
$catalog.CatalogSiteUrl = $NewCatalogSiteUrl
$catalog.CatalogUrl = $NewCatalogUrl
$catalog.ResultSourceId = [Guid]::NewGuid()
$catalogConfig.AddCatalogConnection($catalog)
$catalogConfig.Update()

This script takes the following parameters:

If we were restoring two Site Collections: Authoring (that publishes the catalog; located at https://authoring.mavention.nl with the Catalog located at https://authoring.mavention.nl/nl-nl/Pages) and Publishing (that connects to the Catalog; located at http://www.mavention.nl) to a local development environment we would call the above script as follows:

.\Add-CatalogConnection.ps1 "http://www.mavention.nl.local" "https://authoring.mavention.nl/nl-nl/Pages" "https://authoring.mavention.nl.local" "https://authoring.mavention.nl.local/nl-nl/Pages"

The script begins with retrieving the information about the old (broken) Catalog Connection (lines 8-10). Next the script sets the URLs of the Site Collection publishing the Catalog as well as the URL of the Catalog List itself to point to the new URLs (lines 11-12). Also, to avoid conflicts (especially should you choose to keep the old Result Source) a new GUID is assigned to the Catalog Connection Result Source (line 13). With all the configuration done the script uses the information to add a new Catalog Connection (lines 14-15). By reusing the information from the old Catalog Connection and only changing the Catalog’s URL all of the previously configured settings are preserved. After calling this script you will see the old as well as the new Catalog Connection on the Manage catalog connections page.

Two Catalog Connections on the ‘Manage catalog connections’ page

Without changing anything else, if you would now check the URLs of your Catalog Items you should see them being rewritten just as expected.

URL of a Catalog Item rewritten and relative to the publishing Site Collection

After verifying that everything works as expected you are safe to remove the old Catalog Connection.

Summary

Using Catalog Connections in SharePoint 2013 offers great flexibility with regard to content reuse within as well as across Farms. Because Catalog Connections are based on URLs they might get broken when restoring Site Collections to other URLs. Using a PowerShell script those Catalog Connections can be fixed without much impact on the contents of the Site Collection.

Others found also helpful: