Performance of content aggregation queries on multiple lists revisited

, , ,

Recently I have done some research on performance of various approaches to querying information from multiple lists. While reading Andrew Connell's book on WCM development I have discovered that I have missed one important approach. It turns out that it's probably the best bet in the most Web Content Management (WCM) solutions being often under heavy load. Curious about the results I have added it to my test project.

It turned out that in spite of providing the context information you cannot use the PortalSiteMapProvider in a Console Application – at least I didn't succeed. Instead I have created a custom Web Part which would support all the methods I have previously tested as well as the PortalSiteMapProvider.

The PortalSiteMapProvider has been developed for the purpose of providing data to navigation controls. It allows you to query the data using one of the three methods:

  • GetCachedList – returns all items from a single list as a PortalListSiteMapNode object
  • GetCachedListItemsByQuery – similar to the previous method but allows you to select items by using an SPQuery. Returns an instance of the SiteMapNodeCollection object
  • GetCachedSiteDataQuery – just like the GetCachedListItemsByQuery method but using the SPSiteDataQuery instead. Returns a DataTable containing found items

Looking at the methods above I chose immediately for the GetCachedSiteDataQuery method. The most important benefits are: it allows you querying the whole Site Collection and it returns the data as a DataTable what allows you further processing for example by using XSL.

Because the GetCachedSiteDataQuery method uses the SPSiteDataQuery as the input query, I have decided to run it using the same three configurations as the method using the SPSiteDataQuery.

Because I have changed the test setup (Web Part instead of a Console Application), I had to run all the tests once again. In the test I skipped the first run and just focused on a normal usage scenario like requesting the test page for the n-th time.

The Results

Here are the results of my test:

Performance of various content aggregation methods

Comparing to the previous test, once again the CrossListQueryInfo based methods either querying all Pages Libraries or only two particular lists, were very good performing. They have proven that they are great choice either for Web Parts as well as custom applications querying SharePoint data.

But having tested the new guy: PortalSiteMapProvider changes everything. The best CrossListQueryInfo methods were at least 2,5 times better than other methods. But querying using the best PortalSiteMapProvider is even 20 times better than the best CrossListQueryInfo!

Below the average results to give you even better impression of various methods:

Average performance of various content aggregation methods

In spite of it's superb performance, the PortalSiteMapProvider doesn't suit every situation. Andrew Connell covers it in detail in his book Professional SharePoint 2007 Web Content Management Development. To keep it short: PortalSiteMapProvider is the best to use if you expect many requests within a period of time less or equal the caching period. In other words: if you expect one visitor each 5 minutes and the contents of the cache are being flushed each 3 minutes, the PortalSiteMapProvider won't be the best choice for you. Even worse: as it's actually querying more data than requested, it will perform even worse than other methods I have tested. In order to choose the most optimal approach you should carefully study your business case and make an educated choice of which method to use.


Possibly related posts

5 Responses to “Performance of content aggregation queries on multiple lists revisited”

  1. Andrew Lansdowne Says:

    I have found that the PortalSiteMapProvider is ideal if you just need to get site map nodes (ie. URL and Title) but if you need to get the value of in-page fields of the results then it sometimes has the values in the indexer and sometimes doesn't!

    It seems to loose the values of the other pages' fields if you create a new page in the same list. I have reverted to either using SPList.GetItems(SPQuery) instead if all the pages I need to find are in the same list, and now i'm looking at CrossListQueryInfo and SPSiteDataQuery for querying across sub sites which is where your post comes in handy – Thanks!

  2. Andrew Lansdowne Says:

    Also, I haven't even heard of CrossListQueryInfo or SPSiteDataQuery until now and i've read the whole of Andrew Connell's WCM book – I can't believe it isn't in there. He says that the CQWP uses the PortalSiteMapProvider but i'm sure it must use CrossListQueryInfo..?

  3. Waldek Mastykarz Says:

    @Andrew Lansdowne: Thank you for sharing your insights, Andrew :D

  4. Waldek Mastykarz Says:

    @Andrew Lansdowne: You should check out the WSS SDK for more information on SPSiteDataQuery (http://msdn.microsoft.com/en-us/library/microsoft.sharepoint.spsitedataquery.aspx) and MOSS SDK on more insights about CrossListQueryInfo (http://msdn.microsoft.com/en-us/library/microsoft.sharepoint.publishing.crosslistqueryinfo.aspx).
    I'm not sure whether Andrew Connell has covered both classes in his book or not. What I do know is that almost everything in SharePoint can be done in many different ways and it's up to the developer/architect to choose the right approach depending on the requirements.

  5. Jomit Says:

    Great Post !!!

    Just added this to my link of the month list

    Cheers,
    Jomit

Leave a Reply

Security Code:

WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS
Copyright © 2007 - 2012 Waldek Mastykarz

Creative Commons License