Performance of content aggregation queries on multiple lists revisited
Recently I have done some research on performance of various approaches to querying information from multiple lists. While reading Andrew Connell's book on WCM development I have discovered that I have missed one important approach. It turns out that it's probably the best bet in the most Web Content Management (WCM) solutions being often under heavy load. Curious about the results I have added it to my test project.
It turned out that in spite of providing the context information you cannot use the PortalSiteMapProvider in a Console Application - at least I didn't succeed. Instead I have created a custom Web Part which would support all the methods I have previously tested as well as the PortalSiteMapProvider.
The PortalSiteMapProvider has been developed for the purpose of providing data to navigation controls. It allows you to query the data using one of the three methods:
- GetCachedList - returns all items from a single list as a PortalListSiteMapNode object
- GetCachedListItemsByQuery - similar to the previous method but allows you to select items by using an SPQuery. Returns an instance of the SiteMapNodeCollection object
- GetCachedSiteDataQuery - just like the GetCachedListItemsByQuery method but using the SPSiteDataQuery instead. Returns a DataTable containing found items
Looking at the methods above I chose immediately for the GetCachedSiteDataQuery method. The most important benefits are: it allows you querying the whole Site Collection and it returns the data as a DataTable what allows you further processing for example by using XSL.
Because the GetCachedSiteDataQuery method uses the SPSiteDataQuery as the input query, I have decided to run it using the same three configurations as the method using the SPSiteDataQuery.
Because I have changed the test setup (Web Part instead of a Console Application), I had to run all the tests once again. In the test I skipped the first run and just focused on a normal usage scenario like requesting the test page for the n-th time.
Here are the results of my test:
Comparing to the previous test, once again the CrossListQueryInfo based methods either querying all Pages Libraries or only two particular lists, were very good performing. They have proven that they are great choice either for Web Parts as well as custom applications querying SharePoint data.
But having tested the new guy: PortalSiteMapProvider changes everything. The best CrossListQueryInfo methods were at least 2,5 times better than other methods. But querying using the best PortalSiteMapProvider is even 20 times better than the best CrossListQueryInfo!
Below the average results to give you even better impression of various methods:
In spite of it's superb performance, the PortalSiteMapProvider doesn't suit every situation. Andrew Connell covers it in detail in his book Professional SharePoint 2007 Web Content Management Development. To keep it short: PortalSiteMapProvider is the best to use if you expect many requests within a period of time less or equal the caching period. In other words: if you expect one visitor each 5 minutes and the contents of the cache are being flushed each 3 minutes, the PortalSiteMapProvider won't be the best choice for you. Even worse: as it's actually querying more data than requested, it will perform even worse than other methods I have tested. In order to choose the most optimal approach you should carefully study your business case and make an educated choice of which method to use.