A while ago I did some research on performance of various approaches for aggregating data in Microsoft Office SharePoint Server 2007. Back then I have found some interesting results. In the project I’m working on at the moment, I got a slightly different challenge: what is the best way to retrieve one particular list item?
What I already knew
From the previous trials that I did I knew, that using PortalSiteMapProvider is the fastest way to perform data aggregation on MOSS 2007. Whether you will get the full benefit of it depends on your content and cache settings. If you optimize the cache settings considering the number and frequency of visitors the PortalSiteMapProvider can help you build a great performing site.
In my previous researches I focused on querying content stored in multiple lists/sites. The challenge was to minimize the penalty of retrieving content from multiple locations and merging it into one result. This time the challenge looked a little bit different: there was only one location I had to query, I knew exactly which list it was. All I needed to do was to retrieve a particular list item using its ID (SPListItem.ID).
Although I knew all the different approaches from the previous trials I noticed that the SPList class had one method designed to retrieve items using ID’s: SPList.GetItemById(int). And although I knew how fast aggregations based on the CrossListQueryCache and PortalSiteMapProvider classes were, I really hoped that this particular method would turn out to be a real gem. After all it was made to do nothing else than retrieve one list item using its ID – exactly what I needed.
To test the various approaches I have created a small Console Application on my development VPC.
I have tested the following approaches:
- using SPQuery on the list instance
- using the GetItemById(int) method
- using SPSiteDataQuery on the parent Web
- using SPSiteDataQuery on the parent Web passing the List ID (using the SPSiteDataQuery.Lists Property)
- using CrossListQueryCache
- using PortalSiteMapProvider
These are the results I got:
The vertical axis represents the duration in milliseconds. The horizontal axis represents the run number. I have executed each query 10 times. Because initial runs are quite heavy due to all the objects that have to be instantiated I have removed them from the chart above.
As you can see PortalSiteMapProvider and CrossListQueryCache are the absolute winners. The PortalSiteMapProvider is almost 10 times faster than the SPSiteDataQuery approach and almost 100 times faster than using the dedicated GetItemById method!
I was really disappointed with the poor performance of the GetItemById method which was the worst of all approaches I have tested. It was almost 3 times worse than running an SPQuery query on the list.
What surprised me is the difference between the initial and all subsequent queries using the regular SPSiteDataQuery object and an SPSiteDataQuery which uses the List ID to narrow the query scope. By initially retrieving the ID of the list and storing it in a static variable you can win 2,5 times performance on the initial call and more than 15 times on the subsequent calls!
Surprisingly or not once again the PortalSiteMapProvider turned out to be the fastest way of retrieving data stored in SharePoint. Once again the second place was taken by the CrossListQueryCache object. Please note, that both of them require passing the List ID to the Lists query property. Because both approaches use cross-site querying that’s the only way to narrow the scope to retrieve the item with a specific ID from a particular list.
One word on PortalSiteMapProvider
PortalSiteMapProvider is undoubtedly the winner when it comes to effectively retrieving content from a SharePoint site. There are two things however that you have to keep in mind when choosing for the PortalSiteMapProvider class.
First of all it’s available in MOSS 2007 only. If you’re developing for WSSv3 you won’t be able to use it. You would have to use the SPSiteDataQuery class instead.
Second of all, and probably the most important, calling the PortalSiteMapProvider.GetCachedSiteDataQuery method won’t give you the great results you would expect of it straight away. The PortalSiteMapProvider strongly relies on cached objects and it is therefore important that you optimize cache for your site, your content, your update frequency and your visitors. There are no magic numbers to this one. If you want to get the most of the PortalSiteMapProvider, you will have to measure the traffic on your website and tweak your settings. Not doing that will most likely work counterproductive: the PortalSiteMapProvider will query the content on every call instead of using the cached results.
What I learned
The first thing I learned from this research is that specific methods are not always better than the multi-purpose methods. GetItemById proves that: it’s almost 70 times slower than the PortalSiteMapProvider which allows you to do much more than only retrieving a single item using a CAML query.
If you really want to be sure if the method you’re using is the best for your scenario, the only thing you can do, is to create some kind of test and compare the results. It might be not perfect and you might miss some things, but at least it will give you a comparison which is better than a wild guess.
During this test I also found out how to use a PortalSiteMapProvider out of context of a Web Application. It turns out that you not only need the context information, but you have to include the siteMap element in App.config of your application.
SharePoint provides you with many different ways to retrieve a specific list item. Although it offers you some specific methods you should be careful while choosing them for a real-life solution and test them to see if they are really the best match for your scenario.