.NET 3x ships with a number of new features among which lambda expressions: an easy way to write code for querying collections. Because of its ease of use many developers use it for retrieving data from different kind of collections. Just recently I’ve been asked to have a look how well lambda expressions perform while retrieving data from SharePoint comparing to SharePoint native mechanisms.
Before you read further…
Keep in mind that lambda expressions are nothing else than a fancy way of writing code… They don’t contain any magical mechanism for iterating through collections. Instead they allow you to write the code more intuitively.
Considering that, I was quite skeptic about the performance of lambda expressions. At the end of the day it’s almost the same as retrieving all the items from SharePoint and then checking each one of them whether it should be returned or not. Not the most efficient way of data querying if you ask me…
The test case
In order to test how well lambda expressions really perform I have setup a simple test environment. I have created a custom list with three columns: Title, Number (integer between 0 and 100) and Lipsum (Multiple lines of Rich HTML). I have filled this list with 1000 items. Each item contained the same dummy text and a random number between 0 and 100.
Using different querying mechanisms I wanted to retrieve all items with the number greater or equal than 50. 492 items in total.
I run each query method 100 times in order to minimize the influence of other processes which might accidentally run in the background.
Additionally I have performed a test to check whether there is a difference between querying list items and webs. I have created a site with 100 team sites. Each site had a custom property IsProjectSite set randomly to either true or false. In my test I wanted to retrieve ID’s of all project sites – 48 sites in total.
The querying methods
To provide enough comparison data I have tested the following methods:
Query: <Where><Geq><FieldRef Name='Number' /><Value Type='Number'>50</Value></Geq></Where> ViewFields: <FieldRef Name='Title'/>
Lists: <Lists><List ID='849f5f81-7850-461c-8057-cff281335b0f'/></Lists> Query: <Where><Geq><FieldRef Name='Number' /><Value Type='Number'>50</Value></Geq></Where> ViewFields: <FieldRef Name='Title'/>
Lambda expression for querying lists
ForEach loop for querying lists
Lambda expression for querying webs
ForEach loop for querying webs
I have run the test on my SharePoint development VPC. Because it doesn’t come even close to a real-life production environment, don’t look at the numbers. Compare the relative values instead.
Query the content if possible
Just as I suspected querying SharePoint lists by iterating all list items has really poor performance. Comparing to SPQuery it’s 300 times (!) slower. Don’t use it unless you have a very good reason. And even then ask yourself twice whether it’s really necessary.
SharePoint gives you a couple of mechanisms for querying data. Unfortunately there is no best-for-all approach. My advice: research various approaches to find the one that would fit your scenario the best. Depending on your experience with SharePoint it may take less or more time to find the best match.
In the test I’ve performed I have used two different query mechanisms: using SPQuery and SPSiteDataQuery. The major different between these two is that SPQuery can query data from single list only while SPSiteDataQuery can be used for querying data from within the whole Site Collection.
Using different settings of the SPSiteDataQuery you can optimize the query for example by narrowing it down to one particular list. Still the SPQuery performs almost 20 times faster. So if you want to get data from one list only, you will be better off with SPQuery.
Lambda expressions perform better than ForEach loops
The outcome of this one really surprised me. As I mentioned before, lambda expressions are a fancy way of performing operations on collections. It turns out that using lambda expressions for traversing lists is almost twice as good as the ForEach loop! If you look at the code examples you notice, that the foreach loop is still there! I run this test for a couple of times just to be sure that nothing wrong was going on. Each single time the results were almost the same. 1-0 for lambda expressions.
Comparing to SPQuery lamda expressions still perform very poor but when you really have to loop through a collection of list items, you should consider using lambda expressions instead of foreach loops.
Surprisingly it doesn’t matter which approach you choose for traversing SPWebCollection in order to retrieve some particular Webs based on a property (SPPropertyBag). The results are similar.
Everyone would agree though: the code is much cleaner when using lambda expressions.
Lambda expressions are a very interesting feature of .NET 3x. They allow you to write cleaner and more intuitive code. While they improve readability, lambda expressions are not always the best choice in combination with SharePoint. You should never choose blindly for lambda expressions and always compare them to other mechanisms especially those native to SharePoint.
Keith Dahlby has provided some feedback on my implementation of the Where lambda expressions. You should definitely check out his article.
Technorati Tags: SharePoint, SharePoint 2007, MOSS 2007, WSS 3.0, lambda expressions