Is it a good idea to use lambda expressions for querying SharePoint data?

, , , ,

.NET 3x ships with a number of new features among which lambda expressions: an easy way to write code for querying collections. Because of its ease of use many developers use it for retrieving data from different kind of collections. Just recently I’ve been asked to have a look how well lambda expressions perform while retrieving data from SharePoint comparing to SharePoint native mechanisms.

Before you read further…

Keep in mind that lambda expressions are nothing else than a fancy way of writing code… They don’t contain any magical mechanism for iterating through collections. Instead they allow you to write the code more intuitively.

Considering that, I was quite skeptic about the performance of lambda expressions. At the end of the day it’s almost the same as retrieving all the items from SharePoint and then checking each one of them whether it should be returned or not. Not the most efficient way of data querying if you ask me…

The test case

In order to test how well lambda expressions really perform I have setup a simple test environment. I have created a custom list with three columns: Title, Number (integer between 0 and 100) and Lipsum (Multiple lines of Rich HTML). I have filled this list with 1000 items. Each item contained the same dummy text and a random number between 0 and 100.

Using different querying mechanisms I wanted to retrieve all items with the number greater or equal than 50. 492 items in total.

I run each query method 100 times in order to minimize the influence of other processes which might accidentally run in the background.

Additionally I have performed a test to check whether there is a difference between querying list items and webs. I have created a site with 100 team sites. Each site had a custom property IsProjectSite set randomly to either true or false. In my test I wanted to retrieve ID’s of all project sites – 48 sites in total.

The querying methods

To provide enough comparison data I have tested the following methods:

SPQuery

Query: <Where><Geq><FieldRef Name='Number' /><Value Type='Number'>50</Value></Geq></Where>
ViewFields: <FieldRef Name='Title'/>

SPSiteDataQuery

Lists: <Lists><List ID='849f5f81-7850-461c-8057-cff281335b0f'/></Lists>
Query: <Where><Geq><FieldRef Name='Number' /><Value Type='Number'>50</Value></Geq></Where>
ViewFields: <FieldRef Name='Title'/>

Lambda expression for querying lists

Definition of a Where lambda expression which can be attached to SPListItemCollection

Example of using Where lambda expression for querying lists

ForEach loop for querying lists

Using a foreach loop for quering lists 

Lambda expression for querying webs

Definition of a Where lambda expression which can be attached to SPWebCollection

Example of using Where lambda expression for querying webs

ForEach loop for querying webs

Using a foreach loop for quering webs

The results

I have run the test on my SharePoint development VPC. Because it doesn’t come even close to a real-life production environment, don’t look at the numbers. Compare the relative values instead.

Query the content if possible

Just as I suspected querying SharePoint lists by iterating all list items has really poor performance. Comparing to SPQuery it’s 300 times (!) slower. Don’t use it unless you have a very good reason. And even then ask yourself twice whether it’s really necessary.

SPQuery versus lambda expressions performance comparison. SPQuery is 300 times faster

SharePoint gives you a couple of mechanisms for querying data. Unfortunately there is no best-for-all approach. My advice: research various approaches to find the one that would fit your scenario the best. Depending on your experience with SharePoint it may take less or more time to find the best match.

In the test I’ve performed I have used two different query mechanisms: using SPQuery and SPSiteDataQuery. The major different between these two is that SPQuery can query data from single list only while SPSiteDataQuery can be used for querying data from within the whole Site Collection.

Using different settings of the SPSiteDataQuery you can optimize the query for example by narrowing it down to one particular list. Still the SPQuery performs almost 20 times faster. So if you want to get data from one list only, you will be better off with SPQuery.

SPSiteDataQuery versus SPQuery performance comparison. SPQuery is 20 times faster

Lambda expressions perform better than ForEach loops

The outcome of this one really surprised me. As I mentioned before, lambda expressions are a fancy way of performing operations on collections. It turns out that using lambda expressions for traversing lists is almost twice as good as the ForEach loop! If you look at the code examples you notice, that the foreach loop is still there! I run this test for a couple of times just to be sure that nothing wrong was going on. Each single time the results were almost the same. 1-0 for lambda expressions.

Comparing to SPQuery lamda expressions still perform very poor but when you really have to loop through a collection of list items, you should consider using lambda expressions instead of foreach loops.

Lambda versus foreach loop performance comparison for querying list items. Lambda expressions are almost as twice as fast as foreach loops

Surprisingly it doesn’t matter which approach you choose for traversing SPWebCollection in order to retrieve some particular Webs based on a property (SPPropertyBag). The results are similar.

Lambda versus foreach loop performance comparison for querying webs. Both approaches perform almost the same

Everyone would agree though: the code is much cleaner when using lambda expressions.

Summary

Lambda expressions are a very interesting feature of .NET 3x. They allow you to write cleaner and more intuitive code. While they improve readability, lambda expressions are not always the best choice in combination with SharePoint. You should never choose blindly for lambda expressions and always compare them to other mechanisms especially those native to SharePoint.

Update 12/10/2008

Keith Dahlby has provided some feedback on my implementation of the Where lambda expressions. You should definitely check out his article.

Possibly related posts

12 Responses to “Is it a good idea to use lambda expressions for querying SharePoint data?”

  1. TJ Says:

    Great article.

  2. Waldek Mastykarz Says:

    Thanks TJ :)

  3. Marius C. Says:

    How about using CrossListQueryInfo in combination with its caching counterpart (CrossListQueryCache)? When runing queries for longer periods,eg 30min. it should make a count for!

    Regards, Marius C.

  4. Marius C. Says:

    Also, sorry for not mentioning it beforre, did u tried instead of looping, in your lamba expression to just perform the query like:
    Eg: SPContext.Current.Web.Lists["My List"].Cast().Where(item=>item["Title"].Contains("My Filter Key"));

  5. Waldek Mastykarz Says:

    @Marius C.: CrossListQueryInfo is a little bit better than any other querying method. The downside is though, that it's only available in MOSS, so if you're using WSS you should have to fall back to one of the 'basic' classes like SPSiteDataQuery or SPQuery.
    As for the lambda expression: I haven't tried running the query in such manner. What I miss though is: how the lambda expression would know that you actually want to access the Items property of SPList?

  6. Marius C. Says:

    Yes, indeed you are right about CrossListQueryInfo! To answer your question i need to complete my piece of code as in hurry i forgot one important aspect (the generic part ), So here the correction:
    SPContext.Current.Web.Lists["My List"].Cast().Where(item=>item["Title"].Contains("My Filter Key"));

    Also, as a slight variation, i've been using the following to retrieve all the lists that contain a specific ContentType (you can imagine complex queries with multiple where clauses, etc…limited just by the need, imagination..and not ultimatelly the performance degradation in some cases!!):

    //see here that we use SPList, instead of SPListItem
    SPContext.Current.Web.Lists.Cast().Where(item=>item["ContentType"].Contains("My Content TypeName "));

  7. Marius C. Says:

    Correcting the correction….is not me…is the publishing interface that strips out the generic part (it looks like HTML tag :D ), so I add it in old plain HTML <SPListItem> and for the 2nd example <SPList>

  8. Simon Ovens Says:

    Great Post!

    Another comparison which would be useful and good to know is which of the following 2 is faster – SiteDataQuery with the PortalSiteMapProvider or CrossListQueryInfo with CrossListQueryCache

    Of course it would have to be over a period of time so that the caching is utilised.

  9. Simon Ovens Says:

    Great Post!

    Another useful comparison and it would be good to know is:
    -SiteDataQuery and PortalSiteMapProvider
    vs
    - CrossListQueryInfo and CrossListQueryCache

    hint hint :)

  10. Waldek Mastykarz Says:

    @Simon: Thanks for the tip. Have you seen this one: Performance of content aggregation queries on multiple lists revisited?

  11. Simon Says:

    ah nice one, always one step ahead ;)

    The tough part is working out if the PortalSiteMapProvider is the right choice before you know the volumes.
    But thats why I like to use the SiteDataQuery as its used in both the SPWeb.GetSiteData and the PortalSiteMapProvider.GetCachedSiteDataQuery so you can easy switch between the 2 without too much work, also as I can see it performs well :)

  12. Waldek Mastykarz Says:

    @Simon: I would definitely agree that it wouldn't be a good idea to use PortalSiteMapProvider for querying 10 items. What you always could do is to program both routines and then abstract them to a single method which would allow you to switch with a single button click/property change.

Leave a Reply

Security Code:

WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS
Copyright © 2007 - 2010 Waldek Mastykarz

Creative Commons License