Jul 5, 2011

The Perfect Loop - Looping Through All Webs in A Site Collection

Adequate memory management and disposing are still hot topics in SharePoint 2010. That’s why I would like to share some insights about iterating through a SPWebCollection.
Take a look at the following snippets:
using(SPSite site = new SPSite(“http://myserver/mysitecol”)
{
          foreach(SPWeb web in site.AllWebs)
          {
                    Console.WriteLine(web.ServerRelativeUrl + “: ” + web.Title);
                    web.Dispose();
          }
}
-or in SP context -
foreach(SPWeb web in SPContext.Current.Site.AllWebs)

      writer.Write(web.ServerRelativeUrl + “: ” + web.Title +”,”);
      web.Dispose(); 
}

what happens in the background

The call to the AllWebs property will instantiate a new instance of SPWebCollection (if there is not all ready an instance).
When the iterator moves to the first item in the SPWebCollection and the collection has not already been initialized it will executes a single unmanaged API call to fetch the most important meta data for all the webs within the site collection. You could say it loads the “Header” data of the SPWeb objects. The unmanaged COM API is wrapped within an internal class called SPRequest. You could say SPRequest is kind of a data access layer for SharePoint. SPRequest will then call the stored procedure “proc_ListAllWebsOfSite” to load the required data from the content database.
After the above operation has finished you can access the following properties of each web without any further database round-trip: 
    • ID
    • Title
    • Name
    • ServerRelativeUrl
    • Description
    • Language
    • Created
    • Modified
    • Template
    • Configuration
    • UserIsWebAdmin
    • UIVersion
    • MasterUrl
    • CustomMasterUrl
    • MeetingCount (in case of an SPMeeting web)
As long as you only access this properties the loop will be fast.
Output from the developer dashboard when I run the above snippet in a web part. My test site collection contains 1000 websites.
image 

BUT What happens when I access a property that has not been loaded YET?

When you access a property of the SPWeb that has not already been loaded (maybe by accident or through lack of knowledge) by “proc_ListAllWebsOfSite”, SharePoint  will reload them (lazy loading). You could say the “body” of the SPWeb gets loaded.
using(SPSite site = new SPSite(“http://myserver/mysitecol”)
{
   foreach(SPWeb web in site.AllWebs)
   {
      Console.WriteLine(web.AlternateCssUrl + “: ” + web.Title);
      web.Dispose(); 
   }
}
Output from the developer dashboard when I run the above snippet in a web part.
SNAGHTML17ae3ff 
The first thing that we notice is that SharePoint now does a database request for every SPWeb in the loop! Very expensive regards resource usage and performance!
SNAGHTML17edca0 
Secondly we notice that that there are now many SPRequest allocations. Internally each SPWeb gets it’s own instance of SPRequest assigned when its “body” will be loaded. The call of web.Dispose() helps to get rid of them as fast as possible.

New in SharePoint 2010 – SPWebInfo

As you could see it is really bad for the performance when you access an unloaded (body) property of SPWeb.  That’s one of the reasons why the SharePoint team adds a new property called WebsInfo to SPWebCollection. WebsInfo is a collection of SPWebInfo and each SPWebInfo object is a wrapper for the SPWeb header that will be loaded in the background when you access the AllWebs collection of the SPSite. Internally SPWebInfo will not allocate any SPRequest and must not be disposed.
The perfect loop:
using(SPSite site = new SPSite(“http://myserver/mysitecol”)
{
   foreach(SPWebInfo webInfo in site.AllWebs.WebsInfo)
   {
        Console.WriteLine(webInfo.ServerRelativeUrl + “: ” + web.Title); 
   }
}

-or in LINQ
using(SPSite site = new SPSite(“http://myserver/mysitecol”)
{
     site.AllWebs.WebsInfo.ForEach(wi=>Console.WriteLine(wi.ServerRelativeUrl));

}


image