Allowing Google to sort your web pages by date

This article refers to the Google Search Appliance (GSA).  But since GSA and www.google.com is based on the same algorithms, it is very likely that it also applies to the general Google search as well.

The Google Search Appliance allows you to get search results sorted not by relevance but by date.
As per default, all GSA’s are configured to look for the document date in the Last-modified field of the response header:

Document Dates of the GSA

Document Dates of the GSA

So all you need is to add the last modified date of the web page to the response header. Google allows plenty of date formats, but the documentation states that the best format is YYYY-MM-DD.

Modifying the response header can be done in several ways. For .NET people (which is you I guess) you need to run .NET 3.5 and you need to run your website in Integrated Pipeline Mode. The Classic mode (and old versions of .NET) does not allow you to read or modify the response headers.

Sitecore developers have several options. Here is a method where I modify the httpRequestBegin pipeline, by adding a new processor to the end of the pipeline. Every httpRequest in Sitecore goes through this pipeline:

<httpRequestBegin>
  <processor type="Sitecore.Pipelines.PreprocessRequest.CheckIgnoreFlag, Sitecore.Kernel" />
  <processor type="Sitecore.Pipelines.HttpRequest.StartMeasurements, Sitecore.Kernel" />
  <processor type="Sitecore.Pipelines.HttpRequest.IgnoreList, Sitecore.Kernel" />
  ...
  ...
  <processor type="Sitecore.Pipelines.HttpRequest.ExecuteRequest, Sitecore.Kernel" />
  <processor type="PT.HttpRequestPipeline.SetDate, PT.HttpRequestPipeline" />
</httpRequestBegin>

The code is very simple. I ensure that the pipeline is processing an item from the “web” database before I modify the header:

using System;
using System.Web;
using Sitecore;
using Sitecore.Pipelines.HttpRequest;

namespace PT.HttpRequestPipeline
{
  public class SetDate : HttpRequestProcessor
  {
    public override void Process(HttpRequestArgs args)
    {
      if (Context.Item == null)
        return;
      if (Context.Database == null)
        return;
      if (Context.Database.Name != "web")
        return;
     
      args.Context.Response.Headers.Add("Last-Modified", Context.Item.Statistics.Updated.ToString("yyyy-MM-dd"));
    }
  }
}

The result is a Last-Modified in the response header of all web pages:

Added Last-Modified header to the response

Added Last-Modified header to the response

About these ads

About Brian Pedersen

Developer at Pentia A/S since 2003. Have developed Web Applications using Sitecore Since Sitecore 4.1.
This entry was posted in .net, General .NET, Sitecore 6 and tagged , , , , , , , , . Bookmark the permalink.

2 Responses to Allowing Google to sort your web pages by date

  1. Mortaza says:

    Hej Briano, long time no talk…

    Is it possible to add Context.Response.Headers.Add(“Last-Modified”, Context.Item.Statistics.Updated.ToString(“yyyy-MM-dd”)); to Layout aspx or if you are using Masters then add it to the Master page.

    Mortaza

  2. Pingback: Creating fallback values using the RenderField pipeline « Brian Pedersen’s Sitecore and .NET Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s