Allowing Google to sort your web pages by date

This article refers to the Google Search Appliance (GSA).  But since GSA and www.google.com is based on the same algorithms, it is very likely that it also applies to the general Google search as well.

The Google Search Appliance allows you to get search results sorted not by relevance but by date.
As per default, all GSA’s are configured to look for the document date in the Last-modified field of the response header:

Document Dates of the GSA

Document Dates of the GSA

So all you need is to add the last modified date of the web page to the response header. Google allows plenty of date formats, but the documentation states that the best format is YYYY-MM-DD.

Modifying the response header can be done in several ways. For .NET people (which is you I guess) you need to run .NET 3.5 and you need to run your website in Integrated Pipeline Mode. The Classic mode (and old versions of .NET) does not allow you to read or modify the response headers.

Sitecore developers have several options. Here is a method where I modify the httpRequestBegin pipeline, by adding a new processor to the end of the pipeline. Every httpRequest in Sitecore goes through this pipeline:

<httpRequestBegin>
  <processor type="Sitecore.Pipelines.PreprocessRequest.CheckIgnoreFlag, Sitecore.Kernel" />
  <processor type="Sitecore.Pipelines.HttpRequest.StartMeasurements, Sitecore.Kernel" />
  <processor type="Sitecore.Pipelines.HttpRequest.IgnoreList, Sitecore.Kernel" />
  ...
  ...
  <processor type="Sitecore.Pipelines.HttpRequest.ExecuteRequest, Sitecore.Kernel" />
  <processor type="PT.HttpRequestPipeline.SetDate, PT.HttpRequestPipeline" />
</httpRequestBegin>

The code is very simple. I ensure that the pipeline is processing an item from the “web” database before I modify the header:

using System;
using System.Web;
using Sitecore;
using Sitecore.Pipelines.HttpRequest;

namespace PT.HttpRequestPipeline
{
  public class SetDate : HttpRequestProcessor
  {
    public override void Process(HttpRequestArgs args)
    {
      if (Context.Item == null)
        return;
      if (Context.Database == null)
        return;
      if (Context.Database.Name != "web")
        return;
     
      args.Context.Response.Headers.Add("Last-Modified", Context.Item.Statistics.Updated.ToString("yyyy-MM-dd"));
    }
  }
}

The result is a Last-Modified in the response header of all web pages:

Added Last-Modified header to the response

Added Last-Modified header to the response

GSA (Google Search Appliance) Suggest using C# and jQuery

The Google Search Appliance (GSA) is a search box (a server) that you buy which contains basically the complete search engine from Google. Using this search box allows you to apply the magical Google search results on your intranet, extranet or internet.

The GSA comes with a fully customizable frontend so it will act as a stand-alone machine. But in one of my recent projects we integrated the search results into the customer internet site using a ListView that reads search results in XML format from the GSA and formats it to nice HTML.

Furthermore we implemented the Google Query Suggestion Service on the search box, allowing us to display search suggestions:

GSA Suggestions

GSA Suggestions

Here is what you need. First of all you need to use jQuery. Then you need to download the Ajax Autocomplete for jQuery that implements the autocomplete feature for any input box. The Autocomplete Javascript reads JSON from an .ashx page. So our task is to create an .ashx page that reads suggestions from the GSA and returns them as JSON.

The HttpHandler looks like this (code is sample code. You should apply your own error handling and comments):

namespace GoogleSearchAppliance
{
  public class Suggest : IHttpHandler
  {
    public bool IsReusable
    {
      get { return true; }
    }

    public void ProcessRequest(HttpContext context)
    {
      if (string.IsNullOrEmpty(context.Request.QueryString[_QUERY_PARAM]))
        throw new Exception(string.Format("Could not find parameter '{0}'", _QUERY_PARAM));
     
      // Get the suggestion word from the parameter
      string suggestiveWord = context.Request.QueryString[_QUERY_PARAM];
      // Create an URL to the GSA
      UrlString suggestionUrl = SuggestionUrl(suggestiveWord);
      // Call the GSA and get the GSA result as a string
      string page = GetPageAsString(suggestionUrl);
      // Convert the GSA result to Json
      string jSonResult = ConvertToJson(page);
      // Return the JSON
      context.Response.Write(jSonResult);
      context.Response.End();
    }

    private string SuggestionUrl(string suggestiveWord)
    {
       // You should modify this line to connect to your
       // own GSA, using the correct collection and frontend
       return "http://myGSAurl/suggest?site=default_collection&client=default_frontend&access=p&format=rich&q" + suggestiveWord;
    }

    private string GetPageAsString(string address)
    {
      // Add your own error handling here
      HttpWebRequest request = WebRequest.Create(address) as HttpWebRequest;
      using (HttpWebResponse response = request.GetResponse() as HttpWebResponse)
      {
        StreamReader reader = new StreamReader(response.GetResponseStream());
        return reader.ReadToEnd();
      }
    }

    private string ConvertToJson(string gsaSuggestResult)
    {
      bool isFirst = true;
      StringBuilder sb = new StringBuilder();
      sb.Append("{ query:");
      foreach (string token in ParseGsaInput(gsaSuggestResult))
      {
        if (isFirst)
        {
          sb.AppendFormat("'{0}', suggestions:[", token.Trim());
          isFirst = false;
        }
        else
        {
          sb.AppendFormat("'{0}',", token.Trim());
        }
      }
      sb.Remove(sb.Length-1, 1);
      sb.Append(@"]}");
      return sb.ToString();
    }
   
    private IEnumerable<string> ParseGsaInput(string gsaSuggestResult)
    {
      gsaSuggestResult = gsaSuggestResult.Replace("[", "").Replace("]", "").Replace("\"", "");
      return gsaSuggestResult.Split(',');
    }

    private const string _QUERY_PARAM = "query";
  }
}

With the HttpHandler in place it is really (really) easy to hook up the code on the input box. Remember to include jQuery and the jquery.autocomplete.js on your page. This is an example code where I assume that the HttpHandler I created is called Suggest.ashx and is placed in the root of my project:

<script language="javascript" type="text/javascript">
  var options, a;
  $(function() {
    options = { serviceUrl: '/Suggest.ashx'};
    a = $jQuery('.metaInput').autocomplete(options);
  });
</script>
<asp:TextBox ID="gssQuery" runat="server" size="31" CssClass="metaInput" Text="Google Site Search"/>

The Autocomplete has some CSS that is applied:
.autocomplete-w1 { position:absolute; top:0px; left:0px; margin:6px 0 0 6px; /* IE6 fix: */ _background:none; _margin:1px 0 0 0; }
.autocomplete { border:1px solid #999; background:#FFF; cursor:default; text-align:left; max-height:350px; overflow:auto; margin:-8px 6px 6px -9px; /* IE6 specific: */ _height:350px;  _margin:0; _overflow-x:hidden; }
.autocomplete .selected { background:#F0F0F0; }
.autocomplete div { padding:2px 5px; white-space:nowrap; overflow:hidden; }
.autocomplete strong { font-weight:normal; color:#CC3333; }

Working with the System.Web.UI.WebControls.ListView in Sitecore

The new System.Web.UI.WebControls.ListView control is one of the new templated controls thats available in the Windows Presentation Foundation. But the control is not available for users in Sitecore. First you must add the control to the web.config in the typesThatShouldNotBeExpanded section (yes, that’s really it’s name):

<rendering>
  <typesThatShouldNotBeExpanded>
    <type>System.Web.UI.WebControls.Repeater</type>
    <type>System.Web.UI.WebControls.DataList</type>
    <type>System.Web.UI.WebControls.ListView</type>
  </typesThatShouldNotBeExpanded>
</rendering>

Now you can use the ListView control. The control is really cool since it allows you to do stuff without much code. This is an example of presenting the XML from a Google Search Appliance result. I won’t give you the codebehind, but I can tell you that all you have to do is that on Page_Load i feed the XML from the GSA request directly to the XmlDataSource. And that’s it:

<asp:XmlDataSource ID="xmlDataSource" EnableCaching="false" EnableViewState="false" runat="server" XPath="GSP/RES/R" >
</asp:XmlDataSource>

<asp:ListView DataSourceID="xmlDataSource" EnableViewState="false" id="SearchResultListView" runat="server">
  <LayoutTemplate>
      <ul>
        <asp:PlaceHolder runat="server" id="itemPlaceHolder" />
      </ul>
  </LayoutTemplate>
  <ItemTemplate>
    <li>
      <div>
        <a href="<%# XPath("U") %>">
          <%# XPath("T") %>
        </a>
      </div>
      <div>
        <%# XPath("S") %>
      </div>
      <div>
        <a href="<%# XPath("U") %>">
          <%# XPath("U") %>
        </a>
      </div>
    </li>
  </ItemTemplate>
  <EmptyDataTemplate>
    <ul>
      <li>
        <div>
          ... no match found ...
        </div>
      </li>
    </ul>
  </EmptyDataTemplate>
</asp:ListView>

Follow

Get every new post delivered to your Inbox.

Join 92 other followers