From Sitecore 6.5, Sitecore is deprecating the old Lucene search method. This simply means that you can no longer use the current, built in Lucene search, but has to use a new built in Lucene search.
This is article 1 of 3 articles:
Part 1 – Configuring the index: Using the Sitecore open source AdvancedDatabaseCrawler Lucene indexer
Part 2 – Simple search: Get latest news using Sitecore AdvancedDatabaseCrawler Lucene index
Part 3 – Multivalue search: Get items based on Metadata using Sitecore AdvancedDatabaseCrawler Lucene index
There are many good reasons to use the new way of indexing, but it also requires you to redo some of your previous work. The old way of indexing was easy to setup and easy to use. The new way is more complex because it lets you do more advanced stuff.
That’s why Alex Shyba wrote an open source module that makes the searching and indexing easier: The Advanced Database Crawler. You can find the module here:
http://trac.sitecore.net/AdvancedDatabaseCrawler/browser/Branches/v2/
I have always felt that open source modules are the best way of implementing other developers unsolvable bugs. But I have tried this module, and it works.
I will now show you how to set up the module, and in 2 later posts, I will show you how to perform the 2 basic tasks that you do with an index: How to get the latest news, and how to get all items with a certain set of metadata categories.
The new indexing applies to newer versions of Sitecore. Sitecore 6.2 revision 5 should do it, but from 6.4 and forward you are certain that it will work.
First you need to compile the AdvancedDatabaseCrawler. When compiled you get some dlls:
- Sitecore.SharedSource.SearchCrawler.dll
- Sitecore.SharedSource.SearchCrawler.DynamicFields.dll
- Sitecore.SharedSource.SearchDemo.dll
- Sitecore.SharedSource.Searcher.dll
The Sitecore.SharedSource.SearchDemo.dll is not needed in production, but it implements some test pages (found in /sitecore modules/Web/searchdemo) that you can use to test your index. Copy the DLL’s to your Sitecore /bin/ folder and you are ready to go. Copy the /sitecore modules/Web/searchdemo items to test your index with Alex’s samples.
Now you need to set up an index. I will show you how to set up an index that crawls the WEB database, as I’m going to use the index for frontend indexing.
Create an ???.config file and put it in the /App_Config/Include folder. Add the following
<configuration xmlns:x="http://www.sitecore.net/xmlconfig/">
<sitecore>
<databases>
<database id="web" singleInstance="true" type="Sitecore.Data.Database, Sitecore.Kernel">
<Engines.HistoryEngine.Storage>
<obj type="Sitecore.Data.$(database).$(database)HistoryStorage, Sitecore.Kernel">
<param connectionStringName="$(id)" />
<EntryLifeTime>30.00:00:00</EntryLifeTime>
</obj>
</Engines.HistoryEngine.Storage>
<Engines.HistoryEngine.SaveDotNetCallStack>false</Engines.HistoryEngine.SaveDotNetCallStack>
</database>
</databases>
<search>
<configuration>
<indexes>
<index id="web" type="Sitecore.Search.Index, Sitecore.Kernel">
<param desc="name">$(id)</param>
<param desc="folder">web</param>
<Analyzer ref="search/analyzer" />
<locations hint="list:AddCrawler">
<master type="Sitecore.SharedSource.SearchCrawler.Crawlers.AdvancedDatabaseCrawler,Sitecore.SharedSource.SearchCrawler">
<Database>web</Database>
<Root>/sitecore/content</Root>
<IndexAllFields>true</IndexAllFields>
<fieldCrawlers hint="raw:AddFieldCrawlers">
<fieldCrawler type="Sitecore.SharedSource.SearchCrawler.FieldCrawlers.LookupFieldCrawler,Sitecore.SharedSource.SearchCrawler" fieldType="Droplink" />
<fieldCrawler type="Sitecore.SharedSource.SearchCrawler.FieldCrawlers.DateFieldCrawler,Sitecore.SharedSource.SearchCrawler" fieldType="Datetime" />
<fieldCrawler type="Sitecore.SharedSource.SearchCrawler.FieldCrawlers.DateFieldCrawler,Sitecore.SharedSource.SearchCrawler" fieldType="Date" />
<fieldCrawler type="Sitecore.SharedSource.SearchCrawler.FieldCrawlers.NumberFieldCrawler,Sitecore.SharedSource.SearchCrawler" fieldType="Number" />
</fieldCrawlers>
<!-- If a field type is not defined, defaults of storageType="NO", indexType="UN_TOKENIZED" vectorType="NO" boost="1f" are applied-->
<fieldTypes hint="raw:AddFieldTypes">
<!-- Text fields need to be tokenized -->
<fieldType name="single-line text" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="multi-line text" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="word document" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="html" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="rich text" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="memo" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="text" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<!-- Multilist based fields need to be tokenized to support search of multiple values -->
<fieldType name="multilist" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="treelist" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="treelistex" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="checklist" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<!-- Legacy tree list field from ver. 5.3 -->
<fieldType name="tree list" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
</fieldTypes>
</master>
</locations>
</index>
</indexes>
</configuration>
</search>
</sitecore>
</configuration>
A short explanation:
The /sitecore/databases/database items creates a HistoryEngine on the WEB database. This is needed for indexing at all. No HistoryEngine, no index.
The /sitecore/search/configuration/indexes/index is the actual index. This is taken straight from Alex Shyba’s own examples and defines an index called “web” that contains everything (all items, all fields) from the WEB database.
Read more about setting up indexes here.
This it it. You cannot use Sitecore to rebuild the index anymore. You need to either use the /sitecore modules/Web/searchdemo/RebuildDatabaseCrawlers.aspx or write your own simple code:
JobOptions options = new JobOptions("RebuildSearchIndex", "index", Sitecore.Client.Site.Name, "web", "Rebuild");
options.AfterLife = TimeSpan.FromMinutes(1.0);
Job job = JobManager.Start(options);
In the following posts I will demonstrate how to get the latest news, and how to get all items with a certain set of metadata categories.
More stuff to read:

October 12, 2011 at 11:29 am
You can also rebuild the index using the IndexViewer :)
Cheers
Jens
October 13, 2011 at 7:25 am
[...] Posts Using the Sitecore open source AdvancedDatabaseCrawler Lucene indexerSitecore Image ParametersMultiple languages in SitecoreThe .ashx extension – Writing your own [...]
October 14, 2011 at 6:03 am
[...] Posts Get latest news using Sitecore AdvancedDatabaseCrawler Lucene indexUsing the Sitecore open source AdvancedDatabaseCrawler Lucene indexerGet local path from UNC pathMultiple languages in SitecoreFirefox only prints first page of [...]
February 10, 2012 at 11:43 am
[...] is why I quit using XSLT’s). Other issues revolve around selecting too many items at once (remember to use the Lucene index). And a few issues revovle around the security settings (Is Siteore security slowing you [...]