Solr, DataImportHandler, UUID and SQL Server

I’ve recently been setting up Apache Lucene/Solr to index static PDF files and also import data to the collection from MS SQL Server.

After successfully indexing PDF files and providing them with a unique id via UUID I wanted to import several SQL tables that each had a ID column called ‘id’. These tables would obvioulsy have overlapping ID’s at some stage so I wanted to use UUID on these documents as well.

I struggles to find much documentation on Solr, SQL Server and UUID, but after successfully setting up UUID via http://wiki.apache.org/solr/UniqueKey, you also need to the UpdateRequestHander on the dataimport handler as well. Therefore the following code:

<requestHandler name=”/dataimport” class=”org.apache.solr.handler.dataimport.DataImportHandler”>
<lst name=”defaults”>
<str name=”config”>db-data-config.xml</str>
</lst>
</requestHandler>

changed to this

<requestHandler name=”/dataimport” class=”org.apache.solr.handler.dataimport.DataImportHandler”>
<lst name=”defaults”>
<str name=”config”>db-data-config.xml</str>
<str name=”update.chain”>uuid</str>
</lst>
</requestHandler>

then auto creates unique id’s when importing on mass from SQL Server tables.

Welcome

to RJPargeter.com – SEO, Google Adword’s Management & Web Development.

With extensive experience in the web site development and management field since the mid 1990’s working at organisations and companies such as USC, Sony, WGSN.com & University of Warwick, I believe I can offer realistic web site solutions for your business needs.

Please browse my services via the navigation links or use the contact details throughout the site to enquire about your web site requirements and how I can help.