Solr, DataImportHandler, UUID and SQL Server

I’ve recently been setting up Apache Lucene/Solr to index static PDF files and also import data to the collection from MS SQL Server.

After successfully indexing PDF files and providing them with a unique id via UUID I wanted to import several SQL tables that each had a ID column called ‘id’. These tables would obvioulsy have overlapping ID’s at some stage so I wanted to use UUID on these documents as well.

I struggles to find much documentation on Solr, SQL Server and UUID, but after successfully setting up UUID via http://wiki.apache.org/solr/UniqueKey, you also need to the UpdateRequestHander on the dataimport handler as well. Therefore the following code:

<requestHandler name=”/dataimport” class=”org.apache.solr.handler.dataimport.DataImportHandler”>
<lst name=”defaults”>
<str name=”config”>db-data-config.xml</str>
</lst>
</requestHandler>

changed to this

<requestHandler name=”/dataimport” class=”org.apache.solr.handler.dataimport.DataImportHandler”>
<lst name=”defaults”>
<str name=”config”>db-data-config.xml</str>
<str name=”update.chain”>uuid</str>
</lst>
</requestHandler>

then auto creates unique id’s when importing on mass from SQL Server tables.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.