I’ve recently been setting up Apache Lucene/Solr to index static PDF files and also import data to the collection from MS SQL Server.
After successfully indexing PDF files and providing them with a unique id via UUID I wanted to import several SQL tables that each had a ID column called ‘id’. These tables would obvioulsy have overlapping ID’s at some stage so I wanted to use UUID on these documents as well.
I struggles to find much documentation on Solr, SQL Server and UUID, but after successfully setting up UUID via http://wiki.apache.org/solr/UniqueKey, you also need to the UpdateRequestHander on the dataimport handler as well. Therefore the following code:
<requestHandler name=”/dataimport” class=”org.apache.solr.handler.dataimport.DataImportHandler”>
<lst name=”defaults”>
<str name=”config”>db-data-config.xml</str>
</lst>
</requestHandler>
changed to this
<requestHandler name=”/dataimport” class=”org.apache.solr.handler.dataimport.DataImportHandler”>
<lst name=”defaults”>
<str name=”config”>db-data-config.xml</str>
<str name=”update.chain”>uuid</str>
</lst>
</requestHandler>
then auto creates unique id’s when importing on mass from SQL Server tables.