1.Configuring solrconfig.xml
(文件所在路径D:\solr5\solr-5.0.0\server\solr\xuq\conf\solrconfig.xml)
Data Import Handler必须先在solrconfig.xml中进行注册。
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config">DIHconfigfile.xml</str> </lst></requestHandler> 2.Configuring the DIH Configuration File(DIHconfigfile.xml)
(文件所在路径D:\solr5\solr-5.0.0\server\solr\xuq\conf\DIHconfigfile.xml)
the DIH Configuration File是基于目录下D:\solr5\solr-5.0.0\example\example-DIH\solr\db\conf中的db-data-config.xml文件的注释文件,它从以下定义的4张数据库表中抽取数据。
<dataConfig><!--The first element is the dataSource, in this case an HSQLDB database. Thepath to the JDBC driver and the JDBC URL and login credentials are all specified here. Otherpermissible attributes include whether or not to autocommit to Solr,the batchsize usedin the JDBC connection, a 'readOnly' flag --> <dataSource driver="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:./example-DIH/hsqldb/ex" user="sa" /> <!--a 'document' element follows, containing multiple 'entity' elements. Notethat 'entity' elements can be nested, and this allows the entity relationshipsin the sample database to be mirrored here, so that we can generatea denormalized Solr record which may include multiple features forone item, for instance --> <document> <!--The possible attributes for the entity element are described below. Entityelements may contain one or more 'field' elements, which map thedata source field names to Solr fields, and optionally specify per-fieldtransformations --><!--this entity is the 'root' entity. --> <entity name="item" query="select* from item" deltaQuery="selectid from item where last_modified > '${dataimporter.last_index_time}'"> <field column="NAME" name="name" /> <!--This entity is nested and reflects the one-to-many relationship between an item and its multiple features. Notethe use of variables; ${item.ID} is the value of the column 'ID' for the current item ('item'referring to the entity name) --> <entity name="feature" query="selectDESCRIPTION from FEATURE where ITEM_ID='${item.ID}'" deltaQuery="selectITEM_ID from FEATURE where last_modified > '${dataimporter.last_index_time}'" parentDeltaQuery="selectID from item where ID=${feature.ITEM_ID}"> <field name="features" column="DESCRIPTION" /> </entity> <entity name="item_category" query="selectCATEGORY_ID from item_category where ITEM_ID='${item.ID}'" deltaQuery="selectITEM_ID, CATEGORY_ID from item_category where last_modified > '${dataimporter.last_index_time}'" parentDeltaQuery="selectID from item where ID=${item_category.ITEM_ID}"> <entity name="category" query="selectDESCRIPTION from category where ID = '${item_category.CATEGORY_ID}'" deltaQuery="selectID from category where last_modified > '${dataimporter.last_index_time}'" parentDeltaQuery="selectITEM_ID, CATEGORY_ID from item_category where CATEGORY_ID=${category.ID}"> <field column="description" name="cat" /> </entity> </entity> </entity> </document></dataConfig> 数据源根据实际情况配置。将DIHconfigfile.xml文件放到solrconfig.xml同目录下,重启solr服务器即可。
资料参考:https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler