Solr4.5.1+tomcat8.0安装配置

发牌SO · 发表于 2015-11-12 09:21:15

下载tomcat和solr:wget http://mirror.esocc.com/apache/tomcat/tomcat-8/v8.0.0-RC5/bin/apache-tomcat-8.0.0-RC5.tar.gz
-O tomcat.tgz; wget http://mirror.bit.edu.cn/apache/lucene/solr/4.5.1/solr-4.5.1.tgz -O solr.tgz
解压缩tomcat和solr:tar xzvf tomcat.tgz; tar xzvf solr.tgz
拷贝solr到tomcat的webapps目录： cp solr/example/webapps/solr.war tomcat/webapps
启动tomcat，解压缩solr.war：tomcat/bin/startup.sh
将solr/example/multicore拷贝到tomcat/webapps/conf目录下
在tomcat/webapps/solr/WEB-INF/下新建一个classes目录，将example/resources下的文件复制到该classes目录中
将solr/example/lib/ext/下的所有jar包复制到tomcat/webapps/solr/WEB-INF的lib目录中
修改web.xml文件，为:
<env-entry>
<env-entry-name>solr/home</env-entry-name>
<env-entry-value>${TOMCAT_HOME}/webapps/conf/multicore</env-entry-value>
<env-entry-type>java.lang.String</env-entry-type>
</env-entry>
配置分词，使用的是mmseg4j和搜狗词库：wget http://mmseg4j.googlecode.com/files/mmseg4j-1.9.1.zip -O mmseg4j; cp mmseg4j/dist/*.jar tomcat/webapps/solr/WEB-INF/lib
配置core0下面的schema.xml的types节点：
<fieldtype name="textComplex" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="com.chenlb.mmseg4j.solr.MMSegTokenizerFactory" mode="complex" dicPath="dic"></tokenizer>
</analyzer>
</fieldtype>
<fieldtype name="textMaxWord" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="com.chenlb.mmseg4j.solr.MMSegTokenizerFactory" mode="max-word" dicPath="dic"></tokenizer>
</analyzer>
</fieldtype>
<fieldtype name="textSimple" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="com.chenlb.mmseg4j.solr.MMSegTokenizerFactory" mode="simple" dicPath="dic"></tokenizer>
</analyzer>
</fieldtype>
在 tomcat\webapps\conf\multicore\core0\conf\schema.xml 文件的 fields 节点里添加如下节点 :
<field name="simple" type="textSimple" indexed="true" stored="true" multiValued="true" />
<field name="complex" type="textComplex" indexed="true" stored="true" multiValued="true" />
<field name="max" type="textMaxWord" indexed="true" stored="true" multiValued="true" />
因为 solr4.5 里有两个 core , 所以针对 core1 重复 10,11 两步
对分词进行测试 , 访问 http://localhost:8080/solr/#/core0/analysis
Field[Name] 输入: complex
Field Value(index) 输入: 中国银行第一分行 , Field Value(index) 下面的 verbose outpu 点选
点击Analyze 按钮 , 查看分词结果 : 中国银行 | 第一 | 分行
此时 Solr3.5 已经可以进行分词 , 接下来配置 solr 3.5 连接 mysql 数据库 , 生成索引 , 进行分词
每个core中都有两个文件，conf和data

conf：主要用于存放core的配置文件，

（1）、schema.xml用于定义索引库的字段及分词器等，这个配置文件是核心文件

（2）、solrconfig.xml定义了这个core的配置信息，比如：

<autoCommit>
<maxTime>15000</maxTime>
<openSearcher>false</openSearcher>
</autoCommit>

定义了什么时候自动提交，提交后是否开启一个新的searcher等等。

data：主要用于存放core的数据，即index-索引文件和log-日志记录。
下载 java 的 mysql 驱动 , 本机解压 mysql-connector-java-5.1.18-bin.jar, 然后拷贝到 tomcat\webapps\solr\WEB-INF\lib 目录下
在 \Tomcat 6.0\webapps\solr 目录下新建 db 文件夹
在 \Tomcat 6.0\webapps\solr\db 文件夹下面新建一个 db-data-config.xml 文件 , 内容如下 :

<dataConfig>

<dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/test" user="root" password="123" />

<document name="messages">

      <entity name="message" transformer="ClobTransformer" query="select * from test1">

         <field column="ID" name="id" />

         <field column="Val" name="text" />

      </entity>

</document>

</dataConfig>

url="jdbc:mysql://localhost:3306/test" user="root" password="123"  这里配置了 mysql 的连接路径 , 用户名 , 密码

<field column="ID" name="id" /><field column="Val" name="text" />  这里配置的是数据库里要索引的字段 , 注意name 是 11 步配置的

14.4 在 Tomcat 6.0\webapps\solr\conf\multicore\core0\conf 目录下的 solrconfig.xml 文件里 , 添加如下代码 :

<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">

<lst name="defaults">

   <str name="config">E:/Program Files/Apache Software Foundation/Tomcat 6.0/webapps/solr/db/db-data-config.xml</str>

</lst>

  </requestHandler>

“E:/Program Files/Apache Software Foundation/Tomcat 6.0/webapps/solr/db/db-data-config.xml” 是 14.3 配置文件的绝对路径

14.5  在Tomcat 6.0\webapps\solr\conf\multicore\core1\conf\solrconfig.xml 路径里重复 14.4

14.6  把本地下载解压的 solr3.5 文件里 , dist 目录下的  apache-solr-dataimporthandler-3.5.0.jar 和 apache-solr-dataimporthandler-extras-3.5.0.jar  Tomcat 6.0\webapps\solr\WEB-INF\lib  目录下

14.7 solr3.5 连接 mysql 已经配置完成 , 测试读取 mysql 生成索引 , 访问 : http://localhost:8180/solr/core0/dataimport?command=full-import

14.8 测试分词查询 , 访问  http://localhost:8180/solr/core0/admin/  查询数据库里索引列里有的词

注意 , 这仅仅是配置 solr3.5 连接 mysql 生成索引 , 可以执行正常词语  的查询 , 但是不能执行  对搜索短语的分词查询

multicore  目录下面多个 core 文件夹 , 每一个都是一个接口 , 有独立的配置文件 , 处理某一类数据。

multicore/core0/conf/  目录下的  schema.xml  文件相当于数据表配置文件 , 它定义了加入索引的数据的数据类型。文件里有一个 <uniqueKey>id</uniqueKey> 的配置 , 这里将 id 字段作为索引文档的唯一标示符 , 非常重要。

FieldType 类型 , name 是这个 FieldType 的名称 , class 指向了 org.apache.solr.analysis 包里面对应的 class 名称 , 用来定义这个类型的定义。在 FieldType 定义的时候最重要的就是定义这个类型的数据在建立索引和进行查询的时候要使用的分析器analyzer,包括分词和过滤。

Fields 字段 :  结点内定义具体的字段(类似数据库中的字段) , 就是 field , 包含 name , type(为之前定义过的各种FieldType) , indexed(是否被索引) , stored(是否被存储) , multiValued(是否有多个值)

copeField（赋值字段）: 建立一个拷贝字段 , 将所有的全文字段复制到一个字段中 , 以便进行统一的检索。

版权声明：本文为博主原创文章，未经博主允许不得转载。

账号		自动登录	找回密码
密码			立即注册

wirelessnetview好用的无线分析工具

Red Hat RHCE 8 (EX294) Cert Guide

Shell从入门到精通（阿良）

亿图图示专家(EDraw Max) V7.9 中文破解版

zabbix3.4.1安装部署+微信推送信息+大屏显

Red Hat OpenShift I: Containers & Kubern

2025 年，C++ 还能“硬核”多久？

[经验分享] Solr4.5.1+tomcat8.0安装配置

浏览过的版块

扫码加入运维网微信交流群