设为首页 收藏本站
查看: 527|回复: 0

[经验分享] Solr: integrate carrot2 with solr-5.1.0

[复制链接]

尚未签到

发表于 2016-12-15 09:17:19 | 显示全部楼层 |阅读模式
  I already integrated carrot2 with solr-4.x with my customerized chinese tokenizer successfully.
  But I run some errors following my series of blogs http://ylzhj02.iyunv.com/blog/2152348  to adopt carrot2 to solr-5.1.0
  The error is

org.carrot2.util.factory.FallbackFactory; Tokenizer for Chinese Simplified (zh_cn) is not available. This may degrade clustering quality of Chinese Simplified content. Cause: java.lang.NoSuchMethodError: org.apache.lucene.analysis.Tokenizer.<init>(Ljava/io/Reader;)V

  The reason is that solr-5.2.1 adopted lucene 5.1.0, however carrot2-3.10.0  used lucene 4.6.0.   So the cause is jars uncompatible.

So, the solution is to download the latest version of carrot2

#git clone git://github.com/carrot2/carrot2.git

(3.11.0)

the lucene version is now 5.1.0

#cd carrot2

 

step 1:

#vi core/carrot2-util-text/src/org/carrot2/text/linguistic/DefaultTokenizerFactory.java

add

import org.carrot2.text.linguistic.lucene.InokChineseTokenizerAdapter;

 

change

100         map.put(LanguageCode.CHINESE_SIMPLIFIED,
101             new NewClassInstanceFactory<ITokenizer>(ChineseTokenizerAdapter.class));

 to

map.put(LanguageCode.CHINESE_SIMPLIFIED,
new NewClassInstanceFactory<ITokenizer>(InokChineseTokenizerAdapter.class));
 

step 2:

#vi InokChineseTokenizerAdapter.java

#cp chineseTokenizer/InokChineseTokenizerAdapter.java ./core/carrot2-util-text/src/org/carrot2/text/linguistic/lucene/

step 3:

#mkdir lib/org.lionsoul.jcseg

├── build.properties
├── jcseg-core-1.9.6.jar
├── jcseg.LICENSE
└── META-INF
    └── MANIFEST.MF

the file and jars is

build.properties

bin.includes = META-INF/,\
jcseg-core-1.9.6.jar,\
jcseg.LICENSE
META-INF/MANIFEST.MF

Manifest-Version: 1.0
Bundle-ManifestVersion: 2
Bundle-Name: Jcseg Tokenizer
Bundle-SymbolicName: org.lionsoul.jcseg
Bundle-Version: 1.9.6
Bundle-ClassPath: jcseg-core-1.9.6.jar
Bundle-Vendor: INokNok Inc.
Bundle-RequiredExecutionEnvironment: JavaSE-1.6
 

step 4:

modify build.xml

141   <patternset id="lib.test">
142     <include name="core/**/*.jar" />
143     <include name="lib/**/*.jar" />
144     <include name="lib/org.lionsoul.jcseg/*.jar" />
145     <exclude name="lib/org.slf4j/slf4j-nop*" />
146     <include name="applications/carrot2-dcs/**/*.jar" />
147     <include name="applications/carrot2-webapp/lib/*.jar" />
148     <include name="applications/carrot2-benchmarks/lib/*.jar" />
149   </patternset>

 
173   <patternset id="lib.core">
174     <include name="lib/**/*.jar" />
175     <include name="lib/org.lionsoul.jcseg/*.jar" />
176     <include name="core/carrot2-util-matrix/lib/*.jar" />
177     <patternset refid="lib.core.excludes" />
178   </patternset>

 

180   <patternset id="lib.core.mini">
181     <include name="lib/**/mahout-*.jar" />
182     <include name="lib/**/jcseg*.jar" />
183     <include name="lib/**/mahout.LICENSE" />
184     <include name="lib/**/colt.LICENSE" />
185     <include name="lib/**/commons-lang*" />
186     <include name="lib/**/guava*" />
187     <include name="lib/**/jackson*" />
188     <include name="lib/**/lucene-snowball*" />
189     <include name="lib/**/lucene.LICENSE" />
190     <include name="lib/**/hppc-*.jar" />
191     <include name="lib/**/hppc*.LICENSE" />
192     
193     <include name="lib/**/slf4j-api*.jar" />
194     <include name="lib/**/slf4j-nop*.jar" />
195     <include name="lib/**/slf4j.LICENSE" />
196     
197     <include name="lib/**/attributes-binder-*.jar" />
198   </patternset>
199   

 

906   <target name="core" depends="jar, jar.src, lib-no-jar.flattened" description="Builds Carrot2 Java API JAR with dependencies">
907     <delete dir="${api.dir}" failonerror="false" />
908     <mkdir dir="${api.dir}" />
909     <mkdir dir="${api.dir}/lib" />
910     <mkdir dir="${api.dir}/examples" />
911     <mkdir dir="${api.dir}/resources" />
912
913     <patternset id="carrot2.required">
914       <include name="**/jcseg*" />
915       <include name="**/commons-lang*" />

 

step 6:

#ant jar

#scp tmp/jar/carrot2-core-3.11.0-SNAPSHOT.jar root@192.168.0.135:/opt/solr/contrib/clustering/lib
carrot2-core-3.11.0-SNAPSHOT.jar

restart solr server to test clustering

 -----------------------------

An error happans

org.apache.solr.common.SolrException; null:java.lang.RuntimeException: java.lang.NoClassDefFoundError: com
/carrotsearch/hppc/ObjectHashSet

 

Solution :

#scp lib/com.carrotsearch.hppc/hppc-0.7.1.jar root@192.168.0.135:/opt/solr/contrib/clustering/lib/
hppc-0.7.1.jar                                       

#rm -f  opt/solr/contrib/clustering/lib/hppc-0.5.2.jar                                       

------

another error is

java.lang.RuntimeException: java.lang.IllegalAccessError: class
com.carrotsearch.hppc.ObjectHashSet cannot access its superclass com.carrotsearch.hppc.AbstractObjectCollection
 

The reason is that there is an old hppc-0.5.2.jar in /opt/solr/server/webapps/solr.war 


so, Solution is to

#cd /opt/solr/server/solr-webapp/webapp

#rm -f WEB-INF/lib/hppc-0.5.2.jar

#cp hppc-0.7.1.jar   WEB-INF/lib


#jar cf solr.war  ./

#mv solr.war  /opt/solr/server/webapps

restart solr

the error disappers

 

 

 

 

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-314534-1-1.html 上篇帖子: solr build索引性能 下篇帖子: solr 使用安装介绍
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表