设为首页 收藏本站
查看: 1088|回复: 0

[经验分享] 全文搜索服务器solr之客户端

[复制链接]

尚未签到

发表于 2015-11-11 15:57:10 | 显示全部楼层 |阅读模式

Solrj已经是很强大的solr客户端了。它本身就包装了httpCliet,以完全对象的方式对solr进行交互。很小很好很强大。

    不过在实际使用中,设置SolrQuery 的过程中,为了设置多个搜索条件和排序规则等等参数,我们往往会陷入并接字符串的地步,实在是很丑陋,不符合面向对象的思想。扩展性几乎为0,。基于这点,开发了一个小东西,我们只需要设置搜索对象,将对象扔给后台就可以了。

    比如,我们搭建的solr服务支持某10个字段的搜索,我们要搜索其中的一些,那么我们只需要传入要搜索的对象POJO,将要搜索的字段内容,set到POJO对象对应额字段即可。


    比如如下一个类:




package org.uppower.tnt.biz.core.manager.blog.dataobject;
/**
* @author yingmu
* @version 2010-7-20 下午01:00:55
*/
public class SolrPropertyDO {
private String auction_id;
private String opt_tag;
private String exp_tag;
private String title;
private String desc;
private String brand;
private String category;
private String price;
private String add_prov;
private String add_city;
private String quality;
private String flag;
private String sales;
private String sellerrate;
private String selleruid;
private String ipv15;
public String getAuction_id() {
return auction_id;
}
public void setAuction_id(String auctionId) {
auction_id = auctionId;
}
……
public String getExp_tag() {
return exp_tag;
}
public void setExp_tag(String expTag) {
exp_tag = expTag;
}
}

那么我们在定义搜索对象时候,就按照如下设置:
SolrPropertyDO propertyDO = new SolrPropertyDO();
propertyDO.setAdd_city("(杭州AND成都)OR北京");
propertyDO.setTitle("丝绸OR剪刀");
……

设置排序条件,也是类似的做法:

  



SolrPropertyDO compositorDO = new SolrPropertyDO();
compositorDO.setPrice ("desc");
compositorDO.setQuality ("asc");
……

将定义好的两个对象扔给后面的接口就可以了。




     接口函数querySolrResult传入四个参数,其中包含搜索字段对象,排序条件对象。为了提供类似limit的操作,用于分页查询,提供了startIndex和pageSize。


     函数querySolrResultCount是单纯为了获得搜索条数,配合分页使用。


    以下是定义的接口:

  



package org.uppower.tnt.biz.core.manager.blog;
import java.util.List;
import org.uppower.tnt.biz.core.manager.isearch.dataobject.SolrPropertyDO;
/**
* @author yingmu
* @version 2010-7-20 下午03:51:15
*/
public interface SolrjOperator {
/**
* 获得搜索结果
*
* @param propertyDO
* @param compositorDO
* @param startIndex
* @param pageSize
* @return
* @throws Exception
*/
public List<Object> querySolrResult(Object propertyDO,
Object compositorDO, Long startIndex, Long pageSize)
throws Exception;
/**
* 获得搜索结果条数
*
* @param propertyDO
* @param compositorDO
* @return
* @throws Exception
*/
public Long querySolrResultCount(SolrPropertyDO propertyDO,
Object compositorDO) throws Exception;
}

实现逻辑为,首先将传入的两个实体对象,解析为<K,V>结构的Map当中,将解析完成的Map放入solrj实际的搜索对象当中。返回的对象为solrj的API提供的SolrDocument,其中结果数量为直接返回SolrDocumentList对象的getNumFound()


    具体实现类:

  



package org.uppower.tnt.biz.core.manager.blog;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.TreeMap;
import org.apache.solr.common.SolrDocumentList;
import org.uppower.tnt.biz.core.manager.isearch.common.SolrjCommonUtil;
import org.uppower.tnt.biz.core.manager.isearch.dataobject.SolrPropertyDO;
import org.uppower.tnt.biz.core.manager.isearch.solrj.SolrjQuery;
/**
* @author yingmu
* @version 2010-7-20 下午03:51:15
*/
public class DefaultSolrOperator implements SolrjOperator {
private Logger logger = LoggerFactory.getLogger(this.getClass());
private SolrjQuery solrjQuery;
public void setSolrjQuery(SolrjQuery solrjQuery) {
this.solrjQuery = solrjQuery;
}
@Override
public List<Object> querySolrResult(Object propertyDO,
Object compositorDO, Long startIndex, Long pageSize)
throws Exception {
Map<String, String> propertyMap = new TreeMap<String, String>();
//排序有顺序,使用TreeMap
Map<String, String> compositorMap = new TreeMap<String, String>();
try {
propertyMap = SolrjCommonUtil.getSearchProperty(propertyDO);
compositorMap = SolrjCommonUtil.getSearchProperty(compositorDO);
} catch (Exception e) {
logger.error(&quot;SolrjCommonUtil.getSearchProperty() is error !&quot;+ e);
}
SolrDocumentList solrDocumentList = solrjQuery.query(propertyMap, compositorMap,
startIndex, pageSize);
List<Object> resultList = new ArrayList<Object>();
for (int i = 0; i < solrDocumentList.size(); i++) {
resultList.add(solrDocumentList.get(i));
}
return resultList;
}
@Override
public Long querySolrResultCount(SolrPropertyDO propertyDO,
Object compositorDO) throws Exception {
Map<String, String> propertyMap = new TreeMap<String, String>();
Map<String, String> compositorMap = new TreeMap<String, String>();
try {
propertyMap = SolrjCommonUtil.getSearchProperty(propertyDO);
compositorMap = SolrjCommonUtil.getSearchProperty(compositorDO);
} catch (Exception e) {
logger.error(&quot;SolrjCommonUtil.getSearchProperty() is error !&quot; + e);
}
SolrDocumentList solrDocument = solrjQuery.query(propertyMap, compositorMap,
null, null);
return solrDocument.getNumFound();
}
}

其中,对象的解析式利用反射原理,将实体对象中不为空的&#20540;,以映射的方式,转化为一个Map,其中排序对象在转化的过程中,使用TreeMap,保证其顺序性。


    解析公共类实现如下:

  



package org.uppower.tnt.biz.core.manager.blog.common;
import java.lang.reflect.Field;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.util.HashMap;
import java.util.Map;
/**
* @author yingmu
* @version 2010-7-20 下午01:07:15
*/
public class SolrjCommonUtil {
public static Map<String, String> getSearchProperty(Object model)
throws NoSuchMethodException, IllegalAccessException,
IllegalArgumentException, InvocationTargetException {
Map<String, String> resultMap = new TreeMap<String, String>();
// 获取实体类的所有属性,返回Field数组
Field[] field = model.getClass().getDeclaredFields();
for (int i = 0; i < field.length; i++) { // 遍历所有属性
String name = field.getName(); // 获取属性的名字
// 获取属性的类型
String type = field.getGenericType().toString();
if (type.equals(&quot;class java.lang.String&quot;)) { // 如果type是类类型,则前面包含&quot;class &quot;,后面跟类名
Method m = model.getClass().getMethod(
&quot;get&quot; + UpperCaseField(name));
String value = (String) m.invoke(model); // 调用getter方法获取属性值
if (value != null) {
resultMap.put(name, value);
}
}
}
return resultMap;
}
// 转化字段首字母为大写
private static String UpperCaseField(String fieldName) {
fieldName = fieldName.replaceFirst(fieldName.substring(0, 1), fieldName
.substring(0, 1).toUpperCase());
return fieldName;
}
}

搜索直接调用solr客户端solrj,基本逻辑为循环两个解析之后的TreeMap,设置到SolrQuery当中,最后直接调用solrj的API,获得搜索结果。最终将搜索结构以List<Object>的形式返回。


    具体实现:

  



package org.uppower.tnt.biz.core.manager.blog.solrj;
import java.net.MalformedURLException;
import java.util.Map;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.common.SolrDocumentList;
/**
* @author yingmu
* @version 2010-7-20 下午02:57:04
*/
public class SolrjQuery {
private String url;
private Integer soTimeOut;
private Integer connectionTimeOut;
private Integer maxConnectionsPerHost;
private Integer maxTotalConnections;
private Integer maxRetries;
private CommonsHttpSolrServer solrServer = null;
private final static String ASC = &quot;asc&quot;;
public void init() throws MalformedURLException {
solrServer = new CommonsHttpSolrServer(url);
solrServer.setSoTimeout(soTimeOut);
solrServer.setConnectionTimeout(connectionTimeOut);
solrServer.setDefaultMaxConnectionsPerHost(maxConnectionsPerHost);
solrServer.setMaxTotalConnections(maxTotalConnections);
solrServer.setFollowRedirects(false);
solrServer.setAllowCompression(true);
solrServer.setMaxRetries(maxRetries);
}
public SolrDocumentList query(Map<String, String> propertyMap,
Map<String, String> compositorMap, Long startIndex, Long pageSize)
throws Exception {
SolrQuery query = new SolrQuery();
// 设置搜索字段
if (null == propertyMap) {
throw new Exception(&quot;搜索字段不可为空!&quot;);
} else {
for (Object o : propertyMap.keySet()) {
StringBuffer sb = new StringBuffer();
sb.append(o.toString()).append(&quot;:&quot;);
sb.append(propertyMap.get(o));
String queryString = addBlank2Expression(sb.toString());
query.setQuery(queryString);
}
}
// 设置排序条件
if (null != compositorMap) {
for (Object co : compositorMap.keySet()) {
if (ASC == compositorMap.get(co)
|| ASC.equals(compositorMap.get(co))) {
query.addSortField(co.toString(), SolrQuery.ORDER.asc);
} else {
query.addSortField(co.toString(), SolrQuery.ORDER.desc);
}
}
}
if (null != startIndex) {
query.setStart(Integer.parseInt(String.valueOf(startIndex)));
}
if (null != pageSize && 0L != pageSize.longValue()) {
query.setRows(Integer.parseInt(String.valueOf(pageSize)));
}
try {
QueryResponse qrsp = solrServer.query(query);
SolrDocumentList docs = qrsp.getResults();
return docs;
} catch (Exception e) {
throw new Exception(e);
}
}
private String addBlank2Expression(String oldExpression) {
String lastExpression;
lastExpression = oldExpression.replace(&quot;AND&quot;, &quot; AND &quot;).replace(&quot;NOT&quot;,
&quot; NOT &quot;).replace(&quot;OR&quot;, &quot; OR &quot;);
return lastExpression;
}
public Integer getMaxRetries() {
return maxRetries;
}
……
public void setMaxTotalConnections(Integer maxTotalConnections) {
this.maxTotalConnections = maxTotalConnections;
}
}

整个实现是在Spring的基础上完成的,其中SolrjQuery的init()方法在Spring容器启动是初始化。Init()方法内的属性,也是直接注入的。上层与下层之间也完全用注入的方式解决。具体配置就不贴不出来了,大家都会。

  
整个代码很简陋,但是几乎支持了你想要搜索的条件设置,而且不会暴露任何与solr相关的内容给上层调用,使整个搜索几乎以sql语言的思想在设置条件。

  



  
http://www.iyunv.com/topic/315330


  


  

package org.nstcrm.person.util;
import java.lang.reflect.Field;
import java.lang.reflect.Method;
import java.net.MalformedURLException;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.TreeMap;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.common.SolrDocumentList;
public class SolrHttpServer {
//private Logger logger = LoggerFactory.getLogger(this.getClass());
private final static String URL = &quot;http://localhost:8080/solr&quot;;
private final static Integer SOCKE_TTIMEOUT = 1000; // socket read timeout
private final static Integer CONN_TIMEOUT = 100;
private final static Integer MAXCONN_DEFAULT = 100;
private final static Integer MAXCONN_TOTAL = 100;
private final static Integer MAXRETRIES = 1;
private static CommonsHttpSolrServer server = null;
private final static String ASC = &quot;asc&quot;;
public void init() throws MalformedURLException {
server = new CommonsHttpSolrServer( URL );
//server.setParser(new XMLResponseParser());
server.setSoTimeout(SOCKE_TTIMEOUT);
server.setConnectionTimeout(CONN_TIMEOUT);
server.setDefaultMaxConnectionsPerHost(MAXCONN_DEFAULT);
server.setMaxTotalConnections(MAXCONN_TOTAL);
server.setFollowRedirects(false);
server.setAllowCompression(true);
server.setMaxRetries(MAXRETRIES);
}
public static SolrDocumentList query(Map<String, String> property, Map<String, String> compositor, Integer pageSize) throws Exception {
SolrQuery query = new SolrQuery();
// 设置搜索字段
if(null == property) {
throw new Exception(&quot;搜索字段不可为空!&quot;);
} else {
for(Object obj : property.keySet()) {
StringBuffer sb = new StringBuffer();
sb.append(obj.toString()).append(&quot;:&quot;);
sb.append(property.get(obj));
String sql = (sb.toString()).replace(&quot;AND&quot;, &quot; AND &quot;).replace(&quot;OR&quot;, &quot; OR &quot;).replace(&quot;NOT&quot;, &quot; NOT &quot;);
query.setQuery(sql);
}
}
// 设置结果排序
if(null != compositor) {
for(Object obj : compositor.keySet()) {
if(ASC == compositor.get(obj) || ASC.equals(compositor.get(obj))) {
query.addSortField(obj.toString(), SolrQuery.ORDER.asc);
} else {
query.addSortField(obj.toString(), SolrQuery.ORDER.desc);
}
}
}
if(null != pageSize && 0 < pageSize) {
query.setRows(pageSize);
}
QueryResponse qr = server.query(query);
SolrDocumentList docList = qr.getResults();
return docList;
}
public static Map<String, String> getQueryProperty(Object obj) throws Exception {
Map<String, String> result = new TreeMap<String, String>();
// 获取实体类的所有属性,返回Fields数组
Field[] fields = obj.getClass().getDeclaredFields();
for(Field f : fields) {
String name = f.getName();// 获取属性的名字
String type = f.getGenericType().toString();
if(&quot;class java.lang.String&quot;.equals(type)) {// 如果type是类类型,则前面包含&quot;class &quot;,后面跟类名
Method me = obj.getClass().getMethod(&quot;get&quot; + UpperCaseField(name));
String tem = (String) me.invoke(obj);
if(null != tem) {
result.put(name, tem);
}
}
}
return result;
}
public static List<Object> querySolrResult(Object propertyObj, Object compositorObj, Integer pageSize) throws Exception {
Map<String, String> propertyMap = new TreeMap<String, String>();
Map<String, String> compositorMap = new TreeMap<String, String>();
propertyMap = getQueryProperty(propertyObj);
compositorMap = getQueryProperty(compositorObj);
SolrDocumentList docList = query(propertyMap, compositorMap, pageSize);
List<Object> list = new ArrayList<Object>();
for(Object obj : docList) {
list.add(obj);
}
return list;
}
private static String UpperCaseField(String name) {
return name.replaceFirst(name.substring(0, 1), name.substring(0, 1).toUpperCase());
}
public CommonsHttpSolrServer getServer() {
return server;
}
public void setServer(CommonsHttpSolrServer server) {
this.server = server;
}
}



  

Solr 1.4.1配置和SolrJ的用法
一、Solr基本安装和配置
1,从官网镜像服务器下载最新版本apache-solr-1.4.1。下载地址:
http://apache.etoak.com//lucene/solr/,并解压缩
2,在D盘建立一个SolrHome文件夹来存放solr的配置文件等,例如:在D盘WORK目录下穿件一个SolrHome文件夹: D:\WORK\SolrHome,
3,在刚解压的apache-solr-1.4.1,找到apache-solr-1.4.1\example下找到solr文件夹,复制到SolrHome下.
4,将apache-solr-1.4.1\dist\apache-solr-1.4.1.war中的apache-solr-1.4.1.war复制到tomcat中的\webapps下并重命名为solr,启动tomcat,解压war包后,停止tomcat.
5,在解压后的solr中找到web.xml,打开:将<env-entry-value>的值设为SolrHome的目录地址
<env-entry>
<env-entry-name>solr/home</env-entry-name>
<env-entry-value>D:\WORK\SolrHome\solr</env-entry-value>
<env-entry-type>java.lang.String</env-entry-type>
</env-entry>
6,在D:\WORK\SolrHome\solr\conf找到solrconfig.xml文件,打开,修改
<dataDir>${solr.data.dir:./solr/data}</dataDir>
其中solr.data.dir存放的是索引目录.
7,添加中文支持,修改tomcat的配置文件server.xml,如下:
<Connector port=&quot;80&quot; protocol=&quot;HTTP/1.1&quot;
maxThreads=&quot;150&quot; connectionTimeout=&quot;20000&quot;
redirectPort=&quot;8443&quot; URIEncoding=&quot;UTF-8&quot;/>
8,启动tomcat,IE中输入:http://localhost:80/solr 即可浏览solr服务器.
二、Solr服务器复制的配置
1,首先测试在本机上开启三个tomcat的服务器:一个端口是80,另一个是9888
2,按照标题一的配置对第二和第三个tomcat服务器进行类似的配置,注意SolrHome的目录不要相同即可,其他的配置不变. 例如:以本机为例

tomcat命名     URLSolrHome目录URI      web.xml配置
tomcat0 (master)http://localhost:80/solrD:\WORK\SolrHome\solr<env-entry-value>D:\WORK\SolrHome\solr</env-entry-value>
tomcat1 (slave)http://localhost:9888/solrE:\WORK\SolrHome\solr<env-entry-value>E:\WORK\SolrHome\solr</env-entry-value>
tomcat2
(salve)http://localhost:9008/solrF:\WORK\SolrHome\solr<env-entry-value>F:\WORK\SolrHome\solr</env-entry-value>


3,以上两步配置好之后,在主服务器tomcat0(master)的SolrHome中找到solrconfig.xml文件,加入如下配置.
<requestHandler name=&quot;/replication&quot; class=&quot;solr.ReplicationHandler&quot; >
<lst name=&quot;master&quot;>
<str name=&quot;replicateAfter&quot;>commit</str>
<str name=&quot;replicateAfter&quot;>startup</str>
<str name=&quot;confFiles&quot;>schema.xml,stopwords.txt</str>
</lst>
</requestHandler>
在从服务器tomcat1(slave)和tomcat1(slave)的SolrHome中找到solrconfig.xml文件加入如下配置:
<requestHandler name=&quot;/replication&quot; class=&quot;solr.ReplicationHandler&quot; >  
<lst name=&quot;slave&quot;>  
<str name=&quot;masterUrl&quot;>http://localhost/solr/replication</str>  
<str name=&quot;pollInterval&quot;>00:00:60</str>  
</lst>  
</requestHandler>
4,在tomcat0上创建索引,使用客户端solrj创建索引,jar包在apache-solr-1.4.1压缩包中.代码如下:
public class SlorTest3 {
private static CommonsHttpSolrServer  server = null;
//private static SolrServer server = null;
public SlorTest3(){
try {
server = new CommonsHttpSolrServer(&quot;http://localhost/solr&quot;);
server.setConnectionTimeout(100);
server.setDefaultMaxConnectionsPerHost(100);
server.setMaxTotalConnections(100);
} catch (IOException e) {
e.printStackTrace();
}
}
@Test
public void testIndexCreate(){
List<SolrInputDocument> docs = new ArrayList<SolrInputDocument>();
for(int i=300;i<500;i++){
SolrInputDocument doc = new SolrInputDocument();
doc.addField(&quot;zjid&quot;, i);  //需要在sechma.xml中配置字段
doc.addField(&quot;title&quot;, &quot;云状空化多个气泡的生长和溃灭&quot;);
doc.addField(&quot;ssid&quot;, &quot;ss&quot;+i);
doc.addField(&quot;dxid&quot;, &quot;dx&quot;+i);
docs.add(doc);
}
try {
server.add(docs);
server.commit();
System.out.println(&quot;----索引创建完毕!!!----&quot;);
} catch (SolrServerException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}
5,分别启动三台tomcat服务器.打开IE浏览器分别输入:其中localhost=192.168.169.121
http://localhost:9888/solr 点击 即可从主solr服务器上复制索引

三、Solr服务器分发(shard)配置
1,开启四台tomcat服务器,其中三台在本机上,一台在远端.清单如下:
注意:四台服务器的配置要相同.其中的schema.xml字段配置一定要一致.
Name     URLSolrHome目录URI
tomcatQueryhttp://localhost:80/solrD:\WORK\SolrHome\solr
tomcat0 (shard)http://localhost:9888/solrE:\WORK\SolrHome\solr
tomcat1 (shard)http://localhost:9008/solrF:\WORK\SolrHome\solr
tomcat2 (shard)http://192.168.169.48:9888/solrD:\WORK\SolrHome\solr

2,配置较简单,只需要在tomcatQuery上的SoleHome的solrconfig.xml文件中修改
其他的solr服务器不需要配置。
<requestHandler name=&quot;standard&quot; class=&quot;solr.SearchHandler&quot; default=&quot;true&quot;>
<!-- default values for query parameters -->
<lst name=&quot;defaults&quot;>
<str name=&quot;echoParams&quot;>explicit</str>
<str name=&quot;shards&quot;>localhost:9088/solr,localhost:9888/solr,192.168.169.48:9888/solr</str>
<!--
<int name=&quot;rows&quot;>10</int>
<str name=&quot;fl&quot;>*</str>
<str name=&quot;version&quot;>2.1</str>
-->
</lst>
</requestHandler>
3,使用slorj的清除原有的索引.或者手动删除。
4,编写代码,将lucene建立的索引(1G左右,874400条记录),按照比例通过solrj分发到三台solr(shard)服务器上,代码如下:
public class IndexCreate{
private static CommonsHttpSolrServer server;
public  CommonsHttpSolrServer getServer(String hostUrl){
CommonsHttpSolrServer server = null;
try {
server = new CommonsHttpSolrServer(hostUrl);
server.setConnectionTimeout(100);
server.setDefaultMaxConnectionsPerHost(100);
server.setMaxTotalConnections(100);
} catch (IOException e) {
System.out.println(&quot;请检查tomcat服务器或端口是否开启!&quot;);
}
return server;
}
@SuppressWarnings(&quot;deprecation&quot;)
public void readerHostCreate(String[] hosts) throws CorruptIndexException, IOException{
IndexReader reader = IndexReader.open(&quot;c:\\index&quot;);
System.out.println(&quot;总记录数: &quot;+reader.numDocs());
int hostNum = hosts.length;
int lengh = reader.numDocs()/hostNum; //根据主机数平分索引长度
int j = reader.numDocs()%hostNum;     //取余
for(int i = 0;i<hosts.length;i++){
long startTime = new Date().getTime();
String url = hosts.substring(hosts.indexOf(&quot;//&quot;)+2,hosts.lastIndexOf(&quot;/&quot;));
System.out.println(&quot;第&quot;+(i+1)+&quot;次,在主机:&quot;+url+&quot; 上创建索引,创建时间&quot;+new Date());
if(i==(hosts.length-1)){
hostlist(reader,lengh*i,lengh*(i+1)+j,hosts);
}else{
hostlist(reader,lengh*i,lengh*(i+1),hosts);
}
System.out.println(&quot;结束时间&quot;+new Date());
long endTime = new Date().getTime();
long ms = (endTime-startTime)%60000-(((endTime-startTime)%60000)/1000)*1000;
System.out.println(&quot;本次索引创建完毕,一共用了&quot;+(endTime-startTime)/60000+&quot;分&quot; +
&quot;&quot;+((endTime-startTime)%60000)/1000+&quot;秒&quot;+ms+&quot;毫秒&quot;);
System.out.println(&quot;****************************&quot;);
}
reader.close();
}
@SuppressWarnings(&quot;static-access&quot;)
public void hostlist(IndexReader reader,int startLengh,int endLengh,String hostUrl) throws CorruptIndexException, IOException{
List<BookIndex> beans = new LinkedList<BookIndex>();
int count = 0;
this.server = getServer(hostUrl);
for(int i=startLengh;i<endLengh;i++){
Document doc = reader.document(i);
BookIndex book = new BookIndex();
book.setZjid(doc.getField(&quot;zjid&quot;).stringValue());
book.setTitle(doc.getField(&quot;title&quot;).stringValue());
book.setSsid(doc.getField(&quot;ssid&quot;).stringValue());
book.setDxid(doc.getField(&quot;dxid&quot;).stringValue());
book.setBookname(doc.getField(&quot;bookname&quot;).stringValue());
book.setAuthor(doc.getField(&quot;author&quot;).stringValue());
book.setPublisher(doc.getField(&quot;publisher&quot;).stringValue());
book.setPubdate(doc.getField(&quot;pubdate&quot;).stringValue());
book.setYear(doc.getField(&quot;year&quot;).stringValue());
book.setFenlei(doc.getField(&quot;fenlei&quot;).stringValue());
book.setscore1(doc.getField(&quot;score&quot;).stringValue());
book.setIsbn(doc.getField(&quot;isbn&quot;).stringValue());
book.setFenleiurl(doc.getField(&quot;fenleiurl&quot;).stringValue());
book.setMulu(doc.getField(&quot;mulu&quot;).stringValue());
book.setIsp(doc.getField(&quot;isp&quot;).stringValue());
book.setIep(doc.getField(&quot;iep&quot;).stringValue());
beans.add(book);
if(beans.size()%3000==0){
createIndex(beans,hostUrl,server);
beans.clear();
System.out.println(&quot;---索引次数:&quot;+(count+1)+&quot;---&quot;);
count++;
}
}
System.out.println(&quot;beansSize 的大小 &quot;+beans.size());
if(beans.size()>0){
createIndex(beans,hostUrl,server);
beans.clear();
}
}
public void createIndex(List<BookIndex> beans,String hostUrl,CommonsHttpSolrServer server){
try {
server.addBeans(beans);
server.commit();
} catch (SolrServerException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
public static void main(String[] args) throws CorruptIndexException, IOException{
IndexCreate as = new IndexCreate();
String[] hosts = new String[] {&quot;http://192.168.169.121:9888/solr&quot;,&quot;http://192.168.169.121:9088/solr&quot;,&quot;http://192.168.169.48:9888/solr&quot;};
long startTime = new Date().getTime();
as.readerHostCreate(hosts);
long endTime = new Date().getTime();
System.out.println(&quot;-------------------&quot;);
long ms = (endTime-startTime)%60000-(((endTime-startTime)%60000)/1000)*1000;
System.out.println(&quot;全部索引创建完毕,一共用了&quot;+(endTime-startTime)/60000+&quot;分&quot; +
&quot;&quot;+((endTime-startTime)%60000)/1000+&quot;秒&quot;+ms+&quot;毫秒&quot;);
}
}
JavaBean类BookIndex.java代码如下:
说明变量名与sechma.xml中的配置要相同.注意:不能使用score这个变量或者字段,与slor配置冲突,报exception。
import org.apache.solr.client.solrj.beans.Field;
public class BookIndex {
@Field  
private String zjid ;
@Field  
private String title;
@Field  
private String ssid;
@Field  
private String dxid;
@Field
private String bookname;
@Field
private String author;
@Field
private String publisher;
@Field
private String pubdate;
@Field
private String year;
@Field
private String fenlei;
@Field
private String score1;
@Field
private String isbn;
@Field
private String fenleiurl;
@Field
private String mulu;
@Field
private String isp;
@Field
private String iep;
public getters();//get方法
public setters();//set方法
}

5,同时开启四台服务器,运行上面代码:
6,打开IE查询
打开:http://localhost/solr
打开:http://localhost:9888/solr
打开http://localhost:9008/solr
打开http://192.168.168.48:9888/solr
四、Solr的Multicore(分片)配置
第一步,将apache-solr-1.4.1\example下的multicore复制到先前配置的solr/home下。
其中multicore下的solr.xml配置如下:
<solr persistent=&quot;false&quot;>
<cores adminPath=&quot;/admin/cores&quot;>
<core name=&quot;core0&quot; instanceDir=&quot;core0&quot;>  
<property name=&quot;dataDir&quot; value=&quot;/data/core0&quot; />  
</core>
<core name=&quot;core1&quot; instanceDir=&quot;core1&quot;>
<property name=&quot;dataDir&quot; value=&quot;/data/core1&quot; />  
</core>
<core name=&quot;core2&quot; instanceDir=&quot;core2&quot;>
<property name=&quot;dataDir&quot; value=&quot;/data/core2&quot; />  
</core>
</cores>
</solr>
第二步,修改Tomcat 6.0\webapps\solr\WEB-INF下的web.xml文件如下
<env-entry-value>D:\WORK\SolrHome\multicore</env-entry-value>
第三步,开启服务器,打开IEhttp://localhost/solr
五、SolrJ的用法
1,SolrJ的用法
solrJ是与solr服务器交互的客户端工具.使用solrJ不必考虑solr服务器的输出格式,以及对文档的解析。solrJ会根据用户发送的请求将数据结果集(collection)返回给用户.
2,使用SolrJ建立索引
事例代码:
//得到连接solr服务器的CommonsHttpSolrServer对象.通过这个对象处理用户提交的请求:
CommonsHttpSolrServer server = new CommonsHttpSolrServer(&quot;http://localhost:9888/solr&quot;);
public void testIndexCreate(){
List<SolrInputDocument> docs = new ArrayList<SolrInputDocument>();
// SolrInputDocumentle类似与Document类,用于创建索引文档,和向文档中添加字段
for(int i=300;i<500;i++){
SolrInputDocument doc = new SolrInputDocument();
doc.addField(&quot;zjid&quot;, i+&quot;_id&quot;);  
doc.addField(&quot;title&quot;, i+&quot;_title&quot;);
doc.addField(&quot;ssid&quot;, &quot;ss_&quot;+i);
doc.addField(&quot;dxid&quot;, &quot;dx_&quot;+i);
docs.add(doc);
}
try {
server.add(docs);
server.commit();//以更新(update)的方式提交
System.out.println(&quot;----索引创建完毕!!!----&quot;);
} catch (SolrServerException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
以javaBean的方式创建索引:
public class BookIndex {
@Field
private String zjid;
@Field
private String zhangjie;
@Field
private String ssid;
@Field
private String qwpos;
@Field
private String publishDate;
@Field
private String mulu;
@Field
private String fenleiurl;
@Field
private String fenlei;
@Field
private String dxid;
@Field
private String author;
@Field
private String address;
@Field
private String bookname;
…………………
}
public void testBean(){
List<BookIndex> beans = new ArrayList<BookIndex>();
for(int i=0;i<10;i++){
BookIndex book = new BookIndex();
book.setZjid(i+&quot;id&quot;);
book.setTitle(i+&quot;title&quot;);
//set方法
beans.add(book);
}
try {
server.addBeans(beans);
server.commit();
System.out.println(&quot;----索引创建完毕!!!----&quot;);
} catch (SolrServerException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
3,SolrJ常见的查询
a,  查询索引中的全部字段的内容:  SolrQuery query = new SolrQuery(&quot;*:*&quot;);
事例代码:
public void testQuery1(){
SolrQuery query = new SolrQuery(&quot;*:*&quot;);
query.setStart(20);//设置起始位置
query.setRows(10); //查询组数
QueryResponse response = null;
try {
response = server.query(query);//响应向服务器提交的查询请求
System.out.println(response);
} catch (SolrServerException e) {
e.printStackTrace();
}
List<SolrDocument> docs = response.getResults();//得到结果集
for(SolrDocument doc:docs){//遍历结果集
for(Iterator iter = doc.iterator();iter.hasNext();){
Map.Entry<String, Object> entry = (Entry<String, Object>)iter.next();
System.out.print(&quot;Key :&quot;+entry.getKey()+&quot;  &quot;);
System.out.println(&quot;Value :&quot;+entry.getValue());
}
System.out.println(&quot;------------&quot;);
}
}
b, 查询某一字段的内容
String queryString  = “zjid:5_id”;//写法为字段名:查询的内容
SolrQuery query = new SolrQuery(queryString);
c, 查询copyField的内容
copyField是查询的默认字段,当不指明查询字段的field的时。查询的请求会在copyField匹配: 详细参见schema..xml的配置.
String queryString  = “XXX”
SolrQuery query = new SolrQuery(queryString);
4,索引删除
事例代码 :
public void testClear() {
server.setRequestWriter(new BinaryRequestWriter());//提高性能采用流输出方式
try {
server.deleteByQuery(&quot;*:*&quot;);
server.commit();
System.out.println(&quot;----清除索引---&quot;);
} catch (SolrServerException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
5,高亮的使用Highlight
public List<Book> getQueryString(String queryString,int start,int pageSize) {
SolrQuery query = new SolrQuery(queryString);
query.setHighlight(true); //开启高亮组件
query.addHighlightField(&quot;mulu&quot;);//高亮字段
query.setHighlightSimplePre(&quot;<font color=\&quot;red\&quot;>&quot;);//标记
query.setHighlightSimplePost(&quot;</font>&quot;);
query.set(&quot;hl.usePhraseHighlighter&quot;, true);
query.set(&quot;hl.highlightMultiTerm&quot;, true);
query.set(&quot;hl.snippets&quot;, 3);//三个片段,默认是1
query.set(&quot;hl.fragsize&quot;, 50);//每个片段50个字,默认是100
//
query.setStart(start); //起始位置 …分页
query.setRows(pageSize);//文档数
try {
response = server.query(query);
} catch (SolrServerException e) {
e.printStackTrace();
}
List<BookIndex> bookLists = response.getBeans(BookIndex.class);
Map<String,Map<String,List<String>>> h1 = response.getHighlighting();
List<Book> books = new ArrayList<Book>();
for(BookIndex bookIndex : bookLists){
Map<String,List<String>>  map = h1.get(bookIndex.getZjid());
//以文档的唯一id作为Map<String,Map<String,List<String>>>的key值.
Book book = new Book();
//copy字段
book.setBookname(bookIndex.getBookname());
book.setZjid(bookIndex.getZjid());
if(map.get(&quot;mulu&quot;)!=null){
List<String> strMulu = map.get(&quot;mulu&quot;);
StringBuffer buf  = new StringBuffer();
for(int i =0;i<strMulu.size();i++){
buf.append(strMulu.get(i));
buf.append(&quot;...&quot;);
if(i>3){
break;
}
}
book.setSummary(buf.toString());
}else{
if(bookIndex.getMulu().length()>100){
book.setSummary(bookIndex.getMulu().substring(0,100)+&quot;...&quot;);
}else{
book.setSummary(bookIndex.getMulu()+&quot;...&quot;);
}
}
books.add(book);
}
return books;
}
6, 分组Fact
//需要注意的是参与分组的字段是不需要分词的,比如:产品的类别.kind
public void testFact(){
String queryString = &quot;kind:儿童图书&quot;;
SolrQuery query = new SolrQuery().setQuery(queryString);
query.setFacet(true); //开启分组
query.addFacetField(&quot;bookname&quot;);//分组字段
query.addFacetField(&quot;title&quot;);
query.setFacetMinCount(1);
query.addSortField( &quot;zjid&quot;, SolrQuery.ORDER.asc );//排序字段
query.setRows(10);
QueryResponse response = null;
try {
response = server.query(query);
System.out.println(response);
} catch (SolrServerException e) {
e.printStackTrace();
}
List<FacetField> facets = response.getFacetFields();
for (FacetField facet : facets) {
System.out.println(&quot;Facet:&quot; + facet);
}
}
六、一个简单的web引用:
首先说明的是索引来源,是根据已有的lucene索引上开发的,又因为lucene的索引直接用solrJ应用效果不好,会出现很多问题,找不到类似的解决办法,比如全字段查询,和高亮分组等,和多级索引目录…。但用solrJ创建的索引不存在类似的问题.
大致的思路是,读取已有的lucene索引 ,再用solrJ创建索引并分发到多台机器上,最后再做开发.
第一步:读取lucene的多级索引目录,用solrJ创建和分发索引;
需注意的是:要加大虚拟机的内存,因为采用的map做为缓存,理论上虚拟机的内存大,map的存储的索引文档数也就多.主要是针对内存溢出.
代码 :
package org.readerIndex;
import org.apache.solr.client.solrj.beans.Field;
public class BookIndex2 {
@Field
private String zjid;
@Field
private String zhangjie;
@Field
private String ssid;
@Field
private String qwpos;
@Field
private String publishDate;
@Field
private String mulu;
@Field
private String fenleiurl;
@Field
private String fenlei;
@Field
private String dxid;
@Field
private String author;
@Field
private String address;
@Field
private String bookname;
public String getZjid() {
return zjid;
…………………………………………
}
package org.readerIndex;
import java.io.File;
import java.io.IOException;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.Date;
import java.util.List;

import org.apache.lucene.document.Document;
import org.apache.lucene.index.CorruptIndexException;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.store.LockObtainFailedException;
import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer;


public class ReaderIndex {
public  CommonsHttpSolrServer getServer(String hostUrl){
CommonsHttpSolrServer server = null;
try {
server = new CommonsHttpSolrServer(hostUrl);
server.setConnectionTimeout(100);
server.setDefaultMaxConnectionsPerHost(100);
server.setMaxTotalConnections(100);
} catch (IOException e) {
System.out.println(&quot;请检查tomcat服务器或端口是否开启!&quot;);
}
return server;
}
public void indexDocuements(String path,String[] hostUrls) throws CorruptIndexException, LockObtainFailedException, IOException{
File pareFile = new File(path);
List<String> list = new ArrayList<String>();
getFile(pareFile,list); //递归方法得到路径保存到list中
System.out.println(&quot;***程序一共递归到&quot;+list.size()+&quot;个索引目录***&quot;);
int arevageSize = list.size()/hostUrls.length;// 根据主机数平分目录
int remainSize  = list.size()%hostUrls.length;//取余
SimpleDateFormat sdf = new SimpleDateFormat(&quot;yyyy-MM-dd HH:mm:ss&quot;);
for(int i=0;i<hostUrls.length;i++){
Date startDate = new Date();
String url = hostUrls.substring(hostUrls.indexOf(&quot;//&quot;)+2,hostUrls.lastIndexOf(&quot;/&quot;));
System.out.println(&quot;第&quot;+(i+1)+&quot;次,在主机:&quot;+url+&quot; 上创建索引,创建时间  &quot;+sdf.format(startDate));
if(i==(hostUrls.length-1)){
list(list,arevageSize*i,arevageSize*(i+1)+remainSize,hostUrls);
}else{
list(list,arevageSize*i,arevageSize*(i+1),hostUrls);
}
Date endDate = new Date();
System.out.println(&quot;本次索引结束时间为:&quot;+sdf.format(endDate));
}
}
public void list(List<String> list,int start,int end,String url){
CommonsHttpSolrServer server = getServer(url);
for(int j=start;j<end;j++){
try {
long startMs = System.currentTimeMillis();
hostCreate(list.get(j),server);
long endMs = System.currentTimeMillis();
System.out.println(&quot;程序第&quot;+(j+1)+&quot;个目录处理完毕,目录路径:&quot;+list.get(j)+&quot;, 耗时&quot;+(endMs-startMs)+&quot;ms&quot;);
} catch (CorruptIndexException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}
public void getFile(File fileDirectory,List<String> list){
if(fileDirectory.isDirectory()){
File[] files = fileDirectory.listFiles();
for(File file :files){
getFile(file,list);
}
}else if(fileDirectory.isFile()){
String filePath = fileDirectory.getPath();
String path = filePath.replace('\\', '/');
if(path.endsWith(&quot;.cfs&quot;)){
int lastIndex = path.lastIndexOf(&quot;/&quot;);
String directory = path.substring(0,lastIndex);
list.add(directory);
}
}
}
@SuppressWarnings(&quot;deprecation&quot;)
public void hostCreate(String directory,CommonsHttpSolrServer server) throws CorruptIndexException, IOException{
IndexReader reader = IndexReader.open(directory);
List<BookIndex2> beans = new ArrayList<BookIndex2>();
for(int i=0;i<reader.numDocs();i++){
Document doc = reader.document(i);
BookIndex2 book = new BookIndex2();
book.setZjid(doc.getField(&quot;zjid&quot;).stringValue());
book.setAddress(doc.getField(&quot;address&quot;).stringValue());
book.setAuthor(doc.getField(&quot;author&quot;).stringValue());
book.setbookname(doc.getField(&quot;bookname&quot;).stringValue());
book.setDxid(doc.getField(&quot;dxid&quot;).stringValue());
book.setFenlei(doc.getField(&quot;fenlei&quot;).stringValue());
book.setFenleiurl(doc.getField(&quot;fenleiurl&quot;).stringValue());
book.setMulu(doc.getField(&quot;mulu&quot;).stringValue());
book.setPublishDate(doc.getField(&quot;publishDate&quot;).stringValue());
book.setQwpos(doc.getField(&quot;qwpos&quot;).stringValue());
book.setSsid(doc.getField(&quot;ssid&quot;).stringValue());
book.setZhangjie(doc.getField(&quot;zhangjie&quot;).stringValue());
beans.add(book);
}
createIndex(beans,server);
beans.clear();
reader.close();
}
public void createIndex(List<BookIndex2> beans,CommonsHttpSolrServer server){
try {
server.addBeans(beans);
server.commit();
//server.optimize();
} catch (SolrServerException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
public static void main(String[] args) throws CorruptIndexException, IOException{
ReaderIndex reader = new ReaderIndex();
String path = &quot;D:\\91&quot;;
String[] hosts = new String[] {&quot;http://192.168.169.121:9888/solr&quot;,&quot;http://192.168.169.121:9088/solr&quot;,&quot;http://192.168.169.48:9888/solr&quot;};
long startTime = new Date().getTime();
reader.indexDocuements(path,hosts);
long endTime = new Date().getTime();
System.out.println(&quot;-------------------&quot;);
long ms = (endTime-startTime)%60000-(((endTime-startTime)%60000)/1000)*1000;
System.out.println(&quot;全部文档索引完毕,一共用了&quot;+(endTime-startTime)/60000+&quot;分&quot; +
&quot;&quot;+((endTime-startTime)%60000)/1000+&quot;秒&quot;+ms+&quot;毫秒&quot;);
}
}





第二步,进行简单的web应用.

  
  

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-138042-1-1.html 上篇帖子: 解析solr javabin 下篇帖子: solr学习之五--------选用合适的类型
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表