|
最近业余时间接触es不久,碰到了一些关于批量添加数据的问题,在本地机器上配置了三个节点的集群,一共有5.2w的元数据,批量更新的时候尼玛总是出现IO异常问题,数据少还中,比如500,1000。已经做了批量更新的间隔,如一次1000,500,200都试过,还是IO。如果说是本机配置的不够,2G的足够了吧,通过一个侧面的例子也能反应出来,同样在本机做了一个numshards=3的solr集群,更新25.8w的数据都没有问题。
下面贴出创建es索引,更新数据的代码,有懂的朋友帮解答一下,ps: es的资料太少了,光看例子都够费劲的,尼玛java链式方法老长老长,显得牛逼吗。
public void createIndex() throws IOException {
XContentBuilder content = XContentFactory.jsonBuilder()
.startObject()
.startObject("vType")
.startObject("properties")
.startObject("title").field("type", "string").field("indexAnalyzer", "ik").field("searchAnalyzer", "ik").endObject()
.startObject("author").field("type", "string").field("indexAnalyzer","ik").field("searchAnalyzer", "ik").endObject()
.startObject("keyword").field("type", "string").field("indexAnalyzer","ik").field("searchAnalyzer", "ik").endObject()
.startObject("fenlei").field("type", "string").endObject()
.startObject("other").field("type", "string").endObject()
.startObject("subname").field("type", "string").endObject()
.startObject("subtitles").field("type", "string").endObject()
.startObject("summary").field("type", "string").field("indexAnalyzer","ik").field("searchAnalyzer", "ik").endObject()
.startObject("videolink").field("type", "string").endObject()
.endObject()
.endObject()
.endObject();
client.admin().indices().prepareCreate("video").addMapping("vType", content)
.setSettings(ImmutableSettings.settingsBuilder().put("number_of_shards", 8)).execute().actionGet();
}
更新数据的代码:
public void bulkData(){
BulkRequestBuilder blukRequest = client.prepareBulk();
try {
IndexReader reader = IndexReader.open(FSDirectory.open(new File(store_path)));
int maxDocs = reader.maxDoc();
List<Document> list = new ArrayList<Document>();
int count = 0;
for (int i = 0; i < maxDocs; i++) {
list.add(reader.document(i));
if(list.size()%100==0){
for(int j =0;j<list.size();j++){
Document doc = list.get(j);
blukRequest.add(client.prepareIndex("video", "vType", String.valueOf((j+count)))
.setSource(XContentFactory.jsonBuilder()
.startObject()
.field("title",doc.get("title"))
.field("author",doc.get("author"))
.field("keyword",doc.get("keyword"))
.field("fenlei",doc.get("fenlei"))
.field("other",doc.get("other"))
.field("subname",doc.get("subname"))
.field("subtitles",doc.get("subtitles"))
.field("summary",doc.get("summary"))
.field("videolink", doc.get("videolink"))
.endObject()
));
}
count+=list.size();
BulkResponse bulkResponse = blukRequest.execute().actionGet();
client.admin().indices().prepareRefresh().execute().actionGet();
if(!bulkResponse.hasFailures())
System.out.println("------success!");
else
System.out.println("------failure!");
list.clear();
}
}
if(list.size()>0){
for(int i =0;i<list.size();i++){
Document doc = list.get(i);
blukRequest.add(client.prepareIndex("video", "vType", String.valueOf((i+count)))
.setSource(XContentFactory.jsonBuilder()
.startObject()
.field("title",doc.get("title"))
.field("author",doc.get("author"))
.field("keyword",doc.get("keyword"))
.field("fenlei",doc.get("fenlei"))
.field("other",doc.get("other"))
.field("subname",doc.get("subname"))
.field("subtitles",doc.get("subtitles"))
.field("summary",doc.get("summary"))
.field("videolink", doc.get("videolink"))
.endObject()
));
count+=list.size();
BulkResponse bulkResponse = blukRequest.execute().actionGet();
//client.admin().indices().prepareRefresh().execute().actionGet();
if(!bulkResponse.hasFailures())
System.out.println("------success!");
else
System.out.println("------failure!");
list.clear();
}
}
} catch (CorruptIndexException e) {
} catch (IOException e) {
e.printStackTrace();
}
}
还是静坐等待高手的指点ing......
接下来介绍solr4.0Alpha,喜欢solr的朋友可以学习了哦...
限于篇幅solr4.0+tomcat的安装,查看我的百度空间http://hi.baidu.com/620734263/item/372ea2c7955fbc7088ad9eaa |
|