solr改造篇
solr本身支持分布式索引,是利用rsync来做的,详见文档:
http://wiki.apache.org/solr/CollectionDistribution
但是,我是想利用hdfs来作这件事,在看了solr的源代码后,发现其并不能配置为直接支持hdfs,他默认就是读取写入本地文件系统,代码片段如下:
File dirFile = new File(getIndexDir());
Directory dir = FSDirectory.getDirectory(d
irFile, !indexExists);
目前手工hack了SolrCore.java,让其支持HDFS
InetSocketAddress addr = DataNode.createSocketAddr("10.88.15.59:9000");
FileSystem fs = new DistributedFileSystem(addr, conf);
FsDirectory dir = new FsDirectory(fs, new Path(getIndexDir()),false, conf);
boolean indexExists = fs.exists(new Path(getIndexDir()));
搜索部分的切换:
Configuration conf = new Configuration();
InetSocketAddress addr = DataNode.createSocketAddr("10.88.15.59:9000");
FileSystem fs = new DistributedFileSystem(addr, conf);
FsDirectory dir=new FsDirectory(fs, new Path(index_path), false, conf);
IndexReader reader = IndexReader.open(dir);
tmp = new SolrIndexSearcher(schema, "main", reader, true);
目前可以通过HDFS来读取索引文件并完成查询
页:
[1]