Directory dir = FSDirectory.open(new File(indexDir));
返回。org.apache.lucene.store.Directory有两个直接子类:FSDirectory和RAMDirectory。分别代表硬盘目录以及内存目录,这两个具体的含义以后再说,因为我们需要索引的文件存储在硬盘上,所以使用FSDirectory返回代表目录的实例。
2)实例化一个Document对象,该对象代表了一些域(Field)的集合。每个域都对应一段可能在搜索过程中被查询的数据。Lucene只能处理java.lang.String、java.io.Reader以及一些基本数据类型(如int或者float)。如果要处理其它诸如Microsoft的word、excel或者html、PDF等文件类型时,则需要使用到一些其它的分析器来进行索引操作,而不只是我们上文提到的最基础的StandardAnalyzer。
索引搜索,代码示例如下:
public static void search(String indexDir,String q)
throws IOException, ParseException{
Directory dir = FSDirectory.open(new File(indexDir));//Open index
IndexSearcher is = new IndexSearcher(dir);
QueryParser parser = new QueryParser(Version.LUCENE_30,//Parse query
"contents",
new StandardAnalyzer(
Version.LUCENE_30));
Query query = parser.parse(q);//Search index
long start = System.currentTimeMillis();
TopDocs hits = is.search(query,10);
long end = System.currentTimeMillis();
System.err.println("Found "+hits.totalHits+//Write search stats
" document(s) (in "+(end-start)+
" milliseconds) that matched query '"+
q+"':");
//Note that the TopDocs object contains only references to the underlying documents.
//In other words, instead of being loaded immediately upon search, matches are loaded
//from the index in a lazy fashion—only when requested with the Index-
//Searcher.doc(int) call.
//换言之,搜索并不立即加载,当调用IndexSearcher.doc(int)时返回。
//That call returns a Document object from which we can then
//retrieve individual field values.
for(ScoreDoc scoreDoc:hits.scoreDocs){
Document doc = is.doc(scoreDoc.doc);//Retrieve matching document
System.out.println(doc.get("fullpath"));//Display filename
System.out.println(doc.get("filename"));
System.out.println(doc.get("contents"));
}
is.close();//Close IndexerSearcher
}
在搜索索引之前,我们必须得知道索引在哪儿吧,所以通过
Directory dir = FSDirectory.open(new File(indexDir));//Open index