Solr 嵌套文档的读写问题(Nested Documents, Block Join)
主要解决的问题:根据子文档属性查询父文档,根据父文档属性查询子文档,父子文档一起返回(联查)。在google都不能很快搜到方案。文中例子为虚构的,代码是groovy的。在这里有一部分说明:
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-BlockJoinQueryParsers
[*] 嵌套文档的保存
solr的文档存储上是扁平结构(Lucene的限制),所以嵌套只是逻辑上的。这个反映在schema.xml里,field不能嵌套子fields(?),跟sql表类似,要通过父id来实现嵌套。
首先在schema.xml里面添加这一行:
我的库里有多重文档,所以添加一个属性帮助标识文档类型:
其次在代码里面用这个方法保存嵌套:
SolrInputDocument doc = new SolrInputDocument()
SolrInputDocument subDoc = new SolrInputDocument()
doc.addChildDocument(subDoc)
2. 根据子文档属性查询父文档
SolrClient sc = new HttpSolrClient(SOLR_CORE_URL)
def fq = '{!parent which="docType:Student"}bookProp:bookPropValue'
def q = 'docType:Student'
SolrQuery sq = new SolrQuery(q)
sq.addFilterQuery(fq)
def rsp = sc.query(sq)
def docs = rsp.getResults()
3. 根据父文档属性(比如prop)查询子文档
SolrClient sc = new HttpSolrClient(SOLR_CORE_URL)
def fq = '{!child of="docType:Student"}studentProp:studentPropValue'
def q = 'docType:Book'
SolrQuery sq = new SolrQuery(q)
sq.addFilterQuery(fq)
def rsp = sc.query(sq)
def docs = rsp.getResults()
4. 联查
SolrClient sc = new HttpSolrClient(SOLR_CORE_URL)
def fq = '{!parent which="docType:Student"}'
def q = '张三'
fl = '*, '
SolrQuery sq = new SolrQuery(q)
sq.setParam(CommonParams.DF, 'name')
sq.addFilterQuery(fq)
// sq.addField(fl)
sq.setParam(CommonParams.FL, fl)
def rsp = sc.query(sq)
SolrDocumentList docs = rsp.getResults()
docs?.each {it ->
println(it)
def children = it.getChildDocuments()
// ...
}
页:
[1]