solr schema.xml 字段解析

winson · 发表于 2016-12-15 08:48:27

文章地址：http://quentinxxz.iyunv.com/blog/2100628

fieldType

< fieldType name =" string " class =" solr.StrField " sortMissingLast =" true " omitNorms =" true " />

sortMissingLast和sortMissingFirst两个属性是用在可以内在使用String排序的类型上（包括：string,boolean,sint,slong,sfloat,sdouble,pdate）。
sortMissingLast="true"，没有该field的数据排在有该field的数据之后，而不管请求时的排序规则。
sortMissingFirst="true"，跟上面倒过来呗

TrieField 用于范围查询，性能比普通的数值类型要快10倍。
precisionStep 值越小，分割的field段就越多，索引要存储的信息也越大，同时范围查找速度也就越快。

positionIncrementGap：可选属性，定义在同一个文档中此类型数据的空白间隔，避免短语匹配错误。和multiValued
一起使用，设置多个值之间的虚拟空白的数量

关于positionIncrementGap的详细说明，参考文章http://rockiee281.blog.163.com/blog/static/19385222920127225619919/

这些数值类型，用于对已存在的索引的兼容（由lucence或早期版本的solr创建），暂不支持范围查找

<fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
      </analyzer>
    </fieldType>

         The optional positionIncrementGap puts space between multiple fields of this type on the same document, with the purpose of preventing false phrase
         matching across fields.

<fieldType name="random" class="solr.RandomSortField" indexed="true" />

RandomSortField不会被存储，也不用于搜索任何数据，用于生成伪随机排序的docs。

Filed

官方建议不要修改id 与_version_ filed。

当下列可选属性被使用时，Lucene的term Vector的存储会被触发

termVectors=true|false
termPositions=true|false
termOffsets=true|false

这些选项用于高亮以及其他配套功能的加速，但是会对索引的大小造成额外开销。

杂项

<uniqueKey>

solr并不强制要求schema有一个唯一字段，但schema都基都会设置一个唯一字段。官方建议不要修改这个字段。‘
如果你在solrconfig.xml中启用了QueryElevationComponent 。就可以要求schema使用一个StrFiled类型的唯一字段。

当搜索fied没有被显示指定的时候，solr指该字段为默认搜索字段。

默认OR

你必须保证datatye 是兼容的

指定评分器

账号		自动登录	找回密码
密码			立即注册

大疆运维招人啦，

Red Hat RHCE 8 (EX294) Cert Guide

c++ size_t 和 int 的区别

HERE 使用 AWS EF 和 JFrog Artifactory 打

C++ 指针大全：从基础到进阶，一篇快速上手

wirelessnetview好用的无线分析工具

亿图图示专家(EDraw Max) V7.9 中文破解版

[经验分享] solr schema.xml 字段解析

浏览过的版块

扫码加入运维网微信交流群