设为首页 收藏本站
查看: 498|回复: 0

[经验分享] [ solr备忘录 ]

[复制链接]
累计签到:1 天
连续签到:1 天
发表于 2015-7-17 09:53:28 | 显示全部楼层 |阅读模式
  1.对于“关注度排序问题”的记录
  在查阅资料是发现:ExternalFileField is handy for cases where you want to update a particular field in many documents more often than you want to update the rest of the documents. For example, suppose you have some kind of document rank based on number of views . You might want to update the rank of all the documents daily or hourly , while the rest of the contents of the documents might be update much less frequently .
  Without ExternalFileField , you would need to update each document just to change the rank. Using ExternalFileField is much more efficient because all document values for a particular field are stored in an external file that can be updated as frequently as you wish .
  An attribute in the field type declaration, valType, specifies the actual type of the values that will be found in the file. Note that only pfloat fields are currently supported.





  The file itself is located in Solr's index directory, which by default is data/index in the Solr home directory.
  The file contains entries that map a key field, on the left of the equals sign, to a value, on the right.  
  (20120215)
  
  2.field type属性的用例
  Field Type Properties by Use Case

  Use Case  indexed    stored    multivalued    omitNorms    termVectors    termPositions  
search withintrue
retrieve contents true
use as unique keytrue false
sort on fieldtrue falsetrue
use field boosts false
document boosts affect searches false
highlightingtruetrue truetrue
facetingtrue
add multiple values,maintaining true
field length affects doc score false
MoreLikeThis true
  (20120215)
  
  3.Lucene's near-real-time search is fast !(NRT)
  Near Realtime!
  Near realtime search means thats documents are available for search almost immediately after being indexed - additions and updates to documents are seen in 'near' realtime .
  [lucene wiki http://wiki.apache.org/lucene-java/NearRealtimeSearch]
  ...One goal of the near realtime search design is to make NRT as transparent as possible to the user. Another is minimize the latency after an update is made to perform a search that includes the update...Index Writer manages the subreaders internally so there is no need to call reopen, instead getReader may be used. NRT adds an internal ram directory(Lucene-1313) to index writer where documents are flushed to before being merged to disk. This technique decreases the turnaround time required for updating the index when calling getReader...



IndexWriter writer; // create an IndexWriter here
Document doc = null; // create a document here
writer.addDocument(doc); // update a document
IndexReader reader = writer.getReader(); // get a reader with the new doc
Document addedDoc = reader.document(0);

  [solr wiki http://wiki.apache.org/solr/NearRealtimeSearch]
Near realtime search means thats documents are available for search almost immediately after being indexed - additions and updates to documents are seen in 'near' realtime.
Near realtime search will be added to Solr in version 4.0 and is currently available on trunk.
You can now modify a commit command to be a 'soft' commit. A soft commit will avoid parts of the standard commit that can be costly. You still will want to do normal commits to ensure that documents are on stable storage, but soft commits allow users to see a very near realtime view of the index in the meantime. Be sure to pay special attention to cache and autowarm settings as they can have a significant impact on NRT performance.



  • You can read about soft commits here: http://wiki.apache.org/solr/UpdateXmlMessages#A.22commit.22_and_.22optimize.22


  • You can see how to auto soft commit here: http://wiki.apache.org/solr/SolrConfigXml?#Update_Handler_Section


A common configuration might be to 'hard' auto commit every 1-10 minutes and 'soft' auto commit every second. With this configuration, new documents will show up within about a second of being added, and if the power goes out, you will be certain to have a consistent index up to the last 'hard' commit.
There is also a blog post detailing some of the current improvements in this area on trunk located here: http://www.lucidimagination.com/blog/2011/07/11/benchmarking-the-new-solr-%E2%80%98near-realtime%E2%80%99-improvements/
其他参考:
  http://java.dzone.com/news/lucenes-near-real-time-search
  (20120220)
  
  
  
  
  
  
  
  

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-87547-1-1.html 上篇帖子: 让Solr返回JSON数据 下篇帖子: Solr环境配置
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表