Apache Solr 4.0 初试体验及LucidWorks介绍

ftsr · 发表于 2015-7-17 08:39:07

　　Apache Solr 4.0 发布一段时间了，最新的solr修改动作还是很大的，尤其从后台管理界面来看，体验和管理更加方便了。默认使用了multi-core模式，以及支持了对多个collection的管理、监控、优化。从内核来看，Solr 4也带来了很多新东西，如Solr Cloud、Realtime GET、NRT（Near-Real-Time Search）、Master/Slave扩展与ZooKeeper集成、Join查询等。

安装SOLR 4
　　1. 下载Solr4 http://lucene.apache.org/solr/
　　2. 解压缩，进入example文件夹下面
　　3. 启动Solr

java -jar start.jar
　　启动没有报错的话solr 已经安装完成可以使用了。打开浏览器，键入网址 http://localhost:8983/solr/ ，可以看到如下的solr界面：

　　新的solr管理admin管理界面主要有 Dashboard、日志、solr集合管理、线程管理以及系统信息，然后每一个collection会有单独的查询、检索等管理界面。
　　进入collection1（默认的一个collection），可以看到如下管理collection的菜单：

　　SOLR以前的界面是非AJAX形式的，并且多个collection没有一个统一的UI接口，大家可以参考以前的SOLR系列中的截图，看看SOLR 4.0之前的样子。

Drupal集成ApacheSolr-3.x以及中文分词处理
Apache Solr 快速启动包以及中文分词集成

　　SOLR 4 APIs
虽然Solr4的后台管理变化比较大，但API的url变化却不是很大，我们列举出来solr
4的API的url，以供大家参考。

/admin/file
/admin/logging
/admin/luke
/admin/mbeans
/admin/ping
/admin/plugins
/admin/properties
/admin/system
/admin/threads
/analysis/document
/analysis/field
/browse
/debug/dump
/elevate
/get
/query
/replication
/select
/spell
/terms
/tvrh
/update
/update/csv
/update/extract
/update/json
　　目前solr 4和drupal的集成还没有现成可用的模块，不过由于交互URL没有太多变化，相信对现有的API做一些修改就可以直接让drupal与solr4集成。
　　LucidWorks
　　最后我们介绍一下LucidWorks。LucidWorks是一款企业级的Solr的应用，包括SOLR的集成、各种数据的索引（文件、FTP、数据库、WEB-HTTP、Hadoop、亚马逊云等）、索引的管理、服务器的监控等等。之前叫LucidImagination，重命名后叫LucidWorks。
　　以下两张截图可以看看LucidWorks的大概工作流程。
　　LucidWorks Dashboard

　　
　　LucidWorks索引数据源管理

　　
　　一个小细节：以前笔者在测试LucidImagination （LucidWorksd前身）的时候，发现使用的SOLR4以及新版本的admin UI，但是刚刚在测试最新版本的时候，发现用的是旧版本的Admin UI，beta版本的solr4，这点比较奇怪。
　　LucidWorks Big Data
　　LucidWorks Big Data 是一个集成的搜索服务，提供大数据海量数据的管理、检索、查询服务，主要涵盖以下组件：

Product	Brief Description	Version
LucidWorks	Provides search and discovery capabilities, plus connectors to common data sources	2.1 plus plugins – Solr 4.0-SNAPSHOT
Apache Hadoop	Provides Distributed storage and general purpose distributed computation	1.0.2
Apache Mahout	Scalable Machine Learning	0.6
Apache HBase	Provides distributed storage for fast lookups based on Hadoop. Used to store metrics, user info and history, time series data	0.92
Apache ZooKeeper	Provides distributed synchronization, configuration, etc.	3.4.3
Apache Pig	Provides high-level language for manipulating large data sets for analytics and ETL	0.9.2
Apache Kafka	Provides distributed pub-sub mechanism for real time distributed data sharing and for aggregating logs into HDFS	0.7.0 (incubating)
Apache Oozie	Distributed Workflow coordination	3.2.0-SNAPSHOT for compatibility with Hadoop 1.0.2
Restlet	REST API capabilities	2.1-rc3
Behemoth	Hadoop based document processing workflow	Trunk

　　LucidWorks Product Suite

　　
　　参考站点

http://www.lucidworks.com
http://lucene.apache.org/solr/

　　
　　转自：http://www.drupal001.com/2012/10/solr-4-0-lucidworks/

账号		自动登录	找回密码
密码			立即注册

大疆运维招人啦，

C++ :try 语句块和异常处理

C++的多态

Red Hat RHCE 8 (EX294) Cert Guide

Java/C++ 区别：看完这一篇，就够用！

别再用过时库了！这 13 个顶级 C++ 库才是

c++ size_t 和 int 的区别

[经验分享] Apache Solr 4.0 初试体验及LucidWorks介绍

浏览过的版块

扫码加入运维网微信交流群