xywuyiba8 发表于 2019-1-28 14:32:54

Elasticsearch cluster

  引用原文中的一段话:
Elasticsearch是一个基于Apache Lucene(TM)的开源搜索引擎。无论在开源还是专有领域,Lucene可以被认为是迄今为止最先进、性能最好的、功能最全的搜索引擎库。但是,Lucene只是一个库。想要使用它,你必须使用Java来作为开发语言并将其直接集成到你的应用中,更糟糕的是,Lucene非常复杂,你需要深入了解检索的相关知识来理解它是如何工作的。Elasticsearch也使用Java开发并使用Lucene作为其核心来实现所有索引和搜索的功能,但是它的目的是通过简单的RESTful API来隐藏Lucene的复杂性,从而让全文搜索变得简单。
不过,Elasticsearch不仅仅是Lucene和全文搜索,我们还能这样去描述它:

[*]  分布式的实时文件存储,每个字段都被索引并可被搜索
[*]  分布式的实时分析搜索引擎
[*]  可以扩展到上百台服务器,处理PB级结构化或非结构化数据
而且,所有的这些功能被集成到一个服务里面,你的应用可以通过简单的RESTful API、各种语言的客户端甚至命令行与之交互。
上手Elasticsearch非常容易。它提供了许多合理的缺省值,并对初学者隐藏了复杂的搜索引擎理论。它开箱即用(安装即可使用),只需很少的学习既可在生产环境中使用。Elasticsearch在Apache 2license下许可使用,可以免费下载、使用和修改。随着你对Elasticsearch的理解加深,你可以根据不同的问题领域定制Elasticsearch的高级特性,这一切都是可配置的,并且配置非常灵活
  
  一、安装elasticsearch
  # vim elasticsearch.repo
  # cat elasticsearch.repo
  
  name=Elasticsearch repository for 2.x packages
  baseurl=https://packages.elastic.co/elasticsearch/2.x/centos
  gpgcheck=1
  gpgkey=https://packages.elastic.co/GPG-KEY-elasticsearch
  enabled=1
  # cd
  # yum-y install elasticsearch
  
  # cd/usr/share/elasticsearch/
  # bin/plugin install mobz/elasticsearch-head
  -> Installing mobz/elasticsearch-head...
  Trying https://github.com/mobz/elasticsearch-head/archive/master.zip...
  Downloading...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................DONE
  Verifyinghttps://github.com/mobz/elasticsearch-head/archive/master.zip checksums ifavailable ...
  NOTE: Unable to verify checksum for downloaded plugin(unable to find .sha1 or .md5 file to verify)
  Installed head into /usr/share/elasticsearch/plugins/head
  # bin/plugin install lmenezes/elasticsearch-kopf
  -> Installing lmenezes/elasticsearch-kopf...
  Trying https://github.com/lmenezes/elasticsearch-kopf/archive/master.zip...
  Downloading......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................DONE
  Verifyinghttps://github.com/lmenezes/elasticsearch-kopf/archive/master.zip checksums ifavailable ...
  NOTE: Unable to verify checksum for downloaded plugin(unable to find .sha1 or .md5 file to verify)
  Installed kopf into /usr/share/elasticsearch/plugins/kopf
  #
  # ls plugins/head/
  elasticsearch-head.sublime-projectgrunt_fileSets.jsLICENCE      plugin-descriptor.properties_sitetest
  Gruntfile.js                        index.html         package.jsonREADME.textile                src
  #
  # /etc/init.d/elasticsearchstart
  正在启动elasticsearch:                                 [确定]
  # lsof -i:9200
  COMMANDPID         USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
  java    2621 elasticsearch   89u IPv623297      0t0 TCP localhost:wap-wsp(LISTEN)
  java    2621 elasticsearch   90u IPv623298      0t0 TCP localhost:wap-wsp(LISTEN)
  #
  
集群和节点(引用原文中的一段话)
        节点(node)是一个运行着的Elasticsearch实例。集群(cluster)是一组具有相同cluster.name的节点集合,他们协同工作,共享数据并提供故障转移和扩展功能,当然一个节点也可以组成一个集群。最好找一个合适的名字来替代cluster.name的默认值,比如自己的名字,这样可以防止一个新启动的节点加入到相同网络中的另一个同名的集群中。可以通过修改config/目录下的elasticsearch.yml文件,然后重启ELasticsearch来做到这一点。当Elasticsearch在前台运行,可以使用Ctrl-C快捷键终止,或者可以调用shutdown API来关闭:curl -XPOST'http://localhost:9200/_shutdown'
  二、配置启动elasticsearchcluster
  # vim /etc/elasticsearch/elasticsearch.yml
  # egrep "^[^#$]" /etc/elasticsearch/elasticsearch.yml
  cluster.name:mycluster
  node.name:pc2
  node.rack:pc2-attr
  path.data:/data/elasticsearch/data
  path.logs:/data/elasticsearch/logs
  path.work:/data/elasticsearch/tmp
  network.host:0.0.0.0
  http.port:9200
  discovery.zen.ping.unicast.hosts:["pc2", "pc1","pc3"]
  discovery.zen.minimum_master_nodes:2
  script.engine.groovy.inline.update:on
  indices.fielddata.cache.size:3g
  #
  # mkdir -p /data/elasticsearch/{data,logs,tmp,run}
  # vim /etc/init.d/elasticsearch (修改log、pid、data目录的位置)
  #
  # chown -R elasticsearch./data/elasticsearch/
  # /etc/init.d/elasticsearch restart
  停止 elasticsearch:                                       [确定]
  正在启动elasticsearch:                                 [确定]
  # lsof -i:9300
  COMMANDPID         USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
  java    2896 elasticsearch   76u IPv652593      0t0TCP *:vrace (LISTEN)
  java    2896 elasticsearch   90u IPv655774      0t0 TCP pc2:43166->pc1:vrace (SYN_SENT)
  java    2896 elasticsearch   91u IPv655775      0t0 TCP pc2:55578->pc3:vrace (SYN_SENT)
  #
  # lsof -i:9200
  COMMANDPID         USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
  java    2896 elasticsearch   89u IPv653327      0t0 TCP *:wap-wsp(LISTEN)
  #
  # curl 'http://localhost:9200/_search?pretty'
  {
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
      "total" : 0,
      "successful" : 0,
      "failed" : 0
  },
  "hits" : {
      "total" : 0,
      "max_score" : 0.0,
      "hits" : [ ]
  }
  }
  
  

  三、Elasticsearch的基础应用
  Elasticsearch为Java用户提供了两种内置客户端
      节点客户端(nodeclient):也称为无数据客户端(none data node),换言之,它自己在集群中不存储任何数据,但是它知道数据在集群中的具体位置,并且能够直接转发请求到对应的节点上。
      传输客户端(Transportclient):这个更轻量的传输客户端能够发送请求到远程集群。它自己不加入集群,只是简单转发请求给集群中的节点。
         两个Java客户端都通过9300端口与集群交互,使用Elasticsearch传输协议(ElasticsearchTransport Protocol)。集群中的节点之间也通过9300端口进行通信。如果此端口未开放,节点将不能组成集群。
  
基于HTTP协议,以JSON为数据交互格式的RESTful API
         其他所有程序语言都可以使用RESTful API,通过9200端口的与Elasticsearch进行通信,可以使用自己喜欢的WEB客户端,事实上,甚至可以通过curl命令与Elasticsearch通信。
      向Elasticsearch发送的request的组成部分 与 普通的HTTP request相同,格式如下:
           curl -X'://:/?'-d ''
           说明如下:
   VERB HTTP方法:GET, POST,PUT, HEAD, DELETE
           PROTOCOL http或者https协议(只有在Elasticsearch前面有https代理的时候可用)
           HOST Elasticsearch集群中的任何一个节点的主机名,如果是在本地的节点,那么就叫localhost
           PORT Elasticsearch HTTP服务所在的端口,默认为9200
           PATH API路径(例如_count将返回集群中文档的数量),PATH可以包含多个组件,例如_cluster/stats或者_nodes/stats/jvm
           QUERY_STRING 一些可选的查询请求参数,例如?pretty参数将使请求返回更加美观易读的JSON数据
           BODY 一个JSON格式的请求主体(如果请求需要的话)
例如,为了统计集群中的文档数量,可以如下操作:
curl -XGET 'http://localhost:9200/_count?pretty' -d '
{
    "query": {
      "match_all": {}
    }
}
'
Elasticsearch是面向文档(documentoriented)的,这意味着它可以存储整个对象或文档(document)。然而它不仅仅是存储,还会索引(index)每个文档的内容使其可以被搜索。在Elasticsearch中,可以对文档(而非成行成列的数据)进行索引、搜索、排序、过滤。这种理解数据的方式与以往完全不同,这也是Elasticsearch能够执行复杂的全文搜索的原因之一。
ELasticsearch使用Javascript对象符号(JavaScriptObject Notation),也就是JSON,作为文档序列化格式。JSON现在已经被大多语言所支持,而且已经成为NoSQL领域的标准格式。它简洁、简单且容易阅读,例:
{
      "email":      "john@example.com",
      "first_name": "John",
      "last_name":"Smith",
      "info": {
        "bio":   "Eco-warrior and defender of the weak",
        "age":25,
        "interests": ["dolphins", "whales" ]
      },
      "join_date":"2016/07/05"
}
所以结合上述内容,如果希望在elasticsearch库中插入内容,可以如下操作:
  
  # curl -XPUT 'http://localhost:9200/example/employee/01' -d '{
      "first_name" : "John",
      "last_name" :"Smith",
      "age" :      25,
      "about" :      "I love to go rock climbing",
      "interests": [ "sports", "music" ]
  }'
  {"_index":"example","_type":"employee","_id":"01","_version":1,"_shards":{"total":2,"successful":2,"failed":0},"created":true}#
  #
  
  # curl -XGET 'http://localhost:9200/_count?pretty' -d '{
     "query": {
           "match_all": {}
     }
   } '
  {
  "count" : 1,
  "_shards" : {
  "total" : 15,
  "successful" : 15,
  "failed" : 0
  }
  }
  #
  继续插入数据,如下操作:
  # curl -XPUT 'http://localhost:9200/example/employee/02' -d '{
  >   "first_name" :"Jane",
  >   "last_name" :   "Smith",
  >   "age" :         32,
  >   "about" :       "I like to collect rock albums",
  >   "interests":[ "music" ]
  > }'
  {"_index":"example","_type":"employee","_id":"02","_version":1,"_shards":{"total":2,"successful":2,"failed":0},"created":true}#
  #
  # curl -XPUT 'http://localhost:9200/example/employee/03' -d '{
  >   "first_name" :"Douglas",
  >   "last_name" :   "Fir",
  >   "age" :         35,
  >   "about":      "I like to build cabinets",
  >   "interests":[ "forestry" ]
  > }'
  {"_index":"example","_type":"employee","_id":"03","_version":1,"_shards":{"total":2,"successful":2,"failed":0},"created":true}#
  # curl -XGET 'http://localhost:9200/_count?pretty' -d '{
  "query": {
  "match_all": {}
  }
  }'
  {
  "count" : 3,
  "_shards" : {
  "total" : 15,
  "successful" : 15,
  "failed" : 0
  }
  }
  #
  

  四、Elasticsearch的文档搜索
  例1:指定id搜索
  # curl -XGET 'http://localhost:9200/example/employee/02'
  {"_index":"example","_type":"employee","_id":"02","_version":1,"found":true,"_source":{
      "first_name" :"Jane",
      "last_name" :   "Smith",
      "age" :         32,
      "about" :       "I like to collect rockalbums",
      "interests":[ "music" ]
  }}#
  
  例2:搜索全部文档
  # curl -XGET 'http://localhost:9200/example/_search'
  {"took":6,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":3,"max_score":1.0,"hits":[{"_index":"example","_type":"employee","_id":"01","_score":1.0,"_source":{
      "first_name" : "John",
      "last_name" :"Smith",
      "age" :      25,
      "about" :      "I love to go rock climbing",
      "interests": ["sports", "music" ]
  }},{"_index":"example","_type":"employee","_id":"03","_score":1.0,"_source":{
      "first_name" :"Douglas",
      "last_name" :   "Fir",
      "age" :         35,
      "about":      "I like to build cabinets",
      "interests":[ "forestry" ]
  }},{"_index":"example","_type":"employee","_id":"02","_score":1.0,"_source":{
      "first_name" :"Jane",
      "last_name" :   "Smith",
      "age" :         32,
      "about" :       "I like to collect rockalbums",
      "interests":[ "music" ]
  }}]}}#
  
  例3:给定查询字符串(querystring)搜索
  # curl -XGET 'http://localhost:9200/example/_search?q=last_name:Smith'
  {"took":17,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":2,"max_score":0.30685282,"hits":[{"_index":"example","_type":"employee","_id":"01","_score":0.30685282,"_source":{
      "first_name" : "John",
      "last_name" :"Smith",
      "age" :      25,
      "about" :      "I love to go rock climbing",
      "interests": ["sports", "music" ]
  }},{"_index":"example","_type":"employee","_id":"02","_score":0.30685282,"_source":{
      "first_name" :"Jane",
      "last_name" :   "Smith",
      "age" :         32,
      "about" :       "I like to collect rock albums",
      "interests":[ "music" ]
  }}]}}#
  
  例4:使用DSL查询(Query DSL)
  # curl -XGET 'http://localhost:9200/example/_search' -d '{
  "query": {
           "match" : {
               "last_name" :"Smith"
           }
}
  }'
  {"took":14,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":2,"max_score":0.30685282,"hits":[{"_index":"example","_type":"employee","_id":"01","_score":0.30685282,"_source":{
      "first_name" : "John",
      "last_name" :"Smith",
      "age" :      25,
      "about" :      "I love to go rock climbing",
      "interests": ["sports", "music" ]
  }},{"_index":"example","_type":"employee","_id":"02","_score":0.30685282,"_source":{
      "first_name" :"Jane",
      "last_name" :   "Smith",
      "age" :         32,
      "about" :       "I like to collect rockalbums",
      "interests":[ "music" ]
  }}]}}#
  
  例5:指定过滤器(filter)搜索
  # curl -XGET 'http://localhost:9200/example/_search' -d '{
  "query": {
        "filtered" : {
              "filter" : {
                  "range" : {
                    "age" : {"gt" : 30 }
                  }
              },
              "query" : {
                  "match" : {
                    "last_name" :"smith"
                  }
              }
        }
      }
  }'
  {"took":27,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":0.30685282,"hits":[{"_index":"example","_type":"employee","_id":"02","_score":0.30685282,"_source":{
      "first_name" :"Jane",
      "last_name" :   "Smith",
      "age" :         32,
      "about" :       "I like to collect rockalbums",
      "interests":[ "music" ]
  }}]}}#
  #
  
  例6:全文搜索
  # curl -XGET 'http://localhost:9200/example/_search' -d '{
  "query": {
        "match" : {
              "about" : "rockclimbing"
        }
}
  }'
  {"took":12,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":2,"max_score":0.16273327,"hits":[{"_index":"example","_type":"employee","_id":"01","_score":0.16273327,"_source":{
      "first_name" : "John",
      "last_name" :"Smith",
      "age" :      25,
      "about" :      "I love to go rock climbing",
      "interests": ["sports", "music" ]
  }},{"_index":"example","_type":"employee","_id":"02","_score":0.016878016,"_source":{
      "first_name" :"Jane",
      "last_name" :   "Smith",
      "age" :         32,
      "about" :       "I like to collect rockalbums",
      "interests":[ "music" ]
  }}]}}#
  #
  
  例7:指定短语(phrases)搜索
  # curl -XGET 'http://localhost:9200/example/_search' -d '{
"query": {
      "match_phrase" : {
            "about" : "rockclimbing"
      }
    }
  }'
  {"took":15,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":0.23013961,"hits":[{"_index":"example","_type":"employee","_id":"01","_score":0.23013961,"_source":{
      "first_name" : "John",
      "last_name" :"Smith",
      "age" :      25,
      "about" :      "I love to go rock climbing",
      "interests": ["sports", "music" ]
  }}]}}#
  #
  

  五、聚合功能
  # curl-XGET 'http://localhost:9200/example/_search' -d '{
  "aggs": {
   "all_interests": {
  "terms": { "field":"interests" }
  }
   }
  }'
  {"took":86,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":3,"max_score":1.0,"hits":[{"_index":"example","_type":"employee","_id":"01","_score":1.0,"_source":{
     "first_name" : "John",
     "last_name" : "Smith",
      "age":      25,
     "about" :      "Ilove to go rock climbing",
     "interests": [ "sports", "music" ]
  }},{"_index":"example","_type":"employee","_id":"03","_score":1.0,"_source":{
     "first_name" : "Douglas",
     "last_name" :"Fir",
      "age":         35,
     "about":      "Ilike to build cabinets",
     "interests":["forestry" ]
  }},{"_index":"example","_type":"employee","_id":"02","_score":1.0,"_source":{
     "first_name" : "Jane",
     "last_name" :"Smith",
      "age":         32,
     "about" :       "Ilike to collect rock albums",
     "interests":["music" ]
  }}]},"aggregations":{"all_interests":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"music","doc_count":2},{"key":"forestry","doc_count":1},{"key":"sports","doc_count":1}]}}}#
  #
  
  #curl -XGET 'http://localhost:9200/example/_search' -d '{
   "query": {
     "match": {
     "last_name": "smith"
      }
  },
  "aggs":{
     "all_interests": {
     "terms": {
         "field": "interests"
        }
      }
  }
  }'
  {"took":7,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":2,"max_score":0.30685282,"hits":[{"_index":"example","_type":"employee","_id":"01","_score":0.30685282,"_source":{
     "first_name" : "John",
     "last_name" : "Smith",
      "age":      25,
     "about" :      "Ilove to go rock climbing",
     "interests": [ "sports", "music" ]
  }},{"_index":"example","_type":"employee","_id":"02","_score":0.30685282,"_source":{
     "first_name" : "Jane",
     "last_name" :"Smith",
      "age":         32,
     "about" :       "Ilike to collect rock albums",
     "interests":["music" ]
  }}]},"aggregations":{"all_interests":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"music","doc_count":2},{"key":"sports","doc_count":1}]}}}#
  #
  
  # curl-XGET 'http://localhost:9200/example/_search' -d '{
     "aggs" : {
         "all_interests" : {
           "terms" : { "field" : "interests" },
           "aggs" : {
                 "avg_age" : {
                     "avg" : { "field" : "age" }
                  }
              }
        }
      }
  }'
  {"took":29,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":3,"max_score":1.0,"hits":[{"_index":"example","_type":"employee","_id":"01","_score":1.0,"_source":{
      "first_name": "John",
     "last_name" : "Smith",
      "age":      25,
     "about" :      "Ilove to go rock climbing",
     "interests": [ "sports", "music" ]
  }},{"_index":"example","_type":"employee","_id":"03","_score":1.0,"_source":{
     "first_name" :"Douglas",
     "last_name" :"Fir",
      "age":         35,
     "about":      "Ilike to build cabinets",
     "interests":["forestry" ]
  }},{"_index":"example","_type":"employee","_id":"02","_score":1.0,"_source":{
     "first_name" : "Jane",
      "last_name":   "Smith",
      "age":         32,
     "about" :       "Ilike to collect rock albums",
     "interests":["music" ]
  }}]},"aggregations":{"all_interests":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"music","doc_count":2,"avg_age":{"value":28.5}},{"key":"forestry","doc_count":1,"avg_age":{"value":35.0}},{"key":"sports","doc_count":1,"avg_age":{"value":25.0}}]}}}#
  #
  
Elasticsearch致力于隐藏分布式系统的复杂性。以下这些操作都是在底层自动完成的:

[*]  将你的文档分区到不同的容器或者分片(shards)中,它们可以存在于一个或多个节点中。
[*]  将分片均匀的分配到各个节点,对索引和搜索做负载均衡。
[*]  冗余每一个分片,防止硬件故障造成的数据丢失。
[*]  将集群中任意一个节点上的请求路由到相应数据所在的节点。
[*]  无论是增加节点,还是移除节点,分片都可以做到无缝的扩展和迁移。



页: [1]
查看完整版本: Elasticsearch cluster