uytre 发表于 2016-7-1 09:36:36

优化ELK

装完elk跑起来之后,我的内心几乎是崩溃的,16G内存16核cpu还经常报错。

一、logstash和elasticsearch同时报错
logstash出现大量报错,可能是es占用heap太多,没有优化es导致的
retrying failed action with response code: 503 {:level=>:warn}
too many attempts at sending event. dropping: 2016-06-16T05:44:54.464Z %{host} %{message} {:level=>:error}

elasticsearch出现大量报错
too many open files

是这个值太小了"max_file_descriptors" : 2048,

# curl http://localhost:9200/_nodes/process\?pretty
{
"cluster_name" : "elasticsearch",
"nodes" : {
    "ZLgPzMqBRoyDFvxoy27Lfg" : {
      "name" : "Mass Master",
      "transport_address" : "inet",
      "host" : "localhost",
      "ip" : "127.0.0.1",
      "version" : "1.6.0",
      "build" : "cdd3ac4",
      "http_address" : "inet",
      "process" : {
      "refresh_interval_in_millis" : 1000,
      "id" : 943,
      "max_file_descriptors" : 2048,
      "mlockall" : true



解决办法:
设置文件打开数
# ulimit -n 65535

设置开机自启动

# vi /etc/profile

在es启动文件里面添加,然后重新启动elasticsearch
# vi /home/elk/elasticsearch-1.6.0/bin/elasticsearch
ulimit -n 65535


# curl http://localhost:9200/_nodes/process\?pretty
{
"cluster_name" : "elasticsearch",
"nodes" : {
    "_QXVsjL9QOGMD13Eb6t7Ag" : {
      "name" : "Ocean",
      "transport_address" : "inet",
      "host" : "localhost",
      "ip" : "127.0.0.1",
      "version" : "1.6.0",
      "build" : "cdd3ac4",
      "http_address" : "inet",
      "process" : {
      "refresh_interval_in_millis" : 1000,
      "id" : 1693,
      "max_file_descriptors" : 65535,
      "mlockall" : true
      }
    }


二、out of memory内存溢出

优化后的es配置文件内容:
# egrep -v '^$|^#' /home/elk/elasticsearch-1.6.0/config/elasticsearch.yml
bootstrap.mlockall: true
http.max_content_length: 2000mb
http.compression: true
index.cache.field.type: soft
index.cache.field.max_size: 50000
index.cache.field.expire: 10m


针对bootstrap.mlockall: true还要设置
# ulimit -l unlimited

# vi /etc/sysctl.confvm.max_map_count=262144vm.swappiness = 1
# ulimit -a
core file size          (blocks, -c) 0
data seg size         (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals               (-i) 127447
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65535
pipe size            (512 bytes, -p) 8
POSIX message queues   (bytes, -q) 819200
real-time priority            (-r) 0
stack size            (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes            (-u) 127447
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited


# vi /etc/security/limits.d/90-nproc.conf
*          soft    nproc   320000
root       soft    nproc   unlimited


三、es状态是yellow
es中用三种颜色状态表示:green,yellow,red.
green:所有主分片和副本分片都可用
yellow:所有主分片可用,但不是所有副本分片都可用
red:不是所有的主分片都可用

# curl -XGET http://localhost:9200/_cluster/health\?pretty
{
"cluster_name" : "elasticsearch",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 1,
"active_primary_shards" : 161,
"active_shards" : 161,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 161,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0

解决办法:建立elasticsearch集群(下篇博客写)


四、kibana not indexed错误
https://rafaelmt.net/en/2015/09/01/kibana-tutorial/#refresh-fields
kibana的索引根据事件会经常更新,所以kibana图有时候会出现 not indexed的报错:

解决办法:
我们访问kibana,然后选择settings,点击indices,点击logstash-*。点击刷新的图标就ok了

页: [1]
查看完整版本: 优化ELK