优化ELK

uytre 发表于 2016-7-1 09:36:36

装完elk跑起来之后，我的内心几乎是崩溃的，16G内存16核cpu还经常报错。

一、logstash和elasticsearch同时报错
logstash出现大量报错，可能是es占用heap太多，没有优化es导致的
retrying failed action with response code: 503 {:level=>:warn}
too many attempts at sending event. dropping: 2016-06-16T05:44:54.464Z %{host} %{message} {:level=>:error}

elasticsearch出现大量报错
too many open files

是这个值太小了"max_file_descriptors" : 2048,

# curl http://localhost:9200/_nodes/process\?pretty
{
"cluster_name" : "elasticsearch",
"nodes" : {
"ZLgPzMqBRoyDFvxoy27Lfg" : {
   "name" : "Mass Master",
   "transport_address" : "inet",
   "host" : "localhost",
   "ip" : "127.0.0.1",
   "version" : "1.6.0",
   "build" : "cdd3ac4",
   "http_address" : "inet",
   "process" : {
   "refresh_interval_in_millis" : 1000,
   "id" : 943,
   "max_file_descriptors" : 2048,
   "mlockall" : true

解决办法：
设置文件打开数
# ulimit -n 65535

设置开机自启动

# vi /etc/profile

在es启动文件里面添加，然后重新启动elasticsearch
# vi /home/elk/elasticsearch-1.6.0/bin/elasticsearch
ulimit -n 65535

# curl http://localhost:9200/_nodes/process\?pretty
{
"cluster_name" : "elasticsearch",
"nodes" : {
"_QXVsjL9QOGMD13Eb6t7Ag" : {
   "name" : "Ocean",
   "transport_address" : "inet",
   "host" : "localhost",
   "ip" : "127.0.0.1",
   "version" : "1.6.0",
   "build" : "cdd3ac4",
   "http_address" : "inet",
   "process" : {
   "refresh_interval_in_millis" : 1000,
   "id" : 1693,
   "max_file_descriptors" : 65535,
   "mlockall" : true
   }
}

二、out of memory内存溢出

优化后的es配置文件内容：
# egrep -v '^$|^#' /home/elk/elasticsearch-1.6.0/config/elasticsearch.yml
bootstrap.mlockall: true
http.max_content_length: 2000mb
http.compression: true
index.cache.field.type: soft
index.cache.field.max_size: 50000
index.cache.field.expire: 10m

针对bootstrap.mlockall: true还要设置
# ulimit -l unlimited

# vi /etc/sysctl.confvm.max_map_count=262144vm.swappiness = 1
# ulimit -a
core file size       (blocks, -c) 0
data seg size       (kbytes, -d) unlimited
scheduling priority          (-e) 0
file size             (blocks, -f) unlimited
pending signals             (-i) 127447
max locked memory    (kbytes, -l) unlimited
max memory size       (kbytes, -m) unlimited
open files                   (-n) 65535
pipe size          (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority          (-r) 0
stack size          (kbytes, -s) 10240
cpu time             (seconds, -t) unlimited
max user processes          (-u) 127447
virtual memory       (kbytes, -v) unlimited
file locks                   (-x) unlimited

# vi /etc/security/limits.d/90-nproc.conf
*       soft nproc 320000
root    soft nproc unlimited

三、es状态是yellow
es中用三种颜色状态表示:green，yellow，red.
green：所有主分片和副本分片都可用
yellow：所有主分片可用，但不是所有副本分片都可用
red：不是所有的主分片都可用

# curl -XGET http://localhost:9200/_cluster/health\?pretty
{
"cluster_name" : "elasticsearch",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 1,
"active_primary_shards" : 161,
"active_shards" : 161,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 161,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0

解决办法：建立elasticsearch集群(下篇博客写)

四、kibana not indexed错误
https://rafaelmt.net/en/2015/09/01/kibana-tutorial/#refresh-fields
kibana的索引根据事件会经常更新，所以kibana图有时候会出现 not indexed的报错：

解决办法：
我们访问kibana，然后选择settings，点击indices，点击logstash-*。点击刷新的图标就ok了

页: [1]

运维网's Archiver

优化ELK