|
最近在一台WEB服务器上部署了Zabbix监控,并且添加了TCP连接状态的监控,但是zabbix经常报TCP模板的中Key不支持
例如:
item "xxxxxx:tcp.status[listen]" became not supported: Received value [39 39] is not suitable for value type [Numeric (unsigned)] and data type [Decimal]
TCP连接状态使用这个模板使用zabbix监控TCP连接状态 ,是使用netstat来获取TCP的连接信息
/bin/netstat -an|awk '/^tcp/{++S[$NF]}END{for(a in S) print a,S[a]}'
到这台主机上面手动运行
$ time /bin/netstat -an|awk '/^tcp/{++S[$NF]}END{for(a in S) print a,S[a]}'
TIME_WAIT 17164
ESTABLISHED 4330
SYN_RECV 2
LAST_ACK 1
LISTEN 39
real0m8.944s
user0m0.495s
sys0m8.508s
花费时间为8.944秒,有17164个TIME_WAIT,由此判断可能是因为每次zabbix调用脚本的时候时间比较长,netstat返回的结果再写入到/tmp/tcp_status.txt这个文件的过程有重复,所以有时才会在这个文件中出现两行LISTEN的情况
于是改用ss命令并修改脚本
$ whereis ss
ss: /usr/sbin/ss /usr/share/man/man8/ss.8.gz
$ rpm -qf /usr/sbin/ss
iproute-2.6.32-23.el6.x86_64
ss命令是另外一个和netstat命令功能差不多的命令
ss -t -a 显示所有的TCP连接信息
ss -u -a 显示所有的UDP连接信息
改成ss后的执行情况
$ time ss -tan|awk 'NR>1{++S[$1]}END{for (a in S) print a,S[a]}'
SYN-RECV 1
ESTAB 4687
TIME-WAIT 20250
LISTEN 39
real0m0.238s
user0m0.192s
sys0m0.083s
可以看出当服务器的连接数过高时,使用ss命令有命令的优势。
所以现在需要把监控脚本修改成通过ss命令获取状态
ss命令的man手册没有对输出的State字段进行说明,通过查看iproute的源代码misc/ss.c 知道ss命令的State字段主要有以下几种
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| static const char *sstate_name[] = {
"UNKNOWN",
[TCP_ESTABLISHED] = "ESTAB",
[TCP_SYN_SENT] = "SYN-SENT",
[TCP_SYN_RECV] = "SYN-RECV",
[TCP_FIN_WAIT1] = "FIN-WAIT-1",
[TCP_FIN_WAIT2] = "FIN-WAIT-2",
[TCP_TIME_WAIT] = "TIME-WAIT",
[TCP_CLOSE] = "UNCONN",
[TCP_CLOSE_WAIT] = "CLOSE-WAIT",
[TCP_LAST_ACK] = "LAST-ACK",
[TCP_LISTEN] = "LISTEN",
[TCP_CLOSING] = "CLOSING",
};
|
ESTAB
SYN-SENT
SYN-RECV
FIN-WAIT-1
FIN-WAIT-2
TIME-WAIT
UNCONN
CLOSE-WAIT
LAST-ACK
LISTEN
CLOSING
于是修改脚本
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
| #!/bin/bash
#this script is used to get tcp and udp connetion status
#tcp status
metric=$1
tmp_file=/tmp/tcp_status.txt
#/bin/netstat -an|awk '/^tcp/{++S[$NF]}END{for(a in S) print a,S[a]}' > $tmp_file
/usr/sbin/ss -tan|awk 'NR>1{++S[$1]}END{for (a in S) print a,S[a]}' > $tmp_file
#ESTAB
#SYN-SENT
#SYN-RECV
#FIN-WAIT-1
#FIN-WAIT-2
#TIME-WAIT
#UNCONN
#CLOSE-WAIT
#LAST-ACK
#LISTEN
#CLOSING
case $metric in
closed)
output=$(awk '/UNCONN/{print $2}' $tmp_file)
if [ "$output" == "" ];then
echo 0
else
echo $output
fi
;;
listen)
output=$(awk '/LISTEN/{print $2}' $tmp_file)
if [ "$output" == "" ];then
echo 0
else
echo $output
fi
;;
synrecv)
output=$(awk '/SYN-RECV/{print $2}' $tmp_file)
if [ "$output" == "" ];then
echo 0
else
echo $output
fi
;;
synsent)
output=$(awk '/SYN-SENT/{print $2}' $tmp_file)
if [ "$output" == "" ];then
echo 0
else
echo $output
fi
;;
established)
output=$(awk '/ESTAB/{print $2}' $tmp_file)
if [ "$output" == "" ];then
echo 0
else
echo $output
fi
;;
timewait)
output=$(awk '/TIME-WAIT/{print $2}' $tmp_file)
if [ "$output" == "" ];then
echo 0
else
echo $output
fi
;;
closing)
output=$(awk '/CLOSING/{print $2}' $tmp_file)
if [ "$output" == "" ];then
echo 0
else
echo $output
fi
;;
closewait)
output=$(awk '/CLOSE-WAIT/{print $2}' $tmp_file)
if [ "$output" == "" ];then
echo 0
else
echo $output
fi
;;
lastack)
output=$(awk '/LAST-ACK/{print $2}' $tmp_file)
if [ "$output" == "" ];then
echo 0
else
echo $output
fi
;;
finwait1)
output=$(awk '/FIN-WAIT-1/{print $2}' $tmp_file)
if [ "$output" == "" ];then
echo 0
else
echo $output
fi
;;
finwait2)
output=$(awk '/FIN-WAIT-2/{print $2}' $tmp_file)
if [ "$output" == "" ];then
echo 0
else
echo $output
fi
;;
*)
echo -e "\e[033mUsage: sh $0 [closed|closing|closewait|synrecv|synsent|finwait1|finwait2|listen|established|lastack|timewait]\e[0m"
esac
|
使用ss代替netstat统计连接信息后就没有再收到报警
|
|