yadianna 发表于 2019-1-18 09:26:34

zabbix监控业务进程变动

  Zabbix 监控进程宕机
  业务需求后端进程宕机以后能在短时间内迅速拉起,业务影响不大,但是开发需要查看coredump,要求能监控到pid变化;在现有构架下zabbix能监控并报警;
  当然zabbix设置报警设置就不再一一
  在每台服务器/etc/zabbix/zabbix_agentd.conf设置路径:此例只需要piddiff.sh
  UserParameter=checkpid,sh /usr/local/script/piddiff.sh
  UserParameter=test,sh /usr/local/script/test.sh
  UserParameter=discovery.process,/usr/local/script/disprocess.sh
  UserParameter=process.check
[*],/usr/local/script/proc_check.sh $1 $2 $3
  /usr/local/script下面存放脚本
  Vim piddiff.sh
  aapid为业务监控id 取值根据业务需求;
  #/bin/sh
  onl_ok=1
  onl_cored=3
  dir=/usr/local/script
  if [[ ! -f "$dir/old.txt" ]];then
   ps aux|grep aapid |grep -v grep|grep -v /bin/bash|awk'{print $2,$11}' > $dir/old.txt
        else
           sleep 1s            
  fi
   ps aux|grep aapid |grep -v grep|grep -v /bin/bash|awk'{print $2,$11}' > $dir/now.txt
  if ! diff -q $dir/old.txt$dir/now.txt > /dev/null; then
            echo $onl_cored
            diff -c $dir/old.txt$dir/now.txt > $dir/`date "+%Y%m%d%H%M"`_diff.txt
            cat $dir/now.txt >$dir/old.txt
     else
            echo $onl_ok   
  fi
  一个简单的判断脚本;

  Zabbix30秒会抓取一次,正常没变化为1,有变化为3,那么zabbix抓取数值为3则表示pid有变化,会发出警报;
  Zabbix设置:
  
  监控项模板添加如下:
  http://s1.运维网.com/images/20180223/1519371651993152.png
  触发器:{Template OS Linux:checkpid.last()}=3
http://s1.运维网.com/images/20180223/1519371669385184.png



页: [1]
查看完整版本: zabbix监控业务进程变动