设为首页 收藏本站
查看: 1083|回复: 0

[经验分享] 整合nagios+cacti遇到问题及解决办法

[复制链接]

尚未签到

发表于 2019-1-11 07:50:09 | 显示全部楼层 |阅读模式
  7.      Nagios每日健康检查报警短信
  对于没有移动短信网关通道来说,让监控平台每天下午4:00发一条短信,不管有没有故障都发,这样以便管理员能够知道短信报警及nagios服务是否正常。
  检查报警的方法如下:
  7.1.      编写检查脚本
  # cat /root/sh/nagios_check.sh
  #!/bin/bash
  #auther by Kevin@cmcc.com.cn
  #check nagios service
  nid=/usr/local/nagios/var/nagios.lock
  if [ -f $nid ]
  then
  /usr/local/nagios/libexec/sms/sendsms.sh 13800000000 "Nagios service is OK, Don't worry it!"
  echo -e "nagios service is ok"
  else
  /etc/init.d/nagios start
  /usr/local/nagios/libexec/sms/sendsms.sh 13800000000 " nagios service is restart,It's ok "
  fi
  7.2.      添加crond计划
  # crontab –e 添加如下内容:
  00 16 * * *      /root/sh/nagios_check.sh > /root/sh/nagios_check.log /dev/null 2>&1
  7.3.      配置飞信机器人报警 7.3.1.      Commands.cfg配置文件添加如下内容:
  #host-notify-by-sms
  define command {
  command_name host-notify-by-sms
  command_line /usr/local/nagios/libexec/sms/sendsms.sh 13800000000 " ** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is AT: $DATE$ $HOSTSTATE$ ** "
  }
  #service-notify-by-sms
  define command {
  command_name service-notify-by-sms
  command_line /usr/local/nagios/libexec/sms/sendsms.sh 13800000000 " *** $NOTIFICATIONTYPE$ $HOSTNAME$   $DATE$ $TIME$ $SERVICEDESC$ is $SERVICESTATE$ info:$SERVICEOUTPUT$ *** "
  }
  7.3.2.      Contacts.cfg配置添加:
  define contact{
  contact_name sms-members
  use sms-mail-contact
  alias Nagios Admin SMS
  email admin@139.com
  pager   13800000000
  }
  define contactgroup{
  contactgroup_name admins
  alias Nagios Administrators
  members sms-members
  }
  7.3.3.      Templates.cfg
  define contact{
  name                            sms-contact
  service_notification_period     24x7
  host_notification_period        24x7
  service_notification_options    w,u,c,r,f,s
  host_notification_options       d,u,r,f,s
  service_notification_commands   notify-service-by-sms
  host_notification_commands      notify-host-by-sms
  register                        0
  }
  7.3.4.      修改展示页面监控图片大小:        /usr/local/nagios/etc/pnp/config.php
  # vim /usr/local/nagios/etc/pnp/config.php
  $conf['graph_width'] = "500";
  $conf['graph_height'] = "100";

  这两行是定义监控页面大小比例的。RRDTool graph Image>  8.      Troubleshooting
  8.1.      web界面修改某个服务时报错
  例如对某个服务进行临时安排其执行时间,或者不让它发警告,web页面上都有这样的设置.但是常常会有错误信息如下:
  Could not open command file '/usr/local/nagios/var/rw/nagios.cmd' for update!
  The permissions on the external command file and/or directory may be incorrect. Read the FAQs on how to setup proper permissions.
  An error occurred while attempting to commit your command for processing.
  关于这部分在nagios.cfg中有下面的内容
  # EXTERNAL COMMAND FILE
  # This is the file that Nagios checks for external command requests.
  # It is also where the command CGI will write commands that are submitted
  # by users, so it must be writeable by the user that the web server
  # is running as (usually 'nobody'). Permissions should be set at the
  # directory level instead of on the file, as the file is deleted every
  # time its contents are processed.
  这段话的核心意思是apache的运行用户要有对文件写的权限.权限应该设置在目录上,因为每次文件的内容被处理后文件就会被删掉
  command_file=/usr/local/nagios/var/rw/nagios.cmd
  本来将apache2运行的用户apache加到nagios组就应该可以了的
  但是这个却不行,就将rw这个目录及其子文件的权限改了777,这样就可以了.
  8.2.      点击host,service选项时,结果无法显示
  安装nagios后,访问页面可以,点击host,service选项时,都是无法显示。后台日志
  报错:
  [Wed Sep 01 17:31:32 2010] [error] [client 222.128.103.52] Premature end of script headers: status.cgi, referer: http://public.ipaddr/nagios/side.php
  [Wed Sep 01 17:31:33 2010] [error] [client 222.128.103.52] (13)Permission denied: exec of '/usr/local/nagios/sbin/status.cgi' failed, referer: http://public.ipaddr/nagios/side.php
  解决方法:原因是因为开启了selinux,getenforce
  令SELinux处于容许模式
  setenforce 0
  如果要永久性更变它,需要更改/etc/selinux/config里的设置并重启系统。
  不关闭SELinux或是永久性变更它的方法是让CGI模块在SELinux下指定强制目标模式:
  chcon -R -t httpd_sys_content_t /usr/local/nagios/sbin/
  chcon -R -t httpd_sys_content_t /usr/local/nagios/share/
  关闭即可。
  8.3.   nagios3.2.0以后,安装nagios在访问http://ip/nagios时出现如下错误提示:
  解决方法如下:nagios3.2.0将页面从之前的html换成了php,首次安装需要先决条件php
  yum install php即可
  8.4.      出现pnp小太阳图标,点击报错如下:
  Initalising
  Using /usr/local/nagios/share/perfdata/
  RRDTool /usr/bin/rrdtool found.
  RRDTool /usr/bin/rrdtool is executable
  PHP Function proc_open is enabled
  PHP Function fpassthru is enabled
  PHP Function xml_parser_create is enabled
  PHP zlib Support found.
  PHP GD Support can’t found.
  解决方法: yum –y   install    php-gd
  # service httpd restart
  再次点击小太阳图标时,出现如下页面,则表示正常:
  8.5.      安装NAGIOS时发现有Status Map、Alert Histogram打不开链接,提示找不到statusmap.cgi和histogram.cgi.
  解决办法:
  原因一:因为gd-devel没有安装的问题,造成NAGIOS在编译时不生成这statusmap.cgi
  原因二:NAGIOS在编译在前, gd-devel安装在后,造成不生成这statusmap.cgi
  8.6.      后台apache日志报错如下:
  # tail -f /etc/httpd/logs/error_log
  [Fri Feb 18 19:07:18 2011] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)
  [Fri Feb 18 19:07:18 2011] [notice] Digest: generating secret for digest authentication ...
  [Fri Feb 18 19:07:18 2011] [notice] Digest: done
  [Fri Feb 18 19:07:18 2011] [notice] Apache/2.2.3 (CentOS) configured -- resuming normal operations
  [Fri Feb 18 19:07:20 2011] [error] [client 127.0.0.1] Directory index forbidden by Options directive: /var/www/html/
  [Fri Feb 18 19:07:42 2011] [error] [client 127.0.0.1] Directory index forbidden by Options directive: /var/www/html/
  [Fri Feb 18 19:07:55 2011] [error] [client 127.0.0.1] Directory index forbidden by Options directive: /var/www/html/
  监控http服务出现响应超时的情况,如下所示:
  # /usr/local/nagios/libexec/check_http -I localhost -w 15 -c 20 -t 30

  HTTP WARNING: HTTP/1.1 403 Forbidden - 5240 bytes in 0.003 second response time |time=0.002991s;15.000000;20.000000;0.000000>  解决方法:
  # echo -n none > /var/www/html/index.html
  8.7.      进行编译安装ndoutils-1.4b7时,报错如下:
  #./db/installdb -ucacti -pcacti -d cacti
  DBD::mysql::db do failed: Table 'cacti.nagios_dbversion' doesn't exist at ./db/installdb line 51.
  命令使用错误,解决方法如下:
  # ./installdb -ucacti -pcacti -h localhost -d cacti    //加上 –h localhost参数
  DBD::mysql::db do failed: Table 'cacti.nagios_dbversion' doesn't exist at ./installdb line 51.
  ** Creating tables for version 1.4b7
  Using mysql.sql for installation...
  ** Updating table nagios_dbversion
  Done!
  8.8.      安装后,查看/usr/local/nagios/var/nagios.log日志,报错如下:
  #tail –f /usr/local/nagios/var/nagios.log
  [1298198680] Error: Could not safely copy module '/usr/local/nagios/bin/ndomod.o'. The module will not be loaded: No such file or directory
  [1298202280] Auto-save of retention data completed successfully.
  原因为:前面安装ndoutils-1.4b7,少了一个操作步骤。解决办法如下:
  # mv /usr/local/nagios/bin/ndomod-3x.o /usr/local/nagios/bin/ndomod.o //新添加
  正确的日志如下:
  # tail -f /usr/local/nagios/var/nagios.log
  [1298346735] Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
  [1298346735] Nagios 3.2.1 starting... (PID=13489)
  [1298346735] Local time is Tue Feb 22 11:52:15 CST 2011
  [1298346735] LOG VERSION: 2.0
  [1298346735] ndomod: NDOMOD 1.4b9 (10-27-2009) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
  [1298346735] ndomod: Successfully connected to data sink. 0 queued items to flush.
  [1298346735] Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
  [1298350335] Auto-save of retention data completed successfully.
  [1298353935] Auto-save of retention data completed successfully.
  [1298357535] Auto-save of retention data completed successfully.
  8.9.      有时开机后,后台报错如下:
  # tail -f /usr/local/nagios/var/nagios.log
  [1298439477] ndomod: Still unable to connect to data sink. 23512 items lost, 5000 queued items to flush.
  [1298439493] ndomod: Still unable to connect to data sink. 23590 items lost, 5000 queued items to flush.
  以上报错一般是由于ndo2db没有启动,手动启动即可:
  #/usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg    启动ndo2db
  8.10.访问npc插件页面时,主机图标为红色叉号:
  解决办法如下:
  # cp -r /usr/local/nagios/share/images/logos/logo.gif /var/www/html/cacti/plugins/npc/logo.gif
  重新刷新页面即可解决问题。正常页面为:
  8.11.访问点击小太阳后,报错如下:
  Hostnane is not set:是pnp的提示,pnp需要以以下方式访问index.php?host=$HOSTNAME$&srv=$SERVICEDESC$ 或者index.php?host=$HOSTNAME;
  而通过脚本推送时,变量发生了变化,生成的文件如下:
  #define_host
  define host {
  name       host-pnp
  register   0
  process_perf_data 1
  action_url /nagios/pnp/index.php?host=nagios.com.cn$   这样不正确的
  action_url /nagios/pnp/index.php?host=$HOSTNAME$    //这是正确格式
  }
  #define_service
  define service {
  name       srv-pnp
  register   0
  process_perf_data 1
  action_url /nagios/pnp/index.php?host=nagios.com.cn$&srv=$ 这样是不正确的
  action_url /nagios/pnp/index.php?host=$HOSTNAME$&srv=$SERVICEDESC$ //正确格式
  }


运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-661797-1-1.html 上篇帖子: CACTI监控系统配置过程 下篇帖子: RHEL5.8 Cacti与Nagio的安装配置(1)
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表