参照netseek的pdf,centos6 64bit
nagios 安装步骤
1在做安装之前确认要对该机器拥有root 权限。
确认你安装好的linux 系统上已经安装如下软件包再继续。
Apache
GCC 编译器
GD库与开发库
yum -y install httpd gcc glibc glibc-common gd gd-devel
2
建立nagios 账号
/usr/sbin/useradd nagios && passwd nagios
创建一个用户组名为nagcmd用于从Web 接口执行外部命令
用户都加到这个组中
/usr/sbin/groupadd nagcmd
/usr/sbin/usermod ‐ G nagcmd nagios
/usr/sbin/usermod ‐ G nagcmd apache
3
下载nagios 和插件程序包
下载Nagios 和Nagios 插件的软件包( 访问http://www.nagios.org/download/站点以获得最
新版本)
cd /usr/local/src
wget http://nchc.dl.sourceforge.net/sourceforge/nagios/nagios-3.0.6.tar.gz
wget http://nchc.dl.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.13.tar.gz
4
编译与安装nagios
cd /usr/local/src
tar zxvf nagios-3.0.6.tar.gz
cd nagios-3.0.6
./configure --with-command-group=nagcmd --prefix=/usr/local/nagios
make all
make install
make install-init
make install-config
make install-commandmode
验证程序是否被正确安装。切换目录到安装路径(这里是/usr/local/nagios),看是否存在
etc、bin、 sbin、 share、 var 这五个目录,如果存在则可以表明程序被正确的安装到系
统了。后表是五个目录功能的简要说明:
5
编译并安装nagios 插件 nagios-plugins
cd /usr/local/src
tar zxvf nagios-plugins-1.4.13.tar.gz
cd nagios-plugins-1.4.13
./configure --with-nagios-user=nagios --with-nagios-group=nagios --prefix=/usr/local/nagios
make && make install
验证:
ls /usr/local/nagios/libexec
会显示安装的插件文件,即所有的插件都安装在 libexec 这个目录下
6配置WEB 接口
方法一:直接在安装nagios 时 make install ‐ webconf
创建一个nagiosadmin的用户用于Nagios 的WEB 接口登录。记下你所设置的登录口
令,一会儿你会用到它。
htpasswd ‐ c /usr/local/nagios/etc/htpasswd.users nagiosadmin
重启Apache服务以使设置生效。
service httpdrestart
方法二:在httpd.conf最后添加如下内容:
#for nagios
ScriptAlias /nagios/cgi-bin /usr/local/nagios/sbin
Options ExecCGI
AllowOverride None
Order allow,deny
Allow from all
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd
Require valid-user
Alias /nagios /usr/local/nagios/share
Options None
AllowOverride None
Order allow,deny
Allow from all
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd
Require valid-user
htpasswd ‐ c /usr/local/nagios/etc/htpasswd test
New password: (输入123456)
Re‐ type new password: (再输入一次密码)
Adding password for user test
查看认证文件的内容
less /usr/local/nagios/etc/htpasswd
test:OmWGEsBnoGpIc 前半部分是用户名test, 后面是加密后的密码
本例添加的是 test 用户名,需要改 cgi.cfg 配置文件,允许test 用户
vi /usr/local/nagios/etc/cgi.cfg
authorized_for_system_information=test
authorized_for_configuration_information=test
authorized_for_system_commands=test
authorized_for_all_services=test
authorized_for_all_hosts=nagiosadmin,test
authorized_for_all_ service_commands=test
authorized_for_all_host_commands=test
7
启动nagios
把Nagios 加入到服务列表中以使之在系统启动时自动启动
chkconfig ‐‐ add nagios
chkconfig nagios on
验证Nagios 的样例配置文件
/usr/local/nagios/bin/nagios ‐ v /usr/local/nagios/etc/nagios.cfg
有可能
Nagios 3.0.6
Copyright (c) 1999-2008 Ethan Galstad (http://www.nagios.org)
Last Modified: 12-01-2008
License: GPL
Error: Cannot open main configuration file '/usr/local/‐' for reading! 然后赋予权限也不行 直接重启nagios服务 启动即可
Nagios 3.0.6 starting... (PID=2821)
Local time is Thu Feb 16 14:24:25 CST 2012
Bailing out due to one or more errors encountered in the configuration files. Run Nagios from the command line with the -v option to verify your config before restarting. (PID=2821)
如果没有报错,可以启动Nagios 服务
service nagios start
service httpd start
8 setenforce 0(执行这个命令就可了)
令SELinux处于容许模式
setenforce 0
如果要永久性更变它,需要更改/etc/selinux/config 里的设置并重启系统。
不关闭SELinux或是永久性变更它的方法是让 CGI 模块在SELinux下指定强制目标模式:
chcon‐ R‐ t httpd_sys_content_t /usr/local/nagios/sbin/
chcon‐ R‐ t httpd_sys_content_t /usr/local/nagios/share/
9
测试
登录 http://localhost/nagios/ 输入用户名test和密码123456就可以正常登录了
十 如何配置监控远程主机
1 在被监控主机上
增加用户
useradd nagios
设置密码
passwd nagios
安装nagios插件
wget http://nchc.dl.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.13.tar.gz
tar zxvf nagios-plugins-1.4.13.tar.gz
cd nagios-plugins-1.4.13
./configure
make
make install
chown nagios.nagios /usr/local/nagios/
chown -R nagios.nagios /usr/local/nagios/libexec/
2 nagios 安装nrpe的时候步骤(监控与被监控都要安装)
tar -zxvf nrpe-2.8.1.tar.gz
cd nrpe-2.8.1
./configure
make all
make install-plugin
make install-daemon
make install-daemon-config
3 vim /usr/local/nagios/etc/nrpe.cfg
#allowed_hosts=127.0.0.1
allowed_hosts=127.0.0.1,192.168.1.130(192.168.1.130监控端的地址)
改/etc/hosts.allow增加监控机ip
echo 'nrpe:192.168.1.130' >> /etc/hosts.allow
4启动服务
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
测试nrpe服务是否正常
/usr/local/nagios/libexec/check_nrpe -H 127.0.0.1(用127.0.0.1测试 不要用localhost测试)
NRPE v2.8.1
5在监控端(192.168.1.130)测试 看到如下结果说明成功
/etc/init.d/iptables stop(或者添加允许从被监控端收集信息)
/usr/local/nagios/libexec/check_nrpe -H 192.168.1.129
NRPE v2.8.1
然后在监控端
1 vim /usr/local/nagios/etc/objects/129.cfg 内容如下
define host{
use linux-server
host_name 129
alias 129
address 192.168.1.129
}
define service{
use generic-service
host_name 129
service_description load
check_command check_nrpe!check_load
#使用自定参数
#check_command check_nrpe!check_load!6.0,5.0,4.0!15.0,8.0,6.0
}
vim /usr/local/nagios/etc/nagios.cfg 添加如下内容
# Definitions for monitoring 192.168.1.129
cfg_file=/usr/local/nagios/etc/objects/129.cfg
vim /usr/local/nagios/etc/objects/commands.cfg
# 'check_nrpe ' command definition
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
监控机nagios重启
service nagios reload
输入http://192.168.1.130/nagios 就可看到129已经添加成功
nagios监控swap
在被监控机的/usr/local/nagios/etc/nrpe.cfg
vim /usr/local/nagios/etc/nrpe.cfg添加
command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10%
nrpe服务重启
[root@localhost libexec]# ps -ef | grep nrpe
nagios 2332 1 0 14:24 ? 00:00:00 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
root 2373 28887 0 14:25 pts/0 00:00:00 grep nrpe
kill -9 2332
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
监控端
/usr/local/nagios/etc/objects/commands.cfg添加
# check_swap command definition
define command{
command_name check_swap
command_line $USER1$/check_swap -w $ARG1$ -c $ARG2$
}
在下面的文件中
vim /usr/local/nagios/etc/objects/129.cfg添加
define service{
use generic-service
host_name 129
service_description swap
check_command check_nrpe!check_swap
}
重启nagios服务和http服务
service nagios restart
service httpd restart
nagios监控磁盘
在被监控机的/usr/local/nagios/etc/nrpe.cfg
vim /usr/local/nagios/etc/nrpe.cfg添加
command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20 -c 10 -p /
nrpe服务重启
[root@localhost libexec]# ps -ef | grep nrpe
nagios 2332 1 0 14:24 ? 00:00:00 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
root 2373 28887 0 14:25 pts/0 00:00:00 grep nrpe
kill -9 2332
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
监控端
/usr/local/nagios/etc/objects/commands.cfg添加
define command{
command_name check_disk
command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
}
在下面的文件中
vim /usr/local/nagios/etc/objects/129.cfg添加
define service{
use generic-service
host_name 129
service_description disk
check_command check_nrpe!check_disk
}
重启nagios服务和http服务
service nagios restart
service httpd restart
nagios监控内存
监控内存脚本如下
######################################
#!/bin/bash
# check memory script
TOTAL=`free -m | head -2 |tail -1 |gawk '{print $2}'`
USED=`free -m | head -2 |tail -1 |gawk '{print $3}'`
FREE=`free -m | head -2 |tail -1 |gawk '{print $4}'`
# to calculate free percent
# use the expression free * 100 / total
FREETMP=`expr $FREE \* 100`
PERCENT=`expr $FREETMP / $TOTAL`
echo "$TOTAL MB Total Memory"
echo "$USED MB Used Memory"
echo "$FREE MB ($PERCENT%) Free Memory"
exit 0
######################################
在被监控机的/usr/local/nagios/etc/nrpe.cfg
vim /usr/local/nagios/etc/nrpe.cfg添加
command[check_mem]=/usr/local/nagios/libexec/check_mem -w 150 -c 200
把监控脚本check_mnem放到/usr/local/nagios/libexec/ 并赋予执行权限
chmod +x /usr/local/nagios/libexec/check_mem
chown nagios.nagios /usr/local/nagios/libexec/check_mem
nrpe服务重启
[root@localhost libexec]# ps -ef | grep nrpe
nagios 2332 1 0 14:24 ? 00:00:00 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
root 2373 28887 0 14:25 pts/0 00:00:00 grep nrpe
kill -9 2332
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
监控端
/usr/local/nagios/etc/objects/commands.cfg添加
define command{
command_name check_mem
command_line $USER1$/check_mem -w $ARG1$ -c $ARG2$
}
在下面的文件中
vim /usr/local/nagios/etc/objects/129.cfg添加
define service{
use generic-service
host_name 129
service_description memory
check_command check_nrpe!check_mem
}
重启nagios服务和http服务
service nagios restart
service httpd restart
nagios监控http存活状态
被监控机不需要任何操作(因为check_http不需要通过nrpe来监控)
监控端
/usr/local/nagios/etc/objects/commands.cfg已经存在check_http命令 故也不需要操作
在下面的文件中
vim /usr/local/nagios/etc/objects/129.cfg添加
define service{
use generic-service
host_name 129
service_description http
check_command check_http(这一行要注意 不是check_nrpe!check_http这种形式)
}
重启nagios服务和http服务
service nagios restart
service httpd restart
错误解决方法 因为http是采用yum安装的 网站文件路径默认是/var/www/html
执行下面命令检测时
/usr/local/nagios/libexec/check_http -I 192.168.1.129
报错如下
HTTP WARNING: HTTP/1.1 403 Forbidden
原因这是因为/var/www/html 下面没有文件所致
cd /var/www/html
echo 123 >index.html
然后过一会 nagios检测就ok了
nagios监控mysql存活状态
被监控机登录数据库授权
mysql> grant all privileges on *.* to xxxxx@192.168.1.130 identified by '123456';
Query OK, 0 rows affected (0.09 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.08 sec)
监控端
/usr/local/nagios/etc/objects/commands.cfg添加如下内容
# check_mysql command definition
define command{
command_name check_mysql
command_line $USER1$/check_mysql -H $HOSTADDRESS$ -P $ARG1$ -
u $ARG2$ -p $ARG3$ (liuyu那个pdf有问题)
}
在下面的文件中
vim /usr/local/nagios/etc/objects/129.cfg添加
define service{
use generic-service
host_name 129
service_description mysql
check_command check_mysql!192.168.1.129!3306!xxxx!123456(这一行liuyu文档上是对的 这一行要注意 不是check_nrpe!check_http这种形式)
notifications_enabled 0
}
重启nagios服务和http服务
service nagios restart
service httpd restart
nagios监控tomcat存活状态
被监控机不需要任何操作(因为check_tcp!8080不需要通过nrpe来监控)
监控端
/usr/local/nagios/etc/objects/commands.cfg已经存在check_tcp命令 故也不需要操作
在下面的文件中
vim /usr/local/nagios/etc/objects/hong221.cfg添加
define service{
use generic-service
host_name hong221
service_description tomcat
check_command check_tcp!8080!xxxxx
}
收到检测 执行下面命令
[root@nagios objects]# /usr/local/nagios/libexec/check_tcp -H xxxxx -p 8080
TCP OK - 0.141 second response time on port 8080|time=0.141140s;;;0.000000;10.000000
重启nagios服务和http服务
service nagios restart
service httpd restart
然后在监控端就可以看到监控页面了
nagios配置139邮箱报警
关于mail发送邮件139邮箱收不到的解决办法
tail -f /var/log/maillog 日志报错如下
Feb 21 17:20:49 localhost postfix/qmgr[2072]: A296612227F: from=, size=700, nrcpt=1 (queue active)
Feb 21 17:20:49 localhost sendmail[2275]: q1L9KmDa002275: to=xxxxx@139.com, ctladdr=root (0/0), delay=00:00:01, xdelay=00:00:0
0, mailer=relay, pri=30221, relay=[127.0.0.1] [127.0.0.1], dsn=2.0.0, stat=Sent (Ok: queued as A296612227F)
Feb 21 17:20:49 localhost postfix/smtpd[2276]: disconnect from localhost.localdomain[127.0.0.1]
Feb 21 17:20:50 localhost postfix/smtp[2280]: A296612227F: to=, relay=mx1.mail.139.com[221.176.9.178]:25, delay
=0.53, delays=0.05/0.01/0.24/0.23, dsn=5.0.0, status=bounced (host mx1.mail.139.com[221.176.9.178] said: 550 985a4f43618db72-3c5de Mail rejected (in reply to end of DATA command))
Feb 21 17:20:50 localhost postfix/cleanup[2279]: 43FB812227E: message-id=
Feb 21 17:20:50 localhost postfix/qmgr[2072]: 43FB812227E: from=, size=2697, nrcpt=1 (queue active)
Feb 21 17:20:50 localhost postfix/bounce[2281]: A296612227F: sender non-delivery notification: 43FB812227E
Feb 21 17:20:50 localhost postfix/qmgr[2072]: A296612227F: removed
经指点是由于hostname(localhost.localdomain)的问题 可能会被139邮箱当做垃圾邮件
[root@nagios objects]# cat /etc/sysconfig/network
NETWORKING=yes
#HOSTNAME=localhost.localdomain
HOSTNAME=nagios.localdomain
[root@nagios objects]# cat /etc/hosts
192.168.1.130 nagios.localdomain nagios # Added by NetworkManager
127.0.0.1 localhost.localdomain localhost
::1 nagios.localdomain nagios localhost6.localdomain6 localhost6
故随便改了一个名字 然后重启服务器发现可以使用了 139邮箱也能收到邮件了
关于服务报警nagios方面的配置
监控机上
vim /usr/local/nagios/etc/objects/contacts.cfg
define contact{
contact_name nagiosadmin ; Short name of user
use generic-contact ; Inherit default values from generic-contact template (defined abov
e)
alias Nagios Admin ; Full name of user
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r
host_notification_options d,u,r
service_notification_commands notify-service-by-email
host_notification_commands notify-host-by-email
email xxxxx@139.com(写上你要发送到的邮箱里面 139邮箱运维必备) ;
运维网声明
1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网 享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com