我们来看下 监控软件 Nagios 和 cacti 结合
来看下 两者区别Cacti:在监控方面绘图比较不错,在流量与图型展现比较存在优势
Nagios:在故障分析比较不错,报警机制相对来说比较好,报警机制:邮箱、短信等,而且也比Cacti灵活;同时适用监控大量服务器以及服务器上面大批服务状态是否正常,重点不在图形化,而在状态故障的监控
1.安装相关的软件包
# yum -y install httpd php gcc glibc glibc-common gd gd-devel libpng libjpeg zlib
2.创建相关的用户和组
# useradd -s /sbin/nologin nagios
# groupadd nagcmd
# usermod -G nagcmd nagios
# usermod -G nagcmd apache
3.安装、编译nagios
# tar zxvf nagios-3.2.1.tar.gz -C /usr/src/
# cd /usr/src/nagios-3.2.1/
# ./configure --with-command-group=nagcmd
# make all
# make install
# make install-init
# make install-config
# make install-commandmode
# make install-webconf
注释:
make install-init是向/etc/rc.d/init.d中安装启动脚本
make install-commandmode是将额外的命令文件修改好恰当的权限
makeinstall-config是向/usr/local/nagios/etc目录下写入示例配置文件
# cd /usr/local/nagios/
# ls
bin测试命令目录
etc 配置文件目录
libexec 插件目录
sbincgj脚本目录
sharenagios网页文件目录
var nagios运行总会产生的数据
# ls etc/
cgi.cfgcgi程序配置文件
nagios.cfgnagios服务主配置文件
resource.cfg 定义nagios变量文件
# ls etc/objects/
commands.cfg定义监控命令配置文件
localhost.cfg定义监控本机对象配置文件
timeperiods.cfg 监控时间模板文件
contacts.cfg指定报警邮件发送邮箱
templates.cfg监控方式模板文件
4.安装插件
# tar -zxvf nagios-plugins-1.4.14.tar.gz
# cd nagios-plugins-1.4.14
# ./configure --with-nagios-user=nagios --with-nagios-group=nagcmd
# make && make install
插件使用:
# ./check_http --help
# ./check_http -H localhost -p 80
# ./check_ftp -H localhost -p 21
# ./check_ping -H 127.0.0.1 -w 5,10% -c 10,20% -p 10 -t 20
# htpasswd -cm /usr/local/nagios/etc/htpasswd.users nagiosadmin
5.通过Firefox访问nagios
# firefox &
http://192.168.2.3/nagios
user:nagiosadmin
password:123456
6.nagios监控本身
配置步骤
定义监控命令->commands.cfg
定义监控对象->localhost.cfg
加载监控对象配置文件->nagios.cfg
配置登陆页面认证用户->nagios.conf
启动nagios服务
启动HTTP服务
登陆监控页面
(1)定义监控命令
# vim etc/objects/commands.cfg
# 'check_nfs' command definition
define command{
command_name check_nfs
command_line $UWangqi$/check_tcp -H $HOSTADDRESS$ $ARG1$
}
# vim etc/objects/localhost.cfg
define service{
use local-service ; Name of service template to use
host_name localhost
service_description NFS
check_command check_nfs!2049
notifications_enabled 0
}
# ./bin/nagios -v /usr/local/nagios/etc/nagios.cfg //检查文件
# service nagios restart
7.配置监控远程主机
被监控端(1)安装监控插件
# useradd nagios
# groupadd nagcmd
# usermod -aG nagcmd nagios
安装插件
# tar zxvf nagios-plugins-1.4.14.tar.gz -C /usr/src/
# cd /usr/src/nagios-plugins-1.4.14/
# yum -y install gcc gcc-c++
# ./configure --with-nagios-user=nagios --with-nagios-group=nagcmd
# make && make install
# cd /usr/local/nagios/
安装NRPE
# tar zxvf nrpe-2.12.tar.gz
# cd nrpe-2.12
# yum -y install openssl-devel
# ./configure &&make && make install
# make install-plugin
# make install-daemon
# make install-daemon-config
# make install-xinetd
# vim nrpe.cfg
command=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /root
command=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /boot
# vim /etc/xinetd.d/nrpe
only_from = 127.0.0.1 192.168.2.3
# vim /etc/services
nrpe 5666/tcp #nrpe
# yum -y install xinetd //6.5需要安装
# netstat -anptul | grep :5666
tcp 0 0 :::5666 :::* LISTEN 51819/xinetd
# cd /usr/local/nagios/
# ./check_nrpe -H localhost
NRPE v2.12
监控端(1)
安装nrpe
# tar zxvf nrpe-2.12.tar.gz
# cd nrpe-2.12
# yum -y install openssl-devel
# ./configure && make && make install
# make install-plugin
# ./check_nrpe -H 192.168.2.4
NRPE v2.12
(2)定义命令
# vim etc/objects/commands.cfg
# 'check_nrpe' command definition
define command{
command_name check_nrpe
command_line $UWangqi$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
# vim objects/ser2.cfg
define service{
use local-service ; Name of service template to use
host_name Wangqi.tarena.com
service_description Boot Partition
check_command check_nrpe!check_boot
}
define service{
use local-service ; Name of service template to use
host_name Wangqi.tarena.com
service_description Root Partition
check_command check_nrpe!check_root
}
# /usr/local/nagios/bin/nagios -v nagios.cfg
# service nagios restart
# firefox &
http://192.168.2.3/nagios
六、安装cacti
1.安装相关软件包
# yum -y install php-mysql php-ldap php-xml net-snmp-utils mysql mysql-server net-snmp crond rrdtool lm_sensors
rrdtool在RHEL5的光盘中不提供,需要用源码包
yum install -y gcc gcc-c++libart_lgpl-develzlib-devellibpng-develfreetype-devel
# service httpd start
# service mysqld start
2.安装cacti
# tar zxvf cacti-0.8.7g.tar.gz
# cp -rp cacti-0.8.7g /var/www/html/cacti
# useradd cactiuser
# chown -R cactiuser.cactiuser ./cacti/rra
# chown -R cactiuser.cactiuser ./cacti/rra/ cacti/log/
# mysql -uroot -p
mysql> create database cactidb default character set utf8;
mysql> show databases;
mysql> grant all on cactidb.* to 'cactiuser'@'localhost' identified by 'cacti';
# mysql -ucactiuser -pcacti cactidb < cacti.sql
# vim include/config.php
3.登陆web界面
HTTP://192.168.2.3/cacti
4.被监控端
# vim /etc/snmp/snmpd.conf
com2sec notConfigUser 192.168.20.1 public
access那一行的systemview改成all
view all included .1取消注释
# service snmpd restart
5.监控端
在web页上点击devices,删除原有的localhost,点击右侧的add
添加新的主机。HOST template用ucd/net snmp host
Associated Data Queries添加以下的数据
SNMP - Get Mounted Partitions
SNMP - Get Processor Information
SNMP - Interface Statistics
保存save后,找到页面最上面的Create Graphs for this Host
把主机加入到图形树中
点击左侧的graph tree->default tree->点击右侧的add
tree item type选择host,然后点击create
产生数据
# su - cactiuser
$ php /var/www/html/cacti/poller.php
$ crontab -e
*/1 * * * * /usr/bin/php /var/www/html/cacti/poller.php &> /dev/null
$ exit
# service crond start
隔几分钟后,再点击web页的graph按钮就可以看到图形了
# service snmpd start
6.安装插件框架
# tar zxvf cacti-plugin-0.8.7g-PA-v2.9.tar.gz
# mv cacti-plugin-arch /var/www/html/cacti/
# cd /var/www/html/cacti/
# patch -p1 -N < cacti-plugin-arch/cacti-plugin-0.8.7g-PA-v2.9.diff
# mysql -ucactiuser -pcacti cactidb < cacti-plugin-arch/pa.sql
# vim include/global.php
$database_type = "mysql";
$database_default = "cactidb";
$database_hostname = "localhost";
$database_username = "cactiuser";
$database_password = "cacti";
$database_port = "3306";
# vim include/config.php
$url_path = "/cacti/";
WEB页面中左侧的用户管理(user management)->admin用户的权限(下面的Plugin Management)加入PA
7.插件安装
# tar zxvf settings-v0.71-1.tgz
# tar zxvf monitor-v1.3-1.tgz
# tar zxvf thold-v0.4.9-3.tgz
# mv settings monitor thold /var/www/html/cacti/plugins
七、整合cacti+nagios
1、cacti在监控服务器资源、绘图方面比nagios有优势,但是nagios在监控服务、报警方面是cacti无法替代的。无法仅使用其中一款软件达到所有的要求。cacti可以安装多款插件,其中Nagios Plugin for Cacti(NPC)可以将nagios的功能以插件的方式在cacti中显示出来。
2、nagios本身的插件是二进制可执行文件,运维工程师不具备高级程序语言的编程能力,因此插件改用SHELL脚本实现。
整合cacti和nagios是利用了cacti的一个插件nagios for cacti,它的原理是将nagios的数据通过ndo2db导入到mysql数据库(cacti的库中),然后cacti读取数据库信息将nagios的结果展示出来。
1.安装ndoutils
首先需要安装ndoutils以将nagios的数据能导入到mysql数据库中
# yum -y install mysql-devel
# tar zxvf ndoutils-1.4b9.tar.gz -C /usr/src/
# cd /usr/src/ndoutils-1.4b9/
# ./configure --prefix=/usr/local/nagiosLDFLAGS=-L/usr/lib --with-mysql-inc=/usr/include/mysql --with-mysql-lib=/usr/lib/mysql --enable-mysql --disable-pgsql--with-ndo2db-user=nagios --with-ndo2db-group=nagios
# make && make install
2.导入数据库
# cd db/
# ./installdb -u cactiuser -p cacti -h localhost -d cactidb
3.配置文件
# cd ..
# cp config/ndomod.cfg-sample /usr/local/nagios/etc/ndomod.cfg
# vim /usr/local/nagios/etc/nagios.cfg
broker_module=/usr/local/nagios/bin/ndomod.o config_file=/usr/local/nagios/etc/ndomod.cfg
process_performance_data=1
# cp config/ndo2db.cfg-sample /usr/local/nagios/etc/ndo2db.cfg
# grep -v -E '(^$| *#)' /usr/local/nagios/etc/ndomod.cfg
instance_name=default
output_type=tcpsocket
output=127.0.0.1
tcp_port=5668
use_ssl=0
output_buffer_items=5000
buffer_file=/usr/local/nagios/var/ndomod.tmp
file_rotation_interval=14400
file_rotation_timeout=60
reconnect_interval=15
reconnect_warning_interval=15
data_processing_options=-1
config_output_options=2
# vim/usr/local/nagios/etc/ndomod.cfg
# grep -v -E '(^$| *#)' /usr/local/nagios/etc/ndo2db.cfg
lock_file=/usr/local/nagios/var/ndo2db.lock
ndo2db_user=nagios
ndo2db_group=nagios
socket_type=tcp
socket_name=/usr/local/nagios/var/ndo.sock
tcp_port=5668
use_ssl=0
db_servertype=mysql
db_host=localhost
db_port=3306
db_name=cactidb
db_prefix=npc_
db_user=cactiuser
db_pass=cacti
max_timedevents_age=1440
max_systemcommands_age=10080
max_servicechecks_age=10080
max_hostchecks_age=10080
max_eventhandlers_age=44640
max_externalcommands_age=44640
debug_level=1
debug_verbosity=1
debug_file=/usr/local/nagios/var/ndo2db.debug
max_debug_file_size=1000000
# vim/usr/local/nagios/etc/ndo2db.cfg
二、安装json
npc展示部分用到json,需要在php中安装php-json的支持
1、安装php的准备环境
# yum install -y php-devel
2、安装json
# tarxvjf php-json-ext-1.2.1.tar.bz2 -C /usr/src/
# phpize
# ./configure && make && make install
3、启用json扩展
# vim /etc/php.d/json.ini
extension=php_json.so
# ln -s /usr/lib64/php/modules/json.so /usr/lib64/php/modules/php_json.so
三、安装npc插件
1、安装
# tar xvzf npc-2.0.4.tar.gz
# mvnpc /var/www/html/cacti/plugins
2、在WEB页中启用插件
http://s3.运维网.com/wyfs02/M01/59/28/wKioL1TJAhahOMlxAAGIebSeLHM025.jpg
3、配置插件
http://s3.运维网.com/wyfs02/M02/59/28/wKioL1TJAnyRd1zsAAKmALLr24w665.jpg
4、注意修改权限
# cd /usr/local/nagios/etc
# chmod 644 ndo2db.cfg
# chmod 644 ndomod.cfg
# chownnagios.nagiosndomod.cfg ndo2db.cfg
5、启动服务
# servicemysqld restart
# servicehttpd restart
# /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
# netstat -tlnp | grep :5668
tcp 0 0 0.0.0.0:5668 0.0.0.0:* LISTEN 19273/ndo2db
# servicenagios restart
6、结果如下
http://s3.运维网.com/wyfs02/M02/59/2A/wKiom1TJAeeRMJbBAAGSaKLCIWg372.jpg
7.我们来 排错
Nagios中没有数据,查看/var/log/messages的报错信息,发现是mysql的问题:
Aug 14 16:01:18 localhost ndo2db: mysql_error: 'Unknown column 'long_output' in 'field list''
所以执行以下操作:
mysql -uroot -p
mysql> use cactidb;
mysql>ALTER TABLEnpc_eventhandlersADDlong_output TEXT NOT NULL DEFAULT '' AFTER output;
mysql> ALTER TABLE npc_hostchecks ADD loing_output TEXT NOT NULL DEFAULT '' AFTER output;
mysql> ALTER TABLE npc_hoststatus ADD long_output TEXT NOT NULL DEFAULT '' AFTER output;
mysql> ALTER TABLE npc_notifications ADD long_output TEXT NOT NULL DEFAULT '' AFTER output;
mysql> ALTER TABLE npc_servicechecks ADD long_output TEXT NOT NULL DEFAULT '' AFTER output;
mysql> ALTER TABLE npc_servicestatus ADD long_output TEXT NOT NULL DEFAULT '' AFTER output;
mysql> ALTER TABLE npc_statehistory ADD long_out TEXT NOT NULL DEFAULT '' AFTER output;
mysql> ALTER TABLE npc_systemcommands ADD long_output TEXT NOT NULL DEFAULT '' AFTER output;
修复后的截图如下:
http://s3.运维网.com/wyfs02/M00/59/28/wKioL1TJA6nCKL4RAAGulwrBpmE216.jpg
页:
[1]