设为首页 收藏本站
查看: 714|回复: 0

Nagios之NRPE监控Linux/UNIX主机

[复制链接]

尚未签到

发表于 2019-1-12 13:56:19 | 显示全部楼层 |阅读模式
1. NRPE简介


        NRPE是Nagios的一个功能扩展,它可在远程Linux/Unix主机上执行插件程序。通过在远程服务器上安装NRPE插件及Nagios插件程序来向Nagios监控平台提供该服务器的本地情况,如CPU负载,内存使用,磁盘使用等。这里将Nagios监控端称为Nagios服务器端,而将远程被监控的主机称为Nagios客户端。
       注意:通过SSH是可以实现在远程的Linux/UNIX主机上执行nagios插件的,比如说check_by_ssh插件就可以实现这项功能。虽然SSH的方式相较于NRPE插件方式更为安全,但是在CPU负载上,无论是监控端还是被监控的远程主机,SSH方式也都更大一些,当面对被监控的主机涉及到成千上百台时,使用这种方式就会是个问题,这也是许多nagios管理员选择使用NRPE方式的主要原因。


1.1 NRPE的设计原理
  NRPE插件包括2部分:

  

  


  • check_nrpe插件,位于本地监控端;

  •   NRPE进程,运行于远程主机(Linux/UNIX),也就是被监控端。
当nagios需要监控远程主机(Linux/UNIX)的服务时,NRPE具体的工作流程如下:


  • Nagios会执行check_nrpe插件,并告诉它需要监控的服务项;
  • check_nrpe插件通过SSL方式与被监控端的nrpe进程连接;
  • nrpe进程运行对应的nagios插件来执行服务或资源的监测;
  • NRPE 进程将监测的结果返回给check_nrpe 插件,check_nrpe插件又将结果传递给nagios进程做后续处理。
注意:NRPE进程能够进行服务与资源监控的前提是:远程主机(Linux/UNIX)必须装有nagios插件。
1.2 NRPE使用案例
   1.直接监测
  

  

  NRPE最直接的使用就是对远程主机的“local”或者“private”资源进行监控,比如CPU负载、内存使用、swap使用、当前的用户数、磁盘的使用情况、进程状态等等。
   2.间接监测


  当监控端不能够直连远程服务端时,NRPE还可用于间接监控远程主机的“public”服务与资源。比如,已安装nrpe进程和插件的远程主机可以连接远程web服务器(但是监控主机不可以),那么,可以通过配置NRPE进程允许间接监控远程web服务器,在本案例中,NRPE进程相当于监控代理。

  2.NRPE安装与配置

  本文的测试服务器信息:
  监控端IP:172.16.56.131,主机名:monitors
  被监控端IP:192.183.3.145,主机名:kk
  2.1 远程主机端(被监控端)的NRPE安装与配置

  从3.0的版本开始,NRPE在众多的操作系统中的安装都变得更为简单,如有问题可访问https://community.nagios.org/
  1.增加nagios用户
[root@kk ~]#useradd nagios  2.下载安装nagios plugins
[root@kk ~]#cd /home/softwares/  
[root@kk softwares]#wget http://nagios-plugins.org/download/nagios-plugins-2.1.2.tar.gz  
[root@kk softwares]#tar -xzf nagios-plugins-2.1.2.tar.gz   
[root@kk softwares]#cd nagios-plugins-2.1.2  
[root@kk nagios-plugins-2.1.2]#./configure  --with-nagios-user=nagios --with-nagios-group=nagios  注意:要监控MySQL需要添加 --with-mysql

[root@kk nagios-plugins-2.1.2]#make  
[root@kk nagios-plugins-2.1.2]#make install
修改nagios插件安装目录权限:

[root@kk nagios-plugins-2.1.2]# chown nagios.nagios /usr/local/nagios
[root@kk nagios-plugins-2.1.2]# chown -R nagios.nagios /usr/local/nagios/libexec
3.安装NRPE



NRPE下载地址https://sourceforge.net/projects/nagios/files/nrpe-3.x/,本文下载版本是nrpe-3.0.1.tar.gz。
[root@kk nagios-plugins-2.1.2]#cd ..  
[root@kk softwares]#tar zxf nrpe-3.0.1.tar.gz   
[root@kk softwares]#cd nrpe-3.0.1  
[root@kk nrpe-3.0.1]#yum -y install openssl openssl-devel  
[root@kk nrpe-3.0.1]#./configure --with-nagios-user=nagios --with-nagios-group=nagios

[root@kk nrpe-3.0.1]#make all

4.安装NRPE的plugin、deamon等

[root@kk nrpe-3.0.1]#make install-plugin

[root@kk nrpe-3.0.1]#make install-daemon

[root@kk nrpe-3.0.1]#make install-daemon-config

这是nrpe该版本的一个bug,详见https://github.com/NagiosEnterprises/nrpe/issues/50。
解决:

[root@kk nrpe-3.0.1]#make install-config
如果需要打开5666端口,则需要下列命令(本案例默认关闭的防火墙):

# iptables -I RH-Firewall-1-INPUT -p tcp -m tcp –dport 5666 -j ACCEPT  
# service iptables save
4.配置NRPE命令

[root@kk nrpe-3.0.1]#vim /usr/local/nagios/etc/nrpe.cfg

  修改allowed_hosts=192.183.3.145,172.16.56.131,允许Nagios服务器端访问;
  
  在命令行测试如下的监测命令,这里根据自己的监测需求对命令进行修改,并写入nrpe.cfg文件:
/usr/local/nagios/libexec/check_nrpe -H localhost -c check_users
/usr/local/nagios/libexec/check_nrpe -H localhost -c check_load
/usr/local/nagios/libexec/check_nrpe -H localhost -c check_sda1
/usr/local/nagios/libexec/check_nrpe -H localhost -c check_total_procs
/usr/local/nagios/libexec/check_nrpe -H localhost -c check_zombie_procs  查看配置结果:
[root@kk libexec]#grep -v '^#' /usr/local/nagios/etc/nrpe.cfg |sed '/^$/d'
log_facility=daemon
debug=0
pid_file=/usr/local/nagios/var/nrpe.pid
server_port=5666
nrpe_user=nagios
nrpe_group=nagios
allowed_hosts=192.183.3.145,172.16.56.131
dont_blame_nrpe=0
allow_bash_command_substitution=0
command_timeout=60
connection_timeout=300
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_sda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda1
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
5.启动NRPE

[root@kk nrpe-3.0.1]# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d  
[root@kk nrpe-3.0.1]#netstat -tulpn | grep nrpe

有两种方式用于管理nrpe服务,nrpe有两种运行模式:
-i        # Run as a service under inetd or xinetd   
-d        # Run as a standalone daemon
可以为nrpe编写启动脚本,使得nrpe以standard alone方式运行:

[root@kk nrpe-3.0.1]#vi /etc/init.d/nrped  
#!/bin/bash  
# chkconfig: 2345 88 12     
# description: NRPE DAEMON     
NRPE=/usr/local/nagios/bin/nrpe   
NRPECONF=/usr/local/nagios/etc/nrpe.cfg     
case "$1" in   
    start)     
        echo -n "Starting NRPE daemon..."   
        $NRPE -c $NRPECONF -d     
        echo " done."   
        ;;     
    stop)     
        echo -n "Stopping NRPE daemon..."   
        pkill -u nagios nrpe     
        echo " done."   
    ;;     
    restart)     
        $0 stop     
        sleep 2     
        $0 start     
        ;;     
    *)     
        echo "Usage: $0 start|stop|restart"   
        ;;     
    esac   
exit 0[root@kk nrpe-3.0.1]#chmod +x /etc/init.d/nrped   
[root@kk nrpe-3.0.1]#chkconfig --add nrped   
[root@kk nrpe-3.0.1]#chkconfig nrped on  
[root@kk nrpe-3.0.1]#service nrped start   
Starting NRPE daemon... done.2.2 监控端NRPE安装与配置
1.安装依赖包
[root@monitors ~]# yum -y install openssl openssl-devel
否则编译nrpe时会出现如下问题:


原因是缺少openssl-devel包。

2. NRPE下载与安装

[root@monitors ~]# cd /home/nagios/  
[root@monitors nagios]# wget http://prdownloads.sourceforge.net/sourceforge/nagios/nrpe-3.0.1.tar.gz--2017-01-17 23:36:36--  http://prdownloads.sourceforge.net/sourceforge/nagios/nrpe-3.0.1.tar.gz  
[root@monitors nagios]# tar xzvf nrpe-3.0.1.tar.gz   
[root@monitors nagios]# cd nrpe-3.0.1  
[root@monitors nrpe-3.0.1]# ./configure --with-nagios-user=nagios --with-nagios-group=nagios

[root@monitors nrpe-3.0.1]# make all

[root@monitors nrpe-3.0.1]# make install-plugin

安装完成后,会在Nagios安装目录的libexec下生成check_nrpe的插件,如下所示:

[root@monitors nrpe-3.0.1]# ll /usr/local/nagios/libexec/check_nrpe  
-rwxrwxr-x 1 nagios nagios 125293 1月  17 23:47 /usr/local/nagios/libexec/check_nrpe

3.NRPE测试
  NRPE命令参数的使用可参详:
[root@monitors libexec]# ./check_nrpe -h  
NRPE Plugin for Nagios  
Copyright (c) 1999-2008 Ethan Galstad (nagios@nagios.org)  
Version: 3.0.1  
Last Modified: 09-08-2016  
License: GPL v2 with exemptions (-l for more info)  
SSL/TLS Available: OpenSSL 0.9.6 or higher required  
Usage: check_nrpe -H  [-2] [-4] [-6] [-n] [-u] [-V] [-l] [-d ]  
       [-P ] [-S ]  [-L ] [-C ]  
       [-K ] [-A ] [-s ] [-b ]  
       [-f ] [-p ] [-t :]  
       [-c ] [-a ]  
Options:  
        = The address of the host running the NRPE daemon  
-2           = Only use Version 2 packets, not Version 3  
-4           = bind to ipv4 only  
-6           = bind to ipv6 only  
-n           = Do no use SSL  
-u           = (DEPRECATED) Make timeouts return UNKNOWN instead of CRITICAL  
-V           = Show version  
-l           = Show license  
       = Anonymous Diffie Hellman use:  
                0 = Don't use Anonymous Diffie Hellman  
                    (This will be the default in a future release.)  
                1 = Allow Anonymous Diffie Hellman (default)  
                2 = Force Anonymous Diffie Hellman  
        = Specify non-default payload size for NSClient++  
     = The SSL/TLS version to use. Can be any one of: SSLv2 (only),  
                SSLv2+ (or above), SSLv3 (only), SSLv3+ (or above),  
                TLSv1 (only), TLSv1+ (or above DEFAULT), TLSv1.1 (only),  
                TLSv1.1+ (or above), TLSv1.2 (only), TLSv1.2+ (or above)  
  = The list of SSL ciphers to use (currently defaults  
                to "ALL:!MD5:@STRENGTH". WILL change in a future release.)  
  = The client certificate to use for PKI  
         = The private key to use with the client certificate  
     = The CA certificate to use for PKI  
     = SSL Logging Options  
    = bind to local address  
    = configuration file to use  
[port]       = The port on which the daemon is running (default=5666)  
[command]    = The name of the command that the remote daemon should run  
[arglist]    = Optional arguments that should be passed to the command,  
                separated by a space.  If provided, this must be the last  
                option supplied on the command line.  
NEW TIMEOUT SYNTAX  
-t :  
     = Number of seconds before connection times out (default=10)  
     = Check state to exit with in the event of a timeout (default=CRITICAL)  
    Timeout state must be a valid state name (case-insensitive) or integer:  
    (OK, WARNING, CRITICAL, UNKNOWN) or integer (0-3)  
Note:  
This plugin requires that you have the NRPE daemon running on the remote host.  
You must also have configured the daemon to associate a specific plugin command  
with the [command] option you are specifying here.  Upon receipt of the  
[command] argument, the NRPE daemon will run the appropriate plugin command and  
send the plugin output and return code back to *this* plugin.  This allows you  
to execute plugins on remote hosts and 'fake' the results to make Nagios think  
the plugin is being run locally.
通过NRPE监控远程Linux主机要使用chech_nrpe插件进行,其语法格式如下:
check_nrpe -H  [-n] [-u] [-p ] [-t ] [-c ] [-a ]

[root@monitors libexec]# ./check_nrpe -H 192.183.3.145 -p 5666  
NRPE v3.0.1
1.创建命令定义

[root@monitors libexec]# cd /usr/local/nagios/etc/objects/  
[root@monitors objects]# vim commands.cfg  
define command{  
        command_name    check_nrpe  
        command_line    $USER1$/check_nrpe -H "$HOSTADDRESS$"  -c "$ARG1$"  
}
2.创建host与service定义

[root@monitors objects]# vim linuxserver.cfg   
#############################################################  
#create a new template for linux boxes  
#############################################  
define host{  
name linux-box ; Name of this template  
use generic-host ; Inherit default values  
check_period 24x7  
check_interval 5  
retry_interval 1  
max_check_attempts 10  
check_command check-host-alive  
notification_period 24x7  
notification_interval 30  
notification_options d,r  
contact_groups admins  
register 0 ; DONT REGISTER THIS - ITS A TEMPLATE  
}  
########################################################  
#defie a new host for the remote Linux/Unix box   
#that references the newly created linux-box host template  
########################################################  
define host{  
use linux-box ; Inherit default values from a template  
host_name remotehost ; The name we're giving to this  
server  
alias centos6_kk ; A longer name for the server  
address 192.183.3.145 ; IP address of the server  
}  
######################################################################################  
#The following service will monitor the CPU load on the remote host.  
# The "check_load" argument that  is passed to the check_nrpe command  
# defiition tells the NRPE daemon to run the "check_load" comman#d as defied in the nrpe.cfg fie  
######################################################################################  
define service{  
use generic-service  
host_name remotehost  
service_description CPU Load  
check_command check_nrpe!check_load  
}  
##############################################################################################  
#The following service will monitor the number of currently logged in users on the remote host  
############################################################################################  
define service{  
use generic-service  
host_name remotehost  
service_description Current Users  
check_command check_nrpe!check_users  
}  
#############################################################################################  
#The following service will monitor the free drive space on /dev/sda1 on the remote host.  
view plain copy print?
#
#注意:这里的/dev/sda1是通过被检测主机df命令获得,切勿根据官方文档盲目填写/dev/hda1
############################################################################################  
define service{  
use generic-service  
host_name remotehost  
service_description /dev/sda1 Free Space  
check_command check_nrpe!check_sda1  
}  
##############################################################################################  
#The following service will monitor the total number of processes on the remote host.  
##############################################################################################  
define service{  
use generic-service  
host_name remotehost  
service_description Total Processes  
check_command check_nrpe!check_total_procs  
}  
###########################################################################################  
#The following service will monitor the number of zombie processes on the remote host.  
###########################################################################################  
define service{  
use generic-service  
host_name remotehost  
service_description Zombie Processes  
check_command check_nrpe!check_zombie_procs  
}
注意:监控端(Nagios服务端)定义的service命令与被监控端NRPE中内置的监控命令一致。

3.启动所定义的命令和服务

[root@monitors objects]# vim /usr/local/nagios/etc/nagios.cfg  
添加一行:  
cfg_file=/usr/local/nagios/etc/objects/linuxserver.cfg
配置语法检查:

[root@monitors objects]# service nagios configtest  
或者  
[root@monitors objects]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg  
Nagios Core 4.2.0  
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors  
Copyright (c) 1999-2009 Ethan Galstad  
Last Modified: 08-01-2016  
License: GPL  
Website: https://www.nagios.org  
Reading configuration data...  
   Read main config file okay...  
   Read object config files okay...  
Running pre-flight check on configuration data...  
Checking objects...  
    Checked 15 services.  
Warning: Host 'kk' has no default contacts or contactgroups defined!  
    Checked 2 hosts.  
    Checked 1 host groups.  
    Checked 1 service groups.  
    Checked 1 contacts.  
    Checked 1 contact groups.  
    Checked 26 commands.  
    Checked 5 time periods.  
    Checked 0 host escalations.  
    Checked 0 service escalations.  
Checking for circular paths...  
    Checked 2 hosts  
    Checked 0 service dependencies  
    Checked 0 host dependencies  
    Checked 5 timeperiods  
Checking global event handlers...  
Checking obsessive compulsive processor commands...  
Checking misc settings...  
Total Warnings: 1  
Total Errors:   0  
Things look okay - No serious problems were detected during the pre-flight check
重启nagios:

[root@monitors objects]#  service nagios restart  
Running configuration check...  
Stopping nagios: done.  
Starting nagios: done.
登录Nagios web监控页面查看配置的监控是否生效:


至此,NRPE的简单安装与配置结束!

4. NRPE自定义配置



  如果需要监控远程主机(Linux/UNIX)更多的服务,需要:

  •   在远程主机端的nrpe.cfg文件中增加新的命令定义;
  •   在监控端的nagios配置文件中增加新的服务监控定义;
  比如说增加swap空间的使用率监控。
  1.被监控远程主机端配置
  在本例中假定想要的结果是当swap空闲率低于10%将会有“critical”警告,低于20%将有“warning”警告;
[root@kk libexec]#/usr/local/nagios/libexec/check_swap -w 20% -c 10%
SWAP OK - 59% free (2251 MB out of 3823 MB) |swap=2251MB;764;382;0;3823

  将该命令添加至nrpe.cfg文件中:
[root@kk libexec]#vi /usr/local/nagios/etc/nrpe.cfg
command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10%  
  重启nrpe进程:
[root@kk libexec]#service nrped  restart
Stopping NRPE daemon... done.
Starting NRPE daemon... done.2.监控端的配置
[root@monitors ~]# vim /usr/local/nagios/etc/objects/linuxserver.cfg
define service{
use generic-service
host_name remotehost
service_description Swap Usage
check_command check_nrpe!check_swap
}验证配置:
[root@monitors ~]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Nagios Core 4.2.0
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-01-2016
License: GPL
Website: https://www.nagios.org
Reading configuration data...
   Read main config file okay...
   Read object config files okay...
Running pre-flight check on configuration data...
Checking objects...
Checked 16 services.
Warning: Host 'kk' has no default contacts or contactgroups defined!
Checked 2 hosts.
Checked 1 host groups.
Checked 1 service groups.
Checked 1 contacts.
Checked 1 contact groups.
Checked 26 commands.
Checked 5 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 2 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 1
Total Errors:   0
Things look okay - No serious problems were detected during the pre-flight check

  重启nagios:
[root@monitors ~]# service nagios restart
Running configuration check...
Stopping nagios: done.
Starting nagios: done.
  刷新nagios监控页面:
  
  成功!
附注:本文理论部分参阅NRPE 3.0官方文档,实践部分有参阅http://467754239.blog.运维网.com/4878013/1558897/,欢迎批评指正!




运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-662420-1-1.html 上篇帖子: 在nagios中使用nrpe自定义脚本 下篇帖子: Nagios之NRPE监控Linux/UNIX主机
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表