# Generic service definition template - This is NOT a real service, just a template!
默认Nagios已经定义了一个通用服务大类generic-service,这个类当中定义的属性是无论什么服务都必须的属性。
define service{
使用define定义关键字service表示该定义段内定义的是服务段,也可以是服务类。段内的定义请使用两个大括号来包括。段内的定义项一行一项。另外,如果要定义多个服务段的话,那么必须要写多个define service {} 段。
name generic-service ; The 'name' of this service template
通过name来指定服务类名,这里的generice-service是一个类名。
active_checks_enabled 1 ; Active service checks are enabled
设定启用活动监测服务。
passive_checks_enabled 1 ; Passive service checks are enabled/accepted
设定启用被动监测服务。
parallelize_check 1 ; Active servicechecks should be parallelized (disabling this can lead to major performance problems)
设定启用并发活动监测服务。
obsess_over_service 1 ; We should obsess over this service (if necessary)
设定启用服务防停滞。
check_freshness 0 ; Default is to NOT check service 'freshness'
设定关闭更新监测。
notifications_enabled 1 ; Service notifications are enabled
设定启用事件通知。
event_handler_enabled 1 ; Service event handler is enabled
设定启用事件处理程序。
flap_detection_enabled 1 ; Flap detection is enabled
设定启用状态抖动监测。
failure_prediction_enabled 1 ; Failure prediction is enabled
设定启用故障预测。
process_perf_data 1 ; Process performance data
设定启用进程性能数据记录。
retain_status_information 1 ; Retain status information across program restarts
设定启用状态信息保存功能。当Nagios重新启动的时候不会是空数据,而是先显示上次离线时最后保留的状态数据。
retain_nonstatus_information 1 ; Retain non-status information across program restarts
设定启用非状态信息保存功能。当Nagios重新启动的时候不会是空数据,而是先显示上次离线时最后保留的非状态数据。
is_volatile 0 ; The service is not volatile
设定服务非易失。
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
设定非注册。此项register为0值的时候Nagios会理解到该定义段是主机类而为实体主机的定义段。因此,在自定义主机类段的时候,记得也要加入这一个属性,用来向Nagios表明该段为主机类段。
}
# Local service definition template - This is NOT a real service, just a template!
这里Nagios还默认定义了一个针对“本地系统监测服务”的类。
define service{
name local-service ; The name of this service template
use generic-service ; Inherit default values from the generic-service definition
通过use来继承generic-service这个类。类也是能够继承类的。
check_period all_days ; The service can be checked at any time of the day
max_check_attempts 2 ; Re-check the service up to 4 times in order to determine its final (hard) state
设定监测失败后最尝试次数。
normal_check_interval 5 ; Check the service every 5 minutes under normal conditions
设定正常监测服务的间隔,单位秒。
retry_check_interval 1 ; Re-check the service every minute until a hard state can be determined
设定监测失败后尝试的间隔,单位秒。
contact_groups admins ; Notifications get sent out to everyone in the 'admins' group
设定联系组。
notification_options w,u,c,r ; Send notifications about warning, unknown, critical, and recovery events
设定监测指定服务产生的事件通知的条件选项。这里后面跟上一些级别类型参数:
w代表warning告警;
u代表unknown未知;
c代表critical严重;
r代表recover恢复;
d代表down奔溃。
notification_interval 15 ; Re-notify about service problems every hour
设定服务通知的间隔。
notification_period all_days ; Notifications can be sent out at any time
设定服务通知运行时间。
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
设定register表明本段定义的是一个服务类,而不是具体的服务。
}
# Remote service definition template - This is NOT a real service, just a template!
我这里定义了一个针对“远程系统监测服务”的类,虽然现在没有加上NRPE,所以暂时远程系统的检测目前还无法实现,但是这个服务类却应该先规划出来。
define service{
name remote-service ; The name of this service template
use generic-service ; Inherit default values from the generic-service definition
check_period all_days ; The service can be checked at any time of the day
max_check_attempts 2 ; Re-check the serviceup to 4 times in order to determine its final (hard) state
normal_check_interval 5 ; Check the service every 5 minutes under normal conditions
retry_check_interval 1 ; Re-check the service every minute until a hard state can be determined
contact_groups admins ; Notifications get sent out to everyone in the 'admins' group
notification_options w,u,c,r ; Send notifications about warning, unknown, critical, and recovery events
notification_interval 20 ; Re-notify about service problems every hour
notification_period all_days ; Notifications can be sent out at any time
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
}
# Connection service definition template - This is NOT a real service, just a template!
我再定义了一个针对“远程系统监测服务”的类。
define service{
name connection-service ; The name of this service template
use generic-service ; Inherit default values from the generic-service definition
check_period all_days ; The service can be checked at any time of the day
max_check_attempts 2 ; Re-check the serviceup to 4 times in order to determine its final (hard) state
normal_check_interval 5 ; Check the service every 5 minutes under normal conditions
retry_check_interval 1 ; Re-check the service every minute until a hard state can be determined
contact_groups admins ; Notifications get sent out to everyone in the 'admins' group
notification_options w,u,c,r ; Send notifications about warning, unknown, critical, and recovery events
notification_interval 20 ; Re-notify about service problems every hour
notification_period all_days ; Notifications can be sent out at any time
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
}
# Define a service to "ping" the local and remote machine
首先定义一个Ping本地主机以及远程主机的服务,这个服务是属于“连通类检测服务”。
define service{
use connection-service ; Name of service template to use
host_name localhost,KCentOS5A,KCWIN2K3A,KCXP1
通过host_name来指定该服务监测的主机,后面跟的必须是在host中定义过的主机名。
接下来要定义一些“本地系统监测服务”
# Define a service to check the disk space of the root partition
# on the local machine. Warning if < 20% free, critical if
# < 10% free space on partition.
定义一个服务来监测Nagios本地系统的磁盘使用情况,当剩余可用空间<20%的时候产生提醒,当剩余可用空间<10%的时候产生告警。
define service{
use local-service ; Name of service template to use
host_name localhost
service_description Root Partition
check_command check_local_disk!20%!10%!/
}
# Define a service to check the number of currently logged in
# users on the local machine. Warning if > 20 users, critical
# if > 50 users.
定义一个服务来监测Nagios本地系统的当前系统登录用户数量,当登录用户数量>20的时候产生提醒,当登录用户数量>50的时候产生告警。
define service{
use local-service ; Name of service template to use
host_name localhost
service_description Current Users
check_command check_local_users!20!50
}
# Define a service to check the number of currently running procs
# on the local machine. Warning if > 250 processes, critical if
# > 400 users.
定义一个服务来监测Nagios本地系统当前运行的进程数量,当进程数>250的时候产生提醒,当进程数>400的时候产生告警。
define service{
use local-service ; Name of service template to use
host_name localhost
service_description Total Processes
check_command check_local_procs!250!400!RSZDT
}
# Define a service to check the load on the local machine.
定义一个服务来监测Nagios本地系统当前的系统负载状况。
define service{
use local-service ; Name of service template to use
host_name localhost
service_description Current Load
check_command check_local_load!5.0,4.0,3.0!10.0,6.0,4.0
}
----------------------------------------------------------------------------------
二.测试启动:
1.检查无误后启动Nagios
[iyunv@KCentOS5C ~]# service nagios start
Starting nagios: done.
4.通过浏览器访问Nagios的情况
首先会弹出一个对话框,请你输入访问Nagios的用户名和密码,这里我输入之前注册的kanecruise用户,口令为123456。
接着就会弹出Nagios的主界面,这里文本我没有办法表现图,但是我稍微把一些监测结果复制粘贴上来好了。
----------------------------------------------------------------------------------
Host
Service Status Last Check Duration Attempt Status Information
localhost
Current Load
OK 10-05-2007 21:29:21 0d 1h 55m 39s 1/2 OK - load average: 0.08, 0.02, 0.00
Current Users
OK 10-05-2007 21:33:46 0d 1h 57m 31s 1/2 USERS OK - 2 users currently logged in
PING
OK 10-05-2007 21:24:59 0d 1h 55m 1s 1/2 PING OK - Packet loss = 0%, RTA = 0.06 ms
Root Partition
OK 10-05-2007 21:28:06 0d 1h 56m 54s 1/2 DISK OK - free space: / 4872 MB (75% inode=96%):
Total Processes
OK 10-05-2007 21:25:36 0d 1h 54m 24s 1/2 PROCS OK: 20 processes with STATE = RSZDT
----------------------------------------------------------------------------------