最近在做线上的nagios监控优化工作,发现线上监控有几个warning内容如下:
Warning: Service 'usr Partition' on host 'luckcart_ss02' has a notification interval less than its check interval!
Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
从警告来看,初步认为是,通知告警的日志设置的,太短导致。
从nagios配置实例来看:
max_check_attempts 3 ; Re-check the service up to 3 times in order to determine its final (hard) state
normal_check_interval 3 ; Check the service every 10 minutes under normal conditions
retry_check_interval 2 ; Re-check the service every two minutes until a hard state can be determined
contact_groups admins ; Notifications get sent out to everyone in the 'admins' group
notification_options w,u,c,r ; Send notifications about warning, unknown, critical, and recovery events
notification_interval 10 ; Re-notify about service problems every hour