nagios+check_logifile实现日志监控

日志检查时我们平时用的非常多的一种监控方式,检查日志我们需要使用nagios插件,比如nagios自带的check_logfile,功能比较有限;我们使用ConSol Labs出品的check_logfiles,它能够处理截断日志,支持宏定义,支持正则等功能,使我们的监控更加灵活。

一.安装

1.安装check_logfiles

tar -zxvf check_logfiles-3.6.3.tar.gz 
cd /usr/local/src/ check_logfiles-3.6.3
./configure --prefix=/usr/local/nagios/ --with-nagios-user=nagios --with-nagios-group=nagios --with-seekfiles-dir=/usr/local/nagios/var/tmp --with-protocols-dir=/usr/local/nagios/var/tmp --with-perl=/usr/bin/perl --with-gzip=/bin/gzip
make
此时可能会报错:

CDPATH="${ZSH_VERSION+.}:" && cd . && /bin/sh /usr/local/src/check_logfiles-3.6.3/missing autoconf
aclocal.m4:21: warning: this file was generated for autoconf 2.69.
You have another version of autoconf.  It may work, but is not guaranteed to.
If you have problems, you may need to regenerate the build system entirely.
To do so, use the procedure documented by the package, typically 'autoreconf'.
configure.ac:4: error: Autoconf version 2.65 or higher is required
aclocal.m4:278: AM_INIT_AUTOMAKE is expanded from...
configure.ac:4: the top level
autom4te: /usr/bin/m4 failed with exit status: 63
WARNING: 'autoconf' is probably too old.
         You should only need it if you modified 'configure.ac',
         or m4 files included by it.
         The 'autoconf' program is part of the GNU Autoconf package:
         <http://www.gnu.org/software/autoconf/>
         It also requires GNU m4 and Perl in order to run:
         <http://www.gnu.org/software/m4/>
         <http://www.perl.org/>
make: *** [configure] 错误 63
这是由于服务器的autoconf版本问题导致,正如提示说“aclocal.m4:21: warning: this file was generated for autoconf 2.69.” 编译需要autoconf的版本为2.6.9,而我们的版本为

[root@nagios monitors]# /usr/bin/autoconf -V
autoconf (GNU Autoconf) 2.63
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv2+: GNU GPL version 2 or later
<http://gnu.org/licenses/old-licenses/gpl-2.0.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by David J. MacKenzie and Akim Demaille.
因此我们需要升级将autoconf版本升级为2.69.

2.安装autoconf

[root@test src]# wget http://ftp.gnu.org/gnu/autoconf/autoconf-2.69.tar.gz
[root@test src]# cd autoconf-2.69
[root@test src]# ./configure --prefix=/usr
[root@test src]# make && make install

注意:我们一定要将其安装到/usr下,否则编译check_logfiles时不会使用新版的autoconf

3.编译安装check_logfiles

make && make install

安装完成后check_logfiles插件将安装到/usr/local/nagios/libexec下,我们需要配置下权限

chown nagios.nagios /usr/local/nagios/libexec/check_logfiles

另外,由于我们检查下是否有/usr/local/nagios/var/tmp这个目录,如果没有的话还要新建,因为我们之前将seekfile及protocols目录安装在此。

至此,安装完毕。

二.配置

首先我们来看下check_logfiles自带的帮助信息

[root@nagios src]# /usr/local/nagios/libexec/check_logfiles -h
This Nagios Plugin comes with absolutely NO WARRANTY. You may use
it on your own risk!
Copyright by ConSol Software GmbH, Gerhard Lausser.

This plugin looks for patterns in logfiles, even in those who were rotated
since the last run of this plugin.

You can find the complete documentation at 
http://labs.consol.de/nagios/check_logfiles/

Usage: check_logfiles [-t timeout] -f <configfile>

The configfile looks like this:

$seekfilesdir = '/opt/nagios/var/tmp';		写状态信息的目录,这里面记录已经检查过的日志内容,相当于历史记录
# where the state information will be saved.

$protocolsdir = '/opt/nagios/var/tmp';					写协议信息的目录,这里面记录日志检查的匹配信息
# where protocols with found patterns will be stored.

$scriptpath = '/opt/nagios/var/tmp';				可调用的脚本或程序
# where scripts will be searched for.

$MACROS = { CL_DISK01 => "/dev/dsk/c0d1", CL_DISK02 => "/dev/dsk/c0d2" };定义宏,我们可以调用的变量

@searches = (此处为配置文件的内容,我们可以通过配置文件来执行程序,也可以通过在命令行中直接定义。通过配置文件更方便
  {
    tag => 'temperature',<span style="white-space:pre">	</span>tag可以理解为一个自定义的标志,它将在生成状态信息或协议信息中作为名字中的一部分使用,并没有实际的意义
    logfile => '/var/adm/syslog/syslog.log',<span style="white-space:pre">	</span>logfile为所要监控的日志文件
    rotation => 'bmwhpux',<span style="white-space:pre">	</span>rotation如果有截断日志的话用来定义如何匹配截断日志
    criticalpatterns => ['OVERTEMP_EMERG', 'Power supply failed'],<span style="white-space:pre">	</span>严重错误,可以匹配一个或多个正则表达式
    warningpatterns => ['OVERTEMP_CRIT', 'Corrected ECC Error'],<span style="white-space:pre">	</span>警告错误,可以匹配一个或多个正则表达式
    options => 'script,protocol,nocount',<span style="white-space:pre">	</span>选项列表,我们可以选择启动脚本,写协议,不计数等操作
    script => 'sendnsca_cmd'<span style="white-space:pre">	</span>脚本的名字
  },
  {
    tag => 'scsi',
    logfile => '/var/adm/messages',
    rotation => 'solaris',
    criticalpatterns => 'Sense Key: Not Ready',
    criticalexceptions => 'Sense Key: Not Ready /dev/testdisk',
    options => 'noprotocol'
  },
  {
    tag => 'logins',
    logfile => '/var/adm/messages',
    rotation => 'solaris',
    criticalpatterns => ['illegal key', 'read error.*$CL_DISK01$'],
    criticalthreshold => 4
    warningpatterns => ['read error.*$CL_DISK02$'],
  }
);

以上将各个项目统一写到配置文件中,当然也可以将其放入命令行中调用,两种调用方式如下:

[root@nagios src]# /usr/local/nagios/libexec/check_logfiles
Usage: check_logfiles [-t timeout] -f <configfile> [--searches=tag1,tag2,...]
       check_logfiles [-t timeout] --logfile=<logfile> --tag=<tag> --rotation=<rotation>
                      --criticalpattern=<regexp> --warningpattern=<regexp>

三.应用

1.我们在被监控端编辑一个配置文件,如:

[root@usvr-218 var]# vim /usr/local/nagios/var/log.cfg
@searches = (
	{
		tag => 'web_monitor',
		logfile => '/var/log/web_monitor.log',
		criticalpatterns => ['nginx has restart','nginx is down'],
		warningpatterns => ['500','302','502']
		#options => 'noprotocol'
	}
);
我们定义了一个标志web_monitor,检查的日志文件为/var/log/web_monitor.log,当日志信息中匹配ciriticalpattern中的内容时会报严重错误,当匹配warningcriticals中的内容时会报警告错误;状态信息和协议信息会写入到/usr/local/nagios/var/tmp中,如

log._var_log_web_monitor.log.web_monitor,其中web_monitor就是我们配置中的tag

[root@usvr-218 tmp]# cat log._var_log_web_monitor.log.web_monitor 
$state = {
           'runcount' => 17,
           'serviceoutput' => '',
           'logoffset' => 642985,
           'runtime' => 1431504819,
           'devino' => '64768:1178440',
           'privatestate' => {
                               'runcount' => 17,
                               'lastruntime' => 1431504220,
                               'logfile' => '/var/log/web_monitor.log'
                             },
           'logtime' => 1431504602,
           'servicestateid' => 0,
           'tag' => 'web_monitor'
         };


1;
被监控端的check_logfiles配置好了后,我们还需在nrpe.cfg中添加命令

command[check_logfile]=/usr/local/nagios/libexec/check_logfiles -f /usr/local/nagios/var/log.cfg

service xinetd reload

2.被监控端端我们再来看下监控端

define service{
    use                     nrpe-service         ; Name of service template to use
    host_name               test
    service_description     web_monitor
    check_command           check_nrpe!check_logfile
    check_interval          10  
    notifications_enabled   1   
    service_groups          logfile_check
    contact_groups          test
    }  

重启后,就可以看到我们的监控项了

技术分享


至此,我们的日志监控讲完了,当然都是最基本的了,希望给大家带来帮助。


郑重声明:本站内容如果来自互联网及其他传播媒体,其版权均属原媒体及文章作者所有。转载目的在于传递更多信息及用于网络分享,并不代表本站赞同其观点和对其真实性负责,也不构成任何其他建议。