zabbix企业应用之解决大量的nodata报警通知-阿里云开发者社区

zabbix企业应用之解决大量的nodata报警通知

2017-11-08 1792

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介：

研究与使用zabbix快1年了，其他功能都很多，最令我头痛的是如果机房的网络出现波动或者代理服务器出现问题，那么就会出现大量的服务器nodata报警，由于我采用邮件发送报警，邮件开通短信接收功能，基本出现大量nodata的报警就会造成手机死机（米3手机），为了解决这个问题测试过各自办法、

1、设置trigger的依赖，如果使用多个zabbix对应多个proxy的话，配置很麻烦，不容易修改，所以放弃；

2、使用自定义脚本报警，然后脚本里进行分析与处理，目前采用此方法。

下面是使用第2种方法后，出现nodata问题的报警截图：

下面是使用第2种方法后，nodata问题恢复后报警截图：

如何实现：

一、服务端（zabbix web地址）

1、是自定义脚本发送报警

选择“管理”==》“示警媒体类型”==》“创建示警媒体类型”

其中“脚本名称”里是使用的发送脚本名称，这个脚本的路径可以在客户端的zabbix_server.conf里定义，具体如何定义参考下面客户端设置。

2、在动作（action）里做设置

选择“配置”==》“动作”==》“创建动作”，”事件源“选择”触发器“。

然后再选择”操作“==》“仅送到”==“E-mail”，这个E-mail是刚才“示警媒体”里定义的名字。

二、客户端操作

1、修改zabbix_server.conf

 
        AlertScriptsPath=
        /usr/local/zabbix/bin

修改脚本的路径

2、把脚本放到/usr/local/zabbix/bin目录，并起名为zabbix_send_mail.sh，给与755权限，授予zabbix组与用户权限。

 
  
    
      
      
        [root@ip-10-10-13-8 bin]
        # cat /usr/local/zabbix/bin/zabbix_send_mail.sh 
       

        #!/bin/bash
       
 
        . 
        /etc/profile 
       
 
        problem_cmd=
        "^PROBLEM.*system time out" 
       
 
        recovery_cmd=
        "^RECOVERY.*system time out" 
       

        #echo "$3"|/bin/mail -s "$2" $1
       
 
        if 
        [ `
        echo 
        "$2"
        |
        egrep 
        -E 
        "$problem_cmd"
        |
        wc 
        -l` -gt 0 ];
        then 
       
 
        echo 
        "echo \"$3\"|/bin/mail -s \"$2\" $1" 
        >>
        /tmp/zabbix_problem_mail
        .sh 
       
 
        elif 
        [ `
        echo 
        "$2"
        |
        egrep 
        -E 
        "$recovery_cmd"
        |
        wc 
        -l` -gt 0 ];
        then 
       
 
        echo 
        "echo \"$3\"|/bin/mail -s \"$2\" $1" 
        >>
        /tmp/zabbix_recovery_mail
        .sh 
       

        else
       
 
        echo 
        "$3"
        |
        /bin/mail 
        -s 
        "$2" 
        $1 
       

        fi
       
 
    

   
 

目前我这里设置如果发送的信息里有包含system time out内容的就重新定向给tmp目录的一个文件里（我这里的time out其实就是nodata，我这里规定nodate信息为system time out）

3、设置nodata报警发送

 
        [root@ip-10-10-13-8 bin]
        # cat /usr/local/zabbix/bin/cront_send_mail.sh 
       
        #!/bin/bash
       
        . 
        /etc/profile 
       
        problem_file=
        '/tmp/zabbix_problem_mail.sh' 
       
        recovery_file=
        '/tmp/zabbix_recovery_mail.sh' 
       
        if 
        [ ! -e $problem_file ];
        then 
       
        touch 
        $problem_file 
       
        chown 
        zabbix:zabbix $problem_file 
       
        fi
       
        if 
        [ ! -e $recovery_file ];
        then 
       
        touch 
        $recovery_file 
       
        chown 
        zabbix:zabbix $recovery_file 
       
        fi
       
        alert_value=15
       
        problem_value=`
        grep 
        -c 
        echo 
        $problem_file` 
       
        recovery_value=`
        grep 
        -c 
        echo 
        $recovery_file` 
       
        time
        =`
        date 
        +%Y-%m-%d_%T` 
       
        contact=
        '244979152@qq.com' 
       
        if 
        [ $problem_value -lt $alert_value ];
        then 
       
        /bin/bash 
        $problem_file 
       
        rm 
        -rf $problem_file 
       
        rm 
        -rf $problem_file-$alert_value 
       
        elif 
        [ $problem_value -gt $alert_value ] && [ ! -e $problem_file-$alert_value ];
        then 
       
        echo 
        "时间:$time 超时次数:$problem_value!"
        |
        /bin/mail 
        -s 
        "问题:灾难报警!机房出现大量超时报警!!!" 
        $contact 
       
        rm 
        -rf $problem_file 
       
        touch 
        $problem_file-$alert_value 
       
        elif 
        [ $problem_value -gt $alert_value ] && [ -e $problem_file-$alert_value ];
        then 
       
        rm 
        -rf $problem_file 
       
        rm 
        -rf $problem_file-$alert_value 
       
        fi
       
        if 
        [ `
        grep 
        -c 
        echo 
        $recovery_file` -lt $alert_value ];
        then 
       
        /bin/bash 
        $recovery_file 
       
        rm 
        -rf $recovery_file 
       
        rm 
        -rf $recovery_file-$alert_value 
       
        rm 
        -rf $problem_file-$alert_value 
       
        elif 
        [ `
        grep 
        -c 
        echo 
        $recovery_file` -gt $alert_value ] && [ ! -e $recovery_file-$alert_value ];
        then 
       
        echo 
        "时间:$time 超时次数:$recovery_value!"
        |
        /bin/mail 
        -s 
        "恢复:灾难报警!机房出现大量超时报警!!!" 
        $contact 
       
        rm 
        -rf $recovery_file 
       
        rm 
        -rf $problem_file-$alert_value 
       
        touch 
        $recovery_file-$alert_value 
       
        elif 
        [ `
        grep 
        -c 
        echo 
        $recovery_file` -gt $alert_value ] && [ -e $recovery_file-$alert_value ];
        then 
       
        rm 
        -rf $recovery_file 
       
        rm 
        -rf $recovery_file-$alert_value 
       
        rm 
        -rf $problem_file-$alert_value 
       
        fi

我这里定义如果超过15次的system time out邮件，就只发送给我设置的244979152@qq.com，而且仅发送一封。

4、crontab设置

 
        *
        /2 
        * * * * 
        /bin/bash 
        /usr/local/zabbix/bin/cront_send_mail
        .sh

这样就实现了以下需求：

1、如果有大量的nodata报警，仅发送一封邮件；

2、如果nodata报警回复，则也只发送一封邮件；

3、设置简单，不需要修改trigger与action、模板。

目前我这里测试的如果一个proxy挂了，出现大量proxy的主机nodata报警，也仅发送一封给我设置的报警邮箱，其他action里设置报警联系人不会受到。

本文转自 reinxu 51CTO博客，原文链接：http://blog.51cto.com/dl528888/1400554，如需转载请自行联系原作者

zabbix企业应用之解决大量的nodata报警通知

热门文章

最新文章

相关电子书

相关实验场景

推荐镜像