mycheckpoint功能及使用
本文主要介绍mycheckpoint的使用,包括自定义监控、设置告警阈值、监控远程主机、OS信息、提供方便的脚本。
欢迎转载,请注明作者、出处。
作者:张正
blog:http://space.itpub.net/26355921
QQ:176036317
如有疑问,欢迎联系。
1.设置自定义监控的值:
query_eval:为自定义查询的值,应为:一行结果&整数值
现在设置自定义监控一个
SELECT COUNT(*) FROM store.shopping_cart WHERE is_pending=1的查询结果。
INSERT INTO
custom_query (custom_query_id, enabled, query_eval, description, chart_type, chart_order)
VALUES (0, 1, 'SELECT COUNT(*) FROM store.shopping_cart WHERE is_pending=1',
'Number of pending carts', 'value', 0);
查看自定义监控的值:
custom_0为第一个自定义项
custom_1为第二个自定义项,
..........
SELECT id, ts, created_tmp_tables_psec, custom_0, custom_0_time, custom_1_psec FROM sv_sample
WHERE ts >= NOW() - INTERVAL 1 HOUR;
以下是摘自官网说明:
mysql> desc custom_query;
+-----------------+------------------------------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+------------------------------------------+------+-----+---------+-------+
| custom_query_id | int(10) unsigned | NO | PRI | 0 | |
| enabled | tinyint(1) | NO | | 1 | |
| query_eval | varchar(4095) | NO | | NULL | |
| description | varchar(255) | YES | | NULL | |
| chart_type | enum('value','value_psec','time','none') | NO | | value | |
| chart_order | tinyint(4) | NO | | 0 | |
+-----------------+------------------------------------------+------+-----+---------+-------+
The columns of the custom_query table are:
custom_query_id: Unique identifier. This is not an AUTO_INCREMENT: you choose the id; you’ll reference it later on.
enabled: 1 for enabled, 0 for disabled (query will not get executed), value stored is NULL.
query_eval: The query to be executed. All tables must be fully qualified with database (Schema) scope. The query must return exactly one row, with exactly one column, which is a type of INTEGER.
description: A human readable explanation of the nature of the query. Used as title for custom charts.
chart_type: How to graphically represent the custom results. This is a type of enum(‘value’,'value_psec’,'time’).
‘value’ means charting the query result’s value;
‘value_psec’ charts the change per second of the value;
‘time’ charts the time it took to execute the query (regardless of the result).
chart_order: chart position within the HTML reports (works as of revision 160).
2.自定义告警:
seconds_behind_master是一个监控项,其表示主从结构中的复制延迟时间,为其设置一个alert阈值:
INSERT INTO alert_condition (condition_eval, description, alert_delay_minutes)
VALUES ('seconds_behind_master IS NULL', 'Slave not replicating', 0);
seconds_behind_master是mycheckpoint自带的监控项,如果要对自定义监控项设置阈值,
condition_eval可设为:custom_0>1000,意思为:如果第一个自定义监控项值大于1000,则报警。
以下是摘自官网说明:
mysql> DESC alert_condition;
+-------------------------------+---------------------------------------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------------------+---------------------------------------------------+------+-----+---------+----------------+
| alert_condition_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| enabled | tinyint(1) | NO | | 1 | |
| condition_eval | varchar(4095) | NO | | NULL | |
| monitored_host_condition_eval | varchar(4095) | NO | | NULL | |
| description | varchar(255) | YES | | NULL | |
| error_level | enum('debug','info','warning','error','critical') | NO | | error | |
| alert_delay_minutes | smallint(5) unsigned | NO | | 0 | |
| repetitive_alert | tinyint(1) | NO | | 0 | |
+-------------------------------+---------------------------------------------------+------+-----+---------+----------------+
Description:
enabled: a boolean value. 0 disables the condition: it will not be checked. 1 (default) enables it.
condition_eval: an SQL condition to evaluate (e.g. ‘seconds_behind_master IS NULL’. See following examples). For the condition to be raised, it must evaluate to a nonzero, not NULL integer value. A result of 0 or NULL means the condition is not met and that there is no problem.
monitored_host_condition_eval: an SQL condition to evaluate on the monitored server; typically this would be a call to a stored function on the monitored server. Same rules apply as for condition_eval.
description: a human readable explanation of the condition. This will be presented in various views and in email notifications.
error_level: condition severity. Defaults to ‘Error’.
alert_delay_minutes: how much time must pass with continuous met condition, before it is considered as a problem and becomes candidate for notification.
repetitive_alert: a boolean value. 1 means an email notification will include this condition whenever the condition is met. 0 (default) means an email notification is only set once in any continuous meeting of this condition.
3.监控远程MySQL服务器
监控主要是通过crontab+mycheckpoint+参数 采集数据库信息,写入数据库,然后通过数据库中的数据将图像展示出来。
a.监控本地主机(采集本地数据库运行状态的数据,也就意味着crontab+mycheckpoint在本地采集数据),写入到远程监控机上(将采集到的数据写入到远程mysql的监控数据库中),本地主机上需要安装mycheckpoint
mycheckpoint --host=remote_monitor_server --port=3306 --user=moni_user --password=moni_passwd --database=check_host1 --monitored-host=localhost --monitored-socket=/tmp/mysql.sock
b.监控远程主机,本地通过crontab+mycheckpoint采集远程数据库中的运行状态信息,然后写入到本地mysql的监控数据库中,远程数据库上无需安装mycheckpoint,只需要有相应的用户及权限即可
mycheckpoint --host=localhost --port=3306 --user=moni_user --password=moni_passwd --database=check_host1 --monitored-host=host1 --monitored-port=3306
注意:只有本地安装了mycheckpoint,并使用crontab执行mycheckpoint .....的命令才能搜集到OS的信息。方式a中,每台服务器上都安装了mycheckpoint,本地通过crontab采集到信息后写入到远端服务器中(包括OS信息)。方式b中,因为是监控机搜集远程服务器上的信息是通过数据库用户搜集的,因此无法搜集到OS信息。
有时候搜集本地信息也会出现无法搜集OS信息的情况,是因为原本是搜集本地信息,但是在mycheckpoint命令中,写的是搜集远程主机信息的语句,只不过这个主机IP是本地的IP而已,但访问方式还是远程访问。只需要在mycheckpoint 命令后额外加-o 参数,即:强行搜集OS信息即可。
4.执行mycheckpoint http 后,才能通过网页访问监控信息界面。但是执行后,需要专门的窗口。因此我们需要将其放在后台执行。然是后台执行,往往需要手动找到这个进程并手动kill,因此在这里,我提供了一个脚本,以便更方便地管理mycheckpoint http进程。例如该脚本名为mysql_monitor.sh,
执行mysql_monitor.sh start/stop/restart
该脚本是放在监控机上执行的,需要先填下前面的几个参数,如HOST、USER、PASSWD等。
#!/bin/bash
#Edit by zheng.zhang@enmotech.com 2013-08-09
#This script. is helpful that using mycheckpoint monitor mysql server.
#Here are some parameters:
HOST=localhost
USER=moni_user
PASSWD=moni_passwd
DATABASE=check_master
SOCKET="-S xxxxxx "
PORT=
LOGFILE_PATH="/root/mycheckpoint_http.log"
pid=`ps -ef|grep mycheckpoint |grep -v grep|grep python|grep mycheckpoint|awk '{print $2}'`
case "$*" in
start)
test -z $pid
if (( $? > 0 ))
then
echo "-------------------------------------------------------"
echo "..........mycheckpoint_http is running(pid $pid) ......"
echo "-------------------------------------------------------"
else
nohup mycheckpoint --host=HOST−uUSER -pPASSWD−−database=DATABASE http >> $LOGFILE_PATH &
v_time1=`date "+%Y-%m-%d %H:%M:%S"`
echo "mycheckpoint started at : vtime1">>LOGFILE_PATH
sleep 2
new_pid=`ps -ef|grep mycheckpoint |grep -v grep|grep python|grep mycheckpoint|awk '{print $2}'`
echo "---------------------------------------------------------"
echo "..........mycheckpoint_http started,pid is $new_pid......"
echo "---------------------------------------------------------"
fi
;;
stop)
test -z $pid
if (( $? > 0 ))
then
kill -9 $pid
v_time=`date "+%Y-%m-%d %H:%M:%S"`
echo "mycheckpoint stopped at : vtime">>LOGFILE_PATH
echo "-----------------------------------------"
echo "..........mycheckpoint_http stopped......"
echo "-----------------------------------------"
else
echo "------------------------------------------------"
echo "..........mycheckpoint_http is not running......"
echo "------------------------------------------------"
fi
;;
status)
test -z $pid
if (( $? > 0 ))
then
echo "------------------------------------------------------"
echo "..........mycheckpoint_http is running.....(PID $pid)"
echo "------------------------------------------------------"
else
echo "------------------------------------------------"
echo "..........mycheckpoint_http is not running......"
echo "------------------------------------------------"
fi
;;
restart)
######stop######
echo -n " stopping mysql_monitor......"
sleep 1
test -z $pid
if (( $? > 0 ))
then
kill -9 $pid
v_time=`date "+%Y-%m-%d %H:%M:%S"`
echo "mycheckpoint stopped at : vtime">>LOGFILE_PATH
echo ".[OK] "
else
echo ".[FAILED] (mysql_mointor is not running)"
fi
sleep 1
######start######
pid=`ps -ef|grep mycheckpoint |grep -v grep|grep python|grep mycheckpoint|awk '{print $2}'`
echo -n " starting mysql_monitor......"
sleep 1
test -z $pid
if (( $? > 0 ))
then
echo ".[FAILED] "
echo " mycheckpoint_http is running(pid $pid) .........."
else
nohup mycheckpoint --host=HOST−uUSER -pPASSWD−−database=DATABASE http >> $LOGFILE_PATH &
v_time1=`date "+%Y-%m-%d %H:%M:%S"`
echo "mycheckpoint started at : vtime1">>LOGFILE_PATH
sleep 2
new_pid=`ps -ef|grep mycheckpoint |grep -v grep|grep python|grep mycheckpoint|awk '{print $2}'`
echo ".[OK] (pid is $new_pid)"
fi
;;
*)
echo "----------------------------------------------------"
echo " please input: start | status | stop | restart"
echo "----------------------------------------------------"
;;
esac
|
本文转自ITPUB博客84223932的博客,原文链接:功能及使用mycheckpoint,如需转载请自行联系原博主。