基于MMM实现MariaDB的高可用

  1. 云栖社区>
  2. 博客>
  3. 正文

基于MMM实现MariaDB的高可用

余二五 2017-11-14 21:37:00 浏览897
展开阅读全文

一、MMM

1、简介

MMM即Master-Master Replication Manager for MySQL(mysql主主复制管理器),是关于mysql主主复制配置的监控、故障转移和管理的一套可伸缩的脚本套件(在任何时候只有一个节点可以被写入),这个套件也能基于标准的主从配置的任意数量的从服务器进行读负载均衡,所以你可以用它来在一组居于复制的服务器启动虚拟ip,除此之外,它还有实现数据备份、节点之间重新同步功能的脚本。
MySQL本身没有提供replication failover的解决方案,通过MMM方案能实现服务器的故障转移,从而实现mysql的高可用。

2、MMM的功能

MMM主要功能由下面三个脚本提供
mmm_mond    :负责所有的监控工作的监控守护进程,决定节点的移除等等
mmm_agentd  :运行在mysql服务器上的代理守护进程,通过简单远程服务集提供给监控节点

mmm_control :通过命令行管理mmm_mond进程

3、MMM的优缺点及应用场景

优点:安全性、稳定性高,可扩展性好,当主服务器挂掉以后,另一个主立即接管,其他的从服务器能自动切换,不用人工干预。

缺点:至少三个节点,对主机的数量有要求,需要实现读写分离,可以在程序扩展上比较难实现。同时对主从(双主)同步延迟要求比较高!因此不适合数据安全非常严格的场合。

应用场所:高访问量,业务增长快,并且要求实现读写分离的场景。

二、MMM架构原理图

wKioL1Ngw7KzkqVEAAA4RTFe4wQ449.png

三、资源配置

1、服务器列表

服务器 IP 主机名 server id
monitoring host 172.16.7.100 monitor -
master 1 172.16.7.200 db1 1
master 2 172.16.7.201 db2 2
master 3 172.16.7.202 db3 3

2、虚拟IP列表

IP role description
172.16.7.1 write 对就用程序连接的VIP进行写操作
172.16.7.2 read


对就用程序连接的VIP进行读操作

172.16.7.3 read

四、MMM的实现

1、配置master 1

(1)修改/etc/my.cnf配置文件

1
2
3
4
5
6
7
8
9
10
server-id       = 1
datadir = /mydata/data
log-bin = /mydata/binglogs/master-bin
relay_log = /mydata/relaylogs/relay
binlog_format=mixed
thread_concurrency = 4
log-slave-updates
sync_binlog=1
auto_increment_increment=2
auto_increment-offset=1

(2)为master2 和 slave 授权复制用户

1
2
MariaDB [(none)]> grant replication slave,replication client on *.* to 'repluser'@'172.16.7.201' identified by 'repluser';
MariaDB [(none)]> grant replication slave,replication client on *.* to 'repluser'@'172.16.7.202' identified by 'repluser';

(3)查看状态信息,从服务器连接主服务器时使用

1
2
3
4
5
6
MariaDB [(none)]> show master status;
+-------------------+----------+--------------+------------------+
| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+-------------------+----------+--------------+------------------+
| master-bin.000001 |      755 |              |                  |
+-------------------+----------+--------------+------------------+

2、配置master 2

(1)修改/etc/my.cnf配置文件

1
2
3
4
5
6
7
8
9
10
server-id       = 2
datadir = /mydata/data
log-bin = /mydata/binglogs/master-bin
relay_log = /mydata/relaylogs/relay
binlog_format=mixed
thread_concurrency = 4
log-slave-updates
sync_binlog=1
auto_increment_increment=2
auto_increment-offset=2

(2)为master1 和 slave 授权复制用户

1
2
MariaDB [(none)]> grant replication slave,replication client on *.* to 'repluser'@'172.16.7.200' identified by 'repluser';
MariaDB [(none)]> grant replication slave,replication client on *.* to 'repluser'@'172.16.7.202' identified by 'repluser';

(3)查看状态信息,从服务器连接主服务器时使用

1
2
3
4
5
6
MariaDB [(none)]> show master status;
+-------------------+----------+--------------+------------------+
| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+-------------------+----------+--------------+------------------+
| master-bin.000001 |      755 |              |                  |
+-------------------+----------+--------------+------------------+

(4)master 2 连接 master 1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
MariaDB [(none)]> change master to master_host='172.16.7.200',master_user='repluser',master_password='repluser',master_log_file='master-bin.000001',master_log_pos=755;
MariaDB [(none)]>
MariaDB [(none)]> start slave;
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 172.16.7.200
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: master-bin.000001
          Read_Master_Log_Pos: 755
               Relay_Log_File: relay.000003
                Relay_Log_Pos: 536
        Relay_Master_Log_File: master-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 755
              Relay_Log_Space: 823
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: No
                  Gtid_IO_Pos:

(5)master 1 连接 master 2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
MariaDB [(none)]> change master to master_host='172.16.7.201',master_user='repluser',master_password='repluser',master_log_file='master-bin.000001',master_log_pos=755;
Query OK, 0 rows affected (0.02 sec)
MariaDB [(none)]> start slave;
Query OK, 0 rows affected (0.02 sec)
MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 172.16.7.201
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: master-bin.000002
          Read_Master_Log_Pos: 327
               Relay_Log_File: db1-relay-bin.000003
                Relay_Log_Pos: 615
        Relay_Master_Log_File: master-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 327
              Relay_Log_Space: 954
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 2
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: No
                  Gtid_IO_Pos:

(6)测试master 1 与master 2 是否可以正常主从同步

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
=========== master 1 ================
MariaDB [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
test               |
+--------------------+
4 rows in set (0.00 sec)
MariaDB [(none)]> create database hlbr;
Query OK, 1 row affected (0.00 sec)
MariaDB [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| hlbr               |
| information_schema |
| mysql              |
| performance_schema |
test               |
+--------------------+
5 rows in set (0.00 sec)
=========== master 2 ================
MariaDB [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| hlbr               |    #同步成功
| information_schema |
| mysql              |
| performance_schema |
test               |
+--------------------+
5 rows in set (0.02 sec)

(7)测试master 2 与master 1 是否可以正常主从同步

   我在master 2 上创建了一个bynr库,可以同步到master 1 上,为了减小篇幅,这里就不贴图了。

   经过(6),(7)的测试,说明双主已然成功

3、配置slave

(1)修改/etc/my.cnf配置文件

1
2
3
4
5
6
7
8
server-id       = 3
datadir = /mydata/data
log-bin = /mydata/binglogs/master-bin
relay_log = /mydata/relaylogs/relay
binlog_format=mixed
thread_concurrency = 4
log-slave-updates
sync_binlog=1

(2)slave连接master 1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
MariaDB [(none)]> change master to master_host='172.16.7.200',master_user='repluser',master_password='repluser',master_log_file='master-bin.000001',master_log_pos=755;
Query OK, 0 rows affected (0.05 sec)
MariaDB [(none)]> start slave;
Query OK, 0 rows affected (0.01 sec)
MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 172.16.7.200
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: master-bin.000001
          Read_Master_Log_Pos: 1261
               Relay_Log_File: relay.000002
                Relay_Log_Pos: 1042
        Relay_Master_Log_File: master-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1261
              Relay_Log_Space: 1329
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: No
                  Gtid_IO_Pos:

(3)再次测试,在master 1 上创建数据库,去slave上查看

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
====================== master 1 =====================
MariaDB [(none)]> create database hhht;
Query OK, 1 row affected (0.02 sec)
======================== slave ======================
MariaDB [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| bynr               |
| hhht               |
| hlbr               |    #所有的全部同步成功
| information_schema |
| mysql              |
| performance_schema |
test               |
+--------------------+

4、安装配置mysql-mmm

   因为monitor主机负责所有的监控工作,决定节点的移除等等,所以它必须要得到所有master和slave的授权;且每台数据库服务器上都需要安装mysql-mmm-agent,通过简单远程服务集提供给监控节点

(1)在3台mysql服务器上安装mysql-mmm-agent

因为有好多依赖关系,所以选择yum安装

1
[root@db1 ~]# yum -y install mysql-mmm-agent

(2)在3台mysql服务器上为monitor授权

用户 description privileges
monitor user 用于mmm检查mysql服务器健康状况的用户 replication client
agent user 用于mmm代理为只读模式和复制等的用户 super,replication client,process
replication user 复制的用户 replication slave

接下来要做的就是为上表中的用户授权给监控主机,由于我们的主主、主从已经做好了,权限也已经授过了,所以上表中的第三个就不需重复操作了,而且我们只需在主服务器上授权剩下的两个就好了,它会自动同步到另外两台mysql服务器上的

1
2
3
MariaDB [(none)]> grant replication client on *.* to 'monitor'@'172.16.7.100' identified by 'monitor';
MariaDB [(none)]> grant super,replication client,process on *.* to 'agent'@'172.16.7.100' identified by 'agent';
MariaDB [(none)]> flush privileges;

(3)在monitor节点上安装mmm

1
[root@monitor ~]# yum -y install mysql-mmm*

 ①、配置/etc/mysql-mmm/mmm_common.conf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
[root@monitor ~]# vim /etc/mysql-mmm/mmm_common.conf
active_master_role      writer
<host default>
    cluster_interface       eth0
    pid_path                /var/run/mysql-mmm/mmm_agentd.pid
    bin_path                /usr/libexec/mysql-mmm/
    replication_user        repluser    #授权复制用户
    replication_password    repluser    #密码
    agent_user              agent       #代理用户
    agent_password          agent       #代理用户密码
</host>
<host db1>
    ip      172.16.7.200
    mode    master            #主的
    peer    db1
</host>
<host db2>
    ip      172.16.7.201
    mode    master
    peer    db2
</host>
<host db3>
    ip      172.16.7.202
    mode    slave              #从的
</host>
<role writer>
    hosts   db1, db2
    ips     172.16.7.1
    mode    exclusive           #exclusive表示排它
</role>
<role reader>
    hosts   db2, db3
    ips     172.16.7.2, 172.16.7.3
    mode    balanced            #balanced表示均衡
</role>

  ②、拷贝此文件到3台mysql服务器

   这个不是数据库中的数据,不能同步了,只能手动来了

1
[root@monitor ~]# scp /etc/mysql-mmm/mmm_common.conf root@172.16.7.202:/etc/mysql-mmm/   #3个节点都需这么做

  ③、在monitor上修改mmm_mon.conf文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
[root@monitor ~]# vim /etc/mysql-mmm/mmm_mon.conf
include mmm_common.conf
<monitor>
    ip                  172.16.7.100    #监控主机的ip
    pid_path            /var/run/mysql-mmm/mmm_mond.pid
    bin_path            /usr/libexec/mysql-mmm
    status_path         /var/lib/mysql-mmm/mmm_mond.status
    ping_ips            172.16.7.200, 172.16.7.201, 172.16.7.202 #各数据库服务器的ip
    auto_set_online     60
    # The kill_host_bin does not exist by default, though the monitor will
    # throw a warning about it missing.  See the section 5.10 "Kill Host
    # Functionality" in the PDF documentation.
    #
    # kill_host_bin     /usr/libexec/mysql-mmm/monitor/kill_host
    #
</monitor>
<host default>
    monitor_user        monitor    #监控用户的用户名
    monitor_password    monitor    #密码
</host>
debug 0     #如果程序无序正常监控,可使用debug 1进行排查

(4)在每个DB上修改mmm_agent.conf

1
2
3
4
5
6
[root@db1 ~]# vim /etc/mysql-mmm/mmm_agent.conf
include mmm_common.conf    #调用此文件
# The 'this' variable refers to this server.  Proper operation requires
# that 'this' server (db1 by default), as well as all other servers, have the
# proper IP addresses set in mmm_common.conf.
this db1    #这一行是标记此主机的角色(引用mmm_common.conf中的host段),当前主机是db几,这里就要改为this db几

5、启动MMM并测试

(1)启动mysql-mmm-agent

1
2
3
[root@db1 ~]# service mysql-mmm-agent start
Starting MMM Agent Daemon:                                 [  OK  ]
#每个DB上的mysql-mmm-agent都要启动

(2)启动monitor监控程序

1
2
3
[root@monitor ~]# chkconfig mysql-mmm-monitor on
[root@monitor ~]# service mysql-mmm-monitor start
Starting MMM Monitor Daemon:                               [  OK  ]

(3)在监控主机查看监控状态

1
2
3
4
5
6
[root@monitor ~]# service mysql-mmm-monitor status
mmm_mond (pid  1586) is running...
[root@monitor ~]# mmm_control show
  db1(172.16.7.200) master/ONLINE. Roles: writer(172.16.7.1)
  db2(172.16.7.201) master/ONLINE. Roles: reader(172.16.7.3)
  db3(172.16.7.202) slave/ONLINE. Roles: reader(172.16.7.2)

(4)设置db1离线,再查看状态

1
2
3
4
5
6
[root@monitor ~]# mmm_control set_offline db1    #设置db1离线
OK: State of 'db1' changed to ADMIN_OFFLINE. Now you can wait some time and check all roles!
[root@monitor ~]# mmm_control show
  db1(172.16.7.200) master/ADMIN_OFFLINE. Roles:     #db1已经离线,VIP转移到了db2上
  db2(172.16.7.201) master/ONLINE. Roles: reader(172.16.7.3), writer(172.16.7.1)
  db3(172.16.7.202) slave/ONLINE. Roles: reader(172.16.7.2)

在开始的时候,我们设置了一个写IP:172.16.7.1和两个读IP:172.16.7.2、172.16.7.3,由(3)和(4)可以看出,在三个节点都正常的情况下,写IP是在master 1上,当master 1挂了的话,写IP就飘到master 2上了,这就解决了mysql数据库的单点故障,通过MMM实现了mysql数据库的高可用











本文转自 nmshuishui 51CTO博客,原文链接:http://blog.51cto.com/nmshuishui/1405197,如需转载请自行联系原作者

网友评论

登录后评论
0/500
评论
余二五
+ 关注