redis 集群指导

本文涉及的产品
云数据库 Redis 版,社区版 2GB
推荐场景:
搭建游戏排行榜
简介:

一、说明

redis 3.0集群功能出来已经有一段时间了,目前最新稳定版是3.0.5,我了解到已经有很多互联网公司在生产环境使用,比如唯品会、美团等等,刚好公司有个新项目,预估的量单机redis无法满足,开发又不想在代码层面做拆分,所以就推荐他们尝试一下redis集群,下面做了一些相关笔记,以备后用

二、环境

1、redis节点

2、redis版本

三、安装配置

1、安装redis

2、安装ruby及ruby的redis模块

3、内核调优

4、建立目录

5、撰写redis配置文件(cp配置文件注意修改端口)

6、启动服务

7、初始化集群

节点角色由顺序决定,先master之后是slave,本文中6300是master,6301是slave

redis-trib.rb create --replicas 1 10.10.2.70:6300 10.10.2.71:6300 10.10.2.85:6300 10.10.2.70:6301 10.10.2.71:6301 10.10.2.85:6301

8、查看集群状态

PS:
redis-trib.rb是一个ruby工具,封装了redis集群的一些命令,用这个工具操作集群非常方便,比如上面初始化集群,查看集群状态,还有添加、删除节点,迁移slot等等功能

四、redis集群维护

A、场景1
线上的集群已经有瓶颈,集群需要扩容,比如我们已经准备了一主一从(10.10.2.85:6302、10.10.2.85:6303),如下:

1、添加一个主节点

10.10.2.85:6302是要加的新节点,10.10.2.70:6300是集群中已存在的任意节点

2、给主节点添加从节点
[root@yw_0_0 ~]# redis-trib.rb add-node --slave --master-id 5ef18f95f75756891aa948ea1f200044f1d3947c 10.10.2.85:6303 10.10.2.70:6300

Adding node 10.10.2.85:6303 to cluster 10.10.2.70:6300

Connecting to node 10.10.2.70:6300: OK
Connecting to node 10.10.2.85:6300: OK
Connecting to node 10.10.2.85:6302: OK
Connecting to node 10.10.2.85:6301: OK
Connecting to node 10.10.2.71:6300: OK
Connecting to node 10.10.2.70:6301: OK
Connecting to node 10.10.2.71:6301: OK

Performing Cluster Check (using node 10.10.2.70:6300)

S: cd1f2c1f348bb4359337e7462c1e21dc82f1551b 10.10.2.70:6300
slots: (0 slots) slave
replicates 85412cf3d8e69354115fc0991f470b32b9213cd7
M: 6bea6afa2ee8dfb0cc3c96f804eb3fa77ce98013 10.10.2.85:6300
slots:0-5460 (5461 slots) master
1 additional replica(s)
M: 5ef18f95f75756891aa948ea1f200044f1d3947c 10.10.2.85:6302
slots: (0 slots) master
0 additional replica(s)
S: a74642c0fbc98f921be477eabcdd22eccd89891f 10.10.2.85:6301
slots: (0 slots) slave
replicates 2568dbd91fffa16ff93ea8db19275fd7ec8af41a
M: 2568dbd91fffa16ff93ea8db19275fd7ec8af41a 10.10.2.71:6300
slots:5461-10922 (5462 slots) master
1 additional replica(s)
M: 85412cf3d8e69354115fc0991f470b32b9213cd7 10.10.2.70:6301
slots:10923-16383 (5461 slots) master
1 additional replica(s)
S: 22d2dec483824b84571a60e8c037fff957615552 10.10.2.71:6301
slots: (0 slots) slave
replicates 6bea6afa2ee8dfb0cc3c96f804eb3fa77ce98013
[OK] All nodes agree about slots configuration.

Check for open slots...
Check slots coverage...

[OK] All 16384 slots covered.
Connecting to node 10.10.2.85:6303: OK

Send CLUSTER MEET to node 10.10.2.85:6303 to make it join the cluster.

Waiting for the cluster to join.

Configure node as replica of 10.10.2.85:6302.

[OK] New node added correctly.

--slave 指定要加的是从节点,--master-id 指定这个从节点的主节点ID,10.10.2.85:6303是需要新加的从节点,10.10.2.70:6300是集群已存在的任意节点

3、迁移一些slot给新节点
[root@yw_0_0 ~]# redis-trib.rb reshard 10.10.2.70:6300
Connecting to node 10.10.2.70:6300: OK
Connecting to node 10.10.2.85:6300: OK
Connecting to node 10.10.2.85:6303: OK
Connecting to node 10.10.2.85:6302: OK
Connecting to node 10.10.2.85:6301: OK
Connecting to node 10.10.2.71:6300: OK
Connecting to node 10.10.2.70:6301: OK
Connecting to node 10.10.2.71:6301: OK

Performing Cluster Check (using node 10.10.2.70:6300)

S: cd1f2c1f348bb4359337e7462c1e21dc82f1551b 10.10.2.70:6300
slots: (0 slots) slave
replicates 85412cf3d8e69354115fc0991f470b32b9213cd7
M: 6bea6afa2ee8dfb0cc3c96f804eb3fa77ce98013 10.10.2.85:6300
slots:0-5460 (5461 slots) master
1 additional replica(s)
S: fc90d090fae909fd4f962752941c039d081d3854 10.10.2.85:6303
slots: (0 slots) slave
replicates 5ef18f95f75756891aa948ea1f200044f1d3947c
M: 5ef18f95f75756891aa948ea1f200044f1d3947c 10.10.2.85:6302
slots: (0 slots) master
1 additional replica(s)
S: a74642c0fbc98f921be477eabcdd22eccd89891f 10.10.2.85:6301
slots: (0 slots) slave
replicates 2568dbd91fffa16ff93ea8db19275fd7ec8af41a
M: 2568dbd91fffa16ff93ea8db19275fd7ec8af41a 10.10.2.71:6300
slots:5461-10922 (5462 slots) master
1 additional replica(s)
M: 85412cf3d8e69354115fc0991f470b32b9213cd7 10.10.2.70:6301
slots:10923-16383 (5461 slots) master
1 additional replica(s)
S: 22d2dec483824b84571a60e8c037fff957615552 10.10.2.71:6301
slots: (0 slots) slave
replicates 6bea6afa2ee8dfb0cc3c96f804eb3fa77ce98013
[OK] All nodes agree about slots configuration.

Check for open slots...
Check slots coverage...

[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)? 3000 #设置需要把3000个slot做移动
What is the receiving node ID? 5ef18f95f75756891aa948ea1f200044f1d3947c #设置接收这3000个slot的节点ID,也就是刚才新加的10.10.2.85:6302的ID
Please enter all the source node IDs.
Type 'all' to use all the nodes as source nodes for the hash slots.
Type 'done' once you entered all the source nodes IDs.
Source node #1:85412cf3d8e69354115fc0991f470b32b9213cd7 #设置这3000slot的来源ID,这里我从集群之前的3个节点分别去取一部分slot
Source node #2:6bea6afa2ee8dfb0cc3c96f804eb3fa77ce98013 #设置这3000slot的来源ID,这里我从集群之前的3个节点分别去取一部分slot
Source node #3:2568dbd91fffa16ff93ea8db19275fd7ec8af41a #设置这3000slot的来源ID,这里我从集群之前的3个节点分别去取一部分slot
Source node #4:done #输入done开始做一些初始化操作
此处省略
Do you want to proceed with the proposed reshard plan (yes)? yes 输入yes确认开始迁移slot

B、场景二

上面的例子是集群扩容,相对的,由于各种原因集群可能也需要缩容,下面的例子把上文扩容的节点下线,步骤如下:

1、迁移这个节点的slot到其他节点(有slot的节点是不可以直接下线的)
[root@yw_0_0 ~]# redis-trib.rb reshard 10.10.2.70:6300
Connecting to node 10.10.2.70:6300: OK
Connecting to node 10.10.2.85:6300: OK
Connecting to node 10.10.2.85:6303: OK
Connecting to node 10.10.2.85:6302: OK
Connecting to node 10.10.2.85:6301: OK
Connecting to node 10.10.2.71:6300: OK
Connecting to node 10.10.2.70:6301: OK
Connecting to node 10.10.2.71:6301: OK

Performing Cluster Check (using node 10.10.2.70:6300)

S: cd1f2c1f348bb4359337e7462c1e21dc82f1551b 10.10.2.70:6300
slots: (0 slots) slave
replicates 85412cf3d8e69354115fc0991f470b32b9213cd7
M: 6bea6afa2ee8dfb0cc3c96f804eb3fa77ce98013 10.10.2.85:6300
slots:999-5460 (4462 slots) master
1 additional replica(s)
S: fc90d090fae909fd4f962752941c039d081d3854 10.10.2.85:6303
slots: (0 slots) slave
replicates 5ef18f95f75756891aa948ea1f200044f1d3947c
M: 5ef18f95f75756891aa948ea1f200044f1d3947c 10.10.2.85:6302
slots:0-998,5461-6461,10923-11921 (2999 slots) master
1 additional replica(s)
S: a74642c0fbc98f921be477eabcdd22eccd89891f 10.10.2.85:6301
slots: (0 slots) slave
replicates 2568dbd91fffa16ff93ea8db19275fd7ec8af41a
M: 2568dbd91fffa16ff93ea8db19275fd7ec8af41a 10.10.2.71:6300
slots:6462-10922 (4461 slots) master
1 additional replica(s)
M: 85412cf3d8e69354115fc0991f470b32b9213cd7 10.10.2.70:6301
slots:11922-16383 (4462 slots) master
1 additional replica(s)
S: 22d2dec483824b84571a60e8c037fff957615552 10.10.2.71:6301
slots: (0 slots) slave
replicates 6bea6afa2ee8dfb0cc3c96f804eb3fa77ce98013
[OK] All nodes agree about slots configuration.

Check for open slots...
Check slots coverage...

[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)? 3000 #上文给这个节点迁入了3000个slot,所以这里还选择迁出3000个slot
What is the receiving node ID? 85412cf3d8e69354115fc0991f470b32b9213cd7 #接收这3000slot节点的主ID
Please enter all the source node IDs.
Type 'all' to use all the nodes as source nodes for the hash slots.
Type 'done' once you entered all the source nodes IDs.
Source node #1:5ef18f95f75756891aa948ea1f200044f1d3947c #要下线节点的主ID
Source node #4:done
此处省略
Do you want to proceed with the proposed reshard plan (yes)?yes

2、然后查看10.10.2.85:6302这个maser上已经没有slot了
10.10.2.71:6300> cluster nodes
85412cf3d8e69354115fc0991f470b32b9213cd7 10.10.2.70:6301 master - 0 1445853133399 12 connected 0-999 6462-7460 10923-16383
22d2dec483824b84571a60e8c037fff957615552 10.10.2.71:6301 slave 6bea6afa2ee8dfb0cc3c96f804eb3fa77ce98013 0 1445853132898 10 connected
6bea6afa2ee8dfb0cc3c96f804eb3fa77ce98013 10.10.2.85:6300 master - 0 1445853134400 10 connected 1000-5461
2568dbd91fffa16ff93ea8db19275fd7ec8af41a 10.10.2.71:6300 myself,master - 0 0 11 connected 5462-6461 7461-10922
cd1f2c1f348bb4359337e7462c1e21dc82f1551b 10.10.2.70:6300 slave 85412cf3d8e69354115fc0991f470b32b9213cd7 0 1445853131395 12 connected
fc90d090fae909fd4f962752941c039d081d3854 10.10.2.85:6303 slave 5ef18f95f75756891aa948ea1f200044f1d3947c 0 1445853133899 8 connected
a74642c0fbc98f921be477eabcdd22eccd89891f 10.10.2.85:6301 slave 2568dbd91fffa16ff93ea8db19275fd7ec8af41a 0 1445853129394 11 connected
5ef18f95f75756891aa948ea1f200044f1d3947c 10.10.2.85:6302 master - 0 1445853132397 8 connected

3、下线slave节点
[root@yw_0_0 ~]# redis-trib.rb del-node 10.10.2.85:6303 fc90d090fae909fd4f962752941c039d081d3854

Removing node fc90d090fae909fd4f962752941c039d081d3854 from cluster 10.10.2.85:6303

Connecting to node 10.10.2.85:6303: OK
Connecting to node 10.10.2.85:6301: OK
Connecting to node 10.10.2.85:6302: OK
Connecting to node 10.10.2.85:6300: OK
Connecting to node 10.10.2.70:6300: OK
Connecting to node 10.10.2.71:6301: OK
Connecting to node 10.10.2.70:6301: OK
Connecting to node 10.10.2.71:6300: OK

Sending CLUSTER FORGET messages to the cluster...
SHUTDOWN the node.

4、下线master节点

C、场景三
集群中一个节点的master挂掉,从节点提升为主节点,还没有来的急给这个新的主节点加从节点,这个新的主节点就又挂掉了,那么集群中这个节点就彻底不可以用了,为了解决这个问题,我们至少保证每个节点的maser下面有两个以上的从节点,这样一来,需要的内存资源或者服务器资源就翻倍了,有没有一个折中的方法呢,答案是肯定的,还节点上文配置文件中的cluster-migration-barrier参数不,我们只需要给集群中其中一个节点的master挂多个从库,当其他节点的master下没有可用的从库时,有多个从库的master会割让一个slave给他,保证整个集群的可用性

1、给10.10.2.70:6300 10.10.2.70:6301 这组节点下面加一个从库10.10.2.85:6302
[root@yw_0_0 ~]# redis-trib.rb add-node --slave --master-id cd1f2c1f348bb4359337e7462c1e21dc82f1551b 10.10.2.85:6302 10.10.2.70:6300

Adding node 10.10.2.85:6302 to cluster 10.10.2.70:6300

Connecting to node 10.10.2.70:6300: OK
Connecting to node 10.10.2.85:6300: OK
Connecting to node 10.10.2.71:6300: OK
Connecting to node 10.10.2.70:6301: OK
Connecting to node 10.10.2.85:6301: OK
Connecting to node 10.10.2.71:6301: OK

Performing Cluster Check (using node 10.10.2.70:6300)

M: cd1f2c1f348bb4359337e7462c1e21dc82f1551b 10.10.2.70:6300
slots:3000-5461,6462-7460,10923-16383 (8922 slots) master
1 additional replica(s)
M: e36cdef7a26ed59e8d9db2cf1dbc1997bfc9dfde 10.10.2.85:6300
slots:0-2999 (3000 slots) master
1 additional replica(s)
M: 2568dbd91fffa16ff93ea8db19275fd7ec8af41a 10.10.2.71:6300
slots:5462-6461,7461-10922 (4462 slots) master
1 additional replica(s)
S: 85412cf3d8e69354115fc0991f470b32b9213cd7 10.10.2.70:6301
slots: (0 slots) slave
replicates cd1f2c1f348bb4359337e7462c1e21dc82f1551b
S: 89fcc4994a99ed2fe9bbb908c58dfda2cf31e7d2 10.10.2.85:6301
slots: (0 slots) slave
replicates e36cdef7a26ed59e8d9db2cf1dbc1997bfc9dfde
S: 1f3ea36eacbe005a4b9ac52aeef6d83337dac051 10.10.2.71:6301
slots: (0 slots) slave
replicates 2568dbd91fffa16ff93ea8db19275fd7ec8af41a
[OK] All nodes agree about slots configuration.

Check for open slots...
Check slots coverage...

[OK] All 16384 slots covered.
Connecting to node 10.10.2.85:6302: OK

Send CLUSTER MEET to node 10.10.2.85:6302 to make it join the cluster.

Waiting for the cluster to join.

Configure node as replica of 10.10.2.70:6300.

[OK] New node added correctly.

2、把10.10.2.71:6300 10.10.2.71:6301这组的从节点停掉
redis-cli -h 10.10.2.71 -p 6301 shutdown

3、查看10.10.2.85:6302这个节点是否成为10.10.2.71:6300的从库
10.10.2.71:6300> CLUSTER nodes
85412cf3d8e69354115fc0991f470b32b9213cd7 10.10.2.70:6301 slave cd1f2c1f348bb4359337e7462c1e21dc82f1551b 0 1445911596844 17 connected
89fcc4994a99ed2fe9bbb908c58dfda2cf31e7d2 10.10.2.85:6301 slave e36cdef7a26ed59e8d9db2cf1dbc1997bfc9dfde 0 1445911594841 20 connected
2568dbd91fffa16ff93ea8db19275fd7ec8af41a 10.10.2.71:6300 myself,master - 0 0 11 connected 5462-6461 7461-10922
cd1f2c1f348bb4359337e7462c1e21dc82f1551b 10.10.2.70:6300 master - 0 1445911593839 17 connected 3000-5461 6462-7460 10923-16383
2b34532cd6937063d1da26cd4652881b73d97a06 10.10.2.85:6302 slave 2568dbd91fffa16ff93ea8db19275fd7ec8af41a 0 1445911592838 17 connected #已成功挂到了10.10.2.71:6300下
1f3ea36eacbe005a4b9ac52aeef6d83337dac051 10.10.2.71:6301 slave,fail 2568dbd91fffa16ff93ea8db19275fd7ec8af41a 1445911561982 1445911559778 11 disconnected
e36cdef7a26ed59e8d9db2cf1dbc1997bfc9dfde 10.10.2.85:6300 master - 0 1445911595843 20 connected 0-2999

五、cluster相关命令

集群
CLUSTER INFO 打印集群的信息
CLUSTER NODES 列出集群当前已知的所有节点(node),以及这些节点的相关信息。
节点
CLUSTER MEET 将 ip 和 port 所指定的节点添加到集群当中,让它成为集群的一份子。
CLUSTER FORGET 从集群中移除 node_id 指定的节点。
CLUSTER REPLICATE 将当前节点设置为 node_id 指定的节点的从节点。
CLUSTER SAVECONFIG 将节点的配置文件保存到硬盘里面。
槽(slot)
CLUSTER ADDSLOTS [slot ...] 将一个或多个槽(slot)指派(assign)给当前节点。
CLUSTER DELSLOTS [slot ...] 移除一个或多个槽对当前节点的指派。
CLUSTER FLUSHSLOTS 移除指派给当前节点的所有槽,让当前节点变成一个没有指派任何槽的节点。
CLUSTER SETSLOT NODE 将槽 slot 指派给 node_id 指定的节点,如果槽已经指派给另一个节点,那么先让另一个节点删除该槽>,然后再进行指派。
CLUSTER SETSLOT MIGRATING 将本节点的槽 slot 迁移到 node_id 指定的节点中。
CLUSTER SETSLOT IMPORTING 从 node_id 指定的节点中导入槽 slot 到本节点。
CLUSTER SETSLOT STABLE 取消对槽 slot 的导入(import)或者迁移(migrate)。

CLUSTER KEYSLOT 计算键 key 应该被放置在哪个槽上。
CLUSTER COUNTKEYSINSLOT 返回槽 slot 目前包含的键值对数量。
CLUSTER GETKEYSINSLOT 返回 count 个 slot 槽中的键。

参考文献

[2]H. Berenson, P. Bernstein, J. Gray, J.Melton, E. O’Neil,and P. O’Neil. A critique of ANSI SQL isolation levels. InProceedings of the SIGMOD International Conference on Management of Data, pages1–10, May 1995.

[3]Michael J. Cahill, Uwe Röhm, and Alan D.Fekete. 2008. Serializable isolation for snapshot databases. In SIGMOD ’08:Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 729–738, New York, NY, USA. ACM.

[4]Michael James Cahill. 2009. Serializable Isolation for Snapshot Databases. Sydney Digital Theses. University of Sydney, School of Information Technologies

[5] A. Fekete, D. Liarokapis, E. O’Neil, P.O’Neil, andD. Shasha. Making snapshot isolation serializable. www.codexueyuan.com In ACM transactions on database systems, volume 39(2), pages 492–528, June 2005.

相关实践学习
基于Redis实现在线游戏积分排行榜
本场景将介绍如何基于Redis数据库实现在线游戏中的游戏玩家积分排行榜功能。
云数据库 Redis 版使用教程
云数据库Redis版是兼容Redis协议标准的、提供持久化的内存数据库服务,基于高可靠双机热备架构及可无缝扩展的集群架构,满足高读写性能场景及容量需弹性变配的业务需求。 产品详情:https://www.aliyun.com/product/kvstore     ------------------------------------------------------------------------- 阿里云数据库体验:数据库上云实战 开发者云会免费提供一台带自建MySQL的源数据库 ECS 实例和一台目标数据库 RDS实例。跟着指引,您可以一步步实现将ECS自建数据库迁移到目标数据库RDS。 点击下方链接,领取免费ECS&RDS资源,30分钟完成数据库上云实战!https://developer.aliyun.com/adc/scenario/51eefbd1894e42f6bb9acacadd3f9121?spm=a2c6h.13788135.J_3257954370.9.4ba85f24utseFl
相关文章
|
9天前
|
NoSQL Linux Redis
06- 你们使用Redis是单点还是集群 ? 哪种集群 ?
**Redis配置:** 使用哨兵集群,结构为1主2从,加上3个哨兵节点,总计分布在3台Linux服务器上,提供高可用性。
18 0
|
17天前
|
负载均衡 监控 NoSQL
Redis的集群方案有哪些?
Redis集群包括主从复制(基础,手动故障恢复)、哨兵模式(自动高可用)和Redis Cluster(官方分布式解决方案,自动分片和容错)。此外,还有如Codis、Redisson和Twemproxy等第三方工具用于代理和负载均衡。选择方案需考虑应用场景、数据规模和并发需求。
17 2
|
23天前
|
NoSQL Redis
Redis集群(六):集群常用命令及说明
Redis集群(六):集群常用命令及说明
15 0
|
2月前
|
运维 NoSQL 算法
Redis-Cluster 与 Redis 集群的技术大比拼
Redis-Cluster 与 Redis 集群的技术大比拼
46 0
|
3月前
|
存储 NoSQL Redis
Redis+SpringBoot企业版集群实战------【华为云版】(上)
Redis+SpringBoot企业版集群实战------【华为云版】
62 0
|
17天前
|
NoSQL Java 测试技术
面试官:如何搭建Redis集群?
**Redis Cluster** 是从 Redis 3.0 开始引入的集群解决方案,它分散数据以减少对单个主节点的依赖,提升读写性能。16384 个槽位分配给节点,客户端通过槽位信息直接路由请求。集群是无代理、去中心化的,多数命令直接由节点处理,保持高性能。通过 `create-cluster` 工具快速搭建集群,但适用于测试环境。在生产环境,需手动配置文件,启动节点,然后使用 `redis-cli --cluster create` 分配槽位和从节点。集群动态添加删除节点、数据重新分片及故障转移涉及复杂操作,包括主从切换和槽位迁移。
29 0
面试官:如何搭建Redis集群?
|
21天前
|
存储 缓存 NoSQL
【Redis深度专题】「核心技术提升」探究Redis服务启动的过程机制的技术原理和流程分析的指南(集群功能分析)(一)
【Redis深度专题】「核心技术提升」探究Redis服务启动的过程机制的技术原理和流程分析的指南(集群功能分析)
41 0
|
1月前
|
NoSQL Redis Docker
使用Docker搭建一个“一主两从”的 Redis 集群(超详细步骤)
使用Docker搭建一个“一主两从”的 Redis 集群(超详细步骤)
37 0
|
1月前
|
存储 监控 NoSQL
Redis 架构深入:主从复制、哨兵到集群
大家好,我是小康,今天我们来聊下 Redis 的几种架构模式,包括主从复制、哨兵和集群模式。
Redis 架构深入:主从复制、哨兵到集群
|
1月前
|
运维 负载均衡 NoSQL
【大厂面试官】知道Redis集群和Redis主从有什么区别吗
集群节点之间的故障检测和Redis主从中的哨兵检测很类似,都是通过PING消息来检测的。。。面试官抓抓脑袋,继续看你的简历…得想想考点你不懂的😰。
67 1

热门文章

最新文章