Speeding up Migration on ApsaraDB for Redis

本文涉及的产品
云数据库 Redis 版,社区版 2GB
推荐场景:
搭建游戏排行榜
简介: Redis supports the MIGRATE command to transfer a key from a source instance to a destination instance.

Database_tuning_practices

Abstract: Redis supports the MIGRATE command to transfer a key from a source instance to a destination instance. During migration, the serialized version of the key value is generated with the DUMP command, and then the target node executes the RESTORE command to load data into memory. In this article, we migrated a key with a data size of 800 MB and a data type (ZSET). We compared the performance of a migration on a native Redis environment with a migration on an optimized environment.

Redis supports the MIGRATE command to transfer a key from a source instance to a destination instance. During migration, the serialized version of the key value is generated with the DUMP command, and then the target node executes the RESTORE command to load data into memory. In this article, we migrated a key with a data size of 800 MB and a data type (ZSET). We compared the performance of a migration on a native Redis environment with a migration on an optimized environment. The test environment consists of two Redis databases on the local development machine and the impact of the network is ignored. Based on these conditions, executing the RESTORE command on the native Redis environment takes 163 seconds while executing it on the optimized Redis takes only 27 seconds. This analysis was performed using Alibaba Cloud ApsaraDB for Redis.

1. Native Redis RESTORE performance bottleneck

Our analysis result shows the CPU status as follows:

_21

We can see from the source code that the hash table values and scores of the ZSET from migrate traversal are serialized and then packaged to the target node.

The target node then deserializes the data and refactors the ZSET structure, including running the zslinsert and dictadd operations. This process is time-consuming, and the refactoring cost increases as the data size increases.

2. Method of optimization

From our analysis, we can see that the bottleneck is due to data model refactoring. To optimize the process, we can serialize and package the data model of the source node together and send the data to the target node. The target node parses the data, pre-constructs the memory, and then crams the parsed members.

Because ZSET is a fairly complicated data structure in Redis, we will briefly introduce the concepts used in ZSET..

2.1 ZSET data structure

ZSET consists of two data structures, one being the hash table, which stores the value of each member and the corresponding scores, and the other being the skip list, where all members are sorted in order as shown in the figure below:

01

02

2.2 Serialize the ZSET structure model

In Redis, the memory for ZSET dicts and the memory for zsl members and scores are shared. The two structures also share the same memory. The cost will be higher if you describe the same copy of data in two indexes in the serialization.

2.2.1 Serialize the dict model

When looking at the CPU resource consumption, we can see that the hash table part consumes more CPU resources when calculating the index, rehash, and compare key. (Rehash is used when the pre-allocated hash table size is not enough, and a larger hash table is needed to transfer the old table to the new table. The compare key is used when traversing in the list to determine whether a key already exists).

Based on this, the largest hash table size is specified during serialization, removing the need for rehash when generating a dict table when executing RESTROE.

To restore the zsl structure, we need to deserialize the member and score, as well as recalculate the member index and insert it to the table of the designated index. Because the zsl from the traversal does not contain key conflicts, members of the same index can be added to the list directly, eliminating the compare key.

2.2.2 Serialize zsl model

Zsl has a multi-layer structure as shown in the figure below.

03

The difficulty of the description lies in the unknown total number of levels of zskiplistNode on each layer. We also need to describe the node context on each layer while considering compatibility.

Based on the above considerations, we decided to traverse from the highest level of zsl, and the serialized format is:
level | header span | level_len | [ span ( | member | score ... ) ]

Item Description
level Level of the data
header span The span value on the layer of the header node
level len Total number of nodes on this layer
span The span value on the layer of the node
member | score Because redundant nodes may exist on top of Level 0, we can add up the span values to determine whether a redundant node exists. If a redundant node exists, the member | score will not be serialized. Otherwise, member | score are included for non-redundant nodes. The deserialization algorithm follows the same principle.

Conclusion

By now, the description of the ZSET data model is complete and the performance of RESTORE is faster. However, this optimization method introduces a tradeoff because it consumes more bandwidth. The extra bandwidth originates from the field that describes the node. The data size after optimization is 20 MB larger than the 800 MB of data before the optimization.

ApsaraDB for Redis is a stable, reliable, and scalable database service with superb performance. It is structured on the Apsara Distributed File System and full SSD high-performance storage, and supports master-slave and cluster-based high-availability architectures. ApsaraDB for Redis offers a full range of database solutions including disaster switchover, failover, online expansion, and performance optimization. Try ApsaraDB for Redis today!

相关实践学习
基于Redis实现在线游戏积分排行榜
本场景将介绍如何基于Redis数据库实现在线游戏中的游戏玩家积分排行榜功能。
云数据库 Redis 版使用教程
云数据库Redis版是兼容Redis协议标准的、提供持久化的内存数据库服务,基于高可靠双机热备架构及可无缝扩展的集群架构,满足高读写性能场景及容量需弹性变配的业务需求。 产品详情:https://www.aliyun.com/product/kvstore     ------------------------------------------------------------------------- 阿里云数据库体验:数据库上云实战 开发者云会免费提供一台带自建MySQL的源数据库 ECS 实例和一台目标数据库 RDS实例。跟着指引,您可以一步步实现将ECS自建数据库迁移到目标数据库RDS。 点击下方链接,领取免费ECS&RDS资源,30分钟完成数据库上云实战!https://developer.aliyun.com/adc/scenario/51eefbd1894e42f6bb9acacadd3f9121?spm=a2c6h.13788135.J_3257954370.9.4ba85f24utseFl
目录
相关文章
|
Kubernetes NoSQL 测试技术
在 K8S 中快速部署 Redis Cluster & Redisinsight
在 K8S 中快速部署 Redis Cluster & Redisinsight
1116 0
在 K8S 中快速部署 Redis Cluster & Redisinsight
|
5月前
|
存储 缓存 NoSQL
Redis Cluster
Redis Cluster
40 1
|
10月前
|
存储 NoSQL MongoDB
mongodb搭建Replica Set
mongodb搭建Replica Set 简单高效
166 0
|
存储 缓存 运维
一文掌握Redis集群实用运维工具redis-tool
redis-tool基于原生的redis-cli客户端工具来进行Redis集群的监控、配置、问题分析等运维管理,能够极大降低Redis cluster集群的运维成本。同时作为脚本化工具,下载即可用,即使对于Redis初学者,也能够快速掌握集群的运行状况,完成集群配置管理、性能问题排查,具备Redis集群的基本运维能力。
2030 0
一文掌握Redis集群实用运维工具redis-tool
|
NoSQL 算法 Redis
知道 Redis-Cluster 么?说说其中可能不可用的情况
知道 Redis-Cluster 么?说说其中可能不可用的情况
知道 Redis-Cluster 么?说说其中可能不可用的情况
|
NoSQL 测试技术 Redis
Undermoon - Redis Cluster Slots 迁移
Undermoon - Redis Cluster Slots 迁移
137 0
|
存储 缓存 监控
带你剖析Redis Cluster
Redis 的集群模式是否让你眼花缭乱呢?为什么有的时候三个,有的时候两个,有的时候六个,九个。其实当我们系统很小时有一个就够了,但是我们往往想做到读写分离,为数据搞一个备份,那么主从复制就来了。但是呢主从复制下,主节点挂了,只能手动去设置主节点,不能自动进行,这时哨兵模式就出现了,通过哨兵监控各个节点,主机挂了,哨兵感知到,就会有一个哨兵发起投票选举自己为领导者,从而由自己进行故障转移。但是主从加哨兵只能保证高可用与读写分离,并不能解决写并发的压力,然后多主节点的redis cluster就出现了,将三种模式整合,就构成最小六个节点的高并发,高可用的redis集群。
172 0
|
运维 NoSQL Redis
Redis Cluster--运维管理
上一篇博客我们讲了如何安装配置Redis Cluster,详情参考:Redis Cluster--安装配置,今天我们来学习一下Redis Cluster的日常运维操作 Cluster常用命令 cluster info  集群信息 cluster nodes 所有节点和sl...
1093 0
|
存储 缓存 运维
Redis之Cluster
Redis之Cluster
1127 0
|
缓存 负载均衡 NoSQL
Redis Cluster入门
Redis Cluster入门
6687 0
Redis Cluster入门