Implementing a Highly-Compressed Data Storage

本文涉及的产品
云数据库 RDS MySQL Serverless,0.5-2RCU 50GB
简介: Alibaba Cloud ApsaraDB for RDS for MySQL supports the TokuDB engine to store data that is compressed to 5 to 10 times smaller than its original size.

ST_007

1. MySQL Big Data Storage - Compression up to 10 times smaller

Alibaba Cloud ApsaraDB for RDS for MySQL supports the TokuDB engine to store data that is compressed to 5 to 10 times smaller than its original size. It also supports highly concurrent writes through the caching of intermediate nodes.

TokuDB is an optional storage engine for RDS for MySQL. With a disk-optimized index structure Fractal Tree, TokuDB's intermediate nodes can cache data processing requests (insert/update/delete/on-line add index/on-line add column), improving the performance of high-concurrency writes by three to nine times. The node size is 4 MB (configurable), and data can be compressed by 5 to 10 times through a variety of compression algorithms such as zlib/quicklz/lzma/zstd/snappy. TokuDB also supports multiple versions of MVCC and the four isolation levels UR, RC, RR, and Serializable.

In addition to features found in the Community Edition, the source code team for RDS for MySQL also implemented a number of customized optimizations to respond to common use cases:

• Hot backup
TokuDB Community Edition does not provide hot backup. RDS for MySQL enables a hot backup solution based on the TokuDB internal checkpoint mechanism by copying the binlog, redo log, and data files. RDS for MySQL also obtains the TokuDB checkpoint lock to prevent the sharp checkpoint issue during the hot backup process.

• Improve query response on the client
The source code version provided by RDS limits TokuDB to performing sharp checkpoint once every second to avoid using too much disk bandwidth, which can result in query response failures on the client.

• Set buffer pool ratio
RDS provides a parameter tokudb_buffer_pool_ratio for you to set the percentage of memory occupied by the TokuDB engine buffer pool within the range [0,100]. The minimum TokuDB buffer pool size is 64 MB, and the minimum InnoDB buffer pool size is 64 MB (V5.6) and 128 MB (V5.7) to meet the InnoDB/TokuDB initialization requirements.

Switch RDS for MySQL to the TokuDB engine in three steps

1.Set the "loose_tokudb_buffer_pool_ratio", that is, the proportion of the TokuDB and InnoDB shared cache occupied by TokuDB.

select sum(data_length) into @all_size from information_schema.tables where engine='innodb';

select sum(data_length) into @change_size from information_schema.tables where engine='innodb' and concat(table_schema, '.', table_name) in ('XX.XXXX', 'XX.XXXX', 'XX.XXXX');

select round(@change_size/@all_size*100);

2.Restart the instance.

3.Alter the storage engine.

ALTER TABLE XX.XXXX ENGINE=TokuDB

Specifically, "XXX.XXXX" refers to the name of the database and table to be altered to the TokuDB storage engine.

2. MySQL MaxCompute

Alibaba Cloud provides the MaxCompute service for storage and calculation of batch structured data. The service offers mass data warehouse solutions and analytical modeling services for big data. You can import RDS data into MaxCompute through the Data Integration service and simple settings on the interface to achieve large-scale data computing.

29_1

相关实践学习
基于CentOS快速搭建LAMP环境
本教程介绍如何搭建LAMP环境,其中LAMP分别代表Linux、Apache、MySQL和PHP。
全面了解阿里云能为你做什么
阿里云在全球各地部署高效节能的绿色数据中心,利用清洁计算为万物互联的新世界提供源源不断的能源动力,目前开服的区域包括中国(华北、华东、华南、香港)、新加坡、美国(美东、美西)、欧洲、中东、澳大利亚、日本。目前阿里云的产品涵盖弹性计算、数据库、存储与CDN、分析与搜索、云通信、网络、管理与监控、应用服务、互联网中间件、移动服务、视频服务等。通过本课程,来了解阿里云能够为你的业务带来哪些帮助     相关的阿里云产品:云服务器ECS 云服务器 ECS(Elastic Compute Service)是一种弹性可伸缩的计算服务,助您降低 IT 成本,提升运维效率,使您更专注于核心业务创新。产品详情: https://www.aliyun.com/product/ecs
目录
相关文章
|
算法 生物认证 开发工具
Guideline 5.1.2 - Legal - Privacy - Data Use and Sharing
你的应用程序收集的设备信息可能包括以下一些:attributesOfItemAtPath:error:, NSLocaleCountryCode, NSFileSystemSize, NSHomeDirectory,和serviceSubscriberCellularProviders。
234 0
|
数据可视化 数据挖掘 开发者
Data-Basic Statistical Descriptions of Data| 学习笔记
快速学习 Data-Basic Statistical Descriptions of Data。
103 0
Data-Basic Statistical Descriptions of Data| 学习笔记
Performance problem during saving - activating big form
Performance problem during saving - activating big form
80 0
automatic asynchronous creation if no note exists
Created by Wang, Jerry, last modified on May 12, 2015
102 0
automatic asynchronous creation if no note exists
The Rising Smart Logistics Industry: How to Use Big Data to Improve Efficiency and Save Costs
This whitepaper will examine Alibaba Cloud’s Cainiao smart logistics cloud and Big Data powered platform and the underlying strategies used to optimiz.
1500 0
The Rising Smart Logistics Industry: How to Use Big Data to Improve Efficiency and Save Costs
|
安全 Go
Improving your Organizations Data Governance Scorecard
This whitepaper looks at how businesses can improve their scores on the tests of three fundamental data governance areas.
1319 0
|
JavaScript 前端开发