如何获取InnoDB树的高度

2018-02-01 8559

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

云数据库 RDS MySQL，集群系列 2核4GB

RDS MySQL Serverless 基础系列，0.5-2RCU 50GB

云数据库 RDS PostgreSQL，集群系列 2核4GB

简介： 前言作为DBA了解InnoDB的页组织方式是最基础的，在实际工作中，免不了会评估SQL会消耗多少IO，怎么评估呢？作为InnoDB表和树的高度或者深度有关系。查看树的高度? 之前研究了半天：https://www.

前言

作为DBA了解InnoDB的页组织方式是最基础的，在实际工作中，免不了会评估SQL会消耗多少IO，怎么评估呢？
作为InnoDB表和树的高度或者深度有关系。

查看树的高度?

之前研究了半天：
https://www.percona.com/blog/2009/04/28/the_depth_of_a_b_tree/
http://code.openark.org/blog/mysql/the-depth-of-an-index-primer
根据

Scholmi notes that there are two main features determining the depth of a B-tree (or B+-tree):

The number of rows in the database. We’ll call that N.
The size of the indexed key. Let’s call B the number of key that fit in a B-tree node. (Sometimes B is used to refer to the node size itself, rather than the number of keys it holds, but I hope my choice will make sense directly.)
Given these quantities, the depth of a B-tree is logB N, give or take a little. That’s just (log N)/log B. Now we can rephrase Scholmi’s point as noting that small keys means a bigger B, which reduces (log N)/log B. If we cut the key size in half, then the depth of the B-tree goes from (log N)/log B to (log N)/log 2B (twice as many keys fit in the tree nodes), and that’s just (log N)/(1+log B).

Let’s put some numbers in there. Say you have a billion rows, and you can currently fit 64 keys in a node. Then the depth of the tree is (log 109)/ log 64 ≈ 30/6 = 5. Now you rebuild the tree with keys half the size and you get log 109 / log 128 ≈ 30/7 = 4.3. Assuming the top 3 levels of the tree are in memory, then you go from 2 disk seeks on average to 1.3 disk seeks on average, for a 35% speedup.
        
          
        
        
        
          
          AI 代码解读

里面算的不对，人肉算也没达到这个高度。也可能是我没有理解作者的意思，没有用对公式。那么根据结合前面的Innodb页结构，如何正确的获取数的高度呢？继续拿这个表举例子：

mysql> show create table sbtest1\G
*************************** 1. row ***************************
       Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `gmt_create` datetime NOT NULL,
  `gmt_modified` datetime NOT NULL,
  `k` int(11) NOT NULL DEFAULT '0',
  `c` varchar(500) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  `is_used` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`),
  KEY `idx_is_used` (`is_used`),
  KEY `idx_gmt_create` (`gmt_create`)
) ENGINE=InnoDB AUTO_INCREMENT=69313841 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
        
          
        
        
        
          
          AI 代码解读

方法一

[root@localhost mysql]# innodb_space -f test/sbtest1.ibd space-index-pages-summary | head -n 10
page        index   level   data    free    records
3           74      2       5166    10904   369
4           75      2       408     15834   24
5           76      2       486     15756   27
6           77      2       486     15756   27
7           74      0       15028   1192    68
8           74      0       15028   1192    68
9           74      1       14700   1030    1050
10          74      0       15028   1192    68
11          74      0       15028   1192    68
        
          
        
        
        
          
          AI 代码解读

page_level是2，所以这个树高度是page_level+1=3;

方法二

mysql> show global variables like "%innodb_page%";
+----------------------+-------+
| Variable_name        | Value |
+----------------------+-------+
| innodb_page_cleaners | 1     |
| innodb_page_size     | 16384 |
+----------------------+-------+
2 rows in set (0.00 sec)

mysql> show table status like 'sbtest1'\G
*************************** 1. row ***************************
           Name: sbtest1
         Engine: InnoDB
        Version: 10
     Row_format: Dynamic
           Rows: 25926320
 Avg_row_length: 279
    Data_length: 7254032384
Max_data_length: 0
   Index_length: 1293697024
      Data_free: 3145728
 Auto_increment: 69313841
    Create_time: 2018-01-19 14:53:11
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.00 sec)
mysql> desc sbtest1;
+--------------+--------------+------+-----+---------+----------------+
| Field        | Type         | Null | Key | Default | Extra          |
+--------------+--------------+------+-----+---------+----------------+
| id           | int(11)      | NO   | PRI | NULL    | auto_increment |
| gmt_create   | datetime     | NO   | MUL | NULL    |                |
| gmt_modified | datetime     | NO   |     | NULL    |                |
| k            | int(11)      | NO   | MUL | 0       |                |
| c            | varchar(500) | NO   |     |         |                |
| pad          | char(60)     | NO   |     |         |                |
| is_used      | int(11)      | YES  | MUL | NULL    |                |
+--------------+--------------+------+-----+---------+----------------+
        
          
        
        
        
          
          AI 代码解读

通常一棵B+树可以存放多少行数据？

这里我们先假设B+树高为2，即存在一个根节点和若干个叶子节点，那么这棵B+树的存放总记录数为：根节点指针数*单个叶子节点记录行数。

上文我们已经说明单个叶子节点（页）中的记录数=16K/279=58。（我们从上面可以看到每行记录的数据平均大小为279个字节）。

那么现在我们需要计算出非叶子节点能存放多少指针，其实这也很好算，表中的主键ID为int类型，长度为4字节，而指针大小在InnoDB源码中设置为6字节，这样一共10字节，我们一个页中能存放多少这样的单元，其实就代表有多少指针，即16384/10=1638。那么可以算出一棵高度为2的B+树，能存放1638*58=95004条这样的数据记录。

根据同样的原理我们可以算出一个高度为3的B+树可以存放：1638*1638*58=155616552条这样的记录。
高度为4的B+树可以存放：
1638*1638*1638*58=254899912176条这样的记录。
        
          
        
        
        
          
          AI 代码解读

而在实际应用中，大部分是以bigint作为主键的，主键ID为bigint类型，长度为8字节，而指针大小在InnoDB源码中设置为6字节，这样一共14字节，我们一个页中能存放多少这样的单元，其实就代表有多少指针，即16384/14=1170。
根据同样的原理我们可以算出一个高度为2，3，4的B+树能够存放的记录数。
        
          
        
        
        
          
          AI 代码解读

mysql> select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 26313272 |
+----------+
1 row in set (4.23 sec)
        
          
        
        
        
          
          AI 代码解读

我们的表一共是3层。

方法三

mysql> SELECT b.name, a.name, index_id, type, a.space, a.PAGE_NO FROM information_schema.INNODB_SYS_INDEXES a, information_schema.INNODB_SYS_TABLES b WHERE a.table_id = b.table_id AND a.space <> 0 and b.name='test/sbtest1';
+--------------+----------------+----------+------+-------+---------+
| name         | name           | index_id | type | space | PAGE_NO |
+--------------+----------------+----------+------+-------+---------+
| test/sbtest1 | PRIMARY        |       74 |    3 |    45 |       3 |
| test/sbtest1 | k_1            |       75 |    0 |    45 |       4 |
| test/sbtest1 | idx_is_used    |       76 |    0 |    45 |       5 |
| test/sbtest1 | idx_gmt_create |       77 |    0 |    45 |       6 |
+--------------+----------------+----------+------+-------+---------+
4 rows in set (0.00 sec)
        
          
        
        
        
          
          AI 代码解读

primary key的高度是3，其他索引的可以看上表。

方法四

因为主键索引B+树的根页在整个表空间文件中的第3个页开始，所以可以算出它在文件中的偏移量：16384*3=49152（16384为页大小）。

另外根据《InnoDB存储引擎》中描述在根页的64偏移量位置前2个字节，保存了page level的值，因此我们想要的page level的值在整个文件中的偏移量为：16384*3+64=49152+64=49216，前2个字节中。

接下来我们用hexdump工具，查看表空间文件指定偏移量上的数据：

[root@localhost test]# hexdump -s 49216 -n 10 sbtest1.ibd
000c040 0200 0000 0000 0000 4a00
000c04a
        
          
        
        
        
          
          AI 代码解读