找到 mysql 数据库中的不良索引-阿里云开发者社区

找到 mysql 数据库中的不良索引

2017-05-02 1295

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

RDS MySQL Serverless 基础系列，0.5-2RCU 50GB

云数据库 RDS MySQL，集群系列 2核4GB

云数据库 RDS PostgreSQL，集群系列 2核4GB

简介：

为了演示，首先建两个包含不良索引的表，并弄点数据。

 
 

  
  mysql> show create table test1\G

  
  *************************** 1. row ***************************

  
         Table: test1

  
  Create Table: CREATE TABLE `test1` (

  
    `id` int(11) NOT NULL,

  
    `f1` int(11) DEFAULT NULL,

  
    `f2` int(11) DEFAULT NULL,

  
    `f3` int(11) DEFAULT NULL,

  
    PRIMARY KEY (`id`),

  
    KEY `k1` (`f1`,`id`),

  
    KEY `k2` (`id`,`f1`),

  
    KEY `k3` (`f1`),

  
    KEY `k4` (`f1`,`f3`),

  
    KEY `k5` (`f1`,`f3`,`f2`)

  
  ) ENGINE=InnoDB DEFAULT CHARSET=latin1

  
  1 row in set (0.00 sec)

  
  

  
  mysql> show create table test2\G

  
  *************************** 1. row ***************************

  
         Table: test2

  
  Create Table: CREATE TABLE `test2` (

  
    `id1` int(11) NOT NULL DEFAULT '0',

  
    `id2` int(11) NOT NULL DEFAULT '0',

  
    `b` int(11) DEFAULT NULL,

  
    PRIMARY KEY (`id1`,`id2`),

  
    KEY `k1` (`b`)

  
  ) ENGINE=InnoDB DEFAULT CHARSET=latin1

  
  1 row in set (0.00 sec)

  
  

  
  mysql> select count(*) from test2 group by b;                                                                                                        

  
  +----------+

  
  | count(*) |

  
  +----------+

  
  |       32 |

  
  |       17 |

  
  +----------+

  
  2 rows in set (0.00 sec)

  
  
 
 

        
          
        
        
        
          
          AI 代码解读

1. 包含主键的索引

innodb 本身是聚簇表，每个二级索引本身就包含主键，类似 f1, id 的索引实际虽然没有害处，但反映了使用者对 mysql 索引不了解。而类似 id, f1 的是多余索引，会浪费存储空间，并影响数据更新性能。包含主键的索引用这样一句 sql 就能全部找出来。

 
 

  
  mysql> select c.*, pk from 

  
      ->   (select table_schema, table_name, index_name, concat('|', group_concat(column_name order by seq_in_index separator '|'), '|') cols 

  
      ->     from INFORMATION_SCHEMA.STATISTICS 

  
      ->     where index_name != 'PRIMARY' and table_schema != 'mysql'

  
      -> group by table_schema, table_name, index_name) c,

  
      ->   (select table_schema, table_name, concat('|', group_concat(column_name order by seq_in_index separator '|'), '|') pk 

  
      ->     from INFORMATION_SCHEMA.STATISTICS 

  
      ->     where index_name = 'PRIMARY' and table_schema != 'mysql'

  
      -> group by table_schema, table_name) p  

  
      -> where c.table_name = p.table_name and c.table_schema = p.table_schema and c.cols like concat('%', pk, '%');

  
  +--------------+------------+------------+---------+------+

  
  | table_schema | table_name | index_name | cols    | pk   |

  
  +--------------+------------+------------+---------+------+

  
  | test         | test1      | k1         | |f1|id| | |id| |

  
  | test         | test1      | k2         | |id|f1| | |id| |

  
  +--------------+------------+------------+---------+------+

  
  2 rows in set (0.04 sec)

  
  
 
 

        
          
        
        
        
          
          AI 代码解读

2. 重复索引前缀

包含重复前缀的索引，索引能由另一个包含该前缀的索引完全代替，是多余索引。多余的索引会浪费存储空间，并影响数据更新性能。这样的索引同样用一句 sql 可以找出来。

 
 

  
  mysql> select c1.table_schema, c1.table_name, c1.index_name,c1.cols,c2.index_name, c2.cols from

  
      ->   (select table_schema, table_name, index_name, concat('|', group_concat(column_name order by seq_in_index separator '|'), '|') cols 

  
      ->     from INFORMATION_SCHEMA.STATISTICS 

  
      ->     where table_schema != 'mysql' and index_name!='PRIMARY'

  
      -> group by table_schema,table_name,index_name) c1,   

  
      ->   (select table_schema, table_name,index_name, concat('|', group_concat(column_name order by seq_in_index separator '|'), '|') cols 

  
      ->     from INFORMATION_SCHEMA.STATISTICS 

  
      ->     where table_schema != 'mysql' and index_name != 'PRIMARY'

  
      -> group by table_schema, table_name, index_name) c2 

  
      -> where c1.table_name = c2.table_name and c1.table_schema = c2.table_schema and c1.cols like concat(c2.cols, '%') and c1.index_name != c2.index_name;

  
  +--------------+------------+------------+------------+------------+---------+

  
  | table_schema | table_name | index_name | cols       | index_name | cols    |

  
  +--------------+------------+------------+------------+------------+---------+

  
  | test         | test1      | k1         | |f1|id|    | k3         | |f1|    |

  
  | test         | test1      | k4         | |f1|f3|    | k3         | |f1|    |

  
  | test         | test1      | k5         | |f1|f3|f2| | k3         | |f1|    |

  
  | test         | test1      | k5         | |f1|f3|f2| | k4         | |f1|f3| |

  
  +--------------+------------+------------+------------+------------+---------+

  
  4 rows in set (0.02 sec)

  
  
 
 

        
          
        
        
        
          
          AI 代码解读

3. 低区分度索引

这样的索引由于仍然会扫描大量记录，在实际查询时通常会被忽略。但是在某些情况下仍然是有用的。因此需要根据实际情况进一步分析。这里是区分度小于 10% 的索引，可以根据需要调整参数。

 
 

  
  mysql> select p.table_schema, p.table_name, c.index_name, c.car, p.car total from

  
      ->   (select table_schema, table_name, index_name, max(cardinality) car

  
      ->     from INFORMATION_SCHEMA.STATISTICS

  
      -> where index_name != 'PRIMARY'

  
      -> group by table_schema, table_name,index_name) c,

  
      ->   (select table_schema, table_name, max(cardinality) car

  
      ->     from INFORMATION_SCHEMA.STATISTICS

  
      -> where index_name = 'PRIMARY' and table_schema != 'mysql'

  
      -> group by table_schema,table_name) p

  
      -> where c.table_name = p.table_name and c.table_schema = p.table_schema and p.car > 0 and c.car / p.car < 0.1;

  
  +--------------+------------+------------+------+-------+

  
  | table_schema | table_name | index_name | car  | total |

  
  +--------------+------------+------------+------+-------+

  
  | test         | test2      | k1         |    4 |    49 |

  
  +--------------+------------+------------+------+-------+

  
  1 row in set (0.04 sec)

  
  
 
 

        
          
        
        
        
          
          AI 代码解读

4. 复合主键

由于 innodb 是聚簇表，每个二级索引都会包含主键值。复合主键会造成二级索引庞大，而影响二级索引查询性能，并影响更新性能。同样需要根据实际情况进一步分析。

 
 

  
  mysql> select table_schema, table_name, group_concat(column_name order by seq_in_index separator ',') cols, max(seq_in_index) len

  
      ->    from INFORMATION_SCHEMA.STATISTICS

  
      ->    where index_name = 'PRIMARY' and table_schema != 'mysql'

  
      ->    group by table_schema, table_name having len>1;

  
  +--------------+------------+-----------------------------------+------+

  
  | table_schema | table_name | cols                              | len  |

  
  +--------------+------------+-----------------------------------+------+

  
  | test         | test2      | id1,id2                           |    2 |

  
  +--------------+------------+-----------------------------------+------+

  
  1 rows in set (0.01 sec)



  
  本文来自云栖社区合作伙伴“Linux中国”，原文发布日期：2015-08-18
 
 

        
          
        
        
        
          
          AI 代码解读

找到 mysql 数据库中的不良索引

1. 包含主键的索引

2. 重复索引前缀

3. 低区分度索引

4. 复合主键

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

找到 mysql 数据库中的不良索引

1. 包含主键的索引

2. 重复索引前缀

3. 低区分度索引

4. 复合主键

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像