库表字符集不一致导致的全表扫描问题

简介:
背景:
当数据库的建库字符集和表不一样时,在库下针对表创建存储过程可能导致全表扫描

如下例:
drop database if exists xx1;
drop database if exists xx2;
create database xx1 character set utf8;    
create database xx2 character set gbk;

 

然后分别在xx1 和 xx2下执行:
CREATE TABLE t1 ( `col1` varchar(10) NOT NULL, `col2` bigint(11) NOT NULL DEFAULT '0', PRIMARY KEY (`col1`)) ENGINE=InnoDB DEFAULT CHARSET=gbk;
insert into t1 values ('ac',2),('del',3),('sp',5),('tr',12);
DELIMITER //
CREATE PROCEDURE proc1(IN isn CHAR(20))  
BEGIN  
     START TRANSACTION;   
     SELECT col2 FROM t1 WHERE col1= isn FOR UPDATE;  
     COMMIT;
END //
DELIMITER ;

 

在5.5版本中通过show processlist可以看到SQL进行了转换(简单的锁住行记录,执行存储过程,再show processlist):
use xx1;
call proc1(‘tr’); 参数isn会被展开成:NAME_CONST(‘isn’,_utf8 0x7472 COLLATE ‘utf8_general_ci’)
use xx2;
call proc1(‘tr’); 参数isn会被展开成:NAME_CONST(‘isn’,_gbk 0x7472 COLLATE ‘gbk_chinese_ci’)

在5.6版本里使用ROW模式时,不会展开sql,但内部采取的是同样的策略。
由于name_const中引用的的字符集和表定义不同,因此xx1.proc1()每次调用都会产生两次全表扫描。但奇怪的是,如果单独执行上述SQL(而不是调用存储过程),就不会全表扫描。

还好MySQL5.6开始有了optimizer_trace这个利器,打开optimizer trace就一目了然了,对比单独执行SQL和执行存储过程:
call xx1.proc1(‘tr’)
| SELECT col2 FROM t1 WHERE col1=  NAME_CONST(‘isn’,_utf8’tr’ COLLATE ‘utf8_general_ci’) FOR UPDATE | {
“steps”: [
{
“join_preparation”: {
“select#”: 1,
“steps”: [
{
           “expanded_query”: “/* select#1 */ select `t1`.`col2` AS `col2` from `t1` where (convert(`t1`.`col1` using utf8) = isn@0)”
}
]
}
},
{
“join_optimization”: {
“select#”: 1,
“steps”: [
{
“condition_processing”: {
“condition”: “WHERE”,
“original_condition”: “(convert(`t1`.`col1` using utf8) = isn@0)”,
“steps”: [
{
“transformation”: “equality_propagation”,
“resulting_condition”: “(convert(`t1`.`col1` using utf8) = isn@0)”
},
{
“transformation”: “constant_propagation”,
“resulting_condition”: “(convert(`t1`.`col1` using utf8) = isn@0)”
},
{
“transformation”: “trivial_condition_removal”,
“resulting_condition”: “(convert(`t1`.`col1` using utf8) = isn@0)”
}
]
}
},
{
“table_dependencies”: [
{
“table”: “`t1`”,
“row_may_be_null”: false,
“map_bit”: 0,
“depends_on_map_bits”: [
]
}
]
},
{
“ref_optimizer_key_uses”: [
]
},
{
“rows_estimation”: [
{
“table”: “`t1`”,
“table_scan”: {
“rows”: 4,
“cost”: 1
}
}
]
},
{
“considered_execution_plans”: [
{
“plan_prefix”: [
],
“table”: “`t1`”,
“best_access_path”: {
“considered_access_paths”: [
{
“access_type”: “scan”,
“rows”: 4,
“cost”: 1.8,
“chosen”: true
}
]
},
“cost_for_plan”: 1.8,
“rows_for_plan”: 4,
“chosen”: true
}
]
},
{
“attaching_conditions_to_tables”: {
“original_condition”: “(convert(`t1`.`col1` using utf8) = isn@0)”,
“attached_conditions_computation”: [
],
“attached_conditions_summary”: [
{
“table”: “`t1`”,
“attached”: “(convert(`t1`.`col1` using utf8) = isn@0)”
}
]
}
},
{
“refine_plan”: [
{
“table”: “`t1`”,
“access_type”: “table_scan”
}
]
}
]
}
},
{
“join_execution”: {
“select#”: 1,
“steps”: [
]
}
}
]
}
call xx2.proc1(‘tr’)
SELECT col2 FROM t1 WHERE col1=  NAME_CONST(‘isn’,_gbk’tr’ COLLATE ‘gbk_chinese_ci’) FOR UPDATE | {
“steps”: [
{
“join_preparation”: {
“select#”: 1,
“steps”: [
{
“expanded_query”: “/* select#1 */ select `t1`.`col2` AS `col2` from `t1` where (`t1`.`col1` = isn@0)”
}
]
}
},
{
“join_optimization”: {
“select#”: 1,
“steps”: [
{
“condition_processing”: {
“condition”: “WHERE”,
“original_condition”: “(`t1`.`col1` = isn@0)”,
“steps”: [
{
“transformation”: “equality_propagation”,
“resulting_condition”: “multiple equal(isn@0, `t1`.`col1`)”
},
{
“transformation”: “constant_propagation”,
“resulting_condition”: “multiple equal(isn@0, `t1`.`col1`)”
},
{
“transformation”: “trivial_condition_removal”,
“resulting_condition”: “multiple equal(isn@0, `t1`.`col1`)”
}
]
}
},
{
“table_dependencies”: [
{
“table”: “`t1`”,
“row_may_be_null”: false,
“map_bit”: 0,
“depends_on_map_bits”: [
]
}
]
},
{
“ref_optimizer_key_uses”: [
{
“table”: “`t1`”,
“field”: “col1″,
“equals”: “isn@0″,
“null_rejecting”: false
}
]
},
{
“rows_estimation”: [
{
“table”: “`t1`”,
“rows”: 1,
“cost”: 1,
“table_type”: “const”,
“empty”: false
}
]
},
{
“condition_on_constant_tables”: “1”,
“condition_value”: true
},
{
“attaching_conditions_to_tables”: {
“original_condition”: “1”,
“attached_conditions_computation”: [
],
“attached_conditions_summary”: [
]
}
},
{
“refine_plan”: [
]
}
]
}
},
{
“join_execution”: {
“select#”: 1,
“steps”: [
]
}
}
]
}
use xx1;

SELECT col2 FROM t1 WHERE col1=
NAME_CONST
(‘isn’,_utf8’tr’ COLLATE ‘utf8_general_ci’)
 FOR UPDATE
 SELECT col2 FROM t1 WHERE col1=  NAME_CONST(‘isn’,_utf8’tr’ COLLATE ‘utf8_general_ci’) FOR UPDATE | {

  “steps”: [

    {

      “join_preparation”: {

        “select#”: 1,

        “steps”: [

          {

            “expanded_query”: “/* select#1 */ select `t1`.`col2` AS `col2` from `t1` where (`t1`.`col1` = convert(NAME_CONST(‘isn’,(_utf8’tr’ collate utf8_general_ci)) using gbk))”

          }

        ]

      }

    },

    {

      “join_optimization”: {

        “select#”: 1,

        “steps”: [

          {

            “condition_processing”: {

              “condition”: “WHERE”,

              “original_condition”: “(`t1`.`col1` = convert(NAME_CONST(‘isn’,(_utf8’tr’ collate utf8_general_ci)) using gbk))”,

              “steps”: [

                {

                  “transformation”: “equality_propagation”,

                  “resulting_condition”: “multiple equal(convert(NAME_CONST(‘isn’,(_utf8’tr’ collate utf8_general_ci)) using gbk), `t1`.`col1`)”

                },

                {

                  “transformation”: “constant_propagation”,

                  “resulting_condition”: “multiple equal(convert(NAME_CONST(‘isn’,(_utf8’tr’ collate utf8_general_ci)) using gbk), `t1`.`col1`)”

                },

                {

                  “transformation”: “trivial_condition_removal”,

                  “resulting_condition”: “multiple equal(convert(NAME_CONST(‘isn’,(_utf8’tr’ collate utf8_general_ci)) using gbk), `t1`.`col1`)”

                }

              ]

            }

          },

          {

            “table_dependencies”: [

              {

                “table”: “`t1`”,

                “row_may_be_null”: false,

                “map_bit”: 0,

                “depends_on_map_bits”: [

                ]

              }

            ]

          },

          {

            “ref_optimizer_key_uses”: [

              {

                “table”: “`t1`”,

                “field”: “col1″,

                “equals”: “convert(NAME_CONST(‘isn’,(_utf8’tr’ collate utf8_general_ci)) using gbk)”,

                “null_rejecting”: false

              }

            ]

          },

          {

            “rows_estimation”: [

              {

                “table”: “`t1`”,

                “rows”: 1,

                “cost”: 1,

                “table_type”: “const”,

                “empty”: false

              }

            ]

          },

          {

            “condition_on_constant_tables”: “1”,

            “condition_value”: true

          },

          {

            “attaching_conditions_to_tables”: {

              “original_condition”: “1”,

              “attached_conditions_computation”: [

              ],

              “attached_conditions_summary”: [

              ]

            }

          },

          {

            “refine_plan”: [

            ]

          }

        ]

      }

    },

    {

      “join_execution”: {

        “select#”: 1,

        “steps”: [

        ]

      }

    }

  ]
}

 

 


注意这expanded_query一项,对于这种方式的查询,需要读取每一条记录,进行字符集转换再比较:
expanded_query”: “/* select#1 */ select `t1`.`col2` AS `col2` from `t1` where (convert(`t1`.`col1` using utf8) = isn@0)

由于本人对优化器模块了解不深,也不确定是否是Bug,先Report到了一个小bug到Bug list,ref:http://bugs.mysql.com/bug.php?id=76002
Updated @2015/2/25, 存储过程的输入列需要显式的定义输入字符集,否则其输入列的字符集将继承自库的字符集。参考bug中的回复。

 


相关文章
|
7月前
|
SQL 监控 关系型数据库
PostgreSQL普通表转换成分区表
如何使用pg_rewrite扩展将普遍表转换成分区表
388 0
|
5月前
|
SQL 关系型数据库 MySQL
MySQL库表操作以及简单查询语句
MySQL库表操作以及简单查询语句
48 0
MySQL库表操作以及简单查询语句
|
11月前
|
SQL 存储 自然语言处理
表索引——全文索引
表索引——全文索引
|
SQL 算法 关系型数据库
sql调优之:字符集不一致导致的索引失效案例
生产上有一段sql,大概几百行,做的是inser into …select ;的操作,但是执行了一个多小时都出不来,正常情况下不会这么久,通过一步步定位,最后发现是其中某一段的join 特别慢导致的,后面对这段sql重点分析发现,这里的表竟然没有走索引,对于几十万以上的表之间的连接,若是没有索引那就是灾难性的,加不加索引就好比一个人力三轮与超跑的区别。
446 0
sql调优之:字符集不一致导致的索引失效案例
|
存储 缓存 算法
7.Mysql数据库表引擎与字符集
7.Mysql数据库表引擎与字符集
7.Mysql数据库表引擎与字符集
|
数据库 索引
数据库中的主键、外键、索引的区别
数据库中的主键、外键、索引的区别
392 1
|
SQL Web App开发 数据库
SQL查询某个数据库所有表名、表中字段名
一、查询某个数据库中所有数据表的名称 Select TABLE_NAME FROM DBTest02.INFORMATION_SCHEMA.TABLES Where TABLE_TYPE='BASE TABLE'   二、查询某个数据表中所有字段的信息 2.
1011 0
|
存储 SQL 关系型数据库
mysql批量修改存储引擎和字符集
mysql批量修改存储引擎和字符集
123 0
|
SQL 关系型数据库 MySQL
字符编码,竟然也是 SQL 不走索引的元凶之一。。
引言 某个忙(mo)碌(yu)的下午,旁边的刘哥(老江湖,从业5年+)突然发出了一声叹息:“哎,mysql 出bug了,有索引不走”。 作为一个热心的人,我立即说到:“是不是,对索引字段做了函数操作”。
60 0
字符编码,竟然也是 SQL 不走索引的元凶之一。。
|
关系型数据库 MySQL 数据库