一、概述
这是我们SQL优化班的一个学员,据说该SQL在生产环境中已经运行了20个小时,快把服务器的磁盘资源耗尽了。这20个小时,我们可爱的学员就是靠着删除一些不重要的文件才能够勉强度过。
据了解,该SQL为一个月运行一次的跑报表的SQL,主要问题是随着SQL的运行时间越来越长,所需的临时表空间也越来越大,导致磁盘资源用尽。
二、先看慢SQL
● SQL 执行时长 ,目前已经20小时,还在继续执行-
SELECT
-
COUNT( * ) AS totalNum,
-
sum( CASE WHEN F.ALARM_LEVEL = 1 THEN 1 ELSE 0 END ) AS LEVELS1,
-
sum( CASE WHEN F.ALARM_LEVEL = 2 THEN 1 ELSE 0 END ) AS LEVELS2,
-
sum( CASE WHEN F.ALARM_LEVEL = 3 THEN 1 ELSE 0 END ) AS LEVELS3,
-
sum( CASE WHEN F.DEAL_STATE = 0 THEN 1 ELSE 0 END ) AS DESTS
-
FROM
-
F
-
LEFT JOIN DC ON DC.ID = F.CONST_ID
-
LEFT JOIN V ON V.ID = F.VEHICLE_ID
-
LEFT JOIN AREA ON AREA.ID = V.SYS_DIVISION_ID
-
WHERE
-
DC.ID IS NOT NULL
-
AND V.ID IS NOT NULL
-
AND F.DEAL_STATE = 0
-
AND ALARM_LEVEL IN ( 1, 2, 3 )
-
AND F.VEHICLE_ID IN (
-
SELECT
-
VEHICLE_ID
-
FROM
-
GVLK
-
WHERE
-
GROUP_ID IN ( SELECT GROUP_ID FROM GULK WHERE USER_ID = 'ff8080816091b09c0161f9b825750a9a' )
-
UNION
-
SELECT
-
VEHICLE_ID
-
FROM
-
UVLK
-
WHERE
-
USER_ID = 'ff8080816091b09c0161f9b825750a9a'
-
)
-
AND date( F.ALARM_TIME ) BETWEEN '2000-01-01'
-
AND '2018-08-14'
-
AND AREA.PATH LIKE CONCAT( ( SELECT ARE.PATH FROM ARE WHERE ARE.ID = '0' ), '%' )
上述SQL的含义:将F左连DC ,V,AREA表的结果进行where过滤,where中存在子查询,并且还有like函数
● 查看执行计划
-
SELECT
-
COUNT( * ) AS totalNum
-
FROM
-
F
-
WHERE F.DEAL_STATE = 0
-
AND ALARM_LEVEL IN ( 1, 2, 3 )
-
AND F.ALARM_TIME BETWEEN '2000-01-01' AND '2018-08-14'
上面的sql 运行10s 结果集为393653条数据,说明where 条件中的过滤条件的选择率不是很好 不适合创建索引。
● 验证子查询的过滤性-
SELECT
-
VEHICLE_ID
-
FROM
-
GVLK
-
WHERE GROUP_ID IN ( SELECT GROUP_ID FROM GULK WHERE USER_ID = 'ff8080816091b09c0161f9b825750a9a' )
-
UNION
-
SELECT
-
VEHICLE_ID
-
FROM
-
UVLK
-
WHERE USER_ID = 'ff8080816091b09c0161f9b825750a9a'
上面的SQL 运行了0.5s 结果为1200条 。in里的结果速度快 结果集很小 ,F表 就该结果进行in操作,也会有大幅度的过滤。
● DEPEND SUBQUERY 改写-
SELECT
-
COUNT( * ) AS totalNum
-
FROM
-
F
-
join (
-
SELECT
-
VEHICLE_ID
-
FROM
-
GVLK
-
WHERE
-
GROUP_ID IN ( SELECT GROUP_ID FROM GULK WHERE USER_ID = 'ff8080816091b09c0161f9b825750a9a' )
-
UNION
-
SELECT
-
VEHICLE_ID
-
FROM
-
UVLK
-
WHERE
-
USER_ID = 'ff8080816091b09c0161f9b825750a9a'
-
) s
-
WHERE F.DEAL_STATE = 0
-
AND ALARM_LEVEL IN ( 1, 2, 3 )
-
AND F.VEHICLE_ID = s.VEHICLE_ID
-
AND F.ALARM_TIME BETWEEN '2000-01-01' AND '2018-08-14'
最后运行了如下sql 结果集为 88696 速度为0.5s。采用join的方式替代in的方式,因为 DEPEND SUBQUERY是依赖于SQL的主体部分,执行的次数与被依赖表结果集一致。
四、SQL优化结果
-
explain extended
-
SELECT
-
COUNT( * ) AS totalNum,
-
sum( CASE WHEN F.ALARM_LEVEL = 1 THEN 1 ELSE 0 END ) AS LEVELS1,
-
sum( CASE WHEN F.ALARM_LEVEL = 2 THEN 1 ELSE 0 END ) AS LEVELS2,
-
sum( CASE WHEN F.ALARM_LEVEL = 3 THEN 1 ELSE 0 END ) AS LEVELS3,
-
sum( CASE WHEN F.DEAL_STATE = 0 THEN 1 ELSE 0 END ) AS DESTS
-
FROM
-
F
-
straight_join (
-
SELECT
-
VEHICLE_ID
-
FROM
-
GVLK
-
WHERE
-
GROUP_ID IN ( SELECT GROUP_ID FROM GULK WHERE USER_ID = 'ff8080816091b09c0161f9b825750a9a' )
-
UNION
-
SELECT
-
VEHICLE_ID
-
FROM
-
UVLK
-
WHERE
-
USER_ID = 'ff8080816091b09c0161f9b825750a9a'
-
) s on F.VEHICLE_ID = s.VEHICLE_ID
-
straight_join DC ON DC.ID = F.CONST_ID
-
straight_join V ON V.ID = F.VEHICLE_ID
-
straight_join AREA ON AREA.ID = V.SYS_DIVISION_ID
-
WHERE
-
DC.ID IS NOT NULL
-
AND V.ID IS NOT NULL
-
AND F.DEAL_STATE = 0
-
AND ALARM_LEVEL IN ( 1, 2, 3 )
-
AND date( F.ALARM_TIME ) BETWEEN '2000-01-01'
-
AND '2018-08-14'
-
AND AREA.PATH LIKE CONCAT( ( SELECT ARE.PATH FROM ARE WHERE ARE.ID = '0' ), '%' )
1、使用join 代替in的方式;
2、修改后的语句可以使用到索引,索引为F.const_id,table为。这里值得一提的是,采用straight_join 代替 了 join,保证了SQL执行顺序一定是按照我们SQL书写的顺序。
四、后记
本文主要在于优化DEPEND SUBQUERY,另外让SQL能够用得上索引,让SQL的速度有着显著的提升。
原文发布时间为:2018-10-31
本文作者:郑松华老师&小鹿