060611G _optimizer_null_aware_antijoin

简介: [20170606]11G _optimizer_null_aware_antijoin.txt --//上午测试_optimizer_null_aware_antijoin,发现自己不经意间又犯了一个低级错误,做1个记录.

[20170606]11G _optimizer_null_aware_antijoin.txt

--//上午测试_optimizer_null_aware_antijoin,发现自己不经意间又犯了一个低级错误,做1个记录.

1.环境:
SYS@book> @ &r/ver1
PORT_STRING                    VERSION        BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

SYS@book> @ &r/hide _optimizer_null_aware_antijoin
NAME                           DESCRIPTION                   DEFAULT_VALUE SESSION_VALUE SYSTEM_VALUE
------------------------------ ----------------------------- ------------- ------------- ------------
_optimizer_null_aware_antijoin null-aware antijoin parameter TRUE          TRUE          TRUE

2.做一个测试例子:
SCOTT@book> create table t1 as select * from dba_objects;
Table created.

SCOTT@book> create table t2 as select * from dba_objects;
Table created.

SCOTT@book> select count(*) from t2;
  COUNT(*)
----------
     86996

SCOTT@book> select count(*) from t1;
  COUNT(*)
----------
     86995
--//分析表略.
--//因为T1,T2建立有先后,这样T2的记录数量比T1记录的数量多1.注意1个特点允许NULL(这样建立的表),如果你使用all_objects来建立就情况不同了:
--//你可以查看定义desc all_objects, desc dba_objects.
SCOTT@book> @ &r/desc t1
   Name            Null? Type
   --------------- ----- --------------
1 OWNER                 VARCHAR2(30)
2 OBJECT_NAME           VARCHAR2(128)
3 SUBOBJECT_NAME        VARCHAR2(30)
4 OBJECT_ID             NUMBER
5 DATA_OBJECT_ID        NUMBER
6 OBJECT_TYPE           VARCHAR2(19)
7 CREATED               DATE
8 LAST_DDL_TIME         DATE
9 TIMESTAMP             VARCHAR2(19)
10 STATUS                VARCHAR2(7)
11 TEMPORARY             VARCHAR2(1)
12 GENERATED             VARCHAR2(1)
13 SECONDARY             VARCHAR2(1)
14 NAMESPACE             NUMBER
15 EDITION_NAME          VARCHAR2(30)

SCOTT@book> select * from t1 where object_name='T2';
no rows selected

SCOTT@book> select * from t2 where object_name='T2';
OWNER  OBJECT_NAME SUBOBJECT_  OBJECT_ID DATA_OBJECT_ID OBJECT_TYPE CREATED             LAST_DDL_TIME       TIMESTAMP           STATUS  T G S  NAMESPACE EDITION_NAME
------ ----------- ---------- ---------- -------------- ----------- ------------------- ------------------- ------------------- ------- - - - ---------- ------------
SCOTT  T2                          90863          90863 TABLE       2017-06-06 16:12:27 2017-06-06 16:12:27 2017-06-06:16:12:27 VALID   N N N          1

SCOTT@book> select  * from t1 where OBJECT_ID=90863;
no rows selected

3.测试:
SCOTT@book> alter session set statistics_level=all;
Session altered.

SCOTT@book> select count(*) from t2 where object_id not in(select object_id from t1);
  COUNT(*)
----------
         0

SCOTT@book> select * from t2 where object_id not in(select object_id from t1);
no rows selected
--//why? 理论应该返回1行啊.

SCOTT@book> @ &r/dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  45ppus55apd1z, child number 1
-------------------------------------
select * from t2 where object_id not in(select object_id from t1)
Plan hash value: 35395643
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name | Starts | E-Rows |E-Bytes|E-Temp | Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |      1 |        |       |       |  1218 (100)|          |      0 |00:00:00.06 |    1244 |   1242 |       |       |          |
|*  1 |  HASH JOIN RIGHT ANTI NA|      |      1 |    870 | 89610 |  1448K|  1218   (1)| 00:00:15 |      0 |00:00:00.06 |    1244 |   1242 |  1696K|  1696K| 2503K (0)|
|   2 |   TABLE ACCESS FULL     | T1   |      1 |  86995 |   424K|       |   347   (1)| 00:00:05 |  86994 |00:00:00.02 |    1244 |   1242 |       |       |          |
|   3 |   TABLE ACCESS FULL     | T2   |      0 |  86996 |  8325K|       |   347   (1)| 00:00:05 |      0 |00:00:00.01 |       0 |      0 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$5DA710D3
   2 - SEL$5DA710D3 / T1@SEL$2
   3 - SEL$5DA710D3 / T2@SEL$1
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("OBJECT_ID"="OBJECT_ID")
27 rows selected.

--//这才想起来以前的一个错误,不能这样查询,因为IN中的记录存在NULL,这样NULL<>NULL.应该改写如下:

SCOTT@book> select * from t2 where object_id not in(select object_id from t1 where t1.object_id is not null );
OWNER  OBJECT_NAME          SUBOBJECT_  OBJECT_ID DATA_OBJECT_ID OBJECT_TYPE         CREATED             LAST_DDL_TIME       TIMESTAMP           STATUS  T G S  NAMESPACE EDITION_NAME
------ -------------------- ---------- ---------- -------------- ------------------- ------------------- ------------------- ------------------- ------- - - - ---------- ------------
SCOTT  T2                                   90863          90863 TABLE               2017-06-06 16:12:27 2017-06-06 16:12:27 2017-06-06:16:12:27 VALID   N N N          1

SCOTT@book> @ &r/dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  9cn6amuwk12r4, child number 0
-------------------------------------
select * from t2 where object_id not in(select object_id from t1 where
t1.object_id is not null )
Plan hash value: 2323664790
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Starts | E-Rows |E-Bytes|E-Temp | Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |      1 |        |       |       |  1218 (100)|          |      1 |00:00:00.15 |    2489 |   2484 |       |       |          |
|*  1 |  HASH JOIN RIGHT ANTI SNA|      |      1 |    870 | 89610 |  1448K|  1218   (1)| 00:00:15 |      1 |00:00:00.15 |    2489 |   2484 |  5416K|  2890K| 5574K (0)|
|*  2 |   TABLE ACCESS FULL      | T1   |      1 |  86993 |   424K|       |   347   (1)| 00:00:05 |  86993 |00:00:00.02 |    1244 |   1242 |       |       |          |
|   3 |   TABLE ACCESS FULL      | T2   |      1 |  86996 |  8325K|       |   347   (1)| 00:00:05 |  86996 |00:00:00.03 |    1245 |   1242 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$5DA710D3
   2 - SEL$5DA710D3 / T1@SEL$2
   3 - SEL$5DA710D3 / T2@SEL$1
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("OBJECT_ID"="OBJECT_ID")
   2 - filter("T1"."OBJECT_ID" IS NOT NULL)
--//注意执行计划中的HASH JOIN RIGHT ANTI SNA.

4.测试"_optimizer_null_aware_antijoin"=false看看.

SCOTT@book> alter session set "_optimizer_null_aware_antijoin"=false;
Session altered.

SCOTT@book> Select * from t2 where object_id not in(select object_id from t1 where t1.object_id is not null );
OWNER  OBJECT_NAME          SUBOBJECT_  OBJECT_ID DATA_OBJECT_ID OBJECT_TYPE         CREATED             LAST_DDL_TIME       TIMESTAMP           STATUS  T G S  NAMESPACE EDITION_NAME
------ -------------------- ---------- ---------- -------------- ------------------- ------------------- ------------------- ------------------- ------- - - - ---------- ------------
SCOTT  T2                                   90863          90863 TABLE               2017-06-06 16:12:27 2017-06-06 16:12:27 2017-06-06:16:12:27 VALID   N N N          1
--//昏,那个慢受不了.......

SCOTT@book> @ &r/dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  4ps64yjz4g7pu, child number 0
-------------------------------------
Select * from t2 where object_id not in(select object_id from t1 where
t1.object_id is not null )
Plan hash value: 2132554994
------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers | Reads  |
------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |       |  8847K(100)|          |      1 |00:04:06.56 |      53M|     67M|
|*  1 |  FILTER            |      |      1 |        |       |            |          |      1 |00:04:06.56 |      53M|     67M|
|   2 |   TABLE ACCESS FULL| T2   |      1 |  86996 |  8325K|   347   (1)| 00:00:05 |  86996 |00:00:00.05 |    1245 |   1242 |
|*  3 |   TABLE ACCESS FULL| T1   |  86995 |      2 |    10 |   116   (0)| 00:00:02 |  86994 |00:04:05.95 |      53M|     67M|
------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$1
   2 - SEL$1 / T2@SEL$1
   3 - SEL$2 / T1@SEL$2
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( IS NULL)
   3 - filter(("T1"."OBJECT_ID" IS NOT NULL AND LNNVL("OBJECT_ID"<>:B1)))
29 rows selected.

3.换1个方式:

SCOTT@book> create table t11 as select * from all_objects;
Table created.

SCOTT@book> create table t22 as select * from all_objects;
Table created.
--//分析表略.

SCOTT@book> @ &r/desc t11
Name           Null?    Type
-------------- -------- ------------
OWNER          NOT NULL VARCHAR2(30)
OBJECT_NAME    NOT NULL VARCHAR2(30)
SUBOBJECT_NAME          VARCHAR2(30)
OBJECT_ID      NOT NULL NUMBER
DATA_OBJECT_ID          NUMBER
OBJECT_TYPE             VARCHAR2(19)
CREATED        NOT NULL DATE
LAST_DDL_TIME  NOT NULL DATE
TIMESTAMP               VARCHAR2(19)
STATUS                  VARCHAR2(7)
TEMPORARY               VARCHAR2(1)
GENERATED               VARCHAR2(1)
SECONDARY               VARCHAR2(1)
NAMESPACE      NOT NULL NUMBER
EDITION_NAME            VARCHAR2(30)

--// 注意这样定义的表OBJECT_ID是非空.

SCOTT@book> select * from t22 where object_id not in(select object_id from t11);
OWNER  OBJECT_NAME          SUBOBJECT_  OBJECT_ID DATA_OBJECT_ID OBJECT_TYPE         CREATED             LAST_DDL_TIME       TIMESTAMP           STATUS  T G S  NAMESPACE EDITION_NAME
------ -------------------- ---------- ---------- -------------- ------------------- ------------------- ------------------- ------------------- ------- - - - ---------- ------------
SCOTT  T22                                  90869          90869 TABLE               2017-06-06 16:33:29 2017-06-06 16:33:29 2017-06-06:16:33:29 VALID   N N N          1

SCOTT@book> @ &r/dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  31yq53mnd3g7k, child number 1
-------------------------------------
select * from t22 where object_id not in(select object_id from t11)
Plan hash value: 3941821364
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name | Starts | E-Rows |E-Bytes|E-Temp | Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |      1 |        |       |       |  1187 (100)|          |      1 |00:00:00.15 |    2427 |   2422 |       |       |          |
|*  1 |  HASH JOIN RIGHT ANTI|      |      1 |    848 | 87344 |  1408K|  1187   (1)| 00:00:15 |      1 |00:00:00.15 |    2427 |   2422 |  5236K|  2890K| 5538K (0)|
|   2 |   TABLE ACCESS FULL  | T11  |      1 |  84763 |   413K|       |   338   (1)| 00:00:05 |  84763 |00:00:00.02 |    1213 |   1211 |       |       |          |
|   3 |   TABLE ACCESS FULL  | T22  |      1 |  84764 |  8112K|       |   338   (1)| 00:00:05 |  84764 |00:00:00.03 |    1214 |   1211 |       |       |          |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$5DA710D3
   2 - SEL$5DA710D3 / T11@SEL$2
   3 - SEL$5DA710D3 / T22@SEL$1
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("OBJECT_ID"="OBJECT_ID")

--//注意执行计划HASH JOIN RIGHT ANTI.我想起来以前10g我也遇到类型问题,将数据类型修改为NULL.

4.退出测试使用not exists的情况:

SCOTT@book> select * from t2 where not exists  (select 1 from t1  where t1.OBJECT_ID=t2.object_id);
OWNER  OBJECT_NAME SUBOBJECT_  OBJECT_ID DATA_OBJECT_ID OBJECT_TYPE         CREATED             LAST_DDL_TIME       TIMESTAMP           STATUS  T G S  NAMESPACE EDITION_NAME
------ ----------- ---------- ---------- -------------- ------------------- ------------------- ------------------- ------------------- ------- - - - ---------- ------------
SCOTT  T2                          90863          90863 TABLE               2017-06-06 16:12:27 2017-06-06 16:12:27 2017-06-06:16:12:27 VALID   N N N          1
PUBLIC LOOPBACK                                         DATABASE LINK       2016-12-14 15:17:00                                         VALID   N N N
PUBLIC TEST040                                          DATABASE LINK       2017-01-09 09:14:26                                         VALID   N N N
--//这样返回3条.也就是这样查询注意空值的情况.也就是要加入t2.object_id is not null 条件,才会返回1条.

SCOTT@book> @ &r/dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  4jx0g5ndct9vk, child number 0
-------------------------------------
select * from t2 where not exists  (select 1 from t1  where
t1.OBJECT_ID=t2.object_id)
Plan hash value: 1142061071
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name | Starts | E-Rows |E-Bytes|E-Temp | Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |      1 |        |       |       |  1218 (100)|          |      3 |00:00:00.14 |    2489 |   2484 |       |       |          |
|*  1 |  HASH JOIN RIGHT ANTI|      |      1 |    870 | 89610 |  1448K|  1218   (1)| 00:00:15 |      3 |00:00:00.14 |    2489 |   2484 |  5416K|  2890K| 5759K (0)|
|   2 |   TABLE ACCESS FULL  | T1   |      1 |  86995 |   424K|       |   347   (1)| 00:00:05 |  86995 |00:00:00.02 |    1244 |   1242 |       |       |          |
|   3 |   TABLE ACCESS FULL  | T2   |      1 |  86996 |  8325K|       |   347   (1)| 00:00:05 |  86996 |00:00:00.02 |    1245 |   1242 |       |       |          |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$5DA710D3
   2 - SEL$5DA710D3 / T1@SEL$2
   3 - SEL$5DA710D3 / T2@SEL$1
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")

SCOTT@book> select * from t1 where object_id is null ;
OWNER  OBJECT_NAME          SUBOBJECT_  OBJECT_ID DATA_OBJECT_ID OBJECT_TYPE         CREATED             LAST_DDL_TIME       TIMESTAMP           STATUS  T G S  NAMESPACE EDITION_NAME
------ -------------------- ---------- ---------- -------------- ------------------- ------------------- ------------------- ------------------- ------- - - - ---------- ------------
PUBLIC LOOPBACK                                                  DATABASE LINK       2016-12-14 15:17:00                                         VALID   N N N
PUBLIC TEST040                                                   DATABASE LINK       2017-01-09 09:14:26                                         VALID   N N N
--//正好2条object_id is null.

5.总结:
1.再次注意not in 与NULL的查询.
2.建立良好的约束,规避一些这样的问题.

目录
相关文章
|
1月前
|
Oracle 关系型数据库
Adaptive Query Optimization
Adaptive Query Optimization
17 4
|
2月前
|
自然语言处理
【论文代码】① STIOCS: Active learning-based semi-supervised training framework for IOC extraction
【论文代码】① STIOCS: Active learning-based semi-supervised training framework for IOC extraction
21 0
|
机器学习/深度学习 算法 数据挖掘
【多标签文本分类】Improved Neural Network-based Multi-label Classification with Better Initialization ……
【多标签文本分类】Improved Neural Network-based Multi-label Classification with Better Initialization ……
【多标签文本分类】Improved Neural Network-based Multi-label Classification with Better Initialization ……
sbs
|
存储 SQL 人工智能
The Volcano Optimizer Generator: Extensibility and Efficient Search 论文翻译
原文:The Volcano Optimizer Generator: Extensibility and Efficient SearchThe Volcano Optimizer Generator: Extensibility and Efficient Search 论文翻译。2023.01.25 —— by zz【中括号内为译者注】对原文部分关键术语,或重点句有加粗。便于定位。为了避免英
sbs
181 0
The Volcano Optimizer Generator: Extensibility and Efficient Search 论文翻译
|
数据可视化 算法 数据挖掘
Evaluation of Clustering|学习笔记
快速学习 Evaluation of Clustering
120 0
Evaluation of Clustering|学习笔记
|
算法 大数据 数据库
The Volcano Optimizer Generator : Extensibility and Efficient Search
数据库查询优化器的搜索技术,基本上分为了基于动态规划的bottom-up search(详见文章System-R论文解读)和基于Cascades/Volcano的top-down search两个流派,这篇文章我们来看一下Cascades的前身也就是Volcano优化器框架的paper。
412 0
The Volcano Optimizer Generator : Extensibility and Efficient Search
|
存储 负载均衡 算法
Morsel-Driven Parallelism: A NUMA-Aware Query Evaluation Framework for the Many-Core Age 论文解读
这篇paper介绍了TUM的内存数据库系统HyPer中使用的,基于小块数据(morsel)来驱动的并行查询执行框架。用来解决many-cores(NUMA)环境下,分析型查询的scalability问题。
1186 0
Morsel-Driven Parallelism: A NUMA-Aware Query Evaluation Framework for the Many-Core Age 论文解读
|
SQL 监控 算法
Adaptive Execution of Compiled Queries 论文解读
本篇是TUM的内存数据库HyPer针对compile-based执行框架的改进。其中涉及到HyPer的动态编译和并行执行框架 动态编译文章的结尾提到了编译执行系统存在的2个问题,其中之一就是:不可控的编译时间。
422 0
Adaptive Execution of Compiled Queries 论文解读
|
机器学习/深度学习 人工智能 搜索推荐
Sequential Recommendation with Self-Attentive Multi-Adversarial Network
回顾推荐系统的发展,离不开模型、数据、训练方式三个维度的创新。模型层面,基于内容的推荐系统到协同过滤的矩阵分解,以及神经网络带来的序列化建模,使得用户表征和商品表征刻画越来越精细;数据层面,长短期行为的切分,基于session的推荐[1],跨场景行为的引入,数据的丰富和建模为用户兴趣的挖掘提供更多的可能;训练方式上,分布式训练框架,在线学习,高维稀疏特征处理,优化器设计,从而支撑商业化推荐系统。然而,性能提升的同时,我们也看到推荐系统的可解释性逐渐变成黑盒。用户的兴趣偏好如何表征,兴趣如何演变,不同时间点的历史行为由哪种因素主导,这一系列的问题都对现代化推荐系统的解释提出了挑战。
Sequential Recommendation with Self-Attentive  Multi-Adversarial Network