PostgreSQL 执行计划缓存和失效讲解

本文涉及的产品
云原生数据库 PolarDB MySQL 版,Serverless 5000PCU 100GB
云原生数据库 PolarDB 分布式版,标准版 2核8GB
云数据库 RDS MySQL Serverless,0.5-2RCU 50GB
简介:
prepare statement的目的是让同样的SQL(无参数)或者类似的SQL(有参数)仅需要一次parse analyse, 因此减少了parse analyse的次数. 
同时使用 prepare statement 还有一个功效是cache plan, 也就是说plan的次数也可以减少 . 
本文介绍一下PostgreSQL的几种prepare statement的方法, 仅供参考.
首先是SPI, 如SPI_prepare, 语法如下 : 
 
 

SPIPlanPtr SPI_prepare(const char * command, int nargs, Oid * argtypes)
const char * command
command string

int nargs
number of input parameters ($1, $2, etc.)

Oid * argtypes
pointer to an array containing the OIDs of the data types of the parameters

注意SPI_prepare返回结果类型为SPIPlanPtr(一个指针), 用于存储plan的数据.
作为SPI_execute_plan或者SPI_execute_plan_with_paramlist 的参数传入 : 
因为SPI_prepare可以带参数(nargs,argtypes), 也可以不带参数. SPI_execute_plan建立generic plan 对带参数和不带参数的处理方式略有不同, 见下面的解释.
1, 无参数的情况.
 
  

If no parameters are defined, a generic plan will be created at the first use of SPI_execute_plan, and used for all subsequent executions as well. 

2, 有参数的情况.
 
  

If there are parameters, the first few uses of SPI_execute_plan will generate custom plans that are specific to the supplied parameter values. 
After enough uses of the same prepared statement, SPI_execute_plan will build a generic plan, and if that is not too much more expensive than the custom plans, it will start using the generic plan instead of re-planning each time.

另外就是使用如果使用prepared 游标, SPI_prepare_cursor, 可以指定flag. 如下 : 
 
  

If this default behavior is unsuitable, you can alter it by passing the CURSOR_OPT_GENERIC_PLAN or CURSOR_OPT_CUSTOM_PLAN flag to SPI_prepare_cursor, to force use of generic or custom plans respectively.

这些flag来自src/include/nodes/parsenodes.h 
 
 
/* ----------------------
 *              Declare Cursor Statement
 *
 * Note: the "query" field of DeclareCursorStmt is only used in the raw grammar
 * output.      After parse analysis it's set to null, and the Query points to the
 * DeclareCursorStmt, not vice versa.
 * ----------------------
 */
#define CURSOR_OPT_BINARY               0x0001  /* BINARY */
#define CURSOR_OPT_SCROLL               0x0002  /* SCROLL explicitly given */
#define CURSOR_OPT_NO_SCROLL    0x0004  /* NO SCROLL explicitly given */
#define CURSOR_OPT_INSENSITIVE  0x0008  /* INSENSITIVE */
#define CURSOR_OPT_HOLD                 0x0010  /* WITH HOLD */
/* these planner-control flags do not correspond to any SQL grammar: */
#define CURSOR_OPT_FAST_PLAN    0x0020  /* prefer fast-start plan */
#define CURSOR_OPT_GENERIC_PLAN 0x0040  /* force use of generic plan */
#define CURSOR_OPT_CUSTOM_PLAN  0x0080  /* force use of custom plan */

SPI_prepare_cursor 用法 如下 : 
 
 

SPIPlanPtr SPI_prepare_cursor(const char * command, int nargs,
                              Oid * argtypes, int cursorOptions)

PL/pgsql cache plan 实际上都是调用的SPI. 所以处理cache plan也分两种情况 (带参数和不带参数).
例如如果以下SQL在PL/pgSQL中执行 : 
1. select id,info from test where id=$1; 这属于带参数的SQL语句.
所以SPI_prepare这条SQL, 当第一次用SPI_execute_plan执行后不会马上建立generic plan. 而需要有足够多次使用到同一个prepare statement后才会建立generic plan. 
2. select id,info from test where id=1; 这属于不带参数的SQL语句. 
所以SPI_prepare这条SQL, 当第一次用SPI_execute_plan执行后将会建立generic plan.

选择使用generic plan还是重新plan(custom plan), 用到这个函数 : 
src/backend/utils/cache/plancache.c
 
 

00840 /*
00841  * choose_custom_plan: choose whether to use custom or generic plan
00842  *
00843  * This defines the policy followed by GetCachedPlan.
00844  */
00845 static bool
00846 choose_custom_plan(CachedPlanSource *plansource, ParamListInfo boundParams)
00847 {
00848     double      avg_custom_cost;
00849 
00850     /* Never any point in a custom plan if there's no parameters */
00851     if (boundParams == NULL)
00852         return false;
00853 
00854     /* See if caller wants to force the decision */
00855     if (plansource->cursor_options & CURSOR_OPT_GENERIC_PLAN)
00856         return false;
00857     if (plansource->cursor_options & CURSOR_OPT_CUSTOM_PLAN)
00858         return true;
00859 
00860     /* Generate custom plans until we have done at least 5 (arbitrary) */
00861     if (plansource->num_custom_plans < 5)
00862         return true;
00863 
00864     avg_custom_cost = plansource->total_custom_cost / plansource->num_custom_plans;
00865 
00866     /*
00867      * Prefer generic plan if it's less than 10% more expensive than average
00868      * custom plan.  This threshold is a bit arbitrary; it'd be better if we
00869      * had some means of comparing planning time to the estimated runtime cost
00870      * differential.
00871      *
00872      * Note that if generic_cost is -1 (indicating we've not yet determined
00873      * the generic plan cost), we'll always prefer generic at this point.
00874      */
00875     if (plansource->generic_cost < avg_custom_cost * 1.1)
00876         return false;
00877 
00878     return true;
00879 }

从函数内容来看, 
首先无绑定参数的情况使用generic plan, 因此函数返回false.
然后判断cursor_option是否包含CURSOR_OPT_GENERIC_PLAN和CURSOR_OPT_CUSTOM_PLAN. 直接返回true或false
然后判断num_custom_plans, 如果大于等于5则使用generic plan. 小于5则返回true.
最后如果上面都没有返回, 那么进入cost的判断. 如果plansource->generic_cost < avg_custom_cost * 1.1 则选择generic plan.
generic_cost , total_custom_cost 和 num_custom_plans都记录在CachedPlanSource这个数据结构里面. 因此custom plan不是每次都产生, 而是通过以上函数来选择的. 每使用一次custom plan都会更新 total_custom_cost 和 num_custom_plans的数据, 用作下次判断的依据.

下面举例说明一下第二种情况, 也就是有参数的情况 : 
首先生成测试数据 : 
测试表 : 
 
 

digoal=> create table test(id int, info text);
CREATE TABLE

测试数据 : 
 
 

digoal=> insert into test select 1,repeat(random()::text,10) from generate_series(1,500000);
INSERT 0 500000
digoal=> insert into test values (2,'test');
INSERT 0 1
digoal=> select id,count(*) from test group by id;
 id | count  
----+--------
  1 | 500000
  2 |      1
(2 rows)

索引 : 
 
 

digoal=> create index idx_test_1 on test(id);
CREATE INDEX


接下来使用PL/pgsql 的cache plan情况测试 : 
创建函数 : 
 
 

CREATE OR REPLACE FUNCTION digoal.f_immutable(i_id integer)
 RETURNS bigint
 LANGUAGE plpgsql
 STRICT IMMUTABLE
AS $function$
declare
  result bigint;
begin
  select count(*) into result from digoal.test where id=i_id;
  return result;
end;
$function$;


测试 : 
 
  

digoal=> \timing
Timing is on.

注意这些SQL的执行耗时. 很明显的区分索引扫描和全表扫描.
使用PL/pgsql 封装的 prepared statement 在第一次执行后SPI_execute_plan并没有马上产生 generic plan.
 
  

digoal=> select * from f_immutable(1);
 f_immutable 
-------------
      500000
(1 row)
Time: 148.080 ms

因此第二次执行 f_immutable(2) 依然进行了plan ,  也就是用了custom plan. 
因为1走了全表扫描, 2走了索引扫描.
 
  

digoal=> select * from f_immutable(2);
 f_immutable 
-------------
           1
(1 row)
Time: 0.736 ms

执行计划如下 : 
 
 

digoal=> explain analyze select count(*) from test where id=1;
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=20637.01..20637.02 rows=1 width=0) (actual time=245.336..245.336 rows=1 loops=1)
   ->  Seq Scan on test  (cost=0.00..19387.01 rows=500001 width=0) (actual time=0.012..166.642 rows=500000 loops=1)
         Filter: (id = 1)
         Rows Removed by Filter: 1
 Total runtime: 245.384 ms
(5 rows)
Time: 246.075 ms

digoal=> explain analyze select count(*) from test where id=2;
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1.41..1.42 rows=1 width=0) (actual time=0.031..0.031 rows=1 loops=1)
   ->  Index Only Scan using idx_test_1 on test  (cost=0.00..1.40 rows=1 width=0) (actual time=0.023..0.024 rows=1 loops=1)
         Index Cond: (id = 2)
         Heap Fetches: 1
 Total runtime: 0.083 ms
(5 rows)
Time: 0.772 ms


那么如SPI_prepare所述, 同一个prepare statement被SPI_execute_plan多次使用后, 将会生成generic plan.
所以接下来我多次(这里选择5次) 调用 select * from f_immutable(1);
根据前面提到的choose_custom_plan函数, 5次后将会进入 plansource->generic_cost < avg_custom_cost * 1.1  的判断.
因为前面5次的cost都一样, 所以存储在 plansource里的custom cost相关数值得出的custom cost平均值与generic_cost没有相差10%. 所以会走generic plan. 那就可以解释为什么第六次的SQL:  select * from f_immutable(2); 未走索引了.
 
  

digoal=> \timing
Timing is on.
digoal=> select * from f_immutable(1);
 f_immutable 
-------------
      500000
(1 row)
Time: 242.742 ms
digoal=> select * from f_immutable(1);
 f_immutable 
-------------
      500000
(1 row)
Time: 179.910 ms
digoal=> select * from f_immutable(1);
 f_immutable 
-------------
      500000
(1 row)
Time: 180.052 ms
digoal=> select * from f_immutable(1);
 f_immutable 
-------------
      500000
(1 row)
Time: 180.027 ms
digoal=> select * from f_immutable(1);
 f_immutable 
-------------
      500000
(1 row)
Time: 179.758 ms

经过以上5次调用 select * from f_immutable(1);后, 
所以下面把参数改成2, 也不走索引扫描了. 执行时间139毫秒.
 
  

digoal=> select * from f_immutable(2);
 f_immutable 
-------------
           1
(1 row)
Time: 139.941 ms
digoal=> select * from f_immutable(2);
 f_immutable 
-------------
           1
(1 row)
Time: 139.994 ms

将函数的strict改成stable和volatile测试的结果与上面一致, 因为和函数的volatile无关.
 
 

digoal=> alter function f_immutable(int) strict volatile;
ALTER FUNCTION
Time: 0.490 ms
digoal=> alter function f_immutable(int) strict stable;
ALTER FUNCTION
Time: 0.451 ms

测试结果略.

接下来使用prepare SQL COMMAND进行测试 : 
结果和PL/pgsql一致.
 
  

digoal=> \timing
Timing is on.
digoal=> prepare p_test(int) as select count(*) from test where id=$1;
PREPARE
Time: 1.154 ms
digoal=> explain analyze execute p_test(1);
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=20637.01..20637.02 rows=1 width=0) (actual time=298.463..298.463 rows=1 loops=1)
   ->  Seq Scan on test  (cost=0.00..19387.01 rows=500001 width=0) (actual time=0.034..220.116 rows=500000 loops=1)
         Filter: (id = 1)
         Rows Removed by Filter: 1
 Total runtime: 298.580 ms
(5 rows)
Time: 299.951 ms
digoal=> explain analyze execute p_test(1);
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=20637.01..20637.02 rows=1 width=0) (actual time=243.990..243.990 rows=1 loops=1)
   ->  Seq Scan on test  (cost=0.00..19387.01 rows=500001 width=0) (actual time=0.012..166.291 rows=500000 loops=1)
         Filter: (id = 1)
         Rows Removed by Filter: 1
 Total runtime: 244.040 ms
(5 rows)
Time: 244.800 ms
digoal=> explain analyze execute p_test(1);
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=20637.01..20637.02 rows=1 width=0) (actual time=244.184..244.184 rows=1 loops=1)
   ->  Seq Scan on test  (cost=0.00..19387.01 rows=500001 width=0) (actual time=0.012..166.416 rows=500000 loops=1)
         Filter: (id = 1)
         Rows Removed by Filter: 1
 Total runtime: 244.235 ms
(5 rows)
Time: 244.817 ms
digoal=> explain analyze execute p_test(1);
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=20637.01..20637.02 rows=1 width=0) (actual time=244.380..244.380 rows=1 loops=1)
   ->  Seq Scan on test  (cost=0.00..19387.01 rows=500001 width=0) (actual time=0.012..166.055 rows=500000 loops=1)
         Filter: (id = 1)
         Rows Removed by Filter: 1
 Total runtime: 244.432 ms
(5 rows)
Time: 245.028 ms
digoal=> explain analyze execute p_test(1);
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=20637.01..20637.02 rows=1 width=0) (actual time=244.029..244.029 rows=1 loops=1)
   ->  Seq Scan on test  (cost=0.00..19387.01 rows=500001 width=0) (actual time=0.011..166.128 rows=500000 loops=1)
         Filter: (id = 1)
         Rows Removed by Filter: 1
 Total runtime: 244.081 ms
(5 rows)
Time: 244.701 ms

传入参数2, 使用了generic plan, 而没有执行custom plan.( 本来参数2应该走索引.)
 
  

digoal=> explain analyze execute p_test(2);
                                                   QUERY PLAN                                                    
-----------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=20637.01..20637.02 rows=1 width=0) (actual time=115.265..115.265 rows=1 loops=1)
   ->  Seq Scan on test  (cost=0.00..19387.01 rows=500001 width=0) (actual time=115.257..115.257 rows=1 loops=1)
         Filter: (id = $1)
         Rows Removed by Filter: 500000
 Total runtime: 115.317 ms
(5 rows)
Time: 116.046 ms


下面把前5次的参数改成2, 主要说明choose_custom_plan的  plansource->generic_cost < avg_custom_cost * 1.1 比较过程.
如下 : 
 
  

ocz@db-172-16-3-150-> psql digoal digoal
psql (9.2.1)
Type "help" for help.
digoal=> prepare p_test(int) as select count(*) from test where id=$1;
PREPARE
digoal=> explain analyze execute p_test(2);
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1.41..1.42 rows=1 width=0) (actual time=0.083..0.084 rows=1 loops=1)
   ->  Index Only Scan using idx_test_1 on test  (cost=0.00..1.40 rows=1 width=0) (actual time=0.074..0.075 rows=1 loops=1)
         Index Cond: (id = 2)
         Heap Fetches: 1
 Total runtime: 0.200 ms
(5 rows)
digoal=> explain analyze execute p_test(2);
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1.41..1.42 rows=1 width=0) (actual time=0.026..0.026 rows=1 loops=1)
   ->  Index Only Scan using idx_test_1 on test  (cost=0.00..1.40 rows=1 width=0) (actual time=0.021..0.022 rows=1 loops=1)
         Index Cond: (id = 2)
         Heap Fetches: 1
 Total runtime: 0.074 ms
(5 rows)
digoal=> explain analyze execute p_test(2);
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1.41..1.42 rows=1 width=0) (actual time=0.021..0.021 rows=1 loops=1)
   ->  Index Only Scan using idx_test_1 on test  (cost=0.00..1.40 rows=1 width=0) (actual time=0.017..0.018 rows=1 loops=1)
         Index Cond: (id = 2)
         Heap Fetches: 1
 Total runtime: 0.065 ms
(5 rows)
digoal=> explain analyze execute p_test(2);
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1.41..1.42 rows=1 width=0) (actual time=0.020..0.020 rows=1 loops=1)
   ->  Index Only Scan using idx_test_1 on test  (cost=0.00..1.40 rows=1 width=0) (actual time=0.016..0.016 rows=1 loops=1)
         Index Cond: (id = 2)
         Heap Fetches: 1
 Total runtime: 0.063 ms
(5 rows)
digoal=> explain analyze execute p_test(2);
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1.41..1.42 rows=1 width=0) (actual time=0.020..0.020 rows=1 loops=1)
   ->  Index Only Scan using idx_test_1 on test  (cost=0.00..1.40 rows=1 width=0) (actual time=0.016..0.017 rows=1 loops=1)
         Index Cond: (id = 2)
         Heap Fetches: 1
 Total runtime: 0.061 ms
(5 rows)

第六次传入参数1, 走的是custom plan. 
 
  

digoal=> explain analyze execute p_test(1);
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=20637.01..20637.02 rows=1 width=0) (actual time=343.385..343.385 rows=1 loops=1)
   ->  Seq Scan on test  (cost=0.00..19387.01 rows=500001 width=0) (actual time=0.053..253.714 rows=500000 loops=1)
         Filter: (id = 1)
         Rows Removed by Filter: 1
 Total runtime: 343.458 ms
(5 rows)

下面来解释一下原因 : 
1. 第6次执行时, 
 
 

num_custom_plans = 5
total_custom_cost = 7.1      /* (1.42*5) */

generic_cost等于多少呢? 还是-1(初始值) , 从后面使用gdb跟踪的结果可以看到. 
因此choose_custom_plan的执行结果为true. 也就是选择custom plan, 因此需要额外的plan 优化选择过程.
到第7次执行时会变成20637. 

GDB跟踪举例 : 
首先确保编译PostgreSQL使用了enable-debug : 
 
 

ocz@db-172-16-3-150-> pg_config --configure
'--prefix=/home/ocz/pgsql9.2.1' '--with-pgport=9201' '--with-perl' '--with-python' '--with-tcl' '--with-openssl' '--with-pam' '--without-ldap' '--with-libxml' '--with-libxslt' '--enable-thread-safety' '--with-wal-blocksize=16' '--enable-debug' '--with-ossp-uuid' '--with-libs=/opt/uuid-1.6.2/lib'

开始测试 : 
终端1 : 
 
  

digoal=> prepare p_test(int) as select count(*) from test where id=$1;
PREPARE
digoal=> explain analyze execute p_test(2);
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1.41..1.42 rows=1 width=0) (actual time=0.113..0.113 rows=1 loops=1)
   ->  Index Only Scan using idx_test_1 on test  (cost=0.00..1.40 rows=1 width=0) (actual time=0.102..0.103 rows=1 loops=1)
         Index Cond: (id = 2)
         Heap Fetches: 1
 Total runtime: 0.269 ms
(5 rows)
digoal=> select pg_backend_pid();
 pg_backend_pid 
----------------
          10921
(1 row)


终端2 : 
 
 

ocz@db-172-16-3-150-> gdb
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-37.el5)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
绑定进程号.
(gdb) attach 10921
Attaching to process 10921
设置断点
(gdb) break choose_custom_plan
Breakpoint 1 at 0x6ee730: file plancache.c, line 850.
执行到断点位置
(gdb) continue
Continuing.


终端1 : 
 
  

digoal=> explain analyze execute p_test(2);


终端2 : 
 
  

到达断点,
Breakpoint 1, choose_custom_plan (plansource=0x14477350, boundParams=0x144b1678) at plancache.c:850
850             if (boundParams == NULL)
进入单步模式
(gdb) step
854             if (plansource->cursor_options & CURSOR_OPT_GENERIC_PLAN)
(gdb) 
856             if (plansource->cursor_options & CURSOR_OPT_CUSTOM_PLAN)
(gdb) 
860             if (plansource->num_custom_plans < 5)
(gdb) 
choose_custom_plan (plansource=0x14477350, boundParams=0x144b1678) at plancache.c:878
878     }
到了choose_custom_plan 后, 打印一下plansource->total_custom_cost
(gdb) print plansource->total_custom_cost
$1 = 1.4174
继续下一轮
(gdb) continue
Continuing.


终端1 : 
 
  

digoal=> explain analyze execute p_test(2);

终端2 : 
 
 

Breakpoint 1, choose_custom_plan (plansource=0x14477350, boundParams=0x144b1678) at plancache.c:850
850             if (boundParams == NULL)
(gdb) step
854             if (plansource->cursor_options & CURSOR_OPT_GENERIC_PLAN)
(gdb) 
856             if (plansource->cursor_options & CURSOR_OPT_CUSTOM_PLAN)
(gdb) 
860             if (plansource->num_custom_plans < 5)
(gdb) 
choose_custom_plan (plansource=0x14477350, boundParams=0x144b1678) at plancache.c:878
878     }
注意此时total_custom_cost增长了.
(gdb) print plansource->total_custom_cost
$2 = 2.8348
此时还未生成generic plan, 所以generic_cost=-1
(gdb) print plansource->generic_cost
$3 = -1
(gdb) continue
Continuing.


终端1 : 
 
   

digoal=> explain analyze execute p_test(2);

略去中间几步, 下面是第6次和第7次执行SQL时跟踪到的结果, 说明第6次执行完后生成了generic plan. 
终端2 : 
 
 

Breakpoint 1, choose_custom_plan (plansource=0x143998c0, boundParams=0x1439aff8) at plancache.c:850
850             if (boundParams == NULL)
(gdb) step
854             if (plansource->cursor_options & CURSOR_OPT_GENERIC_PLAN)
(gdb) 
856             if (plansource->cursor_options & CURSOR_OPT_CUSTOM_PLAN)
(gdb) 
860             if (plansource->num_custom_plans < 5)
(gdb) 
GetCachedPlan (plansource=0x143998c0, boundParams=0x1439aff8, useResOwner=1 '\001') at plancache.c:941
941             if (!customplan)
(gdb) print plansource->generic_cost
$4 = -1
(gdb) print plansource->total_custom_cost
$5 = 7.0869999999999997
(gdb) print plansource->num_custom_plans
$6 = 5
(gdb) continue
Continuing.

Breakpoint 1, choose_custom_plan (plansource=0x143998c0, boundParams=0x1439aff8) at plancache.c:850
850             if (boundParams == NULL)
(gdb) continue
Continuing.

Breakpoint 1, choose_custom_plan (plansource=0x143998c0, boundParams=0x1439abe8) at plancache.c:850
850             if (boundParams == NULL)
(gdb) step
854             if (plansource->cursor_options & CURSOR_OPT_GENERIC_PLAN)
(gdb) 
856             if (plansource->cursor_options & CURSOR_OPT_CUSTOM_PLAN)
(gdb) 
860             if (plansource->num_custom_plans < 5)
(gdb) 
GetCachedPlan (plansource=0x143998c0, boundParams=0x1439abe8, useResOwner=1 '\001') at plancache.c:941
941             if (!customplan)
(gdb) print plansource->generic_cost
$7 = 20637.024999999998
(gdb) print plansource->total_custom_cost
$8 = 8.5044000000000004
(gdb) print plansource->num_custom_plans
$9 = 6

generic_cost = 20637.024999999998 对应id=1的执行计划得到的cost, 如下 :
digoal=> explain analyze execute p_test(1);
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=20637.01..20637.02 rows=1 width=0) (actual time=433.100..433.100 rows=1 loops=1)
   ->  Seq Scan on test  (cost=0.00..19387.01 rows=500001 width=0) (actual time=0.133..344.716 rows=500000 loops=1)
         Filter: (id = 1)
         Rows Removed by Filter: 1
为什么id=1的执行计划作为generic plan了. 因为id=1的值在test表的出现频率最高. 
通过pg_stats视图可以看到. 如下 : 
 
 

digoal=> \d pg_stats
          View "pg_catalog.pg_stats"
         Column         |   Type   | Modifiers 
------------------------+----------+-----------
 schemaname             | name     | 
 tablename              | name     | 
 attname                | name     | 
 inherited              | boolean  | 
 null_frac              | real     | 
 avg_width              | integer  | 
 n_distinct             | real     | 
 most_common_vals       | anyarray | 
 most_common_freqs      | real[]   | 
 histogram_bounds       | anyarray | 
 correlation            | real     | 
 most_common_elems      | anyarray | 
 most_common_elem_freqs | real[]   | 
 elem_count_histogram   | real[]   | 
digoal=> \x
Expanded display is on.
digoal=> select * from pg_stats where schemaname='digoal' and tablename='test' and attname='id';
-[ RECORD 1 ]----------+-------
schemaname             | digoal
tablename              | test
attname                | id
inherited              | f
null_frac              | 0
avg_width              | 4
n_distinct             | 1
most_common_vals       | {1}
most_common_freqs      | {1}
histogram_bounds       | 
correlation            | 1
most_common_elems      | 
most_common_elem_freqs | 
elem_count_histogram   | 

注意这两项 : 
 
  

most_common_vals       | {1}
most_common_freqs      | {1}

采样出来它的出现频率是100%

【其他】
1. 使用SPI_prepare_cursor, 通过设置flag [CURSOR_OPT_GENERIC_PLAN or CURSOR_OPT_CUSTOM_PLAN] 可以强制指定使用custom plan还是generic plan. 
2. generic plan的生命周期 : 
The statement returned by SPI_prepare can be used only in the current invocation of the procedure, since SPI_finish frees memory allocated for such a statement. 
But the statement can be saved for longer using the functions SPI_keepplan or SPI_saveplan. 一般不推荐使用SPI_saveplan, 因为数据会重复.
SPI_finish释放gplan : 
 
  

/*
 * ReleaseGenericPlan: release a CachedPlanSource's generic plan, if any.
 */
static void
ReleaseGenericPlan(CachedPlanSource *plansource)
{
        /* Be paranoid about the possibility that ReleaseCachedPlan fails */
        if (plansource->gplan)
        {
                CachedPlan *plan = plansource->gplan;

                Assert(plan->magic == CACHEDPLAN_MAGIC);
                plansource->gplan = NULL;
                ReleaseCachedPlan(plan, false);
        }
}
3. 最后补充以下, 数据分布非常倾斜, 就像本例id=1的有50000条, id=2的只有1条的情况. 当传入id=2的值时可以考虑强制custom plan.
当然如果经常要这样做, 不推荐使用prepare statement. 因为generic plan是通过出现最频繁的值产生的执行计划, 至少从gdb跟踪出来的是这样的.

【参考】
1.  http://www.postgresql.org/docs/9.2/static/spi-spi-prepare.html
2.  http://www.postgresql.org/docs/9.2/static/spi-spi-execute-plan.html
14. src/backend/executor/spi.c
15. src/include/executor/spi_priv.h
16. src/backend/utils/cache/plancache.c
17. src/backend/commands/prepare.c
18. src/include/utils/plancache.h
 
  

00024 /*
00025  * CachedPlanSource (which might better have been called CachedQuery)
00026  * represents a SQL query that we expect to use multiple times.  It stores
00027  * the query source text, the raw parse tree, and the analyzed-and-rewritten
00028  * query tree, as well as adjunct data.  Cache invalidation can happen as a
00029  * result of DDL affecting objects used by the query.  In that case we discard
00030  * the analyzed-and-rewritten query tree, and rebuild it when next needed.
00031  *
00032  * An actual execution plan, represented by CachedPlan, is derived from the
00033  * CachedPlanSource when we need to execute the query.  The plan could be
00034  * either generic (usable with any set of plan parameters) or custom (for a
00035  * specific set of parameters).  plancache.c contains the logic that decides
00036  * which way to do it for any particular execution.  If we are using a generic
00037  * cached plan then it is meant to be re-used across multiple executions, so
00038  * callers must always treat CachedPlans as read-only.
00039  *
00040  * Once successfully built and "saved", CachedPlanSources typically live
00041  * for the life of the backend, although they can be dropped explicitly.
00042  * CachedPlans are reference-counted and go away automatically when the last
00043  * reference is dropped.  A CachedPlan can outlive the CachedPlanSource it
00044  * was created from.
00045  *
00046  * An "unsaved" CachedPlanSource can be used for generating plans, but it
00047  * lives in transient storage and will not be updated in response to sinval
00048  * events.
00049  *
00050  * CachedPlans made from saved CachedPlanSources are likewise in permanent
00051  * storage, so to avoid memory leaks, the reference-counted references to them
00052  * must be held in permanent data structures or ResourceOwners.  CachedPlans
00053  * made from unsaved CachedPlanSources are in children of the caller's
00054  * memory context, so references to them should not be longer-lived than
00055  * that context.  (Reference counting is somewhat pro forma in that case,
00056  * though it may be useful if the CachedPlan can be discarded early.)
00057  *
00058  * A CachedPlanSource has two associated memory contexts: one that holds the
00059  * struct itself, the query source text and the raw parse tree, and another
00060  * context that holds the rewritten query tree and associated data.  This
00061  * allows the query tree to be discarded easily when it is invalidated.
00062  *
00063  * Note: the string referenced by commandTag is not subsidiary storage;
00064  * it is assumed to be a compile-time-constant string.  As with portals,
00065  * commandTag shall be NULL if and only if the original query string (before
00066  * rewriting) was an empty string.
00067  */
00068 typedef struct CachedPlanSource
00069 {
00070     int         magic;          /* should equal CACHEDPLANSOURCE_MAGIC */
00071     Node       *raw_parse_tree; /* output of raw_parser() */
00072     char       *query_string;   /* source text of query */
00073     const char *commandTag;     /* command tag (a constant!), or NULL */
00074     Oid        *param_types;    /* array of parameter type OIDs, or NULL */
00075     int         num_params;     /* length of param_types array */
00076     ParserSetupHook parserSetup;    /* alternative parameter spec method */
00077     void       *parserSetupArg;
00078     int         cursor_options; /* cursor options used for planning */
00079     bool        fixed_result;   /* disallow change in result tupdesc? */
00080     TupleDesc   resultDesc;     /* result type; NULL = doesn't return tuples */
00081     struct OverrideSearchPath *search_path;     /* saved search_path */
00082     MemoryContext context;      /* memory context holding all above */
00083     /* These fields describe the current analyzed-and-rewritten query tree: */
00084     List       *query_list;     /* list of Query nodes, or NIL if not valid */
00085     List       *relationOids;   /* OIDs of relations the queries depend on */
00086     List       *invalItems;     /* other dependencies, as PlanInvalItems */
00087     MemoryContext query_context;    /* context holding the above, or NULL */
00088     /* If we have a generic plan, this is a reference-counted link to it: */
00089     struct CachedPlan *gplan;   /* generic plan, or NULL if not valid */
00090     /* Some state flags: */
00091     bool        is_complete;    /* has CompleteCachedPlan been done? */
00092     bool        is_saved;       /* has CachedPlanSource been "saved"? */
00093     bool        is_valid;       /* is the query_list currently valid? */
00094     int         generation;     /* increments each time we create a plan */
00095     /* If CachedPlanSource has been saved, it is a member of a global list */
00096     struct CachedPlanSource *next_saved;        /* list link, if so */
00097     /* State kept to help decide whether to use custom or generic plans: */
00098     double      generic_cost;   /* cost of generic plan, or -1 if not known */
00099     double      total_custom_cost;      /* total cost of custom plans so far */
00100     int         num_custom_plans;       /* number of plans included in total */
00101 } CachedPlanSource;
00102 generic plan 存储在以下数据结构中:
00103 /*
00104  * CachedPlan represents an execution plan derived from a CachedPlanSource.
00105  * The reference count includes both the link from the parent CachedPlanSource
00106  * (if any), and any active plan executions, so the plan can be discarded
00107  * exactly when refcount goes to zero.  Both the struct itself and the
00108  * subsidiary data live in the context denoted by the context field.
00109  * This makes it easy to free a no-longer-needed cached plan.
00110  */
00111 typedef struct CachedPlan
00112 {
00113     int         magic;          /* should equal CACHEDPLAN_MAGIC */
00114     List       *stmt_list;      /* list of statement nodes (PlannedStmts and
00115                                  * bare utility statements) */
00116     bool        is_saved;       /* is CachedPlan in a long-lived context? */
00117     bool        is_valid;       /* is the stmt_list currently valid? */
00118     TransactionId saved_xmin;   /* if valid, replan when TransactionXmin
00119                                  * changes from this value */
00120     int         generation;     /* parent's generation number for this plan */
00121     int         refcount;       /* count of live references to this struct */
00122     MemoryContext context;      /* context containing this CachedPlan */
00123 } CachedPlan;
相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
相关文章
|
7月前
|
SQL 缓存 OLTP
OBCP第三章 SQL引擎技术-执行计划缓存
OBCP第三章 SQL引擎技术-执行计划缓存
90 0
|
9月前
|
关系型数据库 PostgreSQL
PostgreSQL执行计划explain
PostgreSQL执行计划explain
82 0
|
9月前
|
关系型数据库 PostgreSQL 索引
PostgreSQL执行计划数据结构
PostgreSQL执行计划数据结构
55 0
|
11月前
|
SQL Cloud Native 关系型数据库
《阿里云认证的解析与实战-数据仓库ACP认证》——云原生数据仓库AnalyticDB PostgreSQL版功能演示(上)——六、查看分析执行计划
《阿里云认证的解析与实战-数据仓库ACP认证》——云原生数据仓库AnalyticDB PostgreSQL版功能演示(上)——六、查看分析执行计划
|
11月前
|
关系型数据库 PostgreSQL 索引
PostgreSQL 性能优化: 执行计划
PostgreSQL为每个收到的查询产生一个查询计划。查询计划的结构是一个计划结点的树。最底层的结点是扫描结点:它们从表中返回未经处理的行。不同的表访问模式有不同的扫描结点类型:顺序扫描、索引扫描、位图索引扫描。也还有不是表的行来源,例如VALUES子句和FROM中返回集合的函数,它们有自己的结点类型。如果查询需要连接、聚集、排序、或者在未经处理的行上的其它操作,那么就会在扫描结点之上有其它额外的结点来执行这些操作。并且,做这些操作通常都有多种方法,因此在这些位置也有可能出现不同的结点类型。
84 0
|
SQL 移动开发 关系型数据库
PostgreSQL 执行计划,成本公式解说,代价因子校准,自动跟踪SQL执行计划(三)|学习笔记
快速学习PostgreSQL 执行计划,成本公式解说,代价因子校准,自动跟踪SQL执行计划(三)
546 0
PostgreSQL 执行计划,成本公式解说,代价因子校准,自动跟踪SQL执行计划(三)|学习笔记
|
关系型数据库 SQL MySQL
如何在线应对SQL语句执行计划?解读RDS MySQL 8.0之SQL Outline功能
这里边有几个因素和场景,比如:随着表数据量的变化,以及统计信息的自动收集,CBO optimizer 计算得到了一个cost 更低的 plan, 又或者 表结构发生了变化,增加和删减了某些索引,或者在实例升级迁移等过程中,MySQL 自身优化器行为和算法发生了变化等。
1480 0
|
SQL 关系型数据库
PostgreSQL citus, Greenplum 分布式执行计划 DEBUG
标签 PostgreSQL , citus , sharding , Greenplum , explain , debug 背景 开启DEBUG,可以观察citus, Greenplum的SQL分布式执行计划,下发情况,主节点,数据节点交互情况。
1580 0

相关产品

  • 云原生数据库 PolarDB