PostgreSQL 11 内核优化 - 降低vacuum cleanup阶段index scan概率 ( vacuum_cleanup_index_scale_factor , skip index vacuum cleanup stage)

本文涉及的产品
云原生数据库 PolarDB MySQL 版,Serverless 5000PCU 100GB
云原生数据库 PolarDB 分布式版,标准版 2核8GB
云数据库 RDS MySQL Serverless,0.5-2RCU 50GB
简介: PostgreSQL 11 内核优化 - 降低vacuum cleanup阶段index scan概率 ( vacuum_cleanup_index_scale_factor , skip index vacuum cleanup stage)

背景

在执行vacuum时,有一个cleanup阶段,以往,不管这个阶段是否需要清理PAGE,只要表上面有索引,就需要对这个表的索引全部扫描一遍。

今天,PG 11版本,增加了一个GUC参数vacuum_cleanup_index_scale_factor,以及btree索引参数vacuum_cleanup_index_scale_factor。在btree index meta page上存储了当前表有多少条记录(仅仅当vacuum时发现整个表没有dead tuples(取自pg_stat_all_tables这种统计计数器)时更新meta page),当

1、((表上的 pg_stat_all_tables insert 计数器 - meta page)/meta page) 大于 vacuum_cleanup_index_scale_factor 时,vacuum cleanup阶段才需要SCAN INDEX,更新INDEX stats信息(包括meta page计数器信息)。

2、有deleted pages that can be recycled during cleanup需要清理时,必定要scan index pages.

因此,对于大量INSERT,没有UPDATE,DELETE操作的表的VACUUM,或者常规静态表的VACUUM会快很多,因为不需要SCAN INDEX了。

背景技术

vacuum_cleanup_index_scale_factor (floating point)

Specifies the fraction of the total number of heap tuples counted in the previous statistics collection     
that can be inserted     
without incurring an index scan at the VACUUM cleanup stage.     
This setting currently applies to B-tree indexes only.    
    
If no tuples were deleted from the heap,     
B-tree indexes are still scanned at the VACUUM cleanup stage     
when at least one of the following conditions is met:     
1、the index statistics are stale,     
2、or the index contains deleted pages that can be recycled during cleanup.     
Index statistics are considered to be stale if     
the number of newly inserted tuples     
exceeds the vacuum_cleanup_index_scale_factor fraction of the total number of heap tuples detected by the previous statistics collection.     
    
The total number of heap tuples is stored in the index meta-page.     
Note that the meta-page does not include this data until VACUUM finds no dead tuples,     
so B-tree index scan at the cleanup stage can only be skipped     
if the second and subsequent VACUUM cycles detect no dead tuples.  (典型的insert only场景,或者vacuum干掉所有dead tuple后)    
    
    
The value can range from 0 to 10000000000.     
When vacuum_cleanup_index_scale_factor is set to 0,     
index scans are never skipped during VACUUM cleanup.     
The default value is 0.1.    

B-tree indexes additionally accept this parameter:

vacuum_cleanup_index_scale_factor

Per-index value for vacuum_cleanup_index_scale_factor.

相关代码

src/backend/access/nbtree/nbtree.c

/*    
 * _bt_vacuum_needs_cleanup() -- Checks if index needs cleanup assuming that    
 *                      btbulkdelete() wasn't called.    
 */    
static bool    
_bt_vacuum_needs_cleanup(IndexVacuumInfo *info)    
{    
.... ....    
    
        {    
                StdRdOptions *relopts;    
                float8          cleanup_scale_factor;    
                float8          prev_num_heap_tuples;    
    
                /*    
                 * If table receives enough insertions and no cleanup was performed,    
                 * then index would appear have stale statistics.  If scale factor is    
                 * set, we avoid that by performing cleanup if the number of inserted    
                 * tuples exceeds vacuum_cleanup_index_scale_factor fraction of    
                 * original tuples count.    
                 */    
                relopts = (StdRdOptions *) info->index->rd_options;    
                cleanup_scale_factor = (relopts &&    
                                                                relopts->vacuum_cleanup_index_scale_factor >= 0)    
                        ? relopts->vacuum_cleanup_index_scale_factor    
                        : vacuum_cleanup_index_scale_factor;    
                prev_num_heap_tuples = metad->btm_last_cleanup_num_heap_tuples;    
    
                if (cleanup_scale_factor <= 0 ||    
                        prev_num_heap_tuples < 0 ||    
                        (info->num_heap_tuples - prev_num_heap_tuples) /    
                        prev_num_heap_tuples >= cleanup_scale_factor)     // 是否需要scan index,当判定index为stale状态时,由计数器与vacuum_cleanup_index_scale_factor参数控制。    
                        result = true;    
        }    
/*    
 * Post-VACUUM cleanup.    
 *    
 * Result: a palloc'd struct containing statistical info for VACUUM displays.    
 */    
IndexBulkDeleteResult *    
btvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)    
{    
        /* No-op in ANALYZE ONLY mode */    
        if (info->analyze_only)    
                return stats;    
    
        /*    
         * If btbulkdelete was called, we need not do anything, just return the    
         * stats from the latest btbulkdelete call.  If it wasn't called, we might    
         * still need to do a pass over the index, to recycle any newly-recyclable    
         * pages or to obtain index statistics.  _bt_vacuum_needs_cleanup    
         * determines if either are needed.    
         *    
         * Since we aren't going to actually delete any leaf items, there's no    
         * need to go through all the vacuum-cycle-ID pushups.    
         */    
        if (stats == NULL)    
        {    
                TransactionId oldestBtpoXact;    
    
                /* Check if we need a cleanup */    
                if (!_bt_vacuum_needs_cleanup(info))  // 不需要scan index    
                        return NULL;    
    
                stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));    
                btvacuumscan(info, stats, NULL, NULL, 0, &oldestBtpoXact);  // 需要SCAN index    
    
                /* Update cleanup-related information in the metapage */    
                _bt_update_meta_cleanup_info(info->index, oldestBtpoXact,    
                                                                         info->num_heap_tuples);    
        }    
    
        /*    
         * It's quite possible for us to be fooled by concurrent page splits into    
         * double-counting some index tuples, so disbelieve any total that exceeds    
         * the underlying heap's count ... if we know that accurately.  Otherwise    
         * this might just make matters worse.    
         */    
        if (!info->estimated_count)    
        {    
                if (stats->num_index_tuples > info->num_heap_tuples)    
                        stats->num_index_tuples = info->num_heap_tuples;    
        }    
    
        return stats;    
}    

例子

tbl1,每次vacuum都要scan index

create table tbl1 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 int);    
create index idx_tbl1_1 on tbl1 (c1) with (vacuum_cleanup_index_scale_factor=0);    
create index idx_tbl1_2 on tbl1 (c2) with (vacuum_cleanup_index_scale_factor=0);    
create index idx_tbl1_3 on tbl1 (c3) with (vacuum_cleanup_index_scale_factor=0);    
create index idx_tbl1_4 on tbl1 (c4) with (vacuum_cleanup_index_scale_factor=0);    
create index idx_tbl1_5 on tbl1 (c5) with (vacuum_cleanup_index_scale_factor=0);    
create index idx_tbl1_6 on tbl1 (c6) with (vacuum_cleanup_index_scale_factor=0);    

tbl2,当有deleted page需要recycle使用时,或者当((pg_stat_all_tables.inserted-metapage(上一次vacuum有多少条记录))/上一次vacuum有多少条记录) > 100000000 时,才需要scan index。

create table tbl2 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 int);    
create index idx_tbl2_1 on tbl2 (c1) with (vacuum_cleanup_index_scale_factor=100000000);    
create index idx_tbl2_2 on tbl2 (c2) with (vacuum_cleanup_index_scale_factor=100000000);    
create index idx_tbl2_3 on tbl2 (c3) with (vacuum_cleanup_index_scale_factor=100000000);    
create index idx_tbl2_4 on tbl2 (c4) with (vacuum_cleanup_index_scale_factor=100000000);    
create index idx_tbl2_5 on tbl2 (c5) with (vacuum_cleanup_index_scale_factor=100000000);    
create index idx_tbl2_6 on tbl2 (c6) with (vacuum_cleanup_index_scale_factor=100000000);    

分别写入1000万记录

insert into tbl1 select id,id,id,id,id,id from generate_series(1,10000000) t(id);    
insert into tbl2 select id,id,id,id,id,id from generate_series(1,10000000) t(id);    

观察两个表的二次VACUUM耗时

\timing    
    
vacuum verbose tbl1;    
vacuum verbose tbl1;    
    
vacuum verbose tbl2;    
vacuum verbose tbl2;    
postgres=# vacuum verbose tbl1;    
INFO:  vacuuming "public.tbl1"    
INFO:  index "idx_tbl1_1" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_2" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_3" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_4" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_5" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_6" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  "tbl1": found 0 removable, 10000000 nonremovable row versions in 63695 out of 63695 pages    
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 1137047265    
There were 0 unused item pointers.    
Skipped 0 pages due to buffer pins, 0 frozen pages.    
0 pages are entirely empty.    
CPU: user: 0.75 s, system: 0.00 s, elapsed: 0.76 s.    
VACUUM    
Time: 771.943 ms    
    
    
    
postgres=# vacuum verbose tbl1;    
INFO:  vacuuming "public.tbl1"    
INFO:  index "idx_tbl1_1" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_2" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_3" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_4" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_5" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_6" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  "tbl1": found 0 removable, 42 nonremovable row versions in 1 out of 63695 pages    
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 1137047265    
There were 0 unused item pointers.    
Skipped 0 pages due to buffer pins, 0 frozen pages.    
0 pages are entirely empty.    
CPU: user: 0.13 s, system: 0.00 s, elapsed: 0.13 s.    
VACUUM    
Time: 141.759 ms    
    
    
    
postgres=# vacuum verbose tbl1;    
INFO:  vacuuming "public.tbl1"    
INFO:  index "idx_tbl1_1" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_2" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_3" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_4" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_5" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_6" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  "tbl1": found 0 removable, 42 nonremovable row versions in 1 out of 63695 pages    
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 1137047265    
There were 0 unused item pointers.    
Skipped 0 pages due to buffer pins, 0 frozen pages.    
0 pages are entirely empty.    
CPU: user: 0.12 s, system: 0.00 s, elapsed: 0.13 s.    
VACUUM    
Time: 140.984 ms    
    
    
    
postgres=# vacuum verbose tbl2;    
INFO:  vacuuming "public.tbl2"    
INFO:  index "idx_tbl2_1" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl2_2" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl2_3" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl2_4" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl2_5" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl2_6" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  "tbl2": found 0 removable, 10000000 nonremovable row versions in 63695 out of 63695 pages    
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 1137047265    
There were 0 unused item pointers.    
Skipped 0 pages due to buffer pins, 0 frozen pages.    
0 pages are entirely empty.    
CPU: user: 0.84 s, system: 0.00 s, elapsed: 0.85 s.    
VACUUM    
Time: 860.749 ms    
    
    
    
postgres=# vacuum verbose tbl2;    
INFO:  vacuuming "public.tbl2"    
INFO:  "tbl2": found 0 removable, 42 nonremovable row versions in 1 out of 63695 pages    
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 1137047265    
There were 0 unused item pointers.    
Skipped 0 pages due to buffer pins, 0 frozen pages.    
0 pages are entirely empty.    
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.    
VACUUM    
Time: 11.895 ms    
    
    
    
postgres=# vacuum verbose tbl2;    
INFO:  vacuuming "public.tbl2"    
INFO:  "tbl2": found 0 removable, 42 nonremovable row versions in 1 out of 63695 pages    
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 1137047265    
There were 0 unused item pointers.    
Skipped 0 pages due to buffer pins, 0 frozen pages.    
0 pages are entirely empty.    
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.    
VACUUM    
Time: 11.944 ms    

分析

tbl1,由于每次vacuum都需要scan index,所以更加耗时。

tbl2,由于设置了vacuum_cleanup_index_scale_factor=100000000,下一次vacuum时insert减去上一次的总记录数再除以上一次总记录数,小于100000000,所以不需要scan index。

耗时来看,相当明显。

通过pageinspect,可以观察到btree索引metapage上的table tuples总数。

postgres=# create extension pageinspect;  
CREATE EXTENSION  
  
postgres=# select * from bt_metap('idx_tbl1_1');  
 magic  | version | root | level | fastroot | fastlevel | oldest_xact | last_cleanup_num_tuples   
--------+---------+------+-------+----------+-----------+-------------+-------------------------  
 340322 |       3 |  412 |     2 |      412 |         2 |           0 |             9.99977e+06  
(1 row)  
  
Time: 0.345 ms  
  
postgres=# select * from bt_metap('idx_tbl2_1');  
 magic  | version | root | level | fastroot | fastlevel | oldest_xact | last_cleanup_num_tuples   
--------+---------+------+-------+----------+-----------+-------------+-------------------------  
 340322 |       3 |  412 |     2 |      412 |         2 |           0 |                   1e+07  
(1 row)  
  
Time: 0.429 ms  

参考

src/backend/access/nbtree/nbtree.c

https://www.postgresql.org/docs/11/sql-createindex.html

https://www.postgresql.org/docs/11/runtime-config-client.html#GUC-VACUUM-CLEANUP-INDEX-SCALE-FACTOR

《PostgreSQL pg_stat_ pg_statio_ 统计信息(scan,read,fetch,hit)源码解读》

PostgreSQL 许愿链接

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
相关文章
|
6月前
|
SQL 关系型数据库 测试技术
沉浸式学习PostgreSQL|PolarDB 20: 学习成为数据库大师级别的优化技能
在上一个实验《沉浸式学习PostgreSQL|PolarDB 19: 体验最流行的开源企业ERP软件 odoo》 中, 学习了如何部署odoo和polardb|pg. 由于ODOO是非常复杂的ERP软件, 对于关系数据库的挑战也非常大, 所以通过odoo业务可以更快速提升同学的数据库优化能力, 发现业务对数据库的使用问题(如索引、事务对锁的运用逻辑问题), 数据库的代码缺陷, 参数或环境配置问题, 系统瓶颈等.
762 1
|
7月前
|
SQL 关系型数据库 分布式数据库
数据库内核那些事|细说PolarDB优化器查询变换:IN-List变换
本篇文章将对PolarDB的IN-List变换进行深入阐述,从而让我们对PolarDB的查询改写能力有更感性的认知。
|
7月前
|
关系型数据库 MySQL 分布式数据库
|
5月前
|
SQL 关系型数据库 分布式数据库
数据库内核那些事|细说PolarDB优化器查询变换:IN-List变换
数据库内核那些事|细说PolarDB优化器查询变换:IN-List变换
104 0
|
18天前
|
存储 JSON 关系型数据库
PostgreSQL Json应用场景介绍和Shared Detoast优化
PostgreSQL Json应用场景介绍和Shared Detoast优化
|
2月前
|
SQL 关系型数据库 分布式数据库
|
3月前
|
弹性计算 关系型数据库 数据库
开源PostgreSQL在倚天ECS上的最佳优化实践
本文基于倚天ECS硬件平台,以自顶向下的方式从上层应用、到基础软件,再到底层芯片硬件,通过应用与芯片的硬件特性的亲和性分析,实现PostgreSQL与倚天芯片软硬协同的深度优化,充分使能倚天硬件性能,帮助开源PostgreSQL应用实现性能提升。
|
9月前
|
关系型数据库 PostgreSQL
PostgreSQL vacuum可见性
PostgreSQL vacuum可见性
61 0
|
5月前
|
缓存 关系型数据库 Serverless
数据库内核那些事,PolarDB HTAP Serverless,打造经济易用的实时分析系统
下本从IMCI Serverless核心优势角度的介绍各优化工作内容。
数据库内核那些事,PolarDB HTAP Serverless,打造经济易用的实时分析系统
|
5月前
|
关系型数据库 MySQL 数据库
Mysql中key与index区别
Mysql中key与index区别

相关产品

  • 云原生数据库 PolarDB