PostgreSQL attention : max_standby_archive_delay and max_standby_streaming_delay

本文涉及的产品
云原生数据库 PolarDB MySQL 版,Serverless 5000PCU 100GB
简介:
今天群里的兄弟提到的一个疑问, 
max_standby_archive_delay = -1
max_standby_streaming_delay = -1
不知道有什么影响?
这两个参数控制了recovery时是否要kill掉hot_standby上与recovery冲突的sql.
如果配置为-1, 那么在hot_standby上可以无限时常的执行sql. 后果是standby只接收但是不recovery xlog. 造成standby xlog文件越来越多, 甚至撑爆磁盘分区. 还有就是造成了standby的延迟. 
所以在设置是需要小心.

为了便于理解, 下面将测试过程展示一下 : 
(primary and standby)安装PostgreSQL
tar -zxvf postgresql-1b1d3d9.tar.gz

cd postgresql-1b1d3d9
./configure --prefix=/home/pg94/pgsql9.4devel --with-pgport=2999 --with-perl --with-tcl --with-python --with-openssl --with-pam --without-ldap --with-libxml --with-libxslt --enable-thread-safety --with-wal-blocksize=16 --enable-dtrace && gmake && gmake install
cd contrib/
gmake && gmake install

(primary)初始化数据库
su - pg94
initdb -D $PGDATA -E UTF8 --locale=C -W -U postgres


(primary)配置
pg94@db-172-16-3-33-> grep "^[a-z]" postgresql.conf
listen_addresses = '0.0.0.0'            # what IP address(es) to listen on;
port = 2999                             # (change requires restart)
max_connections = 100                   # (change requires restart)
superuser_reserved_connections = 13     # (change requires restart)
unix_socket_directories = '.'   # comma-separated list of directories
unix_socket_permissions = 0700          # begin with 0 to use octal notation
shared_buffers = 1024MB                 # min 128kB
maintenance_work_mem = 512MB            # min 1MB
shared_preload_libraries = 'pg_stat_statements'         # (change requires restart)
wal_level = hot_standby                 # minimal, archive, or hot_standby
synchronous_commit = off                # synchronization level;
wal_sync_method = fdatasync             # the default is the first option
wal_buffers = 16384kB                   # min 32kB, -1 sets based on shared_buffers
wal_writer_delay = 20ms         # 1-10000 milliseconds
checkpoint_segments = 32                # in logfile segments, min 1, 16MB each
max_wal_senders = 32            # max number of walsender processes
wal_keep_segments = 128                # in logfile segments, 16MB each; 0 disables
hot_standby = on                        # "on" allows queries during recovery
max_standby_archive_delay = -1  # max delay before canceling queries
max_standby_streaming_delay = -1        # max delay before canceling queries
wal_receiver_status_interval = 1s       # send replies at least this often
hot_standby_feedback = on               # send info from standby to prevent
random_page_cost = 1.0                  # same scale as above
effective_cache_size = 10240MB
log_destination = 'csvlog'              # Valid values are combinations of
logging_collector = on          # Enable capturing of stderr and csvlog
log_directory = 'pg_log'                # directory where log files are written,
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,
log_file_mode = 0600                    # creation mode for log files,
log_truncate_on_rotation = on           # If on, an existing log file with the
log_rotation_age = 1d                   # Automatic rotation of logfiles will
log_rotation_size = 10MB                # Automatic rotation of logfiles will
log_checkpoints = on
log_connections = on
log_disconnections = on
log_error_verbosity = verbose           # terse, default, or verbose messages
log_lock_waits = on                     # log lock waits >= deadlock_timeout
log_statement = 'ddl'                   # none, ddl, mod, all
log_timezone = 'PRC'
track_activity_query_size = 2048        # (change requires restart)
autovacuum = on                 # Enable autovacuum subprocess?  'on'
log_autovacuum_min_duration = 0 # -1 disables, 0 logs all actions and
datestyle = 'iso, mdy'
timezone = 'PRC'
lc_messages = 'C'                       # locale for system error message
lc_monetary = 'C'                       # locale for monetary formatting
lc_numeric = 'C'                        # locale for number formatting
lc_time = 'C'                           # locale for time formatting
default_text_search_config = 'pg_catalog.english'
pg_stat_statements.max = 1000
pg_stat_statements.track = all


pg94@db-172-16-3-33-> grep "^[a-z]" pg_hba.conf 
local   all             all                                     trust
host    all             all             127.0.0.1/32            trust
host    all             all             ::1/128                 trust
host all all 0.0.0.0/0 md5
host replication postgres 172.16.3.0/24 md5


pg94@db-172-16-3-33-> grep "^[a-z]" recovery.done 
recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=172.16.3.39 port=2999 user=postgres keepalives_idle=60'                # e.g. 'host=localhost port=5432'


standby配置 : 
pg94@db-172-16-3-39-> vi ~/.pgpass 
172.16.3.33:2999:replication:postgres:postgres
pg94@db-172-16-3-39-> chmod 400 ~/.pgpass
-- 复制节点
pg94@db-172-16-3-39-> pg_basebackup -D $PGDATA -F p -P -v -U postgres -h 172.16.3.33
pg94@db-172-16-3-39-> cd $PGDATA
pg94@db-172-16-3-39-> mv recovery.done recovery.conf
pg94@db-172-16-3-39-> vi recovery.conf
primary_conninfo = 'host=172.16.3.33 port=2999 user=postgres keepalives_idle=60'


启动standby : 
pg94@db-172-16-3-39-> pg_ctl start


(primary)创建测试表 : 
pg94@db-172-16-3-33-> psql postgres postgres
psql (9.4devel)
Type "help" for help.
postgres=# create database digoal;
CREATE DATABASE
\c digoal 
digoal=# create table t1 (id int);
CREATE TABLE
digoal=# insert into t1 values (1);
INSERT 0 1


(standby)开启repeatable read查询
digoal=# begin transaction isolation level repeatable read;
BEGIN
digoal=# select * from t1;
 id 
----
  1
(1 row)
-- 不要退出事务.


(primary)删除t1表.
删除t1表后, 这部分xlog信息在standby上面做恢复时将和standby上面的事务冲突.
digoal=# drop table t1;
DROP TABLE

查看standby的replay_location, 这个指的是恢复点.
digoal=# select * from pg_stat_replication ;
-[ RECORD 1 ]----+------------------------------
pid              | 5436
usesysid         | 10
usename          | postgres
application_name | walreceiver
client_addr      | 172.16.3.39
client_hostname  | 
client_port      | 22422
backend_start    | 2013-08-15 12:36:18.844357+08
state            | streaming
sent_location    | 0/3012680
write_location   | 0/3012680
flush_location   | 0/3012680
replay_location  | 0/3012140
sync_priority    | 0
sync_state       | async


(primary)执行大量的写操作, 意为产生大量的pg_xlog.
digoal=# create table test(id int, info text, crt_time timestamp);
CREATE TABLE
digoal=# insert into test select generate_series(1,1000000),'test',now();
INSERT 0 1000000
digoal=# checkpoint;
CHECKPOINT

查询 standby的replay_location, 这个指的是恢复点. 
从write_location看出, 数据在发给standby, 但是standby的replay_location不变. 也就是说现在standby只接收xlog, 但是没有将接收到的xlog做recovery处理.
因为在standby上设置了如下参数 :
max_standby_archive_delay = -1  # max delay before canceling queries
max_standby_streaming_delay = -1        # max delay before canceling queries

这个查询和recovery冲突时, 不会被kill掉, recovery将持续等待.
digoal=# select * from pg_stat_replication ;
-[ RECORD 1 ]----+------------------------------
pid              | 5436
usesysid         | 10
usename          | postgres
application_name | walreceiver
client_addr      | 172.16.3.39
client_hostname  | 
client_port      | 22422
backend_start    | 2013-08-15 12:36:18.844357+08
state            | streaming
sent_location    | 0/8CA2BA0
write_location   | 0/8CA2BA0
flush_location   | 0/8CA2BA0
replay_location  | 0/3012140
sync_priority    | 0
sync_state       | async


(standby)开启另外一个会话, 查询是否存在test表.
显然不可能存在, 因为pg_xlog只接收, 却为recovery. 所以standby和primary延迟也随着时间越来越大.
pg94@db-172-16-3-39-> psql
psql (9.4devel)
Type "help" for help.
digoal=# \dt
        List of relations
 Schema | Name | Type  |  Owner   
--------+------+-------+----------
 public | t1   | table | postgres
(1 row)

代码如下 : 
1. src/backend/storage/ipc/standby.c
/*
 * Determine the cutoff time at which we want to start canceling conflicting
 * transactions.  Returns zero (a time safely in the past) if we are willing
 * to wait forever.
 */
static TimestampTz
GetStandbyLimitTime(void)
{
        TimestampTz rtime;
        bool            fromStream;

        /*
         * The cutoff time is the last WAL data receipt time plus the appropriate
         * delay variable.  Delay of -1 means wait forever.
         */
        GetXLogReceiptTime(&rtime, &fromStream);
        if (fromStream)
        {
                if (max_standby_streaming_delay < 0)
                        return 0;                       /* wait forever */
                return TimestampTzPlusMilliseconds(rtime, max_standby_streaming_delay);
        }
        else
        {
                if (max_standby_archive_delay < 0)
                        return 0;                       /* wait forever */  // 小于0永久等待.
                return TimestampTzPlusMilliseconds(rtime, max_standby_archive_delay);
        }
}

相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
相关文章
|
弹性计算 关系型数据库 数据库连接
PostgreSQL 12 preview - Move max_wal_senders out of max_connections for connection slot handling
标签 PostgreSQL , max_wal_senders , max_connections , sorry, too many clients already 背景 如果你需要使用PG的流复制,上游节点的max_wal_senders参数,用来限制这个节点同时最多可以有多少个wal sender进程。 包括逻辑复制、物理复制、pg_basebackup备份等,只要是使用stre
337 0
|
SQL 弹性计算 关系型数据库
为什么pg_basebackup或pg_start_backup好像hang住确没有开始拷贝文件 - checkpoint 的几种调度(checkpoint_completion_target)
标签 PostgreSQL , checkpoint , 调度 , lazy , immediate , pg_start_backup , pg_basebackup 背景 PostgreSQL支持在线全量备份与增量归档备份。在线全量备份实际上就是拷贝文件,增量备份则分为两种,一种是基于BLOCK lsn变化的BLOCK即增量备份,另一种是基于WAL的持续归档文件备份。 全量备份通常
1523 0
|
SQL 关系型数据库 PostgreSQL
Enable point-in-time-recovery in PostgreSQL
转载地址:https://dev.to/pythonmeister/enable-point-in-time-recovery-in-postgresql-4d19 I really love PostgreSQL.
1128 0
|
关系型数据库 数据库 Oracle
|
关系型数据库 MySQL
MySQL5.7 & AliSQL5.6 半同步复制 rpl_semi_sync_master_wait_point参数
MySQL5.5 开始支持半同步复制,但会有数据丢失的风险。MySQL5.7 中进行了优化,通过 rpl_semi_sync_master_wait_point 参数来实现。
5149 0
|
SQL
[20160501]dg参数STANDBY_MAX_DATA_DELAY
[20160501]dg参数STANDBY_MAX_DATA_DELAY.txt --11G dg 支持Active Data Guard应用,就是在dg上只读模式下应用日志.
990 0