Enable point-in-time-recovery in PostgreSQL-阿里云开发者社区

Enable point-in-time-recovery in PostgreSQL

2019-07-18 1129

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

云原生数据库 PolarDB MySQL 版，Serverless 5000PCU 100GB

简介： 转载地址:https://dev.to/pythonmeister/enable-point-in-time-recovery-in-postgresql-4d19I really love PostgreSQL.

转载地址:https://dev.to/pythonmeister/enable-point-in-time-recovery-in-postgresql-4d19

I really love PostgreSQL.

It has a really good performance, has a large eco-system, has a maltitude of extensions for every special case and - what I love the most is rock-solid and offers an almost complete SQL standard implementation.

When you start using (and developing with) it, you most likely use
pg_dump or pg_dumpall for backups.

Once started they create a single SQL file containing both DDL and DML statements to restore or copy your database or a complete cluster (PostgreSQL terminology:the sum of all databases under control of a single PostgreSQL service, e.g. all databases on one machine).

Sooner or later you go into production and you want to minimize downtime - and thus dataloss - to the absolute minimum possible.

With pg_dump you can only call it more often (from once a day to every hour) but sooner or later your database size will prevent the backup to be taken before the next run is started.

One escape from this is to enable point-in-time-recovery.

After a default installation PostgreSQL is run in a non-archive mode.
That means that all changes are written to the WAL, Write Ahead Log, before they finally make it to the database files.

This ensures entegrity in the event of an outage or filesystem error.When the root cause is cleared and you restart the PostgreSQL server it reads the WAL and applies all changes or discards unfinished transactions.

So the first step to enable PITR is to enable WAL-archiving. That basically means the finished WAL files are copied to a archive destination.

To get an idea please assume to have an archive directory structure as this,to which the postgres user has full access:

mkdir /archive/base
mkdir /archive/wal
chown postgres:postgres /archive/base /archive/wal
/archive/
|-- base
|-- wal

Enable the following settings in postgresql.conf:

wal_level = replica
archive_mode = on
archive_command = 'cp %p /archive/wal/%f'
max_wal_senders = 1

Add the following line to pg_hba.conf:

local   replication     postgres                                peer

After doing so, PostgreSQL will copy all finished WAL files to the archive directory.

Now you can take a full database backup any time by issuing commands like these:

mkdir /archive/base/$(date +%Y-%m-%d)
pg_basebackup -D /archive/base/$(date +%Y-%m-%d)

Once pg_basebackup completes you have a complete backup of the PostgreSQL cluster data:

tree /archive/base/ -L 2
/archive/base/`-- 2019-03-02    
|-- backup_label    
|-- base    
|-- global    
|-- pg_clog    
|-- pg_commit_ts    
|-- pg_dynshmem    
|-- pg_logical    
|-- pg_multixact    
|-- pg_notify    
|-- pg_replslot    
|-- pg_serial    
|-- pg_snapshots    
|-- pg_stat    
|-- pg_stat_tmp    
|-- pg_subtrans    
|-- pg_tblspc    
|-- pg_twophase    
|-- PG_VERSION    
|-- pg_xlog    
`-- postgresql.auto.conf18 directories, 3 files

Also, you can see that there are WAL archive files, with one of it marked to be a special file for restores:

tree /archive/wal/ -L 2
/archive/wal/
|-- 000000010000000000000001
|-- 000000010000000000000002
|-- 000000010000000000000003
|-- 000000010000000000000004
|-- 000000010000000000000005
|-- 000000010000000000000006`
-- 000000010000000000000006.00000028.backup

The one which ends on .backup contains information about when the backup was started and when it ended, so that a restore does not need to read and apply all WAL archive files in the archive, but only those written during and after it.

Now, lets create some random data to see that we have successfully enabled PITR and are able to use it.

CREATE DATABASE SAMPLE4 TEMPLATE template0;
\C SAMPLE4; -- connect to database SAMPLE4
CREATE TABLE INFO (ID INTEGER, TITLE VARCHAR(20));
INSERT INTO INFO (ID,TITLE) VALUES (1,'23:10');
SELECT pg_switch_wal(); -- force switching WAL

After that you should see a new WAL in the archive directory:

tree /archive/wal/ -L 2
/archive/wal/
|-- 000000010000000000000001
|-- 000000010000000000000002
|-- 000000010000000000000003
|-- 000000010000000000000004
|-- 000000010000000000000005
|-- 000000010000000000000006
|-- 000000010000000000000006.00000028.backup`
-- 000000010000000000000007

P.S.: You should not use pg_switch_wal without a reason. Consult the PostgreSQL manual for information about WAL configuration and switch timings!

Next, we gonna kill the database:

[postgres@localhost data]$ pg_ctl -D /usr/local/pgsql/data/ stop
waiting for server to shut down.... done
server stopped
[postgres@localhost data]$

At this point the database cluster is gone and you will see error messages,

when you try to start it.

So, time to recover!

Copy the base backup back to the clusters data directory:

cp-ar /archive/base/2019-03-02/* /usr/local/pgsql/data/

Create a file /usr/local/pgsql/recovery.conf with this content:

restore_command = 'cp /archive/wal/%f "%p"'
recovery_target_timeline = 'latest'

P.S.: The recovery.conf file will be renamed to recovery.done by the PostgreSQL process once the restore is finished.

Then start the server and once the recovery process is done query some data:

\C SAMPLE4;
select * from info;  
SAMPLE4 id | 
title
----+-------  
1 | 23:10  
1 | 23:11(2 rows)

If you take a look at the clusters log file you will notice, that it automatically recovered the WAL archives and the most recent data is present.

This was a very simple setup; trust me, productive environments usually have more complexity (archiving WAL files to remote servers, restore from tape, make full backups by taking filesystem snapshots etc. etc.).

To make this example complete: add a cronjob entry which creates a full-backup every day, twice a day or even every hour and you should be able to recover in no time.

Enable point-in-time-recovery in PostgreSQL

热门文章

最新文章

相关电子书