苹果推出最受欢迎的iOS 到 民用与商用数据库备份的差异与源码浅析

本文涉及的产品
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
云数据库 RDS MySQL,集群系列 2核4GB
推荐场景:
搭建个人博客
云数据库 Tair(兼容Redis),内存型 2GB
简介: 背景 对于商业数据库来说,备份的功能一般都非常的全面。 比如Oracle,它的备份工具rman是非常强大的,很多年前就已经支持全量、增量、归档的备份模式,支持压缩等。 还支持元数据存储到数据库中,管理也非常的方便,例如保留多少归档,备份集的管理也很方便,例如要恢复到什么时间点,将此前的备份清除等等。

背景

苹果推出了有史以来最受欢迎的一版iOS,为什么这么受欢迎?

最主要的还是使用了最新的APFS文件系统,这个文件系统几乎集成了ZFS,Btrfs的所有优良特性,比如最为好用的快照(块级增量)、压缩。使得苹果的操作系统一下子瘦了,而且备份占用空间也非常小。

对于数据库来说,备份也不是小事,如何实现高效的备份、节省空间的备份以及具备可以定义SLA的恢复(不会随着数据库的大小、REDO的多少而变化)。


对于商业数据库来说,备份的功能一般都非常的全面。

比如Oracle,它的备份工具rman是非常强大的,很多年前就已经支持全量、增量、归档的备份模式,支持压缩等。

还支持元数据存储到数据库中,管理也非常的方便,例如保留多少归档,备份集的管理也很方便,例如要恢复到什么时间点,将此前的备份清除等等。

对于开源数据库来说,支持向商业版本这么丰富更能的比较少,PostgreSQL算是非常完善的一个。

PostgreSQL作为最高级的开源数据库,备份方面已经向商业数据库看齐。

目前PostgreSQL已经支持类似Oracle的rman备份工具的功能,支持全量、增量、归档三种备份模式,支持压缩,支持备份集的管理等。

有了块级增量备份,对于那种非常庞大的数据库,备份起来就不像只支持全量和归档的模式那么吃力了。

PostgreSQL增量备份是怎么做到的呢?

一个数据页的框架如下

 * +----------------+---------------------------------+  
 * | PageHeaderData | linp1 linp2 linp3 ...           |  
 * +-----------+----+---------------------------------+  
 * | ... linpN |                                      |  
 * +-----------+--------------------------------------+  
 * |               ^ pd_lower                         |  
 * |                                                  |  
 * |                     v pd_upper                   |  
 * +-------------+------------------------------------+  
 * |                     | tupleN ...                 |  
 * +-------------+------------------+-----------------+  
 * |       ... tuple3 tuple2 tuple1 | "special space" |  
 * +--------------------------------+-----------------+  
AI 代码解读

数据页头部的数据结构

typedef struct PageHeaderData  
{  
        /* XXX LSN is member of *any* block, not only page-organized ones */  
        PageXLogRecPtr pd_lsn;          /* LSN: next byte after last byte of xlog  
                                         * record for last change to this page */  
        uint16          pd_checksum;    /* checksum */   
        uint16          pd_flags;               /* flag bits, see below */  
        LocationIndex pd_lower;         /* offset to start of free space */  
        LocationIndex pd_upper;         /* offset to end of free space */  
        LocationIndex pd_special;       /* offset to start of special space */  
        uint16          pd_pagesize_version;  
        TransactionId pd_prune_xid; /* oldest prunable XID, or zero if none */  
        ItemIdData      pd_linp[FLEXIBLE_ARRAY_MEMBER]; /* line pointer array */  
} PageHeaderData;  
AI 代码解读

因为如果对象是持久化的,那么它的所有变更都会记录REDO,数据页头部的pd_lsn表示该数据页最后一次变化时,变化产生的REDO在xlog file中的结束位置.

即如果xlog flush的xlog地址位 大于或等于 此页pd_lsn,那么这个页的更改就可以认为是可靠的。

 *              pd_lsn          - identifies xlog record for last change to this page.  
 *              pd_checksum - page checksum, if set.  
 *              pd_flags        - flag bits.  
 *              pd_lower        - offset to start of free space.  
 *              pd_upper        - offset to end of free space.  
 *              pd_special      - offset to start of special space.  
 *              pd_pagesize_version - size in bytes and page layout version number.  
 *              pd_prune_xid - oldest XID among potentially prunable tuples on page.  
AI 代码解读

好了,既然每次块的变化都包含了LSN的修改,那么也即是说,我们可以通过第一次备份开始时的全局LSN,以及当前需要备份的数据的page LSN来判断此页是否发生过修改。

如果修改了就备份,没修改,就不需要备份, 从而实现数据库的块级增量备份。

pg_rman 介绍

pg_rman是一个开源的PostgreSQL备份管理软件,类似Oracle的RMAN。

https://github.com/ossc-db/pg_rman

http://ossc-db.github.io/pg_rman/index.html

pg_rman使用的是pg_start_backup(), copy, pg_stop_backup()的备份模式。

pg_rman跑的不是流复制协议,而是文件拷贝,所以pg_rman必须和数据库节点跑在一起。

如果在standby节点跑pg_rman,pg_rman则需要通过网络连接到主节点执行pg_start_backup和pg_stop_backup。

pg_rman的用法非常简单,支持以下几种运行模式。

init  
  Initialize a backup catalog.  
backup  
  Take an online backup.  
restore  
  Do restore.  
show  
  Show backup history. The detail option shows with additional information of each backups.  
validate  
  Validate backup files. Backups without validation cannot be used for restore and incremental backup.  
delete  
  Delete backup files.  
purge  
  Remove deleted backups from backup catalog.  
AI 代码解读

使用pg_rman的前提

开启归档

配置csvlog

建议的配置

postgres=# show log_destination ;  
 log_destination   
-----------------  
 csvlog  
(1 row)  

postgres=# SHOW log_directory ;  
 log_directory  
---------------  
 pg_log  
(1 row)  

postgres=# SHOW archive_command ;  
              archive_command  
--------------------------------------------  
 cp %p /data04/digoal/arc_log/%f  
(1 row)  
AI 代码解读

初始化pg_rman backup catalog

首先需要初始化一个backup catalog,实际上就是需要一个目录,这个目录将用于存放备份的文件。

同时这个目录也会存放一些元数据,例如备份的配置文件,数据库的systemid,时间线文件历史等等。

初始化命令需要两个参数,分别为备份目标目录,以及数据库的$PGDATA

$ mkdir /data05/digoal/pgbbk  

$ /home/digoal/pgsql9.5/bin/pg_rman init -B /data05/digoal/pgbbk -D /data04/digoal/pg_root  
INFO: ARCLOG_PATH is set to '/data04/digoal/arc_log'  
INFO: SRVLOG_PATH is set to '/data04/digoal/pg_root/pg_log'  
AI 代码解读

生成备份元数据如下

[digoal@iZ28tqoemgtZ ~]$ cd /data05/digoal/pgbbk/  
[digoal@iZ28tqoemgtZ pgbbk]$ ll  
total 16  
drwx------ 4 digoal digoal 4096 Aug 26 19:29 backup  
-rw-rw-r-- 1 digoal digoal   82 Aug 26 19:29 pg_rman.ini  
-rw-rw-r-- 1 digoal digoal   40 Aug 26 19:29 system_identifier  
drwx------ 2 digoal digoal 4096 Aug 26 19:29 timeline_history  
AI 代码解读

生成的配置文件

$ cat pg_rman.ini   
ARCLOG_PATH='/data04/digoal/arc_log'  
SRVLOG_PATH='/data04/digoal/pg_root/pg_log'  
AI 代码解读

你可以把将来要使用的配置写在这个配置文件中,或者写在pg_rman的命令行中。

我后面的测试会直接使用命令行参数。

生成的数据库system id,用于区分备份的数据库是不是一个数据库,防止被冲。

$ cat system_identifier   
SYSTEM_IDENTIFIER='6318621837015461309'  
AI 代码解读

与控制文件中存储的system id一致。

注意
pg_rman只从postgresql.conf取log_directory和archive_command参数的值。

如果你的PostgreSQL的配置文件是include的或者配置在postgresql.auto.conf中。

这两个值将不准确。

所以建议你仅仅把参数配置在postgresql.conf中,而不要使用其他配置文件。

pg_rman 命令行用法

pg_rman manage backup/recovery of PostgreSQL database.  

Usage:  
  pg_rman OPTION init  
  pg_rman OPTION backup  
  pg_rman OPTION restore  
  pg_rman OPTION show [DATE]  
  pg_rman OPTION show detail [DATE]  
  pg_rman OPTION validate [DATE]  
  pg_rman OPTION delete DATE  
  pg_rman OPTION purge  

Common Options:  
  -D, --pgdata=PATH         location of the database storage area  
  -A, --arclog-path=PATH    location of archive WAL storage area  
  -S, --srvlog-path=PATH    location of server log storage area  
  -B, --backup-path=PATH    location of the backup storage area  
  -c, --check               show what would have been done  
  -v, --verbose             show what detail messages  
  -P, --progress            show progress of processed files  

Backup options:  
  -b, --backup-mode=MODE    full, incremental, or archive  
  -s, --with-serverlog      also backup server log files  
  -Z, --compress-data       compress data backup with zlib  
  -C, --smooth-checkpoint   do smooth checkpoint before backup  
  -F, --full-backup-on-error   switch to full backup mode  
                               if pg_rman cannot find validate full backup  
                               on current timeline  
      NOTE: this option is only used in --backup-mode=incremental or archive.  
  --keep-data-generations=NUM keep NUM generations of full data backup  
  --keep-data-days=NUM        keep enough data backup to recover to N days ago  
  --keep-arclog-files=NUM   keep NUM of archived WAL  
  --keep-arclog-days=DAY    keep archived WAL modified in DAY days  
  --keep-srvlog-files=NUM   keep NUM of serverlogs  
  --keep-srvlog-days=DAY    keep serverlog modified in DAY days  
  --standby-host=HOSTNAME   standby host when taking backup from standby  
  --standby-port=PORT       standby port when taking backup from standby  

Restore options:  
  --recovery-target-time    time stamp up to which recovery will proceed  
  --recovery-target-xid     transaction ID up to which recovery will proceed  
  --recovery-target-inclusive whether we stop just after the recovery target  
  --recovery-target-timeline  recovering into a particular timeline  
  --hard-copy                 copying archivelog not symbolic link  

Catalog options:  
  -a, --show-all            show deleted backup too  

Delete options:  
  -f, --force               forcibly delete backup older than given DATE  

Connection options:  
  -d, --dbname=DBNAME       database to connect  
  -h, --host=HOSTNAME       database server host or socket directory  
  -p, --port=PORT           database server port  
  -U, --username=USERNAME   user name to connect as  
  -w, --no-password         never prompt for password  
  -W, --password            force password prompt  

Generic options:  
  -q, --quiet               don't show any INFO or DEBUG messages  
  --debug                   show DEBUG messages  
  --help                    show this help, then exit  
  --version                 output version information, then exit  

Read the website for details. <http://github.com/ossc-db/pg_rman>  
Report bugs to <http://github.com/ossc-db/pg_rman/issues>.  
AI 代码解读

全量备份

输入必要的参数或option

$ export PGPASSWORD=postgres  

$ /home/digoal/pgsql9.5/bin/pg_rman backup \  
-B /data05/digoal/pgbbk \  
-D /data04/digoal/pg_root \  
-b full \  
-s \  
-Z \  
-C \  
--keep-data-days=10 \  
--keep-arclog-files=15 \  
--keep-arclog-days=10 \  
--keep-srvlog-files=10 \  
--keep-srvlog-days=15 \  
-h 127.0.0.1 -p 1921 -U postgres -d postgres  
AI 代码解读

结果

INFO: copying database files  
NOTICE:  pg_stop_backup complete, all required WAL segments have been archived  
INFO: copying archived WAL files  
INFO: copying server log files  
INFO: backup complete  
HINT: Please execute 'pg_rman validate' to verify the files are correctly copied.  
INFO: start deleting old archived WAL files from ARCLOG_PATH (keep files = 15, keep days = 10)  
INFO: the threshold timestamp calculated by keep days is "2016-08-16 00:00:00"  
INFO: start deleting old server files from SRVLOG_PATH (keep files = 10, keep days = 15)  
INFO: the threshold timestamp calculated by keep days is "2016-08-11 00:00:00"  
INFO: start deleting old backup (keep after = 2016-08-16 00:00:00)  
INFO: does not include the backup just taken  
WARNING: backup "2016-08-26 19:39:32" is not taken int account  
DETAIL: This is not valid backup.  
AI 代码解读

校验备份集

备份时pg_rman会记录每个备份文件的crc,以便validate进行校验。

例如某个备份集

$ less /data05/digoal/pgbbk/20160826/195809/file_database.txt  

.s.PGSQL.1921 ? 0 0 0777 2016-08-26 19:27:05  
.s.PGSQL.1921.lock f 55 590164837 0600 2016-08-26 19:27:05  
PG_VERSION f 12 3872055064 0600 2016-07-28 10:03:42  
backup_label f 167 2985542389 0600 2016-08-26 19:58:42  
backup_label.old f 155 4273989468 0600 2016-08-23 19:43:32  
base d 0 0 0700 2016-08-23 10:28:32  
base/1 d 0 0 0700 2016-08-24 16:17:02  
base/1/112 f 57 1147028285 0600 2016-07-28 10:03:42  
base/1/113 f 57 1147028285 0600 2016-07-28 10:03:42  
base/1/1247 F 8178 1875285513 0600 2016-07-29 13:51:29  
base/1/1247_fsm f 139 3668812536 0600 2016-07-28 10:03:43  
AI 代码解读

解释

路径,文件类型,大小,CRC校验值,权限,时间

第四列即crc校验值

每次备份完,必须要做一次校验,否则备份集不可用用来恢复,增量备份时也不会用它来做增量比较。

$ /home/digoal/pgsql9.5/bin/pg_rman validate -B /data05/digoal/pgbbk  
INFO: validate: "2016-08-26 19:39:50" backup, archive log files and server log files by CRC  
INFO: backup "2016-08-26 19:39:50" is valid  
AI 代码解读

每个备份集都包含了一个备份状态文件,如下

cat /data05/digoal/pgbbk/20160826/201955/backup.ini

# configuration
BACKUP_MODE=INCREMENTAL
FULL_BACKUP_ON_ERROR=false
WITH_SERVERLOG=true
COMPRESS_DATA=true
# result
TIMELINEID=1
START_LSN=46/df000108
STOP_LSN=46/df000210
START_TIME='2016-08-26 20:19:55'
END_TIME='2016-08-26 20:20:48'
RECOVERY_XID=3896508593
RECOVERY_TIME='2016-08-26 20:20:47'
TOTAL_DATA_BYTES=6196524307
READ_DATA_BYTES=3199287520
READ_ARCLOG_BYTES=33554754
READ_SRVLOG_BYTES=0
WRITE_BYTES=125955
BLOCK_SIZE=8192
XLOG_BLOCK_SIZE=8192
STATUS=OK
AI 代码解读

这个文件中包含了很重要的信息,比如LSN,后面LSN将用于比对增量备份时对比数据块的LSN是否发生了变化,是否需要备份。

增量备份

$ export PGPASSWORD=postgres  

$ /home/digoal/pgsql9.5/bin/pg_rman backup \  
-B /data05/digoal/pgbbk \  
-D /data04/digoal/pg_root \  
-b incremental \  
-s \  
-Z \  
-C \  
--keep-data-days=10 \  
--keep-arclog-files=15 \  
--keep-arclog-days=10 \  
--keep-srvlog-files=10 \  
--keep-srvlog-days=15 \  
-h 127.0.0.1 -p 1921 -U postgres -d postgres  
AI 代码解读

增量备份输出

INFO: copying database files  
NOTICE:  pg_stop_backup complete, all required WAL segments have been archived  
INFO: copying archived WAL files  
INFO: copying server log files  
INFO: backup complete  
HINT: Please execute 'pg_rman validate' to verify the files are correctly copied.  
INFO: start deleting old archived WAL files from ARCLOG_PATH (keep files = 15, keep days = 10)  
INFO: the threshold timestamp calculated by keep days is "2016-08-16 00:00:00"  
INFO: start deleting old server files from SRVLOG_PATH (keep files = 10, keep days = 15)  
INFO: the threshold timestamp calculated by keep days is "2016-08-11 00:00:00"  
INFO: start deleting old backup (keep after = 2016-08-16 00:00:00)  
INFO: does not include the backup just taken  
INFO: backup "2016-08-26 19:39:50" should be kept  
DETAIL: This is taken after "2016-08-16 00:00:00".  
WARNING: backup "2016-08-26 19:39:32" is not taken int account  
DETAIL: This is not valid backup.  
AI 代码解读

校验备份集

$ /home/digoal/pgsql9.5/bin/pg_rman validate -B /data05/digoal/pgbbk  
INFO: validate: "2016-08-26 19:43:20" backup, archive log files and server log files by CRC  
INFO: backup "2016-08-26 19:43:20" is valid  
AI 代码解读

列出备份集

$ /home/digoal/pgsql9.5/bin/pg_rman show -B /data05/digoal/pgbbk  
==========================================================  
 StartTime           Mode  Duration    Size   TLI  Status   
==========================================================  
2016-08-26 19:43:20  INCR        0m    54kB     1  OK  
2016-08-26 19:39:50  FULL        1m   245MB     1  OK  
AI 代码解读

可以看到增量非常小,因为很少变化的块。

接下来更新一张大表的某一条记录,再看看。

postgres=# \dt+  
                      List of relations  
 Schema |   Name   | Type  |  Owner   |  Size   | Description   
--------+----------+-------+----------+---------+-------------  
 public | hll_test | table | postgres | 208 kB  |   
 public | t        | table | postgres | 3050 MB |   
 public | tbl1     | table | postgres | 226 MB  |   
 public | tbl2     | table | postgres | 63 MB   |   
 public | test     | table | postgres | 120 MB  |   
(5 rows)  

postgres=# with t1 as (select id from t where id between 1 and 1000 limit 10) update t set info='new' where id in (select * from t1);  
UPDATE 10  
AI 代码解读

更新后做一个增量备份

$ /home/digoal/pgsql9.5/bin/pg_rman backup -B /data05/digoal/pgbbk -D /data04/digoal/pg_root -b incremental -s -Z -C --keep-data-days=10 --keep-arclog-files=15 --keep-arclog-days=10 --keep-srvlog-files=10 --keep-srvlog-days=15 -h 127.0.0.1 -p 1921 -U postgres -d postgres  

INFO: copying database files  
NOTICE:  pg_stop_backup complete, all required WAL segments have been archived  
INFO: copying archived WAL files  
INFO: copying server log files  
INFO: backup complete  
HINT: Please execute 'pg_rman validate' to verify the files are correctly copied.  
INFO: start deleting old archived WAL files from ARCLOG_PATH (keep files = 15, keep days = 10)  
INFO: the threshold timestamp calculated by keep days is "2016-08-16 00:00:00"  
INFO: start deleting old server files from SRVLOG_PATH (keep files = 10, keep days = 15)  
INFO: the threshold timestamp calculated by keep days is "2016-08-11 00:00:00"  
INFO: start deleting old backup (keep after = 2016-08-16 00:00:00)  
INFO: does not include the backup just taken  
INFO: backup "2016-08-26 19:58:09" should be kept  
DETAIL: This is taken after "2016-08-16 00:00:00".  
WARNING: backup "2016-08-26 19:56:54" is not taken int account  
DETAIL: This is not valid backup.  
INFO: backup "2016-08-26 19:43:20" should be kept  
DETAIL: This is taken after "2016-08-16 00:00:00".  
INFO: backup "2016-08-26 19:39:50" should be kept  
DETAIL: This is taken after "2016-08-16 00:00:00".  
WARNING: backup "2016-08-26 19:39:32" is not taken int account  
DETAIL: This is not valid backup.  
AI 代码解读

校验备份集

[digoal@iZ28tqoemgtZ pg_rman]$ /home/digoal/pgsql9.5/bin/pg_rman validate -B /data05/digoal/pgbbk  
INFO: validate: "2016-08-26 20:19:55" backup, archive log files and server log files by CRC  
INFO: backup "2016-08-26 20:19:55" is valid  
AI 代码解读

输出当前备份

[digoal@iZ28tqoemgtZ pg_rman]$ /home/digoal/pgsql9.5/bin/pg_rman show -B /data05/digoal/pgbbk  
==========================================================  
 StartTime           Mode  Duration    Size   TLI  Status   
==========================================================  
2016-08-26 20:19:55  INCR        0m   125kB     1  OK  
2016-08-26 19:58:09  FULL       11m  3094MB     1  OK  
2016-08-26 19:56:54  FULL        1m      0B     0  ERROR  
2016-08-26 19:43:20  INCR        0m    54kB     1  OK  
2016-08-26 19:39:50  FULL        1m   245MB     1  OK  
2016-08-26 19:39:32  FULL        0m      0B     0  ERROR  
AI 代码解读

增量备份的文件非常小,因为变化的数据块很少。

按指定时间从catalog删除备份集

例如我只需要我的备份集能恢复到2016-08-26 19:59:00,在这个时间点以前,不需要用来恢复到这个时间点的备份全删掉。

$ /home/digoal/pgsql9.5/bin/pg_rman delete "2016-08-26 19:59:00" -B /data05/digoal/pgbbk  
WARNING: cannot delete backup with start time "2016-08-26 19:58:09"  
DETAIL: This is the latest full backup necessary for successful recovery.  
INFO: delete the backup with start time: "2016-08-26 19:56:54"  
INFO: delete the backup with start time: "2016-08-26 19:43:20"  
INFO: delete the backup with start time: "2016-08-26 19:39:50"  
INFO: delete the backup with start time: "2016-08-26 19:39:32"  
AI 代码解读

保留的备份集合可以将数据库恢复到2016-08-26 19:59:00

$ /home/digoal/pgsql9.5/bin/pg_rman show -B /data05/digoal/pgbbk  
==========================================================  
 StartTime           Mode  Duration    Size   TLI  Status   
==========================================================  
2016-08-26 20:19:55  INCR        0m   125kB     1  OK  
2016-08-26 19:58:09  FULL       11m  3094MB     1  OK  
AI 代码解读

物理删除已从catalog删除的备份集

$ /home/digoal/pgsql9.5/bin/pg_rman purge -B /data05/digoal/pgbbk  
INFO: DELETED backup "2016-08-26 19:56:54" is purged  
INFO: DELETED backup "2016-08-26 19:43:20" is purged  
INFO: DELETED backup "2016-08-26 19:39:50" is purged  
INFO: DELETED backup "2016-08-26 19:39:32" is purged  
AI 代码解读

恢复

后续在写

pg_rman 源码浅析

1. 增量备份代码

上次备份以来,数据块的LSN是否发生了变化,如果自从上次备份的start_lsn以来没有发生变化,则不备份。

代码举例

                else  
                {  
                        pgBackupGetPath(prev_backup, prev_file_txt, lengthof(prev_file_txt),  
                                DATABASE_FILE_LIST);  
                        prev_files = dir_read_file_list(pgdata, prev_file_txt);  

                        /*  
                         * Do backup only pages having larger LSN than previous backup.  
                         */  
                        lsn = &prev_backup->start_lsn;  
                        xlogid = (uint32) (*lsn >> 32);  
                        xrecoff = (uint32) *lsn;  
                        elog(DEBUG, _("backup only the page updated after LSN(%X/%08X)"),  
                                                        xlogid, xrecoff);  
                }  

...

                /* backup files from non-snapshot */
                pgBackupGetPath(&current, path, lengthof(path), DATABASE_DIR);
                backup_files(pgdata, path, files, prev_files, lsn, current.compress_data, NULL);
AI 代码解读

2. 备份结果backup.ini相关代码

# configuration  
BACKUP_MODE=FULL  
FULL_BACKUP_ON_ERROR=false  
WITH_SERVERLOG=true  
COMPRESS_DATA=true  
# result  
TIMELINEID=1  
START_LSN=43/d5000028  
STOP_LSN=43/d5000168  
START_TIME='2016-08-26 15:43:39'  
END_TIME='2016-08-26 15:44:27'  
RECOVERY_XID=3896508572  
RECOVERY_TIME='2016-08-26 15:44:18'  
TOTAL_DATA_BYTES=823571731  
READ_DATA_BYTES=823571731  
READ_ARCLOG_BYTES=234881668  
READ_SRVLOG_BYTES=218248  
WRITE_BYTES=206009921  
BLOCK_SIZE=8192  
XLOG_BLOCK_SIZE=8192  
STATUS=OK  
AI 代码解读

对应的数据结构

/*  
 * pg_rman takes backup into the directory $BACKUP_PATH/<date>/<time>.  
 *  
 * status == -1 indicates the pgBackup is invalid.  
 */  
typedef struct pgBackup  
{  
        /* Backup Level */  
        BackupMode      backup_mode;  
        bool            with_serverlog;  
        bool            compress_data;  
        bool            full_backup_on_error;  

        /* Status - one of BACKUP_STATUS_xxx */  
        BackupStatus    status;  

        /* Timestamp, etc. */  
        TimeLineID      tli;  

        XLogRecPtr      start_lsn;  
        XLogRecPtr      stop_lsn;  

        time_t          start_time;  
        time_t          end_time;  
        time_t          recovery_time;  
        uint32          recovery_xid;  

        /* Size (-1 means not-backup'ed) */  
        int64           total_data_bytes;  
        int64           read_data_bytes;  
        int64           read_arclog_bytes;  
        int64           read_srvlog_bytes;  
        int64           write_bytes;  

        /* data/wal block size for compatibility check */  
        uint32          block_size;  
        uint32          wal_block_size;  

        /* if backup from standby or not */  
        bool            is_from_standby;  

} pgBackup;  
AI 代码解读

备份开始时记录pg_start_backup调用返回的lsn,写入backup->start_lsn

/*  
 * Notify start of backup to PostgreSQL server.  
 */  
static void  
pg_start_backup(const char *label, bool smooth, pgBackup *backup)  
{  
        PGresult           *res;  
        const char         *params[2];  
        params[0] = label;  

        elog(DEBUG, "executing pg_start_backup()");  
        reconnect();  

        /* Assumes PG version >= 8.4 */  

        /* 2nd argument is 'fast' (IOW, !smooth) */  
        params[1] = smooth ? "false" : "true";  
        res = execute("SELECT * from pg_xlogfile_name_offset(pg_start_backup($1, $2))", 2, params);  

        if (backup != NULL)  
                get_lsn(res, &backup->tli, &backup->start_lsn);  

        elog(DEBUG, "backup start point is (WAL file: %s, xrecoff: %s)",  
                        PQgetvalue(res, 0, 0), PQgetvalue(res, 0, 1));  

        PQclear(res);  
        disconnect();  
}  
AI 代码解读

备份停止,调用pg_stop_backup,从返回结果中取出LSN,写入backup->stop_lsn

/*  
 * Notify end of backup to PostgreSQL server.  
 */  
static void  
pg_stop_backup(pgBackup *backup)  
{  
        elog(DEBUG, "executing pg_stop_backup()");  
        wait_for_archive(backup,  
                "SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup())");  
}  


static void  
wait_for_archive(pgBackup *backup, const char *sql)  
{  
        PGresult           *res;  
        char                    ready_path[MAXPGPATH];  
        int                             try_count;  

        reconnect();  
        res = execute(sql, 0, NULL);  
        if (backup != NULL)  
        {  
                get_lsn(res, &backup->tli, &backup->stop_lsn);  
                elog(DEBUG, "backup end point is (WAL file: %s, xrecoff: %s)",  
                                PQgetvalue(res, 0, 0), PQgetvalue(res, 0, 1));  
        }  

        /* get filename from the result of pg_xlogfile_name_offset() */  
        elog(DEBUG, "waiting for %s is archived", PQgetvalue(res, 0, 0));  
        snprintf(ready_path, lengthof(ready_path),  
                "%s/pg_xlog/archive_status/%s.ready", pgdata, PQgetvalue(res, 0, 0));  

        PQclear(res);  

        res = execute(TXID_CURRENT_SQL, 0, NULL);  
        if(backup != NULL)  
        {  
                get_xid(res, &backup->recovery_xid);  
                backup->recovery_time = time(NULL);  
        }  
        disconnect();  

        /* wait until switched WAL is archived */  
        try_count = 0;  
        while (fileExists(ready_path))  
        {  
                sleep(1);  
                if (interrupted)  
                        ereport(FATAL,  
                                (errcode(ERROR_INTERRUPTED),  
                                 errmsg("interrupted during waiting for WAL archiving")));  
                try_count++;  
                if (try_count > TIMEOUT_ARCHIVE)  
                        ereport(ERROR,  
                                (errcode(ERROR_ARCHIVE_FAILED),  
                                 errmsg("switched WAL could not be archived in %d seconds",  
                                        TIMEOUT_ARCHIVE)));  
        }  
        elog(DEBUG, "WAL file contains backup end point is archived after %d seconds waiting",  
                        try_count);  
}  
AI 代码解读

validate 时,改backup.ini的STATUS字段

validate.c  

        for (i = 0; i < parray_num(backup_list); i++)  
        {  
                pgBackup *backup = (pgBackup *)parray_get(backup_list, i);  

                /* clean extra backups (switch STATUS to ERROR) */  
                if(!another_pg_rman &&  
                   (backup->status == BACKUP_STATUS_RUNNING ||  
                        backup->status == BACKUP_STATUS_DELETING))  
                {  
                        backup->status = BACKUP_STATUS_ERROR;  
                        pgBackupWriteIni(backup);  
                }  

                /* Validate completed backups only. */  
                if (backup->status != BACKUP_STATUS_DONE)  
                        continue;  

                /* validate with CRC value and update status to OK */  
                pgBackupValidate(backup, false, false, (HAVE_DATABASE(backup)));  
        }  

...  


                /* update status to OK */  
                if (corrupted)  
                        backup->status = BACKUP_STATUS_CORRUPT;  
                else  
                        backup->status = BACKUP_STATUS_OK;  
                pgBackupWriteIni(backup);  
AI 代码解读

注意

1.
备份参数 -C 表示无缝checkpoint, 所以可能很慢,视checkpoint_completion_target和segment_size的配置。

如果你发现pg_rman开始很慢,可以把-C去掉,速度就快了,但是可能在高峰时,造成冲击。

建议高峰是不要备份。

2.
BUG

unix socket 是$PGDATA时, validate会报错

pg_rman validate  

INFO: validate: "2016-08-26 16:19:25" backup, archive log files and server log files by CRC  
ERROR: invalid type '?' found in "/data05/digoal/pgbak/20160826/161925/file_database.txt"  

vi /data05/digoal/pgbak/20160826/161925/file_database.txt  

.s.PGSQL.1921 ? 0 0 0777 2016-08-26 15:35:05  
AI 代码解读

修改一下dir.c的代码即可修复这个问题,修改如下

                if (strncmp(path, ".s.PGSQL", 7) != 0 && type != 'f' && type != 'F' && type != 'd' && type != 'l')
                        ereport(ERROR,
                                (errcode(ERROR_CORRUPTED),
                                 errmsg("invalid type '%c' found in \"%s\"", type, file_txt)));
AI 代码解读
目录
打赏
0
0
0
4
20694
分享
相关文章
【03】仿站技术之python技术,看完学会再也不用去购买收费工具了-修改整体页面做好安卓下载发给客户-并且开始提交网站公安备案-作为APP下载落地页文娱产品一定要备案-包括安卓android下载(简单)-ios苹果plist下载(稍微麻烦一丢丢)-优雅草卓伊凡
【03】仿站技术之python技术,看完学会再也不用去购买收费工具了-修改整体页面做好安卓下载发给客户-并且开始提交网站公安备案-作为APP下载落地页文娱产品一定要备案-包括安卓android下载(简单)-ios苹果plist下载(稍微麻烦一丢丢)-优雅草卓伊凡
95 13
【03】仿站技术之python技术,看完学会再也不用去购买收费工具了-修改整体页面做好安卓下载发给客户-并且开始提交网站公安备案-作为APP下载落地页文娱产品一定要备案-包括安卓android下载(简单)-ios苹果plist下载(稍微麻烦一丢丢)-优雅草卓伊凡
苹果app上架-ios上架苹果商店app store 之苹果支付In - App Purchase内购配置-优雅草卓伊凡
苹果app上架-ios上架苹果商店app store 之苹果支付In - App Purchase内购配置-优雅草卓伊凡
99 13
苹果app上架-ios上架苹果商店app store 之苹果支付In - App Purchase内购配置-优雅草卓伊凡
苹果app上架app store 之苹果开发者账户在mac电脑上如何使用钥匙串访问-发行-APP发布证书ios_distribution.cer-优雅草卓伊凡
苹果app上架app store 之苹果开发者账户在mac电脑上如何使用钥匙串访问-发行-APP发布证书ios_distribution.cer-优雅草卓伊凡
58 8
苹果app上架app store 之苹果开发者账户在mac电脑上如何使用钥匙串访问-发行-APP发布证书ios_distribution.cer-优雅草卓伊凡
基于ssm的考研图书电子商务平台,附源码+数据库+论文
考研图书电子商务平台是一个基于Java的B/S架构系统,适用于Windows环境。该平台设有管理员和用户权限,管理员可管理商品、用户、留言板及订单,用户可管理收货地址、订单、收藏及购买商品。技术框架包括前端Vue+HTML+JavaScript+CSS+LayUI,后端SSM,数据库为MySQL。项目包含17个数据库表,支持Maven构建。提供演示视频和详细文档,支持免费远程调试安装,确保顺利运行。
41 13
基于ssm的考研图书电子商务平台,附源码+数据库+论文
基于ssm的社区物业管理系统,附源码+数据库+论文+任务书
社区物业管理系统采用B/S架构,基于Java语言开发,使用MySQL数据库。系统涵盖个人中心、用户管理、楼盘管理、收费管理、停车登记、报修与投诉管理等功能模块,方便管理员及用户操作。前端采用Vue、HTML、JavaScript等技术,后端使用SSM框架。系统支持远程安装调试,确保顺利运行。提供演示视频和详细文档截图,帮助用户快速上手。
67 17
基于ssm的超市会员(积分)管理系统,附源码+数据库+论文,包安装调试
本项目为简单内容浏览和信息处理系统,具备管理员和员工权限。管理员可管理会员、员工、商品及积分记录,员工则负责积分、商品信息和兑换管理。技术框架采用Java编程语言,B/S架构,前端使用Vue+JSP+JavaScript+Css+LayUI,后端为SSM框架,数据库为MySQL。运行环境为Windows,JDK8+Tomcat8.5,非前后端分离的Maven项目。提供演示视频和详细文档,购买后支持免费远程安装调试。
86 19
基于ssm的培训学校教学管理平台,附源码+数据库+论文
金旗帜文化培训学校网站项目包含管理员、教师和用户三种角色,各角色功能通过用例图展示。技术框架采用Java语言,B/S架构,前端为Vue+HTML+CSS+LayUI,后端为SSM,数据库为MySQL,运行环境为JDK8+Tomcat8.5。项目含12张数据库表,非前后端分离,支持演示视频与截图查看。购买后提供免费安装调试服务,确保顺利运行。
48 14
[Java计算机毕设]基于ssm的OA办公管理系统的设计与实现,附源码+数据库+论文+开题,包安装调试
OA办公管理系统是一款基于Java和SSM框架开发的B/S架构应用,适用于Windows系统。项目包含管理员、项目管理人员和普通用户三种角色,分别负责系统管理、请假审批、图书借阅等日常办公事务。系统使用Vue、HTML、JavaScript、CSS和LayUI构建前端,后端采用SSM框架,数据库为MySQL,共24张表。提供完整演示视频和详细文档截图,支持远程安装调试,确保顺利运行。
84 17
基于ssm的网络直播带货管理系统,附源码+数据库+论文
该项目为网络直播带货网站,包含管理员和用户两个角色。管理员可进行主页、个人中心、用户管理、商品分类与信息管理、系统及订单管理;用户可浏览主页、管理个人中心、收藏和订单。系统基于Java开发,采用B/S架构,前端使用Vue、JSP等技术,后端为SSM框架,数据库为MySQL。项目运行环境为Windows,支持JDK8、Tomcat8.5。提供演示视频和详细文档截图。
58 10
基于ssm的培训学校教学管理平台,附源码+数据库+论文
该项目为一培训学校教学管理平台,涵盖管理员、教师和学生三大功能模块。管理员可进行系统全面管理,包括学生、教师、课程等信息的增删改查;教师能管理个人中心、课程及选课信息;学生则可管理个人中心及选课信息。技术框架采用Java编程语言,基于B/S架构,前端使用Vue+HTML+JavaScript+CSS+LayUI,后端采用SSM框架,数据库为MySQL。项目运行环境为JDK8+MySQL5.7+Tomcat8.5,支持远程调试安装。演示视频与详细文档截图均提供下载链接。

数据库

+关注