My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
readme  
User manual
Updated May 18, 2012 by T.Katsum...@gmail.com

pg_rman -- manages backup and recovery of PostgreSQL.

Synopsis

pg_rman [ OPTIONS ] { init | backup | restore | show [ DATE | timeline ] | validate [ DATE ] | delete DATE }

pg_rman has the features below:

  • Takes a backup while database including tablespaces with just one command.
  • Can recovery from backup with just one command.
  • Supports incremental backup and compression of backup files so that it takes less disk spaces.
  • Manages backup generations and shows a catalog of the backups.
  • Supports storage snapshot.

DATE is the start time of the target backup in ISO-format (YYYY-MM-DD HH:MI:SS). Prefix match is used to compare DATE and backup files.

$ pg_rman show 2009-12 # show backups in a month of December 2009
$ pg_rman validate     # validate all unvalidated backups

pg_rman supports the following commands. See also Options for details of OPTIONS.

init

Initialize a backup catalog.
backup
Take an online backup.
restore
Do restore.
show
Show backup history.

The timeline option shows timeline of the backup and the parent's timeline for each backup.
validate
Validate backup files.
delete
Delete backup files.

Description

pg_rman is a utility program to backup and restore PostgreSQL database. It takes a physical online backup of whole database cluster, archive WALs, and server logs.

pg_rman supports getting backup from standby-site with PostgreSQL 9.0 later.

And pg_rman supports storage snapshot backup.

Initialize a backup catalog

First, you need to create "a backup catalog" to store backup files and their metadata.

It is recommended to setup archive_mode and archive_command in postgresql.conf before initialize the backup catalog. If the variables are initialized, pg_rman can adjust the config file to the setting. In this case, you have to specify the database cluster path for PostgreSQL. Please specify it in PGDATA environmental variable or -D/--pgdata option.

$ pg_rman init -B <a backup catalog path>

Backup

Backup target can be one of the following types. Also serverlogs can be added.

Full backup

Backup a whole database cluster.
Incremental backup
Backup only files or pages modified after the last verified backup.
Archive WAL backup
Backup only archive WAL files.

It is recommended to verify backup files as soon as possible after backup. Unverified backup cannot be used in restore nor in incremental backup.

Restore

PostgreSQL server should be stopped before restore. If database cluster still exists, restore command will save unarchived transaction log and delete all database files. You can retry recovery until a new backup is taken. After restoring files, pg_rman create recovery.conf in $PGDATA. The conf file contains parameters to recovery, and you can also modify the file if needed.

It is recommended to take a full backup as soon as possible after recovery is succeeded.

If "--recovery-target-timeline" is not specifed, the last checkpoint's TimeLineID in control file ($PGDATA/global/pg_control) will be a restore target. If pg_control is not present, TimeLineID in the full backup used by the restore will be a restore target.

Standby-site Backup

If you use replication feature on PostgreSQL 9.0 later, you can get backup from standby-site.

You shold specify different options from usual use for getting backup from standby-site. In detail, you shold specify the database cluster on standby-site with -D/--pgdata option. And you shold specify information on master-site with connection options(-d/--dbname, -h/--host, -p/--port).

Please see Examples on Standby-site for more detail information.

Examples

To reduce the number of command line arguments, you can set BACKUP_PATH, an environment variablle, to the absolute path of the backup catalog and write default configuration into ${BACKUP_PATH}/pg_rman.ini.

$ cat $BACKUP_PATH/pg_rman.ini
ARCLOG_PATH = /home/postgres/arclog
SRVLOG_PATH = /home/postgres/pgdata/pg_log

BACKUP_MODE = F
COMPRESS_DATA = YES
KEEP_ARCLOG_FILES = 10
KEEP_ARCLOG_DAYS = 10
KEEP_DATA_GENERATIONS = 3
KEEP_DATA_DAYS = 120
KEEP_SRVLOG_FILES = 10
KEEP_SRVLOG_DAYS = 10

Take a backup. This example takes a full backup of the whole database with serverlogs. Then, it validates all unvalidated backups.

$ pg_rman backup --backup-mode=full --with-serverlog
$ pg_rman validate

Discatd the all database and restore from a backup.

$ pg_ctl stop -m immediate
$ pg_rman restore
$ pg_ctl start 

You can retrieve a backup catalog with show command.

$ pg_rman show
============================================================================
Start                Time   Total    Data     WAL     Log  Backup   Status  
============================================================================
2011-11-27 19:16:37  ----    ----    ----    ----    ----    ----   RUNNING
2011-11-27 19:16:20    3m    ----  9223kB    16MB    ----    17MB   OK
2011-11-27 19:15:45    3m  1242MB    ----    32MB    ----   242MB   OK

The fields are:

  • Start : start time of backup
  • Time : total time of backup
  • Total : size of whole database cluster
  • Data : size of read data files
  • WAL : size of read WAL archive files
  • Log : size of read serverlog files
  • Backup: size of backup (= written size)
  • Status: status of backup. Possible values are:
    • OK : backup is done and validated.
    • DONE : backup is done, but not validated yet.
    • RUNNING : backup is running now.
    • DELETING : backup is being deleted now.
    • DELETED : backup has been deleted.
    • ERROR : backup is unavailable because some errors occur during backup.
    • CORRUPT : backup is unavailable because it is broken.
with "timeline", you can see the timeline with the backup catalog.

$ pg_rman show timeline
============================================================
Start                Mode  Current TLI  Parent TLI  Status  
============================================================
2011-11-27 19:16:37  INCR            1           0  RUNNING
2011-11-27 19:16:20  INCR            1           0  OK
2011-11-27 19:15:45  FULL            1           0  OK

And more, when you specify the date in "Start" field, you can see the detail information of the backup.

$ pg_rman show '2011-11-27 19:15:45'
# configuration
BACKUP_MODE=FULL
WITH_SERVERLOG=false
COMPRESS_DATA=false
# result
TIMELINEID=1
START_LSN=0/08000020
STOP_LSN=0/080000a0
START_TIME='2011-11-27 19:15:45'
END_TIME='2011-11-27 19:19:02'
RECOVERY_XID=1759
RECOVERY_TIME='2011-11-27 19:15:53'
TOTAL_DATA_BYTES=1242420184
READ_DATA_BYTES=25420184
READ_ARCLOG_BYTES=32218912
WRITE_BYTES=242919520
BLOCK_SIZE=8192
XLOG_BLOCK_SIZE=8192
STATUS=OK

You can check the "RECOVERY_XID" and "RECOVERY_TIME" which are used for restore option "--recovery-target-xid", "--recovery-target-time".

The delete command deletes backup files not required by recovery after the specified date. The following example deletes unneeded backup files to recovery at 12:00 11, September 2009.

$ pg_rman show
============================================================================
Start                Time   Total    Data     WAL     Log  Backup   Status
============================================================================
2009-09-11 20:00:01    0m    ----    ----      0B    ----      0B   OK
2009-09-11 15:00:53    0m    ----   8363B    16MB    ----  2346kB   OK
2009-09-11 10:00:48    0m    ----    ----      0B    ----      0B   OK
2009-09-11 05:00:06    0m    40MB    ----    16MB    ----  5277kB   OK
2009-09-11 00:00:02    0m    ----   8363B    16MB    ----   464kB   OK
2009-09-10 20:30:12    0m    ----    ----      0B    ----      0B   OK
2009-09-10 15:00:06    0m    ----    ----    16MB    ----    16kB   OK
2009-09-10 10:00:02    0m    ----   8363B    16MB    ----    16kB   OK
2009-09-10 05:00:08    0m    40MB    ----   150MB    ----    13MB   OK
$ pg_rman delete 2009-09-11 12:00:00
$ pg_rman show
============================================================================
Start                Time   Total    Data     WAL     Log  Backup   Status
============================================================================
2009-09-11 20:00:01    0m    ----    ----      0B    ----      0B   OK
2009-09-11 15:00:53    0m    ----   8363B    16MB    ----  2346kB   OK
2009-09-11 10:00:48    0m    ----    ----      0B    ----      0B   OK
2009-09-11 05:00:06    0m    40MB    ----    16MB    ----  5277kB   OK

Examples on Standby-site

This is examples to get backup with pg_rman from standby-site. Basical use is same with how to use on master-site, so the difference points are shown on bellow.

At first, about initializing backup catalog. You shold specify the database cluster path on standby-site with -D/--pgdata.

$ pg_rman init -B <a backup catalog path> -D <(the database cluster path(on standby-site)>

When you get backup from standby-site, you shold specify the database cluster on standby-site with -D/--pgdata option. And you shold specify information on master-site with connection options(-d/--dbname, -h/--host, -p/--port etc.). This is an example on the condition that the database cluster path on standby-site is /home/postgres/pgdata_sby, and master is working same host with 5433 port.

$ pg_rman backup --pgdata=/home/postgres/pgdata_sby --backup-mode=full --port=5433

Options

pg_rman accepts the following command line parameters. Some of them can be also sepcified as environment variables. See also Parameters for the details.

Common options

As a general rule, paths for data location need to be specified as absolute paths; relative paths are not allowed. -D PATH / --pgdata=PATH

The absolute path of database cluster. Required on backup and restore.

-A PATH / --arclog-path=PATH

The absolute path of archive WAL directory. Required on backup and restore.

-S PATH / --srvlog-path=PATH

The absolute path of server log directory. Required on backup with server logs and restore.

-B PATH / --backup-path=PATH

The absolute path of backup catalog. Always required.

-c / --check

If specifed, pg_rman doesn't perform actual jobs but only checks parameters and required resources. The option is typically used with --verbose option to verify the operation.

-v / --verbose

If specified, pg_rman works in verbose mode.

Backup options

-b { full | incremental | archive } / --backup-mode={ full | incremental | archive }

Specify backup target files. Available options are: "full" backup, "incremental" backup, and "archive" backup. Abbreviated forms (prefix match) are also available. For example, -b f means "full" backup.
  • full : Whole database backup and archive backup
  • incremental : Incremental backup and archive backup
  • archive : Only archive backup

-s / --with-serverlog

Backup server log files if specified.

-Z / --compress-data

Comress backup files with zlib if specified.

-C / --smooth-checkpoint

Checkpoint is performed on every backups. If the option is specified, do smooth checkpoint then. See also the second argument for pg_start_backup().

--keep-data-generations / --keep-data-days

Specify how long backuped data files will be kept. --keep-data-generations means number of backup generations. --keep-data-days means days to be kept. Only files exceeded both settings are deleted.

--keep-arclog-files / --keep-arclog-days

Specify how long backuped archive WAL files will be kept. --keep-arclog-files means number of backup files. --keep-arclog-days means days to be kept. When you do backup, only files exceeded both settings are deleted from archive storage. If you want to do with these options, you have to specify both options --keep-arclog-files and --keep-arclog-days.

--keep-srvlog-files / --keep-srvlog-days

Specify how long backuped serverlog files will be kept. --keep-srvlog-files means number of backup files. --keep-srvlog-days means days to be kept. When you do backup, only files exceeded both settings are deleted from server log directory (log_directory). This option works when you specify --with-serverlog and --srvlog-path options in backup command. And If you want to do with these options, you have to specify both options --keep-srvlog-files and --keep-srvlog-days.

Restore options

Those parameters are same as parameters in recovery.conf. See also "Continuous Archiving and Point-In-Time Recovery (PITR)#Recovery Settings" for details.

--recovery-target-timeline TIMELINE

Specifies recovering into a particular timeline.
If not specified, the current timeline Use ($PGDATA/global/pg_control) is used.

--recovery-target-time TIMESTAMP

This parameter specifies the time stamp up to which recovery will proceed.
If not specified, continue recovery to the latest time.

--recovery-target-xid XID

This parameter specifies the transaction ID up to which recovery will proceed.
If not specified, continue recovery to the latest xid.

--recovery-target-inclusive

Specifies whether we stop just after the specified recovery target (true), or just before the recovery target (false). Default is true.

Catalog options

-a / --show-all

Show also deleted backups.

Connection options

Parameters to connect PostgreSQL server.

-d DBNAME / --dbname=DBNAME

The database name to execute pg_start_backup() and pg_stop_backup().

-h HOSTNAME / --host=HOSTNAME

Specifies the host name of the machine on which the server is running. If the value begins with a slash, it is used as the directory for the Unix domain socket.

-p PORT / --port=PORT

Specifies the TCP port or local Unix domain socket file extension on which the server is listening for connections.

-U USERNAME / --username=USERNAME

User name to connect as.

-w / --no-password

Never issue a password prompt. If the server requires password authentication and a password is not available by other means such as a .pgpass file, the connection attempt will fail. This option can be useful in batch jobs and scripts where no user is present to enter a password.

-W / --password

Force pg_rman to prompt for a password before connecting to a database.
This option is never essential, since pg_rman will automatically prompt for a password if the server demands password authentication. However, pg_rman will waste a connection attempt finding out that the server wants a password. In some cases it is worth typing -W to avoid the extra connection attempt.

Generic options

--help

Print help, then exit.

-V / --version

Print version information, then exit.

-! / --debug

Show debug information.

Parameters

Some of parameters can be specified in commandline arguments, environment variables or configuration file as follows:

Short Long Environment variable Conf file Description Remarks
-h --host PGHOST database server host or socket directory
-p --port PGPORT database server port
-d --dbname PGDATABASE database to connect
-U --username PGUSER user name to connect as
PGPASSWORD password used to connect
-w --password force password prompt
-W --no-password never prompt for password
-D --pgdata PGDATA Yes location of the database storage area
-B --backup-path BACKUP_PATH Yes location of the backup storage area
-A --arclog-path ARCLOG_PATH Yes location of archive WAL storage area
-S --srvlog-path SRVLOG_PATH Yes location of server log storage area
-b --backup-mode BACKUP_MODE Yes backup mode (full, incremental, or archive)
-s --with-serverlog WITH_SERVERLOG Yes also backup server log files specify boolean type in environmental variable or configuration file
-Z --compress-data COMPRESS_DATA Yes compress data backup with zlib specify boolean type in environmental variable or configuration file
-C --smooth-checkpoint SMOOTH_CHECKPOINT Yes do smooth checkpoint before backup specify boolean type in environmental variable or configuration file
--keep-data-generations KEEP_DATA_GENERATIONS Yes keep GENERATION of full data backup
--keep-data-days KEEP_DATA_DAYS Yes keep enough data backup to recover to DAY days age
--keep-srvlog-files KEEP_SRVLOG_FILES Yes keep NUM of serverlogs
--keep-srvlog-days KEEP_SRVLOG_DAYS Yes keep serverlog modified in DAY days
--keep-arclog-files KEEP_ARCLOG_FILES Yes keep NUM of archived WAL
--keep-arclog-days KEEP_ARCLOG_DAYS Yes keep archived WAL modified in DAY days
--recovery-target-timeline RECOVERY_TARGET_TIMELINE Yes recovering into a particular timeline
--recovery-target-xid RECOVERY_TARGET_XID Yes transaction ID up to which recovery will proceed
--recovery-target-time RECOVERY_TARGET_TIME Yes time stamp up to which recovery will proceed
--recovery-target-inclusive RECOVERY_TARGET_INCLUSIVE Yes whether we stop just after the recovery target

  • The names of variable in configuration file are the same as long names or names of environment variables.
  • The password can not be specified in command line and configuration file for security reason.

This utility, like most other PostgreSQL utilities, also uses the environment variables supported by libpq (see Environment Variables)

Restrictions

pg_rman has the following restrictions.

Requires to read database cluster directory and write backup catalog directory.

For example, you need to mount the disk where backup catalog is placed with NFS from database server.

Major versions of pg_rman and server should be matched.

pg_rman cannot backup servers with different major versions.

Block sizes of pg_rman and server should be matched.

BLCKSZ and XLOG_BLCKSZ also should be matched.

If there are some unreadable files/directories in database cluster directory, WAL directory or archived WAL directory, the backup or restore would be failed.

Getting backup from standby-site, pg_rman has the follow restrictions too.

  • The environment of replication should be built right, or the backup will not finish.
  • You can't get backups on master and standby at the same time.
  • You can't get backups on multi standbys at the same time too.
  • Basically, the backup from standby-site is used for restoring on MASTER. pg_rman doesn't treat the backup as restoring on standby automatically.
  • If you want to restore the backup on STANDBY, you have to manage archive logs with your self.

When using storage snapshot, pg_rman has the following restrictions too.

  • If your snapshot does not have any file update time, incremental backup is same with full backup.
  • Because pg_rman judges performing full backup or incremental backup by update time for files. If files don't have update time because of storage snapshot specification, pg_rman performs full backup every time.
  • You can't backup for one side works storage with split mirror snapshot.
  • Before you execute pg_rman, you should perform storage "RESYNC".
  • After pg_rman performs backup with split mirror snapshot, storeage will be "SPLITTED"(works on one side).
  • pg_rman perform SPLIT command for getting snapshot, but doesn't perform RESYNC command.
  • You cant't get snapshot from different vendor storages in a time.
  • You cant't use some vendor storages which have different commands for getting snapshot.
  • The script and commands for getting storage snapshot shold be executable.
  • It's expected to have authority of root for getting snapshot or mounting volumes. So a user, performs pg_rman, is granted to execute any commands in the script.
  • If you use LVM(Logical Volume Manager), it's needed root authority for mount, umount, lvcreate, lvremove, lvscan commands. You shold granted to these commands with sudo command to non-password executable.

Details

Recovery to Point-in-Time

pg_rman can recover to point-in-time if timeline, transaction ID, or timestamp are specified in recovery. xlogdump is an useful tool to check the contents of WAL files and determine when to recover. See Continuous Archiving and Point-In-Time Recovery (PITR) for the details.

Configuration file

Setting parameters can be specified with form of "name=value" in the configuration file. Quotes are required if the value contains whitespaces. Comments starts with "#". Whitespaces and tabs are ignored excluding values.

Exit codes

pg_rman returns exit codes for each error status.

Code Name Description
0 SUCCESS Succeeded.
1 HELP Print a help, then exit.
2 ERROR Generic error.
3 FATAL Exit because of repeated errors
4 PANIC Unknown critical condition.
10 ERROR_SYSTEM I/O or system error.
11 ERROR_NOMEM Out of memory.
12 ERROR_ARGS Invalid input parameters.
13 ERROR_INTERRUPTED Interrupted by user. (Ctrl+C etc.)
14 ERROR_PG_COMMAND SQL error.
15 ERROR_PG_CONNECT Cannot connect to PostgreSQL server.
20 ERROR_ARCHIVE_FAILED Cannot archive WAL files.
21 ERROR_NO_BACKUP Backup file not found.
22 ERROR_CORRUPTED Backup file is broken.
23 ERROR_ALREADY_RUNNING Cannot start because another pg_rman is running.
24 ERROR_PG_INCOMPATIBLE Version conflicted with PostgreSQL server.
25 ERROR_PG_RUNNING Cannot restore because PostgreSQL server is running.
26 ERROR_PID_BROKEN postmaster.pid file is broken.

Outer Script

  • This is the script to getting snapshot and mounting file systems.
  • If you want to add outer scripts, you shold make your script corresponding outer script interface according to referring manuals of the storage.
  • Please refer "Interface Specification" about what you shold make.
  • Outer script performs some operation for getting several snapshots in a time execution.
  • If you want to use outer script, you shold set the script in backup catalog directory and rename it to "snapshot_script".
  • A sample outer script is released for LVM(Logical Volume Manager).

Commands Specification

$ ${BACKUP_PATH}/snapshot_script { split | resync | mount | umount | freeze | unfreeze } [cleanup]
  • Input
    • first argument { split | resync | mount | umount | freeze | unfreeze }
      • identifier for performing
    • second argument [cleanup]
      • If you specified "cleanup", error occuring doesn't stop the process. just output warning messages.
      • this argument is for resync, umount, unfreeze only.
  • Output
    • Making snapshot volume by performing "split" operation.
    • Freezing filesystem I/O by performing "freeze" operation.
    • Mounting to snapshot volume by performing "mount" operation.
    • standard output
      • output tablespace name in the snapshot by performing "split" operation.
      • output tablespace name and its directory by performing "mount" operation. The template is &lt;tablespace name&gt;=&lt;path to directory for the tablespace&gt;
      • The command performs without any errors, output "SUCCESS", otherwise output nothing. If the command is "split" or "mount", output in last line.
    • standard error output
      • output log messages.
    • return value
      • Not Specified.

Interface Specification

  • split
    • Making procedure for getting snapshots.
    • Output tablespace name in the snapshots to standard output.
      • If database cluster is included in a snapshot, the tablespace name should be "PG-DATA".
      • If some tablespaces are included in a snapshot, output them with multi lines.
  • resync [cleanup]
    • Making procedure for removing snapshot made by split operation.
    • If you don't need to remove snapshot, like split mirror snapshot, you don't make any procedure.
    • If cleanup is specified and occuring errors, output warning messages and continue to get rest snapshots.
  • mount
    • Making procedure for mounting the snapshot made by split operation to the filesystem.
    • output tablespace name and its direcotry(absolute path) to standard output. The template is [<tablespace name>=<directory path>].
      • If database cluster is included in a snapshot, the tablespace name should be "PG-DATA".
      • If some tablespaces are included in a snapshot, output them with multi lines.
  • umount [cleanup]
    • Making procedure for unmounting the snapshot made by mount operation.
    • If cleanup is specified and occuring errors, output warning messages and continue to unmount rest snapshots.
  • freeze
    • Making procedure for freezing filesystem IO.
    • If your storage doesn't need to freeze filesystem IO, you don't make any procedure.
  • unfreeze [cleanup]
    • Making procedure for unfreezing filesystem IO by performing freeze operation.
    • If your storage doesn't need to freeze filesystem IO, you don't make any procedure.
    • If cleanup is specified and occuring errors, output warning messages and continue to unfreeze rest filesystems.

Explanation for sample script for LVM(Logical Volume Manager)

  • split
    • perform lvcreate command as root authority against a volume for getting snapshot.
    • $ sudo /usr/sbin/lvcreate --snapshot --size=2G --name snap00 /dev/VolGroup00/LogVolume00
      • Above example is getting snapshot for logical volume "LogVolume00".
      • Snapshot name, snapshot size and logical volume for getting snapshot are parameterable.
    • Output tablespace names for backup to standard output.
  • resync
    • perform lvremove command as root authority against a volume for getting snapshot.
    • $ sudo /usr/sbin/lvremove -f /dev/VolGroup00/snap00
  • mount
    • perform mount command as root authority against a volume for getting snapshot.
    • $ sudo /bin/mount /dev/VolGroup00/snap00 /mnt/snapshot_lvm/pgdata
      • Above example is mounting snapshot volume made by split operation to "/mnt/snapshot_lvm/pgdata".
      • Mounting directory name is parametarable.
    • Output tablespace names and corresponding mounting direcotry name to standard output.
  • umount
    • perform umount command as root authority against a volume for getting snapshot.
    • $ sudo /bin/umount /mnt/snapshot_lvm/pgdata
  • freeze
    • noting to do. (not need file system freeze)
  • unfreeze
    • noting to do. (not need file system freeze)

Installations

pg_rman can be installed as same as standard contrib modules.

Build from source

The module can be built with pgxs.

$ cd pg_rman
$ make USE_PGXS=1
$ make USE_PGXS=1 install

No need to register to databases.

Requirements

  • PostgreSQL 9.0, 8.4-8.2
  • OS: RHEL 5.4
Comment by jota.c...@gmail.com, Feb 29, 2012

Is possible to work with PostgreSQL 9.0.4 and RHEL 5.6?


Sign in to add a comment
Powered by Google Project Hosting