CBDR
CBDR is a backup and recovery tool for SynxDB and Apache Cloudberry, built on top of WAL-G. It provides a simple command-line interface for performing backup and recovery operations, helping ensure data safety and enabling disaster recovery.
CBDR offers the following features:
Full backup: Supports full backup of the entire database cluster.
Incremental backup: Backs up only the changes made since the last backup.
Backup listing: Displays all available backups.
Data recovery: Restores data from a specified backup.
Storage support: Supports S3-compatible object storage.
Configuration management: Generates and manages the configuration files needed for backup and restore.
Tip
Compared to peer tools like gpbackup
and gprestore
, CBDR also supports storing backups to S3, multiple compression algorithms (lz4, lzma, zstd, brotli), and backup encryption.
Full backup and restore procedure
Before using CBDR to back up or restore a SynxDB cluster, make sure the following requirements are met:
SynxDB is properly installed and running.
The
wal-g
binary is installed under/usr/local/bin/
or/usr/bin/
.If using S3 storage, the appropriate credentials have been configured.
The general procedure for performing a backup using CBDR is as follows:
Backup process
Create a backup configuration file named
config.yaml
. Assume the file is located at/path/to/config.yaml
. For the configuration file template, see Configuration file reference.Distribute the configuration file to all Segment nodes and update the archive command in
postgresql.conf
:cbdr configure backup --config=/path/to/config.yaml
Restart the SynxDB cluster:
gpstop -ari
Perform the backup:
cbdr backup --config=/path/to/config.yaml
View the list of available backups:
cbdr backup-list --config=/path/to/config.yaml
Restore process
Prepare a new SynxDB cluster as the target for restoration, and create the required configuration file
config.yaml
. Assume the file is located at/path/to/config.yaml
.Generate the restore configuration file
restore_cfg.json
. Before running this command, make sure the new cluster is reachable:cbdr configure restore --config=/path/to/config.yaml --restore-config=/path/to/restore_cfg.json
Delete all existing data directories on the new cluster, including both Coordinator and Segment nodes. For example:
rm -rf /data202502111728221784/coordinator/gpseg-1 rm -rf /data202502111728221784/segment/gpseg-0 rm -rf /data202502111728221784/segment/gpseg-1 rm -rf /data202502111728221784/segment/gpseg-2
Perform the restore:
cbdr restore --config=/path/to/config.yaml --restore-config=/path/to/restore_cfg.json
Start the Coordinator node in admin mode and update the
gp_segment_configuration
system table to set the correcthostname
,address
,mirror
, and other fields:gpstart -c -a
If the following error occurs during startup, use
ps -ef | grep postgres
to check if the Coordinator process is running. If it is, you can safely ignore the error:gpstart failed. (Reason='connection to server at "localhost" (127.0.0.1), port 7000 failed: FATAL: the database system is not accepting connections DETAIL: Hot standby mode is disabled.')
Once the Coordinator starts successfully, you can query the segment configuration:
select * from gp_segment_configuration;
If you have exited admin mode, restart the cluster in admin mode:
gpstop -c -a
Start the full cluster:
gpstart -a
You might encounter the following error during startup:
invalid IP mask "trust": Name or service not known
This happens because the
pg_hba.conf
file generated by WAL-G is missing CIDR masks (for example,/32
). You need to manually fix the configuration files for the Coordinator, Segment, and Mirror nodes.Example of incorrect configuration:
host all all 192.168.199.42 trust host all gpadmin 192.168.192.159 trust host all gpadmin 192.168.197.5 trust
Corrected configuration:
host all all 192.168.199.42/32 trust host all gpadmin 192.168.192.159/32 trust host all gpadmin 192.168.197.5/32 trust
Attention
Before backing up, make sure the database is running, configuration is correct, and there is enough available disk space.
Prepare the restore environment in advance. Do not interrupt the restore process. After restoring, always verify data integrity.
For storage management, regularly clean up invalid backups and monitor storage usage. If using S3, ensure a stable network connection.
Incremental backup and restore procedure
Before performing an incremental backup, make sure that you have completed at least one full backup.
On the source cluster, run the following command to start an incremental backup based on a specific full backup. Example:
cbdr backup --config=/path/to/config.yaml --delta-from-name=backup_20250409T153036Z
Attention
If you have not run the
cbdr configure backup
command on the current machine, or if you have run it before but the configuration file has changed, runcbdr configure backup --config=/path/to/config.yaml
first. This command distributes the/path/to/config.yaml
file from the coordinator node to the same file path on all segment nodes.If you have run the
cbdr configure backup
command on the current machine, and the configuration file has not changed since then, you can simply runcbdr backup
.View all available backups (including full and incremental):
cbdr backup-list --config=/path/to/config.yaml
Sample output:
backup_name modified wal_file_name storage_name backup_20250409T153036Z 2025-04-09T15:31:36+08:00 ZZZZZZZZZZZZZZZZZZZZZZZZ default backup_20250409T153136Z_D_20250409T153036Z 2025-04-09T15:32:36+08:00 ZZZZZZZZZZZZZZZZZZZZZZZZ default
Prepare a new SynxDB cluster as the target for restore. The preparation process is the same as for full backup restore.
Run the restore on the new cluster:
cbdr restore backup_20250409T153136Z_D_20250409T153036Z \ --config=/path/to/config.yaml \ --restore-config=/path/to/restore_cfg.json
Restore procedure using a restore point
CBDR supports creating logical restore points based on full or incremental backups. A restore point is essentially a named timestamp marker that allows a new cluster to be restored to a specific point in time.
Currently, CBDR only supports restoring to a restore point on a newly prepared cluster. It does not yet support incremental continuation from a restore point on an already restored cluster.
Below is an example of restoring a new cluster to a specified restore point:
Create a restore point on the source cluster:
cbdr create-restore-point "restore-point-1" --config=/path/to/config.yaml
View all backups and restore points:
cbdr backup-list --config=/path/to/config.yaml
Sample output:
backup_name modified wal_file_name storage_name backup_20250409T155552Z 2025-04-09T15:56:52+08:00 ZZZZZZZZZZZZZZZZZZZZZZZZ default restore-point-1 2025-04-09T15:56:52+08:00 ZZZZZZZZZZZZZZZZZZZZZZZZ default
Prepare a new SynxDB cluster as the restore target. The preparation steps are the same as in the full backup restore scenario and are not repeated here.
Run the restore on the new cluster, specifying the restore point name:
cbdr restore \ --config=/path/to/config.yaml \ --restore-config=/path/to/config.yaml \ --restore-point="restore-point-1"
Configuration file reference
CBDR uses YAML configuration files. Currently, it only supports configuring S3 storage parameters and does not support storing backups on the local file system. The following is a sample configuration.
# Database connection settings
PGHOST: "localhost"
PGPORT: 7000
PGUSER: "gpadmin"
PGDATABASE: "postgres"
# Concurrency settings
GOMAXPROCS: 6
# Relative path to the recovery configuration file
WALG_GP_RELATIVE_RECOVERY_CONF_PATH: "conf.d/recovery.conf"
# Polling interval for segment status
WALG_GP_SEG_POLL_INTERVAL: "1m"
# Compression method: supports lz4, lzma, zstd, brotli
WALG_COMPRESSION_METHOD: "lz4"
# Upload and download concurrency
WALG_UPLOAD_CONCURRENCY: 5
WALG_DOWNLOAD_CONCURRENCY: 5
# Retry attempts for file download
WALG_DOWNLOAD_FILE_RETRIES: 5
# Required settings for using S3 storage
WALE_S3_PREFIX: "xxxxxxxxxxxxx"
AWS_ENDPOINT: "xxxxxxxxxxxxx"
AWS_SECRET_ACCESS_KEY: "xxxxxxxxxxxxx"
AWS_ACCESS_KEY_ID: "xxxxxxxxxxxxx"
# Directory for log files
WALG_GP_LOGS_DIR: "/var/log/cbdr"
# Incremental backup limit: maximum number of incremental backups allowed
# after a full backup. For example, if set to 6, a new full backup will be
# forced after 6 consecutive incremental backups to prevent long backup chains.
WALG_DELTA_MAX_STEPS: 10
Command usage
The basic syntax for running CBDR commands is:
cbdr <command> [options] --config=<config_file>
The following sections describe the main usage of each CBDR command.
Configure commands
The cbdr configure
command is used to distribute the backup configuration to all Segment nodes or to generate a restore configuration file.
# Distribute the backup configuration and update the archive command in postgresql.conf
cbdr configure backup --config=<config_file>
# Generate the restore configuration file
cbdr configure restore --config=<config_file> --restore-config=<restore_config_file>
The restore configuration file can be automatically generated using cbdr configure restore
(requires the cluster to be reachable), or it can be written manually. A sample JSON format is shown below:
{
"segments": {
"-1": {
"hostname": "localhost",
"port": 7000,
"data_dir": "/tmp/tests/gpdemo/datadirs1/qddir/demoDataDir-1"
},
"0": {
"hostname": "localhost",
"port": 7002,
"data_dir": "/tmp/tests/gpdemo/datadirs1/dbfast1/demoDataDir0"
},
"1": {
"hostname": "localhost",
"port": 7003,
"data_dir": "/tmp/tests/gpdemo/datadirs1/dbfast2/demoDataDir1"
},
"2": {
"hostname": "localhost",
"port": 7004,
"data_dir": "/tmp/tests/gpdemo/datadirs1/dbfast3/demoDataDir2"
}
}
}
Cluster backup
The cbdr backup
command is used to back up the database cluster. Syntax:
cbdr backup [options] --config=<config_file>
Optional parameters:
--permanent
: Marks the backup as permanent. It cannot be deleted unless forced.--full
: Performs a full backup.--add-user-data=<json>
: Attaches custom metadata to the backup in JSON format.--delta-from-user-data=<json>
: Specifies the base backup for incremental backup using metadata.--delta-from-name=<backup_name>
: Specifies the base backup for incremental backup by name.
View backup list
The cbdr backup-list
command displays the list of available backups.
cbdr backup-list --config=<config_file> [options]
Optional parameters:
--pretty
: Outputs the list in a more readable format.--json
: Outputs the list in JSON format.--detail
: Displays detailed information.
Restore command
The cbdr restore
command restores a backup to the target cluster. Usage:
cbdr restore <backup_name> --config=<config_file> [--restore-config=<restore_config_file>] [--target-user-data=<json>]
Parameter descriptions:
--restore-config
: Path to the restore configuration file.--target-user-data
: Restores a backup that matches the specified user-defined metadata.
Delete command
The cbdr delete
command removes an existing backup.
cbdr delete --config=<config_file> [--confirm] [--force-delete]
Parameter descriptions:
--confirm
: Must be explicitly set to execute the deletion.--force-delete
: Forces deletion even for permanent backups.
Restore point commands
Restore points are used to support incremental recovery.
To create a restore point:
cbdr create-restore-point <restore-point-name> --config=<config_file>
To view existing restore points:
cbdr restore-point-list --config=<config_file> [--pretty] [--json] [--detail]