Optimize DB Recovery:Multi-Thread PostgreSQL Restores on Amazon Aurora Serverless

Overview

Restoring a database to an AWS RDS instance ranks among the frequently encountered operations. In this concise tutorial, I've developed a straightforward bash script to facilitate database restoration on an Amazon Aurora Serverless cluster this can be use as well on any Amazon RDS PostgresSQL database . This approach applies to any Amazon RDS Postgres database. A noteworthy aspect of the pg_restore utility is its capability to designate the number of concurrent threads for the restoration process through the -j N flag, where N represents the total number of jobs. For further details on pg_restore, refer to the official documentation available online.

I leveraged Amazon Aurora Serverless and adjusted the ACU settings for optimization. This process is straightforward: you can increase the instance's capacity for the restoration process and then reduce it to minimize expenses.

The duration of the restoration is influenced by the size of the backup and the capacity of the Amazon RDS instance. In my experiment involving approximate 50GB database and an Amazon Aurora Serverless instance configured with a minimum of 2 ACUs and a maximum of 32 ACUs, the restoration was completed a 53 minutes

For the initial comparison, with multi-threading turned off and ACU sizes set to the default (minimum of 0.5 and maximum of 2), the restoration process lasted 3 hours. This demonstrates the significant efficiency gains achievable through proper sizing and utilizing multi-threading, which can expedite the restoration process to be approximate 3 times quicker than without employing multi-threading and sticking to the default database sizes. Furthermore, enhancing the EC2 instance size where the restoration script operates could further boost network and CPU resources, effectively eliminating bottlenecks. and reduce the restore time even more.

Upon observation, it became clear that multi-threading optimizes the restoration process by allocating tasks based on table size and other criteria, ensuring that each database object continues to be restored by the same thread. This method enhances efficiency by intelligently organizing restoration tasks to maintain consistency and speed.

Pre-Requirements

Postgres tools were installed directly on the EC2 instance or a workstation with access to the Amazon RDS instance for this demonstration. An Amazon Linux instance was utilized to install Postgres and execute the subsequent commands.

1
sudo dnf install postgresql15

The instance profile has access to the Secret Manager, which stores the credentials and connection details for the Amazon Aurora Serverless cluster.

Copy the script below to the instance with access to the Amazon Aurora Serverless cluster and the Secret.
Currently, the script doesn't automatically create the target database. However, this feature can be conveniently incorporated into the script if required. For the time being, manually connect to the RDS and initiate the creation of a target database with the command:CREATE DATABASE Demo;, this feature can be conveniently incorporated into the script if required
Modify the secret_name on line 4 to match your secret's name. Additionally, the script logs the restoration duration by writing to the output file named. pg_restore_log.txt.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
#!/bin/bash

# Name of the secret in AWS Secrets Manager
secret_name="postgres/credentials"

# Path to the log file where start and finish times will be written
LOG_FILE="./pg_restore_log.txt"

# Capture and write the start time to the log file
echo "Start time: $(date)" > $LOG_FILE

# Retrieve the password from AWS Secrets Manager
secret_value=$(aws secretsmanager get-secret-value --secret-id $secret_name --query 'SecretString' --output text)
echo $secret_value

# Check if the secret was successfully retrieved
if [ -z "$secret_value" ]; then
    echo "Failed to retrieve secret from AWS Secrets Manager."
    echo "Failed to retrieve secret from AWS Secrets Manager." >> $LOG_FILE
    exit 1
fi

PASSWORD=$(echo $secret_value | jq -r '.password')
HOST=$(echo $secret_value | jq -r '.host')
USERNAME=$(echo $secret_value | jq -r '.username')
ENGINE=$(echo $secret_value | jq -r '.engine')

DUMP_FILE_PATH="./latest.dump"
JOB_NUMBER=8
DB_NAME_TO_RESTORE="Demo"

echo "PASSWORD: $PASSWORD"
echo "HOST: $HOST"
echo "USERNAME: $USERNAME"
echo "ENGINE: $ENGINE"
echo "DUMP_FILE_PATH: $DUMP_FILE_PATH"
echo "JOBS_NUMBER: $JOBS_NUMBER"
echo "DB_NAME_TO_RESTORE: $DB_NAME_TO_RESTORE"

# Check if the values were successfully extracted
if [ -z "$PASSWORD" ] || [ -z "$HOST" ] || [ -z "$USERNAME" ] || [ -z "$ENGINE" ]; then
    echo "Failed to extract database credentials from JSON file."
    exit 1
fi

# Export PGPASSWORD for PostgreSQL authentication
export PGPASSWORD=$PASSWORD

# Build the pg_restore command with all options and arguments
PG_COMMAND="pg_restore --verbose --clean --no-acl --no-owner -h $HOST -U $USERNAME -d $DB_NAME_TO_RESTORE -j $JOB_NUMBER  $DUMP_FILE_PATH"
echo "PG_COMMAND:$PG_COMMAND"

# Execute the pg_restore command using eval to interpret the string as a command
$eval $PG_COMMAND

# Unset the PGPASSWORD to avoid leaving sensitive information in the environment
unset PGPASSWORD

# Capture and write the finish time to the log file
echo "Finish time: $(date)" >> $LOG_FILE

To keep track of the pg_restore process, as illustrated in the screenshot below, execute the multi-threading restore command in a separate terminal.

1
watch "ps aux | grep pg_restore | grep -v grep"

Conclusion

In summary, we've seen how multi-threading and ACU adjustments can significantly speed up database restoration on Amazon Aurora Serverless cluster, particularly with Postgres. Future speed improvements could come from modifying the I/O optimization settings in Amazon Aurora Serverless or regular Amazon RDS (PostgreSQL), though this has cost implications and restrictions. As we navigate these options, balancing speed with cost efficiency is key. We're eager to hear your feedback and thoughts on these strategies.

Select your cookie preferences

Site Terms, Privacy, and more.

Optimize DB Recovery:Multi-Thread PostgreSQL Restores on Amazon Aurora Serverless

Unlocking Restore Speed and Efficiency in Cloud Database Management and minimizing the RTO and RPO by restoring a single database.

Overview

Pre-Requirements

Conclusion

Comments