Optimize DB Recovery:Multi-Thread PostgreSQL Restores on Amazon Aurora Serverless

Unlocking Restore Speed and Efficiency in Cloud Database Management and minimizing the RTO and RPO by restoring a single database.

Published Feb 25, 2024
Last Modified Oct 31, 2024

Overview

Restoring a database to an AWS RDS instance ranks among the frequently encountered operations. In this concise tutorial, I've developed a straightforward bash script to facilitate database restoration on an Amazon Aurora Serverless cluster this can be use as well on any Amazon RDS PostgresSQL database . This approach applies to any Amazon RDS Postgres database. A noteworthy aspect of the pg_restore utility is its capability to designate the number of concurrent threads for the restoration process through the -j N flag, where N represents the total number of jobs. For further details on pg_restore, refer to the official documentation available online.
I leveraged Amazon Aurora Serverless and adjusted the ACU settings for optimization. This process is straightforward: you can increase the instance's capacity for the restoration process and then reduce it to minimize expenses.
The duration of the restoration is influenced by the size of the backup and the capacity of the Amazon RDS instance. In my experiment involving approximate 50GB database and an Amazon Aurora Serverless instance configured with a minimum of 2 ACUs and a maximum of 32 ACUs, the restoration was completed a 53 minutes
For the initial comparison, with multi-threading turned off and ACU sizes set to the default (minimum of 0.5 and maximum of 2), the restoration process lasted 3 hours. This demonstrates the significant efficiency gains achievable through proper sizing and utilizing multi-threading, which can expedite the restoration process to be approximate 3 times quicker than without employing multi-threading and sticking to the default database sizes. Furthermore, enhancing the EC2 instance size where the restoration script operates could further boost network and CPU resources, effectively eliminating bottlenecks. and reduce the restore time even more.
Upon observation, it became clear that multi-threading optimizes the restoration process by allocating tasks based on table size and other criteria, ensuring that each database object continues to be restored by the same thread. This method enhances efficiency by intelligently organizing restoration tasks to maintain consistency and speed.

Pre-Requirements

  • Postgres tools were installed directly on the EC2 instance or a workstation with access to the Amazon RDS instance for this demonstration. An Amazon Linux instance was utilized to install Postgres and execute the subsequent commands.
  • The instance profile has access to the Secret Manager, which stores the credentials and connection details for the Amazon Aurora Serverless cluster.
  1. Copy the script below to the instance with access to the Amazon Aurora Serverless cluster and the Secret.
  2. Currently, the script doesn't automatically create the target database. However, this feature can be conveniently incorporated into the script if required. For the time being, manually connect to the RDS and initiate the creation of a target database with the command:CREATE DATABASE Demo;, this feature can be conveniently incorporated into the script if required
  3. Modify the secret_name on line 4 to match your secret's name. Additionally, the script logs the restoration duration by writing to the output file named. pg_restore_log.txt.
To keep track of the pg_restore process, as illustrated in the screenshot below, execute the multi-threading restore command in a separate terminal.
Pg_restore
Process

Conclusion

In summary, we've seen how multi-threading and ACU adjustments can significantly speed up database restoration on Amazon Aurora Serverless cluster, particularly with Postgres. Future speed improvements could come from modifying the I/O optimization settings in Amazon Aurora Serverless or regular Amazon RDS (PostgreSQL), though this has cost implications and restrictions. As we navigate these options, balancing speed with cost efficiency is key. We're eager to hear your feedback and thoughts on these strategies.
 

Comments