AWS Logo
Menu
🚨 Never Miss a Snapshot Failure! 📸 Monitoring Amazon RDS & Aurora Snapshots Globally 🌍

🚨 Never Miss a Snapshot Failure! 📸 Monitoring Amazon RDS & Aurora Snapshots Globally 🌍

"Everything fails, all the time." - Amazon's Chief Technology Officer, Werner Vogels ☁️ 💥

Josh
Amazon Employee
Published Jun 4, 2025

🤔 What's the Problem?

If you're relying on Amazon RDS or Aurora snapshots for your backups and disaster recovery (DR) strategy, you probably think you’re covered. Automated snapshots? ✅ Manual snapshots? ✅ Lifecycle policies? ✅
But here’s the gotcha: not all snapshot status changes emit RDS events. That’s right — if a snapshot fails silently 🫥, your backup strategy could be compromised, and you won't know about it unless you go looking.
Let’s pause on that for a moment.
Failed snapshots may not emit events like RDS-EVENT-0042. If you’re monitoring RDS events alone, you’re flying blind when it matters most.
That’s where the Snapshot Monitor for Amazon RDS comes in 🛡️.

⚙️ Introducing Snapshot Monitor for Amazon RDS

This solution continuously watches snapshot states — across all your AWS regions, all RDS engine types (Aurora included), and all snapshot types (automated, manual, cluster, instance). It gives you:
  • ✅ Real-time monitoring of snapshot state
  • ✅ Cross-region and cross-engine compatibility
  • ✅ Notifications on failure or stuck states via SNS
  • ✅ Easy deployment via AWS Cloud Development Kit (CDK)
Think of it like a health-checker for your database backups — watching silently in the background, alerting you only when something goes wrong. 🩺

💡Why This Matters

Let’s unpack a few assumptions that many teams make:
❌ Assumption 1: "If a snapshot fails, I’ll get notified."
Not always. RDS emits events for many things — like instance status changes, maintenance windows, or successful snapshot creations. But failed snapshots do not emit events.
❌ Assumption 2: "Snapshots always work."
They usually do — until they don’t.
A storage-level blip?
A permission misconfiguration on cross-account snapshot sharing?
You won’t know unless you’re checking. And if the failure happens during your only backup window? Ouch. 🫣
It's worth noting that RDS snapshots are robust and failures are extremely rare. This solution is designed to give you peace of mind that you will receive a notification in the unlikely event something does go wrong.

🌍 Multi-Region & Multi-Engine

Global RDS snapshot observability, from one place with no per-region deployments. The tool doesn’t care which AWS region you’re in, or whether you’re running:
  • PostgreSQL
  • MySQL
  • Oracle
  • SQL Server
  • Aurora (PostgreSQL or MySQL-compatible)
Whether your org runs hundreds of instances in us-east-1 and a few critical workloads in eu-west-1, this solution will monitor them all from a centralized Lambda function.

🏗️ How It Works

This GitHub project deploys a lightweight, cost-efficient monitoring solution using:
  • AWS Lambda for polling snapshots
  • Amazon EventBridge Scheduler for scheduled Lambda invocation (every 10 mins by default)
  • Amazon SNS for notifications
  • CDK to deploy it
It uses describe_db_snapshots() and describe_db_cluster_snapshots() API calls across all supported (or selected) AWS regions, checking each snapshot’s status. If any snapshot is in a monitored status (e.g. failed), a notification is fired.

🔔 Notifications That Actually Matter

Tired of noisy monitoring that sends alerts for things you don’t care about? This solution notifies you only when something is in one of the statuses you have selected to monitor.

🛠️ Simple Deployment

Just clone the repo, install dependencies and run CDK deploy:

Requirements

Usage

  1. Install dependencies:
go mod download
  1. Deploy the stack:
cdk deploy -c notification_email=<email address to receive snapshot summary report>
Or to customize the deployment, use CDK context parameters:
cdk deploy -c notification_email=your-email@example.com -c schedule_expression="rate(30 minutes)" -c status_to_monitor=failed,available
Or modify the cdk.json file:
For full instructions, check it out here 👉 https://github.com/awslabs/snapshot-monitor-for-amazon-rds

💬 Final Thoughts

Your snapshots are your last line of defense. They’re your parachute.
But even the best parachute is useless if it doesn’t open — and worse if you don’t know it didn’t open until it’s too late.💀
This solution is ready to plug into your backup strategy today.

🌟Let’s Collaborate!

Got ideas for improvements? The solution is open-source — your contributions are welcome! 🧑‍💻
Fork it. Star it 👉 https://github.com/awslabs/snapshot-monitor-for-amazon-rds
 

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Comments