
Failover Routing for Disaster Recovery - Ensuring Your Customers Get to The Good Place
Discover effective failover routing strategies for directing users to your recovery region using AWS Route 53. Learn about the options to ensure seamless disaster recovery.
- Control plane operations are the CRUD type operations. These include updating a Route 53 DNS record, creating a new Amazon EC2 instance, or changing the throughput on an Amazon DynamoDB table.
- Data plane operations are what provide the primary function of the service, such as running EC2 instance itself, getting and putting objects in an S3 bucket, or resolving a DNS query.
Several AWS services experienced impact to the control planes that are used for creating and managing AWS resources…. Route 53 APIs were impaired from 7:30 AM PST until 2:30 PM PST preventing customers from making changes to their DNS entries, but existing DNS entries and answers to DNS queries were not impacted during this event.
https://<FAILOVER-BUCKET>.s3.<STANDBY-REGION>.amazonaws.com/initiate-failover.html
- The “Jason”: A direct approach — manually update Route 53 DNS records. It's effective but carries risks due to reliance on AWS control plane operations.
- The “Chidi”: Automatic failover with Route 53 health checks — like Chidi, it’s thorough but potentially indecisive. It removes manual intervention but requires precise health checks to avoid unnecessary failovers.
- The “Tahani”: Use Amazon Application Recovery Controller (ARC) — a sophisticated and resilient solution with a cost. ARC offers high availability and direct control over failover via data plane operations.
- The “Eleanor”: Standby takes over primary (STOP) — pragmatic, cost-effective, and human-initiated without the risks of control plane reliance, making it a DIY version of ARC.