logo
Menu
Scale across borders: build a multi-region architecture while maintaining data residency

Scale across borders: build a multi-region architecture while maintaining data residency

In this post, we cover a high-level reference architecture to illustrate how you can deploy a multi-region architecture while maintaining data residency. This architecture is suitable for scaling startups and businesses operating in regulated industries and, those who are building the foundation for a global business.

Daniel Wirjo
Amazon Employee
Published Sep 8, 2023
Last Modified Mar 14, 2024

Overview

In a world where data security and privacy requirements are becoming increasingly stringent, businesses face the challenge of expanding globally while maintaining compliance with data residency requirements. In this post, we cover a high-level reference architecture to illustrate how you can deploy a multi-region architecture while maintaining data residency. We provide accompanying code sample using AWS Cloud Development Kit (CDK) as well as considerations and best-practices to assist with your implementation.
This architecture is suitable for scaling startups and businesses operating in regulated industries such as Healthcare and Life Sciences (HCLS) and Financial Services (FinTech). And, those who are building the foundation for a global business, or seeking to scale from a single-region architecture.

AWS reference architecture for multi-region with data residency

The example high-level architecture covers a full-stack web application. It uses a silo model with isolated infrastructure stacks for each region. With this architecture, businesses can securely handle sensitive data like Personally Identifiable Information (PII) or Personal Health Information (PHI) while adhering to regional compliance standards.
Multi-Region Reference Architecture
Let’s walk through the components of the high-level architecture:
  1. User connects to application hosted on AWS Amplify. Amazon CloudFront provides global edge caching to minimize end-user latency.
  2. Amazon Cognito is used for authentication including login and sign up. It is a regional service and can be deployed to each region. The application can integrate to Cognito using Amplify UI Authenticator.
  3. Amazon DynamoDB Global Tables is used to store the user’s data residency and replicated across regions. Amazon Cognito Lambda Triggers (pre-auth and pre-signup) will use the data to ensure that the user is allocated to the appropriate region.
  4. Amazon Route 53 Geolocation Routing (alternatively Latency Routing) provides a global API endpoint based on the user’s geolocation, and failover capability.
  5. Amazon API Gateway Regional Endpoint (alternatively Application Load Balancer) provides an endpoint for each region. See #8 for more details.
  6. AWS Lambda (or alternative compute services such as Amazon ECS with Fargate) provides the backend for the API.
  7. Storage and databases (such as Amazon Simple Storage Service (S3) and Amazon Relational Database Service (RDS), and Amazon DynamoDB) is used to store sensitive data. These are isolated to each region.
  8. Optionally, the regional API endpoint can be accessed directly for the user to access their desired region, bypassing the default. For additional security, consider Amazon CloudFront and AWS WAF.
For more details on the implementation, see the code sample for demo application.

Preparing for a multi-region architecture

Adopting multi-region is a significant undertaking, consider the following before taking the plunge:

Be wary of added cost and complexity

Adopting a multi-region architecture can bring additional costs, complexities across your application design and operations. As such, we typically advise startups to challenge and dive deep on the the necessity for a multi-region architecture, including understanding specific compliance requirements, and key drivers. Expanding your business globally does not necessarily require a multi-region architecture.

Deep dive into regulatory and compliance requirements

Compliance is a shared responsibility between AWS and the customer (you). On the AWS side, we publish our compliance reports on AWS Artifact, and AWS Compliance Center provides research cloud-related regulatory requirements and how they impact your industry. On the customer side, ensure that you are aware of your responsibilities. To assist with this, we publish guidance such as navigating GDPR compliance. As at time of writing, compliance to GDPR does not necessarily mandate data residency. In fact, many regulations are principles-based and does not mandate specific requirements. If data residency is not strictly required, then implementing general security controls may be of higher priority to mitigate against more important risks. Here, consider working towards compliance to a recognised international security standard (such as ISO27001, SOC II, and NIST800-53) which provides guidance on security for your overall organization. For your compliance journey, we provide AWS-native tools such as AWS Config and AWS Audit Manager, as well as partner solutions such as Drata. In addition, our marketplace also has a wealth of security and data protection solutions, such as DataMasque and Skyflow.

Consider a simpler architecture

To achieve the best performance and user experience for customers in the new region, you may think that a multi-region architecture is required. However, let’s challenge this assumption. Consider starting with a simplified architecture such as introducing Amazon CloudFront to a single-region architecture. CloudFront has global points-of-presence to reduce end-user latency to your global users. Similarly, for availability and Disaster Recovery (DR), we recommend to first consider a multi-AZ architecture. At AWS, we define an availability zone (AZ) as isolated locations with redundancy. The risk of outage of multiple availability zones is very low.

Automate, automate, automate

If you have not adopted infrastructure as code and automated your deployment process, consider implementing this first. In the reference architecture, we have used AWS Cloud Development Kit (CDK) which allows you to define your infrastructure with familiar programming languages such as TypeScript or Python.

Considerations and best-practices when adopting a multi-region architecture

If a multi-region architecture is required, consider the following factors:

Think big, but start small

The fastest path to get your application into a new region is to replicate your infrastructure stack to the new region. Starting with this approach allows you to get your product in market and into the hands of users in your target region. This allows you to iterate your product, obtain feedback from customers, and localize your product offering for the new region.

Support local requirements through global customization

If you are required to release updates specific to the local region, consider introducing the feature as a configurable feature. This allows consistent application source code and infrastructure across regions. To implement, consider using feature flags using AWS AppConfig which also facilitates fast iteration through trunk-based development.

Build foundation for efficiency using SaaS design principles

As you grow, it is important to review software-as-a-service (SaaS) design principles. The example reference architecture draws upon some of the concepts outlined such as tenant. In a SaaS, typically a tenant corresponds to a customer, and data is partitioned accordingly. For example, Customer 1 cannot see data for Customer 2. The same principle can be applied to this architecture where Region 1 is isolated from Region 2. As each tenant uses its own separate infrastructure, the architecture uses a silo isolation model. For cost efficiency, infrastructure resources can be pooled over time.

Bind user identity to tenant identity

It is highly likely that every layer of the application will need to be aware of this tenant context. The most efficient approach is to introduce the context is through the identity layer. In the reference architecture, we use custom:region user attribute which is then passed to the application via JSON Web Tokens (JWT) tokens as a custom claim. As the application is expanded to multiple services, each service can simply use the token to gain tenant awareness. Without relying on another service, each service can decrypt the tokens to determine the context, apply the appropriate isolation logic, connect to the relevant data source as well as pass data to monitoring and logging tools. The logic can be abstracted from developers for development efficiency and simplicity.

Implement additional security controls as your grow

As your team grows, there can be more room for mistakes. Consider layering security controls over time to improve protection of sensitive data. For example, you can consider data residency controls using AWS Control Tower, performing a Well-Architected: Security review for a holistic assessment, and evolving your security capabilities at key growth stages.

Conclusion

In this post, we explored the significance of data residency, particularly in regulated industries such as healthcare, life sciences and financial services, where protecting sensitive customer data is paramount. We covered a high-level reference architecture which allows you to establish a solid foundation for your global business, enabling expansion to new regions, while maintaining compliance, and safeguarding sensitive data. If you would like to learn more, we encourage you to explore the accompanying code sample and demo application, and diving deeper into the links to resources posted throughout the post.
If you are plotting for world domination or looking to expand to a new region, feel free to watch our AWS On-Air episode on the topic or contact your AWS account team to learn more about the programs we offer and alternative multi-region architecture patterns.
 

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Comments