Deep-dive within Amazon Bedrock security architecture

Deep-dive within Amazon Bedrock security architecture

in the wings of Amazon Bedrock

Published Jul 9, 2024
When you hear GenAI on AWS with Foundation Models (FMs) what service will you give me ?
Of course Amazon Bedrock !
Amazon Bedrock is the easiest way on AWS to build and scale GenAI applications with foundations Models (FMs).
You can choose FMs from Amazon, Al21 Labs, Anthropic, Cohere, Meta, Mistral AI, and Stability AI to find the right FM for your use case, you have also the possibility to privately customize FMs using your organization's data without managing infrastructure.

But how Amazon Bedrock manages security ?

On the last re:Inforce 2024, Raj Pathak presented a deep-dive of security in Amazon Bedrock, let's see on that blog post how Amazon Bedrock manages security. (slides)
Amazon Bedrock is an AWS service with all the "standard" of AWS about the security (IAM, AWS Monitoring (Cloudwatch) and logging (CloudTrail), AWS PrivateLink, compliance & AWS data privacy and encryption)
None of the customer's data is used to train the underlying foundation models (source), customized models are accessed only by customer and the data stay private.
All data is encrypted at rest using AWS KMS and encrypted in transit with TLS 1.2 minimum.

Amazon Bedrock Compute Capacity

There's 2 compute capacity model on Bedrock
compute capacity
Amazon Bedrock compute capacity
On-demand : Available to all customers with a pricing more accessible for small use
Provisioned capacity compute : Available to a single customer (for very large consumptions)
The common compute features is for: no inference request's to train any model, model vendors have no access to any customer data and deployments are inside an AWS account owned by the Amazon Bedrock service team who operate.

A request to Amazon Bedrock, under the hood

Let's go into more detail what happens during a request to Amazon Bedrock ?
Client connectivity
Connectivity to Amazon Bedrock
The path will depend on the origin but the target is at the end the API endpoint of Amazon Bedrock.
The architecture of on-demand compute is broken down as follows
on-demand compute architecture
Amazon Bedrock on-demand compute architecture overview
Incoming request from : Console, SDKs and API's go to the API endpoint of Amazon Bedrock service (who accesses to AWS CloudTrail, IAM and Amazon CloudWatch) the runtime to execute the inference (custom? SageMaker?) get the information from a dedicated and another AWS account with the base model on S3.
On provisioned capacity mode of Bedrock it's a little bit different
provisioned capacity
Amazon Bedrock Provisioned capacity architecture overview
In provisioned capacity the runtime inference now "pass a specific model ID or ARN" to a provisioned capacity compute with 2 S3 bucket's (base model & fine-tuned model, depending on the case)
The architecture of fine-tuned model is very specific
model fine-tuning
Amazon Bedrock model fine-tuning architecture overview
There's a Training orchestration on Amazon Bedrock service account that manage a SageMaker training from the model deployment account who can access the data from the customer account (via a dedicated ENI) and S3 data for training still on customer account. There's just read-only access on customer account.
Once the model is trained the model is presented on the model deployment account via the dedicated S3 bucket (Fined-tuned model S3 bucket). Bedrock is fully managed so we don't precise the number of servers or capacity compute we need.

Securing model prompts and responses

When you "speak" with a GenAI you can have undesirable, irrelevant topics also harmful or offensive responses, you need also to protect user information or sensitive data such as personally identifiable information (PII).
On Amazon Bedrock you can implement Guardrails
It's a safeguards that you can customize to your requirements and responsible AI policies
Amazon Bedrock Guardrails
In details the Guardrail step in after the user input from the prompt and if no problematic content is found also on the output of the Foundation Model to have a proper answer.