Deep-dive within Amazon Bedrock security architecture
in the wings of Amazon Bedrock
Published Jul 9, 2024
You can choose FMs from Amazon, Al21 Labs, Anthropic, Cohere, Meta, Mistral AI, and Stability AI to find the right FM for your use case, you have also the possibility to privately customize FMs using your organization's data without managing infrastructure.
On the last re:Inforce 2024, Raj Pathak presented a deep-dive of security in Amazon Bedrock, let's see on that blog post how Amazon Bedrock manages security. (slides)
Amazon Bedrock is an AWS service with all the "standard" of AWS about the security (IAM, AWS Monitoring (Cloudwatch) and logging (CloudTrail), AWS PrivateLink, compliance & AWS data privacy and encryption)
None of the customer's data is used to train the underlying foundation models (source), customized models are accessed only by customer and the data stay private.
All data is encrypted at rest using AWS KMS and encrypted in transit with TLS 1.2 minimum.
There's 2 compute capacity model on Bedrock
On-demand : Available to all customers with a pricing more accessible for small use
Provisioned capacity compute : Available to a single customer (for very large consumptions)
The common compute features is for: no inference request's to train any model, model vendors have no access to any customer data and deployments are inside an AWS account owned by the Amazon Bedrock service team who operate.
Let's go into more detail what happens during a request to Amazon Bedrock ?
The path will depend on the origin but the target is at the end the API endpoint of Amazon Bedrock.
The architecture of on-demand compute is broken down as follows
Incoming request from : Console, SDKs and API's go to the API endpoint of Amazon Bedrock service (who accesses to AWS CloudTrail, IAM and Amazon CloudWatch) the runtime to execute the inference (custom? SageMaker?) get the information from a dedicated and another AWS account with the base model on S3.
On provisioned capacity mode of Bedrock it's a little bit different
In provisioned capacity the runtime inference now "pass a specific model ID or ARN" to a provisioned capacity compute with 2 S3 bucket's (base model & fine-tuned model, depending on the case)
The architecture of fine-tuned model is very specific
There's a Training orchestration on Amazon Bedrock service account that manage a SageMaker training from the model deployment account who can access the data from the customer account (via a dedicated ENI) and S3 data for training still on customer account. There's just read-only access on customer account.
Once the model is trained the model is presented on the model deployment account via the dedicated S3 bucket (Fined-tuned model S3 bucket). Bedrock is fully managed so we don't precise the number of servers or capacity compute we need.
When you "speak" with a GenAI you can have undesirable, irrelevant topics also harmful or offensive responses, you need also to protect user information or sensitive data such as personally identifiable information (PII).
On Amazon Bedrock you can implement Guardrails
It's a safeguards that you can customize to your requirements and responsible AI policies
In details the Guardrail step in after the user input from the prompt and if no problematic content is found also on the output of the Foundation Model to have a proper answer.