Deploy DeepSeek R1 on AWS Bedrock

You've probably heard the buzz about the DeepSeek R1 model – and for good reason! It's been making waves in the AI community for its impressive performance, particularly considering its relatively compact size. This distilled version of the LLaMA 8B model packs a serious punch, making it a prime candidate for a variety of applications.

But how do you actually get your hands on this powerful tool? Today, we'll walk you through the process of deploying the DeepSeek R1 Distilled LLaMA 8B model to Amazon Bedrock, from local setup to testing. We’ll be using a macOS environment, but the steps are easily adaptable to other operating systems. This is your complete guide to getting up and running with DeepSeek R1 on AWS.

Why is DeepSeek R1 in the spotlight?

DeepSeek R1 isn't just another language model; it's designed to be efficient. By "distilling" the larger LLaMA model, it retains much of its performance while significantly reducing size. This means faster inference times and lower resource requirements, making it ideal for deployment on platforms like Amazon Bedrock. Its balance of size and capability makes it a hot topic right now, as businesses and developers seek powerful AI without breaking the bank.

What You’ll Need

Before we dive in, make sure you have these prerequisites covered:

An AWS Account: You'll need an active AWS account with appropriate permissions to access Amazon Bedrock.
AWS CLI: The AWS Command Line Interface (CLI) should be installed and configured with a profile that has the necessary permissions.
Homebrew (macOS): If you’re on macOS, we’ll use Homebrew for package management. If you’re on Linux, adapt with your distro’s package manager (e.g., apt, yum).
Storage Space: Make sure you have approximately 30GB of free storage for the model files.

Step-by-Step Deployment

Let's break down the process into easy-to-follow steps.

Step 1: Install Git Large File Storage (LFS)

Large models, large files. Git LFS helps us manage these efficiently. Let’s install it:

Step 2: Clone the DeepSeek Model Repository

Time to get the model files. We'll clone them from Hugging Face:

git clone https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B

Step 3: Configure AWS Environment Variables

Set up your AWS environment variables to interact with your account:

Replace us-east-1 with your desired region and your-profile-name with your AWS profile.

Step 4: Create an S3 Bucket

We need a place to store our model files on AWS. Let’s create an S3 bucket:

aws s3 mb s3://bedrock-deepseek-models-{account-id}-us-east-1

Remember to replace {account-id} with your actual AWS account ID and match the region you set earlier!

Step 5: Upload Model to S3

Now, let's move the model files into your newly created S3 bucket:

aws s3 sync --exclude '.git*' DeepSeek-R1-Distill-Llama-8B s3://bedrock-deepseek-models-{account-id}-us-east-1/DeepSeek-R1-Distill-Llama-8B/

This command copies all the files except those in the .git folder.

Step 6: Import the Model into Amazon Bedrock Console

Now the exciting part – deploying it in Bedrock!

Head to the Console: Navigate to the Amazon Bedrock console within the AWS Management Console and select "Imported models".
Start Import: Click on "Import model" to begin the import process.
Model Details: Configure as follows:
- Model name: deepseek-r1-8B (you can choose a different name).
- Tags: Optional.
- VPC settings: Optional.
Import Job Configuration: This is where we connect to S3:
- Import job name: import_deepseek-r1-8B (or any name that helps you).
- Model import source: Select "Amazon S3 bucket."
- S3 location: Enter the S3 path: s3://bedrock-deepseek-models-{account-id}-us-east-1/DeepSeek-R1-Distill-Llama-8B/
- Service Access Configuration:
  - Choose "Create and use a new service role".
  - Accept the auto-generated service role name
  - Use default encryption settings
Import: Finally, click "Import model" to start the import job.

Testing Your Deployed Model

Let's make sure everything's working. Save the following code as test_model.py and customize it with your values.

Important:

Replace {account-id}: With your AWS account ID.
Replace {model-id}: With the Model ID provided in the Imported Model section of the Amazon Bedrock console. The MODEL_ARN will look similar to arn:aws:bedrock:us-east-1:529088295222:imported-model/h8t50Wwcd0ux.
Make sure REGION_NAME matches your chosen region
max_gen_len – Specify the maximum number of tokens to use in the generated response. The model truncates the response once the generated text exceeds max_gen_len

Run the script:

If all goes well, you should see a JSON response in your console, similar to the one below. This means your model is deployed and responding to requests!

Example Output

Here's an example of what you might see after running the Python script successfully:

What to look for in the response:

HTTPStatusCode: 200: This indicates that your request was successful.
x-amzn-bedrock-invocation-latency: This shows how long it took for the model to generate a response (in milliseconds).
x-amzn-bedrock-output-token-count: The number of tokens in the generated output. The model will stop generating if reaches a max token number for the output.
x-amzn-bedrock-input-token-count: The number of tokens in the input prompt
body.generation: This field contains the actual text generated by the model, answering your question. In this case, the DeepSeek R1 model has provided a thorough explanation of why the sky appears blue and what might happen without the atmosphere. The stop reason "length" indicates that the output was truncated because it reached a max token limit.

Cost Considerations

Here's a breakdown of the estimated costs, keep in mind these are rough estimates based on the us-east-1 region. Your actual costs can vary.

Storage Costs (30GB Model)

S3 Storage: Roughly ~$0.69 per month at $0.023/GB.
Bedrock Model Storage: Approximately ~$2.19/month at $0.0001 per GB-hour.

Inference Costs

Base Rate: Around $0.25 per hour for a ml.g4dn.xlarge instance.
Per-Request:
- Estimated cost per second: ~$0.000069
- Example 2-second inference: ~$0.00014 per request
- Example: 10,000 requests/month with 2-second inference: ~$1.40/month

Additional Costs

One-time import process charge.
Data transfer fees (if applicable).
AWS CloudWatch monitoring, if used.

For the most up-to-date pricing, see the official AWS Bedrock pricing page.

Key Considerations

Security: Always ensure you are using proper AWS credentials and permissions to avoid any unauthorized access.
Model Size: Keep in mind the model is around 30GB in size, which has implications for storage and bandwidth.
Region: Bedrock availability may vary. Check that it's available in your chosen region.
Monitoring: Use AWS CloudWatch to track your model’s performance, costs, and identify potential issues.

Troubleshooting Tips

Permissions: Double-check your AWS credentials and ensure they have the necessary Bedrock permissions.
Git LFS: Make sure Git LFS is initialized and configured correctly.
S3 Naming: Verify that your S3 bucket naming adheres to AWS conventions.
CloudWatch Logs: If you encounter errors, examine CloudWatch logs for more detailed information.

Next Steps

Now that you've deployed your DeepSeek R1 model, what's next?

Bedrock Playground: Test the model further through the Bedrock console's playground.
Permissions: Set up fine-grained model access permissions as needed.
Application Integration: Integrate this model into your applications to leverage its power.
Monitoring: Continually monitor performance and resource usage to optimize.

We hope this guide was useful in deploying DeepSeek R1 on Amazon Bedrock. Happy building!

Select your cookie preferences

Site Terms, Privacy, and more.