AWS Logo
Menu
Deploying DeepSeek-R1 Distill Model on AWS using Amazon SageMaker JumpStart

Deploying DeepSeek-R1 Distill Model on AWS using Amazon SageMaker JumpStart

Step-by-step guide: Deploying DeepSeek-R1-Distill-Qwen-14B on Amazon SageMaker JumpStart

Jarrett
Amazon Employee
Published Jan 28, 2025
Last Modified Feb 13, 2025
This is Part 2 of our series on how to deploy Deepseek on AWS. This post will focus on deploying on Amazon SageMaker JumpStart. View Part 1 of our series on deploying to Amazon EC2 here.

Introduction

DeepSeek, a Chinese artificial intelligence (AI) company, has recently garnered significant attention for its innovative AI models that rival leading Western counterparts in performance while being more cost-effective. The company's latest release, launched on 20 Jan 2025, DeepSeek-R-1, matches the capabilities of OpenAI's o1 reasoning model across math, code, and reasoning tasks, but at less than 10% of the cost. Furthermore, DeepSeek-R-1 is completely open-source, enabling developers worldwide to access and implement the model on their own systems, disrupting the LLM landscape.
Hosting DeepSeek-R-1 on AWS offers unparalleled scalability and flexibility, ensuring you can seamlessly leverage its powerful AI capabilities for your specific use case - whether for research, business intelligence, or development projects.
This blog post will guide you through a step-by-step process for hosting DeepSeek-R-1, specifically the Deepseek-R1 14B model, on AWS infrastructure. This deployment will involve deploying the model on Amazon SageMaker JumpStart, enabling you to harness DeepSeek-R-1 AI capabilities within the cloud.

Instructions

Section 1. Raising endpoint usage service quota

In this tutorial, you will deploy the DeepSeek-R1 14B model on the ml.g6.12xlarge instance type. This is the default instance type for inference hosting endpoint.
Because the default quota for ml.g6.12xlarge for endpoint usage is 0, you will need to raise it.
In the AWS Console, search for Service Quotas
In the AWS Console, search for Service Quotas
Next, search for SageMaker
Next, search for SageMaker
Search for ml.g6.12xlarge for endpoint usage, then click on Request increase at account level
Search for ml.g6.12xlarge for endpoint usage, then click on Request increase at account level
Enter 1 under Increase quota value
Enter 1 under Increase quota value
Finally, wait until your request is processed, you should see a success message in a few minutes
Finally, wait until your request is processed, you should see a success message in a few minutes

Section 2. Setting up SageMaker AI Domain

In this section, you will need to set up your SageMaker AI domain in the region which you desire. Follow this guide for a fuss-free creation of your domain. This can take a while, so be patient!
Click on the newly created domain
Click on the newly created domain
Once the domain has been successfully created, browse to your user profile, then click on Studio. This will bring you to the SageMaker AI Studio console.
Click on User Profiles, and under Launch, select Studio
Click on User Profiles, and under Launch, select Studio
You will then be brought to SageMaker AI Studio console.

Section 3. Deploying DeepSeek

We are finally ready to deploy DeepSeek on Amazon SageMaker JumpStart!

Setting up JupyterLab Space

Amazon SageMaker JumpStart is a machine learning (ML) hub with foundation models, built-in algorithms, and prebuilt ML solutions that you can deploy with just a few clicks.
On the left side menu, select JumpStart. DeepSeek should be listed as one of the Providers.
On the left side menu, select JumpStart. DeepSeek should be listed as one of the Providers.
In this guide, we will deploy the 14B model so click on DeepSeek-R1-Distill-Qwen-14B.
In this guide, we will deploy the 14B model so click on DeepSeek-R1-Distill-Qwen-14B.
Click on Open in JupyterLab
Click on Open in JupyterLab
Click on Create new space, key in any name, and then click on Create space and open notebook.
Click on Create new space, key in any name, and then click on Create space and open notebook.
Click on Open in JupyterLab again, and wait until the Space is ready.
Click on Open in JupyterLab again, and wait until the Space is ready.
Select your Space, then click on Open notebook once App Status says "Running".
Select your Space, then click on Open notebook once App Status says "Running".

Editing the Notebook

The default notebook requires some modifications before it can successfully deploy the model.
In the first cell, add this code block to upgrade the sagemaker library:
Add this code block to upgrade sagemaker.
Add this code block to upgrade sagemaker.
Next scroll to the bottom of the notebook, and add "#" in front of "predictor.delete_predictor()".
Next scroll to the bottom of the notebook, and add "#" in front of "predictor.delete_predictor()".
Finally, under Run, click on Run All Cells.
Finally, under Run, click on Run All Cells.

Cleaning Up

After you've finished using the endpoint, it's important to delete it to avoid incurring unnecessary costs. Uncomment the last code block which we have commented out earlier by removing the "#", and then run the cell.

FAQ

When should I consider using Amazon SageMaker over Amazon Bedrock Custom Model Import?

One reason you would consider using Amazon SageMaker over importing a customized model into Amazon Bedrock is if your target region does not support said feature yet. View the available regions for Amazon Bedrock Custom Model Import here.

References

About the Authors

Jarrett Yeo - Associate Cloud Architect, AWS
Jarrett Yeo Shan Wei is a Delivery Consultant in the AWS Professional Services team covering the Public Sector across ASEAN and is an advocate for helping customers modernize and migrate into the cloud. He has attained five AWS certifications, and has also published a research paper on gradient boosting machine ensembles in the 8th International Conference on AI. In his free time, Jarrett focuses on and contributes to the generative AI scene at AWS.
Germaine Ong - Startup Solutions Architect, AWS
Germaine is a Startup Solutions Architect in the AWS ASEAN Startup team covering Singapore Startup customers. She is an advocate for helping customers modernise their cloud workloads and improving their security stature through architecture reviews.
 

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Comments