Building Scalable Generative AI Endpoints with AWS CDK and Amazon Bedrock
Here's a very efficient CDK project that allows you to create a Gen AI endpoint in minutes leveraging API Gateway and Lambda.
Published Dec 7, 2024
Generative AI is the new driver of technological advancement in this era. Whether we like it or not, Gen AI has already impacted our lives directly or indirectly. Yet most people are hesitant to incorporate them in our software workflows, and business. The start of AI revolution is similar to the internet revolution from early 2000s, it is bound to change our lives sooner or later.
Now the main AI service in AWS that has been such a game changer in AI and ML is Amazon Bedrock.
Amazon Bedrock allows you to access various foundational AI models from 3rd parties like Meta, Anthropic and Cohere. It is a managed service with new innovations being added everyday.
What we are doing is calling the bedrock api through our API Gateway. The API Gateway will then invole a lambda function which will be calling the Bedrock API. AWS CDK makes it really easy to set it up, for developers it is perfect to apply infrastructure as code.
Why Use an API Gateway?
Using API Gateway has many advantages, specially when it comes to securing our endpoints, when using microservices, API Gateway comes in really handy, we can leverage it for adding authenticaiton and authorization, as well as protection from multiple cyber security threats, similarly we can also leverage rate limiting and Encryption.
Why Lambda?
Using Lambda to call the Bedrock API adds significant advantages, it scales up, is cost effective based on use, added security, integration with aws services for compex workflow and error handling, I find it really handy to debug errors with lamdba logs streamed to cloudwatch.
When it comes to calling the Bedrock API, It is easier to invoke the models based on their reference available on the aws console,
for example, here is how the reference looks like for the amazon tital express model.
and here's how we have called it by seperating body to a variable
You can checkout the full code in the repository
In summary, using Lambda provides a flexible, scalable, and secure way to interact with the Bedrock API, enhancing your AI applications.
Here's a guide on how you can also try it out in minutes.
Make sure you have aws cli configured. Check out this documentation for aws cli configuration
we also need to have aws cdk installed, for this you will need to have node installed too.
once node is installed, you can install aws cdk with npm
To replicate my code, you can clone the repository here
then go to the root directory of the project and enter the following cdk commands
# make sure your account id is correct
here's what the bootstrap command will do:
Deploy CDKToolkit CloudFormation stack
- Create S3 bucket for CDK assets
- Set up ECR repository for Docker images
- Create IAM roles for deployment permissions
- Provision bootstrap template resources
- Prepare environment for CDK deployments
One-time setup per AWS account and region.
now go ahead and do deploy, but if you want to view a cloudformation template before deploying you can use the cdk synth command, if you want to save the template
you can do
to save it to a file.
this will deploy the lambda functions, api gateway, iam roles and permissions, and everything else using a cloudformation stack, which makes it super easy to supervise your resources.
Now, we are almost all set, but before that, we need to make sure the model we are calling in our code is accessible, if you have not used bedrock before, we should request for access on some models before using them, usually it takes a few minutes while some take seconds, based on which model we are asking for.
in our source code, if you go to constants.py we have a variable defined there BEDROCK_MODEL_ID, in this case I have used the Amazon Titan Text Express and its subsequent model id,so go to Bedrock -> model access -> request model
now, when you deploy cdk you will have received the api gateway endpoint as an output, the endpoint is usually the base endpoint, you may have to use the proper endpoint with the environment and endpoint name to make it a full endpoint.
then you can either use postman or terminal to test
Here, we have deployed a serverless app that call for the LLM !
Thank you for reading it to the end !
Thank you for reading it to the end !