
Deploying AI Agent APIs on AWS (the easiest way)
This step-by-step guide will not only get you quickly acquainted with how to build and deploy an AI Agent API using Amazon Bedrock on AWS, it will also provide CloudFormation templates that will automate a majority of the steps needed.
Basil Fateen
Amazon Employee
Published Oct 10, 2024
I know…
It’s hard not to feel like we’re all riding the tsunami of Generative AI innovations (and hype) in a tiny canoe, frantically paddling to not only ride the wave into prosperity but also not to be consumed by it.
Recently I’m sure you’ve been hearing about ‘AI Agents’ everywhere. For some, it’s not clear what the value proposition is to builders and if it is to be seen as an ally or a potential threat.
Let’s first clarify what an agent is. In the simplest terms, an agent uses a generative AI Large Language Model (LLM) to orchestrate actions beyond providing a text result to a prompt. It can break down complex objectives into smaller tasks, connect with internal or external systems to pull in more contextual and specific data and call APIs to take actions, like sending emails, scheduling meetings, etc.
To provide more contextually relevant, domain-specific information for the Agent, such as product information or company policy guides, we connect the agent to a Knowledge Base which uses Retrieval Augmented Generation to allow this specific data to be used in combination with the LLM. Keep in mind that this does not actually ‘make the LLM smarter’ or ‘learn anything’, for that we would need to fine-tune an LLM with our data in order to add this information as weights into the model itself. However, to add specific information to the public data that an LLM was already trained with can add a highly impactful degree of relevance to how helpful an LLM can be.
On AWS, we can create Agents on Amazon Bedrock. The process involves choosing a Large Language Model, setting up a Knowledge Base and any additional Action Groups which can call custom functions to handle certain tasks.
To embed an agent within a web app, either as part of the customer-facing value proposition or to streamline internal operations, opens infinite possibilities to enhance value for your customers and grow fast.
However, I’m aware that for some developers who don’t have much experience on AWS, the objective of launching an AI agent to connect to their web app may seem daunting.
Well, that’s what this guide (and accompanying video on the AWS Developers YouTube channel) is here for.
If you’re new to AWS and you want to launch an AI agent in the simplest way possible, you’re in the right place.
It doesn’t matter if you’re front-end, back-end, full-stack, no-stack (is that a thing?). This guide will generate the agent and give you the URL to plug it into any web or mobile app. Do with it what you will.
Maybe you previously tried diving head-first into some technical documentation and went cross-eyed.
Then you called one of your friends who is an expert at AWS to guide you, and that phone call resembled the scene in ‘Poltergeist’ where the psychic was communicating with the little girl in the spirit realm (“Carol Anne, don’t go into the light!”).
So, if you feel like a cross-eyed lost soul from the shadow realm, don’t worry. I got you.
Let’s get started.
Here’s the high-level plan:
1. Sign up to AWS
2. Create an IAM User
3. Request access to Gen AI models on Bedrock
4. Upload the templates using CloudFormation
5. Copy the URL for your Agent’s API
6. ?
7. Profit
First thing we need to do, is sign up to AWS.
You will be asked to add a credit card to continue the signup, but if you’re signing up for the first time then you are eligible for the ‘Free Tier’ of AWS which provides a certain amount of credits and free usage for certain services. To know more about the usage limits for different services you can have a look here: https://aws.amazon.com/free/ and after your account is created then you can track your specific consumption within the free tier from the ‘Billing and Cost Management’ section.
This is a view of the ‘Console’ on AWS, your headquarters of innovation and infrastructure. From here you can access the over 200 services that can help you build literally anything, from the leanest prototype to a global enterprise infrastructure. To access a service, just start typing the name in the search box and click on the relevant service from the drop-down options.

The first service we will access is “IAM” which stands for ‘Identify and Access Management’. This service handles the relationships and security between services, roles and users. What makes AWS the most secure cloud platform is the granularity and focus on maintaining security best practices when it comes to infrastructure. So when we create services that interact with each other, we need IAM to set the permissions of interaction between them.

So in order to move forward with automating the creation of services using our CloudFormation template, we need to create our first administrator user.
We will assign it the ‘AdministratorAccess’ privilege and enable ‘Console access’. Once it’s created, we will copy the ‘ARN’ of that user which we will need as a parameter for our script. The ARN is a unique identifier code assigned to each resource in AWS.
Next, we will proceed to make sure that we have access to the models on Amazon Bedrock by accessing the ‘Model Access’ subsection. For this particular example we are using Amazon Titan Embeddings model for RAG and Amazon Titan Premier’ for the LLM, so we need to request access to those models. Once access has been granted, we can now continue to the CloudFormation template upload.


This is a tale of a template. Specifically, a CloudFormation template that will help automate MOST of the steps required to deploy an Agent with a Knowledge Base on Amazon Bedrock and generate a public URL to access that Agent in your web app.
CloudFormation is a service that automates the deployment of infrastructure on AWS. It can create services and the necessary roles and relationships between them using a template.
Click ‘Create Stack’ and choose ‘With new resources’. Then ‘Choose and existing template’, ‘Upload a template file’ and then locate the first CloudFormation template file in the repo linked below and click ‘Next’.

Give the stack a name like ‘MyAgent’ and then below, paste the ARN code that you copied from your new user. Leave all defaults and click continue and finally check the boxes on the last step and click ‘Create Stack’.
Now it’s time to go make a cup of coffee and then come back so I can tell you all about what’s going on in the background as the template runs.
A Tale of Template:
Script 1:
1. S3 Bucket Creation
2. IAM Roles and Policies Setup
3. OpenSearch Serverless Collection and Policies Configuration
4. Custom Lambda Functions for OpenSearch Management
5. Vector Index Creation in OpenSearch
Script 2:
1. Bedrock Knowledge Base Setup
2. Bedrock Agent Creation
Script 3:
1. Boto3 Layer
Script 4:
- Lambda function and API
Script 5:
- Add billing alert
Here’s some detailed breakdowns of each step:
1. S3 Bucket Creation
We kick things off by summoning an S3 bucket from the digital ether. This humble bucket will serve as the treasure chest for our knowledge base's source data. It's like creating a library, but instead of dusty shelves, we have AES256 encryption. This is where we will upload the files that we want injested by the Knowledge Base.
2. IAM Roles and Policies Setup
Next, we set up the IAM roles and policies. We're essentially handing out VIP passes to our AWS resources, making sure everyone (and everything) has the right backstage access.
3. OpenSearch Serverless Collection and Policies Configuration
Now we set up an OpenSearch Serverless collection with the correct security policies because, let's face it, nobody likes an uninvited guest at their data party. This is the mechanism we are using for RAG. With OpenSearch Serverless, you can easily search and analyze a large volume of data without having to worry about the underlying infrastructure and data management.
4. Custom Lambda Functions for OpenSearch Management
Then we add some Lambda functions to fascilitate the OpenSearch setup.
5. Vector Index Creation in OpenSearch
Vector index creation time! This is where we give our OpenSearch collection the superpower of understanding high-dimensional vector space. It's like teaching a librarian to organize books in 1536 dimensions.
6. Bedrock Knowledge Base Setup
With OpenSearch ready to roll, we set up our Bedrock Knowledge Base. This is where the magic happens.
7. Bedrock Agent Creation
Last but not least, we create a Bedrock Agent. This digital workforce is ready to dive into our newly created knowledge base and start answering questions. It's like hiring a team of experts who never sleep, never take coffee breaks, and never complain about the office temperature.
Now it’s time to deploy the additional templates, which will make our new AI agent accessible so we can plug it into our app to start testing.
Now we should have a URL in the output section that looks like this:

A few things are very important to highlight at this point. Firstly, services that are launched using these scripts will incur costs once the Free Tier credits are finished, so keep a close eye on that and delete the template as soon as you are done testing. The Knowledge Base uses Open Search Serverless collections, which is very powerful and as such, carries a substantial cost if left running. A more cost-effective option would be to use something like PineCone for the RAG portion which would reduce the costs substantially. This template and guide is only provided for quick testing and not for any deployment environment. For that it would need a lot more security mechanisms such as Cognito.
Stay tuned for part 2, where we will:
1. Build a small frontend page to communicate with the Agent
2. Add files to the S3 bucket for the knowledge base
3. Add action groups to the Agent
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.