
Model Context Protocol (MCP) and Amazon Bedrock
Getting started with Anthropic's LLM protocol on AWS
Giuseppe Battista
Amazon Employee
Published Mar 19, 2025
This tutorial is based on a two-part quickstart published by Anthropic (part one, part two). In this post we're going to highlight how to make Model Context Protocol work with Amazon Bedrock as model provider.
We'll then extend our toolbox to solve a real-world business problem: we want to build an agentic system that will summarize blog posts and check all links included are available for our readers.
Quoting from the official documentation
MCP is an open protocol released by Anthropic that standardizes how applications provide context to LLMs. Think of MCP like a USB-C port for AI applications. Just as USB-C provides a standardized way to connect your devices to various peripherals and accessories, MCP provides a standardized way to connect AI models to different data sources and tools.
If you want to learn more about it, have a look at this introduction on its official documentation.
At a high level, MCP implements a client/server architecture with bi-directional communication. Its communication protocol can be served via two transport mechanisms:
- Stdio transport
- Uses standard input/output for communication
- Ideal for local processes
- HTTP with SSE transport
- Uses Server-Sent Events for server-to-client messages
- HTTP POST for client-to-server messages
All transports use JSON-RPC 2.0 to exchange messages. See the specification for detailed information about the Model Context Protocol message format.
In this sample we're going to explore stdio. This means that we'll host both the MCP client and server on the same machine.
Let us know in the comments if you'd like to see how to implement and deploy an MCP server that communicates over HTTP. We're thinking Fargate or maybe even Lambda...!
A key benefit of MCP is how it makes it easier for different teams to work together on AI powered applications. Similar to how microservice architecture allows different teams to work independently, MCP creates clear boundaries between different organizational functions.
Tool developers can freely create and update capabilities without disrupting the core AI system, while AI teams can concentrate on improving conversation quality without getting entangled in tool implementation details. The clear interfaces between components mean each team can work at their own pace, following their own development cycles.
Perhaps most importantly, organizations can incrementally add new capabilities without requiring system-wide changes, making it easier to evolve the AI system over time. This modular approach to AI development mirrors the successful patterns seen in modern software architecture, where loose coupling between components leads to more maintainable and scalable systems.
Additionally, thinking about SaaS providers for example, customers can focus on building features for their own products, rather than worrying about interfaces and undifferentiated heavy-lifting. With this standard in place, they can easily plug in their offering without time-consuming integration work.
Finally, from a technical perspective, you get all the benefits (and the burdens!) of a fully distributed architecture. This means you'll be able to scale and manage the lifecycle of components independently, you'll be able to A/B test tools and models for performance improvement, cost optimisation, and easy experimentation, you'll be able to allocate resources in a granular way and respond quickly to demand, fully benefitting from the elasticity of building in the cloud.
Here's a breakdown of the architecture and conversation flow we'll implement.
We'll follow the journey of the simple prompt
We'll follow the journey of the simple prompt
get me a summary of the blog post at this $URL
depicted in the two pictures below step by step, so it's easy to follow along.
- For a human the task of reading a web page and provide its summary is trivial, but LLMs are not normally able to visit web pages and fetch context outside of their parametric memory. This is why we need a tool.
- We provide the user prompt and a list of available tools brokered by our MCP server to Amazon Bedrock via Converse API. In this case, Amazon Bedrock is acting as our unified interface to many models.
- Based on the user prompt and the tool inventory the chosen model plans a proper response.
- In this case the model correctly plans to use the
visit_webpage
tool to download the content at the URL provided. Bedrock returns atoolUse
message to the client, including thename
of the selected tool, theinput
for the tool request, and a uniquetoolUseId
which can be used in subsequent messages. Read these docs for more information about the syntax and usage oftoolUse
in Bedrock Converse API. - The client is programmed to forward any
toolUse
message to the MCP server.
In our implementation, communication happens viaJSON-RPC
overstdio
on the same machine - The MCP server dispatches the
toolUse
request to the appropriate tool visit_webpage
tool is invoked and an HTTP request is made to the provided URL- The tool is programmed to download the content located at the provided URL and return its content in markdown format
- The content is then forwarded to the MCP server
- Flow control is returned to the MCP client. We complete the journey with steps 11-14 depicted in the following picture.
- The MCP client adds a
toolResult
message to the conversation history inclding thetoolUseId
provided at step 4 and forwards it to Bedrock. Read these docs for more information abouttoolResult
syntax. - Bedrock now plans to use the result of the tool to compose its final response
- The response is sent back to the client which is programmed to yield back the control of the conversation flow to the user
- the user receives the response from the MCP client and the user is free to initiate a new flow
Before you can get started, you'll need to complete the following tasks:
- Complete this tutorial showing how to create an MCP server in Python. At the end of the tutorial you should have a working MCP server providing two tools:
get_alerts
andget_weather
. This also includes installinguv
, a fast and secure Python runtime and package manager. - make sure you've exported your AWS credentials in your environment, so that they're available to
boto3
💡 For more information on how to do this, please refer to the AWS Boto3 documentation (Developer Guide > Credentials).
Ok, time to get our hands dirty with MCP and Amazon Bedrock. Make sure you have completed the tutorial linked in the Prerequisites section above before building your new client!
Your project tree should look something like this (minus the
mcp-client
folder we're going to create now)
jump one level above if you're currently in
weather/
and create a new python projectYou can get rid of
main.py
and create a new file called client.py
byYou'll need to install the following packages via
uv
We're making use of
mcp
package to manage access to the MCP server session and of course a sprinkle of boto3
to add the Bedrock goodness.With this helper class we're mapping messages coming from the Bedrock Converse API to objects we can use in our business logic. We're also defining a utility method to map the definition of a tool in the MCP server to Bedrock Converse API syntax. Just massaging the data a bit, nothing too fancy.
For simplicity, we're packaging all the business logic of our client in one class,
MCPClient
self.session
is the object mapping to the MCP session we're establishing. In this case we'll be usingstdio
, as we'll be using tools hosted on the same machineself.bedrock
creates an AWS SDK client that provides methods to interact with Amazon Bedrock's runtime APIs, allowing you to make API calls like converse to communicate with foundation models.self.exit_stack = AsyncExitStack()
creates a context manager that helps manage multiple async resources (like network connections and file handles) by automatically cleaning them up in reverse order when the program exits, similar to a stack of nestedasync with
statements but more flexible and programmatic. We're making use ofself.exit_stack
in the publiccleanup
method to cut loose ends.- The
connect_to_server
method establishes a bidirectional communication channel with a Python or Node.js script that implements MCP tools, using standard input/output (stdio) for message passing, and initializes a session that allows the client to discover and call the tools exposed by the server script.
Getting closer to the core of our business-logic.
_make_bedrock_request
method is a private helper that sends a request to Amazon Bedrock's Converse API, passing in the conversation history ( messages), available tools, and model configuration parameters (like token limit and temperature), to get a response from the foundation model for the next turn of conversation. We'll use this in a couple of different methodsprocess_query
method orchestrates the entire query processing flow:- Creates a message from the user's query
- Fetches available tools from the connected server
- Formats the tools into Bedrock's expected structure
- Makes a request to Bedrock with the query and tools
- Processes the response through potentially multiple turns of conversation (if tool use is needed)
- Returns the final response
This is the main entry point for handling user queries and managing the conversation flow between the user, tools, and the foundation model. Let's double-click on how the conversation is actually handled.
This is a conversation loop in which we handle all sorts of requests from both user and Bedrock. Let's dive in!
_process_response
method initializes a conversation loop with a maximum of 10 turns (MAX_TURNS
), tracking responses infinal_text
.- When the model requests to use a tool, it processes the request by handling both thinking steps (text) and tool execution steps (toolUse).
- For tool usage, it calls the tool handler and makes a new request to Bedrock with the tool's results. Remember, we're hosting the tools locally in our MCP server.
- We also handle various stop conditions (max tokens, content filtering, stop sequence, end turn) by appending appropriate messages and breaking the loop.
- Finally, it joins all accumulated text with newlines and returns the complete conversation history.
_handle_tool_call
method executes a tool request by extracting the tool's name, arguments, and ID from the provided info.- It calls the tool through the
session
interface and awaits its result. - The method records both the tool request and its result in the conversation history. This is to let Bedrock know that we have had a conversation with somebody else, somewhere else (the tool running on your machine, I mean!)
- Finally, it returns a formatted message indicating which tool was called with what arguments.
This method essentially serves as the bridge between the model's tool use requests (coming from Bedrock) and the actual tool execution system (running on your machine).
The
chat_loop
method implements a simple interactive command-line interface that continuously accepts user input, processes queries through the system, and displays responses until the user types 'quit' or an error occurs.And here's the main entry point. Finally!
Here we validate command-line arguments and initialize an MCP client to connect to a specified server script. We then run the chat loop in an async context with proper cleanup handling, using Python's asyncio to manage the asynchronous execution flow.
Here's a video demo of what we built so far: we're goin to test this client with the weather tools we've built as part of this tutorial released by Anthropic.
We'll demo the two tools in isolation:
- The weather alert tool will help us fetching alerts for a state in the US
- prompt: "get me a summary of the weather alerts in California"
- The weather forecast tool will help us getting the forecast for a city in the US
- prompt: "get me a summary of the weather forecast for Buffalo, NY"
Use uv to run
client.py
and don't forget to pass the path to where you're stored your tool.Now that we have a client and a server set up, we're ready to start solving real-world use cases. In this sample, we're going to build custom tools to provide web browsing capabilities for our LLMs.
Extending your MCP server is as easy as adding new functions to your server file or creating a new server file. For the scope of this demo, we're going to add a couple of very powerful functions to our existing server file and we'll show how Claude is smart enough to use them together in a plan and properly distinguish them.
Before we write more code, make sure you switch to the MCP server folder and install some extra dependencies
We're providing our LLMs with an HTTP client to visit web pages and extract markdown from it.
The
visit_webpage
function is a tool that fetches content from a given URL using HTTP GET requests. It converts the retrieved HTML content into a cleaner Markdown format, removing excessive line breaks and handling edge cases. The function includes comprehensive error handling for both network-related issues and unexpected errors, returning appropriate error messages when something goes wrong.The
validate_links
function takes a list of URLs and checks each one to verify if it's a valid, accessible webpage. It attempts to make HTTP GET requests to each URL, considering a link valid if the request succeeds and returns non-empty content. The function returns a list of URL-validity pairs, where each pair contains the URL and a boolean indicating whether the link is valid, with error handling for both network and general exceptions.(also make sure you check the article we're using in this demo, so you know we're not cheating - Serverless Retrieval Augmented (RAG) on AWS)
It's impressive how even without explicit orchestration Claude is able to plan and combine the usage of tools to achieve a complex goal. In this demo we've seen how our system is able to first download the content of a web page, then extract all the links, validate all of them, and finally returning a summary to the end user.
You can now extend your tool library and explore new ways to manage the conversation independently.
We're eager to explore new use cases and getting this architecture to the next level by going fully distributed. Please let us know in the comments what features or tools you'd like to see next.
Are you building tools or agentic systems? Are you a startup founder and want to discuss your startup with AWS startup experts and the authors of this article? Book your 1:1 meeting here!
Giuseppe Battista is a Senior Solutions Architect at Amazon Web Services. He leads soultions architecture for Early Stage Startups in UK and Ireland. He hosts the Twitch Show "Let's Build a Startup" on twitch.tv/aws
Kevin Shaffer-Morrison is a Senior Solutions Architect at Amazon Web Services. He's helped hundreds of startups get off the ground quickly and up into the cloud. Kevin focuses on helping the earliest stage of founders with code samples and Twitch live streams.
Jamila Jamilova - Solutions Architect, AWS Startups UKIR and Anthropic champion.
Book a meeting with Kevin, Jamila, and Giuseppe here.
Giuse and Kevin would like to thank the following colleagues and friends for their contribution to this article
- Dheeraj Mudgil - Sr. Solutions Architect, AWS Startups UKIR. Thank you for continuously providing kind and patient guidance on security and inspiring the architectural and flow diagrams in this article
- Heikki Tunkelo - Manager, Solutions Architecture, AWS Startups Nordics. Thanks for your bright remarks on organisational and business benefits in adopting MCP, and overall peer review of the article.
- João Galego - Head of AI at Critical Software and many other superimpositions. Thanks for your kind peer review. You keep insisting on the highest standards, even outside of Amazon.
Security Awareness Disclaimer For any considerations of adopting these services in a production environment, it is imperative to consult with your company-specific security policies and requirements. Each production environment demands a uniquely tailored security assessment that comprehensively addresses its particular risks and regulatory standards. If in doubt, reach out to your AWS Account team.
Open-source disclaimer - This blog post and related code samples make use of third party open-source technologies. If you want to make use of these code samples, make sure you check the licensing implications of all the packages involved. All code in this sample is released under Apache 2.0 licensing terms.
Housekeeping Note After completing the experiment, it’s crucial to promptly remove or disable any keys or credentials generated for the PoC. Additionally, it’s advisable to remove the associated services to avoid incurring unnecessary costs. Make sure your MCP client is not stuck in an infinite loop. We made sure every conversation could only have 10 turns, but make sure you've killed all processes.
Generative AI Disclaimer. The cover image for this article was generated with Amazon Nova Canvas.
Relevant Security Resources
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.