Getting started with different LLMs on Amazon Bedrock

You've probably found your way here because you're interested in getting started with Large Language Models (LLMs) on Amazon Bedrock. Or maybe you heard Bedrock supports an assortment of different models, like Anthropic's Claude 3, Mistral AI's models, and Meta's Llama 3 models, that can be used side-by-side (and even play each other in video games). That's awesome! Before we get started, let's define what an LLM is, and see briefly how they work. Then we'll talk about how to access a ton of LLMs using Amazon Bedrock, and I'll show you how to get started, how to switch between LLMs, and we will explore some of the features of Amazon Bedrock.

Introduction to Large Language Models (LLMs)

LLMs are a type of Foundation Model (FM) that is trained on vast amounts of text data that allows them to understand language and perform complex tasks. The complex tasks include generating stories, summarizing text, writing code for you, and much, much more. Amazon Bedrock is a managed service that gives you access to several high-performing FMs from several well known AI companies. At a high level, you provide an LLM with some initial text or instructions. We call that a prompt. The model then uses this as a starting point to generate its own response or continuation to a response depending on the scenario. What's cool is that the LLM has the ability to recognize patterns in the data it was trained on and it uses that knowledge to produce a relevant, coherent output. It's not always perfect. Sometimes an LLM will hallucinate and make things up that are not entirely accurate. This is why you should pay close attention to the responses and compare them against what you know to be true.

Now, one of the key aspects of working with LLMs is something called prompt engineering. This is the crafting of effective prompts that let the LLM give you the type of output you're looking for. There are a few techniques like zero-shot learning, where the LLM tackles a task without any examples, and few-shot learning, where you provide a small number of examples to guide the LLM. These methods can help you get more accurate responses from the LLM.

With Amazon Bedrock, you get access to several LLMs from leading AI companies like Anthropic, Cohere, and Amazon's own models, all through a single API. Switching between different LLMs is super easy since you're really just changing a line of code. But we'll get into using Bedrock a bit later. For now, just know that Bedrock gives you the tools to experiment with various LLMs, customize them with your own data, and build secure, responsible AI applications without having to worry about managing any infrastructure yourself.

What is Amazon Bedrock

But what exactly is Amazon Bedrock? Amazon Bedrock is a managed service that gives you access to a bunch of high-performing FMs from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI AI, Stability AI, and even Amazon itself. This access is gained through a single API. What this means is that as a developer, you can easily change a couple lines of code and quickly gain access to a totally different LLM. We're going to get into that a little later in this article. In fact, Bedrock also offers a set of capabilities that you can use to build generative AI applications with security, privacy, and responsible AI in mind. With Bedrock you can experiment with and evaluate different LLMs. This gives you the opportunity to find the perfect fit for your use case. Additionally, you can customize them with your own data using techniques like fine-tuning and Retrieval Augmented Generation (RAG). So how do you get started?

Getting Started with Amazon Bedrock

I'm going to assume that you already have an AWS account and jump right into getting access to Amazon Bedrock. First, lets look at the AWS Console.

In the AWS console navigate to Amazon Bedrock. From there click "Get Started" and you will be taken to an overview page. You can see that in Figure 1.

In Figure 1, you'll note several areas called out with the numbers one through six. Area number one is the getting started menu where you can gain an overview of Amazon Bedrock, see some examples, and view details about the model providers, specifics on the LLMs that are available for that model, see what the API request looks like, and access other resources for each provider. Area two is where you can see your base FMs and any custom models you have trained. Area three is the playground area. This is where you can select and test different LLMs inside the AWS console interface. Area four is your safeguards, which currently does watermark detection. This feature is currently in preview, but it will detect whether an image was generated by a Titan Image Generator model on Bedrock. Next, area five is Orchestration. This is where you can gain access to knowledge bases and Agents. Area five is Assessment and Deployment which lets you do model evaluation and control provisioned throughput. The entire menu on the left is subject to change, but at least you are familiar with what's there today.

One of the most important menu items is not seen in Figure 1. Its the Model access menu item and its at the bottom of the menu on the left. I say this is the most important menu because access to each model must be requested before you can use them, and they are not granted by default. Therefore, and administrator will need to go into Model access and manage model access, requesting it for the models you wish to use. You can see this in Figure 2.

At the time of writing this article there are four pricing models:

On-Demand
Batch
Provisioned Throughout
Model Customization

Everything we do in this article will fall under the On-Demand model, so you will only pay for what you use. You can find more about the pricing models on the Bedrock pricing page.

Interacting with Playgrounds

Since you're here to get started with Bedrock and we are already in the AWS Console, I would feel remise if we didn't talk about playground access. Without coding anything you can test the LLMs you have model access to through chat, text, or images. For example, to chat with Claude 3 Sonnet you would click on Chat and then Select model as seen in Figure 3.

Then you would select the model provider Anthropic from the Category column, Claude 3 Sonnet from the Model column, and then On-demand will be selected automatically from the Throughput column, followed by clicking Apply as seen in Figure 4.

After applying the model you can chat with it as seen in Figure 5.

This is certainly an easy way to get started with the LLM, but why not take it a step further. Let's look at how to interact with the LLM programatically. For our next example I'll use the AWS SDK, specifically Python and Boto3, but you could use other languages if you prefer. You can find out more in our code examples.

Interacting though Code

Another way to interact with our LLM is through code. For this article I will give you a few examples in Python to illustrate how to get started. When interacting with Amazon Bedrock through code, you will use the Bedrock API. In our examples we will use Amazon Bedrock Runtime.

The following code interacts with Claude 3 Sonnet. I've added comments into the code to explain what each section of code does.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64

# imports JSON and Boto3.

import json
import boto3

# Initialize Bedrock client
bedrock = boto3.client(service_name="bedrock-runtime", region_name="us-west-2")

# Next, we create a function called `send_prompt_to_claude` that will take a single argument `prompt`. We create a dictionary called `prompt_config` with the configuration of the prompt we send to Claude.  Notice that we define the version, the max_tokens, and the message.  

def send_prompt_to_claude(prompt):
    try:
        prompt_config = {
            "anthropic_version": "bedrock-2023-05-31",
            "max_tokens": 1000,
            "messages": [
                {
                    "role": "user",
                    "content": prompt
                }
            ]
        }

# We convert the `prompt_config` dictionary into a JSON string.  

        body = json.dumps(prompt_config)

# Next, `modelId`, `contentType`, and `accept` set the values that are required by the `bedrock.invoke_model` method.  Have a look at the [Boto3 documentation](https://brandonjcarroll.com/links/9lmrs) and you will see the [request syntax](https://brandonjcarroll.com/links/0zfs2) that's required.  In this case the request and response should be in JSON.  

        modelId = "anthropic.claude-3-sonnet-20240229-v1:0"
        contentType = "application/json"
        accept = "application/json"

# We then invoke the model and extract and return the response. 

        response = bedrock.invoke_model(
            modelId=modelId,
            contentType=contentType,
            accept=accept,
            body=body
        )
        response_body = json.loads(response.get("body").read())
        claude_response = response_body.get("content")[0]["text"].strip()
        return claude_response

# Lastly we have an exception handling block to catch any exceptions and gives us some details.

    except Exception as e:
        print(f"An error occurred while invoking Claude: {e}")
        return "An error occurred."

# Next we create an infinite loop using `while True`. Inside the loop, the user is prompted to enter their question using the `input` function. The message `"Enter your question (or 'q' to quit): "` is displayed to the user. If the user enters 'q' (case-insensitive), the `break` statement is executed, which will terminate the loop.

if __name__ == "__main__":
    while True:
        user_input = input("Enter your question (or 'q' to quit): ")
        if user_input.lower() == 'q':
            break

# Finally we the question entered by the user is passed as an argument to the `send_prompt_to_claude` function. The prompt is sent to Claude and the response is retrieved. The response is stored in the `response` variable and printed.  Our loop then starts over.

        response = send_prompt_to_claude(user_input)
        print(f"\n\n-----------\nClaude's response:\n\n {response}\n -----------\n")

So when we run this code we end up with something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[~/Documents/repos/beginning-bedrock]$ python claude.py
Enter your question (or 'q' to quit): Does an amazon VPC support a stateful firewall?

-----------
Claude's response:

 Yes, an Amazon Virtual Private Cloud (VPC) supports stateful firewalls. Amazon VPC provides several options for deploying stateful firewalls:

1. AWS Network Firewall: AWS Network Firewall is a managed service that provides a stateful firewall with integrated intrusion prevention system (IPS) capabilities. It allows you to define firewall rules and policies to filter and inspect traffic across your VPCs.

2. Security Groups: Security groups in Amazon VPC act as stateful firewalls at the instance level. They control inbound and outbound traffic to and from your EC2 instances based on rules that you define.

3. Network ACLs (NACLs): Network ACLs are stateless firewalls that operate at the subnet level in a VPC. They provide an additional layer of security by allowing or denying traffic based on rules that you define.

4. Third-Party Virtual Firewalls: You can also deploy third-party virtual firewall appliances, such as solutions from vendors like Palo Alto Networks, Fortinet, and Check Point, within your VPC. These virtual appliances operate as stateful firewalls and can be integrated with your VPC's routing configuration.

While Security Groups and Network ACLs provide basic stateful and stateless firewall capabilities, respectively, you can deploy AWS Network Firewall or third-party virtual firewall appliances for more advanced stateful firewall features. These solutions offer capabilities like deep packet inspection, intrusion prevention, application-level filtering, and advanced threat protection.
 -----------

Enter your question (or 'q' to quit):

Sweet! We are now interacting with Claude 3 Sonnet. Let's now consider how we can switch between different LLMs.

Using Different LLMs with Bedrock

I mentioned earlier that we are using the Bedrock API to work with our LLMs. Before moving forward we need to make sure that we have model access. For this article I will show you examples for Claude 3 Sonnet and Haiku, Mistral AI, and Amazon Titan. Navigate to the Model access page in the AWS console and ensure that these LLMs show "Access granted" before you try to test the code I am showing. You should see an output similar to Figure 6 if you have access to these LLMs.

With access enabled, lets take that code we used above for Claude 3 Sonnet and lets change it to use Claude 3 Haiku.

To do this all we will change in our code is the Model ID portion of the code as seen below:

1
        modelId = "anthropic.claude-3-haiku-20240307-v1:0"

We can then run the code again. Our response looks like the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
[~/Documents/repos/beginning-bedrock]$ python claude.py 
Enter your question (or 'q' to quit): Does an amazon VPC support a stateful firewall?

-----------
Claude's response:

 Yes, Amazon Virtual Private Cloud (VPC) supports a stateful firewall, which is known as a Network ACL (NACL) and a Security Group.

1. Network ACL (NACL):
   - NACL is a stateful firewall that operates at the subnet level in a VPC.
   - It can be used to create inbound and outbound rules to control the traffic entering and leaving the subnet.
   - NACL rules are evaluated in order, and the first rule that matches the traffic is applied.
   - NACL keeps track of the state of the connection (i.e., whether it's an incoming or outgoing packet) and applies the appropriate rules accordingly.

2. Security Group:
   - Security Groups are also a type of stateful firewall in Amazon VPC, operating at the instance level.
   - Security Groups control inbound and outbound traffic to the instances they are associated with.
   - Security Group rules are evaluated before the NACL rules, and they are stateful, meaning they remember the state of the connection.
   - Security Groups allow you to specify specific ports, protocols, and IP ranges for both inbound and outbound traffic.

Both NACL and Security Groups provide stateful firewall functionality in Amazon VPC, allowing you to control and secure the network traffic in your virtual private environment.
 -----------

Enter your question (or 'q' to quit):

That worked and it was easy to switch to a new LLM. This is because Claude 3 Sonnet and Claude 3 Haiku use the same API request format. You can see them on the providers page if you scroll to the bottom. Note in Figure 7 and Figure 8, that the format is the same and the only difference is the modelID.

So now we can work with two different LLMs. Lets add another. Let's now work with Mistral AI. We need to have a look first at the API request format, and then we can modify our code a bit. We can see the format of the API request in Figure 9. Note that the body format is a bit different. Where Claude 3 used messages Mistral AI does not.

Armed with this information we can change the code to reflect the correct request format. I also recommend going to the AWS documentation and looking at the example for request and response. In my testing I've found that the response may vary as well. With the correct request and response formats, here is our new code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
import json
import boto3

# Initialize Bedrock client
bedrock = boto3.client(service_name="bedrock-runtime", region_name="us-west-2")

def send_prompt_to_mistral(prompt):
    try:
        model_id = 'mistral.mistral-7b-instruct-v0:2'
        prompt_config = f"""<s>[INST] {prompt} [/INST]"""
        
        body = json.dumps({
            "prompt": prompt_config,
            "max_tokens": 2000,
            "temperature": 0.7,
            "top_p": 0.7,
            "top_k": 50
        })
        response = bedrock.invoke_model(
            body=body,
            modelId=model_id
        )
        response_body = json.loads(response.get('body').read())
        mistral_response = response_body.get('outputs')[0]['text'].strip()
        return mistral_response
    except Exception as e:
        print(f"An error occurred while invoking Mistral: {e}")
        return "An error occurred."

if __name__ == "__main__":
    while True:
        user_input = input("Enter your question (or 'q' to quit): ")
        if user_input.lower() == 'q':
            break
        response = send_prompt_to_mistral(user_input)
        print(f"\n\n-----------\nMistral's response:\n\n{response}\n\n -----------\n")

And here is the output now:

Now we have working code to test Claude 3 Sonnet, Claude 3 Haiku, and - Mistral 7B Instruct. In fact, you can change the Model ID between the different Mistral AI models as well and they should work as well. Now lets work with Amazon's Titan model. Again, we need to check the API request as seen in Figure 10.

We should also look at the Titan response format in the AWS Documentation.

Here is our code to work with the Amazon Titan models:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
import json
import boto3

# Initialize Bedrock client
bedrock = boto3.client(service_name="bedrock-runtime", region_name="us-west-2")

def send_prompt_to_titan(prompt):
    try:
        model_id = 'amazon.titan-text-lite-v1'
        contentType = 'application/json'
        accept = 'application/json'

        prompt_config = f"{prompt}"

        body = json.dumps({
            "inputText": prompt_config,
            # "max_tokens": 2000,
            # "temperature": 0.7,
            # "top_p": 0.7,
            # "top_k": 50
            })

        print (body)
        response = bedrock.invoke_model(
            body=body,
            modelId = model_id,
            contentType = contentType,
            accept = accept
        )

     # Print the keys of the response dictionary
        print("Keys in the response:", response.keys())

        response_body = json.loads(response.get('body').read(),)
        
        titan_response = response_body.get('results')[0]['outputText'].strip()
        return titan_response
        print (titan_response)
    except Exception as e:
        print(f"An error occurred while invoking titan: {e}")
        return "An error occurred."

if __name__ == "__main__":
    while True:
        user_input = input("Enter your question (or 'q' to quit): ")
        if user_input.lower() == 'q':
            break
        response = send_prompt_to_titan(user_input)
        print(f"\n\n-----------\nTitan's response:\n\n{response}\n\n -----------\n")

Running our Titan code produces the following output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[~/Documents/repos/beginning-bedrock]$ python titan.py
Enter your question (or 'q' to quit): Does an amazon VPC support a stateful firewall?
{"inputText": "Does an amazon VPC support a stateful firewall?"}
Keys in the response: dict_keys(['ResponseMetadata', 'contentType', 'body'])

-----------
Titan's response:

Yes, Amazon VPC supports stateful firewall rules. Amazon VPC provides a logically isolated AWS environment in which you can launch AWS resources and create private networks. To enable stateful firewall rules in an Amazon VPC, you need to:
1. Create a VPC client in the required region.
2. Define the request parameter params with the VPC ID.
3. Send the request parameter params to the create firewall rule API with the help of the VPC client to enable stateful firewall rules.
4. Retrieve the response of the create firewall rule API.
Here is an example.

Again, you can switch the Titan models around to test different ones.

Conclusion

We've covered a lot of ground in this article. We introduced you to LLMs and Amazon Bedrock, and we showed you how to get started with different LLMs in the AWS console as well as programmatically using Python. Using Amazon Bedrock provides a unified method of accessing several popular LLMs and makes it easy to integrate it into your code projects. As you experiment with the various LLMs you'll be in a better position to select the best LLM for your specific use case. For guidance in this area I recommend reading the article “Choose the best foundational model for your AI applications." There is much more you can do with Amazon Bedrock and I encourage you to spend some time in the Generative AI space on Community.AWS. Keep learning, building, and testing., and as always, "Happy Labbing!"

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Select your cookie preferences

Site Terms, Privacy, and more.