Monitoring foundation models with Amazon CloudWatch

As businesses deploy generative AI applications using Amazon Bedrock, it becomes crucial to monitor foundation model performance and user behavior to understand the health and adoption of the application. Amazon Bedrock provides built-in publishing of metrics and logs to Amazon CloudWatch. If you operate a chatbot that uses Amazon Bedrock's Converse API, Amazon CloudWatch provides an easy method for viewing data about your chatbot's usage in a consolidated dashboard of metrics and logs. In this post, I'll walk through how to get started using Amazon CloudWatch dashboards to gain live observability into all of the Amazon Bedrock foundation models used by a generative AI chatbot.

Foundation model metrics

By default, there are nine runtime metrics published to Amazon CloudWatch that provide performance details about individual foundation models used by Amazon Bedrock. These metrics provide insights about how your generative AI chatbot is performing and being used:

CloudWatch runtime metric	Monitoring insight
`Invocations`	Understanding high and low chatbot usage over time; understanding overall chatbot adoption.
`InvocationLatency`, `InvocationClientErrors`, `InvocationServerErrors`, `InvocationThrottles`	Identify issues with your chatbot that are affecting user experience.
`InputTokenCount`, `OutputTokenCount`	Validate the average or trending size of user input prompts over time; verify the response size is expected for the selected foundation model configuration.

Foundation model logs

Amazon Bedrock also supports model invocation logging, which is disabled by default. While logs can be sent to either Amazon S3 or Amazon CloudWatch Logs, this post will focus on Amazon CloudWatch Logs.

Foundation model logs provide detailed information about each invocation by your chatbot's users. The logs keep a record of all input and output text:

I'll use these logs to parse the initial prompts users input when using an Amazon Bedrock chatbot. In the above example, the input prompt is "What is AWS?".

Create a dashboard in Amazon CloudWatch

To create a single dashboard showing consolidated foundation model metrics and logs, I'll use Amazon CloudWatch. Automatic Dashboards provide a starting point dashboard of metrics that we'll modify to include logs.

1. Create an Amazon CloudWatch log group

Before enabling model invocation logging, you'll need to create an Amazon CloudWatch log group. In this example (figure 1), I create a log group named /aws/bedrock and set the retention setting to 1 month (30 days). Leave the log class as Standard.

Figure 1 - Creating a CloudWatch log group

Click Create and a new, empty log group will be created.

Note: The retention setting is a balance between how much log history you want retain vs. how much you're willing to pay for storage. For more information about CloudWatch Logs costs, visit the pricing page.

2. Enable Amazon Bedrock foundation model invocation logging

In the Amazon Bedrock console, you'll need to turn on model invocation logging. In figure 2, I select Cloudwatch Logs only, and fill-in the /aws/bedrock Log group name I created in step 1. I also select Create and use a new role in IAM, naming it bedrock-cloudwatch-logs.

Figure 2 - Enable Amazon Bedrock model invocation logging

Click Save settings and Amazon Bedrock will begin publishing logs to a new log stream under the Amazon CloudWatch log group.

3. Create an Amazon CloudWatch Logs Insights query

To monitor input prompts from your chatbot's users invoking foundation models, I'll create and save a query in Amazon CloudWatch Logs Insights. This example focuses on the newer Converse API, which is the recommended API for foundation models that support messages. For a Converse API overview, review this post.

The query below will gather log messages from the foundation model logs published to Amazon CloudWatch, with the following conditions:

Select logs where Converse or ConverseStream APIs are used.
Ignore common error logs: ThrottlingException and ValidationException.
Parse the first user input message in the log, and remove duplicates.
Limit the result to the most recent ten logs.

This simple query will give insight to some of the initial prompts used the chatbot's users, which I'll display on the Amazon CloudWatch dashboard.

Note: due to the structure of the log, a single log entry may contain multiple messages in an array. In this example, I use the input.inputBodyJson.messages.0.content.0.text field to display the first instance of text in a message array.

To create the query, navigate to the Amazon CloudWatch console and open Log Insights. Select the /aws/bedrock/ log group created in step 1, then paste in the above query and click Run query. Like shown in figure 3, you will see results of the query in the bottom table, so long as there are log messages matching the query within the specified time period.

Figure 3 - Creating a query with CloudWatch Log Insights

After you run your query successfully, click Save and give it a Query name of ModelInput.

4. Create an Amazon CloudWatch automatic dashboard

In the Amazon CloudWatch console, you'll navigate to the Dashboards page and click the Automatic dashboards tab. Bedrock will appear as an available dashboard since the service is publishing metrics to Amazon CloudWatch. Click Bedrock and we'll use the example dashboard as a starting point.

As you'll see in figure 4, the automatic dashboard has 6 metrics displayed on line widgets, showing multiple foundation model datapoints in each chart. Click Add to dashboard and create a new dashboard using this one as the starting point.

Figure 4 - Creating an Amazon CloudWatch automatic dashboard for Bedrock

5. Add logs to enhance your CloudWatch dashboard

After you create your new Amazon CloudWatch dashboard, click the + in the top right to add a new widget. In this example, I'll create a new Logs table widget under the Logs tab as shown in figure 5.

Figure 5 - Creating a new Logs tab widget in Amazon CloudWatch

To configure your Logs table widget, select the ModelInput query you created earlier in step 3. Click Create widget at the top, as shown in figure 6.

Figure 6 - Create Logs widget with Amazon CloudWatch Logs Insights query

After you save your widget, you'll now have a logs table displayed underneath the metrics in the live dashboard, showing a recent list of input prompts for your foundation models. The combined dashboard is shown in figure 7.

Figure 7 - Amazon CloudWatch dashboard showing foundation model metrics and logs

Some potential enhancements you can make in Amazon CloudWatch, not covered in this post, are:

Add additional widgets to the top of the dashboard highlighting key metrics, such as a number widget showing current invocation count and invocation latency.
Add a static threshold alarm, such as when high throttling or errors occur.
Add an anomaly detection alarm, such as when any metric goes above or below a range of normal values.

Conclusion

In this post, I walk through creating an Amazon CloudWatch dashboard to view live metrics and logs for an Amazon Bedrock chatbot that uses the Converse API. Using Automatic Dashboards and Logs Insights, you can easily setup a view of your foundation model's usage to understand live chatbot performance. The monitoring capabilities in Amazon CloudWatch helps businesses that deploy and iterate new generative AI chatbot applications to quickly understand user adoption and diagnose issues.

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Site Terms, Privacy, and more.