Amazon Bedrock Guardrails API: Part 1

Amazon Bedrock has introduced a new ApplyGuardrail API within its Guardrails feature. This API allows for comprehensive evaluation of user inputs and model outputs against predefined safety measures. The key advantage is its versatility - it provides a unified approach to implementing safeguards across a diverse range of generative AI applications. Whether you're using Amazon Bedrock's native foundation models (FMs), other AWS services like SageMaker, cloud infrastructure like Amazon EC2, on-premises systems, or even third-party FMs, you can now apply consistent protective measures. This standardization ensures that the same robust safeguards are in place for both input prompts and model responses, regardless of the underlying technology or deployment location.

Let me show you how to use the ApplyGuardrail API in a sample application use case for a customer service. In the following example, I have used the AWS SDK for Python (Boto3).

In this example today, I will set the source to INPUT, which means that the content to be evaluated is from a user (typically the LLM prompt). In a future post, I will show how to evaluate the model output, with the source set to OUTPUT.

Guardrail Setup

The sample customer service guardrail is configured to address two main concerns:

Sensitive Customer Information: Prevents handling or requesting personal identifying information (PII), financial details, or health information.
Inappropriate Language: Blocks rude, offensive, or unprofessional language in customer interactions.

How It Works

Input Screening:
- Example: Customer says, "My social security number is 123-45-6789, can you update my account?"
- Action: Guardrail blocks the request, protecting sensitive information.
Language Monitoring:
- Example: Customer says, "This is ridiculous! Your service is terrible!"
- Action: Guardrail intervenes, maintaining professional interaction.
Safe Interactions:
- Example: Customer asks, "What's your return policy for electronics?"
- Action: Guardrail allows the message, proceeding with normal processing.

Response Handling

Blocked Requests: System responds: "I cannot process requests containing sensitive personal information or inappropriate language."
Passed Requests: System proceeds with: "How may I assist you with your inquiry?"
Unexpected Situations: System provides a fallback response, guiding the customer to additional support.

Note: This is demo code for illustrative purposes only. Not intended for production use.

Implementation Details

The function create_cs_guardrail() creates a customer service guardrail that prevents handling of sensitive information and inappropriate language, defining specific topics to deny, and sets up blocked messaging for both inputs and outputs. It then creates and returns a version for this guardrail.

The next function apply_cs_guardrail(guardrail_id, guardrail_version, customer_message) applies a specified guardrail to a customer message, then interprets the response to determine if the message was blocked, passed, or resulted in an unexpected action. It returns a status and message based on the guardrail's intervention.

Based on blocked and passed situation you can create your own custom implementation on how to handle the user experience further.

Note : For passed messages, the code demonstrates a custom implementation where a standard assistance offer is given: "How may I assist you with your inquiry?". For blocked messages, the implementation provides the specific blocked message returned by the guardrail. Additionally, the code includes custom action recommendations, such as "Escalate to human agent or provide safe response" for blocked messages, and "Proceed with normal processing" for passed messages. This showcases how developers can create tailored responses and actions based on the guardrail's output.

Now below code uses a customer service guardrail, applies it to a set of test messages, and demonstrates how to handle different guardrail outcomes (blocked, passed, unknown) with appropriate actions and user responses for each scenario.

As we can see that the guardrail successfully passed three non-sensitive inquiries about account balance, order tracking, and return policy, allowing normal processing. It blocked two messages: one containing inappropriate language and another with sensitive personal information (SSN). For passed messages, a standard assistance offer was given. For blocked messages, a response indicating inability to process the request due to sensitive information or inappropriate language was provided, with a recommendation to escalate to a human agent.

Below is sample API response when Guardrail intervened

And below is sample API response when Guardrail not intervened

Benefits

Protects customer privacy by filtering sensitive information
Maintains professional tone in all interactions
Allows safe processing of legitimate inquiries
Provides clear guidance when guardrails are triggered
Escalates complex or sensitive issues to human agents when necessary

By implementing these guardrails, customer service teams can confidently use AI to handle a wide range of inquiries while maintaining high standards of security and professionalism.

For further details refer to this excellent blog

Happy Building !

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Site Terms, Privacy, and more.