Use Guardrails for safeguarding generative AI applications built using custom or third-party models
Learn how the ApplyGuardrail API can provide a flexible way to integrate Guardrails with your generative AI applications
- Using different models outside of Bedrock (e.g. Amazon SageMaker)
- Enforcing Guardrails at different stages of a generative AI application.
- Testing Guardrails without invoking the model.
ApplyGuardrail
API makes it possible to evaluate user inputs and model responses independently at different stages of your generative AI applications. For example, in a RAG application, you can use Guardrails to filter potentially harmful user inputs before performing a search on your knowledge base. Then, you can also evaluate the final model response (after completing the search and the generation step).ApplyGuardrail
API, let's consider a generative AI application that acts as a virtual assistant to manage doctor appointments. Users invoke it using natural language, for example "I want an appointment for Dr. Smith". Note that this is an over-simplified version for demonstration purposes.\b(?:Health\s*Insurance\s*ID|HIID|Insurance\s*ID)\s*[:=]?\s*([a-zA-Z0-9]+)\b
Health Insurance ID
is just an example, and this could be any sensitive data that needs to be blocked/masked/filtered.ApplyGuardrail
API. I have used the AWS SDK for Python (boto3
), but it will work with any of the SDKs.1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import boto3
bedrockRuntimeClient = boto3.client('bedrock-runtime', region_name="us-east-1")
guardrail_id = 'ENTER_GUARDRAIL_ID'
guardrail_version = 'ENTER_GUARDRAIL_VERSION'
input = "I have mild fever. Can Tylenol help?"
def main():
response = bedrockRuntimeClient.apply_guardrail(guardrailIdentifier=guardrail_id,guardrailVersion=guardrail_version, source='INPUT', content=[{"text": {"text": input}}])
guardrailResult = response["action"]
print(f'Guardrail action: {guardrailResult}')
output = response["outputs"][0]["text"]
print(f'Final response: {output}')
if __name__ == "__main__":
main()
1
2
pip install boto3
python apply_guardrail_1.py
1
2
Guardrail action: GUARDRAIL_INTERVENED
Final response: I apologize, but I am not able to provide medical advice. Please get in touch with your healthcare professional.
source
to INPUT
, which means that the content to be evaluated is from a user (typically the LLM prompt). To evaluate the model output, the source should be set to OUTPUT
. You will see it in action in the next section.1
2
3
4
5
//...
guardrail_id = 'ENTER_GUARDRAIL_ID'
guardrail_version = 'ENTER_GUARDRAIL_VERSION'
endpoint_name = "ENTER_SAGEMAKER_ENDPOINT"
//...
1
2
3
4
5
6
7
8
9
10
11
12
//...
def main():
prompt = "Can you help me with medicine suggestions for mild fever?"
#prompt = "I need an appointment with Dr. Smith for 4 PM tomorrow."
safe, output = safeguard_check(prompt,'INPUT')
if safe == False:
print("Final response:", output)
return
//....
1
2
pip install boto3
python apply_guardrail_2.py
1
2
3
4
5
Checking INPUT - Can you help me with medicine suggestions for mild fever?
Guardrail intervention due to: [{'topicPolicy': {'topics': [{'name': 'Medical advice', 'type': 'DENY', 'action': 'BLOCKED'}]}}]
Final response: I apologize, but I am not able to provide medical advice. Please get in touch with your healthcare professional.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
//...
messages = [
{ "role": "system","content": "When requested for a doctor appointment, reply with a confirmation of the appointment along with a random appointment ID. Don't ask additional questions"}
]
def main():
#prompt = "Can you help me with medicine suggestions for mild fever?"
prompt = "I need an appointment with Dr. Smith for 4 PM tomorrow."
safe, output = safeguard_check(prompt,'INPUT')
if safe == False:
print("Final response:", output)
return
//....
1
2
pip install boto3
python apply_guardrail_2.py
1
2
3
4
5
6
7
8
9
10
Checking INPUT - I need an appointment with Dr. Smith for 4 PM tomorrow.
Result: No Guardrail intervention
Invoking Sagemaker endpoint
Checking OUTPUT - Of course! Your appointment with Dr. Smith is confirmed for 4 PM tomorrow. Appointment ID: 987654321. See you then!
Result: No Guardrail intervention
Final response:
Of course! Your appointment with Dr. Smith is confirmed for 4 PM tomorrow. Appointment ID: 987654321. See you then!
- Guardrails did not block the input.
- Sagemaker endpoint was invoked and returned a response.
- Guardrails did not block the output either, and it was returned to the caller.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
//...
messages = [
{ "role": "system","content": "When requested for a doctor appointment, reply with a confirmation of the appointment along with a random appointment ID and a random patient health insurance ID. Don't ask additional questions."}
]
# messages = [
# { "role": "system","content": "When requested for a doctor appointment, reply with a confirmation of the appointment along with a random appointment ID. Don't ask additional questions"}
# ]
def main():
#prompt = "Can you help me with medicine suggestions for mild fever?"
prompt = "I need an appointment with Dr. Smith for 4 PM tomorrow."
safe, output = safeguard_check(prompt,'INPUT')
//...
1
2
pip install boto3
python apply_guardrail_2.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Checking INPUT - I need an appointment with Dr. Smith for 4 PM tomorrow.
Result: No Guardrail intervention
Invoking Sagemaker endpoint
Checking OUTPUT - Of course! Here is your confirmation of the appointment:
Appointment ID: 7892345
Patient Health Insurance ID: 98765432
We look forward to seeing you at Dr. Smith's office tomorrow at 4 PM. Please don't hesitate to reach out if you have any questions or concerns.
Guardrail intervention due to: [{'sensitiveInformationPolicy': {'regexes': [{'name': 'Health Insurance ID', 'match': 'Health Insurance ID: 98765432', 'regex': '\\b(?:Health\\s*Insurance\\s*ID|HIID|Insurance\\s*ID)\\s*[:=]?\\s*([a-zA-Z0-9]+)\\b', 'action': 'ANONYMIZED'}]}}]
Final response:
Of course! Here is your confirmation of the appointment:
Appointment ID: 7892345
Patient {Health Insurance ID}
We look forward to seeing you at Dr. Smith's office tomorrow at 4 PM. Please don't hesitate to reach out if you have any questions or concerns
- Guardrails did not block the input - it was valid.
- Sagemaker endpoint was invoked and returned the response.
- Guardrails masked (the response wasn't completely blocked) the part of the output that contained the health insurance ID. You can see the details in logs in the part that says
'action': 'ANONYMIZED'
Patient {Health Insurance ID}
in the final response. Having the option to partially mask the output is quite flexible in these situations where the rest of the response is valid and you don't want to block it entirely.ApplyGuardrail
is a really flexible API that lets you evaluate input prompts and model responses for foundation models on Amazon Bedrock, as well as custom and third-party models, irrespective of there they are hosted. This allows you to use Guardrails for centralized governance across all your generative AI applications.Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.