Generative AI security readiness checklist: What to consider before you productionize your generative AI workload

Co-authored by Solutions Architects at Amazon Web Services (AWS)

Riza Saputra, Senior Startups Solutions Architect
Glendon Thaiw, Senior Startups Solutions Architect
Ying Ting Ng, Associate Security Solutions Architect

Overview

Generative AI has seen explosive growth in recent years, with applications that can transform how organizations create content, analyze data, and make critical decisions. As companies increasingly use the power of generative AI models to build custom applications, maintaining the security and responsible use of these models has become a top priority. Security controls required vary depending on the type of model you’re using to build your application: a pre-trained model, fine-tuned model, or custom model. Our focus will be on applications built using pre-trained models, which currently address the majority of customer use cases.

In this post, we’ve developed the following 7-item checklist that outlines the essential security and compliance measures you should consider when moving your generative AI-powered applications from experimentation to production.

Establish governance framework and compliance process
Review and comply with the LLM provider’s EULA and data usage policies
Implement comprehensive access controls
Mitigate input and output risks
Protect your data
Secure your perimeter
Implement comprehensive monitoring and incident response

This checklist covers governance, legal compliance, access controls, risk management, input/output validation, infrastructure protection, and monitoring. Implementing the checklist helps to mitigate risks, protect data, and maintain user trust. While checking off more items improves defense in depth, it is not compulsory to complete every checkpoint, as it will depend on what your application needs.

1. Establish governance framework and compliance process

As with most compliance frameworks, people and process are key. Before deploying your generative AI application, establish a comprehensive governance and compliance framework that will serve as the foundation for responsible AI deployment. Start by forming a cross-functional AI governance committee with subject matter experts from legal, IT security, and relevant business units. This committee should create and enforce specific policies for your generative AI application, covering data handling, model selection, and usage guidelines. This governance framework, focusing on people and processes, aligns with most compliance frameworks and supports responsible AI deployment.

Develop a compliance checklist tailored to your industry and applicable regulations (such as GDPR or PCI DSS). This checklist should cover data privacy measures, consent management, and transparency requirements. Implement a regular compliance review schedule, such as quarterly audits, to maintain ongoing adherence to these standards. Refer to these blogs for guidance: Scaling a governance, risk, and compliance program for the cloud, emerging technologies, and innovation and Securing generative AI: data, compliance, and privacy considerations.

Finally, set up a documentation system to track decisions, changes, and compliance status of your generative AI application. This could be a dedicated section in your project management tool or a specialized compliance management software. Include features like version control for policies, audit logs for model changes, and a dashboard for compliance status. This system will not only help in maintaining compliance but also provide necessary evidence during external audits.

2. Review and comply with the LLM provider’s EULA and data usage policies

Before integrating a pre-trained model into your application, thoroughly review the End User License Agreement (EULA) and data usage policies of your chosen large language model (LLM) provider. Pay close attention to clauses regarding data handling, model outputs, and any restrictions on commercial use. For Amazon Bedrock users, refer to Access Amazon Bedrock foundation models. If you’re self-deploying on Amazon SageMaker, review the model sources on the model details page. It’s crucial to understand specific limitations and requirements to maintain compliance and avoid potential legal issues.

Keep an eye out for new LLM releases and license updates because these can bring exciting opportunities for your application. For instance, the Meta Llama 3.1 license is considered more permissive than its predecessors, potentially opening up new use cases. However, Meta Llama 3.1 still has some restrictions, like needing a separate license if your monthly active users exceed 700 million. Regardless of the model you choose, regularly reviewing these aspects makes sure your application remains optimized and can take advantage of new opportunities.

3. Implement comprehensive access controls

When developing and deploying your generative AI application, you will need robust access controls to protect your system and data. This includes setting up user authentication, authorization, and data access policies, all while adhering to the principle of least privilege. Modern generative AI applications typically interact with external functions and data sources, such as for Retrieval Augmented Generation (RAG) patterns. It’s essential to make sure end-users can only invoke functions or access data sources that they are explicitly allowed to use. Implement proper user authentication using services like Amazon Cognito or Amazon Verified Permissions, and establish fine-grained authorization controls to limit specific actions and access to data based on user roles, attributes, and groups. Below is an example of how you can use Amazon Cognito's JWT tokens to perform identity propagation and fine-grain authorization throughout various steps in the application.

Image not found

Architecture diagram showing sample GenAI application with identity propagation and authorization

Consider the various components of your generative AI application when setting up access controls. This includes the LLM itself, databases, storage systems, and any additional services or APIs your application interacts with. To enhance security, consider using LLMs that can be accessed through short-lived, temporary credentials. Examples include models available through Amazon Bedrock or those deployed on SageMaker, as opposed to LLMs that require long-lived API keys for access. This approach helps reduce the risk of credential compromise and simplifies secure access management. Furthermore, some services, like Amazon Bedrock, allow you to implement granular control over model usage, such as denying access for inference on specific models using identity-based policies.

Make sure user sessions and conversation contexts are properly isolated. Implement mechanisms to prevent users from accessing other users’ content, session histories, or potentially sensitive conversational information. For example, use unique session identifiers for each user interaction and make sure these are validated on every request. Additionally, implement server-side session management, storing conversation histories and context in isolated, user-specific data stores.

For RAG implementations, it’s crucial to manage access to the knowledge bases that augment your LLM responses. Amazon Bedrock Knowledge Bases includes metadata filtering, which provides built-in access controls to make sure users only retrieve information they’re authorized to access. Refer to the following diagram for an illustration of how metadata filtering works. If you’re managing your own RAG, we recommend using Amazon Kendra, which provides capabilities to filter responses based on user permissions.

4. Mitigate input and output risks

Implement robust evaluation mechanisms to assess and mitigate risks associated with user inputs and model outputs in your generative AI application. This helps protect against vulnerabilities such as prompt injection attacks, inappropriate content generation or hallucinations.

Amazon Bedrock Guardrails provides a suite of configurable defenses for prompt input and model output, which can be applied across LLMs on Amazon Bedrock, including fine-tuned models and even generative AI applications outside of Amazon Bedrock. Amazon Bedrock Guardrails offers the following key capabilities:

Configure thresholds to filter harmful content, jailbreaks, and prompt injection attacks
Define and disallow denied topics with short natural language descriptions
Block or mask sensitive information including personally identifiable information (PII)
Reduce hallucinations using contextual grounding checks and relevance

Another precaution to take as part of input sanitization is to implement a verified prompt catalog—a pre-approved set of prompts for common tasks. We recommend using Amazon Bedrock Prompt Management to organize and manage these prompts effectively. This can help mitigate risks by limiting the LLM’s exposure to potentially malicious instructions.

Output validation is equally important. LLM responses should be treated with caution, especially when dealing with database queries or code generation which will be passed downstream to other components. When handling such outputs, it's crucial to treat the LLM as you would any other user. Implement proper access control and authorization checks before executing actions on downstream services. Use safeguards against remote code execution, SQL injection, and cross-site scripting (XSS). Utilize parameterized queries for database interactions and validate the structure and intent of generated SQL before execution. Additionally, define prompt templates within system prompts to restrict output format and reduce potential risks.

When dealing with LLMs that generate system commands or code, it's crucial to implement rigorous security measures. Start by employing strict validation checks through a combination of parameterized queries and input validation. This approach should incorporate allow-lists to restrict permitted commands, syntax checking to verify proper structure, and semantic analysis to understand the intent of the generated code. When handling potentially risky content like JavaScript or Markdown, always encode the model's output before returning it to users or rendering it in a browser. This encoding step acts as an additional layer of protection against potential vulnerabilities. In cases where command or code execution is necessary, run these commands or code snippets in an isolated environment. This sandboxing technique provides an extra safeguard, containing potential negative impacts and protecting your main system from unintended consequences.

5. Protect your data

Data protection when using pre-trained models in generative AI applications focuses primarily on safeguarding inference data and supplementary data sources. This includes user queries, additional contexts, and knowledge bases used in patterns like Retrieval-Augmented Generation (RAG).

To protect against data loss or leakage, encrypt all data sources, including RAG knowledge bases. Use services like AWS Key Management Service for secure management, storage, and rotation of encryption keys used for encryption. Implement access controls using AWS Identity and Access Management (IAM) policies to make sure only authorized users and services can access the data. Additionally, enable versioning on your knowledge base storage (such as S3 versioning) to track changes and maintain data integrity.

If your application deals with sensitive data, consider implementing data masking or blocking before it reaches the model, which can be achieved using Bedrock Guardrail’s sensitive information filters.

6. Secure your perimeter

Implement strong network security measures to protect your generative AI infrastructure. When utilizing proprietary data for fine-tuning or RAG, it's essential to establish a secure perimeter to prevent internet exposure. Leverage Amazon bedrock VPC endpoint, powered by AWS PrivateLink, to create a private connection between your VPC and the Amazon Bedrock service account. This approach significantly enhances the security of your data and model interactions.

LLMs often require substantial computational resources, making them potential targets for resource exploitation. To mitigate this risk, implement appropriate consumption limits before deploying your generative AI application as a service. Utilize AWS Web Application Firewall (WAF) to set up rate limiting, which can help prevent abuse and facilitate fair resource allocation. Alternatively, enable throttling through Amazon API Gateway to control the rate of requests to your application. These measures not only protect your infrastructure from potential threats but also help maintain consistent performance and availability of your generative AI services.

7. Implement comprehensive monitoring and incident response

Proper monitoring and response mechanisms is important for detecting and addressing issues quickly and maintaining system reliability. As part of your observability strategy, monitor LLM usage metrics such as request volume, latency, and error rates to understand system performance and detect anomalies quickly. Use Amazon CloudWatch to create alerts for key metrics that exceed predefined thresholds. Some examples include denied inputs or outputs indicated by number of interventions from Amazon Bedrock Guardrails, or unusual activity measured by number of invocations on your Amazon Bedrock application.

Equally important is the development of a practical incident response plan for generative AI-related issues. This plan should address scenarios such as prompt injections, unexpected model outputs, or data leaks. Establish a clear escalation process with well-defined roles and responsibilities. Implement an Andon cord mechanism that allows for quick model deactivation, system shutdown, rollback to a previous version, or activation of a restricted safe mode with predefined procedures for common scenarios. This proactive approach to incident management significantly enhances your ability to maintain system integrity and swiftly address issues that may arise in your generative AI application.

Conclusion

As you prepare to move your generative AI application from prototype to production, make sure you’ve implemented these critical security measures. They’ll help you build and deploy responsibly, protecting both your organization and your users. Stay informed about the latest developments in AI security to keep your application at the forefront of innovation and trust.

This post covers the security controls for applications using pre-trained models. If you’re considering using fine-tuned or custom models, refer to the Generative AI Security Scoping Matrix to understand the different risks and mitigations to look out for based on the model type. You can also refer to Secure approach to generative AI or contact your account team for additional support. Now, go build innovative and secure generative AI applications!

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Select your cookie preferences

Site Terms, Privacy, and more.