5 Ways for Chatting with Your Data on AWS
Learn the trade-offs of different methods for chatting with your data
Banjo Obayomi
Amazon Employee
Published Mar 12, 2024
Chatting with data using large language models (LLMs) is a game-changer for builders, but the abundance of options on AWS can be overwhelming. Should you go with a no-code solution like Amazon Q, or a highly customizable approach like Bedrock + LangChain? What about managed services like Knowledge Bases? Each path has its own pros and cons, and choosing the right one can be daunting.
In this post, we'll break down five different methods for integrating generative AI capabilities with your data to perform retrieval-augmented generation (RAG). We'll dive into the advantages and drawbacks of each approach, so you can make an informed decision based on your specific needs. Whether you prioritize ease of use, customization, or enterprise-grade solutions, you'll find a path that aligns with your requirements.
Amazon Q is a no-code solution that allows you to create a chatbot capable of connecting with over 30 types of data stores, from S3 buckets to Slack messages. This service comes with a pre-built user interface and security features, making it ideal for organizations looking for a quick and secure deployment.
Pros:
- No coding required
- Deployable within an organization with security features
- Comes with a user interface
- Integrates with over 30 data sources
Cons:
- Limited customization options for prompts and UI
- Can only be deployed within an organization
You can call an Amazon Q application using the ChatSync API as follows:
Amazon Q's strengths lie in its no-code approach, pre-built UI, secure deployment within organizations, and integration with over 30 data sources. However, it lacks customization options for prompts and UI. This workshop module provides step by step instructions on how to setup a Amazon Q chatbot.
For those seeking a highly customizable solution, combining Embeddings, Foundation Models , and a local vector store with LangChain can provide powerful APIs for building RAG workflows tailored to your needs.
Pros:
- Highly customizable
- Can be integrated into existing workflows
- Leverages the power of LangChain
Cons:
- Code-heavy approach
- Requires maintenance and updates
- Learning curve for using LangChain
Here's an code snippet of chatting with a PDF, the full code is here
This approach offers high customizability, the ability to integrate into existing workflows, and leverages the power of LangChain. Its downsides include being code-heavy, requiring maintenance, and having a learning curve for LangChain. This repo provides guidance on building further solutions.
Knowledge Bases is a managed service that allows you to store and query your data using a vector database, enabling RAG workflows and semantic search capabilities. By leveraging large language models and vector embeddings, Knowledge Bases can understand the meaning and context of your data, making it easier to retrieve relevant information through natural language queries.
Knowledge Bases provide both API access and a console interface, catering to developers and non-technical users alike. Additionally, it offers seamless integration with various database solutions, including Amazon OpenSearch and Amazon RDS/Aurora.
- Use Case: Ideal for AI-powered search applications that require semantic and personalized searches.
- Advantages:
- Optimal price-performance for search workloads
- Offers both serverless and managed cluster options
- Use Case: When you need to co-locate vector search capabilities with relational data.
- Advantages:
- Integrates vector search with relational databases
- Supports serverless operations via Aurora
- Helps keep traditional application data and vector embeddings in the same database, enabling better governance and faster deployment with minimal learning curves.
No matter which database solution you choose, Knowledge Bases provides a robust set of advantages and capabilities:
Pros:
- Accessible via console or API
- Fast search on getting documents.
- Manages the vector database for you
Cons:
- Minimum cost due to the need to maintain a database instance
Here's an example of how you would create a request to a Knowledge Base:
By leveraging Knowledge Bases with your choice of Amazon OpenSearch or Amazon RDS/Aurora, you can seamlessly integrate vector search capabilities into your applications, enabling AI-powered search and retrieval functionalities tailored to your specific needs. This workshop module provides guidance on how to set one up.
For those seeking a highly customizable AI assistant that can leverage tools and a knowledge base, Agents offer a compelling solution. With the ability to connect to a knowledge base and customize the assistant's behavior via an OpenAPI specification and Lambda actions, you can tailor the experience to your specific needs.
Pros:
- Highly customizable
- Can leverage a knowledge base
- Accessible via console or API
Cons:
- Learning curve for setting up OpenAPI spec and Lambda actions
- Slower response time
Here's an code snippet of chatting with a PDF, the full code is here
With Bedrock Agents, you gain highly customizable AI assistants that can connect to a knowledge base and are accessible via the console or API. However, there is a learning curve for setting up the OpenAPI spec and Lambda actions. Here is a full end to end repo to learn more. You can also leverage Powertools for AWS Lambda to aid in the creation of Agents.
For enterprise use cases that already leverage Kendra for data storage, combining Bedrock with Kendra can provide a powerful RAG solution for enterprise content. This approach allows you to take advantage of Kendra's data ingestion capabilities while leveraging LangChain and Bedrock's Language Models for querying.
Pros:
- Can use LangChain for advanced querying
- Kendra handles data ingestion and storage
Cons:
- High setup cost due to Kendra's enterprise focus
- Best suited for existing Kendra users
Here is an example code snippet of using LangChain and Kendra.
The combination of Bedrock and Kendra enables the use of LangChain while leveraging Kendra's data ingestion capabilities. The drawbacks are the high setup cost and suitability primarily for existing Kendra users. Explore this full end to end example to get started.
With these five approaches, you have a range of options for chatting with your data on AWS. Whether you prioritize ease of use, customization, or enterprise-grade solutions, there's a path that aligns with your needs.
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.