5 Ways for Chatting with Your Data on AWS

Chatting with data using large language models (LLMs) is a game-changer for builders, but the abundance of options on AWS can be overwhelming. Should you go with a no-code solution like Amazon Q, or a highly customizable approach like Bedrock + LangChain? What about managed services like Knowledge Bases? Each path has its own pros and cons, and choosing the right one can be daunting.

In this post, we'll break down five different methods for integrating generative AI capabilities with your data to perform retrieval-augmented generation (RAG). We'll dive into the advantages and drawbacks of each approach, so you can make an informed decision based on your specific needs. Whether you prioritize ease of use, customization, or enterprise-grade solutions, you'll find a path that aligns with your requirements.

Amazon Q

Amazon Q is a no-code solution that allows you to create a chatbot capable of connecting with over 30 types of data stores, from S3 buckets to Slack messages. This service comes with a pre-built user interface and security features, making it ideal for organizations looking for a quick and secure deployment.

Pros:

No coding required
Deployable within an organization with security features
Comes with a user interface
Integrates with over 30 data sources

Cons:

Limited customization options for prompts and UI
Can only be deployed within an organization

You can call an Amazon Q application using the ChatSync API as follows:

1
2
3
4
5
amazon_q.chat_sync(
        applicationId=xxx,
        userId=xxx,
        userMessage=xxx
)

Amazon Q's strengths lie in its no-code approach, pre-built UI, secure deployment within organizations, and integration with over 30 data sources. However, it lacks customization options for prompts and UI. This workshop module provides step by step instructions on how to setup a Amazon Q chatbot.

Bedrock + LangChain

For those seeking a highly customizable solution, combining Embeddings, Foundation Models , and a local vector store with LangChain can provide powerful APIs for building RAG workflows tailored to your needs.

Pros:

Highly customizable
Can be integrated into existing workflows
Leverages the power of LangChain

Cons:

Code-heavy approach
Requires maintenance and updates
Learning curve for using LangChain

Here's an code snippet of chatting with a PDF, the full code is here

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
def rag_with_bedrock(query):

    embeddings = BedrockEmbeddings(
        client=bedrock_runtime,
        model_id="amazon.titan-embed-text-v1",
    )
    pdf_loc = "well_arch.pdf"

    local_vector_store = FAISS.load_local("local_index", embeddings)

    docs = local_vector_store.similarity_search(query)
    context = ""

    for doc in docs:
        context += doc.page_content

    prompt = f"""Use the following pieces of context to answer the question at the end.

    {context}

    Question: {query}
    Answer:"""

    return call_claude(prompt)

query = "What can you tell me about Amazon RDS?"
print(query)
print(rag_with_bedrock(query))

This approach offers high customizability, the ability to integrate into existing workflows, and leverages the power of LangChain. Its downsides include being code-heavy, requiring maintenance, and having a learning curve for LangChain. This repo provides guidance on building further solutions.

Knowledge Bases for Amazon Bedrock

Knowledge Bases is a managed service that allows you to store and query your data using a vector database, enabling RAG workflows and semantic search capabilities. By leveraging large language models and vector embeddings, Knowledge Bases can understand the meaning and context of your data, making it easier to retrieve relevant information through natural language queries.

Knowledge Bases provide both API access and a console interface, catering to developers and non-technical users alike. Additionally, it offers seamless integration with various database solutions, including Amazon OpenSearch and Amazon RDS/Aurora.

Amazon OpenSearch:

Use Case: Ideal for AI-powered search applications that require semantic and personalized searches.
Advantages:
- Optimal price-performance for search workloads
- Offers both serverless and managed cluster options

Amazon RDS/Aurora with pgvector extension:

Use Case: When you need to co-locate vector search capabilities with relational data.
Advantages:
- Integrates vector search with relational databases
- Supports serverless operations via Aurora
- Helps keep traditional application data and vector embeddings in the same database, enabling better governance and faster deployment with minimal learning curves.

No matter which database solution you choose, Knowledge Bases provides a robust set of advantages and capabilities:

Pros:

Accessible via console or API
Fast search on getting documents.
Manages the vector database for you

Cons:

Minimum cost due to the need to maintain a database instance

Here's an example of how you would create a request to a Knowledge Base:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import boto3

KB_ID = "TODO"
QUERY = "What can you tell me about Amazon EC2?"
REGION = "us-west-2"
MODEL = "anthropic.claude-v2:1"

# Setup bedrock
bedrock_agent_runtime = boto3.client(
    service_name="bedrock-agent-runtime",
    region_name=REGION,
)

text_response = bedrock_agent_runtime.retrieve_and_generate(
    input={"text": QUERY},
    retrieveAndGenerateConfiguration={
        "type": "KNOWLEDGE_BASE",
        "knowledgeBaseConfiguration": {
            "knowledgeBaseId": KB_ID,
            "modelArn": MODEL,
        },
    },
)

print(f"Output:\n{text_response['output']['text']}\n")
for citation in text_response["citations"]:
    print(f"Citation:\n{citation}\n")

By leveraging Knowledge Bases with your choice of Amazon OpenSearch or Amazon RDS/Aurora, you can seamlessly integrate vector search capabilities into your applications, enabling AI-powered search and retrieval functionalities tailored to your specific needs. This workshop module provides guidance on how to set one up.

Agents for Amazon Bedrock

For those seeking a highly customizable AI assistant that can leverage tools and a knowledge base, Agents offer a compelling solution. With the ability to connect to a knowledge base and customize the assistant's behavior via an OpenAPI specification and Lambda actions, you can tailor the experience to your specific needs.

Pros:

Highly customizable
Can leverage a knowledge base
Accessible via console or API

Cons:

Learning curve for setting up OpenAPI spec and Lambda actions
Slower response time

Here's an code snippet of chatting with a PDF, the full code is here

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def run_agent():

    response = bedrock_agent_runtime.invoke_agent(
        sessionState={
            "sessionAttributes": {},
            "promptSessionAttributes": {},
        },
        agentId=AGENT_ID,
        agentAliasId="TSTALIASID",
        sessionId=str(generate_random_15digit()),
        endSession=False,
        enableTrace=True,
        inputText=QUERY,
    )
    print(response)

    results = response.get("completion")

    for stream in results:
        process_stream(stream)

With Bedrock Agents, you gain highly customizable AI assistants that can connect to a knowledge base and are accessible via the console or API. However, there is a learning curve for setting up the OpenAPI spec and Lambda actions. Here is a full end to end repo to learn more. You can also leverage Powertools for AWS Lambda to aid in the creation of Agents.

Bedrock + Kendra

For enterprise use cases that already leverage Kendra for data storage, combining Bedrock with Kendra can provide a powerful RAG solution for enterprise content. This approach allows you to take advantage of Kendra's data ingestion capabilities while leveraging LangChain and Bedrock's Language Models for querying.

Pros:

Can use LangChain for advanced querying
Kendra handles data ingestion and storage

Cons:

High setup cost due to Kendra's enterprise focus
Best suited for existing Kendra users

Here is an example code snippet of using LangChain and Kendra.

1
2
3
from langchain_community.retrievers import AmazonKendraRetriever
retriever = AmazonKendraRetriever(index_id="c0806df7-e76b-4bce-9b5c-d5582f6b1a03")
retriever.get_relevant_documents("what is langchain")

The combination of Bedrock and Kendra enables the use of LangChain while leveraging Kendra's data ingestion capabilities. The drawbacks are the high setup cost and suitability primarily for existing Kendra users. Explore this full end to end example to get started.

Conclusion

With these five approaches, you have a range of options for chatting with your data on AWS. Whether you prioritize ease of use, customization, or enterprise-grade solutions, there's a path that aligns with your needs.

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Select your cookie preferences

Site Terms, Privacy, and more.

5 Ways for Chatting with Your Data on AWS

Learn the trade-offs of different methods for chatting with your data

Amazon Q

Bedrock + LangChain

Knowledge Bases for Amazon Bedrock

Agents for Amazon Bedrock

Bedrock + Kendra

Conclusion

Comments