
Using Amazon Bedrock to compare Retrieval Augmented Generation (RAG) based Generative AI (GenAI) application between Amazon Nova Pro and Anthropic Claude 3.5 Sonnet
GenAI Chatbot with RAG and Rerank using different Foundational Models (FMs) on Amazon Bedrock endpoints.
- Create a Knowledge Base from the documents to be used as contexts for FM queries.
- Select the secure, reliable, accurate, efficient and cost-effective FM.
- Based on the user queries and document embeddings (
Cohere Embed English
), retrieve the similar document chunks from vector store withFAISS
engine. - For improved relevancy and accuracy, rerank the retrieved similar document chunks.
- Augment user query with the reranked document chunks.
- Re-write user query and perform Prompt Construction for the selected FM (Amazon Nova Pro / Claude 3.5 Sonnet).
- Use Amazon Bedrock Guardrails to filter harmful contents and topics on both the user inputs and FM responses.
- Perform FM responses with streaming for responsiveness.
- Format the FM responses accordingly based on the use case.
- Collect user feedback on the responses for potential model improvements.
- Save the Queries and Responses for model evaluations, model fine-tuning and/or continued pre-training.
- Create a boto3 agent runtime client to connect programmatically for making retrieval from Amazon Bedrock Knowledge Base (OpenSearch Serverless).
- Create a boto3 agent runtime client to connect programmatically for making inference requests from Large Language Models (LLM) hosted in Amazon Bedrock (eg. Amazon Nova Pro).
- Retrieve embedded chunks (
Cohere Embed English
) from OpenSearch Serverless (OSS) vector store with semantic search. - Rerank document chunks with
Cohere Rerank 3.5
model. - Stop harmful content in models using Amazon Bedrock Guardrails.
- Generate LLM response in streams.
- Perform multi-turn conversations with
sessionId
. - Obtain LLM response in streams after knowledge retrieval, rerank and employing Amazon Guardrails.
- Display external information from RAG with citations and retrieved document chunks based on the
numberOfResults
parameter.
- Collect user feedback based on the LLM responses to user queries.
- Save these user feedback in JSON formatted output file and DynamoDB table.
- These collected user data can be used for model evaluations with Amazon Bedrock evaluations.
- This data can also be used as a source for Reinforcement Learning Human Feedback (RLHF) which can be used in fine-tuning of the Foundational Models.

Streamlit
to build the User interface for the LLM Chatbot with RAG.
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.