AWS Logo
Menu
RAG Search witch Amazon Bedrock,  Amazon OpenSearch

RAG Search witch Amazon Bedrock, Amazon OpenSearch

RAG Search with Amazon Bedrock & OpenSearch Integration

Published Nov 12, 2024
Last Modified Dec 16, 2024
Benefits of this solution:
This final part completes the process of building a Retrieval-Augmented Generation (RAG) application that can solve business problems by utilizing GenAI with private data sources, providing valuable insights and business benefits for the solution that is require to convert audio files to text while allowing semantic search using RAG.
Introduction
Overview: This blog is Part 3/3 of my series on processing and utilizing audio transcriptions. Once the transcription is completed by Amazon Transcribe, we will be using Amazon Bedrock and LangChain API to embed the generated text for efficient retrieval and analysis. The LangChain API libraries is used by AWS Bedrock for embedding and retrieval from the Amazon OpenSearch.
In this section, I will also demonstrate how to implement search capabilities using a Retrieval-Augmented Generation (RAG) approach with Amazon Bedrock. To run this example, you will need an Amazon OpenSearch instance with an index pre-populated with embedded content, as outlined in Part 2/3 of the previous article.
In the architecture diagram, both AWS Lambda and Amazon OpenSearch are deployed within VPC private subnets, ensuring security and alignment with best practices. If you want to test it locally (which is not recommended for production use), you could deploy Amazon OpenSearch in a public subnet and secure it using Network Data policies of the Amazon OpenSearch.
Objective: To demonstrate enabling semantic searches within a RAG solution.
Purpose: This solution aims to integrate audio transcription files into a Retrieval-Augmented Generation (RAG) system, enabling semantic search capabilities. By embedding the transcribed audio content and storing it in a vector database, users can perform advanced searches that extends beyond basic keyword matching. This approach allows for context-aware and meaningful retrieval of information, enhancing the efficiency and relevance of search results in applications that requires in-depth analysis and querying of text data retrieved from the audio transcription.

Architecture Full RAG that combine the flows

  1. Show how to create a Amazon Transcribe Job Flow
  2. Show how to ingest to Amazon OpenSearch audio transcription text
  3. Show RAG search using Amazon Bedrock
We can further enhance the RAG solution by introducing DynamoDB to maintain search history by creating a complete search solution. However, I’ll leave this challenge for you, my dear readers.
Architectural note:
The flows listed below can be orchestrated with Step Functions however I've chosen to use the architecture below.
Full Rag
RAG Solution
Workflow Overview for the last RAG search:
Services Used
  • Amazon BedRock: Used for embedding and LLM retrieval from the Knowledge Based.
  • AWS Lambda: The function is deployed within a VPC to ensure secure and private access to the Amazon OpenSearch ServerLess database. Written in Python using the AWS SDK Boto3, this function is designed to accept a JSON payload from the API or user input, embedding the data using Amazon Bedrock, and perform a search in the Amazon OpenSearch index. The response is then returned to the user, enabling efficient data processing, semantic querying, and seamless content updates.
  • Amazon OpenSearch: A ServerLess vector database stores vector embeddings, which are numerical representations of data (e.g., words or images) capturing their semantics. The serverless allow scaling automatically based on use. It integrates with Retrieval-Augmented Generation (RAG) by enabling efficient similarity searches, such as calculating the similarity of vectors. This allows the system to quickly identify and retrieve the most contextually relevant information, making it highly effective for generating responses based on related content. When a query is made, it’s converted into a vector, and the database finds the closest matches. This allows RAG systems to retrieve relevant context efficiently and use it to enhance their generated responses. Benefits include scalability, cost-effectiveness, and high performance.
  • IAM Roles: The IAM roles for the AWS Lambda function include the necessary policies to access both Amazon ServerLess OpenSearch vector database.
Here is the requirements.txt
langchain-aws
langchain-community
opensearch-py
Here is a code snippet of the AWS Lambda function demonstration only:

Conclusion

  • Summary: This article explores the implementation of Retrieval-Augmented Generation (RAG) search capabilities using a combination of Amazon Bedrock for embedding and Amazon OpenSearch for efficient retrieval. I demonstrated how transcribed audio content from Amazon Transcribe can be embedded and stored in a Amazon OpenSearch ServerLess vector database, enabling advanced semantic search. This design pattern can be used for any data that is by leveraging RAG, the solution provides context-aware answers, going beyond keyword-based searches to deliver more relevant, nuanced responses. This setup offers a scalable and secure approach for real-time applications, enhancing user experiences through improved accuracy and faster retrieval of information. Always keep in mid the security of GenAI solution's to keep you data secure and AWS GenAI services such as Amazon Bedrock,Amazon OpenSearch and full integration with many other services that will give confidence that your solutions is secure.
     

Comments