How to get started with Knowledge Bases for Amazon Bedrock ?
Quick and Seamless way to deploy vector stores for your RAG applications with LLMs
Published Dec 14, 2023
As I see the need for many customers requesting guidance on Vector Store Selection Guidance for scale and security, I would like to share my opinion on this topic today. Typically the need for Vector Data stores is in the Retrieval Augmented Generation(RAG) pattern of architecture for Generative AI. As customers need to provide more context on the domain specific information which is often proprietary in naturel, RAG tends to be the apt pattern to achieve a very reasonable level of accuracy and relevance for the generated content. There are several Options available for choosing Vector Data Stores including Open Source options like Weaviate, FAISS, Milvus, Amazon OpenSearch (Provisioned & Serverless), Amazon Aurora PostgreSQL with pg_vector extension, Pinecone, Redis Enterprise Cloud, MongoDB and so on.
While there so many options, question is which one to choose. In my opinion, based on several interactions on the related use cases, its recommended to consider the following
- latency of response
- what level of product quantization is needed
- scale of vectors to be stored (whether its in tens of Billions or single digit billions)
- how many dimensions do the generated vectors have
- what kind of algorithm does the vector store support like Hierarchical Navigable Small World (HNSW) , Inverted File(IVF).
- What kind of Similarity Metrics does the Vector Store Support
In addition to this, it is also essential to see that how do you plan to build the ingestion pipeline of embeddings for one time conversion of your knowledge corpus, along with the regular delta ingestion for new data that's being accumulated.
Once you are all set on the above, you also need to develop the code for Prompt + Context Engineering aspect where you need to determine the chunking & overlap strategies and codify them to ensure you get accurate information from the LLM
In the recently concluded re:Invent 2023 , AWS announced General Availability of Knowledge Bases (KB) for Amazon Bedrock. KB feature of Amazon Bedrock takes care of quite a few of the heavy lifting listed above. This feature makes evaluation of multiple vector stores seamless.
You would start by providing the name for KB in the step 1
In the Step 2, you would create a data source, which currently allows S3 to be chosen as a location from where you can ingest data as shown below
In Step 3, you would select the Vector Data Store. Currently the workflow gives you option of choosing 3 types of Vector Stores. If you are starting new then KB provides OpenSearch Serverless (OSS) vector store as part of the creation workflow, in case if you already have existing OSS, Pinecone or Redis Enterprise Cloud deployments then you can easily point to that deployment during the same stage of creation workflow as shown below
In Step 4, you would review and create the Knowledge Base. It takes a few minutes to get the vector data store to be provisioned, post which it would prompt you for initiating the data sync from your S3 location. This is when the Titan Embeddings G1 model , which is the default model currently, would start converting the knowledge corpus into embeddings and store them into the vector data store.
Once the data Sync is completed it would Show up a prompt about the status and that's when you can get started with Initial level of testing of Knowledge base using the user interface and integration available with LLMs Claude (v1 and v2) as shown below
And there you are, completed with the vector data store set up in a few minutes. This is now available to integrate with Agents and any of your existing/new applications using SDKs such as boto3 with "bedrock-agent-runtime" and APIs like Retrieve , RetrieveAndGenerate.
I hope this walkthrough helps readers to get started with KB and you can start using vector data stores on AWS much faster for your RAG pattern implementation. Good Luck and Go build on AWS !