
Using MongoDB Atlas as a Vector Store for Bedrock
Learn how to build an AWS Bedrock Knowledge Base using MongoDB Atlas as a vector store for semantic search over S3 documents.
Published Apr 12, 2025
AWS Bedrock enables the development of AI-driven applications by providing foundational models and integration options that enhance knowledge retrieval. One of its key features is the ability to create Knowledge Bases, which support Retrieval-Augmented Generation (RAG). RAG combines large language models (LLMs) with a document retrieval system, allowing models to generate responses based on relevant content retrieved from a knowledge base.
A Knowledge Base in AWS Bedrock allows businesses to query unstructured documents while enabling efficient information retrieval. By leveraging semantic search through vector embeddings, Bedrock makes it possible to find the most relevant content dynamically. This is particularly useful for applications requiring contextualized responses based on proprietary or domain-specific information. Additionally, Knowledge Bases created in AWS Bedrock can be integrated with other Bedrock components, such as Bedrock Agents and Bedrock Flow.
To illustrate the capabilities of AWS Bedrock, this tutorial walks through the creation of a Knowledge Base designed for a demo company, a vegan bakery. The goal is to build a didactic example that demonstrates how AWS Bedrock retrieves relevant answers from textual data stored in Amazon S3.
- Amazon S3 as a Document Store: Text-based documents containing recipes, ingredient substitutions, and common customer questions will be stored in an S3 bucket.
- MongoDB Atlas as a Vector Database: Embedded representations of document contents will be stored in MongoDB Atlas, enabling efficient similarity searches.
- AWS Bedrock for Knowledge Retrieval: AWS Bedrock will power semantic search and generate AI-driven responses based on stored bakery-related information.
The following diagram illustrates the end-to-end architecture of the Knowledge Base solution using AWS Bedrock and MongoDB Atlas:

This architecture highlights the main components involved in the document ingestion, embedding, and retrieval processes. Users upload documents to an S3 bucket, which are then parsed, chunked, and embedded using a Bedrock model. The resulting vector representations are stored in MongoDB Atlas, enabling efficient semantic search during inference through AWS Bedrock.
By following this tutorial, users will gain hands-on experience with integrating AWS Bedrock and MongoDB Atlas to build a functional Knowledge Base. This project serves as an educational example, demonstrating how AI-powered retrieval systems can enhance customer interactions in a specialized domain, such as a vegan bakery.
The following steps will walk you through setting up a Knowledge Base in Bedrock using Amazon S3 as your document repository and MongoDB Atlas as the vector database.
Select or create an Amazon S3 bucket to serve as the document repository for your Knowledge Base. This bucket will store the text files that the Bedrock agent will query.
In a collaborative environment with multiple teams, you can structure the S3 storage based on your operational needs. For example, you might create a dedicated bucket for each team—such as sales, technical, or customer support. Alternatively, you can use a single bucket and organize documents by team using separate folders.
This organizational approach will help maintain clarity and ease of access, especially as the volume of documents grows over time.
For this tutorial, we will organize all documents into two subfolders (recipes and company-info) under the S3 bucket. The folder structure will resemble the following:
This separation simulates how businesses manage document access based on team responsibilities. For example, only employees responsible for product formulation can upload files to the
recipes
folder, while only administrative team members can update the company-info
folderYou can upload documents to Amazon S3 using the AWS Management Console or the AWS CLI, which is more efficient for handling a large number of files.
- Navigate to the bucket.
- Select the appropriate folder (e.g.,
recipes
orcompany-info
). - Click Upload.
- Drag and drop your files or select them manually to upload.
For bulk uploads, the AWS CLI offers a more streamlined approach. Use the following command for
recipes
and company-info
folders respectively:For more details, refer to the official AWS documentation: AWS CLI S3 Reference
Below are example documents to demonstrate the type of files that can be stored and retrieved in this Knowledge Base setup.
Before integrating MongoDB Atlas with AWS Bedrock, ensure you have an active MongoDB Atlas account and a configured cluster.
Step 3.1. Log in or Sign Up
Access your MongoDB Atlas account at https://cloud.mongodb.com. Sign up if you don't already have an account.
Access your MongoDB Atlas account at https://cloud.mongodb.com. Sign up if you don't already have an account.
Step 3.2. Create a Project
Click Create Project and define the following:
Click Create Project and define the following:
- Name:
Bedrock
- Add Members and Set Permissions: Optionally invite collaborators and configure access levels.


Step 3.3. Create a Cluster
Follow the on-screen instructions to create a new cluster. Choose a region close to your AWS services for reduced latency.
Follow the on-screen instructions to create a new cluster. Choose a region close to your AWS services for reduced latency.
Cluster configuration:
- Cluster Type: Choose based on your requirements. This tutorial uses the Free Tier.
- Name:
knowledgebase
- Provider:
AWS
- Region:
us-east-1
Performance Consideration:
While an M10 or higher cluster is recommended for production, both M10 and Free Tier clusters were tested for this tutorial:
While an M10 or higher cluster is recommended for production, both M10 and Free Tier clusters were tested for this tutorial:
- M10 Cluster: Delivered fast, consistent responses without issues.
- Free Tier Cluster: Functional for proof of concept. Some latency and occasional network errors were observed. In one case, a retry resolved the issue without changes.
Recommendation: Use the Free Tier for development or learning purposes. Opt for M10 or higher for production deployments.


Step 3.4. Create a Database User
Create a user with the necessary credentials:
Create a user with the necessary credentials:
- Username:
<define-user-name>
- Password:
<use-a-strong-password>

Step 3.5. Configure Network Access
For this tutorial, allow access from all IPs (
For this tutorial, allow access from all IPs (
0.0.0.0/0
) to simplify connectivity. For production, configure VPC peering or restricted IP access to enhance security.
Step 4.1. Access Your Cluster Collections
From the MongoDB Atlas dashboard, navigate to your cluster. Click Browse Collections to view and manage your database collections.
From the MongoDB Atlas dashboard, navigate to your cluster. Click Browse Collections to view and manage your database collections.

Step 4.2. Create a New Database and Collection
Click Add My Own Data. In the dialog that appears, provide the following:
Click Add My Own Data. In the dialog that appears, provide the following:
- Database Name:
bedrock
- Collection Name:
knowledge
This will create the initial database and collection structure required for storing the vector embeddings.


To enable vector search functionality, you need to define a vector search index in MongoDB Atlas.
Step 5.1. Navigate to the Atlas Search Tab
Access your cluster in the MongoDB Atlas dashboard and open the Atlas Search tab. Click the Create Search Index button to begin.
Access your cluster in the MongoDB Atlas dashboard and open the Atlas Search tab. Click the Create Search Index button to begin.

Step 5.2. Configure the Index Settings
- Search Type: Select
Vector Search
. - Database and Collection: Choose the database and collection created in Step 4 (
bedrock.vectorbase
). - Configuration Method: Select
JSON Editor
.
Proceed to the next step.

Step 5.3. Define the Index Schema
In the JSON Editor, replace the default configuration with the following definition:
In the JSON Editor, replace the default configuration with the following definition:
Click Next, review the configuration, and click Create Search Index to finalize the setup.

Note on numDimensions:
Adjust the
Adjust the
numDimensions
value to match the embedding model you intend to use in AWS Bedrock. This tutorial uses Titan Text Embedding V2, which supports 1024 dimensions. The supported dimension values were obtained directly from the AWS Bedrock console. Below are common configurations:- Titan Text Embedding V2: 1024, 512, or 256
- Titan Embeddings G1 - Text V1.2: 1536
- Embed English V3: 1024
- Embed Multilingual V3: 1024
For more details, refer to the AWS documentation: Titan Embedding Models
To securely store MongoDB credentials for use with AWS Bedrock, create a secret in AWS Secrets Manager.
Step 6.1. Access AWS Secrets Manager
In the AWS Management Console, navigate to Secrets Manager and select Secrets from the sidebar.
In the AWS Management Console, navigate to Secrets Manager and select Secrets from the sidebar.
Step 6.2. Choose Secret Type
Select Other type of secret. Then, define the key-value pairs as follows:
Select Other type of secret. Then, define the key-value pairs as follows:
Note: The Key field is case-sensitive. Make sure to use lowercase for the keys:
username
and password
.
Step 6.3. Configure the Secret Details
Provide the following configuration:
Provide the following configuration:
- Secret Name:
dev/mongodb/knowledgebase
- Rotation: Select Do not enable automatic rotation
This secret will later be referenced in the Bedrock Knowledge Base configuration.

With MongoDB Atlas configured, the next step is to create a Knowledge Base in AWS Bedrock.
Step 7.1. Start the Knowledge Base Creation Process
In the AWS Bedrock Console, navigate to Builder tools → Knowledge bases and click Create knowledge base. Choose Knowledge Base with vector store as the setup option.
In the AWS Bedrock Console, navigate to Builder tools → Knowledge bases and click Create knowledge base. Choose Knowledge Base with vector store as the setup option.

Step 7.2. Provide Basic Configuration Details
- Name:
bakery-knowledge-base
- Description:
Centralized knowledge base for bakery documents.
- IAM Permissions: Choose Create a new service role
- Data Source: Select Amazon S3

Step 7.3. Configure the Data Source
- Data Source Name:
bakery-data-source
- S3 URI:
s3://sample-s3-bedrock-knowledge-bases/
Note: Using the entire bucket as a single data source allows centralized updates across all team folders. Alternatively, you may create separate data sources per folder or team-specific bucket to allow independent updates. For simplicity, this tutorial uses one data source for the full bucket.
- Parsing Strategy: Choose
Amazon Bedrock default parser
- Suitable for plain text documents such as
.txt
- Other options include
Amazon Bedrock Data Automation
andFoundation models
, which support visually rich documents but incur additional costs. See Bedrock Pricing for details.
- Chunking Strategy: Select how the documents should be split before embedding. Available options:
- Default chunking
- Fixed-size chunking
- Hierarchical chunking
- Semantic chunking
- No chunking
For this tutorial, use the Default chunking strategy.

Step 7.4. Configure Storage and Processing Details
- Embeddings Model: Choose
Titan Text Embeddings V2
, ensuring alignment with thenumDimensions
value set in Step 5.3.Click Additional configurations under the selected model to confirm the number of vector dimensions supported or required. Some models allow editing this value; others have it fixed.

- Vector Database: Choose
Use an existing vector store
, then selectMongoDB Atlas
.
Provide the MongoDB Atlas connection details:
- Hostname:
<clusterName>.<shardIdentifier>.mongodb.net
(e.g.,knowledgebase.xxxx.mongodb.net
)- You can find this value by clicking Connect in your MongoDB Atlas cluster and copying the connection string.


- Database Name:
bedrock
- Collection Name:
knowledge
- Credentials Secret ARN: Provide the ARN of the secret created in Step 6

Metadata Field Mapping:
- Vector Search Index name:
vector_index
- Vector Embedding Field path:
embedding
- Text Field Path:
text_chunk
- Metadata Field Path:
metadata
Note: Indexing duration may vary. For the sample dataset, expect approximately 4 minutes per Knowledge Base.

Once the Knowledge Base is created, the next step is to synchronize and test its functionality.
Step 8.1. Access the Knowledge Base
In the AWS Bedrock Console, go to Builder tools → Knowledge bases and open your Knowledge Base.
In the AWS Bedrock Console, go to Builder tools → Knowledge bases and open your Knowledge Base.
Step 8.2. Synchronize the Knowledge Base
- Select the associated data source.
- Click Sync to begin the synchronization process.
The sync operation typically completes in seconds, depending on the number and size of the documents. Once the status indicates success, the data is indexed and ready for retrieval.

Step 8.3. Test the Knowledge Base
Scroll to the Test Knowledge Base section within the same page. Select a supported LLM (e.g., Amazon Nova Lite) and run sample queries such as:
Scroll to the Test Knowledge Base section within the same page. Select a supported LLM (e.g., Amazon Nova Lite) and run sample queries such as:
- "When was Kind Bites founded?"
- "What ingredients are used in your vegan chocolate chip cookies?"
The model will return answers based on the indexed content. This provides a convenient way to validate whether the synchronization and document parsing were successful.
For users who wish to inspect the underlying data structure or confirm the vector storage, MongoDB Atlas offers visibility into the indexed documents.
Step 9.1. Open the Collection
From the MongoDB Atlas dashboard, navigate to your cluster and click Browse Collections. Locate the
From the MongoDB Atlas dashboard, navigate to your cluster and click Browse Collections. Locate the
bedrock.knowledge
collection.Step 9.2. Review Document Structure
Each document stored by the Knowledge Base typically contains:
Each document stored by the Knowledge Base typically contains:
embedding
: The vector representation of a text chunk (e.g., 1024-dimensional array)text_chunk
: The portion of content that was embeddedmetadata
: Information such as the S3 file path
Example:
This step is optional but recommended for troubleshooting and understanding the internal indexing mechanism of the system.

This tutorial provided a step-by-step guide to building a Knowledge Base using AWS Bedrock, Amazon S3, and MongoDB Atlas as a vector store. By using MongoDB Atlas, you gain flexible indexing, semantic search, and scalable storage — ideal for RAG-based AI applications.
For further guidance, refer to the documentation or support teams of AWS and MongoDB.