Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

AWS Logo
Menu
Processing WhatsApp Multimedia with Amazon Bedrock Agents: Images, Video, and Documents

Processing WhatsApp Multimedia with Amazon Bedrock Agents: Images, Video, and Documents

Build a WhatsApp AI assistant using Amazon Bedrock and Amazon Nova models to processes multimedia content such as images, videos, documents, and audio. This serverless solution uses AWS End User Messaging for direct integration.

Elizabeth Fuentes
Amazon Employee
Published Feb 21, 2025
Amazon Bedrock can now process various content types through the Amazon Nova Model, enabling you to create AI assistants that understand context across different media formats. This post will demonstrate how to build a WhatsApp assistant that analyzes images, processes videos, extracts information from documents, and transcribes audio messages, all while maintaining context throughout the conversation with Amazon Bedrock Agents.
You'll learn how to combine Amazon Bedrock with AWS End User Messaging for direct WhatsApp integration, creating a serverless solution that eliminates the need for additional API layers.
Your data will be securely stored in your AWS account and will not be shared or used for model training. It is not recommended to share private information because the security of data with WhatsApp is not guaranteed.
Image not found
Image not found
Video
Image not found
Image not found
image
You can see the animated demo in the original repository: private-assistant-v2/README.md
AWS Level: Advanced - 300
Prerequisites:

🤔 How The App Works

Image not found

Infrastructure

The project uses AWS AWS Cloud Development Kit (CDK) to define and deploy the following resources:
The infrastructure is defined in the PrivateAssistantV2Stack class within the private_assistant_v2_stack.py file.

Data Flow

  1. User sends a WhatsApp message.
  2. Message is published to the SNS Topic.
  3. whatsapp_in AWS Lambda function is triggered.
  4. Message is processed based on its type:
    • Text: Sent directly to Amazon Bedrock Agent.
    • Audio: Transcribed using Amazon Transcribe, once the transcribe job is done.transcriber_done Lambda function is triggered and then sent the text to Amazon Bedrock Agent.
    • Image/Video/Document: Stored in S3, then analyzed by Amazon Bedrock Agent converse API, save the input and response as ConversationHistory Contents in an AgentHistory Amazon DynamoDB table.
  5. bedrock_agent Lambda function processes the message and generates a response
  6. Response is sent back to the user via WhatsApp.

💰 For pricing details, see:

Key Files:

  • app.py: Entry point for the CDK application.
  • private_assistant_v2_stack.py: Main stack definition for the AI assistant.
  • lambdas/code/: Contains Lambda functions for processing WhatsApp messages, invoking Bedrock Agent, and handling transcriptions.
  • layers/: Contains shared code and dependencies for AWS Lambda functions.
  • agent_bedrock/create_agent.py: Defines the Bedrock Agent configuration.

🧰 Usage Instructions

Installation

⏱️ Estimated time to complete: 10-15 minutes
Clone the repository:
git clone https://github.com/build-on-aws/building-gen-ai-whatsapp-assistant-with-amazon-bedrock-and-python cd private_assistant_v2
Create and activate a virtual environment:
python3 -m venv .venv source .venv/bin/activate # On Windows, use `.venv\Scripts\activate`
Install dependencies:
pip install -r requirements.txt
Synthesize The Cloudformation Template With The Following Command:
cdk synth
The Deployment🚀:
cdk deploy
Note the output values, especially the SNS Topic ARN, which will be used for configuring the WhatsApp integration.

🧰 Configuration

Step 0: Activate WhatsApp account Facebook Developers

Step 1: APP Set Up

Set up a WhatsApp Business account by follow the [Getting started with AWS End User Messaging Social steps](https://docs.aws.amazon.com/social-messaging/latest/userguide/getting-started-whatsapp.html and configure it to send messages to the SNS Topic created by this stack.

Step 2: Customize the Bedrock Agent's behavior (optional).

Update the agent_data.json file in the private_assistant_v2/ directory to customize the Bedrock Agent's behavior.

Step 3: Adjust environment variables (optional).

  1. Adjust environment variables in private_assistant_v2_stack.py if needed, such as S3 bucket prefixes or DynamoDB table names.

Testing

To test the AI assistant:
  1. Send a WhatsApp message to the configured phone number.
  2. The message will be processed by the whatsapp_in Lambda function.
  3. For text messages, the Bedrock Agent will be invoked directly.
  4. For audio messages, they will be transcribed using Amazon Transcribe before being sent to the Bedrock Agent through the bedrock_agent Lambda function.
  5. For images, videos, and documents, they will be stored in S3 and analyzed by the Bedrock converse API through the bedrock_agent Lambda function.
  6. The assistant's response will be sent back to the user via WhatsApp.

🧹 Clean up:

If you finish testing and want to clean the application, you just have to follow these two steps:
  1. Delete the files from the Amazon S3 bucket created in the deployment.
  2. Run this command in your terminal:
cdk destroy

📚 Some links for more information:

🇻🇪🇨🇱 ¡Gracias!

Eli :)
 

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Comments

Log in to comment