
Create Multi-Host Podcasts Using Amazon Bedrock
Perfect for content creators looking to produce engaging, conversational podcasts without the need for multiple human hosts or complex recording sessions.
Shanks Subramaniam
Amazon Employee
Published Mar 14, 2025
This solution generates a multi-host podcast by combining large language models for script creation (based on the user provided text corpus), Amazon Polly for realistic voice synthesis, and AWS Elemental MediaConvert for audio production.
To implement this solution, the following components are required:
- Access to AWS S3, Amazon Polly, AWS Elemental MediaConvert service access
- Appropriate IAM roles for MediaConvert
- Development environment (VS Code used for this example)
- Python 3.9.19 or later
The process follows these main steps:
- Prepare the prompt for the large language model (LLM) based on the text corpus
- Generate podcast script using a LLM
- Convert text to speech using Amazon Polly
- Combine audio files using AWS Elemental MediaConvert
- Download the final podcast file
Script Generation: In this example Amazon Bedrock with Claude 3.5 LLM (customizable) is accessed via Amazon Bedrock Converse API. This creates a structured JSON object containing speaker-specific dialogue as output. It supports multiple speakers (example uses 4 speakers). Note that the name of the speaker used in the prompt correspond to Polly voice engines (e.g., Danielle, Stephen, Ruth). You can change the speaker per your requirements, however you must map it to a standard Amazon Polly voice in the code.
Text-to-Speech Processing: Uses Amazon Polly's generative engine. Processes each dialogue segment individually and creates MP3 files for each conversation turn. Stores audio files in designated S3 bucket.
Audio Assembly: Generates dynamic MediaConvert job specifications. Combines multiple MP3 files into single podcast. Automatically handles variable conversation lengths.
- This example is tested in
us-west-2
region. - Create an S3 bucket for storage. Note the bucket name, this will be needed later.
- Set up a MediaConvert queue. You can use the default queue as well. Note the queue arn, this will be needed later.
- Set up an IAM role to access AWS Elemental MediaConvert service (see documentation). Note the IAM role this will be needed later.
- Ensure the IAM role you are using to run this example has the necessary permissions for S3, MediaConvert and Amazon Bedrock.
- Ensure you have access to Claude 3.5 model in the region you are running this example and you have access to the model via Amazon Bedrock.
- Configure AWS CLI in your workstation.
Launch Visual Studio Code
Create a new folder and add these four files.
llm_prompt.txt
Replace "INSERT YOUR TEXT HERE" with the text corpus based on which you would like to create the podcast. LLM will use this text corpus and create a conversational output in json format.
podcast-text-sample.json
Example output json from LLM
requirements.txt
generate-podcast.py
Note: Review the
main()
method and update the following;s3_bucket_name = "<Name of your bucket>"
media_convert_q = "arn:aws:mediaconvert:<your AWS region>:<your AWS account number>:queues/Default"
iam_role_name = "arn:aws:iam::<your AWS account number>:role/service-role/MediaConvert_Default_Role"
# (Optional) Set a value of max_turns > 0 to control the number of conversation turns in the podcast, this useful during testing.
max_turns = 0
In the VS Code Terminal: Set up virtual environment. Run these commands
Run the script
python generate-podcast.py
During the execution the output of each step is stored locally in a file so you can independently review the output. You can also see the output in the terminal. Once the execution is completed the final audio output mp3 file will be downloaded to your workstation.
Run this to go back to your shell.
deactivate
This solution demonstrates a simple yet robust implementation of creating multi-host podcasts using AWS services. Potential enhancements include:
- Experimenting with different voice engines and modulations
- Implementing long-form speech capabilities
- Adding multi-language support through translation
- Customizing voice characteristics
I hope you found this post useful for your own projects. I would love to hear your thoughts, new features, ideas for improvement, so feel free to leave a comment below. Please feel free to reach out to me on LinkedIn.
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.