Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

AWS Logo
Menu

How I built my own custom AI-generated podcast using AWS Bedrock, Polly, and Python

Building your own automated podcasts is now easy with AWS generative AI

Published Feb 25, 2024
Last Modified Mar 11, 2024
If you're looking for a quirky, short podcast that presents all the latest news about new AWS serverless features, here's how to build one yourself using AWS Bedrock (Claude 3 engine) and AWS Polly for Text-To-Speech!
If you want to hear an example of the finished product, here's an mp3. The example was created using this recent AWS news post.
The podcast has two AI-generated hosts, "Quinn" and "Ashcroft", which are AWS Polly TTS models that read a Bedrock-generated script. The script is generated from stories on the official AWS News site, in an irreverent style that I find fun to listen to.
Here's how I did it!
I first hooked up the AWS news RSS feed to a custom python parser, to extract the links for only the serverless news. To extract the links to the serverless news posts, I used the Python library "Beautiful Soup 4", instantiating the object with the xml parser (this requires the Python lxml package to be installed as well):
1
2
3
4
5
6
soup = BeautifulSoup(aws_feed.text, features='xml')
rss_items = soup.find_all('item')
for item in rss_items:
category = item.find('category').text
if 'serverless' in category:
post_link = item.find('link').text
Once I had the links to the news posts, I used the Python requests library to grab the content of the news posts:
1
2
3
4
5
6
post_page = requests.get(post_link)
post_soup = BeautifulSoup(post_page.text, "html.parser")
text_boxes = post_soup.find("main", {"id": "aws-page-content-main"}).find_all("div", {"class": "aws-text-box"})

for box in text_boxes:
prompt += box.text
After I get the post text for the LLM prompt, I append the beginning of a script, alternating between the two 'hosts'. I then use the boto3 Bedrock runtime client to generate a script based on the prompt (I used Claude 3 Sonnet as the model, but you can use any model you want!).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
bedrock_client = boto3.client('bedrock-runtime', region_name="us-east-1")

body = json.dumps({
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 8000,
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": prompt
}
]
}
],
"system": system_message
})

response = bedrock_client.invoke_model(
body=body,
modelId="anthropic.claude-3-sonnet-20240229-v1:0",
accept="application/json",
contentType="application/json"
)

script = json.loads(response['body'].read().decode('utf-8'))["content"][0]["text"]

full_podcast = intro_spiel
for line in script.splitlines():
if "Quinn: " in line:
full_podcast.append(["Ruth", line.split("Quinn: ")[1].strip()])
if "Ashcroft: " in line:
full_podcast.append(["Gregory", line.split("Ashcroft: ")[1].strip()])
I then parsed the output into line-by-line chunks that I sent through two different Polly voices ("Ruth" and "Gregory", with the "long-form" engine), and then concatenated the outputs with the Python library pydub
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
polly_client = boto3.client('polly', region_name="us-east-1")

i=1
for voice, text in full_podcast:
print(voice, text)
speech_file_path = Path(__file__).parent / f"{str(i)}.mp3"
print(speech_file_path)
print("sending request")
response = polly_client.synthesize_speech(
Engine="long-form",
OutputFormat='mp3',
Text=text,
TextType='text',
VoiceId=voice,
)

with open(speech_file_path, 'wb') as file:
file.write(response['AudioStream'].read())

i+=1

print(i)
full_pod = AudioSegment.from_mp3(Path(__file__).parent / "1.mp3")
os.remove(Path(__file__).parent / "1.mp3")
for pos in range(2, i):
full_pod += AudioSegment.from_mp3(Path(__file__).parent / f"{str(pos)}.mp3")
os.remove(Path(__file__).parent / f"{str(pos)}.mp3")

full_pod.export(pod_file_name, format="mp3")
Finally, I put all this code into a scheduled AWS Lambda, which runs and checks the RSS feed daily to see if there are any serverless news updates. If you're curious about how I deploy serverless Lambda stacks, check out this article on using AWS CDK to easily build and maintain serverless applications.
You now know how to build your own goofy podcast for all the latest AWS serverless news. I know the style I used isn't for everyone, so experiment with your own style and have fun!
 

Comments

Log in to comment