How I built my own custom AI-generated podcast using AWS Bedrock, Polly, and Python

If you're looking for a quirky, short podcast that presents all the latest news about new AWS serverless features, here's how to build one yourself using AWS Bedrock (Claude 3 engine) and AWS Polly for Text-To-Speech!

If you want to hear an example of the finished product, here's an mp3. The example was created using this recent AWS news post.

The podcast has two AI-generated hosts, "Quinn" and "Ashcroft", which are AWS Polly TTS models that read a Bedrock-generated script. The script is generated from stories on the official AWS News site, in an irreverent style that I find fun to listen to.

Here's how I did it!

I first hooked up the AWS news RSS feed to a custom python parser, to extract the links for only the serverless news. To extract the links to the serverless news posts, I used the Python library "Beautiful Soup 4", instantiating the object with the xml parser (this requires the Python lxml package to be installed as well):

1
2
3
4
5
6
soup = BeautifulSoup(aws_feed.text, features='xml')
rss_items = soup.find_all('item')
for item in rss_items:
  category = item.find('category').text
  if 'serverless' in category:
    post_link = item.find('link').text

Once I had the links to the news posts, I used the Python requests library to grab the content of the news posts:

1
2
3
4
5
6
post_page = requests.get(post_link)
    post_soup = BeautifulSoup(post_page.text, "html.parser")
    text_boxes = post_soup.find("main", {"id": "aws-page-content-main"}).find_all("div", {"class": "aws-text-box"})

    for box in text_boxes:
      prompt += box.text

After I get the post text for the LLM prompt, I append the beginning of a script, alternating between the two 'hosts'. I then use the boto3 Bedrock runtime client to generate a script based on the prompt (I used Claude 3 Sonnet as the model, but you can use any model you want!).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
bedrock_client = boto3.client('bedrock-runtime', region_name="us-east-1")

body = json.dumps({
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 8000,
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": prompt
                }
            ]
        }
    ],
    "system": system_message
})

response = bedrock_client.invoke_model(
    body=body,
    modelId="anthropic.claude-3-sonnet-20240229-v1:0", 
    accept="application/json", 
    contentType="application/json"
)

script = json.loads(response['body'].read().decode('utf-8'))["content"][0]["text"]

full_podcast = intro_spiel
for line in script.splitlines():
  if "Quinn: " in line:
    full_podcast.append(["Ruth", line.split("Quinn: ")[1].strip()])
  if "Ashcroft: " in line:
    full_podcast.append(["Gregory", line.split("Ashcroft: ")[1].strip()])

I then parsed the output into line-by-line chunks that I sent through two different Polly voices ("Ruth" and "Gregory", with the "long-form" engine), and then concatenated the outputs with the Python library pydub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
polly_client = boto3.client('polly', region_name="us-east-1")

i=1
for voice, text in full_podcast:
  print(voice, text)
  speech_file_path = Path(__file__).parent / f"{str(i)}.mp3"
  print(speech_file_path)
  print("sending request")
  response = polly_client.synthesize_speech(
      Engine="long-form",
      OutputFormat='mp3',
      Text=text,
      TextType='text',
      VoiceId=voice,
  )

  with open(speech_file_path, 'wb') as file:
    file.write(response['AudioStream'].read())

  i+=1

print(i)
full_pod = AudioSegment.from_mp3(Path(__file__).parent / "1.mp3")
os.remove(Path(__file__).parent / "1.mp3")
for pos in range(2, i):
  full_pod += AudioSegment.from_mp3(Path(__file__).parent / f"{str(pos)}.mp3")
  os.remove(Path(__file__).parent / f"{str(pos)}.mp3")

full_pod.export(pod_file_name, format="mp3")

Finally, I put all this code into a scheduled AWS Lambda, which runs and checks the RSS feed daily to see if there are any serverless news updates. If you're curious about how I deploy serverless Lambda stacks, check out this article on using AWS CDK to easily build and maintain serverless applications.

You now know how to build your own goofy podcast for all the latest AWS serverless news. I know the style I used isn't for everyone, so experiment with your own style and have fun!

Select your cookie preferences

Site Terms, Privacy, and more.

How I built my own custom AI-generated podcast using AWS Bedrock, Polly, and Python

Building your own automated podcasts is now easy with AWS generative AI

Comments