Build Serverless, Scalable Generative AI Apps with This GitHub Repo

Build Serverless, Scalable Generative AI Apps with This GitHub Repo

Create scalable generative AI apps using AWS Step Functions and Amazon Bedrock. Apply your serverless skills to build with large language models (LLMs) on AWS.

Brooke Jamieson
Amazon Employee
Published Jun 3, 2024
Last Modified Jun 4, 2024


While flashy generative AI demos wow audiences, many developers I've spoken to feel a bit disconnected - excited by the potential, but unsure how to apply their expertise to actually build this future in production. That changed when my AWS colleague Clare Liguori (Senior Principal Engineer, AWS) shared her GitHub repo called “Amazon Bedrock Serverless Prompt Chaining”.
Clare's repo focuses on prompt chaining - breaking complex AI tasks into sequential prompts for large language models. Using AWS Step Functions and Amazon Bedrock, she shows you how you can create serverless generative AI workflows without managing infrastructure.
The examples range from simple prompt chains to complex ones with parallelism, conditional logic, human input integration and more. The "Plan a Meal" workflow, where AI agents debate recipe ideas, stands out.
Whether for analysis, writing, planning or new use cases, this repo makes prompt chaining accessible. Keep reading to dive into the powerful prompt chaining techniques and practical examples for your next generative AI project.

Prompt Chaining Techniques

Prompt chaining involves breaking down a complex task into smaller, individual prompts that are fed to a large language model in a specific order or based on defined rules. Clare's repo demonstrates numerous approaches for orchestrating these prompt chains using AWS Step Functions:

Model invocation:

Step Functions can invoke models in Bedrock using the optimized integration, with the prompt and inference properties defined in the body parameter.
Diagram shows model invocation in AWS Step Functions to generate book summary.
Model invocation

Prompt templating:

Prompts can be dynamically generated by injecting values from the execution input or previous step outputs using Step Functions' intrinsic functions.
Prompt templating diagram: injects book title to generate summary in Step Functions.
Prompt Templating

Sequential chains:

The output from one prompt can be included as context for the next prompt in the chain, enabling multi-step workflows.
Diagram shows sequential prompt chaining in Step Functions to generate book summary then advertiseme
Sequential Chains

Parallel chains:

Multiple prompts can be executed in parallel, with their outputs merged as input for a subsequent prompt in the chain.
Parallel prompts generate book summary and audience, then merged for advertisement.
Parallel Chains


Choice states in Step Functions can branch the workflow based on evaluating the model's response against specified conditions.
Diagram shows conditional logic in Step Functions to generate book summary and ad if input is a book


Map states allow iterating over a collection of inputs, processing each one with a prompt to generate an array of outputs.
Map state iterates over book list to generate summaries, create bookstore ad.

Chain prompts & other AWS services:

Prompts can be chained with interactions with other AWS services, such as retrieving data from S3 or invoking Lambda functions.
Image shows chaining language model prompt with AWS SNS to send book summary notification.
Chain prompts and other AWS services

Validate output & re-prompt:

If a model's response fails validation, it can be re-prompted to fix its output, with error handling capabilities in Step Functions.
Validates model output, allows re-prompting if invalid, has error handling for failed attempts.
Validate output and re-prompt

Wait for human input:

State machines can pause execution and wait for human input using task tokens, enabling interactive workflows.
Workflow pauses for human input after generating book advertisement draft, sent via SNS for approval
Wait for human input

Practical Examples

Clare's repo includes comprehensive examples showing the various prompt chaining techniques for building generative AI applications in action.
Architecture shows frontend app executing AWS Step Functions state machines to invoke Amazon Bedrock
Architecture diagram
Let's walk through each example:

Write a Blog Post

This workflow generates an in-depth literature analysis for a blog by breaking down the task into sequential prompts.
Here's how it works:
  1. The first prompt generates a 1-2 sentence overall summary of the novel.
  2. The next prompt uses that summary to write a paragraph describing the plot.
  3. Another prompt then takes the summary and plot as context to generate a paragraph analyzing the novel's key themes.
  4. Using the accumulating context, the next prompt writes a paragraph examining the novel's writing style and tone.
  5. The final prompt combines the summary, plot description, themes analysis, and writing style analysis into a complete, multi-paragraph blog post with an intro and conclusion.
Sequential prompts to generate literature analysis blog post for provided novel.
Write a blog post

Write a Story

Produce a short story on a given topic by generating a character list, looping through character arcs, and synthesizing the outputs into a narrative.
Here’s how it works:
  1. The first prompt generates a list of 5 characters as a JSON array, including their names and descriptions.
  2. A Map state is used to process the character list:
    1. For each character, a new prompt generates that character's story arc using their name and description.
    2. This "Generate Character Story Arc" step runs in parallel for all characters.
  3. The outputs from all the character arc prompts are merged together.
  4. The final prompt takes the merged character arcs as context to generate the full short story narrative.

Generates characters and story arcs, combines to produce short story.
Write a story

Plan a Trip

Create a weekend vacation itinerary by parallelizing prompts for hotel, activity, and restaurant recommendations, then compiling them into a daily schedule and PDF.
Here’s how it works:
  1. Three prompts run in parallel to generate suggestions for hotels, restaurants, and activities at the given location.
  2. The outputs from those parallel prompts are passed into the next prompt, which combines them to create a structured daily itinerary in Markdown format.
  3. The Markdown itinerary is passed to a Lambda function (not involving any language model).
  4. The Lambda function renders the Markdown into a PDF file and uploads it to an S3 bucket.
Parallel prompts for hotels, restaurants, activities combined into trip itinerary.
Plan a Trip

Pitch a Movie Idea

An interactive workflow that pitches movie ideas, parallelizing idea generation, enabling human approval input, generating longer pitches, and looping for new ideas if rejected.
Here’s how it works:
  1. Three prompts run in parallel to generate one-paragraph movie pitch ideas, each with a different temperature setting (low, medium, high).
  2. A follow-up prompt evaluates the three pitch ideas and chooses the best one.
  3. The chosen one-paragraph pitch is presented to the human user (acting as a movie producer) for approval.
  4. The user provides input on whether to "greenlight" the pitch idea or not.
  5. If greenlighted, a final prompt takes the one-paragraph pitch as context to generate a longer, one-page movie pitch document.
  6. If not greenlighted, the workflow loops back to step 1 to generate new pitch ideas in parallel.
Interactive workflow generates parallel movie pitch ideas, lets user approve or reject for longer pi
Pitch a Movie

Plan a Meal

Generate a recipe based on given ingredients by having two "AI chef" personas iteratively debate and improve parallel meal suggestions until reaching consensus, with code selecting the highest-scored meal.
Here’s how it works:
  1. Two "AI chef" agents generate initial meal ideas in parallel based on provided ingredients
  2. A "judge" agent scores both meal ideas for tastiness on a 0-100 scale
  3. Iterative "debate" process:
    1. Chefs attempt to improve their meal idea to outscore the other
    2. "Referee" agent determines if chefs reached consensus on the same/similar meal
  4. A Lambda function (no language model) selects the highest scoring meal idea
  5. Final step generates a complete recipe for the winning meal using the original ingredients
AI chef personas debate meal ideas in parallel until consensus, generating final recipe.
Plan a Meal

Describe the most popular open source repo today

Summarize today's top trending GitHub repository by chaining React prompts that identify the trending repo and retrieve its README content via GitHub APIs.
Here’s how it works:
  1. The first prompt/agent looks up the current highest trending open source repository on GitHub by scraping the GitHub Trending page.
  2. The second prompt/agent takes the identified repo name as input and retrieves the README content for that repo by calling the GitHub API.
  3. The second prompt/agent then summarizes the key details from the retrieved README.
This example is provided in two versions - one using the Bedrock Agents framework and one using the Langchain agents library. The core architecture of chaining the two agents is the same between versions.
Identifies top trending open source GitHub repo, retrieves README, summarizes key details.
Describe the most popular Open Source repo today


Look, I get it - generative AI can seem out of reach for lots of developers. But you don't need to start over from scratch. Clare's repo proves you can apply your existing Step Functions and workflow orchestration skills to build cool (and functional!) AI apps.
The examples show how to integrate large language models into familiar serverless practices using Amazon Bedrock. There's no need to ditch the AWS services and patterns you already know and love. This repo lets you gradually blend generative AI into your serverless skillset at your own pace. You get to leverage your hard-earned and valuable experience with modern AI capabilities.

About the Author: Brooke Jamieson is a Senior Developer Advocate at AWS. You can follow Brooke on LinkedIn, Twitter, Instagram & TikTok.Read other articles from Brooke:

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.