AWS Logo
Menu
Create Custom Visual Novels in 5 Minutes with Generative AI

Create Custom Visual Novels in 5 Minutes with Generative AI

Learn how to transform simple text prompts into fully illustrated visual novels using Claude 3.5 and Stable Diffusion—no artistic skills required.

Banjo Obayomi
Amazon Employee
Published Nov 7, 2024
Have you ever wanted to create your own visual novel but felt overwhelmed by the need to write engaging stories, create compelling artwork, and manage complex game development? As a builder who loves both gaming and AI, I faced this challenge head-on. While tools like Ren'Py make visual novel development more accessible, creating original content still requires significant time and artistic skill.
That's why I built VisualNovelLM - a tool that leverages the latest in generative AI to transform simple prompts and background lore into fully realized visual novels. In this post, I'll show you how I combined AWS services, Claude 3.5, Stable Diffusion 3, and Ren'Py to create an end-to-end story generation pipeline that anyone can use.
Try VisualNovelLM here.
a cat journeys through a magical forest
a cat journeys through a magical forest

The Architecture Behind the Magic

While generating a visual novel in 5 minutes might seem like magic, the real innovation lies in how multiple AI services work together seamlessly. Let's peek behind the curtain to see how VisualNovelLM transforms your ideas into playable stories:
VisualNovelLM architecture diagram
VisualNovelLM architecture diagram
  1. Users provide a story prompt and optional lore file through a Gradio interface hosted on Hugging Face
  2. AWS Lambda processes the input and orchestrates the generation pipeline
  3. Claude 3.5 (via Amazon Bedrock) transforms the input into a structured story
  4. Stable Diffusion 3 generates character artwork and backgrounds
  5. A custom Docker container compiles everything into a Ren'Py game
  6. The final visual novel is served through the web interface

The Generation Pipeline: From Prompt to Playable Game

Let's dive into how each piece of this pipeline works together to create a seamless experience.

Story Generation with Claude 3.5

The heart of VisualNovelLM is its ability to generate compelling narratives. Here's how we use Claude 3.5 Haiku through Amazon Bedrock to create stories that feel hand-crafted:

1. Story Structure Generation

First, we use Claude to break down the user's prompt and lore into a structured story outline. This requires careful prompt engineering to ensure Claude:
  • Creates a coherent narrative arc
  • Maintains consistent world rules from the provided lore
  • Generates an appropriate number of scenes and branches
  • Outputs in a parseable JSON format for the next steps

2. Character Development Chain

With our story structure in place, we generate detailed character profiles that will remain consistent throughout the narrative:
  • Physical descriptions (used later for Stable Diffusion)
  • Personality traits and speech patterns
  • Character relationships and motivations
  • Background stories that align with the lore
These profiles are then used to inform both the dialogue generation and image generation steps.

3. Scene and Background Planning

Before generating any visuals or dialogue, we create a detailed scene-by-scene breakdown:
  • Location descriptions for background generation
  • Emotional tone and atmosphere
  • Time of day and lighting conditions
  • Character positions and interactions

4. Dialogue Generation

Using the character profiles and scene breakdown, we prompt Claude to generate natural-sounding dialogue that:
  • Matches each character's established personality
  • Advances the story meaningfully
  • Creates appropriate emotional responses
  • Maintains consistent character voices
  • Includes stage directions and expressions

5. Asset Compilation Pipeline

This is where the real magic happens - turning all these generated elements into a playable visual novel:
Visual Creation with Stable Diffusion 3
With our story structure and dialogue in place, we need to bring our characters and world to life. Stable Diffusion 3 has proven remarkably capable at maintaining visual consistency across scenes:
Building the Compilation Pipeline
The most challenging aspect of VisualNovelLM was creating a reliable pipeline to turn our generated content into playable games. This involved several key components:
1. Custom Docker Container for Ren'Py
To compile games in a serverless environment, we needed a specialized container. I created a custom Docker container that had the Ren'py binary and lambda code to handle the user input and story generation.
2. Script Validation with Amazon Q Developer
With our compilation environment ready, we needed to ensure our generated scripts wouldn't crash during gameplay. One of the issues I faced that i had make sure all character names in the scripts were valid for the Ren'py engine. Rather than writing validation logic from scratch, I leveraged Amazon Q Developer (which you can use for free with BuilderID) to create validation functions. You can ask the prompt right in the IDE, and add in the code. Here is the prompt i used:
Can you help me write a python function that can sanitizes a string to be a valid Ren'Py identifier. it should
  1. Use the first name and initial of the last name
  2. Remove apostrophes and hyphens, convert to lowercase
  3. Replace any character that is not alphanumeric or underscore with an underscore
  4. 4.Ensure it doesn't start with a number
Amazon Q Developer
Amazon Q Developer

Deploying VisualNovelLM

With all our components working individually, the final challenge was orchestrating everything into a seamless, serverless experience. Using Hugging Face to host a Gradio app is free, and I'm able to provide access to my Lambda function to invoke the pipeline.
Here's what the interface looks like:
VisualNovelLM Gradio App
VisualNovelLM Gradio App
The Gradio app provides a clean, intuitive way for users to:
  • Enter their story prompt
  • Upload optional lore files
  • Generate their visual novel
  • Play the finished game
What I love about this setup is its simplicity - there's no need to manage servers or worry about scaling. The Lambda function handles all the heavy lifting, while Hugging Face takes care of serving the interface to users.

Try It Yourself!

You can experiment with VisualNovelLM here
Want to build something similar? There's an ongoing game development hackathon where you can showcase your own games: Learn more here.
Have you built something interesting with generative AI? I'd love to hear about it in the comments below!
 

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Comments