AWS Logo
Menu
Feature Flags in Generative AI Prompts

Feature Flags in Generative AI Prompts

This post will demonstrate a way to create "in-prompt" feature flags in order to control the output of a Large Language Model.

Randall Potter
Amazon Employee
Published Jan 11, 2025

First an important note!

This approach is meant for super simple use cases and I’m in no way advocating for large complex conditionals If things get more complex than this post for you, please see my post here:
“Avoiding the Mega Prompt”
https://community.aws/content/2p1h1wCLtao68k0uBXCI6UBDBOA/prompt-decomposition-for-complex-scenarios

Feature Flags in a Prompt?

The concept of "feature flags" has been in system design for a very long time. I know as I've developed applications I've used them to toggle the display of new features in a user interface or back-end system capabilities.
What if we could extend this concept to prompt engineering for Large Language Models(LLMs)?
I know in production, we can set feature flags or even environmental variables to change things, such as "logging levels." How many times have you deployed a generative AI application and wished you could enable some sort of thought process tracing in the output so you could see the "reasoning" behind the LLM's response? I know I have.
Often times we can get around this by using specific formatted output, often in XML or JSON such as the below.
With the above structured output, you could just select the node you wanted and leave the rest. This would handle all of the logic in the application layer versus the LLM output.
Sound good? No! It's more expensive!
In the scenario above the "thought process" could output say 500 tokens. That's 500 tokens more than the actual response you needed (<response_text/>). So every request comes with this extra cost.
Using the feature flag approach within the actual LLM prompt, we will be able to remove that extra cost and only get that "thought" output if we actually wanted it!
Let's dive in...

Some Important Notes

Here are a few key considerations.
To use this concept correctly you will need to ensure...
  1. You place the feature flags at the top of the prompt.
  2. When you create a feature flag, you will need to use that exact name.
  3. When using the feature flag you will need to be more descriptive in using it for a conditional statement.
  4. LLMs are probabilistic in nature, but inherent to that is the model's ability to create semantic relationships. The "feature flags" act as semantic "anchor points" on ingestion that the LLM will create relationships too.
Ok number 4 was a mouthful... let's elaborate on that a bit more.
Consider this example:
  • Content: The basketball is orange.
  • User Question: What color is the basketball?
  • LLM Answer: The basketball is orange.
In most of our workings with LLMs it makes perfect sense to us that the model would be able to determine the relationship between the basketball and the color orange based on standard sentence structure.
We can create those relationships for the model even if they are out of the "norm."
Let's look at another example exchange.
  • Information: The user's favorite type of vacation involves the beach. Keep that in mind while processing the content provided to you.
  • Content: Vacations of various sorts often...
  • <LLM associates "favorite vacation involves beach" to the content provided>
  • User Question: Out of the content, help me understand vacations I would like...
  • LLM Answer: Since I know your favorite vacation involves the beach, the following from the content, would be of interest to you.
In both of these cases we provided information up front that the LLM processed various levels of relations to. This is not a new concept in this world, it's usually known as Retrieval Augmented Generation or RAG.
Our feature flag is "augmented" content for the model's work. If that's the case and we word its usage well, we should be able to consistently use the feature flag for our purposes.

Let's Take a Look

For this next part we will use the Amazon Bedrock IDE capability within SageMaker Unified Studio. You can see more about this offering by taking a look at the video and resource links at the bottom of this post.
I've created a project and then created an app using the generative AI app template and then the prompt playground.

Starting Simple

Let's create a simple "prompt template" to set a "feature flag."
Screenshot of Amazon Bedrock IDE depticing prompt template testing.
In the image above, I'm using this prompt:
In the test area to the right I've created a variable and set it to "TRUE." When I click "run" it will inject the value I put into the form field into the prompt template.
As you can see, the LLM easily picks up this value like we would expect.

Let's take it a step further

Now let's create a template where the feature flag changes whether the model replies with "Hello!" or "Howdy!"
Screenshot of Amazon Bedrock IDE depticing prompt feature flag or conditional output.  Saying Hello!
Screenshot of Amazon Bedrock IDE depticing prompt feature flag or conditional output; Saying Howdy!
Important Note
The Claude model family by Anthropic works well with a variety of prompt styles. Personally I've seen the best results using XML. Knowing that the model was trained on certain formats like XML I know it understands the commenting system compatible with XML. I take advantage of this to place "IF...END IF" logic into the prompt. I'm annotating the prompt so the model can understand how to interpret the prompt. Prompt engineering inception? Ha!
With that said, let's keep our same variable for now, "debug_mode" and use it as a switch to determine what the model will say.
The model responds with, "Hello!" because our variable was set to "TRUE."
To be thorough, let's now set the variable to "FALSE."
And we see that it replies with, "Howdy!"

Practical Application # 1

Now that we've established that this can work, let's turn to a more practical application of this methodology. Often times we want to know "why" or "how" an LLM arrived at it's answer. We can often give it prompt entries like the below.
However we established earlier that including those types of explanations can cost money and aren't always needed unless you want to check or log something. I've seen a lot of customers modify their prompts as the move through the development cycle. Often there will be one prompts for dev that has thoughts, and one prompt for production that doesn't. I do not recommend this pattern. This can lead to concerns about different behaviors between environments and doesn't maintain a stable CI/CD workflow. Instead use this approach or the pattern I mention in my article above, "Avoiding the Mega Prompt."
We can make a runtime decision whether to include thoughts or not using our feature flag. You could take an environmental icon, load it in the back-end service, and then inject that value for say, "debug_mode" as we've been discussing. Since we structure our response, the back end can select the proper text to go to the front-end, while the back-end system could log those thoughts out in a custom log.
Let's take a look at the setup for this idea below:
Screenshot of Amazon Bedrock IDE depticing prompt feature flag to toggle showing the LLM's thoughts.
I've made the prompt more robust to give the LLM something to "think about."
Here's the prompt:
Let's take a look at the responses when debug_mode is true or false.

Output when debug_mode = TRUE:

Plenty of thoughts there!

Output when debug_mode = FALSE

And as you might expect, we don't see thoughts present.

Practical Application # 2

So that's great for back-end systems, but what about if you needed a way to offer a basic plan for your customers that used the same prompt, but maybe "premium" users received a little "extra" in their response?
Let's introduce another variable called, "enhanced_mode." Here's what that prompt would look like:
So in this case, if the user is in enhanced mode then they get an "alternative response." This approach can simplify prompt management. You could compose prompt snippets together to where perhaps "enhanced mode" meant any of the features they had signed up - that they could pick! This is like the concept of content personalization based on a user's persona.
Here's what this looks like:
Screenshot of Amazon Bedrock IDE depticing prompt feature flag to toggle enhanced user features.
The output when "enhanced_mode" equals "TRUE"

Addressing Security Concerns

Right away we can gravitate towards security concerns around prompt attacks and the like.
Here are some steps to mitigate that.
  1. All variables injected into the prompt template should come from the back-end service.
  2. API calls to generate content should also come from the back-end service.
  3. Ensure that you have properly formatted output and select i this case, the "XML node" content you want and only return that to the user
  4. Feature flag values come from something like AWS Parameter Store.
  5. Common best practices still apply for front-end security and in general.

Conclusion

In this post we demonstrated that for straight forward use cases we can use feature flags to control or scope the output of a Large Language Model.
I hope this helps you on your generative AI journey!
Randall Potter
Senior Solutions Architect and Generative AI Subject Matter Expert at Amazon Web Services

Resources

To learn more about Amazon Bedrock IDE in the Amazon SageMaker Unified Studio, check out the following videos and links:

Videos

Amazon SageMaker YouTube Channel (Features, Demos, etc.)

Links

  • https://aws.amazon.com/bedrock/ide/
  • https://aws.amazon.com/blogs/machine-learning/build-generative-ai-applications-quickly-with-amazon-bedrock-ide-in-amazon-sagemaker-unified-studio/
  • https://docs.aws.amazon.com/sagemaker-unified-studio/latest/adminguide/amazon-bedrock-ide.html
  • https://aws.amazon.com/sagemaker/unified-studio/
  • https://docs.aws.amazon.com/sagemaker-unified-studio/latest/userguide/what-is-sagemaker-unified-studio.html
     

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Comments