Using function calling in private aviation

Using function calling in private aviation

Reaping the benefits of LLMs' reasoning and tool use in a dynamic use case to book a private jet

Dhiraj Mahapatro
Amazon Employee
Published Jul 9, 2024
Last Modified Jul 10, 2024

Overview

Commercial aviation operates on a large scale with numerous routes with almost a static and predictable schedule. In contrast, private aviation caters to a smaller fleet size but offers a highly dynamic and personalized experience. Customers can book flights with a short notice of just 4-5 hours, akin to rideshare services like Uber or Lyft but with private jets. The dynamic nature of private aviation requires extensive behind-the-scenes coordination to ensure a seamless journey for the customer. This includes tasks such as securing an available jet at the preferred airport, arranging crew and ground transportation, planning in-flight catering, and addressing weather or mechanical concerns – all within a short timeframe. To facilitate this on-demand service, private aviation companies often employ a pricing model where customers purchase pre-filled flight hour cards or fractional jet ownership plans. This approach is analogous to AWS' Serverless model, where customers leverage the infrastructure without managing it directly, allowing them to focus on their core business needs. In essence, private aviation prioritizes personalized, flexible travel experiences by dynamically orchestrating various operational elements, offering a premium service that comes with a higher price tag compared to commercial aviation's standardized offerings.

Why are we talking about private aviation?

Coming from an aviation background, I found it very interesting to think about challenges that private aviation companies face in efficiently booking personalized travel for high-profile customers. When a CEO or executive needs to plan a trip, they typically delegate the task to an assistant or trip planner. The assistant then contacts the private jet company's service representative to arrange the travel over a phone call. During the call, the service representative must capture all the relevant details and convert the conversation into a reservation in the booking system. This process of translating a verbal conversation into a formal reservation can be tedious and time-consuming. While Natural Language Processing (NLP) can convert the phone conversation to text, it may not always provide sufficient context for the service representative to successfully book the reservation. For example, a conversation from a PA would start as:
"Hi.. Mr. John Doe is planning for a surprise trip to Disney World in 5 hours with his family. I need to book a reservation for them."
It lacks critical details such as the arrival airport, estimated time of departure, flight preferences, and other personalized requirements. To streamline the booking process and maintain a personalized experience, private aviation companies need a solution that can extract the necessary context from such conversations, provided they have access to the customer's information. This would enable the service representative to quickly and accurately create reservations, reducing the time spent on the phone and enhancing the overall customer experience.

How can GenAI help?

Large Language Models (LLMs) like Anthropic's Claude 3 Haiku/Sonnet 3.5/Opus can enhance business logic with their reasoning capabilities and working with agents. NVIDIA's developer blog talks about an Agent as “they can be described as a system that can use an LLM to reason through a problem, create a plan to solve the problem, and execute the plan with the help of a set of tools”. While LLM agents can invoke tools autonomously, you might prefer orchestrating around LLMs yourself for visibility, control, error handling, and token management. We will touch on these advantages in detail later. This approach, known as tool use or function calling, allows you to specify which tool the LLM should use and control the responses.
The blog focuses on Anthropic's Claude Sonnet 3 for tool use. Building on Tanner McRae's Rethinking AI Agents blog, it explores using Serverless to solve a private jet reservation use case, extending the thought of tool use with LLMs.

Problem statement

Before we talk about solutions we need to revisit the problem statement briefly. The personal assistant of the CEO of a company calls the jet company and says:
Mr. John Doe (OwnerId: 9612f6c4-b7ff-4d82-b113-7b605e188ed9) is planning for a surprise trip to Disney World in 5 hours with his family. I will need to book a reservation for them.
Can we use generative AI to book this reservation from just the above detail? Customer information will be part of the payload.

Solution

The sample application shows how you can use function calling that can help LLMs work with your existing business logic. This sample application is also powered by Amazon Q Developer. At the core of the solution is an AWS Step Functions state machine that recursively interacts with the LLM with additional context provided by each tool’s response. The end goal of the state machine is to book a reservation for an owner by extracting necessary informations it can from the input prompt.
Step Functions
Book Reservation Step Functions using Bedrock and tool use
Let’s walkthrough the AWS Step Functions workflow:
  1. Capture the current date and time using a Lambda function and provide that to the system prompt.
    1. LLMs are context unaware of time construct. Any time in future, which is necessary for booking reservation, has to be fed as a prompt to the LLM. So, make sure current date and time is part of the user or system prompt.
  2. Invoke the Bedrock model. The Step Functions ASL has an invokeModel task where you first define what are the available tools to use for the LLM. Tools available for the LLM are:
    1. get_owner_info
    2. get_passengers
    3. book_reservation
  3. Based on invokeModel response, check which tool to use.
  4. Gather response from the tool.
  5. Prepare and reconcile messages to be fed back to the LLM.
  6. Subsequent responses from the LLM shows other tools to use, so on an so forth, until the stop_reason from the LLM response becomes end_turn instead of tool_use.
  7. Finally, gather the payload or the verbiage of the response.

State machine

Before diving deeper, take a look at the state machine ASL. Mainly the Bedrock InvokeModel state where the tools are specified. Prompts and reasoning are the main part of this solution, so stress on reading the descriptions and system prompts shown in the SAM template.yaml file.

State machine execution flow

Let’s go through a prompt and see how on each stage the LLM responds. You can check that the Lambda functions used for each tools are very trivial and hardcoded. This is for brevity, you can incorporate your business logic the way you want.
Input Prompt:
Initial response from Amazon Bedrock:
Above response indicates that the LLM is stopping to wait for the tool get_owner_info to respond. The ownerId has been extracted from the prompt and provided as an input to the tool.
After reconciliation of the tool response, the recursive input to the LLM gets additional context as below. The tool_result and tool_use_id are the required attributes to be provided back to the LLM.
Now, that the LLM has user info, it would switch to another tool get_passengers as shown in its output:
To understand the true power of using GenAI, you need to look at the response from the LLM:
So the departure airport (from) will be JFK.\n\nFor the arrival airport (to), Disney World is located near Orlando, FL. The major airport code for Orlando is MCO.\n\nThe travel date and time is 5 hours from the current time, which is:\nTuesday, July 9th 2024, 1:05:35 AM UTC“
The LLM was able to deduce that the nearest major airport for Disney World is Orlando airport, without even being asked about it by the owner in the prompt. It also figured out the departure time which is 5 hours from the current time.
Also note, the output token size of the response is just 14% of the input token size.
Get passengers
The Lambda function for get_passengers tool is again hardcoded to return an array of John Doe’s family members. This tool response is fed back to the LLM as additional context. The response from the LLM would look like:
The important part to note here is that the LLM understood who all needs to be present in the passenger manifest:
Based on the passenger info, the owner John Doe is traveling with his wife Jill Doe and two daughters Jane and Jenny Doe.\n\nLet me invoke the book_reservation tool now:
Now, let's look at the powerful part of prompts and LLM reasoning. If I tweak the user prompt to make sure John Doe wants to only fly with his younger daughter instead of the entire family as below:
Mr. John Doe (OwnerId: 9612f6c4-b7ff-4d82-b113-7b605e188ed9) will be traveling to Disney World in 5 hours with his younger daughter. I will need to book a reservation for them.
Then the LLM responds with just the younger daughter in the passenger manifest. This is possible because age is an attribute as part of get_passengers response and the LLM reasoned based on that available data. So, the response from the LLM would be as below where it filters out other members from the family:
Book Reservation
Finally, the book_reservation tool is used to actually book the reservation. The input payload validation can be done in this Lambda function before actually making the reservation. Once book_reservation tool response is returned, the LLM indicates that there is no more tool to use. The stop_reason is set as end_turn. The summary response would be:
Reservation booked successfully! A private jet has been scheduled to fly Mr. John Doe and his family from JFK to Orlando MCO airport on 2024-07-10 at 03:22:58 UTC. Enjoy your trip to Disney World!

Advantages

We touched on each of the advantages that was mentioned earlier. To summarize:
  • Visibility and control:
    • AWS Step Functions provides you the visibility in the console on what and where the workflow is failing instead of acting as a black box.
  • Error handling:
    • You can setup a timeout at the root level of the state machine or at each task level so that the LLM does not run in an endless loop, therefore, failing gracefully instead of incurring cost related to AWS Step Functions and Amazon Bedrock
  • Debugging:
    • Troubleshooting your orchestrator visually and easily pinpoint an error at each task level is achieved through the Step Functions console
  • Token management:
    • Since the LLM relies on specifying which tool to use and the responses from the tool to be very specific, the token size will not immediately reach to its limits. You can control how tokens are passed in between each calls. You can see that the output_token size is relatively low in each response

Considerations

There are few considerations you have to keep in mind with this approach. With tool use you can generate deterministic responses to a greater degree, however, LLMs itself are indeterministic in nature unlike APIs. Therefore, extra care needs to be taken to test this implementation thoroughly and build prompts and knowledge bases to reduce the degree of hallucination and creating incorrect reservations. Extensive integration tests would help in this scenario, before even thinking about running something like this in production.

Conclusion

In addition to Financial services industry, aviation industry is close to my heart. I have worked on building enterprise trip management systems like above in my previous jobs, however, I can’t stop thinking about how the new capabilities provided by GenAI would help ease the process of booking a reservation which eventually would provide a better customer experience. This is a burning example of using generative AI where you can be “customer obsessed”.
That being said, above mechanism is not limited to just aviation industry. This is applicable to any industry use case you can think of where LLM can do the heavy-lifting for you while operating in a dynamic and rapidly evolving environment.
You can learn more from the implementation of the sample application. Clone the app, play around with this and modify it to fit your use case. Finally, you see the full power of generative AI surfacing when you build solutions using Serverless services like AWS Step Functions and AWS Lambda. Think of a serverless-first approach when you are building gen AI solutions similar to above and then evolve to whatever suits best to your use case. I will end with my recurring statements:
  1. Serverless-first does not mean serverless-all

Resources

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Comments