The Age of Agents: How AI agentic workflows are reshaping our tomorrow
Learn how agents and multi-agentic workflows can help build powerful GenAI applications.
Veda Raman
Amazon Employee
Published Oct 30, 2024
With GenAI being widespread by now, everyone understand the capabilities of RAG( Retrieval Augmented Generation ) and how it can leverage your organization's proprietary information to answer questions about your data. RAG empowers your chatbots to respond to queries like "What is the policy for a client who is insured as a premium subscriber?" or "What are the specifications of the X part?" RAG is achieved by transforming the input query using an embedding model which produces a vector. This vector is then used to find similar chunks of data stored in an index vector database also called knowledge base.
But what if a user poses a question like "What is the company's most recent quarterly revenue?" or "How many PTOs do I have currently?" or "I want to take the next week off"? These queries require real-time, up-to-date information that goes beyond the static knowledge base( which could become obsolete until the next update).
But what if a user poses a question like "What is the company's most recent quarterly revenue?" or "How many PTOs do I have currently?" or "I want to take the next week off"? These queries require real-time, up-to-date information that goes beyond the static knowledge base( which could become obsolete until the next update).
Enter the realm of intelligent bots, where cutting-edge AI technologies converge to create truly dynamic and responsive assistants. The need for intelligent bots arise with the need of taking actions on user's behalf in addition to responding back to static responses.
At the core of an intelligent bot lies a powerful Foundation Model (FM), specifically a Large Language Model (LLM). These language models excel at reasoning and "thinking," but their knowledge is limited to the events and information present in their training dataset. Training a model is function of time. Models are trained on large corpus of data that has been accumulated from the past till the date the training process started and hence their knowledge is limited to this cut-off date.
However, by harnessing the LLM's cognitive capabilities, we can transcend these limitations and tap into real-time data sources. The LLM can plan, break down complex questions into smaller tasks, construct queries to retrieve live information from existing APIs and data sources , make sense of the responses, and finally craft a comprehensive answer to the original query.
There are a few strategies for building intelligent bots that can access real-time data. The following section discusses these.
The concept of "tools" or "function calling" empowers the LLM to decide which functions or tools it should invoke from a given list of functions to gather the necessary information. Your intelligent bot application code then uses the returned function signature to call the appropriate functions. Your application code then passes the results of the function call to the LLM. The LLM then formulates a final response.
Consider an example where we are building a vacation planner AI assistant/bot. The user wants to plan a vacation to Miami. In-order to plan the vacation, we want to know the weather predictions for Miami. This would need a function call to a weather API to get weather predictions for the specified time-range. The LLM wouldn’t know the current weather predictions. Our application constructs the prompt that asks the LLM to pick the function/tool from a list of functions it can use to get the weather predictions. The application then makes the function call and constructs another prompt to send the results of the function call to the LLM. The LLM then generates a vacation plan. The sequence of actions is depicted below.
As you can see, the orchestration is done by the application using the intelligence of the LLM.
Agents take the concept of function calling to a whole new level. An agent is an autonomous entity that interacts with the LLM, performs tasks, and engages with users. Think of an agent as a virtual assistant with a specific role, capable of thinking (courtesy of the LLM) and acting (through function calls and tools). The orchestration of the flow is taken care of by the agent.
Amazon Bedrock provides an agent feature which allows you to create an agent. Agents in Bedrock is an entity that contains the user written instructions, an LLM, optional Knowledge Bases and action groups( API calls that it is allowed to make). These are the inputs provided to the agent during creation. Agent constructs the prompt required to perform the orchestration using the inputs provided.
Imagine the possibilities with agents! You can create specialized agents with distinct roles, working independently or collaborating seamlessly with other agents to accomplish complex tasks.
This comprises of a single agent with an assigned role, access to an LLM and a set of tools available to accomplish the tasks. There are many open source libraries that can help you create agent based architectures like LangChain, AutoGPT etc. Amazon Bedrock makes it easy for you to create and operate agents. Bedrock takes away the undifferentiated heavy lifting from you so that you can concentrate on building your application. Agents in Bedrock automatically creates the agent prompt, integrates RAG for retrieving information, breaks down the task into sub-task and performs function calling for real-time information.
Going back to our vacation planner example - we can create a vacation planner agent that can use the get-weather function/API and orchestrate the entire flow as shown below. As you can see, your application only makes the InvokeAgent API call and sends in the task/question. The orchestration is handled by the agent with the help of the LLM. This application is now leaner with the orchestration shifted to the agent.
Here’s an architecture that you can use to build an agent based solution. AWS lambda can be triggered via an API call when the user interacts with our intelligent bot. Lambda can then make the Bedrock InvokeAgentAPI call.
In this architecture type, multiple agents collaborate together. This is synonymous to humans collaborating together to accomplish a bigger task. The humans play their assigned roles and get the job done. Similarly agents play their role and collaborate to get the job done. Agents could work together synchronously or asynchronously, they could work in parallel or in sequence etc. An example of a multi-agent collaboration could be a software architect, programmer and a tester working together to create a software product. Another example is different doctors collaborating together to diagnose a patient’s symptoms.
In this type of collaboration, agents collaborate with each other synchronously. Often, there is a supervisor agent that co-ordinates multiple sub-agents.
Supervisor agent co-ordinating sub-agents synchronously.
Let’s re-design our vacation planner now with multiple agents. We can create a weather-agent that can help us get weather predictions. Let us also create a hotel-booking agent, a flight booking agent and an attractions-booking agent. Our vacation planner is the supervisor agent and it co-ordinates the vacation planning using the sub-agents.
Here's the sample code for multi-agent implementation of the vacation planner.
In synchronous collaboration, the agent invoking another agent is aware of what agents it has access to and should be aware as well. This has several disadvantages - just like microservices calling each other synchronously. Adding a new agent to the system would need an update to every agent that needs to be aware of this new agent. Another drawback of this system is that the whole system operates at the pace of the slowest agent. We can solve these issues by designing the system to be asynchronous. Agents can collaborate through a queue or an event bus. An agent that has an update for other agents can push an event to the event bus and any interested agents can listen to events on the event bus and act on events they are interested in.
As AI technologies continue to evolve, the potential of intelligent bots is boundless. From personal assistants that anticipate your needs to dynamic chatbots that offer tailored solutions, the era of truly intelligent, context-aware bots is upon us.
Embrace the power of intelligent bots, and prepare to revolutionize the way we interact with technology, unlocking new realms of efficiency, productivity, and personalized experiences.
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.