Knowledge Graph And Generative AI applications (GraphRAG) with Amazon Neptune and LlamaIndex (Part 2) - Knowledge Graph Retrieval
How to use LlamaIndex and Amazon Bedrock to translate a natural language question into templated graph queries.
Dave Bechberger
Amazon Employee
Published Aug 12, 2024
Last Modified Aug 13, 2024
This is the second post in this blog series where we are explore various methods that can be used with LlamaIndex to create applications built on top of these LlamaIndex to satisfy these workflows. In each post, we will cover an aspect of how to use LlamaIndex with Amazon Neptune. We will not go into detail on the architecture of LlamaIndex so if you are not familiar with the concepts and terminology I suggest you check out the documentation here.
In this post, we will be using LlamaIndex and Amazon Bedrock to translate a natural language question into a templated graph query, using a technique call Knowledge Graph Retrieval.
LlamaIndex consists tooling designed to create and interact with large language model indexes. It facilitates the storage, searching, and querying of textual data using advanced vector database techniques in conjunction with large language models like GPT. This enables efficient and effective retrieval of relevant information from extensive text corpora.
Natural language querying allows users to interact with computer systems using everyday human language, rather than having to learn and use structured query languages or complex programming commands. This capability enables users to ask questions or provide instructions in their native tongue, and the system can then process this input to understand the user's intent and provide relevant information or perform the requested action.
However, in many real-world applications, while customers may ask completely open-ended questions of a data set. but as application developers we need to ensure that only appropriate and safe questions are answered. For example, in a banking application, you may want to give users the ability to ask about their own transactions and account balance, but you certainly wouldn't want to allow them to inquire about other users' financial information.
To address these types of use cases, we can leverage a technique called Knowledge Graph Retrieval. Here, a large language model (LLM) is used to extract key entities from the user's natural language question. These extracted entities are then used as the parameters for a pre-defined, templated query. This approach gives the application developer the freedom to create optimized, secure queries with appropriate data access controls, while still providing users with a natural language interface on top of the underlying data.
In LlamaIndex we will be using the
TextToCypherRetriever
class of the PropertyGraphIndex to take the schema of the graph and the question, generate an openCypher query, and then execute that query.The data we will use in this post is based on the data in Graph Databases in Action by Manning Publications. The book uses the most common graph access patterns to build a fictitious application, DiningByFriends, that uses friends and ratings to provide personalized restaurant recommendations. Below is what the schema of our application.
Note: To try this out for yourself as you go through this post, you can download a notebook from our Amazon Neptune Generative AI Samples repository on Github, here.
The next few set of steps are the same as we covered in Part 1 of this blog series, so if you have already gone through this process you can jump ahead to the section on Setting up our Retriever
To get started building our application, the first step is to set up all the required dependencies. In this example, we'll need to install the following components:
- The core package for LlamaIndex
- Packages for Amazon Bedrock, which we'll be using as our large language model (LLM)
- Packages for Amazon Neptune, which will serve as our data store
By installing these key dependencies, we'll have the necessary tools and infrastructure in place to translate natural language questions into structured graph queries, and then execute those queries against the data stored in our Amazon Neptune database.
For this post we will be using Amazon Neptune Database as our data store so you must have a Neptune Database configured. The methodology presented here will also work with Neptune Analytics and we will call out where the code differs. To run the code in this post will also require permissions to run Amazon Bedrock models, specifically
Claude v3 Sonnet
and Titan Embedding v1
.With our dependencies installed, let’s start by connecting our application to the hosted LLM models in Amazon Bedrock. For our application, we are going to primarily use Bedrock to provide the natural language interactions to the user.
Note: When we create the
PropertyGraphIndex
later we must provide it an embedding model, so even though we are creating one, it is not used.We do this by instantiating the appropriate classes and passing in the model names we want to use. For generating the document embeddings, we are using Titan Embeddings and for the natural language interactions we chose Anthropic Claude v3 Sonnet hosted in Amazon Bedrock.
Now that we have defined our models, let’s configure our application to use them. While you can set these individually LlamaIndex provides a global
Settings
object which sets the settings for all modules in the application. In this example, we’ll set the LLM and embedding model to the values we defined above.That’s all we have to do to globally set out our LLM’s, how easy. With all our setup out of the way, let’s set up our
PropertyGraphIndex
.Our next step is to create a
PropertyGraphStore
for our Amazon Neptune Database using the NeptuneDatabasePropertyGraphStore
, specifying the cluster endpoint.If we wanted to use Amazon Neptune Analytics, you create a
PropertyGraphStore
for our Amazon Neptune Database using the NeptuneAnalyticsPropertyGraphStore
, specifying the graph identifier.Now that we have to define our
PropertyGraphIndex
which is a feature in LlamaIndex. To read more about the features, check out this blog post, it is a great read.Now that we have our LLMs and Graph Stores configured, it’s time to set up our index. In this case we are going to use the
from_existing
method since we already have data loaded into the graph.With all this ceremony finally completed, we have one more step to set up our
CypherTemplateRetriever
.The
Here's how the retriever works:
CypherTemplateRetriever
is the core component powering the knowledge graph retrieval capability in this system. This is an area where LlamaIndex does significant heavy lifting for us.Here's how the retriever works:
- When given a natural language question, the retriever combines the question with predefined template parameters.
- It then provides this combined input to the language model (LLM), which extracts the relevant parameters from the question.
- Once the parameters are extracted, the retriever incorporates them into a parameterized openCypher query template.
- Finally, the retriever executes this query against the graph store and returns the results.
To set up the
CypherTemplateRetriever
, a few additional pieces need to be configured:- The
TemplateParams
class: This is a Pydantic BaseModel class that defines the expected parameters, along with a description for each. The LLM uses these descriptions to understand what values it needs to extract from the question. - The parameterized openCypher query: This is the template query that will be executed, with the extracted parameters inserted as necessary.
In the example below, we'll demonstrate how to extract parameter names from the question and pass them as the
$names
parameter to the Cypher query during execution.The example is above is the most basic form of the retriever, let’s see how we can use this to query our graph.
One key advantage of using a knowledge graph to execute queries, rather than arbitrary natural language queries on a database, is the ability to control the security of the data. With a knowledge graph, we can carefully define the surface area of data that we expose to our applications, ensuring that users only have access to the information they need.
Now that we've set up the necessary components, we can start asking questions of our knowledge graph. To do this, we'll use the
retrieve
method on our Retriever
object, passing in the natural language question we want to have answered.Which provides us the results:
As we've seen, the results returned from our graph queries not only include the requested data values, but also additional metadata about the graph structure.
While this comprehensive response is a great way to provide secure data access from natural language questions, it does have a distinct downside. Specifically, each unique query requires a separate retriever to be configured. For example, if we wanted to find out how two users are connected, we would need to recreate the entire query, parameter template, and retriever setup for that specific use case.
While this approach can quickly become a management headache, it often provides a reasonable compromise between offering users a simple user interface (UI) and maintaining application and data security.
Another key advantage of this approach is the ability to create optimized queries in ways that a generated query may not be able to achieve. As we'll see in the example below, we can provide additional functionality, such as filtering and ordering, that may be outside the scope of what a user explicitly requests, but aligns with their implicit expectations.
Knowledge graph retrieval can be a compelling application pattern in many scenarios. While it may introduce some additional complexity and trade-off some flexibility, I believe that most question-answering (Q&A) systems will need to prioritize data and application security over complete flexibility.
In certain internal applications, users may be granted unfettered access to the data. However, for most real-world deployments, organizations will likely want to implement some form of guardrails within the application. These guardrails help prevent unintended consequences, such as:
- Noisy neighbor problems: Where high-demand queries from some users impact the performance for others.
- Resource over-utilization: Ensuring individual users don't monopolize shared computing resources.
- Data manipulation: Protecting the integrity of the underlying knowledge graph data.
- Inadvertent data exposure: Restricting access to sensitive or confidential information.
By carefully balancing flexibility and security, knowledge graph-powered Q&A systems can provide a robust and reliable user experience while maintaining appropriate controls over the application and data. The specific trade-offs and implementation details will depend on the context and requirements of each individual use case.
In this post, we examined the basic steps required to perform knowledge graph retrieval using LlamaIndex with Amazon Neptune. In future posts, we will examine how to use LlamaIndex to perform some of the other common Knowledge Graph and Generative AI workflows, next up is Knowledge Graph Generation.
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.