
Knowledge Graphs and Generative AI (GraphRAG) with Amazon Neptune and LlamaIndex (Part 1) - Natural Language Querying
How to use LlamaIndex and Amazon Bedrock to translate a natural language question into a openCypher graph queries.

TextToCypherRetriever
class of the PropertyGraphIndex to take the schema of the graph and the question, generate an openCypher query, and then execute that query.Note: To try this out for yourself as you go through this post, you can download a notebook from our Amazon Neptune Generative AI Samples repository on Github, here.
- The core package for LlamaIndex
- Packages for Amazon Bedrock, which we'll be using as our large language model (LLM)
- Packages for Amazon Neptune, which will serve as our data store
Claude v3 Sonnet
and Titan Embedding v1
.PropertyGraphIndex
later we must provide it an embedding model, so even though we are creating one, it is not used.Settings
object which sets the settings for all modules in the application. In this example, we’ll set the LLM and embedding model to the values we defined above.Settings.llm = llm
Settings.embed_model = embed_modelThat’s all we have to do to globally set out our LLM’s, how easy. With all our setup out of the way, let’s set up our
PropertyGraphIndex
.PropertyGraphStore
for our Amazon Neptune Database using the NeptuneDatabasePropertyGraphStore
, specifying the cluster endpoint.PropertyGraphStore
for our Amazon Neptune Database using the NeptuneAnalyticsPropertyGraphStore
, specifying the graph identifier.PropertyGraphIndex
which is a feature in LlamaIndex. To read more about the features, check out this blog post, it is a great read.from_existing
method since we already have data loaded into the graph.TextToCypherRetriever
.TextToCypherRetriever
is the core component that powers the natural language querying feature in our system. This is an area where LlamaIndex does a significant amount of the heavy lifting for us.Here's how the TextToCypherRetriever works:
- When given a natural language question, the retriever combines the schema information of the graph database with the user's question.
- It then provides this combined input to the large language model (LLM), which generates an equivalent openCypher graph query.
- Once the LLM returns the generated query, the
TextToCypherRetriever
executes that query against the graph store and returns the results to the user.
TextToCypherRetriever
is able to seamlessly translate natural language questions into the appropriate graph database query language. This allows users to interact with the graph data in a more intuitive and user-friendly manner, without needing to be experts in the underlying query language..- Customize the prompts used to generate the responses
- Inject functions to validate the returned query
- Limit the available fields that can be included in the results
To maintain focus in this post, we will exclude a detailed discussion of these advanced retriever options and best practices. Instead, we will cover the core functionality demonstrated in the earlier example. A future blog post will dive deeper into the nuances of prompt engineering, result validation, and field limiting to further optimize the retriever for real-world applications.
retrieve
method on our retriever and pass it out question.
The results returned from our initial query not only include the relevant values from the graph, but also display the generated openCypher query itself. This is an added benefit of working with natural language querying - it provides a valuable learning path for understanding the underlying graph query languages.

While the previous query provided some interesting insights, it doesn't really showcase the true power of using a graph database. One of the most powerful capabilities of graph databases is the ability to perform recursive traversals over the data to uncover unknown connections.

Dave
and Denise
are directly connected. So far we have just shown some relatively easier graph queries so what if we try something more difficult, one that requires using much more of the graph to answer the question.
One of the impressive aspects of this approach is that we're able to get the desired results without having to deeply learn the intricacies of complex query languages like openCypher. Up to this point, we've been progressing smoothly, with our natural language queries successfully translating into the appropriate structured graph queries.
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.