Build a GraphRAG proof of concept

Build a GraphRAG proof of concept

A GraphRAG proof of concept built using LlamaIndex, Amazon Bedrock, and Amazon Neptune

Anand Komandooru
Amazon Employee
Published Jul 9, 2024
Retrieval-Augmented Generation (RAG) is the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response. See this AWS post for more information about RAG.
The RAG process described above has challenges with answering questions using relationship information between entities present in the knowledge base documents. See this Microsoft blog for the challenges and the proposed Graph Retrieval-Augmented Generation (GraphRAG) solution approach to address these challenges.
A GraphRAG process takes the unstructured data in the knowledge base and organizes it into a structured knowledge graph. The Large Language Model (LLM) will then use the relationship information from the knowledge graph to generate an answer.
This post will walk you through a GraphRAG proof of concept (POC) built using LlamaIndex framework, Amazon Bedrock, and Amazon Neptune. The POC implements and validates the idea proposed by a post titled "An Easy Way to Comprehend How GraphRAG Works".

Prerequisite

1. Sign in to your AWS account.
2. Create an Amazon Neptune Serverless database from the AWS Neptune console as shown below.
Create database
Create database
3. Pick “Serverless” for the instance type, the “Development and testing” option for the template, and leave everything else with the default option.
Database settings
Database settings
4. Make note of the cluster endpoint for the Neptune database cluster.
Cluster endpoint
Cluster endpoint
5. If you chose to skip creating a Jupyter notebook during the database creation, create a new Jupyter notebook from the Neptune Notebooks feature. Pick the database cluster name created in the previous step, provide a name suffix for the name and the IAM role. These Jupyter notebooks are fully managed and are hosted and billed through Amazon SageMaker’s notebook service.

POC steps

1. Use the AWS Neptune console to select the Jupyter notebook using the “Open JupyterLab” action. Select to launch a “Python 3” notebook from the “Launcher” screen.
2. Install the required packages
3. Add the required imports
4. Update the variables for your deployment
5. Configure the LLM to use, Amazon Bedrock with Claude 3 Sonnet and Titan Text Embeddings
6. Load the test data Newton and Edison using sample text about Newton and Edison
7. Create a knowledge graph automatically using the unstructured documents
If you get an AccessDeniedException on the Bedrock LLM, make sure to provide Bedrock “InvokeModel” permission to the Jupyter notebook IAM role on the 2 models you selected in step 4. You can add an inline policy using the JSON below
8. Query the knowledge graph
The response from the query is
The scientific contributions of the 17th century, particularly the work of Isaac Newton, laid the foundation for classical mechanics and provided the groundwork for much of modern physics. Newton's laws of motion and his theory of universal gravitation established a framework for understanding the behavior of objects and the forces acting upon them. This classical understanding of mechanics and gravitation remained influential and formed the basis for early 20th-century physics, even as new discoveries and theories emerged to challenge and expand upon Newton's work. The advancements made during the 17th century set the stage for the revolutionary developments in physics that occurred in the early 20th century, such as the theories of relativity and quantum mechanics.

Cleanup

  1. From the AWS Neptune console, go to the Notebooks menu to stop and delete the Neptune Python notebook.
  2. From the AWS Neptune console, go to the Clusters menu to delete the Neptune database cluster.

Conclusion

By augmenting the LLM with a knowledge graph, the LLM was able to use the see the progression from Newton’s work to Einstein’s contribution. You can use the Neptune notebook’s graph explorer feature to see the knowledge graph created by this POC.
Try the POC on your GenAI Q&A use case, help it see the forest for the tree and share your thoughts 😊.
 

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

1 Comment