GraphRAG
Learn how to build and use GraphRAG with R2R
Introduction
GraphRAG is a powerful feature of R2R that allows you to perform graph-based search and retrieval. This guide will walk you through the process of setting it up and running your first queries.
Note that graph construction may take long for local LLMs, we recommend using cloud LLMs for faster results.
Start server
We provide three configurations for R2R: Light, Light with Local LLMs, and Full with Docker+Hatchet. If you want to get started quickly, we recommend using R2R Light. If you want to run large graph workloads, we recommend using R2R Full with Docker+Hatchet.
R2R Light
R2R Light with Local LLMs
R2R Full with Docker+Hatchet
Ingesting files
We begin the cookbook by ingesting the default sample file aristotle.txt
used across R2R tutorials and cookbooks:
CLI
SDK
The initial ingestion step adds parses the given documents and inserts them into R2R’s relational and vector databases, enabling document management and semantic search over them. The aristotle.txt
example file is typically ingested in under 10s. You can confirm ingestion is complete by querying the documents overview table:
When ingestion completes successfully for a given file we will find that ingestion_status
reads success
in the corresponding output. You can also view in R2R’s dashboard on http://localhost:7273 that the file has been ingested.
Create Knowledge Graph
Knowledge graph creation is done in two steps:
create-graph
: Extracts nodes and relationships from your input document collection.enrich-graph
: Enhances the graph structure through clustering and explaining entities (commonly referred to asGraphRAG
).
CLI
SDK
If you are using R2R Full, you can log into the hatchet dashboard on http://localhost:7274 ([email protected] / Admin123!!) to check the status of the graph creation process. Please make sure all the kg-extract-*
tasks are completed before running the enrich-graph step.
This step will create a knowledge graph with nodes and relationships. You can get the entities and relationships in the graph using our dashboard on http://localhost:7273 or by calling the following API endpoints. These hit the /v2/entities and /v2/triples endpoints respectively. This will by default use the entity_level=document
query parameter to get the entities and triples at the document level. We will set the default collection id to 122fdf6a-e116-546b-a8f6-e4cb2e2c0a09
when submitting requests to the endpoints below.
Graph Enrichment
Now we have a searchable graph, but this graph is not enriched yet. It does not have any community level information. We will now run the enrichment step.
The graph enrichment step performs hierarchical leiden clustering to create communities, and embeds the descriptions. These embeddings will be used later in the local search stage of the pipeline. If you are more interested in the algorithm, please refer to the blog post here.
CLI
SDK
If you’re using R2R Full, you can similarly check that all community-summary-*
tasks are completed before proceeding.
Now you can see that the graph is enriched with the following information. We have added descriptions and embeddings to the nodes and relationships. Also, each node is mapped to a community. Following is a visualization of the enriched graph (deprecated as of now. We are working on a new visualization tool):
You can see the list of communities in the graph using the following API endpoint:
- Communities: Communities
Search
A knowledge graph search performs similarity search on the entity and community description embeddings.
CLI
SDK
Conclusion
In conclusion, integrating R2R with GraphRAG significantly enhances the capabilities of your RAG applications. By leveraging the power of graph-based knowledge representations, GraphRAG allows for more nuanced and context-aware information retrieval. This is evident in the example query we ran using R2R, which not only retrieved relevant information but also provided a structured analysis of the key contributions of Aristotle to modern society.
In essence, combining R2R with GraphRAG empowers your RAG applications to deliver more intelligent, context-aware, and insightful responses, making it a powerful tool for advanced information retrieval and analysis tasks.
Feel free to reach out to us at [email protected] if you have any questions or need further assistance.
Advanced GraphRAG Techniques
If you want to learn more about the advanced techniques that we use in GraphRAG, please refer to the Advanced GraphRAG Techniques page.