Knowledge Graphs
Building and managing graphs through collections
Overview
R2R allows you to build and analyze knowledge graphs from your documents through a collection-based architecture. The system extracts entities and relationships from documents, enabling richer search capabilities that understand connections between information.
The process works in several key stages:
- Documents are first ingested and entities/relationships are extracted
- Collections serve as containers for documents and their corresponding graphs
- Extracted information is pulled into the collection’s graph
- Communities can be built to identify higher-level concepts
- The resulting graph enhances search with relationship-aware queries
Collections in R2R are flexible containers that support multiple documents and provide features for access control and graph management. A document can belong to multiple collections, allowing for different organizational schemes and sharing patterns.
The resulting knowledge graphs improve search accuracy by understanding relationships between concepts rather than just performing traditional document search.
Ingestion and Extraction
Before we can extract entities and relationships from a document, we must ingest a file. After we’ve successfully ingested a file, we can extract
the entities and relationships from document.
In the following script, we fetch The Gift of the Magi by O. Henry and ingest it our R2R server. We then begin the extraction process, which may take a few minutes to run.
Python
As this script runs, we see indications of successful ingestion and extraction.
Ingestion
Entities
Managing Collections
Graphs are built within a collection, allowing for us to add many documents to a graph, and to share our graphs with other users. When we ingested the file above, it was added into our default collection.
Each collection has a description which is used in the graph creation process. This can be set by the user, or generated using an LLM.
Python
Pulling Extractions into the Graph
Our graph will not contain the extractions from our documents until we pull
them into the graph. This gives developers more granular control over the creation and management of graphs.
Recall that we already extracted the entities and relationships for the graph; this means that we can pull
a document into many graphs without having to rerun the extraction process.
Python
As soon as we pull
the extractions into the graph, we can begin using the graph in our searches. We can confirm that the entities and relationships were pulled into the collection, as well.
Entities
Entity Visualization
Building Communities
To further enhance our graph we can build communities, which clusters over the entities and relationships inside our graph. This allows us to capture higher-level concepts that exist within our data.
Python
We can see that the resulting communities capture overall themes and concepts within the story.