Knowledge Graphs in R2R

Building and managing knowledge graphs through collections

R2R’s knowledge graph system automatically extracts entities and relationships from documents, organizing them into rich semantic networks for improved search, analysis and knowledge discovery. The system integrates tightly with collections to enable flexible organization and access control.

For an end-to-end example of building a graph, check out our graph cookbook

Refer to the graphs API and SDK reference for detailed examples for interacting with graphs.

Core Concepts

Graphs in R2R operate at two levels:

  1. Document Level: Individual documents undergo entity and relationship extraction using advanced language models. This captures key concepts, people, organizations, and connections within each document.

  2. Collection Level: Collections act as containers for documents and maintain unified graphs. Collection graphs combine and deduplicate entities across documents while preserving source information.

Building Graphs

Element Extraction

When you extract the entities and relationships from a document, R2R:

  1. Analyzes document content using language models to identify entities
  2. Extracts relationships between entities
  3. Generates rich metadata and descriptions
  4. Creates embeddings for semantic search

These are then used to populate a graph.

For example, after extraction from a research paper:

1# View extracted entities
2entities = client.documents.list_entities(document_id)
3print(entities)
4# -> [
5# {"name": "DEEP_LEARNING",
6# "description": "A subset of machine learning using neural networks",
7# "category": "CONCEPT"},
8# {"name": "TRANSFORMERS",
9# "description": "Neural network architecture using self-attention",
10# "category": "CONCEPT"}
11# ]
12
13# View relationships between entities
14relationships = client.documents.list_relationships(document_id)
15print(relationships)
16# -> [
17# {"subject": "DEEP_LEARNING",
18# "predicate": "IS_SUBSET_OF",
19# "object": "MACHINE_LEARNING"}
20# ]

Collection Graphs

Collections maintain unified knowledge graphs that combine entities and relationships across documents. The system:

  1. Deduplicates entities and relationships
  2. Preserves document source information
  3. Updates automatically as documents are added
  4. Enables graph-wide analysis

Knowledge Graph Communities

R2R automatically analyzes graph structure to identify logical groupings of related entities called communities. This enables:

  1. Higher-level understanding of themes across many documents
  2. Discovery of hidden connections
  3. Improved knowledge navigation
  4. Semantic topic clustering
Analyzing knowledge graph communities in Hatchet

Using Knowledge Graphs

Knowledge graphs automatically improve search by:

  1. Providing rich entity and relationship context
  2. Enabling semantic similarity matching
  3. Supporting concept-based navigation
  4. Surfacing related content through graph connections
1# Search with knowledge graph context
2results = client.retrieval.search(
3 "What is deep learning?",
4 search_settings={
5 "graph_settings": {"enabled": True}
6 }
7)

RAG Integration

Knowledge graphs enhance RAG responses by providing:

  • Structured entity information
  • Relationship context
  • Community-level insights
  • Cross-document connections
1# RAG with knowledge graph context
2response = client.retrieval.rag(
3 "Explain deep learning's relationship to ML",
4 graph_settings={"enabled": True}
5)

Enterprise Features

The following features are restricted to:

  • Self-deployed instances
  • Enterprise tier cloud accounts

Contact our sales team for pricing and availability.

Advanced knowledge graph capabilities include:

  • Custom entity extraction rules
  • Manual graph curation tools
  • Graph export and import
  • Advanced graph analytics
  • Custom visualization tools

Conclusion

R2R’s knowledge graphs provide powerful document analysis and knowledge discovery capabilities through automatic entity extraction and graph construction. Deep integration with collections enables flexible organization, while community detection uncovers hidden patterns and relationships in your content.

Built with