Knowledge Graphs in R2R
Building and managing knowledge graphs through collections
Overview
Knowledge graphs in R2R enhance search accuracy and context understanding by extracting and connecting information from your documents. The system uses a two-level architecture:
- Document Level: Entities and relationships are first extracted and stored with their source documents
- Collection Level: Collections act as soft containers that can include documents and maintain corresponding graphs
System Architecture
Collections provide:
- Flexible document organization (documents can belong to multiple collections)
- Access control and sharing
- Graph synchronization and updates
Getting Started
1. Document-Level Extraction
First, extract entities and relationships from your underlying documents:
Python
JavaScript
CLI
2. Creating Collection Graphs
Each collection maintains its own graph. Create and populate a collection:
3. Managing Collection Graphs
View and manage the collection’s knowledge graph:
Example outputs:
Graph Synchronization
Understanding how to keep graphs updated:
Document Updates
When documents change:
Cross-Collection Updates
Documents can belong to multiple collections:
Access Control
Manage access to graphs through collection permissions:
Using Knowledge Graphs
Search Integration
Graphs automatically enhance search for collection members:
RAG Integration
Knowledge graphs enhance RAG responses:
Best Practices
Document Management
- Extract knowledge after document updates
- Monitor extraction quality at document level
- Remember extractions stay with source documents
- Consider document size and complexity when extracting
Collection Management
- Keep collections focused on related documents
- Use meaningful collection names and descriptions
- Remember documents can belong to multiple collections
- Pull changes when document extractions update
Performance Optimization
- Start with small document sets to test extraction
- Use collection-level operations for bulk processing
- Monitor graph size and complexity
- Consider using orchestration for large collections
Access Control
- Plan collection structure around sharing needs
- Review access permissions regularly
- Document collection purposes and access patterns
- Use collection metadata to track graph usage
Troubleshooting
Common issues and solutions:
-
Missing Extractions
- Verify document extraction completed successfully
- Check document format and content
- Ensure collection graph was pulled after extraction
-
Graph Sync Issues
- Confirm all documents are properly extracted
- Check collection membership
- Try resetting and re-pulling collection graph
-
Performance Problems
- Monitor collection size
- Check extraction batch sizes
- Consider splitting large collections
- Use pagination for large result sets
Next Steps
- Explore GraphRAG for advanced features
- Learn about hybrid search integration
- Discover more about collections
- Set up orchestration for large-scale processing