How R2R works
On this page
- Core Architecture
- Document Processing Pipeline
- Search and Retrieval System
- Response Generation
- System Components
Core Architecture
R2R operates as a distributed system with several key components:
API Layer
- RESTful API for all operations
- Authentication and access control
- Request routing and validation
Storage Layer
- Document storage
- Vector embeddings
- User and permission data
- Knowledge graphs
Processing Pipeline
- Document parsing
- Chunking and embedding
- Relationship extraction
- Task orchestration
Document Processing Pipeline
When you ingest a document into R2R:
-
Document Parsing
- Files are processed based on type (PDF, text, images, etc.)
- Text is extracted and cleaned
- Metadata is preserved
-
Chunking
- Documents are split into semantic units
- Chunk size and overlap are configurable
- Headers and structure are maintained
-
Embedding Generation
- Each chunk is converted to a vector embedding
- Multiple embedding models supported
- Embeddings are optimized for search
-
Knowledge Graph Creation
- Relationships between chunks are identified
- Entities are extracted and linked
- Graph structure is built and maintained
Search and Retrieval System
R2R uses a sophisticated search system:
Vector Search
- High-dimensional vector similarity search
- Optimized indices for fast retrieval
- Configurable distance metrics
Hybrid Search
Ranking
- Reciprocal rank fusion
- Configurable weights
- Result deduplication
Response Generation
When generating responses:
-
Context Building
- Relevant chunks are retrieved
- Context is formatted for the LLM
- Citations are prepared
-
LLM Integration
- Context is combined with the query
- System prompts guide response format
- Streaming support for real-time responses
-
Post-processing
- Response validation
- Citation linking
- Format cleaning
System Components
R2R consists of several integrated services:
Core Services
Database Layer
- PostgreSQL for structured data
- pgvector for vector storage
- Graph data for relationships
External Integrations
- LLM providers (OpenAI, Anthropic, etc.)
- Authentication providers
- Storage systems
Performance Considerations
R2R optimizes for several key metrics:
Latency
- Cached embeddings
- Optimized vector indices
- Request batching
Scalability
- Horizontal scaling support
- Distributed processing
- Load balancing
Reliability
- Task queuing
- Error handling
- Automatic retries
Resource Management
R2R efficiently manages system resources:
-
Memory Usage
- Vector index optimization
- Chunk size management
- Cache control
-
Processing Power
- Parallel processing
- Batch operations
- Priority queuing
-
Storage
- Efficient vector storage
- Document versioning
- Metadata indexing
For detailed deployment configurations and optimization strategies, refer to our Configuration Guide.