Maintenance & Scaling
Learn how to maintain and scale your R2R system
This guide covers essential maintenance tasks for R2R deployments, with a focus on vector index management and system updates. Understanding when and how to build vector indices, as well as keeping your R2R installation current, is crucial for maintaining optimal performance at scale.
Vector Indices
Do You Need Vector Indices?
Vector indices are not necessary for all deployments, especially in multi-user applications where each user typically queries their own subset of documents. Consider that:
- In multi-user applications, queries are usually filtered by user_id, drastically reducing the actual number of vectors being searched
- A system with 1 million total vectors but 1000 users might only search through 1000 vectors per query
- Performance impact of not having indices is minimal when searching small per-user document sets
Only consider implementing vector indices when:
- Individual users are searching across hundreds of thousands of documents
- Query latency becomes a bottleneck even with user-specific filtering
- You need to support cross-user search functionality at scale
For development environments or smaller deployments, the overhead of maintaining vector indices often outweighs their benefits.
Vector Index Management
R2R supports multiple indexing methods, with HNSW (Hierarchical Navigable Small World) being recommended for most use cases:
Important Considerations
-
Pre-warming Requirement
- New indices start “cold” and require warming for optimal performance
- Initial queries will be slower until the index is loaded into memory
- Consider implementing explicit pre-warming in production
- Warming must be repeated after system restarts
-
Resource Usage
- Index creation is CPU and memory intensive
- Memory usage scales with both dataset size and
m
parameter - Consider creating indices during off-peak hours
-
Performance Tuning
- HNSW Parameters:
m
: 16-64 (higher = better quality, more memory)ef_construction
: 64-100 (higher = better quality, longer build time)
- Distance Measures:
cosine_distance
: Best for normalized vectors (most common)l2_distance
: Better for absolute distancesmax_inner_product
: Optimized for dot product similarity
- HNSW Parameters:
System Updates and Maintenance
Version Management
Check your current R2R version:
Update Process
-
Prepare for Update
-
Stop Running Services
-
Update R2R
-
Update Database
-
Restart Services
Database Migration Management
R2R uses database migrations to manage schema changes. Always check and update your database schema after updates:
Managing Multiple Environments
Use different project names and schemas for different environments:
Troubleshooting
If issues occur:
-
Generate a system report:
-
Check container health:
-
Review database state:
-
Roll back if needed:
Scaling Strategies
Horizontal Scaling
For applications serving many users:
-
Load Balancing
- Deploy multiple R2R instances behind a load balancer
- Each instance can handle a subset of users
- Particularly effective since most queries are user-specific
-
Sharding
- Consider sharding by user_id for large multi-user deployments
- Each shard handles a subset of users
- Maintains performance even with millions of total documents
Vertical Scaling
For applications requiring large single-user searches:
-
Cloud Provider Solutions
- AWS RDS supports up to 1 billion vectors per instance
- Scale up compute and memory resources as needed
- Example instance types:
db.r6g.16xlarge
: Suitable for up to 100M vectorsdb.r6g.metal
: Can handle 1B+ vectors
-
Memory Optimization
Multi-User Considerations
-
Filtering Optimization
-
Collection Management
- Group related documents into collections
- Enable efficient access control
- Optimize search scope
-
Resource Allocation
- Monitor per-user resource usage
- Implement usage quotas if needed
- Consider dedicated instances for power users
Performance Monitoring
Monitor these metrics to inform scaling decisions:
-
Query Performance
- Average query latency per user
- Number of vectors searched per query
- Cache hit rates
-
System Resources
- Memory usage per instance
- CPU utilization
- Storage growth rate
-
User Patterns
- Number of active users
- Query patterns and peak usage times
- Document count per user