Analytics & Observability
Learn how to use analytics and and logging with R2R
Introduction
This guide demonstrates how to leverage R2R’s powerful analytics and logging features. These capabilities allow you to monitor system performance, track usage patterns, and gain valuable insights into your RAG application’s behavior.
Setup
Ensure you have R2R installed and configured as described in the installation guide. For this cookbook, we’ll use the default configuration.
Basic Usage
Logging
R2R automatically logs various events and metrics during its operation. To access these logs:
app = R2R()
# Perform some searches / RAG completions
logs = app.logs()
print(logs)
Expected Output:
[
{
'run_id': UUID('27f124ad-6f70-4641-89ab-f346dc9d1c2f'),
'run_type': 'rag',
'entries': [
{'key': 'search_query', 'value': 'Who is Aristotle?'},
{'key': 'search_latency', 'value': '0.39'},
{'key': 'search_results', 'value': '["{\\"id\\":\\"7ed3a01c-88dc-5a58-a68b-6e5d9f292df2\\",...}"]'},
{'key': 'rag_generation_latency', 'value': '3.79'},
{'key': 'llm_response', 'value': 'Aristotle (Greek: Ἀριστοτέλης Aristotélēs; 384–322 BC) was...'}
]
},
# More log entries...
]
These logs provide detailed information about each operation, including search results, queries, latencies, and LLM responses.
To execute this by from within the quickstart, execute the following:
python -m r2r.examples.quickstart logs
Analytics
R2R offers an analytics feature that allows you to aggregate and analyze log data:
from r2r import FilterCriteria, AnalysisTypes
filter_criteria = FilterCriteria(filters={"search_latencies": "search_latency"})
analysis_types = AnalysisTypes(analysis_types={"search_latencies": ["basic_statistics", "search_latency"]})
analytics_results = app.analytics(filter_criteria, analysis_types)
print(analytics_results)
Expected Output:
{
'results': {
'filtered_logs': {
'search_latencies': [
{
'timestamp': '2024-06-20 21:29:06',
'log_id': UUID('0f28063c-8b87-4934-90dc-4cd84dda5f5c'),
'key': 'search_latency',
'value': '0.66',
'rn': 3
},
...
]
},
'search_latencies': {
'Mean': 0.734,
'Median': 0.523,
'Mode': 0.495,
'Standard Deviation': 0.213,
'Variance': 0.0453
}
}
}
The boilerplate analytics implementation allows you to:
- Filter logs based on specific criteria
- Perform statistical analysis on various metrics (e.g., search latencies)
- Track performance trends over time
- Identify potential bottlenecks or areas for optimization
python -m r2r.examples.quickstart analytics --filters '{"search_latencies": "search_latency"}' --analysis_types '{"search_latencies": ["basic_statistics", "search_latency"]}'
Experimental Features
Advanced analytics features are still in an experimental state - please reach out to the R2R team if you are interested in using this feature.
Custom Analytics
R2R’s analytics system is flexible and allows for custom analysis. You can specify different filters and analysis types to focus on specific aspects of your application’s performance.
# Analyze RAG latencies
rag_filter = FilterCriteria(filters={"rag_latencies": "rag_generation_latency"})
rag_analysis = AnalysisTypes(analysis_types={"rag_latencies": ["basic_statistics", "rag_generation_latency"]})
rag_analytics = app.analytics(rag_filter, rag_analysis)
print(rag_analytics)
# Track usage patterns by user
user_filter = FilterCriteria(filters={"user_patterns": "user_id"})
user_analysis = AnalysisTypes(analysis_types={"user_patterns": ["bar_chart", "user_id"]})
user_analytics = app.analytics(user_filter, user_analysis)
print(user_analytics)
# Monitor error rates
error_filter = FilterCriteria(filters={"error_rates": "error"})
error_analysis = AnalysisTypes(analysis_types={"error_rates": ["basic_statistics", "error"]})
error_analytics = app.analytics(error_filter, error_analysis)
print(error_analytics)
Preloading Data for Analysis
To get meaningful analytics, you need a substantial amount of data. Here’s a script to preload your database with random searches:
import random
from r2r import R2R, GenerationConfig
app = R2R()
# List of sample queries
queries = [
"What is artificial intelligence?",
"Explain machine learning.",
"How does natural language processing work?",
"What are neural networks?",
"Describe deep learning.",
# Add more queries as needed
]
# Perform random searches
for _ in range(1000):
query = random.choice(queries)
app.rag(query, GenerationConfig(model="gpt-3.5-turbo"))
print("Preloading complete. You can now run analytics on this data.")
After running this script, you’ll have a rich dataset to analyze using the analytics features described above.
User-Level Analytics
To get analytics for a specific user:
user_id = "your_user_id_here"
user_filter = FilterCriteria(filters={"user_analytics": "user_id"})
user_analysis = AnalysisTypes(analysis_types={
"user_analytics": ["basic_statistics", "user_id"],
"user_search_latencies": ["basic_statistics", "search_latency"]
})
user_analytics = app.analytics(user_filter, user_analysis)
print(f"Analytics for user {user_id}:")
print(user_analytics)
This will give you insights into the behavior and performance of specific users in your system.
Summary
R2R’s logging and analytics features provide powerful tools for understanding and optimizing your RAG application. By leveraging these capabilities, you can:
- Monitor system performance in real-time
- Analyze trends in search and RAG operations
- Identify potential bottlenecks or areas for improvement
- Track user behavior and usage patterns
- Make data-driven decisions to enhance your application’s performance and user experience
For detailed setup and basic functionality, refer back to the R2R Quickstart. For more advanced usage and customization options, join the R2R Discord community.
Was this page helpful?