Walkthrough
A detailed step-by-step cookbook of the core features provided by R2R.
This guide shows how to use R2R to:
- Ingest files into R2R
- Search over ingested files
- Use your data as input to RAG (Retrieval-Augmented Generation)
- Extract entities and relationships from your data to create a graph.
- Perform basic user auth
- Observe and analyze an R2R deployment
Introduction
R2R is an engine for building user-facing Retrieval-Augmented Generation (RAG) applications. At its core, R2R provides this service through an architecture of providers, services, and an integrated RESTful API. This cookbook provides a detailed walkthrough of how to interact with R2R. Refer here for a deeper dive on the R2R system architecture.
Hello R2R
R2R gives developers configurable vector search and RAG right out of the box, as well as direct method calls instead of the client-server architecture seen throughout the docs:
Document Ingestion and Management
R2R efficiently handles diverse document types using Postgres with pgvector, combining relational data management with vector search capabilities. This approach enables seamless ingestion, storage, and retrieval of multimodal data, while supporting flexible document management and user permissions.
Key features include:
- Unique
Document
, with correspondingid
, created for each ingested file or context, which contains the downstreamChunks
andEntities
&Relationships
. User
andCollection
objects for comprehensive document permissions.Graph
, construction and maintenance.- Flexible document deletion and update mechanisms at global document and chunk levels.
Create Documents
R2R offers a powerful data ingestion process that handles various file types including html
, pdf
, png
, mp3
, and txt
.
The ingestion process parses, chunks, embeds, and stores documents efficiently. A durable orchestration workflow coordinates the entire process.
CLI
Python
JavaScript
This command initiates the ingestion process, producing output similar to:
Key features of the ingestion process:
- Unique
document_id
generation for each file - Metadata association, including
user_id
andcollection_ids
for document management - Efficient parsing, chunking, and embedding of diverse file types
Retrieving Documents
R2R allows retrieval of high-level document information stored in a relational table within the Postgres database. To fetch this information:
CLI
Python
Curl
This command returns document metadata, including:
This overview provides quick access to document versions, sizes, and associated metadata, facilitating efficient document management.
Retrieving Document Chunks
R2R enables retrieval of specific document chunks and associated metadata. To fetch chunks for a particular document by id:
CLI
Python
JavaScript
Curl
This command returns detailed chunk information:
These features allow for granular access to document content.
Deleting Documents
R2R supports flexible document deletion through a method that can run arbitrary deletion filters. To delete a document by its ID:
CLI
Python
JavaScript
Curl
This command produces output similar to:
Key features of the deletion process:
- Deletion by document ID,
- Cascading deletion of associated chunks and metadata
- Deletion by filter, e.g. by text match, user id match, or other with
documents/by-filter
.
This flexible deletion mechanism ensures precise control over document management within the R2R system.
For more advanced document management techniques and user authentication details, refer to the user documentation.
AI Powered Search
R2R offers powerful and highly configurable search capabilities, including vector search, hybrid search, and knowledge graph-enhanced search. These features allow for more accurate and contextually relevant information retrieval.
Vector Search
Vector search parameters inside of R2R can be fine-tuned at runtime for optimal results. Here’s how to perform a basic vector search:
CLI
Python
JavaScript
Curl
Expected Output
Key configurable parameters for vector search can be inferred from the retrieval API reference.
Hybrid Search
R2R supports hybrid search, which combines traditional keyword-based search with vector search for improved results. Here’s how to perform a hybrid search:
CLI
Python
JavaScript
Curl
Retrieval-Augmented Generation (RAG)
R2R is built around a comprehensive Retrieval-Augmented Generation (RAG) engine, allowing you to generate contextually relevant responses based on your ingested documents. The RAG process combines all the search functionality shown above with Large Language Models to produce more accurate and informative answers.
Basic RAG
To generate a response using RAG, use the following command:
CLI
Python
JavaScript
Curl
Example Output:
This command performs a search on the ingested documents and uses the retrieved information to generate a response.
RAG w/ Hybrid Search
R2R also supports hybrid search in RAG, combining the power of vector search and keyword-based search. To use hybrid search in RAG, simply add the use_hybrid_search
flag to your search settings input:
CLI
Python
JavaScript
Curl
Example Output:
This example demonstrates how hybrid search can enhance the RAG process by combining semantic understanding with keyword matching, potentially providing more accurate and comprehensive results.
Streaming RAG
R2R also supports streaming RAG responses, which can be useful for real-time applications. To use streaming RAG:
CLI
Python
JavaScript
Example Output:
Streaming allows the response to be generated and sent in real-time, chunk by chunk.
Customizing RAG
R2R offers extensive customization options for its Retrieval-Augmented Generation (RAG) functionality:
-
Search Settings: Customize vector and knowledge graph search parameters using
VectorSearchSettings
andKGSearchSettings
. -
Generation Config: Fine-tune the language model’s behavior with
GenerationConfig
, including:- Temperature, top_p, top_k for controlling randomness
- Max tokens, model selection, and streaming options
- Advanced settings like beam search and sampling strategies
-
Multiple LLM Support: Easily switch between different language models and providers:
- OpenAI models (default)
- Anthropic’s Claude models
- Local models via Ollama
- Any provider supported by LiteLLM
Example of customizing the model:
CLI
Python
JavaScript
Curl
This flexibility allows you to optimize RAG performance for your specific use case and leverage the strengths of various LLM providers.
Behind the scenes, R2R’s RetrievalService handles RAG requests, combining the power of vector search, optional knowledge graph integration, and language model generation. The flexible architecture allows for easy customization and extension of the RAG pipeline to meet diverse requirements.
Graphs in R2R
R2R implements a Git-like model for knowledge graphs, where each collection has a corresponding graph that can diverge and be independently managed. This approach allows for flexible knowledge management while maintaining data consistency.
Graph-Collection Relationship
- Each collection has an associated graph that acts similar to a Git branch
- Graphs can diverge from their underlying collections through independent updates
- The
pull
operation syncs the graph with its collection, similar to a Git pull - This model enables experimental graph modifications without affecting the base collection
Knowledge Graph Workflow
Extract Document Knowledge
Extract entities and relationships from the previously ingested document:
CLI
Python
JavaScript
Curl
This step processes the document to identify entities and their relationships.
Initialize and Populate Graph
Sync the graph with the collection and view extracted knowledge:
CLI
Python
JavaScript
Curl
Build Graph Communities
Build and list graph communities:
CLI
Python
JavaScript
Curl
Knowledge Graph Search
Perform knowledge graph-enhanced search (enabled by default):
CLI
Python
JavaScript
Curl
Cleanup
Reset the graph to a clean state:
CLI
Python
JavaScript
Curl
Best Practices
-
Graph Synchronization
- Always
pull
before attempting to list or work with entities - Keep track of which documents have been added to the graph
- Always
-
Community Management
- Build communities after significant changes to the graph
- Use community information to enhance search results
-
Version Control
- Treat graphs like Git branches - experiment freely
- Use
reset
to start fresh if needed - Maintain documentation of graph modifications
This Git-like model provides a flexible framework for knowledge management while maintaining data consistency and enabling experimental modifications.
User Management
R2R provides robust user auth and management capabilities. This section briefly covers user authentication features and how they relate to document management.
User Registration
To register a new user:
Python
Curl
JavaScript
Example output:
Email Verification
After registration, users need to verify their email:
Python
Curl
JavaScript
User Login
To log in and obtain access tokens:
Python
Curl
JavaScript
User-Specific Search
Once authenticated, search results are automatically filtered to include only documents associated with the current user:
Python
Curl
JavaScript
Refresh Access Token
To refresh an expired access token:
Python
Curl
JavaScript
User Logout
To log out and invalidate the current access token:
Python
Curl
JavaScript
These authentication features ensure that users can only access and manage their own documents. When performing operations like search, RAG, or document management, the results are automatically filtered based on the authenticated user’s permissions.
Remember to replace YOUR_ACCESS_TOKEN
and YOUR_REFRESH_TOKEN
with actual tokens obtained during the login process.
Observability and Analytics
R2R provides robust observability and analytics features, allowing superusers to monitor system performance, track usage patterns, and gain insights into the RAG application’s behavior. These advanced features are crucial for maintaining and optimizing your R2R deployment.
Observability and analytics features are restricted to superusers only. By default, R2R is configured to treat unauthenticated users as superusers for quick testing and development. In a production environment, you should disable this setting and properly manage superuser access.
Users Overview
R2R offers high level user observability for superusers
CLI
Python
JavaScript
Curl
This command returns detailed log user information, here’s some example output:
This summary returns information for each user about their number of files ingested, the total size of user ingested files, and the corresponding document ids.
Logging
R2R automatically logs various events and metrics during its operation. You can access these logs using the logs
command:
CLI
Python
JavaScript
Curl
This command returns detailed log entries for various operations, including search and RAG requests. Here’s an example of a log entry:
These logs provide detailed information about each operation, including search results, queries, latencies, and LLM responses.
These observability and analytics features provide valuable insights into your R2R application’s performance and usage, enabling data-driven optimization and decision-making.
Next Steps
Now that you have a basic understanding of R2R’s core features, you can explore more advanced topics:
- Dive into document ingestion and the document reference.
- Learn about search and RAG and the retrieval reference.
- Try advanced techniques like knowledge-graphs and refer to the graph reference.
- Learn about user authentication to secure your application permissions and the users API reference.
- Organize your documents using collections for granular access control.