Walkthrough

A detailed step-by-step cookbook of the core features provided by R2R.

This guide shows how to use R2R to:

  1. Ingest files into R2R
  2. Search over ingested files
  3. Use your data as input to RAG (Retrieval-Augmented Generation)
  4. Extract entities and relationships from your data to create a graph.
  5. Perform basic user auth
  6. Observe and analyze an R2R deployment

Introduction

R2R is an engine for building user-facing Retrieval-Augmented Generation (RAG) applications. At its core, R2R provides this service through an architecture of providers, services, and an integrated RESTful API. This cookbook provides a detailed walkthrough of how to interact with R2R. Refer here for a deeper dive on the R2R system architecture.

Hello R2R

R2R gives developers configurable vector search and RAG right out of the box, as well as direct method calls instead of the client-server architecture seen throughout the docs:

core/examples/hello_r2r.py
1from r2r import R2RClient
2
3client = R2RClient() # optional, pass in "http://localhost:7272" or "https://api.cloud.sciphi.ai"
4
5with open("test.txt", "w") as file:
6 file.write("John is a person that works at Google.")
7
8client.documents.create(file_path="test.txt")
9
10# Call RAG directly
11rag_response = client.retrieval.rag(
12 query="Who is john",
13 rag_generation_config={"model": "openai/gpt-4o-mini", "temperature": 0.0},
14)
15results = rag_response.results
16
17print(f"Search Results:\n{results.search_results}")
18# AggregateSearchResult(chunk_search_results=[ChunkSearchResult(score=0.685, text=John is a person that works at Google.)], graph_search_results=[], web_search_results=[], context_document_results=[])
19
20print(f"Completion:\n{results.completion}")
21# John is a person that works at Google [1].

Document Ingestion and Management

R2R efficiently handles diverse document types using Postgres with pgvector, combining relational data management with vector search capabilities. This approach enables seamless ingestion, storage, and retrieval of multimodal data, while supporting flexible document management and user permissions.

Key features include:

  • Unique Document, with corresponding id, created for each ingested file or context, which contains the downstream Chunks and Entities & Relationships.
  • User and Collection objects for comprehensive document permissions.
  • Graph, construction and maintenance.
  • Flexible document deletion and update mechanisms at global document and chunk levels.
Note, all document related commands are gated to documents the user has uploaded or has access to through shared collections, with the exception of superusers.

R2R offers a powerful data ingestion process that handles various file types including html, pdf, png, mp3, and txt.

The ingestion process parses, chunks, embeds, and stores documents efficiently. A durable orchestration workflow coordinates the entire process.

1# export R2R_API_KEY=...
2from r2r import R2RClient
3
4client = R2RClient() # or set base_url=...
5# when using auth, do client.login(...)
6
7client.documents.create_sample(hi_res=True)
8# to ingest your own document, client.documents.create(file_path="/path/to/file")

This command initiates the ingestion process, producing output similar to:

IngestionResponse(message='Document created and ingested successfully.', task_id=None, document_id=UUID('e43864f5-a36f-548e-aacd-6f8d48b30c7f'))

Key features of the ingestion process:

  1. Unique document_id generation for each file
  2. Metadata association, including user_id and collection_ids for document management
  3. Efficient parsing, chunking, and embedding of diverse file types

R2R allows retrieval of high-level document information stored in a relational table within the Postgres database. To fetch this information:

1result = client.documents.list(
2 limit=10,
3 offset=0
4)

This command returns document metadata, including:

[
DocumentResponse(
id=UUID('e43864f5-a36f-548e-aacd-6f8d48b30c7f'),
collection_ids=[UUID('122fdf6a-e116-546b-a8f6-e4cb2e2c0a09')],
owner_id=UUID('2acb499e-8428-543b-bd85-0d9098718220'),
document_type=<DocumentType.PDF: 'pdf'>,
metadata={'title': 'DeepSeek_R1.pdf', 'version': 'v0'},
version='v0',
size_in_bytes=1768572,
ingestion_status=<IngestionStatus.SUCCESS: 'success'>,
extraction_status=<GraphExtractionStatus.PENDING: 'pending'>,
created_at=datetime.datetime(2025, 2, 8, 3, 31, 39, 126759, tzinfo=TzInfo(UTC)),
updated_at=datetime.datetime(2025, 2, 8, 3, 31, 39, 160114, tzinfo=TzInfo(UTC)),
ingestion_attempt_number=None,
summary="The document contains a comprehensive overview of DeepSeek-R1, a series of reasoning models developed by DeepSeek-AI, which includes DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero utilizes large-scale reinforcement learning (RL) without supervised fine-tuning, showcasing impressive reasoning capabilities but facing challenges like readability and language mixing. To enhance performance, DeepSeek-R1 incorporates multi-stage training and cold-start data, achieving results comparable to OpenAI's models on various reasoning tasks. The document details the models' training processes, evaluation results across multiple benchmarks, and the introduction of distilled models that maintain reasoning capabilities while being smaller and more efficient. It also discusses the limitations of current models, such as language mixing and sensitivity to prompts, and outlines future research directions to improve general capabilities and efficiency in software engineering tasks. The findings emphasize the potential of RL in developing reasoning abilities in large language models and the effectiveness of distillation techniques for smaller models.", summary_embedding=None, total_tokens=29673)] total_entries=1
), ...
]

This overview provides quick access to document versions, sizes, and associated metadata, facilitating efficient document management.

R2R enables retrieval of specific document chunks and associated metadata. To fetch chunks for a particular document by id:

1client.documents.list_chunks(id="9fbe403b-c11c-5aae-8ade-ef22980c3ad1")

This command returns detailed chunk information:

results=[ChunkResponse(id=UUID('27a2e605-2916-59fe-a4da-b19853713298'), document_id=UUID('30f950f0-c692-57c5-b6ec-ff78ccf5ccdc'), owner_id=UUID('2acb499e-8428-543b-bd85-0d9098718220'), collection_ids=[UUID('122fdf6a-e116-546b-a8f6-e4cb2e2c0a09')], text='John is a person that works at Google.', metadata={'version': 'v0', 'chunk_order': 0, 'document_type': 'txt'}, vector=None)] total_entries=1

These features allow for granular access to document content.

R2R supports flexible document deletion through a method that can run arbitrary deletion filters. To delete a document by its ID:

1client.documents.delete(id="9fbe403b-c11c-5aae-8ade-ef22980c3ad1")

This command produces output similar to:

GenericBooleanResponse(success=True)

Key features of the deletion process:

  1. Deletion by document ID,
  2. Cascading deletion of associated chunks and metadata
  3. Deletion by filter, e.g. by text match, user id match, or other with documents/by-filter.

This flexible deletion mechanism ensures precise control over document management within the R2R system.

For more advanced document management techniques and user authentication details, refer to the user documentation.

R2R offers powerful and highly configurable search capabilities, including vector search, hybrid search, and knowledge graph-enhanced search. These features allow for more accurate and contextually relevant information retrieval.

Vector search parameters inside of R2R can be fine-tuned at runtime for optimal results. Here’s how to perform a basic vector search:

1client.retrieval.search(
2 query="What is DeepSeek R1?",
3)
AggregateSearchResult(
chunk_search_results=[
ChunkSearchResult(
score=0.643,
text="Document Title: DeepSeek_R1.pdf
Text: could achieve an accuracy of over 70%.
DeepSeek-R1 also delivers impressive results on IF-Eval, a benchmark designed to assess a
models ability to follow format instructions. These improvements can be linked to the inclusion
of instruction-following data during the final stages of supervised fine-tuning (SFT) and RL
training. Furthermore, remarkable performance is observed on AlpacaEval2.0 and ArenaHard,
indicating DeepSeek-R1s strengths in writing tasks and open-domain question answering. Its
significant outperformance of DeepSeek-V3 underscores the generalization benefits of large-scale
RL, which not only boosts reasoning capabilities but also improves performance across diverse
domains. Moreover, the summary lengths generated by DeepSeek-R1 are concise, with an
average of 689 tokens on ArenaHard and 2,218 characters on AlpacaEval 2.0. This indicates that
DeepSeek-R1 avoids introducing length bias during GPT-based evaluations, further solidifying
its robustness across multiple tasks."
), ...
],
graph_search_results=[],
web_search_results=[],
context_document_results=[]
)

Key configurable parameters for vector search can be inferred from the retrieval API reference.

R2R supports hybrid search, which combines traditional keyword-based search with vector search for improved results. Here’s how to perform a hybrid search:

1client.retrieval.search(
2 "What was Uber's profit in 2020?",
3 search_settings={
4 "index_measure": "l2_distance",
5 "use_hybrid_search": True,
6 "hybrid_settings": {
7 "full_text_weight": 1.0,
8 "semantic_weight": 5.0,
9 "full_text_limit": 200,
10 "rrf_k": 50,
11 },
12 "filters": {"title": {"$in": ["DeepSeek_R1.pdf"]}},
13 },
14)

AI Retrieval (RAG)

R2R is built around a comprehensive Retrieval-Augmented Generation (RAG) engine, allowing you to generate contextually relevant responses based on your ingested documents. The RAG process combines all the search functionality shown above with Large Language Models to produce more accurate and informative answers.

To generate a response using RAG, use the following command:

1client.retrieval.rag(query="What is DeepSeek R1?")

Example Output:

$RAGResponse(
> generated_answer='DeepSeek-R1 is a model that demonstrates impressive performance across various tasks, leveraging reinforcement learning (RL) and supervised fine-tuning (SFT) to enhance its capabilities. It excels in writing tasks, open-domain question answering, and benchmarks like IF-Eval, AlpacaEval2.0, and ArenaHard [1], [2]. DeepSeek-R1 outperforms its predecessor, DeepSeek-V3, in several areas, showcasing its strengths in reasoning and generalization across diverse domains [1]. It also achieves competitive results on factual benchmarks like SimpleQA, although it performs worse on the Chinese SimpleQA benchmark due to safety RL constraints [2]. Additionally, DeepSeek-R1 is involved in distillation processes to transfer its reasoning capabilities to smaller models, which perform exceptionally well on benchmarks [4], [6]. The model is optimized for English and Chinese, with plans to address language mixing issues in future updates [8].',
> search_results=AggregateSearchResult(
> chunk_search_results=[ChunkSearchResult(score=0.643, text=Document Title: DeepSeek_R1.pdf ...)]
> ),
> citations=[Citation(index=1, rawIndex=1, startIndex=305, endIndex=308, snippetStartIndex=288, snippetEndIndex=315, sourceType='chunk', id='e760bb76-1c6e-52eb-910d-0ce5b567011b', document_id='e43864f5-a36f-548e-aacd-6f8d48b30c7f', owner_id='2acb499e-8428-543b-bd85-0d9098718220', collection_ids=['122fdf6a-e116-546b-a8f6-e4cb2e2c0a09'], score=0.6433466439465674, text='Document Title: DeepSeek_R1.pdf\n\nText: could achieve an accuracy of over 70%.\nDeepSeek-R1 also delivers impressive results on IF-Eval, a benchmark designed to assess a\nmodels ability to follow format instructions. These improvements can be linked to the inclusion\nof instruction-following...]
> metadata={'id': 'chatcmpl-B0BaZ0vwIa58deI0k8NIuH6pBhngw', 'choices': [{'finish_reason': 'stop', 'index': 0, 'logprobs': None, 'message': {'refusal': None, 'role': 'assistant', 'audio': None, 'function_call': None, 'tool_calls': None}}], 'created': 1739384247, 'model': 'gpt-4o-2024-08-06', 'object': 'chat.completion', 'service_tier': 'default', 'system_fingerprint': 'fp_4691090a87', ...}
>)

This command performs a search on the ingested documents and uses the retrieved information to generate a response.

R2R also supports streaming RAG responses, which can be useful for real-time applications. To use streaming RAG:

1response = client.retrieval.rag(
2 "who was aristotle",
3 rag_generation_config={"stream": True},
4 search_settings={"use_hybrid_search": True},
5)
6for chunk in response:
7 print(chunk, end='', flush=True)

Example Output:

$<search>["{\"id\":\"808c47c5-ebef-504a-a230-aa9ddcfbd87 .... </search>
><completion>Aristotle was an Ancient Greek philosopher and polymath born in 384 BC in Stagira, Chalcidice [1], [4]. He was a student of Plato and later became the tutor of Alexander the Great [2]. Aristotle founded the Peripatetic school of philosophy in the Lyceum in Athens and made significant contributions across a broad range of subjects, including natural sciences, philosophy, linguistics, economics, politics, psychology, and the arts [4]. His work laid the groundwork for the development of modern science [4]. Aristotle's influence extended well beyond his time, impacting medieval Islamic and Christian scholars, and his contributions to logic, ethics, and biology were particularly notable [8], [9], [10].</completion>```

Streaming allows the response to be generated and sent in real-time, chunk by chunk.

R2R offers extensive customization options for its Retrieval-Augmented Generation (RAG) functionality:

  1. Search Settings: Customize vector and knowledge graph search parameters using VectorSearchSettings and KGSearchSettings.

  2. Generation Config: Fine-tune the language model’s behavior with GenerationConfig, including:

    • Temperature, top_p, top_k for controlling randomness
    • Max tokens, model selection, and streaming options
    • Advanced settings like beam search and sampling strategies
  3. Multiple LLM Support: Easily switch between different language models and providers:

    • OpenAI models (default)
    • Anthropic’s Claude models
    • Local models via Ollama
    • Any provider supported by LiteLLM

Example of customizing the model:

1# requires ANTHROPIC_API_KEY is set
2response = client.retrieval.rag(
3 "Who was Aristotle?",
4 rag_generation_config={"model":"anthropic/claude-3-haiku-20240307", "stream": True}
5)
6for chunk in response:
7 print(chunk, nl=False)

This flexibility allows you to optimize RAG performance for your specific use case and leverage the strengths of various LLM providers.

Using the R2R Reasoning Agent, retrieval-augmented generation is combined with step-by-step reasoning to produce higher quality responses from your documents.

1client.retrieval.rag(
2 query="What does deepseek r1 imply?",
3 rag_generation_config={
4 "stream": True
5 }
6)

Example output:

<Thought>Calling function: local_search, with payload {"query":"DeepSeek R1"}</Thought>
<Thought>The search results provide a comprehensive overview of DeepSeek-R1, highlighting its capabilities and performance across various benchmarks and tasks. DeepSeek-R1 is a reasoning model developed by DeepSeek-AI, which leverages reinforcement learning (RL) and instruction-following data to enhance its performance. It excels in tasks such as writing, open-domain question answering, and handling fact-based queries. The model outperforms its predecessor, DeepSeek-V3, in several areas, although it falls short in some complex tasks like function calling and multi-turn interactions. DeepSeek-R1 also demonstrates strong performance in educational tasks and creative writing, showcasing its versatility and robustness.Key points about DeepSeek-R1 include:- It achieves impressive results on benchmarks like IF-Eval, AlpacaEval2.0, and ArenaHard, indicating strengths in writing and question answering [Source 1].- The model is used as a teacher to distill reasoning capabilities into smaller models, which also perform well on benchmarks [Source 2].- It outperforms DeepSeek-V3 on factual benchmarks like SimpleQA but has limitations in language mixing and certain complex tasks [Sources 3, 5].- DeepSeek-R1 demonstrates expert-level performance in coding tasks and strong results in educational benchmarks like MMLU and GPQA Diamond [Sources 6, 9].Overall, DeepSeek-R1 is a powerful model with a focus on reasoning and instruction-following, achieving competitive performance across a wide range of tasks.</Thought>
<Response>DeepSeek-R1 is a reasoning model developed by DeepSeek-AI, known for its strong performance in writing tasks, open-domain question answering, and handling fact-based queries. It leverages reinforcement learning and instruction-following data to enhance its capabilities. The model outperforms its predecessor, DeepSeek-V3, in several areas and is used to distill reasoning capabilities into smaller models. Despite its strengths, it has limitations in complex tasks like function calling and language mixing. Overall, DeepSeek-R1 is a versatile and robust model with competitive performance across various benchmarks.

Behind the scenes, R2R’s RetrievalService handles RAG requests, combining the power of vector search, optional knowledge graph integration, and language model generation.

Graphs in R2R

R2R implements a Git-like model for knowledge graphs, where each collection has a corresponding graph that can diverge and be independently managed. This approach allows for flexible knowledge management while maintaining data consistency.

Graph-Collection Relationship

  • Each collection has an associated graph that acts similar to a Git branch
  • Graphs can diverge from their underlying collections through independent updates
  • The pull operation syncs the graph with its collection, similar to a Git pull
  • This model enables experimental graph modifications without affecting the base collection

Knowledge Graph Workflow

Extract entities and relationships from the previously ingested document:

1client.documents.extract(document_id)

This step processes the document to identify entities and their relationships.

Sync the graph with the collection and view extracted knowledge:

1collection_id="122fdf6a-e116-546b-a8f6-e4cb2e2c0a09" # default collection_id for admin
2
3# Sync graph with collection
4pull_response = client.graphs.pull(collection_id)
5
6# View extracted knowledge
7entities = client.graphs.list_entities(collection_id)
8relationships = client.graphs.list_relationships(collection_id)

Build and list graph communities:

1# Build communities
2build_response = client.graphs.build(collection_id, settings={})
3
4# List communities
5communities = client.graphs.list_communities(collection_id)
[
Community(
name='Large Language Models and AGI Community',
summary='The Large Language Models and AGI Community focuses on the development and implications of advanced AI technologies, particularly in the pursuit of Artificial General Intelligence.',
level=None,
findings=['Large Language Models (LLMs) are rapidly evolving towards capabilities akin to Artificial General Intelligence (AGI) [Data: Descriptions (1579a46f-be12-4e60-a96b-e5b5afe026d9)].', 'The primary aim of LLMs is to achieve functionalities that closely resemble AGI [Data: Relationships (22bb116d-ab0b-4390-a68f-6ef1a1c99999)].', 'AGI systems are designed to outperform humans in most economically valuable tasks, indicating their potential impact on various industries [Data: Descriptions (80a34efa-d569-488f-91fd-db08fd93667b)].', 'The development of LLMs is a critical step towards realizing the goals of AGI, highlighting the interconnectedness of these technologies [Data: Relationships (22bb116d-ab0b-4390-a68f-6ef1a1c99999)].', 'Research in LLMs is essential for understanding the ethical implications of AGI deployment in society [Data: Descriptions (1579a46f-be12-4e60-a96b-e5b5afe026d9)].'],
id=UUID('62fd3478-f303-47ba-941a-fcf41576615d'),
community_id=None,
collection_id=UUID('122fdf6a-e116-546b-a8f6-e4cb2e2c0a09'),
rating=9.0,
rating_explanation='This community has a significant impact on the future of AI, as it drives research towards achieving AGI capabilities.',
...
), ...
]

Reset the graph to a clean state:

1client.graphs.reset(collection_id)

Best Practices

  1. Graph Synchronization

    • Always pull before attempting to list or work with entities
    • Keep track of which documents have been added to the graph
  2. Community Management

    • Build communities after significant changes to the graph
    • Use community information to enhance search results
  3. Version Control

    • Treat graphs like Git branches - experiment freely
    • Use reset to start fresh if needed
    • Maintain documentation of graph modifications

This Git-like model provides a flexible framework for knowledge management while maintaining data consistency and enabling experimental modifications.

User Management

R2R provides robust user auth and management capabilities. This section briefly covers user authentication features and how they relate to document management.

To register a new user:

1from r2r import R2RClient
2
3client = R2RClient()
4client.users.create("[email protected]", "password123")

Example output:

$User(
> id=UUID('fcbcbc64-f85c-5025-877c-37f4c7a12d6e'),
> email='[email protected]',
> is_active=True,
> is_superuser=False,
> created_at=datetime.datetime(2025, 2, 8, 5, 8, 17, 376293,
> tzinfo=TzInfo(UTC)),
> updated_at=datetime.datetime(2025, 2, 8, 5, 8, 17, 376293,
> tzinfo=TzInfo(UTC)),
> is_verified=False,
> collection_ids=[UUID('d3ef9c77-cb13-59a9-be70-0db46de619db')],
> graph_ids=[],
> document_ids=[],
> limits_overrides={},
> metadata={},
> verification_code_expiry=None,
> name=None,
> bio=None,
> rofile_picture=None,
> total_size_in_bytes=None,
> num_files=None,
> account_type='password',
> hashed_password='JDJiJDEyJDE4UFdOTWZTSHNxdzRRMDdKZXU2Nk9qMFNNbXFxVFZldmpHaGhjdTcwdk5hNDZubEMxblVD',
> google_id=None,
> github_id=None
>)

After registration, users need to verify their email:

1client.users.verify_email("123456") # Verification code sent to email

To log in and obtain access tokens:

1client.users.login("[email protected]", "password123")
$LoginResponse(access_token=Token(token='eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJ0ZXN0QGV4YW1wbGUuY29tIiwidG9rZW5fdHlwZSI6ImFjY2VzcyIsImV4cCI6MTc0MjU5MTQ0Ni43MTY2MzcsImlhdCI6MTczODk5MTQ0Ni43MTY3MDUsIm5iZiI6MTczODk5MTQ0Ni43MTY3MDUsImp0aSI6IkhkWWVfeWxOSm9Yc2tvaU5ZVkdoNHc9PSIsIm5vbmNlIjoiMkhOOUs3bU40QVNfVnkzOTdXR2Vpdz09In0.gG_9oa-7_ZHqfHHo-bE1ooynCm7YCQFCYbJoiEgGmTg', token_type='access'), refresh_token=Token(token='eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJ0ZXN0QGV4YW1wbGUuY29tIiwidG9rZW5fdHlwZSI6InJlZnJlc2giLCJleHAiOjE3Mzk1OTYyNDYuNzE3MzQxLCJpYXQiOjE3Mzg5OTE0NDYuNzE3MzQ5LCJuYmYiOjE3Mzg5OTE0NDYuNzE3MzQ5LCJqdGkiOiJybXltZTk5bGNtZklOWDZLQWNaTmpBPT0iLCJub25jZSI6InExRGdqZm96YkpjYXpDbzdTcE5XcWc9PSJ9.Zn-2pncsEdvyuig36N4APO_U9AWDQcJi6E5EjglN16U', token_type='refresh'))

To refresh an expired access token:

1# requires client.users.login(...)
2client.users.refresh_access_token()["results"]

To log out and invalidate the current access token:

1# requires client.users.login(...)
2client.users.logout()

These authentication features ensure that users can only access and manage their own documents. When performing operations like search, RAG, or document management, the results are automatically filtered based on the authenticated user’s permissions.

Remember to replace YOUR_ACCESS_TOKEN and YOUR_REFRESH_TOKEN with actual tokens obtained during the login process.

These observability and analytics features provide valuable insights into your R2R application’s performance and usage, enabling data-driven optimization and decision-making.

Next Steps

Now that you have a basic understanding of R2R’s core features, you can explore more advanced topics:

Built with