Quickstart

Getting started with R2R is easy.

1

Create an Account

Create an account with SciPhi Cloud. It’s free!

For those interested in deploying R2R locally, please refer here.

2

Install the SDK

R2R offers Python and JavaScript SDKs to interact with.

$pip install r2r
3

Environment

After signing into SciPhi Cloud, navigate to the homepage and click Create New Key (for the self-hosted quickstart, refer here): API Key

Next, set your local environment variable R2R_API_KEY. Be sure to include the entire API key `pk_...sk_...`.

1# export R2R_API_KEY=...
2from r2r import R2RClient
3
4client = R2RClient() # or set base_url=...
5
6# or, alternatively, client.users.login("[email protected]", "my_strong_password")
4

Client

1from r2r import R2RClient
2
3client = R2RClient("http://localhost:7272") # or export R2R_API_KEY=...
5

Ingesting files

When you ingest files into R2R, the server accepts the task, processes and chunks the file, and generates a summary of the document.

1client.documents.create_sample(hi_res=True)
2# to ingest your own document, client.documents.create(file_path="/path/to/file")

Example output:

IngestionResponse(message='Document created and ingested successfully.', task_id=None, document_id=UUID('e43864f5-a36f-548e-aacd-6f8d48b30c7f'))
6

Getting file status

After file ingestion is complete, you can check the status of your documents by listing them.

1client.documents.list()

Example output:

[
DocumentResponse(
id=UUID('e43864f5-a36f-548e-aacd-6f8d48b30c7f'),
collection_ids=[UUID('122fdf6a-e116-546b-a8f6-e4cb2e2c0a09')],
owner_id=UUID('2acb499e-8428-543b-bd85-0d9098718220'),
document_type=<DocumentType.PDF: 'pdf'>,
metadata={'title': 'DeepSeek_R1.pdf', 'version': 'v0'},
version='v0',
size_in_bytes=1768572,
ingestion_status=<IngestionStatus.SUCCESS: 'success'>,
extraction_status=<GraphExtractionStatus.PENDING: 'pending'>,
created_at=datetime.datetime(2025, 2, 8, 3, 31, 39, 126759, tzinfo=TzInfo(UTC)),
updated_at=datetime.datetime(2025, 2, 8, 3, 31, 39, 160114, tzinfo=TzInfo(UTC)),
ingestion_attempt_number=None,
summary="The document contains a comprehensive overview of DeepSeek-R1, a series of reasoning models developed by DeepSeek-AI, which includes DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero utilizes large-scale reinforcement learning (RL) without supervised fine-tuning, showcasing impressive reasoning capabilities but facing challenges like readability and language mixing. To enhance performance, DeepSeek-R1 incorporates multi-stage training and cold-start data, achieving results comparable to OpenAI's models on various reasoning tasks. The document details the models' training processes, evaluation results across multiple benchmarks, and the introduction of distilled models that maintain reasoning capabilities while being smaller and more efficient. It also discusses the limitations of current models, such as language mixing and sensitivity to prompts, and outlines future research directions to improve general capabilities and efficiency in software engineering tasks. The findings emphasize the potential of RL in developing reasoning abilities in large language models and the effectiveness of distillation techniques for smaller models.", summary_embedding=None, total_tokens=29673)] total_entries=1
), ...
]
8

RAG

Generate a RAG response:

1client.retrieval.rag(
2 query="What is DeepSeek R1?",
3)

Example output:

RAGResponse(
generated_answer='DeepSeek-R1 is a model that demonstrates impressive performance across various tasks, leveraging reinforcement learning (RL) and supervised fine-tuning (SFT) to enhance its capabilities. It excels in writing tasks, open-domain question answering, and benchmarks like IF-Eval, AlpacaEval2.0, and ArenaHard [1], [2]. DeepSeek-R1 outperforms its predecessor, DeepSeek-V3, in several areas, showcasing its strengths in reasoning and generalization across diverse domains [1]. It also achieves competitive results on factual benchmarks like SimpleQA, although it performs worse on the Chinese SimpleQA benchmark due to safety RL constraints [2]. Additionally, DeepSeek-R1 is involved in distillation processes to transfer its reasoning capabilities to smaller models, which perform exceptionally well on benchmarks [4], [6]. The model is optimized for English and Chinese, with plans to address language mixing issues in future updates [8].',
search_results=AggregateSearchResult(
chunk_search_results=[ChunkSearchResult(score=0.643, text=Document Title: DeepSeek_R1.pdf ...)]
),
citations=[Citation(index=1, rawIndex=1, startIndex=305, endIndex=308, snippetStartIndex=288, snippetEndIndex=315, sourceType='chunk', id='e760bb76-1c6e-52eb-910d-0ce5b567011b', document_id='e43864f5-a36f-548e-aacd-6f8d48b30c7f', owner_id='2acb499e-8428-543b-bd85-0d9098718220', collection_ids=['122fdf6a-e116-546b-a8f6-e4cb2e2c0a09'], score=0.6433466439465674, text='Document Title: DeepSeek_R1.pdf\n\nText: could achieve an accuracy of over 70%.\nDeepSeek-R1 also delivers impressive results on IF-Eval, a benchmark designed to assess a\nmodels ability to follow format instructions. These improvements can be linked to the inclusion\nof instruction-following...]
metadata={'id': 'chatcmpl-B0BaZ0vwIa58deI0k8NIuH6pBhngw', 'choices': [{'finish_reason': 'stop', 'index': 0, 'logprobs': None, 'message': {'refusal': None, 'role': 'assistant', 'audio': None, 'function_call': None, 'tool_calls': None}}], 'created': 1739384247, 'model': 'gpt-4o-2024-08-06', 'object': 'chat.completion', 'service_tier': 'default', 'system_fingerprint': 'fp_4691090a87', ...}
9

Reasoning Agents with RAG (reasoning_agent)

Using the R2R Reasoning Agent, retrieval-augmented generation is combined with step-by-step reasoning to produce higher quality responses from your documents.

1client.retrieval.rag(
2 query="What does deepseek r1 imply?",
3 rag_generation_config={
4 "stream": True
5 }
6)

Example output:

<Thought>Calling function: local_search, with payload {"query":"DeepSeek R1"}</Thought>
<Thought>The search results provide a comprehensive overview of DeepSeek-R1, highlighting its capabilities and performance across various benchmarks and tasks. DeepSeek-R1 is a reasoning model developed by DeepSeek-AI, which leverages reinforcement learning (RL) and instruction-following data to enhance its performance. It excels in tasks such as writing, open-domain question answering, and handling fact-based queries. The model outperforms its predecessor, DeepSeek-V3, in several areas, although it falls short in some complex tasks like function calling and multi-turn interactions. DeepSeek-R1 also demonstrates strong performance in educational tasks and creative writing, showcasing its versatility and robustness.Key points about DeepSeek-R1 include:- It achieves impressive results on benchmarks like IF-Eval, AlpacaEval2.0, and ArenaHard, indicating strengths in writing and question answering [Source 1].- The model is used as a teacher to distill reasoning capabilities into smaller models, which also perform well on benchmarks [Source 2].- It outperforms DeepSeek-V3 on factual benchmarks like SimpleQA but has limitations in language mixing and certain complex tasks [Sources 3, 5].- DeepSeek-R1 demonstrates expert-level performance in coding tasks and strong results in educational benchmarks like MMLU and GPQA Diamond [Sources 6, 9].Overall, DeepSeek-R1 is a powerful model with a focus on reasoning and instruction-following, achieving competitive performance across a wide range of tasks.</Thought>
<Response>DeepSeek-R1 is a reasoning model developed by DeepSeek-AI, known for its strong performance in writing tasks, open-domain question answering, and handling fact-based queries. It leverages reinforcement learning and instruction-following data to enhance its capabilities. The model outperforms its predecessor, DeepSeek-V3, in several areas and is used to distill reasoning capabilities into smaller models. Despite its strengths, it has limitations in complex tasks like function calling and language mixing. Overall, DeepSeek-R1 is a versatile and robust model with competitive performance across various benchmarks.

Additional Features

R2R offers the additional features below to enhance your document management and user experience.

Graphs

R2R provides powerful entity and relationshipo extraction capabilities that enhance document understanding and retrieval. These can leveraged to construct knowledge graphs inside R2R. The system can automatically identify entities, build relationships between them, and create enriched knowledge graphs from your document collection.

Users and Collections

R2R provides a complete set of user authentication and management features, allowing you to implement secure and feature-rich authentication systems or integrate with your preferred authentication provider. Further, collections exist to enable efficient access control and organization of users and documents.

Next Steps

Now that you have a basic understanding of R2R’s core features, you can explore more advanced topics:

Built with