Overview | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.

R2R was built with configuration in mind and utilizes TOML configuration files to define server-side variables.

The levels of configuration that are supported are:

Server-side Configuration: Define default configuration for your R2R deployment.
Runtime Settings: Dynamically override configuration settings when making API calls.

Server-side Configuration

R2R’s configuration format works by override. Default configuration values are defined in the r2r.toml file.

A number of pre-defined configuration files ship with R2R, detailed below. For a complete list of configurable parameters and their defaults, refer to our all_possible_config.toml file.

Editing pre-defined configurations while running R2R with Docker will not have an effect; refer to the installation guide for instructions on how to use custom configs with Docker.

Configuration File	Usage
r2r.toml	The default R2R configuration file.
full.toml	Includes orchestration with Hatchet.
full_azure.toml	Includes orchestration with Hatchet and Azure OpenAI models.
full_lm_studio.toml	Includes orchestration with Hatchet and LM Studio models.
full_ollama.toml	Includes orchestration with Hatchet and Ollama models.
r2r_azure.toml	Configured to run Azure OpenAI models.
gemini.toml	Configured to run Gemini models.
lm_studio.toml	Configured to run LM Studio models.
ollama.toml	Configured to run Ollama models.
r2r_with_auth.toml	Configured to require user verification.
tavily.toml	Configured to use the Tavily tool.

Custom Configuration Files

To create your own custom configuration:

Create a new file named my_r2r.toml in your project directory.
Add only the settings you wish to customize. For example:

my_r2r.toml

1 [app]
2 # LLM used for user-facing responses (high-quality outputs)
3 quality_llm = "openai/gpt-4o"
4 # LLM used for internal summarizations and similar tasks (fast responses)
5 fast_llm = "openai/gpt-4o-mini"
6 
7 [completion]
8   [completion.generation_config]
9   temperature = 0.7
10   top_p = 0.9
11   max_tokens_to_sample = 1024
12   stream = false
13   add_generation_kwargs = {}

Launch the R2R server with your custom configuration:

1 export R2R_CONFIG_PATH=path_to_your_config
2 python -m r2r.serve

R2R will use your specified settings, falling back to the defaults defined in the main configuration files for any unspecified options.

Runtime Settings

When calling endpoints, such as retrieval/search or retrieval/rag, you can override server-side configurations on-the-fly. This allows for dynamic control over search settings, model selection, prompt customization, and more.

For example, using the Python SDK:

1 client = R2RClient("http://localhost:7272")
2 
3 response = client.retrieval.rag(
4     "Who was Aristotle?",
5     rag_generation_config={
6         "model": "anthropic/claude-3-haiku-20240307",  # Overrides the default quality_llm
7         "temperature": 0.7
8     },
9     search_settings={
10         "limit": 100,           # Number of search results to return
11         "use_hybrid_search": True  # Enable semantic + full-text search
12     }
13 )

Refer here to learn more about configuring and dynamically setting your retrieval system.