Overview

Configure your R2R deployment

R2R was built with configuration in mind and utilizes TOML configuration files to define server-side variables.

The levels of configuration that are supported are:

  1. Server-side Configuration: Define default configuration for your R2R deployment.
  2. Runtime Settings: Dynamically override configuration settings when making API calls.

Server-side Configuration

R2R’s configuration format works by override. Default configuration values are defined in the r2r.toml file.

A number of pre-defined configuration files ship with R2R, detailed below. For a complete list of configurable parameters and their defaults, refer to our all_possible_config.toml file.

Editing pre-defined configurations while running R2R with Docker will not have an effect; refer to the installation guide for instructions on how to use custom configs with Docker.
Configuration FileUsage
r2r.tomlThe default R2R configuration file.
full.tomlIncludes orchestration with Hatchet.
full_azure.tomlIncludes orchestration with Hatchet and Azure OpenAI models.
full_lm_studio.tomlIncludes orchestration with Hatchet and LM Studio models.
full_ollama.tomlIncludes orchestration with Hatchet and Ollama models.
r2r_azure.tomlConfigured to run Azure OpenAI models.
gemini.tomlConfigured to run Gemini models.
lm_studio.tomlConfigured to run LM Studio models.
ollama.tomlConfigured to run Ollama models.
r2r_with_auth.tomlConfigured to require user verification.
tavily.tomlConfigured to use the Tavily tool.

Custom Configuration Files

To create your own custom configuration:

  1. Create a new file named my_r2r.toml in your project directory.
  2. Add only the settings you wish to customize. For example:
my_r2r.toml
1[app]
2# LLM used for user-facing responses (high-quality outputs)
3quality_llm = "openai/gpt-4o"
4# LLM used for internal summarizations and similar tasks (fast responses)
5fast_llm = "openai/gpt-4o-mini"
6
7[completion]
8 [completion.generation_config]
9 temperature = 0.7
10 top_p = 0.9
11 max_tokens_to_sample = 1024
12 stream = false
13 add_generation_kwargs = {}
  1. Launch the R2R server with your custom configuration:
1export R2R_CONFIG_PATH=path_to_your_config
2python -m r2r.serve

R2R will use your specified settings, falling back to the defaults defined in the main configuration files for any unspecified options.

Runtime Settings

When calling endpoints, such as retrieval/search or retrieval/rag, you can override server-side configurations on-the-fly. This allows for dynamic control over search settings, model selection, prompt customization, and more.

For example, using the Python SDK:

1client = R2RClient("http://localhost:7272")
2
3response = client.retrieval.rag(
4 "Who was Aristotle?",
5 rag_generation_config={
6 "model": "anthropic/claude-3-haiku-20240307", # Overrides the default quality_llm
7 "temperature": 0.7
8 },
9 search_settings={
10 "limit": 100, # Number of search results to return
11 "use_hybrid_search": True # Enable semantic + full-text search
12 }
13)

Refer here to learn more about configuring and dynamically setting your retrieval system.