Overview — The most advanced AI retrieval system. Containerized, Retrieval-Augmented Generation (RAG) with a RESTful API.

R2R offers a flexible configuration system that allows you to customize your Retrieval-Augmented Generation (RAG) system. This guide introduces the key concepts and methods for configuring R2R.

Configuration Levels

R2R supports two main levels of configuration:

Server-side Configuration: Define default configuration for your R2R deployment.
Runtime Settings: Dynamically override configuration settings when making API calls.

Server-side Configuration

The default settings for the R2R light installation are specified in the r2r.toml file.

To create your own custom configuration:

Create a new file named my_r2r.toml in your project directory.
Add only the settings you wish to customize. For example:

my_r2r.toml

1 [embedding]
2 provider = "litellm"
3 base_model = "text-embedding-3-small" # defaults to `text-embedding-3-large`
4 base_dimension = 512 # defaults to `3072`
5 
6 [completion]
7     [completion.generation_config]
8     model = "anthropic/claude-3-opus-20240229" # defaults to `openai/gpt-4o`

Launch R2R with the CLI using your custom configuration:

$ r2r serve --config-path=my_r2r.toml

R2R will use your specified settings, falling back to the defaults defined in the r2r.toml for any unspecified options. When doing the R2R full installation the R2R CLI uses the full.toml to configure the relevant provider settings.

Runtime Settings

When calling endpoints, like retrieval/search or retrieval/rag, you can override server-side configurations on-the-fly. This allows for dynamic control over search settings, model selection, prompt customization, and more.

For example, using the Python SDK:

1 client = R2RClient("http://localhost:7272")
2 
3 response = client.retrieval.rag(
4     "Who was Aristotle?",
5     rag_generation_config={
6         "model": "anthropic/claude-3-haiku-20240307", # overrides `claude-3-opus` specified above
7         "temperature": 0.7
8     },
9     search_settings={
10         "limit": 100, # number of search results to return
11         "use_hybrid_search": True # enable semantic + full-text search
12     }
13 )

Refer here to learn more about configuring and dynamically setting your retrieval system.

1	[embedding]
2	provider = "litellm"
3	base_model = "text-embedding-3-small" # defaults to `text-embedding-3-large`
4	base_dimension = 512 # defaults to `3072`
5
6	[completion]
7	[completion.generation_config]
8	model = "anthropic/claude-3-opus-20240229" # defaults to `openai/gpt-4o`

1	client = R2RClient("http://localhost:7272")
2
3	response = client.retrieval.rag(
4	"Who was Aristotle?",
5	rag_generation_config={
6	"model": "anthropic/claude-3-haiku-20240307", # overrides `claude-3-opus` specified above
7	"temperature": 0.7
8	},
9	search_settings={
10	"limit": 100, # number of search results to return
11	"use_hybrid_search": True # enable semantic + full-text search
12	}
13	)