Introduction

R2R offers a flexible configuration system that allows you to customize your Retrieval-Augmented Generation (RAG) applications. This guide introduces the key concepts and methods for configuring R2R.

Configuration Levels

R2R supports two main levels of configuration:

Server Configuration: Define default server-side settings.
Runtime Configuration: Dynamically override settings when making API calls.

Server Configuration

The default settings for a light R2R installation are specified in the r2r.toml file.

When doing a full installation the R2R CLI uses the full.toml to override some of the default light default settings with those of the added providers.

To create your own custom configuration:

Create a new file named my_r2r.toml in your project directory.
Add only the settings you wish to customize. For example:

my_r2r.toml

1 [embedding]
2 provider = "litellm"
3 base_model = "text-embedding-3-small"
4 base_dimension = 1536
5 
6 [completion]
7     [completion.generation_config]
8     model = "anthropic/claude-3-opus-20240229"

Launch R2R with the CLI using your custom configuration:

$ r2r serve --config-path=my_r2r.toml

R2R will use your specified settings, falling back to defaults for any unspecified options.

Runtime Configuration

When calling endpoints, you can override server configurations on-the-fly. This allows for dynamic control over search settings, model selection, prompt customization, and more.

For example, using the Python SDK:

1 client = R2RClient("http://localhost:7272")
2 
3 response = client.rag(
4     "Who was Aristotle?",
5     rag_generation_config={
6         "model": "anthropic/claude-3-haiku-20240307",
7         "temperature": 0.7
8     },
9     vector_search_settings={
10         "search_limit": 100,
11         "use_hybrid_search": True
12     }
13 )

Next Steps

For more detailed information on configuring specific components of R2R, please refer to the following pages:

1	[embedding]
2	provider = "litellm"
3	base_model = "text-embedding-3-small"
4	base_dimension = 1536
5
6	[completion]
7	[completion.generation_config]
8	model = "anthropic/claude-3-opus-20240229"

1	client = R2RClient("http://localhost:7272")
2
3	response = client.rag(
4	"Who was Aristotle?",
5	rag_generation_config={
6	"model": "anthropic/claude-3-haiku-20240307",
7	"temperature": 0.7
8	},
9	vector_search_settings={
10	"search_limit": 100,
11	"use_hybrid_search": True
12	}
13	)