Introduction

R2R offers a flexible configuration system that allows you to customize your Retrieval-Augmented Generation (RAG) applications. This guide introduces the key concepts and methods for configuring R2R.

Configuration Levels

R2R supports two main levels of configuration:

  1. Server Configuration: Define default server-side settings.
  2. Runtime Configuration: Dynamically override settings when making API calls.

Server Configuration

The default settings for R2R are specified in the r2r.toml file. To create a custom configuration:

  1. Create a new file named r2r.toml in your project directory.
  2. Add only the settings you wish to customize. For example:
r2r.toml
[embedding]
provider = "litellm"
base_model = "text-embedding-3-large"
base_dimension = 1536

[completion]
    [completion.generation_config]
    model = "anthropic/claude-3-opus-20240229"
  1. Launch R2R with the CLI using your custom configuration:
r2r serve --config-path=r2r.toml

R2R will use your specified settings, falling back to defaults for any unspecified options.

Runtime Configuration

When calling endpoints, you can override server configurations on-the-fly. This allows for dynamic control over model selection, prompt customization, and more.

For example, using the Python SDK:

client = R2RClient("http://localhost:7272")

response = client.rag(
    "Who was Aristotle?",
    rag_generation_config={
        "model": "anthropic/claude-3-haiku-20240307",
        "temperature": 0.7
    },
    vector_search_settings={
        "search_limit": 100,
        "use_hybrid_search": True
    }
)

Next Steps

For more detailed information on configuring specific components of R2R, please refer to the following pages: