Embedding Provider

By default, R2R uses the LiteLLM framework to communicate with various cloud embedding providers. To customize the embedding settings:

r2r.toml
1[embedding]
2provider = "litellm"
3base_model = "openai/text-embedding-3-small"
4base_dimension = 512
5batch_size = 128
6add_title_as_prefix = false
7rerank_model = "None"
8concurrent_request_limit = 256

Let’s break down the embedding configuration options:

  • provider: Choose from ollama, litellm and openai. R2R defaults to using the LiteLLM framework for maximum embedding provider flexibility.
  • base_model: Specifies the embedding model to use. Format is typically “provider/model-name” (e.g., "openai/text-embedding-3-small").
  • base_dimension: Sets the dimension of the embedding vectors. Should match the output dimension of the chosen model.
  • batch_size: Determines the number of texts to embed in a single API call. Larger values can improve throughput but may increase latency.
  • add_title_as_prefix: When true, prepends the document title to the text before embedding, providing additional context.
  • rerank_model: Specifies a model for reranking results. Set to “None” to disable reranking (note: not supported by LiteLLMEmbeddingProvider).
  • concurrent_request_limit: Sets the maximum number of concurrent embedding requests to manage load and avoid rate limiting.
Embedding providers for an R2R system cannot be configured at runtime and are instead configured server side.

Supported LiteLLM Providers

Support for any of the embedding providers listed below is provided through LiteLLM.

Example configuration:

example r2r.toml
1provider = "litellm"
2base_model = "openai/text-embedding-3-small"
3base_dimension = 512
$export OPENAI_API_KEY=your_openai_key
># .. set other environment variables
>
>r2r serve --config-path=r2r.toml

Supported models include:

  • text-embedding-3-small
  • text-embedding-3-large
  • text-embedding-ada-002

For detailed usage instructions, refer to the LiteLLM OpenAI Embedding documentation.