Embedding — Build, scale, and manage user-facing Retrieval-Augmented Generation applications.

Embedding Provider

By default, R2R uses the LiteLLM framework to communicate with various cloud embedding providers. To customize the embedding settings:

r2r.toml

1 [embedding]
2 provider = "litellm"
3 base_model = "openai/text-embedding-3-small"
4 base_dimension = 512
5 batch_size = 128
6 add_title_as_prefix = false
7 rerank_model = "None"
8 concurrent_request_limit = 256

Let’s break down the embedding configuration options:

provider: Choose from ollama, litellm and openai. R2R defaults to using the LiteLLM framework for maximum embedding provider flexibility.
base_model: Specifies the embedding model to use. Format is typically “provider/model-name” (e.g., "openai/text-embedding-3-small").
base_dimension: Sets the dimension of the embedding vectors. Should match the output dimension of the chosen model.
batch_size: Determines the number of texts to embed in a single API call. Larger values can improve throughput but may increase latency.
add_title_as_prefix: When true, prepends the document title to the text before embedding, providing additional context.
rerank_model: Specifies a model for reranking results. Set to “None” to disable reranking (note: not supported by LiteLLMEmbeddingProvider).
concurrent_request_limit: Sets the maximum number of concurrent embedding requests to manage load and avoid rate limiting.

Embedding providers for an R2R system cannot be configured at runtime and are instead configured server side.

Supported LiteLLM Providers

Support for any of the embedding providers listed below is provided through LiteLLM.

OpenAI

Azure

Anthropic

Cohere

Ollama

HuggingFace

Bedrock

Vertex AI

Voyage AI

Example configuration:

example r2r.toml

1 provider = "litellm"
2 base_model = "openai/text-embedding-3-small"
3 base_dimension = 512

$ export OPENAI_API_KEY=your_openai_key
> # .. set other environment variables
> 
> r2r serve --config-path=r2r.toml

Supported models include:

text-embedding-3-small
text-embedding-3-large
text-embedding-ada-002

For detailed usage instructions, refer to the LiteLLM OpenAI Embedding documentation.

1	[embedding]
2	provider = "litellm"
3	base_model = "openai/text-embedding-3-small"
4	base_dimension = 512
5	batch_size = 128
6	add_title_as_prefix = false
7	rerank_model = "None"
8	concurrent_request_limit = 256

$	export OPENAI_API_KEY=your_openai_key
>	# .. set other environment variables
>
>	r2r serve --config-path=r2r.toml