Embedding
Embedding Provider
By default, R2R uses the LiteLLM framework to communicate with various cloud embedding providers. To customize the embedding settings:
r2r.toml
Let’s break down the embedding configuration options:
provider
: Choose fromollama
,litellm
andopenai
. R2R defaults to using the LiteLLM framework for maximum embedding provider flexibility.base_model
: Specifies the embedding model to use. Format is typically “provider/model-name” (e.g.,"openai/text-embedding-3-small"
).base_dimension
: Sets the dimension of the embedding vectors. Should match the output dimension of the chosen model.batch_size
: Determines the number of texts to embed in a single API call. Larger values can improve throughput but may increase latency.add_title_as_prefix
: When true, prepends the document title to the text before embedding, providing additional context.rerank_model
: Specifies a model for reranking results. Set to “None” to disable reranking (note: not supported by LiteLLMEmbeddingProvider).concurrent_request_limit
: Sets the maximum number of concurrent embedding requests to manage load and avoid rate limiting.
Embedding providers for an R2R system cannot be configured at runtime and are instead configured server side.
Supported LiteLLM Providers
Support for any of the embedding providers listed below is provided through LiteLLM.
OpenAI
Azure
Anthropic
Cohere
Ollama
HuggingFace
Bedrock
Vertex AI
Voyage AI
Example configuration:
example r2r.toml
Supported models include:
- text-embedding-3-small
- text-embedding-3-large
- text-embedding-ada-002
For detailed usage instructions, refer to the LiteLLM OpenAI Embedding documentation.