Embeddings
Embeddings in R2R
Introduction
R2R supports multiple Embedding providers, offering flexibility in choosing and switching between different models based on your specific requirements. This guide provides an in-depth look at configuring and using various Embedding providers within the R2R framework.
For a quick start on basic configuration, including embedding setup, please refer to our configuration guide.
Providers
R2R currently supports the following cloud embedding providers:
- OpenAI
- Azure
- Cohere
- HuggingFace
- Bedrock (Amazon)
- Vertex AI (Google)
- Voyage AI
And for local inference:
- Ollama
- SentenceTransformers
Configuration
Update the embedding
section in your r2r.toml
file to configure your embedding provider. Here are some example configurations:
Selecting Different Embedding Providers
R2R supports a wide range of embedding providers through LiteLLM. Here’s how to configure and use them:
OpenAI
Azure
Cohere
Ollama
HuggingFace
Bedrock
Vertex AI
Voyage AI
Supported models include:
- text-embedding-3-small
- text-embedding-3-small
- text-embedding-ada-002
Embedding Service Endpoints
The EmbeddingProvider is responsible for core functionalities in these R2R endpoints:
update_files
: When updating existing files in the systemingest_files
: During the ingestion of new filessearch
: For embedding search queriesrag
: As part of the Retrieval-Augmented Generation process
Here’s how you can use these endpoints with embeddings:
File Ingestion
Search
RAG (Retrieval-Augmented Generation)
Updating Files
Remember that you don’t directly call the embedding methods in your application code. R2R handles the embedding process internally based on your configuration.
Security Best Practices
- API Key Management: Use environment variables or secure key management solutions for API keys.
- Input Validation: Sanitize and validate all inputs before generating embeddings.
- Rate Limiting: Implement rate limiting to prevent abuse of embedding endpoints.
- Monitoring: Regularly monitor embedding usage for anomalies or misuse.
Custom Embedding Providers in R2R
You can create custom embedding providers by inheriting from the EmbeddingProvider
class and implementing the required methods. This allows you to integrate any embedding model or service into R2R.
Embedding Provider Structure
The Embedding system in R2R is built on two main components:
EmbeddingConfig
: A configuration class for Embedding providers.EmbeddingProvider
: An abstract base class that defines the interface for all Embedding providers.
EmbeddingConfig
The EmbeddingConfig
class is used to configure Embedding providers:
EmbeddingProvider
The EmbeddingProvider
is an abstract base class that defines the common interface for all Embedding providers:
Creating a Custom Embedding Provider
To create a custom Embedding provider, follow these steps:
- Create a new class that inherits from
EmbeddingProvider
. - Implement the required methods:
get_embedding
,get_embeddings
,rerank
, andtokenize_string
. - (Optional) Implement async versions of methods if needed.
- (Optional) Add any additional methods or attributes specific to your provider.
Here’s an example of a custom Embedding provider:
Registering and Using the Custom Provider
To use your custom Embedding provider in R2R:
- Update the
EmbeddingConfig
class to include your custom provider:
- Update your R2R configuration to use the custom provider:
- In your R2R application, register the custom provider:
Now you can use your custom Embedding provider seamlessly within your R2R application:
By following this structure, you can integrate any embedding model or service into R2R, maintaining consistency with the existing system while adding custom functionality as needed. This approach allows for great flexibility in choosing or implementing embedding solutions that best fit your specific use case.### Embedding Prefixes
Embedding Prefixes
R2R supports embedding prefixes to enhance embedding quality for different purposes:
- Index Prefixes: Applied to documents during indexing.
- Query Prefixes: Applied to search queries.
Configure prefixes in your r2r.toml
or when initializing the EmbeddingConfig.
Troubleshooting
Common issues and solutions:
- API Key Errors: Ensure your API keys are correctly set and have the necessary permissions.
- Dimension Mismatch: Verify that the
base_dimension
in your config matches the actual output of the chosen model. - Out of Memory Errors: Adjust the batch size or choose a smaller model if encountering memory issues with local models.
Performance Considerations
- Batching: Use batching for multiple, similar requests to improve throughput.
- Model Selection: Balance between model capability and inference speed based on your use case.
- Caching: Implement caching strategies to avoid re-embedding identical text.
Conclusion
R2R’s Embedding system provides a flexible and powerful foundation for integrating various embedding models into your applications. By understanding the available providers, configuration options, and best practices, you can effectively leverage embeddings to enhance your R2R-based projects.
For an advanced example of implementing reranking in R2R.