R2R supports multiple embedding providers, offering flexibility in choosing the best model for your specific use case.

Supported Providers

R2R currently supports the following embedding providers:

The R2R team is actively working to expand the range of supported providers based on user requests and priorities.

Available Models

OpenAI Models

OpenAI offers several embedding models, each optimized for different use cases:

SentenceTransformers Models

R2R also supports models from the sentence_transformers package on HuggingFace:

Configuring Embedding Models

To configure a specific embedding model, update the embedding section of your config.json file:

"embedding": {
    "provider": "openai",
    "base_model": "text-embedding-3-small",
    "base_dimension": 1536,
    "batch_size": 32
}

For SentenceTransformers:

"embedding": {
    "provider": "sentence-transformers",
    "base_model": "mixedbread-ai/mxbai-embed-large-v1",
    "base_dimension": 768,
    "batch_size": 32
}

Make sure to set the appropriate environment variables (e.g., OPENAI_API_KEY) for the chosen provider.

Choosing the Right Model

When selecting an embedding model, consider the following factors:

  1. Task requirements: Choose a model that aligns with your specific use case (e.g., semantic search, text similarity).
  2. Performance needs: Balance between embedding quality and processing speed.
  3. Cost considerations: Consider the pricing and efficiency of different models.
  4. Integration complexity: Local models may be easier to deploy but might have lower performance compared to cloud-based solutions.

Conclusion

R2R provides flexibility in choosing and configuring embedding models to suit your specific needs. By understanding the characteristics of different models and providers, you can optimize your application’s performance and cost-effectiveness.

For more information on customizing R2R, refer to the Customizing R2R documentation.