Configuring Your RAG Pipeline
The R2R library provides flexibility in customizing various aspects of the RAG pipeline to suit your specific needs.
Your RAG pipeline providers can be configured through the config.json
file. Below are some of the various options that are supported.
Vector Database Provider
R2R supports multiple vector database providers, including:
local
: A local vector database implementation written in sqlite.qdrant
: Integration with Qdrant, a high-performance vector similarity search engine.pgvector
: Integration with PGVector, a vector similarity search extension for PostgreSQL.sciphi
: Managed PGVector database from SciPhi.
To specify the vector database provider, set the provider
field under vector_database
in the config.json
file. Make sure to provide the necessary connection details and credentials for your chosen provider.
For more information, refer to vector database providers.
Embedding Provider
R2R supports OpenAI and local inference as embedding providers. To configure the embedding settings, update the embedding
section in the config.json
file. Specify the desired embedding model, dimension, and batch size according to your requirements. This can easily be extended by request.
openai
: Integration with OpenAI, supporting models liketext-embedding-3-small
andtext-embedding-3-large
.sentence-transformers
: Integration with the sentence transformers library, providing support for models available on HuggingFace, likemixedbread-ai/mxbai-embed-large-v1
.
For more information, refer to embedding providers.
Language Model Provider
openai
: Integration with OpenAI, supporting models likegpt-3.5-turbo
litellm
(default): Integrates with many LLM providers, such as those listed below- OpenAI
- ollama
- Anthropic
- Vertex AI
- HuggingFace
- ...
llama-cpp
: Integrates with the llama-cpp library for local inference.
For more information, refer to llm providers.
Evaluation Provider
R2R supports DeepEval and PareaAI as evaluation providers. These providers allow you to evaluate the performance and quality of your RAG pipeline at regular intervals.
provider
: Specifies the evaluation provider to use (deepeval
orpareaai
).sampling_fraction
: Determines how often the pipeline should be evaluated. It represents the fraction of queries that should trigger an evaluation. For example, a sampling_fraction of0.1
means that approximately 10% of the queries will be evaluated.
Logging Provider
The R2R library supports various logging providers to store execution logs of the RAG pipeline.
R2R supports the following logging providers:
postgres
: Logs pipeline execution information to a PostgreSQL database.local
: Logs pipeline execution information to a local SQLite database.redis
: Logs pipeline execution information to a Redis database.