R2RConfig
Introduction
R2RConfig
uses a TOML-based configuration system to customize various aspects of R2R’s functionality. This guide provides a detailed overview of how to configure R2R, including all available options and their meanings.
Configuration File Structure
The R2R configuration is stored in a TOML file, which defaults to r2r.toml
. The file is divided into several sections, each corresponding to a different aspect of the R2R system:
- Authentication
- Completion (LLM)
- Cryptography
- Database
- Embedding
- Evaluation
- Ingestion
- Knowledge Graph
- Logging
- Prompt Management
Loading a Configuration
To use a custom configuration, you can load it when initializing R2R:
Configuration Sections
Authentication
Refer to the AuthProvider
to learn more about how R2R supports auth providers.
provider
: Authentication provider. Currently, only “r2r” is supported.access_token_lifetime_in_minutes
: Lifespan of access tokens in minutes.refresh_token_lifetime_in_days
: Lifespan of refresh tokens in days.require_authentication
: If true, all secure routes require authentication. Otherwise, non-authenticated requests mock superuser access.require_email_verification
: If true, email verification is required for new accounts.default_admin_email
anddefault_admin_password
: Credentials for the default admin account.
Completion (LLM)
Refer to the LLMProvider
to learn more about how R2R supports LLM providers.
provider
: LLM provider. Options include “litellm” and “openai”.concurrent_request_limit
: Maximum number of concurrent requests allowed.generation_config
: Detailed configuration for text generation.model
: The specific LLM model to use.temperature
: Controls randomness in generation (0.0 to 1.0).top_p
: Parameter for nucleus sampling.max_tokens_to_sample
: Maximum number of tokens to generate.- Other parameters control various aspects of text generation.
Cryptography
Refer to the CryptoProvider
to learn more about how R2R supports cryptography.
provider
: Cryptography provider for password hashing. Currently, only “bcrypt” is supported.
Database
Refer to the DatabaseProvider
to learn more about how R2R supports databases.
provider
: Database provider. Only “postgres” is supported.user
: Default username for accessing database.password
: Default password for accessing database.host
: Default host for accessing database.port
: Default port for accessing database.db_name
: Default db_name for accessing database.
Embedding
Refer to the EmbeddingProvider
to learn more about how R2R supports embeddings.
provider
: Embedding provider. Options include “ollama”, “openai” and “sentence-transformers”.base_model
: The specific embedding model to use.base_dimension
: Dimension of the embedding vectors.batch_size
: Number of items to process in a single batch.add_title_as_prefix
: Whether to add the title as a prefix to the embedded text.rerank_model
: Model used for reranking, if any.concurrent_request_limit
: Maximum number of concurrent embedding requests.
Evaluation
provider
: Evaluation provider. Set to “None” to disable evaluation functionality.
Knowledge Graph
Refer to the KGProvider
to learn more about how R2R supports knowledge graphs.
provider
: Specifies the backend used for storing and querying the knowledge graph. Options include “postgres” and “None”.batch_size
: Determines how many text chunks are processed at once for knowledge extraction.kg_extraction_config
: Configures the language model used for extracting knowledge from text chunks.
Logging
provider
: Logging provider. Currently set to “local”.log_table
: Name of the table where logs are stored.log_info_table
: Name of the table where log information is stored.
Prompt Management
provider
: Prompt management provider. Currently set to “r2r”.
Advanced Configuration
Environment Variables
For sensitive information like API keys, it’s recommended to use environment variables instead of hardcoding them in the configuration file. R2R will automatically look for environment variables for certain settings.
Custom Providers
R2R supports custom providers for various components. To use a custom provider, you’ll need to implement the appropriate interface and register it with R2R. Refer to the developer documentation for more details on creating custom providers.
Configuration Validation
R2R performs validation on the configuration when it’s loaded. If there are any missing required fields or invalid values, an error will be raised. Always test your configuration in a non-production environment before deploying.
Best Practices
- Security: Never commit sensitive information like API keys or passwords to version control. Use environment variables instead.
- Modularity: Create separate configuration files for different environments (development, staging, production).
- Documentation: Keep your configuration files well-commented, especially when using custom or non-standard settings.
- Version Control: Track your configuration files in version control, but use
.gitignore
to exclude files with sensitive information. - Regular Review: Periodically review and update your configuration to ensure it aligns with your current needs and best practices.
Troubleshooting
If you encounter issues with your configuration:
- Check the R2R logs for any error messages related to configuration.
- Verify that all required fields are present in your configuration file.
- Ensure that the values in your configuration are of the correct type (string, number, boolean, etc.).
- If using custom providers or non-standard settings, double-check the documentation or consult with the R2R community.
By following this guide, you should be able to configure R2R to suit your specific needs. Remember that R2R is highly customizable, so don’t hesitate to explore different configuration options to optimize your setup.