Customizing Your RAG Pipeline
The R2R library provides flexibility in customizing various aspects of the RAG pipeline to suit your specific needs. You can create custom implementations of the ingestion pipeline, embedding pipeline, RAG pipeline, and evaluation pipeline by subclassing the respective base classes.
Custom Ingestion Pipeline
To create a custom ingestion pipeline, subclass the IngestionPipeline
abstract base class and override the necessary methods. For example:
from r2r.pipelines import IngestionPipeline
class CustomIngestionPipeline(IngestionPipeline):
def process_data(self, entry_type, entry_data):
# Custom processing logic
...
Pass your custom ingestion pipeline to the E2EPipelineFactory.create_pipeline()
method using the ingestion_pipeline_impl
parameter:
app = E2EPipelineFactory.create_pipeline(
config=R2RConfig.load_config(),
ingestion_pipeline_impl=CustomIngestionPipeline,
)
Custom Embedding Pipeline
To create a custom embedding pipeline, subclass the EmbeddingPipeline
abstract base class and override the necessary methods. For example:
from r2r.pipelines import EmbeddingPipeline
class CustomEmbeddingPipeline(EmbeddingPipeline):
def transform_chunks(self, chunks, metadatas):
# Custom chunk transformation logic
...
Pass your custom embedding pipeline to the E2EPipelineFactory.create_pipeline()
method using the embedding_pipeline_impl
parameter:
app = E2EPipelineFactory.create_pipeline(
config=R2RConfig.load_config(),
embedding_pipeline_impl=CustomEmbeddingPipeline,
)
Custom RAG Pipeline
To create a custom RAG pipeline, subclass the RAGPipeline
abstract base class and override the necessary methods. For example:
from r2r.pipelines import RAGPipeline
class CustomRAGPipeline(RAGPipeline):
def transform_query(self, query):
# Custom query transformation logic
...
def search(self, transformed_query, filters, limit, *args, **kwargs):
# Custom document retrieval logic
...
Pass your custom RAG pipeline to the E2EPipelineFactory.create_pipeline()
method using the rag_pipeline_impl
parameter:
app = E2EPipelineFactory.create_pipeline(
config=R2RConfig.load_config(),
rag_pipeline_impl=CustomRAGPipeline,
)
Custom Evaluation Pipeline
To create a custom evaluation pipeline, subclass the EvalPipeline
abstract base class and implement the necessary methods. For example:
from r2r.pipelines import EvalPipeline
class CustomEvalPipeline(EvalPipeline):
def evaluate(self, query, context, completion):
# Custom evaluation logic
...
Pass your custom evaluation pipeline to the E2EPipelineFactory.create_pipeline()
method using the eval_pipeline_impl
parameter:
app = E2EPipelineFactory.create_pipeline(
config=R2RConfig.load_config(),
eval_pipeline_impl=CustomEvalPipeline,
)
Summary
The document serves as a guide for developers to customize the RAG pipeline in the R2R framework by subclassing four key abstract base classes: IngestionPipeline
, EmbeddingPipeline
, RAGPipeline
, and EvalPipeline
. For each pipeline, the document provides an example of how to override the necessary methods to implement custom logic. It also demonstrates how to integrate these custom pipelines into the R2R system using the E2EPipelineFactory.create_pipeline()
method, ensuring that developers can tailor the behavior of ingestion, embedding, retrieval, and evaluation processes to fit their application's requirements.