Getting Started
Customize Your Pipeline

Customizing Your RAG Pipeline

The R2R library provides flexibility in customizing various aspects of the RAG pipeline to suit your specific needs. You can create custom implementations of the ingestion pipeline, embedding pipeline, RAG pipeline, and evaluation pipeline by subclassing the respective base classes.

Custom Ingestion Pipeline

To create a custom ingestion pipeline, subclass the IngestionPipeline abstract base class and override the necessary methods. For example:

from r2r.pipelines import IngestionPipeline
 
class CustomIngestionPipeline(IngestionPipeline):
    def process_data(self, entry_type, entry_data):
        # Custom processing logic
        ...

Pass your custom ingestion pipeline to the E2EPipelineFactory.create_pipeline() method using the ingestion_pipeline_impl parameter:

app = E2EPipelineFactory.create_pipeline(
    config=R2RConfig.load_config(),
    ingestion_pipeline_impl=CustomIngestionPipeline,
)

Custom Embedding Pipeline

To create a custom embedding pipeline, subclass the EmbeddingPipeline abstract base class and override the necessary methods. For example:

from r2r.pipelines import EmbeddingPipeline
 
class CustomEmbeddingPipeline(EmbeddingPipeline):
    def transform_chunks(self, chunks, metadatas):
        # Custom chunk transformation logic
        ...

Pass your custom embedding pipeline to the E2EPipelineFactory.create_pipeline() method using the embedding_pipeline_impl parameter:

app = E2EPipelineFactory.create_pipeline(
    config=R2RConfig.load_config(),
    embedding_pipeline_impl=CustomEmbeddingPipeline,
)

Custom RAG Pipeline

To create a custom RAG pipeline, subclass the RAGPipeline abstract base class and override the necessary methods. For example:

from r2r.pipelines import RAGPipeline
 
class CustomRAGPipeline(RAGPipeline):
    def transform_query(self, query):
        # Custom query transformation logic
        ...
 
    def search(self, transformed_query, filters, limit, *args, **kwargs):
        # Custom document retrieval logic
        ...

Pass your custom RAG pipeline to the E2EPipelineFactory.create_pipeline() method using the rag_pipeline_impl parameter:

app = E2EPipelineFactory.create_pipeline(
    config=R2RConfig.load_config(),
    rag_pipeline_impl=CustomRAGPipeline,
)

Custom Evaluation Pipeline

To create a custom evaluation pipeline, subclass the EvalPipeline abstract base class and implement the necessary methods. For example:

from r2r.pipelines import EvalPipeline
 
class CustomEvalPipeline(EvalPipeline):
    def evaluate(self, query, context, completion):
        # Custom evaluation logic
        ...

Pass your custom evaluation pipeline to the E2EPipelineFactory.create_pipeline() method using the eval_pipeline_impl parameter:

app = E2EPipelineFactory.create_pipeline(
    config=R2RConfig.load_config(),
    eval_pipeline_impl=CustomEvalPipeline,
)

Summary

The document serves as a guide for developers to customize the RAG pipeline in the R2R framework by subclassing four key abstract base classes: IngestionPipeline, EmbeddingPipeline, RAGPipeline, and EvalPipeline. For each pipeline, the document provides an example of how to override the necessary methods to implement custom logic. It also demonstrates how to integrate these custom pipelines into the R2R system using the E2EPipelineFactory.create_pipeline() method, ensuring that developers can tailor the behavior of ingestion, embedding, retrieval, and evaluation processes to fit their application's requirements.