Contextual Enrichment | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.

Contextual enrichment is currently restricted to:

Self-hosted instances
Enterprise tier cloud accounts

Contact our sales team for Enterprise pricing and features.

When processing documents into chunks, individual segments can sometimes lack necessary context from surrounding content. Chunk enrichment addresses this by incorporating contextual information from neighboring chunks to create more meaningful and comprehensive text segments.

Overview

Chunk enrichment is the process of enhancing individual document chunks by considering their surrounding context.

How Enrichment Works

The enrichment process runs after initial document chunking and:

Retrieves a configurable number of preceding and succeeding chunks
Sends the chunks, along with document summary if available, to an LLM
Generates an enriched version that maintains the original meaning while incorporating relevant context
Creates new embeddings for the enriched chunks
Replaces the original chunks in the vector database

Example Enrichment

Consider this example from a technical document about spacecraft:

Chunk Enrichment Example

Stage	Content
Original Chunk	”The heat shield underwent significant stress during this phase, reaching temperatures of 1500°C.”
Preceding Chunk	”As the spacecraft began its descent through the Martian atmosphere, the entry sequence was initiated.”
Succeeding Chunk	”These extreme temperatures were within expected parameters, thanks to the carbon-based ablative material.”
Enriched Result	”During the spacecraft’s descent through the Martian atmosphere, the heat shield underwent significant stress during the entry phase, reaching temperatures of 1500°C. These temperatures were successfully managed by the shield’s design.”

The enriched version incorporates crucial context about the Martian descent while maintaining the core information about temperature and stress levels. This improved chunk will likely perform better in searches related to Mars missions, atmospheric entry, or heat shield performance.

Configuration Settings

Chunk enrichment can be enabled through a custom configuration file. To learn more about managing your R2R configuration settings, read our self hosting documentation.

my_r2r.toml

1 [ingestion]
2    [ingestion.chunk_enrichment_settings]
3     enable_chunk_enrichment = true
4     n_chunks = 2 # number of preceding/succeeding chunks to use

Chunk enrichment can modify the original text content. While this generally improves search quality, it’s crucial to note that this process mutates the underlying chunks.

Enrichment Process Details

The enrichment process handles chunks in batches for efficiency:

Context Collection: Gathers preceding and succeeding chunks based on n_chunks setting
LLM Enhancement: Processes chunks through the configured LLM to incorporate context
Fallback Handling: Maintains original chunk text if enrichment fails
Batch Processing: Processes chunks in groups of 128 for optimal performance
Vector Updates: Replaces original chunks with enriched versions in the vector database