Orchestration
Learn how orchestration is handled inside R2R
Introduction to orchestration
R2R uses Hatchet for orchestrating complex workflows, particularly for ingestion and knowledge graph construction processes.
Hatchet is a distributed, fault-tolerant task queue that solves scaling problems like concurrency, fairness, and rate limiting. It allows R2R to distribute functions between workers with minimal configuration.
Key Concepts
- Workflows: Sets of functions executed in response to external triggers.
- Workers: Long-running processes that execute workflow functions.
- Managed Queue: Low-latency queue for handling real-time tasks.
Orchestration in R2R
Benefits of orchestration
- Scalability: Efficiently handles large-scale tasks.
- Fault Tolerance: Built-in retry mechanisms and error handling.
- Flexibility: Easy to add or modify workflows as R2R’s capabilities expand.
Workflows in R2R
- IngestFilesWorkflow: Handles file ingestion, parsing, chunking, and embedding.
- UpdateFilesWorkflow: Manages the process of updating existing files.
- KgExtractAndStoreWorkflow: Extracts and stores knowledge graph information.
- CreateGraphWorkflow: Orchestrates the creation of knowledge graphs.
- EnrichGraphWorkflow: Handles graph enrichment processes like node creation and clustering.
Orchestration GUI
By default, the R2R Docker ships with with Hatchet’s front-end application on port 7274. This can be accessed by navigating to http://localhost:7274
.
You may login with the following credentials:
Email: [email protected]
Password: Admin123!!
Login
Running Tasks
The panel below shows the state of the Hatchet workflow panel at http://localhost:7274/workflow-runs
immediately after calling r2r ingest-sample-files
:
Inspecting a workflow
You can inspect a workflow within Hatchet and can even attempt to retry the job from directly in the GUI in the case of failure:
Long running tasks
Hatchet supports long running tasks, which is very useful during knowledge graph construction:
Coming Soon
In the coming day(s) / week(s) we will further highlight the available feature set and best practices for orchestrating your ingestion workflows inside R2R.