Workflows
Troubleshooting Guide: Workflow Orchestration Failures in R2R
Workflow orchestration is a critical component of R2R, managed by Hatchet. When orchestration failures occur, they can disrupt the entire data processing pipeline. This guide will help you identify and resolve common issues.
1. Check Hatchet Service Status
First, ensure that the Hatchet service is running properly:
Look for containers with names like hatchet-engine
, hatchet-api
, and hatchet-rabbitmq
.
2. Examine Hatchet Logs
View the logs for the Hatchet engine:
Look for error messages or warnings that might indicate the source of the problem.
3. Common Issues and Solutions
3.1 Connection Issues with RabbitMQ
Symptom: Hatchet logs show connection errors to RabbitMQ.
Solution:
- Check if RabbitMQ is running:
- Verify RabbitMQ credentials in the Hatchet configuration.
- Ensure the RabbitMQ container is in the same Docker network as Hatchet.
3.2 Database Connection Problems
Symptom: Errors in Hatchet logs related to database connections.
Solution:
- Verify Postgres container is running and healthy:
- Check database connection settings in Hatchet configuration.
- Ensure the Postgres container is in the same Docker network as Hatchet.
3.3 Workflow Definition Errors
Symptom: Specific workflows fail to start or execute properly.
Solution:
- Review the workflow definition in your R2R configuration.
- Check for syntax errors or invalid step definitions.
- Verify that all required environment variables for the workflow are set.
3.4 Resource Constraints
Symptom: Workflows start but fail due to timeout or resource exhaustion.
Solution:
- Check system resources (CPU, memory) on the host machine.
- Adjust resource limits for Docker containers if necessary.
- Consider scaling up your infrastructure or optimizing workflow resource usage.
3.5 Version Incompatibility
Symptom: Unexpected errors after updating R2R or Hatchet.
Solution:
- Ensure all components (R2R, Hatchet, RabbitMQ) are compatible versions.
- Check the R2R documentation for any breaking changes in recent versions.
- Consider rolling back to a known working version if issues persist.
4. Advanced Debugging
4.1 Inspect Hatchet API
Use the Hatchet API to get more details about workflow executions:
- Find the Hatchet API container:
- Use
curl
to query the API (replace<container-id>
with the actual ID):
4.2 Check Hatchet Dashboard
If you have the Hatchet dashboard set up:
- Access the dashboard (typically at
http://localhost:8002
if using default ports). - Navigate to the Workflows section to view detailed execution status and logs.
4.3 Analyze RabbitMQ Queues
Inspect RabbitMQ queues to check for message backlogs or routing issues:
- Access the RabbitMQ management interface (typically at
http://localhost:15672
). - Check queue lengths, message rates, and any dead-letter queues.
5. Common Workflow-Specific Issues
5.1 Document Ingestion Failures
Symptom: Documents fail to process through the ingestion workflow.
Solution:
- Check the Unstructured API configuration and connectivity.
- Verify file permissions and formats of ingested documents.
- Examine R2R logs for specific ingestion errors.
5.2 Vector Store Update Issues
Symptom: Vector store (Postgres with pgvector) not updating correctly.
Solution:
- Check Postgres logs for any errors related to vector operations.
- Verify the pgvector extension is properly installed and enabled.
- Ensure the R2R configuration correctly specifies the vector store settings.
6. Seeking Further Help
If you’re still experiencing issues after trying these solutions:
- Gather all relevant logs (R2R, Hatchet, RabbitMQ, Postgres).
- Document the steps to reproduce the issue.
- Check the R2R GitHub repository for similar reported issues.
- Consider opening a new issue on the R2R GitHub repository with your findings.
Remember to provide:
- R2R version (
r2r version
) - Docker Compose configuration
- Relevant parts of your R2R configuration
- Detailed error messages and logs
By following this guide, you should be able to diagnose and resolve most workflow orchestration issues in R2R. If problems persist, don’t hesitate to seek help from the R2R community or support channels.