RAG Query

Execute a RAG (Retrieval-Augmented Generation) query. This endpoint combines search results with language model generation to produce accurate, contextually-relevant responses based on your document corpus. **Features:** - Combines vector search, optional knowledge graph integration, and LLM generation - Automatically cites sources with unique citation identifiers - Supports both streaming and non-streaming responses - Compatible with various LLM providers (OpenAI, Anthropic, etc.) - Web search integration for up-to-date information **Search Configuration:** All search parameters from the search endpoint apply here, including filters, hybrid search, and graph-enhanced search. **Generation Configuration:** Fine-tune the language model's behavior with `rag_generation_config`: ```json { "model": "openai/gpt-4o-mini", // Model to use "temperature": 0.7, // Control randomness (0-1) "max_tokens": 1500, // Maximum output length "stream": true // Enable token streaming } ``` **Model Support:** - OpenAI models (default) - Anthropic Claude models (requires ANTHROPIC_API_KEY) - Local models via Ollama - Any provider supported by LiteLLM **Streaming Responses:** When `stream: true` is set, the endpoint returns Server-Sent Events with the following types: - `search_results`: Initial search results from your documents - `message`: Partial tokens as they're generated - `citation`: Citation metadata when sources are referenced - `final_answer`: Complete answer with structured citations **Example Response:** ```json { "generated_answer": "DeepSeek-R1 is a model that demonstrates impressive performance...[1]", "search_results": { ... }, "citations": [ { "id": "cit.123456", "object": "citation", "payload": { ... } } ] } ```

Authentication

AuthorizationBearer

Bearer authentication of the form Bearer <token>, where token is your auth token.

Headers

X-API-KeystringRequired

Request

This endpoint expects an object.
querystringRequired
search_modeenumOptional
Default value of `custom` allows full control over search settings. Pre-configured search modes: `basic`: A simple semantic-based search. `advanced`: A more powerful hybrid search combining semantic and full-text. `custom`: Full control via `search_settings`. If `filters` or `limit` are provided alongside `basic` or `advanced`, they will override the default settings for that mode.
Allowed values:
search_settingsobjectOptional

The search configuration object. If search_mode is custom, these settings are used as-is. For basic or advanced, these settings will override the default mode configuration.

Common overrides include filters to narrow results and limit to control how many results are returned.

rag_generation_configobjectOptional
Configuration for RAG generation
task_promptstringOptional
Optional custom prompt to override default
include_title_if_availablebooleanOptionalDefaults to false
Include document titles in responses when available

Response

Successful Response

Errors