Creating a knowledge graph and running graphrag using R2R.

Create a graph

Creating a graph on your documents.

1client.create_graph(
2 collection_id='122fdf6a-e116-546b-a8f6-e4cb2e2c0a09', # optional
3 run_type="run", # estimate or run
4 kg_creation_settings={
5 "force_kg_creation": True,
6 "kg_triples_extraction_prompt": "graphrag_triples_extraction_few_shot",
7 "entity_types": [],
8 "relation_types": [],
9 "extraction_merge_count": 4,
10 "max_knowledge_triples": 100,
11 "max_description_input_length": 65536,
12 "generation_config": {
13 "model": "openai/gpt-4o-mini",
14 # other generation config params
15 }
16 }
17)
collection_id
Optional[Union[UUID, str]]

The ID of the collection to create the graph for. If not provided, the graph will be created for the default collection.

run_type
Optional[Union[str, KGRunType]]

The type of run to perform. Options are “estimate” or “run”. Estimate will return an estimate of the creation cost, and run will create the graph.

kg_creation_settings
Optional[Union[dict, KGCreationSettings]]

The settings for the graph creation process.

Enrich a graph

1client.enrich_graph(
2 collection_id='122fdf6a-e116-546b-a8f6-e4cb2e2c0a09',
3 run_type="run",
4 kg_enrichment_settings={
5 "community_reports_prompt": "graphrag_community_reports",
6 "max_summary_input_length": 65536,
7 "generation_config": {
8 "model": "openai/gpt-4o-mini",
9 "temperature": 0.12,
10 # other generation config params
11 },
12 "leiden_params": {
13 # leiden algorithm params, all are optional, default values are shown
14 "max_cluster_size": 1000,
15 "starting_communities": None,
16 "extra_forced_iterations": 0,
17 "resolution": 1.0,
18 "randomness": 0.001,
19 "use_modularity": True,
20 "random_seed": 7272, # If not set, defaults to 7272
21 "weight_attribute": "weight",
22 "is_weighted": None,
23 "weight_default": 1.0,
24 "check_directed": True,
25 }
26 }
27)
collection_id
Optional[Union[UUID, str]]

The ID of the collection to enrich the graph for. If not provided, the graph will be enriched for the default collection.

run_type
Optional[Union[str, KGRunType]]

The type of run to perform. Options are “estimate” or “run”. Estimate will return an estimate of the enrichment cost, and run will create the enriched graph.

kg_enrichment_settings
Optional[Union[dict, KGEnrichmentSettings]]

The settings for the graph enrichment process.

Get entities

1client.get_entities(
2 collection_id='122fdf6a-e116-546b-a8f6-e4cb2e2c0a09',
3 offset=0,
4 limit=1000,
5 entity_ids=None
6)
collection_id
Optional[Union[UUID, str]]

The ID of the collection to get the entities from. If not provided, the entities will be retrieved from the default collection.

offset
int

The offset for pagination.

limit
int

The limit for pagination.

entity_ids
Optional[list[str]]

The list of entity IDs to filter by.

Get triples

1client.get_triples(
2 collection_id='122fdf6a-e116-546b-a8f6-e4cb2e2c0a09',
3 offset=0,
4 limit=100,
5 entity_names=[],
6 triple_ids=None
7)
collection_id
Optional[Union[UUID, str]]

The ID of the collection to get the triples from. If not provided, the triples will be retrieved from the default collection.

offset
Optional[int]

The offset for pagination. Defaults to 0.

limit
Optional[int]

The limit for pagination. Defaults to 100.

entity_names
Optional[list[str]]

The list of entity names to filter by. Entities are in all caps. eg. [‘ARISTOTLE’, ‘PLATO’]

triple_ids
Optional[list[str]]

The list of triple IDs to filter by.

Get Communities

1client.get_communities(
2 collection_id='122fdf6a-e116-546b-a8f6-e4cb2e2c0a09',
3 offset=0,
4 limit=100,
5 levels=[],
6 community_numbers=[],
7)
collection_id
Optional[Union[UUID, str]]

The ID of the collection to get the communities from. If not provided, the communities will be retrieved from the default collection.

offset
Optional[int]

The offset for pagination. Defaults to 0.

limit
Optional[int]

The limit for pagination. Defaults to 100.

levels
Optional[list[int]]

The list of levels to filter by. As output of hierarchical clustering, each community is assigned a level.

community_numbers
Optional[list[int]]

The list of community numbers to filter by.

Delete Graph

Delete the graph for a collection using the delete_graph_for_collection method.

1client.delete_graph_for_collection(
2 collection_id='122fdf6a-e116-546b-a8f6-e4cb2e2c0a09',
3 cascade=False
4)
collection_id
Union[UUID, str]

The ID of the collection to delete the graph for.

cascade
bool

Whether to cascade the deletion.

NOTE: Setting this flag to true will delete entities and triples for documents that are shared across multiple collections. Do not set this flag unless you are absolutely sure that you want to delete the entities and triples for all documents in the collection.

Get Tuned Prompt

1client.get_tuned_prompt(
2 prompt_name="graphrag_entity_description",
3 collection_id='122fdf6a-e116-546b-a8f6-e4cb2e2c0a09',
4 documents_offset=0,
5 documents_limit=100,
6 chunk_offset=0,
7 chunk_limit=100
8)
prompt_name
str

The name of the prompt to tune. Valid values include “graphrag_entity_description”, “graphrag_triples_extraction_few_shot”, and “graphrag_community_reports”.

collection_id
Optional[Union[UUID, str]]

The ID of the collection to tune the prompt for. If not provided, the default collection will be used.

documents_offset
Optional[int]

The offset for pagination of documents. Defaults to 0.

documents_limit
Optional[int]

The limit for pagination of documents. Defaults to 100. Controls how many documents are used for tuning.

chunk_offset
Optional[int]

The offset for pagination of chunks within each document. Defaults to 0.

chunk_limit
Optional[int]

The limit for pagination of chunks within each document. Defaults to 100. Controls how many chunks per document are used for tuning.

The tuning process provides an LLM with chunks from each document in the collection. The relative sample size can therefore be controlled by adjusting the document and chunk limits.

Deduplicate Entities

1client.deduplicate_entities(
2 collection_id='122fdf6a-e116-546b-a8f6-e4cb2e2c0a09',
3 entity_deduplication_settings=entity_deduplication_settings
4)
collection_id
Union[UUID, str]

The ID of the collection to deduplicate entities for.

entity_deduplication_settings
EntityDeduplicationSettings

The settings for the entity deduplication process.

Search and RAG

Please see the Search and RAG documentation for more information on how to perform search and RAG using Knowledge Graphs.

API Reference

Please see the API documentation for more information on the capabilities of the R2R Graph creation and enrichment API.