Configuration for Restructuring data after ingestion using Knowledge Graphs

It is often effective to restructure data after ingestion to improve retrieval performance and accuracy. R2R supports knowledge graphs for data restructuring. You can find out more about creating knowledge graphs in the Knowledge Graphs Guide.

You can configure knowledge graph enrichment in the R2R configuration file. To do this, just set the kg.kg_enrichment_settings section in the configuration file. Following is the sample format from the example configuration file r2r.toml.

1[database]
2provider = "postgres"
3batch_size = 256
4
5 [database.kg_creation_settings]
6 kg_triples_extraction_prompt = "graphrag_triples_extraction_few_shot"
7 entity_types = [] # if empty, all entities are extracted
8 relation_types = [] # if empty, all relations are extracted
9 fragment_merge_count = 4 # number of fragments to merge into a single extraction
10 max_knowledge_triples = 100 # max number of triples to extract for each document chunk
11 generation_config = { model = "openai/gpt-4o-mini" } # and other generation params
12
13 [database.kg_enrichment_settings]
14 max_description_input_length = 65536 # increase if you want more comprehensive descriptions
15 max_summary_input_length = 65536
16 generation_config = { model = "openai/gpt-4o-mini" } # and other generation params
17 leiden_params = {} # more params in graspologic/partition/leiden.py
18
19 [database.kg_search_settings]
20 generation_config = { model = "openai/gpt-4o-mini" }

Next you can do GraphRAG with the knowledge graph. Find out more about GraphRAG in the GraphRAG Guide.