Hello, thank you very much for your work on Medical Graph RAG; the paper is excellent. I am reading the source code and trying to reproduce the experimental results in the paper, but I have found some inconsistencies when comparing the paper's description and the code implementation, and I would like to ask you about them.
1.Regarding hierarchical clustering: I noticed that the repository contains code for nano_graphrag (which implements Leiden clustering), but in the main logic of run.py (Standard Mode / else branch), it seems that these clustering algorithms are not called to build the semantic tree.
2.Regarding the retrieval logic (U-Retrieval): The core U-Retrieval in the paper is described as a top-down navigation retrieval based on a tree structure. However, in the seq_ret function of retrieve.py, I see that the logic is to retrieve all Summary nodes from the database, then loop through them and call LLM for scoring. This looks more like a full linear scan based on LLM, rather than the tree-based retrieval algorithm described in the paper.
I would like to ask: Is the currently open-source code a simplified demo? Are there plans to open-source the "12-layer dynamic clustering tree" construction and the actual U-Retrieval navigation code described in the paper? Or how can I reproduce the efficiency described in the paper based on the current code?
look forward to your reply, thank you!
Hello, thank you very much for your work on Medical Graph RAG; the paper is excellent. I am reading the source code and trying to reproduce the experimental results in the paper, but I have found some inconsistencies when comparing the paper's description and the code implementation, and I would like to ask you about them.
1.Regarding hierarchical clustering: I noticed that the repository contains code for
nano_graphrag(which implements Leiden clustering), but in the main logic ofrun.py(Standard Mode / else branch), it seems that these clustering algorithms are not called to build the semantic tree.2.Regarding the retrieval logic (U-Retrieval): The core U-Retrieval in the paper is described as a top-down navigation retrieval based on a tree structure. However, in the
seq_retfunction ofretrieve.py, I see that the logic is to retrieve all Summary nodes from the database, then loop through them and call LLM for scoring. This looks more like a full linear scan based on LLM, rather than the tree-based retrieval algorithm described in the paper.I would like to ask: Is the currently open-source code a simplified demo? Are there plans to open-source the "12-layer dynamic clustering tree" construction and the actual U-Retrieval navigation code described in the paper? Or how can I reproduce the efficiency described in the paper based on the current code?
look forward to your reply, thank you!