Skip to content
Shashank Shekhar Singh edited this page Sep 23, 2025 · 1 revision

✨ Welcome to the CodeGraphContext Wiki! ✨

This is the central hub for understanding the architecture, concepts, and capabilities of CodeGraphContext. Whether you're looking to use it, contribute to it, or just learn how it works, you're in the right place.


🚀 What is CodeGraphContext?

At its heart, CodeGraphContext is a powerful MCP (Model-Context-Protocol) server that transforms your local source code into a rich, queryable knowledge graph. It acts as a bridge between your AI assistant and your codebase, giving the AI a deep, structural understanding of your project.

Think of it as giving your AI a superpower: instead of just reading files as plain text, it can see the intricate web of connections—function calls, class inheritance, imports, and more. This enables a whole new level of context-aware assistance for debugging, refactoring, and understanding complex systems.

Core Features:

  • 🧠 Deep Code Indexing: Uses tree-sitter for fast, accurate parsing of Python code.
  • 🕸️ Graph-Based Analysis: Leverages a Neo4j database to map out and query code relationships.
  • Live Updates: A watchdog-based file watcher keeps the graph synchronized with your code as you edit.
  • 🗣️ Natural Language Interaction: A suite of powerful tools allows you to ask complex questions about your code in plain English.
  • 🛠️ Interactive CLI: A user-friendly command-line interface (cgc) for easy setup and management.

🏗️ Architecture Deep Dive

CodeGraphContext operates on a simple yet powerful principle: code is a graph. Here's a look at how the different components work together.

The Flow: From Code to Query

  1. Indexing (add_code_to_graph or watch_directory):

    • The GraphBuilder is invoked.
    • It uses the TreeSitterParser to parse every Python file in your project. This is much more robust than simple regex or AST parsing and captures detailed structural information.
    • The parsed data (functions, classes, calls, imports) is translated into nodes and relationships.
    • These nodes and relationships are committed to the Neo4j database, forming the code graph. This entire process runs in a background job managed by the JobManager, so the server remains responsive.
  2. Querying (e.g., "who calls this function?"):

    • Your AI assistant sends a tools/call request to the MCPServer.
    • The server routes the request to the appropriate tool, usually the CodeFinder.
    • The CodeFinder constructs a Cypher query based on the request.
    • The query is executed against the Neo4j database.
    • The results are formatted and sent back to the AI, providing a fact-based, contextually accurate answer.

Key Components

  • MCPServer: The main JSON-RPC server that listens for requests from the AI assistant and orchestrates all operations.
  • GraphBuilder: The engine responsible for parsing code (TreeSitterParser) and building the Neo4j graph.
  • CodeFinder: The query engine. It contains the logic for translating natural language queries into specific Cypher queries to find code and analyze relationships.
  • CodeWatcher: The live-update mechanism. It runs a watchdog observer in a separate thread to detect file changes and trigger incremental updates to the graph.
  • DatabaseManager: A thread-safe singleton that manages the connection pool to the Neo4j database.
  • JobManager: Manages long-running background tasks, primarily code indexing, allowing you to check their status.

The Neo4j Schema

The power of CodeGraphContext comes from its graph model. Here are the core elements:

  • Nodes:
    • Repository: The root of your project.
    • Directory: A folder in your project.
    • File: A source code file.
    • Class: A class definition.
    • Function: A function or method definition.
    • Module: An imported module.
    • Variable: A variable assignment.
  • Relationships:
    • CONTAINS: Shows how elements are nested (e.g., File -[:CONTAINS]-> Function).
    • CALLS: Connects a function to the functions it calls.
    • IMPORTS: Connects a file to the modules it imports.
    • INHERITS: Connects a child class to its parent class.

📖 Cookbook Highlights

You can interact with the code graph using natural language. Here are a few examples of what's possible (see cookbook.md for a full list).

Natural Language Query Tool Call
"Find all calls to the helper function." analyze_code_relationships(query_type="find_callers", target="helper")
"Show me the class hierarchy for BaseController." analyze_code_relationships(query_type="class_hierarchy", target="BaseController")
"Find the 5 most complex functions." find_most_complex_functions(limit=5)
"What is the call chain from wrapper to helper?" analyze_code_relationships(query_type="call_chain", target="wrapper->helper")
"Find all dead code." find_dead_code()

🤝 Contributing

We welcome contributions! The project is built with modern Python tools and is designed to be extensible.

  • Setup: pip install -e ".[dev]"
  • Testing: Tests are written with pytest. Simply run pytest in the root directory. See CONTRIBUTING.md for more details, including how to skip re-indexing for faster test runs.
  • Debugging: To see detailed logs, set debug_mode = 1 in src/codegraphcontext/tools/graph_builder.py.

Check out the CONTRIBUTING.md file for full guidelines.