depgraph

Design an implementation plan for adding realtime graph updates to the depgraph project via external CSV files.

Current Architecture

The depgraph project is a dependency graph visualizer. Here’s the current data flow:

Source Code (HTML) → extractJS() → analyzeCode() via Acorn AST → {functions: Map, globals: Map}
  → clusterFunctions() + computeAffinities() → buildGraph() → computeLayout() → renderGraph() (D3 SVG)

Key files:

What the user wants

  1. External CSV files: Move node and edge data to runtime/nodes.csv and runtime/edges.csv (no CSV headers)
  2. File watchers in Node.js server: Watch for changes in src and codemap files. When those change, recompute runtime/nodes.csv and runtime/edges.csv.
  3. Realtime push to frontend: The server should push updates to the frontend when CSV files change (SSE already exists for focus events).
  4. Design document: Create documentation alongside codegen.md.

Coupling Analysis — How tangled is the code to “functions”/”vars”/”globals”?

HEAVILY tangled. Here are the coupling points:

AST Analysis (index.html lines 750-792)

Edge Layers (index.html lines 1091-1211)

Six hardcoded edge layers, all assuming function properties:

buildGraph() (index.html lines 1387-1507)

Clustering (index.html lines 928-980)

Affinities (index.html lines 1020-1081)

Main Loop (index.html lines 4662-4690)

Server (depgraph-server.mjs)

Design Requirements

  1. CSV format for nodes.csv and edges.csv (no headers)
  2. Server watches src and codemap files, regenerates CSVs when they change
  3. Server watches CSVs and pushes updates to frontend via SSE
  4. Frontend can consume CSV data as an alternative to (or replacement for) AST analysis
  5. The user is curious about how much refactoring is needed given the tight coupling

Key Questions to Address in the Plan

  1. CSV format: What columns for nodes.csv and edges.csv?
  2. Who computes edges?: AST analysis currently happens in the BROWSER (Acorn is loaded client-side), it must be moved to NodeJS and it must generate nodes/edges. CSVs can be populated by many different sources.
    • Many tools should be able to push nodes and edges, similar to how codegen.py does it.
    • Browser no longer does AST analysis but only combines and reads the nodes/edges.
  3. Decoupling strategy: How to make buildGraph() work with CSV data vs AST data, maybe buildGraph should just be its own process/thing?
  4. What gets into the CSVs vs stays computed client-side: Clustering and affinities are currently computed from the codemap + AST together. Where does that boundary move?

Please design a detailed implementation plan covering: