npm - @optave/codegraph - Versions diffs - 2.5.1 → 3.0.0 - Mend

@optave/codegraph 2.5.1 → 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (41) hide show

package/README.md +216 -89
package/package.json +8 -7
package/src/ast.js +392 -0
package/src/audit.js +423 -0
package/src/batch.js +180 -0
package/src/boundaries.js +346 -0
package/src/builder.js +375 -92
package/src/cfg.js +1451 -0
package/src/change-journal.js +130 -0
package/src/check.js +432 -0
package/src/cli.js +734 -107
package/src/cochange.js +5 -2
package/src/communities.js +7 -1
package/src/complexity.js +124 -17
package/src/config.js +10 -0
package/src/dataflow.js +1187 -0
package/src/db.js +96 -0
package/src/embedder.js +359 -47
package/src/export.js +305 -0
package/src/extractors/csharp.js +64 -1
package/src/extractors/go.js +66 -1
package/src/extractors/hcl.js +22 -0
package/src/extractors/java.js +61 -1
package/src/extractors/javascript.js +142 -0
package/src/extractors/php.js +79 -0
package/src/extractors/python.js +134 -0
package/src/extractors/ruby.js +89 -0
package/src/extractors/rust.js +71 -1
package/src/flow.js +4 -4
package/src/index.js +78 -3
package/src/manifesto.js +69 -1
package/src/mcp.js +702 -193
package/src/owners.js +359 -0
package/src/paginate.js +37 -2
package/src/parser.js +8 -0
package/src/queries.js +590 -50
package/src/snapshot.js +149 -0
package/src/structure.js +9 -3
package/src/triage.js +273 -0
package/src/viewer.js +948 -0
package/src/watcher.js +36 -1

package/README.md CHANGED Viewed

@@ -31,19 +31,24 @@
 ## The Problem
-AI coding assistants are incredible — until your codebase gets big enough. Then they get lost.
+Large codebases are opaque. The structure lives in people's heads, not in tools.
-On a large codebase, a great portion of your AI budget isn't going toward solving tasks. It's going toward the AI re-orienting itself in your code. Every session. Over and over. It burns tokens on tool calls — `grep`, `find`, `cat` — just to figure out what calls what. It loses context. It hallucinates dependencies. It modifies a function without realizing 14 callers across 9 files depend on it.
+A developer inherits a project and spends days grepping to understand what calls what. An AI agent burns half its token budget on `grep`, `find`, `cat` — re-discovering the same structure every session. An architect draws boundary rules on a whiteboard that erode within weeks because nothing enforces them. A CI pipeline catches test failures but can't tell you _"this change silently affects 14 callers across 9 files."_
-When the AI catches these mistakes, you waste time and tokens on corrections. When it doesn't catch them, your codebase starts degrading with silent bugs until things stop working.
-And when you hit `/clear` or run out of context? It starts from scratch.
+The information exists — it's in the code itself. But without a structured map, everyone is navigating blind: developers guess, AI agents hallucinate, and architecture degrades one unreviewed change at a time.
 ## What Codegraph Does
-Codegraph gives your AI a pre-built, always-current map of your entire codebase — every function, every caller, every dependency — so it stops guessing and starts knowing.
+Codegraph builds a function-level dependency graph of your entire codebase — every function, every caller, every dependency — and keeps it current with sub-second incremental rebuilds.
+It parses your code with [tree-sitter](https://tree-sitter.github.io/) (native Rust or WASM), stores the graph in SQLite, and gives you multiple ways to consume it:
+- **CLI** — developers explore, query, and audit their code from the terminal
+- **MCP server** — AI agents query the graph directly through 30 tools
+- **CI gates** — `check` and `manifesto` commands enforce quality thresholds with exit codes
+- **Programmatic API** — embed codegraph in your own tools via `npm install`
-It parses your code with [tree-sitter](https://tree-sitter.github.io/) (native Rust or WASM), builds a function-level dependency graph in SQLite, and keeps it current with sub-second incremental rebuilds. Your AI gets answers like _"this function has 14 callers across 9 files"_ instantly, instead of spending 30 tool calls to maybe discover half of them.
+Instead of 30 tool calls to maybe discover half your dependencies, you get _"this function has 14 callers across 9 files"_ instantly. Instead of hoping architecture rules are followed, you enforce them. Instead of finding breakage in production, `diff-impact --staged` catches it before you commit.
 **Free. Open source. Fully local.** Zero network calls, zero telemetry. Your code stays on your machine. When you want deeper intelligence, bring your own LLM provider — your code only goes where you choose to send it.
@@ -55,39 +60,54 @@ cd your-project
 codegraph build
 ```
-That's it. No config files, no Docker, no JVM, no API keys, no accounts. The graph is ready to query. Add `codegraph mcp` to your AI agent's config and it has full access to your dependency graph through 24 MCP tools (25 in multi-repo mode).
+That's it. No config files, no Docker, no JVM, no API keys, no accounts. The graph is ready to query.
 ### Why it matters
-| Without codegraph | With codegraph |
-|---|---|
-| AI spends 20+ tool calls per session re-discovering your code structure | AI gets full dependency context in one call |
-| Modifies `parseConfig()` without knowing 9 files import it | `fn-impact parseConfig` shows every caller before the edit |
-| Hallucinates that `auth.js` imports from `db.js` | `deps src/auth.js` shows the real import graph |
-| After `/clear`, starts from scratch | Graph persists — next session picks up where this one left off |
-| Suggests renaming a function, breaks 14 call sites silently | `diff-impact --staged` catches the breakage before you commit |
+| | Without codegraph | With codegraph |
+|---|---|---|
+| **AI agents** | Spend 20+ tool calls per session re-discovering code structure | Get full dependency context in one MCP call |
+| **AI agents** | Modify `parseConfig()` without knowing 9 files import it | `fn-impact parseConfig` shows every caller before the edit |
+| **Developers** | Inherit a codebase and grep for hours to understand what calls what | `context handleAuth -T` gives source, deps, callers, and tests in one command |
+| **Developers** | Rename a function, break 14 call sites silently | `diff-impact --staged` catches breakage before you commit |
+| **CI pipelines** | Catch test failures but miss structural degradation | `check --staged` fails the build when blast radius or complexity thresholds are exceeded |
+| **Architects** | Draw boundary rules that erode within weeks | `manifesto` and `boundaries` enforce architecture rules on every commit |
 ### Feature comparison
-<sub>Comparison last verified: February 2026</sub>
+<sub>Comparison last verified: March 2026. Full analysis: <a href="generated/competitive/COMPETITIVE_ANALYSIS.md">COMPETITIVE_ANALYSIS.md</a></sub>
 | Capability | codegraph | [joern](https://github.com/joernio/joern) | [narsil-mcp](https://github.com/postrv/narsil-mcp) | [code-graph-rag](https://github.com/vitali87/code-graph-rag) | [cpg](https://github.com/Fraunhofer-AISEC/cpg) | [GitNexus](https://github.com/abhigyanpatwari/GitNexus) | [CodeMCP](https://github.com/SimplyLiz/CodeMCP) | [axon](https://github.com/harshkedia177/axon) |
 |---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
 | Function-level analysis | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** |
-| Multi-language | **11** | **14** | **32** | Multi | **~10** | **9** | SCIP langs | Few |
-| Semantic search | **Yes** | — | **Yes** | **Yes** | — | **Yes** | — | — |
-| MCP / AI agent support | **Yes** | — | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** | — |
-| Git diff impact | **Yes** | — | — | — | — | **Yes** | — | **Yes** |
-| Git co-change analysis | **Yes** | — | — | — | — | — | **Yes** | **Yes** |
-| Watch mode | **Yes** | — | **Yes** | — | — | — | — | — |
-| Dead code / role classification | **Yes** | — | **Yes** | — | — | — | — | **Yes** |
-| Cycle detection | **Yes** | — | **Yes** | — | — | — | — | **Yes** |
-| Incremental rebuilds | **O(changed)** | — | O(n) Merkle | — | — | — | — | — |
-| Zero config | **Yes** | — | **Yes** | — | — | — | — | — |
+| Multi-language | **11** | **14** | **32** | **11** | **~10** | **12** | **12** | **3** |
+| Semantic search | **Yes** | — | **Yes** | **Yes** | — | **Yes** | — | **Yes** |
+| Hybrid BM25 + semantic | **Yes** | — | — | — | — | **Yes** | — | **Yes** |
+| CODEOWNERS integration | **Yes** | — | — | — | — | — | — | — |
+| Architecture boundary rules | **Yes** | — | — | — | — | — | — | — |
+| CI validation predicates | **Yes** | — | — | — | — | — | — | — |
+| Composite audit command | **Yes** | — | — | — | — | — | — | — |
+| Batch querying | **Yes** | — | — | — | — | — | — | — |
+| Graph snapshots | **Yes** | — | — | — | — | — | — | — |
+| MCP / AI agent support | **Yes** | — | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** |
+| Git diff impact | **Yes** | — | — | — | — | **Yes** | **Yes** | **Yes** |
+| Branch structural diff | **Yes** | — | — | — | — | — | — | **Yes** |
+| Git co-change analysis | **Yes** | — | — | — | — | — | — | **Yes** |
+| Watch mode | **Yes** | — | **Yes** | **Yes** | — | — | **Yes** | **Yes** |
+| Dead code / role classification | **Yes** | — | **Yes** | — | — | — | **Yes** | **Yes** |
+| Cycle detection | **Yes** | — | — | — | — | — | — | — |
+| Incremental rebuilds | **O(changed)** | — | O(n) Merkle | — | — | — | Go only | **Yes** |
+| Zero config | **Yes** | — | **Yes** | — | — | **Yes** | — | **Yes** |
 | Embeddable JS library (`npm install`) | **Yes** | — | — | — | — | — | — | — |
-| LLM-optional (works without API keys) | **Yes** | **Yes** | **Yes** | — | **Yes** | **Yes** | **Yes** | **Yes** |
-| Commercial use allowed | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** | — | — | — |
-| Open source | **Yes** | Yes | Yes | Yes | Yes | Yes | Custom | — |
+| LLM-optional (works without API keys) | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** |
+| Dataflow analysis | **Yes** | **Yes** | — | — | **Yes** | — | — | — |
+| Control flow graph (CFG) | **Yes** | **Yes** | — | — | **Yes** | — | — | — |
+| AST node querying | **Yes** | **Yes** | — | — | **Yes** | — | — | — |
+| Expanded node/edge types | **Yes** | **Yes** | — | — | **Yes** | — | — | — |
+| GraphML / Neo4j export | **Yes** | **Yes** | — | — | — | — | — | — |
+| Interactive graph viewer | **Yes** | — | — | — | — | — | — | — |
+| Commercial use allowed | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** | No | Paid | **Yes** |
+| Open source | **Yes** | Yes | Yes | Yes | Yes | No | No | Yes |
 ### What makes codegraph different
@@ -97,10 +117,11 @@ That's it. No config files, no Docker, no JVM, no API keys, no accounts. The gra
 | **🔓** | **Zero-cost core, LLM-enhanced when you want** | Full graph analysis with no API keys, no accounts, no cost. Optionally bring your own LLM provider — your code only goes where you choose |
 | **🔬** | **Function-level, not just files** | Traces `handleAuth()` → `validateToken()` → `decryptJWT()` and shows 14 callers across 9 files break if `decryptJWT` changes |
 | **🏷️** | **Role classification** | Every symbol auto-tagged as `entry`/`core`/`utility`/`adapter`/`dead`/`leaf` — agents instantly know what they're looking at |
-| **🤖** | **Built for AI agents** | 24-tool [MCP server](https://modelcontextprotocol.io/) — AI assistants query your graph directly. Single-repo by default |
+| **🤖** | **Built for AI agents** | 30-tool [MCP server](https://modelcontextprotocol.io/) — AI assistants query your graph directly. Single-repo by default |
 | **🌐** | **Multi-language, one CLI** | JS/TS + Python + Go + Rust + Java + C# + PHP + Ruby + HCL in a single graph |
 | **💥** | **Git diff impact** | `codegraph diff-impact` shows changed functions, their callers, and full blast radius — enriched with historically coupled files from git co-change analysis. Ships with a GitHub Actions workflow |
-| **🧠** | **Semantic search** | Local embeddings by default, LLM-powered when opted in — multi-query with RRF ranking via `"auth; token; JWT"` |
+| **🧠** | **Hybrid search** | BM25 keyword + semantic embeddings fused via RRF — `hybrid` (default), `semantic`, or `keyword` mode; multi-query via `"auth; token; JWT"` |
+| **🔬** | **Dataflow + CFG** | Track how data flows through functions (`flows_to`, `returns`, `mutates`) and visualize intraprocedural control flow graphs for all 11 languages |
 ---
@@ -127,6 +148,8 @@ git clone https://github.com/optave/codegraph.git
 cd codegraph && npm install && npm link
 ```
+> **Dev builds:** Pre-release tarballs are attached to [GitHub Releases](https://github.com/optave/codegraph/releases). Install with `npm install -g <path-to-tarball>`. Note that `npm install -g <tarball-url>` does not work because npm cannot resolve optional platform-specific dependencies from a URL — download the `.tgz` first, then install from the local file.
 ### For AI agents
 Add codegraph to your agent's instructions (e.g. `CLAUDE.md`):
@@ -144,7 +167,7 @@ After modifying code:
 Or connect directly via MCP:
 ```bash
-codegraph mcp          # 24-tool MCP server — AI queries the graph directly
+codegraph mcp          # 30-tool MCP server — AI queries the graph directly
 ```
 Full agent setup: [AI Agent Guide](docs/guides/ai-agent-guide.md) &middot; [CLAUDE.md template](docs/guides/ai-agent-guide.md#claudemd-template)
@@ -159,7 +182,7 @@ Full agent setup: [AI Agent Guide](docs/guides/ai-agent-guide.md) &middot; [CLAU
 | 📁 | **File dependencies** | See what a file imports and what imports it |
 | 💥 | **Impact analysis** | Trace every file affected by a change (transitive) |
 | 🧬 | **Function-level tracing** | Call chains, caller trees, function-level impact, and A→B pathfinding with qualified call resolution |
-| 🎯 | **Deep context** | `context` gives AI agents source, deps, callers, signature, and tests for a function in one call; `explain` gives structural summaries of files or functions |
+| 🎯 | **Deep context** | `context` gives AI agents source, deps, callers, signature, and tests for a function in one call; `audit --quick` gives structural summaries of files or functions |
 | 📍 | **Fast lookup** | `where` shows exactly where a symbol is defined and used — minimal, fast |
 | 📊 | **Diff impact** | Parse `git diff`, find overlapping functions, trace their callers |
 | 🔗 | **Co-change analysis** | Analyze git history for files that always change together — surfaces hidden coupling the static graph can't see; enriches `diff-impact` with historically coupled files |
@@ -167,14 +190,31 @@ Full agent setup: [AI Agent Guide](docs/guides/ai-agent-guide.md) &middot; [CLAU
 | 🏗️ | **Structure & hotspots** | Directory cohesion scores, fan-in/fan-out hotspot detection, module boundaries |
 | 🏷️ | **Node role classification** | Every symbol auto-tagged as `entry`/`core`/`utility`/`adapter`/`dead`/`leaf` based on connectivity patterns — agents instantly know architectural role |
 | 🔄 | **Cycle detection** | Find circular dependencies at file or function level |
-| 📤 | **Export** | DOT (Graphviz), Mermaid, and JSON graph export |
+| 📤 | **Export** | DOT, Mermaid, JSON, GraphML, GraphSON, and Neo4j CSV graph export |
 | 🧠 | **Semantic search** | Embeddings-powered natural language search with multi-query RRF ranking |
 | 👀 | **Watch mode** | Incrementally update the graph as files change |
-| 🤖 | **MCP server** | 24-tool MCP server for AI assistants; single-repo by default, opt-in multi-repo |
+| 🤖 | **MCP server** | 30-tool MCP server for AI assistants; single-repo by default, opt-in multi-repo |
 | ⚡ | **Always fresh** | Three-tier incremental detection — sub-second rebuilds even on large codebases |
 | 🧮 | **Complexity metrics** | Cognitive, cyclomatic, nesting depth, Halstead, and Maintainability Index per function |
 | 🏘️ | **Community detection** | Louvain clustering to discover natural module boundaries and architectural drift |
-| 📜 | **Manifesto rule engine** | Configurable pass/fail rules with warn/fail thresholds for CI gates (exit code 1 on fail) |
+| 📜 | **Manifesto rule engine** | Configurable pass/fail rules with warn/fail thresholds for CI gates via `check` (exit code 1 on fail) |
+| 👥 | **CODEOWNERS integration** | Map graph nodes to CODEOWNERS entries — see who owns each function, ownership boundaries in `diff-impact` |
+| 💾 | **Graph snapshots** | `snapshot save`/`restore` for instant DB backup and rollback — checkpoint before refactoring, restore without rebuilding |
+| 🔎 | **Hybrid BM25 + semantic search** | FTS5 keyword search + embedding-based semantic search fused via Reciprocal Rank Fusion — `hybrid`, `semantic`, or `keyword` modes |
+| 📄 | **Pagination & NDJSON streaming** | Universal `--limit`/`--offset` pagination on all MCP tools and CLI commands; `--ndjson` for newline-delimited JSON streaming |
+| 🔀 | **Branch structural diff** | Compare code structure between two git refs — added/removed/changed symbols with transitive caller impact |
+| 🛡️ | **Architecture boundaries** | User-defined dependency rules between modules with onion architecture preset — violations flagged in manifesto and CI |
+| ✅ | **CI validation predicates** | `check` command with configurable gates: complexity, blast radius, cycles, boundary violations — exit code 0/1 for CI |
+| 📋 | **Composite audit** | Single `audit` command combining explain + impact + health metrics per function — one call instead of 3-4 |
+| 🚦 | **Triage queue** | `triage` merges connectivity, hotspots, roles, and complexity into a ranked audit priority queue |
+| 📦 | **Batch querying** | Accept a list of targets and return all results in one JSON payload — enables multi-agent parallel dispatch |
+| 🔬 | **Dataflow analysis** | Track how data moves through functions with `flows_to`, `returns`, and `mutates` edges — opt-in via `build --dataflow` (JS/TS) |
+| 🧩 | **Control flow graph** | Intraprocedural CFG construction for all 11 languages — `cfg` command with text/DOT/Mermaid output, opt-in via `build --cfg` |
+| 🔎 | **AST node querying** | Stored queryable AST nodes (calls, `new`, string, regex, throw, await) — `ast` command with SQL GLOB pattern matching |
+| 🧬 | **Expanded node/edge types** | `parameter`, `property`, `constant` node kinds with `parent_id` for sub-declaration queries; `contains`, `parameter_of`, `receiver` edge kinds |
+| 📊 | **Exports analysis** | `exports <file>` shows all exported symbols with per-symbol consumers, re-export detection, and counts |
+| 📈 | **Interactive viewer** | `codegraph plot` generates an interactive HTML graph viewer with hierarchical/force/radial layouts, complexity overlays, and drill-down |
+| 🏷️ | **Stable JSON schema** | `normalizeSymbol` utility ensures consistent 7-field output (name, kind, file, line, endLine, role, fileHash) across all commands |
 See [docs/examples](docs/examples) for real-world CLI and MCP usage examples.
@@ -202,6 +242,8 @@ codegraph stats                # Graph health: nodes, edges, languages, quality
 codegraph roles                # Node role classification (entry, core, utility, adapter, dead, leaf)
 codegraph roles --role dead -T # Find dead code (unreferenced, non-exported symbols)
 codegraph roles --role core --file src/  # Core symbols in src/
+codegraph exports src/queries.js  # Per-symbol consumer analysis (who calls each export)
+codegraph children <name>         # List parameters, properties, constants of a symbol
 ```
 ### Deep Context (AI-Optimized)
@@ -209,24 +251,28 @@ codegraph roles --role core --file src/  # Core symbols in src/
 ```bash
 codegraph context <name>       # Full context: source, deps, callers, signature, tests
 codegraph context <name> --depth 2 --no-tests  # Include callee source 2 levels deep
-codegraph explain <file>       # Structural summary: public API, internals, data flow
-codegraph explain <function>   # Function summary: signature, calls, callers, tests
+codegraph audit <file> --quick    # Structural summary: public API, internals, data flow
+codegraph audit <function> --quick  # Function summary: signature, calls, callers, tests
 ```
 ### Impact Analysis
 ```bash
 codegraph impact <file>        # Transitive reverse dependency trace
-codegraph fn <name>            # Function-level: callers, callees, call chain
-codegraph fn <name> --no-tests --depth 5
+codegraph query <name>         # Function-level: callers, callees, call chain
+codegraph query <name> --no-tests --depth 5
 codegraph fn-impact <name>     # What functions break if this one changes
-codegraph path <from> <to>     # Shortest path between two symbols (A calls...calls B)
+codegraph path <from> <to>            # Shortest path between two symbols (A calls...calls B)
 codegraph path <from> <to> --reverse  # Follow edges backward
-codegraph path <from> <to> --max-depth 5 --kinds calls,imports
+codegraph path <from> <to> --depth 5 --kinds calls,imports
 codegraph diff-impact          # Impact of unstaged git changes
 codegraph diff-impact --staged # Impact of staged changes
 codegraph diff-impact HEAD~3   # Impact vs a specific ref
 codegraph diff-impact main --format mermaid -T  # Mermaid flowchart of blast radius
+codegraph branch-compare main feature-branch    # Structural diff between two refs
+codegraph branch-compare main HEAD --no-tests   # Symbols added/removed/changed vs main
+codegraph branch-compare v2.4.0 v2.5.0 --json   # JSON output for programmatic use
+codegraph branch-compare main HEAD --format mermaid  # Mermaid diagram of structural changes
 ```
 ### Co-Change Analysis
@@ -249,8 +295,8 @@ Co-change data also enriches `diff-impact` — historically coupled files appear
 ```bash
 codegraph structure            # Directory overview with cohesion scores
-codegraph hotspots             # Files with extreme fan-in, fan-out, or density
-codegraph hotspots --metric coupling --level directory --no-tests
+codegraph triage --level file  # Files with extreme fan-in, fan-out, or density
+codegraph triage --level directory --sort coupling --no-tests
 ```
 ### Code Health & Architecture
@@ -263,8 +309,79 @@ codegraph complexity --above-threshold -T  # Only functions exceeding warn thres
 codegraph communities             # Louvain community detection — natural module boundaries
 codegraph communities --drift -T  # Drift analysis only — split/merge candidates
 codegraph communities --functions # Function-level community detection
-codegraph manifesto               # Pass/fail rule engine (exit code 1 on fail)
-codegraph manifesto -T            # Exclude test files from rule evaluation
+codegraph check                   # Pass/fail rule engine (exit code 1 on fail)
+codegraph check -T                # Exclude test files from rule evaluation
+```
+### Dataflow, CFG & AST
+```bash
+codegraph dataflow <name>             # Data flow edges for a function (flows_to, returns, mutates)
+codegraph dataflow <name> --impact    # Transitive data-dependent blast radius
+codegraph cfg <name>                  # Control flow graph (text format)
+codegraph cfg <name> --format dot     # CFG as Graphviz DOT
+codegraph cfg <name> --format mermaid # CFG as Mermaid diagram
+codegraph ast                         # List all stored AST nodes
+codegraph ast "handleAuth"            # Search AST nodes by pattern (GLOB)
+codegraph ast -k call                 # Filter by kind: call, new, string, regex, throw, await
+codegraph ast -k throw --file src/    # Combine kind and file filters
+```
+> **Note:** Dataflow requires `codegraph build --dataflow` (JS/TS only). CFG requires `codegraph build --cfg`. Both are opt-in to keep default builds fast.
+### Audit, Triage & Batch
+Composite commands for risk-driven workflows and multi-agent dispatch.
+```bash
+codegraph audit <file-or-function>    # Combined structural summary + impact + health in one report
+codegraph audit <target> --quick      # Structural summary only (skip impact and health)
+codegraph audit src/queries.js -T     # Audit all functions in a file
+codegraph triage                      # Ranked audit priority queue (connectivity + hotspots + roles)
+codegraph triage -T --limit 20        # Top 20 riskiest functions, excluding tests
+codegraph triage --level file -T      # File-level hotspot analysis
+codegraph triage --level directory -T # Directory-level hotspot analysis
+codegraph batch target1 target2 ...   # Batch query multiple targets in one call
+codegraph batch --json targets.json   # Batch from a JSON file
+```
+### CI Validation
+`codegraph check` provides configurable pass/fail predicates for CI gates and state machines. Exit code 0 = pass, 1 = fail.
+```bash
+codegraph check                             # Run manifesto rules on whole codebase
+codegraph check --staged                    # Check staged changes (diff predicates)
+codegraph check --staged --rules            # Run both diff predicates AND manifesto rules
+codegraph check --no-new-cycles             # Fail if staged changes introduce cycles
+codegraph check --max-complexity 30         # Fail if any function exceeds complexity threshold
+codegraph check --max-blast-radius 50       # Fail if blast radius exceeds limit
+codegraph check --no-boundary-violations    # Fail on architecture boundary violations
+codegraph check main                        # Check current branch vs main
+```
+### CODEOWNERS
+Map graph symbols to CODEOWNERS entries. Shows who owns each function and surfaces ownership boundaries.
+```bash
+codegraph owners                   # Show ownership for all symbols
+codegraph owners src/queries.js    # Ownership for symbols in a specific file
+codegraph owners --boundary        # Show ownership boundaries between modules
+codegraph owners --owner @backend  # Filter by owner
+```
+Ownership data also enriches `diff-impact` — affected owners and suggested reviewers appear alongside the static dependency analysis.
+### Snapshots
+Lightweight SQLite DB backup and restore — checkpoint before refactoring, instantly rollback without rebuilding.
+```bash
+codegraph snapshot save before-refactor   # Save a named snapshot
+codegraph snapshot list                   # List all snapshots
+codegraph snapshot restore before-refactor  # Restore a snapshot
+codegraph snapshot delete before-refactor   # Delete a snapshot
 ```
 ### Export & Visualization
@@ -273,7 +390,11 @@ codegraph manifesto -T            # Exclude test files from rule evaluation
 codegraph export -f dot        # Graphviz DOT format
 codegraph export -f mermaid    # Mermaid diagram
 codegraph export -f json       # JSON graph
+codegraph export -f graphml    # GraphML (XML standard)
+codegraph export -f graphson   # GraphSON (TinkerPop v3 / Gremlin)
+codegraph export -f neo4j      # Neo4j CSV (bulk import, separate nodes/relationships files)
 codegraph export --functions -o graph.dot  # Function-level, write to file
+codegraph plot                 # Interactive HTML viewer with force/hierarchical/radial layouts
 codegraph cycles               # Detect circular dependencies
 codegraph cycles --functions   # Function-level cycles
 ```
@@ -287,6 +408,9 @@ codegraph embed                # Build embeddings (default: nomic-v1.5)
 codegraph embed --model nomic  # Use a different model
 codegraph search "handle authentication"
 codegraph search "parse config" --min-score 0.4 -n 10
+codegraph search "parseConfig" --mode keyword   # BM25 keyword-only (exact names)
+codegraph search "auth flow" --mode semantic    # Embedding-only (conceptual)
+codegraph search "auth flow" --mode hybrid      # BM25 + semantic RRF fusion (default)
 codegraph models               # List available models
 ```
@@ -336,13 +460,17 @@ codegraph registry remove <name>  # Unregister
 | Flag | Description |
 |---|---|
 | `-d, --db <path>` | Custom path to `graph.db` |
-| `-T, --no-tests` | Exclude `.test.`, `.spec.`, `__test__` files (available on `fn`, `fn-impact`, `path`, `context`, `explain`, `where`, `diff-impact`, `search`, `map`, `hotspots`, `roles`, `co-change`, `deps`, `impact`, `complexity`, `communities`, `manifesto`) |
+| `-T, --no-tests` | Exclude `.test.`, `.spec.`, `__test__` files (available on most query commands including `query`, `fn-impact`, `path`, `context`, `where`, `diff-impact`, `search`, `map`, `roles`, `co-change`, `deps`, `impact`, `complexity`, `communities`, `branch-compare`, `audit`, `triage`, `check`, `dataflow`, `cfg`, `ast`, `exports`, `children`) |
 | `--depth <n>` | Transitive trace depth (default varies by command) |
 | `-j, --json` | Output as JSON |
 | `-v, --verbose` | Enable debug output |
 | `--engine <engine>` | Parser engine: `native`, `wasm`, or `auto` (default: `auto`) |
-| `-k, --kind <kind>` | Filter by kind: `function`, `method`, `class`, `struct`, `enum`, `trait`, `record`, `module` (`fn`, `context`, `search`) |
+| `-k, --kind <kind>` | Filter by kind: `function`, `method`, `class`, `interface`, `type`, `struct`, `enum`, `trait`, `record`, `module`, `parameter`, `property`, `constant` |
 | `-f, --file <path>` | Scope to a specific file (`fn`, `context`, `where`) |
+| `--mode <mode>` | Search mode: `hybrid` (default), `semantic`, or `keyword` (`search`) |
+| `--ndjson` | Output as newline-delimited JSON (one object per line) |
+| `--limit <n>` | Limit number of results |
+| `--offset <n>` | Skip first N results (pagination) |
 | `--rrf-k <n>` | RRF smoothing constant for multi-query search (default 60) |
 ## 🌐 Language Support
@@ -375,10 +503,11 @@ codegraph registry remove <name>  # Unregister
 ```
 1. **Parse** — tree-sitter parses every source file into an AST (native Rust engine or WASM fallback)
-2. **Extract** — Functions, classes, methods, interfaces, imports, exports, and call sites are extracted
+2. **Extract** — Functions, classes, methods, interfaces, imports, exports, call sites, parameters, properties, and constants are extracted
 3. **Resolve** — Imports are resolved to actual files (handles ESM conventions, `tsconfig.json` path aliases, `baseUrl`)
-4. **Store** — Everything goes into SQLite as nodes + edges with tree-sitter node boundaries
-5. **Query** — All queries run locally against the SQLite DB — typically under 100ms
+4. **Store** — Everything goes into SQLite as nodes + edges with tree-sitter node boundaries, plus structural edges (`contains`, `parameter_of`, `receiver`)
+5. **Analyze** (opt-in) — Complexity metrics, control flow graphs (`--cfg`), dataflow edges (`--dataflow`), and AST node storage
+6. **Query** — All queries run locally against the SQLite DB — typically under 100ms
 ### Incremental Rebuilds
@@ -419,18 +548,18 @@ Codegraph also extracts symbols from common callback patterns: Commander `.comma
 ## 📊 Performance
-Self-measured on every release via CI ([build benchmarks](generated/BUILD-BENCHMARKS.md) | [embedding benchmarks](generated/EMBEDDING-BENCHMARKS.md)):
+Self-measured on every release via CI ([build benchmarks](generated/benchmarks/BUILD-BENCHMARKS.md) | [embedding benchmarks](generated/benchmarks/EMBEDDING-BENCHMARKS.md)):
 | Metric | Latest |
 |---|---|
-| Build speed (native) | **2 ms/file** |
-| Build speed (WASM) | **8.4 ms/file** |
-| Query time | **2ms** |
+| Build speed (native) | **1.9 ms/file** |
+| Build speed (WASM) | **8.3 ms/file** |
+| Query time | **3ms** |
 | No-op rebuild (native) | **4ms** |
-| 1-file rebuild (native) | **97ms** |
-| Query: fn-deps | **2.1ms** |
-| Query: path | **1.2ms** |
-| ~50,000 files (est.) | **~100.0s build** |
+| 1-file rebuild (native) | **124ms** |
+| Query: fn-deps | **1.4ms** |
+| Query: path | **1.4ms** |
+| ~50,000 files (est.) | **~95.0s build** |
 Metrics are normalized per file for cross-version comparability. Times above are for a full initial build — incremental rebuilds only re-parse changed files.
@@ -452,7 +581,7 @@ Optional: `@huggingface/transformers` (semantic search), `@modelcontextprotocol/
 ### MCP Server
-Codegraph includes a built-in [Model Context Protocol](https://modelcontextprotocol.io/) server with 24 tools (25 in multi-repo mode), so AI assistants can query your dependency graph directly:
+Codegraph includes a built-in [Model Context Protocol](https://modelcontextprotocol.io/) server with 30 tools (31 in multi-repo mode), so AI assistants can query your dependency graph directly:
 ```bash
 codegraph mcp                  # Single-repo mode (default) — only local project
@@ -475,7 +604,7 @@ This project uses codegraph. The database is at `.codegraph/graph.db`.
 ### Before modifying code, always:
 1. `codegraph where <name>` — find where the symbol lives
-2. `codegraph explain <file-or-function>` — understand the structure
+2. `codegraph audit <file-or-function> --quick` — understand the structure
 3. `codegraph context <name> -T` — get full context (source, deps, callers)
 4. `codegraph fn-impact <name> -T` — check blast radius before editing
@@ -485,7 +614,7 @@ This project uses codegraph. The database is at `.codegraph/graph.db`.
 ### Other useful commands
 - `codegraph build .` — rebuild the graph (incremental by default)
 - `codegraph map` — module overview
-- `codegraph fn <name> -T` — function call chain
+- `codegraph query <name> -T` — function call chain (callers + callees)
 - `codegraph path <from> <to> -T` — shortest call path between two symbols
 - `codegraph deps <file>` — file-level dependencies
 - `codegraph roles --role dead -T` — find dead code (unreferenced symbols)
@@ -493,8 +622,23 @@ This project uses codegraph. The database is at `.codegraph/graph.db`.
 - `codegraph co-change <file>` — files that historically change together
 - `codegraph complexity -T` — per-function complexity metrics (cognitive, cyclomatic, MI)
 - `codegraph communities --drift -T` — module boundary drift analysis
-- `codegraph manifesto -T` — pass/fail rule check (CI gate, exit code 1 on fail)
-- `codegraph search "<query>"` — semantic search (requires `codegraph embed`)
+- `codegraph check -T` — pass/fail rule check (CI gate, exit code 1 on fail)
+- `codegraph audit <target> -T` — combined structural summary + impact + health in one report
+- `codegraph triage -T` — ranked audit priority queue
+- `codegraph triage --level file -T` — file-level hotspot analysis
+- `codegraph check --staged` — CI validation predicates (exit code 0/1)
+- `codegraph batch target1 target2` — batch query multiple targets at once
+- `codegraph owners [target]` — CODEOWNERS mapping for symbols
+- `codegraph snapshot save <name>` — checkpoint the graph DB before refactoring
+- `codegraph branch-compare main HEAD -T` — structural diff between two refs (added/removed/changed symbols)
+- `codegraph exports <file>` — per-symbol consumer analysis (who calls each export)
+- `codegraph children <name>` — list parameters, properties, constants of a symbol
+- `codegraph dataflow <name>` — data flow edges (flows_to, returns, mutates)
+- `codegraph cfg <name>` — intraprocedural control flow graph
+- `codegraph ast <pattern>` — search stored AST nodes (calls, new, string, regex, throw, await)
+- `codegraph plot` — interactive HTML dependency graph viewer
+- `codegraph search "<query>"` — hybrid search (requires `codegraph embed`)
+- `codegraph search "<query>" --mode keyword` — BM25 keyword search
 - `codegraph cycles` — check for circular dependencies
 ### Flags
@@ -576,7 +720,7 @@ Create a `.codegraphrc.json` in your project root to customize behavior:
 ### Manifesto rules
-Configure pass/fail thresholds for `codegraph manifesto`:
+Configure pass/fail thresholds for `codegraph check` (manifesto mode):
 ```json
 {
@@ -592,7 +736,7 @@ Configure pass/fail thresholds for `codegraph manifesto`:
 }
 ```
-When any function exceeds a `fail` threshold, `codegraph manifesto` exits with code 1 — perfect for CI gates.
+When any function exceeds a `fail` threshold, `codegraph check` exits with code 1 — perfect for CI gates.
 ### LLM credentials
@@ -616,13 +760,14 @@ Works with any secret manager: 1Password CLI (`op`), Bitwarden (`bw`), `pass`, H
 Codegraph also exports a full API for use in your own tools:
 ```js
-import { buildGraph, queryNameData, findCycles, exportDOT } from '@optave/codegraph';
+import { buildGraph, queryNameData, findCycles, exportDOT, normalizeSymbol } from '@optave/codegraph';
 // Build the graph
 buildGraph('/path/to/project');
 // Query programmatically
 const results = queryNameData('myFunction', '/path/to/.codegraph/graph.db');
+// All query results use normalizeSymbol for a stable 7-field schema
 ```
 ```js
@@ -659,25 +804,7 @@ const { results: fused } = await multiSearchData(
 - **No full type inference** — parses `.d.ts` interfaces but doesn't use TypeScript's type checker for overload resolution
 - **Dynamic calls are best-effort** — complex computed property access and `eval` patterns are not resolved
 - **Python imports** — resolves relative imports but doesn't follow `sys.path` or virtual environment packages
-## 🔍 How Codegraph Compares
-<sub>Last verified: February 2026. Full analysis: <a href="generated/COMPETITIVE_ANALYSIS.md">COMPETITIVE_ANALYSIS.md</a></sub>
-| Capability | codegraph | [joern](https://github.com/joernio/joern) | [narsil-mcp](https://github.com/postrv/narsil-mcp) | [code-graph-rag](https://github.com/vitali87/code-graph-rag) | [cpg](https://github.com/Fraunhofer-AISEC/cpg) | [GitNexus](https://github.com/abhigyanpatwari/GitNexus) |
-|---|:---:|:---:|:---:|:---:|:---:|:---:|
-| Function-level analysis | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** |
-| Multi-language | **11** | **14** | **32** | Multi | **~10** | **9** |
-| Incremental rebuilds | **O(changed)** | — | O(n) Merkle | — | — | — |
-| MCP / AI agent support | **Yes** | — | **Yes** | **Yes** | **Yes** | **Yes** |
-| Git diff impact | **Yes** | — | — | — | — | **Yes** |
-| Git co-change analysis | **Yes** | — | — | — | — | — |
-| Dead code / role classification | **Yes** | — | **Yes** | — | — | — |
-| Semantic search | **Yes** | — | **Yes** | **Yes** | — | **Yes** |
-| Watch mode | **Yes** | — | **Yes** | — | — | — |
-| Zero config, no Docker/JVM | **Yes** | — | **Yes** | — | — | — |
-| Works without API keys | **Yes** | **Yes** | **Yes** | — | **Yes** | **Yes** |
-| Commercial use (Apache/MIT) | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** | — |
+- **Dataflow analysis** — currently JS/TS only; intraprocedural (single-function scope), not interprocedural
 ## 🗺️ Roadmap
@@ -685,12 +812,12 @@ See **[ROADMAP.md](docs/roadmap/ROADMAP.md)** for the full development roadmap a
 1. ~~**Rust Core**~~ — **Complete** (v1.3.0) — native tree-sitter parsing via napi-rs, parallel multi-core parsing, incremental re-parsing, import resolution & cycle detection in Rust
 2. ~~**Foundation Hardening**~~ — **Complete** (v1.4.0) — parser registry, 12-tool MCP server with multi-repo support, test coverage 62%→75%, `apiKeyCommand` secret resolution, global repo registry
-3. **Architectural Refactoring** — parser plugin system, repository pattern, pipeline builder, engine strategy, domain errors, curated API
-4. **Intelligent Embeddings** — LLM-generated descriptions, hybrid search
+3. ~~**Deep Analysis**~~ — **Complete** (v3.0.0) — dataflow analysis (flows_to, returns, mutates), intraprocedural CFG for all 11 languages, stored AST nodes, expanded node/edge types (parameter, property, constant, contains, parameter_of, receiver), GraphML/GraphSON/Neo4j CSV export, interactive HTML viewer, CLI consolidation, stable JSON schema
+4. **Architectural Refactoring** — parser plugin system, repository pattern, pipeline builder, engine strategy, domain errors, curated API
 5. **Natural Language Queries** — `codegraph ask` command, conversational sessions
 6. **Expanded Language Support** — 8 new languages (12 → 20)
 7. **GitHub Integration & CI** — reusable GitHub Action, PR review, SARIF output
-8. **Visualization & Advanced** — web UI, monorepo support, agentic search
+8. **TypeScript Migration** — gradual migration from JS to TypeScript
 ## 🤝 Contributing

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@optave/codegraph",
-  "version": "2.5.1",
+  "version": "3.0.0",
   "description": "Local code graph CLI — parse codebases with tree-sitter, build dependency graphs, query them",
   "type": "module",
   "main": "src/index.js",
@@ -71,15 +71,16 @@
   },
   "optionalDependencies": {
     "@modelcontextprotocol/sdk": "^1.0.0",
-    "@optave/codegraph-darwin-arm64": "2.5.1",
-    "@optave/codegraph-darwin-x64": "2.5.1",
-    "@optave/codegraph-linux-x64-gnu": "2.5.1",
-    "@optave/codegraph-win32-x64-msvc": "2.5.1"
+    "@optave/codegraph-darwin-arm64": "3.0.0",
+    "@optave/codegraph-darwin-x64": "3.0.0",
+    "@optave/codegraph-linux-x64-gnu": "3.0.0",
+    "@optave/codegraph-win32-x64-msvc": "3.0.0"
   },
   "devDependencies": {
     "@biomejs/biome": "^2.4.4",
-    "@commitlint/cli": "^19.8",
-    "@commitlint/config-conventional": "^19.8",
+    "@commitlint/cli": "^20.4",
+    "@commitlint/config-conventional": "^20.0",
+    "@huggingface/transformers": "^3.8.1",
     "@tree-sitter-grammars/tree-sitter-hcl": "^1.2.0",
     "@vitest/coverage-v8": "^4.0.18",
     "commit-and-tag-version": "^12.5",