npm - @optave/codegraph - Versions diffs - 2.2.3-dev.44e8146 → 2.3.1-dev.1aeea34 - Mend

@optave/codegraph 2.2.3-dev.44e8146 → 2.3.1-dev.1aeea34

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/README.md +55 -15
package/package.json +7 -6
package/src/builder.js +66 -0
package/src/cli.js +116 -13
package/src/cochange.js +498 -0
package/src/config.js +6 -0
package/src/db.js +40 -0
package/src/embedder.js +61 -2
package/src/export.js +158 -13
package/src/extractors/helpers.js +2 -1
package/src/extractors/javascript.js +294 -78
package/src/index.js +13 -0
package/src/mcp.js +62 -1
package/src/parser.js +39 -2
package/src/queries.js +158 -9
package/src/registry.js +9 -1
package/src/structure.js +94 -0

package/README.md CHANGED Viewed

@@ -55,7 +55,7 @@ cd your-project
 codegraph build
 ```
-That's it. No config files, no Docker, no JVM, no API keys, no accounts. The graph is ready to query. Add `codegraph mcp` to your AI agent's config and it has full access to your dependency graph through 17 MCP tools.
+That's it. No config files, no Docker, no JVM, no API keys, no accounts. The graph is ready to query. Add `codegraph mcp` to your AI agent's config and it has full access to your dependency graph through 19 MCP tools.
 ### Why it matters
@@ -78,7 +78,9 @@ That's it. No config files, no Docker, no JVM, no API keys, no accounts. The gra
 | Semantic search | **Yes** | — | **Yes** | **Yes** | — | **Yes** | — | — |
 | MCP / AI agent support | **Yes** | — | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** | — |
 | Git diff impact | **Yes** | — | — | — | — | **Yes** | — | **Yes** |
+| Git co-change analysis | **Yes** | — | — | — | — | — | **Yes** | **Yes** |
 | Watch mode | **Yes** | — | **Yes** | — | — | — | — | — |
+| Dead code / role classification | **Yes** | — | **Yes** | — | — | — | — | **Yes** |
 | Cycle detection | **Yes** | — | **Yes** | — | — | — | — | **Yes** |
 | Incremental rebuilds | **O(changed)** | — | O(n) Merkle | — | — | — | — | — |
 | Zero config | **Yes** | — | **Yes** | — | — | — | — | — |
@@ -94,9 +96,10 @@ That's it. No config files, no Docker, no JVM, no API keys, no accounts. The gra
 | **⚡** | **Always-fresh graph** | Three-tier change detection: journal (O(changed)) → mtime+size (O(n) stats) → hash (O(changed) reads). Sub-second rebuilds even on large codebases |
 | **🔓** | **Zero-cost core, LLM-enhanced when you want** | Full graph analysis with no API keys, no accounts, no cost. Optionally bring your own LLM provider — your code only goes where you choose |
 | **🔬** | **Function-level, not just files** | Traces `handleAuth()` → `validateToken()` → `decryptJWT()` and shows 14 callers across 9 files break if `decryptJWT` changes |
-| **🤖** | **Built for AI agents** | 17-tool [MCP server](https://modelcontextprotocol.io/) — AI assistants query your graph directly. Single-repo by default |
+| **🏷️** | **Role classification** | Every symbol auto-tagged as `entry`/`core`/`utility`/`adapter`/`dead`/`leaf` — agents instantly know what they're looking at |
+| **🤖** | **Built for AI agents** | 19-tool [MCP server](https://modelcontextprotocol.io/) — AI assistants query your graph directly. Single-repo by default |
 | **🌐** | **Multi-language, one CLI** | JS/TS + Python + Go + Rust + Java + C# + PHP + Ruby + HCL in a single graph |
-| **💥** | **Git diff impact** | `codegraph diff-impact` shows changed functions, their callers, and full blast radius — ships with a GitHub Actions workflow |
+| **💥** | **Git diff impact** | `codegraph diff-impact` shows changed functions, their callers, and full blast radius — enriched with historically coupled files from git co-change analysis. Ships with a GitHub Actions workflow |
 | **🧠** | **Semantic search** | Local embeddings by default, LLM-powered when opted in — multi-query with RRF ranking via `"auth; token; JWT"` |
 ---
@@ -141,10 +144,10 @@ After modifying code:
 Or connect directly via MCP:
 ```bash
-codegraph mcp          # 17-tool MCP server — AI queries the graph directly
+codegraph mcp          # 19-tool MCP server — AI queries the graph directly
 ```
-Full agent setup: [AI Agent Guide](docs/ai-agent-guide.md) &middot; [CLAUDE.md template](docs/ai-agent-guide.md#claudemd-template)
+Full agent setup: [AI Agent Guide](docs/guides/ai-agent-guide.md) &middot; [CLAUDE.md template](docs/guides/ai-agent-guide.md#claudemd-template)
 ---
@@ -159,13 +162,15 @@ Full agent setup: [AI Agent Guide](docs/ai-agent-guide.md) &middot; [CLAUDE.md t
 | 🎯 | **Deep context** | `context` gives AI agents source, deps, callers, signature, and tests for a function in one call; `explain` gives structural summaries of files or functions |
 | 📍 | **Fast lookup** | `where` shows exactly where a symbol is defined and used — minimal, fast |
 | 📊 | **Diff impact** | Parse `git diff`, find overlapping functions, trace their callers |
+| 🔗 | **Co-change analysis** | Analyze git history for files that always change together — surfaces hidden coupling the static graph can't see; enriches `diff-impact` with historically coupled files |
 | 🗺️ | **Module map** | Bird's-eye view of your most-connected files |
 | 🏗️ | **Structure & hotspots** | Directory cohesion scores, fan-in/fan-out hotspot detection, module boundaries |
+| 🏷️ | **Node role classification** | Every symbol auto-tagged as `entry`/`core`/`utility`/`adapter`/`dead`/`leaf` based on connectivity patterns — agents instantly know architectural role |
 | 🔄 | **Cycle detection** | Find circular dependencies at file or function level |
 | 📤 | **Export** | DOT (Graphviz), Mermaid, and JSON graph export |
 | 🧠 | **Semantic search** | Embeddings-powered natural language search with multi-query RRF ranking |
 | 👀 | **Watch mode** | Incrementally update the graph as files change |
-| 🤖 | **MCP server** | 17-tool MCP server for AI assistants; single-repo by default, opt-in multi-repo |
+| 🤖 | **MCP server** | 19-tool MCP server for AI assistants; single-repo by default, opt-in multi-repo |
 | ⚡ | **Always fresh** | Three-tier incremental detection — sub-second rebuilds even on large codebases |
 ## 📦 Commands
@@ -189,6 +194,9 @@ codegraph map -n 50 --no-tests # Top 50, excluding test files
 codegraph where <name>         # Where is a symbol defined and used?
 codegraph where --file src/db.js  # List symbols, imports, exports for a file
 codegraph stats                # Graph health: nodes, edges, languages, quality score
+codegraph roles                # Node role classification (entry, core, utility, adapter, dead, leaf)
+codegraph roles --role dead -T # Find dead code (unreferenced, non-exported symbols)
+codegraph roles --role core --file src/  # Core symbols in src/
 ```
 ### Deep Context (AI-Optimized)
@@ -213,6 +221,22 @@ codegraph diff-impact HEAD~3   # Impact vs a specific ref
 codegraph diff-impact main --format mermaid -T  # Mermaid flowchart of blast radius
 ```
+### Co-Change Analysis
+Analyze git history to find files that always change together — surfaces hidden coupling the static graph can't see. Requires a git repository.
+```bash
+codegraph co-change --analyze          # Scan git history and populate co-change data
+codegraph co-change src/queries.js     # Show co-change partners for a file
+codegraph co-change                    # Show top co-changing file pairs globally
+codegraph co-change --since 6m         # Limit to last 6 months of history
+codegraph co-change --min-jaccard 0.5  # Only show strong coupling (Jaccard >= 0.5)
+codegraph co-change --min-support 5    # Minimum co-commit count
+codegraph co-change --full             # Include all details
+```
+Co-change data also enriches `diff-impact` — historically coupled files appear in a `historicallyCoupled` section alongside the static dependency analysis.
 ### Structure & Hotspots
 ```bash
@@ -373,22 +397,36 @@ Codegraph also extracts symbols from common callback patterns: Commander `.comma
 ## 📊 Performance
-Self-measured on every release via CI ([full history](generated/BENCHMARKS.md)):
+Self-measured on every release via CI ([build benchmarks](generated/BUILD-BENCHMARKS.md) | [embedding benchmarks](generated/EMBEDDING-BENCHMARKS.md)):
 | Metric | Latest |
 |---|---|
 | Build speed (native) | **1.9 ms/file** |
 | Build speed (WASM) | **6.6 ms/file** |
-| Query time | **1ms** |
+| Query time | **2ms** |
 | ~50,000 files (est.) | **~95.0s build** |
 Metrics are normalized per file for cross-version comparability. Times above are for a full initial build — incremental rebuilds only re-parse changed files.
+### Lightweight Footprint
+<a href="https://www.npmjs.com/package/@optave/codegraph"><img src="https://img.shields.io/npm/unpacked-size/@optave/codegraph?style=flat-square&label=unpacked%20size" alt="npm unpacked size" /></a>
+Only **3 runtime dependencies** — everything else is optional or a devDependency:
+| Dependency | What it does | | |
+|---|---|---|---|
+| [better-sqlite3](https://github.com/WiseLibs/better-sqlite3) | Fast, synchronous SQLite driver | ![GitHub stars](https://img.shields.io/github/stars/WiseLibs/better-sqlite3?style=flat-square&label=%E2%AD%90) | ![npm downloads](https://img.shields.io/npm/dw/better-sqlite3?style=flat-square&label=%F0%9F%93%A5%2Fwk) |
+| [commander](https://github.com/tj/commander.js) | CLI argument parsing | ![GitHub stars](https://img.shields.io/github/stars/tj/commander.js?style=flat-square&label=%E2%AD%90) | ![npm downloads](https://img.shields.io/npm/dw/commander?style=flat-square&label=%F0%9F%93%A5%2Fwk) |
+| [web-tree-sitter](https://github.com/tree-sitter/tree-sitter) | WASM tree-sitter bindings | ![GitHub stars](https://img.shields.io/github/stars/tree-sitter/tree-sitter?style=flat-square&label=%E2%AD%90) | ![npm downloads](https://img.shields.io/npm/dw/web-tree-sitter?style=flat-square&label=%F0%9F%93%A5%2Fwk) |
+Optional: `@huggingface/transformers` (semantic search), `@modelcontextprotocol/sdk` (MCP server) — lazy-loaded only when needed.
 ## 🤖 AI Agent Integration
 ### MCP Server
-Codegraph includes a built-in [Model Context Protocol](https://modelcontextprotocol.io/) server with 17 tools, so AI assistants can query your dependency graph directly:
+Codegraph includes a built-in [Model Context Protocol](https://modelcontextprotocol.io/) server with 19 tools, so AI assistants can query your dependency graph directly:
 ```bash
 codegraph mcp                  # Single-repo mode (default) — only local project
@@ -402,7 +440,7 @@ codegraph mcp --repos a,b      # Restrict to specific repos (implies --multi-rep
 ### CLAUDE.md / Agent Instructions
-Add this to your project's `CLAUDE.md` to help AI agents use codegraph (full template in the [AI Agent Guide](docs/ai-agent-guide.md#claudemd-template)):
+Add this to your project's `CLAUDE.md` to help AI agents use codegraph (full template in the [AI Agent Guide](docs/guides/ai-agent-guide.md#claudemd-template)):
 ```markdown
 ## Code Navigation
@@ -460,7 +498,7 @@ Use `--kind function` to cut noise. Use `--file <pattern>` to scope.
 ## 📋 Recommended Practices
-See **[docs/recommended-practices.md](docs/recommended-practices.md)** for integration guides:
+See **[docs/guides/recommended-practices.md](docs/guides/recommended-practices.md)** for integration guides:
 - **Git hooks** — auto-rebuild on commit, impact checks on push, commit message enrichment
 - **CI/CD** — PR impact comments, threshold gates, graph caching
@@ -468,7 +506,7 @@ See **[docs/recommended-practices.md](docs/recommended-practices.md)** for integ
 - **Developer workflow** — watch mode, explore-before-you-edit, semantic search
 - **Secure credentials** — `apiKeyCommand` with 1Password, Bitwarden, Vault, macOS Keychain, `pass`
-For AI-specific integration, see the **[AI Agent Guide](docs/ai-agent-guide.md)** — a comprehensive reference covering the 6-step agent workflow, complete command-to-MCP mapping, Claude Code hooks, and token-saving patterns.
+For AI-specific integration, see the **[AI Agent Guide](docs/guides/ai-agent-guide.md)** — a comprehensive reference covering the 6-step agent workflow, complete command-to-MCP mapping, Claude Code hooks, and token-saving patterns.
 ## 🔁 CI / GitHub Actions
@@ -575,6 +613,8 @@ const { results: fused } = await multiSearchData(
 | Incremental rebuilds | **O(changed)** | — | O(n) Merkle | — | — | — |
 | MCP / AI agent support | **Yes** | — | **Yes** | **Yes** | **Yes** | **Yes** |
 | Git diff impact | **Yes** | — | — | — | — | **Yes** |
+| Git co-change analysis | **Yes** | — | — | — | — | — |
+| Dead code / role classification | **Yes** | — | **Yes** | — | — | — |
 | Semantic search | **Yes** | — | **Yes** | **Yes** | — | **Yes** |
 | Watch mode | **Yes** | — | **Yes** | — | — | — |
 | Zero config, no Docker/JVM | **Yes** | — | **Yes** | — | — | — |
@@ -583,7 +623,7 @@ const { results: fused } = await multiSearchData(
 ## 🗺️ Roadmap
-See **[ROADMAP.md](ROADMAP.md)** for the full development roadmap and **[STABILITY.md](STABILITY.md)** for the stability policy and versioning guarantees. Current plan:
+See **[ROADMAP.md](docs/roadmap/ROADMAP.md)** for the full development roadmap and **[STABILITY.md](STABILITY.md)** for the stability policy and versioning guarantees. Current plan:
 1. ~~**Rust Core**~~ — **Complete** (v1.3.0) — native tree-sitter parsing via napi-rs, parallel multi-core parsing, incremental re-parsing, import resolution & cycle detection in Rust
 2. ~~**Foundation Hardening**~~ — **Complete** (v1.4.0) — parser registry, 12-tool MCP server with multi-repo support, test coverage 62%→75%, `apiKeyCommand` secret resolution, global repo registry
@@ -592,7 +632,7 @@ See **[ROADMAP.md](ROADMAP.md)** for the full development roadmap and **[STABILI
 5. **Natural Language Queries** — `codegraph ask` command, conversational sessions
 6. **Expanded Language Support** — 8 new languages (12 → 20)
 7. **GitHub Integration & CI** — reusable GitHub Action, PR review, SARIF output
-8. **Visualization & Advanced** — web UI, dead code detection, monorepo support, agentic search
+8. **Visualization & Advanced** — web UI, monorepo support, agentic search
 ## 🤝 Contributing
@@ -605,7 +645,7 @@ npm install
 npm test
 ```
-Looking to add a new language? Check out **[Adding a New Language](docs/adding-a-language.md)**.
+Looking to add a new language? Check out **[Adding a New Language](docs/guides/adding-a-language.md)**.
 ## 📄 License

package/package.json CHANGED Viewed

@@ -1,13 +1,14 @@
 {
   "name": "@optave/codegraph",
-  "version": "2.2.3-dev.44e8146",
+  "version": "2.3.1-dev.1aeea34",
   "description": "Local code graph CLI — parse codebases with tree-sitter, build dependency graphs, query them",
   "type": "module",
   "main": "src/index.js",
   "exports": {
     ".": {
       "import": "./src/index.js"
-    }
+    },
+    "./package.json": "./package.json"
   },
   "bin": {
     "codegraph": "./src/cli.js"
@@ -61,10 +62,10 @@
   "optionalDependencies": {
     "@huggingface/transformers": "^3.8.1",
     "@modelcontextprotocol/sdk": "^1.0.0",
-    "@optave/codegraph-darwin-arm64": "2.2.3-dev.44e8146",
-    "@optave/codegraph-darwin-x64": "2.2.3-dev.44e8146",
-    "@optave/codegraph-linux-x64-gnu": "2.2.3-dev.44e8146",
-    "@optave/codegraph-win32-x64-msvc": "2.2.3-dev.44e8146"
+    "@optave/codegraph-darwin-arm64": "2.3.1-dev.1aeea34",
+    "@optave/codegraph-darwin-x64": "2.3.1-dev.1aeea34",
+    "@optave/codegraph-linux-x64-gnu": "2.3.1-dev.1aeea34",
+    "@optave/codegraph-win32-x64-msvc": "2.3.1-dev.1aeea34"
   },
   "devDependencies": {
     "@biomejs/biome": "^2.4.4",

package/src/builder.js CHANGED Viewed

@@ -827,6 +827,59 @@ export async function buildGraph(rootDir, opts = {}) {
     }
   }
+  // For incremental builds, buildStructure needs ALL files (not just changed ones)
+  // because it clears and rebuilds all contains edges and directory metrics.
+  // Load unchanged files from the DB so structure data stays complete.
+  if (!isFullBuild) {
+    const existingFiles = db.prepare("SELECT DISTINCT file FROM nodes WHERE kind = 'file'").all();
+    const defsByFile = db.prepare(
+      "SELECT name, kind, line FROM nodes WHERE file = ? AND kind != 'file' AND kind != 'directory'",
+    );
+    // Count imports per file — buildStructure only uses imports.length for metrics
+    const importCountByFile = db.prepare(
+      `SELECT COUNT(DISTINCT n2.file) AS cnt FROM edges e
+       JOIN nodes n1 ON e.source_id = n1.id
+       JOIN nodes n2 ON e.target_id = n2.id
+       WHERE n1.file = ? AND e.kind = 'imports'`,
+    );
+    const lineCountByFile = db.prepare(
+      `SELECT n.name AS file, m.line_count
+       FROM node_metrics m JOIN nodes n ON m.node_id = n.id
+       WHERE n.kind = 'file'`,
+    );
+    const cachedLineCounts = new Map();
+    for (const row of lineCountByFile.all()) {
+      cachedLineCounts.set(row.file, row.line_count);
+    }
+    let loadedFromDb = 0;
+    for (const { file: relPath } of existingFiles) {
+      if (!fileSymbols.has(relPath)) {
+        const importCount = importCountByFile.get(relPath)?.cnt || 0;
+        fileSymbols.set(relPath, {
+          definitions: defsByFile.all(relPath),
+          imports: new Array(importCount),
+          exports: [],
+        });
+        loadedFromDb++;
+      }
+      if (!lineCountMap.has(relPath)) {
+        const cached = cachedLineCounts.get(relPath);
+        if (cached != null) {
+          lineCountMap.set(relPath, cached);
+        } else {
+          const absPath = path.join(rootDir, relPath);
+          try {
+            const content = fs.readFileSync(absPath, 'utf-8');
+            lineCountMap.set(relPath, content.split('\n').length);
+          } catch {
+            lineCountMap.set(relPath, 0);
+          }
+        }
+      }
+    }
+    debug(`Structure: ${fileSymbols.size} files (${loadedFromDb} loaded from DB)`);
+  }
   // Build directory structure, containment edges, and metrics
   const relDirs = new Set();
   for (const absDir of discoveredDirs) {
@@ -839,6 +892,19 @@ export async function buildGraph(rootDir, opts = {}) {
     debug(`Structure analysis failed: ${err.message}`);
   }
+  // Classify node roles (entry, core, utility, adapter, dead, leaf)
+  try {
+    const { classifyNodeRoles } = await import('./structure.js');
+    const roleSummary = classifyNodeRoles(db);
+    debug(
+      `Roles: ${Object.entries(roleSummary)
+        .map(([r, c]) => `${r}=${c}`)
+        .join(', ')}`,
+    );
+  } catch (err) {
+    debug(`Role classification failed: ${err.message}`);
+  }
   const nodeCount = db.prepare('SELECT COUNT(*) as c FROM nodes').get().c;
   info(`Graph built: ${nodeCount} nodes, ${edgeCount} edges`);
   info(`Stored in ${dbPath}`);

package/src/cli.js CHANGED Viewed

@@ -2,13 +2,18 @@
 import fs from 'node:fs';
 import path from 'node:path';
-import Database from 'better-sqlite3';
 import { Command } from 'commander';
 import { buildGraph } from './builder.js';
 import { loadConfig } from './config.js';
 import { findCycles, formatCycles } from './cycles.js';
-import { findDbPath } from './db.js';
-import { buildEmbeddings, EMBEDDING_STRATEGIES, MODELS, search } from './embedder.js';
+import { openReadonlyOrFail } from './db.js';
+import {
+  buildEmbeddings,
+  DEFAULT_MODEL,
+  EMBEDDING_STRATEGIES,
+  MODELS,
+  search,
+} from './embedder.js';
 import { exportDOT, exportJSON, exportMermaid } from './export.js';
 import { setVerbose } from './logger.js';
 import {
@@ -22,7 +27,9 @@ import {
   impactAnalysis,
   moduleMap,
   queryName,
+  roles,
   stats,
+  VALID_ROLES,
   where,
 } from './queries.js';
 import {
@@ -273,13 +280,15 @@ program
   .option('-T, --no-tests', 'Exclude test/spec files')
   .option('--include-tests', 'Include test/spec files (overrides excludeTests config)')
   .option('--min-confidence <score>', 'Minimum edge confidence threshold (default: 0.5)', '0.5')
+  .option('--direction <dir>', 'Flowchart direction for Mermaid: TB, LR, RL, BT', 'LR')
   .option('-o, --output <file>', 'Write to file instead of stdout')
   .action((opts) => {
-    const db = new Database(findDbPath(opts.db), { readonly: true });
+    const db = openReadonlyOrFail(opts.db);
     const exportOpts = {
       fileLevel: !opts.functions,
       noTests: resolveNoTests(opts),
       minConfidence: parseFloat(opts.minConfidence),
+      direction: opts.direction,
     };
     let output;
@@ -314,7 +323,7 @@ program
   .option('--include-tests', 'Include test/spec files (overrides excludeTests config)')
   .option('-j, --json', 'Output as JSON')
   .action((opts) => {
-    const db = new Database(findDbPath(opts.db), { readonly: true });
+    const db = openReadonlyOrFail(opts.db);
     const cycles = findCycles(db, { fileLevel: !opts.functions, noTests: resolveNoTests(opts) });
     db.close();
@@ -396,8 +405,15 @@ registry
   .command('prune')
   .description('Remove stale registry entries (missing directories or idle beyond TTL)')
   .option('--ttl <days>', 'Days of inactivity before pruning (default: 30)', '30')
+  .option('--exclude <names>', 'Comma-separated repo names to preserve from pruning')
   .action((opts) => {
-    const pruned = pruneRegistry(undefined, parseInt(opts.ttl, 10));
+    const excludeNames = opts.exclude
+      ? opts.exclude
+          .split(',')
+          .map((s) => s.trim())
+          .filter((s) => s.length > 0)
+      : [];
+    const pruned = pruneRegistry(undefined, parseInt(opts.ttl, 10), excludeNames);
     if (pruned.length === 0) {
       console.log('No stale entries found.');
     } else {
@@ -415,12 +431,13 @@ program
   .command('models')
   .description('List available embedding models')
   .action(() => {
+    const defaultModel = config.embeddings?.model || DEFAULT_MODEL;
     console.log('\nAvailable embedding models:\n');
-    for (const [key, config] of Object.entries(MODELS)) {
-      const def = key === 'minilm' ? ' (default)' : '';
-      const ctx = config.contextWindow ? `${config.contextWindow} ctx` : '';
+    for (const [key, cfg] of Object.entries(MODELS)) {
+      const def = key === defaultModel ? ' (default)' : '';
+      const ctx = cfg.contextWindow ? `${cfg.contextWindow} ctx` : '';
       console.log(
-        `  ${key.padEnd(12)} ${String(config.dim).padStart(4)}d  ${ctx.padEnd(9)} ${config.desc}${def}`,
+        `  ${key.padEnd(12)} ${String(cfg.dim).padStart(4)}d  ${ctx.padEnd(9)} ${cfg.desc}${def}`,
       );
     }
     console.log('\nUsage: codegraph embed --model <name> --strategy <structured|source>');
@@ -434,8 +451,7 @@ program
   )
   .option(
     '-m, --model <name>',
-    'Embedding model: minilm (default), jina-small, jina-base, jina-code, nomic, nomic-v1.5, bge-large. Run `codegraph models` for details',
-    'minilm',
+    'Embedding model (default from config or minilm). Run `codegraph models` for details',
   )
   .option(
     '-s, --strategy <name>',
@@ -450,7 +466,8 @@ program
       process.exit(1);
     }
     const root = path.resolve(dir || '.');
-    await buildEmbeddings(root, opts.model, undefined, { strategy: opts.strategy });
+    const model = opts.model || config.embeddings?.model || DEFAULT_MODEL;
+    await buildEmbeddings(root, model, undefined, { strategy: opts.strategy });
   });
 program
@@ -465,6 +482,7 @@ program
   .option('-k, --kind <kind>', 'Filter by kind: function, method, class')
   .option('--file <pattern>', 'Filter by file path pattern')
   .option('--rrf-k <number>', 'RRF k parameter for multi-query ranking', '60')
+  .option('-j, --json', 'Output as JSON')
   .action(async (query, opts) => {
     await search(query, opts.db, {
       limit: parseInt(opts.limit, 10),
@@ -474,6 +492,7 @@ program
       kind: opts.kind,
       filePattern: opts.file,
       rrfK: parseInt(opts.rrfK, 10),
+      json: opts.json,
     });
   });
@@ -530,6 +549,90 @@ program
     }
   });
+program
+  .command('roles')
+  .description('Show node role classification: entry, core, utility, adapter, dead, leaf')
+  .option('-d, --db <path>', 'Path to graph.db')
+  .option('--role <role>', `Filter by role (${VALID_ROLES.join(', ')})`)
+  .option('-f, --file <path>', 'Scope to a specific file (partial match)')
+  .option('-T, --no-tests', 'Exclude test/spec files')
+  .option('--include-tests', 'Include test/spec files (overrides excludeTests config)')
+  .option('-j, --json', 'Output as JSON')
+  .action((opts) => {
+    if (opts.role && !VALID_ROLES.includes(opts.role)) {
+      console.error(`Invalid role "${opts.role}". Valid roles: ${VALID_ROLES.join(', ')}`);
+      process.exit(1);
+    }
+    roles(opts.db, {
+      role: opts.role,
+      file: opts.file,
+      noTests: resolveNoTests(opts),
+      json: opts.json,
+    });
+  });
+program
+  .command('co-change [file]')
+  .description(
+    'Analyze git history for files that change together. Use --analyze to scan, or query existing data.',
+  )
+  .option('--analyze', 'Scan git history and populate co-change data')
+  .option('--since <date>', 'Git date for history window (default: "1 year ago")')
+  .option('--min-support <n>', 'Minimum co-occurrence count (default: 3)')
+  .option('--min-jaccard <n>', 'Minimum Jaccard similarity 0-1 (default: 0.3)')
+  .option('--full', 'Force full re-scan (ignore incremental state)')
+  .option('-n, --limit <n>', 'Max results', '20')
+  .option('-d, --db <path>', 'Path to graph.db')
+  .option('-T, --no-tests', 'Exclude test/spec files')
+  .option('--include-tests', 'Include test/spec files (overrides excludeTests config)')
+  .option('-j, --json', 'Output as JSON')
+  .action(async (file, opts) => {
+    const { analyzeCoChanges, coChangeData, coChangeTopData, formatCoChange, formatCoChangeTop } =
+      await import('./cochange.js');
+    if (opts.analyze) {
+      const result = analyzeCoChanges(opts.db, {
+        since: opts.since || config.coChange?.since,
+        minSupport: opts.minSupport ? parseInt(opts.minSupport, 10) : config.coChange?.minSupport,
+        maxFilesPerCommit: config.coChange?.maxFilesPerCommit,
+        full: opts.full,
+      });
+      if (opts.json) {
+        console.log(JSON.stringify(result, null, 2));
+      } else if (result.error) {
+        console.error(result.error);
+        process.exit(1);
+      } else {
+        console.log(
+          `\nCo-change analysis complete: ${result.pairsFound} pairs from ${result.commitsScanned} commits (since: ${result.since})\n`,
+        );
+      }
+      return;
+    }
+    const queryOpts = {
+      limit: parseInt(opts.limit, 10),
+      minJaccard: opts.minJaccard ? parseFloat(opts.minJaccard) : config.coChange?.minJaccard,
+      noTests: resolveNoTests(opts),
+    };
+    if (file) {
+      const data = coChangeData(file, opts.db, queryOpts);
+      if (opts.json) {
+        console.log(JSON.stringify(data, null, 2));
+      } else {
+        console.log(formatCoChange(data));
+      }
+    } else {
+      const data = coChangeTopData(opts.db, queryOpts);
+      if (opts.json) {
+        console.log(JSON.stringify(data, null, 2));
+      } else {
+        console.log(formatCoChangeTop(data));
+      }
+    }
+  });
 program
   .command('watch [dir]')
   .description('Watch project for file changes and incrementally update the graph')