@optave/codegraph 2.3.0 → 2.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -55,7 +55,7 @@ cd your-project
55
55
  codegraph build
56
56
  ```
57
57
 
58
- That's it. No config files, no Docker, no JVM, no API keys, no accounts. The graph is ready to query. Add `codegraph mcp` to your AI agent's config and it has full access to your dependency graph through 17 MCP tools.
58
+ That's it. No config files, no Docker, no JVM, no API keys, no accounts. The graph is ready to query. Add `codegraph mcp` to your AI agent's config and it has full access to your dependency graph through 19 MCP tools.
59
59
 
60
60
  ### Why it matters
61
61
 
@@ -78,7 +78,9 @@ That's it. No config files, no Docker, no JVM, no API keys, no accounts. The gra
78
78
  | Semantic search | **Yes** | — | **Yes** | **Yes** | — | **Yes** | — | — |
79
79
  | MCP / AI agent support | **Yes** | — | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** | — |
80
80
  | Git diff impact | **Yes** | — | — | — | — | **Yes** | — | **Yes** |
81
+ | Git co-change analysis | **Yes** | — | — | — | — | — | **Yes** | **Yes** |
81
82
  | Watch mode | **Yes** | — | **Yes** | — | — | — | — | — |
83
+ | Dead code / role classification | **Yes** | — | **Yes** | — | — | — | — | **Yes** |
82
84
  | Cycle detection | **Yes** | — | **Yes** | — | — | — | — | **Yes** |
83
85
  | Incremental rebuilds | **O(changed)** | — | O(n) Merkle | — | — | — | — | — |
84
86
  | Zero config | **Yes** | — | **Yes** | — | — | — | — | — |
@@ -94,9 +96,10 @@ That's it. No config files, no Docker, no JVM, no API keys, no accounts. The gra
94
96
  | **⚡** | **Always-fresh graph** | Three-tier change detection: journal (O(changed)) → mtime+size (O(n) stats) → hash (O(changed) reads). Sub-second rebuilds even on large codebases |
95
97
  | **🔓** | **Zero-cost core, LLM-enhanced when you want** | Full graph analysis with no API keys, no accounts, no cost. Optionally bring your own LLM provider — your code only goes where you choose |
96
98
  | **🔬** | **Function-level, not just files** | Traces `handleAuth()` → `validateToken()` → `decryptJWT()` and shows 14 callers across 9 files break if `decryptJWT` changes |
97
- | **🤖** | **Built for AI agents** | 17-tool [MCP server](https://modelcontextprotocol.io/)AI assistants query your graph directly. Single-repo by default |
99
+ | **🏷️** | **Role classification** | Every symbol auto-tagged as `entry`/`core`/`utility`/`adapter`/`dead`/`leaf`agents instantly know what they're looking at |
100
+ | **🤖** | **Built for AI agents** | 19-tool [MCP server](https://modelcontextprotocol.io/) — AI assistants query your graph directly. Single-repo by default |
98
101
  | **🌐** | **Multi-language, one CLI** | JS/TS + Python + Go + Rust + Java + C# + PHP + Ruby + HCL in a single graph |
99
- | **💥** | **Git diff impact** | `codegraph diff-impact` shows changed functions, their callers, and full blast radius — ships with a GitHub Actions workflow |
102
+ | **💥** | **Git diff impact** | `codegraph diff-impact` shows changed functions, their callers, and full blast radius — enriched with historically coupled files from git co-change analysis. Ships with a GitHub Actions workflow |
100
103
  | **🧠** | **Semantic search** | Local embeddings by default, LLM-powered when opted in — multi-query with RRF ranking via `"auth; token; JWT"` |
101
104
 
102
105
  ---
@@ -141,10 +144,10 @@ After modifying code:
141
144
  Or connect directly via MCP:
142
145
 
143
146
  ```bash
144
- codegraph mcp # 17-tool MCP server — AI queries the graph directly
147
+ codegraph mcp # 19-tool MCP server — AI queries the graph directly
145
148
  ```
146
149
 
147
- Full agent setup: [AI Agent Guide](docs/ai-agent-guide.md) · [CLAUDE.md template](docs/ai-agent-guide.md#claudemd-template)
150
+ Full agent setup: [AI Agent Guide](docs/guides/ai-agent-guide.md) · [CLAUDE.md template](docs/guides/ai-agent-guide.md#claudemd-template)
148
151
 
149
152
  ---
150
153
 
@@ -159,15 +162,19 @@ Full agent setup: [AI Agent Guide](docs/ai-agent-guide.md) · [CLAUDE.md t
159
162
  | 🎯 | **Deep context** | `context` gives AI agents source, deps, callers, signature, and tests for a function in one call; `explain` gives structural summaries of files or functions |
160
163
  | 📍 | **Fast lookup** | `where` shows exactly where a symbol is defined and used — minimal, fast |
161
164
  | 📊 | **Diff impact** | Parse `git diff`, find overlapping functions, trace their callers |
165
+ | 🔗 | **Co-change analysis** | Analyze git history for files that always change together — surfaces hidden coupling the static graph can't see; enriches `diff-impact` with historically coupled files |
162
166
  | 🗺️ | **Module map** | Bird's-eye view of your most-connected files |
163
167
  | 🏗️ | **Structure & hotspots** | Directory cohesion scores, fan-in/fan-out hotspot detection, module boundaries |
168
+ | 🏷️ | **Node role classification** | Every symbol auto-tagged as `entry`/`core`/`utility`/`adapter`/`dead`/`leaf` based on connectivity patterns — agents instantly know architectural role |
164
169
  | 🔄 | **Cycle detection** | Find circular dependencies at file or function level |
165
170
  | 📤 | **Export** | DOT (Graphviz), Mermaid, and JSON graph export |
166
171
  | 🧠 | **Semantic search** | Embeddings-powered natural language search with multi-query RRF ranking |
167
172
  | 👀 | **Watch mode** | Incrementally update the graph as files change |
168
- | 🤖 | **MCP server** | 17-tool MCP server for AI assistants; single-repo by default, opt-in multi-repo |
173
+ | 🤖 | **MCP server** | 19-tool MCP server for AI assistants; single-repo by default, opt-in multi-repo |
169
174
  | ⚡ | **Always fresh** | Three-tier incremental detection — sub-second rebuilds even on large codebases |
170
175
 
176
+ See [docs/examples](docs/examples) for real-world CLI and MCP usage examples.
177
+
171
178
  ## 📦 Commands
172
179
 
173
180
  ### Build & Watch
@@ -189,6 +196,9 @@ codegraph map -n 50 --no-tests # Top 50, excluding test files
189
196
  codegraph where <name> # Where is a symbol defined and used?
190
197
  codegraph where --file src/db.js # List symbols, imports, exports for a file
191
198
  codegraph stats # Graph health: nodes, edges, languages, quality score
199
+ codegraph roles # Node role classification (entry, core, utility, adapter, dead, leaf)
200
+ codegraph roles --role dead -T # Find dead code (unreferenced, non-exported symbols)
201
+ codegraph roles --role core --file src/ # Core symbols in src/
192
202
  ```
193
203
 
194
204
  ### Deep Context (AI-Optimized)
@@ -213,6 +223,22 @@ codegraph diff-impact HEAD~3 # Impact vs a specific ref
213
223
  codegraph diff-impact main --format mermaid -T # Mermaid flowchart of blast radius
214
224
  ```
215
225
 
226
+ ### Co-Change Analysis
227
+
228
+ Analyze git history to find files that always change together — surfaces hidden coupling the static graph can't see. Requires a git repository.
229
+
230
+ ```bash
231
+ codegraph co-change --analyze # Scan git history and populate co-change data
232
+ codegraph co-change src/queries.js # Show co-change partners for a file
233
+ codegraph co-change # Show top co-changing file pairs globally
234
+ codegraph co-change --since 6m # Limit to last 6 months of history
235
+ codegraph co-change --min-jaccard 0.5 # Only show strong coupling (Jaccard >= 0.5)
236
+ codegraph co-change --min-support 5 # Minimum co-commit count
237
+ codegraph co-change --full # Include all details
238
+ ```
239
+
240
+ Co-change data also enriches `diff-impact` — historically coupled files appear in a `historicallyCoupled` section alongside the static dependency analysis.
241
+
216
242
  ### Structure & Hotspots
217
243
 
218
244
  ```bash
@@ -379,7 +405,7 @@ Self-measured on every release via CI ([build benchmarks](generated/BUILD-BENCHM
379
405
  |---|---|
380
406
  | Build speed (native) | **1.9 ms/file** |
381
407
  | Build speed (WASM) | **6.6 ms/file** |
382
- | Query time | **1ms** |
408
+ | Query time | **2ms** |
383
409
  | ~50,000 files (est.) | **~95.0s build** |
384
410
 
385
411
  Metrics are normalized per file for cross-version comparability. Times above are for a full initial build — incremental rebuilds only re-parse changed files.
@@ -402,7 +428,7 @@ Optional: `@huggingface/transformers` (semantic search), `@modelcontextprotocol/
402
428
 
403
429
  ### MCP Server
404
430
 
405
- Codegraph includes a built-in [Model Context Protocol](https://modelcontextprotocol.io/) server with 17 tools, so AI assistants can query your dependency graph directly:
431
+ Codegraph includes a built-in [Model Context Protocol](https://modelcontextprotocol.io/) server with 19 tools, so AI assistants can query your dependency graph directly:
406
432
 
407
433
  ```bash
408
434
  codegraph mcp # Single-repo mode (default) — only local project
@@ -416,7 +442,7 @@ codegraph mcp --repos a,b # Restrict to specific repos (implies --multi-rep
416
442
 
417
443
  ### CLAUDE.md / Agent Instructions
418
444
 
419
- Add this to your project's `CLAUDE.md` to help AI agents use codegraph (full template in the [AI Agent Guide](docs/ai-agent-guide.md#claudemd-template)):
445
+ Add this to your project's `CLAUDE.md` to help AI agents use codegraph (full template in the [AI Agent Guide](docs/guides/ai-agent-guide.md#claudemd-template)):
420
446
 
421
447
  ```markdown
422
448
  ## Code Navigation
@@ -474,7 +500,7 @@ Use `--kind function` to cut noise. Use `--file <pattern>` to scope.
474
500
 
475
501
  ## 📋 Recommended Practices
476
502
 
477
- See **[docs/recommended-practices.md](docs/recommended-practices.md)** for integration guides:
503
+ See **[docs/guides/recommended-practices.md](docs/guides/recommended-practices.md)** for integration guides:
478
504
 
479
505
  - **Git hooks** — auto-rebuild on commit, impact checks on push, commit message enrichment
480
506
  - **CI/CD** — PR impact comments, threshold gates, graph caching
@@ -482,7 +508,7 @@ See **[docs/recommended-practices.md](docs/recommended-practices.md)** for integ
482
508
  - **Developer workflow** — watch mode, explore-before-you-edit, semantic search
483
509
  - **Secure credentials** — `apiKeyCommand` with 1Password, Bitwarden, Vault, macOS Keychain, `pass`
484
510
 
485
- For AI-specific integration, see the **[AI Agent Guide](docs/ai-agent-guide.md)** — a comprehensive reference covering the 6-step agent workflow, complete command-to-MCP mapping, Claude Code hooks, and token-saving patterns.
511
+ For AI-specific integration, see the **[AI Agent Guide](docs/guides/ai-agent-guide.md)** — a comprehensive reference covering the 6-step agent workflow, complete command-to-MCP mapping, Claude Code hooks, and token-saving patterns.
486
512
 
487
513
  ## 🔁 CI / GitHub Actions
488
514
 
@@ -589,6 +615,8 @@ const { results: fused } = await multiSearchData(
589
615
  | Incremental rebuilds | **O(changed)** | — | O(n) Merkle | — | — | — |
590
616
  | MCP / AI agent support | **Yes** | — | **Yes** | **Yes** | **Yes** | **Yes** |
591
617
  | Git diff impact | **Yes** | — | — | — | — | **Yes** |
618
+ | Git co-change analysis | **Yes** | — | — | — | — | — |
619
+ | Dead code / role classification | **Yes** | — | **Yes** | — | — | — |
592
620
  | Semantic search | **Yes** | — | **Yes** | **Yes** | — | **Yes** |
593
621
  | Watch mode | **Yes** | — | **Yes** | — | — | — |
594
622
  | Zero config, no Docker/JVM | **Yes** | — | **Yes** | — | — | — |
@@ -597,7 +625,7 @@ const { results: fused } = await multiSearchData(
597
625
 
598
626
  ## 🗺️ Roadmap
599
627
 
600
- See **[ROADMAP.md](ROADMAP.md)** for the full development roadmap and **[STABILITY.md](STABILITY.md)** for the stability policy and versioning guarantees. Current plan:
628
+ See **[ROADMAP.md](docs/roadmap/ROADMAP.md)** for the full development roadmap and **[STABILITY.md](STABILITY.md)** for the stability policy and versioning guarantees. Current plan:
601
629
 
602
630
  1. ~~**Rust Core**~~ — **Complete** (v1.3.0) — native tree-sitter parsing via napi-rs, parallel multi-core parsing, incremental re-parsing, import resolution & cycle detection in Rust
603
631
  2. ~~**Foundation Hardening**~~ — **Complete** (v1.4.0) — parser registry, 12-tool MCP server with multi-repo support, test coverage 62%→75%, `apiKeyCommand` secret resolution, global repo registry
@@ -606,7 +634,7 @@ See **[ROADMAP.md](ROADMAP.md)** for the full development roadmap and **[STABILI
606
634
  5. **Natural Language Queries** — `codegraph ask` command, conversational sessions
607
635
  6. **Expanded Language Support** — 8 new languages (12 → 20)
608
636
  7. **GitHub Integration & CI** — reusable GitHub Action, PR review, SARIF output
609
- 8. **Visualization & Advanced** — web UI, dead code detection, monorepo support, agentic search
637
+ 8. **Visualization & Advanced** — web UI, monorepo support, agentic search
610
638
 
611
639
  ## 🤝 Contributing
612
640
 
@@ -619,7 +647,7 @@ npm install
619
647
  npm test
620
648
  ```
621
649
 
622
- Looking to add a new language? Check out **[Adding a New Language](docs/adding-a-language.md)**.
650
+ Looking to add a new language? Check out **[Adding a New Language](docs/guides/adding-a-language.md)**.
623
651
 
624
652
  ## 📄 License
625
653
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@optave/codegraph",
3
- "version": "2.3.0",
3
+ "version": "2.4.0",
4
4
  "description": "Local code graph CLI — parse codebases with tree-sitter, build dependency graphs, query them",
5
5
  "type": "module",
6
6
  "main": "src/index.js",
@@ -62,10 +62,9 @@
62
62
  "optionalDependencies": {
63
63
  "@huggingface/transformers": "^3.8.1",
64
64
  "@modelcontextprotocol/sdk": "^1.0.0",
65
- "@optave/codegraph-darwin-arm64": "2.3.0",
66
- "@optave/codegraph-darwin-x64": "2.3.0",
67
- "@optave/codegraph-linux-x64-gnu": "2.3.0",
68
- "@optave/codegraph-win32-x64-msvc": "2.3.0"
65
+ "@optave/codegraph-darwin-arm64": "2.4.0",
66
+ "@optave/codegraph-darwin-x64": "2.4.0",
67
+ "@optave/codegraph-linux-x64-gnu": "2.4.0"
69
68
  },
70
69
  "devDependencies": {
71
70
  "@biomejs/biome": "^2.4.4",
package/src/builder.js CHANGED
@@ -827,6 +827,59 @@ export async function buildGraph(rootDir, opts = {}) {
827
827
  }
828
828
  }
829
829
 
830
+ // For incremental builds, buildStructure needs ALL files (not just changed ones)
831
+ // because it clears and rebuilds all contains edges and directory metrics.
832
+ // Load unchanged files from the DB so structure data stays complete.
833
+ if (!isFullBuild) {
834
+ const existingFiles = db.prepare("SELECT DISTINCT file FROM nodes WHERE kind = 'file'").all();
835
+ const defsByFile = db.prepare(
836
+ "SELECT name, kind, line FROM nodes WHERE file = ? AND kind != 'file' AND kind != 'directory'",
837
+ );
838
+ // Count imports per file — buildStructure only uses imports.length for metrics
839
+ const importCountByFile = db.prepare(
840
+ `SELECT COUNT(DISTINCT n2.file) AS cnt FROM edges e
841
+ JOIN nodes n1 ON e.source_id = n1.id
842
+ JOIN nodes n2 ON e.target_id = n2.id
843
+ WHERE n1.file = ? AND e.kind = 'imports'`,
844
+ );
845
+ const lineCountByFile = db.prepare(
846
+ `SELECT n.name AS file, m.line_count
847
+ FROM node_metrics m JOIN nodes n ON m.node_id = n.id
848
+ WHERE n.kind = 'file'`,
849
+ );
850
+ const cachedLineCounts = new Map();
851
+ for (const row of lineCountByFile.all()) {
852
+ cachedLineCounts.set(row.file, row.line_count);
853
+ }
854
+ let loadedFromDb = 0;
855
+ for (const { file: relPath } of existingFiles) {
856
+ if (!fileSymbols.has(relPath)) {
857
+ const importCount = importCountByFile.get(relPath)?.cnt || 0;
858
+ fileSymbols.set(relPath, {
859
+ definitions: defsByFile.all(relPath),
860
+ imports: new Array(importCount),
861
+ exports: [],
862
+ });
863
+ loadedFromDb++;
864
+ }
865
+ if (!lineCountMap.has(relPath)) {
866
+ const cached = cachedLineCounts.get(relPath);
867
+ if (cached != null) {
868
+ lineCountMap.set(relPath, cached);
869
+ } else {
870
+ const absPath = path.join(rootDir, relPath);
871
+ try {
872
+ const content = fs.readFileSync(absPath, 'utf-8');
873
+ lineCountMap.set(relPath, content.split('\n').length);
874
+ } catch {
875
+ lineCountMap.set(relPath, 0);
876
+ }
877
+ }
878
+ }
879
+ }
880
+ debug(`Structure: ${fileSymbols.size} files (${loadedFromDb} loaded from DB)`);
881
+ }
882
+
830
883
  // Build directory structure, containment edges, and metrics
831
884
  const relDirs = new Set();
832
885
  for (const absDir of discoveredDirs) {
@@ -839,6 +892,19 @@ export async function buildGraph(rootDir, opts = {}) {
839
892
  debug(`Structure analysis failed: ${err.message}`);
840
893
  }
841
894
 
895
+ // Classify node roles (entry, core, utility, adapter, dead, leaf)
896
+ try {
897
+ const { classifyNodeRoles } = await import('./structure.js');
898
+ const roleSummary = classifyNodeRoles(db);
899
+ debug(
900
+ `Roles: ${Object.entries(roleSummary)
901
+ .map(([r, c]) => `${r}=${c}`)
902
+ .join(', ')}`,
903
+ );
904
+ } catch (err) {
905
+ debug(`Role classification failed: ${err.message}`);
906
+ }
907
+
842
908
  const nodeCount = db.prepare('SELECT COUNT(*) as c FROM nodes').get().c;
843
909
  info(`Graph built: ${nodeCount} nodes, ${edgeCount} edges`);
844
910
  info(`Stored in ${dbPath}`);
package/src/cli.js CHANGED
@@ -7,7 +7,13 @@ import { buildGraph } from './builder.js';
7
7
  import { loadConfig } from './config.js';
8
8
  import { findCycles, formatCycles } from './cycles.js';
9
9
  import { openReadonlyOrFail } from './db.js';
10
- import { buildEmbeddings, EMBEDDING_STRATEGIES, MODELS, search } from './embedder.js';
10
+ import {
11
+ buildEmbeddings,
12
+ DEFAULT_MODEL,
13
+ EMBEDDING_STRATEGIES,
14
+ MODELS,
15
+ search,
16
+ } from './embedder.js';
11
17
  import { exportDOT, exportJSON, exportMermaid } from './export.js';
12
18
  import { setVerbose } from './logger.js';
13
19
  import {
@@ -21,7 +27,9 @@ import {
21
27
  impactAnalysis,
22
28
  moduleMap,
23
29
  queryName,
30
+ roles,
24
31
  stats,
32
+ VALID_ROLES,
25
33
  where,
26
34
  } from './queries.js';
27
35
  import {
@@ -31,6 +39,7 @@ import {
31
39
  registerRepo,
32
40
  unregisterRepo,
33
41
  } from './registry.js';
42
+ import { checkForUpdates, printUpdateNotification } from './update-check.js';
34
43
  import { watchProject } from './watcher.js';
35
44
 
36
45
  const __cliDir = path.dirname(new URL(import.meta.url).pathname.replace(/^\/([A-Z]:)/i, '$1'));
@@ -48,6 +57,17 @@ program
48
57
  .hook('preAction', (thisCommand) => {
49
58
  const opts = thisCommand.opts();
50
59
  if (opts.verbose) setVerbose(true);
60
+ })
61
+ .hook('postAction', async (_thisCommand, actionCommand) => {
62
+ const name = actionCommand.name();
63
+ if (name === 'mcp' || name === 'watch') return;
64
+ if (actionCommand.opts().json) return;
65
+ try {
66
+ const result = await checkForUpdates(pkg.version);
67
+ if (result) printUpdateNotification(result.current, result.latest);
68
+ } catch {
69
+ /* never break CLI */
70
+ }
51
71
  });
52
72
 
53
73
  /**
@@ -272,6 +292,7 @@ program
272
292
  .option('-T, --no-tests', 'Exclude test/spec files')
273
293
  .option('--include-tests', 'Include test/spec files (overrides excludeTests config)')
274
294
  .option('--min-confidence <score>', 'Minimum edge confidence threshold (default: 0.5)', '0.5')
295
+ .option('--direction <dir>', 'Flowchart direction for Mermaid: TB, LR, RL, BT', 'LR')
275
296
  .option('-o, --output <file>', 'Write to file instead of stdout')
276
297
  .action((opts) => {
277
298
  const db = openReadonlyOrFail(opts.db);
@@ -279,6 +300,7 @@ program
279
300
  fileLevel: !opts.functions,
280
301
  noTests: resolveNoTests(opts),
281
302
  minConfidence: parseFloat(opts.minConfidence),
303
+ direction: opts.direction,
282
304
  };
283
305
 
284
306
  let output;
@@ -395,8 +417,15 @@ registry
395
417
  .command('prune')
396
418
  .description('Remove stale registry entries (missing directories or idle beyond TTL)')
397
419
  .option('--ttl <days>', 'Days of inactivity before pruning (default: 30)', '30')
420
+ .option('--exclude <names>', 'Comma-separated repo names to preserve from pruning')
398
421
  .action((opts) => {
399
- const pruned = pruneRegistry(undefined, parseInt(opts.ttl, 10));
422
+ const excludeNames = opts.exclude
423
+ ? opts.exclude
424
+ .split(',')
425
+ .map((s) => s.trim())
426
+ .filter((s) => s.length > 0)
427
+ : [];
428
+ const pruned = pruneRegistry(undefined, parseInt(opts.ttl, 10), excludeNames);
400
429
  if (pruned.length === 0) {
401
430
  console.log('No stale entries found.');
402
431
  } else {
@@ -414,12 +443,13 @@ program
414
443
  .command('models')
415
444
  .description('List available embedding models')
416
445
  .action(() => {
446
+ const defaultModel = config.embeddings?.model || DEFAULT_MODEL;
417
447
  console.log('\nAvailable embedding models:\n');
418
- for (const [key, config] of Object.entries(MODELS)) {
419
- const def = key === 'minilm' ? ' (default)' : '';
420
- const ctx = config.contextWindow ? `${config.contextWindow} ctx` : '';
448
+ for (const [key, cfg] of Object.entries(MODELS)) {
449
+ const def = key === defaultModel ? ' (default)' : '';
450
+ const ctx = cfg.contextWindow ? `${cfg.contextWindow} ctx` : '';
421
451
  console.log(
422
- ` ${key.padEnd(12)} ${String(config.dim).padStart(4)}d ${ctx.padEnd(9)} ${config.desc}${def}`,
452
+ ` ${key.padEnd(12)} ${String(cfg.dim).padStart(4)}d ${ctx.padEnd(9)} ${cfg.desc}${def}`,
423
453
  );
424
454
  }
425
455
  console.log('\nUsage: codegraph embed --model <name> --strategy <structured|source>');
@@ -433,8 +463,7 @@ program
433
463
  )
434
464
  .option(
435
465
  '-m, --model <name>',
436
- 'Embedding model: minilm (default), jina-small, jina-base, jina-code, nomic, nomic-v1.5, bge-large. Run `codegraph models` for details',
437
- 'minilm',
466
+ 'Embedding model (default from config or minilm). Run `codegraph models` for details',
438
467
  )
439
468
  .option(
440
469
  '-s, --strategy <name>',
@@ -449,7 +478,8 @@ program
449
478
  process.exit(1);
450
479
  }
451
480
  const root = path.resolve(dir || '.');
452
- await buildEmbeddings(root, opts.model, undefined, { strategy: opts.strategy });
481
+ const model = opts.model || config.embeddings?.model || DEFAULT_MODEL;
482
+ await buildEmbeddings(root, model, undefined, { strategy: opts.strategy });
453
483
  });
454
484
 
455
485
  program
@@ -464,6 +494,7 @@ program
464
494
  .option('-k, --kind <kind>', 'Filter by kind: function, method, class')
465
495
  .option('--file <pattern>', 'Filter by file path pattern')
466
496
  .option('--rrf-k <number>', 'RRF k parameter for multi-query ranking', '60')
497
+ .option('-j, --json', 'Output as JSON')
467
498
  .action(async (query, opts) => {
468
499
  await search(query, opts.db, {
469
500
  limit: parseInt(opts.limit, 10),
@@ -473,6 +504,7 @@ program
473
504
  kind: opts.kind,
474
505
  filePattern: opts.file,
475
506
  rrfK: parseInt(opts.rrfK, 10),
507
+ json: opts.json,
476
508
  });
477
509
  });
478
510
 
@@ -529,6 +561,90 @@ program
529
561
  }
530
562
  });
531
563
 
564
+ program
565
+ .command('roles')
566
+ .description('Show node role classification: entry, core, utility, adapter, dead, leaf')
567
+ .option('-d, --db <path>', 'Path to graph.db')
568
+ .option('--role <role>', `Filter by role (${VALID_ROLES.join(', ')})`)
569
+ .option('-f, --file <path>', 'Scope to a specific file (partial match)')
570
+ .option('-T, --no-tests', 'Exclude test/spec files')
571
+ .option('--include-tests', 'Include test/spec files (overrides excludeTests config)')
572
+ .option('-j, --json', 'Output as JSON')
573
+ .action((opts) => {
574
+ if (opts.role && !VALID_ROLES.includes(opts.role)) {
575
+ console.error(`Invalid role "${opts.role}". Valid roles: ${VALID_ROLES.join(', ')}`);
576
+ process.exit(1);
577
+ }
578
+ roles(opts.db, {
579
+ role: opts.role,
580
+ file: opts.file,
581
+ noTests: resolveNoTests(opts),
582
+ json: opts.json,
583
+ });
584
+ });
585
+
586
+ program
587
+ .command('co-change [file]')
588
+ .description(
589
+ 'Analyze git history for files that change together. Use --analyze to scan, or query existing data.',
590
+ )
591
+ .option('--analyze', 'Scan git history and populate co-change data')
592
+ .option('--since <date>', 'Git date for history window (default: "1 year ago")')
593
+ .option('--min-support <n>', 'Minimum co-occurrence count (default: 3)')
594
+ .option('--min-jaccard <n>', 'Minimum Jaccard similarity 0-1 (default: 0.3)')
595
+ .option('--full', 'Force full re-scan (ignore incremental state)')
596
+ .option('-n, --limit <n>', 'Max results', '20')
597
+ .option('-d, --db <path>', 'Path to graph.db')
598
+ .option('-T, --no-tests', 'Exclude test/spec files')
599
+ .option('--include-tests', 'Include test/spec files (overrides excludeTests config)')
600
+ .option('-j, --json', 'Output as JSON')
601
+ .action(async (file, opts) => {
602
+ const { analyzeCoChanges, coChangeData, coChangeTopData, formatCoChange, formatCoChangeTop } =
603
+ await import('./cochange.js');
604
+
605
+ if (opts.analyze) {
606
+ const result = analyzeCoChanges(opts.db, {
607
+ since: opts.since || config.coChange?.since,
608
+ minSupport: opts.minSupport ? parseInt(opts.minSupport, 10) : config.coChange?.minSupport,
609
+ maxFilesPerCommit: config.coChange?.maxFilesPerCommit,
610
+ full: opts.full,
611
+ });
612
+ if (opts.json) {
613
+ console.log(JSON.stringify(result, null, 2));
614
+ } else if (result.error) {
615
+ console.error(result.error);
616
+ process.exit(1);
617
+ } else {
618
+ console.log(
619
+ `\nCo-change analysis complete: ${result.pairsFound} pairs from ${result.commitsScanned} commits (since: ${result.since})\n`,
620
+ );
621
+ }
622
+ return;
623
+ }
624
+
625
+ const queryOpts = {
626
+ limit: parseInt(opts.limit, 10),
627
+ minJaccard: opts.minJaccard ? parseFloat(opts.minJaccard) : config.coChange?.minJaccard,
628
+ noTests: resolveNoTests(opts),
629
+ };
630
+
631
+ if (file) {
632
+ const data = coChangeData(file, opts.db, queryOpts);
633
+ if (opts.json) {
634
+ console.log(JSON.stringify(data, null, 2));
635
+ } else {
636
+ console.log(formatCoChange(data));
637
+ }
638
+ } else {
639
+ const data = coChangeTopData(opts.db, queryOpts);
640
+ if (opts.json) {
641
+ console.log(JSON.stringify(data, null, 2));
642
+ } else {
643
+ console.log(formatCoChangeTop(data));
644
+ }
645
+ }
646
+ });
647
+
532
648
  program
533
649
  .command('watch [dir]')
534
650
  .description('Watch project for file changes and incrementally update the graph')