engramx 0.4.3 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -15,28 +15,28 @@
15
15
  <a href="https://github.com/NickCirv/engram/actions"><img src="https://github.com/NickCirv/engram/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
16
16
  <img src="https://img.shields.io/badge/license-Apache%202.0-blue" alt="License">
17
17
  <img src="https://img.shields.io/badge/node-%3E%3D20-brightgreen" alt="Node">
18
- <img src="https://img.shields.io/badge/tests-486%20passing-brightgreen" alt="Tests">
18
+ <img src="https://img.shields.io/badge/tests-520%20passing-brightgreen" alt="Tests">
19
19
  <img src="https://img.shields.io/badge/LLM%20cost-$0-green" alt="Zero LLM cost">
20
20
  <img src="https://img.shields.io/badge/native%20deps-zero-green" alt="Zero native deps">
21
- <img src="https://img.shields.io/badge/token%20reduction-82%25-orange" alt="82% token reduction">
21
+ <img src="https://img.shields.io/badge/token%20savings-up%20to%2090%25-orange" alt="Up to 90% token savings">
22
22
  </p>
23
23
 
24
24
  ---
25
25
 
26
- # The structural code graph your AI agent can't forget to use.
26
+ # The context spine for AI coding agents.
27
27
 
28
- **Context rot is empirically solved.** Chroma's July 2025 research proved that even Claude Opus 4.6 scores only 76% on MRCR v2 8-needle at 1M tokens. Long context windows don't save you they drown you. engram is the answer: a **structural graph** of your codebase that replaces file reads with ~300-token summaries *before the agent sees them*.
28
+ **One call replaces five.** engram intercepts file reads and serves rich context packets structural summaries, decisions, library docs, known issues, and git historyassembled from 6 providers in a single ~500-token response. Your AI agent gets everything it needs without making 5 separate tool calls.
29
29
 
30
- engram installs a Claude Code hook layer at the tool-call boundary. Every `Read`, `Edit`, `Write`, and `Bash cat` gets intercepted. When the graph has confident coverage of a file, the raw read never happens the agent sees a structural summary instead.
30
+ engram installs a hook layer at the Claude Code tool-call boundary. Every `Read`, `Edit`, `Write`, and `Bash cat` gets intercepted. When the graph has confident coverage of a file, it serves a **rich context packet** combining structure, decisions, docs, and history — all pre-assembled, all within budget.
31
31
 
32
- Not a memory tool. Not a RAG layer. Not a context manager. **A structural code graph with a Claude Code hook layer that turns it into the memory your agent can't forget to exist.**
32
+ Not a memory tool. Not a RAG layer. **The context spine that connects your knowledge graph, semantic memory, library docs, and project notes into a single context layer your AI agent can't forget to use.**
33
33
 
34
34
  | What it is | What it isn't |
35
35
  |---|---|
36
- | Structural code graph (AST + git + session miners) | Prose memory like Anthropic's MEMORY.md |
36
+ | Context spine (graph + 6 providers assembled per read) | A single-purpose file summarizer |
37
37
  | Local SQLite, zero cloud, zero native deps | Vector RAG that phones home |
38
38
  | Hook-based interception at the tool boundary | A tool the agent has to remember to call |
39
- | 82% measured token reduction on real code | Another LongMemEval chatbot benchmark |
39
+ | Up to 90% session-level token savings | A theoretical benchmark |
40
40
  | Complements native Claude memory | Competes with native Claude memory |
41
41
 
42
42
  ```bash
@@ -46,27 +46,23 @@ engram init # scan codebase → .engram/graph.db (~40 ms, 0 tokens)
46
46
  engram install-hook # wire the Sentinel into Claude Code
47
47
  ```
48
48
 
49
- ```bash
50
- npm install -g engramx
51
- cd ~/my-project
52
- engram init # scan codebase → .engram/graph.db
53
- engram install-hook # wire into Claude Code (project-local)
54
- ```
55
-
56
49
  That's it. The next Claude Code session in that directory automatically:
57
50
 
58
- - **Replaces file reads with graph summaries** (Read intercept, deny+reason)
51
+ - **Serves rich context packets** structure + git changes + decisions + library docs in one response (Context Spine)
52
+ - **Warms provider caches at session start** — MemPalace, Context7, Obsidian pre-fetched in background
59
53
  - **Warns before edits that hit known mistakes** (Edit landmine injection)
60
54
  - **Pre-loads relevant context when you ask a question** (UserPromptSubmit pre-query)
61
- - **Injects a project brief at session start** (SessionStart additionalContext)
62
- - **Logs every decision for `engram hook-stats`** (PostToolUse observer)
63
-
64
- ## Architecture Diagram
55
+ - **Injects a project brief at session start** with semantic context (SessionStart)
56
+ - **Survives context compaction** re-injects critical nodes before compression (PreCompact)
57
+ - **Auto-switches project context** when you navigate to a different repo (CwdChanged)
58
+ - **Shows live HUD in Claude Code status bar** — auto-configured on `install-hook`
65
59
 
66
- An 11-page visual walkthrough of the full lifecycle — the four hook events, the Read handler's 9-branch decision tree (with a real JSON response), the six-layer ecosystem substrate, and measured numbers from engram's own code.
60
+ ## Docs
67
61
 
68
- - 📄 **[View the PDF](docs/engram-sentinel-ecosystem.pdf)** — A3 landscape, 11 pages, 1 MB. GitHub renders it inline when clicked.
69
- - 🌐 **[HTML source](docs/engram-sentinel-ecosystem.html)** — single self-contained file. Download raw and open in any browser for the interactive scroll-reveal version.
62
+ - **[Architecture Diagram](docs/engram-sentinel-ecosystem.pdf)** — 11-page visual walkthrough of the full lifecycle. [HTML version](docs/engram-sentinel-ecosystem.html).
63
+ - **[Installation Guide](docs/engram-user-manual.html)** — AAA-designed step-by-step setup guide with experience tiers.
64
+ - **[Integration Guide](docs/engram-integration-guide.html)** — how memory tools, compression plugins, and workflow managers integrate with engram. Real token savings numbers + code examples.
65
+ - **[INTEGRATION.md](docs/INTEGRATION.md)** — MCP server setup and programmatic API reference.
70
66
 
71
67
  ## The Problem
72
68
 
@@ -117,9 +113,10 @@ Each tier builds on the previous. You can stop at any level — each one works s
117
113
  | Tier | What you run | What you get | Token savings |
118
114
  |---|---|---|---|
119
115
  | **1. Graph only** | `engram init` | CLI queries, MCP server, `engram gen` for CLAUDE.md | ~6x per query vs reading files |
120
- | **2. + Sentinel hooks** | `engram install-hook` | Automatic Read interception, Edit landmine warnings, session-start briefs, prompt pre-query | ~82% per session (measured) |
121
- | **3. + Skills index** | `engram init --with-skills` | Graph includes your `~/.claude/skills/` queries surface relevant skills alongside code | ~23% overhead on graph size |
122
- | **4. + Git hooks** | `engram hooks install` | Auto-rebuild graph on every `git commit` graph never goes stale | Zero token cost |
116
+ | **2. + Sentinel hooks** | `engram install-hook` | Automatic Read interception, Edit warnings, session briefs, HUD | ~80% per session |
117
+ | **3. + Context Spine** | Configure providers.json | Rich packets: structure + decisions + docs + git in one response | Up to 90% session-level |
118
+ | **4. + Skills index** | `engram init --with-skills` | Graph includes your `~/.claude/skills/`queries surface relevant skills | ~23% overhead on graph size |
119
+ | **5. + Git hooks** | `engram hooks install` | Auto-rebuild graph on every `git commit` — graph never goes stale | Zero token cost |
123
120
 
124
121
  **Recommended full setup** (one-time, per project):
125
122
 
@@ -161,7 +158,7 @@ engram gen --task bug-fix # Task-aware view (general|bug-fix|feature|refac
161
158
  engram hooks install # Auto-rebuild graph on git commit
162
159
  ```
163
160
 
164
- ### Sentinel (v0.3 — new)
161
+ ### Sentinel (v0.3+)
165
162
 
166
163
  ```bash
167
164
  engram intercept # Hook entry point (called by Claude Code, reads stdin)
@@ -178,9 +175,51 @@ engram hook-disable # Kill switch (touch .engram/hook-disabl
178
175
  engram hook-enable # Remove kill switch
179
176
  ```
180
177
 
178
+ ### Infrastructure (v0.4 — new)
179
+
180
+ ```bash
181
+ engram watch [path] # Live file watcher — incremental re-index on save
182
+ engram dashboard [path] # Live terminal dashboard (token savings, hit rate, top files)
183
+ engram hud [path] # Alias for dashboard
184
+ engram hud-label [path] # JSON label for Claude HUD --extra-cmd integration
185
+ ```
186
+
187
+ **Claude HUD integration** — add `--extra-cmd="engram hud-label"` to your Claude HUD statusLine command and see live savings in your Claude Code status bar:
188
+
189
+ ```
190
+ ⚡engram 48.5K saved ▰▰▰▰▰▰▰▰▱▱ 75%
191
+ ```
192
+
193
+ ## Context Spine (v0.5 — new)
194
+
195
+ The Context Spine assembles rich context from 6 providers into a single response per file read:
196
+
197
+ | Provider | Tier | What it adds | Latency |
198
+ |----------|------|-------------|---------|
199
+ | `engram:structure` | Internal | Functions, classes, imports, edges | <50ms |
200
+ | `engram:mistakes` | Internal | Known bugs and past failures | <10ms |
201
+ | `engram:git` | Internal | Last modified, author, churn rate | <100ms |
202
+ | `mempalace` | External (cached) | Decisions, learnings, project context | <5ms cached |
203
+ | `context7` | External (cached) | Library API docs for imports | <5ms cached |
204
+ | `obsidian` | External (cached) | Project notes, architecture docs | <5ms cached |
205
+
206
+ External providers cache results in SQLite at SessionStart. Per-Read resolution is a cache lookup (<5ms), not a live query. If any provider is unavailable, it's silently skipped — you always get at least the structural summary.
207
+
208
+ Configure providers in `.engram/providers.json` (optional — auto-detection works for most setups):
209
+
210
+ ```json
211
+ {
212
+ "providers": {
213
+ "mempalace": { "enabled": true },
214
+ "context7": { "enabled": true },
215
+ "obsidian": { "enabled": true, "vault": "~/vault" }
216
+ }
217
+ }
218
+ ```
219
+
181
220
  ## How the Sentinel Layer Works
182
221
 
183
- Seven hook handlers compose the interception stack:
222
+ Nine hook handlers compose the interception stack:
184
223
 
185
224
  | Hook | Mechanism | What it does |
186
225
  |---|---|---|
@@ -188,9 +227,11 @@ Seven hook handlers compose the interception stack:
188
227
  | **`PreToolUse:Edit`** | `allow + additionalContext` | Never blocks writes. If the file has known past mistakes, injects them as a landmine warning alongside the edit. |
189
228
  | **`PreToolUse:Write`** | Same as Edit | Advisory landmine injection. |
190
229
  | **`PreToolUse:Bash`** | Parse + delegate | Detects `cat|head|tail|less|more <single-file>` invocations (strict parser, rejects any shell metacharacter) and delegates to the Read handler. Closes the Bash workaround loophole. |
191
- | **`SessionStart`** | `additionalContext` | Injects a compact project brief (god nodes + graph stats + top landmines + git branch) on source=startup/clear/compact. Passes through on resume. |
230
+ | **`SessionStart`** | `additionalContext` | Injects a compact project brief (god nodes + graph stats + top landmines + git branch) on source=startup/clear/compact. Bundles mempalace semantic context in parallel if available. Passes through on resume. |
192
231
  | **`UserPromptSubmit`** | `additionalContext` | Extracts keywords from the user's message, runs a ≤500-token pre-query, injects results. Skipped for short or generic prompts. Raw prompt content is never logged. |
193
- | **`PostToolUse`** | Observer | Pure logger. Writes tool/path/outputSize/success/decision to `.engram/hook-log.jsonl` for `hook-stats` and v0.3.1 self-tuning. |
232
+ | **`PostToolUse`** | Observer | Pure logger. Writes tool/path/outputSize/success/decision to `.engram/hook-log.jsonl` for `hook-stats`. |
233
+ | **`PreCompact`** | `additionalContext` | Re-injects god nodes + active landmines right before Claude compresses the conversation. First tool in the ecosystem whose context survives compaction. |
234
+ | **`CwdChanged`** | `additionalContext` | Auto-switches project context when the user navigates to a different repo mid-session. Injects a compact brief for the new project. |
194
235
 
195
236
  ### Ten safety invariants, enforced at runtime
196
237
 
@@ -215,7 +256,7 @@ If anything goes wrong, `engram hook-disable` flips the kill switch without unin
215
256
 
216
257
  engram runs three miners on your codebase. None of them use an LLM.
217
258
 
218
- **AST Miner** — Extracts code structure (classes, functions, imports, exports, call patterns) using pattern matching across 10 languages: TypeScript, JavaScript, Python, Go, Rust, Java, C, C++, Ruby, PHP. Zero tokens, deterministic, cached.
259
+ **Heuristic Code Miner** — Extracts code structure (classes, functions, imports, exports, call patterns) using regex heuristics across 10 languages: TypeScript, JavaScript, Python, Go, Rust, Java, C, C++, Ruby, PHP. Confidence-scored (0.85 for regex extraction, 1.0 reserved for future tree-sitter). Zero tokens, deterministic, cached.
219
260
 
220
261
  **Git Miner** — Reads `git log` for co-change patterns (files that change together), hot files (most frequently modified), and authorship. Creates INFERRED edges between structurally coupled files.
221
262
 
@@ -310,7 +310,7 @@ function writeToFile(filePath, summary) {
310
310
  writeFileSync2(filePath, newContent);
311
311
  }
312
312
  async function autogen(projectRoot, target, task) {
313
- const { getStore } = await import("./core-WTKXDUDO.js");
313
+ const { getStore } = await import("./core-VUVXLXZN.js");
314
314
  const store = await getStore(projectRoot);
315
315
  try {
316
316
  let view = VIEWS.general;
@@ -59,6 +59,16 @@ var GraphStore = class _GraphStore {
59
59
  key TEXT PRIMARY KEY,
60
60
  value TEXT NOT NULL
61
61
  );
62
+
63
+ CREATE TABLE IF NOT EXISTS provider_cache (
64
+ provider TEXT NOT NULL,
65
+ file_path TEXT NOT NULL,
66
+ content TEXT NOT NULL,
67
+ query_used TEXT NOT NULL DEFAULT '',
68
+ cached_at INTEGER NOT NULL,
69
+ ttl INTEGER NOT NULL DEFAULT 3600,
70
+ PRIMARY KEY (provider, file_path)
71
+ );
62
72
  `);
63
73
  const indexes = [
64
74
  "CREATE INDEX IF NOT EXISTS idx_nodes_kind ON nodes(kind)",
@@ -66,7 +76,9 @@ var GraphStore = class _GraphStore {
66
76
  "CREATE INDEX IF NOT EXISTS idx_edges_source ON edges(source)",
67
77
  "CREATE INDEX IF NOT EXISTS idx_edges_target ON edges(target)",
68
78
  "CREATE INDEX IF NOT EXISTS idx_edges_relation ON edges(relation)",
69
- "CREATE INDEX IF NOT EXISTS idx_edges_source_file ON edges(source_file)"
79
+ "CREATE INDEX IF NOT EXISTS idx_edges_source_file ON edges(source_file)",
80
+ "CREATE INDEX IF NOT EXISTS idx_cache_file ON provider_cache(file_path)",
81
+ "CREATE INDEX IF NOT EXISTS idx_cache_stale ON provider_cache(cached_at)"
70
82
  ];
71
83
  for (const sql of indexes) {
72
84
  try {
@@ -149,10 +161,11 @@ var GraphStore = class _GraphStore {
149
161
  return null;
150
162
  }
151
163
  searchNodes(query2, limit = 20) {
152
- const pattern = `%${query2}%`;
164
+ const escaped = query2.replace(/%/g, "\\%").replace(/_/g, "\\_");
165
+ const pattern = `%${escaped}%`;
153
166
  const results = [];
154
167
  const stmt = this.db.prepare(
155
- "SELECT * FROM nodes WHERE label LIKE ? OR id LIKE ? ORDER BY query_count DESC LIMIT ?"
168
+ "SELECT * FROM nodes WHERE label LIKE ? ESCAPE '\\' OR id LIKE ? ESCAPE '\\' ORDER BY query_count DESC LIMIT ?"
156
169
  );
157
170
  stmt.bind([pattern, pattern, limit]);
158
171
  while (stmt.step()) {
@@ -198,6 +211,41 @@ var GraphStore = class _GraphStore {
198
211
  stmt.free();
199
212
  return results;
200
213
  }
214
+ getNodesByFile(sourceFile, limit = 500) {
215
+ const results = [];
216
+ const stmt = this.db.prepare(
217
+ "SELECT * FROM nodes WHERE source_file = ? LIMIT ?"
218
+ );
219
+ stmt.bind([sourceFile, limit]);
220
+ while (stmt.step()) {
221
+ results.push(this.rowToNode(stmt.getAsObject()));
222
+ }
223
+ stmt.free();
224
+ return results;
225
+ }
226
+ getEdgesForNodes(nodeIds) {
227
+ if (nodeIds.length === 0) return [];
228
+ const CHUNK = 400;
229
+ const seen = /* @__PURE__ */ new Set();
230
+ const results = [];
231
+ for (let i = 0; i < nodeIds.length; i += CHUNK) {
232
+ const chunk = nodeIds.slice(i, i + CHUNK);
233
+ const placeholders = chunk.map(() => "?").join(",");
234
+ const sql = `SELECT * FROM edges WHERE source IN (${placeholders}) OR target IN (${placeholders})`;
235
+ const stmt = this.db.prepare(sql);
236
+ stmt.bind([...chunk, ...chunk]);
237
+ while (stmt.step()) {
238
+ const edge = this.rowToEdge(stmt.getAsObject());
239
+ const key = `${edge.source}|${edge.target}|${edge.relation}`;
240
+ if (!seen.has(key)) {
241
+ seen.add(key);
242
+ results.push(edge);
243
+ }
244
+ }
245
+ stmt.free();
246
+ }
247
+ return results;
248
+ }
201
249
  getAllNodes() {
202
250
  const results = [];
203
251
  const stmt = this.db.prepare("SELECT * FROM nodes");
@@ -277,7 +325,135 @@ var GraphStore = class _GraphStore {
277
325
  this.db.run("DELETE FROM nodes");
278
326
  this.db.run("DELETE FROM edges");
279
327
  this.db.run("DELETE FROM stats");
328
+ this.db.run("DELETE FROM provider_cache");
329
+ }
330
+ // ─── Provider Cache ─────────────────────────────────────────────
331
+ /**
332
+ * Get all cached provider results for a file. Returns only non-stale
333
+ * entries (cached_at + ttl > now).
334
+ */
335
+ getCachedContext(filePath) {
336
+ const now = Date.now();
337
+ const results = [];
338
+ const stmt = this.db.prepare(
339
+ `SELECT * FROM provider_cache
340
+ WHERE file_path = ? AND (cached_at + ttl * 1000) > ?`
341
+ );
342
+ stmt.bind([filePath, now]);
343
+ while (stmt.step()) {
344
+ results.push(this.rowToCachedContext(stmt.getAsObject()));
345
+ }
346
+ stmt.free();
347
+ return results;
348
+ }
349
+ /**
350
+ * Get cached context for a specific provider + file. Returns null if
351
+ * missing or stale.
352
+ */
353
+ getCachedContextForProvider(provider, filePath) {
354
+ const now = Date.now();
355
+ const stmt = this.db.prepare(
356
+ `SELECT * FROM provider_cache
357
+ WHERE provider = ? AND file_path = ? AND (cached_at + ttl * 1000) > ?`
358
+ );
359
+ stmt.bind([provider, filePath, now]);
360
+ if (stmt.step()) {
361
+ const row = stmt.getAsObject();
362
+ stmt.free();
363
+ return this.rowToCachedContext(row);
364
+ }
365
+ stmt.free();
366
+ return null;
367
+ }
368
+ /**
369
+ * Upsert a single cached provider result.
370
+ */
371
+ setCachedContext(provider, filePath, content, ttl, queryUsed = "") {
372
+ this.db.run(
373
+ `INSERT OR REPLACE INTO provider_cache
374
+ (provider, file_path, content, query_used, cached_at, ttl)
375
+ VALUES (?, ?, ?, ?, ?, ?)`,
376
+ [provider, filePath, content, queryUsed, Date.now(), ttl]
377
+ );
378
+ }
379
+ /**
380
+ * Bulk insert/replace cache entries for a provider. Uses a transaction
381
+ * for performance. Called by provider warmup at SessionStart.
382
+ */
383
+ warmCache(provider, entries, ttl, queryUsed = "") {
384
+ if (entries.length === 0) return;
385
+ this.db.run("BEGIN TRANSACTION");
386
+ try {
387
+ for (const entry of entries) {
388
+ this.db.run(
389
+ `INSERT OR REPLACE INTO provider_cache
390
+ (provider, file_path, content, query_used, cached_at, ttl)
391
+ VALUES (?, ?, ?, ?, ?, ?)`,
392
+ [provider, entry.filePath, entry.content, queryUsed, Date.now(), ttl]
393
+ );
394
+ }
395
+ this.db.run("COMMIT");
396
+ this.save();
397
+ } catch (e) {
398
+ this.db.run("ROLLBACK");
399
+ throw e;
400
+ }
401
+ }
402
+ /**
403
+ * Remove all stale cache entries. Called at SessionStart before warmup.
404
+ */
405
+ pruneStaleCache() {
406
+ const now = Date.now();
407
+ this.db.run(
408
+ "DELETE FROM provider_cache WHERE (cached_at + ttl * 1000) <= ?",
409
+ [now]
410
+ );
411
+ const changes = this.db.getRowsModified();
412
+ return changes;
413
+ }
414
+ /**
415
+ * Remove all cache entries for a provider. Used when a provider is
416
+ * disabled or its configuration changes.
417
+ */
418
+ clearProviderCache(provider) {
419
+ this.db.run("DELETE FROM provider_cache WHERE provider = ?", [provider]);
420
+ }
421
+ /**
422
+ * Get count of cached entries per provider.
423
+ */
424
+ getCacheStats() {
425
+ const now = Date.now();
426
+ const results = [];
427
+ const stmt = this.db.prepare(
428
+ `SELECT provider,
429
+ COUNT(*) as total,
430
+ SUM(CASE WHEN (cached_at + ttl * 1000) <= ? THEN 1 ELSE 0 END) as stale
431
+ FROM provider_cache
432
+ GROUP BY provider`
433
+ );
434
+ stmt.bind([now]);
435
+ while (stmt.step()) {
436
+ const row = stmt.getAsObject();
437
+ results.push({
438
+ provider: row.provider,
439
+ count: row.total,
440
+ stale: row.stale
441
+ });
442
+ }
443
+ stmt.free();
444
+ return results;
280
445
  }
446
+ rowToCachedContext(row) {
447
+ return {
448
+ provider: row.provider ?? "",
449
+ filePath: row.file_path ?? "",
450
+ content: row.content ?? "",
451
+ queryUsed: row.query_used ?? "",
452
+ cachedAt: row.cached_at ?? 0,
453
+ ttl: row.ttl ?? 3600
454
+ };
455
+ }
456
+ // ─── Lifecycle ────────────────────────────────────────────────
281
457
  close() {
282
458
  this.save();
283
459
  this.db.close();
@@ -341,9 +517,28 @@ function isHiddenKeyword(node) {
341
517
  }
342
518
  var CHARS_PER_TOKEN = 4;
343
519
  function scoreNodes(store, terms) {
344
- const allNodes = store.getAllNodes();
520
+ const seen = /* @__PURE__ */ new Set();
521
+ const seedNodes = [];
522
+ for (const t of terms) {
523
+ for (const node of store.searchNodes(t, 200)) {
524
+ if (!seen.has(node.id)) {
525
+ seen.add(node.id);
526
+ seedNodes.push(node);
527
+ }
528
+ }
529
+ }
530
+ const neighborNodes = [];
531
+ for (const node of seedNodes) {
532
+ for (const { node: neighbor } of store.getNeighbors(node.id)) {
533
+ if (!seen.has(neighbor.id)) {
534
+ seen.add(neighbor.id);
535
+ neighborNodes.push(neighbor);
536
+ }
537
+ }
538
+ }
539
+ const allCandidates = [...seedNodes, ...neighborNodes];
345
540
  const scored = [];
346
- for (const node of allNodes) {
541
+ for (const node of allCandidates) {
347
542
  const label = node.label.toLowerCase();
348
543
  const file = node.sourceFile.toLowerCase();
349
544
  let score = 0;
@@ -544,10 +739,8 @@ function renderPath(nodes, edges) {
544
739
  return `Path (${edges.length} hops): ${segments.join(" ")}`;
545
740
  }
546
741
  function renderFileStructure(store, relativeFilePath, tokenBudget = 600) {
547
- const allNodes = store.getAllNodes();
548
- const fileNodes = allNodes.filter(
549
- (n) => n.sourceFile === relativeFilePath && !isHiddenKeyword(n)
550
- );
742
+ const allFileNodes = store.getNodesByFile(relativeFilePath);
743
+ const fileNodes = allFileNodes.filter((n) => !isHiddenKeyword(n));
551
744
  if (fileNodes.length === 0) {
552
745
  return {
553
746
  text: "",
@@ -561,10 +754,10 @@ function renderFileStructure(store, relativeFilePath, tokenBudget = 600) {
561
754
  (n) => n.kind !== "file" && n.kind !== "module"
562
755
  ).length;
563
756
  const avgConfidence = fileNodes.reduce((s, n) => s + n.confidenceScore, 0) / fileNodes.length;
564
- const allEdges = store.getAllEdges();
565
757
  const fileNodeIds = new Set(fileNodes.map((n) => n.id));
758
+ const fileEdges = store.getEdgesForNodes([...fileNodeIds]);
566
759
  const degreeMap = /* @__PURE__ */ new Map();
567
- for (const e of allEdges) {
760
+ for (const e of fileEdges) {
568
761
  if (fileNodeIds.has(e.source)) {
569
762
  degreeMap.set(e.source, (degreeMap.get(e.source) ?? 0) + 1);
570
763
  }
@@ -612,15 +805,31 @@ function renderFileStructure(store, relativeFilePath, tokenBudget = 600) {
612
805
  lines.push(`NODE ${n.label} [${n.kind}] ${loc}`.trim());
613
806
  }
614
807
  }
615
- const relevantEdges = allEdges.filter(
808
+ const relevantEdges = fileEdges.filter(
616
809
  (e) => fileNodeIds.has(e.source) || fileNodeIds.has(e.target)
617
- ).slice(0, 10);
810
+ ).sort((a, b) => {
811
+ const degA = (degreeMap.get(a.source) ?? 0) + (degreeMap.get(a.target) ?? 0);
812
+ const degB = (degreeMap.get(b.source) ?? 0) + (degreeMap.get(b.target) ?? 0);
813
+ return degB - degA;
814
+ }).slice(0, 10);
815
+ const nodeById = /* @__PURE__ */ new Map();
816
+ for (const n of fileNodes) nodeById.set(n.id, n);
817
+ for (const e of relevantEdges) {
818
+ if (!nodeById.has(e.source)) {
819
+ const n = store.getNode(e.source);
820
+ if (n) nodeById.set(n.id, n);
821
+ }
822
+ if (!nodeById.has(e.target)) {
823
+ const n = store.getNode(e.target);
824
+ if (n) nodeById.set(n.id, n);
825
+ }
826
+ }
618
827
  if (relevantEdges.length > 0) {
619
828
  lines.push("");
620
829
  lines.push("Key relationships:");
621
830
  for (const e of relevantEdges) {
622
- const src = allNodes.find((n) => n.id === e.source);
623
- const tgt = allNodes.find((n) => n.id === e.target);
831
+ const src = nodeById.get(e.source);
832
+ const tgt = nodeById.get(e.target);
624
833
  if (src && tgt) {
625
834
  lines.push(`EDGE ${src.label} --${e.relation}--> ${tgt.label}`);
626
835
  }
@@ -755,7 +964,8 @@ function extractFile(filePath, rootDir) {
755
964
  sourceFile: relPath,
756
965
  sourceLocation: line ? `L${line}` : null,
757
966
  confidence: "EXTRACTED",
758
- confidenceScore: 1,
967
+ confidenceScore: 0.85,
968
+ // Regex heuristic — reserve 1.0 for tree-sitter
759
969
  lastVerified: now,
760
970
  queryCount: 0,
761
971
  metadata: { lang }
@@ -767,7 +977,8 @@ function extractFile(filePath, rootDir) {
767
977
  target,
768
978
  relation,
769
979
  confidence: "EXTRACTED",
770
- confidenceScore: 1,
980
+ confidenceScore: 0.85,
981
+ // Regex heuristic — reserve 1.0 for tree-sitter
771
982
  sourceFile: relPath,
772
983
  sourceLocation: line ? `L${line}` : null,
773
984
  lastVerified: now,
@@ -810,6 +1021,8 @@ function extractWithPatterns(content, lines, lang, fileId, stem, relPath, addNod
810
1021
  for (let i = 0; i < lines.length; i++) {
811
1022
  const line = lines[i];
812
1023
  const lineNum = i + 1;
1024
+ const trimmed = line.trimStart();
1025
+ if (trimmed.startsWith("//") || trimmed.startsWith("*")) continue;
813
1026
  for (const pat of patterns.classes) {
814
1027
  const match = line.match(pat);
815
1028
  if (match?.[1]) {
@@ -822,6 +1035,7 @@ function extractWithPatterns(content, lines, lang, fileId, stem, relPath, addNod
822
1035
  for (const pat of patterns.functions) {
823
1036
  const match = line.match(pat);
824
1037
  if (match?.[1]) {
1038
+ if (pat.source.includes("const|let") && !line.includes("=>")) continue;
825
1039
  const name = match[1];
826
1040
  const id = makeId(stem, name);
827
1041
  addNode(id, `${name}()`, "function", lineNum);
@@ -849,9 +1063,26 @@ function extractWithPatterns(content, lines, lang, fileId, stem, relPath, addNod
849
1063
  }
850
1064
  }
851
1065
  function extractGo(content, lines, fileId, stem, relPath, addNode, addEdge) {
1066
+ let inImportBlock = false;
852
1067
  for (let i = 0; i < lines.length; i++) {
853
1068
  const line = lines[i];
854
1069
  const lineNum = i + 1;
1070
+ const trimmed = line.trimStart();
1071
+ if (trimmed.startsWith("//") || trimmed.startsWith("*")) continue;
1072
+ if (/^import\s*\(/.test(line)) {
1073
+ inImportBlock = true;
1074
+ continue;
1075
+ }
1076
+ if (inImportBlock && trimmed === ")") {
1077
+ inImportBlock = false;
1078
+ continue;
1079
+ }
1080
+ const singleImport = line.match(/^import\s+"([^"]+)"/);
1081
+ if (singleImport?.[1]) {
1082
+ const module = singleImport[1].split("/").pop();
1083
+ addEdge(fileId, makeId(module), "imports", lineNum);
1084
+ continue;
1085
+ }
855
1086
  const funcMatch = line.match(
856
1087
  /^func\s+(?:\([\w\s*]+\)\s+)?(\w+)\s*\(/
857
1088
  );
@@ -869,10 +1100,12 @@ function extractGo(content, lines, fileId, stem, relPath, addNode, addEdge) {
869
1100
  addNode(id, name, kind, lineNum);
870
1101
  addEdge(fileId, id, "contains", lineNum);
871
1102
  }
872
- const importMatch = line.match(/^\s*"([^"]+)"/);
873
- if (importMatch?.[1] && i > 0 && content.includes("import")) {
874
- const module = importMatch[1].split("/").pop();
875
- addEdge(fileId, makeId(module), "imports", lineNum);
1103
+ if (inImportBlock) {
1104
+ const importMatch = line.match(/^\s*"([^"]+)"/);
1105
+ if (importMatch?.[1]) {
1106
+ const module = importMatch[1].split("/").pop();
1107
+ addEdge(fileId, makeId(module), "imports", lineNum);
1108
+ }
876
1109
  }
877
1110
  }
878
1111
  }
@@ -1824,6 +2057,7 @@ export {
1824
2057
  MAX_MISTAKE_LABEL_CHARS,
1825
2058
  queryGraph,
1826
2059
  shortestPath,
2060
+ renderFileStructure,
1827
2061
  toPosixPath,
1828
2062
  SUPPORTED_EXTENSIONS,
1829
2063
  extractFile,