claude-mem-lite 3.4.0 → 3.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -10,7 +10,7 @@
10
10
  "plugins": [
11
11
  {
12
12
  "name": "claude-mem-lite",
13
- "version": "3.4.0",
13
+ "version": "3.6.0",
14
14
  "source": "./",
15
15
  "description": "Persistent long-term memory for Claude Code via MCP — captures coding decisions, bugfixes, and context across sessions. Hybrid FTS5 + TF-IDF search with episode batching. Single SQLite DB, no external services. A lighter, lower-cost alternative to claude-mem (episode batching + a smaller model; cost savings are an internal estimate, not a measured benchmark)."
16
16
  }
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "claude-mem-lite",
3
- "version": "3.4.0",
3
+ "version": "3.6.0",
4
4
  "description": "Persistent long-term memory for Claude Code via MCP — captures coding decisions, bugfixes, and context across sessions. Hybrid FTS5 + TF-IDF search with episode batching. Single SQLite DB, no external services. A lighter, lower-cost alternative to claude-mem (episode batching + a smaller model; cost savings are an internal estimate, not a measured benchmark).",
5
5
  "author": {
6
6
  "name": "sdsrss"
package/README.md CHANGED
@@ -649,10 +649,15 @@ The benchmark suite runs as a CI gate (`npm run benchmark:gate`) to prevent sear
649
649
  Beyond the in-repo micro-benchmark above, claude-mem-lite is measured on
650
650
  [LongMemEval](https://github.com/xiaowu0162/LongMemEval) (Wu et al.) — a
651
651
  500-question long-term-memory benchmark — so its recall is comparable to the
652
- field, not just to itself. Metric is **recall_any@k**: is a gold evidence session
653
- in the top *k* retrieved? Corpus is user-turns-only (the standard raw-baseline
654
- rule). Runners: `benchmark/longmemeval.mjs` (lexical) and
655
- `benchmark/longmemeval-rerank.mjs` (rerank).
652
+ field, not just to itself. Metric is **recall_any@k**: does *any* gold evidence session appear in the
653
+ top *k* retrieved? This is the same session-level definition the systems we
654
+ compare against report on this split — [agentmemory](https://github.com/rohitg00/agentmemory)
655
+ (BM25 + vector + graph) and dense-embedding systems like MemPalace — so the rows
656
+ below sit on one axis, not metric-shopped. (Note: 65% of the 500 questions have
657
+ multiple gold sessions, so `recall_any@k` is looser than fractional recall there;
658
+ all systems in this comparison report the any-hit form.) Corpus is user-turns-only
659
+ (the standard raw-baseline rule). Runners: `benchmark/longmemeval.mjs` (lexical)
660
+ and `benchmark/longmemeval-rerank.mjs` (rerank).
656
661
 
657
662
  | Retriever (zero embeddings) | @1 | @5 | @10 |
658
663
  |---|---|---|---|
@@ -664,15 +669,28 @@ hands the top 20 lexical candidates to a single Haiku call (~1.4 s/query) that
664
669
  reorders them. It is **never worse than the lexical baseline by construction** —
665
670
  any LLM or parse failure falls back to the original candidate order.
666
671
 
667
- **On embeddings, honestly.** With no LLM in the loop, dense-embedding retrieval
668
- still wins on raw recalla dense-embedding baseline reports ~96.6% @5 on this
669
- split, versus our 90.6%. The rerank row's point is that a *single cheap LLM call
670
- closes that gap*: a zero-embedding lexical stack reaches 96.8% @5, edging the
671
- embedding raw number, because the lexical candidate set is already rich enough
672
- (recall@20 = 97.8%) that ranking not recall is the bottleneck. An
673
- embedding-plus-rerank stack still leads when both sides spend an LLM call; the
674
- takeaway is that claude-mem-lite needs **no vector model, no Python, and no
675
- external service** to reach embedding-competitive precision.
672
+ **Stricter metric, for the record.** The rows above are `recall_any@k` does *any*
673
+ gold session reach the top *k* the metric agentmemory and MemPalace publish, so the
674
+ comparison is like-for-like. Under the stricter **standard recall@k** (`|gold top-k| /
675
+ |gold|`, the *fraction* of all gold sessions retrieved), the lexical stack scores
676
+ @1 = 46.9% / @5 = 84.4% / @10 = 91.9%. The whole gap is the 65% of questions with
677
+ multiple gold sessions any-hit needs one, fractional needs them all, and @1 is capped
678
+ at 1/|gold| there; single-gold question types score identically under both.
679
+ `benchmark/longmemeval.mjs` reports both columns (the rerank row's fractional is not yet
680
+ measured).
681
+
682
+ **On embeddings, honestly.** With no LLM in the loop, both a dense-embedding
683
+ baseline (MemPalace, ~96.6% @5) and a BM25 + vector + graph hybrid (agentmemory,
684
+ 95.2% @5) out-recall our zero-embedding lexical stack (90.6% @5) at the same
685
+ retrieval stage — dense and graph signal genuinely help raw recall, and most of
686
+ our remaining gap is paraphrase (single-session-preference is our lowest category
687
+ at 63%). The rerank row's point is that a *single cheap LLM call closes it*:
688
+ reordering the top-20 lexical candidates reaches 96.8% @5 — matching the dense raw
689
+ number and edging the hybrid's retrieval score — because the lexical candidate set
690
+ is already rich enough (recall@20 = 97.8%) that ranking, not recall, is the
691
+ bottleneck. An embedding-plus-rerank stack still leads when both sides spend an LLM
692
+ call; the takeaway is that claude-mem-lite reaches embedding-competitive precision
693
+ with **no vector model, no knowledge graph, no Python, and no external service**.
676
694
 
677
695
  Per-category @5 (lexical → +rerank): knowledge-update 98.7 → 100.0 ·
678
696
  single-session-user 91.4 → 98.6 · temporal-reasoning 89.5 → 97.7 · multi-session
package/hook.mjs CHANGED
@@ -54,6 +54,7 @@ import {
54
54
  bumpCitationAccess,
55
55
  computeCiteRecall,
56
56
  applyCitationDecay,
57
+ recordCitationFunnel,
57
58
  hasMainThreadAssistantText,
58
59
  } from './lib/citation-tracker.mjs';
59
60
  import { extractTailAssistantText, extractStructuredSummary } from './lib/summary-extractor.mjs';
@@ -572,6 +573,10 @@ async function handleStop() {
572
573
  for (const id of citeBackIds) citedMain.add(id);
573
574
  const r = applyCitationDecay(db, project, injected, citedMain, sessionId);
574
575
  debugLog('DEBUG', 'handleStop', `citation-decay: touched=${r.touched} promoted=${r.promoted} demoted=${r.demoted}`);
576
+ // R1: persist this session's invocation→cite funnel row. touched =
577
+ // obs resolved this run (denominator), promoted = obs cited this run
578
+ // (numerator). Idempotent (touched is 0 on re-fire) + best-effort.
579
+ recordCitationFunnel(db, project, sessionId, r.touched, r.promoted);
575
580
  }
576
581
  }
577
582
  } catch (e) { debugCatch(e, 'handleStop-citation-decay'); }
@@ -544,3 +544,96 @@ export function applyCitationDecay(db, project, injectedIds, citedIds, sessionId
544
544
  try { txn(); } catch (e) { debugCatch(e, 'applyCitationDecay-txn'); return empty; }
545
545
  return { promoted, demoted, touched };
546
546
  }
547
+
548
+ /**
549
+ * R1 — persist one accumulating per-session row of the invocation→cite funnel.
550
+ * Fed by applyCitationDecay's return: `injectedDelta` = obs RESOLVED this Stop
551
+ * (touched), `citedDelta` = obs CITED this Stop (promoted). Idempotent against
552
+ * Stop multi-fire by construction — a re-fired Stop re-resolves nothing (touched
553
+ * is 0 for already-decided obs), so the no-op gate below skips it. A later turn
554
+ * that resolves NEW injections accumulates onto the same (project, session) row.
555
+ *
556
+ * Unlike the per-obs cited_count/decay_seen_count counters (lifetime-cumulative,
557
+ * session breakdown lost), this preserves the per-session series that
558
+ * computeCitationFunnelTrend reads back as a trend. Telemetry only — every write
559
+ * is wrapped so a citation_log failure can never break the Stop handler.
560
+ *
561
+ * @param {import('better-sqlite3').Database} db
562
+ * @param {string} project
563
+ * @param {string} sessionId — memory_session_id of the resolved session
564
+ * @param {number} injectedDelta — obs resolved this run (applyCitationDecay.touched)
565
+ * @param {number} citedDelta — obs cited this run (applyCitationDecay.promoted)
566
+ */
567
+ export function recordCitationFunnel(db, project, sessionId, injectedDelta, citedDelta) {
568
+ if (!db || !project || !sessionId) return;
569
+ const inj = Number(injectedDelta) || 0;
570
+ if (inj <= 0) return; // nothing resolved this run → no row noise
571
+ const cited = Math.max(0, Number(citedDelta) || 0);
572
+ try {
573
+ db.prepare(`
574
+ INSERT INTO citation_log (project, memory_session_id, resolved_at, injected_n, cited_n)
575
+ VALUES (?, ?, ?, ?, ?)
576
+ ON CONFLICT(project, memory_session_id) DO UPDATE SET
577
+ injected_n = injected_n + excluded.injected_n,
578
+ cited_n = cited_n + excluded.cited_n,
579
+ resolved_at = excluded.resolved_at
580
+ `).run(project, sessionId, Date.now(), inj, cited);
581
+ } catch (e) { debugCatch(e, 'recordCitationFunnel'); }
582
+ }
583
+
584
+ /**
585
+ * R1 — read the per-session invocation→cite funnel as a windowed trend.
586
+ * `window` aggregates [now-days, now]; `prior` aggregates [now-2*days, now-days)
587
+ * so `delta_pt` shows whether invocation effectiveness is rising or falling.
588
+ * `sessions` is the most-recent `limit` rows (per-session rate for the table view).
589
+ *
590
+ * @param {import('better-sqlite3').Database} db
591
+ * @param {{days?: number, limit?: number, project?: string|null}} [opts]
592
+ * @returns {{window_days: number, sessions: Array, window: {injected:number,cited:number,rate:number}, prior: {injected:number,cited:number,rate:number}, delta_pt: number|null}}
593
+ */
594
+ export function computeCitationFunnelTrend(db, { days = 7, limit = 10, project = null } = {}) {
595
+ const rate = (cited, inj) => (inj > 0 ? cited / inj : 0);
596
+ const empty = {
597
+ window_days: days,
598
+ sessions: [],
599
+ window: { injected: 0, cited: 0, rate: 0 },
600
+ prior: { injected: 0, cited: 0, rate: 0 },
601
+ delta_pt: null,
602
+ };
603
+ if (!db) return empty;
604
+ try {
605
+ const now = Date.now();
606
+ const windowStart = now - days * 86400000;
607
+ const priorStart = now - 2 * days * 86400000;
608
+ const projClause = project ? 'AND project = ?' : '';
609
+
610
+ const sessions = db.prepare(`
611
+ SELECT project, memory_session_id, resolved_at, injected_n, cited_n
612
+ FROM citation_log
613
+ WHERE 1=1 ${projClause}
614
+ ORDER BY resolved_at DESC
615
+ LIMIT ?
616
+ `).all(...(project ? [project, limit] : [limit]))
617
+ .map(r => ({ ...r, rate: rate(r.cited_n, r.injected_n) }));
618
+
619
+ const agg = (fromTs, toTs) => {
620
+ const params = toTs === null ? [fromTs] : [fromTs, toTs];
621
+ if (project) params.push(project);
622
+ const upper = toTs === null ? '' : 'AND resolved_at < ?';
623
+ const row = db.prepare(`
624
+ SELECT COALESCE(SUM(injected_n), 0) AS injected, COALESCE(SUM(cited_n), 0) AS cited
625
+ FROM citation_log
626
+ WHERE resolved_at >= ? ${upper} ${projClause}
627
+ `).get(...params);
628
+ return { injected: row.injected, cited: row.cited, rate: rate(row.cited, row.injected) };
629
+ };
630
+
631
+ const windowAgg = agg(windowStart, null);
632
+ const priorAgg = agg(priorStart, windowStart);
633
+ const delta_pt = priorAgg.injected > 0
634
+ ? Number(((windowAgg.rate - priorAgg.rate) * 100).toFixed(1))
635
+ : null;
636
+
637
+ return { window_days: days, sessions, window: windowAgg, prior: priorAgg, delta_pt };
638
+ } catch (e) { debugCatch(e, 'computeCitationFunnelTrend'); return empty; }
639
+ }
package/mem-cli.mjs CHANGED
@@ -9,7 +9,7 @@ import { resolveProject } from './project-utils.mjs';
9
9
  import { TIER_CASE_SQL, tierSqlParams } from './tier.mjs';
10
10
  import { _resetVocabCache } from './tfidf.mjs';
11
11
  import { autoBoostIfNeeded, reRankWithContext, markSuperseded } from './server-internals.mjs';
12
- import { searchObservationsHybrid, countSearchTotal } from './search-engine.mjs';
12
+ import { searchObservationsHybrid, countSearchTotal, attachBodyTokens } from './search-engine.mjs';
13
13
  import { deepSearch, resolveDeepMode, shouldEscalateToDeep, autoDeepLlmReady, hasEscalatableCorpus } from './deep-search.mjs';
14
14
  import { ensureRegistryDb, upsertResource } from './registry.mjs';
15
15
  import { searchResources } from './registry-retriever.mjs';
@@ -40,6 +40,7 @@ import { resolveAnchorToken, formatAnchorError, resolveQueryAnchor, fetchRecentT
40
40
  import { buildSearchFtsQuery, parseDateBounds, computePerSourceWindow, effectiveObsFtsQuery, searchSessionsFts, searchPromptsFts, normalizeCrossSourceScores, applyUserSort, applyTierFilter } from './lib/search-core.mjs';
41
41
  import { AUTO_MERGE_THRESHOLD } from './lib/dedup-constants.mjs';
42
42
  import { countRecentHookErrors } from './lib/hook-telemetry.mjs';
43
+ import { computeCitationFunnelTrend } from './lib/citation-tracker.mjs';
43
44
  import { aggregateMetrics } from './lib/metrics.mjs';
44
45
  import {
45
46
  insertDeferred, listOpenWithOrdinal, dropDeferred,
@@ -323,6 +324,9 @@ async function cmdSearch(db, args, { llm } = {}) {
323
324
  includeNoise,
324
325
  }), results.length);
325
326
  const paged = results.slice(offset, offset + limit);
327
+ // Enrich the final page with the ~Nt fetch-cost hint (paired with MCP mem_search; #8654 both
328
+ // source keys handled). Batch-fetches heavy obs fields by id — no-op on an empty page.
329
+ attachBodyTokens(db, paged);
326
330
 
327
331
  if (paged.length === 0) {
328
332
  if (jsonOutput) {
@@ -361,6 +365,7 @@ async function cmdSearch(db, args, { llm } = {}) {
361
365
  importance: r.importance ?? null,
362
366
  superseded: Boolean(r.superseded),
363
367
  files_modified: r.files_modified || null,
368
+ body_tokens: r.bodyTokens ?? null,
364
369
  };
365
370
  });
366
371
  out(JSON.stringify({
@@ -382,19 +387,22 @@ async function cmdSearch(db, args, { llm } = {}) {
382
387
  // Pluralize on total — "Found 1 of 44 result" reads wrong; the population (44) drives
383
388
  // grammatical number, not the page slice (1).
384
389
  out(`[mem] Found ${countLabel} result${total !== 1 ? 's' : ''} for "${query}"${fallbackHint}:${hasMixed ? ' (# observation, S# session, P# prompt)' : ''}`);
390
+ // `~Nt` = est. tokens to fetch this row's full body via mem_get (attachBodyTokens, paired with
391
+ // MCP). Conditional so a row that skipped enrichment renders cleanly, not "~undefinedt".
392
+ const tok = r => (r.bodyTokens ? ` ~${r.bodyTokens}t` : '');
385
393
  for (const r of paged) {
386
394
  const timeStr = showTime && r.created_at_epoch ? ` (${relativeTime(r.created_at_epoch)})` : '';
387
395
  if (r._source === 'session') {
388
396
  const date = fmtDateShort(r.created_at);
389
- out(`S#${r.id} 📋 ${date}${timeStr} ${truncate(r.request || r.completed || '(no summary)', 80)}`);
397
+ out(`S#${r.id} 📋 ${date}${timeStr} ${truncate(r.request || r.completed || '(no summary)', 80)}${tok(r)}`);
390
398
  } else if (r._source === 'prompt') {
391
399
  const date = fmtDateShort(r.created_at);
392
- out(`P#${r.id} 💬 ${date}${timeStr} ${truncate(r.prompt_text || '(empty)', 80)}`);
400
+ out(`P#${r.id} 💬 ${date}${timeStr} ${truncate(r.prompt_text || '(empty)', 80)}${tok(r)}`);
393
401
  } else {
394
402
  const date = fmtDateShort(r.created_at);
395
403
  const title = truncate(r.title || r.subtitle || '(untitled)', 80);
396
404
  const supersededTag = r.superseded ? ' [SUPERSEDED]' : '';
397
- out(`#${r.id} ${typeIcon(r.type)} ${date}${timeStr} ${title}${supersededTag}`);
405
+ out(`#${r.id} ${typeIcon(r.type)} ${date}${timeStr} ${title}${supersededTag}${tok(r)}`);
398
406
  if (r.lesson_learned) {
399
407
  out(` -> ${truncate(r.lesson_learned, 80)}`);
400
408
  }
@@ -2306,8 +2314,12 @@ function cmdCitationStats(db, args) {
2306
2314
  ? `${pollutedRows.n} obs have cited_count > decay_seen_count (pre-v34 backfill — invariant holds for new data).`
2307
2315
  : null;
2308
2316
 
2317
+ // R1: per-session invocation→cite funnel trend (citation_log). Same `days` window
2318
+ // as the per-project cite rate above; funnel.prior/delta_pt show the direction.
2319
+ const funnel = computeCitationFunnelTrend(db, { days });
2320
+
2309
2321
  if (json) {
2310
- out(JSON.stringify({ window_days: days, per_project: perProject, decay_queue: decayQueue, promoted, demoted, data_pollution_note: dataPollutionNote }, null, 2));
2322
+ out(JSON.stringify({ window_days: days, per_project: perProject, decay_queue: decayQueue, promoted, demoted, data_pollution_note: dataPollutionNote, funnel }, null, 2));
2311
2323
  return;
2312
2324
  }
2313
2325
 
@@ -2318,6 +2330,28 @@ function cmdCitationStats(db, args) {
2318
2330
  out(` ${r.project.padEnd(34)} ${String(rate).padStart(6)} cited:${r.cited}/${r.resolved} at_risk:${r.at_risk}`);
2319
2331
  }
2320
2332
  out('');
2333
+
2334
+ // R1: invocation→cite funnel — per-session trend + window-vs-prior direction.
2335
+ out(`Invocation→cite funnel (recent sessions, injected→cited; rate window ${days}d):`);
2336
+ if (funnel.sessions.length === 0) {
2337
+ out(' (no resolved sessions in window)');
2338
+ } else {
2339
+ for (const s of funnel.sessions) {
2340
+ const day = s.resolved_at ? new Date(s.resolved_at).toISOString().slice(0, 10) : '—'.repeat(10);
2341
+ const pct = (s.rate * 100).toFixed(1) + '%';
2342
+ out(` ${day} ${(s.project || '').padEnd(28)} inj ${String(s.injected_n).padStart(3)} cited ${String(s.cited_n).padStart(3)} ${pct.padStart(6)}`);
2343
+ }
2344
+ }
2345
+ let trendLine = `window rate ${(funnel.window.rate * 100).toFixed(1)}% cited ${funnel.window.cited}/${funnel.window.injected}`;
2346
+ if (funnel.delta_pt === null) {
2347
+ trendLine += ' (no prior-window data)';
2348
+ } else {
2349
+ const arrow = funnel.delta_pt > 0 ? '↑' : funnel.delta_pt < 0 ? '↓' : '→';
2350
+ const sign = funnel.delta_pt > 0 ? '+' : '';
2351
+ trendLine += ` (prior ${days}d ${(funnel.prior.rate * 100).toFixed(1)}%) ${arrow} ${sign}${funnel.delta_pt}pt`;
2352
+ }
2353
+ out(trendLine);
2354
+ out('');
2321
2355
  out('Active decay queue (uncited_streak >= 2, next miss → demote):');
2322
2356
  if (decayQueue.length === 0) out(' (none)');
2323
2357
  for (const r of decayQueue) {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "claude-mem-lite",
3
- "version": "3.4.0",
3
+ "version": "3.6.0",
4
4
  "description": "Persistent long-term memory for Claude Code via MCP — captures coding decisions, bugfixes, and context across sessions. Hybrid FTS5 + TF-IDF search with episode batching. Single SQLite DB, no external services. A lighter, lower-cost alternative to claude-mem (episode batching + a smaller model; cost savings are an internal estimate, not a measured benchmark).",
5
5
  "type": "module",
6
6
  "packageManager": "npm@10.9.2",
package/schema.mjs CHANGED
@@ -71,7 +71,15 @@ export const CODE_DIR = join(homedir(), '.claude-mem-lite');
71
71
  // legacy trigger on existing DBs. LATEST_MIGRATION_COLUMN unchanged (no new column).
72
72
  // v37 (D#26): adds user_prompts.cc_session_id (additive, nullable). LATEST_MIGRATION_COLUMN
73
73
  // MOVES to it so the half-migrated-DB self-heal fast-path covers the new column.
74
- export const CURRENT_SCHEMA_VERSION = 37;
74
+ // v38 (R1): citation_log table — per-session invocation→cite funnel telemetry. One
75
+ // accumulating row per resolved session (injected_n / cited_n), written by
76
+ // recordCitationFunnel from applyCitationDecay's touched/promoted at Stop. Turns the
77
+ // per-obs cite counters (lifetime-cumulative) into a trendable per-session series so
78
+ // `citation-stats` can answer "is memory invocation effectiveness rising or falling".
79
+ // New TABLE (not a column) reached via CORE_SCHEMA's CREATE TABLE IF NOT EXISTS on the
80
+ // forced migration pass; LATEST_MIGRATION_COLUMN unchanged (no new column) — same
81
+ // pattern as v35/v36.
82
+ export const CURRENT_SCHEMA_VERSION = 38;
75
83
 
76
84
  // Sentinel column for the LATEST migration set. The fast-path uses this to
77
85
  // self-heal half-migrated DBs — schema_version bumped but column ALTERs rolled
@@ -168,6 +176,15 @@ const CORE_SCHEMA = `
168
176
  created_at_epoch INTEGER,
169
177
  PRIMARY KEY (project, type, session_id)
170
178
  );
179
+
180
+ CREATE TABLE IF NOT EXISTS citation_log (
181
+ project TEXT NOT NULL,
182
+ memory_session_id TEXT NOT NULL,
183
+ resolved_at INTEGER,
184
+ injected_n INTEGER NOT NULL DEFAULT 0,
185
+ cited_n INTEGER NOT NULL DEFAULT 0,
186
+ PRIMARY KEY (project, memory_session_id)
187
+ );
171
188
  `;
172
189
 
173
190
  // Column migrations (idempotent — only swallow "duplicate column" errors)
package/search-engine.mjs CHANGED
@@ -9,7 +9,7 @@ import {
9
9
  OBS_BM25, TYPE_DECAY_CASE, TYPE_QUALITY_CASE,
10
10
  DEFAULT_DECAY_HALF_LIFE_MS,
11
11
  notLowSignalTitleClause, LOW_SIGNAL_TITLE,
12
- relaxFtsQueryToOr, debugLog, debugCatch,
12
+ relaxFtsQueryToOr, debugLog, debugCatch, estimateTokens,
13
13
  } from './utils.mjs';
14
14
  import { getVocabulary, computeVector, vectorSearch, rrfMerge } from './tfidf.mjs';
15
15
  import { extractPRFTerms, expandQueryByConcepts } from './server-internals.mjs';
@@ -190,6 +190,45 @@ export function ftsRowToResult(r, { scoreMultiplier, snippet } = {}) {
190
190
  };
191
191
  }
192
192
 
193
+ // Per-result estimate of the token cost to fetch the FULL body via mem_get, surfaced as the
194
+ // `~Nt` hint in search output so the agent can budget the 3-layer protocol (search → timeline →
195
+ // get) before paying to expand any ID. Adopted from thedotmack/claude-mem's token-cost column
196
+ // (reference_claude_mem_comparison) — the one genuinely portable idea from that analysis.
197
+ //
198
+ // Layer-1 search deliberately omits narrative/facts (that's what keeps the index light), so the
199
+ // heavy obs fields are batch-fetched by id HERE rather than carried on every result. The source
200
+ // key is read as `source || _source` because the two render paths disagree (#8654): MCP sets
201
+ // `source`+`text`, CLI sets `_source`+`prompt_text`. estimateTokens floors at 1, so a missing row
202
+ // or empty body yields 1 — never 0/NaN.
203
+ export function attachBodyTokens(db, results) {
204
+ if (!Array.isArray(results) || results.length === 0) return results;
205
+ const obsIds = results
206
+ .filter(r => (r.source || r._source) === 'obs' && Number.isInteger(r.id))
207
+ .map(r => r.id);
208
+ const bodyById = new Map();
209
+ if (obsIds.length > 0) {
210
+ try {
211
+ const ph = obsIds.map(() => '?').join(',');
212
+ const rows = db.prepare(`SELECT id, narrative, facts, text FROM observations WHERE id IN (${ph})`).all(...obsIds);
213
+ for (const row of rows) bodyById.set(row.id, row);
214
+ } catch (e) { debugCatch(e, 'attachBodyTokens'); }
215
+ }
216
+ for (const r of results) {
217
+ const src = r.source || r._source;
218
+ let parts;
219
+ if (src === 'obs') {
220
+ const row = bodyById.get(r.id) || {};
221
+ parts = [r.title, r.subtitle, r.lesson_learned, row.narrative, row.facts, row.text];
222
+ } else if (src === 'session') {
223
+ parts = [r.request, r.completed, r.working_on];
224
+ } else {
225
+ parts = [r.text, r.prompt_text];
226
+ }
227
+ r.bodyTokens = estimateTokens(parts.filter(Boolean).join(' '));
228
+ }
229
+ return results;
230
+ }
231
+
193
232
  function expandObsByConceptCo(db, ctx, now, existingIds, results, includeNoise = false) {
194
233
  const { ftsQuery, args, epochFrom, epochTo, limit } = ctx;
195
234
  if (results.length >= Math.ceil(limit / 2)) return;
package/server.mjs CHANGED
@@ -9,7 +9,7 @@ import { truncate, typeIcon, inferProject, scrubSecrets, fmtDate, debugLog, debu
9
9
  import { resolveProject as _resolveProjectShared } from './project-utils.mjs';
10
10
  import { ensureDb, DB_PATH, DB_DIR, REGISTRY_DB_PATH } from './schema.mjs';
11
11
  import { reRankWithContext, markSuperseded, autoBoostIfNeeded, runIdleCleanup, buildServerInstructions } from './server-internals.mjs';
12
- import { searchObservationsHybrid, countSearchTotal } from './search-engine.mjs';
12
+ import { searchObservationsHybrid, countSearchTotal, attachBodyTokens } from './search-engine.mjs';
13
13
  import { deepSearch, resolveDeepMode, shouldEscalateToDeep, autoDeepLlmReady, hasEscalatableCorpus } from './deep-search.mjs';
14
14
  import { selectCompressionCandidates, groupByProjectWeek, compressGroup } from './lib/compress-core.mjs';
15
15
  import { resolveAnchorToken, formatAnchorError, resolveQueryAnchor, fetchRecentTimeline, fetchTimelineWindow } from './lib/timeline-core.mjs';
@@ -294,21 +294,24 @@ function formatSearchOutput(paginatedResults, args, ftsQuery, totalCount, orFall
294
294
  const fallbackHint = orFallbackFired && !args.or ? ' (relaxed AND→OR)' : '';
295
295
  lines.push(`Found ${countLabel} result(s)${qLabel}${fallbackHint}:${hasMixed ? ' (# observation, S# session, P# prompt)' : ''}\n`);
296
296
 
297
+ // `~Nt` = estimated tokens to fetch this row's full body via mem_get (attachBodyTokens).
298
+ // Conditional so a result that skipped enrichment renders cleanly, not "~undefinedt".
299
+ const tok = r => (r.bodyTokens ? ` ~${r.bodyTokens}t` : '');
297
300
  for (const r of paginatedResults) {
298
301
  if (r.source === 'obs') {
299
302
  const supersededTag = r.superseded ? ' [SUPERSEDED]' : '';
300
- lines.push(`#${r.id} ${typeIcon(r.type)} [${r.type}] ${truncate(r.title || r.subtitle || '(untitled)')} | ${r.project} | ${fmtDate(r.date)}${supersededTag}`);
303
+ lines.push(`#${r.id} ${typeIcon(r.type)} [${r.type}] ${truncate(r.title || r.subtitle || '(untitled)')} | ${r.project} | ${fmtDate(r.date)}${supersededTag}${tok(r)}`);
301
304
  if (r.snippet && r.snippet.length > 10 && r.snippet !== r.title) {
302
305
  lines.push(` ${truncate(r.snippet, 100)}`);
303
306
  }
304
307
  } else if (r.source === 'session') {
305
- lines.push(`S#${r.id} 📋 ${truncate(r.request || r.completed || '(no summary)')} | ${r.project} | ${fmtDate(r.date)}`);
308
+ lines.push(`S#${r.id} 📋 ${truncate(r.request || r.completed || '(no summary)')} | ${r.project} | ${fmtDate(r.date)}${tok(r)}`);
306
309
  } else if (r.source === 'prompt') {
307
- lines.push(`P#${r.id} 💬 ${truncate(r.text)} | ${fmtDate(r.date)}`);
310
+ lines.push(`P#${r.id} 💬 ${truncate(r.text)} | ${fmtDate(r.date)}${tok(r)}`);
308
311
  }
309
312
  }
310
313
 
311
- lines.push(`\nWorkflow: mem_timeline(anchor=ID) for context | mem_get(ids=[...]) for full details`);
314
+ lines.push(`\nWorkflow: mem_timeline(anchor=ID) for context | mem_get(ids=[...]) for full details · ~Nt = est. tokens to fetch full detail`);
312
315
  return { content: [{ type: 'text', text: lines.join('\n') }] };
313
316
  }
314
317
 
@@ -508,6 +511,9 @@ async function runSearchPipeline(db, args, { llm, rerankLlm } = {}) {
508
511
  }), results.length);
509
512
  // Always apply pagination — single-source results can exceed SQL LIMIT due to expansion (concept co-occurrence, PRF, vector search)
510
513
  const paginatedResults = (offset > 0 || results.length > limit) ? results.slice(offset, offset + limit) : results;
514
+ // Enrich the FINAL page with a fetch-cost estimate (~Nt) so the agent budgets before mem_get.
515
+ // Uses the same db threaded through the pipeline (#8743) — batch-fetches heavy obs fields by id.
516
+ attachBodyTokens(db, paginatedResults);
511
517
 
512
518
  // Observability: announce auto-escalation on stderr (parity with CLI deep note).
513
519
  if (escalated) process.stderr.write(`[mem] auto-escalated to deep search (weak results: ${escalatedObsCount} hits)\n`);