claude-mem-lite 2.87.0 → 2.89.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -10,9 +10,9 @@
10
10
  "plugins": [
11
11
  {
12
12
  "name": "claude-mem-lite",
13
- "version": "2.87.0",
13
+ "version": "2.89.0",
14
14
  "source": "./",
15
- "description": "Persistent long-term memory for Claude Code via MCP — captures coding decisions, bugfixes, and context across sessions. Hybrid FTS5 + TF-IDF search with episode batching. Single SQLite DB, no external services. Alternative to claude-mem with 600x lower cost."
15
+ "description": "Persistent long-term memory for Claude Code via MCP — captures coding decisions, bugfixes, and context across sessions. Hybrid FTS5 + TF-IDF search with episode batching. Single SQLite DB, no external services. A lighter, lower-cost alternative to claude-mem (episode batching + a smaller model; cost savings are an internal estimate, not a measured benchmark)."
16
16
  }
17
17
  ]
18
18
  }
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "claude-mem-lite",
3
- "version": "2.87.0",
4
- "description": "Persistent long-term memory for Claude Code via MCP — captures coding decisions, bugfixes, and context across sessions. Hybrid FTS5 + TF-IDF search with episode batching. Single SQLite DB, no external services. Alternative to claude-mem with 600x lower cost.",
3
+ "version": "2.89.0",
4
+ "description": "Persistent long-term memory for Claude Code via MCP — captures coding decisions, bugfixes, and context across sessions. Hybrid FTS5 + TF-IDF search with episode batching. Single SQLite DB, no external services. A lighter, lower-cost alternative to claude-mem (episode batching + a smaller model; cost savings are an internal estimate, not a measured benchmark).",
5
5
  "author": {
6
6
  "name": "sdsrss"
7
7
  },
package/README.md CHANGED
@@ -4,7 +4,7 @@
4
4
 
5
5
  `claude-mem-lite` is a **persistent memory** (also called *long-term memory* or *cross-session context*) system for **[Claude Code](https://docs.anthropic.com/en/docs/claude-code)** — Anthropic's CLI coding agent. It runs as an **[MCP](https://modelcontextprotocol.io/) server** plus a set of Claude Code hooks, automatically capturing coding observations, decisions, and bug fixes during sessions, then providing hybrid full-text + semantic search to recall them later.
6
6
 
7
- Compared to general-purpose LLM memory frameworks like [`mem0`](https://github.com/mem0ai/mem0) or the MCP reference [`memory`](https://github.com/modelcontextprotocol/servers/tree/main/src/memory) server, claude-mem-lite is purpose-built for Claude Code's hook lifecycle: episode batching cuts LLM calls 7–10× vs the original [claude-mem](https://github.com/thedotmack/claude-mem) (600× lower total cost), and the hybrid FTS5 + TF-IDF retriever benchmarks at 0.88 Recall@10 / 0.96 Precision@10.
7
+ Compared to general-purpose LLM memory frameworks like [`mem0`](https://github.com/mem0ai/mem0) or the MCP reference [`memory`](https://github.com/modelcontextprotocol/servers/tree/main/src/memory) server, claude-mem-lite is purpose-built for Claude Code's hook lifecycle: episode batching cuts LLM calls 7–10× vs the original [claude-mem](https://github.com/thedotmack/claude-mem) (an estimated ~600× lower total cost — see the cost model below; this is an architecture estimate, not a measured benchmark), while the hybrid FTS5 + TF-IDF retriever benchmarks at 0.88 Recall@10 / 0.96 Precision@10.
8
8
 
9
9
  > 中文简介:claude-mem-lite 是 Claude Code 的轻量级**持久化记忆 / 长期记忆 / 跨会话上下文**插件,基于 MCP 协议 + 钩子机制,自动捕获编码会话中的决策、修复和上下文,并通过 FTS5 + TF-IDF 混合检索召回。详见 [中文 README](README.zh-CN.md)。
10
10
 
@@ -29,15 +29,17 @@ A ground-up redesign of [claude-mem](https://github.com/thedotmack/claude-mem),
29
29
 
30
30
  ### Token & cost efficiency
31
31
 
32
- For a typical 50-tool-call session:
32
+ For a typical 50-tool-call session (illustrative cost model — the ratios below are
33
+ architecture estimates derived from batch size, token counts, and model pricing, **not**
34
+ a measured end-to-end benchmark):
33
35
 
34
- | | claude-mem | claude-mem-lite | Ratio |
36
+ | | claude-mem | claude-mem-lite | Ratio (estimated) |
35
37
  |---|---|---|---|
36
- | LLM calls | ~50 (every tool use) | ~5-8 (per episode) | **7-10x fewer** |
37
- | Tokens per call | 1,000-5,000 (raw JSON + history) | 200-500 (summaries only) | **5-10x smaller** |
38
- | Total tokens | ~100K-250K | ~1K-4K | **50-100x less** |
39
- | Model cost | Sonnet ($3/$15 per M) | Haiku ($0.25/$1.25 per M) | **12x cheaper** |
40
- | Combined savings | | | **600x+ lower cost** |
38
+ | LLM calls | ~50 (every tool use) | ~5-8 (per episode) | **~7-10x fewer** |
39
+ | Tokens per call | 1,000-5,000 (raw JSON + history) | 200-500 (summaries only) | **~5-10x smaller** |
40
+ | Total tokens | ~100K-250K | ~1K-4K | **~50-100x less** |
41
+ | Model cost | Sonnet ($3/$15 per M) | Haiku ($0.25/$1.25 per M) | **~12x cheaper** |
42
+ | Combined savings | | | **~600x lower cost (estimated)** |
41
43
 
42
44
  ### Quality comparison
43
45
 
@@ -681,7 +683,7 @@ No. Claude Code's `CLAUDE.md` and `MEMORY.md` files act as static instruction me
681
683
 
682
684
  ### Why "lite"? What did the original claude-mem do differently?
683
685
 
684
- The original called an LLM on every tool use with raw JSON inputs. claude-mem-lite batches 5–10 operations per LLM call, uses a smaller model (Haiku), and runs a deterministic code-level filter before sending anything to the model. Net result: ~600× lower cost with equivalent search quality. See the [Architecture comparison](#architecture-comparison) above.
686
+ The original called an LLM on every tool use with raw JSON inputs. claude-mem-lite batches 5–10 operations per LLM call, uses a smaller model (Haiku), and runs a deterministic code-level filter before sending anything to the model. Net result: an estimated ~600× lower cost (an architecture estimate from the cost model above, not a measured benchmark) with equivalent search quality. See the [Architecture comparison](#architecture-comparison) above.
685
687
 
686
688
  ### Does this work cross-project? Cross-machine?
687
689
 
package/README.zh-CN.md CHANGED
@@ -4,7 +4,7 @@
4
4
 
5
5
  `claude-mem-lite` 是 **[Claude Code](https://docs.anthropic.com/en/docs/claude-code)**(Anthropic 官方 CLI 编程代理)的 **持久化记忆系统**(也称 **长期记忆 / 跨会话上下文 / Claude Code 记忆插件**)。它以 **[MCP](https://modelcontextprotocol.io/) 服务器** + Claude Code 钩子(hooks)的形式运行,在编码会话中自动捕获观察记录、决策、bug 修复,并通过 FTS5 全文检索 + TF-IDF 向量的混合检索召回历史上下文。
6
6
 
7
- 与 [`mem0`](https://github.com/mem0ai/mem0)、MCP 官方参考实现的 [`memory`](https://github.com/modelcontextprotocol/servers/tree/main/src/memory) 服务器等通用 LLM 记忆框架相比,claude-mem-lite 专为 Claude Code 的钩子生命周期定制:episode 批处理把 LLM 调用量相比原版 [claude-mem](https://github.com/thedotmack/claude-mem) 减少 7-10 倍(综合成本下降 600 倍),FTS5 + TF-IDF 混合检索在 30 个查询的基准上达到 **Recall@10 = 0.88 / Precision@10 = 0.96**。
7
+ 与 [`mem0`](https://github.com/mem0ai/mem0)、MCP 官方参考实现的 [`memory`](https://github.com/modelcontextprotocol/servers/tree/main/src/memory) 服务器等通用 LLM 记忆框架相比,claude-mem-lite 专为 Claude Code 的钩子生命周期定制:episode 批处理把 LLM 调用量相比原版 [claude-mem](https://github.com/thedotmack/claude-mem) 减少 7-10 倍(综合成本估算下降约 600 倍 —— 见下方成本模型,属架构估算而非实测基准);FTS5 + TF-IDF 混合检索在 30 个查询的基准上达到 **Recall@10 = 0.88 / Precision@10 = 0.96**。
8
8
 
9
9
  无需外部服务。单一 SQLite 数据库。开销极低。
10
10
 
@@ -27,15 +27,15 @@
27
27
 
28
28
  ### Token 与成本效率
29
29
 
30
- 以典型的 50 次工具调用的会话为例:
30
+ 以典型的 50 次工具调用的会话为例(成本模型示意 —— 下列比率由批大小、token 量与模型定价**估算**得出,并非端到端实测):
31
31
 
32
- | | claude-mem | claude-mem-lite | 比率 |
32
+ | | claude-mem | claude-mem-lite | 比率(估算) |
33
33
  |---|---|---|---|
34
- | LLM 调用次数 | ~50(每次工具使用) | ~5-8(按 episode) | **减少 7-10 倍** |
35
- | 每次调用 token | 1,000-5,000(原始 JSON + 历史) | 200-500(仅摘要) | **减少 5-10 倍** |
36
- | 总 token 量 | ~100K-250K | ~1K-4K | **减少 50-100 倍** |
37
- | 模型成本 | Sonnet ($3/$15 每百万) | Haiku ($0.25/$1.25 每百万) | **便宜 12 倍** |
38
- | 综合节省 | | | **成本降低 600 倍+** |
34
+ | LLM 调用次数 | ~50(每次工具使用) | ~5-8(按 episode) | **约减少 7-10 倍** |
35
+ | 每次调用 token | 1,000-5,000(原始 JSON + 历史) | 200-500(仅摘要) | **约减少 5-10 倍** |
36
+ | 总 token 量 | ~100K-250K | ~1K-4K | **约减少 50-100 倍** |
37
+ | 模型成本 | Sonnet ($3/$15 每百万) | Haiku ($0.25/$1.25 每百万) | **约便宜 12 倍** |
38
+ | 综合节省 | | | **成本降低约 600 倍(估算)** |
39
39
 
40
40
  ### 质量对比
41
41
 
package/haiku-client.mjs CHANGED
@@ -20,6 +20,14 @@ const MODEL_MAP = {
20
20
  sonnet: 'claude-sonnet-4-5-20250929',
21
21
  };
22
22
 
23
+ // Every background LLM call here is fixed-schema extraction / classification
24
+ // (episode→JSON, type/merge classification, synonym + metadata extraction) whose
25
+ // output is consumed deterministically (JSON.parse, MinHash dedup). Pin temperature
26
+ // to 0 so the provider default (~1.0) doesn't inject wording variance that breaks
27
+ // JSON parsing or defeats the wording-sensitive MinHash near-duplicate detector.
28
+ // A call that genuinely needs sampling can pass opts.temperature to override.
29
+ const DEFAULT_LLM_TEMPERATURE = 0;
30
+
23
31
  /**
24
32
  * Resolve the LLM model to use for background calls.
25
33
  * Reads CLAUDE_MEM_MODEL env var, defaults to 'haiku'.
@@ -143,7 +151,7 @@ export function flattenForCLI(input) {
143
151
  * @param {number} [opts.maxTokens=500] Max tokens in response
144
152
  * @returns {Promise<{text: string}|null>} Response or null on failure
145
153
  */
146
- export async function callHaiku(prompt, { timeout = 10000, maxTokens = 500 } = {}) {
154
+ export async function callHaiku(prompt, { timeout = 10000, maxTokens = 500, temperature = DEFAULT_LLM_TEMPERATURE } = {}) {
147
155
  if (!prompt) return null;
148
156
 
149
157
  const mode = detectMode();
@@ -160,8 +168,8 @@ export async function callHaiku(prompt, { timeout = 10000, maxTokens = 500 } = {
160
168
  let primary = null;
161
169
  try {
162
170
  primary = mode === 'api'
163
- ? await callHaikuAPI(prompt, { timeout, maxTokens })
164
- : await callOpenRouterAPI(prompt, resolveModel().cli, { timeout, maxTokens });
171
+ ? await callHaikuAPI(prompt, { timeout, maxTokens, temperature })
172
+ : await callOpenRouterAPI(prompt, resolveModel().cli, { timeout, maxTokens, temperature });
165
173
  } catch (e) {
166
174
  debugCatch(e, `callHaiku:${mode}`);
167
175
  }
@@ -198,7 +206,7 @@ export async function callHaikuJSON(prompt, opts) {
198
206
  * @param {number} [opts.maxTokens=1000] Max tokens in response
199
207
  * @returns {Promise<{text: string}|null>} Response or null on failure
200
208
  */
201
- export async function callLLMWithModel(prompt, model = 'haiku', { timeout = 15000, maxTokens = 1000 } = {}) {
209
+ export async function callLLMWithModel(prompt, model = 'haiku', { timeout = 15000, maxTokens = 1000, temperature = DEFAULT_LLM_TEMPERATURE } = {}) {
202
210
  if (!prompt) return null;
203
211
  const resolvedModel = MODEL_MAP[model] ? model : 'haiku';
204
212
  const mode = detectMode();
@@ -214,8 +222,8 @@ export async function callLLMWithModel(prompt, model = 'haiku', { timeout = 1500
214
222
  let primary = null;
215
223
  try {
216
224
  primary = mode === 'api'
217
- ? await callModelAPI(prompt, resolvedModel, { timeout, maxTokens })
218
- : await callOpenRouterAPI(prompt, resolvedModel, { timeout, maxTokens });
225
+ ? await callModelAPI(prompt, resolvedModel, { timeout, maxTokens, temperature })
226
+ : await callOpenRouterAPI(prompt, resolvedModel, { timeout, maxTokens, temperature });
219
227
  } catch (e) {
220
228
  debugCatch(e, `callLLMWithModel:${mode}:${resolvedModel}`);
221
229
  }
@@ -239,7 +247,7 @@ export async function callModelJSON(prompt, model = 'haiku', opts) {
239
247
  return parseJsonFromLLM(result.text);
240
248
  }
241
249
 
242
- async function callModelAPI(prompt, model, { timeout, maxTokens }) {
250
+ async function callModelAPI(prompt, model, { timeout, maxTokens, temperature = DEFAULT_LLM_TEMPERATURE }) {
243
251
  const apiKey = process.env.ANTHROPIC_API_KEY;
244
252
  if (!apiKey) return null;
245
253
 
@@ -252,6 +260,7 @@ async function callModelAPI(prompt, model, { timeout, maxTokens }) {
252
260
  const body = {
253
261
  model: modelId,
254
262
  max_tokens: maxTokens,
263
+ temperature,
255
264
  messages: [{ role: 'user', content: user }],
256
265
  };
257
266
  // System slot is constant per call type (instructions, schema, type taxonomy)
@@ -312,7 +321,7 @@ function callModelCLI(prompt, model, { timeout }) {
312
321
 
313
322
  // ─── API Mode ────────────────────────────────────────────────────────────────
314
323
 
315
- async function callHaikuAPI(prompt, { timeout, maxTokens }) {
324
+ async function callHaikuAPI(prompt, { timeout, maxTokens, temperature = DEFAULT_LLM_TEMPERATURE }) {
316
325
  const apiKey = process.env.ANTHROPIC_API_KEY;
317
326
  if (!apiKey) return null;
318
327
 
@@ -325,6 +334,7 @@ async function callHaikuAPI(prompt, { timeout, maxTokens }) {
325
334
  const body = {
326
335
  model: modelId,
327
336
  max_tokens: maxTokens,
337
+ temperature,
328
338
  messages: [{ role: 'user', content: user }],
329
339
  };
330
340
  // See callModelAPI: cache_control on the constant system slot.
@@ -365,7 +375,7 @@ async function callHaikuAPI(prompt, { timeout, maxTokens }) {
365
375
  // `cache_control` field has no OpenAI-format equivalent and is omitted.
366
376
  // `tier` is the resolved model tier ('haiku'|'sonnet'); OPENROUTER_MODEL can
367
377
  // override the resulting slug entirely (see resolveOpenRouterModel).
368
- async function callOpenRouterAPI(prompt, tier, { timeout, maxTokens }) {
378
+ async function callOpenRouterAPI(prompt, tier, { timeout, maxTokens, temperature = DEFAULT_LLM_TEMPERATURE }) {
369
379
  const apiKey = process.env.OPENROUTER_API_KEY;
370
380
  if (!apiKey) return null;
371
381
 
@@ -387,7 +397,7 @@ async function callOpenRouterAPI(prompt, tier, { timeout, maxTokens }) {
387
397
  // Optional OpenRouter attribution headers (ignored by the API if absent).
388
398
  'X-Title': 'claude-mem-lite',
389
399
  },
390
- body: JSON.stringify({ model, max_tokens: maxTokens, messages }),
400
+ body: JSON.stringify({ model, max_tokens: maxTokens, temperature, messages }),
391
401
  signal: controller.signal,
392
402
  });
393
403
 
package/hook-llm.mjs CHANGED
@@ -12,6 +12,7 @@ import {
12
12
  import { acquireLLMSlot, releaseLLMSlot } from './hook-semaphore.mjs';
13
13
  import { scrubRecord } from './lib/scrub-record.mjs';
14
14
  import { getVocabulary, computeVector } from './tfidf.mjs';
15
+ import { DEDUP_JACCARD_THRESHOLD, AUTO_MERGE_THRESHOLD } from './lib/dedup-constants.mjs';
15
16
  import {
16
17
  RUNTIME_DIR, DEDUP_WINDOW_MS, RELATED_OBS_WINDOW_MS,
17
18
  sessionFile, getSessionId, openDb, callLLM, sleep,
@@ -148,7 +149,7 @@ export function saveObservation(obs, projectOverride, sessionIdOverride, externa
148
149
  ORDER BY created_at_epoch DESC LIMIT 10
149
150
  `).all(project, fiveMinAgo);
150
151
 
151
- if (obs.title && recent.some(r => jaccardSimilarity(r.title, obs.title) > 0.7)) {
152
+ if (obs.title && recent.some(r => jaccardSimilarity(r.title, obs.title) > DEDUP_JACCARD_THRESHOLD)) {
152
153
  return null; // dedup: Jaccard title match
153
154
  }
154
155
 
@@ -173,8 +174,8 @@ export function saveObservation(obs, projectOverride, sessionIdOverride, externa
173
174
  WHERE project = ? AND created_at_epoch > ? AND created_at_epoch <= ?
174
175
  ORDER BY created_at_epoch DESC LIMIT 60
175
176
  `).all(project, threeDaysAgo, fiveMinAgo);
176
- if (extRecent.some(r => jaccardSimilarity(r.title, obs.title) > 0.85)) {
177
- return null; // dedup: low-signal Jaccard match
177
+ if (extRecent.some(r => jaccardSimilarity(r.title, obs.title) > AUTO_MERGE_THRESHOLD)) {
178
+ return null; // dedup: low-signal Jaccard match (stricter cutoff for degraded titles)
178
179
  }
179
180
  }
180
181
 
package/hook-optimize.mjs CHANGED
@@ -13,6 +13,7 @@ import { callModelJSON } from './haiku-client.mjs';
13
13
  import { acquireLLMSlot, releaseLLMSlot } from './hook-semaphore.mjs';
14
14
  import { scrubRecord } from './lib/scrub-record.mjs';
15
15
  import { getVocabulary, computeVector, cosineSimilarity } from './tfidf.mjs';
16
+ import { MERGE_JACCARD_LOW, AUTO_MERGE_THRESHOLD } from './lib/dedup-constants.mjs';
16
17
  import { DB_DIR } from './schema.mjs';
17
18
 
18
19
  const RUNTIME_DIR = join(DB_DIR, 'runtime');
@@ -331,8 +332,9 @@ export async function executeNormalize(db, force = false, { project } = {}) {
331
332
  // ─── Task 3: Cluster-merge ─────────────────────────────────────────────────
332
333
 
333
334
  const MERGE_TIME_WINDOW_MS = 30 * 86400000;
334
- const MERGE_JACCARD_LOW = 0.4;
335
- const MERGE_JACCARD_HIGH = 0.85;
335
+ // Merge-review band [MERGE_JACCARD_LOW, AUTO_MERGE_THRESHOLD): titles in this
336
+ // Jaccard range are LLM-reviewed for merge; at/above AUTO_MERGE_THRESHOLD they'd
337
+ // already auto-merge elsewhere, below MERGE_JACCARD_LOW they're too dissimilar.
336
338
 
337
339
  export function findMergeCandidates(db, maxClusters = 5, { project } = {}) {
338
340
  const cutoff = Date.now() - MERGE_TIME_WINDOW_MS;
@@ -363,12 +365,14 @@ export function findMergeCandidates(db, maxClusters = 5, { project } = {}) {
363
365
  if (Math.abs(rows[i].created_at_epoch - rows[j].created_at_epoch) > MERGE_TIME_WINDOW_MS) continue;
364
366
 
365
367
  if (rows[i].minhash_sig && rows[j].minhash_sig) {
368
+ // 0.8 slack: the MinHash estimate is noisy, so pre-filter a band below
369
+ // MERGE_JACCARD_LOW rather than at it, to avoid dropping true candidates.
366
370
  const est = estimateJaccardFromMinHash(rows[i].minhash_sig, rows[j].minhash_sig);
367
371
  if (est < MERGE_JACCARD_LOW * 0.8) continue;
368
372
  }
369
373
 
370
374
  const titleSim = jaccardSimilarity(rows[i].title, rows[j].title);
371
- if (titleSim >= MERGE_JACCARD_LOW && titleSim < MERGE_JACCARD_HIGH) {
375
+ if (titleSim >= MERGE_JACCARD_LOW && titleSim < AUTO_MERGE_THRESHOLD) {
372
376
  cluster.push(rows[j]);
373
377
  used.add(rows[j].id);
374
378
  }
package/hook.mjs CHANGED
@@ -26,7 +26,7 @@ import {
26
26
  truncate, inferProject, detectBashSignificance,
27
27
  extractErrorKeywords, extractFilePaths, isRelatedToEpisode,
28
28
  makeEntryDesc, scrubSecrets, stripPrivate, EDIT_TOOLS, debugCatch, debugLog,
29
- COMPRESSED_AUTO, COMPRESSED_PENDING_PURGE, isoWeekKey, OBS_BM25,
29
+ COMPRESSED_AUTO, COMPRESSED_PENDING_PURGE, OBS_BM25,
30
30
  computeMinHash, estimateJaccardFromMinHash, jaccardSimilarity,
31
31
  } from './utils.mjs';
32
32
  import {
@@ -45,6 +45,8 @@ import {
45
45
  } from './hook-shared.mjs';
46
46
  import { handleLLMEpisode, handleLLMSummary, saveObservation, buildImmediateObservation } from './hook-llm.mjs';
47
47
  import { scrubRecord } from './lib/scrub-record.mjs';
48
+ import { selectCompressionCandidates, groupByProjectWeek, compressGroup } from './lib/compress-core.mjs';
49
+ import { cleanupBroken, decayAndMarkIdle, boostAccessed } from './lib/maintain-core.mjs';
48
50
  import {
49
51
  extractCitationsFromTranscript,
50
52
  extractAllInjected,
@@ -61,7 +63,8 @@ import { checkForUpdate, getCachedUpdateBanner, isUpdateCheckDue } from './hook-
61
63
  import { handleLLMOptimize } from './hook-optimize.mjs';
62
64
  import { silentAutoAdopt, hasAutoAdoptMarker } from './adopt-cli.mjs';
63
65
  import { emitV270UpgradeBanner } from './lib/upgrade-banner.mjs';
64
- import { loadCiteBackForEpisode, buildUnsavedBugfixHint, countUnsavedBugfixShape, buildCiteRecallNudge as libBuildCiteRecallNudge, nextCiteLowStreak } from './lib/cite-back-hint.mjs';
66
+ import { loadCiteBackForEpisode, extractCiteBackSignals, buildUnsavedBugfixHint, countUnsavedBugfixShape, buildCiteRecallNudge as libBuildCiteRecallNudge, nextCiteLowStreak } from './lib/cite-back-hint.mjs';
67
+ import { MINHASH_PREFILTER, FUZZY_DEDUP_THRESHOLD } from './lib/dedup-constants.mjs';
65
68
  // plugin-cache-guard.mjs loaded dynamically — pre-2.31.2 installs that auto-upgraded
66
69
  // from an older hook-update.mjs SOURCE_FILES (which did not list this module) would
67
70
  // crash on static import. Degrade gracefully to no-op when the module is absent.
@@ -540,6 +543,12 @@ async function handleStop() {
540
543
  // contract test in tests/citation-tracker-userprompt.test.mjs covers it.
541
544
  try {
542
545
  const injected = extractAllInjected(transcriptPath);
546
+ // P5 ①: cite-back signals — observations whose warned file the agent
547
+ // edited this session. Union into injected so they're resolved (they
548
+ // were injected via pre-tool-recall) and, below, into cited so the
549
+ // edit promotes them even without a literal #NN in text.
550
+ const citeBackIds = extractCiteBackSignals(transcriptPath);
551
+ for (const id of citeBackIds) injected.add(id);
543
552
  if (injected.size > 0) {
544
553
  // Text-floor gate: skip decay on tool-only Stops. Without this,
545
554
  // a turn that ends on tool_use locks every injected obs as
@@ -552,6 +561,7 @@ async function handleStop() {
552
561
  debugLog('DEBUG', 'handleStop', `citation-decay: skipped (no main-thread assistant text yet, injected=${injected.size})`);
553
562
  } else {
554
563
  const citedMain = extractCitationsFromTranscript(transcriptPath, { mainOnly: true });
564
+ for (const id of citeBackIds) citedMain.add(id);
555
565
  const r = applyCitationDecay(db, project, injected, citedMain, sessionId);
556
566
  debugLog('DEBUG', 'handleStop', `citation-decay: touched=${r.touched} promoted=${r.promoted} demoted=${r.demoted}`);
557
567
  }
@@ -819,65 +829,19 @@ async function handleSessionStart() {
819
829
  `).run(Date.now() - 37 * 86400000);
820
830
  if (purged.changes > 0) debugLog('DEBUG', 'auto-maintain', `purged ${purged.changes} stale observations`);
821
831
 
822
- // Cleanup: remove broken observations (no title AND no narrative)
823
- const cleaned = db.prepare(`
824
- DELETE FROM observations WHERE id IN (
825
- SELECT id FROM observations
826
- WHERE COALESCE(compressed_into, 0) = 0
827
- AND (title IS NULL OR title = '') AND (narrative IS NULL OR narrative = '')
828
- LIMIT ${OP_CAP}
829
- )
830
- `).run();
831
- if (cleaned.changes > 0) debugLog('DEBUG', 'auto-maintain', `cleaned ${cleaned.changes} broken observations`);
832
-
833
- // Decay: reduce importance of old, never-accessed observations
834
- // v2.56.0 #4: injection_count is a separate engagement signal
835
- // hook-memory.mjs bumps it when the obs is auto-injected into Claude's
836
- // context. Pre-v2.56 only checked access_count, so an obs auto-injected
837
- // 8x (proven contextually relevant) still got decayed/marked. Adding
838
- // `injection_count = 0` treats injection as first-class engagement.
839
- const decayed = db.prepare(`
840
- UPDATE observations SET importance = MAX(1, COALESCE(importance, 1) - 1)
841
- WHERE id IN (
842
- SELECT id FROM observations
843
- WHERE COALESCE(compressed_into, 0) = 0
844
- AND COALESCE(importance, 1) > 1
845
- AND COALESCE(access_count, 0) = 0
846
- AND COALESCE(injection_count, 0) = 0
847
- AND created_at_epoch < ?
848
- LIMIT ${OP_CAP}
849
- )
850
- `).run(STALE_AGE);
851
- if (decayed.changes > 0) debugLog('DEBUG', 'auto-maintain', `decayed ${decayed.changes} stale observations`);
852
-
853
- // Mark idle: importance=1, never-accessed, never-injected, old → pending-purge
854
- // (will be purged next cycle). v2.56.0 #4: injection_count protects.
855
- const idleMarked = db.prepare(`
856
- UPDATE observations SET compressed_into = ${COMPRESSED_PENDING_PURGE}
857
- WHERE id IN (
858
- SELECT id FROM observations
859
- WHERE COALESCE(compressed_into, 0) = 0
860
- AND COALESCE(importance, 1) = 1
861
- AND COALESCE(access_count, 0) = 0
862
- AND COALESCE(injection_count, 0) = 0
863
- AND created_at_epoch < ?
864
- LIMIT ${OP_CAP}
865
- )
866
- `).run(STALE_AGE);
867
- if (idleMarked.changes > 0) debugLog('DEBUG', 'auto-maintain', `marked ${idleMarked.changes} idle as pending-purge`);
868
-
869
- // Boost: increase importance of frequently-accessed observations
870
- const boosted = db.prepare(`
871
- UPDATE observations SET importance = MIN(3, COALESCE(importance, 1) + 1)
872
- WHERE id IN (
873
- SELECT id FROM observations
874
- WHERE COALESCE(compressed_into, 0) = 0
875
- AND COALESCE(access_count, 0) > 3
876
- AND COALESCE(importance, 1) < 3
877
- LIMIT ${OP_CAP}
878
- )
879
- `).run();
880
- if (boosted.changes > 0) debugLog('DEBUG', 'auto-maintain', `boosted ${boosted.changes} frequently-accessed observations`);
832
+ // cleanup / decay+mark-idle / boost via maintain-core (shared with CLI + MCP).
833
+ // injection_count>0 protection lives in decayAndMarkIdle. Whole-DB, cap 500.
834
+ const mctx = { projectFilter: '', baseParams: [], staleAge: STALE_AGE, opCap: OP_CAP };
835
+
836
+ const cleaned = cleanupBroken(db, mctx);
837
+ if (cleaned > 0) debugLog('DEBUG', 'auto-maintain', `cleaned ${cleaned} broken observations`);
838
+
839
+ const { decayed, idleMarked } = decayAndMarkIdle(db, mctx);
840
+ if (decayed > 0) debugLog('DEBUG', 'auto-maintain', `decayed ${decayed} stale observations`);
841
+ if (idleMarked > 0) debugLog('DEBUG', 'auto-maintain', `marked ${idleMarked} idle as pending-purge`);
842
+
843
+ const boosted = boostAccessed(db, mctx);
844
+ if (boosted > 0) debugLog('DEBUG', 'auto-maintain', `boosted ${boosted} frequently-accessed observations`);
881
845
 
882
846
  // Auto-dedup (exact): merge identical-title observations within 1h.
883
847
  // Catches rapid duplicate writes (same hook firing twice, race conditions).
@@ -908,8 +872,6 @@ async function handleSessionStart() {
908
872
  if (!process.env.CLAUDE_MEM_SKIP_AUTO_DEDUP_FUZZY) {
909
873
  const SCAN_LIMIT = 500;
910
874
  const FUZZY_MAX_MERGES = 20;
911
- const FUZZY_THRESHOLD = 0.95;
912
- const MINHASH_PREFILTER = 0.7;
913
875
  const recent = db.prepare(`
914
876
  SELECT id, title, importance, created_at_epoch
915
877
  FROM observations
@@ -929,7 +891,7 @@ async function handleSessionStart() {
929
891
  for (let j = i + 1; j < recent.length; j++) {
930
892
  if (!minhashes[j] || removed.has(recent[j].id)) continue;
931
893
  if (estimateJaccardFromMinHash(minhashes[i], minhashes[j]) < MINHASH_PREFILTER) continue;
932
- if (jaccardSimilarity(titles[i], titles[j]) < FUZZY_THRESHOLD) continue;
894
+ if (jaccardSimilarity(titles[i], titles[j]) < FUZZY_DEDUP_THRESHOLD) continue;
933
895
  // Keep the higher-importance row; tiebreak by older (lower id wins access history)
934
896
  const keep = (recent[i].importance ?? 1) >= (recent[j].importance ?? 1) ? recent[i] : recent[j];
935
897
  const remove = keep === recent[i] ? recent[j] : recent[i];
@@ -1361,57 +1323,16 @@ function handleAutoCompress() {
1361
1323
 
1362
1324
  try {
1363
1325
  const compressCutoff = Date.now() - 60 * 86400000; // 60 days
1364
- const compressCandidates = db.prepare(`
1365
- SELECT id, project, type, title, created_at_epoch
1366
- FROM observations
1367
- WHERE COALESCE(importance, 1) = 1 AND COALESCE(access_count, 0) = 0
1368
- AND created_at_epoch < ?
1369
- AND (compressed_into IS NULL OR compressed_into = ${COMPRESSED_AUTO})
1370
- ORDER BY project, created_at_epoch
1371
- `).all(compressCutoff);
1326
+ const compressCandidates = selectCompressionCandidates(db, { cutoff: compressCutoff, includeAutoMarked: true });
1372
1327
  if (compressCandidates.length < 3) return;
1373
1328
 
1374
- const groups = new Map();
1375
- for (const c of compressCandidates) {
1376
- const key = `${c.project}::${isoWeekKey(c.created_at_epoch)}`;
1377
- if (!groups.has(key)) groups.set(key, []);
1378
- groups.get(key).push(c);
1379
- }
1380
- // Transact each group to prevent orphan summaries on crash
1381
- const compressGroup = db.transaction((proj, obs) => {
1382
- const types = {};
1383
- for (const o of obs) types[o.type] = (types[o.type] || 0) + 1;
1384
- const dominantType = Object.entries(types).sort((a, b) => b[1] - a[1])[0][0];
1385
- const title = `Weekly summary: ${obs.length} ${dominantType} observations`;
1386
- const narrative = obs.map(o => `- ${o.title || '(untitled)'}`).join('\n');
1387
- const sortedEpochs = obs.map(o => o.created_at_epoch).sort((a, b) => a - b);
1388
- const medianEpoch = sortedEpochs[Math.floor(sortedEpochs.length / 2)];
1389
- const sessionId = `compress-${proj}`;
1390
- const now = new Date();
1391
- db.prepare(`INSERT OR IGNORE INTO sdk_sessions
1392
- (content_session_id, memory_session_id, project, started_at, started_at_epoch, status)
1393
- VALUES (?,?,?,?,?,'active')`
1394
- ).run(sessionId, sessionId, proj, now.toISOString(), now.getTime());
1395
- // Defense-in-depth: title/narrative are derived from already-stored
1396
- // obs.title, but those rows pre-date the central scrub policy in some
1397
- // cases. Re-scrub at the persistence boundary.
1398
- const safe = scrubRecord('observations', { text: narrative, title, narrative });
1399
- const summaryResult = db.prepare(`INSERT INTO observations
1400
- (memory_session_id, project, text, type, title, subtitle, narrative, concepts, facts,
1401
- files_read, files_modified, importance, created_at, created_at_epoch)
1402
- VALUES (?,?,?,?,?,'',?,'','','[]','[]',2,?,?)`
1403
- ).run(sessionId, proj, safe.text, dominantType, safe.title, safe.narrative, new Date(medianEpoch).toISOString(), medianEpoch);
1404
- const summaryId = Number(summaryResult.lastInsertRowid);
1405
- const obsIds = obs.map(o => o.id);
1406
- db.prepare(`UPDATE observations SET compressed_into = ? WHERE id IN (${obsIds.map(() => '?').join(',')})`)
1407
- .run(summaryId, ...obsIds);
1408
- return obs.length;
1409
- });
1329
+ const groups = groupByProjectWeek(compressCandidates);
1330
+ // Transact each group to prevent orphan summaries on crash (CLI/MCP wrap all groups in one).
1331
+ const compressGroupTxn = db.transaction((proj, obs) => compressGroup(db, proj, obs).compressed);
1410
1332
  let totalCompressed = 0;
1411
1333
  for (const [key, obs] of groups) {
1412
- if (obs.length < 3) continue;
1413
1334
  const [proj] = key.split('::');
1414
- totalCompressed += compressGroup(proj, obs);
1335
+ totalCompressed += compressGroupTxn(proj, obs);
1415
1336
  }
1416
1337
  if (totalCompressed > 0) {
1417
1338
  debugLog('DEBUG', 'auto-compress', `auto-compressed ${totalCompressed} observations into weekly summaries`);
@@ -387,6 +387,42 @@ const IMPORTANCE_CAP = 3;
387
387
  const IMPORTANCE_FLOOR = 0;
388
388
  const UNCITED_STREAK_THRESHOLD = 3;
389
389
 
390
+ // Adoption-rate gate (P5 ②). A project's cite-rate is SUM(cited_count) /
391
+ // SUM(decay_seen_count) over its non-superseded observations: of every decay
392
+ // resolution this project has ever produced, what fraction were citations.
393
+ // Below ADOPTION_THRESHOLD with at least ADOPTION_MIN_SEEN resolutions on record,
394
+ // the project has demonstrably not adopted the #NN convention, so we suppress
395
+ // DEMOTION (never promotion) — see the construct-validity note on
396
+ // applyCitationDecay. MIN_SEEN keeps the gate dormant for low-data projects so
397
+ // the established behavior is preserved until there's enough signal to judge.
398
+ const ADOPTION_THRESHOLD = 0.02;
399
+ const ADOPTION_MIN_SEEN = 8;
400
+
401
+ /**
402
+ * Compute a project's citation-adoption snapshot: total citations vs total decay
403
+ * resolutions on record, and their ratio. Read-only; safe to call before the
404
+ * decay transaction (the gate decision is made on the pre-mutation snapshot).
405
+ *
406
+ * @param {import('better-sqlite3').Database} db
407
+ * @param {string} project
408
+ * @returns {{cited: number, seen: number, rate: number}}
409
+ */
410
+ export function computeCitationAdoption(db, project) {
411
+ const empty = { cited: 0, seen: 0, rate: 0 };
412
+ if (!db || !project) return empty;
413
+ try {
414
+ const row = db.prepare(`
415
+ SELECT COALESCE(SUM(cited_count), 0) AS cited,
416
+ COALESCE(SUM(decay_seen_count), 0) AS seen
417
+ FROM observations
418
+ WHERE project = ? AND superseded_at IS NULL
419
+ `).get(project);
420
+ const cited = row?.cited || 0;
421
+ const seen = row?.seen || 0;
422
+ return { cited, seen, rate: seen > 0 ? cited / seen : 0 };
423
+ } catch (e) { debugCatch(e, 'computeCitationAdoption'); return empty; }
424
+ }
425
+
390
426
  /**
391
427
  * Apply the citation-feedback loop for one session: for each injected obs id,
392
428
  * decide cited vs uncited and mutate importance/streak/cited_count per spec.
@@ -398,6 +434,20 @@ const UNCITED_STREAK_THRESHOLD = 3;
398
434
  * - cross-project IDs are silently ignored by the WHERE clause.
399
435
  * - MEM_DISABLE_CITATION_DECAY=1 disables all writes; returns zeros.
400
436
  *
437
+ * CONSTRUCT-VALIDITY ASSUMPTION (P5): a "citation" is operationally two signals,
438
+ * neither of which is ground-truth behavioral impact:
439
+ * 1. the literal `#NN` token appears in main-thread assistant text (citedIds), and
440
+ * 2. (cite-back) the agent edited a file a prior lesson #NN had warned about —
441
+ * unioned into citedIds by the Stop handler before this call.
442
+ * Signal 2 was added because signal 1 alone penalizes projects that act on a
443
+ * lesson without typing its id. Even so, both are proxies. For a project that has
444
+ * never cited anything (cite-rate below ADOPTION_THRESHOLD over ≥ADOPTION_MIN_SEEN
445
+ * resolutions), demotion is suppressed: absent any positive signal we cannot
446
+ * distinguish "useless lesson" from "useful lesson in a project that doesn't use
447
+ * the #NN convention," and a false demotion is the costlier error. The gate trades
448
+ * missed demotions (stale lessons linger) for avoided false demotions. Promotion
449
+ * is never gated — a single citation lifts the project's rate and re-enables decay.
450
+ *
401
451
  * @param {import('better-sqlite3').Database} db
402
452
  * @param {string} project
403
453
  * @param {Set<number>|Iterable<number>} injectedIds
@@ -413,6 +463,13 @@ export function applyCitationDecay(db, project, injectedIds, citedIds, sessionId
413
463
  if (injected.size === 0) return empty;
414
464
  const cited = citedIds instanceof Set ? citedIds : new Set(citedIds || []);
415
465
 
466
+ // Adoption gate (snapshot taken before any mutation this run). Suppress only
467
+ // demotion; promotion always proceeds. Threshold overridable via env.
468
+ const adoption = computeCitationAdoption(db, project);
469
+ const envThreshold = Number.parseFloat(process.env.CLAUDE_MEM_CITATION_ADOPTION_THRESHOLD);
470
+ const adoptionThreshold = Number.isFinite(envThreshold) && envThreshold >= 0 ? envThreshold : ADOPTION_THRESHOLD;
471
+ const suppressDemotion = adoption.seen >= ADOPTION_MIN_SEEN && adoption.rate < adoptionThreshold;
472
+
416
473
  const selectStmt = db.prepare(
417
474
  'SELECT id, importance, uncited_streak, last_decided_session_id FROM observations WHERE id = ? AND project = ?'
418
475
  );
@@ -457,7 +514,10 @@ export function applyCitationDecay(db, project, injectedIds, citedIds, sessionId
457
514
  promoted++;
458
515
  } else {
459
516
  const nextStreak = (row.uncited_streak || 0) + 1;
460
- if (nextStreak >= UNCITED_STREAK_THRESHOLD) {
517
+ // Demote only when the streak is up AND the project has demonstrably
518
+ // adopted citations. A non-adopting project advances the streak (idempotent
519
+ // bookkeeping) but never loses importance — see construct-validity note.
520
+ if (nextStreak >= UNCITED_STREAK_THRESHOLD && !suppressDemotion) {
461
521
  updateDemote.run(IMPORTANCE_FLOOR, sessionId, Date.now(), id);
462
522
  demoted++;
463
523
  } else {
@@ -17,6 +17,11 @@ import { EDIT_TOOLS } from '../utils.mjs';
17
17
 
18
18
  const MAX_FILES = 2;
19
19
 
20
+ // Leader literal for the cite-back hint. Shared by the builder (below) and the
21
+ // Stop-time signal extractor (extractCiteBackSignals) so the two can never drift
22
+ // — the extractor finds hint emissions by this exact prefix.
23
+ const CITE_BACK_HINT_LEADER = '[mem] ⚠ Cite-back:';
24
+
20
25
  export function buildCiteBackHint(episode, cooldown) {
21
26
  if (!episode || !cooldown) return null;
22
27
  const entries = episode.entries;
@@ -48,7 +53,7 @@ export function buildCiteBackHint(episode, cooldown) {
48
53
  // numeric framing is measurably harder to dismiss than a hedged hint.
49
54
  const totalLessons = matches.reduce((sum, m) => sum + m.ids.length, 0);
50
55
  const lines = [
51
- `[mem] ⚠ Cite-back: edited ${matches.length} file(s) with ${totalLessons} prior lesson(s) this session. Save now if any was the root cause:`,
56
+ `${CITE_BACK_HINT_LEADER} edited ${matches.length} file(s) with ${totalLessons} prior lesson(s) this session. Save now if any was the root cause:`,
52
57
  ];
53
58
  for (const m of matches) {
54
59
  const fname = basename(m.file);
@@ -242,3 +247,36 @@ export function loadCiteBackForEpisode(episode, runtimeDir) {
242
247
  }
243
248
  return buildCiteBackHint(episode, cooldown);
244
249
  }
250
+
251
+ // ─── extractCiteBackSignals (P5 ①) ──────────────────────────────────────────
252
+ // Stop-time positive-citation signal. Scans the transcript for cite-back hint
253
+ // emissions (PostToolUse attachment.stdout carrying CITE_BACK_HINT_LEADER — the
254
+ // same source countUnsavedBugfixShape reads) and collects the `#NN` lesson ids
255
+ // they name. Each id is an observation whose warned file the agent actually
256
+ // EDITED this session — a behavioral citation even when the agent never typed
257
+ // #NN. The Stop handler unions these into the cited set passed to
258
+ // applyCitationDecay (lib/citation-tracker.mjs), so acting on a lesson promotes
259
+ // it and lifts the project's adoption rate. Returns an empty set on missing path.
260
+ const CITE_BACK_ID_RE = /#(\d{1,7})\b/g;
261
+
262
+ export function extractCiteBackSignals(transcriptPath) {
263
+ const ids = new Set();
264
+ if (!transcriptPath || !existsSync(transcriptPath)) return ids;
265
+ let raw;
266
+ try { raw = readFileSync(transcriptPath, 'utf8'); } catch { return ids; }
267
+ for (const line of raw.split('\n')) {
268
+ if (!line.trim()) continue;
269
+ let entry;
270
+ try { entry = JSON.parse(line); } catch { continue; }
271
+ if (entry.type !== 'attachment') continue;
272
+ const stdout = entry.attachment?.stdout || '';
273
+ if (!stdout.includes(CITE_BACK_HINT_LEADER)) continue;
274
+ CITE_BACK_ID_RE.lastIndex = 0;
275
+ let m;
276
+ while ((m = CITE_BACK_ID_RE.exec(stdout))) {
277
+ const id = Number(m[1]);
278
+ if (Number.isInteger(id) && id > 0 && id < 1e7) ids.add(id);
279
+ }
280
+ }
281
+ return ids;
282
+ }