claude-mem-lite 2.88.0 → 2.90.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -10,9 +10,9 @@
10
10
  "plugins": [
11
11
  {
12
12
  "name": "claude-mem-lite",
13
- "version": "2.88.0",
13
+ "version": "2.90.0",
14
14
  "source": "./",
15
- "description": "Persistent long-term memory for Claude Code via MCP — captures coding decisions, bugfixes, and context across sessions. Hybrid FTS5 + TF-IDF search with episode batching. Single SQLite DB, no external services. Alternative to claude-mem with 600x lower cost."
15
+ "description": "Persistent long-term memory for Claude Code via MCP — captures coding decisions, bugfixes, and context across sessions. Hybrid FTS5 + TF-IDF search with episode batching. Single SQLite DB, no external services. A lighter, lower-cost alternative to claude-mem (episode batching + a smaller model; cost savings are an internal estimate, not a measured benchmark)."
16
16
  }
17
17
  ]
18
18
  }
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "claude-mem-lite",
3
- "version": "2.88.0",
4
- "description": "Persistent long-term memory for Claude Code via MCP — captures coding decisions, bugfixes, and context across sessions. Hybrid FTS5 + TF-IDF search with episode batching. Single SQLite DB, no external services. Alternative to claude-mem with 600x lower cost.",
3
+ "version": "2.90.0",
4
+ "description": "Persistent long-term memory for Claude Code via MCP — captures coding decisions, bugfixes, and context across sessions. Hybrid FTS5 + TF-IDF search with episode batching. Single SQLite DB, no external services. A lighter, lower-cost alternative to claude-mem (episode batching + a smaller model; cost savings are an internal estimate, not a measured benchmark).",
5
5
  "author": {
6
6
  "name": "sdsrss"
7
7
  },
package/README.md CHANGED
@@ -4,7 +4,7 @@
4
4
 
5
5
  `claude-mem-lite` is a **persistent memory** (also called *long-term memory* or *cross-session context*) system for **[Claude Code](https://docs.anthropic.com/en/docs/claude-code)** — Anthropic's CLI coding agent. It runs as an **[MCP](https://modelcontextprotocol.io/) server** plus a set of Claude Code hooks, automatically capturing coding observations, decisions, and bug fixes during sessions, then providing hybrid full-text + semantic search to recall them later.
6
6
 
7
- Compared to general-purpose LLM memory frameworks like [`mem0`](https://github.com/mem0ai/mem0) or the MCP reference [`memory`](https://github.com/modelcontextprotocol/servers/tree/main/src/memory) server, claude-mem-lite is purpose-built for Claude Code's hook lifecycle: episode batching cuts LLM calls 7–10× vs the original [claude-mem](https://github.com/thedotmack/claude-mem) (600× lower total cost), and the hybrid FTS5 + TF-IDF retriever benchmarks at 0.88 Recall@10 / 0.96 Precision@10.
7
+ Compared to general-purpose LLM memory frameworks like [`mem0`](https://github.com/mem0ai/mem0) or the MCP reference [`memory`](https://github.com/modelcontextprotocol/servers/tree/main/src/memory) server, claude-mem-lite is purpose-built for Claude Code's hook lifecycle: episode batching cuts LLM calls 7–10× vs the original [claude-mem](https://github.com/thedotmack/claude-mem) (an estimated ~600× lower total cost — see the cost model below; this is an architecture estimate, not a measured benchmark), while the hybrid FTS5 + TF-IDF retriever benchmarks at 0.88 Recall@10 / 0.96 Precision@10.
8
8
 
9
9
  > 中文简介:claude-mem-lite 是 Claude Code 的轻量级**持久化记忆 / 长期记忆 / 跨会话上下文**插件,基于 MCP 协议 + 钩子机制,自动捕获编码会话中的决策、修复和上下文,并通过 FTS5 + TF-IDF 混合检索召回。详见 [中文 README](README.zh-CN.md)。
10
10
 
@@ -29,15 +29,17 @@ A ground-up redesign of [claude-mem](https://github.com/thedotmack/claude-mem),
29
29
 
30
30
  ### Token & cost efficiency
31
31
 
32
- For a typical 50-tool-call session:
32
+ For a typical 50-tool-call session (illustrative cost model — the ratios below are
33
+ architecture estimates derived from batch size, token counts, and model pricing, **not**
34
+ a measured end-to-end benchmark):
33
35
 
34
- | | claude-mem | claude-mem-lite | Ratio |
36
+ | | claude-mem | claude-mem-lite | Ratio (estimated) |
35
37
  |---|---|---|---|
36
- | LLM calls | ~50 (every tool use) | ~5-8 (per episode) | **7-10x fewer** |
37
- | Tokens per call | 1,000-5,000 (raw JSON + history) | 200-500 (summaries only) | **5-10x smaller** |
38
- | Total tokens | ~100K-250K | ~1K-4K | **50-100x less** |
39
- | Model cost | Sonnet ($3/$15 per M) | Haiku ($0.25/$1.25 per M) | **12x cheaper** |
40
- | Combined savings | | | **600x+ lower cost** |
38
+ | LLM calls | ~50 (every tool use) | ~5-8 (per episode) | **~7-10x fewer** |
39
+ | Tokens per call | 1,000-5,000 (raw JSON + history) | 200-500 (summaries only) | **~5-10x smaller** |
40
+ | Total tokens | ~100K-250K | ~1K-4K | **~50-100x less** |
41
+ | Model cost | Sonnet ($3/$15 per M) | Haiku ($0.25/$1.25 per M) | **~12x cheaper** |
42
+ | Combined savings | | | **~600x lower cost (estimated)** |
41
43
 
42
44
  ### Quality comparison
43
45
 
@@ -681,7 +683,7 @@ No. Claude Code's `CLAUDE.md` and `MEMORY.md` files act as static instruction me
681
683
 
682
684
  ### Why "lite"? What did the original claude-mem do differently?
683
685
 
684
- The original called an LLM on every tool use with raw JSON inputs. claude-mem-lite batches 5–10 operations per LLM call, uses a smaller model (Haiku), and runs a deterministic code-level filter before sending anything to the model. Net result: ~600× lower cost with equivalent search quality. See the [Architecture comparison](#architecture-comparison) above.
686
+ The original called an LLM on every tool use with raw JSON inputs. claude-mem-lite batches 5–10 operations per LLM call, uses a smaller model (Haiku), and runs a deterministic code-level filter before sending anything to the model. Net result: an estimated ~600× lower cost (an architecture estimate from the cost model above, not a measured benchmark) with equivalent search quality. See the [Architecture comparison](#architecture-comparison) above.
685
687
 
686
688
  ### Does this work cross-project? Cross-machine?
687
689
 
package/README.zh-CN.md CHANGED
@@ -4,7 +4,7 @@
4
4
 
5
5
  `claude-mem-lite` 是 **[Claude Code](https://docs.anthropic.com/en/docs/claude-code)**(Anthropic 官方 CLI 编程代理)的 **持久化记忆系统**(也称 **长期记忆 / 跨会话上下文 / Claude Code 记忆插件**)。它以 **[MCP](https://modelcontextprotocol.io/) 服务器** + Claude Code 钩子(hooks)的形式运行,在编码会话中自动捕获观察记录、决策、bug 修复,并通过 FTS5 全文检索 + TF-IDF 向量的混合检索召回历史上下文。
6
6
 
7
- 与 [`mem0`](https://github.com/mem0ai/mem0)、MCP 官方参考实现的 [`memory`](https://github.com/modelcontextprotocol/servers/tree/main/src/memory) 服务器等通用 LLM 记忆框架相比,claude-mem-lite 专为 Claude Code 的钩子生命周期定制:episode 批处理把 LLM 调用量相比原版 [claude-mem](https://github.com/thedotmack/claude-mem) 减少 7-10 倍(综合成本下降 600 倍),FTS5 + TF-IDF 混合检索在 30 个查询的基准上达到 **Recall@10 = 0.88 / Precision@10 = 0.96**。
7
+ 与 [`mem0`](https://github.com/mem0ai/mem0)、MCP 官方参考实现的 [`memory`](https://github.com/modelcontextprotocol/servers/tree/main/src/memory) 服务器等通用 LLM 记忆框架相比,claude-mem-lite 专为 Claude Code 的钩子生命周期定制:episode 批处理把 LLM 调用量相比原版 [claude-mem](https://github.com/thedotmack/claude-mem) 减少 7-10 倍(综合成本估算下降约 600 倍 —— 见下方成本模型,属架构估算而非实测基准);FTS5 + TF-IDF 混合检索在 30 个查询的基准上达到 **Recall@10 = 0.88 / Precision@10 = 0.96**。
8
8
 
9
9
  无需外部服务。单一 SQLite 数据库。开销极低。
10
10
 
@@ -27,15 +27,15 @@
27
27
 
28
28
  ### Token 与成本效率
29
29
 
30
- 以典型的 50 次工具调用的会话为例:
30
+ 以典型的 50 次工具调用的会话为例(成本模型示意 —— 下列比率由批大小、token 量与模型定价**估算**得出,并非端到端实测):
31
31
 
32
- | | claude-mem | claude-mem-lite | 比率 |
32
+ | | claude-mem | claude-mem-lite | 比率(估算) |
33
33
  |---|---|---|---|
34
- | LLM 调用次数 | ~50(每次工具使用) | ~5-8(按 episode) | **减少 7-10 倍** |
35
- | 每次调用 token | 1,000-5,000(原始 JSON + 历史) | 200-500(仅摘要) | **减少 5-10 倍** |
36
- | 总 token 量 | ~100K-250K | ~1K-4K | **减少 50-100 倍** |
37
- | 模型成本 | Sonnet ($3/$15 每百万) | Haiku ($0.25/$1.25 每百万) | **便宜 12 倍** |
38
- | 综合节省 | | | **成本降低 600 倍+** |
34
+ | LLM 调用次数 | ~50(每次工具使用) | ~5-8(按 episode) | **约减少 7-10 倍** |
35
+ | 每次调用 token | 1,000-5,000(原始 JSON + 历史) | 200-500(仅摘要) | **约减少 5-10 倍** |
36
+ | 总 token 量 | ~100K-250K | ~1K-4K | **约减少 50-100 倍** |
37
+ | 模型成本 | Sonnet ($3/$15 每百万) | Haiku ($0.25/$1.25 每百万) | **约便宜 12 倍** |
38
+ | 综合节省 | | | **成本降低约 600 倍(估算)** |
39
39
 
40
40
  ### 质量对比
41
41
 
package/cli/activity.mjs CHANGED
@@ -11,6 +11,7 @@
11
11
  import { inferProject } from '../utils.mjs';
12
12
  import { resolveProject } from '../project-utils.mjs';
13
13
  import { parseArgs, out, fail } from './common.mjs';
14
+ import { parseIntFlag } from '../lib/cli-flags.mjs';
14
15
 
15
16
  function formatActivityResults(rows) {
16
17
  if (!rows || rows.length === 0) return '(no events)';
@@ -77,17 +78,20 @@ export async function cmdActivity(db, args) {
77
78
  fail(`[mem] activity search: invalid --type "${type}". Valid: ${[...VALID_EVENT_TYPES].join(', ')}`);
78
79
  return;
79
80
  }
80
- const limit = flags.limit !== undefined ? parseInt(flags.limit, 10) : 10;
81
+ const limit = parseIntFlag(flags.limit, { name: '--limit', defaultValue: 10, max: 1000 });
81
82
  const rows = searchEvents(db, q, { project, type, limit });
82
83
  out(formatActivityResults(rows));
83
84
  return;
84
85
  }
85
86
 
86
87
  if (sub === 'recent') {
87
- // Accept either `activity recent 5` or `activity recent --limit 5`.
88
- const posLimit = positional.length > 0 ? parseInt(positional[0], 10) : NaN;
89
- const flagLimit = flags.limit !== undefined ? parseInt(flags.limit, 10) : NaN;
90
- const limit = Number.isFinite(posLimit) ? posLimit : (Number.isFinite(flagLimit) ? flagLimit : 20);
88
+ // Accept either `activity recent 5` or `activity recent --limit 5`. Both routed
89
+ // through parseIntFlag so garbage ("2abc"), negatives (SQLite LIMIT -1 = UNLIMITED
90
+ // full-table dump), and uncapped huge values warn + clamp to default/max, matching
91
+ // the search/recent/browse siblings.
92
+ const limit = positional.length > 0
93
+ ? parseIntFlag(positional[0], { name: 'count', defaultValue: 20, max: 1000 })
94
+ : parseIntFlag(flags.limit, { name: '--limit', defaultValue: 20, max: 1000 });
91
95
  const type = flags.type || null;
92
96
  if (type !== null && !VALID_EVENT_TYPES.has(type)) {
93
97
  fail(`[mem] activity recent: invalid --type "${type}". Valid: ${[...VALID_EVENT_TYPES].join(', ')}`);
package/cli.mjs CHANGED
@@ -1,5 +1,5 @@
1
1
  #!/usr/bin/env node
2
- const CLI_COMMANDS = new Set(['search', 'recent', 'recall', 'get', 'timeline', 'save', 'stats', 'context', 'browse', 'citation-stats', 'delete', 'update', 'export', 'compress', 'maintain', 'optimize', 'fts-check', 'registry', 'import', 'import-jsonl', 'enrich', 'activity', 'adopt', 'unadopt', 'memdir-audit', 'defer', 'help']);
2
+ const CLI_COMMANDS = new Set(['search', 'recent', 'recall', 'get', 'timeline', 'save', 'stats', 'context', 'browse', 'citation-stats', 'delete', 'update', 'export', 'restore', 'compress', 'maintain', 'optimize', 'fts-check', 'registry', 'import', 'import-jsonl', 'enrich', 'activity', 'adopt', 'unadopt', 'memdir-audit', 'defer', 'help']);
3
3
  const INSTALL_COMMANDS = new Set(['install', 'uninstall', 'status', 'doctor', 'cleanup', 'cleanup-hooks', 'self-update', 'repair', 'release']);
4
4
 
5
5
  const cmd = process.argv[2];
@@ -13,14 +13,16 @@ if (cmd === '--version' || cmd === '-v') {
13
13
  } else if (cmd === '--help' || cmd === '-h') {
14
14
  const { run } = await import('./mem-cli.mjs');
15
15
  await run(['help']);
16
- } else if (cmd === 'doctor' && process.argv.slice(3).some(a => a.startsWith('--') && a.length > 2)) {
17
- // Per #8217 single-source-of-truth: any flagged `doctor --X` is a DB-layer
18
- // inspection tool (--benchmark, --metrics, --session-audit, future flags)
19
- // and routes to mem-cli. Plain `doctor` (no flags) keeps running the
20
- // install health-check below adding a new flag in cli/doctor.mjs no
21
- // longer requires touching this enumeration. The `length > 2` guard
22
- // ignores a bare `--` (POSIX end-of-options separator) so `doctor --`
23
- // continues to route to install.mjs, not mem-cli.
16
+ } else if (cmd === 'doctor' && process.argv.slice(3).some(a => a === '--benchmark' || a === '--metrics' || a === '--session-audit')) {
17
+ // Per #8217: the DB-layer doctor modes (--benchmark / --metrics / --session-audit,
18
+ // each implemented in cli/doctor.mjs) route to mem-cli. Everything else — plain
19
+ // `doctor`, `doctor --` (POSIX end-of-options), and `doctor --json` stays with
20
+ // install.mjs's health-check, which OWNS --json (install.mjs doctor() line ~1216).
21
+ // Pre-fix the router forwarded ANY flagged `doctor --X` to mem-cli, so the documented
22
+ // `doctor --json` (install health JSON, advertised in install.mjs usage) was shadowed
23
+ // and rejected by cli/doctor.mjs. Gating on the three DB-layer flags keeps --json
24
+ // (and any future install-doctor flag) on the install path. Adding a NEW DB-layer
25
+ // mode requires extending this list — a deliberate trade for a working --json.
24
26
  const { run } = await import('./mem-cli.mjs');
25
27
  await run(process.argv.slice(2));
26
28
  } else if (CLI_COMMANDS.has(cmd)) {
package/haiku-client.mjs CHANGED
@@ -20,6 +20,14 @@ const MODEL_MAP = {
20
20
  sonnet: 'claude-sonnet-4-5-20250929',
21
21
  };
22
22
 
23
+ // Every background LLM call here is fixed-schema extraction / classification
24
+ // (episode→JSON, type/merge classification, synonym + metadata extraction) whose
25
+ // output is consumed deterministically (JSON.parse, MinHash dedup). Pin temperature
26
+ // to 0 so the provider default (~1.0) doesn't inject wording variance that breaks
27
+ // JSON parsing or defeats the wording-sensitive MinHash near-duplicate detector.
28
+ // A call that genuinely needs sampling can pass opts.temperature to override.
29
+ const DEFAULT_LLM_TEMPERATURE = 0;
30
+
23
31
  /**
24
32
  * Resolve the LLM model to use for background calls.
25
33
  * Reads CLAUDE_MEM_MODEL env var, defaults to 'haiku'.
@@ -143,7 +151,7 @@ export function flattenForCLI(input) {
143
151
  * @param {number} [opts.maxTokens=500] Max tokens in response
144
152
  * @returns {Promise<{text: string}|null>} Response or null on failure
145
153
  */
146
- export async function callHaiku(prompt, { timeout = 10000, maxTokens = 500 } = {}) {
154
+ export async function callHaiku(prompt, { timeout = 10000, maxTokens = 500, temperature = DEFAULT_LLM_TEMPERATURE } = {}) {
147
155
  if (!prompt) return null;
148
156
 
149
157
  const mode = detectMode();
@@ -160,8 +168,8 @@ export async function callHaiku(prompt, { timeout = 10000, maxTokens = 500 } = {
160
168
  let primary = null;
161
169
  try {
162
170
  primary = mode === 'api'
163
- ? await callHaikuAPI(prompt, { timeout, maxTokens })
164
- : await callOpenRouterAPI(prompt, resolveModel().cli, { timeout, maxTokens });
171
+ ? await callHaikuAPI(prompt, { timeout, maxTokens, temperature })
172
+ : await callOpenRouterAPI(prompt, resolveModel().cli, { timeout, maxTokens, temperature });
165
173
  } catch (e) {
166
174
  debugCatch(e, `callHaiku:${mode}`);
167
175
  }
@@ -198,7 +206,7 @@ export async function callHaikuJSON(prompt, opts) {
198
206
  * @param {number} [opts.maxTokens=1000] Max tokens in response
199
207
  * @returns {Promise<{text: string}|null>} Response or null on failure
200
208
  */
201
- export async function callLLMWithModel(prompt, model = 'haiku', { timeout = 15000, maxTokens = 1000 } = {}) {
209
+ export async function callLLMWithModel(prompt, model = 'haiku', { timeout = 15000, maxTokens = 1000, temperature = DEFAULT_LLM_TEMPERATURE } = {}) {
202
210
  if (!prompt) return null;
203
211
  const resolvedModel = MODEL_MAP[model] ? model : 'haiku';
204
212
  const mode = detectMode();
@@ -214,8 +222,8 @@ export async function callLLMWithModel(prompt, model = 'haiku', { timeout = 1500
214
222
  let primary = null;
215
223
  try {
216
224
  primary = mode === 'api'
217
- ? await callModelAPI(prompt, resolvedModel, { timeout, maxTokens })
218
- : await callOpenRouterAPI(prompt, resolvedModel, { timeout, maxTokens });
225
+ ? await callModelAPI(prompt, resolvedModel, { timeout, maxTokens, temperature })
226
+ : await callOpenRouterAPI(prompt, resolvedModel, { timeout, maxTokens, temperature });
219
227
  } catch (e) {
220
228
  debugCatch(e, `callLLMWithModel:${mode}:${resolvedModel}`);
221
229
  }
@@ -239,7 +247,7 @@ export async function callModelJSON(prompt, model = 'haiku', opts) {
239
247
  return parseJsonFromLLM(result.text);
240
248
  }
241
249
 
242
- async function callModelAPI(prompt, model, { timeout, maxTokens }) {
250
+ async function callModelAPI(prompt, model, { timeout, maxTokens, temperature = DEFAULT_LLM_TEMPERATURE }) {
243
251
  const apiKey = process.env.ANTHROPIC_API_KEY;
244
252
  if (!apiKey) return null;
245
253
 
@@ -252,6 +260,7 @@ async function callModelAPI(prompt, model, { timeout, maxTokens }) {
252
260
  const body = {
253
261
  model: modelId,
254
262
  max_tokens: maxTokens,
263
+ temperature,
255
264
  messages: [{ role: 'user', content: user }],
256
265
  };
257
266
  // System slot is constant per call type (instructions, schema, type taxonomy)
@@ -312,7 +321,7 @@ function callModelCLI(prompt, model, { timeout }) {
312
321
 
313
322
  // ─── API Mode ────────────────────────────────────────────────────────────────
314
323
 
315
- async function callHaikuAPI(prompt, { timeout, maxTokens }) {
324
+ async function callHaikuAPI(prompt, { timeout, maxTokens, temperature = DEFAULT_LLM_TEMPERATURE }) {
316
325
  const apiKey = process.env.ANTHROPIC_API_KEY;
317
326
  if (!apiKey) return null;
318
327
 
@@ -325,6 +334,7 @@ async function callHaikuAPI(prompt, { timeout, maxTokens }) {
325
334
  const body = {
326
335
  model: modelId,
327
336
  max_tokens: maxTokens,
337
+ temperature,
328
338
  messages: [{ role: 'user', content: user }],
329
339
  };
330
340
  // See callModelAPI: cache_control on the constant system slot.
@@ -365,7 +375,7 @@ async function callHaikuAPI(prompt, { timeout, maxTokens }) {
365
375
  // `cache_control` field has no OpenAI-format equivalent and is omitted.
366
376
  // `tier` is the resolved model tier ('haiku'|'sonnet'); OPENROUTER_MODEL can
367
377
  // override the resulting slug entirely (see resolveOpenRouterModel).
368
- async function callOpenRouterAPI(prompt, tier, { timeout, maxTokens }) {
378
+ async function callOpenRouterAPI(prompt, tier, { timeout, maxTokens, temperature = DEFAULT_LLM_TEMPERATURE }) {
369
379
  const apiKey = process.env.OPENROUTER_API_KEY;
370
380
  if (!apiKey) return null;
371
381
 
@@ -387,7 +397,7 @@ async function callOpenRouterAPI(prompt, tier, { timeout, maxTokens }) {
387
397
  // Optional OpenRouter attribution headers (ignored by the API if absent).
388
398
  'X-Title': 'claude-mem-lite',
389
399
  },
390
- body: JSON.stringify({ model, max_tokens: maxTokens, messages }),
400
+ body: JSON.stringify({ model, max_tokens: maxTokens, temperature, messages }),
391
401
  signal: controller.signal,
392
402
  });
393
403
 
package/hook-handoff.mjs CHANGED
@@ -32,12 +32,26 @@ import * as taskReaderModule from './lib/task-reader.mjs';
32
32
  * @param {string|null} [scopeSessionId=null] CC UUID for session_handoffs.session_id column
33
33
  */
34
34
  export function buildAndSaveHandoff(db, sessionId, project, type, episodeSnapshot, scopeSessionId = null) {
35
- // 1. Working objective — from user prompts
36
- const prompts = db.prepare(`
37
- SELECT prompt_text FROM user_prompts
38
- WHERE content_session_id = ?
39
- ORDER BY prompt_number ASC LIMIT 5
40
- `).all(sessionId);
35
+ // 1. Working objective — from user prompts.
36
+ // D#26: getSessionId() is project-scoped, so multiple CC sessions in one project
37
+ // share `content_session_id`. When a genuine CC scope is passed (scopeSessionId is
38
+ // the CC UUID, i.e. differs from the mem-internal sessionId), filter to THIS CC
39
+ // session's prompts so working_on doesn't merge concurrent/sequential sessions.
40
+ // `OR cc_session_id IS NULL` keeps legacy rows + non-CC/no-stdin invocations. When
41
+ // scopeSessionId is absent or == sessionId (legacy/test/no-stdin), fall back to the
42
+ // unfiltered query (identical to pre-D#26 behavior).
43
+ const ccScope = scopeSessionId && scopeSessionId !== sessionId ? scopeSessionId : null;
44
+ const prompts = ccScope
45
+ ? db.prepare(`
46
+ SELECT prompt_text FROM user_prompts
47
+ WHERE content_session_id = ? AND (cc_session_id = ? OR cc_session_id IS NULL)
48
+ ORDER BY prompt_number ASC LIMIT 5
49
+ `).all(sessionId, ccScope)
50
+ : db.prepare(`
51
+ SELECT prompt_text FROM user_prompts
52
+ WHERE content_session_id = ?
53
+ ORDER BY prompt_number ASC LIMIT 5
54
+ `).all(sessionId);
41
55
  if (prompts.length === 0) return; // Empty session — nothing to hand off
42
56
 
43
57
  // Filter prompts whose only content is workflow/control language ("继续",
@@ -73,12 +87,30 @@ export function buildAndSaveHandoff(db, sessionId, project, type, episodeSnapsho
73
87
  }
74
88
  }
75
89
 
90
+ // D#28 (completes D#26): observations carry the project-scoped memory_session_id, shared by
91
+ // parallel/sequential same-project CC sessions. Lower-bound the observation queries below to
92
+ // THIS CC session's start (earliest prompt epoch for ccScope) so Completed / Key Files / Key
93
+ // Decisions stop merging a prior session's work — the observation-side complement of
94
+ // working_on's cc-scoping. When ccScope is absent or its session has no prompts (MIN→null),
95
+ // ccWindowStart stays null and the queries run unscoped (pre-D#28 behavior). Residual: truly
96
+ // concurrent same-project sessions whose windows overlap can still co-attribute a few rows.
97
+ let ccWindowStart = null;
98
+ if (ccScope) {
99
+ const w = db.prepare(`
100
+ SELECT MIN(created_at_epoch) AS startEpoch FROM user_prompts
101
+ WHERE content_session_id = ? AND cc_session_id = ?
102
+ `).get(sessionId, ccScope);
103
+ if (typeof w?.startEpoch === 'number') ccWindowStart = w.startEpoch;
104
+ }
105
+ const obsWindowClause = ccWindowStart !== null ? 'AND created_at_epoch >= ?' : '';
106
+ const obsWindowParams = ccWindowStart !== null ? [ccWindowStart] : [];
107
+
76
108
  // 2. Completed — from observations (include narrative for richer handoff)
77
109
  const completed = db.prepare(`
78
110
  SELECT title, type, narrative FROM observations
79
- WHERE memory_session_id = ? AND COALESCE(compressed_into, 0) = 0
111
+ WHERE memory_session_id = ? AND COALESCE(compressed_into, 0) = 0 ${obsWindowClause}
80
112
  ORDER BY created_at_epoch DESC LIMIT 15
81
- `).all(sessionId);
113
+ `).all(sessionId, ...obsWindowParams);
82
114
 
83
115
  // 3. Recent activity — episode snapshot + full session edit history from narratives.
84
116
  // Keep only entries that represent in-flight work (file edits) or outright failures
@@ -131,9 +163,9 @@ export function buildAndSaveHandoff(db, sessionId, project, type, episodeSnapsho
131
163
  if (episodeSnapshot?.files) episodeSnapshot.files.filter(isValidFile).forEach(f => fileSet.add(f));
132
164
  const obsFiles = db.prepare(`
133
165
  SELECT files_modified FROM observations
134
- WHERE memory_session_id = ? AND files_modified IS NOT NULL
166
+ WHERE memory_session_id = ? AND files_modified IS NOT NULL ${obsWindowClause}
135
167
  ORDER BY created_at_epoch DESC LIMIT 10
136
- `).all(sessionId);
168
+ `).all(sessionId, ...obsWindowParams);
137
169
  for (const row of obsFiles) {
138
170
  try { JSON.parse(row.files_modified).filter(isValidFile).forEach(f => fileSet.add(f)); } catch {}
139
171
  }
@@ -142,9 +174,9 @@ export function buildAndSaveHandoff(db, sessionId, project, type, episodeSnapsho
142
174
  const decisions = db.prepare(`
143
175
  SELECT title FROM observations
144
176
  WHERE memory_session_id = ? AND COALESCE(importance, 1) >= 2
145
- AND COALESCE(compressed_into, 0) = 0
177
+ AND COALESCE(compressed_into, 0) = 0 ${obsWindowClause}
146
178
  ORDER BY created_at_epoch DESC LIMIT 10
147
- `).all(sessionId).filter(d => d.title && !LOW_SIGNAL_TITLE.test(d.title)).slice(0, 5);
179
+ `).all(sessionId, ...obsWindowParams).filter(d => d.title && !LOW_SIGNAL_TITLE.test(d.title)).slice(0, 5);
148
180
 
149
181
  // 6. Match keywords
150
182
  const allText = [workingOn, ...completed.map(c => c.title).filter(Boolean), unfinished].join(' ');
package/hook-llm.mjs CHANGED
@@ -12,6 +12,7 @@ import {
12
12
  import { acquireLLMSlot, releaseLLMSlot } from './hook-semaphore.mjs';
13
13
  import { scrubRecord } from './lib/scrub-record.mjs';
14
14
  import { getVocabulary, computeVector } from './tfidf.mjs';
15
+ import { DEDUP_JACCARD_THRESHOLD, AUTO_MERGE_THRESHOLD } from './lib/dedup-constants.mjs';
15
16
  import {
16
17
  RUNTIME_DIR, DEDUP_WINDOW_MS, RELATED_OBS_WINDOW_MS,
17
18
  sessionFile, getSessionId, openDb, callLLM, sleep,
@@ -148,7 +149,7 @@ export function saveObservation(obs, projectOverride, sessionIdOverride, externa
148
149
  ORDER BY created_at_epoch DESC LIMIT 10
149
150
  `).all(project, fiveMinAgo);
150
151
 
151
- if (obs.title && recent.some(r => jaccardSimilarity(r.title, obs.title) > 0.7)) {
152
+ if (obs.title && recent.some(r => jaccardSimilarity(r.title, obs.title) > DEDUP_JACCARD_THRESHOLD)) {
152
153
  return null; // dedup: Jaccard title match
153
154
  }
154
155
 
@@ -173,8 +174,8 @@ export function saveObservation(obs, projectOverride, sessionIdOverride, externa
173
174
  WHERE project = ? AND created_at_epoch > ? AND created_at_epoch <= ?
174
175
  ORDER BY created_at_epoch DESC LIMIT 60
175
176
  `).all(project, threeDaysAgo, fiveMinAgo);
176
- if (extRecent.some(r => jaccardSimilarity(r.title, obs.title) > 0.85)) {
177
- return null; // dedup: low-signal Jaccard match
177
+ if (extRecent.some(r => jaccardSimilarity(r.title, obs.title) > AUTO_MERGE_THRESHOLD)) {
178
+ return null; // dedup: low-signal Jaccard match (stricter cutoff for degraded titles)
178
179
  }
179
180
  }
180
181
 
package/hook-optimize.mjs CHANGED
@@ -13,6 +13,7 @@ import { callModelJSON } from './haiku-client.mjs';
13
13
  import { acquireLLMSlot, releaseLLMSlot } from './hook-semaphore.mjs';
14
14
  import { scrubRecord } from './lib/scrub-record.mjs';
15
15
  import { getVocabulary, computeVector, cosineSimilarity } from './tfidf.mjs';
16
+ import { MERGE_JACCARD_LOW, AUTO_MERGE_THRESHOLD } from './lib/dedup-constants.mjs';
16
17
  import { DB_DIR } from './schema.mjs';
17
18
 
18
19
  const RUNTIME_DIR = join(DB_DIR, 'runtime');
@@ -331,8 +332,9 @@ export async function executeNormalize(db, force = false, { project } = {}) {
331
332
  // ─── Task 3: Cluster-merge ─────────────────────────────────────────────────
332
333
 
333
334
  const MERGE_TIME_WINDOW_MS = 30 * 86400000;
334
- const MERGE_JACCARD_LOW = 0.4;
335
- const MERGE_JACCARD_HIGH = 0.85;
335
+ // Merge-review band [MERGE_JACCARD_LOW, AUTO_MERGE_THRESHOLD): titles in this
336
+ // Jaccard range are LLM-reviewed for merge; at/above AUTO_MERGE_THRESHOLD they'd
337
+ // already auto-merge elsewhere, below MERGE_JACCARD_LOW they're too dissimilar.
336
338
 
337
339
  export function findMergeCandidates(db, maxClusters = 5, { project } = {}) {
338
340
  const cutoff = Date.now() - MERGE_TIME_WINDOW_MS;
@@ -363,12 +365,14 @@ export function findMergeCandidates(db, maxClusters = 5, { project } = {}) {
363
365
  if (Math.abs(rows[i].created_at_epoch - rows[j].created_at_epoch) > MERGE_TIME_WINDOW_MS) continue;
364
366
 
365
367
  if (rows[i].minhash_sig && rows[j].minhash_sig) {
368
+ // 0.8 slack: the MinHash estimate is noisy, so pre-filter a band below
369
+ // MERGE_JACCARD_LOW rather than at it, to avoid dropping true candidates.
366
370
  const est = estimateJaccardFromMinHash(rows[i].minhash_sig, rows[j].minhash_sig);
367
371
  if (est < MERGE_JACCARD_LOW * 0.8) continue;
368
372
  }
369
373
 
370
374
  const titleSim = jaccardSimilarity(rows[i].title, rows[j].title);
371
- if (titleSim >= MERGE_JACCARD_LOW && titleSim < MERGE_JACCARD_HIGH) {
375
+ if (titleSim >= MERGE_JACCARD_LOW && titleSim < AUTO_MERGE_THRESHOLD) {
372
376
  cluster.push(rows[j]);
373
377
  used.add(rows[j].id);
374
378
  }
package/hook-update.mjs CHANGED
@@ -7,7 +7,7 @@ import { readFileSync, writeFileSync, copyFileSync, cpSync, readdirSync, existsS
7
7
  import { join, dirname } from 'node:path';
8
8
  import { pathToFileURL } from 'node:url';
9
9
  import { tmpdir, homedir } from 'node:os';
10
- import { DB_DIR } from './schema.mjs';
10
+ import { DB_DIR, CODE_DIR } from './schema.mjs';
11
11
  import { debugCatch, debugLog } from './utils.mjs';
12
12
  // Local manifest is fallback only — the active manifest is loaded from the
13
13
  // extracted tarball's own source-files.mjs inside installExtractedRelease.
@@ -16,8 +16,15 @@ import { SOURCE_FILES as LOCAL_SOURCE_FILES, HOOK_SCRIPT_FILES as LOCAL_HOOK_SCR
16
16
 
17
17
  // ── Configuration ──────────────────────────────────────────
18
18
  const GITHUB_REPO = 'sdsrss/claude-mem-lite';
19
- const INSTALL_DIR = DB_DIR; // ~/.claude-mem-lite/
20
- const STATE_FILE = join(INSTALL_DIR, 'runtime', 'update-state.json');
19
+ // Plugin CODE location (server.mjs / package.json / install target) — always
20
+ // homedir-rooted, NEVER follows CLAUDE_MEM_DIR (see schema.mjs CODE_DIR). Used
21
+ // for dev-mode detection, current-version read, and the install target dir.
22
+ const INSTALL_DIR = CODE_DIR; // ~/.claude-mem-lite/ (code)
23
+ // DATA/state location — runtime/update-state.json lives with the data (env-aware
24
+ // DB_DIR), matching hook-shared RUNTIME_DIR and install.mjs doctor's read path.
25
+ // Equal to INSTALL_DIR unless CLAUDE_MEM_DIR relocates the data dir.
26
+ const STATE_DIR = DB_DIR;
27
+ const STATE_FILE = join(STATE_DIR, 'runtime', 'update-state.json');
21
28
  const CHECK_INTERVAL_MS = 24 * 60 * 60 * 1000; // 24 hours
22
29
  const FETCH_TIMEOUT_MS = 3000; // 3s network timeout
23
30
  const RATE_LIMIT_INTERVAL_MS = 6 * 60 * 60 * 1000; // 6h if rate-limited
@@ -558,7 +565,7 @@ function readState() {
558
565
 
559
566
  function saveState(state) {
560
567
  try {
561
- const dir = join(INSTALL_DIR, 'runtime');
568
+ const dir = join(STATE_DIR, 'runtime');
562
569
  mkdirSync(dir, { recursive: true });
563
570
  const tmpFile = STATE_FILE + `.tmp-${process.pid}`;
564
571
  writeFileSync(tmpFile, JSON.stringify(state, null, 2));
package/hook.mjs CHANGED
@@ -63,7 +63,8 @@ import { checkForUpdate, getCachedUpdateBanner, isUpdateCheckDue } from './hook-
63
63
  import { handleLLMOptimize } from './hook-optimize.mjs';
64
64
  import { silentAutoAdopt, hasAutoAdoptMarker } from './adopt-cli.mjs';
65
65
  import { emitV270UpgradeBanner } from './lib/upgrade-banner.mjs';
66
- import { loadCiteBackForEpisode, buildUnsavedBugfixHint, countUnsavedBugfixShape, buildCiteRecallNudge as libBuildCiteRecallNudge, nextCiteLowStreak } from './lib/cite-back-hint.mjs';
66
+ import { loadCiteBackForEpisode, extractCiteBackSignals, buildUnsavedBugfixHint, countUnsavedBugfixShape, buildCiteRecallNudge as libBuildCiteRecallNudge, nextCiteLowStreak } from './lib/cite-back-hint.mjs';
67
+ import { MINHASH_PREFILTER, FUZZY_DEDUP_THRESHOLD } from './lib/dedup-constants.mjs';
67
68
  // plugin-cache-guard.mjs loaded dynamically — pre-2.31.2 installs that auto-upgraded
68
69
  // from an older hook-update.mjs SOURCE_FILES (which did not list this module) would
69
70
  // crash on static import. Degrade gracefully to no-op when the module is absent.
@@ -542,6 +543,12 @@ async function handleStop() {
542
543
  // contract test in tests/citation-tracker-userprompt.test.mjs covers it.
543
544
  try {
544
545
  const injected = extractAllInjected(transcriptPath);
546
+ // P5 ①: cite-back signals — observations whose warned file the agent
547
+ // edited this session. Union into injected so they're resolved (they
548
+ // were injected via pre-tool-recall) and, below, into cited so the
549
+ // edit promotes them even without a literal #NN in text.
550
+ const citeBackIds = extractCiteBackSignals(transcriptPath);
551
+ for (const id of citeBackIds) injected.add(id);
545
552
  if (injected.size > 0) {
546
553
  // Text-floor gate: skip decay on tool-only Stops. Without this,
547
554
  // a turn that ends on tool_use locks every injected obs as
@@ -554,6 +561,7 @@ async function handleStop() {
554
561
  debugLog('DEBUG', 'handleStop', `citation-decay: skipped (no main-thread assistant text yet, injected=${injected.size})`);
555
562
  } else {
556
563
  const citedMain = extractCitationsFromTranscript(transcriptPath, { mainOnly: true });
564
+ for (const id of citeBackIds) citedMain.add(id);
557
565
  const r = applyCitationDecay(db, project, injected, citedMain, sessionId);
558
566
  debugLog('DEBUG', 'handleStop', `citation-decay: touched=${r.touched} promoted=${r.promoted} demoted=${r.demoted}`);
559
567
  }
@@ -864,8 +872,6 @@ async function handleSessionStart() {
864
872
  if (!process.env.CLAUDE_MEM_SKIP_AUTO_DEDUP_FUZZY) {
865
873
  const SCAN_LIMIT = 500;
866
874
  const FUZZY_MAX_MERGES = 20;
867
- const FUZZY_THRESHOLD = 0.95;
868
- const MINHASH_PREFILTER = 0.7;
869
875
  const recent = db.prepare(`
870
876
  SELECT id, title, importance, created_at_epoch
871
877
  FROM observations
@@ -885,7 +891,7 @@ async function handleSessionStart() {
885
891
  for (let j = i + 1; j < recent.length; j++) {
886
892
  if (!minhashes[j] || removed.has(recent[j].id)) continue;
887
893
  if (estimateJaccardFromMinHash(minhashes[i], minhashes[j]) < MINHASH_PREFILTER) continue;
888
- if (jaccardSimilarity(titles[i], titles[j]) < FUZZY_THRESHOLD) continue;
894
+ if (jaccardSimilarity(titles[i], titles[j]) < FUZZY_DEDUP_THRESHOLD) continue;
889
895
  // Keep the higher-importance row; tiebreak by older (lower id wins access history)
890
896
  const keep = (recent[i].importance ?? 1) >= (recent[j].importance ?? 1) ? recent[i] : recent[j];
891
897
  const remove = keep === recent[i] ? recent[j] : recent[i];
@@ -1202,7 +1208,12 @@ async function handleUserPrompt() {
1202
1208
  // every downstream consumer (user_prompts INSERT, FTS query, continuation
1203
1209
  // detection, semantic-memory injection) sees the redacted text — single
1204
1210
  // source of truth for the privacy primitive.
1205
- const promptText = stripPrivate(rawPrompt);
1211
+ // Strip NUL / C0 control chars (keep \t \n \r) before any downstream use: an
1212
+ // embedded NUL terminates SQLite's C string, silently truncating the stored
1213
+ // prompt_text at the first NUL (and breaking FTS). Single source of truth, so the
1214
+ // user_prompts INSERT, FTS query, and continuation detection all see clean text.
1215
+ // eslint-disable-next-line no-control-regex -- intentional: NUL/C0 strip prevents SQLite C-string truncation
1216
+ const promptText = stripPrivate(rawPrompt).replace(/[\x00-\x08\x0b\x0c\x0e-\x1f]/g, '');
1206
1217
 
1207
1218
  const sessionId = getSessionId();
1208
1219
  const db = openDb();
@@ -1227,24 +1238,27 @@ async function handleUserPrompt() {
1227
1238
  ).get(sessionId);
1228
1239
  const promptNumber = bumped?.prompt_counter || 1;
1229
1240
 
1241
+ // Claude Code's real session_id (CC UUID) from hook stdin. Persisted on the
1242
+ // prompt row (cc_session_id) so buildAndSaveHandoff can scope working_on to ONE
1243
+ // CC session — getSessionId() is project-scoped (no CC-UUID), so without this
1244
+ // concurrent/within-TTL same-project sessions merge each other's prompts (D#26).
1245
+ // Also scopes handoff-row injection below. Null (legacy) when stdin lacks session_id.
1246
+ const ccSessionId = typeof hookData.session_id === 'string' && hookData.session_id.length > 0
1247
+ ? hookData.session_id
1248
+ : null;
1249
+
1230
1250
  db.prepare(`
1231
- INSERT INTO user_prompts (content_session_id, prompt_text, prompt_number, created_at, created_at_epoch)
1232
- VALUES (?, ?, ?, ?, ?)
1251
+ INSERT INTO user_prompts (content_session_id, prompt_text, prompt_number, cc_session_id, created_at, created_at_epoch)
1252
+ VALUES (?, ?, ?, ?, ?, ?)
1233
1253
  `).run(
1234
1254
  sessionId,
1235
1255
  scrubSecrets(promptText.slice(0, 10000)),
1236
1256
  promptNumber,
1257
+ ccSessionId,
1237
1258
  now.toISOString(), now.getTime()
1238
1259
  );
1239
1260
 
1240
1261
  // Cross-session handoff injection (first 3 prompts window, before semantic memory).
1241
- // Use Claude Code's real session_id from hook stdin to scope handoffs to this CC
1242
- // session — prevents cross-session bleed when running parallel sessions for the
1243
- // same project (see docs/bug.txt). Falls back to null (legacy behavior) if the
1244
- // hook input does not carry session_id.
1245
- const ccSessionId = typeof hookData.session_id === 'string' && hookData.session_id.length > 0
1246
- ? hookData.session_id
1247
- : null;
1248
1262
  if (promptNumber <= 3) {
1249
1263
  try {
1250
1264
  if (detectContinuationIntent(db, promptText, project, ccSessionId)) {