npm - claude-mem-lite - Versions diffs - 2.88.0 → 2.90.0 - Mend

claude-mem-lite 2.88.0 → 2.90.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (28) hide show

package/.claude-plugin/marketplace.json +2 -2
package/.claude-plugin/plugin.json +2 -2
package/README.md +11 -9
package/README.zh-CN.md +8 -8
package/cli/activity.mjs +9 -5
package/cli.mjs +11 -9
package/haiku-client.mjs +20 -10
package/hook-handoff.mjs +44 -12
package/hook-llm.mjs +4 -3
package/hook-optimize.mjs +7 -3
package/hook-update.mjs +11 -4
package/hook.mjs +28 -14
package/install.mjs +46 -19
package/lib/citation-tracker.mjs +61 -1
package/lib/cite-back-hint.mjs +39 -1
package/lib/cli-flags.mjs +24 -2
package/lib/compress-core.mjs +24 -4
package/lib/dedup-constants.mjs +35 -0
package/lib/maintain-core.mjs +5 -2
package/lib/save-observation.mjs +1 -1
package/mem-cli.mjs +163 -17
package/nlp.mjs +6 -0
package/package.json +3 -2
package/schema.mjs +45 -3
package/search-engine.mjs +2 -1
package/server.mjs +8 -2
package/source-files.mjs +5 -0
package/tfidf.mjs +12 -8

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -10,9 +10,9 @@
   "plugins": [
     {
       "name": "claude-mem-lite",
-      "version": "2.88.0",
+      "version": "2.90.0",
       "source": "./",
-      "description": "Persistent long-term memory for Claude Code via MCP — captures coding decisions, bugfixes, and context across sessions. Hybrid FTS5 + TF-IDF search with episode batching. Single SQLite DB, no external services. Alternative to claude-mem with 600x lower cost."
+      "description": "Persistent long-term memory for Claude Code via MCP — captures coding decisions, bugfixes, and context across sessions. Hybrid FTS5 + TF-IDF search with episode batching. Single SQLite DB, no external services. A lighter, lower-cost alternative to claude-mem (episode batching + a smaller model; cost savings are an internal estimate, not a measured benchmark)."
     }
   ]
 }

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "claude-mem-lite",
-  "version": "2.88.0",
-  "description": "Persistent long-term memory for Claude Code via MCP — captures coding decisions, bugfixes, and context across sessions. Hybrid FTS5 + TF-IDF search with episode batching. Single SQLite DB, no external services. Alternative to claude-mem with 600x lower cost.",
+  "version": "2.90.0",
+  "description": "Persistent long-term memory for Claude Code via MCP — captures coding decisions, bugfixes, and context across sessions. Hybrid FTS5 + TF-IDF search with episode batching. Single SQLite DB, no external services. A lighter, lower-cost alternative to claude-mem (episode batching + a smaller model; cost savings are an internal estimate, not a measured benchmark).",
   "author": {
     "name": "sdsrss"
   },

package/README.md CHANGED Viewed

@@ -4,7 +4,7 @@
 `claude-mem-lite` is a **persistent memory** (also called *long-term memory* or *cross-session context*) system for **[Claude Code](https://docs.anthropic.com/en/docs/claude-code)** — Anthropic's CLI coding agent. It runs as an **[MCP](https://modelcontextprotocol.io/) server** plus a set of Claude Code hooks, automatically capturing coding observations, decisions, and bug fixes during sessions, then providing hybrid full-text + semantic search to recall them later.
-Compared to general-purpose LLM memory frameworks like [`mem0`](https://github.com/mem0ai/mem0) or the MCP reference [`memory`](https://github.com/modelcontextprotocol/servers/tree/main/src/memory) server, claude-mem-lite is purpose-built for Claude Code's hook lifecycle: episode batching cuts LLM calls 7–10× vs the original [claude-mem](https://github.com/thedotmack/claude-mem) (600× lower total cost), and the hybrid FTS5 + TF-IDF retriever benchmarks at 0.88 Recall@10 / 0.96 Precision@10.
+Compared to general-purpose LLM memory frameworks like [`mem0`](https://github.com/mem0ai/mem0) or the MCP reference [`memory`](https://github.com/modelcontextprotocol/servers/tree/main/src/memory) server, claude-mem-lite is purpose-built for Claude Code's hook lifecycle: episode batching cuts LLM calls 7–10× vs the original [claude-mem](https://github.com/thedotmack/claude-mem) (an estimated ~600× lower total cost — see the cost model below; this is an architecture estimate, not a measured benchmark), while the hybrid FTS5 + TF-IDF retriever benchmarks at 0.88 Recall@10 / 0.96 Precision@10.
 > 中文简介：claude-mem-lite 是 Claude Code 的轻量级**持久化记忆 / 长期记忆 / 跨会话上下文**插件，基于 MCP 协议 + 钩子机制，自动捕获编码会话中的决策、修复和上下文，并通过 FTS5 + TF-IDF 混合检索召回。详见 [中文 README](README.zh-CN.md)。
@@ -29,15 +29,17 @@ A ground-up redesign of [claude-mem](https://github.com/thedotmack/claude-mem),
 ### Token & cost efficiency
-For a typical 50-tool-call session:
+For a typical 50-tool-call session (illustrative cost model — the ratios below are
+architecture estimates derived from batch size, token counts, and model pricing, **not**
+a measured end-to-end benchmark):
-| | claude-mem | claude-mem-lite | Ratio |
+| | claude-mem | claude-mem-lite | Ratio (estimated) |
 |---|---|---|---|
-| LLM calls | ~50 (every tool use) | ~5-8 (per episode) | **7-10x fewer** |
-| Tokens per call | 1,000-5,000 (raw JSON + history) | 200-500 (summaries only) | **5-10x smaller** |
-| Total tokens | ~100K-250K | ~1K-4K | **50-100x less** |
-| Model cost | Sonnet ($3/$15 per M) | Haiku ($0.25/$1.25 per M) | **12x cheaper** |
-| Combined savings | | | **600x+ lower cost** |
+| LLM calls | ~50 (every tool use) | ~5-8 (per episode) | **~7-10x fewer** |
+| Tokens per call | 1,000-5,000 (raw JSON + history) | 200-500 (summaries only) | **~5-10x smaller** |
+| Total tokens | ~100K-250K | ~1K-4K | **~50-100x less** |
+| Model cost | Sonnet ($3/$15 per M) | Haiku ($0.25/$1.25 per M) | **~12x cheaper** |
+| Combined savings | | | **~600x lower cost (estimated)** |
 ### Quality comparison
@@ -681,7 +683,7 @@ No. Claude Code's `CLAUDE.md` and `MEMORY.md` files act as static instruction me
 ### Why "lite"? What did the original claude-mem do differently?
-The original called an LLM on every tool use with raw JSON inputs. claude-mem-lite batches 5–10 operations per LLM call, uses a smaller model (Haiku), and runs a deterministic code-level filter before sending anything to the model. Net result: ~600× lower cost with equivalent search quality. See the [Architecture comparison](#architecture-comparison) above.
+The original called an LLM on every tool use with raw JSON inputs. claude-mem-lite batches 5–10 operations per LLM call, uses a smaller model (Haiku), and runs a deterministic code-level filter before sending anything to the model. Net result: an estimated ~600× lower cost (an architecture estimate from the cost model above, not a measured benchmark) with equivalent search quality. See the [Architecture comparison](#architecture-comparison) above.
 ### Does this work cross-project? Cross-machine?

package/README.zh-CN.md CHANGED Viewed

@@ -4,7 +4,7 @@
 `claude-mem-lite` 是 **[Claude Code](https://docs.anthropic.com/en/docs/claude-code)**（Anthropic 官方 CLI 编程代理）的 **持久化记忆系统**（也称 **长期记忆 / 跨会话上下文 / Claude Code 记忆插件**）。它以 **[MCP](https://modelcontextprotocol.io/) 服务器** + Claude Code 钩子（hooks）的形式运行，在编码会话中自动捕获观察记录、决策、bug 修复，并通过 FTS5 全文检索 + TF-IDF 向量的混合检索召回历史上下文。
-与 [`mem0`](https://github.com/mem0ai/mem0)、MCP 官方参考实现的 [`memory`](https://github.com/modelcontextprotocol/servers/tree/main/src/memory) 服务器等通用 LLM 记忆框架相比，claude-mem-lite 专为 Claude Code 的钩子生命周期定制：episode 批处理把 LLM 调用量相比原版 [claude-mem](https://github.com/thedotmack/claude-mem) 减少 7-10 倍（综合成本下降 600 倍），FTS5 + TF-IDF 混合检索在 30 个查询的基准上达到 **Recall@10 = 0.88 / Precision@10 = 0.96**。
+与 [`mem0`](https://github.com/mem0ai/mem0)、MCP 官方参考实现的 [`memory`](https://github.com/modelcontextprotocol/servers/tree/main/src/memory) 服务器等通用 LLM 记忆框架相比，claude-mem-lite 专为 Claude Code 的钩子生命周期定制：episode 批处理把 LLM 调用量相比原版 [claude-mem](https://github.com/thedotmack/claude-mem) 减少 7-10 倍（综合成本估算下降约 600 倍 —— 见下方成本模型，属架构估算而非实测基准）；FTS5 + TF-IDF 混合检索在 30 个查询的基准上达到 **Recall@10 = 0.88 / Precision@10 = 0.96**。
 无需外部服务。单一 SQLite 数据库。开销极低。
@@ -27,15 +27,15 @@
 ### Token 与成本效率
-以典型的 50 次工具调用的会话为例：
+以典型的 50 次工具调用的会话为例（成本模型示意 —— 下列比率由批大小、token 量与模型定价**估算**得出，并非端到端实测）：
-| | claude-mem | claude-mem-lite | 比率 |
+| | claude-mem | claude-mem-lite | 比率（估算） |
 |---|---|---|---|
-| LLM 调用次数 | ~50（每次工具使用） | ~5-8（按 episode） | **减少 7-10 倍** |
-| 每次调用 token | 1,000-5,000（原始 JSON + 历史） | 200-500（仅摘要） | **减少 5-10 倍** |
-| 总 token 量 | ~100K-250K | ~1K-4K | **减少 50-100 倍** |
-| 模型成本 | Sonnet ($3/$15 每百万) | Haiku ($0.25/$1.25 每百万) | **便宜 12 倍** |
-| 综合节省 | | | **成本降低 600 倍+** |
+| LLM 调用次数 | ~50（每次工具使用） | ~5-8（按 episode） | **约减少 7-10 倍** |
+| 每次调用 token | 1,000-5,000（原始 JSON + 历史） | 200-500（仅摘要） | **约减少 5-10 倍** |
+| 总 token 量 | ~100K-250K | ~1K-4K | **约减少 50-100 倍** |
+| 模型成本 | Sonnet ($3/$15 每百万) | Haiku ($0.25/$1.25 每百万) | **约便宜 12 倍** |
+| 综合节省 | | | **成本降低约 600 倍（估算）** |
 ### 质量对比

package/cli/activity.mjs CHANGED Viewed

@@ -11,6 +11,7 @@
 import { inferProject } from '../utils.mjs';
 import { resolveProject } from '../project-utils.mjs';
 import { parseArgs, out, fail } from './common.mjs';
+import { parseIntFlag } from '../lib/cli-flags.mjs';
 function formatActivityResults(rows) {
   if (!rows || rows.length === 0) return '(no events)';
@@ -77,17 +78,20 @@ export async function cmdActivity(db, args) {
       fail(`[mem] activity search: invalid --type "${type}". Valid: ${[...VALID_EVENT_TYPES].join(', ')}`);
       return;
     }
-    const limit = flags.limit !== undefined ? parseInt(flags.limit, 10) : 10;
+    const limit = parseIntFlag(flags.limit, { name: '--limit', defaultValue: 10, max: 1000 });
     const rows = searchEvents(db, q, { project, type, limit });
     out(formatActivityResults(rows));
     return;
   }
   if (sub === 'recent') {
-    // Accept either `activity recent 5` or `activity recent --limit 5`.
-    const posLimit = positional.length > 0 ? parseInt(positional[0], 10) : NaN;
-    const flagLimit = flags.limit !== undefined ? parseInt(flags.limit, 10) : NaN;
-    const limit = Number.isFinite(posLimit) ? posLimit : (Number.isFinite(flagLimit) ? flagLimit : 20);
+    // Accept either `activity recent 5` or `activity recent --limit 5`. Both routed
+    // through parseIntFlag so garbage ("2abc"), negatives (SQLite LIMIT -1 = UNLIMITED
+    // full-table dump), and uncapped huge values warn + clamp to default/max, matching
+    // the search/recent/browse siblings.
+    const limit = positional.length > 0
+      ? parseIntFlag(positional[0], { name: 'count', defaultValue: 20, max: 1000 })
+      : parseIntFlag(flags.limit, { name: '--limit', defaultValue: 20, max: 1000 });
     const type = flags.type || null;
     if (type !== null && !VALID_EVENT_TYPES.has(type)) {
       fail(`[mem] activity recent: invalid --type "${type}". Valid: ${[...VALID_EVENT_TYPES].join(', ')}`);

package/cli.mjs CHANGED Viewed

@@ -1,5 +1,5 @@
 #!/usr/bin/env node
-const CLI_COMMANDS = new Set(['search', 'recent', 'recall', 'get', 'timeline', 'save', 'stats', 'context', 'browse', 'citation-stats', 'delete', 'update', 'export', 'compress', 'maintain', 'optimize', 'fts-check', 'registry', 'import', 'import-jsonl', 'enrich', 'activity', 'adopt', 'unadopt', 'memdir-audit', 'defer', 'help']);
+const CLI_COMMANDS = new Set(['search', 'recent', 'recall', 'get', 'timeline', 'save', 'stats', 'context', 'browse', 'citation-stats', 'delete', 'update', 'export', 'restore', 'compress', 'maintain', 'optimize', 'fts-check', 'registry', 'import', 'import-jsonl', 'enrich', 'activity', 'adopt', 'unadopt', 'memdir-audit', 'defer', 'help']);
 const INSTALL_COMMANDS = new Set(['install', 'uninstall', 'status', 'doctor', 'cleanup', 'cleanup-hooks', 'self-update', 'repair', 'release']);
 const cmd = process.argv[2];
@@ -13,14 +13,16 @@ if (cmd === '--version' || cmd === '-v') {
 } else if (cmd === '--help' || cmd === '-h') {
   const { run } = await import('./mem-cli.mjs');
   await run(['help']);
-} else if (cmd === 'doctor' && process.argv.slice(3).some(a => a.startsWith('--') && a.length > 2)) {
-  // Per #8217 single-source-of-truth: any flagged `doctor --X` is a DB-layer
-  // inspection tool (--benchmark, --metrics, --session-audit, future flags)
-  // and routes to mem-cli. Plain `doctor` (no flags) keeps running the
-  // install health-check below — adding a new flag in cli/doctor.mjs no
-  // longer requires touching this enumeration. The `length > 2` guard
-  // ignores a bare `--` (POSIX end-of-options separator) so `doctor --`
-  // continues to route to install.mjs, not mem-cli.
+} else if (cmd === 'doctor' && process.argv.slice(3).some(a => a === '--benchmark' || a === '--metrics' || a === '--session-audit')) {
+  // Per #8217: the DB-layer doctor modes (--benchmark / --metrics / --session-audit,
+  // each implemented in cli/doctor.mjs) route to mem-cli. Everything else — plain
+  // `doctor`, `doctor --` (POSIX end-of-options), and `doctor --json` — stays with
+  // install.mjs's health-check, which OWNS --json (install.mjs doctor() line ~1216).
+  // Pre-fix the router forwarded ANY flagged `doctor --X` to mem-cli, so the documented
+  // `doctor --json` (install health JSON, advertised in install.mjs usage) was shadowed
+  // and rejected by cli/doctor.mjs. Gating on the three DB-layer flags keeps --json
+  // (and any future install-doctor flag) on the install path. Adding a NEW DB-layer
+  // mode requires extending this list — a deliberate trade for a working --json.
   const { run } = await import('./mem-cli.mjs');
   await run(process.argv.slice(2));
 } else if (CLI_COMMANDS.has(cmd)) {

package/haiku-client.mjs CHANGED Viewed

@@ -20,6 +20,14 @@ const MODEL_MAP = {
   sonnet: 'claude-sonnet-4-5-20250929',
 };
+// Every background LLM call here is fixed-schema extraction / classification
+// (episode→JSON, type/merge classification, synonym + metadata extraction) whose
+// output is consumed deterministically (JSON.parse, MinHash dedup). Pin temperature
+// to 0 so the provider default (~1.0) doesn't inject wording variance that breaks
+// JSON parsing or defeats the wording-sensitive MinHash near-duplicate detector.
+// A call that genuinely needs sampling can pass opts.temperature to override.
+const DEFAULT_LLM_TEMPERATURE = 0;
 /**
  * Resolve the LLM model to use for background calls.
  * Reads CLAUDE_MEM_MODEL env var, defaults to 'haiku'.
@@ -143,7 +151,7 @@ export function flattenForCLI(input) {
  * @param {number} [opts.maxTokens=500] Max tokens in response
  * @returns {Promise<{text: string}|null>} Response or null on failure
  */
-export async function callHaiku(prompt, { timeout = 10000, maxTokens = 500 } = {}) {
+export async function callHaiku(prompt, { timeout = 10000, maxTokens = 500, temperature = DEFAULT_LLM_TEMPERATURE } = {}) {
   if (!prompt) return null;
   const mode = detectMode();
@@ -160,8 +168,8 @@ export async function callHaiku(prompt, { timeout = 10000, maxTokens = 500 } = {
   let primary = null;
   try {
     primary = mode === 'api'
-      ? await callHaikuAPI(prompt, { timeout, maxTokens })
-      : await callOpenRouterAPI(prompt, resolveModel().cli, { timeout, maxTokens });
+      ? await callHaikuAPI(prompt, { timeout, maxTokens, temperature })
+      : await callOpenRouterAPI(prompt, resolveModel().cli, { timeout, maxTokens, temperature });
   } catch (e) {
     debugCatch(e, `callHaiku:${mode}`);
   }
@@ -198,7 +206,7 @@ export async function callHaikuJSON(prompt, opts) {
  * @param {number} [opts.maxTokens=1000] Max tokens in response
  * @returns {Promise<{text: string}|null>} Response or null on failure
  */
-export async function callLLMWithModel(prompt, model = 'haiku', { timeout = 15000, maxTokens = 1000 } = {}) {
+export async function callLLMWithModel(prompt, model = 'haiku', { timeout = 15000, maxTokens = 1000, temperature = DEFAULT_LLM_TEMPERATURE } = {}) {
   if (!prompt) return null;
   const resolvedModel = MODEL_MAP[model] ? model : 'haiku';
   const mode = detectMode();
@@ -214,8 +222,8 @@ export async function callLLMWithModel(prompt, model = 'haiku', { timeout = 1500
   let primary = null;
   try {
     primary = mode === 'api'
-      ? await callModelAPI(prompt, resolvedModel, { timeout, maxTokens })
-      : await callOpenRouterAPI(prompt, resolvedModel, { timeout, maxTokens });
+      ? await callModelAPI(prompt, resolvedModel, { timeout, maxTokens, temperature })
+      : await callOpenRouterAPI(prompt, resolvedModel, { timeout, maxTokens, temperature });
   } catch (e) {
     debugCatch(e, `callLLMWithModel:${mode}:${resolvedModel}`);
   }
@@ -239,7 +247,7 @@ export async function callModelJSON(prompt, model = 'haiku', opts) {
   return parseJsonFromLLM(result.text);
 }
-async function callModelAPI(prompt, model, { timeout, maxTokens }) {
+async function callModelAPI(prompt, model, { timeout, maxTokens, temperature = DEFAULT_LLM_TEMPERATURE }) {
   const apiKey = process.env.ANTHROPIC_API_KEY;
   if (!apiKey) return null;
@@ -252,6 +260,7 @@ async function callModelAPI(prompt, model, { timeout, maxTokens }) {
     const body = {
       model: modelId,
       max_tokens: maxTokens,
+      temperature,
       messages: [{ role: 'user', content: user }],
     };
     // System slot is constant per call type (instructions, schema, type taxonomy)
@@ -312,7 +321,7 @@ function callModelCLI(prompt, model, { timeout }) {
 // ─── API Mode ────────────────────────────────────────────────────────────────
-async function callHaikuAPI(prompt, { timeout, maxTokens }) {
+async function callHaikuAPI(prompt, { timeout, maxTokens, temperature = DEFAULT_LLM_TEMPERATURE }) {
   const apiKey = process.env.ANTHROPIC_API_KEY;
   if (!apiKey) return null;
@@ -325,6 +334,7 @@ async function callHaikuAPI(prompt, { timeout, maxTokens }) {
     const body = {
       model: modelId,
       max_tokens: maxTokens,
+      temperature,
       messages: [{ role: 'user', content: user }],
     };
     // See callModelAPI: cache_control on the constant system slot.
@@ -365,7 +375,7 @@ async function callHaikuAPI(prompt, { timeout, maxTokens }) {
 // `cache_control` field has no OpenAI-format equivalent and is omitted.
 // `tier` is the resolved model tier ('haiku'|'sonnet'); OPENROUTER_MODEL can
 // override the resulting slug entirely (see resolveOpenRouterModel).
-async function callOpenRouterAPI(prompt, tier, { timeout, maxTokens }) {
+async function callOpenRouterAPI(prompt, tier, { timeout, maxTokens, temperature = DEFAULT_LLM_TEMPERATURE }) {
   const apiKey = process.env.OPENROUTER_API_KEY;
   if (!apiKey) return null;
@@ -387,7 +397,7 @@ async function callOpenRouterAPI(prompt, tier, { timeout, maxTokens }) {
         // Optional OpenRouter attribution headers (ignored by the API if absent).
         'X-Title': 'claude-mem-lite',
       },
-      body: JSON.stringify({ model, max_tokens: maxTokens, messages }),
+      body: JSON.stringify({ model, max_tokens: maxTokens, temperature, messages }),
       signal: controller.signal,
     });

package/hook-handoff.mjs CHANGED Viewed

@@ -32,12 +32,26 @@ import * as taskReaderModule from './lib/task-reader.mjs';
  * @param {string|null} [scopeSessionId=null] CC UUID for session_handoffs.session_id column
  */
 export function buildAndSaveHandoff(db, sessionId, project, type, episodeSnapshot, scopeSessionId = null) {
-  // 1. Working objective — from user prompts
-  const prompts = db.prepare(`
-    SELECT prompt_text FROM user_prompts
-    WHERE content_session_id = ?
-    ORDER BY prompt_number ASC LIMIT 5
-  `).all(sessionId);
+  // 1. Working objective — from user prompts.
+  // D#26: getSessionId() is project-scoped, so multiple CC sessions in one project
+  // share `content_session_id`. When a genuine CC scope is passed (scopeSessionId is
+  // the CC UUID, i.e. differs from the mem-internal sessionId), filter to THIS CC
+  // session's prompts so working_on doesn't merge concurrent/sequential sessions.
+  // `OR cc_session_id IS NULL` keeps legacy rows + non-CC/no-stdin invocations. When
+  // scopeSessionId is absent or == sessionId (legacy/test/no-stdin), fall back to the
+  // unfiltered query (identical to pre-D#26 behavior).
+  const ccScope = scopeSessionId && scopeSessionId !== sessionId ? scopeSessionId : null;
+  const prompts = ccScope
+    ? db.prepare(`
+        SELECT prompt_text FROM user_prompts
+        WHERE content_session_id = ? AND (cc_session_id = ? OR cc_session_id IS NULL)
+        ORDER BY prompt_number ASC LIMIT 5
+      `).all(sessionId, ccScope)
+    : db.prepare(`
+        SELECT prompt_text FROM user_prompts
+        WHERE content_session_id = ?
+        ORDER BY prompt_number ASC LIMIT 5
+      `).all(sessionId);
   if (prompts.length === 0) return;  // Empty session — nothing to hand off
   // Filter prompts whose only content is workflow/control language ("继续",
@@ -73,12 +87,30 @@ export function buildAndSaveHandoff(db, sessionId, project, type, episodeSnapsho
     }
   }
+  // D#28 (completes D#26): observations carry the project-scoped memory_session_id, shared by
+  // parallel/sequential same-project CC sessions. Lower-bound the observation queries below to
+  // THIS CC session's start (earliest prompt epoch for ccScope) so Completed / Key Files / Key
+  // Decisions stop merging a prior session's work — the observation-side complement of
+  // working_on's cc-scoping. When ccScope is absent or its session has no prompts (MIN→null),
+  // ccWindowStart stays null and the queries run unscoped (pre-D#28 behavior). Residual: truly
+  // concurrent same-project sessions whose windows overlap can still co-attribute a few rows.
+  let ccWindowStart = null;
+  if (ccScope) {
+    const w = db.prepare(`
+      SELECT MIN(created_at_epoch) AS startEpoch FROM user_prompts
+      WHERE content_session_id = ? AND cc_session_id = ?
+    `).get(sessionId, ccScope);
+    if (typeof w?.startEpoch === 'number') ccWindowStart = w.startEpoch;
+  }
+  const obsWindowClause = ccWindowStart !== null ? 'AND created_at_epoch >= ?' : '';
+  const obsWindowParams = ccWindowStart !== null ? [ccWindowStart] : [];
   // 2. Completed — from observations (include narrative for richer handoff)
   const completed = db.prepare(`
     SELECT title, type, narrative FROM observations
-    WHERE memory_session_id = ? AND COALESCE(compressed_into, 0) = 0
+    WHERE memory_session_id = ? AND COALESCE(compressed_into, 0) = 0 ${obsWindowClause}
     ORDER BY created_at_epoch DESC LIMIT 15
-  `).all(sessionId);
+  `).all(sessionId, ...obsWindowParams);
   // 3. Recent activity — episode snapshot + full session edit history from narratives.
   // Keep only entries that represent in-flight work (file edits) or outright failures
@@ -131,9 +163,9 @@ export function buildAndSaveHandoff(db, sessionId, project, type, episodeSnapsho
   if (episodeSnapshot?.files) episodeSnapshot.files.filter(isValidFile).forEach(f => fileSet.add(f));
   const obsFiles = db.prepare(`
     SELECT files_modified FROM observations
-    WHERE memory_session_id = ? AND files_modified IS NOT NULL
+    WHERE memory_session_id = ? AND files_modified IS NOT NULL ${obsWindowClause}
     ORDER BY created_at_epoch DESC LIMIT 10
-  `).all(sessionId);
+  `).all(sessionId, ...obsWindowParams);
   for (const row of obsFiles) {
     try { JSON.parse(row.files_modified).filter(isValidFile).forEach(f => fileSet.add(f)); } catch {}
   }
@@ -142,9 +174,9 @@ export function buildAndSaveHandoff(db, sessionId, project, type, episodeSnapsho
   const decisions = db.prepare(`
     SELECT title FROM observations
     WHERE memory_session_id = ? AND COALESCE(importance, 1) >= 2
-      AND COALESCE(compressed_into, 0) = 0
+      AND COALESCE(compressed_into, 0) = 0 ${obsWindowClause}
     ORDER BY created_at_epoch DESC LIMIT 10
-  `).all(sessionId).filter(d => d.title && !LOW_SIGNAL_TITLE.test(d.title)).slice(0, 5);
+  `).all(sessionId, ...obsWindowParams).filter(d => d.title && !LOW_SIGNAL_TITLE.test(d.title)).slice(0, 5);
   // 6. Match keywords
   const allText = [workingOn, ...completed.map(c => c.title).filter(Boolean), unfinished].join(' ');

package/hook-llm.mjs CHANGED Viewed

@@ -12,6 +12,7 @@ import {
 import { acquireLLMSlot, releaseLLMSlot } from './hook-semaphore.mjs';
 import { scrubRecord } from './lib/scrub-record.mjs';
 import { getVocabulary, computeVector } from './tfidf.mjs';
+import { DEDUP_JACCARD_THRESHOLD, AUTO_MERGE_THRESHOLD } from './lib/dedup-constants.mjs';
 import {
   RUNTIME_DIR, DEDUP_WINDOW_MS, RELATED_OBS_WINDOW_MS,
   sessionFile, getSessionId, openDb, callLLM, sleep,
@@ -148,7 +149,7 @@ export function saveObservation(obs, projectOverride, sessionIdOverride, externa
       ORDER BY created_at_epoch DESC LIMIT 10
     `).all(project, fiveMinAgo);
-    if (obs.title && recent.some(r => jaccardSimilarity(r.title, obs.title) > 0.7)) {
+    if (obs.title && recent.some(r => jaccardSimilarity(r.title, obs.title) > DEDUP_JACCARD_THRESHOLD)) {
       return null; // dedup: Jaccard title match
     }
@@ -173,8 +174,8 @@ export function saveObservation(obs, projectOverride, sessionIdOverride, externa
         WHERE project = ? AND created_at_epoch > ? AND created_at_epoch <= ?
         ORDER BY created_at_epoch DESC LIMIT 60
       `).all(project, threeDaysAgo, fiveMinAgo);
-      if (extRecent.some(r => jaccardSimilarity(r.title, obs.title) > 0.85)) {
-        return null; // dedup: low-signal Jaccard match
+      if (extRecent.some(r => jaccardSimilarity(r.title, obs.title) > AUTO_MERGE_THRESHOLD)) {
+        return null; // dedup: low-signal Jaccard match (stricter cutoff for degraded titles)
       }
     }

package/hook-optimize.mjs CHANGED Viewed

@@ -13,6 +13,7 @@ import { callModelJSON } from './haiku-client.mjs';
 import { acquireLLMSlot, releaseLLMSlot } from './hook-semaphore.mjs';
 import { scrubRecord } from './lib/scrub-record.mjs';
 import { getVocabulary, computeVector, cosineSimilarity } from './tfidf.mjs';
+import { MERGE_JACCARD_LOW, AUTO_MERGE_THRESHOLD } from './lib/dedup-constants.mjs';
 import { DB_DIR } from './schema.mjs';
 const RUNTIME_DIR = join(DB_DIR, 'runtime');
@@ -331,8 +332,9 @@ export async function executeNormalize(db, force = false, { project } = {}) {
 // ─── Task 3: Cluster-merge ─────────────────────────────────────────────────
 const MERGE_TIME_WINDOW_MS = 30 * 86400000;
-const MERGE_JACCARD_LOW = 0.4;
-const MERGE_JACCARD_HIGH = 0.85;
+// Merge-review band [MERGE_JACCARD_LOW, AUTO_MERGE_THRESHOLD): titles in this
+// Jaccard range are LLM-reviewed for merge; at/above AUTO_MERGE_THRESHOLD they'd
+// already auto-merge elsewhere, below MERGE_JACCARD_LOW they're too dissimilar.
 export function findMergeCandidates(db, maxClusters = 5, { project } = {}) {
   const cutoff = Date.now() - MERGE_TIME_WINDOW_MS;
@@ -363,12 +365,14 @@ export function findMergeCandidates(db, maxClusters = 5, { project } = {}) {
       if (Math.abs(rows[i].created_at_epoch - rows[j].created_at_epoch) > MERGE_TIME_WINDOW_MS) continue;
       if (rows[i].minhash_sig && rows[j].minhash_sig) {
+        // 0.8 slack: the MinHash estimate is noisy, so pre-filter a band below
+        // MERGE_JACCARD_LOW rather than at it, to avoid dropping true candidates.
         const est = estimateJaccardFromMinHash(rows[i].minhash_sig, rows[j].minhash_sig);
         if (est < MERGE_JACCARD_LOW * 0.8) continue;
       }
       const titleSim = jaccardSimilarity(rows[i].title, rows[j].title);
-      if (titleSim >= MERGE_JACCARD_LOW && titleSim < MERGE_JACCARD_HIGH) {
+      if (titleSim >= MERGE_JACCARD_LOW && titleSim < AUTO_MERGE_THRESHOLD) {
         cluster.push(rows[j]);
         used.add(rows[j].id);
       }

package/hook-update.mjs CHANGED Viewed

@@ -7,7 +7,7 @@ import { readFileSync, writeFileSync, copyFileSync, cpSync, readdirSync, existsS
 import { join, dirname } from 'node:path';
 import { pathToFileURL } from 'node:url';
 import { tmpdir, homedir } from 'node:os';
-import { DB_DIR } from './schema.mjs';
+import { DB_DIR, CODE_DIR } from './schema.mjs';
 import { debugCatch, debugLog } from './utils.mjs';
 // Local manifest is fallback only — the active manifest is loaded from the
 // extracted tarball's own source-files.mjs inside installExtractedRelease.
@@ -16,8 +16,15 @@ import { SOURCE_FILES as LOCAL_SOURCE_FILES, HOOK_SCRIPT_FILES as LOCAL_HOOK_SCR
 // ── Configuration ──────────────────────────────────────────
 const GITHUB_REPO = 'sdsrss/claude-mem-lite';
-const INSTALL_DIR = DB_DIR;  // ~/.claude-mem-lite/
-const STATE_FILE = join(INSTALL_DIR, 'runtime', 'update-state.json');
+// Plugin CODE location (server.mjs / package.json / install target) — always
+// homedir-rooted, NEVER follows CLAUDE_MEM_DIR (see schema.mjs CODE_DIR). Used
+// for dev-mode detection, current-version read, and the install target dir.
+const INSTALL_DIR = CODE_DIR;  // ~/.claude-mem-lite/ (code)
+// DATA/state location — runtime/update-state.json lives with the data (env-aware
+// DB_DIR), matching hook-shared RUNTIME_DIR and install.mjs doctor's read path.
+// Equal to INSTALL_DIR unless CLAUDE_MEM_DIR relocates the data dir.
+const STATE_DIR = DB_DIR;
+const STATE_FILE = join(STATE_DIR, 'runtime', 'update-state.json');
 const CHECK_INTERVAL_MS = 24 * 60 * 60 * 1000;       // 24 hours
 const FETCH_TIMEOUT_MS = 3000;                         // 3s network timeout
 const RATE_LIMIT_INTERVAL_MS = 6 * 60 * 60 * 1000;   // 6h if rate-limited
@@ -558,7 +565,7 @@ function readState() {
 function saveState(state) {
   try {
-    const dir = join(INSTALL_DIR, 'runtime');
+    const dir = join(STATE_DIR, 'runtime');
     mkdirSync(dir, { recursive: true });
     const tmpFile = STATE_FILE + `.tmp-${process.pid}`;
     writeFileSync(tmpFile, JSON.stringify(state, null, 2));

package/hook.mjs CHANGED Viewed

@@ -63,7 +63,8 @@ import { checkForUpdate, getCachedUpdateBanner, isUpdateCheckDue } from './hook-
 import { handleLLMOptimize } from './hook-optimize.mjs';
 import { silentAutoAdopt, hasAutoAdoptMarker } from './adopt-cli.mjs';
 import { emitV270UpgradeBanner } from './lib/upgrade-banner.mjs';
-import { loadCiteBackForEpisode, buildUnsavedBugfixHint, countUnsavedBugfixShape, buildCiteRecallNudge as libBuildCiteRecallNudge, nextCiteLowStreak } from './lib/cite-back-hint.mjs';
+import { loadCiteBackForEpisode, extractCiteBackSignals, buildUnsavedBugfixHint, countUnsavedBugfixShape, buildCiteRecallNudge as libBuildCiteRecallNudge, nextCiteLowStreak } from './lib/cite-back-hint.mjs';
+import { MINHASH_PREFILTER, FUZZY_DEDUP_THRESHOLD } from './lib/dedup-constants.mjs';
 // plugin-cache-guard.mjs loaded dynamically — pre-2.31.2 installs that auto-upgraded
 // from an older hook-update.mjs SOURCE_FILES (which did not list this module) would
 // crash on static import. Degrade gracefully to no-op when the module is absent.
@@ -542,6 +543,12 @@ async function handleStop() {
           // contract test in tests/citation-tracker-userprompt.test.mjs covers it.
           try {
             const injected = extractAllInjected(transcriptPath);
+            // P5 ①: cite-back signals — observations whose warned file the agent
+            // edited this session. Union into injected so they're resolved (they
+            // were injected via pre-tool-recall) and, below, into cited so the
+            // edit promotes them even without a literal #NN in text.
+            const citeBackIds = extractCiteBackSignals(transcriptPath);
+            for (const id of citeBackIds) injected.add(id);
             if (injected.size > 0) {
               // Text-floor gate: skip decay on tool-only Stops. Without this,
               // a turn that ends on tool_use locks every injected obs as
@@ -554,6 +561,7 @@ async function handleStop() {
                 debugLog('DEBUG', 'handleStop', `citation-decay: skipped (no main-thread assistant text yet, injected=${injected.size})`);
               } else {
                 const citedMain = extractCitationsFromTranscript(transcriptPath, { mainOnly: true });
+                for (const id of citeBackIds) citedMain.add(id);
                 const r = applyCitationDecay(db, project, injected, citedMain, sessionId);
                 debugLog('DEBUG', 'handleStop', `citation-decay: touched=${r.touched} promoted=${r.promoted} demoted=${r.demoted}`);
               }
@@ -864,8 +872,6 @@ async function handleSessionStart() {
         if (!process.env.CLAUDE_MEM_SKIP_AUTO_DEDUP_FUZZY) {
           const SCAN_LIMIT = 500;
           const FUZZY_MAX_MERGES = 20;
-          const FUZZY_THRESHOLD = 0.95;
-          const MINHASH_PREFILTER = 0.7;
           const recent = db.prepare(`
             SELECT id, title, importance, created_at_epoch
             FROM observations
@@ -885,7 +891,7 @@ async function handleSessionStart() {
               for (let j = i + 1; j < recent.length; j++) {
                 if (!minhashes[j] || removed.has(recent[j].id)) continue;
                 if (estimateJaccardFromMinHash(minhashes[i], minhashes[j]) < MINHASH_PREFILTER) continue;
-                if (jaccardSimilarity(titles[i], titles[j]) < FUZZY_THRESHOLD) continue;
+                if (jaccardSimilarity(titles[i], titles[j]) < FUZZY_DEDUP_THRESHOLD) continue;
                 // Keep the higher-importance row; tiebreak by older (lower id wins access history)
                 const keep = (recent[i].importance ?? 1) >= (recent[j].importance ?? 1) ? recent[i] : recent[j];
                 const remove = keep === recent[i] ? recent[j] : recent[i];
@@ -1202,7 +1208,12 @@ async function handleUserPrompt() {
   // every downstream consumer (user_prompts INSERT, FTS query, continuation
   // detection, semantic-memory injection) sees the redacted text — single
   // source of truth for the privacy primitive.
-  const promptText = stripPrivate(rawPrompt);
+  // Strip NUL / C0 control chars (keep \t \n \r) before any downstream use: an
+  // embedded NUL terminates SQLite's C string, silently truncating the stored
+  // prompt_text at the first NUL (and breaking FTS). Single source of truth, so the
+  // user_prompts INSERT, FTS query, and continuation detection all see clean text.
+  // eslint-disable-next-line no-control-regex -- intentional: NUL/C0 strip prevents SQLite C-string truncation
+  const promptText = stripPrivate(rawPrompt).replace(/[\x00-\x08\x0b\x0c\x0e-\x1f]/g, '');
   const sessionId = getSessionId();
   const db = openDb();
@@ -1227,24 +1238,27 @@ async function handleUserPrompt() {
     ).get(sessionId);
     const promptNumber = bumped?.prompt_counter || 1;
+    // Claude Code's real session_id (CC UUID) from hook stdin. Persisted on the
+    // prompt row (cc_session_id) so buildAndSaveHandoff can scope working_on to ONE
+    // CC session — getSessionId() is project-scoped (no CC-UUID), so without this
+    // concurrent/within-TTL same-project sessions merge each other's prompts (D#26).
+    // Also scopes handoff-row injection below. Null (legacy) when stdin lacks session_id.
+    const ccSessionId = typeof hookData.session_id === 'string' && hookData.session_id.length > 0
+      ? hookData.session_id
+      : null;
     db.prepare(`
-      INSERT INTO user_prompts (content_session_id, prompt_text, prompt_number, created_at, created_at_epoch)
-      VALUES (?, ?, ?, ?, ?)
+      INSERT INTO user_prompts (content_session_id, prompt_text, prompt_number, cc_session_id, created_at, created_at_epoch)
+      VALUES (?, ?, ?, ?, ?, ?)
     `).run(
       sessionId,
       scrubSecrets(promptText.slice(0, 10000)),
       promptNumber,
+      ccSessionId,
       now.toISOString(), now.getTime()
     );
     // Cross-session handoff injection (first 3 prompts window, before semantic memory).
-    // Use Claude Code's real session_id from hook stdin to scope handoffs to this CC
-    // session — prevents cross-session bleed when running parallel sessions for the
-    // same project (see docs/bug.txt). Falls back to null (legacy behavior) if the
-    // hook input does not carry session_id.
-    const ccSessionId = typeof hookData.session_id === 'string' && hookData.session_id.length > 0
-      ? hookData.session_id
-      : null;
     if (promptNumber <= 3) {
       try {
         if (detectContinuationIntent(db, promptText, project, ccSessionId)) {