npm - claude-mem-lite - Versions diffs - 2.85.0 → 2.87.0 - Mend

claude-mem-lite 2.85.0 → 2.87.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/.claude-plugin/marketplace.json +1 -1
package/.claude-plugin/plugin.json +1 -1
package/README.md +4 -1
package/README.zh-CN.md +4 -1
package/haiku-client.mjs +127 -22
package/hook-llm.mjs +3 -3
package/hook-shared.mjs +18 -5
package/package.json +1 -1
package/schema.mjs +33 -2

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -10,7 +10,7 @@
   "plugins": [
     {
       "name": "claude-mem-lite",
-      "version": "2.85.0",
+      "version": "2.87.0",
       "source": "./",
       "description": "Persistent long-term memory for Claude Code via MCP — captures coding decisions, bugfixes, and context across sessions. Hybrid FTS5 + TF-IDF search with episode batching. Single SQLite DB, no external services. Alternative to claude-mem with 600x lower cost."
     }

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "claude-mem-lite",
-  "version": "2.85.0",
+  "version": "2.87.0",
   "description": "Persistent long-term memory for Claude Code via MCP — captures coding decisions, bugfixes, and context across sessions. Hybrid FTS5 + TF-IDF search with episode batching. Single SQLite DB, no external services. Alternative to claude-mem with 600x lower cost.",
   "author": {
     "name": "sdsrss"

package/README.md CHANGED Viewed

@@ -105,7 +105,7 @@ How claude-mem-lite differs from the major neighbors in the LLM-memory space (ve
 - **Resource registry** -- Indexes installed skills and agents with FTS5 search, composite scoring, and invocation tracking; searchable via `mem_registry` MCP tool
 - **Unified resource discovery** -- Shared filesystem traversal layer (`resource-discovery.mjs`) used by both runtime scanner and offline indexer, supporting flat directories, plugin nesting, and loose `.md` files
 - **Domain synonym expansion** -- Registry search queries expand to domain synonyms (e.g., "fix" → debug, bugfix, troubleshoot, diagnose, repair)
-- **Dual LLM mode** -- Auto-detects `ANTHROPIC_API_KEY` for direct API calls; falls back to `claude -p` CLI when no key is available
+- **Multi-provider LLM mode** -- Provider priority `ANTHROPIC_API_KEY` (direct Anthropic API) → `OPENROUTER_API_KEY` (OpenRouter, OpenAI-compatible — point it at any model via `OPENROUTER_MODEL`) → `claude -p` CLI fallback when no key is set
 - **Lesson-learned indexing** -- `lesson_learned` field indexed in FTS5 with weight 8, making past debugging insights directly searchable
 - **Cross-source normalization** -- `mem_search` normalizes scores across observations, sessions, and prompts before merging, preventing any source from dominating results
 - **Exponential recency decay** -- Type-differentiated half-lives (decisions: 90d, discoveries: 60d, bugfixes: 14d, changes: 7d) consistently applied in all ranking paths
@@ -657,6 +657,9 @@ npm run benchmark:gate    # CI gate: fails if metrics regress beyond 5% toleranc
 |----------|-------------|---------|
 | `CLAUDE_MEM_DIR` | Custom data directory. All databases, runtime files, and managed resources are stored here. | `~/.claude-mem-lite/` |
 | `CLAUDE_MEM_MODEL` | LLM model for background calls (episode extraction, session summaries). Accepts `haiku` or `sonnet`. | `haiku` |
+| `ANTHROPIC_API_KEY` | Anthropic API key. When set, all background LLM calls go directly to the Anthropic Messages API (with prompt caching). Highest priority. | _(unset → CLI)_ |
+| `OPENROUTER_API_KEY` | OpenRouter API key (OpenAI-compatible). Used for background LLM calls when `ANTHROPIC_API_KEY` is **not** set. If neither key is set, calls fall back to the `claude -p` CLI. | _(unset)_ |
+| `OPENROUTER_MODEL` | Overrides the OpenRouter model slug for **all** background calls (e.g. `openai/gpt-4o-mini`, `qwen/qwen-2.5-72b-instruct`). When unset, the `CLAUDE_MEM_MODEL` tier maps to `anthropic/claude-haiku-4.5` (haiku) or `anthropic/claude-sonnet-4.5` (sonnet). | _(tier default)_ |
 | `CLAUDE_MEM_DEBUG` | Enable debug logging (`1` to enable). | _(disabled)_ |
 | `MEM_QUIET_HOOKS` | Low-noise hooks. `1` drops the `File Lessons` / `Key Context` sections from SessionStart injection, the lesson suffix from `[mem] Related memories`, and the `WHEN TO USE` / `Decision rules` blocks from MCP server instructions. IDs and the `Recent` table still surface so `mem_get(ids=[…])` remains reachable. Intended for users running the invited-memory adopt path or who otherwise want minimal auto-injection. **Since v2.82.0 this env no longer gates auto-adopt — use `MEM_NO_AUTO_ADOPT=1` for that.** | _(disabled)_ |
 | `MEM_NO_AUTO_ADOPT` | Global opt-out for auto-adopt (v2.82.0+). `1` prevents the first-SessionStart auto-write of the invited-memory sentinel across **all** projects. For per-project opt-out use `claude-mem-lite adopt --disable` instead (writes a durable `<memdir>/.mem-no-auto-adopt` sentinel that survives marker deletion). | _(disabled)_ |

package/README.zh-CN.md CHANGED Viewed

@@ -86,7 +86,7 @@
 - **统一资源发现** -- 共享文件系统遍历层（`resource-discovery.mjs`），运行时扫描器和离线索引器共用，支持扁平目录、插件嵌套和松散 `.md` 文件
 - **领域同义词扩展** -- 注册表搜索查询自动扩展领域同义词（如 "修复" → fix, debug, bugfix, repair, error）
 - **持久化冷却机制** -- 5 分钟跨会话冷却 + 同会话去重，避免重复推荐 skill 自动加载
-- **双模式 LLM 调用** -- 自动检测 `ANTHROPIC_API_KEY` 直连 API；无 key 时回退到 `claude -p` CLI
+- **多 provider LLM 调用** -- provider 优先级 `ANTHROPIC_API_KEY`（直连 Anthropic API）→ `OPENROUTER_API_KEY`（OpenRouter，OpenAI 兼容，可用 `OPENROUTER_MODEL` 指向任意模型）→ 无 key 时回退 `claude -p` CLI
 - **Haiku 熔断器** -- 连续 3 次 LLM 失败后，禁用 Haiku 调度 5 分钟，防止级联延迟
 - **否定意图感知** -- 正确处理 "不要测试了，先修 bug" 等复杂提示，排除被否定的意图，支持中英文混合输入
 - **可配置 LLM 模型** -- 通过 `CLAUDE_MEM_MODEL` 环境变量在 Haiku（快速/低成本）和 Sonnet（深度分析）之间切换
@@ -599,6 +599,9 @@ npm run benchmark:gate    # CI 门控：指标回退超过 5% 容差时失败
 |------|------|--------|
 | `CLAUDE_MEM_DIR` | 自定义数据目录。所有数据库、运行时文件和托管资源均存储在此。 | `~/.claude-mem-lite/` |
 | `CLAUDE_MEM_MODEL` | 后台 LLM 调用模型（Episode 提取、会话总结、调度）。可选 `haiku` 或 `sonnet`。 | `haiku` |
+| `ANTHROPIC_API_KEY` | Anthropic API key。设置后所有后台 LLM 调用直连 Anthropic Messages API（带 prompt caching），优先级最高。 | _(未设 → CLI)_ |
+| `OPENROUTER_API_KEY` | OpenRouter API key（OpenAI 兼容）。当**未设** `ANTHROPIC_API_KEY` 时用于后台 LLM 调用；两者都未设则回退到 `claude -p` CLI。 | _(未设)_ |
+| `OPENROUTER_MODEL` | 覆盖**所有**后台调用的 OpenRouter 模型 slug（如 `openai/gpt-4o-mini`、`qwen/qwen-2.5-72b-instruct`）。未设时按 `CLAUDE_MEM_MODEL` 分层映射到 `anthropic/claude-haiku-4.5`（haiku）或 `anthropic/claude-sonnet-4.5`（sonnet）。 | _(分层默认)_ |
 | `CLAUDE_MEM_DEBUG` | 启用调试日志（设为 `1` 启用）。 | _(禁用)_ |
 | `MEM_QUIET_HOOKS` | 低噪声 hook。设为 `1` 时，SessionStart 注入去掉 `File Lessons` / `Key Context` 两节，`[mem] Related memories` 去掉 lesson 后缀，MCP server instructions 去掉 `WHEN TO USE` / `Decision rules` 两段。ID 与 `Recent` 表仍保留，`mem_get(ids=[…])` 可继续展开细节。适用于启用了 invited-memory adopt 流程或偏好最小化自动注入的用户。**v2.82.0 起此 env 不再阻挡 auto-adopt——如需关闭 auto-adopt 用 `MEM_NO_AUTO_ADOPT=1`。** | _(禁用)_ |
 | `MEM_NO_AUTO_ADOPT` | auto-adopt 全局关闭开关（v2.82.0+）。设为 `1` 阻止首次 SessionStart 在**所有**项目自动写入邀请式 memory 哨兵。项目级关闭走 `claude-mem-lite adopt --disable`（写 `<memdir>/.mem-no-auto-adopt` 哨兵，存活于 marker 删除）。 | _(禁用)_ |

package/haiku-client.mjs CHANGED Viewed

@@ -1,7 +1,9 @@
 // claude-mem-lite: Unified LLM call wrapper
 // Shared by memory (hook.mjs) and dispatch modules
-// Auto-detects API key for direct calls, falls back to claude CLI
-// Model configurable via CLAUDE_MEM_MODEL env var (default: haiku)
+// Provider priority: ANTHROPIC_API_KEY (direct Anthropic API) →
+// OPENROUTER_API_KEY (OpenRouter, OpenAI-compatible) → claude CLI fallback
+// Model configurable via CLAUDE_MEM_MODEL (haiku|sonnet); OpenRouter slug
+// overridable via OPENROUTER_MODEL
 import { execFileSync } from 'child_process';
 import { readFileSync } from 'fs';
@@ -30,18 +32,47 @@ export function resolveModel() {
   return { cli, api };
 }
+// OpenRouter uses its own slug namespace (OpenAI-compatible API). Map the
+// project's haiku/sonnet tiers to the matching anthropic/* slugs so the quality
+// tiering is preserved when routing through OpenRouter. Slugs verified against
+// openrouter.ai (2026-06): claude-haiku-4.5 / claude-sonnet-4.5 mirror the
+// native MODEL_MAP IDs above.
+const OPENROUTER_MODEL_MAP = {
+  haiku: 'anthropic/claude-haiku-4.5',
+  sonnet: 'anthropic/claude-sonnet-4.5',
+};
+/**
+ * Resolve the OpenRouter model slug for a given tier.
+ * OPENROUTER_MODEL (if set, non-blank) overrides every tier with an explicit
+ * slug — this is how users point claude-mem-lite at any OpenRouter model
+ * (e.g. openai/gpt-4o-mini, qwen/...). Otherwise the tier maps to its default
+ * anthropic/* slug, falling back to the haiku slug for unknown tiers.
+ * @param {string} tier 'haiku' | 'sonnet'
+ * @returns {string} OpenRouter model slug
+ */
+export function resolveOpenRouterModel(tier) {
+  const override = (process.env.OPENROUTER_MODEL || '').trim();
+  if (override) return override;
+  return OPENROUTER_MODEL_MAP[tier] || OPENROUTER_MODEL_MAP.haiku;
+}
 // ─── Mode Detection ──────────────────────────────────────────────────────────
 let _mode = null;
 /**
- * Detect whether to use direct API or CLI for LLM calls.
- * Cached after first call.
- * @returns {'api'|'cli'} The detected mode
+ * Detect which provider to use for LLM calls. Priority (per user contract):
+ * ANTHROPIC_API_KEY → direct Anthropic API ('api', native, supports prompt
+ * caching), else OPENROUTER_API_KEY → OpenRouter ('openrouter', OpenAI-compat),
+ * else fall back to the `claude` CLI ('cli'). Cached after first call.
+ * @returns {'api'|'openrouter'|'cli'} The detected mode
  */
 export function detectMode() {
   if (_mode) return _mode;
-  _mode = process.env.ANTHROPIC_API_KEY ? 'api' : 'cli';
+  if (process.env.ANTHROPIC_API_KEY) _mode = 'api';
+  else if (process.env.OPENROUTER_API_KEY) _mode = 'openrouter';
+  else _mode = 'cli';
   const { cli } = resolveModel();
   debugLog('DEBUG', 'haiku-client', `mode: ${_mode}, model: ${cli}`);
   return _mode;
@@ -102,8 +133,9 @@ export function flattenForCLI(input) {
 /**
  * Call Haiku model with a prompt. Returns parsed text or null on failure.
- * Uses direct API when ANTHROPIC_API_KEY is available, otherwise falls back to CLI.
- * Never throws — returns null on any error.
+ * Provider priority ANTHROPIC_API_KEY → OPENROUTER_API_KEY → CLI; if the keyed
+ * provider call fails (HTTP error / network throw / empty), degrades to the
+ * `claude -p` CLI. Never throws — returns null only when every path fails.
  *
  * @param {string|{system?: string, user: string}} prompt Prompt text, or split form
  * @param {object} [opts] Options
@@ -116,15 +148,28 @@ export async function callHaiku(prompt, { timeout = 10000, maxTokens = 500 } = {
   const mode = detectMode();
+  // CLI is terminal — no provider to fall back to.
+  if (mode === 'cli') {
+    try { return callHaikuCLI(prompt, { timeout }); }
+    catch (e) { debugCatch(e, 'callHaiku'); return null; }
+  }
+  // Keyed provider (api/openrouter): attempt it, then degrade to the CLI on any
+  // failure (HTTP error → null, or network/timeout throw). A region-blocked or
+  // out-of-credit key must not silently drop background summaries.
+  let primary = null;
   try {
-    if (mode === 'api') {
-      return await callHaikuAPI(prompt, { timeout, maxTokens });
-    }
-    return callHaikuCLI(prompt, { timeout });
+    primary = mode === 'api'
+      ? await callHaikuAPI(prompt, { timeout, maxTokens })
+      : await callOpenRouterAPI(prompt, resolveModel().cli, { timeout, maxTokens });
   } catch (e) {
-    debugCatch(e, 'callHaiku');
-    return null;
+    debugCatch(e, `callHaiku:${mode}`);
   }
+  if (primary) return primary;
+  debugLog('WARN', 'haiku-client', `${mode} call failed, falling back to claude CLI`);
+  try { return callHaikuCLI(prompt, { timeout }); }
+  catch (e) { debugCatch(e, 'callHaiku:cli-fallback'); return null; }
 }
 /**
@@ -143,8 +188,8 @@ export async function callHaikuJSON(prompt, opts) {
 /**
  * Call LLM with explicit model selection. Supports 'haiku' and 'sonnet'.
- * Reuses existing API/CLI dual-mode infrastructure.
- * Never throws — returns null on any error.
+ * Same provider priority + failure fallback to CLI as callHaiku.
+ * Never throws — returns null only when every path fails.
  *
  * @param {string} prompt The prompt text
  * @param {'haiku'|'sonnet'} model Model to use (default: 'haiku')
@@ -158,15 +203,27 @@ export async function callLLMWithModel(prompt, model = 'haiku', { timeout = 1500
   const resolvedModel = MODEL_MAP[model] ? model : 'haiku';
   const mode = detectMode();
+  // CLI is terminal — no provider to fall back to.
+  if (mode === 'cli') {
+    try { return callModelCLI(prompt, resolvedModel, { timeout }); }
+    catch (e) { debugCatch(e, `callLLMWithModel:${resolvedModel}`); return null; }
+  }
+  // Keyed provider (api/openrouter): attempt it, then degrade to the CLI on any
+  // failure so a region-blocked / out-of-credit key still produces output.
+  let primary = null;
   try {
-    if (mode === 'api') {
-      return await callModelAPI(prompt, resolvedModel, { timeout, maxTokens });
-    }
-    return callModelCLI(prompt, resolvedModel, { timeout });
+    primary = mode === 'api'
+      ? await callModelAPI(prompt, resolvedModel, { timeout, maxTokens })
+      : await callOpenRouterAPI(prompt, resolvedModel, { timeout, maxTokens });
   } catch (e) {
-    debugCatch(e, `callLLMWithModel:${resolvedModel}`);
-    return null;
+    debugCatch(e, `callLLMWithModel:${mode}:${resolvedModel}`);
   }
+  if (primary) return primary;
+  debugLog('WARN', 'haiku-client', `${mode} call failed, falling back to claude CLI (${resolvedModel})`);
+  try { return callModelCLI(prompt, resolvedModel, { timeout }); }
+  catch (e) { debugCatch(e, `callLLMWithModel:cli-fallback:${resolvedModel}`); return null; }
 }
 /**
@@ -299,6 +356,54 @@ async function callHaikuAPI(prompt, { timeout, maxTokens }) {
   }
 }
+// ─── OpenRouter Mode ─────────────────────────────────────────────────────────
+// OpenRouter exposes an OpenAI-compatible chat-completions API (NOT the
+// Anthropic Messages format), so the request/response shapes differ from
+// callHaikuAPI/callModelAPI: Bearer auth, `messages` with a system-role entry,
+// and the reply lives at choices[0].message.content. Anthropic's prompt-cache
+// `cache_control` field has no OpenAI-format equivalent and is omitted.
+// `tier` is the resolved model tier ('haiku'|'sonnet'); OPENROUTER_MODEL can
+// override the resulting slug entirely (see resolveOpenRouterModel).
+async function callOpenRouterAPI(prompt, tier, { timeout, maxTokens }) {
+  const apiKey = process.env.OPENROUTER_API_KEY;
+  if (!apiKey) return null;
+  const model = resolveOpenRouterModel(tier);
+  const controller = new AbortController();
+  const timer = setTimeout(() => controller.abort(), timeout);
+  try {
+    const { system, user } = splitPrompt(prompt);
+    const messages = [];
+    if (system) messages.push({ role: 'system', content: system });
+    messages.push({ role: 'user', content: user });
+    const res = await fetch('https://openrouter.ai/api/v1/chat/completions', {
+      method: 'POST',
+      headers: {
+        'Content-Type': 'application/json',
+        'Authorization': `Bearer ${apiKey}`,
+        // Optional OpenRouter attribution headers (ignored by the API if absent).
+        'X-Title': 'claude-mem-lite',
+      },
+      body: JSON.stringify({ model, max_tokens: maxTokens, messages }),
+      signal: controller.signal,
+    });
+    if (!res.ok) {
+      debugLog('WARN', `${tier}-openrouter`, `HTTP ${res.status}`);
+      return null;
+    }
+    const data = await res.json();
+    const text = data.choices?.[0]?.message?.content;
+    return text ? { text } : null;
+  } finally {
+    clearTimeout(timer);
+  }
+}
 // ─── CLI Mode ────────────────────────────────────────────────────────────────
 function callHaikuCLI(prompt, { timeout }) {

package/hook-llm.mjs CHANGED Viewed

@@ -674,7 +674,7 @@ ${actionList}`;
   if (gotSlot) {
     let raw, parsed;
     try {
-      raw = callLLM(prompt);
+      raw = await callLLM(prompt);
       parsed = parseJsonFromLLM(raw);
     } finally {
       releaseLLMSlot();
@@ -721,7 +721,7 @@ ${actionList}`;
         retryAttempted = true;
         try {
           const retryPrompt = buildLessonRetryPrompt(episode, parsed);
-          const retryRaw = callLLM(retryPrompt, 10000);
+          const retryRaw = await callLLM(retryPrompt, 10000);
           if (retryRaw) {
             const retry = parseJsonFromLLM(retryRaw);
             const retryLesson = typeof retry?.lesson === 'string' ? retry.lesson.trim() : '';
@@ -974,7 +974,7 @@ ${obsList}`;
     let raw, llmParsed;
     try {
-      raw = callLLM(prompt, 20000);
+      raw = await callLLM(prompt, 20000);
       llmParsed = parseJsonFromLLM(raw);
     } finally {
       releaseLLMSlot();

package/hook-shared.mjs CHANGED Viewed

@@ -7,7 +7,7 @@ import { join } from 'path';
 import { existsSync, readFileSync, writeFileSync, mkdirSync, renameSync, readdirSync, statSync, unlinkSync } from 'fs';
 import { inferProject, debugCatch } from './utils.mjs';
 import { ensureDb, DB_DIR } from './schema.mjs';
-import { getClaudePath as getClaudePathShared, resolveModel as resolveModelShared, flattenForCLI as _flattenForCLI } from './haiku-client.mjs';
+import { getClaudePath as getClaudePathShared, resolveModel as resolveModelShared, flattenForCLI as _flattenForCLI, detectMode as detectLLMMode, callHaiku } from './haiku-client.mjs';
 // Phase D: invited-memory sentinel detection. memdir.mjs only pulls in fs/path/os/crypto;
 // adopt-content.mjs is pure strings. No circular deps — memdir doesn't import hook-shared.
 import { memdirPath as _memdirPath, isAdopted as _isAdopted } from './memdir.mjs';
@@ -130,13 +130,26 @@ export function openDb() {
   }
 }
-// ─── LLM via claude CLI ─────────────────────────────────────────────────────
+// ─── LLM (provider-routed: Anthropic API → OpenRouter → claude CLI) ─────────
 // Accepts either a plain string (legacy) or {system, user} (defense-in-depth
 // against prompt injection from poisoned user_prompts content — cso F#4 fix).
-// CLI mode renders the {system, user} form via flattenForCLI which inserts an
-// explicit data-boundary marker; API mode uses the system role natively.
-export function callLLM(prompt, timeoutMs = 15000) {
+// Provider priority mirrors haiku-client (ANTHROPIC_API_KEY > OPENROUTER_API_KEY
+// > CLI): when a key is present, delegate to callHaiku — it owns the Anthropic
+// Messages / OpenRouter chat-completions request shapes, uses the system role
+// natively, AND degrades to the `claude -p` CLI internally if the keyed provider
+// fails (so a region-blocked / out-of-credit key still yields a summary). The
+// keyless case shells out to `claude -p` directly here, where flattenForCLI
+// renders {system, user} with an explicit data-boundary marker. Returns the raw
+// response string (callers run parseJsonFromLLM themselves) or null.
+// maxTokens is sized for session-summary / episode JSON (larger than the
+// registry/optimize callers' budgets).
+export async function callLLM(prompt, timeoutMs = 15000) {
+  if (detectLLMMode() !== 'cli') {
+    const result = await callHaiku(prompt, { timeout: timeoutMs, maxTokens: 2000 });
+    return result?.text ?? null;
+  }
   const { cli: modelName } = resolveModelShared();
   try {
     const result = execFileSync(getClaudePathShared(), ['-p', '--model', modelName], {

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "claude-mem-lite",
-  "version": "2.85.0",
+  "version": "2.87.0",
   "description": "Persistent long-term memory for Claude Code via MCP — captures coding decisions, bugfixes, and context across sessions. Hybrid FTS5 + TF-IDF search with episode batching. Single SQLite DB, no external services. Alternative to claude-mem with 600x lower cost.",
   "type": "module",
   "packageManager": "npm@10.9.2",

package/schema.mjs CHANGED Viewed

@@ -49,7 +49,12 @@ export const REGISTRY_DB_PATH = join(DB_DIR, 'resource-registry.db');
 // cited_count, last_decided_session_id. Stop hook resolves injected obs as
 // cited|uncited; 3 consecutive uncited → importance -1 (floor 0); 1 cited → +1
 // (cap 3). last_decided_session_id makes Stop idempotent across multi-fire.
-export const CURRENT_SCHEMA_VERSION = 34;
+// v35 (v2.87.0): no DDL — version bumped only to force one full migration pass on
+// existing DBs, which runs the one-shot observation_files orphan cleanup (and
+// re-runs the v28 observation_vectors cleanup) to clear the backlog leaked while
+// the warm-start fast-path left foreign_keys OFF. LATEST_MIGRATION_COLUMN is
+// unchanged (no new column) — decay_seen_count still exists at v35.
+export const CURRENT_SCHEMA_VERSION = 35;
 // Sentinel column for the LATEST migration set. The fast-path uses this to
 // self-heal half-migrated DBs — schema_version bumped but column ALTERs rolled
@@ -212,7 +217,16 @@ export function initSchema(db) {
       // Self-heal: version-row says CURRENT but latest migration column may be
       // absent (rolled back / never applied). Fall through to migration apply
       // when the sentinel is missing — duplicates are caught in the loop.
-      if (row.version === CURRENT_SCHEMA_VERSION && hasLatestMigrationColumn(db)) return db;
+      if (row.version === CURRENT_SCHEMA_VERSION && hasLatestMigrationColumn(db)) {
+        // Warm-start post-condition: ensureDb() opened this handle with
+        // foreign_keys=OFF (early migrations require cascade disabled). The full
+        // migration path re-enables it at the end; this fast-path return must
+        // match that post-condition, else every DELETE on the returned handle
+        // skips ON DELETE CASCADE and silently orphans junction rows. Safe here:
+        // no transaction is open yet (BEGIN IMMEDIATE is below).
+        db.pragma('foreign_keys = ON');
+        return db;
+      }
       if (row.version > CURRENT_SCHEMA_VERSION) {
         throw new Error(
           `DB schema is v${row.version} but this claude-mem-lite binary supports up to v${CURRENT_SCHEMA_VERSION}. ` +
@@ -237,6 +251,10 @@ export function initSchema(db) {
     const underlock = db.prepare('SELECT version FROM schema_version LIMIT 1').get();
     if (underlock && underlock.version === CURRENT_SCHEMA_VERSION && hasLatestMigrationColumn(db)) {
       db.exec('COMMIT');
+      // COMMIT closed the transaction, so this PRAGMA takes effect (no-op inside a txn).
+      // Same FK post-condition as the fast-path above: a peer completed init while we
+      // were blocked, so we skip migrations and must still restore cascade enforcement.
+      db.pragma('foreign_keys = ON');
       return db;
     }
   } catch { /* table absent — proceed */ }
@@ -472,6 +490,19 @@ export function initSchema(db) {
     }
   } catch { /* non-critical — migration can retry on next open */ }
+  // v35 (v2.87.0) P1: one-shot cleanup of orphaned observation_files. Same root
+  // cause as the v28 observation_vectors cleanup below — until the warm-start
+  // fast-path FK fix (initSchema early returns now restore foreign_keys=ON),
+  // ensureDb() handles ran with FK OFF, so ON DELETE CASCADE never fired and
+  // junction rows leaked (live DB: 6440/9569 = 67% orphans). Idempotent (NOT IN
+  // is empty on a clean DB); runs once per version bump via the fast-path gate.
+  try {
+    db.prepare(`
+      DELETE FROM observation_files
+      WHERE obs_id NOT IN (SELECT id FROM observations)
+    `).run();
+  } catch { /* non-critical — table-missing path handled by earlier CREATE */ }
   // Observation vectors table for TF-IDF vector search
   db.exec(`
     CREATE TABLE IF NOT EXISTS observation_vectors (