npm - moflo - Versions diffs - 4.10.0 → 4.10.2 - Mend

moflo 4.10.0 → 4.10.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/.claude/skills/healer/SKILL.md +3 -1
package/bin/lib/db-repair.mjs +358 -41
package/bin/session-start-launcher.mjs +42 -6
package/dist/src/cli/commands/doctor-checks-config.js +60 -0
package/dist/src/cli/commands/doctor-checks-memory-access.js +27 -1
package/dist/src/cli/commands/doctor-embedding-hygiene.js +48 -12
package/dist/src/cli/commands/doctor-fixes.js +57 -0
package/dist/src/cli/commands/doctor-registry.js +10 -1
package/dist/src/cli/commands/doctor-render.js +118 -74
package/dist/src/cli/commands/doctor.js +70 -25
package/dist/src/cli/memory/bridge-core.js +36 -0
package/dist/src/cli/memory/bridge-embedder.js +84 -3
package/dist/src/cli/memory/memory-initializer.js +2 -2
package/dist/src/cli/services/ephemeral-namespace-purge.js +15 -5
package/dist/src/cli/services/memory-db-integrity-repair.js +119 -0
package/dist/src/cli/version.js +1 -1
package/package.json +2 -2

package/.claude/skills/healer/SKILL.md CHANGED Viewed

@@ -30,7 +30,9 @@ Thin wrapper around the `flo healer` CLI. All check + fix logic lives in the CLI
    - `✓ N passing` (count only)
    - `⚠ warnings` — list `name: message`; flag with `[auto-fixable]` when the result has a `fix` field
    - `✗ failures` — same
-   - If `--fix` mode, also list which fixes were applied vs which need manual action.
+   - If `--fix` mode, read `fixesApplied[]` from the JSON payload and list `{name, applied}` per entry — applied=true → "fixed", applied=false → "needs manual action". The `results[]` array is post-fix state (re-evaluated), so report the final status.
+   - If `--install` was passed, surface `claudeCodeInstall.installed` from the payload.
+   - If `--kill-zombies` was passed, surface `zombieScan.killed` / `zombieScan.found` from the payload.
 4. **Nudge based on what changed.** Only mention next steps for state that *actually* changed:
    - Daemon restarted → `Statusline should refresh within ~5s.`

package/bin/lib/db-repair.mjs CHANGED Viewed

@@ -1,32 +1,54 @@
 /**
- * Memory-DB integrity check + auto-REINDEX (story #743).
+ * Memory-DB integrity check + tiered repair (#743, #1090-followup).
  *
- * The `.moflo/moflo.db` SQLite file routinely accumulates index corruption of
- * the form `row N missing from index sqlite_autoindex_memory_entries_1` —
- * the row data is intact, only the unique-key index has drifted. The most
- * common trigger is sql.js's whole-file dump-on-flush behaviour racing with
- * concurrent writes (see `feedback_sqljs_writeback_clobber.md` and #714).
+ * The `.moflo/moflo.db` SQLite file picks up corruption in two distinct modes:
+ *
+ *  1. **Index drift** — `row N missing from sqlite_autoindex_memory_entries_1`.
+ *     Row data is intact; only the unique-key b-tree is wrong. Trigger: sql.js's
+ *     whole-file dump-on-flush racing with concurrent writes (#714, #743 —
+ *     fixed for new installs by Phase 5 / #1084 which removed sql.js entirely).
+ *     **REINDEX** rebuilds the index from canonical row data.
+ *
+ *  2. **Table b-tree corruption** — `Tree N page M cell K: Rowid X out of
+ *     order`, where Tree N is a TABLE root page (not just an index). Row data
+ *     is partly intact, but page ordering is broken. Triggers we've seen:
+ *      - sql.js → node:sqlite migration: an old 4.9.x sql.js daemon flushes its
+ *        full-file dump OVER a WAL frame that the new 4.10 backend has already
+ *        written, leaving WAL referencing pages that no longer exist in main.
+ *      - Concurrent multi-process writes when the daemon was disabled (#981).
+ *     **REINDEX cannot fix this** — the table itself is broken. Recovery path:
+ *      a) `VACUUM INTO` a fresh file (single-shot rebuild; fails fast if
+ *         iteration hits an unreadable page),
+ *      b) row-level salvage — chunked `SELECT rowid > ?` per table, catching
+ *         per-chunk errors and skipping past corrupt page ranges,
+ *      c) atomic swap with .corrupt.<TS> backup retained for forensics.
+ *
+ *  3. **Unrecoverable** — header damage, encrypted-by-malware, etc. We can't
+ *     fix this; surface a clear failure and let the user decide between manual
+ *     `flo memory rebuild-index` (destructive) and offline recovery tools.
  *
  * Symptoms when uncorrected:
  *  - `index-guidance.mjs` and `index-patterns.mjs` fail mid-write with
  *    `database disk image is malformed`, leaving partial state.
  *  - The ephemeral-namespace purge (#729) fails silently, so hive-mind /
  *    tasklist / epic-state / test-bridge-fix rows accumulate.
- *  - Vector counts in the statusline stay inflated (observed: 4415 with
- *    1025 unpurged ephemeral rows).
+ *  - Vector counts in the statusline stay inflated.
+ *  - Healer's deep checks throw with "database disk image is malformed",
+ *    surfacing as the synthetic 'Check' failure (doctor.ts:214).
  *
- * Fix shape: REINDEX rebuilds indexes from the canonical row data — much less
- * destructive than a full rebuild and works for the typical drift mode. If
- * REINDEX itself fails to restore integrity we leave the file alone and
- * report; manual `flo memory rebuild-index` is the fallback.
- *
- * MUST run BEFORE any long-lived sql.js consumer (MCP server, daemon) opens
- * the DB and BEFORE the embeddings migration / soft-delete purge / ephemeral
- * purge — those all swallow corruption errors and silently no-op.
+ * MUST run BEFORE any long-lived consumer (MCP server, daemon) opens the DB
+ * and BEFORE the embeddings migration / soft-delete purge / ephemeral purge —
+ * those all swallow corruption errors and silently no-op.
  */
-import { existsSync } from 'node:fs';
+import { existsSync, renameSync, unlinkSync } from 'node:fs';
 import { memoryDbPath } from './moflo-paths.mjs';
 import { openBackend } from './get-backend.mjs';
+import './suppress-sqlite-warning.mjs';
+// Resolve node:sqlite once at module load — get-backend.mjs has already
+// loaded it by this point, so the dynamic import is a cache hit. Avoids
+// three independent `await import('node:sqlite')` calls inside the repair
+// functions (style cleanup; was producing no functional difference).
+const { DatabaseSync } = await import('node:sqlite');
 function isOk(execResult) {
   const rows = execResult?.[0]?.values ?? [];
@@ -38,42 +60,337 @@ function corruptionCount(execResult) {
 }
 /**
- * Probe the memory DB for index corruption and run REINDEX in place if
- * found. Returns `{ repaired, errors, persistent }`:
- *  - `repaired: true` and `errors > 0` when REINDEX restored integrity.
- *  - `repaired: false, errors: 0` when the DB is healthy or absent.
- *  - `repaired: false, errors > 0, persistent: true` when corruption survives
- *    REINDEX (caller should surface to the user — manual rebuild needed).
+ * Open `.moflo/moflo.db` raw via node:sqlite in readonly mode and run
+ * `PRAGMA integrity_check`. Bypasses {@link openBackend} because that path
+ * sets `journal_mode=WAL`, `busy_timeout`, and `synchronous=NORMAL` on every
+ * non-readonly open — those PRAGMAs can themselves throw against a corrupt
+ * file, and the pre-#1090 code path caught those throws and reported the DB
+ * as healthy. Readonly + no PRAGMAs = the probe always reaches the
+ * `integrity_check` call regardless of file health.
+ *
+ * Exported so the TS doctor check (`checkMemoryDbIntegrity` in
+ * `src/cli/commands/doctor-checks-config.ts`) can call into the same
+ * implementation instead of re-deriving the readonly-no-PRAGMAs probe.
+ *
+ * @param {string} dbPath
+ * @returns {Promise<{ ok: boolean, errors: number, openFailed?: boolean }>}
+ */
+export async function probeIntegrityRaw(dbPath) {
+  let db;
+  try {
+    db = new DatabaseSync(dbPath, { readOnly: true });
+  } catch {
+    return { ok: false, errors: 0, openFailed: true };
+  }
+  try {
+    const rows = db.prepare('PRAGMA integrity_check').all();
+    if (rows.length === 1 && String(rows[0]?.integrity_check ?? '').toLowerCase() === 'ok') {
+      return { ok: true, errors: 0 };
+    }
+    return { ok: false, errors: rows.length };
+  } catch {
+    return { ok: false, errors: 0, openFailed: true };
+  } finally {
+    try { db.close(); } catch { /* already-dead handle */ }
+  }
+}
+/**
+ * Tier-2 recovery: `VACUUM INTO` a fresh file. Single SQLite call that
+ * iterates every row of every table and writes them to a brand-new database
+ * with rebuilt indexes. Fails fast if iteration hits an unreadable page —
+ * caller falls back to row-level salvage.
+ *
+ * @param {string} srcPath
+ * @param {string} dstPath
+ * @returns {Promise<{ ok: boolean, error?: string }>}
+ */
+async function tryVacuumInto(srcPath, dstPath) {
+  try { if (existsSync(dstPath)) unlinkSync(dstPath); } catch { /* best effort */ }
+  let db;
+  try {
+    // Open writable (not readonly) — VACUUM needs to checkpoint WAL first.
+    // Skip our standard WAL pragmas (they can throw on corrupt files); SQLite
+    // applies its defaults which are sufficient for VACUUM INTO.
+    db = new DatabaseSync(srcPath);
+  } catch (err) {
+    return { ok: false, error: err?.message ?? 'open failed' };
+  }
+  try {
+    try { db.exec('PRAGMA wal_checkpoint(TRUNCATE)'); } catch { /* corrupt WAL ok */ }
+    db.exec(`VACUUM INTO '${dstPath.replace(/'/g, "''")}'`);
+    return { ok: true };
+  } catch (err) {
+    return { ok: false, error: err?.message ?? 'vacuum failed' };
+  } finally {
+    try { db.close(); } catch { /* */ }
+  }
+}
+/**
+ * Tier-3 recovery: row-level salvage. Iterate each non-empty table in
+ * `rowid > ?` chunks; on any chunk-read failure, skip past that chunk's
+ * rowid range and continue. Per-table loss stats returned so the caller can
+ * surface what was preserved vs lost.
+ *
+ * Schema is copied verbatim from `sqlite_master.sql` so triggers/indexes/views
+ * are preserved alongside tables. `INSERT OR IGNORE` handles unique-key
+ * collisions from any duplicate-rowid corruption mode.
+ *
+ * @param {string} srcPath
+ * @param {string} dstPath
+ * @returns {Promise<{
+ *   ok: boolean,
+ *   error?: string,
+ *   lossStats?: Record<string, { read: number, written: number, errors: number }>,
+ * }>}
+ */
+async function trySalvageRowByRow(srcPath, dstPath) {
+  try { if (existsSync(dstPath)) unlinkSync(dstPath); } catch { /* */ }
+  let src;
+  try {
+    src = new DatabaseSync(srcPath, { readOnly: true });
+  } catch (err) {
+    return { ok: false, error: err?.message ?? 'src open failed' };
+  }
+  // Open dst defensively. If this throws (e.g. permissions, dst path in a
+  // dir we can't create, or a concurrent lock on dstPath), keep the
+  // "never throws" contract by returning the failure shape — otherwise the
+  // open exception would escape past `repairMemoryDbIfCorrupt` and block
+  // session start, which is the failure mode this whole module exists to
+  // prevent.
+  let dst;
+  try {
+    dst = new DatabaseSync(dstPath);
+  } catch (err) {
+    try { src.close(); } catch { /* */ }
+    return { ok: false, error: err?.message ?? 'dst open failed' };
+  }
+  const lossStats = {};
+  const CHUNK = 500;
+  try {
+    // Copy schema. Order matters: tables first (else indexes/triggers/views
+    // reference nonexistent tables), then everything else. sqlite_* objects
+    // (sqlite_sequence, sqlite_autoindex_*) are created implicitly by SQLite.
+    const schemaRows = src
+      .prepare(
+        "SELECT type, name, tbl_name, sql FROM sqlite_master " +
+          "WHERE sql IS NOT NULL ORDER BY CASE type " +
+          "WHEN 'table' THEN 1 WHEN 'index' THEN 2 WHEN 'view' THEN 3 ELSE 4 END",
+      )
+      .all();
+    for (const s of schemaRows) {
+      if (String(s.name).startsWith('sqlite_')) continue;
+      try { dst.exec(s.sql + ';'); } catch { /* malformed schema row — skip */ }
+    }
+    // Salvage rows table-by-table.
+    const tables = src
+      .prepare("SELECT name FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite_%'")
+      .all();
+    for (const t of tables) {
+      const name = String(t.name);
+      lossStats[name] = { read: 0, written: 0, errors: 0 };
+      const cols = src.prepare(`PRAGMA table_info('${name.replace(/'/g, "''")}')`).all();
+      if (cols.length === 0) continue;
+      const colList = cols.map((c) => '"' + String(c.name).replace(/"/g, '""') + '"').join(',');
+      const placeholders = cols.map(() => '?').join(',');
+      const insert = dst.prepare(
+        `INSERT OR IGNORE INTO "${name.replace(/"/g, '""')}" (${colList}) VALUES (${placeholders})`,
+      );
+      let lastRowid = 0;
+      let safetyCap = 0;
+      const MAX_ITERATIONS = 100_000;
+      while (safetyCap++ < MAX_ITERATIONS) {
+        let rows;
+        try {
+          rows = src
+            .prepare(
+              `SELECT rowid as __rid, * FROM "${name.replace(/"/g, '""')}" ` +
+                `WHERE rowid > ? ORDER BY rowid LIMIT ${CHUNK}`,
+            )
+            .all(lastRowid);
+        } catch {
+          lossStats[name].errors++;
+          lastRowid += CHUNK;
+          continue;
+        }
+        if (!rows || rows.length === 0) break;
+        lossStats[name].read += rows.length;
+        for (const r of rows) {
+          try {
+            insert.run(...cols.map((c) => r[c.name]));
+            lossStats[name].written++;
+          } catch {
+            lossStats[name].errors++;
+          }
+          lastRowid = Number(r.__rid);
+        }
+        if (rows.length < CHUNK) break;
+      }
+    }
+    // Verify the recovered file. If integrity_check still fails, the
+    // salvage didn't actually produce a clean file — surface as failure
+    // (caller will keep the corrupted original in place).
+    const checkRows = dst.prepare('PRAGMA integrity_check').all();
+    const recoveredOk =
+      checkRows.length === 1 &&
+      String(checkRows[0]?.integrity_check ?? '').toLowerCase() === 'ok';
+    if (!recoveredOk) {
+      return { ok: false, error: 'recovered file failed integrity_check', lossStats };
+    }
+    return { ok: true, lossStats };
+  } catch (err) {
+    return { ok: false, error: err?.message ?? 'salvage failed' };
+  } finally {
+    try { src.close(); } catch { /* */ }
+    try { dst.close(); } catch { /* */ }
+  }
+}
+/**
+ * Atomically swap a freshly recovered DB into the canonical path, keeping the
+ * corrupted original (+ its WAL/SHM sidecars if present) under `.corrupt.<TS>`
+ * suffixes for forensics. Caller must guarantee no live writer holds the
+ * canonical file open before invoking this — see `stopWritersBeforeRepair`
+ * for the daemon-coordinated entry point.
+ *
+ * @param {string} canonicalPath
+ * @param {string} recoveredPath
+ * @returns {{ ok: boolean, error?: string, corruptSuffix: string }}
+ */
+function atomicSwap(canonicalPath, recoveredPath) {
+  const ts = new Date().toISOString().replace(/[:.]/g, '-').replace(/Z$/, '');
+  const corruptSuffix = `.corrupt.${ts}`;
+  try {
+    if (existsSync(canonicalPath)) {
+      renameSync(canonicalPath, canonicalPath + corruptSuffix);
+    }
+    const walPath = canonicalPath + '-wal';
+    const shmPath = canonicalPath + '-shm';
+    if (existsSync(walPath)) {
+      try { renameSync(walPath, walPath + corruptSuffix); } catch { /* not always present */ }
+    }
+    if (existsSync(shmPath)) {
+      try { renameSync(shmPath, shmPath + corruptSuffix); } catch { /* not always present */ }
+    }
+    renameSync(recoveredPath, canonicalPath);
+    return { ok: true, corruptSuffix };
+  } catch (err) {
+    return { ok: false, error: err?.message ?? 'swap failed', corruptSuffix };
+  }
+}
+/**
+ * Probe the memory DB for corruption and run a tiered repair if found:
+ *
+ *  - Tier 1: `REINDEX` in place (index-only corruption — #743).
+ *  - Tier 2: `VACUUM INTO` fresh file + atomic swap (table b-tree corruption).
+ *  - Tier 3: row-level salvage + atomic swap (deep corruption with partial
+ *    row loss).
+ *
+ * Returns a structured result:
+ *  - `{ repaired: false, errors: 0 }` — healthy or absent.
+ *  - `{ repaired: true, errors: N, tier: 'reindex' }` — Tier 1 worked.
+ *  - `{ repaired: true, errors: N, tier: 'vacuum', corruptBackup }` — Tier 2.
+ *  - `{ repaired: true, errors: N, tier: 'salvage', corruptBackup, lossStats }`
+ *    — Tier 3 (partial row loss possible; see `lossStats`).
+ *  - `{ repaired: false, errors: N, persistent: true }` — nothing worked;
+ *    manual recovery needed.
  *
  * Never throws; any internal failure becomes `{ repaired: false, errors: 0 }`
  * so a probe failure cannot block session start.
+ *
+ * @param {string} projectRoot
+ * @returns {Promise<{
+ *   repaired: boolean,
+ *   errors: number,
+ *   tier?: 'reindex' | 'vacuum' | 'salvage',
+ *   persistent?: boolean,
+ *   corruptBackup?: string,
+ *   lossStats?: Record<string, { read: number, written: number, errors: number }>,
+ * }>}
  */
 export async function repairMemoryDbIfCorrupt(projectRoot) {
   const dbPath = memoryDbPath(projectRoot);
   if (!existsSync(dbPath)) return { repaired: false, errors: 0 };
-  let db = null;
-  try {
-    db = await openBackend(projectRoot, { create: false });
+  // Step 1 — defensive readonly probe (cannot throw on WAL-setup errors
+  // against corrupt files). If the open itself fails, fall through to the
+  // openBackend path which has retry semantics for transient lock issues;
+  // truly unopenable files surface as persistent below.
+  const probe = await probeIntegrityRaw(dbPath);
+  if (probe.ok) return { repaired: false, errors: 0 };
-    const before = db.exec('PRAGMA integrity_check');
-    if (isOk(before)) {
-      return { repaired: false, errors: 0 };
-    }
+  const errors = probe.errors;
-    const errors = corruptionCount(before);
-    db.run('REINDEX');
+  // Step 2 — Tier 1: REINDEX via the existing backend path. Fast for the
+  // common index-drift mode and preserves the file in place.
+  if (!probe.openFailed) {
+    try {
+      const db = await openBackend(projectRoot, { create: false });
+      try {
+        db.run('REINDEX');
+        const after = db.exec('PRAGMA integrity_check');
+        if (isOk(after)) {
+          db.save();
+          return { repaired: true, errors, tier: 'reindex' };
+        }
+      } finally {
+        try { db.close(); } catch { /* */ }
+      }
+    } catch {
+      // REINDEX path failed (often because openBackend's WAL pragmas throw
+      // on a corrupt file). Fall through to deeper recovery.
+    }
+  }
-    const after = db.exec('PRAGMA integrity_check');
-    if (!isOk(after)) {
-      return { repaired: false, errors, persistent: true };
+  // Step 3 — Tier 2: VACUUM INTO a fresh file.
+  const recoveredPath = dbPath + '.recovered';
+  const vacuum = await tryVacuumInto(dbPath, recoveredPath);
+  if (vacuum.ok) {
+    const recoveredProbe = await probeIntegrityRaw(recoveredPath);
+    if (recoveredProbe.ok) {
+      const swap = atomicSwap(dbPath, recoveredPath);
+      if (swap.ok) {
+        return {
+          repaired: true,
+          errors: errors || corruptionCount(recoveredProbe),
+          tier: 'vacuum',
+          corruptBackup: dbPath + swap.corruptSuffix,
+        };
+      }
     }
+    try { unlinkSync(recoveredPath); } catch { /* */ }
+  }
-    db.save();
-    return { repaired: true, errors };
-  } catch {
-    return { repaired: false, errors: 0 };
-  } finally {
-    if (db) try { db.close(); } catch { /* non-fatal */ }
+  // Step 4 — Tier 3: row-level salvage.
+  const salvage = await trySalvageRowByRow(dbPath, recoveredPath);
+  if (salvage.ok) {
+    const swap = atomicSwap(dbPath, recoveredPath);
+    if (swap.ok) {
+      return {
+        repaired: true,
+        errors,
+        tier: 'salvage',
+        corruptBackup: dbPath + swap.corruptSuffix,
+        lossStats: salvage.lossStats,
+      };
+    }
+    try { unlinkSync(recoveredPath); } catch { /* */ }
+  } else {
+    try { if (existsSync(recoveredPath)) unlinkSync(recoveredPath); } catch { /* */ }
   }
+  // Step 5 — give up.
+  return { repaired: false, errors, persistent: true };
 }

package/bin/session-start-launcher.mjs CHANGED Viewed

@@ -268,15 +268,51 @@ try {
 try {
   const repair = await repairMemoryDbIfCorrupt(projectRoot);
   if (repair?.repaired) {
-    emitMutation(
-      'repaired memory db index',
-      `${plural(repair.errors, 'index error')} fixed via REINDEX`,
-    );
+    // Three recovery tiers, three messages. Tier surfaces what level of
+    // damage the DB had so the user (and any downstream telemetry) knows
+    // whether row data was lost. See bin/lib/db-repair.mjs for the cascade.
+    if (repair.tier === 'reindex') {
+      emitMutation(
+        'repaired memory db index',
+        `${plural(repair.errors, 'index error')} fixed via REINDEX`,
+      );
+    } else if (repair.tier === 'vacuum') {
+      emitMutation(
+        'rebuilt memory db',
+        `${plural(repair.errors, 'integrity violation')} fixed via VACUUM INTO; corrupt original kept at ${repair.corruptBackup ?? '.moflo/moflo.db.corrupt.*'}`,
+      );
+    } else if (repair.tier === 'salvage') {
+      // Row-level salvage may have dropped rows; summarise loss so the
+      // user sees what's gone before downstream consumers (indexer,
+      // embeddings) re-process the survivors.
+      let lossSummary = '';
+      if (repair.lossStats) {
+        const losses = Object.entries(repair.lossStats)
+          .map(([tbl, s]) => {
+            const lost = Math.max(0, s.read - s.written);
+            return lost > 0 ? `${tbl} ${s.written}/${s.read}` : null;
+          })
+          .filter(Boolean);
+        if (losses.length > 0) lossSummary = ` (rows preserved: ${losses.join(', ')})`;
+      }
+      emitMutation(
+        'salvaged memory db',
+        `${plural(repair.errors, 'integrity violation')} recovered via row-level salvage${lossSummary}; corrupt original kept at ${repair.corruptBackup ?? '.moflo/moflo.db.corrupt.*'}`,
+      );
+    } else {
+      // Older db-repair without a `tier` field — fall back to legacy text.
+      emitMutation(
+        'repaired memory db',
+        `${plural(repair.errors, 'integrity violation')} fixed`,
+      );
+    }
   } else if (repair?.persistent) {
     // Surface to stderr — Claude additionalContext + the user both see this.
-    // Manual `flo memory rebuild-index` is the next step.
+    // Every recovery tier exhausted; user options are destructive only.
     process.stderr.write(
-      `moflo: memory db has ${plural(repair.errors, 'index error')} REINDEX could not fix — run 'flo memory rebuild-index'\n`,
+      `moflo: memory db has ${plural(repair.errors, 'integrity violation')} ` +
+      `that REINDEX / VACUUM INTO / row-level salvage could not fix — ` +
+      `run 'flo memory rebuild-index' (destructive) or restore from backup\n`,
     );
   }
 } catch {

package/dist/src/cli/commands/doctor-checks-config.js CHANGED Viewed

@@ -8,6 +8,7 @@ import { join } from 'path';
 import os from 'os';
 import { getDaemonLockHolder } from '../services/daemon-lock.js';
 import { legacyMemoryDbPath, memoryDbCandidatePaths, memoryDbPath, } from '../services/moflo-paths.js';
+import { probeDbIntegrity } from '../services/memory-db-integrity-repair.js';
 import { errorDetail } from '../shared/utils/error-detail.js';
 export async function checkConfigFile() {
     // JSON configs (parse-validated). LEGACY-CONFIG: `.claude-flow.json` and
@@ -131,6 +132,65 @@ export async function checkMemoryDatabase() {
     }
     return { name: 'Memory Database', status: 'warn', message: 'Not initialized', fix: 'claude-flow memory configure --backend hybrid' };
 }
+/**
+ * Tier-1 corruption probe for `.moflo/moflo.db`. Runs `PRAGMA integrity_check`
+ * via a raw node:sqlite readonly handle — bypasses `openBackend` because that
+ * path sets WAL pragmas on open and those throw on deeply-corrupt files,
+ * masking the real failure as a generic "Check" error (doctor.ts:214).
+ *
+ * Owns the corruption signal so downstream checks (Embeddings, Semantic
+ * Quality, Memory Access Functional, etc.) don't end up doing it implicitly
+ * via their own swallow-all error paths. The companion fix in
+ * doctor-fixes.ts coordinates daemon stop + tiered repair via the JS-side
+ * `repairMemoryDbIfCorrupt` (bin/lib/db-repair.mjs).
+ *
+ * Status semantics:
+ *  - `pass` — DB absent OR `integrity_check` returns 'ok'.
+ *  - `fail` — corruption detected. `fix` field points at the healer's
+ *    auto-recovery path (which runs REINDEX → VACUUM INTO → row-level
+ *    salvage in order of escalation).
+ *  - `warn` — probe itself crashed (rare; surfaces the diagnostic rather
+ *    than masking it).
+ */
+export async function checkMemoryDbIntegrity(cwd = process.cwd()) {
+    const dbPath = memoryDbPath(cwd);
+    if (!existsSync(dbPath)) {
+        return { name: 'Memory DB Integrity', status: 'pass', message: 'DB absent (no integrity probe needed)' };
+    }
+    // Delegate to the single readonly-no-PRAGMAs probe in
+    // `bin/lib/db-repair.mjs` (via the TS service bridge). Avoids re-deriving
+    // the same DatabaseSync({ readOnly: true }) + integrity_check sequence in
+    // two places and keeps the "what counts as healthy" semantics in one file.
+    try {
+        const probe = await probeDbIntegrity(dbPath);
+        if (probe.ok) {
+            return { name: 'Memory DB Integrity', status: 'pass', message: 'PRAGMA integrity_check: ok' };
+        }
+        const message = probe.openFailed
+            ? 'Unable to probe DB (readonly open failed — likely deep corruption)'
+            : `${probe.errors} integrity violation(s) detected`;
+        return {
+            name: 'Memory DB Integrity',
+            status: 'fail',
+            message,
+            fix: 'flo healer --fix -c memory-db-integrity',
+        };
+    }
+    catch (e) {
+        // The probe itself maps "readonly open failed" to `openFailed: true`
+        // and we surface that as `fail` above. Reaching the catch means the
+        // probe *module* couldn't be loaded — `findMofloPackageRoot()` returned
+        // null (broken install / wrong cwd) or the dynamic import threw. Both
+        // are first-class diagnostic failures — a broken install must not be
+        // silently downgraded to `warn` and hidden from the healer summary.
+        return {
+            name: 'Memory DB Integrity',
+            status: 'fail',
+            message: `Integrity probe unavailable: ${errorDetail(e)}`,
+            fix: 'flo healer --fix -c memory-db-integrity',
+        };
+    }
+}
 /**
  * Standard MCP-config search paths: home (Claude Desktop on macOS/Linux),
  * XDG config dir, project-local `.mcp.json`, and APPDATA on Windows.

package/dist/src/cli/commands/doctor-checks-memory-access.js CHANGED Viewed

@@ -144,7 +144,33 @@ async function runMemoryRoundTrip(ctx) {
     }
     else {
         const top = searchOut.results?.find(r => r.key === key);
-        pushDetail(ctx.details, { id: `${ctx.idPrefix}.search-finds-key`, mcpTool: 'memory_search', expected: `result containing key=${key}` }, top ? { topKey: top.key, similarity: top.similarity } : { allKeys: searchOut.results?.map(r => r.key) }, top ? null : `stored key ${key} not in results (got: ${searchOut?.results?.map(r => r.key).join(', ') ?? 'none'})`);
+        if (top) {
+            pushDetail(ctx.details, { id: `${ctx.idPrefix}.search-finds-key`, mcpTool: 'memory_search', expected: `result containing key=${key}` }, { topKey: top.key, similarity: top.similarity }, null);
+        }
+        else {
+            // #1120: search returned results but our just-stored key wasn't among
+            // them. Mirrors the #1111 empty-HNSW fallback for the non-zero case:
+            // if the row IS reachable by literal key, demote to warn — memory
+            // access works, the HNSW index just hasn't propagated the new write
+            // yet (stale-neighbor race when healer runs 2+ times in one session
+            // against accumulated probe rows). If literal retrieve also fails,
+            // surface the original fail unchanged.
+            const otherKeys = searchOut?.results?.map(r => r.key).join(', ') ?? 'none';
+            const retrievable = await literalKeyReachable(ctx.memoryTools, key, namespace);
+            if (retrievable) {
+                ctx.details.push({
+                    id: `${ctx.idPrefix}.search-finds-key`,
+                    mcpTool: 'memory_search',
+                    status: 'warn',
+                    observed: { topKeys: searchOut?.results?.map(r => r.key), retrievable: true },
+                    expected: `result containing key=${key}`,
+                    message: `search returned results but our key was not among them (got: ${otherKeys}); row IS reachable by literal retrieve — HNSW stale-neighbor race (newly-written row not yet propagated to the index). Memory access path works.`,
+                });
+            }
+            else {
+                pushDetail(ctx.details, { id: `${ctx.idPrefix}.search-finds-key`, mcpTool: 'memory_search', expected: `result containing key=${key}` }, { allKeys: searchOut.results?.map(r => r.key) }, `stored key ${key} not in results (got: ${otherKeys})`);
+            }
+        }
     }
     // 4. memory_retrieve returns the full value (search content is truncated
     // to a 60-char snippet). Catches write clobber and namespace bleed — we