npm - alvin-bot - Versions diffs - 4.14.0 → 4.14.2 - Mend

alvin-bot 4.14.0 → 4.14.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/CHANGELOG.md +77 -0
package/dist/handlers/commands.js +6 -3
package/dist/services/async-agent-watcher.js +35 -1
package/dist/services/subagents.js +59 -0
package/package.json +1 -1
package/test/list-subagents-merged.test.ts +172 -0
package/test/watcher-zombie-fix.test.ts +252 -0

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,83 @@
 All notable changes to Alvin Bot are documented here.
+## [4.14.2] — 2026-04-16
+### 🐛 Patch: watcher zombie-entry fix (missing outputFile > 10 min = failed)
+**Edge case Ali caught today:** a pending async-agent entry stuck in `/subagents list` for 3+ hours showing "running" — but the underlying `alvin_dispatch_agent` subprocess had already died (its output file was gone). The entry would have continued haunting the list until the 12-hour `giveUpAt` ceiling fired.
+**Root cause:** `async-agent-watcher`'s `pollOnce` handled four states from `parseOutputFileStatus` — `completed` / `failed` / `running` / `missing`. For `missing` (file doesn't exist or is empty), the watcher just kept polling forever, on the assumption that a slow subprocess might eventually write. If the subprocess crashed before writing ANY output, the file never appeared, and we polled for 12 hours before timing out.
+**Fix:** when `status.state === "missing"` AND `now - entry.startedAt > MISSING_FILE_FAILURE_MS` (default 10 min, configurable via `ALVIN_MISSING_FILE_FAILURE_MS` env var), deliver as failed with an explicit message:
+> *Dispatched subprocess never wrote its output file (N m after start). Likely crashed before initializing, or the file was removed externally.*
+10 minutes is well above any legitimate `claude -p` startup variance (normal first-write latency is seconds) and well below the 12-hour hard ceiling.
+### What's preserved (regression-guard tested)
+- Running agents (file has content but no `end_turn`/`result` yet) are untouched by this path — they still keep polling as before.
+- Completed agents (clean `end_turn` or `stream-json result` event) still deliver normally.
+- Explicit `failed` state from the parser (if ever used) still delivers error normally.
+- v4.12.4's "file is stale but has text → deliver partial" path takes precedence over the new zombie check (the file has content, so not "missing").
+- 12-hour `giveUpAt` hard ceiling still applies as the ultimate safety net.
+- Session's `pendingBackgroundCount` decrement fires on zombie failure, same as every other delivery path.
+### Testing
+- **Baseline**: 498 tests (v4.14.1)
+- **New**: `test/watcher-zombie-fix.test.ts` — 6 tests:
+  - Young missing file (<threshold) stays pending
+  - Old missing file (>threshold) delivers failed + removes from pending
+  - Default threshold is 10 min when env var unset
+  - Running file (has content) is unaffected by zombie check
+  - Completed file delivers as completed (regression guard)
+  - Session's `pendingBackgroundCount` decrements on zombie delivery
+- **Total**: 504 tests, all green, TSC clean
+### Files changed
+- **Modified**: `src/services/async-agent-watcher.ts` (new `getMissingFileFailureMs()` + zombie branch in `pollOnce`)
+- **NEW tests**: `test/watcher-zombie-fix.test.ts`
+- **Version**: `package.json` 4.14.1 → 4.14.2
+---
+## [4.14.1] — 2026-04-16
+### 🐛 Patch: `/subagents list` now shows v4.13+ dispatch agents too
+**Bug Ali caught:** typing `/subagents list` in Telegram while a `alvin_dispatch_agent` sub-agent was actively running returned "no agents running" — even though the user could see the agent finish and deliver a result shortly after. Cross-platform effect too: `/alvin` slash command on Slack had the same display gap.
+**Root cause:** two separate registries for sub-agents:
+- `src/services/subagents.ts` `activeAgents` Map — used since v4.0.0 for bot-level sub-agents (cron spawns, implicit Task tool children, `/sub-agents spawn` CLI)
+- `src/services/async-agent-watcher.ts` `pending` Map — used since v4.13 for detached `alvin_dispatch_agent` subprocesses
+`/subagents list` only read from the first map. The entire v4.13+ dispatch path was invisible in the listing.
+**Fix:** new `listActiveSubAgents()` helper in subagents.ts that merges both registries. Pending async-agent-watcher entries get synthesized into `SubAgentInfo` shape (status="running", source="cron", depth=0, platform preserved). The `/subagents list` handler and the default-render path both switch to the merged helper. The old `listSubAgents()` function stays pure (unchanged behavior) — cancel/result paths still use it because detached subprocess PIDs aren't tracked.
+### Technical details
+- `listActiveSubAgents()` is async (lazy dynamic import of the watcher module to keep subagents.ts load order clean) — existing `listSubAgents()` remains sync for the v4.0.0 consumers
+- Synthesis mapping: `PendingAsyncAgent.agentId → SubAgentInfo.id`, `description → name`, `startedAt → startedAt`, always `status="running"` (pending by definition), `source="cron"` (matches watcher's delivery banner), `depth=0`
+- Platform field preserved so the renderer can show cross-platform context if desired later
+### Testing
+- **Baseline**: 492 tests (v4.14.0)
+- **New**: `test/list-subagents-merged.test.ts` — 6 tests (empty state, single slack agent, multi-platform merge, timestamp preservation, source tag, listSubAgents purity guard)
+- **Total**: 498 tests, all green, TSC clean
+### Files changed
+- **Modified**: `src/services/subagents.ts` (new listActiveSubAgents helper), `src/handlers/commands.ts` (both /subagents list paths switch to merged view)
+- **NEW tests**: `test/list-subagents-merged.test.ts`
+- **Version**: `package.json` 4.14.0 → 4.14.1
+---
 ## [4.14.0] — 2026-04-16
 ### ✨ Sub-agent dispatch on Slack, Discord, WhatsApp (Telegram unchanged)

package/dist/handlers/commands.js CHANGED Viewed

@@ -1910,7 +1910,7 @@ export function registerCommands(bot) {
     // type both "/sub-agents" and "/subagents" — Telegram routes both to this.
     bot.command(["sub_agents", "subagents"], async (ctx) => {
         const lang = getSession(ctx.from.id).language;
-        const { listSubAgents, cancelSubAgent, getSubAgentResult, getMaxParallelAgents, getConfiguredMaxParallel, setMaxParallelAgents, findSubAgentByName, getVisibility, setVisibility, getQueueCap, setQueueCap, getDefaultTimeoutMs, setDefaultTimeoutMs, } = await import("../services/subagents.js");
+        const { listSubAgents, listActiveSubAgents, cancelSubAgent, getSubAgentResult, getMaxParallelAgents, getConfiguredMaxParallel, setMaxParallelAgents, findSubAgentByName, getVisibility, setVisibility, getQueueCap, setQueueCap, getDefaultTimeoutMs, setDefaultTimeoutMs, } = await import("../services/subagents.js");
         const arg = (ctx.match || "").trim();
         const tokens = arg.split(/\s+/).filter(Boolean);
         const sub = tokens[0]?.toLowerCase() || "";
@@ -2040,8 +2040,10 @@ export function registerCommands(bot) {
             return;
         }
         // /sub-agents list  — same rendering as the default, but forced
+        // v4.14.1 — uses listActiveSubAgents (merged view) so v4.13+
+        // alvin_dispatch_agent detached subprocesses also show up here.
         if (sub === "list") {
-            const agents = listSubAgents();
+            const agents = await listActiveSubAgents();
             if (agents.length === 0) {
                 await ctx.reply(t("bot.subagents.noneRunning", lang));
                 return;
@@ -2142,7 +2144,8 @@ export function registerCommands(bot) {
         const timeoutLabel = currentTimeout <= 0
             ? `⏱ Timeout: *∞ (unlimited)*`
             : `⏱ Timeout: *${Math.round(currentTimeout / 1000)}s*`;
-        const agents = listSubAgents();
+        // v4.14.1 — merged view incl. v4.13+ alvin_dispatch_agent agents.
+        const agents = await listActiveSubAgents();
         let body = "";
         if (agents.length === 0) {
             body = `\n${t("bot.subagents.noneRunning", lang)}`;

package/dist/services/async-agent-watcher.js CHANGED Viewed

@@ -33,6 +33,31 @@ const POLL_INTERVAL_MS = 15_000;
  *  a timeout banner. SEO audits historically take ~13 min, so 12h
  *  is absurdly generous and protects against state-file growth. */
 const MAX_AGENT_AGE_MS = 12 * 60 * 60 * 1000;
+/**
+ * v4.14.2 — When a dispatched subprocess never creates its outputFile
+ * (spawn failure, crash before first write, file deleted externally),
+ * `parseOutputFileStatus` returns "missing" on every poll. Pre-v4.14.2
+ * that meant waiting the full 12h MAX_AGENT_AGE_MS before delivering a
+ * timeout — a 12-hour zombie in `/subagents list`.
+ *
+ * This threshold caps how long we tolerate a missing file before
+ * declaring the agent failed. `claude -p` normally writes its first
+ * JSONL line within seconds of spawn; 10 minutes is way above any
+ * legitimate startup variance and well below the 12h ceiling.
+ *
+ * Configurable via the ALVIN_MISSING_FILE_FAILURE_MS env var. Tests
+ * use shorter values via the same hook. Only the getter is exposed
+ * so callers always see the current env value, not a stale constant.
+ */
+function getMissingFileFailureMs() {
+    const raw = process.env.ALVIN_MISSING_FILE_FAILURE_MS;
+    if (raw) {
+        const n = Number(raw);
+        if (Number.isFinite(n) && n > 0)
+            return n;
+    }
+    return 10 * 60 * 1000; // default 10 min
+}
 // ── Module state ──────────────────────────────────────────────────
 const pending = new Map();
 let pollTimer = null;
@@ -139,6 +164,7 @@ export function stopWatcher() {
 export async function pollOnce() {
     const now = Date.now();
     const toRemove = [];
+    const missingFileFailureMs = getMissingFileFailureMs();
     for (const entry of pending.values()) {
         entry.lastCheckedAt = now;
         // Timeout check first — if the agent is past its giveUpAt, give up
@@ -157,7 +183,15 @@ export async function pollOnce() {
             await deliverAsFailure(entry, "error", status.error);
             toRemove.push(entry.agentId);
         }
-        // running / missing → keep polling next cycle
+        else if (status.state === "missing" &&
+            now - entry.startedAt > missingFileFailureMs) {
+            // v4.14.2 — Zombie guard: the subprocess never created its
+            // output file within `missingFileFailureMs` (default 10 min).
+            // Declare failed instead of polling until the 12h giveUpAt.
+            await deliverAsFailure(entry, "error", `Dispatched subprocess never wrote its output file (${Math.round((now - entry.startedAt) / 60_000)}m after start). Likely crashed before initializing, or the file was removed externally.`);
+            toRemove.push(entry.agentId);
+        }
+        // running / missing-but-young → keep polling next cycle
     }
     if (toRemove.length > 0) {
         for (const id of toRemove)

package/dist/services/subagents.js CHANGED Viewed

@@ -576,10 +576,69 @@ export function spawnSubAgent(agentConfig) {
 }
 /**
  * List all agents (active + recent completed).
+ *
+ * This is the v4.0.0 API — shows only agents from the bot-level
+ * registry (activeAgents Map). Does NOT include v4.13+ detached
+ * `alvin_dispatch_agent` subprocesses which live in async-agent-
+ * watcher. For the merged view used by `/subagents list`, use
+ * `listActiveSubAgents()` instead.
  */
 export function listSubAgents() {
     return [...activeAgents.values()].map((a) => ({ ...a.info }));
 }
+/**
+ * v4.14.1 — Merged view of BOTH sub-agent registries:
+ *   1. Bot-level agents (subagents.ts activeAgents Map) — v4.0.0+
+ *      the /sub-agents spawn CLI, cron-spawned sub-agents, implicit
+ *      Task-tool children.
+ *   2. Detached `alvin_dispatch_agent` subprocesses (async-agent-
+ *      watcher pending Map) — v4.13+ the MCP-tool-dispatched
+ *      agents that survive parent aborts.
+ *
+ * The user doesn't care which registry an agent lives in — "is there
+ * anything running right now?" is the question `/subagents list`
+ * answers. This function unifies the view.
+ *
+ * Pending async agents are synthesized into SubAgentInfo shape:
+ *   - id: PendingAsyncAgent.agentId (alvin-prefixed hex)
+ *   - name: PendingAsyncAgent.description
+ *   - status: "running" (we wouldn't be pending otherwise)
+ *   - startedAt: PendingAsyncAgent.startedAt
+ *   - source: "cron" — matches the delivery banner's source tag
+ *   - depth: 0 — dispatch agents are always top-level (no nesting)
+ *   - platform: preserved from the pending entry
+ *   - parentChatId: from the pending entry
+ *
+ * Lazy import of the watcher keeps this function cheap for callers
+ * who only need the v4.0.0 view (importing the watcher pulls in its
+ * whole startup cost otherwise).
+ */
+export async function listActiveSubAgents() {
+    const botLevel = listSubAgents();
+    let pending = [];
+    try {
+        // Lazy dynamic import so this module doesn't depend on the watcher
+        // at load time (preserves test isolation + avoids a circular boot).
+        const watcher = await import("./async-agent-watcher.js");
+        if (typeof watcher.listPendingAgents === "function") {
+            const raw = watcher.listPendingAgents();
+            pending = raw.map((p) => ({
+                id: p.agentId,
+                name: p.description,
+                status: "running",
+                startedAt: p.startedAt,
+                source: "cron",
+                depth: 0,
+                platform: p.platform,
+                parentChatId: p.chatId,
+            }));
+        }
+    }
+    catch {
+        /* never break listing because of merge errors */
+    }
+    return [...botLevel, ...pending];
+}
 /**
  * Cancel a running sub-agent by ID.
  * Returns true if the agent was found and aborted.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "alvin-bot",
-  "version": "4.14.0",
+  "version": "4.14.2",
   "description": "Alvin Bot \u2014 Your personal AI agent on Telegram, WhatsApp, Discord, Signal, and Web.",
   "type": "module",
   "main": "dist/index.js",

package/test/list-subagents-merged.test.ts ADDED Viewed

@@ -0,0 +1,172 @@
+/**
+ * v4.14.1 — `/subagents list` must show v4.13+ dispatch agents too.
+ *
+ * Root cause: `listSubAgents()` in subagents.ts only iterates the
+ * `activeAgents` Map (B1+B2 from v4.0.0). v4.13's `alvin_dispatch_agent`
+ * MCP tool writes into `async-agent-watcher.ts`'s `pending` Map instead.
+ * User-facing impact: "no subagents running" while the bot is visibly
+ * dispatching sub-agents.
+ *
+ * Fix strategy: a new `listActiveSubAgents()` helper that merges both
+ * registries into a unified SubAgentInfo-shaped list. The `/subagents
+ * list` handler uses this instead of the bare `listSubAgents()`.
+ * Cancel/result operations keep using the old registry — we can't
+ * cancel a detached `claude -p` subprocess anyway without knowing its
+ * PID, which isn't tracked.
+ */
+import { describe, it, expect, beforeEach, afterEach, vi } from "vitest";
+import fs from "fs";
+import os from "os";
+import { resolve } from "path";
+const TEST_DATA_DIR = resolve(
+  os.tmpdir(),
+  `alvin-list-merged-${process.pid}-${Date.now()}`,
+);
+beforeEach(async () => {
+  if (fs.existsSync(TEST_DATA_DIR)) {
+    fs.rmSync(TEST_DATA_DIR, { recursive: true, force: true });
+  }
+  fs.mkdirSync(TEST_DATA_DIR, { recursive: true });
+  process.env.ALVIN_DATA_DIR = TEST_DATA_DIR;
+  vi.resetModules();
+  vi.doMock("../src/services/subagent-delivery.js", () => ({
+    deliverSubAgentResult: async () => {},
+    attachBotApi: () => {},
+    __setBotApiForTest: () => {},
+  }));
+});
+afterEach(async () => {
+  try {
+    const mod = await import("../src/services/async-agent-watcher.js");
+    mod.stopWatcher();
+    mod.__resetForTest();
+  } catch {}
+});
+describe("listActiveSubAgents merged view (v4.14.1)", () => {
+  it("returns empty list when neither registry has agents", async () => {
+    const mod = await import("../src/services/subagents.js");
+    expect(await mod.listActiveSubAgents()).toEqual([]);
+  });
+  it("includes async-agent-watcher pending agents in the merged list", async () => {
+    const watcher = await import("../src/services/async-agent-watcher.js");
+    watcher.registerPendingAgent({
+      agentId: "alvin-abc123",
+      outputFile: `${TEST_DATA_DIR}/out.jsonl`,
+      description: "Research Higgsfield",
+      prompt: "...",
+      chatId: "C012SLACK",
+      userId: "U123",
+      toolUseId: null,
+      sessionKey: "slack:C012SLACK",
+      platform: "slack",
+    });
+    const mod = await import("../src/services/subagents.js");
+    const agents = await mod.listActiveSubAgents();
+    expect(agents).toHaveLength(1);
+    expect(agents[0].id).toBe("alvin-abc123");
+    expect(agents[0].name).toBe("Research Higgsfield");
+    expect(agents[0].status).toBe("running");
+    expect(agents[0].depth).toBe(0);
+    expect(agents[0].platform).toBe("slack");
+  });
+  it("merges multiple agents from both registries without dupes", async () => {
+    const watcher = await import("../src/services/async-agent-watcher.js");
+    watcher.registerPendingAgent({
+      agentId: "alvin-one",
+      outputFile: `${TEST_DATA_DIR}/a.jsonl`,
+      description: "Agent One",
+      prompt: "p",
+      chatId: 42,
+      userId: 42,
+      toolUseId: null,
+      sessionKey: "s",
+      platform: "telegram",
+    });
+    watcher.registerPendingAgent({
+      agentId: "alvin-two",
+      outputFile: `${TEST_DATA_DIR}/b.jsonl`,
+      description: "Agent Two",
+      prompt: "p",
+      chatId: 42,
+      userId: 42,
+      toolUseId: null,
+      sessionKey: "s",
+      platform: "telegram",
+    });
+    const mod = await import("../src/services/subagents.js");
+    const agents = await mod.listActiveSubAgents();
+    const ids = agents.map((a) => a.id).sort();
+    expect(ids).toEqual(["alvin-one", "alvin-two"]);
+  });
+  it("preserves startedAt timestamp for age rendering", async () => {
+    const fixedTs = Date.now() - 45_000; // 45 seconds ago
+    const watcher = await import("../src/services/async-agent-watcher.js");
+    watcher.registerPendingAgent({
+      agentId: "alvin-aged",
+      outputFile: `${TEST_DATA_DIR}/aged.jsonl`,
+      description: "Old agent",
+      prompt: "p",
+      chatId: 1,
+      userId: 1,
+      toolUseId: null,
+      sessionKey: "s",
+      platform: "slack",
+    });
+    const mod = await import("../src/services/subagents.js");
+    const agents = await mod.listActiveSubAgents();
+    expect(agents[0].startedAt).toBeGreaterThan(fixedTs - 1000);
+    expect(agents[0].startedAt).toBeLessThan(Date.now() + 1000);
+  });
+  it("tags async dispatch agents with source='cron' (matches v4.12 banner format)", async () => {
+    const watcher = await import("../src/services/async-agent-watcher.js");
+    watcher.registerPendingAgent({
+      agentId: "alvin-sourced",
+      outputFile: `${TEST_DATA_DIR}/s.jsonl`,
+      description: "sourced",
+      prompt: "p",
+      chatId: 1,
+      userId: 1,
+      toolUseId: null,
+      sessionKey: "s",
+      platform: "telegram",
+    });
+    const mod = await import("../src/services/subagents.js");
+    const agents = await mod.listActiveSubAgents();
+    // source='cron' = the ⏰ badge in /subagents list rendering. Matches
+    // the existing v4.12.x watcher delivery's SubAgentInfo.source value.
+    expect(agents[0].source).toBe("cron");
+  });
+  it("listSubAgents() (v4.0.0 API) is unchanged and doesn't include pending dispatches", async () => {
+    const watcher = await import("../src/services/async-agent-watcher.js");
+    watcher.registerPendingAgent({
+      agentId: "alvin-isolated",
+      outputFile: `${TEST_DATA_DIR}/iso.jsonl`,
+      description: "isolated",
+      prompt: "p",
+      chatId: 1,
+      userId: 1,
+      toolUseId: null,
+      sessionKey: "s",
+      platform: "telegram",
+    });
+    const mod = await import("../src/services/subagents.js");
+    // The original listSubAgents is kept pure — only the merged helper
+    // returns combined results. Cancel/result paths still use the
+    // bot-level registry.
+    expect(mod.listSubAgents()).toHaveLength(0);
+    expect(await mod.listActiveSubAgents()).toHaveLength(1);
+  });
+});

package/test/watcher-zombie-fix.test.ts ADDED Viewed

@@ -0,0 +1,252 @@
+/**
+ * v4.14.2 — zombie-entry fix for async-agent-watcher.
+ *
+ * Problem: when the dispatched `claude -p` subprocess never produces
+ * its outputFile (crashed before the first write, spawn failed, file
+ * got deleted externally), `parseOutputFileStatus` returns "missing"
+ * on every poll. The watcher keeps polling forever until `giveUpAt`
+ * (12 hours) fires, then delivers a timeout banner. Meanwhile the
+ * entry hangs in `/subagents list` as a permanent "running" zombie.
+ *
+ * Fix: when status is "missing" for longer than
+ * `MISSING_FILE_FAILURE_MS` (default 10 min, env-configurable), the
+ * watcher declares the agent failed with a clear "output file never
+ * appeared" reason, delivers the failure banner, and removes the
+ * entry. 10 minutes is well above normal startup variance (seconds)
+ * and well below the 12h hard ceiling.
+ *
+ * Invariants preserved:
+ *   - An agent whose output file DOES appear, even slowly, continues
+ *     normally (missing on first poll, running on second, completed
+ *     on third — same as v4.14.1).
+ *   - The `completed` path (end_turn or stream-json result) is
+ *     unchanged.
+ *   - The `failed` path (existing "error" state from parser) is
+ *     unchanged.
+ *   - The 12h giveUpAt ceiling still applies — it's now just less
+ *     likely to be hit because missing-file zombies resolve earlier.
+ */
+import { describe, it, expect, beforeEach, afterEach, vi } from "vitest";
+import fs from "fs";
+import os from "os";
+import { resolve } from "path";
+const TEST_DATA_DIR = resolve(
+  os.tmpdir(),
+  `alvin-zombie-${process.pid}-${Date.now()}`,
+);
+interface Delivered {
+  info: { name: string; status: string };
+  result: { status: string; output: string; error?: string };
+}
+let delivered: Delivered[] = [];
+beforeEach(async () => {
+  if (fs.existsSync(TEST_DATA_DIR)) {
+    fs.rmSync(TEST_DATA_DIR, { recursive: true, force: true });
+  }
+  fs.mkdirSync(TEST_DATA_DIR, { recursive: true });
+  process.env.ALVIN_DATA_DIR = TEST_DATA_DIR;
+  // Reset the env override between tests
+  delete process.env.ALVIN_MISSING_FILE_FAILURE_MS;
+  delivered = [];
+  vi.resetModules();
+  vi.doMock("../src/services/subagent-delivery.js", () => ({
+    deliverSubAgentResult: async (info: unknown, result: unknown) => {
+      delivered.push({
+        info: info as Delivered["info"],
+        result: result as Delivered["result"],
+      });
+    },
+    attachBotApi: () => {},
+    __setBotApiForTest: () => {},
+  }));
+});
+afterEach(async () => {
+  try {
+    const mod = await import("../src/services/async-agent-watcher.js");
+    mod.stopWatcher();
+    mod.__resetForTest();
+  } catch {}
+  delete process.env.ALVIN_MISSING_FILE_FAILURE_MS;
+});
+describe("watcher zombie fix (v4.14.2)", () => {
+  it("missing file younger than threshold stays pending (no premature fail)", async () => {
+    // Threshold = 10 min. Backdate only 2 min. Expect: still pending.
+    const mod = await import("../src/services/async-agent-watcher.js");
+    mod.registerPendingAgent({
+      agentId: "young-zombie",
+      outputFile: `${TEST_DATA_DIR}/nonexistent.jsonl`,
+      description: "young",
+      prompt: "p",
+      chatId: 1,
+      userId: 1,
+      toolUseId: null,
+    });
+    // Forcibly set startedAt to 2 min ago
+    const pending = mod.listPendingAgents();
+    expect(pending).toHaveLength(1);
+    (pending[0] as { startedAt: number }).startedAt = Date.now() - 2 * 60_000;
+    await mod.pollOnce();
+    expect(delivered).toHaveLength(0);
+    expect(mod.listPendingAgents()).toHaveLength(1);
+  });
+  it("missing file older than threshold delivers failed + removes from pending", async () => {
+    process.env.ALVIN_MISSING_FILE_FAILURE_MS = "120000"; // 2 min for test
+    const mod = await import("../src/services/async-agent-watcher.js");
+    mod.registerPendingAgent({
+      agentId: "old-zombie",
+      outputFile: `${TEST_DATA_DIR}/never-appears.jsonl`,
+      description: "stuck crash zombie",
+      prompt: "p",
+      chatId: 1,
+      userId: 1,
+      toolUseId: null,
+    });
+    // Backdate 5 min (> 2 min threshold)
+    const pending = mod.listPendingAgents();
+    (pending[0] as { startedAt: number }).startedAt = Date.now() - 5 * 60_000;
+    await mod.pollOnce();
+    expect(delivered).toHaveLength(1);
+    expect(delivered[0].result.status).toBe("error");
+    // Error message should be explicit so user understands
+    expect(delivered[0].result.error).toMatch(/output file|never appeared|never wrote/i);
+    expect(mod.listPendingAgents()).toHaveLength(0);
+  });
+  it("default threshold is 10 min when env var is not set", async () => {
+    const mod = await import("../src/services/async-agent-watcher.js");
+    mod.registerPendingAgent({
+      agentId: "at-default",
+      outputFile: `${TEST_DATA_DIR}/z.jsonl`,
+      description: "default threshold",
+      prompt: "p",
+      chatId: 1,
+      userId: 1,
+      toolUseId: null,
+    });
+    // Backdate 9 min — still under the 10-min default, should stay pending
+    let p = mod.listPendingAgents();
+    (p[0] as { startedAt: number }).startedAt = Date.now() - 9 * 60_000;
+    await mod.pollOnce();
+    expect(delivered).toHaveLength(0);
+    expect(mod.listPendingAgents()).toHaveLength(1);
+    // Backdate to 11 min — over threshold, should fire
+    p = mod.listPendingAgents();
+    (p[0] as { startedAt: number }).startedAt = Date.now() - 11 * 60_000;
+    await mod.pollOnce();
+    expect(delivered).toHaveLength(1);
+  });
+  it("running file (has content, no end_turn) is unaffected by zombie check", async () => {
+    // A file WITH content should never trigger the missing-file path
+    // regardless of age.
+    const outPath = `${TEST_DATA_DIR}/running.jsonl`;
+    fs.writeFileSync(
+      outPath,
+      JSON.stringify({
+        type: "assistant",
+        isSidechain: true,
+        agentId: "x",
+        message: {
+          role: "assistant",
+          content: [{ type: "tool_use", name: "Bash", input: {} }],
+          stop_reason: "tool_use",
+        },
+      }) + "\n",
+      "utf-8",
+    );
+    const mod = await import("../src/services/async-agent-watcher.js");
+    mod.registerPendingAgent({
+      agentId: "active-work",
+      outputFile: outPath,
+      description: "legitimately running",
+      prompt: "p",
+      chatId: 1,
+      userId: 1,
+      toolUseId: null,
+    });
+    const p = mod.listPendingAgents();
+    (p[0] as { startedAt: number }).startedAt = Date.now() - 30 * 60_000; // 30 min old
+    await mod.pollOnce();
+    // v4.12.4 staleness detection COULD fire here because the file has
+    // text content and is stale. That's a different (benign) path — the
+    // agent gets delivered as "completed with partial output". Either
+    // way, the zombie-fix error path must NOT fire.
+    const anyZombieError = delivered.some(
+      (d) => d.result.error && /output file never/i.test(d.result.error),
+    );
+    expect(anyZombieError).toBe(false);
+  });
+  it("completed file delivers as completed (unchanged)", async () => {
+    const outPath = `${TEST_DATA_DIR}/done.jsonl`;
+    fs.writeFileSync(
+      outPath,
+      JSON.stringify({
+        type: "assistant",
+        agentId: "x",
+        message: {
+          content: [{ type: "text", text: "all good" }],
+          stop_reason: "end_turn",
+        },
+      }) + "\n",
+      "utf-8",
+    );
+    const mod = await import("../src/services/async-agent-watcher.js");
+    mod.registerPendingAgent({
+      agentId: "done-agent",
+      outputFile: outPath,
+      description: "clean completion",
+      prompt: "p",
+      chatId: 1,
+      userId: 1,
+      toolUseId: null,
+    });
+    // Backdate 1h — would trigger zombie if misapplied
+    const p = mod.listPendingAgents();
+    (p[0] as { startedAt: number }).startedAt = Date.now() - 60 * 60_000;
+    await mod.pollOnce();
+    expect(delivered).toHaveLength(1);
+    expect(delivered[0].result.status).toBe("completed");
+  });
+  it("decrements session counter on zombie failure delivery", async () => {
+    process.env.ALVIN_MISSING_FILE_FAILURE_MS = "1000"; // 1 sec for fast test
+    const sessionMod = await import("../src/services/session.js");
+    const session = sessionMod.getSession("zombie-session");
+    session.pendingBackgroundCount = 1;
+    const mod = await import("../src/services/async-agent-watcher.js");
+    mod.registerPendingAgent({
+      agentId: "session-zombie",
+      outputFile: `${TEST_DATA_DIR}/gone.jsonl`,
+      description: "zombie for counter",
+      prompt: "p",
+      chatId: 1,
+      userId: 1,
+      toolUseId: null,
+      sessionKey: "zombie-session",
+    });
+    const p = mod.listPendingAgents();
+    (p[0] as { startedAt: number }).startedAt = Date.now() - 5000; // 5 sec ago, > 1sec threshold
+    await mod.pollOnce();
+    expect(delivered).toHaveLength(1);
+    expect(session.pendingBackgroundCount).toBe(0);
+  });
+});