npm - alvin-bot - Versions diffs - 4.9.4 → 4.10.0 - Mend

alvin-bot 4.9.4 → 4.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

package/CHANGELOG.md +65 -0
package/dist/handlers/async-agent-chunk-handler.js +33 -0
package/dist/handlers/message.js +34 -6
package/dist/index.js +7 -0
package/dist/paths.js +4 -0
package/dist/providers/claude-sdk-provider.js +43 -0
package/dist/services/async-agent-parser.js +152 -0
package/dist/services/async-agent-watcher.js +206 -0
package/dist/services/personality.js +55 -0
package/package.json +1 -1
package/test/async-agent-chunk-flow.test.ts +131 -0
package/test/async-agent-parser.test.ts +322 -0
package/test/async-agent-watcher.test.ts +229 -0
package/test/system-prompt-background-hint.test.ts +48 -0

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,71 @@
 All notable changes to Alvin Bot are documented here.
+## [4.10.0] — 2026-04-13
+### 🚀 Async sub-agents — main session no longer blocks during long tasks
+The big architecture upgrade: Claude can now delegate long-running work (SEO audits, multi-page research, full-repo analyses) to **background** sub-agents. The main Telegram session ends quickly, the user can keep chatting, and the sub-agent's final report arrives as a separate message when ready.
+A colleague flagged the underlying problem on 2026-04-13 via WhatsApp voice note: *"It's weird that the main routine crashes when the sub-agents are still running. It should just run in the background, and that should have zero impact on the main routine."* He was right. OpenClaw had this years ago because back then the SDK didn't support async; today's `@anthropic-ai/claude-agent-sdk@0.2.97` already ships `run_in_background: true` on the Agent tool — Alvin just wasn't using it.
+This release closes that gap in two complementary stages, both bundled into the same v4.10.0:
+#### Stage 1 — System prompt teaches Claude when to use `run_in_background`
+- New `BACKGROUND_SUBAGENT_HINT` constant in `src/services/personality.ts`, injected only into SDK sessions (non-SDK providers don't have an Agent tool).
+- The hint tells Claude: for audits / multi-page research / >2 min tasks → ALWAYS set `run_in_background: true`. After launching, end the turn promptly. The bot delivers the result automatically when done.
+- Net effect: Claude's main turn ends in ~5 s instead of 10+ minutes. `session.isProcessing` flips to `false` quickly so the user can keep chatting.
+#### Stage 2 — Async-agent watcher polls and delivers
+The hard part. Three new pure modules + one new wired-up service:
+- **`src/services/async-agent-parser.ts`** (NEW, pure) — two helpers:
+  - `parseAsyncLaunchedToolResult(text)` extracts `agentId` + `output_file` from the SDK's plain-text `Async agent launched successfully…` tool-result. **Important**: the `.d.ts` type in the SDK package claims this is a JSON object with `outputFile: string`. The runtime actually emits plain text with `output_file` (snake_case). Captured live via probe — see the parser test fixtures.
+  - `parseOutputFileStatus(path)` tail-reads (64 KB) the JSONL `output_file` and detects completion by finding the most-recent `assistant` message with `stop_reason: "end_turn"`. Concatenates `content[].text` blocks for the final answer. Token usage extracted from the `usage` field. Survives partial last lines, garbage lines, and tail-cuts on huge files. **19 unit tests** including a 200 KB tail-test.
+- **`src/services/async-agent-watcher.ts`** (NEW) — the polling service. `Map<agentId, PendingAsyncAgent>` in memory, persisted to `~/.alvin-bot/state/async-agents.json` for restart catch-up (same pattern as v4.9.0 cron scheduler). Public API: `startWatcher` / `stopWatcher` / `registerPendingAgent` / `pollOnce` / `listPendingAgents`. Polls every 15 s, gives up after 12 h per-agent (timeout banner). On completion → builds a `SubAgentInfo + SubAgentResult` and hands off to the existing `subagent-delivery.ts` from v4.9.x. **7 integration tests** including bot-restart catch-up.
+- **`src/handlers/async-agent-chunk-handler.ts`** (NEW) — bridge between provider stream chunks and the watcher. Inspects `tool_result` chunks for the async_launched payload, extracts the `description` from the immediately preceding `tool_use` chunk, registers with the watcher. **4 unit tests**.
+- **`src/providers/claude-sdk-provider.ts`** — extended to surface `tool_result` blocks from SDK `user` messages as a new `tool_result` chunk type. Previously the provider only emitted `text` and `tool_use` chunks.
+- **`src/providers/types.ts`** — `StreamChunk` gets two new optional fields: `toolUseId` and `toolResultContent`.
+- **`src/handlers/message.ts`** — captures `lastAgentToolUseInput` from each `tool_use` chunk and consumes it on the immediately-following `tool_result` chunk. Tool-name match also extended from `"Task"` → `"Task" | "Agent"` (the SDK renamed it in v2.1.63).
+- **`src/index.ts`** — `startAsyncAgentWatcher()` after the cron scheduler, `stopAsyncAgentWatcher()` in the shutdown handler.
+- **`src/paths.ts`** — new `ASYNC_AGENTS_STATE_FILE` constant under `~/.alvin-bot/state/`.
+#### Investigation artifacts (gitignored, maintainer-local)
+- `docs/superpowers/plans/2026-04-13-async-subagents.md` — full TDD plan
+- `docs/superpowers/specs/sdk-async-agent-outputfile-format.md` — live-captured SDK format spec; documents the `.d.ts` mismatch that ate ~30 minutes of debugging time
+#### Testing
+**237 tests total** (201 baseline + 36 new). All green. TSC clean.
+- 6 system-prompt-hint tests (Stage 1)
+- 19 parser tests (8 plain-text format + 11 JSONL format including 200 KB tail-test)
+- 7 watcher integration tests (register, deliver, persistence, restart catch-up, timeout, concurrent agents)
+- 4 chunk-handler unit tests
+Live-verified via isolated SDK probe (`node sdk-probe.mjs` inside the repo) which confirmed the real `output_file` path and JSONL format match the parser's expectations.
+#### What you'll see as a user
+Send: *"Make a SEO audit of gethomes.io and alev-b.com in parallel"*
+- **0 s** — Claude responds: *"Starting both audits in the background — I'll send the reports when done."* Main session **unlocks**.
+- **1–10 min later** — You can chat about anything else. The bot answers immediately.
+- **~13 min** (when each agent finishes) — Two separate banner messages arrive: *"✅ SEO audit gethomes.io completed · 13m 17s · 2.6M in / 28k out"* + the full report body, delivered via the v4.9.3 Markdown→plain-text fallback path.
+#### Non-goals
+- No session-mutex refactor (Stage 3 from the analysis, out of scope here)
+- No replacement for Alvin's existing cron `spawnSubAgent` system (different use case)
+- No SDK upgrade beyond `0.2.97`
+#### Compatibility
+- `CLAUDE_CODE_DISABLE_BACKGROUND_TASKS=1` in `.env` disables background mode at the SDK level → Stage 1 hint becomes inert, watcher idles; foreground behavior is restored
 ## [4.9.4] — 2026-04-13
 ### 🔌 Web UI fully decoupled from main bot — port conflicts no longer crash anything

package/dist/handlers/async-agent-chunk-handler.js ADDED Viewed

@@ -0,0 +1,33 @@
+import { parseAsyncLaunchedToolResult } from "../services/async-agent-parser.js";
+import { registerPendingAgent } from "../services/async-agent-watcher.js";
+/**
+ * Inspect a stream chunk; if it's an Agent async_launched tool_result,
+ * register the pending agent with the watcher.
+ *
+ * Safe to call on any chunk type — non-tool_result chunks are ignored.
+ */
+export function handleToolResultChunk(chunk, ctx) {
+    if (chunk.type !== "tool_result")
+        return;
+    if (!chunk.toolResultContent)
+        return;
+    const info = parseAsyncLaunchedToolResult(chunk.toolResultContent);
+    if (!info)
+        return;
+    // The description and prompt come from the original tool_use input,
+    // not the tool_result text. If we don't have them (e.g. test setup
+    // forgot to pass lastToolUseInput), fall back to a generic label so
+    // the user still sees something meaningful in the delivery banner.
+    const description = ctx.lastToolUseInput?.description?.trim() ||
+        `Background agent ${info.agentId.slice(0, 8)}`;
+    const prompt = ctx.lastToolUseInput?.prompt?.trim() || "";
+    registerPendingAgent({
+        agentId: info.agentId,
+        outputFile: info.outputFile,
+        description,
+        prompt,
+        chatId: ctx.chatId,
+        userId: ctx.userId,
+        toolUseId: chunk.toolUseId ?? null,
+    });
+}

package/dist/handlers/message.js CHANGED Viewed

@@ -15,6 +15,7 @@ import { trackUsage } from "../services/usage-tracker.js";
 import { emitUserMessage as broadcastUserMessage, emitResponseStart as broadcastResponseStart, emitResponseDelta as broadcastResponseDelta, emitResponseDone as broadcastResponseDone, } from "../services/broadcast.js";
 import { t } from "../i18n.js";
 import { isHarmlessTelegramError } from "../util/telegram-error-filter.js";
+import { handleToolResultChunk } from "./async-agent-chunk-handler.js";
 /**
  * Stuck-only timeout — NO absolute cap.
  *
@@ -279,6 +280,11 @@ export async function handleMessage(ctx) {
         };
         // Stream response from provider (with fallback)
         let lastBroadcastLen = 0;
+        // Captured during tool_use chunks; consumed by tool_result chunks so
+        // the async-agent watcher can label pending agents with their human-
+        // readable description (which only appears in the tool_use input,
+        // not in the tool_result text). See Fix #17 Stage 2.
+        let lastAgentToolUseInput;
         for await (const chunk of registry.queryWithFallback(queryOpts)) {
             // Any chunk is progress — reset the stuck timer.
             resetStuckTimer();
@@ -309,13 +315,14 @@ export async function handleMessage(ctx) {
                     if (chunk.toolName) {
                         session.toolUseCount++;
                         const icon = TOOL_ICONS[chunk.toolName] || "🔧";
-                        // Special treatment for Claude's SDK-internal Task tool:
+                        // Special treatment for Claude's SDK-internal Task/Agent tool:
                         // track how many sub-tasks Claude delegated and surface the
                         // task description in the status line so the user sees WHAT
-                        // is being delegated, not just "Task…".
-                        if (chunk.toolName === "Task") {
+                        // is being delegated, not just "Task…". The tool was renamed
+                        // from "Task" to "Agent" in Claude Code v2.1.63 — match both.
+                        if (chunk.toolName === "Task" || chunk.toolName === "Agent") {
                             session.sdkSubTaskCount++;
-                            let label = "Task";
+                            let label = chunk.toolName;
                             if (chunk.toolInput) {
                                 try {
                                     const parsed = JSON.parse(chunk.toolInput);
@@ -324,11 +331,18 @@ export async function handleMessage(ctx) {
                                         const desc = parsed.description.length > 80
                                             ? parsed.description.slice(0, 80) + "…"
                                             : parsed.description;
-                                        label = `Task: ${desc}`;
+                                        label = `${chunk.toolName}: ${desc}`;
                                     }
                                     else if (parsed.subagent_type) {
-                                        label = `Task (${parsed.subagent_type})`;
+                                        label = `${chunk.toolName} (${parsed.subagent_type})`;
                                     }
+                                    // Capture the description+prompt for the upcoming
+                                    // tool_result. Used by Fix #17 Stage 2 to label
+                                    // background agents in the watcher's delivery banner.
+                                    lastAgentToolUseInput = {
+                                        description: parsed.description,
+                                        prompt: parsed.prompt,
+                                    };
                                 }
                                 catch {
                                     // not JSON — keep generic label
@@ -341,6 +355,20 @@ export async function handleMessage(ctx) {
                         }
                     }
                     break;
+                case "tool_result":
+                    // Fix #17 Stage 2: detect Agent async_launched payloads and
+                    // hand them off to the async-agent watcher. The watcher will
+                    // poll the outputFile and deliver the result as a separate
+                    // Telegram message when the background agent finishes.
+                    handleToolResultChunk(chunk, {
+                        chatId: ctx.chat.id,
+                        userId,
+                        lastToolUseInput: lastAgentToolUseInput,
+                    });
+                    // Reset the captured input — only the immediately following
+                    // tool_result should consume it.
+                    lastAgentToolUseInput = undefined;
+                    break;
                 case "done":
                     if (chunk.sessionId)
                         session.sessionId = chunk.sessionId;

package/dist/index.js CHANGED Viewed

@@ -78,6 +78,7 @@ import { loadPlugins, registerPluginCommands, unloadPlugins } from "./services/p
 import { initMCP, disconnectMCP, hasMCPConfig } from "./services/mcp.js";
 import { startWebServer, stopWebServer } from "./web/server.js";
 import { startScheduler, stopScheduler, setNotifyCallback } from "./services/cron.js";
+import { startWatcher as startAsyncAgentWatcher, stopWatcher as stopAsyncAgentWatcher } from "./services/async-agent-watcher.js";
 import { startSessionCleanup, stopSessionCleanup } from "./services/session.js";
 import { processQueue, cleanupQueue, setSenders, enqueue } from "./services/delivery-queue.js";
 import { discoverTools } from "./services/tool-discovery.js";
@@ -254,6 +255,7 @@ const shutdown = async () => {
     await cancelAllSubAgents(true);
     stopWatchdog();
     stopScheduler();
+    stopAsyncAgentWatcher();
     stopSessionCleanup();
     if (queueInterval)
         clearInterval(queueInterval);
@@ -420,6 +422,11 @@ setNotifyCallback(async (target, text) => {
     enqueue(target.platform, String(target.chatId), text);
 });
 startScheduler();
+// Start the async-agent watcher (Fix #17 Stage 2). Polls outputFiles
+// of background sub-agents Claude launched with run_in_background and
+// delivers their completed reports as separate Telegram messages.
+// Loads any persisted pending agents from disk on boot.
+startAsyncAgentWatcher();
 // Session memory hygiene: purge sessions idle > 7 days (configurable via
 // ALVIN_SESSION_TTL_DAYS). Never touches active sessions — see session.ts.
 startSessionCleanup();

package/dist/paths.js CHANGED Viewed

@@ -62,6 +62,10 @@ export const SUDO_ENC_FILE = resolve(DATA_DIR, "data", ".sudo-enc");
 export const SUDO_KEY_FILE = resolve(DATA_DIR, "data", ".sudo-key");
 /** backups/ — Config snapshots */
 export const BACKUP_DIR = resolve(DATA_DIR, "backups");
+/** state/async-agents.json — Pending background SDK agents (Fix #17 Stage 2).
+ *  See src/services/async-agent-watcher.ts for the watcher that polls and
+ *  delivers these. Survives bot restarts. */
+export const ASYNC_AGENTS_STATE_FILE = resolve(DATA_DIR, "state", "async-agents.json");
 /** soul.md — Bot personality */
 export const SOUL_FILE = resolve(DATA_DIR, "soul.md");
 /** tools.md — Custom tool definitions (Markdown) */

package/dist/providers/claude-sdk-provider.js CHANGED Viewed

@@ -186,6 +186,49 @@ export class ClaudeSDKProvider {
                         }
                     }
                 }
+                // User message — tool_results from the Claude API arrive as user
+                // messages in the SDK protocol. We surface tool_result blocks as
+                // chunks so the message handler can detect Agent async_launched
+                // payloads and register them with the watcher (Fix #17 Stage 2).
+                if (message.type === "user") {
+                    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+                    const userMsg = message;
+                    const content = userMsg.message?.content;
+                    if (Array.isArray(content)) {
+                        for (const block of content) {
+                            if (block &&
+                                typeof block === "object" &&
+                                block.type === "tool_result" &&
+                                typeof block.tool_use_id === "string") {
+                                // The `content` field on a tool_result block can be a
+                                // plain string OR an array of content blocks. Normalize
+                                // to a single string so the chunk consumer doesn't need
+                                // to know about the SDK shape.
+                                let contentText = "";
+                                if (typeof block.content === "string") {
+                                    contentText = block.content;
+                                }
+                                else if (Array.isArray(block.content)) {
+                                    contentText = block.content
+                                        .map((c) => {
+                                        if (c && typeof c === "object" && "text" in c) {
+                                            const t = c.text;
+                                            return typeof t === "string" ? t : "";
+                                        }
+                                        return "";
+                                    })
+                                        .join("");
+                                }
+                                yield {
+                                    type: "tool_result",
+                                    toolUseId: block.tool_use_id,
+                                    toolResultContent: contentText,
+                                    sessionId: capturedSessionId,
+                                };
+                            }
+                        }
+                    }
+                }
                 // Result — done (extract full usage including cache tokens)
                 if (message.type === "result") {
                     const resultMsg = message;

package/dist/services/async-agent-parser.js ADDED Viewed

@@ -0,0 +1,152 @@
+/**
+ * Pure helpers for the async-agent watcher (Fix #17 Stage 2).
+ *
+ * Two responsibilities, both pure (the file read in parseOutputFileStatus
+ * is pure-by-input — same path returns the same shape at that moment in
+ * time, no mutation, no side effects):
+ *
+ *  1. Parse the SDK's plain-text "Async agent launched successfully" tool
+ *     result into a structured AsyncLaunchedInfo.
+ *  2. Read the tail of an outputFile JSONL stream and decide whether the
+ *     sub-agent is still running, completed, or failed.
+ *
+ * Format details captured live from @anthropic-ai/claude-agent-sdk@0.2.97
+ * on 2026-04-13. See docs/superpowers/specs/sdk-async-agent-outputfile-format.md
+ * for the full investigation notes — the SDK's .d.ts shape DOES NOT match
+ * what the runtime actually emits, which is why the contract is pinned by
+ * tests against real fixtures.
+ */
+import { promises as fs } from "fs";
+// ── Tool-result text parser ──────────────────────────────────────────
+/**
+ * Parse the plain-text SDK tool-result content for an `Agent` call with
+ * `run_in_background: true`. The format is documented in the spec doc
+ * — it's NOT JSON, and the field is `output_file` (snake_case).
+ *
+ * Accepts:
+ *   - the raw text string
+ *   - an Anthropic SDK content array `[{type: "text", text: "..."}]`
+ *   - null/undefined/non-string → returns null
+ */
+export function parseAsyncLaunchedToolResult(raw) {
+    // Normalize to a string
+    let text;
+    if (raw == null)
+        return null;
+    if (typeof raw === "string") {
+        text = raw;
+    }
+    else if (Array.isArray(raw)) {
+        // SDK content blocks shape
+        text = raw
+            .map((b) => (b && typeof b === "object" && "text" in b ? String(b.text) : ""))
+            .join("");
+    }
+    else {
+        return null;
+    }
+    if (!text || text.length === 0)
+        return null;
+    // Quick gate: avoid expensive matching on non-async tool results
+    if (!text.includes("Async agent launched successfully"))
+        return null;
+    // agentId line: "agentId: <id> (...)" — capture everything up to first space/paren
+    const agentMatch = text.match(/agentId:\s*(\S+)/);
+    if (!agentMatch)
+        return null;
+    const agentId = agentMatch[1].trim();
+    if (!agentId)
+        return null;
+    // output_file line: "output_file: <path>" — path may contain spaces, capture
+    // until end of line (the path is always on its own line in real output).
+    const outFileMatch = text.match(/output_file:\s*(.+?)\s*(?:\n|$)/);
+    if (!outFileMatch)
+        return null;
+    const outputFile = outFileMatch[1].trim();
+    if (!outputFile)
+        return null;
+    return { agentId, outputFile };
+}
+const DEFAULT_TAIL_BYTES = 64 * 1024;
+/**
+ * Read the tail of an SDK background-agent outputFile and decide what
+ * state the sub-agent is in. See spec doc for the JSONL format. We only
+ * read the last `maxTailBytes` of the file because long-running agents
+ * (SEO audits etc.) can produce hundreds of KB of intermediate JSONL.
+ */
+export async function parseOutputFileStatus(path, opts = {}) {
+    const maxTailBytes = opts.maxTailBytes ?? DEFAULT_TAIL_BYTES;
+    let stat;
+    try {
+        stat = await fs.stat(path);
+    }
+    catch {
+        return { state: "missing" };
+    }
+    if (stat.size === 0) {
+        // Empty file is functionally the same as missing — we keep polling.
+        return { state: "missing" };
+    }
+    // Tail-read the last maxTailBytes
+    let buf;
+    let fh;
+    try {
+        fh = await fs.open(path, "r");
+        const readSize = Math.min(stat.size, maxTailBytes);
+        buf = Buffer.alloc(readSize);
+        await fh.read(buf, 0, readSize, stat.size - readSize);
+    }
+    catch {
+        return { state: "missing" };
+    }
+    finally {
+        try {
+            await fh?.close();
+        }
+        catch { /* ignore */ }
+    }
+    const text = buf.toString("utf-8");
+    // Split into lines. If we tail-read into the middle of a line (size >
+    // maxTailBytes), drop the first line because it's almost certainly
+    // truncated. The trailing line is dropped if there's no newline — it's
+    // the line being written right now.
+    const lines = text.split("\n");
+    const tailIsMidLine = stat.size > maxTailBytes;
+    const headIncomplete = tailIsMidLine ? 1 : 0;
+    const trailIncomplete = text.endsWith("\n") ? 0 : 1;
+    const usable = lines
+        .slice(headIncomplete, lines.length - (trailIncomplete > 0 ? trailIncomplete : 0))
+        .filter((l) => l.length > 0);
+    // Walk backwards to find the most-recent assistant message with end_turn
+    for (let i = usable.length - 1; i >= 0; i--) {
+        let parsed;
+        try {
+            parsed = JSON.parse(usable[i]);
+        }
+        catch {
+            // Garbage line — skip
+            continue;
+        }
+        if (parsed.type === "assistant" &&
+            parsed.message?.stop_reason === "end_turn" &&
+            Array.isArray(parsed.message.content)) {
+            const finalText = parsed.message.content
+                .filter((c) => c?.type === "text" && typeof c.text === "string")
+                .map((c) => c.text)
+                .join("\n\n");
+            const usage = parsed.message.usage;
+            return {
+                state: "completed",
+                output: finalText,
+                tokensUsed: usage
+                    ? {
+                        input: usage.input_tokens ?? 0,
+                        output: usage.output_tokens ?? 0,
+                    }
+                    : undefined,
+            };
+        }
+    }
+    // No completion marker found — still running.
+    return { state: "running", size: stat.size };
+}

package/dist/services/async-agent-watcher.js ADDED Viewed

@@ -0,0 +1,206 @@
+/**
+ * Async Sub-Agent Watcher (Fix #17 Stage 2)
+ *
+ * Tracks pending background sub-agents that Claude launched with
+ * `run_in_background: true`. Polls each agent's outputFile every
+ * POLL_INTERVAL_MS, detects completion (success/failure/timeout),
+ * and delivers the final result as a separate Telegram message via
+ * the existing subagent-delivery.ts pipeline.
+ *
+ * Persistence: pending agents survive bot restarts via
+ * ~/.alvin-bot/state/async-agents.json. On boot, startWatcher() loads
+ * the file and resumes polling — same catchup pattern as the v4.9.0
+ * cron scheduler.
+ *
+ * Why this exists: Claude's Agent tool defaults to synchronous, which
+ * blocks the main Telegram session for 10+ minutes during long audits.
+ * Stage 1 of the fix tells Claude to use run_in_background; Stage 2
+ * (this file) catches the resulting outputFile and delivers the result
+ * when ready, so the user can keep chatting while the agent works.
+ *
+ * See docs/superpowers/plans/2026-04-13-async-subagents.md for the
+ * full plan and docs/superpowers/specs/sdk-async-agent-outputfile-format.md
+ * for the JSONL format details.
+ */
+import fs from "fs";
+import { dirname } from "path";
+import { parseOutputFileStatus } from "./async-agent-parser.js";
+import { ASYNC_AGENTS_STATE_FILE } from "../paths.js";
+/** How often the polling loop runs against each pending agent. */
+const POLL_INTERVAL_MS = 15_000;
+/** Hard ceiling per agent — 12h. After this, give up and deliver
+ *  a timeout banner. SEO audits historically take ~13 min, so 12h
+ *  is absurdly generous and protects against state-file growth. */
+const MAX_AGENT_AGE_MS = 12 * 60 * 60 * 1000;
+// ── Module state ──────────────────────────────────────────────────
+const pending = new Map();
+let pollTimer = null;
+let started = false;
+// ── Persistence ───────────────────────────────────────────────────
+function loadFromDisk() {
+    try {
+        const raw = fs.readFileSync(ASYNC_AGENTS_STATE_FILE, "utf-8");
+        const arr = JSON.parse(raw);
+        if (!Array.isArray(arr))
+            return;
+        for (const entry of arr) {
+            if (typeof entry?.agentId === "string" && typeof entry?.outputFile === "string") {
+                pending.set(entry.agentId, entry);
+            }
+        }
+    }
+    catch {
+        // No state file yet — fresh start. Not an error.
+    }
+}
+function saveToDisk() {
+    try {
+        fs.mkdirSync(dirname(ASYNC_AGENTS_STATE_FILE), { recursive: true });
+        fs.writeFileSync(ASYNC_AGENTS_STATE_FILE, JSON.stringify([...pending.values()], null, 2), "utf-8");
+    }
+    catch (err) {
+        console.error("[async-watcher] failed to persist state:", err);
+    }
+}
+// ── Public API ────────────────────────────────────────────────────
+/**
+ * Register a new async agent that Claude just launched. Persists
+ * immediately so a crash right after registration still delivers
+ * the result on the next boot.
+ */
+export function registerPendingAgent(input) {
+    const now = Date.now();
+    const entry = {
+        agentId: input.agentId,
+        outputFile: input.outputFile,
+        description: input.description,
+        prompt: input.prompt,
+        chatId: input.chatId,
+        userId: input.userId,
+        startedAt: now,
+        lastCheckedAt: 0,
+        giveUpAt: input.giveUpAt ?? now + MAX_AGENT_AGE_MS,
+        toolUseId: input.toolUseId,
+    };
+    pending.set(input.agentId, entry);
+    saveToDisk();
+}
+/** Returns a snapshot of in-memory pending agents (for /subagents + diagnostics). */
+export function listPendingAgents() {
+    return [...pending.values()];
+}
+/** Start the polling loop. Idempotent. Loads any persisted state from disk. */
+export function startWatcher() {
+    if (started)
+        return;
+    started = true;
+    loadFromDisk();
+    pollTimer = setInterval(() => {
+        pollOnce().catch((err) => console.error("[async-watcher] poll cycle failed:", err));
+    }, POLL_INTERVAL_MS);
+    console.log(`⏳ Async-agent watcher started (${pending.size} pending, ${POLL_INTERVAL_MS / 1000}s interval)`);
+}
+/** Stop the polling loop. Idempotent. */
+export function stopWatcher() {
+    if (pollTimer)
+        clearInterval(pollTimer);
+    pollTimer = null;
+    started = false;
+}
+/**
+ * Run one poll cycle: check every pending agent, deliver the completed
+ * ones, drop them from the in-memory + on-disk state. Exported for
+ * tests; production uses the setInterval from startWatcher().
+ */
+export async function pollOnce() {
+    const now = Date.now();
+    const toRemove = [];
+    for (const entry of pending.values()) {
+        entry.lastCheckedAt = now;
+        // Timeout check first — if the agent is past its giveUpAt, give up
+        // regardless of whether the file shows progress.
+        if (now >= entry.giveUpAt) {
+            await deliverAsFailure(entry, "timeout", "Agent ran longer than 12h — giving up");
+            toRemove.push(entry.agentId);
+            continue;
+        }
+        const status = await parseOutputFileStatus(entry.outputFile);
+        if (status.state === "completed") {
+            await deliverAsCompleted(entry, status.output, status.tokensUsed);
+            toRemove.push(entry.agentId);
+        }
+        else if (status.state === "failed") {
+            await deliverAsFailure(entry, "error", status.error);
+            toRemove.push(entry.agentId);
+        }
+        // running / missing → keep polling next cycle
+    }
+    if (toRemove.length > 0) {
+        for (const id of toRemove)
+            pending.delete(id);
+        saveToDisk();
+    }
+}
+// ── Delivery helpers ──────────────────────────────────────────────
+async function deliverAsCompleted(entry, output, tokensUsed) {
+    const { deliverSubAgentResult } = await import("./subagent-delivery.js");
+    const info = {
+        id: entry.agentId,
+        name: entry.description,
+        status: "completed",
+        startedAt: entry.startedAt,
+        source: "cron", // Reuse cron banner format — fits async background agents.
+        depth: 0,
+        parentChatId: entry.chatId,
+    };
+    const result = {
+        id: entry.agentId,
+        name: entry.description,
+        status: "completed",
+        output,
+        tokensUsed: tokensUsed ?? { input: 0, output: 0 },
+        duration: Date.now() - entry.startedAt,
+    };
+    try {
+        await deliverSubAgentResult(info, result);
+    }
+    catch (err) {
+        console.error(`[async-watcher] delivery failed for ${entry.agentId}:`, err);
+    }
+}
+async function deliverAsFailure(entry, status, error) {
+    const { deliverSubAgentResult } = await import("./subagent-delivery.js");
+    const info = {
+        id: entry.agentId,
+        name: entry.description,
+        status,
+        startedAt: entry.startedAt,
+        source: "cron",
+        depth: 0,
+        parentChatId: entry.chatId,
+    };
+    const result = {
+        id: entry.agentId,
+        name: entry.description,
+        status,
+        output: "",
+        tokensUsed: { input: 0, output: 0 },
+        duration: Date.now() - entry.startedAt,
+        error,
+    };
+    try {
+        await deliverSubAgentResult(info, result);
+    }
+    catch (err) {
+        console.error(`[async-watcher] failure delivery failed for ${entry.agentId}:`, err);
+    }
+}
+// ── Test helpers ──────────────────────────────────────────────────
+/** Test-only: drop in-memory state. Doesn't touch disk. */
+export function __resetForTest() {
+    pending.clear();
+    if (pollTimer)
+        clearInterval(pollTimer);
+    pollTimer = null;
+    started = false;
+}