npm - alvin-bot - Versions diffs - 4.12.2 → 4.12.3 - Mend

alvin-bot 4.12.2 → 4.12.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

package/CHANGELOG.md +64 -0
package/dist/handlers/async-agent-chunk-handler.js +17 -0
package/dist/handlers/background-bypass.js +75 -0
package/dist/handlers/message.js +127 -16
package/dist/services/async-agent-watcher.js +25 -0
package/dist/services/session-persistence.js +5 -0
package/dist/services/session.js +2 -0
package/package.json +1 -1
package/test/async-agent-chunk-flow.test.ts +113 -0
package/test/background-bypass-integration.test.ts +443 -0
package/test/background-bypass-stress.test.ts +417 -0
package/test/background-bypass.test.ts +127 -0
package/test/session-pending-background.test.ts +59 -0
package/test/watcher-pending-count.test.ts +228 -0

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,70 @@
 All notable changes to Alvin Bot are documented here.
+## [4.12.3] — 2026-04-15
+### 🐛 Patch: Background sub-agent no longer blocks the main Telegram session
+**The bug Ali reported:** After launching an async sub-agent (`run_in_background: true`), sending any follow-up message to the bot silently stalled for 2+ minutes before being processed. v4.12.1/v4.12.2 attempted a prompt-hint mitigation but did NOT address the architectural root cause.
+**Root cause (re-diagnosed with live SDK event logs):** The Claude Agent SDK's CLI subprocess stays alive for the full duration of a background task so it can inject the `<task-notification>` inline into the NEXT assistant turn. While that subprocess idles, Alvin's query iterator is still being drained, `session.isProcessing` stays `true`, and every new user message gets pushed into the 3-slot queue — which doesn't auto-drain. From the user's perspective: send "A" → nothing happens for 2 minutes.
+**The fix (architectural workaround):** New session field `pendingBackgroundCount` tracks the number of background agents currently in-flight. When a new message arrives while `isProcessing=true` AND the counter is `>0`, the handler:
+1. **Aborts the blocked query** instead of queueing. The old SDK subprocess dies; the background task's own detached subprocess keeps writing to its `output_file`.
+2. **Starts a fresh SDK session** (`resume: null`) for the new message so it doesn't inherit the block. Recent conversation history is carried forward via the bridge preamble so Claude retains context.
+3. **Relies on the existing `async-agent-watcher` (v4.10.0)** to poll the background task's `output_file` and deliver the result as a separate Telegram message via `subagent-delivery.ts`. The watcher decrements the counter when it delivers, so subsequent messages go back to normal SDK-resume behavior.
+**Net effect:** Sending "A" during a 5-minute research task now gets processed in ~200ms instead of after 5 minutes. The background research still delivers its result via a separate message when ready.
+### Technical details
+**New module** `src/handlers/background-bypass.ts` — pure state-machine helpers:
+- `shouldBypassQueue(state)` — returns true when `isProcessing=true`, `pendingBackgroundCount>0`, and an unaborted `abortController` exists
+- `shouldBypassSdkResume(state)` — returns true when `pendingBackgroundCount>0`, signalling the next query should pass `sessionId=null`
+- `waitUntilProcessingFalse(session, timeoutMs, tickMs)` — poll-waits for the old handler's `finally` block to flip the flag before the new query starts
+**`src/services/session.ts`** — new field `pendingBackgroundCount: number` (default 0, reset on `/new`). Not persisted across restarts — the watcher re-hydrates its own state file and delivery still works, and starting a fresh counter after restart avoids stale drift.
+**`src/services/async-agent-watcher.ts`** — `PendingAsyncAgent` gets an optional `sessionKey` field. On every delivery path (completed/failed/timeout), a new `decrementPendingCount(sessionKey)` helper clamps the counter at 0 using `Math.max`. Missing/unknown session keys are a no-op (backwards compatible with pre-v4.12.3 persisted state files).
+**`src/handlers/async-agent-chunk-handler.ts`** — `TurnContext` gets `sessionKey`. When `registerPendingAgent` is called, the counter is incremented in the same function.
+**`src/handlers/message.ts`** (Telegram):
+- Computes `sessionKey` once at the top of the handler and passes it everywhere
+- `if (session.isProcessing)` branch now checks `shouldBypassQueue` first — if true, aborts + waits for cleanup + falls through to process the new message. If false, queues as before.
+- When queueing, the handler now sends a text reply (`"⏳ Eine Anfrage läuft gerade. Deine Nachricht ist in der Warteschlange..."`) in addition to the 📝 reaction, so the user sees what happened (reactions alone were too subtle)
+- New `bypassResume` variable controls whether `queryOpts.sessionId` is `null` (fresh session) or `session.sessionId` (normal resume)
+- Bridge preamble now has two modes: the existing "SDK recovery" mode that bridges fallback turns, plus a new "bypass" mode that bridges the last 10 turns when starting a fresh session mid-conversation
+- New `_bypassAbortFired` session flag + `bypassAborted` local flag ensure that the old handler silently absorbs the abort error instead of showing a confusing "request cancelled" reply, and the fresh handler's finalize/broadcast/👍 reaction path is skipped for the aborted turn
+### Known limitations
+- **Platform coverage**: bypass path is Telegram-only in v4.12.3. Slack/Discord/WhatsApp handlers (`src/handlers/platform-message.ts`) don't currently handle `tool_result` chunks at all, so async agents can't be registered on those platforms. That's a pre-existing limitation that will be fixed in a future release.
+- **SDK behavior dependency**: the fix assumes the background task's own subprocess is detached from the parent SDK query's `AbortController`. Empirically this holds (the watcher delivers results even after bypass-abort), but if a future SDK release changes this we'd need to either stop using `run_in_background` and rely on a pure Alvin-side background dispatch (bigger change) or add a targeted `process.kill` for the parent only, keeping the child alive.
+- **Restart mid-flight**: if the bot restarts while a background agent is pending, the session's counter starts at 0 on restart. The watcher re-hydrates its own state file and still delivers the result correctly, but the session's "is this blocked?" signal is lost, so the first post-restart message might use SDK resume on the old (possibly-blocked) session ID. Minor cosmetic issue, not a data loss.
+### Testing
+- **Baseline**: 396 tests (v4.12.2)
+- **New tests**: +40
+  - `test/session-pending-background.test.ts` — 4 tests (counter wiring, reset, clamp)
+  - `test/watcher-pending-count.test.ts` — 6 tests (decrement on delivery/timeout/failure, missing sessionKey, multi-agent)
+  - `test/async-agent-chunk-flow.test.ts` — +3 tests (sessionKey propagation, counter stacking, non-async no-op)
+  - `test/background-bypass.test.ts` — 12 tests (pure helpers: shouldBypassQueue, shouldBypassSdkResume, waitUntilProcessingFalse)
+  - `test/background-bypass-integration.test.ts` — 6 tests (full lifecycle, stress, session isolation)
+  - `test/background-bypass-stress.test.ts` — 9 tests (100 parallel sessions, 200 churn cycles, extreme drift, /new during pending, ephemeral session, mixed rollout, timing edge cases, high load 50×4 agents)
+- **Total**: 436 tests, all green, TSC clean
+### Files changed
+- **NEW**: `src/handlers/background-bypass.ts`
+- **NEW tests**: `test/session-pending-background.test.ts`, `test/watcher-pending-count.test.ts`, `test/background-bypass.test.ts`, `test/background-bypass-integration.test.ts`, `test/background-bypass-stress.test.ts`
+- **Modified**: `src/handlers/message.ts` (bypass wiring + visible queue reply), `src/handlers/async-agent-chunk-handler.ts` (sessionKey + counter increment), `src/services/async-agent-watcher.ts` (sessionKey in PendingAsyncAgent + decrement on delivery), `src/services/session.ts` (pendingBackgroundCount field + _bypassAbortFired flag), `src/services/session-persistence.ts` (counter not persisted — reset on restart), `test/async-agent-chunk-flow.test.ts` (new assertions)
+- **Version**: `package.json` 4.12.2 → 4.12.3
+---
 ## [4.12.2] — 2026-04-15
 ### 🔒 Security patch: file permissions, ALLOWED_USERS hard-fail, exec-guard hardening, CVE updates

package/dist/handlers/async-agent-chunk-handler.js CHANGED Viewed

@@ -1,5 +1,6 @@
 import { parseAsyncLaunchedToolResult } from "../services/async-agent-parser.js";
 import { registerPendingAgent } from "../services/async-agent-watcher.js";
+import { getAllSessions } from "../services/session.js";
 /**
  * Inspect a stream chunk; if it's an Agent async_launched tool_result,
  * register the pending agent with the watcher.
@@ -29,5 +30,21 @@ export function handleToolResultChunk(chunk, ctx) {
         chatId: ctx.chatId,
         userId: ctx.userId,
         toolUseId: chunk.toolUseId ?? null,
+        sessionKey: ctx.sessionKey,
     });
+    // v4.12.3 — Increment the session's pendingBackgroundCount so the
+    // main handler knows a background task is tying up the SDK's CLI
+    // subprocess. The watcher decrements this when it delivers the result.
+    // Guarded: missing sessionKey or unknown session is a no-op.
+    if (ctx.sessionKey) {
+        try {
+            const s = getAllSessions().get(ctx.sessionKey);
+            if (s) {
+                s.pendingBackgroundCount = (s.pendingBackgroundCount ?? 0) + 1;
+            }
+        }
+        catch {
+            /* never let counter updates break registration */
+        }
+    }
 }

package/dist/handlers/background-bypass.js ADDED Viewed

@@ -0,0 +1,75 @@
+/**
+ * v4.12.3 — Background-agent bypass helpers.
+ *
+ * Pure state-machine helpers used by the Telegram + platform message
+ * handlers to decide whether to:
+ *   1. Abort a running query instead of queueing the next user message,
+ *      when the running query is blocked waiting for a background
+ *      task-notification (SDK's CLI subprocess stays alive for the full
+ *      duration of the background task).
+ *   2. Start the next SDK query with a fresh session (sessionId=null)
+ *      when any background agent is still pending, so the new query
+ *      doesn't inherit the old session's block.
+ *
+ * These are separated into their own module so they can be unit tested
+ * without a grammy Context mock.
+ */
+/**
+ * Decide whether to bypass the normal "queue this message" branch and
+ * interrupt the running query so the new message can proceed immediately.
+ *
+ * True when:
+ *   - A query is currently running (`isProcessing`)
+ *   - At least one background agent is pending in this session
+ *   - An unaborted abortController exists to cancel the running query
+ *
+ * Otherwise false → fall back to the normal queue/drop behavior.
+ */
+export function shouldBypassQueue(state) {
+    if (!state.isProcessing)
+        return false;
+    if (state.pendingBackgroundCount <= 0)
+        return false;
+    const ac = state.abortController;
+    if (!ac)
+        return false;
+    if (ac.signal.aborted)
+        return false;
+    return true;
+}
+/**
+ * Decide whether the next SDK query should skip `resume: sessionId`
+ * and start a fresh session instead. Needed when a background agent is
+ * still pending — resuming the original session would inherit its block
+ * (the SDK's CLI subprocess for that session is waiting to deliver the
+ * task-notification inline). A fresh session has no such block and
+ * proceeds immediately. Context is preserved via the bridge preamble
+ * (buildBridgeMessage in message.ts).
+ */
+export function shouldBypassSdkResume(state) {
+    return state.pendingBackgroundCount > 0;
+}
+/**
+ * Poll-wait until `session.isProcessing` becomes false (or the timeout
+ * elapses). Returns true if the flag flipped, false on timeout.
+ *
+ * Used by the bypass path: after calling `abort()` on the running query,
+ * we wait for its finally block to run and flip isProcessing=false
+ * before starting the new query. The handler's own message loop is the
+ * one flipping the flag, so we just have to yield the event loop and
+ * re-check.
+ *
+ * Timeouts above 0 are recommended. Default tick interval is 50ms which
+ * is short enough that the fall-through feels instant to the user.
+ */
+export async function waitUntilProcessingFalse(session, timeoutMs, tickMs = 50) {
+    if (!session.isProcessing)
+        return true;
+    const start = Date.now();
+    while (session.isProcessing) {
+        if (Date.now() - start >= timeoutMs)
+            return false;
+        await new Promise((resolve) => setTimeout(resolve, tickMs));
+    }
+    return true;
+}

package/dist/handlers/message.js CHANGED Viewed

@@ -18,6 +18,7 @@ import { t } from "../i18n.js";
 import { isHarmlessTelegramError } from "../util/telegram-error-filter.js";
 import { handleToolResultChunk } from "./async-agent-chunk-handler.js";
 import { createStuckTimer } from "./stuck-timer.js";
+import { shouldBypassQueue, shouldBypassSdkResume, waitUntilProcessingFalse, } from "./background-bypass.js";
 /**
  * Stuck-only timeout — NO absolute cap.
  *
@@ -152,7 +153,8 @@ export async function handleMessage(ctx) {
         text = `[Replying to previous message: "${quotedText}"]\n\n${text}`;
     }
     const userId = ctx.from.id;
-    const session = getSession(buildSessionKey("telegram", ctx.chat.id, userId));
+    const sessionKey = buildSessionKey("telegram", ctx.chat.id, userId);
+    const session = getSession(sessionKey);
     // Track user profile
     touchProfile(userId, ctx.from?.first_name, ctx.from?.username, "telegram", text);
     // Sync session language from persistent profile (on first message)
@@ -163,15 +165,54 @@ export async function handleMessage(ctx) {
             session.language = profile.language;
     }
     if (session.isProcessing) {
-        // Queue the message instead of rejecting it (max 3)
-        if (session.messageQueue.length < 3) {
-            session.messageQueue.push(text);
-            await react(ctx, "📝");
+        // v4.12.3 — If a background agent is pending, the running query is
+        // almost certainly just the SDK's CLI subprocess sitting idle waiting
+        // for the task-notification to be ready (can take 5+ minutes for long
+        // audits). Don't queue — abort the blocked query and fall through so
+        // the new message gets processed immediately. The background task
+        // itself continues in its detached subprocess; the async-agent watcher
+        // delivers the result via subagent-delivery.ts when ready.
+        if (shouldBypassQueue({
+            isProcessing: session.isProcessing,
+            pendingBackgroundCount: session.pendingBackgroundCount,
+            abortController: session.abortController,
+        })) {
+            console.log(`[v4.12.3 bypass] aborting blocked query for ${sessionKey} — ` +
+                `${session.pendingBackgroundCount} background agent(s) pending`);
+            // Mark the abort as a bypass so the old handler's error branch
+            // doesn't surface a "request cancelled" reply to the user.
+            session._bypassAbortFired = true;
+            try {
+                session.abortController.abort();
+            }
+            catch {
+                /* ignore */
+            }
+            // Wait briefly for the old handler's finally to run. If it hangs
+            // (>5s, shouldn't happen), we fall through anyway — worst case is
+            // a brief overlap where both handlers run.
+            await waitUntilProcessingFalse(session, 5000);
+            // Fall through to start a fresh query below.
         }
         else {
-            await ctx.reply("⏳ Warteschlange voll (3 Nachrichten). Bitte warten oder /cancel.");
+            // Normal queue behavior. v4.12.3 — emit a text reply in addition
+            // to the reaction so the user actually sees that their message
+            // was received and is waiting. Reactions alone are too subtle.
+            if (session.messageQueue.length < 3) {
+                session.messageQueue.push(text);
+                await react(ctx, "📝");
+                try {
+                    await ctx.reply("⏳ Eine Anfrage läuft gerade. Deine Nachricht ist in der Warteschlange und wird als Nächstes bearbeitet.");
+                }
+                catch {
+                    /* harmless grammy race */
+                }
+            }
+            else {
+                await ctx.reply("⏳ Warteschlange voll (3 Nachrichten). Bitte warten oder /cancel.");
+            }
+            return;
         }
-        return;
     }
     // Consume queued messages (sent while previous query was processing)
     if (session.messageQueue.length > 0) {
@@ -180,9 +221,23 @@ export async function handleMessage(ctx) {
     }
     session.isProcessing = true;
     session.abortController = new AbortController();
+    // v4.12.3 — Clear any stale bypass flag from a previous aborted turn.
+    // The flag is set by the bypass path right before it calls abort(),
+    // read by the OLD handler's error path, and cleared here by the NEW
+    // handler so it doesn't misclassify future non-bypass aborts. Use
+    // `delete` so TypeScript doesn't narrow the flag to literal `false`
+    // for the rest of this function (it's mutated from the bypass path in
+    // another handler invocation, so the type stays `boolean | undefined`).
+    delete session._bypassAbortFired;
     const streamer = new TelegramStreamer(ctx.chat.id, ctx.api, ctx.message?.message_id);
     let finalText = "";
     let timedOut = false;
+    // v4.12.3 — Tracks whether the current turn ended because the bypass
+    // path aborted us. When true, skip the finalize/broadcast/👍 reaction
+    // flow at the bottom of the handler since the user isn't waiting on
+    // this turn anymore. Explicit `boolean` type so TS doesn't narrow to
+    // the literal `false` and reject the later comparison.
+    let bypassAborted = false;
     const typingInterval = setInterval(() => {
         ctx.api.sendChatAction(ctx.chat.id, "typing").catch(() => { });
     }, 4000);
@@ -280,22 +335,49 @@ export async function handleMessage(ctx) {
                 session.checkpointHintsInjected++;
             }
         }
+        // v4.12.3 — If a background agent is still pending, skip SDK resume.
+        // The OLD SDK session is blocked waiting to deliver the
+        // task-notification inline; resuming it would inherit that block.
+        // Start a fresh SDK session and rely on the bridge preamble below
+        // to carry recent history so Claude has context.
+        const bypassResume = isSDK && shouldBypassSdkResume({
+            pendingBackgroundCount: session.pendingBackgroundCount,
+        });
+        if (bypassResume) {
+            console.log(`[v4.12.3 bypass] starting fresh SDK session for ${sessionKey} — ` +
+                `${session.pendingBackgroundCount} background agent(s) still pending`);
+        }
         // B2 Bridge-Message: if SDK is active but there are non-SDK turns since
         // the last SDK turn, prepend a catch-up preamble so the SDK sees what
         // happened during the failover. We defensively clamp the index against
         // history bounds in case compaction shrank the array under our feet.
+        //
+        // v4.12.3 — Bypass-resume path also gets a bridge: since we're starting
+        // a fresh SDK session, Claude has no prior context from this chat.
+        // Bridge the last BYPASS_BRIDGE_TURNS entries so it knows what we were
+        // just talking about.
+        const BYPASS_BRIDGE_TURNS = 10;
         let bridgedPrompt = text;
         if (isSDK) {
-            const anchor = Math.min(session.lastSdkHistoryIndex, session.history.length - 1);
-            const gapStart = Math.max(0, anchor + 1);
-            // gapEnd excludes the user message we just added (history.length - 1).
-            const gapEnd = session.history.length - 1;
+            let gapStart;
+            let gapEnd;
+            if (bypassResume) {
+                gapEnd = session.history.length - 1;
+                gapStart = Math.max(0, gapEnd - BYPASS_BRIDGE_TURNS);
+            }
+            else {
+                const anchor = Math.min(session.lastSdkHistoryIndex, session.history.length - 1);
+                gapStart = Math.max(0, anchor + 1);
+                // gapEnd excludes the user message we just added (history.length - 1).
+                gapEnd = session.history.length - 1;
+            }
             if (gapEnd > gapStart) {
                 const gapTurns = session.history.slice(gapStart, gapEnd);
                 const bridge = buildBridgeMessage(gapTurns);
                 if (bridge) {
                     bridgedPrompt = bridge + text;
-                    console.log(`[bridge] SDK recovery: injecting ${gapTurns.length} fallback turn(s) into prompt`);
+                    console.log(`[bridge] ${bypassResume ? "bypass" : "SDK recovery"}: ` +
+                        `injecting ${gapTurns.length} turn(s) into prompt`);
                 }
             }
         }
@@ -307,8 +389,8 @@ export async function handleMessage(ctx) {
             abortSignal: session.abortController.signal,
             // User's UI locale — registry uses it to localize failure messages.
             locale: session.language,
-            // SDK-specific
-            sessionId: isSDK ? session.sessionId : null,
+            // SDK-specific. v4.12.3 — bypass resume when background pending.
+            sessionId: isSDK && !bypassResume ? session.sessionId : null,
             // Unified history: SDK ignores it (uses filesystem-resume instead),
             // non-SDK providers use it for context. Keeping it populated for both
             // means a failover from SDK → Ollama keeps the conversation context.
@@ -418,9 +500,12 @@ export async function handleMessage(ctx) {
                     // hand them off to the async-agent watcher. The watcher will
                     // poll the outputFile and deliver the result as a separate
                     // Telegram message when the background agent finishes.
+                    // v4.12.3 — Forward sessionKey so the watcher can route the
+                    // delivery-complete decrement back to the right session.
                     handleToolResultChunk(chunk, {
                         chatId: ctx.chat.id,
                         userId,
+                        sessionKey,
                         lastToolUseInput: lastAgentToolUseInput,
                     });
                     // Reset the captured input — only the immediately following
@@ -447,6 +532,15 @@ export async function handleMessage(ctx) {
                     await ctx.reply(`⚡ _${chunk.failedProvider} unavailable — switching to ${chunk.providerName}_`, { parse_mode: "Markdown" });
                     break;
                 case "error":
+                    // v4.12.3 — If the bypass path aborted us, swallow the error
+                    // silently. The new handler is already preparing to process
+                    // the user's next message; showing a cancellation notice here
+                    // would be misleading.
+                    if (session._bypassAbortFired === true &&
+                        chunk.error?.toLowerCase().includes("abort")) {
+                        bypassAborted = true;
+                        break;
+                    }
                     // If our stuck-timer fired, the abort travels up as a registry
                     // mid-stream error chunk. Prefer the explicit stuck message over
                     // the generic one so the user understands this was a real hang,
@@ -460,6 +554,11 @@ export async function handleMessage(ctx) {
                     break;
             }
         }
+        if (bypassAborted) {
+            // v4.12.3 — Bypass path took over; don't finalize, don't react 👍.
+            // Just clean up and return. The finally block still fires.
+            return;
+        }
         await streamer.finalize(finalText);
         emit("message:sent", { userId, text: finalText, platform: "telegram" });
         // v4.5.0: tell observers the response is complete.
@@ -499,14 +598,26 @@ export async function handleMessage(ctx) {
     catch (err) {
         const errorMsg = err instanceof Error ? err.message : String(err);
         const lang = session.language;
-        await react(ctx, "👎");
-        if (timedOut) {
+        // v4.12.3 — If this handler was interrupted by the bypass path
+        // (another handler aborted us to process a new message while a
+        // background agent is pending), silently absorb the abort error.
+        // Showing "request cancelled" would be misleading — from the
+        // user's point of view, nothing was cancelled, their new message
+        // is just being processed.
+        const absorbBypassAbort = errorMsg.includes("abort") && session._bypassAbortFired === true;
+        if (absorbBypassAbort) {
+            // Do NOT react 👎 or reply — just clean up silently.
+        }
+        else if (timedOut) {
+            await react(ctx, "👎");
             await ctx.reply(t("bot.error.timeoutStuck", lang, { min: STUCK_TIMEOUT_MINUTES }));
         }
         else if (errorMsg.includes("abort")) {
+            await react(ctx, "👎");
             await ctx.reply(t("bot.error.requestCancelled", lang));
         }
         else if (!isHarmlessTelegramError(err)) {
+            await react(ctx, "👎");
             // Drop benign grammy races ("message is not modified", etc.)
             // instead of surfacing them as "Fehler: ..." replies.
             await ctx.reply(`${t("bot.error.prefix", lang)} ${errorMsg}`);

package/dist/services/async-agent-watcher.js CHANGED Viewed

@@ -26,6 +26,7 @@ import fs from "fs";
 import { dirname } from "path";
 import { parseOutputFileStatus } from "./async-agent-parser.js";
 import { ASYNC_AGENTS_STATE_FILE } from "../paths.js";
+import { getAllSessions } from "./session.js";
 /** How often the polling loop runs against each pending agent. */
 const POLL_INTERVAL_MS = 15_000;
 /** Hard ceiling per agent — 12h. After this, give up and deliver
@@ -81,10 +82,32 @@ export function registerPendingAgent(input) {
         lastCheckedAt: 0,
         giveUpAt: input.giveUpAt ?? now + MAX_AGENT_AGE_MS,
         toolUseId: input.toolUseId,
+        sessionKey: input.sessionKey,
     };
     pending.set(input.agentId, entry);
     saveToDisk();
 }
+/**
+ * v4.12.3 — Decrement the session's pendingBackgroundCount. Called on
+ * every delivery (completed/failed/timeout). Clamped at 0 so drift
+ * scenarios (counter was already 0, or session was reset) never crash.
+ * Missing/unknown sessionKey → no-op. Never throws.
+ */
+function decrementPendingCount(sessionKey) {
+    if (!sessionKey)
+        return;
+    try {
+        const all = getAllSessions();
+        const s = all.get(sessionKey);
+        if (!s)
+            return;
+        s.pendingBackgroundCount = Math.max(0, (s.pendingBackgroundCount ?? 0) - 1);
+    }
+    catch (err) {
+        // Never let a decrement failure break delivery.
+        console.error("[async-watcher] decrement failed:", err);
+    }
+}
 /** Returns a snapshot of in-memory pending agents (for /subagents + diagnostics). */
 export function listPendingAgents() {
     return [...pending.values()];
@@ -167,6 +190,7 @@ async function deliverAsCompleted(entry, output, tokensUsed) {
     catch (err) {
         console.error(`[async-watcher] delivery failed for ${entry.agentId}:`, err);
     }
+    decrementPendingCount(entry.sessionKey);
 }
 async function deliverAsFailure(entry, status, error) {
     const { deliverSubAgentResult } = await import("./subagent-delivery.js");
@@ -194,6 +218,7 @@ async function deliverAsFailure(entry, status, error) {
     catch (err) {
         console.error(`[async-watcher] failure delivery failed for ${entry.agentId}:`, err);
     }
+    decrementPendingCount(entry.sessionKey);
 }
 // ── Test helpers ──────────────────────────────────────────────────
 /** Test-only: drop in-memory state. Doesn't touch disk. */

package/dist/services/session-persistence.js CHANGED Viewed

@@ -191,6 +191,11 @@ export function loadPersistedSessions() {
             compactionCount: 0,
             checkpointHintsInjected: 0,
             sdkSubTaskCount: 0,
+            // v4.12.3 — Don't persist pendingBackgroundCount. On restart, the
+            // async-agent-watcher re-hydrates its own state file and polls each
+            // pending agent's outputFile, which handles delivery independently.
+            // Starting at 0 avoids stale counters surviving a crash.
+            pendingBackgroundCount: 0,
             history: Array.isArray(persisted.history) ? persisted.history : [],
             language: persisted.language ?? "en",
             messageQueue: [],

package/dist/services/session.js CHANGED Viewed

@@ -94,6 +94,7 @@ export function getSession(key) {
             compactionCount: 0,
             checkpointHintsInjected: 0,
             sdkSubTaskCount: 0,
+            pendingBackgroundCount: 0,
             history: [],
             language: "en",
             messageQueue: [],
@@ -122,6 +123,7 @@ export function resetSession(key) {
     session.compactionCount = 0;
     session.checkpointHintsInjected = 0;
     session.sdkSubTaskCount = 0;
+    session.pendingBackgroundCount = 0;
     session.history = [];
     session.lastSdkHistoryIndex = -1;
     session.startedAt = Date.now();

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "alvin-bot",
-  "version": "4.12.2",
+  "version": "4.12.3",
   "description": "Alvin Bot \u2014 Your personal AI agent on Telegram, WhatsApp, Discord, Signal, and Web.",
   "type": "module",
   "main": "dist/index.js",

package/test/async-agent-chunk-flow.test.ts CHANGED Viewed

@@ -50,6 +50,119 @@ describe("async agent chunk flow (Stage 2)", () => {
     expect(r.outputFile).toBe("/tmp/out-abc-1.jsonl");
   });
+  it("v4.12.3 — passes sessionKey to registerPendingAgent and increments session counter", async () => {
+    const registered: Array<{ sessionKey?: string }> = [];
+    vi.doMock("../src/services/async-agent-watcher.js", () => ({
+      registerPendingAgent: (input: { sessionKey?: string }) =>
+        registered.push(input),
+      startWatcher: () => {},
+      stopWatcher: () => {},
+      pollOnce: async () => {},
+      listPendingAgents: () => [],
+    }));
+    const { getSession } = await import("../src/services/session.js");
+    const session = getSession("v412-chunk-test-session");
+    session.pendingBackgroundCount = 0;
+    const { handleToolResultChunk } = await import(
+      "../src/handlers/async-agent-chunk-handler.js"
+    );
+    handleToolResultChunk(
+      {
+        type: "tool_result",
+        toolUseId: "toolu_sess",
+        toolResultContent:
+          "Async agent launched successfully.\n" +
+          "agentId: ag-sess\n" +
+          "output_file: /tmp/ag-sess.jsonl\n",
+      },
+      {
+        chatId: 10,
+        userId: 20,
+        sessionKey: "v412-chunk-test-session",
+        lastToolUseInput: { description: "SEO", prompt: "do it" },
+      },
+    );
+    expect(registered).toHaveLength(1);
+    expect(registered[0].sessionKey).toBe("v412-chunk-test-session");
+    expect(session.pendingBackgroundCount).toBe(1);
+  });
+  it("v4.12.3 — multiple async launches in same turn stack the counter", async () => {
+    vi.doMock("../src/services/async-agent-watcher.js", () => ({
+      registerPendingAgent: () => {},
+      startWatcher: () => {},
+      stopWatcher: () => {},
+      pollOnce: async () => {},
+      listPendingAgents: () => [],
+    }));
+    const { getSession } = await import("../src/services/session.js");
+    const session = getSession("v412-chunk-stack");
+    session.pendingBackgroundCount = 0;
+    const { handleToolResultChunk } = await import(
+      "../src/handlers/async-agent-chunk-handler.js"
+    );
+    for (let i = 0; i < 3; i++) {
+      handleToolResultChunk(
+        {
+          type: "tool_result",
+          toolUseId: `toolu_${i}`,
+          toolResultContent:
+            `Async agent launched successfully.\n` +
+            `agentId: ag-${i}\n` +
+            `output_file: /tmp/ag-${i}.jsonl\n`,
+        },
+        {
+          chatId: 10,
+          userId: 20,
+          sessionKey: "v412-chunk-stack",
+          lastToolUseInput: { description: `task ${i}`, prompt: "p" },
+        },
+      );
+    }
+    expect(session.pendingBackgroundCount).toBe(3);
+  });
+  it("v4.12.3 — non-async tool_result does not increment the counter", async () => {
+    vi.doMock("../src/services/async-agent-watcher.js", () => ({
+      registerPendingAgent: () => {
+        throw new Error("should not be called");
+      },
+      startWatcher: () => {},
+      stopWatcher: () => {},
+      pollOnce: async () => {},
+      listPendingAgents: () => [],
+    }));
+    const { getSession } = await import("../src/services/session.js");
+    const session = getSession("v412-chunk-nonasync");
+    session.pendingBackgroundCount = 0;
+    const { handleToolResultChunk } = await import(
+      "../src/handlers/async-agent-chunk-handler.js"
+    );
+    handleToolResultChunk(
+      {
+        type: "tool_result",
+        toolUseId: "toolu_read",
+        toolResultContent: "plain read result — no async_launched marker",
+      },
+      {
+        chatId: 1,
+        userId: 1,
+        sessionKey: "v412-chunk-nonasync",
+        lastToolUseInput: { description: "read", prompt: "p" },
+      },
+    );
+    expect(session.pendingBackgroundCount).toBe(0);
+  });
   it("falls back to a generic description when no toolUseInput is provided", async () => {
     const registered: unknown[] = [];
     vi.doMock("../src/services/async-agent-watcher.js", () => ({