npm - @gotgenes/pi-subagents - Versions diffs - 16.0.0 → 16.1.1 - Mend

@gotgenes/pi-subagents 16.0.0 → 16.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

package/CHANGELOG.md +14 -0
package/dist/public.d.ts +19 -22
package/docs/architecture/architecture.md +49 -17
package/docs/plans/0373-extract-subagent-state.md +250 -0
package/docs/plans/0381-replace-concurrency-queue-with-limiter.md +267 -0
package/docs/plans/0403-abort-subagents-on-interrupt.md +180 -0
package/docs/retro/0373-extract-subagent-state.md +94 -0
package/docs/retro/0381-replace-concurrency-queue-with-limiter.md +95 -0
package/docs/retro/0400-include-parent-prompt-in-replace-mode.md +40 -0
package/docs/retro/0403-abort-subagents-on-interrupt.md +49 -0
package/package.json +1 -1
package/src/handlers/index.ts +1 -0
package/src/handlers/interrupt.ts +49 -0
package/src/index.ts +13 -16
package/src/lifecycle/concurrency-limiter.ts +55 -0
package/src/lifecycle/subagent-manager.ts +57 -51
package/src/lifecycle/subagent-state.ts +156 -0
package/src/lifecycle/subagent.ts +86 -163
package/src/observation/record-observer.ts +15 -13
package/src/lifecycle/concurrency-queue.ts +0 -63

package/docs/plans/0381-replace-concurrency-queue-with-limiter.md ADDED Viewed

@@ -0,0 +1,267 @@
+---
+issue: 381
+issue_title: "Replace ConcurrencyQueue with a thunk-based ConcurrencyLimiter"
+---
+# Replace ConcurrencyQueue with a thunk-based ConcurrencyLimiter
+## Problem Statement
+The `ConcurrencyQueue` stores background-agent IDs and decides *when* to start them, but it cannot start an agent itself.
+It compensates with a `startAgent(id)` callback that reaches back into the manager (`getRecord(id)`, status check, `run()`) — a dependency back-edge that forces forward-referenced bindings in both `index.ts` and the manager test helper.
+The queue also keeps its own `running` counter, fed by `markStarted`/`markFinished` relays in the manager's observer, duplicating state the agents already carry.
+A queued agent has `promise === undefined` until the queue starts it, which is the direct cause of `waitForAll`'s `while (true)` drain loop and its `eslint-disable`.
+These are three symptoms of one root cause: the queue schedules *identifiers it cannot act on* instead of *work it can run*.
+Scheduling thunks (`() => Promise<void>`) instead of IDs dissolves all three at the source.
+This is Phase 17 Step 1 (core consolidation), recorded in `docs/architecture/architecture.md` under "Improvement roadmap (Phase 17 — core consolidation)".
+It unblocks Phase 17 Step 3 ([#374], run-start encapsulation).
+## Goals
+- Replace `ConcurrencyQueue` (ID registry + back-edge callback) with a `ConcurrencyLimiter` that schedules run closures FIFO against a dynamic limit and knows nothing about agents, IDs, or the manager.
+- Make the dependency direction strictly `SubagentManager → ConcurrencyLimiter`: no callback back-edge, no forward-referenced bindings.
+- Derive the active count from the limiter's own task lifecycle (increment on task start, decrement on settle); delete the observer's `markStarted`/`markFinished` relays.
+- Give every spawned agent a real `promise` at spawn time, collapsing `waitForAll`'s `while (true)` drain loop and its `eslint-disable`.
+- This is a non-breaking internal refactor: the FIFO admission behavior against `maxConcurrent` is preserved, and no public API, config key, or observable behavior changes.
+## Non-Goals
+- Renaming the `bypassQueue` spawn option.
+  It is part of the published `SubagentsService` type surface (`src/service/service.ts`), so renaming it would churn the type bundle and break consumers — out of scope; track in Open Questions.
+- Folding the queued-status guard into `Subagent.start()` — that is Phase 17 Step 3 ([#374]).
+  This plan keeps the guard inside the scheduled thunk.
+- Extracting `SubagentState` or making execution deps mandatory ([#373], Step 2).
+- Any change to foreground execution (`spawnAndWait`) or to `bypassQueue` runs — both continue to invoke `record.run()` directly, never touching the limiter.
+- Touching `src/service/service.ts` or `src/service/service-adapter.ts` — `bypassQueue` flows through unchanged.
+## Background
+Relevant modules:
+- `src/lifecycle/concurrency-queue.ts` — the current `ConcurrencyQueue`: `isFull`, `enqueue`, `dequeue`, `markStarted`, `markFinished`, `drain`, `clear`, `queuedIds`.
+  Stores IDs; `drain()` calls the injected `startAgent(id)` back-edge.
+- `src/lifecycle/subagent-manager.ts` — injects the queue via `SubagentManagerOptions.queue`.
+  `buildObserver` relays `markStarted`/`markFinished`; `spawn` enqueues when `isFull()`; `abort` calls `dequeue`; `abortAll` iterates `queuedIds` + `clear()`; `waitForAll` loops `drain()` + `Promise.allSettled`; `dispose` calls `clear()`.
+- `src/index.ts` — constructs the queue with a `startAgent` callback that forward-references the manager (`manager.getRecord(id)` then `agent.run()`); wires `settings.onMaxConcurrentChanged` to `queue.drain()`.
+- `src/lifecycle/subagent.ts` — `run()` sets status to `running` synchronously (`markRunning`) before its first `await`; `run()` always resolves (errors captured internally).
+  `abort()` acts only on `running` agents; its docstring references `ConcurrencyQueue.dequeue()`.
+- `test/lifecycle/subagent-manager.test.ts` — `createManager` helper replicates the `index.ts` start callback with a `prefer-const` `eslint-disable` for the forward reference.
+- `test/lifecycle/concurrency-queue.test.ts` — unit tests for the queue (drain ordering, `markStarted`/`markFinished` counting, `enqueue`/`dequeue`).
+Constraints from AGENTS.md and skills:
+- ES2024 `Promise.withResolvers` is available and preferred (`code-design` skill).
+- The `bypassQueue` field lives in the public type bundle (`exports`, `verify:public-types`); renaming public surface is breaking (`package-pi-subagents` skill).
+- `@typescript-eslint/require-await` is enabled for `src/`; a thunk with no `await` must return a `Promise` without `async`.
+- Where the old `drain()` used `while (… && !isFull())` with `this.queue.shift()!`, prefer a bounded loop without a non-null assertion (`code-design` Biome/ESLint notes).
+The current observer-relay path (`buildObserver` → `queue.markStarted`/`markFinished`) confirmed: the queue's `running` counter mirrors the per-agent status the manager already tracks (the manager filters on `status === "running" || "queued"` in `cleanup`, `clearCompleted`, `hasRunning`, `waitForAll`).
+No production caller awaits a *queued* agent's promise (`get-result-tool.ts` guards on `status === "running"`; `spawnAndWait` is foreground; `waitForAll` filters by status), so giving queued agents a settled-on-completion promise is safe.
+## Design Overview
+### `ConcurrencyLimiter`
+A pure FIFO scheduler over thunks.
+It owns the active count and the pending queue; it has no knowledge of agents, IDs, or the manager.
+```typescript
+export class ConcurrencyLimiter {
+	private active = 0;
+	private readonly pending: Array<{ start: () => void; settle: () => void }> = [];
+	constructor(private readonly getLimit: () => number) {}
+	/**
+	 * Schedule a task to run FIFO once a slot is free.
+	 * The returned promise always settles: it follows the task's settlement when
+	 * the task runs, or resolves early if clear() drops it before it starts.
+	 */
+	schedule(task: () => Promise<void>): Promise<void> {
+		const { promise, resolve, reject } = Promise.withResolvers<void>();
+		this.pending.push({
+			start: () => {
+				this.active++;
+				task().then(resolve, reject).finally(() => {
+					this.active--;
+					this.recheck();
+				});
+			},
+			settle: resolve,
+		});
+		this.recheck();
+		return promise;
+	}
+	/** Start pending tasks until the limit is reached. Call when the limit may have grown. */
+	recheck(): void {
+		while (this.active < this.getLimit()) {
+			const next = this.pending.shift();
+			if (!next) break;
+			next.start();
+		}
+	}
+	/** Drop all pending tasks, resolving their promises without running them. */
+	clear(): void {
+		const dropped = this.pending.splice(0);
+		for (const task of dropped) task.settle();
+	}
+}
+```
+Design decisions:
+- **Active count derived from task lifecycle.**
+  `active++` happens synchronously inside `start()` before the task's first `await`; `active--` runs in `finally`.
+  This replaces the queue's `running` counter and the two observer relays.
+- **`recheck()` is bounded.**
+  The loop terminates when the limit is reached or the pending queue empties — no `while (true)`, no `this.pending.shift()!` non-null assertion.
+- **`clear()` settles dropped promises.**
+  Every `schedule()` promise becomes `record.promise`; the contract is that it always settles.
+  Dropping a thunk without resolving would leave a forever-pending `record.promise`.
+  `clear()` resolves dropped tasks so `dispose()`/`abortAll()` cannot strand a promise. (This is a few lines beyond the issue's "~40 lines" sketch; the extra `settle` handle is the deliberate cost of that invariant.)
+- **Synchronous start.**
+  When a slot is free, `schedule()` runs the thunk synchronously inside `recheck()`, so `record.run()` executes its synchronous prefix (`markRunning`) immediately — preserving today's behavior where `record.promise = record.run()` flips status to `running` at once.
+### Manager spawn call site
+```typescript
+// spawn(), background and not bypassQueue:
+record.promise = this.limiter.schedule(() => {
+	// Guard: an abort-while-queued task is a no-op (Step 3 folds this into Subagent.start()).
+	if (record.status !== "queued") return Promise.resolve();
+	return record.run();
+});
+// foreground or bypassQueue:
+record.promise = record.run();
+```
+This is Tell-Don't-Ask toward the limiter: the manager hands it work, the limiter decides timing.
+The status guard replaces `dequeue` — an aborted queued agent (status `stopped`) becomes a no-op when its slot finally opens.
+### Manager lifecycle methods
+- `buildObserver` — drop the `markStarted` (in `onStarted`) and `markFinished` (in `onRunFinished`) relays; `onRunFinished` keeps the background `onSubagentCompleted` dispatch.
+- `abort(id)` — for a `queued` agent, just `record.markStopped()` (no `dequeue`); otherwise `record.abort()`.
+- `abortAll()` — iterate agents: `markStopped()` each `queued` agent (count it), else `record.abort()`; then `this.limiter.clear()` to drop pending thunks (their promises resolve).
+- `waitForAll()` — every spawned agent has a `promise`, so the manual `drain()` loop collapses:
+  ```typescript
+  async waitForAll(): Promise<void> {
+   let pending = this.pendingPromises();
+   while (pending.length > 0) {
+    await Promise.allSettled(pending);
+    pending = this.pendingPromises();
+   }
+  }
+  private pendingPromises(): Promise<void>[] {
+   return [...this.agents.values()]
+    .filter(r => r.status === "running" || r.status === "queued")
+    .map(r => r.promise)
+    .filter((p): p is Promise<void> => p != null);
+  }
+  ```
+  The re-check loop is no longer `while (true)` and no longer drives scheduling — the limiter auto-starts queued agents as slots free, so a single `allSettled` covers the queued case.
+  The loop survives only to catch agents spawned *during* the wait.
+  The `eslint-disable @typescript-eslint/no-unnecessary-condition` is deleted.
+- `dispose()` — `this.limiter.clear()` (unchanged in intent).
+### `index.ts` wiring
+```typescript
+const settings = new SettingsManager({
+	// …
+	onMaxConcurrentChanged: () => limiter.recheck(), // forward-ref closure (settings → limiter); benign
+});
+settings.load();
+// …
+const limiter = new ConcurrencyLimiter(() => settings.maxConcurrent);
+const manager = new SubagentManager({ /* … */ limiter, /* … */ });
+```
+The only surviving forward reference is `settings → limiter` (a runtime-only closure, the same shape as today's `settings → queue.drain`).
+The `limiter → manager` back-edge (the `startAgent` callback and its explanatory comment) is **deleted entirely** — that is the structural win.
+### Edge cases
+- **Abort while queued** — `markStopped()` flips status; the scheduled thunk, when run, returns `Promise.resolve()` (no-op), settling `record.promise`.
+- **Limit decreased below active count** — `recheck()` simply starts nothing (`active < getLimit()` is false); in-flight tasks finish normally.
+- **Limit increased** — `onMaxConcurrentChanged → limiter.recheck()` starts newly-admissible pending tasks.
+- **`clear()` with in-flight tasks** — only *pending* tasks are dropped; running tasks complete and `active--` on settle.
+## Module-Level Changes
+| File                                         | Change                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
+| -------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `src/lifecycle/concurrency-limiter.ts`       | Add — new `ConcurrencyLimiter` (`schedule`, `recheck`, `clear`).                                                                                                                                                                                                                                                                                                                                                                                                              |
+| `src/lifecycle/concurrency-queue.ts`         | Remove — replaced by the limiter.                                                                                                                                                                                                                                                                                                                                                                                                                                             |
+| `src/lifecycle/subagent-manager.ts`          | Change — import limiter; `SubagentManagerOptions.queue` → `limiter: ConcurrencyLimiter` and the private field; drop `markStarted`/`markFinished` from `buildObserver`; `spawn` schedules a status-guarded thunk; `abort` drops `dequeue`; `abortAll` iterates agents + `limiter.clear()`; `waitForAll` simplified (add `pendingPromises` helper, delete the `while (true)` loop and its `eslint-disable`); `dispose` calls `limiter.clear()`; update the file-header comment. |
+| `src/lifecycle/subagent.ts`                  | Change — `abort()` docstring: remove the `ConcurrencyQueue.dequeue()` reference (queue removal is now a status-guard no-op).                                                                                                                                                                                                                                                                                                                                                  |
+| `src/index.ts`                               | Change — import `ConcurrencyLimiter`; construct it as `new ConcurrencyLimiter(() => settings.maxConcurrent)`; `onMaxConcurrentChanged: () => limiter.recheck()`; delete the `startAgent` callback and its forward-ref comment; inject `limiter` into the manager.                                                                                                                                                                                                             |
+| `test/lifecycle/concurrency-limiter.test.ts` | Add — limiter unit tests (no `startAgent` mock).                                                                                                                                                                                                                                                                                                                                                                                                                              |
+| `test/lifecycle/concurrency-queue.test.ts`   | Remove — the queue is gone.                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
+| `test/lifecycle/subagent-manager.test.ts`    | Change — `createManager` constructs a `ConcurrencyLimiter`; delete the forward-ref `let mgr` + `prefer-const` `eslint-disable`; drop the unused `queue` field from the returned object.                                                                                                                                                                                                                                                                                       |
+| `docs/architecture/architecture.md`          | Change — Mermaid lifecycle node (`ConcurrencyQueue<br/>(scheduling, drain)` → `ConcurrencyLimiter<br/>(thunk admission gate)`); layout listing (`concurrency-queue.ts` → `concurrency-limiter.ts`); "What the core owns" bullet; mark roadmap Step 1 done; fix the Step 7 ([#378]) target filename reference.                                                                                                                                                                 |
+| `.pi/skills/package-pi-subagents/SKILL.md`   | Change — lifecycle-domain table: `concurrency-queue.ts` → `concurrency-limiter.ts` and adjust the "scheduling" wording to "concurrency admission".                                                                                                                                                                                                                                                                                                                            |
+Verified by grep that no other `src/`, `test/`, `docs/` (excluding `docs/architecture/history/` and prior plans/retros, which are historical), or `.pi/skills/` file references `ConcurrencyQueue`, `concurrency-queue`, `enqueue`, `dequeue`, `markStarted`/`markFinished` (queue), `drain`, `isFull`, or `queuedIds` for this queue.
+`SKILL.md` line 80 (Phase 15 history) keeps `ConcurrencyQueue` — it is a historical record, not current state.
+## Test Impact Analysis
+1. **New tests the change enables.**
+   `ConcurrencyLimiter` is a pure thunk scheduler with no agent/manager knowledge, so it is unit-testable with plain `() => Promise<void>` tasks and `Promise.withResolvers` gates — no `startAgent` mock, no re-entrant `markStarted` simulation.
+   New coverage: FIFO start order; slot gating (only `limit` tasks run concurrently); `active` decrement frees a slot for the next pending task on settle; `recheck()` starts newly-admissible tasks when the limit grows; dynamic limit re-evaluation; `clear()` resolves pending promises without running their tasks; a task that rejects still frees its slot.
+2. **Tests that become redundant.**
+   The entire `test/lifecycle/concurrency-queue.test.ts` (`isFull`, `enqueue`/`dequeue`, `markStarted`/`markFinished`, `drain`, auto-drain, `clear`, `queuedIds`) — those methods no longer exist; the limiter tests replace them at a cleaner seam.
+3. **Tests that stay as-is (genuinely exercise the layer).**
+   The `SubagentManager — queueing and concurrency with injected stubs` describe block asserts manager-level behavior (queued → running transition order, abort-while-queued never runs the factory, `onSubagentStarted` fires on the queued → running transition).
+   These remain valid against the manager + limiter integration and need only the `createManager` helper change (construct a `ConcurrencyLimiter`), not a behavioral rewrite.
+   The `clearCompleted does not remove running or queued agents` test (maxConcurrent=1, blocking factory) also stays.
+## TDD Order
+Priority = preparatory addition first, then the atomic interface swap, then docs.
+1. **Add `ConcurrencyLimiter` (red → green).**
+   Surface: new `test/lifecycle/concurrency-limiter.test.ts` against new `src/lifecycle/concurrency-limiter.ts`.
+   Covers FIFO start order, slot gating, `active`-frees-slot-on-settle, `recheck()` on limit growth, dynamic limit, `clear()` resolves pending without running, reject-frees-slot.
+   Pure addition — `ConcurrencyQueue` still exists and its tests still pass; the suite stays green.
+   Commit: `feat(pi-subagents): add ConcurrencyLimiter (#381)`.
+2. **Migrate `SubagentManager`, `index.ts`, and the manager test helper to the limiter; delete the queue (red → green).**
+   Surface: `src/lifecycle/subagent-manager.ts`, `src/index.ts`, `src/lifecycle/subagent.ts` (docstring), `test/lifecycle/subagent-manager.test.ts`, and deletion of `src/lifecycle/concurrency-queue.ts` + `test/lifecycle/concurrency-queue.test.ts`.
+   This is one atomic commit: changing `SubagentManagerOptions.queue` → `limiter` breaks both call sites (`index.ts` and the test helper) at the type level simultaneously, and the old test file imports the deleted source — all must land together.
+   Drop the observer relays, the `dequeue`/`drain`/`isFull`/`queuedIds` usage, the `while (true)` loop + its `eslint-disable`, and the test helper's forward-ref `eslint-disable`.
+   Run `pnpm run check` immediately after (shared-interface change with multiple call sites), then the full `pnpm --filter @gotgenes/pi-subagents exec vitest run` (the queueing/concurrency integration tests must still pass).
+   Commit: `refactor(pi-subagents): replace ConcurrencyQueue with thunk-based ConcurrencyLimiter (#381)`.
+3. **Update architecture doc and package skill (docs).**
+   Surface: `docs/architecture/architecture.md` (Mermaid node, layout listing, "What the core owns" bullet, roadmap Step 1 marked done, Step 7 filename reference) and `.pi/skills/package-pi-subagents/SKILL.md` (lifecycle-domain table entry + wording).
+   Commit: `docs(pi-subagents): update architecture and skill for ConcurrencyLimiter (#381)`.
+## Risks and Mitigations
+| Risk                                                                   | Mitigation                                                                                                                                                                                                    |
+| ---------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| A dropped pending thunk leaves `record.promise` forever pending.       | `clear()` resolves dropped tasks' promises; the limiter's contract is that every `schedule()` promise settles.                                                                                                |
+| `waitForAll` could spin or miss queued agents.                         | Queued agents now carry real promises, so a single `Promise.allSettled` covers them; the bounded re-check loop only catches agents spawned during the wait, and terminates when `pendingPromises()` is empty. |
+| An abort-while-queued no-op thunk briefly occupies a slot.             | The thunk returns a synchronously-resolved promise; `active++`/`active--` round-trip in one microtask and `recheck()` immediately pulls the next task — negligible.                                           |
+| Renaming the file/class leaves stale references.                       | Grep-verified inventory in Module-Level Changes; the migration deletes the source and its test in the same commit; docs updated in step 3.                                                                    |
+| `bypassQueue` public-surface name now slightly misnames the mechanism. | Out of scope (breaking); recorded in Open Questions.                                                                                                                                                          |
+## Open Questions
+- Should `bypassQueue` be renamed (e.g. `bypassLimiter`) for accuracy?
+  It is public type surface, so a rename is breaking and belongs in its own change — defer.
+- Should the `code-design` "narrow interface, not concrete class" guidance be applied to the manager's `limiter` field (typed as `{ schedule; clear }` rather than the concrete `ConcurrencyLimiter`)?
+  Tests construct a real limiter (it is pure and trivially constructible), so no mock-cast pressure exists today; keep the concrete type to match the issue and existing pattern, and revisit only if a test needs to substitute it.
+[#373]: https://github.com/gotgenes/pi-packages/issues/373
+[#374]: https://github.com/gotgenes/pi-packages/issues/374
+[#378]: https://github.com/gotgenes/pi-packages/issues/378

package/docs/plans/0403-abort-subagents-on-interrupt.md ADDED Viewed

@@ -0,0 +1,180 @@
+---
+issue: 403
+issue_title: "Pressing Escape does not stop subagent/background agent"
+---
+# Abort subagents on parent interrupt (ESC)
+## Problem Statement
+A user reports that pressing Escape in the Pi terminal to cancel the current work does not stop a running subagent — the agent keeps going despite the cancel request.
+The reporter is a third party (`khalid244`); the operator confirmed the direction is to implement ESC-to-abort for both foreground and background subagents, aborting all running and queued background agents on a single ESC.
+The root cause splits cleanly by execution mode:
+1. Foreground subagents already receive the parent's abort signal through the tool boundary (`tool.execute(signal)` → `Subagent.wireSignal` → `abort()` → child `session.abort()`), so they should already stop on ESC.
+2. Background subagents are detached by design: `spawnBackground()` never forwards the parent signal, and `manager.abortAll()` runs only on `session_shutdown`.
+   There is no wiring from a parent interrupt to background-agent abort, so ESC does nothing to them.
+   This is the reproducible bug.
+## Goals
+- Pressing ESC (the parent agent-loop interrupt) aborts all running and queued background subagents.
+- Add a regression guard test proving a foreground subagent's child session is aborted when the parent signal fires.
+- Reuse the existing `manager.abortAll()` semantics (abort running, mark queued stopped, clear the limiter) so ESC stops every active subagent in one action.
+This is an intentional behavior change: background subagents that previously survived ESC will now stop.
+It is a bug fix (`fix:`), not a breaking change — no config key, default value, or output shape changes, and detached-survives-ESC was a limitation rather than a contract.
+## Non-Goals
+- Selective or interactive abort (choosing which agent to stop) — out of scope.
+- A dedicated `abortBackground()` that excludes foreground agents — `abortAll()` is reused; foreground agents are already aborted by their own signal wiring, so the overlap is redundant-but-harmless.
+- Changing background-agent detachment for any path other than the ESC interrupt (e.g., the tool still returns immediately on spawn).
+- Confirmation prompts or status messaging on abort.
+## Background
+Relevant modules and the verified runtime facts behind the design:
+- `src/tools/foreground-runner.ts` — `runForeground(..., signal, ...)` forwards the parent `signal` into `manager.spawnAndWait({ signal })`.
+- `src/lifecycle/subagent.ts` — `run()` calls `this.wireSignal(this.execution.signal, () => this.abort())`; `abort()` fires `abortController.abort()` and marks the record stopped.
+- `src/lifecycle/subagent-session.ts` — `runTurnLoop` calls `forwardAbortSignal(session, opts.signal)`, which calls `session.abort()` when the signal fires.
+- `src/tools/background-spawner.ts` — `spawnBackground()` omits `signal` entirely; background agents are detached.
+- `src/lifecycle/subagent-manager.ts` — `abortAll()` aborts running, marks queued stopped, and clears the limiter; currently called only from `src/handlers/lifecycle.ts` on shutdown.
+- `src/handlers/tool-start.ts`, `src/handlers/lifecycle.ts`, `src/handlers/index.ts` — the existing `handlers/` pattern: small classes with a narrow injected interface, registered in `index.ts`.
+Verified SDK facts (from the pinned peer deps under `node_modules/@earendil-works/`):
+- The interactive ESC handler calls `agent.abort()` while streaming (`pi-coding-agent` `interactive-mode.js`, `restoreQueuedMessagesToEditor({ abort: true })`).
+- `pi-agent-core` `agent.js`: each run creates a fresh `AbortController`; `agent.abort()` calls `activeRun.abortController.abort()`; on normal completion `finishRun()` discards the controller **without** aborting it.
+  Therefore the parent signal's `abort` event fires only on a real interrupt, never on normal turn completion — latching `abortAll()` to it will not spuriously kill background agents at turn end.
+- The signal passed to `tool.execute(...)` (`agent-loop.js` line ~419) is that same per-run signal.
+- Extensions read the live per-run parent signal via `ctx.signal` (`ExtensionContext.signal: AbortSignal | undefined`, undefined when idle).
+- `pi.on("turn_start", (event, ctx) => ...)` is a registered event whose handler receives `ExtensionContext`; `turn_start` fires once at the start of every turn while streaming, so its `ctx.signal` is always the current run's signal.
+AGENTS.md constraint: pi-subagents is a minimal core with dependency arrows pointing inward.
+The new handler depends only on a narrow manager interface; no consumer knowledge leaks into the manager.
+## Design Overview
+Add a small `InterruptHandler` that latches the current parent abort signal and, on abort, tells the manager to abort all subagents.
+Drive it from `turn_start` so the latch always tracks the live per-run signal — including across runs and turns that execute no tools.
+Why `turn_start` rather than `tool_execution_start`: a background agent can outlive the run that spawned it.
+If the user later interrupts a turn that ran no subagent tool, only a turn-level latch still holds that run's signal.
+`turn_start` fires every turn with the current `ctx.signal`, so the latch is always current.
+The latch dedups by reference: most turns reuse the same signal (no-op); a new run's signal triggers a detach-and-rewire.
+The `abort` listener is `{ once: true }`; on normal completion the run's `AbortController` is discarded and garbage-collected with its listener, and the next `turn_start` detaches the stale reference.
+### Manager interface (narrow, Tell-Don't-Ask)
+```typescript
+/** Narrow manager interface — only the method the interrupt handler calls. */
+export interface InterruptManager {
+  abortAll(): number;
+}
+/** Minimal context shape — only the field the handler reads. */
+interface InterruptCtx {
+  signal: AbortSignal | undefined;
+}
+```
+### Handler
+```typescript
+export class InterruptHandler {
+  private latched?: AbortSignal;
+  private detach?: () => void;
+  constructor(private readonly manager: InterruptManager) {}
+  handleTurnStart(ctx: InterruptCtx): void {
+    const signal = ctx.signal;
+    if (signal === this.latched) return;
+    this.detach?.();
+    this.detach = undefined;
+    this.latched = signal;
+    if (!signal) return;
+    const onAbort = (): void => {
+      this.manager.abortAll();
+    };
+    signal.addEventListener("abort", onAbort, { once: true });
+    this.detach = () => signal.removeEventListener("abort", onAbort);
+  }
+}
+```
+### Consumer call site (`index.ts`)
+```typescript
+const interrupt = new InterruptHandler(manager);
+pi.on("turn_start", (_event, ctx) => interrupt.handleTurnStart(ctx));
+```
+The handler talks to `manager` through a one-method interface, reads one field of `ctx`, and performs no chained access — no Law-of-Demeter or output-argument smells.
+The latch state (current signal, detach handle) is owned by the handler.
+### Edge cases
+- Same signal across consecutive turns → reference equality short-circuits; no listener churn.
+- `ctx.signal` undefined (idle, defensive) → detach the old listener and hold no signal.
+- Signal already aborted when latched → `{ once: true }` listener does not fire; the prior signal's listener already ran `abortAll()`, so no agent is missed.
+- ESC during a foreground subagent → the foreground agent is aborted twice (once via its own `wireSignal`, once via `abortAll`); `abort()` is guarded by status and `markStopped` is idempotent, so this is harmless.
+## Module-Level Changes
+- `src/handlers/interrupt.ts` (new) — `InterruptHandler` class and `InterruptManager` interface.
+- `src/handlers/index.ts` — add `export { InterruptHandler } from "#src/handlers/interrupt";`.
+- `src/index.ts` — instantiate `new InterruptHandler(manager)` and register `pi.on("turn_start", (_event, ctx) => interrupt.handleTurnStart(ctx))`.
+- `src/lifecycle/subagent-manager.ts` — no code change; `abortAll()` is reused.
+  Its `// fallow-ignore-next-line unused-class-member` comment stays (it is still reached only through narrow interfaces that fallow does not trace); the pre-completion `fallow dead-code` check will confirm.
+- `docs/architecture/architecture.md` — extend the `handlers/` directory listing (around line 354) with `interrupt.ts` (turn_start handler → abort all subagents on interrupt).
+  Check the same file for any handler file-count or complexity row that names the `handlers/` domain and update if present.
+No exports are removed or renamed.
+Grep confirms `.pi/skills/package-pi-subagents/SKILL.md` does not mention `abortAll`, interrupt, or ESC, so no skill update is required.
+## Test Impact Analysis
+This is a feature/fix addition, not an extraction, so no existing tests become redundant.
+1. New unit tests enabled — `InterruptHandler`: latches the current signal, fires `abortAll()` on abort, dedups the same signal reference, re-wires on a new signal, and handles an undefined signal.
+2. New integration guard — foreground abort: aborting the parent signal passed to `runTurnLoop` invokes the child `session.abort()`.
+   This pins the currently-untested foreground link in `forwardAbortSignal`.
+3. Existing tests stay as-is — `test/lifecycle/subagent.test.ts` (`wireSignal`, `abort`), `test/lifecycle/subagent-session.test.ts` (max-turns abort path), and `test/handlers/lifecycle.test.ts` (`abortAll` on shutdown) continue to exercise their layers unchanged.
+## TDD Order
+1. Foreground guard — `test/lifecycle/subagent-session.test.ts`.
+   Add a test: when the `signal` passed to `runTurnLoop` aborts while `session.prompt` is in flight, `session.abort()` is called.
+   Expected to pass immediately (proving the foreground chain already works); if the trace is wrong and it fails, fix `forwardAbortSignal` in `src/lifecycle/subagent-session.ts`.
+   Commit `test: guard foreground subagent abort on parent signal (#403)` (or `fix:` if a code fix is needed).
+2. Interrupt handler + wiring — `test/handlers/interrupt.test.ts` (new) → `src/handlers/interrupt.ts`, `src/handlers/index.ts`, `src/index.ts`.
+   Red: write the handler unit tests (latch, abort→abortAll, dedup, re-wire, undefined signal) against the not-yet-existing class.
+   Green: implement `InterruptHandler` + `InterruptManager`, export from the barrel, and register `pi.on("turn_start", ...)` in `index.ts`.
+   The handler, its test, and the composition-root wiring land together because the handler is inert without the registration.
+   Commit `fix: abort all subagents on parent interrupt (#403)`.
+3. Architecture doc — `docs/architecture/architecture.md`.
+   Add `interrupt.ts` to the `handlers/` directory listing and update any handler-domain count/row if present.
+   Commit `docs: note interrupt handler in subagents architecture (#403)`.
+## Risks and Mitigations
+- ESC now stops background agents the user might have wanted to keep running.
+  Mitigation: this is the operator's explicit choice (abort all running + queued); the behavior is documented in the plan and reflected in the `fix:` commit body.
+- Re-latching on every `turn_start` could add overhead.
+  Mitigation: the latch is a single reference comparison and short-circuits on the common same-signal case.
+- A `{ once: true }` listener lingers on a signal that completes normally.
+  Mitigation: the run's `AbortController` is discarded and GC'd with its listener; the next `turn_start` detaches the stale handle.
+- Non-interactive modes (print/rpc) may not emit `turn_start` the same way.
+  Mitigation: ESC interrupt is an interactive concern; the handler is a no-op when no signal is present.
+## Open Questions
+- Should a dedicated `abortBackground()` (excluding foreground) replace `abortAll()` here?
+  Deferred: `abortAll()` is simpler and foreground is already signal-aborted; revisit only if the redundant double-abort proves problematic.
+- Should ESC abort surface a confirmation or status message?
+  Deferred: out of scope for this fix.

package/docs/retro/0373-extract-subagent-state.md ADDED Viewed

@@ -0,0 +1,94 @@
+---
+issue: 373
+issue_title: "Extract SubagentState; make Subagent execution deps mandatory"
+---
+# Retro: #373 — Extract SubagentState; make Subagent execution deps mandatory
+## Stage: Planning (2026-06-14T03:34:51Z)
+### Session summary
+Produced the implementation plan at `packages/pi-subagents/docs/plans/0373-extract-subagent-state.md`.
+The architecture doc (Phase 17 Step 2 + "First-principles refinement") already specified the design precisely and the issue body matched it, so planning was confirmation-and-detailing rather than discovery.
+Issue is first-party (`gotgenes`) and unambiguous — skipped the `ask_user` gate.
+### Observations
+- **Not breaking** for the published surface: `src/service/service.ts` exposes `SubagentRecord`/`SubagentStatus`/spawn-config, never `SubagentInit` or the `Subagent` constructor.
+  Only the internal constructor signature changes.
+- **Single production construction site** confirmed: `SubagentManager.spawn` (~line 139) is the only `new Subagent(...)` outside tests — this is what makes mandatory execution deps viable.
+- **Observer retarget is required**, not optional: making execution mandatory would otherwise force `record-observer.test.ts` to stub execution.
+  Pointing `subscribeSubagentObserver` at `SubagentState` (and dropping the record from `onCompact`, closing over `this` in `subagent.ts`) is the move that lets observer tests target `SubagentState` directly.
+- **`resume()`'s missing-session throw stays** — it guards a genuine runtime state, not a construction concern.
+  Only the two `run()` "not configured for execution" throws are deleted.
+- **`SubagentStatus` home**: moved to `subagent-state.ts` but re-exported from `subagent.ts` to keep `service.ts`'s import path (and the public type bundle path) unchanged, and to avoid a circular import.
+- **Lift-and-shift for the large test file**: `test/lifecycle/subagent.test.ts` (~700 LOC).
+  Step 1 funnels constructions through a local helper and moves the state-machine `describe` blocks to the new `subagent-state.test.ts`, so Step 3's mandatory-execution flip is bounded to the helper + two run/resume factories.
+  Step 3 is unavoidably one atomic commit (removing optional fields breaks every construction at the type level at once).
+- **Doc updates identified**: `architecture.md` (lifecycle file listing, `Subagent` class diagram, mark Step 2 ✅ Complete, Phase 17 prose ~line 879, type-complexity table ~line 649) and `SKILL.md` (Lifecycle 10→11 modules, total 56→57 files).
+- Deferred per scope boundary: metrics-as-projection and result-delivery domain extraction (the other two of the four conflated domains).
+## Stage: Implementation — TDD (2026-06-14T09:23:00Z)
+### Session summary
+Executed all four planned steps as separate commits: (1) extract `SubagentState` value object + new `subagent-state.test.ts`, (2) retarget `subscribeSubagentObserver` at `SubagentState`, (3) the atomic flip making `SubagentExecution` a mandatory collaborator and deleting the two `run()` throws, (4) docs.
+Test count moved 966 → 967 (net): +26 new `SubagentState` tests, minus the migrated state-machine duplicates and the obsolete missing-factory test.
+Pre-completion reviewer returned **PASS**; `check`/`lint`/`test`/`fallow` all clean.
+### Observations
+- The plan held exactly — every file in Module-Level Changes was touched and nothing else.
+  The `createTestSubagent` consumers (`conversation-viewer`, `notification`, `get-result-tool`, `make-subagent.test`) stayed untouched as predicted; the helper absorbed the construction change via a `TestSubagentOptions` shape that splits passive-state shorthands from identity/execution.
+- **Explicit-`undefined` preservation** (testing-skill warning) mattered: `createTestSubagent` and the local `makeSubagent` build their `SubagentState` via spread of the rest-captured state overrides (`{ defaults, ...stateOverrides }`) so callers passing `completedAt: undefined` (running-status records in `get-result-tool.test`) still get `undefined`, not the `2000` default.
+- The lift-and-shift prep in Step 1 (local `makeSubagent` helper + perl-routing the single-line constructions) paid off: Step 3's breaking flip only had to edit the helper, `createRunnableAgent`, `createResumableAgent`, `createCompletionAgent`, and the constructor describe — not the whole file.
+- Removed the obsolete "throws when the session factory is missing" test (the guard is gone by construction); the construct-complete invariant is now type-level, not runtime-testable.
+  An initial replacement comment was dropped per reviewer/operator feedback as unhelpful.
+- `SubagentExecution` carries 12 fields (4 mandatory).
+  Reviewer flagged it as wide but accepted per the plan's recorded decision to keep it concrete rather than split further.
+- Pre-completion reviewer: **PASS** (no WARN findings).
+## Stage: Final Retrospective (2026-06-14T17:20:00Z)
+### Session summary
+Shipped #373 end-to-end across one conversation spanning Planning → TDD → Ship → Retro: four implementation commits, CI green, issue closed, no release-please PR (a `refactor:`-only change does not trigger a release).
+The plan held exactly — zero rework, and the pre-completion reviewer returned PASS with nothing to fix.
+The single user intervention was a one-line comment removal during TDD Step 3.
+### Observations
+#### What went well
+- **Plan-to-ship with zero rework.**
+  Every file in the plan's Module-Level Changes was touched and nothing else; the `createTestSubagent` consumers stayed untouched exactly as predicted.
+  The lift-and-shift prep (Step 1 funneling constructions through a local `makeSubagent` helper) bounded the breaking Step 3 flip to the helper plus three factories — the atomic-construction-change concern from the plan never materialized as churn.
+- **Clean model allocation across stages.**
+  Planning ran on `claude-opus-4-8`, TDD on `claude-sonnet-4-6`, Ship on `opencode-go/deepseek-v4-flash` (mechanical git/CI/close work), the pre-completion reviewer subagent on `claude-sonnet-4-6`, and Retro on `claude-opus-4-8`.
+  Judgment-heavy work landed on reasoning-strong models; the cheap model handled only the mechanical ship sequence.
+- **Incremental verification.** `pnpm run check` ran after every TDD step (not just at the end), catching the shared-type breakage at the right boundary; the affected test files were run per-step before the full suite.
+#### What caused friction (agent side)
+- `other` (tombstone comment) — after removing the obsolete "throws when the session factory is missing" test in TDD Step 3, left a comment narrating the *absence* of the guard (`// No "missing session factory" guard: execution is a mandatory constructor collaborator …`).
+  The user flagged it as unhelpful and asked for removal.
+  Impact: one extra `Edit` + a blank-line cleanup + a `--amend` of the Step 3 commit.
+  No behavioral rework; user-caught.
+#### What caused friction (user side)
+- None of consequence.
+  The single intervention (comment removal) was light mechanical oversight on an otherwise self-driving session; no earlier context would have changed the outcome.
+### Diagnostic details
+- **Model-performance correlation** — no mismatch.
+  The only subagent dispatch (pre-completion-reviewer) ran on `claude-sonnet-4-6`, appropriate for judgment-heavy review; it returned PASS.
+  The Ship stage on `deepseek-v4-flash` was purely mechanical (git push, `ci_find`/`ci_watch`, `issue_close`, `release_pr_find`) and the one judgment point (the batch-vs-release `ask_user`) was handled correctly.
+- **Escalation-delay / unused-tool / feedback-loop** — nothing notable: no rabbit-holes, no error-chasing sequences, and verification ran incrementally throughout.
+  Lenses skipped.
+### Changes made
+1. `.pi/skills/code-design/SKILL.md` (§ Names over comments) — added a line forbidding tombstone comments that narrate removed code or the absence of a guard/test/branch, prompted by the user-caught over-comment in TDD Step 3.