npm - @gotgenes/pi-subagents - Versions diffs - 16.1.0 → 16.2.0 - Mend

@gotgenes/pi-subagents 16.1.0 → 16.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

package/CHANGELOG.md +14 -0
package/dist/public.d.ts +19 -22
package/docs/architecture/architecture.md +52 -19
package/docs/plans/0373-extract-subagent-state.md +250 -0
package/docs/plans/0374-encapsulate-subagent-start-notification.md +268 -0
package/docs/plans/0403-abort-subagents-on-interrupt.md +180 -0
package/docs/retro/0373-extract-subagent-state.md +94 -0
package/docs/retro/0374-encapsulate-subagent-start-notification.md +38 -0
package/docs/retro/0381-replace-concurrency-queue-with-limiter.md +46 -0
package/docs/retro/0403-abort-subagents-on-interrupt.md +90 -0
package/package.json +1 -1
package/src/handlers/index.ts +1 -0
package/src/handlers/interrupt.ts +49 -0
package/src/index.ts +5 -1
package/src/lifecycle/subagent-manager.ts +22 -23
package/src/lifecycle/subagent-state.ts +156 -0
package/src/lifecycle/subagent.ts +108 -166
package/src/observation/record-observer.ts +15 -13

package/docs/plans/0374-encapsulate-subagent-start-notification.md ADDED Viewed

@@ -0,0 +1,268 @@
+---
+issue: 374
+issue_title: "Encapsulate run start and notification attachment on Subagent"
+---
+# Encapsulate Subagent.start() and read-only promise/notification
+## Problem Statement
+`Subagent.promise` is assigned from outside the class in three places — `SubagentManager.spawn()` (two sites: scheduled and immediate paths) — and `record.notification` is assigned from outside the class in seven test sites.
+Both are output-argument smells (design-review check 3): the object should own the state its own methods read.
+`Subagent.run()` already exists; the promise that tracks it lives outside the object purely so callers can `await record.promise`.
+`notification` was already moved to the constructor in Phase 17 Step 2 (wired from `execution.parentSession?.toolCallId`), but the field is still publicly writable, so tests bypass the constructor path with direct assignment.
+## Goals
+- Add `Subagent.start()` that calls `run()`, stores the resulting promise internally, and returns it.
+- Fold the abort-while-queued status guard into `start()`, removing the inline check from `SubagentManager`.
+- Make `promise` externally read-only: private `_promise` field backed by a public `get promise()` accessor.
+- Make `notification` externally read-only: private `_notification` field backed by a public `get notification()` accessor.
+- Add `toolCallId?: string` to `TestSubagentOptions` so tests wire notification state via the constructor path without external writes.
+- Achieve grep-verifiable outcome: `\.promise =` and `\.notification =` appear only inside `subagent.ts`.
+## Non-Goals
+- Extracting `RunListeners` or workspace-bracket collaborators from `Subagent` (Phase 17 Step 4, Issue [#375]).
+- Extracting the manager observer from `index.ts` (Phase 17 Step 5, Issue [#376]).
+- Any other Phase 17 step beyond Step 3.
+## Background
+Phase 17 Step 1 ([#381]) replaced `ConcurrencyQueue` with a `ConcurrencyLimiter` — the manager now calls `this.limiter.schedule(thunk)` and stores the scheduled promise on `record.promise`.
+Phase 17 Step 2 ([#373]) extracted `SubagentState`, made `SubagentExecution` mandatory, and wired `notification` in the constructor via `execution.parentSession?.toolCallId`.
+Current external write sites after Step 2:
+| Field                 | Location                  | Count                                                                                                                    |
+| --------------------- | ------------------------- | ------------------------------------------------------------------------------------------------------------------------ |
+| `record.promise`      | `SubagentManager.spawn()` | 2 (scheduled + immediate)                                                                                                |
+| `record.promise`      | Test files                | 3 (`get-result-tool.test.ts`, `service-adapter.test.ts`, `make-subagent.test.ts`)                                        |
+| `record.notification` | Test files                | 7 (`get-result-tool.test.ts` ×2, `subagent-manager.test.ts` ×2, `service-adapter.test.ts` ×1, `notification.test.ts` ×2) |
+`SubagentManager.spawnAndWait()` and `waitForAll()` read `record.promise` via the public field — these become getter reads after the change.
+`get-result-tool.ts` reads `record.promise` to `await` it when `wait=true` — unchanged (getter).
+The `AGENTS.md` constraint that applies: **output arguments** — if a function sets a field on a received object, it is doing work that belongs inside the owning object.
+## Design Overview
+### `Subagent.start()` and the status guard
+```typescript
+private _promise?: Promise<void>;
+/** Awaitable handle to the running promise. Set by start(). */
+get promise(): Promise<void> | undefined {
+  return this._promise;
+}
+/**
+ * Start execution: call run(), store the promise, and return it.
+ * Guards against non-active states (e.g. abort-while-queued): if the agent
+ * is neither queued nor running, the promise resolves immediately (no-op).
+ */
+start(): Promise<void> {
+  if (this.status !== "queued" && this.status !== "running") {
+    this._promise = Promise.resolve();
+    return this._promise;
+  }
+  this._promise = this.run();
+  return this._promise;
+}
+```
+The guard allows:
+- `"queued"` — background agent waiting in the limiter; `run()` proceeds normally.
+- `"running"` — foreground agent (status set to `"running"` at construction in the manager); `run()` proceeds normally.
+- Any terminal state (`"stopped"`, `"error"`, `"completed"`, etc.) — agent was aborted while queued; `start()` becomes a no-op returning an immediately-resolving promise.
+This folds the inline `if (record.status !== "queued") return Promise.resolve()` guard out of the `SubagentManager` limiter callback.
+### `SubagentManager.spawn()` after the change
+```typescript
+// Queued background path
+this.limiter.schedule(() => record.start());
+// Immediate path (foreground or bypassQueue)
+record.start();
+```
+`spawnAndWait()` continues to `await record.promise` (now uses the getter, no behavior change).
+`waitForAll()`'s `pendingPromises()` continues to `r.promise` (getter — no behavior change).
+### `notification` encapsulation
+The constructor already writes to `this.notification` internally.
+After the change, the constructor writes to `this._notification`:
+```typescript
+private _notification?: NotificationState;
+get notification(): NotificationState | undefined {
+  return this._notification;
+}
+// In constructor:
+const toolCallId = init.execution.parentSession?.toolCallId;
+if (toolCallId) {
+  this._notification = new NotificationState(toolCallId);
+}
+```
+No production writes to `notification` outside the constructor — only test sites need updating.
+### `TestSubagentOptions` shorthand
+Add `toolCallId?: string` so tests that need a `NotificationState` use the constructor path:
+```typescript
+// Before
+const record = createTestSubagent();
+record.notification = new NotificationState("tc-1");
+// After
+const record = createTestSubagent({ toolCallId: "tc-1" });
+```
+In `createTestSubagent`, `toolCallId` routes through `makeStubExecution({ parentSession: { toolCallId } })`.
+### Tests that write `record.promise`
+- **`service-adapter.test.ts`** ("strips promise from the record" tests): the test only needs `promise` to be absent from the serialized output.
+  Since `toSubagentRecord()` already builds an explicit object without `promise`, these tests pass without any promise being set on the record.
+  Remove the `record.promise = ...` setup.
+- **`make-subagent.test.ts`** ("allows setting promise directly"): the test's intent was to verify the field was settable.
+  Replace with a test that `start()` sets `promise` internally via the stub execution.
+- **`get-result-tool.test.ts`** ("waits for promise when wait=true"): the test needs a running agent whose promise resolves and updates status to completed.
+  Replace with an execution stub where `runTurnLoop` returns `{ responseText: "Finished after wait.", aborted: false, steered: false }` and call `record.start()`.
+  The `createSubagentSessionStub()` default already resolves with `{ responseText: "done", ... }` — override `runTurnLoop` to return the expected text.
+### `subagent-manager.test.ts` notification tests (lines 82, 100)
+Tests that reproduce the race-condition bug (notification set post-spawn) become:
+```typescript
+const id = manager.spawn(STUB_SNAPSHOT, "general-purpose", "test", {
+  description: "bg",
+  isBackground: true,
+  parentSession: { toolCallId: "tc-1" },
+});
+const record = manager.getRecord(id)!;
+// notification is already wired from the constructor
+await record.promise;
+record.notification?.markConsumed();
+```
+The behavior under test (race: `markConsumed()` after `await` is too late) is unchanged.
+## Module-Level Changes
+- `src/lifecycle/subagent.ts`
+  - Remove public writable `promise?: Promise<void>` field.
+  - Add `private _promise?: Promise<void>`.
+  - Add `get promise(): Promise<void> | undefined`.
+  - Add `start(): Promise<void>` with the status guard.
+  - Rename `this.notification` write in constructor to `this._notification`.
+  - Remove public writable `notification?: NotificationState` field.
+  - Add `private _notification?: NotificationState`.
+  - Add `get notification(): NotificationState | undefined`.
+- `src/lifecycle/subagent-manager.ts`
+  - Replace `record.promise = this.limiter.schedule(() => { if (...) return ...; return record.run(); })` with `this.limiter.schedule(() => record.start())`.
+  - Replace `record.promise = record.run()` with `record.start()`.
+- `test/helpers/make-subagent.ts`
+  - Add `toolCallId?: string` to `TestSubagentOptions`.
+  - In `createTestSubagent`, map `toolCallId` to `makeStubExecution({ parentSession: { toolCallId } })`.
+- `test/helpers/make-subagent.test.ts`
+  - Replace "allows setting promise directly after construction" with a test that `start()` stores promise via the execution stub.
+- `test/tools/get-result-tool.test.ts`
+  - Replace `record.promise = Promise.resolve().then(...)` setup with a stub execution + `record.start()`.
+  - Replace `record.notification = new NotificationState("tc-1")` (×2) with `createTestSubagent({ toolCallId: "tc-1" })`.
+- `test/lifecycle/subagent-manager.test.ts`
+  - Replace `record.notification = new NotificationState("tc-1")` (×2) with spawn options carrying `parentSession: { toolCallId: "tc-1" }`.
+- `test/service/service-adapter.test.ts`
+  - Remove `record.promise = Promise.resolve()` setup (×2) from tests that only need to verify `toSubagentRecord()` strips the field.
+  - Replace `record.notification = new NotificationState("tc-1")` with `createTestSubagent({ toolCallId: "tc-1" })`.
+- `test/observation/notification.test.ts`
+  - Replace `record.notification = new NotificationState("tc-123/tc-1")` (×2) with `createTestSubagent({ toolCallId: "tc-123/tc-1" })`.
+- `docs/architecture/architecture.md`
+  - Mark Step 3 `✅ Complete` and add a "Landed" note.
+## Test Impact Analysis
+1. **New unit tests enabled**: `start()` behavior (promise stored, status guard no-op) can be tested directly in `subagent.test.ts` without touching the manager.
+2. **Existing tests simplified**: The 7 test sites that do `record.notification = ...` drop an artificial mutation and instead use the natural constructor path — the tests are shorter and closer to production semantics.
+3. **Tests that must stay**: The manager's race-condition tests (lines 74–120) verify ordering of `markConsumed()` vs `await promise` — they change setup only (spawn with toolCallId), not intent.
+4. **Tests removed**: The `make-subagent.test.ts` "allows setting promise" test is replaced, since direct write is no longer possible.
+## TDD Order
+1. **Add `Subagent.start()` alongside the existing public `promise?` field**
+   In `test/lifecycle/subagent.test.ts`, add tests:
+   - `start()` on a running agent returns a defined promise.
+   - `start()` on a stopped agent returns a resolving promise immediately (no-op guard).
+   - After `start()`, `record.promise` matches the returned promise.
+   In `src/lifecycle/subagent.ts`: add `private _promise`, `get promise()` (shadowing the old field — TypeScript will require removing the duplicate; advance to step 2 immediately), and `start()`.
+   Commit: `test: add Subagent.start() tests and initial implementation (#374)`
+2. **Make `promise` read-only — remove public field, update all write sites**
+   Breaking change at the type level.
+   Atomic commit must include:
+   - `src/lifecycle/subagent.ts` — remove `promise?: Promise<void>` public field (only `private _promise` + getter remain).
+   - `src/lifecycle/subagent-manager.ts` — replace both `record.promise = ...` sites with `record.start()` calls; limiter thunk becomes `() => record.start()`.
+   - `test/helpers/make-subagent.test.ts` — replace write-promise test with `start()` test.
+   - `test/tools/get-result-tool.test.ts` — replace `record.promise = ...` setup; use execution stub + `record.start()`.
+   - `test/service/service-adapter.test.ts` — remove `record.promise = Promise.resolve()` setup (×2).
+   Run `pnpm --filter @gotgenes/pi-subagents run check` to verify.
+   Commit: `feat: make Subagent.promise read-only, add start() (#374)`
+3. **Make `notification` read-only — remove public field, update all write sites**
+   Breaking change at the type level.
+   Atomic commit must include:
+   - `src/lifecycle/subagent.ts` — rename public `notification?` to `private _notification`; add `get notification()`; constructor write becomes `this._notification = ...`.
+   - `test/helpers/make-subagent.ts` — add `toolCallId?: string` to `TestSubagentOptions`; route through `makeStubExecution`.
+   - `test/tools/get-result-tool.test.ts` — replace `record.notification = new NotificationState(...)` (×2) with `createTestSubagent({ toolCallId: ... })`.
+   - `test/lifecycle/subagent-manager.test.ts` — replace `record.notification = new NotificationState(...)` (×2) with spawn options carrying `parentSession: { toolCallId: ... }`.
+   - `test/service/service-adapter.test.ts` — replace `record.notification = new NotificationState(...)` with `createTestSubagent({ toolCallId: ... })`.
+   - `test/observation/notification.test.ts` — replace `record.notification = new NotificationState(...)` (×2) with `createTestSubagent({ toolCallId: ... })`.
+   Run `pnpm --filter @gotgenes/pi-subagents exec vitest run` and `pnpm --filter @gotgenes/pi-subagents run check`.
+   Commit: `feat: make Subagent.notification read-only, update tests (#374)`
+4. **Update architecture doc**
+   In `docs/architecture/architecture.md`, mark Step 3 `✅ Complete` and add a "Landed" note summarizing the outcome.
+   Also update the note at line 943 that says "Step 3 later folds the guard into `Subagent.start()`" to reflect it is now done.
+   Commit: `docs: mark Phase 17 Step 3 complete in architecture.md (#374)`
+## Risks and Mitigations
+- **Risk**: Adding both `private _promise` and `get promise()` while the public `promise?` field still exists is a TypeScript error (duplicate identifier).
+  **Mitigation**: Steps 1 and 2 are merged into one commit: introduce `start()`, remove the public writable field, and fix all consumers atomically.
+  The TDD order describes testing `start()` first, but both the public field removal and the consumer updates land in the same `feat:` commit.
+- **Risk**: The status guard in `start()` allows `"running"` for foreground agents, which have `status = "running"` at construction.
+  If a foreground agent is stopped before `start()` is called (edge case), `run()` would call `markRunning()` on an already-stopped agent.
+  **Mitigation**: Foreground agents are started synchronously at the end of `spawn()` — there is no window between construction and `start()` during which the abort path can fire.
+  The guard is conservative and causes no regression.
+- **Risk**: The race-condition test in `subagent-manager.test.ts` (lines 74–107) verifies that `markConsumed()` called after `await record.promise` is "too late" for the observer.
+  Switching from `record.notification = new NotificationState("tc-1")` to the constructor path does not change timing semantics.
+  **Mitigation**: The test body stays structurally identical; only the setup changes.
+- **Risk**: `service-adapter.test.ts` tests that call `record.promise = Promise.resolve()` might be testing that the field exists on the Subagent type.
+  **Mitigation**: The tests are testing `toSubagentRecord()` output, not the field type.
+  Removing the setup doesn't change the assertion.
+## Open Questions
+- None.
+  The design is fully specified by the Phase 17 Step 3 architecture note and the existing class structure.
+[#373]: https://github.com/gotgenes/pi-packages/issues/373
+[#375]: https://github.com/gotgenes/pi-packages/issues/375
+[#381]: https://github.com/gotgenes/pi-packages/issues/381

package/docs/plans/0403-abort-subagents-on-interrupt.md ADDED Viewed

@@ -0,0 +1,180 @@
+---
+issue: 403
+issue_title: "Pressing Escape does not stop subagent/background agent"
+---
+# Abort subagents on parent interrupt (ESC)
+## Problem Statement
+A user reports that pressing Escape in the Pi terminal to cancel the current work does not stop a running subagent — the agent keeps going despite the cancel request.
+The reporter is a third party (`khalid244`); the operator confirmed the direction is to implement ESC-to-abort for both foreground and background subagents, aborting all running and queued background agents on a single ESC.
+The root cause splits cleanly by execution mode:
+1. Foreground subagents already receive the parent's abort signal through the tool boundary (`tool.execute(signal)` → `Subagent.wireSignal` → `abort()` → child `session.abort()`), so they should already stop on ESC.
+2. Background subagents are detached by design: `spawnBackground()` never forwards the parent signal, and `manager.abortAll()` runs only on `session_shutdown`.
+   There is no wiring from a parent interrupt to background-agent abort, so ESC does nothing to them.
+   This is the reproducible bug.
+## Goals
+- Pressing ESC (the parent agent-loop interrupt) aborts all running and queued background subagents.
+- Add a regression guard test proving a foreground subagent's child session is aborted when the parent signal fires.
+- Reuse the existing `manager.abortAll()` semantics (abort running, mark queued stopped, clear the limiter) so ESC stops every active subagent in one action.
+This is an intentional behavior change: background subagents that previously survived ESC will now stop.
+It is a bug fix (`fix:`), not a breaking change — no config key, default value, or output shape changes, and detached-survives-ESC was a limitation rather than a contract.
+## Non-Goals
+- Selective or interactive abort (choosing which agent to stop) — out of scope.
+- A dedicated `abortBackground()` that excludes foreground agents — `abortAll()` is reused; foreground agents are already aborted by their own signal wiring, so the overlap is redundant-but-harmless.
+- Changing background-agent detachment for any path other than the ESC interrupt (e.g., the tool still returns immediately on spawn).
+- Confirmation prompts or status messaging on abort.
+## Background
+Relevant modules and the verified runtime facts behind the design:
+- `src/tools/foreground-runner.ts` — `runForeground(..., signal, ...)` forwards the parent `signal` into `manager.spawnAndWait({ signal })`.
+- `src/lifecycle/subagent.ts` — `run()` calls `this.wireSignal(this.execution.signal, () => this.abort())`; `abort()` fires `abortController.abort()` and marks the record stopped.
+- `src/lifecycle/subagent-session.ts` — `runTurnLoop` calls `forwardAbortSignal(session, opts.signal)`, which calls `session.abort()` when the signal fires.
+- `src/tools/background-spawner.ts` — `spawnBackground()` omits `signal` entirely; background agents are detached.
+- `src/lifecycle/subagent-manager.ts` — `abortAll()` aborts running, marks queued stopped, and clears the limiter; currently called only from `src/handlers/lifecycle.ts` on shutdown.
+- `src/handlers/tool-start.ts`, `src/handlers/lifecycle.ts`, `src/handlers/index.ts` — the existing `handlers/` pattern: small classes with a narrow injected interface, registered in `index.ts`.
+Verified SDK facts (from the pinned peer deps under `node_modules/@earendil-works/`):
+- The interactive ESC handler calls `agent.abort()` while streaming (`pi-coding-agent` `interactive-mode.js`, `restoreQueuedMessagesToEditor({ abort: true })`).
+- `pi-agent-core` `agent.js`: each run creates a fresh `AbortController`; `agent.abort()` calls `activeRun.abortController.abort()`; on normal completion `finishRun()` discards the controller **without** aborting it.
+  Therefore the parent signal's `abort` event fires only on a real interrupt, never on normal turn completion — latching `abortAll()` to it will not spuriously kill background agents at turn end.
+- The signal passed to `tool.execute(...)` (`agent-loop.js` line ~419) is that same per-run signal.
+- Extensions read the live per-run parent signal via `ctx.signal` (`ExtensionContext.signal: AbortSignal | undefined`, undefined when idle).
+- `pi.on("turn_start", (event, ctx) => ...)` is a registered event whose handler receives `ExtensionContext`; `turn_start` fires once at the start of every turn while streaming, so its `ctx.signal` is always the current run's signal.
+AGENTS.md constraint: pi-subagents is a minimal core with dependency arrows pointing inward.
+The new handler depends only on a narrow manager interface; no consumer knowledge leaks into the manager.
+## Design Overview
+Add a small `InterruptHandler` that latches the current parent abort signal and, on abort, tells the manager to abort all subagents.
+Drive it from `turn_start` so the latch always tracks the live per-run signal — including across runs and turns that execute no tools.
+Why `turn_start` rather than `tool_execution_start`: a background agent can outlive the run that spawned it.
+If the user later interrupts a turn that ran no subagent tool, only a turn-level latch still holds that run's signal.
+`turn_start` fires every turn with the current `ctx.signal`, so the latch is always current.
+The latch dedups by reference: most turns reuse the same signal (no-op); a new run's signal triggers a detach-and-rewire.
+The `abort` listener is `{ once: true }`; on normal completion the run's `AbortController` is discarded and garbage-collected with its listener, and the next `turn_start` detaches the stale reference.
+### Manager interface (narrow, Tell-Don't-Ask)
+```typescript
+/** Narrow manager interface — only the method the interrupt handler calls. */
+export interface InterruptManager {
+  abortAll(): number;
+}
+/** Minimal context shape — only the field the handler reads. */
+interface InterruptCtx {
+  signal: AbortSignal | undefined;
+}
+```
+### Handler
+```typescript
+export class InterruptHandler {
+  private latched?: AbortSignal;
+  private detach?: () => void;
+  constructor(private readonly manager: InterruptManager) {}
+  handleTurnStart(ctx: InterruptCtx): void {
+    const signal = ctx.signal;
+    if (signal === this.latched) return;
+    this.detach?.();
+    this.detach = undefined;
+    this.latched = signal;
+    if (!signal) return;
+    const onAbort = (): void => {
+      this.manager.abortAll();
+    };
+    signal.addEventListener("abort", onAbort, { once: true });
+    this.detach = () => signal.removeEventListener("abort", onAbort);
+  }
+}
+```
+### Consumer call site (`index.ts`)
+```typescript
+const interrupt = new InterruptHandler(manager);
+pi.on("turn_start", (_event, ctx) => interrupt.handleTurnStart(ctx));
+```
+The handler talks to `manager` through a one-method interface, reads one field of `ctx`, and performs no chained access — no Law-of-Demeter or output-argument smells.
+The latch state (current signal, detach handle) is owned by the handler.
+### Edge cases
+- Same signal across consecutive turns → reference equality short-circuits; no listener churn.
+- `ctx.signal` undefined (idle, defensive) → detach the old listener and hold no signal.
+- Signal already aborted when latched → `{ once: true }` listener does not fire; the prior signal's listener already ran `abortAll()`, so no agent is missed.
+- ESC during a foreground subagent → the foreground agent is aborted twice (once via its own `wireSignal`, once via `abortAll`); `abort()` is guarded by status and `markStopped` is idempotent, so this is harmless.
+## Module-Level Changes
+- `src/handlers/interrupt.ts` (new) — `InterruptHandler` class and `InterruptManager` interface.
+- `src/handlers/index.ts` — add `export { InterruptHandler } from "#src/handlers/interrupt";`.
+- `src/index.ts` — instantiate `new InterruptHandler(manager)` and register `pi.on("turn_start", (_event, ctx) => interrupt.handleTurnStart(ctx))`.
+- `src/lifecycle/subagent-manager.ts` — no code change; `abortAll()` is reused.
+  Its `// fallow-ignore-next-line unused-class-member` comment stays (it is still reached only through narrow interfaces that fallow does not trace); the pre-completion `fallow dead-code` check will confirm.
+- `docs/architecture/architecture.md` — extend the `handlers/` directory listing (around line 354) with `interrupt.ts` (turn_start handler → abort all subagents on interrupt).
+  Check the same file for any handler file-count or complexity row that names the `handlers/` domain and update if present.
+No exports are removed or renamed.
+Grep confirms `.pi/skills/package-pi-subagents/SKILL.md` does not mention `abortAll`, interrupt, or ESC, so no skill update is required.
+## Test Impact Analysis
+This is a feature/fix addition, not an extraction, so no existing tests become redundant.
+1. New unit tests enabled — `InterruptHandler`: latches the current signal, fires `abortAll()` on abort, dedups the same signal reference, re-wires on a new signal, and handles an undefined signal.
+2. New integration guard — foreground abort: aborting the parent signal passed to `runTurnLoop` invokes the child `session.abort()`.
+   This pins the currently-untested foreground link in `forwardAbortSignal`.
+3. Existing tests stay as-is — `test/lifecycle/subagent.test.ts` (`wireSignal`, `abort`), `test/lifecycle/subagent-session.test.ts` (max-turns abort path), and `test/handlers/lifecycle.test.ts` (`abortAll` on shutdown) continue to exercise their layers unchanged.
+## TDD Order
+1. Foreground guard — `test/lifecycle/subagent-session.test.ts`.
+   Add a test: when the `signal` passed to `runTurnLoop` aborts while `session.prompt` is in flight, `session.abort()` is called.
+   Expected to pass immediately (proving the foreground chain already works); if the trace is wrong and it fails, fix `forwardAbortSignal` in `src/lifecycle/subagent-session.ts`.
+   Commit `test: guard foreground subagent abort on parent signal (#403)` (or `fix:` if a code fix is needed).
+2. Interrupt handler + wiring — `test/handlers/interrupt.test.ts` (new) → `src/handlers/interrupt.ts`, `src/handlers/index.ts`, `src/index.ts`.
+   Red: write the handler unit tests (latch, abort→abortAll, dedup, re-wire, undefined signal) against the not-yet-existing class.
+   Green: implement `InterruptHandler` + `InterruptManager`, export from the barrel, and register `pi.on("turn_start", ...)` in `index.ts`.
+   The handler, its test, and the composition-root wiring land together because the handler is inert without the registration.
+   Commit `fix: abort all subagents on parent interrupt (#403)`.
+3. Architecture doc — `docs/architecture/architecture.md`.
+   Add `interrupt.ts` to the `handlers/` directory listing and update any handler-domain count/row if present.
+   Commit `docs: note interrupt handler in subagents architecture (#403)`.
+## Risks and Mitigations
+- ESC now stops background agents the user might have wanted to keep running.
+  Mitigation: this is the operator's explicit choice (abort all running + queued); the behavior is documented in the plan and reflected in the `fix:` commit body.
+- Re-latching on every `turn_start` could add overhead.
+  Mitigation: the latch is a single reference comparison and short-circuits on the common same-signal case.
+- A `{ once: true }` listener lingers on a signal that completes normally.
+  Mitigation: the run's `AbortController` is discarded and GC'd with its listener; the next `turn_start` detaches the stale handle.
+- Non-interactive modes (print/rpc) may not emit `turn_start` the same way.
+  Mitigation: ESC interrupt is an interactive concern; the handler is a no-op when no signal is present.
+## Open Questions
+- Should a dedicated `abortBackground()` (excluding foreground) replace `abortAll()` here?
+  Deferred: `abortAll()` is simpler and foreground is already signal-aborted; revisit only if the redundant double-abort proves problematic.
+- Should ESC abort surface a confirmation or status message?
+  Deferred: out of scope for this fix.

package/docs/retro/0373-extract-subagent-state.md ADDED Viewed

@@ -0,0 +1,94 @@
+---
+issue: 373
+issue_title: "Extract SubagentState; make Subagent execution deps mandatory"
+---
+# Retro: #373 — Extract SubagentState; make Subagent execution deps mandatory
+## Stage: Planning (2026-06-14T03:34:51Z)
+### Session summary
+Produced the implementation plan at `packages/pi-subagents/docs/plans/0373-extract-subagent-state.md`.
+The architecture doc (Phase 17 Step 2 + "First-principles refinement") already specified the design precisely and the issue body matched it, so planning was confirmation-and-detailing rather than discovery.
+Issue is first-party (`gotgenes`) and unambiguous — skipped the `ask_user` gate.
+### Observations
+- **Not breaking** for the published surface: `src/service/service.ts` exposes `SubagentRecord`/`SubagentStatus`/spawn-config, never `SubagentInit` or the `Subagent` constructor.
+  Only the internal constructor signature changes.
+- **Single production construction site** confirmed: `SubagentManager.spawn` (~line 139) is the only `new Subagent(...)` outside tests — this is what makes mandatory execution deps viable.
+- **Observer retarget is required**, not optional: making execution mandatory would otherwise force `record-observer.test.ts` to stub execution.
+  Pointing `subscribeSubagentObserver` at `SubagentState` (and dropping the record from `onCompact`, closing over `this` in `subagent.ts`) is the move that lets observer tests target `SubagentState` directly.
+- **`resume()`'s missing-session throw stays** — it guards a genuine runtime state, not a construction concern.
+  Only the two `run()` "not configured for execution" throws are deleted.
+- **`SubagentStatus` home**: moved to `subagent-state.ts` but re-exported from `subagent.ts` to keep `service.ts`'s import path (and the public type bundle path) unchanged, and to avoid a circular import.
+- **Lift-and-shift for the large test file**: `test/lifecycle/subagent.test.ts` (~700 LOC).
+  Step 1 funnels constructions through a local helper and moves the state-machine `describe` blocks to the new `subagent-state.test.ts`, so Step 3's mandatory-execution flip is bounded to the helper + two run/resume factories.
+  Step 3 is unavoidably one atomic commit (removing optional fields breaks every construction at the type level at once).
+- **Doc updates identified**: `architecture.md` (lifecycle file listing, `Subagent` class diagram, mark Step 2 ✅ Complete, Phase 17 prose ~line 879, type-complexity table ~line 649) and `SKILL.md` (Lifecycle 10→11 modules, total 56→57 files).
+- Deferred per scope boundary: metrics-as-projection and result-delivery domain extraction (the other two of the four conflated domains).
+## Stage: Implementation — TDD (2026-06-14T09:23:00Z)
+### Session summary
+Executed all four planned steps as separate commits: (1) extract `SubagentState` value object + new `subagent-state.test.ts`, (2) retarget `subscribeSubagentObserver` at `SubagentState`, (3) the atomic flip making `SubagentExecution` a mandatory collaborator and deleting the two `run()` throws, (4) docs.
+Test count moved 966 → 967 (net): +26 new `SubagentState` tests, minus the migrated state-machine duplicates and the obsolete missing-factory test.
+Pre-completion reviewer returned **PASS**; `check`/`lint`/`test`/`fallow` all clean.
+### Observations
+- The plan held exactly — every file in Module-Level Changes was touched and nothing else.
+  The `createTestSubagent` consumers (`conversation-viewer`, `notification`, `get-result-tool`, `make-subagent.test`) stayed untouched as predicted; the helper absorbed the construction change via a `TestSubagentOptions` shape that splits passive-state shorthands from identity/execution.
+- **Explicit-`undefined` preservation** (testing-skill warning) mattered: `createTestSubagent` and the local `makeSubagent` build their `SubagentState` via spread of the rest-captured state overrides (`{ defaults, ...stateOverrides }`) so callers passing `completedAt: undefined` (running-status records in `get-result-tool.test`) still get `undefined`, not the `2000` default.
+- The lift-and-shift prep in Step 1 (local `makeSubagent` helper + perl-routing the single-line constructions) paid off: Step 3's breaking flip only had to edit the helper, `createRunnableAgent`, `createResumableAgent`, `createCompletionAgent`, and the constructor describe — not the whole file.
+- Removed the obsolete "throws when the session factory is missing" test (the guard is gone by construction); the construct-complete invariant is now type-level, not runtime-testable.
+  An initial replacement comment was dropped per reviewer/operator feedback as unhelpful.
+- `SubagentExecution` carries 12 fields (4 mandatory).
+  Reviewer flagged it as wide but accepted per the plan's recorded decision to keep it concrete rather than split further.
+- Pre-completion reviewer: **PASS** (no WARN findings).
+## Stage: Final Retrospective (2026-06-14T17:20:00Z)
+### Session summary
+Shipped #373 end-to-end across one conversation spanning Planning → TDD → Ship → Retro: four implementation commits, CI green, issue closed, no release-please PR (a `refactor:`-only change does not trigger a release).
+The plan held exactly — zero rework, and the pre-completion reviewer returned PASS with nothing to fix.
+The single user intervention was a one-line comment removal during TDD Step 3.
+### Observations
+#### What went well
+- **Plan-to-ship with zero rework.**
+  Every file in the plan's Module-Level Changes was touched and nothing else; the `createTestSubagent` consumers stayed untouched exactly as predicted.
+  The lift-and-shift prep (Step 1 funneling constructions through a local `makeSubagent` helper) bounded the breaking Step 3 flip to the helper plus three factories — the atomic-construction-change concern from the plan never materialized as churn.
+- **Clean model allocation across stages.**
+  Planning ran on `claude-opus-4-8`, TDD on `claude-sonnet-4-6`, Ship on `opencode-go/deepseek-v4-flash` (mechanical git/CI/close work), the pre-completion reviewer subagent on `claude-sonnet-4-6`, and Retro on `claude-opus-4-8`.
+  Judgment-heavy work landed on reasoning-strong models; the cheap model handled only the mechanical ship sequence.
+- **Incremental verification.** `pnpm run check` ran after every TDD step (not just at the end), catching the shared-type breakage at the right boundary; the affected test files were run per-step before the full suite.
+#### What caused friction (agent side)
+- `other` (tombstone comment) — after removing the obsolete "throws when the session factory is missing" test in TDD Step 3, left a comment narrating the *absence* of the guard (`// No "missing session factory" guard: execution is a mandatory constructor collaborator …`).
+  The user flagged it as unhelpful and asked for removal.
+  Impact: one extra `Edit` + a blank-line cleanup + a `--amend` of the Step 3 commit.
+  No behavioral rework; user-caught.
+#### What caused friction (user side)
+- None of consequence.
+  The single intervention (comment removal) was light mechanical oversight on an otherwise self-driving session; no earlier context would have changed the outcome.
+### Diagnostic details
+- **Model-performance correlation** — no mismatch.
+  The only subagent dispatch (pre-completion-reviewer) ran on `claude-sonnet-4-6`, appropriate for judgment-heavy review; it returned PASS.
+  The Ship stage on `deepseek-v4-flash` was purely mechanical (git push, `ci_find`/`ci_watch`, `issue_close`, `release_pr_find`) and the one judgment point (the batch-vs-release `ask_user`) was handled correctly.
+- **Escalation-delay / unused-tool / feedback-loop** — nothing notable: no rabbit-holes, no error-chasing sequences, and verification ran incrementally throughout.
+  Lenses skipped.
+### Changes made
+1. `.pi/skills/code-design/SKILL.md` (§ Names over comments) — added a line forbidding tombstone comments that narrate removed code or the absence of a guard/test/branch, prompted by the user-caught over-comment in TDD Step 3.

package/docs/retro/0374-encapsulate-subagent-start-notification.md ADDED Viewed

@@ -0,0 +1,38 @@
+---
+issue: 374
+issue_title: "Encapsulate run start and notification attachment on Subagent"
+---
+# Retro: #374 — Encapsulate run start and notification attachment on Subagent
+## Stage: Planning (2026-06-14T00:00:00Z)
+### Session summary
+Read issue #374 (Phase 17 Step 3 — output-argument encapsulation), loaded skills, explored `subagent.ts`, `subagent-manager.ts`, `notification-state.ts`, and all seven test files with external writes.
+Produced a 4-step TDD plan in `packages/pi-subagents/docs/plans/0374-encapsulate-subagent-start-notification.md`.
+### Observations
+- The `notification` field was already constructor-wired in Phase 17 Step 2 (from `execution.parentSession?.toolCallId`); the remaining work is making both `promise` and `notification` externally read-only and updating the 7 + 3 test write sites.
+- Steps 1 and 2 in the TDD order are effectively merged: introducing `private _promise` alongside the existing public `promise?` field is a TypeScript duplicate-identifier error, so the public field removal and all consumer updates must land in one atomic commit (`feat: make Subagent.promise read-only, add start() (#374)`).
+- The status guard (`if (status !== "queued" && status !== "running")`) in `start()` allows foreground agents (constructed with `status: "running"`) to pass through cleanly, while stopping aborted-while-queued agents; this folds the inline guard out of the `SubagentManager` limiter callback.
+- `service-adapter.test.ts` tests that set `record.promise = Promise.resolve()` only test that `toSubagentRecord()` strips the field — the setup is vestigial once `promise` becomes a getter; simply removing it is sufficient.
+- The "waits for promise when wait=true" test in `get-result-tool.test.ts` needs a more realistic execution stub (`runTurnLoop` returning `{ responseText: "Finished after wait.", aborted: false, steered: false }`) so `record.start()` triggers the full run pipeline and calls `markCompleted()` internally.
+- `TestSubagentOptions.toolCallId?: string` is the cleanest shorthand for the 5 test files that create passive records but need a `NotificationState`; it routes through `makeStubExecution({ parentSession: { toolCallId } })`, matching the production constructor path exactly.
+## Stage: Implementation — TDD (2026-06-14T16:31:00Z)
+### Session summary
+Implemented all 4 plan steps in 2 substantive commits: one atomic `feat:` commit for `start()` + `promise`/`notification` read-only + all test site updates, and one `docs:` commit for the architecture doc.
+Test count went from 975 to 981 (+6 new `start()` unit tests).
+Pre-completion reviewer returned PASS with one WARN (stale test count in `package-pi-subagents` SKILL.md — fixed immediately).
+### Observations
+- Plan steps 1–3 landed in a single commit because making `notification` private required the same `subagent.ts` file as making `promise` private; splitting would have required complex partial staging.
+- The `void record.start()` and `void this.limiter.schedule(...)` patterns were needed in `subagent-manager.ts` to satisfy `@typescript-eslint/no-floating-promises` — `start()` returns a `Promise<void>` but the manager stores the state internally; callers don't need to await it.
+- The "waits for promise when wait=true" test in `get-result-tool.test.ts` required `void record.start()` (intentional fire-and-forget) for the same reason.
+- Grep-verifiable outcome confirmed: `\.promise =` and `\.notification =` appear only inside `subagent.ts` (as `this._promise =` and `this._notification =`).
+- Pre-completion reviewer: PASS (no FAIL findings; WARN on stale skill test count addressed inline).

package/docs/retro/0381-replace-concurrency-queue-with-limiter.md CHANGED Viewed

@@ -46,4 +46,50 @@ Test count went 975 → 966 (−22 deleted queue tests, +13 new limiter tests);
 - Pre-completion reviewer: WARN (no FAILs).
   Reviewer warnings: the single stale-comment finding at `index.ts:125` — now fixed in commit `90135005`.
+## Stage: Final Retrospective (2026-06-14T00:30:00Z)
+### Session summary
+Shipped #381 across planning, TDD, and release: `pi-subagents` `16.0.0` → `16.1.0`, tag `pi-subagents-v16.1.0`.
+Four commits landed (one `feat`, two `refactor`, one `docs`) plus two `docs(retro)` notes; CI passed first try, the issue was closed with an implemented-in summary, and the release-please PR was merged.
+The plan — written down to code sketches — held up across all three TDD cycles with no design rework.
+### Observations
+#### What went well
+- The plan's fidelity paid off: the `clear()`-settles-pending-promises decision, the atomic step-2 sequencing (migrate consumers + delete queue + delete old test in one commit), and the `void`-prefix prediction for floating promises were all made at planning time and executed without surprise.
+  The `queueing and concurrency` manager tests passed unchanged after only the `createManager` helper swap, validating the planning claim that they exercise behavior, not queue internals.
+- The pre-completion-reviewer (on `anthropic/claude-sonnet-4-6`, 161s, 21 tool uses) caught a stale comment at `src/index.ts:125` that all four deterministic gates (`check`, `lint`, `test`, `fallow dead-code`) passed over.
+  This is the backstop working exactly as intended — a judgment-model review surfacing residue that pattern-matchers cannot.
+- Verification cadence was incremental, not end-loaded: file-scoped `vitest` + `biome` + `eslint` after step 1, `pnpm run check` immediately after the shared-interface change mid-step-2 (per the plan's own instruction), then lifecycle suite → full suite → full lint, then `rumdl` for the docs step, then the full gates + `fallow` before push.
+#### What caused friction (agent side)
+- `missing-context` (self/reviewer-caught) — the stale comment `// before startAgent / queue drain` at `src/index.ts:125` referenced two deleted concepts but was not cataloged in the plan's Module-Level Changes, despite the planning grep output having surfaced that exact line.
+  The grep hit was visible but never converted into a plan action or an explicit leave-as-is.
+  Impact: one small follow-up commit (`90135005`, `refactor:`); no rework, no design impact — the reviewer backstop absorbed it before ship.
+#### What caused friction (user side)
+- None.
+  The single user touchpoint — the release-timing gate in `/ship-issue` (release now vs. batch the Phase 17 sequence) — was strategic judgment the agent correctly deferred, not mechanical oversight.
+### Diagnostic details
+- **Model-performance correlation** — one subagent dispatch (`pre-completion-reviewer`) on `anthropic/claude-sonnet-4-6`; appropriate match for judgment-heavy review, and it returned the session's only actionable finding.
+- **Escalation-delay tracking** — no rabbit-holes; the lone lint error (`@typescript-eslint/no-floating-promises`, 18 sites) was resolved in a single test-file rewrite, far under the 5-call escalation threshold.
+- **Unused-tool detection** — nothing under-tooled; `colgrep`/`grep` were used during planning exploration and the reviewer subagent was dispatched as designed.
+- **Feedback-loop gap analysis** — no gap; verification ran after every cycle, with `pnpm run check` correctly invoked right after the shared-interface change rather than at end-of-session.
+#### Process note (no inline change)
+- The release-please PR merge required the documented `UNSTABLE` → `gh pr merge` fallback (step 6.4 of `/ship-issue`) because default-`GITHUB_TOKEN` release PRs never get checks.
+  This recurs every release; the prompt already handles it, so it is recorded here only as a standing pattern, not a friction point.
+### Changes made
+1. Added this Final Retrospective stage entry to `packages/pi-subagents/docs/retro/0381-replace-concurrency-queue-with-limiter.md`.
+2. No prompt or `AGENTS.md` changes — the operator chose retro-file-only, since the single friction (the stale `src/index.ts:125` comment) was a one-off execution slip already caught by the pre-completion-reviewer backstop, and the candidate grep-hit rule was judged not worth the prompt verbosity.
 [#378]: https://github.com/gotgenes/pi-packages/issues/378