@gotgenes/pi-subagents 16.1.0 → 16.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,90 @@
1
+ ---
2
+ issue: 403
3
+ issue_title: "Pressing Escape does not stop subagent/background agent"
4
+ ---
5
+
6
+ # Retro: #403 — Pressing Escape does not stop subagent/background agent
7
+
8
+ ## Stage: Planning (2026-06-14T00:00:00Z)
9
+
10
+ ### Session summary
11
+
12
+ Investigated the third-party bug report that ESC does not stop subagents and traced the abort path through both the package and the pinned Pi SDK peer deps.
13
+ Found that foreground subagents already receive the parent abort signal end-to-end, while background subagents are detached with no interrupt wiring — the reproducible bug.
14
+ Confirmed direction with the operator via `ask_user` (third-party gate): implement ESC-to-abort for both modes, with a foreground guard test, aborting all running and queued background agents.
15
+ Wrote and committed plan `0403-abort-subagents-on-interrupt.md`.
16
+
17
+ ### Observations
18
+
19
+ - Key SDK fact that de-risks the design: in `pi-agent-core` `agent.js`, each run creates a fresh `AbortController` and `finishRun()` discards it **without** aborting on normal completion.
20
+ So the parent signal's `abort` event fires only on a real ESC interrupt — latching `abortAll()` to it will not spuriously kill background agents at turn end.
21
+ - Chosen mechanism: a small `InterruptHandler` driven by `pi.on("turn_start", ...)`, re-latching `ctx.signal` each turn so the latch tracks the live per-run signal even across runs and tool-less turns.
22
+ `turn_start` was preferred over `tool_execution_start` because a background agent can outlive the run that spawned it; a turn-level latch still holds the current run's signal when the user interrupts a later tool-less turn.
23
+ - Reused the existing `manager.abortAll()` rather than adding `abortBackground()`.
24
+ Foreground agents are already aborted via their own `wireSignal`, so `abortAll()`'s overlap is redundant-but-harmless (status-guarded `abort()`, idempotent `markStopped`).
25
+ The manager does not store `isBackground` on the record, so distinguishing modes would need extra state — deferred as an Open Question.
26
+ - Classified as a non-breaking `fix:` (not `fix!:`): no config key, default, or output shape changes; detached-survives-ESC was a limitation, not a contract.
27
+ Noted the behavior change explicitly in Goals.
28
+ - Foreground path is believed already-correct from the code trace; the plan adds a regression guard in `subagent-session.test.ts` (`forwardAbortSignal` is currently untested for the parent-signal path) and will fix only if the guard fails.
29
+
30
+ ## Stage: Implementation — TDD (2026-06-14T18:00:00Z)
31
+
32
+ ### Session summary
33
+
34
+ Completed all three TDD cycles against a green baseline (967 tests).
35
+ Added the foreground-abort guard, implemented `InterruptHandler` + `turn_start` wiring, and updated the architecture doc.
36
+ Test count went from 967 to 975 (+8: 6 `InterruptHandler` unit tests, 2 foreground guard tests); `check`, `lint`, `test`, and `fallow dead-code` all pass.
37
+
38
+ ### Observations
39
+
40
+ - The foreground guard (Step 1) passed on the first run, confirming the planning-stage code trace: the parent signal already reaches the child `session.abort()` via `forwardAbortSignal`.
41
+ No code fix was needed, so it landed as `test:` exactly as the plan anticipated.
42
+ - `InterruptHandler` came out clean against the `code-design` heuristics — one field read from `ctx`, one method on a one-method `InterruptManager` interface, latch state owned internally, `{ once: true }` listener.
43
+ The reviewer's code-design check was PASS with no structural concerns.
44
+ - `abortAll()` gained a second narrow-interface consumer (the new handler) on top of the shutdown path; `fallow dead-code` stayed green, so its existing `fallow-ignore-next-line unused-class-member` comment was left untouched.
45
+ - Pre-completion reviewer: **WARN**.
46
+ - Reviewer warnings: stale source-file counts in `architecture.md`.
47
+ Fixed the current-state prose claim (`56` → `58` source files).
48
+ Left the fallow health-metrics snapshot rows (line ~650, `7,778 (57 files)`) intact — those are point-in-time analysis tables where the file count was computed alongside LOC and other metrics, so bumping one cell in isolation would desync the snapshot.
49
+ Amended the fix into the docs commit (not yet pushed).
50
+
51
+ ## Stage: Final Retrospective (2026-06-14T20:00:00Z)
52
+
53
+ ### Session summary
54
+
55
+ Shipped issue #403 end-to-end across four stages (plan → TDD → ship → live verification): root-caused the bug, implemented the `InterruptHandler` (single `fix:` commit), guarded the already-working foreground path, and released `pi-subagents-v16.1.1`.
56
+ The operator then live-tested all three abort paths (background subagent, foreground subagent, main agent) and confirmed a single Escape aborts each immediately.
57
+ Near-zero rework: one reviewer WARN (stale doc file count) fixed by amend, no follow-up commits, no failed CI.
58
+
59
+ ### Observations
60
+
61
+ #### What went well
62
+
63
+ 1. The planning-stage SDK trace paid dividends two stages later.
64
+ When the operator asked during live testing "is it supposed to take two Escapes or just one?", the answer came straight from the `restoreQueuedMessagesToEditor → agent.abort()` trace captured at planning time — no re-investigation.
65
+ The same trace explained the main-agent and foreground-subagent abort paths immediately.
66
+ 2. The keystone de-risking finding (`finishRun()` discards the per-run `AbortController` without aborting it, so the `abort` event fires only on a real interrupt) held up in practice — no spurious turn-end aborts were observed in live testing.
67
+ 3. The foreground guard test passed on its first run, confirming the planning trace, so the plan's pre-typed `test:` commit type was correct and the whole implementation landed with zero rework.
68
+ 4. Verification was incremental throughout TDD: green baseline first, per-step affected-file runs, `pnpm run check` after the interface-touching step, and full `test`/`check`/`lint`/`fallow` at the end.
69
+
70
+ #### What caused friction (agent side)
71
+
72
+ 1. `missing-context` — when adding the new source file `interrupt.ts`, I updated the `handlers/` directory listing in `architecture.md` but not the prose total-file count at line 277 (which was already stale: `56` vs the pre-change actual of `57`).
73
+ Impact: one pre-completion reviewer WARN, fixed by amending the docs commit before push — no rework, no extra commit, no CI cost.
74
+
75
+ #### What caused friction (user side)
76
+
77
+ 1. None.
78
+ The operator's involvement was high-value: the third-party-issue direction gate (planning) and the live three-path abort verification (post-ship) validated behavior that unit tests cannot reach (real ESC keypress through the interactive TUI).
79
+
80
+ ### Diagnostic details
81
+
82
+ 1. Model-performance correlation — ship stage and the `pre-completion-reviewer` subagent both ran on `claude-sonnet-4-6` (mechanical orchestration and checklist review — appropriate); retro synthesis on `claude-opus-4-8` (judgment — appropriate).
83
+ No mismatch.
84
+ 2. Escalation-delay tracking — no `rabbit-hole` friction points; the planning SDK dig was productive forward exploration, not repeated calls against one error.
85
+ 3. Unused-tool detection — the planning SDK trace navigated minified `node_modules/.pnpm` dist files by hand; `colgrep` (project-code semantic search) and an Explore subagent (project-code understanding) were not suited to reverse-engineering pinned third-party `dist` JS, so no tool was wrongly skipped.
86
+ 4. Feedback-loop gap analysis — no gap; verification ran incrementally per TDD step, not only at the end.
87
+
88
+ ### Changes made
89
+
90
+ 1. Added an "Abort / interrupt signal lifecycle" section to `.pi/skills/pi-extension-lifecycle/SKILL.md` documenting the per-run `AbortController`, the ESC → `agent.abort()` path, the `finishRun()` discard-without-abort behavior, and the `ctx.signal` / `tool.execute(signal)` exposure — so future interrupt-timing work need not re-derive it from the pinned SDK `dist` files.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@gotgenes/pi-subagents",
3
- "version": "16.1.0",
3
+ "version": "16.2.0",
4
4
  "type": "module",
5
5
  "exports": {
6
6
  ".": {
@@ -1,2 +1,3 @@
1
+ export { InterruptHandler } from "#src/handlers/interrupt";
1
2
  export { SessionLifecycleHandler } from "#src/handlers/lifecycle";
2
3
  export { ToolStartHandler } from "#src/handlers/tool-start";
@@ -0,0 +1,49 @@
1
+ /**
2
+ * turn_start event handler that aborts subagents on a parent interrupt (ESC).
3
+ *
4
+ * The parent agent loop creates a fresh AbortController per run and only aborts
5
+ * it on an explicit interrupt — never on normal completion. So latching to the
6
+ * current run's signal and aborting on its `abort` event fires exactly on ESC.
7
+ *
8
+ * `turn_start` carries the live per-run `ctx.signal`, so re-latching each turn
9
+ * keeps the handler tracking the current signal across runs and tool-less turns.
10
+ */
11
+
12
+ /** Narrow manager interface — only the method the interrupt handler calls. */
13
+ export interface InterruptManager {
14
+ abortAll(): number;
15
+ }
16
+
17
+ /** Minimal context shape — only the field the handler reads. */
18
+ interface InterruptCtx {
19
+ signal: AbortSignal | undefined;
20
+ }
21
+
22
+ /**
23
+ * Latches the current parent abort signal and aborts all subagents when it fires.
24
+ *
25
+ * The latch dedups by reference: most turns reuse the same signal (no-op); a new
26
+ * run's signal triggers a detach-and-rewire. The `abort` listener is one-shot.
27
+ */
28
+ export class InterruptHandler {
29
+ private latched?: AbortSignal;
30
+ private detach?: () => void;
31
+
32
+ constructor(private readonly manager: InterruptManager) {}
33
+
34
+ handleTurnStart(ctx: InterruptCtx): void {
35
+ const signal = ctx.signal;
36
+ if (signal === this.latched) return;
37
+
38
+ this.detach?.();
39
+ this.detach = undefined;
40
+ this.latched = signal;
41
+ if (!signal) return;
42
+
43
+ const onAbort = (): void => {
44
+ this.manager.abortAll();
45
+ };
46
+ signal.addEventListener("abort", onAbort, { once: true });
47
+ this.detach = () => signal.removeEventListener("abort", onAbort);
48
+ }
49
+ }
package/src/index.ts CHANGED
@@ -22,7 +22,7 @@ import {
22
22
  } from "@earendil-works/pi-coding-agent";
23
23
  import { AgentTypeRegistry } from "#src/config/agent-types";
24
24
  import { loadCustomAgents } from "#src/config/custom-agents";
25
- import { SessionLifecycleHandler, ToolStartHandler } from "#src/handlers/index";
25
+ import { InterruptHandler, SessionLifecycleHandler, ToolStartHandler } from "#src/handlers/index";
26
26
  import { createChildLifecyclePublisher } from "#src/lifecycle/child-lifecycle";
27
27
  import { ConcurrencyLimiter } from "#src/lifecycle/concurrency-limiter";
28
28
  import { createSubagentSession, type SubagentSessionDeps } from "#src/lifecycle/create-subagent-session";
@@ -185,6 +185,10 @@ export default function (pi: ExtensionAPI) {
185
185
  const toolStart = new ToolStartHandler(runtime);
186
186
  pi.on("tool_execution_start", (event, ctx) => toolStart.handleToolExecutionStart(event, ctx));
187
187
 
188
+ // Abort all subagents when the parent agent loop is interrupted (ESC).
189
+ const interrupt = new InterruptHandler(manager);
190
+ pi.on("turn_start", (_event, ctx) => interrupt.handleTurnStart(ctx));
191
+
188
192
  // ---- Agent tool ----
189
193
 
190
194
  pi.registerTool(new AgentTool(manager, runtime, settings, registry, getAgentDir()).toToolDefinition());
@@ -14,6 +14,7 @@ import type { CreateSubagentSessionParams } from "#src/lifecycle/create-subagent
14
14
  import type { ParentSnapshot } from "#src/lifecycle/parent-snapshot";
15
15
  import { Subagent, type SubagentLifecycleObserver } from "#src/lifecycle/subagent";
16
16
  import type { SubagentSession } from "#src/lifecycle/subagent-session";
17
+ import { SubagentState } from "#src/lifecycle/subagent-state";
17
18
  import type { WorkspaceProvider } from "#src/lifecycle/workspace";
18
19
 
19
20
  import type { RunConfig } from "#src/runtime";
@@ -140,23 +141,25 @@ export class SubagentManager {
140
141
  id,
141
142
  type,
142
143
  description: options.description,
143
- status: options.isBackground ? "queued" : "running",
144
- startedAt: Date.now(),
145
144
  invocation: options.invocation,
146
- // Run config
147
- snapshot,
148
- prompt,
149
- model: options.model,
150
- maxTurns: options.maxTurns,
151
- thinkingLevel: options.thinkingLevel,
152
- parentSession: options.parentSession,
153
- signal: options.signal,
154
- // Shared deps
155
- createSubagentSession: this.createSubagentSession,
156
- observer: this.buildObserver(options),
157
- getRunConfig: this.getRunConfig,
158
- baseCwd: this.baseCwd,
159
- getWorkspaceProvider: () => this._workspaceProvider,
145
+ state: new SubagentState({
146
+ status: options.isBackground ? "queued" : "running",
147
+ startedAt: Date.now(),
148
+ }),
149
+ execution: {
150
+ createSubagentSession: this.createSubagentSession,
151
+ snapshot,
152
+ prompt,
153
+ baseCwd: this.baseCwd,
154
+ observer: this.buildObserver(options),
155
+ getRunConfig: this.getRunConfig,
156
+ getWorkspaceProvider: () => this._workspaceProvider,
157
+ model: options.model,
158
+ maxTurns: options.maxTurns,
159
+ thinkingLevel: options.thinkingLevel,
160
+ parentSession: options.parentSession,
161
+ signal: options.signal,
162
+ },
160
163
  });
161
164
  this.agents.set(id, record);
162
165
 
@@ -165,16 +168,12 @@ export class SubagentManager {
165
168
  }
166
169
 
167
170
  if (options.isBackground && !options.bypassQueue) {
168
- // Schedule on the limiter — started when a slot frees. The status guard
169
- // makes an abort-while-queued task a no-op (Step 3 folds it into start()).
170
- record.promise = this.limiter.schedule(() => {
171
- if (record.status !== "queued") return Promise.resolve();
172
- return record.run();
173
- });
171
+ // Schedule on the limiter — start() guards against abort-while-queued.
172
+ void this.limiter.schedule(() => record.start());
174
173
  return id;
175
174
  }
176
175
 
177
- record.promise = record.run();
176
+ void record.start();
178
177
  return id;
179
178
  }
180
179
 
@@ -0,0 +1,156 @@
1
+ /**
2
+ * subagent-state.ts — SubagentState value object: lifecycle status and metrics.
3
+ *
4
+ * Owns the passive, readable state of a subagent — status, result, error,
5
+ * timestamps, and stats (toolUses, lifetimeUsage, compactionCount) — together
6
+ * with the transition methods (markRunning, markCompleted, …) and accumulation
7
+ * methods (incrementToolUses, addUsage, incrementCompactions) that mutate it.
8
+ *
9
+ * State is encapsulated behind getters; external code reads through them but
10
+ * mutates only via the transition/accumulation methods. The value object owns
11
+ * all of its own mutations — no field is written from outside.
12
+ *
13
+ * Subagent holds one of these privately and delegates its getters and mutation
14
+ * methods to it. Extracting it lets the lifecycle state machine and the
15
+ * session-event observer be unit-tested without constructing an executor.
16
+ */
17
+
18
+ import type { LifetimeUsage } from "#src/lifecycle/usage";
19
+ import { addUsage } from "#src/lifecycle/usage";
20
+
21
+ export type SubagentStatus =
22
+ | "queued"
23
+ | "running"
24
+ | "completed"
25
+ | "steered"
26
+ | "aborted"
27
+ | "stopped"
28
+ | "error";
29
+
30
+ export interface SubagentStateInit {
31
+ status?: SubagentStatus;
32
+ result?: string;
33
+ error?: string;
34
+ startedAt?: number;
35
+ completedAt?: number;
36
+ }
37
+
38
+ export class SubagentState {
39
+ // Transition state — encapsulated behind getters, mutated only via transition methods
40
+ private _status: SubagentStatus;
41
+ get status(): SubagentStatus { return this._status; }
42
+
43
+ private _result?: string;
44
+ get result(): string | undefined { return this._result; }
45
+
46
+ private _error?: string;
47
+ get error(): string | undefined { return this._error; }
48
+
49
+ private _startedAt: number;
50
+ get startedAt(): number { return this._startedAt; }
51
+
52
+ private _completedAt?: number;
53
+ get completedAt(): number | undefined { return this._completedAt; }
54
+
55
+ // Stats — accumulated via mutation methods, readable via getters
56
+ private _toolUses = 0;
57
+ get toolUses(): number { return this._toolUses; }
58
+
59
+ private _lifetimeUsage: LifetimeUsage = { input: 0, output: 0, cacheWrite: 0 };
60
+ get lifetimeUsage(): Readonly<LifetimeUsage> { return this._lifetimeUsage; }
61
+
62
+ private _compactionCount = 0;
63
+ get compactionCount(): number { return this._compactionCount; }
64
+
65
+ constructor(init: SubagentStateInit = {}) {
66
+ this._status = init.status ?? "queued";
67
+ this._result = init.result;
68
+ this._error = init.error;
69
+ this._startedAt = init.startedAt ?? Date.now();
70
+ this._completedAt = init.completedAt;
71
+ }
72
+
73
+ /** Increment tool use count. Called by record-observer on tool_execution_end. */
74
+ incrementToolUses(): void {
75
+ this._toolUses++;
76
+ }
77
+
78
+ /** Accumulate a usage delta into lifetimeUsage. Called by record-observer on message_end. */
79
+ addUsage(delta: { input: number; output: number; cacheWrite: number }): void {
80
+ addUsage(this._lifetimeUsage, delta);
81
+ }
82
+
83
+ /** Increment compaction count. Called by record-observer on compaction_end. */
84
+ incrementCompactions(): void {
85
+ this._compactionCount++;
86
+ }
87
+
88
+ /** Transition to running state. Sets status and startedAt. */
89
+ markRunning(startedAt: number): void {
90
+ this._status = "running";
91
+ this._startedAt = startedAt;
92
+ }
93
+
94
+ /**
95
+ * Transition to completed state.
96
+ * Always sets result and completedAt (??=). Only changes status if not stopped.
97
+ */
98
+ markCompleted(result: string, completedAt?: number): void {
99
+ this._result = result;
100
+ this._completedAt ??= completedAt ?? Date.now();
101
+ if (this._status !== "stopped") {
102
+ this._status = "completed";
103
+ }
104
+ }
105
+
106
+ /**
107
+ * Transition to aborted state.
108
+ * Always sets result and completedAt (??=). Only changes status if not stopped.
109
+ */
110
+ markAborted(result: string, completedAt?: number): void {
111
+ this._result = result;
112
+ this._completedAt ??= completedAt ?? Date.now();
113
+ if (this._status !== "stopped") {
114
+ this._status = "aborted";
115
+ }
116
+ }
117
+
118
+ /**
119
+ * Transition to steered state.
120
+ * Always sets result and completedAt (??=). Only changes status if not stopped.
121
+ */
122
+ markSteered(result: string, completedAt?: number): void {
123
+ this._result = result;
124
+ this._completedAt ??= completedAt ?? Date.now();
125
+ if (this._status !== "stopped") {
126
+ this._status = "steered";
127
+ }
128
+ }
129
+
130
+ /**
131
+ * Transition to error state.
132
+ * Always sets error (formatted) and completedAt (??=). Only changes status if not stopped.
133
+ */
134
+ markError(error: unknown, completedAt?: number): void {
135
+ this._error = error instanceof Error ? error.message : String(error);
136
+ this._completedAt ??= completedAt ?? Date.now();
137
+ if (this._status !== "stopped") {
138
+ this._status = "error";
139
+ }
140
+ }
141
+
142
+ /** Transition to stopped state. Always valid — no guard. */
143
+ markStopped(completedAt?: number): void {
144
+ this._status = "stopped";
145
+ this._completedAt = completedAt ?? Date.now();
146
+ }
147
+
148
+ /** Reset for resume: running status, new startedAt, clear completedAt/result/error. */
149
+ resetForResume(startedAt: number): void {
150
+ this._status = "running";
151
+ this._startedAt = startedAt;
152
+ this._completedAt = undefined;
153
+ this._result = undefined;
154
+ this._error = undefined;
155
+ }
156
+ }