npm - @gotgenes/pi-subagents - Versions diffs - 6.7.0 → 6.8.1 - Mend

@gotgenes/pi-subagents 6.7.0 → 6.8.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

package/CHANGELOG.md +24 -0
package/docs/architecture/architecture.md +30 -29
package/docs/plans/0111-split-agent-record-lifecycle.md +582 -0
package/docs/plans/0123-remove-vi-fn-cast-smell.md +179 -0
package/docs/retro/0110-agent-activity-tracker.md +44 -0
package/docs/retro/0111-split-agent-record-lifecycle.md +61 -0
package/package.json +1 -1
package/src/agent-manager.ts +41 -21
package/src/agent-record.ts +45 -34
package/src/execution-state.ts +17 -0
package/src/index.ts +2 -1
package/src/notification-state.ts +27 -0
package/src/notification.ts +9 -6
package/src/record-observer.ts +6 -7
package/src/service-adapter.ts +8 -7
package/src/tools/agent-tool.ts +6 -4
package/src/tools/get-result-tool.ts +7 -5
package/src/tools/steer-tool.ts +8 -6
package/src/ui/agent-menu.ts +2 -2
package/src/worktree-state.ts +35 -0

package/docs/plans/0123-remove-vi-fn-cast-smell.md ADDED Viewed

@@ -0,0 +1,179 @@
+---
+issue: 123
+issue_title: "refactor(pi-subagents): remove vi.fn() cast smell from test helpers"
+---
+# Remove vi.fn() cast smell from test helpers
+## Problem Statement
+Several test files construct mock objects typed to narrow interfaces (`AgentManagerLike`, `LifecycleRuntime`, `LifecycleManager`, `ToolStartRuntime`).
+Because the returned objects are typed to the interface — not to Vitest's mock types — tests that need to configure individual method stubs are forced to cast:
+```typescript
+(deps.manager.abort as ReturnType<typeof vi.fn>).mockReturnValue(false);
+(deps.manager.getRecord as ReturnType<typeof vi.fn>).mockReturnValue(record);
+```
+This silences TypeScript without constraining the call's return type — if `getRecord`'s return type changes, the cast won't catch it.
+Nine occurrences exist across three test files.
+## Goals
+- Eliminate all `as ReturnType<typeof vi.fn>` casts from the test suite.
+- Preserve type safety: mock configuration calls should be checked against the real method signatures.
+- Keep the change minimal — this is a test hygiene fix, not a structural redesign.
+## Non-Goals
+- Changing `AgentManagerLike`, `LifecycleRuntime`, `LifecycleManager`, `ToolStartRuntime`, or any production code.
+- Restructuring test layout or merging describe blocks.
+## Background
+The cast pattern was noted during #111 implementation and preserved to keep scope tight.
+Issue #111 (split `AgentRecord` lifecycle state) is now closed and implemented.
+### Affected files
+| File                               | Occurrences | Interface                              |
+| ---------------------------------- | ----------- | -------------------------------------- |
+| `test/service-adapter.test.ts`     | 5           | `AgentManagerLike`                     |
+| `test/handlers/lifecycle.test.ts`  | 2           | `LifecycleRuntime`, `LifecycleManager` |
+| `test/handlers/tool-start.test.ts` | 2           | `ToolStartRuntime`                     |
+### Cast sites by file
+`service-adapter.test.ts` — the "steer, abort, waitForAll, hasRunning" block's `createDeps` returns `AdapterDeps` directly.
+Five casts reconfigure `getRecord`, `abort`, or `queueSteer` after construction:
+1. `(deps.manager.abort as ReturnType<typeof vi.fn>).mockReturnValue(false)`
+2. `(deps.manager.getRecord as ReturnType<typeof vi.fn>).mockReturnValue({...})` (×4)
+`lifecycle.test.ts` — mock objects are assigned to `let` variables in `beforeEach`, typed to `LifecycleRuntime` and `LifecycleManager`.
+Two casts reconfigure methods to track call order:
+1. `(runtime.setSessionContext as ReturnType<typeof vi.fn>).mockImplementation(...)`
+2. `(manager.clearCompleted as ReturnType<typeof vi.fn>).mockImplementation(...)`
+Note: the same file already uses `vi.mocked()` in the shutdown-order test — both patterns coexist, which is itself a consistency smell.
+`tool-start.test.ts` — mock object assigned to a `let` variable typed to `ToolStartRuntime`.
+Two casts reconfigure methods to track call order:
+1. `(runtime.setUICtx as ReturnType<typeof vi.fn>).mockImplementation(...)`
+2. `(runtime.onTurnStart as ReturnType<typeof vi.fn>).mockImplementation(...)`
+### Approach: named-variable extraction
+Extract individual `vi.fn()` stubs into named variables.
+This is the approach the issue recommends and it aligns with the testing skill's guidance on extractable stubs.
+The alternative — `vi.mocked()` — is already used in `lifecycle.test.ts` for the shutdown-order test and works for hand-built mocks, but is semantically less clean: `vi.mocked()` asserts that a value is already a mock, which is true here but opaque to readers.
+Named variables make the mock-ness explicit at the construction site.
+For `lifecycle.test.ts`, the named-variable approach also eliminates the inconsistency between the two ordering tests — one currently uses `vi.mocked()` and the other uses casts.
+After this change both will use named stubs.
+## Design Overview
+### service-adapter.test.ts
+Refactor the "steer, abort, waitForAll, hasRunning" block's `createDeps` to return named stubs:
+```typescript
+function createDeps(overrides: Partial<AdapterDeps> = {}) {
+  const mockGetRecord = vi.fn<AgentManagerLike["getRecord"]>();
+  const mockAbort = vi.fn<AgentManagerLike["abort"]>(() => true);
+  const mockQueueSteer = vi.fn<AgentManagerLike["queueSteer"]>(() => true);
+  const deps: AdapterDeps = {
+    manager: {
+      spawn: vi.fn(() => "id"),
+      getRecord: mockGetRecord,
+      listAgents: vi.fn(() => []),
+      abort: mockAbort,
+      waitForAll: vi.fn(async () => {}),
+      hasRunning: vi.fn(() => true),
+      queueSteer: mockQueueSteer,
+    },
+    resolveModel: vi.fn(),
+    getCtx: () => ({ pi: {}, ctx: {} }),
+    getModelRegistry: () => ({ find: () => null, getAll: () => [] }),
+    ...overrides,
+  };
+  return { deps, mockGetRecord, mockAbort, mockQueueSteer };
+}
+```
+Callers destructure what they need:
+```typescript
+const { deps, mockAbort } = createDeps();
+mockAbort.mockReturnValue(false);  // ← type-checked, no cast
+```
+### lifecycle.test.ts
+Promote the `beforeEach`-scoped `runtime` and `manager` mock construction to use named stubs.
+The stubs that need reconfiguration (`setSessionContext`, `clearCompleted`) become named `let` variables alongside the existing `runtime`/`manager` lets, reset in `beforeEach`:
+```typescript
+let mockSetSessionContext: MockInstance<LifecycleRuntime["setSessionContext"]>;
+let mockClearCompleted: MockInstance<LifecycleManager["clearCompleted"]>;
+// ...assigned in beforeEach when building runtime/manager
+```
+Also convert the shutdown-order test's `vi.mocked()` calls to the same pattern for consistency — `unpublishService`, `clearSessionContext`, `abortAll`, `disposeNotifications`, `dispose` all become named stubs.
+### tool-start.test.ts
+Same pattern: promote `setUICtx` and `onTurnStart` to named `let` variables:
+```typescript
+let mockSetUICtx: MockInstance<ToolStartRuntime["setUICtx"]>;
+let mockOnTurnStart: MockInstance<ToolStartRuntime["onTurnStart"]>;
+```
+## Module-Level Changes
+| File                               | Change                                                                                                                                                                                                |
+| ---------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `test/service-adapter.test.ts`     | Refactor `createDeps` in the "steer, abort, waitForAll, hasRunning" block to return named mock stubs alongside `deps`. Update all 5 cast sites to use named stubs.                                    |
+| `test/handlers/lifecycle.test.ts`  | Extract `mockSetSessionContext`, `mockClearCompleted`, `mockAbortAll`, `mockDispose`, `mockClearSessionContext` as named `let` variables. Replace 2 casts and 5 `vi.mocked()` calls with named stubs. |
+| `test/handlers/tool-start.test.ts` | Extract `mockSetUICtx` and `mockOnTurnStart` as named `let` variables. Replace 2 casts with named stubs.                                                                                              |
+No production files are changed.
+## Test Impact Analysis
+1. No new tests are added — this is a refactoring of existing test infrastructure.
+2. No tests become redundant — every existing assertion stays.
+3. All existing tests must pass unchanged; only the mock-wiring changes.
+## TDD Order
+1. **Commit:** Refactor `createDeps` in `service-adapter.test.ts` to return named stubs; update all 5 cast sites.
+   All tests pass before and after.
+   Commit: `test: remove vi.fn() cast smell from service-adapter tests (#123)`
+2. **Commit:** Extract named stubs in `lifecycle.test.ts`; replace 2 casts and 5 `vi.mocked()` calls.
+   All tests pass.
+   Commit: `test: remove vi.fn() cast smell from lifecycle tests (#123)`
+3. **Commit:** Extract named stubs in `tool-start.test.ts`; replace 2 casts.
+   All tests pass.
+   Commit: `test: remove vi.fn() cast smell from tool-start tests (#123)`
+Each step is an independent file — order doesn't matter, but one-file-per-commit keeps diffs reviewable.
+## Risks and Mitigations
+| Risk                                                                                                                          | Mitigation                                                                                              |
+| ----------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- |
+| Overrides via `...overrides` in `service-adapter.test.ts` could replace a manager method, leaving the named stub disconnected | Only `manager`-level overrides are spread; individual method overrides aren't used in this block.       |
+| Named stubs add return-surface to helpers                                                                                     | Each helper is test-local and the extra names are self-documenting. The alternative (casting) is worse. |
+| Converting `vi.mocked()` in `lifecycle.test.ts` shutdown test expands scope slightly beyond the cast pattern                  | Worth it for consistency — mixing `vi.mocked()` and named stubs in the same file is a different smell.  |
+## Open Questions
+None — the issue is fully scoped and the approach is established in the codebase.

package/docs/retro/0110-agent-activity-tracker.md ADDED Viewed

@@ -0,0 +1,44 @@
+---
+issue: 110
+issue_title: "refactor(pi-subagents): wrap AgentActivity in AgentActivityTracker class"
+---
+# Retro: #110 — wrap AgentActivity in AgentActivityTracker class
+## Final Retrospective (2026-05-21T23:30:00Z)
+### Session summary
+Planned and implemented `AgentActivityTracker` class across 6 TDD cycles plus doc updates, released as `pi-subagents-v6.7.0`.
+The 7-field mutable `AgentActivity` interface was replaced with a class exposing explicit transition methods (`onToolStart`, `onToolEnd`, `onMessageStart`, `onMessageUpdate`, `onTurnEnd`, `onUsageUpdate`, `setSession`) and read-only accessors.
+All 7 source files and 3 test files were migrated incrementally without any big-bang commit.
+### Observations
+#### What went well
+- **TDD Red phase caught all three implementation bugs.**
+  1. `onToolEnd` initially incremented `toolUses` unconditionally (ported from original code), but the plan specified no-op defensive behavior.
+     The Red phase test `"onToolEnd with no matching tool is a no-op"` caught it instantly.
+  2. `Date.now()` key collision in `activeTools` Map — two `onToolStart("Read")` calls in the same millisecond produced identical keys, so the second overwrote the first.
+     The Red phase test `"multiple concurrent tools with same name tracked independently"` caught it.
+  3. `describeActivity` signature needed `ReadonlyMap<string, string>` after the accessor change — caught by `pnpm run check` in step 3.
+  All three were fixed immediately with no cascading rework.
+- **Incremental migration avoided type breakage.**
+  The plan kept `AgentActivity` alive in `agent-widget.ts` until step 3, so steps 1–2 compiled without touching downstream files.
+  Each step only broke the files it was about to migrate, keeping intermediate states valid.
+- **Monotonic counter is strictly better than `Date.now()` for tool keys.**
+  The extraction enabled replacing the `toolName + "_" + Date.now()` key strategy with `toolName + "_" + (++this._toolKeySeq)`, which never collides regardless of timing.
+  This is a concrete improvement the original inline code couldn't easily adopt.
+#### What caused friction (agent side)
+- `missing-context` — The plan specified the `Date.now()` key strategy from the original code, but didn't account for same-millisecond collisions in test execution.
+  Impact: ~1 minute debugging in step 1; trivial fix to monotonic counter.
+- `premature-convergence` — Initial `onToolEnd` implementation copied the original's unconditional `toolUses++` before checking the plan's specified no-op behavior.
+  Impact: caught immediately by the Red phase test, single-line fix.
+#### What caused friction (user side)
+- No material friction observed.
+  The session ran end-to-end (plan → implement → ship → release) without user intervention.

package/docs/retro/0111-split-agent-record-lifecycle.md ADDED Viewed

@@ -0,0 +1,61 @@
+---
+issue: 111
+issue_title: "refactor(pi-subagents): split AgentRecord lifecycle state into phase-specific objects"
+---
+# Retro: #111 — split AgentRecord lifecycle state into phase-specific objects
+## Final Retrospective (2026-05-22T01:50:00Z)
+### Session summary
+Planned and implemented the `AgentRecord` lifecycle split across 12 TDD cycles plus doc updates, released as `pi-subagents-v6.8.0`.
+Three new phase-specific collaborators (`ExecutionState`, `WorktreeState`, `NotificationState`) replace 9 post-construction mutable fields.
+`pendingSteers` moved to a `Map` on `AgentManager`; stats (`toolUses`, `lifetimeUsage`, `compactionCount`) encapsulated behind mutation methods with read-only getters.
+`AgentRecordInit` trimmed from 19 optional fields to 4.
+### Observations
+#### What went well
+- **Lift-and-shift scaled from 7 files (#110) to 18 files (#111) without any intermediate test breakage.**
+  Every commit left all 41 test files passing.
+  The pattern — add new alongside old, migrate consumers with fallbacks (`record.execution?.session ?? record.session`), strip fallbacks in a final commit — is reliable for multi-step encapsulation refactors.
+- **Stats encapsulation was simpler than expected.**
+  Converting `toolUses`, `lifetimeUsage`, `compactionCount` to private fields with getters and mutation methods required zero changes to read-only consumers because the getter names match the old field names.
+  Only `record-observer.ts` (the sole writer) needed updating.
+- **The `createTestRecord` factory intersection type trick preserved backward compatibility.**
+  The factory accepts `toolUses?: number` via `Partial<AgentRecordInit> & { toolUses?: number; ... }` and internally calls `record.incrementToolUses()` in a loop.
+  This let 10+ test files continue passing `toolUses: 5` without rewriting each to call mutation methods directly.
+- **`Promise.withResolvers` timing analysis in the plan was unnecessary.**
+  The plan spent ~40 lines analyzing whether `promise` should live inside `ExecutionState` and concluded it should stay separate.
+  Implementation confirmed: `record.execution` is set in `onSessionCreated` (async callback), `record.promise` is set after `runner.run()` (synchronous return) — different moments, straightforward.
+#### What caused friction (agent side)
+- `missing-context` — In the step 7 test for `record.execution`, the initial mock runner used `mockResolvedValue(...)` which doesn't call `onSessionCreated`, so `record.execution` stayed `undefined`.
+  Had to switch to `mockImplementation(async (..., opts) => { opts.onSessionCreated?.(session); ... })`.
+  The existing tests in the same file already use this pattern for record-observer tests, but I didn't check them first.
+  Impact: one test rewrite (~2 minutes), no rework to production code.
+- `scope-drift` — Step 4 absorbed step 5 (adding collaborator fields) without noting the merge in the commit or session log.
+  Step 5 became a no-op.
+  Impact: no rework, but the session narrative skipped a plan step without explanation.
+- `wrong-abstraction` — Step 12 was planned as a simple cleanup ("remove old fields and trim `AgentRecordInit`") but required coordinated changes across 18 files: removing 9 fields from `AgentRecordInit`, updating the `createTestRecord` factory, fixing 5 test files that passed removed fields, and stripping all fallback patterns.
+  This was 2-3 steps' worth of work compressed into one.
+  Impact: step 12 took significantly longer than other steps, though it landed cleanly.
+- `missing-context` — Did not proactively flag the `as ReturnType<typeof vi.fn>` cast smell in `service-adapter.test.ts` while migrating that file.
+  The user noticed it and asked about it.
+  Filed as #123.
+  Impact: added friction but no rework; follow-up issue created.
+  User-caught.
+#### What caused friction (user side)
+- No material friction observed.
+  The user's `ask_user` decisions during planning (NotificationState collaborator, Map on AgentManager) gave clear direction.
+  Quick "follow-up" response on the cast smell kept scope tight.
+### Changes made
+1. `packages/pi-subagents/docs/retro/0111-split-agent-record-lifecycle.md` — this retro file.
+2. `.pi/skills/testing/SKILL.md` — added field-removal rule symmetric to the existing field-addition rule (esbuild silent pass-through on unknown init properties).

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@gotgenes/pi-subagents",
-  "version": "6.7.0",
+  "version": "6.8.1",
   "exports": {
     ".": "./src/service.ts"
   },

package/src/agent-manager.ts CHANGED Viewed

@@ -13,11 +13,13 @@ import { AgentRecord } from "./agent-record.js";
 import type { AgentRunner } from "./agent-runner.js";
 import { AgentTypeRegistry } from "./agent-types.js";
 import { debugLog } from "./debug.js";
+import type { ExecutionState } from "./execution-state.js";
 import { buildParentSnapshot } from "./parent-snapshot.js";
 import { subscribeRecordObserver } from "./record-observer.js";
 import type { RunConfig } from "./runtime.js";
 import type { AgentInvocation, IsolationMode, ParentSnapshot, ShellExec, SubagentType, ThinkingLevel } from "./types.js";
 import type { WorktreeManager } from "./worktree.js";
+import { WorktreeState } from "./worktree-state.js";
 export type OnAgentComplete = (record: AgentRecord) => void;
 export type OnAgentStart = (record: AgentRecord) => void;
@@ -92,6 +94,8 @@ export class AgentManager {
   private queue: { id: string; args: SpawnArgs }[] = [];
   /** Number of currently running background agents. */
   private runningBackground = 0;
+  /** Steers buffered for agents whose session hasn’t been created yet. */
+  private pendingSteers = new Map<string, string[]>();
   constructor(options: AgentManagerOptions) {
     this.runner = options.runner;
@@ -116,6 +120,19 @@ export class AgentManager {
     this.drainQueue();
   }
+  /**
+   * Buffer a steer message for an agent whose session isn’t ready yet.
+   * Returns false if the agent id is not tracked (already cleaned up or unknown).
+   * Called by steer-tool and service-adapter when record.execution is undefined.
+   */
+  queueSteer(id: string, message: string): boolean {
+    if (!this.agents.has(id)) return false;
+    const steers = this.pendingSteers.get(id) ?? [];
+    steers.push(message);
+    this.pendingSteers.set(id, steers);
+    return true;
+  }
   /**
    * Spawn an agent and return its ID immediately (for background use).
    * If the concurrency limit is reached, the agent is queued.
@@ -173,7 +190,7 @@ export class AgentManager {
           'Initialize git and commit at least once, or omit `isolation`.',
         );
       }
-      record.worktree = wt;
+      record.worktreeState = new WorktreeState(wt);
       worktreeCwd = wt.path;
     }
@@ -207,17 +224,18 @@ export class AgentManager {
       signal: record.abortController!.signal,
       registry: this.registry,
       onSessionCreated: (session) => {
-        record.session = session;
         // Capture the session file path early so it's available for display
         // before the run completes (e.g. in background agent status messages).
-        const file = session.sessionManager?.getSessionFile?.();
-        if (file) record.outputFile = file;
+        const outputFile = session.sessionManager?.getSessionFile?.() ?? undefined;
+        // Set the execution-state collaborator — born complete at session creation.
+        record.execution = { session, outputFile };
         // Flush any steers that arrived before the session was ready
-        if (record.pendingSteers?.length) {
-          for (const msg of record.pendingSteers) {
+        const buffered = this.pendingSteers.get(id);
+        if (buffered?.length) {
+          for (const msg of buffered) {
             session.steer(msg).catch(() => {});
           }
-          record.pendingSteers = undefined;
+          this.pendingSteers.delete(id);
         }
         // Subscribe record observer for stats accumulation
         unsubRecordObserver = subscribeRecordObserver(session, record, {
@@ -232,9 +250,9 @@ export class AgentManager {
         // Clean up worktree before transition so the final result includes branch text
         let finalResult = responseText;
-        if (record.worktree) {
-          const wtResult = this.worktrees.cleanup(record.worktree, options.description);
-          record.worktreeResult = wtResult;
+        if (record.worktreeState) {
+          const wtResult = this.worktrees.cleanup(record.worktreeState, options.description);
+          record.worktreeState.recordCleanup(wtResult);
           if (wtResult.hasChanges && wtResult.branch) {
             finalResult += `\n\n---\nChanges saved to branch \`${wtResult.branch}\`. Merge with: \`git merge ${wtResult.branch}\``;
           }
@@ -245,8 +263,8 @@ export class AgentManager {
         else if (steered) record.markSteered(finalResult);
         else record.markCompleted(finalResult);
-        record.session = session;
-        if (sessionFile) record.outputFile = sessionFile;
+        // Update execution collaborator with final session/outputFile from runner
+        record.execution = { session, outputFile: sessionFile ?? record.execution?.outputFile };
         if (options.isBackground) {
           this.runningBackground--;
@@ -262,10 +280,11 @@ export class AgentManager {
         detach();
         // Best-effort worktree cleanup on error
-        if (record.worktree) {
+        if (record.worktreeState) {
           try {
-            const wtResult = this.worktrees.cleanup(record.worktree, options.description);
-            record.worktreeResult = wtResult;
+            const wtResult = this.worktrees.cleanup(record.worktreeState, options.description);
+            record.worktreeState.recordCleanup(wtResult);
           } catch (err) { debugLog("cleanupWorktree on agent error", err); }
         }
@@ -322,16 +341,17 @@ export class AgentManager {
     signal?: AbortSignal,
   ): Promise<AgentRecord | undefined> {
     const record = this.agents.get(id);
-    if (!record?.session) return undefined;
+    const session = record?.execution?.session;
+    if (!session) return undefined;
     record.resetForResume(Date.now());
-    const unsubResume = subscribeRecordObserver(record.session, record, {
+    const unsubResume = subscribeRecordObserver(session, record, {
       onCompact: (r, info) => this.onCompact?.(r, info),
     });
     try {
-      const responseText = await this.runner.resume(record.session, prompt, {
+      const responseText = await this.runner.resume(session, prompt, {
         signal,
       });
       record.markCompleted(responseText);
@@ -373,9 +393,9 @@ export class AgentManager {
   /** Dispose a record's session and remove it from the map. */
   private removeRecord(id: string, record: AgentRecord): void {
-    record.session?.dispose?.();
-    record.session = undefined;
+    record.execution?.session?.dispose?.();
     this.agents.delete(id);
+    this.pendingSteers.delete(id);
   }
   private cleanup() {
@@ -448,7 +468,7 @@ export class AgentManager {
     // Clear queue
     this.queue = [];
     for (const record of this.agents.values()) {
-      record.session?.dispose();
+      record.execution?.session?.dispose();
     }
     this.agents.clear();
     // Prune any orphaned git worktrees (crash recovery)

package/src/agent-record.ts CHANGED Viewed

@@ -5,12 +5,19 @@
  * by the class and exposed via transition methods. External code reads these
  * fields through public properties but cannot write them directly.
  *
- * Non-transition state (session, toolUses, lifetimeUsage, etc.) remains public.
+ * Stats (toolUses, lifetimeUsage, compactionCount) are owned by the class and
+ * accumulated via mutation methods (incrementToolUses, addUsage, incrementCompactions).
+ *
+ * Phase-specific collaborators (execution, worktreeState, notification) are attached
+ * after construction as lifecycle information becomes available.
  */
-import type { AgentSession } from "@earendil-works/pi-coding-agent";
+import type { ExecutionState } from "./execution-state.js";
+import type { NotificationState } from "./notification-state.js";
 import type { AgentInvocation, SubagentType } from "./types.js";
 import type { LifetimeUsage } from "./usage.js";
+import { addUsage } from "./usage.js";
+import type { WorktreeState } from "./worktree-state.js";
 export type AgentRecordStatus =
 	| "queued"
@@ -30,19 +37,9 @@ export interface AgentRecordInit {
 	completedAt?: number;
 	result?: string;
 	error?: string;
-	toolUses?: number;
-	lifetimeUsage?: LifetimeUsage;
-	compactionCount?: number;
 	abortController?: AbortController;
 	invocation?: AgentInvocation;
-	session?: AgentSession;
 	promise?: Promise<string>;
-	resultConsumed?: boolean;
-	pendingSteers?: string[];
-	worktree?: { path: string; branch: string };
-	worktreeResult?: { hasChanges: boolean; branch?: string };
-	toolCallId?: string;
-	outputFile?: string;
 }
 export class AgentRecord {
@@ -68,19 +65,25 @@ export class AgentRecord {
 	private _completedAt?: number;
 	get completedAt(): number | undefined { return this._completedAt; }
-	// Non-transition mutable state
-	toolUses: number;
-	lifetimeUsage: LifetimeUsage;
-	compactionCount: number;
-	session?: AgentSession;
-	abortController?: AbortController;
+	// Stats — accumulated via mutation methods, readable via getters
+	private _toolUses: number;
+	get toolUses(): number { return this._toolUses; }
+	private _lifetimeUsage: LifetimeUsage;
+	get lifetimeUsage(): Readonly<LifetimeUsage> { return this._lifetimeUsage; }
+	private _compactionCount: number;
+	get compactionCount(): number { return this._compactionCount; }
+	/** AbortController for cancelling this agent. Set at construction; used only by AgentManager. */
+	readonly abortController?: AbortController;
+	/** Promise for the full agent run (including post-processing). Set once by AgentManager. */
 	promise?: Promise<string>;
-	resultConsumed?: boolean;
-	pendingSteers?: string[];
-	worktree?: { path: string; branch: string };
-	worktreeResult?: { hasChanges: boolean; branch?: string };
-	toolCallId?: string;
-	outputFile?: string;
+	// Phase-specific collaborators — each born complete when their info becomes available
+	execution?: ExecutionState;
+	worktreeState?: WorktreeState;
+	notification?: NotificationState;
 	constructor(init: AgentRecordInit) {
 		this.id = init.id;
@@ -94,18 +97,26 @@ export class AgentRecord {
 		this._startedAt = init.startedAt ?? Date.now();
 		this._completedAt = init.completedAt;
-		this.toolUses = init.toolUses ?? 0;
-		this.lifetimeUsage = init.lifetimeUsage ?? { input: 0, output: 0, cacheWrite: 0 };
-		this.compactionCount = init.compactionCount ?? 0;
+		this._toolUses = 0;
+		this._lifetimeUsage = { input: 0, output: 0, cacheWrite: 0 };
+		this._compactionCount = 0;
 		this.abortController = init.abortController;
-		this.session = init.session;
 		this.promise = init.promise;
-		this.resultConsumed = init.resultConsumed;
-		this.pendingSteers = init.pendingSteers;
-		this.worktree = init.worktree;
-		this.worktreeResult = init.worktreeResult;
-		this.toolCallId = init.toolCallId;
-		this.outputFile = init.outputFile;
+	}
+	/** Increment tool use count. Called by record-observer on tool_execution_end. */
+	incrementToolUses(): void {
+		this._toolUses++;
+	}
+	/** Accumulate a usage delta into lifetimeUsage. Called by record-observer on message_end. */
+	addUsage(delta: { input: number; output: number; cacheWrite: number }): void {
+		addUsage(this._lifetimeUsage, delta);
+	}
+	/** Increment compaction count. Called by record-observer on compaction_end. */
+	incrementCompactions(): void {
+		this._compactionCount++;
 	}
 	/** Transition to running state. Sets status and startedAt. */

package/src/execution-state.ts ADDED Viewed

@@ -0,0 +1,17 @@
+/**
+ * execution-state.ts — ExecutionState: execution-phase state for a running agent.
+ *
+ * Constructed and attached to AgentRecord when onSessionCreated fires inside startAgent().
+ * Contains the session and output file — the two fields that become known once the
+ * runner creates the session. promise stays as a separate AgentRecord field because
+ * it is set at a different moment (after runner.run() returns).
+ */
+import type { AgentSession } from "@earendil-works/pi-coding-agent";
+export interface ExecutionState {
+	/** The active agent session — available from the moment the session is created. */
+	readonly session: AgentSession;
+	/** Path to the agent's session JSONL file, or undefined if not yet available. */
+	readonly outputFile: string | undefined;
+}

package/src/index.ts CHANGED Viewed

@@ -88,7 +88,7 @@ export default function (pi: ExtensionAPI) {
       });
       // Skip notification if result was already consumed via get_subagent_result
-      if (record.resultConsumed) {
+      if (record.notification?.resultConsumed) {
         notifications.cleanupCompleted(record.id);
         return;
       }
@@ -215,6 +215,7 @@ export default function (pi: ExtensionAPI) {
     getRecord: (id) => manager.getRecord(id),
     emitEvent: (name, data) => pi.events.emit(name, data),
     steerAgent: (session, message) => steerAgent(session, message),
+    queueSteer: (id, message) => manager.queueSteer(id, message),
   })));
   // ---- /agents interactive menu ----

package/src/notification-state.ts ADDED Viewed

@@ -0,0 +1,27 @@
+/**
+ * notification-state.ts — NotificationState: notification-scoped tracking per background agent.
+ *
+ * Constructed once when agent-tool assigns the tool call ID (background agents only).
+ * Foreground agents never get a NotificationState — record.notification stays undefined.
+ */
+export class NotificationState {
+	/** The tool call ID that spawned this background agent. Used in task-notification XML. */
+	readonly toolCallId: string;
+	private _resultConsumed = false;
+	constructor(toolCallId: string) {
+		this.toolCallId = toolCallId;
+	}
+	/** Whether the parent agent has already consumed this result (suppresses duplicate notifications). */
+	get resultConsumed(): boolean {
+		return this._resultConsumed;
+	}
+	/** Mark the result as consumed — suppresses the completion notification. */
+	markConsumed(): void {
+		this._resultConsumed = true;
+	}
+}