npm - @gotgenes/pi-subagents - Versions diffs - 10.0.1 → 10.2.0 - Mend

@gotgenes/pi-subagents 10.0.1 → 10.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

package/CHANGELOG.md +41 -0
package/docs/architecture/architecture.md +78 -159
package/docs/architecture/history/phase-14-strip-policy.md +49 -0
package/docs/plans/0227-evolve-agent-record-into-agent.md +322 -0
package/docs/plans/0228-async-start-agent-dissolve-run-handle.md +288 -0
package/docs/retro/0227-evolve-agent-record-into-agent.md +80 -0
package/docs/retro/0228-async-start-agent-dissolve-run-handle.md +42 -0
package/docs/retro/0239-collapse-filter-active-tools.md +33 -0
package/package.json +1 -1
package/src/lifecycle/agent-manager.ts +70 -207
package/src/lifecycle/{agent-record.ts → agent.ts} +151 -13
package/src/lifecycle/execution-state.ts +2 -2
package/src/observation/notification.ts +8 -8
package/src/observation/record-observer.ts +7 -7
package/src/service/service-adapter.ts +8 -8
package/src/tools/agent-tool.ts +4 -4
package/src/tools/background-spawner.ts +2 -2
package/src/tools/foreground-runner.ts +4 -4
package/src/tools/get-result-tool.ts +2 -2
package/src/tools/steer-tool.ts +4 -5
package/src/types.ts +1 -1
package/src/ui/agent-creation-wizard.ts +2 -2
package/src/ui/agent-menu.ts +5 -5
package/src/ui/conversation-viewer.ts +3 -3

package/docs/retro/0227-evolve-agent-record-into-agent.md ADDED Viewed

@@ -0,0 +1,80 @@
+---
+issue: 227
+issue_title: "Evolve AgentRecord into Agent with behavior (Phase 15, Step 1)"
+---
+# Retro: #227 — Evolve AgentRecord into Agent with behavior
+## Stage: Planning (2026-05-27T12:00:00Z)
+### Session summary
+Produced an 8-step TDD plan to move per-agent behavior (`abort`, `queueSteer`/`flushPendingSteers`, `setupWorktree`) from `AgentManager` into `AgentRecord`, then rename `AgentRecord` → `Agent` across the codebase.
+The plan follows a "add behavior first, rename last" strategy to keep behavior diffs small and the rename commit purely mechanical.
+### Observations
+- `AgentRecord` is internal-only (public API is `SubagentRecord` in `service.ts`), so the rename is non-breaking.
+- The `queueSteer` method can be removed from `AgentManagerLike` and `SteerToolManager` interfaces entirely — both callers (`steer-tool`, `service-adapter`) already hold the agent reference from `getRecord()`, so they can call `agent.queueSteer()` directly.
+- Queue removal in `abort()` must stay on `AgentManager` until #230 extracts `ConcurrencyQueue`.
+- `RunHandle` ownership explicitly deferred to #228 — the plan does not touch `RunHandle` at all.
+- The rename step (step 7) touches ~30 files but is purely mechanical; all behavior changes land in steps 1–6.
+## Stage: Implementation — TDD (2026-05-27T13:00:00Z)
+### Session summary
+Completed all 8 TDD steps from the plan.
+Added 9 new tests (steer buffering, `abort()`, `setupWorktree()`) and migrated 977 existing tests to the renamed `Agent` class.
+Test count went from 977 to 986 across 62 test files.
+### Observations
+- Fallow reported `AgentInit` and `AgentStatus` as unused type exports from `types.ts`; suppressed with `// fallow-ignore-next-line unused-type` (correct singular form — tool's error message hints at this).
+- `ESLint` auto-removed an `as any` cast in the `setupWorktree` test (the mock `WorktreeManager` already satisfied the interface structurally); staged and re-committed cleanly.
+- Biome auto-formatted several test files during the rename commit; re-staged and re-committed.
+- Pre-completion reviewer returned **WARN** for 4 stale diagram/table references in `architecture.md` and the `package-pi-subagents` skill table; all fixed before the final commit.
+- No deviations from the plan's behavior design; the `queueSteer` removal from manager interfaces worked exactly as anticipated in the retro notes.
+## Stage: Final Retrospective (2026-05-27T17:22:00Z)
+### Session summary
+Completed all stages in a single session: planning, 8 TDD steps, pre-completion review, shipping, and release as `pi-subagents-v10.1.0`.
+Three behaviors (`abort`, steer buffering, worktree setup) moved from `AgentManager` to `Agent`, followed by a codebase-wide rename (33 files).
+### Observations
+#### What went well
+- The "add behavior first, rename last" strategy kept behavior-adding commits small (1–2 files each) and the rename commit purely mechanical.
+- Planning identified that `queueSteer` could be removed from `AgentManagerLike` and `SteerToolManager` entirely — this simplified the delegation step and eliminated an unnecessary indirection layer.
+- Pre-completion reviewer caught 4 stale Mermaid diagram references and a skill table entry that the plan's step 8 did not anticipate; all fixed before shipping.
+#### What caused friction (agent side)
+1. `scope-drift` — Added `AgentInit` and `AgentStatus` to the `types.ts` re-export barrel during the rename step without verifying any file imports them from that path.
+   Impact: fallow flagged dead code, triggering a 4-call suppression trial (`unused-export` → `unused-types` → `unused-type`), then the user identified the real fix (remove the speculative re-exports entirely), requiring a follow-up `fix:` commit after docs were already done.
+2. `missing-context` — During the mechanical rename (step 7), `sed` commands matched `#test/helpers/make-record` but missed the relative import `"./helpers/make-record"` in `conversation-viewer.test.ts`.
+   Impact: `pnpm run check` caught it in 1 tool call; minimal rework.
+3. `missing-context` — The fallow skill documents `unused-export` as a suppression kind but not `unused-type`.
+   Impact: 3 wrong guesses before the correct suppression syntax.
+   Self-identified after fallow's error message suggested the correct kind name.
+#### What caused friction (user side)
+- The user's question about whether the fallow suppressions could be removed in a future step was a valuable prompt — it surfaced that the re-exports were speculative and could be removed immediately.
+  Earlier intervention (e.g., during the TDD stage when the suppressions were added) would have avoided the `fix:` commit.
+### Diagnostic details
+- **Model-performance correlation** — Pre-completion reviewer ran as `pre-completion-reviewer` subagent (default model); appropriate for judgment-heavy work (doc staleness, code design review).
+  No model mismatches.
+- **Feedback-loop gap analysis** — `pnpm run check` was run after every delegation step (steps 2, 4, 6, 7) and after every behavior-adding step (steps 1, 3, 5).
+  Verification was incremental throughout, not deferred to the end.
+  The `conversation-viewer.test.ts` import miss in step 7 was caught immediately by the type checker.
+### Changes made
+1. `.pi/skills/fallow/SKILL.md` — Added `unused-type` suppression example alongside existing `unused-export` example.
+2. `AGENTS.md` — Added "no speculative re-exports" rule to Code Style section.

package/docs/retro/0228-async-start-agent-dissolve-run-handle.md ADDED Viewed

@@ -0,0 +1,42 @@
+---
+issue: 228
+issue_title: "Convert startAgent to async/await, move run lifecycle to Agent (Phase 15, Step 2)"
+---
+# Retro: #228 — Convert startAgent to async/await, move run lifecycle to Agent
+## Stage: Planning (2026-05-27T20:00:00Z)
+### Session summary
+Planned the async `startAgent` conversion and decided to dissolve `RunHandle` into Agent methods rather than moving it as a separate class.
+Identified three preparatory steps (narrow promise type, add Agent methods, hoist worktree setup) that make the final async conversion a minimal diff.
+### Observations
+- The original issue proposed `Agent.createRunHandle()` as a factory, keeping RunHandle as a separate class.
+  Analysis showed 5 of 6 RunHandle concerns are Agent state mutations — RunHandle is doing work that belongs on Agent.
+  The clincher was `resume()` in `agent-manager.ts`: it duplicates RunHandle's pattern manually, and #232 wants to unify them.
+  Dissolving RunHandle gives both `startAgent` and `resume` the same primitives (`completeRun`, `failRun`, `releaseListeners`).
+- The synchronous-throw contract in `spawn()` for worktree failures requires hoisting `record.setupWorktree()` out of `startAgent` before the async conversion.
+  Without this prep step, async `startAgent` would turn the throw into a rejected promise that `spawn()` doesn't catch.
+- `promise: Promise<string>` → `Promise<void>` is safe because the resolved string is dead — every consumer reads `record.result` instead.
+  Only one test assertion reads the resolved value.
+- `completeRun`/`failRun` take `worktrees: WorktreeManager` as a parameter rather than storing it on Agent (ISP — only needed at run end, exactly two callers).
+## Stage: Implementation — TDD (2026-05-27T20:40:00Z)
+### Session summary
+Implemented all 6 TDD steps: narrowed `promise` to `Promise<void>`, added 6 run lifecycle methods to Agent (+19 tests), replaced `RunHandle` with Agent methods (-85 LOC), hoisted worktree setup to callers, converted `startAgent` to async/await, and updated architecture docs.
+Test count: 986 → 1005.
+### Observations
+- Step 1 (promise narrowing) required fixing 3 additional test files not listed in the plan: `make-agent.test.ts`, `service-adapter.test.ts`, `get-result-tool.test.ts`.
+  All were trivial `Promise.resolve("done")` → `Promise.resolve()` changes and a cast removal.
+- The lift-and-shift approach worked cleanly — each of the 5 implementation commits was small and independently green.
+  The most impactful commit was step 3 (replace RunHandle, -96/+6 lines) which was risk-free because step 2 had already introduced the Agent methods.
+- Pre-completion reviewer returned WARN for stale `AgentRecord` and `run-handle.ts` references in `architecture.md` class diagram and layout listing.
+  These were pre-existing staleness from #227's rename that wasn't fully propagated to Mermaid diagrams.
+  Fixed by amending the docs commit.

package/docs/retro/0239-collapse-filter-active-tools.md CHANGED Viewed

@@ -35,3 +35,36 @@ A follow-up skill maintenance commit updated `.pi/skills/package-pi-subagents/SK
 - Pre-completion reviewer returned **WARN** for stale `package-pi-subagents` skill content: the "Patch 2 scheduled for removal" note and the `// Patch 2 (RepOne` grep instruction were both stale after #239 completion.
   Fixed immediately as a follow-up `docs:` commit before writing retro notes.
 - `pnpm fallow dead-code` passed with 0 issues — no orphaned exports left behind.
+## Stage: Final Retrospective (2026-05-27T14:40:00Z)
+### Session summary
+Completed the full issue lifecycle (plan → TDD → ship → retro) in a single continuous session.
+Issue #239 shipped as `pi-subagents-v10.0.1` with 7 commits (2 refactor, 4 docs, 1 release).
+Phase 14 is now fully complete, unblocking Phase 15 (#227–#232).
+### Observations
+#### What went well
+- The plan's type-dependency-chain ordering (`SessionConfig` first → expected compile errors → `agent-runner.ts` resolves them) produced zero surprises during TDD.
+  Each step's red/green boundary was exactly where the plan predicted.
+- Pre-completion reviewer caught stale `package-pi-subagents` skill content ("Patch 2 scheduled for removal" and a `// Patch 2 (RepOne` grep instruction) that no longer matched source.
+  Fixed before shipping — the reviewer earned its keep.
+- Multi-model routing was well-matched: `claude-sonnet-4-6` for planning and TDD (judgment + code), `deepseek-v4-flash` for shipping (mechanical checklist), `claude-opus-4-6` for retrospective (synthesis).
+- Feedback loops were incremental: `pnpm vitest run` after each test change, `pnpm run check` after Step 1 to confirm expected errors, full suite + lint + dead-code after all steps.
+#### What caused friction (agent side)
+No friction points identified.
+This was the final step of a 3-step phase with both dependencies already closed, a well-scoped plan, and internal-only API changes — the simplest possible lifecycle.
+#### What caused friction (user side)
+None observed.
+The user's involvement was limited to issuing the four standard lifecycle commands (`/plan-issue`, `/tdd-plan`, `/ship-issue`, `/retro`) with no corrections or redirections needed.
+### Changes made
+1. Appended Final Retrospective stage entry to `packages/pi-subagents/docs/retro/0239-collapse-filter-active-tools.md`.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@gotgenes/pi-subagents",
-  "version": "10.0.1",
+  "version": "10.2.0",
   "type": "module",
   "exports": {
     ".": "./src/service.ts"

package/src/lifecycle/agent-manager.ts CHANGED Viewed

@@ -1,5 +1,5 @@
 /**
- * agent-manager.ts — Tracks agents, background execution, resume support.
+ * agent-manager.ts - Tracks agents, background execution, resume support.
  *
  * Background agents are subject to a configurable concurrency limit (default: 4).
  * Excess agents are queued and auto-started as running agents complete.
@@ -11,114 +11,25 @@ import type { Model } from "@earendil-works/pi-ai";
 import type { AgentSession } from "@earendil-works/pi-coding-agent";
 import { AgentTypeRegistry } from "#src/config/agent-types";
 import { debugLog } from "#src/debug";
-import { AgentRecord } from "#src/lifecycle/agent-record";
-import type { AgentRunner, RunResult } from "#src/lifecycle/agent-runner";
+import { Agent } from "#src/lifecycle/agent";
+import type { AgentRunner } from "#src/lifecycle/agent-runner";
 import type { ParentSnapshot } from "#src/lifecycle/parent-snapshot";
 import type { WorktreeManager } from "#src/lifecycle/worktree";
-import { WorktreeState } from "#src/lifecycle/worktree-state";
 import { NotificationState } from "#src/observation/notification-state";
-import { subscribeRecordObserver } from "#src/observation/record-observer";
+import { subscribeAgentObserver } from "#src/observation/record-observer";
 import type { RunConfig } from "#src/runtime";
 import type { AgentInvocation, IsolationMode, ShellExec, SubagentType, ThinkingLevel } from "#src/types";
-/**
- * RunHandle — per-run lifecycle object that owns cleanup state.
- *
- * Owns the observer unsubscribe and parent-signal detach handles acquired during
- * a run. Exposes `complete()` and `fail()` as the only way to finish a run,
- * eliminating mutable closure variables from `startAgent`.
- * `fireOnFinished` is idempotent — safe to call from both success and error paths.
- */
-class RunHandle {
-  private unsub?: () => void;
-  private detachFn?: () => void;
-  private onFinished?: () => void;
-  constructor(
-    private readonly record: AgentRecord,
-    private readonly worktrees: WorktreeManager,
-    onFinished?: () => void,
-  ) {
-    this.onFinished = onFinished;
-  }
-  /** Wire a parent AbortSignal so it stops this agent when fired. */
-  wireSignal(signal: AbortSignal | undefined, onAbort: () => void): void {
-    if (!signal) return;
-    const listener = () => onAbort();
-    signal.addEventListener("abort", listener, { once: true });
-    this.detachFn = () => signal.removeEventListener("abort", listener);
-  }
-  /** Store the record-observer unsubscribe handle (called from onSessionCreated). */
-  attachObserver(unsub: () => void): void {
-    this.unsub = unsub;
-  }
-  /** Complete a run successfully — clean up, transition record, fire onFinished. */
-  complete(result: RunResult): string {
-    this.releaseListeners();
-    let finalResult = result.responseText;
-    if (this.record.worktreeState) {
-      const wtResult = this.record.worktreeState.performCleanup(this.worktrees, this.record.description);
-      if (wtResult.hasChanges && wtResult.branch) {
-        finalResult += `\n\n---\nChanges saved to branch \`${wtResult.branch}\`. Merge with: \`git merge ${wtResult.branch}\``;
-      }
-    }
-    if (result.aborted) this.record.markAborted(finalResult);
-    else if (result.steered) this.record.markSteered(finalResult);
-    else this.record.markCompleted(finalResult);
-    // Update execution with the final session/outputFile from the runner
-    this.record.execution = {
-      session: result.session,
-      outputFile: result.sessionFile ?? this.record.execution?.outputFile,
-    };
-    this.fireOnFinished();
-    return result.responseText;
-  }
-  /** Fail a run — mark error, best-effort worktree cleanup, fire onFinished. */
-  fail(err: unknown): void {
-    this.record.markError(err);
-    this.releaseListeners();
-    if (this.record.worktreeState) {
-      try {
-        this.record.worktreeState.performCleanup(this.worktrees, this.record.description);
-      } catch (cleanupErr) { debugLog("cleanupWorktree on agent error", cleanupErr); }
-    }
-    this.fireOnFinished();
-  }
-  private releaseListeners(): void {
-    this.unsub?.();
-    this.unsub = undefined;
-    this.detachFn?.();
-    this.detachFn = undefined;
-  }
-  /** Fire the onFinished callback at most once. */
-  private fireOnFinished(): void {
-    const fn = this.onFinished;
-    this.onFinished = undefined;
-    fn?.();
-  }
-}
 export type CompactionInfo = { reason: "manual" | "threshold" | "overflow"; tokensBefore: number };
 /** Observer interface for agent lifecycle notifications. */
 export interface AgentManagerObserver {
-  onAgentStarted(record: AgentRecord): void;
-  onAgentCompleted(record: AgentRecord): void;
-  onAgentCompacted(record: AgentRecord, info: CompactionInfo): void;
+  onAgentStarted(record: Agent): void;
+  onAgentCompleted(record: Agent): void;
+  onAgentCompacted(record: Agent, info: CompactionInfo): void;
   /** Fires synchronously after a background agent record is created (before startAgent). */
-  onAgentCreated(record: AgentRecord): void;
+  onAgentCreated(record: Agent): void;
 }
 /** Default max concurrent background agents. */
@@ -129,7 +40,7 @@ export interface AgentManagerOptions {
   worktrees: WorktreeManager;
   exec: ShellExec;
   registry: AgentTypeRegistry;
-  /** Injected getter for the concurrency limit — owned by SettingsManager. */
+  /** Injected getter for the concurrency limit - owned by SettingsManager. */
   getMaxConcurrent?: () => number;
   getRunConfig?: () => RunConfig;
   observer?: AgentManagerObserver;
@@ -160,25 +71,25 @@ export interface AgentSpawnConfig {
   thinkingLevel?: ThinkingLevel;
   isBackground?: boolean;
   /**
-   * Skip the maxConcurrent queue check for this spawn — start immediately even
+   * Skip the maxConcurrent queue check for this spawn - start immediately even
    * if the configured concurrency limit would otherwise queue it. Useful for
    * callers (e.g. cross-extension RPC) that must not be deferred by the queue.
    */
   bypassQueue?: boolean;
-  /** Isolation mode — "worktree" creates a temp git worktree for the agent. */
+  /** Isolation mode - "worktree" creates a temp git worktree for the agent. */
   isolation?: IsolationMode;
   /** Resolved invocation snapshot captured for UI display. */
   invocation?: AgentInvocation;
-  /** Parent abort signal — when aborted, the subagent is also stopped. */
+  /** Parent abort signal - when aborted, the subagent is also stopped. */
   signal?: AbortSignal;
-  /** Called when the agent session is created — receives the session and the agent's record. */
-  onSessionCreated?: (session: AgentSession, record: AgentRecord) => void;
-  /** Parent session identity — grouped fields that travel together from the tool boundary. */
+  /** Called when the agent session is created - receives the session and the agent's record. */
+  onSessionCreated?: (session: AgentSession, record: Agent) => void;
+  /** Parent session identity - grouped fields that travel together from the tool boundary. */
   parentSession?: ParentSessionInfo;
 }
 export class AgentManager {
-  private agents = new Map<string, AgentRecord>();
+  private agents = new Map<string, Agent>();
   private cleanupInterval: ReturnType<typeof setInterval>;
   private readonly observer?: AgentManagerObserver;
   private readonly runner: AgentRunner;
@@ -192,9 +103,6 @@ export class AgentManager {
   private queue: { id: string; args: SpawnArgs }[] = [];
   /** Number of currently running background agents. */
   private runningBackground = 0;
-  /** Steers buffered for agents whose session hasn’t been created yet. */
-  private pendingSteers = new Map<string, string[]>();
   constructor(options: AgentManagerOptions) {
     this.runner = options.runner;
     this.worktrees = options.worktrees;
@@ -216,19 +124,6 @@ export class AgentManager {
     this.drainQueue();
   }
-  /**
-   * Buffer a steer message for an agent whose session isn’t ready yet.
-   * Returns false if the agent id is not tracked (already cleaned up or unknown).
-   * Called by steer-tool and service-adapter when record.execution is undefined.
-   */
-  queueSteer(id: string, message: string): boolean {
-    if (!this.agents.has(id)) return false;
-    const steers = this.pendingSteers.get(id) ?? [];
-    steers.push(message);
-    this.pendingSteers.set(id, steers);
-    return true;
-  }
   /**
    * Spawn an agent and return its ID immediately (for background use).
    * If the concurrency limit is reached, the agent is queued.
@@ -241,7 +136,7 @@ export class AgentManager {
   ): string {
     const id = randomUUID().slice(0, 17);
     const abortController = new AbortController();
-    const record = new AgentRecord({
+    const record = new Agent({
       id,
       type,
       description: options.description,
@@ -263,15 +158,16 @@ export class AgentManager {
     const args: SpawnArgs = { snapshot, type, prompt, options };
     if (options.isBackground && !options.bypassQueue && this.runningBackground >= this._getMaxConcurrent()) {
-      // Queue it — will be started when a running agent completes
+      // Queue it - will be started when a running agent completes
       this.queue.push({ id, args });
       return id;
     }
-    // startAgent can throw (e.g. strict worktree-isolation failure) — clean
+    // setupWorktree can throw (e.g. strict worktree-isolation failure) - clean
     // up the record so callers don't see an orphan in `listAgents()`.
     try {
-      this.startAgent(id, record, args);
+      record.setupWorktree(this.worktrees, options.isolation);
+      record.promise = this.startAgent(id, record, args);
     } catch (err) {
       this.agents.delete(id);
       throw err;
@@ -280,79 +176,53 @@ export class AgentManager {
   }
   /** Actually start an agent (called immediately or from queue drain). */
-  private startAgent(id: string, record: AgentRecord, { snapshot, type, prompt, options }: SpawnArgs) {
-    const worktreeCwd = this.setupWorktree(id, record, options.isolation);
+  private async startAgent(id: string, record: Agent, { snapshot, type, prompt, options }: SpawnArgs): Promise<void> {
     record.markRunning(Date.now());
     if (options.isBackground) this.runningBackground++;
     this.observer?.onAgentStarted(record);
-    const handle = new RunHandle(
-      record, this.worktrees,
+    record.setOnRunFinished(
       options.isBackground ? () => this.finalizeBackgroundRun(record) : undefined,
     );
-    handle.wireSignal(options.signal, () => this.abort(id));
+    record.wireSignal(options.signal, () => this.abort(id));
     const runConfig = this.getRunConfig?.();
-    record.promise = this.runner.run(snapshot, type, prompt, {
-      context: {
-        exec: this.exec,
-        registry: this.registry,
-        cwd: worktreeCwd,
-        parentSession: options.parentSession,
-      },
-      model: options.model,
-      maxTurns: options.maxTurns,
-      defaultMaxTurns: runConfig?.defaultMaxTurns,
-      graceTurns: runConfig?.graceTurns,
-      isolated: options.isolated,
-      thinkingLevel: options.thinkingLevel,
-      signal: record.abortController!.signal,
-      onSessionCreated: (session) => {
-        // Capture the session file path early so it's available for display
-        // before the run completes (e.g. in background agent status messages).
-        // eslint-disable-next-line @typescript-eslint/no-unnecessary-condition -- sessionManager is typed as always present but Pi SDK may not provide it
-        const outputFile = session.sessionManager?.getSessionFile?.() ?? undefined;
-        record.execution = { session, outputFile };
-        this.flushPendingSteers(id, session);
-        handle.attachObserver(subscribeRecordObserver(session, record, {
-          onCompact: (r, info) => this.observer?.onAgentCompacted(r, info),
-        }));
-        options.onSessionCreated?.(session, record);
-      },
-    })
-      .then((result) => handle.complete(result))
-      .catch((err: unknown) => { handle.fail(err); return ""; });
-  }
-  /** Create a worktree for isolated agents. Throws (strict) if isolation is requested but impossible. */
-  private setupWorktree(
-    id: string, record: AgentRecord, isolation: IsolationMode | undefined,
-  ): string | undefined {
-    if (isolation !== "worktree") return undefined;
-    const wt = this.worktrees.create(id);
-    if (!wt) {
-      throw new Error(
-        'Cannot run with isolation: "worktree" — not a git repo, no commits yet, or `git worktree add` failed. ' +
-        'Initialize git and commit at least once, or omit `isolation`.',
-      );
-    }
-    record.worktreeState = new WorktreeState(wt);
-    return wt.path;
-  }
-  /** Flush any steers buffered before the session was ready. */
-  private flushPendingSteers(id: string, session: AgentSession): void {
-    const buffered = this.pendingSteers.get(id);
-    if (!buffered?.length) return;
-    for (const msg of buffered) {
-      session.steer(msg).catch(() => {});
+    try {
+      const result = await this.runner.run(snapshot, type, prompt, {
+        context: {
+          exec: this.exec,
+          registry: this.registry,
+          cwd: record.worktreeState?.path,
+          parentSession: options.parentSession,
+        },
+        model: options.model,
+        maxTurns: options.maxTurns,
+        defaultMaxTurns: runConfig?.defaultMaxTurns,
+        graceTurns: runConfig?.graceTurns,
+        isolated: options.isolated,
+        thinkingLevel: options.thinkingLevel,
+        signal: record.abortController!.signal,
+        onSessionCreated: (session) => {
+          // Capture the session file path early so it's available for display
+          // before the run completes (e.g. in background agent status messages).
+          // eslint-disable-next-line @typescript-eslint/no-unnecessary-condition -- sessionManager is typed as always present but Pi SDK may not provide it
+          const outputFile = session.sessionManager?.getSessionFile?.() ?? undefined;
+          record.execution = { session, outputFile };
+          record.flushPendingSteers(session);
+          record.attachObserver(subscribeAgentObserver(session, record, {
+            onCompact: (r, info) => this.observer?.onAgentCompacted(r, info),
+          }));
+          options.onSessionCreated?.(session, record);
+        },
+      });
+      record.completeRun(result, this.worktrees);
+    } catch (err) {
+      record.failRun(err, this.worktrees);
     }
-    this.pendingSteers.delete(id);
   }
   /** Decrement background counter, notify observer (crash-safe), and drain the queue. */
-  private finalizeBackgroundRun(record: AgentRecord): void {
+  private finalizeBackgroundRun(record: Agent): void {
     this.runningBackground--;
     try { this.observer?.onAgentCompleted(record); } catch (err) { debugLog("onAgentCompleted observer", err); }
     this.drainQueue();
@@ -365,9 +235,10 @@ export class AgentManager {
       const record = this.agents.get(next.id);
       if (record?.status !== "queued") continue;
       try {
-        this.startAgent(next.id, record, next.args);
+        record.setupWorktree(this.worktrees, next.args.options.isolation);
+        record.promise = this.startAgent(next.id, record, next.args);
       } catch (err) {
-        // Late failure (e.g. strict worktree-isolation) — surface on the record
+        // Late failure (e.g. strict worktree-isolation) - surface on the record
         // so the user/agent can see it via /agents, then keep draining.
         record.markError(err);
         this.observer?.onAgentCompleted(record);
@@ -384,7 +255,7 @@ export class AgentManager {
     type: SubagentType,
     prompt: string,
     options: Omit<AgentSpawnConfig, "isBackground">,
-  ): Promise<AgentRecord> {
+  ): Promise<Agent> {
     const id = this.spawn(snapshot, type, prompt, { ...options, isBackground: false });
     const record = this.agents.get(id)!;
     await record.promise;
@@ -398,14 +269,14 @@ export class AgentManager {
     id: string,
     prompt: string,
     signal?: AbortSignal,
-  ): Promise<AgentRecord | undefined> {
+  ): Promise<Agent | undefined> {
     const record = this.agents.get(id);
     const session = record?.session;
     if (!session) return undefined;
     record.resetForResume(Date.now());
-    const unsubResume = subscribeRecordObserver(session, record, {
+    const unsubResume = subscribeAgentObserver(session, record, {
       onCompact: (r, info) => this.observer?.onAgentCompacted(r, info),
     });
@@ -423,11 +294,11 @@ export class AgentManager {
     return record;
   }
-  getRecord(id: string): AgentRecord | undefined {
+  getRecord(id: string): Agent | undefined {
     return this.agents.get(id);
   }
-  listAgents(): AgentRecord[] {
+  listAgents(): Agent[] {
     return [...this.agents.values()].sort(
       (a, b) => b.startedAt - a.startedAt,
     );
@@ -444,18 +315,14 @@ export class AgentManager {
       return true;
     }
-    if (record.status !== "running") return false;
-    record.abortController?.abort();
-    record.markStopped();
-    return true;
+    return record.abort();
   }
   /** Dispose a record's session and remove it from the map. */
-  private removeRecord(id: string, record: AgentRecord): void {
+  private removeRecord(id: string, record: Agent): void {
     // eslint-disable-next-line @typescript-eslint/no-unnecessary-condition -- dispose may not exist on all session implementations
     record.session?.dispose?.();
     this.agents.delete(id);
-    this.pendingSteers.delete(id);
   }
   private cleanup() {
@@ -501,11 +368,7 @@ export class AgentManager {
     this.queue = [];
     // Abort running agents
     for (const record of this.agents.values()) {
-      if (record.status === "running") {
-        record.abortController?.abort();
-        record.markStopped();
-        count++;
-      }
+      if (record.abort()) count++;
     }
     return count;
   }
@@ -513,7 +376,7 @@ export class AgentManager {
   /** Wait for all running and queued agents to complete (including queued ones). */
   // fallow-ignore-next-line unused-class-member
   async waitForAll(): Promise<void> {
-    // Loop because drainQueue respects the concurrency limit — as running
+    // Loop because drainQueue respects the concurrency limit - as running
     // agents finish they start queued ones, which need awaiting too.
     // eslint-disable-next-line @typescript-eslint/no-unnecessary-condition -- intentional infinite loop with explicit break
     while (true) {
@@ -521,7 +384,7 @@ export class AgentManager {
       const pending = [...this.agents.values()]
         .filter(r => r.status === "running" || r.status === "queued")
         .map(r => r.promise)
-        .filter((p): p is Promise<string> => p != null);
+        .filter((p): p is Promise<void> => p != null);
       if (pending.length === 0) break;
       await Promise.allSettled(pending);
     }