npm - @gotgenes/pi-subagents - Versions diffs - 6.0.0 → 6.1.0 - Mend

@gotgenes/pi-subagents 6.0.0 → 6.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/CHANGELOG.md +23 -0
package/docs/architecture/architecture.md +302 -210
package/docs/plans/0098-extract-agent-record-state-machine.md +435 -0
package/docs/plans/0102-consolidate-test-record-factory.md +176 -0
package/docs/retro/0061-session-format-transcript.md +41 -0
package/docs/retro/0102-consolidate-test-record-factory.md +30 -0
package/package.json +1 -1
package/src/agent-manager.ts +25 -50
package/src/agent-record.ts +179 -0
package/src/types.ts +3 -39

package/docs/architecture/architecture.md CHANGED Viewed

@@ -1,59 +1,79 @@
 # Architecture
-This document describes the planned decomposition of the pi-subagents fork into a focused, composable core with a stable API boundary that other extensions can build on.
+This document describes the architecture of the pi-subagents fork: a focused, composable core with a stable API boundary that other extensions can build on.
 ## Design principles
 1. **Narrow core** — the extension owns agent spawning, execution, and result retrieval.
    Everything else is a consumer.
 2. **Composable by default** — other extensions can spawn agents, observe their lifecycle, and display their state without importing this package directly.
-3. **Typed API boundary** — this package exports a `SubagentsAPI` interface and `Symbol.for()` accessors (`publishSubagentsAPI` / `getSubagentsAPI`).
+3. **Typed API boundary** — this package exports a `SubagentsService` interface and `Symbol.for()` accessors (`publishSubagentsService` / `getSubagentsService`).
    Consumers declare this package as an optional peer dependency and use dynamic import for compile-time types.
-   The runtime bridge is `Symbol.for()` on `globalThis` — no separate API package.
+   The runtime bridge is `Symbol.for("@gotgenes/pi-subagents:service")` on `globalThis` — no separate API package.
 4. **No scheduling** — in-process scheduling is removed from the core.
    Scheduling is a separate concern that any extension can implement by calling `spawn()` on the published API.
 5. **UI extraction is deferred** — the widget, conversation viewer, and `/agents` command menu stay in the core for now.
    They are the first candidate for extraction once the API boundary is proven stable.
+6. **Snapshot, don't capture** — mutable parent state (ctx, session, model) is read once at spawn time and frozen into a plain data object.
+   No live references survive past the spawn call.
+7. **Subscribe, don't thread** — observation of agent progress uses event subscription on the session, not callback parameters threaded through multiple layers.
 ## Current state
-The extension is a 6,300 LOC monolith organized into well-factored internal modules but with no public API contract.
-The subsystems are:
+The extension is ~6,100 LOC across 35 focused modules with a typed `SubagentsService` API boundary.
+The `index.ts` entry point is ~270 lines; the rest is decomposed into domain modules.
 ```text
-index.ts (1,894 LOC) — entry point, tool registration, event wiring
-agent-manager.ts      — lifecycle, concurrency, queue
-agent-runner.ts       — session creation, turn loop, tool filtering
-agent-types.ts        — type registry (defaults + custom .md files)
-types.ts              — shared type definitions
-prompts.ts            — system prompt assembly
-context.ts            — parent conversation extraction
-memory.ts             — persistent MEMORY.md per agent
-skill-loader.ts       — preload .pi/skills into prompts
-env.ts                — git/platform detection
-worktree.ts           — git worktree isolation
-usage.ts              — token usage tracking
-model-resolver.ts     — fuzzy model name resolution
-invocation-config.ts  — merge tool params with agent config
-session-dir.ts        — subagent session directory derivation
-settings.ts           — persistent operational settings
-cross-extension-rpc.ts — RPC over pi.events                  ← replacing
-group-join.ts         — batch completion notifications
+index.ts (274 LOC)       — entry point, tool registration, event wiring
+agent-manager.ts (499)   — lifecycle, concurrency, queue
+agent-runner.ts (512)    — session creation, turn loop, tool filtering
+session-config.ts (243)  — pure session-config assembler
+agent-types.ts (138)     — type registry (defaults + custom .md files)
+types.ts (126)           — shared type definitions
+runtime.ts (94)          — SubagentRuntime factory (session-scoped state)
+prompts.ts               — system prompt assembly
+context.ts               — parent conversation extraction
+memory.ts                — persistent MEMORY.md per agent
+skill-loader.ts          — preload .pi/skills into prompts
+env.ts                   — git/platform detection
+worktree.ts              — git worktree isolation
+usage.ts                 — token usage tracking
+model-resolver.ts        — fuzzy model name resolution
+invocation-config.ts     — merge tool params with agent config
+session-dir.ts           — subagent session directory derivation
+settings.ts              — persistent operational settings
+service.ts               — SubagentsService interface + Symbol.for() accessors
+service-adapter.ts       — SubagentsService implementation wrapping AgentManager
+tools/agent-tool.ts      — Agent tool definition + execute
+tools/get-result-tool.ts — get_subagent_result tool
+tools/steer-tool.ts      — steer_subagent tool
+tools/helpers.ts         — shared tool utilities
+handlers/lifecycle.ts    — session_start, session_before_switch, session_shutdown
+handlers/tool-start.ts   — tool_execution_start handler
+notification.ts          — completion nudges, custom message renderer
+renderer.ts              — notification TUI component
 ui/agent-widget.ts       — above-editor live status widget
+ui/agent-menu.ts         — /agents slash command menu
 ui/conversation-viewer.ts — scrollable session overlay
+default-agents.ts        — embedded default agent configs (general-purpose, Explore, Plan)
+custom-agents.ts         — user-defined agent .md file loader
+debug.ts                 — debug logging utility
 ```
 ### Coupling today
-The widget reads agent state by holding a direct reference to `AgentManager` and polling a shared mutable `Map<string, AgentActivity>` every 80 ms. The conversation viewer subscribes directly to `AgentSession` objects.
-Cross-extension consumers use an ad-hoc RPC layer over `pi.events` (`subagents:rpc:spawn`, `subagents:rpc:stop`, `subagents:rpc:ping`) with per-request reply channels and untyped envelopes.
+The widget reads agent state by holding a direct reference to `SubagentRuntime` and polling a shared mutable `Map<string, AgentActivity>` every 80 ms. The conversation viewer subscribes directly to `AgentSession` objects.
-There is also a `Symbol.for("pi-subagents:manager")` export on `globalThis` that exposes `{ waitForAll, hasRunning, spawn, getRecord }`, but it is undocumented and untyped.
+Cross-extension consumers use the typed `SubagentsService` API published via `Symbol.for("@gotgenes/pi-subagents:service")` on `globalThis`.
+The ad-hoc RPC layer and untyped `Symbol.for("pi-subagents:manager")` have been removed.
 ## Target state
@@ -62,20 +82,20 @@ There is also a `Symbol.for("pi-subagents:manager")` export on `globalThis` that
   │  @gotgenes/pi-subagents  (this package)                 │
   │                                                        │
   │  Exports:                                              │
-  │    SubagentsAPI interface                              │
-  │    publishSubagentsAPI() / getSubagentsAPI()           │
-  │    SubagentRecord, SubagentStatus, LifetimeUsage types │
-  │    Event channel constants                             │
+  │    SubagentsService interface                           │
+  │    publishSubagentsService() / getSubagentsService()    │
+  │    SubagentRecord, SubagentStatus, LifetimeUsage types  │
+  │    SUBAGENT_EVENTS constants                            │
   │                                                        │
   │  Core:                                                 │
   │    Agent + get_subagent_result + steer_subagent tools  │
   │    AgentManager, agent-runner, agent-types             │
-  │    publishSubagentsAPI(impl)  ← called at init         │
+  │    publishSubagentsService(impl)  ← called at init     │
   │                                                        │
   │  Internal UI (widget, viewer, /agents menu)            │
   │  ← moves to pi-subagents-ui later                     │
   └──────────────────────┬─────────────────────────────────┘
-                         │ Symbol.for("pi:service:subagents")
+                         │ Symbol.for("@gotgenes/pi-subagents:service")
                          │
        ┌─────────────────┼──────────────────┐
        │                 │                  │
@@ -87,7 +107,7 @@ There is also a `Symbol.for("pi-subagents:manager")` export on `globalThis` that
   │  ext)   │    └──────────────┘    └──────────────┘
   └─────────┘
        │
-       │  getSubagentsAPI()?.spawn(...)
+       │  getSubagentsService()?.spawn(...)
        │  (optional peer dep + dynamic import for types)
        ▼
 ```
@@ -97,10 +117,13 @@ There is also a `Symbol.for("pi-subagents:manager")` export on `globalThis` that
 - The three tools: `Agent`, `get_subagent_result`, `steer_subagent`.
 - `AgentManager` — spawn, queue, abort, resume, concurrency control.
 - `agent-runner` — session creation, turn loop, tool filtering, extension binding (Patches 2 and 3).
+- `session-config` — pure configuration assembler (extracted from `agent-runner`).
+- `SubagentRuntime` — session-scoped state bag with methods.
 - Agent type registry — default agents, custom `.md` file loading.
 - Prompt assembly, context extraction, memory, skills, environment.
 - Worktree isolation.
 - Token usage tracking.
+- Session directory derivation and persisted `SessionManager` for subagent transcripts.
 - Settings persistence.
 - Internal UI (widget, conversation viewer, `/agents` menu) — these stay until the API boundary is proven, then move to a separate extension.
@@ -108,8 +131,8 @@ There is also a `Symbol.for("pi-subagents:manager")` export on `globalThis` that
 - **Scheduling** (`schedule.ts`, `schedule-store.ts`, `ui/schedule-menu.ts`) — 612 LOC removed.
   The `schedule` parameter is removed from the `Agent` tool schema.
-  Any extension that wants scheduling can implement it by calling `getSubagentsAPI()?.spawn(...)` on a timer.
-- **Ad-hoc RPC** (`cross-extension-rpc.ts`) — replaced by the typed `SubagentsAPI` published via `Symbol.for()`.
+  Any extension that wants scheduling can implement it by calling `getSubagentsService()?.spawn(...)` on a timer.
+- **Ad-hoc RPC** (`cross-extension-rpc.ts`) — replaced by the typed `SubagentsService` published via `Symbol.for()`.
   The untyped event-bus RPC channels are removed.
 - **Group join** (`group-join.ts`) — 141 LOC removed.
   The grouped notification batching adds complexity for a marginal UX improvement.
@@ -117,21 +140,22 @@ There is also a `Symbol.for("pi-subagents:manager")` export on `globalThis` that
 - **Output file** (`output-file.ts`) — replaced by `session-dir.ts` + `SessionManager.create()` (#61).
   Subagent transcripts are now written in Pi's official JSONL session format via the SDK's `SessionManager`, nested under the parent session directory.
-### Estimated impact
+### Estimated impact (realized)
-| Subsystem removed | LOC removed   | LOC removed from index.ts |
-| ----------------- | ------------- | ------------------------- |
-| Scheduling        | 612           | ~200                      |
-| Ad-hoc RPC        | 80            | ~50                       |
-| Group join        | 141           | ~100                      |
-| Output file       | 83 (replaced) | ~50                       |
-| **Total**         | **~916**      | **~400**                  |
+| Subsystem              | Status         | LOC impact                                 |
+| ---------------------- | -------------- | ------------------------------------------ |
+| Scheduling             | Removed (#52)  | −612                                       |
+| Ad-hoc RPC             | Removed (#49)  | −080                                       |
+| Group join             | Removed (#49)  | −141                                       |
+| Output file            | Replaced (#61) | −83 (replaced by 38-line `session-dir.ts`) |
+| index.ts decomposition | Done (#54)     | 1,894 → 274                                |
-After removal and `index.ts` decomposition, the core shrinks from ~6,300 to ~5,400 LOC, and `index.ts` shrinks from ~1,894 to ~1,300 LOC.
+The codebase is now ~6,100 LOC across 35 modules.
+The `index.ts` entry point is 274 lines.
-## SubagentsAPI
+## SubagentsService (done — #48)
-The `SubagentsAPI` interface, accessor functions, and serializable types are exported directly from this package (`@gotgenes/pi-subagents`).
+The `SubagentsService` interface, accessor functions, and serializable types are exported from `@gotgenes/pi-subagents` via the `./service` export map entry.
 No separate API package is needed.
 Consumers declare this package as an optional peer dependency:
@@ -139,7 +163,7 @@ Consumers declare this package as an optional peer dependency:
 ```json
 {
   "peerDependencies": {
-    "@gotgenes/pi-subagents": ">=2.0.0"
+    "@gotgenes/pi-subagents": ">=5.0.0"
   },
   "peerDependenciesMeta": {
     "@gotgenes/pi-subagents": { "optional": true }
@@ -150,72 +174,40 @@ Consumers declare this package as an optional peer dependency:
 At runtime, consumers use dynamic import for type-safe access to the accessor functions:
 ```typescript
-const { getSubagentsAPI } = await import("@gotgenes/pi-subagents");
-const api = getSubagentsAPI();
-if (api) {
-  api.spawn("Explore", "Check for stale TODOs");
+const { getSubagentsService } = await import("@gotgenes/pi-subagents");
+const svc = getSubagentsService();
+if (svc) {
+  svc.spawn("Explore", "Check for stale TODOs");
 }
 ```
 Pi's extension loader creates a fresh `jiti` instance per extension with `moduleCache: false`, so module-scoped singletons don't survive across extensions.
-The accessor functions use `Symbol.for()` on `globalThis`, which is process-global by spec, to bridge this gap.
+The accessor functions use `Symbol.for("@gotgenes/pi-subagents:service")` on `globalThis`, which is process-global by spec, to bridge this gap.
 The dynamic import provides compile-time types; the `Symbol.for()` key is the actual runtime channel.
 ### Interface
-```typescript
-/** The public API surface published by pi-subagents. */
-export interface SubagentsAPI {
-  /**
-   * Spawn an agent. Returns the agent ID immediately.
-   * The agent runs in the background unless options.foreground is true.
-   */
-  spawn(type: string, prompt: string, options?: SpawnOptions): string;
-  /** Get a snapshot of an agent's current state. */
-  getRecord(id: string): SubagentRecord | undefined;
-  /** List all tracked agents, most recent first. */
-  listAgents(): SubagentRecord[];
-  /** Abort a running or queued agent. Returns false if not found. */
-  abort(id: string): boolean;
-  /** Send a steering message to a running agent. */
-  steer(id: string, message: string): Promise<boolean>;
+See `src/service.ts` for the canonical definition.
+Key types:
-  /** Wait for all running and queued agents to complete. */
-  waitForAll(): Promise<void>;
-  /** Whether any agents are running or queued. */
-  hasRunning(): boolean;
-}
-export interface SpawnOptions {
-  description?: string;
-  model?: string;
-  maxTurns?: number;
-  thinkingLevel?: string;
-  isolated?: boolean;
-  inheritContext?: boolean;
-  foreground?: boolean;
-  /** Skip the concurrency queue — start immediately. */
-  bypassQueue?: boolean;
-  isolation?: "worktree";
-}
-```
+- `SubagentsService` — `spawn`, `getRecord`, `listAgents`, `abort`, `steer`, `waitForAll`, `hasRunning`.
+- `SubagentRecord` — serializable agent snapshot (no live session objects).
+- `SpawnOptions` — `description`, `model`, `maxTurns`, `thinkingLevel`, `isolated`, `inheritContext`, `foreground`, `bypassQueue`, `isolation`.
+- `SUBAGENT_EVENTS` — channel constants for `pi.events` subscriptions.
 ### Accessor pattern
 ```typescript
-const KEY = Symbol.for("pi:service:subagents");
+const SERVICE_KEY = Symbol.for("@gotgenes/pi-subagents:service");
-export function publishSubagentsAPI(api: SubagentsAPI): void {
-  (globalThis as any)[KEY] = api;
+export function publishSubagentsService(service: SubagentsService): void {
+  (globalThis as Record<symbol, unknown>)[SERVICE_KEY] = service;
 }
-export function getSubagentsAPI(): SubagentsAPI | undefined {
-  return (globalThis as any)[KEY];
+export function getSubagentsService(): SubagentsService | undefined {
+  return (globalThis as Record<symbol, unknown>)[SERVICE_KEY] as
+    | SubagentsService
+    | undefined;
 }
 ```
@@ -237,23 +229,19 @@ They are fire-and-forget broadcast events — no request IDs, no reply channels.
 ### Consumer example: scheduling extension
 ```typescript
-// package.json:
-// "peerDependencies": { "@gotgenes/pi-subagents": ">=2.0.0" }
-// "peerDependenciesMeta": { "@gotgenes/pi-subagents": { "optional": true } }
 export default function (pi) {
   pi.on("session_start", async (event, ctx) => {
-    let getSubagentsAPI;
+    let getSubagentsService;
     try {
-      ({ getSubagentsAPI } = await import("@gotgenes/pi-subagents"));
+      ({ getSubagentsService } = await import("@gotgenes/pi-subagents"));
     } catch {
       return; // pi-subagents not installed
     }
-    const api = getSubagentsAPI();
-    if (!api) return;
+    const svc = getSubagentsService();
+    if (!svc) return;
     setInterval(() => {
-      api.spawn("Explore", "Check for stale TODOs", {
+      svc.spawn("Explore", "Check for stale TODOs", {
         bypassQueue: true,
       });
     }, 60 * 60 * 1000);
@@ -267,13 +255,13 @@ export default function (pi) {
 export default function (pi) {
   pi.events.on("subagents:completed", async (data) => {
     const { id } = data as { id: string };
-    let getSubagentsAPI;
+    let getSubagentsService;
     try {
-      ({ getSubagentsAPI } = await import("@gotgenes/pi-subagents"));
+      ({ getSubagentsService } = await import("@gotgenes/pi-subagents"));
     } catch {
       return;
     }
-    const record = getSubagentsAPI()?.getRecord(id);
+    const record = getSubagentsService()?.getRecord(id);
     if (record?.result) {
       fs.appendFileSync("agent-log.jsonl", JSON.stringify(record) + "\n");
     }
@@ -281,32 +269,38 @@ export default function (pi) {
 }
 ```
-## index.ts decomposition
+## index.ts decomposition (done — #54, #69, #70)
-The 1,894-line `index.ts` is decomposed into focused modules:
+The original 1,894-line `index.ts` has been decomposed into focused modules:
 ```text
 src/
-├── index.ts                  ← slimmed entry point: init, tool registration
+├── index.ts (274)            ← slimmed entry point: init, tool registration
+├── runtime.ts (94)           ← SubagentRuntime: session-scoped state + methods
 ├── tools/
-│   ├── agent-tool.ts         ← Agent tool definition + execute
-│   ├── result-tool.ts        ← get_subagent_result tool
-│   └── steer-tool.ts         ← steer_subagent tool
-├── notifications.ts          ← completion nudges, custom renderer
-├── activity-tracker.ts       ← AgentActivity map + callback factory
-├── agents-command.ts         ← /agents slash command menu
-├── api-adapter.ts            ← SubagentsAPI implementation wrapping AgentManager
-└── (existing modules unchanged)
+│   ├── agent-tool.ts (626)   ← Agent tool definition + execute
+│   ├── get-result-tool.ts    ← get_subagent_result tool
+│   ├── steer-tool.ts         ← steer_subagent tool
+│   └── helpers.ts            ← shared tool utilities
+├── handlers/
+│   ├── lifecycle.ts          ← session_start, session_before_switch, session_shutdown
+│   └── tool-start.ts         ← tool_execution_start handler
+├── notification.ts           ← completion nudges, custom renderer
+├── renderer.ts               ← notification TUI component
+├── ui/agent-menu.ts (677)    ← /agents slash command menu
+├── service-adapter.ts        ← SubagentsService implementation wrapping AgentManager
+└── (existing domain modules unchanged)
 ```
 Each extracted module receives narrow constructor-injected dependencies rather than closing over module-level state.
+Handlers call methods on narrow runtime interfaces — no raw field writes, no `widget!` reach-throughs.
-## Phase plan
+## Phase plan (Phases 1–5 complete)
-### Phase 1: Export `SubagentsAPI` from this package
+### Phase 1: Export `SubagentsService` from this package ✓ (done — #48)
-Add the `SubagentsAPI` interface, serializable types, and `Symbol.for()` accessor functions as public exports of this package.
-No behavioral changes to the core yet.
+Added the `SubagentsService` interface, serializable types, `Symbol.for()` accessor functions, and `SUBAGENT_EVENTS` constants as public exports.
+Wired `service-adapter.ts` to wrap `AgentManager` and call `publishSubagentsService()` at extension init.
 ### Phase 2: Remove scheduling ✓ (done — issue #52)
@@ -314,127 +308,225 @@ Deleted `schedule.ts`, `schedule-store.ts`, `ui/schedule-menu.ts`.
 Removed the `schedule` parameter from the `Agent` tool schema.
 Removed scheduler setup and lifecycle hooks from `index.ts`.
-### Phase 3: Remove group-join, ad-hoc RPC; replace output-file
+### Phase 3: Remove group-join, ad-hoc RPC; replace output-file ✓ (done — #49, #61)
-Delete `group-join.ts`, `cross-extension-rpc.ts`.
-Replace `output-file.ts` with `SessionManager.create()` + `session-dir.ts` (#61).
-Simplify `index.ts` to use direct individual notifications.
-Emit lifecycle events on `pi.events` for external consumers.
+Deleted `group-join.ts`, `cross-extension-rpc.ts` (#49).
+Replaced `output-file.ts` with `SessionManager.create()` + `session-dir.ts` (#61).
+Simplified `index.ts` to use direct individual notifications.
+Lifecycle events emitted on `pi.events` for external consumers.
-### Phase 4: Implement and publish `SubagentsAPI`
+### Phase 4: Implement and publish `SubagentsService` ✓ (done — #48)
-Wire `api-adapter.ts` to wrap `AgentManager` and call `publishSubagentsAPI()` at extension init.
-Resolve model strings inside the adapter (fixing upstream [tintinweb/pi-subagents#60]).
+Wired `service-adapter.ts` to wrap `AgentManager` and call `publishSubagentsService()` at extension init.
+Model strings are resolved inside the adapter.
-### Phase 5: Decompose `index.ts` ✓ (done — issue #54)
+### Phase 5: Decompose `index.ts` ✓ (done — #54, #69, #70, #87)
-Extracted tools, notifications, activity tracking, and the `/agents` command into separate modules.
-`src/index.ts` shrank from ~1,619 lines to ~265 lines.
+Extracted tools, notifications, activity tracking, event handlers, and the `/agents` command into separate modules.
+Created `SubagentRuntime` factory to hold session-scoped state.
+`src/index.ts` shrank from ~1,894 lines to ~274 lines.
 ### Phase 6 (future): Extract UI to `@gotgenes/pi-subagents-ui`
-Move `ui/agent-widget.ts`, `ui/conversation-viewer.ts`, the `/agents` command, notifications, and activity tracking to a separate extension that consumes `SubagentsAPI` + lifecycle events.
+Move `ui/agent-widget.ts`, `ui/conversation-viewer.ts`, the `/agents` command, notifications, and activity tracking to a separate extension that consumes `SubagentsService` + lifecycle events.
 This phase is deferred until the API boundary is proven stable in production.
-## Structural refactoring roadmap (post-#54)
+## Structural refactoring roadmap (post-#54) ✓ complete
-The Issue #54 decomposition created focused modules but left several structural cleanup opportunities on the table.
-The following issues track the work needed to bring `pi-subagents` to the same level of testability and composability as `pi-permission-system`.
+All structural refactoring phases are complete.
+See `git log` for the full history; issue references are preserved below for traceability.
-### Phase 1: Foundation
+| Phase              | Issue              | Summary                                                               |
+| ------------------ | ------------------ | --------------------------------------------------------------------- |
+| Foundation         | #69, #71, #76, #80 | SubagentRuntime, pure assembler, cwd injection, config consolidation  |
+| Core decomposition | #84, #72, #87, #70 | WorktreeManager, AgentManager DI, runtime methods, handler extraction |
+| Interface polish   | #66, #77           | SDK types, projectAgentsDir                                           |
+| Features           | #61                | JSONL session transcripts                                             |
-These issues are independent of each other and can land in any order.
-Together they eliminate module-scope mutable state, create a testable functional core, and simplify the agent-types API.
+The remaining open issue is #22 (parent-session resolution), a cross-extension track that does not gate the structural work.
-1. **gotgenes/pi-packages#69** ✓ — Create `SubagentRuntime`
-   - Move `defaultMaxTurns`, `graceTurns`, `agentActivity`, `currentCtx`, and widget references out of closure/module scope into a single factory-constructed object.
-   - This unblocks handler extraction (Issue #70) by giving handlers a concrete deps bag instead of closure variables.
+---
-2. **gotgenes/pi-packages#71** ✓ — Extract pure agent-session assembler from `agent-runner.ts`
-   - Split `runAgent()` into a pure configuration assembler (~200 lines) and an IO shell (~200 lines).
-   - The assembler becomes independently testable without mocking the Pi SDK.
+## Next target: AgentManager internal decomposition
-3. **gotgenes/pi-packages#76** ✓ — Inject `cwd` into `AgentManager`
-   - Replaced the `process.cwd()` call in `dispose()` with a constructor parameter.
+The structural refactoring roadmap decomposed the extension entry point and established clean module boundaries.
+AgentManager itself — the central class — was not touched structurally.
+A design review reveals three tangled responsibilities and two systemic patterns that inflate complexity.
-4. **gotgenes/pi-packages#80** ✓ — Consolidate `getConfig` / `getAgentConfig` into a single resolution path
-   - Replaced the two overlapping lookup functions with a single `resolveAgentConfig(type): AgentConfig` that handles the unknown-type fallback internally.
-   - Eliminated the duplicated fallback chain exposed by #71 and simplified test mock setup.
+### Problem statement
-### Phase 2: Core decomposition
+AgentManager is a 500-line class that serves as the single mediator between tool callers and the agent runner.
+Every concern passes through it because it owns the `AgentRecord`.
-These build on Phase 1 and should land after it.
+Three responsibilities are tangled:
-1. **gotgenes/pi-packages#84** ✓ — Extract `GitWorktreeManager` class from `worktree.ts`
-   - Added `WorktreeManager` interface and `GitWorktreeManager` class that captures `cwd` at construction.
-   - Prerequisite for #72 — separated the real-object extraction from the DI refactor.
+1. **Record registry** — create, track, query, clean up `AgentRecord` instances.
+2. **Concurrency control** — queue, running count, drain, `bypassQueue`.
+3. **Execution orchestration** — thread options to the runner, intercept callbacks to update records, wire abort signals, manage worktree lifecycle.
-2. **gotgenes/pi-packages#72** ✓ — Dependency-inject `AgentManager`'s collaborators
-   - Defined `AgentRunner` interface (execution boundary) and `ResumeOptions` type in `agent-runner.ts`.
-   - Converted `AgentManager` constructor from 6 positional parameters to an `AgentManagerOptions` bag with injected `AgentRunner` and `WorktreeManager`.
-   - Removed all runtime imports of `agent-runner.ts` and `worktree.ts` from `agent-manager.ts` (only `import type` remains).
-   - Migrated all tests from `vi.mock()` module stubs to `vi.fn()` interface stubs.
+`startAgent()` alone is ~130 lines because it handles all three.
+The `.then()` / `.catch()` blocks mix status updates (job 1), worktree cleanup (job 3), notification callbacks (job 1), and queue draining (job 2).
-3. **gotgenes/pi-packages#87** ✓ — Evolve `SubagentRuntime` from data bag to object with methods
-   - Added session-context methods (`setSessionContext`, `clearSessionContext`) and widget delegation methods (`setUICtx`, `onTurnStart`, `markFinished`, `updateWidget`, `ensureTimer`).
-   - Prerequisite for #70 — without runtime methods, extracted handlers would move LoD violations and output-argument smells into handler classes.
+Two systemic patterns compound the problem:
-4. **gotgenes/pi-packages#70** ✓ — Extract event handlers into `src/handlers/`
-   - Moved the four inline lambdas (`session_start`, `session_before_switch`, `session_shutdown`, `tool_execution_start`) into `SessionLifecycleHandler` and `ToolStartHandler` classes.
-   - Handlers call methods on narrow runtime interfaces — no raw field writes, no `widget!` reach-throughs.
+### Problem 1: Callback threading
-### Phase 3: Interface polish
+`SpawnOptions` carries 6 `on*` callback fields.
+They thread through three layers:
-Small cleanups that are safest after the structural changes settle.
+```text
+agent-tool.ts (UI tracking state)
+  → AgentManager.startAgent() wraps each to update the record, then forwards
+    → runner.run() subscribes to session events, calls callbacks
+```
-1. **gotgenes/pi-packages#66** — Replace `as any` casts with proper SDK types
-   - Type-only change in the tool/menu factory dep interfaces.
-   - Best done after Issues #69 and #70 when the interfaces are stable.
+The callbacks serve two purposes that are tangled together:
-2. **gotgenes/pi-packages#77** — Add `projectAgentsDir` to `AgentMenuDeps`
-   - Remove the inline `process.cwd()` lambda from the menu handler.
+1. **Record statistics** — `onToolActivity` increments `toolUses`, `onAssistantUsage` accumulates `lifetimeUsage`, `onCompaction` increments `compactionCount`, `onSessionCreated` captures the session and output file.
+   This is internal bookkeeping that belongs to the record.
+2. **UI streaming** — the same callbacks update the widget's active-tool display, response text preview, and turn counter.
+   This is presentation that belongs to the UI layer.
-### Phase 4: Features and cross-cutting concerns
+The session already emits all of these events via `session.subscribe()`.
+The runner subscribes to session events, translates them into callback invocations, AgentManager wraps each callback to update the record, then forwards to the caller's callback.
+Three layers reimplementing what a single event subscription could provide.
-1. **gotgenes/pi-packages#61** ✓ — Port transcript logging to Pi's official JSONL session format
-   - Replaced `output-file.ts` with `SessionManager.create()` + `session-dir.ts`.
-   - Subagent sessions are persisted under `<parent-session-dir>/<parent-session-basename>/tasks/` with `parentSession` header linking.
+### Problem 2: Live `ctx` capture
-2. **gotgenes/pi-packages#22** — Parent-session resolution for `nicobailon/pi-subagents` children
-   - Cross-extension issue that spans `pi-permission-system` and `pi-subagents`.
-   - Requires coordination on env-var conventions.
-   - Not blocked by the structural refactor but logically separate from it.
+`ctx: ExtensionContext` is a mutable reference to the parent session.
+It is captured into `SpawnArgs` and held in the concurrency queue:
-### Dependency graph
+```typescript
+const args: SpawnArgs = { pi, ctx, type, prompt, options };
+this.queue.push({ id, args });  // ctx held until dequeue
+```
-```text
-#69 (SubagentRuntime) ✓ ──► #87 (runtime methods) ✓ ─┬─► #70 (handler extraction) ✓
-                                                   │
-#71 (pure assembler) ✓                              │
-#80 (config lookup) ✓                               │
-#76 (cwd injection) ✓                               │
-#84 (WorktreeManager) ✓                             │
-#72 (AgentManager DI) ✓ ────────────────────────────┘──(optional)──► #70
-#66 (type casts) ◄─────(after structural changes settle)
-#77 (projectAgentsDir) ◄─(after #66 or parallel)
-#61 (transcript format) ✓
-#22 (parent session) ◄──(cross-extension, independent)
+When the queued agent dequeues, `runAgent()` reads from the live `ctx`:
+- `ctx.cwd` — directory that may have changed.
+- `ctx.getSystemPrompt()` — live method call on a potentially stale session.
+- `ctx.model` — model that may have been switched.
+- `ctx.modelRegistry` — registry reference.
+If the parent session changes between queue and dequeue (model switch, cwd change, session restart), the agent reads invalid state.
+The same live reference persists in `runtime.currentCtx` for the service-adapter.
+Additionally, `inheritContext` calls `ctx.sessionManager.getBranch()` at run time.
+The user's intent is to fork the conversation as it existed when they asked for the agent — not the conversation at some arbitrary later point when a queue slot opens.
+### Design: snapshot at spawn time
+Replace the live `ctx` capture with a plain data snapshot taken once at spawn time:
+```typescript
+interface ParentSnapshot {
+  cwd: string;
+  systemPrompt: string;
+  model: unknown;
+  modelRegistry: { find(...): unknown; getAvailable?(): ... };
+  parentContext?: string;  // pre-built text if inheritContext
+}
 ```
-### Recommended order
+This snapshot is:
-The recommended sequence is:
+- Captured once in `spawn()` (or by the tool before calling `spawn()`).
+- Stored in `SpawnArgs` instead of `ctx`.
+- Passed to `runner.run()` instead of `ctx: ExtensionContext`.
+- Immutable — no staleness risk, no session-lifetime coupling.
+`runAgent()` already reads exactly these 4 values from `ctx` and never touches it again.
+`buildParentContext()` also reads once and produces a string.
+The snapshot formalizes what is already happening, and makes the "read once" guarantee structural.
+### Design: session-event observation replaces callback threading
+The session emits events via `session.subscribe()`.
+Today, `runner.run()` subscribes and translates events into `RunOptions.on*()` callbacks, AgentManager wraps those to update the record, then forwards to the caller.
+The target replaces this three-layer chain with direct subscription:
 ```text
-#69 ✓ → #71 ✓ → #80 ✓ → #76 ✓ → #84 ✓ → #72 ✓ → #87 ✓ → #70 ✓ → #66 → #77 → #61 ✓
+                     session.subscribe()
+                            │
+              ┌─────────────┼─────────────┐
+              │                           │
+       Record observer              UI observer
+  (accumulates stats on record)   (updates widget state)
+  managed by AgentManager         managed by agent-tool
+  subscribes in startAgent()      subscribes after spawn
 ```
-Phase 1 is complete; Phase 2 is complete.
-Issue #61 (transcript format) is complete.
-The next issue is #66 (replace `as any` casts with proper SDK types).
-Issue #22 is a parallel cross-extension track and does not gate the structural work.
+AgentManager subscribes to the session to update the record (toolUses, lifetimeUsage, compactionCount, outputFile).
+The agent-tool subscribes to the session to stream UI state (active tools, response text, turn count).
+Neither layer wraps or forwards the other's callbacks.
+`RunOptions` drops all 6 `on*` fields and becomes pure configuration.
+`SpawnOptions` drops all 6 `on*` fields and becomes identity + dispatch mode.
+The session reference reaches callers via `record.session` (already stored) or via an `onSessionCreated` callback that is the one callback that remains (it delivers the session object, enabling the external subscription).
+### Design: record state machine
+Status transitions are scattered across 6 locations (`startAgent` `.then()`, `.catch()`, `resume()`, `abort()`, `abortAll()`, `drainQueue()`).
+Each location sets `record.status` plus associated fields (`completedAt`, `result`, `error`) in ad-hoc combinations.
+Extract a state machine on `AgentRecord` (or a thin wrapper) that owns all transitions:
+```typescript
+record.markRunning(startedAt)
+record.markCompleted(result, completedAt)
+record.markError(error)
+record.markStopped()
+record.resetForResume()
+```
+Each method sets exactly the fields that belong to that transition.
+Invalid transitions (e.g., `markCompleted` on an already-stopped record) are no-ops.
+The `if (record.status !== "stopped")` guards in `.then()` and `.catch()` become part of the transition logic rather than scattered conditionals.
+### Phased implementation
+The three designs are independent and can land in any order.
+The recommended sequence minimizes intermediate churn.
+#### Step 1: Record state machine
+Extract status-transition methods onto `AgentRecord` (or a `RecordManager` wrapper).
+Purely mechanical — replace scattered field writes with method calls.
+No interface changes for callers.
+This is the lowest-risk change and immediately reduces `startAgent()` line count.
+#### Step 2: Parent snapshot
+Replace `ctx: ExtensionContext` in `SpawnArgs` with a `ParentSnapshot` data object.
+Capture the snapshot in `spawn()` or at the tool call site.
+Update `runner.run()` signature to accept `ParentSnapshot` instead of `ctx`.
+Remove `pi: ExtensionAPI` from `SpawnArgs` (it is only used to pass to `runner.run()`, which only uses it for `detectEnv()` — that can accept a shell-exec function instead).
+This change narrows the `AgentRunner` interface and eliminates live-reference capture.
+#### Step 3: Session-event observation
+Replace the callback-threading pattern with direct session subscriptions.
+AgentManager subscribes to the session after creation to update the record.
+The agent-tool subscribes to the session after spawn to stream UI state.
+`RunOptions` and `SpawnOptions` drop all `on*` callback fields.
+This is the largest change but depends on Step 2 (the runner signature is already narrower) and benefits from Step 1 (the record's transition methods encapsulate the stats updates that the subscription drives).
+### Expected outcome
+| Metric                            | Before | After                    |
+| --------------------------------- | ------ | ------------------------ |
+| `SpawnOptions` fields             | 19     | ~8 (identity + dispatch) |
+| `RunOptions` fields               | 15     | ~9 (config only)         |
+| `startAgent()` lines              | ~130   | ~50                      |
+| Callback layers                   | 3      | 0 (direct subscription)  |
+| Live `ctx` references in queue    | 1      | 0 (snapshot)             |
+| Scattered status-transition sites | 6      | 1 (state machine)        |
+---
 ## Relationship with upstream