@gotgenes/pi-subagents 6.12.1 → 6.13.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -4,23 +4,23 @@ This document describes the architecture of the pi-subagents fork: a focused, co
4
4
 
5
5
  ## Design principles
6
6
 
7
- 1. **Narrow core** the extension owns agent spawning, execution, and result retrieval.
7
+ 1. **Narrow core** - the extension owns agent spawning, execution, and result retrieval.
8
8
  Everything else is a consumer.
9
- 2. **Composable by default** other extensions can spawn agents, observe their lifecycle, and display their state without importing this package directly.
10
- 3. **Typed API boundary** this package exports a `SubagentsService` interface and `Symbol.for()` accessors (`publishSubagentsService` / `getSubagentsService`).
9
+ 2. **Composable by default** - other extensions can spawn agents, observe their lifecycle, and display their state without importing this package directly.
10
+ 3. **Typed API boundary** - this package exports a `SubagentsService` interface and `Symbol.for()` accessors (`publishSubagentsService` / `getSubagentsService`).
11
11
  Consumers declare this package as an optional peer dependency and use dynamic import for compile-time types.
12
- The runtime bridge is `Symbol.for("@gotgenes/pi-subagents:service")` on `globalThis` no separate API package.
13
- 4. **No scheduling** in-process scheduling is removed from the core.
12
+ The runtime bridge is `Symbol.for("@gotgenes/pi-subagents:service")` on `globalThis` - no separate API package.
13
+ 4. **No scheduling** - in-process scheduling is removed from the core.
14
14
  Scheduling is a separate concern that any extension can implement by calling `spawn()` on the published API.
15
- 5. **UI extraction is deferred** the widget, conversation viewer, and `/agents` command menu stay in the core for now.
15
+ 5. **UI extraction is deferred** - the widget, conversation viewer, and `/agents` command menu stay in the core for now.
16
16
  They are the first candidate for extraction once the API boundary is proven stable.
17
- 6. **Snapshot, don't capture** mutable parent state (ctx, session, model) is read once at spawn time and frozen into a `ParentSnapshot` data object.
17
+ 6. **Snapshot, don't capture** - mutable parent state (ctx, session, model) is read once at spawn time and frozen into a `ParentSnapshot` data object.
18
18
  No live references survive past the spawn call.
19
- 7. **Subscribe, don't thread** observation of agent progress uses direct session-event subscription, not callback parameters threaded through multiple layers.
20
- 8. **Construct complete** objects are born with all their dependencies.
19
+ 7. **Subscribe, don't thread** - observation of agent progress uses direct session-event subscription, not callback parameters threaded through multiple layers.
20
+ 8. **Construct complete** - objects are born with all their dependencies.
21
21
  If state isn't available yet, the object that needs it doesn't exist yet.
22
- No post-construction field writes from external code if an object can't be instantiated ready-to-go, the prep work hasn't been done and the right dependencies haven't been identified.
23
- 9. **State owns its mutations** mutable state lives in a class whose methods enforce valid transitions and invariants.
22
+ No post-construction field writes from external code - if an object can't be instantiated ready-to-go, the prep work hasn't been done and the right dependencies haven't been identified.
23
+ 9. **State owns its mutations** - mutable state lives in a class whose methods enforce valid transitions and invariants.
24
24
  Free functions that mutate module-scoped variables, closure-captured bags-of-functions, and external writes to shared interfaces are replaced by classes that encapsulate the state they manage.
25
25
 
26
26
  ## Current state
@@ -28,62 +28,62 @@ This document describes the architecture of the pi-subagents fork: a focused, co
28
28
  The extension is organized into 39 focused modules with a typed `SubagentsService` API boundary.
29
29
 
30
30
  ```text
31
- index.ts entry point, tool registration, event wiring
32
- agent-manager.ts lifecycle, concurrency, queue
33
- agent-runner.ts session creation, turn loop, tool filtering
34
- session-config.ts pure session-config assembler
35
- agent-types.ts type registry (defaults + custom .md files)
36
- agent-record.ts agent record with encapsulated status transitions
37
- types.ts shared type definitions
38
- runtime.ts SubagentRuntime factory (session-scoped state)
39
- parent-snapshot.ts immutable snapshot of parent session state
40
-
41
- prompts.ts system prompt assembly
42
- context.ts parent conversation extraction
43
- memory.ts persistent MEMORY.md per agent
44
- skill-loader.ts preload .pi/skills into prompts
45
- env.ts git/platform detection
46
-
47
- worktree.ts git worktree isolation
48
- usage.ts token usage tracking
49
- model-resolver.ts fuzzy model name resolution
50
- invocation-config.ts merge tool params with agent config
51
- session-dir.ts subagent session directory derivation
52
- settings.ts persistent operational settings; `SettingsManager` class owns all three in-memory values
53
-
54
- service.ts SubagentsService interface + Symbol.for() accessors
55
- service-adapter.ts SubagentsService implementation wrapping AgentManager
56
-
57
- tools/agent-tool.ts Agent tool definition, parameter validation, dispatch
58
- tools/foreground-runner.ts foreground execution loop (spinner, streaming, result)
59
- tools/background-spawner.ts background spawn (activity setup, notification wiring)
60
- tools/get-result-tool.ts get_subagent_result tool
61
- tools/steer-tool.ts steer_subagent tool
62
- tools/helpers.ts shared tool utilities (textResult, buildDetails, getStatusNote, )
63
-
64
- handlers/lifecycle.ts session_start, session_before_switch, session_shutdown
65
- handlers/tool-start.ts tool_execution_start handler
66
-
67
- notification.ts completion nudges, custom message renderer
68
- renderer.ts notification TUI component
69
- record-observer.ts session-event observer for record statistics
70
-
71
- ui/display.ts pure formatters, display helpers, and shared types (Theme, AgentDetails)
72
- ui/agent-widget.ts above-editor live status widget
73
- ui/agent-menu.ts /agents slash command menu
74
- ui/conversation-viewer.ts scrollable session overlay
75
- ui/ui-observer.ts session-event observer for UI streaming
76
-
77
- default-agents.ts embedded default agent configs (general-purpose, Explore, Plan)
78
- custom-agents.ts user-defined agent .md file loader
79
- debug.ts debug logging utility
31
+ index.ts - entry point, tool registration, event wiring
32
+ agent-manager.ts - lifecycle, concurrency, queue
33
+ agent-runner.ts - session creation, turn loop, tool filtering
34
+ session-config.ts - pure session-config assembler
35
+ agent-types.ts - type registry (defaults + custom .md files)
36
+ agent-record.ts - agent record with encapsulated status transitions
37
+ types.ts - shared type definitions
38
+ runtime.ts - SubagentRuntime factory (session-scoped state)
39
+ parent-snapshot.ts - immutable snapshot of parent session state
40
+
41
+ prompts.ts - system prompt assembly
42
+ context.ts - parent conversation extraction
43
+ memory.ts - persistent MEMORY.md per agent
44
+ skill-loader.ts - preload .pi/skills into prompts
45
+ env.ts - git/platform detection
46
+
47
+ worktree.ts - git worktree isolation
48
+ usage.ts - token usage tracking
49
+ model-resolver.ts - fuzzy model name resolution
50
+ invocation-config.ts - merge tool params with agent config
51
+ session-dir.ts - subagent session directory derivation
52
+ settings.ts - persistent operational settings; `SettingsManager` class owns all three in-memory values
53
+
54
+ service.ts - SubagentsService interface + Symbol.for() accessors
55
+ service-adapter.ts - SubagentsService implementation wrapping AgentManager
56
+
57
+ tools/agent-tool.ts - Agent tool definition, parameter validation, dispatch
58
+ tools/foreground-runner.ts - foreground execution loop (spinner, streaming, result)
59
+ tools/background-spawner.ts - background spawn (activity setup, notification wiring)
60
+ tools/get-result-tool.ts - get_subagent_result tool
61
+ tools/steer-tool.ts - steer_subagent tool
62
+ tools/helpers.ts - shared tool utilities (textResult, buildDetails, getStatusNote, ...)
63
+
64
+ handlers/lifecycle.ts - session_start, session_before_switch, session_shutdown
65
+ handlers/tool-start.ts - tool_execution_start handler
66
+
67
+ notification.ts - completion nudges, custom message renderer
68
+ renderer.ts - notification TUI component
69
+ record-observer.ts - session-event observer for record statistics
70
+
71
+ ui/display.ts - pure formatters, display helpers, and shared types (Theme, AgentDetails)
72
+ ui/agent-widget.ts - above-editor live status widget
73
+ ui/agent-menu.ts - /agents slash command menu
74
+ ui/conversation-viewer.ts - scrollable session overlay
75
+ ui/ui-observer.ts - session-event observer for UI streaming
76
+
77
+ default-agents.ts - embedded default agent configs (general-purpose, Explore, Plan)
78
+ custom-agents.ts - user-defined agent .md file loader
79
+ debug.ts - debug logging utility
80
80
  ```
81
81
 
82
82
  ### Observation model
83
83
 
84
84
  Record statistics (tool uses, token usage, compaction counts) are updated by `record-observer.ts`, which subscribes directly to session events.
85
85
  UI streaming (active tools, response text, turn counts) is handled by `ui/ui-observer.ts`, which subscribes to the same session events independently.
86
- Neither observer wraps or forwards the other both subscribe directly to the session.
86
+ Neither observer wraps or forwards the other - both subscribe directly to the session.
87
87
 
88
88
  The widget reads agent state by polling a shared `Map<string, AgentActivityTracker>` on `SubagentRuntime` every 80 ms. The conversation viewer subscribes directly to `AgentSession` objects.
89
89
 
@@ -111,31 +111,31 @@ They declare this package as an optional peer dependency and use dynamic import
111
111
  ### What the core owns
112
112
 
113
113
  - The three tools: `Agent`, `get_subagent_result`, `steer_subagent`.
114
- - `AgentManager` spawn, queue, abort, resume, concurrency control.
115
- - `agent-runner` session creation, turn loop, tool filtering, extension binding (Patches 2 and 3).
116
- - `session-config` pure configuration assembler (extracted from `agent-runner`).
117
- - `SubagentRuntime` session-scoped state bag with methods.
118
- - `ParentSnapshot` immutable snapshot of parent session state, captured once at spawn time.
119
- - `record-observer` session-event observer that updates record statistics without callback threading.
120
- - Agent type registry default agents, custom `.md` file loading.
114
+ - `AgentManager` - spawn, queue, abort, resume, concurrency control.
115
+ - `agent-runner` - session creation, turn loop, tool filtering, extension binding (Patches 2 and 3).
116
+ - `session-config` - pure configuration assembler (extracted from `agent-runner`).
117
+ - `SubagentRuntime` - session-scoped state bag with methods.
118
+ - `ParentSnapshot` - immutable snapshot of parent session state, captured once at spawn time.
119
+ - `record-observer` - session-event observer that updates record statistics without callback threading.
120
+ - Agent type registry - default agents, custom `.md` file loading.
121
121
  - Prompt assembly, context extraction, memory, skills, environment.
122
122
  - Worktree isolation.
123
123
  - Token usage tracking.
124
124
  - Session directory derivation and persisted `SessionManager` for subagent transcripts.
125
125
  - Settings persistence.
126
- - Internal UI (widget, conversation viewer, `/agents` menu) these stay until the API boundary is proven, then move to a separate extension.
126
+ - Internal UI (widget, conversation viewer, `/agents` menu) - these stay until the API boundary is proven, then move to a separate extension.
127
127
 
128
128
  ### What the core dropped
129
129
 
130
- - **Scheduling** (`schedule.ts`, `schedule-store.ts`, `ui/schedule-menu.ts`) removed (#52).
130
+ - **Scheduling** (`schedule.ts`, `schedule-store.ts`, `ui/schedule-menu.ts`) - removed (#52).
131
131
  Any extension that wants scheduling can implement it by calling `getSubagentsService()?.spawn(...)` on a timer.
132
- - **Ad-hoc RPC** (`cross-extension-rpc.ts`) replaced by the typed `SubagentsService` published via `Symbol.for()` (#49).
133
- - **Group join** (`group-join.ts`) removed (#49).
132
+ - **Ad-hoc RPC** (`cross-extension-rpc.ts`) - replaced by the typed `SubagentsService` published via `Symbol.for()` (#49).
133
+ - **Group join** (`group-join.ts`) - removed (#49).
134
134
  Individual completion notifications are sufficient.
135
- - **Output file** (`output-file.ts`) replaced by `session-dir.ts` + `SessionManager.create()` (#61).
135
+ - **Output file** (`output-file.ts`) - replaced by `session-dir.ts` + `SessionManager.create()` (#61).
136
136
  Subagent transcripts are now written in Pi's official JSONL session format.
137
- - **Callback threading** the three-layer `on*` callback chain through `SpawnOptions` → `AgentManager` → `RunOptions` was replaced by direct session-event subscriptions (#100).
138
- - **Live `ctx` capture** `SpawnArgs` previously held a mutable `ctx: ExtensionContext` reference that could go stale in the concurrency queue.
137
+ - **Callback threading** - the three-layer `on*` callback chain through `SpawnOptions` → `AgentManager` → `RunOptions` was replaced by direct session-event subscriptions (#100).
138
+ - **Live `ctx` capture** - `SpawnArgs` previously held a mutable `ctx: ExtensionContext` reference that could go stale in the concurrency queue.
139
139
  Replaced by `ParentSnapshot`, an immutable data object captured once at spawn time (#99).
140
140
 
141
141
  ## SubagentsService
@@ -175,10 +175,10 @@ The dynamic import provides compile-time types; the `Symbol.for()` key is the ac
175
175
  See `src/service.ts` for the canonical definition.
176
176
  Key types:
177
177
 
178
- - `SubagentsService` `spawn`, `getRecord`, `listAgents`, `abort`, `steer`, `waitForAll`, `hasRunning`.
179
- - `SubagentRecord` serializable agent snapshot (no live session objects).
180
- - `SpawnOptions` `description`, `model`, `maxTurns`, `thinkingLevel`, `isolated`, `inheritContext`, `foreground`, `bypassQueue`, `isolation`.
181
- - `SUBAGENT_EVENTS` channel constants for `pi.events` subscriptions.
178
+ - `SubagentsService` - `spawn`, `getRecord`, `listAgents`, `abort`, `steer`, `waitForAll`, `hasRunning`.
179
+ - `SubagentRecord` - serializable agent snapshot (no live session objects).
180
+ - `SpawnOptions` - `description`, `model`, `maxTurns`, `thinkingLevel`, `isolated`, `inheritContext`, `foreground`, `bypassQueue`, `isolation`.
181
+ - `SUBAGENT_EVENTS` - channel constants for `pi.events` subscriptions.
182
182
 
183
183
  ### Accessor pattern
184
184
 
@@ -208,7 +208,7 @@ The core emits events on `pi.events` that any extension can observe:
208
208
  | `subagents:completed` | `{ id, type, status, result?, error? }` | Agent finishes |
209
209
  | `subagents:activity` | `{ id, toolName?, textDelta?, turnCount? }` | Streaming progress |
210
210
 
211
- These are fire-and-forget broadcast events no request IDs, no reply channels.
211
+ These are fire-and-forget broadcast events - no request IDs, no reply channels.
212
212
 
213
213
  ### Consumer example: scheduling extension
214
214
 
@@ -259,27 +259,30 @@ The original monolithic `index.ts` has been decomposed into focused modules:
259
259
 
260
260
  ```text
261
261
  src/
262
- ├── index.ts slimmed entry point: init, tool registration
263
- ├── runtime.ts SubagentRuntime: session-scoped state + methods
262
+ ├── index.ts - slimmed entry point: init, tool registration
263
+ ├── runtime.ts - SubagentRuntime: session-scoped state + methods
264
264
  ├── tools/
265
- │ ├── agent-tool.ts Agent tool definition, parameter validation, dispatch
266
- │ ├── foreground-runner.ts foreground execution loop (spinner, streaming, result)
267
- │ ├── background-spawner.ts background spawn (activity setup, notification wiring)
268
- │ ├── get-result-tool.ts get_subagent_result tool
269
- │ ├── steer-tool.ts steer_subagent tool
270
- │ └── helpers.ts shared tool utilities (textResult, buildDetails, getStatusNote, )
265
+ │ ├── agent-tool.ts - Agent tool definition, parameter validation, dispatch
266
+ │ ├── foreground-runner.ts - foreground execution loop (spinner, streaming, result)
267
+ │ ├── background-spawner.ts - background spawn (activity setup, notification wiring)
268
+ │ ├── get-result-tool.ts - get_subagent_result tool
269
+ │ ├── steer-tool.ts - steer_subagent tool
270
+ │ └── helpers.ts - shared tool utilities (textResult, buildDetails, getStatusNote, ...)
271
271
  ├── handlers/
272
- │ ├── lifecycle.ts session_start, session_before_switch, session_shutdown
273
- │ └── tool-start.ts tool_execution_start handler
274
- ├── notification.ts completion nudges, custom renderer
275
- ├── renderer.ts notification TUI component
276
- ├── ui/agent-menu.ts /agents slash command menu
277
- ├── service-adapter.ts SubagentsService implementation wrapping AgentManager
272
+ │ ├── lifecycle.ts - session_start, session_before_switch, session_shutdown
273
+ │ └── tool-start.ts - tool_execution_start handler
274
+ ├── notification.ts - completion nudges, custom renderer
275
+ ├── renderer.ts - notification TUI component
276
+ ├── ui/agent-menu.ts - /agents slash command menu (orchestration, listing, settings)
277
+ ├── ui/agent-config-editor.ts - agent detail view (edit/delete/eject/disable/enable)
278
+ ├── ui/agent-creation-wizard.ts - agent creation (AI-generation and manual-form)
279
+ ├── ui/agent-file-ops.ts - AgentFileOps interface + FsAgentFileOps implementation
280
+ ├── service-adapter.ts - SubagentsService implementation wrapping AgentManager
278
281
  └── (existing domain modules unchanged)
279
282
  ```
280
283
 
281
284
  Each extracted module receives narrow constructor-injected dependencies rather than closing over module-level state.
282
- Handlers call methods on narrow runtime interfaces no raw field writes, no `widget!` reach-throughs.
285
+ Handlers call methods on narrow runtime interfaces - no raw field writes, no `widget!` reach-throughs.
283
286
 
284
287
  ## Phase plan
285
288
 
@@ -311,27 +314,34 @@ Model strings are resolved inside the adapter.
311
314
  Extracted tools, notifications, activity tracking, event handlers, and the `/agents` command into separate modules.
312
315
  Created `SubagentRuntime` factory to hold session-scoped state.
313
316
 
314
- ### Phase 6 (future): Extract UI to `@gotgenes/pi-subagents-ui`
317
+ ### Phase 6 (deferred): Extract UI to `@gotgenes/pi-subagents-ui`
315
318
 
316
- Move `ui/agent-widget.ts`, `ui/conversation-viewer.ts`, the `/agents` command, notifications, and activity tracking to a separate extension that consumes `SubagentsService` + lifecycle events.
319
+ The widget, conversation viewer, `/agents` command, notifications, and activity tracking are candidates for extraction to a separate extension that consumes `SubagentsService` + lifecycle events.
317
320
  This phase is deferred until the API boundary is proven stable in production.
318
321
 
319
322
  ### Phase 7: Encapsulation and dependency narrowing
320
323
 
321
- Target: every mutable state bag becomes a class, every dependency bag narrows to what its consumer uses, every callback becomes either a method on a collaborator or an event on an observable.
324
+ Every mutable state bag became a class, every dependency bag narrowed to what its consumer uses, every callback became either a method on a collaborator or an event on an observable.
322
325
 
323
- The work is sequenced so each change makes the next change easy.
324
326
  See the [Encapsulation roadmap](#encapsulation-roadmap) section for the full breakdown.
325
327
 
326
328
  ### Phase 8: Testability, display extraction, and menu decomposition
327
329
 
328
- Target: eliminate `vi.mock()` module mocking in the two most fragile test suites by injecting IO-touching collaborators; consolidate shared test fixtures; extract display helpers into a reusable module; decompose the largest UI file.
330
+ Eliminated `vi.mock()` module mocking in the two most fragile test suites by injecting IO-touching collaborators; consolidated shared test fixtures; extracted display helpers into a reusable module; decomposed the largest UI file.
329
331
 
330
332
  See the [Phase 8 roadmap](#phase-8-roadmap) section for the full breakdown.
331
333
 
334
+ ### Phase 9: Observation consolidation, ctx elimination, and remaining mocks
335
+
336
+ Target: consolidate the dual observation model so stats live in one place; remove `ExtensionContext` from all internal APIs; eliminate remaining `vi.mock()` calls and `as any` casts; split widget rendering from lifecycle; apply dependency bag convention.
337
+
338
+ See the [Phase 9 roadmap](#phase-9-roadmap) section for the full breakdown.
339
+ Issues: #144, #145, #146, #147, #148.
340
+
332
341
  ## Structural refactoring roadmap
333
342
 
334
- Phases 15 and 7 are complete.
343
+ Phases 1-5, 7, and 8 are complete.
344
+ Phase 6 (UI extraction) is deferred.
335
345
  See `git log` for the full history; issue references are preserved below for traceability.
336
346
 
337
347
  | Phase | Issue | Summary |
@@ -358,7 +368,7 @@ Issue #102 consolidated test `AgentRecord` construction into a shared factory.
358
368
  Replaced live `ctx: ExtensionContext` capture in `SpawnArgs` with an immutable `ParentSnapshot` data object.
359
369
  The snapshot is taken once at spawn time; queued agents execute against frozen state rather than a potentially stale session reference.
360
370
  `runAgent()` accepts `ParentSnapshot` instead of `ctx`.
361
- `pi: ExtensionAPI` was removed from `SpawnArgs` `runAgent()` accepts a `ShellExec` function instead.
371
+ `pi: ExtensionAPI` was removed from `SpawnArgs` - `runAgent()` accepts a `ShellExec` function instead.
362
372
 
363
373
  ### Step 3: Session-event observation (#100)
364
374
 
@@ -381,9 +391,9 @@ Replaced three-layer callback threading with direct session subscriptions.
381
391
 
382
392
  ## Encapsulation roadmap
383
393
 
384
- This section describes the Phase 7 targets: encapsulating mutable state into classes, replacing callbacks with semantic components, and narrowing dependency bags.
394
+ Phase 7 encapsulated mutable state into classes, replaced callbacks with semantic components, and narrowed dependency bags.
385
395
 
386
- Each step is sequenced so it makes the next step easier.
396
+ Each step was sequenced so it made the next step easier.
387
397
 
388
398
  ### Resolved smells
389
399
 
@@ -416,7 +426,7 @@ Wrapped the module-scoped `agents` Map and free functions in `agent-types.ts` in
416
426
 
417
427
  Encapsulated settings load/save/apply cycle into `SettingsManager` (in `settings.ts`).
418
428
  Owns `defaultMaxTurns`, `graceTurns`, `maxConcurrent` with normalizing property accessors.
419
- Added `applyMaxConcurrent(n)`, `applyDefaultMaxTurns(n)`, `applyGraceTurns(n)` each owns the full consequence chain: normalize → set in memory → notify callback → persist → emit event → return toast.
429
+ Added `applyMaxConcurrent(n)`, `applyDefaultMaxTurns(n)`, `applyGraceTurns(n)` - each owns the full consequence chain: normalize → set in memory → notify callback → persist → emit event → return toast.
420
430
  The 6 settings-related fields in `AgentMenuDeps` collapsed to `settings: AgentMenuSettings`.
421
431
 
422
432
  #### A3. AgentActivityTracker class (#110)
@@ -429,9 +439,9 @@ The shared map on `SubagentRuntime` is `Map<string, AgentActivityTracker>`.
429
439
 
430
440
  Split post-construction mutation into phase-specific collaborators, each born complete:
431
441
 
432
- - **`ExecutionState`** (`session`, `outputFile`) constructed in `onSessionCreated`.
433
- - **`WorktreeState`** (`path`, `branch`, `cleanupResult`) constructed at worktree setup.
434
- - **`NotificationState`** (`toolCallId`, `resultConsumed`) constructed by `AgentManager.spawn()` when `toolCallId` is provided.
442
+ - **`ExecutionState`** (`session`, `outputFile`) - constructed in `onSessionCreated`.
443
+ - **`WorktreeState`** (`path`, `branch`, `cleanupResult`) - constructed at worktree setup.
444
+ - **`NotificationState`** (`toolCallId`, `resultConsumed`) - constructed by `AgentManager.spawn()` when `toolCallId` is provided.
435
445
  - **`pendingSteers`** moved to `Map<string, string[]>` on `AgentManager`.
436
446
  - Stats encapsulated behind mutation methods with read-only getters.
437
447
  - `AgentRecordInit` trimmed from 19 optional fields to 4 construction-time fields.
@@ -487,16 +497,15 @@ Extracted `foreground-runner.ts` (~175 lines) and `background-spawner.ts` (~116
487
497
 
488
498
  ### Dependency graph
489
499
 
490
- ```text
491
- A1 (Registry) ──────────────────┐
492
- A2 (Settings) ── A2b (Apply) ──┤
493
- A3 (Activity Tracker) ───────────┤
494
- ├── D2 (Narrow deps) ── E1 (agent-tool split)
495
- B (Record lifecycle) ───────────┤
496
- └── C (Observer) ────────────┤
497
- └── D1 (SpawnOptions) ──┘
498
-
499
- E2 (Type housekeeping) ── can start after A1, runs parallel to later steps
500
+ ```mermaid
501
+ flowchart LR
502
+ A1["A1: Registry"] --> D2["D2: Narrow deps"]
503
+ A2["A2: Settings"] --> A2b["A2b: Apply"] --> D2
504
+ A3["A3: Activity Tracker"] --> D2
505
+ B["B: Record lifecycle"] --> D2
506
+ B --> C["C: Observer"] --> D1["D1: SpawnOptions"] --> D2
507
+ D2 --> E1["E1: agent-tool split"]
508
+ A1 --> E2["E2: Type housekeeping"]
500
509
  ```
501
510
 
502
511
  ---
@@ -504,44 +513,44 @@ E2 (Type housekeeping) ── can start after A1, runs parallel to later steps
504
513
  ## Phase 8 roadmap
505
514
 
506
515
  Phase 7 eliminated all structural smells (mutable state, closure bags, callback threading, wide dependency bags).
507
- Phase 8 targets the next layer: testability friction, display module cohesion, and menu decomposition.
516
+ Phase 8 targeted the next layer: testability friction, display module cohesion, and menu decomposition.
508
517
 
509
- The test suite (714 tests) is comprehensive but uneven in quality.
510
- Steps G and H have eliminated 11 of the original 12 `vi.mock()` calls in the runner tests, removing fragile call-sequence assertions in favour of injected stubs. (Step G resolved `session-config.test.ts`; Step H resolved both `agent-runner.test.ts` and `agent-runner-extension-tools.test.ts`.)
518
+ Steps G and H eliminated 11 of the original 12 `vi.mock()` calls in the runner tests, removing fragile call-sequence assertions in favour of injected stubs.
519
+ Step G resolved `session-config.test.ts`; Step H resolved both `agent-runner.test.ts` and `agent-runner-extension-tools.test.ts`.
511
520
 
512
- The display and menu improvements were identified during Phase 7 but deferred because they don't gate encapsulation work.
513
- They are included here because the display extraction unblocks menu decomposition.
521
+ The display and menu improvements were identified during Phase 7 but deferred because they did not gate encapsulation work.
522
+ The display extraction unblocked menu decomposition.
514
523
 
515
- ### Test pain points
524
+ ### Test pain points (resolved)
516
525
 
517
- | Symptom | Location | Root cause |
518
- | ----------------------------- | ------------------------------------------------------- | ----------------------------------------------------------------- |
519
- | ~~7 `vi.mock()` calls~~ | ~~`agent-runner.test.ts`~~ | ~~Resolved by Step H (#133)~~ |
520
- | ~~7 `vi.mock()` calls~~ | ~~`agent-runner-extension-tools.test.ts`~~ | ~~Resolved by Step H (#133)~~ |
521
- | ~~52 `as any` casts~~ | ~~Across test suite~~ | ~~Reduced to 15 by Step I (#134)~~ |
522
- | 3× duplicated `mockSession()` | agent-manager, record-observer, ui-observer tests | No shared test fixture |
523
- | 3× duplicated `makeDeps()` | agent-tool, background-spawner, foreground-runner tests | No shared tool-deps fixture |
524
- | Weak assertions | lifecycle, renderer, session-config tests | `toHaveBeenCalled()` without args, `toContain()` on large strings |
526
+ | Symptom | Resolution |
527
+ | ------------------------------------------------------------- | -------------------------------------------------------------- |
528
+ | 7 `vi.mock()` calls in `agent-runner.test.ts` | Step H (#133): injected `RunnerIO` stubs |
529
+ | 7 `vi.mock()` calls in `agent-runner-extension-tools.test.ts` | Step H (#133): same |
530
+ | 52 `as any` casts across test suite | Step I (#134): reduced to 15 |
531
+ | 3× duplicated `mockSession()` | Step F (#131): shared `createMockSession()` in `test/helpers/` |
532
+ | 3× duplicated `makeDeps()` | Step F (#131): shared `createToolDeps()` in `test/helpers/` |
525
533
 
526
- Contrast with the well-designed test suites: `agent-manager.test.ts` (1 mock, DI via `AgentRunner` interface), `notification.test.ts` (0 mocks, pure functions + DI), and `agent-tool.test.ts` (0 mocks, tests via deps bag).
527
- The pattern is clear: modules that accept collaborators through injection produce resilient tests; modules that import collaborators directly produce fragile mock-heavy tests.
534
+ The well-designed test suites - `agent-manager.test.ts` (1 mock, DI via `AgentRunner` interface), `notification.test.ts` (0 mocks, pure functions + DI), and `agent-tool.test.ts` (0 mocks, tests via deps bag) - confirmed the pattern: modules that accept collaborators through injection produce resilient tests; modules that import collaborators directly produce fragile mock-heavy tests.
528
535
 
529
536
  ### Step F: Shared test fixtures (#131)
530
537
 
531
- Consolidate duplicated mock factories into `test/helpers/`.
538
+ Consolidated duplicated mock factories into `test/helpers/`.
532
539
 
533
- 1. `createMockSession()` subscribable event bus with `emit()` helper; replaces 3 hand-rolled copies.
534
- 2. `createToolDeps()` builds `AgentToolDeps` with sensible defaults and override support; replaces 3 `makeDeps()` copies.
540
+ 1. `createMockSession()` - subscribable event bus with `emit()` helper; replaced 3 hand-rolled copies.
541
+ 2. `createToolDeps()` - builds `AgentToolDeps` with sensible defaults and override support; replaced 3 `makeDeps()` copies.
542
+ 3. `makeRecord()` - `AgentRecord` factory with sensible defaults; replaced scattered inline construction.
543
+ 4. `STUB_CTX` - shared stub `ExtensionContext` constant; centralised unavoidable bridge casts.
535
544
 
536
- Impact: reduces test boilerplate; single source of truth for mock shapes; changes to dep interfaces propagate automatically.
545
+ Impact: reduced test boilerplate; single source of truth for mock shapes; changes to dep interfaces propagate automatically.
537
546
 
538
- ### Step G: Inject IO collaborators into session-config (#132) ✓ done
547
+ ### Step G: Inject IO collaborators into session-config (#132)
539
548
 
540
549
  `assembleSessionConfig` now accepts `io: AssemblerIO` as a required parameter.
541
550
  `index.ts` constructs the real `AssemblerIO` from direct imports via the `RunnerIO.assemblerIO` field (wired in Step H).
542
- `session-config.test.ts` injects stubs all 4 `vi.mock()` calls eliminated, assertions shifted to `SessionConfig` output properties.
551
+ `session-config.test.ts` injects stubs - all 4 `vi.mock()` calls eliminated, assertions shifted to `SessionConfig` output properties.
543
552
 
544
- ### Step H: Inject SDK boundary into agent-runner (#133) ✓ done
553
+ ### Step H: Inject SDK boundary into agent-runner (#133)
545
554
 
546
555
  `runAgent()` now accepts `io: RunnerIO` as a required parameter bundling all IO collaborators: `detectEnv`, `getAgentDir`, `createResourceLoader`, `deriveSessionDir`, `createSessionManager`, `createSettingsManager`, `createSession`, and `assemblerIO`.
547
556
 
@@ -550,7 +559,7 @@ Impact: reduces test boilerplate; single source of truth for mock shapes; change
550
559
 
551
560
  Impact: all 7 `vi.mock()` calls eliminated from both `agent-runner.test.ts` and `agent-runner-extension-tools.test.ts`; tests verify behavior (turn limits, tool filtering, response collection) through injected stubs; SDK imports moved to the extension entry point.
552
561
 
553
- ### Step I: Reduce `as any` casts in tests (#134) ✓ done
562
+ ### Step I: Reduce `as any` casts in tests (#134)
554
563
 
555
564
  Reduced `as any` count from 93 to 15 (plus 13 explicit `as unknown as T` bridge casts).
556
565
 
@@ -563,9 +572,9 @@ Production changes:
563
572
  - `textResult()` return no longer casts `details as any`.
564
573
  - `toAgentSession()` helper and `STUB_CTX` constant centralise unavoidable bridge casts.
565
574
 
566
- Remaining 15 `as any` casts are: 8 menu-handler `ctx as any` (deferred requires `AgentManager.spawn` to accept `ParentSnapshot` directly), 2 `print-mode.test.ts` (same ExtensionContext/API pattern), 2 private-field test access, 1 `createSession` SDK bridge in `index.ts`, 1 `foreground-runner.ts` `AgentToolResult<any>` detail, 1 `stub-ctx.ts` comment.
575
+ Remaining 15 `as any` casts are: 8 menu-handler `ctx as any` (deferred - requires `AgentManager.spawn` to accept `ParentSnapshot` directly), 2 `print-mode.test.ts` (same ExtensionContext/API pattern), 2 private-field test access, 1 `createSession` SDK bridge in `index.ts`, 1 `foreground-runner.ts` `AgentToolResult<any>` detail, 1 `stub-ctx.ts` comment.
567
576
 
568
- ### Step J: Extract display helpers (#135) ✓ done
577
+ ### Step J: Extract display helpers (#135)
569
578
 
570
579
  `ui/display.ts` now contains all pure formatters, display helpers, constants, and shared types (`Theme`, `AgentDetails`).
571
580
  `agent-widget.ts` dropped from 522 → ~340 lines.
@@ -574,30 +583,160 @@ All consumer modules (menu, tools, renderer, conversation viewer) import from `u
574
583
 
575
584
  ### Step K: Decompose agent-menu.ts (#136)
576
585
 
577
- `agent-menu.ts` (650 lines) has 8 distinct responsibilities: menu FSM, agent listing, config editing, agent ejection, two creation wizards, running-agent viewer, and settings form.
578
- Filesystem operations (read/write/delete agent `.md` files) are scattered throughout.
586
+ `agent-menu.ts` (668 lines) decomposed into four modules:
579
587
 
580
- 1. Extract `AgentFileOps` interface `read`, `write`, `delete`, `findAgentFile` abstracting the fs calls.
581
- 2. Extract `ui/agent-config-editor.ts` `showAgentDetail` with enable/disable/reset/delete transitions.
582
- 3. Extract `ui/agent-creation-wizard.ts` both AI-generation and manual form paths.
583
- 4. Leave menu orchestration, settings form, and running-agent viewer in `agent-menu.ts` (~200 lines).
588
+ 1. `ui/agent-file-ops.ts` - `AgentFileOps` interface (`exists`, `read`, `write`, `remove`, `ensureDir`, `findAgentFile`) + `FsAgentFileOps` production implementation.
589
+ 2. `ui/agent-config-editor.ts` - `showAgentDetail` with edit/delete/reset/eject/disable/enable transitions (~200 lines).
590
+ 3. `ui/agent-creation-wizard.ts` - AI-generation and manual-form creation paths (~250 lines).
591
+ 4. `ui/agent-menu.ts` - menu orchestration, agent listing, running-agent viewer, settings form (~300 lines).
584
592
 
585
- Impact: `agent-menu.ts` drops from 650~200 lines; extracted modules receive `AgentFileOps` via injection; wizard logic becomes independently testable.
593
+ Impact: `agent-menu.ts` dropped from 668296 lines; extracted modules receive `AgentFileOps` via injection; `vi.mock("node:fs")` eliminated from `agent-menu.test.ts`.
586
594
 
587
595
  ### Step dependencies
588
596
 
589
- ```text
590
- F (Shared fixtures) ──────────────────────────────┐
591
-
592
- G (session-config IO injection) ──────────────────┤
593
- └── H (agent-runner SDK injection) ────────────┤
594
- └── I (Reduce as-any) ────────────────────┘
595
-
596
- J (Display extraction) ──────────────────────────┐
597
- └── K (Menu decomposition) ────────────────────┘
597
+ ```mermaid
598
+ flowchart LR
599
+ subgraph testability["Testability track"]
600
+ F["F: Shared fixtures"] --> G["G: session-config IO"] --> H["H: agent-runner SDK"] --> I["I: Reduce as-any"]
601
+ end
602
+ subgraph display["Display track"]
603
+ J["J: Display extraction"] --> K["K: Menu decomposition"]
604
+ end
598
605
  ```
599
606
 
600
- Steps F through I (testability) and Steps J through K (display/menu) are independent tracks that can proceed in parallel.
607
+ The two tracks are independent and can proceed in parallel.
608
+
609
+ ---
610
+
611
+ ## Phase 9 roadmap
612
+
613
+ Phases 7 and 8 addressed structural encapsulation and testability.
614
+ Phase 9 targets the next layer: observation model consolidation, `ExtensionContext` elimination from internal APIs, remaining `vi.mock()` / `as any` casts, and dependency bag cleanup.
615
+
616
+ ### Current smells
617
+
618
+ | Smell | Location | Evidence | Severity |
619
+ | ------------------------------------------------ | --------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- |
620
+ | Dual observation | `record-observer.ts`, `ui-observer.ts` | Both independently count tool uses and accumulate lifetime usage from the same session events; consumers use `activity?.toolUses ?? record.toolUses` fallbacks | High |
621
+ | `execute` does config resolution for its callees | `agent-tool.ts` (145-line `execute`) | ~60 lines unpack config, resolve model, compute metadata, repack into 16-field bags for spawners; `ctx` threaded 4 layers deep | Medium |
622
+ | Wide `ctx` in menu handlers | `agent-menu.ts`, `agent-config-editor.ts`, `agent-creation-wizard.ts` | Functions declare `ctx: ExtensionContext` but only call `ctx.ui.select/confirm/input/notify/editor`; 43 `ctx as any` casts across 3 test files | Medium |
623
+ | `record.execution?.session` traversal | 15+ callsites across tools, notification, widget, menu | Callers reach through `ExecutionState` to access session and outputFile - Law of Demeter violation | Medium |
624
+ | Direct SDK import in `conversation-viewer.ts` | `conversation-viewer.test.ts` | Hoisted `vi.mock("@earendil-works/pi-tui")` to intercept `wrapTextWithAnsi` | Low |
625
+ | Widget mixes rendering, lifecycle, and state | `agent-widget.ts` (370 lines) | `renderWidget` is ~109 lines mixing data collection, formatting, and overflow layout; constructor takes 3 concrete collaborators | Low |
626
+ | `deps.` prefix noise in function bodies | 12 modules across tools, UI, notification, service-adapter | Functions accept a `deps` bag and access every field as `deps.foo`; hides real dependencies and lengthens every call line | Low |
627
+
628
+ ### Dependency bag convention
629
+
630
+ Applied incrementally as each step touches a module:
631
+
632
+ - **≤4 fields** — accept as plain parameters; drop the interface.
633
+ - **≥5 fields** — keep a named interface but destructure in the function signature (`{ manager, widget }: ForegroundDeps`) so the function body uses bare names, not `deps.foo`.
634
+
635
+ This eliminates the `deps.` prefix noise across ~124 callsites in 12 modules.
636
+
637
+ ### Step L: Consolidate observation model (#144)
638
+
639
+ Remove `_toolUses` and `_lifetimeUsage` from `AgentActivityTracker`.
640
+ UI consumers read stats from `AgentRecord` instead of the tracker.
641
+ The UI observer retains event subscriptions for re-render triggers but no longer accumulates stats independently.
642
+
643
+ Add `session` and `outputFile` convenience getters on `AgentRecord` to hide the `execution?.` traversal.
644
+ The 15+ callsites that navigate `record.execution?.session` simplify to `record.session`.
645
+
646
+ Apply the dependency bag convention to touched modules: `NotificationDeps` (4 fields) becomes plain parameters on `NotificationManager` constructor.
647
+
648
+ Impact: eliminates dual counting; removes `??` fallback pattern from widget and conversation viewer; hides `ExecutionState` structure from consumers.
649
+
650
+ ### Step M: Decompose execute and push ExtensionContext to the boundary (#145)
651
+
652
+ `execute` is 145 lines with three responsibilities mixed together:
653
+
654
+ 1. **Boundary extraction** (~5 lines) - read `ctx.model`, `ctx.modelRegistry`, `ctx.ui`, `ctx.sessionManager`, call `buildParentSnapshot(ctx)`.
655
+ 2. **Config resolution** (~60 lines) - resolve agent type, merge invocation config, resolve model, compute max turns, build tags and display metadata.
656
+ 3. **Dispatch** (~80 lines) - resume / background / foreground, each passing 14-16 field parameter bags.
657
+
658
+ The config resolution section is working for the dependencies: manually unpacking `resolvedConfig` field by field, computing derived values, then repacking everything into massive objects for `spawnBackground` and `runForeground`.
659
+ The 16-field bags are the symptom - they exist because the resolution happened in the wrong place.
660
+
661
+ The fix has two parts:
662
+
663
+ 1. **Extract config resolution** into a pure function (e.g. `resolveSpawnConfig`) that accepts the raw tool params, registry, model info, and settings, and returns a single `ResolvedSpawnConfig` object.
664
+ `execute` becomes: extract ctx → resolve config → dispatch.
665
+ `spawnBackground` and `runForeground` receive `ResolvedSpawnConfig` instead of 16 individual fields.
666
+ 2. **Push `ctx` to the boundary.**
667
+ `execute` extracts everything from `ctx` in its first few lines.
668
+ `foreground-runner.ts` and `background-spawner.ts` receive domain values (`snapshot`, `parentSessionFile`, `parentSessionId`) instead of `ctx`.
669
+ `AgentManager.spawn()` and `spawnAndWait()` accept `ParentSnapshot` instead of `ExtensionContext`.
670
+ `service-adapter.ts` calls `buildParentSnapshot(session.ctx)` at its boundary.
671
+
672
+ After this step, `ExtensionContext` appears only in:
673
+
674
+ - `agent-tool.ts execute` (SDK callback - unavoidable)
675
+ - `service-adapter.ts` (cross-extension boundary)
676
+ - `index.ts` (extension entry point)
677
+ - Menu handlers (addressed by Step N)
678
+
679
+ Apply the dependency bag convention to touched modules: `ForegroundDeps` (3 fields) and `BackgroundDeps` (3 fields) become plain parameters; `AdapterDeps` (3 fields) becomes plain parameters; `AgentToolDeps` (6 fields) is destructured in the signature.
680
+
681
+ Impact: `execute` drops from ~145 to ~30 lines; eliminates 16-field parameter bags; eliminates 1 `vi.mock()` call in `agent-manager.test.ts`; `foreground-runner` and `background-spawner` tests no longer need `ctx` mocks; `AgentManager` operates entirely on domain types.
682
+
683
+ ### Step N: Narrow UI context for menu handlers (#146)
684
+
685
+ Define a `MenuUI` interface with `select`, `confirm`, `input`, `notify`, and `editor` methods.
686
+ Menu handler functions (`showAgentsMenu`, `showAgentDetail`, `showCreateWizard`, etc.) accept `MenuUI` instead of `ExtensionContext`.
687
+ `index.ts` passes `ctx.ui` at the call site.
688
+
689
+ Creation wizard’s `spawnAndWait` call changes: the narrow `AgentMenuManager.spawnAndWait` accepts `ParentSnapshot` (enabled by Step M) instead of `ExtensionContext`.
690
+
691
+ Apply the dependency bag convention to touched modules: `AgentConfigEditorDeps` (4 fields), `SteerToolDeps` (4 fields), and `GetResultDeps` (4 fields) become plain parameters; `AgentMenuDeps` (8 fields) and `AgentCreationWizardDeps` (5 fields) are destructured in the signature.
692
+
693
+ After Steps M and N, `ExtensionContext` appears only at true boundaries: `agent-tool.ts execute` (SDK callback), `service-adapter.ts` (cross-extension bridge), and `index.ts` (extension entry point).
694
+
695
+ Impact: eliminates ~43 `ctx as any` casts across menu, editor, and wizard test files; tests construct a plain object satisfying `MenuUI` with no cast.
696
+
697
+ ### Step O: Inject text wrapping into ConversationViewer (#147)
698
+
699
+ Accept a `wrapText` function via `ConversationViewerOptions`.
700
+ `index.ts` passes the real `wrapTextWithAnsi` import.
701
+ Tests inject a stub or the real function directly - no module-level mock needed.
702
+
703
+ Apply the dependency bag convention: `ConversationViewerOptions` is destructured in the constructor signature.
704
+
705
+ Impact: eliminates the hoisted `vi.mock("@earendil-works/pi-tui")` in `conversation-viewer.test.ts`.
706
+
707
+ ### Step P: Split AgentWidget rendering (#148)
708
+
709
+ Extract pure rendering functions from `AgentWidget` into `ui/widget-renderer.ts`.
710
+ The widget becomes a thin lifecycle/polling wrapper that calls pure render functions.
711
+ Rendering functions receive data (agent list, activity map, registry) and return formatted strings - testable without widget lifecycle.
712
+
713
+ Depends on Step L: once the tracker drops stats fields, the renderer reads from `AgentRecord` for tool uses and usage, and from `AgentActivityTracker` only for live UI state (active tools, response text, turn count).
714
+
715
+ ### Step dependencies
716
+
717
+ ```mermaid
718
+ flowchart LR
719
+ subgraph observation["Observation track"]
720
+ L["L: Consolidate observation #144"] --> P["P: Split widget rendering #148"]
721
+ end
722
+ subgraph ctx["ctx elimination track"]
723
+ M["M: Decompose execute / push ctx #145"] --> N["N: Narrow UI context #146"]
724
+ end
725
+ O["O: Inject text wrapping #147"]
726
+ ```
727
+
728
+ The three tracks are independent of each other.
729
+
730
+ ### Projected impact
731
+
732
+ | Metric | Before | After |
733
+ | ---------------------------------- | ------------------------ | ------------------------ |
734
+ | `vi.mock()` calls remaining | 4 | 1 (`print-mode.test.ts`) |
735
+ | `as any` casts remaining | 45 | ~5 |
736
+ | Independent tool-use counters | 2 | 1 |
737
+ | `record.execution?.` traversals | 15+ | 0 |
738
+ | `ExtensionContext` in domain types | 1 (`AgentManager.spawn`) | 0 |
739
+ | `deps.` prefix accesses | ~124 | 0 |
601
740
 
602
741
  ---
603
742