npm - @gotgenes/pi-subagents - Versions diffs - 6.10.0 → 6.12.0 - Mend

@gotgenes/pi-subagents 6.10.0 → 6.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

package/CHANGELOG.md +34 -0
package/docs/architecture/architecture.md +21 -31
package/docs/plans/0133-inject-sdk-boundary-into-agent-runner.md +373 -0
package/docs/plans/0134-reduce-as-any-casts.md +366 -0
package/docs/retro/0132-inject-io-into-session-config.md +33 -0
package/docs/retro/0133-inject-sdk-boundary-into-agent-runner.md +45 -0
package/package.json +1 -1
package/src/agent-runner.ts +107 -35
package/src/index.ts +32 -3
package/src/runtime.ts +14 -2
package/src/tools/agent-tool.ts +1 -1
package/src/tools/helpers.ts +1 -1
package/src/ui/conversation-viewer.ts +38 -8

package/CHANGELOG.md CHANGED Viewed

@@ -5,6 +5,40 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [6.12.0](https://github.com/gotgenes/pi-packages/compare/pi-subagents-v6.11.0...pi-subagents-v6.12.0) (2026-05-22)
+### Features
+* narrow runtime widget field to WidgetLike interface ([#134](https://github.com/gotgenes/pi-packages/issues/134)) ([afa70ab](https://github.com/gotgenes/pi-packages/commit/afa70ab430109248a8f61ccd182b0f3acd1fa7e1))
+* use SDK types in CreateSessionOptions ([#134](https://github.com/gotgenes/pi-packages/issues/134)) ([c2452af](https://github.com/gotgenes/pi-packages/commit/c2452af0ee3d47d778878443a634ca787f8d0bfb))
+### Bug Fixes
+* replace message-shape as-any casts with type guards ([#134](https://github.com/gotgenes/pi-packages/issues/134)) ([d7ad65a](https://github.com/gotgenes/pi-packages/commit/d7ad65a61267790ae1ae8414b0c2aa9ebc8ad59c))
+### Documentation
+* plan as-any cast reduction in test suite ([#134](https://github.com/gotgenes/pi-packages/issues/134)) ([f7cb1aa](https://github.com/gotgenes/pi-packages/commit/f7cb1aac0963021ae0545b73c88f950a7adb5fd2))
+* **retro:** add retro notes for issue [#133](https://github.com/gotgenes/pi-packages/issues/133) ([be32640](https://github.com/gotgenes/pi-packages/commit/be32640048943059a98fc79797a35dfefd70fc34))
+* update architecture doc for Step I completion ([#134](https://github.com/gotgenes/pi-packages/issues/134)) ([fd4aca7](https://github.com/gotgenes/pi-packages/commit/fd4aca79c74da2b8c4e3c58e2376e0612941d7d9))
+## [6.11.0](https://github.com/gotgenes/pi-packages/compare/pi-subagents-v6.10.0...pi-subagents-v6.11.0) (2026-05-22)
+### Features
+* inject SDK boundary into agent-runner via RunnerIO ([#133](https://github.com/gotgenes/pi-packages/issues/133)) ([a9f6a9e](https://github.com/gotgenes/pi-packages/commit/a9f6a9e8c71e307b71600409e865fb539312f539))
+### Documentation
+* plan SDK boundary injection into agent-runner ([#133](https://github.com/gotgenes/pi-packages/issues/133)) ([1706ebc](https://github.com/gotgenes/pi-packages/commit/1706ebcc1452c6798dafb733ec8c68e6ee9e8512))
+* **retro:** add retro notes for issue [#132](https://github.com/gotgenes/pi-packages/issues/132) ([d0af140](https://github.com/gotgenes/pi-packages/commit/d0af1409ddc18099dfdda94ab37af2b99bc46c3c))
+* update architecture doc for Step H completion ([#133](https://github.com/gotgenes/pi-packages/issues/133)) ([f6b1258](https://github.com/gotgenes/pi-packages/commit/f6b1258f50a038df18ca1f33e3681c7bc258f4fc))
 ## [6.10.0](https://github.com/gotgenes/pi-packages/compare/pi-subagents-v6.9.4...pi-subagents-v6.10.0) (2026-05-22)

package/docs/architecture/architecture.md CHANGED Viewed

@@ -505,9 +505,8 @@ E2 (Type housekeeping) ── can start after A1, runs parallel to later steps
 Phase 7 eliminated all structural smells (mutable state, closure bags, callback threading, wide dependency bags).
 Phase 8 targets the next layer: testability friction, display module cohesion, and menu decomposition.
-The test suite (690 tests, 1.4:1 test-to-code ratio) is comprehensive but uneven in quality.
-`agent-runner.test.ts` accounts for 7 of 8 remaining `vi.mock()` calls and relies heavily on verifying internal call sequences rather than observable outputs.
-This fragility is a symptom of production code that imports IO-touching collaborators directly instead of receiving them through injection. (Step G resolved `session-config.test.ts`, which previously held 4 of the 12 total mocks.)
+The test suite (714 tests) is comprehensive but uneven in quality.
+Steps G and H have eliminated 11 of the original 12 `vi.mock()` calls in the runner tests, removing fragile call-sequence assertions in favour of injected stubs. (Step G resolved `session-config.test.ts`; Step H resolved both `agent-runner.test.ts` and `agent-runner-extension-tools.test.ts`.)
 The display and menu improvements were identified during Phase 7 but deferred because they don't gate encapsulation work.
 They are included here because the display extraction unblocks menu decomposition.
@@ -516,9 +515,9 @@ They are included here because the display extraction unblocks menu decompositio
 | Symptom                       | Location                                                | Root cause                                                        |
 | ----------------------------- | ------------------------------------------------------- | ----------------------------------------------------------------- |
-| 7 `vi.mock()` calls           | `agent-runner.test.ts`                                  | Runner imports prompts, memory, skills, env, session-dir directly |
-| 7 `vi.mock()` calls           | `agent-runner.test.ts`                                  | Runner imports prompts, memory, skills, env, session-dir directly |
-| 52 `as any` casts             | Across test suite                                       | SDK session/context interfaces too wide to construct in tests     |
+| ~~7 `vi.mock()` calls~~       | ~~`agent-runner.test.ts`~~                              | ~~Resolved by Step H (#133)~~                                     |
+| ~~7 `vi.mock()` calls~~       | ~~`agent-runner-extension-tools.test.ts`~~              | ~~Resolved by Step H (#133)~~                                     |
+| ~~52 `as any` casts~~         | ~~Across test suite~~                                   | ~~Reduced to 15 by Step I (#134)~~                                |
 | 3× duplicated `mockSession()` | agent-manager, record-observer, ui-observer tests       | No shared test fixture                                            |
 | 3× duplicated `makeDeps()`    | agent-tool, background-spawner, foreground-runner tests | No shared tool-deps fixture                                       |
 | Weak assertions               | lifecycle, renderer, session-config tests               | `toHaveBeenCalled()` without args, `toContain()` on large strings |
@@ -538,41 +537,32 @@ Impact: reduces test boilerplate; single source of truth for mock shapes; change
 ### Step G: Inject IO collaborators into session-config (#132) ✓ done
 `assembleSessionConfig` now accepts `io: AssemblerIO` as a required parameter.
-`agent-runner.ts` constructs the real `AssemblerIO` from direct imports and passes it through.
+`index.ts` constructs the real `AssemblerIO` from direct imports via the `RunnerIO.assemblerIO` field (wired in Step H).
 `session-config.test.ts` injects stubs — all 4 `vi.mock()` calls eliminated, assertions shifted to `SessionConfig` output properties.
-### Step H: Inject SDK boundary into agent-runner (#133)
+### Step H: Inject SDK boundary into agent-runner (#133) ✓ done
-`agent-runner.ts` has 7 module mocks because it imports `createAgentSession`, `DefaultResourceLoader`, `SessionManager`, and `SettingsManager` from the Pi SDK, plus `detectEnv`, `deriveSubagentSessionDir`, and `assembleSessionConfig` from sibling modules.
+`runAgent()` now accepts `io: RunnerIO` as a required parameter bundling all IO collaborators: `detectEnv`, `getAgentDir`, `createResourceLoader`, `deriveSessionDir`, `createSessionManager`, `createSettingsManager`, `createSession`, and `assemblerIO`.
-After Step G, `assembleSessionConfig` no longer needs mocking (its own IO is injected).
-The remaining SDK dependencies can be injected via a narrow `RunnerIO` interface:
+`createAgentRunner(io: RunnerIO): AgentRunner` factory captures the boundary at construction time so `AgentManager` and the `AgentRunner` interface remain unchanged.
+`index.ts` constructs the real `RunnerIO` from Pi SDK imports and sibling modules.
-```typescript
-export interface RunnerIO {
-  createSession: (opts: SessionOptions) => AgentSession;
-  createResourceLoader: (opts: ResourceLoaderOptions) => ResourceLoader;
-  createSessionManager: (cwd: string) => SessionManager;
-  detectEnv: (exec: ShellExec, cwd: string) => Promise<EnvInfo>;
-  deriveSessionDir: (parentFile: string) => string;
-}
-```
-The production call site in `agent-manager.ts` passes a `RunnerIO` built from the real SDK imports.
-Tests pass a stub `RunnerIO` without `vi.mock()`.
+Impact: all 7 `vi.mock()` calls eliminated from both `agent-runner.test.ts` and `agent-runner-extension-tools.test.ts`; tests verify behavior (turn limits, tool filtering, response collection) through injected stubs; SDK imports moved to the extension entry point.
-Impact: eliminates 5–7 `vi.mock()` calls in `agent-runner.test.ts`; tests verify behavior (turn limits, tool filtering, response collection) through injected fakes; refactoring internal structure no longer breaks tests.
+### Step I: Reduce `as any` casts in tests (#134) ✓ done
-### Step I: Reduce `as any` casts in tests (#134)
+Reduced `as any` count from 93 to 15 (plus 13 explicit `as unknown as T` bridge casts).
-With Steps G and H, many `as any` casts disappear because tests construct narrow injectable interfaces instead of wide SDK types.
-Remaining casts are addressed by:
+Production changes:
-1. Defining a `TestSession` type in `test/helpers/` that satisfies `SubscribableSession` + the fields tests actually read.
-2. Replacing `const mockCtx = { cwd: "/tmp" } as any` with properly typed `AssemblerContext` or `ParentSnapshot` objects.
-3. Using `satisfies` assertions where possible instead of `as any`.
+- `ResourceLoaderOptions.appendSystemPromptOverride` typed to match `DefaultResourceLoaderOptions`; `createResourceLoader` factory cast removed from `index.ts`.
+- `CreateSessionOptions.settingsManager` / `RunnerIO.createSettingsManager` typed as `SettingsManager`.
+- `WidgetLike` interface in `runtime.ts` narrows the widget field.
+- Local `ToolCallContent` / `BashExecutionMessage` type guards replace `as any` duck-typing in `conversation-viewer.ts` and `agent-runner.ts`.
+- `textResult()` return no longer casts `details as any`.
+- `toAgentSession()` helper and `STUB_CTX` constant centralise unavoidable bridge casts.
-Target: reduce `as any` count from 52 to under 10.
+Remaining 15 `as any` casts are: 8 menu-handler `ctx as any` (deferred — requires `AgentManager.spawn` to accept `ParentSnapshot` directly), 2 `print-mode.test.ts` (same ExtensionContext/API pattern), 2 private-field test access, 1 `createSession` SDK bridge in `index.ts`, 1 `foreground-runner.ts` `AgentToolResult<any>` detail, 1 `stub-ctx.ts` comment.
 ### Step J: Extract display helpers (#135)

package/docs/plans/0133-inject-sdk-boundary-into-agent-runner.md ADDED Viewed

@@ -0,0 +1,373 @@
+---
+issue: 133
+issue_title: "Inject SDK boundary into `agent-runner`"
+---
+# Inject SDK boundary into agent-runner
+## Problem Statement
+`agent-runner.ts` directly imports five Pi SDK symbols (`createAgentSession`, `DefaultResourceLoader`, `getAgentDir`, `SessionManager`, `SettingsManager`) and two sibling modules (`detectEnv`, `deriveSubagentSessionDir`).
+It also imports four functions (`preloadSkills`, `buildMemoryBlock`, `buildReadOnlyMemoryBlock`, `buildAgentPrompt`) solely to construct the `AssemblerIO` object introduced in #132.
+This forces `agent-runner.test.ts` to use 7 `vi.mock()` calls, a `vi.hoisted()` block with 5+ mock factories, and a `beforeEach` that manually resets 6+ mocks.
+Tests verify internal call patterns ("defaultResourceLoaderCtor was called with `noContextFiles: true`") rather than behavioral outcomes, making any internal restructuring break multiple tests without changing observable behavior.
+The same 7-mock pattern is duplicated in `agent-runner-extension-tools.test.ts`.
+## Goals
+- Define a `RunnerIO` interface bundling all SDK and IO collaborators used by `runAgent()`.
+- Add `io: RunnerIO` as a parameter to `runAgent()`.
+- Provide a `createAgentRunner(io: RunnerIO): AgentRunner` factory so the `AgentRunner` interface and `AgentManager` remain unchanged.
+- Replace direct SDK and sibling-module imports in `runAgent()` with calls through `io`.
+- Update the wiring in `index.ts` to construct a real `RunnerIO` and use `createAgentRunner()`.
+- Eliminate all 7 `vi.mock()` calls in `agent-runner.test.ts`.
+- Eliminate all 7 `vi.mock()` calls in `agent-runner-extension-tools.test.ts`.
+- Shift test assertions toward behavioral outcomes (turn limits enforced, tool filtering correct, response text collected).
+## Non-Goals
+- Changing `resumeAgent` — it receives an already-created `AgentSession` and has no SDK/IO deps to inject.
+- Injecting `assembleSessionConfig` itself — the function is pure (after #132) and stays as a direct import; only its `AssemblerIO` collaborators move into `RunnerIO`.
+- Injecting `getMemoryToolNames` / `getReadOnlyMemoryToolNames` — these are pure utility functions with no IO; they remain as direct imports in `session-config.ts`.
+- Refactoring `filterActiveTools` or the turn-limit logic — out of scope.
+- Consolidating shared test fixtures (#131) — independent work.
+## Background
+### Prerequisite
+Issue #132 (inject IO into session-config) is closed.
+`assembleSessionConfig` now receives an `AssemblerIO` parameter and no longer imports IO functions directly.
+However, `agent-runner.ts` still imports those four functions to construct the `AssemblerIO` object, and the SDK factories remain as direct imports.
+### Current vi.mock inventory in agent-runner.test.ts
+| #   | Module                            | Symbols mocked                                                                                    | Why mocked                            |
+| --- | --------------------------------- | ------------------------------------------------------------------------------------------------- | ------------------------------------- |
+| 1   | `@earendil-works/pi-coding-agent` | `createAgentSession`, `DefaultResourceLoader`, `getAgentDir`, `SessionManager`, `SettingsManager` | SDK constructors and factories        |
+| 2   | `../src/agent-types.js`           | `getMemoryToolNames`, `getReadOnlyMemoryToolNames`                                                | Pure functions used by session-config |
+| 3   | `../src/env.js`                   | `detectEnv`                                                                                       | Async IO (shell exec)                 |
+| 4   | `../src/prompts.js`               | `buildAgentPrompt`                                                                                | Relayed to AssemblerIO                |
+| 5   | `../src/memory.js`                | `buildMemoryBlock`, `buildReadOnlyMemoryBlock`                                                    | Relayed to AssemblerIO                |
+| 6   | `../src/skill-loader.js`          | `preloadSkills`                                                                                   | Relayed to AssemblerIO                |
+| 7   | `../src/session-dir.js`           | `deriveSubagentSessionDir`                                                                        | Path derivation                       |
+`agent-runner-extension-tools.test.ts` has an identical set.
+### Established DI patterns
+- `AgentManager` already receives `AgentRunner` via constructor injection — the same boundary this issue pushes down one layer.
+- `AssemblerIO` (#132) bundles four IO collaborators into a single injectable interface.
+- `AgentManagerLike` in `service-adapter.ts` defines a narrow interface for the concrete `AgentManager` class, avoiding coupling to the concrete type.
+### Architecture reference
+Phase 8, Step H in `docs/architecture/architecture.md`.
+### Constraints from AGENTS.md
+- Keep scope tight; prefer small, reversible changes.
+- Prefer explicit configuration over hidden behavior.
+- Business logic should be pure functions — keep IO at the edges.
+- Keep Pi SDK imports out of business-logic modules.
+## Design Overview
+### `RunnerIO` interface
+Defined in `agent-runner.ts` alongside the existing runner types.
+Bundles all IO dependencies that `runAgent()` uses:
+```typescript
+/** Minimal resource-loader contract used by the runner. */
+export interface ResourceLoaderLike {
+  reload(): Promise<void>;
+}
+/** Minimal session-manager contract used by the runner. */
+export interface SessionManagerLike {
+  newSession(opts: { parentSession?: string }): void;
+  getSessionFile(): string | undefined;
+}
+/** Options passed to RunnerIO.createResourceLoader. */
+export interface ResourceLoaderOptions {
+  cwd: string;
+  agentDir: string;
+  noExtensions?: boolean;
+  noSkills?: boolean;
+  noPromptTemplates?: boolean;
+  noThemes?: boolean;
+  noContextFiles?: boolean;
+  systemPromptOverride?: () => string;
+  appendSystemPromptOverride?: () => unknown[];
+}
+/** Options passed to RunnerIO.createSession. */
+export interface CreateSessionOptions {
+  cwd: string;
+  agentDir: string;
+  sessionManager: SessionManagerLike;
+  settingsManager: unknown;
+  modelRegistry: unknown;
+  model?: unknown;
+  tools: string[];
+  resourceLoader: ResourceLoaderLike;
+  thinkingLevel?: ThinkingLevel;
+}
+/**
+ * IO boundary injected into runAgent().
+ *
+ * Decouples the runner from direct Pi SDK imports and sibling-module IO,
+ * making it testable via plain stub objects without vi.mock().
+ */
+export interface RunnerIO {
+  detectEnv: (exec: ShellExec, cwd: string) => Promise<EnvInfo>;
+  getAgentDir: () => string;
+  createResourceLoader: (opts: ResourceLoaderOptions) => ResourceLoaderLike;
+  deriveSessionDir: (
+    parentSessionFile: string | undefined,
+    effectiveCwd: string,
+  ) => string;
+  createSessionManager: (
+    cwd: string,
+    sessionDir: string,
+  ) => SessionManagerLike;
+  createSettingsManager: (cwd: string, agentDir: string) => unknown;
+  createSession: (
+    opts: CreateSessionOptions,
+  ) => Promise<{ session: AgentSession }>;
+  assemblerIO: AssemblerIO;
+}
+```
+The interface has 8 fields (7 functions + 1 nested `AssemblerIO`).
+All 8 are consumed by `runAgent()` — no field is relayed without use.
+### `createAgentRunner` factory
+```typescript
+export function createAgentRunner(io: RunnerIO): AgentRunner {
+  return {
+    run: (snapshot, type, prompt, options) =>
+      runAgent(snapshot, type, prompt, options, io),
+    resume: resumeAgent,
+  };
+}
+```
+This keeps the `AgentRunner` interface unchanged.
+`AgentManager` continues to receive an `AgentRunner` — it never sees `RunnerIO`.
+### Call site in `index.ts`
+```typescript
+import {
+  createAgentSession,
+  DefaultResourceLoader,
+  getAgentDir,
+  SessionManager,
+  SettingsManager,
+} from "@earendil-works/pi-coding-agent";
+import { detectEnv } from "./env.js";
+import { buildMemoryBlock, buildReadOnlyMemoryBlock } from "./memory.js";
+import { buildAgentPrompt } from "./prompts.js";
+import { deriveSubagentSessionDir } from "./session-dir.js";
+import { preloadSkills } from "./skill-loader.js";
+const runnerIO: RunnerIO = {
+  detectEnv,
+  getAgentDir,
+  createResourceLoader: (opts) => new DefaultResourceLoader(opts),
+  deriveSessionDir: deriveSubagentSessionDir,
+  createSessionManager: (cwd, dir) => SessionManager.create(cwd, dir),
+  createSettingsManager: (cwd, dir) => SettingsManager.create(cwd, dir),
+  createSession: createAgentSession,
+  assemblerIO: {
+    preloadSkills,
+    buildMemoryBlock,
+    buildReadOnlyMemoryBlock,
+    buildAgentPrompt,
+  },
+};
+const manager = new AgentManager({
+  runner: createAgentRunner(runnerIO),
+  // ... rest unchanged
+});
+```
+SDK and IO imports move from `agent-runner.ts` to `index.ts` — the extension entry point, which is the natural IO edge.
+### Test-side stubs
+Tests create a plain `RunnerIO` object with `vi.fn()` stubs:
+```typescript
+function createRunnerIO(): RunnerIO {
+  return {
+    detectEnv: vi.fn(async () => ({
+      isGitRepo: false,
+      branch: "",
+      platform: "linux",
+    })),
+    getAgentDir: vi.fn(() => "/mock/agent-dir"),
+    createResourceLoader: vi.fn(() => ({ reload: vi.fn() })),
+    deriveSessionDir: vi.fn(() => "/mock/session-dir/tasks"),
+    createSessionManager: vi.fn(() => ({
+      newSession: vi.fn(),
+      getSessionFile: vi.fn(() => "/sessions/child.jsonl"),
+    })),
+    createSettingsManager: vi.fn(() => ({ kind: "settings-manager" })),
+    createSession: vi.fn(),
+    assemblerIO: {
+      preloadSkills: vi.fn(() => []),
+      buildMemoryBlock: vi.fn(() => ""),
+      buildReadOnlyMemoryBlock: vi.fn(() => ""),
+      buildAgentPrompt: vi.fn(() => "system prompt"),
+    },
+  };
+}
+```
+This replaces all 7 `vi.mock()` calls, the `vi.hoisted()` block, and most of the `beforeEach` resets.
+Each test calls `runAgent(snapshot, type, prompt, options, io)` directly with a stub `io`.
+### Interaction verification — consumer call site (Tell-Don't-Ask check)
+```typescript
+// In index.ts — the consumer constructs RunnerIO and hands it off:
+const runnerIO: RunnerIO = { detectEnv, getAgentDir, ... };
+const manager = new AgentManager({
+  runner: createAgentRunner(runnerIO),
+});
+// AgentManager calls runner.run(...) — never reaches through to runnerIO.
+// Tell-Don't-Ask: ✓  Manager tells runner to run; runner uses its own IO.
+```
+### Pure functions stay as direct imports
+`assembleSessionConfig` (pure after #132), `filterActiveTools` (module-private), `normalizeMaxTurns` (pure exported), `collectResponseText`, `getLastAssistantText`, and `forwardAbortSignal` remain as direct code — they have no IO dependencies.
+`getMemoryToolNames` / `getReadOnlyMemoryToolNames` in `session-config.ts` remain as direct imports (pure, no IO).
+The `vi.mock("../src/agent-types.js", ...)` in both test files can be removed because the mock agent config has no `memory` field, so the memory branch in `assembleSessionConfig` is never entered and those functions are never called.
+## Module-Level Changes
+### Modified files
+1. `src/agent-runner.ts`
+   - Add `RunnerIO`, `ResourceLoaderLike`, `SessionManagerLike`, `ResourceLoaderOptions`, `CreateSessionOptions` interface exports.
+   - Add `createAgentRunner(io: RunnerIO): AgentRunner` factory export.
+   - Add `io: RunnerIO` parameter to `runAgent()`.
+   - Replace `detectEnv(...)` with `io.detectEnv(...)`.
+   - Replace `getAgentDir()` with `io.getAgentDir()`.
+   - Replace `new DefaultResourceLoader(...)` with `io.createResourceLoader(...)`.
+   - Replace `deriveSubagentSessionDir(...)` with `io.deriveSessionDir(...)`.
+   - Replace `SessionManager.create(...)` with `io.createSessionManager(...)`.
+   - Replace `SettingsManager.create(...)` with `io.createSettingsManager(...)`.
+   - Replace `createAgentSession(...)` with `io.createSession(...)`.
+   - Replace inline `AssemblerIO` construction with `io.assemblerIO`.
+   - Remove imports: `createAgentSession`, `DefaultResourceLoader`, `getAgentDir`, `SessionManager`, `SettingsManager` from `@earendil-works/pi-coding-agent`; `detectEnv` from `./env.js`; `deriveSubagentSessionDir` from `./session-dir.js`; `preloadSkills` from `./skill-loader.js`; `buildMemoryBlock`, `buildReadOnlyMemoryBlock` from `./memory.js`; `buildAgentPrompt` from `./prompts.js`.
+   - Keep imports: `type AgentSession`, `type AgentSessionEvent` from SDK (used in function signatures and event handling); `type AssemblerIO` from `./session-config.js`; `assembleSessionConfig` from `./session-config.js`; `extractText` from `./context.js`.
+2. `src/index.ts`
+   - Add imports: `detectEnv` from `./env.js`; `deriveSubagentSessionDir` from `./session-dir.js`; `preloadSkills` from `./skill-loader.js`; `buildMemoryBlock`, `buildReadOnlyMemoryBlock` from `./memory.js`; `buildAgentPrompt` from `./prompts.js`.
+   - Add import: `createAgentRunner`, `type RunnerIO` from `./agent-runner.js`.
+   - Remove import: `runAgent` from `./agent-runner.js` (replaced by factory).
+   - Construct `runnerIO` object from real implementations.
+   - Replace `runner: { run: runAgent, resume: resumeAgent }` with `runner: createAgentRunner(runnerIO)`.
+3. `test/agent-runner.test.ts`
+   - Remove all 7 `vi.mock()` calls and the `vi.hoisted()` block.
+   - Add `createRunnerIO()` factory function returning a stub `RunnerIO`.
+   - Pass `io` to all `runAgent()` calls.
+   - Simplify `beforeEach` to reset `io.createSession` (the only mock that needs per-test setup).
+   - Remove `mockAgentLookup.resolveAgentConfig` and `mockAgentLookup.getToolNamesForType` resets that are now unnecessary.
+   - Update assertions that verify SDK constructor arguments (e.g., `defaultResourceLoaderCtor` calls) to verify `io.createResourceLoader` calls instead.
+   - Remove the `agent-types.js` mock — pure functions run against controlled inputs.
+4. `test/agent-runner-extension-tools.test.ts`
+   - Same structural changes as `agent-runner.test.ts`: remove all 7 `vi.mock()` calls, inject `RunnerIO` stubs.
+   - Keep the `createSessionWithExtensionToolRegistration` helper — it creates mock sessions for testing post-bind tool filtering, which is behavioral.
+   - Update assertions to use `io.createResourceLoader` / `io.createSession` stubs.
+### Unchanged files
+- `src/agent-manager.ts` — receives `AgentRunner` via injection; unaffected by `RunnerIO`.
+- `test/agent-manager.test.ts` — already injects a mock `AgentRunner`; unaffected.
+- `src/session-config.ts` — pure function, already receives `AssemblerIO`; unaffected.
+- `test/session-config.test.ts` — tests the pure assembler directly; unaffected.
+- `test/agent-runner-settings.test.ts` — tests `normalizeMaxTurns` (pure, no mocks); unaffected.
+- `test/print-mode.test.ts` — mocks `runAgent` itself at the module level; unaffected (it tests `index.ts` notification wiring, not the runner internals).
+## Test Impact Analysis
+1. The `RunnerIO` injection enables testing `runAgent` without any module mocking.
+   Tests create plain stub objects satisfying `RunnerIO` — no `vi.mock()`, no `vi.hoisted()`, no module-level mock variable management.
+   This was previously impossible because `runAgent` hard-imported SDK constructors.
+2. Several existing tests that verify mock constructor arguments become redundant or shift to verifying `io.*` stub calls:
+   - "passes effective cwd and agentDir to the loader and settings manager" → verifies `io.createResourceLoader` and `io.createSettingsManager` were called with expected args (simpler, no `defaultResourceLoaderCtor` indirection).
+   - "suppresses AGENTS.md/CLAUDE.md/APPEND_SYSTEM.md for subagents" → verifies `io.createResourceLoader` was called with `noContextFiles: true` and an `appendSystemPromptOverride` that returns `[]`.
+3. Tests for turn-limit enforcement, abort forwarding, and response-text collection stay as-is — they already test behavioral outcomes through the mock session, not through SDK mock call patterns.
+4. The extension-tools tests (Patch 2) remain behavioral — they verify `setActiveToolsByName` calls before/after `bindExtensions`.
+   The only change is how the session is created (via `io.createSession` stub instead of a module mock).
+5. The `agent-types.js` mock can be removed from both test files because the mock agent configs have no `memory` field, so the code path through `getMemoryToolNames` / `getReadOnlyMemoryToolNames` is never reached.
+## TDD Order
+1. **Define `RunnerIO` and `createAgentRunner`; inject IO into `runAgent`.**
+   Add the `RunnerIO`, `ResourceLoaderLike`, `SessionManagerLike`, `ResourceLoaderOptions`, and `CreateSessionOptions` interfaces to `agent-runner.ts`.
+   Add `io: RunnerIO` parameter to `runAgent()`.
+   Add `createAgentRunner(io)` factory export.
+   Replace all direct SDK and IO imports with `io.*` calls inside `runAgent()`.
+   Remove the now-unused direct imports.
+   Update `index.ts` to construct `runnerIO` from real implementations and use `createAgentRunner(runnerIO)`.
+   Run `pnpm run check` to verify types compile.
+   Commit: `feat: inject SDK boundary into agent-runner via RunnerIO (#133)`
+2. **Migrate `agent-runner.test.ts` to use injected `RunnerIO` stubs.**
+   Add `createRunnerIO()` helper returning a fully-stubbed `RunnerIO`.
+   Pass `io` to all `runAgent()` calls.
+   Remove all 7 `vi.mock()` calls and the `vi.hoisted()` block.
+   Simplify `beforeEach` to reset only `io.createSession`.
+   Update assertions that referenced hoisted mocks (e.g., `defaultResourceLoaderCtor`, `sessionManagerCreate`, `settingsManagerCreate`, `getAgentDir`) to reference `io.*` stubs.
+   Remove the `mockAgentLookup` mock resets that are now unnecessary.
+   All existing tests pass with equivalent assertions.
+   Commit: `test: replace vi.mock with RunnerIO stubs in agent-runner tests (#133)`
+3. **Migrate `agent-runner-extension-tools.test.ts` to use injected `RunnerIO` stubs.**
+   Same structural changes as step 2: remove all 7 `vi.mock()` calls, inject `RunnerIO` stubs.
+   Keep `createSessionWithExtensionToolRegistration` helper (tests tool filtering behavior).
+   Simplify `beforeEach` and update stub references.
+   Commit: `test: replace vi.mock with RunnerIO stubs in extension-tools tests (#133)`
+4. **Shift constructor-argument assertions to behavioral checks.**
+   In `agent-runner.test.ts`, update tests that verify internal SDK call arguments:
+   - Replace `expect(defaultResourceLoaderCtor).toHaveBeenCalledWith(expect.objectContaining({...}))` with `expect(io.createResourceLoader).toHaveBeenCalledWith(expect.objectContaining({...}))`.
+   - Where the assertion only verified plumbing (e.g., "settings manager gets the right cwd"), simplify to a behavioral assertion or remove if covered by other tests.
+   - Keep assertions that verify meaningful configuration decisions (e.g., `noContextFiles: true`, `appendSystemPromptOverride` returns `[]`).
+   Run full test suite.
+   Commit: `test: shift agent-runner assertions toward behavioral checks (#133)`
+## Risks and Mitigations
+| Risk                                                                                                    | Mitigation                                                                                                                                                                                                            |
+| ------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `RunnerIO` at 8 fields may seem wide                                                                    | All 8 are consumed by the single consumer (`runAgent`). No field is relayed without use. The interface represents a genuine IO boundary — further narrowing would require splitting `runAgent` itself (out of scope). |
+| Removing the `agent-types.js` mock could cause failures if a test unexpectedly enters the memory branch | The mock agent config has no `memory` field (`undefined`), so the memory branch is guarded by `if (agentConfig.memory)`. Verified by reading the test's `resolveAgentConfig` mock return value.                       |
+| `index.ts` accumulates many new imports                                                                 | The imports move from `agent-runner.ts` to `index.ts` — the extension entry point is the natural IO edge. The total import count across the two files is unchanged.                                                   |
+| `createAgentRunner` factory adds indirection                                                            | The factory is a one-liner that captures `io` in a closure. The `AgentRunner` interface and `AgentManager` are completely unchanged. No new abstraction layer — just a construction-time binding.                     |
+| Steps 2–3 touch many call sites in two test files (add `, io` argument)                                 | All changes are mechanical. Each `runAgent(snapshot, type, prompt, {...})` becomes `runAgent(snapshot, type, prompt, {...}, io)`. A single find-and-replace handles it.                                               |
+| `print-mode.test.ts` mocks `runAgent` at the module level — does the new `io` parameter break it?       | `print-mode.test.ts` mocks the entire `runAgent` export with `vi.mock("../src/agent-runner.js", ...)`. The mock replaces the function entirely, so the new parameter has no effect on that test.                      |
+## Open Questions
+- Should `RunnerIO` live in `agent-runner.ts` or be extracted to a separate types file?
+  The interface is tightly coupled to `runAgent()` — co-location follows the `AssemblerIO` precedent in `session-config.ts`.
+  Extract only if a second consumer appears.