npm - @gotgenes/pi-subagents - Versions diffs - 15.0.2 → 16.1.0 - Mend

@gotgenes/pi-subagents 15.0.2 → 16.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

package/CHANGELOG.md +23 -0
package/README.md +24 -24
package/docs/architecture/architecture.md +111 -18
package/docs/plans/0381-replace-concurrency-queue-with-limiter.md +267 -0
package/docs/plans/0400-include-parent-prompt-in-replace-mode.md +199 -0
package/docs/retro/0381-replace-concurrency-queue-with-limiter.md +49 -0
package/docs/retro/0400-include-parent-prompt-in-replace-mode.md +84 -0
package/package.json +1 -1
package/src/index.ts +8 -15
package/src/lifecycle/concurrency-limiter.ts +55 -0
package/src/lifecycle/subagent-manager.ts +38 -35
package/src/lifecycle/subagent.ts +2 -1
package/src/session/prompts.ts +25 -20
package/src/lifecycle/concurrency-queue.ts +0 -63

package/docs/plans/0381-replace-concurrency-queue-with-limiter.md ADDED Viewed

@@ -0,0 +1,267 @@
+---
+issue: 381
+issue_title: "Replace ConcurrencyQueue with a thunk-based ConcurrencyLimiter"
+---
+# Replace ConcurrencyQueue with a thunk-based ConcurrencyLimiter
+## Problem Statement
+The `ConcurrencyQueue` stores background-agent IDs and decides *when* to start them, but it cannot start an agent itself.
+It compensates with a `startAgent(id)` callback that reaches back into the manager (`getRecord(id)`, status check, `run()`) — a dependency back-edge that forces forward-referenced bindings in both `index.ts` and the manager test helper.
+The queue also keeps its own `running` counter, fed by `markStarted`/`markFinished` relays in the manager's observer, duplicating state the agents already carry.
+A queued agent has `promise === undefined` until the queue starts it, which is the direct cause of `waitForAll`'s `while (true)` drain loop and its `eslint-disable`.
+These are three symptoms of one root cause: the queue schedules *identifiers it cannot act on* instead of *work it can run*.
+Scheduling thunks (`() => Promise<void>`) instead of IDs dissolves all three at the source.
+This is Phase 17 Step 1 (core consolidation), recorded in `docs/architecture/architecture.md` under "Improvement roadmap (Phase 17 — core consolidation)".
+It unblocks Phase 17 Step 3 ([#374], run-start encapsulation).
+## Goals
+- Replace `ConcurrencyQueue` (ID registry + back-edge callback) with a `ConcurrencyLimiter` that schedules run closures FIFO against a dynamic limit and knows nothing about agents, IDs, or the manager.
+- Make the dependency direction strictly `SubagentManager → ConcurrencyLimiter`: no callback back-edge, no forward-referenced bindings.
+- Derive the active count from the limiter's own task lifecycle (increment on task start, decrement on settle); delete the observer's `markStarted`/`markFinished` relays.
+- Give every spawned agent a real `promise` at spawn time, collapsing `waitForAll`'s `while (true)` drain loop and its `eslint-disable`.
+- This is a non-breaking internal refactor: the FIFO admission behavior against `maxConcurrent` is preserved, and no public API, config key, or observable behavior changes.
+## Non-Goals
+- Renaming the `bypassQueue` spawn option.
+  It is part of the published `SubagentsService` type surface (`src/service/service.ts`), so renaming it would churn the type bundle and break consumers — out of scope; track in Open Questions.
+- Folding the queued-status guard into `Subagent.start()` — that is Phase 17 Step 3 ([#374]).
+  This plan keeps the guard inside the scheduled thunk.
+- Extracting `SubagentState` or making execution deps mandatory ([#373], Step 2).
+- Any change to foreground execution (`spawnAndWait`) or to `bypassQueue` runs — both continue to invoke `record.run()` directly, never touching the limiter.
+- Touching `src/service/service.ts` or `src/service/service-adapter.ts` — `bypassQueue` flows through unchanged.
+## Background
+Relevant modules:
+- `src/lifecycle/concurrency-queue.ts` — the current `ConcurrencyQueue`: `isFull`, `enqueue`, `dequeue`, `markStarted`, `markFinished`, `drain`, `clear`, `queuedIds`.
+  Stores IDs; `drain()` calls the injected `startAgent(id)` back-edge.
+- `src/lifecycle/subagent-manager.ts` — injects the queue via `SubagentManagerOptions.queue`.
+  `buildObserver` relays `markStarted`/`markFinished`; `spawn` enqueues when `isFull()`; `abort` calls `dequeue`; `abortAll` iterates `queuedIds` + `clear()`; `waitForAll` loops `drain()` + `Promise.allSettled`; `dispose` calls `clear()`.
+- `src/index.ts` — constructs the queue with a `startAgent` callback that forward-references the manager (`manager.getRecord(id)` then `agent.run()`); wires `settings.onMaxConcurrentChanged` to `queue.drain()`.
+- `src/lifecycle/subagent.ts` — `run()` sets status to `running` synchronously (`markRunning`) before its first `await`; `run()` always resolves (errors captured internally).
+  `abort()` acts only on `running` agents; its docstring references `ConcurrencyQueue.dequeue()`.
+- `test/lifecycle/subagent-manager.test.ts` — `createManager` helper replicates the `index.ts` start callback with a `prefer-const` `eslint-disable` for the forward reference.
+- `test/lifecycle/concurrency-queue.test.ts` — unit tests for the queue (drain ordering, `markStarted`/`markFinished` counting, `enqueue`/`dequeue`).
+Constraints from AGENTS.md and skills:
+- ES2024 `Promise.withResolvers` is available and preferred (`code-design` skill).
+- The `bypassQueue` field lives in the public type bundle (`exports`, `verify:public-types`); renaming public surface is breaking (`package-pi-subagents` skill).
+- `@typescript-eslint/require-await` is enabled for `src/`; a thunk with no `await` must return a `Promise` without `async`.
+- Where the old `drain()` used `while (… && !isFull())` with `this.queue.shift()!`, prefer a bounded loop without a non-null assertion (`code-design` Biome/ESLint notes).
+The current observer-relay path (`buildObserver` → `queue.markStarted`/`markFinished`) confirmed: the queue's `running` counter mirrors the per-agent status the manager already tracks (the manager filters on `status === "running" || "queued"` in `cleanup`, `clearCompleted`, `hasRunning`, `waitForAll`).
+No production caller awaits a *queued* agent's promise (`get-result-tool.ts` guards on `status === "running"`; `spawnAndWait` is foreground; `waitForAll` filters by status), so giving queued agents a settled-on-completion promise is safe.
+## Design Overview
+### `ConcurrencyLimiter`
+A pure FIFO scheduler over thunks.
+It owns the active count and the pending queue; it has no knowledge of agents, IDs, or the manager.
+```typescript
+export class ConcurrencyLimiter {
+	private active = 0;
+	private readonly pending: Array<{ start: () => void; settle: () => void }> = [];
+	constructor(private readonly getLimit: () => number) {}
+	/**
+	 * Schedule a task to run FIFO once a slot is free.
+	 * The returned promise always settles: it follows the task's settlement when
+	 * the task runs, or resolves early if clear() drops it before it starts.
+	 */
+	schedule(task: () => Promise<void>): Promise<void> {
+		const { promise, resolve, reject } = Promise.withResolvers<void>();
+		this.pending.push({
+			start: () => {
+				this.active++;
+				task().then(resolve, reject).finally(() => {
+					this.active--;
+					this.recheck();
+				});
+			},
+			settle: resolve,
+		});
+		this.recheck();
+		return promise;
+	}
+	/** Start pending tasks until the limit is reached. Call when the limit may have grown. */
+	recheck(): void {
+		while (this.active < this.getLimit()) {
+			const next = this.pending.shift();
+			if (!next) break;
+			next.start();
+		}
+	}
+	/** Drop all pending tasks, resolving their promises without running them. */
+	clear(): void {
+		const dropped = this.pending.splice(0);
+		for (const task of dropped) task.settle();
+	}
+}
+```
+Design decisions:
+- **Active count derived from task lifecycle.**
+  `active++` happens synchronously inside `start()` before the task's first `await`; `active--` runs in `finally`.
+  This replaces the queue's `running` counter and the two observer relays.
+- **`recheck()` is bounded.**
+  The loop terminates when the limit is reached or the pending queue empties — no `while (true)`, no `this.pending.shift()!` non-null assertion.
+- **`clear()` settles dropped promises.**
+  Every `schedule()` promise becomes `record.promise`; the contract is that it always settles.
+  Dropping a thunk without resolving would leave a forever-pending `record.promise`.
+  `clear()` resolves dropped tasks so `dispose()`/`abortAll()` cannot strand a promise. (This is a few lines beyond the issue's "~40 lines" sketch; the extra `settle` handle is the deliberate cost of that invariant.)
+- **Synchronous start.**
+  When a slot is free, `schedule()` runs the thunk synchronously inside `recheck()`, so `record.run()` executes its synchronous prefix (`markRunning`) immediately — preserving today's behavior where `record.promise = record.run()` flips status to `running` at once.
+### Manager spawn call site
+```typescript
+// spawn(), background and not bypassQueue:
+record.promise = this.limiter.schedule(() => {
+	// Guard: an abort-while-queued task is a no-op (Step 3 folds this into Subagent.start()).
+	if (record.status !== "queued") return Promise.resolve();
+	return record.run();
+});
+// foreground or bypassQueue:
+record.promise = record.run();
+```
+This is Tell-Don't-Ask toward the limiter: the manager hands it work, the limiter decides timing.
+The status guard replaces `dequeue` — an aborted queued agent (status `stopped`) becomes a no-op when its slot finally opens.
+### Manager lifecycle methods
+- `buildObserver` — drop the `markStarted` (in `onStarted`) and `markFinished` (in `onRunFinished`) relays; `onRunFinished` keeps the background `onSubagentCompleted` dispatch.
+- `abort(id)` — for a `queued` agent, just `record.markStopped()` (no `dequeue`); otherwise `record.abort()`.
+- `abortAll()` — iterate agents: `markStopped()` each `queued` agent (count it), else `record.abort()`; then `this.limiter.clear()` to drop pending thunks (their promises resolve).
+- `waitForAll()` — every spawned agent has a `promise`, so the manual `drain()` loop collapses:
+  ```typescript
+  async waitForAll(): Promise<void> {
+   let pending = this.pendingPromises();
+   while (pending.length > 0) {
+    await Promise.allSettled(pending);
+    pending = this.pendingPromises();
+   }
+  }
+  private pendingPromises(): Promise<void>[] {
+   return [...this.agents.values()]
+    .filter(r => r.status === "running" || r.status === "queued")
+    .map(r => r.promise)
+    .filter((p): p is Promise<void> => p != null);
+  }
+  ```
+  The re-check loop is no longer `while (true)` and no longer drives scheduling — the limiter auto-starts queued agents as slots free, so a single `allSettled` covers the queued case.
+  The loop survives only to catch agents spawned *during* the wait.
+  The `eslint-disable @typescript-eslint/no-unnecessary-condition` is deleted.
+- `dispose()` — `this.limiter.clear()` (unchanged in intent).
+### `index.ts` wiring
+```typescript
+const settings = new SettingsManager({
+	// …
+	onMaxConcurrentChanged: () => limiter.recheck(), // forward-ref closure (settings → limiter); benign
+});
+settings.load();
+// …
+const limiter = new ConcurrencyLimiter(() => settings.maxConcurrent);
+const manager = new SubagentManager({ /* … */ limiter, /* … */ });
+```
+The only surviving forward reference is `settings → limiter` (a runtime-only closure, the same shape as today's `settings → queue.drain`).
+The `limiter → manager` back-edge (the `startAgent` callback and its explanatory comment) is **deleted entirely** — that is the structural win.
+### Edge cases
+- **Abort while queued** — `markStopped()` flips status; the scheduled thunk, when run, returns `Promise.resolve()` (no-op), settling `record.promise`.
+- **Limit decreased below active count** — `recheck()` simply starts nothing (`active < getLimit()` is false); in-flight tasks finish normally.
+- **Limit increased** — `onMaxConcurrentChanged → limiter.recheck()` starts newly-admissible pending tasks.
+- **`clear()` with in-flight tasks** — only *pending* tasks are dropped; running tasks complete and `active--` on settle.
+## Module-Level Changes
+| File                                         | Change                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
+| -------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `src/lifecycle/concurrency-limiter.ts`       | Add — new `ConcurrencyLimiter` (`schedule`, `recheck`, `clear`).                                                                                                                                                                                                                                                                                                                                                                                                              |
+| `src/lifecycle/concurrency-queue.ts`         | Remove — replaced by the limiter.                                                                                                                                                                                                                                                                                                                                                                                                                                             |
+| `src/lifecycle/subagent-manager.ts`          | Change — import limiter; `SubagentManagerOptions.queue` → `limiter: ConcurrencyLimiter` and the private field; drop `markStarted`/`markFinished` from `buildObserver`; `spawn` schedules a status-guarded thunk; `abort` drops `dequeue`; `abortAll` iterates agents + `limiter.clear()`; `waitForAll` simplified (add `pendingPromises` helper, delete the `while (true)` loop and its `eslint-disable`); `dispose` calls `limiter.clear()`; update the file-header comment. |
+| `src/lifecycle/subagent.ts`                  | Change — `abort()` docstring: remove the `ConcurrencyQueue.dequeue()` reference (queue removal is now a status-guard no-op).                                                                                                                                                                                                                                                                                                                                                  |
+| `src/index.ts`                               | Change — import `ConcurrencyLimiter`; construct it as `new ConcurrencyLimiter(() => settings.maxConcurrent)`; `onMaxConcurrentChanged: () => limiter.recheck()`; delete the `startAgent` callback and its forward-ref comment; inject `limiter` into the manager.                                                                                                                                                                                                             |
+| `test/lifecycle/concurrency-limiter.test.ts` | Add — limiter unit tests (no `startAgent` mock).                                                                                                                                                                                                                                                                                                                                                                                                                              |
+| `test/lifecycle/concurrency-queue.test.ts`   | Remove — the queue is gone.                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
+| `test/lifecycle/subagent-manager.test.ts`    | Change — `createManager` constructs a `ConcurrencyLimiter`; delete the forward-ref `let mgr` + `prefer-const` `eslint-disable`; drop the unused `queue` field from the returned object.                                                                                                                                                                                                                                                                                       |
+| `docs/architecture/architecture.md`          | Change — Mermaid lifecycle node (`ConcurrencyQueue<br/>(scheduling, drain)` → `ConcurrencyLimiter<br/>(thunk admission gate)`); layout listing (`concurrency-queue.ts` → `concurrency-limiter.ts`); "What the core owns" bullet; mark roadmap Step 1 done; fix the Step 7 ([#378]) target filename reference.                                                                                                                                                                 |
+| `.pi/skills/package-pi-subagents/SKILL.md`   | Change — lifecycle-domain table: `concurrency-queue.ts` → `concurrency-limiter.ts` and adjust the "scheduling" wording to "concurrency admission".                                                                                                                                                                                                                                                                                                                            |
+Verified by grep that no other `src/`, `test/`, `docs/` (excluding `docs/architecture/history/` and prior plans/retros, which are historical), or `.pi/skills/` file references `ConcurrencyQueue`, `concurrency-queue`, `enqueue`, `dequeue`, `markStarted`/`markFinished` (queue), `drain`, `isFull`, or `queuedIds` for this queue.
+`SKILL.md` line 80 (Phase 15 history) keeps `ConcurrencyQueue` — it is a historical record, not current state.
+## Test Impact Analysis
+1. **New tests the change enables.**
+   `ConcurrencyLimiter` is a pure thunk scheduler with no agent/manager knowledge, so it is unit-testable with plain `() => Promise<void>` tasks and `Promise.withResolvers` gates — no `startAgent` mock, no re-entrant `markStarted` simulation.
+   New coverage: FIFO start order; slot gating (only `limit` tasks run concurrently); `active` decrement frees a slot for the next pending task on settle; `recheck()` starts newly-admissible tasks when the limit grows; dynamic limit re-evaluation; `clear()` resolves pending promises without running their tasks; a task that rejects still frees its slot.
+2. **Tests that become redundant.**
+   The entire `test/lifecycle/concurrency-queue.test.ts` (`isFull`, `enqueue`/`dequeue`, `markStarted`/`markFinished`, `drain`, auto-drain, `clear`, `queuedIds`) — those methods no longer exist; the limiter tests replace them at a cleaner seam.
+3. **Tests that stay as-is (genuinely exercise the layer).**
+   The `SubagentManager — queueing and concurrency with injected stubs` describe block asserts manager-level behavior (queued → running transition order, abort-while-queued never runs the factory, `onSubagentStarted` fires on the queued → running transition).
+   These remain valid against the manager + limiter integration and need only the `createManager` helper change (construct a `ConcurrencyLimiter`), not a behavioral rewrite.
+   The `clearCompleted does not remove running or queued agents` test (maxConcurrent=1, blocking factory) also stays.
+## TDD Order
+Priority = preparatory addition first, then the atomic interface swap, then docs.
+1. **Add `ConcurrencyLimiter` (red → green).**
+   Surface: new `test/lifecycle/concurrency-limiter.test.ts` against new `src/lifecycle/concurrency-limiter.ts`.
+   Covers FIFO start order, slot gating, `active`-frees-slot-on-settle, `recheck()` on limit growth, dynamic limit, `clear()` resolves pending without running, reject-frees-slot.
+   Pure addition — `ConcurrencyQueue` still exists and its tests still pass; the suite stays green.
+   Commit: `feat(pi-subagents): add ConcurrencyLimiter (#381)`.
+2. **Migrate `SubagentManager`, `index.ts`, and the manager test helper to the limiter; delete the queue (red → green).**
+   Surface: `src/lifecycle/subagent-manager.ts`, `src/index.ts`, `src/lifecycle/subagent.ts` (docstring), `test/lifecycle/subagent-manager.test.ts`, and deletion of `src/lifecycle/concurrency-queue.ts` + `test/lifecycle/concurrency-queue.test.ts`.
+   This is one atomic commit: changing `SubagentManagerOptions.queue` → `limiter` breaks both call sites (`index.ts` and the test helper) at the type level simultaneously, and the old test file imports the deleted source — all must land together.
+   Drop the observer relays, the `dequeue`/`drain`/`isFull`/`queuedIds` usage, the `while (true)` loop + its `eslint-disable`, and the test helper's forward-ref `eslint-disable`.
+   Run `pnpm run check` immediately after (shared-interface change with multiple call sites), then the full `pnpm --filter @gotgenes/pi-subagents exec vitest run` (the queueing/concurrency integration tests must still pass).
+   Commit: `refactor(pi-subagents): replace ConcurrencyQueue with thunk-based ConcurrencyLimiter (#381)`.
+3. **Update architecture doc and package skill (docs).**
+   Surface: `docs/architecture/architecture.md` (Mermaid node, layout listing, "What the core owns" bullet, roadmap Step 1 marked done, Step 7 filename reference) and `.pi/skills/package-pi-subagents/SKILL.md` (lifecycle-domain table entry + wording).
+   Commit: `docs(pi-subagents): update architecture and skill for ConcurrencyLimiter (#381)`.
+## Risks and Mitigations
+| Risk                                                                   | Mitigation                                                                                                                                                                                                    |
+| ---------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| A dropped pending thunk leaves `record.promise` forever pending.       | `clear()` resolves dropped tasks' promises; the limiter's contract is that every `schedule()` promise settles.                                                                                                |
+| `waitForAll` could spin or miss queued agents.                         | Queued agents now carry real promises, so a single `Promise.allSettled` covers them; the bounded re-check loop only catches agents spawned during the wait, and terminates when `pendingPromises()` is empty. |
+| An abort-while-queued no-op thunk briefly occupies a slot.             | The thunk returns a synchronously-resolved promise; `active++`/`active--` round-trip in one microtask and `recheck()` immediately pulls the next task — negligible.                                           |
+| Renaming the file/class leaves stale references.                       | Grep-verified inventory in Module-Level Changes; the migration deletes the source and its test in the same commit; docs updated in step 3.                                                                    |
+| `bypassQueue` public-surface name now slightly misnames the mechanism. | Out of scope (breaking); recorded in Open Questions.                                                                                                                                                          |
+## Open Questions
+- Should `bypassQueue` be renamed (e.g. `bypassLimiter`) for accuracy?
+  It is public type surface, so a rename is breaking and belongs in its own change — defer.
+- Should the `code-design` "narrow interface, not concrete class" guidance be applied to the manager's `limiter` field (typed as `{ schedule; clear }` rather than the concrete `ConcurrencyLimiter`)?
+  Tests construct a real limiter (it is pure and trivially constructible), so no mock-cast pressure exists today; keep the concrete type to match the issue and existing pattern, and revisit only if a test needs to substitute it.
+[#373]: https://github.com/gotgenes/pi-packages/issues/373
+[#374]: https://github.com/gotgenes/pi-packages/issues/374
+[#378]: https://github.com/gotgenes/pi-packages/issues/378

package/docs/plans/0400-include-parent-prompt-in-replace-mode.md ADDED Viewed

@@ -0,0 +1,199 @@
+---
+issue: 400
+issue_title: "perf(pi-subagents): include parent system prompt in replace mode for KV cache reuse"
+---
+# Include parent system prompt in replace mode for KV cache reuse
+## Problem Statement
+In replace mode, `buildAgentPrompt()` discards the parent system prompt entirely and substitutes a thin two-line header (`"You are a pi coding agent sub-agent. / You have been invoked to handle a specific task autonomously."`).
+Replace-mode agents therefore lose the core identity, tool-usage guidelines, and AGENTS.md context the parent carries, and they share no prompt prefix with the parent or with each other — defeating LLM KV cache reuse.
+The `parentSystemPrompt` parameter is already passed into `buildAgentPrompt()` but the replace branch ignores it.
+## Goals
+- Place the parent system prompt (or `genericBase` when no parent is available) at the front of the replace-mode prompt as a shared, cacheable prefix.
+- Order the replace-mode prompt as: parent/`genericBase` → `<active_agent>` tag → env block → `config.systemPrompt`.
+- Preserve the distinguishing feature of replace mode: it injects neither the `<sub_agent_context>` bridge nor the `<agent_instructions>` wrapper — the custom prompt keeps full control of the agent's instructions, placed last so it has the final say.
+- Apply the change uniformly to every replace-mode agent, including the built-in `Explore` and `Plan` agents.
+- This is a **breaking change**: replace-mode agents (including `Explore`/`Plan` and any custom `prompt_mode: replace` agent) now inherit the parent system prompt on upgrade with no user edit, and the thin two-line header is removed.
+  Ship it as `perf!:` with a `BREAKING CHANGE:` footer.
+## Non-Goals
+- No change to append-mode assembly (already reordered for KV cache in [#180]).
+- No change to how `parentSystemPrompt` is sourced — `create-subagent-session.ts` already passes `snapshot.systemPrompt` through `session-config.ts`.
+- No new mode or flag to distinguish "replace with parent" from "replace without parent" — the operator confirmed the change applies uniformly, so `Explore`/`Plan` are not special-cased.
+- No change to `pi-permission-system` — its `<active_agent>` tag parsing is a full-string regex search, position-independent.
+- No change to `pi-anthropic-auth` — its OAuth shaping is unaffected (see Background).
+## Background
+`buildAgentPrompt()` in `packages/pi-subagents/src/session/prompts.ts` assembles the child system prompt.
+The append branch was reordered in [#180] (shipped in `pi-subagents-v6.18.3`) to place shared/stable content first; the parent prompt is placed verbatim (no wrapper tag) so it forms an identical byte prefix with the parent session, maximising KV cache hits.
+The replace branch was left untouched and still emits the thin header.
+Current replace branch:
+```typescript
+// "replace" mode — env header + the config's full system prompt
+const replaceHeader = `You are a pi coding agent sub-agent.
+You have been invoked to handle a specific task autonomously.
+${envBlock}`;
+return activeAgentTag + replaceHeader + "\n\n" + config.systemPrompt;
+```
+`const identity = parentSystemPrompt ?? genericBase;` currently lives inside the append branch.
+`genericBase` (a `# Role` / general-purpose coding agent blurb) is the shared fallback.
+### Cross-extension interaction — `pi-anthropic-auth` OAuth
+The operator asked how the `genericBase` fallback interacts with `@gotgenes/pi-anthropic-auth`.
+Findings from reading that package's `src/system-prompt-shaping.ts` and `src/request-shaping.ts`:
+- The OAuth de-fingerprinting (`shapeAnthropicOAuthSystemPrompt`) only activates when the system prompt contains `PI_DEFAULT_PROMPT_PREFIX` (Pi's default expert-coding-assistant preamble); otherwise it returns the prompt untouched.
+- The `x-anthropic-billing-header` system block is prepended **unconditionally** for every OAuth request (`prependBillingHeader`), independent of the base prompt content — this is the primary Claude Code billing signal.
+Implications for this change:
+- Normal case (parent present): replace mode places the parent prompt verbatim at the front, structurally identical to append mode, which already works under the OAuth transport wrapper.
+  The inherited Pi preamble is de-fingerprinted exactly as it is for append-mode subagents and the main session today.
+- `genericBase` fallback (only when the parent snapshot has no system prompt — effectively never in real sessions, since `parentSystemPrompt` is a required `string` at the `session-config` layer): `genericBase` carries no Pi fingerprint, so the OAuth shaping no-ops and the billing header is still prepended.
+  `genericBase` is already neutral, so nothing leaks.
+Conclusion: #400 introduces no new OAuth interaction. `genericBase` remains the correct fallback and stays consistent with append mode.
+### Constraints from AGENTS.md
+- This package carries a type-declaration bundle for its public API, but `buildAgentPrompt` is internal — no `dist/public.d.ts` or `exports` impact, so `verify:public-types` is not required for this change.
+- Conventional Commits; do not edit `CHANGELOG.md` (release-please owns it).
+- The `BREAKING CHANGE:` footer text is reused verbatim in the release-please CHANGELOG and the issue close comment — name only real surface (`prompt_mode: replace`).
+## Design Overview
+Hoist the `identity` resolution above the branch so both modes share it, then rewrite the replace branch.
+```typescript
+const activeAgentTag = `<active_agent name="${config.name}"/>\n\n`;
+const envBlock = `# Environment\n...`;
+const identity = parentSystemPrompt ?? genericBase;
+if (config.promptMode === "append") {
+  // ...unchanged...
+}
+// "replace" mode — shared parent prompt (or generic base) first for KV cache
+// reuse, then the active_agent tag, env block, and the config's full system
+// prompt. Unlike append mode, replace mode injects neither the
+// <sub_agent_context> bridge nor the <agent_instructions> wrapper — the custom
+// prompt keeps full control of the agent's instructions.
+return identity + "\n\n" + activeAgentTag + envBlock + "\n\n" + config.systemPrompt;
+```
+Resulting replace-mode order (`activeAgentTag` already ends with `\n\n`):
+```text
+1. parentSystemPrompt (or genericBase)    ← SHARED, cacheable prefix
+2. <active_agent name="${name}"/>         ← varies per agent
+3. # Environment ...                      ← varies per runtime
+4. config.systemPrompt                    ← custom instructions (full control)
+```
+This mirrors append mode's prefix-first ordering, minus the bridge and the `<agent_instructions>` wrapper.
+The change is a pure single-function edit — no new collaborator, no new module, no interface change — so the design-review structural checklist (dependency width, Law of Demeter, extraction seams) does not apply.
+### Edge cases
+- Empty `config.systemPrompt` (e.g. a replace agent with no body): the prompt ends with a trailing `\n\n` after the env block.
+  Acceptable and consistent with current behavior; no special-casing.
+  `genericBase` only substitutes on a nullish parent (the `??` operator), so an empty-string parent prompt is preserved as-is, matching append mode.
+## Module-Level Changes
+### `packages/pi-subagents/src/session/prompts.ts`
+1. Hoist `const identity = parentSystemPrompt ?? genericBase;` from the append branch to before the `if (config.promptMode === "append")` check so both branches use it.
+2. Replace the replace-branch `replaceHeader` template and return statement with the new ordering (`identity` → `activeAgentTag` → `envBlock` → `config.systemPrompt`); remove the thin two-line header.
+3. Update the JSDoc summary: replace-mode bullet becomes "parent system prompt (or generic base) + active_agent tag + env header + config.systemPrompt; no bridge, no agent_instructions wrapper," and update the trailing note about tag position (it is included, not prepended, in either mode).
+### `packages/pi-subagents/test/session/prompts.test.ts`
+See Test Impact Analysis and TDD Order for the specific test changes.
+### `packages/pi-subagents/README.md`
+1. Lines 119–120 — the `Explore` and `Plan` rows: revise the `replace` (standalone) framing, since replace mode now inherits the parent prompt as its base.
+2. Line 187 — the `prompt_mode` frontmatter table: `replace` no longer means "no AGENTS.md / CLAUDE.md inheritance."
+   Reword to describe the new semantics: replace inherits the parent prompt as the base, then the body takes full control (no `<sub_agent_context>` bridge, no `<agent_instructions>` wrapper), whereas append wraps the body and adds the bridge.
+3. Line 494 (Patch 3, `<active_agent>` tag): change "prepends ... to every assembled child system prompt (both `replace` and `append` modes)" to "includes ... in every assembled child system prompt (both modes)" — the tag follows the cacheable parent prefix in both modes now, so "prepends" is inaccurate.
+No `docs/architecture/` updates: the architecture doc references `prompts.ts` only as a one-line file listing (no prompt-assembly description, no complexity/health table entry tied to this change).
+## Test Impact Analysis
+This is a behavior change, not an extraction, so the extraction-specific questions are limited.
+- New behavior to cover: replace mode now includes the parent prompt as a cacheable prefix; falls back to `genericBase` with no parent; still excludes the bridge and the `<agent_instructions>` wrapper.
+- Existing replace-mode tests that assert the old behavior must change (they pin the removed thin header and the "ignores parent prompt" premise).
+- `toContain`-based tests for cwd/git/env and the `genericBase` fallback remain valid where position-independent.
+- No existing test becomes redundant beyond the ones being rewritten; no test must stay frozen for a layer being extracted (nothing is extracted).
+Tests that change in `test/session/prompts.test.ts`:
+1. `"replace mode uses config systemPrompt directly"` — asserts `toContain("You are a pi coding agent sub-agent")`; that header is removed.
+   Rewrite to assert the config prompt is present and the thin header is gone.
+2. `"replace mode ignores parent prompt"` — asserts the parent content is absent.
+   The premise inverts: rename to `"replace mode includes parent prompt as base (no bridge/wrapper)"` and assert the parent content is present while `<sub_agent_context>` and `<agent_instructions>` are absent.
+3. `"prepends <active_agent name=...> tag in replace mode"` — asserts `prompt.startsWith('<active_agent name="Explore"/>\n\n')`.
+   The tag no longer leads (parent/`genericBase` does); rewrite to assert the tag appears after the identity prefix and before the env block.
+4. `"active_agent tag appears before envBlock in both modes"` — the replace assertions pin `tagIdx === 0`.
+   Update the replace assertions: the tag is no longer at index 0 but still precedes `# Environment`.
+   The append assertions stay as-is.
+## TDD Order
+All test and source changes live in two files that the type checker links (the replace branch and its tests).
+Each cycle is a single commit that leaves the suite green.
+1. **Red: rewrite replace-mode behavioral tests.**
+   Update tests 1–2 above to the new behavior (parent prompt included as base; thin header removed; no bridge/wrapper), and add a test for the `genericBase` fallback when no parent is supplied in replace mode, plus a test pinning the full order (`identity` → `<active_agent>` → `# Environment` → `config.systemPrompt`).
+   These fail against the current implementation.
+   Commit: `test: assert replace mode inherits parent prompt as cacheable prefix (#400)`
+2. **Green: rewrite the replace branch.**
+   Hoist `identity`, replace the `replaceHeader` block with the new ordering, remove the thin header, and update the JSDoc.
+   Update the positional `<active_agent>` tests (3–4 above) in the same commit — they break at runtime the moment the branch changes.
+   Commit body carries the `BREAKING CHANGE:` footer.
+   Commit: `perf!: include parent system prompt in replace mode (#400)`
+   ```text
+   BREAKING CHANGE: replace-mode subagents (built-in Explore/Plan and any
+   custom prompt_mode: replace agent) now inherit the parent system prompt as
+   their base instead of a thin standalone header. The custom prompt is
+   appended last and retains full control; the <sub_agent_context> bridge and
+   <agent_instructions> wrapper are still omitted in replace mode.
+   ```
+3. **Docs: update README replace-mode semantics.**
+   Apply the three README edits (Explore/Plan rows, `prompt_mode` table, Patch 3 `<active_agent>` wording).
+   Commit: `docs: describe replace-mode parent inheritance (#400)`
+## Risks and Mitigations
+| Risk                                                                                                                  | Mitigation                                                                                                                                                                                                    |
+| --------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `Explore`/`Plan` behavior shifts — they now carry the full parent prompt plus their read-only specialist instructions | Operator confirmed uniform application; specialist instructions are placed last so they have the final say; existing read-only assertions (`READ-ONLY`, `file search specialist`) still hold via `toContain`. |
+| `pi-permission-system` depends on `<active_agent>` tag position                                                       | Tag parsing is a full-string regex search; position-independent (same basis as [#180]).                                                                                                                       |
+| `pi-anthropic-auth` OAuth shaping breaks with the new base                                                            | No new interaction — billing header is prepended unconditionally; de-fingerprinting keys off `PI_DEFAULT_PROMPT_PREFIX` and `genericBase` is already neutral (see Background).                                |
+| A custom replace agent relied on the clean-slate (no parent) behavior                                                 | Documented as breaking in the `BREAKING CHANGE:` footer and README; this aligns with the expectation reported in the issue ([@jeffutter] expected the parent identity to be present).                         |
+| Stale README claims that replace = no inheritance                                                                     | README edits in cycle 3 correct lines 119–120, 187, and 494.                                                                                                                                                  |
+## Open Questions
+None — the three design decisions (breaking classification, `genericBase` fallback, uniform application to built-ins) were resolved with the operator before planning.
+[#180]: https://github.com/gotgenes/pi-packages/issues/180
+[@jeffutter]: https://github.com/jeffutter

package/docs/retro/0381-replace-concurrency-queue-with-limiter.md ADDED Viewed

@@ -0,0 +1,49 @@
+---
+issue: 381
+issue_title: "Replace ConcurrencyQueue with a thunk-based ConcurrencyLimiter"
+---
+# Retro: #381 — Replace ConcurrencyQueue with a thunk-based ConcurrencyLimiter
+## Stage: Planning (2026-06-13T00:00:00Z)
+### Session summary
+Produced a 3-step TDD plan to replace the ID-registry `ConcurrencyQueue` (with its `startAgent` back-edge and `markStarted`/`markFinished` relays) with a pure `ConcurrencyLimiter` that schedules thunks FIFO against a dynamic limit.
+The design follows the architecture doc's Phase 17 Step 1 entry and the issue's revised framing closely; the plan adds concrete code sketches for `schedule`/`recheck`/`clear`, the manager call site, the simplified `waitForAll`, and `index.ts` wiring.
+### Observations
+- Author is `gotgenes` (matches the gh CLI user), so the well-specified proposal was treated as the working hypothesis; the design is unambiguous (down to the architecture-doc Step 1), so the `ask_user` gate was skipped.
+- Classified non-breaking: `ConcurrencyQueue`/`ConcurrencyLimiter` are internal — no public API, config, or observable behavior change.
+  The FIFO admission gate against `maxConcurrent` is preserved.
+- Key design decision beyond the issue sketch: `clear()` must *settle* dropped pending promises (resolve them), not just drop the thunks.
+  Every `schedule()` promise becomes `record.promise`, and the post-spawn contract is that it always settles — dropping without resolving would strand a promise.
+  This costs a small `settle` handle per pending entry (a few lines beyond the issue's "~40 lines").
+- Verified no production caller awaits a *queued* agent's promise in a blocking way (`get-result-tool.ts` guards on `status === "running"`; `spawnAndWait` is foreground/direct; `waitForAll` filters by status), confirming it is safe to give queued agents a real promise.
+- Sequencing decision: the `SubagentManagerOptions.queue` → `limiter` swap breaks both call sites (`index.ts` + the manager test helper) and the old test file imports the deleted source, so step 2 is one atomic commit (migrate consumers + delete queue + delete old test).
+- `bypassQueue` is kept as-is — it is in the published `SubagentsService` type bundle, so renaming would be breaking; deferred to Open Questions.
+- Doc inventory: grep confirmed current-state references to update are the Mermaid lifecycle node, the layout listing, the "What the core owns" bullet, the Step 7 ([#378]) target filename, and the `package-pi-subagents` SKILL lifecycle-domain table.
+  `SKILL.md` line 80 (Phase 15 history) keeps `ConcurrencyQueue` as a historical record.
+## Stage: Implementation — TDD (2026-06-13T22:15:00Z)
+### Session summary
+Executed all 3 planned TDD cycles: (1) added `ConcurrencyLimiter` + 13 unit tests, (2) migrated `SubagentManager`, `index.ts`, `subagent.ts` docstring, and the manager test helper to the limiter while deleting `concurrency-queue.ts` + its test in the same atomic commit, (3) updated `architecture.md` and the package SKILL.
+Test count went 975 → 966 (−22 deleted queue tests, +13 new limiter tests); the full suite, `check`, `lint`, and `pnpm fallow dead-code` are all green.
+### Observations
+- The plan held up cleanly — no surprises in the manager integration tests.
+  The `queueing and concurrency` describe block passed unchanged after only the `createManager` helper swap (real `ConcurrencyLimiter` instead of `ConcurrencyQueue` + forward-ref start callback), confirming those tests exercise behavior, not queue internals.
+- One deviation: a 4th commit (`90135005`, `refactor:`) fixes a stale `// before startAgent / queue drain` comment at `src/index.ts:125` that the plan's grep inventory missed (it named no removed symbol, just deleted concepts).
+  The pre-completion reviewer caught it.
+  Committed separately rather than amending the non-HEAD refactor commit, since AGENTS.md discourages interactive rebase in this environment.
+- ESLint `@typescript-eslint/no-floating-promises` fired on every bare `limiter.schedule(...)` in the limiter test (the queue's `enqueue` returned `void`; `schedule` returns a promise).
+  Resolved by prefixing unawaited calls with `void` — all such tasks either stay pending or resolve, so no unhandled rejection.
+- The `clear()`-settles-pending-promises decision (made at planning) proved correct and is covered by a dedicated test ("resolves the promises of dropped pending tasks").
+- Pre-completion reviewer: WARN (no FAILs).
+  Reviewer warnings: the single stale-comment finding at `index.ts:125` — now fixed in commit `90135005`.
+[#378]: https://github.com/gotgenes/pi-packages/issues/378

package/docs/retro/0400-include-parent-prompt-in-replace-mode.md ADDED Viewed

@@ -0,0 +1,84 @@
+---
+issue: 400
+issue_title: "perf(pi-subagents): include parent system prompt in replace mode for KV cache reuse"
+---
+# Retro: #400 — Include parent system prompt in replace mode for KV cache reuse
+## Stage: Planning (2026-06-14T00:42:49Z)
+### Session summary
+Produced a numbered plan for including the parent system prompt as a cacheable prefix in `buildAgentPrompt()`'s replace branch, mirroring the [#180] append-mode reorder.
+The change is a single-function edit plus test and README updates, planned across three TDD/docs commits.
+### Observations
+- Three design decisions were confirmed with the operator (issue author = gh user) before planning:
+  1. Ship as breaking `perf!:` with a `BREAKING CHANGE:` footer — replace-mode agents inherit the parent prompt on upgrade with no user edit, and the thin two-line header is removed.
+  2. Use `genericBase` as the no-parent fallback, consistent with append mode.
+  3. Apply uniformly to all replace agents, including built-in `Explore` and `Plan` (one code path, no special-casing).
+- The operator raised a cross-extension concern about the `genericBase` fallback interacting with `@gotgenes/pi-anthropic-auth`.
+  Investigation of that package's `system-prompt-shaping.ts` / `request-shaping.ts` showed no new interaction: the `x-anthropic-billing-header` block is prepended unconditionally for OAuth, and de-fingerprinting keys off `PI_DEFAULT_PROMPT_PREFIX` (absent from `genericBase`, which is already neutral).
+  Captured this in the plan's Background and Risks.
+- `parentSystemPrompt` is a required `string` at the `session-config` layer (sourced from `snapshot.systemPrompt`), so the `genericBase` fallback is effectively a defensive/test-only path in real sessions.
+- The thin replace header string (`You are a pi coding agent sub-agent`) appears only in `prompts.ts` and its test — no skill or live doc pins it; README needs three edits (Explore/Plan rows, `prompt_mode` table, Patch 3 `<active_agent>` wording, the last already slightly stale post-#180).
+- Notable emergent scope point: `Explore`/`Plan` are built-in replace-mode agents, so this change affects them visibly — surfaced and confirmed rather than assumed.
+## Stage: Implementation — TDD (2026-06-14T00:54:46Z)
+### Session summary
+Completed all 3 TDD cycles in `packages/pi-subagents`.
+The change is a single-function edit to `src/session/prompts.ts` (hoist `identity`, rewrite replace branch) plus test updates and README/skill-doc corrections.
+Test count went from 973 to 975 (+2 net new tests) across 59 test files.
+### Observations
+- Step 1 (Red): rewrote 2 existing replace-mode tests and added 2 new ones (4 failures confirmed against old code); the old "ignores parent prompt" test premise inverted cleanly into "includes parent prompt as base."
+- Step 2 (Green): hoisting `const identity = parentSystemPrompt ?? genericBase;` above the `if` block and replacing the `replaceHeader` template were the only `src/` changes; also updated two positional `<active_agent>` tests in the same commit since they broke the moment the branch changed (`tagIdx === 0` → `toBeGreaterThan(0)`).
+- The `BREAKING CHANGE:` footer wording was taken verbatim from the plan and landed in the `perf!:` commit.
+- Pre-completion reviewer: WARN — one finding: `.pi/skills/package-pi-subagents/SKILL.md` still said "prepends" for the `<active_agent>` tag; fixed in a follow-up `docs:` commit before shipping.
+- No deviations from the plan's Module-Level Changes list; no lockfile changes; fallow dead-code exited zero.
+## Stage: Final Retrospective (2026-06-14T01:11:10Z)
+### Session summary
+Shipped #400 across three stages (Planning on `claude-opus-4-8`, TDD + Ship on `claude-sonnet-4-6`) as a single-function edit to `buildAgentPrompt()`'s replace branch plus tests and doc updates, released as `pi-subagents` v16.0.0 (major, breaking `perf!:`).
+The run was clean end-to-end: two `ask_user` gates during planning, a 3-cycle TDD pass, one pre-completion WARN resolved before push, and a no-friction release-please merge.
+### Observations
+#### What went well
+- Cross-extension investigation on demand — when the operator asked mid-`ask_user` how the `genericBase` fallback interacts with `@gotgenes/pi-anthropic-auth`, the agent read that sibling repo's `system-prompt-shaping.ts` and `request-shaping.ts` and proved no new interaction (billing header prepended unconditionally; de-fingerprinting keys off `PI_DEFAULT_PROMPT_PREFIX`, absent from the neutral `genericBase`) before answering.
+  This converted an open worry into a documented Risk row rather than a deferred unknown.
+- Emergent-scope surfacing — planning noticed that built-in `Explore`/`Plan` are replace-mode agents and so are visibly affected, then confirmed uniform application via a second `ask_user` instead of assuming.
+- Autoformat discipline — after `pi-autoformat` touched `README.md` mid-edit, the agent re-read the region before the next edit (turns 49–50) rather than matching against stale layout, avoiding a failed `oldText`.
+#### What caused friction (agent side)
+- `missing-context` (planning) — the plan listed the README's Patch 3 `<active_agent>` "prepends" wording as a doc update but missed the identical Patch 3 description in `.pi/skills/package-pi-subagents/SKILL.md`.
+  Exact-grep during planning keyed on removed strings (`You are a pi coding agent sub-agent`, `prompt_mode`); the stale prose carried none of them, so the skill file's "prepends `<active_agent>`" line was not found.
+  Impact: the pre-completion reviewer caught it as a WARN, requiring one follow-up `docs:` commit (8e93d2a4) during TDD before push — no rework beyond that, and the safety net worked as designed.
+#### What caused friction (user side)
+- None — the operator's mid-planning OAuth question was a high-value redirect that strengthened the plan, not friction.
+### Diagnostic details
+- **Model-performance correlation** — judgment-heavy planning ran on `claude-opus-4-8`; mechanical TDD execution and the deterministic ship steps ran on `claude-sonnet-4-6`.
+  Appropriate assignment in both directions; no mismatch.
+- **Unused-tool detection** — the `colgrep` skill was loaded in planning but never used; exploration was all exact-symbol grep, which was correct for known symbols.
+  The one place it would have helped is the `missing-context` friction: a semantic search like "docs describing how the active_agent tag is added to the system prompt" would likely have surfaced both the README and the SKILL.md descriptions that symbol-grep missed.
+- **Feedback-loop gap analysis** — verification ran incrementally throughout (green baseline before cycle 1, per-file `vitest` each cycle, full suite + `check` + `lint` + `fallow` after the last step).
+  No end-loaded verification.
+- **Escalation-delay tracking** — no rabbit-holes; no error sequence exceeded one tool call.
+### Changes made
+1. `.pi/prompts/plan-issue.md` — extended the Module-Level Changes grep bullet: when a step reworks a documented mechanism's behavior (rather than removing a symbol), grep `.pi/skills/package-*/SKILL.md` for the mechanism name, since reworded prose carries no removed symbol to match.
+[#180]: https://github.com/gotgenes/pi-packages/issues/180

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@gotgenes/pi-subagents",
-  "version": "15.0.2",
+  "version": "16.1.0",
   "type": "module",
   "exports": {
     ".": {