@gotgenes/pi-subagents 15.0.1 → 16.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +24 -0
- package/README.md +24 -24
- package/docs/architecture/architecture.md +263 -11
- package/docs/plans/0400-include-parent-prompt-in-replace-mode.md +199 -0
- package/docs/retro/0400-include-parent-prompt-in-replace-mode.md +44 -0
- package/package.json +5 -5
- package/src/session/prompts.ts +25 -20
package/CHANGELOG.md
CHANGED
|
@@ -5,6 +5,30 @@ All notable changes to this project will be documented in this file.
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
+
## [16.0.0](https://github.com/gotgenes/pi-packages/compare/pi-subagents-v15.0.2...pi-subagents-v16.0.0) (2026-06-14)
|
|
9
|
+
|
|
10
|
+
|
|
11
|
+
### ⚠ BREAKING CHANGES
|
|
12
|
+
|
|
13
|
+
* replace-mode subagents (built-in Explore/Plan and any custom prompt_mode: replace agent) now inherit the parent system prompt as their base instead of a thin standalone header. The custom prompt is appended last and retains full control; the <sub_agent_context> bridge and <agent_instructions> wrapper are still omitted in replace mode.
|
|
14
|
+
|
|
15
|
+
### Performance Improvements
|
|
16
|
+
|
|
17
|
+
* include parent system prompt in replace mode ([#400](https://github.com/gotgenes/pi-packages/issues/400)) ([1cc25cf](https://github.com/gotgenes/pi-packages/commit/1cc25cf0106cbfe3015ceb69a820c745c07038e2))
|
|
18
|
+
|
|
19
|
+
|
|
20
|
+
### Documentation
|
|
21
|
+
|
|
22
|
+
* describe replace-mode parent inheritance ([#400](https://github.com/gotgenes/pi-packages/issues/400)) ([6b6e61d](https://github.com/gotgenes/pi-packages/commit/6b6e61d649582c26d2c36edf67dfd1e35d87a802))
|
|
23
|
+
|
|
24
|
+
## [15.0.2](https://github.com/gotgenes/pi-packages/compare/pi-subagents-v15.0.1...pi-subagents-v15.0.2) (2026-06-12)
|
|
25
|
+
|
|
26
|
+
|
|
27
|
+
### Miscellaneous Chores
|
|
28
|
+
|
|
29
|
+
* **deps:** bump Pi SDK to 0.79.1 ([#370](https://github.com/gotgenes/pi-packages/issues/370)) ([704f3b3](https://github.com/gotgenes/pi-packages/commit/704f3b3457ceb12b9df9efffe7a56812a5667d5d))
|
|
30
|
+
* **deps:** bump rollup to 4.61.1 ([#370](https://github.com/gotgenes/pi-packages/issues/370)) ([250b729](https://github.com/gotgenes/pi-packages/commit/250b7296093b091297c57463693eaa2db59d5fe3))
|
|
31
|
+
|
|
8
32
|
## [15.0.1](https://github.com/gotgenes/pi-packages/compare/pi-subagents-v15.0.0...pi-subagents-v15.0.1) (2026-06-10)
|
|
9
33
|
|
|
10
34
|
|
package/README.md
CHANGED
|
@@ -113,14 +113,14 @@ The LLM receives structured `<task-notification>` XML for parsing, while the use
|
|
|
113
113
|
|
|
114
114
|
## Default Agent Types
|
|
115
115
|
|
|
116
|
-
| Type | Tools | Model | Prompt Mode | Description
|
|
117
|
-
| ----------------- | -------------------------- | ----------------------------- | ---------------------- |
|
|
118
|
-
| `general-purpose` | all 7 | inherit | `append` (parent twin) | Inherits the parent's full system prompt — same rules, CLAUDE.md, project conventions
|
|
119
|
-
| `Explore` | read, bash, grep, find, ls | haiku (falls back to inherit) | `replace`
|
|
120
|
-
| `Plan` | read, bash, grep, find, ls | inherit | `replace`
|
|
116
|
+
| Type | Tools | Model | Prompt Mode | Description |
|
|
117
|
+
| ----------------- | -------------------------- | ----------------------------- | ---------------------- | ------------------------------------------------------------------------------------------------ |
|
|
118
|
+
| `general-purpose` | all 7 | inherit | `append` (parent twin) | Inherits the parent's full system prompt — same rules, CLAUDE.md, project conventions |
|
|
119
|
+
| `Explore` | read, bash, grep, find, ls | haiku (falls back to inherit) | `replace` | Fast codebase exploration (read-only); inherits the parent prompt as a base |
|
|
120
|
+
| `Plan` | read, bash, grep, find, ls | inherit | `replace` | Software architect for implementation planning (read-only); inherits the parent prompt as a base |
|
|
121
121
|
|
|
122
122
|
The `general-purpose` agent is a **parent twin** — it receives the parent's entire system prompt plus a sub-agent context bridge, so it follows the same rules the parent does.
|
|
123
|
-
Explore and Plan use
|
|
123
|
+
Explore and Plan use `replace` mode: the parent prompt is the cacheable base and their specialist read-only instructions are appended last, giving them the final say.
|
|
124
124
|
|
|
125
125
|
Default agents can be **ejected** (`/agents` → select agent → Eject) to export them as `.md` files for customization, **overridden** by creating a `.md` file with the same name (e.g. `.pi/agents/general-purpose.md`), or **disabled** per-project with `enabled: false` frontmatter.
|
|
126
126
|
|
|
@@ -172,23 +172,23 @@ subagent({ subagent_type: "auditor", prompt: "Review the auth module", descripti
|
|
|
172
172
|
|
|
173
173
|
All fields are optional — sensible defaults for everything.
|
|
174
174
|
|
|
175
|
-
| Field | Default | Description
|
|
176
|
-
| ------------------- | -------------- |
|
|
177
|
-
| `description` | filename | Agent description shown in tool listings
|
|
178
|
-
| `display_name` | — | Display name for UI (e.g. widget, agent list)
|
|
179
|
-
| `tools` | all 7 | Comma-separated built-in tools: read, bash, edit, write, grep, find, ls. `none` for no tools
|
|
180
|
-
| `extensions` | `true` | `true` to inherit all MCP/extension tools, `false` to disable
|
|
181
|
-
| `skills` | `true` | Inherit skills from parent. Can be a comma-separated list of skill names to preload (see [Skill Preloading](#skill-preloading) for discovery locations)
|
|
182
|
-
| `memory` | — | Persistent agent memory scope: `project`, `local`, or `user`. Auto-detects read-only agents
|
|
183
|
-
| `isolation` | — | Set to `worktree` to run in an isolated git worktree
|
|
184
|
-
| `model` | inherit parent | Model — `provider/modelId` or fuzzy name (`"haiku"`, `"sonnet"`)
|
|
185
|
-
| `thinking` | inherit | off, minimal, low, medium, high, xhigh
|
|
186
|
-
| `max_turns` | unlimited | Max agentic turns before graceful shutdown. `0` or omit for unlimited
|
|
187
|
-
| `prompt_mode` | `append` | `replace`:
|
|
188
|
-
| `inherit_context` | `false` | Fork parent conversation into agent
|
|
189
|
-
| `run_in_background` | `false` | Run in background by default
|
|
190
|
-
| `isolated` | `false` | No extension/MCP tools, only built-in
|
|
191
|
-
| `enabled` | `true` | Set to `false` to disable an agent (useful for hiding a default agent per-project)
|
|
175
|
+
| Field | Default | Description |
|
|
176
|
+
| ------------------- | -------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
177
|
+
| `description` | filename | Agent description shown in tool listings |
|
|
178
|
+
| `display_name` | — | Display name for UI (e.g. widget, agent list) |
|
|
179
|
+
| `tools` | all 7 | Comma-separated built-in tools: read, bash, edit, write, grep, find, ls. `none` for no tools |
|
|
180
|
+
| `extensions` | `true` | `true` to inherit all MCP/extension tools, `false` to disable |
|
|
181
|
+
| `skills` | `true` | Inherit skills from parent. Can be a comma-separated list of skill names to preload (see [Skill Preloading](#skill-preloading) for discovery locations) |
|
|
182
|
+
| `memory` | — | Persistent agent memory scope: `project`, `local`, or `user`. Auto-detects read-only agents |
|
|
183
|
+
| `isolation` | — | Set to `worktree` to run in an isolated git worktree |
|
|
184
|
+
| `model` | inherit parent | Model — `provider/modelId` or fuzzy name (`"haiku"`, `"sonnet"`) |
|
|
185
|
+
| `thinking` | inherit | off, minimal, low, medium, high, xhigh |
|
|
186
|
+
| `max_turns` | unlimited | Max agentic turns before graceful shutdown. `0` or omit for unlimited |
|
|
187
|
+
| `prompt_mode` | `append` | `replace`: parent prompt is the cacheable base; body is appended last with full control (no `<sub_agent_context>` bridge, no `<agent_instructions>` wrapper). `append`: parent prompt is the base; body is wrapped in `<agent_instructions>` and a sub-agent context bridge is injected (agent acts as a "parent twin") |
|
|
188
|
+
| `inherit_context` | `false` | Fork parent conversation into agent |
|
|
189
|
+
| `run_in_background` | `false` | Run in background by default |
|
|
190
|
+
| `isolated` | `false` | No extension/MCP tools, only built-in |
|
|
191
|
+
| `enabled` | `true` | Set to `false` to disable an agent (useful for hiding a default agent per-project) |
|
|
192
192
|
|
|
193
193
|
Frontmatter is authoritative.
|
|
194
194
|
If an agent file sets `model`, `thinking`, `max_turns`, `inherit_context`, `run_in_background`, `isolated`, or `isolation`, those values are locked for that agent.
|
|
@@ -491,7 +491,7 @@ Each has a corresponding upstream PR:
|
|
|
491
491
|
Upstream PR: [tintinweb/pi-subagents#71](https://github.com/tintinweb/pi-subagents/pull/71).
|
|
492
492
|
2. **Post-`bindExtensions` active-tool re-filter** (`src/agent-runner.ts`) — `runAgent` re-runs its active-tool filter after `session.bindExtensions(...)` so the `EXCLUDED_TOOL_NAMES` recursion guard applies to extension-registered tools (which join the active set during `bindExtensions`).
|
|
493
493
|
Upstream PR: [tintinweb/pi-subagents#72](https://github.com/tintinweb/pi-subagents/pull/72).
|
|
494
|
-
3. **`<active_agent>` system-prompt tag** (`src/prompts.ts`) — `buildAgentPrompt`
|
|
494
|
+
3. **`<active_agent>` system-prompt tag** (`src/prompts.ts`) — `buildAgentPrompt` includes `<active_agent name="${config.name}"/>` in every assembled child system prompt (both `replace` and `append` modes); the tag follows the cacheable parent-prompt prefix in both modes.
|
|
495
495
|
Downstream extensions like [`@gotgenes/pi-permission-system`](https://github.com/gotgenes/pi-permission-system) parse this tag to resolve per-agent `permission:` frontmatter inside the child session.
|
|
496
496
|
Upstream PR: [tintinweb/pi-subagents#73](https://github.com/tintinweb/pi-subagents/pull/73).
|
|
497
497
|
4. **Child-execution lifecycle events** (`src/lifecycle/child-lifecycle.ts`) — the child-session execution lifecycle is published as ordered events on `pi.events` (`subagents:child:spawning`, `session-created`, `completed`, `disposed`).
|
|
@@ -10,8 +10,9 @@ This document describes the architecture of the pi-subagents fork: a focused, co
|
|
|
10
10
|
3. **Typed API boundary** — this package exports a `SubagentsService` interface and `Symbol.for()` accessors (`publishSubagentsService` / `getSubagentsService`).
|
|
11
11
|
Consumers declare this package as an optional peer dependency and use dynamic import for compile-time types.
|
|
12
12
|
The runtime bridge is `Symbol.for("@gotgenes/pi-subagents:service")` on `globalThis` — no separate API package.
|
|
13
|
-
4. **No scheduling** —
|
|
14
|
-
|
|
13
|
+
4. **No time-based scheduling** — cron-style timed dispatch (upstream's `schedule.ts` subsystem) is removed from the core (#52).
|
|
14
|
+
Timed dispatch is a separate concern that any extension can implement by calling `spawn()` on the published API.
|
|
15
|
+
The max-concurrent admission gate is not scheduling in this sense — concurrency management stays in core.
|
|
15
16
|
5. **UI extraction is deferred** — the widget, conversation viewer, and `/agents` command menu stay in the core for now.
|
|
16
17
|
They are the first candidate for extraction once the API boundary is proven stable.
|
|
17
18
|
6. **Snapshot, don't capture** — mutable parent state (ctx, session, model) is read once at spawn time and frozen into a `ParentSnapshot` data object.
|
|
@@ -491,6 +492,10 @@ The governing rule — **no vacant hooks**: the architecture must _admit_ a seam
|
|
|
491
492
|
A provider seam with no consumer is a speculative abstraction that taxes every reader and that `fallow` flags as dead.
|
|
492
493
|
Latent extensibility is the deliverable; a vacant hook is not.
|
|
493
494
|
|
|
495
|
+
The [first-principles refinement](#first-principles-refinement-the-deeper-target) below sharpens this two-surface split.
|
|
496
|
+
The awaited, behavior-affecting lifecycle events (notably `session-created` before `bindExtensions`) are _hooks_ — the child's own extension surface applied recursively, generative because the core waits on the handler before deciding what to do next.
|
|
497
|
+
The observational surface then carries only fire-and-forget broadcasts of immutable snapshots, which no consumer can use to change the core.
|
|
498
|
+
|
|
494
499
|
### Core responsibilities (keep)
|
|
495
500
|
|
|
496
501
|
- **Agent definitions** — name, model, thinking, system prompt, tools list.
|
|
@@ -521,12 +526,90 @@ In the target state, pi-subagents publishes events and a provider seam; other pa
|
|
|
521
526
|
|
|
522
527
|
- **pi-permission-system** (observational) subscribes to child-session lifecycle events, detects subagent execution context in the child, and gates tool calls at runtime.
|
|
523
528
|
- **pi-subagents-worktrees** (generative) registers a `WorkspaceProvider` that prepares a git worktree at run-start and tears it down after, supplying the child's cwd.
|
|
524
|
-
- **pi-subagents-ui** (future) subscribes to the
|
|
529
|
+
- **pi-subagents-ui** (future, under reconsideration — see the [first-principles refinement](#first-principles-refinement-the-deeper-target)) subscribes to the broadcast and the query/behavior interfaces; whether the inherited widget, conversation viewer, and `/agents` menu survive is judged on our principles, not preserved by default.
|
|
525
530
|
- **Any future extension** (OTel, auditing, cost tracking) subscribes to the same events without pi-subagents knowing.
|
|
526
531
|
|
|
527
532
|
Composition test: install neither extension, only permissions, only workspaces, or both — the core is byte-for-byte identical in all four cases, and the two extensions never reference each other.
|
|
528
533
|
|
|
529
|
-
This is achieved across phases: Phase 14 (strip policy), Phase 16 (invert dependencies — extensions on a minimal core), and Phase
|
|
534
|
+
This is achieved across phases: Phase 14 (strip policy), Phase 16 (invert dependencies — extensions on a minimal core), and Phase 18 (reconsider UI).
|
|
535
|
+
|
|
536
|
+
### First-principles refinement (the deeper target)
|
|
537
|
+
|
|
538
|
+
The two-surface model above is correct but coarse.
|
|
539
|
+
Pushing it against our own principles — construct complete, state owns its mutations, tell-don't-ask, dependency inversion — surfaces sharper boundaries that the current code draws through the middle of classes.
|
|
540
|
+
This subsection records the deeper target; the steps that realize it are sequenced in later phases.
|
|
541
|
+
|
|
542
|
+
#### `Subagent` is four conflated domains
|
|
543
|
+
|
|
544
|
+
The construction duality that motivates Phase 17 — a class that is simultaneously a passive record and an executor — is only the two most visible of four domains fused into one class.
|
|
545
|
+
Pulling each apart by asking "who changes this, how often, and who needs to know" surfaces:
|
|
546
|
+
|
|
547
|
+
1. **Lifecycle state** — status, result, error, timestamps.
|
|
548
|
+
Owned by the subagent; transitions are rare and meaningful; the right outward shape is an immutable snapshot announced on change.
|
|
549
|
+
2. **Metrics** — tool uses, token usage, compaction count.
|
|
550
|
+
These are not lifecycle state; they are a projection aggregated over the child session's event stream.
|
|
551
|
+
`record-observer` already computes them — its only error is writing the aggregate back onto the subagent.
|
|
552
|
+
3. **The hook surface** — the points where an extension alters or augments the child before and around its run.
|
|
553
|
+
This is the child session's own extension binding (see below), not data on the subagent.
|
|
554
|
+
4. **Result delivery** — whether the parent has consumed the result, when to nudge, how the result reaches the caller.
|
|
555
|
+
The homeless `notification.resultConsumed` field belongs to this domain, not to execution.
|
|
556
|
+
|
|
557
|
+
The ~20 optional constructor fields and the runtime `run()` throws are the pressure these four domains exert on one class.
|
|
558
|
+
Separating them is what makes the Phase 17 steps fall out rather than fight back.
|
|
559
|
+
|
|
560
|
+
#### The subagent is a recursive Pi
|
|
561
|
+
|
|
562
|
+
A subagent is a child Pi session: created with `createAgentSession`, then `bindExtensions`.
|
|
563
|
+
Its extension surface is therefore Pi's extension surface applied recursively — not a bespoke event bus.
|
|
564
|
+
What the current doc calls "awaited, ordered lifecycle events" are not observations; they are **hooks**, structurally identical to Pi's own (`session_start`, `tool_execution_start`).
|
|
565
|
+
The tell is the awaiting: the core waits for the handler because the handler's completion changes what the core does next — an extension registers before the child binds.
|
|
566
|
+
A handler that can change subsequent behavior is generative, not observational, whatever we name the channel.
|
|
567
|
+
|
|
568
|
+
This splits the current "lifecycle events" surface cleanly in two:
|
|
569
|
+
|
|
570
|
+
1. **Broadcast** (observational, fire-and-forget) — "this happened; react if you want; you cannot change anything."
|
|
571
|
+
Carries immutable snapshots for telemetry, notification, and any renderer.
|
|
572
|
+
No consumer holds a live `Subagent`.
|
|
573
|
+
2. **Hooks** (generative, awaited, ordered) — the recursive Pi extension surface where workspace, permissions, and future concerns attach to the child.
|
|
574
|
+
The `WorkspaceProvider` is one _typed_ hook; the general form is "be an extension of the child session."
|
|
575
|
+
|
|
576
|
+
The "no vacant hooks" rule still governs the generative side: admit the surface, ship a hook only when a real consumer exists.
|
|
577
|
+
|
|
578
|
+
#### Reactive versus discrete (not internal versus external)
|
|
579
|
+
|
|
580
|
+
The axis that decides push versus pull is whether a need is reactive or discrete — never whether the consumer is in-package or out.
|
|
581
|
+
|
|
582
|
+
- **Reactive** (ambient state that changes underneath you) → subscribe to the broadcast; be told.
|
|
583
|
+
The state-owner announces; the consumer maintains its own read-model; nobody pulls.
|
|
584
|
+
- **Discrete** (a one-shot question: current value, full transcript) → pull a query.
|
|
585
|
+
`get_subagent_result`, opening a transcript, and the external `SubagentsService.getRecord` are queries by nature and stay pull, in-package or not.
|
|
586
|
+
|
|
587
|
+
Behavior is a third interface: **tell by id, with outcomes**.
|
|
588
|
+
`steer` and `abort` own their own rules — a non-running agent rejects a steer from inside `steer`, not via a caller's status pre-check — so coordinators never ask-then-tell.
|
|
589
|
+
|
|
590
|
+
#### Consequences
|
|
591
|
+
|
|
592
|
+
Two consequences fall straight out, and both cut scope.
|
|
593
|
+
|
|
594
|
+
1. **The activity/metrics push tier is provisional.**
|
|
595
|
+
Its only reactive consumer is the inherited widget.
|
|
596
|
+
Treated from first principles, metrics are accumulated by an observer, exposed as a discrete query, and folded into the completion snapshot — so the high-frequency stream may not need to exist at all.
|
|
597
|
+
We do not contort the core's event design to feed an inherited consumer.
|
|
598
|
+
2. **Phase 18 is "reconsider the UI," not "extract the UI."**
|
|
599
|
+
The widget and `/agents` menu predate the fork; they are consumers to be judged on our principles, not requirements to preserve.
|
|
600
|
+
If a UI survives, it survives as a reactive consumer of the broadcast and a caller of the query/behavior interfaces — built on our terms, possibly smaller, possibly removed.
|
|
601
|
+
|
|
602
|
+
#### Sibling packages follow the same discipline
|
|
603
|
+
|
|
604
|
+
`@gotgenes/pi-permission-system` is one of these hooks, and it is subject to the same scrutiny.
|
|
605
|
+
Its boundaries deserve the same first-principles treatment: surface its conflated domains, distinguish what it observes from what it injects, and prefer being told over asking.
|
|
606
|
+
The recursion principle means a consumer's internal design is not exempt because it lives in another package — the same axes (reactive versus discrete, hook versus broadcast, construct complete) apply across the seam.
|
|
607
|
+
|
|
608
|
+
#### How we find these boundaries
|
|
609
|
+
|
|
610
|
+
The boundaries above were not deduced top-down; they were surfaced by friction.
|
|
611
|
+
Each place the target got _harder_ to test marked a domain seam drawn through the middle of a class.
|
|
612
|
+
That method — testability friction as a boundary probe, with its limits — is recorded in the `improvement-discovery` skill so it outlives this phase.
|
|
530
613
|
|
|
531
614
|
## Current structural analysis
|
|
532
615
|
|
|
@@ -765,16 +848,175 @@ All five steps are closed: [#261], [#262], [#263], [#264], [#265].
|
|
|
765
848
|
The earlier "agent collaborator architecture" framing (#256 superseded, #257 parked, #258 and #259 closed not-planned) was abandoned; its structural win was reached cleanly via the workspace seam.
|
|
766
849
|
See [phase-16-invert-dependencies.md](history/phase-16-invert-dependencies.md) for details.
|
|
767
850
|
|
|
768
|
-
## Improvement roadmap (Phase 17 —
|
|
851
|
+
## Improvement roadmap (Phase 17 — core consolidation)
|
|
852
|
+
|
|
853
|
+
Phase 17 consolidates the core's remaining structural debt before the UI reconsideration (now Phase 18).
|
|
854
|
+
The findings come from the standard discovery pass — fallow suite, entry-point trace, design-review checklist, and test-constructibility audit — run after Phase 16 landed.
|
|
855
|
+
|
|
856
|
+
Phase 17 is the consolidation slice of the [first-principles refinement](#first-principles-refinement-the-deeper-target), not the full domain split.
|
|
857
|
+
It lands the first cut of the lifecycle-state domain (Step 2's `SubagentState`) plus the wiring, queue, and duplication cleanups.
|
|
858
|
+
The fuller four-domain split — metrics as a projection, result delivery as its own domain, the hook/broadcast reclassification, and the push/pull (DIP) inversion — is recorded in the refinement and sequenced into later phases.
|
|
859
|
+
|
|
860
|
+
### Findings summary
|
|
861
|
+
|
|
862
|
+
Updated health metrics (fallow, package-wide including tests):
|
|
863
|
+
|
|
864
|
+
| Metric | Phase 16 baseline | Current |
|
|
865
|
+
| -------------------------- | ------------------------------ | --------------------------------------------- |
|
|
866
|
+
| Health score | 78/100 (B) | 78/100 (B) |
|
|
867
|
+
| Source LOC | 7,778 (57 files) | ~7,400 (56 files) |
|
|
868
|
+
| Dead code | 0 files, 0 exports | 0 files, 0 exports |
|
|
869
|
+
| Maintainability index | 90.8 (good) | 90.8 (good) |
|
|
870
|
+
| Avg / P90 cyclomatic | 1.4 / 2 | 1.4 / 2 |
|
|
871
|
+
| Production duplication | 11 lines (1 internal group) | 34 lines (1 internal + 1 cross-package group) |
|
|
872
|
+
| Test duplication | 42 groups, 661 lines | 44 groups, ~750 lines |
|
|
873
|
+
| Fallow refactoring targets | 0 | 0 |
|
|
874
|
+
| Top churn hotspot | `index.ts` 65.0 ▲ accelerating | `index.ts` 31.3 ▼ cooling |
|
|
875
|
+
|
|
876
|
+
The syntactic metrics are healthy and stable — the remaining debt is structural, mostly invisible to fallow, and concentrated in three places:
|
|
877
|
+
|
|
878
|
+
1. **`Subagent` construction duality.**
|
|
879
|
+
`SubagentInit` carries ~20 fields, nearly all optional with "required for run(), optional for tests" semantics, and `run()` compensates with runtime throws ("not configured for execution").
|
|
880
|
+
This violates principle 8 (construct complete): the class is simultaneously a passive record (tests build display-only snapshots) and an executor (production wires factory, observer, run config, workspace provider).
|
|
881
|
+
The symptoms are in the tests: external writes `record.promise = …` (manager, queue callback, four test files) and `record.notification = new NotificationState(…)` (seven test sites) are output-argument smells on fields the object should own.
|
|
882
|
+
This duality is the two most visible of four domains fused into `Subagent`; Phase 17 resolves it (Step 2) and defers the remaining split (metrics, result delivery) to a later phase per the [first-principles refinement](#first-principles-refinement-the-deeper-target).
|
|
883
|
+
2. **Wiring debt in `index.ts`.**
|
|
884
|
+
Two forward references (settings → queue, queue → manager) are replicated with an `eslint-disable prefer-const` dance in `test/lifecycle/subagent-manager.test.ts`; the queue's start callback (`record.promise = record.run()` after a status check) is duplicated verbatim between `index.ts` and the test helper.
|
|
885
|
+
A ~70-line inline `SubagentManagerObserver` literal mixes three concerns (event emission, `appendEntry` persistence, notification dispatch).
|
|
886
|
+
`runtime.widget` is assigned post-construction behind five relay-only delegation methods on `SubagentRuntime`.
|
|
887
|
+
3. **Duplication.**
|
|
888
|
+
A 23-line cross-package production clone (`settings.ts:198-211` ↔ `pi-subagents-worktrees/src/config.ts:51-73`: the layered global/project settings-file loader) and 44 test clone groups (~750 lines), with clone families concentrated in `test/lifecycle/` and `test/ui/`.
|
|
889
|
+
|
|
890
|
+
Deferred findings (scored below the priority cut, tracked here rather than as steps): the `resolveModel` error-as-string union return (callers branch on `typeof resolved === "string"`), the file-top SDK `eslint-disable` headers in 14 files (re-audit when the Pi SDK exports improve), missing unit tests for `observation/renderer.ts` (the top CRAP-risk file), and the 11-line internal clone in `ui/agent-config-editor.ts` (folds into the Phase 18 UI extraction).
|
|
891
|
+
|
|
892
|
+
### Steps
|
|
893
|
+
|
|
894
|
+
Priority = Impact × (6 − Risk).
|
|
895
|
+
|
|
896
|
+
| Step | Title | Category | Impact | Risk | Priority |
|
|
897
|
+
| ---- | ------------------------------------------------------------------------------------ | -------- | ------ | ---- | -------- |
|
|
898
|
+
| 1 | Replace ConcurrencyQueue with a thunk-based ConcurrencyLimiter | A/C | 4 | 2 | 16 |
|
|
899
|
+
| 2 | Extract `SubagentState`; make `Subagent` execution deps mandatory | B/D | 4 | 3 | 12 |
|
|
900
|
+
| 3 | Encapsulate run start and notification attachment on Subagent | C | 3 | 2 | 12 |
|
|
901
|
+
| 4 | Extract run-listener and workspace-bracket collaborators from Subagent | B/C | 3 | 2 | 12 |
|
|
902
|
+
| 5 | Extract the manager observer from index.ts into a class | B/E | 3 | 2 | 12 |
|
|
903
|
+
| 6 | Split widget delegation out of SubagentRuntime | C | 3 | 3 | 9 |
|
|
904
|
+
| 7 | Consolidate lifecycle test fixtures | D | 3 | 1 | 15 |
|
|
905
|
+
| 8 | Consolidate UI and tools test fixtures | D | 2 | 1 | 10 |
|
|
906
|
+
| 9 | Resolve the cross-package settings-loader duplication | A | 2 | 2 | 8 |
|
|
907
|
+
|
|
908
|
+
#### Step 1 — Replace ConcurrencyQueue with a thunk-based ConcurrencyLimiter ([#381])
|
|
909
|
+
|
|
910
|
+
- Targets: `src/lifecycle/concurrency-queue.ts` (→ `concurrency-limiter.ts`), `src/lifecycle/subagent-manager.ts`, `src/index.ts`, `test/lifecycle/concurrency-queue.test.ts`, `test/lifecycle/subagent-manager.test.ts`.
|
|
911
|
+
- Smell: Category C (forward references: the queue's ID-registry design forces a start callback that reaches back into the manager, duplicated between `index.ts` and the test helper) and Category A (dual counting: the queue's `running` counter is fed by `markStarted`/`markFinished` relays in the manager's observer, mirroring state the agents already carry).
|
|
912
|
+
- Change: replace the ID-registry queue with a `ConcurrencyLimiter` that schedules thunks FIFO against a dynamic `getLimit()` — the injected limiter knows nothing about agents, IDs, or the manager.
|
|
913
|
+
Spawn gates background runs with `limiter.schedule(() => record.run())` (the thunk guards on `queued` status, covering abort-while-queued; Step 3 later folds the guard into `Subagent.start()`); foreground and `bypassQueue` runs invoke directly.
|
|
914
|
+
The settings `onMaxConcurrentChanged` hook wires to `limiter.recheck()` in `index.ts`; `dispose()` calls `limiter.clear()` to drop pending thunks.
|
|
915
|
+
- Outcome: dependency direction is strictly manager → limiter (no callback back-edge; the `prefer-const` eslint-disable in the test helper is deleted); the observer's two queue relays are gone; every spawned agent has a `promise` at spawn, collapsing `waitForAll`'s `while (true)` drain loop and its eslint-disable.
|
|
916
|
+
|
|
917
|
+
#### Step 2 — Extract `SubagentState`; make `Subagent` execution deps mandatory ([#373])
|
|
918
|
+
|
|
919
|
+
- Targets: `src/lifecycle/subagent.ts` (state fields, transition/accumulation methods, constructor, `run()` guards), `src/lifecycle/subagent-manager.ts` (`spawn`), `test/helpers/make-subagent.ts`, `test/lifecycle/subagent.test.ts`, `test/observation/record-observer.test.ts`.
|
|
920
|
+
- Smell: Category B (god interface — ~20 fields) and Category D (constructibility: "optional for tests" fields with compensating runtime throws).
|
|
921
|
+
The record/executor duality is the two most visible of the four conflated domains (see [First-principles refinement](#first-principles-refinement-the-deeper-target)).
|
|
922
|
+
- Change: extract the passive-record state — status, result, error, timestamps, and the stats (toolUses, lifetimeUsage, compactionCount) — into a `SubagentState` value object that owns the transition and accumulation methods.
|
|
923
|
+
`Subagent` holds one privately; its existing getters and `markX`/`incrementX`/`addUsage` methods become one-line delegations, so the ~40 read sites and the mutation callers are unchanged.
|
|
924
|
+
This is not reach-through: `SubagentState` is a private owned value, not a foreign collaborator (contrast [#277], which removed reach-through to the raw SDK session).
|
|
925
|
+
With the readable state extracted, the remaining execution inputs (snapshot, prompt, model, maxTurns, thinkingLevel, parentSession, signal, createSubagentSession, observer, getRunConfig, getWorkspaceProvider, baseCwd) collapse into a single **mandatory** `SubagentExecution` collaborator: production always supplies it (the one `spawn()` site), the passive-record construction moves entirely into `make-subagent.ts`, and `run()`'s two "not configured" throws vanish by construction.
|
|
926
|
+
- Outcome: state-machine and observer tests target `SubagentState` directly (no stub execution); `Subagent` is construct-complete with no optional execution fields and no runtime throws (grep-verifiable: no "not configured for execution" in `subagent.ts`); the record-vs-executor duality is resolved, not type-encoded.
|
|
927
|
+
- Scope boundary: stats stay on `SubagentState` for now.
|
|
928
|
+
Hoisting **metrics** into a projection over the child session's event stream and extracting **result delivery** (`notification`/`resultConsumed`) into its own domain are the remaining two of the four domains, deferred to a later phase per the refinement.
|
|
929
|
+
- The issue ([#373]) is filed under the prior "decompose `SubagentInit` into present-or-absent bags" framing; update its description to this stronger target before implementation.
|
|
930
|
+
|
|
931
|
+
#### Step 3 — Encapsulate run start and notification attachment on Subagent ([#374])
|
|
932
|
+
|
|
933
|
+
- Targets: `src/lifecycle/subagent.ts`, `src/lifecycle/subagent-manager.ts`, `test/tools/get-result-tool.test.ts`, `test/lifecycle/subagent-manager.test.ts`, `test/service/service-adapter.test.ts`, `test/observation/notification.test.ts`, `test/helpers/make-subagent.ts`.
|
|
934
|
+
- Smell: Category C — output arguments: external writes to `record.promise` (3 production/test sites) and `record.notification` (7 test sites).
|
|
935
|
+
- Change: add `Subagent.start()` that runs and stores its own promise (plus an awaitable accessor for `spawnAndWait`/`waitForAll`); make `promise` and `notification` externally read-only; tests attach notification state through `SubagentExecution.parentSession.toolCallId` or a dedicated options field.
|
|
936
|
+
- Outcome: zero external writes to `Subagent` fields outside its own methods (grep-verifiable: `\.promise =` and `\.notification =` appear only inside `subagent.ts`).
|
|
937
|
+
|
|
938
|
+
#### Step 4 — Extract run-listener and workspace-bracket collaborators from Subagent ([#375])
|
|
939
|
+
|
|
940
|
+
- Targets: `src/lifecycle/subagent.ts` (533 LOC — largest source file, accelerating churn).
|
|
941
|
+
- Smell: Category B (oversized class; per-run listener fields declared mid-class) and Category C (state owns its mutations: workspace dispose logic appears in `run()`'s catch, `completeRun`, and `failRun`).
|
|
942
|
+
- Change: extract a `RunListeners` object owning the observer-unsubscribe and signal-detach handles (`attach`/`release`), and a workspace-bracket collaborator owning prepare/dispose-with-addendum, so the three dispose paths collapse into one.
|
|
943
|
+
- Outcome: `subagent.ts` ≤ 450 LOC; workspace disposal logic in exactly one place; listener handles no longer raw nullable fields.
|
|
944
|
+
|
|
945
|
+
#### Step 5 — Extract the manager observer from index.ts into a class ([#376])
|
|
946
|
+
|
|
947
|
+
- Targets: `src/index.ts` (inline `SubagentManagerObserver` literal, ~70 lines), new module under `src/observation/`.
|
|
948
|
+
- Smell: Category B/E — `index.ts` is the dominant churn hotspot (31.3, 91 commits); the literal mixes event emission, record persistence (`appendEntry`), and notification dispatch; principle 9 (state and behavior belong in classes, not closure-captured literals).
|
|
949
|
+
- Change: extract a class (e.g. `SubagentEventsObserver`) constructed with narrow deps (`emit`, `appendEntry`, the `NotificationSystem`).
|
|
950
|
+
- Outcome: `index.ts` < 170 lines; the observer's three concerns unit-tested directly without booting the extension.
|
|
951
|
+
|
|
952
|
+
#### Step 6 — Split widget delegation out of SubagentRuntime ([#377])
|
|
953
|
+
|
|
954
|
+
- Targets: `src/runtime.ts`, `src/tools/agent-tool.ts` (`AgentToolRuntime`), `src/tools/foreground-runner.ts`, `src/tools/background-spawner.ts`, `src/observation/notification.ts` (`NotificationManager` constructor), `src/index.ts`.
|
|
955
|
+
- Smell: Category C — relay-only dependency (five delegation methods that only forward to `widget`) and a post-construction `runtime.widget =` write violating principle 8.
|
|
956
|
+
- Change: pass the existing `WidgetLike` handle directly to the consumers that need it (tool deps, `NotificationManager`) and construct the widget before them; remove the `widget` field and the five relay methods from `SubagentRuntime`.
|
|
957
|
+
- Outcome: `SubagentRuntime` has zero widget knowledge; no post-construction field writes in `index.ts`; tool fixtures stub a 5-method `WidgetLike` instead of widget methods on the runtime mock.
|
|
958
|
+
|
|
959
|
+
#### Step 7 — Consolidate lifecycle test fixtures ([#378])
|
|
960
|
+
|
|
961
|
+
- Targets: `test/lifecycle/subagent-manager.test.ts` (766 LOC), `test/lifecycle/subagent.test.ts`, `test/lifecycle/subagent-session.test.ts`, `test/lifecycle/create-subagent-session.test.ts`, `test/lifecycle/create-subagent-session-extension-tools.test.ts`, `test/lifecycle/concurrency-queue.test.ts`, `test/helpers/`.
|
|
962
|
+
- Smell: Category D — fallow reports five clone families across the lifecycle tests.
|
|
963
|
+
- Change: extract the repeated spawn/run/factory arrangements into shared helpers, migrating incrementally (lift-and-shift, never a single-step rewrite of a large test file).
|
|
964
|
+
- Outcome: lifecycle clone families 5 → ≤ 1; package test duplication below 600 lines.
|
|
965
|
+
|
|
966
|
+
#### Step 8 — Consolidate UI and tools test fixtures ([#379])
|
|
967
|
+
|
|
968
|
+
- Targets: `test/ui/agent-creation-wizard.test.ts`, `test/ui/agent-config-editor.test.ts`, `test/ui/ui-observer.test.ts`, `test/tools/foreground-runner.test.ts`, `test/tools/background-spawner.test.ts`, `test/session/session-config.test.ts`.
|
|
969
|
+
- Smell: Category D — remaining clone families outside the lifecycle tree.
|
|
970
|
+
- Change: extract per-file repeated arrangements into local helpers or `test/helpers/` where shared across files.
|
|
971
|
+
- Outcome: package clone groups 44 → ≤ 25; overall duplication ≤ 0.6%.
|
|
972
|
+
|
|
973
|
+
#### Step 9 — Resolve the cross-package settings-loader duplication ([#380])
|
|
974
|
+
|
|
975
|
+
- Targets: `src/settings.ts:198-211`, `packages/pi-subagents-worktrees/src/config.ts:51-73`.
|
|
976
|
+
- Smell: Category A — 23-line production clone: the layered global/project JSON read-sanitize-warn-merge loader.
|
|
977
|
+
- Change: decide explicitly between (a) exporting a small `loadLayeredSettings` helper from pi-subagents' public surface for worktrees to consume, and (b) documenting the duplication as intentional (separate release cadences, registry-resolved dependency) with a recorded fallow suppression.
|
|
978
|
+
The issue weighs the public-API cost (type bundle, `verify:public-types`, docs for third-party authors) against living with the flag.
|
|
979
|
+
- Outcome: `pnpm fallow:dupes` no longer reports the pair, via extraction or recorded suppression.
|
|
980
|
+
|
|
981
|
+
### Step dependencies
|
|
982
|
+
|
|
983
|
+
```mermaid
|
|
984
|
+
flowchart TB
|
|
985
|
+
S1["Step 1 (#381)<br/>ConcurrencyLimiter replacement"]
|
|
986
|
+
S2["Step 2 (#373)<br/>SubagentState extraction"]
|
|
987
|
+
S3["Step 3 (#374)<br/>Encapsulate start + notification"]
|
|
988
|
+
S4["Step 4 (#375)<br/>Run collaborators extraction"]
|
|
989
|
+
S5["Step 5 (#376)<br/>Observer class from index.ts"]
|
|
990
|
+
S6["Step 6 (#377)<br/>Widget handle out of runtime"]
|
|
991
|
+
S7["Step 7 (#378)<br/>Lifecycle test fixtures"]
|
|
992
|
+
S8["Step 8 (#379)<br/>UI/tools test fixtures"]
|
|
993
|
+
S9["Step 9 (#380)<br/>Settings-loader duplication"]
|
|
994
|
+
|
|
995
|
+
S1 --> S3
|
|
996
|
+
S2 --> S3
|
|
997
|
+
S3 --> S4
|
|
998
|
+
S4 --> S7
|
|
999
|
+
S5 --> S6
|
|
1000
|
+
```
|
|
1001
|
+
|
|
1002
|
+
Steps 8 and 9 have no dependencies and can run at any point.
|
|
1003
|
+
|
|
1004
|
+
### Tracks
|
|
1005
|
+
|
|
1006
|
+
| Track | Steps | Theme |
|
|
1007
|
+
| ----------------------------- | ------------- | ---------------------------------------------------------------------------------- |
|
|
1008
|
+
| A — Subagent constructibility | 2 → 3 → 4 → 7 | Construct complete; encapsulate run state; then consolidate the tests that churned |
|
|
1009
|
+
| B — Wiring debt | 1, 5 → 6 | Shrink index.ts; eliminate forward references and relay delegation |
|
|
1010
|
+
| C — Test hygiene | 8 | Clone families outside the lifecycle tree |
|
|
1011
|
+
| D — Duplication policy | 9 | Cross-package clone decision |
|
|
769
1012
|
|
|
770
|
-
|
|
771
|
-
|
|
772
|
-
By this point the core is minimal and stable — the API boundary has been proven across Phases 14–16.
|
|
1013
|
+
Tracks A and B intersect only at Step 3 (which needs Step 1's queue relocation); otherwise they proceed in parallel.
|
|
1014
|
+
Tracks C and D are fully independent.
|
|
773
1015
|
|
|
774
1016
|
## Refactoring history
|
|
775
1017
|
|
|
776
1018
|
Phases 1–5, 7–16 are complete.
|
|
777
|
-
Phase 6 (UI extraction to a separate package) is deferred → Phase
|
|
1019
|
+
Phase 6 (UI extraction to a separate package) is deferred → Phase 18.
|
|
778
1020
|
Detailed records are preserved in per-phase history files:
|
|
779
1021
|
|
|
780
1022
|
| Phase | Title | Status | History |
|
|
@@ -784,7 +1026,7 @@ Detailed records are preserved in per-phase history files:
|
|
|
784
1026
|
| 3 | Remove group-join, RPC; replace output-file | Complete | [phase-3-remove-rpc-groupjoin.md](history/phase-3-remove-rpc-groupjoin.md) |
|
|
785
1027
|
| 4 | Implement and publish SubagentsService | Complete | [phase-4-implement-service.md](history/phase-4-implement-service.md) |
|
|
786
1028
|
| 5 | Decompose index.ts | Complete | [phase-5-decompose-index.md](history/phase-5-decompose-index.md) |
|
|
787
|
-
| 6 | Extract UI to separate package | Deferred → Phase
|
|
1029
|
+
| 6 | Extract UI to separate package | Deferred → Phase 18 | — |
|
|
788
1030
|
| 7 | Encapsulation and dependency narrowing | Complete | [phase-7-encapsulation.md](history/phase-7-encapsulation.md) |
|
|
789
1031
|
| 8 | Testability, display extraction, menu decomposition | Complete | [phase-8-testability.md](history/phase-8-testability.md) |
|
|
790
1032
|
| 9 | Observation consolidation, ctx elimination | Complete | [phase-9-observation-ctx.md](history/phase-9-observation-ctx.md) |
|
|
@@ -795,7 +1037,8 @@ Detailed records are preserved in per-phase history files:
|
|
|
795
1037
|
| 14 | Strip policy from core | Complete | [phase-14-strip-policy.md](history/phase-14-strip-policy.md) |
|
|
796
1038
|
| 15 | Domain model evolution | Complete | [phase-15-domain-model-evolution.md](history/phase-15-domain-model-evolution.md) |
|
|
797
1039
|
| 16 | Invert dependencies (extensions on a minimal core) | Complete | [phase-16-invert-dependencies.md](history/phase-16-invert-dependencies.md) |
|
|
798
|
-
| 17 |
|
|
1040
|
+
| 17 | Core consolidation | Planned | — |
|
|
1041
|
+
| 18 | Reconsider UI (first principles) | Planned | — |
|
|
799
1042
|
|
|
800
1043
|
### Structural refactoring issues
|
|
801
1044
|
|
|
@@ -861,4 +1104,13 @@ The upstream test suite is run periodically as a regression canary for the sessi
|
|
|
861
1104
|
[#264]: https://github.com/gotgenes/pi-packages/issues/264
|
|
862
1105
|
[#265]: https://github.com/gotgenes/pi-packages/issues/265
|
|
863
1106
|
[#277]: https://github.com/gotgenes/pi-packages/issues/277
|
|
1107
|
+
[#373]: https://github.com/gotgenes/pi-packages/issues/373
|
|
1108
|
+
[#374]: https://github.com/gotgenes/pi-packages/issues/374
|
|
1109
|
+
[#375]: https://github.com/gotgenes/pi-packages/issues/375
|
|
1110
|
+
[#376]: https://github.com/gotgenes/pi-packages/issues/376
|
|
1111
|
+
[#377]: https://github.com/gotgenes/pi-packages/issues/377
|
|
1112
|
+
[#378]: https://github.com/gotgenes/pi-packages/issues/378
|
|
1113
|
+
[#379]: https://github.com/gotgenes/pi-packages/issues/379
|
|
1114
|
+
[#380]: https://github.com/gotgenes/pi-packages/issues/380
|
|
1115
|
+
[#381]: https://github.com/gotgenes/pi-packages/issues/381
|
|
864
1116
|
[ADR-0002]: ../decisions/0002-extensions-on-a-minimal-core.md
|
|
@@ -0,0 +1,199 @@
|
|
|
1
|
+
---
|
|
2
|
+
issue: 400
|
|
3
|
+
issue_title: "perf(pi-subagents): include parent system prompt in replace mode for KV cache reuse"
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Include parent system prompt in replace mode for KV cache reuse
|
|
7
|
+
|
|
8
|
+
## Problem Statement
|
|
9
|
+
|
|
10
|
+
In replace mode, `buildAgentPrompt()` discards the parent system prompt entirely and substitutes a thin two-line header (`"You are a pi coding agent sub-agent. / You have been invoked to handle a specific task autonomously."`).
|
|
11
|
+
Replace-mode agents therefore lose the core identity, tool-usage guidelines, and AGENTS.md context the parent carries, and they share no prompt prefix with the parent or with each other — defeating LLM KV cache reuse.
|
|
12
|
+
The `parentSystemPrompt` parameter is already passed into `buildAgentPrompt()` but the replace branch ignores it.
|
|
13
|
+
|
|
14
|
+
## Goals
|
|
15
|
+
|
|
16
|
+
- Place the parent system prompt (or `genericBase` when no parent is available) at the front of the replace-mode prompt as a shared, cacheable prefix.
|
|
17
|
+
- Order the replace-mode prompt as: parent/`genericBase` → `<active_agent>` tag → env block → `config.systemPrompt`.
|
|
18
|
+
- Preserve the distinguishing feature of replace mode: it injects neither the `<sub_agent_context>` bridge nor the `<agent_instructions>` wrapper — the custom prompt keeps full control of the agent's instructions, placed last so it has the final say.
|
|
19
|
+
- Apply the change uniformly to every replace-mode agent, including the built-in `Explore` and `Plan` agents.
|
|
20
|
+
- This is a **breaking change**: replace-mode agents (including `Explore`/`Plan` and any custom `prompt_mode: replace` agent) now inherit the parent system prompt on upgrade with no user edit, and the thin two-line header is removed.
|
|
21
|
+
Ship it as `perf!:` with a `BREAKING CHANGE:` footer.
|
|
22
|
+
|
|
23
|
+
## Non-Goals
|
|
24
|
+
|
|
25
|
+
- No change to append-mode assembly (already reordered for KV cache in [#180]).
|
|
26
|
+
- No change to how `parentSystemPrompt` is sourced — `create-subagent-session.ts` already passes `snapshot.systemPrompt` through `session-config.ts`.
|
|
27
|
+
- No new mode or flag to distinguish "replace with parent" from "replace without parent" — the operator confirmed the change applies uniformly, so `Explore`/`Plan` are not special-cased.
|
|
28
|
+
- No change to `pi-permission-system` — its `<active_agent>` tag parsing is a full-string regex search, position-independent.
|
|
29
|
+
- No change to `pi-anthropic-auth` — its OAuth shaping is unaffected (see Background).
|
|
30
|
+
|
|
31
|
+
## Background
|
|
32
|
+
|
|
33
|
+
`buildAgentPrompt()` in `packages/pi-subagents/src/session/prompts.ts` assembles the child system prompt.
|
|
34
|
+
The append branch was reordered in [#180] (shipped in `pi-subagents-v6.18.3`) to place shared/stable content first; the parent prompt is placed verbatim (no wrapper tag) so it forms an identical byte prefix with the parent session, maximising KV cache hits.
|
|
35
|
+
The replace branch was left untouched and still emits the thin header.
|
|
36
|
+
|
|
37
|
+
Current replace branch:
|
|
38
|
+
|
|
39
|
+
```typescript
|
|
40
|
+
// "replace" mode — env header + the config's full system prompt
|
|
41
|
+
const replaceHeader = `You are a pi coding agent sub-agent.
|
|
42
|
+
You have been invoked to handle a specific task autonomously.
|
|
43
|
+
|
|
44
|
+
${envBlock}`;
|
|
45
|
+
|
|
46
|
+
return activeAgentTag + replaceHeader + "\n\n" + config.systemPrompt;
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
`const identity = parentSystemPrompt ?? genericBase;` currently lives inside the append branch.
|
|
50
|
+
`genericBase` (a `# Role` / general-purpose coding agent blurb) is the shared fallback.
|
|
51
|
+
|
|
52
|
+
### Cross-extension interaction — `pi-anthropic-auth` OAuth
|
|
53
|
+
|
|
54
|
+
The operator asked how the `genericBase` fallback interacts with `@gotgenes/pi-anthropic-auth`.
|
|
55
|
+
Findings from reading that package's `src/system-prompt-shaping.ts` and `src/request-shaping.ts`:
|
|
56
|
+
|
|
57
|
+
- The OAuth de-fingerprinting (`shapeAnthropicOAuthSystemPrompt`) only activates when the system prompt contains `PI_DEFAULT_PROMPT_PREFIX` (Pi's default expert-coding-assistant preamble); otherwise it returns the prompt untouched.
|
|
58
|
+
- The `x-anthropic-billing-header` system block is prepended **unconditionally** for every OAuth request (`prependBillingHeader`), independent of the base prompt content — this is the primary Claude Code billing signal.
|
|
59
|
+
|
|
60
|
+
Implications for this change:
|
|
61
|
+
|
|
62
|
+
- Normal case (parent present): replace mode places the parent prompt verbatim at the front, structurally identical to append mode, which already works under the OAuth transport wrapper.
|
|
63
|
+
The inherited Pi preamble is de-fingerprinted exactly as it is for append-mode subagents and the main session today.
|
|
64
|
+
- `genericBase` fallback (only when the parent snapshot has no system prompt — effectively never in real sessions, since `parentSystemPrompt` is a required `string` at the `session-config` layer): `genericBase` carries no Pi fingerprint, so the OAuth shaping no-ops and the billing header is still prepended.
|
|
65
|
+
`genericBase` is already neutral, so nothing leaks.
|
|
66
|
+
|
|
67
|
+
Conclusion: #400 introduces no new OAuth interaction. `genericBase` remains the correct fallback and stays consistent with append mode.
|
|
68
|
+
|
|
69
|
+
### Constraints from AGENTS.md
|
|
70
|
+
|
|
71
|
+
- This package carries a type-declaration bundle for its public API, but `buildAgentPrompt` is internal — no `dist/public.d.ts` or `exports` impact, so `verify:public-types` is not required for this change.
|
|
72
|
+
- Conventional Commits; do not edit `CHANGELOG.md` (release-please owns it).
|
|
73
|
+
- The `BREAKING CHANGE:` footer text is reused verbatim in the release-please CHANGELOG and the issue close comment — name only real surface (`prompt_mode: replace`).
|
|
74
|
+
|
|
75
|
+
## Design Overview
|
|
76
|
+
|
|
77
|
+
Hoist the `identity` resolution above the branch so both modes share it, then rewrite the replace branch.
|
|
78
|
+
|
|
79
|
+
```typescript
|
|
80
|
+
const activeAgentTag = `<active_agent name="${config.name}"/>\n\n`;
|
|
81
|
+
const envBlock = `# Environment\n...`;
|
|
82
|
+
const identity = parentSystemPrompt ?? genericBase;
|
|
83
|
+
|
|
84
|
+
if (config.promptMode === "append") {
|
|
85
|
+
// ...unchanged...
|
|
86
|
+
}
|
|
87
|
+
|
|
88
|
+
// "replace" mode — shared parent prompt (or generic base) first for KV cache
|
|
89
|
+
// reuse, then the active_agent tag, env block, and the config's full system
|
|
90
|
+
// prompt. Unlike append mode, replace mode injects neither the
|
|
91
|
+
// <sub_agent_context> bridge nor the <agent_instructions> wrapper — the custom
|
|
92
|
+
// prompt keeps full control of the agent's instructions.
|
|
93
|
+
return identity + "\n\n" + activeAgentTag + envBlock + "\n\n" + config.systemPrompt;
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
Resulting replace-mode order (`activeAgentTag` already ends with `\n\n`):
|
|
97
|
+
|
|
98
|
+
```text
|
|
99
|
+
1. parentSystemPrompt (or genericBase) ← SHARED, cacheable prefix
|
|
100
|
+
2. <active_agent name="${name}"/> ← varies per agent
|
|
101
|
+
3. # Environment ... ← varies per runtime
|
|
102
|
+
4. config.systemPrompt ← custom instructions (full control)
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
This mirrors append mode's prefix-first ordering, minus the bridge and the `<agent_instructions>` wrapper.
|
|
106
|
+
The change is a pure single-function edit — no new collaborator, no new module, no interface change — so the design-review structural checklist (dependency width, Law of Demeter, extraction seams) does not apply.
|
|
107
|
+
|
|
108
|
+
### Edge cases
|
|
109
|
+
|
|
110
|
+
- Empty `config.systemPrompt` (e.g. a replace agent with no body): the prompt ends with a trailing `\n\n` after the env block.
|
|
111
|
+
Acceptable and consistent with current behavior; no special-casing.
|
|
112
|
+
`genericBase` only substitutes on a nullish parent (the `??` operator), so an empty-string parent prompt is preserved as-is, matching append mode.
|
|
113
|
+
|
|
114
|
+
## Module-Level Changes
|
|
115
|
+
|
|
116
|
+
### `packages/pi-subagents/src/session/prompts.ts`
|
|
117
|
+
|
|
118
|
+
1. Hoist `const identity = parentSystemPrompt ?? genericBase;` from the append branch to before the `if (config.promptMode === "append")` check so both branches use it.
|
|
119
|
+
2. Replace the replace-branch `replaceHeader` template and return statement with the new ordering (`identity` → `activeAgentTag` → `envBlock` → `config.systemPrompt`); remove the thin two-line header.
|
|
120
|
+
3. Update the JSDoc summary: replace-mode bullet becomes "parent system prompt (or generic base) + active_agent tag + env header + config.systemPrompt; no bridge, no agent_instructions wrapper," and update the trailing note about tag position (it is included, not prepended, in either mode).
|
|
121
|
+
|
|
122
|
+
### `packages/pi-subagents/test/session/prompts.test.ts`
|
|
123
|
+
|
|
124
|
+
See Test Impact Analysis and TDD Order for the specific test changes.
|
|
125
|
+
|
|
126
|
+
### `packages/pi-subagents/README.md`
|
|
127
|
+
|
|
128
|
+
1. Lines 119–120 — the `Explore` and `Plan` rows: revise the `replace` (standalone) framing, since replace mode now inherits the parent prompt as its base.
|
|
129
|
+
2. Line 187 — the `prompt_mode` frontmatter table: `replace` no longer means "no AGENTS.md / CLAUDE.md inheritance."
|
|
130
|
+
Reword to describe the new semantics: replace inherits the parent prompt as the base, then the body takes full control (no `<sub_agent_context>` bridge, no `<agent_instructions>` wrapper), whereas append wraps the body and adds the bridge.
|
|
131
|
+
3. Line 494 (Patch 3, `<active_agent>` tag): change "prepends ... to every assembled child system prompt (both `replace` and `append` modes)" to "includes ... in every assembled child system prompt (both modes)" — the tag follows the cacheable parent prefix in both modes now, so "prepends" is inaccurate.
|
|
132
|
+
|
|
133
|
+
No `docs/architecture/` updates: the architecture doc references `prompts.ts` only as a one-line file listing (no prompt-assembly description, no complexity/health table entry tied to this change).
|
|
134
|
+
|
|
135
|
+
## Test Impact Analysis
|
|
136
|
+
|
|
137
|
+
This is a behavior change, not an extraction, so the extraction-specific questions are limited.
|
|
138
|
+
|
|
139
|
+
- New behavior to cover: replace mode now includes the parent prompt as a cacheable prefix; falls back to `genericBase` with no parent; still excludes the bridge and the `<agent_instructions>` wrapper.
|
|
140
|
+
- Existing replace-mode tests that assert the old behavior must change (they pin the removed thin header and the "ignores parent prompt" premise).
|
|
141
|
+
- `toContain`-based tests for cwd/git/env and the `genericBase` fallback remain valid where position-independent.
|
|
142
|
+
- No existing test becomes redundant beyond the ones being rewritten; no test must stay frozen for a layer being extracted (nothing is extracted).
|
|
143
|
+
|
|
144
|
+
Tests that change in `test/session/prompts.test.ts`:
|
|
145
|
+
|
|
146
|
+
1. `"replace mode uses config systemPrompt directly"` — asserts `toContain("You are a pi coding agent sub-agent")`; that header is removed.
|
|
147
|
+
Rewrite to assert the config prompt is present and the thin header is gone.
|
|
148
|
+
2. `"replace mode ignores parent prompt"` — asserts the parent content is absent.
|
|
149
|
+
The premise inverts: rename to `"replace mode includes parent prompt as base (no bridge/wrapper)"` and assert the parent content is present while `<sub_agent_context>` and `<agent_instructions>` are absent.
|
|
150
|
+
3. `"prepends <active_agent name=...> tag in replace mode"` — asserts `prompt.startsWith('<active_agent name="Explore"/>\n\n')`.
|
|
151
|
+
The tag no longer leads (parent/`genericBase` does); rewrite to assert the tag appears after the identity prefix and before the env block.
|
|
152
|
+
4. `"active_agent tag appears before envBlock in both modes"` — the replace assertions pin `tagIdx === 0`.
|
|
153
|
+
Update the replace assertions: the tag is no longer at index 0 but still precedes `# Environment`.
|
|
154
|
+
The append assertions stay as-is.
|
|
155
|
+
|
|
156
|
+
## TDD Order
|
|
157
|
+
|
|
158
|
+
All test and source changes live in two files that the type checker links (the replace branch and its tests).
|
|
159
|
+
Each cycle is a single commit that leaves the suite green.
|
|
160
|
+
|
|
161
|
+
1. **Red: rewrite replace-mode behavioral tests.**
|
|
162
|
+
Update tests 1–2 above to the new behavior (parent prompt included as base; thin header removed; no bridge/wrapper), and add a test for the `genericBase` fallback when no parent is supplied in replace mode, plus a test pinning the full order (`identity` → `<active_agent>` → `# Environment` → `config.systemPrompt`).
|
|
163
|
+
These fail against the current implementation.
|
|
164
|
+
Commit: `test: assert replace mode inherits parent prompt as cacheable prefix (#400)`
|
|
165
|
+
|
|
166
|
+
2. **Green: rewrite the replace branch.**
|
|
167
|
+
Hoist `identity`, replace the `replaceHeader` block with the new ordering, remove the thin header, and update the JSDoc.
|
|
168
|
+
Update the positional `<active_agent>` tests (3–4 above) in the same commit — they break at runtime the moment the branch changes.
|
|
169
|
+
Commit body carries the `BREAKING CHANGE:` footer.
|
|
170
|
+
Commit: `perf!: include parent system prompt in replace mode (#400)`
|
|
171
|
+
|
|
172
|
+
```text
|
|
173
|
+
BREAKING CHANGE: replace-mode subagents (built-in Explore/Plan and any
|
|
174
|
+
custom prompt_mode: replace agent) now inherit the parent system prompt as
|
|
175
|
+
their base instead of a thin standalone header. The custom prompt is
|
|
176
|
+
appended last and retains full control; the <sub_agent_context> bridge and
|
|
177
|
+
<agent_instructions> wrapper are still omitted in replace mode.
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
3. **Docs: update README replace-mode semantics.**
|
|
181
|
+
Apply the three README edits (Explore/Plan rows, `prompt_mode` table, Patch 3 `<active_agent>` wording).
|
|
182
|
+
Commit: `docs: describe replace-mode parent inheritance (#400)`
|
|
183
|
+
|
|
184
|
+
## Risks and Mitigations
|
|
185
|
+
|
|
186
|
+
| Risk | Mitigation |
|
|
187
|
+
| --------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
188
|
+
| `Explore`/`Plan` behavior shifts — they now carry the full parent prompt plus their read-only specialist instructions | Operator confirmed uniform application; specialist instructions are placed last so they have the final say; existing read-only assertions (`READ-ONLY`, `file search specialist`) still hold via `toContain`. |
|
|
189
|
+
| `pi-permission-system` depends on `<active_agent>` tag position | Tag parsing is a full-string regex search; position-independent (same basis as [#180]). |
|
|
190
|
+
| `pi-anthropic-auth` OAuth shaping breaks with the new base | No new interaction — billing header is prepended unconditionally; de-fingerprinting keys off `PI_DEFAULT_PROMPT_PREFIX` and `genericBase` is already neutral (see Background). |
|
|
191
|
+
| A custom replace agent relied on the clean-slate (no parent) behavior | Documented as breaking in the `BREAKING CHANGE:` footer and README; this aligns with the expectation reported in the issue ([@jeffutter] expected the parent identity to be present). |
|
|
192
|
+
| Stale README claims that replace = no inheritance | README edits in cycle 3 correct lines 119–120, 187, and 494. |
|
|
193
|
+
|
|
194
|
+
## Open Questions
|
|
195
|
+
|
|
196
|
+
None — the three design decisions (breaking classification, `genericBase` fallback, uniform application to built-ins) were resolved with the operator before planning.
|
|
197
|
+
|
|
198
|
+
[#180]: https://github.com/gotgenes/pi-packages/issues/180
|
|
199
|
+
[@jeffutter]: https://github.com/jeffutter
|
|
@@ -0,0 +1,44 @@
|
|
|
1
|
+
---
|
|
2
|
+
issue: 400
|
|
3
|
+
issue_title: "perf(pi-subagents): include parent system prompt in replace mode for KV cache reuse"
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Retro: #400 — Include parent system prompt in replace mode for KV cache reuse
|
|
7
|
+
|
|
8
|
+
## Stage: Planning (2026-06-14T00:42:49Z)
|
|
9
|
+
|
|
10
|
+
### Session summary
|
|
11
|
+
|
|
12
|
+
Produced a numbered plan for including the parent system prompt as a cacheable prefix in `buildAgentPrompt()`'s replace branch, mirroring the [#180] append-mode reorder.
|
|
13
|
+
The change is a single-function edit plus test and README updates, planned across three TDD/docs commits.
|
|
14
|
+
|
|
15
|
+
### Observations
|
|
16
|
+
|
|
17
|
+
- Three design decisions were confirmed with the operator (issue author = gh user) before planning:
|
|
18
|
+
1. Ship as breaking `perf!:` with a `BREAKING CHANGE:` footer — replace-mode agents inherit the parent prompt on upgrade with no user edit, and the thin two-line header is removed.
|
|
19
|
+
2. Use `genericBase` as the no-parent fallback, consistent with append mode.
|
|
20
|
+
3. Apply uniformly to all replace agents, including built-in `Explore` and `Plan` (one code path, no special-casing).
|
|
21
|
+
- The operator raised a cross-extension concern about the `genericBase` fallback interacting with `@gotgenes/pi-anthropic-auth`.
|
|
22
|
+
Investigation of that package's `system-prompt-shaping.ts` / `request-shaping.ts` showed no new interaction: the `x-anthropic-billing-header` block is prepended unconditionally for OAuth, and de-fingerprinting keys off `PI_DEFAULT_PROMPT_PREFIX` (absent from `genericBase`, which is already neutral).
|
|
23
|
+
Captured this in the plan's Background and Risks.
|
|
24
|
+
- `parentSystemPrompt` is a required `string` at the `session-config` layer (sourced from `snapshot.systemPrompt`), so the `genericBase` fallback is effectively a defensive/test-only path in real sessions.
|
|
25
|
+
- The thin replace header string (`You are a pi coding agent sub-agent`) appears only in `prompts.ts` and its test — no skill or live doc pins it; README needs three edits (Explore/Plan rows, `prompt_mode` table, Patch 3 `<active_agent>` wording, the last already slightly stale post-#180).
|
|
26
|
+
- Notable emergent scope point: `Explore`/`Plan` are built-in replace-mode agents, so this change affects them visibly — surfaced and confirmed rather than assumed.
|
|
27
|
+
|
|
28
|
+
## Stage: Implementation — TDD (2026-06-14T00:54:46Z)
|
|
29
|
+
|
|
30
|
+
### Session summary
|
|
31
|
+
|
|
32
|
+
Completed all 3 TDD cycles in `packages/pi-subagents`.
|
|
33
|
+
The change is a single-function edit to `src/session/prompts.ts` (hoist `identity`, rewrite replace branch) plus test updates and README/skill-doc corrections.
|
|
34
|
+
Test count went from 973 to 975 (+2 net new tests) across 59 test files.
|
|
35
|
+
|
|
36
|
+
### Observations
|
|
37
|
+
|
|
38
|
+
- Step 1 (Red): rewrote 2 existing replace-mode tests and added 2 new ones (4 failures confirmed against old code); the old "ignores parent prompt" test premise inverted cleanly into "includes parent prompt as base."
|
|
39
|
+
- Step 2 (Green): hoisting `const identity = parentSystemPrompt ?? genericBase;` above the `if` block and replacing the `replaceHeader` template were the only `src/` changes; also updated two positional `<active_agent>` tests in the same commit since they broke the moment the branch changed (`tagIdx === 0` → `toBeGreaterThan(0)`).
|
|
40
|
+
- The `BREAKING CHANGE:` footer wording was taken verbatim from the plan and landed in the `perf!:` commit.
|
|
41
|
+
- Pre-completion reviewer: WARN — one finding: `.pi/skills/package-pi-subagents/SKILL.md` still said "prepends" for the `<active_agent>` tag; fixed in a follow-up `docs:` commit before shipping.
|
|
42
|
+
- No deviations from the plan's Module-Level Changes list; no lockfile changes; fallow dead-code exited zero.
|
|
43
|
+
|
|
44
|
+
[#180]: https://github.com/gotgenes/pi-packages/issues/180
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@gotgenes/pi-subagents",
|
|
3
|
-
"version": "
|
|
3
|
+
"version": "16.0.0",
|
|
4
4
|
"type": "module",
|
|
5
5
|
"exports": {
|
|
6
6
|
".": {
|
|
@@ -59,11 +59,11 @@
|
|
|
59
59
|
},
|
|
60
60
|
"devDependencies": {
|
|
61
61
|
"@biomejs/biome": "^2.4.16",
|
|
62
|
-
"@earendil-works/pi-ai": "0.
|
|
63
|
-
"@earendil-works/pi-coding-agent": "0.
|
|
64
|
-
"@earendil-works/pi-tui": "0.
|
|
62
|
+
"@earendil-works/pi-ai": "0.79.1",
|
|
63
|
+
"@earendil-works/pi-coding-agent": "0.79.1",
|
|
64
|
+
"@earendil-works/pi-tui": "0.79.1",
|
|
65
65
|
"@types/node": "^22.15.3",
|
|
66
|
-
"rollup": "^4.
|
|
66
|
+
"rollup": "^4.61.1",
|
|
67
67
|
"rollup-plugin-dts": "^6.4.1",
|
|
68
68
|
"rumdl": "^0.2.10",
|
|
69
69
|
"typescript": "^6.0.3",
|
package/src/session/prompts.ts
CHANGED
|
@@ -8,17 +8,25 @@ import type { AgentPromptConfig } from "#src/types";
|
|
|
8
8
|
/**
|
|
9
9
|
* Build the system prompt for an agent from its config.
|
|
10
10
|
*
|
|
11
|
-
*
|
|
12
|
-
*
|
|
13
|
-
*
|
|
11
|
+
* Both modes place the shared/stable parent prompt (or `genericBase` when no
|
|
12
|
+
* parent is available) first so the LLM's KV cache can reuse the inherited
|
|
13
|
+
* prefix across all subagent invocations.
|
|
14
14
|
*
|
|
15
|
-
*
|
|
16
|
-
*
|
|
17
|
-
*
|
|
18
|
-
*
|
|
19
|
-
*
|
|
15
|
+
* - "replace" mode: parent/genericBase + active_agent tag + env header +
|
|
16
|
+
* config.systemPrompt. No `<sub_agent_context>` bridge and no
|
|
17
|
+
* `<agent_instructions>` wrapper — the custom prompt has full control and
|
|
18
|
+
* the final say.
|
|
19
|
+
* - "append" mode: parent/genericBase + sub-agent context bridge +
|
|
20
|
+
* active_agent tag + env header + config.systemPrompt (wrapped in
|
|
21
|
+
* `<agent_instructions>` when non-empty).
|
|
22
|
+
* - "append" with empty systemPrompt: pure parent clone.
|
|
20
23
|
*
|
|
21
|
-
*
|
|
24
|
+
* Both modes include an `<active_agent name="${config.name}"/>` tag so
|
|
25
|
+
* downstream extensions (e.g. `@gotgenes/pi-permission-system`) can resolve
|
|
26
|
+
* per-agent policy inside the child session by parsing the system prompt.
|
|
27
|
+
* The tag follows the cacheable parent prefix in both modes.
|
|
28
|
+
*
|
|
29
|
+
* @param parentSystemPrompt The parent agent's effective system prompt.
|
|
22
30
|
*/
|
|
23
31
|
export function buildAgentPrompt(
|
|
24
32
|
config: AgentPromptConfig,
|
|
@@ -33,8 +41,9 @@ Working directory: ${cwd}
|
|
|
33
41
|
${env.isGitRepo ? `Git repository: yes\nBranch: ${env.branch}` : "Not a git repository"}
|
|
34
42
|
Platform: ${env.platform}`;
|
|
35
43
|
|
|
44
|
+
const identity = parentSystemPrompt ?? genericBase;
|
|
45
|
+
|
|
36
46
|
if (config.promptMode === "append") {
|
|
37
|
-
const identity = parentSystemPrompt ?? genericBase;
|
|
38
47
|
|
|
39
48
|
const bridge = `<sub_agent_context>
|
|
40
49
|
You are operating as a sub-agent invoked to handle a specific task.
|
|
@@ -69,18 +78,14 @@ You are operating as a sub-agent invoked to handle a specific task.
|
|
|
69
78
|
);
|
|
70
79
|
}
|
|
71
80
|
|
|
72
|
-
// "replace" mode —
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
return (
|
|
79
|
-
activeAgentTag + replaceHeader + "\n\n" + config.systemPrompt
|
|
80
|
-
);
|
|
81
|
+
// "replace" mode — parent/genericBase prefix first for KV cache reuse, then
|
|
82
|
+
// the active_agent tag, env block, and the config's full system prompt.
|
|
83
|
+
// Unlike append mode, no <sub_agent_context> bridge or <agent_instructions>
|
|
84
|
+
// wrapper is injected — the custom prompt retains full control.
|
|
85
|
+
return identity + "\n\n" + activeAgentTag + envBlock + "\n\n" + config.systemPrompt;
|
|
81
86
|
}
|
|
82
87
|
|
|
83
|
-
/** Fallback base prompt when parent system prompt is unavailable
|
|
88
|
+
/** Fallback base prompt when parent system prompt is unavailable (both modes). */
|
|
84
89
|
const genericBase = `# Role
|
|
85
90
|
You are a general-purpose coding agent for complex, multi-step tasks.
|
|
86
91
|
You have full access to read, write, edit files, and execute commands.
|