pi-subagents 0.25.0 → 0.28.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (40) hide show
  1. package/CHANGELOG.md +34 -0
  2. package/README.md +175 -19
  3. package/package.json +1 -1
  4. package/prompts/parallel-context-build.md +3 -1
  5. package/prompts/parallel-handoff-plan.md +3 -1
  6. package/skills/pi-subagents/SKILL.md +60 -17
  7. package/src/agents/agent-management.ts +71 -15
  8. package/src/agents/agent-serializer.ts +13 -2
  9. package/src/agents/agents.ts +88 -17
  10. package/src/agents/chain-serializer.ts +120 -0
  11. package/src/extension/fanout-child.ts +2 -0
  12. package/src/extension/index.ts +5 -2
  13. package/src/extension/schemas.ts +132 -6
  14. package/src/intercom/result-intercom.ts +5 -0
  15. package/src/runs/background/async-execution.ts +88 -6
  16. package/src/runs/background/async-status.ts +11 -1
  17. package/src/runs/background/run-status.ts +10 -1
  18. package/src/runs/background/subagent-runner.ts +665 -39
  19. package/src/runs/foreground/chain-execution.ts +369 -118
  20. package/src/runs/foreground/execution.ts +392 -19
  21. package/src/runs/foreground/subagent-executor.ts +126 -3
  22. package/src/runs/shared/acceptance-contract.ts +318 -0
  23. package/src/runs/shared/acceptance-evaluation.ts +221 -0
  24. package/src/runs/shared/acceptance-finalization.ts +173 -0
  25. package/src/runs/shared/acceptance-reports.ts +127 -0
  26. package/src/runs/shared/acceptance.ts +22 -0
  27. package/src/runs/shared/chain-outputs.ts +101 -0
  28. package/src/runs/shared/completion-guard.ts +26 -3
  29. package/src/runs/shared/dynamic-fanout.ts +293 -0
  30. package/src/runs/shared/parallel-utils.ts +33 -1
  31. package/src/runs/shared/pi-args.ts +11 -0
  32. package/src/runs/shared/structured-output.ts +77 -0
  33. package/src/runs/shared/subagent-prompt-runtime.ts +53 -3
  34. package/src/runs/shared/workflow-graph.ts +210 -0
  35. package/src/shared/formatters.ts +2 -2
  36. package/src/shared/settings.ts +53 -4
  37. package/src/shared/types.ts +265 -1
  38. package/src/shared/utils.ts +7 -0
  39. package/src/slash/slash-commands.ts +41 -3
  40. package/src/tui/render.ts +178 -45
package/CHANGELOG.md CHANGED
@@ -2,6 +2,40 @@
2
2
 
3
3
  ## [Unreleased]
4
4
 
5
+ ## [0.28.0] - 2026-06-03
6
+
7
+ ### Added
8
+ - Added foreground-only `timeoutMs`/`maxRuntimeMs` for single, parallel, and chain subagent runs. Timed-out children are soft-interrupted, keep completed sibling/prior results, and return `timedOut: true` with a stable timeout message.
9
+ - Added per-agent `maxExecutionTimeMs` and `maxTokens` resource limits. Foreground and async children stop with a clear `resourceLimitExceeded` result when the configured runtime or observed token budget is reached.
10
+
11
+ ### Changed
12
+ - Strengthened tool and skill guidance so writer subagents launched from plans, specs, issues, or broad fixes proactively use structured `acceptance` instead of burying validation requirements only in task prose.
13
+
14
+ ### Fixed
15
+ - Removed a provider-unfriendly required-only subschema from the public `acceptance` tool schema so Kimi models served through OpenCode Go can load the `subagent` tool, while keeping runtime validation for empty acceptance contracts.
16
+ - Clarified acceptance-report prompts so required evidence like `diff-summary` must be copied into structured JSON fields such as `diffSummary`, not only described in visible prose.
17
+
18
+ ## [0.27.0] - 2026-05-30
19
+
20
+ ### Changed
21
+ - Reworked public acceptance config to be object-only and evidence-driven, removing public `level`/disable shorthands. Explicit acceptance now triggers a same-session self-review/repair finalization loop, with `maxFinalizationTurns` controlling the cap.
22
+ - Documented goal-style acceptance guidance so `/goal`, “active goal”, and “work until evidence says done” requests map to run-scoped `acceptance` contracts.
23
+ - Refined acceptance finalization prompts and status output to emphasize evidence, blockers, stop rules, and finalization progress such as `completed after 1/3 turns`.
24
+
25
+ ### Fixed
26
+ - Treat explicit acceptance as the completion contract for acceptance-enabled runs, avoiding implementation completion-guard false positives when the visible output is only an `acceptance-report` or a finalization self-review turn does not need a repair edit.
27
+
28
+ ## [0.26.0] - 2026-05-29
29
+
30
+ ### Added
31
+ - Added first-wave acceptance gates with optional public `acceptance` config, inferred effective policies, structured child reports, provenance ledgers, checked evidence gates, explicit runtime verification commands, async/status persistence, and saved `.chain.json` validation.
32
+ - Added chain step metadata (`phase`, `label`), named outputs (`as` with `{outputs.name}`), workflow graph snapshots, and strict `outputSchema` structured-output contracts across foreground and async chain execution.
33
+ - Added dynamic chain fanout with `expand`/single-template `parallel`/`collect`, structured named-output sources, bounded item expansion, collected result outputs, async status graph persistence, and saved `.chain.json` support.
34
+
35
+ ### Fixed
36
+ - Fixed dynamic fanout acceptance blockers around real `structured_output` tool validation, malformed dynamic-like chain rejection, async dynamic failure status/details, dynamic child intercom target indexing, and saved `.chain.json` management diagnostics.
37
+ - Fixed acceptance-gate semantics so reviewed status requires an independent reviewer result, required criteria must be reported as satisfied, only fenced `acceptance-report` blocks satisfy attestation, malformed reports preserve parse errors, `{ level: "none", reason }` disables inferred gates, and zero-child dynamic aggregates no longer fabricate evidence.
38
+
5
39
  ## [0.25.0] - 2026-05-21
6
40
 
7
41
  ### Added
package/README.md CHANGED
@@ -145,7 +145,7 @@ Use `~/.pi/agent/settings.json` for a user override or `.pi/settings.json` for a
145
145
 
146
146
  ## Where running subagents show up
147
147
 
148
- Foreground runs stream progress in the conversation while they run.
148
+ Foreground runs stream progress in the conversation while they run. Use `timeoutMs` or its alias `maxRuntimeMs` when a foreground run must return within a wall-clock budget. When the timeout expires, running children are soft-interrupted, completed children stay in the result, and timed-out children return `timedOut: true` with a stable timeout message.
149
149
 
150
150
  Background runs keep working after control returns to you. Inspect active runs with `subagent({ action: "status" })`, or a specific run with `subagent({ action: "status", id: "..." })`.
151
151
 
@@ -246,7 +246,7 @@ Skip this section until you want exact syntax.
246
246
  | `/run <agent> [task]` | Run one agent; omit the task for self-contained agents |
247
247
  | `/chain agent1 "task1" -> agent2 "task2"` | Run agents in sequence |
248
248
  | `/parallel agent1 "task1" -> agent2 "task2"` | Run agents in parallel |
249
- | `/run-chain <chainName> -- <task>` | Launch a saved `.chain.md` workflow |
249
+ | `/run-chain <chainName> -- <task>` | Launch a saved `.chain.md` or `.chain.json` workflow |
250
250
  | `/subagents-doctor` | Show read-only setup diagnostics |
251
251
 
252
252
  Commands validate agent names locally, support tab completion, and send results back into the conversation.
@@ -436,6 +436,8 @@ defaultProgress: true
436
436
  completionGuard: false
437
437
  interactive: true
438
438
  maxSubagentDepth: 1
439
+ maxExecutionTimeMs: 600000
440
+ maxTokens: 50000
439
441
  ---
440
442
 
441
443
  Your system prompt goes here.
@@ -462,6 +464,8 @@ Important fields:
462
464
  | `completionGuard` | Set `false` only for non-implementation agents that may mention implementation words while using mutation-capable tools such as `bash`. |
463
465
  | `interactive` | Parsed for compatibility but not enforced in v1. |
464
466
  | `maxSubagentDepth` | Tightens nested delegation for this agent’s children. |
467
+ | `maxExecutionTimeMs` | Stops each foreground or async child run for this agent after the given number of milliseconds. |
468
+ | `maxTokens` | Stops each foreground or async child run for this agent when observed input plus output tokens reach the limit. Token enforcement is best-effort because usage is reported after model events arrive. |
465
469
 
466
470
  ### Tool and extension selection
467
471
 
@@ -492,14 +496,14 @@ When `extensions` is present, it takes precedence over extension paths implied b
492
496
 
493
497
  ## Chain files
494
498
 
495
- Chains are reusable `.chain.md` workflows stored separately from agent files.
499
+ Chains are reusable workflows stored separately from agent files. Use `.chain.md` for simple sequential saved chains. Use `.chain.json` when a chain needs dynamic fanout.
496
500
 
497
501
  | Scope | Path |
498
502
  |-------|------|
499
- | User | `~/.pi/agent/chains/**/*.chain.md` |
500
- | Project | `.pi/chains/**/*.chain.md` |
503
+ | User | `~/.pi/agent/chains/**/*.chain.md`, `~/.pi/agent/chains/**/*.chain.json` |
504
+ | Project | `.pi/chains/**/*.chain.md`, `.pi/chains/**/*.chain.json` |
501
505
 
502
- Nested subdirectories are discovered recursively. If user and project scopes define the same parsed runtime chain name, the project chain wins. Chains support the same optional `package` frontmatter as agents; `name: review-flow` plus `package: code-analysis` runs as `code-analysis.review-flow`.
506
+ Nested subdirectories are discovered recursively. If both `.chain.md` and `.chain.json` define the same parsed runtime chain name in the same scope, `.chain.json` wins. If user and project scopes define the same parsed runtime chain name, the project chain wins. Chains support the same optional `package` frontmatter as agents; `name: review-flow` plus `package: code-analysis` runs as `code-analysis.review-flow`.
503
507
 
504
508
  Example:
505
509
 
@@ -510,23 +514,67 @@ description: Gather context then plan implementation
510
514
  ---
511
515
 
512
516
  ## scout
517
+ phase: Context
518
+ label: Map auth flow
519
+ as: context
513
520
  output: context.md
514
521
 
515
522
  Analyze the codebase for {task}
516
523
 
517
524
  ## planner
525
+ phase: Planning
526
+ label: Implementation plan
518
527
  reads: context.md
519
528
  model: anthropic/claude-sonnet-4-5:high
520
529
  progress: true
521
530
 
522
- Create an implementation plan based on {previous}
531
+ Create an implementation plan based on {outputs.context}
523
532
  ```
524
533
 
525
- Each `## agent-name` section is a step. Config lines such as `output`, `outputMode`, `reads`, `model`, `skills`, and `progress` go immediately after the header. A blank line separates config from task text.
534
+ Each `.chain.md` `## agent-name` section is a step. Config lines such as `phase`, `label`, `as`, `outputSchema`, `output`, `outputMode`, `reads`, `model`, `skills`, and `progress` go immediately after the header. A blank line separates config from task text. In saved `.chain.md` files, `outputSchema` is a path to a JSON Schema file; direct tool calls and `.chain.json` files can pass the schema object inline.
526
535
 
527
536
  For `output`, `reads`, `skills`, and `progress`, chain behavior is three-state: omitted inherits from the agent, a value overrides, and `false` disables.
528
537
 
529
- Create chains by writing `.chain.md` files directly or with the `subagent({ action: "create", config: ... })` management action. Run them with natural language or:
538
+ Use `phase` to group related work in status output, `label` for a readable step name, and `as` to store a successful step or parallel task result for later `{outputs.name}` references. Duplicate `as` names, invalid identifiers, and unknown output references fail before child execution.
539
+
540
+ Dynamic fanout is available only through direct `subagent({ chain: [...] })` JSON or saved `.chain.json` files. It expands an array from a prior structured named output, runs one child template per item, and stores the ordered collection under `collect.as`. The source must be structured output; prose is never parsed. `expand.maxItems` is required, over-limit arrays fail, nested fanout and arbitrary expressions are not supported, and `.chain.md` has no dynamic syntax in this release.
541
+
542
+ ```json
543
+ {
544
+ "name": "dynamic-review",
545
+ "description": "Find review targets, fan out reviewers, then synthesize.",
546
+ "chain": [
547
+ {
548
+ "agent": "scout",
549
+ "task": "Return {\"items\":[{\"path\":\"...\",\"reason\":\"...\"}]} via structured_output.",
550
+ "as": "targets",
551
+ "outputSchema": { "type": "object" }
552
+ },
553
+ {
554
+ "expand": {
555
+ "from": { "output": "targets", "path": "/items" },
556
+ "item": "target",
557
+ "key": "/path",
558
+ "maxItems": 12
559
+ },
560
+ "parallel": {
561
+ "agent": "reviewer",
562
+ "label": "Review {target.path}",
563
+ "task": "Review {target.path}. Reason: {target.reason}",
564
+ "outputSchema": { "type": "object" }
565
+ },
566
+ "collect": { "as": "reviews" },
567
+ "concurrency": 4
568
+ },
569
+ {
570
+ "agent": "worker",
571
+ "task": "Synthesize fixes from {outputs.reviews}"
572
+ }
573
+ ]
574
+ }
575
+ ```
576
+
577
+ Create simple `.chain.md` chains by writing files directly or with the `subagent({ action: "create", config: ... })` management action. Create dynamic `.chain.json` chains by writing the JSON file directly. Run saved chains with natural language or:
530
578
 
531
579
  ```text
532
580
  /run-chain scout-planner -- refactor authentication
@@ -541,6 +589,7 @@ Task templates support:
541
589
  | `{task}` | Original task from the first step. |
542
590
  | `{previous}` | Output from the prior step, or aggregated output from a parallel step. |
543
591
  | `{chain_dir}` | Path to the chain artifact directory. |
592
+ | `{outputs.name}` | Text value from a prior step or completed parallel task with `as: "name"`. |
544
593
 
545
594
  Parallel outputs are aggregated with clear separators before being passed to the next step:
546
595
 
@@ -634,14 +683,49 @@ These are the parameters the LLM passes when it calls the `subagent` tool. Most
634
683
 
635
684
  // Chain with fan-out/fan-in
636
685
  { chain: [
637
- { agent: "scout", task: "Gather context" },
686
+ { agent: "scout", task: "Gather context", phase: "Context", label: "Map code", as: "context" },
638
687
  { parallel: [
639
- { agent: "worker", task: "Implement feature A from {previous}" },
640
- { agent: "worker", task: "Implement feature B from {previous}" }
688
+ { agent: "worker", task: "Implement feature A from {outputs.context}", label: "Feature A", as: "featureA" },
689
+ { agent: "worker", task: "Implement feature B from {outputs.context}", label: "Feature B", as: "featureB" }
641
690
  ], concurrency: 2, failFast: true },
642
- { agent: "reviewer", task: "Review all changes from {previous}" }
691
+ { agent: "reviewer", task: "Review {outputs.featureA} and {outputs.featureB}" }
643
692
  ]}
644
693
 
694
+ // Dynamic fanout from structured output
695
+ { chain: [
696
+ {
697
+ agent: "scout",
698
+ task: "Return review targets as structured_output: { items: [{ path, reason }] }",
699
+ as: "targets",
700
+ outputSchema: { type: "object" }
701
+ },
702
+ {
703
+ expand: { from: { output: "targets", path: "/items" }, item: "target", key: "/path", maxItems: 12 },
704
+ parallel: { agent: "reviewer", task: "Review {target.path}. Reason: {target.reason}", outputSchema: { type: "object" } },
705
+ collect: { as: "reviews" },
706
+ concurrency: 4
707
+ },
708
+ { agent: "worker", task: "Synthesize fixes from {outputs.reviews}" }
709
+ ] }
710
+
711
+ // Strict structured output for reliable handoff data
712
+ { chain: [
713
+ {
714
+ agent: "scout",
715
+ task: "Return the key files and risks for {task}",
716
+ as: "scan",
717
+ outputSchema: {
718
+ type: "object",
719
+ required: ["files", "risks"],
720
+ properties: {
721
+ files: { type: "array", items: { type: "string" } },
722
+ risks: { type: "array", items: { type: "string" } }
723
+ }
724
+ }
725
+ },
726
+ { agent: "planner", task: "Plan from this scan: {outputs.scan}" }
727
+ ] }
728
+
645
729
  // Worktree isolation
646
730
  { tasks: [
647
731
  { agent: "worker", task: "Implement auth" },
@@ -711,10 +795,11 @@ Agent definitions are not loaded into context by default. Management actions let
711
795
  | `outputMode` | `"inline" \| "file-only"` | `inline` | Return saved output inline or as a concise saved-file reference. `file-only` requires an `output` path. |
712
796
  | `skill` | `string \| string[] \| false` | agent default | Override skills or disable all. |
713
797
  | `model` | string | agent default | Override model. |
714
- | `tasks` | array | - | Top-level parallel tasks. Supports `agent`, `task`, `cwd`, `count`, `output`, `outputMode`, `reads`, `progress`, `skill`, and `model`. |
798
+ | `tasks` | array | - | Top-level parallel tasks. Supports `agent`, `task`, `cwd`, `count`, `output`, `outputMode`, `reads`, `progress`, `skill`, `model`, and `acceptance`. |
715
799
  | `concurrency` | number | config or `4` | Top-level parallel concurrency. |
800
+ | `timeoutMs` / `maxRuntimeMs` | number | - | Foreground wall-clock timeout for single, parallel, and chain runs. Timed-out children return `timedOut: true`; async/background runs reject it. |
716
801
  | `worktree` | boolean | false | Create isolated git worktrees for parallel tasks. |
717
- | `chain` | array | - | Sequential and parallel chain steps. |
802
+ | `chain` | array | - | Sequential, static parallel, and dynamic fanout chain steps. Sequential steps and parallel child tasks support `phase`, `label`, `as`, `outputSchema`, and `acceptance` in addition to the usual execution fields. Dynamic fanout uses `expand`, one child `parallel` template, and `collect`; group-level acceptance is not supported because there is no child session to finalize. |
718
803
  | `context` | `fresh \| fork` | agent default or `fresh` | `fork` creates real branched sessions from the parent leaf. Packaged `planner`, `worker`, and `oracle` default to `fork`. |
719
804
  | `chainDir` | string | temp chain dir | Persistent directory for chain artifacts. |
720
805
  | `clarify` | boolean | true for chains | Show TUI preview/edit flow. |
@@ -726,12 +811,13 @@ Agent definitions are not loaded into context by default. Management actions let
726
811
  | `includeProgress` | boolean | false | Include full progress in result. |
727
812
  | `share` | boolean | false | Upload session export to GitHub Gist. |
728
813
  | `sessionDir` | string | derived | Override session log directory. |
814
+ | `acceptance` | object | omitted | Explicit acceptance contract. When present, the child gets a structured contract, then the runtime continues the same session for a bounded self-review/repair loop before evaluating acceptance. |
729
815
 
730
816
  `context: "fork"` fails fast when the parent session is not persisted, the current leaf is missing, or the branched child session cannot be created. It never silently downgrades to `fresh`. In multi-agent runs, if any requested agent has `defaultContext: fork` and the launch omits `context`, the whole invocation uses forked context; pass `context: "fresh"` when you intentionally want a fresh run.
731
817
 
732
818
  Use `outputMode: "file-only"` when a saved output may be large and the parent only needs a pointer. The returned text is a compact reference like `Output saved to: /abs/report.md (48.2 KB, 2847 lines). Read this file if needed.` Failed runs and save errors still return normal inline output for debugging. In chains, later `{previous}` steps receive the same compact reference when the prior step used file-only mode.
733
819
 
734
- Sequential and parallel chain tasks accept `agent`, `task`, `cwd`, `output`, `outputMode`, `reads`, `progress`, `skill`, and `model`. Parallel tasks also accept `count`. Parallel step groups accept `parallel`, `concurrency`, `failFast`, and `worktree`.
820
+ Sequential and parallel chain tasks accept `agent`, `task`, `phase`, `label`, `as`, `outputSchema`, `cwd`, `output`, `outputMode`, `reads`, `progress`, `skill`, and `model`. Parallel tasks also accept `count`. Parallel step groups accept `parallel`, `concurrency`, `failFast`, and `worktree`. If `outputSchema` is present, the child must call `structured_output` with schema-valid JSON; prose-only completion or invalid JSON fails the step. Validated structured values are preserved on the step result, and `as` also exposes a compact text representation through `{outputs.name}`.
735
821
 
736
822
  Status and control actions:
737
823
 
@@ -830,6 +916,19 @@ Session directory precedence is: `params.sessionDir`, then `config.defaultSessio
830
916
 
831
917
  Controls nested delegation when no inherited `PI_SUBAGENT_MAX_DEPTH` is already in effect. Per-agent `maxSubagentDepth` can tighten the limit for that agent’s child runs, but cannot relax an inherited stricter limit. This applies even to children that explicitly declare `tools: subagent`; at the cap, execution fanout is blocked instead of silently hiding nested work.
832
918
 
919
+ ### Agent resource limits
920
+
921
+ Set `maxExecutionTimeMs` and `maxTokens` in agent frontmatter or through `subagent({ action: "create" | "update", config })` to bound a specific agent across foreground and async runs.
922
+
923
+ ```yaml
924
+ maxExecutionTimeMs: 600000
925
+ maxTokens: 50000
926
+ ```
927
+
928
+ When a limit is reached, the child receives a soft interrupt, the run fails with a clear `Resource limit exceeded...` error, and the result includes `resourceLimitExceeded` with the limit kind, configured limit, and observed token count when available. Resource-limit failures do not trigger fallback model retries. `maxTokens` is best-effort because providers report usage after message events; a child may exceed the exact limit before the runtime can stop it.
929
+
930
+ Spawn-count and per-agent child-concurrency quotas are not part of this release; use `maxSubagentDepth` and parallel `concurrency` for those boundaries today.
931
+
833
932
  ### `intercomBridge`
834
933
 
835
934
  ```json
@@ -888,7 +987,7 @@ Debug artifacts live under `{sessionDir}/subagent-artifacts/` or a user-scoped t
888
987
  - `{runId}_{agent}.jsonl`
889
988
  - `{runId}_{agent}_meta.json`
890
989
 
891
- Metadata records timing, usage, exit code, final model, attempted models, and fallback attempt outcomes.
990
+ Metadata records timing, usage, exit code, final model, attempted models, fallback attempt outcomes, and any resource-limit termination reason.
892
991
 
893
992
  Session files are stored under a per-run session directory. With `context: "fork"`, each child starts with `--session <branched-session-file>` produced from the parent’s current leaf. That is a real session fork, not an injected summary.
894
993
 
@@ -906,13 +1005,70 @@ Async runs write:
906
1005
 
907
1006
  `status.json` powers the widget and `subagent({ action: "status" })` output. `events.jsonl` contains wrapper events plus child Pi JSON events annotated with run and step metadata. Nested fanout status is stored as compact sidecar event/registry metadata and merged into parent status views and result/intercom payloads; full recursive status snapshots are not embedded in parent result files. `output-<n>.log` is a live human-readable tail. Fallback information is persisted so background runs are debuggable after completion.
908
1007
 
1008
+ ## Acceptance Gates
1009
+
1010
+ `acceptance` is an explicit contract. Omit it for lightweight runs. Set it on single runs, top-level parallel task items, sequential chain steps, static parallel task items, and dynamic fanout child templates when the child must prove the work meets concrete criteria. Do not set it on static parallel groups or dynamic fanout aggregate groups; those groups do not own a same-session child turn.
1011
+
1012
+ If you are coming from Codex Goals, `acceptance` is the subagent equivalent for one delegated run. When a user says `/goal`, “goal”, “active goal”, “continue until evidence says done”, or “verify against a goal”, translate that into an acceptance contract: `criteria` are the target, `evidence` and `verify` are proof, `stopRules` are constraints, and `maxFinalizationTurns` is the bounded loop budget.
1013
+
1014
+ ```ts
1015
+ {
1016
+ agent: "worker",
1017
+ task: "Implement the fix",
1018
+ acceptance: {
1019
+ criteria: ["Patch the bug without widening scope"],
1020
+ evidence: ["changed-files", "tests-added", "commands-run", "residual-risks", "no-staged-files"],
1021
+ verify: [{ id: "focused", command: "npm test", timeoutMs: 120000 }],
1022
+ maxFinalizationTurns: 3
1023
+ }
1024
+ }
1025
+ ```
1026
+
1027
+ When `acceptance` is present, the initial child prompt includes a standardized acceptance section and asks for a fenced `acceptance-report` JSON block. After the child’s initial completion, the runtime continues the same persisted child session with an acceptance finalization prompt. The child can repair omissions in that same session, then must return the final `acceptance-report`. Missing or malformed finalization reports reject the run when the loop limit is reached.
1028
+
1029
+ Public acceptance config is evidence-driven. There is no public `level` field and no `acceptance: "checked"` shorthand. Runtime provenance is derived from what actually happened:
1030
+
1031
+ - `attested`: the child returned a structured acceptance report.
1032
+ - `checked`: runtime structural checks passed, such as required criteria, required evidence, and no staged files.
1033
+ - `verified`: configured runtime verification commands passed. Child-reported command success does not count.
1034
+ - `reviewed`: an independent reviewer result is present.
1035
+ - `rejected`: attestation, structural checks, verification, review, or finalization failed.
1036
+
1037
+ Self-review finalization never counts as `reviewed`, and it never counts as `verified` unless configured runtime verification commands actually pass. The visible child output remains the initial answer; finalization reports and residual risks are stored in the acceptance ledger and async/status details.
1038
+
1039
+ When delegating implementation from a plan or spec, keep the task focused on what to implement and put the definition of done in `acceptance` so the runtime can finalize and evaluate it:
1040
+
1041
+ ```ts
1042
+ subagent({
1043
+ agent: "worker",
1044
+ async: true,
1045
+ task: "Implement the plan at /Users/me/docs/mcp-alignment-plan.md. Use scout artifacts in ./handoff/ as context. Do not commit the scout artifacts.",
1046
+ acceptance: {
1047
+ criteria: [
1048
+ "Implementation follows /Users/me/docs/mcp-alignment-plan.md",
1049
+ "Plan acceptance checks are addressed",
1050
+ "Scout handoff artifacts are not committed",
1051
+ "Focused validation for changed behavior passes",
1052
+ "Residual risks or skipped checks are reported"
1053
+ ],
1054
+ evidence: ["changed-files", "commands-run", "validation-output", "residual-risks"],
1055
+ verify: [{ id: "focused", command: "npm test -- --runInBand" }],
1056
+ stopRules: [
1057
+ "Do not edit unrelated files",
1058
+ "Stop and report if the plan requires an unapproved product decision"
1059
+ ],
1060
+ maxFinalizationTurns: 3
1061
+ }
1062
+ })
1063
+ ```
1064
+
909
1065
  ## Live progress
910
1066
 
911
- Foreground runs show compact live progress for single, chain, and parallel modes: current tool, recent output, token counts, duration, activity freshness, and current-tool duration.
1067
+ Foreground runs show compact live progress for single, chain, and parallel modes: current tool, recent output, token counts, duration, activity freshness, current-tool duration, and chain graph metadata when available.
912
1068
 
913
1069
  Press `Ctrl+O` to expand the full streaming view with complete output per step.
914
1070
 
915
- Sequential chains show a flow line like `done scout → running planner`. Chains with parallel steps show per-step cards instead.
1071
+ Sequential chains show a flow line like `done scout → running planner`. Chains with parallel steps show per-step cards instead. Chain status uses `label` and `phase` metadata when present, while falling back to agent names for older chains.
916
1072
 
917
1073
  ## Session sharing
918
1074
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "pi-subagents",
3
- "version": "0.25.0",
3
+ "version": "0.28.0",
4
4
  "description": "Pi extension for delegating tasks to subagents with chains, parallel execution, and TUI clarification",
5
5
  "author": "Nico Bailon",
6
6
  "license": "MIT",
@@ -4,12 +4,14 @@ description: Parallel context builders for planning handoff
4
4
 
5
5
  Launch fresh-context `context-builder` subagents in parallel to build grounded handoff context for planning or implementation.
6
6
 
7
- Use the `subagent` tool in chain mode with a single parallel step, not top-level parallel tasks, so relative output files live under the temporary chain directory. Use `context: "fresh"` unless I explicitly ask for forked context. Give every parallel task a distinct `output` path, for example:
7
+ Use the `subagent` tool in chain mode with a single parallel step, not top-level parallel tasks, so relative output files live under the temporary chain directory. Use `context: "fresh"` unless I explicitly ask for forked context. Give every parallel task a distinct `output` path, `label`, and `as` name, for example:
8
8
 
9
9
  - `context-build/request-and-scope.md`
10
10
  - `context-build/codebase-and-patterns.md`
11
11
  - `context-build/validation-and-risks.md`
12
12
 
13
+ Use one phase such as `phase: "Context build"` for the parallel tasks so async status is readable. A later synthesis step can reference specific outputs with `{outputs.requestScope}`, `{outputs.codebasePatterns}`, and `{outputs.validationRisks}` instead of relying only on `{previous}`.
14
+
13
15
  Do not write these context artifacts into the repository unless I explicitly ask for persistent files.
14
16
 
15
17
  Treat the slash command arguments as the primary request, target, or focus:
@@ -19,12 +19,14 @@ Use the `subagent` tool in chain mode:
19
19
 
20
20
  2. Second step: a synthesis `context-builder` that reads the parallel findings and writes the final handoff plan and meta-prompt.
21
21
 
22
- Use distinct output paths under the chain directory. Example outputs:
22
+ Use distinct output paths, `label` values, and `as` names under the chain directory. Example outputs:
23
23
  - `handoff/external-reference.md`
24
24
  - `handoff/local-context.md`
25
25
  - `handoff/implementation-strategy.md`
26
26
  - `handoff/final-handoff-plan.md`
27
27
 
28
+ Use phases such as `Research`, `Local context`, and `Synthesis` so async status is readable. Prefer `{outputs.externalReference}`, `{outputs.localContext}`, and `{outputs.implementationStrategy}` in the synthesis task when those specific inputs are available; keep `{previous}` only when the whole parallel fan-in summary is the desired input.
29
+
28
30
  Do not write these artifacts into the repository unless I explicitly ask for persistent files.
29
31
 
30
32
  Role guidance:
@@ -32,7 +32,7 @@ Humans often use the slash-command layer instead:
32
32
  - `/run` — launch a single agent
33
33
  - `/chain` — launch a chain of steps
34
34
  - `/parallel` — launch top-level parallel tasks
35
- - `/run-chain` — launch a saved `.chain.md` workflow
35
+ - `/run-chain` — launch a saved `.chain.md` or `.chain.json` workflow
36
36
  - `/subagents-doctor` — diagnose setup, discovery, async paths, and intercom bridge state
37
37
 
38
38
  Prefer the tool when you are writing agent logic. Prefer the slash commands when
@@ -118,7 +118,9 @@ Use this when a broad diff has known reviewer findings across several items and
118
118
  2. One writer worker. It receives the planner summaries through `{previous}`, the parent’s accepted scope, stop rules, and verification contract. It is the only child allowed to edit the active worktree.
119
119
  3. A parallel read-only validation fanout. Validators inspect the worker diff from fresh context with distinct angles, report pass/fail, remaining blockers, and missing verification.
120
120
 
121
- Prefer `async: true`, `context: "fresh"` for planners/validators, `outputMode: "file-only"` for large summaries, and per-stage output names that will not collide. Use this pattern instead of launching several writer workers into a dirty worktree. Include non-blocking suggestions in the writer prompt only when they are small, safe, and do not expand product scope; otherwise record them as deferred.
121
+ Prefer `async: true`, `context: "fresh"` for planners/validators, `outputMode: "file-only"` for large summaries, and per-stage output names that will not collide. Add `phase` and `label` to make async status readable, and use `as` plus `{outputs.name}` when a later step needs a specific earlier result instead of the whole `{previous}` blob. Use this pattern instead of launching several writer workers into a dirty worktree. Include non-blocking suggestions in the writer prompt only when they are small, safe, and do not expand product scope; otherwise record them as deferred.
122
+
123
+ When the first step can return a structured target list, prefer dynamic fanout instead of hand-authoring a static parallel group. Use `outputSchema` and `as` on the producer, then an `expand` step with `from: { output, path }`, an explicit `maxItems`, one `parallel` child template, and `collect.as`. Item templates may use `{item}` or a named item such as `{target.path}`. Do not use dynamic fanout for prose outputs, nested fanout, dynamic agent selection, reducers, `when` conditions, or arbitrary expressions; `.chain.md` does not support this syntax, so use direct JSON or a saved `.chain.json`.
122
124
 
123
125
  Example shape:
124
126
 
@@ -128,14 +130,14 @@ subagent({
128
130
  context: "fresh",
129
131
  chain: [
130
132
  { parallel: [
131
- { agent: "reviewer", task: "Plan fixes for deploy docs/workflow. Inspect the current diff. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "plans/deploy.md", outputMode: "file-only" },
132
- { agent: "reviewer", task: "Plan fixes for scheduler contract. Inspect the current diff. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "plans/scheduler.md", outputMode: "file-only" },
133
- { agent: "reviewer", task: "Plan fixes for sandbox/security. Inspect the current diff. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "plans/sandbox.md", outputMode: "file-only" }
133
+ { agent: "reviewer", phase: "Planning", label: "Deploy docs", as: "deployPlan", task: "Plan fixes for deploy docs/workflow. Inspect the current diff. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "plans/deploy.md", outputMode: "file-only" },
134
+ { agent: "reviewer", phase: "Planning", label: "Scheduler contract", as: "schedulerPlan", task: "Plan fixes for scheduler contract. Inspect the current diff. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "plans/scheduler.md", outputMode: "file-only" },
135
+ { agent: "reviewer", phase: "Planning", label: "Sandbox/security", as: "sandboxPlan", task: "Plan fixes for sandbox/security. Inspect the current diff. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "plans/sandbox.md", outputMode: "file-only" }
134
136
  ], concurrency: 3 },
135
- { agent: "worker", task: "Apply only the accepted fixes from these planning summaries. You are the sole writer for the active worktree. Run focused validation and report changed files, commands, failures, and remaining issues.\n\nPlanning summaries:\n{previous}", output: "worker/fixes.md", outputMode: "file-only", progress: true },
137
+ { agent: "worker", phase: "Implementation", label: "Apply accepted fixes", as: "workerResult", task: "Apply only the accepted fixes from these planning summaries. You are the sole writer for the active worktree.\n\nDeploy plan:\n{outputs.deployPlan}\n\nScheduler plan:\n{outputs.schedulerPlan}\n\nSandbox plan:\n{outputs.sandboxPlan}", acceptance: { criteria: ["Accepted fixes from each planning summary are applied", "Focused validation for changed behavior passes", "Changed files, validation commands, failures, and residual risks are reported"], evidence: ["changed-files", "commands-run", "validation-output", "residual-risks"], stopRules: ["Do not expand product scope beyond accepted fixes", "Stop and report if a fix requires an unapproved decision"], maxFinalizationTurns: 3 }, output: "worker/fixes.md", outputMode: "file-only", progress: true },
136
138
  { parallel: [
137
- { agent: "reviewer", task: "Validate the post-worker diff for deploy and scheduler fixes. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "validation/deploy-scheduler.md", outputMode: "file-only" },
138
- { agent: "reviewer", task: "Validate the post-worker diff for sandbox/security fixes. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "validation/sandbox.md", outputMode: "file-only" }
139
+ { agent: "reviewer", phase: "Validation", label: "Deploy/scheduler validation", task: "Validate the post-worker diff for deploy and scheduler fixes. Start from the worker result: {outputs.workerResult}. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "validation/deploy-scheduler.md", outputMode: "file-only" },
140
+ { agent: "reviewer", phase: "Validation", label: "Sandbox validation", task: "Validate the post-worker diff for sandbox/security fixes. Start from the worker result: {outputs.workerResult}. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "validation/sandbox.md", outputMode: "file-only" }
139
141
  ], concurrency: 2 }
140
142
  ]
141
143
  })
@@ -217,10 +219,10 @@ Agent files can live in:
217
219
  - legacy `.agents/**/*.md` — still read for compatibility, but `.pi/agents/` wins on conflicts
218
220
 
219
221
  Chains live in:
220
- - `~/.pi/agent/chains/**/*.chain.md` — user scope
221
- - `.pi/chains/**/*.chain.md` — project scope
222
+ - `~/.pi/agent/chains/**/*.chain.md` and `~/.pi/agent/chains/**/*.chain.json` — user scope
223
+ - `.pi/chains/**/*.chain.md` and `.pi/chains/**/*.chain.json` — project scope
222
224
 
223
- Discovery is recursive. `.chain.md` files do not define agents. Agents and chains can set optional frontmatter `package: code-analysis`; `name: scout` plus `package: code-analysis` registers as runtime name `code-analysis.scout` while serialization keeps `name` and `package` separate.
225
+ Discovery is recursive. `.chain.md` files do not define agents. Use `.chain.md` for simple saved chains and `.chain.json` for dynamic fanout or inline schema objects. Agents and chains can set optional frontmatter/package metadata; `name: scout` plus `package: code-analysis` registers as runtime name `code-analysis.scout` while serialization keeps `name` and `package` separate.
224
226
 
225
227
  Precedence is by parsed runtime name:
226
228
  1. project scope
@@ -290,9 +292,13 @@ subagent({
290
292
  })
291
293
  ```
292
294
 
293
- Chain steps can use templated variables such as `{task}`, `{previous}`, and
294
- `{chain_dir}`. This is the main way to pass structured summaries between steps
295
- without forcing each step to rediscover everything.
295
+ Chain steps can use templated variables such as `{task}`, `{previous}`,
296
+ `{chain_dir}`, and `{outputs.name}`. Use `as: "name"` on a successful step or
297
+ parallel task to make that output available to later steps. Prefer named outputs
298
+ when a later step needs one specific result; keep `{previous}` for simple linear
299
+ handoffs or full fan-in summaries. Use `phase` and `label` for status readability.
300
+ Use `outputSchema` when later steps need reliable structured data; the child must
301
+ call `structured_output` with schema-valid JSON, or the step fails.
296
302
 
297
303
  ### Async/background
298
304
 
@@ -684,7 +690,39 @@ For feature work, use this sequence as scaffolding for parent-agent behavior:
684
690
  clarify → validation contract → planner → async worker → parallel async fresh-context reviewers/validators → async fix worker → follow-up review when warranted → parent review
685
691
  ```
686
692
 
687
- The validation contract defines what done means before code is written: expected behavior, acceptance checks, commands or user flows to exercise, and evidence the worker should return. Keep it lightweight for small tasks, but make it explicit enough that reviewers and validators are checking the intended outcome rather than the worker’s own assumptions.
693
+ The validation contract defines acceptance before code is written: expected behavior, acceptance checks, commands or user flows to exercise, and evidence the worker should return. Keep it lightweight for small tasks, but make it explicit enough that reviewers and validators are checking the intended outcome rather than the worker’s own assumptions.
694
+
695
+ Use the structured `acceptance` field when the run should carry an explicit acceptance contract. If omitted, the run stays lightweight. When present, acceptance is object-only: define concrete `criteria`, required `evidence`, optional runtime `verify` commands, optional independent `review`, and optionally `maxFinalizationTurns`. The runtime continues the same child session for a bounded self-review/repair loop before evaluating the final report, so set `acceptance` on single runs, sequential chain steps, parallel task items, and dynamic fanout child templates, not on static parallel or dynamic fanout groups. Do not call a run reviewed just because the worker says it is done; reviewed means a reviewer gate returned a result. Child-reported command success is evidence, not runtime verification.
696
+
697
+ Goal-style requests map to `acceptance`. If the user says `/goal`, “goal”, “active goal”, “continue until evidence says done”, or “verify against a goal” for a subagent run, create an explicit run-scoped acceptance contract: `criteria` for the target, `evidence` and `verify` for proof, `stopRules` for constraints, and `maxFinalizationTurns` for the bounded loop budget.
698
+
699
+ When launching a writer/worker from a plan, PRD, spec, issue, or broad fix, set structured `acceptance` proactively. Put implementation instructions, plan paths, and handoff artifacts in `task`; put the definition of done in `acceptance.criteria`, proof requirements in `acceptance.evidence` and `acceptance.verify`, constraints in `acceptance.stopRules`, and usually set `maxFinalizationTurns: 3`. Do not bury all validation requirements only in the task prompt.
700
+
701
+ Example writer handoff:
702
+
703
+ ```typescript
704
+ subagent({
705
+ agent: "worker",
706
+ async: true,
707
+ task: "Implement the plan at /Users/me/docs/mcp-alignment-plan.md. Use scout artifacts in ./handoff/ as context. Do not commit the scout artifacts.",
708
+ acceptance: {
709
+ criteria: [
710
+ "Implementation follows /Users/me/docs/mcp-alignment-plan.md",
711
+ "Plan acceptance checks are addressed",
712
+ "Scout handoff artifacts are not committed",
713
+ "Focused validation for changed behavior passes",
714
+ "Residual risks or skipped checks are reported"
715
+ ],
716
+ evidence: ["changed-files", "commands-run", "validation-output", "residual-risks"],
717
+ verify: [{ id: "focused", command: "npm test -- --runInBand" }],
718
+ stopRules: [
719
+ "Do not edit unrelated files",
720
+ "Stop and report if the plan requires an unapproved product decision"
721
+ ],
722
+ maxFinalizationTurns: 3
723
+ }
724
+ })
725
+ ```
688
726
 
689
727
  The first `worker` implements the approved plan. The parent continues with independent inspection or validation prep while it runs, not parallel edits to the same worktree. When the async worker completes, treat its handoff as the transition into review, not as final completion, unless the user explicitly asked for worker-only work, review-only output, or to stop after implementation. Parallel reviewers inspect the resulting diff from fresh context. Validators check behavior with the best available evidence: commands, tests, browser/CLI interaction, screenshots, logs, or manual reproduction notes. The final `worker` applies synthesized review fixes in forked context, then the parent looks over the final diff before completing. The parent may launch these steps as an initial async chain when the workflow is already clear, or as follow-up subagent runs after each async completion. Initial chains should pass `async: true` so the main chat is unblocked; avoid `clarify: true` unless the user asked for foreground clarification. Do not stop after parallel review unless the user explicitly asked for review-only output or the review surfaced a decision that needs approval first.
690
728
 
@@ -697,7 +735,7 @@ For very large work, split into serial milestones instead of launching a swarm o
697
735
  Keep orchestration authority in the parent session. Child subagents should not launch more subagents, read this skill, or run their own orchestration loops unless the parent intentionally selected a fanout agent whose builtin `tools` includes `subagent`. Spawned subagents do not receive the `pi-subagents` skill, parent-only status/control/slash messages, or prior parent `subagent` tool-call/tool-result artifacts. Ordinary children also do not receive the `subagent` extension tool. Child context filtering strips old hidden orchestration-instruction messages when they appear in inherited history. Every child receives a boundary instruction: ordinary children are told the parent owns orchestration and they must not propose or run subagents; explicit fanout children are told to use `subagent` only for the assigned fanout work, with `maxSubagentDepth` still enforced. Implementation children must call real edit/write tools instead of printing pseudo tool calls. Pass children concrete role-specific work instead.
698
736
 
699
737
  1. Clarify first. This is mandatory. Gather code context with `scout` or `context-builder`, add `researcher` only when external evidence matters, then ask the user clarifying questions with `interview` until scope, acceptance criteria, constraints, and non-goals are clear.
700
- 2. Define the validation contract. State what done means before implementation: expected behavior, checks to run, user flows to exercise, and evidence required in the worker handoff. For UI, CLI, integration, or workflow changes, include at least one validator angle that uses the product the way a user would rather than only reading code.
738
+ 2. Define the validation contract. State acceptance before implementation: expected behavior, checks to run, user flows to exercise, and evidence required in the worker handoff. For UI, CLI, integration, or workflow changes, include at least one validator angle that uses the product the way a user would rather than only reading code.
701
739
  3. Plan when useful. For complex work, call `planner` or write a plan doc yourself and get approval before implementation. For simple work, confirm shared understanding and explicitly note why planning is skipped.
702
740
  4. Implement with one writer. After approval, launch `worker` asynchronously with a proper meta prompt that includes clarified requirements, relevant context, plan path or summary, the validation contract, and output expectations. Packaged `worker` defaults to forked context; pass `context: "fresh"` only when you intentionally want a fresh child. While it runs, prepare validation or inspect adjacent code instead of editing the same worktree.
703
741
  5. Require a useful worker handoff. Ask the worker to report changed files, what was implemented, what was left undone, commands run with exit codes, validation evidence, surprises or new risks, decisions made inside approved scope, and decisions needing parent approval.
@@ -712,6 +750,11 @@ Example implementation handoff after clarification and optional planning:
712
750
  subagent({
713
751
  agent: "worker",
714
752
  task: "Implement the approved feature.\n\nClarified requirements:\n- ...\n\nPlan: see ~/Documents/docs/...-plan.md\n\nValidation contract:\n- ...\n\nReturn a handoff with changed files, what was implemented, what was left undone, commands run with exit codes, validation evidence, surprises/new risks, and decisions needing parent approval.",
753
+ acceptance: {
754
+ criteria: ["Implement the approved feature without widening scope"],
755
+ evidence: ["changed-files", "tests-added", "commands-run", "residual-risks", "no-staged-files"],
756
+ maxFinalizationTurns: 3
757
+ },
715
758
  async: true
716
759
  })
717
760
  ```
@@ -766,7 +809,7 @@ subagent({
766
809
  /run-chain review-chain -- review this branch
767
810
  ```
768
811
 
769
- Use saved `.chain.md` workflows when the user wants a repeatable multi-agent flow without rewriting the chain each time.
812
+ Use saved `.chain.md` or `.chain.json` workflows when the user wants a repeatable multi-agent flow without rewriting the chain each time. Prefer `.chain.json` for dynamic fanout or inline `outputSchema` objects; `.chain.md` remains the simple sequential/static authoring format.
770
813
 
771
814
  ## Error Handling
772
815