npm - pi-subagents - Versions diffs - 0.25.0 → 0.28.0 - Mend

pi-subagents 0.25.0 → 0.28.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (40) hide show

package/CHANGELOG.md +34 -0
package/README.md +175 -19
package/package.json +1 -1
package/prompts/parallel-context-build.md +3 -1
package/prompts/parallel-handoff-plan.md +3 -1
package/skills/pi-subagents/SKILL.md +60 -17
package/src/agents/agent-management.ts +71 -15
package/src/agents/agent-serializer.ts +13 -2
package/src/agents/agents.ts +88 -17
package/src/agents/chain-serializer.ts +120 -0
package/src/extension/fanout-child.ts +2 -0
package/src/extension/index.ts +5 -2
package/src/extension/schemas.ts +132 -6
package/src/intercom/result-intercom.ts +5 -0
package/src/runs/background/async-execution.ts +88 -6
package/src/runs/background/async-status.ts +11 -1
package/src/runs/background/run-status.ts +10 -1
package/src/runs/background/subagent-runner.ts +665 -39
package/src/runs/foreground/chain-execution.ts +369 -118
package/src/runs/foreground/execution.ts +392 -19
package/src/runs/foreground/subagent-executor.ts +126 -3
package/src/runs/shared/acceptance-contract.ts +318 -0
package/src/runs/shared/acceptance-evaluation.ts +221 -0
package/src/runs/shared/acceptance-finalization.ts +173 -0
package/src/runs/shared/acceptance-reports.ts +127 -0
package/src/runs/shared/acceptance.ts +22 -0
package/src/runs/shared/chain-outputs.ts +101 -0
package/src/runs/shared/completion-guard.ts +26 -3
package/src/runs/shared/dynamic-fanout.ts +293 -0
package/src/runs/shared/parallel-utils.ts +33 -1
package/src/runs/shared/pi-args.ts +11 -0
package/src/runs/shared/structured-output.ts +77 -0
package/src/runs/shared/subagent-prompt-runtime.ts +53 -3
package/src/runs/shared/workflow-graph.ts +210 -0
package/src/shared/formatters.ts +2 -2
package/src/shared/settings.ts +53 -4
package/src/shared/types.ts +265 -1
package/src/shared/utils.ts +7 -0
package/src/slash/slash-commands.ts +41 -3
package/src/tui/render.ts +178 -45

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,40 @@
 ## [Unreleased]
+## [0.28.0] - 2026-06-03
+### Added
+- Added foreground-only `timeoutMs`/`maxRuntimeMs` for single, parallel, and chain subagent runs. Timed-out children are soft-interrupted, keep completed sibling/prior results, and return `timedOut: true` with a stable timeout message.
+- Added per-agent `maxExecutionTimeMs` and `maxTokens` resource limits. Foreground and async children stop with a clear `resourceLimitExceeded` result when the configured runtime or observed token budget is reached.
+### Changed
+- Strengthened tool and skill guidance so writer subagents launched from plans, specs, issues, or broad fixes proactively use structured `acceptance` instead of burying validation requirements only in task prose.
+### Fixed
+- Removed a provider-unfriendly required-only subschema from the public `acceptance` tool schema so Kimi models served through OpenCode Go can load the `subagent` tool, while keeping runtime validation for empty acceptance contracts.
+- Clarified acceptance-report prompts so required evidence like `diff-summary` must be copied into structured JSON fields such as `diffSummary`, not only described in visible prose.
+## [0.27.0] - 2026-05-30
+### Changed
+- Reworked public acceptance config to be object-only and evidence-driven, removing public `level`/disable shorthands. Explicit acceptance now triggers a same-session self-review/repair finalization loop, with `maxFinalizationTurns` controlling the cap.
+- Documented goal-style acceptance guidance so `/goal`, “active goal”, and “work until evidence says done” requests map to run-scoped `acceptance` contracts.
+- Refined acceptance finalization prompts and status output to emphasize evidence, blockers, stop rules, and finalization progress such as `completed after 1/3 turns`.
+### Fixed
+- Treat explicit acceptance as the completion contract for acceptance-enabled runs, avoiding implementation completion-guard false positives when the visible output is only an `acceptance-report` or a finalization self-review turn does not need a repair edit.
+## [0.26.0] - 2026-05-29
+### Added
+- Added first-wave acceptance gates with optional public `acceptance` config, inferred effective policies, structured child reports, provenance ledgers, checked evidence gates, explicit runtime verification commands, async/status persistence, and saved `.chain.json` validation.
+- Added chain step metadata (`phase`, `label`), named outputs (`as` with `{outputs.name}`), workflow graph snapshots, and strict `outputSchema` structured-output contracts across foreground and async chain execution.
+- Added dynamic chain fanout with `expand`/single-template `parallel`/`collect`, structured named-output sources, bounded item expansion, collected result outputs, async status graph persistence, and saved `.chain.json` support.
+### Fixed
+- Fixed dynamic fanout acceptance blockers around real `structured_output` tool validation, malformed dynamic-like chain rejection, async dynamic failure status/details, dynamic child intercom target indexing, and saved `.chain.json` management diagnostics.
+- Fixed acceptance-gate semantics so reviewed status requires an independent reviewer result, required criteria must be reported as satisfied, only fenced `acceptance-report` blocks satisfy attestation, malformed reports preserve parse errors, `{ level: "none", reason }` disables inferred gates, and zero-child dynamic aggregates no longer fabricate evidence.
 ## [0.25.0] - 2026-05-21
 ### Added

package/README.md CHANGED Viewed

@@ -145,7 +145,7 @@ Use `~/.pi/agent/settings.json` for a user override or `.pi/settings.json` for a
 ## Where running subagents show up
-Foreground runs stream progress in the conversation while they run.
+Foreground runs stream progress in the conversation while they run. Use `timeoutMs` or its alias `maxRuntimeMs` when a foreground run must return within a wall-clock budget. When the timeout expires, running children are soft-interrupted, completed children stay in the result, and timed-out children return `timedOut: true` with a stable timeout message.
 Background runs keep working after control returns to you. Inspect active runs with `subagent({ action: "status" })`, or a specific run with `subagent({ action: "status", id: "..." })`.
@@ -246,7 +246,7 @@ Skip this section until you want exact syntax.
 | `/run <agent> [task]` | Run one agent; omit the task for self-contained agents |
 | `/chain agent1 "task1" -> agent2 "task2"` | Run agents in sequence |
 | `/parallel agent1 "task1" -> agent2 "task2"` | Run agents in parallel |
-| `/run-chain <chainName> -- <task>` | Launch a saved `.chain.md` workflow |
+| `/run-chain <chainName> -- <task>` | Launch a saved `.chain.md` or `.chain.json` workflow |
 | `/subagents-doctor` | Show read-only setup diagnostics |
 Commands validate agent names locally, support tab completion, and send results back into the conversation.
@@ -436,6 +436,8 @@ defaultProgress: true
 completionGuard: false
 interactive: true
 maxSubagentDepth: 1
+maxExecutionTimeMs: 600000
+maxTokens: 50000
 ---
 Your system prompt goes here.
@@ -462,6 +464,8 @@ Important fields:
 | `completionGuard` | Set `false` only for non-implementation agents that may mention implementation words while using mutation-capable tools such as `bash`. |
 | `interactive` | Parsed for compatibility but not enforced in v1. |
 | `maxSubagentDepth` | Tightens nested delegation for this agent’s children. |
+| `maxExecutionTimeMs` | Stops each foreground or async child run for this agent after the given number of milliseconds. |
+| `maxTokens` | Stops each foreground or async child run for this agent when observed input plus output tokens reach the limit. Token enforcement is best-effort because usage is reported after model events arrive. |
 ### Tool and extension selection
@@ -492,14 +496,14 @@ When `extensions` is present, it takes precedence over extension paths implied b
 ## Chain files
-Chains are reusable `.chain.md` workflows stored separately from agent files.
+Chains are reusable workflows stored separately from agent files. Use `.chain.md` for simple sequential saved chains. Use `.chain.json` when a chain needs dynamic fanout.
 | Scope | Path |
 |-------|------|
-| User | `~/.pi/agent/chains/**/*.chain.md` |
-| Project | `.pi/chains/**/*.chain.md` |
+| User | `~/.pi/agent/chains/**/*.chain.md`, `~/.pi/agent/chains/**/*.chain.json` |
+| Project | `.pi/chains/**/*.chain.md`, `.pi/chains/**/*.chain.json` |
-Nested subdirectories are discovered recursively. If user and project scopes define the same parsed runtime chain name, the project chain wins. Chains support the same optional `package` frontmatter as agents; `name: review-flow` plus `package: code-analysis` runs as `code-analysis.review-flow`.
+Nested subdirectories are discovered recursively. If both `.chain.md` and `.chain.json` define the same parsed runtime chain name in the same scope, `.chain.json` wins. If user and project scopes define the same parsed runtime chain name, the project chain wins. Chains support the same optional `package` frontmatter as agents; `name: review-flow` plus `package: code-analysis` runs as `code-analysis.review-flow`.
 Example:
@@ -510,23 +514,67 @@ description: Gather context then plan implementation
 ---
 ## scout
+phase: Context
+label: Map auth flow
+as: context
 output: context.md
 Analyze the codebase for {task}
 ## planner
+phase: Planning
+label: Implementation plan
 reads: context.md
 model: anthropic/claude-sonnet-4-5:high
 progress: true
-Create an implementation plan based on {previous}
+Create an implementation plan based on {outputs.context}
 ```
-Each `## agent-name` section is a step. Config lines such as `output`, `outputMode`, `reads`, `model`, `skills`, and `progress` go immediately after the header. A blank line separates config from task text.
+Each `.chain.md` `## agent-name` section is a step. Config lines such as `phase`, `label`, `as`, `outputSchema`, `output`, `outputMode`, `reads`, `model`, `skills`, and `progress` go immediately after the header. A blank line separates config from task text. In saved `.chain.md` files, `outputSchema` is a path to a JSON Schema file; direct tool calls and `.chain.json` files can pass the schema object inline.
 For `output`, `reads`, `skills`, and `progress`, chain behavior is three-state: omitted inherits from the agent, a value overrides, and `false` disables.
-Create chains by writing `.chain.md` files directly or with the `subagent({ action: "create", config: ... })` management action. Run them with natural language or:
+Use `phase` to group related work in status output, `label` for a readable step name, and `as` to store a successful step or parallel task result for later `{outputs.name}` references. Duplicate `as` names, invalid identifiers, and unknown output references fail before child execution.
+Dynamic fanout is available only through direct `subagent({ chain: [...] })` JSON or saved `.chain.json` files. It expands an array from a prior structured named output, runs one child template per item, and stores the ordered collection under `collect.as`. The source must be structured output; prose is never parsed. `expand.maxItems` is required, over-limit arrays fail, nested fanout and arbitrary expressions are not supported, and `.chain.md` has no dynamic syntax in this release.
+```json
+{
+  "name": "dynamic-review",
+  "description": "Find review targets, fan out reviewers, then synthesize.",
+  "chain": [
+    {
+      "agent": "scout",
+      "task": "Return {\"items\":[{\"path\":\"...\",\"reason\":\"...\"}]} via structured_output.",
+      "as": "targets",
+      "outputSchema": { "type": "object" }
+    },
+    {
+      "expand": {
+        "from": { "output": "targets", "path": "/items" },
+        "item": "target",
+        "key": "/path",
+        "maxItems": 12
+      },
+      "parallel": {
+        "agent": "reviewer",
+        "label": "Review {target.path}",
+        "task": "Review {target.path}. Reason: {target.reason}",
+        "outputSchema": { "type": "object" }
+      },
+      "collect": { "as": "reviews" },
+      "concurrency": 4
+    },
+    {
+      "agent": "worker",
+      "task": "Synthesize fixes from {outputs.reviews}"
+    }
+  ]
+}
+```
+Create simple `.chain.md` chains by writing files directly or with the `subagent({ action: "create", config: ... })` management action. Create dynamic `.chain.json` chains by writing the JSON file directly. Run saved chains with natural language or:
 ```text
 /run-chain scout-planner -- refactor authentication
@@ -541,6 +589,7 @@ Task templates support:
 | `{task}` | Original task from the first step. |
 | `{previous}` | Output from the prior step, or aggregated output from a parallel step. |
 | `{chain_dir}` | Path to the chain artifact directory. |
+| `{outputs.name}` | Text value from a prior step or completed parallel task with `as: "name"`. |
 Parallel outputs are aggregated with clear separators before being passed to the next step:
@@ -634,14 +683,49 @@ These are the parameters the LLM passes when it calls the `subagent` tool. Most
 // Chain with fan-out/fan-in
 { chain: [
-  { agent: "scout", task: "Gather context" },
+  { agent: "scout", task: "Gather context", phase: "Context", label: "Map code", as: "context" },
   { parallel: [
-    { agent: "worker", task: "Implement feature A from {previous}" },
-    { agent: "worker", task: "Implement feature B from {previous}" }
+    { agent: "worker", task: "Implement feature A from {outputs.context}", label: "Feature A", as: "featureA" },
+    { agent: "worker", task: "Implement feature B from {outputs.context}", label: "Feature B", as: "featureB" }
   ], concurrency: 2, failFast: true },
-  { agent: "reviewer", task: "Review all changes from {previous}" }
+  { agent: "reviewer", task: "Review {outputs.featureA} and {outputs.featureB}" }
 ]}
+// Dynamic fanout from structured output
+{ chain: [
+  {
+    agent: "scout",
+    task: "Return review targets as structured_output: { items: [{ path, reason }] }",
+    as: "targets",
+    outputSchema: { type: "object" }
+  },
+  {
+    expand: { from: { output: "targets", path: "/items" }, item: "target", key: "/path", maxItems: 12 },
+    parallel: { agent: "reviewer", task: "Review {target.path}. Reason: {target.reason}", outputSchema: { type: "object" } },
+    collect: { as: "reviews" },
+    concurrency: 4
+  },
+  { agent: "worker", task: "Synthesize fixes from {outputs.reviews}" }
+] }
+// Strict structured output for reliable handoff data
+{ chain: [
+  {
+    agent: "scout",
+    task: "Return the key files and risks for {task}",
+    as: "scan",
+    outputSchema: {
+      type: "object",
+      required: ["files", "risks"],
+      properties: {
+        files: { type: "array", items: { type: "string" } },
+        risks: { type: "array", items: { type: "string" } }
+      }
+    }
+  },
+  { agent: "planner", task: "Plan from this scan: {outputs.scan}" }
+] }
 // Worktree isolation
 { tasks: [
   { agent: "worker", task: "Implement auth" },
@@ -711,10 +795,11 @@ Agent definitions are not loaded into context by default. Management actions let
 | `outputMode` | `"inline" \| "file-only"` | `inline` | Return saved output inline or as a concise saved-file reference. `file-only` requires an `output` path. |
 | `skill` | `string \| string[] \| false` | agent default | Override skills or disable all. |
 | `model` | string | agent default | Override model. |
-| `tasks` | array | - | Top-level parallel tasks. Supports `agent`, `task`, `cwd`, `count`, `output`, `outputMode`, `reads`, `progress`, `skill`, and `model`. |
+| `tasks` | array | - | Top-level parallel tasks. Supports `agent`, `task`, `cwd`, `count`, `output`, `outputMode`, `reads`, `progress`, `skill`, `model`, and `acceptance`. |
 | `concurrency` | number | config or `4` | Top-level parallel concurrency. |
+| `timeoutMs` / `maxRuntimeMs` | number | - | Foreground wall-clock timeout for single, parallel, and chain runs. Timed-out children return `timedOut: true`; async/background runs reject it. |
 | `worktree` | boolean | false | Create isolated git worktrees for parallel tasks. |
-| `chain` | array | - | Sequential and parallel chain steps. |
+| `chain` | array | - | Sequential, static parallel, and dynamic fanout chain steps. Sequential steps and parallel child tasks support `phase`, `label`, `as`, `outputSchema`, and `acceptance` in addition to the usual execution fields. Dynamic fanout uses `expand`, one child `parallel` template, and `collect`; group-level acceptance is not supported because there is no child session to finalize. |
 | `context` | `fresh \| fork` | agent default or `fresh` | `fork` creates real branched sessions from the parent leaf. Packaged `planner`, `worker`, and `oracle` default to `fork`. |
 | `chainDir` | string | temp chain dir | Persistent directory for chain artifacts. |
 | `clarify` | boolean | true for chains | Show TUI preview/edit flow. |
@@ -726,12 +811,13 @@ Agent definitions are not loaded into context by default. Management actions let
 | `includeProgress` | boolean | false | Include full progress in result. |
 | `share` | boolean | false | Upload session export to GitHub Gist. |
 | `sessionDir` | string | derived | Override session log directory. |
+| `acceptance` | object | omitted | Explicit acceptance contract. When present, the child gets a structured contract, then the runtime continues the same session for a bounded self-review/repair loop before evaluating acceptance. |
 `context: "fork"` fails fast when the parent session is not persisted, the current leaf is missing, or the branched child session cannot be created. It never silently downgrades to `fresh`. In multi-agent runs, if any requested agent has `defaultContext: fork` and the launch omits `context`, the whole invocation uses forked context; pass `context: "fresh"` when you intentionally want a fresh run.
 Use `outputMode: "file-only"` when a saved output may be large and the parent only needs a pointer. The returned text is a compact reference like `Output saved to: /abs/report.md (48.2 KB, 2847 lines). Read this file if needed.` Failed runs and save errors still return normal inline output for debugging. In chains, later `{previous}` steps receive the same compact reference when the prior step used file-only mode.
-Sequential and parallel chain tasks accept `agent`, `task`, `cwd`, `output`, `outputMode`, `reads`, `progress`, `skill`, and `model`. Parallel tasks also accept `count`. Parallel step groups accept `parallel`, `concurrency`, `failFast`, and `worktree`.
+Sequential and parallel chain tasks accept `agent`, `task`, `phase`, `label`, `as`, `outputSchema`, `cwd`, `output`, `outputMode`, `reads`, `progress`, `skill`, and `model`. Parallel tasks also accept `count`. Parallel step groups accept `parallel`, `concurrency`, `failFast`, and `worktree`. If `outputSchema` is present, the child must call `structured_output` with schema-valid JSON; prose-only completion or invalid JSON fails the step. Validated structured values are preserved on the step result, and `as` also exposes a compact text representation through `{outputs.name}`.
 Status and control actions:
@@ -830,6 +916,19 @@ Session directory precedence is: `params.sessionDir`, then `config.defaultSessio
 Controls nested delegation when no inherited `PI_SUBAGENT_MAX_DEPTH` is already in effect. Per-agent `maxSubagentDepth` can tighten the limit for that agent’s child runs, but cannot relax an inherited stricter limit. This applies even to children that explicitly declare `tools: subagent`; at the cap, execution fanout is blocked instead of silently hiding nested work.
+### Agent resource limits
+Set `maxExecutionTimeMs` and `maxTokens` in agent frontmatter or through `subagent({ action: "create" | "update", config })` to bound a specific agent across foreground and async runs.
+```yaml
+maxExecutionTimeMs: 600000
+maxTokens: 50000
+```
+When a limit is reached, the child receives a soft interrupt, the run fails with a clear `Resource limit exceeded...` error, and the result includes `resourceLimitExceeded` with the limit kind, configured limit, and observed token count when available. Resource-limit failures do not trigger fallback model retries. `maxTokens` is best-effort because providers report usage after message events; a child may exceed the exact limit before the runtime can stop it.
+Spawn-count and per-agent child-concurrency quotas are not part of this release; use `maxSubagentDepth` and parallel `concurrency` for those boundaries today.
 ### `intercomBridge`
 ```json
@@ -888,7 +987,7 @@ Debug artifacts live under `{sessionDir}/subagent-artifacts/` or a user-scoped t
 - `{runId}_{agent}.jsonl`
 - `{runId}_{agent}_meta.json`
-Metadata records timing, usage, exit code, final model, attempted models, and fallback attempt outcomes.
+Metadata records timing, usage, exit code, final model, attempted models, fallback attempt outcomes, and any resource-limit termination reason.
 Session files are stored under a per-run session directory. With `context: "fork"`, each child starts with `--session <branched-session-file>` produced from the parent’s current leaf. That is a real session fork, not an injected summary.
@@ -906,13 +1005,70 @@ Async runs write:
 `status.json` powers the widget and `subagent({ action: "status" })` output. `events.jsonl` contains wrapper events plus child Pi JSON events annotated with run and step metadata. Nested fanout status is stored as compact sidecar event/registry metadata and merged into parent status views and result/intercom payloads; full recursive status snapshots are not embedded in parent result files. `output-<n>.log` is a live human-readable tail. Fallback information is persisted so background runs are debuggable after completion.
+## Acceptance Gates
+`acceptance` is an explicit contract. Omit it for lightweight runs. Set it on single runs, top-level parallel task items, sequential chain steps, static parallel task items, and dynamic fanout child templates when the child must prove the work meets concrete criteria. Do not set it on static parallel groups or dynamic fanout aggregate groups; those groups do not own a same-session child turn.
+If you are coming from Codex Goals, `acceptance` is the subagent equivalent for one delegated run. When a user says `/goal`, “goal”, “active goal”, “continue until evidence says done”, or “verify against a goal”, translate that into an acceptance contract: `criteria` are the target, `evidence` and `verify` are proof, `stopRules` are constraints, and `maxFinalizationTurns` is the bounded loop budget.
+```ts
+{
+  agent: "worker",
+  task: "Implement the fix",
+  acceptance: {
+    criteria: ["Patch the bug without widening scope"],
+    evidence: ["changed-files", "tests-added", "commands-run", "residual-risks", "no-staged-files"],
+    verify: [{ id: "focused", command: "npm test", timeoutMs: 120000 }],
+    maxFinalizationTurns: 3
+  }
+}
+```
+When `acceptance` is present, the initial child prompt includes a standardized acceptance section and asks for a fenced `acceptance-report` JSON block. After the child’s initial completion, the runtime continues the same persisted child session with an acceptance finalization prompt. The child can repair omissions in that same session, then must return the final `acceptance-report`. Missing or malformed finalization reports reject the run when the loop limit is reached.
+Public acceptance config is evidence-driven. There is no public `level` field and no `acceptance: "checked"` shorthand. Runtime provenance is derived from what actually happened:
+- `attested`: the child returned a structured acceptance report.
+- `checked`: runtime structural checks passed, such as required criteria, required evidence, and no staged files.
+- `verified`: configured runtime verification commands passed. Child-reported command success does not count.
+- `reviewed`: an independent reviewer result is present.
+- `rejected`: attestation, structural checks, verification, review, or finalization failed.
+Self-review finalization never counts as `reviewed`, and it never counts as `verified` unless configured runtime verification commands actually pass. The visible child output remains the initial answer; finalization reports and residual risks are stored in the acceptance ledger and async/status details.
+When delegating implementation from a plan or spec, keep the task focused on what to implement and put the definition of done in `acceptance` so the runtime can finalize and evaluate it:
+```ts
+subagent({
+  agent: "worker",
+  async: true,
+  task: "Implement the plan at /Users/me/docs/mcp-alignment-plan.md. Use scout artifacts in ./handoff/ as context. Do not commit the scout artifacts.",
+  acceptance: {
+    criteria: [
+      "Implementation follows /Users/me/docs/mcp-alignment-plan.md",
+      "Plan acceptance checks are addressed",
+      "Scout handoff artifacts are not committed",
+      "Focused validation for changed behavior passes",
+      "Residual risks or skipped checks are reported"
+    ],
+    evidence: ["changed-files", "commands-run", "validation-output", "residual-risks"],
+    verify: [{ id: "focused", command: "npm test -- --runInBand" }],
+    stopRules: [
+      "Do not edit unrelated files",
+      "Stop and report if the plan requires an unapproved product decision"
+    ],
+    maxFinalizationTurns: 3
+  }
+})
+```
 ## Live progress
-Foreground runs show compact live progress for single, chain, and parallel modes: current tool, recent output, token counts, duration, activity freshness, and current-tool duration.
+Foreground runs show compact live progress for single, chain, and parallel modes: current tool, recent output, token counts, duration, activity freshness, current-tool duration, and chain graph metadata when available.
 Press `Ctrl+O` to expand the full streaming view with complete output per step.
-Sequential chains show a flow line like `done scout → running planner`. Chains with parallel steps show per-step cards instead.
+Sequential chains show a flow line like `done scout → running planner`. Chains with parallel steps show per-step cards instead. Chain status uses `label` and `phase` metadata when present, while falling back to agent names for older chains.
 ## Session sharing

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "pi-subagents",
-  "version": "0.25.0",
+  "version": "0.28.0",
   "description": "Pi extension for delegating tasks to subagents with chains, parallel execution, and TUI clarification",
   "author": "Nico Bailon",
   "license": "MIT",

package/prompts/parallel-context-build.md CHANGED Viewed

@@ -4,12 +4,14 @@ description: Parallel context builders for planning handoff
 Launch fresh-context `context-builder` subagents in parallel to build grounded handoff context for planning or implementation.
-Use the `subagent` tool in chain mode with a single parallel step, not top-level parallel tasks, so relative output files live under the temporary chain directory. Use `context: "fresh"` unless I explicitly ask for forked context. Give every parallel task a distinct `output` path, for example:
+Use the `subagent` tool in chain mode with a single parallel step, not top-level parallel tasks, so relative output files live under the temporary chain directory. Use `context: "fresh"` unless I explicitly ask for forked context. Give every parallel task a distinct `output` path, `label`, and `as` name, for example:
 - `context-build/request-and-scope.md`
 - `context-build/codebase-and-patterns.md`
 - `context-build/validation-and-risks.md`
+Use one phase such as `phase: "Context build"` for the parallel tasks so async status is readable. A later synthesis step can reference specific outputs with `{outputs.requestScope}`, `{outputs.codebasePatterns}`, and `{outputs.validationRisks}` instead of relying only on `{previous}`.
 Do not write these context artifacts into the repository unless I explicitly ask for persistent files.
 Treat the slash command arguments as the primary request, target, or focus:

package/prompts/parallel-handoff-plan.md CHANGED Viewed

@@ -19,12 +19,14 @@ Use the `subagent` tool in chain mode:
 2. Second step: a synthesis `context-builder` that reads the parallel findings and writes the final handoff plan and meta-prompt.
-Use distinct output paths under the chain directory. Example outputs:
+Use distinct output paths, `label` values, and `as` names under the chain directory. Example outputs:
 - `handoff/external-reference.md`
 - `handoff/local-context.md`
 - `handoff/implementation-strategy.md`
 - `handoff/final-handoff-plan.md`
+Use phases such as `Research`, `Local context`, and `Synthesis` so async status is readable. Prefer `{outputs.externalReference}`, `{outputs.localContext}`, and `{outputs.implementationStrategy}` in the synthesis task when those specific inputs are available; keep `{previous}` only when the whole parallel fan-in summary is the desired input.
 Do not write these artifacts into the repository unless I explicitly ask for persistent files.
 Role guidance:

package/skills/pi-subagents/SKILL.md CHANGED Viewed

@@ -32,7 +32,7 @@ Humans often use the slash-command layer instead:
 - `/run` — launch a single agent
 - `/chain` — launch a chain of steps
 - `/parallel` — launch top-level parallel tasks
-- `/run-chain` — launch a saved `.chain.md` workflow
+- `/run-chain` — launch a saved `.chain.md` or `.chain.json` workflow
 - `/subagents-doctor` — diagnose setup, discovery, async paths, and intercom bridge state
 Prefer the tool when you are writing agent logic. Prefer the slash commands when
@@ -118,7 +118,9 @@ Use this when a broad diff has known reviewer findings across several items and
 2. One writer worker. It receives the planner summaries through `{previous}`, the parent’s accepted scope, stop rules, and verification contract. It is the only child allowed to edit the active worktree.
 3. A parallel read-only validation fanout. Validators inspect the worker diff from fresh context with distinct angles, report pass/fail, remaining blockers, and missing verification.
-Prefer `async: true`, `context: "fresh"` for planners/validators, `outputMode: "file-only"` for large summaries, and per-stage output names that will not collide. Use this pattern instead of launching several writer workers into a dirty worktree. Include non-blocking suggestions in the writer prompt only when they are small, safe, and do not expand product scope; otherwise record them as deferred.
+Prefer `async: true`, `context: "fresh"` for planners/validators, `outputMode: "file-only"` for large summaries, and per-stage output names that will not collide. Add `phase` and `label` to make async status readable, and use `as` plus `{outputs.name}` when a later step needs a specific earlier result instead of the whole `{previous}` blob. Use this pattern instead of launching several writer workers into a dirty worktree. Include non-blocking suggestions in the writer prompt only when they are small, safe, and do not expand product scope; otherwise record them as deferred.
+When the first step can return a structured target list, prefer dynamic fanout instead of hand-authoring a static parallel group. Use `outputSchema` and `as` on the producer, then an `expand` step with `from: { output, path }`, an explicit `maxItems`, one `parallel` child template, and `collect.as`. Item templates may use `{item}` or a named item such as `{target.path}`. Do not use dynamic fanout for prose outputs, nested fanout, dynamic agent selection, reducers, `when` conditions, or arbitrary expressions; `.chain.md` does not support this syntax, so use direct JSON or a saved `.chain.json`.
 Example shape:
@@ -128,14 +130,14 @@ subagent({
   context: "fresh",
   chain: [
     { parallel: [
-      { agent: "reviewer", task: "Plan fixes for deploy docs/workflow. Inspect the current diff. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "plans/deploy.md", outputMode: "file-only" },
-      { agent: "reviewer", task: "Plan fixes for scheduler contract. Inspect the current diff. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "plans/scheduler.md", outputMode: "file-only" },
-      { agent: "reviewer", task: "Plan fixes for sandbox/security. Inspect the current diff. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "plans/sandbox.md", outputMode: "file-only" }
+      { agent: "reviewer", phase: "Planning", label: "Deploy docs", as: "deployPlan", task: "Plan fixes for deploy docs/workflow. Inspect the current diff. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "plans/deploy.md", outputMode: "file-only" },
+      { agent: "reviewer", phase: "Planning", label: "Scheduler contract", as: "schedulerPlan", task: "Plan fixes for scheduler contract. Inspect the current diff. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "plans/scheduler.md", outputMode: "file-only" },
+      { agent: "reviewer", phase: "Planning", label: "Sandbox/security", as: "sandboxPlan", task: "Plan fixes for sandbox/security. Inspect the current diff. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "plans/sandbox.md", outputMode: "file-only" }
     ], concurrency: 3 },
-    { agent: "worker", task: "Apply only the accepted fixes from these planning summaries. You are the sole writer for the active worktree. Run focused validation and report changed files, commands, failures, and remaining issues.\n\nPlanning summaries:\n{previous}", output: "worker/fixes.md", outputMode: "file-only", progress: true },
+    { agent: "worker", phase: "Implementation", label: "Apply accepted fixes", as: "workerResult", task: "Apply only the accepted fixes from these planning summaries. You are the sole writer for the active worktree.\n\nDeploy plan:\n{outputs.deployPlan}\n\nScheduler plan:\n{outputs.schedulerPlan}\n\nSandbox plan:\n{outputs.sandboxPlan}", acceptance: { criteria: ["Accepted fixes from each planning summary are applied", "Focused validation for changed behavior passes", "Changed files, validation commands, failures, and residual risks are reported"], evidence: ["changed-files", "commands-run", "validation-output", "residual-risks"], stopRules: ["Do not expand product scope beyond accepted fixes", "Stop and report if a fix requires an unapproved decision"], maxFinalizationTurns: 3 }, output: "worker/fixes.md", outputMode: "file-only", progress: true },
     { parallel: [
-      { agent: "reviewer", task: "Validate the post-worker diff for deploy and scheduler fixes. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "validation/deploy-scheduler.md", outputMode: "file-only" },
-      { agent: "reviewer", task: "Validate the post-worker diff for sandbox/security fixes. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "validation/sandbox.md", outputMode: "file-only" }
+      { agent: "reviewer", phase: "Validation", label: "Deploy/scheduler validation", task: "Validate the post-worker diff for deploy and scheduler fixes. Start from the worker result: {outputs.workerResult}. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "validation/deploy-scheduler.md", outputMode: "file-only" },
+      { agent: "reviewer", phase: "Validation", label: "Sandbox validation", task: "Validate the post-worker diff for sandbox/security fixes. Start from the worker result: {outputs.workerResult}. Do not modify project/source files; returning findings via the configured output artifact is allowed.", output: "validation/sandbox.md", outputMode: "file-only" }
     ], concurrency: 2 }
   ]
 })
@@ -217,10 +219,10 @@ Agent files can live in:
 - legacy `.agents/**/*.md` — still read for compatibility, but `.pi/agents/` wins on conflicts
 Chains live in:
-- `~/.pi/agent/chains/**/*.chain.md` — user scope
-- `.pi/chains/**/*.chain.md` — project scope
+- `~/.pi/agent/chains/**/*.chain.md` and `~/.pi/agent/chains/**/*.chain.json` — user scope
+- `.pi/chains/**/*.chain.md` and `.pi/chains/**/*.chain.json` — project scope
-Discovery is recursive. `.chain.md` files do not define agents. Agents and chains can set optional frontmatter `package: code-analysis`; `name: scout` plus `package: code-analysis` registers as runtime name `code-analysis.scout` while serialization keeps `name` and `package` separate.
+Discovery is recursive. `.chain.md` files do not define agents. Use `.chain.md` for simple saved chains and `.chain.json` for dynamic fanout or inline schema objects. Agents and chains can set optional frontmatter/package metadata; `name: scout` plus `package: code-analysis` registers as runtime name `code-analysis.scout` while serialization keeps `name` and `package` separate.
 Precedence is by parsed runtime name:
 1. project scope
@@ -290,9 +292,13 @@ subagent({
 })
 ```
-Chain steps can use templated variables such as `{task}`, `{previous}`, and
-`{chain_dir}`. This is the main way to pass structured summaries between steps
-without forcing each step to rediscover everything.
+Chain steps can use templated variables such as `{task}`, `{previous}`,
+`{chain_dir}`, and `{outputs.name}`. Use `as: "name"` on a successful step or
+parallel task to make that output available to later steps. Prefer named outputs
+when a later step needs one specific result; keep `{previous}` for simple linear
+handoffs or full fan-in summaries. Use `phase` and `label` for status readability.
+Use `outputSchema` when later steps need reliable structured data; the child must
+call `structured_output` with schema-valid JSON, or the step fails.
 ### Async/background
@@ -684,7 +690,39 @@ For feature work, use this sequence as scaffolding for parent-agent behavior:
 clarify → validation contract → planner → async worker → parallel async fresh-context reviewers/validators → async fix worker → follow-up review when warranted → parent review
 ```
-The validation contract defines what done means before code is written: expected behavior, acceptance checks, commands or user flows to exercise, and evidence the worker should return. Keep it lightweight for small tasks, but make it explicit enough that reviewers and validators are checking the intended outcome rather than the worker’s own assumptions.
+The validation contract defines acceptance before code is written: expected behavior, acceptance checks, commands or user flows to exercise, and evidence the worker should return. Keep it lightweight for small tasks, but make it explicit enough that reviewers and validators are checking the intended outcome rather than the worker’s own assumptions.
+Use the structured `acceptance` field when the run should carry an explicit acceptance contract. If omitted, the run stays lightweight. When present, acceptance is object-only: define concrete `criteria`, required `evidence`, optional runtime `verify` commands, optional independent `review`, and optionally `maxFinalizationTurns`. The runtime continues the same child session for a bounded self-review/repair loop before evaluating the final report, so set `acceptance` on single runs, sequential chain steps, parallel task items, and dynamic fanout child templates, not on static parallel or dynamic fanout groups. Do not call a run reviewed just because the worker says it is done; reviewed means a reviewer gate returned a result. Child-reported command success is evidence, not runtime verification.
+Goal-style requests map to `acceptance`. If the user says `/goal`, “goal”, “active goal”, “continue until evidence says done”, or “verify against a goal” for a subagent run, create an explicit run-scoped acceptance contract: `criteria` for the target, `evidence` and `verify` for proof, `stopRules` for constraints, and `maxFinalizationTurns` for the bounded loop budget.
+When launching a writer/worker from a plan, PRD, spec, issue, or broad fix, set structured `acceptance` proactively. Put implementation instructions, plan paths, and handoff artifacts in `task`; put the definition of done in `acceptance.criteria`, proof requirements in `acceptance.evidence` and `acceptance.verify`, constraints in `acceptance.stopRules`, and usually set `maxFinalizationTurns: 3`. Do not bury all validation requirements only in the task prompt.
+Example writer handoff:
+```typescript
+subagent({
+  agent: "worker",
+  async: true,
+  task: "Implement the plan at /Users/me/docs/mcp-alignment-plan.md. Use scout artifacts in ./handoff/ as context. Do not commit the scout artifacts.",
+  acceptance: {
+    criteria: [
+      "Implementation follows /Users/me/docs/mcp-alignment-plan.md",
+      "Plan acceptance checks are addressed",
+      "Scout handoff artifacts are not committed",
+      "Focused validation for changed behavior passes",
+      "Residual risks or skipped checks are reported"
+    ],
+    evidence: ["changed-files", "commands-run", "validation-output", "residual-risks"],
+    verify: [{ id: "focused", command: "npm test -- --runInBand" }],
+    stopRules: [
+      "Do not edit unrelated files",
+      "Stop and report if the plan requires an unapproved product decision"
+    ],
+    maxFinalizationTurns: 3
+  }
+})
+```
 The first `worker` implements the approved plan. The parent continues with independent inspection or validation prep while it runs, not parallel edits to the same worktree. When the async worker completes, treat its handoff as the transition into review, not as final completion, unless the user explicitly asked for worker-only work, review-only output, or to stop after implementation. Parallel reviewers inspect the resulting diff from fresh context. Validators check behavior with the best available evidence: commands, tests, browser/CLI interaction, screenshots, logs, or manual reproduction notes. The final `worker` applies synthesized review fixes in forked context, then the parent looks over the final diff before completing. The parent may launch these steps as an initial async chain when the workflow is already clear, or as follow-up subagent runs after each async completion. Initial chains should pass `async: true` so the main chat is unblocked; avoid `clarify: true` unless the user asked for foreground clarification. Do not stop after parallel review unless the user explicitly asked for review-only output or the review surfaced a decision that needs approval first.
@@ -697,7 +735,7 @@ For very large work, split into serial milestones instead of launching a swarm o
 Keep orchestration authority in the parent session. Child subagents should not launch more subagents, read this skill, or run their own orchestration loops unless the parent intentionally selected a fanout agent whose builtin `tools` includes `subagent`. Spawned subagents do not receive the `pi-subagents` skill, parent-only status/control/slash messages, or prior parent `subagent` tool-call/tool-result artifacts. Ordinary children also do not receive the `subagent` extension tool. Child context filtering strips old hidden orchestration-instruction messages when they appear in inherited history. Every child receives a boundary instruction: ordinary children are told the parent owns orchestration and they must not propose or run subagents; explicit fanout children are told to use `subagent` only for the assigned fanout work, with `maxSubagentDepth` still enforced. Implementation children must call real edit/write tools instead of printing pseudo tool calls. Pass children concrete role-specific work instead.
 1. Clarify first. This is mandatory. Gather code context with `scout` or `context-builder`, add `researcher` only when external evidence matters, then ask the user clarifying questions with `interview` until scope, acceptance criteria, constraints, and non-goals are clear.
-2. Define the validation contract. State what done means before implementation: expected behavior, checks to run, user flows to exercise, and evidence required in the worker handoff. For UI, CLI, integration, or workflow changes, include at least one validator angle that uses the product the way a user would rather than only reading code.
+2. Define the validation contract. State acceptance before implementation: expected behavior, checks to run, user flows to exercise, and evidence required in the worker handoff. For UI, CLI, integration, or workflow changes, include at least one validator angle that uses the product the way a user would rather than only reading code.
 3. Plan when useful. For complex work, call `planner` or write a plan doc yourself and get approval before implementation. For simple work, confirm shared understanding and explicitly note why planning is skipped.
 4. Implement with one writer. After approval, launch `worker` asynchronously with a proper meta prompt that includes clarified requirements, relevant context, plan path or summary, the validation contract, and output expectations. Packaged `worker` defaults to forked context; pass `context: "fresh"` only when you intentionally want a fresh child. While it runs, prepare validation or inspect adjacent code instead of editing the same worktree.
 5. Require a useful worker handoff. Ask the worker to report changed files, what was implemented, what was left undone, commands run with exit codes, validation evidence, surprises or new risks, decisions made inside approved scope, and decisions needing parent approval.
@@ -712,6 +750,11 @@ Example implementation handoff after clarification and optional planning:
 subagent({
   agent: "worker",
   task: "Implement the approved feature.\n\nClarified requirements:\n- ...\n\nPlan: see ~/Documents/docs/...-plan.md\n\nValidation contract:\n- ...\n\nReturn a handoff with changed files, what was implemented, what was left undone, commands run with exit codes, validation evidence, surprises/new risks, and decisions needing parent approval.",
+  acceptance: {
+    criteria: ["Implement the approved feature without widening scope"],
+    evidence: ["changed-files", "tests-added", "commands-run", "residual-risks", "no-staged-files"],
+    maxFinalizationTurns: 3
+  },
   async: true
 })
 ```
@@ -766,7 +809,7 @@ subagent({
 /run-chain review-chain -- review this branch
 ```
-Use saved `.chain.md` workflows when the user wants a repeatable multi-agent flow without rewriting the chain each time.
+Use saved `.chain.md` or `.chain.json` workflows when the user wants a repeatable multi-agent flow without rewriting the chain each time. Prefer `.chain.json` for dynamic fanout or inline `outputSchema` objects; `.chain.md` remains the simple sequential/static authoring format.
 ## Error Handling