npm - @chllming/wave-orchestration - Versions diffs - 0.8.3 → 0.8.4 - Mend

@chllming/wave-orchestration 0.8.3 → 0.8.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (39) hide show

package/CHANGELOG.md +19 -0
package/README.md +47 -11
package/docs/README.md +6 -2
package/docs/concepts/what-is-a-wave.md +1 -1
package/docs/plans/architecture-hardening-migration.md +8 -1
package/docs/plans/current-state.md +15 -7
package/docs/plans/end-state-architecture.md +82 -69
package/docs/plans/examples/wave-example-live-proof.md +1 -1
package/docs/plans/migration.md +235 -62
package/docs/plans/wave-orchestrator.md +37 -11
package/docs/reference/cli-reference.md +34 -14
package/docs/reference/coordination-and-closure.md +19 -6
package/docs/reference/npmjs-trusted-publishing.md +5 -4
package/docs/reference/sample-waves.md +4 -4
package/package.json +1 -1
package/releases/manifest.json +20 -0
package/scripts/wave-orchestrator/agent-state.mjs +0 -491
package/scripts/wave-orchestrator/autonomous.mjs +10 -6
package/scripts/wave-orchestrator/{launcher-closure.mjs → closure-engine.mjs} +190 -74
package/scripts/wave-orchestrator/{launcher-derived-state.mjs → derived-state-engine.mjs} +34 -146
package/scripts/wave-orchestrator/{launcher-gates.mjs → gate-engine.mjs} +395 -139
package/scripts/wave-orchestrator/human-input-resolution.mjs +14 -10
package/scripts/wave-orchestrator/human-input-workflow.mjs +104 -0
package/scripts/wave-orchestrator/implementation-engine.mjs +120 -0
package/scripts/wave-orchestrator/launcher-runtime.mjs +5 -6
package/scripts/wave-orchestrator/launcher.mjs +271 -724
package/scripts/wave-orchestrator/projection-writer.mjs +256 -0
package/scripts/wave-orchestrator/reconcile-format.mjs +32 -0
package/scripts/wave-orchestrator/reducer-snapshot.mjs +297 -0
package/scripts/wave-orchestrator/replay.mjs +3 -1
package/scripts/wave-orchestrator/result-envelope.mjs +589 -0
package/scripts/wave-orchestrator/retry-control.mjs +5 -0
package/scripts/wave-orchestrator/{launcher-retry.mjs → retry-engine.mjs} +267 -18
package/scripts/wave-orchestrator/role-helpers.mjs +51 -0
package/scripts/wave-orchestrator/{launcher-supervisor.mjs → session-supervisor.mjs} +178 -103
package/scripts/wave-orchestrator/shared.mjs +1 -0
package/scripts/wave-orchestrator/traces.mjs +10 -1
package/scripts/wave-orchestrator/wave-files.mjs +11 -9
package/scripts/wave-orchestrator/wave-state-reducer.mjs +52 -5

package/docs/plans/migration.md CHANGED Viewed

@@ -1,102 +1,275 @@
 # Migration
-For the staged internal cutover from the legacy launcher-centric runtime to the authority-set / reducer / phase-engine architecture, see [architecture-hardening-migration.md](./architecture-hardening-migration.md). This page stays focused on package adoption and upgrade steps for repo operators.
+This page is the operator-facing upgrade guide for adopting repos. It explains how to move from older Wave package versions onto the current `0.8.4` surface without guessing which files are package-owned, which files are repo-owned, and which validations to trust after the bump.
+For the completed internal architecture cutover record, see [architecture-hardening-migration.md](./architecture-hardening-migration.md). That document is historical. This one is the practical repo-upgrade checklist.
+## What `0.8.4` Changes
+`0.8.4` is a hardening release, not a new authoring model.
+- contradiction replay no longer depends on component-matrix parsing when the trace does not declare promoted components
+- `requireComponentPromotionsFromWave` now disables both component-promotion proof blocking and component-matrix current-level blocking before the configured threshold
+- `projection-writer.mjs` is now the single persistence layer for projection outputs, while `derived-state-engine.mjs` computes those payloads without persisting them directly
+- starter docs, release notes, README, and publishing guidance now describe the shipped runtime instead of transitional architecture claims
+There are no new CLI flags or wave-file section requirements in `0.8.4`.
+## Upgrade Contract
+- `pnpm up @chllming/wave-orchestration` updates the runtime in `node_modules`.
+- `pnpm exec wave upgrade` writes `.wave/install-state.json` and `.wave/upgrade-history/*` only.
+- `wave upgrade` does not rewrite repo-owned `wave.config.json`, `docs/agents/*`, `docs/plans/waves/*`, `skills/*`, `docs/context7/*`, or repo-specific reference docs.
+- `.tmp/<lane>-wave-launcher/` is runtime state, not migration source of truth. Do not treat old generated artifacts as the thing to preserve.
 ## Default Adoption Path
+Use this when the repo is not already running Wave or you are replacing a very old local starter copy.
 1. Install the package from npmjs with `pnpm add -D @chllming/wave-orchestration`.
 2. For a fresh repo, run `pnpm exec wave init`.
-3. For a repo that already has Wave config, docs, or waves you want to preserve, run `pnpm exec wave init --adopt-existing`.
-4. Edit `wave.config.json` for the repo's docs, roles, validation rules, executor defaults, skill attachment policy, and component-cutover matrix paths.
-5. Replace the starter plan docs, sample waves, starter `skills/` bundles, and component cutover matrix with repository-specific ones.
-6. Configure Context7 bundles for the external libraries that repo actually uses.
-7. Run `pnpm exec wave doctor` and `pnpm exec wave launch --lane main --dry-run --no-dashboard` until validation passes.
-8. Inspect seeded coordination and inbox artifacts with `pnpm exec wave coord show --lane main --wave 0 --dry-run --json` and `pnpm exec wave coord inbox --lane main --wave 0 --agent A1 --dry-run`.
-9. Upgrade later with `pnpm up @chllming/wave-orchestration` and `pnpm exec wave upgrade`.
+3. For a repo that already owns Wave config, docs, or waves, run `pnpm exec wave init --adopt-existing`.
+4. Review `wave.config.json` for docs roots, roles, validation thresholds, executor defaults, skill attachments, Context7 bundles, and component-cutover matrix paths.
+5. Replace starter sample plans, starter skills, and starter prompts with repo-owned versions where needed.
+6. Run the validation checklist in this doc before the first live launcher run.
-GitHub Packages remains available as an authenticated fallback path, and maintainer npm publishing setup is documented in [npmjs-trusted-publishing.md](../reference/npmjs-trusted-publishing.md).
+GitHub Packages remains an authenticated fallback install path, but npmjs is the default public distribution channel.
-## Upgrade Contract
+## Safe Upgrade Flow For Any Existing Repo
+Use this flow before the version-specific sections below.
-- Package upgrades change the runtime behavior in `node_modules`; they do not copy a new starter scaffold into the repo.
-- `wave upgrade` writes `.wave/install-state.json` and `.wave/upgrade-history/*` only.
-- Existing `wave.config.json`, role prompts, plan docs, `skills/` bundles, Context7 bundles, and wave files are never overwritten by the upgrade flow.
-- Fresh `wave init` seeds the starter `skills/` library. `wave init --adopt-existing` records existing repo-owned skill bundles when they are already present, but does not replace or rewrite them.
-- The current runtime expects the post-roadmap model: typed coordination, compiled inboxes, `A8` integration, staged closure, orchestrator-first clarification, and operational runtime policy.
+### 1. Upgrade When The Lane Is Idle
-## Upgrading From 0.6.x To 0.8.3
+- Prefer upgrading between waves, not mid-attempt.
+- If a lane still has running sessions, finish or intentionally stop that attempt before changing package versions.
+- If a repo is stranded after a prior crash, inspect `wave control status` first, then decide whether to relaunch or reconcile on the upgraded package.
-Read `CHANGELOG.md` first, then treat this section as the repo-owned migration checklist for adopted `0.6.x` workspaces.
+### 2. Bump The Package
-`wave upgrade` updates the installed runtime only. It does not copy planner starter files into a repo that already owns its docs, skills, and Context7 bundles.
+```bash
+pnpm up @chllming/wave-orchestration
+pnpm exec wave upgrade
+```
-`0.8.3` carries forward the `0.8.2` completed-wave control-status hardening and fixes the human-answer reconciliation path: answered feedback now closes the linked clarification or escalation chain in canonical coordination, re-syncs helper-assignment projections, and preserves ad-hoc `--run <id>` context when writing safe continuation intent.
+### 3. Sync Repo-Owned Starter Surface Only If You Copied It
-### Required Repo Changes
+If your repo copied package-owned starter docs, prompts, or skills instead of treating them as read-only package material, sync the copied files that you still want to match upstream.
-If the repo adopted Wave before the planner corpus became a tracked required surface, sync:
+The common sync set is:
+- `docs/agents/wave-launcher-role.md`
+- `docs/agents/wave-orchestrator-role.md`
 - `docs/agents/wave-planner-role.md`
+- `skills/wave-core/`
 - `skills/role-planner/`
+- runtime and closure-role starter skills under `skills/`
 - `docs/context7/planner-agent/`
 - `docs/reference/wave-planning-lessons.md`
-- the `planner-agentic` bundle entry in `docs/context7/bundles.json`
+- `docs/plans/current-state.md`
+- `docs/plans/end-state-architecture.md`
+- `docs/plans/wave-orchestrator.md`
+- `docs/plans/migration.md`
-If the repo copied the shipped starter architecture docs or skills and wants the `0.8.3` authority-model language, also sync:
+If your repo never copied those starter files, do not invent migration work. The installed package already carries the new runtime behavior.
-- `docs/agents/wave-launcher-role.md`
-- `docs/agents/wave-orchestrator-role.md`
-- `skills/wave-core/`
-- the relevant runtime and closure-role starter skills under `skills/`
-- `docs/plans/architecture-hardening-migration.md`
+### 4. Re-validate Before A Live Run
-### Recommended Upgrade Validation
+Run these from the repo root:
-After syncing those repo-owned files:
+```bash
+pnpm exec wave doctor
+pnpm exec wave launch --lane main --dry-run --no-dashboard
+pnpm exec wave control status --lane main --wave 0 --json
+pnpm exec wave coord inbox --lane main --wave 0 --agent A1 --dry-run
+```
-1. Run `pnpm exec wave doctor`.
-2. Run `pnpm exec wave launch --lane main --dry-run --no-dashboard`.
-3. Use `pnpm exec wave dashboard --lane <lane> --attach current` or `--attach global` when you need to reattach to a live tmux-backed dashboard without reverse-engineering the socket or session name.
-4. If your operators answer human-input tickets through `wave feedback respond`, update any repo-local runbooks so ad-hoc runs always pass `--run <id>` when responding outside the main roadmap lane.
+Use `pnpm exec wave dashboard --lane <lane> --attach current` or `--attach global` when you need to reattach to an existing tmux-backed dashboard after the upgrade.
-## Upgrading From 0.5.4 To 0.6.1
+## Version-Specific Upgrade Guidance
-Read `CHANGELOG.md` first, then treat the rest of this page as the manual repo-owned migration checklist for the `0.6.1` release. `wave upgrade` will update package-owned runtime code only; it will not rewrite the docs, prompts, config, or wave files that your repo already owns.
+## Upgrading From `0.8.3` To `0.8.4`
-### Required Repo Changes
+This is the smallest migration.
-1. Rename legacy `evaluator` config and prompt terminology to `cont-QA`.
-2. Keep `A0` as the final closure owner that emits both the final verdict and `[wave-gate]`.
-3. Add `E0` only when the wave needs benchmark-driven tuning or service-output evaluation.
-4. Add wave-level `## Eval targets` whenever `cont-EVAL` is present.
-5. Update any starter docs or examples that still describe the pre-`0.6.1` evaluator model.
+### What changed
-In practice that means checking:
+- contradiction replay for non-promoted traces is now independent of component-matrix parsing
+- component-promotion threshold handling is now consistent between proof validation and matrix-current-level validation
+- projection output writes are centralized in `projection-writer.mjs`
+### Required repo changes
+None for wave shape, config keys, or CLI usage.
+### Recommended checks
+1. Re-run `pnpm exec wave doctor`.
+2. Re-run `pnpm exec wave launch --lane main --dry-run --no-dashboard`.
+3. If your repo copied starter architecture docs or starter skills, sync them so local runbooks stop describing the older split projection behavior.
+4. If you keep historical trace fixtures, replay at least one contradiction-blocked trace and one promoted-component trace after the upgrade.
+## Upgrading From `0.8.0`-`0.8.2` To `0.8.4`
+Treat this as one upgrade to the current surface.
+### What changed across that range
+- completed-wave `wave control status` projection hardened in `0.8.2`
+- human-input reconciliation and ad-hoc `--run <id>` context hardening landed in `0.8.3`
+- contradiction replay, component-threshold consistency, and projection-writer centralization landed in `0.8.4`
+### Required repo changes
+Usually none for config shape.
+### Strongly recommended sync
+If your repo copied upstream starter docs or skills, sync:
+- the current operator runbook and architecture docs
+- the launcher and orchestrator role prompts
+- the relevant runtime skills and closure-role starter skills
+### Validation focus
+- check that completed waves do not show stale blockers through `wave control status`
+- answer at least one human-feedback ticket in a test lane and confirm the linked clarification or escalation chain closes cleanly
+- replay one contradiction-blocked trace if your repo relies on trace-based regression checks
+## Upgrading From `0.6.x` Or `0.7.x` To `0.8.4`
+This is the main migration path for older adopted repos.
+### Behavioral changes you must account for
+- `wave control` is the preferred operator surface for status, rerun, proof, and telemetry work
+- `cont-QA` and optional `cont-EVAL` remain distinct closure roles; older overloaded evaluator language should be removed
+- planner corpus files are now treated as required starter surface for repos that use planner workflows
+- live closure depends on validated result envelopes plus canonical state, not only older summary-era behavior
+- control-plane state, reducer state, and replay are now first-class runtime surfaces, not optional internals
+### Required repo changes
+1. Remove or rename any legacy `evaluator` role/config terminology to `cont-QA`.
+2. Keep `A0` as the final closure owner and add `E0` only when the wave needs eval-driven tuning.
+3. Add wave-level `## Eval targets` whenever `cont-EVAL` is present.
+4. Sync the planner starter corpus if the repo uses `wave project` or `wave draft`:
+   - `docs/agents/wave-planner-role.md`
+   - `skills/role-planner/`
+   - `docs/context7/planner-agent/`
+   - `docs/reference/wave-planning-lessons.md`
+   - the `planner-agentic` entry in `docs/context7/bundles.json`
+5. Review any repo-owned docs or internal runbooks that still describe one overloaded evaluator role, marker-era closure, or pre-control-plane retry/proof workflow.
+### Additional validation
+Run the default validation set, then also check:
+```bash
+pnpm exec wave control status --lane main --wave 0 --json
+pnpm exec wave control rerun get --lane main --wave 0 --json
+pnpm exec wave control proof get --lane main --wave 0 --json
+```
+If your repo carries proof-first waves, verify that required proof artifacts are still present locally and not only in historical summaries.
+## Upgrading From `0.5.x` Or Earlier To `0.8.4`
+Do not treat this as a tiny patch bump.
+### Recommended approach
+1. Read [docs/reference/migration-0.2-to-0.5.md](../reference/migration-0.2-to-0.5.md) first if the repo still looks pre-`0.5`.
+2. Run `pnpm exec wave init --adopt-existing` on a branch so the workspace records install state without overwriting repo-owned material.
+3. Move the repo onto the `0.6.x` and later surface using the section above.
+4. Re-run the full validation checklist before any live executor run.
+### Why
+Older repos often differ in:
+- role naming
+- closure ordering
+- runtime config keys
+- planner starter corpus
+- proof and retry operator surfaces
+- generated state layout under `.tmp/`
+Trying to jump directly with ad hoc edits usually leaves hidden drift in prompts, docs, or config.
+## Repo-Owned Files To Audit During Any Upgrade
+These are the highest-value files to check when a repo copied starter surface instead of reading from the package.
+### Prompts and skills
-- `wave.config.json`
-  Remove or rename `roles.evaluator*`, `skills.byRole.evaluator`, and `runtimePolicy.defaultExecutorByRole.evaluator`.
 - `docs/agents/*.md`
-  Rename or replace any legacy evaluator prompt files so the repo clearly distinguishes `cont-QA`, `cont-EVAL`, and optional security review.
+- `skills/*`
+- `docs/context7/bundles.json`
+- `docs/context7/planner-agent/`
+### Operator docs and runbooks
+- `docs/plans/current-state.md`
+- `docs/plans/wave-orchestrator.md`
+- `docs/plans/end-state-architecture.md`
+- `docs/plans/migration.md`
+- `docs/reference/cli-reference.md`
+- `docs/reference/wave-control.md`
+- `docs/reference/sample-waves.md`
+### Config and wave contracts
+- `wave.config.json`
 - `docs/plans/waves/*.md`
-  Update wave agent headings, role prompts, and closure expectations to use `A0` for `cont-QA`, optional `E0` for eval tuning, and optional security review before integration.
-- `docs/reference/` and other operator docs
-  Refresh any examples or internal runbooks that still describe one overloaded evaluator role.
+- `docs/evals/benchmark-catalog.json`
+- component-cutover matrix files under `docs/plans/`
+## Validation Checklist After The Upgrade
+Use this exact sequence unless your repo has a better repo-specific smoke suite.
+```bash
+pnpm exec wave doctor --json
+pnpm exec wave launch --lane main --dry-run --no-dashboard
+pnpm exec wave control status --lane main --wave 0 --json
+pnpm exec wave coord show --lane main --wave 0 --dry-run --json
+pnpm exec wave coord inbox --lane main --wave 0 --agent A1 --dry-run
+```
+For repos that extend or test the runtime itself, also run:
+```bash
+pnpm test
+```
+For repos that depend on replay parity, replay at least:
+- one contradiction-blocked trace
+- one promoted-component trace
+- one retry-history trace
+## Troubleshooting
+### `wave doctor` fails after the upgrade
+- check whether the repo is missing planner starter surface it previously copied
+- check whether old `evaluator` naming is still present in config or prompts
+- check whether wave files now declare closure roles or eval targets inconsistently with the current runtime
-### Closure And Marker Changes
+### A live lane looks blocked after the bump
-Live `0.6.1` closure is stricter than `0.5.4`.
+- use `wave control status --lane <lane> --wave <n> --json`
+- confirm whether the blocker is canonical coordination, dependency, proof, or human-input state
+- do not trust old generated markdown alone
-- `cont-EVAL` must leave a report plus a final `[wave-eval]` marker whose `target_ids` exactly matches the wave contract and whose `benchmark_ids` stays within the benchmark catalog.
-- Security review, when present, must leave a report plus a final `[wave-security]` marker.
-- `cont-QA` must leave both the final `Verdict:` line and the final `[wave-gate]` marker.
-- Older evaluator-era or verdict-only artifacts remain replay-readable, but they do not satisfy live completion anymore.
+### Replay differs from old expectations
-### Recommended Upgrade Validation
+- verify whether the trace declares promoted components
+- verify whether the repo relied on pre-`0.8.4` component-threshold behavior
+- compare `storedOutcome.gateSnapshot` against recomputed replay output before changing live policy
-After updating repo-owned files:
+## Summary
-1. Run `pnpm exec wave doctor`.
-2. Run `pnpm exec wave launch --lane main --dry-run --no-dashboard`.
-3. Use `pnpm exec wave coord show --lane main --wave 0 --dry-run --json` as a read-only inspection path for the coordination state.
-4. Use `pnpm exec wave coord inbox --lane main --wave 0 --agent A1 --dry-run` when you want the launcher to materialize shared-summary and inbox artifacts for review.
-5. If the repo adopts `cont-EVAL`, verify that every live eval wave declares `## Eval targets` and that the benchmark ids exist in `docs/evals/benchmark-catalog.json`.
+`0.8.4` does not introduce a new authoring model. It hardens replay, makes component-promotion thresholds behave consistently, and finishes the projection-writer ownership boundary. For most repos already on `0.8.x`, the upgrade is package bump plus validation. For older adopted repos, the real work is syncing repo-owned prompts, skills, and runbooks so they describe the runtime the package now ships.

package/docs/plans/wave-orchestrator.md CHANGED Viewed

@@ -12,10 +12,33 @@ This runbook is the operational view of the architecture:
 - executor adapters preserve Claude, Codex, and OpenCode-specific runtime features at the edge
 - closure makes completion depend on integrated proof and shared state, not on free-form agent narration
+## Runtime Module Map
+The live runtime is organized around explicit modules:
+- `launcher.mjs`
+  Thin orchestrator for CLI parsing, launcher lock handling, wave iteration, and engine sequencing.
+- `implementation-engine.mjs`
+  Chooses initial or retry implementation fan-out.
+- `derived-state-engine.mjs`
+  Computes shared summary, inbox, assignment, dependency, ledger, docs queue, and integration or security projection payloads from canonical state.
+- `gate-engine.mjs`
+  Evaluates live gates from validated result envelopes plus canonical state.
+- `retry-engine.mjs`
+  Produces reducer-driven retry and resume plans.
+- `closure-engine.mjs`
+  Runs staged closure sequencing across `cont-EVAL`, security, integration, documentation, and `cont-QA` using the wave's effective role bindings.
+- `wave-state-reducer.mjs`
+  Reconstructs deterministic wave state for live queries and replay.
+- `session-supervisor.mjs`
+  Launches and monitors sessions and writes observed `wave_run`, `attempt`, and `agent_run` lifecycle facts.
+- `projection-writer.mjs`
+  Persists projection outputs such as dashboards, traces, board projections, compiled summaries and inboxes, assignment and dependency snapshots, docs queues, ledgers, and integration or security summaries. Clarification-triage workflow artifacts stay workflow-owned.
 ## What It Does
 - parses wave plans from `docs/plans/waves/`
-- supports transient ad-hoc runs from `.wave/adhoc/runs/` on the same launcher substrate
+- supports transient ad-hoc runs from `.wave/adhoc/runs/` on the same runtime substrate
 - fans a wave out into one session per `## Agent ...` section
 - supports standing role imports from `docs/agents/*.md`
 - seeds a coordination log, generated board, compiled shared summary, and per-agent inboxes
@@ -24,11 +47,12 @@ This runbook is the operational view of the architecture:
 - validates Context7 declarations and exit contracts from configurable wave thresholds
 - validates component promotions and component-owned proof from configurable wave thresholds
 - writes prompts, logs, dashboards, coordination state, and status summaries under `.tmp/`
-- supports launcher-side Context7 prefetch and injection for headless runs
+- supports runtime-side Context7 prefetch and injection for headless runs
 - supports headless execution through `codex`, `claude`, `opencode`, and the local smoke executor
 - can retry rate-limited `codex`, `claude`, and `opencode` launches with per-agent exponential backoff via `--agent-rate-limit-*`
 - supports a file-backed human feedback queue
-- performs a closure sweep so optional `cont-EVAL`, optional security review, integration, documentation, and cont-QA gates reflect final landed state
+- performs a closure sweep so optional `cont-EVAL`, optional security review, integration, documentation, and cont-QA gates reflect final landed state through the wave's effective closure-role bindings
+- rebuilds contradiction blockers from canonical control-plane events during replay and materializes human-blocked waves as `clarifying` plus blocked `waveState`
 ## Main Commands
@@ -147,14 +171,14 @@ Compatibility note:
 The canonical conversational state is the JSONL log under `.tmp/<lane>-wave-launcher/coordination/`. The markdown board is a generated projection for humans, not a decision input.
-Control-plane facts that drive reruns, proof, attempt state, contradictions, facts, and operator tasks are appended separately under `.tmp/<lane>-wave-launcher/control-plane/`. Result envelopes live under `.tmp/<lane>-wave-launcher/results/`. Legacy proof and retry files remain derived projections for compatibility, not decision inputs.
+Control-plane facts that drive reruns, proof, attempt state, contradictions, facts, human-input workflow, and operator tasks are appended separately under `.tmp/<lane>-wave-launcher/control-plane/`. Result envelopes live under `.tmp/<lane>-wave-launcher/results/wave-<n>/attempt-<a>/<agent>.json`. Legacy proof and retry files remain derived projections for compatibility, not decision inputs.
 Capability-targeted requests now become deterministic helper assignments. The runtime resolves the assignee from explicit targets, `capabilityRouting.preferredAgents`, then least-busy matching capability owners, writes that assignment into `.tmp/<lane>-wave-launcher/assignments/`, mirrors the decision into coordination state, and keeps the wave blocked until the linked follow-up resolves.
 Clarification flow is orchestrator-first:
 1. Agent emits `clarification-request` through `wave coord post`.
-2. The launcher triages it from repo policy, ownership, prior decisions, or targeted rerouting.
+2. The orchestrator triages it from repo policy, ownership, prior decisions, or targeted rerouting.
 3. Only unresolved items become human feedback tickets.
 4. Routed clarification follow-up requests remain blocking until they resolve.
 5. Human escalations are written back into coordination state, the ledger, and trace artifacts.
@@ -168,7 +192,8 @@ Retry intent, operator tasks, attempt lifecycle, and proof injection are now fir
 - canonical control events live under `.tmp/<lane>-wave-launcher/control-plane/`
 - projected retry overrides still live under `.tmp/<lane>-wave-launcher/control/`
 - projected proof registries still live under `.tmp/<lane>-wave-launcher/proof/`
-- live traces now copy the control-plane log alongside the proof registry so replay keeps the same operator-visible facts
+- live traces now copy the control-plane log alongside the proof registry so replay keeps the same operator-visible facts and contradiction blockers
+- `session-supervisor.mjs` writes observed `wave_run.started|completed|failed`, `attempt.running|completed|failed`, and `agent_run.started|completed|failed|timed_out` events into that control-plane log
 For a full end-to-end explainer of helper assignments, deliverables, integration, and why an agent can be locally done while the wave stays blocked, see [docs/reference/coordination-and-closure.md](../reference/coordination-and-closure.md).
@@ -208,7 +233,7 @@ pnpm exec wave changelog --since-installed
 3. Review `.wave/upgrade-history/` for any manual follow-up. The upgrade flow does not overwrite repo-owned plans, waves, or config.
-## What The Launcher Writes
+## What The Runtime Writes
 - prompts: `.tmp/<lane>-wave-launcher/prompts/`
 - logs: `.tmp/<lane>-wave-launcher/logs/`
@@ -235,7 +260,7 @@ pnpm exec wave changelog --since-installed
 - proof registries: `.tmp/<lane>-wave-launcher/proof/`
   Projected from control-plane state for compatibility. Operator-registered authoritative proof bundles that feed integration, cont-QA, and replay.
 - retry overrides: `.tmp/<lane>-wave-launcher/control/`
-  Projected from control-plane state for compatibility. Operator-applied targeted retry overrides, applied once per attempt and then cleared by the launcher.
+  Projected from control-plane state for compatibility. Operator-applied targeted retry overrides, applied once per attempt and then cleared after execution.
 - clarification triage: `.tmp/<lane>-wave-launcher/feedback/triage/`
 - dashboards: `.tmp/<lane>-wave-launcher/dashboards/`
   Dashboard JSON is a versioned contract. `global.json` and `wave-<n>.json` now carry explicit `schemaVersion` and `kind` fields.
@@ -248,7 +273,7 @@ pnpm exec wave changelog --since-installed
 Ad-hoc runs mirror the same state shape under `.tmp/<lane>-wave-launcher/adhoc/<run-id>/`, including dry-run previews at `.tmp/<lane>-wave-launcher/adhoc/<run-id>/dry-run/`. Their docs queue can still point at canonical shared-plan docs when the run reports a shared-plan delta.
-The launcher entrypoint in `scripts/wave-orchestrator/launcher.mjs` is being hardened toward a thin orchestrator over reducer, derived-state, retry, gate, closure, and supervision modules. The CLI and `traceVersion: 2` replay contract stay unchanged.
+The launcher entrypoint in `scripts/wave-orchestrator/launcher.mjs` now acts as a thin orchestrator over reducer, derived-state, retry, gate, closure, and supervision modules. The CLI and `traceVersion: 2` replay contract stay unchanged.
 ## Trace Contract
@@ -275,9 +300,10 @@ The launcher entrypoint in `scripts/wave-orchestrator/launcher.mjs` is being har
 - `outcome.json` is the stored replay baseline. Replay compares recomputed gates and quality against it instead of trusting only inline metadata.
 - For `traceVersion: 2`, launched agents must have copied prompt/log/status/inbox/summary artifacts, and promoted-component waves must include the copied component matrix JSON.
 - `security.json` stores the derived per-wave security state that feeds integration summaries, gate snapshots, and replay.
+- Non-promoted contradiction replay relies on copied control-plane facts and result artifacts; copied component matrices are only required when the trace declares promoted components.
 - `quality.json` is cumulative through the current attempt. It is intended for regression comparison, not only for one-shot pass/fail reporting.
 - `quality.json` also reports capability-assignment and dependency-resolution metrics, plus coordination response metrics (overdue acknowledgements, clarification timing, human escalation counts), in addition to the Phase 2/3 communication, fallback, and closure metrics.
-- Replay support is internal. The source tree contains helpers to load, validate, and replay trace bundles against the same gate logic the launcher uses, but there is no public replay CLI yet.
+- Replay support is internal. The source tree contains helpers to load, validate, and replay trace bundles against the same gate logic the runtime uses, but there is no public replay CLI yet.
 - Replay is read-only and hash-validating for `traceVersion: 2` bundles. It ignores inline summary duplicates in `run-metadata.json` and returns a stored-vs-recomputed comparison report for gate and quality state. Legacy `traceVersion: 1` bundles remain best-effort and emit warnings instead of claiming full hermetic replay.
 ## Authoring Rules
@@ -375,7 +401,7 @@ pnpm exec wave feedback respond --id <request-id> --response "..."
 ## Closure Sweep
-If implementation agents ran, the launcher does not stop at `exit 0`. It checks implementation exit contracts, promoted component proof, helper assignments, required dependencies, and the integration recommendation first. When present, `cont-EVAL` must satisfy its declared eval targets before integration can close. Optional security review then runs before integration so the reviewer can publish findings and approval-sensitive actions while the wave is still active. In the default planner shape `E0` is report-only; if a wave explicitly assigns `E0` non-report files, the launcher also applies the normal implementation proof gates to that role. Security reviewers stay report-only by default. Documentation and cont-QA closure only run after integration is explicitly ready for doc closure; if `cont-EVAL`, security review, or integration reports more work, or if helper assignments or required dependency tickets remain open, the wave stops there and retries only the implicated owners plus the relevant closure steward. When multiple implementation agents share a promoted component, owners that already landed valid proof stay reusable while the launcher retries only the sibling owners that still owe closure evidence.
+If implementation agents ran, the runtime does not stop at `exit 0`. It checks implementation exit contracts, promoted component proof, helper assignments, required dependencies, and the integration recommendation first. When present, `cont-EVAL` must satisfy its declared eval targets before integration can close. Optional security review then runs before integration so the reviewer can publish findings and approval-sensitive actions while the wave is still active. In the default planner shape `E0` is report-only; if a wave explicitly assigns `E0` non-report files, the runtime also applies the normal implementation proof gates to that role. Security reviewers stay report-only by default. Waves may override the default closure role ids; derived state, reducer snapshots, retry or resume planning, and closure sequencing all honor those wave-specific bindings consistently. Documentation and cont-QA closure only run after integration is explicitly ready for doc closure; if `cont-EVAL`, security review, or integration reports more work, or if helper assignments or required dependency tickets remain open, the wave stops there and retries only the implicated owners plus the relevant closure steward. When multiple implementation agents share a promoted component, owners that already landed valid proof stay reusable while the runtime retries only the sibling owners that still owe closure evidence.
 Live closure is fail-closed:

package/docs/reference/cli-reference.md CHANGED Viewed

@@ -7,6 +7,19 @@ summary: "Complete syntax reference for all wave CLI commands, flags, and operat
 Complete syntax for every `wave` command. All commands use `pnpm exec wave` as the entry point.
+## Command Families
+- Runtime:
+  `wave launch`, `wave autonomous`, and `wave local` cover dry-run validation, live execution, and executor-specific prompt transport.
+- Operator control:
+  `wave control` is the preferred surface for live status, tasks, reruns, proof bundles, and telemetry.
+- Compatibility and inspection:
+  `wave coord`, `wave retry`, and `wave proof` remain available where older runbooks still depend on them.
+- Planning and transient work:
+  `wave project`, `wave draft`, and `wave adhoc` cover defaults, authored waves, and operator-driven one-off runs.
+- Setup and lifecycle:
+  `wave init`, `wave doctor`, `wave upgrade`, and `wave self-update` cover workspace adoption, validation, and package upgrades.
 ## wave launch
 Launch waves for execution.
@@ -15,6 +28,10 @@ Launch waves for execution.
 wave launch [options]
 ```
+Defaults below reflect the starter workspace surface in this repo. Lane config can override executor, timeout, retry, and terminal defaults.
+Closure-role bindings do not have a CLI override surface. When a wave file declares custom integration, documentation, `cont-QA`, `cont-EVAL`, or security-review role ids, launch, retry, reducer, and closure flows honor those wave-level bindings end to end.
 | Flag | Default | Description |
 |------|---------|-------------|
 | `--lane <name>` | `main` | Lane name |
@@ -22,20 +39,23 @@ wave launch [options]
 | `--end-wave <n>` | last available | Last wave to launch |
 | `--auto-next` | off | Start from next unfinished wave and continue |
 | `--resume-control-state` | off | Preserve the prior auto-generated relaunch plan instead of treating the launch as a fresh wave start |
-| `--executor <id>` | lane config | Default executor: `codex`, `claude`, `opencode`, `local` |
+| `--executor <id>` | `codex` | Default executor: `codex`, `claude`, `opencode`, `local` |
 | `--codex-sandbox <mode>` | `danger-full-access` | Codex sandbox isolation level |
-| `--timeout-minutes <n>` | `60` | Max minutes to wait per wave |
-| `--max-retries-per-wave <n>` | `3` | Relaunch failed agents per wave |
-| `--agent-rate-limit-retries <n>` | `3` | Per-agent retries for 429 errors |
-| `--agent-rate-limit-base-delay-seconds <n>` | `1` | Base exponential backoff for 429 |
-| `--agent-rate-limit-max-delay-seconds <n>` | `60` | Max backoff delay for 429 |
-| `--agent-launch-stagger-ms <n>` | `250` | Delay between agent launches |
-| `--terminal-surface <mode>` | configured | `tmux`, `vscode`, or `none` |
+| `--timeout-minutes <n>` | `240` | Max minutes to wait per wave |
+| `--max-retries-per-wave <n>` | `1` | Relaunch failed agents per wave |
+| `--agent-rate-limit-retries <n>` | `2` | Per-agent retries for 429 errors |
+| `--agent-rate-limit-base-delay-seconds <n>` | `20` | Base exponential backoff for 429 |
+| `--agent-rate-limit-max-delay-seconds <n>` | `180` | Max backoff delay for 429 |
+| `--agent-launch-stagger-ms <n>` | `1200` | Delay between agent launches |
+| `--terminal-surface <mode>` | `vscode` | `tmux`, `vscode`, or `none` |
 | `--no-dashboard` | off | Disable per-wave tmux dashboard |
 | `--cleanup-sessions` | on | Kill lane tmux sessions after each wave |
 | `--keep-sessions` | off | Keep lane tmux sessions |
 | `--keep-terminals` | off | Keep temporary terminal entries |
 | `--orchestrator-id <id>` | generated | Stable orchestrator identity |
+| `--orchestrator-board <path>` | default board path | Write coordination-board updates to a specific shared board |
+| `--no-orchestrator-board` | off | Disable shared orchestrator-board writes for this run |
+| `--coordination-note <text>` | empty | Append a startup intent note to orchestrator-board updates |
 | `--resident-orchestrator` | off | Launch long-running non-owning orchestrator session |
 | `--no-telemetry` | off | Disable Wave Control event publication |
 | `--no-context7` | off | Disable Context7 prefetch |
@@ -57,13 +77,13 @@ wave autonomous [options]
 | `--lane <name>` | `main` | Lane name |
 | `--executor <id>` | lane config | `codex`, `claude`, or `opencode` (not `local`) |
 | `--codex-sandbox <mode>` | `danger-full-access` | Codex sandbox mode |
-| `--timeout-minutes <n>` | `60` | Per-wave timeout passed to launcher |
-| `--max-retries-per-wave <n>` | `3` | Per-wave relaunches inside launcher |
+| `--timeout-minutes <n>` | `240` | Per-wave timeout passed to launcher |
+| `--max-retries-per-wave <n>` | `1` | Per-wave relaunches inside launcher |
 | `--max-attempts-per-wave <n>` | `1` | External attempts per wave |
-| `--agent-rate-limit-retries <n>` | `3` | Per-agent 429 retries |
-| `--agent-rate-limit-base-delay-seconds <n>` | `1` | Base 429 backoff |
-| `--agent-rate-limit-max-delay-seconds <n>` | `60` | Max 429 backoff |
-| `--agent-launch-stagger-ms <n>` | `250` | Delay between agent launches |
+| `--agent-rate-limit-retries <n>` | `2` | Per-agent 429 retries |
+| `--agent-rate-limit-base-delay-seconds <n>` | `20` | Base 429 backoff |
+| `--agent-rate-limit-max-delay-seconds <n>` | `180` | Max 429 backoff |
+| `--agent-launch-stagger-ms <n>` | `1200` | Delay between agent launches |
 | `--orchestrator-id <id>` | `<lane>-autonomous` | Orchestrator identity |
 | `--resident-orchestrator` | off | Launch resident orchestrator for each wave |
 | `--dashboard` | off | Enable dashboards |

package/docs/reference/coordination-and-closure.md CHANGED Viewed

@@ -25,6 +25,17 @@ Those are related, but they are not the same.
 An implementation agent can be locally complete and still leave the wave blocked if it created open helper work, unresolved clarification chains, or required dependencies.
+At runtime, those distinctions map onto separate modules:
+- `implementation-engine.mjs` selects implementation work
+- `derived-state-engine.mjs` rebuilds the blackboard projections
+- `gate-engine.mjs` evaluates closure and barrier state from envelopes plus canonical logs
+- `retry-engine.mjs` decides what can safely resume
+- `closure-engine.mjs` sequences the staged closeout
+- `session-supervisor.mjs` only launches sessions and records observed facts
+Closure roles are resolved from the wave definition first, then from starter defaults. In other words, integration, documentation, `cont-QA`, `cont-EVAL`, and security review keep the same semantics even when a wave overrides the default role ids such as `A8`, `A9`, `A0`, `E0`, or `A7`.
 ## Durable State Surfaces
 The runtime writes several different artifacts, but they do different jobs:
@@ -52,6 +63,8 @@ The runtime writes several different artifacts, but they do different jobs:
 The important rule is that decisions come from the canonical authority set: wave definitions, the coordination log, the control-plane log, and immutable result envelopes. The markdown board is a projection for humans. See [wave-orchestrator.md](../plans/wave-orchestrator.md).
+That control-plane log also carries observed `wave_run`, `attempt`, and `agent_run` lifecycle facts from `session-supervisor.mjs`. When human feedback or escalation remains open, the reducer materializes the wave as `clarifying` with blocked `waveState` instead of flattening it into generic progress.
 Live waves now keep refreshing that derived state while agents are still running. Shared summaries, inboxes, dashboard coordination metrics, and clarification routing are not only recomputed at attempt boundaries; they are also refreshed during active wave execution so stale clarification and acknowledgement timing is machine-visible before the attempt ends.
 ## What Agents Should Use
@@ -61,7 +74,7 @@ Use the coordination log for conversational or workflow state:
 - `request`
   Use this when you need another agent or capability owner to do work. Target it explicitly. This is the kind that becomes a helper assignment.
 - `blocker`
-  Use this when the wave is blocked, but not because the launcher needs to route work to a specific assignee.
+  Use this when the wave is blocked, but not because the runtime needs to route work to a specific assignee.
 - `handoff`
   Use this for continuity and context transfer. This is informative by itself; it is not the same as a blocking helper assignment.
 - `evidence`
@@ -164,7 +177,7 @@ pnpm exec wave control task create \
 What happens next:
 - the request lands in the canonical coordination log
-- the launcher derives a helper assignment for `agent:A8`
+- the runtime derives a helper assignment for `agent:A8`
 - that assignment is written into the assignment snapshot
 - the shared summary and A8 inbox now show the open helper work
@@ -230,9 +243,9 @@ pnpm exec wave coord post \
 What happens next:
-1. the launcher triages the clarification from repo policy, ownership, prior decisions, and routing context
+1. the orchestrator triages the clarification from repo policy, ownership, prior decisions, and routing context
 2. if it can answer inside the wave, it writes the resolution back into coordination state
-3. if another owner can answer it, the launcher opens a targeted follow-up request and keeps the clarification chain blocking
+3. if another owner can answer it, the runtime opens a targeted follow-up request and keeps the clarification chain blocking
 4. only after policy and routed follow-up paths are exhausted does it create human feedback or escalation artifacts
 5. until that chain is resolved, clarification remains a closure barrier and any routed follow-up also remains blocking helper work
@@ -406,7 +419,7 @@ That gives Wave two useful properties:
 ## Targeted Retry Behavior
-When closure fails, the launcher does not always relaunch the entire wave.
+When closure fails, the runtime does not always relaunch the entire wave.
 It tries to relaunch only the implicated owners:
@@ -425,7 +438,7 @@ pnpm exec wave control rerun get --lane main --wave 10 --json
 pnpm exec wave control rerun request --lane main --wave 10 --agent A2 --agent A7 --clear-reuse A2 --reason "Resume sibling-owned component closure"
 ```
-The canonical rerun request is written under `.tmp/<lane>-wave-launcher/control-plane/`, projected to `.tmp/<lane>-wave-launcher/control/` for compatibility, consumed by the launcher on the next retry decision, and then cleared by default after one application. This is the supported path for:
+The canonical rerun request is written under `.tmp/<lane>-wave-launcher/control-plane/`, projected to `.tmp/<lane>-wave-launcher/control/` for compatibility, consumed by the retry engine on the next retry decision, and then cleared by default after one application. This is the supported path for:
 - rerunning only specific owners
 - preserving explicit reuse selectors such as attempt ids, proof bundle ids, derived-summary reuse, and invalidated component ids through the compatibility projection