@chllming/wave-orchestration 0.6.1 → 0.6.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,15 @@
2
2
 
3
3
  ## Unreleased
4
4
 
5
+ ## 0.6.2 - 2026-03-22
6
+
7
+ - Added first-class `claude.effort` support across config profiles, lane overrides, and per-agent `### Executor` blocks, and now emit `--effort` in Claude launch previews and live runs.
8
+ - Clarified operator runtime visibility with additive `launch-preview.json` `limits` metadata, including explicit known turn ceilings for Claude/OpenCode and explicit Codex opacity when Wave does not emit a turn-limit flag.
9
+ - Clarified dashboard and terminal UX: global wave counts now distinguish done, active, pending, and failed agents; the current-wave dashboard keeps a stable terminal name; and TTY dashboards use simple color cues for faster scanning.
10
+ - Pruned stale dry-run executor preview directories when wave agent sets shrink, so manual inspection of `.tmp/.../dry-run/executors/` matches the current manifest.
11
+ - Preserved already-landed implementation slices for shared promoted components by retrying only the sibling owners that still owe closure proof instead of blindly replaying the landed owner.
12
+ - Added release-surface alignment regression coverage and updated the shipped docs so README, runtime-config references, changelog, and release metadata match the `0.6.2` package surface.
13
+
5
14
  ## 0.6.1 - 2026-03-22
6
15
 
7
16
  - Published the post-merge `main` source as `0.6.1` so the default branch, tagged source, and package docs all agree on the current release.
package/README.md CHANGED
@@ -1,48 +1,91 @@
1
1
  # Wave Orchestration
2
2
 
3
- Wave Orchestration is a repository harness for running multi-agent work in bounded waves. You define shared plan docs plus per-wave markdown, the launcher validates the wave, compiles prompts and inboxes, runs implementation agents first, then performs staged closure. Every run writes durable state under `.tmp/<lane>-wave-launcher/` so humans can inspect progress, replay outcomes, and intervene only when needed.
3
+ Wave Orchestration is my framework for "vibe-coding." It keeps the speed of agentic coding, but makes the runtime, coordination, and context model explicit enough to inspect, replay, and improve.
4
+
5
+ The framework does three things:
6
+
7
+ 1. It abstracts the agent runtime away without flattening everything to the lowest common denominator. The same waves, skills, planning, evaluation, proof, and traces can run across Claude, Codex, and OpenCode while still preserving runtime-native features through executor adapters.
8
+ 2. It runs work as a blackboard-style multi-agent system. Agents do not just exchange chat messages; they work against shared state, generated inboxes, explicit ownership, and staged closure, and a wave keeps going until the declared goals, proof, production-live criteria, or eval targets are actually satisfied.
9
+ 3. It compiles context dynamically for the task at hand. Shared memory, generated runtime files, project defaults, skills, Context7, and cached external docs are assembled at runtime so you do not have to hand-maintain separate Claude, Codex, or other context files.
10
+
11
+ ## Core Ideas
12
+
13
+ - `One orchestrator, many runtimes.`
14
+ Planning, skills, evals, proof, and traces stay constant while the executor adapter changes.
15
+ - `A blackboard-style multi-agent system.`
16
+ The coordination log is canonical shared state; the rolling board, shared summary, inboxes, ledger, and integration views are generated projections over that state.
17
+ - `Completion is goal-driven and proof-bounded.`
18
+ Waves close only when deliverables, proof artifacts, eval targets, dependencies, and closure stewards agree.
19
+ - `Context is compiled, not hand-maintained.`
20
+ Wave builds runtime context from repo state, project memory, skills, Context7, and generated overlays.
21
+ - `The system is inspectable and replayable.`
22
+ Dry-run previews, logs, dashboards, ledgers, traces, and replay make the system debuggable instead of mysterious.
23
+
24
+ ## How The Architecture Works
25
+
26
+ 1. Define shared docs plus `docs/plans/waves/wave-<n>.md` files, or generate them with `wave draft`.
27
+ 2. Run `wave launch --dry-run` to validate the wave and materialize prompts, shared summaries, inboxes, dashboards, and executor previews before any live execution.
28
+ 3. During live execution, implementation agents write claims, evidence, requests, and decisions into the canonical coordination log instead of relying on ad hoc terminal narration.
29
+ 4. The launcher compiles blackboard projections from that state: rolling board, shared summary, per-agent inboxes, ledger, docs queue, dependency views, and integration summaries.
30
+ 5. Closure runs only when the integrated state is ready: optional `cont-EVAL` (`E0`), optional security review, integration (`A8`), documentation (`A9`), and `cont-QA` (`A0`).
31
+
32
+ ## Architecture Surfaces
33
+
34
+ - `Wave contract`
35
+ Shared plan docs, wave markdown, deliverables, proof artifacts, and eval targets define the goal.
36
+ - `Shared state`
37
+ The coordination log is the source of truth; the board is for humans, not the scheduler.
38
+ - `Runtime abstraction`
39
+ Executor adapters preserve Codex, Claude, and OpenCode-specific launch features without changing the higher-level wave contract.
40
+ - `Compiled context`
41
+ Project profile memory, shared summary, inboxes, skills, Context7, and runtime overlays are generated for the chosen executor.
42
+ - `Proof and closure`
43
+ Exit contracts, proof artifacts, eval markers, and closure stewards stop waves from closing on narrative-only PASS.
44
+ - `Replay and audit`
45
+ Traces capture the attempt so failures can be inspected and replayed instead of guessed from screenshots.
4
46
 
5
- ## How It Works
47
+ ## Example Output
6
48
 
7
- 1. Write shared docs and one or more `docs/plans/waves/wave-<n>.md` files.
8
- 2. Run `wave launch --dry-run` to validate the wave and materialize prompts, inboxes, dashboards, and executor previews.
9
- 3. A real launch runs implementation agents first. Agents post claims, evidence, requests, and decisions into the coordination log and rolling message board.
10
- 4. When implementation gates pass, closure runs in order: optional `cont-EVAL` (`E0`), integration (`A8`), documentation (`A9`), and `cont-QA` (`A0`).
11
- 5. Operators use the generated ledgers, inboxes, feedback queue, dependency views, and traces instead of guessing from raw terminal output.
49
+ Representative rolling message board output from a real wave run:
12
50
 
13
- ## Features
51
+ <img src="./docs/image.png" alt="Example rolling message board output showing claims, evidence, requests, and cont-QA closure for a wave run" width="100%" />
14
52
 
15
- - Planner foundation with saved project profile memory, draft specs, and rendered wave markdown
16
- - Implementation-first execution with staged closure and retry support
17
- - Durable coordination log, rolling message board, compiled inboxes, and per-wave ledger
18
- - Dry-run prompt and executor preview mode before any real agent launch
19
- - Context7 bundle selection, caching, and prompt injection
20
- - Multi-executor support for Codex, Claude Code, OpenCode, and a local smoke executor
21
- - Cross-runtime skill packs loaded from `skills/` and resolved by lane, role, runtime, deploy kind, and per-agent attachment
22
- - Human feedback routing, clarification triage, helper assignment, and cross-lane dependencies
23
- - Replayable trace bundles for regression and release verification
53
+ ## Common MAS Failure Cases
24
54
 
25
- ## Example Output
55
+ Recent multi-agent research keeps returning to the same failure modes:
26
56
 
27
- Representative rolling message board output from a real wave run:
57
+ - `Cosmetic board, no canonical state`
58
+ Agents appear coordinated, but there is no machine-trustable source of truth underneath the conversation.
59
+ - `Hidden evidence never gets pooled`
60
+ One agent has the critical fact, but it never reaches shared state before closure.
61
+ - `Communication without global-state reconstruction`
62
+ Agents exchange information, but nobody reconstructs the correct cross-agent picture.
63
+ - `Simultaneous coordination collapse`
64
+ A team that looks fine in serial work falls apart when multiple owners, blockers, or resources must move together.
65
+ - `Expert signal gets averaged away`
66
+ The strongest specialist view is diluted into a weaker compromise.
67
+ - `Contradictions get smoothed over`
68
+ Conflicts are narrated away instead of being turned into explicit repair work.
69
+ - `Premature closure`
70
+ Agents say they are done before proof, evals, or integrated state actually support PASS.
28
71
 
29
- <img src="./docs/image.png" alt="Example rolling message board output showing claims, evidence, requests, and cont-QA closure for a wave run" width="100%" />
72
+ Wave is built to mitigate those failures with canonical shared state, generated blackboard projections, explicit ownership, goal-driven, proof-bounded closure, and replayable traces. For the research framing and the current gaps, see [docs/research/coordination-failure-review.md](./docs/research/coordination-failure-review.md).
30
73
 
31
74
  ## Quick Start
32
75
 
33
76
  Current release:
34
77
 
35
- - `@chllming/wave-orchestration@0.6.1`
36
- - Release tag: [`v0.6.1`](https://github.com/chllming/wave-orchestration/releases/tag/v0.6.1)
78
+ - `@chllming/wave-orchestration@0.6.2`
79
+ - Release tag: [`v0.6.2`](https://github.com/chllming/wave-orchestration/releases/tag/v0.6.2)
37
80
  - Public install path: npmjs
38
81
  - Authenticated fallback: GitHub Packages
39
82
 
40
- Highlights in `0.6.1`:
83
+ Highlights in `0.6.2`:
41
84
 
42
- - `cont-EVAL` (`E0`) is now a first-class optional eval stage before integration, separate from final `cont-QA` closure.
43
- - Optional security review now has a dedicated role, report path, and `[wave-security]` closure marker.
44
- - `wave adhoc plan|run|show|promote` now supports transient operator requests on the same launcher substrate.
45
- - Starter docs and skills now cover the current `0.6.1` closure, benchmark, security, and provider surfaces.
85
+ - Runtime previews and docs now expose first-class Claude effort plus structured limit metadata, making known Claude/OpenCode ceilings explicit and Codex opacity explicit.
86
+ - The global dashboard and VS Code terminal surfaces are easier to read: active vs pending counts are distinct, the current-wave dashboard keeps a stable terminal name, and TTY dashboards now use simple color cues.
87
+ - Dry-run executor preview directories now prune stale agent folders when a wave shrinks.
88
+ - Shared promoted-component retries now preserve already-landed owner slices and relaunch only the sibling owners still needed for closure.
46
89
 
47
90
  Requirements:
48
91
 
@@ -59,7 +102,7 @@ pnpm add -D @chllming/wave-orchestration
59
102
  pnpm exec wave init
60
103
  pnpm exec wave doctor
61
104
  pnpm exec wave launch --lane main --dry-run --no-dashboard
62
- pnpm exec wave coord show --lane main --wave 0 --dry-run
105
+ pnpm exec wave coord show --lane main --wave 0 --dry-run --json
63
106
  ```
64
107
 
65
108
  If the repo already has Wave config, plans, or waves you want to keep:
@@ -99,14 +142,16 @@ node scripts/wave.mjs launch --lane main --dry-run --no-dashboard
99
142
  ## Learn More
100
143
 
101
144
  - [docs/README.md](./docs/README.md): docs map and suggested structure
102
- - [docs/concepts/what-is-a-wave.md](./docs/concepts/what-is-a-wave.md): wave anatomy, lifecycle, and closure model
145
+ - [docs/concepts/what-is-a-wave.md](./docs/concepts/what-is-a-wave.md): wave anatomy, blackboard execution model, and proof-bounded closure
146
+ - [docs/concepts/runtime-agnostic-orchestration.md](./docs/concepts/runtime-agnostic-orchestration.md): how one orchestration substrate spans Claude, Codex, OpenCode, and local execution
147
+ - [docs/concepts/context7-vs-skills.md](./docs/concepts/context7-vs-skills.md): compiled context, external truth, and repo-owned operating knowledge
103
148
  - [docs/guides/planner.md](./docs/guides/planner.md): `wave project` and `wave draft` workflow
104
- - [docs/concepts/context7-vs-skills.md](./docs/concepts/context7-vs-skills.md): when to use external docs vs repo-owned skills
105
149
  - [docs/guides/terminal-surfaces.md](./docs/guides/terminal-surfaces.md): tmux, VS Code terminal registry, and dry-run surfaces
106
150
  - [docs/plans/wave-orchestrator.md](./docs/plans/wave-orchestrator.md): operator runbook
107
151
  - [docs/plans/context7-wave-orchestrator.md](./docs/plans/context7-wave-orchestrator.md): Context7 setup and bundle authoring
108
152
  - [docs/reference/runtime-config/README.md](./docs/reference/runtime-config/README.md): executor, runtime, and skill-projection configuration
109
153
  - [docs/reference/skills.md](./docs/reference/skills.md): skill bundle format, resolution order, and runtime projection
154
+ - [docs/research/coordination-failure-review.md](./docs/research/coordination-failure-review.md): MAS failure modes from the research and how Wave responds
110
155
  - [CHANGELOG.md](./CHANGELOG.md): release history
111
156
 
112
157
  ## Research Sources
package/docs/README.md CHANGED
@@ -1,6 +1,12 @@
1
1
  # Wave Documentation
2
2
 
3
- This repository now uses a layered docs structure, but the useful path is journey-first:
3
+ These docs are organized around three core ideas:
4
+
5
+ - one orchestrator, many runtimes across Claude, Codex, OpenCode, and local execution
6
+ - a blackboard-style multi-agent system with goal-driven, proof-bounded closure
7
+ - compiled context from shared state, skills, runtime files, and Context7 instead of hand-maintained per-runtime context files
8
+
9
+ The useful path is journey-first:
4
10
 
5
11
  - start with one core concept doc
6
12
  - then use one end-to-end workflow guide
@@ -22,7 +28,11 @@ This repository now uses a layered docs structure, but the useful path is journe
22
28
  ## Start Here
23
29
 
24
30
  - New to Wave:
25
- Read [concepts/what-is-a-wave.md](./concepts/what-is-a-wave.md). It now covers the core execution model, runtime posture, closure, and state model in one place.
31
+ Read [concepts/what-is-a-wave.md](./concepts/what-is-a-wave.md). It covers the blackboard execution model, proof-bounded closure, runtime posture, and durable state model in one place.
32
+ - Want the runtime abstraction story:
33
+ Read [concepts/runtime-agnostic-orchestration.md](./concepts/runtime-agnostic-orchestration.md) to see how planning, skills, evals, proof, and traces stay stable across Claude, Codex, OpenCode, and local execution.
34
+ - Want the context story:
35
+ Read [concepts/context7-vs-skills.md](./concepts/context7-vs-skills.md) for the compiled-context model: shared summary, inboxes, project defaults, skills, Context7, and runtime overlays.
26
36
  - Drafting or revising waves:
27
37
  Read [guides/author-and-run-waves.md](./guides/author-and-run-waves.md), then use [plans/wave-orchestrator.md](./plans/wave-orchestrator.md) as the operator runbook.
28
38
  - Adding a security review pass:
@@ -37,8 +47,10 @@ This repository now uses a layered docs structure, but the useful path is journe
37
47
  Start with [guides/author-and-run-waves.md](./guides/author-and-run-waves.md), then use [plans/wave-orchestrator.md](./plans/wave-orchestrator.md) for the live operator flow.
38
48
  - Tuning runtime behavior:
39
49
  Read [reference/runtime-config/README.md](./reference/runtime-config/README.md) and [reference/skills.md](./reference/skills.md).
50
+ - Want the research framing behind the design:
51
+ Read [research/coordination-failure-review.md](./research/coordination-failure-review.md) for the common MAS failure modes and how Wave tries to mitigate them, then use [research/agent-context-sources.md](./research/agent-context-sources.md) as the bibliography.
40
52
  - Looking for supporting concept pages:
41
- Use [concepts/runtime-agnostic-orchestration.md](./concepts/runtime-agnostic-orchestration.md), [concepts/operating-modes.md](./concepts/operating-modes.md), and [concepts/context7-vs-skills.md](./concepts/context7-vs-skills.md) after the main concept and workflow docs.
53
+ Use [concepts/operating-modes.md](./concepts/operating-modes.md) after the main concept, runtime, and context docs.
42
54
 
43
55
  ## Package vs Repo-Owned Material
44
56
 
@@ -4,6 +4,30 @@ Context7 and skills solve different problems.
4
4
 
5
5
  Use Context7 for external library truth. Use skills for repo-owned, reusable operating knowledge.
6
6
 
7
+ That comparison matters because Wave treats context as something to compile at runtime, not something humans should maintain separately for Claude, Codex, OpenCode, and every other executor.
8
+
9
+ ## Compiled Context, Not Hand-Maintained Context Files
10
+
11
+ The active context for an agent is assembled from multiple layers:
12
+
13
+ - repository source and the wave's owned files
14
+ - wave markdown and shared plan docs
15
+ - generated shared summary and per-agent inbox
16
+ - saved project defaults such as `.wave/project-profile.json`
17
+ - resolved repo-owned skills
18
+ - selected Context7 snippets for external library truth
19
+ - generated runtime overlays and launch artifacts
20
+
21
+ Because of that, the question is not "which hand-written context file does this runtime use?" The question is "which context sources does this wave compile for the selected runtime right now?"
22
+
23
+ Runtime-specific context is still real, but it is mostly generated:
24
+
25
+ - Claude gets merged system-prompt and settings overlays
26
+ - Codex gets executor flags plus runtime-projected skills
27
+ - OpenCode gets generated config, attachments, and runtime instructions
28
+
29
+ That keeps the context model unified even when the transport layer differs.
30
+
7
31
  ## Short Version
8
32
 
9
33
  - Context7
@@ -1,15 +1,22 @@
1
1
  # Runtime-Agnostic Orchestration
2
2
 
3
+ In short: one orchestrator, many runtimes.
4
+
3
5
  Wave is runtime agnostic at the orchestration layer.
4
6
 
5
- That means planning, coordination, closure, and traces do not depend on whether the selected executor is Codex, Claude Code, OpenCode, or the local smoke executor.
7
+ That means planning, skills, evaluation, proof, coordination, closure, and traces do not depend on whether the selected executor is Codex, Claude Code, OpenCode, or the local smoke executor.
8
+
9
+ Wave abstracts the runtime away without flattening everything to the lowest common denominator. The wave contract stays stable while the executor adapter preserves the useful runtime-native features.
6
10
 
7
11
  ## What Stays The Same Across Runtimes
8
12
 
9
13
  These layers are runtime-neutral:
10
14
 
11
15
  - wave parsing and validation
16
+ - planner-produced wave specs and authored wave markdown
17
+ - eval targets, deliverables, and proof artifacts
12
18
  - component and closure gates
19
+ - skill resolution and attachment policy
13
20
  - compiled shared summaries and per-agent inboxes
14
21
  - coordination log and rendered message board
15
22
  - helper assignments and dependency handling
@@ -34,11 +41,19 @@ Runtime-specific behavior is isolated to the executor adapter layer:
34
41
 
35
42
  The orchestration substrate above those adapters does not need to know how the runtime transports prompts.
36
43
 
44
+ This is the important distinction:
45
+
46
+ - the orchestration layer owns goals, ownership, proof, and shared state
47
+ - the executor adapter owns prompt transport, runtime-native flags, files, and settings
48
+
49
+ That split is what lets Wave stay portable without giving up runtime-specific leverage.
50
+
37
51
  ## Why This Matters
38
52
 
39
53
  Runtime agnosticism gives you:
40
54
 
41
- - the same plan and closure model across vendors
55
+ - the same plan, skill, and closure model across vendors
56
+ - the same eval and proof model across vendors
42
57
  - replay and audit surfaces that do not care which runtime produced the work
43
58
  - per-role runtime choice without rewriting authoring conventions
44
59
  - retry-time fallback without inventing a second planning model
@@ -2,6 +2,8 @@
2
2
 
3
3
  A wave is the main planning and execution unit in Wave Orchestration.
4
4
 
5
+ It turns free-form agent runs into a bounded blackboard-style work package with shared state, explicit ownership, dynamic context, goal-driven execution, and proof-bounded closure.
6
+
5
7
  It is not just a prompt file. A wave is a bounded slice of repository work with:
6
8
 
7
9
  - explicit scope
@@ -34,6 +36,16 @@ Waves force a higher planning bar than ad hoc prompts. A good wave answers:
34
36
  - What evidence closes the wave?
35
37
  - Which dependencies, helper requests, or escalations can still block completion?
36
38
 
39
+ ## Why This Is A Blackboard-Style Model
40
+
41
+ Wave is blackboard-style because agents work against shared state instead of treating chat output as the system of record.
42
+
43
+ - the canonical coordination log is the machine-readable source of truth
44
+ - the rolling board is a human projection over that state, not the scheduler's authority
45
+ - shared summaries and per-agent inboxes are compiled views over the same state
46
+ - helper assignments, clarification flow, dependencies, and integration all operate on that shared state
47
+ - closure depends on the integrated state, not on whether an agent says "done"
48
+
37
49
  ## Wave Anatomy
38
50
 
39
51
  Wave markdown is the authored execution surface today. A typical wave can include:
@@ -136,6 +148,22 @@ Current live waves are strict about closure artifacts:
136
148
  - `cont-QA` must emit both a final `Verdict:` line and a final `[wave-gate]` marker.
137
149
  - Replay keeps read-only compatibility with older traces and older evaluator-era artifacts, but live waves do not pass on verdict-only or underspecified closure markers.
138
150
 
151
+ ## Context Is Compiled At Runtime
152
+
153
+ Wave also treats context as something to compile for the current task, not something humans should hand-maintain separately for each runtime.
154
+
155
+ The active context for an agent is assembled from:
156
+
157
+ - repository source and owned files
158
+ - wave markdown and shared plan docs
159
+ - saved project defaults such as `.wave/project-profile.json`
160
+ - the generated shared summary and the agent's inbox
161
+ - resolved skills and runtime-specific skill projections
162
+ - selected Context7 snippets for external library truth
163
+ - generated executor overlays and launch artifacts
164
+
165
+ That is why switching an agent between Codex, Claude, or OpenCode does not require maintaining separate parallel context files. The orchestrator recomputes the context package for the selected runtime and the current wave state.
166
+
139
167
  ## What Makes A Wave "Done"
140
168
 
141
169
  A wave is not done because an agent said so. It is done only when the runtime surfaces agree:
@@ -16,6 +16,8 @@ The catalog is reference metadata, not a run-history database. It tells the wave
16
16
 
17
17
  For a full authored wave example that uses these patterns, see [docs/reference/sample-waves.md](../reference/sample-waves.md).
18
18
 
19
+ These benchmark families are also Wave's operator-facing vocabulary for common MAS failure modes. For the research-side framing and the current architectural gaps, see [docs/research/coordination-failure-review.md](../research/coordination-failure-review.md).
20
+
19
21
  ## Migrating From Legacy Evaluator Waves
20
22
 
21
23
  If your `0.5.4`-era repo still talks about a single `evaluator` role, split that surface before adopting `0.6.1`:
@@ -45,6 +45,8 @@ Use `tmux` when:
45
45
 
46
46
  By default the launcher can start per-wave dashboard sessions in tmux.
47
47
 
48
+ When `--terminal-surface vscode` is active, Wave also maintains a stable current-wave dashboard terminal entry instead of creating a new wave-numbered dashboard attach target for every wave transition.
49
+
48
50
  Important flags:
49
51
 
50
52
  - `--no-dashboard`
@@ -4,6 +4,14 @@ The Wave Orchestrator coordinates repository work as bounded execution waves.
4
4
 
5
5
  For the broader docs map, concept pages, and workflow guides, start at [docs/README.md](../README.md).
6
6
 
7
+ This runbook is the operational view of the architecture:
8
+
9
+ - one wave contract defines goals, ownership, proof, and closure
10
+ - one canonical coordination log acts as the shared blackboard state
11
+ - generated board, shared summary, inboxes, ledger, and integration outputs are projections over that state
12
+ - executor adapters preserve Claude, Codex, and OpenCode-specific runtime features at the edge
13
+ - closure makes completion depend on integrated proof and shared state, not on free-form agent narration
14
+
7
15
  ## What It Does
8
16
 
9
17
  - parses wave plans from `docs/plans/waves/`
@@ -260,7 +268,7 @@ The launcher entrypoint in `scripts/wave-orchestrator/launcher.mjs` now delegate
260
268
  - Skills resolve only after that executor choice is known. Runtime-specific skill overlays are regenerated whenever retry-time fallback changes the selected executor.
261
269
  - Runtime mix targets are enforced before launch and again before any retry-time fallback reassignment.
262
270
  - Fallbacks are declared in profiles or lane policy, can be applied automatically on retry when the next executor is available and still satisfies mix targets, and are recorded in the ledger, integration summary, and traces when used.
263
- - Generic `budget.minutes` caps per-agent attempt timeouts. Generic `budget.turns` seeds `claude.maxTurns` and `opencode.steps` when executor-specific values are not set.
271
+ - Generic `budget.minutes` caps per-agent attempt timeouts. Generic `budget.turns` seeds `claude.maxTurns` and `opencode.steps` when executor-specific values are not set; Codex turn ceilings remain external to Wave and show up in preview metadata as opaque when Wave cannot inspect them.
264
272
  - The launcher writes runtime overlay files under `.tmp/<lane>-wave-launcher/executors/`; these should stay ignored and local.
265
273
 
266
274
  Runtime authoring examples:
@@ -294,7 +302,7 @@ Runtime authoring examples:
294
302
  - opencode.config_json: {"instructions":["Keep shared-plan edits concise."]}
295
303
  ````
296
304
 
297
- Dry-run is the intended validation path for these runtime surfaces. `wave launch --dry-run --no-dashboard` now writes compiled prompts, merged runtime overlays, and `launch-preview.json` files under `.tmp/<lane>-wave-launcher/dry-run/` so the harness can verify invocation shape without requiring the executor binaries to run.
305
+ Dry-run is the intended validation path for these runtime surfaces. `wave launch --dry-run --no-dashboard` now writes compiled prompts, merged runtime overlays, and `launch-preview.json` files under `.tmp/<lane>-wave-launcher/dry-run/` so the harness can verify invocation shape, attempt budgets, and known or opaque turn-limit metadata without requiring the executor binaries to run.
298
306
 
299
307
  ## Human Feedback Queue
300
308
 
@@ -308,7 +316,7 @@ pnpm exec wave feedback respond --id <request-id> --response "..."
308
316
 
309
317
  ## Closure Sweep
310
318
 
311
- If implementation agents ran, the launcher does not stop at `exit 0`. It checks implementation exit contracts, promoted component proof, helper assignments, required dependencies, and the integration recommendation first. When present, `cont-EVAL` must satisfy its declared eval targets before integration can close. Optional security review then runs before integration so the reviewer can publish findings and approval-sensitive actions while the wave is still active. In the default planner shape `E0` is report-only; if a wave explicitly assigns `E0` non-report files, the launcher also applies the normal implementation proof gates to that role. Security reviewers stay report-only by default. Documentation and cont-QA closure only run after integration is explicitly ready for doc closure; if `cont-EVAL`, security review, or integration reports more work, or if helper assignments or required dependency tickets remain open, the wave stops there and retries only the implicated owners plus the relevant closure steward.
319
+ If implementation agents ran, the launcher does not stop at `exit 0`. It checks implementation exit contracts, promoted component proof, helper assignments, required dependencies, and the integration recommendation first. When present, `cont-EVAL` must satisfy its declared eval targets before integration can close. Optional security review then runs before integration so the reviewer can publish findings and approval-sensitive actions while the wave is still active. In the default planner shape `E0` is report-only; if a wave explicitly assigns `E0` non-report files, the launcher also applies the normal implementation proof gates to that role. Security reviewers stay report-only by default. Documentation and cont-QA closure only run after integration is explicitly ready for doc closure; if `cont-EVAL`, security review, or integration reports more work, or if helper assignments or required dependency tickets remain open, the wave stops there and retries only the implicated owners plus the relevant closure steward. When multiple implementation agents share a promoted component, owners that already landed valid proof stay reusable while the launcher retries only the sibling owners that still owe closure evidence.
312
320
 
313
321
  Live closure is fail-closed:
314
322
 
@@ -1,6 +1,6 @@
1
1
  # Runtime Configuration Reference
2
2
 
3
- This directory is the canonical reference for executor configuration in Wave `0.6.1`.
3
+ This directory is the canonical reference for executor configuration in the packaged Wave release.
4
4
 
5
5
  Use it when you need the full supported surface for:
6
6
 
@@ -65,7 +65,7 @@ These fields are shared across runtimes:
65
65
  | Model | `model` in profile, `executors.claude.model`, `executors.opencode.model` | `model` | Codex uses shared `model` from profile or agent only |
66
66
  | Fallbacks | `fallbacks` in profile | `fallbacks` | Runtime ids used for retry-time reassignment |
67
67
  | Tags | `tags` in profile | `tags` | Stored in resolved executor state for policy and traces |
68
- | Budget turns | `budget.turns` in profile | `budget.turns` | Seeds Claude `maxTurns` and OpenCode `steps` when runtime-specific values are absent |
68
+ | Budget turns | `budget.turns` in profile | `budget.turns` | Seeds Claude `maxTurns` and OpenCode `steps` when runtime-specific values are absent; it does not set a Codex turn limit |
69
69
  | Budget minutes | `budget.minutes` in profile | `budget.minutes` | Caps attempt timeout |
70
70
 
71
71
  ## Runtime Pages
@@ -83,7 +83,7 @@ Wave writes runtime artifacts here:
83
83
 
84
84
  Common files:
85
85
 
86
- - `launch-preview.json`: resolved invocation lines, env vars, and retry mode
86
+ - `launch-preview.json`: resolved invocation lines, env vars, retry mode, and structured attempt/turn-limit metadata
87
87
  - `skills.resolved.md`: compact metadata-first skill catalog for the selected agent and runtime
88
88
  - `skills.expanded.md`: full canonical/debug skill payload with `SKILL.md` bodies and adapters
89
89
  - `skills.metadata.json`: resolved skill ids, activation metadata, permissions, hashes, and generated artifact paths
@@ -100,7 +100,7 @@ Runtime-specific delivery:
100
100
  - OpenCode injects the compact catalog into `opencode.json` and attaches `skill.json`, `SKILL.md`, the selected adapter, and recursive `references/**` files through `--file`.
101
101
  - Local keeps skills prompt-only.
102
102
 
103
- `launch-preview.json` also records the resolved skill metadata so dry-run can verify the exact runtime plus skill combination before any live launch.
103
+ `launch-preview.json` also records the resolved skill metadata plus a `limits` section. For Claude and OpenCode, that section reports the known turn ceiling and whether it came from the runtime-specific setting or generic `budget.turns`. For Codex, it explicitly records that Wave emitted no turn-limit flag and that any effective ceiling may come from the selected Codex profile or upstream runtime.
104
104
 
105
105
  ## Recommended Validation Path
106
106
 
@@ -12,6 +12,7 @@ Wave launches Claude headlessly with `claude -p --no-session-persistence`.
12
12
  | Prompt mode | `executors.claude.appendSystemPromptMode` | n/a | Uses `--append-system-prompt-file` or `--system-prompt-file` |
13
13
  | Permission mode | `executors.claude.permissionMode`, `executors.profiles.<name>.claude.permissionMode` | `claude.permission_mode` | Adds `--permission-mode <mode>` |
14
14
  | Permission prompt tool | `executors.claude.permissionPromptTool`, `executors.profiles.<name>.claude.permissionPromptTool` | `claude.permission_prompt_tool` | Adds `--permission-prompt-tool <tool>` |
15
+ | Effort | `executors.claude.effort`, `executors.profiles.<name>.claude.effort` | `claude.effort` | Adds `--effort low|medium|high|max` |
15
16
  | Max turns | `executors.claude.maxTurns`, `executors.profiles.<name>.claude.maxTurns` | `claude.max_turns` | Adds `--max-turns <n>` |
16
17
  | MCP config | `executors.claude.mcpConfig`, `executors.profiles.<name>.claude.mcpConfig` | `claude.mcp_config` | Adds repeated `--mcp-config <path>` |
17
18
  | Strict MCP mode | `executors.claude.strictMcpConfig`, `executors.profiles.<name>.claude.strictMcpConfig` | n/a | Adds `--strict-mcp-config` |
@@ -27,6 +28,8 @@ Wave launches Claude headlessly with `claude -p --no-session-persistence`.
27
28
 
28
29
  Wave always writes `claude-system-prompt.txt` for the harness runtime instructions.
29
30
 
31
+ Wave validates the effort enum only. Model-specific compatibility for values such as `max` remains enforced by Claude Code itself.
32
+
30
33
  Wave writes `claude-settings.json` only when at least one inline overlay input is present:
31
34
 
32
35
  - `settingsJson`
@@ -57,6 +60,7 @@ If no inline overlay data is present, Wave passes the base `claude.settings` fil
57
60
  },
58
61
  "claude": {
59
62
  "agent": "reviewer",
63
+ "effort": "high",
60
64
  "permissionMode": "plan",
61
65
  "allowedTools": ["Read"],
62
66
  "disallowedTools": ["Edit"]
@@ -84,6 +88,7 @@ If no inline overlay data is present, Wave passes the base `claude.settings` fil
84
88
 
85
89
  - id: claude
86
90
  - model: claude-sonnet-4-6
91
+ - claude.effort: high
87
92
  - claude.permission_mode: plan
88
93
  - claude.max_turns: 4
89
94
  - claude.settings_json: {"permissions":{"allow":["Read"]}}
@@ -102,4 +107,4 @@ For a dry run, inspect:
102
107
  - `claude-settings.json`, when generated
103
108
  - `launch-preview.json`
104
109
 
105
- `launch-preview.json` shows the final `claude -p` invocation and whether `--settings`, `--allowedTools`, `--disallowedTools`, `--mcp-config`, or `--system-prompt-file` were included.
110
+ `launch-preview.json` shows the final `claude -p` invocation, whether `--effort`, `--settings`, `--allowedTools`, `--disallowedTools`, `--mcp-config`, or `--system-prompt-file` were included, and the resolved `limits` block for attempt timeout plus known turn ceiling.
@@ -20,6 +20,7 @@ Wave launches Codex with `codex exec` and pipes the generated task prompt throug
20
20
  ## Notes
21
21
 
22
22
  - There is no `executors.codex.model` key today. Use profile `model` or per-agent `model`.
23
+ - Generic `budget.turns` does not set a Codex turn limit. If Codex stops on a turn ceiling, that limit came from the selected Codex profile or upstream Codex runtime, not from a Wave-emitted CLI flag.
23
24
  - `codex.images`, `codex.add_dirs`, and `codex.config` accept either a string array in `wave.config.json` or a comma-separated list in a wave file.
24
25
  - Relative paths are passed to Codex relative to the repository root because Wave launches the executor from the repo workspace.
25
26
 
@@ -35,7 +36,6 @@ Wave launches Codex with `codex exec` and pipes the generated task prompt throug
35
36
  "model": "gpt-5-codex",
36
37
  "fallbacks": ["claude", "opencode"],
37
38
  "budget": {
38
- "turns": 12,
39
39
  "minutes": 45
40
40
  },
41
41
  "codex": {
@@ -78,4 +78,4 @@ For a dry run, inspect:
78
78
  - `launch-preview.json` for the final `codex exec` command
79
79
  - any referenced prompt file under `.tmp/<lane>-wave-launcher/dry-run/prompts/`
80
80
 
81
- The preview records the exact `--profile`, repeated `-c`, `--image`, and `--add-dir` flags that Wave would use in a live launch.
81
+ The preview records the exact `--profile`, repeated `-c`, `--image`, and `--add-dir` flags that Wave would use in a live launch. It also includes a `limits` block that makes Wave's Codex visibility explicit: `turnLimitSource: "not-set-by-wave"` means Wave emitted no Codex turn-limit flag, so any effective ceiling is external to the Wave CLI invocation.
@@ -90,4 +90,4 @@ For a dry run, inspect:
90
90
  - `opencode.json`
91
91
  - `launch-preview.json`
92
92
 
93
- `launch-preview.json` shows the final `opencode run` command and the exported `OPENCODE_CONFIG` path.
93
+ `launch-preview.json` shows the final `opencode run` command, the exported `OPENCODE_CONFIG` path, and the resolved `limits` block for attempt timeout plus known step ceiling.
@@ -7,6 +7,8 @@ summary: "Primary external sources used as inspiration for planning, harness des
7
7
 
8
8
  This repository does not commit converted paper/article caches. Keep any hydrated local copies under `docs/research/agent-context-cache/` or another ignored cache directory.
9
9
 
10
+ For a narrative synthesis of the most relevant MAS failure modes and how Wave responds to them, start with [coordination-failure-review.md](./coordination-failure-review.md) and then use this page as the bibliography.
11
+
10
12
  ## Practice Articles
11
13
 
12
14
  - [Harness engineering: leveraging Codex in an agent-first world](https://openai.com/index/harness-engineering/)
@@ -17,7 +17,28 @@ The Wave orchestrator addresses several coordination failure modes constructivel
17
17
 
18
18
  That is materially stronger than the common "agents talk in a shared channel and we hope that was enough" pattern criticized by recent multi-agent papers.
19
19
 
20
- The main weakness is empirical, not architectural. The repo does not yet contain a benchmark family that proves the blackboard actually helps agents reconstruct distributed state under HiddenBench or Silo-Bench style pressure, or that it handles DPBench-style simultaneous coordination reliably.
20
+ The main weakness is empirical, not architectural. The repo now carries coordination-oriented benchmark vocabulary, but it does not yet present enough hard evidence that the blackboard reconstructs distributed state under HiddenBench or Silo-Bench style pressure, or that it handles DPBench-style simultaneous coordination reliably.
21
+
22
+ ## Common MAS Failure Cases
23
+
24
+ The research cited in this repo keeps returning to a fairly stable set of failure modes. In Wave language, the common ones are:
25
+
26
+ - `Cosmetic board, no canonical state`
27
+ Agents appear coordinated because they share a board or chat, but there is no machine-trustable source of truth underneath. Wave responds with a canonical coordination log and treats the board as a projection.
28
+ - `Hidden evidence never gets pooled`
29
+ One agent has the decision-changing fact, but it never reaches the shared state before closure. Wave responds with shared summaries, per-agent inboxes, and integration gating, but this still needs stronger empirical validation.
30
+ - `Communication without global-state reconstruction`
31
+ Agents exchange information, yet nobody reconstructs the correct cross-agent picture. Wave responds with integration summaries and barrier-based closure so the final decision depends on integrated state rather than message volume.
32
+ - `Simultaneous coordination collapse`
33
+ A team that looks competent in serial work falls apart when multiple owners, blockers, or resources must move together. Wave responds with helper assignments, dependency barriers, and staged closure, but still lacks a stronger contention benchmark story.
34
+ - `Expert signal gets averaged away`
35
+ The strongest specialist view is diluted into a weaker compromise. Wave responds with explicit ownership, named stewards, and capability routing instead of free-form consensus, though expertise weighting is still shallow.
36
+ - `Blackboard projection drift`
37
+ The raw shared state may be right, but summaries, inboxes, ledgers, or integration artifacts lose the important fact. Wave responds by compiling those surfaces from canonical state and by adding `blackboard-fidelity` to the eval vocabulary.
38
+ - `Contradictions get smoothed over`
39
+ Conflicting claims look resolved in prose, but the system never turns them into bounded repair work. Wave responds with clarification flow, integration barriers, and contradiction-oriented eval vocabulary, though subtle semantic conflicts can still leak through.
40
+ - `Premature closure`
41
+ Agents say they are done before proof, evals, or integrated state actually support PASS. Wave responds with structured proof markers, exit contracts, eval gates, closure stewards, and replay-visible traces.
21
42
 
22
43
  ## What The Papers Warn About
23
44
 
@@ -175,23 +196,26 @@ That alignment matters. In many MAS projects the docs promise a blackboard, but
175
196
 
176
197
  ## What Is Still Missing To Make The Claim Credible
177
198
 
178
- ### 1. No distributed-information benchmark family yet
199
+ ### 1. The benchmark vocabulary exists, but the empirical proof is still thin
179
200
 
180
- The biggest gap is in [docs/evals/benchmark-catalog.json](../evals/benchmark-catalog.json). The current families are:
201
+ [docs/evals/benchmark-catalog.json](../evals/benchmark-catalog.json) and [docs/evals/README.md](../evals/README.md) now define coordination-oriented benchmark families such as:
181
202
 
182
- - `service-output`
183
- - `latency`
184
- - `quality-regression`
203
+ - `hidden-profile-pooling`
204
+ - `silo-escape`
205
+ - `simultaneous-coordination`
206
+ - `expertise-leverage`
207
+ - `blackboard-fidelity`
208
+ - `contradiction-recovery`
185
209
 
186
- There is nothing yet for:
210
+ That is a real improvement because the repo now has a vocabulary for the exact MAS failures the research highlights.
187
211
 
188
- - hidden-profile reconstruction
189
- - silo escape under partial information
190
- - blackboard consistency across raw log, summary, inboxes, ledger, and integration state
191
- - contradiction injection and recovery
192
- - simultaneous coordination under contention
212
+ The remaining gap is not the absence of categories. The gap is still empirical proof:
193
213
 
194
- So the repo can reasonably claim "we built mechanisms intended to mitigate these failures," but it cannot yet claim "we demonstrated that these mechanisms overcome the failures highlighted by HiddenBench, Silo-Bench, or DPBench."
214
+ - not enough published results showing those families are exercised systematically
215
+ - not enough evidence that the blackboard actually improves hidden-state reconstruction
216
+ - not enough stress data showing simultaneous coordination remains reliable under contention
217
+
218
+ So the repo can reasonably claim "we built mechanisms and eval categories intended to mitigate these failures," but it still cannot claim "we demonstrated that those mechanisms consistently overcome the failures highlighted by HiddenBench, Silo-Bench, or DPBench."
195
219
 
196
220
  ### 2. Information integration is supported, but not measured directly
197
221
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@chllming/wave-orchestration",
3
- "version": "0.6.1",
3
+ "version": "0.6.2",
4
4
  "license": "MIT",
5
5
  "description": "Generic wave-based multi-agent orchestration for repository work.",
6
6
  "repository": {
@@ -2,6 +2,24 @@
2
2
  "schemaVersion": 1,
3
3
  "packageName": "@chllming/wave-orchestration",
4
4
  "releases": [
5
+ {
6
+ "version": "0.6.2",
7
+ "date": "2026-03-22",
8
+ "summary": "Runtime preview visibility, dashboard/operator UX fixes, dry-run cleanup, and safer shared-component retries.",
9
+ "features": [
10
+ "Claude runtime config now exposes first-class `claude.effort`, and runtime previews now include structured `limits` metadata for known attempt and turn ceilings.",
11
+ "Codex previews and docs now make turn-limit visibility explicit: Wave records when it emitted no Codex turn-limit flag and warns that any effective ceiling may come from the selected Codex profile or upstream runtime.",
12
+ "The dashboard surface now distinguishes done, active, pending, and failed counts, keeps a stable `Current Wave Dashboard` terminal target, and adds simple TTY color cues for faster scanning.",
13
+ "Dry-run executor preview directories are pruned when wave agent sets shrink, so stale overlay folders no longer linger under `.tmp/.../dry-run/executors/`.",
14
+ "Shared promoted-component retries now preserve already-landed owner slices and relaunch only the sibling owners still required for closure proof."
15
+ ],
16
+ "manualSteps": [
17
+ "After upgrading, rerun `pnpm exec wave doctor` and `pnpm exec wave launch --lane main --dry-run --no-dashboard` to inspect the new preview `limits` metadata and confirm your repo runtime config still resolves as expected.",
18
+ "If you relied on a local Claude wrapper only to inject `--effort`, move that setting into `wave.config.json` or the agent `### Executor` block and retire the wrapper when convenient.",
19
+ "If you document Codex turn ceilings in repo-local guidance, update that guidance to reflect that Wave now reports Codex ceiling visibility as opaque unless the limit is surfaced by the selected Codex profile or runtime."
20
+ ],
21
+ "breaking": false
22
+ },
5
23
  {
6
24
  "version": "0.6.1",
7
25
  "date": "2026-03-22",