@chllming/wave-orchestration 0.9.0 → 0.9.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (68) hide show
  1. package/CHANGELOG.md +57 -0
  2. package/LICENSE.md +21 -0
  3. package/README.md +133 -20
  4. package/docs/README.md +12 -4
  5. package/docs/agents/wave-security-role.md +1 -0
  6. package/docs/architecture/README.md +1498 -0
  7. package/docs/concepts/operating-modes.md +2 -2
  8. package/docs/guides/author-and-run-waves.md +14 -4
  9. package/docs/guides/planner.md +2 -2
  10. package/docs/guides/{recommendations-0.9.0.md → recommendations-0.9.2.md} +8 -7
  11. package/docs/guides/sandboxed-environments.md +158 -0
  12. package/docs/guides/terminal-surfaces.md +14 -12
  13. package/docs/plans/current-state.md +11 -3
  14. package/docs/plans/end-state-architecture.md +3 -1
  15. package/docs/plans/examples/wave-example-design-handoff.md +1 -1
  16. package/docs/plans/examples/wave-example-live-proof.md +1 -1
  17. package/docs/plans/migration.md +70 -19
  18. package/docs/plans/sandbox-end-state-architecture.md +153 -0
  19. package/docs/reference/cli-reference.md +71 -7
  20. package/docs/reference/coordination-and-closure.md +18 -1
  21. package/docs/reference/corridor.md +225 -0
  22. package/docs/reference/github-packages-setup.md +1 -1
  23. package/docs/reference/migration-0.2-to-0.5.md +9 -7
  24. package/docs/reference/npmjs-token-publishing.md +53 -0
  25. package/docs/reference/npmjs-trusted-publishing.md +4 -50
  26. package/docs/reference/package-publishing-flow.md +272 -0
  27. package/docs/reference/runtime-config/README.md +61 -3
  28. package/docs/reference/sample-waves.md +5 -5
  29. package/docs/reference/skills.md +1 -1
  30. package/docs/reference/wave-control.md +358 -27
  31. package/docs/roadmap.md +39 -204
  32. package/package.json +1 -1
  33. package/releases/manifest.json +38 -0
  34. package/scripts/wave-cli-bootstrap.mjs +52 -1
  35. package/scripts/wave-orchestrator/agent-process-runner.mjs +344 -0
  36. package/scripts/wave-orchestrator/agent-state.mjs +0 -1
  37. package/scripts/wave-orchestrator/artifact-schemas.mjs +7 -0
  38. package/scripts/wave-orchestrator/autonomous.mjs +47 -14
  39. package/scripts/wave-orchestrator/closure-engine.mjs +138 -17
  40. package/scripts/wave-orchestrator/config.mjs +199 -3
  41. package/scripts/wave-orchestrator/context7.mjs +231 -29
  42. package/scripts/wave-orchestrator/control-cli.mjs +42 -5
  43. package/scripts/wave-orchestrator/coordination.mjs +14 -0
  44. package/scripts/wave-orchestrator/corridor.mjs +363 -0
  45. package/scripts/wave-orchestrator/dashboard-renderer.mjs +115 -43
  46. package/scripts/wave-orchestrator/derived-state-engine.mjs +44 -4
  47. package/scripts/wave-orchestrator/gate-engine.mjs +126 -38
  48. package/scripts/wave-orchestrator/install.mjs +46 -0
  49. package/scripts/wave-orchestrator/launcher-progress.mjs +91 -0
  50. package/scripts/wave-orchestrator/launcher-runtime.mjs +290 -75
  51. package/scripts/wave-orchestrator/launcher.mjs +201 -53
  52. package/scripts/wave-orchestrator/ledger.mjs +7 -2
  53. package/scripts/wave-orchestrator/planner.mjs +1 -0
  54. package/scripts/wave-orchestrator/projection-writer.mjs +36 -1
  55. package/scripts/wave-orchestrator/provider-runtime.mjs +104 -0
  56. package/scripts/wave-orchestrator/reducer-snapshot.mjs +6 -0
  57. package/scripts/wave-orchestrator/retry-control.mjs +3 -3
  58. package/scripts/wave-orchestrator/retry-engine.mjs +93 -6
  59. package/scripts/wave-orchestrator/role-helpers.mjs +30 -0
  60. package/scripts/wave-orchestrator/session-supervisor.mjs +94 -85
  61. package/scripts/wave-orchestrator/shared.mjs +1 -0
  62. package/scripts/wave-orchestrator/supervisor-cli.mjs +1306 -0
  63. package/scripts/wave-orchestrator/terminals.mjs +12 -32
  64. package/scripts/wave-orchestrator/tmux-adapter.mjs +300 -0
  65. package/scripts/wave-orchestrator/traces.mjs +25 -0
  66. package/scripts/wave-orchestrator/wave-control-client.mjs +14 -1
  67. package/scripts/wave-orchestrator/wave-files.mjs +38 -5
  68. package/scripts/wave.mjs +13 -0
@@ -0,0 +1,153 @@
1
+ # Sandbox End-State Architecture
2
+
3
+ This document is the sandbox-runtime companion to [end-state-architecture.md](./end-state-architecture.md). The core architecture still applies: the canonical authority set remains wave definitions, the coordination log, and the control-plane event log. This page narrows that model to the execution environments that impose short-lived `exec` sessions, process ceilings, or terminal instability.
4
+
5
+ The goal is straightforward: sandbox client commands must stay short and disposable, while long-running wave ownership moves to a durable supervisor that can survive launcher exit, sandbox timeout, and terminal churn.
6
+
7
+ For the operator-facing setup flow in LEAPclaw, OpenClaw, Nemoshell, Docker, and similar environments, read [../guides/sandboxed-environments.md](../guides/sandboxed-environments.md). This page is the deeper design and authority-model reference.
8
+
9
+ ---
10
+
11
+ ## Problem Statement
12
+
13
+ Sandboxed runtimes have failure modes that the generic architecture does not need to describe in detail:
14
+
15
+ - the sandbox `exec` session may have a wall-clock timeout that is much shorter than a real wave
16
+ - bursty `spawnSync` and `tmux` probes can hit `EAGAIN`, `EMFILE`, or related process pressure limits
17
+ - the launcher process can die before child agents finish, leaving orphaned sessions and ambiguous status
18
+ - a missing `tmux` session is not enough evidence that the actual agent process failed
19
+
20
+ The shipped runtime now has an initial async supervisor wrapper plus forwarded closure-gap handling, but it does not yet satisfy the full sandbox ownership model described here.
21
+
22
+ ---
23
+
24
+ ## Target Command Model
25
+
26
+ Sandbox-facing commands should follow an async submit/observe pattern:
27
+
28
+ - `wave submit [launcher options]`
29
+ Validate the request, persist a run request, print a `runId`, and exit quickly.
30
+ - `wave supervise`
31
+ Long-running daemon command that owns launch, monitoring, retry, adoption, and cleanup. This command is not intended to be bound to a short sandbox `exec` lifetime.
32
+ - `wave status --run-id <id>`
33
+ Read canonical supervisor state for a run.
34
+ - `wave wait --run-id <id> --timeout-seconds <n>`
35
+ Observe until a state change or timeout. Timing out never cancels the run.
36
+ - `wave attach --run-id <id>`
37
+ Optional operator projection surface for `tmux` or another terminal UI. This is not a liveness authority.
38
+
39
+ Compatibility rules:
40
+
41
+ - `wave launch` remains the canonical full launcher surface for direct local execution and dry-run validation.
42
+ - `wave autonomous` should submit and observe wave execution when it is used in sandbox-oriented flows.
43
+ - `wave submit`, `wave supervise`, `wave status`, `wave wait`, and `wave attach` are the preferred sandbox-facing surface, even while some internals remain partial.
44
+
45
+ ---
46
+
47
+ ## Canonical Authority In Sandboxed Runs
48
+
49
+ The canonical authority set does not change, but sandbox supervision adds one more durable runtime layer:
50
+
51
+ - wave definitions remain authoritative for declared work, closure roles, proof artifacts, and task contracts
52
+ - coordination and control-plane logs remain authoritative for workflow, lifecycle, proof, and blocker state
53
+ - supervisor run state becomes the canonical record of daemon-owned runtime observation for a submitted run
54
+
55
+ The supervisor-owned state should converge on this per-run structure under `.tmp/<lane>-wave-launcher/supervisor/runs/<runId>/`:
56
+
57
+ - `request.json`
58
+ Immutable submitted request.
59
+ - `state.json`
60
+ Current daemon-owned run snapshot, including `runId`, `status`, `submittedAt`, `startedAt`, `completedAt`, `launcherPid`, `supervisorId`, `leaseExpiresAt`, `terminalDisposition`, and the latest observed launcher status.
61
+ - `events.jsonl`
62
+ Supervisor-local observation history for adoption, retries, reconciliation, and cleanup decisions.
63
+ - `launcher-status.json`
64
+ Canonical launcher completion status written atomically by the detached launcher wrapper.
65
+ - `launcher.log`
66
+ Human-facing log stream only.
67
+ - `agents/<agentId>.runtime.json`
68
+ Agent runtime observation record with fields such as `pid`, `pgid`, `attempt`, `startedAt`, `lastHeartbeatAt`, `exitCode`, `exitReason`, `statusPath`, and optional projection metadata like `tmuxSessionName`.
69
+
70
+ Authority rules:
71
+
72
+ - `tmux` is projection-only
73
+ - dashboards, summaries, inboxes, and board markdown remain projections only
74
+ - missing `tmux` state cannot by itself fail a run and is warning-only telemetry
75
+ - pid checks, heartbeats, and atomic status files outrank terminal presence for liveness
76
+
77
+ ---
78
+
79
+ ## Daemon Ownership, Adoption, And Process Control
80
+
81
+ The end state requires one daemon-owned control path for long-running work:
82
+
83
+ 1. `wave submit` writes the request and exits.
84
+ 2. `wave supervise` claims or renews a lease, launches work, and records observed runtime facts.
85
+ 3. `wave status` and `wave wait` read canonical state only.
86
+ 4. If the daemon dies, a later daemon instance can adopt active runs after lease expiry and continue observation without relaunching healthy agents.
87
+
88
+ The daemon must own:
89
+
90
+ - bounded process launch concurrency
91
+ - async retry with jittered backoff for `EAGAIN`, `EMFILE`, and `ENFILE`
92
+ - orphan adoption after stale lease detection
93
+ - conservative orphan cleanup only after lease expiry, stale heartbeat, and failed pid confirmation
94
+ - reconciliation between launcher status files, live pid state, and control-plane events
95
+
96
+ The daemon must not depend on:
97
+
98
+ - repeated `spawnSync("tmux", "list-sessions")` calls in the steady-state wait loop
99
+ - one sandbox client process staying alive for the full wave duration
100
+ - terminal presence as the source of truth for agent or wave health
101
+
102
+ ---
103
+
104
+ ## Closure Semantics For Forwarded Proof Gaps
105
+
106
+ Closure staging still runs in the normal order:
107
+
108
+ `implementation + proof -> cont-EVAL -> security/A7 -> integration/A8 -> docs/A9 -> cont-QA/A0`
109
+
110
+ Sandbox stability does not change closure authority, but the daemon must preserve one special case:
111
+
112
+ - `wave-proof-gap` from a closure-stage agent is a forwarded soft blocker, not an immediate full-wave stop
113
+
114
+ Forwarding rules:
115
+
116
+ - if A7 returns `wave-proof-gap`, the daemon still dispatches A8, A9, and A0 with the gap included as structured input
117
+ - if A8 returns `wave-proof-gap`, the daemon still dispatches A9 and A0
118
+ - if A9 returns `wave-proof-gap`, the daemon still dispatches A0
119
+ - later closure agents must evaluate the currently available artifacts and report what is true; they must not refuse to run only because an earlier closure-stage agent reported `wave-proof-gap`
120
+ - the final wave disposition remains blocked until the forwarded closure gaps are resolved
121
+
122
+ Non-forwardable closure failures remain hard stops. Examples include malformed outputs, missing proof envelopes, explicit integration blockers, or invalid marker formats.
123
+
124
+ ---
125
+
126
+ ## Current Implementation Status
127
+
128
+ Already landed:
129
+
130
+ - `wave submit`, `wave supervise`, `wave status`, `wave wait`, and `wave attach` exist as a file-backed async wrapper over the existing launcher
131
+ - supervisor state now includes lease-backed daemon ownership, `events.jsonl`, exact lane-scoped lookup, and detached launcher-status reconciliation
132
+ - agent runtime records now capture per-agent pid, heartbeat, runner metadata, terminal disposition, and attach or log-follow metadata for supervisor-owned runs
133
+ - `wave autonomous` now submits and observes single-wave runs through the supervisor surface instead of binding them to one blocking launcher subprocess
134
+ - closure-stage `wave-proof-gap` forwarding now continues later closure stages and records the blocker instead of failing the whole sweep immediately
135
+ - retry planning now invalidates later closure reuse from the earliest forwarded closure-gap stage
136
+ - agent execution now uses detached process runners by default, which lowers tmux session churn and memory pressure in wide fan-outs; tmux remains dashboard-only and `wave attach --agent` falls back to log following when no live session exists
137
+ - launcher progress journaling now lets the supervisor recover finalized runs and safely resume the active wave without a repo-wide rescan
138
+
139
+ Still missing for the true end state:
140
+
141
+ - broader resume semantics beyond “restart the active wave with preserved control state”; recovery can now use finalized progress journals and canonical run-state completion, but multi-wave and auto-next recovery is still conservative
142
+ - fully tmux-free live dashboard projection; dashboard attach now falls back to the last written dashboard file, but live dashboard sessions still use tmux today
143
+ - full success inference from canonical runtime facts alone; the daemon still refuses to synthesize success from agent runtime files without either finalized progress or canonical run-state completion
144
+
145
+ ---
146
+
147
+ ## Remaining Gap Plan
148
+
149
+ 1. Implement supervisor lease, heartbeat, and stale-lock reclamation so a restarted daemon can adopt active runs without relaunching healthy work.
150
+ 2. Move liveness authority to pid, heartbeat, and atomic status files; keep `tmux` as projection-only and remove sync terminal probes from steady-state monitoring.
151
+ 3. Materialize supervisor events and per-agent runtime records as canonical daemon state, not only ad hoc wrapper files.
152
+ 4. Extend forwarded closure-gap handling into retry planning so the earliest forwarded gap invalidates later closure outputs for reuse while still preserving them for operator evidence.
153
+ 5. Converge sandbox-facing entrypoints so `submit/status/wait` become the default operator path and `autonomous` no longer owns a multi-hour blocking launcher process.
@@ -13,6 +13,8 @@ When a command targets lane-scoped runtime state, it also accepts `--project <id
13
13
 
14
14
  - Runtime:
15
15
  `wave launch`, `wave autonomous`, and `wave local` cover dry-run validation, live execution, and executor-specific prompt transport.
16
+ - Sandbox async supervision:
17
+ `wave submit`, `wave supervise`, `wave status`, `wave wait`, and `wave attach` provide the sandbox-friendly submit-and-observe surface for long-running waves.
16
18
  - Operator control:
17
19
  `wave control` is the preferred surface for live status, tasks, reruns, proof bundles, and telemetry.
18
20
  - Compatibility and inspection:
@@ -43,7 +45,7 @@ Closure-role bindings do not have a CLI override surface. When a wave file decla
43
45
  | `--auto-next` | off | Start from next unfinished wave and continue |
44
46
  | `--resume-control-state` | off | Preserve the prior auto-generated relaunch plan instead of treating the launch as a fresh wave start |
45
47
  | `--executor <id>` | `codex` | Default executor: `codex`, `claude`, `opencode`, `local` |
46
- | `--codex-sandbox <mode>` | `danger-full-access` | Codex sandbox isolation level |
48
+ | `--codex-sandbox <mode>` | lane config | Codex sandbox isolation override; falls back to `danger-full-access` only when config is unset |
47
49
  | `--timeout-minutes <n>` | `240` | Max minutes to wait per wave |
48
50
  | `--max-retries-per-wave <n>` | `1` | Relaunch failed agents per wave |
49
51
  | `--agent-rate-limit-retries <n>` | `2` | Per-agent retries for 429 errors |
@@ -51,9 +53,9 @@ Closure-role bindings do not have a CLI override surface. When a wave file decla
51
53
  | `--agent-rate-limit-max-delay-seconds <n>` | `180` | Max backoff delay for 429 |
52
54
  | `--agent-launch-stagger-ms <n>` | `1200` | Delay between agent launches |
53
55
  | `--terminal-surface <mode>` | `vscode` | `tmux`, `vscode`, or `none` |
54
- | `--no-dashboard` | off | Disable per-wave tmux dashboard |
55
- | `--cleanup-sessions` | on | Kill lane tmux sessions after each wave |
56
- | `--keep-sessions` | off | Keep lane tmux sessions |
56
+ | `--no-dashboard` | off | Disable the per-wave dashboard projection session |
57
+ | `--cleanup-sessions` | on | Kill lane tmux dashboard and projection sessions after each wave |
58
+ | `--keep-sessions` | off | Keep lane tmux dashboard and projection sessions |
57
59
  | `--keep-terminals` | off | Keep temporary terminal entries |
58
60
  | `--orchestrator-id <id>` | generated | Stable orchestrator identity |
59
61
  | `--orchestrator-board <path>` | default board path | Write coordination-board updates to a specific shared board |
@@ -80,7 +82,7 @@ wave autonomous [options]
80
82
  | `--project <id>` | config default | Project id |
81
83
  | `--lane <name>` | `main` | Lane name |
82
84
  | `--executor <id>` | lane config | `codex`, `claude`, or `opencode` (not `local`) |
83
- | `--codex-sandbox <mode>` | `danger-full-access` | Codex sandbox mode |
85
+ | `--codex-sandbox <mode>` | lane config | Codex sandbox override passed to launcher; falls back to `danger-full-access` only when config is unset |
84
86
  | `--timeout-minutes <n>` | `240` | Per-wave timeout passed to launcher |
85
87
  | `--max-retries-per-wave <n>` | `1` | Per-wave relaunches inside launcher |
86
88
  | `--max-attempts-per-wave <n>` | `1` | External attempts per wave |
@@ -91,9 +93,65 @@ wave autonomous [options]
91
93
  | `--orchestrator-id <id>` | `<lane>-autonomous` | Orchestrator identity |
92
94
  | `--resident-orchestrator` | off | Launch resident orchestrator for each wave |
93
95
  | `--dashboard` | off | Enable dashboards |
94
- | `--keep-sessions` | off | Keep tmux sessions between waves |
96
+ | `--keep-sessions` | off | Keep tmux dashboard and projection sessions between waves |
95
97
  | `--keep-terminals` | off | Keep terminal entries between waves |
96
98
 
99
+ When you run Wave in a sandbox with short-lived `exec` sessions, prefer the async supervisor surface instead of binding the whole run to one long-lived `wave autonomous` client process. The end-state sandbox model is documented in [../plans/sandbox-end-state-architecture.md](../plans/sandbox-end-state-architecture.md).
100
+
101
+ ## wave submit
102
+
103
+ Submit a launcher request for daemon-owned execution and return quickly with a `runId`.
104
+
105
+ ```
106
+ wave submit [launcher options] [--json]
107
+ ```
108
+
109
+ Current implementation status: this is a file-backed wrapper over `wave-launcher.mjs` with daemon leases, exact-context lookup, launcher-status reconciliation, progress journaling, and process-backed agent execution. It is the preferred sandbox-facing entrypoint for LEAPclaw, OpenClaw, Nemoshell, Docker, and similar short-lived exec environments, even though the broader daemon convergence described in [../plans/sandbox-end-state-architecture.md](../plans/sandbox-end-state-architecture.md) is still conservative in some recovery paths.
110
+
111
+ `wave submit` accepts the same launcher options you would pass to `wave launch`, for example `--project`, `--lane`, `--start-wave`, `--end-wave`, `--executor`, `--codex-sandbox`, `--timeout-minutes`, `--agent-launch-stagger-ms`, `--resident-orchestrator`, `--no-dashboard`, and `--dry-run`. Use `--json` when you want a structured payload containing `runId`, `project`, `lane`, optional `adhocRunId`, and `statePath`.
112
+
113
+ For concrete setup guidance, read [../guides/sandboxed-environments.md](../guides/sandboxed-environments.md).
114
+
115
+ ## wave supervise
116
+
117
+ Run the supervisor loop that claims queued submitted runs and reconciles launcher status.
118
+
119
+ ```
120
+ wave supervise [--project <id>] [--lane <name>] [--once]
121
+ ```
122
+
123
+ Use `--once` for a single reconciliation pass in tests or wrapper scripts. The shipped daemon now renews a lease, reconciles detached launcher status, and can adopt already-running submitted runs from the same lane-scoped supervisor root.
124
+
125
+ ## wave status
126
+
127
+ Read the current supervisor-owned state for a submitted run.
128
+
129
+ ```
130
+ wave status --run-id <id> --project <id> --lane <name> [--adhoc-run <id>] [--json]
131
+ ```
132
+
133
+ Current implementation status: reads the thin file-backed supervisor state from the exact lane-scoped supervisor root. `--project` and `--lane` are required so status does not guess across unrelated state trees.
134
+
135
+ ## wave wait
136
+
137
+ Wait for a submitted run to reach terminal state or until the wait timeout expires.
138
+
139
+ ```
140
+ wave wait --run-id <id> --project <id> --lane <name> [--adhoc-run <id>] [--timeout-seconds <n>] [--json]
141
+ ```
142
+
143
+ `wave wait` is observational only. Timing out does not cancel or kill the underlying run.
144
+
145
+ ## wave attach
146
+
147
+ Attach to a projection for a submitted run.
148
+
149
+ ```
150
+ wave attach --run-id <id> --project <id> --lane <name> [--adhoc-run <id>] (--agent <id> | --dashboard)
151
+ ```
152
+
153
+ `--agent <id>` attaches to a live session only when the runtime record explicitly exposes one; otherwise it follows the recorded log, or prints the recent log tail if the agent is already terminal. `--dashboard` reuses the current lane dashboard attach surface and falls back to the last written dashboard file when no live dashboard session exists. Missing projections are treated as operator errors, not as run-health failures.
154
+
97
155
  ## wave control
98
156
 
99
157
  Unified operator control surface. Preferred over legacy `wave coord`, `wave retry`, and `wave proof`.
@@ -114,6 +172,10 @@ The JSON payload now includes:
114
172
  Versioned wave-level signal state for wrappers and external operators.
115
173
  - `signals.agents`
116
174
  Versioned per-agent signal state, including `shouldWake` plus any observed ack metadata.
175
+ - `supervisor`
176
+ The most relevant lane-scoped supervisor run for this wave, including degraded states such as `launcher-lost-agents-running`, recovery fields such as `sessionBackend`, `recoveryState`, and `resumeAction`, plus any recorded per-agent runtime summary.
177
+ - `forwardedClosureGaps`
178
+ Earliest-first forwarded `wave-proof-gap` records from the relaunch plan, including the stage key, originating agent, attempt, detail, and downstream closure targets.
117
179
 
118
180
  Starter repos also include `scripts/wave-status.sh` and `scripts/wave-watch.sh` as thin readers over this JSON payload. They use exit `0` for completed, `20` for input-required, `40` for failed, and `30` from `wave-watch.sh --until-change` when the signal changed but the wave stayed active. For the full wrapper contract, read [../guides/signal-wrappers.md](../guides/signal-wrappers.md).
119
181
 
@@ -505,6 +567,8 @@ wave dashboard --dashboard-file <path> [--project <id>] [--lane <lane>] [--messa
505
567
  wave dashboard --project <id> --lane <lane> --attach current|global
506
568
  ```
507
569
 
570
+ `wave dashboard --attach current|global` attaches to the live dashboard session when one exists; otherwise it follows the last written dashboard JSON for that target.
571
+
508
572
  ## Workspace Commands
509
573
 
510
574
  **Initialize workspace:**
@@ -565,7 +629,7 @@ Interactive draft currently offers worker role kinds:
565
629
  - `research`
566
630
  - `security`
567
631
 
568
- Agentic planner payloads also accept `workerAgents[].roleKind = "design"`. The shipped `0.9.0` surface uses `design-pass` as the default executor profile for that role and typically assigns a packet path like `docs/plans/waves/design/wave-<n>-<agentId>.md`. Interactive draft scaffolds the docs-first default; hybrid design stewards are authored by explicitly adding implementation-owned paths and the normal implementation contract sections.
632
+ Agentic planner payloads also accept `workerAgents[].roleKind = "design"`. The shipped `0.9.2` surface uses `design-pass` as the default executor profile for that role and typically assigns a packet path like `docs/plans/waves/design/wave-<n>-<agentId>.md`. Interactive draft scaffolds the docs-first default; hybrid design stewards are authored by explicitly adding implementation-owned paths and the normal implementation contract sections.
569
633
 
570
634
  ## Ad-Hoc Task Commands
571
635
 
@@ -36,6 +36,8 @@ At runtime, those distinctions map onto separate modules:
36
36
 
37
37
  Closure roles are resolved from the wave definition first, then from starter defaults. In other words, integration, documentation, `cont-QA`, `cont-EVAL`, and security review keep the same semantics even when a wave overrides the default role ids such as `A8`, `A9`, `A0`, `E0`, or `A7`.
38
38
 
39
+ If `externalProviders.corridor.enabled` is on, Wave also materializes a normalized Corridor artifact before security and integration run. Security review still owns the human-readable report and `[wave-security]` marker, but the security gate can fail closed when the saved Corridor artifact reports a fetch failure or matched blocking findings on implementation-owned paths.
40
+
39
41
  ## Durable State Surfaces
40
42
 
41
43
  The runtime writes several different artifacts, but they do different jobs:
@@ -56,6 +58,12 @@ The runtime writes several different artifacts, but they do different jobs:
56
58
  `.tmp/<lane>-wave-launcher/inboxes/wave-<n>/<agent>.md`
57
59
  - integration summary:
58
60
  `.tmp/<lane>-wave-launcher/integration/wave-<n>.json`
61
+ - security summary:
62
+ `.tmp/<lane>-wave-launcher/security/wave-<n>.json`
63
+ - security markdown summary:
64
+ `.tmp/<lane>-wave-launcher/security/wave-<n>.md`
65
+ - Corridor security context:
66
+ `.tmp/<lane>-wave-launcher/security/wave-<n>-corridor.json`
59
67
  - wave dashboard:
60
68
  `.tmp/<lane>-wave-launcher/dashboards/wave-<n>.json`
61
69
  - run-state:
@@ -134,7 +142,7 @@ Practical rule:
134
142
 
135
143
  That means a targeted helper request only blocks while it remains open *and* still has blocking severity in coordination state.
136
144
 
137
- For the practical `0.9.0` recommendation on when to keep records blocking versus when to downgrade them to `soft`, `stale`, or `advisory`, see [../guides/recommendations-0.9.0.md](../guides/recommendations-0.9.0.md).
145
+ For the practical `0.9.2` recommendation on when to keep records blocking versus when to downgrade them to `soft`, `stale`, or `advisory`, see [../guides/recommendations-0.9.2.md](../guides/recommendations-0.9.2.md).
138
146
 
139
147
  This page is documenting runtime semantics first. The important contract is that closure follows the durable coordination state, not that a particular human or agent used one exact command path to mutate it.
140
148
 
@@ -392,6 +400,15 @@ If present, security review must emit a final `[wave-security]` marker and publi
392
400
  - `concerns` remains visible in summaries and traces
393
401
  - `clear` is only valid when no unresolved findings or approvals remain
394
402
 
403
+ Corridor does not replace that review. When `externalProviders.corridor.enabled` is on:
404
+
405
+ - Wave first materializes the normalized Corridor artifact
406
+ - `requiredAtClosure: true` turns provider fetch failures into `corridor-fetch-failed`
407
+ - matched findings at or above the configured threshold turn the gate into `corridor-blocked`
408
+ - matched findings still stay visible in security and integration summaries even when the human reviewer reports only advisory concerns
409
+
410
+ Only implementation-owned non-doc, non-`.tmp/`, non-markdown paths are eligible for Corridor matching. See [corridor.md](./corridor.md) for the provider-specific rules.
411
+
395
412
  ### Integration
396
413
 
397
414
  Integration must reconcile cross-agent state and report `ready-for-doc-closure` only when there is no remaining meaningful contradiction, blocker, proof gap, or deploy risk.
@@ -0,0 +1,225 @@
1
+ ---
2
+ title: "Corridor"
3
+ summary: "How Wave loads Corridor security context, matches findings against implementation-owned paths, and uses the result during closure."
4
+ ---
5
+
6
+ # Corridor
7
+
8
+ Corridor is Wave's optional external security-context provider.
9
+
10
+ It does not replace the report-owning security reviewer. Instead, it adds a machine-readable guardrail and findings input that Wave can materialize before security and integration closure run.
11
+
12
+ Use it when you want Wave to:
13
+
14
+ - pull guardrail or findings context for a project
15
+ - filter that context down to the implementation-owned paths in the current wave
16
+ - persist the normalized result as a runtime artifact
17
+ - fail closure automatically when the fetch fails or matched findings cross a configured severity threshold
18
+
19
+ ## Modes
20
+
21
+ Wave supports three Corridor modes under `externalProviders.corridor`:
22
+
23
+ ```json
24
+ {
25
+ "externalProviders": {
26
+ "corridor": {
27
+ "enabled": true,
28
+ "mode": "hybrid",
29
+ "baseUrl": "https://app.corridor.dev/api",
30
+ "apiTokenEnvVar": "CORRIDOR_API_TOKEN",
31
+ "apiKeyFallbackEnvVar": "CORRIDOR_API_KEY",
32
+ "teamId": "corridor-team-id",
33
+ "projectId": "corridor-project-id",
34
+ "severityThreshold": "critical",
35
+ "findingStates": ["open", "potential"],
36
+ "requiredAtClosure": true
37
+ }
38
+ }
39
+ }
40
+ ```
41
+
42
+ - `direct`
43
+ Wave calls Corridor from the repo runtime with `CORRIDOR_API_TOKEN` or the fallback `CORRIDOR_API_KEY`.
44
+ - `broker`
45
+ Wave calls an owned `wave-control` deployment with `WAVE_API_TOKEN`, and that service uses deployment-owned Corridor credentials.
46
+ - `hybrid`
47
+ Wave tries the owned `wave-control` broker first and falls back to direct auth if broker setup or broker delivery fails.
48
+
49
+ Notes:
50
+
51
+ - direct mode requires both `teamId` and `projectId` in config; the live fetches use `projectId`, while `teamId` keeps the project identity explicit in repo config
52
+ - broker mode requires an owned Wave Control endpoint; the packaged default endpoint intentionally rejects Corridor brokering
53
+ - if `findingStates` is omitted or set to `[]`, Wave does not filter by finding state and the provider may return all states
54
+ - if you only want active findings, set `findingStates` explicitly, for example `["open", "potential"]`
55
+
56
+ ## What Wave Matches
57
+
58
+ Wave does not send every file in the repo to Corridor matching.
59
+
60
+ The runtime builds the relevant path set from implementation-owned paths in the current wave:
61
+
62
+ - security reviewers are excluded
63
+ - design stewards are excluded
64
+ - integration, documentation, and `cont-QA` owners are excluded
65
+ - `cont-EVAL` only contributes owned paths when that agent is implementation-owning
66
+ - `.tmp/` paths are excluded
67
+ - `docs/` paths are excluded
68
+ - `.md` and `.txt` files are excluded
69
+
70
+ That means Corridor is aimed at code and implementation-owned assets, not shared-plan markdown or generated launcher state.
71
+
72
+ Matching is path-prefix based:
73
+
74
+ - an exact file match counts
75
+ - a finding under a matched owned directory also counts
76
+ - unmatched findings remain in the upstream provider but are dropped from the normalized Wave artifact
77
+
78
+ ## Generated Artifact
79
+
80
+ Wave writes the normalized Corridor artifact to:
81
+
82
+ - `.tmp/<lane>-wave-launcher/security/wave-<n>-corridor.json`
83
+ - `.tmp/projects/<projectId>/<lane>-wave-launcher/security/wave-<n>-corridor.json` for explicit projects
84
+
85
+ Representative shape:
86
+
87
+ ```json
88
+ {
89
+ "schemaVersion": 1,
90
+ "wave": 7,
91
+ "lane": "main",
92
+ "projectId": "app",
93
+ "providerMode": "broker",
94
+ "source": "wave-control-broker",
95
+ "requiredAtClosure": true,
96
+ "severityThreshold": "critical",
97
+ "fetchedAt": "2026-03-29T12:00:00.000Z",
98
+ "relevantOwnedPaths": ["src/auth", "src/session"],
99
+ "guardrails": [{ "id": "r1", "name": "No secrets" }],
100
+ "matchedFindings": [
101
+ {
102
+ "id": "f1",
103
+ "title": "Hardcoded token",
104
+ "affectedFile": "src/auth/token.ts",
105
+ "severity": "critical",
106
+ "state": "open",
107
+ "matchedOwnedPaths": ["src/auth"]
108
+ }
109
+ ],
110
+ "blockingFindings": [
111
+ {
112
+ "id": "f1",
113
+ "title": "Hardcoded token",
114
+ "affectedFile": "src/auth/token.ts",
115
+ "severity": "critical",
116
+ "state": "open",
117
+ "matchedOwnedPaths": ["src/auth"]
118
+ }
119
+ ],
120
+ "blocking": true,
121
+ "error": null
122
+ }
123
+ ```
124
+
125
+ Important fields:
126
+
127
+ - `providerMode`: the configured mode after runtime resolution
128
+ - `source`: the actual fetch source such as direct Corridor API or owned Wave Control broker
129
+ - `relevantOwnedPaths`: the implementation-owned paths Wave considered eligible for matching
130
+ - `guardrails`: normalized provider-side guardrail/report metadata
131
+ - `matchedFindings`: findings that hit the wave's eligible owned paths
132
+ - `blockingFindings`: matched findings whose severity meets or exceeds `severityThreshold`
133
+ - `blocking`: whether the Corridor result alone is enough to fail the security gate
134
+ - `error`: the fetch or broker error when the load failed
135
+
136
+ If the wave has no eligible implementation-owned paths, Wave still writes a successful artifact with `blocking: false` and a `detail` explaining that nothing qualified for matching.
137
+
138
+ ## Closure Behavior
139
+
140
+ Corridor is evaluated before security review finishes.
141
+
142
+ The security gate behaves like this:
143
+
144
+ 1. If Corridor is disabled, Wave ignores it.
145
+ 2. If Corridor is enabled and the fetch fails:
146
+ - `requiredAtClosure: true` turns that into `corridor-fetch-failed`
147
+ - `requiredAtClosure: false` keeps the failure visible in summaries without hard-failing the gate
148
+ 3. If Corridor loads and matched findings meet the configured threshold:
149
+ - the gate fails as `corridor-blocked`
150
+ 4. If Corridor loads cleanly:
151
+ - security review still runs and still owns the human-readable report plus `[wave-security]`
152
+
153
+ That separation matters:
154
+
155
+ - Corridor provides machine-readable blocking evidence
156
+ - the security reviewer still provides the threat-model-first narrative review, approvals, and final marker
157
+ - `concerns` from the human reviewer remain advisory, while `blocked` from either the reviewer or Corridor stops closure before integration
158
+
159
+ Matched Corridor findings are also copied into the generated security and integration summaries, so they remain visible even when the human reviewer reports advisory concerns rather than a hard block.
160
+
161
+ ## Broker Mode Through Wave Control
162
+
163
+ In broker mode, the repo runtime sends a normalized request to:
164
+
165
+ - `POST /api/v1/providers/corridor/context`
166
+
167
+ The request body contains:
168
+
169
+ - `projectId`: the active Wave project id
170
+ - `wave`: current wave number
171
+ - `ownedPaths`: the filtered implementation-owned paths
172
+ - `severityThreshold`
173
+ - `findingStates`
174
+
175
+ The owned `wave-control` deployment then:
176
+
177
+ - looks up the Wave project id inside `WAVE_BROKER_CORRIDOR_PROJECT_MAP`
178
+ - fetches Corridor reports and findings with the deployment-owned `WAVE_BROKER_CORRIDOR_API_TOKEN`
179
+ - returns the same normalized shape Wave would have produced in direct mode
180
+
181
+ Example mapping:
182
+
183
+ ```json
184
+ {
185
+ "app": {
186
+ "teamId": "corridor-team-uuid",
187
+ "projectId": "corridor-project-uuid"
188
+ }
189
+ }
190
+ ```
191
+
192
+ Broker mode requirements:
193
+
194
+ - `waveControl.endpoint` must point at an owned Wave Control deployment, not the packaged default endpoint
195
+ - Wave must have a bearer token, normally `WAVE_API_TOKEN`
196
+ - the service deployment must enable Corridor brokering with `WAVE_BROKER_OWNED_DEPLOYMENT=true`, `WAVE_BROKER_ENABLE_CORRIDOR=true`, `WAVE_BROKER_CORRIDOR_API_TOKEN`, and `WAVE_BROKER_CORRIDOR_PROJECT_MAP`
197
+
198
+ ## Prompt And Summary Surfaces
199
+
200
+ When Corridor loads during a live run, Wave also projects a compact text summary into the generated runtime context. That summary includes:
201
+
202
+ - the actual source used
203
+ - whether Corridor is currently blocking
204
+ - the configured threshold
205
+ - matched finding count
206
+ - up to five blocking findings
207
+
208
+ The same Corridor result then appears in:
209
+
210
+ - the generated security summary
211
+ - the integration summary
212
+ - the trace bundle's copied `corridor.json`
213
+
214
+ This keeps security context visible to humans without turning the provider response into the sole authority.
215
+
216
+ ## Recommended Pattern
217
+
218
+ For most repos:
219
+
220
+ - use `hybrid` if you want an owned Wave Control deployment in normal operation but still want direct-repo fallback during outages or incomplete broker setup
221
+ - set `findingStates` explicitly if you only want open or potential findings considered live
222
+ - leave `requiredAtClosure` enabled when Corridor is meant to be part of the actual release gate
223
+ - keep a report-owning security reviewer in the wave; Corridor should strengthen that review, not replace it
224
+
225
+ See [runtime-config/README.md](./runtime-config/README.md) for the config keys, [wave-control.md](./wave-control.md) for the owned broker surface, and [coordination-and-closure.md](./coordination-and-closure.md) for the closure-stage gate ordering.
@@ -3,7 +3,7 @@
3
3
  Use this page only if you intentionally want the legacy GitHub Packages install path.
4
4
 
5
5
  GitHub's npm registry still requires authentication for installs from `npm.pkg.github.com`, even when the package and backing repository are public.
6
- This is now the optional authenticated fallback path. The primary public install path is npmjs. Maintainer npm publishing setup is documented in [npmjs-trusted-publishing.md](./npmjs-trusted-publishing.md).
6
+ This is now the optional authenticated fallback path. The primary public install path is npmjs. Maintainer npm publishing setup is documented in [npmjs-token-publishing.md](./npmjs-token-publishing.md).
7
7
 
8
8
  ## `.npmrc`
9
9
 
@@ -1,12 +1,12 @@
1
1
  ---
2
2
  title: "Wave Orchestration Migration Guide: 0.2 to 0.5"
3
- summary: "How to migrate a repository from the earlier 0.2 wave baseline to the current post-roadmap Wave model."
3
+ summary: "How to migrate a repository from the earlier 0.2 wave baseline to the current Wave model."
4
4
  ---
5
5
 
6
6
  # Wave Orchestration Migration Guide: 0.2 to 0.5
7
7
 
8
8
  This guide explains how to migrate a repository from the earlier Wave
9
- Orchestration 0.2 baseline to the current post-roadmap Wave model.
9
+ Orchestration 0.2 baseline to the current Wave model.
10
10
 
11
11
  Current mainline note:
12
12
 
@@ -20,10 +20,11 @@ It uses two concrete references:
20
20
  - the 0.2-style baseline in the sibling `~/slowfast.ai` repo
21
21
  - the current target shape in this standalone `wave-orchestration` repo
22
22
 
23
- This is a migration guide for the current architecture described in
24
- [Roadmap](../roadmap.md), not just a changelog of whatever happens to be landed
25
- in one point-in-time package build. This document is about the shipped runtime
26
- shape, not only the semver label.
23
+ This is a migration guide for the current shipped architecture and release
24
+ direction described across [docs/architecture/README.md](../architecture/README.md)
25
+ and [Roadmap](../roadmap.md), not just a changelog of whatever happens to be
26
+ landed in one point-in-time package build. This document is about the shipped
27
+ runtime shape, not only the semver label.
27
28
 
28
29
  ## Baseline And Target
29
30
 
@@ -34,7 +35,8 @@ Use these files as the concrete examples while migrating:
34
35
  - 0.2 baseline wave example: `~/slowfast.ai/docs/plans/waves/wave-7.md`
35
36
  - current target config: [wave.config.json](../../wave.config.json)
36
37
  - current target sample wave: [wave-0.md](../plans/waves/wave-0.md)
37
- - current target architecture: [roadmap.md](../roadmap.md)
38
+ - current target architecture: [architecture/README.md](../architecture/README.md)
39
+ - current release direction: [roadmap.md](../roadmap.md)
38
40
  - current package workflow: [README.md](../../README.md)
39
41
 
40
42
  The migration is intentionally evolutionary: