@chllming/wave-orchestration 0.9.0 → 0.9.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +28 -0
- package/README.md +119 -18
- package/docs/README.md +7 -3
- package/docs/architecture/README.md +1498 -0
- package/docs/concepts/operating-modes.md +2 -2
- package/docs/guides/author-and-run-waves.md +14 -4
- package/docs/guides/planner.md +2 -2
- package/docs/guides/{recommendations-0.9.0.md → recommendations-0.9.1.md} +8 -7
- package/docs/guides/sandboxed-environments.md +158 -0
- package/docs/guides/terminal-surfaces.md +14 -12
- package/docs/plans/current-state.md +5 -3
- package/docs/plans/end-state-architecture.md +3 -1
- package/docs/plans/examples/wave-example-design-handoff.md +1 -1
- package/docs/plans/examples/wave-example-live-proof.md +1 -1
- package/docs/plans/migration.md +46 -19
- package/docs/plans/sandbox-end-state-architecture.md +153 -0
- package/docs/reference/cli-reference.md +71 -7
- package/docs/reference/coordination-and-closure.md +1 -1
- package/docs/reference/github-packages-setup.md +1 -1
- package/docs/reference/migration-0.2-to-0.5.md +9 -7
- package/docs/reference/npmjs-token-publishing.md +53 -0
- package/docs/reference/npmjs-trusted-publishing.md +4 -50
- package/docs/reference/package-publishing-flow.md +272 -0
- package/docs/reference/runtime-config/README.md +2 -2
- package/docs/reference/sample-waves.md +5 -5
- package/docs/reference/skills.md +1 -1
- package/docs/roadmap.md +43 -201
- package/package.json +1 -1
- package/releases/manifest.json +19 -0
- package/scripts/wave-orchestrator/agent-process-runner.mjs +344 -0
- package/scripts/wave-orchestrator/agent-state.mjs +0 -1
- package/scripts/wave-orchestrator/artifact-schemas.mjs +7 -0
- package/scripts/wave-orchestrator/autonomous.mjs +47 -14
- package/scripts/wave-orchestrator/closure-engine.mjs +138 -17
- package/scripts/wave-orchestrator/control-cli.mjs +42 -5
- package/scripts/wave-orchestrator/dashboard-renderer.mjs +115 -43
- package/scripts/wave-orchestrator/derived-state-engine.mjs +6 -3
- package/scripts/wave-orchestrator/gate-engine.mjs +106 -38
- package/scripts/wave-orchestrator/install.mjs +13 -0
- package/scripts/wave-orchestrator/launcher-progress.mjs +91 -0
- package/scripts/wave-orchestrator/launcher-runtime.mjs +179 -68
- package/scripts/wave-orchestrator/launcher.mjs +201 -53
- package/scripts/wave-orchestrator/ledger.mjs +7 -2
- package/scripts/wave-orchestrator/projection-writer.mjs +13 -1
- package/scripts/wave-orchestrator/reducer-snapshot.mjs +6 -0
- package/scripts/wave-orchestrator/retry-control.mjs +3 -3
- package/scripts/wave-orchestrator/retry-engine.mjs +93 -6
- package/scripts/wave-orchestrator/role-helpers.mjs +30 -0
- package/scripts/wave-orchestrator/session-supervisor.mjs +94 -85
- package/scripts/wave-orchestrator/supervisor-cli.mjs +1306 -0
- package/scripts/wave-orchestrator/terminals.mjs +12 -32
- package/scripts/wave-orchestrator/tmux-adapter.mjs +300 -0
- package/scripts/wave-orchestrator/wave-files.mjs +38 -5
- package/scripts/wave.mjs +13 -0
|
@@ -0,0 +1,153 @@
|
|
|
1
|
+
# Sandbox End-State Architecture
|
|
2
|
+
|
|
3
|
+
This document is the sandbox-runtime companion to [end-state-architecture.md](./end-state-architecture.md). The core architecture still applies: the canonical authority set remains wave definitions, the coordination log, and the control-plane event log. This page narrows that model to the execution environments that impose short-lived `exec` sessions, process ceilings, or terminal instability.
|
|
4
|
+
|
|
5
|
+
The goal is straightforward: sandbox client commands must stay short and disposable, while long-running wave ownership moves to a durable supervisor that can survive launcher exit, sandbox timeout, and terminal churn.
|
|
6
|
+
|
|
7
|
+
For the operator-facing setup flow in LEAPclaw, OpenClaw, Nemoshell, Docker, and similar environments, read [../guides/sandboxed-environments.md](../guides/sandboxed-environments.md). This page is the deeper design and authority-model reference.
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
## Problem Statement
|
|
12
|
+
|
|
13
|
+
Sandboxed runtimes have failure modes that the generic architecture does not need to describe in detail:
|
|
14
|
+
|
|
15
|
+
- the sandbox `exec` session may have a wall-clock timeout that is much shorter than a real wave
|
|
16
|
+
- bursty `spawnSync` and `tmux` probes can hit `EAGAIN`, `EMFILE`, or related process pressure limits
|
|
17
|
+
- the launcher process can die before child agents finish, leaving orphaned sessions and ambiguous status
|
|
18
|
+
- a missing `tmux` session is not enough evidence that the actual agent process failed
|
|
19
|
+
|
|
20
|
+
The shipped runtime now has an initial async supervisor wrapper plus forwarded closure-gap handling, but it does not yet satisfy the full sandbox ownership model described here.
|
|
21
|
+
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
## Target Command Model
|
|
25
|
+
|
|
26
|
+
Sandbox-facing commands should follow an async submit/observe pattern:
|
|
27
|
+
|
|
28
|
+
- `wave submit [launcher options]`
|
|
29
|
+
Validate the request, persist a run request, print a `runId`, and exit quickly.
|
|
30
|
+
- `wave supervise`
|
|
31
|
+
Long-running daemon command that owns launch, monitoring, retry, adoption, and cleanup. This command is not intended to be bound to a short sandbox `exec` lifetime.
|
|
32
|
+
- `wave status --run-id <id>`
|
|
33
|
+
Read canonical supervisor state for a run.
|
|
34
|
+
- `wave wait --run-id <id> --timeout-seconds <n>`
|
|
35
|
+
Observe until a state change or timeout. Timing out never cancels the run.
|
|
36
|
+
- `wave attach --run-id <id>`
|
|
37
|
+
Optional operator projection surface for `tmux` or another terminal UI. This is not a liveness authority.
|
|
38
|
+
|
|
39
|
+
Compatibility rules:
|
|
40
|
+
|
|
41
|
+
- `wave launch` remains the canonical full launcher surface for direct local execution and dry-run validation.
|
|
42
|
+
- `wave autonomous` should submit and observe wave execution when it is used in sandbox-oriented flows.
|
|
43
|
+
- `wave submit`, `wave supervise`, `wave status`, `wave wait`, and `wave attach` are the preferred sandbox-facing surface, even while some internals remain partial.
|
|
44
|
+
|
|
45
|
+
---
|
|
46
|
+
|
|
47
|
+
## Canonical Authority In Sandboxed Runs
|
|
48
|
+
|
|
49
|
+
The canonical authority set does not change, but sandbox supervision adds one more durable runtime layer:
|
|
50
|
+
|
|
51
|
+
- wave definitions remain authoritative for declared work, closure roles, proof artifacts, and task contracts
|
|
52
|
+
- coordination and control-plane logs remain authoritative for workflow, lifecycle, proof, and blocker state
|
|
53
|
+
- supervisor run state becomes the canonical record of daemon-owned runtime observation for a submitted run
|
|
54
|
+
|
|
55
|
+
The supervisor-owned state should converge on this per-run structure under `.tmp/<lane>-wave-launcher/supervisor/runs/<runId>/`:
|
|
56
|
+
|
|
57
|
+
- `request.json`
|
|
58
|
+
Immutable submitted request.
|
|
59
|
+
- `state.json`
|
|
60
|
+
Current daemon-owned run snapshot, including `runId`, `status`, `submittedAt`, `startedAt`, `completedAt`, `launcherPid`, `supervisorId`, `leaseExpiresAt`, `terminalDisposition`, and the latest observed launcher status.
|
|
61
|
+
- `events.jsonl`
|
|
62
|
+
Supervisor-local observation history for adoption, retries, reconciliation, and cleanup decisions.
|
|
63
|
+
- `launcher-status.json`
|
|
64
|
+
Canonical launcher completion status written atomically by the detached launcher wrapper.
|
|
65
|
+
- `launcher.log`
|
|
66
|
+
Human-facing log stream only.
|
|
67
|
+
- `agents/<agentId>.runtime.json`
|
|
68
|
+
Agent runtime observation record with fields such as `pid`, `pgid`, `attempt`, `startedAt`, `lastHeartbeatAt`, `exitCode`, `exitReason`, `statusPath`, and optional projection metadata like `tmuxSessionName`.
|
|
69
|
+
|
|
70
|
+
Authority rules:
|
|
71
|
+
|
|
72
|
+
- `tmux` is projection-only
|
|
73
|
+
- dashboards, summaries, inboxes, and board markdown remain projections only
|
|
74
|
+
- missing `tmux` state cannot by itself fail a run and is warning-only telemetry
|
|
75
|
+
- pid checks, heartbeats, and atomic status files outrank terminal presence for liveness
|
|
76
|
+
|
|
77
|
+
---
|
|
78
|
+
|
|
79
|
+
## Daemon Ownership, Adoption, And Process Control
|
|
80
|
+
|
|
81
|
+
The end state requires one daemon-owned control path for long-running work:
|
|
82
|
+
|
|
83
|
+
1. `wave submit` writes the request and exits.
|
|
84
|
+
2. `wave supervise` claims or renews a lease, launches work, and records observed runtime facts.
|
|
85
|
+
3. `wave status` and `wave wait` read canonical state only.
|
|
86
|
+
4. If the daemon dies, a later daemon instance can adopt active runs after lease expiry and continue observation without relaunching healthy agents.
|
|
87
|
+
|
|
88
|
+
The daemon must own:
|
|
89
|
+
|
|
90
|
+
- bounded process launch concurrency
|
|
91
|
+
- async retry with jittered backoff for `EAGAIN`, `EMFILE`, and `ENFILE`
|
|
92
|
+
- orphan adoption after stale lease detection
|
|
93
|
+
- conservative orphan cleanup only after lease expiry, stale heartbeat, and failed pid confirmation
|
|
94
|
+
- reconciliation between launcher status files, live pid state, and control-plane events
|
|
95
|
+
|
|
96
|
+
The daemon must not depend on:
|
|
97
|
+
|
|
98
|
+
- repeated `spawnSync("tmux", "list-sessions")` calls in the steady-state wait loop
|
|
99
|
+
- one sandbox client process staying alive for the full wave duration
|
|
100
|
+
- terminal presence as the source of truth for agent or wave health
|
|
101
|
+
|
|
102
|
+
---
|
|
103
|
+
|
|
104
|
+
## Closure Semantics For Forwarded Proof Gaps
|
|
105
|
+
|
|
106
|
+
Closure staging still runs in the normal order:
|
|
107
|
+
|
|
108
|
+
`implementation + proof -> cont-EVAL -> security/A7 -> integration/A8 -> docs/A9 -> cont-QA/A0`
|
|
109
|
+
|
|
110
|
+
Sandbox stability does not change closure authority, but the daemon must preserve one special case:
|
|
111
|
+
|
|
112
|
+
- `wave-proof-gap` from a closure-stage agent is a forwarded soft blocker, not an immediate full-wave stop
|
|
113
|
+
|
|
114
|
+
Forwarding rules:
|
|
115
|
+
|
|
116
|
+
- if A7 returns `wave-proof-gap`, the daemon still dispatches A8, A9, and A0 with the gap included as structured input
|
|
117
|
+
- if A8 returns `wave-proof-gap`, the daemon still dispatches A9 and A0
|
|
118
|
+
- if A9 returns `wave-proof-gap`, the daemon still dispatches A0
|
|
119
|
+
- later closure agents must evaluate the currently available artifacts and report what is true; they must not refuse to run only because an earlier closure-stage agent reported `wave-proof-gap`
|
|
120
|
+
- the final wave disposition remains blocked until the forwarded closure gaps are resolved
|
|
121
|
+
|
|
122
|
+
Non-forwardable closure failures remain hard stops. Examples include malformed outputs, missing proof envelopes, explicit integration blockers, or invalid marker formats.
|
|
123
|
+
|
|
124
|
+
---
|
|
125
|
+
|
|
126
|
+
## Current Implementation Status
|
|
127
|
+
|
|
128
|
+
Already landed:
|
|
129
|
+
|
|
130
|
+
- `wave submit`, `wave supervise`, `wave status`, `wave wait`, and `wave attach` exist as a file-backed async wrapper over the existing launcher
|
|
131
|
+
- supervisor state now includes lease-backed daemon ownership, `events.jsonl`, exact lane-scoped lookup, and detached launcher-status reconciliation
|
|
132
|
+
- agent runtime records now capture per-agent pid, heartbeat, runner metadata, terminal disposition, and attach or log-follow metadata for supervisor-owned runs
|
|
133
|
+
- `wave autonomous` now submits and observes single-wave runs through the supervisor surface instead of binding them to one blocking launcher subprocess
|
|
134
|
+
- closure-stage `wave-proof-gap` forwarding now continues later closure stages and records the blocker instead of failing the whole sweep immediately
|
|
135
|
+
- retry planning now invalidates later closure reuse from the earliest forwarded closure-gap stage
|
|
136
|
+
- agent execution now uses detached process runners by default, which lowers tmux session churn and memory pressure in wide fan-outs; tmux remains dashboard-only and `wave attach --agent` falls back to log following when no live session exists
|
|
137
|
+
- launcher progress journaling now lets the supervisor recover finalized runs and safely resume the active wave without a repo-wide rescan
|
|
138
|
+
|
|
139
|
+
Still missing for the true end state:
|
|
140
|
+
|
|
141
|
+
- broader resume semantics beyond “restart the active wave with preserved control state”; recovery can now use finalized progress journals and canonical run-state completion, but multi-wave and auto-next recovery is still conservative
|
|
142
|
+
- fully tmux-free live dashboard projection; dashboard attach now falls back to the last written dashboard file, but live dashboard sessions still use tmux today
|
|
143
|
+
- full success inference from canonical runtime facts alone; the daemon still refuses to synthesize success from agent runtime files without either finalized progress or canonical run-state completion
|
|
144
|
+
|
|
145
|
+
---
|
|
146
|
+
|
|
147
|
+
## Remaining Gap Plan
|
|
148
|
+
|
|
149
|
+
1. Implement supervisor lease, heartbeat, and stale-lock reclamation so a restarted daemon can adopt active runs without relaunching healthy work.
|
|
150
|
+
2. Move liveness authority to pid, heartbeat, and atomic status files; keep `tmux` as projection-only and remove sync terminal probes from steady-state monitoring.
|
|
151
|
+
3. Materialize supervisor events and per-agent runtime records as canonical daemon state, not only ad hoc wrapper files.
|
|
152
|
+
4. Extend forwarded closure-gap handling into retry planning so the earliest forwarded gap invalidates later closure outputs for reuse while still preserving them for operator evidence.
|
|
153
|
+
5. Converge sandbox-facing entrypoints so `submit/status/wait` become the default operator path and `autonomous` no longer owns a multi-hour blocking launcher process.
|
|
@@ -13,6 +13,8 @@ When a command targets lane-scoped runtime state, it also accepts `--project <id
|
|
|
13
13
|
|
|
14
14
|
- Runtime:
|
|
15
15
|
`wave launch`, `wave autonomous`, and `wave local` cover dry-run validation, live execution, and executor-specific prompt transport.
|
|
16
|
+
- Sandbox async supervision:
|
|
17
|
+
`wave submit`, `wave supervise`, `wave status`, `wave wait`, and `wave attach` provide the sandbox-friendly submit-and-observe surface for long-running waves.
|
|
16
18
|
- Operator control:
|
|
17
19
|
`wave control` is the preferred surface for live status, tasks, reruns, proof bundles, and telemetry.
|
|
18
20
|
- Compatibility and inspection:
|
|
@@ -43,7 +45,7 @@ Closure-role bindings do not have a CLI override surface. When a wave file decla
|
|
|
43
45
|
| `--auto-next` | off | Start from next unfinished wave and continue |
|
|
44
46
|
| `--resume-control-state` | off | Preserve the prior auto-generated relaunch plan instead of treating the launch as a fresh wave start |
|
|
45
47
|
| `--executor <id>` | `codex` | Default executor: `codex`, `claude`, `opencode`, `local` |
|
|
46
|
-
| `--codex-sandbox <mode>` | `danger-full-access`
|
|
48
|
+
| `--codex-sandbox <mode>` | lane config | Codex sandbox isolation override; falls back to `danger-full-access` only when config is unset |
|
|
47
49
|
| `--timeout-minutes <n>` | `240` | Max minutes to wait per wave |
|
|
48
50
|
| `--max-retries-per-wave <n>` | `1` | Relaunch failed agents per wave |
|
|
49
51
|
| `--agent-rate-limit-retries <n>` | `2` | Per-agent retries for 429 errors |
|
|
@@ -51,9 +53,9 @@ Closure-role bindings do not have a CLI override surface. When a wave file decla
|
|
|
51
53
|
| `--agent-rate-limit-max-delay-seconds <n>` | `180` | Max backoff delay for 429 |
|
|
52
54
|
| `--agent-launch-stagger-ms <n>` | `1200` | Delay between agent launches |
|
|
53
55
|
| `--terminal-surface <mode>` | `vscode` | `tmux`, `vscode`, or `none` |
|
|
54
|
-
| `--no-dashboard` | off | Disable per-wave
|
|
55
|
-
| `--cleanup-sessions` | on | Kill lane tmux sessions after each wave |
|
|
56
|
-
| `--keep-sessions` | off | Keep lane tmux sessions |
|
|
56
|
+
| `--no-dashboard` | off | Disable the per-wave dashboard projection session |
|
|
57
|
+
| `--cleanup-sessions` | on | Kill lane tmux dashboard and projection sessions after each wave |
|
|
58
|
+
| `--keep-sessions` | off | Keep lane tmux dashboard and projection sessions |
|
|
57
59
|
| `--keep-terminals` | off | Keep temporary terminal entries |
|
|
58
60
|
| `--orchestrator-id <id>` | generated | Stable orchestrator identity |
|
|
59
61
|
| `--orchestrator-board <path>` | default board path | Write coordination-board updates to a specific shared board |
|
|
@@ -80,7 +82,7 @@ wave autonomous [options]
|
|
|
80
82
|
| `--project <id>` | config default | Project id |
|
|
81
83
|
| `--lane <name>` | `main` | Lane name |
|
|
82
84
|
| `--executor <id>` | lane config | `codex`, `claude`, or `opencode` (not `local`) |
|
|
83
|
-
| `--codex-sandbox <mode>` | `danger-full-access`
|
|
85
|
+
| `--codex-sandbox <mode>` | lane config | Codex sandbox override passed to launcher; falls back to `danger-full-access` only when config is unset |
|
|
84
86
|
| `--timeout-minutes <n>` | `240` | Per-wave timeout passed to launcher |
|
|
85
87
|
| `--max-retries-per-wave <n>` | `1` | Per-wave relaunches inside launcher |
|
|
86
88
|
| `--max-attempts-per-wave <n>` | `1` | External attempts per wave |
|
|
@@ -91,9 +93,65 @@ wave autonomous [options]
|
|
|
91
93
|
| `--orchestrator-id <id>` | `<lane>-autonomous` | Orchestrator identity |
|
|
92
94
|
| `--resident-orchestrator` | off | Launch resident orchestrator for each wave |
|
|
93
95
|
| `--dashboard` | off | Enable dashboards |
|
|
94
|
-
| `--keep-sessions` | off | Keep tmux sessions between waves |
|
|
96
|
+
| `--keep-sessions` | off | Keep tmux dashboard and projection sessions between waves |
|
|
95
97
|
| `--keep-terminals` | off | Keep terminal entries between waves |
|
|
96
98
|
|
|
99
|
+
When you run Wave in a sandbox with short-lived `exec` sessions, prefer the async supervisor surface instead of binding the whole run to one long-lived `wave autonomous` client process. The end-state sandbox model is documented in [../plans/sandbox-end-state-architecture.md](../plans/sandbox-end-state-architecture.md).
|
|
100
|
+
|
|
101
|
+
## wave submit
|
|
102
|
+
|
|
103
|
+
Submit a launcher request for daemon-owned execution and return quickly with a `runId`.
|
|
104
|
+
|
|
105
|
+
```
|
|
106
|
+
wave submit [launcher options] [--json]
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
Current implementation status: this is a file-backed wrapper over `wave-launcher.mjs` with daemon leases, exact-context lookup, launcher-status reconciliation, progress journaling, and process-backed agent execution. It is the preferred sandbox-facing entrypoint for LEAPclaw, OpenClaw, Nemoshell, Docker, and similar short-lived exec environments, even though the broader daemon convergence described in [../plans/sandbox-end-state-architecture.md](../plans/sandbox-end-state-architecture.md) is still conservative in some recovery paths.
|
|
110
|
+
|
|
111
|
+
`wave submit` accepts the same launcher options you would pass to `wave launch`, for example `--project`, `--lane`, `--start-wave`, `--end-wave`, `--executor`, `--codex-sandbox`, `--timeout-minutes`, `--agent-launch-stagger-ms`, `--resident-orchestrator`, `--no-dashboard`, and `--dry-run`. Use `--json` when you want a structured payload containing `runId`, `project`, `lane`, optional `adhocRunId`, and `statePath`.
|
|
112
|
+
|
|
113
|
+
For concrete setup guidance, read [../guides/sandboxed-environments.md](../guides/sandboxed-environments.md).
|
|
114
|
+
|
|
115
|
+
## wave supervise
|
|
116
|
+
|
|
117
|
+
Run the supervisor loop that claims queued submitted runs and reconciles launcher status.
|
|
118
|
+
|
|
119
|
+
```
|
|
120
|
+
wave supervise [--project <id>] [--lane <name>] [--once]
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
Use `--once` for a single reconciliation pass in tests or wrapper scripts. The shipped daemon now renews a lease, reconciles detached launcher status, and can adopt already-running submitted runs from the same lane-scoped supervisor root.
|
|
124
|
+
|
|
125
|
+
## wave status
|
|
126
|
+
|
|
127
|
+
Read the current supervisor-owned state for a submitted run.
|
|
128
|
+
|
|
129
|
+
```
|
|
130
|
+
wave status --run-id <id> --project <id> --lane <name> [--adhoc-run <id>] [--json]
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
Current implementation status: reads the thin file-backed supervisor state from the exact lane-scoped supervisor root. `--project` and `--lane` are required so status does not guess across unrelated state trees.
|
|
134
|
+
|
|
135
|
+
## wave wait
|
|
136
|
+
|
|
137
|
+
Wait for a submitted run to reach terminal state or until the wait timeout expires.
|
|
138
|
+
|
|
139
|
+
```
|
|
140
|
+
wave wait --run-id <id> --project <id> --lane <name> [--adhoc-run <id>] [--timeout-seconds <n>] [--json]
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
`wave wait` is observational only. Timing out does not cancel or kill the underlying run.
|
|
144
|
+
|
|
145
|
+
## wave attach
|
|
146
|
+
|
|
147
|
+
Attach to a projection for a submitted run.
|
|
148
|
+
|
|
149
|
+
```
|
|
150
|
+
wave attach --run-id <id> --project <id> --lane <name> [--adhoc-run <id>] (--agent <id> | --dashboard)
|
|
151
|
+
```
|
|
152
|
+
|
|
153
|
+
`--agent <id>` attaches to a live session only when the runtime record explicitly exposes one; otherwise it follows the recorded log, or prints the recent log tail if the agent is already terminal. `--dashboard` reuses the current lane dashboard attach surface and falls back to the last written dashboard file when no live dashboard session exists. Missing projections are treated as operator errors, not as run-health failures.
|
|
154
|
+
|
|
97
155
|
## wave control
|
|
98
156
|
|
|
99
157
|
Unified operator control surface. Preferred over legacy `wave coord`, `wave retry`, and `wave proof`.
|
|
@@ -114,6 +172,10 @@ The JSON payload now includes:
|
|
|
114
172
|
Versioned wave-level signal state for wrappers and external operators.
|
|
115
173
|
- `signals.agents`
|
|
116
174
|
Versioned per-agent signal state, including `shouldWake` plus any observed ack metadata.
|
|
175
|
+
- `supervisor`
|
|
176
|
+
The most relevant lane-scoped supervisor run for this wave, including degraded states such as `launcher-lost-agents-running`, recovery fields such as `sessionBackend`, `recoveryState`, and `resumeAction`, plus any recorded per-agent runtime summary.
|
|
177
|
+
- `forwardedClosureGaps`
|
|
178
|
+
Earliest-first forwarded `wave-proof-gap` records from the relaunch plan, including the stage key, originating agent, attempt, detail, and downstream closure targets.
|
|
117
179
|
|
|
118
180
|
Starter repos also include `scripts/wave-status.sh` and `scripts/wave-watch.sh` as thin readers over this JSON payload. They use exit `0` for completed, `20` for input-required, `40` for failed, and `30` from `wave-watch.sh --until-change` when the signal changed but the wave stayed active. For the full wrapper contract, read [../guides/signal-wrappers.md](../guides/signal-wrappers.md).
|
|
119
181
|
|
|
@@ -505,6 +567,8 @@ wave dashboard --dashboard-file <path> [--project <id>] [--lane <lane>] [--messa
|
|
|
505
567
|
wave dashboard --project <id> --lane <lane> --attach current|global
|
|
506
568
|
```
|
|
507
569
|
|
|
570
|
+
`wave dashboard --attach current|global` attaches to the live dashboard session when one exists; otherwise it follows the last written dashboard JSON for that target.
|
|
571
|
+
|
|
508
572
|
## Workspace Commands
|
|
509
573
|
|
|
510
574
|
**Initialize workspace:**
|
|
@@ -565,7 +629,7 @@ Interactive draft currently offers worker role kinds:
|
|
|
565
629
|
- `research`
|
|
566
630
|
- `security`
|
|
567
631
|
|
|
568
|
-
Agentic planner payloads also accept `workerAgents[].roleKind = "design"`. The shipped `0.9.
|
|
632
|
+
Agentic planner payloads also accept `workerAgents[].roleKind = "design"`. The shipped `0.9.1` surface uses `design-pass` as the default executor profile for that role and typically assigns a packet path like `docs/plans/waves/design/wave-<n>-<agentId>.md`. Interactive draft scaffolds the docs-first default; hybrid design stewards are authored by explicitly adding implementation-owned paths and the normal implementation contract sections.
|
|
569
633
|
|
|
570
634
|
## Ad-Hoc Task Commands
|
|
571
635
|
|
|
@@ -134,7 +134,7 @@ Practical rule:
|
|
|
134
134
|
|
|
135
135
|
That means a targeted helper request only blocks while it remains open *and* still has blocking severity in coordination state.
|
|
136
136
|
|
|
137
|
-
For the practical `0.9.
|
|
137
|
+
For the practical `0.9.1` recommendation on when to keep records blocking versus when to downgrade them to `soft`, `stale`, or `advisory`, see [../guides/recommendations-0.9.1.md](../guides/recommendations-0.9.1.md).
|
|
138
138
|
|
|
139
139
|
This page is documenting runtime semantics first. The important contract is that closure follows the durable coordination state, not that a particular human or agent used one exact command path to mutate it.
|
|
140
140
|
|
|
@@ -3,7 +3,7 @@
|
|
|
3
3
|
Use this page only if you intentionally want the legacy GitHub Packages install path.
|
|
4
4
|
|
|
5
5
|
GitHub's npm registry still requires authentication for installs from `npm.pkg.github.com`, even when the package and backing repository are public.
|
|
6
|
-
This is now the optional authenticated fallback path. The primary public install path is npmjs. Maintainer npm publishing setup is documented in [npmjs-
|
|
6
|
+
This is now the optional authenticated fallback path. The primary public install path is npmjs. Maintainer npm publishing setup is documented in [npmjs-token-publishing.md](./npmjs-token-publishing.md).
|
|
7
7
|
|
|
8
8
|
## `.npmrc`
|
|
9
9
|
|
|
@@ -1,12 +1,12 @@
|
|
|
1
1
|
---
|
|
2
2
|
title: "Wave Orchestration Migration Guide: 0.2 to 0.5"
|
|
3
|
-
summary: "How to migrate a repository from the earlier 0.2 wave baseline to the current
|
|
3
|
+
summary: "How to migrate a repository from the earlier 0.2 wave baseline to the current Wave model."
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Wave Orchestration Migration Guide: 0.2 to 0.5
|
|
7
7
|
|
|
8
8
|
This guide explains how to migrate a repository from the earlier Wave
|
|
9
|
-
Orchestration 0.2 baseline to the current
|
|
9
|
+
Orchestration 0.2 baseline to the current Wave model.
|
|
10
10
|
|
|
11
11
|
Current mainline note:
|
|
12
12
|
|
|
@@ -20,10 +20,11 @@ It uses two concrete references:
|
|
|
20
20
|
- the 0.2-style baseline in the sibling `~/slowfast.ai` repo
|
|
21
21
|
- the current target shape in this standalone `wave-orchestration` repo
|
|
22
22
|
|
|
23
|
-
This is a migration guide for the current architecture
|
|
24
|
-
[
|
|
25
|
-
|
|
26
|
-
|
|
23
|
+
This is a migration guide for the current shipped architecture and release
|
|
24
|
+
direction described across [docs/architecture/README.md](../architecture/README.md)
|
|
25
|
+
and [Roadmap](../roadmap.md), not just a changelog of whatever happens to be
|
|
26
|
+
landed in one point-in-time package build. This document is about the shipped
|
|
27
|
+
runtime shape, not only the semver label.
|
|
27
28
|
|
|
28
29
|
## Baseline And Target
|
|
29
30
|
|
|
@@ -34,7 +35,8 @@ Use these files as the concrete examples while migrating:
|
|
|
34
35
|
- 0.2 baseline wave example: `~/slowfast.ai/docs/plans/waves/wave-7.md`
|
|
35
36
|
- current target config: [wave.config.json](../../wave.config.json)
|
|
36
37
|
- current target sample wave: [wave-0.md](../plans/waves/wave-0.md)
|
|
37
|
-
- current target architecture: [
|
|
38
|
+
- current target architecture: [architecture/README.md](../architecture/README.md)
|
|
39
|
+
- current release direction: [roadmap.md](../roadmap.md)
|
|
38
40
|
- current package workflow: [README.md](../../README.md)
|
|
39
41
|
|
|
40
42
|
The migration is intentionally evolutionary:
|
|
@@ -0,0 +1,53 @@
|
|
|
1
|
+
# npmjs Token Publishing
|
|
2
|
+
|
|
3
|
+
This repo now includes a dedicated npmjs publish workflow at [publish-npm.yml](../../.github/workflows/publish-npm.yml).
|
|
4
|
+
|
|
5
|
+
The current `0.9.1` release procedure publishes through a repository Actions secret named `NPM_TOKEN`.
|
|
6
|
+
|
|
7
|
+
## What This Repo Already Does
|
|
8
|
+
|
|
9
|
+
- `package.json` no longer hardcodes GitHub Packages as the publish registry.
|
|
10
|
+
- `publish-npm.yml` publishes tagged releases to `https://registry.npmjs.org`.
|
|
11
|
+
- `publish-package.yml` still publishes to GitHub Packages explicitly, so both registries can coexist.
|
|
12
|
+
- `publish-npm.yml` expects `NPM_TOKEN` in GitHub Actions secrets.
|
|
13
|
+
- The public install path is already npmjs; GitHub Packages remains the authenticated fallback path.
|
|
14
|
+
|
|
15
|
+
## One-Time npm Setup
|
|
16
|
+
|
|
17
|
+
1. Create an npm granular access token with:
|
|
18
|
+
- package or scope access for `@chllming/wave-orchestration`
|
|
19
|
+
- `Read and write` permission
|
|
20
|
+
- `Bypass 2FA` enabled
|
|
21
|
+
2. In the GitHub repo `chllming/agent-wave-orchestrator`, add that token as an Actions secret named `NPM_TOKEN`.
|
|
22
|
+
3. Rotate or revoke the token when no longer needed.
|
|
23
|
+
|
|
24
|
+
## GitHub Workflow Behavior
|
|
25
|
+
|
|
26
|
+
The npmjs workflow:
|
|
27
|
+
|
|
28
|
+
- runs on GitHub-hosted runners
|
|
29
|
+
- requires `contents: read`
|
|
30
|
+
- installs dependencies with `pnpm install --frozen-lockfile`
|
|
31
|
+
- runs `pnpm test`
|
|
32
|
+
- publishes with `pnpm publish --access public --no-git-checks`
|
|
33
|
+
- authenticates with `NODE_AUTH_TOKEN=${{ secrets.NPM_TOKEN }}`
|
|
34
|
+
|
|
35
|
+
## Security Follow-Up
|
|
36
|
+
|
|
37
|
+
After a successful npm publish:
|
|
38
|
+
|
|
39
|
+
1. Keep the token scoped only to this package or scope.
|
|
40
|
+
2. Rotate the token periodically.
|
|
41
|
+
3. Revoke emergency or temporary tokens once they are no longer needed.
|
|
42
|
+
|
|
43
|
+
If this repo later needs private npm dependencies during CI, consider a separate read-only install token rather than reusing the publish token.
|
|
44
|
+
|
|
45
|
+
## Release Checklist
|
|
46
|
+
|
|
47
|
+
1. Confirm [publish-npm.yml](../../.github/workflows/publish-npm.yml) is on the default branch.
|
|
48
|
+
2. Confirm `NPM_TOKEN` exists in the GitHub repo secrets.
|
|
49
|
+
3. Confirm the package version has been bumped and committed.
|
|
50
|
+
4. Confirm `README.md`, `CHANGELOG.md`, `releases/manifest.json`, and `docs/plans/migration.md` all describe the same release surface.
|
|
51
|
+
5. Push the release commit and release tag, for example `v0.9.1`.
|
|
52
|
+
6. Verify both `publish-npm.yml` and `publish-package.yml` start from the tag push.
|
|
53
|
+
7. Verify the npmjs publish completes successfully for the tagged source.
|
|
@@ -1,53 +1,7 @@
|
|
|
1
|
-
# npmjs Publishing
|
|
1
|
+
# npmjs Trusted Publishing
|
|
2
2
|
|
|
3
|
-
This repo
|
|
3
|
+
This repo does not currently use npm trusted publishing or OIDC-based publish credentials.
|
|
4
4
|
|
|
5
|
-
The
|
|
5
|
+
The live npmjs workflow is token-based and documented in [npmjs-token-publishing.md](./npmjs-token-publishing.md).
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
- `package.json` no longer hardcodes GitHub Packages as the publish registry.
|
|
10
|
-
- `publish-npm.yml` publishes tagged releases to `https://registry.npmjs.org`.
|
|
11
|
-
- `publish-package.yml` still publishes to GitHub Packages explicitly, so both registries can coexist.
|
|
12
|
-
- `publish-npm.yml` expects `NPM_TOKEN` in GitHub Actions secrets.
|
|
13
|
-
- The public install path is already npmjs; GitHub Packages remains the authenticated fallback path.
|
|
14
|
-
|
|
15
|
-
## One-Time npm Setup
|
|
16
|
-
|
|
17
|
-
1. Create an npm granular access token with:
|
|
18
|
-
- package or scope access for `@chllming/wave-orchestration`
|
|
19
|
-
- `Read and write` permission
|
|
20
|
-
- `Bypass 2FA` enabled
|
|
21
|
-
2. In the GitHub repo `chllming/agent-wave-orchestrator`, add that token as an Actions secret named `NPM_TOKEN`.
|
|
22
|
-
3. Rotate or revoke the token when no longer needed.
|
|
23
|
-
|
|
24
|
-
## GitHub Workflow Behavior
|
|
25
|
-
|
|
26
|
-
The npmjs workflow:
|
|
27
|
-
|
|
28
|
-
- runs on GitHub-hosted runners
|
|
29
|
-
- requires `contents: read`
|
|
30
|
-
- installs dependencies with `pnpm install --frozen-lockfile`
|
|
31
|
-
- runs `pnpm test`
|
|
32
|
-
- publishes with `pnpm publish --access public --no-git-checks`
|
|
33
|
-
- authenticates with `NODE_AUTH_TOKEN=${{ secrets.NPM_TOKEN }}`
|
|
34
|
-
|
|
35
|
-
## Security Follow-Up
|
|
36
|
-
|
|
37
|
-
After a successful npm publish:
|
|
38
|
-
|
|
39
|
-
1. Keep the token scoped only to this package or scope.
|
|
40
|
-
2. Rotate the token periodically.
|
|
41
|
-
3. Revoke emergency or temporary tokens once they are no longer needed.
|
|
42
|
-
|
|
43
|
-
If this repo later needs private npm dependencies during CI, consider a separate read-only install token rather than reusing the publish token.
|
|
44
|
-
|
|
45
|
-
## Release Checklist
|
|
46
|
-
|
|
47
|
-
1. Confirm [publish-npm.yml](../../.github/workflows/publish-npm.yml) is on the default branch.
|
|
48
|
-
2. Confirm `NPM_TOKEN` exists in the GitHub repo secrets.
|
|
49
|
-
3. Confirm the package version has been bumped and committed.
|
|
50
|
-
4. Confirm `README.md`, `CHANGELOG.md`, `releases/manifest.json`, and `docs/plans/migration.md` all describe the same release surface.
|
|
51
|
-
5. Push the release commit and release tag, for example `v0.9.0`.
|
|
52
|
-
6. Verify both `publish-npm.yml` and `publish-package.yml` start from the tag push.
|
|
53
|
-
7. Verify the npmjs publish completes successfully for the tagged source.
|
|
7
|
+
If this repo later migrates to trusted publishing, replace this compatibility stub with the real OIDC procedure in the same change as the workflow update.
|