@chllming/wave-orchestration 0.9.14 → 0.9.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (47) hide show
  1. package/CHANGELOG.md +27 -0
  2. package/README.md +7 -7
  3. package/docs/README.md +3 -3
  4. package/docs/concepts/operating-modes.md +1 -1
  5. package/docs/guides/author-and-run-waves.md +1 -1
  6. package/docs/guides/planner.md +2 -2
  7. package/docs/guides/recommendations-0.9.15.md +83 -0
  8. package/docs/guides/sandboxed-environments.md +2 -2
  9. package/docs/guides/signal-wrappers.md +10 -0
  10. package/docs/plans/agent-first-closure-hardening.md +612 -0
  11. package/docs/plans/current-state.md +3 -3
  12. package/docs/plans/end-state-architecture.md +1 -1
  13. package/docs/plans/examples/wave-example-design-handoff.md +1 -1
  14. package/docs/plans/examples/wave-example-live-proof.md +1 -1
  15. package/docs/plans/migration.md +75 -20
  16. package/docs/reference/cli-reference.md +34 -1
  17. package/docs/reference/coordination-and-closure.md +16 -1
  18. package/docs/reference/npmjs-token-publishing.md +3 -3
  19. package/docs/reference/package-publishing-flow.md +13 -11
  20. package/docs/reference/runtime-config/README.md +2 -2
  21. package/docs/reference/sample-waves.md +5 -5
  22. package/docs/reference/skills.md +1 -1
  23. package/docs/reference/wave-control.md +1 -1
  24. package/docs/roadmap.md +5 -3
  25. package/package.json +8 -9
  26. package/releases/manifest.json +35 -0
  27. package/scripts/context7-api-check.sh +0 -0
  28. package/scripts/context7-export-env.sh +0 -0
  29. package/scripts/wave-autonomous.mjs +0 -0
  30. package/scripts/wave-dashboard.mjs +0 -0
  31. package/scripts/wave-human-feedback.mjs +0 -0
  32. package/scripts/wave-launcher.mjs +0 -0
  33. package/scripts/wave-local-executor.mjs +0 -0
  34. package/scripts/wave-orchestrator/agent-state.mjs +200 -312
  35. package/scripts/wave-orchestrator/artifact-schemas.mjs +37 -2
  36. package/scripts/wave-orchestrator/closure-adjudicator.mjs +311 -0
  37. package/scripts/wave-orchestrator/control-cli.mjs +212 -18
  38. package/scripts/wave-orchestrator/dashboard-state.mjs +40 -0
  39. package/scripts/wave-orchestrator/derived-state-engine.mjs +3 -0
  40. package/scripts/wave-orchestrator/gate-engine.mjs +93 -1
  41. package/scripts/wave-orchestrator/install.mjs +1 -1
  42. package/scripts/wave-orchestrator/launcher.mjs +49 -10
  43. package/scripts/wave-orchestrator/signal-cli.mjs +271 -0
  44. package/scripts/wave-orchestrator/structured-signal-parser.mjs +499 -0
  45. package/scripts/wave-status.sh +0 -0
  46. package/scripts/wave-watch.sh +0 -0
  47. package/scripts/wave.mjs +9 -0
@@ -0,0 +1,612 @@
1
+ ---
2
+ title: "Agent-First Closure Hardening"
3
+ summary: "Patch plan for lenient closure signal ingestion, deterministic closure adjudication, and clearer runtime state in the Wave orchestrator."
4
+ ---
5
+
6
+ # Agent-First Closure Hardening
7
+
8
+ ## Summary
9
+
10
+ Wave currently holds a strong line on closure quality, which is correct, but some of that strictness sits in the wrong layer.
11
+
12
+ Today the runtime is strict about marker transport and syntax in `agent-state.mjs`, but relatively coarse in how it projects runtime state after the agent work is already over. That creates a bad failure mode for agent-operated runs:
13
+
14
+ - the work is actually done
15
+ - deliverables and proof artifacts are present
16
+ - the only failure is malformed or incomplete closure markers
17
+ - the runtime persists a relaunch plan
18
+ - `wave control status` can still look like `running` or "not done"
19
+ - operators end up manually judging closure even though the orchestrator already has enough evidence to do most of that deterministically
20
+
21
+ This plan hardens closure for agent-driven use without lowering semantic quality bars. The core change is to make closure transport more forgiving, keep semantic proof requirements strict, and introduce a deterministic adjudication layer before any rework or relaunch plan is persisted.
22
+
23
+ ## Problem Statement
24
+
25
+ The current runtime has three related issues.
26
+
27
+ ### 1. Marker transport is stricter than closure semantics
28
+
29
+ Implementation closure currently depends on exact structured marker parsing in `scripts/wave-orchestrator/agent-state.mjs`.
30
+
31
+ - `buildAgentExecutionSummary()` extracts `[wave-proof]`, `[wave-doc-delta]`, and `[wave-component]` markers with regex-only parsing.
32
+ - `validateImplementationSummary()` treats malformed or missing structured markers as terminal closure failures.
33
+ - `validateDocumentationClosureSummary()` already has a more pragmatic fallback for empty documentation runs, but implementation closure does not.
34
+
35
+ That asymmetry means the runtime already accepts deterministic closure leniency in some places, but not in the place where CLI agents are most likely to make transport mistakes.
36
+
37
+ ### 2. Closure transport failures escalate too quickly into relaunch plans
38
+
39
+ `scripts/wave-orchestrator/retry-engine.mjs` currently folds agent closure failures into `closureGate` selection logic. Once that relaunch state is persisted, `scripts/wave-orchestrator/control-cli.mjs` can surface it as the dominant blocking edge.
40
+
41
+ That is the right behavior for real semantic failures, but it is too aggressive for syntax-only or transport-only failures where the runtime can still prove that the owned work landed correctly.
42
+
43
+ ### 3. Runtime status conflates live execution with unresolved closure
44
+
45
+ The current dashboard and control projection can leave a wave feeling "running" even after the launcher and agent processes are no longer doing work.
46
+
47
+ The runtime needs to distinguish:
48
+
49
+ - agent execution is still active
50
+ - closure evidence is incomplete or ambiguous
51
+ - a controller or relaunch plan exists but no process is live
52
+
53
+ Without that split, agent operators see "not done" when what they really have is "done, pending adjudication" or "done, closure blocked."
54
+
55
+ ## Goals
56
+
57
+ - Keep closure quality high for semantic proof, deliverables, component promotions, and shared-state coherence.
58
+ - Make structured signal ingestion tolerant of key ordering, extra fields, and equivalent state spellings.
59
+ - Run deterministic closure adjudication before creating relaunch plans for transport-only failures.
60
+ - Expose runtime state so completed agent execution is never mislabeled as ongoing work.
61
+ - Improve the CLI surface for Codex, Claude Code, and OpenCode so they do not need to hand-type fragile marker lines.
62
+ - Keep the existing marker-based contract backward compatible during rollout.
63
+
64
+ ## Non-Goals
65
+
66
+ - Do not auto-close waves that still have semantic proof gaps, missing deliverables, missing proof artifacts, or unresolved blocking coordination.
67
+ - Do not replace the current closure sweep with an LLM judge as the primary source of truth.
68
+ - Do not require repositories to migrate their wave files immediately.
69
+ - Do not remove the existing `[wave-*]` markers in the first rollout.
70
+
71
+ ## Design Principles
72
+
73
+ ### Semantic strictness, transport leniency
74
+
75
+ The orchestrator should remain strict about whether the work is actually proven. It should become more forgiving only about how agents express that proof.
76
+
77
+ ### Deterministic first, specialist fallback second
78
+
79
+ If the runtime already has enough evidence to make a safe closure decision, it should do so itself. A specialist judge runtime should only run when the deterministic layer cannot decide.
80
+
81
+ ### Execution state is not closure state
82
+
83
+ The operator surface should tell an agent whether the run is still working, whether closure is still evaluating, and whether the controller is simply holding a relaunch plan.
84
+
85
+ ### Agent ergonomics are a product requirement
86
+
87
+ Wave is primarily operated through agents. The ergonomics should fit a CLI-driving coding agent, not assume a careful human operator babysitting marker syntax.
88
+
89
+ ## Proposal
90
+
91
+ ## 1. Replace regex-only signal parsing with normalized structured signal ingestion
92
+
93
+ ### Current surface
94
+
95
+ The current extraction path in `scripts/wave-orchestrator/agent-state.mjs` is regex-driven:
96
+
97
+ - `WAVE_PROOF_REGEX`
98
+ - `WAVE_DOC_DELTA_REGEX`
99
+ - `WAVE_DOC_CLOSURE_REGEX`
100
+ - `WAVE_INTEGRATION_REGEX`
101
+ - `WAVE_COMPONENT_REGEX`
102
+
103
+ This is brittle when agents emit valid intent with slightly different key order or extra keys.
104
+
105
+ ### Patch
106
+
107
+ Add a normalized structured signal parser that treats each `[wave-*]` line as:
108
+
109
+ - a marker type
110
+ - an unordered key/value map
111
+ - an optional free-text suffix
112
+
113
+ Implementation shape:
114
+
115
+ - add `scripts/wave-orchestrator/structured-signal-parser.mjs`
116
+ - move signal normalization into reusable functions such as:
117
+ - `parseWaveSignalLine(line)`
118
+ - `normalizeWaveProofSignal(record)`
119
+ - `normalizeWaveDocDeltaSignal(record)`
120
+ - `normalizeWaveComponentSignal(record)`
121
+ - `normalizeWaveIntegrationSignal(record)`
122
+ - keep the existing regexes as compatibility parsers for older logs and edge cases
123
+
124
+ ### Parsing rules
125
+
126
+ - Required keys remain required.
127
+ - Unknown keys are ignored for gating but preserved in diagnostics.
128
+ - Key order is irrelevant.
129
+ - Equivalent spellings are normalized:
130
+ - `state=complete` -> `state=met`
131
+ - optional aliases may be accepted where unambiguous
132
+ - Extra keys do not invalidate the marker.
133
+ - The normalized parser should accept lines that the current regex rejects when the required semantic fields are still present.
134
+
135
+ ### Diagnostics
136
+
137
+ Extend `structuredSignalDiagnostics` so it records:
138
+
139
+ - `rawCount`
140
+ - `acceptedCount`
141
+ - `normalizedCount`
142
+ - `rejectedCount`
143
+ - rejected samples
144
+ - accepted-but-normalized samples
145
+ - unknown keys seen
146
+
147
+ This preserves debuggability while making transport more resilient.
148
+
149
+ ### Backward compatibility
150
+
151
+ `buildAgentExecutionSummary()` should prefer normalized records when present, then fall back to regex-derived legacy parsing. Existing marker lines remain valid.
152
+
153
+ ## 2. Add explicit closure transport versus semantic failure classification
154
+
155
+ ### Current surface
156
+
157
+ `validateImplementationSummary()` in `scripts/wave-orchestrator/agent-state.mjs` currently returns terminal failures such as:
158
+
159
+ - `missing-wave-proof`
160
+ - `invalid-wave-proof-format`
161
+ - `missing-doc-delta`
162
+ - `invalid-doc-delta-format`
163
+ - `missing-wave-component`
164
+ - `invalid-wave-component-format`
165
+
166
+ Those are all treated similarly by downstream retry logic even though they are not equally severe.
167
+
168
+ ### Patch
169
+
170
+ Introduce a formal closure failure taxonomy:
171
+
172
+ - `transport-failure`
173
+ Marker missing, malformed, or only partially parseable.
174
+ - `semantic-failure`
175
+ Marker is valid but says the proof is insufficient.
176
+ - `artifact-failure`
177
+ Required deliverable or proof artifact is missing.
178
+ - `state-failure`
179
+ Shared closure state is contradictory or blocking.
180
+
181
+ Implementation shape:
182
+
183
+ - extend validation results in `agent-state.mjs` with:
184
+ - `failureClass`
185
+ - `eligibleForAdjudication`
186
+ - `adjudicationHint`
187
+
188
+ Examples:
189
+
190
+ - `invalid-wave-proof-format` -> `failureClass: "transport-failure"`
191
+ - `missing-wave-proof` with exit code `0`, deliverables present, proof artifacts present -> `failureClass: "transport-failure"`
192
+ - `wave-proof-gap` -> `failureClass: "semantic-failure"`
193
+ - `missing-deliverable` -> `failureClass: "artifact-failure"`
194
+
195
+ This gives downstream engines enough shape to react proportionally.
196
+
197
+ ## 3. Add deterministic closure adjudication before relaunch
198
+
199
+ ### Current surface
200
+
201
+ `scripts/wave-orchestrator/closure-engine.mjs` forwards some closure blockers, and `scripts/wave-orchestrator/retry-engine.mjs` can turn them into a relaunch plan. There is no deterministic layer that asks: "is the work actually proven despite a bad marker?"
202
+
203
+ ### Patch
204
+
205
+ Add a new deterministic adjudication module:
206
+
207
+ - `scripts/wave-orchestrator/closure-adjudicator.mjs`
208
+
209
+ Primary interface:
210
+
211
+ - `evaluateClosureAdjudication({ wave, lanePaths, gate, summary, derivedState, agentRun, envelope, options })`
212
+
213
+ Outputs:
214
+
215
+ - `status: "pass"`
216
+ - `status: "rework-required"`
217
+ - `status: "ambiguous"`
218
+ - `reason`
219
+ - `evidence`
220
+ - `synthesizedSignals`
221
+
222
+ ### Eligibility rules
223
+
224
+ Only attempt deterministic adjudication when all of the following are true:
225
+
226
+ - the failure class is `transport-failure`
227
+ - the agent exit code is `0`
228
+ - required deliverables exist
229
+ - required proof artifacts exist
230
+ - there is no explicit negative semantic signal such as `wave-proof-gap`
231
+ - there is no unresolved blocking coordination owned by the same closure slice that would still fail the wave
232
+
233
+ ### Evidence sources
234
+
235
+ The adjudicator may use:
236
+
237
+ - result envelopes
238
+ - deliverable existence summaries
239
+ - proof artifact existence summaries
240
+ - component matrix expectations
241
+ - derived integration and documentation state
242
+ - structured diagnostics
243
+ - explicit negative gap markers
244
+
245
+ It should not inspect free-form narration as primary truth when durable artifacts already exist.
246
+
247
+ ### Adjudication outcomes
248
+
249
+ #### `pass`
250
+
251
+ If the work is semantically proven and only the transport failed:
252
+
253
+ - closure succeeds
254
+ - the gate result is marked as adjudicated
255
+ - a normalized synthetic signal record may be stored for replay/debugging
256
+ - no relaunch plan is created
257
+
258
+ #### `rework-required`
259
+
260
+ If the evidence is clear that the closure contract is not met:
261
+
262
+ - preserve current relaunch behavior
263
+
264
+ #### `ambiguous`
265
+
266
+ If the deterministic layer cannot decide:
267
+
268
+ - do not report the wave as still running
269
+ - mark closure as awaiting adjudication or specialist review
270
+ - only then consider a specialist judge runtime or explicit rerun policy
271
+
272
+ ### Persistence
273
+
274
+ Persist deterministic adjudication artifacts under a new path:
275
+
276
+ - `.tmp/<lane>-wave-launcher/closure/wave-<n>/attempt-<a>/<agent>.json`
277
+
278
+ That keeps replay and postmortems explainable.
279
+
280
+ ## 4. Use the adjudicator inside gate and closure sequencing
281
+
282
+ ### `gate-engine.mjs`
283
+
284
+ Add adjudication handling to implementation-gate reads, following the same spirit as the existing documentation auto-close path.
285
+
286
+ Patch points:
287
+
288
+ - implementation gate readers should call the adjudicator when validation returns `eligibleForAdjudication`
289
+ - returned gate payloads should carry:
290
+ - `adjudicated: true|false`
291
+ - `adjudicationStatus`
292
+ - `adjudicationArtifactPath`
293
+
294
+ This gives implementation closure the same kind of pragmatic recovery path that documentation closure already has for empty runs.
295
+
296
+ ### `closure-engine.mjs`
297
+
298
+ Refine forwarding behavior:
299
+
300
+ - keep forwarding true semantic proof gaps
301
+ - do not forward transport-only failures as closure-critical blockers before adjudication
302
+ - add a new intermediate stage status such as `awaiting-adjudication`
303
+
304
+ `isForwardableClosureGap()` should stop treating every proof-related failure the same way. It should forward only semantic gaps or adjudication-confirmed failures.
305
+
306
+ ## 5. Change retry planning so transport failures do not immediately become relaunch plans
307
+
308
+ ### Current surface
309
+
310
+ `retry-engine.mjs` computes `closureGate` from selected-agent failures without distinguishing syntax-only failures from true rework conditions.
311
+
312
+ ### Patch
313
+
314
+ Update retry planning to use the new failure taxonomy:
315
+
316
+ - `transport-failure` + adjudication pending -> do not create relaunch plan yet
317
+ - `transport-failure` + adjudication pass -> no relaunch plan
318
+ - `transport-failure` + adjudication ambiguous -> create a narrower adjudication or judge action, not a general relaunch
319
+ - `semantic-failure` or `artifact-failure` -> preserve current relaunch planning behavior
320
+
321
+ Add a more precise reason bucket split:
322
+
323
+ - `closureTransport`
324
+ - `closureSemantic`
325
+ - `closureArtifacts`
326
+ - `closureState`
327
+
328
+ This keeps the retry engine honest about what actually failed.
329
+
330
+ ## 6. Split execution, closure, and controller state in status projections
331
+
332
+ ### Current surface
333
+
334
+ `control-cli.mjs`, `dashboard-state.mjs`, and the dashboard schema mostly project a single top-level "running-like" state. That is not enough once closure and live process state diverge.
335
+
336
+ ### Patch
337
+
338
+ Introduce explicit projected state fields:
339
+
340
+ - `executionState`
341
+ - `pending`
342
+ - `active`
343
+ - `settled`
344
+ - `closureState`
345
+ - `pending`
346
+ - `evaluating`
347
+ - `awaiting-adjudication`
348
+ - `blocked`
349
+ - `passed`
350
+ - `failed`
351
+ - `controllerState`
352
+ - `active`
353
+ - `idle`
354
+ - `stale`
355
+ - `relaunch-planned`
356
+
357
+ Top-level `status` can remain for compatibility, but it should be derived from those three states instead of hiding them.
358
+
359
+ ### Projection rules
360
+
361
+ - If no launcher process is live and no agent runtime is live, `executionState` must not be `active`.
362
+ - A persisted relaunch plan without a live launcher should project `controllerState: "relaunch-planned"`, not imply active execution.
363
+ - A wave with all agent processes done and only transport-level closure ambiguity should project `closureState: "awaiting-adjudication"`.
364
+
365
+ ### Module impact
366
+
367
+ - `scripts/wave-orchestrator/control-cli.mjs`
368
+ - `scripts/wave-orchestrator/dashboard-state.mjs`
369
+ - `scripts/wave-orchestrator/artifact-schemas.mjs`
370
+ - possibly `scripts/wave-orchestrator/session-supervisor.mjs` for stronger live-runtime detection
371
+
372
+ ## 7. Add agent-friendly signal helper commands
373
+
374
+ ### Current surface
375
+
376
+ Agents currently have to type fragile marker lines directly into their terminal output.
377
+
378
+ ### Patch
379
+
380
+ Add a `wave signal` family to the CLI:
381
+
382
+ - `pnpm exec wave signal proof --completion <level> --durability <level> --proof <level> --state <state> --detail "..."`
383
+ - `pnpm exec wave signal doc-delta --state <state> --path <file> --detail "..."`
384
+ - `pnpm exec wave signal component --id <component> --level <level> --state <state> --detail "..."`
385
+ - `pnpm exec wave signal integration --state <state> --claims <n> --conflicts <n> --blockers <n> --detail "..."`
386
+ - `pnpm exec wave signal doc-closure --state <state> --path <file> --detail "..."`
387
+
388
+ Behavior:
389
+
390
+ - print the canonical marker line to stdout so it lands in the captured log
391
+ - optionally support `--json` for machine-driven wrappers
392
+ - optionally support `--append-file <path>` for direct marker-file output in environments that want it
393
+
394
+ This is intentionally small. It reduces agent error without forcing a breaking transport rewrite.
395
+
396
+ ### Docs impact
397
+
398
+ Update:
399
+
400
+ - `docs/reference/cli-reference.md`
401
+ - `docs/guides/signal-wrappers.md`
402
+ - `docs/reference/coordination-and-closure.md`
403
+
404
+ ## 8. Add a specialist closure judge only as the last fallback
405
+
406
+ ### Why not first
407
+
408
+ A specialist judge runtime is useful, but it should not become the main closure authority. If used too early, it becomes an expensive manual-review simulator and weakens deterministic reproducibility.
409
+
410
+ ### Patch
411
+
412
+ Add an optional specialist adjudication mode behind config:
413
+
414
+ - `closurePolicy.adjudication.specialistJudge.enabled`
415
+ - `closurePolicy.adjudication.specialistJudge.thresholds`
416
+ - `closurePolicy.adjudication.specialistJudge.executorProfile`
417
+
418
+ When to use it:
419
+
420
+ - deterministic adjudication returns `ambiguous`
421
+ - closure ambiguity has lasted beyond a configured timeout
422
+ - the wave is otherwise settled
423
+
424
+ What it should do:
425
+
426
+ - evaluate the specific closure contract for the blocked slice
427
+ - emit one of:
428
+ - `pass`
429
+ - `rework-required`
430
+ - `insufficient-evidence`
431
+
432
+ What it should not do:
433
+
434
+ - re-implement code
435
+ - reopen the whole wave
436
+ - override clear semantic failures
437
+
438
+ This makes the judge a narrow specialist, not the default closure judge for every bad marker.
439
+
440
+ ## Module-Level Patch Map
441
+
442
+ ## `scripts/wave-orchestrator/agent-state.mjs`
443
+
444
+ - adopt the normalized structured signal parser
445
+ - enrich diagnostics
446
+ - extend validation payloads with failure classes and adjudication eligibility
447
+ - preserve current status codes for compatibility while adding richer metadata
448
+
449
+ ## `scripts/wave-orchestrator/structured-signal-parser.mjs`
450
+
451
+ - new shared parser/normalizer for `[wave-*]` lines
452
+ - house alias normalization, unknown-key handling, and diagnostics support
453
+
454
+ ## `scripts/wave-orchestrator/gate-engine.mjs`
455
+
456
+ - invoke deterministic adjudication for implementation transport failures
457
+ - expose adjudicated closure results in gate payloads
458
+ - mirror the existing documentation fallback style for implementation closure where safe
459
+
460
+ ## `scripts/wave-orchestrator/closure-engine.mjs`
461
+
462
+ - stop forwarding transport-only failures as closure-critical gaps before adjudication
463
+ - add `awaiting-adjudication` flow
464
+ - persist adjudication state into traceable closure artifacts
465
+
466
+ ## `scripts/wave-orchestrator/closure-adjudicator.mjs`
467
+
468
+ - new deterministic adjudication layer
469
+ - evaluate artifact-backed proof for syntax-only closure failures
470
+
471
+ ## `scripts/wave-orchestrator/retry-engine.mjs`
472
+
473
+ - distinguish transport, semantic, artifact, and state failures
474
+ - avoid relaunch plans when adjudication can still resolve the issue
475
+ - persist narrower adjudication intent where appropriate
476
+
477
+ ## `scripts/wave-orchestrator/control-cli.mjs`
478
+
479
+ - project `executionState`, `closureState`, and `controllerState`
480
+ - stop surfacing persisted relaunch plans as if work is still actively running
481
+ - add `wave signal ...` subcommands
482
+ - optionally add `wave control adjudication get --lane <lane> --wave <n> --json`
483
+
484
+ ## `scripts/wave-orchestrator/dashboard-state.mjs`
485
+
486
+ - project settled-versus-active execution cleanly
487
+ - distinguish closure evaluation from live runtime progress
488
+
489
+ ## `scripts/wave-orchestrator/session-supervisor.mjs`
490
+
491
+ - tighten live-process detection so status surfaces can tell when the controller is no longer active
492
+
493
+ ## `scripts/wave-orchestrator/artifact-schemas.mjs`
494
+
495
+ - version and normalize the new projected state fields and adjudication artifacts
496
+
497
+ ## Docs
498
+
499
+ - `docs/reference/coordination-and-closure.md`
500
+ - `docs/reference/cli-reference.md`
501
+ - `docs/guides/signal-wrappers.md`
502
+ - `docs/plans/wave-orchestrator.md`
503
+
504
+ ## Rollout Plan
505
+
506
+ ### Phase 1: transport hardening
507
+
508
+ - land normalized structured signal parsing
509
+ - keep current regex parsing as compatibility fallback
510
+ - add new diagnostics fields
511
+ - no change yet to retry policy
512
+
513
+ ### Phase 2: deterministic adjudication
514
+
515
+ - land `closure-adjudicator.mjs`
516
+ - wire implementation transport failures through adjudication before relaunch planning
517
+ - persist adjudication artifacts
518
+
519
+ ### Phase 3: status projection split
520
+
521
+ - add `executionState`, `closureState`, and `controllerState`
522
+ - update `wave control status`, dashboards, and artifact schemas
523
+ - keep legacy top-level status fields for compatibility
524
+
525
+ ### Phase 4: agent ergonomics
526
+
527
+ - add `wave signal ...` commands
528
+ - update docs and starter prompts to recommend helper usage
529
+
530
+ ### Phase 5: specialist judge fallback
531
+
532
+ - add optional judge runtime only for unresolved ambiguous cases after deterministic adjudication
533
+
534
+ ## Test Plan
535
+
536
+ ### Unit tests
537
+
538
+ - parser accepts reordered proof keys
539
+ - parser accepts extra unknown keys
540
+ - parser normalizes `state=complete` to `met`
541
+ - parser still accepts existing canonical marker lines unchanged
542
+ - parser rejects truly incomplete proof markers
543
+ - validation classifies malformed markers as `transport-failure`
544
+ - validation classifies proof gaps as `semantic-failure`
545
+
546
+ ### Integration tests
547
+
548
+ - implementation agent with exit code `0`, deliverables present, proof artifacts present, malformed proof marker, and coherent derived state is auto-adjudicated to pass
549
+ - implementation agent with malformed marker but missing deliverable still fails
550
+ - implementation agent with explicit `[wave-gap]` or `wave-proof-gap` still fails
551
+ - relaunch plan is not written for adjudication-pass cases
552
+ - a settled wave with only unresolved closure ambiguity projects `executionState: settled`, not active running
553
+ - `wave control status` shows `controllerState: relaunch-planned` when a relaunch plan exists but no live runtime is active
554
+
555
+ ### Regression tests
556
+
557
+ - preserve current documentation auto-close behavior for empty doc runs
558
+ - preserve current semantic failure behavior for real proof gaps
559
+ - preserve replay compatibility with older logs and envelopes
560
+
561
+ ### Eval additions
562
+
563
+ Add or extend eval coverage for:
564
+
565
+ - marker transport failure with semantically complete work
566
+ - stale relaunch plan with no live processes
567
+ - ambiguous closure requiring specialist adjudication
568
+
569
+ ## Risks And Mitigations
570
+
571
+ ### Risk: accidental premature closure
572
+
573
+ Mitigation:
574
+
575
+ - adjudication is only allowed for transport failures
576
+ - deliverables and proof artifacts must still exist
577
+ - explicit negative proof signals still fail closed
578
+ - component and shared-state gates remain active
579
+
580
+ ### Risk: status surface churn for existing tooling
581
+
582
+ Mitigation:
583
+
584
+ - keep legacy top-level `status`
585
+ - add new fields as additive schema changes first
586
+ - document projection precedence clearly
587
+
588
+ ### Risk: helper CLI commands add another surface to maintain
589
+
590
+ Mitigation:
591
+
592
+ - keep helpers thin
593
+ - have helpers emit the same canonical marker lines already used today
594
+ - do not require helper adoption on day one
595
+
596
+ ## Recommended Success Criteria
597
+
598
+ - closure transport errors stop causing unnecessary relaunches when evidence already proves the work landed
599
+ - `wave control status` no longer implies active work when the system is only waiting on closure adjudication or a stale relaunch decision
600
+ - agent-authored runs require fewer manual closure fixes
601
+ - documentation fallback behavior and implementation closure behavior follow the same design philosophy
602
+ - specialist judge runtimes are rare, narrow, and explainable
603
+
604
+ ## Suggested Execution Order
605
+
606
+ 1. land normalized signal parsing plus richer diagnostics
607
+ 2. land deterministic adjudication for implementation closure
608
+ 3. land state projection split for control status and dashboards
609
+ 4. land `wave signal ...` helper commands
610
+ 5. add optional specialist judge fallback
611
+
612
+ That order keeps risk low and makes each step independently reviewable.
@@ -1,10 +1,10 @@
1
1
  # Current State
2
2
 
3
- - The published package is `0.9.13`; that release keeps the shipped monorepo, design-role, signal-hygiene, detached process-runner, and sandbox supervisor surfaces, and adds three focused operational hardening fixes: `state=complete` now maps cleanly onto the existing proof semantics, restart-safe launcher validation now honors status-recoverable completed waves before stale promotion checks, and detached runners now derive per-agent broker sticky keys by default. The current authenticated Wave Control plus Corridor-backed security surface continues to ship in this repo, with the browser UI still organized around dashboard-first navigation and richer run or benchmark analytics summaries.
3
+ - The published package is `0.9.15`; that release keeps the shipped monorepo, design-role, signal-hygiene, detached process-runner, and sandbox supervisor surfaces, and adds a more explicit closure-hardening layer: canonical structured-signal parsing, deterministic adjudication for transport-only implementation failures, persisted adjudication artifacts, the `wave signal` helper family, and additive execution/closure/controller projections in the control plane. The current authenticated Wave Control plus Corridor-backed security surface continues to ship in this repo, with the browser UI still organized around dashboard-first navigation and richer run or benchmark analytics summaries.
4
4
  - The canonical shipped runtime architecture is documented in `docs/plans/end-state-architecture.md`; the sandbox-runtime companion is `docs/plans/sandbox-end-state-architecture.md`; historical cutover notes remain in `docs/plans/architecture-hardening-migration.md`.
5
5
  - The repository contains the published `@chllming/wave-orchestration` package plus the starter scaffold used by `wave init`.
6
6
  - The runtime is package-first and non-destructive for adopting repos: `wave init --adopt-existing` records existing repo-owned plans, waves, prompts, and config without overwriting them, and `wave upgrade` writes only `.wave/install-state.json` plus `.wave/upgrade-history/`.
7
- - The recommended `0.9.13` operating stance is documented in `docs/guides/recommendations-0.9.13.md`: keep proof and closure strict, keep generic `budget.turns` advisory, use targeted recovery for narrow restart-safe recovery work, and treat TMUX as an optional operator surface rather than a runtime requirement.
7
+ - The recommended `0.9.15` operating stance is documented in `docs/guides/recommendations-0.9.15.md`: keep proof and closure strict, keep generic `budget.turns` advisory, use targeted recovery for narrow restart-safe recovery work, treat TMUX as an optional operator surface rather than a runtime requirement, and treat `awaiting-adjudication` as distinct from both pass and rerun-needed.
8
8
  - Sandbox-safe setup guidance now ships in `docs/guides/sandboxed-environments.md`: use `wave submit/supervise/status/wait/attach` for short-lived clients, keep `tmux` optional and dashboard-only, and preserve `.tmp/` plus `.wave/` when running inside Nemoshell or Docker.
9
9
  - Runtime launch entrypoints now perform a best-effort npmjs version check, cache the result under `.wave/package-update-check.json`, and point operators at `pnpm exec wave self-update` when a newer published package exists.
10
10
  - The companion control plane now ships in two packages:
@@ -99,7 +99,7 @@
99
99
  - Live closure is strict: `cont-EVAL` must prove the declared eval contract with exact target and benchmark ids, and `cont-QA` must provide both final verdict and final gate artifacts. Legacy evaluator-era shapes remain replay-only compatibility inputs.
100
100
  - Proof-centric waves can now declare `### Proof artifacts`, and implementation proof validation can require those machine-visible local artifacts in addition to deliverables and structured proof markers.
101
101
  - Routed clarifications remain blocking until the linked follow-up request or escalation is fully resolved.
102
- - Operators can now use `wave control status`, `wave control task`, `wave control rerun`, and `wave control proof` as the preferred supported control surfaces during a live wave instead of editing runtime files by hand. Legacy `wave coord`, `wave retry`, and `wave proof` commands remain as compatibility paths.
102
+ - Operators can now use `wave control status`, `wave control task`, `wave control rerun`, `wave control adjudication`, and `wave control proof` as the preferred supported control surfaces during a live wave instead of editing runtime files by hand. Legacy `wave coord`, `wave retry`, and `wave proof` commands remain as compatibility paths, and `wave signal ...` now provides the preferred canonical marker-emission path for wrappers or orchestrator prompts.
103
103
  - Required inbound cross-lane dependency tickets under `.tmp/wave-orchestrator/dependencies/` block both autonomous wave launch and lane finalization while they remain unresolved.
104
104
  - Cross-lane dependency workflows now include:
105
105
  - `wave dep post|show|resolve|render`
@@ -1,6 +1,6 @@
1
1
  # End-State Architecture
2
2
 
3
- This document describes the canonical architecture for the current Wave runtime. It is the authoritative reference for the engine boundaries, canonical authority set, and artifact ownership model that the shipped `0.9.13` surface now follows.
3
+ This document describes the canonical architecture for the current Wave runtime. It is the authoritative reference for the engine boundaries, canonical authority set, and artifact ownership model that the shipped `0.9.15` surface now follows.
4
4
 
5
5
  For the sandbox-specific execution model, including async supervisor ownership, daemon adoption goals, and forwarded closure-gap behavior, read [sandbox-end-state-architecture.md](./sandbox-end-state-architecture.md).
6
6
 
@@ -1,6 +1,6 @@
1
1
  # Wave 12 - Optional Design Steward Handoff
2
2
 
3
- This is a showcase-first sample wave for the shipped `design` worker role in `0.9.13`.
3
+ This is a showcase-first sample wave for the shipped `design` worker role in `0.9.15`.
4
4
 
5
5
  This example demonstrates the docs-first design-steward path where a design packet is published before code-owning implementation begins.
6
6
 
@@ -2,7 +2,7 @@
2
2
 
3
3
  This is a showcase-first sample wave.
4
4
 
5
- Use it as the single reference example for the current `0.9.13` Wave surface.
5
+ Use it as the single reference example for the current `0.9.15` Wave surface.
6
6
 
7
7
  It intentionally combines more sections than a normal production wave so one file can demonstrate:
8
8