@chllming/wave-orchestration 0.6.3 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (112) hide show
  1. package/CHANGELOG.md +57 -1
  2. package/README.md +39 -7
  3. package/docs/agents/wave-orchestrator-role.md +50 -0
  4. package/docs/agents/wave-planner-role.md +39 -0
  5. package/docs/context7/bundles.json +9 -0
  6. package/docs/context7/planner-agent/README.md +25 -0
  7. package/docs/context7/planner-agent/manifest.json +83 -0
  8. package/docs/context7/planner-agent/papers/cooperbench-why-coding-agents-cannot-be-your-teammates-yet.md +3283 -0
  9. package/docs/context7/planner-agent/papers/dova-deliberation-first-multi-agent-orchestration-for-autonomous-research-automation.md +1699 -0
  10. package/docs/context7/planner-agent/papers/dpbench-large-language-models-struggle-with-simultaneous-coordination.md +2251 -0
  11. package/docs/context7/planner-agent/papers/incremental-planning-to-control-a-blackboard-based-problem-solver.md +1729 -0
  12. package/docs/context7/planner-agent/papers/silo-bench-a-scalable-environment-for-evaluating-distributed-coordination-in-multi-agent-llm-systems.md +3747 -0
  13. package/docs/context7/planner-agent/papers/todoevolve-learning-to-architect-agent-planning-systems.md +1675 -0
  14. package/docs/context7/planner-agent/papers/verified-multi-agent-orchestration-a-plan-execute-verify-replan-framework-for-complex-query-resolution.md +1173 -0
  15. package/docs/context7/planner-agent/papers/why-do-multi-agent-llm-systems-fail.md +5211 -0
  16. package/docs/context7/planner-agent/topics/planning-and-orchestration.md +24 -0
  17. package/docs/evals/README.md +96 -1
  18. package/docs/evals/arm-templates/README.md +13 -0
  19. package/docs/evals/arm-templates/full-wave.json +15 -0
  20. package/docs/evals/arm-templates/single-agent.json +15 -0
  21. package/docs/evals/benchmark-catalog.json +7 -0
  22. package/docs/evals/cases/README.md +47 -0
  23. package/docs/evals/cases/wave-blackboard-inbox-targeting.json +73 -0
  24. package/docs/evals/cases/wave-contradiction-conflict.json +104 -0
  25. package/docs/evals/cases/wave-expert-routing-preservation.json +69 -0
  26. package/docs/evals/cases/wave-hidden-profile-private-evidence.json +81 -0
  27. package/docs/evals/cases/wave-premature-closure-guard.json +71 -0
  28. package/docs/evals/cases/wave-silo-cross-agent-state.json +77 -0
  29. package/docs/evals/cases/wave-simultaneous-lockstep.json +92 -0
  30. package/docs/evals/cooperbench/real-world-mitigation.md +341 -0
  31. package/docs/evals/external-benchmarks.json +85 -0
  32. package/docs/evals/external-command-config.sample.json +9 -0
  33. package/docs/evals/external-command-config.swe-bench-pro.json +8 -0
  34. package/docs/evals/pilots/README.md +47 -0
  35. package/docs/evals/pilots/swe-bench-pro-public-full-wave-review-10.json +64 -0
  36. package/docs/evals/pilots/swe-bench-pro-public-pilot.json +111 -0
  37. package/docs/evals/wave-benchmark-program.md +302 -0
  38. package/docs/guides/planner.md +48 -11
  39. package/docs/plans/context7-wave-orchestrator.md +20 -0
  40. package/docs/plans/current-state.md +8 -1
  41. package/docs/plans/examples/wave-benchmark-improvement.md +108 -0
  42. package/docs/plans/examples/wave-example-live-proof.md +1 -1
  43. package/docs/plans/examples/wave-example-rollout-fidelity.md +340 -0
  44. package/docs/plans/wave-orchestrator.md +62 -11
  45. package/docs/plans/waves/reviews/wave-1-benchmark-operator.md +118 -0
  46. package/docs/reference/coordination-and-closure.md +436 -0
  47. package/docs/reference/live-proof-waves.md +25 -3
  48. package/docs/reference/npmjs-trusted-publishing.md +3 -3
  49. package/docs/reference/proof-metrics.md +90 -0
  50. package/docs/reference/runtime-config/README.md +61 -0
  51. package/docs/reference/sample-waves.md +29 -18
  52. package/docs/reference/wave-control.md +164 -0
  53. package/docs/reference/wave-planning-lessons.md +131 -0
  54. package/package.json +5 -4
  55. package/releases/manifest.json +18 -0
  56. package/scripts/research/agent-context-archive.mjs +18 -0
  57. package/scripts/research/manifests/agent-context-expanded-2026-03-22.mjs +17 -0
  58. package/scripts/research/sync-planner-context7-bundle.mjs +133 -0
  59. package/scripts/wave-orchestrator/artifact-schemas.mjs +232 -0
  60. package/scripts/wave-orchestrator/autonomous.mjs +7 -0
  61. package/scripts/wave-orchestrator/benchmark-cases.mjs +374 -0
  62. package/scripts/wave-orchestrator/benchmark-external.mjs +1384 -0
  63. package/scripts/wave-orchestrator/benchmark.mjs +972 -0
  64. package/scripts/wave-orchestrator/clarification-triage.mjs +78 -12
  65. package/scripts/wave-orchestrator/config.mjs +175 -0
  66. package/scripts/wave-orchestrator/control-cli.mjs +1123 -0
  67. package/scripts/wave-orchestrator/control-plane.mjs +697 -0
  68. package/scripts/wave-orchestrator/coord-cli.mjs +360 -2
  69. package/scripts/wave-orchestrator/coordination-store.mjs +211 -9
  70. package/scripts/wave-orchestrator/coordination.mjs +84 -0
  71. package/scripts/wave-orchestrator/dashboard-renderer.mjs +38 -3
  72. package/scripts/wave-orchestrator/dashboard-state.mjs +22 -0
  73. package/scripts/wave-orchestrator/evals.mjs +23 -0
  74. package/scripts/wave-orchestrator/executors.mjs +3 -2
  75. package/scripts/wave-orchestrator/feedback.mjs +55 -0
  76. package/scripts/wave-orchestrator/install.mjs +55 -1
  77. package/scripts/wave-orchestrator/launcher-closure.mjs +4 -1
  78. package/scripts/wave-orchestrator/launcher-runtime.mjs +24 -21
  79. package/scripts/wave-orchestrator/launcher.mjs +796 -35
  80. package/scripts/wave-orchestrator/planner-context.mjs +75 -0
  81. package/scripts/wave-orchestrator/planner.mjs +2270 -136
  82. package/scripts/wave-orchestrator/proof-cli.mjs +195 -0
  83. package/scripts/wave-orchestrator/proof-registry.mjs +317 -0
  84. package/scripts/wave-orchestrator/replay.mjs +10 -4
  85. package/scripts/wave-orchestrator/retry-cli.mjs +184 -0
  86. package/scripts/wave-orchestrator/retry-control.mjs +225 -0
  87. package/scripts/wave-orchestrator/shared.mjs +26 -0
  88. package/scripts/wave-orchestrator/swe-bench-pro-task.mjs +1004 -0
  89. package/scripts/wave-orchestrator/traces.mjs +157 -2
  90. package/scripts/wave-orchestrator/wave-control-client.mjs +532 -0
  91. package/scripts/wave-orchestrator/wave-control-schema.mjs +309 -0
  92. package/scripts/wave-orchestrator/wave-files.mjs +17 -5
  93. package/scripts/wave.mjs +27 -0
  94. package/skills/repo-coding-rules/SKILL.md +1 -0
  95. package/skills/role-cont-eval/SKILL.md +1 -0
  96. package/skills/role-cont-qa/SKILL.md +13 -6
  97. package/skills/role-deploy/SKILL.md +1 -0
  98. package/skills/role-documentation/SKILL.md +4 -0
  99. package/skills/role-implementation/SKILL.md +4 -0
  100. package/skills/role-infra/SKILL.md +2 -1
  101. package/skills/role-integration/SKILL.md +15 -8
  102. package/skills/role-planner/SKILL.md +39 -0
  103. package/skills/role-planner/skill.json +21 -0
  104. package/skills/role-research/SKILL.md +1 -0
  105. package/skills/role-security/SKILL.md +2 -2
  106. package/skills/runtime-claude/SKILL.md +2 -1
  107. package/skills/runtime-codex/SKILL.md +1 -0
  108. package/skills/runtime-local/SKILL.md +2 -0
  109. package/skills/runtime-opencode/SKILL.md +1 -0
  110. package/skills/wave-core/SKILL.md +25 -6
  111. package/skills/wave-core/references/marker-syntax.md +16 -8
  112. package/wave.config.json +45 -0
@@ -0,0 +1,309 @@
1
+ import crypto from "node:crypto";
2
+
3
+ export const WAVE_CONTROL_SCHEMA_VERSION = 1;
4
+ export const WAVE_CONTROL_EVENT_KIND = "wave-control-event";
5
+
6
+ export const WAVE_CONTROL_ENTITY_TYPES = new Set([
7
+ "task",
8
+ "proof_bundle",
9
+ "rerun_request",
10
+ "attempt",
11
+ "human_input",
12
+ "wave_run",
13
+ "agent_run",
14
+ "coordination_record",
15
+ "gate",
16
+ "artifact",
17
+ "benchmark_run",
18
+ "benchmark_item",
19
+ "verification",
20
+ "review",
21
+ ]);
22
+
23
+ export const WAVE_CONTROL_RUN_KINDS = new Set([
24
+ "roadmap",
25
+ "adhoc",
26
+ "benchmark",
27
+ "service",
28
+ "unknown",
29
+ ]);
30
+
31
+ export const WAVE_CONTROL_UPLOAD_POLICIES = new Set([
32
+ "local-only",
33
+ "metadata-only",
34
+ "selected",
35
+ "full",
36
+ ]);
37
+
38
+ export const WAVE_CONTROL_REPORT_MODES = new Set([
39
+ "disabled",
40
+ "metadata-only",
41
+ "metadata-plus-selected",
42
+ "full-artifact-upload",
43
+ ]);
44
+
45
+ export const WAVE_CONTROL_REVIEW_VALIDITIES = new Set([
46
+ "comparison-valid",
47
+ "review-only",
48
+ "benchmark-invalid",
49
+ "harness-setup-failure",
50
+ "proof-blocked",
51
+ "trustworthy-model-failure",
52
+ ]);
53
+
54
+ function normalizeText(value, fallback = null) {
55
+ const normalized = String(value ?? "").trim();
56
+ return normalized || fallback;
57
+ }
58
+
59
+ function normalizeNonNegativeInt(value, fallback = null) {
60
+ if (value === undefined || value === null || value === "") {
61
+ return fallback;
62
+ }
63
+ const parsed = Number.parseInt(String(value), 10);
64
+ return Number.isFinite(parsed) && parsed >= 0 ? parsed : fallback;
65
+ }
66
+
67
+ function normalizeBoolean(value, fallback = false) {
68
+ if (value === undefined) {
69
+ return fallback;
70
+ }
71
+ return value === true;
72
+ }
73
+
74
+ function normalizePlainObject(value) {
75
+ return value && typeof value === "object" && !Array.isArray(value)
76
+ ? JSON.parse(JSON.stringify(value))
77
+ : {};
78
+ }
79
+
80
+ function normalizeStringArray(values) {
81
+ if (!Array.isArray(values)) {
82
+ return [];
83
+ }
84
+ return Array.from(
85
+ new Set(
86
+ values
87
+ .map((value) => normalizeText(value, null))
88
+ .filter(Boolean),
89
+ ),
90
+ );
91
+ }
92
+
93
+ function assertEnum(value, allowed, label) {
94
+ if (!allowed.has(value)) {
95
+ throw new Error(`${label} must be one of ${Array.from(allowed).join(", ")} (got: ${value || "empty"})`);
96
+ }
97
+ }
98
+
99
+ function defaultId(prefix, value) {
100
+ const hash = crypto.createHash("sha1").update(String(value || "")).digest("hex").slice(0, 16);
101
+ return `${prefix}-${hash}`;
102
+ }
103
+
104
+ export function stableJsonStringify(value) {
105
+ return JSON.stringify(sortJsonValue(value));
106
+ }
107
+
108
+ function sortJsonValue(value) {
109
+ if (Array.isArray(value)) {
110
+ return value.map((entry) => sortJsonValue(entry));
111
+ }
112
+ if (value && typeof value === "object") {
113
+ return Object.fromEntries(
114
+ Object.keys(value)
115
+ .sort()
116
+ .map((key) => [key, sortJsonValue(value[key])]),
117
+ );
118
+ }
119
+ return value;
120
+ }
121
+
122
+ export function buildWaveControlConfigAttestationHash(value) {
123
+ return crypto.createHash("sha256").update(stableJsonStringify(value)).digest("hex");
124
+ }
125
+
126
+ export function normalizeWaveControlUploadPolicy(
127
+ value,
128
+ label = "waveControl.uploadPolicy",
129
+ fallback = "metadata-only",
130
+ ) {
131
+ const normalized = normalizeText(value, fallback)?.toLowerCase() || fallback;
132
+ assertEnum(normalized, WAVE_CONTROL_UPLOAD_POLICIES, label);
133
+ return normalized;
134
+ }
135
+
136
+ export function normalizeWaveControlReportMode(
137
+ value,
138
+ label = "waveControl.reportMode",
139
+ fallback = "metadata-plus-selected",
140
+ ) {
141
+ const normalized = normalizeText(value, fallback)?.toLowerCase() || fallback;
142
+ assertEnum(normalized, WAVE_CONTROL_REPORT_MODES, label);
143
+ return normalized;
144
+ }
145
+
146
+ export function normalizeWaveControlReviewValidity(
147
+ value,
148
+ label = "waveControl.reviewValidity",
149
+ fallback = "review-only",
150
+ ) {
151
+ const normalized = normalizeText(value, fallback)?.toLowerCase() || fallback;
152
+ assertEnum(normalized, WAVE_CONTROL_REVIEW_VALIDITIES, label);
153
+ return normalized;
154
+ }
155
+
156
+ export function normalizeWaveControlRunKind(
157
+ value,
158
+ label = "waveControl.runKind",
159
+ fallback = "unknown",
160
+ ) {
161
+ const normalized = normalizeText(value, fallback)?.toLowerCase() || fallback;
162
+ assertEnum(normalized, WAVE_CONTROL_RUN_KINDS, label);
163
+ return normalized;
164
+ }
165
+
166
+ export function normalizeWaveControlRunIdentity(rawIdentity = {}, defaults = {}) {
167
+ const source = normalizePlainObject(rawIdentity);
168
+ const fallback = normalizePlainObject(defaults);
169
+ return {
170
+ workspaceId: normalizeText(source.workspaceId, normalizeText(fallback.workspaceId, null)),
171
+ projectId: normalizeText(source.projectId, normalizeText(fallback.projectId, null)),
172
+ runId: normalizeText(source.runId, normalizeText(fallback.runId, null)),
173
+ runKind: normalizeWaveControlRunKind(
174
+ source.runKind,
175
+ "waveControl.runKind",
176
+ normalizeText(fallback.runKind, "unknown"),
177
+ ),
178
+ lane: normalizeText(source.lane, normalizeText(fallback.lane, null)),
179
+ wave: normalizeNonNegativeInt(source.wave, normalizeNonNegativeInt(fallback.wave, null)),
180
+ attempt: normalizeNonNegativeInt(
181
+ source.attempt,
182
+ normalizeNonNegativeInt(fallback.attempt, null),
183
+ ),
184
+ agentId: normalizeText(source.agentId, normalizeText(fallback.agentId, null)),
185
+ orchestratorId: normalizeText(
186
+ source.orchestratorId,
187
+ normalizeText(fallback.orchestratorId, null),
188
+ ),
189
+ runtimeVersion: normalizeText(
190
+ source.runtimeVersion,
191
+ normalizeText(fallback.runtimeVersion, null),
192
+ ),
193
+ benchmarkRunId: normalizeText(
194
+ source.benchmarkRunId,
195
+ normalizeText(fallback.benchmarkRunId, null),
196
+ ),
197
+ benchmarkItemId: normalizeText(
198
+ source.benchmarkItemId,
199
+ normalizeText(fallback.benchmarkItemId, null),
200
+ ),
201
+ };
202
+ }
203
+
204
+ export function normalizeWaveControlArtifactDescriptor(rawArtifact = {}, defaults = {}) {
205
+ const source = normalizePlainObject(rawArtifact);
206
+ const fallback = normalizePlainObject(defaults);
207
+ const path = normalizeText(source.path, normalizeText(fallback.path, null));
208
+ const kind = normalizeText(source.kind, normalizeText(fallback.kind, "artifact"));
209
+ const descriptor = {
210
+ path,
211
+ kind,
212
+ required: normalizeBoolean(source.required, Boolean(fallback.required)),
213
+ present:
214
+ source.present === undefined
215
+ ? normalizeBoolean(fallback.present, false)
216
+ : normalizeBoolean(source.present, false),
217
+ sha256: normalizeText(source.sha256, normalizeText(fallback.sha256, null)),
218
+ bytes: normalizeNonNegativeInt(source.bytes, normalizeNonNegativeInt(fallback.bytes, null)),
219
+ contentType: normalizeText(source.contentType, normalizeText(fallback.contentType, null)),
220
+ uploadPolicy: normalizeWaveControlUploadPolicy(
221
+ source.uploadPolicy,
222
+ "waveControl.artifact.uploadPolicy",
223
+ normalizeText(fallback.uploadPolicy, "metadata-only"),
224
+ ),
225
+ label: normalizeText(source.label, normalizeText(fallback.label, null)),
226
+ recordedAt: normalizeText(source.recordedAt, normalizeText(fallback.recordedAt, null)),
227
+ };
228
+ return {
229
+ artifactId:
230
+ normalizeText(source.artifactId, normalizeText(fallback.artifactId, null)) ||
231
+ defaultId("artifact", stableJsonStringify({ path, kind, sha256: descriptor.sha256 })),
232
+ ...descriptor,
233
+ };
234
+ }
235
+
236
+ export function normalizeWaveControlEventEnvelope(rawEvent = {}, defaults = {}) {
237
+ const source = normalizePlainObject(rawEvent);
238
+ const fallback = normalizePlainObject(defaults);
239
+ const entityType = normalizeText(source.entityType, normalizeText(fallback.entityType, null))?.toLowerCase();
240
+ assertEnum(entityType, WAVE_CONTROL_ENTITY_TYPES, "waveControl.entityType");
241
+ const entityId = normalizeText(source.entityId, normalizeText(fallback.entityId, null));
242
+ const action = normalizeText(source.action, normalizeText(fallback.action, null))?.toLowerCase();
243
+ if (!entityId) {
244
+ throw new Error("waveControl.entityId is required");
245
+ }
246
+ if (!action) {
247
+ throw new Error("waveControl.action is required");
248
+ }
249
+ const recordedAt = normalizeText(
250
+ source.recordedAt,
251
+ normalizeText(fallback.recordedAt, new Date().toISOString()),
252
+ );
253
+ const identity = normalizeWaveControlRunIdentity(
254
+ source.identity,
255
+ fallback.identity || {
256
+ workspaceId: fallback.workspaceId,
257
+ projectId: fallback.projectId,
258
+ runId: fallback.runId,
259
+ runKind: fallback.runKind,
260
+ lane: fallback.lane,
261
+ wave: fallback.wave,
262
+ attempt: fallback.attempt,
263
+ agentId: fallback.agentId,
264
+ orchestratorId: fallback.orchestratorId,
265
+ runtimeVersion: fallback.runtimeVersion,
266
+ benchmarkRunId: fallback.benchmarkRunId,
267
+ benchmarkItemId: fallback.benchmarkItemId,
268
+ },
269
+ );
270
+ const artifacts = (Array.isArray(source.artifacts) ? source.artifacts : fallback.artifacts || [])
271
+ .map((artifact) => normalizeWaveControlArtifactDescriptor(artifact))
272
+ .filter((artifact) => artifact.path || artifact.sha256 || artifact.kind);
273
+ const payload = {
274
+ schemaVersion: WAVE_CONTROL_SCHEMA_VERSION,
275
+ kind: WAVE_CONTROL_EVENT_KIND,
276
+ id:
277
+ normalizeText(source.id, normalizeText(fallback.id, null)) ||
278
+ defaultId(
279
+ "wctl",
280
+ stableJsonStringify({
281
+ workspaceId: identity.workspaceId,
282
+ projectId: identity.projectId,
283
+ runId: identity.runId,
284
+ lane: identity.lane,
285
+ wave: identity.wave,
286
+ attempt: identity.attempt,
287
+ orchestratorId: identity.orchestratorId,
288
+ runtimeVersion: identity.runtimeVersion,
289
+ entityType,
290
+ entityId,
291
+ action,
292
+ recordedAt,
293
+ }),
294
+ ),
295
+ category: normalizeText(source.category, normalizeText(fallback.category, "runtime")),
296
+ source: normalizeText(source.source, normalizeText(fallback.source, "wave")),
297
+ actor: normalizeText(source.actor, normalizeText(fallback.actor, null)),
298
+ recordedAt,
299
+ entityType,
300
+ entityId,
301
+ action,
302
+ identity,
303
+ tags: normalizeStringArray(source.tags ?? fallback.tags),
304
+ metrics: normalizePlainObject(source.metrics ?? fallback.metrics),
305
+ data: normalizePlainObject(source.data ?? fallback.data),
306
+ artifacts,
307
+ };
308
+ return payload;
309
+ }
@@ -1866,12 +1866,24 @@ function mergeUniqueStringArrays(...lists) {
1866
1866
  );
1867
1867
  }
1868
1868
 
1869
+ function mergeDefinedExecutorValues(...sections) {
1870
+ const merged = {};
1871
+ for (const section of sections) {
1872
+ if (!section || typeof section !== "object") {
1873
+ continue;
1874
+ }
1875
+ for (const [key, value] of Object.entries(section)) {
1876
+ if (value === null || value === undefined) {
1877
+ continue;
1878
+ }
1879
+ merged[key] = cloneExecutorValue(value);
1880
+ }
1881
+ }
1882
+ return merged;
1883
+ }
1884
+
1869
1885
  function mergeExecutorSections(baseSection, profileSection, inlineSection, arrayKeys = []) {
1870
- const merged = {
1871
- ...(cloneExecutorValue(baseSection) || {}),
1872
- ...(cloneExecutorValue(profileSection) || {}),
1873
- ...(cloneExecutorValue(inlineSection) || {}),
1874
- };
1886
+ const merged = mergeDefinedExecutorValues(baseSection, profileSection, inlineSection);
1875
1887
  for (const key of arrayKeys) {
1876
1888
  const mergedArray = mergeUniqueStringArrays(
1877
1889
  baseSection?.[key],
package/scripts/wave.mjs CHANGED
@@ -24,7 +24,10 @@ function printHelp() {
24
24
  wave feedback [feedback options]
25
25
  wave dashboard [dashboard options]
26
26
  wave local [local executor options]
27
+ wave control [control-plane options]
27
28
  wave coord [coordination options]
29
+ wave retry [retry control options]
30
+ wave proof [proof registry options]
28
31
  wave dep [dependency options]
29
32
  wave benchmark [benchmark options]
30
33
 
@@ -102,6 +105,14 @@ if (["init", "upgrade", "self-update", "changelog", "doctor"].includes(subcomman
102
105
  console.error(`[wave] ${error instanceof Error ? error.message : String(error)}`);
103
106
  process.exit(1);
104
107
  }
108
+ } else if (subcommand === "control") {
109
+ try {
110
+ const { runControlCli } = await import("./wave-orchestrator/control-cli.mjs");
111
+ await runControlCli(rest);
112
+ } catch (error) {
113
+ console.error(`[wave] ${error instanceof Error ? error.message : String(error)}`);
114
+ process.exit(1);
115
+ }
105
116
  } else if (subcommand === "coord") {
106
117
  try {
107
118
  const { runCoordinationCli } = await import("./wave-orchestrator/coord-cli.mjs");
@@ -110,6 +121,22 @@ if (["init", "upgrade", "self-update", "changelog", "doctor"].includes(subcomman
110
121
  console.error(`[wave] ${error instanceof Error ? error.message : String(error)}`);
111
122
  process.exit(1);
112
123
  }
124
+ } else if (subcommand === "retry") {
125
+ try {
126
+ const { runRetryCli } = await import("./wave-orchestrator/retry-cli.mjs");
127
+ await runRetryCli(rest);
128
+ } catch (error) {
129
+ console.error(`[wave] ${error instanceof Error ? error.message : String(error)}`);
130
+ process.exit(1);
131
+ }
132
+ } else if (subcommand === "proof") {
133
+ try {
134
+ const { runProofCli } = await import("./wave-orchestrator/proof-cli.mjs");
135
+ await runProofCli(rest);
136
+ } catch (error) {
137
+ console.error(`[wave] ${error instanceof Error ? error.message : String(error)}`);
138
+ process.exit(1);
139
+ }
113
140
  } else if (subcommand === "dep") {
114
141
  try {
115
142
  const { runDependencyCli } = await import("./wave-orchestrator/dep-cli.mjs");
@@ -21,6 +21,7 @@ Before editing any file, confirm:
21
21
  5. If the file has a corresponding test file, you will update or extend tests to cover your change.
22
22
  6. You have checked for other files that import or depend on the symbols you are changing.
23
23
  7. If the file is a config file (JSON, YAML), you have validated the resulting structure is well-formed.
24
+ 8. You are not editing generated runtime state (coordination logs, control-plane events, proof registries, trace bundles, dashboards). These are managed by the launcher and operator tooling.
24
25
 
25
26
  ## Change Hygiene
26
27
 
@@ -31,6 +31,7 @@ Execute these steps in order:
31
31
  - Prefer **targeted changes** over broad rewrites. Each change should address a specific gap identified in the review step.
32
32
  - Never skip the rerun step. Every change must be validated.
33
33
  - If a tune step introduces a regression in another target, revert the change and record the trade-off.
34
+ - Summaries and inboxes may refresh during execution. Re-read context before each iteration to pick up new evidence or coordination records from other agents.
34
35
 
35
36
  ## Benchmark Recording
36
37
 
@@ -16,13 +16,16 @@ Use this skill when the agent is the wave's final cont-QA closure steward.
16
16
 
17
17
  Execute these steps in order. Do not skip steps.
18
18
 
19
- 1. **Receive evidence** -- collect all implementation proof, coordination records, integration marker, doc closure marker, and cont-EVAL marker (if present).
20
- 2. **Review vs exit contracts** -- walk each agent's exit contract line by line. For each line, confirm a proof artifact backs it. Record pass or gap.
19
+ 1. **Receive evidence** -- collect all implementation proof, coordination records, integration marker, doc closure marker, cont-EVAL marker (if present), and security marker (if present).
20
+ 2. **Review vs exit contracts** -- walk each agent's exit contract line by line. For each line, confirm a proof artifact backs it. Record pass or gap. When the wave declares `### Proof artifacts`, verify those machine-visible artifacts are present.
21
21
  3. **Review vs promotions** -- walk each declared component promotion. Confirm evidence shows the component reached the declared target level, not just that adjacent code landed.
22
- 4. **Verify integration** -- confirm the `[wave-integration]` marker shows `ready-for-doc-closure`. Check that no later coordination records contradict it.
23
- 5. **Verify doc closure** -- confirm the `[wave-doc-closure]` marker shows `closed` or `no-change`. If `no-change`, verify the reasoning is valid given what the wave changed.
24
- 6. **Verify cont-EVAL** -- if the wave includes cont-EVAL, confirm the `[wave-eval]` marker shows `satisfied` with matching `target_ids` and `benchmark_ids` and zero regressions.
25
- 7. **Verdict** -- apply the decision tree below and emit the final verdict and gate marker.
22
+ 4. **Verify proof registry** -- check whether operator-registered proof bundles exist. Confirm they are `active` (not `revoked` or `superseded`). Only active bundles count as evidence.
23
+ 5. **Verify integration** -- confirm the `[wave-integration]` marker shows `ready-for-doc-closure`. Check that no later coordination records contradict it.
24
+ 6. **Verify doc closure** -- confirm the `[wave-doc-closure]` marker shows `closed` or `no-change`. If `no-change`, verify the reasoning is valid given what the wave changed.
25
+ 7. **Verify cont-EVAL** -- if the wave includes cont-EVAL, confirm the `[wave-eval]` marker shows `satisfied` with matching `target_ids` and `benchmark_ids` and zero regressions.
26
+ 8. **Verify security** -- if the wave includes a security review, confirm the `[wave-security]` marker shows `clear` or `concerns`. A `blocked` marker prevents closure.
27
+ 9. **Verify no pending rerun** -- confirm no active rerun request is outstanding. An uncleared rerun blocks closure.
28
+ 10. **Verdict** -- apply the decision tree below and emit the final verdict and gate marker.
26
29
 
27
30
  ## Evidence Review Checklist
28
31
 
@@ -38,6 +41,10 @@ Walk each item. Any unchecked item is a potential blocker.
38
41
  - [ ] cont-EVAL marker (if present) is `satisfied` with matching ids and zero regressions.
39
42
  - [ ] Runtime-facing proof is real evidence, not future-work notes or speculative validation.
40
43
  - [ ] No contradictions exist between implementation claims, integration summary, docs, and runtime state.
44
+ - [ ] Declared `### Proof artifacts` (if present) exist as files.
45
+ - [ ] Proof registry bundles are `active`, not `revoked` or `superseded`.
46
+ - [ ] No active rerun request is pending.
47
+ - [ ] Security marker (if present) shows `clear` or `concerns`, not `blocked`.
41
48
 
42
49
  ## Verdict Decision Tree
43
50
 
@@ -67,6 +67,7 @@ When rollback is necessary:
67
67
  4. Emit a `[deploy-status]` marker with `state=rolled-back`.
68
68
  5. Post a coordination record so integration and cont-QA see the rollback.
69
69
  6. Do not claim the deploy surface is healthy after a rollback. The wave's deploy exit contract is not met.
70
+ 7. If a rollback occurs, any operator-registered proof that relied on the deployed state should be superseded or revoked via `wave control proof supersede` or `wave control proof revoke`.
70
71
 
71
72
  ## Marker Format
72
73
 
@@ -30,6 +30,8 @@ These docs are your responsibility when the wave changes their content:
30
30
  | `docs/plans/component-cutover-matrix.md` and `.json` | Component maturity levels advance, new components are declared, or next-safe assumptions change. |
31
31
  | `docs/roadmap.md` | Roadmap items are completed, reordered, or newly added. |
32
32
  | `docs/reference/migration-*.md` | Migration steps are added, changed, or completed. |
33
+ | `CHANGELOG.md` | Feature, fix, or behavioral changes ship in a new version. |
34
+ | `docs/plans/wave-orchestrator.md` | Runtime behavior, CLI surface, or configuration changes. |
33
35
 
34
36
  These docs are **not** your responsibility:
35
37
 
@@ -37,6 +39,8 @@ These docs are **not** your responsibility:
37
39
  - Role definition docs under `docs/agents/` are updated by the orchestrator or planner, not by the wave documentation steward.
38
40
  - Research docs under `docs/research/` stay with the research role.
39
41
 
42
+ When an ad-hoc run reports a shared-plan delta, documentation closure still queues the canonical shared-plan docs alongside the ad-hoc closure report.
43
+
40
44
  ## No-Change Protocol
41
45
 
42
46
  When a wave does not require shared-plan doc updates:
@@ -21,11 +21,14 @@ Follow this sequence for each deliverable in your exit contract:
21
21
  - Tests that pass and cover the changed behavior.
22
22
  - Generated artifacts (built output, schemas, configs) that exist on disk.
23
23
  - Structured markers or summaries when the deliverable is not purely code.
24
+ - If the wave declares `### Proof artifacts`, ensure those machine-visible local files are present before closure.
24
25
  5. **Run tests** -- execute `pnpm test` or the repo's declared test command. Fix any regressions your changes introduced.
25
26
  6. **Verify exit contract** -- walk each line of your exit contract and confirm a proof artifact backs it. If any line lacks proof, either produce it or post a coordination record explaining the gap.
26
27
  7. **Coordination record** -- post a record summarizing what landed, what proof exists, and any downstream impacts on integration or documentation.
27
28
  8. **Handoff** -- if your work affects another agent's scope (interface changes, new dependencies, shifted proof expectations), post an explicit handoff naming the affected agent, files, and fields.
28
29
 
30
+ Note: summaries and inboxes may refresh during execution. Re-read context before major decisions rather than relying on the initial snapshot.
31
+
29
32
  ## Proof Standards
30
33
 
31
34
  - **Tests pass**: name the exact test file and the command that runs it. Example: "test/wave-orchestrator/planner.test.ts passes via pnpm test".
@@ -45,6 +48,7 @@ Post a coordination record immediately when any of these occur:
45
48
  - **Dependency**: your deliverable depends on another agent's work landing first.
46
49
  - **Proof gap**: you cannot produce the required proof for an exit contract line and need help.
47
50
  - **Completion**: you have finished all deliverables and want downstream agents (integration, documentation) to proceed.
51
+ - **Helper assignment received**: you received a targeted request from another agent. Acknowledge it promptly; unacknowledged requests become overdue and may be rerouted.
48
52
 
49
53
  ## Exit Contract Verification
50
54
 
@@ -18,7 +18,8 @@ Execute these steps for each infra surface assigned in the wave:
18
18
  2. **Verify each surface** -- check the current state of each surface against the required state.
19
19
  3. **Classify state** -- assign a status to each surface using the classification system below.
20
20
  4. **Emit markers** -- produce one `[infra-status]` marker per surface verified.
21
- 5. **Coordinate** -- post coordination records for any surface that blocks other agents or requires human approval.
21
+ 5. **Coordinate** -- post coordination records for any surface that blocks other agents or requires human approval. Use targeted requests so the finding becomes a helper assignment with an explicit owner.
22
+ 6. **Check dependencies** -- if the wave has inbound cross-lane dependency tickets, verify they are resolved before declaring infra conformance.
22
23
 
23
24
  ## Verification Surfaces
24
25
 
@@ -14,15 +14,17 @@
14
14
 
15
15
  Execute these steps in order:
16
16
 
17
- 1. **Collect evidence** -- re-read the compiled shared summary, your inbox, the board projection, and all coordination records posted by implementation agents and cont-EVAL (if present).
17
+ 1. **Collect evidence** -- re-read the compiled shared summary, your inbox, the board projection, and all coordination records posted by implementation agents and cont-EVAL (if present). Summaries refresh during execution, so use the latest version.
18
18
  2. **Check contradictions** -- identify claims from different agents that conflict (e.g., two agents claiming the same file, incompatible interface assumptions, inconsistent status claims).
19
- 3. **Verify proof gaps** -- walk each agent's exit contract and confirm proof artifacts exist. Flag any exit contract line that lacks durable evidence.
19
+ 3. **Verify proof gaps** -- walk each agent's exit contract and confirm proof artifacts exist. Flag any exit contract line that lacks durable evidence. When the wave declares `### Proof artifacts`, verify those artifacts are present. Check the proof registry for any revoked or superseded bundles that no longer satisfy closure.
20
20
  4. **Check helper assignments** -- verify that every helper assignment posted during the wave has a linked resolution or explicit follow-up.
21
21
  5. **Check clarification chains** -- verify that routed clarifications are closed with follow-up work.
22
- 6. **Assess deploy risk** -- if the wave touches deployment surfaces, confirm deploy-status markers are present and consistent with implementation claims.
23
- 7. **Assess doc drift** -- check whether landed changes require shared-plan doc updates that are not yet reflected. Flag drift for the documentation steward.
24
- 8. **Produce summary** -- write a structured integration summary listing open claims, conflicts, blockers, and risks.
25
- 9. **Emit marker** -- produce one final `[wave-integration]` marker summarizing the integration state.
22
+ 6. **Check rerun requests** -- verify that no active rerun request is pending. An uncleared rerun request blocks closure.
23
+ 7. **Check dependency tickets** -- verify that all inbound cross-lane dependency tickets are resolved or explicitly deferred with reasoning.
24
+ 8. **Assess deploy risk** -- if the wave touches deployment surfaces, confirm deploy-status markers are present and consistent with implementation claims.
25
+ 9. **Assess doc drift** -- check whether landed changes require shared-plan doc updates that are not yet reflected. Flag drift for the documentation steward.
26
+ 10. **Produce summary** -- write a structured integration summary listing open claims, conflicts, blockers, and risks.
27
+ 11. **Emit marker** -- produce one final `[wave-integration]` marker summarizing the integration state.
26
28
 
27
29
  ## Synthesis Checklist
28
30
 
@@ -37,6 +39,9 @@ Review each item. Any failure means the wave is `needs-more-work`:
37
39
  - [ ] Clarification chains are closed.
38
40
  - [ ] cont-EVAL marker (if present) shows `satisfied` with matching ids.
39
41
  - [ ] Deploy-status markers (if present) show `healthy` or have explicit downgrade reasoning.
42
+ - [ ] Cross-lane dependency tickets are resolved or explicitly deferred.
43
+ - [ ] No active rerun request is pending (check via `wave control rerun get`).
44
+ - [ ] Proof registry bundles are active, not revoked or superseded.
40
45
 
41
46
  ## Contradiction Resolution
42
47
 
@@ -55,8 +60,10 @@ The integration summary should be structured and machine-readable. Include:
55
60
  1. **Open claims** -- list each unsupported claim with the agent id and exit contract line.
56
61
  2. **Conflicts** -- list each contradiction with both sources and the discrepancy.
57
62
  3. **Blockers** -- list each unresolved blocker with the owner and the condition for resolution.
58
- 4. **Deploy risks** -- list any deploy surfaces that are not healthy or verified.
59
- 5. **Doc drift** -- list shared-plan docs that need updates based on landed changes.
63
+ 4. **Dependencies** -- list unresolved cross-lane dependency tickets with owner lane and status.
64
+ 5. **Deploy risks** -- list any deploy surfaces that are not healthy or verified.
65
+ 6. **Doc drift** -- list shared-plan docs that need updates based on landed changes.
66
+ 7. **Proof state** -- list any proof bundles that are revoked or superseded, and any declared proof artifacts that are missing.
60
67
 
61
68
  Keep the summary concise enough to drive relaunch decisions. Do not pad with observations that do not affect closure.
62
69
 
@@ -0,0 +1,39 @@
1
+ # Planner Role
2
+
3
+ Use this skill when the agent is producing a future wave or multi-wave roadmap packet.
4
+
5
+ ## Core Rules
6
+
7
+ - Stay read-only during planning.
8
+ - Turn simple requests into explicit wave contracts, not broad prose.
9
+ - Keep maturity claims, owned slices, deliverables, proof artifacts, runtime settings, and closure docs aligned.
10
+ - Split broad work into narrower waves instead of raising the claimed maturity level dishonestly.
11
+ - Surface open questions when the repo does not provide enough truth to plan safely.
12
+
13
+ ## Planning Checklist
14
+
15
+ For each proposed wave, answer these explicitly:
16
+
17
+ 1. What exact maturity level is being claimed?
18
+ 2. Which exact components are being promoted?
19
+ 3. Which implementation owners map to those promoted components?
20
+ 4. Which exact deliverables prove each owned slice?
21
+ 5. Which exact proof artifacts are required? Declare them in `### Proof artifacts` for machine-visible validation.
22
+ 6. Who owns live proof, if the target is `pilot-live` or above?
23
+ 7. What must A8, A9, and A0 reject if the wave lands incompletely?
24
+ 8. Which shared-plan docs must change when the wave closes?
25
+ 9. Which executor, model, budget, retry policy, and Context7 settings reduce avoidable failure?
26
+ 10. Does this work require cross-lane dependency tickets?
27
+
28
+ ## Maturity Rules
29
+
30
+ - Default to one honest maturity jump per component per wave.
31
+ - `repo-landed` means repo-local code, tests, and docs prove the claim.
32
+ - `pilot-live` and above require live-proof structure, not just code plus tests.
33
+ - Do not let “the code exists” become “the deployment works”.
34
+
35
+ ## Output Rules
36
+
37
+ - Emit structured JSON only.
38
+ - Prefer exact file paths, exact artifacts, exact commands, and exact owners.
39
+ - If a wave still sounds ambitious after you write the deliverables and proof artifacts, split it again.
@@ -0,0 +1,21 @@
1
+ {
2
+ "id": "role-planner",
3
+ "title": "Planner Role",
4
+ "description": "Guides the read-only planner through maturity selection, wave splitting, ownership mapping, proof design, runtime pinning, and closure-ready planning output.",
5
+ "activation": {
6
+ "when": "Attach when a planner agent is generating future waves or a reviewable roadmap packet from a simple task request.",
7
+ "roles": [],
8
+ "runtimes": [],
9
+ "deployKinds": []
10
+ },
11
+ "termination": "Stop when the plan, verifier notes, and open questions are explicit enough for review or apply.",
12
+ "permissions": {
13
+ "network": [],
14
+ "shell": [],
15
+ "mcpServers": []
16
+ },
17
+ "trust": {
18
+ "tier": "repo-owned"
19
+ },
20
+ "evalCases": []
21
+ }
@@ -53,6 +53,7 @@ Escalate to the integration steward or post a coordination record when:
53
53
  - Uncertainty is high enough that proceeding without resolution could cause rework.
54
54
  - The research question requires access to systems or files outside your declared scope.
55
55
  - An external dependency has a breaking change, deprecation, or security advisory.
56
+ - A finding affects a cross-lane dependency that may need a dependency ticket.
56
57
 
57
58
  ## Customization
58
59
 
@@ -18,9 +18,9 @@ Execute these steps in order:
18
18
  2. **Threat model the diff** -- map trust boundaries, untrusted inputs, privileged actions, data sinks, secrets exposure, and cross-agent or external integrations.
19
19
  3. **Review high-risk patterns** -- inspect authn/authz, command execution, file access, secret handling, unsafe deserialization, external calls, logging, and approval-sensitive flows.
20
20
  4. **Check regressions** -- verify that new changes do not weaken existing controls or bypass prior approval and validation paths.
21
- 5. **Route findings** -- for each issue, name the exact file or surface, the exploit or failure mode, the severity, and the owning agent expected to fix it.
21
+ 5. **Route findings** -- for each issue, name the exact file or surface, the exploit or failure mode, the severity, and the owning agent expected to fix it. Use targeted coordination requests so the finding becomes a helper assignment.
22
22
  6. **Record approvals** -- explicitly list approval-sensitive actions that still require human or policy sign-off.
23
- 7. **Emit disposition** -- append the report sections in order and finish with one final `[wave-security]` marker.
23
+ 7. **Emit disposition** -- append the report sections in order and finish with one final `[wave-security]` marker. Security closure runs before integration, so a `blocked` marker prevents the entire closure sequence from advancing.
24
24
 
25
25
  ## Review Checklist
26
26
 
@@ -49,11 +49,12 @@ All three layers are binding. When they conflict, prefer the compiled task promp
49
49
 
50
50
  ## Context Management
51
51
 
52
- - Re-read the shared summary and inbox before starting work and before emitting final markers.
52
+ - Re-read the shared summary and inbox before starting work and before emitting final markers. Summaries and inboxes may refresh during execution via the live orchestration loop, so re-read before major decisions.
53
53
  - Use Agent for parallel research tasks that would otherwise consume main-thread context.
54
54
  - Avoid pasting large file contents into reasoning when a targeted Grep or Read with offset suffices.
55
55
  - When context grows large, summarize intermediate findings into a working note rather than re-reading raw sources.
56
56
  - Do not re-read files you have already read in the current session unless the file may have changed.
57
+ - Do not edit files under `.tmp/` (coordination logs, control-plane events, proof registries, dashboards, traces). These are managed by the launcher and operator tooling.
57
58
 
58
59
  ## Customization
59
60