cool-workflow 0.1.78

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (193) hide show
  1. package/.claude-plugin/plugin.json +20 -0
  2. package/.codex-plugin/mcp.json +10 -0
  3. package/.codex-plugin/plugin.json +38 -0
  4. package/.mcp.json +10 -0
  5. package/LICENSE +24 -0
  6. package/README.md +638 -0
  7. package/apps/architecture-review/app.json +51 -0
  8. package/apps/architecture-review/workflow.js +116 -0
  9. package/apps/end-to-end-golden-path/app.json +30 -0
  10. package/apps/end-to-end-golden-path/workflow.js +33 -0
  11. package/apps/pr-review-fix-ci/app.json +59 -0
  12. package/apps/pr-review-fix-ci/workflow.js +90 -0
  13. package/apps/release-cut/app.json +54 -0
  14. package/apps/release-cut/workflow.js +82 -0
  15. package/apps/research-synthesis/app.json +50 -0
  16. package/apps/research-synthesis/workflow.js +76 -0
  17. package/apps/workflow-app-framework-demo/app.json +29 -0
  18. package/apps/workflow-app-framework-demo/workflow.js +44 -0
  19. package/dist/agent-config.js +223 -0
  20. package/dist/candidate-scoring.js +715 -0
  21. package/dist/capability-core.js +630 -0
  22. package/dist/capability-dispatcher.js +86 -0
  23. package/dist/capability-registry.js +523 -0
  24. package/dist/cli.js +1276 -0
  25. package/dist/collaboration.js +727 -0
  26. package/dist/commit.js +570 -0
  27. package/dist/contract-migration.js +234 -0
  28. package/dist/coordinator.js +1163 -0
  29. package/dist/daemon.js +44 -0
  30. package/dist/dispatch.js +201 -0
  31. package/dist/drive.js +503 -0
  32. package/dist/error-feedback.js +415 -0
  33. package/dist/evidence-grounding.js +179 -0
  34. package/dist/evidence-reasoning.js +733 -0
  35. package/dist/execution-backend.js +1279 -0
  36. package/dist/harness.js +61 -0
  37. package/dist/mcp-server.js +1615 -0
  38. package/dist/multi-agent-eval.js +857 -0
  39. package/dist/multi-agent-host.js +764 -0
  40. package/dist/multi-agent-operator-ux.js +537 -0
  41. package/dist/multi-agent-trust.js +366 -0
  42. package/dist/multi-agent.js +1173 -0
  43. package/dist/node-snapshot.js +270 -0
  44. package/dist/observability.js +922 -0
  45. package/dist/operator-ux.js +971 -0
  46. package/dist/orchestrator/audit-operations.js +182 -0
  47. package/dist/orchestrator/candidate-operations.js +117 -0
  48. package/dist/orchestrator/cli-options.js +288 -0
  49. package/dist/orchestrator/collaboration-operations.js +86 -0
  50. package/dist/orchestrator/feedback-operations.js +81 -0
  51. package/dist/orchestrator/host-operations.js +78 -0
  52. package/dist/orchestrator/lifecycle-operations.js +462 -0
  53. package/dist/orchestrator/migration-operations.js +44 -0
  54. package/dist/orchestrator/multi-agent-operations.js +362 -0
  55. package/dist/orchestrator/report.js +369 -0
  56. package/dist/orchestrator/topology-operations.js +84 -0
  57. package/dist/orchestrator.js +874 -0
  58. package/dist/pipeline-contract.js +92 -0
  59. package/dist/pipeline-runner.js +285 -0
  60. package/dist/reclamation.js +882 -0
  61. package/dist/result-normalize.js +194 -0
  62. package/dist/run-export.js +64 -0
  63. package/dist/run-registry.js +1347 -0
  64. package/dist/run-state-schema.js +67 -0
  65. package/dist/sandbox-profile.js +471 -0
  66. package/dist/scheduler.js +266 -0
  67. package/dist/scheduling.js +184 -0
  68. package/dist/schema-validate.js +98 -0
  69. package/dist/state-explosion.js +1213 -0
  70. package/dist/state-migrations.js +463 -0
  71. package/dist/state-node.js +301 -0
  72. package/dist/state.js +308 -0
  73. package/dist/telemetry-attestation.js +156 -0
  74. package/dist/telemetry-ledger.js +145 -0
  75. package/dist/topology.js +527 -0
  76. package/dist/triggers.js +159 -0
  77. package/dist/trust-audit.js +475 -0
  78. package/dist/types/blackboard.js +2 -0
  79. package/dist/types/boundary.js +29 -0
  80. package/dist/types/candidate.js +2 -0
  81. package/dist/types/collaboration.js +2 -0
  82. package/dist/types/core.js +2 -0
  83. package/dist/types/drive.js +10 -0
  84. package/dist/types/error-feedback.js +2 -0
  85. package/dist/types/evidence-reasoning.js +2 -0
  86. package/dist/types/execution-backend.js +2 -0
  87. package/dist/types/multi-agent.js +2 -0
  88. package/dist/types/observability.js +2 -0
  89. package/dist/types/pipeline.js +2 -0
  90. package/dist/types/reclamation.js +8 -0
  91. package/dist/types/result.js +2 -0
  92. package/dist/types/run-registry.js +2 -0
  93. package/dist/types/run.js +2 -0
  94. package/dist/types/sandbox.js +2 -0
  95. package/dist/types/schedule.js +2 -0
  96. package/dist/types/state-node.js +2 -0
  97. package/dist/types/topology.js +2 -0
  98. package/dist/types/trust.js +2 -0
  99. package/dist/types/workbench.js +2 -0
  100. package/dist/types/worker.js +2 -0
  101. package/dist/types/workflow-app.js +2 -0
  102. package/dist/types.js +43 -0
  103. package/dist/verifier-registry.js +46 -0
  104. package/dist/verifier.js +78 -0
  105. package/dist/version.js +8 -0
  106. package/dist/workbench-host.js +172 -0
  107. package/dist/workbench.js +190 -0
  108. package/dist/worker-isolation.js +1028 -0
  109. package/dist/workflow-api.js +98 -0
  110. package/dist/workflow-app-framework.js +626 -0
  111. package/docs/agent-delegation-drive.7.md +190 -0
  112. package/docs/agent-framework.md +176 -0
  113. package/docs/candidate-scoring.7.md +106 -0
  114. package/docs/canonical-workflow-apps.7.md +137 -0
  115. package/docs/capability-topology-registry.7.md +168 -0
  116. package/docs/cli-mcp-parity.7.md +373 -0
  117. package/docs/contract-migration-tooling.7.md +123 -0
  118. package/docs/control-plane-scheduling.7.md +110 -0
  119. package/docs/coordinator-blackboard.7.md +183 -0
  120. package/docs/dogfood/architecture-review-cool-workflow.md +16 -0
  121. package/docs/dogfood-one-real-repo.7.md +168 -0
  122. package/docs/durable-state-and-locking.7.md +107 -0
  123. package/docs/end-to-end-golden-path.7.md +117 -0
  124. package/docs/error-feedback.7.md +153 -0
  125. package/docs/evidence-adoption-reasoning-chain.7.md +270 -0
  126. package/docs/execution-backends.7.md +300 -0
  127. package/docs/getting-started.md +99 -0
  128. package/docs/index.md +41 -0
  129. package/docs/mcp-app-surface.7.md +235 -0
  130. package/docs/multi-agent-cli-mcp-surface.7.md +265 -0
  131. package/docs/multi-agent-eval-replay-harness.7.md +302 -0
  132. package/docs/multi-agent-operator-ux.7.md +314 -0
  133. package/docs/multi-agent-runtime-core.7.md +231 -0
  134. package/docs/multi-agent-topologies.7.md +103 -0
  135. package/docs/multi-agent-trust-policy-audit.7.md +154 -0
  136. package/docs/node-snapshot-diff-replay.7.md +135 -0
  137. package/docs/observability-cost-accounting.7.md +194 -0
  138. package/docs/operator-ux.7.md +180 -0
  139. package/docs/pipeline-runner.7.md +136 -0
  140. package/docs/project-index.md +261 -0
  141. package/docs/real-execution-backends.7.md +142 -0
  142. package/docs/release-and-migration.7.md +280 -0
  143. package/docs/release-tooling.7.md +159 -0
  144. package/docs/routines.md +48 -0
  145. package/docs/run-registry-control-plane.7.md +312 -0
  146. package/docs/run-retention-reclamation.7.md +191 -0
  147. package/docs/sandbox-profiles.7.md +137 -0
  148. package/docs/scheduled-tasks.md +80 -0
  149. package/docs/security-trust-hardening.7.md +117 -0
  150. package/docs/state-explosion-management.7.md +264 -0
  151. package/docs/state-node.7.md +96 -0
  152. package/docs/team-collaboration.7.md +207 -0
  153. package/docs/unix-principles.md +192 -0
  154. package/docs/verifier-gated-commit.7.md +140 -0
  155. package/docs/web-desktop-workbench.7.md +215 -0
  156. package/docs/worker-isolation.7.md +167 -0
  157. package/docs/workflow-app-framework.7.md +274 -0
  158. package/manifest/README.md +43 -0
  159. package/manifest/plugin.manifest.json +316 -0
  160. package/manifest/pricing.policy.json +14 -0
  161. package/package.json +79 -0
  162. package/scripts/agents/claude-p-agent.js +104 -0
  163. package/scripts/agents/claude-p-agent.sh +9 -0
  164. package/scripts/agents/cw-attest-keygen.js +55 -0
  165. package/scripts/agents/cw-attest-wrap.js +143 -0
  166. package/scripts/block-unapproved-tag.sh +39 -0
  167. package/scripts/bump-version.js +249 -0
  168. package/scripts/canonical-apps.js +171 -0
  169. package/scripts/cw.js +4 -0
  170. package/scripts/dist-drift-check.js +79 -0
  171. package/scripts/dogfood-architecture-review.js +237 -0
  172. package/scripts/dogfood-release.js +624 -0
  173. package/scripts/forward-ref-docs.js +73 -0
  174. package/scripts/gen-manifests.js +232 -0
  175. package/scripts/golden-path.js +300 -0
  176. package/scripts/mcp-server.js +4 -0
  177. package/scripts/new-feature.js +121 -0
  178. package/scripts/parity-check.js +213 -0
  179. package/scripts/release-check.js +118 -0
  180. package/scripts/release-flow.js +272 -0
  181. package/scripts/release-gate.sh +85 -0
  182. package/scripts/sync-project-index.js +387 -0
  183. package/scripts/validate-run-state-schema.js +126 -0
  184. package/scripts/verify-container-selfref.js +64 -0
  185. package/scripts/version-sync-check.js +237 -0
  186. package/skills/cool-workflow/SKILL.md +162 -0
  187. package/skills/cool-workflow/references/commands.md +282 -0
  188. package/tsconfig.json +16 -0
  189. package/ui/workbench/app.css +76 -0
  190. package/ui/workbench/app.js +159 -0
  191. package/ui/workbench/index.html +32 -0
  192. package/workflows/architecture-review.workflow.js +84 -0
  193. package/workflows/research-synthesis.workflow.js +47 -0
@@ -0,0 +1,300 @@
1
+ # EXECUTION-BACKENDS(7)
2
+
3
+ ## NAME
4
+
5
+ Execution Backends - pluggable, swappable execution drivers for Cool Workflow (v0.1.29)
6
+
7
+ ## SYNOPSIS
8
+
9
+ ```text
10
+ node dist/cli.js backend list
11
+ node dist/cli.js backend show shell
12
+ node dist/cli.js backend probe container
13
+ node dist/cli.js dispatch <run-id> --sandbox readonly --backend shell
14
+ node dist/cli.js worker manifest <run-id> <worker-id>
15
+ ```
16
+
17
+ ## DESCRIPTION
18
+
19
+ An execution backend is a CW driver: a thin adapter that runs a dispatched
20
+ task/worker somewhere, under the requested sandbox profile, and records a
21
+ canonical result envelope plus a sandbox attestation. v0.1.29 lifts execution
22
+ out of the kernel into this driver layer.
23
+
24
+ The model is a BSD VFS / device-driver layer. There is ONE narrow
25
+ `ExecutionBackend` interface (the mechanism) and many interchangeable drivers
26
+ (`node`, `bun`, `shell`, `container`, `remote`, `ci`). The kernel —
27
+ orchestrator, dispatch, and pipeline-runner — never learns which backend ran a
28
+ task. WHAT to run and which evidence to record is kernel policy; HOW and WHERE
29
+ it runs is the driver's concern.
30
+
31
+ ```text
32
+ selected backend -> sandbox attestation -> execution/delegation -> canonical envelope
33
+ ```
34
+
35
+ The result envelope, evidence refs, and provenance a task produces are
36
+ schema-identical no matter which backend ran it. The backend id and its sandbox
37
+ attestation are recorded AS provenance, so eval/replay, the verifier gates, and
38
+ the v0.1.28 run registry do not care which backend executed a run.
39
+
40
+ ## THE CONTRACT
41
+
42
+ The `ExecutionBackend` interface is three members:
43
+
44
+ ```text
45
+ descriptor the capability descriptor: which sandbox dimensions it enforces vs
46
+ attests, local vs remote, kind (local/delegating), readiness
47
+ probe(ctx) live, deterministic readiness check
48
+ run(request) execute (or delegate) under a sandbox profile and return a
49
+ canonical ExecutionResultEnvelope { status, result, evidence, provenance }
50
+ ```
51
+
52
+ `run` takes a dispatch/worker manifest plus a resolved sandbox profile and
53
+ returns `{ result, evidence }` (byte-stable across backends) and `provenance`
54
+ (backend id + `SandboxAttestation` + optional delegation handle).
55
+
56
+ ## THE SANDBOX PROFILE IS THE CONTRACT
57
+
58
+ Every backend MUST honor the five sandbox-profile dimensions: read, write,
59
+ command, network, env. For each dimension a driver declares one of:
60
+
61
+ `enforce`
62
+ : the driver actively restricts the dimension at execution time.
63
+
64
+ `attest`
65
+ : the driver records a verifiable claim but relies on the host/runner to
66
+ enforce it (mirrors the existing `sandbox.hostRequired` split).
67
+
68
+ `unsupported`
69
+ : the driver can neither enforce nor attest it.
70
+
71
+ A profile requires a dimension when it restricts it (`command` when
72
+ `execute.mode != any`, `network` when `network.mode != any`, `env` when
73
+ `env.inherit` is false; read/write are always bounded). If a required dimension
74
+ is `unsupported`, or the backend is not ready, or the command is denied by the
75
+ profile, the backend FAILS CLOSED: `run` returns `status: "refused"` with an
76
+ attestation whose `status` is `refused`. It never silently downgrades to an
77
+ unsandboxed execution.
78
+
79
+ ## DRIVERS
80
+
81
+ `node` (default)
82
+ : Reproduces pre-v0.1.29 behavior exactly. The host runs the worker in-process
83
+ under CW's worker-output acceptance (a delegate-host execution). When it
84
+ executes a command it enforces command + env via the Node child process and
85
+ attests read/write/network to the host.
86
+
87
+ `bun`
88
+ : Node-compatible by default, Bun-friendly. Executes via the Node-compatible
89
+ runtime so evidence is byte-stable with `node`, and attests Bun availability
90
+ in provenance. Enforces command + env; attests read/write/network.
91
+
92
+ `shell`
93
+ : Runs a command/worker via the system shell (`/bin/sh -c`) under the sandbox
94
+ contract. Enforces command + env; attests read/write/network.
95
+
96
+ `container`
97
+ : Delegates to a container runtime (docker/podman) and records the
98
+ `image@digest` handle + attestation + result. A container can enforce all
99
+ five dimensions. Fails closed when no image is supplied.
100
+
101
+ `remote`
102
+ : Delegates to a remote runner and records the endpoint + job handle +
103
+ attestation + result. Fails closed when no endpoint is configured
104
+ (`CW_REMOTE_ENDPOINT` or `--endpoint`).
105
+
106
+ `ci`
107
+ : Delegates to a CI runner and records the job handle + attestation + result.
108
+ Fails closed when no CI job target is configured (`CW_CI_ENDPOINT` or
109
+ `--job`).
110
+
111
+ CW DELEGATES; IT DOES NOT BECOME THE EXECUTOR. The local drivers run a thin
112
+ child process to capture verifiable evidence (exit code + an output digest). The
113
+ container/remote/ci drivers delegate and record a handle + attestation +
114
+ result; they never reimplement a container runtime or a CI system.
115
+
116
+ ## SELECTION
117
+
118
+ Backend selection parallels `--sandbox`:
119
+
120
+ ```text
121
+ --backend <id> (flag) > CW_BACKEND (env) > node (default)
122
+ ```
123
+
124
+ Selection is recorded in run state (dispatch manifest, worker scope, worker
125
+ manifest, the RunDispatch) and surfaced in the v0.1.28 run registry as the
126
+ record's `backends` field. A per-task `backendId` overrides the run default.
127
+ `backend list|show|probe` and the `--backend` flag are declared once in
128
+ `src/capability-registry.ts`, so `cw <cmd> --json` and `cw_<cmd>` render one
129
+ data source and pass the v0.1.27 parity gate.
130
+
131
+ ## EVIDENCE PARITY
132
+
133
+ The canonical evidence a local driver records for a command run is
134
+ backend-independent:
135
+
136
+ ```text
137
+ command:<command + args>
138
+ exitCode:<code>
139
+ stdoutSha256:sha256:<hex>
140
+ ```
141
+
142
+ Running CW's own self-verify (`node dist/cli.js list`) through `node`, `shell`,
143
+ and `bun` yields byte-identical `result` and `evidence`; only
144
+ `provenance.backendId` (and the attestation detail) differs. The
145
+ `test/execution-backends-smoke.js` gate proves this, proves the fail-closed
146
+ refusals, proves the recorded provenance and delegation handles, and proves the
147
+ verifier/registry stay backend-agnostic.
148
+
149
+ ## ATTESTATION SHAPE
150
+
151
+ ```json
152
+ {
153
+ "backendId": "shell",
154
+ "locality": "local",
155
+ "kind": "local",
156
+ "sandboxProfileId": "readonly",
157
+ "required": ["read", "write", "network", "env"],
158
+ "enforced": ["command", "env"],
159
+ "attested": ["read", "write", "network"],
160
+ "unenforceable": [],
161
+ "status": "enforced",
162
+ "enforcedByCW": ["..."],
163
+ "hostRequired": ["..."]
164
+ }
165
+ ```
166
+
167
+ A delegating driver additionally records `handle` (e.g.
168
+ `{ "kind": "container", "ref": "img@sha256:..." }`).
169
+
170
+ ## FILES
171
+
172
+ ```text
173
+ .cw/runs/<run-id>/state.json
174
+ .cw/runs/<run-id>/dispatches/<dispatch-id>.json
175
+ .cw/runs/<run-id>/workers/<worker-id>/worker.json
176
+ .cw/runs/<run-id>/workers/<worker-id>/manifest.json
177
+ .cw/registry/index.json
178
+ ```
179
+
180
+ ## FAILURE MODES
181
+
182
+ Unknown backends fail closed with `backend-not-found` (CLI/dispatch/`CW_BACKEND`).
183
+
184
+ `run` returns `status: "refused"` with `attestation.status: "refused"` when:
185
+
186
+ - the command is denied by the sandbox profile (`sandbox-command-denied`),
187
+ - a required sandbox dimension is `unsupported` (`sandbox-unenforceable`),
188
+ - a local backend is not ready (`backend-not-ready`),
189
+ - a delegating backend has no delegation target (`delegation-target-missing`).
190
+
191
+ CW never silently downgrades a requested backend, and never runs a task
192
+ unsandboxed when the requested profile cannot be honored.
193
+
194
+ ## COMPATIBILITY
195
+
196
+ Execution Backends are introduced in CW v0.1.29. The default (`node`) backend
197
+ reproduces pre-v0.1.29 behavior exactly; runs with no backend selected keep
198
+ working and old run state loads unchanged (the backend fields are additive and
199
+ optional). The `ResultEnvelope` schema (`summary`, `findings`, `evidence`) is
200
+ unchanged — the backend id and attestation live in provenance and run state,
201
+ never in the result envelope.
202
+
203
+ ## SEE ALSO
204
+
205
+ sandbox-profiles(7), worker-isolation(7), cli-mcp-parity(7),
206
+ run-registry-control-plane(7), security-trust-hardening(7)
207
+ ```
208
+ ## Web / Desktop Workbench (v0.1.30)
209
+
210
+ v0.1.30 adds the Web / Desktop Workbench: a read-only, localhost-only human
211
+ console that renders this surface (and the other four operator panels — run
212
+ graph, blackboard, worker logs, candidate compare, audit timeline) for any run,
213
+ reading the SAME capability `--json` payloads. It is a THIRD FRONT DOOR alongside
214
+ the CLI and MCP that holds no authoritative state and forks no schema: each panel
215
+ equals its `cw <cmd> --json` payload byte-for-byte (parity-gated), and refresh
216
+ re-derives everything from disk. See
217
+ [web-desktop-workbench.7.md](web-desktop-workbench.7.md).
218
+
219
+ ## Observability + Cost Accounting (v0.1.31)
220
+
221
+ v0.1.31 adds Observability + Cost Accounting: `metrics show`/`metrics summary`
222
+ derive durations, failure/verifier/acceptance rates (with sample counts and
223
+ fail-closed `n/a`), and host-attested token/cost from existing durable run state
224
+ — no metrics database, no collector daemon, no hidden counter. Usage is additive
225
+ and optional (absent ⇒ `unreported`, never 0); cost is `attested` (attested usage
226
+ × a recorded pricing policy) or clearly `estimated`, with pricing as policy. Both
227
+ verbs are parity-gated and render read-only in the v0.1.30 Workbench. See
228
+ [observability-cost-accounting.7.md](observability-cost-accounting.7.md).
229
+
230
+
231
+ ## Team Collaboration (v0.1.32)
232
+
233
+ v0.1.32 adds Team Collaboration: a host-attested actor and append-only
234
+ approvals/rejections/comments/handoffs provenance-linked to a durable target,
235
+ plus a review gate that STACKS ON the verifier gate — required approvals from
236
+ authorized roles, enforced inside `resolveCommitGate` AFTER the verifier checks
237
+ and never instead of them, failing closed on quorum/authority/self-approval and
238
+ recording who approved the very artifact that shipped. Policy (required approvals,
239
+ authorized roles, self-approval) is data, default off (pre-v0.1.32 behavior
240
+ unchanged). The verbs are parity-gated and render read-only in the v0.1.30
241
+ Workbench. See [Team Collaboration](team-collaboration.7.md).
242
+
243
+ ## Release Tooling (v0.1.33)
244
+
245
+ the per-tag mechanical surfaces (version bump across 17 surfaces, feature scaffold, and the forward-reference docs) become deterministic scripts, with a de-duplicated release gate. See release-tooling(7).
246
+
247
+ ## Real Execution Backend Integrations (v0.1.34)
248
+
249
+ container/remote/ci backends really execute (docker/podman run, remote/CI POST-and-poll) under the sandbox contract, with byte-stable evidence vs node and fail-closed refusal when a runtime/endpoint is unavailable. See real-execution-backends(7).
250
+
251
+ ## Node Snapshot / Diff / Replay (v0.1.35)
252
+
253
+ per-node snapshot, structural diff, and isolated deterministic replay over StateNode, reusing the v0.1.23 eval harness; fail-closed on source drift (valid|stale|absent). See node-snapshot-diff-replay(7).
254
+
255
+ ## Contract Migration Tooling (v0.1.36)
256
+
257
+ first-class declared migration registry (run-state + workflow-app) with per-edge compatibility proofs, fail-closed reachability, and a round-trip/non-destruction prover. See contract-migration-tooling(7).
258
+
259
+ ## Control-Plane Scheduling (v0.1.37)
260
+
261
+ priority + concurrency limits + lease lifecycle + retry/backoff + fail-closed park over the v0.1.28 Run Registry queue; policy-as-data, deterministic. See control-plane-scheduling(7).
262
+
263
+ ## Agent Delegation Drive (v0.1.38)
264
+
265
+ spawn an external agent process per worker, capture result.md + attestation, auto-drive plan->dispatch->fulfill->accept->commit
266
+
267
+ ## Run Retention & Provable Reclamation (v0.1.39)
268
+
269
+ tiered, append-only, cryptographically-verifiable run reclamation: seal the audit skeleton, free the reconstructable bulk, prove it
270
+
271
+ ## Durable State & Locking (v0.1.40)
272
+
273
+ atomic temp->rename writes + fsync-durability for authoritative stores; portable stale-stealing file lock serializing the cross-process read-modify-write stores
274
+
275
+ ## Self-Audit Hardening & Pure-Router Decomposition (v0.1.41)
276
+
277
+ evidence grounding + durable audit append + symlink-hardened containment + deterministic worker ids + recursive redaction; BackendRegistry self-describing drivers (no per-id switches); orchestrator god-object decomposed into per-domain operation modules (pure loadRun->delegate router)
278
+
279
+ ## Robust Result Ingest (v0.1.42)
280
+
281
+ capture findings/evidence from any reasonable agent shape (alt keys + prose), CW derives grounded evidence itself, warn on empty capture — closes the v0.1.41 live-drive 'accepted with 0 captured' failure
282
+
283
+ ## No-False-Green Gate & Launch Prep (v0.1.43)
284
+
285
+ Hard gate blocking empty-capture verifier-gated commits, plus quickstart and launch-prep docs.
286
+
287
+ ## Release-Gate Determinism & Agents Vendor (v0.1.44)
288
+
289
+ Release-readiness checks now validate the committed blob (`git show HEAD:<path>`) instead of the mutable working tree — eliminating false-red/false-green from concurrent working-tree writes (iCloud/Spotlight/editor). Adds the `agents` vendor manifest target: a generated `.agents/plugins/cool-workflow/` adapter giving any non-Claude AI agent one common interface to CW.
290
+
291
+ ## P1-P2 Fixes & CI Content Surfaces (v0.1.49)
292
+
293
+ Migration DAG with reversible edges (v0.1.45), capability auto-discovery (v0.1.46), vendor-adapter registry (v0.1.47), state auto-compaction and P2 fixes (v0.1.48), plus CI content-surface determinism hardening (v0.1.49).
294
+ 0.1.51
295
+
296
+ 0.1.76
297
+
298
+ 0.1.77
299
+
300
+ 0.1.78
@@ -0,0 +1,99 @@
1
+ # Getting Started
2
+
3
+ From a fresh clone:
4
+
5
+ ```bash
6
+ cd plugins/cool-workflow
7
+ npm install
8
+ npm run build
9
+ node scripts/cw.js app list
10
+ ```
11
+
12
+ Create a run with a canonical workflow app:
13
+
14
+ ```bash
15
+ node scripts/cw.js plan release-cut \
16
+ --repo "$PWD" \
17
+ --version 0.1.25 \
18
+ --previousVersion 0.1.24 \
19
+ --releaseBranch main \
20
+ --dryRun true
21
+ ```
22
+
23
+ Use the returned run id:
24
+
25
+ ```bash
26
+ node scripts/cw.js status <run-id>
27
+ node scripts/cw.js graph <run-id>
28
+ node scripts/cw.js dispatch <run-id> --limit 1 --sandbox readonly
29
+ node scripts/cw.js worker summary <run-id>
30
+ node scripts/cw.js topology list
31
+ node scripts/cw.js topology apply <run-id> map-reduce --task <task-id>
32
+ node scripts/cw.js topology summary <run-id>
33
+ node scripts/cw.js multi-agent run <run-id> --topology judge-panel --task <task-id>
34
+ node scripts/cw.js multi-agent status <run-id>
35
+ node scripts/cw.js multi-agent graph <run-id>
36
+ node scripts/cw.js multi-agent dependencies <run-id>
37
+ node scripts/cw.js multi-agent failures <run-id>
38
+ node scripts/cw.js multi-agent evidence <run-id>
39
+ node scripts/cw.js multi-agent step <run-id> --sandbox readonly
40
+ node scripts/cw.js multi-agent blackboard <run-id> summary
41
+ node scripts/cw.js multi-agent score <run-id> <candidate-id> --criterion correctness=1 --evidence <ref>
42
+ node scripts/cw.js multi-agent select <run-id> <candidate-id> --reason "verified winner"
43
+ node scripts/cw.js multi-agent summary <run-id>
44
+ node scripts/cw.js blackboard summary <run-id>
45
+ node scripts/cw.js audit summary <run-id>
46
+ node scripts/cw.js audit multi-agent <run-id>
47
+ node scripts/cw.js audit policy <run-id>
48
+ node scripts/cw.js audit blackboard <run-id>
49
+ node scripts/cw.js audit judge <run-id>
50
+ node scripts/cw.js eval snapshot <run-id> --id <suite-id>
51
+ node scripts/cw.js eval replay .cw/evals/<suite-id>/snapshot.json
52
+ node scripts/cw.js eval compare .cw/evals/<suite-id>/snapshot.json .cw/evals/<suite-id>/replay-run.json
53
+ node scripts/cw.js eval score .cw/evals/<suite-id>/replay-run.json
54
+ node scripts/cw.js eval gate .cw/evals/<suite-id>
55
+ node scripts/cw.js eval report .cw/evals/<suite-id>/replay-run.json
56
+ node scripts/cw.js report <run-id> --show
57
+ ```
58
+
59
+ Run the deterministic regression commands:
60
+
61
+ ```bash
62
+ npm run check
63
+ npm test
64
+ npm run canonical-apps
65
+ npm run golden-path
66
+ npm run eval:replay
67
+ npm run fixture-compat
68
+ ```
69
+
70
+ Before cutting a release, run the full dry-run gate:
71
+
72
+ ```bash
73
+ npm run release:check
74
+ npm run dogfood:release
75
+ ```
76
+
77
+ The release check is non-destructive. It builds, type-checks, runs tests,
78
+ validates canonical apps and golden path behavior, checks old fixture
79
+ compatibility, verifies docs, runs the dogfood smoke proof, and checks version
80
+ synchronization. It does not tag, push, publish, or rewrite fixture files.
81
+
82
+ `npm run dogfood:release` is the real-repository release proof. It uses the
83
+ canonical `release-cut` app against this repository in dry-run mode, records CW
84
+ worker outputs from real command logs, scores and selects a release candidate,
85
+ creates a verifier-gated CW state commit, and writes
86
+ `.cw/runs/<run-id>/dogfood-summary.json`.
87
+
88
+ Trust audit records live under `.cw/runs/<run-id>/audit/`. CW records the
89
+ sandbox profile used by each worker, allowed and denied decisions, evidence
90
+ provenance, and why selected candidates or verifier-gated commits were
91
+ accepted. Multi-agent trust records add role policy, blackboard write audit,
92
+ message provenance, judge rationale, and policy violations. Inspect them with
93
+ `audit summary`, `audit worker`, `audit provenance`, `audit multi-agent`,
94
+ `audit policy`, `audit blackboard`, and `audit judge`.
95
+
96
+ Eval/replay artifacts live under `.cw/evals/<suite-id>/`. They let a release
97
+ gate prove replay completion, graph/dependency parity, evidence adoption,
98
+ trust/policy/audit parity, judge rationale, candidate scoring, selection, and
99
+ verifier-gated commit readiness without running live agents.
package/docs/index.md ADDED
@@ -0,0 +1,41 @@
1
+ # Cool Workflow Docs
2
+
3
+ Read these in order when you are new to CW:
4
+
5
+ 1. [Getting Started](getting-started.md) - clone, install, run a workflow, inspect it, and run the release check.
6
+ 2. [Project Index](project-index.md) - code-derived map of source modules, workflow apps, docs, tests, and sync targets.
7
+ 3. [Workflow App framework](workflow-app-framework.7.md) - userland app manifests, entrypoints, compatibility, and validation.
8
+ 4. [Sandbox Profiles](sandbox-profiles.7.md) - named worker policy contracts for read/write/execute/network/env handling.
9
+ 5. [Security / Trust Hardening](security-trust-hardening.7.md) - audit records, provenance, sandbox attestations, and acceptance rationale.
10
+ 6. [Multi-Agent Runtime Core](multi-agent-runtime-core.7.md) - first-class MultiAgentRun, roles, groups, memberships, fanout, fanin, and lifecycle state.
11
+ 7. [Coordinator / Blackboard](coordinator-blackboard.7.md) - shared topics, messages, context frames, artifact refs, snapshots, decisions, conflicts, and fanin evidence.
12
+ 8. [Multi-Agent Topologies](multi-agent-topologies.7.md) - official map-reduce, debate, and judge-panel recipes built on multi-agent and blackboard records.
13
+ 9. [Multi-Agent CLI + MCP Surface](multi-agent-cli-mcp-surface.7.md) - preferred host loop for run, status, step, blackboard, score, and select.
14
+ 10. [Multi-Agent Operator UX](multi-agent-operator-ux.7.md) - graph, dependencies, failures, and evidence adoption for topology-backed multi-agent runs.
15
+ 11. [Multi-Agent Trust / Policy / Audit](multi-agent-trust-policy-audit.7.md) - role authority, message provenance, blackboard write audit, judge rationale, and policy violations.
16
+ 12. [Multi-Agent Eval & Replay Harness](multi-agent-eval-replay-harness.7.md) - snapshots, isolated replays, comparison, scoring, gates, reports, and MCP parity.
17
+ 13. [State Explosion Management](state-explosion-management.7.md) - durable summary records, compact and focused graph views, blackboard digests, and stale-aware compaction for large multi-agent runs.
18
+ 14. [Evidence Adoption Reasoning Chain](evidence-adoption-reasoning-chain.7.md) - derived, fingerprinted reasoning chains explaining why each evidence item was adopted/rejected with basis, authority, rationale, and counterfactual, and a fail-closed `unexplained` state.
19
+ 15. [Run Registry / Control Plane](run-registry-control-plane.7.md) - derived, fingerprinted, fail-closed index over runs across repos: search, resume, archive, durable queue, cross-repo history, and failed-run rerun with provenance.
20
+ 16. [Execution Backends](execution-backends.7.md) - the pluggable driver layer (node/bun/shell/container/remote/ci): one narrow `ExecutionBackend` contract, sandbox attestation, identical envelopes across backends, and fail-closed delegation.
21
+ 17. [Operator UX](operator-ux.7.md) - `status`, `graph`, report, worker, candidate, feedback, commit, topology, multi-agent, blackboard, coordinator, and trust summaries.
22
+ 18. [MCP App Surface](mcp-app-surface.7.md) - JSON tool parity for agent hosts.
23
+ 19. [CLI ↔ MCP Parity](cli-mcp-parity.7.md) - the capability registry and fail-closed gate proving the CLI and MCP surfaces render one data source.
24
+ 20. [End-to-End Golden Path](end-to-end-golden-path.7.md) - deterministic proof of app, worker, verifier, candidate, commit, and report flow.
25
+ 21. [Dogfood One Real Repo](dogfood-one-real-repo.7.md) - dry-run release proof against the real Cool Workflow repository.
26
+ 22. [Web / Desktop Workbench](web-desktop-workbench.7.md) - a read-only, localhost-only human console rendering the run graph, blackboard, worker logs, candidate compare, and audit timeline over existing capability payloads — a third front door that holds no authoritative state.
27
+ 23. [Observability + Cost Accounting](observability-cost-accounting.7.md) - derived time/duration, failure/verifier/acceptance rates with sample counts and fail-closed `n/a`, plus host-attested token usage and attested-vs-estimated cost with explicit `unreported` coverage; pricing is policy as data.
28
+ 24. [Team Collaboration](team-collaboration.7.md) - host-attested actor, append-only approvals/rejections/comments/handoffs provenance-linked to durable targets, and a review gate that stacks on the verifier gate (required approvals from authorized roles, fail-closed quorum/authority/self-approval); policy is data.
29
+ 25. [Release And Migration](release-and-migration.7.md) - release and migration discipline for durable run state.
30
+ 26. [Release Tooling](release-tooling.7.md) - one-command version bump across every surface, a per-feature scaffolder, forward-reference doc automation, and a de-duplicated release gate.
31
+ 27. [Real Execution Backend Integrations](real-execution-backends.7.md) - container/remote/ci backends really execute (docker/podman run, remote/CI POST-and-poll) under the sandbox contract, byte-stable evidence vs node, fail-closed on an unavailable runtime/endpoint.
32
+ 28. [Node Snapshot / Diff / Replay](node-snapshot-diff-replay.7.md) - per-node snapshot, structural diff, and isolated deterministic replay over StateNode, reusing the eval harness; sha256-fingerprinted with fail-closed `valid|stale|absent` freshness.
33
+ 29. [Contract Migration Tooling](contract-migration-tooling.7.md) - a declared migration registry (run-state + workflow-app) with per-edge compatibility proofs, fail-closed reachability, and a round-trip/non-destruction prover over the existing migrateRunState pipeline.
34
+ 30. [Control-Plane Scheduling](control-plane-scheduling.7.md) - priority + hard concurrency ceiling + lease lifecycle + retry/backoff + fail-closed park policy over the v0.1.28 Run Registry queue; policy-as-data, deterministic, with a read-only `sched plan`.
35
+ 31. [Agent Delegation Drive](agent-delegation-drive.7.md) - the `agent` backend delegates each worker to an EXTERNAL agent process (claude/codex/HTTP endpoint) and `run --drive` auto-advances plan→dispatch→fulfill→accept→commit; the model runs in the agent's process, never in CW. Two-layer evidence, operator-vs-attested model, fail-closed park, replay without re-spawn.
36
+ 32. [Run Retention & Provable Reclamation](run-retention-reclamation.7.md) - tiered, append-only, cryptographically-verifiable disk reclamation over the v0.1.28 archive overlay: seal the audit skeleton, free the reconstructable/scratch bulk, and prove it via a hash-chained tombstone; `gc plan|run|verify`, write-ahead + fail-closed, explicit capability downgrade.
37
+ 33. [Durable State & Locking](durable-state-and-locking.7.md) - atomic (temp→rename) writes for every authoritative store with fsync-durability for the audit-essential ones, plus a portable stale-stealing file lock serializing the cross-process read-modify-write stores (home queue, archive overlay, reclamation chain); closes the prior verdict's non-atomic/unlocked P1.
38
+
39
+ CW is the base system. Workflow apps are userland. Release and migration rules
40
+ must preserve that line: stable contracts, explicit compatibility checks, and
41
+ inspectable state.
@@ -0,0 +1,235 @@
1
+ # MCP App Surface
2
+
3
+ Cool Workflow v0.1.13 completes the MCP bridge as a runtime surface for agent
4
+ hosts. The CLI remains the reference interface, and MCP exposes the same
5
+ operational contracts as explicit JSON tools.
6
+
7
+ The bridge follows CW's base-system discipline:
8
+
9
+ - old tool names remain compatible
10
+ - read-only inspection tools do not mutate state
11
+ - state-changing tools write durable run files
12
+ - inputs use stable names such as `runId`, `appId`, `workerId`,
13
+ `candidateId`, `selectionId`, `profileId`, `cwd`, `reason`, `evidence`, and
14
+ `criteria`
15
+ - errors fail closed through JSON-RPC errors and durable ErrorFeedback where the
16
+ runtime already records feedback
17
+
18
+ ## App Run Flow
19
+
20
+ Use `cw_app_list`, `cw_app_show`, and `cw_app_validate` to inspect app
21
+ contracts. `cw_app_package` writes a package artifact. `cw_app_run` creates a
22
+ run from a Workflow App framework app id and structured inputs:
23
+
24
+ ```json
25
+ {
26
+ "appId": "end-to-end-golden-path",
27
+ "cwd": "/repo",
28
+ "inputs": {
29
+ "question": "Prove the MCP runtime surface."
30
+ },
31
+ "sandbox": "readonly"
32
+ }
33
+ ```
34
+
35
+ The result includes `runId`, workflow/app id and version, `statePath`,
36
+ `reportPath`, pending task count, compact operator status, next actions, and
37
+ the resolved sandbox profile when one was requested.
38
+
39
+ `cw_plan` remains the lower-level planning tool and returns the full run object
40
+ for compatibility.
41
+
42
+ ## Worker Inspection
43
+
44
+ Worker isolation is first-class over MCP:
45
+
46
+ - `cw_worker_list`
47
+ - `cw_worker_show`
48
+ - `cw_worker_manifest`
49
+ - `cw_worker_validate`
50
+ - `cw_worker_output`
51
+ - `cw_worker_fail`
52
+ - `cw_worker_summary`
53
+
54
+ Worker records expose the worker id, task id, status, worker directory,
55
+ `input.md`, `result.md`, artifacts/logs directories, sandbox profile id,
56
+ sandbox policy, feedback ids, multi-agent metadata when present, and
57
+ result/verifier node ids.
58
+
59
+ An agent host should inspect `cw_worker_manifest`, write worker-local output to
60
+ the manifest `resultPath`, then call `cw_worker_output`. CW validates the
61
+ worker boundary, parses the `cw:result` block, creates result and verifier
62
+ nodes, updates the task, writes reports, and checkpoints state.
63
+
64
+ ## Candidate Scoring
65
+
66
+ Candidate operations mirror the CLI:
67
+
68
+ - `cw_candidate_register`
69
+ - `cw_candidate_list`
70
+ - `cw_candidate_show`
71
+ - `cw_candidate_score`
72
+ - `cw_candidate_rank`
73
+ - `cw_candidate_select`
74
+ - `cw_candidate_reject`
75
+ - `cw_candidate_summary`
76
+
77
+ `cw_candidate_score` accepts structured `criteria` and evidence locators:
78
+
79
+ ```json
80
+ {
81
+ "runId": "run-id",
82
+ "candidateId": "candidate-one",
83
+ "criteria": { "correctness": 4, "evidence": 4, "fit": 2 },
84
+ "maxTotal": 10,
85
+ "evidence": ["docs/mcp-app-surface.7.md:1"],
86
+ "verdict": "pass",
87
+ "notes": "Evidence-backed candidate."
88
+ }
89
+ ```
90
+
91
+ `cw_candidate_rank` and `cw_candidate_select` support the same
92
+ evidence/verifier-gate policy as the CLI with `requireEvidence`,
93
+ `requireVerifierGate`, `minNormalized`, and `allowUnverified`. Missing evidence
94
+ or verifier gates fail closed and produce structured feedback through the
95
+ candidate scoring layer.
96
+
97
+ ## Sandbox Profiles
98
+
99
+ Existing sandbox tools remain:
100
+
101
+ - `cw_sandbox_list`
102
+ - `cw_sandbox_show`
103
+ - `cw_sandbox_validate`
104
+
105
+ v0.1.13 adds `cw_sandbox_choose` and `cw_sandbox_resolve` as read-only helpers
106
+ that validate and resolve `sandbox`, `sandboxProfile`, `sandboxProfileId`, or
107
+ `profileId` without dispatching work. `cw_dispatch` accepts all three sandbox
108
+ field spellings for compatibility with different hosts.
109
+
110
+ ## Multi-Agent Runtime
111
+
112
+ v0.1.17 adds MCP parity for first-class multi-agent state.
113
+
114
+ v0.1.20 adds preferred host-facing tools for the full multi-agent loop:
115
+
116
+ - `cw_multi_agent_run`
117
+ - `cw_multi_agent_status`
118
+ - `cw_multi_agent_step`
119
+ - `cw_multi_agent_blackboard`
120
+ - `cw_multi_agent_score`
121
+ - `cw_multi_agent_select`
122
+
123
+ Use these when an agent host wants to drive `run -> status -> step ->
124
+ blackboard -> score -> select` without manually plumbing topology, blackboard,
125
+ candidate, and audit ids. The lower-level tools below remain advanced
126
+ primitives.
127
+
128
+ v0.1.22 adds audit parity for multi-agent trust:
129
+
130
+ - `cw_audit_multi_agent`
131
+ - `cw_audit_policy`
132
+ - `cw_audit_role`
133
+ - `cw_audit_blackboard`
134
+ - `cw_audit_judge`
135
+
136
+ These tools expose role policies, permission decisions, blackboard write audit,
137
+ message provenance, judge rationales, panel decisions, and policy violations in
138
+ deterministic JSON.
139
+
140
+ v0.1.24 adds eval/replay parity for multi-agent regression gates:
141
+
142
+ - `cw_eval_snapshot`
143
+ - `cw_eval_replay`
144
+ - `cw_eval_compare`
145
+ - `cw_eval_score`
146
+ - `cw_eval_gate`
147
+ - `cw_eval_report`
148
+
149
+ These tools create replay snapshots, run isolated replays, compare normalized
150
+ baseline/replay records, score metrics, fail closed on regressions, and return
151
+ artifact paths in deterministic JSON.
152
+
153
+ v0.1.25 adds State Explosion Management parity for large multi-agent runs:
154
+
155
+ - `cw_summary_refresh`
156
+ - `cw_summary_show`
157
+ - `cw_blackboard_summarize`
158
+ - `cw_multi_agent_summarize`
159
+ - `cw_multi_agent_graph_compact`
160
+
161
+ These tools refresh durable, versioned summary records, read the stale-aware
162
+ state-explosion report, return the blackboard digest, and return compact or
163
+ focused graph views with synthetic summary nodes. Every response keeps source
164
+ refs and expansion hints and never deletes raw blackboard, graph, audit, or
165
+ evidence records.
166
+
167
+ Read and inspect:
168
+
169
+ - `cw_multi_agent_summary`
170
+ - `cw_multi_agent_graph`
171
+ - `cw_multi_agent_run_show`
172
+ - `cw_multi_agent_role_show`
173
+ - `cw_multi_agent_group_show`
174
+ - `cw_multi_agent_membership_show`
175
+ - `cw_multi_agent_fanout_show`
176
+ - `cw_multi_agent_fanin_show`
177
+
178
+ Safe writes:
179
+
180
+ - `cw_multi_agent_run_create`
181
+ - `cw_multi_agent_run_transition`
182
+ - `cw_multi_agent_role_create`
183
+ - `cw_multi_agent_group_create`
184
+ - `cw_multi_agent_membership_create`
185
+ - `cw_multi_agent_fanout_create`
186
+ - `cw_multi_agent_fanin_collect`
187
+
188
+ These tools mirror the CLI state model. CW records and validates roles, groups,
189
+ memberships, fanout/fanin, and lifecycle state; the host still executes agents
190
+ and enforces OS/process/network/environment controls.
191
+
192
+ ## Verifier-Gated Commit
193
+
194
+ `cw_commit` accepts verifier-gate fields:
195
+
196
+ ```json
197
+ {
198
+ "runId": "run-id",
199
+ "selection": "selection-id",
200
+ "reason": "verified candidate selected"
201
+ }
202
+ ```
203
+
204
+ It also supports `verifier`, `verifierNode`, `candidate`, `selection`,
205
+ `allowUnverifiedCheckpoint`, and `reason`. The MCP response includes `runId`,
206
+ `commitId`, `verifierGated`, `checkpoint`, verifier/candidate/selection ids,
207
+ `evidenceCount`, `snapshotPath`, next actions, and the underlying commit record.
208
+
209
+ Use `cw_commit_summary` for a read-only view of verifier-gated commits and
210
+ explicit checkpoints.
211
+
212
+ ## Operator Views
213
+
214
+ MCP exposes structured JSON equivalents of Operator UX:
215
+
216
+ - `cw_operator_status`
217
+ - `cw_operator_graph`
218
+ - `cw_operator_report`
219
+ - `cw_worker_summary`
220
+ - `cw_candidate_summary`
221
+ - `cw_feedback_summary`
222
+ - `cw_commit_summary`
223
+ - `cw_multi_agent_summary`
224
+
225
+ These tools return JSON summaries instead of console text. `cw_operator_report`
226
+ refreshes the Markdown report the same way the CLI renderer does; the rest are
227
+ read-only inspection tools.
228
+
229
+ ## CLI/MCP Parity
230
+
231
+ The CLI remains the easiest way for humans to drive a run. MCP is the stable
232
+ tool surface for agent hosts. New runtime capabilities should appear in both
233
+ surfaces, keep old names as aliases or wrappers, and use explicit JSON
234
+ contracts rather than host-specific policy hidden in the bridge.
235
+ 0.1.51