cclaw-cli 8.2.0 → 8.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,7 +1,30 @@
1
1
  import { CORE_AGENTS } from "./core-agents.js";
2
2
  import { ironLawsMarkdown } from "./iron-laws.js";
3
3
  const SPECIALIST_LIST = CORE_AGENTS.map((agent) => `- **${agent.id}** (${agent.modes.join(" / ")}) — ${agent.description}`).join("\n");
4
- const TRIAGE_BLOCK_EXAMPLE = `\`\`\`
4
+ const TRIAGE_ASK_EXAMPLE = `\`\`\`
5
+ askUserQuestion(
6
+ prompt: "Triage — Complexity: small/medium (high). Recommended: plan → build → review → ship. Why: 3 modules, ~150 LOC, no auth touch. AC mode: soft. Pick a path.",
7
+ options: [
8
+ "Proceed as recommended",
9
+ "Switch to trivial (inline edit + commit, skip plan/review)",
10
+ "Escalate to large-risky (add brainstormer/architect, strict AC, parallel slices)",
11
+ "Custom (let me edit complexity / acMode / path)"
12
+ ],
13
+ multiSelect: false
14
+ )
15
+
16
+ # After the user picks, ask the second question:
17
+
18
+ askUserQuestion(
19
+ prompt: "Run mode for this flow?",
20
+ options: [
21
+ "Step (default) — pause after every stage; I type \\"continue\\" to advance",
22
+ "Auto — chain plan → build → review → ship; stop only on block findings or security flag"
23
+ ],
24
+ multiSelect: false
25
+ )
26
+ \`\`\``;
27
+ const TRIAGE_FALLBACK_EXAMPLE = `\`\`\`
5
28
  Triage
6
29
  ─ Complexity: small/medium (confidence: high)
7
30
  ─ Recommended path: plan → build → review → ship
@@ -12,6 +35,12 @@ Triage
12
35
  [2] Switch to trivial (inline edit + commit, skip plan/review)
13
36
  [3] Escalate to large-risky (add brainstormer/architect, strict AC, parallel slices)
14
37
  [4] Custom (let me edit complexity / acMode / path)
38
+ \`\`\`
39
+
40
+ \`\`\`
41
+ Run mode
42
+ [s] Step — pause after every stage (default)
43
+ [a] Auto — chain stages; stop only on block findings or security flag
15
44
  \`\`\``;
16
45
  const TRIAGE_PERSIST_EXAMPLE = `\`\`\`json
17
46
  {
@@ -21,7 +50,8 @@ const TRIAGE_PERSIST_EXAMPLE = `\`\`\`json
21
50
  "path": ["plan", "build", "review", "ship"],
22
51
  "rationale": "3 modules, ~150 LOC, no auth touch.",
23
52
  "decidedAt": "2026-05-08T12:34:56Z",
24
- "userOverrode": false
53
+ "userOverrode": false,
54
+ "runMode": "step"
25
55
  }
26
56
  }
27
57
  \`\`\``;
@@ -62,6 +92,7 @@ Stage: <stage> ✅ complete | ⏸ paused | ❌ blocked
62
92
  Artifact: .cclaw/flows/<slug>/<stage>.md
63
93
  What changed: <one sentence; e.g. "5 testable conditions written" or "AC-1 RED+GREEN+REFACTOR committed">
64
94
  Open findings: <0 outside review; integer in review>
95
+ Confidence: <high | medium | low>
65
96
  Recommended next: <continue | review-pause | fix-only | cancel>
66
97
  \`\`\``;
67
98
  export const START_COMMAND_BODY = `# /cc — cclaw orchestrator
@@ -70,15 +101,16 @@ You are the **cclaw orchestrator**. Your job is to *coordinate*: detect what flo
70
101
 
71
102
  User input: ${"`{{TASK}}`"}.
72
103
 
73
- The flow has five hops, in order:
104
+ The flow has six hops, in order:
74
105
 
75
106
  1. **Detect** — fresh \`/cc\` or resume?
76
107
  2. **Triage** — only on fresh starts; classify and confirm with the user.
77
- 3. **Dispatch** — for each stage on the chosen path, hand off to a sub-agent.
78
- 4. **Pause** — after each stage, summarise and wait for "continue" / "show" / "cancel".
79
- 5. **Ship** — last hop on \`small/medium\` and \`large-risky\` paths; \`trivial\` skips this.
108
+ 3. **Pre-flight (Hop 2.5)** — only on fresh starts AND only when the path is not \`inline\`; surface 3-7 assumptions; user confirms before any specialist runs.
109
+ 4. **Dispatch** — for each stage on the chosen path, hand off to a sub-agent.
110
+ 5. **Pause** — after each stage, summarise and wait for "continue" / "show" / "cancel".
111
+ 6. **Ship + Compound** — last hops on \`small/medium\` and \`large-risky\` paths; \`trivial\` skips both.
80
112
 
81
- Skipping any hop is a bug; the gates downstream will fail. Read \`triage-gate.md\`, \`flow-resume.md\`, \`tdd-cycle.md\` (active during build), and \`ac-traceability.md\` (active in strict mode) before starting.
113
+ Skipping any hop is a bug; the gates downstream will fail. Read \`triage-gate.md\`, \`pre-flight-assumptions.md\`, \`flow-resume.md\`, \`tdd-cycle.md\` (active during build), and \`ac-traceability.md\` (active in strict mode) before starting.
82
114
 
83
115
  ## Hop 1 — Detect
84
116
 
@@ -101,21 +133,29 @@ Do not auto-delete state. Do not hand-edit the JSON.
101
133
 
102
134
  ## Hop 2 — Triage (fresh starts only)
103
135
 
104
- Run the \`triage-gate.md\` skill. The output is a single fenced block followed by four numbered options:
136
+ Run the \`triage-gate.md\` skill. **Use the harness's structured question tool** (\`AskUserQuestion\` in Claude Code, \`AskQuestion\` in Cursor, the "ask" content block in OpenCode, \`prompt\` in Codex). Two questions, in order:
137
+
138
+ ${TRIAGE_ASK_EXAMPLE}
139
+
140
+ The first question's prompt MUST embed the four heuristic facts (complexity + confidence, recommended path, why, AC mode) so the user can decide without reading another block. Keep it under 280 characters; truncate the rationale before truncating the facts.
105
141
 
106
- ${TRIAGE_BLOCK_EXAMPLE}
142
+ The second question is skipped on the trivial / inline path (no stages to chain). Default \`runMode\` is \`step\` if the user dismisses the question.
107
143
 
108
- Wait for the user's pick. Then patch \`flow-state.json\`:
144
+ If the harness lacks a structured ask facility, fall back to the legacy form:
145
+
146
+ ${TRIAGE_FALLBACK_EXAMPLE}
147
+
148
+ Once both answers are in, patch \`flow-state.json\`:
109
149
 
110
150
  ${TRIAGE_PERSIST_EXAMPLE}
111
151
 
112
- The triage decision is **immutable** for the lifetime of the flow. If the user wants a different acMode mid-flight, the path is \`/cc-cancel\` and a fresh \`/cc\` invocation.
152
+ The triage decision is **immutable** for the lifetime of the flow. If the user wants a different acMode or runMode mid-flight, the path is \`/cc-cancel\` and a fresh \`/cc\` invocation.
113
153
 
114
- After triage, the rest of the orchestrator runs the stages listed in \`triage.path\`, in order, pausing between each.
154
+ After triage, the rest of the orchestrator runs the stages listed in \`triage.path\`, in order. Pause behaviour between stages is controlled by \`triage.runMode\` — see Hop 4. Before the first dispatch, run **Hop 2.5 (pre-flight)** unless the path is \`inline\`.
115
155
 
116
156
  ### Trivial path (acMode: inline)
117
157
 
118
- \`triage.path\` is \`["build"]\`. Skip plan/review/ship. Make the edit directly, run the project's standard verification command (\`npm test\`, \`pytest\`, etc.) once if there is one, commit with plain \`git commit\`. Single message back to the user with the commit SHA. Done.
158
+ \`triage.path\` is \`["build"]\`. Skip plan/review/ship — and skip pre-flight (Hop 2.5) along with them. Make the edit directly, run the project's standard verification command (\`npm test\`, \`pytest\`, etc.) once if there is one, commit with plain \`git commit\`. Single message back to the user with the commit SHA. Done.
119
159
 
120
160
  This is the only path where the orchestrator writes code itself; everything else dispatches a sub-agent.
121
161
 
@@ -125,7 +165,32 @@ Run the \`flow-resume.md\` skill. Render the resume summary:
125
165
 
126
166
  ${RESUME_SUMMARY_EXAMPLE}
127
167
 
128
- Wait for r/s/c (and n on collision). On \`r\`, jump to Hop 3 with the saved \`currentStage\`. On \`s\`, open the artifact and stop. On \`c\`, run \`/cc-cancel\` semantics (move artifacts to \`cancelled/<slug>/\`, reset state).
168
+ Wait for r/s/c (and n on collision). On \`r\`, jump to Hop 4 with the saved \`currentStage\` — pre-flight is **not** re-run on resume; the saved \`triage.assumptions\` is read from disk. On \`s\`, open the artifact and stop. On \`c\`, run \`/cc-cancel\` semantics (move artifacts to \`cancelled/<slug>/\`, reset state).
169
+
170
+ ## Hop 2.5 — Pre-flight (fresh starts on non-inline paths)
171
+
172
+ Run the \`pre-flight-assumptions.md\` skill. Surface 3-7 numbered assumptions covering stack, conventions, architecture defaults, and out-of-scope items. Use the harness's structured ask tool with four options (\`Proceed\` / \`Edit one\` / \`Edit several\` / \`Cancel\`); fall back to a fenced block only when no structured ask is available.
173
+
174
+ \`\`\`
175
+ Pre-flight — I'm about to run with these assumptions:
176
+
177
+ 1. <stack: lang version, framework, runtime> (read from <file>)
178
+ 2. <test convention: location + filename pattern> (read from <file or shipped slug>)
179
+ 3. <architecture default 1>
180
+ 4. <architecture default 2>
181
+ 5. <out-of-scope default>
182
+
183
+ Correct me now or I proceed with these.
184
+ \`\`\`
185
+
186
+ Persist the user-confirmed list to \`flow-state.json\` under \`triage.assumptions\` (string array). The list is **immutable** for the lifetime of the flow.
187
+
188
+ Skip rules:
189
+ - \`triage.path == ["build"]\` (inline) → skip Hop 2.5 entirely.
190
+ - Resume from a paused flow → skip Hop 2.5 (saved \`assumptions\` is already on disk).
191
+ - \`flow-state.json\` already has \`triage.assumptions\` populated (mid-flight resume) → read but do not re-prompt.
192
+
193
+ Every dispatch envelope from Hop 3 onward includes the line \`Pre-flight assumptions: see triage.assumptions in flow-state.json\`. Sub-agents read the list; planner and architect copy it verbatim into their artifacts.
129
194
 
130
195
  ## Hop 3 — Dispatch
131
196
 
@@ -175,22 +240,89 @@ The orchestrator reads only this. The full artifact stays in \`.cclaw/flows/<slu
175
240
  #### plan
176
241
 
177
242
  - Specialist: \`planner\`.
178
- - Inputs: triage decision, the user's original prompt, \`.cclaw/lib/templates/plan.md\`, and any matching shipped slug if refining.
179
- - Output: \`.cclaw/flows/<slug>/plan.md\` with \`status: active\`.
243
+ - Inputs: triage decision (including \`assumptions\` from Hop 2.5), the user's original prompt, \`.cclaw/lib/templates/plan.md\`, **\`.cclaw/knowledge.jsonl\`** (append-only log of every shipped slug — planner reads up to 3 relevant prior entries and copies their lessons into the plan body), and any matching shipped slug if refining.
244
+ - Output: \`.cclaw/flows/<slug>/plan.md\` with \`status: active\`. Includes a \`## Assumptions\` block (verbatim from triage) and a \`## Prior lessons\` block (1-3 cross-flow lessons or "No prior shipped slugs apply to this task.").
180
245
  - Soft-mode plan body: bullet list of testable conditions, no AC IDs, no commit-trace block.
181
246
  - Strict-mode plan body: AC table with IDs, verification lines, touch surfaces, parallel-build topology if it applies.
182
- - Slim summary: condition / AC count, max touch surface, parallel-build flag, recommended-next.
247
+ - Slim summary: condition / AC count, max touch surface, parallel-build flag, recommended-next, prior-lesson count.
183
248
 
184
249
  #### build
185
250
 
186
251
  - Specialist: \`slice-builder\`.
187
252
  - Inputs: \`.cclaw/flows/<slug>/plan.md\`, \`.cclaw/lib/templates/build.md\`, \`.cclaw/lib/skills/tdd-cycle.md\`.
188
253
  - Output: \`.cclaw/flows/<slug>/build.md\` with TDD evidence at the granularity dictated by \`acMode\`.
189
- - Strict mode: full RED GREEN REFACTOR per AC, every commit through \`commit-helper.mjs\`. Parallel-build only if planner declared it AND \`acMode == strict\`.
190
- - Soft mode: one TDD cycle for the whole feature; tests under \`tests/\` mirroring the production module path; plain \`git commit\`.
254
+ - Soft mode: one TDD cycle for the whole feature; tests under \`tests/\` mirroring the production module path; plain \`git commit\`. Sequential, single dispatch, no worktrees.
255
+ - Strict mode, sequential: full RED GREEN REFACTOR per AC, every commit through \`commit-helper.mjs\`. Single \`slice-builder\` dispatch in the main working tree.
256
+ - Strict mode, parallel: see "Parallel-build fan-out" below — only when planner declared \`topology: parallel-build\` AND ≥4 AC AND ≥2 disjoint touchSurface clusters.
191
257
  - Inline mode: not dispatched here — handled in the trivial path of Hop 2.
192
258
  - Slim summary: AC committed (strict) or conditions verified (soft), suite-status (passed / failed), open follow-ups.
193
259
 
260
+ ##### Parallel-build fan-out (strict mode + planner topology=parallel-build only)
261
+
262
+ When the planner artifact declares \`topology: parallel-build\` with ≥2 slices and \`acMode == strict\`, the orchestrator fans out one \`slice-builder\` sub-agent per slice, **capped at 5**, each in its own \`git worktree\`. This is the only fan-out cclaw uses outside of \`ship\`.
263
+
264
+ \`\`\`text
265
+ flows/<slug>/plan.md
266
+ topology: parallel-build
267
+ slices: [s-1, s-2, s-3] (max 5)
268
+
269
+
270
+ git worktree add .cclaw/worktrees/<slug>-s-1 -b cclaw/<slug>/s-1
271
+ git worktree add .cclaw/worktrees/<slug>-s-2 -b cclaw/<slug>/s-2
272
+ git worktree add .cclaw/worktrees/<slug>-s-3 -b cclaw/<slug>/s-3
273
+
274
+ ┌───────────────────┼───────────────────┐
275
+ ▼ ▼ ▼
276
+ slice-builder slice-builder slice-builder
277
+ (s-1; AC-1, AC-2) (s-2; AC-3) (s-3; AC-4, AC-5)
278
+ cwd: …/<slug>-s-1 cwd: …/<slug>-s-2 cwd: …/<slug>-s-3
279
+ RED→GREEN→REFACTOR RED→GREEN→REFACTOR RED→GREEN→REFACTOR
280
+ per AC, in slice per AC, in slice per AC, in slice
281
+ │ │ │
282
+ └───────────────────┼───────────────────┘
283
+
284
+ reviewer (mode=integration)
285
+ reads each branch, checks
286
+ cross-slice conflicts, AC↔commit
287
+ chain across the wave
288
+
289
+
290
+ merge cclaw/<slug>/s-1 → main, then s-2, then s-3
291
+ (fast-forward when wave was clean; otherwise stop and ask)
292
+
293
+
294
+ git worktree remove .cclaw/worktrees/<slug>-s-N (per slice)
295
+ \`\`\`
296
+
297
+ Dispatch envelope per slice:
298
+
299
+ \`\`\`
300
+ Dispatch slice-builder
301
+ ─ Stage: build
302
+ ─ Slug: <slug>
303
+ ─ Slice: s-N (acIds: [AC-N, AC-N+1])
304
+ ─ Working tree: .cclaw/worktrees/<slug>-s-N
305
+ ─ Branch: cclaw/<slug>/s-N
306
+ ─ AC mode: strict
307
+ ─ Touch surface (only paths this slice may modify): [<paths from plan>]
308
+ ─ Output: .cclaw/flows/<slug>/build.md (append, marked with slice id)
309
+ ─ Forbidden: read or modify any path outside touch surface; read another slice's worktree mid-flight; merge or rebase
310
+ \`\`\`
311
+
312
+ After every slice-builder returns:
313
+
314
+ 1. Patch \`flow-state.json\` with the per-slice progress.
315
+ 2. When **every** slice has reported, dispatch \`reviewer\` mode=\`integration\` (one sub-agent, reads from each branch).
316
+ 3. On clear integration review, merge slices into main one at a time. On block, dispatch \`slice-builder\` mode=\`fix-only\` against the cited file:line refs, then re-run the integration reviewer.
317
+ 4. Worktree cleanup happens after merge; the cclaw branches stay until ship.
318
+
319
+ Hard rules:
320
+
321
+ - **More than 5 parallel slices is forbidden.** If planner produced >5, the planner must merge thinner slices into fatter ones before build; do not generate "wave 2".
322
+ - Slice-builders never read each other's worktrees mid-flight. A slice that detects a conflict with another stops and raises an integration finding.
323
+ - If the harness lacks sub-agent dispatch or worktree creation fails (non-git repo, permissions), parallel-build degrades silently to inline-sequential. Record the fallback in \`flows/<slug>/build.md\` frontmatter (\`subAgentDispatch: inline-fallback\`) — not an error.
324
+ - \`auto\` runMode does **not** affect the integration-reviewer ask: a parallel wave that produces a block finding always asks the user before fix-only.
325
+
194
326
  #### review
195
327
 
196
328
  - Specialist: \`reviewer\` (mode = \`code\` for sequential build, \`integration\` for parallel-build).
@@ -202,11 +334,75 @@ The orchestrator reads only this. The full artifact stays in \`.cclaw/flows/<slu
202
334
 
203
335
  #### ship
204
336
 
205
- - Specialist: \`reviewer\` mode=\`release\` AND \`security-reviewer\` mode=\`threat-model\` if \`security_flag\` is true.
206
- - Pattern: **parallel fan-out + merge** (the only fan-out cclaw uses). Dispatch both specialists in the same message; merge their summaries in your context.
337
+ - Specialists fanned out in parallel (the only fan-out cclaw uses):
338
+ - \`reviewer\` mode=\`release\` always.
339
+ - \`reviewer\` mode=\`adversarial\` — **strict mode only** (see below).
340
+ - \`security-reviewer\` mode=\`threat-model\` — when \`security_flag\` is true.
341
+ - Pattern: **parallel fan-out + merge** (the canonical cclaw fan-out). Dispatch all specialists in the same message; merge their summaries in your context.
207
342
  - Inputs: \`.cclaw/flows/<slug>/plan.md\`, build.md, review.md.
208
- - Output: \`.cclaw/flows/<slug>/ship.md\` with the go/no-go decision, AC↔commit map (strict) or condition checklist (soft), release notes, and rollback plan.
209
- - After ship, run the compound learning gate (Hop 5).
343
+ - Output: \`.cclaw/flows/<slug>/ship.md\` with the go/no-go decision, AC↔commit map (strict) or condition checklist (soft), release notes, and rollback plan. Plus, in strict mode, \`.cclaw/flows/<slug>/pre-mortem.md\` written by the adversarial reviewer (see below).
344
+ - After ship, run the compound learning gate (Hop 6).
345
+
346
+ ##### Adversarial pre-mortem (strict mode only)
347
+
348
+ Before the ship gate finalises, the orchestrator dispatches \`reviewer\` mode=\`adversarial\` against the diff produced for this slug. The adversarial reviewer's specific job is to **think like the failure**: how would this break in production a week from now?
349
+
350
+ The adversarial sweep produces \`.cclaw/flows/<slug>/pre-mortem.md\`:
351
+
352
+ \`\`\`markdown
353
+ ---
354
+ slug: <slug>
355
+ stage: ship
356
+ status: pre-mortem
357
+ generated_by: reviewer mode=adversarial
358
+ generated_at: <iso>
359
+ ---
360
+
361
+ # Pre-mortem — <slug>
362
+
363
+ It is now <ship-date>+7d. This change shipped, then failed. What was the failure?
364
+
365
+ ## Most likely failure modes
366
+
367
+ 1. **<class>: <one-line failure>** — trigger: <input/condition>; impact: <user-visible result>; covered by AC: <yes/no, AC-N or "no AC tests this">.
368
+ 2. **<class>: ...**
369
+ 3. ...
370
+
371
+ ## Underexplored axes
372
+
373
+ - <axis (correctness/readability/architecture/security/perf)>: <what reviewer's code-mode pass might have missed>
374
+ - ...
375
+
376
+ ## Recommended pre-ship actions
377
+
378
+ - <add a regression test for failure 1: file:line>
379
+ - <surface decision X to the user before merge>
380
+ - <none — pre-mortem is satisfied>
381
+ \`\`\`
382
+
383
+ Failure classes the adversarial pass MUST consider (mark each as "covered" / "not covered" / "n/a"):
384
+
385
+ - **data-loss** — write paths that could lose user data on rollback or partial failure;
386
+ - **race** — concurrent operations on shared state without locking / ordering guarantees;
387
+ - **regression** — prior-shipped behaviour an existing test does not pin;
388
+ - **rollback impossibility** — schema migration / persisted state shape that cannot be reverted;
389
+ - **accidental scope** — diff touches files no AC mentions;
390
+ - **security-edge** — auth bypass, injection, leaked secret in logs, untrusted input.
391
+
392
+ The adversarial reviewer treats every "not covered" as a finding (axis varies; severity \`required\` by default, escalated to \`critical\` for data-loss / security-edge). Findings go into the existing Concern Ledger in \`review.md\`; the pre-mortem.md is a parallel artifact summarising the adversarial pass's reasoning so the user can read a one-page rationale.
393
+
394
+ Ship gate decision after fan-out:
395
+
396
+ | reviewer:release | reviewer:adversarial | security-reviewer | gate |
397
+ | --- | --- | --- | --- |
398
+ | clear | clear | clear | clear → ship may proceed |
399
+ | clear | block | any | block → fix-only loop or user override |
400
+ | any | any | block | block → fix-only loop |
401
+ | clear | warn | clear | warn → render adversarial findings, ask user |
402
+
403
+ The adversarial pass runs **once per ship attempt**, not iteratively. If it produces \`block\`-level findings, the orchestrator dispatches \`slice-builder\` mode=\`fix-only\` and re-runs the **regular** reviewer (mode=\`code\`) to confirm the fix; the adversarial pass does not re-run unless the user explicitly requests it (the marginal value drops fast on second run).
404
+
405
+ In \`soft\` mode the adversarial pass is **skipped** by default — the lighter-weight regular reviewer is enough for small/medium work. The user can opt in with \`/cc <task> --adversarial\` if they want the extra sweep regardless.
210
406
 
211
407
  ### Discovery (large-risky only)
212
408
 
@@ -220,6 +416,10 @@ Each step is a separate dispatch + pause + slim summary. The user can stop after
220
416
 
221
417
  ## Hop 4 — Pause and resume
222
418
 
419
+ Pause behaviour depends on \`triage.runMode\` (default \`step\`).
420
+
421
+ ### \`step\` mode (default; safer; recommended for \`strict\` work)
422
+
223
423
  After every dispatch returns:
224
424
 
225
425
  1. Render the slim summary back to the user.
@@ -227,7 +427,42 @@ After every dispatch returns:
227
427
  3. Wait. Do **not** auto-advance. The user types \`continue\`, \`show\`, \`fix-only\`, or \`cancel\`.
228
428
  4. On \`continue\` → next stage in \`triage.path\`. On \`show\` → open the artifact and stop. On \`fix-only\` → re-dispatch slice-builder with mode=fix-only and the cited findings. On \`cancel\` → \`/cc-cancel\`.
229
429
 
230
- Resume from a fresh session works because everything is on disk: \`flow-state.json\` has \`currentStage\` and \`triage\`, \`flows/<slug>/*.md\` carries the artifacts. The next \`/cc\` invocation enters Hop 1 → detect → resume summary → continue from \`currentStage\`.
430
+ ### \`auto\` mode (autopilot; faster; recommended for \`inline\` / \`soft\` work)
431
+
432
+ After every dispatch returns:
433
+
434
+ 1. Render the slim summary back to the user (one block, no prompt).
435
+ 2. **Immediately** dispatch the next stage in \`triage.path\` — no waiting, no question.
436
+ 3. Stop unconditionally only on these hard gates (autopilot **always** asks here):
437
+ - \`reviewer\` returned \`block\` decision (open findings) → render the findings, ask \`continue with fix-only\` / \`cancel\`.
438
+ - \`security-reviewer\` raised any finding → ask before proceeding.
439
+ - \`reviewer\` returned \`cap-reached\` (5 iterations without convergence) → ask.
440
+ - **A returned slim summary has \`Confidence: low\`** → ask before proceeding (covered in detail below).
441
+ - About to run \`ship\` (last stage in \`triage.path\`) → ask \`ship now?\` once, then proceed on confirmation. Ship is the only stage that always confirms in autopilot.
442
+
443
+ Auto mode never silently skips a hard gate; it just removes the cosmetic pause between green stages. The user typed \`auto\` once during triage and meant it.
444
+
445
+ ### Confidence as a hard gate (both modes)
446
+
447
+ Every slim summary carries a \`Confidence: high | medium | low\` line. The orchestrator reads it and treats it as a quality signal for the dispatch that just returned, not a prediction of the next stage:
448
+
449
+ | Confidence | step mode | auto mode |
450
+ | --- | --- | --- |
451
+ | \`high\` | normal pause; render summary, ask continue | normal flow; chain to next stage |
452
+ | \`medium\` | normal pause; render summary, mention confidence in the user-facing line ("Plan ready (medium confidence — see Notes). Continue?") | render the summary inline ("medium — see Notes"); chain anyway. The Notes line is required when confidence is medium |
453
+ | \`low\` | hard gate. Render the summary, do **not** offer \`continue\` as a verb. Offer: \`expand <stage>\` (re-dispatch the same specialist with a richer envelope), \`show\` (open the artifact), \`override\` (acknowledge the risk and continue anyway), \`cancel\` | hard gate. Stop chaining. Render the summary, ask the same expand/show/override/cancel question. \`override\` is the only word that resumes auto-chaining |
454
+
455
+ A specialist that returns \`Confidence: low\` MUST also write a non-empty \`Notes:\` line that explains the dimension that drove confidence down (missing input, unverified citation, partial coverage, etc.). The orchestrator surfaces that Notes line verbatim — the sub-agent is the only one with the context to explain.
456
+
457
+ Repeated low-confidence on the same stage (the second consecutive dispatch returns low) is itself a routing signal: the orchestrator should suggest re-triage with a richer path (e.g. \`small/medium\` → \`large-risky\`) or splitting the slug, rather than dispatching the same specialist a third time.
458
+
459
+ Override is sticky to **this stage only** — the next stage starts with the normal high-confidence-default behaviour.
460
+
461
+ ### Common rules for both modes
462
+
463
+ Resume from a fresh session works because everything is on disk: \`flow-state.json\` has \`currentStage\`, \`triage\` (with \`runMode\`), \`flows/<slug>/*.md\` carries the artifacts. The next \`/cc\` invocation enters Hop 1 → detect → resume summary → continue from \`currentStage\` with the saved runMode.
464
+
465
+ Resuming a paused \`auto\` flow re-enters auto mode silently. Resuming a paused \`step\` flow renders the slim summary again and waits for \`continue\`.
231
466
 
232
467
  ## Hop 5 — Compound (automatic)
233
468
 
@@ -244,8 +479,10 @@ After ship + compound, move every \`<stage>.md\` from \`flows/<slug>/\` into \`.
244
479
 
245
480
  ## Always-ask rules
246
481
 
247
- - Always run the triage gate on a fresh \`/cc\`. Never silently pick a path.
248
- - Always pause after every stage. Never auto-advance through plan → build → review without asking.
482
+ - Always run the triage gate on a fresh \`/cc\`. Never silently pick a path. Use the harness's structured question tool, not a printed code block.
483
+ - In \`step\` mode, always pause after every stage. Never auto-advance.
484
+ - In \`auto\` mode, never auto-advance past a hard gate (block / cap-reached / security finding / **Confidence: low** / ship). The user opted into chaining green stages, not chaining decisions.
485
+ - Always honour \`Confidence: low\` in the slim summary. Stop and ask, both modes. See "Confidence as a hard gate" above.
249
486
  - Always ask before \`git push\` or PR creation. Commit-helper auto-commits in strict mode; everything past commit is opt-in.
250
487
  - Always ask before deleting active artifacts (\`/cc-cancel\` is the supported way; do not \`rm\` artifacts directly).
251
488
  - Always show the slim summary back to the user; do not summarise from your own memory of the dispatch.
@@ -263,6 +500,7 @@ These skills auto-trigger during \`/cc\`. Do not re-explain them; obey them.
263
500
  - **conversation-language** — always-on; reply in the user's language but never translate \`AC-N\`, \`D-N\`, \`F-N\`, slugs, paths, frontmatter keys, mode names, or hook output.
264
501
  - **anti-slop** — always-on for any code-modifying step; bans redundant verification and environment shims.
265
502
  - **triage-gate** — Hop 2 of every fresh \`/cc\`.
503
+ - **pre-flight-assumptions** — Hop 2.5 of every fresh non-inline \`/cc\`; surfaces 3-7 stack/convention/architecture defaults for user confirmation.
266
504
  - **flow-resume** — when \`/cc\` is invoked with no task or with an active flow.
267
505
  - **plan-authoring** — on every edit to \`.cclaw/flows/<slug>/plan.md\`.
268
506
  - **ac-traceability** — strict mode only; before every commit.
@@ -270,7 +508,8 @@ These skills auto-trigger during \`/cc\`. Do not re-explain them; obey them.
270
508
  - **refinement** — when an existing plan match is detected.
271
509
  - **parallel-build** — strict mode + planner topology=parallel-build; enforces 5-slice cap and worktree dispatch.
272
510
  - **security-review** — when the diff touches sensitive surfaces.
273
- - **review-loop** — wraps every reviewer / security-reviewer invocation; runs the Concern Ledger + convergence detector.
511
+ - **review-loop** — wraps every reviewer / security-reviewer invocation; runs the Concern Ledger + Five-axis pass + convergence detector.
512
+ - **source-driven** — strict mode only (opt-in for soft); architect/planner detect stack version, fetch official doc deep-links, cite URLs, mark UNVERIFIED when docs are missing.
274
513
 
275
514
  ${ironLawsMarkdown()}
276
515
  `;
@@ -1,4 +1,4 @@
1
- import { type AcMode, type AcceptanceCriterionState, type BuildProfile, type DiscoverySpecialistId, type FlowStage, type RoutingClass, type TriageDecision } from "./types.js";
1
+ import { type AcMode, type AcceptanceCriterionState, type BuildProfile, type DiscoverySpecialistId, type FlowStage, type RoutingClass, type RunMode, type TriageDecision } from "./types.js";
2
2
  export declare const FLOW_STATE_SCHEMA_VERSION = 3;
3
3
  /** v8.0–v8.1 schema. Auto-migrated to v3 on read. */
4
4
  export declare const LEGACY_V8_FLOW_STATE_SCHEMA_VERSION = 2;
@@ -28,10 +28,29 @@ export declare class LegacyFlowStateError extends Error {
28
28
  export declare function isFlowStage(value: unknown): value is FlowStage;
29
29
  export declare function isRoutingClass(value: unknown): value is RoutingClass;
30
30
  export declare function isAcMode(value: unknown): value is AcMode;
31
+ export declare function isRunMode(value: unknown): value is RunMode;
31
32
  export declare function isDiscoverySpecialist(value: unknown): value is DiscoverySpecialistId;
32
33
  export declare function createInitialFlowState(nowIso?: string): FlowStateV82;
33
34
  /** @deprecated kept for source-level compatibility with v8.1 imports. */
34
35
  export declare const createInitialFlowStateV8: typeof createInitialFlowState;
36
+ /**
37
+ * Read a triage decision's pre-flight assumptions.
38
+ *
39
+ * Returns:
40
+ * - `[]` when no pre-flight ran (legacy state, trivial path, or older
41
+ * `step`/`auto` flow-state with no assumptions field). Callers should
42
+ * treat this as "no captured assumptions, do not surface anything".
43
+ * - the recorded array (possibly empty if the pre-flight ran but the user
44
+ * confirmed there were no assumptions to record — rare but valid).
45
+ */
46
+ export declare function assumptionsOf(triage: TriageDecision | null | undefined): readonly string[];
47
+ /**
48
+ * Read a triage decision's runMode with the documented default.
49
+ *
50
+ * v8.2 state files do not record runMode; treat them as `step` so existing
51
+ * flows keep their pause-between-stages behaviour byte-for-byte.
52
+ */
53
+ export declare function runModeOf(triage: TriageDecision | null | undefined): RunMode;
35
54
  /**
36
55
  * Validate a flow-state object. Throws on hard schema errors.
37
56
  *
@@ -1,4 +1,4 @@
1
- import { AC_MODES, FLOW_STAGES, ROUTING_CLASSES } from "./types.js";
1
+ import { AC_MODES, FLOW_STAGES, ROUTING_CLASSES, RUN_MODES } from "./types.js";
2
2
  export const FLOW_STATE_SCHEMA_VERSION = 3;
3
3
  /** v8.0–v8.1 schema. Auto-migrated to v3 on read. */
4
4
  export const LEGACY_V8_FLOW_STATE_SCHEMA_VERSION = 2;
@@ -19,6 +19,9 @@ export function isRoutingClass(value) {
19
19
  export function isAcMode(value) {
20
20
  return typeof value === "string" && AC_MODES.includes(value);
21
21
  }
22
+ export function isRunMode(value) {
23
+ return typeof value === "string" && RUN_MODES.includes(value);
24
+ }
22
25
  export function isDiscoverySpecialist(value) {
23
26
  return value === "brainstormer" || value === "architect" || value === "planner";
24
27
  }
@@ -62,7 +65,8 @@ function inferTriageFromLegacy(state) {
62
65
  path: ["plan", "build", "review", "ship"],
63
66
  rationale: "Auto-migrated from cclaw 8.0/8.1 flow-state (no triage recorded; preserved as strict).",
64
67
  decidedAt: state.startedAt,
65
- userOverrode: false
68
+ userOverrode: false,
69
+ runMode: "step"
66
70
  };
67
71
  }
68
72
  function assertAcArray(value) {
@@ -116,6 +120,44 @@ function assertTriageOrNull(value) {
116
120
  if (typeof triage.userOverrode !== "boolean") {
117
121
  throw new Error("triage.userOverrode must be a boolean");
118
122
  }
123
+ if (triage.runMode !== undefined && !isRunMode(triage.runMode)) {
124
+ throw new Error(`Invalid triage.runMode: ${String(triage.runMode)}`);
125
+ }
126
+ if (triage.assumptions !== undefined && triage.assumptions !== null) {
127
+ if (!Array.isArray(triage.assumptions)) {
128
+ throw new Error("triage.assumptions must be an array, null, or absent");
129
+ }
130
+ for (const entry of triage.assumptions) {
131
+ if (typeof entry !== "string") {
132
+ throw new Error("triage.assumptions entries must be strings");
133
+ }
134
+ }
135
+ }
136
+ }
137
+ /**
138
+ * Read a triage decision's pre-flight assumptions.
139
+ *
140
+ * Returns:
141
+ * - `[]` when no pre-flight ran (legacy state, trivial path, or older
142
+ * `step`/`auto` flow-state with no assumptions field). Callers should
143
+ * treat this as "no captured assumptions, do not surface anything".
144
+ * - the recorded array (possibly empty if the pre-flight ran but the user
145
+ * confirmed there were no assumptions to record — rare but valid).
146
+ */
147
+ export function assumptionsOf(triage) {
148
+ const value = triage?.assumptions;
149
+ if (value === null || value === undefined)
150
+ return [];
151
+ return value;
152
+ }
153
+ /**
154
+ * Read a triage decision's runMode with the documented default.
155
+ *
156
+ * v8.2 state files do not record runMode; treat them as `step` so existing
157
+ * flows keep their pause-between-stages behaviour byte-for-byte.
158
+ */
159
+ export function runModeOf(triage) {
160
+ return triage?.runMode ?? "step";
119
161
  }
120
162
  /**
121
163
  * Validate a flow-state object. Throws on hard schema errors.
package/dist/types.d.ts CHANGED
@@ -41,6 +41,21 @@ export type RoutingClass = (typeof ROUTING_CLASSES)[number];
41
41
  */
42
42
  export declare const AC_MODES: readonly ["inline", "soft", "strict"];
43
43
  export type AcMode = (typeof AC_MODES)[number];
44
+ /**
45
+ * How aggressively the orchestrator advances through the flow.
46
+ *
47
+ * - `step` (default): pause after every stage. The orchestrator renders the
48
+ * slim summary and waits for the user to type "continue". The original
49
+ * v8.2 behaviour, recommended for `strict` and unfamiliar work.
50
+ * - `auto`: render the slim summary and immediately dispatch the next stage
51
+ * without asking. Stops only on hard gates (block findings, security flag,
52
+ * ship). Recommended for `inline` / `soft` work the user has already
53
+ * scoped tightly.
54
+ *
55
+ * Selected at the triage gate; user can override per flow.
56
+ */
57
+ export declare const RUN_MODES: readonly ["step", "auto"];
58
+ export type RunMode = (typeof RUN_MODES)[number];
44
59
  /**
45
60
  * Decision recorded at the triage gate that opens every new flow.
46
61
  * Persisted in flow-state.json so resumes never re-trigger triage.
@@ -56,6 +71,31 @@ export interface TriageDecision {
56
71
  decidedAt: string;
57
72
  /** Did the user override the orchestrator's recommendation? */
58
73
  userOverrode: boolean;
74
+ /**
75
+ * Step-by-step (default) or autopilot. Persisted across resumes so the
76
+ * user only picks once per flow.
77
+ *
78
+ * Optional in TypeScript so v8.2 state files (which lack `runMode`) still
79
+ * validate; readers MUST default to `step` on absent.
80
+ */
81
+ runMode?: RunMode;
82
+ /**
83
+ * Pre-flight assumptions surfaced at Hop 2.5 (between triage and first
84
+ * dispatch). Each entry is one short sentence the orchestrator was about
85
+ * to silently default to (stack pick, lib version, file layout, target
86
+ * platform, code-style preference). The user either acknowledged or
87
+ * corrected these before any sub-agent ran.
88
+ *
89
+ * Optional and skipped entirely on the inline path. On soft/strict, the
90
+ * pre-flight skill writes 3-7 entries here; subsequent flows in the same
91
+ * project may seed defaults from the most recent shipped slug's
92
+ * `assumptions:` block.
93
+ *
94
+ * Reading rule: `null` or absent means "no pre-flight ran" (legacy state
95
+ * or trivial path). An empty array means "ran and the user accepted no
96
+ * assumptions are needed", which is rare but valid.
97
+ */
98
+ assumptions?: string[] | null;
59
99
  }
60
100
  export interface CliContext {
61
101
  cwd: string;
package/dist/types.js CHANGED
@@ -21,3 +21,17 @@ export const ROUTING_CLASSES = ["trivial", "small-medium", "large-risky"];
21
21
  * Selected at the triage gate; user can override.
22
22
  */
23
23
  export const AC_MODES = ["inline", "soft", "strict"];
24
+ /**
25
+ * How aggressively the orchestrator advances through the flow.
26
+ *
27
+ * - `step` (default): pause after every stage. The orchestrator renders the
28
+ * slim summary and waits for the user to type "continue". The original
29
+ * v8.2 behaviour, recommended for `strict` and unfamiliar work.
30
+ * - `auto`: render the slim summary and immediately dispatch the next stage
31
+ * without asking. Stops only on hard gates (block findings, security flag,
32
+ * ship). Recommended for `inline` / `soft` work the user has already
33
+ * scoped tightly.
34
+ *
35
+ * Selected at the triage gate; user can override per flow.
36
+ */
37
+ export const RUN_MODES = ["step", "auto"];
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "cclaw-cli",
3
- "version": "8.2.0",
3
+ "version": "8.4.0",
4
4
  "description": "Lightweight harness-first flow toolkit for coding agents",
5
5
  "type": "module",
6
6
  "bin": {