cclaw-cli 0.7.1 → 0.9.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/content/agents.d.ts +9 -0
- package/dist/content/agents.js +177 -6
- package/dist/content/examples.d.ts +17 -0
- package/dist/content/examples.js +275 -4
- package/dist/content/harness-tool-refs.d.ts +20 -0
- package/dist/content/harness-tool-refs.js +240 -0
- package/dist/content/meta-skill.js +203 -33
- package/dist/content/skills.js +106 -49
- package/dist/content/stage-schema.js +63 -11
- package/dist/content/start-command.js +63 -17
- package/dist/content/subagents.js +169 -0
- package/dist/content/templates.js +44 -6
- package/dist/content/utility-skills.d.ts +2 -1
- package/dist/content/utility-skills.js +141 -2
- package/dist/doctor.js +77 -0
- package/dist/harness-adapters.js +55 -16
- package/dist/install.js +19 -0
- package/package.json +1 -1
package/dist/content/skills.js
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
import { RUNTIME_ROOT } from "../constants.js";
|
|
2
|
-
import { stageExamples } from "./examples.js";
|
|
2
|
+
import { stageDomainExamples, stageExamples, stageGoodBadExamples } from "./examples.js";
|
|
3
3
|
import { selfImprovementBlock } from "./learnings.js";
|
|
4
|
-
import {
|
|
4
|
+
import { stageAutoSubagentDispatch, stageSchema } from "./stage-schema.js";
|
|
5
5
|
function rationalizationTable(stage) {
|
|
6
6
|
const schema = stageSchema(stage);
|
|
7
7
|
return `| Rationalization | Reality |
|
|
@@ -67,6 +67,25 @@ function decisionRecordBlock(stage) {
|
|
|
67
67
|
return "";
|
|
68
68
|
return `## Decision Record Template\n\nUse this format for every non-trivial architecture or scope decision made during this stage:\n\n\`\`\`\n${fmt}\n\`\`\`\n`;
|
|
69
69
|
}
|
|
70
|
+
function visualCommunicationBlock(stage) {
|
|
71
|
+
if (stage !== "design")
|
|
72
|
+
return "";
|
|
73
|
+
return `## Visual Communication Rules
|
|
74
|
+
|
|
75
|
+
Diagrams are load-bearing artifacts in the design stage, not decoration. A diagram that encodes structure wrongly (or hides structure behind generic labels) misleads every downstream reader. Apply these rules to **every** diagram in the design artifact:
|
|
76
|
+
|
|
77
|
+
1. **Concrete names, never generic.** "Service A → Service B" is not a diagram; it is a shape. Every node must name a real component the team will build or touch (\`NotificationPublisher\`, \`FeedReadModel\`, \`Stripe webhook handler\`). If you cannot name it concretely, the design is not ready.
|
|
78
|
+
2. **Every arrow is labeled.** Label with the message, action, or protocol it carries (\`publishEvent(user_id, payload)\`, \`GET /snapshot\`, \`dedupe-key upsert\`). Unlabeled arrows silently lose the contract between components.
|
|
79
|
+
3. **Direction is explicit.** Use arrowheads, not bare lines; draw the flow of *data* (not "dependency") unless the diagram type is explicitly a dependency graph, in which case say so in a one-line caption.
|
|
80
|
+
4. **Distinguish sync vs async.** Use a convention and state it once in a legend: e.g. solid arrow = synchronous request/response, dashed arrow = async message via queue/bus, double arrow = two-way. Async edges always name the queue or topic.
|
|
81
|
+
5. **Show at least one failure edge.** Every non-trivial diagram needs one branch that represents the degraded or error path (timeout, reconnect, fallback to cache, poison-message routing). A diagram with only the happy path hides the interesting half of the design.
|
|
82
|
+
6. **One level of detail per diagram.** Do not mix "service-level" and "class-level" on the same canvas. If you need both, produce two diagrams — one at the system boundary, one at the internal module — and cross-reference them.
|
|
83
|
+
7. **Caption, not decoration.** Each diagram gets a one-sentence caption below it stating what the reader should take away ("*Publish path with idempotent outbox; SSE stream reads the projection, not the bus directly*"). If you cannot write the caption in one sentence, the diagram is doing two things at once.
|
|
84
|
+
8. **Prefer text-based formats** (Mermaid, ASCII) over binary images in \`.cclaw/artifacts/\` so diffs stay reviewable. Binary/SVG is allowed when the diagram is already the source of truth elsewhere (e.g. \`docs/architecture/\`) and the artifact embeds a link plus a text-based summary.
|
|
85
|
+
|
|
86
|
+
If a diagram cannot satisfy rules 1–5, do NOT include it — a missing diagram is honest; a misleading diagram is worse. Surface the gap in **Unresolved Decisions** and proceed without the diagram until the decisions that would populate it are locked.
|
|
87
|
+
`;
|
|
88
|
+
}
|
|
70
89
|
function contextLoadingBlock(stage) {
|
|
71
90
|
const trace = stageSchema(stage).crossStageTrace;
|
|
72
91
|
const readLines = trace.readsFrom.length > 0
|
|
@@ -136,60 +155,81 @@ function waveExecutionModeBlock(stage) {
|
|
|
136
155
|
|
|
137
156
|
After plan approval (**WAIT_FOR_CONFIRM** / \`plan_wait_for_confirm\` satisfied), process **all tasks in the current dependency wave** sequentially: **RED → GREEN → REFACTOR** per task, recording evidence per slice. **Stop** only on **BLOCKED**, a test failure that **requires user input**, or **wave completion** (every task in the wave has the required RED / GREEN / REFACTOR evidence per the plan artifact).
|
|
138
157
|
|
|
158
|
+
### Walkthrough — Wave 1 with 3 tasks
|
|
159
|
+
|
|
160
|
+
The example below is **illustrative only** — do not copy the command names blindly, match them to your stack.
|
|
161
|
+
|
|
162
|
+
Assume Wave 1 from the plan artifact contains three tasks:
|
|
163
|
+
|
|
164
|
+
| Task ID | Description | AC | Verification |
|
|
165
|
+
|---|---|---|---|
|
|
166
|
+
| T-1 \`[~3m]\` | Add \`User.emailNormalized\` column | AC-1 | \`npm test -- users/schema\` |
|
|
167
|
+
| T-2 \`[~4m]\` | Normalize on write in \`UserRepo.save\` | AC-1 | \`npm test -- users/repo\` |
|
|
168
|
+
| T-3 \`[~3m]\` | Reject duplicates in \`UserService.signup\` | AC-2 | \`npm test -- users/service\` |
|
|
169
|
+
|
|
170
|
+
**Execution transcript** (one slice at a time, evidence captured per step):
|
|
171
|
+
|
|
172
|
+
**T-1 — RED**
|
|
173
|
+
|
|
174
|
+
> Run: \`npm test -- users/schema\` → **FAIL** (missing column: \`emailNormalized\`). Captured the failure stack as RED evidence. No production code touched yet.
|
|
175
|
+
|
|
176
|
+
**T-1 — GREEN**
|
|
177
|
+
|
|
178
|
+
> Added the column in the schema module. Re-ran \`npm test -- users/schema\` → **PASS**. Ran the full suite \`npm test\` → **PASS**. Captured both outputs as GREEN evidence.
|
|
179
|
+
|
|
180
|
+
**T-1 — REFACTOR**
|
|
181
|
+
|
|
182
|
+
> Extracted the column definition into a shared \`NormalizedEmail\` type used by T-2/T-3. Re-ran \`npm test\` → **PASS**. Captured REFACTOR note: "Extracted NormalizedEmail type to keep T-2/T-3 DRY; zero behavior change, all tests still green."
|
|
183
|
+
|
|
184
|
+
**T-2 — RED / GREEN / REFACTOR**: same shape — write the repo test that expects normalised writes, watch it fail (RED), implement normalisation inside \`UserRepo.save\` only (GREEN), then refactor the normaliser out of the repo into a helper shared with T-3 (REFACTOR).
|
|
185
|
+
|
|
186
|
+
**T-3 — RED / GREEN / REFACTOR**: write the service-level duplicate test that expects a rejection, watch it fail (RED), add the duplicate check in \`UserService.signup\` (GREEN), refactor the error message into a named constant (REFACTOR).
|
|
187
|
+
|
|
188
|
+
**Wave gate check**
|
|
189
|
+
|
|
190
|
+
After T-3 REFACTOR, before declaring Wave 1 done:
|
|
191
|
+
|
|
192
|
+
1. Run the **full suite** (\`npm test\`) one final time → **PASS** captured as wave-exit evidence.
|
|
193
|
+
2. Verify the TDD artifact contains RED, GREEN, and REFACTOR evidence for T-1, T-2, **and** T-3. No partial waves.
|
|
194
|
+
3. Only now mark Wave 1 complete. Wave 2 cannot start until this step.
|
|
195
|
+
|
|
196
|
+
**When to stop mid-wave (do NOT push through)**
|
|
197
|
+
|
|
198
|
+
- A RED test fails for a reason you did not predict (e.g. an unrelated flaky test) → **pause**, diagnose, log an operational-self-improvement entry, and decide with the user before proceeding.
|
|
199
|
+
- A GREEN step would require touching code outside the task's acceptance criterion → **pause**, the task is scoped wrong; adjust the plan or open a follow-up task.
|
|
200
|
+
- The same RED failure reappears after a GREEN change → **escalate** per the 3-attempts rule; do not keep patching.
|
|
201
|
+
|
|
139
202
|
`;
|
|
140
203
|
}
|
|
141
204
|
function stageCompletionProtocol(schema) {
|
|
142
205
|
const stage = schema.stage;
|
|
143
206
|
const gateIds = schema.requiredGates.map((g) => g.id);
|
|
144
207
|
const gateList = gateIds.map((id) => `\`${id}\``).join(", ");
|
|
145
|
-
const nextStage = schema.next === "done" ?
|
|
208
|
+
const nextStage = schema.next === "done" ? "done" : schema.next;
|
|
146
209
|
const mandatory = schema.mandatoryDelegations;
|
|
147
|
-
const
|
|
148
|
-
const
|
|
149
|
-
?
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
- Move all gate IDs for this stage (${gateList}) into \`stageGateCatalog.${stage}.passed\`
|
|
155
|
-
- Clear \`stageGateCatalog.${stage}.blocked\``;
|
|
156
|
-
const delegationBlock = mandatory.length > 0
|
|
157
|
-
? `0. **Delegation pre-flight** (BLOCKING):
|
|
158
|
-
- Mandatory agents for this stage: ${mandatory.map((a) => `\`${a}\``).join(", ")}.
|
|
159
|
-
- For each mandatory agent: confirm it was dispatched (via Task/delegate) and completed, OR record an explicit waiver with reason in \`${delegationLogRel}\`.
|
|
160
|
-
- Write a JSON entry per agent: \`{ "stage": "${stage}", "agent": "<name>", "mode": "mandatory", "status": "completed"|"waived", "waiverReason": "<if waived>", "ts": "<ISO timestamp>" }\`.
|
|
161
|
-
- If the harness does not support delegation, record status \`"waived"\` with reason \`"harness_limitation"\`.
|
|
162
|
-
- **Do NOT proceed to step 1 until every mandatory agent has an entry in the delegation log.**
|
|
163
|
-
`
|
|
164
|
-
: "";
|
|
165
|
-
let nextAction;
|
|
166
|
-
if (nextStage) {
|
|
167
|
-
const nextSchema = stageSchema(nextStage);
|
|
168
|
-
const nextDescription = nextSchema.skillDescription.charAt(0).toLowerCase() + nextSchema.skillDescription.slice(1);
|
|
169
|
-
nextAction = `4. Tell the user:\n\n > **Stage \`${stage}\` complete.** Next: **${nextStage}** — ${nextDescription}\n >\n > Run \`/cc-next\` to continue.`;
|
|
170
|
-
}
|
|
171
|
-
else {
|
|
172
|
-
nextAction = `4. Tell the user:\n\n > **Flow complete.** All stages finished. The project is ready for release.`;
|
|
173
|
-
}
|
|
210
|
+
const mandatoryList = mandatory.length > 0 ? mandatory.map((a) => `\`${a}\``).join(", ") : "none";
|
|
211
|
+
const nextDescription = schema.next === "done"
|
|
212
|
+
? "flow complete — release cut and handoff signed off"
|
|
213
|
+
: (() => {
|
|
214
|
+
const nextSchema = stageSchema(schema.next);
|
|
215
|
+
return nextSchema.skillDescription.charAt(0).toLowerCase() + nextSchema.skillDescription.slice(1);
|
|
216
|
+
})();
|
|
174
217
|
return `## Stage Completion Protocol
|
|
175
218
|
|
|
176
|
-
|
|
219
|
+
Apply the **Shared Stage Completion Protocol** from \`.cclaw/skills/using-cclaw/SKILL.md\` with these parameters — do NOT re-derive the generic steps here.
|
|
177
220
|
|
|
178
|
-
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
${
|
|
221
|
+
**Completion Parameters**
|
|
222
|
+
- \`stage\` — \`${stage}\`
|
|
223
|
+
- \`next\` — \`${nextStage}\` (${nextDescription})
|
|
224
|
+
- \`gates\` — ${gateList}
|
|
225
|
+
- \`artifact\` — \`${RUNTIME_ROOT}/artifacts/${schema.artifactFile}\`
|
|
226
|
+
- \`mandatory\` — ${mandatoryList}
|
|
184
227
|
|
|
185
|
-
|
|
228
|
+
When all required gates are satisfied and the artifact is written, execute the shared procedure (delegation pre-flight → flow-state update → artifact persistence → \`npx cclaw doctor\` → user handoff → STOP) using the parameters above. If any check fails, resolve the issue and re-run before proceeding.
|
|
186
229
|
|
|
187
230
|
## Resume Protocol
|
|
188
231
|
|
|
189
|
-
When resuming
|
|
190
|
-
1. Read the existing artifact and check which gates can be verified from artifact evidence.
|
|
191
|
-
2. For each unverified gate, ask the user to confirm ONE gate at a time. Do NOT batch multiple gate confirmations in a single message.
|
|
192
|
-
3. Update \`guardEvidence\` for each confirmed gate before proceeding.
|
|
232
|
+
When resuming this stage in a NEW session (artifact exists but not all of ${gateList} are passed), follow the **Shared Resume Protocol** in \`.cclaw/skills/using-cclaw/SKILL.md\` — confirm one gate at a time, update \`guardEvidence\` for each, never batch confirmations.
|
|
193
233
|
`;
|
|
194
234
|
}
|
|
195
235
|
function stageTransitionAutoAdvanceBlock(schema) {
|
|
@@ -344,15 +384,15 @@ You MUST complete these steps in order:
|
|
|
344
384
|
|
|
345
385
|
${checklistItems}
|
|
346
386
|
|
|
387
|
+
${stageGoodBadExamples(stage)}
|
|
388
|
+
${stageDomainExamples(stage)}
|
|
347
389
|
${stageExamples(stage)}
|
|
348
390
|
${namedAntiPatternBlock(stage)}
|
|
349
391
|
${cognitivePatternsList(stage)}
|
|
350
392
|
## Interaction Protocol
|
|
351
393
|
${schema.interactionProtocol.map((item, i) => `${i + 1}. ${item}`).join("\n")}
|
|
352
394
|
|
|
353
|
-
|
|
354
|
-
|
|
355
|
-
${ERROR_BUDGET_SPEC}
|
|
395
|
+
**See \`.cclaw/skills/using-cclaw/SKILL.md\` "Shared Decision + Tool-Use Protocol"** for the full AskUserQuestion format, error/retry budget, and the 3-attempt escalation rule. Do not duplicate those rules here — apply them verbatim.
|
|
356
396
|
|
|
357
397
|
${waveExecutionModeBlock(stage)}
|
|
358
398
|
## Required Gates
|
|
@@ -368,15 +408,13 @@ ${reviewSectionsBlock(stage)}
|
|
|
368
408
|
${verificationBlock(stage)}
|
|
369
409
|
${crossStageTraceBlock(stage)}
|
|
370
410
|
${artifactValidationBlock(stage)}
|
|
411
|
+
${visualCommunicationBlock(stage)}
|
|
371
412
|
${decisionRecordBlock(stage)}
|
|
372
413
|
## Common Rationalizations
|
|
373
414
|
${rationalizationTable(stage)}
|
|
374
415
|
|
|
375
|
-
## Blockers
|
|
376
|
-
${schema.blockers.length > 0 ? schema.blockers.map((item) => `- ${item}`).join("\n") : "- None — stage can always proceed"}
|
|
377
|
-
|
|
378
416
|
## Anti-Patterns
|
|
379
|
-
${schema.antiPatterns.map((item) => `- ${item}`).join("\n")}
|
|
417
|
+
${[...schema.antiPatterns, ...schema.blockers].map((item) => `- ${item}`).join("\n")}
|
|
380
418
|
|
|
381
419
|
## Red Flags
|
|
382
420
|
${schema.redFlags.map((item) => `- ${item}`).join("\n")}
|
|
@@ -389,6 +427,25 @@ ${stageTransitionAutoAdvanceBlock(schema)}
|
|
|
389
427
|
${progressiveDisclosureBlock(stage)}
|
|
390
428
|
${selfImprovementBlock(stage)}
|
|
391
429
|
## Handoff
|
|
430
|
+
|
|
431
|
+
Before closing the stage, announce the handoff explicitly so the user can steer. Use the **Handoff Menu** below; never auto-advance silently, even when \`/cc-next\` is available.
|
|
432
|
+
|
|
433
|
+
### Handoff Menu
|
|
434
|
+
|
|
435
|
+
Offer the user a lettered choice at the end of the stage (use \`AskUserQuestion\` / \`AskQuestion\` when the harness supports it, otherwise plain lettered text):
|
|
436
|
+
|
|
437
|
+
- **A) Advance** — run \`/cc-next\` and continue to the next stage. Default when all gates are satisfied and there are no open concerns.
|
|
438
|
+
- **B) Revise this stage** — stay on the current stage; apply the user's feedback, then re-ask for handoff.
|
|
439
|
+
- **C) Pause / park** — save state; stop here. Useful when the user wants to share the artifact with a human reviewer before continuing.
|
|
440
|
+
- **D) Rewind** — move to a prior stage (user names which). Use when downstream work revealed that an earlier stage was wrong.
|
|
441
|
+
- **E) Abandon** — mark the flow as cancelled; no further stages will run. Artifacts remain on disk.
|
|
442
|
+
|
|
443
|
+
Recommendation rules:
|
|
444
|
+
- If all required gates are satisfied AND the stage's completion status is \`DONE\`, recommend **A (Advance)**.
|
|
445
|
+
- If completion status is \`DONE_WITH_CONCERNS\`, recommend **B (Revise)** and name the concern.
|
|
446
|
+
- If completion status is \`BLOCKED\`, recommend **B (Revise)** or **C (Pause)** depending on whether the blocker is internal or external.
|
|
447
|
+
|
|
448
|
+
Reference data for the user:
|
|
392
449
|
- Next command: \`/cc-next\` (loads whatever stage is current in flow-state)
|
|
393
450
|
- Required artifact: \`.cclaw/artifacts/${schema.artifactFile}\`
|
|
394
451
|
- Stage stays blocked if any required gate is unsatisfied
|
|
@@ -195,7 +195,7 @@ const SCOPE = {
|
|
|
195
195
|
"**Error and Rescue Registry** — For each capability: what breaks, how detected, what fallback."
|
|
196
196
|
],
|
|
197
197
|
interactionProtocol: [
|
|
198
|
-
"For scope mode selection: use the Decision Protocol — present expand/selective/hold/reduce as labeled options with trade-offs and mark one as (recommended).
|
|
198
|
+
"For scope mode selection: use the Decision Protocol — present expand/selective/hold/reduce as labeled options with trade-offs and mark one as (recommended). Do NOT use a numeric Completeness rubric; recommend the option that best covers the prime-directive failure modes, four data-flow paths, observability, and deferred handling for the in-scope set with the smallest blast radius. Base your recommendation on default heuristics: greenfield -> expand, enhancement -> selective, bugfix/hotfix/refactor -> hold, broad blast radius -> reduce. If AskQuestion/AskUserQuestion is available, send exactly ONE question per call, validate fields against runtime schema, and on schema error immediately fall back to plain-text question instead of retrying guessed payloads.",
|
|
199
199
|
"Walk through the scope checklist interactively. Each checklist item that surfaces a decision should be presented to the user as a question, not as a monologue. Do not dump all items at once.",
|
|
200
200
|
"Challenge premise and verify the problem framing before anything else.",
|
|
201
201
|
"Take a position on every scope decision. Avoid hedging phrases like 'this could work' or 'there are many ways'; state your recommendation and one concrete condition that would change it.",
|
|
@@ -350,6 +350,7 @@ const SCOPE = {
|
|
|
350
350
|
artifactValidation: [
|
|
351
351
|
{ section: "Prime Directives", required: true, validationRule: "For each scoped capability: named failure modes, explicit error surface, four data-flow paths, interaction edge cases, observability expectations, and deferred-item handling." },
|
|
352
352
|
{ section: "Premise Challenge", required: true, validationRule: "Must contain explicit answers to: right problem? direct path? what if nothing?" },
|
|
353
|
+
{ section: "Requirements", required: true, validationRule: "Table of stable requirement IDs (R1, R2, R3…) one per row with observable outcome, priority, and source. IDs are assigned once and never renumbered across scope/design/spec/plan/review; dropped requirements stay with Priority `DROPPED`." },
|
|
353
354
|
{ section: "Implementation Alternatives", required: true, validationRule: "2-3 options with Name, Summary, Effort, Risk, Pros, Cons, and Reuses. Must include minimal viable and ideal architecture options." },
|
|
354
355
|
{ section: "Scope Mode", required: true, validationRule: "Must state selected mode and rationale with default heuristic justification." },
|
|
355
356
|
{ section: "Mode-Specific Analysis", required: true, validationRule: "Must document the analysis matching the selected scope mode: EXPAND (10x and delight opportunities), SELECTIVE (hold-scope baseline then cherry-picked expansions), HOLD (minimum-change-set hardening), REDUCE (ruthless cuts and follow-up split)." },
|
|
@@ -393,7 +394,7 @@ const DESIGN = {
|
|
|
393
394
|
"Codebase Investigation — Before any design decision, read the actual code in the blast radius. List every file that will be touched, its current responsibilities, and existing patterns (error handling, naming, test style). Design must conform to discovered patterns, not impose new ones without justification.",
|
|
394
395
|
"Step 0: Scope Challenge — what existing code solves sub-problems? Minimum change set? Complexity check: 8+ files or 2+ new services = complexity smell → flag for possible scope reduction.",
|
|
395
396
|
"Search Before Building — For each technical choice (library, pattern, architecture), search for existing solutions. Label findings: Layer 1 (exact match), Layer 2 (partial match, needs adaptation), Layer 3 (inspiration only), EUREKA (unexpected perfect solution). Default to existing before custom.",
|
|
396
|
-
"Architecture Review — system design, component boundaries, data flow, scaling, security architecture. For each new codepath: one realistic production failure scenario. **Mandatory:** produce at least one architecture diagram (ASCII, Mermaid, or tool-generated) showing component boundaries and data flow direction.",
|
|
397
|
+
"Architecture Review — system design, component boundaries, data flow, scaling, security architecture. For each new codepath: one realistic production failure scenario. **Mandatory:** produce at least one architecture diagram (ASCII, Mermaid, or tool-generated) showing component boundaries and data flow direction. Apply the **Visual Communication rules** (see below) — an unlabeled or generic diagram is worse than no diagram, because it pretends to encode decisions it does not.",
|
|
397
398
|
"Code Quality Review — code organization, DRY violations, error handling patterns, over/under-engineering assessment.",
|
|
398
399
|
"Test Review — diagram every new flow, data path, error path. For each: what test type covers it? Does one exist? What is the gap? Produce test plan artifact.",
|
|
399
400
|
"Performance Review — N+1 queries, memory concerns, caching opportunities, slow code paths. What breaks at 10x load? At 100x?",
|
|
@@ -405,7 +406,7 @@ const DESIGN = {
|
|
|
405
406
|
interactionProtocol: [
|
|
406
407
|
"Review architecture decisions section-by-section.",
|
|
407
408
|
"For EACH issue found in a review section, present it ONE AT A TIME. Do NOT batch multiple issues.",
|
|
408
|
-
"For each issue: use the Decision Protocol — describe concretely with file/line references, present labeled options (A/B/C) with trade-offs, effort estimate (S/M/L/XL), risk level (Low/Med/High),
|
|
409
|
+
"For each issue: use the Decision Protocol — describe concretely with file/line references, present labeled options (A/B/C) with trade-offs, effort estimate (S/M/L/XL), risk level (Low/Med/High), and mark one as (recommended). Do NOT use a numeric Completeness rubric; recommend the option that best covers architecture, data-flow, failure-modes, test, and perf review concerns for the issue with the lowest risk. If AskQuestion/AskUserQuestion is available, send exactly ONE question per call, validate fields against runtime schema, and on schema error immediately fall back to plain-text question instead of retrying guessed payloads.",
|
|
409
410
|
"Only proceed to the next review section after ALL issues in the current section are resolved.",
|
|
410
411
|
"If a section has no issues, say 'No issues found' and move on.",
|
|
411
412
|
"Do not skip failure-mode mapping.",
|
|
@@ -583,7 +584,7 @@ const DESIGN = {
|
|
|
583
584
|
{ section: "Codebase Investigation", required: true, validationRule: "Must list blast-radius files with current responsibilities and discovered patterns." },
|
|
584
585
|
{ section: "Search Before Building", required: true, validationRule: "For each technical choice: Layer 1 (exact match), Layer 2 (partial match), Layer 3 (inspiration), EUREKA labels with reuse-first default." },
|
|
585
586
|
{ section: "Architecture Boundaries", required: true, validationRule: "Must list component boundaries with ownership." },
|
|
586
|
-
{ section: "Architecture Diagram", required: true, validationRule: "At least one diagram (ASCII, Mermaid, or image) showing component boundaries and data flow direction." },
|
|
587
|
+
{ section: "Architecture Diagram", required: true, validationRule: "At least one diagram (ASCII, Mermaid, or image) showing component boundaries and data flow direction. Diagram must: (1) label every node with a concrete component name (no generic 'Service A/B'), (2) label every arrow with the action or message (no unlabeled arrows), (3) mark direction of data flow explicitly, (4) distinguish synchronous from asynchronous edges (e.g. solid vs dashed, or `sync:` / `async:` prefix), (5) show at least one failure edge or degraded-mode branch when the system has one." },
|
|
587
588
|
{ section: "Data Flow", required: true, validationRule: "Must include happy path, nil input, empty input, upstream error paths." },
|
|
588
589
|
{ section: "Failure Mode Table", required: true, validationRule: "Each failure mode has: trigger, detection, mitigation, user impact." },
|
|
589
590
|
{ section: "Test Strategy", required: true, validationRule: "Must define unit/integration/e2e expectations with coverage targets." },
|
|
@@ -748,7 +749,7 @@ const SPEC = {
|
|
|
748
749
|
traceabilityRule: "Every acceptance criterion must trace to a design decision. Every downstream plan task must trace to a spec criterion."
|
|
749
750
|
},
|
|
750
751
|
artifactValidation: [
|
|
751
|
-
{ section: "Acceptance Criteria", required: true, validationRule: "Each criterion is observable, measurable, and falsifiable. Table
|
|
752
|
+
{ section: "Acceptance Criteria", required: true, validationRule: "Each criterion is observable, measurable, and falsifiable. Table must include a Requirement Ref column linking to R# IDs in 02-scope.md and a Design Decision Ref column tracing back to design artifact. AC IDs (AC-1, AC-2…) are stable across revisions — dropped ACs stay with Priority `DROPPED`." },
|
|
752
753
|
{ section: "Edge Cases", required: true, validationRule: "At least one boundary and one error condition per criterion." },
|
|
753
754
|
{ section: "Constraints and Assumptions", required: true, validationRule: "All implicit assumptions surfaced. Constraints have sources." },
|
|
754
755
|
{ section: "Testability Map", required: true, validationRule: "Each criterion maps to a concrete test description with verification approach (unit, integration, e2e, manual) and command or manual steps." },
|
|
@@ -864,6 +865,8 @@ const PLAN = {
|
|
|
864
865
|
cognitivePatterns: [
|
|
865
866
|
{ name: "Vertical Slice Thinking", description: "Each task delivers one thin end-to-end slice of value. Horizontal layers (all models, then all controllers) create integration risk. Vertical slices (one feature through all layers) reduce it." },
|
|
866
867
|
{ name: "Two-Minute Smell Test", description: "If a competent engineer cannot understand and start a task in two minutes, the task is too large or too vague. Break it down further." },
|
|
868
|
+
{ name: "Five-Minute Budget (hard)", description: "Every plan step MUST fit a 2-to-5-minute execution budget on a competent implementer. If a step plausibly takes longer, it is two steps pretending to be one — split it. Measure by 'keyboard minutes on this slice', not by wall clock. Write the estimated minutes next to each task (e.g. `[~3m]`); when a TDD slice later consumes >2× the estimate, log an operational-self-improvement entry so future plans calibrate better." },
|
|
869
|
+
{ name: "No Placeholders", description: "Plan text must be copy-pasteable. Forbidden tokens anywhere in the artifact: `TODO`, `TBD`, `FIXME`, `<fill-in>`, `<your-*-here>`, `xxx`, `...` (as ellipsis for omitted content — real commands use real args). Every acceptance-criterion link, file path, test command, and verification command must be concrete and runnable as written. A placeholder is a deferred decision masquerading as a plan; decide it now or remove the task." },
|
|
867
870
|
{ name: "Make the Change Easy, Then Make the Easy Change", description: "Refactor first, implement second. Never structural + behavioral changes simultaneously. Sequence tasks accordingly." },
|
|
868
871
|
{ name: "Diagnose Before Fix", description: "Before decomposing work, understand the current state of the codebase. Read existing code, tests, and conventions. Tasks should reference what exists, not assume a blank slate." },
|
|
869
872
|
{ name: "Scrap Signals", description: "If a task description is vague, the acceptance criterion is missing, or the verification command is a placeholder — it is scrap. Either rewrite it or remove it. Half-specified tasks waste more time than no tasks." },
|
|
@@ -891,6 +894,16 @@ const PLAN = {
|
|
|
891
894
|
"Are there hidden dependencies between tasks in different waves?"
|
|
892
895
|
],
|
|
893
896
|
stopGate: true
|
|
897
|
+
},
|
|
898
|
+
{
|
|
899
|
+
title: "Five-Minute Budget + No-Placeholders Audit",
|
|
900
|
+
evaluationPoints: [
|
|
901
|
+
"Does every task carry an explicit minutes estimate (e.g. `[~3m]`) and does every estimate fit the 2-to-5-minute budget? Estimates >5 minutes must be split.",
|
|
902
|
+
"Are all file paths, test commands, and verification commands copy-pasteable as written — no `TODO`, `TBD`, `FIXME`, `<fill-in>`, `<your-*-here>`, `xxx`, or ellipsis standing in for omitted args?",
|
|
903
|
+
"Does every acceptance-criterion reference resolve to a real R# / AC-### in the spec (not a blank link)?",
|
|
904
|
+
"If an estimate is genuinely uncertain (first-time integration, unfamiliar library), is the uncertainty named explicitly and scheduled as a spike task in wave 0, rather than hidden behind a large estimate?"
|
|
905
|
+
],
|
|
906
|
+
stopGate: true
|
|
894
907
|
}
|
|
895
908
|
],
|
|
896
909
|
completionStatus: ["DONE", "DONE_WITH_CONCERNS", "BLOCKED"],
|
|
@@ -902,11 +915,12 @@ const PLAN = {
|
|
|
902
915
|
artifactValidation: [
|
|
903
916
|
{ section: "Dependency Graph", required: true, validationRule: "Ordering and parallel opportunities explicit. No circular dependencies." },
|
|
904
917
|
{ section: "Dependency Waves", required: true, validationRule: "Every task belongs to a wave. Each wave has an exit gate and dependency statement." },
|
|
905
|
-
{ section: "Task List", required: true, validationRule: "Each task
|
|
918
|
+
{ section: "Task List", required: true, validationRule: "Each task row includes ID, description, acceptance criterion, verification command, and effort estimate (S/M/L). Every task must also carry a minutes estimate within the 2-5 minute budget." },
|
|
906
919
|
{ section: "Acceptance Mapping", required: true, validationRule: "Every spec criterion is covered by at least one task." },
|
|
907
920
|
{ section: "Risk Assessment", required: false, validationRule: "If present: per-task or per-wave risk identification with likelihood, impact, and mitigation strategy." },
|
|
908
921
|
{ section: "Boundary Map", required: false, validationRule: "If present: per-wave or per-task interface contracts listing what each task produces (exports) and consumes (imports) from other tasks." },
|
|
909
|
-
{ section: "WAIT_FOR_CONFIRM", required: true, validationRule: "Explicit marker present. Status: pending until user approves." }
|
|
922
|
+
{ section: "WAIT_FOR_CONFIRM", required: true, validationRule: "Explicit marker present. Status: pending until user approves." },
|
|
923
|
+
{ section: "No-Placeholder Scan", required: false, validationRule: "If present: confirmation that a text scan for `TODO`, `TBD`, `FIXME`, `<fill-in>`, `<your-*-here>`, `xxx`, or bare ellipses has zero hits in the task list. A placeholder is a deferred decision masquerading as a plan." }
|
|
910
924
|
],
|
|
911
925
|
namedAntiPattern: {
|
|
912
926
|
title: "Task Details Can Be Finalized During Coding",
|
|
@@ -1037,7 +1051,12 @@ const TDD = {
|
|
|
1037
1051
|
{ name: "Regression Paranoia", description: "Assume every change breaks something until the full suite proves otherwise. Partial test runs are lies of omission." },
|
|
1038
1052
|
{ name: "Refactor-as-Hygiene", description: "Refactoring is not optional cleanup — it is the third leg of TDD. GREEN without REFACTOR accumulates mess. REFACTOR without GREEN breaks things." },
|
|
1039
1053
|
{ name: "Evidence Over Anecdote", description: "Every claim about test state must be backed by captured output. 'It passed' without terminal evidence is not evidence. 'I saw it fail' without the failure output is not RED. Capture commands, outputs, and results — not summaries from memory." },
|
|
1040
|
-
{ name: "Characterization First", description: "Before changing existing behavior, write characterization tests that capture current behavior as-is. These tests document what the system does today — even if that behavior is wrong. Only after the characterization suite is green do you add the new RED test for the desired change. This prevents accidental behavior destruction during refactoring." }
|
|
1054
|
+
{ name: "Characterization First", description: "Before changing existing behavior, write characterization tests that capture current behavior as-is. These tests document what the system does today — even if that behavior is wrong. Only after the characterization suite is green do you add the new RED test for the desired change. This prevents accidental behavior destruction during refactoring." },
|
|
1055
|
+
{ name: "Test Pyramid Shape", description: "Healthy test suites look like a pyramid: many small fast tests at the base, fewer medium integration tests in the middle, few large end-to-end tests at the top. Each layer catches a different class of bug; none of them substitutes for another. If your suite is top-heavy (mostly E2E) it is slow and flaky; if it is base-only it misses integration contracts. During TDD, default to the smallest layer that can prove the behavior." },
|
|
1056
|
+
{ name: "Prove-It Pattern (bug fixes)", description: "For any reported regression or hotfix, the FIRST test is a reproduction — it must fail without your fix, pass with your fix, and fail again if the fix is reverted. This is the only way to prove you fixed the reported bug and not a superficially similar one. Skipping this step is how bugs come back two releases later wearing a different name." },
|
|
1057
|
+
{ name: "Test Size Model", description: "Size tests by scope, not by name: Small = pure logic, no I/O, <50ms; Medium = one process boundary, possibly filesystem or an in-memory DB; Large = multi-process / network / real external service. Small tests are the default; escalate to Medium only when a real boundary must be exercised, and to Large only for end-to-end user journeys. Record the size class in the TDD artifact so reviewers can sanity-check the pyramid shape." },
|
|
1058
|
+
{ name: "State Over Interaction", description: "Assert on observable outcomes (return values, state changes, persisted data, HTTP responses) — NOT on which helper methods were called, how many times, or in what order. Interaction-style assertions (`expect(mock.foo).toHaveBeenCalledWith(...)` without a state assertion) couple tests to implementation and shatter under harmless refactors. Use mocks only at trust boundaries (network, filesystem, time); for everything inside the module, let state do the asserting. If you cannot observe the outcome without a mock-spy, rework the seam before writing the test." },
|
|
1059
|
+
{ name: "Beyoncé Rule", description: "If you liked it, you should have put a test on it. Every surface that a caller can observe — public API, CLI flag, config key, exit code, persisted schema — is a contract, and every contract without a test is a silent regression waiting to happen. When a bug or production incident reveals an uncovered surface, the fix is never 'patch the code'; it is 'patch the code AND add the test that would have caught it'. Untested behavior does not exist for future refactors — it only exists until somebody accidentally removes it." }
|
|
1041
1060
|
],
|
|
1042
1061
|
reviewSections: [
|
|
1043
1062
|
{
|
|
@@ -1061,6 +1080,37 @@ const TDD = {
|
|
|
1061
1080
|
"Is traceability complete: every change links to plan task ID and spec criterion?"
|
|
1062
1081
|
],
|
|
1063
1082
|
stopGate: true
|
|
1083
|
+
},
|
|
1084
|
+
{
|
|
1085
|
+
title: "Test Pyramid + Size Audit",
|
|
1086
|
+
evaluationPoints: [
|
|
1087
|
+
"Is the tests-added count skewed toward Small (unit) tests, with Medium and Large used only when a real boundary justifies the cost?",
|
|
1088
|
+
"Does every newly added test declare a size class (Small / Medium / Large) — either inline in the test file or in the TDD artifact table?",
|
|
1089
|
+
"Are Large tests reserved for genuine end-to-end user journeys (not substitutes for unit coverage)?",
|
|
1090
|
+
"Has the slice avoided using Medium/Large tests to paper over testability problems that should be fixed at the design layer?"
|
|
1091
|
+
],
|
|
1092
|
+
stopGate: false
|
|
1093
|
+
},
|
|
1094
|
+
{
|
|
1095
|
+
title: "Prove-It Reproduction (bug-fix slices)",
|
|
1096
|
+
evaluationPoints: [
|
|
1097
|
+
"Does the artifact identify this slice as a bug fix, and if so, include a reproduction test checked in alongside the fix?",
|
|
1098
|
+
"Is there captured RED evidence from running the reproduction WITHOUT the fix applied?",
|
|
1099
|
+
"Is there captured GREEN evidence from the same reproduction AFTER the fix was applied?",
|
|
1100
|
+
"Is there a note confirming the reproduction test fails again if the fix is reverted (or equivalent evidence that the test is actually pinned to this fix)?"
|
|
1101
|
+
],
|
|
1102
|
+
stopGate: false
|
|
1103
|
+
},
|
|
1104
|
+
{
|
|
1105
|
+
title: "State-over-Interaction + Beyoncé Coverage",
|
|
1106
|
+
evaluationPoints: [
|
|
1107
|
+
"Do assertions target observable state (return values, persisted data, HTTP responses, logs) rather than which internal helpers were called?",
|
|
1108
|
+
"Are mocks/spies used only at true trust boundaries (network, filesystem, time, external services), not for module-internal collaborators?",
|
|
1109
|
+
"For every public surface touched in this slice (exported API, CLI flag, config key, env var, exit code, schema field) — does at least one test observe it?",
|
|
1110
|
+
"If a bug or review finding revealed an uncovered surface, was a test added alongside the fix, not just the code change?",
|
|
1111
|
+
"Are interaction-style assertions (e.g. `toHaveBeenCalledWith` without a state assertion) justified by an explicit boundary comment, or flagged for follow-up?"
|
|
1112
|
+
],
|
|
1113
|
+
stopGate: false
|
|
1064
1114
|
}
|
|
1065
1115
|
],
|
|
1066
1116
|
completionStatus: ["DONE", "DONE_WITH_CONCERNS", "BLOCKED"],
|
|
@@ -1077,7 +1127,9 @@ const TDD = {
|
|
|
1077
1127
|
{ section: "REFACTOR Notes", required: true, validationRule: "What changed, why, behavior preservation confirmed." },
|
|
1078
1128
|
{ section: "Traceability", required: true, validationRule: "Plan task ID and spec criterion linked." },
|
|
1079
1129
|
{ section: "Verification Ladder", required: false, validationRule: "If present: per-slice verification tier (static, command, behavioral, human) with evidence for highest tier reached." },
|
|
1080
|
-
{ section: "Coverage Targets", required: false, validationRule: "If present: per-module or per-code-type coverage thresholds with current values and measurement commands." }
|
|
1130
|
+
{ section: "Coverage Targets", required: false, validationRule: "If present: per-module or per-code-type coverage thresholds with current values and measurement commands." },
|
|
1131
|
+
{ section: "Test Pyramid Shape", required: false, validationRule: "If present: per-slice count of Small/Medium/Large tests added, to let reviewers verify the suite is not drifting top-heavy." },
|
|
1132
|
+
{ section: "Prove-It Reproduction", required: false, validationRule: "Required for bug-fix slices: original failing reproduction test (RED without fix), passing output with fix (GREEN), and a note confirming the test fails again if the fix is reverted." }
|
|
1081
1133
|
],
|
|
1082
1134
|
namedAntiPattern: {
|
|
1083
1135
|
title: "Code Before Failing Test",
|
|
@@ -1125,7 +1177,7 @@ const REVIEW = {
|
|
|
1125
1177
|
"Run Layer 1 (spec compliance) completely before starting Layer 2.",
|
|
1126
1178
|
"In each review section, present findings ONE AT A TIME. Do NOT batch.",
|
|
1127
1179
|
"Classify every finding as Critical, Important, or Suggestion.",
|
|
1128
|
-
"For each Critical finding: use the Decision Protocol — present resolution options (A/B/C) with trade-offs,
|
|
1180
|
+
"For each Critical finding: use the Decision Protocol — present resolution options (A/B/C) with trade-offs, and mark one as (recommended). Do NOT use a numeric Completeness rubric; recommend the option that fully closes the finding with no carry-over risk and the smallest blast radius. If AskQuestion/AskUserQuestion is available, send exactly ONE question per call, validate fields against runtime schema, and on schema error immediately fall back to plain-text question instead of retrying guessed payloads.",
|
|
1129
1181
|
"Resolve all critical blockers before ship.",
|
|
1130
1182
|
"For final verdict: use AskQuestion/AskUserQuestion only if runtime schema is confirmed; otherwise collect verdict with a plain-text single-choice prompt (APPROVED / APPROVED_WITH_CONCERNS / BLOCKED).",
|
|
1131
1183
|
"**STOP.** Do NOT proceed to ship until the user provides an explicit verdict."
|
|
@@ -1336,7 +1388,7 @@ const SHIP = {
|
|
|
1336
1388
|
interactionProtocol: [
|
|
1337
1389
|
"Run preflight checks before any release action.",
|
|
1338
1390
|
"Document release notes and rollback plan explicitly.",
|
|
1339
|
-
"For finalization mode: use the Decision Protocol — present modes as labeled options (A/B/C/D) with consequences,
|
|
1391
|
+
"For finalization mode: use the Decision Protocol — present modes as labeled options (A/B/C/D) with consequences, and mark one as (recommended). Do NOT use a numeric Completeness rubric; recommend the mode that best addresses release blast-radius, rollback readiness, observability, and stakeholder communication — ties go to the most reversible option. If AskQuestion/AskUserQuestion is available, send exactly ONE question per call, validate fields against runtime schema, and on schema error immediately fall back to plain-text question instead of retrying guessed payloads.",
|
|
1340
1392
|
"Do not proceed if critical blockers remain from review.",
|
|
1341
1393
|
"**STOP.** Present finalization options and wait for user selection before executing any finalization action."
|
|
1342
1394
|
],
|
|
@@ -25,30 +25,69 @@ This is the **recommended way to start** working with cclaw. Use \`/cc-next\` fo
|
|
|
25
25
|
## HARD-GATE
|
|
26
26
|
|
|
27
27
|
- **Do not** skip reading \`${flowPath}\` — always check current state before acting.
|
|
28
|
-
- **Do not** start implementation stages directly from \`/cc <prompt>\` — always begin at brainstorm.
|
|
28
|
+
- **Do not** start implementation stages directly from \`/cc <prompt>\` — always begin at the first stage of the resolved track (brainstorm for standard, spec for quick).
|
|
29
|
+
- **Do not** start a stage pipeline for a task that is not a software change (pure question, non-software task, conversation).
|
|
29
30
|
|
|
30
31
|
## Algorithm
|
|
31
32
|
|
|
32
33
|
### With prompt (\`/cc <text>\`)
|
|
33
34
|
|
|
34
|
-
1.
|
|
35
|
-
|
|
36
|
-
|
|
35
|
+
1. **Phase 0 — Task classification.** Before any stage routing, classify the prompt:
|
|
36
|
+
|
|
37
|
+
| Class | Signals | Action |
|
|
38
|
+
|---|---|---|
|
|
39
|
+
| **non-software** | legal text / docs / marketing copy / meeting notes / therapy-style conversation | Respond directly, do NOT open a stage, do NOT mutate flow state. |
|
|
40
|
+
| **pure-question** | "how does X work?", "explain Y", "what are the trade-offs of Z?" | Answer directly, do NOT open a stage. |
|
|
41
|
+
| **trivial** | typo, one-liner, rename, config tweak, copy change, version bump with zero behavior change | Fast-path: skip \`brainstorm\` and \`scope\`, seed \`00-idea.md\`, move straight to \`design\` or \`spec\` depending on whether an interface change is involved. |
|
|
42
|
+
| **software — bug fix with repro** | regression / hotfix / named symptom + repro steps | Fast-path: set track to \`quick\`, seed \`04-spec.md\` with the reproduction, enter \`tdd\` with a RED reproduction test first. |
|
|
43
|
+
| **software — standard** | feature, refactor, migration, integration, architecture change | Full 8-stage flow starting at \`brainstorm\`. |
|
|
44
|
+
|
|
45
|
+
Record the chosen class in \`.cclaw/artifacts/00-idea.md\` on the \`Class:\` line. Do NOT silently treat a non-software task as software.
|
|
46
|
+
|
|
47
|
+
2. **Phase 1 — Origin-document discovery.** Before asking the user for context, scan for existing requirements/plan artifacts and merge them into initial context:
|
|
48
|
+
- \`.cclaw/artifacts/00-idea.md\` if it already exists (resumed flow).
|
|
49
|
+
- Common origin locations: \`docs/prd/**\`, \`docs/rfcs/**\`, \`docs/adr/**\`, \`docs/design/**\`, \`specs/**\`, \`prd/**\`, \`rfc/**\`, \`design/**\`, root-level \`PRD.md\` / \`SPEC.md\` / \`DESIGN.md\` / \`REQUIREMENTS.md\` / \`ROADMAP.md\`.
|
|
50
|
+
- Summarize each discovered doc in \`00-idea.md\` under a \`Discovered context\` section with path + 1-line summary.
|
|
51
|
+
- If an origin doc contradicts the prompt, surface the conflict to the user before routing.
|
|
52
|
+
|
|
53
|
+
3. **Phase 2 — Tech-stack + version detection.** Sniff the repo for stack + language versions and record under \`Stack:\`:
|
|
54
|
+
- Node: \`package.json\` \`engines\` / \`volta\` / \`packageManager\` / \`devDependencies\`.
|
|
55
|
+
- Python: \`pyproject.toml\` / \`requirements*.txt\` / \`.python-version\`.
|
|
56
|
+
- Go: \`go.mod\` (module + Go version).
|
|
57
|
+
- Rust: \`Cargo.toml\` (\`[package]\` + \`rust-version\`).
|
|
58
|
+
- Java/Kotlin: \`pom.xml\` / \`build.gradle*\` + toolchain version.
|
|
59
|
+
- Containers: \`Dockerfile\`, \`docker-compose*.yml\`.
|
|
60
|
+
- CI: \`.github/workflows\`, \`.gitlab-ci.yml\`.
|
|
61
|
+
Skip detection quietly if no markers are found — do NOT invent a stack.
|
|
62
|
+
|
|
63
|
+
4. Read \`${flowPath}\`.
|
|
64
|
+
5. If flow already has completed stages beyond brainstorm, warn the user that starting a new brainstorm will reset progress. Ask for confirmation before proceeding.
|
|
65
|
+
6. **Track heuristic** — classify the idea text and **recommend** a track (the user can override before any state mutation):
|
|
37
66
|
- **quick** (\`spec → tdd → review → ship\`) — single-purpose work where the spec is essentially already known.
|
|
38
67
|
Triggers (case-insensitive substring or close variant): \`bug\`, \`bugfix\`, \`fix\`, \`hotfix\`, \`patch\`, \`typo\`, \`regression\`, \`copy change\`, \`rename\`, \`bump\`, \`upgrade dep\`, \`config tweak\`, \`docs only\`, \`comment\`, \`lint\`, \`format\`, \`small\`, \`tiny\`, \`one-liner\`, \`revert\`.
|
|
39
68
|
- **standard** (full 8 stages — default) — anything that introduces a new capability, touches multiple modules, or has unclear scope.
|
|
40
69
|
Triggers: \`new feature\`, \`add\`, \`build\`, \`design\`, \`refactor\`, \`migration\`, \`platform\`, \`architecture\`, \`endpoint\`, \`schema\`, \`api\`, \`integrate\`, \`workflow\`, \`onboarding\`, or any prompt that does not match quick triggers.
|
|
41
70
|
- When triggers conflict (e.g. "small refactor that touches 5 modules") prefer **standard** — quick is opt-in and only safe when scope is genuinely tiny.
|
|
42
|
-
|
|
71
|
+
7. Present the recommendation as a single decision with explicit options:
|
|
43
72
|
> \`Recommended track: <quick|standard>\` because \`<one-line reason citing matched triggers>\`.
|
|
44
73
|
> Override? (A) keep \`<recommended>\` (B) switch to \`<other>\` (C) cancel.
|
|
45
74
|
If \`AskQuestion\`/\`AskUserQuestion\` is available, send exactly ONE question; on schema error, fall back to plain text.
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
75
|
+
8. Persist the chosen track to \`${flowPath}\` (\`track\` field). Compute \`skippedStages\` from the track and write that too. Use the **first stage of the chosen track** as \`currentStage\` (quick → \`spec\`, standard → \`brainstorm\`, trivial fast-path → \`design\` or \`spec\` per Phase 0).
|
|
76
|
+
9. Write the prompt to \`.cclaw/artifacts/00-idea.md\` with the following header lines: \`Class:\` (from Phase 0), \`Track:\` (chosen track + matched heuristic), \`Stack:\` (from Phase 2 detection, or \`unknown\`), and a \`Discovered context\` section if Phase 1 found origin docs.
|
|
77
|
+
10. Load the **first-stage skill for the chosen track** and its command file:
|
|
78
|
+
- quick → \`.cclaw/skills/specification-authoring/SKILL.md\` + \`.cclaw/commands/spec.md\`
|
|
79
|
+
- standard → \`.cclaw/skills/brainstorming/SKILL.md\` + \`.cclaw/commands/brainstorm.md\`
|
|
80
|
+
- trivial fast-path → design or spec skill per Phase 0 decision.
|
|
81
|
+
11. Execute that stage with the prompt + Phase 1/Phase 2 context as initial input.
|
|
82
|
+
|
|
83
|
+
### Reclassification on discovery
|
|
84
|
+
|
|
85
|
+
If during any stage the agent discovers evidence that contradicts the initial Phase 0 / track decision (e.g. a supposedly \`trivial\` change turns out to require schema migration, a \`quick\` bug fix turns out to need design discussion, an origin doc reveals scope 3× larger than the prompt), STOP and re-classify:
|
|
86
|
+
|
|
87
|
+
1. Surface the new evidence in plain text.
|
|
88
|
+
2. Propose the updated \`Class\` + \`Track\` with a one-line reason.
|
|
89
|
+
3. Use the Decision Protocol to let the user accept, override, or cancel.
|
|
90
|
+
4. On acceptance: update \`00-idea.md\` with a \`Reclassification:\` entry (old → new, reason, ISO timestamp) and update \`flow-state.json\` accordingly — do NOT rewrite prior artifacts, they stay as history.
|
|
52
91
|
|
|
53
92
|
### Without prompt (\`/cc\`)
|
|
54
93
|
|
|
@@ -88,12 +127,15 @@ Do **not** silently discard an existing flow when the user provides a prompt. If
|
|
|
88
127
|
|
|
89
128
|
### Path A: \`/cc <prompt>\`
|
|
90
129
|
|
|
91
|
-
1.
|
|
92
|
-
2.
|
|
130
|
+
1. **Task classification (Phase 0).** Decide whether the prompt is \`software-standard\`, \`software-trivial\`, \`software-bugfix\`, \`pure-question\`, or \`non-software\`. Non-software and pure-question exit immediately — answer directly, do not open a stage.
|
|
131
|
+
2. **Origin-document discovery (Phase 1).** Scan for \`docs/prd/**\`, \`docs/rfcs/**\`, \`docs/adr/**\`, \`docs/design/**\`, \`specs/**\`, root-level \`PRD.md\` / \`SPEC.md\` / \`DESIGN.md\` / \`REQUIREMENTS.md\`. Summarize any hits in \`00-idea.md\` under \`Discovered context\`. Surface conflicts with the prompt before routing.
|
|
132
|
+
3. **Stack detection (Phase 2).** Inspect \`package.json\` engines, \`pyproject.toml\`, \`go.mod\`, \`Cargo.toml\`, \`pom.xml\`, \`build.gradle*\`, \`Dockerfile\`, \`docker-compose*.yml\`, and CI configs. Record stack + versions on the \`Stack:\` line. Do not invent stack details.
|
|
133
|
+
4. Read \`${flowPath}\`.
|
|
134
|
+
5. If \`completedStages\` is non-empty:
|
|
93
135
|
- Inform: "You have an active flow at stage **{currentStage}** with {N} completed stages. Starting a new brainstorm will reset progress."
|
|
94
136
|
- Ask: "Continue with reset? (A) Yes, start fresh (B) No, resume current flow"
|
|
95
137
|
- If (B) → switch to Path B behavior.
|
|
96
|
-
|
|
138
|
+
6. **Classify the idea** using the heuristic below and present a single track recommendation. Wait for explicit confirmation or override before mutating any state.
|
|
97
139
|
|
|
98
140
|
**Track heuristic** (lowercase substring match against the user prompt):
|
|
99
141
|
|
|
@@ -104,9 +146,13 @@ Do **not** silently discard an existing flow when the user provides a prompt. If
|
|
|
104
146
|
|
|
105
147
|
- On conflict, prefer \`standard\` (quick is opt-in for genuinely tiny work).
|
|
106
148
|
- Always state the recommendation as a one-line reason citing the matched trigger.
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
|
|
149
|
+
7. Persist the chosen track in \`${flowPath}\` (\`track\` + \`skippedStages\`). Set \`currentStage\` to the first stage of the chosen track (\`quick\` → \`spec\`, \`standard\` → \`brainstorm\`, trivial fast-path → \`design\` or \`spec\`). Reset gate catalog.
|
|
150
|
+
8. Write \`${RUNTIME_ROOT}/artifacts/00-idea.md\` with the user's prompt plus header lines: \`Class:\`, \`Track:\`, \`Stack:\`, and a \`Discovered context\` section from Phase 1.
|
|
151
|
+
9. Load and execute the **first stage skill of the chosen track** (\`brainstorming\` for standard, \`specification-authoring\` for quick) plus its matching command file.
|
|
152
|
+
|
|
153
|
+
### Reclassification on discovery
|
|
154
|
+
|
|
155
|
+
If mid-stage evidence contradicts the initial Class/Track decision (the "trivial" change needs a migration, the "quick" bug fix needs architecture work, an origin doc multiplies scope), STOP and re-classify using the Decision Protocol. Record \`Reclassification:\` in \`00-idea.md\` with old/new class and a one-line reason. Do NOT rewrite prior artifacts — they stay as history.
|
|
110
156
|
|
|
111
157
|
### Path B: \`/cc\` (no arguments)
|
|
112
158
|
|