cclaw-cli 0.5.11 → 0.5.13
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/artifact-linter.js +16 -0
- package/dist/content/examples.js +71 -108
- package/dist/content/stage-schema.js +97 -10
- package/dist/content/templates.js +34 -3
- package/package.json +1 -1
package/dist/artifact-linter.js
CHANGED
|
@@ -201,6 +201,22 @@ function validateSectionBody(sectionBody, rule) {
|
|
|
201
201
|
}
|
|
202
202
|
}
|
|
203
203
|
}
|
|
204
|
+
if (/Status:\s*pending\s+until/iu.test(rule)) {
|
|
205
|
+
const statusLine = bodyLines.find((l) => /^\s*-?\s*Status\s*:/iu.test(l));
|
|
206
|
+
if (!statusLine) {
|
|
207
|
+
return { ok: false, details: "WAIT_FOR_CONFIRM section must contain a 'Status:' line." };
|
|
208
|
+
}
|
|
209
|
+
const validStatuses = ["pending", "approved"];
|
|
210
|
+
const statusMatch = /Status\s*:\s*(\S+)/iu.exec(statusLine);
|
|
211
|
+
const statusValue = statusMatch?.[1]?.toLowerCase();
|
|
212
|
+
if (!statusValue || !validStatuses.includes(statusValue)) {
|
|
213
|
+
const foundLabel = statusValue || "(empty)";
|
|
214
|
+
return {
|
|
215
|
+
ok: false,
|
|
216
|
+
details: "WAIT_FOR_CONFIRM Status must be exactly one of: " + validStatuses.join(", ") + ". Found: " + foundLabel + "."
|
|
217
|
+
};
|
|
218
|
+
}
|
|
219
|
+
}
|
|
204
220
|
const keywords = extractRequiredKeywords(rule);
|
|
205
221
|
if (keywords.length > 0) {
|
|
206
222
|
const bodyLower = sectionBody.toLowerCase();
|
package/dist/content/examples.js
CHANGED
|
@@ -236,11 +236,11 @@ Data flow: Gateway → Service (validate + enrich) → Publisher (fan-out) → Q
|
|
|
236
236
|
Design output should be **reviewable by someone who did not attend brainstorming**: they can trace from constraints → components → open decisions without reading code.`,
|
|
237
237
|
spec: `### Acceptance Criteria
|
|
238
238
|
|
|
239
|
-
| ID | Criterion (observable/measurable/falsifiable) |
|
|
240
|
-
| --- | --- |
|
|
241
|
-
| AC-1 | Given a signed-in user with an active session, when the server publishes a new notification event for that user, the client feed shows the new item within 5 seconds without a full page reload. |
|
|
242
|
-
| AC-2 | Given the same logical notification is published twice with the same dedupe key, when the client processes the stream, the feed contains exactly one visible item for that key. |
|
|
243
|
-
| AC-3 | Given the live connection is unavailable, when the user opens the notifications panel, the UI shows a non-blocking "live updates paused" banner and loads the latest snapshot via REST within 2 seconds. |
|
|
239
|
+
| ID | Criterion (observable/measurable/falsifiable) | Design Decision Ref |
|
|
240
|
+
| --- | --- | --- |
|
|
241
|
+
| AC-1 | Given a signed-in user with an active session, when the server publishes a new notification event for that user, the client feed shows the new item within 5 seconds without a full page reload. | Architecture: SSE delivery path |
|
|
242
|
+
| AC-2 | Given the same logical notification is published twice with the same dedupe key, when the client processes the stream, the feed contains exactly one visible item for that key. | Architecture: dedupe-key in event schema |
|
|
243
|
+
| AC-3 | Given the live connection is unavailable, when the user opens the notifications panel, the UI shows a non-blocking "live updates paused" banner and loads the latest snapshot via REST within 2 seconds. | Architecture: REST fallback + degraded UX |
|
|
244
244
|
|
|
245
245
|
### Edge Cases
|
|
246
246
|
|
|
@@ -267,136 +267,99 @@ Design output should be **reviewable by someone who did not attend brainstorming
|
|
|
267
267
|
|
|
268
268
|
- Approved by: user
|
|
269
269
|
- Date: 2026-04-14`,
|
|
270
|
-
plan: `###
|
|
271
|
-
|
|
272
|
-
| ID | Title | depends_on | acceptance_criteria | estimated_effort |
|
|
273
|
-
| --- | --- | --- | --- | --- |
|
|
274
|
-
| T1 | Define notification event schema + dedupe key rules | — | Spec criteria 2 satisfied in a written contract + fixtures | S |
|
|
275
|
-
| T2 | Implement publisher + outbox write path | T1 | Spec criterion 1 satisfied in integration test (happy path) | M |
|
|
276
|
-
| T3 | Implement client feed + SSE subscribe + REST fallback | T1, T2 | Spec criteria 1–3 satisfied in e2e-style tests (including degraded mode) | L |
|
|
270
|
+
plan: `### Dependency Graph
|
|
277
271
|
|
|
278
|
-
### Dependency graph (ASCII)
|
|
279
|
-
|
|
280
|
-
\`\`\`
|
|
281
|
-
T1 ──▶ T2 ──▶ T3
|
|
282
|
-
│ ▲
|
|
283
|
-
└────────────┘
|
|
284
272
|
\`\`\`
|
|
285
|
-
|
|
286
|
-
|
|
287
|
-
|
|
288
|
-
| Spec criterion | Tasks that cover it | Notes |
|
|
289
|
-
| --- | --- | --- |
|
|
290
|
-
| Criterion 1 (delivery) | T2, T3 | T2 proves publish path; T3 proves UI subscription path |
|
|
291
|
-
| Criterion 2 (idempotency) | T1, T2 | Schema + publisher tests must include dedupe cases |
|
|
292
|
-
| Criterion 3 (failure visibility) | T3 | Explicit degraded-mode test case |
|
|
293
|
-
|
|
294
|
-
### Sequencing rationale (sample)
|
|
295
|
-
|
|
296
|
-
- **T1 first** prevents rework when event keys change mid-build.
|
|
297
|
-
- **T2 before T3** ensures the UI is not built on a mocked publisher that will not match production semantics.
|
|
298
|
-
- **T3 last** integrates transport concerns once contracts are stable.
|
|
299
|
-
|
|
300
|
-
### Risk note
|
|
301
|
-
|
|
302
|
-
If T3 grows too large, split “transport” vs “UI state machine” into two tasks while keeping the dependency graph acyclic.`,
|
|
303
|
-
tdd: `### RED test (Vitest) — written before production code
|
|
304
|
-
|
|
305
|
-
\`\`\`typescript
|
|
306
|
-
import { describe, it, expect } from "vitest";
|
|
307
|
-
import { summarizeDedupedFeed } from "../notificationFeed";
|
|
308
|
-
|
|
309
|
-
describe("summarizeDedupedFeed", () => {
|
|
310
|
-
it("counts unique keys and unread items", () => {
|
|
311
|
-
const summary = summarizeDedupedFeed([
|
|
312
|
-
{ dedupeKey: "a", read: false },
|
|
313
|
-
{ dedupeKey: "a", read: true },
|
|
314
|
-
{ dedupeKey: "b", read: false },
|
|
315
|
-
]);
|
|
316
|
-
|
|
317
|
-
expect(summary).toEqual({ uniqueKeys: 2, unread: 1 });
|
|
318
|
-
});
|
|
319
|
-
});
|
|
273
|
+
T-1 ──▶ T-2 ──▶ T-3
|
|
274
|
+
│ ▲
|
|
275
|
+
└───────────────┘
|
|
320
276
|
\`\`\`
|
|
321
277
|
|
|
322
|
-
|
|
278
|
+
Parallel opportunity: T-1 is a prerequisite for both T-2 and T-3 (T-3 also needs T-2).
|
|
323
279
|
|
|
324
|
-
|
|
325
|
-
FAIL src/notificationFeed.test.ts
|
|
326
|
-
Error: Cannot find module '../notificationFeed' imported from src/notificationFeed.test.ts
|
|
327
|
-
\`\`\`
|
|
280
|
+
### Dependency Waves
|
|
328
281
|
|
|
329
|
-
|
|
282
|
+
#### Wave 1 (foundation)
|
|
283
|
+
- Task IDs: T-1
|
|
284
|
+
- Verification gate: schema tests pass, dedupe key fixtures validated
|
|
330
285
|
|
|
331
|
-
|
|
286
|
+
#### Wave 2 (core logic)
|
|
287
|
+
- Task IDs: T-2
|
|
288
|
+
- Depends on: Wave 1 (T-1 complete)
|
|
289
|
+
- Verification gate: integration test proves publish-to-outbox path
|
|
332
290
|
|
|
333
|
-
|
|
334
|
-
|
|
335
|
-
|
|
291
|
+
#### Wave 3 (integration)
|
|
292
|
+
- Task IDs: T-3
|
|
293
|
+
- Depends on: Wave 2 (T-2 complete)
|
|
294
|
+
- Verification gate: e2e tests pass for delivery, dedupe, and degraded mode
|
|
336
295
|
|
|
337
|
-
|
|
296
|
+
Execution rule: complete and verify each wave before starting the next wave.
|
|
338
297
|
|
|
339
|
-
|
|
340
|
-
- Assertions that pass because the function returns \`undefined\` and the matcher is too loose.
|
|
298
|
+
### Task List
|
|
341
299
|
|
|
342
|
-
|
|
300
|
+
| Task ID | Description | Acceptance criterion | Verification command | Effort |
|
|
301
|
+
| --- | --- | --- | --- | --- |
|
|
302
|
+
| T-1 | Define notification event schema + dedupe key rules | AC-1, AC-2: schema contract + fixtures | \`\`\`pnpm vitest run tests/unit/notification-schema.test.ts\`\`\` |
|
|
303
|
+
| T-2 | Implement publisher + outbox write path | AC-1: integration test (happy path publish) | \`\`\`pnpm vitest run tests/integration/publisher.test.ts\`\`\` |
|
|
304
|
+
| T-3 | Implement client feed + SSE subscribe + REST fallback | AC-1, AC-2, AC-3: e2e tests including degraded mode | \`\`\`pnpm playwright test tests/e2e/notification-feed.spec.ts\`\`\` |
|
|
343
305
|
|
|
344
|
-
|
|
345
|
-
export type FeedItem = { dedupeKey: string; read: boolean };
|
|
306
|
+
### Acceptance Mapping
|
|
346
307
|
|
|
347
|
-
|
|
348
|
-
|
|
349
|
-
|
|
308
|
+
| Criterion ID | Task IDs |
|
|
309
|
+
| --- | --- |
|
|
310
|
+
| AC-1 (delivery within 5s) | T-2, T-3 |
|
|
311
|
+
| AC-2 (idempotency) | T-1, T-2 |
|
|
312
|
+
| AC-3 (failure visibility) | T-3 |
|
|
350
313
|
|
|
351
|
-
|
|
352
|
-
latestReadByKey.set(item.dedupeKey, item.read);
|
|
353
|
-
}
|
|
314
|
+
### Risk Assessment
|
|
354
315
|
|
|
355
|
-
|
|
356
|
-
|
|
357
|
-
|
|
358
|
-
|
|
316
|
+
| Task/Wave | Risk | Likelihood | Impact | Mitigation |
|
|
317
|
+
| --- | --- | --- | --- | --- |
|
|
318
|
+
| T-3 (Wave 3) | SSE reconnect logic complex | Medium | High | Spike reconnect in isolation before integrating with feed UI |
|
|
319
|
+
| Wave 2 → 3 | Publisher API contract may shift | Low | Medium | Pin contract in T-1 schema; T-2 integration test validates |
|
|
359
320
|
|
|
360
|
-
|
|
361
|
-
|
|
362
|
-
|
|
321
|
+
### WAIT_FOR_CONFIRM
|
|
322
|
+
- Status: pending
|
|
323
|
+
- Confirmed by:`,
|
|
324
|
+
tdd: `### RED Evidence
|
|
363
325
|
|
|
364
|
-
|
|
326
|
+
| Slice | Test name | Command | Failure output summary |
|
|
327
|
+
| --- | --- | --- | --- |
|
|
328
|
+
| S-1 (event schema + dedupe) | counts unique keys and unread items | \`\`\`pnpm vitest run tests/unit/dedupe-feed.test.ts\`\`\` | Cannot find module '../notificationFeed' |
|
|
329
|
+
| S-2 (publisher outbox) | publishes event to outbox with dedupe key | \`\`\`pnpm vitest run tests/integration/publisher.test.ts\`\`\` | publishToOutbox is not a function |
|
|
330
|
+
| S-3 (client feed + fallback) | shows notification within 5s via SSE | \`\`\`pnpm playwright test tests/e2e/notification-feed.spec.ts\`\`\` | Element [data-testid="feed-item"] not found |
|
|
365
331
|
|
|
366
|
-
|
|
332
|
+
### Acceptance Mapping
|
|
367
333
|
|
|
368
|
-
|
|
369
|
-
|
|
334
|
+
| Slice | Plan task ID | Spec criterion ID |
|
|
335
|
+
| --- | --- | --- |
|
|
336
|
+
| S-1 | T-1 | AC-1, AC-2 |
|
|
337
|
+
| S-2 | T-2 | AC-1 |
|
|
338
|
+
| S-3 | T-3 | AC-1, AC-2, AC-3 |
|
|
370
339
|
|
|
371
|
-
|
|
372
|
-
const latestReadByKey = new Map<string, boolean>();
|
|
373
|
-
for (const item of items) latestReadByKey.set(item.dedupeKey, item.read);
|
|
374
|
-
return latestReadByKey;
|
|
375
|
-
}
|
|
340
|
+
### Failure Analysis
|
|
376
341
|
|
|
377
|
-
|
|
378
|
-
|
|
342
|
+
| Slice | Expected missing behavior | Actual failure reason |
|
|
343
|
+
| --- | --- | --- |
|
|
344
|
+
| S-1 | notificationFeed module does not exist yet | Module import fails — correct: implementation missing |
|
|
345
|
+
| S-2 | publishToOutbox function not implemented | Function not found — correct: write path missing |
|
|
346
|
+
| S-3 | Feed UI not rendered, SSE not connected | DOM element missing — correct: client component not built |
|
|
379
347
|
|
|
380
|
-
|
|
381
|
-
for (const read of latestReadByKey.values()) {
|
|
382
|
-
if (!read) unread += 1;
|
|
383
|
-
}
|
|
348
|
+
### GREEN Evidence
|
|
384
349
|
|
|
385
|
-
|
|
386
|
-
|
|
387
|
-
\`\`\`
|
|
350
|
+
- Full suite command: \`\`\`pnpm vitest run && pnpm playwright test\`\`\`
|
|
351
|
+
- Full suite result: 47 tests passed (3 new + 44 existing), 0 failed, 0 skipped
|
|
388
352
|
|
|
389
|
-
###
|
|
353
|
+
### REFACTOR Notes
|
|
390
354
|
|
|
391
|
-
\`\`\`
|
|
392
|
-
|
|
355
|
+
- What changed: Extracted \`\`\`mergeLatestByDedupeKey\`\`\` helper from inline loop in \`\`\`summarizeDedupedFeed\`\`\`; moved SSE reconnect logic into \`\`\`useSSEConnection\`\`\` hook.
|
|
356
|
+
- Why: Dedupe merge logic is reused by both publisher and client; reconnect logic was duplicated across components.
|
|
357
|
+
- Behavior preserved: Full suite re-run confirms 47/47 pass after refactor.
|
|
393
358
|
|
|
394
|
-
|
|
359
|
+
### Traceability
|
|
395
360
|
|
|
396
|
-
|
|
397
|
-
|
|
398
|
-
Tests: 1 passed.
|
|
399
|
-
\`\`\``,
|
|
361
|
+
- Plan task IDs: T-1, T-2, T-3
|
|
362
|
+
- Spec criterion IDs: AC-1, AC-2, AC-3`,
|
|
400
363
|
review: `### Layer 1 — Spec compliance (per-criterion)
|
|
401
364
|
|
|
402
365
|
| Criterion | Status | Evidence |
|
|
@@ -717,7 +717,30 @@ const SPEC = {
|
|
|
717
717
|
{ name: "Assumption Surfacing", description: "Implicit assumptions are invisible requirements. Force every assumption into an explicit statement. If you cannot name the assumption, you have not found it yet." },
|
|
718
718
|
{ name: "Ambiguity Classification", description: "Before resolving any unclear requirement, classify it: (A) Insufficient information — ask the user. (B) Multiple valid interpretations — enumerate and pick with justification. (C) Genuinely unknown — propose hypothesis and validation path. Never treat all ambiguity the same way." }
|
|
719
719
|
],
|
|
720
|
-
reviewSections: [
|
|
720
|
+
reviewSections: [
|
|
721
|
+
{
|
|
722
|
+
title: "Acceptance Criteria Audit",
|
|
723
|
+
evaluationPoints: [
|
|
724
|
+
"Is every criterion observable (can you point to evidence of pass/fail)?",
|
|
725
|
+
"Is every criterion measurable (numeric threshold or boolean outcome)?",
|
|
726
|
+
"Is every criterion falsifiable (can you describe what failure looks like)?",
|
|
727
|
+
"Does every criterion trace to a design decision (Design Decision Ref)?",
|
|
728
|
+
"Are there any vague adjectives (fast, intuitive, robust) without thresholds?"
|
|
729
|
+
],
|
|
730
|
+
stopGate: true
|
|
731
|
+
},
|
|
732
|
+
{
|
|
733
|
+
title: "Testability Audit",
|
|
734
|
+
evaluationPoints: [
|
|
735
|
+
"Does every criterion have a concrete test description in the Testability Map?",
|
|
736
|
+
"Does every test specify a verification approach (unit, integration, e2e, manual)?",
|
|
737
|
+
"Does every test include a runnable command or manual steps?",
|
|
738
|
+
"Are edge cases (boundary + error) defined for every criterion?",
|
|
739
|
+
"Can you run every verification command right now and get a meaningful result?"
|
|
740
|
+
],
|
|
741
|
+
stopGate: true
|
|
742
|
+
}
|
|
743
|
+
],
|
|
721
744
|
completionStatus: ["DONE", "DONE_WITH_CONCERNS", "BLOCKED"],
|
|
722
745
|
crossStageTrace: {
|
|
723
746
|
readsFrom: [".cclaw/artifacts/03-design.md", ".cclaw/artifacts/02-scope.md"],
|
|
@@ -729,6 +752,8 @@ const SPEC = {
|
|
|
729
752
|
{ section: "Edge Cases", required: true, validationRule: "At least one boundary and one error condition per criterion." },
|
|
730
753
|
{ section: "Constraints and Assumptions", required: true, validationRule: "All implicit assumptions surfaced. Constraints have sources." },
|
|
731
754
|
{ section: "Testability Map", required: true, validationRule: "Each criterion maps to a concrete test description with verification approach (unit, integration, e2e, manual) and command or manual steps." },
|
|
755
|
+
{ section: "Vague to Fixed", required: false, validationRule: "If present: table with original vague wording and rewritten observable/testable version for each ambiguous requirement." },
|
|
756
|
+
{ section: "Non-Functional Requirements", required: false, validationRule: "If present: performance thresholds, security constraints, scalability limits, reliability targets with measurable values." },
|
|
732
757
|
{ section: "Interface Contracts", required: false, validationRule: "If present: for each module boundary list produces (outputs) and consumes (inputs) with data types." },
|
|
733
758
|
{ section: "Approval", required: true, validationRule: "Explicit user approval marker present." }
|
|
734
759
|
],
|
|
@@ -839,9 +864,35 @@ const PLAN = {
|
|
|
839
864
|
cognitivePatterns: [
|
|
840
865
|
{ name: "Vertical Slice Thinking", description: "Each task delivers one thin end-to-end slice of value. Horizontal layers (all models, then all controllers) create integration risk. Vertical slices (one feature through all layers) reduce it." },
|
|
841
866
|
{ name: "Two-Minute Smell Test", description: "If a competent engineer cannot understand and start a task in two minutes, the task is too large or too vague. Break it down further." },
|
|
842
|
-
{ name: "Make the Change Easy, Then Make the Easy Change", description: "Refactor first, implement second. Never structural + behavioral changes simultaneously. Sequence tasks accordingly." }
|
|
867
|
+
{ name: "Make the Change Easy, Then Make the Easy Change", description: "Refactor first, implement second. Never structural + behavioral changes simultaneously. Sequence tasks accordingly." },
|
|
868
|
+
{ name: "Diagnose Before Fix", description: "Before decomposing work, understand the current state of the codebase. Read existing code, tests, and conventions. Tasks should reference what exists, not assume a blank slate." },
|
|
869
|
+
{ name: "Scrap Signals", description: "If a task description is vague, the acceptance criterion is missing, or the verification command is a placeholder — it is scrap. Either rewrite it or remove it. Half-specified tasks waste more time than no tasks." },
|
|
870
|
+
{ name: "Risk-First Exploration", description: "Sequence the highest-risk or most uncertain tasks first. If wave 1 proves the risky assumption wrong, the rest of the plan can adapt. If the risk is buried in wave 3, you discover failure late." }
|
|
871
|
+
],
|
|
872
|
+
reviewSections: [
|
|
873
|
+
{
|
|
874
|
+
title: "Task Decomposition Audit",
|
|
875
|
+
evaluationPoints: [
|
|
876
|
+
"Does every task target a single coherent area (vertical slice)?",
|
|
877
|
+
"Can each task be completed in 2-5 minutes?",
|
|
878
|
+
"Does every task have an acceptance criterion link and verification command?",
|
|
879
|
+
"Are there tasks that touch multiple unrelated areas?",
|
|
880
|
+
"Would a new engineer understand and start each task within two minutes?"
|
|
881
|
+
],
|
|
882
|
+
stopGate: true
|
|
883
|
+
},
|
|
884
|
+
{
|
|
885
|
+
title: "Wave Completeness Audit",
|
|
886
|
+
evaluationPoints: [
|
|
887
|
+
"Does every task belong to exactly one wave?",
|
|
888
|
+
"Does each wave have a verification gate?",
|
|
889
|
+
"Are wave dependencies explicit and acyclic?",
|
|
890
|
+
"Is the acceptance mapping complete — every spec criterion covered?",
|
|
891
|
+
"Are there hidden dependencies between tasks in different waves?"
|
|
892
|
+
],
|
|
893
|
+
stopGate: true
|
|
894
|
+
}
|
|
843
895
|
],
|
|
844
|
-
reviewSections: [],
|
|
845
896
|
completionStatus: ["DONE", "DONE_WITH_CONCERNS", "BLOCKED"],
|
|
846
897
|
crossStageTrace: {
|
|
847
898
|
readsFrom: [".cclaw/artifacts/04-spec.md", ".cclaw/artifacts/03-design.md", ".cclaw/artifacts/02-scope.md"],
|
|
@@ -851,10 +902,16 @@ const PLAN = {
|
|
|
851
902
|
artifactValidation: [
|
|
852
903
|
{ section: "Dependency Graph", required: true, validationRule: "Ordering and parallel opportunities explicit. No circular dependencies." },
|
|
853
904
|
{ section: "Dependency Waves", required: true, validationRule: "Every task belongs to a wave. Each wave has an exit gate and dependency statement." },
|
|
854
|
-
{ section: "Task List", required: true, validationRule: "Each task: ID, description, acceptance criterion link, verification command." },
|
|
905
|
+
{ section: "Task List", required: true, validationRule: "Each task: ID, description, acceptance criterion link, verification command, and effort estimate (S/M/L)." },
|
|
855
906
|
{ section: "Acceptance Mapping", required: true, validationRule: "Every spec criterion is covered by at least one task." },
|
|
907
|
+
{ section: "Risk Assessment", required: false, validationRule: "If present: per-task or per-wave risk identification with likelihood, impact, and mitigation strategy." },
|
|
908
|
+
{ section: "Boundary Map", required: false, validationRule: "If present: per-wave or per-task interface contracts listing what each task produces (exports) and consumes (imports) from other tasks." },
|
|
856
909
|
{ section: "WAIT_FOR_CONFIRM", required: true, validationRule: "Explicit marker present. Status: pending until user approves." }
|
|
857
|
-
]
|
|
910
|
+
],
|
|
911
|
+
namedAntiPattern: {
|
|
912
|
+
title: "Task Details Can Be Finalized During Coding",
|
|
913
|
+
description: "Underspecified tasks do not become clear during implementation — they become context thrash, broken sequencing, and rework. Every task needs an acceptance criterion, a verification command, and a wave assignment before execution starts. If you cannot describe what 'done' looks like for a task, the task is not ready."
|
|
914
|
+
}
|
|
858
915
|
};
|
|
859
916
|
// ---------------------------------------------------------------------------
|
|
860
917
|
// TDD — RED → GREEN → REFACTOR cycle (merged test + build)
|
|
@@ -977,14 +1034,38 @@ const TDD = {
|
|
|
977
1034
|
{ name: "Failure-First Thinking", description: "The failing test IS the specification. Until you see the right failure, you do not understand what you are building. Wrong failures are information." },
|
|
978
1035
|
{ name: "Minimal Viable Change", description: "The best implementation is the smallest one that passes all RED tests. Every extra line is risk. Resist the urge to 'improve while you are here.'" },
|
|
979
1036
|
{ name: "Regression Paranoia", description: "Assume every change breaks something until the full suite proves otherwise. Partial test runs are lies of omission." },
|
|
980
|
-
{ name: "Refactor-as-Hygiene", description: "Refactoring is not optional cleanup — it is the third leg of TDD. GREEN without REFACTOR accumulates mess. REFACTOR without GREEN breaks things." }
|
|
1037
|
+
{ name: "Refactor-as-Hygiene", description: "Refactoring is not optional cleanup — it is the third leg of TDD. GREEN without REFACTOR accumulates mess. REFACTOR without GREEN breaks things." },
|
|
1038
|
+
{ name: "Evidence Over Anecdote", description: "Every claim about test state must be backed by captured output. 'It passed' without terminal evidence is not evidence. 'I saw it fail' without the failure output is not RED. Capture commands, outputs, and results — not summaries from memory." }
|
|
1039
|
+
],
|
|
1040
|
+
reviewSections: [
|
|
1041
|
+
{
|
|
1042
|
+
title: "RED Evidence Audit",
|
|
1043
|
+
evaluationPoints: [
|
|
1044
|
+
"Does every slice have a captured failing test output?",
|
|
1045
|
+
"Does each failure reason match the expected missing behavior (not a typo or config error)?",
|
|
1046
|
+
"Were tests written BEFORE any production code for that slice?",
|
|
1047
|
+
"Does each RED test assert observable behavior, not implementation details?",
|
|
1048
|
+
"Is there a test for each acceptance criterion mapped in the plan?"
|
|
1049
|
+
],
|
|
1050
|
+
stopGate: true
|
|
1051
|
+
},
|
|
1052
|
+
{
|
|
1053
|
+
title: "GREEN/REFACTOR Audit",
|
|
1054
|
+
evaluationPoints: [
|
|
1055
|
+
"Does GREEN evidence show a FULL suite pass (not partial)?",
|
|
1056
|
+
"Is the GREEN implementation minimal — no features beyond what RED tests require?",
|
|
1057
|
+
"Does the REFACTOR step preserve all existing behavior (no new failures)?",
|
|
1058
|
+
"Are REFACTOR notes documented with rationale?",
|
|
1059
|
+
"Is traceability complete: every change links to plan task ID and spec criterion?"
|
|
1060
|
+
],
|
|
1061
|
+
stopGate: true
|
|
1062
|
+
}
|
|
981
1063
|
],
|
|
982
|
-
reviewSections: [],
|
|
983
1064
|
completionStatus: ["DONE", "DONE_WITH_CONCERNS", "BLOCKED"],
|
|
984
1065
|
crossStageTrace: {
|
|
985
|
-
readsFrom: [".cclaw/artifacts/05-plan.md", ".cclaw/artifacts/04-spec.md"],
|
|
1066
|
+
readsFrom: [".cclaw/artifacts/05-plan.md", ".cclaw/artifacts/04-spec.md", ".cclaw/artifacts/03-design.md"],
|
|
986
1067
|
writesTo: [".cclaw/artifacts/06-tdd.md"],
|
|
987
|
-
traceabilityRule: "Every RED test traces to a plan task. Every GREEN change traces to a RED test. Every plan task traces to a spec criterion. Evidence chain must be unbroken."
|
|
1068
|
+
traceabilityRule: "Every RED test traces to a plan task. Every GREEN change traces to a RED test. Every plan task traces to a spec criterion. Design decisions inform test strategy. Evidence chain must be unbroken."
|
|
988
1069
|
},
|
|
989
1070
|
artifactValidation: [
|
|
990
1071
|
{ section: "RED Evidence", required: true, validationRule: "Failing test output captured per slice." },
|
|
@@ -992,8 +1073,14 @@ const TDD = {
|
|
|
992
1073
|
{ section: "Failure Analysis", required: true, validationRule: "Failure reason matches expected missing behavior." },
|
|
993
1074
|
{ section: "GREEN Evidence", required: true, validationRule: "Full suite pass output captured." },
|
|
994
1075
|
{ section: "REFACTOR Notes", required: true, validationRule: "What changed, why, behavior preservation confirmed." },
|
|
995
|
-
{ section: "Traceability", required: true, validationRule: "Plan task ID and spec criterion linked." }
|
|
1076
|
+
{ section: "Traceability", required: true, validationRule: "Plan task ID and spec criterion linked." },
|
|
1077
|
+
{ section: "Verification Ladder", required: false, validationRule: "If present: per-slice verification tier (static, command, behavioral, human) with evidence for highest tier reached." },
|
|
1078
|
+
{ section: "Coverage Targets", required: false, validationRule: "If present: per-module or per-code-type coverage thresholds with current values and measurement commands." }
|
|
996
1079
|
],
|
|
1080
|
+
namedAntiPattern: {
|
|
1081
|
+
title: "Code Before Failing Test",
|
|
1082
|
+
description: "Production code written before a failing test is not TDD — it is guessing validated after the fact. Tests written after implementation confirm assumptions, not behavior. If you wrote code first, delete it and start with RED. Delete means delete — not 'keep as reference.' The failing test IS the specification."
|
|
1083
|
+
},
|
|
997
1084
|
waveExecutionAllowed: true
|
|
998
1085
|
};
|
|
999
1086
|
// ---------------------------------------------------------------------------
|
|
@@ -221,6 +221,16 @@ export const ARTIFACT_TEMPLATES = {
|
|
|
221
221
|
|---|---|---|
|
|
222
222
|
| AC-1 | | |
|
|
223
223
|
|
|
224
|
+
## Vague to Fixed
|
|
225
|
+
| Original (vague) | Rewritten (observable/testable) |
|
|
226
|
+
|---|---|
|
|
227
|
+
| | |
|
|
228
|
+
|
|
229
|
+
## Non-Functional Requirements
|
|
230
|
+
| Category | Requirement | Threshold | Measurement |
|
|
231
|
+
|---|---|---|---|
|
|
232
|
+
| | | | |
|
|
233
|
+
|
|
224
234
|
## Interface Contracts
|
|
225
235
|
| Module | Produces | Consumes |
|
|
226
236
|
|---|---|---|
|
|
@@ -254,15 +264,25 @@ export const ARTIFACT_TEMPLATES = {
|
|
|
254
264
|
Execution rule: complete and verify each wave before starting the next wave.
|
|
255
265
|
|
|
256
266
|
## Task List
|
|
257
|
-
| Task ID | Description | Acceptance criterion | Verification command |
|
|
258
|
-
|
|
259
|
-
| T-1 | | | |
|
|
267
|
+
| Task ID | Description | Acceptance criterion | Verification command | Effort |
|
|
268
|
+
|---|---|---|---|---|
|
|
269
|
+
| T-1 | | | | |
|
|
260
270
|
|
|
261
271
|
## Acceptance Mapping
|
|
262
272
|
| Criterion ID | Task IDs |
|
|
263
273
|
|---|---|
|
|
264
274
|
| AC-1 | T-1 |
|
|
265
275
|
|
|
276
|
+
## Risk Assessment
|
|
277
|
+
| Task/Wave | Risk | Likelihood | Impact | Mitigation |
|
|
278
|
+
|---|---|---|---|---|
|
|
279
|
+
| | | | | |
|
|
280
|
+
|
|
281
|
+
## Boundary Map
|
|
282
|
+
| Task/Wave | Produces (exports) | Consumes (imports from) |
|
|
283
|
+
|---|---|---|
|
|
284
|
+
| | | |
|
|
285
|
+
|
|
266
286
|
## WAIT_FOR_CONFIRM
|
|
267
287
|
- Status: pending
|
|
268
288
|
- Confirmed by:
|
|
@@ -296,6 +316,17 @@ Execution rule: complete and verify each wave before starting the next wave.
|
|
|
296
316
|
## Traceability
|
|
297
317
|
- Plan task IDs:
|
|
298
318
|
- Spec criterion IDs:
|
|
319
|
+
|
|
320
|
+
|
|
321
|
+
## Verification Ladder
|
|
322
|
+
| Slice | Tier reached | Evidence |
|
|
323
|
+
|---|---|---|
|
|
324
|
+
| S-1 | | |
|
|
325
|
+
|
|
326
|
+
## Coverage Targets
|
|
327
|
+
| Code type | Target | Current | Command |
|
|
328
|
+
|---|---|---|---|
|
|
329
|
+
| | | | |
|
|
299
330
|
`,
|
|
300
331
|
"07-review.md": `# Review Artifact
|
|
301
332
|
|