qfai 1.6.0 → 1.6.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -69,6 +69,9 @@ Prototyping stage policy:
69
69
  Implementation stage:
70
70
 
71
71
  - `/qfai-implement` orchestrates the full TDD micro-cycle (Red/Green/Refactor) one test at a time using `test-list.md` as the execution ledger.
72
+ - Each item requires watch it fail (RED observation confirmed), watch it pass (GREEN observation confirmed), and fresh evidence (command+result pairs, not status-only).
73
+ - Completion requires independent spec review and code quality review gates — both must PASS before an item is marked done.
74
+ - Parallel execution is allowed only for independent slices with no shared state; worktree separation is required.
72
75
 
73
76
  Legacy note:
74
77
 
@@ -56,14 +56,16 @@ Execute the TDD micro-cycle for each pending item in `test-list.md`, transitioni
56
56
 
57
57
  The execution ledger at `.qfai/specs/spec-XXXX/tdd/test-list.md` tracks progress with these required columns:
58
58
 
59
- | Column | Description |
60
- | --------- | -------------------------------------------------- |
61
- | TDD-ID | Unique identifier for the TDD item (e.g., TDD-001) |
62
- | TC-Refs | References to test cases from `06_Test-Cases.md` |
63
- | Layer | Test layer (Unit, Integration, etc.) |
64
- | Test file | Path to the test file |
65
- | Selector | Test selector/description for targeted execution |
66
- | Status | Current lifecycle status |
59
+ | Column | Description |
60
+ | --------- | -------------------------------------------------------- |
61
+ | TDD-ID | Unique identifier for the TDD item (e.g., TDD-0001) |
62
+ | TC-Refs | References to test cases from `06_Test-Cases.md` |
63
+ | Layer | Test layer (Unit, Integration, etc.) |
64
+ | Test file | Path to the test file |
65
+ | Selector | Test selector/description for targeted execution |
66
+ | Status | Current lifecycle status |
67
+ | DR-ID | Decision Record ID for exception items (blank otherwise) |
68
+ | Evidence | RED/GREEN command+result pairs proving the TDD cycle |
67
69
 
68
70
  ### Status Lifecycle
69
71
 
@@ -75,7 +77,7 @@ Allowed transitions:
75
77
  - `red` -> `green` (make the test pass with minimal code)
76
78
  - `green` -> `refactor` (improve code quality while keeping tests green)
77
79
  - `refactor` -> `done` (item complete)
78
- - Any active status -> `exception` (anomaly detected; record DR-ID in Notes column if present)
80
+ - Any active status -> `exception` (anomaly detected; record DR-ID in DR-ID column)
79
81
 
80
82
  Backward transitions are prohibited. Attempting `green` -> `red` must produce:
81
83
  `"Backward transition prohibited: green -> red"`.
@@ -84,8 +86,8 @@ Backward transitions are prohibited. Attempting `green` -> `red` must produce:
84
86
 
85
87
  When transitioning to `exception`:
86
88
 
87
- - A DR-ID (Decision Record ID) should be recorded in the Notes column if present.
88
- - If a Notes column exists but is empty, emit warning: `"exception status requires DR-ID in Notes column"`.
89
+ - A DR-ID (Decision Record ID) must be recorded in the DR-ID column.
90
+ - If the DR-ID column is empty, emit error: `"exception status requires DR-ID in DR-ID column"`.
89
91
 
90
92
  ## Required Process
91
93
 
@@ -108,7 +110,9 @@ When transitioning to `exception`:
108
110
 
109
111
  1. Improve code quality (naming, structure, duplication removal) while keeping all tests green.
110
112
  2. Run the full relevant test suite to confirm nothing broke.
111
- 3. Transition status to `refactor`, then immediately to `done`.
113
+ 3. Transition status to `refactor`.
114
+ 4. Submit for spec review (TDDSpecReviewer) and code quality review (TDDCodeQualityReviewer).
115
+ 5. After both reviewers return PASS, run checkpoint verification, then transition to `done`.
112
116
 
113
117
  ### Completion
114
118
 
@@ -124,14 +128,64 @@ When transitioning to `exception`:
124
128
  - Orchestrator MUST NOT write test or production code directly.
125
129
  - Orchestrator updates `test-list.md` status after each phase completes.
126
130
 
127
- ### Sub-agent Roles
131
+ ### Formal Sub-agent Roster
128
132
 
129
- | Role | Responsibility |
130
- | ----------- | -------------------------------------------- |
131
- | TestWriter | Writes the failing test (Red phase) |
132
- | Implementer | Writes minimal production code (Green phase) |
133
- | Refactorer | Improves code quality (Refactor phase) |
134
- | TestRunner | Executes tests and reports pass/fail results |
133
+ This skill delegates to 6 named sub-agents. Each has explicit responsibilities, prohibitions, and handoff contracts.
134
+ RedGreenAuditor is the sole authority for RED/GREEN observation confirmation;
135
+ self-certification by TDDImplementer is prohibited.
136
+
137
+ #### TDDCycleController
138
+
139
+ - Responsibilities: reads `test-list.md`, selects the next pending item, enforces Red-Green-Refactor-Review-Checkpoint ordering,
140
+ blocks advancement until completion conditions are met, oversized item splitting (target: completion within minutes)
141
+ - Prohibitions: must not write test or production code directly, must not edit spec artifacts, must not authorize parallel dispatch without ParallelSliceDispatcher confirmation of independence
142
+
143
+ #### TDDImplementer
144
+
145
+ - Responsibilities: implements the selected single item only — writes a failing test first,
146
+ writes minimal production code to make it pass, performs refactor while keeping tests green, performs local self-inspection before handoff
147
+ - Prohibitions: must not write production code before the failing test exists,
148
+ must not confirm its own RED/GREEN observations (self-certification prohibited — only RedGreenAuditor may confirm RED/GREEN observations),
149
+ must not work on more than one item simultaneously, must not perform speculative generalization, must not mix unrelated refactoring
150
+
151
+ #### RedGreenAuditor
152
+
153
+ - Responsibilities: sole authority for confirming RED and GREEN observations — verifies that the test actually failed for the expected reason (watch it fail),
154
+ verifies that the test actually passed after implementation (watch it pass), verifies that refactored code maintains green state
155
+ - Prohibitions: must not accept reasoning-only confirmation without actual test execution output, must not accept setup failures / import errors / typo failures as valid RED observations
156
+
157
+ #### TDDSpecReviewer
158
+
159
+ - Responsibilities: reviews alignment with `01_Spec.md`, `06_Test-Cases.md`, `09_delta.md`, `10_Plan.md` — detects scope creep,
160
+ verifies `test-list.md` updates match spec references, performs spec review as an independent gate
161
+ - Prohibitions: must not issue style-only reviews that skip compliance checks, must not permit spec drift through reviewer notes alone while allowing completion
162
+
163
+ #### TDDCodeQualityReviewer
164
+
165
+ - Responsibilities: reviews duplication, naming, hidden coupling, edge cases, error boundaries, security assumptions —
166
+ verifies refactor achieves design improvement, performs code quality review as an independent gate
167
+ - Prohibitions: must not issue style-nit-only reviews that skip design analysis, must not conflate spec compliance with quality review scope,
168
+ must not be self-approved by TDDImplementer (TDDImplementer cannot serve as TDDCodeQualityReviewer for its own work)
169
+
170
+ #### ParallelSliceDispatcher
171
+
172
+ - Responsibilities: sole authority for authorizing parallel dispatch — evaluates independence of candidate slices, requires worktree/branch separation, defines post-merge integration verify conditions
173
+ - Prohibitions: must not authorize parallel dispatch for slices sharing the same behavior R/G/R cycle, same API surface, same fixture/mock/DI/global setup, or any unexplained independence claim
174
+
175
+ ### Handoff Contracts
176
+
177
+ All agent-to-agent transitions follow these 8 defined contracts:
178
+
179
+ 1. **TDDCycleController -> TDDImplementer**: Controller selects item, sets status to `red`, hands off item context (TDD-ID, TC-Refs, spec references) to Implementer
180
+ 2. **TDDImplementer -> RedGreenAuditor**: Implementer submits RED/GREEN observation
181
+ (test command + actual output: failing for RED, passing for GREEN) for verification; Auditor confirms or rejects the observation state
182
+ 3. **RedGreenAuditor -> TDDImplementer**: Auditor returns RED/GREEN confirmation
183
+ (RED: proceed to implementation; GREEN: proceed to spec review) or rejection (resubmit with valid and correctly classified test run)
184
+ 4. **TDDImplementer -> TDDSpecReviewer**: After GREEN confirmed by RedGreenAuditor, Implementer submits item for spec review with implementation summary and test evidence
185
+ 5. **TDDSpecReviewer -> TDDImplementer**: Reviewer returns PASS (proceed to quality review) or FAIL with required fixes
186
+ 6. **TDDImplementer -> TDDCodeQualityReviewer**: After spec review PASS, Implementer submits for code quality review
187
+ 7. **TDDCodeQualityReviewer -> TDDImplementer**: Reviewer returns PASS (item can be marked done) or FAIL with required fixes
188
+ 8. **TDDCycleController -> ParallelSliceDispatcher**: Controller requests parallel dispatch evaluation; Dispatcher returns authorization or denial with rationale
135
189
 
136
190
  ### Capability Probe (MUST)
137
191
 
@@ -163,19 +217,71 @@ Every major artifact in this stage MUST include this table schema:
163
217
 
164
218
  ## Parallelization Policy
165
219
 
166
- - **Default**: Serial execution. Items are processed one at a time in `test-list.md` order.
220
+ - **Default**: Serial execution. Items are processed one test at a time in `test-list.md` order.
167
221
  - **Exception**: When items target completely independent SUT modules with no shared state, parallel processing may be used with explicit user approval.
168
222
  - Serial execution ensures that each test is written and verified in isolation before moving to the next.
223
+ - ParallelSliceDispatcher is the sole authority for authorizing parallel dispatch.
224
+
225
+ ### Allow conditions (all must be true)
226
+
227
+ - Independent SUT (no shared source files under test)
228
+ - Independent test files (no shared test files or fixtures)
229
+ - No shared state (no shared database, global variable, singleton, or DI container)
230
+ - No sequential dependency (Slice B does not depend on Slice A output)
231
+ - Worktree or branch separation is available
232
+ - Post-merge integration verify plan exists
233
+
234
+ ### Deny conditions (any one blocks parallel dispatch)
235
+
236
+ - Same behavior Red/Green/Refactor cycle across slices
237
+ - Same public API surface modified by multiple slices
238
+ - Shared fixture, shared mock, shared DI container, shared global setup
239
+ - Sequential dependency: "A must finish before B has meaning"
240
+ - Independence claim cannot be explained with concrete file/module evidence
241
+
242
+ ### Post-parallel integration verify
243
+
244
+ - After parallel slices complete and merge, run integration verify on the merged result
245
+ - If integration verify fails, flag all slices for re-examination and roll back the merge
246
+ - If integration verify passes, state transitions back to TDDCycleController for sequential flow
169
247
 
170
248
  ## Completion Contract (Shared)
171
249
 
172
- Before declaring completion, you MUST:
250
+ ### Item completion checklist (10-point gate)
251
+
252
+ An item in `test-list.md` may transition to `done` only when ALL of the following are satisfied:
253
+
254
+ 1. Corresponding `TDD-ID` has been selected and is in progress
255
+ 2. A failing test was added first (test-first)
256
+ 3. RED was observed — RedGreenAuditor confirmed the test failed for the expected reason (watch it fail)
257
+ 4. Minimal production code was written to make the test pass
258
+ 5. GREEN was observed — RedGreenAuditor confirmed the test passes after implementation (watch it pass)
259
+ 6. Refactor was performed and GREEN was re-confirmed after refactor
260
+ 7. TDDSpecReviewer returned PASS (spec review gate)
261
+ 8. TDDCodeQualityReviewer returned PASS (code quality review gate)
262
+ 9. `test-list.md` Status and Evidence columns are updated with fresh evidence
263
+ 10. Checkpoint verification passed
264
+
265
+ ### Spec completion conditions
266
+
267
+ The skill may declare "this spec's implementation is complete" only when:
173
268
 
174
- - All `todo` items in `test-list.md` have been processed.
175
- - Each processed item reached `done` or `exception` status.
176
- - All tests pass (`npm test` or equivalent).
177
- - `test-list.md` reflects the final state accurately.
178
- - Exception items have DR-IDs recorded (in Notes column if present).
269
+ - All TC-\* from `06_Test-Cases.md` with applicable layer are present in `test-list.md`
270
+ - Each item reached `done` or valid `exception` (with DR-ID)
271
+ - 0 blocking reviewer issues remain
272
+ - Checkpoint verification passed
273
+ - No unresolved Change Request or waiver dependency exists
274
+
275
+ ### Completion prohibition conditions
276
+
277
+ Completion MUST NOT be declared when any of the following are true:
278
+
279
+ - No RED fresh evidence exists for the item
280
+ - No GREEN fresh evidence exists for the item
281
+ - Either reviewer (TDDSpecReviewer or TDDCodeQualityReviewer) has not been run or returned FAIL
282
+ - Items with `todo`, `red`, `green`, or `refactor` status still exist (for spec-level completion)
283
+ - Parallel slices were used but integration verify has not been run post-merge
284
+ - Checkpoint boundary was reached but verification was not executed
179
285
 
180
286
  ## Evidence (MANDATORY)
181
287
 
@@ -189,6 +295,28 @@ Required sections:
189
295
  - Exception items (if any) with DR-IDs
190
296
  - Commands executed
191
297
 
298
+ ### Per-item evidence contract (fresh evidence required)
299
+
300
+ Each TDD item MUST have fresh evidence containing at minimum:
301
+
302
+ - `TDD-ID` — the item identifier
303
+ - `TC-ref` — reference to the test case(s)
304
+ - `RED command` — the exact command executed to observe failure
305
+ - `RED result` — the failure output (result completeness is best-effort; truncated output is acceptable)
306
+ - `GREEN command` — the exact command executed to observe success
307
+ - `GREEN result` — the success output
308
+ - `Refactor verify command` — the exact command re-executed after refactor
309
+ - `Refactor verify result` — the output confirming GREEN is maintained
310
+ - `Spec review` — TDDSpecReviewer result (PASS or FAIL)
311
+ - `Code quality review` — TDDCodeQualityReviewer result (PASS or FAIL)
312
+
313
+ ### Evidence hard rules
314
+
315
+ - Status-only evidence (e.g., "Status: PASS" with no command) is invalid and MUST be rejected
316
+ - Both command and result are required; "should pass" or "looks good" alone is not acceptable
317
+ - Stale evidence from a previous run MUST NOT be reused to claim completion for a new cycle
318
+ - Empty evidence entries are rejected: minimum evidence per TDD item must be met
319
+
192
320
  ## FINAL CHECKLIST (Check Last)
193
321
 
194
322
  - [ ] CRITICAL CONSTRAINTS were followed.
@@ -66,6 +66,19 @@ Each `spec-XXXX/` must satisfy:
66
66
 
67
67
  `_policies/` files must not contain lower-layer IDs (`US/AC/BR/EX/TC`) or `spec-XXXX` references.
68
68
 
69
+ ## TDD Execution Ledger (`tdd/test-list.md`)
70
+
71
+ Each `spec-XXXX/tdd/test-list.md` is the execution ledger for the TDD micro-cycle.
72
+
73
+ - **8 required columns**: TDD-ID, TC-Refs, Layer, Test file, Selector, Status, DR-ID, Evidence
74
+ - **Coverage** is measured as unit/component TC references from `06_Test-Cases.md` appearing in TC-Refs
75
+ - **Level column fallback**: when `06_Test-Cases.md` has no `Level` column, all TCs are treated as coverage targets (equivalent to all being unit/component)
76
+ - **Status=exception** rows must have a non-empty DR-ID (Decision Record reference)
77
+ - **Status in {green, refactor, done}** rows must have an existing Test file (resolved relative to project root)
78
+ - **TDD-ID** must match `TDD-NNNN` format and be unique within the spec (case-insensitive)
79
+ - Specs without `tdd/test-list.md` receive a `TDDLIST_MISSING` warning (not error)
80
+ - Old 6-column format (missing DR-ID/Evidence) triggers `TDDLIST_REQUIRED_COLUMN_MISSING` error
81
+
69
82
  ## Notes
70
83
 
71
84
  - `specs/` is definition-only. Keep operational status as run logs under `.qfai/report/run-*/`.
@@ -1,4 +1,4 @@
1
1
  # TDD Execution Ledger
2
2
 
3
- | TDD-ID | TC-Refs | Layer | Test file | Selector | Status |
4
- | ------ | ------- | ----- | --------- | -------- | ------ |
3
+ | TDD-ID | TC-Refs | Layer | Test file | Selector | Status | DR-ID | Evidence |
4
+ | ------ | ------- | ----- | --------- | -------- | ------ | ----- | -------- |