npm - qfai - Versions diffs - 1.6.0 → 1.6.2 - Mend

qfai 1.6.0 → 1.6.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

package/assets/init/.qfai/assistant/instructions/workflow.md +3 -0
package/assets/init/.qfai/assistant/skills/qfai-implement/SKILL.md +154 -26
package/assets/init/.qfai/specs/README.md +13 -0
package/assets/init/.qfai/specs/spec-XXXX/tdd/test-list.md +2 -2
package/dist/cli/index.cjs +679 -341
package/dist/cli/index.cjs.map +1 -1
package/dist/cli/index.mjs +680 -342
package/dist/cli/index.mjs.map +1 -1
package/dist/index.cjs +683 -345
package/dist/index.cjs.map +1 -1
package/dist/index.d.cts +17 -1
package/dist/index.d.ts +17 -1
package/dist/index.mjs +684 -346
package/dist/index.mjs.map +1 -1
package/package.json +1 -1

package/assets/init/.qfai/assistant/instructions/workflow.md CHANGED Viewed

@@ -69,6 +69,9 @@ Prototyping stage policy:
 Implementation stage:
 - `/qfai-implement` orchestrates the full TDD micro-cycle (Red/Green/Refactor) one test at a time using `test-list.md` as the execution ledger.
+- Each item requires watch it fail (RED observation confirmed), watch it pass (GREEN observation confirmed), and fresh evidence (command+result pairs, not status-only).
+- Completion requires independent spec review and code quality review gates — both must PASS before an item is marked done.
+- Parallel execution is allowed only for independent slices with no shared state; worktree separation is required.
 Legacy note:

package/assets/init/.qfai/assistant/skills/qfai-implement/SKILL.md CHANGED Viewed

@@ -56,14 +56,16 @@ Execute the TDD micro-cycle for each pending item in `test-list.md`, transitioni
 The execution ledger at `.qfai/specs/spec-XXXX/tdd/test-list.md` tracks progress with these required columns:
-| Column    | Description                                        |
-| --------- | -------------------------------------------------- |
-| TDD-ID    | Unique identifier for the TDD item (e.g., TDD-001) |
-| TC-Refs   | References to test cases from `06_Test-Cases.md`   |
-| Layer     | Test layer (Unit, Integration, etc.)               |
-| Test file | Path to the test file                              |
-| Selector  | Test selector/description for targeted execution   |
-| Status    | Current lifecycle status                           |
+| Column    | Description                                              |
+| --------- | -------------------------------------------------------- |
+| TDD-ID    | Unique identifier for the TDD item (e.g., TDD-0001)      |
+| TC-Refs   | References to test cases from `06_Test-Cases.md`         |
+| Layer     | Test layer (Unit, Integration, etc.)                     |
+| Test file | Path to the test file                                    |
+| Selector  | Test selector/description for targeted execution         |
+| Status    | Current lifecycle status                                 |
+| DR-ID     | Decision Record ID for exception items (blank otherwise) |
+| Evidence  | RED/GREEN command+result pairs proving the TDD cycle     |
 ### Status Lifecycle
@@ -75,7 +77,7 @@ Allowed transitions:
 - `red` -> `green` (make the test pass with minimal code)
 - `green` -> `refactor` (improve code quality while keeping tests green)
 - `refactor` -> `done` (item complete)
-- Any active status -> `exception` (anomaly detected; record DR-ID in Notes column if present)
+- Any active status -> `exception` (anomaly detected; record DR-ID in DR-ID column)
 Backward transitions are prohibited. Attempting `green` -> `red` must produce:
 `"Backward transition prohibited: green -> red"`.
@@ -84,8 +86,8 @@ Backward transitions are prohibited. Attempting `green` -> `red` must produce:
 When transitioning to `exception`:
-- A DR-ID (Decision Record ID) should be recorded in the Notes column if present.
-- If a Notes column exists but is empty, emit warning: `"exception status requires DR-ID in Notes column"`.
+- A DR-ID (Decision Record ID) must be recorded in the DR-ID column.
+- If the DR-ID column is empty, emit error: `"exception status requires DR-ID in DR-ID column"`.
 ## Required Process
@@ -108,7 +110,9 @@ When transitioning to `exception`:
 1. Improve code quality (naming, structure, duplication removal) while keeping all tests green.
 2. Run the full relevant test suite to confirm nothing broke.
-3. Transition status to `refactor`, then immediately to `done`.
+3. Transition status to `refactor`.
+4. Submit for spec review (TDDSpecReviewer) and code quality review (TDDCodeQualityReviewer).
+5. After both reviewers return PASS, run checkpoint verification, then transition to `done`.
 ### Completion
@@ -124,14 +128,64 @@ When transitioning to `exception`:
 - Orchestrator MUST NOT write test or production code directly.
 - Orchestrator updates `test-list.md` status after each phase completes.
-### Sub-agent Roles
+### Formal Sub-agent Roster
-| Role        | Responsibility                               |
-| ----------- | -------------------------------------------- |
-| TestWriter  | Writes the failing test (Red phase)          |
-| Implementer | Writes minimal production code (Green phase) |
-| Refactorer  | Improves code quality (Refactor phase)       |
-| TestRunner  | Executes tests and reports pass/fail results |
+This skill delegates to 6 named sub-agents. Each has explicit responsibilities, prohibitions, and handoff contracts.
+RedGreenAuditor is the sole authority for RED/GREEN observation confirmation;
+self-certification by TDDImplementer is prohibited.
+#### TDDCycleController
+- Responsibilities: reads `test-list.md`, selects the next pending item, enforces Red-Green-Refactor-Review-Checkpoint ordering,
+  blocks advancement until completion conditions are met, oversized item splitting (target: completion within minutes)
+- Prohibitions: must not write test or production code directly, must not edit spec artifacts, must not authorize parallel dispatch without ParallelSliceDispatcher confirmation of independence
+#### TDDImplementer
+- Responsibilities: implements the selected single item only — writes a failing test first,
+  writes minimal production code to make it pass, performs refactor while keeping tests green, performs local self-inspection before handoff
+- Prohibitions: must not write production code before the failing test exists,
+  must not confirm its own RED/GREEN observations (self-certification prohibited — only RedGreenAuditor may confirm RED/GREEN observations),
+  must not work on more than one item simultaneously, must not perform speculative generalization, must not mix unrelated refactoring
+#### RedGreenAuditor
+- Responsibilities: sole authority for confirming RED and GREEN observations — verifies that the test actually failed for the expected reason (watch it fail),
+  verifies that the test actually passed after implementation (watch it pass), verifies that refactored code maintains green state
+- Prohibitions: must not accept reasoning-only confirmation without actual test execution output, must not accept setup failures / import errors / typo failures as valid RED observations
+#### TDDSpecReviewer
+- Responsibilities: reviews alignment with `01_Spec.md`, `06_Test-Cases.md`, `09_delta.md`, `10_Plan.md` — detects scope creep,
+  verifies `test-list.md` updates match spec references, performs spec review as an independent gate
+- Prohibitions: must not issue style-only reviews that skip compliance checks, must not permit spec drift through reviewer notes alone while allowing completion
+#### TDDCodeQualityReviewer
+- Responsibilities: reviews duplication, naming, hidden coupling, edge cases, error boundaries, security assumptions —
+  verifies refactor achieves design improvement, performs code quality review as an independent gate
+- Prohibitions: must not issue style-nit-only reviews that skip design analysis, must not conflate spec compliance with quality review scope,
+  must not be self-approved by TDDImplementer (TDDImplementer cannot serve as TDDCodeQualityReviewer for its own work)
+#### ParallelSliceDispatcher
+- Responsibilities: sole authority for authorizing parallel dispatch — evaluates independence of candidate slices, requires worktree/branch separation, defines post-merge integration verify conditions
+- Prohibitions: must not authorize parallel dispatch for slices sharing the same behavior R/G/R cycle, same API surface, same fixture/mock/DI/global setup, or any unexplained independence claim
+### Handoff Contracts
+All agent-to-agent transitions follow these 8 defined contracts:
+1. **TDDCycleController -> TDDImplementer**: Controller selects item, sets status to `red`, hands off item context (TDD-ID, TC-Refs, spec references) to Implementer
+2. **TDDImplementer -> RedGreenAuditor**: Implementer submits RED/GREEN observation
+   (test command + actual output: failing for RED, passing for GREEN) for verification; Auditor confirms or rejects the observation state
+3. **RedGreenAuditor -> TDDImplementer**: Auditor returns RED/GREEN confirmation
+   (RED: proceed to implementation; GREEN: proceed to spec review) or rejection (resubmit with valid and correctly classified test run)
+4. **TDDImplementer -> TDDSpecReviewer**: After GREEN confirmed by RedGreenAuditor, Implementer submits item for spec review with implementation summary and test evidence
+5. **TDDSpecReviewer -> TDDImplementer**: Reviewer returns PASS (proceed to quality review) or FAIL with required fixes
+6. **TDDImplementer -> TDDCodeQualityReviewer**: After spec review PASS, Implementer submits for code quality review
+7. **TDDCodeQualityReviewer -> TDDImplementer**: Reviewer returns PASS (item can be marked done) or FAIL with required fixes
+8. **TDDCycleController -> ParallelSliceDispatcher**: Controller requests parallel dispatch evaluation; Dispatcher returns authorization or denial with rationale
 ### Capability Probe (MUST)
@@ -163,19 +217,71 @@ Every major artifact in this stage MUST include this table schema:
 ## Parallelization Policy
-- **Default**: Serial execution. Items are processed one at a time in `test-list.md` order.
+- **Default**: Serial execution. Items are processed one test at a time in `test-list.md` order.
 - **Exception**: When items target completely independent SUT modules with no shared state, parallel processing may be used with explicit user approval.
 - Serial execution ensures that each test is written and verified in isolation before moving to the next.
+- ParallelSliceDispatcher is the sole authority for authorizing parallel dispatch.
+### Allow conditions (all must be true)
+- Independent SUT (no shared source files under test)
+- Independent test files (no shared test files or fixtures)
+- No shared state (no shared database, global variable, singleton, or DI container)
+- No sequential dependency (Slice B does not depend on Slice A output)
+- Worktree or branch separation is available
+- Post-merge integration verify plan exists
+### Deny conditions (any one blocks parallel dispatch)
+- Same behavior Red/Green/Refactor cycle across slices
+- Same public API surface modified by multiple slices
+- Shared fixture, shared mock, shared DI container, shared global setup
+- Sequential dependency: "A must finish before B has meaning"
+- Independence claim cannot be explained with concrete file/module evidence
+### Post-parallel integration verify
+- After parallel slices complete and merge, run integration verify on the merged result
+- If integration verify fails, flag all slices for re-examination and roll back the merge
+- If integration verify passes, state transitions back to TDDCycleController for sequential flow
 ## Completion Contract (Shared)
-Before declaring completion, you MUST:
+### Item completion checklist (10-point gate)
+An item in `test-list.md` may transition to `done` only when ALL of the following are satisfied:
+1. Corresponding `TDD-ID` has been selected and is in progress
+2. A failing test was added first (test-first)
+3. RED was observed — RedGreenAuditor confirmed the test failed for the expected reason (watch it fail)
+4. Minimal production code was written to make the test pass
+5. GREEN was observed — RedGreenAuditor confirmed the test passes after implementation (watch it pass)
+6. Refactor was performed and GREEN was re-confirmed after refactor
+7. TDDSpecReviewer returned PASS (spec review gate)
+8. TDDCodeQualityReviewer returned PASS (code quality review gate)
+9. `test-list.md` Status and Evidence columns are updated with fresh evidence
+10. Checkpoint verification passed
+### Spec completion conditions
+The skill may declare "this spec's implementation is complete" only when:
-- All `todo` items in `test-list.md` have been processed.
-- Each processed item reached `done` or `exception` status.
-- All tests pass (`npm test` or equivalent).
-- `test-list.md` reflects the final state accurately.
-- Exception items have DR-IDs recorded (in Notes column if present).
+- All TC-\* from `06_Test-Cases.md` with applicable layer are present in `test-list.md`
+- Each item reached `done` or valid `exception` (with DR-ID)
+- 0 blocking reviewer issues remain
+- Checkpoint verification passed
+- No unresolved Change Request or waiver dependency exists
+### Completion prohibition conditions
+Completion MUST NOT be declared when any of the following are true:
+- No RED fresh evidence exists for the item
+- No GREEN fresh evidence exists for the item
+- Either reviewer (TDDSpecReviewer or TDDCodeQualityReviewer) has not been run or returned FAIL
+- Items with `todo`, `red`, `green`, or `refactor` status still exist (for spec-level completion)
+- Parallel slices were used but integration verify has not been run post-merge
+- Checkpoint boundary was reached but verification was not executed
 ## Evidence (MANDATORY)
@@ -189,6 +295,28 @@ Required sections:
 - Exception items (if any) with DR-IDs
 - Commands executed
+### Per-item evidence contract (fresh evidence required)
+Each TDD item MUST have fresh evidence containing at minimum:
+- `TDD-ID` — the item identifier
+- `TC-ref` — reference to the test case(s)
+- `RED command` — the exact command executed to observe failure
+- `RED result` — the failure output (result completeness is best-effort; truncated output is acceptable)
+- `GREEN command` — the exact command executed to observe success
+- `GREEN result` — the success output
+- `Refactor verify command` — the exact command re-executed after refactor
+- `Refactor verify result` — the output confirming GREEN is maintained
+- `Spec review` — TDDSpecReviewer result (PASS or FAIL)
+- `Code quality review` — TDDCodeQualityReviewer result (PASS or FAIL)
+### Evidence hard rules
+- Status-only evidence (e.g., "Status: PASS" with no command) is invalid and MUST be rejected
+- Both command and result are required; "should pass" or "looks good" alone is not acceptable
+- Stale evidence from a previous run MUST NOT be reused to claim completion for a new cycle
+- Empty evidence entries are rejected: minimum evidence per TDD item must be met
 ## FINAL CHECKLIST (Check Last)
 - [ ] CRITICAL CONSTRAINTS were followed.

package/assets/init/.qfai/specs/README.md CHANGED Viewed

@@ -66,6 +66,19 @@ Each `spec-XXXX/` must satisfy:
 `_policies/` files must not contain lower-layer IDs (`US/AC/BR/EX/TC`) or `spec-XXXX` references.
+## TDD Execution Ledger (`tdd/test-list.md`)
+Each `spec-XXXX/tdd/test-list.md` is the execution ledger for the TDD micro-cycle.
+- **8 required columns**: TDD-ID, TC-Refs, Layer, Test file, Selector, Status, DR-ID, Evidence
+- **Coverage** is measured as unit/component TC references from `06_Test-Cases.md` appearing in TC-Refs
+- **Level column fallback**: when `06_Test-Cases.md` has no `Level` column, all TCs are treated as coverage targets (equivalent to all being unit/component)
+- **Status=exception** rows must have a non-empty DR-ID (Decision Record reference)
+- **Status in {green, refactor, done}** rows must have an existing Test file (resolved relative to project root)
+- **TDD-ID** must match `TDD-NNNN` format and be unique within the spec (case-insensitive)
+- Specs without `tdd/test-list.md` receive a `TDDLIST_MISSING` warning (not error)
+- Old 6-column format (missing DR-ID/Evidence) triggers `TDDLIST_REQUIRED_COLUMN_MISSING` error
 ## Notes
 - `specs/` is definition-only. Keep operational status as run logs under `.qfai/report/run-*/`.

package/assets/init/.qfai/specs/spec-XXXX/tdd/test-list.md CHANGED Viewed

@@ -1,4 +1,4 @@
 # TDD Execution Ledger
-| TDD-ID | TC-Refs | Layer | Test file | Selector | Status |
-| ------ | ------- | ----- | --------- | -------- | ------ |
+| TDD-ID | TC-Refs | Layer | Test file | Selector | Status | DR-ID | Evidence |
+| ------ | ------- | ----- | --------- | -------- | ------ | ----- | -------- |