npm - qfai - Versions diffs - 1.8.2 → 1.8.4 - Mend

qfai 1.8.2 → 1.8.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (39) hide show

package/assets/init/.qfai/assistant/skills/qfai-prototyping/references/l1-review-guide.md CHANGED Viewed

@@ -2,10 +2,12 @@
 L1 checks implementation fidelity.
-## Inputs
+## Inputs (read from review-bundle.json)
-- screenshots
-- HTML snapshots
+- screenshots (round/candidate path, per declared screen)
+- HTML snapshots (round/candidate path, per declared screen)
+- accessibility snapshots (round/candidate path, per declared screen)
+- Playwright CLI command log (round/candidate path, per declared screen)
 - canonical UI contracts from `.qfai/contracts/ui/*.yaml`
 - latest code state
@@ -13,24 +15,23 @@ L1 checks implementation fidelity.
 For each declared screen:
-- the screen is reachable/rendered
-- screenshot exists
-- HTML snapshot exists
-- required elements are visibly present
-- required actions are wired or explicitly marked missing
+- the screen is reachable/rendered (confirm via goto log in command log)
+- screenshot, HTML, accessibility snapshot, and command log all exist at the round/candidate path
+- required elements are visibly present (cross-check screenshot + HTML + snapshot)
+- required actions are wired or explicitly marked missing (cross-check interaction commands in the command log vs `primaryTasks`)
 - blocking UI failures are identified
 ## Failure handling
-- Missing screenshot or HTML => score `0`, rerun required
+- Missing any of the 4 per-screen artifacts => score `0`, rerun required
 - Missing primary action wiring => blocking finding
 - Severe route/render failure => blocking finding
 ## Output
-Return:
+Write `evaluator-reviews/<candidate-id>.json` with:
 - per-screen findings
 - blocking/immediate-fix classification
-- a numeric score per axis in the range `0.0..1.0`
-- rationale tied to screenshot/HTML evidence
+- a numeric score per axis in the range `0..100`
+- rationale tied to screenshot / HTML / snapshot / command log refs (all entries in `evidenceRefs[]` MUST be concrete paths to existing artifacts)

package/assets/init/.qfai/assistant/skills/qfai-prototyping/references/l2-review-guide.md CHANGED Viewed

@@ -2,14 +2,19 @@
 L2 checks product experience and design alignment.
-## Inputs
-- screenshots
-- HTML snapshots
-- `.qfai/contracts/design/evaluation-axes.yaml`
-- `.qfai/contracts/design/anchor-selection.yaml`
+## Inputs (read from review-bundle.json)
+- screenshots (round/candidate path, per declared screen)
+- HTML snapshots (round/candidate path, per declared screen)
+- accessibility snapshots (round/candidate path, per declared screen)
+- Playwright CLI command log (round/candidate path, per declared screen)
+- `.qfai/contracts/design/evaluation-rubric.yaml`
+- `.qfai/contracts/design/selected-direction.yaml`
 - `.qfai/contracts/design/design-system.yaml`
-- previous iteration score
+- legacy inputs, if present (skip if absent):
+  - `.qfai/contracts/design/evaluation-axes.yaml`
+  - `.qfai/contracts/design/anchor-selection.yaml`
+- previous round score
 ## 3-layer evaluation family
@@ -27,13 +32,14 @@ L2 must explicitly use all of:
 - product-specific differentiation is visible
 - selected anchor direction is reflected in the current UI
 - design system checklist is respected
+- interaction outcomes in the command log are consistent with the experience the designer intended
 - experience findings are recorded separately from blocking L1 findings
 ## Output
-Return:
+Write to `evaluator-reviews/<candidate-id>.json` with:
 - per-axis findings
 - revise/manual-review classification
-- a numeric score per axis in the range `0.0..1.0`
-- rationale tied to screenshot/HTML evidence and axis refs
+- a numeric score per axis in the range `0..100`
+- rationale tied to screenshot / HTML / snapshot / command log refs and axis refs (all entries in `evidenceRefs[]` MUST be concrete paths to existing artifacts)

package/assets/init/.qfai/assistant/skills/qfai-prototyping/references/reviewer-gate.md CHANGED Viewed

@@ -1,15 +1,26 @@
 # Reviewer Gate
-The reviewer is an independent gate, not the implementation author.
+The reviewer is an independent gate, not the implementation author. The reviewer gate applies identically to all modes (per the resolved primary prototyping spec); modes differ only in `maxCycles`.
 ## Reviewer must verify
-- all declared screens have screenshot evidence
-- all declared screens have HTML snapshot evidence
+- all declared screens have all 4 per-screen artifacts for every active candidate in every round (screenshot, HTML, accessibility snapshot, command log)
+- canonical latest paths mirror the newest accepted winner/polish state
+- every round has `command-plans.json`, `review-bundle.json`, and per-candidate evaluator reviews
+- `review-bundle.json` contains all required fields (candidates, axisDefs, designSystemChecklist, commandPlanRef)
+- evaluator review `evidenceRefs[]` entries are concrete artifact refs (no placeholders)
 - L1 and L2 evaluators used the required inputs
 - the 3-layer evaluation family was referenced
 - missing evidence triggered rerun rather than waiver
-- `qfai validate --fail-on error` passed
+- `qfai validate --profile prototyping --fail-on error` passed
+- `prototyping.json` `maxCycles` matches the mode (no mode invariant violations)
+- winner_selected is true
+- post_selection_polish_completed is true
+- breakthrough_checked is true
+- best_of_history_present is true
+- all_reviewer_axes_perfect_100 is true
+- completion_eligible is true only after the completion certificate is valid
+- no completion claim is based on a 95-point threshold
 ## Reviewer output
@@ -21,4 +32,14 @@ Required fixes:
 - ...
 Evidence checked:
 - ...
+Gate fields:
+- mode: low-cost|standard|full-harness
+- maxCycles: <number matching mode>
+- winner_selected: true|false
+- post_selection_polish_completed: true|false
+- breakthrough_checked: true|false
+- best_of_history_present: true|false
+- all_reviewer_axes_perfect_100: true|false
+- completion_eligible: true|false
+- completion_certificate_valid: true|false
 ```

package/assets/init/.qfai/assistant/skills/qfai-sdd/SKILL.md CHANGED Viewed

@@ -200,7 +200,7 @@ Follow `.qfai/assistant/instructions/shared-skill-operating-baseline.md#delta-re
 - `05_Examples.md` must include `EX-ID` and `BR-Ref` mappings.
 - `06_Test-Cases.md` must include `TC-ID`, `EX-Ref`, `AC-Refs`, and `Type`.
 - `06_Test-Cases.md` quality depth must include normal-path plus error or boundary coverage.
-- Do not complete the stage until `qfai validate --fail-on error --format github | tee .qfai/report/validate.log` exits with `error=0`.
+- Do not complete the stage until `qfai validate --profile sdd --fail-on error --format github | tee .qfai/report/validate.log` exits with `error=0`.
 - Reference direction rules from `.qfai/specs/README.md` must be enforced.
 - Keep `specs/` definition-only and operational status under `.qfai/report/run-*`.
 - Traceability depth and density-smell review rules live in:
@@ -246,7 +246,7 @@ The canonical file set is defined by skill templates under `.qfai/assistant/skil
 8. Execute Phase 2 (Slice) and pass slice gate for each target spec.
 9. Execute Phase 3 (Plan finalize) after at least one slice gate passes.
 10. Execute Phase 4 (Delta update).
-11. Run `qfai validate --fail-on error --format github | tee .qfai/report/validate.log`.
+11. Run `qfai validate --profile sdd --fail-on error --format github | tee .qfai/report/validate.log`.
 12. Review `.qfai/report/specs-coverage/spec-*.md` and triage density-smell warnings.
 13. If validate fails, fix source-layer artifacts and repeat until `error=0`.
@@ -314,7 +314,7 @@ When declaring DONE, include:
 - [ ] `10_Plan.md` is finalized as How-only.
 - [ ] `specs/plan.md` was not created.
 - [ ] `09_delta.md` (or `*_delta.md`) contains adoption/rejection rationale.
-- [ ] `qfai validate --fail-on error --format github` ran and produced `error=0`.
+- [ ] `qfai validate --profile sdd --fail-on error --format github` ran and produced `error=0`.
 - [ ] `.qfai/report/specs-coverage/spec-*.md` was reviewed.
 - [ ] Quality gate checks are recorded in evidence.
 - [ ] Evidence file exists and is complete.

package/assets/init/.qfai/assistant/skills/qfai-sdd/references/rcp_footer.md CHANGED Viewed

@@ -31,7 +31,7 @@
 ## Validate Hard Gate（必須）
-- 各 review cycle で `qfai validate --fail-on error --format github` を実行していること
+- 各 review cycle で `qfai validate --profile sdd --fail-on error --format github` を実行していること
 - `.qfai/report/validate.log` が存在し、最新の成果物に対応していること
 ---

package/assets/init/.qfai/assistant/skills/qfai-sdd/references/sdd-quality-gate.md CHANGED Viewed

@@ -20,7 +20,7 @@ Use this file for the full quality gate checklist behind `/qfai-sdd`.
 ## Validation Checks
-- `qfai validate --fail-on error --format github | tee .qfai/report/validate.log`
+- `qfai validate --profile sdd --fail-on error --format github | tee .qfai/report/validate.log`
 - `error=0`
 - `.qfai/report/specs-coverage/spec-*.md` reviewed
 - Density-smell warnings triaged

package/assets/init/.qfai/assistant/skills/qfai-sdd/templates/contracts/absorption-policy.sample.yaml ADDED Viewed

@@ -0,0 +1,7 @@
+# Downstream absorption policy generated by /qfai-sdd
+minAbsorptionsPerSurvivor: 2
+require_rejected_reason: true
+allow_adapt_required: true
+coherence_review:
+  block_on_regression_alert: true
+  block_on_blocking_findings: true

package/assets/init/.qfai/assistant/skills/qfai-sdd/templates/contracts/evaluation-rubric.sample.yaml CHANGED Viewed

@@ -8,9 +8,28 @@ axes:
     weight: 1
   - id: functionality
     weight: 1
-hard_floors:
-  - functionality
-  - accessibility-risk
+  - id: accessibility-risk
+    weight: 1
+  - id: implementation-plausibility
+    weight: 1
 weighted_axes:
   - design-quality
   - originality
+hard_floors:
+  - id: functionality
+    min_score: 80
+  - id: accessibility-risk
+    min_score: 80
+  - id: conceptFit
+    min_score: 85
+  - id: originality
+    min_score: 80
+absorbable_categories:
+  - layout
+  - interaction
+  - content-hierarchy
+  - visual-language
+  - navigation
+  - motion
+coherence:
+  regression_threshold: 5

package/assets/init/.qfai/assistant/skills/qfai-sdd/templates/contracts/evaluator-calibration.sample.yaml CHANGED Viewed

@@ -1,9 +1,15 @@
 # Downstream evaluator calibration generated by /qfai-sdd
 good_critique_examples:
   - Skeptical, specific, and actionable feedback
+  - Names which candidate strengths should be harvested and why
 too_lenient_examples:
   - Praise that ignores blandness
 blandness_fail_examples:
   - Technically correct but generic
 originality_fail_examples:
   - Library defaults with no product-specific decisions
+concept_fit_fail_examples:
+  - A refinement that improves novelty but breaks the exploration brief anchors
+coherence_regression_red_flags:
+  - Newly absorbed ideas overpower the selected concept
+  - Navigation or hierarchy changes make the surface feel like a different product

package/assets/init/.qfai/assistant/skills/qfai-sdd/templates/contracts/exploration-brief.sample.yaml CHANGED Viewed

@@ -8,3 +8,12 @@ brand_signals:
   - Confident
 differentiation_targets:
   - Avoid generic dashboard defaults
+parallel_candidate_routing: required
+concept_anchors:
+  - id: CONCEPT-0001
+    statement: Calm operational confidence
+  - id: CONCEPT-0002
+    statement: Guided depth without clutter
+non_goals:
+  - Loud novelty-first visuals
+  - Route switching that depends on hidden runtime state

package/assets/init/.qfai/assistant/skills/qfai-verify/SKILL.md CHANGED Viewed

@@ -95,7 +95,7 @@ Use the shared schema.
 - Follow `.qfai/assistant/instructions/shared-skill-delegation-baseline.md#reviewer-gate-baseline`.
 - Reviewer checks:
   - required roles were delegated;
-  - validate evidence exists: `qfai validate --fail-on error` completed with `error=0`;
+  - validate evidence exists: `qfai validate --profile verify --fail-on error` completed with `error=0`;
   - declared screens have mandatory screenshot and HTML evidence under `.qfai/evidence/prototyping/`;
   - Drift Protocol enforced;
   - test-layer policy enforced against `test-layers.md`.
@@ -132,7 +132,7 @@ Follow `.qfai/assistant/instructions/shared-skill-operating-baseline.md#delta-re
   - `.qfai/evidence/` is intentionally NOT tracked by Git (it ships with a local `.gitignore`).
   - Do NOT commit evidence files; summarize key outcomes in the PR description instead.
 - You MUST run the mandatory checks listed below and record outcomes.
-- In CI, you MUST keep QFAI validation on default/full mode (`qfai validate --fail-on error`). Do NOT use `--phase refinement`.
+- In CI, you MUST keep QFAI validation on full-scan mode (`qfai validate --profile verify --fail-on error` or default `qfai validate --fail-on error`). Do NOT use partial profiles.
 - Waivers are only for `warning` / `info` findings. If a waiver attempts to suppress an `error`, treat it as a failure and fix the root cause.
 - You MUST stop and escalate if any gate fails without an actionable fix list.
 - Completion must be approved by a reviewer who did not run the gates.
@@ -148,7 +148,7 @@ Run quality gates and produce evidence that the change is correct and safe.
 ## Success Criteria (Definition of Done)
 - Repo quality gates PASS (format/lint/type/test/build/etc).
-- QFAI checks PASS (at minimum: `qfai validate`, and optionally `qfai report`).
+- QFAI checks PASS (at minimum: `qfai validate --profile verify`, and optionally `qfai report`).
 - Declared screens have mandatory screenshot and HTML evidence.
 - A concise evidence summary exists (copy‑paste for PR).
 - The PR-ready summary includes **Change Classification (Primary/Tags)** per `.qfai/assistant/instructions/change-classification.md`.
@@ -341,12 +341,12 @@ If unknown, propose defaults and mark assumptions.
 Run (adjust as needed):
-- `qfai validate --fail-on error`
+- `qfai validate --profile verify --fail-on error`
 - `qfai report` (if used in this repo)
 Notes:
-- CI must run default/full validation only. `--phase refinement` is local-only.
+- CI must run default/full validation only. Partial profiles are local skill checks only.
 - If `QFAI-WAIVER-002` appears, remove the invalid waiver and resolve the underlying `error` finding.
 Capture:
@@ -454,7 +454,7 @@ Evidence must include:
 1. QFAI validation:
    ```bash
-   qfai validate --fail-on error
+   qfai validate --profile verify --fail-on error
    ```
 2. Repository standard gates (discover from package.json/CI/docs):

package/assets/init/.qfai/contracts/design/README.md CHANGED Viewed

@@ -6,6 +6,8 @@ Provide the downstream execution truth for exploration-first prototyping and fin
 These files are version-managed and may be read directly by `/qfai-prototyping`, `/qfai-implement`, `/qfai-atdd`, and `qfai validate`.
+> **Prototyping harness (spec-0012)**: `evaluation-rubric.yaml` is the source of evaluator axes, absorbable categories, and concept-fit hard floors. `absorption-policy.yaml` defines minimum absorption and curation expectations between rounds. `design-system.yaml` remains the downstream checklist for winner extraction and polish.
 ## Status After Init
 After `qfai init`, this directory contains only this README. This is the normal initial state. `/qfai-sdd` creates design files when a UI-bearing capability is normalized for downstream execution.
@@ -17,8 +19,9 @@ The absence of design files is not a defect for non-UI capabilities. For UI-bear
 Typical files:
 - `exploration-brief.yaml` — machine-readable exploration brief generated from discussion
-- `evaluation-rubric.yaml` — machine-readable evaluator rubric with weighted originality/design criteria
+- `evaluation-rubric.yaml` — machine-readable evaluator rubric with weighted axes, hard floors, and absorbable categories
 - `evaluator-calibration.yaml` — evaluator alignment examples and anti-leniency guidance
+- `absorption-policy.yaml` — round-to-round absorption thresholds and curation rules
 - `selected-direction.yaml` — current winning direction, rationale, and carry-forward rules
 - `design-system.yaml` — extracted final design system produced after direction convergence
 - `design-tokens*.yaml` — optional token definitions
@@ -28,6 +31,7 @@ Typical files:
 - `exploration-brief.yaml`
 - `evaluation-rubric.yaml`
 - `evaluator-calibration.yaml`
+- `absorption-policy.yaml`
 - `selected-direction.yaml`
 - `design-system.yaml`
 - `design-tokens.yaml`
@@ -38,3 +42,4 @@ Typical files:
 - **Not** a replacement for specs or UI contracts
 - **Not** an excuse for downstream skills to read discussion-side artifacts directly
 - **Not** a place to finalize a winner before prototyping convergence
+- **Not** a place to store round evidence; that belongs under `.qfai/evidence/prototyping/`

package/assets/init/.qfai/contracts/ui/README.md CHANGED Viewed

@@ -7,6 +7,8 @@ The contract must describe screen structure, action coverage targets, and stable
 > **Note:** UI contracts are the downstream execution truth for screen obligations. `/qfai-sdd` may derive them from discussion-side exploration, but `/qfai-prototyping`, `/qfai-implement`, and `/qfai-atdd` must read `contracts/ui/*.yaml` instead of reading `discussion-*/uiux/40_screen_contracts.md` directly.
+> **Prototyping harness (spec-0012)**: `screens[].id`, `screens[].route`, and `screens[].primary_tasks` (snake_case in YAML; surfaced as `primaryTasks` in the parsed `CanonicalScreenContract`) feed the round-based command-plan builders consumed by the AI evaluator sub-agent. Changes to screen IDs or routes must propagate to `.qfai/evidence/prototyping/rounds/<rN>/candidates/<candidate-id>/<screen-id>.*` evidence.
 ## File rules
 - File name: `ui-XXXX-<slug>.yaml`
@@ -123,9 +125,7 @@ screens:
 ## Example
-- Copy-ready repository sample:
-  `docs/examples/ui-contract.good.yaml`
-- Also available from prototyping skill template:
+- Copy-ready sample bundled with this package:
   `../../assistant/skills/qfai-prototyping/templates/contracts/ui-0001-order-mockable.yaml`
 ## FAQ (Typical failures)

package/assets/init/.qfai/discussion/README.md CHANGED Viewed

@@ -123,11 +123,15 @@ UI-bearing discussion packs (`ui_bearing: true`) may include a `prototyping.yaml
 ### Canonical namespaced schema (when present)
+The three modes (`low-cost`, `standard`, `full-harness`) share the same evidence obligations and validator gates; the only mode-dependent value is `maxCycles` (1 / 3 / 20). Choose `recommended_mode` based on the project's iteration budget, and list every mode the project allows under `allowed_modes`.
 ```yaml
 prototyping:
-  recommended_mode: full-harness
-  rationale: Exploration-first prototyping requires the full-harness runtime loop in packages/qfai.
+  recommended_mode: standard
+  rationale: Standard mode (maxCycles=3) matches our review cadence.
   allowed_modes:
+    - low-cost
+    - standard
     - full-harness
   surface: web
 ```
@@ -136,18 +140,19 @@ prototyping:
 If you create this artifact, populate all 4 fields.
-| Field              | Required | Description                                    |
-| ------------------ | -------- | ---------------------------------------------- |
-| `recommended_mode` | yes      | `full-harness`                                 |
-| `rationale`        | yes      | Non-empty string explaining the recommendation |
-| `allowed_modes`    | yes      | Unique array; must contain only `full-harness` |
-| `surface`          | yes      | `web`, `mobile`, `desktop`, or `mixed`         |
+| Field              | Required | Description                                                                |
+| ------------------ | -------- | -------------------------------------------------------------------------- |
+| `recommended_mode` | yes      | One of `low-cost`, `standard`, `full-harness`                              |
+| `rationale`        | yes      | Non-empty string explaining the recommendation                             |
+| `allowed_modes`    | yes      | Unique non-empty array drawn from `low-cost` / `standard` / `full-harness` |
+| `surface`          | yes      | `web`, `mobile`, `desktop`, or `mixed`                                     |
 ### Current behavior
 - Current discussion-pack readiness does not block on missing `prototyping.yaml`.
 - When `prototyping.yaml` is present, prefer the canonical namespaced schema under the `prototyping:` key.
-- `recommended_mode` should be included in `allowed_modes`. In packages/qfai, this means `recommended_mode` should be `full-harness` and `allowed_modes` should contain only `full-harness`.
+- `recommended_mode` MUST be included in `allowed_modes`.
+- Mode invariant (spec-0017 REQ-0001): the three modes share obligations except for `maxCycles` (1 for `low-cost`, 3 for `standard`, 20 for `full-harness`). Picking a different mode does not relax any other gate.
 ## Suggested naming

package/assets/init/.qfai/evidence/README.md CHANGED Viewed

@@ -3,75 +3,95 @@
 ## Purpose
 Evidence files record what was actually executed and observed.
-`packages/qfai` v1.7.15 treats prototyping as `full-harness` only and UI-only.
+`packages/qfai` treats prototyping as a Playwright CLI + AI evaluator harness with unified obligations across all modes (spec-0012).
 ## Prototyping artifacts
-Canonical files:
+Round-scoped artifacts (for each round `<rN>`):
-- `.qfai/evidence/prototyping.md`
-- `.qfai/evidence/prototyping.json`
-- `.qfai/evidence/render.json`
-- `.qfai/evidence/browser-qa.json`
-- `.qfai/evidence/fullHarness.exit.json`
-- `.qfai/evidence/fullHarness.handoff.json`
-- `.qfai/evidence/fullHarness.fakeUiDetection.json`
+- `.qfai/evidence/prototyping/rounds/<rN>/command-plans.json` — candidate-aware Playwright CLI command plans
+- `.qfai/evidence/prototyping/rounds/<rN>/review-bundle.json` — evaluator input bundle
+- `.qfai/evidence/prototyping/rounds/<rN>/evaluator-reviews/<candidate-id>.json` — evaluator output per candidate
+- `.qfai/evidence/prototyping/rounds/<rN>/harvest.json` — harvest template for `r5|r3|r2`
+- `.qfai/evidence/prototyping/rounds/<rN>/narrow-decision.json` — survivor decision for `r5|r3|r2`
+- `.qfai/evidence/prototyping/rounds/<rN>/absorption-plan.json` — absorption plan for `r3|r2|r1`
+- `.qfai/evidence/prototyping/rounds/<rN>/reimplementation.json` — reimplementation record for `r3|r2|r1`
+- `.qfai/evidence/prototyping/rounds/<rN>/candidates/<candidate-id>/<screen-id>.png` — screenshot per declared screen
+- `.qfai/evidence/prototyping/rounds/<rN>/candidates/<candidate-id>/<screen-id>.html` — HTML snapshot per declared screen
+- `.qfai/evidence/prototyping/rounds/<rN>/candidates/<candidate-id>/<screen-id>.snapshot.txt` — accessibility snapshot per declared screen
+- `.qfai/evidence/prototyping/rounds/<rN>/candidates/<candidate-id>/<screen-id>.commands.json` — executed command log per declared screen
+Cross-round rollups:
+- `.qfai/evidence/prototyping.json` — `rounds[]` / `polishCycles[]` rollup with best-of-history / breakthrough / reviewer gate sections
+- `.qfai/evidence/prototyping.md` — reviewer-readable summary
+- `.qfai/evidence/breakthrough.json` — breakthrough decisions
+Canonical latest paths (mirror the newest accepted winner/polish state):
+- `.qfai/evidence/prototyping/screenshots/<screen-id>.png`
+- `.qfai/evidence/prototyping/html/<screen-id>.html`
 ## Execution contract
 Supported prototyping surfaces are `web`, `mobile`, `desktop`, and `mixed`.
 `cli`, API-only, backend-only, and `ui_bearing: false` classifications are not prototyping execution targets.
-## Obligation matrix
+Browser tool: `playwright-cli` (the only supported value per spec-0012).
-| surface / mode         | specs    | runtimeGate | uiFidelity | render evidence | browser QA | fullHarness |
-| ---------------------- | -------- | ----------- | ---------- | --------------- | ---------- | ----------- |
-| web / full-harness     | required | required    | required   | required        | required   | required    |
-| mobile / full-harness  | required | required    | required   | required        | required   | required    |
-| desktop / full-harness | required | required    | required   | required        | required   | required    |
-| mixed / full-harness   | required | required    | required   | required        | required   | required    |
+## Obligation matrix (spec-0012)
-`low-cost` and `standard` are unsupported in `packages/qfai` v1.7.15.
+Mode invariant: every row below is identical except for the `maxCycles` column. `maxCycles` is the only mode-dependent field.
+| surface / mode         | specs    | runtimeGate | uiFidelity | playwright evidence | reviewBundle | evaluatorReview | bestOfHistory | breakthrough | reviewerGate | maxCycles |
+| ---------------------- | -------- | ----------- | ---------- | ------------------- | ------------ | --------------- | ------------- | ------------ | ------------ | --------- |
+| web / low-cost         | required | required    | required   | required            | required     | required        | required      | required     | required     | 1         |
+| web / standard         | required | required    | required   | required            | required     | required        | required      | required     | required     | 3         |
+| web / full-harness     | required | required    | required   | required            | required     | required        | required      | required     | required     | 20        |
+| mobile / low-cost      | required | required    | required   | required            | required     | required        | required      | required     | required     | 1         |
+| mobile / standard      | required | required    | required   | required            | required     | required        | required      | required     | required     | 3         |
+| mobile / full-harness  | required | required    | required   | required            | required     | required        | required      | required     | required     | 20        |
+| desktop / low-cost     | required | required    | required   | required            | required     | required        | required      | required     | required     | 1         |
+| desktop / standard     | required | required    | required   | required            | required     | required        | required      | required     | required     | 3         |
+| desktop / full-harness | required | required    | required   | required            | required     | required        | required      | required     | required     | 20        |
+| mixed / low-cost       | required | required    | required   | required            | required     | required        | required      | required     | required     | 1         |
+| mixed / standard       | required | required    | required   | required            | required     | required        | required      | required     | required     | 3         |
+| mixed / full-harness   | required | required    | required   | required            | required     | required        | required      | required     | required     | 20        |
+Choosing a lower mode buys fewer cycles, not a weaker gate.
 ## Truthfulness rules
-- `mode.effective` must be `full-harness`.
-- `uiFidelity.mode` must be `interactive`.
-- Canonical screen contracts in `discussion-*/uiux/40_screen_contracts.md` are mandatory.
-- Browser QA is mandatory per screen.
-- Calibration is resolved from `fullHarness.calibrationRef.packPath`; scalar caller overrides are invalid.
-- `runtimeGate.evidenceRefs` must contain concrete render/browser QA/spec refs only.
-- `specCoverage` refs must use concrete declared refs plus concrete observed refs. Self-reference and synthetic strings are invalid.
+- `mode.effective` must be one of `low-cost`, `standard`, `full-harness`.
+- `maxCycles` must match `PROTOTYPING_MAX_CYCLES[mode]` or `QFAI-PROT-MODE-001` is raised.
+- Browser tool must be `playwright-cli`.
+- `uiFidelity.mode` must be `interactive` (captured via the Playwright CLI command plans).
+- Evidence capture is performed by the AI evaluator sub-agent via the Playwright CLI command plans generated by QFAI.
+- Canonical screen contracts in `.qfai/contracts/ui/*.yaml` are mandatory.
+- evaluator review `evidenceRefs[]` entries must contain concrete artifact refs that point to existing files; placeholders (`""`, `"tbd"`, `"TBD"`) are rejected.
+- Canonical latest paths must mirror the newest accepted winner/polish artifacts.
 - `mockPaths` is a negative-only ledger. Allowed values are `fail|finding` only.
-## fullHarness semantics
-Required fields:
+## Prototyping completion gate (spec-0012)
-- `enabled = true`
-- `runId`
-- `calibrationRef.configPath`
-- `calibrationRef.packPath`
-- `calibrationRef.packVersion`
-- `iterationCount`
-- `bestIteration`
-- `status`
-- `reviewerSignoff`
-- `reviewerLogs`
-- `iterations`
-- `scoringTrace`
-- `limitations`
+Completion requires all of the following for every mode:
-Review semantics:
+- all 4 per-screen artifacts present for every declared screen in the completion round / polish cycle
+- at least one `polish` cycle completed after winner selection
+- `bestOfHistory` present
+- `breakthrough` present
+- independent reviewer gate returned `PASS`
+- every reviewer sub-agent scored every evaluation axis at `100/100`
+- `qfai validate --profile prototyping --fail-on error` passes
-- `finalDecision = accepted` -> `reviewerSignoff.status = approved`
-- `finalDecision = rejected` -> `reviewerSignoff.status = rejected`
-- `finalDecision = abandoned` -> `reviewerSignoff.status = abandoned`
-- `reviewerLogs[last].verdict` must align with the final decision and termination semantics.
+If the polish-cycle budget is exhausted before the gate is satisfied, the run does not complete and returns `REVISE`.
 ## Prohibited patterns
-- `low-cost` or `standard` prototyping metadata
+- `browserProvider` or `renderProvider` config keys (rejected per spec-0012)
+- `playwright-mcp` as standard browser tool
+- Node Playwright direct invocation for evidence capture
+- Mode differences other than `maxCycles`
 - `cli` prototyping execution
 - self-reference such as `prototyping.json#/runtimeGate`
 - synthetic refs such as `specs: ...`

package/assets/init/root/.github/workflows/qfai-validate.yml ADDED Viewed

@@ -0,0 +1,39 @@
+# QFAI validate CI workflow
+#
+# Generated by `qfai init` (spec-0017 REQ-0009). Runs `qfai validate --profile
+# full --fail-on error` on every push to main/master and every pull request.
+#
+# The `full` profile includes the QFAI-TEST-001 test-todo stub gate, so any
+# `it.todo` / `test.todo` / `describe.todo` in your test suite will fail CI
+# (you can opt out via `validation.testStrategy.forbidTestTodoStubs: false`
+# in qfai.config.yaml).
+#
+# If your project uses pnpm / yarn instead of npm, replace the `npm ci` step
+# with your package manager's install command (e.g. `pnpm install
+# --frozen-lockfile` + `pnpm/action-setup@v4`). `npx qfai` works with any
+# Node package manager.
+name: qfai validate
+on:
+  push:
+    branches: [main, master]
+  pull_request:
+jobs:
+  validate:
+    name: qfai validate (full profile, fail on error)
+    runs-on: ubuntu-latest
+    timeout-minutes: 10
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-node@v4
+        with:
+          # Match the QFAI repo's CI baseline (Node 20 LTS) and the package
+          # `engines: ">=18.0.0"`. Bump deliberately when QFAI raises its
+          # supported floor.
+          node-version: "20"
+          cache: npm
+      - run: npm ci
+      - name: qfai validate
+        run: npx qfai validate --profile full --fail-on error

package/assets/init/root/qfai.config.yaml CHANGED Viewed

@@ -40,5 +40,4 @@ prototyping:
     packPath: .qfai/evidence/calibration.yaml
   execution:
     targetUrl: null
-    browserProvider: playwright
-    renderProvider: playwright
+    browserTool: playwright-cli