npm - qfai - Versions diffs - 1.8.0 → 1.8.2 - Mend

qfai 1.8.0 → 1.8.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (60) hide show

package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/uiux/30_exploration_brief.md ADDED Viewed

@@ -0,0 +1,29 @@
+# 30 Exploration Brief
+## Product Intent
+- What the product should feel like:
+- What users should immediately understand:
+## Must-preserve Interactions
+- Primary task:
+- Secondary task:
+- Critical state changes:
+## Brand Signals
+- Desired tone:
+- Desired visual character:
+- Must-avoid brand signals:
+## Differentiation Targets
+- How this surface should avoid generic layouts:
+- Where deliberate originality should show up:
+## Implementation Constraints
+- Technical constraints:
+- Accessibility constraints:
+- Operational constraints:

package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/uiux/31_reference_pool.md ADDED Viewed

@@ -0,0 +1,13 @@
+# 31 Reference Pool
+## Exploration References
+| Ref     | Type    | Why it matters | Adopted points | Rejected points | Local translation |
+| ------- | ------- | -------------- | -------------- | --------------- | ----------------- |
+| REF-001 | Product | [why]          | [adopted]      | [rejected]      | [translation]     |
+## Design Guideline Research
+| Ref    | Guideline   | Rule refs   | Why it matters | Local translation |
+| ------ | ----------- | ----------- | -------------- | ----------------- |
+| GL-001 | [guideline] | [rule refs] | [why]          | [translation]     |

package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/uiux/32_design_anti_goals.md ADDED Viewed

@@ -0,0 +1,10 @@
+# 32 Design Anti-goals
+## Anti-goals
+- Avoid generic library-default dashboards with no product character.
+- Avoid AI-slop patterns such as purple gradients over white cards without product rationale.
+## Recurrence Prevention
+- If a later iteration drifts toward a rejected direction, log the trigger and explicitly restate why the pattern is banned.

package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/uiux/33_exploration_rubric.md ADDED Viewed

@@ -0,0 +1,27 @@
+# 33 Exploration Rubric
+## Design Quality
+- What counts as coherent:
+- What breaks coherence:
+## Originality
+- What counts as deliberate design:
+- What counts as generic or AI-slop:
+## Craft
+- Typography, spacing, color, and contrast competence:
+## Functionality
+- Whether users can understand and complete the core task:
+## Accessibility Risk
+- Which issues are hard fails:
+## Implementation Plausibility
+- What level of complexity is acceptable for the current slice:

package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/uiux/34_evaluator_calibration.md ADDED Viewed

@@ -0,0 +1,17 @@
+# 34 Evaluator Calibration
+## Good Critique
+- Example of a skeptical but actionable critique:
+## Too Lenient
+- Example of praise that should be rejected because it ignores obvious blandness or usability issues:
+## Blandness Fail
+- Example of a design that is technically competent but too generic to pass:
+## Originality Fail
+- Example of a design that copies defaults without deliberate product-specific choices:

package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/uiux/50_review_input_bundle.md CHANGED Viewed

@@ -6,33 +6,27 @@ Consolidate all sidecar artifacts into a review-ready bundle for design reviewer
 ## Bundle Contents
-| Artifact                   | Path                                       | Status                    |
-| -------------------------- | ------------------------------------------ | ------------------------- |
-| Strategy                   | `uiux/10_implementation_strategy.md`       | [draft/reviewed/approved] |
-| Taste interview            | `uiux/11_design_taste_interview.md`        | [draft/reviewed/approved] |
-| Trend scan                 | `04_Sources.md#Trend Scan`                 | [draft/reviewed/approved] |
-| Invariant layer            | `uiux/20_design_eval_invariant.md`         | [draft/reviewed/approved] |
-| Trend-derived layer        | `uiux/21_design_eval_trend_derived.md`     | [draft/reviewed/approved] |
-| Product-specific layer     | `uiux/22_design_eval_product_specific.md`  | [draft/reviewed/approved] |
-| Aggregate layer            | `uiux/23_design_eval_aggregate.md`         | [draft/reviewed/approved] |
-| Dynamic overrides          | `uiux/24_design_eval_dynamic_overrides.md` | [optional]                |
-| Option comparison          | `uiux/30_option_comparison.md`             | [draft/reviewed/approved] |
-| Selected anchor            | `uiux/31_selected_anchor_screen.md`        | [draft/reviewed/approved] |
-| Screen contracts           | `uiux/40_screen_contracts.md`              | [draft/reviewed/approved] |
-| Prototyping recommendation | `../prototyping.yaml`                      | [draft/reviewed/approved] |
+| Artifact                   | Path                               | Status                    |
+| -------------------------- | ---------------------------------- | ------------------------- |
+| Exploration brief          | `uiux/30_exploration_brief.md`     | [draft/reviewed/approved] |
+| Reference pool             | `uiux/31_reference_pool.md`        | [draft/reviewed/approved] |
+| Design anti-goals          | `uiux/32_design_anti_goals.md`     | [draft/reviewed/approved] |
+| Exploration rubric         | `uiux/33_exploration_rubric.md`    | [draft/reviewed/approved] |
+| Evaluator calibration      | `uiux/34_evaluator_calibration.md` | [draft/reviewed/approved] |
+| Screen contracts           | `uiux/40_screen_contracts.md`      | [draft/reviewed/approved] |
+| Prototyping recommendation | `../prototyping.yaml`              | [draft/reviewed/approved] |
 ## Trend-derived review focus
-- Required trend categories are all present and complete.
+- Required references are all present and complete.
 - Stale / overused AI slop patterns are explicitly avoided.
-- Trend research is translated into scoring, comparison, and selected anchor decisions.
-- Scoring-ready axes use canonical fields: `origin`, `layer`, `source_refs`, `goal_refs`, `evidence_required`, `review_questions`.
+- Reference research is translated into exploration and evaluator calibration inputs.
+- Later iterations are not automatically preferred over stronger middle iterations.
 ## Review Checklist
-- [ ] Strategy aligns with surface type and project constraints
-- [ ] Trend categories are complete and translated into local design decisions
-- [ ] Competitive references include adopted_points, rejected_points, and local_translation
-- [ ] Scoring-ready axes expose canonical fields including origin/source_refs/goal_refs/evidence_required/review_questions
-- [ ] Selected anchor clearly documents rationale and downstream implications
+- [ ] Exploration brief aligns with surface type and project constraints
+- [ ] Reference pool is complete and translated into local design decisions
+- [ ] Evaluator calibration includes skeptical critique examples
+- [ ] Best-of-history handling is explicit
 - [ ] Screen contracts cover all required states

package/assets/init/.qfai/assistant/skills/qfai-implement/SKILL.md CHANGED Viewed

@@ -75,11 +75,13 @@ Execute the TDD micro-cycle for each pending item in `test-list.md`, transitioni
 ## Visual Review Guard
 - Review rendered output, screenshot evidence, or HTML output before closing any UI-affecting item.
-- Read the sidecar family first (selected anchor, strategy, screen contracts) whenever implementation touches UI or critique-driven behavior.
-- Read order: option comparison (30_option_comparison.md) → selected anchor screen (31_selected_anchor_screen.md) →
-  strategy (10_implementation_strategy.md) → taste interview (11_design_taste_interview.md) →
-  trend scan (04_Sources.md#Trend Scan) → 3-layer evaluation family (20/21/22/23 + optional 24) →
-  screen contracts (40_screen_contracts.md) → review input bundle (50_review_input_bundle.md) →
+- Read spec + contract inputs first whenever implementation touches UI or critique-driven behavior.
+- Read order: `01_Spec.md` → `03_Acceptance-Criteria.md` → `05_Examples.md` →
+  `.qfai/contracts/design/exploration-brief.yaml` →
+  `.qfai/contracts/design/anchor-selection.yaml` (legacy alias, when present) →
+  `.qfai/contracts/design/evaluation-axes.yaml` (legacy alias, when present) →
+  `.qfai/contracts/design/evaluation-rubric.yaml` → `.qfai/contracts/design/evaluator-calibration.yaml` →
+  `.qfai/contracts/design/selected-direction.yaml` → `.qfai/contracts/design/design-system.yaml` → `.qfai/contracts/ui/*.yaml` →
   optional design tokens → optional fallback mock → mermaid flows.
 - If code intent and rendered output diverge, treat the rendered/HTML result as the blocking review input and reconcile before DONE.

package/assets/init/.qfai/assistant/skills/qfai-prototyping/SKILL.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: qfai-prototyping
-title: QFAI Prototyping (Full-Harness Only)
-description: "Build a contract-aligned UI prototype and block completion until full-harness evidence and validate gate pass."
+title: QFAI Prototyping (Exploration-First Harness)
+description: "Run a planner/generator/evaluator UI harness with a 5→3→2→1 direction funnel, breakthrough detection, and final design-system extraction."
 argument-hint: "[--auto]"
 allowed-tools: [Read, Glob, Write, TodoWrite, Task, Bash]
 roles:
@@ -24,173 +24,225 @@ mode: execution-focused
 [DRIFT-PROTOCOL:MANDATORY]
-This skill is static-first for planning and file review, but the package execution contract is `full-harness` only.
-Do not default or downgrade prototyping modes.
+This skill owns prototyping orchestration directly.
+Do not rely on a CLI entrypoint or package runtime loop.
 ## CRITICAL CONSTRAINTS (Read First)
 - Scope is all specs from `.qfai/specs/spec-*`.
-- Evidence is mandatory in markdown + json under `.qfai/evidence/`.
-- DONE is forbidden until prototyping evidence, reviewer gate, and `qfai validate --fail-on error` pass.
-- Supported prototyping surfaces are `web`, `mobile`, `desktop`, and `mixed`.
+- Screenshot evidence and HTML snapshot evidence are mandatory.
+- Screenshot evidence path: `.qfai/evidence/prototyping/screenshots/<screen-id>.png`
+- HTML snapshot path: `.qfai/evidence/prototyping/html/<screen-id>.html`
+- If either screenshot or HTML is missing for a declared screen, that screen scores `0` and the run is incomplete.
+- Optional evidence is abolished. Missing mandatory evidence must trigger rerun, not waiver.
+- DONE is forbidden until `qfai validate --fail-on error` passes and `/qfai-verify` can approve the run.
+- Supported UI prototyping surfaces are `web`, `mobile`, `desktop`, and `mixed`.
 - `cli`, API-only, backend-only, and `ui_bearing: false` classifications are not prototyping execution targets.
-- Canonical screen contracts in `discussion-*/uiux/40_screen_contracts.md` are mandatory.
-- Browser QA, render evidence, runtimeGate, uiFidelity, specCoverage, and `fullHarness` are mandatory.
-- `uiFidelity` is screen-level and must be built from real render/browser evidence.
-- `mockPaths` is a negative-only issue ledger with `fail|finding` only.
-- Calibration pack is the SSOT. Runtime and validator both resolve from `calibrationRef.packPath`.
-- `--reviewer <id>` is mandatory and placeholder reviewer ids are rejected.
-- L1 and L2 findings must be fixed or dispositioned before PASS.
+- `cli` is not supported and is not an execution target for prototyping.
+- Evaluation is performed by sub-agents; machine checks are limited to schema/evidence validation and breakthrough trigger detection.
+- Shared evidence vocabulary includes `render.json`, `browser-qa.json`, `prototyping.json`, and `breakthrough.json`.
+- static-first evidence capture remains mandatory even when interactive review is used.
 ## Goal
-Build the minimum runnable slice for all specs and produce canonical `full-harness` evidence under `.qfai/evidence/`.
+Generate multiple design directions, converge on a winner, extract the selected direction and final design system, and keep the winner open to breakthrough pivots during later polish iterations.
-## Mode
+## Surface / Mode
-### Full-harness
+- surface / mode routing uses `standard` as the default execution path.
+- `standard` is the default when no explicit escalation to `full-harness` is requested.
+- `full-harness` is reserved for explicit escalation and review-heavy obligations.
-- Full-harness is the package default when prototyping execution is valid.
-- Each `qfai prototyping run --mode full-harness --reviewer <id>` invocation records exactly one measured iteration.
-- Multiple iterations are formed only by real code changes between runs.
-- The runtime does not self-modify code and does not fabricate evidence.
+## Required References
-## Obligation matrix
+Read and follow these references before execution:
-| surface / mode         | specs    | runtimeGate | uiFidelity | render evidence | browser QA | fullHarness |
-| ---------------------- | -------- | ----------- | ---------- | --------------- | ---------- | ----------- |
-| web / full-harness     | required | required    | required   | required        | required   | required    |
-| mobile / full-harness  | required | required    | required   | required        | required   | required    |
-| desktop / full-harness | required | required    | required   | required        | required   | required    |
-| mixed / full-harness   | required | required    | required   | required        | required   | required    |
+- `.qfai/assistant/skills/qfai-prototyping/references/evidence-requirements.md`
+- `.qfai/assistant/skills/qfai-prototyping/references/iteration-cycle.md`
+- `.qfai/assistant/skills/qfai-prototyping/references/l1-review-guide.md`
+- `.qfai/assistant/skills/qfai-prototyping/references/l2-review-guide.md`
+- `.qfai/contracts/design/anchor-selection.yaml` when legacy validator slices are exercised
+- `.qfai/contracts/design/evaluation-axes.yaml` when legacy validator slices are exercised
+- `.qfai/assistant/skills/qfai-prototyping/references/design-system-compliance.md`
+- `.qfai/assistant/skills/qfai-prototyping/references/reviewer-gate.md`
+- `.qfai/assistant/steering/test-layers.md`
-## Required evidence
+## Delegation Scope Table
-## Evidence (MANDATORY)
+All sub-agent delegation in this skill MUST follow the category-to-role mapping below.
+Assigning a task to a role not listed for the category is a violation and MUST be flagged.
+Evaluation scoring and screenshot capture must use only the allowed roles below.
-- `.qfai/evidence/prototyping.md`
-- `.qfai/evidence/prototyping.json`
-- `.qfai/evidence/render.json`
-- `.qfai/evidence/browser-qa.json`
-- `.qfai/evidence/fullHarness.exit.json`
-- `.qfai/evidence/fullHarness.handoff.json`
-- `.qfai/evidence/fullHarness.fakeUiDetection.json`
+| Category              | Allowed Role(s)                                        |
+| --------------------- | ------------------------------------------------------ |
+| UI implementation     | frontend-engineer, product-experience-architect        |
+| Screenshot capture    | devops-ci-engineer                                     |
+| Evaluation scoring    | product-surface-reviewer, product-experience-architect |
+| Build                 | devops-ci-engineer, backend-engineer                   |
+| Breakthrough planning | product-experience-architect, frontend-engineer        |
-## Truthfulness rules
+Any delegation map entry that assigns a category to an undefined or unlisted role MUST produce a violation finding naming the undefined role and the category.
-- `mode.effective` must be `full-harness`.
-- `runtimeGate` is observed-only. Synthetic status codes are invalid.
-- `runtimeGate.evidenceRefs` must contain concrete render/browser QA/spec refs only.
-- `specCoverage` must use concrete declared refs and concrete observed refs only.
-- Browser QA evidence must be preserved per screen.
-- `actionsWired` must reflect actionable control coverage, not finding counts.
-- `reviewerSignoff.status` represents final decision, not mere completion.
-- `reviewerLogs[].verdict` must align with decision/termination semantics.
+## Required Process
-## Review semantics
+### Step 0 — Execution Plan
-- `accepted` -> `approved`
-- `rejected` -> `rejected`
-- `abandoned` -> `abandoned`
-- Plateau stop or max-iterations stop must not produce `approved`.
+Before any code is written, create an execution plan record in the work evidence.
-## Delegation Scope Table
+Required fields:
-All sub-agent delegation in this skill MUST follow the category-to-role mapping below.
-Assigning a task to a role not listed for the category is a violation and MUST be flagged.
+- `targetIterations`: integer; minimum 2
+- `funnelPolicy`: `5->3->2->1`
+- `evaluationAxesSource`: ref to `.qfai/contracts/design/evaluation-rubric.yaml`
+- `delegationMap`: category-to-role assignments per Delegation Scope Table
+- `plannedAt`: ISO-8601 timestamp
-| Category           | Allowed Role(s)                                        |
-| ------------------ | ------------------------------------------------------ |
-| UI implementation  | frontend-engineer, product-experience-architect        |
-| Screenshot capture | devops-ci-engineer                                     |
-| Evaluation L1-L2   | product-surface-reviewer, product-experience-architect |
-| Build              | devops-ci-engineer, backend-engineer                   |
+### Step 1 — Read Inputs
-Any delegation map entry that assigns a category to an undefined or unlisted role (e.g., `"generic-code-writer"`) MUST produce a violation finding naming the undefined role and the category.
+Read the downstream-ready spec/contract inputs and verify:
-## Required process
+- `.qfai/specs/<spec-id>/01_Spec.md`
+- `.qfai/specs/<spec-id>/03_Acceptance-Criteria.md`
+- `.qfai/contracts/design/exploration-brief.yaml`
+- `.qfai/contracts/design/evaluation-rubric.yaml`
+- `.qfai/contracts/design/evaluator-calibration.yaml`
+- `.qfai/contracts/design/anchor-selection.yaml` when legacy validator slices are exercised
+- `.qfai/contracts/design/evaluation-axes.yaml` when legacy validator slices are exercised
+- `.qfai/contracts/design/selected-direction.yaml` when already created
+- `.qfai/contracts/design/design-system.yaml` when already created
+- `.qfai/contracts/ui/*.yaml`
-### Step 0 — Execution Plan (executionPlan)
+Read order:
-Before any code is written, create an `executionPlan` record with the following fields:
+1. `.qfai/specs/<spec-id>/01_Spec.md`
+2. `.qfai/specs/<spec-id>/03_Acceptance-Criteria.md`
+3. `.qfai/contracts/design/exploration-brief.yaml`
+4. `.qfai/contracts/design/evaluation-rubric.yaml`
+5. `.qfai/contracts/design/evaluator-calibration.yaml`
+6. `.qfai/contracts/design/anchor-selection.yaml` (legacy validator alias, when present)
+7. `.qfai/contracts/design/evaluation-axes.yaml` (legacy validator alias, when present)
+8. `.qfai/contracts/design/selected-direction.yaml`
+9. `.qfai/contracts/design/design-system.yaml`
+10. `.qfai/contracts/ui/*.yaml`
-- `targetIterations`: integer; minimum 2 for full-harness
-- `evaluationAxesSource`: reference to the discussion pack evaluation-family files (20/21/22/23)
-- `delegationMap`: category-to-role assignments per Delegation Scope Table above
-- `plannedAt`: ISO-8601 timestamp
+### Step 2 — Verify Execution Preconditions
-The executionPlan MUST be present in `prototyping.json` when `mode=full-harness`. A validator MUST reject any full-harness record without an executionPlan.
+Confirm all of the following before any evaluation:
-### Iteration Gate
+- classification is UI-bearing
+- surface is `web`, `mobile`, `desktop`, or `mixed`
+- every declared screen has a stable `screen-id`
+- the exploration brief, evaluation rubric, and evaluator calibration contracts satisfy the required schema
-- full-harness convergence requires a minimum of 2 iterations.
-- A single-iteration run that reports `converged=true` is invalid; the iteration gate MUST raise an error with message "minimum 2 iterations required before convergence".
-- The phase transition from iteration N to N+1 is blocked until `terminationCondition` is met or the gate explicitly authorizes continuation.
+### Step 3 — Generate Divergent Directions
-### 5-Step Iteration Cycle
+Generate 5 clearly distinct design directions before selecting a winner.
+Do not begin with a single incumbent direction.
-Each full-harness iteration follows this fixed sequence:
+### Step 4 — Capture Mandatory Evidence
-1. **Capture** — Run `packages/qfai/assets/scripts/capture-screenshots.js --url <url> --out <dir>` and record screenshot paths with timestamps under `scoringTrace[i].screenshotDir`.
-2. **Evaluate** — Launch L1 and L2 evaluator sub-agents with full context bundle: (a) screenshots from Step 1, (b) axisDefs from evaluation-family 20/21/22/23, (c) previousScore from prior iteration, (d) designSystemChecklist from `uiux/12_design_system.md`.
-3. **Identify** — Aggregate L1 + L2 findings; flag immediate-fix items.
-4. **Fix** — Apply fixes per finding disposition; do not close items without evidence.
-5. **Re-evaluate** — Re-run Steps 1–4; compare new score to prior score to check plateau.
+For every declared screen and every active direction:
-The sequence MUST NOT be permuted. Parallel execution of Capture+Evaluate is prohibited.
+- capture one screenshot and store it at the canonical screenshot path
+- capture one HTML snapshot and store it at the canonical HTML path
+- record missing evidence immediately; do not continue as if capture succeeded
-### Evaluator Input — 4 Required Elements
+### Step 5 — Launch Evaluation Reviewers
-When launching any L1 or L2 evaluator sub-agent, all 4 elements MUST be present in the input:
+Launch evaluation reviewer sub-agents with the full context bundle:
-(a) screenshots — paths produced by capture-screenshots.js for the current iteration
-(b) axisDefs — scoring axes from discussion-pack evaluation-family (20/21/22/23)
-(c) previousScore — aggregate score from the prior iteration (null for iteration 1)
-(d) designSystemChecklist — the compliance checklist derived from `uiux/12_design_system.md`
+- screenshots from Step 4
+- HTML snapshots from Step 4
+- `axisDefs` from `.qfai/contracts/design/evaluation-rubric.yaml`
+- `previousScore` from the prior iteration (`null` for iteration 1)
+- `designSystemChecklist` from `.qfai/contracts/design/design-system.yaml`
-If any element is missing, a reviewer check MUST raise a finding naming the missing element.
-Missing element (d) is a common error when `uiux/12_design_system.md` is absent; the reviewer MUST still flag it.
+### Step 6 — Direction Funnel
-### Visual Quality Structural Checklist
+Run the mandatory convergence funnel:
-Each iteration evaluation MUST score all 6 visual categories:
+- 5 directions -> top 3
+- top 3 remixed -> top 2
+- top 2 -> selected winner 1
+### Step 7 — Extract Winner Contracts
+After the first winner is selected:
+- write `.qfai/contracts/design/selected-direction.yaml`
+- extract `.qfai/contracts/design/design-system.yaml`
+### Step 8 — Polish the Winner
+Iterate on the selected winner with normal critique/rework loops.
+Do not assume the latest iteration is automatically best; keep best-of-history in evidence.
-1. Color — color palette adherence to design system tokens
-2. Typography — type scale, weight, line-height compliance
-3. Spacing — spacing scale and grid alignment
-4. Border radius — border-radius consistency across components
-5. Shadow — shadow elevation and opacity standards
-6. Do's&Don'ts — adherence to explicit do/don't rules from `uiux/12_design_system.md`
+## Iteration Gate
-### Lighthouse Gate (MUST for web full-harness)
+- Minimum 2 iterations are required before any terminal phase transition is allowed.
+- Do not mark the run as converged or complete after a single iteration.
+- Any phase transition to completion must pass through the iteration gate and reviewer gate.
-When `surface=web` and `mode=full-harness`, a Lighthouse performance/accessibility report MUST be captured and attached to the evidence. The reviewer gate MUST raise an error "Lighthouse Gate is MUST for full-harness + web surface" when the report is absent.
+### Step 9 — Breakthrough Detection
-### Steps (continued)
+After each polish iteration, run the mechanical breakthrough detector.
+If `allItemsPass95` is false and score improvement is below the configured plateau threshold and code change is below the configured diff threshold, trigger breakthrough branching.
-1. Read the latest discussion pack and verify `prototyping.yaml`, `04_Sources.md`, `20/21/22/23`, and `40_screen_contracts.md`.
-   Read order: option comparison / `30_option_comparison.md` -> selected anchor screen / `31_selected_anchor_screen.md` -> strategy / `10_implementation_strategy.md` -> taste interview / `11_design_taste_interview.md` -> trend scan / `04_Sources.md` -> 3-layer evaluation family (`20/21/22/23`) -> screen contracts / `40_screen_contracts.md`.
-2. Verify the classification is UI-bearing and the surface is `web`, `mobile`, `desktop`, or `mixed`.
-3. Create the executionPlan (Step 0 above).
-4. Implement the minimum runnable slice for all specs.
-5. Run `qfai prototyping run --mode full-harness --reviewer <id>` — this executes the 5-Step Iteration Cycle per iteration.
-6. Review render evidence, HTML snapshots, Browser QA, runtimeGate, uiFidelity, and specCoverage for every declared screen.
-7. Fix findings and rerun until the evidence is coherent.
-8. Run `qfai validate --fail-on error`.
-9. Route an independent reviewer and do not declare completion until the result is `PASS`.
+### Step 10 — Breakthrough Branch Loop
-## Reviewer gate
+When breakthrough is triggered:
+- generate exactly 2 branch directions
+- compare incumbent + 2 branches
+- replace the mainline if a branch wins
+- refresh selected-direction/design-system if the winner changes
+- record the decision in `.qfai/evidence/breakthrough.json`
+### Step 11 — Validate and Verify
+- Run `qfai validate --fail-on error`.
+- Route `/qfai-verify` or its equivalent gate workflow for final quality approval.
+- Do not declare completion until the reviewer result is `PASS`.
+## Evaluator Inputs (Mandatory)
+When launching any evaluation reviewer sub-agent, all 5 elements MUST be present:
+1. screenshots
+2. HTML snapshots
+3. axisDefs
+4. previousScore
+5. designSystemChecklist
+## Visual Quality Structural Checklist
+Each iteration evaluation MUST score all 6 visual categories:
+1. Design quality
+2. Originality
+3. Craft
+4. Functionality
+5. Accessibility risk
+6. Implementation plausibility
 ### Reviewer Gate (MUST)
-- Reviewer must verify full-harness evidence completeness.
-- Reviewer response must include `Result: PASS | REVISE` (matching shared-skill-delegation-baseline.md#reviewer-response-template).
-- Reviewer must verify calibration pack usage via `calibrationRef`.
-- Reviewer must reject self-reference, synthetic refs, and `mockPaths.status="pass"`.
-- Reviewer must verify `reviewerSignoff`, `reviewerLogs`, `terminationReason`, and `finalDecision` are semantically aligned.
-- Reviewer must verify Drift Protocol compliance and alignment with `test-layers.md`.
-- Review volume guidance remains signals, not gates.
-- Reviewer returns PASS or REVISE only.
+Reviewer checks are defined in:
+- `.qfai/assistant/skills/qfai-prototyping/references/reviewer-gate.md`
+- `.qfai/assistant/steering/test-layers.md`
+Minimum reviewer responsibilities:
+- enforce the Drift Protocol before approving a completion transition
+- verify mandatory screenshot/HTML evidence exists for every declared screen
+- verify exploration brief, evaluation rubric, and evaluator calibration were used
+- verify missing evidence caused rerun rather than waiver
+- verify `qfai validate --fail-on error` completed successfully
+- verify breakthrough trigger evidence is present
+- verify best-of-history handling is documented
+- treat score/volume heuristics as signals, not gates
+- return `Result: PASS | REVISE`
 ## Sub-agent Delegation (MANDATORY)
@@ -198,9 +250,9 @@ Follow `.qfai/assistant/instructions/shared-skill-delegation-baseline.md`.
 ### Orchestrator Protocol (MUST)
-- Additional prototyping-specific overrides:
-- do not self-approve;
-- keep evidence paths canonical and integrate delegated results only.
+- do not self-approve
+- keep evidence paths canonical
+- integrate delegated results only
 ### Capability Probe (MUST)
@@ -221,19 +273,22 @@ Follow `.qfai/assistant/instructions/shared-skill-operating-baseline.md#completi
 Prototyping-specific additions:
-- all specs are covered;
-- full-harness evidence is complete and truthful;
-- `qfai validate --fail-on error` passes;
-- reviewer returns `PASS`.
+- all specs are covered
+- all declared screens have screenshot + HTML evidence
+- `selected-direction.yaml` exists
+- `design-system.yaml` exists
+- `breakthrough.json` exists
+- `qfai validate --fail-on error` passes
+- reviewer returns `PASS`
 ## FINAL CHECKLIST (Check Last)
-### Completion Checklist (MUST)
 - All specs are covered in the Coverage Matrix.
-- Required full-harness evidence is present.
-- 404 findings are resolved or the run is not complete.
-- uiFidelity is present when required.
+- Every declared screen has screenshot evidence.
+- Every declared screen has HTML evidence.
+- Missing evidence triggered rerun instead of waiver.
+- Direction funnel `5->3->2->1` completed.
+- Breakthrough detector ran after polish iterations.
 - Reviewer returned PASS; otherwise status is REVISE.
 ## Completion Message & Next Actions (MUST)
@@ -242,4 +297,4 @@ Action:
 - Proceed: `/qfai-atdd`
 - Quality gate: `/qfai-verify`
-- Rework prototyping: rerun `/qfai-prototyping` with corrected evidence
+- Rework prototyping: rerun `/qfai-prototyping` with corrected screenshot/HTML evidence

package/assets/init/.qfai/assistant/skills/qfai-prototyping/references/design-system-compliance.md ADDED Viewed

@@ -0,0 +1,22 @@
+# Design System Compliance
+When `.qfai/contracts/design/design-system.yaml` exists and is required, evaluators must compare the implementation against:
+- color palette
+- typography scale and weights
+- spacing scale
+- border radius
+- shadow usage
+- explicit do/don't rules
+## Rule
+If the implementation clearly contradicts the design system on a primary screen, record an immediate-fix finding.
+## Evidence
+Support each finding with:
+- screenshot evidence
+- HTML snapshot evidence
+- the specific design-system clause or checklist item

package/assets/init/.qfai/assistant/skills/qfai-prototyping/references/evidence-requirements.md ADDED Viewed

@@ -0,0 +1,31 @@
+# Evidence Requirements
+## Mandatory evidence
+For every declared screen in `.qfai/contracts/ui/*.yaml`, collect both:
+- screenshot: `.qfai/evidence/prototyping/screenshots/<screen-id>.png`
+- HTML snapshot: `.qfai/evidence/prototyping/html/<screen-id>.html`
+If either artifact is missing:
+- the screen is scored `0`
+- the run is incomplete
+- rerun is mandatory
+Optional evidence is not allowed.
+## Capture rules
+- Use stable `screen-id` names from the canonical UI contracts.
+- Overwrite stale evidence with fresh evidence from the current iteration.
+- Do not reuse an older screenshot or HTML snapshot after a fix.
+- If capture fails, record the failure in work evidence and stop pretending the screen was evaluated.
+## Validate gate expectations
+`qfai validate --fail-on error` must be able to confirm:
+- every declared screen has a screenshot file
+- every declared screen has an HTML snapshot file
+- the file paths follow the canonical directories above