qfai 1.7.13 → 1.7.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (43) hide show
  1. package/README.md +7 -5
  2. package/assets/init/.qfai/assistant/agents/frontend-engineer.md +2 -2
  3. package/assets/init/.qfai/assistant/agents/product-experience-architect.md +2 -2
  4. package/assets/init/.qfai/assistant/agents/product-surface-reviewer.md +1 -1
  5. package/assets/init/.qfai/assistant/skills/qfai-discussion/SKILL.md +44 -18
  6. package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/01_Context.md +9 -0
  7. package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/03_Story-Workshop.md +1 -1
  8. package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/04_Sources.md +6 -6
  9. package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/14_Review-Request.md +7 -7
  10. package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/review/Rxx_reviewer.md +2 -2
  11. package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/review/review_request.md +2 -2
  12. package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/uiux/10_implementation_strategy.md +31 -13
  13. package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/uiux/20_trend_scan.md +41 -0
  14. package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/uiux/23_design_eval_aggregate.md +12 -0
  15. package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/uiux/40_screen_contracts.md +1 -1
  16. package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/uiux/50_review_input_bundle.md +2 -0
  17. package/assets/init/.qfai/assistant/skills/qfai-implement/SKILL.md +6 -2
  18. package/assets/init/.qfai/assistant/skills/qfai-prototyping/SKILL.md +264 -34
  19. package/assets/init/.qfai/assistant/steering/agent-catalog.yml +1 -1
  20. package/assets/init/.qfai/assistant/steering/agent-routing.yml +1 -1
  21. package/assets/init/.qfai/assistant/steering/manifest.md +4 -7
  22. package/assets/init/.qfai/assistant/steering/product.md +6 -6
  23. package/assets/init/.qfai/assistant/steering/review-profiles.yml +3 -0
  24. package/assets/init/.qfai/assistant/steering/ui-definition-protocol.md +2 -2
  25. package/assets/init/.qfai/contracts/ui/README.md +2 -2
  26. package/assets/init/.qfai/discussion/README.md +14 -22
  27. package/assets/init/.qfai/evidence/README.md +21 -12
  28. package/assets/uix-rev/comparison-review.md +3 -15
  29. package/assets/uix-rev/contracts-review.md +5 -2
  30. package/assets/uix-rev/scoring-review.md +10 -2
  31. package/assets/uix-rev/strategy-review.md +11 -7
  32. package/dist/cli/index.cjs +1993 -1279
  33. package/dist/cli/index.cjs.map +1 -1
  34. package/dist/cli/index.mjs +1930 -1216
  35. package/dist/cli/index.mjs.map +1 -1
  36. package/dist/index.cjs +1989 -1269
  37. package/dist/index.cjs.map +1 -1
  38. package/dist/index.d.cts +75 -62
  39. package/dist/index.d.ts +75 -62
  40. package/dist/index.mjs +1926 -1207
  41. package/dist/index.mjs.map +1 -1
  42. package/package.json +1 -1
  43. package/assets/uix-rev/migration-review.md +0 -17
@@ -16,7 +16,7 @@ roles:
16
16
  product-surface-reviewer,
17
17
  qa-gatekeeper,
18
18
  ]
19
- routing-profile: ui-bearing
19
+ routing-profile: ui-surface-aware
20
20
  mode: execution-focused
21
21
  ---
22
22
 
@@ -38,14 +38,25 @@ This skill is **static-first**. File-based checks and evidence are the default.
38
38
  - If a required API endpoint still returns `404`, the run is incomplete.
39
39
  - `L1` and `L2` critique findings must be reflected in the evidence pack or justified as `REVISE`.
40
40
  - `uiFidelity` is the canonical UI evidence block for UI-bearing surfaces.
41
- - non-ui skip semantics must be preserved. UI-only placeholders are not required when the surface is non-ui.
41
+ - `ui_bearing: false` specs are not prototyping execution targets. UI-only placeholders are not required for such specs.
42
42
  - Review rendered output, screenshot evidence, HTML snapshots, or preview artifacts before closing any UI-affecting run.
43
- - Read the canonical sidecar family first: option comparison / `30_option_comparison.md` -> selected anchor screen / `31_selected_anchor_screen.md` -> strategy / `10_implementation_strategy.md` -> taste interview / `11_design_taste_interview.md` -> trend scan / `04_Sources.md` -> 3-layer evaluation family (`20/21/22/23` + optional `24`) -> screen contracts / `40_screen_contracts.md` -> review input bundle / `50_review_input_bundle.md`.
43
+ - Read the canonical sidecar family first: option comparison / `30_option_comparison.md` -> selected anchor screen / `31_selected_anchor_screen.md` ->
44
+ strategy / `10_implementation_strategy.md` -> taste interview / `11_design_taste_interview.md` ->
45
+ trend scan / `04_Sources.md` -> 3-layer evaluation family (`20/21/22/23` + optional `24`) ->
46
+ screen contracts / `40_screen_contracts.md` -> review input bundle / `50_review_input_bundle.md`.
44
47
 
45
48
  ## Goal
46
49
 
47
50
  Build the minimum runnable vertical slice for **ALL specs** and produce canonical prototyping evidence under `.qfai/evidence/`.
48
51
 
52
+ ### Mode-specific Goals
53
+
54
+ | Mode | Goal |
55
+ | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
56
+ | low-cost | Static structure proof. Skeleton + evidence files only. |
57
+ | standard | Customer-presentable vertical slice. UI fidelity + static evidence. |
58
+ | full-harness | **Iterative design-improvement loop.** Evaluate → Identify → Fix → Re-evaluate until convergence, plateau, or max-iterations. Each iteration produces measurable quality delta. |
59
+
49
60
  ## Non-goals
50
61
 
51
62
  - Acceptance test automation (`/qfai-atdd`)
@@ -70,10 +81,12 @@ Record in `prototyping.json`:
70
81
 
71
82
  ## Surface Semantics
72
83
 
73
- - `surface: non-ui` means UI-specific evidence is `n/a`.
74
- - For non-ui projects, `uiFidelity`, render evidence, browser QA, and `runtimeGate.ui` may be absent.
75
- - Absent is normal for non-ui. Do not force skipped placeholders unless the project intentionally emits them.
76
- - For UI-bearing projects, route/contract fidelity must be captured when `uiFidelity` is required by mode.
84
+ Canonical prototyping surfaces are: `web`, `mobile`, `desktop`, `cli`, `mixed`.
85
+
86
+ - `ui_bearing: false` specs are **not** prototyping execution targets. Prototyping execution is only invoked for `ui_bearing: true` or `mixed` classifications.
87
+ - For `cli` surface: render screenshot evidence is not required; browser QA is not required. Only output / interaction / structured evidence is expected.
88
+ - For `web`, `mobile`, `desktop` surfaces: route/contract fidelity must be captured when `uiFidelity` is required by mode.
89
+ - `mixed` surface inherits the union of obligations from the constituent surfaces.
77
90
 
78
91
  ## Prototyping Modes
79
92
 
@@ -81,42 +94,57 @@ Record in `prototyping.json`:
81
94
 
82
95
  - Static checks only.
83
96
  - Suitable for early skeleton work.
84
- - UI-bearing projects may include `uiFidelity` and render/browser artifacts, but they are optional.
97
+ - `web`, `mobile`, `desktop`, `mixed` surfaces may include `uiFidelity` and render/browser artifacts, but they are optional.
98
+ - `cli` surface does not require `uiFidelity`, render evidence, or browser QA.
85
99
  - `skeleton` mode is allowed for lightweight UI proof.
86
100
 
87
101
  ### Standard
88
102
 
89
103
  - Static checks plus optional light validation.
90
104
  - This is the default mode.
91
- - UI-bearing projects require `uiFidelity`.
105
+ - `web`, `mobile`, `desktop`, `mixed` surfaces require `uiFidelity`.
106
+ - `cli` surface does not require `uiFidelity`, render evidence, or browser QA.
92
107
  - Runtime gate, render bundle, and browser QA bundle are optional.
93
108
 
94
109
  ### Full-harness
95
110
 
96
111
  - Explicit opt-in only. Never auto-activate.
97
112
  - Adds runtime-heavy obligations and full-harness audit metadata.
98
- - UI-bearing projects require runtime gate, render bundle, browser QA bundle, and `fullHarness`.
99
- - Non-ui projects require `fullHarness`, but UI-specific bundles remain n/a.
113
+ - `web`, `mobile`, `desktop`, `mixed` surfaces require runtime gate, render bundle, browser QA bundle, and `fullHarness`.
114
+ - `cli` surface requires `fullHarness` but not `uiFidelity`, render evidence, or browser QA.
115
+ - `ui_bearing: false` specs are not prototyping execution targets.
116
+ - Full-harness is an **iterative design-improvement loop**, not a single evidence-generation pass. See `## Full-Harness Iteration Protocol` below.
117
+ - The discussion 3-layer evaluation score measures **design direction quality** and MUST NOT be copied into `fullHarness.scoringTrace`.
118
+ Prototyping scores measure **implementation fidelity** against the selected anchor.
100
119
 
101
120
  ## Obligation Matrix
102
121
 
103
122
  ### surface / mode
104
123
 
105
- | surface / mode | specs | runtimeGate | uiFidelity | render evidence | browser QA | fullHarness |
106
- | ------------------------- | -------- | ----------- | --------------------------------- | ------------------------------------ | ------------ | ------------ |
107
- | non-ui / low-cost | required | optional | n/a | n/a | n/a | absent |
108
- | non-ui / standard | required | optional | n/a | n/a | n/a | absent |
109
- | non-ui / full-harness | required | optional | n/a | n/a | n/a | required |
110
- | ui-bearing / low-cost | required | optional | optional (`skeleton` allowed) | optional (`captured/skipped/failed`) | optional | absent |
111
- | ui-bearing / standard | required | optional | **required** (`interactive` only) | optional (`captured/skipped/failed`) | optional | absent |
112
- | ui-bearing / full-harness | required | required | **required** (`interactive` only) | **required** | **required** | **required** |
124
+ | surface / mode | specs | runtimeGate | uiFidelity | render evidence | browser QA | fullHarness |
125
+ | ---------------------- | -------- | ----------- | --------------------------------- | ------------------------------------ | ------------ | ------------ |
126
+ | web / low-cost | required | optional | optional (`skeleton` allowed) | optional (`captured/skipped/failed`) | optional | absent |
127
+ | web / standard | required | optional | **required** (`interactive` only) | optional (`captured/skipped/failed`) | optional | absent |
128
+ | web / full-harness | required | required | **required** (`interactive` only) | **required** | **required** | **required** |
129
+ | mobile / low-cost | required | optional | optional (`skeleton` allowed) | optional (`captured/skipped/failed`) | optional | absent |
130
+ | mobile / standard | required | optional | **required** (`interactive` only) | optional (`captured/skipped/failed`) | optional | absent |
131
+ | mobile / full-harness | required | required | **required** (`interactive` only) | **required** | **required** | **required** |
132
+ | desktop / low-cost | required | optional | optional (`skeleton` allowed) | optional (`captured/skipped/failed`) | optional | absent |
133
+ | desktop / standard | required | optional | **required** (`interactive` only) | optional (`captured/skipped/failed`) | optional | absent |
134
+ | desktop / full-harness | required | required | **required** (`interactive` only) | **required** | **required** | **required** |
135
+ | cli / low-cost | required | optional | n/a | n/a | n/a | absent |
136
+ | cli / standard | required | optional | n/a | n/a | n/a | absent |
137
+ | cli / full-harness | required | optional | n/a | n/a | n/a | **required** |
138
+ | mixed / low-cost | required | optional | optional (`skeleton` allowed) | optional (`captured/skipped/failed`) | optional | absent |
139
+ | mixed / standard | required | optional | **required** (`interactive` only) | optional (`captured/skipped/failed`) | optional | absent |
140
+ | mixed / full-harness | required | required | **required** (`interactive` only) | **required** | **required** | **required** |
113
141
 
114
142
  `uiFidelity.mode` policy:
115
143
 
116
144
  - `low-cost`: `skeleton` or `interactive`
117
145
  - `standard`: `interactive` only — `skeleton` is rejected by the validator
118
146
  - `full-harness`: `interactive` only — `skeleton` is rejected; render evidence, Browser QA, runtimeGate, and fullHarness block are all required
119
- - `non-ui`: `uiFidelity` is not emitted
147
+ - `cli`: `uiFidelity` is not emitted; render and browser QA are not required
120
148
 
121
149
  Interpretation:
122
150
 
@@ -132,27 +160,33 @@ Interpretation:
132
160
  - `.qfai/evidence/prototyping.json`
133
161
  - `.qfai/evidence/render.json` when render evidence is emitted or required by mode
134
162
  - `.qfai/evidence/browser-qa.json` when browser QA evidence is emitted or required by mode
163
+ - `.qfai/evidence/browserQa.summary.json` when browser QA evidence is emitted or required by mode
164
+ - `.qfai/evidence/browserQa.findings.json` when browser QA evidence is emitted or required by mode
165
+ - `.qfai/evidence/browserQa.repairs.json` when browser QA evidence is emitted or required by mode
166
+ - `.qfai/evidence/fullHarness.exit.json` when `mode.effective = full-harness`
167
+ - `.qfai/evidence/fullHarness.handoff.json` when `mode.effective = full-harness`
168
+ - `.qfai/evidence/fullHarness.fakeUiDetection.json` when `mode.effective = full-harness`
135
169
  - `Coverage Matrix` covering all specs
136
170
  - critique summary with `L1` / `L2` findings and disposition
137
171
 
138
172
  ### low-cost obligations
139
173
 
140
174
  - always: `specs[]`, `meta.generatedAt`, `meta.toolVersion`, `meta.commands[]`, `mode.*`
141
- - ui-bearing: `uiFidelity` optional, render/browser optional
142
- - non-ui: UI-specific evidence is n/a
175
+ - `web`, `mobile`, `desktop`, `mixed`: `uiFidelity` optional, render/browser optional
176
+ - `cli`: UI-specific evidence is n/a
143
177
 
144
178
  ### standard obligations
145
179
 
146
180
  - always: `specs[]`, `meta.*`, `mode.*`
147
- - ui-bearing: `uiFidelity` required
148
- - non-ui: UI-specific evidence is n/a
181
+ - `web`, `mobile`, `desktop`, `mixed`: `uiFidelity` required
182
+ - `cli`: UI-specific evidence is n/a
149
183
  - runtime gate and browser QA remain optional
150
184
 
151
185
  ### full-harness obligations
152
186
 
153
187
  - always: `specs[]`, `meta.*`, `mode.*`, `fullHarness`
154
- - ui-bearing: `runtimeGate`, `.qfai/evidence/render.json`, `.qfai/evidence/browser-qa.json`, `uiFidelity`
155
- - non-ui: UI-specific evidence remains n/a
188
+ - `web`, `mobile`, `desktop`, `mixed`: `runtimeGate`, `.qfai/evidence/render.json`, Browser QA bundle trio, `uiFidelity`
189
+ - `cli`: UI-specific evidence remains n/a
156
190
 
157
191
  ## Full-harness minimum completeness
158
192
 
@@ -161,31 +195,208 @@ When `mode.effective = full-harness`, record:
161
195
  - `fullHarness.enabled = true`
162
196
  - `fullHarness.available`
163
197
  - `fullHarness.runId`
164
- - `fullHarness.iterationCount >= 1`
198
+ - `fullHarness.iterationCount >= 1` (validator warns if `== 1` with `terminationReason: converged` — see QFAI-PROT-290)
199
+ - `fullHarness.scoringTrace` entries MUST equal `iterationCount` (validator warns on mismatch — see QFAI-PROT-291)
200
+ - `fullHarness.scoringTrace` SHOULD show measurable progression (non-monotonic traces are flagged as info — see QFAI-PROT-294)
165
201
  - `fullHarness.bestIteration >= 1`
166
202
  - `fullHarness.terminationReason`
167
203
  - `fullHarness.reviewerSignoff`
168
204
  - `fullHarness.scoringTrace`
205
+ - `fullHarness.exit`
206
+ - `fullHarness.handoff`
207
+ - `fullHarness.fakeUiDetection`
169
208
 
170
209
  ## Canonical Bundles
171
210
 
172
211
  - render bundle: `.qfai/evidence/render.json`
173
212
  - browser QA bundle: `.qfai/evidence/browser-qa.json`
213
+ - browser QA summary: `.qfai/evidence/browserQa.summary.json`
214
+ - browser QA findings: `.qfai/evidence/browserQa.findings.json`
215
+ - browser QA repairs: `.qfai/evidence/browserQa.repairs.json`
216
+ - full-harness exit: `.qfai/evidence/fullHarness.exit.json`
217
+ - full-harness handoff: `.qfai/evidence/fullHarness.handoff.json`
218
+ - full-harness fake-UI detection: `.qfai/evidence/fullHarness.fakeUiDetection.json`
174
219
 
175
220
  Render bundle uses `captured | skipped | failed`.
176
221
  Browser QA bundle uses `completed | skipped | failed`.
177
222
 
223
+ ## Full-Harness Iteration Protocol
224
+
225
+ Full-harness mode executes a **multi-iteration improvement loop**. A single-pass evidence dump is not full-harness.
226
+
227
+ ### Iteration Cycle Definition
228
+
229
+ Each iteration consists of exactly 4 steps:
230
+
231
+ 1. **Evaluate**: Score the current implementation against the evaluation axes defined in `uiux/20-23` (3-layer evaluation family). Use the calibration baselines from `qfai.config.yaml > prototyping.calibration`.
232
+ 2. **Identify**: List concrete deficiencies with L1/L2 classification. Each finding MUST reference a specific evaluation axis and criterion.
233
+ 3. **Fix**: Apply targeted improvements to the identified deficiencies. Record what was changed and why.
234
+ 4. **Re-evaluate**: Re-score using the same evaluation axes. Record the delta per axis.
235
+
236
+ ### Calibration Configuration Reference
237
+
238
+ The iteration loop MUST read calibration parameters from `qfai.config.yaml`:
239
+
240
+ ```yaml
241
+ prototyping:
242
+ calibration:
243
+ packPath: ".qfai/evidence/calibration.yaml" # evaluation criteria source
244
+ thresholds:
245
+ accept: 0.8 # weighted total >= accept → converged
246
+ refine: 0.5 # weighted total >= refine → continue improving
247
+ maxIterations: 15 # hard ceiling on iteration count
248
+ plateauDelta: 0.02 # delta < this for N consecutive iterations → plateau
249
+ plateauLookback: 3 # N for plateau detection window
250
+ ```
251
+
252
+ Runtime constants (harness types): `MIN_ITERATIONS = 5`, `MAX_ITERATIONS = 15`.
253
+
254
+ ### Termination Conditions
255
+
256
+ The loop terminates when **any** of these conditions is met:
257
+
258
+ | Condition | `terminationReason` | Description |
259
+ | ---------------------- | ------------------- | --------------------------------------------------------------------------- |
260
+ | Accept threshold met | `converged` | `weightedTotal >= thresholds.accept` AND `iterationCount >= MIN_ITERATIONS` |
261
+ | Max iterations reached | `max-iterations` | `iterationCount >= maxIterations` |
262
+ | Score plateau detected | `plateau` | Score delta < `plateauDelta` for `plateauLookback` consecutive iterations |
263
+ | User manual stop | `manual-stop` | User explicitly requests termination |
264
+
265
+ **IMPORTANT**: `converged` with `iterationCount == 1` is a contradiction and will trigger a validator warning (QFAI-PROT-290).
266
+
267
+ ### Independent Evaluator Panel (MUST)
268
+
269
+ To prevent self-evaluation bias, the evaluator MUST be independent from the generator:
270
+
271
+ | Layer | Agent | Input Scope | Role |
272
+ | ---------------------- | ------------------------------ | ----------------------------------------------------------- | ------------------------------------------------------- |
273
+ | L1: Design Quality | `product-surface-reviewer` | Screenshot/HTML snapshot + evaluation axis definitions ONLY | UI/UX/visual coherence scoring |
274
+ | L2: Product Experience | `product-experience-architect` | Same as L1 + screen contracts + selected anchor | User journey / IA / transition coherence |
275
+ | L3: Process Audit | `qa-gatekeeper` | `fullHarness` evidence block ONLY | iterationCount/scoringTrace/terminationReason integrity |
276
+
277
+ **Operational Rules:**
278
+
279
+ - L1 and L2 MUST be launched via `task` tool in `background` mode with a separate context. They MUST NOT receive improvement history, previous scores, or generator plans.
280
+ - L3 operates on the final evidence file and does not need a separate context.
281
+ - The iteration's `weightedTotal` is the **minimum** of L1 and L2 scores. If either returns below `thresholds.refine`, the iteration decision is `pivot`.
282
+ - Fabricated reviewer names (e.g., `"completion-reviewer"` without actual agent invocation) are a process integrity violation.
283
+
284
+ ### scoringTrace Recording
285
+
286
+ Each iteration MUST produce a `scoringTrace` entry:
287
+
288
+ ```json
289
+ {
290
+ "iteration": 3,
291
+ "weightedTotal": 0.72,
292
+ "decision": "refine",
293
+ "evaluators": ["product-surface-reviewer", "product-experience-architect"],
294
+ "axisDelta": { "visual_coherence": 0.05, "navigation": 0.08, "accessibility": -0.01 },
295
+ "maxDeltaCap": 0.15
296
+ }
297
+ ```
298
+
299
+ **Score Scope Separation:**
300
+
301
+ - Discussion 3-layer scores evaluate **design direction quality** (option comparison).
302
+ - Prototyping scoringTrace evaluates **implementation fidelity** against the selected anchor.
303
+ - These are different evaluation targets. Copying discussion scores into `scoringTrace` is prohibited.
304
+
305
+ ### Maximum Delta Cap
306
+
307
+ Per-axis score improvement per iteration is capped at `maxDeltaPerAxisPerIteration: 0.15`.
308
+ Any reported delta exceeding this cap MUST trigger re-evaluation or justification.
309
+ This prevents single-iteration score inflation.
310
+
311
+ ## Evaluation Rigor Rules (Full-Harness)
312
+
313
+ ### Rubric-Based Scoring Structure
314
+
315
+ Each evaluation axis MUST use a 3-tier rubric:
316
+
317
+ | Tier | Criteria | Score Range |
318
+ | --------------------- | ---------------------------------------- | ----------- |
319
+ | `existence_gate` | Is the element present at all? | 0.0-0.3 |
320
+ | `quality_criteria` | Does it meet baseline quality standards? | 0.3-0.7 |
321
+ | `excellence_criteria` | Does it exceed expectations? | 0.7-1.0 |
322
+
323
+ An axis that fails `existence_gate` cannot score above 0.3 regardless of other qualities.
324
+
325
+ ### L1/L2 Classification and Agent-Fixable Assessment
326
+
327
+ | Level | Definition | Agent-fixable? | Action |
328
+ | --------- | ----------------------------------------------------------------------------------- | ----------------------------------- | ------------------------------------- |
329
+ | L1 | Structural deficiency (missing element, broken navigation, accessibility violation) | Yes — must fix in current iteration | Fix immediately |
330
+ | L2 | Quality shortfall (suboptimal spacing, weak contrast, inconsistent tone) | Yes if clearly defined | Fix or justify deferral with evidence |
331
+ | L1-manual | Requires human judgment (brand alignment, business logic correctness) | No | Record in `limitations` section |
332
+
333
+ ### Lighthouse Automated Gate (SHOULD)
334
+
335
+ When the surface is `web` and a dev server is available:
336
+
337
+ - Run Lighthouse audit (Performance, Accessibility, Best Practices, SEO).
338
+ - Record scores in evidence. Scores below 70 in any category are flagged as L1 findings.
339
+ - This is SHOULD (not MUST) because dev server availability is not guaranteed.
340
+
341
+ ## Asset Acquisition Strategy (Full-Harness)
342
+
343
+ When `mode.effective = full-harness`, professional-quality visual assets are REQUIRED (not optional).
344
+
345
+ ### Asset Rules
346
+
347
+ | Rule | Level | Description |
348
+ | ----------------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
349
+ | Free asset sources | MUST | Use only properly licensed free assets (Unsplash, Pexels, Google Fonts, Heroicons, etc.). Record source URL and license in evidence. |
350
+ | Emoji prohibition | MUST | Emoji characters (U+1F000–U+1FAFF, U+2600–U+27BF) MUST NOT appear in UI output as decorative elements. Unicode symbols for functional purposes (e.g., ✓ for checkmarks) are allowed. |
351
+ | Placeholder prohibition | MUST | "Lorem ipsum", `placeholder.com` images, and gray boxes are not acceptable in full-harness final output. |
352
+ | Attribution | SHOULD | Record asset attributions in `prototyping.md` or a dedicated `assets.md`. |
353
+
354
+ ### Accessibility Checklist (Full-Harness MUST)
355
+
356
+ - Color contrast ratio ≥ 4.5:1 for normal text, ≥ 3:1 for large text (WCAG 2.1 AA)
357
+ - All interactive elements are keyboard-navigable
358
+ - Images have `alt` attributes (decorative images use `alt=""`)
359
+ - Form inputs have associated labels
360
+ - Focus indicators are visible
361
+
362
+ ### Trust Signal Checklist (Full-Harness SHOULD)
363
+
364
+ - Consistent typography hierarchy (h1 > h2 > h3 > body)
365
+ - Consistent spacing rhythm (4px/8px grid or equivalent)
366
+ - Professional color palette (not random/clashing colors)
367
+ - Loading states and error states are designed (not browser defaults)
368
+ - No broken images or missing resources in rendered output
369
+
370
+ ### Dev Server Management Protocol
371
+
372
+ When a dev server is started for evidence collection:
373
+
374
+ 1. Record the process PID and port in evidence metadata.
375
+ 2. After evidence collection, terminate the dev server explicitly.
376
+ 3. Do not leave orphaned dev server processes running.
377
+
178
378
  ## Required Process
179
379
 
180
380
  1. Read `.qfai/specs/spec-*` and determine the surface and requested mode.
181
381
  2. Build the minimum runnable slice across **ALL specs**.
182
382
  3. Produce `prototyping.md` and `prototyping.json` with a complete Coverage Matrix.
183
- 4. If UI-bearing, capture `uiFidelity`; if full-harness, capture runtime gate, render bundle, and browser QA bundle.
383
+ 4. If `web`, `mobile`, `desktop`, or `mixed` surface, capture `uiFidelity`; if full-harness, capture runtime gate, render bundle, and browser QA bundle.
184
384
  5. Review rendered output, screenshot evidence, HTML snapshots, or preview artifacts against the canonical sidecar family.
185
- 6. Record critique findings, classify each as `L1` or `L2`, and either fix or mark the result `REVISE`.
186
- 7. Use the read order `option comparison (30_option_comparison.md) -> selected anchor screen (31_selected_anchor_screen.md) -> strategy (10_implementation_strategy.md) -> taste interview (11_design_taste_interview.md) -> trend scan (04_Sources.md) -> 3-layer evaluation family (20/21/22/23 + optional 24) -> screen contracts (40_screen_contracts.md) -> review input bundle (50_review_input_bundle.md)` when the project is UI-bearing.
187
- 8. Run `qfai validate --fail-on error`.
188
- 9. Route reviewer gate and do not declare completion until the result is `PASS`.
385
+ 6. **[full-harness only]** Execute the Full-Harness Iteration Protocol:
386
+ a. Initialize calibration from `qfai.config.yaml > prototyping.calibration`.
387
+ b. Run Evaluate Identify → Fix → Re-evaluate cycle.
388
+ c. Launch independent evaluators (product-surface-reviewer, product-experience-architect) per iteration.
389
+ d. Record each iteration in `scoringTrace`.
390
+ e. Continue until termination condition is met.
391
+ f. Record `terminationReason`, `iterationCount`, `bestIteration`.
392
+ 7. Record critique findings, classify each as `L1` or `L2`, and either fix or mark the result `REVISE`.
393
+ 8. Use the following read order when the surface is `web`, `mobile`, `desktop`, or `mixed`:
394
+ option comparison (`30_option_comparison.md`) -> selected anchor screen (`31_selected_anchor_screen.md`) ->
395
+ strategy (`10_implementation_strategy.md`) -> taste interview (`11_design_taste_interview.md`) ->
396
+ trend scan (`04_Sources.md`) -> 3-layer evaluation family (`20/21/22/23` + optional `24`) ->
397
+ screen contracts (`40_screen_contracts.md`) -> review input bundle (`50_review_input_bundle.md`).
398
+ 9. Run `qfai validate --fail-on error`.
399
+ 10. Route reviewer gate and do not declare completion until the result is `PASS`.
189
400
 
190
401
  ## Sub-agent Delegation (MANDATORY)
191
402
 
@@ -223,6 +434,24 @@ Every major artifact in this stage MUST include this table schema:
223
434
  - Test volume floors/ratios are not gates; they are signals.
224
435
  - Reviewer must verify evidence obligations for the chosen `surface / mode`.
225
436
  - Do not declare DONE until Reviewer returns `PASS`; otherwise apply `REVISE`.
437
+ - **[full-harness only]** Reviewer MUST verify:
438
+ - `iterationCount > 1` (or explicit justification for single-iteration convergence).
439
+ - `scoringTrace` contains entries equal to `iterationCount`.
440
+ - `scoringTrace` shows measurable score progression (not all identical scores).
441
+ - `terminationReason` is consistent with the scoring trajectory.
442
+ - Independent evaluators were actually invoked (not fabricated names).
443
+ - `limitations` section is present and documents known shortcomings honestly.
444
+
445
+ ### Limitations Section (Full-Harness MUST)
446
+
447
+ When `mode.effective = full-harness`, the evidence MUST include a `## Limitations` section in `prototyping.md` that documents:
448
+
449
+ - Known quality shortcomings that were not resolved by the iteration loop.
450
+ - Evaluation axes where scores did not reach `accept` threshold.
451
+ - Areas where agent judgment is insufficient (requires human review).
452
+ - Technical constraints that prevented further improvement (e.g., asset licensing, browser API limitations).
453
+
454
+ Omitting limitations or recording an empty limitations section when `iterationCount < maxIterations` is a process integrity concern.
226
455
 
227
456
  ## Completion Contract (Shared)
228
457
 
@@ -231,8 +460,9 @@ Before DONE:
231
460
  - package assets and generated evidence must match the obligation matrix
232
461
  - `qfai validate --fail-on error` must pass
233
462
  - reviewer gate must return PASS
234
- - UI-bearing runs must reconcile `uiFidelity`, render evidence, and critique outputs
235
- - non-ui runs must preserve `n/a` semantics without fake placeholders
463
+ - `web`, `mobile`, `desktop`, `mixed` surface runs must reconcile `uiFidelity`, render evidence, and critique outputs
464
+ - `cli` surface runs preserve n/a semantics for render and browser QA without fake placeholders
465
+ - `ui_bearing: false` specs are not prototyping execution targets
236
466
 
237
467
  ## FINAL CHECKLIST (Check Last)
238
468
 
@@ -65,7 +65,7 @@ agents:
65
65
  - id: frontend-engineer
66
66
  kind: worker
67
67
  domain: frontend
68
- mission: Implement frontend behavior aligned with selected direction, strategy, screen contracts, and product-surface decisions.
68
+ mission: Implement frontend behavior aligned with selected anchor, strategy, screen contracts, and product-surface decisions.
69
69
  owned_artifacts: [ui-implementation, surface-evidence]
70
70
  tool_profile: frontend
71
71
  permission_profile: authoring
@@ -119,7 +119,7 @@ routing:
119
119
  rerun_policy: changed-scope-dependents
120
120
  - id: evidence
121
121
  mandatory_agents: []
122
- conditional_agents: [devops-ci-engineer, qa-gatekeeper]
122
+ conditional_agents: [devops-ci-engineer, qa-gatekeeper, product-experience-architect]
123
123
  parallel_groups: []
124
124
  blocking_agents: [qa-gatekeeper]
125
125
  rerun_policy: changed-scope-dependents
@@ -19,9 +19,9 @@
19
19
  ## Compatibility vs Change Rubric
20
20
 
21
21
  - Criteria (Compatibility): validate.json is an internal contract (not a stable API). CLI command system follows semver.
22
- - Criteria (Change): Breaking changes deferred until v2.0. Migration guide required.
22
+ - Criteria (Change): canonical consistency, validator alignment, and shipped SSOT alignment take priority. Breaking changes are allowed when required to restore canonical consistency.
23
23
  - Examples: `_shared/` -> `_policies/` rename (v1.5.3), spec-pack -> layered migration (v1.4.17)
24
- - Evidence: CHANGELOG.md, OQ-0003 (validate.json), OQ-0004 (legacy deprecation)
24
+ - Evidence: CHANGELOG.md
25
25
 
26
26
  ## Governance (Ownership / Review / Evidence)
27
27
 
@@ -54,14 +54,11 @@
54
54
  ## Non-goals / Not-now (Optional)
55
55
 
56
56
  - IDE plugin / GUI development
57
- - Plugin architecture (to be reconsidered in v2.0)
57
+ - Plugin architecture
58
58
  - Automated test generation
59
59
  - browser QA full audit / screenshot diff / repair loop / external critique adapter (v1.7.1)
60
60
  - auto-fix / rewrite for design findings (v1.7.2)
61
- - evidence schema versioning detail (deferred to v1.7.6, OQ-0001 of discussion-20260329130000123)
62
- - browser QA output normalization shape (deferred to v1.7.6, OQ-0002 of discussion-20260329130000123)
63
- - external critique provider / full-harness orchestration / calibration pack / cost observability / long-running handoff (v1.7.5 out of scope → v1.7.6 IN scope)
64
- - Evidence: 05_Scope.md (Out of Scope), OQ-0001, OQ-0002, discussion-20260329175059391
61
+ - Evidence: 05_Scope.md (Out of Scope)
65
62
 
66
63
  ## References (Optional)
67
64
 
@@ -37,8 +37,9 @@
37
37
 
38
38
  ## Release posture
39
39
 
40
- - Compatibility policy: semver. Maintain backward compatibility of the CLI command system.
41
- - Breaking change policy: Breaking changes deferred until v2.0. Migration guide (docs/migrations/) required.
40
+ - Compatibility policy: current canonical contract only.
41
+ - Breaking changes are allowed when required to restore canonical consistency.
42
+ - CLI/skill/docs/validator must match current package semantics.
42
43
  - Evidence: CHANGELOG.md, 09_Constraints.md (DL-02)
43
44
 
44
45
  ## Milestones
@@ -64,11 +65,10 @@
64
65
  | v1.7.7 (完了) | Remediation & Prototyping Readiness — static-first prototyping default + full-harness mode exposure + 3-layer eval reconciliation + strategy/contract upgrade + UI-bearing detection fix + render evidence wiring + browser QA findings + doc normalization + migration support |
65
66
  | v1.7.8 (完了) | Canonical Convergence — design taste interview + trend research + 3-layer evaluation convergence + scoring-ready schema + strategy/screen contract upgrade + UI-bearing detection unification + static-first prototyping rewrite + full-harness mode convergence + render evidence wiring + browser QA MVP + reviewer extension + migration normalization + docs normalization |
66
67
  | v1.7.9 (完了) | Convergence Correction Release — canonical validator registration, discussion completion convergence, honest render evidence/browser QA wiring, reviewer routing alignment, docs maturity normalization |
67
- | v1.7.13 (進行中) | Canonical Sidecar Convergence — selected direction SSOT moved to 31_selected_anchor_screen.md, option comparison remains in 30_option_comparison.md, sidecar-first read order, DDS/anchor vocabulary removal, validator semantics rewrite, template-validator self-consistency |
68
+ | v1.7.13 (完了) | Canonical Sidecar Convergence — selected anchor SSOT moved to 31_selected_anchor_screen.md, option comparison remains in 30_option_comparison.md, sidecar-first read order, DDS/anchor vocabulary removal, validator semantics rewrite, template-validator self-consistency |
69
+ | v1.7.14 (進行中) | Canonical Convergence Finalization — strict classification enforcement, namespaced-only prototyping.yaml, current-only shipped SSOT, regression net hardening |
68
70
 
69
71
  ## Open questions
70
72
 
71
73
  - Blocking: none
72
- - Non-blocking:
73
- - OQ-0003: validate.json external API stability (deferred to v2.0)
74
- - OQ-0004: Legacy spec-pack deprecation schedule (deferred to v2.0)
74
+ - Non-blocking: none
@@ -16,6 +16,9 @@ profiles:
16
16
  runtime-heavy:
17
17
  always_required: [completion-reviewer, qa-gatekeeper]
18
18
  conditional_required: [implementation-reviewer]
19
+ full-harness:
20
+ always_required: [completion-reviewer, product-surface-reviewer, qa-gatekeeper]
21
+ conditional_required: []
19
22
 
20
23
  optional_modes:
21
24
  devils-advocate:
@@ -9,7 +9,7 @@ spec-0013 (CAP-0013) で定義された、下流 skill(prototyping / ATDD / TD
9
9
 
10
10
  1. **Discussion-side UI/UX Sidecar Artifacts** (`discussion-*/uiux/`) — **primary source of truth**
11
11
  - `30_option_comparison.md` — オプション比較(比較 artifact)
12
- - `31_selected_anchor_screen.md` — 選定結果 + selected direction の SSOT
12
+ - `31_selected_anchor_screen.md` — 選定結果 + selected anchor の SSOT
13
13
  - `10_implementation_strategy.md` — 実装戦略(strict canonical schema)
14
14
  - `11_design_taste_interview.md` — デザインテイストインタビュー
15
15
  - `20-24` — 3-layer 評価ファミリー(invariant / trend-derived / product-specific / aggregate / dynamic overrides)
@@ -47,7 +47,7 @@ spec-0013 (CAP-0013) で定義された、下流 skill(prototyping / ATDD / TD
47
47
 
48
48
  ## Priority and Override Semantics
49
49
 
50
- - sidecar artifacts(selected direction / strategy / contracts)が **primary truth**
50
+ - sidecar artifacts(selected anchor / strategy / contracts)が **primary truth**
51
51
  - UI Contracts と Design Token は **存在する場合のみ読む supporting input**(primary truth ではない)
52
52
  - Optional fallback mock はさらに後順位の **fallback**
53
53
  - Design Token の値と HTML Mock の fallback 値が矛盾する場合は warning を発行
@@ -35,11 +35,11 @@ The contract must describe both screen structure and minimum mockable behavior.
35
35
 
36
36
  ### `data-qfai` marker convention
37
37
 
38
- - Recommended marker value: `CONTRACT_ID:ELEMENT_ID` (example: `data-qfai="CON-UI-0001:search_input"`).
38
+ - Canonical marker value: `CONTRACT_ID:ELEMENT_ID` (example: `data-qfai="CON-UI-0001:search_input"`).
39
39
  - Use `elements[].id` (stable ID) for the marker suffix, not `elements[].label`.
40
40
  - Even when label text is not visible in the UI, markers ensure fidelity coverage.
41
41
  - autogen generates expected markers from `elements[].id` automatically.
42
- - Legacy format (`CONTRACT_ID:ELEMENT_LABEL`) is accepted for backward compatibility but new implementations should use the id-based format.
42
+ - The id-based format (`CONTRACT_ID:ELEMENT_ID`) is the only canonical marker format.
43
43
 
44
44
  ## Mockable prototype minimum (L2)
45
45
 
@@ -2,7 +2,7 @@
2
2
 
3
3
  ## Purpose
4
4
 
5
- `discussion/` stores the unified discussion pack that merges interview outputs (discuss) and requirement intake (require). Discussion packs use 15 required markdown files plus required prototyping.yaml.
5
+ `discussion/` stores the unified discussion pack that merges interview outputs (discuss) and requirement intake (require). Discussion packs use 15 required markdown files. When the latest pack is `ui_bearing: true`, it must also include `prototyping.yaml`; when `ui_bearing: false`, `prototyping.yaml` is not required.
6
6
 
7
7
  This directory does not directly update `specs/`; it prepares decisions, requirements, open questions, and rationale as inputs for `/qfai-sdd`.
8
8
 
@@ -29,7 +29,7 @@ discussion/
29
29
  ├── 13_Deferred.md
30
30
  ├── 14_Review-Request.md
31
31
  ├── 99_delta.md
32
- └── prototyping.yaml
32
+ └── prototyping.yaml # required only when ui_bearing: true
33
33
  ```
34
34
 
35
35
  ## File responsibilities
@@ -103,11 +103,11 @@ discussion/
103
103
  - Use timestamp directory naming for new outputs: `discussion-YYYYMMDDhhmmssSSS`.
104
104
  - `14_Review-Request.md` must reference routing SSOT: `.qfai/assistant/steering/agent-routing.yml` and `.qfai/assistant/steering/review-profiles.yml`.
105
105
 
106
- ## prototyping.yaml (Required Recommendation Artifact)
106
+ ## prototyping.yaml (Classification-aware Recommendation Artifact)
107
107
 
108
- Each discussion pack **must** include a `prototyping.yaml` file that recommends the prototyping mode for the project. This is a required side artifact of the 15-file discussion pack plus required prototyping.yaml completion contract.
108
+ Each UI-bearing discussion pack (`ui_bearing: true`) **must** include a `prototyping.yaml` file that recommends the prototyping mode for the project. Non-UI discussion packs (`ui_bearing: false`) do not require `prototyping.yaml`.
109
109
 
110
- ### Canonical namespaced schema (recommended)
110
+ ### Canonical namespaced schema (required)
111
111
 
112
112
  ```yaml
113
113
  prototyping:
@@ -117,24 +117,9 @@ prototyping:
117
117
  - low-cost
118
118
  - standard
119
119
  - full-harness
120
- surface: web-ui
120
+ surface: web
121
121
  ```
122
122
 
123
- ### Legacy top-level schema (deprecated — read-only backward compatibility)
124
-
125
- The following top-level form is accepted by the parser for backward compatibility but produces a deprecation warning (`QFAI-PROT-231`). New artifacts MUST NOT emit this form; use the namespaced canonical schema above.
126
-
127
- ```yaml
128
- recommended_mode: standard
129
- rationale: ...
130
- allowed_modes:
131
- - low-cost
132
- - standard
133
- surface: web-ui
134
- ```
135
-
136
- If both forms are present in the same file, the namespaced form takes precedence and a conflict warning (`QFAI-PROT-232`) is emitted.
137
-
138
123
  ### Field reference
139
124
 
140
125
  All 4 fields are **required**. An artifact missing any field will fail validation.
@@ -144,7 +129,14 @@ All 4 fields are **required**. An artifact missing any field will fail validatio
144
129
  | `recommended_mode` | yes | `low-cost`, `standard`, or `full-harness` |
145
130
  | `rationale` | yes | Non-empty string explaining the recommendation |
146
131
  | `allowed_modes` | yes | Unique array of valid modes; must include `recommended_mode` |
147
- | `surface` | yes | `web-ui`, `mobile-ui`, `desktop-ui`, `mixed`, or `non-ui` |
132
+ | `surface` | yes | `web`, `mobile`, `desktop`, `cli`, or `mixed` |
133
+
134
+ ### Validation rules
135
+
136
+ - Only the canonical namespaced schema under the `prototyping:` key is accepted. Top-level recommendation keys (`recommended_mode`, `rationale`, `allowed_modes`, `surface` at root level) are not supported and will cause validation failure.
137
+ - Coexistence of top-level recommendation keys with the namespaced `prototyping:` block is invalid.
138
+ - `recommended_mode` must be included in `allowed_modes`. An artifact where `recommended_mode` is not in `allowed_modes` is invalid.
139
+ - An artifact that does not conform to the canonical namespaced schema is invalid and will be rejected by both validation and execution/CLI. No fallback to explicit mode or default mode is performed for invalid artifacts.
148
140
 
149
141
  ## Suggested naming
150
142