qfai 1.7.13 → 1.7.14
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +7 -5
- package/assets/init/.qfai/assistant/agents/frontend-engineer.md +2 -2
- package/assets/init/.qfai/assistant/agents/product-experience-architect.md +2 -2
- package/assets/init/.qfai/assistant/agents/product-surface-reviewer.md +1 -1
- package/assets/init/.qfai/assistant/skills/qfai-discussion/SKILL.md +44 -18
- package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/01_Context.md +9 -0
- package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/03_Story-Workshop.md +1 -1
- package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/04_Sources.md +6 -6
- package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/14_Review-Request.md +7 -7
- package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/review/Rxx_reviewer.md +2 -2
- package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/review/review_request.md +2 -2
- package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/uiux/10_implementation_strategy.md +31 -13
- package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/uiux/20_trend_scan.md +41 -0
- package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/uiux/23_design_eval_aggregate.md +12 -0
- package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/uiux/40_screen_contracts.md +1 -1
- package/assets/init/.qfai/assistant/skills/qfai-discussion/templates/uiux/50_review_input_bundle.md +2 -0
- package/assets/init/.qfai/assistant/skills/qfai-implement/SKILL.md +6 -2
- package/assets/init/.qfai/assistant/skills/qfai-prototyping/SKILL.md +264 -34
- package/assets/init/.qfai/assistant/steering/agent-catalog.yml +1 -1
- package/assets/init/.qfai/assistant/steering/agent-routing.yml +1 -1
- package/assets/init/.qfai/assistant/steering/manifest.md +4 -7
- package/assets/init/.qfai/assistant/steering/product.md +6 -6
- package/assets/init/.qfai/assistant/steering/review-profiles.yml +3 -0
- package/assets/init/.qfai/assistant/steering/ui-definition-protocol.md +2 -2
- package/assets/init/.qfai/contracts/ui/README.md +2 -2
- package/assets/init/.qfai/discussion/README.md +14 -22
- package/assets/init/.qfai/evidence/README.md +21 -12
- package/assets/uix-rev/comparison-review.md +3 -15
- package/assets/uix-rev/contracts-review.md +5 -2
- package/assets/uix-rev/scoring-review.md +10 -2
- package/assets/uix-rev/strategy-review.md +11 -7
- package/dist/cli/index.cjs +1993 -1279
- package/dist/cli/index.cjs.map +1 -1
- package/dist/cli/index.mjs +1930 -1216
- package/dist/cli/index.mjs.map +1 -1
- package/dist/index.cjs +1989 -1269
- package/dist/index.cjs.map +1 -1
- package/dist/index.d.cts +75 -62
- package/dist/index.d.ts +75 -62
- package/dist/index.mjs +1926 -1207
- package/dist/index.mjs.map +1 -1
- package/package.json +1 -1
- package/assets/uix-rev/migration-review.md +0 -17
|
@@ -16,7 +16,7 @@ roles:
|
|
|
16
16
|
product-surface-reviewer,
|
|
17
17
|
qa-gatekeeper,
|
|
18
18
|
]
|
|
19
|
-
routing-profile: ui-
|
|
19
|
+
routing-profile: ui-surface-aware
|
|
20
20
|
mode: execution-focused
|
|
21
21
|
---
|
|
22
22
|
|
|
@@ -38,14 +38,25 @@ This skill is **static-first**. File-based checks and evidence are the default.
|
|
|
38
38
|
- If a required API endpoint still returns `404`, the run is incomplete.
|
|
39
39
|
- `L1` and `L2` critique findings must be reflected in the evidence pack or justified as `REVISE`.
|
|
40
40
|
- `uiFidelity` is the canonical UI evidence block for UI-bearing surfaces.
|
|
41
|
-
-
|
|
41
|
+
- `ui_bearing: false` specs are not prototyping execution targets. UI-only placeholders are not required for such specs.
|
|
42
42
|
- Review rendered output, screenshot evidence, HTML snapshots, or preview artifacts before closing any UI-affecting run.
|
|
43
|
-
- Read the canonical sidecar family first: option comparison / `30_option_comparison.md` -> selected anchor screen / `31_selected_anchor_screen.md` ->
|
|
43
|
+
- Read the canonical sidecar family first: option comparison / `30_option_comparison.md` -> selected anchor screen / `31_selected_anchor_screen.md` ->
|
|
44
|
+
strategy / `10_implementation_strategy.md` -> taste interview / `11_design_taste_interview.md` ->
|
|
45
|
+
trend scan / `04_Sources.md` -> 3-layer evaluation family (`20/21/22/23` + optional `24`) ->
|
|
46
|
+
screen contracts / `40_screen_contracts.md` -> review input bundle / `50_review_input_bundle.md`.
|
|
44
47
|
|
|
45
48
|
## Goal
|
|
46
49
|
|
|
47
50
|
Build the minimum runnable vertical slice for **ALL specs** and produce canonical prototyping evidence under `.qfai/evidence/`.
|
|
48
51
|
|
|
52
|
+
### Mode-specific Goals
|
|
53
|
+
|
|
54
|
+
| Mode | Goal |
|
|
55
|
+
| ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
56
|
+
| low-cost | Static structure proof. Skeleton + evidence files only. |
|
|
57
|
+
| standard | Customer-presentable vertical slice. UI fidelity + static evidence. |
|
|
58
|
+
| full-harness | **Iterative design-improvement loop.** Evaluate → Identify → Fix → Re-evaluate until convergence, plateau, or max-iterations. Each iteration produces measurable quality delta. |
|
|
59
|
+
|
|
49
60
|
## Non-goals
|
|
50
61
|
|
|
51
62
|
- Acceptance test automation (`/qfai-atdd`)
|
|
@@ -70,10 +81,12 @@ Record in `prototyping.json`:
|
|
|
70
81
|
|
|
71
82
|
## Surface Semantics
|
|
72
83
|
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
-
|
|
76
|
-
- For
|
|
84
|
+
Canonical prototyping surfaces are: `web`, `mobile`, `desktop`, `cli`, `mixed`.
|
|
85
|
+
|
|
86
|
+
- `ui_bearing: false` specs are **not** prototyping execution targets. Prototyping execution is only invoked for `ui_bearing: true` or `mixed` classifications.
|
|
87
|
+
- For `cli` surface: render screenshot evidence is not required; browser QA is not required. Only output / interaction / structured evidence is expected.
|
|
88
|
+
- For `web`, `mobile`, `desktop` surfaces: route/contract fidelity must be captured when `uiFidelity` is required by mode.
|
|
89
|
+
- `mixed` surface inherits the union of obligations from the constituent surfaces.
|
|
77
90
|
|
|
78
91
|
## Prototyping Modes
|
|
79
92
|
|
|
@@ -81,42 +94,57 @@ Record in `prototyping.json`:
|
|
|
81
94
|
|
|
82
95
|
- Static checks only.
|
|
83
96
|
- Suitable for early skeleton work.
|
|
84
|
-
-
|
|
97
|
+
- `web`, `mobile`, `desktop`, `mixed` surfaces may include `uiFidelity` and render/browser artifacts, but they are optional.
|
|
98
|
+
- `cli` surface does not require `uiFidelity`, render evidence, or browser QA.
|
|
85
99
|
- `skeleton` mode is allowed for lightweight UI proof.
|
|
86
100
|
|
|
87
101
|
### Standard
|
|
88
102
|
|
|
89
103
|
- Static checks plus optional light validation.
|
|
90
104
|
- This is the default mode.
|
|
91
|
-
-
|
|
105
|
+
- `web`, `mobile`, `desktop`, `mixed` surfaces require `uiFidelity`.
|
|
106
|
+
- `cli` surface does not require `uiFidelity`, render evidence, or browser QA.
|
|
92
107
|
- Runtime gate, render bundle, and browser QA bundle are optional.
|
|
93
108
|
|
|
94
109
|
### Full-harness
|
|
95
110
|
|
|
96
111
|
- Explicit opt-in only. Never auto-activate.
|
|
97
112
|
- Adds runtime-heavy obligations and full-harness audit metadata.
|
|
98
|
-
-
|
|
99
|
-
-
|
|
113
|
+
- `web`, `mobile`, `desktop`, `mixed` surfaces require runtime gate, render bundle, browser QA bundle, and `fullHarness`.
|
|
114
|
+
- `cli` surface requires `fullHarness` but not `uiFidelity`, render evidence, or browser QA.
|
|
115
|
+
- `ui_bearing: false` specs are not prototyping execution targets.
|
|
116
|
+
- Full-harness is an **iterative design-improvement loop**, not a single evidence-generation pass. See `## Full-Harness Iteration Protocol` below.
|
|
117
|
+
- The discussion 3-layer evaluation score measures **design direction quality** and MUST NOT be copied into `fullHarness.scoringTrace`.
|
|
118
|
+
Prototyping scores measure **implementation fidelity** against the selected anchor.
|
|
100
119
|
|
|
101
120
|
## Obligation Matrix
|
|
102
121
|
|
|
103
122
|
### surface / mode
|
|
104
123
|
|
|
105
|
-
| surface / mode
|
|
106
|
-
|
|
|
107
|
-
|
|
|
108
|
-
|
|
|
109
|
-
|
|
|
110
|
-
|
|
|
111
|
-
|
|
|
112
|
-
|
|
|
124
|
+
| surface / mode | specs | runtimeGate | uiFidelity | render evidence | browser QA | fullHarness |
|
|
125
|
+
| ---------------------- | -------- | ----------- | --------------------------------- | ------------------------------------ | ------------ | ------------ |
|
|
126
|
+
| web / low-cost | required | optional | optional (`skeleton` allowed) | optional (`captured/skipped/failed`) | optional | absent |
|
|
127
|
+
| web / standard | required | optional | **required** (`interactive` only) | optional (`captured/skipped/failed`) | optional | absent |
|
|
128
|
+
| web / full-harness | required | required | **required** (`interactive` only) | **required** | **required** | **required** |
|
|
129
|
+
| mobile / low-cost | required | optional | optional (`skeleton` allowed) | optional (`captured/skipped/failed`) | optional | absent |
|
|
130
|
+
| mobile / standard | required | optional | **required** (`interactive` only) | optional (`captured/skipped/failed`) | optional | absent |
|
|
131
|
+
| mobile / full-harness | required | required | **required** (`interactive` only) | **required** | **required** | **required** |
|
|
132
|
+
| desktop / low-cost | required | optional | optional (`skeleton` allowed) | optional (`captured/skipped/failed`) | optional | absent |
|
|
133
|
+
| desktop / standard | required | optional | **required** (`interactive` only) | optional (`captured/skipped/failed`) | optional | absent |
|
|
134
|
+
| desktop / full-harness | required | required | **required** (`interactive` only) | **required** | **required** | **required** |
|
|
135
|
+
| cli / low-cost | required | optional | n/a | n/a | n/a | absent |
|
|
136
|
+
| cli / standard | required | optional | n/a | n/a | n/a | absent |
|
|
137
|
+
| cli / full-harness | required | optional | n/a | n/a | n/a | **required** |
|
|
138
|
+
| mixed / low-cost | required | optional | optional (`skeleton` allowed) | optional (`captured/skipped/failed`) | optional | absent |
|
|
139
|
+
| mixed / standard | required | optional | **required** (`interactive` only) | optional (`captured/skipped/failed`) | optional | absent |
|
|
140
|
+
| mixed / full-harness | required | required | **required** (`interactive` only) | **required** | **required** | **required** |
|
|
113
141
|
|
|
114
142
|
`uiFidelity.mode` policy:
|
|
115
143
|
|
|
116
144
|
- `low-cost`: `skeleton` or `interactive`
|
|
117
145
|
- `standard`: `interactive` only — `skeleton` is rejected by the validator
|
|
118
146
|
- `full-harness`: `interactive` only — `skeleton` is rejected; render evidence, Browser QA, runtimeGate, and fullHarness block are all required
|
|
119
|
-
- `
|
|
147
|
+
- `cli`: `uiFidelity` is not emitted; render and browser QA are not required
|
|
120
148
|
|
|
121
149
|
Interpretation:
|
|
122
150
|
|
|
@@ -132,27 +160,33 @@ Interpretation:
|
|
|
132
160
|
- `.qfai/evidence/prototyping.json`
|
|
133
161
|
- `.qfai/evidence/render.json` when render evidence is emitted or required by mode
|
|
134
162
|
- `.qfai/evidence/browser-qa.json` when browser QA evidence is emitted or required by mode
|
|
163
|
+
- `.qfai/evidence/browserQa.summary.json` when browser QA evidence is emitted or required by mode
|
|
164
|
+
- `.qfai/evidence/browserQa.findings.json` when browser QA evidence is emitted or required by mode
|
|
165
|
+
- `.qfai/evidence/browserQa.repairs.json` when browser QA evidence is emitted or required by mode
|
|
166
|
+
- `.qfai/evidence/fullHarness.exit.json` when `mode.effective = full-harness`
|
|
167
|
+
- `.qfai/evidence/fullHarness.handoff.json` when `mode.effective = full-harness`
|
|
168
|
+
- `.qfai/evidence/fullHarness.fakeUiDetection.json` when `mode.effective = full-harness`
|
|
135
169
|
- `Coverage Matrix` covering all specs
|
|
136
170
|
- critique summary with `L1` / `L2` findings and disposition
|
|
137
171
|
|
|
138
172
|
### low-cost obligations
|
|
139
173
|
|
|
140
174
|
- always: `specs[]`, `meta.generatedAt`, `meta.toolVersion`, `meta.commands[]`, `mode.*`
|
|
141
|
-
-
|
|
142
|
-
-
|
|
175
|
+
- `web`, `mobile`, `desktop`, `mixed`: `uiFidelity` optional, render/browser optional
|
|
176
|
+
- `cli`: UI-specific evidence is n/a
|
|
143
177
|
|
|
144
178
|
### standard obligations
|
|
145
179
|
|
|
146
180
|
- always: `specs[]`, `meta.*`, `mode.*`
|
|
147
|
-
-
|
|
148
|
-
-
|
|
181
|
+
- `web`, `mobile`, `desktop`, `mixed`: `uiFidelity` required
|
|
182
|
+
- `cli`: UI-specific evidence is n/a
|
|
149
183
|
- runtime gate and browser QA remain optional
|
|
150
184
|
|
|
151
185
|
### full-harness obligations
|
|
152
186
|
|
|
153
187
|
- always: `specs[]`, `meta.*`, `mode.*`, `fullHarness`
|
|
154
|
-
-
|
|
155
|
-
-
|
|
188
|
+
- `web`, `mobile`, `desktop`, `mixed`: `runtimeGate`, `.qfai/evidence/render.json`, Browser QA bundle trio, `uiFidelity`
|
|
189
|
+
- `cli`: UI-specific evidence remains n/a
|
|
156
190
|
|
|
157
191
|
## Full-harness minimum completeness
|
|
158
192
|
|
|
@@ -161,31 +195,208 @@ When `mode.effective = full-harness`, record:
|
|
|
161
195
|
- `fullHarness.enabled = true`
|
|
162
196
|
- `fullHarness.available`
|
|
163
197
|
- `fullHarness.runId`
|
|
164
|
-
- `fullHarness.iterationCount >= 1`
|
|
198
|
+
- `fullHarness.iterationCount >= 1` (validator warns if `== 1` with `terminationReason: converged` — see QFAI-PROT-290)
|
|
199
|
+
- `fullHarness.scoringTrace` entries MUST equal `iterationCount` (validator warns on mismatch — see QFAI-PROT-291)
|
|
200
|
+
- `fullHarness.scoringTrace` SHOULD show measurable progression (non-monotonic traces are flagged as info — see QFAI-PROT-294)
|
|
165
201
|
- `fullHarness.bestIteration >= 1`
|
|
166
202
|
- `fullHarness.terminationReason`
|
|
167
203
|
- `fullHarness.reviewerSignoff`
|
|
168
204
|
- `fullHarness.scoringTrace`
|
|
205
|
+
- `fullHarness.exit`
|
|
206
|
+
- `fullHarness.handoff`
|
|
207
|
+
- `fullHarness.fakeUiDetection`
|
|
169
208
|
|
|
170
209
|
## Canonical Bundles
|
|
171
210
|
|
|
172
211
|
- render bundle: `.qfai/evidence/render.json`
|
|
173
212
|
- browser QA bundle: `.qfai/evidence/browser-qa.json`
|
|
213
|
+
- browser QA summary: `.qfai/evidence/browserQa.summary.json`
|
|
214
|
+
- browser QA findings: `.qfai/evidence/browserQa.findings.json`
|
|
215
|
+
- browser QA repairs: `.qfai/evidence/browserQa.repairs.json`
|
|
216
|
+
- full-harness exit: `.qfai/evidence/fullHarness.exit.json`
|
|
217
|
+
- full-harness handoff: `.qfai/evidence/fullHarness.handoff.json`
|
|
218
|
+
- full-harness fake-UI detection: `.qfai/evidence/fullHarness.fakeUiDetection.json`
|
|
174
219
|
|
|
175
220
|
Render bundle uses `captured | skipped | failed`.
|
|
176
221
|
Browser QA bundle uses `completed | skipped | failed`.
|
|
177
222
|
|
|
223
|
+
## Full-Harness Iteration Protocol
|
|
224
|
+
|
|
225
|
+
Full-harness mode executes a **multi-iteration improvement loop**. A single-pass evidence dump is not full-harness.
|
|
226
|
+
|
|
227
|
+
### Iteration Cycle Definition
|
|
228
|
+
|
|
229
|
+
Each iteration consists of exactly 4 steps:
|
|
230
|
+
|
|
231
|
+
1. **Evaluate**: Score the current implementation against the evaluation axes defined in `uiux/20-23` (3-layer evaluation family). Use the calibration baselines from `qfai.config.yaml > prototyping.calibration`.
|
|
232
|
+
2. **Identify**: List concrete deficiencies with L1/L2 classification. Each finding MUST reference a specific evaluation axis and criterion.
|
|
233
|
+
3. **Fix**: Apply targeted improvements to the identified deficiencies. Record what was changed and why.
|
|
234
|
+
4. **Re-evaluate**: Re-score using the same evaluation axes. Record the delta per axis.
|
|
235
|
+
|
|
236
|
+
### Calibration Configuration Reference
|
|
237
|
+
|
|
238
|
+
The iteration loop MUST read calibration parameters from `qfai.config.yaml`:
|
|
239
|
+
|
|
240
|
+
```yaml
|
|
241
|
+
prototyping:
|
|
242
|
+
calibration:
|
|
243
|
+
packPath: ".qfai/evidence/calibration.yaml" # evaluation criteria source
|
|
244
|
+
thresholds:
|
|
245
|
+
accept: 0.8 # weighted total >= accept → converged
|
|
246
|
+
refine: 0.5 # weighted total >= refine → continue improving
|
|
247
|
+
maxIterations: 15 # hard ceiling on iteration count
|
|
248
|
+
plateauDelta: 0.02 # delta < this for N consecutive iterations → plateau
|
|
249
|
+
plateauLookback: 3 # N for plateau detection window
|
|
250
|
+
```
|
|
251
|
+
|
|
252
|
+
Runtime constants (harness types): `MIN_ITERATIONS = 5`, `MAX_ITERATIONS = 15`.
|
|
253
|
+
|
|
254
|
+
### Termination Conditions
|
|
255
|
+
|
|
256
|
+
The loop terminates when **any** of these conditions is met:
|
|
257
|
+
|
|
258
|
+
| Condition | `terminationReason` | Description |
|
|
259
|
+
| ---------------------- | ------------------- | --------------------------------------------------------------------------- |
|
|
260
|
+
| Accept threshold met | `converged` | `weightedTotal >= thresholds.accept` AND `iterationCount >= MIN_ITERATIONS` |
|
|
261
|
+
| Max iterations reached | `max-iterations` | `iterationCount >= maxIterations` |
|
|
262
|
+
| Score plateau detected | `plateau` | Score delta < `plateauDelta` for `plateauLookback` consecutive iterations |
|
|
263
|
+
| User manual stop | `manual-stop` | User explicitly requests termination |
|
|
264
|
+
|
|
265
|
+
**IMPORTANT**: `converged` with `iterationCount == 1` is a contradiction and will trigger a validator warning (QFAI-PROT-290).
|
|
266
|
+
|
|
267
|
+
### Independent Evaluator Panel (MUST)
|
|
268
|
+
|
|
269
|
+
To prevent self-evaluation bias, the evaluator MUST be independent from the generator:
|
|
270
|
+
|
|
271
|
+
| Layer | Agent | Input Scope | Role |
|
|
272
|
+
| ---------------------- | ------------------------------ | ----------------------------------------------------------- | ------------------------------------------------------- |
|
|
273
|
+
| L1: Design Quality | `product-surface-reviewer` | Screenshot/HTML snapshot + evaluation axis definitions ONLY | UI/UX/visual coherence scoring |
|
|
274
|
+
| L2: Product Experience | `product-experience-architect` | Same as L1 + screen contracts + selected anchor | User journey / IA / transition coherence |
|
|
275
|
+
| L3: Process Audit | `qa-gatekeeper` | `fullHarness` evidence block ONLY | iterationCount/scoringTrace/terminationReason integrity |
|
|
276
|
+
|
|
277
|
+
**Operational Rules:**
|
|
278
|
+
|
|
279
|
+
- L1 and L2 MUST be launched via `task` tool in `background` mode with a separate context. They MUST NOT receive improvement history, previous scores, or generator plans.
|
|
280
|
+
- L3 operates on the final evidence file and does not need a separate context.
|
|
281
|
+
- The iteration's `weightedTotal` is the **minimum** of L1 and L2 scores. If either returns below `thresholds.refine`, the iteration decision is `pivot`.
|
|
282
|
+
- Fabricated reviewer names (e.g., `"completion-reviewer"` without actual agent invocation) are a process integrity violation.
|
|
283
|
+
|
|
284
|
+
### scoringTrace Recording
|
|
285
|
+
|
|
286
|
+
Each iteration MUST produce a `scoringTrace` entry:
|
|
287
|
+
|
|
288
|
+
```json
|
|
289
|
+
{
|
|
290
|
+
"iteration": 3,
|
|
291
|
+
"weightedTotal": 0.72,
|
|
292
|
+
"decision": "refine",
|
|
293
|
+
"evaluators": ["product-surface-reviewer", "product-experience-architect"],
|
|
294
|
+
"axisDelta": { "visual_coherence": 0.05, "navigation": 0.08, "accessibility": -0.01 },
|
|
295
|
+
"maxDeltaCap": 0.15
|
|
296
|
+
}
|
|
297
|
+
```
|
|
298
|
+
|
|
299
|
+
**Score Scope Separation:**
|
|
300
|
+
|
|
301
|
+
- Discussion 3-layer scores evaluate **design direction quality** (option comparison).
|
|
302
|
+
- Prototyping scoringTrace evaluates **implementation fidelity** against the selected anchor.
|
|
303
|
+
- These are different evaluation targets. Copying discussion scores into `scoringTrace` is prohibited.
|
|
304
|
+
|
|
305
|
+
### Maximum Delta Cap
|
|
306
|
+
|
|
307
|
+
Per-axis score improvement per iteration is capped at `maxDeltaPerAxisPerIteration: 0.15`.
|
|
308
|
+
Any reported delta exceeding this cap MUST trigger re-evaluation or justification.
|
|
309
|
+
This prevents single-iteration score inflation.
|
|
310
|
+
|
|
311
|
+
## Evaluation Rigor Rules (Full-Harness)
|
|
312
|
+
|
|
313
|
+
### Rubric-Based Scoring Structure
|
|
314
|
+
|
|
315
|
+
Each evaluation axis MUST use a 3-tier rubric:
|
|
316
|
+
|
|
317
|
+
| Tier | Criteria | Score Range |
|
|
318
|
+
| --------------------- | ---------------------------------------- | ----------- |
|
|
319
|
+
| `existence_gate` | Is the element present at all? | 0.0-0.3 |
|
|
320
|
+
| `quality_criteria` | Does it meet baseline quality standards? | 0.3-0.7 |
|
|
321
|
+
| `excellence_criteria` | Does it exceed expectations? | 0.7-1.0 |
|
|
322
|
+
|
|
323
|
+
An axis that fails `existence_gate` cannot score above 0.3 regardless of other qualities.
|
|
324
|
+
|
|
325
|
+
### L1/L2 Classification and Agent-Fixable Assessment
|
|
326
|
+
|
|
327
|
+
| Level | Definition | Agent-fixable? | Action |
|
|
328
|
+
| --------- | ----------------------------------------------------------------------------------- | ----------------------------------- | ------------------------------------- |
|
|
329
|
+
| L1 | Structural deficiency (missing element, broken navigation, accessibility violation) | Yes — must fix in current iteration | Fix immediately |
|
|
330
|
+
| L2 | Quality shortfall (suboptimal spacing, weak contrast, inconsistent tone) | Yes if clearly defined | Fix or justify deferral with evidence |
|
|
331
|
+
| L1-manual | Requires human judgment (brand alignment, business logic correctness) | No | Record in `limitations` section |
|
|
332
|
+
|
|
333
|
+
### Lighthouse Automated Gate (SHOULD)
|
|
334
|
+
|
|
335
|
+
When the surface is `web` and a dev server is available:
|
|
336
|
+
|
|
337
|
+
- Run Lighthouse audit (Performance, Accessibility, Best Practices, SEO).
|
|
338
|
+
- Record scores in evidence. Scores below 70 in any category are flagged as L1 findings.
|
|
339
|
+
- This is SHOULD (not MUST) because dev server availability is not guaranteed.
|
|
340
|
+
|
|
341
|
+
## Asset Acquisition Strategy (Full-Harness)
|
|
342
|
+
|
|
343
|
+
When `mode.effective = full-harness`, professional-quality visual assets are REQUIRED (not optional).
|
|
344
|
+
|
|
345
|
+
### Asset Rules
|
|
346
|
+
|
|
347
|
+
| Rule | Level | Description |
|
|
348
|
+
| ----------------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
349
|
+
| Free asset sources | MUST | Use only properly licensed free assets (Unsplash, Pexels, Google Fonts, Heroicons, etc.). Record source URL and license in evidence. |
|
|
350
|
+
| Emoji prohibition | MUST | Emoji characters (U+1F000–U+1FAFF, U+2600–U+27BF) MUST NOT appear in UI output as decorative elements. Unicode symbols for functional purposes (e.g., ✓ for checkmarks) are allowed. |
|
|
351
|
+
| Placeholder prohibition | MUST | "Lorem ipsum", `placeholder.com` images, and gray boxes are not acceptable in full-harness final output. |
|
|
352
|
+
| Attribution | SHOULD | Record asset attributions in `prototyping.md` or a dedicated `assets.md`. |
|
|
353
|
+
|
|
354
|
+
### Accessibility Checklist (Full-Harness MUST)
|
|
355
|
+
|
|
356
|
+
- Color contrast ratio ≥ 4.5:1 for normal text, ≥ 3:1 for large text (WCAG 2.1 AA)
|
|
357
|
+
- All interactive elements are keyboard-navigable
|
|
358
|
+
- Images have `alt` attributes (decorative images use `alt=""`)
|
|
359
|
+
- Form inputs have associated labels
|
|
360
|
+
- Focus indicators are visible
|
|
361
|
+
|
|
362
|
+
### Trust Signal Checklist (Full-Harness SHOULD)
|
|
363
|
+
|
|
364
|
+
- Consistent typography hierarchy (h1 > h2 > h3 > body)
|
|
365
|
+
- Consistent spacing rhythm (4px/8px grid or equivalent)
|
|
366
|
+
- Professional color palette (not random/clashing colors)
|
|
367
|
+
- Loading states and error states are designed (not browser defaults)
|
|
368
|
+
- No broken images or missing resources in rendered output
|
|
369
|
+
|
|
370
|
+
### Dev Server Management Protocol
|
|
371
|
+
|
|
372
|
+
When a dev server is started for evidence collection:
|
|
373
|
+
|
|
374
|
+
1. Record the process PID and port in evidence metadata.
|
|
375
|
+
2. After evidence collection, terminate the dev server explicitly.
|
|
376
|
+
3. Do not leave orphaned dev server processes running.
|
|
377
|
+
|
|
178
378
|
## Required Process
|
|
179
379
|
|
|
180
380
|
1. Read `.qfai/specs/spec-*` and determine the surface and requested mode.
|
|
181
381
|
2. Build the minimum runnable slice across **ALL specs**.
|
|
182
382
|
3. Produce `prototyping.md` and `prototyping.json` with a complete Coverage Matrix.
|
|
183
|
-
4. If
|
|
383
|
+
4. If `web`, `mobile`, `desktop`, or `mixed` surface, capture `uiFidelity`; if full-harness, capture runtime gate, render bundle, and browser QA bundle.
|
|
184
384
|
5. Review rendered output, screenshot evidence, HTML snapshots, or preview artifacts against the canonical sidecar family.
|
|
185
|
-
6.
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
385
|
+
6. **[full-harness only]** Execute the Full-Harness Iteration Protocol:
|
|
386
|
+
a. Initialize calibration from `qfai.config.yaml > prototyping.calibration`.
|
|
387
|
+
b. Run Evaluate → Identify → Fix → Re-evaluate cycle.
|
|
388
|
+
c. Launch independent evaluators (product-surface-reviewer, product-experience-architect) per iteration.
|
|
389
|
+
d. Record each iteration in `scoringTrace`.
|
|
390
|
+
e. Continue until termination condition is met.
|
|
391
|
+
f. Record `terminationReason`, `iterationCount`, `bestIteration`.
|
|
392
|
+
7. Record critique findings, classify each as `L1` or `L2`, and either fix or mark the result `REVISE`.
|
|
393
|
+
8. Use the following read order when the surface is `web`, `mobile`, `desktop`, or `mixed`:
|
|
394
|
+
option comparison (`30_option_comparison.md`) -> selected anchor screen (`31_selected_anchor_screen.md`) ->
|
|
395
|
+
strategy (`10_implementation_strategy.md`) -> taste interview (`11_design_taste_interview.md`) ->
|
|
396
|
+
trend scan (`04_Sources.md`) -> 3-layer evaluation family (`20/21/22/23` + optional `24`) ->
|
|
397
|
+
screen contracts (`40_screen_contracts.md`) -> review input bundle (`50_review_input_bundle.md`).
|
|
398
|
+
9. Run `qfai validate --fail-on error`.
|
|
399
|
+
10. Route reviewer gate and do not declare completion until the result is `PASS`.
|
|
189
400
|
|
|
190
401
|
## Sub-agent Delegation (MANDATORY)
|
|
191
402
|
|
|
@@ -223,6 +434,24 @@ Every major artifact in this stage MUST include this table schema:
|
|
|
223
434
|
- Test volume floors/ratios are not gates; they are signals.
|
|
224
435
|
- Reviewer must verify evidence obligations for the chosen `surface / mode`.
|
|
225
436
|
- Do not declare DONE until Reviewer returns `PASS`; otherwise apply `REVISE`.
|
|
437
|
+
- **[full-harness only]** Reviewer MUST verify:
|
|
438
|
+
- `iterationCount > 1` (or explicit justification for single-iteration convergence).
|
|
439
|
+
- `scoringTrace` contains entries equal to `iterationCount`.
|
|
440
|
+
- `scoringTrace` shows measurable score progression (not all identical scores).
|
|
441
|
+
- `terminationReason` is consistent with the scoring trajectory.
|
|
442
|
+
- Independent evaluators were actually invoked (not fabricated names).
|
|
443
|
+
- `limitations` section is present and documents known shortcomings honestly.
|
|
444
|
+
|
|
445
|
+
### Limitations Section (Full-Harness MUST)
|
|
446
|
+
|
|
447
|
+
When `mode.effective = full-harness`, the evidence MUST include a `## Limitations` section in `prototyping.md` that documents:
|
|
448
|
+
|
|
449
|
+
- Known quality shortcomings that were not resolved by the iteration loop.
|
|
450
|
+
- Evaluation axes where scores did not reach `accept` threshold.
|
|
451
|
+
- Areas where agent judgment is insufficient (requires human review).
|
|
452
|
+
- Technical constraints that prevented further improvement (e.g., asset licensing, browser API limitations).
|
|
453
|
+
|
|
454
|
+
Omitting limitations or recording an empty limitations section when `iterationCount < maxIterations` is a process integrity concern.
|
|
226
455
|
|
|
227
456
|
## Completion Contract (Shared)
|
|
228
457
|
|
|
@@ -231,8 +460,9 @@ Before DONE:
|
|
|
231
460
|
- package assets and generated evidence must match the obligation matrix
|
|
232
461
|
- `qfai validate --fail-on error` must pass
|
|
233
462
|
- reviewer gate must return PASS
|
|
234
|
-
-
|
|
235
|
-
-
|
|
463
|
+
- `web`, `mobile`, `desktop`, `mixed` surface runs must reconcile `uiFidelity`, render evidence, and critique outputs
|
|
464
|
+
- `cli` surface runs preserve n/a semantics for render and browser QA without fake placeholders
|
|
465
|
+
- `ui_bearing: false` specs are not prototyping execution targets
|
|
236
466
|
|
|
237
467
|
## FINAL CHECKLIST (Check Last)
|
|
238
468
|
|
|
@@ -65,7 +65,7 @@ agents:
|
|
|
65
65
|
- id: frontend-engineer
|
|
66
66
|
kind: worker
|
|
67
67
|
domain: frontend
|
|
68
|
-
mission: Implement frontend behavior aligned with selected
|
|
68
|
+
mission: Implement frontend behavior aligned with selected anchor, strategy, screen contracts, and product-surface decisions.
|
|
69
69
|
owned_artifacts: [ui-implementation, surface-evidence]
|
|
70
70
|
tool_profile: frontend
|
|
71
71
|
permission_profile: authoring
|
|
@@ -119,7 +119,7 @@ routing:
|
|
|
119
119
|
rerun_policy: changed-scope-dependents
|
|
120
120
|
- id: evidence
|
|
121
121
|
mandatory_agents: []
|
|
122
|
-
conditional_agents: [devops-ci-engineer, qa-gatekeeper]
|
|
122
|
+
conditional_agents: [devops-ci-engineer, qa-gatekeeper, product-experience-architect]
|
|
123
123
|
parallel_groups: []
|
|
124
124
|
blocking_agents: [qa-gatekeeper]
|
|
125
125
|
rerun_policy: changed-scope-dependents
|
|
@@ -19,9 +19,9 @@
|
|
|
19
19
|
## Compatibility vs Change Rubric
|
|
20
20
|
|
|
21
21
|
- Criteria (Compatibility): validate.json is an internal contract (not a stable API). CLI command system follows semver.
|
|
22
|
-
- Criteria (Change): Breaking changes
|
|
22
|
+
- Criteria (Change): canonical consistency, validator alignment, and shipped SSOT alignment take priority. Breaking changes are allowed when required to restore canonical consistency.
|
|
23
23
|
- Examples: `_shared/` -> `_policies/` rename (v1.5.3), spec-pack -> layered migration (v1.4.17)
|
|
24
|
-
- Evidence: CHANGELOG.md
|
|
24
|
+
- Evidence: CHANGELOG.md
|
|
25
25
|
|
|
26
26
|
## Governance (Ownership / Review / Evidence)
|
|
27
27
|
|
|
@@ -54,14 +54,11 @@
|
|
|
54
54
|
## Non-goals / Not-now (Optional)
|
|
55
55
|
|
|
56
56
|
- IDE plugin / GUI development
|
|
57
|
-
- Plugin architecture
|
|
57
|
+
- Plugin architecture
|
|
58
58
|
- Automated test generation
|
|
59
59
|
- browser QA full audit / screenshot diff / repair loop / external critique adapter (v1.7.1)
|
|
60
60
|
- auto-fix / rewrite for design findings (v1.7.2)
|
|
61
|
-
-
|
|
62
|
-
- browser QA output normalization shape (deferred to v1.7.6, OQ-0002 of discussion-20260329130000123)
|
|
63
|
-
- external critique provider / full-harness orchestration / calibration pack / cost observability / long-running handoff (v1.7.5 out of scope → v1.7.6 IN scope)
|
|
64
|
-
- Evidence: 05_Scope.md (Out of Scope), OQ-0001, OQ-0002, discussion-20260329175059391
|
|
61
|
+
- Evidence: 05_Scope.md (Out of Scope)
|
|
65
62
|
|
|
66
63
|
## References (Optional)
|
|
67
64
|
|
|
@@ -37,8 +37,9 @@
|
|
|
37
37
|
|
|
38
38
|
## Release posture
|
|
39
39
|
|
|
40
|
-
- Compatibility policy:
|
|
41
|
-
- Breaking
|
|
40
|
+
- Compatibility policy: current canonical contract only.
|
|
41
|
+
- Breaking changes are allowed when required to restore canonical consistency.
|
|
42
|
+
- CLI/skill/docs/validator must match current package semantics.
|
|
42
43
|
- Evidence: CHANGELOG.md, 09_Constraints.md (DL-02)
|
|
43
44
|
|
|
44
45
|
## Milestones
|
|
@@ -64,11 +65,10 @@
|
|
|
64
65
|
| v1.7.7 (完了) | Remediation & Prototyping Readiness — static-first prototyping default + full-harness mode exposure + 3-layer eval reconciliation + strategy/contract upgrade + UI-bearing detection fix + render evidence wiring + browser QA findings + doc normalization + migration support |
|
|
65
66
|
| v1.7.8 (完了) | Canonical Convergence — design taste interview + trend research + 3-layer evaluation convergence + scoring-ready schema + strategy/screen contract upgrade + UI-bearing detection unification + static-first prototyping rewrite + full-harness mode convergence + render evidence wiring + browser QA MVP + reviewer extension + migration normalization + docs normalization |
|
|
66
67
|
| v1.7.9 (完了) | Convergence Correction Release — canonical validator registration, discussion completion convergence, honest render evidence/browser QA wiring, reviewer routing alignment, docs maturity normalization |
|
|
67
|
-
| v1.7.13 (
|
|
68
|
+
| v1.7.13 (完了) | Canonical Sidecar Convergence — selected anchor SSOT moved to 31_selected_anchor_screen.md, option comparison remains in 30_option_comparison.md, sidecar-first read order, DDS/anchor vocabulary removal, validator semantics rewrite, template-validator self-consistency |
|
|
69
|
+
| v1.7.14 (進行中) | Canonical Convergence Finalization — strict classification enforcement, namespaced-only prototyping.yaml, current-only shipped SSOT, regression net hardening |
|
|
68
70
|
|
|
69
71
|
## Open questions
|
|
70
72
|
|
|
71
73
|
- Blocking: none
|
|
72
|
-
- Non-blocking:
|
|
73
|
-
- OQ-0003: validate.json external API stability (deferred to v2.0)
|
|
74
|
-
- OQ-0004: Legacy spec-pack deprecation schedule (deferred to v2.0)
|
|
74
|
+
- Non-blocking: none
|
|
@@ -16,6 +16,9 @@ profiles:
|
|
|
16
16
|
runtime-heavy:
|
|
17
17
|
always_required: [completion-reviewer, qa-gatekeeper]
|
|
18
18
|
conditional_required: [implementation-reviewer]
|
|
19
|
+
full-harness:
|
|
20
|
+
always_required: [completion-reviewer, product-surface-reviewer, qa-gatekeeper]
|
|
21
|
+
conditional_required: []
|
|
19
22
|
|
|
20
23
|
optional_modes:
|
|
21
24
|
devils-advocate:
|
|
@@ -9,7 +9,7 @@ spec-0013 (CAP-0013) で定義された、下流 skill(prototyping / ATDD / TD
|
|
|
9
9
|
|
|
10
10
|
1. **Discussion-side UI/UX Sidecar Artifacts** (`discussion-*/uiux/`) — **primary source of truth**
|
|
11
11
|
- `30_option_comparison.md` — オプション比較(比較 artifact)
|
|
12
|
-
- `31_selected_anchor_screen.md` — 選定結果 + selected
|
|
12
|
+
- `31_selected_anchor_screen.md` — 選定結果 + selected anchor の SSOT
|
|
13
13
|
- `10_implementation_strategy.md` — 実装戦略(strict canonical schema)
|
|
14
14
|
- `11_design_taste_interview.md` — デザインテイストインタビュー
|
|
15
15
|
- `20-24` — 3-layer 評価ファミリー(invariant / trend-derived / product-specific / aggregate / dynamic overrides)
|
|
@@ -47,7 +47,7 @@ spec-0013 (CAP-0013) で定義された、下流 skill(prototyping / ATDD / TD
|
|
|
47
47
|
|
|
48
48
|
## Priority and Override Semantics
|
|
49
49
|
|
|
50
|
-
- sidecar artifacts(selected
|
|
50
|
+
- sidecar artifacts(selected anchor / strategy / contracts)が **primary truth**
|
|
51
51
|
- UI Contracts と Design Token は **存在する場合のみ読む supporting input**(primary truth ではない)
|
|
52
52
|
- Optional fallback mock はさらに後順位の **fallback**
|
|
53
53
|
- Design Token の値と HTML Mock の fallback 値が矛盾する場合は warning を発行
|
|
@@ -35,11 +35,11 @@ The contract must describe both screen structure and minimum mockable behavior.
|
|
|
35
35
|
|
|
36
36
|
### `data-qfai` marker convention
|
|
37
37
|
|
|
38
|
-
-
|
|
38
|
+
- Canonical marker value: `CONTRACT_ID:ELEMENT_ID` (example: `data-qfai="CON-UI-0001:search_input"`).
|
|
39
39
|
- Use `elements[].id` (stable ID) for the marker suffix, not `elements[].label`.
|
|
40
40
|
- Even when label text is not visible in the UI, markers ensure fidelity coverage.
|
|
41
41
|
- autogen generates expected markers from `elements[].id` automatically.
|
|
42
|
-
-
|
|
42
|
+
- The id-based format (`CONTRACT_ID:ELEMENT_ID`) is the only canonical marker format.
|
|
43
43
|
|
|
44
44
|
## Mockable prototype minimum (L2)
|
|
45
45
|
|
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
## Purpose
|
|
4
4
|
|
|
5
|
-
`discussion/` stores the unified discussion pack that merges interview outputs (discuss) and requirement intake (require). Discussion packs use 15 required markdown files
|
|
5
|
+
`discussion/` stores the unified discussion pack that merges interview outputs (discuss) and requirement intake (require). Discussion packs use 15 required markdown files. When the latest pack is `ui_bearing: true`, it must also include `prototyping.yaml`; when `ui_bearing: false`, `prototyping.yaml` is not required.
|
|
6
6
|
|
|
7
7
|
This directory does not directly update `specs/`; it prepares decisions, requirements, open questions, and rationale as inputs for `/qfai-sdd`.
|
|
8
8
|
|
|
@@ -29,7 +29,7 @@ discussion/
|
|
|
29
29
|
├── 13_Deferred.md
|
|
30
30
|
├── 14_Review-Request.md
|
|
31
31
|
├── 99_delta.md
|
|
32
|
-
└── prototyping.yaml
|
|
32
|
+
└── prototyping.yaml # required only when ui_bearing: true
|
|
33
33
|
```
|
|
34
34
|
|
|
35
35
|
## File responsibilities
|
|
@@ -103,11 +103,11 @@ discussion/
|
|
|
103
103
|
- Use timestamp directory naming for new outputs: `discussion-YYYYMMDDhhmmssSSS`.
|
|
104
104
|
- `14_Review-Request.md` must reference routing SSOT: `.qfai/assistant/steering/agent-routing.yml` and `.qfai/assistant/steering/review-profiles.yml`.
|
|
105
105
|
|
|
106
|
-
## prototyping.yaml (
|
|
106
|
+
## prototyping.yaml (Classification-aware Recommendation Artifact)
|
|
107
107
|
|
|
108
|
-
Each discussion pack **must** include a `prototyping.yaml` file that recommends the prototyping mode for the project.
|
|
108
|
+
Each UI-bearing discussion pack (`ui_bearing: true`) **must** include a `prototyping.yaml` file that recommends the prototyping mode for the project. Non-UI discussion packs (`ui_bearing: false`) do not require `prototyping.yaml`.
|
|
109
109
|
|
|
110
|
-
### Canonical namespaced schema (
|
|
110
|
+
### Canonical namespaced schema (required)
|
|
111
111
|
|
|
112
112
|
```yaml
|
|
113
113
|
prototyping:
|
|
@@ -117,24 +117,9 @@ prototyping:
|
|
|
117
117
|
- low-cost
|
|
118
118
|
- standard
|
|
119
119
|
- full-harness
|
|
120
|
-
surface: web
|
|
120
|
+
surface: web
|
|
121
121
|
```
|
|
122
122
|
|
|
123
|
-
### Legacy top-level schema (deprecated — read-only backward compatibility)
|
|
124
|
-
|
|
125
|
-
The following top-level form is accepted by the parser for backward compatibility but produces a deprecation warning (`QFAI-PROT-231`). New artifacts MUST NOT emit this form; use the namespaced canonical schema above.
|
|
126
|
-
|
|
127
|
-
```yaml
|
|
128
|
-
recommended_mode: standard
|
|
129
|
-
rationale: ...
|
|
130
|
-
allowed_modes:
|
|
131
|
-
- low-cost
|
|
132
|
-
- standard
|
|
133
|
-
surface: web-ui
|
|
134
|
-
```
|
|
135
|
-
|
|
136
|
-
If both forms are present in the same file, the namespaced form takes precedence and a conflict warning (`QFAI-PROT-232`) is emitted.
|
|
137
|
-
|
|
138
123
|
### Field reference
|
|
139
124
|
|
|
140
125
|
All 4 fields are **required**. An artifact missing any field will fail validation.
|
|
@@ -144,7 +129,14 @@ All 4 fields are **required**. An artifact missing any field will fail validatio
|
|
|
144
129
|
| `recommended_mode` | yes | `low-cost`, `standard`, or `full-harness` |
|
|
145
130
|
| `rationale` | yes | Non-empty string explaining the recommendation |
|
|
146
131
|
| `allowed_modes` | yes | Unique array of valid modes; must include `recommended_mode` |
|
|
147
|
-
| `surface` | yes | `web
|
|
132
|
+
| `surface` | yes | `web`, `mobile`, `desktop`, `cli`, or `mixed` |
|
|
133
|
+
|
|
134
|
+
### Validation rules
|
|
135
|
+
|
|
136
|
+
- Only the canonical namespaced schema under the `prototyping:` key is accepted. Top-level recommendation keys (`recommended_mode`, `rationale`, `allowed_modes`, `surface` at root level) are not supported and will cause validation failure.
|
|
137
|
+
- Coexistence of top-level recommendation keys with the namespaced `prototyping:` block is invalid.
|
|
138
|
+
- `recommended_mode` must be included in `allowed_modes`. An artifact where `recommended_mode` is not in `allowed_modes` is invalid.
|
|
139
|
+
- An artifact that does not conform to the canonical namespaced schema is invalid and will be rejected by both validation and execution/CLI. No fallback to explicit mode or default mode is performed for invalid artifacts.
|
|
148
140
|
|
|
149
141
|
## Suggested naming
|
|
150
142
|
|