qualia-framework 4.5.0 → 5.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (66) hide show
  1. package/AGENTS.md +24 -0
  2. package/CLAUDE.md +12 -75
  3. package/README.md +23 -16
  4. package/agents/builder.md +9 -21
  5. package/agents/planner.md +8 -0
  6. package/agents/verifier.md +8 -0
  7. package/agents/visual-evaluator.md +132 -0
  8. package/bin/cli.js +54 -18
  9. package/bin/install.js +369 -29
  10. package/bin/qualia-ui.js +208 -1
  11. package/bin/slop-detect.mjs +5 -0
  12. package/bin/state.js +34 -1
  13. package/docs/install-redesign-builder-prompt.md +290 -0
  14. package/docs/install-redesign-pilot.md +234 -0
  15. package/docs/playwright-loop-builder-prompt.md +185 -0
  16. package/docs/playwright-loop-design-notes.md +108 -0
  17. package/docs/playwright-loop-pilot-results.md +170 -0
  18. package/docs/playwright-loop-tester-prompt.md +213 -0
  19. package/docs/polish-loop-supervised-run.md +111 -0
  20. package/docs/reviews/matt-pocock-skills-analysis.md +300 -0
  21. package/guide.md +9 -5
  22. package/hooks/env-empty-guard.js +74 -0
  23. package/hooks/pre-compact.js +19 -9
  24. package/hooks/pre-deploy-gate.js +8 -2
  25. package/hooks/pre-push.js +26 -12
  26. package/hooks/supabase-destructive-guard.js +62 -0
  27. package/hooks/vercel-account-guard.js +91 -0
  28. package/package.json +2 -1
  29. package/rules/design-brand.md +4 -0
  30. package/rules/design-laws.md +4 -0
  31. package/rules/design-product.md +4 -0
  32. package/rules/design-rubric.md +4 -0
  33. package/rules/grounding.md +4 -0
  34. package/skills/qualia-build/SKILL.md +40 -46
  35. package/skills/qualia-discuss/SKILL.md +51 -68
  36. package/skills/qualia-handoff/SKILL.md +1 -0
  37. package/skills/qualia-hook-gen/SKILL.md +206 -0
  38. package/skills/qualia-issues/SKILL.md +151 -0
  39. package/skills/qualia-map/SKILL.md +78 -35
  40. package/skills/qualia-new/REFERENCE.md +139 -0
  41. package/skills/qualia-new/SKILL.md +45 -121
  42. package/skills/qualia-optimize/REFERENCE.md +265 -0
  43. package/skills/qualia-optimize/SKILL.md +92 -232
  44. package/skills/qualia-plan/SKILL.md +58 -65
  45. package/skills/qualia-polish-loop/REFERENCE.md +265 -0
  46. package/skills/qualia-polish-loop/SKILL.md +201 -0
  47. package/skills/qualia-polish-loop/fixtures/broken.html +117 -0
  48. package/skills/qualia-polish-loop/fixtures/clean.html +196 -0
  49. package/skills/qualia-polish-loop/scripts/loop.mjs +323 -0
  50. package/skills/qualia-polish-loop/scripts/playwright-capture.mjs +206 -0
  51. package/skills/qualia-polish-loop/scripts/score.mjs +176 -0
  52. package/skills/qualia-prd/SKILL.md +199 -0
  53. package/skills/qualia-report/SKILL.md +141 -200
  54. package/skills/qualia-research/SKILL.md +28 -33
  55. package/skills/qualia-road/SKILL.md +103 -0
  56. package/skills/qualia-ship/SKILL.md +1 -0
  57. package/skills/qualia-task/SKILL.md +1 -1
  58. package/skills/qualia-test/SKILL.md +50 -2
  59. package/skills/qualia-triage/SKILL.md +152 -0
  60. package/skills/qualia-verify/SKILL.md +63 -104
  61. package/skills/qualia-zoom/SKILL.md +51 -0
  62. package/skills/zoho-workflow/SKILL.md +1 -1
  63. package/templates/CONTEXT.md +36 -0
  64. package/templates/decisions/ADR-template.md +30 -0
  65. package/tests/bin.test.sh +598 -7
  66. package/tests/state.test.sh +58 -0
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: qualia-plan
3
- description: "Plan the current phase spawns planner, validates with plan-checker in a revision loop (max 2), optionally runs discuss/research first. Use when ready to plan a phase."
3
+ description: "Plans the current phase by spawning a planner agent to break it into executable tasks with waves, then validates via a plan-checker revision loop (max 2 cycles). Supports gap-closure mode for verification failures. Use when the user says 'plan this phase', 'break this into tasks', 'create the plan', 'qualia-plan', 'plan phase 2', or after /qualia-new sets up the journey."
4
4
  allowed-tools:
5
5
  - Bash
6
6
  - Read
@@ -15,15 +15,15 @@ allowed-tools:
15
15
 
16
16
  # /qualia-plan — Plan a Phase
17
17
 
18
- Spawn a planner agent to break the current phase into executable tasks, then validate the plan with a checker (up to 2 revision cycles) before routing to build.
18
+ Spawn planner to break phase into tasks, validate with checker (max 2 revision cycles), route to build.
19
19
 
20
20
  ## Usage
21
21
 
22
- `/qualia-plan` — plan the next unplanned phase
22
+ `/qualia-plan` — plan next unplanned phase
23
23
  `/qualia-plan {N}` — plan specific phase N
24
24
  `/qualia-plan {N} --gaps` — plan fixes for verification failures
25
- `/qualia-plan {N} --skip-check` — skip the plan-checker validation loop (not recommended)
26
- `/qualia-plan {N} --auto` — plan + auto-chain into `/qualia-build {N} --auto` when done (no human approval between plan and build)
25
+ `/qualia-plan {N} --skip-check` — skip plan-checker loop (not recommended)
26
+ `/qualia-plan {N} --auto` — plan + chain into `/qualia-build {N} --auto` (no human gate)
27
27
 
28
28
  ## Process
29
29
 
@@ -38,9 +38,9 @@ node ~/.claude/bin/knowledge.js load patterns
38
38
  node ~/.claude/bin/knowledge.js load client
39
39
  ```
40
40
 
41
- If no phase number given, use the current phase from STATE.md.
41
+ No phase number current phase from STATE.md.
42
42
 
43
- **Read phase-specific context if it exists:**
43
+ **Phase-specific context (if exists):**
44
44
  ```bash
45
45
  cat .planning/phase-{N}-context.md 2>/dev/null # from /qualia-discuss
46
46
  cat .planning/phase-{N}-research.md 2>/dev/null # from /qualia-research
@@ -48,19 +48,17 @@ cat .planning/phase-{N}-research.md 2>/dev/null # from /qualia-research
48
48
 
49
49
  ### 2. Optional: Suggest Deeper Prep
50
50
 
51
- **If ROADMAP.md marked this phase as a "research flag" AND no phase-{N}-research.md exists:**
51
+ **If ROADMAP.md flagged phase for research AND no phase-{N}-research.md:**
52
52
 
53
53
  - header: "Research first?"
54
- - question: "This phase was flagged for deeper research. Run /qualia-research {N} first?"
54
+ - question: "Phase flagged for research. Run /qualia-research {N} first?"
55
55
  - options:
56
- - "Yes, research first" — Run /qualia-research {N} inline, then continue
57
- - "Skip, plan directly" — I know enough
56
+ - "Yes, research first"
57
+ - "Skip, plan directly"
58
58
 
59
- **If phase involves compliance/regulatory/architectural stakes AND no phase-{N}-context.md exists:**
59
+ **If phase has compliance/regulatory/architectural stakes AND no phase-{N}-context.md:**
60
60
 
61
- Briefly suggest: *"Want to run /qualia-discuss {N} first to lock decisions? Optional."*
62
-
63
- Don't force it. Some phases don't need it.
61
+ Suggest: *"Run /qualia-discuss {N} first to lock decisions? Optional."*
64
62
 
65
63
  ### 3. Spawn Planner (Fresh Context)
66
64
 
@@ -69,12 +67,9 @@ node ~/.claude/bin/qualia-ui.js banner plan {N} "{phase name from ROADMAP.md}"
69
67
  node ~/.claude/bin/qualia-ui.js spawn planner "Breaking phase into tasks..."
70
68
  ```
71
69
 
72
- Spawn the planner:
73
-
74
70
  ```
75
71
  Agent(prompt="
76
- Read your role: @~/.claude/agents/planner.md
77
- Grounding + rubrics: @~/.claude/rules/grounding.md
72
+ Role: @~/.claude/agents/planner.md
78
73
 
79
74
  <project_context>
80
75
  @.planning/PROJECT.md
@@ -89,86 +84,84 @@ Phase {N} from ROADMAP.md:
89
84
  @.planning/ROADMAP.md
90
85
 
91
86
  Goal: {goal from ROADMAP.md}
92
- Requirements: {REQ-IDs from ROADMAP.md}
87
+ Reqs: {REQ-IDs from ROADMAP.md}
93
88
  Success criteria: {success criteria from ROADMAP.md}
94
89
  </phase_details>
95
90
 
96
91
  <locked_decisions>
97
- {if phase-{N}-context.md exists, inline its Locked Decisions section; else 'none'}
92
+ {phase-{N}-context.md Locked Decisions if exists, else 'none'}
98
93
  </locked_decisions>
99
94
 
100
95
  <research_findings>
101
- {if phase-{N}-research.md exists, inline its recommendation; else 'none'}
96
+ {phase-{N}-research.md recommendation if exists, else 'none'}
102
97
  </research_findings>
103
98
 
104
- {If --gaps: Also read @.planning/phase-{N}-verification.md for failures to fix. Create gap-closure plan.}
99
+ {--gaps read @.planning/phase-{N}-verification.md failures. Create gap-closure plan.}
105
100
 
106
101
  <relevant_learnings>
107
- {inline any applicable patterns from knowledge/learned-patterns.md}
102
+ {applicable patterns from knowledge/learned-patterns.md}
108
103
  </relevant_learnings>
109
104
 
110
- Create the plan at .planning/phase-{N}-plan.md (or .planning/phase-{N}-gaps-plan.md for --gaps).
105
+ Output: .planning/phase-{N}-plan.md (or phase-{N}-gaps-plan.md for --gaps).
111
106
  ", subagent_type="qualia-planner", description="Plan phase {N}")
112
107
  ```
113
108
 
114
- ### 4. Validate the Plan (unless --skip-check)
109
+ ### 4. Validate Plan (unless --skip-check)
115
110
 
116
- Read the generated plan. Spawn the plan-checker:
111
+ Read generated plan. Spawn checker:
117
112
 
118
113
  ```
119
114
  Agent(prompt="
120
- Read your role: @~/.claude/agents/plan-checker.md
121
- Grounding + rubrics: @~/.claude/rules/grounding.md
115
+ Role: @~/.claude/agents/plan-checker.md
122
116
 
123
117
  <plan_path>.planning/phase-{N}-plan.md</plan_path>
124
118
  <phase_goal>{goal from ROADMAP.md}</phase_goal>
125
119
  <success_criteria>{criteria from ROADMAP.md}</success_criteria>
126
120
  <project_context>@.planning/PROJECT.md</project_context>
127
121
 
128
- Validate against the 7 rules. Return PASS or REVISE with structured issues.
122
+ Validate against 7 rules. Return PASS or REVISE with structured issues.
129
123
  ", subagent_type="qualia-plan-checker", description="Check plan phase {N}")
130
124
  ```
131
125
 
132
- **Revision loop (max 2 iterations):**
126
+ **Revision loop (max 2):**
133
127
 
134
- - Iteration 1: Check → if REVISE, re-spawn planner with checker issues
135
- - Iteration 2: Re-check → if REVISE or BLOCKED, escalate to user
128
+ - Iter 1: Check → REVISE re-spawn planner with checker issues
129
+ - Iter 2: Re-check → REVISE/BLOCKED escalate to user
136
130
 
137
- Rationale: Amazon/NeurIPS 2025 measured reflection gains at 74→86% for 1 round, 88% for 3 rounds. Iteration 3 only added 2pp over iteration 1 — not worth the extra planner spawn (serial cost ~30-60s).
131
+ (74→86% after 1 round, 88% after 3. Iter 3 adds only 2pp; not worth extra spawn.)
138
132
 
139
- For each revision:
133
+ Per revision:
140
134
 
141
135
  ```
142
136
  Agent(prompt="
143
- Read your role: @~/.claude/agents/planner.md
137
+ Role: @~/.claude/agents/planner.md
144
138
 
145
139
  <revision_mode>true</revision_mode>
146
140
  <current_plan>@.planning/phase-{N}-plan.md</current_plan>
147
141
  <checker_feedback>
148
- {inline REVISE output from plan-checker}
142
+ {REVISE output from plan-checker}
149
143
  </checker_feedback>
150
144
 
151
- Revise the plan in place. Address every issue. Do NOT add new tasks or change scope
152
- — only fix what the checker flagged.
145
+ Revise in place. Address every issue. Do NOT add tasks or change scope; fix only what checker flagged.
153
146
  ", subagent_type="qualia-planner", description="Revise plan phase {N}")
154
147
  ```
155
148
 
156
- After revision, spawn the checker again. Max 2 total revision cycles.
149
+ After revision, re-spawn checker. Max 2 total cycles.
157
150
 
158
- **If checker returns BLOCKED after 2 cycles:**
151
+ **BLOCKED after 2 cycles:**
159
152
 
160
153
  ```bash
161
154
  node ~/.claude/bin/qualia-ui.js fail "Plan failed validation after 2 revisions"
162
155
  ```
163
156
 
164
- Show the remaining issues. Ask:
165
- - "Skip validation and proceed anyway" (use `--skip-check`)
166
- - "Adjust the roadmap" (phase scope may be wrong)
167
- - "Adjust the phase goal" (success criteria may be under-specified)
157
+ Show remaining issues. Options:
158
+ - "Skip validation" (`--skip-check`)
159
+ - "Adjust roadmap" (scope wrong)
160
+ - "Adjust phase goal" (criteria under-specified)
168
161
 
169
162
  ### 5. Present Final Plan
170
163
 
171
- Render the story-file dashboard — this is a single command that parses the plan and shows the phase goal, waves, tasks, personas, dependencies, acceptance-criteria counts, and validation counts:
164
+ Render story-file dashboard:
172
165
 
173
166
  ```bash
174
167
  node ~/.claude/bin/qualia-ui.js plan-summary .planning/phase-{N}-plan.md
@@ -184,19 +177,19 @@ If "adjust" — get feedback, re-spawn planner with revision context, re-validat
184
177
  node ~/.claude/bin/state.js transition --to planned --phase {N}
185
178
  ```
186
179
 
187
- If state.js returns an error, show it and stop. Do NOT manually edit STATE.md or tracking.json.
180
+ Error show, stop. Do NOT edit STATE.md or tracking.json manually.
188
181
 
189
182
  ### 7. Route (auto-chain aware)
190
183
 
191
- **If invoked with `--auto`:** immediately invoke `/qualia-build {N} --auto` inline. The user already approved the whole journey at `/qualia-new` time — no additional approval needed per phase plan.
184
+ **`--auto`:** invoke `/qualia-build {N} --auto` inline. User approved at `/qualia-new`; no per-phase gate.
192
185
 
193
186
  ```bash
194
187
  node ~/.claude/bin/qualia-ui.js info "Auto mode — chaining into /qualia-build {N}"
195
188
  ```
196
189
 
197
- Then invoke the `qualia-build` skill inline with `--auto`.
190
+ Then invoke `qualia-build` inline with `--auto`.
198
191
 
199
- **Otherwise (guided mode):** stop and show the next step:
192
+ **Guided mode:** stop, show next step:
200
193
 
201
194
  ```bash
202
195
  node ~/.claude/bin/qualia-ui.js end "PHASE {N} PLANNED" "/qualia-build {N}"
@@ -204,22 +197,22 @@ node ~/.claude/bin/qualia-ui.js end "PHASE {N} PLANNED" "/qualia-build {N}"
204
197
 
205
198
  ## Gap Closure Mode (`--gaps`)
206
199
 
207
- When invoked as `/qualia-plan {N} --gaps`, the planner is in gap-closure mode:
200
+ With `--gaps`, planner enters gap-closure mode:
208
201
 
209
- 1. Read `.planning/phase-{N}-verification.md` extract ONLY the FAIL items
210
- 2. For each FAIL item, create a targeted fix task:
211
- - **Files:** specific files that failed verification
212
- - **Action:** specific fix (not "fix auth" "add session persistence check in src/lib/auth.ts signIn function")
213
- - **Acceptance Criteria:** the exact verification criterion that previously failed, restated as an observable behavior
214
- 3. Do NOT re-plan passing items. Do NOT add new features. Gap plans are surgical.
215
- 4. Write to `.planning/phase-{N}-gaps-plan.md` (separate from original plan)
216
- 5. All gap tasks are Wave 1 (parallel) unless they share files
217
- 6. Plan-checker still validates the gap plan — same 7 rules apply
202
+ 1. Read `.planning/phase-{N}-verification.md` extract ONLY FAIL items
203
+ 2. Per FAIL, create targeted fix task:
204
+ - **Files:** specific files that failed
205
+ - **Action:** specific fix (not "fix auth"; "add session persistence check in src/lib/auth.ts signIn fn")
206
+ - **AC:** failed criterion restated as observable behavior
207
+ 3. Do NOT re-plan passing items. No new features. Surgical only.
208
+ 4. Output: `.planning/phase-{N}-gaps-plan.md`
209
+ 5. All gap tasks Wave 1 (parallel) unless they share files
210
+ 6. Plan-checker still validates; same 7 rules
218
211
 
219
212
  ## Rules
220
213
 
221
- 1. **Plan-checker is mandatory by default.** Only skip with `--skip-check`, and only if you know what you're doing.
222
- 2. **Max 3 revision cycles.** After 3 failed checks, escalate — the phase scope is probably wrong.
223
- 3. **Honor locked decisions.** If phase-{N}-context.md exists, its locked decisions are non-negotiable.
224
- 4. **One plan file per phase.** Don't create phase-1-plan.md AND phase-1-plan-v2.md. Edit in place.
225
- 5. **Revision is surgical.** When revising, only fix what the checker flagged no scope creep.
214
+ 1. **Plan-checker mandatory by default.** Skip only with `--skip-check`.
215
+ 2. **Max 3 revision cycles.** 3 fails escalate; scope probably wrong.
216
+ 3. **Honor locked decisions.** phase-{N}-context.md locked decisions non-negotiable.
217
+ 4. **One plan file per phase.** No phase-1-plan-v2.md. Edit in place.
218
+ 5. **Revision is surgical.** Fix only what checker flagged; no scope creep.
@@ -0,0 +1,265 @@
1
+ # REFERENCE — /qualia-polish-loop
2
+
3
+ Verbatim agent prompts and operational details. Loaded on demand by SKILL.md, not carried in the system prompt. Per progressive-disclosure discipline (Matt Pocock): the agent reads SKILL.md first, then this file when it needs the spawn templates.
4
+
5
+ ## Architecture summary
6
+
7
+ ```
8
+ SKILL.md driver (Claude session)
9
+
10
+ ├─ scripts/playwright-capture.mjs (deterministic Node — produces PNGs)
11
+ │ ↓ writes /tmp/qpl-{ts}/iter-{N}/{mobile,tablet,desktop}-*.png
12
+
13
+ ├─ Agent({subagent_type: "qualia-visual-evaluator", ...})
14
+ │ ↓ reads PNGs, returns single JSON envelope (eval.json)
15
+
16
+ ├─ scripts/loop.mjs record (deterministic — verdict + fingerprints)
17
+ │ ↓ exit 0=SUCCESS, 1=CONTINUE, 3=KILLED
18
+
19
+ ├─ Agent({subagent_type: "qualia-builder", ...}) × up to 3 in parallel
20
+ │ ↓ each fixes ONE issue, calls scripts/loop.mjs commit-fix
21
+
22
+ └─ scripts/loop.mjs report (final markdown report)
23
+ ↓ writes .planning/visual-polish-loop.md
24
+ ```
25
+
26
+ ## Capture: backend selection
27
+
28
+ The capture script (`scripts/playwright-capture.mjs`) auto-selects in this order:
29
+
30
+ 1. `import('playwright')` — preferred when available; gives deterministic `waitUntil: 'networkidle'`
31
+ 2. `import('playwright-core')` — same API, lighter package
32
+ 3. `~/.cache/ms-playwright/chromium-{version}/chrome-{linux64,linux,mac,win}/chrome` — if Playwright was ever installed for browsers but the package isn't import-resolvable
33
+ 4. `which google-chrome` / `chromium` / `chromium-browser` / `chrome` — system browser fallback
34
+
35
+ For backends 3 and 4 (binary-direct), the script uses `--headless=new --screenshot --virtual-time-budget`. Less precise than Playwright's `networkidle` waiting but works without any npm dependency.
36
+
37
+ Setup hints if all four fail:
38
+
39
+ ```bash
40
+ # Option A — Playwright (best stability)
41
+ npm i -D playwright && npx playwright install chromium
42
+
43
+ # Option B — system Chrome (fastest setup if you already have Chrome installed)
44
+ # (no action needed if google-chrome is on PATH)
45
+ ```
46
+
47
+ ## Vision-evaluator spawn template (VERBATIM)
48
+
49
+ The vision evaluator's anchored discipline: **DEFAULT TO 3.** Only score above 3 with a cited design principle. Only score below 3 with a quoted violation. Without anchoring, vision models return "looks great!" to everything — that failure mode is the entire reason this loop exists. The full rubric criteria live in `agents/visual-evaluator.md`; this section is the spawn template.
50
+
51
+ When the loop reaches step 2 (Evaluate), spawn ONE agent with the screenshots, brief, rubric, and previous-iteration context. Inline this prompt verbatim — do not paraphrase.
52
+
53
+ ```
54
+ Agent({
55
+ subagent_type: "qualia-visual-evaluator",
56
+ description: "Score iteration {N} screenshots against rubric",
57
+ prompt: `
58
+ Role: @~/.claude/agents/visual-evaluator.md
59
+
60
+ <rubric>
61
+ {INLINE rules/design-rubric.md §"The 8 dimensions" through §"Aggregate score"}
62
+ </rubric>
63
+
64
+ <brief>
65
+ {INLINE the relevant excerpt from .planning/DESIGN.md — sections "Direction", "Color", "Typography"}
66
+ </brief>
67
+
68
+ <product>
69
+ {INLINE the relevant excerpt from .planning/PRODUCT.md — register, voice, anti-references}
70
+ </product>
71
+
72
+ <screenshots>
73
+ - mobile (375px): /tmp/qpl-{ts}/iter-{N}/mobile-375.png
74
+ - tablet (768px): /tmp/qpl-{ts}/iter-{N}/tablet-768.png
75
+ - desktop (1440px): /tmp/qpl-{ts}/iter-{N}/desktop-1440.png
76
+ </screenshots>
77
+
78
+ <viewport_meta>
79
+ { "reduced_motion": {true|false}, "viewport_widths": [375, 768, 1440] }
80
+ </viewport_meta>
81
+
82
+ <previous_iteration>
83
+ {If N > 1, INLINE eval.json.top_issues from iter-{N-1} so the evaluator can verify regression vs improvement. Otherwise: "(first iteration — no prior data)"}
84
+ </previous_iteration>
85
+
86
+ <task>
87
+ This is iteration {N} of {max}. Read each screenshot. Score every dimension 1-5 with one-line evidence per dimension per viewport. Return a single fenced JSON block per the contract in your role file. No prose outside the JSON.
88
+ </task>
89
+ `
90
+ })
91
+ ```
92
+
93
+ The evaluator's role file (`agents/visual-evaluator.md`) carries the trust-boundary block, the calibration examples, and the JSON output contract. Together with this spawn template, the prompt prefix is stable across iterations — Anthropic prompt caching reuses the role + rubric + brief prefix, so the per-iteration cost is roughly: 3 image reads + the previous-iteration delta.
94
+
95
+ ## Fix-builder spawn template (VERBATIM)
96
+
97
+ When the loop has 1-3 issues to fix, spawn one builder per issue IN THE SAME RESPONSE TURN (parallel). Each fixes one dimension, narrowly.
98
+
99
+ ```
100
+ Agent({
101
+ subagent_type: "qualia-builder",
102
+ description: "Fix {dim} issue: {short description}",
103
+ prompt: `
104
+ Role: @~/.claude/agents/builder.md
105
+
106
+ <phase_context>
107
+ You are inside /qualia-polish-loop iteration {N}. The vision evaluator scored
108
+ the {dim} dimension at {score}. Your single task: fix that one dimension.
109
+
110
+ <design>
111
+ {INLINE .planning/DESIGN.md tokens relevant to {dim}}
112
+ </design>
113
+
114
+ <product>
115
+ {INLINE .planning/PRODUCT.md voice + register}
116
+ </product>
117
+ </phase_context>
118
+
119
+ <task_context>
120
+ # Issue
121
+ - Dimension: {dim}
122
+ - Severity: {severity}
123
+ - Description: {description}
124
+ - Likely file: {likely_file or "(infer from grep — start at the path the screenshot suggests)"}
125
+ - Recommended fix: {fix}
126
+
127
+ # Files probably affected
128
+ {1-3 candidate paths the loop has inferred from the URL routing}
129
+ </task_context>
130
+
131
+ <task>
132
+ 1. Read the likely file. If the issue is in a different file, follow the import graph until you find the source.
133
+ 2. Make the MINIMUM edit to fix this one dimension. Do not refactor. Do not change logic. Do not touch state management. Do not change copy unless this is a microcopy issue.
134
+ 3. Use design tokens from DESIGN.md. Do not invent new color values, font names, or spacing.
135
+ 4. After the edit, commit via the orchestrator (slop-detect-gated):
136
+ node ~/.claude/skills/qualia-polish-loop/scripts/loop.mjs commit-fix --state {STATE} --file {file} --slug {dim}-{short-keyword}
137
+ If slop-detect blocks (exit 2), READ the slop output and re-edit. If you cannot fix without violating slop-detect, return BLOCKED with the conflict.
138
+ 5. Return DONE with: file modified, lines changed, slop-detect: pass, commit: {sha}.
139
+ </task>
140
+
141
+ <rules>
142
+ - Vision says: {evidence from eval.json.viewport_results[].evidence[{dim}]}
143
+ - Do not add features.
144
+ - Do not write tests for this fix (the loop's next iteration is the test).
145
+ - Single commit. The orchestrator handles the slug + iteration prefix.
146
+ </rules>
147
+ `
148
+ })
149
+ ```
150
+
151
+ ## Iteration log entry (what `loop.mjs record` writes to state.json.iterations[])
152
+
153
+ ```json
154
+ {
155
+ "iteration": 1,
156
+ "scores": { "typography": 1, "color": 1, "spatial": 3, "layout": 1, "shadow": 3, "motion": 3, "microcopy": 1, "container": 1 },
157
+ "aggregate": 14,
158
+ "pass": false,
159
+ "failing_dims": ["typography", "color", "layout", "microcopy", "container"],
160
+ "top_issues": [
161
+ { "dim": "color", "severity": "critical", "description": "blue→purple gradient on hero", "likely_file": "src/styles/globals.css", "fix": "replace linear-gradient with single accent var(--accent)" },
162
+ { "dim": "typography", "severity": "critical", "description": "Inter as primary font-family", "likely_file": "src/styles/globals.css", "fix": "swap to Fraunces + JetBrains Mono per DESIGN.md §3" },
163
+ { "dim": "layout", "severity": "high", "description": "three identical feature cards in section 2", "likely_file": "src/pages/index.tsx", "fix": "vary card sizes per design-brand.md §Layout" }
164
+ ],
165
+ "tokens_used": 14500,
166
+ "timestamp": "2026-05-03T12:34:56.000Z"
167
+ }
168
+ ```
169
+
170
+ ## Issue fingerprint (regression detection)
171
+
172
+ The orchestrator computes a fingerprint per top_issue for each iteration:
173
+
174
+ ```
175
+ fingerprint = `${dim}__${path.basename(likely_file)}__${first_32_chars_of_description}`
176
+ .toLowerCase().replace(/\W+/g, "_")
177
+ ```
178
+
179
+ State stores `state.fingerprints[fingerprint] = { iterations: [1,2,3], description, dim }`. The KILL trigger is **3 consecutive integer iterations in `iterations[]`** — non-consecutive recurrences don't kill (the issue may have been fixed, broken by a different change, then refixed; that's a different signal than "fix-builder cannot fix this").
180
+
181
+ When the kill trigger fires, the verdict becomes `killed_regression` and `state.kill_fingerprint` records which one. The user can `cat state.json | jq '.fingerprints | to_entries | map(select(.key == "{fingerprint}"))'` to see the recurrence pattern.
182
+
183
+ ## Token-budget table
184
+
185
+ | Iterations | Tokens (est.) | Sized for |
186
+ |---|---|---|
187
+ | 2 | ~30K | known-clean page sanity-check |
188
+ | 4 | ~60K | mid-confidence |
189
+ | 6 | ~90K | default |
190
+ | 8 | ~120K | hard cap; pass `--budget 150000` to allow |
191
+
192
+ Per-iteration cost (rough):
193
+ - 3 screenshot reads ≈ 9K
194
+ - rubric + brief inlined ≈ 2K (cached after iter 1)
195
+ - previous-iteration delta ≈ 0.5K
196
+ - 3 fix-builder spawns × (file read + edit + commit-fix call) ≈ 3K
197
+ - **per-iteration ≈ 14.5K**
198
+
199
+ ## Self-test scenarios (mapping to spec)
200
+
201
+ | # | Fixture | Expected | Verifier |
202
+ |---|---|---|---|
203
+ | 1 | `fixtures/clean.html` | SUCCESS in 1-2 iterations, all dims ≥ 4 | run capture, run evaluator inline, assert pass |
204
+ | 2 | `fixtures/broken.html` | SUCCESS in 4-6 iters; identifies banned font + gradient + 3-card grid + side-stripe + generic CTA | each fix-builder commits a `qpl-N:` change; final eval all dims ≥ 3 |
205
+ | 3 | Kill-switch | KILL at iter ≤ 4 with `LOOP_REGRESSION_DETECTED` | call `loop.mjs record` 3× with the same fingerprint; assert exit 3 + correct verdict |
206
+
207
+ The pilot-results doc at `docs/playwright-loop-pilot-results.md` records the actual outcome from `bash scripts/_self-tests.sh` (Scenario 3 is exercised by a deterministic unit-style invocation; Scenarios 1+2 require a real vision pass and are run by Claude when the loop ships).
208
+
209
+ ## Final report template (what `loop.mjs report` emits to stdout)
210
+
211
+ ```markdown
212
+ # Visual-Polish Loop Report
213
+
214
+ - **URL:** http://localhost:3000
215
+ - **Brief:** .planning/DESIGN.md
216
+ - **Started:** 2026-05-03T12:00:00Z
217
+ - **Final verdict:** SUCCESS
218
+ - **Iterations:** 4 / 8
219
+ - **Tokens used:** 58000 / 100000
220
+ - **Fixes committed:** 7
221
+
222
+ ## Iteration log
223
+
224
+ ### Iteration 1
225
+ - Scores: typo=1 colo=1 spat=3 layo=1 shad=3 moti=3 micr=1 cont=1
226
+ - Aggregate: 14/40 (avg 1.75)
227
+ - Pass: NO (failing: typography, color, layout, microcopy, container)
228
+ - Top issues:
229
+ - **color** [critical] blue→purple gradient on hero → src/styles/globals.css
230
+ - **typography** [critical] Inter as primary → src/styles/globals.css
231
+ - **layout** [high] three identical cards → src/pages/index.tsx
232
+
233
+ ### Iteration 2
234
+ - Scores: typo=3 colo=3 spat=3 layo=2 shad=3 moti=3 micr=2 cont=2
235
+ - Aggregate: 21/40 (avg 2.62)
236
+ - Pass: NO (failing: layout, microcopy, container)
237
+ - ...
238
+
239
+ ### Iteration 3
240
+ - Scores: typo=4 colo=3 spat=3 layo=3 shad=3 moti=3 micr=3 cont=3
241
+ - Aggregate: 25/40 (avg 3.13)
242
+ - Pass: YES
243
+
244
+ ## Fix commits (revertable)
245
+ - abc1234 qpl-1: color-gradient-removal — src/styles/globals.css
246
+ - def5678 qpl-1: typography-fraunces — src/styles/globals.css
247
+ - ...
248
+
249
+ ## Issue fingerprints (regression tracker)
250
+ - color__globals_css__blue_purple_gradient — iterations [1] — fixed at iter 2
251
+ ```
252
+
253
+ ## Why three viewports
254
+
255
+ Per the spec's hard constraint (§5g `prefers-reduced-motion` and §5c mobile-only failures), the loop MUST evaluate at mobile (375), tablet (768), and desktop (1440). The aggregate score is the **minimum** across viewports for each dimension — a layout that's elegant on desktop but breaks at 375 is a fail, full stop.
256
+
257
+ This is intentional. Most visual regressions Fawzi has documented in `/insights` (hero videos cropped wrong on mobile, touch targets < 44px on mobile, navigation collapse misbehaving) only show up below 768. Scoring on desktop alone is how we got "looks great in dev" → "looks broken on the user's phone."
258
+
259
+ ## What the loop does NOT do (deferred to v5.2)
260
+
261
+ - Cross-browser rendering checks (Firefox / WebKit) — Chromium-only, per `qualia-polish` Stage 4 precedent
262
+ - Accessibility audits beyond what the rubric scores — use `/qualia-polish` Stage 3 (Lighthouse + axe) for that
263
+ - Performance regressions — use `/qualia-polish-loop` only after Lighthouse score passes
264
+ - Reference-image-only mode (compare to a target screenshot without a brief) — currently the brief is required; reference is supplemental
265
+ - Multi-page sweeps — one URL per invocation; chain `/qualia-polish-loop` per route for site-wide passes