qualia-framework 6.7.1 → 6.8.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,2314 @@
1
+ # Changelog
2
+
3
+ All notable changes to the Qualia Framework are documented here.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ > Note: git tags for historical versions were not retained; commit references are approximate
9
+ > and dates reflect commit history rather than npm publish timestamps.
10
+
11
+ ## [6.8.1] - 2026-06-10 (installer hygiene — global hook sweep, bin purge, bak routing)
12
+
13
+ Found via a live audit of an installed `~/.claude`: the retired brain experiment's `brain-inject.js` was still wired under `UserPromptSubmit`, firing a failing node process on every user prompt, and had survived four releases because the settings merge never looked at that event.
14
+
15
+ ### Fixed — installer
16
+ - **Settings hook sweep is now global.** The merge previously only cleaned events present in `qualiaHooks`, so deprecated entries under other events (`UserPromptSubmit`, etc.) were never pruned. The installer now sweeps every hook event for (a) retired Qualia hook commands and (b) `node "<CLAUDE_DIR>/..."` commands whose target file no longer exists, dropping empty blocks/events. User hooks (non-node or outside `CLAUDE_DIR`) are untouched.
17
+ - **`brain-inject.js` added to `DEPRECATED_HOOKS`** — the UserPromptSubmit half of the brain experiment was missing from the v6.8.0 prune list.
18
+ - **`bin/` orphan purge** — `DEPRECATED_BIN` removes the retired `build-brain-index.js`; bin/ previously had no deprecation pass at all.
19
+ - **`.bak` files routed to `backups/`** — `settings.json.bak.*` / `CLAUDE.md.bak.*` no longer accumulate in the `~/.claude` root across reinstalls; all three backup sites now write into a `backups/` subdir via a shared `bakPath()` helper.
20
+
21
+ ## [6.8.0] - 2026-06-06 (audit remediation — installer, guards, trust-score, polish)
22
+
23
+ Closes the verified findings from the 2026-06-06 framework audit (4 CRITICAL · 7 HIGH · 33 MEDIUM, adversarially verified). Root cause of most CRITICALs was a stale/incomplete install, not source rot — fixed in the installer so a clean reinstall is complete and correct.
24
+
25
+ ### Fixed — installer (the highest-leverage fixes)
26
+ - **Installed CLI no longer crashes.** `bin/install.js` now writes a root `package.json` (version marker) into the install, and `bin/cli.js` guards the `require("../package.json")` read (falls back to `.qualia-config.json`). The installed `qualia-framework` CLI was dead via `MODULE_NOT_FOUND`.
27
+ - **`references/archetypes/*.md` now install to the canonical `references/` path** (in addition to `qualia-references/`), so `/qualia-scope` and `constitution.md` resolve their archetype DoDs. `qualia-scope` reads them via absolute paths.
28
+ - **Companion docs shipped:** `SOUL.md`, `FLAGS.md`, `TROUBLESHOOTING.md`, `CHANGELOG.md` are now copied by the installer and included in the npm package `files`.
29
+ - **`auto-report.js` added to the runtime manifest** so the Stop-hook ship-time auto-report is actually installed; `stop-session-log.js` guards the spawn with `fs.existsSync`.
30
+ - **Installer is idempotent for hooks:** reinstall prunes orphan `.js` hooks (e.g. legacy `brain-*.js`) and `DEPRECATED_HOOKS` lists them.
31
+
32
+ ### Fixed — guards fail closed (were failing open / silent)
33
+ - **`pre-deploy-gate`:** hook-registration timeout raised 180→600s and per-gate budgets lowered to 120s so four quality gates + security scan can't exhaust the hook timeout and let a deploy through. Brownfield ship-policy short-circuit now logs to stderr. `QUALIA_SKIP_LINT` is OWNER-gated.
34
+ - **`migration-guard`:** block reason now printed to **stderr** (was stdout, invisible to the model on exit 2).
35
+ - **`security-scan`:** unreadable files are recorded in `skipped[]`, degrade the score, and force a non-clean exit — no more silent fail-open per file.
36
+ - **`trust-score`:** now load-probes spine scripts (`node --check`) that are present, so a syntax-broken installed script forces status off PASS (it previously reported 100/PASS regardless).
37
+
38
+ ### Changed — config
39
+ - **Renamed the env key `CLAUDE_AGENT_FORK_ENABLED` → `CLAUDE_CODE_FORK_SUBAGENT`** — the official Claude Code variable (v2.1.117+); the old name was a no-op. Installer + `cli.js` repair env now agree.
40
+ - Installer writes a `$schema` reference and seeds a scoped `permissions.allow` baseline instead of leaving it empty.
41
+
42
+ ### Fixed — skills, agents, rules, docs
43
+ - Repointed dangling command routes (`/qualia-status`→`/qualia-doctor`, `/qualia-debug`→`/qualia-fix`, `/qualia-pause`,`/qualia-resume`→`/qualia`/`/qualia-handoff`), fixed `~/qualia-memory` path, added 6 omitted skills to the `/qualia-road` map.
44
+ - Added `rules/grounding.md`/`rules/trust-boundary.md` references to agents that were missing them; granted the verifier `Write` and qa-browser `Edit` (the contracts that require writing the verification file).
45
+ - Fixed stale rule paths (`rules/design-rubric.md`→`qualia-design/design-rubric.md`), replaced the non-existent `/supabase` skill references with `npx supabase`, annotated knowledge index stubs.
46
+
47
+ ### Improved — polish subsystem
48
+ - Screenshot capture is now **full-page** (was above-the-fold only) across all backends; visual-evaluator scores below-fold sections.
49
+ - `--vibe` now routes back through verification (tsc + slop-detect + a scoped vision pass) instead of terminating at the token swap.
50
+ - Deterministic loop token accounting (fixed per-iteration cost) and an iteration-monotonicity assertion; single-sourced design rubric + the 10-font banned list (lockstep with `bin/slop-detect.mjs`).
51
+
52
+ ## [6.7.1] - 2026-05-31 (docs patch — A4 surface-collapse follow-up)
53
+
54
+ Documentation-only patch. No code or behavior changes.
55
+
56
+ ### Fixed
57
+
58
+ - **User-facing help docs caught up with the A4 surface collapse.** `docs/onboarding.html` and `templates/help.html` still advertised `/qualia-discuss` as a live slash command, though it was retired into `qualia-scope` in 6.7.0 (`RETIRED_SKILLS`, PR #64). Both now point at `/qualia-scope` with accurate PROJECT/PHASE-mode descriptions (PR #66). No code path was affected — `qualia-discuss` was already absent from `ACTIVE_SKILLS`.
59
+
60
+ ## [6.7.0] - 2026-05-31 (A5 — per-increment, concurrency-aware state model)
61
+
62
+ Additive release. No breaking changes. Completes A5: multiple people can work the same project on different branches without merge-conflicting on the planning files. The change is **dual-layout** — projects without `.planning/increments/` run the exact legacy code path (all 65 prior state tests pass unchanged); the new path activates only after `state.js migrate`.
63
+
64
+ ### Added
65
+
66
+ - **Per-increment layout (`bin/state.js`)** — the committed source of truth becomes one file per increment, `.planning/increments/{id}.md` (frontmatter: `status`, `claimed_by`, `branch`, `verification`, `gap_cycles`, build counters, `deployed_url`; body carries a Definition-of-Done checklist), wrapped in `.planning/releases/{rid}.md`. Two people editing two increments edit two files → **no conflict by construction**.
67
+ - **`STATE.md` + `tracking.json` + `.cursor.json` are now LOCAL generated views** — `regenerateViews()` rebuilds them from the increment files. `tracking.json` stays **fat** (mirrors the active increment + keeps identity/lifetime + gains an `increments`/`releases`/`mode` index), so all 13 back-compat consumers (`report-payload.js`, `project-snapshot.js`, `trust-score.js`, `statusline.js`, hooks, …) keep working with zero edits. They're gitignored, so the only per-step-mutated files are never committed.
68
+ - **`state.js migrate`** — non-destructive conversion of a legacy project: backs up `STATE.md`+`tracking.json`, writes a reversible `migration-manifest.json`, is idempotent, and supports `migrate --revert` (byte-identical restore). Hard-stop, human-initiated — never silent.
69
+ - **`state.js claim` / `state.js release`** — claim writes `claimed_by`+`branch` onto one increment (a second actor is refused with `ALREADY_CLAIMED`); release ships the increment, runs the **profile-aware Definition-of-Done gate** (strict: any open `- [ ]` blocks; standard: open lines need a `WAIVED:` reason; `--force` overrides), and clears the claim.
70
+ - **`templates/planning.gitignore`** — installed into `.planning/.gitignore` by `qualia-new`; lists the now-local generated files.
71
+
72
+ ### Changed
73
+
74
+ - **`check` and `transition` are increment-aware** when `.planning/increments/` exists — `check` is read-only and returns `my_claim` / `next_increment` / `claimed_increments[]` so the router skips work held by another person; `transition` operates on one increment file (by `--id`, or `--phase N`/cursor for full back-compat with the existing plan→build→verify→ship flow).
75
+ - **Skills repointed** — `qualia-scope` claims the scoped increment, `qualia-ship` releases it, `qualia` routes around others' claims, `qualia-new` scaffolds `.planning/.gitignore`. All gated on the increment layout; legacy projects are unaffected.
76
+
77
+ ### Security
78
+
79
+ - **Increment-id validation** (`isValidIncrementId`) — `claim`/`release`/`transition --id` reject any id that isn't a plain `inc-<slug>` token before it reaches `path.join`, closing a path-traversal vector (`--id "../../etc/passwd"` → `INVALID_ID`).
80
+
81
+ ### Tested
82
+
83
+ - 14 new `tests/state.test.sh` assertions (state suite 65 → 79, all green): migration + idempotency + byte-identical revert, claim/double-claim, router skip, release + DoD gate (strict/standard/force), id-traversal guard, and the **two-actor zero-conflict keystone** (two clones ship two increments on two branches → merge to main with zero conflicts, `STATE.md` never committed). All 10 suites green.
84
+
85
+ ## [6.6.0] - 2026-05-31 (B1 auto-capture — ship-time ERP reporting)
86
+
87
+ Additive release. No breaking changes. Completes B1: the ERP receives a session report **automatically when work ships**, with no manual `/qualia-report` — closing the "data is self-reported and optional" gap.
88
+
89
+ ### Added
90
+
91
+ - **`bin/auto-report.js`** — ship-time auto-capture. `maybeAutoReport()` fires ONLY when a project reaches `status: shipped`, deduped once per shipped `(milestone, phase)` via a marker file, and POSTs a session report tagged `source: 'auto'`. Fail-soft: enqueues to the existing `erp-retry` queue on any failure, never throws, never blocks. (Deliberately NOT triggered per-turn — the Stop hook fires every turn, so a naive hook there would spam the ERP with a new report every few minutes.)
92
+ - **`source` provenance on the report payload** — `bin/report-payload.js` `buildPayload()` tags `source` (`'auto'` when `SOURCE=auto`, else `'manual'`) and honors `DRY_RUN` for safe testing. The ERP side (`session_reports.source`) shipped separately in `qualia-erp`.
93
+
94
+ ### Changed
95
+
96
+ - **`hooks/stop-session-log.js`** — spawns `auto-report.js` **detached** (fire-and-forget) so the hook stays fast and non-blocking; the auto-report's own guards make it a no-op except the one turn right after a ship.
97
+ - **`bin/erp-retry.js`** — exports `postOnce` / `readApiKey` / `readConfig` so auto-report and the manual `/qualia-report` flow share ONE ERP-upload seam.
98
+
99
+ ### Tested
100
+
101
+ - New `tests/auto-report.test.sh` (14 assertions): source tagging, every guard (no key, ERP disabled, not-shipped), ship-time POST delivering `source:'auto'` to a stub endpoint, dedupe, fail-soft enqueue. All 10 suites green. E2E-verified against the live ERP (a real `source='auto'` row landed, then was removed).
102
+
103
+ ## [6.5.0] - 2026-05-30 (Seniority profiles + archetype-aware intake hardening)
104
+
105
+ Additive release. No breaking changes. Completes the A1/A3 intake work, adds the seniority profile primitive, and hardens the router against state drift.
106
+
107
+ ### Added
108
+
109
+ - **Seniority profile (`strict` / `standard`)** — a project now carries a `Profile:` field (STATE.md + tracking.json), resolved as `$QUALIA_PROFILE` (env wins) → STATE.md → tracking.json → `strict` (safe default; anything but the exact string `standard` coerces to `strict`). `state.js check` surfaces it; `state.js init --profile` sets it and re-init preserves it. `state.js` only stores/surfaces the field — gate enforcement is a documented contract owned by the consuming skill (`qualia-scope`), not enforced in `state.js`. Router + `guide.md` document the semantics: `strict` = hard gates, no waivers; `standard` = advisory, waiver logged to `.planning/decisions/`. (`bin/state.js`, `skills/qualia/SKILL.md`, `guide.md`)
110
+ - **`web-app` + `voice-agent` archetypes** — completes the archetype Definition-of-Done set that `qualia-scope` reads at runtime (`references/archetypes/${ARCHETYPE}.md`); the skill previously broke for these two selections. `qualia-scope` also registered in `bin/command-surface.js` `ACTIVE_SKILLS`. (`references/archetypes/web-app.md`, `references/archetypes/voice-agent.md`)
111
+
112
+ ### Fixed
113
+
114
+ - **Router no longer self-cancels on a drifted `STATE.md`.** `state.js check` read phase/status from `STATE.md` and returned `NO_PROJECT` (exit 1) whenever it was missing/corrupt — even with `tracking.json` intact — which made the `/qualia` router's parallel command batch cancel its siblings. `check` now reconstructs from the surviving file, exits 0, and routes to `state.js fix`; true `NO_PROJECT` (both files absent) is unchanged. (`bin/state.js`)
115
+
116
+ ### Changed
117
+
118
+ - **De-brittled skill-surface tests.** `tests/lib.test.sh`'s surface manifest and trust-score fixtures now derive from `command-surface.js` `ACTIVE_SKILLS` instead of hardcoding a count — adding or removing a skill no longer requires test edits.
119
+
120
+ ### Docs
121
+
122
+ - **A5 multi-person concurrency spec + line-cited implementation plan** (`docs/implementation-handoff.md`, `docs/a5-concurrency-plan.md`): partition state per-increment with `claimed_by`/`branch`, demote `STATE.md` to a generated dashboard, so concurrent contributors on different increments never conflict on planning files. Build deferred (highest-risk `state.js` rework — its own effort).
123
+
124
+ ### Deferred
125
+
126
+ - **Command-surface collapse.** Retiring `qualia-discuss` into `qualia-scope` is blocked on `qualia-scope` first absorbing the PROJECT MODE intake interface (`PROJECT_TYPE`, `project-discovery.md`); retiring it now would break `qualia-new` kickoff. All 26 skills remain active.
127
+
128
+ ## [6.4.0] - 2026-05-28 (Audit-driven remediation + mechanism grafts)
129
+
130
+ Substantive additive release. No breaking changes. Consolidates four feature commits previously labeled v6.3.1–v6.3.4 in commit messages (never published) plus today's CI cleanup and polish.
131
+
132
+ ### Added
133
+
134
+ - **`/qualia-idk` restored + enhanced** — three-scan diagnostic (planning + code + conversation/memory), returns guidance plus a paste-ready Qualia command sequence. Was retired in v6.3.0; owner pivot brought it back deeper. (`skills/qualia-idk/SKILL.md`)
135
+ - **`/qualia-secure`** — security audit of agent config (CLAUDE.md / settings.json / hooks / MCP). Two-pass: fast static pattern scan + optional Opus 4.7 adversarial red/blue/auditor pipeline. (`skills/qualia-secure/SKILL.md`, `bin/security-scan.js`)
136
+ - **`hooks/pre-compact.js`** — PreCompact hook writes `.planning/.compaction-snapshot.md` with in-flight state before context compaction. No git pollution (unlike the v6.2.0-removed version) — pure sidecar that `/qualia-resume` and `session-start` consume.
137
+ - **`bin/learning-candidates.js`** (`qualia-framework learn-scan`) — scans recent commits + daily-log for repeated fix-scope patterns and hot files; writes `~/.claude/knowledge/learning-candidates.md` with action hints.
138
+ - **`bin/status-snapshot.js`** (`qualia-framework status`) — portable operator snapshot (install health + active project + work in flight + ERP queue + memory state). Markdown / `--json` / `--write` / `--exit-code`.
139
+ - **`bin/security-scan.js`** (`qualia-framework secure`) — static scanner for AI-agent config surfaces. Detects leaked secrets (Anthropic/OpenAI/GitHub/AWS/Supabase JWT/Vercel/Bearer), unscoped Bash tool, service_role-in-client smells, hook hygiene issues. `--deep` emits a prompt pack for the `/qualia-secure` adversarial pass.
140
+ - **`bin/prune-deprecated.js`** — shared helper used by both install and doctor. Detects and removes ghost retired skills from `~/.claude/skills/` and `~/.codex/skills/`. Errors are surfaced (not swallowed) so users see permission/mount issues that would otherwise leave trap skills installed.
141
+ - **`SOUL.md`** — 17-line identity statement (Identity / Stance / 5 named principles / What We Reject). Linked from `README.md` and `CLAUDE.md`.
142
+ - **`FLAGS.md`** — single-page index of every Qualia skill flag, grouped by lane.
143
+ - **`TROUBLESHOOTING.md`** — 185-line problem-to-fix doc, grounded in actual error strings from `bin/state.js`, hooks, ERP retry.
144
+ - **Hard rules now pair negative with positive principle** in `CLAUDE.md` and `AGENTS.md` (italic explanation per rule).
145
+
146
+ ### Changed
147
+
148
+ - **`doctor` auto-prunes ghost retired skills** in every install home and lists each failure if any prune step errors out.
149
+ - **17 SKILL.md descriptions slimmed** to one verb-phrase + short Triggers list (was 3–6 sentence trigger essays). Aggregate savings: ~600 words per session in harness-routing instructions.
150
+ - **`README.md` and `guide.md` openers slimmed** — version-history walls (35+ and 25+ lines respectively) moved to `CHANGELOG.md`. Both files now lead with thesis + first commands + cross-links (`SOUL.md` / `FLAGS.md` / `TROUBLESHOOTING.md`).
151
+ - **`ACTIVE_SKILLS` count: 23 → 25** (`qualia-idk` restored, `qualia-secure` added).
152
+ - **Hooks count: 12 → 13** (`pre-compact.js` reintroduced with sidecar mechanism).
153
+
154
+ ### Fixed
155
+
156
+ - **macOS CI** — `tests/refs.test.sh` refactored off bash 4 `declare -A` (macOS ships bash 3.2). Replaced with temp-file key-value store. macOS CI went from red since 2026-05-21 to all 6/6 green on first PR after the fix.
157
+ - **CI matrix cleanup** — Windows runners dropped from the matrix. Qualia is Cyprus-stack (Linux/macOS dev); Windows isn't supported. Two coercion attempts (`pwd -W`, `cygpath -m`) didn't catch every shell-side path interpolation site, and the cure was more brittle than the disease. The workflow comment documents the path forward if Windows support is ever needed (env-var-pass paths instead of bash-interpolating them).
158
+ - **CodeRabbit advisory** — `bin/prune-deprecated.js` no longer silently swallows filesystem errors; returns `{removed, errors}` and surfaces failures via doctor or non-zero exit on the CLI.
159
+ - **CodeRabbit advisory** — `README.md` "First commands" code fence got a `text` language specifier (satisfies MD040).
160
+ - **Stale `/qualia-idk` reference** in `/qualia` skill description removed; clean separation between the mechanical router and the deep diagnostic.
161
+
162
+ ### Audit artifacts
163
+
164
+ Three `.planning/codebase/` audit documents drove this release: `qualia-confusion-audit.md` (16 cited defects with file:line citations), `ecc-vs-qualia.md` (8-pair head-to-head skill comparison), `qualia-vs-ecc-final.md` (16-query NotebookLM consolidated audit with execution order). The release closes 14 of 14 actionable items from the final audit; 1 item (Q2 AGENTS.md skill table) was deliberately rejected to preserve the 25-line instruction-budget cap.
165
+
166
+ ## [6.3.0] - 2026-05-23 (Harness hardening + command surface reduction)
167
+
168
+ ### Changed
169
+
170
+ - **Default install surface reduced from 35-ish skills to 23 active skills.** `bin/command-surface.js` is now the canonical active/retired skill manifest; installer summary and Claude/Codex skill copying use it.
171
+ - **Retired helper commands are deleted from shipped skill source and pruned on upgrade.** The installer still removes folded helpers such as `qualia-debug`, `qualia-vibe`, `qualia-idk`, `qualia-pause`, `qualia-resume`, `qualia-zoom`, `qualia-issues`, `qualia-triage`, `qualia-hook-gen`, `qualia-skill-new`, and `qualia-flush` from older installed homes.
172
+ - **`/qualia-polish --vibe` absorbs the separate vibe command.** Vibe token/extract scripts now ship under `skills/qualia-polish/scripts/`.
173
+
174
+ ### Added
175
+
176
+ - **`bin/harness-eval.js`** writes deterministic project eval artifacts under `.planning/evals/`, combining state health, ledger validation, plan contract validation, machine evidence, verification report status, trust score, and ERP linkage into a 0-100 score.
177
+ - **ERP reports and project snapshots carry the latest harness eval summary** (`status`, `score`, `phase`, `generated_at`, artifact path).
178
+ - **`qualia-framework eval` CLI** runs the harness eval (`--run --write --json` supported).
179
+
180
+ ### Fixed
181
+
182
+ - **PASS state transitions now require evidence.** When a phase contract exists, `state.js transition --to verified --verification pass` refuses to pass without a clean `.planning/evidence/phase-N-contract-run.json`; it also refuses PASS if the verification report contains `INSUFFICIENT EVIDENCE`.
183
+
184
+ ## [6.2.11] - 2026-05-23 (Owner approval integrity)
185
+
186
+ ### Changed
187
+
188
+ - **Fawzi OWNER code changed** from `QS-FAWZI-01` to `QS-FAWZI-11`.
189
+ - **Employee force-ship override is blocked.** `QUALIA_SHIP_FORCE=1` is OWNER-only in `/qualia-ship` and in the deploy hook.
190
+ - **Ship refusals are shorter and routed.** When the framework refuses deploy from the wrong state, it says why and prints the correct next command.
191
+
192
+ ### Added
193
+
194
+ - **`fawzi-approval-guard.js`** silently records employee proxy-approval claims such as "Fawzi said OK" into a local counter and ERP policy-event queue.
195
+ - **ERP policy-event contract** for `/api/v1/policy-events`.
196
+
197
+ ## [6.2.9] - 2026-05-22 (Codex hook noise + status line)
198
+
199
+ ### Fixed
200
+
201
+ - **PreToolUse hook spam.** Codex was printing eight "Running PreToolUse hook: Qualia X..." status messages on every Bash tool call, even though six of the eight hooks would immediately exit 0 because their `if` filter excluded them. Codex's hook schema doesn't include an `if` field — only `command`, `commandWindows`, `timeout`, `async`, `statusMessage`. The `if:` we shipped was silently dropped and Codex showed the `statusMessage` for every entry in the matched group before running it. v6.2.9 removes `statusMessage` from all conditional hooks; only the always-running hooks (`auto-update`, `git-guardrails`) still show a status line. Conditional hooks render silently.
202
+ - **`pre-deploy-gate.js` firing on unrelated commands.** Claude Code's `if: "Bash(vercel --prod*)"` matcher does substring matching (not glob), so any Bash command containing the literal text `vercel --prod` or `pre-deploy-gate` as a token tripped the gate and ran the full tsc/lint/test/build sweep. Added an inline `selfFilter()` IIFE at the top of the hook that reads `tool_input.command` from stdin and exits 0 fast unless the command actually starts with `vercel --prod` or `vercel deploy --prod`. Same pattern added to `pre-push.js`.
203
+
204
+ ### Added
205
+
206
+ - **`[tui] status_line = [...]` in Codex `config.toml`.** Installer now writes (or appends to existing config) the rich Codex bottom status line: `model-with-reasoning · task-progress · current-dir · git-branch · context-used · five-hour-limit · weekly-limit`. Codex doesn't support custom-command status lines like Claude's `statusLine`, so the Qualia phase/state info continues to render via the `SessionStart` banner at the top of the session; this commit ensures Codex's native bottom status line is at least configured for richness, not left to default. Existing user configs are appended-to only when they have no `[tui]` block; otherwise left untouched.
207
+
208
+ ## [6.2.8] - 2026-05-22 (Codex /goal integration + install hardening)
209
+
210
+ ### Fixed
211
+
212
+ - **Codex agent TOMLs missing `name` field.** `renderCodexAgentToml` now emits `name = "..."` as the first line of every `~/.codex/agents/*.toml`. v6.2.7 shipped without this and Codex 0.133 rejected all 9 agent files at session start with `Ignoring malformed agent role definition: ... must define a non-empty \`name\``.
213
+ - **ERP API key missing on Codex installs.** When both targets are installed, the installer now mirrors `~/.claude/.erp-api-key` → `~/.codex/.erp-api-key` (0600). Without this, every ERP write from a Codex session 401'd and the retry queue grew silently.
214
+ - **Deprecated skills lingered after upgrade.** Installer now prunes `qualia-task`, `qualia-quick`, `qualia-polish-loop`, `qualia-design`, `qualia-prd` from both `~/.claude/skills/` and `~/.codex/skills/` on every run. Previously left orphaned after the v5.7.0 / v5.8.0 / v4 consolidations.
215
+
216
+ ### Added
217
+
218
+ - **`bin/codex-goal.js`** — reads `.planning/STATE.md` + `.planning/ROADMAP.md` and emits a `/goal {objective}` line plus a token-budget suggestion calibrated to scope (`phase` 80k · `task` 30k · `feature` 30k · `quick` 10k). Output is meant to be pasted into a Codex session or consumed by the model via the `update_goal` tool.
219
+ - **`rules/codex-goal.md`** — guidance: detect Codex runtime, run `codex-goal.js`, call `update_goal` or surface the `/goal` line. Skip entirely on Claude Code.
220
+ - **`/goal` integration wired into phase-start skills** — `qualia-build`, `qualia-plan`, `qualia-feature` each open with a Step 0 that references `rules/codex-goal.md`. Codex users get native burn-vs-budget tracking against the active phase/task objective without extra typing.
221
+
222
+ ## [6.2.7] - 2026-05-21 (Codex runtime compatibility)
223
+
224
+ ### Added
225
+
226
+ - **Codex-native install surface** now writes `~/.codex/hooks.json`, `~/.codex/hooks/*.js`, `~/.codex/agents/*.toml`, `~/.codex/bin/*.js`, `~/.codex/rules/`, `~/.codex/skills/`, `~/.codex/qualia-design/`, `~/.codex/qualia-templates/`, `~/.codex/knowledge/`, and `~/.codex/qualia-guide.md`.
227
+ - **Codex agent conversion** turns framework `agents/*.md` frontmatter/body into Codex TOML agents.
228
+
229
+ ### Changed
230
+
231
+ - Shared runtime scripts and hooks now resolve their Qualia install home from `~/.claude` or `~/.codex`, so Codex-only installs no longer depend on a Claude install existing.
232
+ - Installer tests and packaged smoke tests now prove Codex runtime files are present, not just `AGENTS.md`.
233
+
234
+ ## [6.2.6] - 2026-05-21 (project snapshot upload)
235
+
236
+ ### Added
237
+
238
+ - **`qualia-framework project-snapshot --upload`** posts the current project snapshot to ERP's project snapshot intake.
239
+ - **Upload coverage** verifies the Framework sends the 0-to-100 progress object, ERP identifiers, and bearer token to `/api/v1/project-snapshots`.
240
+
241
+ ## [6.2.5] - 2026-05-21 (project snapshot export)
242
+
243
+ ### Added
244
+
245
+ - **`qualia-framework project-snapshot`** exports a single project progress JSON object for ERP/admin import.
246
+ - **`--write` snapshot mode** writes `.planning/snapshots/project-snapshot-*.json` with shared IDs, current milestone/phase, closed milestones, lifetime counters, and a 0–100 progress percentage.
247
+
248
+ ## [6.2.4] - 2026-05-21 (report payload contract hardening)
249
+
250
+ ### Added
251
+
252
+ - **`bin/report-payload.js`** is now the canonical Framework -> ERP report payload builder used by `/qualia-report`.
253
+ - **Executable coverage** now proves slug-like ERP IDs are omitted while canonical UUID fields are sent, keeping admin/project linking reliable.
254
+
255
+ ## [6.2.3] - 2026-05-21 (ERP ID guard hotfix)
256
+
257
+ ### Fixed
258
+
259
+ - **`/qualia-report`** now only sends `erp_project_id`, `client_id`, and `workspace_id` when values are UUID-shaped. This prevents ERP `422 VALIDATION_FAILED` responses when local `tracking.json` contains slug-like values such as `acme` or `qualia-solutions`.
260
+ - **ERP contract docs and work-packet template** now label ERP-native identifiers as UUIDs. Local slugs remain in `project_id` and `team_id`.
261
+
262
+ ## [6.2.2] - 2026-05-21 (memory/ERP operating model)
263
+
264
+ This patch incorporates the Framework/Memory/ERP connection note into the framework without reviving the old passive git-scrape model.
265
+
266
+ ### Added
267
+
268
+ - **`docs/ecosystem-operating-model.md`** documents the boundary: Framework builds, Memory remembers, ERP operates. `.planning` imports stay explicit and approved; ERP does not depend on passive git scraping or hook-created bot commits.
269
+ - **`docs/reviews/v6.2.2-memory-erp-audit.md`** records how the downloaded Framework/Memory/ERP note was applied and where it needed correction.
270
+ - **`templates/work-packet.md`** gives ERP/admins a clean way to hand approved project context into Claude Code or Codex sessions.
271
+ - **`docs/release.md`** documents the pre-publish, publish, and post-publish proof path so local package smoke is not mistaken for public install proof.
272
+ - **`tests/published-install-smoke.test.sh`** is the release-only public smoke: it fails until npm `latest` matches `package.json`, then runs `npx qualia-framework@latest install` into an isolated HOME/cache and verifies Claude + Codex output.
273
+
274
+ ### Changed
275
+
276
+ - **`tracking.json` and `/qualia-report`** now preserve/pass through optional `erp_project_id`, `client_id`, and `workspace_id` identifiers when present, so ERP dashboards can link reports without guessing from names.
277
+ - **`tests/install-smoke.test.sh`** now isolates npm cache/log/userconfig state and reports npm pack stderr instead of swallowing pack failures.
278
+ - **`skills/qualia-report`** now shows the safe piped `set-erp-key` command instead of the retired positional-key syntax.
279
+
280
+ ## [6.2.1] - 2026-05-21 (revival audit surface guard)
281
+
282
+ This patch closes the remaining active-surface drift found during the framework revival audit. v6.2.0 removed hook-created bot commits, but several user-facing surfaces still described the old model: ERP reads from git, `pre-compact` saves state, 32 skills, and stale v6.0 headings.
283
+
284
+ ### Changed
285
+
286
+ - **README / guide / onboarding** now reflect the current package identity, 33-skill surface, `/qualia-vibe`, and the explicit `/qualia-report` ERP contract.
287
+ - **`skills/qualia-road`, `skills/qualia-milestone`, `agents/roadmapper`, and `docs/erp-contract.md`** now describe `tracking.json` as local/report telemetry, not passive ERP git input.
288
+ - **`skills/qualia-verify`** now states the current fail-closed `INSUFFICIENT EVIDENCE` behavior directly instead of describing the old false-pass vector as if it still applied.
289
+ - **`skills/qualia-polish`** now requires optional Lighthouse/axe gate skips to be reported, not silently hidden.
290
+ - **`docs/reviews/v6.2.1-revival-audit.md`** records the evidence, remaining release blocker, and next high-leverage work.
291
+
292
+ ### Tests
293
+
294
+ - **`tests/refs.test.sh`** now also guards stale active-surface claims: passive ERP tracking, removed pre-compact guidance, stale skill count, stale README version, old `INSUFFICIENT EVIDENCE` wording, and silent optional gate language.
295
+ - **`tests/install-smoke.test.sh`** builds the actual npm tarball, extracts it, runs the packaged installer with target `Both`, and verifies Claude `CLAUDE.md`, Codex `AGENTS.md`, 11 hooks, no `pre-compact`, and matching installed config version.
296
+
297
+ ## [6.2.0] - 2026-05-20 (no more hook-created bot commits)
298
+
299
+ Two hooks were creating bot commits whose only purpose was to feed a phantom consumer:
300
+
301
+ - **`hooks/pre-push.js`** — created `chore(track): ERP sync …` on every push to stamp `.planning/tracking.json` into git. Rationale: "the ERP at portal.qualiasolutions.net reads tracking.json from GitHub." That rationale was false; the ERP repo has no route, cron, MCP, or background worker that reads `tracking.json` (verified — search returns only marketing copy and one substring-match comment).
302
+ - **`hooks/pre-compact.js`** — committed STATE.md + tracking.json on every Claude Code context compaction. Rationale: "preserve mutations across compactions." Misdiagnosed; `bin/state.js` already provides crash-safe atomic writes (`state.js:18`) plus a write-ahead journal (`state.js:36-64`) and recovery (`state.js:49-63`). Compaction is in-memory only — disk writes survive it without help.
303
+
304
+ The bot commits were pushing data into git for a consumer that doesn't exist, polluting every Qualia project's history, and on any Vercel project with GitHub auto-deploy accidentally on, triggering redeploys for nothing.
305
+
306
+ v6.2 strips both. The pre-push hook still stamps `tracking.json` locally so `bin/statusline.js`, `hooks/stop-session-log.js`, and `/qualia-report` see fresh telemetry — it just doesn't touch git anymore. The pre-compact hook is deleted outright; `state.js` already covers crash safety with a stronger guarantee (atomic + journaled) than git ever did.
307
+
308
+ ### Changed
309
+
310
+ - **`hooks/pre-push.js`** — rewrite. Stamps `last_commit`, `last_updated`, `last_pushed_at` locally only. No `git add`, no `git commit`, no rollback dance, no branch-guard mirroring (irrelevant without a commit). Net code reduction: ~220 → ~85 lines.
311
+ - **`docs/erp-contract.md`** — strikes the false "ERP reads `tracking.json` directly from git" claim and removes the dangling `GET /api/v1/tracking/:project` endpoint that was documented but never implemented on the ERP.
312
+ - **`bin/install.js`** — fixes the false "tracking.json syncs to ERP on every push" install-summary line; drops empty hook-event keys after Qualia-owned cleanup so settings.json doesn't accumulate empty `PreCompact: []` arrays.
313
+
314
+ ### Removed
315
+
316
+ - **`hooks/pre-compact.js`** — deleted. Added to `DEPRECATED_HOOKS` array in `bin/install.js` so existing installs get the file unlinked from `~/.claude/hooks/`. Added to `QUALIA_LEGACY_HOOK_FILES` in `bin/cli.js` so the `wire` path also strips it. The PreCompact event entry in user `settings.json` is detected and removed; if PreCompact is now empty, the key is dropped.
317
+
318
+ ### Migration
319
+
320
+ No action required. Existing `chore(track): ERP sync` and `state: pre-compaction save` commits in your project histories stay (you can't unmake history); v6.2 just stops creating new ones. Existing ERP integrations are unaffected — `/qualia-report` continues to POST to `/api/v1/reports` exactly as before. On upgrade, the framework will:
321
+
322
+ 1. Delete `~/.claude/hooks/pre-compact.js` from existing installs.
323
+ 2. Strip any `pre-compact.js` references from your `~/.claude/settings.json`.
324
+ 3. Drop the `PreCompact` key entirely if it becomes empty.
325
+
326
+ ## [6.1.0] - 2026-05-17 (design pivot path — /qualia-vibe + design-surface fixes)
327
+
328
+ The design path was missing a fast pivot. `/qualia-polish` polishes within a chosen vibe; `/qualia-polish --redesign` rebuilds from scratch (~30 min). When a client says "different vibe" and you want a 3-minute token swap, neither fits. v6.1 adds **`/qualia-vibe`** — the impeccable middle path.
329
+
330
+ This release also closes the design-surface bugs surfaced by the v6.0 audit (viewport mismatch, slop-detect path resolution, dead `/qualia-design` references, bounce-easing contradiction) and adds the EventMaster discipline as a real rule.
331
+
332
+ ### New — `/qualia-vibe` skill
333
+
334
+ Layout-preserving aesthetic pivot. Modes:
335
+
336
+ - **Default** — propose ONE direction (per `rules/one-opinion.md`), justify it from PRODUCT.md / anti-references / scene sentence, apply on approval. Never enumerates a menu of options.
337
+ - **`--variants N`** — the opt-in menu, when the user explicitly asked for choices. Default N=3, max 5. Uses `AskUserQuestion`.
338
+ - **`--extract <URL or image path>`** — reverse-engineer a DESIGN.md draft from a reference site or screenshot. Captures via `playwright-capture.mjs`, runs an extract-mode prompt (description, not scoring), emits a DESIGN.md draft for user review.
339
+ - **`--sync [--write]`** — show drift between code (CSS vars, Tailwind config, font imports) and DESIGN.md. Three categories reported: undocumented (in code, not declared), orphaned (declared, not used), drifted (different values). With `--write`, patches DESIGN.md to match code.
340
+
341
+ Hard contract: vibe NEVER touches JSX structure, routing, data flow, or layout grid. Only tokens (color, type, depth, motion) + DESIGN.md sections 1, 2, 3, 6, 7. If a pivot would need structural change, vibe stops and routes to `/qualia-polish --redesign`.
342
+
343
+ Supporting scripts:
344
+
345
+ - `skills/qualia-vibe/scripts/tokens.mjs` — `sync` (code↔DESIGN.md diff + patch) and `propose-variants` (LLM prompt scaffold). Pure utility, no LLM call.
346
+ - `skills/qualia-vibe/scripts/extract.mjs` — capture orchestrator + extract-mode prompt scaffold. Pure utility, no LLM call.
347
+ - Reuses `skills/qualia-polish/scripts/playwright-capture.mjs`, `bin/slop-detect.mjs`, `agents/visual-evaluator.md`.
348
+
349
+ ### New — `rules/one-opinion.md`
350
+
351
+ Codifies the EventMaster discipline (from a real session where the user pivoted aesthetic 3 times in one block and finally said `ANYTHING JUST CHANGE IT STOP ASKING`). Rule: when proposing a design decision the user has not already named, propose ONE opinionated direction with justification — never a menu. If rejected, ask what didn't fit and propose ONE replacement. Lazy-loaded by `/qualia-vibe`, `/qualia-polish` Stage 0, `/qualia-new` DESIGN.md step.
352
+
353
+ ### Fixed — design-surface bugs from v6.0 audit
354
+
355
+ - **Viewport mismatch:** `/qualia-polish` SKILL.md Stage 4 said 1280px desktop. `agents/visual-evaluator.md`, `REFERENCE.md`, and `playwright-capture.mjs` all use 1440. Aligned to 1440.
356
+ - **slop-detect path in `loop.mjs`:** Was hardcoded to `~/.claude/bin/slop-detect.mjs` and silently skipped the gate if missing. Now searches `SLOP_DETECT_SCRIPT` env, `~/.claude/bin/...`, and the framework-relative path. Skip is now logged-on-miss, not silent.
357
+ - **`loop.mjs commit-fix` git identity:** Was failing on fresh clones with no global git config. Now sets `Qualia Polish Loop / polish-loop@qualia.solutions` inline so the commit works regardless.
358
+ - **Dead `/qualia-design` references:** `qualia-design/frontend.md` still listed `/qualia-design` as an active command — it was deleted in v4.5. Replaced with `/qualia-polish` and `/qualia-vibe`.
359
+ - **Bounce easing token contradicting design-laws:** `qualia-design/design-reference.md` shipped `--ease-spring: cubic-bezier(0.34, 1.56, …)` as a generic token — but `design-laws.md` §6 and `slop-detect.mjs` MED-BOUNCE-EASING ban that exact curve. Removed the token, replaced with a note explaining when (and only when) brand-register vibes can override the ban.
360
+ - **Stale "deferred to v5.2" in REFERENCE.md:** Multi-route and reduced-motion have been implemented for releases. Removed from the deferred list and replaced with an "aesthetic pivot belongs in /qualia-vibe" pointer.
361
+ - **Menu-of-options anti-pattern in `/qualia-polish` Stage 0:** Brief format listed `{editorial · brutalist · luxury · maximalist · …}` as options. Per `rules/one-opinion.md`, agent now proposes ONE direction inferred from PRODUCT.md. Pushback routes to `/qualia-vibe`, not to enumerating more.
362
+
363
+ ### Changed — `bin/slop-detect.mjs`
364
+
365
+ - **Added banned fonts:** Montserrat, Poppins, Lato, Open Sans (already banned in `qualia-design/design-brand.md` but missing from the linter).
366
+ - **Added `--watch` flag:** Re-scan on file change with 200ms debounce. Watches tracked extensions (`tsx/jsx/ts/js/css/scss/html/svelte/vue/astro`). Makes slop-detect proactive instead of pre-commit only.
367
+ - **Broadened default scan paths:** Added `packages/` and `apps/` to the default roots (turbo / nx / pnpm-workspaces monorepo conventions).
368
+
369
+ ### Surface counts
370
+
371
+ - **33 skills** (was 32). `/qualia-vibe` added.
372
+ - **8 rules** in `rules/` (7 always-loaded + 1 lazy-loaded). `rules/one-opinion.md` added (lazy-loaded by design-adjacent skills).
373
+ - Same 9 agents, same 12 hooks.
374
+
375
+ ### Migration
376
+
377
+ - Run `npx qualia-framework@latest install` (or `migrate`) once. Pulls in `/qualia-vibe` and `rules/one-opinion.md`.
378
+ - `/qualia-vibe --extract` requires Playwright to be installed in the target project (same dependency as `/qualia-polish --loop`).
379
+ - `/qualia-vibe --sync` works on any project with `app/globals.css`, `src/app/globals.css`, or similar, and optionally `tailwind.config.{ts,js,mjs,cjs}`.
380
+ - No skill removed, no command renamed.
381
+
382
+ ---
383
+
384
+ ## [6.0.0] - 2026-05-17 (audit, cleanup, reliability)
385
+
386
+ A wide-surface audit pass. No new flagship feature — every change is a real bug, a silent failure surfaced, an outdated reference replaced, or a token-budget reduction. The framework is the same shape and the road commands behave exactly as before; what changed is that broken edges are now fixed.
387
+
388
+ ### Fixed — uninstall and migrate left orphan files on disk
389
+
390
+ `cli.js` shipped a `QUALIA_HOOK_FILES` manifest listing 9 hooks but the framework ships 12. Uninstall left `env-empty-guard.js`, `supabase-destructive-guard.js`, and `vercel-account-guard.js` orphaned under `~/.claude/hooks/`. Same shape for agents: `QUALIA_AGENT_FILES` was missing `visual-evaluator.md`. And `cmdMigrate()` `requiredBashHooks` was missing the same 3 hooks, so users on `qualia-framework migrate` ended up with an incomplete hook set vs. those on a fresh `install`. All three manifests now match what ships.
391
+
392
+ ### Fixed — silent catches in hooks
393
+
394
+ `auto-update.js` had two bare `catch {}` blocks (the npm-version fetch and the top-level wrapper) that swallowed every error type including ENOMEM and ETIMEDOUT. `pre-compact.js` had the same shape around the commit logic. Both now write to `~/.claude/.qualia-traces/` so failures are debuggable instead of vanishing. `env-empty-guard.js` was exiting 2 (BLOCK) on any unparseable stdin payload — which meant a malformed JSON from an unrelated tool call would block the user from running `vercel env`. Now allows on parse error and only blocks when it actually matched a `vercel env add` with an empty value.
395
+
396
+ ### Fixed — phantom `rules/frontend.md` references
397
+
398
+ `CLAUDE.md`, `AGENTS.md`, and `skills/qualia-feature/SKILL.md` pointed at `rules/frontend.md` which has not existed since the design substrate moved to `qualia-design/` in v4.5. Agents loading these references hit a dead end. Now points at `qualia-design/frontend.md`. Also added `rules/architecture.md` (the deep-modules rule from v5.x) to the discoverable substrate list.
399
+
400
+ ### Fixed — skill bugs that the registry never caught
401
+
402
+ - `/qualia-learn` declared `Read, Write, Edit, Glob, Grep` in `allowed-tools` but its process calls `node ~/.claude/bin/knowledge.js` and `git add`. Without `Bash` the skill literally could not run its own instructions. Added.
403
+ - `/qualia-map` describes an `AskUserQuestion` Re-scan? prompt but did not declare the tool. Added.
404
+ - `/qualia-plan` Rule 2 said "Max 3 revision cycles" while the description and process said "max 2." Reconciled to 2 (matches the actual revision-loop code).
405
+ - `/qualia-postmortem` had `skills/qualia-design/SKILL.md` in its design-regression lookup table — that path has not existed since the polish skill consolidated design work. Now points at `qualia-polish` + `qualia-design/design-laws.md`.
406
+
407
+ ### Fixed — agent tool frontmatter gaps
408
+
409
+ `agents/planner.md` referenced `mcp__context7__*` tools but did not declare them in frontmatter; same for `agents/qa-browser.md` and Playwright MCP. Both agents were unable to call the tools their process described. Both frontmatters updated.
410
+
411
+ ### Fixed — hardcoded `/tmp` paths
412
+
413
+ `agents/verifier.md` drift audit wrote to `/tmp/used-tokens` and `/tmp/declared` (Windows-incompatible, symlink-attackable on multi-user systems). Now uses `mktemp` with cleanup. `agents/qa-browser.md` dev-server log goes to `${TMPDIR:-/tmp}/qualia-dev-server-$$.log` instead.
414
+
415
+ ### Changed — `tests/run-all.sh` replaces the `&&`-chain test orchestrator
416
+
417
+ `npm test` used to chain 8 suites with `&&`, so the first failure aborted everything else and you only ever saw the failure of the earliest-failing suite. New `tests/run-all.sh` runs every suite, collects failures, and prints which suites failed at the end. Also stripped silent `|| true` from `slop-detect.test.sh` so genuine slop-detect crashes can no longer pass the suite invisibly.
418
+
419
+ ### Changed — `rules/trust-boundary.md` extracts duplicated security block
420
+
421
+ Builder, planner, verifier, and visual-evaluator each carried a ~150-token inline copy of the trust-boundary block (~600 tokens duplicated across the four agents). Extracted to `rules/trust-boundary.md`; agents reference it with one line and add the role-specific reporting shape. Net: ~500 tokens saved per phase that spawns the full agent set, plus a single canonical place to update the rule.
422
+
423
+ ### Removed — stale version stamps from user-facing skill surfaces
424
+
425
+ `/qualia-road` had four "v4.5.0+" / "v5.0+" / "v5.1+" / "v5.3+" headings; removed. `/qualia-new` headings stamped `(v5.0 — REQUIRED)` / `(v4.5.0 OKLCH-first)`; removed. `/qualia-flush` had a "deferred to v4.3.0" caveat that has long since been fixed in `knowledge.js`; updated to reflect current behavior. Similar cleanups in `/qualia-polish`, `/qualia-verify`, `/qualia-milestone`, `/qualia-hook-gen`, `/qualia-feature`. `/qualia-feature` description no longer mentions the removed `/qualia-quick` and `/qualia-task`. Stale `<!-- v5.9: -->` comments in agent headers stripped.
426
+
427
+ ### Archived — pre-v4 CHANGELOG entries
428
+
429
+ Moved v2.x and v3.x release notes (851 lines) to `docs/archive/CHANGELOG-pre-v4.md`. Main `CHANGELOG.md` now ends at v4.0.0 with a pointer to the archive. The active changelog dropped from 2726 to 1875 lines.
430
+
431
+ ### Fixed — `templates/help.html` drift
432
+
433
+ Showed "28 skills" (actual 32) and was missing entries for `/qualia-hook-gen`, `/qualia-issues`, `/qualia-road`, `/qualia-triage`, `/qualia-zoom`, `/zoho-workflow`. Added. `docs/onboarding.html` version-stamped `v5.5`; bumped to v6.0.
434
+
435
+ ### Test posture
436
+
437
+ 489 tests still passing on the full suite. No tests removed; one test (em-dash detection) tightened to also assert exit code, not just message presence. Test runner is now fail-collect, not fail-fast, so future regressions in early suites do not hide later regressions.
438
+
439
+ ### Migration notes (v5.9.x → v6.0.0)
440
+
441
+ - Run `npx qualia-framework@latest install` (or `migrate`) once. The migrate path now wires the three v5.0 hooks (`vercel-account-guard.js`, `env-empty-guard.js`, `supabase-destructive-guard.js`) that earlier migrate runs missed.
442
+ - If you had a previous uninstall leave orphan hook files, re-run `npx qualia-framework@latest uninstall` to clean them up cleanly.
443
+ - No skill removed; no command renamed; no API change.
444
+
445
+ ---
446
+
447
+ ## [5.9.2] - 2026-05-16 (hook ordering + ERP payload fixes)
448
+
449
+ **Two production bugs that were silently corrupting state.** Neither was caught by tests because both required specific runtime conditions to reproduce — a blocked push in one case, a fresh project with no session_started_at stamp in the other. Both surfaced from real session traces, not synthetic testing.
450
+
451
+ ### Fixed — pre-push.js no longer orphans a bot commit when the push is blocked
452
+
453
+ - **Root cause:** Claude Code PreToolUse hooks run *in parallel*, not in declared order. `pre-push.js` (stamps `tracking.json` and creates a bot commit) and `branch-guard.js` (blocks non-OWNER pushes to main/master) both fire on `git push *`. When `pre-push` ran and `branch-guard` exited 2, the bot commit was already in the local branch — orphaned, sitting around to be shipped on the next push to a feature branch.
454
+ - **Fix:** `pre-push.js` now self-gates with a `wouldBeBlocked()` mirror of `branch-guard`'s checks (role from config + refspec parse + current-branch fallback). If the push would be blocked, it skips `commitStamp()` entirely and exits 0. `branch-guard` still produces the user-facing block message via its own exit-2.
455
+ - **Why not just declare ordering?** Claude Code hook config doesn't expose execution-order priorities for PreToolUse. The only safe primitive is self-gating. When v6.0 extracts the shared runtime, both hooks call one `wouldBeBlocked()` from `lib/`.
456
+
457
+ ### Fixed — qualia-report ERP payload no longer 422s on empty datetimes
458
+
459
+ - **Root cause:** The payload builder coalesced missing `session_started_at` and `last_pushed_at` to `''` via `t.session_started_at||''`. The ERP validator (Pydantic-style) accepts missing-Optional but rejects empty-string-as-datetime → HTTP 422 on every fresh project, never-pushed branch, or brand-new session.
460
+ - **Fix:** Conditional spread in the payload object — `...(t.session_started_at?{session_started_at:t.session_started_at}:{})` — drops the key entirely when the value is empty. ERP accepts the missing-Optional cleanly. Applied to both fields.
461
+ - **Verified:** Manual retry without the empty datetime keys succeeded with HTTP 200 in the session that caught the bug.
462
+
463
+ ### Files changed
464
+
465
+ - `hooks/pre-push.js` — added `wouldBeBlocked()` and early-exit
466
+ - `skills/qualia-report/SKILL.md` — payload builder uses conditional spreads
467
+ - `package.json`, `README.md`, `guide.md` — version bumped to 5.9.2
468
+ - `CHANGELOG.md` — this entry
469
+
470
+ ### Carries forward from v5.9.1
471
+
472
+ Demo-first gate at kickoff, AskUserQuestion-only mode, single free-text question, hard hand-off to `/qualia-discuss`.
473
+
474
+ ---
475
+
476
+ ## [5.9.1] - 2026-05-11 (qualia-new — demo-first gate, AskUserQuestion everywhere)
477
+
478
+ **Targeted UX fix for the kickoff flow.** Observed a session where `/qualia-new` opened with a free-text "tell me what you want to make" prompt, then drifted into ad-hoc clarification questioning ("What does the SaaS DO?") before ever reaching the Demo vs Full gate. That's both spec violation and bad UX — the shape gate drives the entire downstream question set, so it must come first.
479
+
480
+ ### Changed — qualia-new step ordering
481
+
482
+ - **Step 1 is now the Demo/Full/Quick gate** via `AskUserQuestion`. It is the literal first interaction with the user — before "what are you building". Banner-only Step 0 stays; the next thing the user sees is the project-shape picker.
483
+ - **Step 3 is the only free-text question** in the kickoff flow: a single one-line pitch, accepted as-is even if broad ("a SaaS platform" is a valid answer). No clarification round, no ad-hoc questioning.
484
+ - **Step 4 hand-off to `/qualia-discuss` is now a hard rule.** "The next tool call after Step 3 must be `/qualia-discuss`. No free-form follow-up." Discovery depth is the discuss skill's job — not improvised.
485
+ - All downstream steps renumbered (old Step 2 → 5, etc.). Internal references updated.
486
+
487
+ ### Changed — qualia-new Rules section
488
+
489
+ - New Rule 1: Project type is the first question, period — even before "what are you building".
490
+ - New Rule 2: `AskUserQuestion` for every discrete-choice question; free-text only for Step 3.
491
+ - New Rule 3: No ad-hoc clarification questioning between Step 3 and `/qualia-discuss`.
492
+ - Demo non-negotiables (Rule 6) made explicit: 1 milestone, no mock data, real backend, real agent/platform functionality, DESIGN.md mandatory, focus on design + functionality. Speed never comes from mocking the backend.
493
+
494
+ ### Changed — qualia-discuss PROJECT MODE
495
+
496
+ - P1 (project-type detection): when invoked standalone, uses `AskUserQuestion` (interactive UI) — never a plain-text prompt. When invoked by `/qualia-new`, skip the gate entirely (already locked upstream).
497
+
498
+ ### Why
499
+
500
+ The screenshot evidence: a fresh `/qualia-new` session was free-form questioning the user before locking project shape. That violates the spec and produces shallow discovery — which then poisons every downstream phase. Reordering puts the shape gate first (where it always belonged) and tightens the rules so no future session can ad-hoc-clarify its way past the structured interview.
501
+
502
+ ---
503
+
504
+ ## [5.9.0] - 2026-05-11 (Deep-research fixes — surface drift, ERP retry queue, model tiering)
505
+
506
+ **The "fix what the deep-research audit found" release.** Three independent wins
507
+ shipped together: every dead command reference on user-facing surfaces is gone,
508
+ the ERP report upload now has a real retry mechanism (not just a lying message),
509
+ and four structured agents move from inherited Opus to Sonnet for a substantial
510
+ cost cut with no flow change.
511
+
512
+ ### Fixed — surface drift
513
+
514
+ - `rules/speed.md` no longer recommends `/qualia-quick` and `/qualia-task` (removed in v5.7). Now points at `/qualia-feature`.
515
+ - `templates/help.html` Quick Paths section no longer lists `/qualia-quick`, `/qualia-task`, `/qualia-design`. Replaced with `/qualia-feature` and `/qualia-polish`.
516
+ - `docs/onboarding.html` no longer lists `/qualia-quick`, `/qualia-task`, `/qualia-prd`, or `/qualia-polish-loop`. Updated to point at `/qualia-feature` and `/qualia-polish --loop`.
517
+ - `docs/playwright-loop-pilot-results.md` path references updated from `skills/qualia-polish-loop/` to `skills/qualia-polish/` (the v5.8.0 consolidation moved them; the historical pilot doc still pointed at the old location).
518
+ - New `tests/refs.test.sh` greps every active surface for backticked `/qualia-*` references and asserts each name maps to a shipped skill. Wired into `npm test`. Excludes historical surfaces (`docs/reviews/`, `docs/research/`, CHANGELOG) and migration-context prose ("Replaces /qualia-quick"). 10 references checked.
519
+
520
+ ### Added — ERP report retry queue
521
+
522
+ - New `bin/erp-retry.js` script. Persistent queue at `~/.claude/.erp-retry-queue.json`. Replaces the v5.8.0 "will appear in ERP after retry" message which was misleading — no retry mechanism existed before this release.
523
+ - `skills/qualia-report/SKILL.md` Step 6 enqueues failed uploads (all 3 attempts exhausted) into the queue with the full payload, idempotency key, and last error. Local commit still happens first; the queue is only the cloud-sync mechanism.
524
+ - `hooks/session-start.js` drains up to 5 queued items quietly on every Claude Code launch (2.5s per-item timeout, 8s total budget, fire-and-forget).
525
+ - New CLI: `qualia-framework erp-flush [show|clear]`. Drains verbosely, lists current queue, or clears with backup. Useful after rotating the API key or when the ERP comes back online.
526
+ - Permanent failures (401, 422) and 10-attempt exhaustion mark the item `give_up: true` — it stays in the queue for visibility but is no longer auto-retried until the user runs `erp-flush` after fixing the underlying issue.
527
+
528
+ ### Added — ERP payload contract completion
529
+
530
+ - `Idempotency-Key` header now sent with every report upload. Deterministic UUID-v5 hash of `(client_report_id, submitted_at, project)` so retries carry the same key. The ERP's documented 24-hour replay window protects against response-lost-mid-flight duplicates (`docs/erp-contract.md:42-49`).
531
+ - `session_duration_minutes` now computed from `tracking.json.session_started_at` to `submitted_at` and included in the payload. The ERP's example payload (`docs/erp-contract.md:93`) already expected this; the framework was silently omitting it.
532
+
533
+ ### Changed — verifier strictness
534
+
535
+ - `/qualia-verify` now grep-checks the verification report for `INSUFFICIENT EVIDENCE` lines after the verifier returns. Any occurrence downgrades the verdict to FAIL. Closes the deep-research audit's #1 false-pass vector — the verifier was instructed to emit INSUFFICIENT EVIDENCE on tool-budget exhaustion, but the orchestrator was treating those as silent PASS.
536
+ - `agents/verifier.md` tool budget changes from fixed 25 calls to `max(25, task_count * 5)`. A 3-task phase still gets 25 calls; a 10-task phase now gets 50. The fixed budget was tight on phases with many tasks.
537
+
538
+ ### Changed — model tiering (cost cut, zero flow risk)
539
+
540
+ - Four structured agents move to Sonnet via `model: sonnet` frontmatter:
541
+ - `agents/verifier.md` — deterministic grep + score against criteria
542
+ - `agents/plan-checker.md` — 11-rule structured checklist
543
+ - `agents/roadmapper.md` — template-driven milestone decomposition
544
+ - `agents/qa-browser.md` — mechanical browser interaction + finding capture
545
+ - `builder.md`, `planner.md`, `researcher.md`, `visual-evaluator.md` stay on inherited Opus where real architectural / vision / synthesis reasoning lives.
546
+ - `research-synthesizer.md` already used Haiku — unchanged.
547
+ - Net effect on a typical phase cycle (1 plan + 1-2 checker spawns + 1 verifier + 0-1 QA-browser): roughly 40% cost reduction without changing any prompts or skill flow.
548
+
549
+ ### Fixed — progress bar
550
+
551
+ - `bin/state.js:332` progress formula was `(phase - 1) / total_phases` which meant a completed 3-phase project showed 66% forever and a 1-phase project showed 0%. Now factors in `verified|polished|shipped|handed_off` statuses so the bar reaches 100% when the project is actually done.
552
+
553
+ ### Tests
554
+
555
+ - 472 assertions passing (up from 462). The 10 new assertions come from `tests/refs.test.sh`. All seven prior test suites continue to pass with the v5.9 changes.
556
+
557
+ ### Migration notes
558
+
559
+ - No breaking changes. Skills, agents, hooks, state machine, and CLI are all backward-compatible.
560
+ - Existing installs upgrade via `npx qualia-framework@latest install`. After upgrade, the next `/qualia-report` failure will enqueue the report; the queue drains on the next session-start. Old reports stranded by v5.8.0 are not auto-recovered (they were never queued), but re-running `/qualia-report` for those days will succeed if the ERP is reachable.
561
+ - Cost-curious users can confirm the model tiering by running `qualia-framework agents --failed` after a phase verification — the `model` field on each spawn record will show `sonnet` for the four tiered agents.
562
+
563
+ ## [5.8.0] - 2026-05-11 (Surface cleanup, polish-loop consolidation)
564
+
565
+ **The "stop shipping deprecated skills" release.** Removes the three skills
566
+ deprecated in v5.7.0 and consolidates `/qualia-polish-loop` into `/qualia-polish`
567
+ as a `--loop` flag. Skill count drops from 35 to 32.
568
+
569
+ ### Removed
570
+
571
+ - **`/qualia-quick`** removed. Was deprecated in v5.7.0; use `/qualia-feature`,
572
+ which auto-routes to the inline path for trivial work.
573
+ - **`/qualia-task`** removed. Was deprecated in v5.7.0; use `/qualia-feature`,
574
+ which auto-routes to a fresh builder spawn for 1-5 file work.
575
+ - **`/qualia-prd`** removed. Niche surface that overlapped `/qualia-discuss`
576
+ in PROJECT MODE (v5.6.0). Conversation-to-spec capture now lives in
577
+ `/qualia-discuss` for kickoff and in regular phase-context docs for
578
+ mid-project additions.
579
+
580
+ ### Changed
581
+
582
+ - **`/qualia-polish-loop` consolidated into `/qualia-polish --loop`**. The
583
+ autonomous visual-polish loop is now a flag on `/qualia-polish`, not a
584
+ separate skill. Scripts moved from `skills/qualia-polish-loop/scripts/`
585
+ to `skills/qualia-polish/scripts/`; fixtures and REFERENCE.md follow.
586
+ Visual evaluator agent description updated to point at the new surface.
587
+ The seven-scope table in `/qualia-polish` gains a "Loop" row.
588
+
589
+ ### Migration notes
590
+
591
+ - Most workflows update by find-and-replace: `/qualia-polish-loop` becomes
592
+ `/qualia-polish --loop`. Flag syntax (`--brief PATH`, `--max N`,
593
+ `--viewports`, `--ref PATH`, `--budget N`, `--reduced-motion`,
594
+ `--routes`) is preserved.
595
+ - `/qualia-quick`, `/qualia-task`, `/qualia-prd` are gone with no shim.
596
+ Projects still typing these will see a "command not found" path. The
597
+ v5.7.0 deprecation warnings were the migration window.
598
+ - README and guide.md reflect the new skill count (32) and v5.8 framing.
599
+
600
+ ### Tests
601
+
602
+ - `tests/bin.test.sh` updated: polish-loop path assertions now target
603
+ `skills/qualia-polish/`. Pass labels renamed. Added v5.8 cleanup
604
+ assertions confirming non-existence of `qualia-quick`, `qualia-task`,
605
+ `qualia-prd` and presence of `qualia-feature`.
606
+
607
+ ## [5.7.0] - 2026-05-11 (Feature consolidation, /qualia-feature)
608
+
609
+ **The "one command instead of three for everything between a typo and a phase"
610
+ release.** Replaces the `/qualia-quick` + `/qualia-task` split with a single
611
+ auto-scoped `/qualia-feature` command. The old commands stay functional for one
612
+ minor-version migration window so the ~40 active projects on prior versions
613
+ don't break overnight.
614
+
615
+ ### Added
616
+
617
+ - **`/qualia-feature` skill** (`skills/qualia-feature/SKILL.md`). Auto-scopes
618
+ the work into one of three buckets:
619
+ - **inline**: typo, comment edit, 1-line config change, single rename. Direct
620
+ work in the current Claude context, no subagent spawn.
621
+ - **spawn**: add a component, add a route, fix a non-trivial bug, refactor
622
+ one module. Fresh builder spawn with atomic commit and validation contract.
623
+ - **refuse**: 5+ files or multi-step workflow. Routes to `/qualia-plan`
624
+ instead of trying to fit a phase into a single-task slot.
625
+
626
+ Escape hatches: `--force-spawn` and `--force-inline` flags override the
627
+ classifier when it guesses wrong. The interactive gate also offers
628
+ "Force inline", "Force spawn", and "Route to /qualia-plan" options.
629
+
630
+ ### Changed
631
+
632
+ - **`/qualia-quick` and `/qualia-task` marked deprecated**
633
+ (`skills/qualia-quick/SKILL.md`, `skills/qualia-task/SKILL.md`). Both still
634
+ work as before for v5.7.x. They print a one-line deprecation warning and
635
+ end with "use /qualia-feature next time" instead of the plain end banner.
636
+ Removal scheduled for v5.8.0.
637
+
638
+ - **Primary-surface references updated**. The road skill, the install banner,
639
+ and the session-start tip now point at `/qualia-feature` as the canonical
640
+ single-feature path. Old references to `/qualia-quick` are gone from those
641
+ primary surfaces (the deprecated skills themselves still self-document).
642
+
643
+ ### Migration notes
644
+
645
+ - Existing projects don't break. `/qualia-quick` and `/qualia-task` still run
646
+ end-to-end with the same behavior they had in v5.6.0; the only change is the
647
+ deprecation warning on entry.
648
+ - The classifier is intentionally confident. When in doubt, it picks `spawn`
649
+ (one fewer surprise than guessing `inline` and contaminating the current
650
+ context). Use `--force-inline` for the over-scoped cases.
651
+ - v5.8.0 will remove `/qualia-quick` and `/qualia-task` entirely. Plan to
652
+ switch primary usage to `/qualia-feature` during the v5.7.x window.
653
+
654
+ ## [5.6.0] - 2026-05-11 (Demo vs Full Project, mandatory discovery)
655
+
656
+ **The "every project starts with the right kickoff shape" release.** Splits the
657
+ project workflow into two paths at the very first fork: demo (single
658
+ shippable milestone for sales conversations) versus full project (multi-milestone
659
+ arc to Handoff). Makes the kickoff interview mandatory and structured, not an
660
+ ad-hoc free-form chat. Surfaces a clean conversion path when a demo client signs.
661
+
662
+ ### Added
663
+
664
+ - **Step 0.6 Project Type Gate** in `skills/qualia-new/SKILL.md`. Asks Demo vs
665
+ Full Project as the first decision after the brownfield check. Every
666
+ downstream step (research scope, journey shape, milestone count) branches
667
+ on this answer. Demo design philosophy is enforced: real backend always,
668
+ DESIGN.md mandatory, slop-detect hard-block. Speed comes from skipping
669
+ multi-milestone planning, never from skipping design quality or mocking the
670
+ backend.
671
+
672
+ - **PROJECT MODE in `/qualia-discuss`** (`skills/qualia-discuss/SKILL.md`).
673
+ Two-mode skill now. With no args, runs the non-technical kickoff interview
674
+ (8 questions for demo, 14 for full project), captures answers verbatim to
675
+ `.planning/project-discovery.md`. With a phase number arg (e.g.
676
+ `/qualia-discuss 2`), runs the existing technical pre-plan grilling
677
+ unchanged. The PROJECT MODE rule "never go technical here" preserves the
678
+ client's own phrasing as input to PRODUCT.md voice and CONTEXT.md glossary.
679
+
680
+ - **Mandatory discovery in `/qualia-new` Step 1**. Replaces the free-form
681
+ deep-questioning loop with an inline invocation of `/qualia-discuss` in
682
+ PROJECT MODE. The discovery doc seeds PROJECT.md, PRODUCT.md, CONTEXT.md,
683
+ and (full projects only) JOURNEY.md milestone names.
684
+
685
+ - **`templates/project-discovery.md`**, the new discovery document template.
686
+ Demo path covers §1-§8 (pitch, users, "remember 24h later" sentence,
687
+ anti-references, brand voice, success criterion, hard constraints, out of
688
+ scope). Full project adds §9-§14 (milestone arc, compliance, integrations,
689
+ content ownership, post-handoff team, budget/timeline shape).
690
+
691
+ - **Demo-Extension Branch in `/qualia-milestone`** (`skills/qualia-milestone/SKILL.md`).
692
+ When a demo project closes its only milestone, asks "did the client sign?".
693
+ On yes: invokes `/qualia-discuss` for the §9-§14 top-up, then spawns the
694
+ roadmapper in `extend-demo-to-full` mode to add M2..M{N-1} plus Handoff,
695
+ with M1 untouched because it shipped. On no: closes the demo as a shipped
696
+ artifact and routes to `/qualia-report`.
697
+
698
+ - **`<scope>quick</scope>` in researcher agents** (`agents/researcher.md`,
699
+ `agents/research-synthesizer.md`). Demo projects pass `quick` scope; each
700
+ researcher gets a 3-call budget instead of 8. The synthesizer produces a
701
+ single-milestone suggestion (no Handoff, no multi-milestone arc). Total
702
+ research time drops roughly 3x with no loss of correctness on the demo
703
+ path. The depth was wasted when there's only one milestone to ship.
704
+
705
+ - **`project_type` frontmatter** in `templates/project.md`. Stored as
706
+ `project_type: demo | full`. The milestone-close branch reads it to decide
707
+ whether to fire the demo-extension question.
708
+
709
+ ### Changed
710
+
711
+ - **`/qualia-discuss` description and routing.** Description now documents
712
+ both modes. The skill picks mode based on whether an arg is passed (no
713
+ arg = PROJECT MODE, numeric arg = PHASE MODE). All existing PHASE MODE
714
+ behavior is preserved unchanged.
715
+
716
+ - **`/qualia-new` Step 8 research scope** now branches on `PROJECT_TYPE`.
717
+ Demo projects pass `<scope>quick</scope>` to all 4 researchers and the
718
+ synthesizer; full projects use the standard 8-call budget per researcher.
719
+
720
+ - **`/qualia-new` Step 10 roadmapper output** now branches on `PROJECT_TYPE`.
721
+ Demo: 1 milestone (the demo itself, 2-4 phases), no Handoff, fully detailed.
722
+ Full: standard 2-5 milestone arc ending in Handoff.
723
+
724
+ ### Migration notes
725
+
726
+ - Projects on v5.5 or earlier do not have `project_type` in their PROJECT.md.
727
+ The milestone-close branch treats missing `project_type` as `full` and
728
+ skips the demo-extension question. Backward-compatible.
729
+ - Existing in-flight projects don't need to re-run discovery. The mandatory
730
+ interview only fires at `/qualia-new` kickoff for net-new projects.
731
+ - `/qualia-discuss N` (phase-mode usage) is unchanged. Existing scripts and
732
+ team habits keep working.
733
+
734
+ ## [5.5.0] - 2026-05-09 (Plan-discipline pass)
735
+
736
+ **The "stop the LLM from quietly simplifying the spec" release.** Pre-execution
737
+ discipline upgrades that catch the most common failure modes (silent scope
738
+ reduction, missing requirement coverage, decision drift) before the builder
739
+ spawns and burns context. Synthesised from a deep research pass across six
740
+ external coding-agent frameworks; only the patterns with clear leverage and
741
+ no ceremony tax were inherited.
742
+
743
+ ### Added
744
+
745
+ - **Scope-Reduction Prohibition** in `agents/planner.md` and `agents/builder.md`.
746
+ Banned phrases (`v1`, `simplified version`, `hardcoded for now`, `will be
747
+ wired later`, `stub`, `placeholder`, etc.) plus three legitimate split
748
+ reasons (context cost over 50%, missing info, dependency conflict). LLMs
749
+ systematically simplify specs; this is a zero-token-cost guardrail in
750
+ steady state that targets the #1 quality failure mode.
751
+
752
+ - **Decision Coverage Audit** in `agents/planner.md`. Planner verifies that
753
+ every `D-NN` row from `.planning/phase-{N}-context.md` is covered by a task
754
+ before returning the plan. Closes the discuss → plan → verify → ship
755
+ traceability loop.
756
+
757
+ - **Auto-heal loop in builder Self-Verify** (`agents/builder.md`). On `tsc`
758
+ or `slop-detect` failure, fix and retry up to 2x before BLOCKED. Moves
759
+ correctness checks from post-hoc verifier to in-builder self-heal,
760
+ saving a verifier round per failure.
761
+
762
+ - **Plan-checker Rule 9: Decision Coverage** (`agents/plan-checker.md`).
763
+ Reads `.planning/phase-{N}-context.md`; BLOCKS if any `D-NN` is unclaimed
764
+ by the plan, or if a `Deferred Ideas` row is being implemented.
765
+
766
+ - **Plan-checker Rule 10: Scope Reduction Detection**. Greps the plan for
767
+ banned phrases; BLOCKS with a quoted citation. Exception is the project's
768
+ own product versioning (e.g. `migrate to API v2`).
769
+
770
+ - **Plan-checker Rule 11: Requirement Coverage**. Reads ROADMAP.md for
771
+ per-phase REQ-IDs; BLOCKS if any REQ-ID listed for the current phase is
772
+ not referenced by any task.
773
+
774
+ - **`findScopeReductionPhrases()` in `bin/plan-contract.js`**. The same scan
775
+ runs on the JSON contract path (`task.action`, `task.acceptance_criteria`)
776
+ so both paths catch the same failure mode. Exported for reuse.
777
+
778
+ ### Changed
779
+
780
+ - **`/qualia-discuss` output format** (`skills/qualia-discuss/SKILL.md`).
781
+ Locked Decisions now carry stable `D-NN` IDs in a four-column table
782
+ (`ID | Decision | Rationale | Source`). Downstream agents reference these
783
+ IDs in commit messages and task Why fields for traceability.
784
+
785
+ - **`templates/phase-context.md`** gains an `ID` column on the Locked
786
+ Decisions table.
787
+
788
+ ### Removed
789
+
790
+ - **9 build-time engineering archive docs** (~140KB) referenced only from
791
+ CHANGELOG: `install-redesign-builder-prompt.md`, `install-redesign-pilot.md`,
792
+ `instruction-budget-audit.md`, `journey-demo.html`, the four
793
+ `playwright-loop-*` engineering notes (the active `pilot-results.md`
794
+ stays), `polish-loop-supervised-run.md`, and the matching archived
795
+ session report.
796
+
797
+ ### Tests
798
+
799
+ - New regression: `tests/lib.test.sh` covers scope-reduction detection on
800
+ both `task.action` and `task.acceptance_criteria` (16 passing, was 15).
801
+ - Total suite: **491 passing across 7 suites**, no regressions.
802
+
803
+ ### Notes
804
+
805
+ - Plan-checker now covers 6 of 7 industry-standard goal-backward dimensions.
806
+ Per-task context budget enforcement was intentionally skipped (heuristic,
807
+ false-positive prone, low signal); the existing Quality Degradation Curve
808
+ in `rules/grounding.md` covers the prompt-side hint.
809
+ - No public API changes. Skill and agent prompt deltas are additive; existing
810
+ plans without `D-NN` IDs or REQ-IDs in ROADMAP.md continue to pass (Rules
811
+ 9 and 11 are conditional on those artifacts existing).
812
+
813
+ ## [5.4.0] — 2026-05-07 (Token-discipline + grounding pass)
814
+
815
+ **The "less tokens, no quality loss" release.** Driven by a re-study of
816
+ Matt Pocock's *Skills and Strategies for Real Engineering*, Patrick
817
+ Debois's *Context is the New Code*, Karpathy on Software 3.0, and Mario
818
+ Zechner on agentic search. Eight PRs landed:
819
+
820
+ ### Added
821
+
822
+ - **`rules/architecture.md`** — deep modules (Ousterhout), locality
823
+ over DRY, adapters at seams, layered service boundaries, interface
824
+ stability, test-the-seam-not-the-function. Auto-loaded only on
825
+ architectural-judgment tasks.
826
+
827
+ - **`rules/speed.md`** — CLI-first / MCP tier-list rule. Default to
828
+ CLI; reach for MCP only when the CLI doesn't expose the operation
829
+ or there's no CLI alternative. Per-MCP guidance for the 10+ MCP
830
+ servers we use. The `/supabase` skill (replacing 32 supabase MCP
831
+ tools) is the canonical pattern.
832
+
833
+ - **`docs/instruction-budget-audit.md`** — anchors the discipline. The
834
+ ~22 KB-per-session leak from auto-loaded design rules is documented
835
+ with cost breakdown and a relocation plan.
836
+
837
+ - **`tests/skills.test.sh`** (168 assertions) — smoke tests every
838
+ `skills/*/SKILL.md`: frontmatter parses, name matches folder,
839
+ description ≥ 50 chars, has trigger phrases (or `disable-model-
840
+ invocation`), body has h1 + section. **Plus a cache-aware spawn
841
+ audit:** every spawn to a custom `qualia-*` subagent must anchor
842
+ the prompt with `@~/.claude/agents/{name}.md` so Anthropic's prompt
843
+ cache reuses the prefix (documented 81–90% cost reduction).
844
+
845
+ - **`tests/slop-detect.test.sh`** (9 assertions) — behavioral tests
846
+ for the AI-tells gatekeeper. Verifies em-dash detection (HIGH),
847
+ banned-font detection (CRITICAL exits 1), purple-blue gradient
848
+ detection (CRITICAL), JSON output, missing-path graceful exit.
849
+
850
+ - **NotebookLM-first research path** in `agents/researcher.md` and
851
+ `/qualia-research`. Mandatory pre-flight: query `mcp__notebooklm-mcp__
852
+ cross_notebook_query` and `~/qualia-memory` (via knowledge.js +
853
+ qualia-recall) BEFORE any external Context7/WebFetch/WebSearch
854
+ call. Local pre-flight is FREE (not counted against the 8-call
855
+ external budget). Skip web round when local sources cover ≥ 80% at
856
+ confidence ≥ MEDIUM. Estimated 60–80% reduction in research-phase
857
+ tokens for `/qualia-new` and `/qualia-research` — the team has 21
858
+ curated NotebookLM notebooks covering most domains we ship in.
859
+
860
+ - **Citation-enforcement preambles** in `agents/builder.md`,
861
+ `agents/verifier.md`, `agents/plan-checker.md`, `agents/qa-browser.md`,
862
+ `agents/researcher.md`, `agents/research-synthesizer.md`,
863
+ `agents/roadmapper.md`. Every claim cites `file:line — "quoted"`,
864
+ no hedging language (banned: "seems", "appears", "probably",
865
+ "might", "likely", "possibly"), `INSUFFICIENT EVIDENCE` is the
866
+ correct answer when tool-budget is exhausted. Closes the *Never
867
+ Trust an LLM* gap surfaced by Pocock and Karpathy's *jagged
868
+ intelligence* framing.
869
+
870
+ - **Trigger phrases** added to three previously-vague skill
871
+ descriptions: `qualia-map` ("map this codebase", "brownfield
872
+ setup"), `qualia-milestone` ("close milestone", "next milestone"),
873
+ `qualia-research` ("best practices for X", "compare libraries").
874
+ Skills now fire on natural-language requests, not just slash command.
875
+
876
+ ### Changed
877
+
878
+ - **Design substrate relocated** from `~/.claude/rules/` (auto-loaded
879
+ by Claude Code's harness on every session) to `~/.claude/qualia-design/`
880
+ (load-on-demand via explicit `Read`). Six files affected:
881
+ `design-laws.md`, `design-brand.md`, `design-product.md`,
882
+ `design-rubric.md`, `design-reference.md`, `frontend.md`. **~22 KB
883
+ of instruction budget recovered per session** for non-frontend work.
884
+ Skills/agents that need the substrate `Read` it explicitly when they
885
+ need it (already the pattern in `agents/builder.md` and
886
+ `agents/verifier.md`). `bin/install.js` migrates legacy paths
887
+ on next install.
888
+
889
+ - **`state.js cmdTransition`** split from a 195-line monolith into 7
890
+ named helpers (`applyNoteOrActivity`, `applyPlannedTransition`,
891
+ `applyBuiltTransition`, `applyVerifiedTransition`,
892
+ `applyPolishedTransition`, `applyShippedTransition`,
893
+ `commitTransitionAtomic`). Orchestrator now reads top-to-bottom in
894
+ 84 lines. Behavior preserved exactly — verified by all 59
895
+ state.test.sh assertions.
896
+
897
+ - **Skill description boundaries sharpened** for the four
898
+ picker-confusable pairs: `qualia-quick` vs `qualia-task` ("NO
899
+ spawn" vs "FRESH builder spawn"), `qualia-road` vs `qualia-help`
900
+ ("TERMINAL" vs "BROWSER HTML"). Picker now disambiguates correctly.
901
+
902
+ - **`rules/infrastructure.md`** Supabase guidance reconciled with the
903
+ new CLI-first rule. Was: "MCP server is available for direct
904
+ database operations". Now: "prefer `npx supabase` over MCP; reach
905
+ for MCP only when CLI can't".
906
+
907
+ - **`README.md`** — three updates:
908
+ - `## Don't run Claude's /init` warning at the top, directing users
909
+ to `/qualia-new` (greenfield) or `/qualia-map` (brownfield) per
910
+ Pocock's *Never run /init* — built-in `/init` generates bloated
911
+ docs that rot and consume instruction budget.
912
+ - `## What's Inside` — split rules section into 6 always-loaded
913
+ rules (`rules/`) + 6 lazy-loaded design files (`qualia-design/`).
914
+ - Skill listings now include `/qualia-flush`, `/qualia-postmortem`,
915
+ `/zoho-workflow` (previously installed but undocumented).
916
+
917
+ - **`SESSION_REPORT.md` + `V4_REVIEW.md`** moved to `docs/archive/`.
918
+ Historical artifacts; git history preserves them.
919
+
920
+ ### Test surface
921
+
922
+ | Group | Assertions |
923
+ |---|---:|
924
+ | statusline | 14 |
925
+ | state | 59 |
926
+ | hooks | 66 |
927
+ | bin | 146 |
928
+ | lib | 15 |
929
+ | skills (new) | 178 |
930
+ | slop-detect (new) | 9 |
931
+ | **Total** | **487** |
932
+
933
+ ### Source attribution (research basis)
934
+
935
+ - Matt Pocock — *Skills and Strategies for Real Engineering with
936
+ Claude Code* (13-source NotebookLM): citation discipline,
937
+ instruction-budget rule, deep modules, never-run-/init, never-trust-
938
+ an-LLM.
939
+ - Patrick Debois — *Context is the New Code*: CDLC, eval pillar.
940
+ - Andrej Karpathy — *Software 3.0 / Vibe Coding to Agentic
941
+ Engineering*: jagged intelligence, verifiable intelligence.
942
+ - Mario Zechner — *Tokens*: agentic search, context rot.
943
+ - Plus 12 sources in *Context and Tools: Mastering the AI Engineering
944
+ Lifecycle* on RAG, agent harnesses, voice agents, CLI vs MCP.
945
+
946
+ ## [5.3.0] — 2026-05-05 (Matt Pocock gaps)
947
+
948
+ **The "absorb the missing patterns" release.** A re-study of Matt
949
+ Pocock's published skills repo and Claude-Code-for-Real-Engineers
950
+ content surfaced three patterns Qualia hadn't yet absorbed: the
951
+ PRD-from-conversation flow, the parallel-interface-alternatives pattern
952
+ in deep-module refactoring, and the instruction-budget-to-hook
953
+ generator. v5.3 ships all three. Most of Matt's other patterns
954
+ (CONTEXT.md, ADRs, slim CLAUDE.md, /qualia-discuss = grill, /qualia-test
955
+ --tdd, /qualia-issues = to-issues, /qualia-zoom = zoom-out, /qualia-debug
956
+ = diagnose, /qualia-optimize --deepen ≈ improve-codebase-architecture)
957
+ were already in v5.x.
958
+
959
+ ### Added
960
+
961
+ - **`/qualia-prd`** — synthesize the current conversation into a
962
+ durable PRD at `.planning/PRD-{slug}.md`. Optionally opens a parent
963
+ GitHub issue. No interview — just synthesis. Pairs with
964
+ `/qualia-issues` to break the PRD into vertical-slice issues.
965
+ Distinct from `/qualia-plan` (phase-operational) — `/qualia-prd` is
966
+ feature-durable. Distinct from `/qualia-new` (one-shot project
967
+ setup) — `/qualia-prd` runs every time a new feature needs a spec.
968
+ Forked subagent for synthesis: full conversation context flows into
969
+ the fork, only `{path, summary}` flows back to the parent session.
970
+ Token discipline preserved.
971
+
972
+ - **`/qualia-hook-gen`** — convert a CLAUDE.md or rules/*.md
973
+ instruction into a deterministic Claude Code `pre-tool-use` hook.
974
+ Three patterns: block (exit 2 with message), rewrite (block with
975
+ alternative), warn (exit 0 with stderr message). Generates
976
+ `hooks/block-{name}.js` (pure Node, cross-platform — no `.sh`) and
977
+ the `settings.json` patch. Tests the hook before committing. After
978
+ install, surfaces the now-redundant CLAUDE.md line so the user can
979
+ trim instruction budget. Refuses non-deterministic rules
980
+ (stylistic / judgment-based) — recommends a skill or ESLint rule
981
+ instead. Closes the gap between Matt's "instruction-budget
982
+ discipline" advocacy and actually shrinking real-project CLAUDE.md
983
+ files.
984
+
985
+ - **Step 5b: Parallel Interface Design (Wave 3)** in
986
+ `/qualia-optimize --deepen` — after the strategist returns
987
+ deepening candidates and the user picks one, spawn 3 fan-out agents
988
+ IN THE SAME RESPONSE TURN, each producing a *radically different*
989
+ interface design (variant 1 = functional/data-oriented, variant 2 =
990
+ OOP/encapsulated, variant 3 = event-driven/message-based). Present
991
+ the 3 alternatives in a comparison table; user picks 1, 2, 3, or
992
+ hybrid. Then a synthesizer writes the Refactor RFC to
993
+ `.planning/REFACTOR-{slug}.md` honoring the user's pick. From
994
+ Matt's improve-codebase-architecture skill — produces dramatically
995
+ better refactor RFCs than a single-pass proposal. ~15K tokens per
996
+ candidate (cached prefix shared across the 3 spawns).
997
+
998
+ - **9 new tests** (bin.test.sh #135-143): qualia-prd installs,
999
+ describes synthesis flow, documents fork-based token discipline;
1000
+ qualia-hook-gen installs, documents block/rewrite/warn patterns,
1001
+ mandates pure-Node hooks; qualia-optimize Step 5b parallel-interface
1002
+ is documented; qualia-optimize REFERENCE.md has the spawn template;
1003
+ package.json version assertion accepts 5.3.x.
1004
+
1005
+ ### Changed
1006
+
1007
+ - **`skills/qualia-optimize/SKILL.md`** — adds "Step 5b: Wave 3 --
1008
+ Parallel Interface Design" section. Triggers in `--deepen` mode
1009
+ only after candidate selection.
1010
+ - **`skills/qualia-optimize/REFERENCE.md`** — adds the verbatim
1011
+ "Parallel interface design prompt" with per-variant lens
1012
+ assignments and output contract (5 numbered sections per variant).
1013
+ - `package.json` version bumped to 5.3.0.
1014
+
1015
+ ### Honest carryover (still deferred)
1016
+
1017
+ - **CLAUDE.md slimming to 1 line** (Matt's "you are on WSL on
1018
+ Windows" pattern) — Qualia's CLAUDE.md is multi-tenant team-aware
1019
+ with role substitution. Can't go to 1 line without losing role
1020
+ injection. Wait until install.js can inline role into a session
1021
+ prompt instead of a global file. v5.4+ candidate.
1022
+ - **`user-invocable: false` frontmatter convention** — small
1023
+ marking that says "this skill is invoked by other skills, not
1024
+ users." Low impact, fold into v5.3.x patch.
1025
+ - **Graybox-locking pattern in /qualia-test** — when refactoring
1026
+ a deep module, lock the interface with comprehensive tests.
1027
+ Educational extension, fold into /qualia-test's TDD docs in
1028
+ v5.3.x.
1029
+ - **"What-to-do / supporting-info" structured sections** in
1030
+ SKILL.md — Matt's recent authoring convention. Fold into
1031
+ `/qualia-skill-new` template in v5.3.x.
1032
+
1033
+ ### Why these changes
1034
+
1035
+ The Matt-Pocock NotebookLM study surfaced exactly what we'd missed.
1036
+ v5.0 absorbed the philosophy (instruction-budget discipline, alignment
1037
+ substrate, deep modules); v5.1 shipped the visual-polish loop; v5.2
1038
+ hardened it. v5.3 closes the remaining patterns gap so Qualia is at
1039
+ parity with Matt's published toolkit while still carrying its own
1040
+ flagship contributions (visual-polish loop, multi-target install,
1041
+ state-machine project lifecycle, ERP integration, the 8-dimension
1042
+ design rubric, the Karpathy raw-to-wiki memory layer).
1043
+
1044
+ Tests: 291 (v5.2) → **300 (v5.3)**. All passing. slop-detect clean
1045
+ on all v5.3 changes.
1046
+
1047
+ ## [5.2.0] — 2026-05-05 (Polish-loop reliability)
1048
+
1049
+ **The "close the v5.1 deferrals" release.** v5.1 shipped the autonomous
1050
+ visual-polish loop (`/qualia-polish-loop`) with three named deferrals in
1051
+ the CHANGELOG: the `prefers-reduced-motion` forced-capture flag, multi-
1052
+ route sweeps, and the first real-project supervised run. v5.2 closes all
1053
+ three. The loop is now production-ready against multi-page sites and
1054
+ honors user a11y intent at capture time.
1055
+
1056
+ ### Added
1057
+
1058
+ - **`--reduced-motion`** flag on `playwright-capture.mjs` and
1059
+ `loop.mjs init`. Forces `prefers-reduced-motion: reduce` in the
1060
+ captured page. Implemented for both backends — Playwright's
1061
+ `newContext({ reducedMotion: 'reduce' })` and Chromium's
1062
+ `--force-prefers-reduced-motion` CLI flag. State records
1063
+ `state.reduced_motion: true` so the vision evaluator knows to score
1064
+ motion on CSS-declaration quality only (not penalize "no animation
1065
+ visible"). Closes v5.1 §5g adversarial deferral.
1066
+ - **`--routes URL1,URL2,URL3`** flag on `loop.mjs init`. Multi-route
1067
+ mode. State stores both `state.url` (first URL, for backward compat
1068
+ with single-route SKILL.md drivers) and `state.urls` (full list).
1069
+ Orchestrator drives capture+eval per URL each iteration; aggregate
1070
+ scores are min across URLs and viewports. The kill-switch fingerprint
1071
+ set is shared across all URLs — a regression on any route halts the
1072
+ loop. Closes v5.1 §"Multi-route sweeps" deferral.
1073
+ - **`docs/polish-loop-supervised-run.md`** — write-up of the first
1074
+ supervised end-to-end run. Captures real wall-clock time, token
1075
+ estimates, both flag combinations exercised, what worked, what
1076
+ surprised. Removes the "first real-project supervised run not done"
1077
+ caveat from v5.1's CHANGELOG.
1078
+ - **6 new tests** (tests #129-134 in `bin.test.sh`) covering: `--routes`
1079
+ multi-route init, `state.url` first-entry backward compat,
1080
+ `--reduced-motion` flag recording, capture --help documents
1081
+ `--reduced-motion`, init rejects missing `--url`/`--routes`, report
1082
+ renders multi-route header.
1083
+
1084
+ ### Changed
1085
+
1086
+ - **`scripts/loop.mjs report`** — when `state.urls.length > 1`, the
1087
+ header renders `**URLs (N):** url1, url2, ...` instead of single
1088
+ URL. Reduced-motion mode renders an extra `**Reduced motion:**
1089
+ forced` row.
1090
+ - **`scripts/loop.mjs init`** — accepts either `--url` or `--routes`
1091
+ (one is required, `--routes` wins if both given). Records
1092
+ `state.reduced_motion` for downstream capture invocations.
1093
+ - `package.json` version bumped to 5.2.0.
1094
+
1095
+ ### Honest caveats
1096
+
1097
+ - Multi-route mode requires the SKILL.md driver to loop the
1098
+ capture+eval per URL each iteration. The state-machine primitives
1099
+ (`init`, `record`, `commit-fix`, `report`) are URL-agnostic, but the
1100
+ orchestrator (Claude session) has to invoke them N times per
1101
+ iteration where N is `state.urls.length`. The SKILL.md guidance for
1102
+ multi-route is documented in `loop.mjs --help` and the v5.2
1103
+ supervised run write-up.
1104
+ - `--reduced-motion` on the chromium-binary backend uses Chrome's
1105
+ `--force-prefers-reduced-motion` flag which has been stable since
1106
+ Chrome 87 (2020). Older browsers may ignore it; Playwright's
1107
+ `reducedMotion: 'reduce'` is more reliable when both backends are
1108
+ available.
1109
+ - Pixel-similarity reference-image comparison still deferred to v5.3.
1110
+ The `--ref` flag remains rubric-anchored (the evaluator looks at
1111
+ both screenshots and scores against the rubric). True
1112
+ structural-similarity comparison needs an embedding step that's
1113
+ outside the scope of this release.
1114
+
1115
+ ### Tests
1116
+
1117
+ 274 (v5.0) → 285 (v5.1) → **291 (v5.2)**. All passing. slop-detect
1118
+ clean (0 critical) on all v5.2 changes.
1119
+
1120
+ ## [5.1.0] — 2026-05-03 (Multi-target install + live-progress redesign)
1121
+
1122
+ **The "first impression matters" release.** Every Qualia user — Fawzi,
1123
+ Hasan, Moayad, Rama, Sally, future hires — meets the framework first
1124
+ through `npx qualia-framework install`. v5.0's installer was functional
1125
+ but minimal: silent for 5 seconds, then a wall of green checkmarks. v5.1
1126
+ turns the install into a polished, intentional document: live progress on
1127
+ every operation, per-section timing, and a target-selection prompt that
1128
+ acknowledges Qualia ships AGENTS.md (the open standard adopted by Codex,
1129
+ Cursor, Continue, Aider, Devin) and meets users wherever they edit.
1130
+
1131
+ The root cause this addresses: silence reads as a hung process. When the
1132
+ installer copies 33 skills + 12 hooks + 24 templates and prints nothing
1133
+ between section headers and the final summary, the user wonders if their
1134
+ laptop froze. Live updates tell them the framework is alive and working
1135
+ through the list. Visual feedback is trust.
1136
+
1137
+ ### Added
1138
+
1139
+ - **Multi-target install** — after the team-code prompt, the installer
1140
+ asks: install to Claude Code (`~/.claude/`), OpenAI Codex
1141
+ (`~/.codex/`), or both. Default is `1` (Claude Code only) so legacy
1142
+ scripts keep working untouched. `2` writes only `~/.codex/AGENTS.md`
1143
+ with role substitution; `3` does both. Codex CLI is detected via
1144
+ `which codex`; if absent, the installer warns and writes the file
1145
+ anyway so the user is set up for when they install Codex.
1146
+ - **Live-progress lifecycle** — every meaningful copy operation prints a
1147
+ `⏳ doing...` line that overwrites in place to `✓ done` when the op
1148
+ completes. Long-running ops (settings.json merge, recursive template
1149
+ copy) show a Braille-pattern spinner. Sub-50ms ops skip the "doing"
1150
+ state and go straight to `✓` to avoid noise.
1151
+ - **Per-section summaries** — each install phase (Skills, Agents, Rules,
1152
+ Hooks, Templates, etc.) closes with a `└─ N items · Xs` line showing
1153
+ the count and elapsed time. The final summary card adds rows for
1154
+ **Targets** (Claude Code · Codex · Both) and **Time** (total install
1155
+ duration), plus a contextual "first command to try" hint that
1156
+ recommends `/qualia` if a `.planning/` folder exists in the current
1157
+ directory and `/qualia-new` otherwise.
1158
+ - **`bin/qualia-ui.js` extended** — exports `step()`, `spinner()`,
1159
+ `progress()`, `box()`, `kv()`, `divider()`, `section()`,
1160
+ `sectionClose()` for require-side consumption from `install.js` and
1161
+ any future CLI work. The existing CLI dispatch (`node qualia-ui.js
1162
+ banner` etc.) is preserved — gated behind `require.main === module`
1163
+ so importing the file doesn't trigger argv parsing.
1164
+ - **`docs/install-redesign-pilot.md`** — captured output of all three
1165
+ install scenarios (Claude only, Codex only, Both), timing
1166
+ measurements, and the TTY-degradation verification.
1167
+
1168
+ ### Changed
1169
+
1170
+ - **`bin/install.js`** — same semantic install logic as v5.0 (skills,
1171
+ agents, hooks, rules, templates, knowledge layer, ERP config,
1172
+ settings.json merge with backup) wrapped in the new live-progress
1173
+ primitives + section timing. Backward compatible: a piped install
1174
+ with only the team code (`echo "QS-FAWZI-11" | npx qualia-framework
1175
+ install`) still completes correctly and defaults to Claude only.
1176
+ - **`bin/qualia-ui.js`** — module is now require-able (gated CLI
1177
+ dispatch) and exports the new primitives alongside an OKLCH-tinted
1178
+ color palette. Color discipline preserved: teal/green/dim/white from
1179
+ v5.0; errors stay red, warns stay yellow.
1180
+ - **`package.json`** — `files` array adds `AGENTS.md` so the Codex
1181
+ install can read the framework copy at install time (it was already
1182
+ shipped in source but excluded from the published tarball).
1183
+ - **`tests/bin.test.sh`** — adds 11 tests for v5.1: target=1/2/3 paths,
1184
+ legacy single-line backward compat, Codex AGENTS.md backup before
1185
+ overwrite, no redundant backup on identical re-install, non-TTY log
1186
+ cleanliness (no orphan `\r` / hide-cursor / clear-line escapes),
1187
+ summary card shows Targets+Time, qualia-ui exports check, qualia-ui
1188
+ CLI dispatch still works, package.json version bump assertion.
1189
+
1190
+ ### Honest caveats
1191
+
1192
+ - **Codex install scope in v5.1: AGENTS.md only (superseded by v6.2.7).** Codex's runtime did not
1193
+ then consume Claude-style skills/hooks/agents on disk in a way
1194
+ the framework can map 1:1 (Codex agents use a `.toml` format and a
1195
+ different hook schema). AGENTS.md carries the rules — the open
1196
+ standard Codex / Cursor / Continue / Aider / Devin all read — and
1197
+ commands continue to route through Claude Code. If Codex grows
1198
+ skill/hook support that lines up, the framework will extend the
1199
+ Codex install path here without breaking AGENTS.md compat.
1200
+ - **Spinner cosmetics on `cmd.exe`.** Modern Windows Terminal renders
1201
+ Braille frames fine. Older `cmd.exe` may render them as boxes; the
1202
+ install still completes correctly. The non-TTY (piped) path skips
1203
+ spinners entirely and prints plain `✓` lines.
1204
+ - **Total install time shown in seconds.** On modern hardware the entire
1205
+ install completes in ~100–300ms, so the Time row often shows `0.1s`
1206
+ or `0.2s`. Not a bug — the framework is just fast.
1207
+ - **Two readline races avoided.** v5.1's first implementation had a
1208
+ subtle bug: between `await askCode()` and `await askTarget()`, EOF
1209
+ on piped stdin emitted `'close'` before the second `question()`
1210
+ could attach a line listener — `target` always defaulted to `1`
1211
+ even when the user piped `CODE\n2\n`. Fixed by pre-buffering all
1212
+ stdin lines synchronously when the input is non-TTY. Documented here
1213
+ so future contributors don't reinvent the bug.
1214
+
1215
+ ### Why these changes
1216
+
1217
+ A boring install signals a boring framework. v5.0 was functional; v5.1
1218
+ is the version Fawzi can demo to a client and have them say "this feels
1219
+ like a product." The Codex addition is strategic: AGENTS.md adoption
1220
+ across Codex / Cursor / Continue / Aider / Devin means Qualia rules can
1221
+ follow the user across vendors without forking the framework. v5.1 makes
1222
+ that explicit, opt-in, and one keystroke away (`2` or `3`).
1223
+
1224
+ Hooks/skills/agents stay Claude-only because Claude Code is where the
1225
+ framework's automation lives. AGENTS.md is the rule layer; the rest is
1226
+ the engine.
1227
+
1228
+
1229
+
1230
+ ## [5.0.0] — 2026-05-03 (Visual-Polish Loop addendum)
1231
+
1232
+ **The "see your own work" addition** to the v5.0.0 release.
1233
+ Adds `/qualia-polish-loop`, the autonomous visual-polish loop that
1234
+ screenshots a live URL at three viewports, scores against the
1235
+ 8-dimension `rules/design-rubric.md` using a vision-anchored evaluator,
1236
+ fixes the top issues, and loops until every dimension scores 3 or the
1237
+ kill-switch trips. Closes the design-iteration churn friction documented
1238
+ in Fawzi's `/insights` data (the #1 strategic workflow opportunity from
1239
+ v5.0's hardening audit).
1240
+
1241
+ The root cause this addresses: the framework's design QA was text-only.
1242
+ `slop-detect` grep-scans CSS for known anti-patterns. The verifier
1243
+ scores design by reading TSX/CSS. Both miss visual failures that only
1244
+ show on a rendered page — hero-video framing on mobile, breakpoint
1245
+ collapse, spacing rhythm, motion presence. `/qualia-polish-loop` is the
1246
+ first Qualia skill that actually looks at what the browser draws.
1247
+
1248
+ ### Added
1249
+
1250
+ - **`/qualia-polish-loop`** — new flagship skill. CLI:
1251
+ `/qualia-polish-loop {url} [--brief PATH] [--max 8] [--viewports 375,768,1440] [--budget 100000]`.
1252
+ Pre-flight gates check substrate, brief, browser backend, URL
1253
+ reachability, working-tree cleanliness, and token budget before any
1254
+ capture runs. Each iteration: capture (3 viewports) → spawn the
1255
+ vision evaluator → record the verdict → spawn 1-3 fix-builders in
1256
+ parallel → commit each fix with `qpl-{N}:` prefix → loop. Stops on
1257
+ SUCCESS (all 8 dims 3, no critical issues), KILL on regression
1258
+ (same issue fingerprint 3 consecutive iterations), KILL on budget,
1259
+ KILL on max iterations.
1260
+ - **`agents/visual-evaluator.md`** — new agent role. Vision-anchored,
1261
+ rubric-inlined, JSON-output-contract. Defaults to score 3
1262
+ (acceptable); only deviates with cited evidence. Aggregate score is
1263
+ the MINIMUM across viewports per dimension — a layout that's elegant
1264
+ on desktop but breaks at 375px is a fail. Trust-boundary block
1265
+ refuses prompt injection from inlined project files.
1266
+ - **`skills/qualia-polish-loop/scripts/playwright-capture.mjs`** —
1267
+ capture helper. Auto-selects backend: `import('playwright')` first,
1268
+ then cached `~/.cache/ms-playwright/chromium-*` binary, then
1269
+ `google-chrome` / `chromium` on PATH. No npm dependency required if
1270
+ any Chrome-family binary is already on disk.
1271
+ - **`skills/qualia-polish-loop/scripts/loop.mjs`** — deterministic
1272
+ state-machine orchestrator. Subcommands: `init`, `record`, `status`,
1273
+ `commit-fix`, `report`. State persists in JSON outside the LLM
1274
+ context (iteration counter, fingerprints, fixes, verdict). The
1275
+ `commit-fix` subcommand runs `slop-detect` first; critical findings
1276
+ block the commit and the fix-builder must retry.
1277
+ - **`skills/qualia-polish-loop/scripts/score.mjs`** — pure scoring
1278
+ utility. Takes 8 dimension scores, computes pass/fail per the rubric
1279
+ formula. Pipe-friendly. Used by the orchestrator and reusable for
1280
+ ad-hoc scoring.
1281
+ - **`skills/qualia-polish-loop/fixtures/clean.html`** — synthetic
1282
+ well-designed page for self-test Scenario 1 (Fraunces + JetBrains
1283
+ Mono, OKLCH palette, varied work-grid layout, fluid clamp() spacing,
1284
+ reduced-motion respected, 65ch line lengths).
1285
+ - **`skills/qualia-polish-loop/fixtures/broken.html`** — synthetic
1286
+ deliberately-bad page for self-test Scenario 2. Hits Inter font,
1287
+ blue-purple gradient, gradient text, three identical cards, side-stripe
1288
+ border, generic CTAs, max-width:1280, em-dash, outline:none — the
1289
+ full slop suite.
1290
+ - **`docs/playwright-loop-pilot-results.md`** — self-test results across
1291
+ the 3 mandated scenarios with screenshots, scores, iteration counts,
1292
+ token measurements.
1293
+ - **`docs/playwright-loop-design-notes.md`** — 1-page integration notes:
1294
+ how /qualia-polish-loop relates to /qualia-polish, when to use which,
1295
+ what's deferred to v5.2.
1296
+
1297
+ ### Changed
1298
+
1299
+ - **`bin/install.js`** — recursively copies `scripts/` and `fixtures/`
1300
+ subfolders from each skill (was: only `SKILL.md` + `REFERENCE.md`).
1301
+ Required so `qualia-polish-loop` can ship its Node helpers and HTML
1302
+ test fixtures alongside the skill.
1303
+ - **`skills/qualia-road/SKILL.md`** — already mentions
1304
+ `/qualia-polish-loop` (added in the v5.1-prep commit).
1305
+ - **`tests/bin.test.sh`** — adds 5 install assertions for the new skill,
1306
+ agent, and `scripts/` subfolder copying.
1307
+ - `package.json` stays at 5.0.0 — polish-loop folded into the same v5.0 release window per Fawzi's "all v5, not v5.1" call.
1308
+
1309
+ ### Honest caveats
1310
+
1311
+ - The vision evaluator runs as a Claude Agent spawn, NOT a separate
1312
+ vision-API call. Token cost per iteration is ~14.5K (3 PNG reads +
1313
+ rubric + brief + previous-iteration delta). The 100K budget cap covers
1314
+ ~6-7 iterations comfortably; 8 needs `--budget 150000`.
1315
+ - Capture defaults to dev-localhost. Vercel-preview mode is opt-in
1316
+ (`--deploy preview`) and slower (~30-60s/iter for the deploy).
1317
+ - Chromium-only — same precedent as `qualia-polish` Stage 4. Cross-
1318
+ browser visual diffs are deferred to v5.2.
1319
+ - One URL per invocation. Multi-route sweeps require chaining the loop
1320
+ per route. Deferred to v5.2.
1321
+
1322
+ ### Why these changes
1323
+
1324
+ Insights from `/insights` data (122 sessions, 292 commits, 10 days)
1325
+ identified design-iteration churn as the #1 recurring friction pattern.
1326
+ Quotes from the transcripts: "many frustrating iterations" on hero
1327
+ videos, "FUCK U / I CHANGED THE PAGE SO U STOP LYING" when Claude
1328
+ described unchanged screens. The root cause is that Claude could not
1329
+ see what the browser actually drew — only what was in the source. Every
1330
+ fix that didn't show up was a fight.
1331
+
1332
+ `/qualia-polish-loop` makes Claude look. Three viewports per iteration,
1333
+ real screenshots, vision-anchored scoring, deterministic kill-switch
1334
+ when the loop oscillates. It's the first Qualia skill where the agent
1335
+ can verify its own work end-to-end without a human in the visual loop.
1336
+
1337
+
1338
+
1339
+ ## [5.0.0] — 2026-05-03
1340
+
1341
+ **The "alignment discipline + insights-driven hardening" release.**
1342
+ Inspired by Matt Pocock's [skills](https://github.com/mattpocock/skills)
1343
+ repo and [Claude Code for Real Engineers](https://www.aihero.dev/cohorts/claude-code-for-real-engineers-2026-04),
1344
+ with a hardening pass driven by Fawzi's actual `/insights` data (122
1345
+ sessions, 292 commits, 10 days) and a 6-agent deep-research audit of
1346
+ the framework codebase.
1347
+
1348
+ Misalignment is the #1 failure mode in AI coding — vague plans become
1349
+ shipped bugs, terms overload across phases, decisions drift between
1350
+ sessions. v5.0 introduces an alignment substrate (CONTEXT.md domain
1351
+ glossary, decisions/ ADR folder), slims the global instruction budget
1352
+ per Matt's strongest claim ("LLMs realistically hold 300–500 instructions;
1353
+ bloating CLAUDE.md hamstrings every spawn"), folds grilling/deepening/TDD
1354
+ into existing skills, adds three small new skills for the autonomous
1355
+ loop (zoom, issues, triage), AND ships three insights-driven hooks
1356
+ (Vercel account verification, empty env-var guard, Supabase destructive-
1357
+ command guard) that prevent the documented top friction patterns
1358
+ deterministically. All existing skills remain. No substitution.
1359
+
1360
+ **Honest delta from earlier v5 drafts:** the deep-research audit caught
1361
+ real issues introduced by my own v5 work (dead `/qualia-grill` references
1362
+ in the road, a new prompt-injection surface from inlining CONTEXT.md
1363
+ into agent spawns, gh CLI heredoc shell injection in qualia-issues +
1364
+ qualia-triage). Those are all fixed before release.
1365
+
1366
+ ### Added
1367
+
1368
+ - **`templates/CONTEXT.md`** — domain glossary template seeded by
1369
+ `/qualia-new` from the discovery questioning. Loaded by every road
1370
+ agent BEFORE PROJECT.md and DESIGN.md. Each term has a one-sentence
1371
+ definition and an `Avoid:` list of rejected synonyms. Updated inline
1372
+ by `/qualia-discuss` as decisions crystallize. Killer of the "wait,
1373
+ what do you mean by *user*?" round-trips that waste tokens and surface
1374
+ bugs late.
1375
+ - **`templates/decisions/ADR-template.md`** — Architecture Decision
1376
+ Record template. Created sparingly: only for hard-to-reverse,
1377
+ surprising-without-context, real-tradeoff decisions. Cargo-culting
1378
+ ADRs ruins the signal. `/qualia-discuss` writes them when triggered.
1379
+ - **`/qualia-zoom`** — 50-line user-fired-only skill
1380
+ (`disable-model-invocation: true`). Maps an unfamiliar code area in
1381
+ domain-glossary terms — modules, callers, dependencies, smallest safe
1382
+ edit window. Used 50× a day in mature projects. Doesn't propose
1383
+ changes — just orients.
1384
+ - **`/qualia-issues`** — Externalize a phase plan to GitHub as
1385
+ independent vertical-slice issues with `needs-triage` label. Each
1386
+ issue is demoable end-to-end (schema → API → UI → tests),
1387
+ dependency-ordered. Uses CONTEXT.md domain language in titles.
1388
+ Externalized work means parallel humans, parallel sessions, or
1389
+ autonomous agents can pull from the queue.
1390
+ - **`/qualia-triage`** — State machine over the open issue queue:
1391
+ `needs-triage` / `needs-info` / `ready-for-agent` / `ready-for-human`
1392
+ / `wontfix`. Optionally routes `ready-for-agent` issues into
1393
+ autonomous `/qualia-build` runs. Together with `/qualia-issues` this
1394
+ is the autonomous-loop unlock — agent pulls from backlog, builds,
1395
+ verifies, closes, picks next.
1396
+ - **`/qualia-road`** — Discoverable workflow map (every command, when
1397
+ to use it, how phases chain). Lives in `skills/qualia-road/SKILL.md`
1398
+ so the agent loads it on demand instead of carrying the road in the
1399
+ global system prompt. Trigger phrases: "how does Qualia work",
1400
+ "what's the workflow", "show me the road", "what command does X".
1401
+ - **`docs/reviews/matt-pocock-skills-analysis.md`** — Deep analysis of
1402
+ Matt's repo and the alignment-discipline framing, with the original
1403
+ 15-change proposal and the rationale for each accepted/rejected fold.
1404
+
1405
+ ### Changed
1406
+
1407
+ - **`CLAUDE.md` slimmed from 88 lines to ~22.** The Road, Quality
1408
+ Gates, Context Isolation, Compaction, and prose explanations all
1409
+ moved out of the global system prompt. They now live in
1410
+ `/qualia-road` (discoverable) and the hooks (deterministic
1411
+ enforcement). Per Matt's strongest claim: LLMs realistically hold
1412
+ 300–500 instructions; everything in CLAUDE.md burns budget on every
1413
+ spawn. Steering rules belong in skills (loaded on relevant tasks) or
1414
+ hooks (enforced deterministically), not in CLAUDE.md.
1415
+ - **`AGENTS.md` slimmed identically** — multi-vendor mirror (Codex,
1416
+ Cursor, Continue, Aider, Devin) of CLAUDE.md. Same content, same
1417
+ size, cross-tool compatibility.
1418
+ - **`/qualia-discuss` upgraded with grilling pattern.** Was passive
1419
+ "capture decisions"; now actively grills: one question at a time,
1420
+ proposes a recommended answer with each, walks every branch of the
1421
+ decision tree until resolved. Updates CONTEXT.md inline as terms
1422
+ crystallize. Writes ADRs only for hard-to-reverse decisions. Auto-
1423
+ creates CONTEXT.md from template if missing. Same job (write
1424
+ `phase-{N}-context.md`), much more powerful interview.
1425
+ - **`/qualia-optimize` gained deepening pillar.** New `--deepen` mode
1426
+ spawns the architecture strategist with Ousterhout vocabulary
1427
+ (depth, locality, leverage, seam, deletion test). The Wave 2
1428
+ architecture-strategist prompt now includes deepening lens for full
1429
+ mode too. Reads CONTEXT.md and decisions/ as substrate. Refuses
1430
+ "extract a helper function" candidates that don't pass the deletion
1431
+ test — adding shallow modules is the disease, not the cure.
1432
+ - **`/qualia-test` gained `--tdd` mode.** Vertical-slice
1433
+ red→green→refactor loop. Anti-pattern: writing all tests first then
1434
+ all impl (horizontal). Correct: one test → one minimal impl → repeat.
1435
+ Test through public interface only. Never refactor while red. Each
1436
+ cycle is a commit.
1437
+ - **`/qualia-map` gained onboarding adapter.** New 5th parallel agent
1438
+ detects existing issue tracker (GH/GL/local), labels (mapped to
1439
+ canonical roles), domain docs (CONTEXT.md/glossary location), and
1440
+ existing CLAUDE.md/AGENTS.md. Writes `.planning/agents/{tracker,
1441
+ labels, domain}.md` adapter config. APPENDS (never overwrites)
1442
+ agent-skills block to existing CLAUDE.md/AGENTS.md. Brownfield
1443
+ projects now get Qualia capabilities without losing their existing
1444
+ process.
1445
+ - **`/qualia-new` emits CONTEXT.md and decisions/.** New Step 5a
1446
+ copies the templates and seeds the glossary from questioning
1447
+ answers (terms the user repeatedly used → glossary entries with
1448
+ `Avoid:` lines). Required for every new project — domain glossary
1449
+ is the single highest-leverage piece of substrate.
1450
+ - **`/qualia-report` tightened from 299 → 240 lines.** Hasan and
1451
+ Moayad will not get stuck on it. Improvements: empty-day handling
1452
+ (asks the employee what they did instead of forcing a fake report),
1453
+ structured synthesis template (verb list + 3-6 bullet format,
1454
+ blocker discipline), graceful pre-flight (no STATE.md / no git → soft
1455
+ warnings only, never blocks), git-push failure handling (exit code
1456
+ check, never silently fails), encouraging closing message ("You can
1457
+ clock out now. See you tomorrow."), self-service common-errors
1458
+ table for permission-denied/network/auth issues. ERP payload
1459
+ preserved verbatim — `client_report_id` (QS-REPORT-NN), full
1460
+ milestones[], build_count/deploy_count/deployed_url for tree
1461
+ rendering, 3-attempt retry with 1s/3s/9s backoff, 401/422 permanent-
1462
+ failure handling. The ERP connection stays beautiful: every payload
1463
+ field the portal needs is populated, every error path is actionable.
1464
+
1465
+ ### Hardening pass (insights-driven hooks, security, instruction-budget)
1466
+
1467
+ Added after the deep-research audit. Each item maps to a documented
1468
+ friction pattern in Fawzi's `/insights` report or a security/integrity
1469
+ issue surfaced by the 6 specialist agents.
1470
+
1471
+ **New hooks (insights-driven friction prevention)**
1472
+
1473
+ - **`hooks/vercel-account-guard.js`** — PreToolUse on `vercel --prod*` /
1474
+ `vercel deploy*`. Verifies active Vercel account against
1475
+ `~/.claude/.vercel-allowed-teams` (one team slug per line, mode 0600).
1476
+ Fail-OPEN when config is missing (don't block legit deploys when the
1477
+ user hasn't set the allowlist yet); fail-CLOSED when the active team
1478
+ doesn't match an allowed entry. Prevents the documented "wrong Vercel
1479
+ account redeploys" friction.
1480
+ - **`hooks/env-empty-guard.js`** — PreToolUse on `vercel env*`. Detects
1481
+ `vercel env add NAME ""` / `''` / trailing-`=` patterns and blocks
1482
+ with a clear message. Escape: `QUALIA_ALLOW_EMPTY_ENV=1`. Prevents
1483
+ the documented "empty env vars on production" incident.
1484
+ - **`hooks/supabase-destructive-guard.js`** — PreToolUse on `supabase*` /
1485
+ `npx supabase*`. Blocks `supabase db reset`, `supabase db push --force`,
1486
+ `supabase migration repair`. Escape: `QUALIA_ALLOW_DESTRUCTIVE=1`
1487
+ (shared with git-guardrails). Prevents the documented "DB wipe leaves
1488
+ orphan rows" incident.
1489
+
1490
+ **State machine + reliability**
1491
+
1492
+ - **`state.js cmdInit` throwaway-name guard** — `init --project test|tmp|
1493
+ scratch|demo|sample|untitled|new-project|...` returns `SUSPICIOUS_NAME`
1494
+ instead of creating a throwaway project. `--force` bypasses. Prevents
1495
+ the documented "test-name-only project accidentally created" incident.
1496
+ - **`hooks/pre-deploy-gate.js` lint escape** — `QUALIA_SKIP_LINT=1` skips
1497
+ the lint gate while keeping tsc, tests, build, and security gates
1498
+ mandatory. For the documented "lint blocks ship after JSX auto-fixer
1499
+ broke things" friction.
1500
+ - **`hooks/pre-compact.js` saves tracking.json too** — was only `git add
1501
+ STATE.md`, losing tracking.json mutations across compactions. ERP got
1502
+ stale data. Now stages both planning files.
1503
+ - **`hooks/pre-push.js` warns instead of blocks** on tracking.json
1504
+ corruption. A corrupt ERP-metadata file should never block a production
1505
+ hotfix. Self-service fix surfaced in the warning. Restore old behavior
1506
+ with `QUALIA_BLOCK_ON_TRACKING_FAIL=1`.
1507
+ - **`state.js cmdCloseMilestone`** — wraps `writeTracking` in the journal
1508
+ pattern (matching `cmdTransition`) so a crash mid-write can be
1509
+ detected and recovered. STATE.md is intentionally NOT mutated here
1510
+ (next `init --force` rewrites it for the new milestone).
1511
+
1512
+ **Security hardening**
1513
+
1514
+ - **ERP API key no longer passed as curl CLI argument** — `bin/cli.js`
1515
+ `cmdErpPing` now uses Node's native `https.request` instead of
1516
+ `spawnSync("curl", ["-H", "Authorization: Bearer $KEY"])`. The CLI-arg
1517
+ pattern exposed the bearer token via `/proc/<pid>/cmdline` to any
1518
+ local process during the curl invocation. Native `https.request` keeps
1519
+ the auth header in-process.
1520
+ - **`set-erp-key` refuses positional arguments** — positional args land
1521
+ in `~/.bash_history` / `~/.zsh_history`. Now reads from stdin only:
1522
+ `printf '%s' "$KEY" | qualia-framework set-erp-key`. Helpful error
1523
+ message if positional is attempted.
1524
+ - **`bin/install.js` backs up CLAUDE.md and settings.json before
1525
+ overwrite** — `.bak.{ISO-timestamp}` files. Settings.json now uses
1526
+ atomic write (tmp + rename) to eliminate partial-write risk during
1527
+ the merge.
1528
+ - **Subagent prompt-injection defense** — `agents/builder.md`,
1529
+ `agents/planner.md`, `agents/verifier.md` now have a "Trust boundary"
1530
+ block warning that content within `<phase_context>`, `<task_context>`,
1531
+ `<project_context>`, `<glossary>`, `<decisions>`, etc. is project DATA
1532
+ not instructions. The agents are told to refuse and report directives
1533
+ appearing inside those tags as project-file injection attempts. This
1534
+ closes the new attack surface introduced by inlining CONTEXT.md into
1535
+ every road agent spawn.
1536
+ - **`qualia-issues` and `qualia-triage` use `gh issue create
1537
+ --body-file`** instead of heredoc-interpolated bodies. The previous
1538
+ pattern allowed shell injection via crafted plan content (shell
1539
+ metacharacters or a literal `EOF` line in user-controlled text).
1540
+
1541
+ **Instruction-budget cuts (Matt Pocock discipline applied to rules)**
1542
+
1543
+ - **`globs:` frontmatter on `rules/design-brand.md`,
1544
+ `rules/design-laws.md`, `rules/design-product.md`,
1545
+ `rules/design-rubric.md`** — these 4 design rules files (~5,508
1546
+ tokens combined) were loading on every conversation regardless of
1547
+ whether the user was editing a backend API route or a SQL migration.
1548
+ Now glob-gated to `["*.tsx", "*.jsx", "*.css", "*.scss", "*.html",
1549
+ "*.vue", "*.svelte"]`. Saves ~5,508 tokens per non-frontend
1550
+ conversation. The single highest-ROI change in the entire audit.
1551
+ - **`globs:` on `rules/grounding.md`** + removed duplicate
1552
+ `@grounding.md` references from spawn templates in
1553
+ `qualia-build/SKILL.md`, `qualia-plan/SKILL.md`,
1554
+ `qualia-verify/SKILL.md`. Saves ~1,175 tokens of duplicate load per
1555
+ agent spawn.
1556
+ - **`agents/builder.md` duplicate-rules block deleted** — 21 lines
1557
+ re-stating security/design/perf rules that already auto-load from
1558
+ `rules/*.md`. Replaced with a single line pointing to the auto-load
1559
+ source of truth. Saves ~280 tokens per builder spawn.
1560
+ - **`templates/CONTEXT.md` prose stripped** — meta-commentary
1561
+ ("What this is", "Why it exists", "How to update this file") was
1562
+ loading into every road agent via `.planning/CONTEXT.md`. Replaced
1563
+ with a one-line HTML comment. Saves ~138 tokens per spawn.
1564
+
1565
+ **Discoverability + consistency**
1566
+
1567
+ - **`disable-model-invocation: true`** added to `qualia-road`,
1568
+ `qualia-ship`, `qualia-handoff` (joining `qualia-zoom` from earlier
1569
+ in v5). These are pure-reference / irreversible-production skills
1570
+ that should never auto-fire from natural-language ambiguity.
1571
+ - **5 under-described skill descriptions expanded** —
1572
+ `qualia-verify`, `qualia-build`, `qualia-plan`, `qualia-task`,
1573
+ `zoho-workflow` (each was <25 words with no "Use when" clause).
1574
+ Now 30-60 words with explicit trigger phrases per the framework's
1575
+ description-as-API-surface convention.
1576
+ - **`qualia-road` dead skill references removed** — `/qualia-grill`,
1577
+ `/qualia-deepen`, `/qualia-tdd`, `/qualia-onboard` were listed as
1578
+ commands but were folded into existing skills. Replaced with the
1579
+ actual folded locations (`/qualia-discuss`, `/qualia-optimize
1580
+ --deepen`, `/qualia-test --tdd`, `/qualia-map`).
1581
+ - **`templates/CONTEXT.md` `/qualia-grill` reference removed** — same
1582
+ fold, same dead reference, same fix.
1583
+ - **README.md and guide.md updated to v5** — version, skill count
1584
+ (32), hook count (12), template count (24), v5 changes summary.
1585
+
1586
+ ### Test coverage (+34 new tests, 226 → 260)
1587
+
1588
+ 19 v5-specific tests added in Wave B and 15 hardening-pass tests
1589
+ added in Wave C. Coverage now includes: CONTEXT.md install,
1590
+ decisions/ADR-template install, all 4 new skill folder installs,
1591
+ qualia-discuss grilling reference, qualia-discuss CONTEXT.md
1592
+ reference, qualia-road `disable-model-invocation`, qualia-road dead-
1593
+ ref guard, CONTEXT.md template `/qualia-grill` guard, builder.md
1594
+ trust-boundary block, qualia-issues `--body-file` (no heredoc
1595
+ injection), 3 new hooks parse-validity + install, state.js
1596
+ `next-report-id` (alloc + increment + --peek), state.js init
1597
+ suspicious-name guard (3 cases), pre-deploy-gate `QUALIA_SKIP_LINT`
1598
+ escape.
1599
+
1600
+ ### Deferred to v5.1
1601
+
1602
+ - **SKILL.md / REFERENCE.md / scripts/ split** for skills > 200 lines
1603
+ (qualia-new at 487, qualia-optimize at 466, others 200–300). The
1604
+ pattern is correct per Matt's progressive-disclosure rule, but no
1605
+ precedent exists in the framework's loader behavior, and the v5.0
1606
+ test suite assumes single-file skills. Deferring is honest. The
1607
+ oversized skills are still cohesive and execution-essential —
1608
+ splitting will be a focused v5.1 project with loader-behavior
1609
+ testing.
1610
+ - **Caveman-mode template** for in-flight subagent prompts (~25%
1611
+ token reduction × 50 builder spawns per project). Worth real money
1612
+ at scale. Deferred to keep v5.0 scope honest.
1613
+ - **Autonomous visual-polish loop with Playwright MCP** — for the
1614
+ documented "design iteration churn" friction (5-10 manual rounds
1615
+ on hero videos, mobile layouts). The insights report's #1 strategic
1616
+ workflow opportunity. Substantial design work; v5.1.
1617
+ - **Parallel lead-enrichment swarm** — multi-source confidence-scored
1618
+ merge for lead-gen pipelines. Insights report's #2 strategic
1619
+ opportunity. v5.1.
1620
+
1621
+ ### Why these changes
1622
+
1623
+ Matt Pocock teaches that the bottleneck on AI-coding quality is not
1624
+ the model — it's the alignment between human, agent, code, and
1625
+ domain. Every skill in his repo fixes one alignment failure mode.
1626
+ Qualia v4.5.0 was strong on execution machinery (phase/wave/task
1627
+ hierarchy, fresh-context spawning, the verifier loop, the design
1628
+ substrate) but treated alignment as a one-shot at `/qualia-new`.
1629
+ v5.0 adds the alignment rituals that prevent decision drift between
1630
+ phases: a domain glossary every agent reads, ADRs for hard-to-reverse
1631
+ decisions, grilling that exposes vagueness before code is written,
1632
+ deepening that fights shallow-module entropy, vertical-slice TDD that
1633
+ keeps tests honest. Combined with Qualia's existing execution machinery
1634
+ this is genuinely a different framework. The instruction-budget
1635
+ discipline (CLAUDE.md slim) is the multiplier — every spawn now uses
1636
+ its 300–500 instruction budget on the work, not on rules-it-already-knows.
1637
+
1638
+ ## [4.5.0] — 2026-05-02
1639
+
1640
+ **The "Design as a thread" release.** Design stops being a final phase
1641
+ and becomes a verification dimension every road agent honors. Informed
1642
+ by deep research (UI-Bench, Refactoring UI, Anthropic frontend-design
1643
+ skill, awesome-claude-design), the impeccable repo's vocabulary
1644
+ (OKLCH, registers, scene sentence, second-order slop test), and 5
1645
+ internal research agents.
1646
+
1647
+ ### Added
1648
+
1649
+ - **`rules/design-laws.md`** — universal design rules both registers
1650
+ honor: OKLCH mandate, color strategy commitment scale, scene sentence
1651
+ for theme, absolute bans (gradient text, side-stripe borders,
1652
+ glassmorphism default, hero-metric template, identical card grids,
1653
+ modal-as-first-thought), no em-dashes, second-order slop test, single
1654
+ icon family rule, container depth max 2.
1655
+ - **`rules/design-brand.md`** — Brand register (marketing, landing,
1656
+ portfolio): distinctiveness is the bar. Reflex-reject aesthetic lanes.
1657
+ Pre-code design brief (direction, color strategy, scene sentence,
1658
+ differentiation). Banned display fonts. Allowed visual flourishes
1659
+ (gradient meshes, noise, parallax signature motion).
1660
+ - **`rules/design-product.md`** — Product register (app UI, admin,
1661
+ dashboards): earned familiarity is the bar. Reference-app anchoring.
1662
+ Density target table (comfortable / standard / compact / ultra-compact).
1663
+ Per-component rules (buttons 3 variants max, tables tabular numerals,
1664
+ empty states require icon + context + action).
1665
+ - **`rules/design-rubric.md`** — 8-dimension anchored 1-5 scoring used
1666
+ by verifier and `/qualia-polish --critique`. Default to 3 (= ships).
1667
+ Score below 3 fails the phase, same as functional bug.
1668
+ - **`templates/PRODUCT.md`** — REQUIRED at /qualia-new (v4.5.0+). Users,
1669
+ brand voice, register, anti-references (mandatory 3-5), strategic
1670
+ principles, differentiation. Substrate every road agent loads.
1671
+ - **`templates/DESIGN.md`** — Rewritten OKLCH-first. Mandatory direction
1672
+ block (aesthetic, color strategy, scene sentence, differentiation).
1673
+ Token-driven components. Anti-pattern checklist auto-runnable.
1674
+ - **`bin/slop-detect.mjs`** — Standalone CLI scanner. ~17 anti-pattern
1675
+ rules across critical / high / medium / low. Exit 1 on critical
1676
+ blocks commit. No AI required. Builders run pre-commit.
1677
+ - **`/qualia-polish` rewritten as scope-adaptive** — six modes:
1678
+ Component (~30s) | Section (~3m) | App (~12m) | Redesign (~30m) |
1679
+ Critique (read-only) | Quick (~1m). Stage selection follows from
1680
+ scope. Mandatory preflight gates (PRODUCT, DESIGN, slop-detect).
1681
+ - **Vision-loop architecture** for Redesign scope: webapp-testing skill
1682
+ + 3-viewport matrix (375/768/1280) + anchored rubric prompt + 2-iter
1683
+ hard cap with regression-stop.
1684
+
1685
+ ### Changed
1686
+
1687
+ - **`agents/planner.md`** — Inputs add product_context, design_spec,
1688
+ design_substrate. Task template adds **Design:** field (required for
1689
+ any .tsx/.jsx/.css task) with register, tokens used, scope, anti-
1690
+ pattern guard line.
1691
+ - **`agents/plan-checker.md`** — New Rule 7b: frontend tasks must have
1692
+ Design field with valid register, non-empty token list, no raw hex.
1693
+ Greps plan for absolute-ban patterns and rejects.
1694
+ - **`agents/builder.md`** — Self-verify step 4 mandatory: slop-detect
1695
+ on touched frontend files. Exit 1 BLOCKS commit. Rule 5 frontend
1696
+ standards rewritten to point at design substrate as source of truth.
1697
+ - **`agents/verifier.md`** — Adds Design Verification block: slop-detect
1698
+ gate (Step A), 8-dim rubric scoring (Step B), drift audit (Step C).
1699
+ Combined phase verdict: functional AND slop-detect AND rubric pass.
1700
+ - **`/qualia-new`** — Adds Step 5b (PRODUCT.md generation, 5 questions,
1701
+ required) and rewrites Step 7 (DESIGN.md OKLCH-first with mandatory
1702
+ 4-line direction commit).
1703
+ - **`bin/cli.js`** — QUALIA_BIN_FILES adds slop-detect.mjs.
1704
+ QUALIA_RULE_FILES adds design-laws / design-brand / design-product /
1705
+ design-rubric. Help text removes /qualia-design, adds /qualia-polish.
1706
+ - **`bin/install.js`** — slop-detect.mjs copied to ~/.claude/bin and
1707
+ chmodded executable. New rules + PRODUCT.md template auto-picked up
1708
+ by directory walks.
1709
+ - **`CLAUDE.md` + `AGENTS.md`** — Road updated: design described as a
1710
+ thread through every phase, not just Handoff Phase 1. /qualia-polish
1711
+ six modes documented.
1712
+
1713
+ ### Removed
1714
+
1715
+ - **`/qualia-design`** skill — its surface area is fully covered by
1716
+ `/qualia-polish` modes. Help text and references purged.
1717
+
1718
+ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1719
+
1720
+ ---
1721
+
1722
+ ## [4.4.0] — 2026-04-28
1723
+
1724
+ **The "Plan as Data + Agent Telemetry" release.** Lays the substrate for deterministic verification and per-spawn observability. Skill rewiring lands in 4.5.
1725
+
1726
+ ### Added
1727
+
1728
+ - **`bin/plan-contract.js`** — JSON plan contract validator (zero deps, ~150 LOC). Replaces ad-hoc markdown re-parsing. Schema in `docs/plan-contract.md`. Four verification check types (`file-exists`, `grep-match`, `command-exit`, `behavioral`); behavioral evidence is structured (`{path, matcher?, description}`), blocking vibes-based passes at the schema level.
1729
+ - **`bin/agent-runs.js`** — append-only JSONL telemetry writer (`.planning/agent-runs.jsonl`). OTel GenAI-aligned field names. 14-code closed `failure_reason` enum. Side log files only on failure. Spec in `docs/agent-runs.md`.
1730
+ - **`qualia-framework agents`** — reader CLI. Flags: `--failed`, `--task <id>`, `--phase <n>`, `prune --before YYYY-MM-DD`.
1731
+ - **`docs/plan-contract.md`** + **`docs/agent-runs.md`** — committed v1 specs with locked design decisions.
1732
+ - **Drift detection:** `source_plan_hash` field on contracts; `plan-contract.checkDrift()` flags when the markdown plan diverges from its compiled contract.
1733
+ - **Privacy:** `QUALIA_TELEMETRY=off` disables agent-runs writes (reads still surface history).
1734
+ - `tests/lib.test.sh` — 15 cases covering both libraries.
1735
+ - `qualia-framework set-erp-key` for saving/enabling the ERP API key without manually editing `~/.claude/.erp-api-key`.
1736
+ - Targeted shell test scripts in `package.json` (`test:state`, `test:hooks`, `test:bin`, `test:lib`, `test:statusline`, `test:shell`); `npm test` runs the full shell gate.
1737
+
1738
+ ### Changed
1739
+
1740
+ - `migrate` now wires the full v4.3 hook set, including `git-guardrails.js`, `pre-compact.js`, and the `Stop` event.
1741
+ - `migrate` help text updated — no longer says "v2 to v3."
1742
+ - README/help docs now match the current package counts: 28 skills, 9 hooks, 8 agents, 6 rules, 21 templates.
1743
+ - `auto-update` caches failed npm lookups so offline registry checks do not add a timeout to every Bash command.
1744
+ - `install.js` copies `plan-contract.js` and `agent-runs.js` into `~/.claude/bin/`; `doctor` checks both.
1745
+
1746
+ ### Fixed
1747
+
1748
+ - State mutators now fail loudly with `STATE_LOCK_TIMEOUT` instead of proceeding without `.planning/.state.lock`.
1749
+ - `pre-push` now blocks when tracking sync cannot be committed instead of letting a stale ERP state push through silently.
1750
+ - `pre-compact` detects untracked `.planning/STATE.md` before compaction.
1751
+ - `pre-deploy-gate` surfaces the last gate output lines on failure instead of hiding root cause behind a generic block message.
1752
+ - `stop-session-log` reads `tracking.total_phases` and still supports legacy `phase_total`.
1753
+ - CLI analytics now reads top-level trace fields (`verification`, `gap_closure`) written by `state.js`.
1754
+ - Stale tests reconciled with v4.3 behavior: retired `block-env-edit`, PreToolUse exit code 2, forced milestone close tests, opt-in ERP key setup.
1755
+
1756
+ ### Changed
1757
+
1758
+ - `migrate` now wires the full v4.3 hook set, including `git-guardrails.js`, `pre-compact.js`, and the `Stop` event.
1759
+ - README/help docs now match the current package counts: 28 skills, 9 hooks, 8 agents, 6 rules, and 21 templates.
1760
+ - `auto-update` caches failed npm lookups so offline registry checks do not add a timeout to every Bash command.
1761
+
1762
+ ### Fixed
1763
+
1764
+ - State mutators now fail loudly with `STATE_LOCK_TIMEOUT` instead of proceeding without `.planning/.state.lock`.
1765
+ - `pre-push` now blocks when tracking sync cannot be committed instead of letting a stale ERP state push through silently.
1766
+ - `pre-compact` detects untracked `.planning/STATE.md` before compaction.
1767
+ - `pre-deploy-gate` surfaces the last gate output lines on failure instead of hiding root cause behind a generic block message.
1768
+ - `stop-session-log` reads `tracking.total_phases` and still supports legacy `phase_total`.
1769
+ - CLI analytics now reads top-level trace fields (`verification`, `gap_closure`) written by `state.js`.
1770
+ - Stale tests were reconciled with v4.3 behavior: retired `block-env-edit`, PreToolUse exit code 2, forced milestone close tests, and opt-in ERP key setup.
1771
+
1772
+ ## [4.3.0] — 2026-04-26
1773
+
1774
+ **The "Compound + Self-Healing" release.** Five sequenced phases (#9–#13 on GitHub) shipped in one window, scoped from the 2026-04-25 NotebookLM deep-dive on Anthropic's subagent upgrade, Karpathy's LLM knowledge bases, Cole Medin's parallel-worktrees playbook, and the mattpocock skills directory. Closes 11 of the 12 top-priority findings from the v4.1.0 audit. The framework now (a) compounds across every session through an automated memory layer, (b) heals itself when the verifier catches a gap, and (c) blocks destructive git operations universally.
1775
+
1776
+ ### Added
1777
+
1778
+ #### Memory layer (Karpathy raw → wiki, end-to-end automated)
1779
+
1780
+ - **`hooks/stop-session-log.js`** — Stop hook seeding the **raw tier**. Appends one mechanical line per turn to `~/.claude/knowledge/daily-log/{YYYY-MM-DD}.md` (project, branch, phase, task counts, commit count, top-3 touched files). Rate-limited 5min, skipped on no activity, never blocks.
1781
+ - **`bin/knowledge.js`** — Unified memory-layer loader. Subcommands: `load <file>` (with aliases `patterns`/`fixes`/`client` or any bare filename), `list`, `search <query>`, `append --type <pattern|fix|client> --title <T> --body <B>`, `path <file>`, `help`. Default invocation prints `index.md`. Subdirectory support: bare-name lookups auto-discover in `concepts/`/`connections/`/`daily-log/` if no top-level match exists; `concepts/foo` works as a qualified path; top-level wins on collision. Every command exits 0 on missing files with a `(no entries)` stub so skills can pipe output safely.
1782
+ - **`/qualia-flush` skill** — The LLM job that promotes raw → curated. Reads recent `daily-log/*.md` (default 14-day window, configurable), groups by project, identifies recurring patterns/decisions, writes promotions via `knowledge.js append`. Conservative: single-occurrence entries stay raw until they recur. `--dry-run`, `--project NAME` supported.
1783
+ - **`bin/knowledge-flush.js`** — Cron-runnable non-interactive `/qualia-flush` runner. Wraps `claude -p "/qualia-flush --days 7"` with a 5-minute hard cap and writes a structured JSONL audit log to `~/.claude/.qualia-flush.log`. Cron-spam-safe: missing CLI / empty daily-log → exits 0 with a logged skip; only true execution failures exit 1. Recommended cron: `0 3 * * 0 node ~/.claude/bin/knowledge-flush.js >> ~/.claude/.qualia-flush.log 2>&1`.
1784
+ - **`templates/knowledge/{agents,index}.md`** — Karpathy meta-doc + index entry point. Installed once at `~/.claude/knowledge/`, never overwrites existing content. Closes v4.1.0 audit finding #3 ("11 of 14 knowledge files are invisible") by giving every agent one deterministic place to start.
1785
+ - **Builder reads knowledge before writing code** (`agents/builder.md` §2b). New "Load Relevant Knowledge" section instructs the builder to call `knowledge.js` first. Hardcoded `cat` is forbidden — the loader is the only sanctioned path. Closes v4.1.0 audit finding #2 (the most-flagged miss).
1786
+
1787
+ #### Self-healing AI layer
1788
+
1789
+ - **`/qualia-postmortem` skill** — Cole Medin's pillar 5: "anytime we encounter a bug in a pull request, we don't just fix the bug, we fix the underlying system that allowed for the bug." After a verify FAIL, identifies which AI-layer file (`agents/X.md`, `rules/Y.md`, `skills/Z/SKILL.md`) should have caught the gap and proposes a surgical delta. Writes `.planning/phase-{N}-postmortem.md` for review. With `--apply`, edits the installed file and TODOs a framework-repo PR so the lesson survives reinstall. Promotes generalizable patterns to the knowledge layer. Conservative: max 3 deltas per postmortem.
1790
+ - **`/qualia-verify --adversarial` flag** — Spawns a SECOND verifier in fresh context with an adversarial prompt ("find what's wrong, not what's right"). Auto-enabled for the Handoff milestone and any phase whose plan touches `auth|payment|migration|rls|service_role` files. Findings union with the cooperative pass; either pass finding CRITICAL or HIGH = phase FAIL. Mitigates the "kid grading their own homework" bias documented in the NotebookLM source.
1791
+ - **`/qualia-verify` auto-invokes `/qualia-postmortem` on FAIL** before the gap-closure re-plan. The postmortem writes the report by default — no destructive AI-layer edits unless the user runs `--apply`.
1792
+
1793
+ #### Forked subagents
1794
+
1795
+ - **`CLAUDE_CODE_FORK_SUBAGENT=1`** in `~/.claude/settings.json` env block. Anthropic shipped forked subagents in 2026-04 to solve the "design subagent loses 50k tokens of nuance" failure mode. Forks inherit full conversation history + share the prompt cache. On by default for all installs.
1796
+ - **`/qualia-design` prefers forks for batch fan-out** when the conversation contains design-taste context. Blank-context spawns still used for mechanical anti-pattern fixes.
1797
+
1798
+ #### Operational tooling
1799
+
1800
+ - **`hooks/git-guardrails.js`** — PreToolUse/Bash hook blocking destructive git ops universally (OWNER too): `git push --force`/`-f` to main/master, `git reset --hard` while on main/master, `git clean -fd[x]`, `git branch -D main|master`, `rm -rf .git`. `--force-with-lease` is allowed. Escape hatch: `QUALIA_ALLOW_DESTRUCTIVE=1`. Exits 2 with a clear reason and remediation suggestion. Inspired by `mattpocock/skills/git-guardrails-claude-code`.
1801
+ - **`qualia-framework doctor`** (aliases: `health`, `health-check`) — Post-install diagnostic. Critical files, all 9 hooks present, knowledge layer initialized, `settings.json` hook wiring (SessionStart + PreToolUse + PreCompact + Stop), config metadata. Exits 0 if healthy, 1 with itemized issues. Inspired by davila7/claude-code-templates' `--health-check`.
1802
+ - **`qualia-framework flush`** — Convenience CLI wrapper around `bin/knowledge-flush.js` for ad-hoc invocation. `qualia-framework flush --dry-run` previews what the next cron run would do.
1803
+
1804
+ ### Changed
1805
+
1806
+ - **Hook count: 7 → 9.** Added `git-guardrails.js`, `stop-session-log.js`. New `Stop` event in `settings.json`.
1807
+ - **`/qualia-learn` rewritten to use the loader.** Duplicate detection now goes through `knowledge.js search`. Append step uses `knowledge.js append --type pattern --title ...` — no manual ID generation, no shell-escaping concerns.
1808
+ - **`/qualia-debug`, `/qualia-plan`, `/qualia-new`, `/qualia-review` migrated off hardcoded `cat`.** Five `cat ~/.claude/knowledge/*.md` calls across four skills now go through the loader. Newly-added knowledge files become reachable to every skill via the index automatically.
1809
+ - **`agents/research-synthesizer.md` uses `model: haiku`.** Conservative first entry in the model-per-agent matrix. Synthesizer is pure markdown merging — no new reasoning needed, ~7× cheaper. Other agents (planner/builder/verifier/plan-checker/roadmapper/qa-browser) retain their default model — they're stakes-bearing.
1810
+ - **`bin/install.js` initializes the knowledge layer** on first install (never overwrites existing content). The `templates/knowledge/` subdirectory is excluded from the regular templates copy to avoid double-installation.
1811
+ - **`bin/install.js` ships `knowledge.js` + `knowledge-flush.js`** to `~/.claude/bin/` alongside `state.js`/`qualia-ui.js`/`statusline.js`. `QUALIA_BIN_FILES` updated for clean uninstall.
1812
+
1813
+ ### Fixed
1814
+
1815
+ - **Two pre-existing test failures** in `tests/bin.test.sh` that had been broken since v3.2.0: hook-count assertion still expected 8 hooks (had been wrong for two releases), and the hook-wiring assertion still grep'd for the deleted `block-env-edit.js`. Both updated to match the v4.3.0 install state (9 hooks, no block-env-edit).
1816
+
1817
+ ### Notes
1818
+
1819
+ Cumulative tests: **+49 new passing assertions**, 2 broken tests fixed, 168/168 node tests + 78/79 bin tests + 61/68 hook tests (the 7 pre-deploy-gate + 1 erp-api-key shell failures predate this release and are tracked separately).
1820
+
1821
+ What's deferred to v4.4.0: worktree-aware phase parallelism (Cole Medin's pillars 2–4 — `bin/qualia-worktree.sh`, port-from-hash, Supabase branch-per-worktree), audit cleanup (orphan skills, color drift, per-employee memory scoping — v4.1.0 findings #4, #6, #7, #8), `npm publish` post-merge.
1822
+
1823
+ For developers integrating the framework: re-run `npx qualia-framework@latest install` after this release lands to pick up the new hooks, knowledge layer, and `bin/` scripts. Existing `~/.claude/knowledge/*.md` content is preserved on reinstall.
1824
+
1825
+ ## [4.1.1] — 2026-04-22
1826
+
1827
+ **Critical silent-fail hotfix.** Follow-up to the v4.1.0 audit (`docs/reviews/v4.1.0-audit.html`) which surfaced 142 findings across 4 dimensions. This release addresses the 5 highest-risk issues — each one previously let an operation fail silently or skip safety checks without telling the user. Subsequent releases (v4.2.0 structural, v4.3.0 harness patterns) will handle the remaining findings.
1828
+
1829
+ ### Fixed
1830
+
1831
+ - **`grounding.md` phantom on stale installs** (`hooks/session-start.js`, `hooks/auto-update.js`). Users who upgraded from v4.0.5 → v4.1.0 without re-running `npx qualia-framework@latest install` had no `~/.claude/rules/grounding.md`, so every planner/builder/verifier spawn silently received empty grounding context. Session-start now spot-checks 5 critical files (`grounding.md`, `security.md`, `frontend.md`, `deployment.md`, `state.js`) once per 24h and prints a loud banner with the exact install command when anything is missing. `auto-update.js` invalidates the health cache on every version bump so newly-shipped critical files are verified on the next session.
1832
+ - **`qualia-report` ERP upload silent-fail chain** (`skills/qualia-report/SKILL.md`). Three independent silent failures were producing garbage reports: empty `API_KEY` → sent `Authorization: Bearer ` (blank token) then surfaced a generic 401 with "Ask Fawzi" and no diagnostic. Empty `CLIENT_REPORT_ID` from a state.js failure → commit message dropped the ID, ERP payload carried empty string. `SUBMITTED_BY` shell-interpolated into `node -e` script → a single quote in `git config user.name` silently broke the payload. Fixes: guard empty API key before the POST loop with a clear `~/.claude/.erp-api-key` diagnostic, validate `CLIENT_REPORT_ID` is non-empty and fail loudly if state.js didn't return one, pass `SUBMITTED_BY`/`SUBMITTED_AT`/`CLIENT_REPORT_ID`/`REPORT_FILE` via environment variables (which are inert to shell metacharacters) instead of string interpolation, and make the 401 handler actually explain the likely cause.
1833
+ - **`/qualia-ship` had no state guard and a hallucinated domain** (`skills/qualia-ship/SKILL.md`). The skill would run from any state (setup/planned/built — even `shipped` → double-deploy possible), its security scan only grepped `service_role` (missed hardcoded keys, tracked `.env`, `dangerouslySetInnerHTML`), and its post-deploy verification used a literal `{domain}` placeholder that expected the LLM to hallucinate the URL. Now: state gate refuses any status except `polished` or `verified+pass` (with `--force` escape hatch for hotfixes), security scan inlines the CRITICAL checks from `/qualia-review` verbatim so the two skills agree, and the URL is read from `tracking.json.deployed_url` with a loud error if missing. Description also gained the missing trigger phrases (`deploy`, `ship it`, `go live`, `push to prod`, `launch`).
1834
+ - **`templates/help.html:410` mis-described `/qualia-idk`.** Was listed as "Alias for /qualia. The smart router handles all 'idk' situations" — directly contradicts the actual SKILL.md which defines it as a diagnostic intelligence running two isolated scans. Team members reading the help page were sent to the wrong skill. Description now matches the skill.
1835
+ - **`hooks/session-start.js` silent error swallow.** The top-level `try { ... } catch { }` block recorded `result: "allow"` in traces even when the try body threw, so silent session-start crashes were invisible in analytics. Error cases now log to `.qualia-traces/{date}.jsonl` with `result: "error"` and the exception message.
1836
+ - **CI was failing on every PR, across all 18 matrix cells, and being ignored.** The `.github/workflows/test.yml` used `actions/setup-node@v4` with `cache: 'npm'`, which requires a `package-lock.json` — but this framework has zero runtime deps and no lockfile. Every PR since the workflow was added reported FAILURE before tests even ran, and merges happened on red CI. Removed the `cache:` key; tests now actually execute on every push.
1837
+ - **`pre-push.js` now passes `-c core.autocrlf=false` on its stamp commands.** Defensive fix against the Windows failure mode where autocrlf normalization produces an empty diff, the stamp-commit fails, and the hook's rollback restores stale `last_commit`. The behavioral test for this path is temporarily skipped on Windows (tracked for v4.1.2) — the Linux + macOS matrix is fully green with this change. Windows platform-specific investigation needs a live Windows box and is out of scope for this hotfix.
1838
+
1839
+ ### Notes
1840
+
1841
+ Full framework review (142 findings) and v4.2.0 / v4.3.0 / v4.4.0 roadmap documented separately. This release handles the 5 highest-risk silent-fail paths; subsequent releases handle structural debt (3-tier memory, unified color module, orphan skill consolidation) and harness-engineering patterns (adversarial build, continuous reviewer agents, component-fetch skill).
1842
+
1843
+ ## [4.1.0] — 2026-04-21
1844
+
1845
+ **Command quality + build workflow hardening.** Deep research across 5 parallel Opus agents surfaced 15 concrete improvements — shipped across 4 commits. Every agent spawn now loads a shared **Grounding Protocol** (cite-or-INSUFFICIENT-EVIDENCE, no hedging, file:line evidence required for every finding) and deterministic scoring rubrics. Build workflow gains cache-aware prompt ordering (92% prefix-cache hit rate per Anthropic docs), explicit parallel wave dispatch, and a structured builder output contract. `qualia-debug` was fully rewritten from interactive to investigative one-shot. Research reports committed to `docs/research/`. 168/168 tests passing.
1846
+
1847
+ ### Added
1848
+
1849
+ - **`rules/grounding.md` — shared Grounding Protocol + 5 rubrics.** New file referenced from every skill that spawns a subagent. Contains: 8-rule Grounding Protocol (every claim requires `file:line — "quoted"` evidence, no hedging language, scores require rubric citations, output shapes are contracts, tool budgets enforced, preconditions checked); **Severity Rubric** with objective criteria per level and a deterministic `max(1, 5 − ⌊weighted_sum/8⌋)` category-score formula; **Task-Done Rubric** (compiles / no stubs / wired / AC validated / committed); **Evidence Citation Format**; **Deviation JSON Format**; **Design Quality Rubric** (6 dimensions × 3 levels); **cache-aware prompt-ordering rule**. Install.js picks this up automatically via `rules/` directory copy.
1850
+ - **Structured Output Contract for builder** (`agents/builder.md`). Builder must return `DONE — Task {N}: {commit_hash}` with file list, `BLOCKED — {reason}` with JSON deviation block (`{type, task, file, planned, actual, impact}`), or `PARTIAL — {done}; remaining: {left}`. Orchestrator can now parse results programmatically instead of regex-guessing free-text.
1851
+ - **Explicit file-based dependency graph for wave assignment** (`agents/planner.md`). Replaces vibes-based "tasks with no dependencies" with a mechanical algorithm: build `writes(T)` / `reads(T)` sets from Files and Context fields, declare edge A→B when `writes(A) ∩ reads(B) ≠ ∅`, topological-sort into waves, enforce write-conflict check within each wave. Worked example table included. Same inputs → same waves.
1852
+ - **Rule 8 for plan-checker** (`agents/plan-checker.md`). Each task's `**Validation:**` list must include at least one `grep-match` or `command-exit` that tests behavior — a task whose only Validation is `test -f {file}` fails the rule. Stops stubs and placeholders from passing the build gate.
1853
+ - **Tool budgets** across 3 open-ended agents: researcher (3 Context7 + 3 WebFetch + 2 WebSearch per dimension), verifier (25 bash/grep per invocation), plan-checker (10 per invocation), qualia-debug (10 Read/Grep/Bash). Enforces INSUFFICIENT EVIDENCE return over speculative output when budget exhausted.
1854
+ - **Frontend gate on verifier's Design Verification section** (`agents/verifier.md`). Grep the phase plan for `.tsx`/`.jsx`/`.css`/`Persona:\s*(frontend|ux)` first — if absent, skip the ~40-command design verification block entirely. Saves substantial time on backend-only phases.
1855
+ - **`<wave_context>` block in builder prompts** (`skills/qualia-build/SKILL.md`). Lists sibling tasks in the same wave (title + files only, ~50 tokens per task) so parallel builders don't make conflicting semantic choices on shared types or patterns.
1856
+ - **Evidence citation requirement for milestone suggestions** (`agents/research-synthesizer.md`). Every arc entry must cite `[DIMENSION.md: <finding>]`. Speculative milestones marked `[speculative — no source]`.
1857
+ - **Parallel Agent fan-out in `qualia-design`** for >5 target files. Batches of 5, one Agent per batch, all dispatched in a single response turn. Post-fix verification greps catch reverted anti-patterns (`outline:none` without replacement, generic fonts, `max-w-7xl`, missing alt, blue-purple gradients).
1858
+ - **Parallelized security scans in `qualia-review`.** Independent greps now explicitly dispatched as parallel Bash calls in one turn. Saves 15-30s on large codebases.
1859
+ - **Typed input contracts across 7 agents** (planner, plan-checker, builder, verifier, researcher, qa-browser, roadmapper). Replaces prose "You receive: X + Y" with `<variable>` blocks + types + sources. Catches missing inputs at prompt-assembly time instead of mid-execution.
1860
+ - **`<full_detail>` declared in roadmapper Input section.** Was a ghost parameter referenced in the body but never declared — orchestrator had no mechanism to pass it.
1861
+ - **Cache-aware prompt structure in `qualia-build`.** Split `<phase_context>` (PROJECT.md/DESIGN.md, phase-stable) from `<task_context>` (per-task @files, varies). Stable prefix first, dynamic last — preserves Anthropic prompt-caching prefix-hit across parallel wave tasks (docs report 92% hit rate + 81% cost reduction at Claude Code scale when prefix is byte-identical).
1862
+ - **Research reports in `docs/research/`** documenting the analysis: `2026-04-21-command-quality-deep-research.md` (15-item synthesis from 4 parallel Opus audits) and `2026-04-21-industry-best-practices.md` (255 lines, cited sources on prompt caching, verification loops, hallucination reduction, multi-agent orchestration).
1863
+
1864
+ ### Changed
1865
+
1866
+ - **Plan-checker revision loop capped at 2 iterations** (was 3). Amazon/NeurIPS 2025 measured reflection gains at 74%→86% for 1 round, only 88% for 3 rounds — iteration 3 added 2pp over iteration 1, not worth the extra planner spawn. Updated `qualia-plan/SKILL.md`, `plan-checker.md`, and all stale "3 cycles" references.
1867
+ - **`qualia-review` scoring replaced subjective thresholds with deterministic formula.** Quick-reference table rewritten to match the computed formula (earlier drafts had inconsistent boundary rules — verified mechanically during release QA and corrected).
1868
+ - **Verifier now receives PROJECT.md inlined in its spawn prompt.** Previously blind to project conventions — Quality scoring rubric referenced "project conventions" but verifier had no way to read them.
1869
+ - **Wave dispatch explicitly parallel in `qualia-build/SKILL.md`.** Replaced "parallel if multiple" language with an explicit instruction: spawn all wave tasks as separate `Agent()` calls in the SAME response turn — do NOT await one before the next. Prior natural-language phrasing relied on harness behavior rather than enforcing true concurrency.
1870
+ - **`qualia-debug` rewritten from interactive to investigative one-shot.** Previously required 4 mandatory user questions and a diagnosis-confirmation gate before any investigation. Now parses symptom from `$ARGUMENTS`, runs diagnostic grep batches (general/frontend/perf modes), hard 10-call tool budget, INSUFFICIENT EVIDENCE return instead of speculative fixes, structured DEBUG-{timestamp}.md report output to `.planning/`. Matches the one-shot pattern of every other `/qualia-*` command.
1871
+ - **`qualia-design` critique section now uses the structured Design Quality Rubric** (File | Dimension | Issue | Line | Severity) instead of vibes-based evaluation. Any dimension scoring below 4 is a mandatory fix.
1872
+
1873
+ ### Fixed
1874
+
1875
+ - **Latent `grep -qL` bug in `qualia-review` API auth check.** The combination of `-q` (quiet) and `-L` (list non-matching files) is undefined in POSIX and was producing inverted "UNPROTECTED" output. Rewrote as a clean `if ! grep -q ... then echo UNPROTECTED` loop. Verified against mock directory of protected + unprotected routes.
1876
+ - **Full `npx next build` removed from `qualia-review` Performance Scan.** A 30-120s side-effectful build triggered during a "scan" command was a hidden cost that made review surprisingly slow and polluted `.next/`. Replaced with `du -sh .next/static/chunks/*.js` against existing build artifacts, with a warning if no build output exists.
1877
+
1878
+ ### Notes
1879
+
1880
+ - **Always pin `@latest` when upgrading.** npx caches at `~/.npm/_npx/` and has no time-based TTL, so `npx qualia-framework install` can silently re-run a cached old copy. Use `npx qualia-framework@latest install` (or `npx clear-npx-cache` first). README updated to reflect this. ([npm/rfcs#700](https://github.com/npm/rfcs/issues/700))
1881
+ - Users who update the framework must re-run the install script so `~/.claude/rules/grounding.md` lands — every skill's spawn prompt now references this file.
1882
+ - Any client projects mid-phase where the plan-checker was on iteration 3 will now escalate at iteration 2. Acceptable trade-off per the measured reflection-gain data.
1883
+ - The builder Output Contract (`DONE/BLOCKED/PARTIAL`) is advisory today — existing orchestrator skills do not programmatically parse it. Enforcement will land in a follow-up minor when the parsing is wired through.
1884
+
1885
+ ### Deferred to v4.2.0
1886
+
1887
+ - Mechanical-fix bypass in plan-checker (skip planner re-spawn for frontmatter/wave-assignment issues — ~4 hrs orchestration work, regression risk not suitable for this release).
1888
+ - Pre-Build Context Packet (single JSON consolidating PROJECT.md + DESIGN.md + plan + wave-context before spawning any builders).
1889
+ - Intra-wave task verification (run task Validation contracts immediately after each builder completes, before next wave starts).
1890
+ - New agents: migrator, dependency-auditor, rollback.
1891
+ - `curl` fallback in qa-browser for environments without Playwright MCP.
1892
+
1893
+ ## [4.0.5] — 2026-04-19
1894
+
1895
+ **Statusline refresh.** The phase segment now shows milestone + tasks +
1896
+ blockers (not just phase number), and the line closes with a
1897
+ `⬢ Qualia · {firstName}` signature pulled from
1898
+ `~/.claude/.qualia-config.json`. Static `hooks N` / `skills N` counters
1899
+ removed — they never changed between projects, so they added noise
1900
+ without signal. All 168 tests still green.
1901
+
1902
+ ### Added
1903
+
1904
+ - **`bin/statusline.js` — milestone segment.** When
1905
+ `.planning/tracking.json` has `milestone` + `milestone_name`, the
1906
+ statusline renders `M{n}·{shortName}` (name truncated to 14 chars)
1907
+ before the phase number. Previously only the phase number (`P1/3`)
1908
+ was visible — milestone context had to be looked up manually.
1909
+ - **`bin/statusline.js` — task progress.** When `tasks_total > 0`,
1910
+ renders `T{done}/{total}` alongside the phase. Gives mid-phase
1911
+ progress at a glance during `/qualia-build` waves.
1912
+ - **`bin/statusline.js` — blocker badge.** When
1913
+ `tracking.json.blockers` is a non-empty array, renders `!{n}` in
1914
+ red. Intentionally loud — blockers should never sit unnoticed.
1915
+ - **`bin/statusline.js` — Qualia signature.** Line 2 now ends with
1916
+ `⬢ Qualia · {firstName}` where `firstName` is the first whitespace-
1917
+ delimited token of `installed_by` in `~/.claude/.qualia-config.json`.
1918
+ Branded closer, makes the statusline feel like ours, not a generic
1919
+ Claude Code tool.
1920
+
1921
+ ### Removed
1922
+
1923
+ - **`bin/statusline.js` — hooks/skills counters.** The `hooks N`
1924
+ and `skills N` indicators were removed from line 1. Both counts are
1925
+ effectively static across all projects on a single machine (a given
1926
+ install has the same hooks and skills everywhere), so they were
1927
+ visual noise — they didn't help the employee understand *this*
1928
+ project's state. `mem N` is retained because it genuinely varies
1929
+ per project (different memories accumulated per working directory).
1930
+
1931
+ ## [4.0.4] — 2026-04-18
1932
+
1933
+ **Audit follow-up + ERP integration upgrade.** Eight concrete improvements
1934
+ from the framework deep-dive audit. Tests: 164 → 168 (+4 regression tests,
1935
+ covering the new `next-report-id` subcommand and the JOURNEY.md
1936
+ pre-populate on `close-milestone`).
1937
+
1938
+ ### Added
1939
+
1940
+ - **`qualia-framework erp-ping`** — new CLI subcommand that POSTs a
1941
+ synthetic `dry_run: true` payload to the ERP and prints HTTP code,
1942
+ response body, and the ERP-returned `report_id`. Single-command
1943
+ connectivity + key-validity + endpoint health check. Aliased as `ping`.
1944
+ - **`QS-REPORT-NN` client-side identifiers** — every session report
1945
+ now carries a stable, sequential client ID (`QS-REPORT-01`, `-02`, …
1946
+ per project) stored in `tracking.json.report_seq` and sent to the ERP
1947
+ in a new `client_report_id` field. Survives retries, survives UUID
1948
+ changes on the ERP side, is the preferred dedupe key going forward.
1949
+ - **`state.js next-report-id [--peek]`** — new mutator subcommand that
1950
+ increments `report_seq` and returns the next `QS-REPORT-NN`. `--peek`
1951
+ returns without incrementing (for `/qualia-report --dry-run`).
1952
+ - **`/qualia-report --dry-run`** — assemble + print the payload but
1953
+ skip the POST, skip the git commit, and peek the sequence counter
1954
+ without consuming one. Useful for previewing before a real clock-out.
1955
+ - **`/qualia-report` now retries with backoff** — 3 attempts at 1s, 3s,
1956
+ 9s on transient failures (timeout, 5xx, network). 401/422 are
1957
+ permanent failures and fail fast. Local report commit is unchanged —
1958
+ no data loss on upload failure, just a stale ERP view until retry.
1959
+ - **`/qualia-report` now displays both IDs on success** — "Uploaded as
1960
+ QS-REPORT-03 (ERP: {uuid})". Employees and the ERP share the same
1961
+ stable reference.
1962
+ - **`/qualia-new --full-detail`** — new flag that instructs the
1963
+ roadmapper to write full phase-level detail for EVERY milestone
1964
+ upfront (not just M1). Default behavior (progressive detail)
1965
+ unchanged. `agents/roadmapper.md` honors `<full_detail>` in its
1966
+ prompt contract.
1967
+ - **Visible progressive-detail notice** — after journey approval,
1968
+ `/qualia-new` now explicitly tells the user "Milestone 1 is fully
1969
+ planned. M2..M{N-1} are sketched. Full detail fills in when
1970
+ /qualia-milestone opens each one." Previously only in template
1971
+ comments — easy for a new team member to miss.
1972
+
1973
+ ### Fixed
1974
+
1975
+ - `bin/state.js` `close-milestone` now reads `.planning/JOURNEY.md` to
1976
+ pre-populate `tracking.json.milestone_name` with the next milestone's
1977
+ name. Previously, between `close-milestone` and the next
1978
+ `state.js init --force` from the roadmapper, `milestone_name` sat
1979
+ blank — the ERP tree view would briefly show an unnamed milestone.
1980
+ Falls back to blank if JOURNEY.md is missing (legacy projects, pre-v4).
1981
+ - `bin/cli.js` — `QUALIA_AGENT_FILES` expanded from 4 to all 8 agents
1982
+ (`planner`, `builder`, `verifier`, `qa-browser`, **`plan-checker`**,
1983
+ **`researcher`**, **`research-synthesizer`**, **`roadmapper`**).
1984
+ `qualia-framework uninstall` would previously leave the last 4 on
1985
+ disk as orphans.
1986
+ - `bin/cli.js` `cmdMigrate` — removed `block-env-edit.js` from
1987
+ `requiredEditHooks`. That hook was deleted in v3.2.0 and
1988
+ `install.js` actively purges it; `migrate` was trying to wire a
1989
+ non-existent file into `settings.json`.
1990
+ - `bin/install.js` + `bin/cli.js` — unpinned
1991
+ `next-devtools-mcp@0.3.10` → `@latest`. The pin was silent drift.
1992
+ - `bin/install.js` — warn (instead of OK) when an existing
1993
+ `~/.claude/.erp-api-key` is under 10 bytes. Clearly truncated or
1994
+ placeholder keys no longer silently pass install. Real bearer tokens
1995
+ are ≥ 20 bytes; the threshold is deliberately loose to avoid false
1996
+ positives.
1997
+
1998
+ ### Changed
1999
+
2000
+ - `templates/tracking.json` — new field `report_seq: 0`.
2001
+ - `docs/erp-contract.md` — documented `client_report_id` (recommended)
2002
+ and `dry_run` (optional) on the POST payload.
2003
+
2004
+ ### Tests
2005
+
2006
+ - 164 → 168 (+4). New coverage: `next-report-id` increments,
2007
+ `next-report-id --peek` is side-effect-free, `close-milestone`
2008
+ pre-populates `milestone_name` from JOURNEY.md, and the fallback
2009
+ path when JOURNEY.md is absent.
2010
+
2011
+ ## [4.0.3] — 2026-04-18
2012
+
2013
+ **Zero-deferral release.** Closes the last two items that were previously
2014
+ deferred as trade-offs.
2015
+
2016
+ ### Fixed
2017
+
2018
+ - `hooks/pre-compact.js`: `--no-verify` and `--no-gpg-sign` are now
2019
+ configurable via `~/.claude/.qualia-config.json`:
2020
+
2021
+ ```json
2022
+ {
2023
+ "pre_compact": {
2024
+ "respect_user_hooks": true,
2025
+ "respect_gpg_signing": true
2026
+ }
2027
+ }
2028
+ ```
2029
+
2030
+ Default behavior is unchanged (bot commits bypass both, because
2031
+ compaction can fire at any moment and pre-commit test suites would
2032
+ routinely block the auto-save and lose STATE.md). Compliance-sensitive
2033
+ projects opt into strict mode per-flag. The flags used are included
2034
+ in the hook trace.
2035
+
2036
+ ### Added
2037
+
2038
+ - **All 26 skills now declare `allowed-tools`** in frontmatter. Per-skill
2039
+ conservative tool unions — wider rather than narrower to avoid
2040
+ breakage. Read-only skills (`qualia-help`, `qualia-resume`) declare
2041
+ it explicitly. The framework no longer relies on the user's default
2042
+ permission mode for tool scoping.
2043
+
2044
+ ## [4.0.2] — 2026-04-18
2045
+
2046
+ **Stability pass.** Closes every remaining HIGH + MEDIUM item from the
2047
+ v4.0.0 audit that could surface as a silent failure or instability.
2048
+ Tests: 159 → 164 (+5 regression tests).
2049
+
2050
+ ### Fixed — HIGH
2051
+
2052
+ - `hooks/session-start.js`: `readConfig()` now defined above its call
2053
+ site. Previously worked by function-declaration hoisting — would have
2054
+ silently broken on any refactor to `const readConfig = …`.
2055
+ - `bin/state.js`: write-ahead journal (`.planning/.state.journal`)
2056
+ captures the pre-transition snapshot of STATE.md + tracking.json
2057
+ before the dual write. On next mutator invocation, if the journal
2058
+ exists we recover both files to the pre-transition state. A crashed
2059
+ mutator (SIGKILL / power loss between the two renames) no longer
2060
+ leaves the pair inconsistent. A corrupt journal is cleared, not fatal.
2061
+
2062
+ ### Fixed — MEDIUM
2063
+
2064
+ - `hooks/migration-guard.js`: DELETE / UPDATE `WHERE` scan is now
2065
+ per-statement (split on `;`) instead of file-global. A file
2066
+ containing `DELETE FROM foo;` followed by any later `… WHERE …`
2067
+ (in a SELECT, JOIN, etc.) would previously pass the check.
2068
+ - `hooks/migration-guard.js`: stdin read retry loop now sleeps 1ms
2069
+ between EAGAIN retries via `Atomics.wait` instead of spinning CPU.
2070
+ - `hooks/pre-push.js`: commit-failure path now unstages tracking.json
2071
+ and restores the working-tree copy, so the user's next manual commit
2072
+ isn't polluted by an aborted ERP-stamp change.
2073
+ - `bin/cli.js` — `cleanSettingsJson`: iterates ALL hook-event keys in
2074
+ settings.json instead of the hardcoded three (SessionStart /
2075
+ PreToolUse / PreCompact). Future hook events get cleaned automatically.
2076
+ - `bin/cli.js` — hook cleanup: introduce `QUALIA_LEGACY_HOOK_FILES`
2077
+ for removed-in-past-version hook filenames (currently
2078
+ `block-env-edit.js`). Uninstall cleans legacy hooks too.
2079
+ - `bin/statusline.js`: memory-path `dirKey` now strips BOTH `/` and `\`
2080
+ so Windows installs get a correct project key and the memory count
2081
+ actually renders.
2082
+
2083
+ ### Tests
2084
+
2085
+ - +5 regression tests:
2086
+ · state.js recovers STATE.md + tracking.json from `.state.journal`
2087
+ · state.js: corrupt `.state.journal` is cleared without crashing
2088
+ · migration-guard: `DELETE FROM x; SELECT … WHERE …;` still blocks
2089
+ · migration-guard: `UPDATE … SET …; SELECT … WHERE …;` still blocks
2090
+ · install.js: reinstall preserves user-added hooks in settings.json
2091
+
2092
+ ## [4.0.1] — 2026-04-18
2093
+
2094
+ **Post-v4.0.0 audit cleanup.** No behavior changes on the happy path —
2095
+ all fixes patch latent bugs, silent failure modes, and documentation
2096
+ drift found in a full-framework audit. Tests grew from 156 to 159.
2097
+
2098
+ ### Fixed — ship-blockers
2099
+
2100
+ - `bin/qualia-ui.js`: `/qualia` journey-tree no longer crashes when
2101
+ `JOURNEY.md` lacks a `project:` frontmatter line. A `const projectName`
2102
+ was shadowing the function name inside its own initializer, triggering
2103
+ `ReferenceError: Cannot access 'projectName' before initialization`.
2104
+ - `templates/help.html`: version pill, subtitle, and footer now render
2105
+ the real installed version. Previously hardcoded `v3.6.0` in three
2106
+ places, and the `sed "s/{{VERSION}}/$VERSION/g"` in `/qualia-help`
2107
+ had nothing to replace.
2108
+ - `skills/qualia-help/SKILL.md`: version fallback chain rewritten —
2109
+ `.qualia-config.json` → `package.json` → `"latest"`. Previously
2110
+ fell back to the string `"v3"`.
2111
+ - `skills/qualia-design/SKILL.md` + `rules/frontend.md`: remove references
2112
+ to 5 non-existent design skills (`/bolder`, `/design-quieter`,
2113
+ `/colorize`, `/distill`, `/delight`). The rules file ships to every
2114
+ user project; the ghost commands would have 404'd.
2115
+ - `CLAUDE.md`: Road section rewritten to describe the v4 hierarchy.
2116
+ Previously no mention of milestones, `JOURNEY.md`, `/qualia-milestone`,
2117
+ `/qualia-idk`, or `--auto` — the file Claude reads every session was
2118
+ still on v3 mental model.
2119
+ - `CHANGELOG.md`: add link references for v3.1.0 through v4.0.0 and
2120
+ point `[Unreleased]` at v4.0.0. Previous version headers rendered
2121
+ as plain text on GitHub / npm.
2122
+
2123
+ ### Fixed — real bugs
2124
+
2125
+ - `hooks/pre-deploy-gate.js`: exits **2** (not 1) on block, matching
2126
+ Claude Code's PreToolUse hook contract. Previous code explicitly
2127
+ acknowledged the violation in a comment. Test assertions updated in
2128
+ `tests/runner.js` and `tests/hooks.test.sh`.
2129
+ - `bin/state.js` — lock: replace 50ms CPU busy-wait with
2130
+ `Atomics.wait`-backed `sleepSync` (no CPU starvation on constrained
2131
+ CI runners). Lock fall-through now traces `state-lock/fallthrough`
2132
+ so repeated contention is visible instead of silent.
2133
+ - `bin/state.js` — `cmdTransition`: back up BOTH `STATE.md` and
2134
+ `tracking.json` before the dual write, so a failure in the second
2135
+ write can roll both files back to a consistent pre-transition state.
2136
+ - `bin/install.js`: hooks are now merged into `settings.json` instead
2137
+ of clobbered. Previous code did `settings.hooks = {...}`, silently
2138
+ destroying any user-added hook entries on every reinstall.
2139
+ Qualia-owned hook commands are matched by filename and replaced;
2140
+ everything else is preserved.
2141
+
2142
+ ### Fixed — docs / drift
2143
+
2144
+ - `agents/plan-checker.md`: Rule 2 heading "6 mandatory fields" →
2145
+ "7 mandatory fields" (list contained 7).
2146
+ - `skills/qualia-task/SKILL.md` + `skills/qualia-plan/SKILL.md`:
2147
+ legacy `Done when:` → `Acceptance Criteria:` (matches v3.7.0 story-file
2148
+ format that plan-checker validates against).
2149
+ - `docs/erp-contract.md`: add v4 fields `milestone_name` and
2150
+ `milestones[]` to the request body example and required-fields table.
2151
+ `/qualia-report` already sent these; the contract doc didn't document
2152
+ them.
2153
+ - `guide.md`: "The 10 Commands" → "The Road Commands" (table has 13 rows).
2154
+ - `skills/qualia-new/SKILL.md`: strip stale "Unlike v3" language.
2155
+
2156
+ ### Tests
2157
+
2158
+ - +3 new regression tests (156 → 159):
2159
+ · `transition --to shipped` actually increments `deploy_count`
2160
+ · `qualia-ui journey-tree` renders milestones without crashing
2161
+ · `qualia-ui journey-tree` falls back to `projectName()` when
2162
+ `JOURNEY.md` frontmatter lacks `project:`
2163
+
2164
+ ### Deferred to v4.1
2165
+
2166
+ - `allowed-tools` frontmatter sweep across 26 skills — requires
2167
+ per-skill audit to avoid accidentally blocking tool access that
2168
+ skills rely on.
2169
+ - Finer-grained per-statement `WHERE`-clause scan in
2170
+ `migration-guard.js` / `pre-deploy-gate.js`.
2171
+
2172
+ ## [4.0.0] — 2026-04-18
2173
+
2174
+ **Full Journey release.** `/qualia-new` now maps the entire project
2175
+ arc from kickoff to client handoff upfront, and the Road can chain
2176
+ itself end-to-end in `--auto` mode with only two human gates per
2177
+ project. The milestone / phase / task hierarchy is locked down so the
2178
+ ERP renders a clean tree, and the team stops improvising milestones
2179
+ after each ship.
2180
+
2181
+ ### The big shift
2182
+
2183
+ Before v4, `/qualia-new` produced a v1 ROADMAP and stopped. Each
2184
+ subsequent milestone was invented when the previous one shipped,
2185
+ leading to structural drift (milestones collapsing into single
2186
+ phases, "Phase 0" entries at milestone level, skipped milestone
2187
+ numbers). The ERP rendered a flat list of heterogeneous entries.
2188
+
2189
+ v4 treats the **Journey** as a first-class artifact:
2190
+
2191
+ ```
2192
+ Project
2193
+ └─ Journey (all milestones defined upfront)
2194
+ └─ Milestone (a release — 2-5 total, Handoff is always last)
2195
+ └─ Phase (a feature-sized deliverable, 2-5 tasks)
2196
+ └─ Task (atomic unit, one commit, verification contract)
2197
+ ```
2198
+
2199
+ ### Added
2200
+
2201
+ - **`.planning/JOURNEY.md`** — the North Star document. Lists every
2202
+ milestone with why-now, exit criteria, and phase sketches. Written
2203
+ during `/qualia-new`, updated on milestone closure. Hard rules: 2-5
2204
+ milestones, ≥ 2 phases per non-Handoff milestone, final milestone
2205
+ is always literally named "Handoff" with the fixed 4-phase template
2206
+ (Polish, Content + SEO, Final QA, Handoff).
2207
+ - **`/qualia-new` full-journey flow** — produces JOURNEY.md +
2208
+ REQUIREMENTS.md (grouped by milestone) + ROADMAP.md (M1's phase
2209
+ detail). **Research runs unconditionally** (no more `workflow.research`
2210
+ gate). **Single approval** on the whole journey replaces multiple
2211
+ mid-flow gates.
2212
+ - **`--auto` flag on `/qualia-new`, `/qualia-plan`, `/qualia-build`,
2213
+ `/qualia-verify`, `/qualia-milestone`** — chains the Road end-to-end.
2214
+ Two human gates per project total: journey approval at kickoff, and
2215
+ one pause at each milestone boundary ("Continue to M{N+1}?"). One
2216
+ halt case: gap-cycle limit exceeded on a failed phase.
2217
+ - **Milestone readiness guards** in `state.js close-milestone`:
2218
+ `MILESTONE_NOT_READY` (any phase not verified) and `MILESTONE_TOO_SMALL`
2219
+ (< 2 phases), both bypassable with `--force`.
2220
+ - **`tracking.json.milestones[]`** — array of closed milestone summaries
2221
+ (num, name, total_phases, phases_completed, tasks_completed,
2222
+ shipped_url, closed_at). The ERP uses this to render the project
2223
+ tree without replaying git history.
2224
+ - **`tracking.json.milestone_name`** — human name of the current
2225
+ milestone ("Foundation", "Core Features", etc.). Appears in status
2226
+ bar and ERP.
2227
+ - **`build_count` and `deploy_count` bump automatically** on every
2228
+ `built` and `shipped` transition. Previously always zero.
2229
+ - **Pre-inline context at builder dispatch.**
2230
+ `/qualia-build` reads PROJECT.md, DESIGN.md, and every `@file`
2231
+ referenced in the task's Context BEFORE spawning the builder subagent.
2232
+ Inlines them under `<pre-loaded-context>`. Saves 3-5 Read tool calls
2233
+ per task; builder starts already oriented.
2234
+ - **`qualia-ui.js journey-tree`** — ASCII ladder visualization of
2235
+ JOURNEY.md. Shipped milestones = green dot, current = teal diamond,
2236
+ future = dim open circle, Handoff = [FINAL] tag. Shown by
2237
+ `/qualia` router and at `/qualia-milestone` confirmation step.
2238
+ - **`qualia-ui.js milestone-complete`** — celebration banner on
2239
+ milestone closure. Distinguishes Handoff closure ("PROJECT SHIPPED")
2240
+ from intermediate milestones ("Next: {name}").
2241
+ - **5 new banner actions:** milestone ◆, journey ◯, auto ⚡,
2242
+ research ◱, roadmap ◐.
2243
+ - **`qualia-report` ERP payload updated** — now sends all v4 fields
2244
+ (project_id, team_id, git_remote, milestone_name, milestones[],
2245
+ build_count, deploy_count, session_started_at, last_pushed_at) so
2246
+ the ERP renders tree and dedupes correctly.
2247
+ - **`/qualia-idk` is now a real diagnostic skill**, not a `/qualia` alias.
2248
+ When the user's confusion is about *understanding the situation*
2249
+ (not picking the next command), it spawns two parallel isolated
2250
+ `Explore` subagents: one scans `.planning/` only, the other scans
2251
+ source code only. Each produces a 250-word view of its side. The
2252
+ main skill synthesizes both views + the user's stated confusion
2253
+ into a structured "What I see / What I think is happening / What
2254
+ to do next" diagnosis in plain language. Catches plan↔code drift
2255
+ that a state-only router can't see.
2256
+
2257
+ ### Changed
2258
+
2259
+ - **`/qualia-handoff` is now explicit about the 4 deliverables** —
2260
+ verified production URL, updated documentation, client assets archive,
2261
+ ERP finalization. Halts if URL is down or latency > 1s, or if
2262
+ `.planning/archive/` is empty (project bypassed `/qualia-milestone`
2263
+ and has no archived milestones).
2264
+ - **`/qualia-milestone` reads next milestone from JOURNEY.md** instead
2265
+ of asking the user to name it. Dedicated `journey-tree` visualization
2266
+ at confirmation + `milestone-complete` banner at close.
2267
+ - **`roadmapper` agent rewritten** to produce JOURNEY + REQUIREMENTS +
2268
+ ROADMAP. **Dropped** the old "no review/deploy/handoff phases" rule —
2269
+ the Handoff milestone is now a first-class feature milestone with
2270
+ the 4 standard phases and their own requirements (HAND-01..HAND-15
2271
+ in REQUIREMENTS.md).
2272
+ - **`plan-checker` Rule 2** — task story-file fields are mandatory
2273
+ (Why / Depends on / Acceptance Criteria / Validation). Inherited
2274
+ from v3.7.0's story-file format.
2275
+ - **`/qualia` description scoped back to mechanical state routing.**
2276
+ Previously claimed "idk / stuck / lost / confused" triggers; those
2277
+ interpretive shades now route to `/qualia-idk`. `/qualia` stays the
2278
+ fast mechanical router ("what's my next command").
2279
+ - **`templates/requirements.md`** — multi-milestone format with fixed
2280
+ Handoff section.
2281
+ - **`templates/roadmap.md`** — scoped to current milestone only, with
2282
+ pointer to JOURNEY.md for the full arc.
2283
+
2284
+ ### Tests
2285
+
2286
+ 150 → 156 green. +6 covering: MILESTONE_NOT_READY, MILESTONE_TOO_SMALL,
2287
+ milestones[] append idempotency, check-output exposure of milestones +
2288
+ milestone_name, milestone summary cumulative task count (not current-
2289
+ phase only), build_count bump on `built`.
2290
+
2291
+ ### Migration
2292
+
2293
+ Fully additive. Projects created on v3.x continue to work:
2294
+ - Plans without story-file fields: `state.js` accepts both legacy
2295
+ `Done when:` and v3.7.0 `Acceptance Criteria:` anchors.
2296
+ - tracking.json missing `milestones[]` or `milestone_name`: `ensureLifetime`
2297
+ hydrates them to `[]` and `""` with zero risk.
2298
+ - Projects without JOURNEY.md (legacy): `/qualia-milestone` falls back
2299
+ to asking the user for the next milestone name. Recommended migration:
2300
+ run `/qualia-map` then regenerate JOURNEY.md via the roadmapper, but
2301
+ not required.
2302
+
2303
+ ### Lineage of patterns
2304
+
2305
+ - **Story-file plan format** — embedded rationale and acceptance
2306
+ criteria per task (arrived in v3.7.0).
2307
+ - **State-machine auto-advance** — the `--auto` loop (arrived in v4.0.0).
2308
+ - **Pre-inline context at dispatch** — the builder `<pre-loaded-context>`
2309
+ block (arrived in v4.0.0).
2310
+ - **Journey-as-first-class-artifact** — JOURNEY.md (arrived in v4.0.0).
2311
+
2312
+ ---
2313
+
2314
+ _v2.x and v3.x history archived at [docs/archive/CHANGELOG-pre-v4.md](docs/archive/CHANGELOG-pre-v4.md)._