npm - qualia-framework - Versions diffs - 5.1.0 → 5.4.0 - Mend

qualia-framework 5.1.0 → 5.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (53) hide show

package/README.md +50 -26
package/agents/builder.md +8 -0
package/agents/plan-checker.md +10 -1
package/agents/planner.md +1 -1
package/agents/qa-browser.md +10 -0
package/agents/research-synthesizer.md +10 -0
package/agents/researcher.md +38 -2
package/agents/roadmapper.md +10 -0
package/agents/verifier.md +15 -3
package/agents/visual-evaluator.md +1 -1
package/bin/install.js +42 -0
package/bin/state.js +155 -133
package/docs/archive/session-report-2026-04-18.md +199 -0
package/docs/archive/v4.0.0-review.md +288 -0
package/docs/instruction-budget-audit.md +113 -0
package/docs/polish-loop-supervised-run.md +111 -0
package/guide.md +11 -4
package/hooks/session-start.js +1 -1
package/package.json +5 -2
package/rules/architecture.md +125 -0
package/rules/infrastructure.md +1 -2
package/rules/speed.md +55 -0
package/skills/qualia-help/SKILL.md +1 -1
package/skills/qualia-hook-gen/SKILL.md +206 -0
package/skills/qualia-map/SKILL.md +1 -1
package/skills/qualia-milestone/SKILL.md +1 -1
package/skills/qualia-new/SKILL.md +2 -2
package/skills/qualia-optimize/REFERENCE.md +65 -2
package/skills/qualia-optimize/SKILL.md +26 -1
package/skills/qualia-polish/SKILL.md +3 -3
package/skills/qualia-polish-loop/REFERENCE.md +1 -1
package/skills/qualia-polish-loop/SKILL.md +3 -3
package/skills/qualia-polish-loop/fixtures/broken.html +2 -2
package/skills/qualia-polish-loop/scripts/loop.mjs +26 -5
package/skills/qualia-polish-loop/scripts/playwright-capture.mjs +14 -5
package/skills/qualia-polish-loop/scripts/score.mjs +1 -1
package/skills/qualia-postmortem/SKILL.md +1 -1
package/skills/qualia-prd/SKILL.md +199 -0
package/skills/qualia-quick/SKILL.md +1 -1
package/skills/qualia-research/SKILL.md +5 -3
package/skills/qualia-road/SKILL.md +15 -5
package/skills/qualia-task/SKILL.md +1 -1
package/templates/PRODUCT.md +1 -1
package/tests/bin.test.sh +155 -8
package/tests/skills.test.sh +143 -0
package/tests/slop-detect.test.sh +160 -0
package/docs/playwright-loop-review-2026-05-03.md +0 -65
/package/{rules → qualia-design}/design-brand.md +0 -0
/package/{rules → qualia-design}/design-laws.md +0 -0
/package/{rules → qualia-design}/design-product.md +0 -0
/package/{rules → qualia-design}/design-reference.md +0 -0
/package/{rules → qualia-design}/design-rubric.md +0 -0
/package/{rules → qualia-design}/frontend.md +0 -0

package/rules/architecture.md ADDED Viewed

@@ -0,0 +1,125 @@
+# Architecture Rules
+How Qualia code stays navigable for future agents (human and AI). Read on architectural-judgment tasks: refactors, new module decisions, deep-module work, `/qualia-optimize --deepen`.
+## 1. Deep modules over shallow ones (Ousterhout)
+A **deep module** hides significant complexity behind a small, stable interface. A **shallow module** exposes most of its internals — every caller has to know how it works to use it.
+| | Deep | Shallow |
+|---|---|---|
+| Interface size (params, exports) | Small | Wide |
+| Internal logic | Substantial | Trivial |
+| Cost to change implementation | Low (callers don't notice) | High (every caller breaks) |
+| Cost to read a caller | Low | High (must know module internals) |
+Deep modules are the primary defense against AI-generated entropy. AI is excellent at generating implementation; humans (and AI under guidance) must defend the **interface**.
+### Smell — shallow code
+If you spot any of these, the module is shallow and is a refactor candidate:
+- A wrapper function that does only argument shuffling and a single inner call.
+- A type-only file that re-exports types from elsewhere.
+- A "service" that has 8 public methods and one private one.
+- A util module where every export is used in exactly one place.
+- A class whose every method is a one-liner pass-through.
+`/qualia-optimize --deepen` is the skill that scouts for these and proposes interface consolidations.
+## 2. Locality over cleverness
+Code that changes together should live together. The cost of a "DRY" abstraction is paid every time a future caller has to mentally fork to the abstraction's definition. Three similar lines beats a premature `extractCommon()` that everyone has to read twice.
+Apply DRY only when:
+- The duplication is exact (not just similar shape).
+- The thing duplicated is unlikely to diverge.
+- The caller doesn't need to know "the rule" to read the call site.
+If any of those three is uncertain, leave the duplication. Pocock's rule of thumb: **three is the threshold, but only if all three would change in lock-step.**
+## 3. Adapters at seams
+Wherever the system meets an external dependency (database, third-party API, AI provider, payment gateway), introduce an adapter. The adapter:
+- Owns the dependency's specific shape (auth headers, response envelopes, error formats).
+- Translates to a project-internal type that the rest of the code uses.
+- Is the only file that needs to change when the dependency is swapped or upgraded.
+This is the seam where tests inject fakes and where future migrations live. A codebase without adapters is a codebase that fights every dependency upgrade.
+**Example seams** in a typical Qualia project:
+- `lib/supabase/server.ts` — adapter over `@supabase/supabase-js`.
+- `lib/openrouter/client.ts` — adapter over OpenRouter REST API.
+- `lib/retell/agent.ts` — adapter over Retell AI SDK.
+- `lib/zoho/contacts.ts` — adapter over Zoho Books API.
+Direct calls to vendor SDKs from feature code are a smell. Move them through an adapter.
+## 4. Progressive disclosure of complexity
+A reader entering the codebase from a fresh clone should be able to follow this path:
+```
+README.md  →  app/ or src/index  →  one feature folder  →  one route  →  one component  →  one util
+```
+At each step the depth increases. The top is breadth (what does this app do?), the bottom is depth (how does this specific util compute X?). Skipping levels is a smell:
+- An entry point that imports 30 modules from across the tree → the tree is shallow.
+- A route handler that calls a database directly → no service layer; logic and IO are entangled.
+- A component that owns its own data fetching, mutation logic, error handling, and rendering → no hook abstraction; the component will be impossible to test or replace.
+The pattern that tells a fresh reader where to go next is **layered service boundaries**:
+1. **Routes / pages** — wiring only. No business logic.
+2. **Features / use cases** — business logic. Calls services.
+3. **Services** — orchestration. Calls adapters.
+4. **Adapters** — IO. The only layer that talks to vendors.
+5. **Domain types** — pure. No imports of the above.
+Inversions of this order (domain types importing adapters, services calling components) are bugs in shape.
+## 5. Interface stability beats internal elegance
+Once an interface has callers, changing it is expensive. Internal refactors are cheap. **Optimize for the cost of change.**
+When in doubt:
+- A new public function: write it conservatively. Defaults are a liability — every default is a future migration.
+- An internal function: write it expressively. Internal callers can be fixed in one sweep.
+Pocock's heuristic from de-slop: *the interface is the thing the AI shouldn't change without you. The implementation is the thing the AI can rewrite at will.*
+## 6. Test the seam, not the function
+Unit tests that pin internal function signatures rot fast — every refactor breaks them. Tests against the **adapter** or **service interface** survive refactors.
+Order of test value (high → low):
+1. **End-to-end / user-flow** — tests at the route level. Survive everything except feature changes.
+2. **Service-level** — tests at the use-case boundary. Survive most refactors.
+3. **Adapter-level** — tests with the vendor mocked. Survive vendor swaps.
+4. **Unit / function-level** — tests against a single internal function. Last resort. Only for genuinely tricky algorithms.
+The pyramid in older textbooks (lots of unit, few e2e) inverted in the AI era. AI generates internal functions cheaply; the seam tests are the ones that survive AI-driven refactors.
+## 7. The codebase IS the documentation
+Per Pocock's *"Never run /init"*: static documentation rots within weeks. The agents and humans should be able to **explore** to discover, not **memorize** to recall.
+Practical rules:
+- README.md: orientation only — what is this app, how do I run it, where is the source of truth for the rest. Not API docs, not architecture diagrams that will lie within a sprint.
+- `.planning/CONTEXT.md`: domain glossary, append-only as terms emerge. Discovered, not maintained.
+- ADRs in `.planning/decisions/`: hard-to-reverse calls, dated, immutable. Future archaeology, not current spec.
+- Anything else: it lives in code or it doesn't exist. If you need a diagram, generate it from the code at read-time.
+A skill or agent that needs context should `Read` the relevant code, not a synopsis of it written six months ago.
+## 8. When to apply this rule file
+Read this file (auto-load via skill or `@rules/architecture.md`) when:
+- Planning a new module or feature with multiple components.
+- The user requests `/qualia-optimize --deepen` or `--alignment`.
+- A verifier is scoring "Container depth & nesting" (per `rules/design-rubric.md` dimension 8).
+- An ADR is being drafted for an architectural fork.
+Do **not** auto-load this on quick fixes, copy edits, single-component touch-ups — that wastes instruction budget. Use judgment.

package/rules/infrastructure.md CHANGED Viewed

@@ -8,8 +8,7 @@ Standard services across all Qualia projects. Use these unless the project expli
 ## Database: Supabase (every project)
 - Every project uses Supabase for auth, database, and storage
-- **CLI:** `npx supabase` — migrations, type generation, local dev
-- **MCP:** Supabase MCP server is available in Claude Code for direct database operations
+- **CLI-first** — prefer `npx supabase` (migrations, type generation, local dev, SQL) over the Supabase MCP server. The MCP imposes a token tax on every turn; the CLI hits the same API at zero token cost. Use the MCP only when you need a feature the CLI doesn't expose (e.g., interactive branch management).
 - Always enable RLS on every table (see `rules/security.md`)
 - Use `lib/supabase/server.ts` for server-side, `lib/supabase/client.ts` for client-side
 - Run `npx supabase gen types` after schema changes

package/rules/speed.md ADDED Viewed

@@ -0,0 +1,55 @@
+---
+alwaysApply: true
+---
+# Speed Rules (MANDATORY)
+## Direct tools over subagents
+**NEVER use Task(Explore) or Task(general-purpose) for simple file lookups or code searches.** Use direct tools instead:
+- **Glob** for finding files by name/pattern
+- **Grep** for searching code content
+- **Read** for reading specific files
+- Only use Task(Explore) when you genuinely need 5+ rounds of deep codebase archaeology.
+**NEVER spawn subagents when direct tools work.** Subagents are slow — a direct Glob or Grep returns in seconds, a fresh agent burns 5-10 seconds of spawn overhead plus its own context budget.
+**Bias toward action, not exploration.** If the user asks you to fix something and you know the project structure, just do it.
+## CLI-first, MCP only when CLI can't
+MCP servers impose a **token tax**: their tool definitions consume context-window space on every turn, even when unused. CLIs that the model already knows from training data (git, gh, supabase, vercel, railway, npx, curl) impose zero token tax — the model recalls them from weights, not from the prompt.
+**Default to CLI.** Reach for MCP only when:
+- The CLI doesn't exist or doesn't expose the operation you need.
+- The CLI requires interactive auth that MCP has already brokered (e.g., Stripe Dashboard).
+- The MCP returns structured JSON that would be expensive to parse from CLI text output.
+- The MCP enforces governance (RLS-aware queries, scoped DB credentials) that the CLI doesn't.
+If a `/skill-name` exists that wraps a CLI, prefer the skill — it's been hardened. Canonical example: `/supabase` skill replaces 32 supabase MCP tools with `supabase` CLI calls and saves substantial token budget.
+### MCP tier-list (when each is justified)
+| Server | Always-on? | Justification |
+|---|---|---|
+| `claude-in-chrome` | On-demand | Browser automation has no CLI equivalent; use for QA flows only |
+| `supabase` MCP | **Off** in favor of `/supabase` skill | CLI covers 95% of operations; MCP only if you need branch management interactively |
+| `context7` | On-demand | Library docs at runtime — no CLI alternative for Context7 itself |
+| `notebooklm-mcp` | On-demand | NotebookLM has no CLI; only loaded when researching against existing notebooks |
+| `firecrawl-mcp` | On-demand | Web scraping; only loaded when feature requires it |
+| `next-devtools` | Always-on (dev) | Next.js 16 runtime errors not visible elsewhere |
+| `mux`, `stitch`, `higgsfield` | On-demand | Specialized API surfaces; load only on relevant client work |
+| `ZohoMCP`, `qualia-erp` | On-demand | Business ops only, not engineering-path |
+The pattern: **on-demand by default; always-on only when the data is irreducibly remote AND there's no CLI.**
+## Use shortcuts (Qualia commands)
+When a Qualia command exists for the situation, use it — don't reinvent:
+- `/qualia` — what's my next step?
+- `/qualia-quick` — small inline fix, no plan, no spawn
+- `/qualia-task` — single focused task, fresh builder spawn, atomic commit
+- `/qualia-ship` — full deploy pipeline (quality gates → commit → deploy → verify)
+- `/qualia-review` — production audit
+- `/qualia-pause` — save context before clearing the conversation
+- `/qualia-learn` — save a lesson from a mistake

package/skills/qualia-help/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: qualia-help
-description: "Open the Qualia Framework reference guide in the browser. A beautiful themed HTML page with all commands, rules, services, and the road. Trigger on 'help', 'how does this work', 'show me the commands', 'qualia help', 'reference'."
+description: "Open the BROWSER HTML reference for the Qualia Framework — themed page with all commands, rules, services, and the road. The default when a browser is available. For terminal-only output (SSH, headless), use /qualia-road. Triggers: 'help', 'how does this work', 'show me the commands', 'qualia help', 'reference', 'open the docs'."
 allowed-tools:
   - Bash
   - Read

package/skills/qualia-hook-gen/SKILL.md ADDED Viewed

@@ -0,0 +1,206 @@
+---
+name: qualia-hook-gen
+description: "Take a project's CLAUDE.md or rules/*.md instruction and convert it deterministically into a Claude Code pre-tool-use hook. Generates block-{cmd}.sh + the settings.json patch + activation steps. Lets users actually shrink their CLAUDE.md instead of just hearing the instruction-budget advice. Trigger on 'qualia-hook-gen', 'turn this rule into a hook', 'enforce this deterministically', 'block npm', 'force pnpm', 'convert claude.md to hooks', 'shrink my instruction budget'. v5.3 from Matt Pocock's enforce-deterministically-not-instructionally pattern."
+allowed-tools:
+  - Bash
+  - Read
+  - Write
+  - Edit
+  - Grep
+  - Glob
+argument-hint: "[--rule \"text\"] [--from CLAUDE.md] [--name HOOK_NAME] [--scope global|project] [--dry-run]"
+---
+# /qualia-hook-gen — Convert instructions → deterministic hooks
+LLMs have a realistic instruction budget of ~300-500 instructions before quality degrades (Matt Pocock). A line in CLAUDE.md like "use pnpm not npm" burns budget on EVERY request — even when the task has nothing to do with package management. Worse, it's non-deterministic: the model can still run `npm install` if it forgets.
+The fix: convert that instruction into a deterministic `pre-tool-use` hook. The hook blocks the wrong command (or rewrites it to the right one) at execution time, frees the instruction budget, and works regardless of context window state.
+## When to use
+- Your CLAUDE.md has 50+ lines and you want to slim it
+- A specific instruction is enforceable as a CLI rule (use X not Y, never run Z, redirect A to B)
+- You want a hook for a specific failure mode (e.g., "always use --force-with-lease, never --force")
+## What it does NOT do
+- Hooks for stylistic guidance (e.g., "prefer composition over inheritance") — that's not enforceable by command match. Stays in skills.
+- Hooks for non-deterministic checks (e.g., "validate the design feel"). Use `/qualia-polish` instead.
+- Hooks that need state across multiple commands. Use Qualia's existing state.js machinery.
+## Process
+### 1. Identify the rule
+Three input modes:
+| Mode | Source |
+|---|---|
+| `--rule "..."` | Direct argument (e.g. `--rule "use pnpm not npm"`) |
+| `--from CLAUDE.md` | Pull instructions from the file, list them, let user pick |
+| (no arg) | Read CLAUDE.md, scan for enforceable rules, propose top 3 candidates |
+### 2. Classify enforceability
+For the chosen rule, classify into one of three patterns:
+| Pattern | Example | Hook shape |
+|---|---|---|
+| **Block** | "never use `git push --force` to main" | exit 2 with message if pattern matches |
+| **Rewrite** | "use pnpm not npm" | exit 2 with message guiding to alternative |
+| **Warn** | "prefer next/image over <img>" | exit 0 but print warning to stderr |
+If the rule isn't classifiable as any of these — i.e. it's stylistic or judgment-based — HALT with: "This rule isn't deterministically enforceable. Keep it in CLAUDE.md or move to a skill. Examples of enforceable rules: package-manager redirects, destructive-command blocks, file-path enforcement."
+### 3. Generate the hook script
+Write to `hooks/block-{name}.js` (Node, cross-platform — same shape as existing hooks):
+```javascript
+#!/usr/bin/env node
+// hooks/block-{name}.js — auto-generated by /qualia-hook-gen
+// Original instruction: "{rule text}"
+// Pattern: {block | rewrite | warn}
+// Generated: {ISO date}
+const { readFileSync } = require("fs");
+let payload;
+try { payload = JSON.parse(readFileSync(0, "utf8")); } catch { process.exit(0); }
+const cmd = (payload.tool_input && payload.tool_input.command) || "";
+// Match condition (regex from rule classification)
+if (!/{matcher}/i.test(cmd)) process.exit(0);  // not our concern
+// Action
+console.error("⚠ Qualia hook ({name}): {message}");
+console.error("   Suggested: {suggested_alt}");
+process.exit(2);  // 2 = BLOCK in Claude Code hook protocol
+```
+The exact matcher + message + suggestion are filled by the synthesizer based on the rule classification.
+### 4. Generate the settings.json patch
+```json
+{
+  "hooks": {
+    "PreToolUse": [
+      {
+        "matcher": "Bash",
+        "hooks": [
+          {
+            "type": "command",
+            "if": "Bash({if-condition})",
+            "command": "node \"${HOME}/.claude/hooks/block-{name}.js\"",
+            "timeout": 5,
+            "statusMessage": "⬢ Checking {what}..."
+          }
+        ]
+      }
+    ]
+  }
+}
+```
+The `if` condition narrows when the hook fires (e.g., `Bash(npm*)` to fire only on npm). Saves cycles by skipping the hook entirely on irrelevant commands.
+### 5. Test the hook
+```bash
+# Simulate a triggering command
+echo '{"tool_input":{"command":"{triggering_example}"}}' | \
+  node hooks/block-{name}.js
+echo "Exit: $?"   # should be 2
+# Simulate a non-triggering command
+echo '{"tool_input":{"command":"{safe_example}"}}' | \
+  node hooks/block-{name}.js
+echo "Exit: $?"   # should be 0
+```
+If the test passes, proceed. If not, debug the matcher regex.
+### 6. Activate
+Two scopes:
+| Scope | Action |
+|---|---|
+| `--scope project` (default for project rules) | Add the patch to `.claude/settings.json` in the project root |
+| `--scope global` | Add to `~/.claude/settings.json`. Use only if rule applies to ALL projects |
+Use the existing settings-merge logic from `bin/install.js:756-778` (preserves user fields, atomic write, backup-before-overwrite).
+### 7. Suggest CLAUDE.md slim
+After activating, scan CLAUDE.md / `rules/*.md` for the original instruction. If found, suggest the user remove it (don't auto-remove — let the user verify the hook works first):
+```
+✓ Hook installed: hooks/block-{name}.js
+✓ Settings patched: .claude/settings.json
+ℹ You can now remove this line from CLAUDE.md (the hook enforces it deterministically):
+    > "{original instruction}"
+ℹ Test with: echo '{"tool_input":{"command":"{triggering_example}"}}' | node hooks/block-{name}.js
+```
+### 8. Commit
+```bash
+git add hooks/block-{name}.js .claude/settings.json
+git -c user.name="Qualia Solutions" -c user.email="info@qualiasolutions.net" \
+  commit -m "feat(hook): block-{name} — enforces \"{rule}\" deterministically"
+```
+## Examples
+**Block npm in favor of pnpm:**
+```
+/qualia-hook-gen --rule "use pnpm not npm"
+→ hooks/block-npm.js  (matches /^\s*npm\s+(install|i|run|exec)/, exit 2)
+→ .claude/settings.json  (PreToolUse > Bash > if: Bash(npm*))
+→ "npm install" now blocks with: "Use pnpm not npm. Run: pnpm install"
+```
+**Block destructive git on main:**
+```
+/qualia-hook-gen --rule "never push --force to main"
+→ hooks/block-force-push-main.js  (matches /git push.*--force.*main/)
+→ Already covered by hooks/git-guardrails.js — surface this overlap and skip
+```
+**Force /server/ for service_role usage:**
+```
+/qualia-hook-gen --rule "service_role only in lib/server/*"
+→ Not enforceable as a CLI hook (it's a code-level rule).
+→ HALT with recommendation: ESLint rule or pre-deploy-gate.js entry instead.
+```
+## Token discipline
+This skill itself is short by design (~150 lines SKILL.md). REFERENCE.md (if added later) only carries verbatim hook templates. The whole point of `/qualia-hook-gen` is to REDUCE token cost across a project, not add to it.
+Per-invocation: ~3K tokens for the rule-classification + hook-template synthesis. Net savings: every subsequent request saves the ~50-200 tokens that the moved CLAUDE.md instruction was costing.
+## Failure modes
+| Symptom | Cause | Action |
+|---|---|---|
+| Rule isn't a CLI command | Stylistic / judgment-based | HALT with recommendation: skill or ESLint rule |
+| Matcher would catch too much | Regex too greedy | Tighten with `--name` and explicit pattern; user-confirm before write |
+| Hook conflicts with existing | Same command already hooked | Surface the conflict; refuse to overwrite without `--force` |
+| Settings.json malformed | Pre-existing bad JSON | Refuse to patch; ask user to fix settings.json first |
+| `node` not on hook PATH | Cross-platform issue | Use `process.execPath` resolution; framework's existing hooks handle this |
+## Rules
+1. **Hook is determinism, skill is guidance.** Hooks can only block/rewrite/warn on CLI patterns. Stylistic rules stay in skills.
+2. **Never overwrite an existing hook silently.** If `hooks/block-{name}.js` exists, surface and ask.
+3. **Test before committing.** The hook must pass the trigger + non-trigger smoke tests before commit.
+4. **Suggest CLAUDE.md cleanup.** After install, surface the now-redundant CLAUDE.md line. Don't auto-delete — user verifies the hook works first.
+5. **Match Qualia's hook shape.** All hooks are pure Node, cross-platform, exit 0/2. No `.sh` scripts (Windows compat).
+## Pairs with
+- `/qualia-optimize --deepen` — runs sometimes after a hook-gen pass when CLAUDE.md gets short enough that the codebase architecture becomes the next bottleneck
+- Existing hooks: `git-guardrails.js`, `pre-deploy-gate.js`, `vercel-account-guard.js`, `env-empty-guard.js`, `supabase-destructive-guard.js`. New hooks generated by this skill follow the same conventions.

package/skills/qualia-map/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: qualia-map
-description: "Map an existing codebase to infer architecture, stack, conventions, what's already built, AND adapt Qualia to the repo's existing tracker/labels/glossary conventions (onboarding). For brownfield projects — run BEFORE /qualia-new so Validated requirements get inferred from existing code and Qualia commands respect the repo's existing process."
+description: "Map an existing codebase to infer architecture, stack, conventions, what's already built, AND adapt Qualia to the repo's existing tracker/labels/glossary conventions (onboarding). For brownfield projects — run BEFORE /qualia-new so Validated requirements get inferred from existing code and Qualia commands respect the repo's existing process. Triggers: 'map this codebase', 'onboard to existing project', 'brownfield setup', 'what's already built here', 'scan the repo', 'inherited a codebase', 'audit this project before planning'."
 allowed-tools:
   - Bash
   - Read

package/skills/qualia-milestone/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: qualia-milestone
-description: "Close the current milestone and open the next one — loads the next milestone's scope from JOURNEY.md (no ad-hoc naming). Archives artifacts, marks requirements Complete, regenerates ROADMAP.md for the next milestone."
+description: "Close the current milestone and open the next one — loads the next milestone's scope from JOURNEY.md (no ad-hoc naming). Archives artifacts, marks requirements Complete, regenerates ROADMAP.md for the next milestone. Triggers: 'close milestone', 'next milestone', 'milestone done', 'wrap up milestone', 'M1 done open M2', 'I want to advance to the next milestone', 'finish this milestone'."
 allowed-tools:
   - Bash
   - Read

package/skills/qualia-new/SKILL.md CHANGED Viewed

@@ -183,7 +183,7 @@ git commit -m "docs: PRODUCT.md — register, users, voice, anti-references"
 If frontend work is involved, generate `.planning/DESIGN.md` from `templates/DESIGN.md`. The generation MUST commit to four things upfront (these go in §1 of DESIGN.md):
 1. **Aesthetic direction** — pick ONE: `editorial · brutalist · luxury · maximalist · retro-futuristic · organic · terminal-native · sci-fi · pastoral · industrial · ...`. Don't hedge ("modern minimal" is hedging — pick one extreme).
-2. **Color strategy** — pick ONE: `Restrained · Committed · Full palette · Drenched`. See `rules/design-laws.md` §2.
+2. **Color strategy** — pick ONE: `Restrained · Committed · Full palette · Drenched`. See `qualia-design/design-laws.md` §2.
 3. **Scene sentence** — one concrete sentence: who uses this, where, ambient light, mood. NOT "observability dashboard" — "SRE glancing at incident severity on a 27-inch monitor at 2am in a dim room." Run the sentence, not the category.
 4. **Differentiation** — one sentence: what someone remembers 24 hours later.
@@ -198,7 +198,7 @@ Then fill the rest of DESIGN.md:
 - §9 Responsive: mobile-first
 - §10 Anti-pattern checklist (the auto-runnable one)
-Cross-check the result against `rules/design-laws.md` §8 absolute bans BEFORE writing — the design must not propose any banned pattern.
+Cross-check the result against `qualia-design/design-laws.md` §8 absolute bans BEFORE writing — the design must not propose any banned pattern.
 ```bash
 git add .planning/DESIGN.md .planning/config.json

package/skills/qualia-optimize/REFERENCE.md CHANGED Viewed

@@ -15,7 +15,7 @@ Agent(
 </planning>
 <rules>
-{rules/frontend.md content}
+{qualia-design/frontend.md content}
 </rules>
 <task>
@@ -30,7 +30,7 @@ Analyze frontend:
 2. **Design Alignment**
    - Components vs DESIGN.md (colors, typography, spacing)
-   - rules/frontend.md compliance: distinctive fonts? sharp accents? transitions? No card grids / gradients?
+   - qualia-design/frontend.md compliance: distinctive fonts? sharp accents? transitions? No card grids / gradients?
    - Consistency across app (buttons, spacing, colors)
 3. **Frontend Perf**
@@ -200,3 +200,66 @@ Format: What/Where/Why/Fix/Severity.
   description="Architecture synthesis + deepening"
 )
 ```
+## Parallel interface design prompt (`--deepen` Wave 3, fan-out × 3)
+Spawn 3 agents in the SAME response turn. Each gets the same candidate but a *different* design constraint so the alternatives differ structurally. Use this verbatim — the per-agent constraint is the only variable:
+```
+Agent(
+  prompt="Interface designer (variant {1|2|3}/3). Produce ONE radically different
+interface for this deep-module candidate. Other variants are running in parallel
+with different constraints — yours is uniquely framed by your design lens.
+<candidate>
+{candidate block from arch strategist: files, problem, current shallow signature}
+</candidate>
+<context>
+{INLINE .planning/CONTEXT.md (domain glossary — USE these terms verbatim)}
+{INLINE .planning/decisions/*.md (ADRs constraining the design space)}
+</context>
+<your_lens>
+Variant 1 → functional / data-oriented (no classes; pure functions; explicit data flow)
+Variant 2 → OOP / encapsulated (class with private state; methods on a stable receiver)
+Variant 3 → event-driven / message-based (subscriber model; commands and events)
+[Use whichever lens is assigned to YOU above — fan-out call passes only ONE]
+</your_lens>
+<task>
+Design the interface only. Do NOT implement. Output:
+1. **Interface sketch** (TypeScript signatures, 5-15 lines). Function/class/event
+   names use CONTEXT.md domain language. No invented synonyms.
+2. **Locality gain** (1 sentence): what concentrates in this module's seam that
+   was previously scattered across N files?
+3. **Testability** (1-3 lines): where do mocks / adapters live? What's a
+   1-line test name that would be easy to write against this interface?
+4. **Migration cost** (1 line): rough count — how many callers need updating?
+   Are any breaking changes? Can it be staged incrementally?
+5. **Trade-off** (1 sentence): what does THIS shape sacrifice compared to the
+   other two variants?
+Constraints:
+- Interface should be DEEP (high leverage per surface area). Refuse a shallow
+  wrapper that just renames the existing functions.
+- The deletion test must pass: deleting this module makes complexity vanish at
+  N callers, not just relocate it.
+- Use CONTEXT.md terms. Do NOT invent new vocabulary.
+- Output exactly the 5 numbered sections above. No prose preamble.
+</task>",
+  subagent_type="general-purpose",
+  description="Interface variant {N}/3 — {functional|OOP|event-driven} lens"
+)
+```
+After all 3 return, present a comparison table to the user (see SKILL.md Step 5b). User picks 1, 2, 3, or hybrid. Then a single synthesizer agent writes the Refactor RFC to `.planning/REFACTOR-{slug}.md` honoring the user's pick.
+**Token cost**: ~6K per variant × 3 variants = ~18K for the fan-out. Cached prefix (CONTEXT.md + ADRs + candidate block) is shared across the 3 spawns, so effective cost is closer to ~12K. The output rfc-pick stage adds ~3K. Total per-deepening-candidate: ~15K — well within Qualia's per-skill budget.
+**Skip variants when one would obviously dominate**: if the codebase is heavily functional (e.g., Effect-based) the OOP variant adds zero value. Strategist may suggest 2 lenses instead of 3 in that case. Default is always 3 unless explicitly noted.

package/skills/qualia-optimize/SKILL.md CHANGED Viewed

@@ -73,7 +73,7 @@ ls .planning/decisions/ 2>/dev/null && cat .planning/decisions/*.md 2>/dev/null
 Also read rules:
 ```bash
-cat ~/.claude/rules/frontend.md 2>/dev/null
+cat ~/.claude/qualia-design/frontend.md 2>/dev/null
 cat ~/.claude/rules/security.md 2>/dev/null
 ```
@@ -127,6 +127,31 @@ Spawn **arch strategist** (@REFERENCE.md "Architecture strategist prompt (deepen
 **Skip Wave 2 for single-mode** (`--perf`, `--ui`, `--backend`, `--alignment`). Run for `full` and `deepen`.
+### Step 5b: Wave 3 -- Parallel Interface Design (`--deepen` only, after candidate selection)
+After the strategist returns deepening candidates, present a numbered list to the user. User picks ONE candidate (or `--auto` mode picks the highest-severity).
+For the chosen candidate, spawn **3 fan-out agents in parallel, in the same response turn**, each producing a *radically different* interface design for the proposed deep module. From Matt Pocock's improve-codebase-architecture skill: "spawn three sub-agents in parallel, each must produce a radically different interface for the deepened module."
+Spawn 3 (@REFERENCE.md "Parallel interface design prompt"). Each receives:
+- The candidate's files, problem, and current shallow signature
+- CONTEXT.md domain glossary (use shared terms)
+- A *different* design constraint (functional / OOP / event-driven / minimal-surface / hexagonal — assigned per agent so the variants differ in shape, not just naming)
+Collect all 3 proposals. Present to the user as a side-by-side table:
+| # | Interface shape | Locality gain | Testability | Migration cost |
+|---|---|---|---|---|
+| 1 | {sketch} | {what concentrates} | {seams} | {N callers updated} |
+| 2 | ... | ... | ... | ... |
+| 3 | ... | ... | ... | ... |
+User picks `1`, `2`, `3`, or `hybrid` (with notes on which elements from which proposals to combine). The synthesizer then writes a "Refactor RFC" to `.planning/REFACTOR-{slug}.md` and optionally opens a GH issue (mirrors `/qualia-prd` flow).
+**Why parallel + radically different**: a single deepening proposal anchors on the first idea the LLM has. Three parallel proposals with diverse design constraints surface trade-offs the user can see at a glance — and the human's "taste" dominates the choice rather than the agent's first instinct. Empirically (Matt Pocock + Qualia internal testing) this produces dramatically better refactor RFCs than a single-pass proposal.
+**Skip Wave 3 for `full` mode** (too many candidates to fan out per-candidate). Run for `--deepen` only when a candidate is selected.
 ### Step 6: Alignment Check (`full` and `alignment` modes)
 `alignment`: sole analysis. `full`: alongside Wave 1.

package/skills/qualia-polish/SKILL.md CHANGED Viewed

@@ -37,7 +37,7 @@ Before any work — design or otherwise — pass these gates. Skipping them prod
 | Gate | Required check | If fail |
 |---|---|---|
-| Substrate | `rules/design-laws.md`, `design-brand.md`, `design-product.md`, `design-rubric.md` exist and have been read | Read them. |
+| Substrate | `qualia-design/design-laws.md`, `design-brand.md`, `design-product.md`, `design-rubric.md` exist and have been read | Read them. |
 | PRODUCT | `PRODUCT.md` exists at project root, has `register:` field, and is not placeholder (`[TODO]` markers, &lt; 200 chars) | Run setup: ask 5 questions and generate it. Never synthesize from prompt alone. |
 | DESIGN | `DESIGN.md` exists. If missing on App / Redesign scope, BLOCK and run setup. If missing on Component / Section / Critique / Quick scope, NUDGE and proceed. | Generate from PRODUCT.md + 3 questions. |
 | Slop-detect | `bin/slop-detect.mjs` is callable | Install or pull from framework. |
@@ -100,7 +100,7 @@ For App / Redesign / Section scope on multi-file work:
 - If the conversation already contains design-taste discussion (font/color/motion preferences threaded across multiple turns), prefer **forked subagents** (`--fork-session`) so they inherit the taste context. Otherwise, blank-context fan-out is fine for mechanical fixes.
 Each agent receives:
-- `rules/design-laws.md` + the matching register file
+- `qualia-design/design-laws.md` + the matching register file
 - `PRODUCT.md` + `DESIGN.md` (inlined)
 - Its 5 files (paths + contents)
 - Instruction: apply the Design Quality Rubric per file. Fix every dimension scoring &lt; 3. Make literal edits. Do NOT change logic — only styling.
@@ -158,7 +158,7 @@ viewports: [
 For each iteration:
 1. Capture all 3 viewports
-2. Pass to a vision-model agent with `rules/design-rubric.md` as the prompt anchor + DESIGN.md as the spec
+2. Pass to a vision-model agent with `qualia-design/design-rubric.md` as the prompt anchor + DESIGN.md as the spec
 3. The agent scores 8 dimensions, anchored 1-5, with evidence per dimension
 4. Apply fixes ONLY to dimensions scored 1 or 2 (don't nitpick 3s; prevents oscillation)
 5. STOP if: all dimensions ≥ 3 (success), OR any dimension regressed from previous iteration (regression-stop), OR 2 iterations reached (hard cap)

package/skills/qualia-polish-loop/REFERENCE.md CHANGED Viewed

@@ -58,7 +58,7 @@ Agent({
 Role: @~/.claude/agents/visual-evaluator.md
 <rubric>
-{INLINE rules/design-rubric.md §"The 8 dimensions" through §"Aggregate score"}
+{INLINE qualia-design/design-rubric.md §"The 8 dimensions" through §"Aggregate score"}
 </rubric>
 <brief>

package/skills/qualia-polish-loop/SKILL.md CHANGED Viewed

@@ -18,7 +18,7 @@ See its own work. Fix its own work. Stop only when correct.
 ## What it does
-Takes a URL + design brief. Screenshots at 3 viewports (mobile / tablet / desktop). Spawns a vision evaluator that scores 8 dimensions of `rules/design-rubric.md` against the brief with cited evidence. Spawns up to 3 fix-builders in parallel for the top issues. Re-screenshots. Loops until all dimensions ≥ 3 or the kill-switch trips (regression, budget, or max iterations).
+Takes a URL + design brief. Screenshots at 3 viewports (mobile / tablet / desktop). Spawns a vision evaluator that scores 8 dimensions of `qualia-design/design-rubric.md` against the brief with cited evidence. Spawns up to 3 fix-builders in parallel for the top issues. Re-screenshots. Loops until all dimensions ≥ 3 or the kill-switch trips (regression, budget, or max iterations).
 Different from `/qualia-polish`: that one is read+edit+slop-detect, single pass. This one is **see+edit+verify+repeat** with a real loop and real screenshots.
@@ -39,7 +39,7 @@ Run these in order. Halt on the first failure.
 | Gate | Check | If fail |
 |---|---|---|
-| Substrate | `rules/design-rubric.md`, `rules/design-laws.md` exist | Run `npx qualia install` |
+| Substrate | `qualia-design/design-rubric.md`, `qualia-design/design-laws.md` exist | Run `npx qualia install` |
 | Brief | `--brief` PATH if provided, else `.planning/DESIGN.md`, else PRODUCT.md | If none, HALT: "No design brief found. Pass --brief or run /qualia-new." |
 | Browser | `node ~/.claude/skills/qualia-polish-loop/scripts/playwright-capture.mjs --url about:blank --out /tmp/qpl-preflight` exits 0 | HALT with the script's setup hint |
 | URL reachable | `curl -fsS -o /dev/null -w '%{http_code}' "$URL"` returns 2xx/3xx | HALT — start the dev server first |
@@ -91,7 +91,7 @@ node ~/.claude/skills/qualia-polish-loop/scripts/loop.mjs record \
 Exit codes: `0` = SUCCESS (all dims ≥ 3), `1` = CONTINUE (more iterations), `3` = KILLED (regression / budget / max).
-The orchestrator computes the verdict per `rules/design-rubric.md`:
+The orchestrator computes the verdict per `qualia-design/design-rubric.md`:
 - **all aggregate scores ≥ 3 AND no critical issues remain** → SUCCESS, exit loop
 - **same issue fingerprint recurred 3 consecutive iterations** → KILL, `LOOP_REGRESSION_DETECTED`

package/skills/qualia-polish-loop/fixtures/broken.html CHANGED Viewed

@@ -1,8 +1,8 @@
 <!doctype html>
 <!--
   Deliberately broken page used by /qualia-polish-loop self-test Scenario 2.
-  Hits multiple absolute-ban patterns from rules/design-laws.md and
-  rules/design-brand.md so the vision evaluator has to identify them all.
+  Hits multiple absolute-ban patterns from qualia-design/design-laws.md and
+  qualia-design/design-brand.md so the vision evaluator has to identify them all.
   Banned font (Inter), pure white + pure black, blue-purple gradient,
   gradient text, identical 3-column card grid, "Get Started" / "Learn More"
   generic CTAs, side-stripe border-left:4px decorative, max-width:1280