npm - qualia-framework - Versions diffs - 6.7.1 → 6.8.1 - Mend

qualia-framework 6.7.1 → 6.8.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (38) hide show

package/CHANGELOG.md +2314 -0
package/FLAGS.md +73 -0
package/SOUL.md +17 -0
package/TROUBLESHOOTING.md +183 -0
package/agents/plan-checker.md +4 -0
package/agents/planner.md +8 -0
package/agents/qa-browser.md +6 -2
package/agents/research-synthesizer.md +4 -0
package/agents/researcher.md +4 -0
package/agents/roadmapper.md +4 -0
package/agents/verifier.md +1 -1
package/agents/visual-evaluator.md +8 -6
package/bin/cli.js +7 -1
package/bin/install.js +122 -15
package/bin/runtime-manifest.js +1 -0
package/bin/security-scan.js +24 -10
package/bin/trust-score.js +34 -0
package/hooks/migration-guard.js +4 -4
package/hooks/pre-deploy-gate.js +14 -4
package/hooks/stop-session-log.js +10 -7
package/package.json +6 -2
package/qualia-design/design-rubric.md +3 -1
package/rules/architecture.md +1 -1
package/rules/grounding.md +3 -1
package/rules/speed.md +2 -2
package/skills/qualia-idk/SKILL.md +3 -3
package/skills/qualia-polish/REFERENCE.md +11 -6
package/skills/qualia-polish/SKILL.md +20 -3
package/skills/qualia-polish/scripts/loop.mjs +24 -6
package/skills/qualia-polish/scripts/playwright-capture.mjs +89 -11
package/skills/qualia-polish/scripts/vibe-tokens.mjs +57 -1
package/skills/qualia-research/SKILL.md +1 -1
package/skills/qualia-road/SKILL.md +6 -0
package/skills/qualia-scope/SKILL.md +2 -2
package/skills/qualia-secure/SKILL.md +5 -5
package/templates/knowledge/index.md +4 -4
package/tests/bin.test.sh +4 -4
package/tests/lib.test.sh +2 -2

package/skills/qualia-polish/REFERENCE.md CHANGED Viewed

@@ -32,6 +32,8 @@ The capture script (`scripts/playwright-capture.mjs`) auto-selects in this order
 3. `~/.cache/ms-playwright/chromium-{version}/chrome-{linux64,linux,mac,win}/chrome` — if Playwright was ever installed for browsers but the package isn't import-resolvable
 4. `which google-chrome` / `chromium` / `chromium-browser` / `chrome` — system browser fallback
+All four backends capture **full-page** (top-to-bottom, including below-fold content), not just the first viewport-height — the evaluator scores card grids, CTAs, and footers, so the capture must contain them. Playwright uses `fullPage: true`; the binary-direct backends omit any fixed `--window-size` height clamp so the headless render extends to the full document height.
 For backends 3 and 4 (binary-direct), the script uses `--headless=new --screenshot --virtual-time-budget`. Less precise than Playwright's `networkidle` waiting but works without any npm dependency.
 Setup hints if all four fail:
@@ -70,6 +72,7 @@ Role: @${QUALIA_AGENTS}/visual-evaluator.md
 </product>
 <screenshots>
+One full-page PNG per viewport (captured top-to-bottom, includes below-fold content):
 - mobile (375px):  /tmp/qpl-{ts}/iter-{N}/mobile-375.png
 - tablet (768px):  /tmp/qpl-{ts}/iter-{N}/tablet-768.png
 - desktop (1440px): /tmp/qpl-{ts}/iter-{N}/desktop-1440.png
@@ -150,6 +153,8 @@ the {dim} dimension at {score}. Your single task: fix that one dimension.
 ## Iteration log entry (what `loop.mjs record` writes to state.json.iterations[])
+`tokens_used` is a **deterministic fixed charge** (`PER_ITER_TOKENS = 14500`) added by `loop.mjs` per iteration — NOT a number reported by the evaluator. The evaluator's JSON no longer carries a `tokens_used` field; trusting a model-guessed count to drive the budget kill-switch was unsound, so the loop charges the constant below for every iteration regardless of what any agent claims.
 ```json
 {
   "iteration": 1,
@@ -189,20 +194,20 @@ When the kill trigger fires, the verdict becomes `killed_regression` and `state.
 | 6 | ~90K  | default |
 | 8 | ~120K | hard cap; pass `--budget 150000` to allow |
-Per-iteration cost (rough):
-- 3 screenshot reads ≈ 9K
+Per-iteration cost is charged as a **fixed constant** — `loop.mjs` adds `PER_ITER_TOKENS = 14500` per iteration to `state.tokens_used` and the kill-switch trips off that deterministic sum, never off an agent's self-estimate. The breakdown below is the rationale for the constant, not a per-run measurement:
+- 3 full-page screenshot reads ≈ 9K (taller PNGs than viewport-height crops, but still 3 reads)
 - rubric + brief inlined ≈ 2K (cached after iter 1)
 - previous-iteration delta ≈ 0.5K
 - 3 fix-builder spawns × (file read + edit + commit-fix call) ≈ 3K
-- **per-iteration ≈ 14.5K**
+- **per-iteration constant = 14.5K** (deterministic — same charge every iteration)
 ## Self-test scenarios (mapping to spec)
 | # | Fixture | Expected | Verifier |
 |---|---|---|---|
-| 1 | `fixtures/clean.html` | SUCCESS in 1-2 iterations, all dims ≥ 4 | run capture, run evaluator inline, assert pass |
-| 2 | `fixtures/broken.html` | SUCCESS in 4-6 iters; identifies banned font + gradient + 3-card grid + side-stripe + generic CTA | each fix-builder commits a `qpl-N:` change; final eval all dims ≥ 3 |
-| 3 | Kill-switch | KILL at iter ≤ 4 with `LOOP_REGRESSION_DETECTED` | call `loop.mjs record` 3× with the same fingerprint; assert exit 3 + correct verdict |
+| 1 | `fixtures/clean.html` | Each viewport PNG is full-page (height > viewport-height when the fixture scrolls); SUCCESS in 1-2 iterations, all dims ≥ 4 | run capture, assert PNG is full-page, run evaluator inline, assert pass |
+| 2 | `fixtures/broken.html` | SUCCESS in 4-6 iters; identifies banned font + gradient + 3-card grid + side-stripe + generic CTA, INCLUDING below-fold offenders only visible in a full-page shot | each fix-builder commits a `qpl-N:` change; final eval all dims ≥ 3 |
+| 3 | Kill-switch | KILL at iter ≤ 4 with `LOOP_REGRESSION_DETECTED`; `state.tokens_used` increments by exactly `PER_ITER_TOKENS` (14500) per recorded iteration | call `loop.mjs record` 3× with the same fingerprint; assert exit 3 + correct verdict + deterministic token charge |
 The pilot-results doc at `docs/playwright-loop-pilot-results.md` records the actual outcome from `bash scripts/_self-tests.sh` (Scenario 3 is exercised by a deterministic unit-style invocation; Scenarios 1+2 require a real vision pass and are run by Claude when the loop ships).

package/skills/qualia-polish/SKILL.md CHANGED Viewed

@@ -32,7 +32,7 @@ The first argument selects the scope. Stage selection follows from scope.
 | `/qualia-polish --redesign` | **Redesign** | ~30m | all + Stage 1 mandatory + 2 vision iterations |
 | `/qualia-polish --critique` | **Critique** | read-only | 0, 4, 5 (no edits) |
 | `/qualia-polish --quick` | **Quick** | ~1m | 0, 2, 7 (gates only, no vision loop) |
-| `/qualia-polish --vibe` | **Aesthetic pivot** | ~3m | token-only direction swap |
+| `/qualia-polish --vibe` | **Aesthetic pivot** | ~3m | token-only direction swap, then 6, 7 (mandatory verify) |
 | `/qualia-polish --loop {url}` | **Loop** | ~5-15m | autonomous see/fix/verify, max 8 iterations |
 Other flags: `--register=brand|product` overrides register inference. Vibe-specific flags: `--variants N`, `--extract URL|image`, `--sync`, `--write`. Loop-specific flags: `--brief PATH`, `--max N`, `--viewports 375,768,1440`, `--ref PATH`, `--budget 100000`.
@@ -73,6 +73,21 @@ node ${QUALIA_SKILLS}/qualia-polish/scripts/vibe-extract.mjs --source https://ex
 Default flow proposes one opinionated direction per `rules/one-opinion.md`. `--variants` is opt-in only. `--extract` stages a reference screenshot for the visual evaluator to reverse-engineer into DESIGN.md tokens; the user must review low-confidence extracted values before application. `--sync --write` patches DESIGN.md to reflect tokens already present in code.
+**Mandatory post-vibe verify (the token swap does NOT terminate the run).** After applying the new tokens, a `--vibe` run MUST NOT stop at the swap — it routes back through **Stage 6 — Verify**. Before claiming done, run:
+```bash
+# 1. TypeScript still compiles (a token rename can break a typed theme map)
+npx tsc --noEmit 2>&1 | tail -10
+# 2. Anti-pattern scan on the files the swap touched (new tokens can reintroduce slop)
+node bin/slop-detect.mjs {changed_files}
+[ $? -eq 0 ] || { echo "vibe failed — slop-detect found critical issues in the new tokens"; exit 1; }
+```
+3. **Scoped vision check (best-effort, only if a dev server is reachable).** Capture one screenshot per viewport via `scripts/playwright-capture.mjs` and spawn a single `agents/visual-evaluator.md` pass scored ONLY on the dimensions a token swap can move: **color, typography, shadow, motion**. (Layout/container/microcopy/graphics are out of scope for `--vibe` — do not score or fix them here.) If no dev server and no resolvable browser backend, skip this step and note it in the report; never block the vibe on missing optional tooling. Any in-scope dimension scoring < 3 is a regression in the new direction — re-tune the offending tokens and re-verify, do not ship.
+Only after Stage 6 passes does the `--vibe` run reach Stage 7 (commit & state).
 ## Setup gates (non-optional, every scope)
 Before any work — design or otherwise — pass these gates. Skipping them produces generic output that ignores the project.
@@ -163,6 +178,8 @@ Each agent receives:
 For Component scope: do the work in main context. Read, fix, verify.
+**Optional best-effort vision check (Component scope).** If a dev server is reachable (`http://localhost:3000` or a port detected via `lsof`) AND `scripts/playwright-capture.mjs` can resolve a browser backend, capture ONE screenshot at a single viewport (desktop 1440 is sufficient for an isolated component) and spawn one `agents/visual-evaluator.md` pass scored on component-relevant dimensions ONLY — Typography, Color, Shadow, Motion, Microcopy (skip Layout/Container/Graphics per the rubric's scope guard). This is opt-in by environment: if there is no dev server or no resolvable backend, **omit the vision check without error and record that it was unavailable** — never block a component touch-up on missing optional tooling.
 **Apply fixes scoped by what's missing:**
 | If the file scores low on… | Apply fix from… |
@@ -216,7 +233,7 @@ If a11y &lt; 90 OR axe critical/serious violations: **fix programmatically** (th
 ### Stage 4 — Vision loop (Redesign scope only — max 2 iterations)
-Use `skills/qualia-polish/scripts/playwright-capture.mjs` (Playwright backend, with Chrome-binary fallback). Capture screenshots at 3 viewports: 375 / 768 / 1440 — these match what `agents/visual-evaluator.md` expects, and what the `--loop` mode captures. Single browser (Chromium) is fine in 2026 — cross-browser CSS rendering differences are vanishingly rare.
+Use `skills/qualia-polish/scripts/playwright-capture.mjs` (Playwright backend, with Chrome-binary fallback). Capture one **full-page** screenshot at each of 3 viewports: 375 / 768 / 1440 — these match what `agents/visual-evaluator.md` expects, and what the `--loop` mode captures. The `height` below sets the viewport window, not the screenshot height — each PNG runs top-to-bottom so the evaluator scores below-fold sections too. Single browser (Chromium) is fine in 2026 — cross-browser CSS rendering differences are vanishingly rare.
 ```
 viewports: [
@@ -325,6 +342,6 @@ node ${QUALIA_BIN}/qualia-ui.js end "POLISHED" "{next command — depends on con
 | `bin/slop-detect.mjs not found` | Framework not installed in project | Run `npx qualia install` or pull script from framework repo |
 | `PRODUCT.md missing` | Project predates the PRODUCT.md substrate | Run setup (ask 5 questions, generate). For Component scope, can proceed with nudge. |
 | `Lighthouse not installed` | Optional tool | Skip Stage 3 numeric gates, note in report. Don't fail. |
-| `webapp-testing skill not present` | Optional Anthropic skill | Skip Stage 4 vision loop on Redesign scope. Note in report. Recommend installing. |
+| `playwright-capture.mjs cannot resolve a browser backend` | None of the four backends resolve: no `playwright`/`playwright-core` package, no cached Chromium under `~/.cache/ms-playwright`, no system Chrome/Chromium on PATH | Skip Stage 4 vision loop (and any screenshot-dependent verify). Note in report. Install a backend: `npm i -D playwright && npx playwright install chromium` (or put `google-chrome` on PATH). |
 | Vision loop oscillates between iterations | Rubric not anchored properly | Verify rubric prompt instructs "default to 3, only 1-2 = fix". Hard cap at 2 iterations. |
 | User says "you missed X" after polish completes | Scope was too narrow | Re-run with wider scope (`/qualia-polish` whole-app). Don't argue scope. |

package/skills/qualia-polish/scripts/loop.mjs CHANGED Viewed

@@ -14,12 +14,13 @@
  *
  * Eval JSON contract (what the vision evaluator returns):
  *   {
- *     "iteration": 1,
+ *     "iteration": 1,            // optional; if present must equal state.iteration+1 (monotonicity assertion)
  *     "viewport_results": [{ viewport, scores, top_issues }],
  *     "aggregate_scores": { typography, color, spatial, layout, shadow, motion, microcopy, container, graphics },
- *     "top_issues": [{ dim, severity, description, likely_file, fix }],
- *     "tokens_used": 14500
+ *     "top_issues": [{ dim, severity, description, likely_file, fix }]
  *   }
+ *   NOTE: any ev.tokens_used is IGNORED — the budget charges a fixed
+ *   PER_ITER_TOKENS per iteration (deterministic kill-switch).
  *
  * State JSON shape (written by this script, read by Claude):
  *   {
@@ -35,6 +36,12 @@ import { spawnSync } from "node:child_process";
 const DIMS = ["typography", "color", "spatial", "layout", "shadow", "motion", "microcopy", "container", "graphics"];
+// Deterministic per-iteration token charge (evaluator + 3 builders ≈ 14500,
+// per REFERENCE.md:197). The budget kill-switch must NOT trust ev.tokens_used —
+// that is a model-guessed number and an under-report would defeat the budget
+// guard. Charge a fixed, known cost per iteration instead.
+const PER_ITER_TOKENS = 14500;
 // ── helpers ──────────────────────────────────────────────────────────────
 function loadState(p) {
   if (!existsSync(p)) { console.error(`state file not found: ${p}`); exit(2); }
@@ -98,7 +105,17 @@ function cmdRecord() {
   const state = loadState(statePath);
   const ev = JSON.parse(readFileSync(evalPath, "utf8"));
-  state.iteration = (ev.iteration ?? state.iteration + 1);
+  // Monotonicity assertion: each record() advances exactly one iteration. If the
+  // evaluator reports an iteration that disagrees with the script's own counter,
+  // the loop is desynced — fail loudly instead of trusting the eval's number,
+  // which would corrupt regression fingerprinting (state.iteration is the key
+  // used at :108-134) and budget accounting.
+  const expected = state.iteration + 1;
+  if (ev.iteration != null && ev.iteration !== expected) {
+    console.error(`loop desync: eval.iteration=${ev.iteration} expected=${expected}`);
+    exit(2);
+  }
+  state.iteration = expected;
   const scores = ev.aggregate_scores || {};
   const failingDims = DIMS.filter((d) => (scores[d] ?? 3) < 3);
   const total = DIMS.reduce((a, d) => a + (scores[d] ?? 3), 0);
@@ -133,7 +150,8 @@ function cmdRecord() {
     return its.length >= 3 && its.every((n, i, arr) => i === 0 || n === arr[i - 1] + 1);
   });
-  state.tokens_used += parseInt(ev.tokens_used || 0, 10) || 0;
+  // Charge the fixed per-iteration cost — do NOT trust ev.tokens_used.
+  state.tokens_used += PER_ITER_TOKENS;
   state.iterations.push({
     iteration: state.iteration,
@@ -142,7 +160,7 @@ function cmdRecord() {
     pass,
     failing_dims: failingDims,
     top_issues: issues,
-    tokens_used: ev.tokens_used || 0,
+    tokens_used: PER_ITER_TOKENS,
     timestamp: new Date().toISOString(),
   });

package/skills/qualia-polish/scripts/playwright-capture.mjs CHANGED Viewed

@@ -26,7 +26,7 @@ import { homedir } from "node:os";
 // ── Arg parsing ──────────────────────────────────────────────────────────
 function parseArgs() {
-  const args = { url: null, out: null, viewports: [375, 768, 1440], wait: 1500, reducedMotion: false };
+  const args = { url: null, out: null, viewports: [375, 768, 1440], wait: 1500, reducedMotion: false, states: false, stateSelector: null };
   for (let i = 2; i < argv.length; i++) {
     const a = argv[i];
     if (a === "--url" && argv[i + 1]) args.url = argv[++i];
@@ -35,15 +35,22 @@ function parseArgs() {
       args.viewports = argv[++i].split(",").map((s) => parseInt(s, 10)).filter((n) => Number.isFinite(n) && n > 0);
     } else if (a === "--wait" && argv[i + 1]) args.wait = parseInt(argv[++i], 10);
     else if (a === "--reduced-motion") args.reducedMotion = true;
+    else if (a === "--states") args.states = true;
+    else if (a === "--state-selector" && argv[i + 1]) args.stateSelector = argv[++i];
     else if (a === "--help" || a === "-h") {
       console.log(`playwright-capture.mjs — Screenshot capture for /qualia-polish --loop
 Usage:
-  node playwright-capture.mjs --url <url> --out <dir> [--viewports 375,768,1440] [--wait 1500] [--reduced-motion]
+  node playwright-capture.mjs --url <url> --out <dir> [--viewports 375,768,1440] [--wait 1500] [--reduced-motion] [--states]
 Flags:
   --reduced-motion   Force prefers-reduced-motion: reduce in the captured page.
                      Use when the brief explicitly opts out of motion (a11y mode).
+  --states           Also capture hover/focus interaction-state PNGs (Playwright
+                     backend only — the Chrome-binary backend cannot drive
+                     hover/focus from the CLI and silently skips them).
+  --state-selector   CSS selector to target for --states (default: first
+                     a, button, [role=button], or input on the page).
 Backend selection (auto):
   1. Playwright   — import('playwright') if installed
@@ -64,11 +71,14 @@ function viewportName(width) {
   if (width <= 900) return "tablet";
   return "desktop";
 }
-function viewportHeight(width) {
-  if (width <= 480) return 812;
-  if (width <= 900) return 1024;
-  return 900;
-}
+// Initial render height for the Playwright context viewport. Capture is
+// full-page (fullPage:true), so this is NOT a clamp on the captured PNG — it
+// only sets the layout viewport before Playwright stitches the full scroll.
+const INITIAL_RENDER_HEIGHT = 1080;
+// Headless Chrome (binary backend) has no --full-page flag, so a tall window
+// height approximates full-page capture. Replaces the old per-viewport clamp
+// (812/1024/900) that cropped everything below the fold.
+const FULLPAGE_WINDOW_HEIGHT = 8000;
 // ── Backend: Playwright (preferred when available) ──────────────────────
 async function captureViaPlaywright(args) {
@@ -84,7 +94,7 @@ async function captureViaPlaywright(args) {
     browser = await chromium.launch({ headless: true });
     for (const width of args.viewports) {
       const name = viewportName(width);
-      const height = viewportHeight(width);
+      const height = INITIAL_RENDER_HEIGHT;
       const file = join(args.out, `${name}-${width}.png`);
       try {
         const ctxOpts = { viewport: { width, height }, deviceScaleFactor: 1 };
@@ -93,9 +103,14 @@ async function captureViaPlaywright(args) {
         const page = await ctx.newPage();
         await page.goto(args.url, { waitUntil: "networkidle", timeout: 30000 });
         if (args.wait > 0) await page.waitForTimeout(args.wait);
-        await page.screenshot({ path: file, fullPage: false });
+        await page.screenshot({ path: file, fullPage: true });
+        const stateFiles = [];
+        if (args.states) {
+          const extra = await captureStates(page, args, name, width);
+          stateFiles.push(...extra);
+        }
         await ctx.close();
-        results.push({ viewport: name, width, height, file, ok: true, backend: "playwright", reducedMotion: !!args.reducedMotion });
+        results.push({ viewport: name, width, height, file, ok: true, backend: "playwright", reducedMotion: !!args.reducedMotion, ...(stateFiles.length ? { state_captures: stateFiles } : {}) });
       } catch (err) {
         results.push({ viewport: name, width, height, file, ok: false, backend: "playwright", error: err.message });
       }
@@ -106,6 +121,59 @@ async function captureViaPlaywright(args) {
   return results;
 }
+// ── Optional interaction-state capture (--states, Playwright only) ───────
+// Captures hover/focus/loading variant PNGs of the first interactive element
+// (or --state-selector) so the evaluator can score Motion/States honestly
+// instead of guessing from a single static above-fold frame. Best-effort: any
+// individual state that can't be captured is recorded as ok:false, never throws.
+async function captureStates(page, args, name, width) {
+  const out = [];
+  const selector = args.stateSelector
+    || "a:visible, button:visible, [role=button]:visible, input:visible";
+  let target;
+  try {
+    target = page.locator(selector).first();
+    await target.waitFor({ state: "visible", timeout: 2000 });
+  } catch {
+    out.push({ state: "hover", viewport: name, width, ok: false, error: "no interactive target found" });
+    return out;
+  }
+  // hover
+  const hoverFile = join(args.out, `${name}-${width}-hover.png`);
+  try {
+    await target.hover({ timeout: 2000 });
+    await page.waitForTimeout(Math.min(args.wait, 400));
+    await target.screenshot({ path: hoverFile });
+    out.push({ state: "hover", viewport: name, width, file: hoverFile, ok: true });
+  } catch (e) {
+    out.push({ state: "hover", viewport: name, width, file: hoverFile, ok: false, error: e.message });
+  }
+  // focus
+  const focusFile = join(args.out, `${name}-${width}-focus.png`);
+  try {
+    await target.focus({ timeout: 2000 });
+    await page.waitForTimeout(Math.min(args.wait, 400));
+    await target.screenshot({ path: focusFile });
+    out.push({ state: "focus", viewport: name, width, file: focusFile, ok: true });
+  } catch (e) {
+    out.push({ state: "focus", viewport: name, width, file: focusFile, ok: false, error: e.message });
+  }
+  // loading — re-navigate and grab the earliest paint (commit) before networkidle
+  const loadingFile = join(args.out, `${name}-${width}-loading.png`);
+  try {
+    await page.goto(args.url, { waitUntil: "commit", timeout: 30000 });
+    await page.screenshot({ path: loadingFile });
+    out.push({ state: "loading", viewport: name, width, file: loadingFile, ok: true });
+  } catch (e) {
+    out.push({ state: "loading", viewport: name, width, file: loadingFile, ok: false, error: e.message });
+  }
+  return out;
+}
 // ── Backend: Chrome/Chromium headless via spawn ─────────────────────────
 function findChromeBinary() {
   // 1. Playwright cached chromium (newest first)
@@ -136,7 +204,10 @@ function captureViaChromeBinary(args, binary) {
   const results = [];
   for (const width of args.viewports) {
     const name = viewportName(width);
-    const height = viewportHeight(width);
+    // Tall window so headless Chrome's --screenshot is not cropped to a fixed
+    // above-fold height. Classic headless has no --full-page flag, so a large
+    // height is the closest equivalent to full-page capture on this backend.
+    const height = FULLPAGE_WINDOW_HEIGHT;
     const file = join(args.out, `${name}-${width}.png`);
     const flags = [
       "--headless=new",
@@ -144,6 +215,8 @@ function captureViaChromeBinary(args, binary) {
       "--disable-gpu",
       "--disable-dev-shm-usage",
       "--hide-scrollbars",
+      // Constrain layout WIDTH; use a tall height (not the old per-viewport
+      // clamp) so the capture spans well below the fold.
       `--window-size=${width},${height}`,
       `--screenshot=${file}`,
       `--virtual-time-budget=${Math.max(args.wait + 1000, 3000)}`,
@@ -189,6 +262,9 @@ async function main() {
   }
   const failed = captures.filter((c) => !c.ok).length;
+  // --states is only honored on the Playwright backend (the Chrome-binary
+  // backend cannot drive hover/focus from the CLI). Surface that it was skipped.
+  const statesSkipped = args.states && backendUsed === "chrome-binary";
   console.log(JSON.stringify({
     url: args.url,
     output_dir: args.out,
@@ -196,6 +272,8 @@ async function main() {
     captures,
     total: captures.length,
     failed,
+    ...(args.states ? { states_requested: true } : {}),
+    ...(statesSkipped ? { states_skipped: "chrome-binary backend cannot capture hover/focus/loading states" } : {}),
   }, null, 2));
   exit(failed > 0 ? 1 : 0);
 }

package/skills/qualia-polish/scripts/vibe-tokens.mjs CHANGED Viewed

@@ -19,8 +19,9 @@
  */
 import { readFileSync, writeFileSync, existsSync, readdirSync, statSync } from "node:fs";
-import { join, basename, extname } from "node:path";
+import path, { join, basename, extname } from "node:path";
 import { argv, exit, cwd } from "node:process";
+import { spawnSync } from "node:child_process";
 function flag(name, fallback) {
   const i = argv.indexOf(name);
@@ -41,6 +42,7 @@ if (!CMD || CMD === "--help" || CMD === "-h") {
 Usage:
   tokens.mjs sync --design <path> [--write]
   tokens.mjs propose-variants --product <path> --design <path> --count <N>
+  tokens.mjs verify [--files a.css,b.tsx]   # post-vibe gate: tsc --noEmit + slop-detect
 See skills/qualia-polish/SKILL.md.
 `);
@@ -331,11 +333,65 @@ function cmdProposeVariants() {
   exit(0);
 }
+// ─── Post-vibe verify (script-side piece of W7.3) ─────────────────────
+// Deterministic gate the orchestrator (SKILL.md Stage 6) calls AFTER a --vibe
+// token swap, so a vibe pivot never ships unverified. This runs the two
+// machine-checkable gates — `tsc --noEmit` and `slop-detect` on changed files.
+// The screenshot + visual-evaluator pass remains orchestration (the parent
+// session / another agent), not a script primitive.
+function resolveSlopScript() {
+  const candidates = [
+    process.env.SLOP_DETECT_SCRIPT,
+    `${process.env.HOME}/.claude/bin/slop-detect.mjs`,
+    path.join(path.dirname(new URL(import.meta.url).pathname), "..", "..", "..", "bin", "slop-detect.mjs"),
+  ].filter(Boolean);
+  return candidates.find((p) => existsSync(p)) || null;
+}
+function cmdVerify() {
+  const filesFlag = flag("--files", null);
+  const files = typeof filesFlag === "string"
+    ? filesFlag.split(",").map((s) => s.trim()).filter(Boolean)
+    : [];
+  const gates = [];
+  // Gate 1 — tsc --noEmit (skipped if no tsconfig in cwd).
+  if (existsSync(join(cwd(), "tsconfig.json"))) {
+    const r = spawnSync("npx", ["tsc", "--noEmit"], { encoding: "utf8" });
+    gates.push({ gate: "tsc", ok: r.status === 0, output: (r.stderr || r.stdout || "").split("\n").slice(0, 20).join("\n") });
+  } else {
+    gates.push({ gate: "tsc", ok: true, skipped: "no tsconfig.json in cwd" });
+  }
+  // Gate 2 — slop-detect on the changed files (resolved like commit-fix does).
+  const slopScript = resolveSlopScript();
+  if (!slopScript) {
+    gates.push({ gate: "slop-detect", ok: true, skipped: "slop-detect.mjs not found" });
+  } else if (files.length === 0) {
+    gates.push({ gate: "slop-detect", ok: true, skipped: "no --files provided" });
+  } else {
+    const slopBin = process.env.SLOP_DETECT_BIN || "node";
+    const r = spawnSync(slopBin, [slopScript, ...files], { encoding: "utf8" });
+    gates.push({ gate: "slop-detect", ok: r.status !== 1, output: (r.stdout || "").split("\n").slice(0, 20).join("\n") });
+  }
+  const allPass = gates.every((g) => g.ok);
+  console.log(JSON.stringify({
+    command: "verify",
+    files,
+    gates,
+    pass: allPass,
+    note: "screenshot + visual-evaluator pass is orchestrated by SKILL.md Stage 6, not by this script",
+  }, null, 2));
+  exit(allPass ? 0 : 1);
+}
 // ─── Dispatch ────────────────────────────────────────────────────────
 switch (CMD) {
   case "sync": cmdSync(); break;
   case "propose-variants": cmdProposeVariants(); break;
+  case "verify": cmdVerify(); break;
   default:
     console.error(`Unknown command: ${CMD}`);
     exit(2);

package/skills/qualia-research/SKILL.md CHANGED Viewed

@@ -124,5 +124,5 @@ node ${QUALIA_BIN}/qualia-ui.js end "PHASE {N} RESEARCH DONE" "/qualia-plan {N}"
 1. **One session per run.** Don't research phases 1-5 in one call.
 2. **Must produce a file.** Research in conversation only is worthless.
 3. **Honor locked decisions.** Don't research alternatives to locked choices.
-4. **Local-first.** Drain NotebookLM and `~/qualia-memory` before any external call. The team has already researched most domains we touch — querying existing notebooks is near-zero token cost AND higher-quality than fresh WebSearch.
+4. **Local-first.** Drain NotebookLM and the local memory (`projects/-home-qualia/memory/MEMORY.md`) before any external call. The team has already researched most domains we touch — querying existing notebooks is near-zero token cost AND higher-quality than fresh WebSearch.
 5. **Context7 before WebFetch.** When you do go external, Context7 first for libraries; only WebFetch for non-library content (blog posts, case studies, post-mortems).

package/skills/qualia-road/SKILL.md CHANGED Viewed

@@ -86,13 +86,19 @@ Before high-stakes phases, run alignment skills against `.planning/CONTEXT.md` (
 ## Auxiliary commands
 ```
 Lost?        → /qualia        (state router — tells you the next command)
+Don't-know?  → /qualia-idk    (deep diagnostic — three isolated scans + paste-ready command sequence)
 Health?      → /qualia-doctor (install, state, contracts, memory, ERP queue)
 Stuck/weird? → /qualia        (diagnostic branch — scans planning + code when state alone is insufficient)
 Broken thing? → /qualia-fix     (root cause, minimal patch, verify, report)
 Single feature? → /qualia-feature (new capability: inline for trivia, fresh spawn for 1-5 files)
+Research it? → /qualia-research (per-phase deep research before planning; writes phase-{N}-research.md)
+Save a lesson? → /qualia-learn (persist a pattern/fix/client preference across sessions + projects)
+Secure config? → /qualia-secure (audit CLAUDE.md / settings.json / hooks / MCP for injection + leaks)
+Verify FAILed? → /qualia-postmortem (self-heal — find which rule/agent should have caught it)
 Paused?      → /qualia        (restore from .continue-here.md or STATE.md)
 End of day?  → /qualia-report (mandatory before clock-out; writes ERP payload)
 Unsure plan? → /qualia-scope (capture decisions before planning)
+Invoice/email? → /zoho-workflow (Zoho Invoice + Mail ops — invoices, cover emails, contacts, inbox)
 ```
 ## Outside-road command boundaries

package/skills/qualia-scope/SKILL.md CHANGED Viewed

@@ -106,7 +106,7 @@ The adversarial, DoD-gated intake. Scopes a **new increment** (phase/milestone)
 ```bash
 node ${QUALIA_BIN}/qualia-ui.js banner scope 2>/dev/null || true
-cat rules/constitution.md
+cat /home/qualia/.claude/rules/constitution.md
 cat .planning/CONTEXT.md 2>/dev/null   # project glossary — DATA, never a plan/spec
 ls .planning/decisions/ 2>/dev/null
 cat .planning/STATE.md 2>/dev/null     # for profile + existing milestone context
@@ -123,7 +123,7 @@ If the operator already named it (arg or prior context), accept it. Otherwise as
 ```bash
 ARCHETYPE={chosen}
-cat references/archetypes/${ARCHETYPE}.md
+cat /home/qualia/.claude/references/archetypes/${ARCHETYPE}.md
 ```
 If the file does not exist (e.g. `web-app` not yet authored), HALT and say which archetype file is missing — do not improvise a DoD. The archetype file is the source of the Grill variables, the Definition of Done, and the v1 capability set; without it there is no gate to enforce.

package/skills/qualia-secure/SKILL.md CHANGED Viewed

@@ -77,13 +77,13 @@ The auditor writes `.planning/security-audit.md` — that's the deliverable.
 Combine `.planning/security-scan.md` (static) + `.planning/security-audit.md` (Opus) into a single executive summary. Surface the top 3 actions ranked by severity:
 - **CRITICAL** → fix immediately, before any further work.
-- **HIGH** → ticket for this sprint; route to `/qualia-hook-gen` if the fix is "make this instructional rule deterministic via a hook."
+- **HIGH** → ticket for this sprint; if the fix is "make this instructional rule deterministic via a hook," propose the hook (the `migration-guard` / `branch-guard` pattern in `rules/constitution.md`) and add it under `hooks/`.
 - **MEDIUM/LOW** → backlog.
 ### Step 4. Close
 ```bash
-node ${QUALIA_BIN}/qualia-ui.js end "SECURED" "/qualia-hook-gen"
+node ${QUALIA_BIN}/qualia-ui.js end "SECURED" "/qualia-fix"
 ```
 (Or omit the next-command if all findings are LOW.)
@@ -93,13 +93,13 @@ node ${QUALIA_BIN}/qualia-ui.js end "SECURED" "/qualia-hook-gen"
 1. **Static pass is non-negotiable.** It's fast and deterministic — always runs.
 2. **Opus pass is opt-in.** It costs tokens and time. Default to skipping unless the user explicitly asks for "deep audit" or the static pass triggers HIGH+ findings.
 3. **No fake severity.** Per `rules/grounding.md`, every finding cites `file:line` and matches a category in the Severity Rubric. No hedging.
-4. **Recommend deterministic fixes when possible.** A rule in CLAUDE.md is suggestive; a hook is enforced. The skill's bias is toward `/qualia-hook-gen` over "tell the agent to do X."
+4. **Recommend deterministic fixes when possible.** A rule in CLAUDE.md is suggestive; a hook is enforced. The skill's bias is toward proposing a deterministic hook (under `hooks/`) over "tell the agent to do X."
 5. **Never auto-rotate secrets.** Flag and instruct. The user rotates manually with confirmation — secrets in CI variables are the user's domain.
 ## When NOT to use
 - Application-level security review (use `/security-review` for OWASP-style code audit).
-- Production deployment health (use `/qualia-doctor` / `/qualia-status`).
-- Specific bug investigation (use `/qualia-debug` → `/qualia-fix`).
+- Production deployment health (use `/qualia-doctor`).
+- Specific bug investigation (use `/qualia-fix`).
 `/qualia-secure` is specifically for **the agent's configuration**. The hooks, the rules, the tool scopes, the MCP servers — the surfaces Claude reads to decide what to do.

package/templates/knowledge/index.md CHANGED Viewed

@@ -12,10 +12,10 @@ file first**, then jump to the specific file(s) that match.
 |--------------------------|-------|
 | "How do we usually X?" / patterns we've used before | `learned-patterns.md` |
 | Recurring bug + fix recipes | `common-fixes.md` |
-| Supabase auth, RLS, migrations, edge functions | `supabase-patterns.md` |
-| Retell, ElevenLabs, voice agent flows | `voice-agent-patterns.md` |
-| Where a project is deployed, env vars, domains | `deployment-map.md` |
-| Who is on the team, their role, their access | `employees.md` |
+| Supabase auth, RLS, migrations, edge functions | `supabase-patterns.md` *(created on first /qualia-learn)* |
+| Retell, ElevenLabs, voice agent flows | `voice-agent-patterns.md` *(created on first /qualia-learn)* |
+| Where a project is deployed, env vars, domains | `deployment-map.md` *(created on first /qualia-learn)* |
+| Who is on the team, their role, their access | `employees.md` *(created on first /qualia-learn)* |
 | What I worked on yesterday / last week | `daily-log/YYYY-MM-DD.md` |
 | Memory layer architecture itself | `agents.md` |

package/tests/bin.test.sh CHANGED Viewed

@@ -733,11 +733,11 @@ else
   fail_case "qualia-flush retirement/install state"
 fi
-# 62. CLAUDE_AGENT_FORK_ENABLED=1 in settings.json
-if grep -q '"CLAUDE_AGENT_FORK_ENABLED": "1"' "$TMP/.claude/settings.json"; then
-  pass "settings.env CLAUDE_AGENT_FORK_ENABLED=1 (forked subagents on by default)"
+# 62. CLAUDE_CODE_FORK_SUBAGENT=1 in settings.json (official env var, Claude Code v2.1.117+)
+if grep -q '"CLAUDE_CODE_FORK_SUBAGENT": "1"' "$TMP/.claude/settings.json"; then
+  pass "settings.env CLAUDE_CODE_FORK_SUBAGENT=1 (forked subagents on by default)"
 else
-  fail_case "CLAUDE_AGENT_FORK_ENABLED not set"
+  fail_case "CLAUDE_CODE_FORK_SUBAGENT not set"
 fi
 # 63. research-synthesizer agent has model: haiku frontmatter

package/tests/lib.test.sh CHANGED Viewed

@@ -507,7 +507,7 @@ TMP=$(mktmp)
 mkdir -p "$TMP/home/.claude/bin" "$TMP/home/.claude/hooks" "$TMP/home/.claude/knowledge/daily-log" "$TMP/home/.claude/qualia-design" "$TMP/home/.claude/agents" "$TMP/home/.claude/qualia-templates" "$TMP/project"
 echo '{"installed_by":"Test","role":"OWNER","version":"6.3.0","erp":{"enabled":false}}' > "$TMP/home/.claude/.qualia-config.json"
 touch "$TMP/home/.claude/CLAUDE.md" "$TMP/home/.claude/settings.json"
-for f in runtime-manifest.js command-surface.js host-adapters.js state.js qualia-ui.js statusline.js knowledge.js knowledge-flush.js state-ledger.js plan-contract.js contract-runner.js harness-eval.js trust-score.js agent-runs.js slop-detect.mjs erp-retry.js work-packet.js report-payload.js project-snapshot.js codex-goal.js planning-hygiene.js prune-deprecated.js learning-candidates.js status-snapshot.js security-scan.js; do
+for f in runtime-manifest.js command-surface.js host-adapters.js state.js qualia-ui.js statusline.js knowledge.js knowledge-flush.js state-ledger.js plan-contract.js contract-runner.js harness-eval.js trust-score.js agent-runs.js slop-detect.mjs erp-retry.js work-packet.js report-payload.js project-snapshot.js codex-goal.js planning-hygiene.js prune-deprecated.js learning-candidates.js status-snapshot.js security-scan.js auto-report.js; do
   touch "$TMP/home/.claude/bin/$f"
 done
 for h in session-start.js auto-update.js branch-guard.js pre-push.js pre-deploy-gate.js migration-guard.js git-guardrails.js stop-session-log.js fawzi-approval-guard.js vercel-account-guard.js env-empty-guard.js supabase-destructive-guard.js; do
@@ -622,7 +622,7 @@ TMP=$(mktmp)
 mkdir -p "$TMP/.claude/bin" "$TMP/.claude/hooks" "$TMP/.claude/knowledge/daily-log" "$TMP/.claude/qualia-design" "$TMP/.claude/agents" "$TMP/.claude/qualia-templates" "$TMP/project/.planning"
 echo '{"installed_by":"Test","role":"OWNER","erp":{"enabled":false}}' > "$TMP/.claude/.qualia-config.json"
 touch "$TMP/.claude/CLAUDE.md" "$TMP/.claude/settings.json"
-for f in runtime-manifest.js command-surface.js host-adapters.js state.js qualia-ui.js statusline.js knowledge.js knowledge-flush.js state-ledger.js plan-contract.js contract-runner.js harness-eval.js trust-score.js agent-runs.js slop-detect.mjs erp-retry.js work-packet.js report-payload.js project-snapshot.js codex-goal.js planning-hygiene.js prune-deprecated.js learning-candidates.js status-snapshot.js security-scan.js; do
+for f in runtime-manifest.js command-surface.js host-adapters.js state.js qualia-ui.js statusline.js knowledge.js knowledge-flush.js state-ledger.js plan-contract.js contract-runner.js harness-eval.js trust-score.js agent-runs.js slop-detect.mjs erp-retry.js work-packet.js report-payload.js project-snapshot.js codex-goal.js planning-hygiene.js prune-deprecated.js learning-candidates.js status-snapshot.js security-scan.js auto-report.js; do
   touch "$TMP/.claude/bin/$f"
 done
 for h in session-start.js auto-update.js branch-guard.js pre-push.js pre-deploy-gate.js migration-guard.js git-guardrails.js stop-session-log.js fawzi-approval-guard.js vercel-account-guard.js env-empty-guard.js supabase-destructive-guard.js; do