npm - start-vibing - Versions diffs - 4.3.0 → 4.3.2 - Mend

start-vibing 4.3.0 → 4.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (22) hide show

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
 	"name": "start-vibing",
-	"version": "4.3.0",
-	"description": "Setup Claude Code with 9 plugins, 6 community skills, and 8 MCP servers. Parallel install, auto-accept, superpowers + ralph-loop. super-design 0.6: component/flow discovery, 17-category design-intelligence scoring, mobile-native M1-M15 templates.",
+	"version": "4.3.2",
+	"description": "Setup Claude Code with 9 plugins, 6 community skills, and 8 MCP servers. Parallel install, auto-accept, superpowers + ralph-loop. super-design 0.6.2: canonical audit-state.schema.json + verify --strict, per-viewport {sha256,phash,png_path} hashes with sharp/fpr fallback + MASK_SELECTORS, visual-regression.sh (pixelmatch→odiff→sha256-fallback), DTCG *.tokens.json + Tokens Studio aliases, @fixture-<id> dynamic routes + madge import-graph N=3, per-app monorepo state (pnpm/npm/yarn/Bun/Nx/Turbo).",
 	"type": "module",
 	"bin": {
 		"start-vibing": "./dist/cli.js"

package/template/.claude/agents/sd-audit.md CHANGED Viewed

@@ -150,6 +150,22 @@ Phase E — Form state coverage
   all valid, simulated 500, offline, paste into password, autocomplete
   tokens, Tab order vs visual order, Enter submits, mobile input zoom
   (font-size < 16px on iOS Safari).
+  Postel's Law robustness check (artifact Part 1 law table; "be liberal in
+  what you accept, conservative in what you send"). Per text/tel/email/date
+  input, verify the field is liberal on input:
+    - trims leading/trailing whitespace before validation;
+    - accepts the common format variants users actually type (phone:
+      "+55 (11) 9 9999-9999", "5511999999999", "11999999999"; date:
+      "2026-04-19", "19/04/2026", "Apr 19 2026"; email: case-insensitive
+      local-part where the provider allows it);
+    - accepts pasted values with mixed whitespace / soft hyphens / unicode
+      thin spaces without rejecting.
+  And strict on output: the value submitted to the backend and the value
+  re-rendered to the user are canonicalized (E.164 phone, ISO-8601 date,
+  trimmed). Record any field that rejects a legitimate variant that a
+  reasonable user would type as finding code `form-postel-<slug>`
+  (severity MEDIUM unless blocking primary conversion → HIGH).
 ```
 **Budget rule:** On large apps, cap to top 5 triggers per page (ranked by
@@ -172,11 +188,79 @@ For each of 10 heuristics (methodology §1), work audit questions. Score 0–4 v
 ### 3c. WCAG 2.2 AA manual pass
 Items NOT covered by axe (methodology §2.3): keyboard traps, focus-order-matches-visual-order, `:focus-visible` quality, reflow at 320px, text-spacing override, `prefers-reduced-motion`, alt text quality, link/button text adequacy.
+**WCAG 2.2 new Success Criteria — explicit checks (finding code prefix `a11y-wcag22-<sc>`):**
+- **2.4.11 Focus Not Obscured (Minimum) — AA** → `a11y-wcag22-2.4.11`. Tab/Shift+Tab through every page at all 3 viewports; verify every focused control is at least partly visible. Common fail: sticky headers/footers (`position:fixed`) covering focused links/inputs. Fix pattern: `html { scroll-padding-top: <header-h>; scroll-padding-bottom: <footer-h>; }`.
+- **2.5.7 Dragging Movements — AA** → `a11y-wcag22-2.5.7`. Enumerate every `draggable="true"`, drag-to-reorder list, range slider, kanban column, canvas drag. Each must expose a single-pointer non-dragging alternative (up/down buttons, ± steppers, numeric input, menu action). Keyboard-only is NOT sufficient (touch-only users).
+- **3.2.6 Consistent Help — A** → `a11y-wcag22-3.2.6`. If help mechanisms exist (contact, chat, self-help link, support email), verify they appear in the same relative DOM order across every page they occur on. Record snapshot quote per page, diff order, file finding if inconsistent.
+- **3.3.7 Redundant Entry — A** → `a11y-wcag22-3.3.7`. In any multi-step process (checkout, onboarding, registration), verify information previously entered is auto-populated or available for selection (e.g., "billing same as shipping" prefilled). Exceptions: essential re-entry (password confirmation), security-related, stale data. Browser autocomplete does not satisfy — the site must provide the value.
+- **3.3.8 Accessible Authentication (Minimum) — AA** → `a11y-wcag22-3.3.8`. On every auth surface (login, re-auth, 2FA, password reset), confirm no cognitive-function test (memorize password, transcribe OTP, puzzle CAPTCHA) is required unless an alternative exists (passkey, magic link), a mechanism helps (paste allowed + `autocomplete="username | current-password | one-time-code"`), or an object/personal-content exception applies. Fail pattern: `onpaste="return false"` or `autocomplete="off"` on password.
+- **3.3.9 Accessible Authentication (Enhanced) — AAA** → `a11y-wcag22-3.3.9` (advisory only, not required for AA audits). Flag as an advisory finding when AA passes only via the object-recognition or personal-content exception (e.g., "select all crosswalks" CAPTCHA). Passkeys / WebAuthn / biometrics / magic links clear this bar.
 ### 3d. Baymard (if e-commerce detected)
 If `package.json` has stripe/shopify/medusajs/saleor OR routes include /checkout /cart /products: checkout-flow + form-design + filter + PDP checklist (methodology §3).
+### 3e.0 Phase 0 — CrUX field data (MUST run before 3e synthetic lab audit)
+Lab numbers (Lighthouse / Playwright / web-vitals injected at audit time)
+are deterministic but reflect a single throttled machine. Google ranks on
+**field** data — Chrome User Experience Report (CrUX), 28-day p75 over real
+users. A site can score 95 in the lab and "Poor" in field due to device
+diversity. Field is authoritative; lab is indicative only.
+Before the synthetic pass in 3e, fetch CrUX for the origin (and for each
+templated page type if a key is configured):
+```bash
+# CrUX API (Chrome UX Report) — requires $CRUX_KEY or PageSpeed Insights key
+curl -s "https://chromeuxreport.googleapis.com/v1/records:queryOrigin?key=$CRUX_KEY" \
+  -H 'Content-Type: application/json' \
+  -d "{\"origin\":\"<site-origin>\",\"formFactor\":\"PHONE\"}" \
+  > "$SESSION_DIR/vitals/crux_origin_mobile.json"
+curl -s "https://chromeuxreport.googleapis.com/v1/records:queryOrigin?key=$CRUX_KEY" \
+  -H 'Content-Type: application/json' \
+  -d "{\"origin\":\"<site-origin>\",\"formFactor\":\"DESKTOP\"}" \
+  > "$SESSION_DIR/vitals/crux_origin_desktop.json"
+# Optional: per-URL record (only if URL has sufficient traffic)
+curl -s "https://chromeuxreport.googleapis.com/v1/records:queryRecord?key=$CRUX_KEY" \
+  -H 'Content-Type: application/json' \
+  -d "{\"url\":\"<full-url>\",\"formFactor\":\"PHONE\"}" \
+  > "$SESSION_DIR/vitals/crux_<slug>_mobile.json"
+```
+Capture field `p75` for LCP / INP / CLS (and FCP / TTFB when present).
+Outcomes:
+- **CrUX present + sufficient traffic** → field values are the verdict; lab
+  values annotate drill-down only.
+- **CrUX absent or insufficient traffic** → record the gap, fall back to
+  lab, and tag every performance finding as `source: "lab"`.
 ### 3e. Core Web Vitals
-Parse `session_dir/vitals/<page>.json`. LCP/INP/CLS/FCP/TTFB/TBT against thresholds (methodology §4). Doherty: interactions <400ms feedback.
+Parse `session_dir/vitals/<page>.json` (lab) AND `crux_*_mobile.json` /
+`crux_*_desktop.json` (field). LCP/INP/CLS/FCP/TTFB/TBT against thresholds
+(methodology §4). Doherty: interactions <400ms feedback.
+**Tag every performance finding with a `source` field:**
+```json
+{
+  "rule": "cwv-lcp",
+  "source": "lab" | "field" | "both",
+  "lab_value_ms": 3200,
+  "field_p75_ms": 4100,
+  "field_sample": "CrUX 28-day p75, PHONE",
+  "verdict": "needs-improvement"
+}
+```
+- `source: "both"` when lab + CrUX agree → highest confidence; proceed to fix.
+- `source: "field"` when CrUX fails but lab passes → real users hit it; still real.
+- `source: "lab"` when CrUX is absent / insufficient → note the gap and
+  surface as "unverified by field data" in the executive summary.
+- If lab and field disagree by > 30%, file a meta-finding
+  (`rule: perf-lab-field-divergence`) with both numbers and `source: "both"`.
 ### 3f. Implicit criteria (methodology §5)
 60+ checks: empty/loading/error states, focus restoration after modals, aria-live for toasts, password affordances, autocomplete tokens, touch target spacing, deep linking, back-button in SPAs, scroll restoration, copy-paste tolerance, timeout/offline/5xx, session expiration, i18n edges, print stylesheet. pass/fail/n-a with evidence.
@@ -265,6 +349,37 @@ mobile-native. Run the 21-item checklist verbatim against each mobile page:
 Each failed item → finding with `rule: mobile-pattern-M<N>`, evidence from
 Step 2.5 artifacts (NOT a fresh snapshot), `template_id: M<N>`.
+**Real-device vs emulation disclaimer (MANDATORY).** Playwright MCP drives
+Blink/Chromium in a resized viewport — it is NOT real iOS Safari (WebKit),
+Android Chrome on a low-end device, or any in-app WebView. Emulation can
+confirm layout, DOM, a11y tree, tab order, reduced-motion / forced-colors,
+computed CSS. It CANNOT confirm touch haptics, iOS safe-area rendering,
+iOS Safari font rasterization, PWA install/add-to-home-screen, iOS keyboard
+overlap via `visualViewport`, viewport-zoom quirks under pinch, Samsung
+Internet auto-dark, real Pointer Event latency, or hover-only fallbacks on
+real touch (iOS "sticky hover"). See methodology §9 for the full list.
+Any mobile finding whose verdict would require iOS Safari or Android Chrome
+on a real device to confirm — touch haptics, iOS safe-area insets, PWA
+install, pinch-zoom quirks, `@media (hover: hover)` behavior on real touch,
+payment sheet (Apple Pay / Google Pay), biometrics, push — MUST be tagged
+in the finding JSON as:
+```json
+{
+  "category": "real-device-required",
+  "real_device_required": true,
+  "emulation_verdict": "likely_fail | likely_pass | indeterminate",
+  "requires": ["ios-safari", "android-chrome"],
+  "rationale": "Playwright runs Blink; iOS Safari uses WebKit; cannot confirm X on emulation."
+}
+```
+`sd-synthesis` MUST surface a "real-device verification needed" banner at
+the top of the executive summary listing every finding where
+`real_device_required=true`, grouped by `requires` platform, so the human
+reviewer books a BrowserStack / Sauce / LambdaTest session before sign-off.
 Cross-reference the competitor component vocabulary from
 `.cache/evidence/component-comparison.md` — if every competitor uses bottom
 tabs on mobile and the product uses hamburger-only, density score drops AND
@@ -328,7 +443,11 @@ this to produce the executive DIS summary.
     document.head.appendChild(s);
   });
   window.__axe = await window.axe.run(document, {
-    runOnly: { type: 'tag', values: ['wcag2a','wcag2aa','wcag21a','wcag21aa','wcag22aa','best-practice'] }
+    runOnly: { type: 'tag', values: ['wcag2a','wcag2aa','wcag21a','wcag21aa','wcag22aa','best-practice'] },
+    // WCAG 2.2 rules (e.g., focus-not-obscured) ship under axe-core's
+    // experimental flag — without this, SC 2.4.11 / 2.5.7 / 2.5.8 etc.
+    // simply will NOT execute. Always enable for super-design audits.
+    experimental: true
   });
 })();
 ```

package/template/.claude/agents/sd-fix.md CHANGED Viewed

@@ -195,6 +195,17 @@ Source of truth: `references/fix-agent-playbook.md` §7.
 - Swap color used in >5 files → needs_human (too broad for single fix)
 - Convert design token itself → MEDIUM, escalate
+**Color-space rule (V4 and any new token):** When emitting new color tokens
+(V4 snap-to-nearest and any fresh tokens proposed by V-templates), express
+them in **OKLCH** — the perceptually uniform color space used by modern
+design systems (Tailwind v4, shadcn 2024+, Radix Colors). Hex / RGB are
+accepted ONLY when they match the existing codebase convention (e.g., the
+project's `tokens.css` / `globals.css` already defines all colors as hex).
+Mixing OKLCH tokens into a hex-only codebase requires a separate
+token-migration finding and is `needs_human`. Format:
+`--color-primary-500: oklch(0.65 0.20 265);` (lightness 0-1, chroma 0+,
+hue 0-360).
 ## ux templates (U1–U10)
 | ID | Fix |

package/template/.claude/agents/sd-research.md CHANGED Viewed

@@ -22,8 +22,29 @@ Output: exactly one file `docs/super-design/market-analysis.md` + evidence under
 3. **Detect niche.** Apply 8-signal scoring (playbook §1). Confidence = top / (top + second). If <0.55, use AskUserQuestion with 3 options from top verticals. Record reasoning to `.cache/evidence/niche.md`.
+   **Regulated-niche always-confirm rule.** Regulated niches: compliance-driven design choices override aesthetic preference, so always confirm. If the detected niche falls into any of the following — **fintech, healthtech, legaltech, gambling, crypto, insurance, children's-app** — ALWAYS fire `AskUserQuestion` to confirm niche + regulatory scope even when detector confidence is ≥0.95. These niches carry compliance implications (SOC2, HIPAA, PCI-DSS, GDPR, PSD2, COPPA, KYC/AML, age-gating, disclosure-mandated copy) that design directly affects — getting the niche wrong wastes the audit. Record the confirmation (selected scope, applicable regulations) to `.cache/evidence/niche.md` under `regulatory_scope:`.
 4. **Discover competitors.** 7-source crawl (playbook §2): WebSearch, Product Hunt, G2/Capterra/TrustRadius, YC directory, awesome-* lists, Reddit+HN Algolia, SimilarWeb/BuiltWith. Dedupe by domain. Rank fame × similarity × design-signal. Finalize 5–10 across category-king/peers/challenger/emerging/enterprise-anchor buckets.
+   **4a. Neumeier insertion test during discovery (per candidate).** For every candidate competitor considered for the final 5–10, apply Neumeier's insertion test (playbook §5.4): *"If this competitor's brand mark were swapped with the project's, would users notice?"* Score each on a 0–5 scale:
+   | Score | Meaning |
+   |---|---|
+   | 0 | Fully swappable — no brand equity, pure commodity visual language |
+   | 1 | Mostly swappable — generic category codes only |
+   | 2–3 | Partially distinct — some ownable elements but weak |
+   | 4 | Strong distinct identity — clear ownable signals |
+   | 5 | Instantly distinct — singular, unmistakable brand mark |
+   Competitors scoring ≤1 are **commodity benchmarks** (show what the category looks like by default); competitors scoring ≥4 **reveal defensible territory** (show what ownable positioning looks like). Include a healthy mix of both. Record the score and one-line justification per competitor in `market-analysis.md` (competitor table) and the per-competitor row in `.cache/evidence/<slug>/component-catalog.md` under a new `Insertion-test score:` field.
+   **4b. Vibe-quadrant final gate (self-check before step 5).** Before moving to step 5, plot the project draft position and each finalized competitor on a 2-axis vibe quadrant:
+   - **X axis:** serious ↔ playful
+   - **Y axis:** minimal ↔ expressive
+   If the project lands in the **same quadrant as ≥3 competitors**, surface a warning in `market-analysis.md` under a `## Positioning risk` section — exact text: `crowded quadrant — positioning risk` — and recommend **one axis shift** (per Kapferer prism §4.3 / Aaker dimensions §4.2) that would move the project into a less-occupied quadrant. **Do not auto-decide the shift**; document the warning and the recommended axis for synthesis (step 8) to reconcile with the user. Save the quadrant plot data (project + competitor coordinates) to `.cache/evidence/vibe-quadrant.md`.
 5. **Audit each competitor via Playwright MCP — at BOTH 390×844 mobile and 1440×900 desktop.** Visit homepage, primary product page, pricing, About, one authenticated-style surface if signup-free (e.g., docs, app tour). Per playbook §3 PLUS component-level extraction per §3bis below. Save to `.cache/evidence/<slug>/<viewport>/`.
 ### §3bis. Component-level extraction (mandatory, not optional)
@@ -97,11 +118,37 @@ and sd-fix use to recommend aesthetic direction.
 6. **Classify each.** Archetype (§4.1), Aaker peak (§4.2), vibe class, NN/g 4D tone (§7.1), hero-pattern.
+   **6a. Voice/tone capture — 8–12 copy sample rule (mandatory).** For each competitor, collect **8–12 distinct copy samples** (≥8 minimum; fewer = insufficient signal), one per surface where available:
+   - Hero headline
+   - Primary CTA label
+   - Error message
+   - Empty state
+   - 404 page
+   - Onboarding step 1
+   - Pricing caption / plan blurb
+   - Footer blurb
+   - ToS / legal excerpt
+   - (Optional extras: subhead, feature card, support article opener, confirmation toast)
+   Grade **each sample** on the NN/g 4D tone dimensions (playbook §7.1) using integers in {−1, 0, +1}:
+   - formal ↔ casual
+   - funny ↔ serious
+   - respectful ↔ irreverent
+   - enthusiastic ↔ matter-of-fact
+   Report per-sample scores + verbatim quote + source URL in `.cache/evidence/<slug>/copy-samples.md`, and the **mean + variance** per axis in `market-analysis.md` (tone row per competitor). Healthy brands are constant on voice, variable on tone.
+   **Insufficient-signal rule.** If fewer than 8 distinct samples can be collected (static site, gated app, locale blockers), **do not compute a tone profile** — flag the competitor as `tone-inconclusive` in `market-analysis.md` with a note listing which surfaces were missing. Never fabricate samples or scores to reach the threshold.
 7. **Build category-code matrix.** Tabulate dimensions (§5.1). Frequency per column. Classify codes obey/extend/subvert/open (§5.2).
-8. **Synthesize.** Archetype in whitespace via Neumeier insertion test (§5.4). Palette, typography, tone, audience, JTBD. Draft onliness statement.
+8. **Synthesize.** Archetype in whitespace via Neumeier insertion test (§5.4). Palette, typography, tone, audience, JTBD.
+8b. **Three-territories pitch (Q7 — Part 7 of `docs/compass_artifact_wf-2e33af6e-127f-402e-8ce6-cb506fc91b94_text_markdown.md` lines 515–519, 652–653).** Before drafting the onliness statement, produce THREE parallel variants of the design direction — **safe** (conforms to category codes), **expected** (the obvious evolution of category codes), **edgy** (the considered provocation / subversion). Each variant MUST include: palette strip (3–6 tokens), type specimen (primary + optional display), motion character (duration + easing archetype), one-line rationale tying back to archetype + category-code matrix from step 7. Build them in parallel — never serialize, or you will anchor to the first. Save to `.cache/evidence/territories/{safe,expected,edgy}.md` and include a summary table in the brief. The user chooses the primary territory (optionally stealing one detail from another) BEFORE the onliness statement lands. "Presenting one direction looks like opinion; presenting three looks like strategy" (artifact line 519).
-9. **Write `market-analysis.md`** per playbook §8 schema.
+9. **Draft onliness statement** against the chosen territory, then **write `market-analysis.md`** per playbook §8 schema (include the three-territories summary + chosen primary).
 10. **Self-check.** Fix gaps before returning.

package/template/.claude/skills/super-design/.schema-version ADDED Viewed

	@@ -0,0 +1 @@
1	+ 1.0.0

package/template/.claude/skills/super-design/SKILL.md CHANGED Viewed

@@ -8,7 +8,7 @@ description: >
   UX audit (WCAG 2.2 AA, Nielsen heuristics, Baymard, CWV), and synthesized
   overview. Re-audits only what changed since last run. On explicit user request,
   applies surgical fixes with full rollback.
-version: 0.6.0
+version: 0.6.2
 ---
 # super-design
@@ -86,9 +86,14 @@ Pass findings via files under `.super-design/sessions/<id>/`, not chat.
 ### Step 4: Write state + history
-- Atomic write `.audit-state.json` (.tmp then rename).
+- Atomic write `.audit-state.json` via `scripts/write-state.sh` (takes JSON
+  on stdin, writes `.tmp`, validates with `jq`, then renames). Do NOT write
+  the state file directly.
 - Append session to `audit-history.md`.
 - `git notes --ref=super-design add -f -m <json> HEAD`.
+- First-time notes setup (run once per clone, also in `setup-git-notes.sh`
+  if you extract it): `git config --add remote.origin.fetch '+refs/notes/super-design/*:refs/notes/super-design/*'`
+  — without this, notes don't round-trip across clones (artifact §7).
 ### Step 5: Return summary (≤5 sentences)
@@ -100,9 +105,96 @@ Do NOT paste overview into chat.
 - `--refresh-research` — rerun sd-research
 - `--only <cat>` — a11y | design | ux | perf | research
 - `--scope <url>` — specific route
+- `--app <name>` — scope the entire run to one monorepo app (matches a
+  `name` entry from `scripts/detect-apps.sh`). Required when `--scope <url>`
+  is ambiguous between multiple apps.
 - `--fix` — run sd-fix after audit
 - `--dry-run` — artifacts without committing state
 - `--ci` — non-interactive, create PR, exit non-zero on blockers
+- `--update-baselines` — Re-hash pages and tokens without re-auditing (use after accepted cosmetic drift). Also accepted by `scripts/visual-regression.sh` to overwrite `.super-design/baselines/*.png` with the current capture.
+- `--visual-regression` — Run `scripts/visual-regression.sh` after hashing. Reads the `visual_regression` block from `.audit-state.json` (engine: pixelmatch | odiff | sha256-fallback; threshold 0.1; max_diff_pixel_ratio 0.01). See artifact §16.
+- `MASK_SELECTORS=<sel,sel,...>` (env) — Extra CSS selectors masked in every screenshot captured by `scripts/hash-pages.sh`. Artifact §3.4 defaults (`[data-timestamp], .relative-time, [data-react-hydration], video, canvas`) are always applied.
+## Monorepo support
+Audit state is per-app (artifact §11 line 902) so independent deploys
+carry independent freshness, `git_sha_at_audit`, and tool results. Layout
+is auto-detected; nothing else to configure.
+### Detection
+`scripts/detect-apps.sh` reads the first workspace manifest it finds:
+| Manifest | Source of globs |
+|----------|-----------------|
+| `pnpm-workspace.yaml` | `packages:` list |
+| `package.json` | `workspaces: [...]` or `workspaces.packages: [...]` (npm, yarn, Bun) |
+| `turbo.json` | Presence → uses pnpm/npm/yarn workspaces; falls back to `apps/*` + `packages/*` if none |
+| `nx.json` | `workspaceLayout.appsDir` / `libsDir` (default `apps/*`, `libs/*`) |
+| `bunfig.toml` | Presence → falls back to `apps/*` + `packages/*` if package.json has no workspaces |
+Each matched directory that also has a `package.json` becomes an app
+with `name` taken from `package.json#name` (scope stripped), `path` the
+directory, and `state_path` = `<path>/docs/super-design/.audit-state.json`.
+If nothing matches, `detect-apps.sh` emits a `single` layout with
+`path: "."` and the repo-root state path — preserving existing single-app
+behavior.
+### Per-app pipeline
+- **Preflight**: per app, read `<app>/docs/super-design/.audit-state.json`
+  via `validate-state.sh <app_path>`.
+- **Change detection**: `scripts/detect-changes.sh --all-apps` loops over
+  every app and narrows `git diff` to `-- <app_path>/` so each app's
+  scope decision sees only its own files. Single-app shape is preserved
+  with `detect-changes.sh <last_sha>`.
+- **Write state**: `scripts/write-state.sh <app_path>` derives the target
+  path; for single-app repos pass `.` or omit.
+### URL → app disambiguation
+`--scope <url>` still targets one URL. When the URL maps cleanly to a
+single app (e.g. `apps/admin` serves `https://admin.example.com`), the
+pipeline picks that app automatically. When mapping is ambiguous
+(multiple apps serve overlapping hostnames, or URL patterns cross apps),
+the user MUST pass `--app <name>` — otherwise the skill aborts with a
+`{"error":"ambiguous-app","candidates":[...]}` verdict instead of
+guessing.
+## Scripts
+Reusable shell helpers under `scripts/`. All POSIX/bash, tested on
+Windows git-bash + Linux.
+- `discover-routes.sh` — emits `route_map` as a JSON array. Dynamic
+  segments (`[slug]`, `[[...all]]`, `$id`, `:uid`) are suffixed with
+  `@fixture-<id>` (artifact §2.7). Fixtures resolved from sibling
+  `*.fixture.json`, `fixtures/<name>.json`, or `$SUPER_DESIGN_FIXTURES`
+  env JSON; falls back to `@fixture-default` with a warning. Consumers
+  (hash-pages, sd-audit) MUST strip the suffix before navigating.
+- `build-import-graph.sh` — builds `.super-design/import-graph.json`
+  (`{nodes, edges, hash, backend}`) and persists `import_graph_sha` to
+  state. Prefers `npx madge --json <roots>`; falls back to a regex
+  scanner (JS/TS only, no alias resolution) if madge is missing.
+  - Query: `bash .../build-import-graph.sh importers <file> --hops 3`
+    → BFS over reversed edges; `detect-changes.sh` uses this to close
+    the component→pages gap when only components changed (Step 2 scope
+    decision: "Only components changed → re-audit pages importing them
+    (N=3 hops via madge)").
+- `hash-pages.sh` — captures 3 viewports per URL (mobile_375, tablet_768,
+  desktop_1280), emits `{html_hash, dom_structure_hash, viewport_hashes:
+  {<vp>: {sha256, phash, png_path}}}` per page to
+  `docs/super-design/.cache/hashes/hashes.json` and persists each PNG to
+  `<cache>/screenshots/<url-enc>/<vp>.png`. Applies artifact §3.4 mask
+  defaults plus `MASK_SELECTORS`; `phash` uses `sharp` when available
+  (tagged `phash:`) or a deterministic PNG fingerprint otherwise
+  (tagged `fpr:`, only useful for exact-match comparison).
+- `visual-regression.sh [--update-baselines] [<state>]` — reads the
+  `visual_regression` block from `.audit-state.json` and diffs current
+  screenshots against `.super-design/baselines/`. Engine chain:
+  `pixelmatch` → `odiff` → `sha256-fallback`. Emits
+  `{page, viewport, diff_ratio, threshold, pass, diff_image_path}` to
+  `<diff_dir>/results.json`. Exits non-zero if any page fails.
 ## References (Read on demand)

package/template/.claude/skills/super-design/audit-state.schema.json ADDED Viewed

@@ -0,0 +1,226 @@
+{
+  "$schema": "http://json-schema.org/draft-07/schema#",
+  "$id": "https://hakutaku.ai/schemas/super-design/audit-state.schema.json",
+  "title": "super-design audit state",
+  "description": "Canonical schema for docs/super-design/.audit-state.json — the state file that super-design reads on every run to decide what to re-audit. Derived from docs/compass_artifact_wf-f52f98c2-73c9-4483-b49c-6b080ef7dc92_text_markdown.md sections 5, 10, and 16.",
+  "type": "object",
+  "additionalProperties": true,
+  "required": [
+    "schema_version",
+    "skill_version",
+    "last_audit_at",
+    "git_sha_at_audit"
+  ],
+  "properties": {
+    "schema_version": {
+      "type": "string",
+      "description": "Semver string for this state file's schema. Major bumps force a full re-audit (see §12 line 934, §10).",
+      "pattern": "^\\d+\\.\\d+\\.\\d+(?:[-+].+)?$"
+    },
+    "skill_version": {
+      "type": "string",
+      "description": "Semver of the super-design skill that wrote this state.",
+      "pattern": "^\\d+\\.\\d+\\.\\d+(?:[-+].+)?$"
+    },
+    "last_audit_at": {
+      "type": "string",
+      "description": "ISO-8601 timestamp of the last completed audit (used for freshness cascade: 90d soft, 180d hard).",
+      "format": "date-time"
+    },
+    "git_sha_at_audit": {
+      "type": "string",
+      "description": "Git SHA (7–64 hex chars) that HEAD pointed to at audit time. Used as the range-start for next audit's git log (§2.1).",
+      "pattern": "^[0-9a-f]{7,64}$"
+    },
+    "git_branch": {
+      "type": "string",
+      "description": "Branch name at audit time (informational)."
+    },
+    "is_shallow_clone": {
+      "type": "boolean",
+      "description": "True if the repo was a shallow clone at audit time. Signals that __MISSING__ fallback (§6) may need git fetch --unshallow."
+    },
+    "theory_doc_sha": {
+      "type": "string",
+      "description": "sha256: of references/design-theory.md. Mismatch forces full re-audit (§9.1).",
+      "pattern": "^sha256:[0-9a-f]{4,64}$"
+    },
+    "market_analysis_sha": {
+      "type": "string",
+      "description": "sha256: of the current market-analysis.md output.",
+      "pattern": "^sha256:[0-9a-f]{4,64}$"
+    },
+    "tools": {
+      "type": "object",
+      "description": "Versions of the audit toolchain at audit time. Major bumps can invalidate prior findings.",
+      "additionalProperties": true,
+      "properties": {
+        "axe-core":       { "type": "string" },
+        "lighthouse":     { "type": "string" },
+        "playwright":     { "type": "string" },
+        "playwright-mcp": { "type": "string" },
+        "pa11y":          { "type": "string" }
+      }
+    },
+    "framework": {
+      "type": "object",
+      "description": "Detected framework info. Router/version migration triggers full re-audit (§7).",
+      "additionalProperties": true,
+      "properties": {
+        "name":    { "type": "string" },
+        "router":  { "type": "string" },
+        "version": { "type": "string" }
+      },
+      "required": ["name"]
+    },
+    "route_map": {
+      "type": "array",
+      "description": "Canonical route list at audit time. Diff against prior route_map produces new/deleted routes (§7).",
+      "items": { "type": "string" }
+    },
+    "pages_audited": {
+      "type": "array",
+      "description": "Per-page hashes + findings (§3, §5). Incremental audits compare these to decide which pages to re-visit.",
+      "items": {
+        "type": "object",
+        "additionalProperties": true,
+        "required": ["url"],
+        "properties": {
+          "url":                  { "type": "string" },
+          "route_file":           { "type": "string" },
+          "html_hash":            { "type": "string", "pattern": "^sha256:[0-9a-f]{4,64}$" },
+          "dom_structure_hash":   { "type": "string", "pattern": "^sha256:[0-9a-f]{4,64}$" },
+          "screenshot_hash":      { "type": "string", "pattern": "^(sha256|phash):[0-9a-f]{4,64}$" },
+          "viewport_hashes": {
+            "type": "object",
+            "description": "Per-viewport hashes (mobile_375/tablet_768/desktop_1280). Each entry may be a bare phash string (back-compat with artifact §10 line 492) or the extended {sha256,phash,png_path} object emitted by hash-pages.sh. Any viewport drift -> re-audit that page.",
+            "additionalProperties": {
+              "oneOf": [
+                { "type": "string" },
+                {
+                  "type": "object",
+                  "properties": {
+                    "sha256":   { "type": "string" },
+                    "phash":    { "type": "string" },
+                    "png_path": { "type": "string" }
+                  }
+                }
+              ]
+            }
+          },
+          "mask_selectors": {
+            "type": "array",
+            "description": "CSS selectors masked in this page's screenshots (artifact §3.4). Merged from hash-pages.sh defaults + MASK_SELECTORS env.",
+            "items": { "type": "string" }
+          },
+          "phash_engine": {
+            "type": "string",
+            "description": "Which engine produced the phash values. `phash:` = sharp-based aHash. `fpr:` = zero-dep PNG fingerprint fallback (exact-match only)."
+          },
+          "last_audited": { "type": "string", "format": "date-time" },
+          "findings_ids": {
+            "type": "array",
+            "items": { "type": "string" }
+          }
+        }
+      }
+    },
+    "components": {
+      "type": "object",
+      "description": "Map of component path -> xxh3 hash (§8). Source of truth for component-level change detection.",
+      "additionalProperties": { "type": "string" }
+    },
+    "token_hash": {
+      "type": "string",
+      "description": "sha256: hash of the canonicalized design-token map. Mismatch invalidates every page (§9).",
+      "pattern": "^sha256:[0-9a-f]{4,64}$"
+    },
+    "import_graph_sha": {
+      "type": "string",
+      "description": "sha256: of the serialized import graph (madge). Used to compute blast radius of component changes.",
+      "pattern": "^sha256:[0-9a-f]{4,64}$"
+    },
+    "findings_counts": {
+      "type": "object",
+      "description": "Aggregate tally after last audit (§5).",
+      "additionalProperties": false,
+      "properties": {
+        "blockers": { "type": "integer", "minimum": 0 },
+        "high":     { "type": "integer", "minimum": 0 },
+        "medium":   { "type": "integer", "minimum": 0 },
+        "nitpicks": { "type": "integer", "minimum": 0 }
+      }
+    },
+    "research_at": {
+      "type": "string",
+      "description": "ISO-8601 timestamp of the last sd-research run. Used to decide whether to rerun research (>90d -> refresh).",
+      "format": "date-time"
+    },
+    "ignored_paths": {
+      "type": "array",
+      "description": "Globs excluded from design-relevance classification (§2.3).",
+      "items": { "type": "string" }
+    },
+    "visual_regression": {
+      "type": "object",
+      "description": "Optional visual-regression config (§16). Absent when user hasn't opted in. Consumed by scripts/visual-regression.sh.",
+      "additionalProperties": true,
+      "properties": {
+        "enabled":              { "type": "boolean", "default": false },
+        "engine": {
+          "type": "string",
+          "description": "Engine chain: scripts/visual-regression.sh tries in order and falls back. `sha256-fallback` = exact-match only, always available.",
+          "enum": ["pixelmatch", "odiff", "resemble", "looks-same", "playwright", "sha256-fallback"],
+          "default": "pixelmatch"
+        },
+        "threshold":            { "type": "number", "minimum": 0, "default": 0.1 },
+        "max_diff_pixel_ratio": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.01 },
+        "antialiasing":         { "type": "boolean", "default": true },
+        "viewports": {
+          "type": "array",
+          "items": {
+            "type": "object",
+            "required": ["label", "width", "height"],
+            "properties": {
+              "label":  { "type": "string" },
+              "width":  { "type": "integer", "minimum": 1 },
+              "height": { "type": "integer", "minimum": 1 }
+            }
+          }
+        },
+        "mask_selectors": {
+          "type": "array",
+          "items": { "type": "string" }
+        },
+        "baseline_dir": {
+          "type": "string",
+          "description": "Where accepted baseline PNGs live. Committed to git (small repos) or stored via LFS/artifacts (large).",
+          "default": ".super-design/baselines"
+        },
+        "current_dir": {
+          "type": "string",
+          "description": "Where hash-pages.sh writes the current run's PNGs. Usually gitignored.",
+          "default": "docs/super-design/.cache/hashes/screenshots"
+        },
+        "diff_dir": {
+          "type": "string",
+          "description": "Where diff images + results.json are emitted by visual-regression.sh.",
+          "default": "docs/super-design/.cache/hashes/diffs"
+        },
+        "docker_image": { "type": "string" }
+      }
+    }
+  }
+}