npm - start-vibing - Versions diffs - 4.2.0 → 4.3.0 - Mend

start-vibing 4.2.0 → 4.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/template/.claude/skills/super-design/SKILL.md CHANGED Viewed

@@ -8,7 +8,7 @@ description: >
   UX audit (WCAG 2.2 AA, Nielsen heuristics, Baymard, CWV), and synthesized
   overview. Re-audits only what changed since last run. On explicit user request,
   applies surgical fixes with full rollback.
-version: 0.5.0
+version: 0.6.0
 ---
 # super-design
@@ -18,16 +18,34 @@ version: 0.5.0
 Four-phase pipeline with 6 specialist agents:
 1. **Market research** (sd-research) — auto-detects niche from repo, finds 5–10
-   competitors, extracts design language, produces market-analysis.md.
-2. **UI/UX audit** (sd-audit) — drives browser via Playwright MCP directly,
-   applies Nielsen's 10 heuristics, WCAG 2.2 AA, Baymard (if e-commerce), and
-   Core Web Vitals. Produces findings.json with SHOT+QUOTE+SEL+VAL evidence.
-3. **Synthesis** (sd-synthesis) — unifies research + audit into overview.md.
+   competitors, extracts design language AND component vocabulary (buttons,
+   nav, cards, modals, forms, tokens — per competitor × mobile+desktop),
+   produces market-analysis.md + component-comparison.md.
+2. **UI/UX audit** (sd-audit) — drives browser via Playwright MCP directly.
+   Five layers:
+   - Route discovery + static snap (Nielsen + WCAG 2.2 AA + Baymard + CWV)
+   - **Step 2.5 component/modal/flow discovery** (Phase A inventory, B modal
+     enumeration, C flow exercising, D state matrix, E form coverage) — this
+     is where modal contents, empty/loading/error states, and flow errors
+     get real evidence instead of "checklist hypothetical".
+   - **Step 3g design-intelligence scoring** (17-category rubric → DIS 0–100)
+     catches implicit best practices checklists miss (cards-in-flex-col,
+     low density, weak CTA hierarchy, vibecode smell).
+   - **Step 3h mobile-native audit** (21-item Duolingo/Linear/Arc/Cash-App
+     checklist) — replaces "responsive-web-on-a-phone" thinking.
+   - C16 ≤ 4 → design-skill advisory finding citing typeui-* selection matrix.
+   Produces findings.json + design-intelligence.json with SHOT+QUOTE+SEL+VAL.
+3. **Synthesis** (sd-synthesis) — unifies research + audit + design-intelligence
+   into overview.md (per-page DIS table + executive summary).
 4. **Fix** (sd-fix + two-stage verify) — optional. Applies safe fixes with
    technical gates (types/lint/tests) AND semantic verification ("does this
-   fix actually resolve the finding, or just mask it?"). After each successful
-   fix, re-drives Playwright to capture an after-screenshot (full page +
-   element crop) and emits `docs/super-design/sessions/<id>/fix-report.md`:
+   fix actually resolve the finding, or just mask it?"). Template families:
+   A1-A15 a11y · V1-V8 design · U1-U10 ux · P1-P10 perf · **M1-M15 mobile**
+   (cards-in-flex-col → compact list, table-on-mobile → card-per-row,
+   centered-modal → bottom-sheet, etc.) · **DSC-1 design-skill advisory**
+   (proposes typeui-* direction, never auto-applies — HIGH risk). After each
+   successful fix, re-drives Playwright to capture an after-screenshot (full
+   page + element crop) and emits `docs/super-design/sessions/<id>/fix-report.md`:
    a self-contained visual diff with before/after images, file diffs,
    verification status, and commit SHA per finding.

package/template/.claude/skills/super-design/references/component-flow-discovery.md ADDED Viewed

@@ -0,0 +1,258 @@
+# Component, Modal & Flow Discovery Playbook
+> How sd-audit systematically exercises EVERY interactive element before
+> running heuristics. Without this, the audit only sees static page snaps
+> and misses modal contents, flow state, hover/focus/active variants, and
+> loading/empty/error states.
+>
+> This playbook runs as **Step 2.5** in sd-audit, after route discovery
+> and before per-viewport heuristic passes.
+## Why this matters
+Static screenshots of pages show ~30% of what users interact with. The other 70%
+lives inside modals, drawers, dropdowns, command palettes, error flows, empty
+states, loading states, hover menus, focus rings. A page can score 95/100 on
+Lighthouse and be unusable because its "Create" modal is broken — and the
+auditor never opened it.
+## Discovery phases
+### Phase A — Interactive inventory (per page × viewport)
+After navigating and dismissing banners:
+```js
+// browser_evaluate
+(() => {
+  const roots = [
+    '[role="button"]',
+    'button',
+    'a[href]',
+    '[role="link"]',
+    '[role="menuitem"]',
+    '[role="tab"]',
+    '[role="switch"]',
+    '[role="checkbox"]',
+    '[role="radio"]',
+    '[aria-haspopup]',
+    '[aria-expanded]',
+    '[data-trigger]',
+    '[data-state]',
+    'input',
+    'select',
+    'textarea',
+    'summary',
+  ];
+  const items = [];
+  roots.forEach(sel => {
+    document.querySelectorAll(sel).forEach(el => {
+      if (!el.offsetParent && getComputedStyle(el).position !== 'fixed') return;
+      const r = el.getBoundingClientRect();
+      if (r.width === 0 || r.height === 0) return;
+      items.push({
+        selector: sel,
+        tag: el.tagName,
+        role: el.getAttribute('role'),
+        name: el.getAttribute('aria-label') || el.textContent?.trim().slice(0, 60) || '',
+        type: el.getAttribute('type'),
+        haspopup: el.getAttribute('aria-haspopup'),
+        expanded: el.getAttribute('aria-expanded'),
+        disabled: el.disabled || el.getAttribute('aria-disabled') === 'true',
+        rect: { x: r.x, y: r.y, w: r.width, h: r.height },
+      });
+    });
+  });
+  return items;
+})()
+```
+Save to `.super-design/sessions/<id>/interactive/<slug>_<vp>.json`.
+Classify each:
+- **navigation** — links, tabs, back buttons
+- **action** — primary CTAs, submit, delete, save
+- **trigger** — opens modal/drawer/dropdown (`aria-haspopup`, `data-trigger`)
+- **input** — form fields
+- **state-toggle** — switches, checkboxes, expanders
+### Phase B — Modal & overlay discovery
+For each trigger from Phase A:
+```
+1. Pre-click snapshot (already have from Phase 1)
+2. browser_click({ ref })   # click the trigger
+3. browser_wait_for(text="<expected modal content>") or 500ms
+4. browser_snapshot → save as snapshots/<slug>_<vp>_<triggerName>_open.yaml
+5. browser_take_screenshot fullPage + element-scoped → screens/components/
+6. browser_console_messages(level="error") → record
+7. Inside open modal, run Phase A again (nested inventory)
+8. Look for [role="dialog"] or [data-state="open"] to confirm it opened
+9. Exercise modal internals:
+   - Tab through to find focus trap
+   - Press Escape to confirm dismiss
+   - Resize to mobile — check if it becomes bottom-sheet
+10. Close modal (button or Escape)
+11. Re-snapshot → confirm background restored, focus returned to trigger
+```
+**Modals a junior agent misses:**
+- Confirmation dialogs (delete confirm, logout confirm)
+- Date pickers (calendar popover)
+- Color pickers
+- Combobox dropdowns (autocomplete search)
+- Popover menus (dropdown with options)
+- Sheet / drawer (slide-in from right or bottom)
+- Command palette (Cmd+K)
+- Tooltips (hover-triggered on desktop)
+- Toast notifications (programmatic — trigger an action that causes one)
+- Error modals (submit invalid form)
+- Share sheets
+- File upload dialogs (click input[type=file])
+### Phase C — Flow exercising
+A **flow** is a multi-step user journey. Every app has 3–10 critical flows.
+sd-audit must auto-discover and exercise them.
+**Auto-discover flows from routes + component names:**
+| Route / name hint | Flow |
+|---|---|
+| `/login`, `/signin`, `/auth` | Login flow (happy + wrong password + locked account) |
+| `/register`, `/signup` | Registration flow |
+| `/forgot`, `/reset` | Password reset flow |
+| `/onboarding`, `/welcome` | First-run flow |
+| `/checkout`, `/cart` | Checkout flow (incl. errors: declined card, validation) |
+| `/dashboard`, `/home` (authed) | Post-auth landing → primary CTA flow |
+| List route (`/users`, `/orders`) | CRUD — create, edit, view, delete, filter, search, paginate |
+| Detail route (`/users/:id`) | Edit flow, delete flow, related actions |
+| `/settings`, `/profile` | Profile edit, preference toggles, account delete |
+| `/support`, `/help`, `/chat` | Messaging flow |
+**Per flow:**
+```
+1. Plan steps (list of expected screens + actions)
+2. Execute step-by-step:
+   - Navigate / click to advance
+   - Per step: snapshot + screenshot + console
+   - Test error path (invalid input, network error via DevTools)
+   - Test back button preserves state
+3. Capture final success state (confirmation page, toast, redirect)
+4. If flow depends on creating test data, use burner account
+```
+Save per-flow artifacts under `.super-design/sessions/<id>/flows/<flow_name>/step_NN_<action>.png`.
+### Phase D — State matrix per component
+For each UI component class (Button, Input, Card, ListRow, Modal, NavItem…),
+capture:
+| State | How to trigger |
+|---|---|
+| default | Initial render |
+| hover | `browser_hover` (desktop only, gate `@media hover`) |
+| focus | Tab to element via `browser_press_key(Tab)` until focused |
+| focus-visible | Same as focus (most systems collapse them now) |
+| active | `browser_press_key` Enter/Space while focused |
+| disabled | Find a disabled example (e.g., form before valid) |
+| loading | Submit form → catch transient state; OR throttle network via DevTools |
+| error | Invalid input + submit |
+| empty | Navigate to route with no data (burner account OR delete all) |
+| success | Complete a flow successfully |
+| selected | Click tab / radio / checkbox that shows selected variant |
+Save per component class:
+```
+.super-design/sessions/<id>/components/
+  Button/
+    default.png
+    hover.png
+    focus.png
+    active.png
+    disabled.png
+    loading.png
+  Input/...
+  Modal/...
+```
+Output `.super-design/sessions/<id>/component-state-matrix.json`:
+```json
+{
+  "Button": {
+    "states_captured": ["default", "hover", "focus", "active", "disabled", "loading"],
+    "states_missing": ["error"],
+    "evidence": { "default": "components/Button/default.png", "..." }
+  },
+  "...": {}
+}
+```
+**Missing states → finding.**
+### Phase E — Form state coverage
+Per form discovered, test:
+1. Empty submit → validation messages
+2. Each field invalid individually → per-field error
+3. All valid → success state
+4. Server error (simulate 500) → error recovery
+5. Network offline → offline handling
+6. Paste into password field → paste NOT blocked
+7. Autocomplete tokens on login fields (`username`, `current-password`, `one-time-code`)
+8. Tab order matches visual order
+9. Submit via Enter key works
+10. Mobile viewport — input zoom behavior (iOS Safari `font-size < 16px`?)
+Save: `.super-design/sessions/<id>/forms/<formId>_<scenario>.png`
+## Orchestration summary
+sd-audit adds this between Step 2 and Step 3:
+```
+Step 2.5 — Discovery
+  For each (page, viewport):
+    A. Interactive inventory
+    B. Modal/overlay enumeration (click every trigger)
+    C. Flow exercising (login, CRUD, checkout if applicable)
+    D. Component state matrix
+    E. Form state coverage
+```
+This takes 3–5× longer than static-only audit but produces ~3× the findings,
+each with real evidence of failure conditions (not just hypothetical WCAG
+violations on static markup).
+## Budget & skipping
+For very large apps, scope Phase B/C/D to:
+- Top 5 most-clicked triggers per page (ranked by proximity to primary CTA)
+- Critical flows only (login + checkout + 1 CRUD)
+- 3 component classes minimum (Button, Input, Modal) — rest deferred to full-mode audit
+Record what was skipped in `.super-design/sessions/<id>/scope.json` so later
+runs can close the gap.
+## Error handling
+- **Triggered modal doesn't appear within 2s** — record "trigger broken" finding, move on
+- **Console error after click** — record verbatim, still capture whatever rendered
+- **Focus not trapped** — record a11y violation
+- **Modal close fails** — force navigate away, record "close broken" finding
+Never let a broken trigger abort the full audit — isolate and continue.
+## Hard rules
+1. Every Phase A inventory item must be considered for exercising; skips recorded.
+2. Every modal opened must be screenshotted OPEN + CLOSED.
+3. Every flow must capture at least one error path, not just happy path.
+4. Component state matrix must declare which states are MISSING (not just captured).
+5. Form state coverage — 10 scenarios per form; partial completion records which.
+6. Use ONE Playwright session; reuse across phases; `browser_close` only at end.
+7. Sequential, not parallel. Never spawn parallel tabs.

package/template/.claude/skills/super-design/references/design-intelligence-rubric.md ADDED Viewed

@@ -0,0 +1,376 @@
+# Design Intelligence Rubric
+> The missing layer between WCAG/Nielsen checklists and "does this feel
+> designed or vibecoded?" — used by sd-audit as Phase 3g.
+## Why this exists
+A UI can pass axe (zero WCAG violations), Nielsen (10/10 heuristics green),
+Lighthouse (100 perf score) and still be visually horrible: card-stacked mobile
+dashboards, microtext tables on phones, no visual hierarchy, shadcn defaults
+slapped onto every surface with zero variant discipline, inline arbitrary
+pixel values everywhere. This rubric codifies those **implicit** criteria.
+Every category is scored **0–10** per page × viewport. Total score → a single
+**design-intelligence score (DIS)** 0–100. Scores below 60 auto-flag MEDIUM
+findings; below 40 auto-flag HIGH.
+## Evidence requirement
+Each category's score MUST cite ≥1 piece of evidence from the audit session:
+- **SHOT** (screenshot path)
+- **CSS** (computed style excerpt)
+- **DOM** (snapshot quote)
+- **CODE** (source file + line, via Grep, for vibecode detection)
+A score without evidence is invalid. Auditor records `n/a` instead of guessing.
+---
+## Category 1 — Visual hierarchy (weight 1.0)
+**Question:** On this view, what is the single primary goal? Is it the most
+dominant element visually?
+| Score | Criteria |
+|---|---|
+| 10 | One dominant element ≥ 2× larger or distinctly heavier than rest. Supporting info subordinate. Example: Cash App balance. |
+| 7  | Primary clear but competing CTAs present. |
+| 4  | Multiple equal-weight elements; user has to hunt. |
+| 0  | Flat wall of cards/tables, no signal of where to look. |
+**Detect:** `browser_evaluate` → collect computed `fontSize`, `fontWeight`, `lineHeight`, `color` of h1–h6, buttons, key metrics. Compute size-dominance ratio. Ratio > 2 → 10. Ratio < 1.3 → ≤4.
+**Example fail (beats-market):** admin dashboard mobile — 10 equal-weight metric cards, no hero. Score: 2.
+---
+## Category 2 — Density calibration per viewport (weight 1.2)
+**Question:** Does information density match the device context?
+| Viewport | Expected primary entities visible above fold |
+|---|---|
+| Mobile 375×812 | 6–8 compact rows OR 1 hero + 4–5 rows |
+| Tablet 768×1024 | 8–12 rows or 4 cards in 2×2 |
+| Desktop 1440×900 | 12–20+ rows or data-dense tables |
+**Score:**
+- 10 if density within ±20% of target
+- 5 if half or double target
+- 0 if <25% of target (wasteful) or >3× (cramped illegible)
+**Detect:** Count `role=listitem | region | article` elements with `getBoundingClientRect` intersecting viewport. Compare to viewport target.
+**Example fail (beats-market):** admin dashboard mobile — 3 cards above fold (target 6–8). Score: 3. Orders mobile = 20 rows visible but microtext = also fail (see Category 7).
+---
+## Category 3 — Consistency: spacing scale (weight 0.8)
+**Question:** Do paddings, margins, gaps come from a scale (4/8px or 0.25rem) or are they arbitrary magic numbers?
+**Detect:**
+- `browser_evaluate` → collect computed `padding`, `margin`, `gap` from ≥50 elements.
+- Grep codebase for `p-\[\d+px\]`, `m-\[\d+px\]`, `gap-\[\d+px\]` (Tailwind arbitrary values).
+- Count: on-scale vs off-scale.
+| Score | Criteria |
+|---|---|
+| 10 | 95%+ values on a 4/8 scale; no arbitrary pixel values in code |
+| 7 | 80–95% on-scale; a few arbitrary values in leaf components |
+| 4 | 50–80% on-scale; arbitrary values common |
+| 0 | Random pixel values everywhere; no visible scale |
+---
+## Category 4 — Consistency: typography scale (weight 0.8)
+Same method for font-size, font-weight, line-height. Look for `text-\[\d+px\]` and arbitrary font-size. Expected: 6–10 sizes total in a designed system; 30+ sizes = vibecoded.
+---
+## Category 5 — Consistency: color palette (weight 0.8)
+**Detect:**
+- Collect computed `color`, `background-color`, `border-color` from ≥100 elements.
+- Unique colors count. <15 = disciplined. 30+ = vibecoded.
+- Grep for `#[0-9a-f]{6}`, `rgb\(`, Tailwind arbitrary colors.
+| Score | Criteria |
+|---|---|
+| 10 | ≤12 distinct colors, all from tokens |
+| 7 | 12–20 colors, mostly tokens |
+| 4 | 20–30 colors |
+| 0 | 30+ colors, raw hex/rgb inline |
+---
+## Category 6 — Whitespace & breathing room (weight 0.7)
+**Question:** Does content have room to breathe, or is it crammed?
+**Detect:** Compute average `padding-inline + margin-inline` per content block. Compare to container width. Measure content-to-chrome ratio.
+| Score | Criteria |
+|---|---|
+| 10 | Content 60–75% of width, 25–40% whitespace |
+| 7 | Content 75–85% |
+| 4 | Content 85–95% OR <50% (too sparse) |
+| 0 | Content touching edges, no breathing |
+---
+## Category 7 — Text legibility (weight 1.2)
+**Detect:** `browser_evaluate` → collect `fontSize` computed px. Find minimum across visible text.
+| Viewport | Min body | Min meta | Min input |
+|---|---|---|---|
+| Mobile | 16px | 13px | 16px (iOS zoom floor) |
+| Desktop | 14px | 12px | 14px |
+| Score | Criteria |
+|---|---|
+| 10 | All text meets mins |
+| 5 | One or two elements below by ≤1px |
+| 0 | Widespread microtext (tables, chips, meta) below min |
+**Example fail (beats-market):** orders mobile — table cells computed at ~8px. Score: 0.
+---
+## Category 8 — CTA hierarchy (weight 1.0)
+**Question:** Is there ONE primary CTA per view?
+**Detect:** Count buttons with `variant=default | primary | filled` OR bg-primary class. >1 above fold = competing.
+| Score | Criteria |
+|---|---|
+| 10 | Single primary CTA, rest secondary/ghost |
+| 7 | Single primary + 1 competing |
+| 4 | 2–3 competing primaries |
+| 0 | Every button styled primary |
+Reference: Baymard PDP — 51% of e-commerce pages fail due to competing CTAs.
+---
+## Category 9 — State coverage (weight 1.1)
+Per page, does the UI handle: default / loading / empty / error / success?
+**Detect per scenario:**
+- Loading: Grep source for `isLoading`, `pending`, `Skeleton`, `aria-busy`.
+- Empty: Grep for `isEmpty`, `EmptyState`, zero-result ternaries.
+- Error: Grep for `role="alert"`, error boundaries, `onError`.
+| Score | Criteria |
+|---|---|
+| 10 | 5/5 states rendered or demonstrably coded |
+| 8 | 4/5 (usually missing success toast) |
+| 5 | 3/5 (missing empty + error) |
+| 0 | Only default; fetch returns broken UI on failure |
+---
+## Category 10 — Touch targets (mobile only, weight 1.0)
+**Detect:** `browser_evaluate` → get `getBoundingClientRect` of every clickable. Count targets < 44×44 px.
+| Score | Criteria |
+|---|---|
+| 10 | 100% targets ≥ 44×44 OR 24×24 with 8px+ gap |
+| 7 | 80–99% compliant |
+| 4 | 50–80% compliant (common: icon-only buttons) |
+| 0 | Widespread <24px targets |
+---
+## Category 11 — Motion & feedback (weight 0.6)
+**Question:** Do interactions give feedback? Are animations tasteful and respect `prefers-reduced-motion`?
+**Detect:**
+- `browser_evaluate` with `matchMedia('(prefers-reduced-motion: reduce)')` + check for `transition` / `animation` on interactive elements.
+- Missing hover/focus feedback on buttons = major fail.
+- >3s animations = excessive.
+| Score | Criteria |
+|---|---|
+| 10 | Hover/focus/active feedback everywhere; animations ≤300ms; reduced-motion respected |
+| 7 | Most interactions feedback; reduced-motion partial |
+| 4 | Some interactions static; reduced-motion ignored |
+| 0 | No hover/focus feedback at all OR autoplay video + parallax with no disable |
+---
+## Category 12 — Nav pattern matches platform (weight 1.0)
+| Viewport | Expected nav |
+|---|---|
+| Mobile (≤768) | Bottom tab bar (3–5), full-screen menus, gesture back |
+| Tablet | Hybrid (collapsible sidebar or top tabs) |
+| Desktop | Persistent sidebar or top navbar with search |
+**Detect:** On mobile viewport, check for `<nav>` in fixed bottom position. On desktop, fixed left sidebar OR top header with nav.
+| Score | Criteria |
+|---|---|
+| 10 | Nav matches platform convention |
+| 5 | Hybrid but functional (e.g., bottom FAB + hamburger) |
+| 0 | Hamburger-only on mobile OR bottom tabs on desktop |
+---
+## Category 13 — Table-on-mobile detection (weight 1.2, mobile only)
+**Detect:** At ≤768px, find `<table>` with >3 visible columns OR `display: table` containers with horizontal scroll AND text < 13px.
+| Score | Criteria |
+|---|---|
+| 10 | No table or table transformed to card-per-row |
+| 5 | Table present but scrolls clean with sticky first col |
+| 0 | Squashed desktop table with microtext |
+**Example fail (beats-market):** admin orders mobile — 8-col desktop table rendered at 375px with ~8px text. Score: 0.
+---
+## Category 14 — Modal/sheet appropriateness (weight 0.8)
+| Viewport | Expected modal pattern |
+|---|---|
+| Mobile | Bottom sheet (slide-up) or full-screen with close top-left |
+| Tablet | Centered dialog OR bottom sheet |
+| Desktop | Centered dialog |
+**Detect:** Open every `role=dialog` trigger. Measure position + dimensions. On mobile, centered dialog with close in top-right = fail.
+| Score | Criteria |
+|---|---|
+| 10 | Correct per viewport |
+| 5 | Centered on mobile but within reach |
+| 0 | Unreachable thumb-zone close, wrong pattern for device |
+---
+## Category 15 — Color semantics (weight 0.6)
+**Detect:** Collect colors used on: error messages, success states, warnings, info. Red = error? Green = success? Or decorative-only?
+| Score | Criteria |
+|---|---|
+| 10 | Semantic colors distinct and consistent across app |
+| 5 | Used but inconsistent (red elsewhere as brand) |
+| 0 | No semantic color system |
+---
+## Category 16 — Design-system coherence (weight 1.1)
+**The meta-category.** Does the app LOOK like it was designed by one team with one vision? Or does it look like a collection of shadcn defaults?
+**Detect (aesthetic signal):**
+- Does at least one of: custom color palette, custom font pairing, custom spacing rhythm, custom radius language, custom motion language, custom illustration/icon set EXIST?
+- Grep package.json for ≥1 of: `typeui.sh`, `framer-motion`, custom fonts beyond system, Lottie, MagicUI, Aceternity, custom token file.
+| Score | Criteria |
+|---|---|
+| 10 | Strong identity — you could recognize this app from a cropped screenshot |
+| 7 | Some identity (brand color + maybe custom font) |
+| 4 | shadcn defaults + 1 accent color |
+| 0 | Pure shadcn defaults, zero customization, looks like `npx shadcn-ui@latest init` |
+**When score ≤4:** recommend a typeui.sh aesthetic skill OR a frontend-design
+session. See `design-skills-catalog.md`.
+---
+## Category 17 — Vibecode detection (weight 1.0)
+**Question:** Does the code follow patterns (components, variants, tokens)
+or is it hand-assembled divs with inline styles?
+**Detect (source grep):**
+- Count `<div className="..."` without surrounding component → raw-div score
+- Count inline `style={{` → inline-style count
+- Count `@media` queries inline vs breakpoint tokens
+- Count arbitrary Tailwind values (`\[\d+px\]`, `\[#[0-9a-f]+\]`)
+- Check if shadcn components are wrapped into domain components (MetricRow, OrderCard) or used raw per page
+- Check if types are co-located in `types/` vs inline `any`
+| Score | Criteria |
+|---|---|
+| 10 | Domain components, design tokens, typed props, variants via CVA |
+| 7 | Some composition, some raw shadcn usage |
+| 4 | Mostly raw primitives, inconsistent composition |
+| 0 | Flat page files with 500+ lines of inline JSX; zero reuse |
+---
+## Scoring formula
+```
+DIS = Σ(score_i × weight_i) / Σ(weight_i) × 10
+weight_sum = 14.8
+max_raw = 148.0 → normalized to 100
+Example (beats-market admin dashboard mobile):
+  C1 hierarchy: 2 × 1.0 = 2.0
+  C2 density:   3 × 1.2 = 3.6
+  C3 spacing:   7 × 0.8 = 5.6
+  C4 type:      6 × 0.8 = 4.8
+  C5 color:     7 × 0.8 = 5.6
+  C6 whitespace:4 × 0.7 = 2.8
+  C7 legibility:8 × 1.2 = 9.6
+  C8 CTA:       6 × 1.0 = 6.0
+  C9 states:    4 × 1.1 = 4.4
+  C10 touch:    6 × 1.0 = 6.0
+  C11 motion:   5 × 0.6 = 3.0
+  C12 nav:      4 × 1.0 = 4.0
+  C13 table:   10 × 1.2 = 12.0 (no table on dashboard)
+  C14 modal:    6 × 0.8 = 4.8
+  C15 color-sem:6 × 0.6 = 3.6
+  C16 coherence:3 × 1.1 = 3.3
+  C17 vibecode: 4 × 1.0 = 4.0
+  raw = 85.1 → DIS = 57.5 → MEDIUM severity
+```
+## Output format
+Write `.super-design/sessions/<id>/design-intelligence.json`:
+```json
+{
+  "pages": [
+    {
+      "page_url": "/admin",
+      "viewport": "375x812",
+      "categories": {
+        "visual_hierarchy": { "score": 2, "evidence": ["screens/admin_mobile.png", "styles/admin_mobile.json"], "notes": "10 equal-weight metric cards, no hero" },
+        "density": { "score": 3, "evidence": ["screens/admin_mobile.png"], "notes": "Only 3 metrics above fold; target 6-8" }
+      },
+      "dis": 57.5,
+      "severity": "MEDIUM",
+      "top_findings": ["F-0042", "F-0043", "F-0049"]
+    }
+  ],
+  "overall_dis": 54.2,
+  "overall_severity": "MEDIUM",
+  "weakest_categories": ["visual_hierarchy", "density", "design_system_coherence"]
+}
+```
+Each category score ≤ 4 SHOULD spawn a finding with `template_id` from the M-family (M1–M15) in `fix-agent-playbook.md`.
+## Cross-references
+- Mobile-specific categories (C2, C7, C10, C12, C13, C14) reference
+  `.claude/skills/mobile-app-patterns/SKILL.md` for fix patterns.
+- Design-system coherence (C16) references `design-skills-catalog.md` for
+  typeui.sh skill suggestions.
+- Vibecode (C17) references `fix-agent-playbook.md` V-templates (spacing,
+  type, color tokens).