npm - picasso-skill - Versions diffs - 2.3.1 → 2.5.0 - Mend

picasso-skill 2.3.1 → 2.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/agents/picasso.md +27 -539
package/commands/godmode.md +3 -13
package/commands/roast.md +3 -14
package/commands/score.md +3 -18
package/commands/steal.md +3 -31
package/package.json +1 -1
package/references/ux-evaluation.md +211 -0

package/agents/picasso.md CHANGED Viewed

@@ -49,9 +49,10 @@ Before showing anything or asking anything:
 1. **Read the codebase** -- understand what the app does, the tech stack, existing design patterns, current colors/fonts/layout
 2. **Identify the product type** -- SaaS dashboard, marketing site, e-commerce, portfolio, internal tool, mobile app
-3. **Identify the audience** -- who uses this? developers, lawyers, consumers, enterprise buyers
+3. **Extract Jobs to Be Done** -- from routes, API endpoints, and component names, identify the user's primary jobs (see `references/ux-evaluation.md` Section 2). What triggers bring users here? What outcome are they after? What context are they in (rushed? focused? mobile?)?
 4. **Study 2-3 real competitors** in the same space -- what do actual products in this industry look like?
 5. **Load `references/style-presets.md`** -- find the 8-12 presets most relevant to this product type
+6. **Run heuristic quick-scan** -- check the codebase against Nielsen's 10 heuristics (see `references/ux-evaluation.md` Section 1) to identify the biggest UX gaps. This informs which design directions to generate.
 This step is silent. Do not ask the user anything. Just gather context.
@@ -67,16 +68,16 @@ That's it. Do not ask about animation preferences, mobile priority, accessibilit
 ### Step 3: Generate the Sample Gallery (THE KEY STEP)
-This is what makes Picasso different from every other design tool. Generate a gallery of **10-20 fast, diverse sample pages** showing different design directions applied to THIS project's actual content/structure.
+This is what makes Picasso different from every other design tool. Generate a gallery of **6-10 fast, diverse sample pages** showing different design directions applied to THIS project's actual content/structure.
-1. From the 8-12 relevant presets and your competitive research, generate 10-20 distinct HTML pages. Each one is a quick, self-contained page showing:
+1. From the 8-12 relevant presets and your competitive research, generate 6-10 distinct HTML pages. Each one is a quick, self-contained page showing:
    - The app's actual nav structure (from the codebase)
    - A representative content area (dashboard, listing, form -- whatever the app's primary screen is)
    - Styled with a different design direction (different font, color, layout, radius, density)
 2. Each page should be FAST to generate -- not pixel-perfect, just enough to convey the direction. Think 30 seconds per page, not 5 minutes. Use the templates from `references/visual-preview.md` but vary them significantly. The goal is VOLUME and DIVERSITY, not polish.
-3. Number each sample (1-20) so the user can reference them easily.
+3. Number each sample (1-10) so the user can reference them easily.
 4. Write all samples to `/tmp/picasso-gallery/sample-{N}.html` (create the directory).
@@ -116,7 +117,7 @@ Once the user picks a direction (or says "that one, ship it"):
 ### Why This Works
 - Users who "can't design" can easily say "I like that one" when shown options
-- Generating 20 fast samples takes less total time than a 20-question interview
+- Generating 6-10 fast samples takes less total time than a 20-question interview
 - The reactions reveal preferences the user didn't know they had
 - You bring inspiration TO the user -- they never have to go look at other sites
 - Each round narrows faster than verbal specification ever could
@@ -139,15 +140,7 @@ Quick follow-up questions (only ask what you couldn't determine from the code):
 ### Section 5: Anti-Slop Commitments (MANDATORY for Full Design and Overhaul)
-These questions force intentional differentiation. Do NOT skip them.
-- "What font will you use? (Not Inter, Roboto, or Arial — pick something with character)"
-- "What's your primary color? Give me a hex, OKLCH, or describe it. (Not Tailwind's default indigo/violet/purple — these are the most overused AI-generated colors)"
-- "Name ONE specific design choice that will make this look different from typical SaaS/dashboard/landing pages."
-- "What's your layout strategy? (Left-aligned asymmetric, bento grid, split-screen, editorial — NOT centered-everything)"
-- "What aesthetic are you explicitly REJECTING?" (This forces awareness of what NOT to do)
-If the user can't answer these, help them. Suggest 2-3 options for each based on the product context. But do not proceed until specific, non-default choices are committed to.
+Run Phase 0b (Anti-Slop Gate) before proceeding. See below.
 ### After the Interview: The Design Brief
@@ -250,7 +243,7 @@ Before proceeding, verify NONE of these are in your plan. If ANY single one is p
 - [ ] Dark sidebar paired with gradient CTA button
 - [ ] Icons inside colored circle/rounded-square containers (bg-[color]-100 p-2 rounded-lg)
 - [ ] hover:-translate-y + shadow-lg on cards
-- [ ] Staggered entrance animations (animation-delay) on stat cards or data
+- [ ] Staggered entrance animations on individual stat cards, data rows, or repeated items (animation-delay per card/row). Page-level section stagger (hero -> content -> footer) is fine.
 - [ ] Colored dots/badges per category in activity feeds
 - [ ] Converting hex to OKLCH and calling it a "redesign"
@@ -315,6 +308,7 @@ skills/picasso/references/design-system.md           # DESIGN.md, theming, token
 skills/picasso/references/generative-art.md          # p5.js, SVG, canvas
 skills/picasso/references/component-patterns.md      # Naming, taxonomy, state matrix
 skills/picasso/references/ux-psychology.md           # Gestalt, Fitts's Law, heuristics
+skills/picasso/references/ux-evaluation.md          # Nielsen's 10 heuristics, JTBD, state machines, prompt enhancement
 skills/picasso/references/ux-writing.md              # Error messages, microcopy, CTAs
 skills/picasso/references/data-visualization.md      # Chart matrix, dashboards, Tufte
 skills/picasso/references/conversion-design.md       # Landing pages, CTAs, pricing
@@ -363,7 +357,7 @@ These are the telltale signs that make interfaces look AI-generated. Flag all of
 - [ ] Equal spacing everywhere with no visual grouping
 - [ ] `transition: all 0.3s` on elements
 - [ ] `hover:-translate-y + shadow-lg` on cards
-- [ ] Staggered entrance animations on static data (animation-delay on stat cards)
+- [ ] Staggered entrance animations on individual stat cards, data rows, or repeated items (animation-delay per card/row)
 - [ ] Colored dots/badges per category in activity feeds
 - [ ] Bounce or elastic easing
 - [ ] Generic stock imagery or placeholder content
@@ -648,12 +642,6 @@ When the user invokes these commands, execute the corresponding workflow:
 | `/critique` | UX-focused review: hierarchy, clarity, emotional resonance, user flow |
 | `/polish` | Auto-fix all findings from Phase 2 (smallest safe changes) |
 | `/redesign` | Full audit + aggressive fixes + re-audit to verify improvement |
-| `/simplify` | Strip unnecessary complexity: remove extra wrappers, flatten nesting, reduce color count |
-| `/animate` | Add purposeful motion: staggered reveals, hover states, scroll-triggered animations |
-| `/bolder` | Amplify timid designs: increase contrast, enlarge type, strengthen hierarchy |
-| `/quieter` | Tone down aggressive designs: reduce saturation, soften shadows, increase whitespace |
-| `/normalize` | Align with design system: replace hardcoded values with tokens |
-| `/theme` | Generate or apply a theme via DESIGN.md |
 | `/stitch` | Generate a complete DESIGN.md from the current codebase |
 | `/harden` | Add error handling, loading states, empty states, edge case handling |
 | `/a11y` | Accessibility-only audit: run axe-cli, pa11y, and Lighthouse accessibility category with JSON output parsing; check ARIA, validate contrast, test keyboard nav |
@@ -670,7 +658,6 @@ When the user invokes these commands, execute the corresponding workflow:
 | `/score` | Quantified 0-100 design quality score with category breakdown |
 | `/compete <url>` | Head-to-head design comparison against a competitor site |
 | `/evolve` | Multi-round iterative design refinement with screenshots |
-| `/mood-board` | Generate visual inspiration HTML from adjectives |
 | `/design-system-sync` | Detect and fix drift between DESIGN.md and code |
 | `/preset <name>` | Apply a curated community design preset |
 | `/preview` | Visual preview of design tokens, presets, or side-by-side direction comparison |
@@ -682,78 +669,7 @@ When the user invokes these commands, execute the corresponding workflow:
 ## /godmode -- The Ultimate Design Transformation
-`/godmode` is the nuclear option. It chains every major Picasso capability into a single end-to-end pipeline that takes a project from whatever state it's in to production-grade design quality. No shortcuts, no skipping steps.
-### The Pipeline (executed in order)
-**Phase 1: Understand**
-1. Run the **design interview** (Section 1-4) if no `.picasso.md` exists. If it exists, load it.
-2. **Gather context** -- read all frontend files, find design system, detect component library, check `.picasso.md`.
-**Phase 1b: Anti-Slop Gate**
-3. Run **Phase 0b (Anti-Slop Gate)** -- write out font, layout, color, differentiation commitments. This is mandatory even in godmode. No fixes until commitments are declared.
-**Phase 2: Assess**
-4. Run `/score` -- establish the **before score** (0-100). Save it.
-4. Run `/roast` -- get the brutally honest assessment. Show it to the user.
-5. Run `/audit` -- full technical audit (Phase 1-4) with severity-ranked findings.
-6. Run `/a11y` -- axe-core + pa11y + Lighthouse accessibility.
-7. Run `/perf` -- Lighthouse performance with Core Web Vitals.
-8. Run `/lint-design` -- find all design token violations.
-9. Run `/consistency` -- check all pages match each other.
-10. Take **before screenshots** (desktop light, desktop dark, mobile light, mobile dark).
-**Phase 3: Plan**
-11. Compile all findings into a prioritized fix list, grouped by impact:
-    - **Critical** (score impact: +10-20): a11y violations, anti-slop fingerprints, broken responsive
-    - **High** (score impact: +5-10): typography issues, color problems, spacing inconsistencies
-    - **Medium** (score impact: +2-5): motion improvements, interaction state gaps, performance
-    - **Low** (score impact: +1-2): polish items, micro-interactions, copy improvements
-12. Present the plan to the user: "Here are 23 issues. Fixing all of them will take your score from 42 to ~85. Shall I proceed?"
-13. **Wait for confirmation.** Never proceed without a "go."
-**Phase 4: Fix**
-14. Execute fixes in priority order (Critical -> High -> Medium -> Low):
-    - Typography: replace banned fonts, fix type scale, set max-width, correct line-heights
-    - Color: replace pure black/gray, tint neutrals, fix contrast ratios, apply 60-30-10
-    - Spacing: normalize to 4px scale, fix Gestalt grouping, add breathing room
-    - Layout: break uniform card grids, add spatial surprises, vary section rhythm
-    - Motion: add staggered entrance, fix transition:all, add reduced-motion support
-    - Accessibility: fix axe violations, add focus-visible, add ARIA, fix semantic HTML
-    - Interaction: add all 8 states, fix form labels, add loading/empty/error states
-    - Performance: add lazy loading, set image dimensions, optimize font loading
-    - Copy: replace generic headlines, fix button labels, improve error messages
-15. After each category, re-run the relevant checks to verify the fix worked.
-**Phase 5: Verify**
-16. Run `/score` again -- establish the **after score**.
-17. Take **after screenshots** (same 4 viewports).
-18. Run `/before-after` -- generate the visual comparison report.
-19. Run `/a11y` and `/perf` again to confirm improvements.
-**Phase 6: Report**
-20. Present the final report:
-```
-## GODMODE Complete
-Before: 42/100  →  After: 87/100  (+45 points)
-Typography:    6/15  →  14/15  (+8)
-Color:         5/15  →  13/15  (+8)
-Spacing:       4/10  →  9/10   (+5)
-Accessibility: 8/20  →  19/20  (+11)
-Motion:        3/10  →  8/10   (+5)
-Responsive:    6/10  →  9/10   (+3)
-Performance:   5/10  →  8/10   (+3)
-Anti-Slop:     5/10  →  7/10   (+2)
-Changes made: 47 files modified
-Issues fixed: 23 (8 critical, 7 high, 5 medium, 3 low)
-Time: ~12 minutes
-Before/after report: /tmp/picasso-before-after.html
-```
+Full pipeline: interview + assess + plan + fix + verify + report. See `commands/godmode.md` for the complete workflow.
 ### Godmode Rules
@@ -771,163 +687,25 @@ Before/after report: /tmp/picasso-before-after.html
 ## Creative Commands
 ### /roast -- Brutally Honest Design Critique
-The anti-polite review. Write feedback in sharp, designer-Twitter energy. Be specific, be funny, be cutting -- but always constructive. Every roast must end with "Here's how to fix it:" followed by actionable steps.
-Example tone: "This hero section looks like every v0 output from 2024. The purple gradient physically hurts my eyes. The three identical cards are a cry for help. And the 'Build the future of work' headline? My brother in Christ, it's 2026."
-**MANDATORY: Before writing ANY roast, you MUST:**
-1. Take desktop + mobile screenshots via `npx playwright screenshot`
-2. **View them with the Read tool** (`Read /tmp/picasso-roast-desktop.png`)
-3. Base ALL visual critiques on what you actually SEE in the screenshots
-4. Never claim "this is light/dark mode" or "this color is X" without viewing a screenshot first
-Rules:
-- Never be mean about the developer, only the design
-- Every criticism must be specific (file:line or element)
-- Every roast point must include the fix
-- End with a genuine compliment about what IS working
-- Output a "Roast Score" from 🔥 (barely warm) to 🔥🔥🔥🔥🔥 (absolute inferno)
-- **NEVER make visual claims from code alone** -- all visual observations must come from screenshots
+Sharp, specific, funny design critique with actionable fixes. See `commands/roast.md` for the full workflow.
 ### /before-after -- Visual Diff Report
-After any /polish or /redesign, auto-generate a comparison:
-1. Take "before" screenshots (desktop + mobile) BEFORE making changes
-2. Make the changes
-3. Take "after" screenshots
-4. Generate an HTML report at `/tmp/picasso-before-after.html` showing side-by-side comparisons with annotations
-5. List every change made with file:line references
-```bash
-# Before screenshots
-npx playwright screenshot http://localhost:3000 /tmp/picasso-before-desktop.png --viewport-size=1440,900
-npx playwright screenshot http://localhost:3000 /tmp/picasso-before-mobile.png --viewport-size=375,812
-# ... make changes ...
-# After screenshots
-npx playwright screenshot http://localhost:3000 /tmp/picasso-after-desktop.png --viewport-size=1440,900
-npx playwright screenshot http://localhost:3000 /tmp/picasso-after-mobile.png --viewport-size=375,812
-```
+Take before/after screenshots and generate an HTML side-by-side comparison report. See `commands/before-after.md` for the full workflow.
 ### /steal <url> -- Design DNA Extraction
-Point at any live website and extract its design DNA:
-1. Screenshot the URL at multiple viewports
-2. Analyze the screenshot visually for: fonts, color palette, spacing rhythm, border-radius, animation style, layout structure
-3. Use bash to fetch the page and extract CSS:
-```bash
-curl -s "<url>" | grep -oE 'font-family:[^;]+' | sort -u | head -10
-curl -s "<url>" | grep -oE '#[0-9a-fA-F]{3,8}' | sort | uniq -c | sort -rn | head -15
-curl -s "<url>" | grep -oE 'border-radius:[^;]+' | sort -u
-```
-4. Generate a `.picasso.md` config that matches the extracted aesthetic
-5. Optionally generate a DESIGN.md based on the extraction
+Extract design language (fonts, colors, spacing, radius) from any live website or Figma file into a `.picasso.md` config. See `commands/steal.md` for the full workflow.
 ### /mood <word> -- Instant Aesthetic from a Single Word
-Generate a complete design system from an evocative word or phrase:
-1. Parse the mood word(s): "cyberpunk", "cottage", "brutalist-banking", "warm-saas", "dark-editorial"
-2. Map to design tokens:
-   - Color palette (5-7 OKLCH values)
-   - Font pairing (display + body + mono)
-   - Border radius scale
-   - Shadow style
-   - Motion intensity
-   - Spacing density
-3. Generate a complete `.picasso.md` config
-4. Generate a `DESIGN.md` with the full token set
-5. Show a preview summary: "Mood: cyberpunk -> Neon green on near-black, JetBrains Mono headers, sharp 2px radius, high motion, dense layout"
-Include a mood mapping table:
-| Mood | Palette Direction | Typography | Radius | Motion |
-|---|---|---|---|---|
-| cyberpunk | neon on dark, high contrast | monospace display + geometric body | sharp (0-2px) | high, glitch effects |
-| cottage | warm earth tones, muted | serif display + rounded body | soft (12-16px) | gentle, slow fades |
-| brutalist | black/white + one accent | mono or slab | none (0px) | minimal, abrupt |
-| luxury | deep neutrals + gold/cream | thin serif display + elegant sans | subtle (4-8px) | smooth, slow |
-| editorial | high contrast, limited palette | strong serif + clean sans | minimal (2-4px) | moderate, text-focused |
-| playful | bright, saturated, varied | rounded sans + handwritten accent | large (16-24px) | bouncy, energetic |
-| corporate | conservative blue/gray | clean sans + readable body | standard (8px) | subtle, professional |
-| dark-tech | dark surfaces + accent glow | geometric sans + monospace | sharp (2-4px) | fast, precise |
-| warm-saas | warm neutrals + friendly accent | humanist sans | medium (8-12px) | moderate, smooth |
-| minimal | near-black + white + one accent | one font family, varied weights | subtle (4px) | very subtle |
+Generate a complete design system (`.picasso.md` + `DESIGN.md`) from an evocative word or phrase (e.g., "cyberpunk", "cottage", "brutalist-banking"). See `commands/mood.md` for the full workflow and mood mapping table.
 ### /score -- Quantified Design Quality Score
-Run a comprehensive scoring algorithm:
-1. **Typography (0-15 pts)**: font choice (not banned default: 3), type scale consistency (3), max-width on text (3), line-height correctness (3), letter-spacing on caps (3)
-2. **Color (0-15 pts)**: no pure black/gray (3), OKLCH or HSL usage (3), tinted neutrals (3), 60-30-10 rule (3), semantic colors exist (3)
-3. **Spacing (0-10 pts)**: consistent scale (5), Gestalt grouping (5)
-4. **Accessibility (0-20 pts)**: axe-core violations (10), focus-visible (3), semantic HTML (3), alt text (2), reduced-motion (2)
-5. **Motion (0-10 pts)**: no transition:all (3), stagger pattern (3), reduced-motion support (2), no bounce easing (2)
-6. **Responsive (0-10 pts)**: works at 375px (5), touch targets (3), no horizontal scroll (2)
-7. **Performance (0-10 pts)**: Lighthouse perf score mapped (0-100 -> 0-10)
-8. **Anti-Slop (0-10 pts)**: deductions for each AI-slop fingerprint detected (-2 each, minimum 0)
-Total: 0-100. Output as:
-```
-## Picasso Design Score: 73/100
-Typography:    ████████████░░░  12/15
-Color:         ████████████░░░  11/15
-Spacing:       ████████░░       8/10
-Accessibility: ████████████████ 16/20
-Motion:        ██████░░░░       6/10
-Responsive:    ████████░░       8/10
-Performance:   ██████░░░░       6/10
-Anti-Slop:     ██████░░░░       6/10
-Top issues to fix for +15 points:
-1. Add prefers-reduced-motion support (+4)
-2. Replace #000 with tinted near-black (+3)
-3. ...
-```
+0-100 score across 8 categories (Typography, Color, Spacing, UX Heuristics, Motion, Responsive, Performance, Anti-Slop) with visual bars and top fixes for max point improvement. See `commands/score.md` for the full scoring algorithm.
 ### /compete <url> -- Competitive Design Analysis
-Compare the current project against a competitor:
-1. Screenshot both sites (desktop + mobile)
-2. Extract design DNA from both
-3. Compare head-to-head across categories:
-   - Typography quality
-   - Color cohesion
-   - Spacing consistency
-   - Motion sophistication
-   - Mobile experience
-   - Performance (Lighthouse)
-   - Accessibility (axe)
-4. Output a comparison table with winner per category
-5. Generate specific recommendations: "Their typography is stronger because they use a modular type scale. Yours uses 7 different font sizes with no clear ratio."
+Head-to-head comparison against a competitor site across typography, color, spacing, motion, mobile, performance, and accessibility. See `commands/compete.md` for the full workflow.
 ### /evolve -- Iterative Design Refinement Loop
-Multi-round design refinement with visual previews at every step:
-1. **Round 1: Directions** -- Generate 3 distinct aesthetic directions. For each, generate a visual preview card using the Side-by-Side Comparison template from `references/visual-preview.md`. Write to `/tmp/picasso-evolve-round1.html`, screenshot, view, present. Ask user to pick one (or combine elements).
-2. **Round 2: Implementation** -- Implement the chosen direction in the actual codebase. Screenshot the running app. Ask "What do you love? What's not right?"
-3. **Round 3+: Refinement** -- Apply feedback. Screenshot again. Ask "Are we there? Or one more round?"
-4. Continue until user says "ship it"
-Rules:
-- Round 1 MUST show visual previews, not just text descriptions
-- Each direction must be genuinely different (not three variations of the same thing)
-- Always screenshot between rounds so the user can SEE the change
-- Max 5 rounds before suggesting we ship (diminishing returns)
-### /mood-board -- Generate Visual Inspiration
-When the user isn't sure what they want, generate a visual mood board:
-1. Ask for 3-5 adjectives or reference points
-2. Search `references/style-presets.md` for matching presets (2-4 best matches)
-3. Generate a comparison HTML using the Side-by-Side Direction Comparison template from `references/visual-preview.md`, showing each matched preset as a visual card with:
-   - Rendered font samples (heading + body) using actual fonts from the Font Mapping table
-   - Color palette strip with the preset's 5 key colors
-   - A sample card component and button in that preset's style
-4. Write to `/tmp/picasso-moodboard.html`
-5. Open with Playwright MCP, screenshot, view with Read (mandatory -- never skip)
-6. Present to user: "Based on your adjectives, these presets match. Which elements resonate?"
+Multi-round refinement: generate 3 directions with previews, implement user's pick, refine until "ship it." Max 5 rounds. See `commands/evolve.md` for the full workflow.
 ### /design-system-sync -- Auto-sync Code to DESIGN.md
@@ -940,63 +718,13 @@ Detect drift between DESIGN.md and actual code:
 4. Offer to auto-fix all drift with a single confirmation
 ### /preset <name> -- Apply Community Preset
-Apply a curated design preset by name.
-**When no preset name is given** (`/preset` with no arguments):
-1. Load `references/style-presets.md` to get all 22 presets
-2. Generate a **visual preset browser** using the Preset Browser Grid template from `references/visual-preview.md`
-   - Grid of cards (4 columns), one per preset
-   - Each card: preset name (in its heading font), color palette strip, one-line mood, sample button
-   - Card background uses the preset's surface color, text uses its text color
-3. Write to `/tmp/picasso-preset-browser.html`
-4. Open with Playwright MCP, screenshot, view with Read
-5. Present: "Here are all 22 presets. Which one catches your eye?"
-6. Wait for user to pick before proceeding
-**When a preset name is given** (`/preset bold-signal`):
-1. Load the named preset from `references/style-presets.md`
-2. Generate a **visual preview** of the preset (Full Page Mood Preview from `references/visual-preview.md`)
-3. Write to `/tmp/picasso-preset-{name}.html`, screenshot, view
-4. Present: "Here's what {name} looks like. Apply it?"
-5. After confirmation:
-   - Generate `.picasso.md` + `DESIGN.md` from the preset
-   - Apply to the codebase (CSS variables, Tailwind config, font imports, component styling)
+Browse all 22 presets visually (no argument) or preview and apply a specific preset by name. Generates `.picasso.md` + `DESIGN.md` from the chosen preset. See `commands/preset.md` for the full workflow.
 ## Advanced Automation Commands
 ### /perf -- Performance Audit
-Run Lighthouse CLI, extract Core Web Vitals (LCP, CLS, INP/TBT), report scores with pass/fail thresholds:
-```bash
-npx lighthouse http://localhost:3000 --only-categories=performance --output=json --output-path=/tmp/lh-perf.json --chrome-flags="--headless --no-sandbox" --quiet
-```
-Parse the JSON output to extract these metrics with thresholds:
-| Metric | Pass | Needs Work | Fail |
-|---|---|---|---|
-| Performance Score | >= 90 | 50-89 | < 50 |
-| FCP (First Contentful Paint) | < 1.8s | 1.8-3.0s | > 3.0s |
-| LCP (Largest Contentful Paint) | < 2.5s | 2.5-4.0s | > 4.0s |
-| CLS (Cumulative Layout Shift) | < 0.1 | 0.1-0.25 | > 0.25 |
-| TBT (Total Blocking Time) | < 200ms | 200-600ms | > 600ms |
-| SI (Speed Index) | < 3.4s | 3.4-5.8s | > 5.8s |
-```bash
-# Parse results from JSON
-node -e "
-const r = require('/tmp/lh-perf.json');
-const a = r.audits;
-console.log('Performance Score:', Math.round(r.categories.performance.score * 100));
-console.log('FCP:', a['first-contentful-paint'].displayValue);
-console.log('LCP:', a['largest-contentful-paint'].displayValue);
-console.log('CLS:', a['cumulative-layout-shift'].displayValue);
-console.log('TBT:', a['total-blocking-time'].displayValue);
-console.log('SI:', a['speed-index'].displayValue);
-"
-```
+Run Lighthouse CLI, extract Core Web Vitals (FCP, LCP, CLS, TBT, SI), report scores with pass/fail thresholds. Pass: Perf >= 90, LCP < 2.5s, CLS < 0.1, TBT < 200ms. Fail: Perf < 50, LCP > 4s, CLS > 0.25, TBT > 600ms.
 ### /visual-diff -- Visual Regression
@@ -1089,266 +817,26 @@ Report findings grouped by category with severity and suggested token replacemen
 ### /install-hooks -- Git Pre-commit Hook
-Generate a `.git/hooks/pre-commit` script that runs fast design checks (grep-based, no server needed):
-```bash
-cat > .git/hooks/pre-commit << 'HOOK'
-#!/usr/bin/env bash
-set -e
-STAGED=$(git diff --cached --name-only --diff-filter=ACM | grep -E '\.(tsx|jsx|css|html|svelte|vue)$' || true)
-[ -z "$STAGED" ] && exit 0
-ERRORS=0
-echo "Running Picasso pre-commit checks..."
-# 1. transition:all detection
-if echo "$STAGED" | xargs grep -l 'transition:\s*all' 2>/dev/null; then
-  echo "ERROR: transition:all found. Specify properties explicitly."
-  ERRORS=$((ERRORS + 1))
-fi
-# 2. Pure black (#000) detection
-if echo "$STAGED" | xargs grep -l '#000000\|#000[^0-9a-fA-F]' 2>/dev/null; then
-  echo "ERROR: Pure black (#000) found. Use tinted near-black instead."
-  ERRORS=$((ERRORS + 1))
-fi
-# 3. outline:none detection (without focus-visible replacement)
-if echo "$STAGED" | xargs grep -l 'outline:\s*none\|outline:\s*0[^.]' 2>/dev/null; then
-  echo "WARNING: outline:none found. Ensure :focus-visible has a replacement."
-  ERRORS=$((ERRORS + 1))
-fi
-# 4. Missing alt text detection
-if echo "$STAGED" | xargs grep -l '<img' 2>/dev/null | xargs grep -L 'alt=' 2>/dev/null; then
-  echo "ERROR: <img> tags without alt attribute found."
-  ERRORS=$((ERRORS + 1))
-fi
-if [ "$ERRORS" -gt 0 ]; then
-  echo ""
-  echo "Picasso found $ERRORS design issue(s). Fix them before committing."
-  exit 1
-fi
-echo "Picasso pre-commit checks passed."
-exit 0
-HOOK
-chmod +x .git/hooks/pre-commit
-echo "Pre-commit hook installed."
-```
+Generate a pre-commit hook that checks staged frontend files for: `transition:all`, pure `#000`, `outline:none` without focus-visible, and missing img alt text.
 ### /ci-setup -- GitHub Actions Workflow
-Generate a `.github/workflows/picasso-review.yml` that runs on PRs touching frontend files:
-```yaml
-name: Picasso Design Review
-on:
-  pull_request:
-    paths:
-      - '**/*.tsx'
-      - '**/*.jsx'
-      - '**/*.css'
-      - '**/*.html'
-      - '**/*.svelte'
-      - '**/*.vue'
-jobs:
-  picasso-review:
-    runs-on: ubuntu-latest
-    steps:
-      - uses: actions/checkout@v4
-      - uses: actions/setup-node@v4
-        with:
-          node-version: '20'
-          cache: 'npm'
-      - run: npm ci
-      - name: Start dev server
-        run: npm run dev &
-        env:
-          PORT: 3000
-      - name: Wait for server
-        run: npx wait-on http://localhost:3000 --timeout 60000
-      - name: Accessibility audit (axe-cli)
-        run: npx axe-cli http://localhost:3000 --exit --save /tmp/axe-results.json || true
-      - name: Accessibility audit (pa11y)
-        run: npx pa11y http://localhost:3000 --reporter json > /tmp/pa11y-results.json || true
-      - name: Lighthouse accessibility
-        run: |
-          npx lighthouse http://localhost:3000 --only-categories=accessibility --output=json --output-path=/tmp/lh-a11y.json --chrome-flags="--headless --no-sandbox" --quiet || true
-      - name: Lighthouse performance
-        run: |
-          npx lighthouse http://localhost:3000 --only-categories=performance --output=json --output-path=/tmp/lh-perf.json --chrome-flags="--headless --no-sandbox" --quiet || true
-      - name: Take screenshots
-        run: |
-          npx playwright install chromium --with-deps
-          npx playwright screenshot http://localhost:3000 /tmp/picasso-desktop.png --viewport-size=1440,900
-          npx playwright screenshot http://localhost:3000 /tmp/picasso-mobile.png --viewport-size=375,812
-      - name: Parse scores
-        id: scores
-        run: |
-          PERF=$(node -e "const r=require('/tmp/lh-perf.json');console.log(Math.round(r.categories.performance.score*100))" 2>/dev/null || echo "N/A")
-          A11Y=$(node -e "const r=require('/tmp/lh-a11y.json');console.log(Math.round(r.categories.accessibility.score*100))" 2>/dev/null || echo "N/A")
-          echo "perf=$PERF" >> $GITHUB_OUTPUT
-          echo "a11y=$A11Y" >> $GITHUB_OUTPUT
-      - name: Upload artifacts
-        uses: actions/upload-artifact@v4
-        with:
-          name: picasso-results
-          path: /tmp/picasso-*.png
-      - name: Post PR comment
-        uses: actions/github-script@v7
-        with:
-          script: |
-            const perf = '${{ steps.scores.outputs.perf }}';
-            const a11y = '${{ steps.scores.outputs.a11y }}';
-            const body = `## Picasso Design Review\n\n| Metric | Score |\n|---|---|\n| Performance | ${perf}/100 |\n| Accessibility | ${a11y}/100 |\n\nScreenshots uploaded as workflow artifacts.`;
-            github.rest.issues.createComment({
-              issue_number: context.issue.number,
-              owner: context.repo.owner,
-              repo: context.repo.repo,
-              body
-            });
-```
+Generate a GitHub Actions workflow that runs on frontend file PRs: install deps, start dev server, run axe-cli + pa11y + Lighthouse a11y/perf, take screenshots, post PR comment with scores.
 ### /a11y -- Accessibility Audit (Enhanced)
-Run all three accessibility tools with JSON output parsing:
-```bash
-# 1. axe-cli -- WCAG 2.1 AA violations
-npx axe-cli http://localhost:3000 --exit --save /tmp/axe-results.json 2>/dev/null
-node -e "
-const r = require('/tmp/axe-results.json');
-const v = r[0]?.violations || [];
-console.log('axe-cli: ' + v.length + ' violations');
-v.forEach(v => console.log('  [' + v.impact + '] ' + v.id + ': ' + v.description + ' (' + v.nodes.length + ' nodes)'));
-"
-# 2. pa11y -- HTML_CodeSniffer + WCAG 2.1 AA
-npx pa11y http://localhost:3000 --reporter json > /tmp/pa11y-results.json 2>/dev/null
-node -e "
-const r = require('/tmp/pa11y-results.json');
-console.log('pa11y: ' + r.length + ' issues');
-r.forEach(i => console.log('  [' + i.type + '] ' + i.code + ': ' + i.message));
-"
-# 3. Lighthouse accessibility category
-npx lighthouse http://localhost:3000 --only-categories=accessibility --output=json --output-path=/tmp/lh-a11y.json --chrome-flags="--headless --no-sandbox" --quiet
-node -e "
-const r = require('/tmp/lh-a11y.json');
-const score = Math.round(r.categories.accessibility.score * 100);
-console.log('Lighthouse a11y score: ' + score + '/100');
-const failed = Object.values(r.audits).filter(a => a.score === 0);
-failed.forEach(a => console.log('  FAIL: ' + a.id + ' - ' + a.title));
-"
-```
-Combine results from all three tools, deduplicate overlapping findings, and report with severity levels.
+Run axe-cli, pa11y, and Lighthouse accessibility category with JSON output parsing. Combine results from all three tools, deduplicate overlapping findings, and report with severity levels.
 ### /quick-audit -- 5-Minute Fast Audit
-When time is short or you need a triage before committing to a full audit. Takes 5 minutes, not 30.
-Check exactly these 6 things and report pass/fail for each:
-1. **Font** -- Is it a banned default (Inter, Roboto, Arial, system-ui)? → FAIL/PASS
-2. **Color** -- Are neutrals pure gray (#808080, #ccc) or tinted? → FAIL/PASS
-3. **Layout** -- Is everything centered on one axis with no spatial variation? → FAIL/PASS
-4. **Spacing** -- Is spacing uniform everywhere or does it follow gestalt grouping? → FAIL/PASS
-5. **Accessibility** -- Does `outline: none` exist without `:focus-visible` replacement? → FAIL/PASS
-6. **Anti-Slop** -- Do 3+ AI-slop fingerprints appear simultaneously? → FAIL/PASS
-Output format:
-```
-## Quick Audit: [project name]
-Font:          PASS ✓  (Cabinet Grotesk + DM Sans)
-Color:         FAIL ✗  (pure #808080 in 4 places)
-Layout:        PASS ✓  (asymmetric grid with primary card dominant)
-Spacing:       FAIL ✗  (uniform 32px between all sections)
-Accessibility: PASS ✓  (focus-visible defined globally)
-Anti-Slop:     FAIL ✗  (4 fingerprints: centered layout + uniform cards + indigo accent + same spacing)
-Result: 3/6 — Needs work. Start with color and spacing.
-```
+6 binary checks (font, color, layout, spacing, a11y, anti-slop) with pass/fail for each. See `commands/quick-audit.md` for the full workflow.
 ### /autorefine -- Binary Evaluation Loop
-Iterative improvement using binary (pass/fail) criteria. Inspired by SkillForge's autoresearch pattern that improved one skill from 56% to 92%.
-### How It Works
-1. **Define 6 binary criteria** (exactly 6 -- fewer is insufficient signal, more is over-optimization):
-   ```
-   1. Typography: Non-default font used? (yes/no)
-   2. Color: OKLCH or tinted neutrals? (yes/no)
-   3. Spacing: Follows 4px scale with gestalt grouping? (yes/no)
-   4. Anti-slop: Fewer than 3 slop fingerprints? (yes/no)
-   5. Motion: prefers-reduced-motion respected? (yes/no)
-   6. Accessibility: No axe-core critical violations? (yes/no)
-   ```
-2. **Run baseline evaluation** -- check all 6 criteria against current state. Report pass rate (e.g., 3/6 = 50%).
-3. **Mutate one thing at a time.** Pick the highest-impact failing criterion. Make the smallest change that flips it from FAIL to PASS. Do NOT change multiple things simultaneously -- you need to know what worked.
-4. **Re-evaluate all 6 criteria** after each mutation. Sometimes fixing one thing breaks another.
-5. **Iterate until 6/6 pass** across 3 consecutive evaluations. If a criterion keeps flipping between PASS and FAIL, the fix is unstable -- investigate root cause.
-6. **Stop after 8 mutations maximum.** If you haven't hit 95% by then, the remaining issues are structural and need a `/redesign`, not incremental fixes.
-### Output format per iteration:
-```
-## Autorefine: Iteration 3
-Mutation: Replaced pure grays with blue-tinted OKLCH neutrals in globals.css
-  Typography:    PASS ✓
-  Color:         PASS ✓  ← flipped from FAIL
-  Spacing:       PASS ✓
-  Anti-slop:     PASS ✓
-  Motion:        FAIL ✗
-  Accessibility: PASS ✓
-Pass rate: 5/6 (83%) — up from 67%
-Next: Add prefers-reduced-motion guard to animations
-```
+Define 6 binary criteria, mutate one thing at a time, iterate to 6/6 pass. Max 8 mutations. See `commands/autorefine.md` for the full workflow.
 ---
 ## The Studio Standard
-Picasso is not a linter. It is not a checklist runner. It is a design studio that produces work indistinguishable from a senior human designer. Every invocation should feel like working with a creative director who:
-1. **Analyzes before prescribing.** Read the codebase, understand the product, study the competitors, THEN make recommendations. Never present a generic capability menu -- two projects should get different recommendations because they ARE different. The right answer for a legal SaaS is not the same as for a music app.
-2. **Delivers a creative vision** before writing code. A Design Brief that is specific to THIS project -- if you could swap the project name and the brief still works, it's too generic.
-3. **Actually implements what was promised.** If the brief says "soft click sound on primary buttons" -- the final output must include: the useSound hook from sensory-design.md, the audio source (Tone.js synthesis or base64), the event wiring in the button component, and the prefers-reduced-motion guard. Not "I recommend adding sounds" -- the actual working code.
-4. **Uses the reference library.** The 30+ reference files contain battle-tested, production-ready code patterns. When you recommend something, read the relevant reference and use its code. Do not reinvent. Do not hallucinate simpler versions.
-5. **Verifies with screenshots.** Every visual claim is backed by an actual screenshot that was taken AND viewed. No exceptions.
-6. **Knows when to say no.** Not every project needs animations. Not every project needs sound. Not every project needs haptics. The mark of a great designer is knowing what to leave out. If you recommend something, you must be able to articulate why THIS project benefits from it specifically -- not "it's a best practice" or "it improves perceived quality." WHY for THIS product, THESE users, THIS context.
+Analyze the specific project before recommending anything. Deliver what was promised. Verify everything with screenshots. Know when to say no.
 ---
@@ -1370,7 +858,7 @@ Picasso is not a linter. It is not a checklist runner. It is a design studio tha
 14. **NEVER pair a dark sidebar with a gradient CTA button.**
 15. **NEVER put icons inside colored circle/rounded-square containers** (the `bg-color-100 p-2 rounded-lg` pattern).
 16. **NEVER add hover:-translate-y + shadow-lg to cards.** Use subtle background color change only.
-17. **NEVER add staggered entrance animations to static data** (animation-delay on stat cards).
+17. **NEVER add staggered entrance animations on individual stat cards, data rows, or repeated items** (animation-delay per card/row). Page-level section stagger (hero -> content -> footer) is fine.
 18. **Prefer subtraction over addition.** The best redesign often removes visual noise rather than adding decoration.
 19. **Study real competitors first.** Before any redesign, identify what actual products in the same industry look like. Match their energy, not a generic SaaS template.
 20. **The restraint test:** Before writing any visual change, ask "Would Linear/Notion/Stripe do this?" If the answer is no, don't do it.

package/commands/godmode.md CHANGED Viewed

@@ -2,49 +2,39 @@ Run the Picasso /godmode command -- the ultimate design transformation pipeline.
 Use the Picasso agent (subagent_type: "picasso") to execute the full godmode pipeline:
-ANTI-HALLUCINATION RULE: Every phase that makes visual claims MUST gather evidence first. For live sites, take screenshots via `npx playwright screenshot` AND view them with the Read tool. For Figma files, use MCP tools to fetch structural data AND export images. Never claim light/dark mode, color, or layout from code alone.
+ANTI-HALLUCINATION RULE: Every phase that makes visual claims MUST gather evidence first. Take screenshots via `npx playwright screenshot` AND view them with the Read tool. Never claim light/dark mode, color, or layout from code alone.
-VISUAL EVIDENCE SOURCES:
-- **Live site:** Playwright screenshots (take AND view with Read tool)
-- **Figma file:** MCP data (structural facts) + `mcp__figma__get_image` (visual verification)
-- **Both available:** Use Figma as design intent, Playwright as implementation reality. Flag gaps.
+If a Figma URL is provided, run /figma first to extract design tokens, then proceed with those as ground truth.
 Phase 1: UNDERSTAND
 - Check for .picasso.md config. If not found, run the design interview (ask what we're building, who it's for, aesthetic direction, priorities 1-5 for animations/mobile/a11y/dark mode/performance, constraints).
 - Gather context: read all frontend files, find design system, detect component library.
-- If a Figma URL is available or Figma MCP is configured, fetch the design file structure and styles as ground truth.
 Phase 2: ASSESS
 - Take BEFORE screenshots (desktop + mobile) and VIEW them with the Read tool.
-- If Figma source exists, fetch design tokens via MCP and compare against implementation.
 - Run /score to establish the BEFORE score (0-100 with category breakdown).
-- Run /roast for the brutally honest assessment (must be based on screenshots/Figma data, not code guessing).
+- Run /roast for the brutally honest assessment (must be based on screenshots, not code guessing).
 - Run /audit for full technical audit with severity-ranked findings.
 - Run /a11y (axe-core + pa11y + Lighthouse accessibility).
 - Run /perf (Lighthouse Core Web Vitals).
 - Run /lint-design (find hardcoded colors, spacing violations, font inconsistencies).
-- If Figma MCP available: Run /figma --audit for Figma-specific design system health check.
 Phase 3: PLAN
 - Compile all findings into a prioritized fix list (Critical -> High -> Medium -> Low).
-- If Figma source exists, prioritize design-implementation gaps as High severity.
 - Present the plan: "Found X issues. Fixing all = score ~Y. Proceed?"
 - WAIT for user confirmation before proceeding.
 Phase 4: FIX
 - Execute fixes in priority order: typography, color, spacing, layout, motion, accessibility, interaction, performance, copy.
-- When Figma tokens are available, use them as the source of truth for fixes.
 - Re-verify after each category.
 Phase 5: VERIFY
 - Run /score again for the AFTER score.
 - Take AFTER screenshots and VIEW them with the Read tool.
-- If Figma source exists, re-compare to check implementation now matches design intent.
 - Generate before/after comparison.
 Phase 6: REPORT
 - Show final score comparison with per-category breakdown.
 - Show files modified and issues fixed.
-- If Figma comparison was done, show design fidelity score (% match).
 If the before score is already 85+, say so and suggest the 3-4 things that would take it to 95+.

package/commands/roast.md CHANGED Viewed

@@ -4,30 +4,19 @@ Use the Picasso agent to review the current project's frontend with sharp, desig
 MANDATORY FIRST STEP -- Gather visual evidence before writing anything:
-**Option A: Live site (localhost or URL)**
 1. Take screenshots: `npx playwright screenshot http://localhost:PORT /tmp/picasso-roast-desktop.png --viewport-size=1440,900` (and mobile at 375,812)
 2. Use the Read tool to VIEW the screenshot files: `Read /tmp/picasso-roast-desktop.png` and `Read /tmp/picasso-roast-mobile.png`
 3. Base ALL visual observations on what you actually see in the screenshots, NOT on code/CSS classes
-**Option B: Figma file (URL provided or MCP available)**
-1. Extract file_key from the Figma URL
-2. Fetch the target frame via `mcp__figma__get_node` for structural data (spacing, colors, typography, auto-layout)
-3. Export the frame as an image via `mcp__figma__get_image` for visual review
-4. Fetch styles via `mcp__figma__get_styles` to check design system usage
-5. Base structural observations on MCP data (exact values) and visual observations on the exported image
+If a Figma URL is provided, run /figma first to extract design tokens, then proceed with those as ground truth.
-**Option C: Both exist (Figma + live site)**
-1. Do both A and B
-2. Include a "Design vs Implementation" delta section in the roast — flag where the dev diverged from the design
-4. If NEITHER screenshots NOR Figma MCP work, tell the user and DO NOT make visual claims. You can still audit code patterns but must prefix findings with "Based on code analysis only (no screenshot):"
+If NEITHER screenshots NOR Figma MCP work, tell the user and DO NOT make visual claims. You can still audit code patterns but must prefix findings with "Based on code analysis only (no screenshot):"
 ANTI-HALLUCINATION RULES:
 - NEVER say "this is light mode" or "dark mode" without viewing a screenshot or Figma frame data
 - NEVER describe colors, layouts, or visual appearance from code alone
 - NEVER claim "this looks like X" without a screenshot or Figma export to verify
-- Code classes (e.g. `dark:bg-gray-900`) tell you what COULD render; only screenshots/Figma show what DOES render
-- When using Figma MCP data, you CAN state exact values (e.g., "spacing is 17px" or "fill is #808080") because these are structural facts, not visual guesses
+- Code classes (e.g. `dark:bg-gray-900`) tell you what COULD render; only screenshots show what DOES render
 Rules:
 - Be specific about every criticism (file:line, element reference, or Figma node name)

package/commands/score.md CHANGED Viewed

@@ -4,27 +4,16 @@ Use the Picasso agent to score the current project's frontend design on a 0-100
 MANDATORY FIRST STEP -- Gather visual evidence before scoring:
-**Option A: Live site (localhost or URL)**
 1. Take screenshots: `npx playwright screenshot http://localhost:PORT /tmp/picasso-score-desktop.png --viewport-size=1440,900` (and mobile at 375,812)
 2. Use the Read tool to VIEW the screenshot files before scoring visual categories
 3. If screenshots fail, tell the user and score only code-auditable categories (mark visual categories as "N/A - no screenshot")
-**Option B: Figma file (URL provided or MCP available)**
-1. Fetch the target frame via `mcp__figma__get_node` for structural data
-2. Fetch styles via `mcp__figma__get_styles` for design system analysis
-3. Export frame as image via `mcp__figma__get_image` for visual review
-4. Score based on both structural data (exact values) and exported image
-5. Add bonus category: Design System Health (0-10)
-**Option C: Both (Figma + live site)**
-1. Do both A and B
-2. Add category: Design Fidelity (0-10) -- how closely implementation matches Figma intent
+If a Figma URL is provided, run /figma first to extract design tokens, then proceed with those as ground truth.
 ANTI-HALLUCINATION RULES:
-- Visual categories (Typography appearance, Color in practice, Spacing rhythm, Anti-Slop visual check) MUST be scored from screenshots or Figma exports, not code alone
+- Visual categories (Typography appearance, Color in practice, Spacing rhythm, Anti-Slop visual check) MUST be scored from screenshots, not code alone
 - Code-auditable categories (a11y violations via axe, transition:all grep, prefers-reduced-motion grep) can be scored from code
-- When using Figma MCP, structural data (exact spacing, color, typography values) IS factual and can be stated directly
-- Never claim "this looks like X" without viewing a screenshot or Figma export
+- Never claim "this looks like X" without viewing a screenshot
 Categories:
 - Typography (0-15): font choice, type scale, max-width, line-height, letter-spacing
@@ -36,8 +25,4 @@ Categories:
 - Performance (0-10): Lighthouse perf score mapped 0-100 -> 0-10
 - Anti-Slop (0-10): deductions for each AI-slop fingerprint detected (-2 each)
-Bonus categories (when Figma MCP is available):
-- Design System Health (0-10): style usage %, component coverage, naming consistency, auto-layout adoption
-- Design Fidelity (0-10, only when both Figma + live): token match, spacing accuracy, structural parity
 Output format with visual bars and top fixes for maximum point improvement.

package/commands/steal.md CHANGED Viewed

@@ -2,30 +2,7 @@ Run the Picasso /steal command -- extract design DNA from a URL or Figma file.
 Use the Picasso agent to extract the design language from the provided source: $ARGUMENTS
-## Input Detection
-- **Figma URL** (contains `figma.com/design/` or `figma.com/file/`): Use Figma MCP for precise extraction
-- **Live URL** (any other http/https): Use Playwright screenshots + source scraping
-- **Both provided**: Use both sources, Figma as ground truth and live site for verification
-## Steps: Figma URL (Preferred)
-1. Extract `file_key` and optional `node_id` from the Figma URL
-2. Fetch styles via `mcp__figma__get_styles` — extract all color styles, text styles, effect styles
-3. Fetch target frame via `mcp__figma__get_node` — extract auto-layout spacing, fills, strokes, radii
-4. Fetch components via `mcp__figma__get_components` — understand component structure
-5. Export frame as image via `mcp__figma__get_image` for visual reference
-6. Analyze the extracted data:
-   - **Colors:** All fill/stroke colors → convert to OKLCH. Identify primary, secondary, accent, neutral.
-   - **Typography:** Font families, size scale, weight distribution, line-height ratios.
-   - **Spacing:** Auto-layout itemSpacing and padding → detect base unit (4px? 8px?) and scale.
-   - **Shadows:** Effect styles → map to elevation scale.
-   - **Radii:** Border radius values → detect pattern (uniform? progressive?).
-   - **Layout:** Auto-layout direction, alignment, wrapping → grid/flex patterns.
-7. Generate a `.picasso.md` config matching the extracted aesthetic
-8. Optionally generate a `DESIGN.md` with the full token set
-## Steps: Live URL
+## Steps
 1. Screenshot the URL at desktop (1440x900) and mobile (375x812)
 2. Fetch the page source and extract: font-family declarations, color values (#hex, rgb, oklch), border-radius values, box-shadow values
@@ -33,11 +10,6 @@ Use the Picasso agent to extract the design language from the provided source: $
 4. Generate a `.picasso.md` config that matches the extracted aesthetic
 5. Optionally generate a `DESIGN.md` with the full token set
-## Steps: Both (Figma + Live URL)
-1. Run Figma extraction (ground truth for intended design)
-2. Run live URL extraction (what actually shipped)
-3. Generate tokens from Figma source
-4. Note any divergences between design and implementation in the output
+If a Figma URL is provided, run /figma first to extract design tokens, then proceed with those as ground truth.
-If no URL is provided, ask the user for one. Accept both Figma URLs and live URLs.
+If no URL is provided, ask the user for one.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "picasso-skill",
-  "version": "2.3.1",
+  "version": "2.5.0",
   "description": "The ultimate AI design skill for producing distinctive, production-grade frontend interfaces",
   "bin": {
     "picasso-skill": "./bin/install.mjs"

package/references/ux-evaluation.md ADDED Viewed

@@ -0,0 +1,211 @@
+# UX Evaluation Reference
+Structured frameworks for evaluating interface quality. Use these during /score, /roast, /audit, and the visual discovery crawl phase.
+---
+## 1. Nielsen's 10 Usability Heuristics (Evaluation Checklist)
+For each heuristic, check the listed indicators. Score pass/fail for each.
+### H1: Visibility of System Status
+The system should always keep users informed about what is going on.
+- [ ] Loading states exist for async actions (skeletons, spinners, progress bars)
+- [ ] Form submission shows pending/success/error feedback
+- [ ] Current page/section is highlighted in navigation
+- [ ] Active filters/sorts are visually indicated
+- [ ] Upload progress is shown
+- **Check in code:** grep for loading states, skeleton components, progress indicators
+- **Check in screenshot:** is the current nav item highlighted? Are there loading indicators?
+### H2: Match Between System and Real World
+Use language and concepts familiar to the user, not system-oriented terms.
+- [ ] Button labels use verbs the user understands ("Save changes" not "Submit")
+- [ ] Error messages explain the problem in plain language
+- [ ] Navigation labels match user mental models
+- [ ] Icons are conventional (trash = delete, pencil = edit, plus = add)
+- **Check in code:** grep for generic labels ("Submit", "Click here", "Data")
+### H3: User Control and Freedom
+Users need a clear emergency exit when they make mistakes.
+- [ ] Modals have close buttons AND escape key support
+- [ ] Destructive actions have confirmation OR undo
+- [ ] Multi-step flows have back navigation
+- [ ] Users can cancel in-progress operations
+- **Check in code:** grep for confirm() dialogs, undo patterns, modal close handlers
+### H4: Consistency and Standards
+Follow platform conventions. Same action = same result everywhere.
+- [ ] Primary buttons look the same across all pages
+- [ ] Same icon means the same thing everywhere
+- [ ] Spacing and typography follow a consistent scale
+- [ ] Color meanings are consistent (red = error, green = success)
+- **Check in code:** grep for hardcoded colors, inconsistent button styles
+### H5: Error Prevention
+Prevent problems from occurring in the first place.
+- [ ] Required fields are marked before submission
+- [ ] Date inputs use pickers (not free text)
+- [ ] Destructive buttons are visually distinct (red/outlined, not primary)
+- [ ] Inline validation catches errors before form submission
+- **Check in code:** grep for required fields, inline validation, input types
+### H6: Recognition Rather Than Recall
+Minimize memory load. Make options visible.
+- [ ] Navigation is always visible (not hidden behind hamburger on desktop)
+- [ ] Search results show context around matches
+- [ ] Forms show labels (not placeholder-only)
+- [ ] Recent items, favorites, or shortcuts are available
+- **Check in screenshot:** are labels visible? Is navigation persistent?
+### H7: Flexibility and Efficiency of Use
+Allow experts to speed up their workflow.
+- [ ] Keyboard shortcuts exist for frequent actions
+- [ ] Bulk operations are available for lists
+- [ ] Command palette or search exists (Cmd+K)
+- [ ] Default values are intelligent
+- **Check in code:** grep for keyboard event listeners, bulk action patterns
+### H8: Aesthetic and Minimalist Design
+Every extra element competes with relevant information.
+- [ ] No decorative elements that don't serve a purpose
+- [ ] Information hierarchy is clear (most important = most prominent)
+- [ ] White space is used to group related elements
+- [ ] No more than 3-4 colors for data categories
+- **Check in screenshot:** squint test -- does hierarchy still read?
+### H9: Help Users Recognize, Diagnose, and Recover from Errors
+Error messages should be in plain language, indicate the problem, and suggest a fix.
+- [ ] Error messages follow: what happened + why + how to fix
+- [ ] Form errors appear next to the relevant field
+- [ ] API errors don't show raw technical messages to users
+- [ ] Empty states guide the user on what to do next
+- **Check in code:** grep for error handling, error messages, empty states
+### H10: Help and Documentation
+Even though a system should be usable without docs, help should be available.
+- [ ] Tooltips explain non-obvious UI elements
+- [ ] Onboarding exists for first-time users
+- [ ] Complex features have inline help or documentation links
+- [ ] Keyboard shortcuts are discoverable
+- **Check in code:** grep for tooltip components, help text, onboarding flows
+---
+## 2. Jobs to Be Done (JTBD) Framework
+Use JTBD to understand WHY users interact with the app, not just WHAT they do. This informs design decisions during the crawl phase.
+### Extracting JTBD from Code
+Analyze the codebase to identify user jobs:
+1. **Route structure** reveals user tasks:
+   - `/dashboard` = "When I start my day, I want to see what needs attention"
+   - `/clients/[id]` = "When I work on a client, I want all their info in one place"
+   - `/billing` = "When I need to invoice, I want to track time and generate bills"
+   - `/analyze` = "When I receive a contract, I want to understand the risks"
+2. **API endpoints** reveal user actions:
+   - POST /api/clients = "I want to onboard a new client"
+   - POST /api/analyze = "I want AI to review this document"
+   - GET /api/dashboard = "I want a summary of my practice"
+3. **Component names** reveal UI functions:
+   - `<ClientForm>` = data entry job
+   - `<TimerWidget>` = time tracking job
+   - `<RedlineView>` = document review job
+### Using JTBD to Inform Design
+For each identified job, ask:
+- **What's the trigger?** When does the user need to do this?
+- **What's the desired outcome?** What does success look like?
+- **What's the anxiety?** What could go wrong?
+- **What's the context?** Where/when do they do this? (mobile? desktop? in a meeting?)
+Design decisions should optimize for the job:
+- High-frequency jobs need the fastest path (fewest clicks, most prominent placement)
+- High-stakes jobs need the most clarity (larger text, explicit confirmation, clear feedback)
+- Time-pressured jobs need efficiency (keyboard shortcuts, bulk actions, smart defaults)
+---
+## 3. Prompt Enhancement
+When a user gives a vague design request, enhance it before proceeding.
+### Vague-to-Specific Mapping
+| User Says | What They Mean | What to Do |
+|-----------|---------------|------------|
+| "Make it look good" | It looks amateur, fix the obvious issues | Run /audit, fix critical+high |
+| "Make it modern" | It looks dated, update the aesthetic | Check font (is it Arial?), colors (pure gray?), radius (sharp corners?) |
+| "Make it clean" | Too much visual noise, simplify | Remove decorative elements, increase whitespace, reduce color count |
+| "Make it pop" | Not enough visual hierarchy, too flat | Increase contrast, add depth, strengthen heading sizes |
+| "Make it professional" | It looks like a student project | Fix typography scale, add consistent spacing, tighten color palette |
+| "I don't know what I want" | They need visual discovery | Generate the 10-20 sample gallery and let them react |
+### Enhancement Process
+1. Identify the complaint (what's wrong) vs. the goal (what they want)
+2. Map to specific design properties (typography, color, spacing, layout, motion)
+3. Propose concrete changes with before/after preview
+4. Never ask "what do you mean by modern?" -- instead, show 3 interpretations and ask which fits
+---
+## 4. State Machine for Interactive Components
+Map all states for each interactive element. Missing states are the #1 source of unpolished UI.
+### The 8 States
+Every interactive element should define:
+| State | Visual Treatment | Trigger |
+|-------|-----------------|---------|
+| **Default** | Base appearance | Page load |
+| **Hover** | Subtle background/border change | Mouse enters |
+| **Focus** | Visible ring/outline (2px+ solid) | Tab navigation |
+| **Active/Pressed** | Scale down slightly (0.97-0.98) | Mouse down |
+| **Disabled** | Reduced opacity (0.5), no pointer | Programmatic |
+| **Loading** | Spinner or pulse, disabled interaction | Async action |
+| **Error** | Red border/text, error message | Validation fail |
+| **Success** | Green indicator, confirmation | Action complete |
+### Audit Checklist
+For each component type, verify states exist:
+| Component | States to Check |
+|-----------|----------------|
+| Button | default, hover, focus, active, disabled, loading |
+| Input | default, hover, focus, filled, error, disabled |
+| Card (clickable) | default, hover, focus, active |
+| Link | default, hover, focus, visited |
+| Toggle | off, on, hover, focus, disabled |
+| Select | default, hover, focus, open, selected, error |
+| Modal | enter, exit, backdrop |
+---
+## 5. Scoring with Heuristics
+When running /score, add heuristic evaluation points:
+```
+Heuristic Evaluation (0-20 pts):
+  H1 System status:    /2  (loading states, feedback)
+  H2 Real world match: /2  (language, icons)
+  H3 User control:     /2  (undo, escape, back)
+  H4 Consistency:      /2  (styles, patterns)
+  H5 Error prevention: /2  (validation, confirmation)
+  H6 Recognition:      /2  (labels, navigation)
+  H7 Efficiency:       /2  (shortcuts, bulk ops)
+  H8 Minimal design:   /2  (hierarchy, whitespace)
+  H9 Error recovery:   /2  (messages, guidance)
+  H10 Help:            /2  (tooltips, onboarding)
+```
+This replaces the ad-hoc accessibility scoring with a structured UX evaluation.