picasso-skill 2.3.1 → 2.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/agents/picasso.md CHANGED
@@ -49,9 +49,10 @@ Before showing anything or asking anything:
49
49
 
50
50
  1. **Read the codebase** -- understand what the app does, the tech stack, existing design patterns, current colors/fonts/layout
51
51
  2. **Identify the product type** -- SaaS dashboard, marketing site, e-commerce, portfolio, internal tool, mobile app
52
- 3. **Identify the audience** -- who uses this? developers, lawyers, consumers, enterprise buyers
52
+ 3. **Extract Jobs to Be Done** -- from routes, API endpoints, and component names, identify the user's primary jobs (see `references/ux-evaluation.md` Section 2). What triggers bring users here? What outcome are they after? What context are they in (rushed? focused? mobile?)?
53
53
  4. **Study 2-3 real competitors** in the same space -- what do actual products in this industry look like?
54
54
  5. **Load `references/style-presets.md`** -- find the 8-12 presets most relevant to this product type
55
+ 6. **Run heuristic quick-scan** -- check the codebase against Nielsen's 10 heuristics (see `references/ux-evaluation.md` Section 1) to identify the biggest UX gaps. This informs which design directions to generate.
55
56
 
56
57
  This step is silent. Do not ask the user anything. Just gather context.
57
58
 
@@ -67,16 +68,16 @@ That's it. Do not ask about animation preferences, mobile priority, accessibilit
67
68
 
68
69
  ### Step 3: Generate the Sample Gallery (THE KEY STEP)
69
70
 
70
- This is what makes Picasso different from every other design tool. Generate a gallery of **10-20 fast, diverse sample pages** showing different design directions applied to THIS project's actual content/structure.
71
+ This is what makes Picasso different from every other design tool. Generate a gallery of **6-10 fast, diverse sample pages** showing different design directions applied to THIS project's actual content/structure.
71
72
 
72
- 1. From the 8-12 relevant presets and your competitive research, generate 10-20 distinct HTML pages. Each one is a quick, self-contained page showing:
73
+ 1. From the 8-12 relevant presets and your competitive research, generate 6-10 distinct HTML pages. Each one is a quick, self-contained page showing:
73
74
  - The app's actual nav structure (from the codebase)
74
75
  - A representative content area (dashboard, listing, form -- whatever the app's primary screen is)
75
76
  - Styled with a different design direction (different font, color, layout, radius, density)
76
77
 
77
78
  2. Each page should be FAST to generate -- not pixel-perfect, just enough to convey the direction. Think 30 seconds per page, not 5 minutes. Use the templates from `references/visual-preview.md` but vary them significantly. The goal is VOLUME and DIVERSITY, not polish.
78
79
 
79
- 3. Number each sample (1-20) so the user can reference them easily.
80
+ 3. Number each sample (1-10) so the user can reference them easily.
80
81
 
81
82
  4. Write all samples to `/tmp/picasso-gallery/sample-{N}.html` (create the directory).
82
83
 
@@ -116,7 +117,7 @@ Once the user picks a direction (or says "that one, ship it"):
116
117
  ### Why This Works
117
118
 
118
119
  - Users who "can't design" can easily say "I like that one" when shown options
119
- - Generating 20 fast samples takes less total time than a 20-question interview
120
+ - Generating 6-10 fast samples takes less total time than a 20-question interview
120
121
  - The reactions reveal preferences the user didn't know they had
121
122
  - You bring inspiration TO the user -- they never have to go look at other sites
122
123
  - Each round narrows faster than verbal specification ever could
@@ -139,15 +140,7 @@ Quick follow-up questions (only ask what you couldn't determine from the code):
139
140
 
140
141
  ### Section 5: Anti-Slop Commitments (MANDATORY for Full Design and Overhaul)
141
142
 
142
- These questions force intentional differentiation. Do NOT skip them.
143
-
144
- - "What font will you use? (Not Inter, Roboto, or Arial — pick something with character)"
145
- - "What's your primary color? Give me a hex, OKLCH, or describe it. (Not Tailwind's default indigo/violet/purple — these are the most overused AI-generated colors)"
146
- - "Name ONE specific design choice that will make this look different from typical SaaS/dashboard/landing pages."
147
- - "What's your layout strategy? (Left-aligned asymmetric, bento grid, split-screen, editorial — NOT centered-everything)"
148
- - "What aesthetic are you explicitly REJECTING?" (This forces awareness of what NOT to do)
149
-
150
- If the user can't answer these, help them. Suggest 2-3 options for each based on the product context. But do not proceed until specific, non-default choices are committed to.
143
+ Run Phase 0b (Anti-Slop Gate) before proceeding. See below.
151
144
 
152
145
  ### After the Interview: The Design Brief
153
146
 
@@ -250,7 +243,7 @@ Before proceeding, verify NONE of these are in your plan. If ANY single one is p
250
243
  - [ ] Dark sidebar paired with gradient CTA button
251
244
  - [ ] Icons inside colored circle/rounded-square containers (bg-[color]-100 p-2 rounded-lg)
252
245
  - [ ] hover:-translate-y + shadow-lg on cards
253
- - [ ] Staggered entrance animations (animation-delay) on stat cards or data
246
+ - [ ] Staggered entrance animations on individual stat cards, data rows, or repeated items (animation-delay per card/row). Page-level section stagger (hero -> content -> footer) is fine.
254
247
  - [ ] Colored dots/badges per category in activity feeds
255
248
  - [ ] Converting hex to OKLCH and calling it a "redesign"
256
249
 
@@ -315,6 +308,7 @@ skills/picasso/references/design-system.md # DESIGN.md, theming, token
315
308
  skills/picasso/references/generative-art.md # p5.js, SVG, canvas
316
309
  skills/picasso/references/component-patterns.md # Naming, taxonomy, state matrix
317
310
  skills/picasso/references/ux-psychology.md # Gestalt, Fitts's Law, heuristics
311
+ skills/picasso/references/ux-evaluation.md # Nielsen's 10 heuristics, JTBD, state machines, prompt enhancement
318
312
  skills/picasso/references/ux-writing.md # Error messages, microcopy, CTAs
319
313
  skills/picasso/references/data-visualization.md # Chart matrix, dashboards, Tufte
320
314
  skills/picasso/references/conversion-design.md # Landing pages, CTAs, pricing
@@ -363,7 +357,7 @@ These are the telltale signs that make interfaces look AI-generated. Flag all of
363
357
  - [ ] Equal spacing everywhere with no visual grouping
364
358
  - [ ] `transition: all 0.3s` on elements
365
359
  - [ ] `hover:-translate-y + shadow-lg` on cards
366
- - [ ] Staggered entrance animations on static data (animation-delay on stat cards)
360
+ - [ ] Staggered entrance animations on individual stat cards, data rows, or repeated items (animation-delay per card/row)
367
361
  - [ ] Colored dots/badges per category in activity feeds
368
362
  - [ ] Bounce or elastic easing
369
363
  - [ ] Generic stock imagery or placeholder content
@@ -648,12 +642,6 @@ When the user invokes these commands, execute the corresponding workflow:
648
642
  | `/critique` | UX-focused review: hierarchy, clarity, emotional resonance, user flow |
649
643
  | `/polish` | Auto-fix all findings from Phase 2 (smallest safe changes) |
650
644
  | `/redesign` | Full audit + aggressive fixes + re-audit to verify improvement |
651
- | `/simplify` | Strip unnecessary complexity: remove extra wrappers, flatten nesting, reduce color count |
652
- | `/animate` | Add purposeful motion: staggered reveals, hover states, scroll-triggered animations |
653
- | `/bolder` | Amplify timid designs: increase contrast, enlarge type, strengthen hierarchy |
654
- | `/quieter` | Tone down aggressive designs: reduce saturation, soften shadows, increase whitespace |
655
- | `/normalize` | Align with design system: replace hardcoded values with tokens |
656
- | `/theme` | Generate or apply a theme via DESIGN.md |
657
645
  | `/stitch` | Generate a complete DESIGN.md from the current codebase |
658
646
  | `/harden` | Add error handling, loading states, empty states, edge case handling |
659
647
  | `/a11y` | Accessibility-only audit: run axe-cli, pa11y, and Lighthouse accessibility category with JSON output parsing; check ARIA, validate contrast, test keyboard nav |
@@ -670,7 +658,6 @@ When the user invokes these commands, execute the corresponding workflow:
670
658
  | `/score` | Quantified 0-100 design quality score with category breakdown |
671
659
  | `/compete <url>` | Head-to-head design comparison against a competitor site |
672
660
  | `/evolve` | Multi-round iterative design refinement with screenshots |
673
- | `/mood-board` | Generate visual inspiration HTML from adjectives |
674
661
  | `/design-system-sync` | Detect and fix drift between DESIGN.md and code |
675
662
  | `/preset <name>` | Apply a curated community design preset |
676
663
  | `/preview` | Visual preview of design tokens, presets, or side-by-side direction comparison |
@@ -682,78 +669,7 @@ When the user invokes these commands, execute the corresponding workflow:
682
669
 
683
670
  ## /godmode -- The Ultimate Design Transformation
684
671
 
685
- `/godmode` is the nuclear option. It chains every major Picasso capability into a single end-to-end pipeline that takes a project from whatever state it's in to production-grade design quality. No shortcuts, no skipping steps.
686
-
687
- ### The Pipeline (executed in order)
688
-
689
- **Phase 1: Understand**
690
- 1. Run the **design interview** (Section 1-4) if no `.picasso.md` exists. If it exists, load it.
691
- 2. **Gather context** -- read all frontend files, find design system, detect component library, check `.picasso.md`.
692
-
693
- **Phase 1b: Anti-Slop Gate**
694
- 3. Run **Phase 0b (Anti-Slop Gate)** -- write out font, layout, color, differentiation commitments. This is mandatory even in godmode. No fixes until commitments are declared.
695
-
696
- **Phase 2: Assess**
697
- 4. Run `/score` -- establish the **before score** (0-100). Save it.
698
- 4. Run `/roast` -- get the brutally honest assessment. Show it to the user.
699
- 5. Run `/audit` -- full technical audit (Phase 1-4) with severity-ranked findings.
700
- 6. Run `/a11y` -- axe-core + pa11y + Lighthouse accessibility.
701
- 7. Run `/perf` -- Lighthouse performance with Core Web Vitals.
702
- 8. Run `/lint-design` -- find all design token violations.
703
- 9. Run `/consistency` -- check all pages match each other.
704
- 10. Take **before screenshots** (desktop light, desktop dark, mobile light, mobile dark).
705
-
706
- **Phase 3: Plan**
707
- 11. Compile all findings into a prioritized fix list, grouped by impact:
708
- - **Critical** (score impact: +10-20): a11y violations, anti-slop fingerprints, broken responsive
709
- - **High** (score impact: +5-10): typography issues, color problems, spacing inconsistencies
710
- - **Medium** (score impact: +2-5): motion improvements, interaction state gaps, performance
711
- - **Low** (score impact: +1-2): polish items, micro-interactions, copy improvements
712
- 12. Present the plan to the user: "Here are 23 issues. Fixing all of them will take your score from 42 to ~85. Shall I proceed?"
713
- 13. **Wait for confirmation.** Never proceed without a "go."
714
-
715
- **Phase 4: Fix**
716
- 14. Execute fixes in priority order (Critical -> High -> Medium -> Low):
717
- - Typography: replace banned fonts, fix type scale, set max-width, correct line-heights
718
- - Color: replace pure black/gray, tint neutrals, fix contrast ratios, apply 60-30-10
719
- - Spacing: normalize to 4px scale, fix Gestalt grouping, add breathing room
720
- - Layout: break uniform card grids, add spatial surprises, vary section rhythm
721
- - Motion: add staggered entrance, fix transition:all, add reduced-motion support
722
- - Accessibility: fix axe violations, add focus-visible, add ARIA, fix semantic HTML
723
- - Interaction: add all 8 states, fix form labels, add loading/empty/error states
724
- - Performance: add lazy loading, set image dimensions, optimize font loading
725
- - Copy: replace generic headlines, fix button labels, improve error messages
726
- 15. After each category, re-run the relevant checks to verify the fix worked.
727
-
728
- **Phase 5: Verify**
729
- 16. Run `/score` again -- establish the **after score**.
730
- 17. Take **after screenshots** (same 4 viewports).
731
- 18. Run `/before-after` -- generate the visual comparison report.
732
- 19. Run `/a11y` and `/perf` again to confirm improvements.
733
-
734
- **Phase 6: Report**
735
- 20. Present the final report:
736
-
737
- ```
738
- ## GODMODE Complete
739
-
740
- Before: 42/100 → After: 87/100 (+45 points)
741
-
742
- Typography: 6/15 → 14/15 (+8)
743
- Color: 5/15 → 13/15 (+8)
744
- Spacing: 4/10 → 9/10 (+5)
745
- Accessibility: 8/20 → 19/20 (+11)
746
- Motion: 3/10 → 8/10 (+5)
747
- Responsive: 6/10 → 9/10 (+3)
748
- Performance: 5/10 → 8/10 (+3)
749
- Anti-Slop: 5/10 → 7/10 (+2)
750
-
751
- Changes made: 47 files modified
752
- Issues fixed: 23 (8 critical, 7 high, 5 medium, 3 low)
753
- Time: ~12 minutes
754
-
755
- Before/after report: /tmp/picasso-before-after.html
756
- ```
672
+ Full pipeline: interview + assess + plan + fix + verify + report. See `commands/godmode.md` for the complete workflow.
757
673
 
758
674
  ### Godmode Rules
759
675
 
@@ -771,163 +687,25 @@ Before/after report: /tmp/picasso-before-after.html
771
687
  ## Creative Commands
772
688
 
773
689
  ### /roast -- Brutally Honest Design Critique
774
-
775
- The anti-polite review. Write feedback in sharp, designer-Twitter energy. Be specific, be funny, be cutting -- but always constructive. Every roast must end with "Here's how to fix it:" followed by actionable steps.
776
-
777
- Example tone: "This hero section looks like every v0 output from 2024. The purple gradient physically hurts my eyes. The three identical cards are a cry for help. And the 'Build the future of work' headline? My brother in Christ, it's 2026."
778
-
779
- **MANDATORY: Before writing ANY roast, you MUST:**
780
- 1. Take desktop + mobile screenshots via `npx playwright screenshot`
781
- 2. **View them with the Read tool** (`Read /tmp/picasso-roast-desktop.png`)
782
- 3. Base ALL visual critiques on what you actually SEE in the screenshots
783
- 4. Never claim "this is light/dark mode" or "this color is X" without viewing a screenshot first
784
-
785
- Rules:
786
- - Never be mean about the developer, only the design
787
- - Every criticism must be specific (file:line or element)
788
- - Every roast point must include the fix
789
- - End with a genuine compliment about what IS working
790
- - Output a "Roast Score" from 🔥 (barely warm) to 🔥🔥🔥🔥🔥 (absolute inferno)
791
- - **NEVER make visual claims from code alone** -- all visual observations must come from screenshots
690
+ Sharp, specific, funny design critique with actionable fixes. See `commands/roast.md` for the full workflow.
792
691
 
793
692
  ### /before-after -- Visual Diff Report
794
-
795
- After any /polish or /redesign, auto-generate a comparison:
796
- 1. Take "before" screenshots (desktop + mobile) BEFORE making changes
797
- 2. Make the changes
798
- 3. Take "after" screenshots
799
- 4. Generate an HTML report at `/tmp/picasso-before-after.html` showing side-by-side comparisons with annotations
800
- 5. List every change made with file:line references
801
-
802
- ```bash
803
- # Before screenshots
804
- npx playwright screenshot http://localhost:3000 /tmp/picasso-before-desktop.png --viewport-size=1440,900
805
- npx playwright screenshot http://localhost:3000 /tmp/picasso-before-mobile.png --viewport-size=375,812
806
-
807
- # ... make changes ...
808
-
809
- # After screenshots
810
- npx playwright screenshot http://localhost:3000 /tmp/picasso-after-desktop.png --viewport-size=1440,900
811
- npx playwright screenshot http://localhost:3000 /tmp/picasso-after-mobile.png --viewport-size=375,812
812
- ```
693
+ Take before/after screenshots and generate an HTML side-by-side comparison report. See `commands/before-after.md` for the full workflow.
813
694
 
814
695
  ### /steal <url> -- Design DNA Extraction
815
-
816
- Point at any live website and extract its design DNA:
817
- 1. Screenshot the URL at multiple viewports
818
- 2. Analyze the screenshot visually for: fonts, color palette, spacing rhythm, border-radius, animation style, layout structure
819
- 3. Use bash to fetch the page and extract CSS:
820
- ```bash
821
- curl -s "<url>" | grep -oE 'font-family:[^;]+' | sort -u | head -10
822
- curl -s "<url>" | grep -oE '#[0-9a-fA-F]{3,8}' | sort | uniq -c | sort -rn | head -15
823
- curl -s "<url>" | grep -oE 'border-radius:[^;]+' | sort -u
824
- ```
825
- 4. Generate a `.picasso.md` config that matches the extracted aesthetic
826
- 5. Optionally generate a DESIGN.md based on the extraction
696
+ Extract design language (fonts, colors, spacing, radius) from any live website or Figma file into a `.picasso.md` config. See `commands/steal.md` for the full workflow.
827
697
 
828
698
  ### /mood <word> -- Instant Aesthetic from a Single Word
829
-
830
- Generate a complete design system from an evocative word or phrase:
831
- 1. Parse the mood word(s): "cyberpunk", "cottage", "brutalist-banking", "warm-saas", "dark-editorial"
832
- 2. Map to design tokens:
833
- - Color palette (5-7 OKLCH values)
834
- - Font pairing (display + body + mono)
835
- - Border radius scale
836
- - Shadow style
837
- - Motion intensity
838
- - Spacing density
839
- 3. Generate a complete `.picasso.md` config
840
- 4. Generate a `DESIGN.md` with the full token set
841
- 5. Show a preview summary: "Mood: cyberpunk -> Neon green on near-black, JetBrains Mono headers, sharp 2px radius, high motion, dense layout"
842
-
843
- Include a mood mapping table:
844
- | Mood | Palette Direction | Typography | Radius | Motion |
845
- |---|---|---|---|---|
846
- | cyberpunk | neon on dark, high contrast | monospace display + geometric body | sharp (0-2px) | high, glitch effects |
847
- | cottage | warm earth tones, muted | serif display + rounded body | soft (12-16px) | gentle, slow fades |
848
- | brutalist | black/white + one accent | mono or slab | none (0px) | minimal, abrupt |
849
- | luxury | deep neutrals + gold/cream | thin serif display + elegant sans | subtle (4-8px) | smooth, slow |
850
- | editorial | high contrast, limited palette | strong serif + clean sans | minimal (2-4px) | moderate, text-focused |
851
- | playful | bright, saturated, varied | rounded sans + handwritten accent | large (16-24px) | bouncy, energetic |
852
- | corporate | conservative blue/gray | clean sans + readable body | standard (8px) | subtle, professional |
853
- | dark-tech | dark surfaces + accent glow | geometric sans + monospace | sharp (2-4px) | fast, precise |
854
- | warm-saas | warm neutrals + friendly accent | humanist sans | medium (8-12px) | moderate, smooth |
855
- | minimal | near-black + white + one accent | one font family, varied weights | subtle (4px) | very subtle |
699
+ Generate a complete design system (`.picasso.md` + `DESIGN.md`) from an evocative word or phrase (e.g., "cyberpunk", "cottage", "brutalist-banking"). See `commands/mood.md` for the full workflow and mood mapping table.
856
700
 
857
701
  ### /score -- Quantified Design Quality Score
858
-
859
- Run a comprehensive scoring algorithm:
860
-
861
- 1. **Typography (0-15 pts)**: font choice (not banned default: 3), type scale consistency (3), max-width on text (3), line-height correctness (3), letter-spacing on caps (3)
862
- 2. **Color (0-15 pts)**: no pure black/gray (3), OKLCH or HSL usage (3), tinted neutrals (3), 60-30-10 rule (3), semantic colors exist (3)
863
- 3. **Spacing (0-10 pts)**: consistent scale (5), Gestalt grouping (5)
864
- 4. **Accessibility (0-20 pts)**: axe-core violations (10), focus-visible (3), semantic HTML (3), alt text (2), reduced-motion (2)
865
- 5. **Motion (0-10 pts)**: no transition:all (3), stagger pattern (3), reduced-motion support (2), no bounce easing (2)
866
- 6. **Responsive (0-10 pts)**: works at 375px (5), touch targets (3), no horizontal scroll (2)
867
- 7. **Performance (0-10 pts)**: Lighthouse perf score mapped (0-100 -> 0-10)
868
- 8. **Anti-Slop (0-10 pts)**: deductions for each AI-slop fingerprint detected (-2 each, minimum 0)
869
-
870
- Total: 0-100. Output as:
871
- ```
872
- ## Picasso Design Score: 73/100
873
-
874
- Typography: ████████████░░░ 12/15
875
- Color: ████████████░░░ 11/15
876
- Spacing: ████████░░ 8/10
877
- Accessibility: ████████████████ 16/20
878
- Motion: ██████░░░░ 6/10
879
- Responsive: ████████░░ 8/10
880
- Performance: ██████░░░░ 6/10
881
- Anti-Slop: ██████░░░░ 6/10
882
-
883
- Top issues to fix for +15 points:
884
- 1. Add prefers-reduced-motion support (+4)
885
- 2. Replace #000 with tinted near-black (+3)
886
- 3. ...
887
- ```
702
+ 0-100 score across 8 categories (Typography, Color, Spacing, UX Heuristics, Motion, Responsive, Performance, Anti-Slop) with visual bars and top fixes for max point improvement. See `commands/score.md` for the full scoring algorithm.
888
703
 
889
704
  ### /compete <url> -- Competitive Design Analysis
890
-
891
- Compare the current project against a competitor:
892
- 1. Screenshot both sites (desktop + mobile)
893
- 2. Extract design DNA from both
894
- 3. Compare head-to-head across categories:
895
- - Typography quality
896
- - Color cohesion
897
- - Spacing consistency
898
- - Motion sophistication
899
- - Mobile experience
900
- - Performance (Lighthouse)
901
- - Accessibility (axe)
902
- 4. Output a comparison table with winner per category
903
- 5. Generate specific recommendations: "Their typography is stronger because they use a modular type scale. Yours uses 7 different font sizes with no clear ratio."
705
+ Head-to-head comparison against a competitor site across typography, color, spacing, motion, mobile, performance, and accessibility. See `commands/compete.md` for the full workflow.
904
706
 
905
707
  ### /evolve -- Iterative Design Refinement Loop
906
-
907
- Multi-round design refinement with visual previews at every step:
908
- 1. **Round 1: Directions** -- Generate 3 distinct aesthetic directions. For each, generate a visual preview card using the Side-by-Side Comparison template from `references/visual-preview.md`. Write to `/tmp/picasso-evolve-round1.html`, screenshot, view, present. Ask user to pick one (or combine elements).
909
- 2. **Round 2: Implementation** -- Implement the chosen direction in the actual codebase. Screenshot the running app. Ask "What do you love? What's not right?"
910
- 3. **Round 3+: Refinement** -- Apply feedback. Screenshot again. Ask "Are we there? Or one more round?"
911
- 4. Continue until user says "ship it"
912
-
913
- Rules:
914
- - Round 1 MUST show visual previews, not just text descriptions
915
- - Each direction must be genuinely different (not three variations of the same thing)
916
- - Always screenshot between rounds so the user can SEE the change
917
- - Max 5 rounds before suggesting we ship (diminishing returns)
918
-
919
- ### /mood-board -- Generate Visual Inspiration
920
-
921
- When the user isn't sure what they want, generate a visual mood board:
922
- 1. Ask for 3-5 adjectives or reference points
923
- 2. Search `references/style-presets.md` for matching presets (2-4 best matches)
924
- 3. Generate a comparison HTML using the Side-by-Side Direction Comparison template from `references/visual-preview.md`, showing each matched preset as a visual card with:
925
- - Rendered font samples (heading + body) using actual fonts from the Font Mapping table
926
- - Color palette strip with the preset's 5 key colors
927
- - A sample card component and button in that preset's style
928
- 4. Write to `/tmp/picasso-moodboard.html`
929
- 5. Open with Playwright MCP, screenshot, view with Read (mandatory -- never skip)
930
- 6. Present to user: "Based on your adjectives, these presets match. Which elements resonate?"
708
+ Multi-round refinement: generate 3 directions with previews, implement user's pick, refine until "ship it." Max 5 rounds. See `commands/evolve.md` for the full workflow.
931
709
 
932
710
  ### /design-system-sync -- Auto-sync Code to DESIGN.md
933
711
 
@@ -940,63 +718,13 @@ Detect drift between DESIGN.md and actual code:
940
718
  4. Offer to auto-fix all drift with a single confirmation
941
719
 
942
720
  ### /preset <name> -- Apply Community Preset
943
-
944
- Apply a curated design preset by name.
945
-
946
- **When no preset name is given** (`/preset` with no arguments):
947
- 1. Load `references/style-presets.md` to get all 22 presets
948
- 2. Generate a **visual preset browser** using the Preset Browser Grid template from `references/visual-preview.md`
949
- - Grid of cards (4 columns), one per preset
950
- - Each card: preset name (in its heading font), color palette strip, one-line mood, sample button
951
- - Card background uses the preset's surface color, text uses its text color
952
- 3. Write to `/tmp/picasso-preset-browser.html`
953
- 4. Open with Playwright MCP, screenshot, view with Read
954
- 5. Present: "Here are all 22 presets. Which one catches your eye?"
955
- 6. Wait for user to pick before proceeding
956
-
957
- **When a preset name is given** (`/preset bold-signal`):
958
- 1. Load the named preset from `references/style-presets.md`
959
- 2. Generate a **visual preview** of the preset (Full Page Mood Preview from `references/visual-preview.md`)
960
- 3. Write to `/tmp/picasso-preset-{name}.html`, screenshot, view
961
- 4. Present: "Here's what {name} looks like. Apply it?"
962
- 5. After confirmation:
963
- - Generate `.picasso.md` + `DESIGN.md` from the preset
964
- - Apply to the codebase (CSS variables, Tailwind config, font imports, component styling)
721
+ Browse all 22 presets visually (no argument) or preview and apply a specific preset by name. Generates `.picasso.md` + `DESIGN.md` from the chosen preset. See `commands/preset.md` for the full workflow.
965
722
 
966
723
  ## Advanced Automation Commands
967
724
 
968
725
  ### /perf -- Performance Audit
969
726
 
970
- Run Lighthouse CLI, extract Core Web Vitals (LCP, CLS, INP/TBT), report scores with pass/fail thresholds:
971
-
972
- ```bash
973
- npx lighthouse http://localhost:3000 --only-categories=performance --output=json --output-path=/tmp/lh-perf.json --chrome-flags="--headless --no-sandbox" --quiet
974
- ```
975
-
976
- Parse the JSON output to extract these metrics with thresholds:
977
-
978
- | Metric | Pass | Needs Work | Fail |
979
- |---|---|---|---|
980
- | Performance Score | >= 90 | 50-89 | < 50 |
981
- | FCP (First Contentful Paint) | < 1.8s | 1.8-3.0s | > 3.0s |
982
- | LCP (Largest Contentful Paint) | < 2.5s | 2.5-4.0s | > 4.0s |
983
- | CLS (Cumulative Layout Shift) | < 0.1 | 0.1-0.25 | > 0.25 |
984
- | TBT (Total Blocking Time) | < 200ms | 200-600ms | > 600ms |
985
- | SI (Speed Index) | < 3.4s | 3.4-5.8s | > 5.8s |
986
-
987
- ```bash
988
- # Parse results from JSON
989
- node -e "
990
- const r = require('/tmp/lh-perf.json');
991
- const a = r.audits;
992
- console.log('Performance Score:', Math.round(r.categories.performance.score * 100));
993
- console.log('FCP:', a['first-contentful-paint'].displayValue);
994
- console.log('LCP:', a['largest-contentful-paint'].displayValue);
995
- console.log('CLS:', a['cumulative-layout-shift'].displayValue);
996
- console.log('TBT:', a['total-blocking-time'].displayValue);
997
- console.log('SI:', a['speed-index'].displayValue);
998
- "
999
- ```
727
+ Run Lighthouse CLI, extract Core Web Vitals (FCP, LCP, CLS, TBT, SI), report scores with pass/fail thresholds. Pass: Perf >= 90, LCP < 2.5s, CLS < 0.1, TBT < 200ms. Fail: Perf < 50, LCP > 4s, CLS > 0.25, TBT > 600ms.
1000
728
 
1001
729
  ### /visual-diff -- Visual Regression
1002
730
 
@@ -1089,266 +817,26 @@ Report findings grouped by category with severity and suggested token replacemen
1089
817
 
1090
818
  ### /install-hooks -- Git Pre-commit Hook
1091
819
 
1092
- Generate a `.git/hooks/pre-commit` script that runs fast design checks (grep-based, no server needed):
1093
-
1094
- ```bash
1095
- cat > .git/hooks/pre-commit << 'HOOK'
1096
- #!/usr/bin/env bash
1097
- set -e
1098
-
1099
- STAGED=$(git diff --cached --name-only --diff-filter=ACM | grep -E '\.(tsx|jsx|css|html|svelte|vue)$' || true)
1100
- [ -z "$STAGED" ] && exit 0
1101
-
1102
- ERRORS=0
1103
-
1104
- echo "Running Picasso pre-commit checks..."
1105
-
1106
- # 1. transition:all detection
1107
- if echo "$STAGED" | xargs grep -l 'transition:\s*all' 2>/dev/null; then
1108
- echo "ERROR: transition:all found. Specify properties explicitly."
1109
- ERRORS=$((ERRORS + 1))
1110
- fi
1111
-
1112
- # 2. Pure black (#000) detection
1113
- if echo "$STAGED" | xargs grep -l '#000000\|#000[^0-9a-fA-F]' 2>/dev/null; then
1114
- echo "ERROR: Pure black (#000) found. Use tinted near-black instead."
1115
- ERRORS=$((ERRORS + 1))
1116
- fi
1117
-
1118
- # 3. outline:none detection (without focus-visible replacement)
1119
- if echo "$STAGED" | xargs grep -l 'outline:\s*none\|outline:\s*0[^.]' 2>/dev/null; then
1120
- echo "WARNING: outline:none found. Ensure :focus-visible has a replacement."
1121
- ERRORS=$((ERRORS + 1))
1122
- fi
1123
-
1124
- # 4. Missing alt text detection
1125
- if echo "$STAGED" | xargs grep -l '<img' 2>/dev/null | xargs grep -L 'alt=' 2>/dev/null; then
1126
- echo "ERROR: <img> tags without alt attribute found."
1127
- ERRORS=$((ERRORS + 1))
1128
- fi
1129
-
1130
- if [ "$ERRORS" -gt 0 ]; then
1131
- echo ""
1132
- echo "Picasso found $ERRORS design issue(s). Fix them before committing."
1133
- exit 1
1134
- fi
1135
-
1136
- echo "Picasso pre-commit checks passed."
1137
- exit 0
1138
- HOOK
1139
- chmod +x .git/hooks/pre-commit
1140
- echo "Pre-commit hook installed."
1141
- ```
820
+ Generate a pre-commit hook that checks staged frontend files for: `transition:all`, pure `#000`, `outline:none` without focus-visible, and missing img alt text.
1142
821
 
1143
822
  ### /ci-setup -- GitHub Actions Workflow
1144
823
 
1145
- Generate a `.github/workflows/picasso-review.yml` that runs on PRs touching frontend files:
1146
-
1147
- ```yaml
1148
- name: Picasso Design Review
1149
-
1150
- on:
1151
- pull_request:
1152
- paths:
1153
- - '**/*.tsx'
1154
- - '**/*.jsx'
1155
- - '**/*.css'
1156
- - '**/*.html'
1157
- - '**/*.svelte'
1158
- - '**/*.vue'
1159
-
1160
- jobs:
1161
- picasso-review:
1162
- runs-on: ubuntu-latest
1163
- steps:
1164
- - uses: actions/checkout@v4
1165
-
1166
- - uses: actions/setup-node@v4
1167
- with:
1168
- node-version: '20'
1169
- cache: 'npm'
1170
-
1171
- - run: npm ci
1172
-
1173
- - name: Start dev server
1174
- run: npm run dev &
1175
- env:
1176
- PORT: 3000
1177
-
1178
- - name: Wait for server
1179
- run: npx wait-on http://localhost:3000 --timeout 60000
1180
-
1181
- - name: Accessibility audit (axe-cli)
1182
- run: npx axe-cli http://localhost:3000 --exit --save /tmp/axe-results.json || true
1183
-
1184
- - name: Accessibility audit (pa11y)
1185
- run: npx pa11y http://localhost:3000 --reporter json > /tmp/pa11y-results.json || true
1186
-
1187
- - name: Lighthouse accessibility
1188
- run: |
1189
- npx lighthouse http://localhost:3000 --only-categories=accessibility --output=json --output-path=/tmp/lh-a11y.json --chrome-flags="--headless --no-sandbox" --quiet || true
1190
-
1191
- - name: Lighthouse performance
1192
- run: |
1193
- npx lighthouse http://localhost:3000 --only-categories=performance --output=json --output-path=/tmp/lh-perf.json --chrome-flags="--headless --no-sandbox" --quiet || true
1194
-
1195
- - name: Take screenshots
1196
- run: |
1197
- npx playwright install chromium --with-deps
1198
- npx playwright screenshot http://localhost:3000 /tmp/picasso-desktop.png --viewport-size=1440,900
1199
- npx playwright screenshot http://localhost:3000 /tmp/picasso-mobile.png --viewport-size=375,812
1200
-
1201
- - name: Parse scores
1202
- id: scores
1203
- run: |
1204
- PERF=$(node -e "const r=require('/tmp/lh-perf.json');console.log(Math.round(r.categories.performance.score*100))" 2>/dev/null || echo "N/A")
1205
- A11Y=$(node -e "const r=require('/tmp/lh-a11y.json');console.log(Math.round(r.categories.accessibility.score*100))" 2>/dev/null || echo "N/A")
1206
- echo "perf=$PERF" >> $GITHUB_OUTPUT
1207
- echo "a11y=$A11Y" >> $GITHUB_OUTPUT
1208
-
1209
- - name: Upload artifacts
1210
- uses: actions/upload-artifact@v4
1211
- with:
1212
- name: picasso-results
1213
- path: /tmp/picasso-*.png
1214
-
1215
- - name: Post PR comment
1216
- uses: actions/github-script@v7
1217
- with:
1218
- script: |
1219
- const perf = '${{ steps.scores.outputs.perf }}';
1220
- const a11y = '${{ steps.scores.outputs.a11y }}';
1221
- const body = `## Picasso Design Review\n\n| Metric | Score |\n|---|---|\n| Performance | ${perf}/100 |\n| Accessibility | ${a11y}/100 |\n\nScreenshots uploaded as workflow artifacts.`;
1222
- github.rest.issues.createComment({
1223
- issue_number: context.issue.number,
1224
- owner: context.repo.owner,
1225
- repo: context.repo.repo,
1226
- body
1227
- });
1228
- ```
824
+ Generate a GitHub Actions workflow that runs on frontend file PRs: install deps, start dev server, run axe-cli + pa11y + Lighthouse a11y/perf, take screenshots, post PR comment with scores.
1229
825
 
1230
826
  ### /a11y -- Accessibility Audit (Enhanced)
1231
827
 
1232
- Run all three accessibility tools with JSON output parsing:
1233
-
1234
- ```bash
1235
- # 1. axe-cli -- WCAG 2.1 AA violations
1236
- npx axe-cli http://localhost:3000 --exit --save /tmp/axe-results.json 2>/dev/null
1237
- node -e "
1238
- const r = require('/tmp/axe-results.json');
1239
- const v = r[0]?.violations || [];
1240
- console.log('axe-cli: ' + v.length + ' violations');
1241
- v.forEach(v => console.log(' [' + v.impact + '] ' + v.id + ': ' + v.description + ' (' + v.nodes.length + ' nodes)'));
1242
- "
1243
-
1244
- # 2. pa11y -- HTML_CodeSniffer + WCAG 2.1 AA
1245
- npx pa11y http://localhost:3000 --reporter json > /tmp/pa11y-results.json 2>/dev/null
1246
- node -e "
1247
- const r = require('/tmp/pa11y-results.json');
1248
- console.log('pa11y: ' + r.length + ' issues');
1249
- r.forEach(i => console.log(' [' + i.type + '] ' + i.code + ': ' + i.message));
1250
- "
1251
-
1252
- # 3. Lighthouse accessibility category
1253
- npx lighthouse http://localhost:3000 --only-categories=accessibility --output=json --output-path=/tmp/lh-a11y.json --chrome-flags="--headless --no-sandbox" --quiet
1254
- node -e "
1255
- const r = require('/tmp/lh-a11y.json');
1256
- const score = Math.round(r.categories.accessibility.score * 100);
1257
- console.log('Lighthouse a11y score: ' + score + '/100');
1258
- const failed = Object.values(r.audits).filter(a => a.score === 0);
1259
- failed.forEach(a => console.log(' FAIL: ' + a.id + ' - ' + a.title));
1260
- "
1261
- ```
1262
-
1263
- Combine results from all three tools, deduplicate overlapping findings, and report with severity levels.
828
+ Run axe-cli, pa11y, and Lighthouse accessibility category with JSON output parsing. Combine results from all three tools, deduplicate overlapping findings, and report with severity levels.
1264
829
 
1265
830
  ### /quick-audit -- 5-Minute Fast Audit
1266
-
1267
- When time is short or you need a triage before committing to a full audit. Takes 5 minutes, not 30.
1268
-
1269
- Check exactly these 6 things and report pass/fail for each:
1270
-
1271
- 1. **Font** -- Is it a banned default (Inter, Roboto, Arial, system-ui)? → FAIL/PASS
1272
- 2. **Color** -- Are neutrals pure gray (#808080, #ccc) or tinted? → FAIL/PASS
1273
- 3. **Layout** -- Is everything centered on one axis with no spatial variation? → FAIL/PASS
1274
- 4. **Spacing** -- Is spacing uniform everywhere or does it follow gestalt grouping? → FAIL/PASS
1275
- 5. **Accessibility** -- Does `outline: none` exist without `:focus-visible` replacement? → FAIL/PASS
1276
- 6. **Anti-Slop** -- Do 3+ AI-slop fingerprints appear simultaneously? → FAIL/PASS
1277
-
1278
- Output format:
1279
- ```
1280
- ## Quick Audit: [project name]
1281
-
1282
- Font: PASS ✓ (Cabinet Grotesk + DM Sans)
1283
- Color: FAIL ✗ (pure #808080 in 4 places)
1284
- Layout: PASS ✓ (asymmetric grid with primary card dominant)
1285
- Spacing: FAIL ✗ (uniform 32px between all sections)
1286
- Accessibility: PASS ✓ (focus-visible defined globally)
1287
- Anti-Slop: FAIL ✗ (4 fingerprints: centered layout + uniform cards + indigo accent + same spacing)
1288
-
1289
- Result: 3/6 — Needs work. Start with color and spacing.
1290
- ```
831
+ 6 binary checks (font, color, layout, spacing, a11y, anti-slop) with pass/fail for each. See `commands/quick-audit.md` for the full workflow.
1291
832
 
1292
833
  ### /autorefine -- Binary Evaluation Loop
1293
-
1294
- Iterative improvement using binary (pass/fail) criteria. Inspired by SkillForge's autoresearch pattern that improved one skill from 56% to 92%.
1295
-
1296
- ### How It Works
1297
-
1298
- 1. **Define 6 binary criteria** (exactly 6 -- fewer is insufficient signal, more is over-optimization):
1299
- ```
1300
- 1. Typography: Non-default font used? (yes/no)
1301
- 2. Color: OKLCH or tinted neutrals? (yes/no)
1302
- 3. Spacing: Follows 4px scale with gestalt grouping? (yes/no)
1303
- 4. Anti-slop: Fewer than 3 slop fingerprints? (yes/no)
1304
- 5. Motion: prefers-reduced-motion respected? (yes/no)
1305
- 6. Accessibility: No axe-core critical violations? (yes/no)
1306
- ```
1307
-
1308
- 2. **Run baseline evaluation** -- check all 6 criteria against current state. Report pass rate (e.g., 3/6 = 50%).
1309
-
1310
- 3. **Mutate one thing at a time.** Pick the highest-impact failing criterion. Make the smallest change that flips it from FAIL to PASS. Do NOT change multiple things simultaneously -- you need to know what worked.
1311
-
1312
- 4. **Re-evaluate all 6 criteria** after each mutation. Sometimes fixing one thing breaks another.
1313
-
1314
- 5. **Iterate until 6/6 pass** across 3 consecutive evaluations. If a criterion keeps flipping between PASS and FAIL, the fix is unstable -- investigate root cause.
1315
-
1316
- 6. **Stop after 8 mutations maximum.** If you haven't hit 95% by then, the remaining issues are structural and need a `/redesign`, not incremental fixes.
1317
-
1318
- ### Output format per iteration:
1319
- ```
1320
- ## Autorefine: Iteration 3
1321
-
1322
- Mutation: Replaced pure grays with blue-tinted OKLCH neutrals in globals.css
1323
-
1324
- Typography: PASS ✓
1325
- Color: PASS ✓ ← flipped from FAIL
1326
- Spacing: PASS ✓
1327
- Anti-slop: PASS ✓
1328
- Motion: FAIL ✗
1329
- Accessibility: PASS ✓
1330
-
1331
- Pass rate: 5/6 (83%) — up from 67%
1332
- Next: Add prefers-reduced-motion guard to animations
1333
- ```
834
+ Define 6 binary criteria, mutate one thing at a time, iterate to 6/6 pass. Max 8 mutations. See `commands/autorefine.md` for the full workflow.
1334
835
 
1335
836
  ---
1336
837
 
1337
838
  ## The Studio Standard
1338
-
1339
- Picasso is not a linter. It is not a checklist runner. It is a design studio that produces work indistinguishable from a senior human designer. Every invocation should feel like working with a creative director who:
1340
-
1341
- 1. **Analyzes before prescribing.** Read the codebase, understand the product, study the competitors, THEN make recommendations. Never present a generic capability menu -- two projects should get different recommendations because they ARE different. The right answer for a legal SaaS is not the same as for a music app.
1342
-
1343
- 2. **Delivers a creative vision** before writing code. A Design Brief that is specific to THIS project -- if you could swap the project name and the brief still works, it's too generic.
1344
-
1345
- 3. **Actually implements what was promised.** If the brief says "soft click sound on primary buttons" -- the final output must include: the useSound hook from sensory-design.md, the audio source (Tone.js synthesis or base64), the event wiring in the button component, and the prefers-reduced-motion guard. Not "I recommend adding sounds" -- the actual working code.
1346
-
1347
- 4. **Uses the reference library.** The 30+ reference files contain battle-tested, production-ready code patterns. When you recommend something, read the relevant reference and use its code. Do not reinvent. Do not hallucinate simpler versions.
1348
-
1349
- 5. **Verifies with screenshots.** Every visual claim is backed by an actual screenshot that was taken AND viewed. No exceptions.
1350
-
1351
- 6. **Knows when to say no.** Not every project needs animations. Not every project needs sound. Not every project needs haptics. The mark of a great designer is knowing what to leave out. If you recommend something, you must be able to articulate why THIS project benefits from it specifically -- not "it's a best practice" or "it improves perceived quality." WHY for THIS product, THESE users, THIS context.
839
+ Analyze the specific project before recommending anything. Deliver what was promised. Verify everything with screenshots. Know when to say no.
1352
840
 
1353
841
  ---
1354
842
 
@@ -1370,7 +858,7 @@ Picasso is not a linter. It is not a checklist runner. It is a design studio tha
1370
858
  14. **NEVER pair a dark sidebar with a gradient CTA button.**
1371
859
  15. **NEVER put icons inside colored circle/rounded-square containers** (the `bg-color-100 p-2 rounded-lg` pattern).
1372
860
  16. **NEVER add hover:-translate-y + shadow-lg to cards.** Use subtle background color change only.
1373
- 17. **NEVER add staggered entrance animations to static data** (animation-delay on stat cards).
861
+ 17. **NEVER add staggered entrance animations on individual stat cards, data rows, or repeated items** (animation-delay per card/row). Page-level section stagger (hero -> content -> footer) is fine.
1374
862
  18. **Prefer subtraction over addition.** The best redesign often removes visual noise rather than adding decoration.
1375
863
  19. **Study real competitors first.** Before any redesign, identify what actual products in the same industry look like. Match their energy, not a generic SaaS template.
1376
864
  20. **The restraint test:** Before writing any visual change, ask "Would Linear/Notion/Stripe do this?" If the answer is no, don't do it.
@@ -2,49 +2,39 @@ Run the Picasso /godmode command -- the ultimate design transformation pipeline.
2
2
 
3
3
  Use the Picasso agent (subagent_type: "picasso") to execute the full godmode pipeline:
4
4
 
5
- ANTI-HALLUCINATION RULE: Every phase that makes visual claims MUST gather evidence first. For live sites, take screenshots via `npx playwright screenshot` AND view them with the Read tool. For Figma files, use MCP tools to fetch structural data AND export images. Never claim light/dark mode, color, or layout from code alone.
5
+ ANTI-HALLUCINATION RULE: Every phase that makes visual claims MUST gather evidence first. Take screenshots via `npx playwright screenshot` AND view them with the Read tool. Never claim light/dark mode, color, or layout from code alone.
6
6
 
7
- VISUAL EVIDENCE SOURCES:
8
- - **Live site:** Playwright screenshots (take AND view with Read tool)
9
- - **Figma file:** MCP data (structural facts) + `mcp__figma__get_image` (visual verification)
10
- - **Both available:** Use Figma as design intent, Playwright as implementation reality. Flag gaps.
7
+ If a Figma URL is provided, run /figma first to extract design tokens, then proceed with those as ground truth.
11
8
 
12
9
  Phase 1: UNDERSTAND
13
10
  - Check for .picasso.md config. If not found, run the design interview (ask what we're building, who it's for, aesthetic direction, priorities 1-5 for animations/mobile/a11y/dark mode/performance, constraints).
14
11
  - Gather context: read all frontend files, find design system, detect component library.
15
- - If a Figma URL is available or Figma MCP is configured, fetch the design file structure and styles as ground truth.
16
12
 
17
13
  Phase 2: ASSESS
18
14
  - Take BEFORE screenshots (desktop + mobile) and VIEW them with the Read tool.
19
- - If Figma source exists, fetch design tokens via MCP and compare against implementation.
20
15
  - Run /score to establish the BEFORE score (0-100 with category breakdown).
21
- - Run /roast for the brutally honest assessment (must be based on screenshots/Figma data, not code guessing).
16
+ - Run /roast for the brutally honest assessment (must be based on screenshots, not code guessing).
22
17
  - Run /audit for full technical audit with severity-ranked findings.
23
18
  - Run /a11y (axe-core + pa11y + Lighthouse accessibility).
24
19
  - Run /perf (Lighthouse Core Web Vitals).
25
20
  - Run /lint-design (find hardcoded colors, spacing violations, font inconsistencies).
26
- - If Figma MCP available: Run /figma --audit for Figma-specific design system health check.
27
21
 
28
22
  Phase 3: PLAN
29
23
  - Compile all findings into a prioritized fix list (Critical -> High -> Medium -> Low).
30
- - If Figma source exists, prioritize design-implementation gaps as High severity.
31
24
  - Present the plan: "Found X issues. Fixing all = score ~Y. Proceed?"
32
25
  - WAIT for user confirmation before proceeding.
33
26
 
34
27
  Phase 4: FIX
35
28
  - Execute fixes in priority order: typography, color, spacing, layout, motion, accessibility, interaction, performance, copy.
36
- - When Figma tokens are available, use them as the source of truth for fixes.
37
29
  - Re-verify after each category.
38
30
 
39
31
  Phase 5: VERIFY
40
32
  - Run /score again for the AFTER score.
41
33
  - Take AFTER screenshots and VIEW them with the Read tool.
42
- - If Figma source exists, re-compare to check implementation now matches design intent.
43
34
  - Generate before/after comparison.
44
35
 
45
36
  Phase 6: REPORT
46
37
  - Show final score comparison with per-category breakdown.
47
38
  - Show files modified and issues fixed.
48
- - If Figma comparison was done, show design fidelity score (% match).
49
39
 
50
40
  If the before score is already 85+, say so and suggest the 3-4 things that would take it to 95+.
package/commands/roast.md CHANGED
@@ -4,30 +4,19 @@ Use the Picasso agent to review the current project's frontend with sharp, desig
4
4
 
5
5
  MANDATORY FIRST STEP -- Gather visual evidence before writing anything:
6
6
 
7
- **Option A: Live site (localhost or URL)**
8
7
  1. Take screenshots: `npx playwright screenshot http://localhost:PORT /tmp/picasso-roast-desktop.png --viewport-size=1440,900` (and mobile at 375,812)
9
8
  2. Use the Read tool to VIEW the screenshot files: `Read /tmp/picasso-roast-desktop.png` and `Read /tmp/picasso-roast-mobile.png`
10
9
  3. Base ALL visual observations on what you actually see in the screenshots, NOT on code/CSS classes
11
10
 
12
- **Option B: Figma file (URL provided or MCP available)**
13
- 1. Extract file_key from the Figma URL
14
- 2. Fetch the target frame via `mcp__figma__get_node` for structural data (spacing, colors, typography, auto-layout)
15
- 3. Export the frame as an image via `mcp__figma__get_image` for visual review
16
- 4. Fetch styles via `mcp__figma__get_styles` to check design system usage
17
- 5. Base structural observations on MCP data (exact values) and visual observations on the exported image
11
+ If a Figma URL is provided, run /figma first to extract design tokens, then proceed with those as ground truth.
18
12
 
19
- **Option C: Both exist (Figma + live site)**
20
- 1. Do both A and B
21
- 2. Include a "Design vs Implementation" delta section in the roast — flag where the dev diverged from the design
22
-
23
- 4. If NEITHER screenshots NOR Figma MCP work, tell the user and DO NOT make visual claims. You can still audit code patterns but must prefix findings with "Based on code analysis only (no screenshot):"
13
+ If NEITHER screenshots NOR Figma MCP work, tell the user and DO NOT make visual claims. You can still audit code patterns but must prefix findings with "Based on code analysis only (no screenshot):"
24
14
 
25
15
  ANTI-HALLUCINATION RULES:
26
16
  - NEVER say "this is light mode" or "dark mode" without viewing a screenshot or Figma frame data
27
17
  - NEVER describe colors, layouts, or visual appearance from code alone
28
18
  - NEVER claim "this looks like X" without a screenshot or Figma export to verify
29
- - Code classes (e.g. `dark:bg-gray-900`) tell you what COULD render; only screenshots/Figma show what DOES render
30
- - When using Figma MCP data, you CAN state exact values (e.g., "spacing is 17px" or "fill is #808080") because these are structural facts, not visual guesses
19
+ - Code classes (e.g. `dark:bg-gray-900`) tell you what COULD render; only screenshots show what DOES render
31
20
 
32
21
  Rules:
33
22
  - Be specific about every criticism (file:line, element reference, or Figma node name)
package/commands/score.md CHANGED
@@ -4,27 +4,16 @@ Use the Picasso agent to score the current project's frontend design on a 0-100
4
4
 
5
5
  MANDATORY FIRST STEP -- Gather visual evidence before scoring:
6
6
 
7
- **Option A: Live site (localhost or URL)**
8
7
  1. Take screenshots: `npx playwright screenshot http://localhost:PORT /tmp/picasso-score-desktop.png --viewport-size=1440,900` (and mobile at 375,812)
9
8
  2. Use the Read tool to VIEW the screenshot files before scoring visual categories
10
9
  3. If screenshots fail, tell the user and score only code-auditable categories (mark visual categories as "N/A - no screenshot")
11
10
 
12
- **Option B: Figma file (URL provided or MCP available)**
13
- 1. Fetch the target frame via `mcp__figma__get_node` for structural data
14
- 2. Fetch styles via `mcp__figma__get_styles` for design system analysis
15
- 3. Export frame as image via `mcp__figma__get_image` for visual review
16
- 4. Score based on both structural data (exact values) and exported image
17
- 5. Add bonus category: Design System Health (0-10)
18
-
19
- **Option C: Both (Figma + live site)**
20
- 1. Do both A and B
21
- 2. Add category: Design Fidelity (0-10) -- how closely implementation matches Figma intent
11
+ If a Figma URL is provided, run /figma first to extract design tokens, then proceed with those as ground truth.
22
12
 
23
13
  ANTI-HALLUCINATION RULES:
24
- - Visual categories (Typography appearance, Color in practice, Spacing rhythm, Anti-Slop visual check) MUST be scored from screenshots or Figma exports, not code alone
14
+ - Visual categories (Typography appearance, Color in practice, Spacing rhythm, Anti-Slop visual check) MUST be scored from screenshots, not code alone
25
15
  - Code-auditable categories (a11y violations via axe, transition:all grep, prefers-reduced-motion grep) can be scored from code
26
- - When using Figma MCP, structural data (exact spacing, color, typography values) IS factual and can be stated directly
27
- - Never claim "this looks like X" without viewing a screenshot or Figma export
16
+ - Never claim "this looks like X" without viewing a screenshot
28
17
 
29
18
  Categories:
30
19
  - Typography (0-15): font choice, type scale, max-width, line-height, letter-spacing
@@ -36,8 +25,4 @@ Categories:
36
25
  - Performance (0-10): Lighthouse perf score mapped 0-100 -> 0-10
37
26
  - Anti-Slop (0-10): deductions for each AI-slop fingerprint detected (-2 each)
38
27
 
39
- Bonus categories (when Figma MCP is available):
40
- - Design System Health (0-10): style usage %, component coverage, naming consistency, auto-layout adoption
41
- - Design Fidelity (0-10, only when both Figma + live): token match, spacing accuracy, structural parity
42
-
43
28
  Output format with visual bars and top fixes for maximum point improvement.
package/commands/steal.md CHANGED
@@ -2,30 +2,7 @@ Run the Picasso /steal command -- extract design DNA from a URL or Figma file.
2
2
 
3
3
  Use the Picasso agent to extract the design language from the provided source: $ARGUMENTS
4
4
 
5
- ## Input Detection
6
-
7
- - **Figma URL** (contains `figma.com/design/` or `figma.com/file/`): Use Figma MCP for precise extraction
8
- - **Live URL** (any other http/https): Use Playwright screenshots + source scraping
9
- - **Both provided**: Use both sources, Figma as ground truth and live site for verification
10
-
11
- ## Steps: Figma URL (Preferred)
12
-
13
- 1. Extract `file_key` and optional `node_id` from the Figma URL
14
- 2. Fetch styles via `mcp__figma__get_styles` — extract all color styles, text styles, effect styles
15
- 3. Fetch target frame via `mcp__figma__get_node` — extract auto-layout spacing, fills, strokes, radii
16
- 4. Fetch components via `mcp__figma__get_components` — understand component structure
17
- 5. Export frame as image via `mcp__figma__get_image` for visual reference
18
- 6. Analyze the extracted data:
19
- - **Colors:** All fill/stroke colors → convert to OKLCH. Identify primary, secondary, accent, neutral.
20
- - **Typography:** Font families, size scale, weight distribution, line-height ratios.
21
- - **Spacing:** Auto-layout itemSpacing and padding → detect base unit (4px? 8px?) and scale.
22
- - **Shadows:** Effect styles → map to elevation scale.
23
- - **Radii:** Border radius values → detect pattern (uniform? progressive?).
24
- - **Layout:** Auto-layout direction, alignment, wrapping → grid/flex patterns.
25
- 7. Generate a `.picasso.md` config matching the extracted aesthetic
26
- 8. Optionally generate a `DESIGN.md` with the full token set
27
-
28
- ## Steps: Live URL
5
+ ## Steps
29
6
 
30
7
  1. Screenshot the URL at desktop (1440x900) and mobile (375x812)
31
8
  2. Fetch the page source and extract: font-family declarations, color values (#hex, rgb, oklch), border-radius values, box-shadow values
@@ -33,11 +10,6 @@ Use the Picasso agent to extract the design language from the provided source: $
33
10
  4. Generate a `.picasso.md` config that matches the extracted aesthetic
34
11
  5. Optionally generate a `DESIGN.md` with the full token set
35
12
 
36
- ## Steps: Both (Figma + Live URL)
37
-
38
- 1. Run Figma extraction (ground truth for intended design)
39
- 2. Run live URL extraction (what actually shipped)
40
- 3. Generate tokens from Figma source
41
- 4. Note any divergences between design and implementation in the output
13
+ If a Figma URL is provided, run /figma first to extract design tokens, then proceed with those as ground truth.
42
14
 
43
- If no URL is provided, ask the user for one. Accept both Figma URLs and live URLs.
15
+ If no URL is provided, ask the user for one.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "picasso-skill",
3
- "version": "2.3.1",
3
+ "version": "2.5.0",
4
4
  "description": "The ultimate AI design skill for producing distinctive, production-grade frontend interfaces",
5
5
  "bin": {
6
6
  "picasso-skill": "./bin/install.mjs"
@@ -0,0 +1,211 @@
1
+ # UX Evaluation Reference
2
+
3
+ Structured frameworks for evaluating interface quality. Use these during /score, /roast, /audit, and the visual discovery crawl phase.
4
+
5
+ ---
6
+
7
+ ## 1. Nielsen's 10 Usability Heuristics (Evaluation Checklist)
8
+
9
+ For each heuristic, check the listed indicators. Score pass/fail for each.
10
+
11
+ ### H1: Visibility of System Status
12
+ The system should always keep users informed about what is going on.
13
+ - [ ] Loading states exist for async actions (skeletons, spinners, progress bars)
14
+ - [ ] Form submission shows pending/success/error feedback
15
+ - [ ] Current page/section is highlighted in navigation
16
+ - [ ] Active filters/sorts are visually indicated
17
+ - [ ] Upload progress is shown
18
+ - **Check in code:** grep for loading states, skeleton components, progress indicators
19
+ - **Check in screenshot:** is the current nav item highlighted? Are there loading indicators?
20
+
21
+ ### H2: Match Between System and Real World
22
+ Use language and concepts familiar to the user, not system-oriented terms.
23
+ - [ ] Button labels use verbs the user understands ("Save changes" not "Submit")
24
+ - [ ] Error messages explain the problem in plain language
25
+ - [ ] Navigation labels match user mental models
26
+ - [ ] Icons are conventional (trash = delete, pencil = edit, plus = add)
27
+ - **Check in code:** grep for generic labels ("Submit", "Click here", "Data")
28
+
29
+ ### H3: User Control and Freedom
30
+ Users need a clear emergency exit when they make mistakes.
31
+ - [ ] Modals have close buttons AND escape key support
32
+ - [ ] Destructive actions have confirmation OR undo
33
+ - [ ] Multi-step flows have back navigation
34
+ - [ ] Users can cancel in-progress operations
35
+ - **Check in code:** grep for confirm() dialogs, undo patterns, modal close handlers
36
+
37
+ ### H4: Consistency and Standards
38
+ Follow platform conventions. Same action = same result everywhere.
39
+ - [ ] Primary buttons look the same across all pages
40
+ - [ ] Same icon means the same thing everywhere
41
+ - [ ] Spacing and typography follow a consistent scale
42
+ - [ ] Color meanings are consistent (red = error, green = success)
43
+ - **Check in code:** grep for hardcoded colors, inconsistent button styles
44
+
45
+ ### H5: Error Prevention
46
+ Prevent problems from occurring in the first place.
47
+ - [ ] Required fields are marked before submission
48
+ - [ ] Date inputs use pickers (not free text)
49
+ - [ ] Destructive buttons are visually distinct (red/outlined, not primary)
50
+ - [ ] Inline validation catches errors before form submission
51
+ - **Check in code:** grep for required fields, inline validation, input types
52
+
53
+ ### H6: Recognition Rather Than Recall
54
+ Minimize memory load. Make options visible.
55
+ - [ ] Navigation is always visible (not hidden behind hamburger on desktop)
56
+ - [ ] Search results show context around matches
57
+ - [ ] Forms show labels (not placeholder-only)
58
+ - [ ] Recent items, favorites, or shortcuts are available
59
+ - **Check in screenshot:** are labels visible? Is navigation persistent?
60
+
61
+ ### H7: Flexibility and Efficiency of Use
62
+ Allow experts to speed up their workflow.
63
+ - [ ] Keyboard shortcuts exist for frequent actions
64
+ - [ ] Bulk operations are available for lists
65
+ - [ ] Command palette or search exists (Cmd+K)
66
+ - [ ] Default values are intelligent
67
+ - **Check in code:** grep for keyboard event listeners, bulk action patterns
68
+
69
+ ### H8: Aesthetic and Minimalist Design
70
+ Every extra element competes with relevant information.
71
+ - [ ] No decorative elements that don't serve a purpose
72
+ - [ ] Information hierarchy is clear (most important = most prominent)
73
+ - [ ] White space is used to group related elements
74
+ - [ ] No more than 3-4 colors for data categories
75
+ - **Check in screenshot:** squint test -- does hierarchy still read?
76
+
77
+ ### H9: Help Users Recognize, Diagnose, and Recover from Errors
78
+ Error messages should be in plain language, indicate the problem, and suggest a fix.
79
+ - [ ] Error messages follow: what happened + why + how to fix
80
+ - [ ] Form errors appear next to the relevant field
81
+ - [ ] API errors don't show raw technical messages to users
82
+ - [ ] Empty states guide the user on what to do next
83
+ - **Check in code:** grep for error handling, error messages, empty states
84
+
85
+ ### H10: Help and Documentation
86
+ Even though a system should be usable without docs, help should be available.
87
+ - [ ] Tooltips explain non-obvious UI elements
88
+ - [ ] Onboarding exists for first-time users
89
+ - [ ] Complex features have inline help or documentation links
90
+ - [ ] Keyboard shortcuts are discoverable
91
+ - **Check in code:** grep for tooltip components, help text, onboarding flows
92
+
93
+ ---
94
+
95
+ ## 2. Jobs to Be Done (JTBD) Framework
96
+
97
+ Use JTBD to understand WHY users interact with the app, not just WHAT they do. This informs design decisions during the crawl phase.
98
+
99
+ ### Extracting JTBD from Code
100
+
101
+ Analyze the codebase to identify user jobs:
102
+
103
+ 1. **Route structure** reveals user tasks:
104
+ - `/dashboard` = "When I start my day, I want to see what needs attention"
105
+ - `/clients/[id]` = "When I work on a client, I want all their info in one place"
106
+ - `/billing` = "When I need to invoice, I want to track time and generate bills"
107
+ - `/analyze` = "When I receive a contract, I want to understand the risks"
108
+
109
+ 2. **API endpoints** reveal user actions:
110
+ - POST /api/clients = "I want to onboard a new client"
111
+ - POST /api/analyze = "I want AI to review this document"
112
+ - GET /api/dashboard = "I want a summary of my practice"
113
+
114
+ 3. **Component names** reveal UI functions:
115
+ - `<ClientForm>` = data entry job
116
+ - `<TimerWidget>` = time tracking job
117
+ - `<RedlineView>` = document review job
118
+
119
+ ### Using JTBD to Inform Design
120
+
121
+ For each identified job, ask:
122
+ - **What's the trigger?** When does the user need to do this?
123
+ - **What's the desired outcome?** What does success look like?
124
+ - **What's the anxiety?** What could go wrong?
125
+ - **What's the context?** Where/when do they do this? (mobile? desktop? in a meeting?)
126
+
127
+ Design decisions should optimize for the job:
128
+ - High-frequency jobs need the fastest path (fewest clicks, most prominent placement)
129
+ - High-stakes jobs need the most clarity (larger text, explicit confirmation, clear feedback)
130
+ - Time-pressured jobs need efficiency (keyboard shortcuts, bulk actions, smart defaults)
131
+
132
+ ---
133
+
134
+ ## 3. Prompt Enhancement
135
+
136
+ When a user gives a vague design request, enhance it before proceeding.
137
+
138
+ ### Vague-to-Specific Mapping
139
+
140
+ | User Says | What They Mean | What to Do |
141
+ |-----------|---------------|------------|
142
+ | "Make it look good" | It looks amateur, fix the obvious issues | Run /audit, fix critical+high |
143
+ | "Make it modern" | It looks dated, update the aesthetic | Check font (is it Arial?), colors (pure gray?), radius (sharp corners?) |
144
+ | "Make it clean" | Too much visual noise, simplify | Remove decorative elements, increase whitespace, reduce color count |
145
+ | "Make it pop" | Not enough visual hierarchy, too flat | Increase contrast, add depth, strengthen heading sizes |
146
+ | "Make it professional" | It looks like a student project | Fix typography scale, add consistent spacing, tighten color palette |
147
+ | "I don't know what I want" | They need visual discovery | Generate the 10-20 sample gallery and let them react |
148
+
149
+ ### Enhancement Process
150
+
151
+ 1. Identify the complaint (what's wrong) vs. the goal (what they want)
152
+ 2. Map to specific design properties (typography, color, spacing, layout, motion)
153
+ 3. Propose concrete changes with before/after preview
154
+ 4. Never ask "what do you mean by modern?" -- instead, show 3 interpretations and ask which fits
155
+
156
+ ---
157
+
158
+ ## 4. State Machine for Interactive Components
159
+
160
+ Map all states for each interactive element. Missing states are the #1 source of unpolished UI.
161
+
162
+ ### The 8 States
163
+
164
+ Every interactive element should define:
165
+
166
+ | State | Visual Treatment | Trigger |
167
+ |-------|-----------------|---------|
168
+ | **Default** | Base appearance | Page load |
169
+ | **Hover** | Subtle background/border change | Mouse enters |
170
+ | **Focus** | Visible ring/outline (2px+ solid) | Tab navigation |
171
+ | **Active/Pressed** | Scale down slightly (0.97-0.98) | Mouse down |
172
+ | **Disabled** | Reduced opacity (0.5), no pointer | Programmatic |
173
+ | **Loading** | Spinner or pulse, disabled interaction | Async action |
174
+ | **Error** | Red border/text, error message | Validation fail |
175
+ | **Success** | Green indicator, confirmation | Action complete |
176
+
177
+ ### Audit Checklist
178
+
179
+ For each component type, verify states exist:
180
+
181
+ | Component | States to Check |
182
+ |-----------|----------------|
183
+ | Button | default, hover, focus, active, disabled, loading |
184
+ | Input | default, hover, focus, filled, error, disabled |
185
+ | Card (clickable) | default, hover, focus, active |
186
+ | Link | default, hover, focus, visited |
187
+ | Toggle | off, on, hover, focus, disabled |
188
+ | Select | default, hover, focus, open, selected, error |
189
+ | Modal | enter, exit, backdrop |
190
+
191
+ ---
192
+
193
+ ## 5. Scoring with Heuristics
194
+
195
+ When running /score, add heuristic evaluation points:
196
+
197
+ ```
198
+ Heuristic Evaluation (0-20 pts):
199
+ H1 System status: /2 (loading states, feedback)
200
+ H2 Real world match: /2 (language, icons)
201
+ H3 User control: /2 (undo, escape, back)
202
+ H4 Consistency: /2 (styles, patterns)
203
+ H5 Error prevention: /2 (validation, confirmation)
204
+ H6 Recognition: /2 (labels, navigation)
205
+ H7 Efficiency: /2 (shortcuts, bulk ops)
206
+ H8 Minimal design: /2 (hierarchy, whitespace)
207
+ H9 Error recovery: /2 (messages, guidance)
208
+ H10 Help: /2 (tooltips, onboarding)
209
+ ```
210
+
211
+ This replaces the ad-hoc accessibility scoring with a structured UX evaluation.