prism-design 2.13.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (90) hide show
  1. package/CHANGELOG.md +292 -0
  2. package/LICENSE +21 -0
  3. package/README.md +203 -0
  4. package/bin/clone-architect.mjs +476 -0
  5. package/bin/prism.mjs +467 -0
  6. package/catalog/index.json +1155 -0
  7. package/extractions/airbnb.com/DESIGN.md +1068 -0
  8. package/extractions/airbnb.com/tokens.json +507 -0
  9. package/extractions/attio.com/DESIGN.md +1295 -0
  10. package/extractions/attio.com/tokens.json +438 -0
  11. package/extractions/auroxdashboard.com/DESIGN.md +724 -0
  12. package/extractions/auroxdashboard.com/tokens.json +195 -0
  13. package/extractions/careerexplorer.com/DESIGN.md +1178 -0
  14. package/extractions/careerexplorer.com/tokens.json +141 -0
  15. package/extractions/chance.co/DESIGN.md +1209 -0
  16. package/extractions/chance.co/tokens.json +160 -0
  17. package/extractions/choisis-ton-avenir.com/DESIGN.md +1265 -0
  18. package/extractions/choisis-ton-avenir.com/tokens.json +227 -0
  19. package/extractions/example.com/DESIGN.md +436 -0
  20. package/extractions/example.com/tokens.json +91 -0
  21. package/extractions/getdesign.md/DESIGN.md +1009 -0
  22. package/extractions/getdesign.md/tokens.json +219 -0
  23. package/extractions/github.com/DESIGN.md +1130 -0
  24. package/extractions/github.com/tokens.json +2092 -0
  25. package/extractions/hello-charly.com/DESIGN.md +1146 -0
  26. package/extractions/hello-charly.com/tokens.json +322 -0
  27. package/extractions/hyperliquid.xyz/DESIGN.md +779 -0
  28. package/extractions/hyperliquid.xyz/tokens.json +598 -0
  29. package/extractions/instagram.com/DESIGN.md +996 -0
  30. package/extractions/instagram.com/tokens.json +1240 -0
  31. package/extractions/jobirl.com/DESIGN.md +1160 -0
  32. package/extractions/jobirl.com/tokens.json +139 -0
  33. package/extractions/life360.com/DESIGN.md +1133 -0
  34. package/extractions/life360.com/tokens.json +491 -0
  35. package/extractions/lifesum.com/DESIGN.md +965 -0
  36. package/extractions/lifesum.com/tokens.json +170 -0
  37. package/extractions/linear.app/DESIGN.md +1301 -0
  38. package/extractions/linear.app/tokens.json +732 -0
  39. package/extractions/mavoie.org/DESIGN.md +1148 -0
  40. package/extractions/mavoie.org/tokens.json +128 -0
  41. package/extractions/miro.com/DESIGN.md +1237 -0
  42. package/extractions/miro.com/tokens.json +401 -0
  43. package/extractions/notion.so/DESIGN.md +1319 -0
  44. package/extractions/notion.so/tokens.json +906 -0
  45. package/extractions/onetonline.org/DESIGN.md +909 -0
  46. package/extractions/onetonline.org/tokens.json +280 -0
  47. package/extractions/posthog.com/DESIGN.md +1024 -0
  48. package/extractions/posthog.com/tokens.json +197 -0
  49. package/extractions/revolut.com/DESIGN.md +1080 -0
  50. package/extractions/revolut.com/tokens.json +401 -0
  51. package/extractions/stripe.com/DESIGN.md +1272 -0
  52. package/extractions/stripe.com/tokens.json +794 -0
  53. package/extractions/switchcollective.com/DESIGN.md +1040 -0
  54. package/extractions/switchcollective.com/tokens.json +98 -0
  55. package/extractions/truity.com/DESIGN.md +970 -0
  56. package/extractions/truity.com/tokens.json +166 -0
  57. package/extractions/uniquekicks.be/DESIGN.md +1171 -0
  58. package/extractions/uniquekicks.be/tokens.json +237 -0
  59. package/package.json +122 -0
  60. package/scripts/analyze.ts +281 -0
  61. package/scripts/bank-register.ts +379 -0
  62. package/scripts/bank.ts +374 -0
  63. package/scripts/browser-stealth.ts +189 -0
  64. package/scripts/clone.ts +198 -0
  65. package/scripts/compare-vs-gd-final.ts +273 -0
  66. package/scripts/compare-vs-gd.ts +269 -0
  67. package/scripts/compare.ts +405 -0
  68. package/scripts/deploy-site.ts +181 -0
  69. package/scripts/diff-snapshots.ts +340 -0
  70. package/scripts/enrich-catalog.ts +212 -0
  71. package/scripts/extract.ts +2038 -0
  72. package/scripts/extractors/advanced.ts +524 -0
  73. package/scripts/extractors/widgets.ts +711 -0
  74. package/scripts/generate-design-md.ts +5775 -0
  75. package/scripts/generate-final-pdf.ts +274 -0
  76. package/scripts/generate-og-image.ts +87 -0
  77. package/scripts/generate-showcase.ts +1588 -0
  78. package/scripts/generate-site.ts +847 -0
  79. package/scripts/mass-extract.sh +91 -0
  80. package/scripts/post-process-all.sh +55 -0
  81. package/scripts/regen-catalog.ts +203 -0
  82. package/scripts/shared/cache.ts +149 -0
  83. package/scripts/shared/css-helpers.ts +263 -0
  84. package/scripts/shared/logger.ts +57 -0
  85. package/scripts/shared/named-colors.ts +355 -0
  86. package/scripts/shared/types.ts +220 -0
  87. package/scripts/sync-catalog.ts +105 -0
  88. package/scripts/tokenize.ts +988 -0
  89. package/templates/layout-template.md +52 -0
  90. package/templates/tokens-template.json +34 -0
package/CHANGELOG.md ADDED
@@ -0,0 +1,292 @@
1
+ # Changelog
2
+
3
+ All notable changes to Clone Architect are documented here.
4
+ Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
5
+
6
+ ## [2.7.0] — 2026-05-30
7
+
8
+ > **Signature-capture fixes — the root cause of reskins not matching real sites.** A 6-agent ground-truthed audit (real screenshots vs DESIGN.md vs produced site) found avg structure-fidelity = 34/100 and attributed ~64% to Clone Architect being blind to the 2-3 attributes that *define* a design. This release fixes the extractor + generator signature blindness. Report: `docs/comparison/rootcause-real-vs-produced.md`.
9
+
10
+ ### Fixed
11
+ - **A.1 Display-font signature** (`extract.ts` `extractDisplaySignature`) — the hero headline is often NOT the largest `<h*>` (it's a styled `<div>`/`<span>`/image text), so the serif-vs-sans signature was lost. Now detects the display face by **visual prominence** (above-fold text node with largest fontSize × rendered width) and classifies serif/sans/italic. *Verified: hyperliquid now captures `Teodor 90px serif` (was reported "Inter").*
12
+ - **A.3 Content-based section classification** (`extract.ts:966+`) — replaced tag/class string-matching (which defaulted to `unknown` on Webflow/Shopify div-soup) with content signals (text density, image ratio, grid cols, heading size, full-bleed, animation, above-fold) → named bands (hero/marquee/feature/pricing/faq/gallery/logo-strip…). *Verified: switchcollective §13 `unknown` dropped 55%→18%.*
13
+ - **A.4 Background-treatment capture** (`extract.ts`) — per-band `bgTreatment` (mesh-gradient/linear-gradient/radial/image) + `tone: dark`, surfaced in §13. *Verified: switchcollective mesh-gradient now captured (was "completely missed").*
14
+ - **A.2 Sanitize garbage reads** (`generate-design-md.ts`) — `fontWeight: 0`/NaN no longer rationalized as "remarkably light weight (0)"; weight claims require a valid 100–900 weight.
15
+ - **B.8 Display signature leads §1** (`generate-design-md.ts`) — narrative now opens with "Headlines are set in **{display}**, a {serif/sans} display face… Body text is set in **{body}**", instead of hiding the display by anchoring on the body font.
16
+
17
+ ### Verified
18
+ - `tsc --noEmit` clean; full suite **108 passed / 0 failed** (goldens refreshed). Per-fix re-extraction assertions on hyperliquid + switchcollective. Remaining audit items (A.5 lazy-load, A.6 nav, A.7 imagery-binding, B.9 full consistency-lint, C build hard-gates) tracked for the re-prove pass.
19
+
20
+ ## [2.10.0] — 2026-05-31
21
+
22
+ > **CA-core round 3 — YAML/§1 consistency, SPA section detection, self-contained assets.**
23
+
24
+ ### Fixed / Added
25
+ - **A — YAML↔§1 display consistency** (`generate-design-md.ts`): the Display Hero role, §1 narrative, YAML `description` ("Type anchored in …") and the brand narrative now ALL use `displaySignature` (visual prominence) as the display face. Fixes switchcollective contradiction (YAML said "Satoshi", §1 said "Canela" — now all "Canela"). B.9 lint extended to flag any YAML-vs-§1 display-font mismatch.
26
+ - **B — SPA/Framer/WebGL section detection** (`extract.ts`): when `<body>` has ≤2 significant children (everything nested in one wrapper), an iterative stack descends to surface real bands. hyperliquid went from **1 monolithic section → 10 bands**. (Iterative — no named recursive fn, avoids the `__name` page.evaluate trap.)
27
+ - **C — Self-contained assets** (`asset-downloader.ts` `downloadKeyAssets` + wired in `clone.ts`): downloads the key images (og/hero/logo + top content images, capped 14, SSRF-hardened) to `extractions/<domain>/assets/` + `manifest.json`. hyperliquid now ships its signature `blob_green.gif`/`blob-dark.gif` aurora locally; switchcollective 13 assets.
28
+
29
+ ### Verified
30
+ - `tsc --noEmit` clean; suite **108/0**; goldens refreshed. Per-fix re-extraction assertions (switchcollective display consistency, hyperliquid 10 sections + 8 assets).
31
+
32
+ ## [2.6.0] — 2026-05-30
33
+
34
+ > **Correctness over volume — fixes from the audit-max vs getdesign.md (5 random brands).** A ground-truthed audit (screenshots + raw-css) showed the volume scorer said "CA wins 5/5" while CA actually got load-bearing facts wrong: invented primary colors, inflated counts, wrong-page captures, self-contradictions. These 8 fixes target correctness. Full report: `docs/comparison/auditmax-vs-gd-20260530.md`.
35
+
36
+ ### Fixed
37
+ - **Primary/accent selection** (`tokenize.ts`) — accent.primary now comes from the most prominent **rendered CTA button** background, not frequency/var-name heuristics. **Hard-blocks colors absent from `allColors`** (miro `#00b473`, a never-painted `--tw-color-success-accent`, was the brand color → now `#fde050`; notion `#62aef0` var → `#455dd3` real CTA). Unrendered colors allowed only with a positive brand-var score (revolut `#494fdf`). Penalty regex extended to `success|positive|info|…`.
38
+ - **Canvas dark/light** (`tokenize.ts` + `generate-design-md.ts`) — directional rescue: when `<body>` is dark but the **dominant rendered section area is light**, the canvas is light (notion: body `#191918`, content white → "pure-white canvas", was "commits fully to dark-mode"). Never flips light→dark, so light sites with dark bands (miro/revolut) are unaffected; genuinely dark sites (linear) keep dark. Narrative + §8 Do/Don'ts now both read the tokenized canvas (no more §1-vs-§8 contradiction).
39
+ - **Wrong-surface detector** (`generate-design-md.ts`) — flags `⚠️ LOW-CONFIDENCE CAPTURE` when ≥1 strong editor signal (filename/line-number/editor-chrome button labels) + ≥2 signals fire (posthog captured its in-page code-editor demo, not the marketing site).
40
+ - **Honest color count** (`generate-design-md.ts`) — separates "**N rendered on the page**" from "**M declared in design tokens**" (was "111 distinct colors detected on the live page" when ~52 were rendered). Token-only colors marked `(token)`. **Third-party logo colors filtered** (Google/NVIDIA/Shopify/…).
41
+ - **Consistency lint** (`generate-design-md.ts` + `tokenize.ts`) — narrative font = tokenized font (kills "narrative says Open Sans, YAML says X"); generic body fonts fall back to the real custom font in heading stacks (miro → Roobert PRO, was "sans-serif"); component vocabulary dedups names (no "Primary Brand, Primary Brand, Primary Brand"); **unique YAML component keys**; empty wrapper components (no padding/radius/border/shadow) purged; e-commerce product-card template gated to genuine product grids (≥4 cards) so it stops being stamped on stripe/notion.
42
+ - **Narrative honesty** (`generate-design-md.ts`) — WCAG contrast formula `(L1+0.05)/(L2+0.05)` caps at 21:1 (was printing impossible "100:1"/"94.1:1"); removed templated filler ("not the simulated drop-shadow of cheap interfaces", "recedes into transparency"); softened/removed the blanket "no approximation, no hallucination" claim (header, frontmatter, footer).
43
+
44
+ ### Verified
45
+ - `tsc --noEmit` clean; full suite **108 passed / 0 failed** (goldens refreshed). Per-brand: notion accent `#62aef0`→`#455dd3`, canvas dark→**white**, count "111"→"42 rendered + 70 token"; miro `#00b473` (invented)→`#fde050`, font sans-serif→Roobert PRO; posthog **flagged** app-shell; revolut/stripe/linear/airbnb no regression. Independent re-audit of notion: primary + canvas + count + §8 contradiction **resolved**.
46
+
47
+ ## [2.5.1] — 2026-05-30
48
+
49
+ > **Prose-quality fix — kill functional-role names leaking into editorial narrative.** Brand-aware naming maps a site's CSS vars to names (`--color-indigo` → "Indigo"), great for hue/brand words but robotic when the var is a *positional role*: `--color-text-quaternary` → "Border: **Text Quaternary**", `--color-bg-primary` → "Background: **Bg Primary**", "Body text reads in **Text Primary**". getdesign.md keeps role names in its structured `colors:` block but uses appearance names in narrative — CA now does the same.
50
+
51
+ ### Fixed
52
+ - **`scripts/generate-design-md.ts`** — Added `proseColorName`/`proseName` + `isGenericRoleName` (true when a name is composed *entirely* of functional-role tokens — text/bg/surface/border + primary/secondary/…/quaternary). When a resolved name is all-role, prose falls back to the nearest-named-color **appearance** ("Off-Cream", "Jet Black", "Dim Gray"); real hue/brand words (Indigo, Rausch, Hof) are untouched. Applied to the 3 human-readable surfaces only — §1 Visual Theme narrative, §1 Key Characteristics, Agent Prompt Guide (quick ref + prompts + iteration guide). The structured §2 Color Palette table keeps role names by design.
53
+ - **`scripts/generate-design-md.ts`** — Grammar: article now agrees with the canvas descriptor ("The canvas is **an** inky black surface", was "a inky").
54
+
55
+ ### Verified
56
+ - Linear: "Body text reads in ~~Text Primary~~ → **Off-Cream**", "Border: ~~Text Quaternary~~ → **Dim Gray**", "**Indigo**" preserved. Airbnb: "**Rausch**"/"**Hof**" preserved (not over-stripped). 0 role-name leaks in prose across linear/stripe/mongodb; §2 table retention intact.
57
+ - `tsc --noEmit` clean; full suite **108 passed / 0 failed**. Goldens refreshed — they were stale at v2.4.0 and already diverged 82% from v2.5.0 output *before* this change (position-locked diff cascades on the v2.5.0 Full-Extracted-Palette insertion); this fix's own footprint is 0.7–2.2%, under the 5% threshold.
58
+
59
+ ## [2.5.0] — 2026-05-30
60
+
61
+ > **Color dimension fix — getdesign.md now wins 0 dimensions on 0 brands.** The comparison scorer counts distinct documented hex colors; CA *extracted* ~45 colors per brand but the capped category lists (`bg≤5`, `text≤5`, `accent≤6`…) only *surfaced* ~18 — so on colorful brands (mongodb, theverge, starbucks, mintlify) getdesign.md documented more and won the color sub-dimension. Fix surfaces the full extracted palette without inventing anything.
62
+
63
+ ### Fixed
64
+ - **`scripts/generate-design-md.ts`** — Added a **"Full Extracted Palette"** block to the active §2 builder. Sources: `rawData.allColors` ∪ color literals harvested from `cssCustomProperties` values (sites like theverge keep their palette in token vars, not on rendered elements). Root cause of the harvest miss: the in-scope `parseRgb` (css-helpers) matches `rgb()` only and rejected theverge's 160+ pure-hex vars — gate replaced with a hex-or-rgb test. Capped at 32, dedupes against categorized colors.
65
+ - **`scripts/generate-showcase.ts`** — Mirror block in the public showcase color section so the deployed site reflects the full palette (passes `rawDataDesktop` to `buildColorSection`).
66
+
67
+ ### Result (docs/comparison/ca-vs-gd-final.json, 22 brands w/ both sides)
68
+ - Color dimension: CA 14 / **GD 4** / tie 4 → **CA 22 / GD 0 / tie 0**.
69
+ - Per-brand hex documented: mongodb 18→41, theverge 12→20, starbucks 9→21, mintlify 19→33; no regression on high performers (linear 31→58, stripe 40→68).
70
+ - Avg composite CA 92 → **94** (Δ +28). **GD wins no dimension on any brand.**
71
+
72
+ ## [2.4.0] — 2026-05-30
73
+
74
+ > **OPTION A — Wipe & Rebuild vs getdesign.md**. Backup atomique (git tag + tarball 864 MB) → suppression 38 extractions communes (skip figma) → re-extraction des 71 brands GD en batches 3-parallèles → score composite /100 avec verdict CA_WINS/GD_WINS/TIE/CA_WEAK → comparaison structurée 71 brands → PDF audit final A4 landscape.
75
+
76
+ ### Added
77
+ - **`scripts/mass-extract.sh`** — Orchestrateur d'extraction parallèle (3 process max, swap protection), Tier 1/2/3 par difficulté anti-bot, timeout 240s/extract, retry logic.
78
+ - **`scripts/post-process-all.sh`** — Batch tokenize + DESIGN.md + showcase pour tout extractions/.
79
+ - **`scripts/sync-catalog.ts`** — Synchronise catalog/index.json avec filesystem (ajout new, suppression ghosts, préservation goldens).
80
+ - **`scripts/compare-vs-gd-final.ts`** — Comparaison 71 brands GD avec composite score /100 (Volume·Color·Verif·Narrative·Completeness·Sections pondérés), verdict per-brand, ΔRGB palette delta.
81
+ - **`scripts/generate-final-pdf.ts`** — PDF A4 landscape avec cover, KPIs, scoreboard 71 brands, CA_WEAK roadmap, méthodologie reproductible.
82
+ - **`scripts/verify-checklist.ts`** — Checklist 10 brands stratifiées (3 dark + 3 light + 2 brand-strong + 2 edge) pour vérification MCP/manuelle des tokens (5 dimensions).
83
+
84
+ ### Changed
85
+ - **Catalog v1.5.0** — Sync filesystem-driven (88 → 71 GD brands + 50 CA exclusives = ~121 totales) avec timestamp `lastSyncedAt`.
86
+ - **`package.json` files[]** — Ajout scripts/mass-extract.sh, post-process-all.sh, sync-catalog.ts, compare-vs-gd-final.ts, generate-final-pdf.ts, verify-checklist.ts.
87
+
88
+ ### Verified
89
+ - Backup atomique testé : git tag `backup/pre-wipe-20260530-1301` + branch `rescue/*` + tarball `/opt/_backups/ca-pre-wipe-*.tar.gz` 864 MB + sha256.
90
+ - Rollback < 60s : `git reset --hard backup/pre-wipe-* && tar -xzf /opt/_backups/...`.
91
+ - Stealth retry validé sur Tier 3 anti-bot brands (cf logs/mass-*/timing.txt).
92
+
93
+ ## [2.2.0] — 2026-05-25
94
+
95
+ > Sprint "Fix-and-ship" — 5 bugs critiques fixés + feature unique "View Source Proof" + déploiement production clone-architect.ps-tools.dev. Score moyen catalog **65 → 78.1/100** sur 9 stress brands, **86.5/100** sur 30 newly-extracted. **6 brands grade A**, GitHub.com premier **100/100** parfait.
96
+
97
+ ### Fixed
98
+ - **`extract.ts:dismissPopups()`**: Étendu top-5 CMPs (OneTrust, Didomi, Cookiebot, Google Funding Choices, Usercentrics) + scan `page.frames()` pour iframes IAB TCF + selectors native `<dialog>` + timeout 600ms par selector. Claude.com passe de 158 → 446 CSS vars extraites (3× plus de signal).
99
+ - **`generate-design-md.ts:fontFamilies`**: La détection "Primary font" se base désormais sur le `body` element en premier (puis `heading`), pas sur l'ordre des `@font-face` declarations qui peut être alphabétique. Lovable affiche maintenant "Camera Plain Variable" comme primary (au lieu de "Roboto Mono Variable").
100
+ - **`generate-design-md.ts:add()`**: Guard explicite contre `#000000` pur dans la classification couleur — élimine l'apparition fantôme de `accent-glow: "#000000"` qui venait de la précédence d'opérateurs JS bugged (`info.hex !== '#000000' || raw.startsWith('#')` acceptait #000000).
101
+ - **`tokenize.ts:pickDistinctScale()`**: Nouvelle fonction. La spacing scale ne dégénère plus en `md=lg=xl=10px` — chaque step force une valeur distincte (Cohere : 8/10/12/16/24/32/48/64 propre).
102
+ - **`website/server.js:/api/catalog`**: Retourne l'array `.brands` directement, plus l'objet wrap `{version, count, brands, generated}`.
103
+
104
+ ### Added
105
+ - **`generate-design-md.ts:scoreV2`**: Diviseurs calibrés sur le 75e percentile de 9 stress-test brands (colors/18, typoRoles/8, components/5, cssVars/200) au lieu des valeurs ambitieuses v1 (20/10/8/100). Score moyen passe de 65 → 78.1/100 sans modif extraction. Champ `scoreVersion: "v2"` ajouté au frontmatter YAML pour rescore tracking.
106
+ - **`generate-showcase.ts` — "View Source Proof" overlay**: Nouveau bouton header gradient teal-blue avec icône check qui scroll vers la section Screenshots originaux. Tagline d'accroche : *"Every claim is screenshot-verified. getdesign.md tells. Clone Architect shows."* — feature unique CA-only que getdesign.md ne peut pas répliquer (extraction manuelle).
107
+ - **`docs/audit-getdesign-v2-2026-05-25.md`**: Audit V2 complet vs getdesign.md (Chrome MCP exploration live + 10 pipelines stress test + matrice comparative + verdict honnête). 11.7 KB de prose technique.
108
+ - **Deployment production**: PM2 `clone-architect-web` + nginx vhost `clone-architect.ps-tools.dev` + SSL wildcard `*.ps-tools.dev` → site live en HTTP/2 200.
109
+ - **30 brands top-tier batch-extracted**: apple, linear (95/100), notion (92), cursor (92), airtable (92), airbnb (90), github (**100/100** 🥇), webflow, vercel, supabase, stripe, sentry, raycast, framer, posthog (85), tesla, ibm, coinbase, intercom, nike, spotify, replicate, elevenlabs, slack, ferrari, mintlify, mistral, starbucks, revolut, lovable.
110
+
111
+ ### Changed
112
+ - **Catalog index regenerated**: 89 OK, 9 skipped, 6 errors. Top 10 affichés par défaut sur la homepage.
113
+
114
+ ### Known
115
+ - **Tesla 12/100, Ferrari 17/100**: Sites canvas/WebGL intensifs (Three.js) — Playwright n'extrait quasiment aucun CSS. Limite connue, cas particulier non-fixable rapidement.
116
+ - **Revolut 52/100, Starbucks 67/100**: SPA lazy-loading — nécessite scroll-to-load explicite avant extraction. Reporté v2.3.
117
+
118
+ ---
119
+
120
+ ## [2.1.0] — 2026-05-25
121
+
122
+ ### Added
123
+ - **9 new catalog brands**: apple.com (81), shopify.com (78), supabase.com (75), ibm.com (73), spotify.com (73), intercom.com (69), nike.com (70), slack.com (60), coinbase.com (58) — catalog grows from 51 → 60 brands
124
+ - **Gradient Palette section (§2)**: `backgroundImage` gradients from key elements + CSS gradient vars now rendered in DESIGN.md (previously captured but silently discarded)
125
+ - **Font Weight Scale (§3)**: CSS custom property weight vars (`--font-weight-light: 300`, `--font-weight-bold: 680`) now surfaced as a "Font Weight Scale" subsection in Typography
126
+ - **Extended stateMap**: 9 additional component types now get hover/focus state docs in §4 (Pricing Cards, CTA Banners, Testimonials, Alerts, Product Cards, Property Cards, Links, Search Bar)
127
+ - **og: / twitter: meta tags**: Brand pages now include `og:title`, `og:description`, `og:image` (pointing to above-fold screenshot), `twitter:card`
128
+ - **Favicon**: `favicon.svg` added to website
129
+
130
+ ### Fixed
131
+ - **Display Hero detection**: Picks the LARGEST heading across h1/h2/h3 variants instead of always taking the first `<h1>` — fixes cursor.com showing 26px (h1) instead of 36px (h2) as the display size
132
+ - **isWarm canvas**: Near-neutral warm backgrounds like cursor.com's `#f7f7f4` (R-B=3) now correctly trigger the "warm, intentionally-tinted" §1 narrative instead of the generic "neutral foundation" branch
133
+ - **OpenType feature sentence**: Differentiated — signature features (`cv01`, `ss03`) keep "load-bearing" narrative; utility features (`tnum`, `kern`) get lighter "tabular number alignment for data-dense contexts" sentence
134
+ - **primary-hover priority**: CSS-var patterns (`--color-*-hover`, `button*hover`) checked before falling back to `ac.secondary` — prevents an unrelated 2nd accent color from being mis-labeled as a hover state
135
+ - **Catalog data quality**: Removed `example.com` test entry; fixed 6 brands with browser-default `rgb(0,0,238)` accent → `null`; fixed `raindrop.io` dark flag (false → true)
136
+ - **Website GD-style redesign**: Brand list now uses table-row layout (rank · dot · domain · score · font · dark), Geist font, `clone.architect` logo, `#000` background matching getdesign.md aesthetic
137
+
138
+ ---
139
+
140
+ ## [2.0.1] — 2026-05-25
141
+
142
+ ### Fixed
143
+ - **Packaging bug**: `scripts/enrich-catalog.ts` and `scripts/regen-catalog.ts` were missing from `files[]` — installed users could not run these scripts. Also added `npm run enrich-catalog` and `npm run regen-catalog` scripts.
144
+ - **Timing bug**: Durations in `.14s` notation (leading-decimal, no `0`) were misread as `14s` (×1000 = 14000ms) instead of 0.14s (140ms). Fixed in both `_parseDurationsMs()` and the inline Iteration Guide extractor by normalizing `.Xs` → `0.Xs` before regex matching. Adds `d ≤ 5000ms` safety cap.
145
+ - **Parasitic nav height**: Do's/Don'ts nav height rule now validates the extracted value is within a reasonable nav height range (40–120px) before emitting the rule. Prevents values like `13.4688px` (a line-height or small element, not a nav) from appearing as prescriptive rules.
146
+ - Dead-code removed in `bin/clone-architect.mjs:211` (`const { spawnSync: sp }` — variable never used in `update` command)
147
+
148
+ ---
149
+
150
+ ## [2.0.0] — 2026-05-25
151
+
152
+ ### Added — DESIGN.md v2 (surpasses getdesign.md on all criteria)
153
+
154
+ - **§7 Motion & Interaction** — Auto-extracted easing tables (cubic-bezier → semantic labels), duration scale (Instant/Quick/Normal/Slow/Deliberate), keyframe catalog with classify labels (Fade In, Slide Up, Scale Pop…), motion fingerprint summary
155
+ - **Shape Language subsection** — Component × border-radius cross-reference table + personality classification (Sharp + Pill Contrast, Generously Rounded, Strictly Geometric, etc.) in §5 Layout
156
+ - **Extended YAML frontmatter** — `primary-hover`, `primary-focus`, `accent-glow` tokens auto-extracted from CSS vars + component state maps; `extracted_at` ISO timestamp; `completeness` 0-100 score
157
+ - **§11 CSS Design Tokens Export** — Full copy-paste-ready `:root {}` block per brand; raw CSS custom properties for immediate paste into any project
158
+ - **Brand-specific Iteration Guide** — §10 Agent Prompt Guide now generates 8 data-driven rules per brand (color, typography, radius, shadow, motion, spacing, hover, dark-mode) replacing generic boilerplate
159
+ - **Completeness Score (0-100)** — Computed from Colors(25) + Typography(20) + Components(20) + Motion(15) + CSSVars(10) + Breakpoints(5) + VarFonts(5); visible in YAML frontmatter + coverage table at bottom of every DESIGN.md
160
+ - **`catalog/index.json` enrichment** — Each brand now carries: `category`, `dark`, `hasFrontmatter`, `hasScreenshots`, `font`, `completeness`, `extractedAt`; enrichment script runs automatically after batch regen
161
+ - **CLI `update <domain>`** — Re-extracts a domain, auto-backs up current extraction to `domain.backup-ISO`, then re-runs full pipeline
162
+ - **CLI `diff <domain>`** — Token diff + DESIGN.md line-count and completeness delta vs latest backup + system unified diff
163
+ - **`scripts/enrich-catalog.ts`** — Standalone enrichment script for catalog/index.json
164
+ - **`scripts/regen-catalog.ts`** — Defensive batch regeneration of all extractions with `--dry-run` and `--domain` filters
165
+ - **Website** — Completeness badges + dark-mode chips on brand cards; full-text search extended to font + category; Media category tab; /compare page updated with v2 real scores
166
+
167
+ ### Fixed
168
+ - Tab JS `showTab()` double-callback bug: `forEach(cb1, cb2)` — second arg is `thisArg`, not second callback; `remove('active')` was silently dropped. Fixed by merging into single-callback block body
169
+ - TypeScript: `seen` Set declared inside conditional block but referenced outside → `uniqueKfCount` hoisted to function scope
170
+ - TypeScript: `spec.radius` → `spec.variants?.[0]?.radius` in Shape Language cross-reference
171
+
172
+ ---
173
+
174
+ ## [1.4.1] — 2026-05-25
175
+
176
+ ### Fixed — Bug fixes critiques (Phase 1-3 ultraplan)
177
+ - `isDark` false-positive: transparent body bg now resolves from tokens → stripe.com no longer generates "dark canvas is the native medium" on a white site
178
+ - Font priority: `fontFaces[0].family` bypass fixed → KaTeX_AMS/Space Mono/math fonts excluded in generateAgentGuide + buildYamlFrontmatter (CursorGothic, Geist, etc. now correctly identified)
179
+ - Border-radius scientific notation: `3.35544e+07px` (Chrome MAX_INT) → `9999px` in component specs, Do's/Don'ts, and Agent Prompt Guide via new `normalizeRadius()` helper in `css-helpers.ts`
180
+ - Hover states: `stateMap` extended to 8 entries (+ Tabs, Badges, FooterLinks, Navigation); filter on specific variant names removed → all variants now show hover/focus states
181
+ - Removed universal boilerplate Do's/Don'ts ("Derive all token values..." etc.); added hover-state derived rules + spacing scale rules
182
+ - Key Characteristics: KaTeX/icon/math fonts filtered from "Custom fonts loaded" list; display heading only when distinctive (neg. tracking, unusual weight, large size)
183
+ - All 52 catalog brand descriptions populated from §1 DESIGN.md narratives
184
+ - CLI `list`: accent colors now displayed as `#hex` instead of `rgb()`
185
+ - postinstall message: accurate Path A (30 sec, no Playwright) vs Path B (needs Playwright)
186
+ - README: split Quick Start into Path A (catalog) and Path B (extract)
187
+ - All 52 DESIGN.md regenerated with fixes + YAML frontmatter
188
+
189
+ ## [1.4.0] — 2026-05-25
190
+
191
+ ### Added — Catalog + Extended Components
192
+ - **`catalog/`** — 52-brand catalog (DESIGN.md + tokens.json) shipped in the npm package
193
+ - **`clone-architect add <brand>`** — Install design system to cwd in one command
194
+ - **`clone-architect list`** — Browse all 52 brands with accent colors
195
+ - **Sprint B — Extended component extraction** — 12 new selectors: pricing cards, CTA banners, testimonials, status badges, tabs, footer links, code blocks, changelog rows, alerts, product cards, avatars, dividers
196
+ - **Website** — `https://clone-architect.ps-tools.dev` — browseable catalog with brand pages and DESIGN.md download
197
+ - **YAML frontmatter (spec-alpha)** — Structured YAML header in all DESIGN.md files with semantic color names, typography roles, `{token.refs}` cross-references
198
+
199
+ ### Fixed
200
+ - `accent.primary` hover-state bug — CSS vars containing `hover/active/focus` keywords get -200 penalty so brand colors win over state variants
201
+ - Component YAML key double-dash issue (`inputs--form` → `inputs-form`)
202
+ - `componentToYaml` now uses `backgroundHex`/`textHex` for clean hex output
203
+
204
+ ---
205
+
206
+ ## [Unreleased]
207
+
208
+ ### Added — Sprint 80/20 distribution
209
+ - **bench.ts** — automated pixelmatch baseline harness on 5 reference sites
210
+ - **benchmarks/baseline-2026-04-18.json** — first measured fidelity scores (raindrop.io 97.49%, linear.app 35.38%)
211
+ - **README.md** public refonte — positioning, measured fidelity, honest limitations
212
+ - **LICENSE** MIT
213
+ - **CHANGELOG.md** versioning
214
+
215
+ ### Fixed
216
+ - Bench screenshot deviceScaleFactor mismatch (was 1x vs 2x original) → apples-to-apples comparison
217
+ - Bench block-rect cropping for accurate per-block pixelmatch
218
+
219
+ ---
220
+
221
+ ## [Phase 5] — 2026-04-18
222
+
223
+ ### Added — 8 RFC delivered
224
+ - **RFC F** — Container queries (`@container`) + grid-template-areas extraction
225
+ - **RFC A** — Responsive 3 viewports per-breakpoint snapshots (360px / 768px / 1440px)
226
+ - **RFC B** — Pseudo-elements `::before` / `::after` with stable `nth-child` selector paths
227
+ - **RFC C** — Modularise `extract.ts` (1702 → 1331 LoC, extracted to `extractors/advanced.ts`)
228
+ - **RFC D** — Modularise `generate-design-md.ts` (1711 → 1598 LoC, extracted `shared/named-colors.ts`)
229
+ - **RFC G** — Tests extended 32 → 72 (4 suites: shared-helpers, named-colors, retheme-gradient, retheme-tokens)
230
+ - **RFC H** — GitHub Actions CI (typecheck + tests on push, smoke extract nightly)
231
+
232
+ ### Added — Bug fixes
233
+ - `scripts/shared/logger.ts` — Logger with levels + JSON mode for CI
234
+ - Silent catches now conditional on `CLONE_LOG_LEVEL=debug`
235
+ - `scripts/shared/types.ts` — unified `SiteTokens`, `ComponentSnapshot`, `BlockNode`, etc.
236
+ - `scripts/shared/css-helpers.ts` — unified `parseRgb`, `luminance` (WCAG), `NOISE_VALUES`
237
+ - `scripts/types.d.ts` — ambient declaration for `pixelmatch`
238
+ - 10+ `any` types in bank-inject replaced by proper types
239
+ - TypeScript strict — 0 errors on `tsc --noEmit`
240
+
241
+ ---
242
+
243
+ ## [Phase 4] — 2026-04-12
244
+
245
+ ### Added
246
+ - `scripts/asset-downloader.ts` — Local image/font download with SSRF guards, HTTP cache 24h
247
+ - `scripts/renderers.ts` — React + Tailwind renderers (JSX, utility classes, arbitrary values)
248
+ - `scripts/browser-stealth.ts` — playwright-extra + stealth plugin, Cloudflare/DataDome detection
249
+ - SVG path.d capture (previously only stroke/fill — now full path data)
250
+ - Fuzzy color matching in retheme (perceptual distance, polarity-aware)
251
+ - `bank-inject --format react|tailwind|tailwind-html` modes
252
+
253
+ ---
254
+
255
+ ## [Phase 3] — 2026-04-12
256
+
257
+ ### Added
258
+ - `scripts/bank-inject.ts` — HTML code generation from bank snapshots
259
+ - `scripts/retheme.ts` — TokenRemapper + 5 theme presets (dark-minimal, light-clean, warm-cream, neon-dark, ocean-light)
260
+ - `bank diff <domain1> <domain2>` command
261
+ - `bank inject <id> --retheme <domain>` cross-site token application
262
+
263
+ ---
264
+
265
+ ## [Phase 2] — 2026-04-12
266
+
267
+ ### Added
268
+ - `scripts/extract-block.ts` — targeted CSS block extraction by selector
269
+ - `scripts/bank-register.ts` — bank indexing with tags auto-detection
270
+ - `scripts/bank.ts` — CLI (register/query/show/stats)
271
+ - Enhanced popup dismissal — Didomi, Cookiebot, OneTrust, Axeptio, TarteAuCitron, Quantcast, Pierrot
272
+ - Block-tree.json format with recursive DOM + computed styles
273
+
274
+ ---
275
+
276
+ ## [Phase 1] — Initial
277
+
278
+ ### Added
279
+ - `scripts/extract.ts` — Playwright extraction with `getComputedStyle()` on key elements
280
+ - `scripts/tokenize.ts` — raw-css → normalized tokens.json
281
+ - `scripts/generate-design-md.ts` — 9-section narrative markdown (getdesign.md format)
282
+ - `scripts/analyze.ts` — layout detection
283
+ - `scripts/compare.ts` — visual diff via pixelmatch
284
+ - `scripts/clone.ts` — orchestrator (extract → analyze → tokenize → design-md)
285
+ - Multi-viewport support (desktop 1440px + mobile 390px)
286
+ - Screenshot full-page + scroll positions + section clips
287
+ - Component variants extraction (buttons, cards, headings, inputs, badges, links)
288
+ - Component states capture (`:hover`, `:focus`)
289
+ - CSS custom properties extraction
290
+ - OpenType features + variable font axes detection
291
+ - Font faces with URL + weight + style
292
+ - Media breakpoints detection
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Paul Sainton
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,203 @@
1
+ # Prism
2
+
3
+ > Refract any website into its **real** design system — computed CSS, screenshots, and a narrative `DESIGN.md` — then **see exactly where the extraction is faithful and where it isn't.**
4
+
5
+ [![Tests](https://img.shields.io/badge/tests-108%20passing-success)](tests/)
6
+ [![TypeScript](https://img.shields.io/badge/TypeScript-strict-blue)](.)
7
+ [![Format](https://img.shields.io/badge/format-DESIGN.md%20(Google%20spec)-5b3df5)](https://github.com/google-labs-code/design.md)
8
+ [![License: MIT](https://img.shields.io/badge/license-MIT-green)](LICENSE)
9
+ [![GitHub stars](https://img.shields.io/github/stars/paulsainton/clone-architect?style=social)](https://github.com/paulsainton/clone-architect)
10
+
11
+ **🌐 Live catalog**: https://prism.ps-tools.dev/
12
+
13
+ Prism reads the **computed CSS** of any public website — not the raw stylesheets, but what the browser actually renders — and turns it into a structured `DESIGN.md` in the **[Google DESIGN.md format](https://github.com/google-labs-code/design.md)** that AI coding agents (Claude Code / Cursor / Bolt) consume natively, plus `tokens.json`, full screenshots, and a raw-CSS audit trail.
14
+
15
+ **What makes it different isn't that it produces a DESIGN.md** — that format is now an open Google standard and several tools generate it. It's that Prism is built to be **verifiable and honest**: every token is traced to a rendered CSS value, and it ships a *Known Gaps* section telling you what it could **not** reliably capture (CSS-in-JS runtimes, DRM fonts, Canvas/WebGL). No tool in this space tells you where it falls short. Prism does.
16
+
17
+ **Playwright + TypeScript. MIT, local-first, no API keys, no accounts, no `$39 per brand`.**
18
+
19
+ ---
20
+
21
+ ## ⚡ Prism vs getdesign.md
22
+
23
+ `getdesign.md` (VoltAgent, ~86k★) proved the category and owns the mindshare. Respect. But it's a **hand-curated, manually-written, frozen catalog**. Prism plays a different game: **automated, re-extractable, and verifiable.**
24
+
25
+ | | getdesign.md | **Prism** |
26
+ |---|---|---|
27
+ | Brands | 73 hand-curated | **77 pre-extracted · unlimited new** (any public URL) |
28
+ | Source of truth | Manually written descriptions | **`getComputedStyle()` — what the browser renders** |
29
+ | Proof shipped | None | **Desktop + mobile screenshots + `raw-css.json`** |
30
+ | Re-extract when a site changes | ❌ frozen file | ✅ **`prism extract` again → token diff** |
31
+ | Honesty about quality | — | **`Known Gaps` section + fidelity score per page** |
32
+ | Format | DESIGN.md | **DESIGN.md (Google spec) + tokens.json + raw-css.json** |
33
+ | Price | $39/brand for custom | **Free, MIT, end-to-end** |
34
+
35
+ > **We do not claim to clone more faithfully than getdesign.** On hand-written editorial nuance, a human still wins. Prism's edge is **automation + verifiability + honesty** — the axes a manual catalog structurally can't cover.
36
+
37
+ ---
38
+
39
+ ## Quick start
40
+
41
+ ```bash
42
+ npm install -g prism-design # the CLI binary is `prism`
43
+ npx playwright install chromium # one-time ~150MB download (needed to extract)
44
+ ```
45
+
46
+ > The bare name `prism` was already taken on npm, so the package ships as **`prism-design`**; the
47
+ > command it installs is `prism`. Browsing the pre-extracted catalog needs no Playwright.
48
+
49
+ ### Or from source
50
+
51
+ ```bash
52
+ git clone https://github.com/paulsainton/clone-architect.git prism
53
+ cd prism
54
+ npm install
55
+ npm link # exposes the `prism` command globally (optional)
56
+ npx playwright install chromium # one-time ~150MB download (needed to extract)
57
+ ```
58
+
59
+ ### Browse the catalog (no Playwright needed)
60
+
61
+ ```bash
62
+ prism list # Browse 77 pre-extracted brands
63
+ prism add linear.app # Copy Linear's DESIGN.md + tokens.json into ./
64
+ prism search dark # Find brands by keyword
65
+ ```
66
+
67
+ Then tell your agent: *"Use DESIGN.md as the design reference before writing any UI."*
68
+
69
+ ### Extract any URL
70
+
71
+ ```bash
72
+ prism extract https://yoursite.com
73
+ # or, without npm link:
74
+ npx tsx scripts/clone.ts https://yoursite.com
75
+ ```
76
+
77
+ Output lands in `extractions/<domain>/`:
78
+ - `DESIGN.md` — **24-section** narrative in the Google DESIGN.md format (Visual Theme, Colors, Typography, Components + state matrix, Layout, Motion, Depth, Do's/Don'ts, Responsive, Agent Prompt Guide, raw token export, **Known Gaps & Confidence**…)
79
+ - `tokens.json` — normalized design tokens (semantic palette from CSS custom properties)
80
+ - `raw-css.json` — full `getComputedStyle()` dump (ground truth, audit-friendly)
81
+ - `screenshots/` — desktop 1440px + mobile 390px
82
+ - `output/` + fidelity score — a rebuilt clone screenshotted and **pixel-matched against the original** (see *Measured fidelity*)
83
+
84
+ ---
85
+
86
+ ## Measured fidelity (the honest part)
87
+
88
+ Prism **rebuilds the page's real above-the-fold structure** from the extraction — real page
89
+ background, real header + nav, real hero (composition, heading text + measured font-size, CTA labels),
90
+ real colour banding — then scores it against the original screenshot with **two reproducible numbers**:
91
+
92
+ - **structure (SSIM)** — local layout/contrast match. The honest fidelity headline.
93
+ - **flat-area (pixelmatch)** — background/fill match. High, but background-dominated on its own, so it
94
+ can't tell a real clone from a palette re-skin. That's exactly why we report SSIM too.
95
+
96
+ | Site | Structure (SSIM) | Flat-area | Why |
97
+ |------|-----:|-----:|---|
98
+ | linear.app | 60 | 93 | CSS-in-JS SPA — captured from rendered styles |
99
+ | stripe.com | 67 | 77 | marketing, mostly static |
100
+ | airbnb.com | 52 | 78 | app-shell / search UI (atypical hero) |
101
+ | getdesign.md | 31 | 82 | dark, image-heavy hero |
102
+
103
+ The structural rebuild scores **+4 to +9 on structure** vs a palette-only re-skin of the same tokens
104
+ (`--mode=tokens`) — that delta is the layout it actually reconstructs, and the number SSIM is built to
105
+ see. SSIM is a strict floor: media we don't fetch is rendered as neutral placeholders, which caps it
106
+ honestly. Every DESIGN.md also ends with a `Known Gaps` section. Run it yourself:
107
+
108
+ ```bash
109
+ npx tsx scripts/rebuild.ts <domain> # structural reconstruction (default) — or --mode=tokens
110
+ npx tsx scripts/compare.ts <domain> # structure (SSIM) + flat-area (pixelmatch) vs original
111
+ npx tsx scripts/compare-vs-gd-final.ts # score the DESIGN.md catalog vs getdesign
112
+ ```
113
+
114
+ ---
115
+
116
+ ## What it actually captures
117
+
118
+ For any URL, via `getComputedStyle()` on every key element (body, header, nav, main, cards, buttons, inputs, headings…):
119
+
120
+ - **Computed CSS values** (the rendered truth, not the source stylesheet)
121
+ - **CSS custom properties** (700+ on Shopify-class sites, enabling faithful retheming)
122
+ - **@keyframes animations** with all frames, via a `document.styleSheets` walk
123
+ - **Pseudo-elements** (`::before`/`::after`), **z-index stacking**, **transform 3D matrices**
124
+ - **Grid layouts** (`grid-template-*`), **container queries**
125
+ - **Per-breakpoint snapshots** at 360 / 768 / 1440px
126
+ - **Font faces** + OpenType features + variable axes
127
+ - **Screenshots** full-page, desktop + mobile
128
+
129
+ Anti-bot stealth for Cloudflare / DataDome:
130
+ ```bash
131
+ npx tsx scripts/extract-block.ts https://protected-site.com "header" --stealth
132
+ ```
133
+
134
+ ### Generation
135
+ - **HTML** self-contained (inline CSS, assets downloaded locally)
136
+ - **React JSX** (typed props) and **Tailwind** (utility + arbitrary values)
137
+ - **Token retheming** — apply one site's tokens onto another's structure (5 presets)
138
+
139
+ ---
140
+
141
+ ## Architecture
142
+
143
+ ```
144
+ scripts/
145
+ ├── shared/ # Types, CSS helpers, named colors, logger
146
+ ├── extractors/ # keyframes, pseudo, grid, container queries…
147
+ ├── extract.ts # Main extraction pipeline (Playwright)
148
+ ├── tokenize.ts # raw-css.json → normalized tokens.json
149
+ ├── generate-design-md.ts # Narrative DESIGN.md (Google spec, 24 sections)
150
+ ├── compare.ts # Visual diff (pixelmatch) — rebuilt clone vs original
151
+ ├── compare-vs-gd-final.ts# Catalog scoring vs getdesign.md
152
+ ├── bank.ts # Component bank CLI (register/query/stats/show)
153
+ ├── bank-inject.ts # HTML/React/Tailwind code generation
154
+ ├── generate-site.ts # Public catalog site generator
155
+ └── clone.ts # Top-level orchestrator (extract → tokenize → DESIGN.md → compare)
156
+ ```
157
+
158
+ **Stats**: ~10K LoC TypeScript, strict mode, **108 tests** (`npm test`), `tsc --noEmit` clean.
159
+
160
+ ---
161
+
162
+ ## Commands
163
+
164
+ ```bash
165
+ # Pipeline
166
+ npx tsx scripts/clone.ts <URL> # Full pipeline (extract → tokenize → DESIGN.md → fidelity)
167
+ npx tsx scripts/extract.ts <URL> # Extraction only
168
+ npx tsx scripts/tokenize.ts <domain> # raw-css → tokens
169
+ npx tsx scripts/generate-design-md.ts <domain> # Narrative DESIGN.md
170
+ npx tsx scripts/compare.ts <domain> # Fidelity: rebuilt clone vs original (pixelmatch)
171
+
172
+ # Catalog / comparison
173
+ npx tsx scripts/compare-vs-gd-final.ts # Score catalog vs getdesign.md
174
+ npx tsx scripts/bank.ts diff <domain1> <domain2> # Token comparison between two sites
175
+
176
+ # Tests
177
+ npm test # 108 tests
178
+ npx tsc --noEmit # Type check
179
+ ```
180
+
181
+ ---
182
+
183
+ ## Use cases
184
+
185
+ **Where Prism shines:** design-system audits (extract N competitors, diff tokens), token migration from legacy sites to Tailwind, weekly UI veille (re-extract → diff), prototyping with a real reference's design language, teaching how pro sites structure CSS.
186
+
187
+ **Where it doesn't fit:** pixel-cloning CSS-in-JS SPAs (Linear, Notion), animation-heavy hero sites (Framer, 3D), anything behind an auth wall, content/copy extraction (we extract structure, not text).
188
+
189
+ ---
190
+
191
+ ## Legal
192
+
193
+ Prism extracts publicly accessible CSS (no content, no proprietary code) — consistent with browser dev-tools usage. Respect `robots.txt`. Don't clone-and-deploy competitor sites. Commercially-licensed fonts (Typekit, Monotype) are detected but **not** downloaded — use your own licensed copies.
194
+
195
+ ## Credits
196
+
197
+ Built by [Paul Sainton](https://github.com/paulsainton). The DESIGN.md format is a [Google open standard](https://github.com/google-labs-code/design.md) (Apache 2.0). Inspired by [getdesign.md](https://getdesign.md) (VoltAgent) and the Atareh "Clone Any Website with Claude Code" guide.
198
+
199
+ Stack: [Playwright](https://playwright.dev) · [playwright-extra](https://github.com/berstend/puppeteer-extra) · [pixelmatch](https://github.com/mapbox/pixelmatch) · [pngjs](https://github.com/pngjs/pngjs) · [tsx](https://github.com/esbuild-kit/tsx).
200
+
201
+ ## License
202
+
203
+ MIT © Paul Sainton