designlang 9.0.0 → 10.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,11 @@
1
+ {
2
+ "version": "0.0.1",
3
+ "configurations": [
4
+ {
5
+ "name": "designlang-site",
6
+ "runtimeExecutable": "npm",
7
+ "runtimeArgs": ["--prefix", "website", "run", "dev"],
8
+ "port": 3000
9
+ }
10
+ ]
11
+ }
package/CHANGELOG.md CHANGED
@@ -1,5 +1,47 @@
1
1
  # Changelog
2
2
 
3
+ ## [10.0.0] — 2026-04-22
4
+
5
+ **The Intent Release.** v9 captured *how* a site looks; v10 captures *what it is* — the semantic layer LLM agents need to rebuild a site faithfully, not just restyle a generic scaffold. Six new extractors, a multi-page crawl orchestrator, an optional smart-classifier LLM fallback, and a ready-to-paste prompt pack. 297/297 tests passing.
6
+
7
+ ### Added — extraction
8
+
9
+ - **Page Intent classifier** (`src/extractors/page-intent.js`) — labels the crawled URL as `landing` / `pricing` / `docs` / `blog` / `blog-post` / `product` / `about` / `dashboard` / `auth` / `legal`, with URL + title + meta + DOM-shape signals, a confidence score, and ranked alternates.
10
+ - **Section Roles** (`src/extractors/section-roles.js`) — annotates every semantic region with a role (`hero`, `feature-grid`, `logo-wall`, `stats`, `testimonial`, `pricing-table`, `faq`, `steps`, `comparison`, `gallery`, `bento`, `cta`, `footer`), extracts slot copy (headings, lede, CTA counts), and emits reading order.
11
+ - **Material Language** (`src/extractors/material-language.js`) — classifies the visual vocabulary (`glassmorphism` / `neumorphism` / `flat` / `brutalist` / `skeuomorphic` / `material-you` / `soft-ui` / `mixed`) from shadow complexity, backdrop-filter usage, saturation, and geometry.
12
+ - **Imagery Style** (`src/extractors/imagery-style.js`) — fingerprints the imagery (`photography` / `3d-render` / `isometric` / `flat-illustration` / `gradient-mesh` / `icon-only` / `screenshot` / `mixed`), plus dominant aspect ratio and image-radius profile.
13
+ - **Component Library detector** (`src/extractors/component-library.js`) — identifies shadcn/ui, Radix, Headless UI, MUI, Chakra, Mantine, Ant Design, Bootstrap, HeroUI/NextUI, Tailwind UI, Vuetify, or plain Tailwind, with evidence and alternates.
14
+ - **Logo extractor** (`src/extractors/logo.js`) — pulls the site's logo (SVG source or `<img>` bytes) and samples clearspace; writes `*-logo.svg` or `.png` plus `*-logo.json`.
15
+
16
+ ### Added — orchestration
17
+
18
+ - **Multi-page crawl** (`src/multipage.js`) — `--full` or `--pages <n>` auto-discovers canonical pages from nav (pricing/docs/blog/about/product), runs the full extractor pipeline on each, and emits a cross-page consistency report with shared tokens, per-page uniques, and pairwise Jaccard scores.
19
+ - **Smart classifier fallback** (`src/classifiers/smart.js`) — opt-in `--smart` flag routes low-confidence classifications through the OpenAI or Anthropic API (via `OPENAI_API_KEY` / `ANTHROPIC_API_KEY`). Gracefully no-ops when no key is set. Zero-dep — uses global `fetch`.
20
+
21
+ ### Added — LLM-native outputs
22
+
23
+ - **Prompt pack** (`src/formatters/prompt-pack.js`) — writes a `*-prompts/` directory with `v0.txt`, `lovable.txt`, `cursor.md`, `claude-artifacts.md`, and atomic `recipe-<component>.md` cards. Tokens, section order, voice, and library guidance are all inlined so one paste is enough.
24
+ - **Markdown sections** (`src/formatters/markdown.js`) — adds Page Intent, Section Roles, Material Language, Imagery Style, Component Library, and (when `--full`) Multi-Page Map sections to `*-design-language.md`.
25
+
26
+ ### Added — output files
27
+
28
+ - `*-intent.json` — page-type + section-role map
29
+ - `*-visual-dna.json` — material language + imagery style
30
+ - `*-library.json` — component library detection + evidence
31
+ - `*-logo.svg` | `*-logo.png` + `*-logo.json` (with `--full`)
32
+ - `*-multipage.json` — per-page design languages + consistency (with `--full` / `--pages`)
33
+ - `*-prompts/` — prompt pack directory
34
+
35
+ ### New CLI flags
36
+
37
+ - `--smart` — enable optional LLM refinement for low-confidence classifiers
38
+ - `--pages <n>` — explicitly crawl N canonical pages
39
+ - `--no-prompts` — skip the prompt-pack directory
40
+
41
+ ### Tests
42
+
43
+ - `tests/v10-features.test.js` — 15 new subtests covering page intent, section roles, component library, material language, imagery style, multi-page discovery, cross-page consistency, and prompt pack. Full suite: 297 passing.
44
+
3
45
  ## [9.0.0] — 2026-04-21
4
46
 
5
47
  **The Motion & Voice release.** Six new capabilities that push designlang past "extract the paint" and into "extract the *feel*, the *anatomy*, and the *voice*." No competing tool does any of these. All work ships with tests (282/282 passing).
package/README.md CHANGED
@@ -17,10 +17,24 @@
17
17
 
18
18
  [![designlang on npm](https://pkgfolio.vercel.app/embed/pkg/designlang?v=2)](https://www.npmjs.com/package/designlang)
19
19
 
20
- **designlang** crawls any website with a headless browser, extracts every computed style from the live DOM, and generates **11+ output files** — including an AI-optimized markdown file, visual HTML preview, Tailwind config, React theme, shadcn/ui theme, Figma variables, W3C design tokens, CSS custom properties, **motion tokens**, **typed component anatomy stubs**, and a **brand voice** summary.
20
+ **designlang** crawls any website with a headless browser, extracts every computed style from the live DOM, and generates **17+ output files** — including an AI-optimized markdown file, visual HTML preview, Tailwind config, React theme, shadcn/ui theme, Figma variables, W3C design tokens, CSS custom properties, motion tokens, typed component anatomy stubs, a brand voice summary, **page intent + section roles**, **visual DNA** (material language + imagery style), **component library detection**, a **logo file**, a **multi-page consistency report**, and a **prompt pack** of ready-to-paste prompts for v0, Lovable, Cursor, and Claude Artifacts.
21
21
 
22
22
  But unlike every other tool out there, it also extracts **layout patterns** (grids, flexbox, containers), **motion language** (durations, easings, springs, scroll-linked animations), **component anatomy** (slots, variant × size × state matrices), **brand voice** (tone, CTA verbs, heading style), captures **responsive behavior** across 4 breakpoints, records **interaction states** (hover, focus, active), scores **WCAG accessibility**, lints your own token files, and lets you **drift-check a codebase against a live site**, **visual-diff two URLs**, **compare multiple brands**, or **sync live sites to local tokens**.
23
23
 
24
+ ## What's New in v10 — The Intent Release
25
+
26
+ Everything else captures *how* a site looks. v10 captures *what it is* — the semantic signal an LLM needs to rebuild a site faithfully instead of restyling a generic scaffold.
27
+
28
+ - **Page Intent** — classifier labels the URL as `landing` / `pricing` / `docs` / `blog` / `blog-post` / `product` / `about` / `dashboard` / `auth` / `legal`, with a confidence score and rival alternates. URL + title + meta + DOM-shape signals. Heuristic-only by default; opt into `--smart` for LLM refinement.
29
+ - **Section Roles** — every semantic region gets a role (`hero`, `feature-grid`, `logo-wall`, `stats`, `testimonial`, `pricing-table`, `faq`, `steps`, `comparison`, `gallery`, `bento`, `cta`, `footer`), plus reading order and extracted slot copy (headings, lede, CTA counts).
30
+ - **Multi-Page Crawl** — `--full` (or `--pages <n>`) auto-discovers the site's own canonical pages from its nav (pricing/docs/blog/about/product) and runs the full pipeline on each, then emits a cross-page consistency report — shared tokens, per-page uniques, and pairwise Jaccard scores. LLMs get a real design language, not just a homepage snapshot.
31
+ - **Material Language** — classifies the visual vocabulary as `glassmorphism` / `neumorphism` / `flat` / `brutalist` / `skeuomorphic` / `material-you` / `soft-ui` / `mixed` from shadow complexity, backdrop-filter usage, saturation, and geometry.
32
+ - **Imagery Style** — fingerprints the images: `photography` / `3d-render` / `isometric` / `flat-illustration` / `gradient-mesh` / `icon-only` / `screenshot` / `mixed`, plus dominant aspect ratio and image-radius profile.
33
+ - **Component Library Detection** — identifies `shadcn/ui`, `radix-ui`, `headlessui`, `mui`, `chakra-ui`, `mantine`, `ant-design`, `bootstrap`, `heroui`, `tailwind-ui`, `vuetify`, or plain `tailwindcss`, with evidence and alternates.
34
+ - **Logo Extraction** — `--full` writes `*-logo.svg` (or `.png`) plus `*-logo.json` with dimensions, aspect, and sampled clearspace.
35
+ - **Prompt Pack** — a `*-prompts/` directory with `v0.txt`, `lovable.txt`, `cursor.md`, `claude-artifacts.md`, and atomic `recipe-<component>.md` cards — tokens, section order, voice, and library inlined so one paste is enough.
36
+ - **`--smart` mode** — when a heuristic classifier returns low confidence, fall back to a small LLM call (uses `OPENAI_API_KEY` or `ANTHROPIC_API_KEY` from env). Completely optional — no key, no behavior change.
37
+
24
38
  ## What's New in v9 — The Motion & Voice Release
25
39
 
26
40
  - **Motion Language** — durations bucketed into semantic tokens (`instant`/`xs`/`sm`/`md`/`lg`/`xl`), easings classified into families (ease-out, spring-overshoot, steps), scroll-linked animation detection (`animation-timeline`, `view-timeline-name`), keyframe kind classification (slide / fade / reveal / rotate / scale / pulse), and a `feel` fingerprint — *springy*, *responsive*, *smooth*, *mechanical*, or *mixed*.
@@ -6,6 +6,10 @@ import { resolve, join } from 'path';
6
6
  import chalk from 'chalk';
7
7
  import ora from 'ora';
8
8
  import { extractDesignLanguage } from '../src/index.js';
9
+ import { refineWithSmart } from '../src/classifiers/smart.js';
10
+ import { crawlCanonicalPages } from '../src/multipage.js';
11
+ import { extractLogo } from '../src/extractors/logo.js';
12
+ import { buildPromptPack } from '../src/formatters/prompt-pack.js';
9
13
  import { formatMarkdown } from '../src/formatters/markdown.js';
10
14
  import { formatTokens } from '../src/formatters/tokens.js';
11
15
  import { formatDtcgTokens } from '../src/formatters/dtcg-tokens.js';
@@ -48,7 +52,7 @@ const program = new Command();
48
52
  program
49
53
  .name('designlang')
50
54
  .description('Extract the complete design language from any website')
51
- .version('9.0.0');
55
+ .version('10.0.0');
52
56
 
53
57
  // ── Main command: extract ──────────────────────────────────────
54
58
  program
@@ -77,6 +81,9 @@ program
77
81
  .option('--tokens-legacy', 'Emit pre-v7 flat token JSON (backward compat)')
78
82
  .option('--platforms <csv>', 'Additional platforms: web,ios,android,flutter,wordpress,all (web is always emitted)', 'web')
79
83
  .option('--emit-agent-rules', 'Emit Cursor/Claude Code/generic agent rules')
84
+ .option('--smart', 'use optional LLM fallback when heuristic classifiers have low confidence (needs OPENAI_API_KEY or ANTHROPIC_API_KEY)')
85
+ .option('--pages <n>', 'crawl N canonical pages (pricing/docs/blog/about/product) in addition to the homepage', parseInt)
86
+ .option('--no-prompts', 'skip writing the prompt-pack directory')
80
87
  .option('--json', 'output raw JSON to stdout (for CI/CD)')
81
88
  .option('--json-pretty', 'output formatted JSON to stdout')
82
89
  .option('--no-history', 'skip saving to history')
@@ -173,6 +180,65 @@ program
173
180
  design.interactions = await captureInteractions(url, { width: merged.width, height: parseInt(merged.height) || 800, wait: merged.wait });
174
181
  }
175
182
 
183
+ // v10: optional LLM refinement for low-confidence classifiers.
184
+ if (merged.smart) {
185
+ spinner.text = 'Refining classifiers with smart mode...';
186
+ try {
187
+ const refined = await refineWithSmart({
188
+ enabled: true,
189
+ rawData: design._raw,
190
+ design,
191
+ pageIntent: design.pageIntent,
192
+ sectionRoles: design.sectionRoles,
193
+ materialLanguage: design.materialLanguage,
194
+ componentLibrary: design.componentLibrary,
195
+ });
196
+ if (refined.applied) {
197
+ if (refined.updates?.pageIntent) design.pageIntent = { ...design.pageIntent, ...refined.updates.pageIntent };
198
+ if (refined.updates?.materialLanguage) design.materialLanguage = { ...design.materialLanguage, ...refined.updates.materialLanguage };
199
+ if (refined.updates?.componentLibrary) design.componentLibrary = { ...design.componentLibrary, ...refined.updates.componentLibrary };
200
+ design._smart = { provider: refined.provider, errors: refined.errors };
201
+ } else {
202
+ design._smart = { skipped: refined.reason };
203
+ }
204
+ } catch (e) { design._smart = { error: e.message }; }
205
+ }
206
+
207
+ // v10: logo extraction via a fresh Playwright session.
208
+ if (merged.full || merged.screenshots) {
209
+ spinner.text = 'Extracting logo...';
210
+ try {
211
+ const { chromium } = await import('playwright');
212
+ const browser = await chromium.launch({ headless: true, ...(merged.systemChrome && { channel: 'chrome' }) });
213
+ const ctx = await browser.newContext({ viewport: { width: merged.width, height: parseInt(merged.height) || 800 } });
214
+ const lp = await ctx.newPage();
215
+ await lp.goto(url, { waitUntil: 'domcontentloaded', timeout: 20000 }).catch(() => {});
216
+ await lp.waitForLoadState('networkidle').catch(() => {});
217
+ mkdirSync(outDir, { recursive: true });
218
+ design.logo = await extractLogo(lp, outDir, prefix);
219
+ await browser.close();
220
+ } catch (e) { design.logo = { found: false, error: e.message }; }
221
+ }
222
+
223
+ // v10: multi-page canonical crawl (pricing/docs/blog/about/product).
224
+ const pagesArg = merged.pages != null ? merged.pages : (merged.full ? 5 : 0);
225
+ if (pagesArg > 0) {
226
+ spinner.text = `Crawling ${pagesArg} canonical pages...`;
227
+ try {
228
+ const mp = await crawlCanonicalPages({
229
+ homepageUrl: url,
230
+ homepageRawData: design._raw,
231
+ maxPages: pagesArg,
232
+ crawlerOptions: { width: merged.width, height: parseInt(merged.height) || 800 },
233
+ extract: (u, o) => extractDesignLanguage(u, o),
234
+ });
235
+ design.multiPage = mp;
236
+ } catch (e) { design.multiPage = { error: e.message }; }
237
+ }
238
+
239
+ // Drop the internal raw stash before JSON/output serialization.
240
+ delete design._raw;
241
+
176
242
  // JSON mode: output and exit
177
243
  if (jsonMode) {
178
244
  const output = opts.jsonPretty ? JSON.stringify(design, null, 2) : JSON.stringify(design);
@@ -230,6 +296,30 @@ program
230
296
  }
231
297
  files.push({ name: `${prefix}-voice.json`, content: JSON.stringify(design.voice || {}, null, 2), label: 'Brand Voice' });
232
298
 
299
+ // v10: page intent + section roles + visual DNA + component library + multi-page + prompt pack.
300
+ files.push({ name: `${prefix}-intent.json`, content: JSON.stringify({ pageIntent: design.pageIntent, sectionRoles: design.sectionRoles }, null, 2), label: 'Page Intent + Section Roles' });
301
+ files.push({ name: `${prefix}-visual-dna.json`, content: JSON.stringify({ materialLanguage: design.materialLanguage, imageryStyle: design.imageryStyle }, null, 2), label: 'Visual DNA' });
302
+ files.push({ name: `${prefix}-library.json`, content: JSON.stringify(design.componentLibrary || {}, null, 2), label: 'Component Library Detection' });
303
+ if (design.logo && design.logo.found) {
304
+ files.push({ name: `${prefix}-logo.json`, content: JSON.stringify(design.logo, null, 2), label: 'Logo Metadata' });
305
+ }
306
+ if (design.multiPage) {
307
+ files.push({ name: `${prefix}-multipage.json`, content: JSON.stringify(design.multiPage, null, 2), label: 'Multi-Page Crawl' });
308
+ }
309
+ if (merged.prompts !== false) {
310
+ const pack = buildPromptPack(design);
311
+ const promptsDir = join(outDir, `${prefix}-prompts`);
312
+ mkdirSync(promptsDir, { recursive: true });
313
+ writeFileSync(join(promptsDir, 'v0.txt'), pack['v0.txt'], 'utf-8');
314
+ writeFileSync(join(promptsDir, 'lovable.txt'), pack['lovable.txt'], 'utf-8');
315
+ writeFileSync(join(promptsDir, 'cursor.md'), pack['cursor.md'], 'utf-8');
316
+ writeFileSync(join(promptsDir, 'claude-artifacts.md'), pack['claude-artifacts.md'], 'utf-8');
317
+ for (const r of pack.recipes) {
318
+ const slug = r.name.replace(/[^a-z0-9]+/gi, '-').toLowerCase() || 'component';
319
+ writeFileSync(join(promptsDir, `recipe-${slug}.md`), r.content, 'utf-8');
320
+ }
321
+ }
322
+
233
323
  for (const file of files) {
234
324
  writeFileSync(join(outDir, file.name), file.content, 'utf-8');
235
325
  }
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "designlang",
3
- "version": "9.0.0",
4
- "description": "Extract the complete design language from any website — colors, typography, spacing, shadows, motion, component anatomy, and brand voice. Outputs AI-optimized markdown, W3C design tokens, motion tokens, typed component stubs, Tailwind config, and more.",
3
+ "version": "10.0.0",
4
+ "description": "Extract the complete design language from any website — colors, typography, spacing, shadows, motion, component anatomy, brand voice, page intent, section roles, material language, component library, imagery style, and logo. Outputs AI-optimized markdown, W3C design tokens, motion tokens, typed component stubs, Tailwind config, and ready-to-paste v0 / Lovable / Cursor / Claude-Artifacts prompts.",
5
5
  "type": "module",
6
6
  "bin": {
7
7
  "designlang": "./bin/design-extract.js"
@@ -0,0 +1,130 @@
1
+ // Optional LLM fallback for low-confidence classifications. No SDK deps — we
2
+ // hit the OpenAI or Anthropic REST API directly via global fetch. Runs only
3
+ // when the user passes --smart AND an API key is available in env. Silently
4
+ // no-ops otherwise so the core extractor stays zero-config.
5
+ //
6
+ // Consumers call `refineWithSmart({ pageIntent, sectionRoles, materialLanguage,
7
+ // componentLibrary }, digest)` — we only hit the network for fields where
8
+ // `needsSmart` is true.
9
+
10
+ const TASKS = {
11
+ pageIntent: {
12
+ system: 'You classify a web page into one of these types: landing, pricing, docs, blog, blog-post, product, about, dashboard, auth, legal, unknown. Return only a JSON object {"type":"...","confidence":0.xx,"why":"one sentence"}.',
13
+ },
14
+ materialLanguage: {
15
+ system: 'You classify a website\'s visual material language. Choose one of: glassmorphism, neumorphism, flat, brutalist, skeuomorphic, material-you, soft-ui, mixed. Return only a JSON object {"label":"...","confidence":0.xx,"why":"one sentence"}.',
16
+ },
17
+ componentLibrary: {
18
+ system: 'You identify which UI component library a website most likely uses. Choose from: shadcn/ui, radix-ui, headlessui, mui, chakra-ui, mantine, ant-design, bootstrap, heroui, tailwind-ui, vuetify, tailwindcss, unknown. Return only a JSON object {"library":"...","confidence":0.xx,"why":"one sentence"}.',
19
+ },
20
+ };
21
+
22
+ function detectProvider() {
23
+ if (process.env.ANTHROPIC_API_KEY) return 'anthropic';
24
+ if (process.env.OPENAI_API_KEY) return 'openai';
25
+ return null;
26
+ }
27
+
28
+ function buildDigest({ rawData, design, pageIntent }) {
29
+ // A compact digest — URL, title, meta description, first 1000 chars of
30
+ // visible text, top classes. Keep under ~3k tokens.
31
+ const sections = (rawData?.light?.sections) || [];
32
+ const text = sections.map(s => s.text || '').join('\n').slice(0, 1500);
33
+ const metas = (rawData?.light?.stack?.metas || []).slice(0, 10)
34
+ .map(m => `${m.name || ''}: ${(m.content || '').slice(0, 120)}`).join('\n');
35
+ const classes = (rawData?.light?.stack?.classNameSample || []).slice(0, 60).join(' | ').slice(0, 1500);
36
+ return [
37
+ `URL: ${rawData?.url || ''}`,
38
+ `TITLE: ${rawData?.title || ''}`,
39
+ `PATH: ${pageIntent?.path || ''}`,
40
+ `METAS:\n${metas}`,
41
+ `SECTION ROLES: ${(design?.regions || []).map(r => r.role).join(',')}`,
42
+ `TEXT SAMPLE:\n${text}`,
43
+ `CLASS SAMPLE:\n${classes}`,
44
+ ].join('\n\n');
45
+ }
46
+
47
+ async function callAnthropic(system, user) {
48
+ const res = await fetch('https://api.anthropic.com/v1/messages', {
49
+ method: 'POST',
50
+ headers: {
51
+ 'x-api-key': process.env.ANTHROPIC_API_KEY,
52
+ 'anthropic-version': '2023-06-01',
53
+ 'content-type': 'application/json',
54
+ },
55
+ body: JSON.stringify({
56
+ model: process.env.DESIGNLANG_MODEL || 'claude-haiku-4-5-20251001',
57
+ max_tokens: 200,
58
+ system,
59
+ messages: [{ role: 'user', content: user }],
60
+ }),
61
+ });
62
+ if (!res.ok) throw new Error(`anthropic ${res.status}`);
63
+ const json = await res.json();
64
+ const text = (json.content || []).map(b => b.text || '').join('');
65
+ return text;
66
+ }
67
+
68
+ async function callOpenAI(system, user) {
69
+ const res = await fetch('https://api.openai.com/v1/chat/completions', {
70
+ method: 'POST',
71
+ headers: {
72
+ 'authorization': `Bearer ${process.env.OPENAI_API_KEY}`,
73
+ 'content-type': 'application/json',
74
+ },
75
+ body: JSON.stringify({
76
+ model: process.env.DESIGNLANG_MODEL || 'gpt-4o-mini',
77
+ messages: [
78
+ { role: 'system', content: system },
79
+ { role: 'user', content: user },
80
+ ],
81
+ max_tokens: 200,
82
+ response_format: { type: 'json_object' },
83
+ }),
84
+ });
85
+ if (!res.ok) throw new Error(`openai ${res.status}`);
86
+ const json = await res.json();
87
+ return json.choices?.[0]?.message?.content || '';
88
+ }
89
+
90
+ async function callLLM(provider, system, user) {
91
+ if (provider === 'anthropic') return callAnthropic(system, user);
92
+ return callOpenAI(system, user);
93
+ }
94
+
95
+ function parseJsonLoose(text) {
96
+ try { return JSON.parse(text); } catch { /* try to find a JSON object */ }
97
+ const m = text.match(/\{[\s\S]*\}/);
98
+ if (m) { try { return JSON.parse(m[0]); } catch {} }
99
+ return null;
100
+ }
101
+
102
+ export async function refineWithSmart({ enabled, rawData, design, pageIntent, sectionRoles, materialLanguage, componentLibrary }) {
103
+ if (!enabled) return { applied: false, reason: 'disabled' };
104
+ const provider = detectProvider();
105
+ if (!provider) return { applied: false, reason: 'no API key (set OPENAI_API_KEY or ANTHROPIC_API_KEY)' };
106
+
107
+ const digest = buildDigest({ rawData, design, pageIntent });
108
+ const updates = {};
109
+ const errors = [];
110
+
111
+ const queue = [];
112
+ if (pageIntent?.needsSmart) queue.push(['pageIntent', pageIntent]);
113
+ if (materialLanguage && (materialLanguage.confidence || 0) < 0.55) queue.push(['materialLanguage', materialLanguage]);
114
+ if (componentLibrary?.needsSmart) queue.push(['componentLibrary', componentLibrary]);
115
+
116
+ for (const [task, current] of queue) {
117
+ const spec = TASKS[task];
118
+ if (!spec) continue;
119
+ const user = `Digest:\n${digest}\n\nCurrent heuristic result:\n${JSON.stringify(current)}\n\nRespond with the requested JSON.`;
120
+ try {
121
+ const raw = await callLLM(provider, spec.system, user);
122
+ const parsed = parseJsonLoose(raw);
123
+ if (parsed) updates[task] = { ...parsed, smart: true, provider };
124
+ } catch (e) {
125
+ errors.push(`${task}: ${e.message}`);
126
+ }
127
+ }
128
+
129
+ return { applied: true, provider, updates, errors };
130
+ }
@@ -0,0 +1,193 @@
1
+ // Detect the component library a site is built on. Fingerprints: class-name
2
+ // patterns, data-* attributes, script URLs, window globals. Returns the most
3
+ // likely library with a confidence score and the evidence that supported it —
4
+ // LLM agents consume this to pick the right scaffolding (e.g. "use shadcn/ui",
5
+ // "use MUI v5") when rebuilding.
6
+
7
+ const LIB_FINGERPRINTS = [
8
+ {
9
+ id: 'shadcn/ui',
10
+ score: (ctx) => {
11
+ let s = 0; const evidence = [];
12
+ // shadcn copies Radix attributes and uses Tailwind utility density.
13
+ if (ctx.radixAttrCount > 2 && ctx.tailwindLike > 0.4) { s += 0.5; evidence.push('radix+tailwind mix'); }
14
+ if (/\bbutton-primary|\bbutton-destructive/.test(ctx.classBlob) && ctx.radixAttrCount > 0) { s += 0.15; evidence.push('shadcn button tokens'); }
15
+ // `class="... bg-background text-foreground ..."` is a shadcn tell.
16
+ if (/\bbg-background\b|\btext-foreground\b|\bborder-input\b|\bring-offset-background\b/.test(ctx.classBlob)) {
17
+ s += 0.65; evidence.push('shadcn css tokens');
18
+ }
19
+ return { score: s, evidence };
20
+ },
21
+ },
22
+ {
23
+ id: 'radix-ui',
24
+ score: (ctx) => {
25
+ const count = ctx.radixAttrCount;
26
+ if (count === 0) return { score: 0, evidence: [] };
27
+ const score = Math.min(0.9, 0.3 + count * 0.05);
28
+ return { score, evidence: [`${count} radix attributes`] };
29
+ },
30
+ },
31
+ {
32
+ id: 'headlessui',
33
+ score: (ctx) => {
34
+ const m = (ctx.classSample.join(' ').match(/headlessui-[a-z]+/gi) || []).length;
35
+ if (m < 2) return { score: 0, evidence: [] };
36
+ return { score: Math.min(0.9, 0.3 + m * 0.08), evidence: [`${m} headlessui- class refs`] };
37
+ },
38
+ },
39
+ {
40
+ id: 'mui',
41
+ score: (ctx) => {
42
+ const m = (ctx.classBlob.match(/Mui[A-Z][A-Za-z]+-root/g) || []).length;
43
+ if (m < 2) return { score: 0, evidence: [] };
44
+ return { score: Math.min(0.95, 0.4 + m * 0.04), evidence: [`${m} Mui*-root classes`] };
45
+ },
46
+ },
47
+ {
48
+ id: 'chakra-ui',
49
+ score: (ctx) => {
50
+ const m = (ctx.classBlob.match(/\bchakra-[a-z]+/g) || []).length;
51
+ if (m < 3) return { score: 0, evidence: [] };
52
+ return { score: Math.min(0.95, 0.4 + m * 0.03), evidence: [`${m} chakra- classes`] };
53
+ },
54
+ },
55
+ {
56
+ id: 'mantine',
57
+ score: (ctx) => {
58
+ const m = (ctx.classBlob.match(/mantine-[A-Za-z0-9]+/g) || []).length;
59
+ if (m < 2) return { score: 0, evidence: [] };
60
+ return { score: Math.min(0.95, 0.4 + m * 0.04), evidence: [`${m} mantine- classes`] };
61
+ },
62
+ },
63
+ {
64
+ id: 'ant-design',
65
+ score: (ctx) => {
66
+ const m = (ctx.classBlob.match(/\bant-[a-z]+(-[a-z]+)*/g) || []).length;
67
+ if (m < 3) return { score: 0, evidence: [] };
68
+ return { score: Math.min(0.95, 0.4 + m * 0.03), evidence: [`${m} ant- classes`] };
69
+ },
70
+ },
71
+ {
72
+ id: 'bootstrap',
73
+ score: (ctx) => {
74
+ const hits = ['container', 'row', 'col-md-', 'btn-primary', 'navbar-nav', 'card-body']
75
+ .filter(k => ctx.classBlob.includes(k)).length;
76
+ if (hits < 3) return { score: 0, evidence: [] };
77
+ return { score: Math.min(0.9, 0.3 + hits * 0.1), evidence: [`bootstrap utility hits: ${hits}`] };
78
+ },
79
+ },
80
+ {
81
+ id: 'heroui',
82
+ score: (ctx) => {
83
+ const m = (ctx.classBlob.match(/\bheroui-|\bnextui-/g) || []).length;
84
+ if (m < 2) return { score: 0, evidence: [] };
85
+ return { score: Math.min(0.95, 0.4 + m * 0.05), evidence: [`${m} heroui/nextui classes`] };
86
+ },
87
+ },
88
+ {
89
+ id: 'tailwind-ui',
90
+ score: (ctx) => {
91
+ // Tailwind UI is a starter/template, not a runtime — use density of
92
+ // Tailwind utilities + typical Tailwind UI patterns as a weak signal.
93
+ if (ctx.tailwindLike < 0.6) return { score: 0, evidence: [] };
94
+ const patterns = ['ring-offset-', 'focus:ring-', 'hover:bg-gray-', 'prose prose-'];
95
+ const hits = patterns.filter(p => ctx.classBlob.includes(p)).length;
96
+ if (hits < 2) return { score: 0, evidence: [] };
97
+ return { score: 0.3 + hits * 0.12, evidence: [`tailwind density=${ctx.tailwindLike.toFixed(2)}, pattern hits=${hits}`] };
98
+ },
99
+ },
100
+ {
101
+ id: 'vuetify',
102
+ score: (ctx) => {
103
+ const m = (ctx.classBlob.match(/\bv-[a-z]+(-[a-z]+)?/g) || []).length;
104
+ if (m < 5) return { score: 0, evidence: [] };
105
+ return { score: Math.min(0.9, 0.3 + m * 0.02), evidence: [`${m} v-* classes`] };
106
+ },
107
+ },
108
+ {
109
+ id: 'tailwindcss',
110
+ score: (ctx) => {
111
+ if (ctx.tailwindLike < 0.35) return { score: 0, evidence: [] };
112
+ // Tailwind itself isn't a component library but we report it as a signal
113
+ // when no higher-level library is detected.
114
+ return { score: 0.3 + (ctx.tailwindLike - 0.35) * 1.2, evidence: [`tailwind-like class density ${(ctx.tailwindLike * 100).toFixed(0)}%`] };
115
+ },
116
+ },
117
+ ];
118
+
119
+ function computeTailwindLike(classSample) {
120
+ if (!classSample.length) return 0;
121
+ // A rough "looks like Tailwind" metric: fraction of class tokens that match
122
+ // common utility shapes (pt-4, bg-slate-100, text-2xl, flex, gap-x-2, etc.).
123
+ const utilRe = /^(?:sm:|md:|lg:|xl:|2xl:|hover:|focus:|dark:|group-hover:)*(?:p|m|px|py|pt|pb|pl|pr|mx|my|mt|mb|ml|mr|w|h|min-w|min-h|max-w|max-h|gap|space|text|font|leading|tracking|bg|border|rounded|shadow|ring|ringed|opacity|flex|grid|items|justify|content|self|place|overflow|z|inset|top|bottom|left|right|translate|rotate|scale|skew|transition|duration|ease|delay|animate)(?:-[a-z0-9/.:\[\]%-]+)?$/i;
124
+ let utils = 0, total = 0;
125
+ for (const cls of classSample) {
126
+ for (const tok of cls.split(/\s+/).slice(0, 30)) {
127
+ if (!tok) continue;
128
+ total++;
129
+ if (utilRe.test(tok)) utils++;
130
+ }
131
+ }
132
+ return total > 0 ? utils / total : 0;
133
+ }
134
+
135
+ function countRadixAttrs(classSample, attrSample = []) {
136
+ // Radix emits attributes like data-radix-popper-content-wrapper, data-state,
137
+ // data-orientation, data-slot. We don't have a full attr dump, but script src
138
+ // for @radix-ui is also a tell.
139
+ let n = 0;
140
+ for (const a of attrSample) {
141
+ if (/data-radix|data-slot|data-state|data-orientation/.test(a)) n++;
142
+ }
143
+ return n;
144
+ }
145
+
146
+ export function extractComponentLibrary(stackSignals = {}) {
147
+ const classSample = (stackSignals.classNameSample || []).slice(0, 500);
148
+ const classBlob = classSample.join(' ');
149
+ const scripts = (stackSignals.scripts || []).join(' ');
150
+ const attrSample = stackSignals.attrSample || [];
151
+
152
+ const tailwindLike = computeTailwindLike(classSample);
153
+ let radixAttrCount = countRadixAttrs(classSample, attrSample);
154
+ if (/@radix-ui/.test(scripts)) radixAttrCount += 3;
155
+
156
+ const ctx = { classSample, classBlob, scripts, tailwindLike, radixAttrCount };
157
+
158
+ const ranked = [];
159
+ for (const lib of LIB_FINGERPRINTS) {
160
+ const { score, evidence } = lib.score(ctx);
161
+ if (score > 0) {
162
+ ranked.push({ id: lib.id, score: Number(score.toFixed(3)), evidence });
163
+ }
164
+ }
165
+ // Tailwind CSS is a styling layer, not a component library. If any
166
+ // higher-level library also scored, demote tailwindcss so it doesn't shadow
167
+ // the real answer.
168
+ const hasHigherLevel = ranked.some(r => r.id !== 'tailwindcss' && r.id !== 'tailwind-ui' && r.score > 0.35);
169
+ if (hasHigherLevel) {
170
+ for (const r of ranked) {
171
+ if (r.id === 'tailwindcss') r.score = Math.min(r.score, 0.3);
172
+ }
173
+ }
174
+ ranked.sort((a, b) => b.score - a.score);
175
+
176
+ const primary = ranked[0] || { id: 'unknown', score: 0, evidence: [] };
177
+ // If shadcn and radix both score, prefer shadcn at the top and keep radix as
178
+ // the underlying primitive.
179
+ const alternates = ranked.slice(1, 5);
180
+
181
+ return {
182
+ library: primary.id,
183
+ confidence: primary.score,
184
+ evidence: primary.evidence,
185
+ alternates,
186
+ signals: {
187
+ tailwindLike: Number(tailwindLike.toFixed(3)),
188
+ radixAttrCount,
189
+ classSampleSize: classSample.length,
190
+ },
191
+ needsSmart: primary.score < 0.55,
192
+ };
193
+ }