npm - designlang - Versions diffs - 9.0.0 → 10.1.0 - Mend

designlang 9.0.0 → 10.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/.claude/launch.json +11 -0
package/CHANGELOG.md +62 -0
package/README.md +16 -1
package/bin/design-extract.js +107 -1
package/package.json +2 -2
package/src/classifiers/smart.js +130 -0
package/src/extractors/component-library.js +193 -0
package/src/extractors/component-screenshots.js +161 -0
package/src/extractors/imagery-style.js +131 -0
package/src/extractors/logo.js +142 -0
package/src/extractors/material-language.js +152 -0
package/src/extractors/page-intent.js +172 -0
package/src/extractors/section-roles.js +135 -0
package/src/formatters/markdown.js +109 -0
package/src/formatters/prompt-pack.js +214 -0
package/src/index.js +27 -0
package/src/multipage.js +233 -0

package/.claude/launch.json ADDED Viewed

@@ -0,0 +1,11 @@
+{
+  "version": "0.0.1",
+  "configurations": [
+    {
+      "name": "designlang-site",
+      "runtimeExecutable": "npm",
+      "runtimeArgs": ["--prefix", "website", "run", "dev"],
+      "port": 3000
+    }
+  ]
+}

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,67 @@
 # Changelog
+## [10.1.0] — 2026-04-22
+**Component screenshots.** The existing `--screenshots` flag now emits cluster-aware, retina (2×), multi-variant PNGs instead of five hardcoded selectors and a full-page image.
+### Added
+- **`src/extractors/component-screenshots.js`** — queries the live DOM with the same candidate selector the crawler uses, groups matches by `kind + variantHint + sizeHint`, and captures up to three representatives per group. Falls back to the v9 hardcoded list when no clusters produced anything (auth / docs pages).
+- Retina capture via a dedicated Playwright context at `deviceScaleFactor: 2`.
+- **`*-screenshots.json`** — index file mapping every cropped PNG to its cluster name, variant, bounds, and fallback flag.
+- Markdown formatter gains a **Component Screenshots** section listing the first 20 crops.
+### Behaviour
+- No new CLI flags. `--screenshots` and `--full` continue to opt into capture.
+- Backward compatible — when no clusters match, the v9 hardcoded selector set still fires.
+### Tests
+297 → **299** passing.
+## [10.0.0] — 2026-04-22
+**The Intent Release.** v9 captured *how* a site looks; v10 captures *what it is* — the semantic layer LLM agents need to rebuild a site faithfully, not just restyle a generic scaffold. Six new extractors, a multi-page crawl orchestrator, an optional smart-classifier LLM fallback, and a ready-to-paste prompt pack. 297/297 tests passing.
+### Added — extraction
+- **Page Intent classifier** (`src/extractors/page-intent.js`) — labels the crawled URL as `landing` / `pricing` / `docs` / `blog` / `blog-post` / `product` / `about` / `dashboard` / `auth` / `legal`, with URL + title + meta + DOM-shape signals, a confidence score, and ranked alternates.
+- **Section Roles** (`src/extractors/section-roles.js`) — annotates every semantic region with a role (`hero`, `feature-grid`, `logo-wall`, `stats`, `testimonial`, `pricing-table`, `faq`, `steps`, `comparison`, `gallery`, `bento`, `cta`, `footer`), extracts slot copy (headings, lede, CTA counts), and emits reading order.
+- **Material Language** (`src/extractors/material-language.js`) — classifies the visual vocabulary (`glassmorphism` / `neumorphism` / `flat` / `brutalist` / `skeuomorphic` / `material-you` / `soft-ui` / `mixed`) from shadow complexity, backdrop-filter usage, saturation, and geometry.
+- **Imagery Style** (`src/extractors/imagery-style.js`) — fingerprints the imagery (`photography` / `3d-render` / `isometric` / `flat-illustration` / `gradient-mesh` / `icon-only` / `screenshot` / `mixed`), plus dominant aspect ratio and image-radius profile.
+- **Component Library detector** (`src/extractors/component-library.js`) — identifies shadcn/ui, Radix, Headless UI, MUI, Chakra, Mantine, Ant Design, Bootstrap, HeroUI/NextUI, Tailwind UI, Vuetify, or plain Tailwind, with evidence and alternates.
+- **Logo extractor** (`src/extractors/logo.js`) — pulls the site's logo (SVG source or `<img>` bytes) and samples clearspace; writes `*-logo.svg` or `.png` plus `*-logo.json`.
+### Added — orchestration
+- **Multi-page crawl** (`src/multipage.js`) — `--full` or `--pages <n>` auto-discovers canonical pages from nav (pricing/docs/blog/about/product), runs the full extractor pipeline on each, and emits a cross-page consistency report with shared tokens, per-page uniques, and pairwise Jaccard scores.
+- **Smart classifier fallback** (`src/classifiers/smart.js`) — opt-in `--smart` flag routes low-confidence classifications through the OpenAI or Anthropic API (via `OPENAI_API_KEY` / `ANTHROPIC_API_KEY`). Gracefully no-ops when no key is set. Zero-dep — uses global `fetch`.
+### Added — LLM-native outputs
+- **Prompt pack** (`src/formatters/prompt-pack.js`) — writes a `*-prompts/` directory with `v0.txt`, `lovable.txt`, `cursor.md`, `claude-artifacts.md`, and atomic `recipe-<component>.md` cards. Tokens, section order, voice, and library guidance are all inlined so one paste is enough.
+- **Markdown sections** (`src/formatters/markdown.js`) — adds Page Intent, Section Roles, Material Language, Imagery Style, Component Library, and (when `--full`) Multi-Page Map sections to `*-design-language.md`.
+### Added — output files
+- `*-intent.json` — page-type + section-role map
+- `*-visual-dna.json` — material language + imagery style
+- `*-library.json` — component library detection + evidence
+- `*-logo.svg` | `*-logo.png` + `*-logo.json` (with `--full`)
+- `*-multipage.json` — per-page design languages + consistency (with `--full` / `--pages`)
+- `*-prompts/` — prompt pack directory
+### New CLI flags
+- `--smart` — enable optional LLM refinement for low-confidence classifiers
+- `--pages <n>` — explicitly crawl N canonical pages
+- `--no-prompts` — skip the prompt-pack directory
+### Tests
+- `tests/v10-features.test.js` — 15 new subtests covering page intent, section roles, component library, material language, imagery style, multi-page discovery, cross-page consistency, and prompt pack. Full suite: 297 passing.
 ## [9.0.0] — 2026-04-21
 **The Motion & Voice release.** Six new capabilities that push designlang past "extract the paint" and into "extract the *feel*, the *anatomy*, and the *voice*." No competing tool does any of these. All work ships with tests (282/282 passing).

package/README.md CHANGED Viewed

@@ -7,6 +7,7 @@
   <a href="https://github.com/Manavarya09/design-extract/blob/main/LICENSE"><img src="https://img.shields.io/github/license/Manavarya09/design-extract?color=0A0908&labelColor=F3F1EA" alt="license"></a>
   <a href="https://nodejs.org"><img src="https://img.shields.io/node/v/designlang?color=0A0908&labelColor=F3F1EA" alt="node version"></a>
   <a href="https://designlang.manavaryasingh.com/"><img src="https://img.shields.io/badge/website-live-FF4800?labelColor=F3F1EA" alt="website"></a>
+[![SafeSkill 77/100](https://img.shields.io/badge/SafeSkill-77%2F100_Passes%20with%20Notes-yellow)](https://safeskill.dev/scan/manavarya09-design-extract)
 </p>
 ---
@@ -17,10 +18,24 @@
 [![designlang on npm](https://pkgfolio.vercel.app/embed/pkg/designlang?v=2)](https://www.npmjs.com/package/designlang)
-**designlang** crawls any website with a headless browser, extracts every computed style from the live DOM, and generates **11+ output files** — including an AI-optimized markdown file, visual HTML preview, Tailwind config, React theme, shadcn/ui theme, Figma variables, W3C design tokens, CSS custom properties, **motion tokens**, **typed component anatomy stubs**, and a **brand voice** summary.
+**designlang** crawls any website with a headless browser, extracts every computed style from the live DOM, and generates **17+ output files** — including an AI-optimized markdown file, visual HTML preview, Tailwind config, React theme, shadcn/ui theme, Figma variables, W3C design tokens, CSS custom properties, motion tokens, typed component anatomy stubs, a brand voice summary, **page intent + section roles**, **visual DNA** (material language + imagery style), **component library detection**, a **logo file**, a **multi-page consistency report**, and a **prompt pack** of ready-to-paste prompts for v0, Lovable, Cursor, and Claude Artifacts.
 But unlike every other tool out there, it also extracts **layout patterns** (grids, flexbox, containers), **motion language** (durations, easings, springs, scroll-linked animations), **component anatomy** (slots, variant × size × state matrices), **brand voice** (tone, CTA verbs, heading style), captures **responsive behavior** across 4 breakpoints, records **interaction states** (hover, focus, active), scores **WCAG accessibility**, lints your own token files, and lets you **drift-check a codebase against a live site**, **visual-diff two URLs**, **compare multiple brands**, or **sync live sites to local tokens**.
+## What's New in v10 — The Intent Release
+Everything else captures *how* a site looks. v10 captures *what it is* — the semantic signal an LLM needs to rebuild a site faithfully instead of restyling a generic scaffold.
+- **Page Intent** — classifier labels the URL as `landing` / `pricing` / `docs` / `blog` / `blog-post` / `product` / `about` / `dashboard` / `auth` / `legal`, with a confidence score and rival alternates. URL + title + meta + DOM-shape signals. Heuristic-only by default; opt into `--smart` for LLM refinement.
+- **Section Roles** — every semantic region gets a role (`hero`, `feature-grid`, `logo-wall`, `stats`, `testimonial`, `pricing-table`, `faq`, `steps`, `comparison`, `gallery`, `bento`, `cta`, `footer`), plus reading order and extracted slot copy (headings, lede, CTA counts).
+- **Multi-Page Crawl** — `--full` (or `--pages <n>`) auto-discovers the site's own canonical pages from its nav (pricing/docs/blog/about/product) and runs the full pipeline on each, then emits a cross-page consistency report — shared tokens, per-page uniques, and pairwise Jaccard scores. LLMs get a real design language, not just a homepage snapshot.
+- **Material Language** — classifies the visual vocabulary as `glassmorphism` / `neumorphism` / `flat` / `brutalist` / `skeuomorphic` / `material-you` / `soft-ui` / `mixed` from shadow complexity, backdrop-filter usage, saturation, and geometry.
+- **Imagery Style** — fingerprints the images: `photography` / `3d-render` / `isometric` / `flat-illustration` / `gradient-mesh` / `icon-only` / `screenshot` / `mixed`, plus dominant aspect ratio and image-radius profile.
+- **Component Library Detection** — identifies `shadcn/ui`, `radix-ui`, `headlessui`, `mui`, `chakra-ui`, `mantine`, `ant-design`, `bootstrap`, `heroui`, `tailwind-ui`, `vuetify`, or plain `tailwindcss`, with evidence and alternates.
+- **Logo Extraction** — `--full` writes `*-logo.svg` (or `.png`) plus `*-logo.json` with dimensions, aspect, and sampled clearspace.
+- **Prompt Pack** — a `*-prompts/` directory with `v0.txt`, `lovable.txt`, `cursor.md`, `claude-artifacts.md`, and atomic `recipe-<component>.md` cards — tokens, section order, voice, and library inlined so one paste is enough.
+- **`--smart` mode** — when a heuristic classifier returns low confidence, fall back to a small LLM call (uses `OPENAI_API_KEY` or `ANTHROPIC_API_KEY` from env). Completely optional — no key, no behavior change.
 ## What's New in v9 — The Motion & Voice Release
 - **Motion Language** — durations bucketed into semantic tokens (`instant`/`xs`/`sm`/`md`/`lg`/`xl`), easings classified into families (ease-out, spring-overshoot, steps), scroll-linked animation detection (`animation-timeline`, `view-timeline-name`), keyframe kind classification (slide / fade / reveal / rotate / scale / pulse), and a `feel` fingerprint — *springy*, *responsive*, *smooth*, *mechanical*, or *mixed*.

package/bin/design-extract.js CHANGED Viewed

@@ -6,6 +6,11 @@ import { resolve, join } from 'path';
 import chalk from 'chalk';
 import ora from 'ora';
 import { extractDesignLanguage } from '../src/index.js';
+import { refineWithSmart } from '../src/classifiers/smart.js';
+import { crawlCanonicalPages } from '../src/multipage.js';
+import { extractLogo } from '../src/extractors/logo.js';
+import { captureComponentScreenshotsV10 } from '../src/extractors/component-screenshots.js';
+import { buildPromptPack } from '../src/formatters/prompt-pack.js';
 import { formatMarkdown } from '../src/formatters/markdown.js';
 import { formatTokens } from '../src/formatters/tokens.js';
 import { formatDtcgTokens } from '../src/formatters/dtcg-tokens.js';
@@ -48,7 +53,7 @@ const program = new Command();
 program
   .name('designlang')
   .description('Extract the complete design language from any website')
-  .version('9.0.0');
+  .version('10.1.0');
 // ── Main command: extract ──────────────────────────────────────
 program
@@ -77,6 +82,9 @@ program
   .option('--tokens-legacy', 'Emit pre-v7 flat token JSON (backward compat)')
   .option('--platforms <csv>', 'Additional platforms: web,ios,android,flutter,wordpress,all (web is always emitted)', 'web')
   .option('--emit-agent-rules', 'Emit Cursor/Claude Code/generic agent rules')
+  .option('--smart', 'use optional LLM fallback when heuristic classifiers have low confidence (needs OPENAI_API_KEY or ANTHROPIC_API_KEY)')
+  .option('--pages <n>', 'crawl N canonical pages (pricing/docs/blog/about/product) in addition to the homepage', parseInt)
+  .option('--no-prompts', 'skip writing the prompt-pack directory')
   .option('--json', 'output raw JSON to stdout (for CI/CD)')
   .option('--json-pretty', 'output formatted JSON to stdout')
   .option('--no-history', 'skip saving to history')
@@ -173,6 +181,77 @@ program
         design.interactions = await captureInteractions(url, { width: merged.width, height: parseInt(merged.height) || 800, wait: merged.wait });
       }
+      // v10: optional LLM refinement for low-confidence classifiers.
+      if (merged.smart) {
+        spinner.text = 'Refining classifiers with smart mode...';
+        try {
+          const refined = await refineWithSmart({
+            enabled: true,
+            rawData: design._raw,
+            design,
+            pageIntent: design.pageIntent,
+            sectionRoles: design.sectionRoles,
+            materialLanguage: design.materialLanguage,
+            componentLibrary: design.componentLibrary,
+          });
+          if (refined.applied) {
+            if (refined.updates?.pageIntent) design.pageIntent = { ...design.pageIntent, ...refined.updates.pageIntent };
+            if (refined.updates?.materialLanguage) design.materialLanguage = { ...design.materialLanguage, ...refined.updates.materialLanguage };
+            if (refined.updates?.componentLibrary) design.componentLibrary = { ...design.componentLibrary, ...refined.updates.componentLibrary };
+            design._smart = { provider: refined.provider, errors: refined.errors };
+          } else {
+            design._smart = { skipped: refined.reason };
+          }
+        } catch (e) { design._smart = { error: e.message }; }
+      }
+      // v10: logo extraction via a fresh Playwright session.
+      if (merged.full || merged.screenshots) {
+        spinner.text = 'Extracting logo...';
+        try {
+          const { chromium } = await import('playwright');
+          const browser = await chromium.launch({ headless: true, ...(merged.systemChrome && { channel: 'chrome' }) });
+          const ctx = await browser.newContext({ viewport: { width: merged.width, height: parseInt(merged.height) || 800 } });
+          const lp = await ctx.newPage();
+          await lp.goto(url, { waitUntil: 'domcontentloaded', timeout: 20000 }).catch(() => {});
+          await lp.waitForLoadState('networkidle').catch(() => {});
+          mkdirSync(outDir, { recursive: true });
+          design.logo = await extractLogo(lp, outDir, prefix);
+          await browser.close();
+        } catch (e) { design.logo = { found: false, error: e.message }; }
+      }
+      // v10.1: cluster-aware retina component screenshots.
+      if (merged.full || merged.screenshots) {
+        spinner.text = 'Capturing component screenshots (retina)...';
+        try {
+          design.componentScreenshots = await captureComponentScreenshotsV10(url, outDir, {
+            width: merged.width,
+            height: parseInt(merged.height) || 800,
+            channel: merged.systemChrome ? 'chrome' : undefined,
+          });
+        } catch (e) { design.componentScreenshots = { error: e.message }; }
+      }
+      // v10: multi-page canonical crawl (pricing/docs/blog/about/product).
+      const pagesArg = merged.pages != null ? merged.pages : (merged.full ? 5 : 0);
+      if (pagesArg > 0) {
+        spinner.text = `Crawling ${pagesArg} canonical pages...`;
+        try {
+          const mp = await crawlCanonicalPages({
+            homepageUrl: url,
+            homepageRawData: design._raw,
+            maxPages: pagesArg,
+            crawlerOptions: { width: merged.width, height: parseInt(merged.height) || 800 },
+            extract: (u, o) => extractDesignLanguage(u, o),
+          });
+          design.multiPage = mp;
+        } catch (e) { design.multiPage = { error: e.message }; }
+      }
+      // Drop the internal raw stash before JSON/output serialization.
+      delete design._raw;
       // JSON mode: output and exit
       if (jsonMode) {
         const output = opts.jsonPretty ? JSON.stringify(design, null, 2) : JSON.stringify(design);
@@ -230,6 +309,33 @@ program
       }
       files.push({ name: `${prefix}-voice.json`, content: JSON.stringify(design.voice || {}, null, 2), label: 'Brand Voice' });
+      // v10: page intent + section roles + visual DNA + component library + multi-page + prompt pack.
+      files.push({ name: `${prefix}-intent.json`, content: JSON.stringify({ pageIntent: design.pageIntent, sectionRoles: design.sectionRoles }, null, 2), label: 'Page Intent + Section Roles' });
+      files.push({ name: `${prefix}-visual-dna.json`, content: JSON.stringify({ materialLanguage: design.materialLanguage, imageryStyle: design.imageryStyle }, null, 2), label: 'Visual DNA' });
+      files.push({ name: `${prefix}-library.json`, content: JSON.stringify(design.componentLibrary || {}, null, 2), label: 'Component Library Detection' });
+      if (design.logo && design.logo.found) {
+        files.push({ name: `${prefix}-logo.json`, content: JSON.stringify(design.logo, null, 2), label: 'Logo Metadata' });
+      }
+      if (design.multiPage) {
+        files.push({ name: `${prefix}-multipage.json`, content: JSON.stringify(design.multiPage, null, 2), label: 'Multi-Page Crawl' });
+      }
+      if (design.componentScreenshots && (design.componentScreenshots.components || []).length) {
+        files.push({ name: `${prefix}-screenshots.json`, content: JSON.stringify(design.componentScreenshots, null, 2), label: 'Component Screenshots index' });
+      }
+      if (merged.prompts !== false) {
+        const pack = buildPromptPack(design);
+        const promptsDir = join(outDir, `${prefix}-prompts`);
+        mkdirSync(promptsDir, { recursive: true });
+        writeFileSync(join(promptsDir, 'v0.txt'), pack['v0.txt'], 'utf-8');
+        writeFileSync(join(promptsDir, 'lovable.txt'), pack['lovable.txt'], 'utf-8');
+        writeFileSync(join(promptsDir, 'cursor.md'), pack['cursor.md'], 'utf-8');
+        writeFileSync(join(promptsDir, 'claude-artifacts.md'), pack['claude-artifacts.md'], 'utf-8');
+        for (const r of pack.recipes) {
+          const slug = r.name.replace(/[^a-z0-9]+/gi, '-').toLowerCase() || 'component';
+          writeFileSync(join(promptsDir, `recipe-${slug}.md`), r.content, 'utf-8');
+        }
+      }
       for (const file of files) {
         writeFileSync(join(outDir, file.name), file.content, 'utf-8');
       }

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "designlang",
-  "version": "9.0.0",
-  "description": "Extract the complete design language from any website — colors, typography, spacing, shadows, motion, component anatomy, and brand voice. Outputs AI-optimized markdown, W3C design tokens, motion tokens, typed component stubs, Tailwind config, and more.",
+  "version": "10.1.0",
+  "description": "Extract the complete design language from any website — colors, typography, spacing, shadows, motion, component anatomy, brand voice, page intent, section roles, material language, component library, imagery style, and logo. Outputs AI-optimized markdown, W3C design tokens, motion tokens, typed component stubs, Tailwind config, and ready-to-paste v0 / Lovable / Cursor / Claude-Artifacts prompts.",
   "type": "module",
   "bin": {
     "designlang": "./bin/design-extract.js"

package/src/classifiers/smart.js ADDED Viewed

@@ -0,0 +1,130 @@
+// Optional LLM fallback for low-confidence classifications. No SDK deps — we
+// hit the OpenAI or Anthropic REST API directly via global fetch. Runs only
+// when the user passes --smart AND an API key is available in env. Silently
+// no-ops otherwise so the core extractor stays zero-config.
+//
+// Consumers call `refineWithSmart({ pageIntent, sectionRoles, materialLanguage,
+// componentLibrary }, digest)` — we only hit the network for fields where
+// `needsSmart` is true.
+const TASKS = {
+  pageIntent: {
+    system: 'You classify a web page into one of these types: landing, pricing, docs, blog, blog-post, product, about, dashboard, auth, legal, unknown. Return only a JSON object {"type":"...","confidence":0.xx,"why":"one sentence"}.',
+  },
+  materialLanguage: {
+    system: 'You classify a website\'s visual material language. Choose one of: glassmorphism, neumorphism, flat, brutalist, skeuomorphic, material-you, soft-ui, mixed. Return only a JSON object {"label":"...","confidence":0.xx,"why":"one sentence"}.',
+  },
+  componentLibrary: {
+    system: 'You identify which UI component library a website most likely uses. Choose from: shadcn/ui, radix-ui, headlessui, mui, chakra-ui, mantine, ant-design, bootstrap, heroui, tailwind-ui, vuetify, tailwindcss, unknown. Return only a JSON object {"library":"...","confidence":0.xx,"why":"one sentence"}.',
+  },
+};
+function detectProvider() {
+  if (process.env.ANTHROPIC_API_KEY) return 'anthropic';
+  if (process.env.OPENAI_API_KEY) return 'openai';
+  return null;
+}
+function buildDigest({ rawData, design, pageIntent }) {
+  // A compact digest — URL, title, meta description, first 1000 chars of
+  // visible text, top classes. Keep under ~3k tokens.
+  const sections = (rawData?.light?.sections) || [];
+  const text = sections.map(s => s.text || '').join('\n').slice(0, 1500);
+  const metas = (rawData?.light?.stack?.metas || []).slice(0, 10)
+    .map(m => `${m.name || ''}: ${(m.content || '').slice(0, 120)}`).join('\n');
+  const classes = (rawData?.light?.stack?.classNameSample || []).slice(0, 60).join(' | ').slice(0, 1500);
+  return [
+    `URL: ${rawData?.url || ''}`,
+    `TITLE: ${rawData?.title || ''}`,
+    `PATH: ${pageIntent?.path || ''}`,
+    `METAS:\n${metas}`,
+    `SECTION ROLES: ${(design?.regions || []).map(r => r.role).join(',')}`,
+    `TEXT SAMPLE:\n${text}`,
+    `CLASS SAMPLE:\n${classes}`,
+  ].join('\n\n');
+}
+async function callAnthropic(system, user) {
+  const res = await fetch('https://api.anthropic.com/v1/messages', {
+    method: 'POST',
+    headers: {
+      'x-api-key': process.env.ANTHROPIC_API_KEY,
+      'anthropic-version': '2023-06-01',
+      'content-type': 'application/json',
+    },
+    body: JSON.stringify({
+      model: process.env.DESIGNLANG_MODEL || 'claude-haiku-4-5-20251001',
+      max_tokens: 200,
+      system,
+      messages: [{ role: 'user', content: user }],
+    }),
+  });
+  if (!res.ok) throw new Error(`anthropic ${res.status}`);
+  const json = await res.json();
+  const text = (json.content || []).map(b => b.text || '').join('');
+  return text;
+}
+async function callOpenAI(system, user) {
+  const res = await fetch('https://api.openai.com/v1/chat/completions', {
+    method: 'POST',
+    headers: {
+      'authorization': `Bearer ${process.env.OPENAI_API_KEY}`,
+      'content-type': 'application/json',
+    },
+    body: JSON.stringify({
+      model: process.env.DESIGNLANG_MODEL || 'gpt-4o-mini',
+      messages: [
+        { role: 'system', content: system },
+        { role: 'user', content: user },
+      ],
+      max_tokens: 200,
+      response_format: { type: 'json_object' },
+    }),
+  });
+  if (!res.ok) throw new Error(`openai ${res.status}`);
+  const json = await res.json();
+  return json.choices?.[0]?.message?.content || '';
+}
+async function callLLM(provider, system, user) {
+  if (provider === 'anthropic') return callAnthropic(system, user);
+  return callOpenAI(system, user);
+}
+function parseJsonLoose(text) {
+  try { return JSON.parse(text); } catch { /* try to find a JSON object */ }
+  const m = text.match(/\{[\s\S]*\}/);
+  if (m) { try { return JSON.parse(m[0]); } catch {} }
+  return null;
+}
+export async function refineWithSmart({ enabled, rawData, design, pageIntent, sectionRoles, materialLanguage, componentLibrary }) {
+  if (!enabled) return { applied: false, reason: 'disabled' };
+  const provider = detectProvider();
+  if (!provider) return { applied: false, reason: 'no API key (set OPENAI_API_KEY or ANTHROPIC_API_KEY)' };
+  const digest = buildDigest({ rawData, design, pageIntent });
+  const updates = {};
+  const errors = [];
+  const queue = [];
+  if (pageIntent?.needsSmart) queue.push(['pageIntent', pageIntent]);
+  if (materialLanguage && (materialLanguage.confidence || 0) < 0.55) queue.push(['materialLanguage', materialLanguage]);
+  if (componentLibrary?.needsSmart) queue.push(['componentLibrary', componentLibrary]);
+  for (const [task, current] of queue) {
+    const spec = TASKS[task];
+    if (!spec) continue;
+    const user = `Digest:\n${digest}\n\nCurrent heuristic result:\n${JSON.stringify(current)}\n\nRespond with the requested JSON.`;
+    try {
+      const raw = await callLLM(provider, spec.system, user);
+      const parsed = parseJsonLoose(raw);
+      if (parsed) updates[task] = { ...parsed, smart: true, provider };
+    } catch (e) {
+      errors.push(`${task}: ${e.message}`);
+    }
+  }
+  return { applied: true, provider, updates, errors };
+}

package/src/extractors/component-library.js ADDED Viewed

@@ -0,0 +1,193 @@
+// Detect the component library a site is built on. Fingerprints: class-name
+// patterns, data-* attributes, script URLs, window globals. Returns the most
+// likely library with a confidence score and the evidence that supported it —
+// LLM agents consume this to pick the right scaffolding (e.g. "use shadcn/ui",
+// "use MUI v5") when rebuilding.
+const LIB_FINGERPRINTS = [
+  {
+    id: 'shadcn/ui',
+    score: (ctx) => {
+      let s = 0; const evidence = [];
+      // shadcn copies Radix attributes and uses Tailwind utility density.
+      if (ctx.radixAttrCount > 2 && ctx.tailwindLike > 0.4) { s += 0.5; evidence.push('radix+tailwind mix'); }
+      if (/\bbutton-primary|\bbutton-destructive/.test(ctx.classBlob) && ctx.radixAttrCount > 0) { s += 0.15; evidence.push('shadcn button tokens'); }
+      // `class="... bg-background text-foreground ..."` is a shadcn tell.
+      if (/\bbg-background\b|\btext-foreground\b|\bborder-input\b|\bring-offset-background\b/.test(ctx.classBlob)) {
+        s += 0.65; evidence.push('shadcn css tokens');
+      }
+      return { score: s, evidence };
+    },
+  },
+  {
+    id: 'radix-ui',
+    score: (ctx) => {
+      const count = ctx.radixAttrCount;
+      if (count === 0) return { score: 0, evidence: [] };
+      const score = Math.min(0.9, 0.3 + count * 0.05);
+      return { score, evidence: [`${count} radix attributes`] };
+    },
+  },
+  {
+    id: 'headlessui',
+    score: (ctx) => {
+      const m = (ctx.classSample.join(' ').match(/headlessui-[a-z]+/gi) || []).length;
+      if (m < 2) return { score: 0, evidence: [] };
+      return { score: Math.min(0.9, 0.3 + m * 0.08), evidence: [`${m} headlessui- class refs`] };
+    },
+  },
+  {
+    id: 'mui',
+    score: (ctx) => {
+      const m = (ctx.classBlob.match(/Mui[A-Z][A-Za-z]+-root/g) || []).length;
+      if (m < 2) return { score: 0, evidence: [] };
+      return { score: Math.min(0.95, 0.4 + m * 0.04), evidence: [`${m} Mui*-root classes`] };
+    },
+  },
+  {
+    id: 'chakra-ui',
+    score: (ctx) => {
+      const m = (ctx.classBlob.match(/\bchakra-[a-z]+/g) || []).length;
+      if (m < 3) return { score: 0, evidence: [] };
+      return { score: Math.min(0.95, 0.4 + m * 0.03), evidence: [`${m} chakra- classes`] };
+    },
+  },
+  {
+    id: 'mantine',
+    score: (ctx) => {
+      const m = (ctx.classBlob.match(/mantine-[A-Za-z0-9]+/g) || []).length;
+      if (m < 2) return { score: 0, evidence: [] };
+      return { score: Math.min(0.95, 0.4 + m * 0.04), evidence: [`${m} mantine- classes`] };
+    },
+  },
+  {
+    id: 'ant-design',
+    score: (ctx) => {
+      const m = (ctx.classBlob.match(/\bant-[a-z]+(-[a-z]+)*/g) || []).length;
+      if (m < 3) return { score: 0, evidence: [] };
+      return { score: Math.min(0.95, 0.4 + m * 0.03), evidence: [`${m} ant- classes`] };
+    },
+  },
+  {
+    id: 'bootstrap',
+    score: (ctx) => {
+      const hits = ['container', 'row', 'col-md-', 'btn-primary', 'navbar-nav', 'card-body']
+        .filter(k => ctx.classBlob.includes(k)).length;
+      if (hits < 3) return { score: 0, evidence: [] };
+      return { score: Math.min(0.9, 0.3 + hits * 0.1), evidence: [`bootstrap utility hits: ${hits}`] };
+    },
+  },
+  {
+    id: 'heroui',
+    score: (ctx) => {
+      const m = (ctx.classBlob.match(/\bheroui-|\bnextui-/g) || []).length;
+      if (m < 2) return { score: 0, evidence: [] };
+      return { score: Math.min(0.95, 0.4 + m * 0.05), evidence: [`${m} heroui/nextui classes`] };
+    },
+  },
+  {
+    id: 'tailwind-ui',
+    score: (ctx) => {
+      // Tailwind UI is a starter/template, not a runtime — use density of
+      // Tailwind utilities + typical Tailwind UI patterns as a weak signal.
+      if (ctx.tailwindLike < 0.6) return { score: 0, evidence: [] };
+      const patterns = ['ring-offset-', 'focus:ring-', 'hover:bg-gray-', 'prose prose-'];
+      const hits = patterns.filter(p => ctx.classBlob.includes(p)).length;
+      if (hits < 2) return { score: 0, evidence: [] };
+      return { score: 0.3 + hits * 0.12, evidence: [`tailwind density=${ctx.tailwindLike.toFixed(2)}, pattern hits=${hits}`] };
+    },
+  },
+  {
+    id: 'vuetify',
+    score: (ctx) => {
+      const m = (ctx.classBlob.match(/\bv-[a-z]+(-[a-z]+)?/g) || []).length;
+      if (m < 5) return { score: 0, evidence: [] };
+      return { score: Math.min(0.9, 0.3 + m * 0.02), evidence: [`${m} v-* classes`] };
+    },
+  },
+  {
+    id: 'tailwindcss',
+    score: (ctx) => {
+      if (ctx.tailwindLike < 0.35) return { score: 0, evidence: [] };
+      // Tailwind itself isn't a component library but we report it as a signal
+      // when no higher-level library is detected.
+      return { score: 0.3 + (ctx.tailwindLike - 0.35) * 1.2, evidence: [`tailwind-like class density ${(ctx.tailwindLike * 100).toFixed(0)}%`] };
+    },
+  },
+];
+function computeTailwindLike(classSample) {
+  if (!classSample.length) return 0;
+  // A rough "looks like Tailwind" metric: fraction of class tokens that match
+  // common utility shapes (pt-4, bg-slate-100, text-2xl, flex, gap-x-2, etc.).
+  const utilRe = /^(?:sm:|md:|lg:|xl:|2xl:|hover:|focus:|dark:|group-hover:)*(?:p|m|px|py|pt|pb|pl|pr|mx|my|mt|mb|ml|mr|w|h|min-w|min-h|max-w|max-h|gap|space|text|font|leading|tracking|bg|border|rounded|shadow|ring|ringed|opacity|flex|grid|items|justify|content|self|place|overflow|z|inset|top|bottom|left|right|translate|rotate|scale|skew|transition|duration|ease|delay|animate)(?:-[a-z0-9/.:\[\]%-]+)?$/i;
+  let utils = 0, total = 0;
+  for (const cls of classSample) {
+    for (const tok of cls.split(/\s+/).slice(0, 30)) {
+      if (!tok) continue;
+      total++;
+      if (utilRe.test(tok)) utils++;
+    }
+  }
+  return total > 0 ? utils / total : 0;
+}
+function countRadixAttrs(classSample, attrSample = []) {
+  // Radix emits attributes like data-radix-popper-content-wrapper, data-state,
+  // data-orientation, data-slot. We don't have a full attr dump, but script src
+  // for @radix-ui is also a tell.
+  let n = 0;
+  for (const a of attrSample) {
+    if (/data-radix|data-slot|data-state|data-orientation/.test(a)) n++;
+  }
+  return n;
+}
+export function extractComponentLibrary(stackSignals = {}) {
+  const classSample = (stackSignals.classNameSample || []).slice(0, 500);
+  const classBlob = classSample.join(' ');
+  const scripts = (stackSignals.scripts || []).join(' ');
+  const attrSample = stackSignals.attrSample || [];
+  const tailwindLike = computeTailwindLike(classSample);
+  let radixAttrCount = countRadixAttrs(classSample, attrSample);
+  if (/@radix-ui/.test(scripts)) radixAttrCount += 3;
+  const ctx = { classSample, classBlob, scripts, tailwindLike, radixAttrCount };
+  const ranked = [];
+  for (const lib of LIB_FINGERPRINTS) {
+    const { score, evidence } = lib.score(ctx);
+    if (score > 0) {
+      ranked.push({ id: lib.id, score: Number(score.toFixed(3)), evidence });
+    }
+  }
+  // Tailwind CSS is a styling layer, not a component library. If any
+  // higher-level library also scored, demote tailwindcss so it doesn't shadow
+  // the real answer.
+  const hasHigherLevel = ranked.some(r => r.id !== 'tailwindcss' && r.id !== 'tailwind-ui' && r.score > 0.35);
+  if (hasHigherLevel) {
+    for (const r of ranked) {
+      if (r.id === 'tailwindcss') r.score = Math.min(r.score, 0.3);
+    }
+  }
+  ranked.sort((a, b) => b.score - a.score);
+  const primary = ranked[0] || { id: 'unknown', score: 0, evidence: [] };
+  // If shadcn and radix both score, prefer shadcn at the top and keep radix as
+  // the underlying primitive.
+  const alternates = ranked.slice(1, 5);
+  return {
+    library: primary.id,
+    confidence: primary.score,
+    evidence: primary.evidence,
+    alternates,
+    signals: {
+      tailwindLike: Number(tailwindLike.toFixed(3)),
+      radixAttrCount,
+      classSampleSize: classSample.length,
+    },
+    needsSmart: primary.score < 0.55,
+  };
+}