dembrandt 0.3.0 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -4,417 +4,101 @@
4
4
  [![npm downloads](https://img.shields.io/npm/dm/dembrandt.svg)](https://www.npmjs.com/package/dembrandt)
5
5
  [![license](https://img.shields.io/npm/l/dembrandt.svg)](https://github.com/thevangelist/dembrandt/blob/main/LICENSE)
6
6
 
7
- A CLI tool for extracting design tokens and brand assets from any website. Powered by Playwright with advanced bot detection avoidance.
7
+ Extract any website’s design system into design tokens in a few seconds: logo, colors, typography, borders, and more. One command.
8
8
 
9
9
  ![Dembrandt Demo](showcase.png)
10
10
 
11
- ## Quick Start
11
+ ## Install
12
12
 
13
13
  ```bash
14
- npx dembrandt stripe.com
14
+ npx dembrandt bmw.de
15
15
  ```
16
16
 
17
- No installation required! Extract design tokens from any website in seconds. Or install globally with `npm install -g dembrandt`.
17
+ Or install globally: `npm install -g dembrandt` then run `dembrandt bmw.de`
18
18
 
19
- ## What It Does
19
+ Requires Node.js 18+
20
20
 
21
- Dembrandt analyzes live websites and extracts their complete design system:
21
+ ## What to expect from extraction?
22
22
 
23
- - **Logo** Logo detection (img/svg) with dimensions and source URL
24
- - **Favicons** All favicon variants with sizes and types
25
- - **Colors** — Semantic colors, color palette with confidence scoring, CSS variables (both hex and RGB formats)
26
- - **Typography** — Font families, sizes, weights, line heights, font sources (Google Fonts, Adobe Fonts, custom)
27
- - **Spacing** — Margin and padding scales with grid system detection (4px/8px/custom)
28
- - **Border Radius** Corner radius patterns with usage frequency
29
- - **Borders** — Border widths, styles (solid, dashed, dotted), and colors with confidence scoring
30
- - **Shadows** Box shadow values for elevation systems
31
- - **Buttons** — Component styles with variants and states
32
- - **Inputs** — Form field styles (input, textarea, select)
33
- - **Links** — Link styles with hover states and decorations
34
- - **Breakpoints** — Responsive design breakpoints from media queries
35
- - **Icons** — Icon system detection (Font Awesome, Material Icons, SVG)
36
- - **Frameworks** — CSS framework detection (Tailwind, Bootstrap, Material-UI, Chakra)
37
-
38
- Perfect for competitive analysis, brand audits, or rebuilding a brand when you don't have design guidelines.
39
-
40
- ## Why It Matters
41
-
42
- **Designers** — Analyze competitor systems, document production tokens, audit brand consistency
43
-
44
- **Developers** — Migrate design tokens, reverse engineer components, validate implementations
45
-
46
- **Product Managers** — Track competitor evolution, quantify design debt, evaluate vendors
47
-
48
- **Marketing** — Audit competitor brands, plan rebrands, monitor brand compliance
49
-
50
- **Engineering Leaders** — Measure technical debt, plan migrations, assess acquisition targets
51
-
52
- ## How It Works
53
-
54
- Uses Playwright to render the page, extracts computed styles from the DOM, analyzes color usage and confidence, groups similar typography, detects spacing patterns, and returns actionable design tokens.
55
-
56
- ### Extraction Process
57
-
58
- 1. **Browser Launch** - Launches Chromium with stealth configuration
59
- 2. **Anti-Detection** - Injects scripts to bypass bot detection
60
- 3. **Navigation** - Navigates to target URL with retry logic
61
- 4. **Hydration** - Waits for SPAs to fully load (8s initial + 4s stabilization)
62
- 5. **Content Validation** - Verifies page content is substantial (>500 chars)
63
- 6. **Parallel Extraction** - Runs all extractors concurrently for speed
64
- 7. **Analysis** - Analyzes computed styles, DOM structure, and CSS variables
65
- 8. **Scoring** - Assigns confidence scores based on context and usage
66
-
67
- ### Color Confidence
68
-
69
- - **High** — Logo, brand elements, primary buttons
70
- - **Medium** — Interactive elements, icons, navigation
71
- - **Low** — Generic UI components (filtered from display)
72
-
73
- Only shows high and medium confidence colors in terminal. Full palette in JSON.
74
-
75
- ### Typography Detection
76
-
77
- Samples all heading levels (h1-h6), body text, buttons, links. Groups by font family, size, and weight. Detects Google Fonts, Adobe Fonts, custom @font-face.
78
-
79
- ### Framework Detection
80
-
81
- Recognizes Tailwind CSS, Bootstrap, Material-UI, and others by class patterns and CDN links.
82
-
83
- ## Installation
84
-
85
- ### Using npx (Recommended)
86
-
87
- No installation needed! Run directly with `npx`:
88
-
89
- ```bash
90
- npx dembrandt stripe.com
91
- ```
92
-
93
- The first run will automatically install Chromium (~170MB).
94
-
95
- ### Global Installation
96
-
97
- Install globally for repeated use:
98
-
99
- ```bash
100
- npm install -g dembrandt
101
- dembrandt stripe.com
102
- ```
103
-
104
- ### Prerequisites
105
-
106
- - Node.js 18 or higher
107
-
108
- ### Development Setup
109
-
110
- For contributors who want to work on dembrandt:
111
-
112
- ```bash
113
- git clone https://github.com/thevangelist/dembrandt.git
114
- cd dembrandt
115
- npm install
116
- npm link
117
- ```
23
+ - Colors (semantic, palette, CSS variables)
24
+ - Typography (fonts, sizes, weights, sources)
25
+ - Spacing (margin/padding scales)
26
+ - Borders (radius, widths, styles, colors)
27
+ - Shadows
28
+ - Components (buttons, inputs, links)
29
+ - Breakpoints
30
+ - Icons & frameworks
118
31
 
119
32
  ## Usage
120
33
 
121
- ### Basic Usage
122
-
123
34
  ```bash
124
- # Using npx (no installation)
125
- npx dembrandt <url>
126
-
127
- # Or if installed globally
128
- dembrandt <url>
129
-
130
- # Examples
131
- dembrandt stripe.com
132
- dembrandt https://github.com
133
- dembrandt tailwindcss.com
35
+ dembrandt <url> # Basic extraction
36
+ dembrandt bmw.de --save-output # Save JSON to output folder
37
+ dembrandt bmw.de --json-only # JSON output only (no save)
38
+ dembrandt bmw.de --debug # Visible browser
39
+ dembrandt bmw.de --dark-mode # Dark mode
40
+ dembrandt bmw.de --mobile # Mobile viewport
41
+ dembrandt bmw.de --slow # 3x timeouts
134
42
  ```
135
43
 
136
- ### Options
137
-
138
- **`--json-only`** - Output raw JSON to stdout instead of formatted terminal display
139
-
140
- ```bash
141
- dembrandt stripe.com --json-only > tokens.json
142
- ```
143
-
144
- Note: JSON is automatically saved to `output/domain.com/` regardless of this flag.
145
-
146
- **`-d, --debug`** - Run with visible browser and detailed logs
147
-
148
- ```bash
149
- dembrandt stripe.com --debug
150
- ```
151
-
152
- Useful for troubleshooting bot detection, timeouts, or extraction issues.
153
-
154
- **`--verbose-colors`** - Show medium and low confidence colors in terminal output
155
-
156
- ```bash
157
- dembrandt stripe.com --verbose-colors
158
- ```
159
-
160
- By default, only high-confidence colors are shown. Use this flag to see all detected colors.
161
-
162
- **`--dark-mode`** - Extract colors from dark mode
163
-
164
- ```bash
165
- dembrandt stripe.com --dark-mode
166
- ```
167
-
168
- Enables dark mode preference detection for sites that support it.
169
-
170
- **`--mobile`** - Extract from mobile viewport
171
-
172
- ```bash
173
- dembrandt stripe.com --mobile
174
- ```
175
-
176
- Simulates a mobile device viewport for responsive design token extraction.
177
-
178
- **`--slow`** - Use 3x longer timeouts for slow-loading sites
179
-
180
- ```bash
181
- dembrandt linkedin.com --slow
182
- ```
183
-
184
- Helpful for sites with heavy JavaScript, complex SPAs, or aggressive bot detection that need extra time to fully load.
185
-
186
- ## Output
187
-
188
- ### Automatic JSON Saves
189
-
190
- Every extraction is automatically saved to `output/domain.com/YYYY-MM-DDTHH-MM-SS.json` with:
191
-
192
- - Complete design token data
193
- - Timestamped for version tracking
194
- - Organized by domain
195
-
196
- Example: `output/stripe.com/2025-11-22T14-30-45.json`
197
-
198
- ### Terminal Output
199
-
200
- Clean, formatted tables showing:
201
-
202
- - Color palette with confidence ratings (with visual swatches)
203
- - CSS variables with color previews
204
- - Typography hierarchy with context
205
- - Spacing scale (4px/8px grid detection)
206
- - Shadow system
207
- - Button variants
208
- - Component style breakdowns
209
- - Framework and icon system detection
210
-
211
- ### JSON Output Format
212
-
213
- Complete extraction data for programmatic use:
214
-
215
- ```json
216
- {
217
- "url": "https://example.com",
218
- "extractedAt": "2025-11-22T...",
219
- "logo": { "source": "img", "url": "...", "width": 120, "height": 40 },
220
- "colors": {
221
- "semantic": { "primary": "#3b82f6", ... },
222
- "palette": [{ "color": "#3b82f6", "confidence": "high", "count": 45, "sources": [...] }],
223
- "cssVariables": { "--color-primary": "#3b82f6", ... }
224
- },
225
- "typography": {
226
- "styles": [{ "fontFamily": "Inter", "fontSize": "16px", "fontWeight": "400", ... }],
227
- "sources": { "googleFonts": [...], "adobeFonts": false, "customFonts": [...] }
228
- },
229
- "spacing": { "scaleType": "8px", "commonValues": [{ "px": "16px", "rem": "1rem", "count": 42 }, ...] },
230
- "borderRadius": { "values": [{ "value": "8px", "count": 15, "confidence": "high" }, ...] },
231
- "shadows": [{ "shadow": "0 2px 4px rgba(0,0,0,0.1)", "count": 8, "confidence": "high" }, ...],
232
- "components": {
233
- "buttons": [{ "backgroundColor": "...", "color": "...", "padding": "...", ... }],
234
- "inputs": [{ "type": "input", "border": "...", "borderRadius": "...", ... }]
235
- },
236
- "breakpoints": [{ "px": "768px" }, ...],
237
- "iconSystem": [{ "name": "Font Awesome", "type": "icon-font" }, ...],
238
- "frameworks": [{ "name": "Tailwind CSS", "confidence": "high", "evidence": "class patterns" }]
239
- }
240
- ```
241
-
242
- ## Examples
243
-
244
- ### Extract Design Tokens
245
-
246
- ```bash
247
- # Analyze a single site (auto-saves JSON to output/stripe.com/)
248
- dembrandt stripe.com
249
-
250
- # View saved JSON files
251
- ls output/stripe.com/
252
-
253
- # Output to stdout for piping
254
- dembrandt stripe.com --json-only | jq '.colors.semantic'
255
-
256
- # Debug mode for difficult sites
257
- dembrandt example.com --debug
258
- ```
259
-
260
- ### Compare Competitors
261
-
262
- ```bash
263
- # Extract tokens from multiple competitors (auto-saved to output/)
264
- for site in stripe.com square.com paypal.com; do
265
- dembrandt $site
266
- done
267
-
268
- # Compare color palettes from most recent extractions
269
- jq '.colors.palette[] | select(.confidence=="high")' output/stripe.com/2025-11-22T*.json output/square.com/2025-11-22T*.json
270
-
271
- # Compare semantic colors across competitors
272
- jq '.colors.semantic' output/*/2025-11-22T*.json
273
- ```
274
-
275
- ### Integration with Design Tools
276
-
277
- ```bash
278
- # Extract and convert to custom config format
279
- dembrandt mysite.com --json-only | jq '{
280
- colors: .colors.semantic,
281
- fontFamily: .typography.sources,
282
- spacing: .spacing.commonValues
283
- }' > design-tokens.json
284
-
285
- # Or use the built-in Tailwind CSS exporter (see lib/exporters.js)
286
- # Converts extracted tokens to Tailwind config format
287
- ```
44
+ By default, results display in terminal only. Use `--save-output` to save JSON to `output/bmw.de/YYYY-MM-DDTHH-MM-SS.json`
288
45
 
289
46
  ## Use Cases
290
47
 
291
- ### Brand Audits
292
-
293
- Extract and document your company's current design system from production websites.
48
+ - Brand audits & competitive analysis
49
+ - Design system documentation
50
+ - Reverse engineering brands
51
+ - Multi-site brand consolidation
294
52
 
295
- ### Competitive Analysis
296
-
297
- Compare design systems across competitors to identify trends and opportunities.
298
-
299
- ### Design System Migration
300
-
301
- Document legacy design tokens before migrating to a new system.
302
-
303
- ### Reverse Engineering
304
-
305
- Rebuild a brand when original design guidelines are unavailable.
306
-
307
- ### Quality Assurance
308
-
309
- Verify design consistency across different pages and environments.
310
-
311
- ## Advanced Features
312
-
313
- ### Bot Detection Avoidance
314
-
315
- - Stealth mode with anti-detection scripts
316
- - Automatic fallback to visible browser on detection
317
- - Human-like interaction simulation (mouse movement, scrolling)
318
- - Custom user agent and browser fingerprinting
319
-
320
- ### Smart Retry Logic
321
-
322
- - Automatic retry on navigation failures (up to 2 attempts)
323
- - SPA hydration detection and waiting
324
- - Content validation to ensure page is fully loaded
325
- - Detailed progress logging at each step
326
-
327
- ### Comprehensive Logging
328
-
329
- - Real-time spinner with step-by-step progress
330
- - Detailed extraction metrics (colors found, styles detected, etc.)
331
- - Error context with URL, stage, and attempt information
332
- - Debug mode with full stack traces
333
-
334
- ## Troubleshooting
335
-
336
- ### Bot Detection Issues
337
-
338
- If you encounter timeouts or network errors:
339
-
340
- ```bash
341
- dembrandt example.com --debug
342
- ```
343
-
344
- This will automatically retry with a visible browser.
345
-
346
- ### Page Not Loading
347
-
348
- Some sites require longer load times. The tool waits 8 seconds for SPA hydration, but you can modify this in the source.
53
+ ## How It Works
349
54
 
350
- ### Empty Content
55
+ Uses Playwright to render the page, extracts computed styles from the DOM, analyzes color usage and confidence, groups similar typography, detects spacing patterns, and returns actionable design tokens.
351
56
 
352
- If content length is < 500 chars, the tool will automatically retry (up to 2 attempts).
57
+ ### Extraction Process
353
58
 
354
- ### Debug Mode
59
+ 1. Browser Launch - Launches Chromium with stealth configuration
60
+ 2. Anti-Detection - Injects scripts to bypass bot detection
61
+ 3. Navigation - Navigates to target URL with retry logic
62
+ 4. Hydration - Waits for SPAs to fully load (8s initial + 4s stabilization)
63
+ 5. Content Validation - Verifies page content is substantial (>500 chars)
64
+ 6. Parallel Extraction - Runs all extractors concurrently for speed
65
+ 7. Analysis - Analyzes computed styles, DOM structure, and CSS variables
66
+ 8. Scoring - Assigns confidence scores based on context and usage
355
67
 
356
- Use `--debug` to see:
68
+ ### Color Confidence
357
69
 
358
- - Browser launch confirmation
359
- - Step-by-step progress logs
360
- - Full error stack traces
361
- - Extraction metrics
70
+ - High Logo, brand elements, primary buttons
71
+ - Medium Interactive elements, icons, navigation
72
+ - Low Generic UI components (filtered from display)
73
+ - Only shows high and medium confidence colors in terminal. Full palette in JSON.
362
74
 
363
75
  ## Limitations
364
76
 
365
- - Dark mode requires `--dark-mode` flag (not automatically detected)
77
+ - Dark mode requires --dark-mode flag (not automatically detected)
366
78
  - Hover/focus states extracted from CSS (not fully interactive)
367
79
  - Canvas/WebGL-rendered sites cannot be analyzed (e.g., Tesla, Apple Vision Pro demos)
368
80
  - JavaScript-heavy sites require hydration time (8s initial + 4s stabilization)
369
81
  - Some dynamically-loaded content may be missed
370
- - Default viewport is 1920x1080 (use `--mobile` for responsive analysis)
371
-
372
- ## Architecture
373
-
374
- ```
375
- dembrandt/
376
- ├── index.js # CLI entry point, command handling
377
- ├── lib/
378
- │ ├── extractors.js # Core extraction logic with stealth mode
379
- │ ├── display.js # Terminal output formatting
380
- │ └── exporters.js # Export to Tailwind CSS config (NEW)
381
- ├── output/ # Auto-saved JSON extractions (gitignored)
382
- │ ├── stripe.com/
383
- │ │ ├── 2025-11-22T14-30-45.json
384
- │ │ └── 2025-11-22T15-12-33.json
385
- │ └── github.com/
386
- │ └── 2025-11-22T14-35-12.json
387
- ├── package.json
388
- └── README.md
389
- ```
82
+ - Default viewport is 1920x1080 (use --mobile for responsive analysis)
390
83
 
391
84
  ## Ethics & Legality
392
85
 
393
86
  Dembrandt extracts publicly available design information (colors, fonts, spacing) from website DOMs for analysis purposes. This falls under fair use in most jurisdictions (USA's DMCA § 1201(f), EU Software Directive 2009/24/EC) when used for competitive analysis, documentation, or learning.
394
87
 
395
- **Legal:** Analyzing public HTML/CSS is generally legal. Does not bypass protections or violate copyright. Check site ToS before mass extraction.
88
+ Legal: Analyzing public HTML/CSS is generally legal. Does not bypass protections or violate copyright. Check site ToS before mass extraction.
396
89
 
397
- **Ethical:** Use for inspiration and analysis, not direct copying. Respect servers (no mass crawling), give credit to sources, be transparent about data origin.
90
+ Ethical: Use for inspiration and analysis, not direct copying. Respect servers (no mass crawling), give credit to sources, be transparent about data origin.
398
91
 
399
92
  ## Contributing
400
93
 
401
- Issues and pull requests welcome. Please include:
94
+ Bugs you found? Weird websites that make it cry? Pull requests (even one-liners make me happy)?
402
95
 
403
- - Clear description of the issue/feature
404
- - Example URLs that demonstrate the problem
405
- - Expected vs actual behavior
96
+ Spam me in [Issues](https://github.com/thevangelist/dembrandt/issues) or PRs. I reply to everything.
406
97
 
407
- ## License
98
+ Let's keep the light alive together.
408
99
 
409
- MIT
100
+ @thevangelist
410
101
 
411
- ## Roadmap
102
+ ---
412
103
 
413
- - [x] Dark mode extraction (via `--dark-mode` flag)
414
- - [x] Mobile viewport support (via `--mobile` flag)
415
- - [x] Clickable terminal links for modern terminals
416
- - [?] Figma integration branch: `figma-integration`
417
- - [ ] Animation/transition detection
418
- - [ ] Interactive state capture (hover, focus, active)
419
- - [ ] Multi-page analysis
420
- - [ ] Configuration file support
104
+ MIT do whatever you want with it.
package/index.js CHANGED
@@ -10,7 +10,7 @@
10
10
  import { program } from "commander";
11
11
  import chalk from "chalk";
12
12
  import ora from "ora";
13
- import { chromium } from "playwright";
13
+ import { chromium } from "playwright-core";
14
14
  import { extractBranding } from "./lib/extractors.js";
15
15
  import { displayResults } from "./lib/display.js";
16
16
  import { writeFileSync, mkdirSync } from "fs";
@@ -23,6 +23,7 @@ program
23
23
  .argument("<url>")
24
24
  .option("--json-only", "Output raw JSON")
25
25
  .option("-d, --debug", "Force visible browser")
26
+ .option("--save-output", "Save JSON file to output folder")
26
27
  .option("--verbose-colors", "Show medium and low confidence colors")
27
28
  .option("--dark-mode", "Extract colors from dark mode")
28
29
  .option("--mobile", "Extract from mobile viewport")
@@ -93,8 +94,8 @@ program
93
94
 
94
95
  console.log();
95
96
 
96
- // Save JSON output automatically (unless --json-only)
97
- if (!opts.jsonOnly) {
97
+ // Save JSON output only if --save-output is specified
98
+ if (opts.saveOutput && !opts.jsonOnly) {
98
99
  try {
99
100
  const domain = new URL(url).hostname.replace("www.", "");
100
101
  const timestamp = new Date()