similarbuild 0.3.2 → 0.3.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "similarbuild",
|
|
3
|
-
"version": "0.3.
|
|
3
|
+
"version": "0.3.3",
|
|
4
4
|
"description": "Visual migration framework for Claude Code — clone a live page, get a paste-ready WordPress/Elementor or Shopify section file, validated and auto-corrected.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"bin": {
|
|
@@ -402,6 +402,8 @@ node .claude/skills/sb-inspect-live/scripts/inspect-live.mjs \
|
|
|
402
402
|
|
|
403
403
|
**Re-uso da inspection da home (V03-0a):** se `page.type === 'home'` AND `globalsExtracted.homeInspectionPath !== null`, SKIP esta inspeção e re-use `globalsExtracted.homeInspectionPath` como o `{inspection-path}` deste step. Step 3.5a já inspecionou a home — re-rodar custaria ~5-8s e produziria conteúdo idêntico. Logue `[build-site] Step 4b: re-using home inspection from Step 3.5 ({path})`.
|
|
404
404
|
|
|
405
|
+
**Section crops organizados (V03-3):** após cada inspeção, copie/link os crops de `{inspection-path}/sections/*.png` para `{output_folder}/{project-slug}/screenshots/{page.slug}/` (com prefixo de página tipo `home/`, `pdp/originals/`, `pages/privacy-policy/` etc, derivado de `page.type + page.slug`). Isso dá ao usuário uma estrutura inspecionável `screenshots/<page>/<idx>-<sectionType>.png` que ele pode abrir num browser/folder pra ver O QUE o composer recebeu como input. Use `ln -sf` (symlink) ou `cp` — sem mover, porque a inspeção também precisa da pasta original pra cross-validation.
|
|
406
|
+
|
|
405
407
|
Capture `inspection`. Branches específicas do batch:
|
|
406
408
|
|
|
407
409
|
- `inspection.widgetBlocked === true` → marque a página como `❌` em `pageResults[]`, anote o motivo, e **continue para a próxima página** (não pare o batch). Exceção: se essa for a PRIMEIRA página E for a `home`, pare e escale — provavelmente o site inteiro está atrás de bot-wall.
|
|
@@ -574,8 +576,9 @@ Anote em `pageResults[].coverage = { buildHeight, liveMainHeight, ratio, source:
|
|
|
574
576
|
|
|
575
577
|
Pré-checks adicionais (idênticos ao `/build-page`):
|
|
576
578
|
1. `--no-auto-correct` foi passado → escala primeiro diff de cada página: rotas que iriam pra Step 4j viram ⚠️ direto.
|
|
577
|
-
2. `iteration >= auto_correct_max_iterations` (default 2) →
|
|
578
|
-
3. **Mesmo `violations[]` 2 iterações seguidas** → fixHint não pegou. Não rode 3ª.
|
|
579
|
+
2. `iteration >= auto_correct_max_iterations` (default 2) → page goes to **❌** (NOT ⚠️) per V03-3 zero-fabrication policy. The composer had its 2 attempts and couldn't produce a faithful build; shipping the partial output as "warning" would be misleading. Mark `pageResults[].status = ❌`, message: `"compose-failed after 2 attempts — see TODOs in {output-path} for unreadable sections"`. The fragment IS still written so the user can inspect/edit it manually, but the orchestrator's contract with the user (per V03-3 hard-fail rule) is: don't claim something is ready when it's not.
|
|
580
|
+
3. **Mesmo `violations[]` 2 iterações seguidas** → fixHint não pegou. Não rode 3ª. ❌ imediato.
|
|
581
|
+
4. **(V03-3) `review.violations[]` includes `composition-fabrication-detected` or `composition-todo-bloat`** (composer self-flagged unreadable sections) → goes to Step 4j once. If 2nd attempt also flags, ship as ❌ — não como ⚠️.
|
|
579
582
|
|
|
580
583
|
### Step 4j — Auto-correct iteration (loop FECHADO por página)
|
|
581
584
|
|
|
@@ -39,32 +39,45 @@ A single `.html` file written to `outputPath`. The file is a fragment — no `<h
|
|
|
39
39
|
|
|
40
40
|
1. **Read the inputs.** Parse `inspection.json` (capture `sectionType`, `tokens`, `dom`, **`domLive`**, `pseudoElements`, `imgUrls`, **`hydratedHeader`**, **`hydratedFooter`**, and `screenshot` path) and `assets-map.json` (the URL → localPath / inline-SVG dictionary). If `fixHints` is given, also read `previousHtmlPath`.
|
|
41
41
|
|
|
42
|
-
**§V03-
|
|
42
|
+
**§V03-3 — Section-level vision composition with ZERO-FABRICATION enforcement (REPLACES §V03-2).** The previous full-page screenshot approach (§V03-2) failed because page screenshots of 24000+ pixels tall get downscaled when read, making digits/labels unreadable — the LLM then completed gaps with plausible-but-wrong content (fabricated FAQ answers, wrong prices, hallucinated product counts, "old version of the site" pulled from training data).
|
|
43
43
|
|
|
44
|
-
**
|
|
44
|
+
v0.3.3 fixes this with **section crops in native resolution**: `inspection.sectionCrops[]` carries one image per visual band (hero, products grid, FAQ, etc.) at the viewport's native width (390px on mobile). Each crop is HD-readable for the section it represents.
|
|
45
45
|
|
|
46
|
-
|
|
47
|
-
2. **Identify visible components** in the section bbox you're composing: gallery images (count and approximate aspect), price (current + compare-at if struck through), sale banners, variant selectors (size dropdowns, color swatches), tier pricing offers, CTAs (color, label text), reviews snippets, trust badges, FAQ accordions, etc.
|
|
48
|
-
3. **For each visible component, emit REAL markup** reflecting what you saw. Examples:
|
|
49
|
-
- 6-thumb gallery visible → emit `<ul class="pdp__thumbs"><li><img src="…" alt="…"></li>×6</ul>` with the thumb URLs taken from `inspection.imgUrls` filtered by Shopify-CDN patterns + product-slug match. If a thumb URL isn't in imgUrls, emit a `<!-- TODO: thumb N — visible in screenshot at (x,y), no src in inspection -->` comment and a placeholder `<img>` so the user knows.
|
|
50
|
-
- Price "$29.00" with strikethrough "$34.00" → `<span class="pdp__price">$29.00</span> <s class="pdp__compare">$34.00</s>` — read the digits LITERALLY from the screenshot, do not pull from a different DOM source unless they match.
|
|
51
|
-
- "Mother's Day Sale — Buy 2 Pairs, Get 2 FREE" banner → emit the banner with the EXACT text you read.
|
|
52
|
-
- 3 tier offers (1 Pair $29, 2 Pairs + 2 FREE $58, 3 Pairs + 5 FREE $88) → emit 3 radio cards with the literal pricing and "Best Seller"/"Best Value" badges as you see them.
|
|
53
|
-
- Variant selectors (Size XL dropdown, Color "Classic Black" dropdown) → emit `<select>` elements with option lists matching what's visible (you may not see all options if dropdown is closed — emit the ones visible + a `<!-- TODO: additional options likely below the fold -->` comment).
|
|
54
|
-
- CTA "ADD TO CART" red button → emit `<button class="pdp__cta">ADD TO CART</button>` with red background color sampled from your visual reading (or use a known brand red from `inspection.tokens.colors`).
|
|
46
|
+
**Workflow when composing a body page** (anything NOT `--target-section=header|footer`):
|
|
55
47
|
|
|
56
|
-
|
|
57
|
-
- **Texts and visible structure** → read from screenshot (most reliable for lazy-hydrated content).
|
|
58
|
-
- **`href` for clickable links** → look in `inspection.domLive` / `inspection.hydratedHeader` / `inspection.hydratedFooter` / `inspection.imgUrls` for matching elements (by visible link text or image alt). If no match, emit `href="#"` plus a `<!-- TODO: resolve href for "…" -->` comment.
|
|
59
|
-
- **`form action`, `method`, hidden inputs** → ALWAYS from `inspection.hydratedFooter.forms` / `inspection.dom` / `inspection.domLive` (never guess these — submission will break).
|
|
60
|
-
- **`src` for images** → resolve via `assetsMap.assets[url].localPath` when URL appears in `inspection.imgUrls`. When you see an image in the screenshot but can't find its URL in imgUrls (custom-element-rendered img), emit a placeholder + TODO comment.
|
|
61
|
-
- **Computed style (fonts, colors, spacing)** → from `inspection.tokens` + `domLive[].computedStyle` (canonical for layout-derivable values like body background).
|
|
48
|
+
1. **For each section you compose**, find the matching entry in `inspection.sectionCrops[]` by approximate bbox.y range and `sectionType` keyword. Read the crop image via Read tool — `Read({ file_path: cropEntry.path })`.
|
|
62
49
|
|
|
63
|
-
|
|
50
|
+
2. **CRITICAL — Zero-fabrication rules. Treat these as hard contract:**
|
|
51
|
+
- **Literal text** (prices, review counts, button labels, headings, badge text, product names, percentages, dimensions): emit ONLY what you can read clearly in the crop AT NATIVE RESOLUTION. If you're not 100% sure of the exact characters (one digit looks like another, text is too small even in native, etc.), DO NOT GUESS — emit a `<!-- TODO: <visual description>, unreadable -->` comment + structural placeholder.
|
|
52
|
+
- **FAQ answers**: NEVER write FAQ answer bodies based on "plausible content for this brand". If `<details>`/`<summary>` accordion is visible in the crop but answers are collapsed, emit each `<summary>` text verbatim from the crop + empty `<div class="faq-a"><!-- TODO: answer body collapsed in source — open accordion or fetch live --></div>`. Same for reviews: if Loox/Yotpo widget appears as the band, emit empty `<div class="reviews-mount"><!-- TODO: third-party reviews widget, integrate at deploy time --></div>` — DO NOT compose fake reviews.
|
|
53
|
+
- **Product counts**: count cards/thumbs in the crop. If there are 4 product cards visible, emit exactly 4. Do not "round" to 3 because typical Shopify themes have 3.
|
|
54
|
+
- **Cross-validate every numeric literal against `inspection.domLive` text nodes.** Before emitting `<span>$29.00</span>`: grep `inspection.domLive` recursively for any text node containing `$29` or `29.00`. If found → safe to emit. If NOT found → emit `<!-- TODO: price "$29.00" read from crop but not present in DOM; verify -->` instead.
|
|
55
|
+
- **Site version awareness**: if the crop shows a banner like "Mother's Day Sale", "Black Friday Sale", limited-edition badges — emit them. Do NOT skip "because the site usually doesn't have this". The crop captured the LIVE state.
|
|
56
|
+
- **NEVER complete content based on knowledge of the public site.** You may have seen this URL in training data. That data is OLD. The crop is NEW. Crop wins, training data is irrelevant.
|
|
64
57
|
|
|
65
|
-
|
|
58
|
+
3. **Emit real markup reflecting what the crop shows:**
|
|
59
|
+
- Counted gallery thumbs → emit each `<img>` with src resolved through `assetsMap.assets[url].localPath` matching URLs in `inspection.imgUrls`. If you can see 6 thumbs but `imgUrls` only has 3 matching URLs, emit 3 real `<img>` + 3 placeholders with `<!-- TODO: thumb N visible in crop, no src in inspection.imgUrls -->`.
|
|
60
|
+
- Literal price `$29.00` (cross-validated against DOM) → `<span class="price">$29.00</span> <s class="compare">$34.00</s>`.
|
|
61
|
+
- Banner with exact visible text → emit verbatim.
|
|
62
|
+
- Variant selectors: emit `<select>` with option list MATCHING WHAT'S VISIBLE. Inferred options (XS/XXXL not visible in crop) → emit only what's seen + `<!-- TODO: additional sizes likely available -->`.
|
|
63
|
+
- CTA: read the literal label text + observe button color. Emit with the right text + use color from `inspection.tokens.colors` for accent/primary.
|
|
66
64
|
|
|
67
|
-
|
|
65
|
+
4. **Hybridize DOM + vision (data source by field type):**
|
|
66
|
+
- **Texts/structure/counts** → crop image (truth of current state).
|
|
67
|
+
- **Hrefs** → `inspection.domLive` / `hydratedHeader` / `hydratedFooter` / `imgUrls` match by visible text. No match → `href="#"` + TODO.
|
|
68
|
+
- **Form action, method, hidden inputs** → ALWAYS DOM (never guess; broken submission else).
|
|
69
|
+
- **Image src** → `assetsMap.assets[url].localPath` resolved by visible alt or src pattern.
|
|
70
|
+
- **Computed style** → `inspection.tokens` + `domLive.computedStyle` for primary/accent colors and font tokens.
|
|
71
|
+
|
|
72
|
+
5. **Self-validation before submit:** before calling `build-wp.mjs write`, scan the composed HTML for fabrication signals:
|
|
73
|
+
- Every numeric literal `$NNNN` or `NNN reviews` MUST either (a) appear in a crop you actually read, or (b) be marked with a nearby TODO comment.
|
|
74
|
+
- Every FAQ `<div class="faq-a">` body MUST be non-empty ONLY if you literally read it from a crop. Empty bodies + TODO comment is the correct output when accordions are collapsed.
|
|
75
|
+
- Reviews widget bands → empty mount-div + TODO. Never fake review text.
|
|
76
|
+
If any violation found, REWRITE before submit. If you can't make it pass, return preflight error `composition-fabrication-detected` and let the orchestrator's Step 4j re-try with feedback.
|
|
77
|
+
|
|
78
|
+
6. **Hard-fail after 2 attempts.** The orchestrator (Step 4j) limits to 2 compose iterations. If iteration 2 still has fabrication signals, the orchestrator marks the page `❌` and ships a structural placeholder fragment with TODO comments — NEVER ship plausible-but-wrong content as "ready" or "partial".
|
|
79
|
+
|
|
80
|
+
This contract enforces what the user explicitly asked for: "não pode me entregar algo diferente e se nao conseguir falha na 2 vez".
|
|
68
81
|
|
|
69
82
|
**§V03-1 — Use `domLive` as the canonical body tree when present.** When `inspection.domLive` is non-null, it holds the live-walker snapshot taken BEFORE Cap A substituted `dom[]` with the shadow-flattened tree. The flattened `dom[]` carries `bbox={0,0,0,0}` and empty `computedStyle` because parseHTMLUnsafe returns a detached doc — it's structurally rich (gallery imgs, custom-element children) but useless for layout. The composer needs real bboxes, real `computedStyle.background`, real heights. Always prefer `inspection.domLive` for body section composition (bbox, computedStyle, hero detection, section ordering). Use `inspection.dom` only when `domLive === null` (page had no shadow roots — flatten didn't fire) or when you specifically need shadow-flattened content like a PDP gallery (consult `dom` for image-rich PDP nodes, but use `domLive` for the surrounding layout).
|
|
70
83
|
|
|
@@ -422,6 +422,164 @@ async function main() {
|
|
|
422
422
|
|
|
423
423
|
Object.assign(result, extracted)
|
|
424
424
|
|
|
425
|
+
// §V03-3 — Section-level crops at NATIVE resolution.
|
|
426
|
+
// The full-page screenshot can be 24k+ pixels tall on long pages;
|
|
427
|
+
// when a vision model reads it, the image is downscaled to fit and
|
|
428
|
+
// small details (prices, badges, button labels) become unreadable.
|
|
429
|
+
// Capture each classified section as its own viewport-width crop in
|
|
430
|
+
// native resolution so the composer can read details accurately.
|
|
431
|
+
log('capturing section-level crops')
|
|
432
|
+
try {
|
|
433
|
+
const cropsDir = join(OUTPUT_DIR, 'sections')
|
|
434
|
+
await mkdir(cropsDir, { recursive: true })
|
|
435
|
+
|
|
436
|
+
// Strategy: capture TOP-LEVEL vertical bands. Walk into <body>/<main>
|
|
437
|
+
// and treat each direct child with meaningful height as a section
|
|
438
|
+
// crop. classifySection's sectionType (when present) is preserved
|
|
439
|
+
// as the section label; otherwise we tag with the element's tag +
|
|
440
|
+
// first class. This is more complete than relying on classifySection
|
|
441
|
+
// alone — many real-world pages have hero/trust/pillars/guarantee/
|
|
442
|
+
// banner that classifySection doesn't recognize as a known type but
|
|
443
|
+
// are clearly distinct visual bands.
|
|
444
|
+
const sectionList = []
|
|
445
|
+
const seenBbox = new Set()
|
|
446
|
+
function bboxKey(b) {
|
|
447
|
+
return `${b?.x}|${b?.y}|${b?.w}|${b?.h}`
|
|
448
|
+
}
|
|
449
|
+
function labelFromNode(n) {
|
|
450
|
+
if (n.sectionType) return n.sectionType
|
|
451
|
+
const cls = Array.isArray(n.classes) ? n.classes[0] : null
|
|
452
|
+
const tag = n.tag || 'section'
|
|
453
|
+
const slug = (cls || tag).replace(/[^a-z0-9-]/gi, '-').toLowerCase().slice(0, 32)
|
|
454
|
+
return slug || 'section'
|
|
455
|
+
}
|
|
456
|
+
const sourceDom = result.domLive
|
|
457
|
+
? Array.isArray(result.domLive)
|
|
458
|
+
? result.domLive
|
|
459
|
+
: [result.domLive]
|
|
460
|
+
: result.dom
|
|
461
|
+
|
|
462
|
+
// Find the host container (body or main) — walk in one level and
|
|
463
|
+
// pick the node with the most children + largest bbox.
|
|
464
|
+
function findHost(arr) {
|
|
465
|
+
for (const node of arr) {
|
|
466
|
+
if (!node || typeof node !== 'object') continue
|
|
467
|
+
if (node.tag === 'body' || node.tag === 'main') return node
|
|
468
|
+
// Recurse one level for the case the root is e.g. <html>
|
|
469
|
+
if (Array.isArray(node.children)) {
|
|
470
|
+
const inner = findHost(node.children)
|
|
471
|
+
if (inner) return inner
|
|
472
|
+
}
|
|
473
|
+
}
|
|
474
|
+
return null
|
|
475
|
+
}
|
|
476
|
+
const host = findHost(sourceDom)
|
|
477
|
+
const directChildren = host && Array.isArray(host.children) ? host.children : sourceDom
|
|
478
|
+
|
|
479
|
+
for (const child of directChildren) {
|
|
480
|
+
if (!child || typeof child !== 'object') continue
|
|
481
|
+
const bbox = child.bbox
|
|
482
|
+
if (!bbox || typeof bbox.h !== 'number') continue
|
|
483
|
+
// Skip tiny strips (probably wrappers/spacers) and zero-height.
|
|
484
|
+
if (bbox.h < 60 || bbox.w < 200) continue
|
|
485
|
+
// Skip absurdly tall single bands (entire page wrapped in one
|
|
486
|
+
// div) — fall back to nested crops in that case.
|
|
487
|
+
if (bbox.h > 6000) {
|
|
488
|
+
if (Array.isArray(child.children)) {
|
|
489
|
+
for (const grand of child.children) {
|
|
490
|
+
if (!grand || typeof grand !== 'object' || !grand.bbox) continue
|
|
491
|
+
if (grand.bbox.h < 60 || grand.bbox.w < 200) continue
|
|
492
|
+
if (grand.bbox.h > 6000) continue
|
|
493
|
+
const key = bboxKey(grand.bbox)
|
|
494
|
+
if (seenBbox.has(key)) continue
|
|
495
|
+
seenBbox.add(key)
|
|
496
|
+
sectionList.push({
|
|
497
|
+
sectionType: labelFromNode(grand),
|
|
498
|
+
bbox: grand.bbox,
|
|
499
|
+
})
|
|
500
|
+
}
|
|
501
|
+
}
|
|
502
|
+
continue
|
|
503
|
+
}
|
|
504
|
+
const key = bboxKey(bbox)
|
|
505
|
+
if (seenBbox.has(key)) continue
|
|
506
|
+
seenBbox.add(key)
|
|
507
|
+
sectionList.push({
|
|
508
|
+
sectionType: labelFromNode(child),
|
|
509
|
+
bbox,
|
|
510
|
+
})
|
|
511
|
+
}
|
|
512
|
+
|
|
513
|
+
// Sort by y so crops are numbered top-to-bottom — matches the
|
|
514
|
+
// user's reading order through the page.
|
|
515
|
+
sectionList.sort((a, b) => (a.bbox.y || 0) - (b.bbox.y || 0))
|
|
516
|
+
|
|
517
|
+
// Use sharp to crop the already-captured full-page screenshot.
|
|
518
|
+
// This is faster (no extra page.screenshot per section) and avoids
|
|
519
|
+
// page-level scroll/layout race conditions where clip regions outside
|
|
520
|
+
// the rendered viewport return "clipped area outside resulting image".
|
|
521
|
+
// The screenshot was captured at deviceScaleFactor=3 for iPhone profile,
|
|
522
|
+
// so pixel coords are bbox-coord * dpr; we read metadata to detect.
|
|
523
|
+
let sharp = null
|
|
524
|
+
try {
|
|
525
|
+
sharp = (await import('sharp')).default
|
|
526
|
+
} catch (_) {
|
|
527
|
+
log('sharp not available — section crops skipped (install: npm i sharp)')
|
|
528
|
+
result.sectionCrops = []
|
|
529
|
+
}
|
|
530
|
+
const sectionCrops = []
|
|
531
|
+
if (sharp) {
|
|
532
|
+
const fullMeta = await sharp(screenshotPath).metadata()
|
|
533
|
+
const dpr = Math.max(1, Math.round(fullMeta.width / VIEWPORT_W))
|
|
534
|
+
let idx = 0
|
|
535
|
+
for (const sec of sectionList) {
|
|
536
|
+
idx++
|
|
537
|
+
const slug =
|
|
538
|
+
String(idx).padStart(2, '0') +
|
|
539
|
+
'-' +
|
|
540
|
+
String(sec.sectionType).replace(/[^a-z0-9-]/gi, '-').toLowerCase()
|
|
541
|
+
const cropPath = join(cropsDir, `${slug}.png`)
|
|
542
|
+
// Clamp to image bounds — bbox may extend beyond captured area.
|
|
543
|
+
const extractW = Math.max(
|
|
544
|
+
1,
|
|
545
|
+
Math.min(Math.round((sec.bbox.w || VIEWPORT_W) * dpr), fullMeta.width),
|
|
546
|
+
)
|
|
547
|
+
const extractH = Math.max(
|
|
548
|
+
1,
|
|
549
|
+
Math.min(Math.round(sec.bbox.h * dpr), fullMeta.height - Math.round(sec.bbox.y * dpr)),
|
|
550
|
+
)
|
|
551
|
+
const extractY = Math.max(0, Math.round(sec.bbox.y * dpr))
|
|
552
|
+
if (extractY + extractH > fullMeta.height || extractH < 30 * dpr) {
|
|
553
|
+
log(`section crop ${idx} ${sec.sectionType} skipped: outside image bounds`)
|
|
554
|
+
continue
|
|
555
|
+
}
|
|
556
|
+
try {
|
|
557
|
+
await sharp(screenshotPath)
|
|
558
|
+
.extract({
|
|
559
|
+
left: 0,
|
|
560
|
+
top: extractY,
|
|
561
|
+
width: extractW,
|
|
562
|
+
height: extractH,
|
|
563
|
+
})
|
|
564
|
+
.toFile(cropPath)
|
|
565
|
+
sectionCrops.push({
|
|
566
|
+
idx,
|
|
567
|
+
sectionType: sec.sectionType,
|
|
568
|
+
bbox: sec.bbox,
|
|
569
|
+
path: cropPath,
|
|
570
|
+
})
|
|
571
|
+
} catch (err) {
|
|
572
|
+
log(`section crop ${idx} ${sec.sectionType} failed: ${err?.message || err}`)
|
|
573
|
+
}
|
|
574
|
+
}
|
|
575
|
+
}
|
|
576
|
+
result.sectionCrops = sectionCrops
|
|
577
|
+
log(`section crops: ${sectionCrops.length} captured`)
|
|
578
|
+
} catch (err) {
|
|
579
|
+
log(`section crops phase failed: ${err?.message || err}`)
|
|
580
|
+
result.sectionCrops = []
|
|
581
|
+
}
|
|
582
|
+
|
|
425
583
|
// §3.2 — Trigger-and-observe header hamburger menu, if present.
|
|
426
584
|
// Composer was guessing "Pattern A drawer-left" from sectionType=header
|
|
427
585
|
// alone, ending up with the wrong animation in example-shop (the live
|