similarbuild 0.3.3 → 0.3.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "similarbuild",
3
- "version": "0.3.3",
3
+ "version": "0.3.5",
4
4
  "description": "Visual migration framework for Claude Code — clone a live page, get a paste-ready WordPress/Elementor or Shopify section file, validated and auto-corrected.",
5
5
  "type": "module",
6
6
  "bin": {
@@ -37,47 +37,81 @@ A single `.html` file written to `outputPath`. The file is a fragment — no `<h
37
37
 
38
38
  ## On Activation
39
39
 
40
- 1. **Read the inputs.** Parse `inspection.json` (capture `sectionType`, `tokens`, `dom`, **`domLive`**, `pseudoElements`, `imgUrls`, **`hydratedHeader`**, **`hydratedFooter`**, and `screenshot` path) and `assets-map.json` (the URL → localPath / inline-SVG dictionary). If `fixHints` is given, also read `previousHtmlPath`.
40
+ 1. **Read the inputs.** Parse `inspection.json` (capture `sectionType`, `tokens`, `dom`, **`domLive`**, `pseudoElements`, `imgUrls`, **`hydratedHeader`**, **`hydratedFooter`**, and **`sectionCrops[]`**) and `assets-map.json` (the URL → localPath / inline-SVG dictionary). If `fixHints` is given, also read `previousHtmlPath`.
41
41
 
42
- **§V03-3 Section-level vision composition with ZERO-FABRICATION enforcement (REPLACES §V03-2).** The previous full-page screenshot approach (§V03-2) failed because page screenshots of 24000+ pixels tall get downscaled when read, making digits/labels unreadable the LLM then completed gaps with plausible-but-wrong content (fabricated FAQ answers, wrong prices, hallucinated product counts, "old version of the site" pulled from training data).
42
+ **CANONICAL VISUAL INPUT = `sectionCrops[]`.** Do NOT read `inspection.screenshot` (full-page) it's downscaled when loaded as image and digits become unreadable. The crops are HD-readable per section. Composer uses one crop per section it composes.
43
43
 
44
- v0.3.3 fixes this with **section crops in native resolution**: `inspection.sectionCrops[]` carries one image per visual band (hero, products grid, FAQ, etc.) at the viewport's native width (390px on mobile). Each crop is HD-readable for the section it represents.
44
+ **§V03-4 TERNARY section classification + ZERO-FABRICATION HARD ENFORCEMENT (replaces §V03-3).** Before composing ANY section, classify it into ONE of five categories. Pick the FIRST category that matches. Each category has a distinct compose recipe. Default = placeholder is FORBIDDEN; placeholder is only the last-resort for category (E).
45
45
 
46
46
  **Workflow when composing a body page** (anything NOT `--target-section=header|footer`):
47
47
 
48
- 1. **For each section you compose**, find the matching entry in `inspection.sectionCrops[]` by approximate bbox.y range and `sectionType` keyword. Read the crop image via Read tool — `Read({ file_path: cropEntry.path })`.
48
+ 1. **Find the section's crop.** For each section you compose, find the matching `inspection.sectionCrops[]` entry by `bbox.y` proximity. Read the crop via `Read({ file_path: cropEntry.path })`.
49
49
 
50
- 2. **CRITICAL Zero-fabrication rules. Treat these as hard contract:**
51
- - **Literal text** (prices, review counts, button labels, headings, badge text, product names, percentages, dimensions): emit ONLY what you can read clearly in the crop AT NATIVE RESOLUTION. If you're not 100% sure of the exact characters (one digit looks like another, text is too small even in native, etc.), DO NOT GUESS — emit a `<!-- TODO: <visual description>, unreadable -->` comment + structural placeholder.
52
- - **FAQ answers**: NEVER write FAQ answer bodies based on "plausible content for this brand". If `<details>`/`<summary>` accordion is visible in the crop but answers are collapsed, emit each `<summary>` text verbatim from the crop + empty `<div class="faq-a"><!-- TODO: answer body collapsed in source — open accordion or fetch live --></div>`. Same for reviews: if Loox/Yotpo widget appears as the band, emit empty `<div class="reviews-mount"><!-- TODO: third-party reviews widget, integrate at deploy time --></div>` — DO NOT compose fake reviews.
53
- - **Product counts**: count cards/thumbs in the crop. If there are 4 product cards visible, emit exactly 4. Do not "round" to 3 because typical Shopify themes have 3.
54
- - **Cross-validate every numeric literal against `inspection.domLive` text nodes.** Before emitting `<span>$29.00</span>`: grep `inspection.domLive` recursively for any text node containing `$29` or `29.00`. If found → safe to emit. If NOT found → emit `<!-- TODO: price "$29.00" read from crop but not present in DOM; verify -->` instead.
55
- - **Site version awareness**: if the crop shows a banner like "Mother's Day Sale", "Black Friday Sale", limited-edition badges — emit them. Do NOT skip "because the site usually doesn't have this". The crop captured the LIVE state.
56
- - **NEVER complete content based on knowledge of the public site.** You may have seen this URL in training data. That data is OLD. The crop is NEW. Crop wins, training data is irrelevant.
50
+ 2. **Classify the section** into A / B / C / D / E:
57
51
 
58
- 3. **Emit real markup reflecting what the crop shows:**
59
- - Counted gallery thumbs → emit each `<img>` with src resolved through `assetsMap.assets[url].localPath` matching URLs in `inspection.imgUrls`. If you can see 6 thumbs but `imgUrls` only has 3 matching URLs, emit 3 real `<img>` + 3 placeholders with `<!-- TODO: thumb N visible in crop, no src in inspection.imgUrls -->`.
60
- - Literal price `$29.00` (cross-validated against DOM) `<span class="price">$29.00</span> <s class="compare">$34.00</s>`.
61
- - Banner with exact visible text emit verbatim.
62
- - Variant selectors: emit `<select>` with option list MATCHING WHAT'S VISIBLE. Inferred options (XS/XXXL not visible in crop) → emit only what's seen + `<!-- TODO: additional sizes likely available -->`.
63
- - CTA: read the literal label text + observe button color. Emit with the right text + use color from `inspection.tokens.colors` for accent/primary.
52
+ **(A) IMAGE-BLOCK** the section is dominated by a single `<img>` element. Detect by either:
53
+ - `inspection.domLive` (or `dom`) the section's root node has a child `<img>` whose `bbox` covers ≥80% of the section's `bbox`, OR
54
+ - Wrapper class match: ancestor has class containing `product-info__image`, `image-with-text`, `hero-image`, `trust-badges`, `features-points`, `mothers-day`, `single-image`, `banner-image`, or similar "single asset" pattern.
55
+ - In Shopify Dawn/OS 2.0: **`<div class="product-info__image">` is THE canonical flag** — every match is an image-block.
64
56
 
65
- 4. **Hybridize DOM + vision (data source by field type):**
66
- - **Texts/structure/counts** → crop image (truth of current state).
67
- - **Hrefs** → `inspection.domLive` / `hydratedHeader` / `hydratedFooter` / `imgUrls` match by visible text. No match → `href="#"` + TODO.
68
- - **Form action, method, hidden inputs** ALWAYS DOM (never guess; broken submission else).
69
- - **Image src** → `assetsMap.assets[url].localPath` resolved by visible alt or src pattern.
70
- - **Computed style** → `inspection.tokens` + `domLive.computedStyle` for primary/accent colors and font tokens.
57
+ Recipe: **DO NOT reproduce the image's contents in HTML/CSS.** Emit:
58
+ ```html
59
+ <section class="es-{section-slug}">
60
+ <img src="{assetsMap.assets[url].localPath}" alt="{img.alt || section description}" loading="lazy" width="{img.width}" height="{img.height}" />
61
+ </section>
62
+ ```
63
+ Pull the URL from `inspection.imgUrls` matching the section's bbox / src pattern. If asset wasn't downloaded (failed extract or didn't appear in imgUrls), emit `<img src="" alt="..." data-todo="asset-missing">` + `<!-- TODO: image asset not in assetsMap (URL: …) — download manually -->`. No CSS reproduction. No SVG re-creation. No text "interpretation" of what's inside the image.
71
64
 
72
- 5. **Self-validation before submit:** before calling `build-wp.mjs write`, scan the composed HTML for fabrication signals:
73
- - Every numeric literal `$NNNN` or `NNN reviews` MUST either (a) appear in a crop you actually read, or (b) be marked with a nearby TODO comment.
74
- - Every FAQ `<div class="faq-a">` body MUST be non-empty ONLY if you literally read it from a crop. Empty bodies + TODO comment is the correct output when accordions are collapsed.
75
- - Reviews widget bands → empty mount-div + TODO. Never fake review text.
76
- If any violation found, REWRITE before submit. If you can't make it pass, return preflight error `composition-fabrication-detected` and let the orchestrator's Step 4j re-try with feedback.
65
+ **Examples from real feedback (everstride PDP)**: `mothers-day-new.svg` banner, `trust-badges-shipping.svg`, `features-points.svg`, `60-day-fit.svg` cards all category (A).
77
66
 
78
- 6. **Hard-fail after 2 attempts.** The orchestrator (Step 4j) limits to 2 compose iterations. If iteration 2 still has fabrication signals, the orchestrator marks the page `❌` and ships a structural placeholder fragment with TODO comments NEVER ship plausible-but-wrong content as "ready" or "partial".
67
+ **(B) PURE MARKUP** the section is text + structured layout, no dominant image. Examples: FAQ accordion (`<details>/<summary>`), comparison tables (`<table>` with cells), pricing tiers, trust copy blocks, headings with paragraphs.
79
68
 
80
- This contract enforces what the user explicitly asked for: "não pode me entregar algo diferente e se nao conseguir falha na 2 vez".
69
+ Recipe: READ the crop carefully + cross-validate every literal against `inspection.domLive` text nodes. Emit real semantic markup (`<table>`, `<dl>`, `<details>`, `<ul>` of `<li>`, etc.) that mirrors the live DOM structure.
70
+
71
+ **Cross-validation is MANDATORY for every literal**:
72
+ - Numeric (`$NN`, `NN reviews`, percentages, ratings): grep `inspection.domLive` recursively for a text node containing the exact substring. NOT found → emit `<!-- TODO: literal "$29.00" read from crop but not in DOM; verify -->` + placeholder.
73
+ - Headings/copy/labels: same rule. "What Customers Say", "All rights reserved", "Mix & Match Colors" — if not in DOM verbatim, do NOT emit.
74
+ - Counts: count items visible in crop AND cross-validate against `domLive` children of the container. If mismatch, prefer DOM count (DOM is authoritative for structure) + log discrepancy.
75
+
76
+ **(C) MIXED (image + text)** — the section has BOTH a meaningful image AND text content (heading + body paragraph + image side-by-side). Common Shopify pattern: `<images-with-text-scrolling>` custom element.
77
+
78
+ Recipe: emit `<section>` with `<img>` (rule A for the image part — use real asset URL when available) + `<h*>` heading + `<p>` paragraph(s) literal from crop, cross-validated. Do NOT collapse to image-only or text-only — preserve both.
79
+
80
+ **(D) WIDGET-RENDERED** — third-party widget (Loox reviews, "you may also like" carousel, Instagram feed) where the crop SHOWS the widget already rendered with content (reviews with names + photos + stars + quotes; product cards with prices + titles).
81
+
82
+ Recipe: READ the crop and emit real markup for each visible card/item. Cross-validate against DOM when widget exposes content as JSON-LD or hidden HTML. Mark container with `data-source="widget-name"` so the deploy team knows where to integrate the real widget later. **Do NOT emit empty mount-div when content is visible.** The "deploy time placeholder" excuse is ONLY for (E).
83
+
84
+ **(E) WIDGET-EMPTY / OPAQUE** — the crop is blank, only shows "Loading..." text, or the section is completely unreadable (zero text visible, no images, no structure). Last resort.
85
+
86
+ Recipe: emit STRUCTURAL placeholder (heading from DOM if available + `<div class="placeholder-grid">` with N visual placeholder-cards) + `<!-- TODO: <widget-name or section-description> — content not visible in crop, integrate at deploy -->`. Mount-div alone (empty `<div>`) is FORBIDDEN — placeholder must have visible structure so the deployed page is not visually broken.
87
+
88
+ 2.5. **CARD-COUNT cross-validation (V03-5)**. When the section is a grid/carousel of repeating cards (products, reviews, trust badges, gallery thumbs, testimonial photos), the crop may only show the first row OR the first few items of a scroll-x carousel. Composer MUST count cards from `inspection.domLive` (or `inspection.dom`) recursively, NOT just from the crop. Recipe:
89
+ - Find the section's root node in domLive (by bbox.y match or sectionType).
90
+ - Walk children, count direct repeat-pattern items: `.product-card`, `.review-card`, `.feature-card`, `<li>` inside `<ul>` carousel, `[role="listitem"]`, etc.
91
+ - If DOM count > crop visible count → emit DOM count items, with `<img>` placeholder + literal text (cross-validated) for the items visible in crop, and `<img>` placeholder + `<!-- TODO: card N text below the fold -->` for items beyond crop.
92
+ - This fixes bugs 3.4 (home only 3 cards), 3.6 (trust badges only 2 of 4), 4.1 (PDP gallery only 1 thumb of 6), 6.1 (collection only 1 of 3 products).
93
+ - **NEVER emit fewer items than DOM has.** Crop visibility is sample, DOM is authority for structure.
94
+
95
+ 3. **Hybridize sources by field type** (applies to B, C, D):
96
+ - **Texts/structure/counts** → crop, cross-validated against DOM.
97
+ - **Hrefs** → `inspection.domLive` / `hydratedHeader` / `hydratedFooter` / `imgUrls` by matching visible link text. No match → `href="#"` + TODO.
98
+ - **Form action/method/hidden inputs** → ALWAYS DOM (never guess; broken submission else).
99
+ - **Image src** → `assetsMap.assets[url].localPath` by alt/src pattern matching imgUrls.
100
+ - **Computed style** → `inspection.tokens` + `domLive.computedStyle`.
101
+
102
+ 4. **Self-validation BEFORE calling `build-wp.mjs write`** — scan composed HTML for fabrication signals:
103
+ - Every numeric/price literal must either appear in `inspection.domLive` text nodes OR have a nearby TODO comment.
104
+ - Every `<h*>` / heading text must appear in DOM verbatim OR have TODO.
105
+ - Every `<p>` text > 30 chars must appear in DOM verbatim OR have TODO.
106
+ - Every FAQ `<div class="faq-a">` body MUST be either (a) empty + TODO when accordion was closed, OR (b) verbatim from crop where accordion was open.
107
+ - Image-block sections (A) MUST NOT contain SVG reproductions or CSS-styled `<div>` mimicking the image — they MUST be `<img>` + nothing else inside the section.
108
+ If any violation → REWRITE before submit. If can't pass after 1 rewrite, return preflight error `composition-fabrication-detected` for orchestrator Step 4j retry.
109
+
110
+ 5. **Hard-fail after 2 attempts.** Orchestrator (Step 4j) caps at 2 compose iterations. 2nd iteration still flagged → page goes to **❌** with TODO-only fragment shipped (not as "ready" or "partial"). NEVER plausible-but-wrong shipped as warning.
111
+
112
+ 6. **NEVER complete content from training data.** You may have seen this URL during training. That data is OLD. The crop is NEW. Crop wins. Training data is IRRELEVANT — never fill gaps with "what this site probably has".
113
+
114
+ This contract enforces the user's hard requirement: "não pode me entregar algo diferente e se não conseguir falha na 2 vez."
81
115
 
82
116
  **§V03-1 — Use `domLive` as the canonical body tree when present.** When `inspection.domLive` is non-null, it holds the live-walker snapshot taken BEFORE Cap A substituted `dom[]` with the shadow-flattened tree. The flattened `dom[]` carries `bbox={0,0,0,0}` and empty `computedStyle` because parseHTMLUnsafe returns a detached doc — it's structurally rich (gallery imgs, custom-element children) but useless for layout. The composer needs real bboxes, real `computedStyle.background`, real heights. Always prefer `inspection.domLive` for body section composition (bbox, computedStyle, hero detection, section ordering). Use `inspection.dom` only when `domLive === null` (page had no shadow roots — flatten didn't fire) or when you specifically need shadow-flattened content like a PDP gallery (consult `dom` for image-rich PDP nodes, but use `domLive` for the surrounding layout).
83
117
 
@@ -140,24 +174,118 @@ A single `.html` file written to `outputPath`. The file is a fragment — no `<h
140
174
 
141
175
  When `--target-section=header` or `--target-section=footer` is passed AND the corresponding `inspection.hydratedHeader` / `inspection.hydratedFooter` is non-null, follow this composition recipe instead of the generic A-H pattern lookup:
142
176
 
143
- **Footer recipe** (when `hydratedFooter` present):
177
+ **Footer recipe** (when `hydratedFooter` present) — v0.3.5 rewrite:
178
+
179
+ **Composition order = DOM order of `hydratedFooter.html`.** Walk the raw HTML once to capture the visual sequence (e.g. brand → newsletter → menu columns → Need Help → social → payments → copyright). DO NOT invent an order that differs from the source — the live site's order is the canonical reference.
144
180
 
145
- 1. **Outer shell.** `<footer class="es-footer">` with reset + `box-sizing: border-box` + the page background-color taken from `inspection.tokens.colors.background` (or fallback `#fafafa`).
146
- 2. **Grouping.** Split `hydratedFooter.links` into clusters by their adjacent heading in `hydratedFooter.headings`. Order of headings reflects the live DOM order. For each heading: emit a column with `<p class="footer__col-title">{heading.text}</p>` followed by `<ul>` of `<li><a href>` from the links bucketed under that heading.
147
- - Heuristic for bucketing: walk `hydratedFooter.html` once (in your head, you have the raw outerHTML in `hydratedFooter.html` if needed) — links that appear AFTER a heading and BEFORE the next heading belong to that heading. If you can't disambiguate, fall back to grouping by `href` prefix patterns (`/policies/` → "More Information", `/products/` → "Collections", `/pages/` → split by name).
148
- 3. **Newsletter.** If `hydratedFooter.forms[]` is non-empty AND `hydratedFooter.inputs[]` contains an `email`-typed input, emit a real `<form action="{form.action}" method="{form.method}">` with the email input, preserving `name`, `placeholder`, `required`, `aria-label`. Hidden inputs (`type=hidden`) are preserved verbatim — they're typically Shopify form-type tokens. Add a submit `<button type="submit">` even if not in the source.
149
- 4. **Social media.** Detect social links by `href` matching `/facebook|instagram|tiktok|youtube|x\.com|twitter|pinterest/`. Group in a separate `<ul class="footer__social">` with inline SVG icons (you can ship the standard FB/IG icons from the bundled patterns). Skip if no matches.
150
- 5. **Payment icons.** Detect by `hydratedFooter.images[]` with `alt` matching `/amazon|visa|mastercard|amex|apple pay|google pay|discover|diners|shop pay|paypal/i` — emit those as a `<ul class="footer__payments">` with `<img>` referencing the `assetsMap` (resolve src via the standard pipeline). If `images[]` is empty but you'd expect them, fall back to text labels.
151
- 6. **Copyright.** If any link or heading text matches `© YEAR, Brand.`, preserve verbatim at the bottom.
152
- 7. **NO image-slice fallback.** When hydrated data is present, never compose a single `<img>` of the footer. The hydrated payload gives everything needed for real markup.
181
+ Use `hydratedFooter.blocks[]` (new in v0.3.5) as primary structured data each block has `{heading, paragraphs[], links[], mailtos[]}` already grouped per source heading.
153
182
 
154
- **Header recipe** (when `hydratedHeader` present):
183
+ 1. **Outer shell.** `<footer class="es-footer">` with reset + `box-sizing: border-box` + background from `inspection.tokens.colors.background` (fallback `#fafafa`).
184
+
185
+ 2. **Brand row (top).** When `hydratedFooter.images[]` includes a logo-looking image (alt matching brand name, or first image of the footer), emit `<a class="es-footer__brand" href="/"><img src="{assetsMap.localPath}"></a>`. Immediately below, if `hydratedFooter.paragraphs[0]` exists AND is long enough to be the brand description (typically >50 chars, sits ABOVE the column area in the live source), emit `<p class="es-footer__brand-desc">{paragraphs[0] verbatim}</p>`. This is the "Socks can be more powerful..." copy in everstride — DO NOT skip.
186
+
187
+ 3. **Newsletter (after brand description).** If `hydratedFooter.forms[]` non-empty AND `inputs[]` contains an `email`-typed input, emit `<form action="{form.action}" method="{form.method}">` with email input (preserve `name`, `placeholder`, `required`, `aria-label`) + all hidden inputs verbatim + `<button type="submit">`. **Position: directly below brand desc, ABOVE the menu columns.** Newsletter is high-conversion CTA — keep it where the live placed it.
188
+
189
+ 4. **Menu columns side-by-side (NOT stacked).** Take `hydratedFooter.blocks[]` filtered to those with `links.length >= 2` (menu-style blocks). Emit them as a grid:
190
+ ```css
191
+ .es-footer__cols.es-footer__cols {
192
+ display: grid;
193
+ grid-template-columns: repeat(2, minmax(0, 1fr));
194
+ gap: 24px;
195
+ }
196
+ @media (min-width: 750px) {
197
+ .es-footer__cols.es-footer__cols { grid-template-columns: repeat(3, minmax(0, 1fr)); }
198
+ }
199
+ ```
200
+ For each menu block: emit column with `<p class="es-footer__col-title">{block.heading.text verbatim}</p>` + `<ul>` of `<li><a href="{link.href}">{link.text verbatim}</a></li>` from `block.links`. **NEVER stack vertically on mobile** — at least 2 columns side-by-side is the canonical layout, otherwise footer becomes an absurd long vertical strip (bug 2.4).
201
+
202
+ 5. **Info blocks (Need Help? and similar).** Blocks with `paragraphs.length >= 1` AND `links.length < 2` are info-text blocks. For each: emit column with heading + `block.paragraphs[]` as `<p>` verbatim + `mailto:` links from `block.mailtos[]` as `<a href="mailto:{email}">{email}</a>`. This captures "Need Help? + Have a question? Email us at info@... + Our friendly support team is available 24/7..." (bug 2.5b — was emitting only heading without paragraphs).
203
+
204
+ 6. **Social icons.** Detect from `hydratedFooter.links` filtered by `href` matching `/facebook|instagram|tiktok|youtube|x\.com|twitter|pinterest/i`. Emit as `<ul class="es-footer__social">` with inline SVG icons (use `hydratedFooter.inlineSvgs[]` matching parent class containing `social`; fall back to bundled FB/IG inline SVG patterns). Position: between menu columns and payment icons (or wherever the source places them).
205
+
206
+ 7. **Payment icons.** Detect by EITHER:
207
+ - `hydratedFooter.images[]` with `alt` matching `/amazon|visa|mastercard|amex|apple pay|google pay|discover|diners|shop pay|paypal/i`, OR
208
+ - `hydratedFooter.inlineSvgs[]` with `ariaLabel`, `title`, or `parentClass` matching the same patterns, OR
209
+ - `hydratedFooter.inlineSvgs[]` where `parentClass` contains `payment-icons` / `footer__payment`.
210
+ Emit as `<ul class="es-footer__payments">` with `<img>` (resolved via assetsMap) OR inline SVG verbatim. If none of these match BUT you can see payment icons in the section crop, emit `<!-- TODO: payment icons visible in crop but not capturable from DOM -->`. NEVER skip the payment row when source had it (bug 2.1).
211
+
212
+ 8. **Copyright.** Capture `© YEAR, Brand.` verbatim from `hydratedFooter.html` (typically last `<p>` in the footer matching `/^(©|copyright)/i`). Emit `<p class="es-footer__copyright">© 2026, Everstride.</p>` — **verbatim only, NEVER add "All rights reserved" or any other plausible-sounding suffix** (bug 2.5a fabrication regression).
213
+
214
+ 9. **Cross-validate every literal** before emit: every block heading, every link text, every paragraph, every copyright text MUST appear verbatim in `hydratedFooter.html`. Validator (`build-wp.mjs validate --inspection-path`) will catch fabrications; pre-validate yourself to avoid rework.
215
+
216
+ 10. **NO image-slice fallback.** When `hydratedFooter` is present, never compose a single `<img>` of the footer. Hydrated payload always provides enough for real markup.
217
+
218
+ **Header recipe** (when `hydratedHeader` present) — v0.3.5 rewrite:
219
+
220
+ 1. **Announcement bar FIRST (when `inspection.hydratedAnnouncementBar` is non-null).** This is captured separately in v0.3.5 — typically a `<aside>` sibling above the `<header>` carrying promotional text ("Mother's Day Sale", "Free Shipping over $X", etc.). Emit as the very first element of `clean/global/header.html`:
221
+ ```html
222
+ <div class="es-announcement-bar.es-announcement-bar">
223
+ {hydratedAnnouncementBar.text verbatim}
224
+ </div>
225
+ ```
226
+ With CSS giving it the pink/promo background read from the crop. **NEVER skip the announcement bar** when it was captured — bug 1.1 (banner not in header) AND bug 3.1 (banner ended up inline in body) both trace to this being omitted. Also remove from per-page body via stripper.
227
+
228
+ 2. **Outer shell.** `<header class="es-header">` below the announcement bar.
229
+
230
+ 3. **Layout: grid with 3 zones (centered logo).** Default mobile + desktop:
231
+ ```css
232
+ .es-header.es-header {
233
+ display: grid;
234
+ grid-template-columns: auto 1fr auto;
235
+ align-items: center;
236
+ padding: 12px 16px;
237
+ }
238
+ .es-header.es-header > .es-header__left { justify-self: start; }
239
+ .es-header.es-header > .es-header__brand { justify-self: center; }
240
+ .es-header.es-header > .es-header__right { justify-self: end; }
241
+ ```
242
+ This guarantees the brand stays VISUALLY centered regardless of how many icons are in left/right clusters — fixes bug 1.3 (logo descentralizada).
243
+
244
+ 4. **Brand (center).** If `hydratedHeader.images[]` includes a logo-looking image (alt matching brand name OR class containing `logo`/`brand`), emit `<a class="es-header__brand" href="/"><img src="{assetsMap.localPath}" alt="{brand}"></a>`. If brand is text-only (no img), emit `<span class="es-header__brand">{brand-name}</span>`.
245
+
246
+ 5. **Left cluster: hamburger + search.** On mobile (`@media (max-width: 749px)`), emit `<div class="es-header__left">` with a `<button>` hamburger trigger (aria-controls a drawer) + a `<button>` search trigger. On desktop, replace hamburger with horizontal nav links (rule 7).
247
+
248
+ 6. **Right cluster: utility icons.** Links/buttons matching `/account|cart|search/` go into `<div class="es-header__right">` with proper SVG icons (use `hydratedHeader.inlineSvgs[]` matching parent class containing `account|cart|search`, OR fall back to bundled icon set).
249
+
250
+ 7. **Desktop nav (visible only ≥750px).** Take category links from `hydratedHeader.links` filtered by hrefs matching `/products|collections|categories|shop/` (top-level shop nav). Emit as `<nav class="es-header__nav-desktop"><ul>{links}</ul></nav>` between brand and right cluster, with `display: none` on mobile.
251
+
252
+ 8. **Mobile drawer (V03-5 corrected UX).** Bug 1.2 — previous output had logo inside drawer + no overlay. Correct UX:
253
+ ```html
254
+ <div class="es-drawer-backdrop" data-drawer-backdrop hidden>
255
+ <div class="es-drawer" data-drawer role="dialog" aria-modal="true">
256
+ <button class="es-drawer__close" aria-label="Close menu" data-drawer-close>✕</button>
257
+ <nav>
258
+ <ul class="es-drawer__nav">
259
+ <!-- category links from hydratedHeader.links, NO LOGO inside -->
260
+ </ul>
261
+ </nav>
262
+ <!-- Optional: account link / login at footer of drawer if source has it -->
263
+ </div>
264
+ </div>
265
+ ```
266
+ CSS:
267
+ ```css
268
+ .es-drawer-backdrop.es-drawer-backdrop[hidden] { display: none; }
269
+ .es-drawer-backdrop.es-drawer-backdrop {
270
+ position: fixed; inset: 0;
271
+ background: rgba(0,0,0,.5);
272
+ z-index: 9999;
273
+ }
274
+ .es-drawer.es-drawer {
275
+ position: absolute; top: 0; left: 0; bottom: 0;
276
+ width: min(85vw, 360px);
277
+ background: #fff;
278
+ padding: 24px 20px;
279
+ overflow-y: auto;
280
+ }
281
+ .es-drawer__close.es-drawer__close {
282
+ position: absolute; top: 12px; right: 12px;
283
+ background: none; border: 0; font-size: 24px;
284
+ }
285
+ ```
286
+ JS: vanilla toggle on hamburger click. **NO LOGO inside drawer** — logo is in the header shell, not duplicated. **Backdrop overlay is mandatory** — drawer with no backdrop is the v0.3.x bug.
155
287
 
156
- 1. **Outer shell.** `<header class="es-header">` with sticky position if the page tokens indicate so.
157
- 2. **Promo bar.** If any link's text matches a promotional pattern (`/sale|discount|free|% off/i`) OR `hydratedHeader.headings[]` contains a short standalone heading at the top, emit `<div class="es-header__promo">{text}</div>`.
158
- 3. **Brand.** If `hydratedHeader.images[]` includes one with `alt` matching the brand name (derived from URL or generic `logo`), emit `<a class="es-header__brand" href="/">` with that `<img>`.
159
- 4. **Nav.** Take `hydratedHeader.links` filtered to category-looking hrefs (`/products/...`, `/collections/...`, top-level pages). Emit as `<nav><ul>{links}</ul></nav>`. On mobile (`@media (max-width: 749px)`), collapse into a `<details><summary aria-label="Menu">` drawer — keep all category links inside.
160
- 5. **Utility icons.** Links with text "Open search", "Account", "Cart" (or matching hrefs `/search`, `/account`, `/cart`) get rendered as a right-side cluster of icon buttons.
288
+ 9. **Cross-validate every literal** (brand text, link texts, announcement bar text) against `hydratedHeader.html` / `hydratedAnnouncementBar.html` before emit. Validator catches fabrications.
161
289
 
162
290
  **Output contract for header/footer:**
163
291
  - Markup includes a top comment `<!-- sb-build-wp: composed from hydrated snapshot (V03-0a) -->`.
@@ -51,7 +51,17 @@ The Elementor + active theme stack frequently injects `!important` on these prop
51
51
  - `font-weight` — set on `h1..h6`, `body`
52
52
  - `color` — set on links via `.elementor a`, on body via theme
53
53
 
54
- Use `!important` ONLY on these four properties when they appear on critical text. Do NOT spray `!important` everywhere overuse turns debugging into a nightmare. Specifically: do NOT `!important` margins, paddings, widths, heights, backgrounds, transforms, transitions.
54
+ ### v0.3.5 Expanded property list (after real-world "tudo liso no WordPress" bug 3.4):
55
+
56
+ The above 4-property list was insufficient — when output was published in WordPress, the products grid and other card-based layouts came out "tudo liso" (no border-radius, no shadow, no gaps). The Elementor theme stack also normalizes/overrides these visual properties on `*` selectors. Add `!important` ALSO to:
57
+
58
+ - `border-radius` — on cards, buttons, image containers (themes flatten cards otherwise)
59
+ - `box-shadow` — on cards and elevated elements
60
+ - `background-color` — on color-bearing blocks (`.es-card`, `.es-cta`, `.es-banner`) when the visual identity depends on it
61
+ - `gap` (and `column-gap`, `row-gap`) — on grid/flex containers (themes that reset child margins also reset gaps)
62
+ - `padding` — on card-like containers and section blocks where the live padding is visually critical
63
+
64
+ Apply via the same chained-scope pattern (`.es-card.es-card { border-radius: 12px !important; }`). Still DO NOT spray `!important` on margin, width, height, transforms, transitions — those rarely conflict with theme.
55
65
 
56
66
  ## Reset universal
57
67
 
@@ -78,7 +78,7 @@ function findAll(re, str) {
78
78
  return out
79
79
  }
80
80
 
81
- function validateHtml(html) {
81
+ function validateHtml(html, inspection = null) {
82
82
  const errors = []
83
83
  const warnings = []
84
84
  const info = []
@@ -228,6 +228,103 @@ function validateHtml(html) {
228
228
  fabricationRisks.push({ section: m[1].trim(), reason: m[2].trim() })
229
229
  }
230
230
 
231
+ // 13. §V03-4 fabrication detector — when inspection JSON is provided,
232
+ // cross-validate every literal text in the composed HTML against the
233
+ // inspection's text corpus (domLive + dom + hydratedHeader.html +
234
+ // hydratedFooter.html). Literals that don't appear ANYWHERE in the
235
+ // inspection corpus are flagged as fabrications.
236
+ //
237
+ // Detected literal types:
238
+ // - prices/money: $\d+(.\d{2})? or \d+\.\d{2}\b
239
+ // - countdown phrases: \d+(?:,\d{3})*\s*(reviews|customers|women|sold|pairs|...)
240
+ // - headings text content (h1-h6 + p.bold-like)
241
+ // - meaningful paragraph text >30 chars
242
+ //
243
+ // Findings emit error `composition-fabrication-detected` so the orchestrator
244
+ // (Step 4j) can re-trigger composition with feedback.
245
+ const fabricationFindings = []
246
+ if (inspection && typeof inspection === 'object') {
247
+ const corpus = buildInspectionCorpus(inspection)
248
+ const looseCorpus = corpus
249
+ .toLowerCase()
250
+ .replace(/\s+/g, ' ')
251
+ .replace(/[‘’]/g, "'")
252
+ .replace(/[“”]/g, '"')
253
+ .replace(/[—–-]+/g, '-')
254
+ // strip TODO-commented regions so they don't get re-scanned
255
+ const htmlBody = html.replace(/<!--[\s\S]*?-->/g, '')
256
+ // (a) money literals
257
+ const moneyRe = /\$\d{1,5}(?:\.\d{2})?/g
258
+ const moneyHits = new Set()
259
+ for (let m = moneyRe.exec(htmlBody); m; m = moneyRe.exec(htmlBody)) moneyHits.add(m[0])
260
+ for (const lit of moneyHits) {
261
+ if (!corpusHas(looseCorpus, lit)) {
262
+ fabricationFindings.push({
263
+ kind: 'money',
264
+ literal: lit,
265
+ severity: 'high',
266
+ message: `Money literal "${lit}" not found in inspection corpus (domLive/dom/hydratedHeader/hydratedFooter)`,
267
+ })
268
+ }
269
+ }
270
+ // (b) count phrases like "19,479 Reviews", "240,000+ Women", "1M+ Sold"
271
+ const countRe = /\b(\d[\d,]{1,9}\+?)\s+(reviews?|customers?|women|sold|pairs?|orders?|stars?|users?|clinicians?|five-star|verified)\b/gi
272
+ const countHits = new Set()
273
+ for (let m = countRe.exec(htmlBody); m; m = countRe.exec(htmlBody)) countHits.add(m[0])
274
+ for (const lit of countHits) {
275
+ if (!corpusHas(looseCorpus, lit)) {
276
+ fabricationFindings.push({
277
+ kind: 'count',
278
+ literal: lit,
279
+ severity: 'high',
280
+ message: `Count phrase "${lit}" not found in inspection corpus`,
281
+ })
282
+ }
283
+ }
284
+ // (c) headings — text content of h1..h6
285
+ const headingRe = /<h[1-6][^>]*>([\s\S]*?)<\/h[1-6]>/gi
286
+ const headingHits = new Set()
287
+ for (let m = headingRe.exec(htmlBody); m; m = headingRe.exec(htmlBody)) {
288
+ const txt = m[1].replace(/<[^>]+>/g, '').trim()
289
+ if (txt.length >= 3 && txt.length <= 120) headingHits.add(txt)
290
+ }
291
+ for (const lit of headingHits) {
292
+ if (!corpusHas(looseCorpus, lit)) {
293
+ fabricationFindings.push({
294
+ kind: 'heading',
295
+ literal: lit,
296
+ severity: 'high',
297
+ message: `Heading "${lit}" not found in inspection corpus`,
298
+ })
299
+ }
300
+ }
301
+ // (d) long paragraphs (>30 chars) — text content of <p>
302
+ const paraRe = /<p[^>]*>([\s\S]*?)<\/p>/gi
303
+ const paraHits = new Set()
304
+ for (let m = paraRe.exec(htmlBody); m; m = paraRe.exec(htmlBody)) {
305
+ const txt = m[1].replace(/<[^>]+>/g, '').trim()
306
+ if (txt.length >= 30 && txt.length <= 400) paraHits.add(txt)
307
+ }
308
+ for (const lit of paraHits) {
309
+ if (!corpusHas(looseCorpus, lit)) {
310
+ fabricationFindings.push({
311
+ kind: 'paragraph',
312
+ literal: lit.slice(0, 80) + (lit.length > 80 ? '…' : ''),
313
+ severity: 'medium',
314
+ message: `Paragraph "${lit.slice(0, 80)}…" not found in inspection corpus`,
315
+ })
316
+ }
317
+ }
318
+ if (fabricationFindings.length > 0) {
319
+ errors.push({
320
+ rule: 'composition-fabrication-detected',
321
+ severity: 'high',
322
+ message: `§V03-4 fabrication-detector found ${fabricationFindings.length} literal(s) not present in inspection corpus. The composer should re-read the section crops and either remove the fabricated literals OR replace with TODO comments.`,
323
+ findings: fabricationFindings,
324
+ })
325
+ }
326
+ }
327
+
231
328
  return {
232
329
  passed: errors.length === 0,
233
330
  errorCount: errors.length,
@@ -236,9 +333,62 @@ function validateHtml(html) {
236
333
  warnings,
237
334
  info,
238
335
  fabricationRisks,
336
+ fabricationFindings,
239
337
  }
240
338
  }
241
339
 
340
+ // §V03-4 — Build a single lowercased text corpus from all inspection text
341
+ // sources for fabrication cross-validation. Includes:
342
+ // - domLive (preferred) / dom: every node.text + node.attrs.alt
343
+ // - hydratedHeader.html, hydratedFooter.html (raw outerHTML strings)
344
+ function buildInspectionCorpus(inspection) {
345
+ const parts = []
346
+ function walkText(arr) {
347
+ if (!arr) return
348
+ const stack = Array.isArray(arr) ? [...arr] : [arr]
349
+ while (stack.length) {
350
+ const n = stack.pop()
351
+ if (!n || typeof n !== 'object') continue
352
+ if (typeof n.text === 'string' && n.text.trim()) parts.push(n.text)
353
+ if (n.attrs && typeof n.attrs === 'object') {
354
+ if (typeof n.attrs.alt === 'string') parts.push(n.attrs.alt)
355
+ if (typeof n.attrs['aria-label'] === 'string') parts.push(n.attrs['aria-label'])
356
+ if (typeof n.attrs.title === 'string') parts.push(n.attrs.title)
357
+ }
358
+ if (Array.isArray(n.children)) for (const c of n.children) stack.push(c)
359
+ }
360
+ }
361
+ walkText(inspection.domLive)
362
+ if (!inspection.domLive) walkText(inspection.dom)
363
+ if (inspection.hydratedHeader && typeof inspection.hydratedHeader.html === 'string') {
364
+ parts.push(inspection.hydratedHeader.html.replace(/<[^>]+>/g, ' '))
365
+ }
366
+ if (inspection.hydratedFooter && typeof inspection.hydratedFooter.html === 'string') {
367
+ parts.push(inspection.hydratedFooter.html.replace(/<[^>]+>/g, ' '))
368
+ }
369
+ return parts.join(' \n ')
370
+ }
371
+
372
+ // Loose comparison: literal appears in the corpus with whitespace and
373
+ // punctuation normalization. Handles "$29.00" matching "$ 29.00" etc.
374
+ function corpusHas(looseCorpusLower, literal) {
375
+ if (!literal) return true
376
+ const needle = literal
377
+ .toLowerCase()
378
+ .replace(/\s+/g, ' ')
379
+ .replace(/[‘’]/g, "'")
380
+ .replace(/[“”]/g, '"')
381
+ .replace(/[—–-]+/g, '-')
382
+ .trim()
383
+ if (!needle) return true
384
+ // direct substring
385
+ if (looseCorpusLower.includes(needle)) return true
386
+ // also try removing all whitespace (handles "$29 .00" vs "$29.00")
387
+ const compact = needle.replace(/\s+/g, '')
388
+ const compactCorpus = looseCorpusLower.replace(/\s+/g, '')
389
+ return compactCorpus.includes(compact)
390
+ }
391
+
242
392
  // --- preflight ----------------------------------------------------------------
243
393
  //
244
394
  // Pre-dispatch checklist. The composer is creative — it picks patterns and
@@ -425,8 +575,21 @@ async function main() {
425
575
  } catch (err) {
426
576
  fail(`validate: cannot read ${path}: ${err.message}`, 1)
427
577
  }
428
- log(values.verbose, `validating ${path} (${html.length} bytes)`)
429
- const report = validateHtml(html)
578
+ // §V03-4 fabrication detector — when --inspection-path is passed, load
579
+ // the inspection JSON and cross-validate composed HTML literals
580
+ // against the inspection text corpus. Without --inspection-path, the
581
+ // fabrication check is skipped (back-compat for older callers).
582
+ let inspection = null
583
+ if (values['inspection-path']) {
584
+ try {
585
+ const insRaw = await readFile(resolve(values['inspection-path']), 'utf8')
586
+ inspection = JSON.parse(insRaw)
587
+ } catch (err) {
588
+ log(values.verbose, `validate: could not load --inspection-path: ${err.message}`)
589
+ }
590
+ }
591
+ log(values.verbose, `validating ${path} (${html.length} bytes${inspection ? ', with fabrication check' : ''})`)
592
+ const report = validateHtml(html, inspection)
430
593
  process.stdout.write(`${JSON.stringify(report, null, 2)}\n`)
431
594
  process.exit(report.passed ? 0 : 3)
432
595
  }
@@ -259,8 +259,32 @@ async function main() {
259
259
  }
260
260
  return best?.el || null
261
261
  }
262
+ // §V03-5 — announcement bar (promo bar on top of the page,
263
+ // typically <aside> sibling above the <header>). Common in e-commerce
264
+ // ("Mother's Day Sale", "Free Shipping over $X", etc.). Captured as
265
+ // separate snapshot so the composer can compose it as part of the
266
+ // global chrome (above the header) — not as inline body content of
267
+ // the home page.
268
+ function pickAnnouncementBar() {
269
+ const candidates = document.querySelectorAll(
270
+ 'aside[class*="announcement"], [class*="announcement-bar"], [class*="promo-bar"], [class*="shopify-section-group-header-group"][class*="announcement"], aside[class*="shopify-section-group-header"]',
271
+ )
272
+ let best = null
273
+ for (const el of candidates) {
274
+ const r = el.getBoundingClientRect()
275
+ if (r.height < 20 || r.height > 200) continue
276
+ // must be near top of page
277
+ if (Math.abs(r.y + window.scrollY) > 200) continue
278
+ const txt = (el.textContent || '').trim()
279
+ if (!txt) continue
280
+ const score = txt.length / 10 + (r.width >= 320 ? 5 : 0)
281
+ if (!best || score > best.score) best = { el, score }
282
+ }
283
+ return best?.el || null
284
+ }
262
285
  const footer = pickFooter()
263
286
  const header = pickHeader()
287
+ const announcement = pickAnnouncementBar()
264
288
  window.__sbHydratedFooter = footer
265
289
  ? {
266
290
  html: footer.outerHTML,
@@ -289,6 +313,29 @@ async function main() {
289
313
  })(),
290
314
  }
291
315
  : null
316
+ window.__sbHydratedAnnouncementBar = announcement
317
+ ? {
318
+ html: announcement.outerHTML,
319
+ text: (() => {
320
+ // Strip inline <script>/<style> tags before reading text —
321
+ // Shopify themes inline CSS variables and JS at the top of
322
+ // the announcement-bar section, which would otherwise leak
323
+ // into the "text" field as garbage.
324
+ const clone = announcement.cloneNode(true)
325
+ clone.querySelectorAll('script,style').forEach((n) => n.remove())
326
+ return (clone.textContent || '').replace(/\s+/g, ' ').trim()
327
+ })(),
328
+ bbox: (() => {
329
+ const r = announcement.getBoundingClientRect()
330
+ return {
331
+ x: Math.round(r.x + window.scrollX),
332
+ y: Math.round(r.y + window.scrollY),
333
+ w: Math.round(r.width),
334
+ h: Math.round(r.height),
335
+ }
336
+ })(),
337
+ }
338
+ : null
292
339
  })
293
340
 
294
341
  // Return to the top before layout reads. content-visibility:auto on
@@ -453,6 +500,84 @@ async function main() {
453
500
  const slug = (cls || tag).replace(/[^a-z0-9-]/gi, '-').toLowerCase().slice(0, 32)
454
501
  return slug || 'section'
455
502
  }
503
+
504
+ // §V03-4 image-block detection. A section is an image-block when
505
+ // its visible content is dominated by a single <img>: either by
506
+ // bbox coverage (img takes ≥80% of section area) OR by a known
507
+ // wrapper-class pattern (Shopify Dawn / OS 2.0 themes use specific
508
+ // wrapper classes like `product-info__image` to delimit a region
509
+ // whose entire visual content is one CDN-hosted SVG/PNG/JPG asset).
510
+ // The composer (sb-build-wp §V03-4 rule A) treats image-block
511
+ // sections as "download asset + emit <img src>" — NEVER tries to
512
+ // reproduce the visual contents in HTML/CSS.
513
+ const IMAGE_BLOCK_WRAPPER_PATTERNS = [
514
+ /product-info__image/i,
515
+ /image-with-text/i,
516
+ /\bhero-image\b/i,
517
+ /\btrust-badges\b/i,
518
+ /\bfeatures-points\b/i,
519
+ /\bmothers?-day\b/i,
520
+ /\bsingle-image\b/i,
521
+ /\bbanner-image\b/i,
522
+ /\bguarantee-card\b/i,
523
+ /\bbest-fit-size-chart\b/i,
524
+ ]
525
+ function classifyAsImageBlock(node) {
526
+ if (!node || typeof node !== 'object') return null
527
+ const sectionBbox = node.bbox
528
+ if (!sectionBbox || !sectionBbox.h || !sectionBbox.w) return null
529
+ const sectionArea = sectionBbox.h * sectionBbox.w
530
+ if (sectionArea <= 0) return null
531
+ // Match by wrapper-class pattern on the node itself OR any
532
+ // descendant (the wrapper might be a child of the section root).
533
+ function findClassMatch(n) {
534
+ if (!n || typeof n !== 'object') return null
535
+ if (Array.isArray(n.classes)) {
536
+ for (const cls of n.classes) {
537
+ for (const pat of IMAGE_BLOCK_WRAPPER_PATTERNS) {
538
+ if (pat.test(cls)) return cls
539
+ }
540
+ }
541
+ }
542
+ if (Array.isArray(n.children)) {
543
+ for (const c of n.children) {
544
+ const m = findClassMatch(c)
545
+ if (m) return m
546
+ }
547
+ }
548
+ return null
549
+ }
550
+ const classMatch = findClassMatch(node)
551
+ // Find the largest <img> descendant by bbox area.
552
+ let bestImg = null
553
+ function findBiggestImg(n) {
554
+ if (!n || typeof n !== 'object') return
555
+ if (n.tag === 'img' && n.bbox && n.bbox.h > 0 && n.bbox.w > 0) {
556
+ const a = n.bbox.h * n.bbox.w
557
+ if (!bestImg || a > bestImg.area) {
558
+ const src = (n.attrs && (n.attrs.src || n.attrs.srcset)) || ''
559
+ bestImg = { area: a, bbox: n.bbox, src }
560
+ }
561
+ }
562
+ if (Array.isArray(n.children)) {
563
+ for (const c of n.children) findBiggestImg(c)
564
+ }
565
+ }
566
+ findBiggestImg(node)
567
+ const imgArea = bestImg ? bestImg.area : 0
568
+ const coverage = sectionArea > 0 ? imgArea / sectionArea : 0
569
+ if (classMatch || coverage >= 0.8) {
570
+ return {
571
+ reason: classMatch
572
+ ? `wrapper-class:${classMatch}`
573
+ : `img-coverage:${(coverage * 100).toFixed(0)}%`,
574
+ imgSrc: bestImg ? bestImg.src : null,
575
+ imgBbox: bestImg ? bestImg.bbox : null,
576
+ coverage,
577
+ }
578
+ }
579
+ return null
580
+ }
456
581
  const sourceDom = result.domLive
457
582
  ? Array.isArray(result.domLive)
458
583
  ? result.domLive
@@ -496,6 +621,7 @@ async function main() {
496
621
  sectionList.push({
497
622
  sectionType: labelFromNode(grand),
498
623
  bbox: grand.bbox,
624
+ imageBlock: classifyAsImageBlock(grand),
499
625
  })
500
626
  }
501
627
  }
@@ -507,6 +633,7 @@ async function main() {
507
633
  sectionList.push({
508
634
  sectionType: labelFromNode(child),
509
635
  bbox,
636
+ imageBlock: classifyAsImageBlock(child),
510
637
  })
511
638
  }
512
639
 
@@ -567,6 +694,7 @@ async function main() {
567
694
  sectionType: sec.sectionType,
568
695
  bbox: sec.bbox,
569
696
  path: cropPath,
697
+ imageBlock: sec.imageBlock || null,
570
698
  })
571
699
  } catch (err) {
572
700
  log(`section crop ${idx} ${sec.sectionType} failed: ${err?.message || err}`)
@@ -1801,7 +1929,7 @@ function extractInPage({ selector, maxDepth, maxChildren, maxText }) {
1801
1929
  try {
1802
1930
  parsed = new DOMParser().parseFromString(snapshot.html, 'text/html')
1803
1931
  } catch (_) {
1804
- return { ...snapshot, links: [], headings: [], inputs: [], forms: [], images: [] }
1932
+ return { ...snapshot, links: [], headings: [], inputs: [], forms: [], images: [], blocks: [], paragraphs: [], inlineSvgs: [] }
1805
1933
  }
1806
1934
  const root = parsed.body.firstElementChild || parsed.body
1807
1935
  const norm = (s) => (s || '').replace(/\s+/g, ' ').trim()
@@ -1812,7 +1940,8 @@ function extractInPage({ selector, maxDepth, maxChildren, maxText }) {
1812
1940
  label: a.getAttribute('aria-label') || null,
1813
1941
  }))
1814
1942
  .filter((l) => l.href && (l.text || l.label))
1815
- const headings = Array.from(root.querySelectorAll('h1, h2, h3, h4, h5, h6, p.bold, .footer__block-title, [class*="heading"]'))
1943
+ const HEADING_SELECTOR = 'h1, h2, h3, h4, h5, h6, p.bold, .footer__block-title, [class*="heading"]'
1944
+ const headings = Array.from(root.querySelectorAll(HEADING_SELECTOR))
1816
1945
  .map((h) => ({ tag: h.tagName.toLowerCase(), text: norm(h.textContent) }))
1817
1946
  .filter((h) => h.text)
1818
1947
  const inputs = Array.from(root.querySelectorAll('input')).map((i) => ({
@@ -1833,8 +1962,96 @@ function extractInPage({ selector, maxDepth, maxChildren, maxText }) {
1833
1962
  alt: img.getAttribute('alt') || '',
1834
1963
  width: img.getAttribute('width') || null,
1835
1964
  height: img.getAttribute('height') || null,
1965
+ classList: Array.from(img.classList || []),
1966
+ parentClass: img.parentElement ? Array.from(img.parentElement.classList || []) : [],
1836
1967
  }))
1837
1968
  .filter((img) => img.src)
1969
+ // §V03-5 — inline SVG capture. Composer needs the raw SVG markup to
1970
+ // preserve ornamental decorators (Clinicians' Choice flanking flora,
1971
+ // social icons, payment cards when those are inline rather than <img>).
1972
+ // Skip extremely small (single-path icon < 16x16) and absurdly large
1973
+ // svgs that are likely page-level decorative blobs.
1974
+ const inlineSvgs = Array.from(root.querySelectorAll('svg'))
1975
+ .map((svg, i) => {
1976
+ const r = svg.getBoundingClientRect ? svg.getBoundingClientRect() : { width: 0, height: 0 }
1977
+ const w = svg.getAttribute('width') || r.width || null
1978
+ const h = svg.getAttribute('height') || r.height || null
1979
+ const viewBox = svg.getAttribute('viewBox') || null
1980
+ const ariaLabel = svg.getAttribute('aria-label') || null
1981
+ const title = svg.querySelector('title')?.textContent || null
1982
+ const classes = Array.from(svg.classList || [])
1983
+ return {
1984
+ idx: i,
1985
+ outerHTML: svg.outerHTML,
1986
+ width: w,
1987
+ height: h,
1988
+ viewBox,
1989
+ ariaLabel,
1990
+ title,
1991
+ classes,
1992
+ parentClass: svg.parentElement ? Array.from(svg.parentElement.classList || []) : [],
1993
+ }
1994
+ })
1995
+ // §V03-5 — blocks: heading-anchored sub-sections. Walks DOM in source
1996
+ // order and groups text content under each heading (e.g. "Need Help?"
1997
+ // heading + its paragraphs + its mailto link). This lets the composer
1998
+ // emit semantically grouped output where the live source intended.
1999
+ const blocks = []
2000
+ let currentBlock = null
2001
+ function nodeHeadingText(el) {
2002
+ if (!el) return null
2003
+ if (el.matches && el.matches(HEADING_SELECTOR)) return norm(el.textContent)
2004
+ return null
2005
+ }
2006
+ function walkInOrder(el) {
2007
+ if (!el || el.nodeType !== 1) return
2008
+ const headingText = nodeHeadingText(el)
2009
+ if (headingText) {
2010
+ if (currentBlock) blocks.push(currentBlock)
2011
+ currentBlock = {
2012
+ heading: { tag: el.tagName.toLowerCase(), text: headingText },
2013
+ paragraphs: [],
2014
+ links: [],
2015
+ mailtos: [],
2016
+ }
2017
+ return // don't descend; siblings of the heading carry the content
2018
+ }
2019
+ // collect paragraphs / inline content for the current block
2020
+ if (currentBlock) {
2021
+ if (el.tagName === 'P') {
2022
+ // <p> can contain inline children (<a>, <strong>, etc.) — what
2023
+ // matters is that it has no block-level nested elements.
2024
+ const hasBlockChild = Array.from(el.children).some((c) =>
2025
+ /^(DIV|UL|OL|SECTION|ARTICLE|FORM|HEADER|FOOTER|NAV|FIGURE)$/.test(c.tagName),
2026
+ )
2027
+ const txt = norm(el.textContent)
2028
+ // Copyright / "All rights reserved" lines are end-of-section
2029
+ // markers — close the current block before they get attached
2030
+ // to an unrelated heading like "Need Help?".
2031
+ if (/^(©|copyright)/i.test(txt) || /all rights reserved/i.test(txt)) {
2032
+ if (currentBlock) {
2033
+ blocks.push(currentBlock)
2034
+ currentBlock = null
2035
+ }
2036
+ } else if (txt && txt.length >= 8 && !hasBlockChild) {
2037
+ currentBlock.paragraphs.push(txt)
2038
+ }
2039
+ }
2040
+ if (el.tagName === 'A') {
2041
+ const href = el.getAttribute('href') || ''
2042
+ if (href.startsWith('mailto:')) currentBlock.mailtos.push(href.replace(/^mailto:/, ''))
2043
+ else if (href) currentBlock.links.push({ href, text: norm(el.textContent) })
2044
+ }
2045
+ }
2046
+ for (const c of el.children) walkInOrder(c)
2047
+ }
2048
+ walkInOrder(root)
2049
+ if (currentBlock) blocks.push(currentBlock)
2050
+ // §V03-5 — paragraphs: all <p> in order, for sections without
2051
+ // heading anchors (e.g. footer brand description sitting alone).
2052
+ const paragraphs = Array.from(root.querySelectorAll('p'))
2053
+ .map((p) => norm(p.textContent))
2054
+ .filter((t) => t && t.length >= 8 && t.length < 400)
1838
2055
  return {
1839
2056
  html: snapshot.html,
1840
2057
  bbox: snapshot.bbox,
@@ -1843,10 +2060,18 @@ function extractInPage({ selector, maxDepth, maxChildren, maxText }) {
1843
2060
  inputs,
1844
2061
  forms,
1845
2062
  images,
2063
+ inlineSvgs,
2064
+ blocks,
2065
+ paragraphs,
1846
2066
  }
1847
2067
  }
1848
2068
  const hydratedHeader = extractChrome(window.__sbHydratedHeader)
1849
2069
  const hydratedFooter = extractChrome(window.__sbHydratedFooter)
2070
+ // §V03-5 announcement bar — captured separately to allow the composer
2071
+ // to emit it as part of the global chrome (clean/global/header.html
2072
+ // composed with the bar prepended) instead of leaking into per-page
2073
+ // body content.
2074
+ const hydratedAnnouncementBar = window.__sbHydratedAnnouncementBar || null
1850
2075
 
1851
2076
  return {
1852
2077
  sectionType,
@@ -1863,6 +2088,7 @@ function extractInPage({ selector, maxDepth, maxChildren, maxText }) {
1863
2088
  externalIframes,
1864
2089
  hydratedHeader,
1865
2090
  hydratedFooter,
2091
+ hydratedAnnouncementBar,
1866
2092
  }
1867
2093
  }
1868
2094