similarbuild 0.3.3 → 0.3.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "similarbuild",
|
|
3
|
-
"version": "0.3.
|
|
3
|
+
"version": "0.3.5",
|
|
4
4
|
"description": "Visual migration framework for Claude Code — clone a live page, get a paste-ready WordPress/Elementor or Shopify section file, validated and auto-corrected.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"bin": {
|
|
@@ -37,47 +37,81 @@ A single `.html` file written to `outputPath`. The file is a fragment — no `<h
|
|
|
37
37
|
|
|
38
38
|
## On Activation
|
|
39
39
|
|
|
40
|
-
1. **Read the inputs.** Parse `inspection.json` (capture `sectionType`, `tokens`, `dom`, **`domLive`**, `pseudoElements`, `imgUrls`, **`hydratedHeader`**, **`hydratedFooter`**, and
|
|
40
|
+
1. **Read the inputs.** Parse `inspection.json` (capture `sectionType`, `tokens`, `dom`, **`domLive`**, `pseudoElements`, `imgUrls`, **`hydratedHeader`**, **`hydratedFooter`**, and **`sectionCrops[]`**) and `assets-map.json` (the URL → localPath / inline-SVG dictionary). If `fixHints` is given, also read `previousHtmlPath`.
|
|
41
41
|
|
|
42
|
-
|
|
42
|
+
**CANONICAL VISUAL INPUT = `sectionCrops[]`.** Do NOT read `inspection.screenshot` (full-page) — it's downscaled when loaded as image and digits become unreadable. The crops are HD-readable per section. Composer uses one crop per section it composes.
|
|
43
43
|
|
|
44
|
-
|
|
44
|
+
**§V03-4 — TERNARY section classification + ZERO-FABRICATION HARD ENFORCEMENT (replaces §V03-3).** Before composing ANY section, classify it into ONE of five categories. Pick the FIRST category that matches. Each category has a distinct compose recipe. Default = placeholder is FORBIDDEN; placeholder is only the last-resort for category (E).
|
|
45
45
|
|
|
46
46
|
**Workflow when composing a body page** (anything NOT `--target-section=header|footer`):
|
|
47
47
|
|
|
48
|
-
1. **For each section you compose
|
|
48
|
+
1. **Find the section's crop.** For each section you compose, find the matching `inspection.sectionCrops[]` entry by `bbox.y` proximity. Read the crop via `Read({ file_path: cropEntry.path })`.
|
|
49
49
|
|
|
50
|
-
2. **
|
|
51
|
-
- **Literal text** (prices, review counts, button labels, headings, badge text, product names, percentages, dimensions): emit ONLY what you can read clearly in the crop AT NATIVE RESOLUTION. If you're not 100% sure of the exact characters (one digit looks like another, text is too small even in native, etc.), DO NOT GUESS — emit a `<!-- TODO: <visual description>, unreadable -->` comment + structural placeholder.
|
|
52
|
-
- **FAQ answers**: NEVER write FAQ answer bodies based on "plausible content for this brand". If `<details>`/`<summary>` accordion is visible in the crop but answers are collapsed, emit each `<summary>` text verbatim from the crop + empty `<div class="faq-a"><!-- TODO: answer body collapsed in source — open accordion or fetch live --></div>`. Same for reviews: if Loox/Yotpo widget appears as the band, emit empty `<div class="reviews-mount"><!-- TODO: third-party reviews widget, integrate at deploy time --></div>` — DO NOT compose fake reviews.
|
|
53
|
-
- **Product counts**: count cards/thumbs in the crop. If there are 4 product cards visible, emit exactly 4. Do not "round" to 3 because typical Shopify themes have 3.
|
|
54
|
-
- **Cross-validate every numeric literal against `inspection.domLive` text nodes.** Before emitting `<span>$29.00</span>`: grep `inspection.domLive` recursively for any text node containing `$29` or `29.00`. If found → safe to emit. If NOT found → emit `<!-- TODO: price "$29.00" read from crop but not present in DOM; verify -->` instead.
|
|
55
|
-
- **Site version awareness**: if the crop shows a banner like "Mother's Day Sale", "Black Friday Sale", limited-edition badges — emit them. Do NOT skip "because the site usually doesn't have this". The crop captured the LIVE state.
|
|
56
|
-
- **NEVER complete content based on knowledge of the public site.** You may have seen this URL in training data. That data is OLD. The crop is NEW. Crop wins, training data is irrelevant.
|
|
50
|
+
2. **Classify the section** into A / B / C / D / E:
|
|
57
51
|
|
|
58
|
-
|
|
59
|
-
-
|
|
60
|
-
-
|
|
61
|
-
-
|
|
62
|
-
- Variant selectors: emit `<select>` with option list MATCHING WHAT'S VISIBLE. Inferred options (XS/XXXL not visible in crop) → emit only what's seen + `<!-- TODO: additional sizes likely available -->`.
|
|
63
|
-
- CTA: read the literal label text + observe button color. Emit with the right text + use color from `inspection.tokens.colors` for accent/primary.
|
|
52
|
+
**(A) IMAGE-BLOCK** — the section is dominated by a single `<img>` element. Detect by either:
|
|
53
|
+
- `inspection.domLive` (or `dom`) — the section's root node has a child `<img>` whose `bbox` covers ≥80% of the section's `bbox`, OR
|
|
54
|
+
- Wrapper class match: ancestor has class containing `product-info__image`, `image-with-text`, `hero-image`, `trust-badges`, `features-points`, `mothers-day`, `single-image`, `banner-image`, or similar "single asset" pattern.
|
|
55
|
+
- In Shopify Dawn/OS 2.0: **`<div class="product-info__image">` is THE canonical flag** — every match is an image-block.
|
|
64
56
|
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
57
|
+
Recipe: **DO NOT reproduce the image's contents in HTML/CSS.** Emit:
|
|
58
|
+
```html
|
|
59
|
+
<section class="es-{section-slug}">
|
|
60
|
+
<img src="{assetsMap.assets[url].localPath}" alt="{img.alt || section description}" loading="lazy" width="{img.width}" height="{img.height}" />
|
|
61
|
+
</section>
|
|
62
|
+
```
|
|
63
|
+
Pull the URL from `inspection.imgUrls` matching the section's bbox / src pattern. If asset wasn't downloaded (failed extract or didn't appear in imgUrls), emit `<img src="" alt="..." data-todo="asset-missing">` + `<!-- TODO: image asset not in assetsMap (URL: …) — download manually -->`. No CSS reproduction. No SVG re-creation. No text "interpretation" of what's inside the image.
|
|
71
64
|
|
|
72
|
-
|
|
73
|
-
- Every numeric literal `$NNNN` or `NNN reviews` MUST either (a) appear in a crop you actually read, or (b) be marked with a nearby TODO comment.
|
|
74
|
-
- Every FAQ `<div class="faq-a">` body MUST be non-empty ONLY if you literally read it from a crop. Empty bodies + TODO comment is the correct output when accordions are collapsed.
|
|
75
|
-
- Reviews widget bands → empty mount-div + TODO. Never fake review text.
|
|
76
|
-
If any violation found, REWRITE before submit. If you can't make it pass, return preflight error `composition-fabrication-detected` and let the orchestrator's Step 4j re-try with feedback.
|
|
65
|
+
**Examples from real feedback (everstride PDP)**: `mothers-day-new.svg` banner, `trust-badges-shipping.svg`, `features-points.svg`, `60-day-fit.svg` cards — all category (A).
|
|
77
66
|
|
|
78
|
-
|
|
67
|
+
**(B) PURE MARKUP** — the section is text + structured layout, no dominant image. Examples: FAQ accordion (`<details>/<summary>`), comparison tables (`<table>` with cells), pricing tiers, trust copy blocks, headings with paragraphs.
|
|
79
68
|
|
|
80
|
-
|
|
69
|
+
Recipe: READ the crop carefully + cross-validate every literal against `inspection.domLive` text nodes. Emit real semantic markup (`<table>`, `<dl>`, `<details>`, `<ul>` of `<li>`, etc.) that mirrors the live DOM structure.
|
|
70
|
+
|
|
71
|
+
**Cross-validation is MANDATORY for every literal**:
|
|
72
|
+
- Numeric (`$NN`, `NN reviews`, percentages, ratings): grep `inspection.domLive` recursively for a text node containing the exact substring. NOT found → emit `<!-- TODO: literal "$29.00" read from crop but not in DOM; verify -->` + placeholder.
|
|
73
|
+
- Headings/copy/labels: same rule. "What Customers Say", "All rights reserved", "Mix & Match Colors" — if not in DOM verbatim, do NOT emit.
|
|
74
|
+
- Counts: count items visible in crop AND cross-validate against `domLive` children of the container. If mismatch, prefer DOM count (DOM is authoritative for structure) + log discrepancy.
|
|
75
|
+
|
|
76
|
+
**(C) MIXED (image + text)** — the section has BOTH a meaningful image AND text content (heading + body paragraph + image side-by-side). Common Shopify pattern: `<images-with-text-scrolling>` custom element.
|
|
77
|
+
|
|
78
|
+
Recipe: emit `<section>` with `<img>` (rule A for the image part — use real asset URL when available) + `<h*>` heading + `<p>` paragraph(s) literal from crop, cross-validated. Do NOT collapse to image-only or text-only — preserve both.
|
|
79
|
+
|
|
80
|
+
**(D) WIDGET-RENDERED** — third-party widget (Loox reviews, "you may also like" carousel, Instagram feed) where the crop SHOWS the widget already rendered with content (reviews with names + photos + stars + quotes; product cards with prices + titles).
|
|
81
|
+
|
|
82
|
+
Recipe: READ the crop and emit real markup for each visible card/item. Cross-validate against DOM when widget exposes content as JSON-LD or hidden HTML. Mark container with `data-source="widget-name"` so the deploy team knows where to integrate the real widget later. **Do NOT emit empty mount-div when content is visible.** The "deploy time placeholder" excuse is ONLY for (E).
|
|
83
|
+
|
|
84
|
+
**(E) WIDGET-EMPTY / OPAQUE** — the crop is blank, only shows "Loading..." text, or the section is completely unreadable (zero text visible, no images, no structure). Last resort.
|
|
85
|
+
|
|
86
|
+
Recipe: emit STRUCTURAL placeholder (heading from DOM if available + `<div class="placeholder-grid">` with N visual placeholder-cards) + `<!-- TODO: <widget-name or section-description> — content not visible in crop, integrate at deploy -->`. Mount-div alone (empty `<div>`) is FORBIDDEN — placeholder must have visible structure so the deployed page is not visually broken.
|
|
87
|
+
|
|
88
|
+
2.5. **CARD-COUNT cross-validation (V03-5)**. When the section is a grid/carousel of repeating cards (products, reviews, trust badges, gallery thumbs, testimonial photos), the crop may only show the first row OR the first few items of a scroll-x carousel. Composer MUST count cards from `inspection.domLive` (or `inspection.dom`) recursively, NOT just from the crop. Recipe:
|
|
89
|
+
- Find the section's root node in domLive (by bbox.y match or sectionType).
|
|
90
|
+
- Walk children, count direct repeat-pattern items: `.product-card`, `.review-card`, `.feature-card`, `<li>` inside `<ul>` carousel, `[role="listitem"]`, etc.
|
|
91
|
+
- If DOM count > crop visible count → emit DOM count items, with `<img>` placeholder + literal text (cross-validated) for the items visible in crop, and `<img>` placeholder + `<!-- TODO: card N text below the fold -->` for items beyond crop.
|
|
92
|
+
- This fixes bugs 3.4 (home only 3 cards), 3.6 (trust badges only 2 of 4), 4.1 (PDP gallery only 1 thumb of 6), 6.1 (collection only 1 of 3 products).
|
|
93
|
+
- **NEVER emit fewer items than DOM has.** Crop visibility is sample, DOM is authority for structure.
|
|
94
|
+
|
|
95
|
+
3. **Hybridize sources by field type** (applies to B, C, D):
|
|
96
|
+
- **Texts/structure/counts** → crop, cross-validated against DOM.
|
|
97
|
+
- **Hrefs** → `inspection.domLive` / `hydratedHeader` / `hydratedFooter` / `imgUrls` by matching visible link text. No match → `href="#"` + TODO.
|
|
98
|
+
- **Form action/method/hidden inputs** → ALWAYS DOM (never guess; broken submission else).
|
|
99
|
+
- **Image src** → `assetsMap.assets[url].localPath` by alt/src pattern matching imgUrls.
|
|
100
|
+
- **Computed style** → `inspection.tokens` + `domLive.computedStyle`.
|
|
101
|
+
|
|
102
|
+
4. **Self-validation BEFORE calling `build-wp.mjs write`** — scan composed HTML for fabrication signals:
|
|
103
|
+
- Every numeric/price literal must either appear in `inspection.domLive` text nodes OR have a nearby TODO comment.
|
|
104
|
+
- Every `<h*>` / heading text must appear in DOM verbatim OR have TODO.
|
|
105
|
+
- Every `<p>` text > 30 chars must appear in DOM verbatim OR have TODO.
|
|
106
|
+
- Every FAQ `<div class="faq-a">` body MUST be either (a) empty + TODO when accordion was closed, OR (b) verbatim from crop where accordion was open.
|
|
107
|
+
- Image-block sections (A) MUST NOT contain SVG reproductions or CSS-styled `<div>` mimicking the image — they MUST be `<img>` + nothing else inside the section.
|
|
108
|
+
If any violation → REWRITE before submit. If can't pass after 1 rewrite, return preflight error `composition-fabrication-detected` for orchestrator Step 4j retry.
|
|
109
|
+
|
|
110
|
+
5. **Hard-fail after 2 attempts.** Orchestrator (Step 4j) caps at 2 compose iterations. 2nd iteration still flagged → page goes to **❌** with TODO-only fragment shipped (not as "ready" or "partial"). NEVER plausible-but-wrong shipped as warning.
|
|
111
|
+
|
|
112
|
+
6. **NEVER complete content from training data.** You may have seen this URL during training. That data is OLD. The crop is NEW. Crop wins. Training data is IRRELEVANT — never fill gaps with "what this site probably has".
|
|
113
|
+
|
|
114
|
+
This contract enforces the user's hard requirement: "não pode me entregar algo diferente e se não conseguir falha na 2 vez."
|
|
81
115
|
|
|
82
116
|
**§V03-1 — Use `domLive` as the canonical body tree when present.** When `inspection.domLive` is non-null, it holds the live-walker snapshot taken BEFORE Cap A substituted `dom[]` with the shadow-flattened tree. The flattened `dom[]` carries `bbox={0,0,0,0}` and empty `computedStyle` because parseHTMLUnsafe returns a detached doc — it's structurally rich (gallery imgs, custom-element children) but useless for layout. The composer needs real bboxes, real `computedStyle.background`, real heights. Always prefer `inspection.domLive` for body section composition (bbox, computedStyle, hero detection, section ordering). Use `inspection.dom` only when `domLive === null` (page had no shadow roots — flatten didn't fire) or when you specifically need shadow-flattened content like a PDP gallery (consult `dom` for image-rich PDP nodes, but use `domLive` for the surrounding layout).
|
|
83
117
|
|
|
@@ -140,24 +174,118 @@ A single `.html` file written to `outputPath`. The file is a fragment — no `<h
|
|
|
140
174
|
|
|
141
175
|
When `--target-section=header` or `--target-section=footer` is passed AND the corresponding `inspection.hydratedHeader` / `inspection.hydratedFooter` is non-null, follow this composition recipe instead of the generic A-H pattern lookup:
|
|
142
176
|
|
|
143
|
-
**Footer recipe** (when `hydratedFooter` present):
|
|
177
|
+
**Footer recipe** (when `hydratedFooter` present) — v0.3.5 rewrite:
|
|
178
|
+
|
|
179
|
+
**Composition order = DOM order of `hydratedFooter.html`.** Walk the raw HTML once to capture the visual sequence (e.g. brand → newsletter → menu columns → Need Help → social → payments → copyright). DO NOT invent an order that differs from the source — the live site's order is the canonical reference.
|
|
144
180
|
|
|
145
|
-
|
|
146
|
-
2. **Grouping.** Split `hydratedFooter.links` into clusters by their adjacent heading in `hydratedFooter.headings`. Order of headings reflects the live DOM order. For each heading: emit a column with `<p class="footer__col-title">{heading.text}</p>` followed by `<ul>` of `<li><a href>` from the links bucketed under that heading.
|
|
147
|
-
- Heuristic for bucketing: walk `hydratedFooter.html` once (in your head, you have the raw outerHTML in `hydratedFooter.html` if needed) — links that appear AFTER a heading and BEFORE the next heading belong to that heading. If you can't disambiguate, fall back to grouping by `href` prefix patterns (`/policies/` → "More Information", `/products/` → "Collections", `/pages/` → split by name).
|
|
148
|
-
3. **Newsletter.** If `hydratedFooter.forms[]` is non-empty AND `hydratedFooter.inputs[]` contains an `email`-typed input, emit a real `<form action="{form.action}" method="{form.method}">` with the email input, preserving `name`, `placeholder`, `required`, `aria-label`. Hidden inputs (`type=hidden`) are preserved verbatim — they're typically Shopify form-type tokens. Add a submit `<button type="submit">` even if not in the source.
|
|
149
|
-
4. **Social media.** Detect social links by `href` matching `/facebook|instagram|tiktok|youtube|x\.com|twitter|pinterest/`. Group in a separate `<ul class="footer__social">` with inline SVG icons (you can ship the standard FB/IG icons from the bundled patterns). Skip if no matches.
|
|
150
|
-
5. **Payment icons.** Detect by `hydratedFooter.images[]` with `alt` matching `/amazon|visa|mastercard|amex|apple pay|google pay|discover|diners|shop pay|paypal/i` — emit those as a `<ul class="footer__payments">` with `<img>` referencing the `assetsMap` (resolve src via the standard pipeline). If `images[]` is empty but you'd expect them, fall back to text labels.
|
|
151
|
-
6. **Copyright.** If any link or heading text matches `© YEAR, Brand.`, preserve verbatim at the bottom.
|
|
152
|
-
7. **NO image-slice fallback.** When hydrated data is present, never compose a single `<img>` of the footer. The hydrated payload gives everything needed for real markup.
|
|
181
|
+
Use `hydratedFooter.blocks[]` (new in v0.3.5) as primary structured data — each block has `{heading, paragraphs[], links[], mailtos[]}` already grouped per source heading.
|
|
153
182
|
|
|
154
|
-
|
|
183
|
+
1. **Outer shell.** `<footer class="es-footer">` with reset + `box-sizing: border-box` + background from `inspection.tokens.colors.background` (fallback `#fafafa`).
|
|
184
|
+
|
|
185
|
+
2. **Brand row (top).** When `hydratedFooter.images[]` includes a logo-looking image (alt matching brand name, or first image of the footer), emit `<a class="es-footer__brand" href="/"><img src="{assetsMap.localPath}"></a>`. Immediately below, if `hydratedFooter.paragraphs[0]` exists AND is long enough to be the brand description (typically >50 chars, sits ABOVE the column area in the live source), emit `<p class="es-footer__brand-desc">{paragraphs[0] verbatim}</p>`. This is the "Socks can be more powerful..." copy in everstride — DO NOT skip.
|
|
186
|
+
|
|
187
|
+
3. **Newsletter (after brand description).** If `hydratedFooter.forms[]` non-empty AND `inputs[]` contains an `email`-typed input, emit `<form action="{form.action}" method="{form.method}">` with email input (preserve `name`, `placeholder`, `required`, `aria-label`) + all hidden inputs verbatim + `<button type="submit">`. **Position: directly below brand desc, ABOVE the menu columns.** Newsletter is high-conversion CTA — keep it where the live placed it.
|
|
188
|
+
|
|
189
|
+
4. **Menu columns side-by-side (NOT stacked).** Take `hydratedFooter.blocks[]` filtered to those with `links.length >= 2` (menu-style blocks). Emit them as a grid:
|
|
190
|
+
```css
|
|
191
|
+
.es-footer__cols.es-footer__cols {
|
|
192
|
+
display: grid;
|
|
193
|
+
grid-template-columns: repeat(2, minmax(0, 1fr));
|
|
194
|
+
gap: 24px;
|
|
195
|
+
}
|
|
196
|
+
@media (min-width: 750px) {
|
|
197
|
+
.es-footer__cols.es-footer__cols { grid-template-columns: repeat(3, minmax(0, 1fr)); }
|
|
198
|
+
}
|
|
199
|
+
```
|
|
200
|
+
For each menu block: emit column with `<p class="es-footer__col-title">{block.heading.text verbatim}</p>` + `<ul>` of `<li><a href="{link.href}">{link.text verbatim}</a></li>` from `block.links`. **NEVER stack vertically on mobile** — at least 2 columns side-by-side is the canonical layout, otherwise footer becomes an absurd long vertical strip (bug 2.4).
|
|
201
|
+
|
|
202
|
+
5. **Info blocks (Need Help? and similar).** Blocks with `paragraphs.length >= 1` AND `links.length < 2` are info-text blocks. For each: emit column with heading + `block.paragraphs[]` as `<p>` verbatim + `mailto:` links from `block.mailtos[]` as `<a href="mailto:{email}">{email}</a>`. This captures "Need Help? + Have a question? Email us at info@... + Our friendly support team is available 24/7..." (bug 2.5b — was emitting only heading without paragraphs).
|
|
203
|
+
|
|
204
|
+
6. **Social icons.** Detect from `hydratedFooter.links` filtered by `href` matching `/facebook|instagram|tiktok|youtube|x\.com|twitter|pinterest/i`. Emit as `<ul class="es-footer__social">` with inline SVG icons (use `hydratedFooter.inlineSvgs[]` matching parent class containing `social`; fall back to bundled FB/IG inline SVG patterns). Position: between menu columns and payment icons (or wherever the source places them).
|
|
205
|
+
|
|
206
|
+
7. **Payment icons.** Detect by EITHER:
|
|
207
|
+
- `hydratedFooter.images[]` with `alt` matching `/amazon|visa|mastercard|amex|apple pay|google pay|discover|diners|shop pay|paypal/i`, OR
|
|
208
|
+
- `hydratedFooter.inlineSvgs[]` with `ariaLabel`, `title`, or `parentClass` matching the same patterns, OR
|
|
209
|
+
- `hydratedFooter.inlineSvgs[]` where `parentClass` contains `payment-icons` / `footer__payment`.
|
|
210
|
+
Emit as `<ul class="es-footer__payments">` with `<img>` (resolved via assetsMap) OR inline SVG verbatim. If none of these match BUT you can see payment icons in the section crop, emit `<!-- TODO: payment icons visible in crop but not capturable from DOM -->`. NEVER skip the payment row when source had it (bug 2.1).
|
|
211
|
+
|
|
212
|
+
8. **Copyright.** Capture `© YEAR, Brand.` verbatim from `hydratedFooter.html` (typically last `<p>` in the footer matching `/^(©|copyright)/i`). Emit `<p class="es-footer__copyright">© 2026, Everstride.</p>` — **verbatim only, NEVER add "All rights reserved" or any other plausible-sounding suffix** (bug 2.5a fabrication regression).
|
|
213
|
+
|
|
214
|
+
9. **Cross-validate every literal** before emit: every block heading, every link text, every paragraph, every copyright text MUST appear verbatim in `hydratedFooter.html`. Validator (`build-wp.mjs validate --inspection-path`) will catch fabrications; pre-validate yourself to avoid rework.
|
|
215
|
+
|
|
216
|
+
10. **NO image-slice fallback.** When `hydratedFooter` is present, never compose a single `<img>` of the footer. Hydrated payload always provides enough for real markup.
|
|
217
|
+
|
|
218
|
+
**Header recipe** (when `hydratedHeader` present) — v0.3.5 rewrite:
|
|
219
|
+
|
|
220
|
+
1. **Announcement bar FIRST (when `inspection.hydratedAnnouncementBar` is non-null).** This is captured separately in v0.3.5 — typically a `<aside>` sibling above the `<header>` carrying promotional text ("Mother's Day Sale", "Free Shipping over $X", etc.). Emit as the very first element of `clean/global/header.html`:
|
|
221
|
+
```html
|
|
222
|
+
<div class="es-announcement-bar.es-announcement-bar">
|
|
223
|
+
{hydratedAnnouncementBar.text verbatim}
|
|
224
|
+
</div>
|
|
225
|
+
```
|
|
226
|
+
With CSS giving it the pink/promo background read from the crop. **NEVER skip the announcement bar** when it was captured — bug 1.1 (banner not in header) AND bug 3.1 (banner ended up inline in body) both trace to this being omitted. Also remove from per-page body via stripper.
|
|
227
|
+
|
|
228
|
+
2. **Outer shell.** `<header class="es-header">` below the announcement bar.
|
|
229
|
+
|
|
230
|
+
3. **Layout: grid with 3 zones (centered logo).** Default mobile + desktop:
|
|
231
|
+
```css
|
|
232
|
+
.es-header.es-header {
|
|
233
|
+
display: grid;
|
|
234
|
+
grid-template-columns: auto 1fr auto;
|
|
235
|
+
align-items: center;
|
|
236
|
+
padding: 12px 16px;
|
|
237
|
+
}
|
|
238
|
+
.es-header.es-header > .es-header__left { justify-self: start; }
|
|
239
|
+
.es-header.es-header > .es-header__brand { justify-self: center; }
|
|
240
|
+
.es-header.es-header > .es-header__right { justify-self: end; }
|
|
241
|
+
```
|
|
242
|
+
This guarantees the brand stays VISUALLY centered regardless of how many icons are in left/right clusters — fixes bug 1.3 (logo descentralizada).
|
|
243
|
+
|
|
244
|
+
4. **Brand (center).** If `hydratedHeader.images[]` includes a logo-looking image (alt matching brand name OR class containing `logo`/`brand`), emit `<a class="es-header__brand" href="/"><img src="{assetsMap.localPath}" alt="{brand}"></a>`. If brand is text-only (no img), emit `<span class="es-header__brand">{brand-name}</span>`.
|
|
245
|
+
|
|
246
|
+
5. **Left cluster: hamburger + search.** On mobile (`@media (max-width: 749px)`), emit `<div class="es-header__left">` with a `<button>` hamburger trigger (aria-controls a drawer) + a `<button>` search trigger. On desktop, replace hamburger with horizontal nav links (rule 7).
|
|
247
|
+
|
|
248
|
+
6. **Right cluster: utility icons.** Links/buttons matching `/account|cart|search/` go into `<div class="es-header__right">` with proper SVG icons (use `hydratedHeader.inlineSvgs[]` matching parent class containing `account|cart|search`, OR fall back to bundled icon set).
|
|
249
|
+
|
|
250
|
+
7. **Desktop nav (visible only ≥750px).** Take category links from `hydratedHeader.links` filtered by hrefs matching `/products|collections|categories|shop/` (top-level shop nav). Emit as `<nav class="es-header__nav-desktop"><ul>{links}</ul></nav>` between brand and right cluster, with `display: none` on mobile.
|
|
251
|
+
|
|
252
|
+
8. **Mobile drawer (V03-5 corrected UX).** Bug 1.2 — previous output had logo inside drawer + no overlay. Correct UX:
|
|
253
|
+
```html
|
|
254
|
+
<div class="es-drawer-backdrop" data-drawer-backdrop hidden>
|
|
255
|
+
<div class="es-drawer" data-drawer role="dialog" aria-modal="true">
|
|
256
|
+
<button class="es-drawer__close" aria-label="Close menu" data-drawer-close>✕</button>
|
|
257
|
+
<nav>
|
|
258
|
+
<ul class="es-drawer__nav">
|
|
259
|
+
<!-- category links from hydratedHeader.links, NO LOGO inside -->
|
|
260
|
+
</ul>
|
|
261
|
+
</nav>
|
|
262
|
+
<!-- Optional: account link / login at footer of drawer if source has it -->
|
|
263
|
+
</div>
|
|
264
|
+
</div>
|
|
265
|
+
```
|
|
266
|
+
CSS:
|
|
267
|
+
```css
|
|
268
|
+
.es-drawer-backdrop.es-drawer-backdrop[hidden] { display: none; }
|
|
269
|
+
.es-drawer-backdrop.es-drawer-backdrop {
|
|
270
|
+
position: fixed; inset: 0;
|
|
271
|
+
background: rgba(0,0,0,.5);
|
|
272
|
+
z-index: 9999;
|
|
273
|
+
}
|
|
274
|
+
.es-drawer.es-drawer {
|
|
275
|
+
position: absolute; top: 0; left: 0; bottom: 0;
|
|
276
|
+
width: min(85vw, 360px);
|
|
277
|
+
background: #fff;
|
|
278
|
+
padding: 24px 20px;
|
|
279
|
+
overflow-y: auto;
|
|
280
|
+
}
|
|
281
|
+
.es-drawer__close.es-drawer__close {
|
|
282
|
+
position: absolute; top: 12px; right: 12px;
|
|
283
|
+
background: none; border: 0; font-size: 24px;
|
|
284
|
+
}
|
|
285
|
+
```
|
|
286
|
+
JS: vanilla toggle on hamburger click. **NO LOGO inside drawer** — logo is in the header shell, not duplicated. **Backdrop overlay is mandatory** — drawer with no backdrop is the v0.3.x bug.
|
|
155
287
|
|
|
156
|
-
|
|
157
|
-
2. **Promo bar.** If any link's text matches a promotional pattern (`/sale|discount|free|% off/i`) OR `hydratedHeader.headings[]` contains a short standalone heading at the top, emit `<div class="es-header__promo">{text}</div>`.
|
|
158
|
-
3. **Brand.** If `hydratedHeader.images[]` includes one with `alt` matching the brand name (derived from URL or generic `logo`), emit `<a class="es-header__brand" href="/">` with that `<img>`.
|
|
159
|
-
4. **Nav.** Take `hydratedHeader.links` filtered to category-looking hrefs (`/products/...`, `/collections/...`, top-level pages). Emit as `<nav><ul>{links}</ul></nav>`. On mobile (`@media (max-width: 749px)`), collapse into a `<details><summary aria-label="Menu">` drawer — keep all category links inside.
|
|
160
|
-
5. **Utility icons.** Links with text "Open search", "Account", "Cart" (or matching hrefs `/search`, `/account`, `/cart`) get rendered as a right-side cluster of icon buttons.
|
|
288
|
+
9. **Cross-validate every literal** (brand text, link texts, announcement bar text) against `hydratedHeader.html` / `hydratedAnnouncementBar.html` before emit. Validator catches fabrications.
|
|
161
289
|
|
|
162
290
|
**Output contract for header/footer:**
|
|
163
291
|
- Markup includes a top comment `<!-- sb-build-wp: composed from hydrated snapshot (V03-0a) -->`.
|
|
@@ -51,7 +51,17 @@ The Elementor + active theme stack frequently injects `!important` on these prop
|
|
|
51
51
|
- `font-weight` — set on `h1..h6`, `body`
|
|
52
52
|
- `color` — set on links via `.elementor a`, on body via theme
|
|
53
53
|
|
|
54
|
-
|
|
54
|
+
### v0.3.5 — Expanded property list (after real-world "tudo liso no WordPress" bug 3.4):
|
|
55
|
+
|
|
56
|
+
The above 4-property list was insufficient — when output was published in WordPress, the products grid and other card-based layouts came out "tudo liso" (no border-radius, no shadow, no gaps). The Elementor theme stack also normalizes/overrides these visual properties on `*` selectors. Add `!important` ALSO to:
|
|
57
|
+
|
|
58
|
+
- `border-radius` — on cards, buttons, image containers (themes flatten cards otherwise)
|
|
59
|
+
- `box-shadow` — on cards and elevated elements
|
|
60
|
+
- `background-color` — on color-bearing blocks (`.es-card`, `.es-cta`, `.es-banner`) when the visual identity depends on it
|
|
61
|
+
- `gap` (and `column-gap`, `row-gap`) — on grid/flex containers (themes that reset child margins also reset gaps)
|
|
62
|
+
- `padding` — on card-like containers and section blocks where the live padding is visually critical
|
|
63
|
+
|
|
64
|
+
Apply via the same chained-scope pattern (`.es-card.es-card { border-radius: 12px !important; }`). Still DO NOT spray `!important` on margin, width, height, transforms, transitions — those rarely conflict with theme.
|
|
55
65
|
|
|
56
66
|
## Reset universal
|
|
57
67
|
|
|
@@ -78,7 +78,7 @@ function findAll(re, str) {
|
|
|
78
78
|
return out
|
|
79
79
|
}
|
|
80
80
|
|
|
81
|
-
function validateHtml(html) {
|
|
81
|
+
function validateHtml(html, inspection = null) {
|
|
82
82
|
const errors = []
|
|
83
83
|
const warnings = []
|
|
84
84
|
const info = []
|
|
@@ -228,6 +228,103 @@ function validateHtml(html) {
|
|
|
228
228
|
fabricationRisks.push({ section: m[1].trim(), reason: m[2].trim() })
|
|
229
229
|
}
|
|
230
230
|
|
|
231
|
+
// 13. §V03-4 fabrication detector — when inspection JSON is provided,
|
|
232
|
+
// cross-validate every literal text in the composed HTML against the
|
|
233
|
+
// inspection's text corpus (domLive + dom + hydratedHeader.html +
|
|
234
|
+
// hydratedFooter.html). Literals that don't appear ANYWHERE in the
|
|
235
|
+
// inspection corpus are flagged as fabrications.
|
|
236
|
+
//
|
|
237
|
+
// Detected literal types:
|
|
238
|
+
// - prices/money: $\d+(.\d{2})? or \d+\.\d{2}\b
|
|
239
|
+
// - countdown phrases: \d+(?:,\d{3})*\s*(reviews|customers|women|sold|pairs|...)
|
|
240
|
+
// - headings text content (h1-h6 + p.bold-like)
|
|
241
|
+
// - meaningful paragraph text >30 chars
|
|
242
|
+
//
|
|
243
|
+
// Findings emit error `composition-fabrication-detected` so the orchestrator
|
|
244
|
+
// (Step 4j) can re-trigger composition with feedback.
|
|
245
|
+
const fabricationFindings = []
|
|
246
|
+
if (inspection && typeof inspection === 'object') {
|
|
247
|
+
const corpus = buildInspectionCorpus(inspection)
|
|
248
|
+
const looseCorpus = corpus
|
|
249
|
+
.toLowerCase()
|
|
250
|
+
.replace(/\s+/g, ' ')
|
|
251
|
+
.replace(/[‘’]/g, "'")
|
|
252
|
+
.replace(/[“”]/g, '"')
|
|
253
|
+
.replace(/[—–-]+/g, '-')
|
|
254
|
+
// strip TODO-commented regions so they don't get re-scanned
|
|
255
|
+
const htmlBody = html.replace(/<!--[\s\S]*?-->/g, '')
|
|
256
|
+
// (a) money literals
|
|
257
|
+
const moneyRe = /\$\d{1,5}(?:\.\d{2})?/g
|
|
258
|
+
const moneyHits = new Set()
|
|
259
|
+
for (let m = moneyRe.exec(htmlBody); m; m = moneyRe.exec(htmlBody)) moneyHits.add(m[0])
|
|
260
|
+
for (const lit of moneyHits) {
|
|
261
|
+
if (!corpusHas(looseCorpus, lit)) {
|
|
262
|
+
fabricationFindings.push({
|
|
263
|
+
kind: 'money',
|
|
264
|
+
literal: lit,
|
|
265
|
+
severity: 'high',
|
|
266
|
+
message: `Money literal "${lit}" not found in inspection corpus (domLive/dom/hydratedHeader/hydratedFooter)`,
|
|
267
|
+
})
|
|
268
|
+
}
|
|
269
|
+
}
|
|
270
|
+
// (b) count phrases like "19,479 Reviews", "240,000+ Women", "1M+ Sold"
|
|
271
|
+
const countRe = /\b(\d[\d,]{1,9}\+?)\s+(reviews?|customers?|women|sold|pairs?|orders?|stars?|users?|clinicians?|five-star|verified)\b/gi
|
|
272
|
+
const countHits = new Set()
|
|
273
|
+
for (let m = countRe.exec(htmlBody); m; m = countRe.exec(htmlBody)) countHits.add(m[0])
|
|
274
|
+
for (const lit of countHits) {
|
|
275
|
+
if (!corpusHas(looseCorpus, lit)) {
|
|
276
|
+
fabricationFindings.push({
|
|
277
|
+
kind: 'count',
|
|
278
|
+
literal: lit,
|
|
279
|
+
severity: 'high',
|
|
280
|
+
message: `Count phrase "${lit}" not found in inspection corpus`,
|
|
281
|
+
})
|
|
282
|
+
}
|
|
283
|
+
}
|
|
284
|
+
// (c) headings — text content of h1..h6
|
|
285
|
+
const headingRe = /<h[1-6][^>]*>([\s\S]*?)<\/h[1-6]>/gi
|
|
286
|
+
const headingHits = new Set()
|
|
287
|
+
for (let m = headingRe.exec(htmlBody); m; m = headingRe.exec(htmlBody)) {
|
|
288
|
+
const txt = m[1].replace(/<[^>]+>/g, '').trim()
|
|
289
|
+
if (txt.length >= 3 && txt.length <= 120) headingHits.add(txt)
|
|
290
|
+
}
|
|
291
|
+
for (const lit of headingHits) {
|
|
292
|
+
if (!corpusHas(looseCorpus, lit)) {
|
|
293
|
+
fabricationFindings.push({
|
|
294
|
+
kind: 'heading',
|
|
295
|
+
literal: lit,
|
|
296
|
+
severity: 'high',
|
|
297
|
+
message: `Heading "${lit}" not found in inspection corpus`,
|
|
298
|
+
})
|
|
299
|
+
}
|
|
300
|
+
}
|
|
301
|
+
// (d) long paragraphs (>30 chars) — text content of <p>
|
|
302
|
+
const paraRe = /<p[^>]*>([\s\S]*?)<\/p>/gi
|
|
303
|
+
const paraHits = new Set()
|
|
304
|
+
for (let m = paraRe.exec(htmlBody); m; m = paraRe.exec(htmlBody)) {
|
|
305
|
+
const txt = m[1].replace(/<[^>]+>/g, '').trim()
|
|
306
|
+
if (txt.length >= 30 && txt.length <= 400) paraHits.add(txt)
|
|
307
|
+
}
|
|
308
|
+
for (const lit of paraHits) {
|
|
309
|
+
if (!corpusHas(looseCorpus, lit)) {
|
|
310
|
+
fabricationFindings.push({
|
|
311
|
+
kind: 'paragraph',
|
|
312
|
+
literal: lit.slice(0, 80) + (lit.length > 80 ? '…' : ''),
|
|
313
|
+
severity: 'medium',
|
|
314
|
+
message: `Paragraph "${lit.slice(0, 80)}…" not found in inspection corpus`,
|
|
315
|
+
})
|
|
316
|
+
}
|
|
317
|
+
}
|
|
318
|
+
if (fabricationFindings.length > 0) {
|
|
319
|
+
errors.push({
|
|
320
|
+
rule: 'composition-fabrication-detected',
|
|
321
|
+
severity: 'high',
|
|
322
|
+
message: `§V03-4 fabrication-detector found ${fabricationFindings.length} literal(s) not present in inspection corpus. The composer should re-read the section crops and either remove the fabricated literals OR replace with TODO comments.`,
|
|
323
|
+
findings: fabricationFindings,
|
|
324
|
+
})
|
|
325
|
+
}
|
|
326
|
+
}
|
|
327
|
+
|
|
231
328
|
return {
|
|
232
329
|
passed: errors.length === 0,
|
|
233
330
|
errorCount: errors.length,
|
|
@@ -236,9 +333,62 @@ function validateHtml(html) {
|
|
|
236
333
|
warnings,
|
|
237
334
|
info,
|
|
238
335
|
fabricationRisks,
|
|
336
|
+
fabricationFindings,
|
|
239
337
|
}
|
|
240
338
|
}
|
|
241
339
|
|
|
340
|
+
// §V03-4 — Build a single lowercased text corpus from all inspection text
|
|
341
|
+
// sources for fabrication cross-validation. Includes:
|
|
342
|
+
// - domLive (preferred) / dom: every node.text + node.attrs.alt
|
|
343
|
+
// - hydratedHeader.html, hydratedFooter.html (raw outerHTML strings)
|
|
344
|
+
function buildInspectionCorpus(inspection) {
|
|
345
|
+
const parts = []
|
|
346
|
+
function walkText(arr) {
|
|
347
|
+
if (!arr) return
|
|
348
|
+
const stack = Array.isArray(arr) ? [...arr] : [arr]
|
|
349
|
+
while (stack.length) {
|
|
350
|
+
const n = stack.pop()
|
|
351
|
+
if (!n || typeof n !== 'object') continue
|
|
352
|
+
if (typeof n.text === 'string' && n.text.trim()) parts.push(n.text)
|
|
353
|
+
if (n.attrs && typeof n.attrs === 'object') {
|
|
354
|
+
if (typeof n.attrs.alt === 'string') parts.push(n.attrs.alt)
|
|
355
|
+
if (typeof n.attrs['aria-label'] === 'string') parts.push(n.attrs['aria-label'])
|
|
356
|
+
if (typeof n.attrs.title === 'string') parts.push(n.attrs.title)
|
|
357
|
+
}
|
|
358
|
+
if (Array.isArray(n.children)) for (const c of n.children) stack.push(c)
|
|
359
|
+
}
|
|
360
|
+
}
|
|
361
|
+
walkText(inspection.domLive)
|
|
362
|
+
if (!inspection.domLive) walkText(inspection.dom)
|
|
363
|
+
if (inspection.hydratedHeader && typeof inspection.hydratedHeader.html === 'string') {
|
|
364
|
+
parts.push(inspection.hydratedHeader.html.replace(/<[^>]+>/g, ' '))
|
|
365
|
+
}
|
|
366
|
+
if (inspection.hydratedFooter && typeof inspection.hydratedFooter.html === 'string') {
|
|
367
|
+
parts.push(inspection.hydratedFooter.html.replace(/<[^>]+>/g, ' '))
|
|
368
|
+
}
|
|
369
|
+
return parts.join(' \n ')
|
|
370
|
+
}
|
|
371
|
+
|
|
372
|
+
// Loose comparison: literal appears in the corpus with whitespace and
|
|
373
|
+
// punctuation normalization. Handles "$29.00" matching "$ 29.00" etc.
|
|
374
|
+
function corpusHas(looseCorpusLower, literal) {
|
|
375
|
+
if (!literal) return true
|
|
376
|
+
const needle = literal
|
|
377
|
+
.toLowerCase()
|
|
378
|
+
.replace(/\s+/g, ' ')
|
|
379
|
+
.replace(/[‘’]/g, "'")
|
|
380
|
+
.replace(/[“”]/g, '"')
|
|
381
|
+
.replace(/[—–-]+/g, '-')
|
|
382
|
+
.trim()
|
|
383
|
+
if (!needle) return true
|
|
384
|
+
// direct substring
|
|
385
|
+
if (looseCorpusLower.includes(needle)) return true
|
|
386
|
+
// also try removing all whitespace (handles "$29 .00" vs "$29.00")
|
|
387
|
+
const compact = needle.replace(/\s+/g, '')
|
|
388
|
+
const compactCorpus = looseCorpusLower.replace(/\s+/g, '')
|
|
389
|
+
return compactCorpus.includes(compact)
|
|
390
|
+
}
|
|
391
|
+
|
|
242
392
|
// --- preflight ----------------------------------------------------------------
|
|
243
393
|
//
|
|
244
394
|
// Pre-dispatch checklist. The composer is creative — it picks patterns and
|
|
@@ -425,8 +575,21 @@ async function main() {
|
|
|
425
575
|
} catch (err) {
|
|
426
576
|
fail(`validate: cannot read ${path}: ${err.message}`, 1)
|
|
427
577
|
}
|
|
428
|
-
|
|
429
|
-
|
|
578
|
+
// §V03-4 fabrication detector — when --inspection-path is passed, load
|
|
579
|
+
// the inspection JSON and cross-validate composed HTML literals
|
|
580
|
+
// against the inspection text corpus. Without --inspection-path, the
|
|
581
|
+
// fabrication check is skipped (back-compat for older callers).
|
|
582
|
+
let inspection = null
|
|
583
|
+
if (values['inspection-path']) {
|
|
584
|
+
try {
|
|
585
|
+
const insRaw = await readFile(resolve(values['inspection-path']), 'utf8')
|
|
586
|
+
inspection = JSON.parse(insRaw)
|
|
587
|
+
} catch (err) {
|
|
588
|
+
log(values.verbose, `validate: could not load --inspection-path: ${err.message}`)
|
|
589
|
+
}
|
|
590
|
+
}
|
|
591
|
+
log(values.verbose, `validating ${path} (${html.length} bytes${inspection ? ', with fabrication check' : ''})`)
|
|
592
|
+
const report = validateHtml(html, inspection)
|
|
430
593
|
process.stdout.write(`${JSON.stringify(report, null, 2)}\n`)
|
|
431
594
|
process.exit(report.passed ? 0 : 3)
|
|
432
595
|
}
|
|
@@ -259,8 +259,32 @@ async function main() {
|
|
|
259
259
|
}
|
|
260
260
|
return best?.el || null
|
|
261
261
|
}
|
|
262
|
+
// §V03-5 — announcement bar (promo bar on top of the page,
|
|
263
|
+
// typically <aside> sibling above the <header>). Common in e-commerce
|
|
264
|
+
// ("Mother's Day Sale", "Free Shipping over $X", etc.). Captured as
|
|
265
|
+
// separate snapshot so the composer can compose it as part of the
|
|
266
|
+
// global chrome (above the header) — not as inline body content of
|
|
267
|
+
// the home page.
|
|
268
|
+
function pickAnnouncementBar() {
|
|
269
|
+
const candidates = document.querySelectorAll(
|
|
270
|
+
'aside[class*="announcement"], [class*="announcement-bar"], [class*="promo-bar"], [class*="shopify-section-group-header-group"][class*="announcement"], aside[class*="shopify-section-group-header"]',
|
|
271
|
+
)
|
|
272
|
+
let best = null
|
|
273
|
+
for (const el of candidates) {
|
|
274
|
+
const r = el.getBoundingClientRect()
|
|
275
|
+
if (r.height < 20 || r.height > 200) continue
|
|
276
|
+
// must be near top of page
|
|
277
|
+
if (Math.abs(r.y + window.scrollY) > 200) continue
|
|
278
|
+
const txt = (el.textContent || '').trim()
|
|
279
|
+
if (!txt) continue
|
|
280
|
+
const score = txt.length / 10 + (r.width >= 320 ? 5 : 0)
|
|
281
|
+
if (!best || score > best.score) best = { el, score }
|
|
282
|
+
}
|
|
283
|
+
return best?.el || null
|
|
284
|
+
}
|
|
262
285
|
const footer = pickFooter()
|
|
263
286
|
const header = pickHeader()
|
|
287
|
+
const announcement = pickAnnouncementBar()
|
|
264
288
|
window.__sbHydratedFooter = footer
|
|
265
289
|
? {
|
|
266
290
|
html: footer.outerHTML,
|
|
@@ -289,6 +313,29 @@ async function main() {
|
|
|
289
313
|
})(),
|
|
290
314
|
}
|
|
291
315
|
: null
|
|
316
|
+
window.__sbHydratedAnnouncementBar = announcement
|
|
317
|
+
? {
|
|
318
|
+
html: announcement.outerHTML,
|
|
319
|
+
text: (() => {
|
|
320
|
+
// Strip inline <script>/<style> tags before reading text —
|
|
321
|
+
// Shopify themes inline CSS variables and JS at the top of
|
|
322
|
+
// the announcement-bar section, which would otherwise leak
|
|
323
|
+
// into the "text" field as garbage.
|
|
324
|
+
const clone = announcement.cloneNode(true)
|
|
325
|
+
clone.querySelectorAll('script,style').forEach((n) => n.remove())
|
|
326
|
+
return (clone.textContent || '').replace(/\s+/g, ' ').trim()
|
|
327
|
+
})(),
|
|
328
|
+
bbox: (() => {
|
|
329
|
+
const r = announcement.getBoundingClientRect()
|
|
330
|
+
return {
|
|
331
|
+
x: Math.round(r.x + window.scrollX),
|
|
332
|
+
y: Math.round(r.y + window.scrollY),
|
|
333
|
+
w: Math.round(r.width),
|
|
334
|
+
h: Math.round(r.height),
|
|
335
|
+
}
|
|
336
|
+
})(),
|
|
337
|
+
}
|
|
338
|
+
: null
|
|
292
339
|
})
|
|
293
340
|
|
|
294
341
|
// Return to the top before layout reads. content-visibility:auto on
|
|
@@ -453,6 +500,84 @@ async function main() {
|
|
|
453
500
|
const slug = (cls || tag).replace(/[^a-z0-9-]/gi, '-').toLowerCase().slice(0, 32)
|
|
454
501
|
return slug || 'section'
|
|
455
502
|
}
|
|
503
|
+
|
|
504
|
+
// §V03-4 image-block detection. A section is an image-block when
|
|
505
|
+
// its visible content is dominated by a single <img>: either by
|
|
506
|
+
// bbox coverage (img takes ≥80% of section area) OR by a known
|
|
507
|
+
// wrapper-class pattern (Shopify Dawn / OS 2.0 themes use specific
|
|
508
|
+
// wrapper classes like `product-info__image` to delimit a region
|
|
509
|
+
// whose entire visual content is one CDN-hosted SVG/PNG/JPG asset).
|
|
510
|
+
// The composer (sb-build-wp §V03-4 rule A) treats image-block
|
|
511
|
+
// sections as "download asset + emit <img src>" — NEVER tries to
|
|
512
|
+
// reproduce the visual contents in HTML/CSS.
|
|
513
|
+
const IMAGE_BLOCK_WRAPPER_PATTERNS = [
|
|
514
|
+
/product-info__image/i,
|
|
515
|
+
/image-with-text/i,
|
|
516
|
+
/\bhero-image\b/i,
|
|
517
|
+
/\btrust-badges\b/i,
|
|
518
|
+
/\bfeatures-points\b/i,
|
|
519
|
+
/\bmothers?-day\b/i,
|
|
520
|
+
/\bsingle-image\b/i,
|
|
521
|
+
/\bbanner-image\b/i,
|
|
522
|
+
/\bguarantee-card\b/i,
|
|
523
|
+
/\bbest-fit-size-chart\b/i,
|
|
524
|
+
]
|
|
525
|
+
function classifyAsImageBlock(node) {
|
|
526
|
+
if (!node || typeof node !== 'object') return null
|
|
527
|
+
const sectionBbox = node.bbox
|
|
528
|
+
if (!sectionBbox || !sectionBbox.h || !sectionBbox.w) return null
|
|
529
|
+
const sectionArea = sectionBbox.h * sectionBbox.w
|
|
530
|
+
if (sectionArea <= 0) return null
|
|
531
|
+
// Match by wrapper-class pattern on the node itself OR any
|
|
532
|
+
// descendant (the wrapper might be a child of the section root).
|
|
533
|
+
function findClassMatch(n) {
|
|
534
|
+
if (!n || typeof n !== 'object') return null
|
|
535
|
+
if (Array.isArray(n.classes)) {
|
|
536
|
+
for (const cls of n.classes) {
|
|
537
|
+
for (const pat of IMAGE_BLOCK_WRAPPER_PATTERNS) {
|
|
538
|
+
if (pat.test(cls)) return cls
|
|
539
|
+
}
|
|
540
|
+
}
|
|
541
|
+
}
|
|
542
|
+
if (Array.isArray(n.children)) {
|
|
543
|
+
for (const c of n.children) {
|
|
544
|
+
const m = findClassMatch(c)
|
|
545
|
+
if (m) return m
|
|
546
|
+
}
|
|
547
|
+
}
|
|
548
|
+
return null
|
|
549
|
+
}
|
|
550
|
+
const classMatch = findClassMatch(node)
|
|
551
|
+
// Find the largest <img> descendant by bbox area.
|
|
552
|
+
let bestImg = null
|
|
553
|
+
function findBiggestImg(n) {
|
|
554
|
+
if (!n || typeof n !== 'object') return
|
|
555
|
+
if (n.tag === 'img' && n.bbox && n.bbox.h > 0 && n.bbox.w > 0) {
|
|
556
|
+
const a = n.bbox.h * n.bbox.w
|
|
557
|
+
if (!bestImg || a > bestImg.area) {
|
|
558
|
+
const src = (n.attrs && (n.attrs.src || n.attrs.srcset)) || ''
|
|
559
|
+
bestImg = { area: a, bbox: n.bbox, src }
|
|
560
|
+
}
|
|
561
|
+
}
|
|
562
|
+
if (Array.isArray(n.children)) {
|
|
563
|
+
for (const c of n.children) findBiggestImg(c)
|
|
564
|
+
}
|
|
565
|
+
}
|
|
566
|
+
findBiggestImg(node)
|
|
567
|
+
const imgArea = bestImg ? bestImg.area : 0
|
|
568
|
+
const coverage = sectionArea > 0 ? imgArea / sectionArea : 0
|
|
569
|
+
if (classMatch || coverage >= 0.8) {
|
|
570
|
+
return {
|
|
571
|
+
reason: classMatch
|
|
572
|
+
? `wrapper-class:${classMatch}`
|
|
573
|
+
: `img-coverage:${(coverage * 100).toFixed(0)}%`,
|
|
574
|
+
imgSrc: bestImg ? bestImg.src : null,
|
|
575
|
+
imgBbox: bestImg ? bestImg.bbox : null,
|
|
576
|
+
coverage,
|
|
577
|
+
}
|
|
578
|
+
}
|
|
579
|
+
return null
|
|
580
|
+
}
|
|
456
581
|
const sourceDom = result.domLive
|
|
457
582
|
? Array.isArray(result.domLive)
|
|
458
583
|
? result.domLive
|
|
@@ -496,6 +621,7 @@ async function main() {
|
|
|
496
621
|
sectionList.push({
|
|
497
622
|
sectionType: labelFromNode(grand),
|
|
498
623
|
bbox: grand.bbox,
|
|
624
|
+
imageBlock: classifyAsImageBlock(grand),
|
|
499
625
|
})
|
|
500
626
|
}
|
|
501
627
|
}
|
|
@@ -507,6 +633,7 @@ async function main() {
|
|
|
507
633
|
sectionList.push({
|
|
508
634
|
sectionType: labelFromNode(child),
|
|
509
635
|
bbox,
|
|
636
|
+
imageBlock: classifyAsImageBlock(child),
|
|
510
637
|
})
|
|
511
638
|
}
|
|
512
639
|
|
|
@@ -567,6 +694,7 @@ async function main() {
|
|
|
567
694
|
sectionType: sec.sectionType,
|
|
568
695
|
bbox: sec.bbox,
|
|
569
696
|
path: cropPath,
|
|
697
|
+
imageBlock: sec.imageBlock || null,
|
|
570
698
|
})
|
|
571
699
|
} catch (err) {
|
|
572
700
|
log(`section crop ${idx} ${sec.sectionType} failed: ${err?.message || err}`)
|
|
@@ -1801,7 +1929,7 @@ function extractInPage({ selector, maxDepth, maxChildren, maxText }) {
|
|
|
1801
1929
|
try {
|
|
1802
1930
|
parsed = new DOMParser().parseFromString(snapshot.html, 'text/html')
|
|
1803
1931
|
} catch (_) {
|
|
1804
|
-
return { ...snapshot, links: [], headings: [], inputs: [], forms: [], images: [] }
|
|
1932
|
+
return { ...snapshot, links: [], headings: [], inputs: [], forms: [], images: [], blocks: [], paragraphs: [], inlineSvgs: [] }
|
|
1805
1933
|
}
|
|
1806
1934
|
const root = parsed.body.firstElementChild || parsed.body
|
|
1807
1935
|
const norm = (s) => (s || '').replace(/\s+/g, ' ').trim()
|
|
@@ -1812,7 +1940,8 @@ function extractInPage({ selector, maxDepth, maxChildren, maxText }) {
|
|
|
1812
1940
|
label: a.getAttribute('aria-label') || null,
|
|
1813
1941
|
}))
|
|
1814
1942
|
.filter((l) => l.href && (l.text || l.label))
|
|
1815
|
-
const
|
|
1943
|
+
const HEADING_SELECTOR = 'h1, h2, h3, h4, h5, h6, p.bold, .footer__block-title, [class*="heading"]'
|
|
1944
|
+
const headings = Array.from(root.querySelectorAll(HEADING_SELECTOR))
|
|
1816
1945
|
.map((h) => ({ tag: h.tagName.toLowerCase(), text: norm(h.textContent) }))
|
|
1817
1946
|
.filter((h) => h.text)
|
|
1818
1947
|
const inputs = Array.from(root.querySelectorAll('input')).map((i) => ({
|
|
@@ -1833,8 +1962,96 @@ function extractInPage({ selector, maxDepth, maxChildren, maxText }) {
|
|
|
1833
1962
|
alt: img.getAttribute('alt') || '',
|
|
1834
1963
|
width: img.getAttribute('width') || null,
|
|
1835
1964
|
height: img.getAttribute('height') || null,
|
|
1965
|
+
classList: Array.from(img.classList || []),
|
|
1966
|
+
parentClass: img.parentElement ? Array.from(img.parentElement.classList || []) : [],
|
|
1836
1967
|
}))
|
|
1837
1968
|
.filter((img) => img.src)
|
|
1969
|
+
// §V03-5 — inline SVG capture. Composer needs the raw SVG markup to
|
|
1970
|
+
// preserve ornamental decorators (Clinicians' Choice flanking flora,
|
|
1971
|
+
// social icons, payment cards when those are inline rather than <img>).
|
|
1972
|
+
// Skip extremely small (single-path icon < 16x16) and absurdly large
|
|
1973
|
+
// svgs that are likely page-level decorative blobs.
|
|
1974
|
+
const inlineSvgs = Array.from(root.querySelectorAll('svg'))
|
|
1975
|
+
.map((svg, i) => {
|
|
1976
|
+
const r = svg.getBoundingClientRect ? svg.getBoundingClientRect() : { width: 0, height: 0 }
|
|
1977
|
+
const w = svg.getAttribute('width') || r.width || null
|
|
1978
|
+
const h = svg.getAttribute('height') || r.height || null
|
|
1979
|
+
const viewBox = svg.getAttribute('viewBox') || null
|
|
1980
|
+
const ariaLabel = svg.getAttribute('aria-label') || null
|
|
1981
|
+
const title = svg.querySelector('title')?.textContent || null
|
|
1982
|
+
const classes = Array.from(svg.classList || [])
|
|
1983
|
+
return {
|
|
1984
|
+
idx: i,
|
|
1985
|
+
outerHTML: svg.outerHTML,
|
|
1986
|
+
width: w,
|
|
1987
|
+
height: h,
|
|
1988
|
+
viewBox,
|
|
1989
|
+
ariaLabel,
|
|
1990
|
+
title,
|
|
1991
|
+
classes,
|
|
1992
|
+
parentClass: svg.parentElement ? Array.from(svg.parentElement.classList || []) : [],
|
|
1993
|
+
}
|
|
1994
|
+
})
|
|
1995
|
+
// §V03-5 — blocks: heading-anchored sub-sections. Walks DOM in source
|
|
1996
|
+
// order and groups text content under each heading (e.g. "Need Help?"
|
|
1997
|
+
// heading + its paragraphs + its mailto link). This lets the composer
|
|
1998
|
+
// emit semantically grouped output where the live source intended.
|
|
1999
|
+
const blocks = []
|
|
2000
|
+
let currentBlock = null
|
|
2001
|
+
function nodeHeadingText(el) {
|
|
2002
|
+
if (!el) return null
|
|
2003
|
+
if (el.matches && el.matches(HEADING_SELECTOR)) return norm(el.textContent)
|
|
2004
|
+
return null
|
|
2005
|
+
}
|
|
2006
|
+
function walkInOrder(el) {
|
|
2007
|
+
if (!el || el.nodeType !== 1) return
|
|
2008
|
+
const headingText = nodeHeadingText(el)
|
|
2009
|
+
if (headingText) {
|
|
2010
|
+
if (currentBlock) blocks.push(currentBlock)
|
|
2011
|
+
currentBlock = {
|
|
2012
|
+
heading: { tag: el.tagName.toLowerCase(), text: headingText },
|
|
2013
|
+
paragraphs: [],
|
|
2014
|
+
links: [],
|
|
2015
|
+
mailtos: [],
|
|
2016
|
+
}
|
|
2017
|
+
return // don't descend; siblings of the heading carry the content
|
|
2018
|
+
}
|
|
2019
|
+
// collect paragraphs / inline content for the current block
|
|
2020
|
+
if (currentBlock) {
|
|
2021
|
+
if (el.tagName === 'P') {
|
|
2022
|
+
// <p> can contain inline children (<a>, <strong>, etc.) — what
|
|
2023
|
+
// matters is that it has no block-level nested elements.
|
|
2024
|
+
const hasBlockChild = Array.from(el.children).some((c) =>
|
|
2025
|
+
/^(DIV|UL|OL|SECTION|ARTICLE|FORM|HEADER|FOOTER|NAV|FIGURE)$/.test(c.tagName),
|
|
2026
|
+
)
|
|
2027
|
+
const txt = norm(el.textContent)
|
|
2028
|
+
// Copyright / "All rights reserved" lines are end-of-section
|
|
2029
|
+
// markers — close the current block before they get attached
|
|
2030
|
+
// to an unrelated heading like "Need Help?".
|
|
2031
|
+
if (/^(©|copyright)/i.test(txt) || /all rights reserved/i.test(txt)) {
|
|
2032
|
+
if (currentBlock) {
|
|
2033
|
+
blocks.push(currentBlock)
|
|
2034
|
+
currentBlock = null
|
|
2035
|
+
}
|
|
2036
|
+
} else if (txt && txt.length >= 8 && !hasBlockChild) {
|
|
2037
|
+
currentBlock.paragraphs.push(txt)
|
|
2038
|
+
}
|
|
2039
|
+
}
|
|
2040
|
+
if (el.tagName === 'A') {
|
|
2041
|
+
const href = el.getAttribute('href') || ''
|
|
2042
|
+
if (href.startsWith('mailto:')) currentBlock.mailtos.push(href.replace(/^mailto:/, ''))
|
|
2043
|
+
else if (href) currentBlock.links.push({ href, text: norm(el.textContent) })
|
|
2044
|
+
}
|
|
2045
|
+
}
|
|
2046
|
+
for (const c of el.children) walkInOrder(c)
|
|
2047
|
+
}
|
|
2048
|
+
walkInOrder(root)
|
|
2049
|
+
if (currentBlock) blocks.push(currentBlock)
|
|
2050
|
+
// §V03-5 — paragraphs: all <p> in order, for sections without
|
|
2051
|
+
// heading anchors (e.g. footer brand description sitting alone).
|
|
2052
|
+
const paragraphs = Array.from(root.querySelectorAll('p'))
|
|
2053
|
+
.map((p) => norm(p.textContent))
|
|
2054
|
+
.filter((t) => t && t.length >= 8 && t.length < 400)
|
|
1838
2055
|
return {
|
|
1839
2056
|
html: snapshot.html,
|
|
1840
2057
|
bbox: snapshot.bbox,
|
|
@@ -1843,10 +2060,18 @@ function extractInPage({ selector, maxDepth, maxChildren, maxText }) {
|
|
|
1843
2060
|
inputs,
|
|
1844
2061
|
forms,
|
|
1845
2062
|
images,
|
|
2063
|
+
inlineSvgs,
|
|
2064
|
+
blocks,
|
|
2065
|
+
paragraphs,
|
|
1846
2066
|
}
|
|
1847
2067
|
}
|
|
1848
2068
|
const hydratedHeader = extractChrome(window.__sbHydratedHeader)
|
|
1849
2069
|
const hydratedFooter = extractChrome(window.__sbHydratedFooter)
|
|
2070
|
+
// §V03-5 announcement bar — captured separately to allow the composer
|
|
2071
|
+
// to emit it as part of the global chrome (clean/global/header.html
|
|
2072
|
+
// composed with the bar prepended) instead of leaking into per-page
|
|
2073
|
+
// body content.
|
|
2074
|
+
const hydratedAnnouncementBar = window.__sbHydratedAnnouncementBar || null
|
|
1850
2075
|
|
|
1851
2076
|
return {
|
|
1852
2077
|
sectionType,
|
|
@@ -1863,6 +2088,7 @@ function extractInPage({ selector, maxDepth, maxChildren, maxText }) {
|
|
|
1863
2088
|
externalIframes,
|
|
1864
2089
|
hydratedHeader,
|
|
1865
2090
|
hydratedFooter,
|
|
2091
|
+
hydratedAnnouncementBar,
|
|
1866
2092
|
}
|
|
1867
2093
|
}
|
|
1868
2094
|
|