npm - @zenuml/core - Versions diffs - 3.47.0 → 3.47.2 - Mend

@zenuml/core 3.47.0 → 3.47.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (30) hide show

package/.agents/skills/babysit-pr/SKILL.md +223 -0
package/.agents/skills/babysit-pr/agents/openai.yaml +7 -0
package/.agents/skills/dia-scoring/SKILL.md +139 -0
package/.agents/skills/dia-scoring/agents/openai.yaml +7 -0
package/.agents/skills/dia-scoring/references/selectors-and-keys.md +253 -0
package/.agents/skills/land-pr/SKILL.md +120 -0
package/.agents/skills/propagate-core-release/SKILL.md +205 -0
package/.agents/skills/propagate-core-release/agents/openai.yaml +7 -0
package/.agents/skills/propagate-core-release/references/downstreams.md +42 -0
package/.agents/skills/ship-branch/SKILL.md +105 -0
package/.agents/skills/submit-branch/SKILL.md +76 -0
package/.agents/skills/validate-branch/SKILL.md +72 -0
package/.claude/skills/emoji-eval/SKILL.md +187 -0
package/.claude/skills/propagate-core-release/SKILL.md +81 -76
package/.claude/skills/propagate-core-release/agents/openai.yaml +2 -2
package/AGENTS.md +1 -1
package/dist/stats.html +1 -1
package/dist/zenuml.esm.mjs +16210 -15460
package/dist/zenuml.js +540 -535
package/docs/superpowers/plans/2026-03-30-emoji-support.md +1220 -0
package/docs/superpowers/plans/2026-03-30-self-correcting-scoring.md +206 -0
package/e2e/data/compare-cases.js +233 -0
package/e2e/tools/compare-case.html +17 -3
package/package.json +3 -3
package/playwright.config.ts +1 -1
package/scripts/analyze-compare-case/collect-data.mjs +159 -16
package/scripts/analyze-compare-case/config.mjs +1 -1
package/scripts/analyze-compare-case/report.mjs +5 -0
package/scripts/analyze-compare-case/residual-scopes.mjs +23 -1
package/scripts/analyze-compare-case/scoring.mjs +13 -0

package/.agents/skills/babysit-pr/SKILL.md ADDED Viewed

@@ -0,0 +1,223 @@
+---
+name: babysit-pr
+description: Monitor and fix failing GitHub Actions CI checks on PRs for mermaid-js/zenuml-core. Use when the user says "babysit PR", "check PR status", "fix CI", "PR is failing", "watch this PR", "why is CI red", or when used with /loop to continuously monitor a PR. Also use when Playwright snapshot failures occur in CI, lint/format issues block merging, or unit tests fail on a PR. Triggers on any PR monitoring, CI failure diagnosis, or automated fix-and-retry workflow.
+---
+# Babysit PR
+Monitor a GitHub Actions PR, diagnose failures, attempt fixes, and retry — up to 3 times total.
+## Scope
+This skill targets **mermaid-js/zenuml-core** only. All commands run from the zenuml-core directory.
+## Step 1: Find the PR
+Resolve which PR to babysit, in this priority order:
+1. **Explicit PR number** — if the user provided one (e.g., `#341`), use it
+2. **Current branch PR** — run `gh pr view --json number,title,headRefName,state,statusCheckRollup` from the zenuml-core directory
+3. **Recently failed PR** — if no PR on current branch, find the most recent failed PR in the last 10 minutes:
+   ```bash
+   gh run list --repo mermaid-js/zenuml-core --status failure --limit 5 --json databaseId,headBranch,event,createdAt,conclusion,name
+   ```
+   Filter to runs created within the last 10 minutes. If multiple, pick the most recent.
+If no PR is found, tell the user and stop.
+## Step 2: Check CI Status
+```bash
+gh pr checks <PR_NUMBER> --repo mermaid-js/zenuml-core
+```
+**If all checks pass**: Report success and stop. Nothing to babysit.
+**If checks are still running**: Report status and wait. Use `gh run watch <RUN_ID> --repo mermaid-js/zenuml-core` to wait for completion (with a 10-minute timeout). Then re-evaluate.
+**If checks failed**: Proceed to Step 3.
+## Step 3: Diagnose Failures
+For each failed check, pull the logs:
+```bash
+gh run view <RUN_ID> --repo mermaid-js/zenuml-core --log-failed
+```
+Categorize the failure:
+| Category | Indicators |
+|----------|-----------|
+| **Playwright snapshot mismatch** | `Error: A]snapshot.*doesn't match`, `Screenshot comparison failed`, pixel diff errors, `-linux.png` referenced |
+| **Playwright test logic failure** | Assertion errors, timeouts, element not found — but NOT snapshot diffs |
+| **Unit test failure** | Failures in `bun run test`, vitest output |
+| **Lint/format failure** | ESLint errors, Prettier diffs |
+| **Build failure** | Vite build errors, TypeScript compilation errors |
+| **Merge conflict** | `CONFLICT`, `merge conflict`, cannot rebase cleanly |
+| **Infra/flaky** | Network timeouts, runner issues, cache failures |
+## Step 4: Attempt Fix
+**Important**: Before fixing, make sure the local branch is up to date with the PR branch:
+```bash
+git fetch origin && git checkout <PR_BRANCH> && git pull origin <PR_BRANCH>
+```
+Before any local `bun pw` run in this workflow, verify that port `8080` is either free or already owned by a dev server started from this repo. `playwright.config.ts` reuses existing servers outside CI, so a Vite server from another repo will produce invalid local results.
+```bash
+PORT="${PORT:-8080}"
+THIS_REPO="$(pwd -P)"
+LISTENER_PID="$(lsof -tiTCP:${PORT} -sTCP:LISTEN 2>/dev/null | head -n1 || true)"
+if [ -n "$LISTENER_PID" ]; then
+  LISTENER_CMD="$(ps -p "$LISTENER_PID" -o command=)"
+  if [[ "$LISTENER_CMD" != *"$THIS_REPO"* ]]; then
+    echo "Port ${PORT} is owned by another repo; killing PID ${LISTENER_PID}"
+    kill "$LISTENER_PID"
+  fi
+fi
+```
+If you killed a different repo's server, do **not** start Vite manually. Let `bun pw` launch the correct dev server from this folder.
+### Fix by Category
+#### Playwright Snapshot Mismatch (Linux)
+This is the most common CI-only failure because snapshots are platform-specific.
+1. **Verify it's a snapshot diff** (not a logic error) by reading the failure log
+2. **Check if the change is intentional** — look at recent commits on the branch. If they modified rendering code, SVG output, or CSS, snapshot updates are expected
+3. **Trigger the Linux snapshot update workflow**:
+   ```bash
+   gh workflow run update-snapshots.yml --repo mermaid-js/zenuml-core --ref <PR_BRANCH>
+   ```
+4. **Wait for the workflow to complete**:
+   ```bash
+   # Find the run ID (most recent on that branch)
+   gh run list --repo mermaid-js/zenuml-core --workflow update-snapshots.yml --branch <PR_BRANCH> --limit 1 --json databaseId,status
+   # Watch it
+   gh run watch <RUN_ID> --repo mermaid-js/zenuml-core
+   ```
+5. **Pull the auto-committed snapshots** locally:
+   ```bash
+   git pull origin <PR_BRANCH>
+   ```
+6. The update-snapshots workflow commits and verifies automatically. If it passes, CI should go green on next run.
+#### Playwright Test Logic Failure
+1. **Reproduce locally first**:
+   ```bash
+   # Run the 8080 ownership preflight above first.
+   bun pw --grep "<test name pattern>"
+   ```
+2. **Read the failing test** to understand what it expects
+3. **Fix the code** (not the test, unless the test expectation is wrong)
+4. **Verify locally**: `bun pw --grep "<test name pattern>"`
+5. **Commit and push**
+#### Unit Test Failure
+1. **Reproduce locally**:
+   ```bash
+   bun run test --run
+   ```
+2. **Fix the code or test**
+3. **Verify**: `bun run test --run`
+4. **Commit and push**
+#### Lint/Format Failure
+1. **Auto-fix**:
+   ```bash
+   bun eslint
+   bun prettier
+   ```
+2. **Verify no remaining issues**:
+   ```bash
+   bun eslint 2>&1 | tail -5
+   ```
+3. **Commit and push** the formatting fixes
+#### Build Failure
+1. **Reproduce locally**:
+   ```bash
+   bun build
+   ```
+2. **Read the error** — usually TypeScript errors or missing imports
+3. **Fix, verify locally, commit and push**
+#### Merge Conflict
+1. **Report to user** — do NOT auto-resolve merge conflicts. Show what's conflicting and ask for guidance.
+#### Infra/Flaky
+1. **Re-run the failed job**:
+   ```bash
+   gh run rerun <RUN_ID> --repo mermaid-js/zenuml-core --failed
+   ```
+2. If it fails again with the same infra error, report to user.
+## Step 5: Push and Monitor
+After applying a fix:
+1. **Run the full local test suite** before pushing (when the failure category allows local reproduction):
+   ```bash
+   bun run test --run   # unit tests
+   # Run the 8080 ownership preflight above first.
+   bun pw               # playwright (local, macOS — won't catch Linux snapshot diffs)
+   bun eslint           # lint
+   ```
+2. **Commit with a clear message**:
+   ```bash
+   git add <specific files>
+   git commit -m "fix: <what was fixed> to pass CI"
+   ```
+3. **Push**:
+   ```bash
+   git push origin <PR_BRANCH>
+   ```
+4. **Wait for CI** — use `gh run watch` on the new run
+5. **Evaluate result** — go back to Step 2
+## Step 6: Retry Budget
+Track attempts. Each "attempt" is one push-and-wait cycle (or one workflow trigger-and-wait for snapshot updates).
+- **Maximum 3 attempts total**
+- After each failed attempt, re-diagnose from scratch (Step 3) — the failure mode may have changed
+- **If a test passes on retry without code changes**, flag it as potentially flaky:
+  > "Test `<name>` passed on retry without changes — likely flaky. Consider investigating stability."
+- **After 3 failed attempts**, stop and report:
+  - What was tried
+  - What the current failure is
+  - Your best theory for root cause
+  - Suggested next steps for the user
+## Step 7: Summary Report
+After babysitting completes (success or exhausted retries), produce a brief report:
+```
+## PR #<number> Babysit Report
+- **Status**: [PASSED | FAILED after N attempts]
+- **Failures found**: <list of categories>
+- **Fixes applied**: <list of commits pushed>
+- **Flaky tests**: <any tests that passed on retry without changes>
+- **Manual attention needed**: <anything unresolved>
+```
+## Safety Rules
+- **Never force-push** — always regular `git push`
+- **Never resolve merge conflicts automatically** — report and ask
+- **Never push while CI is still running** from a previous attempt — wait for it to finish first
+- **Never modify the snapshot update workflow itself** — only trigger it
+- **Always verify fixes locally** before pushing (except Linux snapshot updates which can only be verified in CI)
+- **Check for in-progress CI** before pushing — avoid wasting CI minutes on runs that will be superseded

package/.agents/skills/babysit-pr/agents/openai.yaml ADDED Viewed

@@ -0,0 +1,7 @@
+interface:
+  display_name: "Babysit PR"
+  short_description: "Monitor and fix failing PR CI checks"
+  default_prompt: "Use $babysit-pr to diagnose failing GitHub Actions checks on a zenuml-core PR, fix what is actionable, push updates, and watch CI until it is green."
+policy:
+  allow_implicit_invocation: true

package/.agents/skills/dia-scoring/SKILL.md ADDED Viewed

@@ -0,0 +1,139 @@
+---
+name: dia-scoring
+description: Score HTML-vs-SVG diagram parity in compare-case pages, including message labels, fragment labels, sequence numbers, arrows, participant headers, icons, stereotypes, participant colors, participant groups, comments, and residual diff scopes. Use Playwright for page inspection and semantic attribution.
+---
+# Dia Scoring
+Use this skill when the task is to measure **message labels, fragment labels, sequence numbers, message arrows, participant labels, participant boxes, participant icons, stereotypes, participant colors, participant groups, inline comments, and residual diff hotspots** between the HTML renderer and the native SVG renderer on `compare-case.html`.
+## Diff Source of Truth
+The `native-diff-ext` Chrome extension is the absolute source of truth for pixel diff. The analyzer script (`scripts/analyze-compare-case.mjs`) uses the same CDP screenshot capture method and the same diff algorithm (`cy/diff-algorithm.js`), producing identical results when run against the same viewport.
+- Use the **analyzer script** as the primary scoring tool (automated, reproducible, CLI-driven)
+- Use the **extension** for live interactive inspection in the browser
+- Both use CDP `Page.captureScreenshot` with `DOM.getBoxModel` border-box clip
+- When calibrating the skill, verify against the extension's live `#diff-panel canvas`
+The workflow:
+1. Run `node scripts/analyze-compare-case.mjs --case <name> --json` for structured data.
+2. Use `--output-dir <dir>` when you need saved `html.png`, `svg.png`, `diff.png`, and `report.json`.
+3. For live browser inspection, navigate to `http://localhost:8080/e2e/tools/compare-case.html?case=<name>` and use the extension's `#diff-panel canvas`.
+4. Use Playwright page inspection for semantic attribution (element positions, font metrics, DOM structure).
+## Offset Anchor
+All reported offsets must use the **outermost frame's top-left corner** as the anchor.
+- HTML anchor: the compare-case HTML frame root
+- SVG anchor: the compare-case SVG root / outer frame root
+- Do not report alternate offset systems
+- Do not anchor offsets to participant boxes, label boxes, stereotype boxes, or local containers
+- If a local-container-relative reading differs from the frame-anchor reading, prefer the frame-anchor reading in all reporting
+## Browser Requirement
+Use **Playwright browser tools only** for browser interaction in this workflow.
+- Preferred tools: `browser_navigate`, `browser_snapshot`, `browser_evaluate`, `browser_take_screenshot`, `browser_click`, `browser_wait_for`
+- Do not use Chrome DevTools browser tools for scoring, DOM inspection, screenshot capture, or residual validation
+- Do not build your own pixel diff from HTML/SVG screenshots. For pixel comparison, use only the extension-rendered `#diff-panel canvas`
+## Rules
+- Do not use `html-to-image` for capture.
+- Use browser-native screenshots only.
+- Use Playwright for browser-native screenshots and page inspection.
+- All offset calculations must be anchored to the outermost frame's top-left corner.
+- When recalibrating the skill itself, verify against the extension's live `#diff-panel canvas`.
+- Do not use Chrome DevTools browser tools for this workflow.
+- Scope:
+  - normal messages
+  - self messages
+  - returns
+  - creation messages (e.g., `«payload»`, `new Order()`)
+  - fragment conditions such as `[cond]`, `[else]`
+  - fragment section labels such as `catch`, `finally`
+  - participant label text and participant box geometry
+  - participant icons (actor, database, ec2, lambda, azurefunction, sqs, sns, iam, boundary, control, entity)
+  - participant stereotypes such as `«BFF»`, `«Interface»`
+  - participant background colors (`#FFEBE6`, `#0747A6`, etc.) and computed text contrast
+  - participant groups (dashed outline containers with title bar)
+  - inline comments (`// text`) above messages and fragments, including styled comments (`// [red] text`)
+  - residual `html-only` and `svg-only` diff clusters scoped back to nearby elements
+- For each supported message, include:
+  - label text
+  - fragment condition / section label text when present
+  - sequence number text, including fragment sequence numbers when present
+  - arrow geometry keyed by sequence number
+  - normal/return arrow endpoint deltas: `left_dx`, `right_dx`, `width_dx`
+  - self-arrow loop geometry from the painted loop path plus arrowhead, not the outer `svg` viewport
+  - self-arrow vertical deltas: `top_dy`, `bottom_dy`, `height_dy`
+- For participant icons, include:
+  - icon presence (HTML vs SVG)
+  - participant label text when the participant has an icon
+  - icon position relative to participant label
+  - icon visual match confirmation from diff image
+- For participant stereotypes, include:
+  - stereotype text presence (HTML vs SVG), e.g. `«BFF»`
+  - stereotype position relative to participant label (above label, smaller font)
+  - stereotype offset must be measured with per-letter glyph-box comparison relative to the outermost frame anchor
+  - do not use participant-box-relative or other local-container-relative deltas in final reporting
+  - do not mark a stereotype as clean from glyph boxes alone; also check the live `#diff-panel canvas` in the stereotype row
+  - if glyph-box deltas are `0/0` but the panel still shows localized red/blue pixels overlapping the stereotype glyph union, report the stereotype as `ambiguous` or `paint-level residual`, not clean
+  - stereotype text color matching participant background contrast
+- For participant colors, include:
+  - background fill color (hex value) on participant rect
+  - text color contrast (dark text on light bg, white text on dark bg)
+  - color application to both top and bottom participant boxes
+- For participant groups, include:
+  - group name text presence and position (centered title bar)
+  - dashed outline rect enclosing grouped participants
+  - group bounds: leftmost to rightmost participant with margin
+  - group height extending to diagram bottom
+- For inline comments, include:
+  - comment text presence and position (above the associated statement)
+  - comment Y offset from the message/fragment it belongs to
+  - fragment-level comments (e.g. `// comment 4` before `if(...)`) positioned above fragment header
+  - when all letters are `ambiguous` due to large positional offset (e.g. fragment comments at wrong X), the analyzer reports `box_dx` / `box_dy` from the bounding boxes instead of suppressing the measurement
+  - styled comment color application (e.g. `// [red] text`)
+- For participant boxes, use the analyzer script output directly:
+  - Report `html_box`, `svg_box`, `dx`, `dy`, `dw`, `dh` from the script's `participant_boxes` section
+  - The script already applies stroke correction (`strokedElementOuterRect`) — do not re-measure with `browser_evaluate`
+- For residual scopes, include:
+  - connected `html-only` and `svg-only` diff clusters from `#diff-panel canvas`
+  - cluster `size`, `bbox`, and `centroid`
+  - nearest scoped HTML and SVG targets at that position
+  - summaries that explain which element a remaining positional diff most likely belongs to
+  - live native diff panel confirmation before claiming a hotspot is real
+  - the largest confirmed live-panel `html-only` and `svg-only` clusters with approximate positions
+  - grouped summaries of where the panel's red and blue pixels are concentrated
+- Do not report a residual hotspot as real if it is absent from the live `#diff-panel canvas`.
+- Do not stop at totals like `HTML-only (44)` or `SVG-only (55)` when residuals matter; report where those pixels are.
+- Each reported letter must be backed by:
+  - direct HTML-vs-SVG browser layout positions
+  - pixel-panel confirmation from `#diff-panel canvas`
+- Participant stereotypes are first-class targets, not just part of `participant-box` or `participant-label`.
+- If the evidence is weak or contradictory, keep the letter `ambiguous`.
+## Known Analyzer Internals
+### Arrow pairing by sequence number
+The analyzer pairs arrows by sequence number (`text` field), not by label text (`pairText`). Calibrated on `repro-creation-return-arrow` (2026-03-24).
+## Commands
+Run from [../..](../..):
+```bash
+node scripts/analyze-compare-case.mjs --case async-2a
+node scripts/analyze-compare-case.mjs --case async-2a --json
+node scripts/analyze-compare-case.mjs --case async-2a --output-dir tmp/message-elements/async-2a
+```
+## References
+- Selector and pairing details: [references/selectors-and-keys.md](references/selectors-and-keys.md)

package/.agents/skills/dia-scoring/agents/openai.yaml ADDED Viewed

@@ -0,0 +1,7 @@
+interface:
+  display_name: "Dia Scoring"
+  short_description: "Diagram label, number, and arrow offsets"
+  default_prompt: "Use $dia-scoring to measure message label, sequence-number, and arrow parity for a compare-case page."
+policy:
+  allow_implicit_invocation: true

package/.agents/skills/dia-scoring/references/selectors-and-keys.md ADDED Viewed

@@ -0,0 +1,253 @@
+# Selectors And Keys
+The analyzer uses these roots:
+- HTML root: `#html-output .frame`, fallback `#html-output .sequence-diagram`
+- SVG root: `#svg-output > svg`
+Offset anchor:
+- All reported offsets use the outermost frame root's top-left corner
+- HTML side: `#html-output .frame`, fallback `#html-output .sequence-diagram`
+- SVG side: `#svg-output > svg`
+- Do not emit final `dx` / `dy` values from participant-local or other nested-container anchors
+HTML label extraction:
+- Normal messages: iterate `.interaction`, skip `.return`, `.creation`, and self interactions, then read `.message .editable-span-base`
+- Self messages: `.self-invocation .label .editable-span-base`
+- Returns: `.interaction.return .message .editable-span-base`, fallback `.interaction.return .name`
+- Fragment conditions: `.fragment .segment > .text-skin-fragment:not(.finally)`, using only visible child spans when conditional branches are stacked
+- Fragment sections:
+  - `.fragment.fragment-tcf .segment > .header.inline-block.bg-skin-frame.opacity-65`
+  - `.fragment.fragment-tcf .segment > .header.finally`
+SVG label extraction:
+- Normal messages: `g.message:not(.self-call) > text.message-label`
+- Self messages: `g.message.self-call > text.message-label`
+- Returns: `g.return > text.return-label`
+- Fragment conditions: `g.fragment > text.fragment-condition`
+- Fragment condition / section groups: `g.fragment > g` containing `text.fragment-section-label`
+  - texts starting with `[` are treated as `fragment-condition`
+  - other texts are treated as `fragment-section`
+Pairing key:
+- Semantic grouping is by `kind + text`
+- Duplicate labels are paired by top-to-bottom order within that group
+- Output key is:
+  - `kind`
+  - `text`
+  - `y_order`
+- Fragment labels also include `owner=<fragment header>` in the human-readable summary when available
+Per-letter scoring:
+- Grapheme segmentation uses `Intl.Segmenter`, fallback `Array.from`
+- Glyph boxes come from browser layout ranges, not whole-word centroids
+- Numeric `dx` and `dy` are only emitted when direct layout evidence and diff-image evidence agree
+Arrow extraction:
+- HTML normal/return messages:
+  - line: direct child `svg` line strip inside `.message`
+  - head: direct child arrowhead `svg` inside `.message`
+- HTML self messages:
+  - loop: painted geometry inside `svg.arrow`
+  - parts: outer loop path plus nested arrowhead path
+- SVG normal messages:
+  - line: `line.message-line`
+  - head: `svg.arrow-head`
+- SVG returns:
+  - line: `line.return-line`
+  - head: `polyline.return-arrow`
+- SVG self messages:
+  - loop: painted geometry inside the outer `svg` under `g.message.self-call`
+  - parts: outer loop path plus nested arrowhead path
+Arrow scoring:
+- Arrows are keyed by sequence number when numbering is available, for example `arrow:1.2.3`
+- Normal and return arrows are measured as one combined geometry item:
+  - line + arrow head together
+- Self arrows are measured as one loop geometry item
+- Self arrows use the union of the painted loop path and arrowhead path, not the outer viewport box
+- Arrow output is endpoint-based, not box-centroid-based
+- For normal and return arrows, report:
+  - `left_dx`
+  - `right_dx`
+  - `width_dx`
+- For self arrows, also report:
+  - `top_dy`
+  - `bottom_dy`
+  - `height_dy`
+- Do not report `dy` for horizontal message arrows
+HTML sequence number extraction:
+- Normal messages: `.interaction:not(.return):not(.creation):not(.self-invocation):not(.self) > .message > .absolute.text-xs`
+- Self messages: `.interaction.self-invocation > .message .absolute.text-xs`
+- Returns: `.interaction.return > .message > .absolute.text-xs`
+- Fragments: `.fragment > .header > .absolute.text-xs`
+SVG sequence number extraction:
+- Normal messages: `g.message:not(.self-call) > text.seq-number`
+- Self messages: `g.message.self-call > text.seq-number`
+- Returns: `g.return > text.seq-number`
+- Fragments: `g.fragment > text.seq-number`
+## Participant Icon Extraction
+## Participant Header Extraction
+HTML participant header extraction:
+- Participant root: `.participant[data-participant-id]`
+- Participant box: outer border box from the participant root element
+- Participant stereotype: `label.interface`, when present
+- Participant label: last `.name` descendant, measured by glyph boxes
+SVG participant header extraction:
+- Participant root: `g.participant[data-participant]`
+- Skip `g.participant-bottom`
+- Participant box element: `:scope > rect.participant-box`
+- Participant box measurement must use the painted outer bounds of the stroked rect, not the inset rect geometry
+- Participant stereotype:
+  - prefer `:scope > text.stereotype-label`
+  - fallback: top-most direct `text` child above `text.participant-label`
+- Participant label: `:scope > text.participant-label`
+Participant stereotype pairing and scoring:
+- Pair by participant name
+- Validate text equality, for example `«BFF»`
+- Measure stereotype offset by per-letter glyph boxes relative to the outermost frame root, not by participant-local anchors or whole-word box centroids
+- Report:
+  - `letter_deltas`
+  - concise aggregate only when the per-letter evidence agrees
+- Do not mark the stereotype clean from glyph boxes alone
+- Also check the live `#diff-panel canvas` over the union of the HTML and SVG stereotype glyph boxes
+- If localized red or blue pixels persist in that stereotype region while glyph-box deltas are `0/0`, classify it as `ambiguous` or `paint-level residual`
+Participant box pairing and scoring:
+- Pair by participant name
+- Report `html_box` and `svg_box` with `x`, `y`, `w`, `h`
+- Box `x` / `y` values are frame-anchor-relative
+- Report box deltas:
+  - `dx`
+  - `dy`
+  - `dw`
+  - `dh`
+HTML icon extraction:
+- Participant root: `.participant[data-participant-id]`
+- Top-row participant only: keep the top-most entry for each participant id
+- Icon host: first child inside the centered participant row when it is an async icon host
+  - `[aria-description]`
+  - or contains `svg`
+  - or has `h-6` sizing class from `AsyncIcon`
+- Icon box: union of painted SVG shapes when available, fallback to the host box
+- Participant label: last `.name` descendant, measured by glyph boxes
+SVG icon extraction:
+- Participant root: `g.participant[data-participant]`
+- Skip `g.participant-bottom`
+- Icon element: `:scope > g[transform]`
+- Icon box: union of painted shapes within that transformed group
+- Participant label: `:scope > text.participant-label`
+Icon pairing:
+- Pair by participant name
+- Only report participant icon rows for participants where at least one side has an icon
+- Participant labels for icon-bearing participants are paired by participant name, not raw label text
+Icon scoring:
+- Absolute icon drift:
+  - `icon_dx`
+  - `icon_dy`
+- Absolute icon drift is measured from the outermost frame anchor
+- Relative icon drift against the participant label anchor:
+  - `relative_dx`
+  - `relative_dy`
+- If there is no participant label on one side, use the participant box center as the anchor
+- Report presence mismatch if one renderer has an icon and the other does not
+- Diff confirmation is taken from `#diff-panel canvas`, scoped to the union of the HTML and SVG icon boxes
+## Residual Scope Attribution
+Residual scope extraction:
+- Build connected clusters from the live `#diff-panel canvas` colors:
+  - red = `html-only`
+  - blue = `svg-only`
+- Ignore green `match` and magenta `color diff` pixels for positional scoping
+- Each cluster reports:
+  - `size`
+  - `bbox`
+  - `centroid`
+- These panel-derived clusters are the source of truth for residual hotspots.
+Residual scope candidates:
+- HTML side:
+  - labels
+  - numbers
+  - arrows
+  - participant stereotypes
+  - participant labels
+  - participant icons
+  - participant boxes
+  - diagram root fallback
+- SVG side:
+  - labels
+  - numbers
+  - arrows
+  - participant stereotypes
+  - participant labels
+  - participant icons
+  - participant boxes
+  - `rect.frame-border-inner`, fallback `rect.frame-border` / `rect.frame-box`
+  - diagram root fallback
+Residual scope attribution:
+- Pick the closest candidate to the cluster centroid on each side
+- Prefer targets that contain the centroid
+- Prefer more specific categories over large containers:
+  - `participant-icon`
+  - `participant-stereotype`
+  - `label`, `number`, `participant-label`
+  - `arrow`
+  - `participant-box`
+  - `frame-border`
+  - `diagram-root`
+- Use cluster/target overlap and centroid distance as tie-breakers
+Residual scope output:
+- `residual_scopes`: all attributed clusters
+- `residual_scope_summary`: top 20 concise lines for terminal use
+- `residual_scope_html_only_top`: top 10 `html-only` clusters
+- `residual_scope_svg_only_top`: top 10 `svg-only` clusters
+- When answering from the live panel, also report:
+  - the largest red clusters from `#diff-panel canvas`
+  - the largest blue clusters from `#diff-panel canvas`
+  - approximate diagram-space positions or bounding boxes
+  - attributed HTML and SVG targets for those clusters
+  - a short grouped summary of where the red and blue pixels are concentrated
+Live panel validation:
+- Source of truth for residual hotspots is `#diff-panel canvas`
+- Confirm the hotspot by reading the panel's actual red and blue pixels at that area
+- If the panel shows no red or blue pixels there, do not report that hotspot as a real residual diff
+- If the panel shows non-zero red or blue totals, do not stop at the totals alone; locate the dominant clusters and report them
+- Do not build or rely on a separate screenshot-to-screenshot diff for pixel comparison when `#diff-panel canvas` is available on the page