@zenuml/core 3.47.0 → 3.47.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (30) hide show
  1. package/.agents/skills/babysit-pr/SKILL.md +223 -0
  2. package/.agents/skills/babysit-pr/agents/openai.yaml +7 -0
  3. package/.agents/skills/dia-scoring/SKILL.md +139 -0
  4. package/.agents/skills/dia-scoring/agents/openai.yaml +7 -0
  5. package/.agents/skills/dia-scoring/references/selectors-and-keys.md +253 -0
  6. package/.agents/skills/land-pr/SKILL.md +120 -0
  7. package/.agents/skills/propagate-core-release/SKILL.md +205 -0
  8. package/.agents/skills/propagate-core-release/agents/openai.yaml +7 -0
  9. package/.agents/skills/propagate-core-release/references/downstreams.md +42 -0
  10. package/.agents/skills/ship-branch/SKILL.md +105 -0
  11. package/.agents/skills/submit-branch/SKILL.md +76 -0
  12. package/.agents/skills/validate-branch/SKILL.md +72 -0
  13. package/.claude/skills/emoji-eval/SKILL.md +187 -0
  14. package/.claude/skills/propagate-core-release/SKILL.md +81 -76
  15. package/.claude/skills/propagate-core-release/agents/openai.yaml +2 -2
  16. package/AGENTS.md +1 -1
  17. package/dist/stats.html +1 -1
  18. package/dist/zenuml.esm.mjs +16210 -15460
  19. package/dist/zenuml.js +540 -535
  20. package/docs/superpowers/plans/2026-03-30-emoji-support.md +1220 -0
  21. package/docs/superpowers/plans/2026-03-30-self-correcting-scoring.md +206 -0
  22. package/e2e/data/compare-cases.js +233 -0
  23. package/e2e/tools/compare-case.html +17 -3
  24. package/package.json +3 -3
  25. package/playwright.config.ts +1 -1
  26. package/scripts/analyze-compare-case/collect-data.mjs +159 -16
  27. package/scripts/analyze-compare-case/config.mjs +1 -1
  28. package/scripts/analyze-compare-case/report.mjs +5 -0
  29. package/scripts/analyze-compare-case/residual-scopes.mjs +23 -1
  30. package/scripts/analyze-compare-case/scoring.mjs +13 -0
@@ -0,0 +1,223 @@
1
+ ---
2
+ name: babysit-pr
3
+ description: Monitor and fix failing GitHub Actions CI checks on PRs for mermaid-js/zenuml-core. Use when the user says "babysit PR", "check PR status", "fix CI", "PR is failing", "watch this PR", "why is CI red", or when used with /loop to continuously monitor a PR. Also use when Playwright snapshot failures occur in CI, lint/format issues block merging, or unit tests fail on a PR. Triggers on any PR monitoring, CI failure diagnosis, or automated fix-and-retry workflow.
4
+ ---
5
+
6
+ # Babysit PR
7
+
8
+ Monitor a GitHub Actions PR, diagnose failures, attempt fixes, and retry — up to 3 times total.
9
+
10
+ ## Scope
11
+
12
+ This skill targets **mermaid-js/zenuml-core** only. All commands run from the zenuml-core directory.
13
+
14
+ ## Step 1: Find the PR
15
+
16
+ Resolve which PR to babysit, in this priority order:
17
+
18
+ 1. **Explicit PR number** — if the user provided one (e.g., `#341`), use it
19
+ 2. **Current branch PR** — run `gh pr view --json number,title,headRefName,state,statusCheckRollup` from the zenuml-core directory
20
+ 3. **Recently failed PR** — if no PR on current branch, find the most recent failed PR in the last 10 minutes:
21
+ ```bash
22
+ gh run list --repo mermaid-js/zenuml-core --status failure --limit 5 --json databaseId,headBranch,event,createdAt,conclusion,name
23
+ ```
24
+ Filter to runs created within the last 10 minutes. If multiple, pick the most recent.
25
+
26
+ If no PR is found, tell the user and stop.
27
+
28
+ ## Step 2: Check CI Status
29
+
30
+ ```bash
31
+ gh pr checks <PR_NUMBER> --repo mermaid-js/zenuml-core
32
+ ```
33
+
34
+ **If all checks pass**: Report success and stop. Nothing to babysit.
35
+
36
+ **If checks are still running**: Report status and wait. Use `gh run watch <RUN_ID> --repo mermaid-js/zenuml-core` to wait for completion (with a 10-minute timeout). Then re-evaluate.
37
+
38
+ **If checks failed**: Proceed to Step 3.
39
+
40
+ ## Step 3: Diagnose Failures
41
+
42
+ For each failed check, pull the logs:
43
+
44
+ ```bash
45
+ gh run view <RUN_ID> --repo mermaid-js/zenuml-core --log-failed
46
+ ```
47
+
48
+ Categorize the failure:
49
+
50
+ | Category | Indicators |
51
+ |----------|-----------|
52
+ | **Playwright snapshot mismatch** | `Error: A]snapshot.*doesn't match`, `Screenshot comparison failed`, pixel diff errors, `-linux.png` referenced |
53
+ | **Playwright test logic failure** | Assertion errors, timeouts, element not found — but NOT snapshot diffs |
54
+ | **Unit test failure** | Failures in `bun run test`, vitest output |
55
+ | **Lint/format failure** | ESLint errors, Prettier diffs |
56
+ | **Build failure** | Vite build errors, TypeScript compilation errors |
57
+ | **Merge conflict** | `CONFLICT`, `merge conflict`, cannot rebase cleanly |
58
+ | **Infra/flaky** | Network timeouts, runner issues, cache failures |
59
+
60
+ ## Step 4: Attempt Fix
61
+
62
+ **Important**: Before fixing, make sure the local branch is up to date with the PR branch:
63
+ ```bash
64
+ git fetch origin && git checkout <PR_BRANCH> && git pull origin <PR_BRANCH>
65
+ ```
66
+
67
+ Before any local `bun pw` run in this workflow, verify that port `8080` is either free or already owned by a dev server started from this repo. `playwright.config.ts` reuses existing servers outside CI, so a Vite server from another repo will produce invalid local results.
68
+
69
+ ```bash
70
+ PORT="${PORT:-8080}"
71
+ THIS_REPO="$(pwd -P)"
72
+ LISTENER_PID="$(lsof -tiTCP:${PORT} -sTCP:LISTEN 2>/dev/null | head -n1 || true)"
73
+
74
+ if [ -n "$LISTENER_PID" ]; then
75
+ LISTENER_CMD="$(ps -p "$LISTENER_PID" -o command=)"
76
+ if [[ "$LISTENER_CMD" != *"$THIS_REPO"* ]]; then
77
+ echo "Port ${PORT} is owned by another repo; killing PID ${LISTENER_PID}"
78
+ kill "$LISTENER_PID"
79
+ fi
80
+ fi
81
+ ```
82
+
83
+ If you killed a different repo's server, do **not** start Vite manually. Let `bun pw` launch the correct dev server from this folder.
84
+
85
+ ### Fix by Category
86
+
87
+ #### Playwright Snapshot Mismatch (Linux)
88
+
89
+ This is the most common CI-only failure because snapshots are platform-specific.
90
+
91
+ 1. **Verify it's a snapshot diff** (not a logic error) by reading the failure log
92
+ 2. **Check if the change is intentional** — look at recent commits on the branch. If they modified rendering code, SVG output, or CSS, snapshot updates are expected
93
+ 3. **Trigger the Linux snapshot update workflow**:
94
+ ```bash
95
+ gh workflow run update-snapshots.yml --repo mermaid-js/zenuml-core --ref <PR_BRANCH>
96
+ ```
97
+ 4. **Wait for the workflow to complete**:
98
+ ```bash
99
+ # Find the run ID (most recent on that branch)
100
+ gh run list --repo mermaid-js/zenuml-core --workflow update-snapshots.yml --branch <PR_BRANCH> --limit 1 --json databaseId,status
101
+ # Watch it
102
+ gh run watch <RUN_ID> --repo mermaid-js/zenuml-core
103
+ ```
104
+ 5. **Pull the auto-committed snapshots** locally:
105
+ ```bash
106
+ git pull origin <PR_BRANCH>
107
+ ```
108
+ 6. The update-snapshots workflow commits and verifies automatically. If it passes, CI should go green on next run.
109
+
110
+ #### Playwright Test Logic Failure
111
+
112
+ 1. **Reproduce locally first**:
113
+ ```bash
114
+ # Run the 8080 ownership preflight above first.
115
+ bun pw --grep "<test name pattern>"
116
+ ```
117
+ 2. **Read the failing test** to understand what it expects
118
+ 3. **Fix the code** (not the test, unless the test expectation is wrong)
119
+ 4. **Verify locally**: `bun pw --grep "<test name pattern>"`
120
+ 5. **Commit and push**
121
+
122
+ #### Unit Test Failure
123
+
124
+ 1. **Reproduce locally**:
125
+ ```bash
126
+ bun run test --run
127
+ ```
128
+ 2. **Fix the code or test**
129
+ 3. **Verify**: `bun run test --run`
130
+ 4. **Commit and push**
131
+
132
+ #### Lint/Format Failure
133
+
134
+ 1. **Auto-fix**:
135
+ ```bash
136
+ bun eslint
137
+ bun prettier
138
+ ```
139
+ 2. **Verify no remaining issues**:
140
+ ```bash
141
+ bun eslint 2>&1 | tail -5
142
+ ```
143
+ 3. **Commit and push** the formatting fixes
144
+
145
+ #### Build Failure
146
+
147
+ 1. **Reproduce locally**:
148
+ ```bash
149
+ bun build
150
+ ```
151
+ 2. **Read the error** — usually TypeScript errors or missing imports
152
+ 3. **Fix, verify locally, commit and push**
153
+
154
+ #### Merge Conflict
155
+
156
+ 1. **Report to user** — do NOT auto-resolve merge conflicts. Show what's conflicting and ask for guidance.
157
+
158
+ #### Infra/Flaky
159
+
160
+ 1. **Re-run the failed job**:
161
+ ```bash
162
+ gh run rerun <RUN_ID> --repo mermaid-js/zenuml-core --failed
163
+ ```
164
+ 2. If it fails again with the same infra error, report to user.
165
+
166
+ ## Step 5: Push and Monitor
167
+
168
+ After applying a fix:
169
+
170
+ 1. **Run the full local test suite** before pushing (when the failure category allows local reproduction):
171
+ ```bash
172
+ bun run test --run # unit tests
173
+ # Run the 8080 ownership preflight above first.
174
+ bun pw # playwright (local, macOS — won't catch Linux snapshot diffs)
175
+ bun eslint # lint
176
+ ```
177
+ 2. **Commit with a clear message**:
178
+ ```bash
179
+ git add <specific files>
180
+ git commit -m "fix: <what was fixed> to pass CI"
181
+ ```
182
+ 3. **Push**:
183
+ ```bash
184
+ git push origin <PR_BRANCH>
185
+ ```
186
+ 4. **Wait for CI** — use `gh run watch` on the new run
187
+ 5. **Evaluate result** — go back to Step 2
188
+
189
+ ## Step 6: Retry Budget
190
+
191
+ Track attempts. Each "attempt" is one push-and-wait cycle (or one workflow trigger-and-wait for snapshot updates).
192
+
193
+ - **Maximum 3 attempts total**
194
+ - After each failed attempt, re-diagnose from scratch (Step 3) — the failure mode may have changed
195
+ - **If a test passes on retry without code changes**, flag it as potentially flaky:
196
+ > "Test `<name>` passed on retry without changes — likely flaky. Consider investigating stability."
197
+ - **After 3 failed attempts**, stop and report:
198
+ - What was tried
199
+ - What the current failure is
200
+ - Your best theory for root cause
201
+ - Suggested next steps for the user
202
+
203
+ ## Step 7: Summary Report
204
+
205
+ After babysitting completes (success or exhausted retries), produce a brief report:
206
+
207
+ ```
208
+ ## PR #<number> Babysit Report
209
+ - **Status**: [PASSED | FAILED after N attempts]
210
+ - **Failures found**: <list of categories>
211
+ - **Fixes applied**: <list of commits pushed>
212
+ - **Flaky tests**: <any tests that passed on retry without changes>
213
+ - **Manual attention needed**: <anything unresolved>
214
+ ```
215
+
216
+ ## Safety Rules
217
+
218
+ - **Never force-push** — always regular `git push`
219
+ - **Never resolve merge conflicts automatically** — report and ask
220
+ - **Never push while CI is still running** from a previous attempt — wait for it to finish first
221
+ - **Never modify the snapshot update workflow itself** — only trigger it
222
+ - **Always verify fixes locally** before pushing (except Linux snapshot updates which can only be verified in CI)
223
+ - **Check for in-progress CI** before pushing — avoid wasting CI minutes on runs that will be superseded
@@ -0,0 +1,7 @@
1
+ interface:
2
+ display_name: "Babysit PR"
3
+ short_description: "Monitor and fix failing PR CI checks"
4
+ default_prompt: "Use $babysit-pr to diagnose failing GitHub Actions checks on a zenuml-core PR, fix what is actionable, push updates, and watch CI until it is green."
5
+
6
+ policy:
7
+ allow_implicit_invocation: true
@@ -0,0 +1,139 @@
1
+ ---
2
+ name: dia-scoring
3
+ description: Score HTML-vs-SVG diagram parity in compare-case pages, including message labels, fragment labels, sequence numbers, arrows, participant headers, icons, stereotypes, participant colors, participant groups, comments, and residual diff scopes. Use Playwright for page inspection and semantic attribution.
4
+ ---
5
+
6
+ # Dia Scoring
7
+
8
+ Use this skill when the task is to measure **message labels, fragment labels, sequence numbers, message arrows, participant labels, participant boxes, participant icons, stereotypes, participant colors, participant groups, inline comments, and residual diff hotspots** between the HTML renderer and the native SVG renderer on `compare-case.html`.
9
+
10
+ ## Diff Source of Truth
11
+
12
+ The `native-diff-ext` Chrome extension is the absolute source of truth for pixel diff. The analyzer script (`scripts/analyze-compare-case.mjs`) uses the same CDP screenshot capture method and the same diff algorithm (`cy/diff-algorithm.js`), producing identical results when run against the same viewport.
13
+
14
+ - Use the **analyzer script** as the primary scoring tool (automated, reproducible, CLI-driven)
15
+ - Use the **extension** for live interactive inspection in the browser
16
+ - Both use CDP `Page.captureScreenshot` with `DOM.getBoxModel` border-box clip
17
+ - When calibrating the skill, verify against the extension's live `#diff-panel canvas`
18
+
19
+ The workflow:
20
+
21
+ 1. Run `node scripts/analyze-compare-case.mjs --case <name> --json` for structured data.
22
+ 2. Use `--output-dir <dir>` when you need saved `html.png`, `svg.png`, `diff.png`, and `report.json`.
23
+ 3. For live browser inspection, navigate to `http://localhost:8080/e2e/tools/compare-case.html?case=<name>` and use the extension's `#diff-panel canvas`.
24
+ 4. Use Playwright page inspection for semantic attribution (element positions, font metrics, DOM structure).
25
+
26
+ ## Offset Anchor
27
+
28
+ All reported offsets must use the **outermost frame's top-left corner** as the anchor.
29
+
30
+ - HTML anchor: the compare-case HTML frame root
31
+ - SVG anchor: the compare-case SVG root / outer frame root
32
+ - Do not report alternate offset systems
33
+ - Do not anchor offsets to participant boxes, label boxes, stereotype boxes, or local containers
34
+ - If a local-container-relative reading differs from the frame-anchor reading, prefer the frame-anchor reading in all reporting
35
+
36
+ ## Browser Requirement
37
+
38
+ Use **Playwright browser tools only** for browser interaction in this workflow.
39
+
40
+ - Preferred tools: `browser_navigate`, `browser_snapshot`, `browser_evaluate`, `browser_take_screenshot`, `browser_click`, `browser_wait_for`
41
+ - Do not use Chrome DevTools browser tools for scoring, DOM inspection, screenshot capture, or residual validation
42
+ - Do not build your own pixel diff from HTML/SVG screenshots. For pixel comparison, use only the extension-rendered `#diff-panel canvas`
43
+
44
+ ## Rules
45
+
46
+ - Do not use `html-to-image` for capture.
47
+ - Use browser-native screenshots only.
48
+ - Use Playwright for browser-native screenshots and page inspection.
49
+ - All offset calculations must be anchored to the outermost frame's top-left corner.
50
+ - When recalibrating the skill itself, verify against the extension's live `#diff-panel canvas`.
51
+ - Do not use Chrome DevTools browser tools for this workflow.
52
+ - Scope:
53
+ - normal messages
54
+ - self messages
55
+ - returns
56
+ - creation messages (e.g., `«payload»`, `new Order()`)
57
+ - fragment conditions such as `[cond]`, `[else]`
58
+ - fragment section labels such as `catch`, `finally`
59
+ - participant label text and participant box geometry
60
+ - participant icons (actor, database, ec2, lambda, azurefunction, sqs, sns, iam, boundary, control, entity)
61
+ - participant stereotypes such as `«BFF»`, `«Interface»`
62
+ - participant background colors (`#FFEBE6`, `#0747A6`, etc.) and computed text contrast
63
+ - participant groups (dashed outline containers with title bar)
64
+ - inline comments (`// text`) above messages and fragments, including styled comments (`// [red] text`)
65
+ - residual `html-only` and `svg-only` diff clusters scoped back to nearby elements
66
+ - For each supported message, include:
67
+ - label text
68
+ - fragment condition / section label text when present
69
+ - sequence number text, including fragment sequence numbers when present
70
+ - arrow geometry keyed by sequence number
71
+ - normal/return arrow endpoint deltas: `left_dx`, `right_dx`, `width_dx`
72
+ - self-arrow loop geometry from the painted loop path plus arrowhead, not the outer `svg` viewport
73
+ - self-arrow vertical deltas: `top_dy`, `bottom_dy`, `height_dy`
74
+ - For participant icons, include:
75
+ - icon presence (HTML vs SVG)
76
+ - participant label text when the participant has an icon
77
+ - icon position relative to participant label
78
+ - icon visual match confirmation from diff image
79
+ - For participant stereotypes, include:
80
+ - stereotype text presence (HTML vs SVG), e.g. `«BFF»`
81
+ - stereotype position relative to participant label (above label, smaller font)
82
+ - stereotype offset must be measured with per-letter glyph-box comparison relative to the outermost frame anchor
83
+ - do not use participant-box-relative or other local-container-relative deltas in final reporting
84
+ - do not mark a stereotype as clean from glyph boxes alone; also check the live `#diff-panel canvas` in the stereotype row
85
+ - if glyph-box deltas are `0/0` but the panel still shows localized red/blue pixels overlapping the stereotype glyph union, report the stereotype as `ambiguous` or `paint-level residual`, not clean
86
+ - stereotype text color matching participant background contrast
87
+ - For participant colors, include:
88
+ - background fill color (hex value) on participant rect
89
+ - text color contrast (dark text on light bg, white text on dark bg)
90
+ - color application to both top and bottom participant boxes
91
+ - For participant groups, include:
92
+ - group name text presence and position (centered title bar)
93
+ - dashed outline rect enclosing grouped participants
94
+ - group bounds: leftmost to rightmost participant with margin
95
+ - group height extending to diagram bottom
96
+ - For inline comments, include:
97
+ - comment text presence and position (above the associated statement)
98
+ - comment Y offset from the message/fragment it belongs to
99
+ - fragment-level comments (e.g. `// comment 4` before `if(...)`) positioned above fragment header
100
+ - when all letters are `ambiguous` due to large positional offset (e.g. fragment comments at wrong X), the analyzer reports `box_dx` / `box_dy` from the bounding boxes instead of suppressing the measurement
101
+ - styled comment color application (e.g. `// [red] text`)
102
+ - For participant boxes, use the analyzer script output directly:
103
+ - Report `html_box`, `svg_box`, `dx`, `dy`, `dw`, `dh` from the script's `participant_boxes` section
104
+ - The script already applies stroke correction (`strokedElementOuterRect`) — do not re-measure with `browser_evaluate`
105
+ - For residual scopes, include:
106
+ - connected `html-only` and `svg-only` diff clusters from `#diff-panel canvas`
107
+ - cluster `size`, `bbox`, and `centroid`
108
+ - nearest scoped HTML and SVG targets at that position
109
+ - summaries that explain which element a remaining positional diff most likely belongs to
110
+ - live native diff panel confirmation before claiming a hotspot is real
111
+ - the largest confirmed live-panel `html-only` and `svg-only` clusters with approximate positions
112
+ - grouped summaries of where the panel's red and blue pixels are concentrated
113
+ - Do not report a residual hotspot as real if it is absent from the live `#diff-panel canvas`.
114
+ - Do not stop at totals like `HTML-only (44)` or `SVG-only (55)` when residuals matter; report where those pixels are.
115
+ - Each reported letter must be backed by:
116
+ - direct HTML-vs-SVG browser layout positions
117
+ - pixel-panel confirmation from `#diff-panel canvas`
118
+ - Participant stereotypes are first-class targets, not just part of `participant-box` or `participant-label`.
119
+ - If the evidence is weak or contradictory, keep the letter `ambiguous`.
120
+
121
+ ## Known Analyzer Internals
122
+
123
+ ### Arrow pairing by sequence number
124
+
125
+ The analyzer pairs arrows by sequence number (`text` field), not by label text (`pairText`). Calibrated on `repro-creation-return-arrow` (2026-03-24).
126
+
127
+ ## Commands
128
+
129
+ Run from [../..](../..):
130
+
131
+ ```bash
132
+ node scripts/analyze-compare-case.mjs --case async-2a
133
+ node scripts/analyze-compare-case.mjs --case async-2a --json
134
+ node scripts/analyze-compare-case.mjs --case async-2a --output-dir tmp/message-elements/async-2a
135
+ ```
136
+
137
+ ## References
138
+
139
+ - Selector and pairing details: [references/selectors-and-keys.md](references/selectors-and-keys.md)
@@ -0,0 +1,7 @@
1
+ interface:
2
+ display_name: "Dia Scoring"
3
+ short_description: "Diagram label, number, and arrow offsets"
4
+ default_prompt: "Use $dia-scoring to measure message label, sequence-number, and arrow parity for a compare-case page."
5
+
6
+ policy:
7
+ allow_implicit_invocation: true
@@ -0,0 +1,253 @@
1
+ # Selectors And Keys
2
+
3
+ The analyzer uses these roots:
4
+
5
+ - HTML root: `#html-output .frame`, fallback `#html-output .sequence-diagram`
6
+ - SVG root: `#svg-output > svg`
7
+
8
+ Offset anchor:
9
+
10
+ - All reported offsets use the outermost frame root's top-left corner
11
+ - HTML side: `#html-output .frame`, fallback `#html-output .sequence-diagram`
12
+ - SVG side: `#svg-output > svg`
13
+ - Do not emit final `dx` / `dy` values from participant-local or other nested-container anchors
14
+
15
+ HTML label extraction:
16
+
17
+ - Normal messages: iterate `.interaction`, skip `.return`, `.creation`, and self interactions, then read `.message .editable-span-base`
18
+ - Self messages: `.self-invocation .label .editable-span-base`
19
+ - Returns: `.interaction.return .message .editable-span-base`, fallback `.interaction.return .name`
20
+ - Fragment conditions: `.fragment .segment > .text-skin-fragment:not(.finally)`, using only visible child spans when conditional branches are stacked
21
+ - Fragment sections:
22
+ - `.fragment.fragment-tcf .segment > .header.inline-block.bg-skin-frame.opacity-65`
23
+ - `.fragment.fragment-tcf .segment > .header.finally`
24
+
25
+ SVG label extraction:
26
+
27
+ - Normal messages: `g.message:not(.self-call) > text.message-label`
28
+ - Self messages: `g.message.self-call > text.message-label`
29
+ - Returns: `g.return > text.return-label`
30
+ - Fragment conditions: `g.fragment > text.fragment-condition`
31
+ - Fragment condition / section groups: `g.fragment > g` containing `text.fragment-section-label`
32
+ - texts starting with `[` are treated as `fragment-condition`
33
+ - other texts are treated as `fragment-section`
34
+
35
+ Pairing key:
36
+
37
+ - Semantic grouping is by `kind + text`
38
+ - Duplicate labels are paired by top-to-bottom order within that group
39
+ - Output key is:
40
+ - `kind`
41
+ - `text`
42
+ - `y_order`
43
+ - Fragment labels also include `owner=<fragment header>` in the human-readable summary when available
44
+
45
+ Per-letter scoring:
46
+
47
+ - Grapheme segmentation uses `Intl.Segmenter`, fallback `Array.from`
48
+ - Glyph boxes come from browser layout ranges, not whole-word centroids
49
+ - Numeric `dx` and `dy` are only emitted when direct layout evidence and diff-image evidence agree
50
+
51
+ Arrow extraction:
52
+
53
+ - HTML normal/return messages:
54
+ - line: direct child `svg` line strip inside `.message`
55
+ - head: direct child arrowhead `svg` inside `.message`
56
+ - HTML self messages:
57
+ - loop: painted geometry inside `svg.arrow`
58
+ - parts: outer loop path plus nested arrowhead path
59
+ - SVG normal messages:
60
+ - line: `line.message-line`
61
+ - head: `svg.arrow-head`
62
+ - SVG returns:
63
+ - line: `line.return-line`
64
+ - head: `polyline.return-arrow`
65
+ - SVG self messages:
66
+ - loop: painted geometry inside the outer `svg` under `g.message.self-call`
67
+ - parts: outer loop path plus nested arrowhead path
68
+
69
+ Arrow scoring:
70
+
71
+ - Arrows are keyed by sequence number when numbering is available, for example `arrow:1.2.3`
72
+ - Normal and return arrows are measured as one combined geometry item:
73
+ - line + arrow head together
74
+ - Self arrows are measured as one loop geometry item
75
+ - Self arrows use the union of the painted loop path and arrowhead path, not the outer viewport box
76
+ - Arrow output is endpoint-based, not box-centroid-based
77
+ - For normal and return arrows, report:
78
+ - `left_dx`
79
+ - `right_dx`
80
+ - `width_dx`
81
+ - For self arrows, also report:
82
+ - `top_dy`
83
+ - `bottom_dy`
84
+ - `height_dy`
85
+ - Do not report `dy` for horizontal message arrows
86
+
87
+ HTML sequence number extraction:
88
+
89
+ - Normal messages: `.interaction:not(.return):not(.creation):not(.self-invocation):not(.self) > .message > .absolute.text-xs`
90
+ - Self messages: `.interaction.self-invocation > .message .absolute.text-xs`
91
+ - Returns: `.interaction.return > .message > .absolute.text-xs`
92
+ - Fragments: `.fragment > .header > .absolute.text-xs`
93
+
94
+ SVG sequence number extraction:
95
+
96
+ - Normal messages: `g.message:not(.self-call) > text.seq-number`
97
+ - Self messages: `g.message.self-call > text.seq-number`
98
+ - Returns: `g.return > text.seq-number`
99
+ - Fragments: `g.fragment > text.seq-number`
100
+
101
+ ## Participant Icon Extraction
102
+
103
+ ## Participant Header Extraction
104
+
105
+ HTML participant header extraction:
106
+
107
+ - Participant root: `.participant[data-participant-id]`
108
+ - Participant box: outer border box from the participant root element
109
+ - Participant stereotype: `label.interface`, when present
110
+ - Participant label: last `.name` descendant, measured by glyph boxes
111
+
112
+ SVG participant header extraction:
113
+
114
+ - Participant root: `g.participant[data-participant]`
115
+ - Skip `g.participant-bottom`
116
+ - Participant box element: `:scope > rect.participant-box`
117
+ - Participant box measurement must use the painted outer bounds of the stroked rect, not the inset rect geometry
118
+ - Participant stereotype:
119
+ - prefer `:scope > text.stereotype-label`
120
+ - fallback: top-most direct `text` child above `text.participant-label`
121
+ - Participant label: `:scope > text.participant-label`
122
+
123
+ Participant stereotype pairing and scoring:
124
+
125
+ - Pair by participant name
126
+ - Validate text equality, for example `«BFF»`
127
+ - Measure stereotype offset by per-letter glyph boxes relative to the outermost frame root, not by participant-local anchors or whole-word box centroids
128
+ - Report:
129
+ - `letter_deltas`
130
+ - concise aggregate only when the per-letter evidence agrees
131
+ - Do not mark the stereotype clean from glyph boxes alone
132
+ - Also check the live `#diff-panel canvas` over the union of the HTML and SVG stereotype glyph boxes
133
+ - If localized red or blue pixels persist in that stereotype region while glyph-box deltas are `0/0`, classify it as `ambiguous` or `paint-level residual`
134
+
135
+ Participant box pairing and scoring:
136
+
137
+ - Pair by participant name
138
+ - Report `html_box` and `svg_box` with `x`, `y`, `w`, `h`
139
+ - Box `x` / `y` values are frame-anchor-relative
140
+ - Report box deltas:
141
+ - `dx`
142
+ - `dy`
143
+ - `dw`
144
+ - `dh`
145
+
146
+ HTML icon extraction:
147
+
148
+ - Participant root: `.participant[data-participant-id]`
149
+ - Top-row participant only: keep the top-most entry for each participant id
150
+ - Icon host: first child inside the centered participant row when it is an async icon host
151
+ - `[aria-description]`
152
+ - or contains `svg`
153
+ - or has `h-6` sizing class from `AsyncIcon`
154
+ - Icon box: union of painted SVG shapes when available, fallback to the host box
155
+ - Participant label: last `.name` descendant, measured by glyph boxes
156
+
157
+ SVG icon extraction:
158
+
159
+ - Participant root: `g.participant[data-participant]`
160
+ - Skip `g.participant-bottom`
161
+ - Icon element: `:scope > g[transform]`
162
+ - Icon box: union of painted shapes within that transformed group
163
+ - Participant label: `:scope > text.participant-label`
164
+
165
+ Icon pairing:
166
+
167
+ - Pair by participant name
168
+ - Only report participant icon rows for participants where at least one side has an icon
169
+ - Participant labels for icon-bearing participants are paired by participant name, not raw label text
170
+
171
+ Icon scoring:
172
+
173
+ - Absolute icon drift:
174
+ - `icon_dx`
175
+ - `icon_dy`
176
+ - Absolute icon drift is measured from the outermost frame anchor
177
+ - Relative icon drift against the participant label anchor:
178
+ - `relative_dx`
179
+ - `relative_dy`
180
+ - If there is no participant label on one side, use the participant box center as the anchor
181
+ - Report presence mismatch if one renderer has an icon and the other does not
182
+ - Diff confirmation is taken from `#diff-panel canvas`, scoped to the union of the HTML and SVG icon boxes
183
+
184
+ ## Residual Scope Attribution
185
+
186
+ Residual scope extraction:
187
+
188
+ - Build connected clusters from the live `#diff-panel canvas` colors:
189
+ - red = `html-only`
190
+ - blue = `svg-only`
191
+ - Ignore green `match` and magenta `color diff` pixels for positional scoping
192
+ - Each cluster reports:
193
+ - `size`
194
+ - `bbox`
195
+ - `centroid`
196
+ - These panel-derived clusters are the source of truth for residual hotspots.
197
+
198
+ Residual scope candidates:
199
+
200
+ - HTML side:
201
+ - labels
202
+ - numbers
203
+ - arrows
204
+ - participant stereotypes
205
+ - participant labels
206
+ - participant icons
207
+ - participant boxes
208
+ - diagram root fallback
209
+ - SVG side:
210
+ - labels
211
+ - numbers
212
+ - arrows
213
+ - participant stereotypes
214
+ - participant labels
215
+ - participant icons
216
+ - participant boxes
217
+ - `rect.frame-border-inner`, fallback `rect.frame-border` / `rect.frame-box`
218
+ - diagram root fallback
219
+
220
+ Residual scope attribution:
221
+
222
+ - Pick the closest candidate to the cluster centroid on each side
223
+ - Prefer targets that contain the centroid
224
+ - Prefer more specific categories over large containers:
225
+ - `participant-icon`
226
+ - `participant-stereotype`
227
+ - `label`, `number`, `participant-label`
228
+ - `arrow`
229
+ - `participant-box`
230
+ - `frame-border`
231
+ - `diagram-root`
232
+ - Use cluster/target overlap and centroid distance as tie-breakers
233
+
234
+ Residual scope output:
235
+
236
+ - `residual_scopes`: all attributed clusters
237
+ - `residual_scope_summary`: top 20 concise lines for terminal use
238
+ - `residual_scope_html_only_top`: top 10 `html-only` clusters
239
+ - `residual_scope_svg_only_top`: top 10 `svg-only` clusters
240
+ - When answering from the live panel, also report:
241
+ - the largest red clusters from `#diff-panel canvas`
242
+ - the largest blue clusters from `#diff-panel canvas`
243
+ - approximate diagram-space positions or bounding boxes
244
+ - attributed HTML and SVG targets for those clusters
245
+ - a short grouped summary of where the red and blue pixels are concentrated
246
+
247
+ Live panel validation:
248
+
249
+ - Source of truth for residual hotspots is `#diff-panel canvas`
250
+ - Confirm the hotspot by reading the panel's actual red and blue pixels at that area
251
+ - If the panel shows no red or blue pixels there, do not report that hotspot as a real residual diff
252
+ - If the panel shows non-zero red or blue totals, do not stop at the totals alone; locate the dominant clusters and report them
253
+ - Do not build or rely on a separate screenshot-to-screenshot diff for pixel comparison when `#diff-panel canvas` is available on the page