@zenuml/core 3.47.1 → 3.47.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (53) hide show
  1. package/.agents/skills/babysit-pr/SKILL.md +223 -0
  2. package/.agents/skills/babysit-pr/agents/openai.yaml +7 -0
  3. package/.agents/skills/dia-scoring/SKILL.md +139 -0
  4. package/.agents/skills/dia-scoring/agents/openai.yaml +7 -0
  5. package/.agents/skills/dia-scoring/references/selectors-and-keys.md +253 -0
  6. package/.agents/skills/land-pr/SKILL.md +120 -0
  7. package/.agents/skills/propagate-core-release/SKILL.md +205 -0
  8. package/.agents/skills/propagate-core-release/agents/openai.yaml +7 -0
  9. package/.agents/skills/propagate-core-release/references/downstreams.md +42 -0
  10. package/.agents/skills/ship-branch/SKILL.md +105 -0
  11. package/.agents/skills/submit-branch/SKILL.md +76 -0
  12. package/.agents/skills/validate-branch/SKILL.md +72 -0
  13. package/.claude/skills/emoji-eval/SKILL.md +187 -0
  14. package/.claude/skills/propagate-core-release/SKILL.md +81 -76
  15. package/.claude/skills/propagate-core-release/agents/openai.yaml +2 -2
  16. package/.claude/skills/zenuml-ux-research/SKILL.md +183 -0
  17. package/.claude/skills/zenuml-ux-research/references/assertion-catalog.md +261 -0
  18. package/.claude/skills/zenuml-ux-research/references/best-practices-overview.md +56 -0
  19. package/.claude/skills/zenuml-ux-research/references/report-template.md +89 -0
  20. package/.claude/skills/zenuml-ux-research/references/scenarios/edit-message-label.md +37 -0
  21. package/.claude/skills/zenuml-ux-research/references/scenarios/insert-message.md +36 -0
  22. package/.claude/skills/zenuml-ux-research/references/scenarios/insert-participant.md +31 -0
  23. package/.claude/skills/zenuml-ux-research/references/scenarios/rename-participant.md +33 -0
  24. package/.claude/skills/zenuml-ux-research/references/scenarios/undo-insert.md +35 -0
  25. package/AGENTS.md +1 -1
  26. package/dist/stats.html +1 -1
  27. package/dist/zenuml.esm.mjs +22732 -20169
  28. package/dist/zenuml.js +590 -543
  29. package/docs/superpowers/plans/2026-03-30-emoji-support.md +1220 -0
  30. package/docs/superpowers/plans/2026-03-30-self-correcting-scoring.md +206 -0
  31. package/docs/superpowers/plans/2026-04-15-keyboard-editing-on-diagram.md +1992 -0
  32. package/docs/superpowers/plans/2026-04-15-zenuml-ux-research-skill.md +1452 -0
  33. package/docs/ux-research/.gitkeep +0 -0
  34. package/docs/ux-research/2026-04-15-rename-participant.md +156 -0
  35. package/docs/ux-research/2026-04-18-insert-participant.md +151 -0
  36. package/e2e/data/compare-cases.js +233 -0
  37. package/e2e/fixtures/create-message.html +26 -0
  38. package/e2e/fixtures/editable-label.html +1 -0
  39. package/e2e/fixtures/empty-diagram.html +23 -0
  40. package/e2e/fixtures/insert-participant.html +23 -0
  41. package/e2e/fixtures/reorder-cross-fragment.html +31 -0
  42. package/e2e/fixtures/reorder-fragment.html +29 -0
  43. package/e2e/fixtures/reorder-message.html +27 -0
  44. package/e2e/fixtures/type-switch.html +29 -0
  45. package/e2e/tools/compare-case.html +16 -2
  46. package/index.html +44 -0
  47. package/package.json +3 -3
  48. package/playwright.config.ts +1 -1
  49. package/scripts/analyze-compare-case/collect-data.mjs +139 -16
  50. package/scripts/analyze-compare-case/config.mjs +1 -1
  51. package/scripts/analyze-compare-case/report.mjs +3 -0
  52. package/scripts/analyze-compare-case/residual-scopes.mjs +23 -1
  53. package/scripts/analyze-compare-case/scoring.mjs +1 -0
@@ -0,0 +1,187 @@
1
+ ---
2
+ name: emoji-eval
3
+ description: Evaluate emoji rendering quality in ZenUML diagrams. Renders test cases in both DOM and SVG modes, takes screenshots, and scores emoji visibility, position, spacing, box fit, and decorator coexistence. Reports per-case scores with HTML-vs-SVG parity check. Use when testing emoji rendering, after emoji-related code changes, or when the user asks to evaluate/score emoji rendering.
4
+ ---
5
+
6
+ # Emoji Rendering Evaluator
7
+
8
+ Automatically score emoji rendering quality in ZenUML diagrams by rendering test cases in both DOM and SVG modes, taking screenshots, and evaluating what the agent sees.
9
+
10
+ ## Prerequisites
11
+
12
+ - Dev server running on `http://localhost:8080` (`bun dev`)
13
+ - Playwright MCP available for browser automation
14
+
15
+ ## Test Cases
16
+
17
+ Run ALL of these cases unless the user specifies a subset:
18
+
19
+ ```javascript
20
+ const EMOJI_TEST_CASES = {
21
+ "emoji-basic": "[rocket] Production\nA->Production.deploy()",
22
+ "emoji-multi": "[rocket] Production\n[lock] AuthService\n[fire] Cache\nProduction->AuthService.validate()\nAuthService->Cache.get()",
23
+ "emoji-with-type": "@Database [fire] HotDB\n@Actor [star] Admin\nAdmin->HotDB.query()",
24
+ "emoji-with-stereotype": '<<service>> [lock] Auth\n<<gateway>> [globe] API\nAPI->Auth.validate()',
25
+ "emoji-inline": "[rocket]User->[fire]Server.request()",
26
+ "emoji-async-message": "A->B: [check] validated\nB->C: [warning] review needed",
27
+ "emoji-comment": "// [eyes] review phase\nA->B.process()",
28
+ "emoji-colon-override": "[:red:] Alert\nA->Alert.trigger()",
29
+ "emoji-css-combo": "// [rocket, red] deploy note\nA->B.deploy()",
30
+ "emoji-complex": "@Database [fire] HotDB\n[rocket] Production\n<<service>> [lock] Auth\nProduction->Auth.validate(token)\n Auth->HotDB.check(token)\n Auth-->Production: [check] valid",
31
+ };
32
+ ```
33
+
34
+ ## Procedure
35
+
36
+ For each test case:
37
+
38
+ ### Step 1: Render in DOM mode
39
+
40
+ 1. Navigate to `http://localhost:8080`
41
+ 2. Set the code via CodeMirror:
42
+ ```javascript
43
+ page.evaluate((code) => {
44
+ const cm = document.querySelector('.CodeMirror');
45
+ cm.CodeMirror.setValue(code);
46
+ }, testCode);
47
+ ```
48
+ 3. Click the "DOM" button to switch to DOM view
49
+ 4. Wait 1 second for rendering to complete
50
+ 5. Take a screenshot of the diagram area — save as `emoji-eval-{caseName}-dom.png`
51
+
52
+ ### Step 2: Render in SVG mode
53
+
54
+ 1. Click the "SVG" button to switch to SVG view
55
+ 2. Wait 1 second for rendering to complete
56
+ 3. Take a screenshot of the diagram area — save as `emoji-eval-{caseName}-svg.png`
57
+
58
+ ### Step 3: Evaluate both screenshots
59
+
60
+ Read each screenshot and score on the criteria below. Use your understanding of sequence diagrams to judge:
61
+
62
+ - A participant header should be a box with the name inside
63
+ - Emoji should appear inline to the LEFT of the participant name
64
+ - @Type icons (actor, database) should appear ABOVE the name, in their own row
65
+ - Stereotypes (`<<name>>`) should appear ABOVE the name, below the icon
66
+ - Messages should be horizontal arrows with labels
67
+ - Comments should be italicized text above messages
68
+
69
+ ## Scoring Criteria
70
+
71
+ Score each criterion 0-3:
72
+
73
+ ### 1. Emoji Visibility (per participant with emoji)
74
+ - **0**: Emoji not visible, blank box, or tofu character
75
+ - **1**: Something visible but wrong character or garbled
76
+ - **2**: Correct emoji visible but poor contrast or very small
77
+ - **3**: Correct emoji clearly visible
78
+
79
+ ### 2. Emoji Position (per participant with emoji)
80
+ - **0**: Emoji in wrong location (after name, outside box, overlapping other elements)
81
+ - **1**: Before name but overlapping the name text
82
+ - **2**: Correct position with minor vertical misalignment
83
+ - **3**: Perfectly aligned inline before the name
84
+
85
+ ### 3. Spacing (per participant with emoji)
86
+ - **0**: Emoji and name overlap or no gap
87
+ - **1**: Too tight (characters touching) or too wide (looks disconnected)
88
+ - **2**: Acceptable gap, slightly off
89
+ - **3**: Natural, comfortable spacing
90
+
91
+ ### 4. Box Fit (per participant with emoji)
92
+ - **0**: Emoji or name overflows the participant box boundary
93
+ - **1**: Box boundary clips the emoji or text
94
+ - **2**: Box fits but looks cramped
95
+ - **3**: Box comfortably accommodates emoji + name with padding
96
+
97
+ ### 5. Decorator Coexistence (only for cases with @Type or stereotype)
98
+ - **0**: @Type icon or stereotype is missing or broken
99
+ - **1**: Both present but overlapping or misaligned
100
+ - **2**: Both present, minor layout issues
101
+ - **3**: Perfect layout — icon above, stereotype below icon, emoji inline with name
102
+
103
+ ### 6. Message/Comment Emoji (only for cases with emoji in messages or comments)
104
+ - **0**: Emoji not visible in message/comment text
105
+ - **1**: Emoji visible but breaks the message layout
106
+ - **2**: Emoji visible, minor alignment issues
107
+ - **3**: Emoji renders naturally inline with message/comment text
108
+
109
+ ## Parity Check
110
+
111
+ For each criterion scored in both DOM and SVG:
112
+ - **Match**: Both scores are equal → mark as `=`
113
+ - **Close**: Scores differ by 1 → mark as `~`
114
+ - **Divergent**: Scores differ by 2+ → mark as `!=` (flag for investigation)
115
+
116
+ ## Output Format
117
+
118
+ Present results as a markdown report:
119
+
120
+ ```markdown
121
+ # Emoji Rendering Evaluation Report
122
+
123
+ **Date:** YYYY-MM-DD
124
+ **Branch:** {current git branch}
125
+ **Total cases:** {N}
126
+
127
+ ## Summary
128
+
129
+ | Case | DOM Score | SVG Score | Parity | Status |
130
+ |------|----------|-----------|--------|--------|
131
+ | emoji-basic | 12/12 | 10/12 | ~ | PASS |
132
+ | emoji-multi | 12/12 | 11/12 | ~ | PASS |
133
+ | ... | ... | ... | ... | ... |
134
+
135
+ **Overall DOM:** {total}/{max} ({percentage}%)
136
+ **Overall SVG:** {total}/{max} ({percentage}%)
137
+ **Parity divergences:** {count}
138
+
139
+ ## Detailed Results
140
+
141
+ ### emoji-basic
142
+ **DSL:**
143
+ \`\`\`
144
+ [rocket] Production
145
+ A->Production.deploy()
146
+ \`\`\`
147
+
148
+ **DOM render:**
149
+ [screenshot: emoji-eval-emoji-basic-dom.png]
150
+
151
+ | Criterion | Score | Notes |
152
+ |-----------|-------|-------|
153
+ | Emoji visibility | 3 | Rocket emoji clearly visible |
154
+ | Emoji position | 3 | Correctly before "Production" |
155
+ | Spacing | 3 | Natural gap |
156
+ | Box fit | 3 | Box fits comfortably |
157
+ | **Total** | **12/12** | |
158
+
159
+ **SVG render:**
160
+ [screenshot: emoji-eval-emoji-basic-svg.png]
161
+
162
+ | Criterion | Score | Notes |
163
+ |-----------|-------|-------|
164
+ | Emoji visibility | 3 | Rocket emoji visible |
165
+ | Emoji position | 2 | Slightly tighter than DOM |
166
+ | Spacing | 2 | Tighter spacing than DOM |
167
+ | Box fit | 3 | Box fits |
168
+ | **Total** | **10/12** | |
169
+
170
+ **Parity:** Spacing is tighter in SVG (~ close)
171
+
172
+ ---
173
+ (repeat for each case)
174
+ ```
175
+
176
+ ## Pass/Fail Thresholds
177
+
178
+ - **PASS**: All criteria >= 2, total >= 75%
179
+ - **WARN**: Any criterion at 1, or total 50-75%
180
+ - **FAIL**: Any criterion at 0, or total < 50%
181
+
182
+ ## When to use this skill
183
+
184
+ - After emoji-related code changes
185
+ - Before creating a PR that touches emoji rendering
186
+ - When the user asks to "evaluate emoji", "score emoji rendering", "check emoji quality"
187
+ - When debugging emoji visual issues
@@ -1,11 +1,11 @@
1
1
  ---
2
2
  name: propagate-core-release
3
- description: Propagate a published `@zenuml/core` release to downstream projects by updating each consumer on its own branch and opening or reusing draft PRs. Use when the user says "push core to downstreams", "update downstream projects", "propagate release", "open downstream PRs", "submit downstream drafts", or wants the newly published zenuml/core version rolled out across mermaid, mermaid live editor, web-sequence, the IntelliJ plugin, confluence-plugin-cloud, and diagramly.ai.
3
+ description: Propagate a published `@zenuml/core` release by opening or reusing per-repo downstream issues with explicit rollout instructions. Use when the user says "push core to downstreams", "update downstream projects", "propagate release", "open downstream issues", "file rollout issues", or wants the newly published zenuml/core version handed off across mermaid, mermaid live editor, web-sequence, the IntelliJ plugin, confluence-plugin-cloud, and diagramly.ai.
4
4
  ---
5
5
 
6
6
  # Propagate Core Release
7
7
 
8
- Update downstream consumers after `@zenuml/core` has already been published. This skill creates or reuses per-repo update branches and draft PRs, but does not merge anything.
8
+ Coordinate downstream consumers after `@zenuml/core` has already been published. This skill creates or reuses per-repo GitHub issues with clear implementation instructions for each downstream team. It does not edit downstream repos or open PRs on their behalf.
9
9
 
10
10
  ## Scope
11
11
 
@@ -14,17 +14,17 @@ This skill is for the post-publish propagation step only.
14
14
  It should:
15
15
 
16
16
  1. identify the published `@zenuml/core` version to roll out
17
- 2. update each downstream repo to that version
18
- 3. create or reuse a branch in each downstream repo
19
- 4. push the branch
20
- 5. create or reuse a **draft** PR
21
- 6. summarize which repos succeeded, failed, or were skipped
17
+ 2. inspect each downstream repo's update conventions from [references/downstreams.md](references/downstreams.md)
18
+ 3. create or reuse one downstream issue per repo for that version
19
+ 4. include explicit repo-specific instructions in each issue body
20
+ 5. summarize which repos succeeded, failed, or were skipped
22
21
 
23
22
  It should not:
24
23
 
25
24
  - publish `@zenuml/core`
26
- - merge downstream PRs
27
- - auto-fix unrelated downstream test failures beyond straightforward dependency-update fallout
25
+ - update downstream code directly
26
+ - create downstream branches or PRs
27
+ - auto-fix unrelated downstream test failures or implementation details
28
28
 
29
29
  Renderer integration rule:
30
30
 
@@ -33,15 +33,14 @@ Renderer integration rule:
33
33
 
34
34
  ## Downstream Repos
35
35
 
36
- Read [references/downstreams.md](references/downstreams.md) before starting. It contains the canonical downstream repo list and repo slug assumptions.
36
+ Read [references/downstreams.md](references/downstreams.md) before starting. It contains the canonical downstream repo list, repo slug assumptions, and repo-specific update commands that must be copied into the issue instructions.
37
37
 
38
38
  ## Preconditions
39
39
 
40
40
  Before starting:
41
41
 
42
42
  - confirm the target `@zenuml/core` version is already published
43
- - confirm `gh auth status` is healthy for all target orgs
44
- - confirm you have local checkout strategy for each downstream repo
43
+ - confirm `gh auth status` is healthy for all target orgs and repos where issues will be filed
45
44
  - if the user did not specify the target version, discover the latest published one first
46
45
 
47
46
  If the published version is ambiguous, stop and ask.
@@ -51,46 +50,50 @@ If the published version is ambiguous, stop and ask.
51
50
  Treat each downstream repo as an independent unit of work.
52
51
 
53
52
  - Continue processing the remaining repos if one repo fails.
54
- - Keep a per-repo status ledger as you go: `updated`, `already-updated`, `draft-pr-open`, `blocked`, `failed`.
55
- - Prefer deterministic updates and small diffs.
56
- - Reuse existing update branches or draft PRs when they already target the same core version.
53
+ - Keep a per-repo status ledger as you go: `issue-opened`, `issue-reused`, `already-tracked`, `blocked`, `failed`.
54
+ - Prefer deterministic, reusable issue text.
55
+ - Check for same-version issues before creating anything new.
56
+ - Reuse an existing open issue when it already targets the same core version.
57
+ - If the same version already has a closed issue, treat it as `already-tracked` and report it instead of opening a duplicate unless the user explicitly asks to reopen or replace it.
57
58
 
58
- ## Branch Naming
59
+ ## Issue Rules
59
60
 
60
- Use a consistent branch name across downstream repos:
61
+ Each downstream repo should get at most one open issue per core version.
61
62
 
62
- ```text
63
- chore/zenuml-core-v<version>
64
- ```
65
-
66
- Example:
67
-
68
- ```text
69
- chore/zenuml-core-v1.2.3
70
- ```
71
-
72
- ## Draft PR Rules
73
-
74
- All PRs created by this skill must be draft PRs.
63
+ Before creating a new issue, search that repo for issues matching the target version in the title or body. Prefer exact matches on `@zenuml/core v<version>`.
75
64
 
76
65
  Use a consistent title pattern:
77
66
 
78
67
  ```text
79
- chore: update @zenuml/core to v<version>
68
+ chore: roll out @zenuml/core v<version>
80
69
  ```
81
70
 
82
- Use a concise body:
71
+ Use a clear body with actionable instructions:
83
72
 
84
73
  ```markdown
85
74
  ## Summary
86
- - update `@zenuml/core` to `v<version>`
87
-
88
- ## Notes
89
- - automated downstream propagation after core publish
90
- - draft PR for repo-specific verification
75
+ - `@zenuml/core` `v<version>` has been published
76
+ - this repo needs to adopt that release
77
+
78
+ ## Required Work
79
+ 1. Run: `<update-command>`
80
+ 2. Run: `<lockfile-refresh-command>` and include the lockfile in the PR when applicable
81
+ 3. Run: `<verify-command>` when applicable
82
+ 4. Keep the diff scoped to the core upgrade and any required integration fix
83
+ 5. Open a downstream PR that links back to this issue
84
+
85
+ ## Repo-Specific Notes
86
+ - <repo-specific note 1>
87
+ - <repo-specific note 2>
88
+
89
+ ## Acceptance Criteria
90
+ - repo is updated to `@zenuml/core` `v<version>` or the equivalent vendored build output
91
+ - lockfile is refreshed when the repo uses one
92
+ - verification command passes locally, or failure details are documented in the PR
93
+ - no unrelated dependency or renderer migrations are mixed into the change
91
94
  ```
92
95
 
93
- If a draft PR already exists for the same branch or same target version, reuse it and report it instead of creating a duplicate.
96
+ If an issue already exists for the same target version, do not create a duplicate. Reuse the open one, or report the closed one as already tracked.
94
97
 
95
98
  ## Workflow
96
99
 
@@ -110,31 +113,31 @@ Record:
110
113
 
111
114
  For each repo in [references/downstreams.md](references/downstreams.md):
112
115
 
113
- 1. Ensure you have a local checkout or clone target.
114
- 2. Fetch latest default branch state.
115
- 3. Create or reuse `chore/zenuml-core-v<version>`.
116
- 4. Update the dependency or bundled artifact according to the repo's conventions.
117
- 5. Inspect the diff and keep it scoped to the propagation work.
118
- 6. Run lightweight repo-appropriate verification if it is cheap and obvious.
119
- 7. Commit with:
120
-
121
- ```text
122
- chore: update @zenuml/core to v<version>
123
- ```
124
-
125
- 8. Push the branch.
126
- 9. Create or reuse a **draft** PR.
116
+ 1. Read the repo row carefully and extract the update command, verification command, and notes.
117
+ 2. Search for existing issues in that repo for the same core version, checking both open and closed issues.
118
+ 3. If an open match exists, reuse it and record the URL.
119
+ 4. If only a closed match exists, record it as `already-tracked` and do not create a duplicate unless the user explicitly asked for that.
120
+ 5. Otherwise create a new issue using the standard title and a repo-specific body.
121
+ 6. Make sure the issue body includes:
122
+ - the target core version
123
+ - the exact update command from the table
124
+ - the lockfile refresh command when the repo uses pnpm or yarn
125
+ - the exact verify command when one is defined
126
+ - the renderer and API caveats from the repo notes
127
+ - an explicit instruction to open a PR linked to the issue after the work is complete
128
+ - a version marker that makes future deduplication easy, such as `Core version: v<version>`
127
129
 
128
130
  ### Step 3: Handle repo-specific blockers
129
131
 
130
132
  If a repo fails, capture exactly why:
131
133
 
134
+ - missing issue creation permissions
135
+ - existing issue search is ambiguous
136
+ - existing closed issue should be reopened but the policy is unclear
132
137
  - dependency location unclear
133
- - package manager / lockfile conflict
134
- - update compiles locally but tests fail
135
- - missing permissions
136
- - repo missing locally and clone failed
137
- - PR creation failed
138
+ - package manager or package filter is unclear
139
+ - repo notes are insufficient to write a safe instruction
140
+ - issue creation failed
138
141
 
139
142
  Do not let one repo failure stop the rest of the batch.
140
143
 
@@ -143,21 +146,23 @@ Do not let one repo failure stop the rest of the batch.
143
146
  At the end, produce a per-repo summary with:
144
147
 
145
148
  - repo
146
- - branch
147
- - PR URL or reused PR URL
149
+ - issue URL or matched prior issue URL
148
150
  - final status
149
151
  - blocker if any
150
152
 
151
- ## Repo Update Guidance
153
+ ## Repo Issue Guidance
152
154
 
153
- Each downstream has specific update and verification commands documented in [references/downstreams.md](references/downstreams.md). Follow the table exactly do not guess package managers or update commands.
155
+ Each downstream has specific update and verification commands documented in [references/downstreams.md](references/downstreams.md). Follow the table exactly when drafting instructions. Do not guess package managers, package filters, or update commands.
154
156
 
155
157
  For each repo:
156
158
 
157
- 1. Run the **Update Command** from the table
158
- 2. Run the **lockfile refresh** (`pnpm install` or `yarn install`) — always commit the updated lockfile
159
- 3. Run the **Verify Command** from the table — if it fails, report the failure and move on
160
- 4. Commit only the dependency change + lockfile — nothing else
159
+ 1. Include the **Update Command** from the table verbatim
160
+ 2. Include the lockfile refresh step:
161
+ - `pnpm install` for pnpm repos
162
+ - `yarn install` for yarn repos
163
+ 3. Include the **Verify Command** from the table verbatim when one exists
164
+ 4. Tell the downstream team to keep the change scoped to the core upgrade and any required integration fix
165
+ 5. Tell the downstream team to open a PR after verification and link it back to the issue
161
166
 
162
167
  Special handling for renderer API changes:
163
168
 
@@ -165,23 +170,24 @@ Special handling for renderer API changes:
165
170
  - `mermaid-js/mermaid-live-editor` is an indirect SVG-renderer consumer through `@mermaid-js/mermaid-zenuml`. Do not add `@zenuml/core` directly there just to follow a core release.
166
171
  - `web-sequence`, `confluence-plugin-cloud`, `diagramly.ai`, and similar downstreams stay on the HTML-renderer path unless the user explicitly asks for a renderer migration.
167
172
 
168
- Prefer the smallest change that updates the downstream safely:
173
+ Prefer the smallest downstream task description that updates the repo safely:
169
174
 
170
175
  - package dependency bumps
171
176
  - lockfile refreshes
172
- - vendored asset refreshes only when the repo actually vendors core output (e.g., `jetbrains-zenuml`)
177
+ - vendored asset refreshes only when the repo actually vendors core output, such as `jetbrains-zenuml`
173
178
 
174
- Do not opportunistically clean up unrelated code while touching the downstream repo.
179
+ Do not ask downstream teams to opportunistically clean up unrelated code while doing the upgrade.
175
180
 
176
- If a downstream repo needs custom update logic that is not obvious from the table or its files, stop on that repo and report the ambiguity.
181
+ If a downstream repo needs custom update logic that is not obvious from the table or its notes, stop on that repo and report the ambiguity instead of inventing instructions.
177
182
 
178
183
  ## Safety
179
184
 
185
+ - Never update downstream repos directly from this skill.
180
186
  - Never merge downstream PRs from this skill.
181
- - Never force-push unless the user explicitly asks.
182
- - Never batch all downstream repos into one branch or one PR.
187
+ - Never batch all downstream repos into one issue.
188
+ - Never file duplicate issues for the same repo and core version.
183
189
  - Never hide per-repo failures behind a single "batch failed" message.
184
- - Never update unrelated dependencies in the same PR.
190
+ - Never ask downstream teams to update unrelated dependencies in the same PR.
185
191
 
186
192
  ## Output
187
193
 
@@ -189,12 +195,11 @@ Final report format:
189
195
 
190
196
  ```markdown
191
197
  ## Downstream Propagation Report
192
- - Core version: v<version>
198
+ - Core version: `v<version>`
193
199
  - Overall: <N> succeeded, <N> reused, <N> skipped, <N> failed
194
200
 
195
201
  ### Repo Results
196
- - `<repo>`: draft PR opened | draft PR reused | already updated | failed
197
- branch: `<branch-name>`
198
- pr: <url or none>
202
+ - `<repo>`: issue opened | issue reused | already tracked | failed
203
+ issue: <url or none>
199
204
  notes: <short reason or blocker>
200
205
  ```
@@ -1,7 +1,7 @@
1
1
  interface:
2
2
  display_name: "Propagate Core Release"
3
- short_description: "Open downstream draft PRs for a published core version"
4
- default_prompt: "Use $propagate-core-release after @zenuml/core has been published to update the configured downstream repos on per-repo branches and open or reuse draft PRs. Do not merge the downstream PRs."
3
+ short_description: "Open downstream rollout issues for a published core version"
4
+ default_prompt: "Use $propagate-core-release after @zenuml/core has been published to open or reuse per-repo downstream issues with explicit rollout instructions. Do not update downstream repos directly and do not open PRs on their behalf."
5
5
 
6
6
  policy:
7
7
  allow_implicit_invocation: true
@@ -0,0 +1,183 @@
1
+ ---
2
+ name: zenuml-ux-research
3
+ description: Audit one ZenUML user interaction scenario at a time (e.g., inserting a message, renaming a participant) against diagramming-tool best practices. Uses claude-in-chrome to walk through the flow in a live browser and writes a gap-only markdown report to docs/ux-research/. Use when the user says "audit ux of", "zenuml ux research", "analyze interaction for zenuml", "run ux research on", or "/zenuml-ux-research". Produces a research report, not an audit pass/fail matrix.
4
+ ---
5
+
6
+ # ZenUML UX Research
7
+
8
+ This skill audits a single ZenUML interaction scenario against diagramming-tool best practices and writes a gap-only markdown report. It is a research tool, not an audit or regression tool. Run it interactively, read the report, and act on it by hand. Never wire it into CI.
9
+
10
+ ## When to invoke
11
+
12
+ - User asks "audit ux of X", "zenuml ux research", "analyze interaction for X".
13
+ - User runs `/zenuml-ux-research <scenario-id>` or `/zenuml-ux-research "free-text goal"`.
14
+ - User asks for a specific gap analysis in the ZenUML editor experience.
15
+
16
+ Do NOT invoke this skill for pixel-level comparison (that is `dia-scoring`), parser behavior, or build/deploy tasks.
17
+
18
+ ## Invocation parameters
19
+
20
+ - **Scenario identifier (positional):** either a catalog ID like `insert-message` or a free-text goal like `"audit how users insert a message between A and B"`.
21
+ - **`--url <url>` (optional):** target URL. Default `http://localhost:4000`. Can point to a deployed staging URL.
22
+ - **`--allow-prod` (optional):** required for any URL that is not `localhost`, `127.0.0.1`, or a known staging subdomain. The skill is read-only against the target, but this flag forces the human to confirm they know they're pointing at a real-users environment.
23
+
24
+ ## Dependencies
25
+
26
+ - `claude-in-chrome` MCP tools (for walkthrough). If these are not yet loaded in the session, the skill must instruct the user to load them via `ToolSearch` and stop; the walkthrough cannot run in text-only mode.
27
+ - `ZenUML dev server or a reachable URL` (default `http://localhost:4000`).
28
+ - `Read` / `Grep` tools (for static source analysis of `zenuml-core/src/`).
29
+ - `WebSearch` / `WebFetch` tools (for targeted best-practice lookups; optional).
30
+
31
+ ## Files this skill uses at runtime
32
+
33
+ - `references/scenarios/<scenario-id>.md` — loaded at Phase A.
34
+ - `references/assertion-catalog.md` — loaded at Phase B.
35
+ - `references/best-practices-overview.md` — loaded at Phase B for narrative framing.
36
+ - `references/report-template.md` — loaded at Phase F.
37
+
38
+ ## Report output
39
+
40
+ - Written to `zenuml-core/docs/ux-research/YYYY-MM-DD-<scenario-id>.md`.
41
+ - Create the directory if it doesn't exist (`mkdir -p`).
42
+ - On filename collision, append `-2`, `-3`, etc. Never overwrite.
43
+ - Never commit the report automatically — the human decides.
44
+
45
+ ## Workflow
46
+
47
+ ### Phase A — Scenario resolution
48
+
49
+ 1. Determine whether the invocation is a catalog ID or free-text.
50
+ 2. **Catalog ID:**
51
+ - Check that `references/scenarios/<id>.md` exists.
52
+ - If not, list all available scenario filenames (glob `references/scenarios/*.md`) and stop.
53
+ - Load the file. Verify it has front matter with `id` and `title`, plus headings for `User intent`, `Starting DSL`, `Target DSL`, `Relevant assertion categories`. If any are missing, print which field is missing from which file and stop.
54
+ 3. **Free-text:**
55
+ - Synthesize a scenario record with the same fields (id, title, user intent, starting DSL, target DSL, relevant categories).
56
+ - Present the synthesized record to the user and wait for explicit confirmation.
57
+ - Do NOT proceed on an unconfirmed synthesized scenario.
58
+ 4. Check `--url` reachability with a quick HTTP GET.
59
+ - If unreachable and the URL is local: print the exact fix command (`cd /Users/penxia/ai-personal/zenuml-core && bun run dev`) and stop.
60
+ - If unreachable and the URL is remote: print the HTTP status and stop.
61
+ - If reachable and the URL is non-local but does NOT have `--allow-prod`: warn and stop. Non-local URLs that look like known staging patterns (e.g., contain `staging`, `preview`, `github.io` for the gh-pages build) may proceed with a one-line warning but no hard stop.
62
+ 5. Confirm `claude-in-chrome` tools are loaded. If not: instruct the user to load them via `ToolSearch` with query `"select:mcp__claude-in-chrome__tabs_context_mcp,mcp__claude-in-chrome__tabs_create_mcp,mcp__claude-in-chrome__navigate,mcp__claude-in-chrome__find,mcp__claude-in-chrome__computer,mcp__claude-in-chrome__read_page,mcp__claude-in-chrome__read_console_messages,mcp__claude-in-chrome__javascript_tool"` and stop.
63
+
64
+ ### Phase B — Hypothesis formation
65
+
66
+ 1. Read the scenario's User intent.
67
+ 2. Read `references/best-practices-overview.md` for narrative framing.
68
+ 3. Scan `references/assertion-catalog.md` for rules whose category is in the scenario's Relevant assertion categories list. Treat these as **priors** — starting points for what you expect to see — not as a checklist.
69
+ 4. Form a short list of expectations in working memory. Example for `rename-participant`: "I expect Enter on selected participant to enter edit mode (KBD-03). I expect caret at end (EDT-02). I expect Escape to cancel (KBD-04). I expect undo granularity to be at the label level (UND-02)."
70
+ 5. **Hypotheses are NOT limited to the catalog.** Form open-ended expectations based on general best practices and common sense. If the scenario suggests territory the catalog is silent on, run **1–3** targeted `WebSearch` queries (e.g., "how does tldraw handle arrow-key navigation between shapes"). Keep the budget tight.
71
+
72
+ ### Phase C — Browser walkthrough
73
+
74
+ 1. `mcp__claude-in-chrome__tabs_context_mcp` → get current tab state (do not reuse existing tabs from prior sessions).
75
+ 2. `mcp__claude-in-chrome__tabs_create_mcp` → open a new tab.
76
+ 3. `mcp__claude-in-chrome__navigate` → navigate to the `--url`.
77
+ 4. Wait for the page to load. `mcp__claude-in-chrome__read_console_messages` at each interaction to catch runtime errors.
78
+ 5. **Seed the starting state** by interacting with the DSL editor pane to type the scenario's Starting DSL. This is setup, not walkthrough — failures here are infrastructure errors.
79
+ - If Starting DSL is empty, no seeding is needed.
80
+ - If seeding itself fails (e.g., the DSL editor is unreachable), stop, report "could not seed starting state via DSL editor" as a walkthrough-blocker, and do NOT write a report. This is worse than a gap — it's a dead environment.
81
+ 6. **Attempt to reach Target DSL via the most discoverable path a new user would try first.** Record each step:
82
+ - What was attempted (e.g., "clicked canvas area to the right of participant B")
83
+ - What happened (e.g., "no visible change; console warning: `[zenuml] unknown click target`")
84
+ - Whether it advanced toward Target DSL
85
+ 7. If the first path fails or hits friction, try 1–2 alternative paths (toolbar, keyboard shortcut, DSL edit). Record each.
86
+ 8. **Capture screenshots only at decision moments**, not every step — keeps reports readable. Use `mcp__claude-in-chrome__computer` for screenshots if the tool is available.
87
+ 9. **Hard stop after 3 failed attempts on the same step.** Record "could not perform step X after 3 attempts" and move on or stop. Do NOT loop.
88
+
89
+ ### Phase D — Gap detection
90
+
91
+ 1. For each observation, compare against the corresponding hypothesis.
92
+ 2. **If observation matches hypothesis: drop it. Do not record. Silence is correct.**
93
+ 3. **If observation diverges from hypothesis: record a gap.** Each gap has:
94
+ - Headline (short, e.g., "Enter on selected participant does nothing")
95
+ - Observed (verbatim)
96
+ - Expected (from hypothesis)
97
+ - Catalog ID (scan `references/assertion-catalog.md` for a rule whose `Applies when` and `Check` match; cite that ID. If no rule matches, label the gap `novel — candidate for new rule`.)
98
+ - Exemplars (from the catalog if cited, else from web search)
99
+ - Rationale
100
+ - Severity (`low`, `med`, `high`) — use the catalog rule's severity if cited, else judge based on impact
101
+ 4. Novel gaps are flagged but NOT auto-written to the catalog. The human reviews them and folds them in manually later.
102
+
103
+ ### Phase E — Targeted static source analysis
104
+
105
+ For each gap, use `Grep` on `/Users/penxia/ai-personal/zenuml-core/src/` to find the relevant code path:
106
+
107
+ - For keyboard interactions: grep for the key name (`'Enter'`, `'Escape'`) and keydown listeners.
108
+ - For selection state: grep for `select`, `Selection`, `aria-selected`, and the Jotai atoms.
109
+ - For inline editing: grep for `contenteditable`, `input`, component names like `Participant`, `Message`.
110
+ - For undo/redo: grep for `undo`, `history`, `Jotai` atoms that track history state.
111
+
112
+ Attach `file:line` pointers to each gap. If no handler is found, write `"no code path found — this is a missing implementation, not a misrouted one."` Often the most useful finding.
113
+
114
+ ### Phase F — Report writing
115
+
116
+ 1. Load `references/report-template.md`.
117
+ 2. Fill in all `{{placeholder}}` fields.
118
+ 3. Determine the output path:
119
+ - Today's date in `YYYY-MM-DD` format.
120
+ - Filename: `YYYY-MM-DD-<scenario-id>.md`.
121
+ - Full path: `/Users/penxia/ai-personal/zenuml-core/docs/ux-research/YYYY-MM-DD-<scenario-id>.md`.
122
+ 4. Create `docs/ux-research/` if it doesn't exist.
123
+ 5. If the filename already exists for today, append `-2`, `-3`, etc.
124
+ 6. Write the file.
125
+ 7. If gap count is zero, render the zero-gap form (omit Gaps and Playwright snippet sections, collapse the walkthrough to a one-line "No gaps observed on <sha>").
126
+
127
+ ### Phase G — Hand-off
128
+
129
+ 1. Print the report path.
130
+ 2. Print a one-line summary: `"Found N gaps (X high, Y med, Z low). Report at <path>."`
131
+ 3. Stop. Do NOT:
132
+ - auto-commit the report
133
+ - auto-fix any gap
134
+ - open a PR
135
+ - notify anyone
136
+ - run additional scenarios
137
+
138
+ ## Error handling
139
+
140
+ **Invocation-time (fail fast, clear instructions):**
141
+
142
+ - Scenario ID not found → list `references/scenarios/*.md` filenames, stop.
143
+ - Free-text goal too ambiguous (e.g., can't infer starting or target DSL) → ask one clarifying question, re-confirm, only proceed on confirmation.
144
+ - Scenario file malformed → print file path and missing field, stop.
145
+ - `claude-in-chrome` tools not loaded → instruct user to load via ToolSearch (query shown above), stop.
146
+ - URL unreachable → print fix command, stop.
147
+ - Non-local URL without `--allow-prod` → warn and stop.
148
+
149
+ **Walkthrough-time (observe and record, do not panic):**
150
+
151
+ - Target state unreachable after 2–3 paths → this is itself a high-severity gap. Write a report with "scenario target is unreachable via discovered interaction paths" as the primary finding.
152
+ - Console error mid-walkthrough → captured, included in the walkthrough step, does not halt unless the app becomes unresponsive.
153
+ - Browser crash → stop, print observations so far, do NOT write a partial report.
154
+ - Screenshot failure → skipped, walkthrough step still recorded with `screenshot: failed`.
155
+ - Same step fails 3 times → stop retrying, record "could not perform step X", move on or stop.
156
+
157
+ **Analysis-time (degrade gracefully):**
158
+
159
+ - Static analysis finds no handler → say so explicitly.
160
+ - Web search returns nothing → fall back to catalog and common sense.
161
+ - Catalog has no matching rule → label gap `novel`, flag as growth candidate.
162
+
163
+ **Output-time:**
164
+
165
+ - `docs/ux-research/` does not exist → create it.
166
+ - Filename collision → append `-2`, `-3`, etc.
167
+ - Git SHA capture fails → write `unknown` in metadata. Do not abort.
168
+
169
+ ## What this skill does NOT do
170
+
171
+ - Retry failed walkthrough paths indefinitely.
172
+ - Auto-fix any gap.
173
+ - Commit the report, open a PR, notify anyone.
174
+ - Run multiple scenarios in one invocation.
175
+ - Run Playwright — only emits a snippet for the human to use.
176
+ - Touch production deploy state.
177
+
178
+ ## Extending the skill
179
+
180
+ - **New scenario:** drop a new file into `references/scenarios/`, matching the format of existing scenarios. The skill discovers it automatically.
181
+ - **New assertion rule:** append it to `references/assertion-catalog.md` with the next sequential ID in its category. Never renumber existing rules.
182
+ - **Catalog growth from novel gaps:** when a run flags a `novel` gap, the human reviews and, if appropriate, adds a new rule to the catalog by hand.
183
+ - **Calibration drift:** any substantial change to this SKILL.md should be followed by re-running both calibration scenarios (see the plan document).