@rolepod/uiproof 0.4.1 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "$schema": "https://json-schema.org/draft/2019-09/schema",
3
- "rolepod_mcp_version": "0.4.1",
3
+ "rolepod_mcp_version": "0.6.0",
4
4
  "tools": {
5
5
  "rolepod_browser_open": {
6
6
  "$schema": "https://json-schema.org/draft/2019-09/schema#"
@@ -32,6 +32,39 @@
32
32
  "rolepod_browser_navigate": {
33
33
  "$schema": "https://json-schema.org/draft/2019-09/schema#"
34
34
  },
35
+ "rolepod_browser_hover": {
36
+ "$schema": "https://json-schema.org/draft/2019-09/schema#"
37
+ },
38
+ "rolepod_browser_drag": {
39
+ "$schema": "https://json-schema.org/draft/2019-09/schema#"
40
+ },
41
+ "rolepod_browser_fill_form": {
42
+ "$schema": "https://json-schema.org/draft/2019-09/schema#"
43
+ },
44
+ "rolepod_browser_upload_file": {
45
+ "$schema": "https://json-schema.org/draft/2019-09/schema#"
46
+ },
47
+ "rolepod_browser_handle_dialog": {
48
+ "$schema": "https://json-schema.org/draft/2019-09/schema#"
49
+ },
50
+ "rolepod_browser_console": {
51
+ "$schema": "https://json-schema.org/draft/2019-09/schema#"
52
+ },
53
+ "rolepod_browser_network": {
54
+ "$schema": "https://json-schema.org/draft/2019-09/schema#"
55
+ },
56
+ "rolepod_browser_set_env": {
57
+ "$schema": "https://json-schema.org/draft/2019-09/schema#"
58
+ },
59
+ "rolepod_browser_evaluate": {
60
+ "$schema": "https://json-schema.org/draft/2019-09/schema#"
61
+ },
62
+ "rolepod_browser_pages": {
63
+ "$schema": "https://json-schema.org/draft/2019-09/schema#"
64
+ },
65
+ "rolepod_browser_switch_page": {
66
+ "$schema": "https://json-schema.org/draft/2019-09/schema#"
67
+ },
35
68
  "rolepod_verify_ui_flow": {
36
69
  "$schema": "https://json-schema.org/draft/2019-09/schema#"
37
70
  },
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@rolepod/uiproof",
3
- "version": "0.4.1",
3
+ "version": "0.6.0",
4
4
  "description": "Multi-platform UI/mobile automation for AI agents — MCP server + shipped skills.",
5
5
  "keywords": [
6
6
  "mcp",
@@ -43,6 +43,15 @@ MCP server. No fallback (D-024).
43
43
  3. Surface counts + critical/serious issues inline; reference the report
44
44
  path for the full list.
45
45
 
46
+ ## Evidence routing
47
+
48
+ Run artifacts are saved under:
49
+
50
+ - **Standalone:** `.rolepod-uiproof/artifacts/<prefix>_<ts>_<uuid>/`
51
+ - **With `rolepod` parent** (when `ROLEPOD_PARENT=1` is set by the parent's SessionStart hook): `.rolepod/evidence/<ts>-rolepod-uiproof-<skill>/`
52
+
53
+ Either way the run directory contains a `manifest.json` per Extension Protocol v1, so the parent's `check-work` skill can aggregate results into the verify phase report. Standalone users can read the manifest themselves — same shape.
54
+
46
55
  ## If the tool is unavailable
47
56
 
48
57
  Surface plainly:
@@ -0,0 +1,123 @@
1
+ ---
2
+ name: check-errors
3
+ description: Drive a flow and fail if any console error or failed network request occurs. Thin wrapper over verify_ui_flow with strict error-only assertions. Use to gate merges on "no regressions during this flow".
4
+ ---
5
+
6
+ # /check-errors
7
+
8
+ Thin wrapper over **`rolepod_verify_ui_flow`** focused on the question:
9
+
10
+ > Does this flow run cleanly — no console errors, no failed requests?
11
+
12
+ Use after `/verify-ui` confirms the feature works, OR as a fast smoke
13
+ check before merging.
14
+
15
+ ## When to use
16
+
17
+ - After feature work, to gate "did I introduce a regression somewhere?"
18
+ - During PR review, to confirm the happy path doesn't spew errors.
19
+ - After dependency upgrades, to catch a quiet console break.
20
+ - After CSP / CORS / API auth changes — common cause of silent 4xx/5xx.
21
+
22
+ ## When NOT to use
23
+
24
+ - You want to assert specific UI text — use `/verify-ui` instead.
25
+ - You only care about visual regression — use `/visual-diff`.
26
+ - You want a11y compliance — use `/audit-a11y`.
27
+ - Backend-only diff with no UI surface.
28
+
29
+ ## Inputs
30
+
31
+ - `url` — entry point.
32
+ - `steps` *(optional)* — drive the flow. Same shape as `/verify-ui` steps.
33
+ - `exclude_console_patterns` *(optional)* — substrings; matching console
34
+ errors are ignored. Useful for third-party SDKs that always log
35
+ noise (e.g. `["facebook.com", "googletagmanager"]`).
36
+ - `exclude_request_patterns` *(optional)* — same idea for URLs.
37
+ - `allow_4xx` *(optional, default false)* — if true, only 5xx counts as
38
+ a failure. Useful when 4xx is part of the auth happy path.
39
+
40
+ ## Process
41
+
42
+ Call `rolepod_verify_ui_flow` with:
43
+
44
+ ```json
45
+ {
46
+ "mode": "assert",
47
+ "open": { "platform": "web", "url": "<url>" },
48
+ "steps": [ ...user-provided... ],
49
+ "expect": [
50
+ { "kind": "no_console_errors", "exclude_patterns": [...] },
51
+ { "kind": "no_failed_requests", "exclude_patterns": [...], "allow_4xx": false }
52
+ ],
53
+ "capture": ["screenshot", "console", "har"]
54
+ }
55
+ ```
56
+
57
+ Surface the result. On `passed: false`, point the user at `console.json`
58
+ and `network.har` in `evidence_paths` so they can drill in.
59
+
60
+ ## Outputs
61
+
62
+ Same shape as `/verify-ui`:
63
+
64
+ - `passed: boolean`
65
+ - `failure_reason` — e.g. `Expectations failed: expect[0] no_console_errors`
66
+ - `evidence_paths.console` — JSON dump of console messages
67
+ - `evidence_paths.har` — full HAR file
68
+
69
+ ## Examples
70
+
71
+ ### Smoke check — landing page
72
+
73
+ User: "Open https://app.example.com and confirm no errors fire."
74
+
75
+ ```json
76
+ {
77
+ "open": { "platform": "web", "url": "https://app.example.com" },
78
+ "steps": [],
79
+ "expect": [
80
+ { "kind": "no_console_errors" },
81
+ { "kind": "no_failed_requests" }
82
+ ],
83
+ "capture": ["screenshot", "console", "har"]
84
+ }
85
+ ```
86
+
87
+ ### Drive a flow then assert clean
88
+
89
+ User: "Sign in then dashboard — make sure no console errors."
90
+
91
+ ```json
92
+ {
93
+ "open": { "platform": "web", "url": "https://app.example.com/login" },
94
+ "steps": [
95
+ { "kind": "fill_form", "fields": [
96
+ { "query": "Email", "value": "test@example.com" },
97
+ { "query": "Password", "value": "..." }
98
+ ]},
99
+ { "kind": "click", "query": "Sign in" },
100
+ { "kind": "wait_for", "condition": { "kind": "url_matches", "pattern": "dashboard" } }
101
+ ],
102
+ "expect": [
103
+ { "kind": "no_console_errors", "exclude_patterns": ["sentry.io"] },
104
+ { "kind": "no_failed_requests", "exclude_patterns": ["/analytics"] }
105
+ ],
106
+ "capture": ["screenshot", "console", "har"]
107
+ }
108
+ ```
109
+
110
+ ## Evidence routing
111
+
112
+ Run artifacts are saved under:
113
+
114
+ - **Standalone:** `.rolepod-uiproof/artifacts/<prefix>_<ts>_<uuid>/`
115
+ - **With `rolepod` parent** (when `ROLEPOD_PARENT=1` is set by the parent's SessionStart hook): `.rolepod/evidence/<ts>-rolepod-uiproof-<skill>/`
116
+
117
+ Either way the run directory contains a `manifest.json` per Extension Protocol v1. Because `/check-errors` wraps `rolepod_verify_ui_flow`, the manifest is written by the underlying composite tool — same shape, same fields.
118
+
119
+ ## If the tool is unavailable
120
+
121
+ > The `/check-errors` skill needs the **rolepod-uiproof** MCP server,
122
+ > which is not currently available. Confirm the plugin is installed and
123
+ > try again, or check that `npx -y rolepod-uiproof` is reachable.
@@ -22,6 +22,20 @@ MCP server. No fallback (D-024).
22
22
  - The scenario is too vague to scaffold — ask the user to clarify before
23
23
  calling.
24
24
 
25
+ ## Coverage
26
+
27
+ The codegen handles every step kind and expect kind supported by
28
+ `/verify-ui` (click, type, key, navigate, wait_for, hover, drag,
29
+ fill_form, upload, dialog, set_env, switch_page, evaluate; text_visible,
30
+ text_absent, url_matches, ref_in_state, no_console_errors,
31
+ no_failed_requests, request_made, response_status).
32
+
33
+ Playwright-test gets first-class translation for everything that has a
34
+ direct Playwright API. Pytest+selenium covers the basics; expect kinds
35
+ that need network introspection (no_failed_requests, request_made,
36
+ response_status) emit a TODO referencing `selenium-wire` or BiDi, since
37
+ upstream Selenium has no network-capture primitive.
38
+
25
39
  ## Inputs
26
40
 
27
41
  - `framework` — `playwright-test` | `vitest+playwright` | `pytest+selenium`.
@@ -48,6 +62,15 @@ MCP server. No fallback (D-024).
48
62
  3. Print the generated file path and the setup steps. Surface
49
63
  `dependencies` as an install command.
50
64
 
65
+ ## Evidence routing
66
+
67
+ Run artifacts (the generated test file) are saved under:
68
+
69
+ - **Standalone:** `.rolepod-uiproof/artifacts/<prefix>_<ts>_<uuid>/`
70
+ - **With `rolepod` parent** (when `ROLEPOD_PARENT=1` is set by the parent's SessionStart hook): `.rolepod/evidence/<ts>-rolepod-uiproof-<skill>/`
71
+
72
+ Either way the run directory contains a `manifest.json` per Extension Protocol v1 (with `phase: "build"` for this skill).
73
+
51
74
  ## If the tool is unavailable
52
75
 
53
76
  Surface plainly:
@@ -1,83 +1,125 @@
1
1
  ---
2
2
  name: verify-ui
3
- description: Drive a real browser session through steps and assert expected outcomes; save evidence under ./.rolepod-uiproof/artifacts/. Use when a diff changes visible behavior and code-level tests do not prove it. v0.1 web only.
3
+ description: Drive a real browser session through steps and assert expected outcomes including console errors, network failures, and visual state. Save evidence under ./.rolepod-uiproof/artifacts/. Web only.
4
4
  ---
5
5
 
6
6
  # /verify-ui
7
7
 
8
8
  Single-backend skill. Calls **`rolepod_verify_ui_flow`** on the rolepod-uiproof
9
9
  MCP server and surfaces the structured result. No fallback (D-024) — if the
10
- tool is unavailable, this skill fails with a clear diagnostic so the caller
11
- (typically the user, or the parent `rolepod` plugin's `check-work` skill)
12
- can decide what to do next.
10
+ tool is unavailable, this skill fails with a clear diagnostic.
13
11
 
14
12
  ## When to use
15
13
 
16
14
  - A diff changes user-visible behavior on a web target.
17
- - A URL is reachable (dev server is running, or the target is a deployed URL).
18
- - Code-level tests (unit, type-check, lint) do not prove the visible
19
- outcome.
15
+ - A URL is reachable (dev server is running, or the target is deployed).
16
+ - You want to prove the UI works AND has no console errors / failed
17
+ requests / regressed visuals — code-level tests can't do that.
20
18
 
21
19
  ## When NOT to use
22
20
 
23
21
  - Backend-only diffs (no UI change).
24
22
  - Doc, config, or build-tool changes with no behavior surface.
25
- - No dev server / target available — ask the user to spin one up first
26
- before invoking.
27
- - iOS / Android targets mobile ships in v0.3 (`platform: 'ios' | 'android'`).
23
+ - No dev server / target available — ask the user to spin one up first.
24
+ - iOS / Android targets — mobile is partially supported (basic input);
25
+ console / network / set_env / evaluate are web-only.
28
26
 
29
27
  ## Modes
30
28
 
31
- - `mode: 'assert'` (default) — the assertions describe what the **feature
29
+ - `mode: 'assert'` (default) — assertions describe what the **feature
32
30
  should do**; pass = feature works.
33
- - `mode: 'reproduce'` — the assertions describe what the **bug looks like**;
31
+ - `mode: 'reproduce'` — assertions describe what the **bug looks like**;
34
32
  pass = bug reproduces. When `minimize: true` (default) the tool then
35
33
  removes steps one-by-one to find the shortest still-reproducing sequence
36
- and writes a `replay-minimized.json` bundle next to `replay.json`.
34
+ and writes `replay-minimized.json` next to `replay.json`.
37
35
 
38
36
  ## Inputs
39
37
 
40
- - `target` — the URL to open (web only in v0.1).
41
- - `steps` — ordered UI actions. Each is one of:
42
- - `{ kind: 'click', query: <accessible name substring> }`
43
- - `{ kind: 'type', query: <accessible name substring>, text: <string>, clear_first?: boolean }`
44
- - `{ kind: 'key', key: <e.g. 'Enter'> }`
45
- - `{ kind: 'wait_for', condition: { kind, ... } }`
46
- - `{ kind: 'navigate', url: <string> }`
47
- - `expect` ordered assertions. Each is one of:
48
- - `{ kind: 'text_visible', text: <string> }`
49
- - `{ kind: 'text_absent', text: <string> }`
50
- - `{ kind: 'url_matches', pattern: <regex string> }`
51
- - `{ kind: 'ref_in_state', query: <accessible name substring>, state: 'visible' | 'enabled' | 'focused' }`
52
- - `capture` *(optional)* — defaults to `['screenshot']`. v0.1 only emits
53
- screenshots and a replay bundle; HAR / console / video land in later
54
- milestones.
55
- - `close_on_finish` *(optional)* defaults to `true`.
38
+ ### `open` — context setup
39
+
40
+ ```json
41
+ { "platform": "web", "url": "https://...", "browser": "chromium" }
42
+ ```
43
+
44
+ Optional: `viewport`, `headless`, `user_agent`, `locale`. UA / locale /
45
+ timezone MUST be set here they cannot change mid-session.
46
+
47
+ ### `steps` UI actions in order
48
+
49
+ Each step is one of:
50
+
51
+ - `{ "kind": "click", "query": "Submit" }`
52
+ - `{ "kind": "type", "query": "Email", "text": "x@y.com", "clear_first": true }`
53
+ - `{ "kind": "key", "key": "Enter" }`
54
+ - `{ "kind": "wait_for", "condition": { ... } }`
55
+ - `{ "kind": "navigate", "url": "https://..." }`
56
+ - `{ "kind": "hover", "query": "More" }`
57
+ - `{ "kind": "drag", "from_query": "Card A", "to_query": "Column 2" }`
58
+ - `{ "kind": "fill_form", "fields": [ { "query": "Name", "value": "Alice" }, { "query": "Subscribe", "value": true, "kind": "checkbox" } ] }`
59
+ - `{ "kind": "upload", "query": "Avatar", "file_path": "/abs/path/to/file.png" }`
60
+ - `{ "kind": "dialog", "action": "accept" }` — **place BEFORE the action that triggers the dialog**
61
+ - `{ "kind": "set_env", "viewport": { "width": 375, "height": 812 } }` — also accepts offline, geolocation, color_scheme, reduced_motion, extra_headers, network_throttle, cpu_throttle
62
+ - `{ "kind": "switch_page", "index": 1 }` — multi-page (popups, target=_blank)
63
+ - `{ "kind": "evaluate", "script": "return document.title" }` — gated by `ROLEPOD_ALLOW_EVAL=1`
64
+
65
+ ### `expect` — assertions
66
+
67
+ - `{ "kind": "text_visible", "text": "..." }`
68
+ - `{ "kind": "text_absent", "text": "..." }`
69
+ - `{ "kind": "url_matches", "pattern": "regex" }`
70
+ - `{ "kind": "ref_in_state", "query": "Submit", "state": "enabled" }`
71
+ - `{ "kind": "no_console_errors", "exclude_patterns": ["3rd-party.com"] }`
72
+ - `{ "kind": "no_failed_requests", "exclude_patterns": ["/analytics"], "allow_4xx": false }`
73
+ - `{ "kind": "request_made", "url_pattern": "/api/checkout", "method": "POST", "min_count": 1 }`
74
+ - `{ "kind": "response_status", "url_pattern": "/api/me", "status": 200 }`
75
+
76
+ ### `capture` — evidence
77
+
78
+ Default: `["screenshot"]`. Available:
79
+
80
+ - `screenshot` — `final.png`
81
+ - `console` — `console.json` (filtered errors+warnings, ring buffer up to 1000)
82
+ - `har` — `network.har` (full HAR)
83
+ - `video` — `videos/*.webm`
84
+ - `trace` — `trace.zip` (Playwright trace; view with `npx playwright show-trace`)
85
+ - `a11y_tree` — `a11y_tree.json` (final snapshot)
86
+
87
+ ### Defaults
88
+
89
+ - `close_on_finish: true`
90
+ - `minimize: true` (only consulted when `mode: 'reproduce'`)
56
91
 
57
92
  ## Outputs
58
93
 
59
- - `run_id` folder name under `./.rolepod-uiproof/artifacts/`.
60
- - `passed` — boolean.
61
- - `failed_at_step` *(when not passed)* 0-based step index.
62
- - `failure_reason` *(when not passed)* human-readable explanation.
63
- - `evidence_paths` — `{ screenshots: string[], replay_bundle?: string }`.
64
- - `final_url_or_screen` — page URL at the end of the run.
94
+ - `run_id`, `passed`, `failed_at_step`, `failure_reason`,
95
+ `final_url_or_screen`
96
+ - `evidence_paths: { screenshots, replay_bundle, console?, a11y_tree?, har?, trace?, video? }`
97
+ - `minimized` (only on `mode: 'reproduce'` + `passed: true` + `minimize: true`)
65
98
 
66
99
  ## Process
67
100
 
68
- 1. Construct a `rolepod_verify_ui_flow` input from the user's intent:
69
- - `mode: 'assert'`
70
- - `open: { platform: 'web', url: <target> }`
71
- - `steps`, `expect`, `capture`, `close_on_finish` per inputs above.
101
+ 1. Build the `rolepod_verify_ui_flow` input.
72
102
  2. Call the tool.
73
- 3. Report the structured result. If `passed: false`, include
74
- `failed_at_step`, `failure_reason`, and the screenshot path so the user
75
- can inspect the failure.
103
+ 3. Report the structured result. On failure include `failed_at_step` +
104
+ `failure_reason` + relevant evidence paths (screenshot, console.json
105
+ if console errors caused the failure).
76
106
 
77
- ## If the tool is unavailable
107
+ ## Default suggestion
108
+
109
+ For ANY user-visible flow, default-include `no_console_errors` and
110
+ `no_failed_requests` in `expect`. Real UI bugs surface as console errors
111
+ or 5xx responses far more often than as wrong text.
112
+
113
+ ## Evidence routing
114
+
115
+ Run artifacts are saved under:
78
116
 
79
- The rolepod-uiproof MCP server is not registered or is not responding. Surface
80
- this plainly:
117
+ - **Standalone:** `.rolepod-uiproof/artifacts/<prefix>_<ts>_<uuid>/`
118
+ - **With `rolepod` parent** (when `ROLEPOD_PARENT=1` is set by the parent's SessionStart hook): `.rolepod/evidence/<ts>-rolepod-uiproof-<skill>/`
119
+
120
+ Either way the run directory contains a `manifest.json` per Extension Protocol v1, so the parent's `check-work` skill can aggregate results into the verify phase report. Standalone users can read the manifest themselves — same shape.
121
+
122
+ ## If the tool is unavailable
81
123
 
82
124
  > The `/verify-ui` skill needs the **rolepod-uiproof** MCP server, which is
83
125
  > not currently available. Confirm the plugin is installed and try again,
@@ -85,50 +127,84 @@ this plainly:
85
127
 
86
128
  Do **not** attempt this work via Playwright MCP, Chrome DevTools MCP, or
87
129
  any other backend from inside this skill. Multi-backend routing is the
88
- job of the parent `rolepod` plugin's `check-work` / `debug-issue` skills
89
- (D-024).
130
+ job of the parent `rolepod` plugin's `check-work` / `debug-issue` skills.
90
131
 
91
132
  ## Examples
92
133
 
93
- ### Success — verify a search result on example.com
134
+ ### Success — verify checkout flow with no errors
94
135
 
95
- User: "Verify that opening https://example.com shows the heading 'Example
96
- Domain' and links to iana.org."
97
-
98
- Skill invokes `rolepod_verify_ui_flow` with:
136
+ User: "Verify https://shop.example.com/checkout works — fill the form,
137
+ submit, expect a success page and no errors."
99
138
 
100
139
  ```json
101
140
  {
102
141
  "mode": "assert",
103
- "open": { "platform": "web", "url": "https://example.com" },
104
- "steps": [],
142
+ "open": { "platform": "web", "url": "https://shop.example.com/checkout" },
143
+ "steps": [
144
+ { "kind": "fill_form", "fields": [
145
+ { "query": "Name", "value": "Alice" },
146
+ { "query": "Email", "value": "alice@example.com" },
147
+ { "query": "Card", "value": "4242 4242 4242 4242" }
148
+ ]},
149
+ { "kind": "click", "query": "Pay" },
150
+ { "kind": "wait_for", "condition": { "kind": "text_visible", "text": "Thank you" } }
151
+ ],
105
152
  "expect": [
106
- { "kind": "text_visible", "text": "Example Domain" },
107
- { "kind": "text_visible", "text": "More information" }
108
- ]
153
+ { "kind": "text_visible", "text": "Thank you" },
154
+ { "kind": "no_console_errors" },
155
+ { "kind": "no_failed_requests", "exclude_patterns": ["/analytics"] },
156
+ { "kind": "response_status", "url_pattern": "/api/checkout", "status": 200 }
157
+ ],
158
+ "capture": ["screenshot", "console", "har"]
109
159
  }
110
160
  ```
111
161
 
112
- Returns:
162
+ ### Failure with evidence
163
+
164
+ When `no_console_errors` fails, the result surfaces:
113
165
 
114
166
  ```json
115
167
  {
116
- "run_id": "verify_20260524T101512_a1b2c3d4",
117
- "passed": true,
168
+ "passed": false,
169
+ "failure_reason": "Expectations failed: expect[1] no_console_errors",
118
170
  "evidence_paths": {
119
- "screenshots": [".rolepod-uiproof/artifacts/verify_…/final.png"],
120
- "replay_bundle": ".rolepod-uiproof/artifacts/verify_…/replay.json"
121
- },
122
- "final_url_or_screen": "https://example.com/"
171
+ "screenshots": ["…/final.png"],
172
+ "console": "…/console.json"
173
+ }
123
174
  }
124
175
  ```
125
176
 
126
- ### Failure MCP server not available
177
+ Open `console.json` to inspect the errors.
127
178
 
128
- The MCP server is not registered. The skill returns:
179
+ ### Dialog handling
129
180
 
130
- > The `/verify-ui` skill needs the **rolepod-uiproof** MCP server, which is
131
- > not currently available. Confirm the plugin is installed and try again.
181
+ User: "When the user clicks Delete, a confirm dialog appears. Verify
182
+ that accepting it deletes the row."
183
+
184
+ ```json
185
+ {
186
+ "steps": [
187
+ { "kind": "dialog", "action": "accept" },
188
+ { "kind": "click", "query": "Delete" },
189
+ { "kind": "wait_for", "condition": { "kind": "text_absent", "text": "Row A" } }
190
+ ],
191
+ "expect": [ { "kind": "text_absent", "text": "Row A" } ]
192
+ }
193
+ ```
194
+
195
+ The `dialog` step arms a one-shot handler; the *next* trigger (the click)
196
+ fires it. Un-armed dialogs are auto-dismissed.
197
+
198
+ ### Responsive + dark mode
199
+
200
+ User: "Verify mobile dark-mode layout."
132
201
 
133
- No other backend is attempted. The caller decides whether to escalate to
134
- the parent rolepod plugin's `check-work` skill.
202
+ ```json
203
+ {
204
+ "steps": [
205
+ { "kind": "set_env", "viewport": { "width": 375, "height": 812 }, "color_scheme": "dark" }
206
+ ],
207
+ "expect": [ { "kind": "text_visible", "text": "Menu" } ],
208
+ "capture": ["screenshot"]
209
+ }
210
+ ```
@@ -44,6 +44,15 @@ MCP server. No fallback (D-024).
44
44
  3. Report `diff_pct`, `passed`, and the three image paths. If the baseline
45
45
  was just seeded, say so explicitly.
46
46
 
47
+ ## Evidence routing
48
+
49
+ Run artifacts are saved under:
50
+
51
+ - **Standalone:** `.rolepod-uiproof/artifacts/<prefix>_<ts>_<uuid>/`
52
+ - **With `rolepod` parent** (when `ROLEPOD_PARENT=1` is set by the parent's SessionStart hook): `.rolepod/evidence/<ts>-rolepod-uiproof-<skill>/`
53
+
54
+ Baselines under `.rolepod-uiproof/baselines/` are always the same location regardless of mode — they are user-curated config, not per-run evidence. Either way the run directory contains a `manifest.json` per Extension Protocol v1.
55
+
47
56
  ## If the tool is unavailable
48
57
 
49
58
  Surface plainly: