zeno-mobile-runner 0.1.8 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (62) hide show
  1. package/CHANGELOG.md +49 -0
  2. package/FEATURES.md +1 -1
  3. package/README.md +167 -238
  4. package/clients/kotlin/README.md +1 -1
  5. package/clients/kotlin/build.gradle.kts +1 -1
  6. package/clients/python/pyproject.toml +1 -1
  7. package/clients/rust/Cargo.lock +1 -1
  8. package/clients/rust/Cargo.toml +1 -1
  9. package/clients/typescript/package.json +1 -1
  10. package/docs/agent-discovery.md +10 -0
  11. package/docs/ai-agents.md +18 -0
  12. package/docs/benchmarking.md +39 -0
  13. package/docs/benchmarks/2026-06-09-android-workflow.md +73 -0
  14. package/docs/benchmarks/2026-06-09-android-workflow.results.jsonl +20 -0
  15. package/docs/benchmarks/2026-06-09-framework-baseline-status.md +32 -0
  16. package/docs/benchmarks/2026-06-09-ios-appium-comparison.md +115 -0
  17. package/docs/benchmarks/2026-06-09-ios-appium-comparison.results.jsonl +40 -0
  18. package/docs/benchmarks/2026-06-09-ios-demo.md +90 -0
  19. package/docs/benchmarks/2026-06-09-ios-demo.results.jsonl +20 -0
  20. package/docs/benchmarks/2026-06-09-ios-maestro-comparison.md +128 -0
  21. package/docs/benchmarks/2026-06-09-ios-maestro-comparison.results.jsonl +40 -0
  22. package/docs/benchmarks/2026-06-09-ios-workflow-comparison.md +143 -0
  23. package/docs/benchmarks/2026-06-09-ios-workflow-comparison.results.jsonl +40 -0
  24. package/docs/benchmarks/2026-06-09-ios-xctest-floor.md +106 -0
  25. package/docs/benchmarks/2026-06-09-ios-xctest-floor.results.jsonl +40 -0
  26. package/docs/benchmarks/README.md +36 -0
  27. package/docs/benchmarks/benchmark-lab-v1.json +155 -0
  28. package/docs/benchmarks/benchmark-lab-v1.md +95 -0
  29. package/docs/clients.md +16 -0
  30. package/docs/demo.md +36 -1
  31. package/docs/frameworks.md +10 -0
  32. package/docs/npm.md +44 -2
  33. package/docs/protocol-fixtures/core-session.responses.jsonl +1 -1
  34. package/docs/protocol.md +10 -10
  35. package/docs/scenario-authoring.md +15 -0
  36. package/docs/trace-privacy.md +9 -0
  37. package/docs/troubleshooting.md +6 -0
  38. package/examples/android-workflow.json +79 -0
  39. package/examples/ios-shim-workflow.json +79 -0
  40. package/examples/react-native-expo-workflow.json +75 -0
  41. package/package.json +6 -1
  42. package/prebuilds/darwin-arm64/zmr +0 -0
  43. package/prebuilds/darwin-x64/zmr +0 -0
  44. package/prebuilds/linux-arm64/zmr +0 -0
  45. package/prebuilds/linux-x64/zmr +0 -0
  46. package/scripts/benchmark-lab.py +253 -0
  47. package/scripts/create-android-demo-app.sh +324 -29
  48. package/scripts/create-ios-demo-app.sh +174 -7
  49. package/scripts/create-react-native-expo-demo-app.sh +727 -0
  50. package/scripts/demo.sh +3 -0
  51. package/scripts/install-ios-shim.sh +2 -2
  52. package/shims/ios/ZMRShim.swift +10 -0
  53. package/shims/ios/ZMRShimUITestCase.swift +42 -0
  54. package/shims/ios/protocol.md +1 -0
  55. package/src/cli_import.zig +31 -15
  56. package/src/cli_trace.zig +38 -16
  57. package/src/cli_validate.zig +12 -6
  58. package/src/ios.zig +49 -12
  59. package/src/ios_shim.zig +36 -2
  60. package/src/main.zig +3 -0
  61. package/src/version.zig +1 -1
  62. package/viewer/app.js +23 -3
package/CHANGELOG.md CHANGED
@@ -2,6 +2,55 @@
2
2
 
3
3
  All notable changes to Zeno Mobile Runner are tracked here.
4
4
 
5
+ ## Unreleased
6
+
7
+ ## 0.2.0 (2026-06-10)
8
+
9
+ ### Added
10
+
11
+ - Added a public-safe iOS simulator benchmark evidence pack with 20 repeated
12
+ runs of the generated iOS smoke scenario.
13
+ - Added a public-safe iOS simulator baseline runner benchmark comparison on the
14
+ same generated demo app.
15
+ - Added a second public-safe iOS baseline comparison plus a native shim floor
16
+ evidence pack for the generated demo app.
17
+ - Added a richer public-safe iOS workflow benchmark pack covering profile
18
+ entry, catalog selection, save, review, and final-state assertion on the
19
+ generated demo app.
20
+ - Added the first Android workflow benchmark pack for the generated demo app,
21
+ covering 20 repeated UIAutomator-path ZMR runs.
22
+ - Added a generated React Native/Expo benchmark fixture with stable `testID`
23
+ values, accessibility labels, deep-link setup, and Android/iOS ZMR workflow
24
+ scenarios.
25
+ - The trace viewer loads a served bundle directly from
26
+ `viewer/index.html?bundle=<url>`, so CI artifact links and shared triage can
27
+ open a trace without manual file selection.
28
+ - The iOS XCTest shim cold-build timeout is tunable with the
29
+ `ZMR_IOS_SHIM_TIMEOUT_MS` environment variable for slower CI hardware.
30
+ - Added a nightly `device-smoke` GitHub Actions workflow that runs the public
31
+ demo apps on a real Android emulator and iOS simulator and uploads traces,
32
+ reports, and redacted bundles as evidence artifacts.
33
+ - Added real captured screenshots under `docs/assets/` (trace viewer, device
34
+ screens, CLI failure-diagnosis loop, HTML report) plus
35
+ `scripts/capture-screenshots.sh` to regenerate them from fresh demo runs.
36
+ The assets ship in the repository only, not in the npm package.
37
+ - Added Mermaid architecture, verification-loop, trace-lifecycle, and
38
+ trace-to-test diagrams to the README and core docs, and rewrote the README
39
+ around the AI-coding-agent verification workflow.
40
+
41
+ ### Fixed
42
+
43
+ - `zmr validate`, `zmr report`, `zmr export`, and `zmr import` now accept
44
+ flags before positional arguments, matching the documented command forms,
45
+ and unknown-flag errors print a help hint.
46
+ - Generated Android demo scenarios clear app state before launching so
47
+ repeated runs no longer fail on leftover screens from a previous session.
48
+
49
+ - Fixed generated iOS shim one-shot log file creation on macOS by using a
50
+ portable `mktemp` template with `XXXXXX` at the end.
51
+ - Skipped the slow iOS system-open confirmation probe for simulator custom URL
52
+ schemes while keeping it for universal web links.
53
+
5
54
  ## 0.1.8 (2026-06-06)
6
55
 
7
56
  ### Changed
package/FEATURES.md CHANGED
@@ -142,7 +142,7 @@ state, and writes deterministic traces. It does not embed an LLM.
142
142
 
143
143
  ## Current Limitations
144
144
 
145
- - Current release status is `0.1.8`, a public developer preview rather than
145
+ - Current release status is `0.2.0`, a public developer preview rather than
146
146
  a production-stable `1.0.0`.
147
147
  - Physical iOS log capture is still simulator-first. Physical iOS screenshots
148
148
  are available when the XCTest/XCUIAutomation shim is configured.
package/README.md CHANGED
@@ -1,301 +1,230 @@
1
1
  # Zeno Mobile Runner
2
2
 
3
- > Agent-native mobile UI automation for React Native, Expo, Flutter, and native Android/iOS apps.
3
+ > The verification loop for AI coding agents building Expo, React Native,
4
+ > Flutter, and native Android/iOS apps.
4
5
 
5
6
  [![CI](https://github.com/johnmikel/zeno-mobile-runner/actions/workflows/ci.yml/badge.svg)](https://github.com/johnmikel/zeno-mobile-runner/actions/workflows/ci.yml)
6
7
  [![Release](https://img.shields.io/github/v/release/johnmikel/zeno-mobile-runner?include_prereleases)](https://github.com/johnmikel/zeno-mobile-runner/releases)
7
8
  [![npm](https://img.shields.io/npm/v/zeno-mobile-runner)](https://www.npmjs.com/package/zeno-mobile-runner)
8
9
  [![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
9
10
 
10
- ZMR gives AI agents and test harnesses a typed mobile control plane. It can
11
- install and launch apps, observe the UI, choose an action, wait for the screen to
12
- settle, assert state, and export a replayable trace. The runner does not embed an
13
- LLM. Agents stay outside and use ZMR through CLI JSON, scenarios, JSON-RPC, MCP,
14
- or optional protocol clients.
11
+ Your coding agent can write mobile code, but it cannot see the phone. ZMR is
12
+ its eyes and hands: a typed mobile control plane that installs and launches
13
+ apps, observes the UI, taps and types, waits for the screen to settle, asserts
14
+ state, and exports a replayable trace as proof. The runner does not embed an
15
+ LLM. Agents stay outside and drive ZMR through MCP, JSON-RPC, CLI JSON, or
16
+ JSON scenarios.
17
+
18
+ ![ZMR trace viewer showing a passed iOS run with timeline, device screenshot, UI tree, and selector payload](docs/assets/viewer-hero.png)
19
+
20
+ <p align="center">
21
+ <img src="docs/assets/device-ios-demo.png" width="260" alt="iOS simulator screenshot captured by ZMR during a scenario run" />
22
+ &nbsp;&nbsp;
23
+ <img src="docs/assets/device-android-demo.png" width="260" alt="Android emulator screenshot captured by ZMR during a scenario run" />
24
+ </p>
25
+
26
+ <p align="center"><em>Real on-device screenshots from ZMR traces: the same demo flow
27
+ driven on an iOS simulator and an Android emulator.</em></p>
28
+
29
+ ## Why agents need this
30
+
31
+ - **Agents can't verify what they can't observe.** ZMR returns semantic UI
32
+ trees with stable selectors, screenshots, and typed action results an agent
33
+ can reason about — not raw pixels it has to guess at.
34
+ - **Evidence, not vibes.** Every session can write a deterministic trace:
35
+ events, screenshots, UI hierarchies, timings, assertion results, HTML and
36
+ JUnit reports, and a redacted shareable bundle.
37
+ - **Tests fall out for free.** After a live agent session, `zmr discover`
38
+ turns the trace into a reviewable JSON scenario that replays in CI without
39
+ an LLM in the loop.
40
+
41
+ ## How it works
42
+
43
+ ```mermaid
44
+ flowchart LR
45
+ A["AI coding agent<br/>Claude Code · Cursor · custom harness"]
46
+ subgraph zmr["ZMR — one small Zig binary"]
47
+ MCP["MCP server<br/><code>zmr mcp</code>"]
48
+ RPC["JSON-RPC stdio/TCP<br/><code>zmr serve</code>"]
49
+ CLI["CLI + JSON scenarios<br/><code>zmr run</code>"]
50
+ CORE["Core engine<br/>selectors · waits · assertions<br/>scenario runner · trace writer"]
51
+ MCP --> CORE
52
+ RPC --> CORE
53
+ CLI --> CORE
54
+ end
55
+ subgraph devices["Devices"]
56
+ AND["Android emulator/device<br/>ADB · UI Automator · optional shim"]
57
+ IOS["iOS simulator/device<br/>simctl · devicectl · XCTest shim"]
58
+ end
59
+ TRACE["Trace<br/>events.jsonl · screenshots · UI trees<br/>report.html · junit.xml · .zmrtrace"]
60
+ A -- "MCP tools" --> MCP
61
+ A -- "JSON-RPC" --> RPC
62
+ A -- "CLI JSON" --> CLI
63
+ CORE --> AND
64
+ CORE --> IOS
65
+ CORE --> TRACE
66
+ ```
67
+
68
+ No app instrumentation is required on Android. iOS selector actions use an
69
+ app-local XCTest shim that the wizard scaffolds. ZMR works below the
70
+ JavaScript/Dart layer, so React Native, Expo, Flutter, and fully native apps
71
+ are all driven the same way. See [docs/frameworks.md](docs/frameworks.md).
15
72
 
16
- ## Install
73
+ ## Five-minute start
17
74
 
18
75
  Inside a mobile app repo:
19
76
 
20
77
  ```bash
21
- npm install --save-dev zeno-mobile-runner
78
+ npm install --save-dev zeno-mobile-runner # bun add --dev zeno-mobile-runner
22
79
  npx zmr-wizard --app-id com.example.mobiletest --package-json
23
80
  npx zmr doctor --strict --json --config .zmr/config.json
24
81
  ```
25
82
 
26
- Run a generated smoke scenario:
83
+ Hook it up to your coding agent (Claude Code shown; any MCP client works):
27
84
 
28
85
  ```bash
29
- npm run zmr:validate
30
- npm run zmr:android
31
- npm run zmr:ios
86
+ claude mcp add zmr -- npx zmr mcp --config .zmr/config.json --trace-dir traces/zmr-agent
87
+ ```
88
+
89
+ Or in an `.mcp.json` / MCP client config:
90
+
91
+ ```json
92
+ {
93
+ "mcpServers": {
94
+ "zmr": {
95
+ "command": "npx",
96
+ "args": ["zmr", "mcp", "--config", ".zmr/config.json", "--trace-dir", "traces/zmr-agent"]
97
+ }
98
+ }
99
+ }
100
+ ```
101
+
102
+ Then ask the agent to verify its own work: *"launch the app, walk through
103
+ onboarding, and show me the trace."*
104
+
105
+ ## The agent verification loop
106
+
107
+ ```mermaid
108
+ sequenceDiagram
109
+ participant Agent as AI agent
110
+ participant ZMR
111
+ participant Device as Emulator / simulator
112
+ Agent->>ZMR: semantic_snapshot
113
+ ZMR->>Device: capture UI + screenshot
114
+ ZMR-->>Agent: roles, stable selectors, bounds
115
+ Agent->>ZMR: tap / type / swipe / open_link
116
+ ZMR->>Device: execute + settle
117
+ Agent->>ZMR: wait_visible / assert_visible
118
+ ZMR-->>Agent: typed result + trace events
119
+ Agent->>ZMR: trace_discover
120
+ ZMR-->>Agent: reviewable replay scenario
121
+ Agent->>ZMR: trace_export --redact
122
+ ZMR-->>Agent: .zmrtrace evidence bundle
32
123
  ```
33
124
 
34
- ## React Native, Expo, and Flutter
35
-
36
- ZMR works below the JavaScript or Dart framework layer. It drives the installed
37
- Android or iOS app through platform lifecycle commands, deep links, accessibility
38
- semantics, screenshots, logs, selector actions, waits, assertions, and traces.
39
-
40
- - **React Native:** prefer `testID`, `accessibilityLabel`, stable text, and deep
41
- links for direct navigation into important states.
42
- - **Expo development builds:** pass `--expo-dev-client-scheme <scheme>` to the
43
- wizard so ZMR scaffolds dev-client open-link scenarios.
44
- - **Flutter:** ZMR supports Flutter apps at the Android and iOS app level when
45
- the app exposes stable semantics labels, text, deep links, or native ids. It is
46
- not a Flutter widget-tree driver and does not inspect Flutter internals.
47
- - **Native Android/iOS:** use resource ids, content descriptions, accessibility
48
- identifiers, XCTest labels, and app-owned deep links.
49
-
50
- See [docs/frameworks.md](docs/frameworks.md) and
51
- [docs/app-integration.md](docs/app-integration.md) for app-side setup guidance.
52
-
53
- ## Why ZMR
54
-
55
- - **Agent-native protocol:** structured snapshots, semantic mobile trees,
56
- actions, waits, assertions, live trace events, trace explanation, and
57
- redacted trace export over JSON-RPC or MCP.
58
- - **Trace-first debugging:** every run can produce screenshots, UI trees, logs,
59
- timings, action inputs, assertion results, and HTML/JUnit reports.
60
- - **Fast local core:** Zig owns orchestration, subprocess control, selectors,
61
- waits, retries, scenario execution, and packaged binaries.
62
- - **App-local setup:** `.zmr/config.json`, smoke scenarios, shim commands, and
63
- traces live in the app repo.
64
- - **Android and iOS:** Android uses ADB/UI Automator plus an optional native
65
- shim. iOS simulators use `simctl`; physical iOS devices use `devicectl`;
66
- selector-grade iOS automation uses the XCTest/XCUIAutomation shim.
67
-
68
- ## Scenario Example
69
-
70
- ZMR scenarios are JSON so agents and build scripts can generate, validate, and
71
- mutate them without a second DSL.
125
+ The MCP server exposes the full loop as mobile-native tools:
126
+
127
+ | Group | Tools |
128
+ | --- | --- |
129
+ | Observe | `snapshot`, `semantic_snapshot` |
130
+ | App lifecycle | `install_app`, `launch_app`, `stop_app`, `clear_state`, `open_link` |
131
+ | Act | `tap`, `type`, `erase_text`, `hide_keyboard`, `swipe`, `press_back` |
132
+ | Wait | `wait_visible`, `wait_not_visible`, `wait_any`, `scroll_until_visible` |
133
+ | Assert | `assert_visible`, `assert_not_visible`, `assert_healthy` |
134
+ | Evidence | `trace_events`, `trace_explain`, `trace_discover`, `trace_explore`, `trace_export`, `scenario_validate` |
135
+
136
+ The same surface is available over JSON-RPC for harnesses that embed ZMR
137
+ directly see [docs/protocol.md](docs/protocol.md) and
138
+ [docs/ai-agents.md](docs/ai-agents.md). When a run fails, `zmr explain`
139
+ diagnoses the trace for humans and agents alike:
140
+
141
+ ![Terminal session showing a failed run, zmr explain diagnosing the failure with visible texts, and the fixed run passing](docs/assets/cli-run-explain.png)
142
+
143
+ ## Deterministic scenarios for CI
144
+
145
+ Scenarios are plain JSON — agents and build scripts generate, validate, and
146
+ mutate them without a second DSL, and they replay in CI with no LLM cost:
72
147
 
73
148
  ```json
74
149
  {
75
150
  "name": "Login smoke",
76
151
  "appId": "com.example.mobiletest",
77
152
  "steps": [
153
+ { "action": "clearState" },
78
154
  { "action": "launch" },
79
155
  { "action": "assertHealthy", "timeoutMs": 5000 },
80
156
  { "action": "tap", "selector": { "resourceId": "email" } },
81
157
  { "action": "typeText", "text": "user@example.com" },
82
- { "action": "tap", "selector": { "resourceId": "password" } },
83
- { "action": "typeText", "text": "password" },
84
158
  { "action": "tap", "selector": { "text": "Login" } },
85
159
  { "action": "waitVisible", "selector": { "text": "Welcome" }, "timeoutMs": 30000 }
86
160
  ]
87
161
  }
88
162
  ```
89
163
 
90
- Useful commands:
91
-
92
164
  ```bash
93
- zmr version --json
94
- zmr schemas --json
95
- zmr devices --json
96
- zmr inspect --json
97
- zmr explore --from-trace traces/zmr-agent --out .zmr/discovered/login-smoke.json --goal "find a stable login smoke" --include-actions --validate --json
98
- zmr discover --from-trace traces/zmr-agent --out .zmr/discovered/replay-smoke.json --include-actions --validate --json
99
- zmr draft --from-trace traces/zmr-agent --out .zmr/discovered/surface-smoke.json --json
100
- zmr draft --from-trace traces/zmr-agent --out .zmr/discovered/replay-smoke.json --include-actions --json
101
- zmr init --app --json --dir . --app-id com.example.mobiletest
102
165
  zmr validate --json .zmr/login-smoke.json
103
166
  zmr run .zmr/login-smoke.json --json --trace-dir traces/login-smoke
104
- zmr run .zmr/login-smoke.json --json --trace-dir traces/login-smoke --discover-out .zmr/discovered/login-smoke.json
105
- zmr explain --json traces/login-smoke
106
167
  zmr report traces/login-smoke --out traces/login-smoke/report.html --junit traces/login-smoke/junit.xml
107
- zmr import flow-yaml .zmr/legacy-flow.yaml --out .zmr/legacy-flow.json
108
- zmr export traces/login-smoke --out traces/login-smoke-redacted.zmrtrace --redact
109
- ```
110
-
111
- For traced runs, `zmr run --json` returns executable `nextCommands` for
112
- HTML and JUnit reporting, failure explanation, `zmr discover --from-trace`,
113
- and redacted export so agents can continue from a run summary without guessing
114
- the next handoff. The generated report handoff writes `report.html` and
115
- `junit.xml` beside the trace for CI artifact collection.
116
-
117
- When an agent should produce the reviewable scenario in the same command, add
118
- `--discover-out .zmr/discovered/<name>.json`. ZMR still treats the generated
119
- file as review-first: it writes from trace evidence, validates the file, and
120
- returns the embedded `discovery` result without crawling or committing tests.
121
- The `discovery.replay` object shows how many trace action events were
122
- considered for replay, how many became scenario steps, and how many were
123
- skipped.
124
-
125
- See [docs/scenario-authoring.md](docs/scenario-authoring.md) for selector and
126
- wait guidance.
127
-
128
- ## Agent Workflow
129
-
130
- Agents can use the CLI, JSON-RPC, or MCP surface. Start JSON-RPC over stdio:
131
-
132
- ```bash
133
- zmr inspect --json --dir .
134
- ```
135
-
136
- `zmr inspect --json` gives agents a read-only handoff for the app checkout:
137
- config status, generated agent instructions, configured platform scenarios, and
138
- recommended next commands. It does not launch devices or write tests.
139
-
140
- After a live session has produced trace artifacts, agents can ask ZMR to turn
141
- the trace into a validated, reviewable scenario candidate. `zmr explore` is
142
- the agent-facing handoff: it records the goal, writes from existing trace
143
- evidence, validates the candidate, and returns guardrails that make the limits
144
- machine-readable:
145
-
146
- ```bash
147
- zmr explore --from-trace traces/zmr-agent \
148
- --out .zmr/discovered/login-smoke.json \
149
- --goal "find a stable login smoke" \
150
- --include-actions \
151
- --validate \
152
- --json
168
+ zmr export traces/login-smoke --out login-smoke-redacted.zmrtrace --redact
153
169
  ```
154
170
 
155
- `zmr explore` is not an autonomous crawler. It does not launch devices, invent
156
- missing actions, discover credentials, or commit tests. Its JSON includes
157
- `autonomous:false`, `reviewRequired:true`, `guardrails`, and the same replay
158
- coverage and validation fields as `zmr discover`.
159
-
160
- When an agent wants the lower-level trace-to-test primitive directly, use
161
- `zmr discover`:
171
+ Traced `zmr run --json` responses include executable `nextCommands` so agents
172
+ can continue to reporting, explanation, discovery, or export without guessing.
173
+ Open any exported bundle in the static [trace viewer](viewer/index.html) — or
174
+ serve it and link straight to it with `viewer/index.html?bundle=<url>`.
162
175
 
163
- ```bash
164
- zmr discover --from-trace traces/zmr-agent \
165
- --out .zmr/discovered/replay-smoke.json \
166
- --include-actions \
167
- --validate \
168
- --json
169
- ```
176
+ For repeat-run reliability gates, p95 duration thresholds, baseline
177
+ comparisons against your current E2E tool, and multi-device matrices, see
178
+ [docs/benchmarking.md](docs/benchmarking.md) and the public
179
+ [Benchmark Lab](docs/benchmarks/README.md) evidence.
170
180
 
171
- `zmr discover` is offline and review-first. It writes a scenario from stable
172
- trace evidence, optionally validates it immediately, and returns next commands
173
- for deterministic reruns. It does not crawl the app, discover credentials, or
174
- commit tests. Its JSON includes `replay` coverage metadata so agents can report
175
- which trace action events became replay steps and which were skipped.
176
-
177
- For CLI-driven agent loops, `zmr run --json --trace-dir traces/zmr-agent
178
- --discover-out .zmr/discovered/replay-smoke.json` performs the same
179
- trace-backed discovery after the run and embeds the discover result in the run
180
- JSON response.
181
-
182
- For the lower-level draft primitive, agents can still ask ZMR to write a
183
- conservative surface-smoke scenario from the latest snapshot:
184
-
185
- ```bash
186
- zmr draft --from-trace traces/zmr-agent \
187
- --out .zmr/discovered/surface-smoke.json \
188
- --json
189
- zmr validate --json .zmr/discovered/surface-smoke.json
190
- ```
191
-
192
- `zmr draft` writes `launch`, `snapshot`, and `assertVisible` steps from stable
193
- visible selectors. It does not tap controls or type into fields unless
194
- `--include-actions` is explicitly requested.
195
-
196
- When the trace was produced by an agent or JSON-RPC/MCP session that took typed
197
- actions, add `--include-actions` to replay successful supported actions before
198
- the final snapshot assertions:
199
-
200
- ```bash
201
- zmr draft --from-trace traces/zmr-agent \
202
- --out .zmr/discovered/replay-smoke.json \
203
- --include-actions \
204
- --json
205
- zmr validate --json .zmr/discovered/replay-smoke.json
206
- ```
207
-
208
- Replay drafts only use trace events with enough stable data to reproduce them,
209
- such as launch, deep links, selector taps, selector text entry,
210
- selector/timeout-preserving waits, back, keyboard hiding, coordinate-complete
211
- swipes, direction/timeout-preserving selector scrolls, `assertNoneVisible`
212
- selector arrays, selector/timeout-preserving `assertVisible` and
213
- `assertNotVisible`, and timed `assertHealthy` checks.
214
- Native selector wait traces include timeout context for successful waits and
215
- timeout diagnostics.
216
- Unsupported or underspecified events are skipped with warnings instead of guessed.
217
- Text entry events whose text was redacted from the trace are also skipped.
218
-
219
- ```bash
220
- zmr serve --transport stdio --config .zmr/config.json --trace-dir traces/zmr-agent
221
- ```
222
-
223
- Agents that support the Model Context Protocol can use the native MCP surface:
224
-
225
- ```bash
226
- zmr mcp --config .zmr/config.json --trace-dir traces/zmr-agent
227
- ```
228
-
229
- The MCP server exposes mobile-specific tools such as `semantic_snapshot`,
230
- `install_app`, `launch_app`, `stop_app`, `clear_state`, `tap`, `type`,
231
- `erase_text`, `hide_keyboard`, `swipe`, `press_back`, `open_link`,
232
- `wait_visible`, `wait_not_visible`, `wait_any`, `scroll_until_visible`,
233
- `assert_visible`, `assert_not_visible`, `assert_healthy`, `scenario_validate`,
234
- `trace_events`, `trace_explain`, `trace_explore`, `trace_discover`, and
235
- `trace_export`.
236
-
237
- For agent-led discovery and test authoring, see
238
- [docs/agent-discovery.md](docs/agent-discovery.md). ZMR supports that loop
239
- through MCP, JSON-RPC, trace events, in-band trace discovery, offline surface
240
- drafts, replay drafts, live and offline guarded trace exploration, and traced
241
- run `nextCommands` today. Built-in exploration is review-first and
242
- trace-backed, not an unbounded autonomous crawler.
243
-
244
- ## Optional Protocol Clients
245
-
246
- Clients are thin wrappers around `zmr serve --transport stdio`. They do not
247
- replace the runner; they make it easier for agents and test code to call the
248
- same JSON-RPC protocol.
249
-
250
- TypeScript and Python are the most common starting points for app teams and
251
- agent harnesses. Go, Rust, Swift, and Kotlin clients are reference integrations
252
- for teams that want to embed the protocol from those ecosystems. Go and Rust
253
- also include typed trace discovery and scenario validation helpers for
254
- host-side agent loops. Swift and Kotlin include lightweight discovery and
255
- validation helpers for host-side automation.
256
-
257
- | Language | Entry point | Example |
258
- | --- | --- | --- |
259
- | TypeScript | `clients/typescript/index.mjs` + `index.d.ts` | `node clients/typescript/examples/fake-session.mjs` |
260
- | Python | `clients/python/zmr_client.py` + `pyproject.toml` | `python3 clients/python/examples/fake_session.py` |
261
- | Go | `clients/go/zmr/client.go` | `go run ./clients/go/examples/fake-session` |
262
- | Rust | `clients/rust/src/lib.rs` | `cargo run --manifest-path clients/rust/Cargo.toml --example fake_session` |
263
- | Swift | `clients/swift/Sources/ZMRClient` | `swift build --package-path clients/swift` |
264
- | Kotlin | `clients/kotlin/src/main/kotlin/dev/zmr` | `gradle -p clients/kotlin build` |
265
-
266
- See [clients/README.md](clients/README.md), [docs/clients.md](docs/clients.md),
267
- and [docs/client-installation.md](docs/client-installation.md).
268
-
269
- ## Platform Support
181
+ ## Platform support
270
182
 
271
183
  | Target | Status | Notes |
272
184
  | --- | --- | --- |
273
185
  | Android emulator | Supported | ADB/UI Automator, optional Android shim, emulator lifecycle helpers |
274
186
  | Android physical device | Supported | Requires ADB connection and app build/install surface |
275
- | iOS simulator | Supported | `simctl` plus app-local XCTest/XCUIAutomation shim for native selector actions, native waits, and bounded snapshots |
276
- | iOS physical device | Supported, validate locally | `devicectl` lifecycle plus app-local XCTest/XCUIAutomation shim; run pilots on your own app/device before relying on it in CI |
277
- | Cloud device farms | Not included | ZMR is focused on local and self-managed device targets in this preview |
187
+ | iOS simulator | Supported | `simctl` plus app-local XCTest/XCUIAutomation shim for native selector actions |
188
+ | iOS physical device | Supported, validate locally | `devicectl` lifecycle plus XCTest shim; pilot on your own app/device before relying on it in CI |
189
+ | Cloud device farms | Not included | ZMR focuses on local and self-managed device targets in this preview |
278
190
 
279
- Current release: `0.1.8` developer preview. Protocol version:
280
- `2026-04-28`.
191
+ Slow CI hardware can extend the iOS shim cold-build timeout with
192
+ `ZMR_IOS_SHIM_TIMEOUT_MS`. Current release: `0.2.0` developer preview.
193
+ Protocol version: `2026-04-28`.
194
+
195
+ ## Optional protocol clients
196
+
197
+ TypeScript and Python clients are the common starting points; Go, Rust, Swift,
198
+ and Kotlin reference clients embed the same JSON-RPC protocol from those
199
+ ecosystems. All are thin wrappers around `zmr serve --transport stdio`. See
200
+ [docs/clients.md](docs/clients.md) and
201
+ [docs/client-installation.md](docs/client-installation.md).
281
202
 
282
203
  ## Documentation
283
204
 
284
- - [FEATURES.md](FEATURES.md): complete feature list and limitations
205
+ **For agents**
206
+
207
+ - [docs/ai-agents.md](docs/ai-agents.md): JSON-RPC and MCP agent workflows
208
+ - [docs/agent-discovery.md](docs/agent-discovery.md): agent-led discovery, `zmr explore`/`discover`/`draft`, and the trace-to-test loop
209
+ - [skills/zmr-mobile-testing/SKILL.md](skills/zmr-mobile-testing/SKILL.md): reusable agent skill
210
+
211
+ **For test authors**
212
+
285
213
  - [docs/install.md](docs/install.md): source, npm, Homebrew, and app setup
286
214
  - [docs/frameworks.md](docs/frameworks.md): React Native, Expo, Flutter, and native app guidance
287
- - [docs/expo-smoke.md](docs/expo-smoke.md): reproducible Expo and iOS smoke test
288
- - [docs/production-readiness.md](docs/production-readiness.md): release, reliability, framework, and agent-readiness gates
289
- - [docs/app-integration.md](docs/app-integration.md): app-side Android/iOS shims
290
215
  - [docs/scenario-authoring.md](docs/scenario-authoring.md): selectors, waits, and scenario design
291
- - [docs/agent-discovery.md](docs/agent-discovery.md): agent-led discovery and scenario authoring loop
216
+ - [docs/app-integration.md](docs/app-integration.md): app-side Android/iOS shims
217
+ - [docs/expo-smoke.md](docs/expo-smoke.md): reproducible Expo and iOS smoke test
218
+ - [docs/benchmarking.md](docs/benchmarking.md): repeat-run gates, reports, device matrix, baselines
219
+
220
+ **Reference**
221
+
222
+ - [FEATURES.md](FEATURES.md): complete feature list and limitations
292
223
  - [docs/protocol.md](docs/protocol.md): JSON-RPC methods and schemas
293
- - [docs/ai-agents.md](docs/ai-agents.md): JSON-RPC and MCP agent workflows
294
- - [docs/clients.md](docs/clients.md): language client guide
295
- - [docs/client-installation.md](docs/client-installation.md): npm, Homebrew, TS, Python, Go, Rust, Swift, and Kotlin setup
296
224
  - [docs/trace-privacy.md](docs/trace-privacy.md): safe trace export
225
+ - [docs/production-readiness.md](docs/production-readiness.md): release, reliability, and agent-readiness gates
297
226
  - [docs/troubleshooting.md](docs/troubleshooting.md): common setup and runtime issues
298
- - [skills/zmr-mobile-testing/SKILL.md](skills/zmr-mobile-testing/SKILL.md): reusable agent skill
227
+ - [docs/benchmarks](docs/benchmarks/README.md): public-safe benchmark evidence
299
228
 
300
229
  ## License
301
230
 
@@ -27,7 +27,7 @@ gradle -p clients/kotlin runFakeSession \
27
27
  ```
28
28
 
29
29
  ```kotlin
30
- implementation(files("path/to/zeno-mobile-runner/clients/kotlin/build/libs/zmr-client-0.1.8.jar"))
30
+ implementation(files("path/to/zeno-mobile-runner/clients/kotlin/build/libs/zmr-client-0.2.0.jar"))
31
31
  ```
32
32
 
33
33
  ```kotlin
@@ -4,7 +4,7 @@ plugins {
4
4
  }
5
5
 
6
6
  group = "dev.zmr"
7
- version = "0.1.8"
7
+ version = "0.2.0"
8
8
 
9
9
  kotlin {
10
10
  jvmToolchain(17)
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "zmr-client"
7
- version = "0.1.8.dev1"
7
+ version = "0.2.0.dev1"
8
8
  description = "Python JSON-RPC client for Zeno Mobile Runner."
9
9
  requires-python = ">=3.9"
10
10
  license = { text = "MIT" }
@@ -100,7 +100,7 @@ checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa"
100
100
 
101
101
  [[package]]
102
102
  name = "zmr-client"
103
- version = "0.1.8"
103
+ version = "0.2.0"
104
104
  dependencies = [
105
105
  "serde",
106
106
  "serde_json",
@@ -1,6 +1,6 @@
1
1
  [package]
2
2
  name = "zmr-client"
3
- version = "0.1.8"
3
+ version = "0.2.0"
4
4
  edition = "2021"
5
5
  license = "MIT"
6
6
  description = "Rust JSON-RPC client for Zeno Mobile Runner."
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@zmr/client",
3
- "version": "0.1.8",
3
+ "version": "0.2.0",
4
4
  "type": "module",
5
5
  "main": "index.mjs",
6
6
  "types": "index.d.ts",
@@ -11,6 +11,16 @@ trace-backed, not an unbounded crawler: it does not launch devices, invent
11
11
  missing actions, discover credentials, or commit files. Keep autonomous
12
12
  planning in the agent, and keep ZMR as the deterministic mobile control plane.
13
13
 
14
+ ```mermaid
15
+ flowchart LR
16
+ SESSION["Live agent session<br/>or zmr run"] --> TRACE["Trace directory"]
17
+ TRACE --> DISCOVER["zmr discover / draft / explore<br/>--from-trace"]
18
+ DISCOVER --> CANDIDATE["Scenario candidate<br/>.zmr/discovered/*.json"]
19
+ CANDIDATE --> REVIEW["Human / agent review"]
20
+ REVIEW --> VALIDATE["zmr validate --json"]
21
+ VALIDATE --> CI["zmr run in CI<br/>report.html · junit.xml"]
22
+ ```
23
+
14
24
  ## Recommended Loop
15
25
 
16
26
  1. Validate local setup:
package/docs/ai-agents.md CHANGED
@@ -4,6 +4,24 @@ ZMR is built for external agents. The runner provides device state, typed
4
4
  actions, waits, assertions, trace explanation, and trace export; the agent
5
5
  decides the next step.
6
6
 
7
+ ```mermaid
8
+ sequenceDiagram
9
+ participant Agent as AI agent
10
+ participant ZMR
11
+ participant Device as Emulator / simulator
12
+ Agent->>ZMR: semantic_snapshot
13
+ ZMR->>Device: capture UI + screenshot
14
+ ZMR-->>Agent: roles, stable selectors, bounds
15
+ Agent->>ZMR: tap / type / swipe / open_link
16
+ ZMR->>Device: execute + settle
17
+ Agent->>ZMR: wait_visible / assert_visible
18
+ ZMR-->>Agent: typed result + trace events
19
+ Agent->>ZMR: trace_discover
20
+ ZMR-->>Agent: reviewable replay scenario
21
+ Agent->>ZMR: trace_export --redact
22
+ ZMR-->>Agent: .zmrtrace evidence bundle
23
+ ```
24
+
7
25
  ## Agent Setup Loop
8
26
 
9
27
  Start inside the app checkout: