zeno-mobile-runner 0.1.8 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (66) hide show
  1. package/CHANGELOG.md +72 -0
  2. package/FEATURES.md +1 -1
  3. package/README.md +175 -238
  4. package/clients/kotlin/README.md +1 -1
  5. package/clients/kotlin/build.gradle.kts +1 -1
  6. package/clients/python/pyproject.toml +1 -1
  7. package/clients/rust/Cargo.lock +1 -1
  8. package/clients/rust/Cargo.toml +1 -1
  9. package/clients/typescript/package.json +1 -1
  10. package/docs/agent-discovery.md +10 -0
  11. package/docs/ai-agents.md +18 -0
  12. package/docs/benchmarking.md +39 -0
  13. package/docs/benchmarks/2026-06-09-android-workflow.md +73 -0
  14. package/docs/benchmarks/2026-06-09-android-workflow.results.jsonl +20 -0
  15. package/docs/benchmarks/2026-06-09-framework-baseline-status.md +32 -0
  16. package/docs/benchmarks/2026-06-09-ios-appium-comparison.md +115 -0
  17. package/docs/benchmarks/2026-06-09-ios-appium-comparison.results.jsonl +40 -0
  18. package/docs/benchmarks/2026-06-09-ios-demo.md +90 -0
  19. package/docs/benchmarks/2026-06-09-ios-demo.results.jsonl +20 -0
  20. package/docs/benchmarks/2026-06-09-ios-maestro-comparison.md +128 -0
  21. package/docs/benchmarks/2026-06-09-ios-maestro-comparison.results.jsonl +40 -0
  22. package/docs/benchmarks/2026-06-09-ios-workflow-comparison.md +143 -0
  23. package/docs/benchmarks/2026-06-09-ios-workflow-comparison.results.jsonl +40 -0
  24. package/docs/benchmarks/2026-06-09-ios-xctest-floor.md +106 -0
  25. package/docs/benchmarks/2026-06-09-ios-xctest-floor.results.jsonl +40 -0
  26. package/docs/benchmarks/README.md +36 -0
  27. package/docs/benchmarks/benchmark-lab-v1.json +155 -0
  28. package/docs/benchmarks/benchmark-lab-v1.md +95 -0
  29. package/docs/clients.md +16 -0
  30. package/docs/demo.md +36 -1
  31. package/docs/frameworks.md +10 -0
  32. package/docs/npm.md +44 -2
  33. package/docs/protocol-fixtures/core-session.responses.jsonl +1 -1
  34. package/docs/protocol.md +10 -10
  35. package/docs/scenario-authoring.md +15 -0
  36. package/docs/trace-privacy.md +9 -0
  37. package/docs/troubleshooting.md +6 -0
  38. package/examples/android-workflow.json +79 -0
  39. package/examples/ios-dev-client-open-link.json +24 -13
  40. package/examples/ios-dev-client-route-snapshot.json +33 -8
  41. package/examples/ios-shim-workflow.json +79 -0
  42. package/examples/react-native-expo-workflow.json +75 -0
  43. package/npm/scenarios.mjs +15 -8
  44. package/npm/wizard.mjs +1 -1
  45. package/package.json +6 -1
  46. package/prebuilds/darwin-arm64/zmr +0 -0
  47. package/prebuilds/darwin-x64/zmr +0 -0
  48. package/prebuilds/linux-arm64/zmr +0 -0
  49. package/prebuilds/linux-x64/zmr +0 -0
  50. package/scripts/benchmark-lab.py +253 -0
  51. package/scripts/create-android-demo-app.sh +324 -29
  52. package/scripts/create-ios-demo-app.sh +174 -7
  53. package/scripts/create-react-native-expo-demo-app.sh +727 -0
  54. package/scripts/demo.sh +3 -0
  55. package/scripts/install-ios-shim.sh +2 -2
  56. package/shims/ios/ZMRShim.swift +10 -0
  57. package/shims/ios/ZMRShimUITestCase.swift +49 -1
  58. package/shims/ios/protocol.md +1 -0
  59. package/src/cli_import.zig +31 -15
  60. package/src/cli_trace.zig +38 -16
  61. package/src/cli_validate.zig +12 -6
  62. package/src/ios.zig +44 -11
  63. package/src/ios_shim.zig +36 -2
  64. package/src/main.zig +6 -0
  65. package/src/version.zig +1 -1
  66. package/viewer/app.js +23 -3
package/CHANGELOG.md CHANGED
@@ -2,6 +2,78 @@
2
2
 
3
3
  All notable changes to Zeno Mobile Runner are tracked here.
4
4
 
5
+ ## Unreleased
6
+
7
+ ## 0.2.1 (2026-06-10)
8
+
9
+ ### Fixed
10
+
11
+ - iOS simulator `openLink` now asks the XCTest shim to accept the SpringBoard
12
+ "Open in <App>?" confirmation for custom URL schemes too, not just
13
+ http/https universal links. Custom schemes are the common Expo dev-client
14
+ deep-link case (`exp+scheme://expo-development-client/...`), and the
15
+ unaccepted dialog previously blocked navigation entirely. The shim's
16
+ `acceptSystemAlert` also gained a single alert-existence probe so the
17
+ best-effort accept stays fast when no dialog appears.
18
+ - The generated Expo dev-client scenarios no longer pass when only the Expo
19
+ dev launcher rendered. The old `waitAny` markers also matched launcher
20
+ chrome ("Home", "Continue", "Sign in"), so runs exited green even though
21
+ the app's JS bundle never loaded. The scenarios now wait for the launcher's
22
+ persistent marker to be gone (`waitNotVisible` on "evelopment servers",
23
+ covering both case-sensitive spellings) — passing immediately when the deep
24
+ link navigates, and failing when the launcher is stuck — then assert no
25
+ bundle-error screen ("Unable to load" / "There was a problem loading") is
26
+ visible before `assertHealthy` and `snapshot`. Verified both directions
27
+ against a real Expo SDK 56 app: passes in ~24s with Metro serving, fails
28
+ with a wait timeout when the bundler is down.
29
+
30
+ ## 0.2.0 (2026-06-10)
31
+
32
+ ### Added
33
+
34
+ - Added a public-safe iOS simulator benchmark evidence pack with 20 repeated
35
+ runs of the generated iOS smoke scenario.
36
+ - Added a public-safe iOS simulator baseline runner benchmark comparison on the
37
+ same generated demo app.
38
+ - Added a second public-safe iOS baseline comparison plus a native shim floor
39
+ evidence pack for the generated demo app.
40
+ - Added a richer public-safe iOS workflow benchmark pack covering profile
41
+ entry, catalog selection, save, review, and final-state assertion on the
42
+ generated demo app.
43
+ - Added the first Android workflow benchmark pack for the generated demo app,
44
+ covering 20 repeated UIAutomator-path ZMR runs.
45
+ - Added a generated React Native/Expo benchmark fixture with stable `testID`
46
+ values, accessibility labels, deep-link setup, and Android/iOS ZMR workflow
47
+ scenarios.
48
+ - The trace viewer loads a served bundle directly from
49
+ `viewer/index.html?bundle=<url>`, so CI artifact links and shared triage can
50
+ open a trace without manual file selection.
51
+ - The iOS XCTest shim cold-build timeout is tunable with the
52
+ `ZMR_IOS_SHIM_TIMEOUT_MS` environment variable for slower CI hardware.
53
+ - Added a nightly `device-smoke` GitHub Actions workflow that runs the public
54
+ demo apps on a real Android emulator and iOS simulator and uploads traces,
55
+ reports, and redacted bundles as evidence artifacts.
56
+ - Added real captured screenshots under `docs/assets/` (trace viewer, device
57
+ screens, CLI failure-diagnosis loop, HTML report) plus
58
+ `scripts/capture-screenshots.sh` to regenerate them from fresh demo runs.
59
+ The assets ship in the repository only, not in the npm package.
60
+ - Added Mermaid architecture, verification-loop, trace-lifecycle, and
61
+ trace-to-test diagrams to the README and core docs, and rewrote the README
62
+ around the AI-coding-agent verification workflow.
63
+
64
+ ### Fixed
65
+
66
+ - `zmr validate`, `zmr report`, `zmr export`, and `zmr import` now accept
67
+ flags before positional arguments, matching the documented command forms,
68
+ and unknown-flag errors print a help hint.
69
+ - Generated Android demo scenarios clear app state before launching so
70
+ repeated runs no longer fail on leftover screens from a previous session.
71
+
72
+ - Fixed generated iOS shim one-shot log file creation on macOS by using a
73
+ portable `mktemp` template with `XXXXXX` at the end.
74
+ - Skipped the slow iOS system-open confirmation probe for simulator custom URL
75
+ schemes while keeping it for universal web links.
76
+
5
77
  ## 0.1.8 (2026-06-06)
6
78
 
7
79
  ### Changed
package/FEATURES.md CHANGED
@@ -142,7 +142,7 @@ state, and writes deterministic traces. It does not embed an LLM.
142
142
 
143
143
  ## Current Limitations
144
144
 
145
- - Current release status is `0.1.8`, a public developer preview rather than
145
+ - Current release status is `0.2.1`, a public developer preview rather than
146
146
  a production-stable `1.0.0`.
147
147
  - Physical iOS log capture is still simulator-first. Physical iOS screenshots
148
148
  are available when the XCTest/XCUIAutomation shim is configured.
package/README.md CHANGED
@@ -1,301 +1,238 @@
1
1
  # Zeno Mobile Runner
2
2
 
3
- > Agent-native mobile UI automation for React Native, Expo, Flutter, and native Android/iOS apps.
3
+ > The verification loop for AI coding agents building Expo, React Native,
4
+ > Flutter, and native Android/iOS apps.
4
5
 
5
6
  [![CI](https://github.com/johnmikel/zeno-mobile-runner/actions/workflows/ci.yml/badge.svg)](https://github.com/johnmikel/zeno-mobile-runner/actions/workflows/ci.yml)
6
7
  [![Release](https://img.shields.io/github/v/release/johnmikel/zeno-mobile-runner?include_prereleases)](https://github.com/johnmikel/zeno-mobile-runner/releases)
7
8
  [![npm](https://img.shields.io/npm/v/zeno-mobile-runner)](https://www.npmjs.com/package/zeno-mobile-runner)
8
9
  [![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
9
10
 
10
- ZMR gives AI agents and test harnesses a typed mobile control plane. It can
11
- install and launch apps, observe the UI, choose an action, wait for the screen to
12
- settle, assert state, and export a replayable trace. The runner does not embed an
13
- LLM. Agents stay outside and use ZMR through CLI JSON, scenarios, JSON-RPC, MCP,
14
- or optional protocol clients.
11
+ Your coding agent can write mobile code, but it cannot see the phone. ZMR is
12
+ its eyes and hands: a typed mobile control plane that installs and launches
13
+ apps, observes the UI, taps and types, waits for the screen to settle, asserts
14
+ state, and exports a replayable trace as proof. The runner does not embed an
15
+ LLM. Agents stay outside and drive ZMR through MCP, JSON-RPC, CLI JSON, or
16
+ JSON scenarios.
17
+
18
+ ![ZMR trace viewer showing a passed iOS run with timeline, device screenshot, UI tree, and selector payload](docs/assets/viewer-hero.png)
19
+
20
+ <p align="center">
21
+ <img src="docs/assets/device-ios-demo.png" width="260" alt="iOS simulator screenshot captured by ZMR during a scenario run" />
22
+ &nbsp;&nbsp;
23
+ <img src="docs/assets/device-android-demo.png" width="260" alt="Android emulator screenshot captured by ZMR during a scenario run" />
24
+ </p>
25
+
26
+ <p align="center"><em>Real on-device screenshots from ZMR traces: the same demo flow
27
+ driven on an iOS simulator and an Android emulator.</em></p>
28
+
29
+ ## Why agents need this
30
+
31
+ - **Agents can't verify what they can't observe.** ZMR returns semantic UI
32
+ trees with stable selectors, screenshots, and typed action results an agent
33
+ can reason about — not raw pixels it has to guess at.
34
+ - **Evidence, not vibes.** Every session can write a deterministic trace:
35
+ events, screenshots, UI hierarchies, timings, assertion results, HTML and
36
+ JUnit reports, and a redacted shareable bundle.
37
+ - **Tests fall out for free.** After a live agent session, `zmr discover`
38
+ turns the trace into a reviewable JSON scenario that replays in CI without
39
+ an LLM in the loop.
40
+
41
+ ## How it works
42
+
43
+ ```mermaid
44
+ flowchart LR
45
+ A["AI coding agent<br/>Claude Code · Cursor · custom harness"]
46
+ subgraph zmr["ZMR — one small Zig binary"]
47
+ MCP["MCP server<br/><code>zmr mcp</code>"]
48
+ RPC["JSON-RPC stdio/TCP<br/><code>zmr serve</code>"]
49
+ CLI["CLI + JSON scenarios<br/><code>zmr run</code>"]
50
+ CORE["Core engine<br/>selectors · waits · assertions<br/>scenario runner · trace writer"]
51
+ MCP --> CORE
52
+ RPC --> CORE
53
+ CLI --> CORE
54
+ end
55
+ subgraph devices["Devices"]
56
+ AND["Android emulator/device<br/>ADB · UI Automator · optional shim"]
57
+ IOS["iOS simulator/device<br/>simctl · devicectl · XCTest shim"]
58
+ end
59
+ TRACE["Trace<br/>events.jsonl · screenshots · UI trees<br/>report.html · junit.xml · .zmrtrace"]
60
+ A -- "MCP tools" --> MCP
61
+ A -- "JSON-RPC" --> RPC
62
+ A -- "CLI JSON" --> CLI
63
+ CORE --> AND
64
+ CORE --> IOS
65
+ CORE --> TRACE
66
+ ```
67
+
68
+ No app instrumentation is required on Android. iOS selector actions use an
69
+ app-local XCTest shim that the wizard scaffolds. ZMR works below the
70
+ JavaScript/Dart layer, so React Native, Expo, Flutter, and fully native apps
71
+ are all driven the same way. See [docs/frameworks.md](docs/frameworks.md).
15
72
 
16
- ## Install
73
+ ## Five-minute start
17
74
 
18
75
  Inside a mobile app repo:
19
76
 
20
77
  ```bash
21
- npm install --save-dev zeno-mobile-runner
78
+ npm install --save-dev zeno-mobile-runner # bun add --dev zeno-mobile-runner
22
79
  npx zmr-wizard --app-id com.example.mobiletest --package-json
23
80
  npx zmr doctor --strict --json --config .zmr/config.json
24
81
  ```
25
82
 
26
- Run a generated smoke scenario:
83
+ Hook it up to your coding agent (Claude Code shown; any MCP client works):
27
84
 
28
85
  ```bash
29
- npm run zmr:validate
30
- npm run zmr:android
31
- npm run zmr:ios
86
+ claude mcp add zmr -- npx zmr mcp --config .zmr/config.json --trace-dir traces/zmr-agent
87
+ ```
88
+
89
+ Claude Code users can instead install the plugin, which bundles the MCP server
90
+ and a mobile-testing skill:
91
+
92
+ ```text
93
+ /plugin marketplace add johnmikel/zeno-mobile-runner
94
+ /plugin install zmr@zmr-marketplace
32
95
  ```
33
96
 
34
- ## React Native, Expo, and Flutter
35
-
36
- ZMR works below the JavaScript or Dart framework layer. It drives the installed
37
- Android or iOS app through platform lifecycle commands, deep links, accessibility
38
- semantics, screenshots, logs, selector actions, waits, assertions, and traces.
39
-
40
- - **React Native:** prefer `testID`, `accessibilityLabel`, stable text, and deep
41
- links for direct navigation into important states.
42
- - **Expo development builds:** pass `--expo-dev-client-scheme <scheme>` to the
43
- wizard so ZMR scaffolds dev-client open-link scenarios.
44
- - **Flutter:** ZMR supports Flutter apps at the Android and iOS app level when
45
- the app exposes stable semantics labels, text, deep links, or native ids. It is
46
- not a Flutter widget-tree driver and does not inspect Flutter internals.
47
- - **Native Android/iOS:** use resource ids, content descriptions, accessibility
48
- identifiers, XCTest labels, and app-owned deep links.
49
-
50
- See [docs/frameworks.md](docs/frameworks.md) and
51
- [docs/app-integration.md](docs/app-integration.md) for app-side setup guidance.
52
-
53
- ## Why ZMR
54
-
55
- - **Agent-native protocol:** structured snapshots, semantic mobile trees,
56
- actions, waits, assertions, live trace events, trace explanation, and
57
- redacted trace export over JSON-RPC or MCP.
58
- - **Trace-first debugging:** every run can produce screenshots, UI trees, logs,
59
- timings, action inputs, assertion results, and HTML/JUnit reports.
60
- - **Fast local core:** Zig owns orchestration, subprocess control, selectors,
61
- waits, retries, scenario execution, and packaged binaries.
62
- - **App-local setup:** `.zmr/config.json`, smoke scenarios, shim commands, and
63
- traces live in the app repo.
64
- - **Android and iOS:** Android uses ADB/UI Automator plus an optional native
65
- shim. iOS simulators use `simctl`; physical iOS devices use `devicectl`;
66
- selector-grade iOS automation uses the XCTest/XCUIAutomation shim.
67
-
68
- ## Scenario Example
69
-
70
- ZMR scenarios are JSON so agents and build scripts can generate, validate, and
71
- mutate them without a second DSL.
97
+ Or in an `.mcp.json` / MCP client config:
98
+
99
+ ```json
100
+ {
101
+ "mcpServers": {
102
+ "zmr": {
103
+ "command": "npx",
104
+ "args": ["zmr", "mcp", "--config", ".zmr/config.json", "--trace-dir", "traces/zmr-agent"]
105
+ }
106
+ }
107
+ }
108
+ ```
109
+
110
+ Then ask the agent to verify its own work: *"launch the app, walk through
111
+ onboarding, and show me the trace."*
112
+
113
+ ## The agent verification loop
114
+
115
+ ```mermaid
116
+ sequenceDiagram
117
+ participant Agent as AI agent
118
+ participant ZMR
119
+ participant Device as Emulator / simulator
120
+ Agent->>ZMR: semantic_snapshot
121
+ ZMR->>Device: capture UI + screenshot
122
+ ZMR-->>Agent: roles, stable selectors, bounds
123
+ Agent->>ZMR: tap / type / swipe / open_link
124
+ ZMR->>Device: execute + settle
125
+ Agent->>ZMR: wait_visible / assert_visible
126
+ ZMR-->>Agent: typed result + trace events
127
+ Agent->>ZMR: trace_discover
128
+ ZMR-->>Agent: reviewable replay scenario
129
+ Agent->>ZMR: trace_export --redact
130
+ ZMR-->>Agent: .zmrtrace evidence bundle
131
+ ```
132
+
133
+ The MCP server exposes the full loop as mobile-native tools:
134
+
135
+ | Group | Tools |
136
+ | --- | --- |
137
+ | Observe | `snapshot`, `semantic_snapshot` |
138
+ | App lifecycle | `install_app`, `launch_app`, `stop_app`, `clear_state`, `open_link` |
139
+ | Act | `tap`, `type`, `erase_text`, `hide_keyboard`, `swipe`, `press_back` |
140
+ | Wait | `wait_visible`, `wait_not_visible`, `wait_any`, `scroll_until_visible` |
141
+ | Assert | `assert_visible`, `assert_not_visible`, `assert_healthy` |
142
+ | Evidence | `trace_events`, `trace_explain`, `trace_discover`, `trace_explore`, `trace_export`, `scenario_validate` |
143
+
144
+ The same surface is available over JSON-RPC for harnesses that embed ZMR
145
+ directly — see [docs/protocol.md](docs/protocol.md) and
146
+ [docs/ai-agents.md](docs/ai-agents.md). When a run fails, `zmr explain`
147
+ diagnoses the trace for humans and agents alike:
148
+
149
+ ![Terminal session showing a failed run, zmr explain diagnosing the failure with visible texts, and the fixed run passing](docs/assets/cli-run-explain.png)
150
+
151
+ ## Deterministic scenarios for CI
152
+
153
+ Scenarios are plain JSON — agents and build scripts generate, validate, and
154
+ mutate them without a second DSL, and they replay in CI with no LLM cost:
72
155
 
73
156
  ```json
74
157
  {
75
158
  "name": "Login smoke",
76
159
  "appId": "com.example.mobiletest",
77
160
  "steps": [
161
+ { "action": "clearState" },
78
162
  { "action": "launch" },
79
163
  { "action": "assertHealthy", "timeoutMs": 5000 },
80
164
  { "action": "tap", "selector": { "resourceId": "email" } },
81
165
  { "action": "typeText", "text": "user@example.com" },
82
- { "action": "tap", "selector": { "resourceId": "password" } },
83
- { "action": "typeText", "text": "password" },
84
166
  { "action": "tap", "selector": { "text": "Login" } },
85
167
  { "action": "waitVisible", "selector": { "text": "Welcome" }, "timeoutMs": 30000 }
86
168
  ]
87
169
  }
88
170
  ```
89
171
 
90
- Useful commands:
91
-
92
172
  ```bash
93
- zmr version --json
94
- zmr schemas --json
95
- zmr devices --json
96
- zmr inspect --json
97
- zmr explore --from-trace traces/zmr-agent --out .zmr/discovered/login-smoke.json --goal "find a stable login smoke" --include-actions --validate --json
98
- zmr discover --from-trace traces/zmr-agent --out .zmr/discovered/replay-smoke.json --include-actions --validate --json
99
- zmr draft --from-trace traces/zmr-agent --out .zmr/discovered/surface-smoke.json --json
100
- zmr draft --from-trace traces/zmr-agent --out .zmr/discovered/replay-smoke.json --include-actions --json
101
- zmr init --app --json --dir . --app-id com.example.mobiletest
102
173
  zmr validate --json .zmr/login-smoke.json
103
174
  zmr run .zmr/login-smoke.json --json --trace-dir traces/login-smoke
104
- zmr run .zmr/login-smoke.json --json --trace-dir traces/login-smoke --discover-out .zmr/discovered/login-smoke.json
105
- zmr explain --json traces/login-smoke
106
175
  zmr report traces/login-smoke --out traces/login-smoke/report.html --junit traces/login-smoke/junit.xml
107
- zmr import flow-yaml .zmr/legacy-flow.yaml --out .zmr/legacy-flow.json
108
- zmr export traces/login-smoke --out traces/login-smoke-redacted.zmrtrace --redact
109
- ```
110
-
111
- For traced runs, `zmr run --json` returns executable `nextCommands` for
112
- HTML and JUnit reporting, failure explanation, `zmr discover --from-trace`,
113
- and redacted export so agents can continue from a run summary without guessing
114
- the next handoff. The generated report handoff writes `report.html` and
115
- `junit.xml` beside the trace for CI artifact collection.
116
-
117
- When an agent should produce the reviewable scenario in the same command, add
118
- `--discover-out .zmr/discovered/<name>.json`. ZMR still treats the generated
119
- file as review-first: it writes from trace evidence, validates the file, and
120
- returns the embedded `discovery` result without crawling or committing tests.
121
- The `discovery.replay` object shows how many trace action events were
122
- considered for replay, how many became scenario steps, and how many were
123
- skipped.
124
-
125
- See [docs/scenario-authoring.md](docs/scenario-authoring.md) for selector and
126
- wait guidance.
127
-
128
- ## Agent Workflow
129
-
130
- Agents can use the CLI, JSON-RPC, or MCP surface. Start JSON-RPC over stdio:
131
-
132
- ```bash
133
- zmr inspect --json --dir .
134
- ```
135
-
136
- `zmr inspect --json` gives agents a read-only handoff for the app checkout:
137
- config status, generated agent instructions, configured platform scenarios, and
138
- recommended next commands. It does not launch devices or write tests.
139
-
140
- After a live session has produced trace artifacts, agents can ask ZMR to turn
141
- the trace into a validated, reviewable scenario candidate. `zmr explore` is
142
- the agent-facing handoff: it records the goal, writes from existing trace
143
- evidence, validates the candidate, and returns guardrails that make the limits
144
- machine-readable:
145
-
146
- ```bash
147
- zmr explore --from-trace traces/zmr-agent \
148
- --out .zmr/discovered/login-smoke.json \
149
- --goal "find a stable login smoke" \
150
- --include-actions \
151
- --validate \
152
- --json
153
- ```
154
-
155
- `zmr explore` is not an autonomous crawler. It does not launch devices, invent
156
- missing actions, discover credentials, or commit tests. Its JSON includes
157
- `autonomous:false`, `reviewRequired:true`, `guardrails`, and the same replay
158
- coverage and validation fields as `zmr discover`.
159
-
160
- When an agent wants the lower-level trace-to-test primitive directly, use
161
- `zmr discover`:
162
-
163
- ```bash
164
- zmr discover --from-trace traces/zmr-agent \
165
- --out .zmr/discovered/replay-smoke.json \
166
- --include-actions \
167
- --validate \
168
- --json
176
+ zmr export traces/login-smoke --out login-smoke-redacted.zmrtrace --redact
169
177
  ```
170
178
 
171
- `zmr discover` is offline and review-first. It writes a scenario from stable
172
- trace evidence, optionally validates it immediately, and returns next commands
173
- for deterministic reruns. It does not crawl the app, discover credentials, or
174
- commit tests. Its JSON includes `replay` coverage metadata so agents can report
175
- which trace action events became replay steps and which were skipped.
179
+ Traced `zmr run --json` responses include executable `nextCommands` so agents
180
+ can continue to reporting, explanation, discovery, or export without guessing.
181
+ Open any exported bundle in the static [trace viewer](viewer/index.html) or
182
+ serve it and link straight to it with `viewer/index.html?bundle=<url>`.
176
183
 
177
- For CLI-driven agent loops, `zmr run --json --trace-dir traces/zmr-agent
178
- --discover-out .zmr/discovered/replay-smoke.json` performs the same
179
- trace-backed discovery after the run and embeds the discover result in the run
180
- JSON response.
184
+ For repeat-run reliability gates, p95 duration thresholds, baseline
185
+ comparisons against your current E2E tool, and multi-device matrices, see
186
+ [docs/benchmarking.md](docs/benchmarking.md) and the public
187
+ [Benchmark Lab](docs/benchmarks/README.md) evidence.
181
188
 
182
- For the lower-level draft primitive, agents can still ask ZMR to write a
183
- conservative surface-smoke scenario from the latest snapshot:
184
-
185
- ```bash
186
- zmr draft --from-trace traces/zmr-agent \
187
- --out .zmr/discovered/surface-smoke.json \
188
- --json
189
- zmr validate --json .zmr/discovered/surface-smoke.json
190
- ```
191
-
192
- `zmr draft` writes `launch`, `snapshot`, and `assertVisible` steps from stable
193
- visible selectors. It does not tap controls or type into fields unless
194
- `--include-actions` is explicitly requested.
195
-
196
- When the trace was produced by an agent or JSON-RPC/MCP session that took typed
197
- actions, add `--include-actions` to replay successful supported actions before
198
- the final snapshot assertions:
199
-
200
- ```bash
201
- zmr draft --from-trace traces/zmr-agent \
202
- --out .zmr/discovered/replay-smoke.json \
203
- --include-actions \
204
- --json
205
- zmr validate --json .zmr/discovered/replay-smoke.json
206
- ```
207
-
208
- Replay drafts only use trace events with enough stable data to reproduce them,
209
- such as launch, deep links, selector taps, selector text entry,
210
- selector/timeout-preserving waits, back, keyboard hiding, coordinate-complete
211
- swipes, direction/timeout-preserving selector scrolls, `assertNoneVisible`
212
- selector arrays, selector/timeout-preserving `assertVisible` and
213
- `assertNotVisible`, and timed `assertHealthy` checks.
214
- Native selector wait traces include timeout context for successful waits and
215
- timeout diagnostics.
216
- Unsupported or underspecified events are skipped with warnings instead of guessed.
217
- Text entry events whose text was redacted from the trace are also skipped.
218
-
219
- ```bash
220
- zmr serve --transport stdio --config .zmr/config.json --trace-dir traces/zmr-agent
221
- ```
222
-
223
- Agents that support the Model Context Protocol can use the native MCP surface:
224
-
225
- ```bash
226
- zmr mcp --config .zmr/config.json --trace-dir traces/zmr-agent
227
- ```
228
-
229
- The MCP server exposes mobile-specific tools such as `semantic_snapshot`,
230
- `install_app`, `launch_app`, `stop_app`, `clear_state`, `tap`, `type`,
231
- `erase_text`, `hide_keyboard`, `swipe`, `press_back`, `open_link`,
232
- `wait_visible`, `wait_not_visible`, `wait_any`, `scroll_until_visible`,
233
- `assert_visible`, `assert_not_visible`, `assert_healthy`, `scenario_validate`,
234
- `trace_events`, `trace_explain`, `trace_explore`, `trace_discover`, and
235
- `trace_export`.
236
-
237
- For agent-led discovery and test authoring, see
238
- [docs/agent-discovery.md](docs/agent-discovery.md). ZMR supports that loop
239
- through MCP, JSON-RPC, trace events, in-band trace discovery, offline surface
240
- drafts, replay drafts, live and offline guarded trace exploration, and traced
241
- run `nextCommands` today. Built-in exploration is review-first and
242
- trace-backed, not an unbounded autonomous crawler.
243
-
244
- ## Optional Protocol Clients
245
-
246
- Clients are thin wrappers around `zmr serve --transport stdio`. They do not
247
- replace the runner; they make it easier for agents and test code to call the
248
- same JSON-RPC protocol.
249
-
250
- TypeScript and Python are the most common starting points for app teams and
251
- agent harnesses. Go, Rust, Swift, and Kotlin clients are reference integrations
252
- for teams that want to embed the protocol from those ecosystems. Go and Rust
253
- also include typed trace discovery and scenario validation helpers for
254
- host-side agent loops. Swift and Kotlin include lightweight discovery and
255
- validation helpers for host-side automation.
256
-
257
- | Language | Entry point | Example |
258
- | --- | --- | --- |
259
- | TypeScript | `clients/typescript/index.mjs` + `index.d.ts` | `node clients/typescript/examples/fake-session.mjs` |
260
- | Python | `clients/python/zmr_client.py` + `pyproject.toml` | `python3 clients/python/examples/fake_session.py` |
261
- | Go | `clients/go/zmr/client.go` | `go run ./clients/go/examples/fake-session` |
262
- | Rust | `clients/rust/src/lib.rs` | `cargo run --manifest-path clients/rust/Cargo.toml --example fake_session` |
263
- | Swift | `clients/swift/Sources/ZMRClient` | `swift build --package-path clients/swift` |
264
- | Kotlin | `clients/kotlin/src/main/kotlin/dev/zmr` | `gradle -p clients/kotlin build` |
265
-
266
- See [clients/README.md](clients/README.md), [docs/clients.md](docs/clients.md),
267
- and [docs/client-installation.md](docs/client-installation.md).
268
-
269
- ## Platform Support
189
+ ## Platform support
270
190
 
271
191
  | Target | Status | Notes |
272
192
  | --- | --- | --- |
273
193
  | Android emulator | Supported | ADB/UI Automator, optional Android shim, emulator lifecycle helpers |
274
194
  | Android physical device | Supported | Requires ADB connection and app build/install surface |
275
- | iOS simulator | Supported | `simctl` plus app-local XCTest/XCUIAutomation shim for native selector actions, native waits, and bounded snapshots |
276
- | iOS physical device | Supported, validate locally | `devicectl` lifecycle plus app-local XCTest/XCUIAutomation shim; run pilots on your own app/device before relying on it in CI |
277
- | Cloud device farms | Not included | ZMR is focused on local and self-managed device targets in this preview |
195
+ | iOS simulator | Supported | `simctl` plus app-local XCTest/XCUIAutomation shim for native selector actions |
196
+ | iOS physical device | Supported, validate locally | `devicectl` lifecycle plus XCTest shim; pilot on your own app/device before relying on it in CI |
197
+ | Cloud device farms | Not included | ZMR focuses on local and self-managed device targets in this preview |
198
+
199
+ Slow CI hardware can extend the iOS shim cold-build timeout with
200
+ `ZMR_IOS_SHIM_TIMEOUT_MS`. Current release: `0.2.1` developer preview.
201
+ Protocol version: `2026-04-28`.
202
+
203
+ ## Optional protocol clients
278
204
 
279
- Current release: `0.1.8` developer preview. Protocol version:
280
- `2026-04-28`.
205
+ TypeScript and Python clients are the common starting points; Go, Rust, Swift,
206
+ and Kotlin reference clients embed the same JSON-RPC protocol from those
207
+ ecosystems. All are thin wrappers around `zmr serve --transport stdio`. See
208
+ [docs/clients.md](docs/clients.md) and
209
+ [docs/client-installation.md](docs/client-installation.md).
281
210
 
282
211
  ## Documentation
283
212
 
284
- - [FEATURES.md](FEATURES.md): complete feature list and limitations
213
+ **For agents**
214
+
215
+ - [docs/ai-agents.md](docs/ai-agents.md): JSON-RPC and MCP agent workflows
216
+ - [docs/agent-discovery.md](docs/agent-discovery.md): agent-led discovery, `zmr explore`/`discover`/`draft`, and the trace-to-test loop
217
+ - [skills/zmr-mobile-testing/SKILL.md](skills/zmr-mobile-testing/SKILL.md): reusable agent skill
218
+
219
+ **For test authors**
220
+
285
221
  - [docs/install.md](docs/install.md): source, npm, Homebrew, and app setup
286
222
  - [docs/frameworks.md](docs/frameworks.md): React Native, Expo, Flutter, and native app guidance
287
- - [docs/expo-smoke.md](docs/expo-smoke.md): reproducible Expo and iOS smoke test
288
- - [docs/production-readiness.md](docs/production-readiness.md): release, reliability, framework, and agent-readiness gates
289
- - [docs/app-integration.md](docs/app-integration.md): app-side Android/iOS shims
290
223
  - [docs/scenario-authoring.md](docs/scenario-authoring.md): selectors, waits, and scenario design
291
- - [docs/agent-discovery.md](docs/agent-discovery.md): agent-led discovery and scenario authoring loop
224
+ - [docs/app-integration.md](docs/app-integration.md): app-side Android/iOS shims
225
+ - [docs/expo-smoke.md](docs/expo-smoke.md): reproducible Expo and iOS smoke test
226
+ - [docs/benchmarking.md](docs/benchmarking.md): repeat-run gates, reports, device matrix, baselines
227
+
228
+ **Reference**
229
+
230
+ - [FEATURES.md](FEATURES.md): complete feature list and limitations
292
231
  - [docs/protocol.md](docs/protocol.md): JSON-RPC methods and schemas
293
- - [docs/ai-agents.md](docs/ai-agents.md): JSON-RPC and MCP agent workflows
294
- - [docs/clients.md](docs/clients.md): language client guide
295
- - [docs/client-installation.md](docs/client-installation.md): npm, Homebrew, TS, Python, Go, Rust, Swift, and Kotlin setup
296
232
  - [docs/trace-privacy.md](docs/trace-privacy.md): safe trace export
233
+ - [docs/production-readiness.md](docs/production-readiness.md): release, reliability, and agent-readiness gates
297
234
  - [docs/troubleshooting.md](docs/troubleshooting.md): common setup and runtime issues
298
- - [skills/zmr-mobile-testing/SKILL.md](skills/zmr-mobile-testing/SKILL.md): reusable agent skill
235
+ - [docs/benchmarks](docs/benchmarks/README.md): public-safe benchmark evidence
299
236
 
300
237
  ## License
301
238
 
@@ -27,7 +27,7 @@ gradle -p clients/kotlin runFakeSession \
27
27
  ```
28
28
 
29
29
  ```kotlin
30
- implementation(files("path/to/zeno-mobile-runner/clients/kotlin/build/libs/zmr-client-0.1.8.jar"))
30
+ implementation(files("path/to/zeno-mobile-runner/clients/kotlin/build/libs/zmr-client-0.2.1.jar"))
31
31
  ```
32
32
 
33
33
  ```kotlin
@@ -4,7 +4,7 @@ plugins {
4
4
  }
5
5
 
6
6
  group = "dev.zmr"
7
- version = "0.1.8"
7
+ version = "0.2.1"
8
8
 
9
9
  kotlin {
10
10
  jvmToolchain(17)
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "zmr-client"
7
- version = "0.1.8.dev1"
7
+ version = "0.2.1.dev1"
8
8
  description = "Python JSON-RPC client for Zeno Mobile Runner."
9
9
  requires-python = ">=3.9"
10
10
  license = { text = "MIT" }
@@ -100,7 +100,7 @@ checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa"
100
100
 
101
101
  [[package]]
102
102
  name = "zmr-client"
103
- version = "0.1.8"
103
+ version = "0.2.1"
104
104
  dependencies = [
105
105
  "serde",
106
106
  "serde_json",
@@ -1,6 +1,6 @@
1
1
  [package]
2
2
  name = "zmr-client"
3
- version = "0.1.8"
3
+ version = "0.2.1"
4
4
  edition = "2021"
5
5
  license = "MIT"
6
6
  description = "Rust JSON-RPC client for Zeno Mobile Runner."
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@zmr/client",
3
- "version": "0.1.8",
3
+ "version": "0.2.1",
4
4
  "type": "module",
5
5
  "main": "index.mjs",
6
6
  "types": "index.d.ts",
@@ -11,6 +11,16 @@ trace-backed, not an unbounded crawler: it does not launch devices, invent
11
11
  missing actions, discover credentials, or commit files. Keep autonomous
12
12
  planning in the agent, and keep ZMR as the deterministic mobile control plane.
13
13
 
14
+ ```mermaid
15
+ flowchart LR
16
+ SESSION["Live agent session<br/>or zmr run"] --> TRACE["Trace directory"]
17
+ TRACE --> DISCOVER["zmr discover / draft / explore<br/>--from-trace"]
18
+ DISCOVER --> CANDIDATE["Scenario candidate<br/>.zmr/discovered/*.json"]
19
+ CANDIDATE --> REVIEW["Human / agent review"]
20
+ REVIEW --> VALIDATE["zmr validate --json"]
21
+ VALIDATE --> CI["zmr run in CI<br/>report.html · junit.xml"]
22
+ ```
23
+
14
24
  ## Recommended Loop
15
25
 
16
26
  1. Validate local setup: