zig-mobile-runner 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +484 -0
- package/CONTRIBUTING.md +42 -0
- package/FEATURES.md +112 -0
- package/LICENSE +21 -0
- package/README.md +255 -0
- package/SECURITY.md +34 -0
- package/build.zig +38 -0
- package/build.zig.zon +7 -0
- package/clients/README.md +144 -0
- package/clients/go/README.md +24 -0
- package/clients/go/examples/fake-session/main.go +93 -0
- package/clients/go/go.mod +3 -0
- package/clients/go/zmr/client.go +432 -0
- package/clients/kotlin/README.md +35 -0
- package/clients/kotlin/build.gradle.kts +35 -0
- package/clients/kotlin/settings.gradle.kts +15 -0
- package/clients/kotlin/src/main/kotlin/dev/zmr/FakeSession.kt +86 -0
- package/clients/kotlin/src/main/kotlin/dev/zmr/ZmrClient.kt +67 -0
- package/clients/python/README.md +29 -0
- package/clients/python/examples/fake_session.py +48 -0
- package/clients/python/pyproject.toml +13 -0
- package/clients/python/zmr_client.py +202 -0
- package/clients/rust/Cargo.lock +107 -0
- package/clients/rust/Cargo.toml +10 -0
- package/clients/rust/README.md +19 -0
- package/clients/rust/examples/fake_session.rs +70 -0
- package/clients/rust/src/lib.rs +461 -0
- package/clients/swift/Package.swift +16 -0
- package/clients/swift/README.md +36 -0
- package/clients/swift/Sources/ZMRClient/ZMRClient.swift +114 -0
- package/clients/swift/Sources/ZMRFakeSession/main.swift +86 -0
- package/clients/typescript/README.md +34 -0
- package/clients/typescript/examples/fake-session.mjs +36 -0
- package/clients/typescript/index.d.ts +144 -0
- package/clients/typescript/index.mjs +192 -0
- package/clients/typescript/package.json +8 -0
- package/docs/adr/0001-agent-native-runner-boundary.md +31 -0
- package/docs/adr/0002-app-local-zmr-contract.md +39 -0
- package/docs/adr/0003-ios-simulator-xctest-shim.md +41 -0
- package/docs/adr/0004-benchmark-claims-and-baseline-collection.md +37 -0
- package/docs/adr/README.md +12 -0
- package/docs/ai-agents.md +156 -0
- package/docs/app-integration.md +316 -0
- package/docs/benchmarking.md +275 -0
- package/docs/client-installation.md +141 -0
- package/docs/clients.md +98 -0
- package/docs/config.md +175 -0
- package/docs/demo.md +259 -0
- package/docs/dsl.md +57 -0
- package/docs/install.md +233 -0
- package/docs/market-positioning.md +70 -0
- package/docs/npm.md +359 -0
- package/docs/protocol-fixtures/README.md +8 -0
- package/docs/protocol-fixtures/core-session.requests.jsonl +8 -0
- package/docs/protocol-fixtures/core-session.responses.jsonl +8 -0
- package/docs/protocol-versioning.md +65 -0
- package/docs/protocol.md +560 -0
- package/docs/publication.md +77 -0
- package/docs/release-audit.md +99 -0
- package/docs/release-candidate.md +111 -0
- package/docs/release-evidence.md +188 -0
- package/docs/release-notes-template.md +58 -0
- package/docs/roadmap.md +334 -0
- package/docs/scenario-authoring.md +88 -0
- package/docs/shipping.md +170 -0
- package/docs/trace-privacy.md +88 -0
- package/docs/troubleshooting.md +256 -0
- package/examples/android-app-auth-probe.json +89 -0
- package/examples/android-app-error-state.json +13 -0
- package/examples/android-app-login-smoke.json +192 -0
- package/examples/android-app-onboarding.json +12 -0
- package/examples/android-app-referral-deep-link.json +12 -0
- package/examples/android-shim-smoke.json +19 -0
- package/examples/demo-failure.json +12 -0
- package/examples/demo-fake.json +14 -0
- package/examples/ios-dev-client-open-link.json +26 -0
- package/examples/ios-dev-client-route-snapshot.json +24 -0
- package/examples/ios-shim-smoke.json +23 -0
- package/examples/ios-smoke.json +9 -0
- package/go.work +3 -0
- package/npm/agents.mjs +183 -0
- package/npm/app-config.mjs +95 -0
- package/npm/build-zmr.mjs +21 -0
- package/npm/commands.mjs +104 -0
- package/npm/generated-files.mjs +50 -0
- package/npm/index.mjs +75 -0
- package/npm/init-app.mjs +80 -0
- package/npm/package-scripts.mjs +72 -0
- package/npm/postinstall.mjs +21 -0
- package/npm/scaffold.mjs +179 -0
- package/npm/scenarios.mjs +93 -0
- package/npm/setup.mjs +69 -0
- package/npm/wizard.mjs +117 -0
- package/npm/zmr.mjs +23 -0
- package/package.json +114 -0
- package/prebuilds/darwin-arm64/zmr +0 -0
- package/prebuilds/darwin-x64/zmr +0 -0
- package/prebuilds/linux-arm64/zmr +0 -0
- package/prebuilds/linux-x64/zmr +0 -0
- package/schemas/README.md +26 -0
- package/schemas/action-result.schema.json +27 -0
- package/schemas/capabilities-output.schema.json +98 -0
- package/schemas/devices-output.schema.json +25 -0
- package/schemas/doctor-output.schema.json +51 -0
- package/schemas/explain-output.schema.json +51 -0
- package/schemas/import-output.schema.json +23 -0
- package/schemas/init-output.schema.json +71 -0
- package/schemas/json-rpc.schema.json +55 -0
- package/schemas/release-manifest.schema.json +43 -0
- package/schemas/release-readiness-output.schema.json +127 -0
- package/schemas/run-output.schema.json +43 -0
- package/schemas/scenario.schema.json +128 -0
- package/schemas/schemas-output.schema.json +26 -0
- package/schemas/semantic-snapshot.schema.json +116 -0
- package/schemas/snapshot.schema.json +60 -0
- package/schemas/trace-event.schema.json +14 -0
- package/schemas/trace-manifest.schema.json +59 -0
- package/schemas/validate-output.schema.json +42 -0
- package/schemas/version-output.schema.json +23 -0
- package/schemas/zmr-config.schema.json +75 -0
- package/scripts/android-emulator.sh +126 -0
- package/scripts/assert-ios-physical-ready.sh +213 -0
- package/scripts/benchmark-command.sh +307 -0
- package/scripts/benchmark.sh +359 -0
- package/scripts/benchmark_gate.py +117 -0
- package/scripts/benchmark_result_row.py +88 -0
- package/scripts/compare-benchmarks.py +288 -0
- package/scripts/create-android-demo-app.sh +342 -0
- package/scripts/create-ios-demo-app.sh +261 -0
- package/scripts/demo-android-real.sh +232 -0
- package/scripts/demo-ios-real.sh +270 -0
- package/scripts/demo.sh +464 -0
- package/scripts/device-matrix.sh +338 -0
- package/scripts/ensure-ios-shim-target.rb +237 -0
- package/scripts/install-android-shim.sh +281 -0
- package/scripts/install-ios-shim.sh +589 -0
- package/scripts/pilot-gate.sh +560 -0
- package/scripts/release-readiness.py +838 -0
- package/scripts/release-readiness.sh +91 -0
- package/scripts/run-android-pilot.sh +561 -0
- package/scripts/run-ios-pilot.sh +509 -0
- package/shims/android/README.md +21 -0
- package/shims/android/ZMRShimInstrumentedTest.java +152 -0
- package/shims/android/protocol.md +18 -0
- package/shims/ios/README.md +50 -0
- package/shims/ios/ZMRShim.swift +110 -0
- package/shims/ios/ZMRShimUITestCase.swift +475 -0
- package/shims/ios/protocol.md +74 -0
- package/skills/zmr-mobile-testing/SKILL.md +127 -0
- package/src/android.zig +344 -0
- package/src/android_device_info.zig +99 -0
- package/src/android_emulator.zig +154 -0
- package/src/android_screen_recording.zig +112 -0
- package/src/android_shell.zig +112 -0
- package/src/bundle.zig +124 -0
- package/src/bundle_redaction.zig +272 -0
- package/src/bundle_tar.zig +123 -0
- package/src/cli_devices.zig +97 -0
- package/src/cli_doctor.zig +114 -0
- package/src/cli_import.zig +70 -0
- package/src/cli_info.zig +39 -0
- package/src/cli_init.zig +72 -0
- package/src/cli_output.zig +467 -0
- package/src/cli_run.zig +259 -0
- package/src/cli_serve.zig +287 -0
- package/src/cli_trace.zig +111 -0
- package/src/cli_validate.zig +41 -0
- package/src/command.zig +211 -0
- package/src/config.zig +305 -0
- package/src/config_diagnostics.zig +212 -0
- package/src/config_paths.zig +49 -0
- package/src/device_registry.zig +37 -0
- package/src/doctor.zig +412 -0
- package/src/doctor_hints.zig +52 -0
- package/src/errors.zig +55 -0
- package/src/fake_device.zig +163 -0
- package/src/health.zig +28 -0
- package/src/importer.zig +343 -0
- package/src/importer_json.zig +100 -0
- package/src/importer_model.zig +103 -0
- package/src/ios.zig +399 -0
- package/src/ios_devices.zig +219 -0
- package/src/ios_lifecycle.zig +72 -0
- package/src/ios_shim.zig +242 -0
- package/src/ios_snapshot.zig +20 -0
- package/src/json_fields.zig +80 -0
- package/src/json_rpc.zig +150 -0
- package/src/json_rpc_methods.zig +318 -0
- package/src/json_rpc_observation.zig +31 -0
- package/src/json_rpc_params.zig +52 -0
- package/src/json_rpc_protocol.zig +110 -0
- package/src/json_rpc_trace.zig +73 -0
- package/src/main.zig +135 -0
- package/src/mcp.zig +234 -0
- package/src/mcp_protocol.zig +64 -0
- package/src/mcp_trace.zig +83 -0
- package/src/report.zig +346 -0
- package/src/report_html.zig +63 -0
- package/src/report_values.zig +27 -0
- package/src/run_options.zig +152 -0
- package/src/runner.zig +280 -0
- package/src/runner_actions.zig +109 -0
- package/src/runner_config.zig +6 -0
- package/src/runner_diagnostics.zig +268 -0
- package/src/runner_events.zig +170 -0
- package/src/runner_native.zig +88 -0
- package/src/runner_waits.zig +300 -0
- package/src/scaffold.zig +472 -0
- package/src/scenario.zig +346 -0
- package/src/scenario_fields.zig +50 -0
- package/src/schema_registry.zig +53 -0
- package/src/selector.zig +84 -0
- package/src/semantic.zig +171 -0
- package/src/trace.zig +315 -0
- package/src/trace_json.zig +340 -0
- package/src/trace_summary.zig +218 -0
- package/src/trace_summary_diagnostic.zig +202 -0
- package/src/types.zig +120 -0
- package/src/uiautomator.zig +164 -0
- package/src/validation.zig +187 -0
- package/src/version.zig +22 -0
- package/viewer/app.js +373 -0
- package/viewer/index.html +126 -0
- package/viewer/parser.js +233 -0
- package/viewer/styles.css +585 -0
|
@@ -0,0 +1,99 @@
|
|
|
1
|
+
# Release Completion Audit
|
|
2
|
+
|
|
3
|
+
This audit maps the release objective to concrete evidence. Treat it as the
|
|
4
|
+
source of truth before tagging a public release or making competitive claims.
|
|
5
|
+
|
|
6
|
+
Current status: **ready for `0.1.0` developer preview**. Not production-stable.
|
|
7
|
+
|
|
8
|
+
Latest release-candidate evidence:
|
|
9
|
+
|
|
10
|
+
- Evidence: `traces/release-candidate/20260517-180801/evidence.jsonl`
|
|
11
|
+
- Summary: `traces/release-candidate/20260517-180801/summary.md`
|
|
12
|
+
- Dev preview: `ready`
|
|
13
|
+
- Production: `blocked`
|
|
14
|
+
- Market claim: `blocked`
|
|
15
|
+
|
|
16
|
+
The latest local candidate passed its generated public Android demo app build
|
|
17
|
+
and generated public iOS simulator demo. Production and market claims remain
|
|
18
|
+
blocked until the hardware and benchmark evidence below exists.
|
|
19
|
+
|
|
20
|
+
Latest full local gate verification:
|
|
21
|
+
|
|
22
|
+
- Date: `2026-05-18`
|
|
23
|
+
- Command: `./scripts/release-gate.sh`
|
|
24
|
+
- Result: passed
|
|
25
|
+
- Zig tests: 214/214 passed with `zig test src/main.zig -target aarch64-macos.15.0`
|
|
26
|
+
- Coverage: `94.40%` line coverage
|
|
27
|
+
- Release artifacts: built and verified
|
|
28
|
+
- Release smoke: passed on the local macOS archive
|
|
29
|
+
- npm package dry-run: passed
|
|
30
|
+
|
|
31
|
+
Additional local pilot evidence:
|
|
32
|
+
|
|
33
|
+
- `traces/hardware-pilots/20260517-evidence.jsonl`
|
|
34
|
+
- Public generated iOS simulator lifecycle pilot: 20/20 passed, p95 10392ms.
|
|
35
|
+
- Public generated iOS simulator selector-shim pilot: 20/20 passed, p95 4175ms.
|
|
36
|
+
- Public generated Android emulator pilot: 20/20 passed, p95 12596ms, after
|
|
37
|
+
cleaning generated build artifacts and hardening the generated demo's first
|
|
38
|
+
screen wait from 10s to 30s.
|
|
39
|
+
|
|
40
|
+
This strengthens the public generated-demo evidence for both platforms, but it
|
|
41
|
+
does not replace the required real app/device pilots for production readiness.
|
|
42
|
+
|
|
43
|
+
## Prompt-to-artifact checklist
|
|
44
|
+
|
|
45
|
+
| Requirement | Evidence | Current state |
|
|
46
|
+
| --- | --- | --- |
|
|
47
|
+
| Leaner, easier-to-understand core | Focused modules under `src/cli_*`, `src/runner_*`, `src/json_rpc_*`, `src/ios_*`, `src/android_*`, plus focused tests | Implemented and covered by `zig test src/main.zig -target aarch64-macos.15.0` |
|
|
48
|
+
| Developer-friendly first run | `npm install`, `npx zmr-wizard`, `.zmr/config.json`, smoke scenarios, package scripts | Implemented; covered by `tests/npm-package.test.mjs` and `tests/init-app-test.sh` |
|
|
49
|
+
| AI-agent usability | JSON-RPC, MCP, semantic snapshots, live trace events, schemas, clients, agent skill | Implemented; covered by protocol fixtures, client tests, MCP tests, and docs |
|
|
50
|
+
| Public package hygiene | npm files whitelist, public-safety scan, private trace exclusion | Implemented; covered by `tests/npm-package.test.mjs` and `tests/public-safety-test.sh` |
|
|
51
|
+
| App-install package surface | npm tarball exposes app-facing commands and excludes maintainer-only release tooling | Implemented; covered by `tests/npm-package.test.mjs` and `npm pack --dry-run` |
|
|
52
|
+
| Android/iOS local demos | Public generated Android and iOS demo scripts | Implemented; included in release-candidate dev-preview evidence |
|
|
53
|
+
| Release artifacts | archives, checksums, SBOM, Homebrew formula, npm dry-run | Implemented; covered by `./scripts/release-gate.sh` |
|
|
54
|
+
| Dev-preview release readiness | `zmr-release-readiness --target dev-preview --json` | Ready when `satisfied` includes local release gate plus public Android/iOS demos |
|
|
55
|
+
| Production release readiness | `zmr-release-readiness --target production --json` | Blocked until physical iOS readiness and repeated real-app/device pilots pass with structured thresholds, app-id, app-root, app-artifact, and device evidence |
|
|
56
|
+
| Competitive market claim | `zmr-release-readiness --target market-claim --json` | Blocked until same-device benchmark evidence exists |
|
|
57
|
+
|
|
58
|
+
## Required evidence before production
|
|
59
|
+
|
|
60
|
+
Run these before claiming production readiness:
|
|
61
|
+
|
|
62
|
+
```bash
|
|
63
|
+
zmr-release-readiness --evidence traces/release-candidate/<run>/evidence.jsonl \
|
|
64
|
+
--evidence /path/to/private-app/traces/zmr-pilots/evidence.jsonl \
|
|
65
|
+
--target production --json
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
Production readiness requires:
|
|
69
|
+
|
|
70
|
+
- physical iOS readiness with concrete device evidence
|
|
71
|
+
- Android hardware pilot with structured `runs >= 20`, `minPassRate >= 100`, `maxFailures <= 0`, app-id evidence, app-root evidence, and Android device evidence
|
|
72
|
+
- iOS simulator hardware pilot with structured `runs >= 20`, `minPassRate >= 100`, `maxFailures <= 0`, app-id evidence, app-root evidence, iOS app-artifact evidence, and iOS simulator device evidence
|
|
73
|
+
- iOS physical hardware pilot with structured `runs >= 20`, `minPassRate >= 100`, `maxFailures <= 0`, app-id evidence, app-root evidence, iOS app-artifact evidence, and physical device evidence
|
|
74
|
+
|
|
75
|
+
Market-claim readiness additionally requires same-device benchmark evidence
|
|
76
|
+
with candidate and baseline name evidence against the specific tool or runner
|
|
77
|
+
being discussed, results path evidence for the source benchmark rows, and
|
|
78
|
+
measured result evidence proving the thresholds were met. The benchmark
|
|
79
|
+
comparison row must also include `sameContext: true` and structured platform,
|
|
80
|
+
device, app id, scenario, and app-build context, with at least 20 candidate
|
|
81
|
+
runs and at least 20 baseline runs.
|
|
82
|
+
|
|
83
|
+
## Release wording
|
|
84
|
+
|
|
85
|
+
Use this wording for the current release:
|
|
86
|
+
|
|
87
|
+
> ZMR `0.1.0` is a public developer preview for local, agent-native
|
|
88
|
+
> mobile automation. It is not production-stable yet. Production and competitive
|
|
89
|
+
> claims require the release-readiness evidence gates.
|
|
90
|
+
|
|
91
|
+
Do not say:
|
|
92
|
+
|
|
93
|
+
- production-ready
|
|
94
|
+
- stable `1.0`
|
|
95
|
+
- better than another runner
|
|
96
|
+
- fully certified on physical iOS
|
|
97
|
+
|
|
98
|
+
unless the matching evidence exists in `evidence.jsonl` and
|
|
99
|
+
`zmr-release-readiness` returns `ready` for that target.
|
|
@@ -0,0 +1,111 @@
|
|
|
1
|
+
# Release Candidate Gate
|
|
2
|
+
|
|
3
|
+
`scripts/release-candidate.sh` is the evidence-producing gate for deciding
|
|
4
|
+
whether a build is ready to publish as a dev-preview release candidate. It
|
|
5
|
+
wraps the existing release checks, adds public Android/iOS demo evidence, and
|
|
6
|
+
can optionally require private app/device pilots.
|
|
7
|
+
|
|
8
|
+
## Local Mode
|
|
9
|
+
|
|
10
|
+
Run this before opening or tagging a release candidate when no devices are
|
|
11
|
+
attached:
|
|
12
|
+
|
|
13
|
+
```bash
|
|
14
|
+
./scripts/release-candidate.sh --mode local
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
Local mode runs `./scripts/release-gate.sh`, builds the generated public
|
|
18
|
+
Android demo APK, and runs the generated public iOS simulator demo five times
|
|
19
|
+
by default. If an Android AVD is available, pass `--local-android-avd <name>`
|
|
20
|
+
to run the generated Android demo app on an emulator instead of only building
|
|
21
|
+
it:
|
|
22
|
+
|
|
23
|
+
```bash
|
|
24
|
+
./scripts/release-candidate.sh --mode local \
|
|
25
|
+
--local-android-avd Small_Phone \
|
|
26
|
+
--local-android-demo-runs 5
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
Use `--local-android-device <serial>` when the emulator serial is not
|
|
30
|
+
`emulator-5554`. Override the iOS demo loop with `--local-ios-demo-runs <n>`
|
|
31
|
+
when collecting slower or faster release-candidate evidence. It writes:
|
|
32
|
+
|
|
33
|
+
- `evidence.jsonl`: one row per gate step with command, status, mode,
|
|
34
|
+
duration, structured app/device provenance, and structured threshold fields
|
|
35
|
+
for hardware pilot rows.
|
|
36
|
+
- `summary.md`: a human-readable checklist suitable for release notes or PR
|
|
37
|
+
review, including the matching `zmr-release-readiness` command and its
|
|
38
|
+
blocked requirement output.
|
|
39
|
+
|
|
40
|
+
Turn that evidence into an explicit release decision:
|
|
41
|
+
|
|
42
|
+
```bash
|
|
43
|
+
zmr-release-readiness --evidence traces/release-candidate/<run>/evidence.jsonl \
|
|
44
|
+
--target dev-preview
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
For production or market-claim checks, keep private app pilot evidence in the
|
|
48
|
+
app repository and pass it as a second evidence file:
|
|
49
|
+
|
|
50
|
+
```bash
|
|
51
|
+
zmr-release-readiness \
|
|
52
|
+
--evidence traces/release-candidate/<run>/evidence.jsonl \
|
|
53
|
+
--evidence /path/to/app/traces/zmr-pilots/evidence.jsonl \
|
|
54
|
+
--target production \
|
|
55
|
+
--json
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
`dev-preview` requires the local release gate plus public Android and iOS demo
|
|
59
|
+
evidence. `production` additionally requires repeated real app Android, iOS
|
|
60
|
+
simulator, and physical iOS pilots. `market-claim` additionally requires a
|
|
61
|
+
same-host/device benchmark comparison before claiming leadership over other
|
|
62
|
+
mobile E2E runners.
|
|
63
|
+
|
|
64
|
+
## Hardware Mode
|
|
65
|
+
|
|
66
|
+
Run this before claiming real app/device reliability:
|
|
67
|
+
|
|
68
|
+
```bash
|
|
69
|
+
./scripts/release-candidate.sh --mode hardware \
|
|
70
|
+
--android-app-root /path/to/mobile-app \
|
|
71
|
+
--android-app-id com.example.mobiletest \
|
|
72
|
+
--android-device emulator-5554 \
|
|
73
|
+
--ios-app-root /path/to/mobile-app \
|
|
74
|
+
--ios-app-path /path/to/mobile-app/build/Debug-iphonesimulator/Sample.app \
|
|
75
|
+
--ios-app-id com.example.mobiletest \
|
|
76
|
+
--ios-device booted \
|
|
77
|
+
--ios-shim /path/to/mobile-app/.zmr/ios-shim \
|
|
78
|
+
--xcrun xcrun \
|
|
79
|
+
--ios-physical-app-root /path/to/mobile-app \
|
|
80
|
+
--ios-physical-app-path /path/to/mobile-app/build/Release-iphoneos/Sample.ipa \
|
|
81
|
+
--ios-physical-app-id com.example.mobiletest \
|
|
82
|
+
--ios-physical-device <physical-device-id> \
|
|
83
|
+
--ios-physical-shim /path/to/mobile-app/.zmr/ios-shim
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
Hardware mode delegates evidence collection through `scripts/pilot-gate.sh`.
|
|
87
|
+
The Android+iOS simulator gate writes the Android and simulator rows, and the
|
|
88
|
+
physical iOS gate writes both `physical iOS readiness` and `iOS physical
|
|
89
|
+
hardware pilot` rows. If hardware mode is run with a custom `--xcrun` path, the
|
|
90
|
+
release-candidate gate forwards that path to the physical-readiness check as
|
|
91
|
+
well as the iOS pilots. Use the `serial` value from:
|
|
92
|
+
|
|
93
|
+
```bash
|
|
94
|
+
zmr devices --json --platform ios --ios-device-type physical
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
The physical iOS step must use a connected, trusted device with Developer Mode
|
|
98
|
+
enabled. Missing physical iOS evidence means physical iOS reliability is not
|
|
99
|
+
shipped.
|
|
100
|
+
|
|
101
|
+
## Full Candidate
|
|
102
|
+
|
|
103
|
+
Run both local and hardware gates:
|
|
104
|
+
|
|
105
|
+
```bash
|
|
106
|
+
./scripts/release-candidate.sh --mode all --runs 20
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
Use `--dry-run` to inspect the command plan and generate planned
|
|
110
|
+
`evidence.jsonl` / `summary.md` files without executing device or release
|
|
111
|
+
commands.
|
|
@@ -0,0 +1,188 @@
|
|
|
1
|
+
# Release Evidence Checklist
|
|
2
|
+
|
|
3
|
+
Use this checklist before publishing a release or making reliability/performance
|
|
4
|
+
claims. The goal is to map every public product claim to a concrete artifact
|
|
5
|
+
that another maintainer can inspect.
|
|
6
|
+
|
|
7
|
+
After running `scripts/release-candidate.sh`, use
|
|
8
|
+
`zmr-release-readiness --evidence <evidence.jsonl> --target dev-preview` to
|
|
9
|
+
check whether the evidence supports a dev-preview release. Use
|
|
10
|
+
`--target production` for real app/device readiness and `--target market-claim`
|
|
11
|
+
before making same-device competitive claims.
|
|
12
|
+
|
|
13
|
+
The JSON output includes a `requirements` array. Each row reports whether a
|
|
14
|
+
requirement is `satisfied`, `missing`, `planned`, `failed`, or `insufficient`,
|
|
15
|
+
and includes the matching evidence row or reason when available. It also
|
|
16
|
+
includes `passed`, `satisfied`, `recommendedWording`, and `claimLimitations` so
|
|
17
|
+
agents can summarize readiness without accidentally upgrading a dev-preview
|
|
18
|
+
result into a production or competitive claim. `blocked` lists every
|
|
19
|
+
unsatisfied requirement, including missing, failed, planned, and insufficient
|
|
20
|
+
evidence rows. `passed` lists raw evidence row names whose command status is
|
|
21
|
+
passed; a row can still be insufficient for a release target. `missing` lists
|
|
22
|
+
absent or unreadable evidence. `insufficient` lists passed evidence rows that
|
|
23
|
+
do not meet threshold, app, device, target, or benchmark proof requirements.
|
|
24
|
+
`satisfied` lists validated requirement names after threshold, target, and app-id checks. Agents
|
|
25
|
+
should use `satisfied`, `blocked`, `missing`, `insufficient`,
|
|
26
|
+
`recommendedWording`, and `claimLimitations` instead of scraping the
|
|
27
|
+
human-readable text output. `nextSteps` is the shortest executable remediation
|
|
28
|
+
plan; one step can cover multiple blocked requirements when a single command
|
|
29
|
+
writes several evidence rows. Each `nextSteps` item includes `covers`, a list
|
|
30
|
+
of blocked requirement or evidence issue labels the step is intended to resolve,
|
|
31
|
+
plus a legacy `command` string and a structured `commands` array. Agents should
|
|
32
|
+
execute `commands` in order when a step needs multiple shell commands.
|
|
33
|
+
Malformed evidence JSONL still returns blocked JSON when `--json` is set, with
|
|
34
|
+
the invalid file and line listed in `missing`, `blocked`, and `nextSteps`.
|
|
35
|
+
Malformed evidence JSONL is reported as `invalid evidence` in
|
|
36
|
+
`claimLimitations` and `recommendedWording`; it is not treated as missing
|
|
37
|
+
evidence unless required rows are also absent.
|
|
38
|
+
Missing evidence next steps are target-aware. For `dev-preview`, a missing
|
|
39
|
+
file points to `./scripts/release-candidate.sh --mode local` for source-checkout
|
|
40
|
+
release verification. For missing `production` or `market-claim` evidence,
|
|
41
|
+
readiness returns two app-install-safe commands via `zmr-pilot-gate`: one grouped
|
|
42
|
+
Android+iOS simulator pilot, and one physical iOS pilot that also writes the
|
|
43
|
+
`physical iOS readiness` row. For `market-claim`, it then appends
|
|
44
|
+
`zmr-benchmark`, `zmr-benchmark-command`, and `zmr-compare-benchmarks`. Those
|
|
45
|
+
commands are available from the npm package and write the pilot and competitive
|
|
46
|
+
benchmark rows to the requested evidence file. When the evidence file itself is
|
|
47
|
+
missing, that file-level next step covers both the missing file and the production or market-claim rows its command sequence writes, so agents do not receive duplicate default pilot commands.
|
|
48
|
+
When an evidence file contains failed or planned rows, `blocked` also includes
|
|
49
|
+
`failed evidence:` and `planned evidence:` blockers with matching `nextSteps`;
|
|
50
|
+
a later passed row does not make those row-level blockers disappear. Those
|
|
51
|
+
`nextSteps` reuse the recorded evidence command when the row includes one.
|
|
52
|
+
Repeated failed or planned rows are reported once per evidence name.
|
|
53
|
+
|
|
54
|
+
You can pass `--evidence` more than once. Keep public release-candidate evidence
|
|
55
|
+
in this repository and private real-app pilot evidence in the app repository,
|
|
56
|
+
then evaluate both together:
|
|
57
|
+
|
|
58
|
+
```bash
|
|
59
|
+
zmr-release-readiness \
|
|
60
|
+
--evidence traces/release-candidate/<run>/evidence.jsonl \
|
|
61
|
+
--evidence /path/to/app/traces/zmr-pilots/evidence.jsonl \
|
|
62
|
+
--target production \
|
|
63
|
+
--json
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
## Core Product Evidence
|
|
67
|
+
|
|
68
|
+
| Claim | Command | Required Evidence |
|
|
69
|
+
| --- | --- | --- |
|
|
70
|
+
| Zig core builds and tests | `zig test src/main.zig -target aarch64-macos.15.0` | All tests pass. |
|
|
71
|
+
| Coverage stays above the release threshold | `./scripts/coverage.sh` | Coverage is at least 90%. |
|
|
72
|
+
| Public demo runs without mobile hardware | `./scripts/demo.sh` | Demo exits zero and writes generic traces under `traces/`. |
|
|
73
|
+
| Release archives are reproducible | `./scripts/build-release.sh && ./scripts/verify-release-artifacts.sh` | `dist/RELEASE_MANIFEST.json`, checksums, SBOM, notices, and archives verify. |
|
|
74
|
+
| npm package contents are public-safe | `npm pack --dry-run` | Tarball includes only public code, docs, clients, schemas, examples, shims, and prebuilds. |
|
|
75
|
+
| Public repo contains no private app references | `bash tests/public-safety-test.sh` | Safety scan exits zero. |
|
|
76
|
+
| Public Android demo builds | `scripts/create-android-demo-app.sh --out /tmp/zmr-android-demo` | Signed debug APK and `.zmr/android-smoke.json` are generated and the scenario validates. |
|
|
77
|
+
| Public Android demo runs | `zmr-demo-android --out /tmp/zmr-android-demo --device emulator-5554 --avd <avd-name> --runs 5` | Generated app installs on an emulator/device and reports `100%` pass rate with trace artifacts. |
|
|
78
|
+
| Public iOS simulator demo runs | `zmr-demo-ios --out /tmp/zmr-ios-demo --device booted --runs 5 --cleanup-build-products` | Generated app runs repeated iOS smoke and shim smoke flows with redacted traces. The cleanup flag removes Xcode `DerivedData` after reports/traces are written. |
|
|
79
|
+
|
|
80
|
+
## Client Evidence
|
|
81
|
+
|
|
82
|
+
| Claim | Command | Required Evidence |
|
|
83
|
+
| --- | --- | --- |
|
|
84
|
+
| TypeScript client can drive ZMR | `node --test tests/typescript-client.test.mjs` | Fake-session client test passes. |
|
|
85
|
+
| Python client can drive ZMR | `python3 -W error -m unittest tests/python_client_test.py` | Fake-session client test passes. |
|
|
86
|
+
| Go client can drive ZMR | `bash tests/go-client-test.sh` | Go tests and fake-session example pass. |
|
|
87
|
+
| Rust client can drive ZMR | `bash tests/rust-client-test.sh` | Rust tests and fake-session example pass. |
|
|
88
|
+
| Swift client can drive ZMR | `swift test --package-path clients/swift` | Swift package test passes when Swift is installed. |
|
|
89
|
+
| Kotlin client can drive ZMR | `gradle -p clients/kotlin test` | Kotlin/JVM test passes when Gradle is installed. |
|
|
90
|
+
|
|
91
|
+
## Local Device Evidence
|
|
92
|
+
|
|
93
|
+
| Claim | Command | Required Evidence |
|
|
94
|
+
| --- | --- | --- |
|
|
95
|
+
| Android emulator/device path is ready | `zmr doctor --strict --json --config .zmr/config.json` | Android checks are `ok`, or warnings are explicitly documented before release. |
|
|
96
|
+
| iOS simulator path is ready | `zmr doctor --strict --json --config .zmr/config.json` | `ios-simulators` and `ios-shim` checks are `ok`. |
|
|
97
|
+
| Physical iOS path is ready | `zmr-assert-ios-physical-ready --device <physical-device-id> --xcrun xcrun --evidence-out traces/zmr-pilots/evidence.jsonl` | The requested physical device identifier from `zmr devices` is present with `"ready": true`, and the command appends a `physical iOS readiness` JSONL row. |
|
|
98
|
+
| Multi-device matrix works | `zmr-device-matrix --matrix .zmr/device-matrix.json --trace-root traces/zmr-matrix --min-pass-rate 100 --max-failures 0` | `summary.json` reports `passRate: 100.0` and `failed: 0`. |
|
|
99
|
+
|
|
100
|
+
## Real Pilot Evidence
|
|
101
|
+
|
|
102
|
+
Run these in a private app checkout with private scenarios, app builds, and raw
|
|
103
|
+
traces kept out of the public repository.
|
|
104
|
+
|
|
105
|
+
| Claim | Command | Required Evidence |
|
|
106
|
+
| --- | --- | --- |
|
|
107
|
+
| Android and iOS simulator pilots are reliable | `zmr-pilot-gate --android --ios --android-app-root . --android-app-id <android-app-id> --android-device <android-device-id> --ios-app-root . --ios-app-path ./build/Debug-iphonesimulator/Sample.app --ios-app-id <ios-app-id> --ios-device booted --ios-shim ./.zmr/ios-shim --runs 20 --min-pass-rate 100 --max-failures 0 --evidence-out traces/zmr-pilots/evidence.jsonl` | Android and iOS simulator trace roots contain run summaries, selector-grade traces, and redacted bundles; no failures. |
|
|
108
|
+
| Physical iOS pilot is reliable | `zmr-pilot-gate --ios --ios-device-type physical --ios-device <physical-device-id> --ios-app-root . --ios-app-path ./build/Release-iphoneos/Sample.ipa --ios-app-id <ios-app-id> --ios-shim ./.zmr/ios-shim --runs 20 --min-pass-rate 100 --max-failures 0 --evidence-out traces/zmr-pilots/evidence.jsonl` | Physical iOS trace root contains selector-grade traces; no failures. |
|
|
109
|
+
| ZMR is faster than a local baseline | collect 20 ZMR rows with `zmr-benchmark`, collect 20 baseline rows with `zmr-benchmark-command`, then run `zmr-compare-benchmarks --results traces/bench-comparison/results.jsonl --candidate zmr --baseline baseline --min-candidate-pass-rate 100 --max-candidate-failures 0 --min-mean-speedup 1.25 --min-p95-speedup 1.25 --evidence-out traces/bench-comparison/evidence.jsonl` | Comparison report exits zero and appends a `competitive benchmark comparison` evidence row with candidate/baseline run counts, mean/p95 speedup, and same-context proof against the baseline collected on the same host/device/app build. |
|
|
110
|
+
|
|
111
|
+
When collecting private real-app pilot evidence, write a machine-readable file
|
|
112
|
+
that can be evaluated with public release evidence:
|
|
113
|
+
|
|
114
|
+
```bash
|
|
115
|
+
zmr-pilot-gate \
|
|
116
|
+
--android \
|
|
117
|
+
--android-app-root . \
|
|
118
|
+
--android-app-id <android-app-id> \
|
|
119
|
+
--android-device emulator-5554 \
|
|
120
|
+
--ios \
|
|
121
|
+
--ios-app-root . \
|
|
122
|
+
--ios-app-path ./build/Debug-iphonesimulator/Sample.app \
|
|
123
|
+
--ios-app-id <ios-app-id> \
|
|
124
|
+
--ios-device booted \
|
|
125
|
+
--ios-shim ./.zmr/ios-shim \
|
|
126
|
+
--runs 20 \
|
|
127
|
+
--min-pass-rate 100 \
|
|
128
|
+
--max-failures 0 \
|
|
129
|
+
--trace-root traces/zmr-pilots \
|
|
130
|
+
--evidence-out traces/zmr-pilots/evidence.jsonl
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
For physical iOS, run a separate physical-device pilot with
|
|
134
|
+
`--ios-device-type physical`; `zmr-pilot-gate` writes both `physical iOS
|
|
135
|
+
readiness` and `iOS physical hardware pilot` rows to the same evidence file.
|
|
136
|
+
The `physical iOS readiness` row must include concrete physical device evidence
|
|
137
|
+
(`iosDeviceId`, `deviceId`, `device`, or a `--device` flag in the recorded
|
|
138
|
+
command). Use the physical device identifier from `zmr devices`, not `booted`
|
|
139
|
+
or simulator aliases; a generic passed row is reported as `insufficient`.
|
|
140
|
+
Each hardware pilot evidence row includes `runs`, `minPassRate`, `maxFailures`,
|
|
141
|
+
a concrete app id (`androidAppId` or `iosAppId`, or an explicit app-id flag in
|
|
142
|
+
the recorded command), and app root evidence (`androidAppRoot`, `iosAppRoot`,
|
|
143
|
+
`appRoot`, or an explicit app-root flag). iOS simulator and physical pilot rows
|
|
144
|
+
must also include app artifact evidence (`iosAppPath`, `appPath`, `--ios-app-path`, or `--app-path`) for the built `.app` or `.ipa` that was tested.
|
|
145
|
+
Pilot threshold evidence must be structured JSON fields: `runs`,
|
|
146
|
+
`minPassRate`, and `maxFailures`. The recorded `command` remains useful
|
|
147
|
+
provenance for app, device, and rerun instructions, but command flags do not count for actual pilot outcomes.
|
|
148
|
+
`zmr-release-readiness --target production` requires at least 20 runs,
|
|
149
|
+
`minPassRate >= 100`, `maxFailures <= 0`, app-id evidence, app-root evidence,
|
|
150
|
+
and iOS app-artifact evidence for the corresponding pilot rows. The Android hardware pilot row requires Android device evidence (`androidDeviceId`, `deviceId`,
|
|
151
|
+
`device`, `--android-device`, or `--device`). The iOS simulator hardware pilot row requires iOS simulator device evidence (`iosDeviceId`, `deviceId`, `device`,
|
|
152
|
+
`--ios-device`, or `--device`); `booted` is accepted for simulator evidence.
|
|
153
|
+
The iOS physical-device pilot row also requires physical device evidence
|
|
154
|
+
(`iosDeviceId`, `deviceId`, `device`, `--ios-device`, or a concrete `--device` flag), and
|
|
155
|
+
`booted` is not accepted as a physical device.
|
|
156
|
+
|
|
157
|
+
For market-claim readiness, benchmark comparison evidence must include the
|
|
158
|
+
competitive thresholds used to justify the claim: `minCandidatePassRate >= 100`,
|
|
159
|
+
`maxCandidateFailures <= 0`, `minMeanSpeedup >= 1.25`, and
|
|
160
|
+
`minP95Speedup >= 1.25`. It must also include candidate name evidence
|
|
161
|
+
(`candidate`, `candidateName`, or a concrete `--candidate` flag) and baseline name evidence (`baseline`, `baselineName`, or a concrete `--baseline` flag) so
|
|
162
|
+
the claim names both compared tools. It must include results path evidence
|
|
163
|
+
(`results`, `resultsPath`, or a concrete `--results` flag) so maintainers can
|
|
164
|
+
inspect the source benchmark rows. It must include measured result evidence:
|
|
165
|
+
`candidatePassRate >= minCandidatePassRate`,
|
|
166
|
+
`candidateFailures <= maxCandidateFailures`, `meanSpeedup >= minMeanSpeedup`,
|
|
167
|
+
`p95Speedup >= minP95Speedup`, `candidateRuns >= 20`, and
|
|
168
|
+
`baselineRuns >= 20`. Measured result evidence must be structured. Sample-size
|
|
169
|
+
evidence must also be structured JSON fields emitted by the comparison tool;
|
|
170
|
+
command flags do not count for those actual outcomes. It must also include same benchmark context evidence: `sameContext: true` plus structured
|
|
171
|
+
`context.platform`, `context.device`, `context.appId`, `context.scenario`, and `context.appBuild`
|
|
172
|
+
fields proving the candidate and baseline rows came from the same platform,
|
|
173
|
+
device, app id, scenario, and app build. A passed comparison row without those
|
|
174
|
+
thresholds, named-tool evidence, results evidence, measured result evidence,
|
|
175
|
+
sample-size evidence, or same-context evidence is reported as `insufficient`,
|
|
176
|
+
not ready.
|
|
177
|
+
|
|
178
|
+
## Evidence Rules
|
|
179
|
+
|
|
180
|
+
- Do not publish raw traces from private apps.
|
|
181
|
+
- Publish only sanitized summaries, redacted `.zmrtrace` bundles, or generated
|
|
182
|
+
markdown reports that do not include private identifiers or credentials.
|
|
183
|
+
- Do not claim physical iOS reliability until the physical iOS pilot evidence
|
|
184
|
+
exists for a connected, trusted, ready device.
|
|
185
|
+
- Do not claim speed leadership from generic fake demos. Use app-local
|
|
186
|
+
benchmark rows collected on the same machine and device state.
|
|
187
|
+
- Treat missing evidence as not shipped, even when unit tests and package gates
|
|
188
|
+
pass.
|
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
# ZMR Release Notes Template
|
|
2
|
+
|
|
3
|
+
## Version
|
|
4
|
+
|
|
5
|
+
`vX.Y.Z`
|
|
6
|
+
|
|
7
|
+
## Release Type
|
|
8
|
+
|
|
9
|
+
- Dev preview
|
|
10
|
+
- Alpha
|
|
11
|
+
- Beta
|
|
12
|
+
- Stable
|
|
13
|
+
|
|
14
|
+
## Highlights
|
|
15
|
+
|
|
16
|
+
- ...
|
|
17
|
+
|
|
18
|
+
## Platform Support
|
|
19
|
+
|
|
20
|
+
- Android:
|
|
21
|
+
- iOS:
|
|
22
|
+
|
|
23
|
+
## Breaking Changes
|
|
24
|
+
|
|
25
|
+
- None.
|
|
26
|
+
|
|
27
|
+
## Added
|
|
28
|
+
|
|
29
|
+
- ...
|
|
30
|
+
|
|
31
|
+
## Changed
|
|
32
|
+
|
|
33
|
+
- ...
|
|
34
|
+
|
|
35
|
+
## Fixed
|
|
36
|
+
|
|
37
|
+
- ...
|
|
38
|
+
|
|
39
|
+
## Known Limitations
|
|
40
|
+
|
|
41
|
+
- ...
|
|
42
|
+
|
|
43
|
+
## Verification
|
|
44
|
+
|
|
45
|
+
Paste the release gate output summary:
|
|
46
|
+
|
|
47
|
+
```text
|
|
48
|
+
zig fmt --check build.zig src
|
|
49
|
+
bash -n scripts/*.sh tests/*.sh
|
|
50
|
+
zig test src/main.zig -target aarch64-macos.15.0
|
|
51
|
+
./scripts/demo.sh
|
|
52
|
+
./scripts/coverage.sh
|
|
53
|
+
./scripts/build-release.sh
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
## Checksums
|
|
57
|
+
|
|
58
|
+
Paste `dist/SHA256SUMS`.
|