npm - agent-device - Versions diffs - 0.10.1 → 0.10.2 - Mend

agent-device 0.10.1 → 0.10.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (56) hide show

package/README.md +4 -1
package/dist/src/331.js +3 -3
package/dist/src/425.js +1 -0
package/dist/src/bin.js +28 -28
package/dist/src/core/dispatch.d.ts +2 -0
package/dist/src/core/session-surface.d.ts +3 -0
package/dist/src/core/settings-contract.d.ts +2 -1
package/dist/src/daemon/app-log-ios.d.ts +2 -1
package/dist/src/daemon/app-log-process.d.ts +1 -1
package/dist/src/daemon/app-log.d.ts +1 -1
package/dist/src/daemon/context.d.ts +2 -0
package/dist/src/daemon/handlers/interaction-common.d.ts +30 -1
package/dist/src/daemon/handlers/interaction-read.d.ts +14 -0
package/dist/src/daemon/handlers/interaction-touch.d.ts +2 -3
package/dist/src/daemon/handlers/interaction.d.ts +5 -12
package/dist/src/daemon/handlers/snapshot-capture.d.ts +11 -4
package/dist/src/daemon/snapshot-processing.d.ts +1 -0
package/dist/src/daemon/types.d.ts +3 -1
package/dist/src/daemon.js +39 -39
package/dist/src/platforms/android/index.d.ts +1 -1
package/dist/src/platforms/android/input-actions.d.ts +1 -0
package/dist/src/platforms/android/settings.d.ts +1 -1
package/dist/src/platforms/ios/apps.d.ts +1 -1
package/dist/src/platforms/ios/macos-helper.d.ts +69 -0
package/dist/src/platforms/ios/runner-client.d.ts +1 -1
package/dist/src/utils/command-schema.d.ts +1 -0
package/dist/src/utils/interactors.d.ts +1 -1
package/dist/src/utils/snapshot-lines.d.ts +5 -2
package/dist/src/utils/snapshot.d.ts +8 -1
package/dist/src/utils/text-surface.d.ts +19 -0
package/ios-runner/AgentDeviceRunner/AgentDeviceRunnerUITests/RunnerTests+CommandExecution.swift +8 -0
package/ios-runner/AgentDeviceRunner/AgentDeviceRunnerUITests/RunnerTests+Interaction.swift +60 -0
package/ios-runner/AgentDeviceRunner/AgentDeviceRunnerUITests/RunnerTests+Lifecycle.swift +1 -1
package/ios-runner/AgentDeviceRunner/AgentDeviceRunnerUITests/RunnerTests+Models.swift +4 -0
package/macos-helper/Package.swift +18 -0
package/macos-helper/Sources/AgentDeviceMacOSHelper/SnapshotTraversal.swift +543 -0
package/macos-helper/Sources/AgentDeviceMacOSHelper/main.swift +545 -0
package/package.json +4 -1
package/skills/agent-device/SKILL.md +25 -334
package/skills/agent-device/references/bootstrap-install.md +167 -0
package/skills/agent-device/references/coordinate-system.md +24 -4
package/skills/agent-device/references/debugging.md +115 -0
package/skills/agent-device/references/exploration.md +193 -0
package/skills/agent-device/references/macos-desktop.md +55 -57
package/skills/agent-device/references/remote-tenancy.md +56 -47
package/skills/agent-device/references/verification.md +103 -0
package/dist/src/274.js +0 -1
package/dist/src/daemon/handlers/interaction-fill.d.ts +0 -3
package/dist/src/daemon/handlers/interaction-press.d.ts +0 -3
package/skills/agent-device/references/batching.md +0 -79
package/skills/agent-device/references/logs-and-debug.md +0 -113
package/skills/agent-device/references/perf-metrics.md +0 -53
package/skills/agent-device/references/permissions.md +0 -70
package/skills/agent-device/references/session-management.md +0 -101
package/skills/agent-device/references/snapshot-refs.md +0 -102
package/skills/agent-device/references/video-recording.md +0 -49

package/skills/agent-device/SKILL.md CHANGED Viewed

@@ -3,346 +3,37 @@ name: agent-device
 description: Automates interactions for Apple-platform apps (iOS, tvOS, macOS) and Android devices. Use when navigating apps, taking snapshots/screenshots, tapping, typing, scrolling, or extracting UI info across mobile, TV, and desktop targets.
 ---
-# Apple and Android Automation with agent-device
+# agent-device
-For exploration, use snapshot refs. For deterministic replay, use selectors.
-For structured exploratory QA bug hunts and reporting, use [../dogfood/SKILL.md](../dogfood/SKILL.md).
+Use this skill as a router.
-## Start Here (Read This First)
+## QA modes
-Use this skill as a router, not a full manual.
+- Open-ended bug hunt with reporting: use [../dogfood/SKILL.md](../dogfood/SKILL.md).
+- Pass/fail QA from acceptance criteria: stay in this skill, start with [references/bootstrap-install.md](references/bootstrap-install.md), then use the QA loop in [references/exploration.md](references/exploration.md).
-1. Pick one mode:
-   - Normal interaction flow
-   - Debug/crash flow
-   - Replay maintenance flow
-2. Run one canonical flow below.
-3. Open references only if blocked.
+## Mental model
-## Decision Map
+- First choose the correct target and open the app or session you want to work on.
+- Then inspect the current UI with `snapshot -i` and pick targets from the actual UI state.
+- Act with `press`, `fill`, `get`, `is`, `wait`, or `find`.
+- Re-snapshot after meaningful UI changes instead of reusing stale refs.
+- End by capturing proof if needed, then `close`.
-- No target context yet: `devices` -> pick target -> `open`.
-- Normal UI task: `open` -> `snapshot -i` -> `press/click/fill` -> `diff snapshot -i` -> `close`
-- Debug/crash (iOS/Android): `open <app>` -> `logs clear --restart` -> reproduce -> `network dump` -> `logs path` -> targeted `grep`
-- Replay drift: `replay -u <path>` -> verify updated selectors
-- Remote multi-tenant run: allocate lease -> point client at remote daemon base URL -> run commands with tenant isolation flags -> heartbeat/release lease
-- Device-scope isolation run: set iOS simulator set / Android allowlist -> run selectors within scope only
-- macOS desktop task: run the macOS desktop flow, then open [references/macos-desktop.md](references/macos-desktop.md) if context menus, Finder rows, or desktop-specific snapshot behavior matters
+## Decision rules
-## Target Selection Rules
+- Use plain `snapshot` when you need to verify whether text is visible.
+- Use `snapshot -i` mainly for interactive exploration and choosing refs.
+- Use `fill` to replace text.
+- Use `type` to append text.
+- Prefer `@ref` or selector targeting over raw coordinates.
+- Keep the default loop short: `open` -> explore/act -> optional debug or verify -> `close`.
-- iOS local QA: use simulators unless the task explicitly requires a physical device.
-- iOS local QA in mixed simulator/device environments: run `ensure-simulator` first and pass `--device`, `--udid`, or `--ios-simulator-device-set` on later commands.
-- macOS desktop app automation: use `--platform macos`, or `--platform apple --target desktop` when the caller wants one Apple-family selector path.
-- Android local QA: use `install` or `reinstall` for `.apk`/`.aab` files, then relaunch by installed package name.
-- Android React Native + Metro flows: prefer `open <package> --remote-config <path> --relaunch`.
-- In mixed-device environments, always pin the exact target with `--serial`, `--device`, `--udid`, or an isolation scope.
-- For session-bound automation runs, prefer a pre-bound session/platform instead of repeating selectors on every command: set `AGENT_DEVICE_SESSION`, set `AGENT_DEVICE_PLATFORM`, and the daemon will enforce the shared lock policy across CLI, typed client, and RPC entry points.
-- Use `--session-lock reject|strip` (or `AGENT_DEVICE_SESSION_LOCK`) only when you need to override the default reject behavior. Lock mode applies to nested `batch` steps too.
+## Choose a reference
-## Canonical Flows
-### 1) Normal Interaction Flow
-```bash
-agent-device open Settings --platform ios
-agent-device snapshot -i
-agent-device press @e3
-agent-device diff snapshot -i
-agent-device fill @e5 "test"
-agent-device close
-```
-### 1a) Local iOS Simulator QA Flow
-```bash
-agent-device ensure-simulator --platform ios --device "iPhone 16" --boot
-agent-device open MyApp --platform ios --device "iPhone 16" --session qa-ios --relaunch
-agent-device snapshot -i
-agent-device press @e3
-agent-device close
-```
-Use this when a physical iPhone is also connected and you want deterministic simulator-only automation.
-### 1b) Android React Native + Metro QA Flow
-```bash
-agent-device reinstall MyApp /path/to/app-debug.apk --platform android --serial emulator-5554
-agent-device open com.example.myapp --remote-config ./agent-device.remote.json --relaunch
-agent-device snapshot -i
-agent-device close
-```
-Do not use `open <apk|aab> --relaunch` on Android. Install/reinstall binaries first, then relaunch by package.
-### 1c) Session-Bound Automation Flow
-```bash
-export AGENT_DEVICE_SESSION=qa-ios
-export AGENT_DEVICE_PLATFORM=ios
-export AGENT_DEVICE_SESSION_LOCK=strip
-agent-device open MyApp --relaunch
-agent-device snapshot -i
-agent-device batch --steps-file /tmp/qa-steps.json --json
-agent-device close
-```
-Use this for orchestrators that must preserve one bound session/device across many plain CLI calls without a wrapper script. In `strip` mode, conflicting selectors such as `--target`, `--device`, `--udid`, `--serial`, and isolation-scope overrides are ignored instead of retargeting the run.
-### 1d) Android Emulator Session-Bound Flow
-```bash
-export AGENT_DEVICE_SESSION=qa-android
-export AGENT_DEVICE_PLATFORM=android
-agent-device reinstall MyApp /path/to/app-debug.apk --serial emulator-5554
-agent-device --session-lock reject open com.example.myapp --relaunch
-agent-device snapshot -i
-agent-device close --shutdown
-```
-Use this when an Android emulator session must stay pinned while an agent or test runner issues plain CLI commands over time.
-### 1e) macOS Desktop Flow
-```bash
-agent-device open TextEdit --platform macos
-agent-device snapshot -i
-agent-device fill @e3 "desktop smoke test"
-agent-device screenshot /tmp/macos-textedit.png
-agent-device close
-```
-Use this for host Mac desktop apps. Prefer the Apple runner interaction flow (`open`, `snapshot`, `press`, `click`, `fill`, `scroll`, `back`, `record`, `screenshot`). macOS also supports `clipboard read|write`, `trigger-app-event` when a desktop deep-link template is configured, and only `settings appearance light|dark|toggle` under the `settings` command. Do not rely on mobile-only helpers like `install`, `push`, `logs`, or `network` on macOS.
-Prefer selectors or snapshot refs (`@e...`) over raw x/y commands on macOS because the window origin can move between runs.
-Open [references/macos-desktop.md](references/macos-desktop.md) when you need Finder-style list traversal, context-menu flows, or macOS-specific snapshot expectations.
-### 2) Debug/Crash Flow
-```bash
-agent-device open MyApp --platform ios
-agent-device logs clear --restart
-agent-device network dump 25
-agent-device logs path
-```
-Logging is off by default. Enable only for debugging windows.
-`logs clear --restart` requires an active app session (`open <app>` first).
-### 3) Replay Maintenance Flow
-```bash
-agent-device replay -u ./session.ad
-```
-### 4) Remote Tenant Lease Flow (HTTP JSON-RPC)
-```bash
-# Client points directly at the remote daemon HTTP base URL.
-export AGENT_DEVICE_DAEMON_BASE_URL=http://mac-host.example:4310
-export AGENT_DEVICE_DAEMON_AUTH_TOKEN=<token>
-# Allocate lease
-curl -sS "${AGENT_DEVICE_DAEMON_BASE_URL}/rpc" \
-  -H "content-type: application/json" \
-  -H "Authorization: Bearer <token>" \
-  -d '{"jsonrpc":"2.0","id":"alloc-1","method":"agent_device.lease.allocate","params":{"runId":"run-123","tenantId":"acme","ttlMs":60000}}'
-# Use lease in tenant-isolated command execution
-agent-device \
-  --tenant acme \
-  --session-isolation tenant \
-  --run-id run-123 \
-  --lease-id <lease-id> \
-  session list --json
-# Heartbeat and release
-curl -sS "${AGENT_DEVICE_DAEMON_BASE_URL}/rpc" \
-  -H "content-type: application/json" \
-  -H "Authorization: Bearer <token>" \
-  -d '{"jsonrpc":"2.0","id":"hb-1","method":"agent_device.lease.heartbeat","params":{"leaseId":"<lease-id>","ttlMs":60000}}'
-curl -sS "${AGENT_DEVICE_DAEMON_BASE_URL}/rpc" \
-  -H "content-type: application/json" \
-  -H "Authorization: Bearer <token>" \
-  -d '{"jsonrpc":"2.0","id":"rel-1","method":"agent_device.lease.release","params":{"leaseId":"<lease-id>"}}'
-```
-Notes:
-- `AGENT_DEVICE_DAEMON_BASE_URL` makes the CLI skip local daemon discovery/startup and call the remote HTTP daemon directly.
-- `AGENT_DEVICE_DAEMON_AUTH_TOKEN` is sent in both the JSON-RPC request token and HTTP auth headers.
-- In remote daemon mode, `--debug` does not tail a local `daemon.log`; inspect logs on the remote host instead.
-## Command Skeleton (Minimal)
-### Session and navigation
-```bash
-agent-device devices
-agent-device devices --platform ios --ios-simulator-device-set /tmp/tenant-a/simulators
-agent-device devices --platform android --android-device-allowlist emulator-5554,device-1234
-agent-device ensure-simulator --device "iPhone 16" --ios-simulator-device-set /tmp/tenant-a/simulators
-agent-device ensure-simulator --device "iPhone 16" --runtime com.apple.CoreSimulator.SimRuntime.iOS-18-4 --ios-simulator-device-set /tmp/tenant-a/simulators --boot
-agent-device open [app|url] [url]
-agent-device open [app] --relaunch
-agent-device close [app]
-agent-device install <app> <path-to-binary>
-agent-device install-from-source <url> [--header "name:value"]
-agent-device reinstall <app> <path-to-binary>
-agent-device session list
-```
-Use `boot` only as fallback when `open` cannot find/connect to a ready target.
-If the workspace repeats the same selectors or device/session flags, prefer a checked-in `agent-device.json` or `--config <path>` over repeating them inline.
-Environment-level defaults follow the same fields via `AGENT_DEVICE_*` names, so persistent host-specific values belong there rather than in committed project config.
-That includes bound-session defaults such as `sessionLock` / `AGENT_DEVICE_SESSION_LOCK` when automation should consistently reject or strip conflicting device routing flags.
-For Android emulators by AVD name, use `boot --platform android --device <avd-name>`.
-For Android emulators without GUI, add `--headless`.
-Use `--target mobile|tv` with `--platform` (required) to pick phone/tablet vs TV targets (AndroidTV/tvOS).
-For Android React Native + Metro flows, install or reinstall the APK first, then use `open <package> --remote-config <path> --relaunch`; do not use `open <apk|aab> --relaunch`.
-For local iOS QA in mixed simulator/device environments, use `ensure-simulator` and pass `--device` or `--udid` so automation does not attach to a physical device by accident.
-For session-bound automation, prefer `AGENT_DEVICE_SESSION` + `AGENT_DEVICE_PLATFORM`; that bound-session default now enables lock mode automatically.
-Isolation scoping quick reference:
-- `--ios-simulator-device-set <path>` scopes iOS simulator discovery + command execution to one simulator set.
-- `--android-device-allowlist <serials>` scopes Android discovery/selection to comma/space separated serials.
-- Scope is applied before selectors (`--device`, `--udid`, `--serial`); out-of-scope selectors fail with `DEVICE_NOT_FOUND`.
-- With iOS simulator-set scope enabled, iOS physical devices are not enumerated.
-- In bound-session `strip` mode, conflicting per-call scope/selectors are ignored and the configured binding is restored for the request. Batch steps still inherit the parent `--platform` when they do not set their own.
-Simulator provisioning quick reference:
-- Use `ensure-simulator` to create or reuse a named iOS simulator inside a device set before starting a session.
-- `--device <name>` is required (e.g. `"iPhone 16 Pro"`). `--runtime <id>` pins the runtime; omit to use the newest compatible one.
-- `--boot` boots it immediately. Returns `udid`, `device`, `runtime`, `ios_simulator_device_set`, `created`, `booted`.
-- Idempotent: safe to call repeatedly; reuses an existing matching simulator by default.
-TV quick reference:
-- AndroidTV: `open`/`apps` use TV launcher discovery automatically.
-- TV target selection works on emulators/simulators and connected physical devices (AndroidTV + AppleTV).
-- tvOS: runner-driven interactions and snapshots are supported (`snapshot`, `wait`, `press`, `fill`, `get`, `scroll`, `back`, `home`, `app-switcher`, `record` and related selector flows).
-- tvOS `back`/`home`/`app-switcher` map to Siri Remote actions (`menu`, `home`, double-home) in the runner.
-- tvOS follows iOS simulator-only command semantics for helpers like `pinch`, `settings`, and `push`.
-### Snapshot and targeting
-```bash
-agent-device snapshot -i
-agent-device diff snapshot -i
-agent-device find "Sign In" click
-agent-device press @e1
-agent-device fill @e2 "text"
-agent-device is visible 'id="anchor"'
-```
-`press` is canonical tap command; `click` is an alias.
-On macOS, use `click --button secondary <@ref|selector>` to open a context menu before the next `snapshot -i`.
-For desktop-specific heuristics and Finder guidance, see [references/macos-desktop.md](references/macos-desktop.md).
-### Utilities
-```bash
-agent-device appstate
-agent-device clipboard read
-agent-device clipboard write "token"
-agent-device keyboard status
-agent-device keyboard dismiss
-agent-device perf --json
-agent-device network dump [limit] [summary|headers|body|all]
-agent-device push <bundle|package> <payload.json|inline-json>
-agent-device trigger-app-event screenshot_taken '{"source":"qa"}'
-agent-device get text @e1
-agent-device screenshot out.png
-agent-device settings permission grant notifications
-agent-device settings permission reset camera
-agent-device trace start
-agent-device trace stop ./trace.log
-```
-### Batch (when sequence is already known)
-```bash
-agent-device batch --steps-file /tmp/batch-steps.json --json
-```
-### Performance Check
-- Use `agent-device perf --json` (or `metrics --json`) after `open`.
-- For detailed metric semantics, caveats, and interpretation guidance, see [references/perf-metrics.md](references/perf-metrics.md).
-## Guardrails (High Value Only)
-- Re-snapshot after UI mutations (navigation/modal/list changes).
-- Prefer `snapshot -i`; scope/depth only when needed.
-- Use refs for discovery, selectors for replay/assertions.
-- `find "<query>" click --json` returns `{ ref, locator, query, x, y }` — all derived from the matched snapshot node. Do not rely on these fields from raw `press`/`click` responses for observability; use `find` instead.
-- Use `fill` for clear-then-type semantics; use `type` for focused append typing.
-- Use `install` for in-place app upgrades (keep app data when platform permits), and `reinstall` for deterministic fresh-state runs.
-- App binary format support for `install`/`reinstall`: Android `.apk`/`.aab`, iOS `.app`/`.ipa`.
-- Android `.aab` requires `bundletool` in `PATH`, or `AGENT_DEVICE_BUNDLETOOL_JAR=<path-to-bundletool-all.jar>` with `java` in `PATH`.
-- Android `.aab` optional: set `AGENT_DEVICE_ANDROID_BUNDLETOOL_MODE=<mode>` to control bundletool `build-apks --mode` (default: `universal`).
-- iOS `.ipa`: extract/install from `Payload/*.app`; when multiple app bundles are present, `<app>` is used as a bundle id/name hint.
-- iOS `appstate` is session-scoped; Android `appstate` is live foreground state. iOS responses include `device_udid` and `ios_simulator_device_set` for isolation verification.
-- iOS `open` responses include `device_udid` and `ios_simulator_device_set` to confirm which simulator handled the session.
-- Clipboard helpers: `clipboard read` / `clipboard write <text>` are supported on macOS, Android, and iOS simulators; iOS physical devices are not supported yet.
-- Android keyboard helpers: `keyboard status|get|dismiss` report keyboard visibility/type and dismiss via keyevent when visible.
-- `network dump` is best-effort and parses HTTP(s) entries from the session app log file.
-- Biometric settings: iOS simulator supports `settings faceid|touchid <match|nonmatch|enroll|unenroll>`; Android supports `settings fingerprint <match|nonmatch>` where runtime tooling is available.
-- For AndroidTV/tvOS selection, always pair `--target` with `--platform` (`ios`, `android`, or `apple` alias); target-only selection is invalid.
-- `push` simulates notification delivery:
-  - iOS simulator uses APNs-style payload JSON.
-  - Android uses broadcast action + typed extras (string/boolean/number).
-- `trigger-app-event` requires app-defined deep-link hooks and URL template configuration (`AGENT_DEVICE_APP_EVENT_URL_TEMPLATE` or platform-specific variants).
-- On macOS, set `AGENT_DEVICE_MACOS_APP_EVENT_URL_TEMPLATE` when the desktop app uses a different deep-link template than iOS/Android.
-- `trigger-app-event` requires an active session or explicit selectors (`--platform`, `--device`, `--udid`, `--serial`); on iOS physical devices, custom-scheme triggers require active app context.
-- Canonical trigger behavior and caveats are documented in [`website/docs/docs/commands.md`](../../website/docs/docs/commands.md) under **App event triggers**.
-- Permission settings are app-scoped and require an active session app:
-  `settings permission <grant|deny|reset> <camera|microphone|photos|contacts|notifications> [full|limited]`
-- iOS simulator permission alerts: use `alert wait` then `alert accept/dismiss` — `accept`/`dismiss` retry internally for up to 2 s so you do not need manual sleeps. See [references/permissions.md](references/permissions.md).
-- `full|limited` mode applies only to iOS `photos`; other targets reject mode.
-- On Android, non-ASCII `fill/type` may require an ADB keyboard IME on some system images; only install IME APKs from trusted sources and verify checksum/signature.
-- If using `--save-script`, prefer explicit path syntax (`--save-script=flow.ad` or `./flow.ad`).
-- For tenant-isolated remote runs, always pass `--tenant`, `--session-isolation tenant`, `--run-id`, and `--lease-id` together.
-- Use short lease TTLs and heartbeat only while work is active; release leases immediately after run completion/failure.
-- Env equivalents for scoped runs: `AGENT_DEVICE_IOS_SIMULATOR_DEVICE_SET` (compat `IOS_SIMULATOR_DEVICE_SET`) and
-  `AGENT_DEVICE_ANDROID_DEVICE_ALLOWLIST` (compat `ANDROID_DEVICE_ALLOWLIST`).
-- For explicit remote client mode, prefer `AGENT_DEVICE_DAEMON_BASE_URL` / `--daemon-base-url` instead of relying on local daemon metadata or loopback-only ports.
-## Common Failure Patterns
-- `Failed to access Android app sandbox for /path/app-debug.apk`: Android relaunch/runtime-hint flow received an APK path instead of an installed package name. Use `reinstall` first, then `open <package> --relaunch`.
-- `mkdir: Needs 1 argument` while writing `ReactNativeDevPrefs.xml`: likely an older `agent-device` build or stale global install is still using the shell-based Android runtime-hint writer. Verify the exact binary being invoked.
-- `Failed to terminate iOS app`: the flow may have selected a physical iPhone or an unavailable iOS target. Re-run with `ensure-simulator`, then pin the simulator with `--device` or `--udid`.
-## Security and Trust Notes
-- Prefer a preinstalled `agent-device` binary over on-demand package execution.
-- If install is required, pin an exact version (for example: `npx --yes agent-device@<exact-version> --help`).
-- Signing/provisioning environment variables are optional, sensitive, and only for iOS physical-device setup.
-- Logs/artifacts are written under `~/.agent-device`; replay scripts write to explicit paths you provide.
-- For remote daemon mode, prefer `AGENT_DEVICE_DAEMON_SERVER_MODE=http|dual` on the host plus client-side `AGENT_DEVICE_DAEMON_BASE_URL`, with `AGENT_DEVICE_HTTP_AUTH_HOOK` and tenant-scoped lease admission where needed.
-- Keep logging off unless debugging and use least-privilege/isolated environments for autonomous runs.
-## Common Mistakes
-- Mixing debug flow into normal runs (keep logs off unless debugging).
-- Continuing to use stale refs after screen transitions.
-- Using URL opens with Android `--activity` (unsupported combination).
-- Treating `boot` as default first step instead of fallback.
-## References
-- [references/snapshot-refs.md](references/snapshot-refs.md)
-- [references/macos-desktop.md](references/macos-desktop.md)
-- [references/logs-and-debug.md](references/logs-and-debug.md)
-- [references/session-management.md](references/session-management.md)
-- [references/permissions.md](references/permissions.md)
-- [references/video-recording.md](references/video-recording.md)
-- [references/coordinate-system.md](references/coordinate-system.md)
-- [references/batching.md](references/batching.md)
-- [references/perf-metrics.md](references/perf-metrics.md)
-- [references/remote-tenancy.md](references/remote-tenancy.md)
+- Pick target device, install, open, or manage sessions: [references/bootstrap-install.md](references/bootstrap-install.md)
+- Need to discover UI, pick refs, wait, query, or interact: [references/exploration.md](references/exploration.md)
+- Need logs, network, alerts, permissions, or failure triage: [references/debugging.md](references/debugging.md)
+- Need screenshots, diff, recording, replay maintenance, or perf data: [references/verification.md](references/verification.md)
+- Need desktop surfaces, menu bar behavior, or macOS-specific interaction rules: [references/macos-desktop.md](references/macos-desktop.md)
+- Need to connect to a remote `agent-device` daemon over HTTP or use tenant leases: [references/remote-tenancy.md](references/remote-tenancy.md)

package/skills/agent-device/references/bootstrap-install.md ADDED Viewed

@@ -0,0 +1,167 @@
+# Bootstrap and Install
+## When to open this file
+Open this file when you still need to choose the right target, start the right session, install or relaunch the app, or pin automation to one device before interacting.
+## Main commands to reach for first
+- `devices`
+- `ensure-simulator`
+- `open`
+- `install` or `reinstall`
+- `close`
+- `session list`
+## Most common mistake to avoid
+Do not start acting before you have pinned the correct target and opened an `app` session. In mixed-device environments, always pass `--device`, `--udid`, or `--serial`.
+## Canonical loop
+```bash
+agent-device ensure-simulator --platform ios --device "iPhone 17 Pro" --boot
+agent-device open MyApp --platform ios --device "iPhone 17 Pro" --relaunch
+agent-device snapshot -i
+agent-device close
+```
+## Choose the right starting point
+- iOS local QA: prefer simulators unless the task explicitly requires physical hardware.
+- iOS in mixed simulator and device environments: run `ensure-simulator` first, then keep using `--device` or `--udid`.
+- TV targets: use `--target tv` together with `--platform` when the task is for tvOS or Android TV rather than phone or tablet surfaces.
+- Android binary flow: use `install` or `reinstall` for `.apk` or `.aab`, then open by installed package name.
+- Android React Native plus Metro flow: `reinstall <app> <apk>` first, then `open <package> --remote-config <path> --relaunch`.
+- macOS desktop app flow: use `open <app> --platform macos`. Only load [macos-desktop.md](macos-desktop.md) if a desktop surface or macOS-specific behavior matters.
+TV example:
+```bash
+agent-device open MyTvApp --platform ios --target tv
+agent-device open com.example.androidtv --platform android --target tv
+```
+## Session rules
+- Use `--session <name>` when you need a named session:
+```bash
+agent-device --session auth open Settings --platform ios
+agent-device --session auth snapshot -i
+```
+- Use `open <app>` before interactions.
+- Use `close` when done. Add `--shutdown` when you want simulators or emulators torn down with the session.
+- Use semantic session names when you need multiple concurrent runs.
+- Use `--save-script=<path>` on `close` when you want to keep a replay script.
+- For dev loops where state can linger, prefer `open <app> --relaunch`.
+- In iOS sessions, use `open <app>` for the app itself. Use `open <url>` for deep links, and `open <app> <url>` when you need to launch the app and deep link in one step.
+- On iOS, `appstate` is session-scoped and requires the matching active session on the target device.
+## After a session is established
+Once you have opened the correct session on the correct target, default to the conservative rule: keep the session binding on follow-up commands, and stop repeating device-routing flags unless you are intentionally retargeting.
+- Prefer `--session <name>` on follow-up commands, or use sandboxed `AGENT_DEVICE_SESSION`.
+- Do not keep repeating `--platform`, `--target`, `--device`, `--udid`, `--serial`, or similar target-selection flags on normal follow-up commands.
+- Only omit follow-up session flags when the environment explicitly guarantees isolation.
+Good shared-host pattern:
+```bash
+agent-device --session auth open Settings --platform ios --device "iPhone 17 Pro"
+agent-device --session auth snapshot -i
+agent-device --session auth press @e3
+agent-device --session auth close
+```
+Bad shared-host pattern:
+```bash
+agent-device --session auth open Settings --platform ios --device "iPhone 17 Pro"
+agent-device --session auth snapshot -i --platform ios --device "iPhone 17 Pro"
+```
+Use target-selection flags again only when you are choosing the target before opening a session, or when you explicitly mean to retarget.
+## Session-bound automation
+Use this when an orchestrator must keep plain CLI calls on one session and device.
+```bash
+export AGENT_DEVICE_SESSION=qa-ios
+export AGENT_DEVICE_PLATFORM=ios
+export AGENT_DEVICE_SESSION_LOCK=strip
+agent-device open MyApp --relaunch
+agent-device snapshot -i
+agent-device close
+```
+- `AGENT_DEVICE_SESSION` plus `AGENT_DEVICE_PLATFORM` provides the default binding.
+- `--session-lock reject|strip` controls whether conflicting per-call routing flags fail or are ignored.
+- Conflicts include explicit retargeting flags such as `--platform`, `--target`, `--device`, `--udid`, `--serial`, `--ios-simulator-device-set`, and `--android-device-allowlist`.
+- Lock policy applies to nested `batch` steps too.
+- Compatibility aliases remain supported: `--session-locked`, `--session-lock-conflicts`, `AGENT_DEVICE_SESSION_LOCKED`, and `AGENT_DEVICE_SESSION_LOCK_CONFLICTS`.
+Android emulator variant:
+```bash
+export AGENT_DEVICE_SESSION=qa-android
+export AGENT_DEVICE_PLATFORM=android
+agent-device reinstall MyApp /path/to/app-debug.apk --serial emulator-5554
+agent-device --session-lock reject open com.example.myapp --relaunch
+agent-device snapshot -i
+agent-device close --shutdown
+```
+## Scoped discovery
+Use scoped discovery when one run must not see host-global device lists.
+```bash
+agent-device devices --platform ios --ios-simulator-device-set /tmp/tenant-a/simulators
+agent-device devices --platform android --android-device-allowlist emulator-5554,device-1234
+```
+- Scope is applied before `--device`, `--udid`, and `--serial`.
+- Out-of-scope selectors fail with `DEVICE_NOT_FOUND`.
+- With iOS simulator-set scope enabled, iOS physical devices are not enumerated.
+- If the scoped iOS simulator set is empty, the error should point at the set path and suggest creating a simulator in that set.
+- Environment equivalents:
+  - `AGENT_DEVICE_IOS_SIMULATOR_DEVICE_SET`
+  - `AGENT_DEVICE_ANDROID_DEVICE_ALLOWLIST`
+## Session inspection and replay
+```bash
+agent-device session list
+agent-device replay ./session.ad --session auth
+agent-device replay -u ./session.ad --session auth
+```
+- iOS session entries include `device_udid` and `ios_simulator_device_set`. Use them to confirm routing in concurrent runs.
+- Prefer selector-based actions and assertions in saved replay scripts.
+- Tenant isolation namespaces sessions as `<tenant>:<session>` during tenant-scoped runs.
+## When to leave this file
+- Once the correct target and session are pinned, move to [exploration.md](exploration.md).
+- If opening, startup, permissions, or logs become the blocker, switch to [debugging.md](debugging.md).
+## Install and open examples
+```bash
+agent-device reinstall MyApp /path/to/app-debug.apk --platform android --serial emulator-5554
+agent-device open com.example.myapp --remote-config ./agent-device.remote.json --relaunch
+```
+Do not use `open <apk|aab> --relaunch` on Android.
+## Security and trust notes
+- Treat signing, provisioning, and daemon auth values as host secrets. Do not paste them into shared logs or commit them to source control.
+- Prefer Xcode Automatic Signing over manual overrides when a physical iOS device is involved.
+- Keep persistent host-specific defaults in environment variables rather than checked-in project config.

package/skills/agent-device/references/coordinate-system.md CHANGED Viewed

@@ -1,8 +1,28 @@
 # Coordinate System
-All coordinate-based actions use device screen coordinates:
+## When to open this file
-- Origin: top-left of the device screen
-- Units: device points for iOS, pixels for Android
+Open this file only when you must use raw coordinates instead of selectors or `@ref` targeting.
-Use screenshots to reason about coordinates.
+## Main commands to reach for first
+- `screenshot`
+- coordinate-based `click` or `swipe`
+## Most common mistake to avoid
+Do not assume coordinates mean the same thing across platforms or runs. Prefer selectors and refs first.
+## Canonical loop
+```bash
+agent-device screenshot /tmp/current-screen.png
+agent-device click 120 240
+```
+## Rules
+- Origin is the top-left of the device screen.
+- iOS uses device points.
+- Android uses pixels.
+- Use screenshots to reason about coordinates before acting.