agent-device 0.10.0 → 0.10.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (76) hide show
  1. package/README.md +4 -607
  2. package/dist/src/331.js +3 -3
  3. package/dist/src/425.js +1 -0
  4. package/dist/src/bin.js +28 -28
  5. package/dist/src/core/dispatch.d.ts +2 -0
  6. package/dist/src/core/session-surface.d.ts +3 -0
  7. package/dist/src/core/settings-contract.d.ts +2 -1
  8. package/dist/src/daemon/android-system-dialog.d.ts +11 -0
  9. package/dist/src/daemon/app-log-ios.d.ts +2 -1
  10. package/dist/src/daemon/app-log-process.d.ts +1 -1
  11. package/dist/src/daemon/app-log.d.ts +1 -1
  12. package/dist/src/daemon/context.d.ts +2 -0
  13. package/dist/src/daemon/handlers/interaction-common.d.ts +30 -1
  14. package/dist/src/daemon/handlers/interaction-read.d.ts +14 -0
  15. package/dist/src/daemon/handlers/interaction-touch.d.ts +45 -0
  16. package/dist/src/daemon/handlers/interaction.d.ts +2 -0
  17. package/dist/src/daemon/handlers/record-trace-android.d.ts +18 -0
  18. package/dist/src/daemon/handlers/record-trace-ios.d.ts +52 -0
  19. package/dist/src/daemon/handlers/record-trace-recording.d.ts +32 -0
  20. package/dist/src/daemon/handlers/record-trace.d.ts +2 -7
  21. package/dist/src/daemon/handlers/snapshot-capture.d.ts +11 -4
  22. package/dist/src/daemon/record-trace-errors.d.ts +6 -0
  23. package/dist/src/daemon/recording-gestures.d.ts +3 -0
  24. package/dist/src/daemon/recording-telemetry.d.ts +20 -0
  25. package/dist/src/daemon/recording-timing.d.ts +24 -0
  26. package/dist/src/daemon/request-router.d.ts +6 -0
  27. package/dist/src/daemon/script-utils.d.ts +1 -0
  28. package/dist/src/daemon/snapshot-processing.d.ts +1 -0
  29. package/dist/src/daemon/touch-reference-frame.d.ts +7 -0
  30. package/dist/src/daemon/types.d.ts +65 -11
  31. package/dist/src/daemon.js +62 -36
  32. package/dist/src/platforms/android/index.d.ts +1 -1
  33. package/dist/src/platforms/android/input-actions.d.ts +5 -0
  34. package/dist/src/platforms/android/settings.d.ts +1 -1
  35. package/dist/src/platforms/ios/apps.d.ts +1 -1
  36. package/dist/src/platforms/ios/macos-helper.d.ts +69 -0
  37. package/dist/src/platforms/ios/runner-client.d.ts +2 -2
  38. package/dist/src/platforms/ios/runner-session.d.ts +5 -0
  39. package/dist/src/platforms/ios/runner-xctestrun.d.ts +3 -1
  40. package/dist/src/recording/overlay.d.ts +10 -0
  41. package/dist/src/utils/command-schema.d.ts +2 -0
  42. package/dist/src/utils/interactors.d.ts +8 -8
  43. package/dist/src/utils/snapshot-lines.d.ts +5 -2
  44. package/dist/src/utils/snapshot.d.ts +8 -1
  45. package/dist/src/utils/text-surface.d.ts +19 -0
  46. package/dist/src/utils/video.d.ts +9 -0
  47. package/ios-runner/AgentDeviceRunner/AgentDeviceRunnerUITests/RunnerTests+CommandExecution.swift +196 -51
  48. package/ios-runner/AgentDeviceRunner/AgentDeviceRunnerUITests/RunnerTests+Interaction.swift +133 -0
  49. package/ios-runner/AgentDeviceRunner/AgentDeviceRunnerUITests/RunnerTests+Lifecycle.swift +1 -1
  50. package/ios-runner/AgentDeviceRunner/AgentDeviceRunnerUITests/RunnerTests+Models.swift +33 -1
  51. package/ios-runner/AgentDeviceRunner/AgentDeviceRunnerUITests/RunnerTests+ScreenRecorder.swift +4 -6
  52. package/ios-runner/AgentDeviceRunner/AgentDeviceRunnerUITests/RunnerTests.swift +1 -0
  53. package/ios-runner/AgentDeviceRunner/RecordingScripts/recording-overlay.swift +571 -0
  54. package/ios-runner/AgentDeviceRunner/RecordingScripts/recording-trim.swift +140 -0
  55. package/macos-helper/Package.swift +18 -0
  56. package/macos-helper/Sources/AgentDeviceMacOSHelper/SnapshotTraversal.swift +543 -0
  57. package/macos-helper/Sources/AgentDeviceMacOSHelper/main.swift +545 -0
  58. package/package.json +4 -1
  59. package/skills/agent-device/SKILL.md +25 -334
  60. package/skills/agent-device/references/bootstrap-install.md +167 -0
  61. package/skills/agent-device/references/coordinate-system.md +24 -4
  62. package/skills/agent-device/references/debugging.md +115 -0
  63. package/skills/agent-device/references/exploration.md +193 -0
  64. package/skills/agent-device/references/macos-desktop.md +55 -57
  65. package/skills/agent-device/references/remote-tenancy.md +56 -47
  66. package/skills/agent-device/references/verification.md +103 -0
  67. package/dist/src/274.js +0 -1
  68. package/dist/src/daemon/handlers/interaction-fill.d.ts +0 -3
  69. package/dist/src/daemon/handlers/interaction-press.d.ts +0 -3
  70. package/skills/agent-device/references/batching.md +0 -79
  71. package/skills/agent-device/references/logs-and-debug.md +0 -113
  72. package/skills/agent-device/references/perf-metrics.md +0 -53
  73. package/skills/agent-device/references/permissions.md +0 -70
  74. package/skills/agent-device/references/session-management.md +0 -101
  75. package/skills/agent-device/references/snapshot-refs.md +0 -102
  76. package/skills/agent-device/references/video-recording.md +0 -41
@@ -0,0 +1,103 @@
1
+ # Verification
2
+
3
+ ## When to open this file
4
+
5
+ Open this file when the task needs evidence, regression checks, replay maintenance, or startup performance measurements after the main interaction flow is already working.
6
+
7
+ ## Main commands to reach for first
8
+
9
+ - `screenshot`
10
+ - `diff snapshot`
11
+ - `record`
12
+ - `replay -u`
13
+ - `perf`
14
+
15
+ ## Most common mistake to avoid
16
+
17
+ Do not use verification tools as the first exploration step. First get the app into the correct state with the normal interaction flow, then capture proof or maintain replay assets.
18
+
19
+ ## Canonical loop
20
+
21
+ ```bash
22
+ agent-device open Settings --platform ios
23
+ agent-device snapshot -i
24
+ agent-device press @e5
25
+ agent-device diff snapshot -i
26
+ agent-device screenshot /tmp/settings-proof.png
27
+ agent-device close
28
+ ```
29
+
30
+ ## Structural verification with diff snapshot
31
+
32
+ Use `diff snapshot` when you need a compact view of how the UI changed between nearby states.
33
+
34
+ ```bash
35
+ agent-device snapshot -i
36
+ agent-device press @e5
37
+ agent-device diff snapshot -i
38
+ ```
39
+
40
+ - Initialize the baseline at a stable point.
41
+ - Perform the mutation.
42
+ - Run `diff snapshot` to confirm the expected structural change.
43
+ - Re-run full `snapshot` only when you need fresh refs.
44
+
45
+ ## Visual artifacts
46
+
47
+ Use `screenshot` when the proof needs a rendered image instead of a structural tree.
48
+
49
+ ## Session recording
50
+
51
+ Use `record` for debugging, documentation, or shareable verification artifacts.
52
+
53
+ ```bash
54
+ agent-device record start ./recordings/ios.mov
55
+ agent-device open App
56
+ agent-device snapshot -i
57
+ agent-device press @e3
58
+ agent-device close
59
+ agent-device record stop
60
+ ```
61
+
62
+ - `record` supports iOS simulators, iOS devices, and Android.
63
+ - On iOS, recording is a wrapper around `simctl` for simulators and the corresponding device capture path for physical devices.
64
+ - On Android, recording is a wrapper around `adb`.
65
+ - Recording writes a video artifact and a gesture-telemetry sidecar JSON.
66
+ - On macOS hosts, touch overlay burn-in is available for supported recordings.
67
+ - On non-macOS hosts, recording still succeeds but the video stays raw and `record stop` can return an `overlayWarning`.
68
+ - If the agent already knows the interaction sequence and wants a more lifelike, uninterrupted recording, drive the flow with `batch` while recording instead of replanning between each step.
69
+
70
+ Example:
71
+
72
+ ```bash
73
+ agent-device record start ./recordings/smoke.mov
74
+ agent-device batch --session sim --platform ios --steps-file /tmp/smoke-steps.json --json
75
+ agent-device record stop
76
+ ```
77
+
78
+ - Use this only after exploration has stabilized the flow.
79
+ - Keep the batch short and add `wait` or `is exists` guards after mutating steps so the recorded flow still tracks realistic UI timing.
80
+
81
+ ## Replay maintenance
82
+
83
+ Use replay updates when selectors drift but the recorded scenario is still correct.
84
+
85
+ ```bash
86
+ agent-device replay -u ./session.ad
87
+ ```
88
+
89
+ - Prefer selector-based actions in recorded `.ad` replays.
90
+ - Use update mode for maintenance, not as a substitute for fixing a broken interaction strategy.
91
+
92
+ ## Performance checks
93
+
94
+ Use `perf --json` or `metrics --json` when you need startup timing for the active session.
95
+
96
+ ```bash
97
+ agent-device open Settings --platform ios
98
+ agent-device perf --json
99
+ ```
100
+
101
+ - Current startup data is command round-trip timing around `open`.
102
+ - It is not true first-frame or first-interactive telemetry.
103
+ - `fps`, `memory`, and `cpu` are currently placeholders.
package/dist/src/274.js DELETED
@@ -1 +0,0 @@
1
- import{AppError as e}from"./331.js";let t="<wifi|airplane|location> <on|off>",r="appearance <light|dark|toggle>",a="faceid <match|nonmatch|enroll|unenroll>",i="touchid <match|nonmatch|enroll|unenroll>",n="fingerprint <match|nonmatch>",s="permission <grant|deny|reset> <camera|microphone|photos|contacts|contacts-limited|notifications|calendar|location|location-always|media-library|motion|reminders|siri> [full|limited]",o=`settings ${t} | settings ${r} | settings ${a} | settings ${i} | settings ${n} | settings ${s}`,l=`settings requires ${t}, ${r}, ${a}, ${i}, ${n}, or ${s}`;function c(e){let t=[],r=[];for(let a of e){let e=a.depth??0;for(;t.length>0&&e<=t[t.length-1];)t.pop();let i=a.label?.trim()||a.value?.trim()||a.identifier?.trim()||"",n=d(a.type??"Element"),s="group"===n&&!i;s&&t.push(e);let o=s?e:Math.max(0,e-t.length);r.push({node:a,depth:o,type:n,text:u(a,o,s,n)})}return r}function u(e,t,r,a){let i=a??d(e.type??"Element"),n=m(e,i),s=" ".repeat(t),o=e.ref?`@${e.ref}`:"",l=[!1===e.enabled?"disabled":null].filter(Boolean).join(", "),c=l?` [${l}]`:"",u=n?` "${n}"`:"";return r?`${s}${o} [${i}]${c}`.trimEnd():`${s}${o} [${i}]${u}${c}`.trimEnd()}function m(e,t){var r,a;let i=e.label?.trim(),n=e.value?.trim();if("text-field"===(r=t)||"text-view"===r||"search"===r){if(n)return n;if(i)return i}else if(i)return i;if(n)return n;let s=e.identifier?.trim();return!s||(a=s,/^[\w.]+:id\/[\w.-]+$/i.test(a)&&("group"===t||"image"===t||"list"===t||"collection"===t))?"":s}function d(e){let t=e.replace(/XCUIElementType/gi,"").toLowerCase(),r=e.includes(".")&&(e.startsWith("android.")||e.startsWith("androidx.")||e.startsWith("com."));switch(t.includes(".")&&(t=t.replace(/^android\.widget\./,"").replace(/^android\.view\./,"").replace(/^android\.webkit\./,"").replace(/^androidx\./,"").replace(/^com\.google\.android\./,"").replace(/^com\.android\./,"")),t){case"application":return"application";case"navigationbar":return"navigation-bar";case"tabbar":return"tab-bar";case"button":case"imagebutton":return"button";case"link":return"link";case"cell":return"cell";case"statictext":case"checkedtextview":return"text";case"textfield":case"edittext":return"text-field";case"textview":return r?"text":"text-view";case"textarea":return"text-view";case"switch":return"switch";case"slider":return"slider";case"image":case"imageview":return"image";case"webview":return"webview";case"framelayout":case"linearlayout":case"relativelayout":case"constraintlayout":case"viewgroup":case"view":case"group":return"group";case"listview":case"recyclerview":return"list";case"collectionview":return"collection";case"searchfield":return"search";case"segmentedcontrol":return"segmented-control";case"window":return"window";case"checkbox":return"checkbox";case"radio":return"radio";case"menuitem":return"menu-item";case"toolbar":return"toolbar";case"scrollarea":case"scrollview":case"nestedscrollview":return"scroll-area";case"table":return"table";default:return t||"element"}}let p=100,h=new Set(["batch","replay"]);function f(t){let r;try{r=JSON.parse(t)}catch{throw new e("INVALID_ARGS","Batch steps must be valid JSON.")}if(!Array.isArray(r)||0===r.length)throw new e("INVALID_ARGS","Batch steps must be a non-empty JSON array.");return r}function w(t,r){if(!Array.isArray(t)||0===t.length)throw new e("INVALID_ARGS","batch requires a non-empty batchSteps array.");if(t.length>r)throw new e("INVALID_ARGS",`batch has ${t.length} steps; max allowed is ${r}.`);let a=[];for(let r=0;r<t.length;r+=1){let i=t[r];if(!i||"object"!=typeof i)throw new e("INVALID_ARGS",`Invalid batch step at index ${r}.`);let n="string"==typeof i.command?i.command.trim().toLowerCase():"";if(!n)throw new e("INVALID_ARGS",`Batch step ${r+1} requires command.`);if(h.has(n))throw new e("INVALID_ARGS",`Batch step ${r+1} cannot run ${n}.`);if(void 0!==i.positionals&&!Array.isArray(i.positionals))throw new e("INVALID_ARGS",`Batch step ${r+1} positionals must be an array.`);let s=i.positionals??[];if(s.some(e=>"string"!=typeof e))throw new e("INVALID_ARGS",`Batch step ${r+1} positionals must contain only strings.`);if(void 0!==i.flags&&("object"!=typeof i.flags||Array.isArray(i.flags)||!i.flags))throw new e("INVALID_ARGS",`Batch step ${r+1} flags must be an object.`);if(void 0!==i.runtime&&("object"!=typeof i.runtime||Array.isArray(i.runtime)||!i.runtime))throw new e("INVALID_ARGS",`Batch step ${r+1} runtime must be an object.`);a.push({command:n,positionals:s,flags:i.flags??{},runtime:i.runtime})}return a}export{TextDecoder,styleText}from"node:util";export{p as DEFAULT_BATCH_MAX_STEPS,l as SETTINGS_INVALID_ARGS_MESSAGE,o as SETTINGS_USAGE_OVERRIDE,c as buildSnapshotDisplayLines,m as displayLabel,d as formatRole,u as formatSnapshotLine,f as parseBatchStepsJson,w as validateAndNormalizeBatchSteps};
@@ -1,3 +0,0 @@
1
- import type { DaemonResponse } from '../types.ts';
2
- import type { InteractionHandlerParams } from './interaction-common.ts';
3
- export declare function handleFillCommand(params: InteractionHandlerParams): Promise<DaemonResponse>;
@@ -1,3 +0,0 @@
1
- import type { DaemonResponse } from '../types.ts';
2
- import type { InteractionHandlerParams } from './interaction-common.ts';
3
- export declare function handlePressCommand(params: InteractionHandlerParams): Promise<DaemonResponse>;
@@ -1,79 +0,0 @@
1
- # Batching
2
-
3
- ## When to use batch
4
-
5
- - The agent already knows a short sequence of commands.
6
- - Steps belong to one logical screen flow.
7
- - You want one result object with per-step timing and failure context.
8
-
9
- ## When not to use batch
10
-
11
- - Flows are unrelated and should be retried independently.
12
- - The workflow is highly dynamic and requires replanning after each step.
13
- - You need human approvals between steps.
14
-
15
- ## CLI patterns
16
-
17
- From file:
18
-
19
- ```bash
20
- agent-device batch --session sim --platform ios --steps-file /tmp/batch-steps.json --json
21
- ```
22
-
23
- Inline (small payloads only):
24
-
25
- ```bash
26
- agent-device batch --steps '[{"command":"open","positionals":["settings"]}]'
27
- ```
28
-
29
- ## Step payload contract
30
-
31
- ```json
32
- [
33
- { "command": "open", "positionals": ["settings"], "flags": {} },
34
- { "command": "wait", "positionals": ["label=\"Privacy & Security\"", "3000"], "flags": {} },
35
- { "command": "click", "positionals": ["label=\"Privacy & Security\""], "flags": {} },
36
- { "command": "get", "positionals": ["text", "label=\"Tracking\""], "flags": {} }
37
- ]
38
- ```
39
-
40
- Rules:
41
-
42
- - `positionals` optional, defaults to `[]`.
43
- - `flags` optional, defaults to `{}`.
44
- - nested `batch` and `replay` are rejected.
45
- - stop-on-first-error is the supported mode (`--on-error stop`).
46
-
47
- ## Response handling
48
-
49
- Success includes:
50
-
51
- - `total`, `executed`, `totalDurationMs`
52
- - `results[]` entries with `step`, `command`, `durationMs`, and optional `data`
53
-
54
- Failure includes:
55
-
56
- - `details.step`
57
- - `details.command`
58
- - `details.executed`
59
- - `details.partialResults`
60
-
61
- Use these fields to replan from the first failing step.
62
-
63
- ## Common error categories and agent actions
64
-
65
- - `INVALID_ARGS`: payload/step shape issue; fix payload and retry.
66
- - `SESSION_NOT_FOUND`: open or select the correct session, then retry.
67
- - `UNSUPPORTED_OPERATION`: switch command/target to supported operation.
68
- - `AMBIGUOUS_MATCH`: refine selector/locator, then retry failed step.
69
- - `COMMAND_FAILED`: add sync guard (`wait`, `is exists`) and retry from failed step.
70
-
71
- ## Reliability guardrails
72
-
73
- - Add sync guards after mutating steps.
74
- - Assume snapshot/ref drift after navigation.
75
- - Keep batch size moderate (about 5-20 steps).
76
- - Split long workflows into phases:
77
- 1. navigate
78
- 2. verify/extract
79
- 3. cleanup
@@ -1,113 +0,0 @@
1
- # Logs (Token-Efficient Debugging)
2
-
3
- Logging is off by default in normal flows. Enable it on demand for debugging windows. App output is written to a session-scoped file so agents can grep it instead of loading full logs into context.
4
- `network dump` parses recent HTTP(s) entries from this same session app log file.
5
-
6
- ## Data Handling
7
-
8
- - Default app logs are stored under `~/.agent-device/sessions/<session>/app.log`.
9
- - Replay scripts saved with `--save-script` are written to the explicit path you provide.
10
- - Log files may contain sensitive runtime data; review before sharing and clean up when finished.
11
- - Use `AGENT_DEVICE_APP_LOG_REDACT_PATTERNS` to redact sensitive patterns at write time when needed.
12
-
13
- ## Retention and Cleanup
14
-
15
- - Keep logging scoped to active debug windows (`logs clear --restart` before repro, `logs stop` after repro).
16
- - Prefer bounded inspection (`grep -n`, `tail -50`) instead of reading full logs into context.
17
- - Clear session logs when finished:
18
- `agent-device logs clear`
19
- - Close session to stop background logging state:
20
- `agent-device close`
21
-
22
- ## Quick Flow
23
-
24
- ```bash
25
- agent-device open MyApp --platform ios # or --platform android
26
- agent-device logs clear --restart # Preferred: stop stream, clear logs, and start streaming again
27
- agent-device network dump 25 # Parse latest HTTP(s) requests (method/url/status) from app.log
28
- agent-device logs path # Print path, e.g. ~/.agent-device/sessions/default/app.log
29
- agent-device logs doctor # Check tool/runtime readiness for current session/device
30
- agent-device logs mark "before tap" # Insert a timeline marker into app.log
31
- # ... run flows; on failure, grep the path (see below)
32
- agent-device logs stop # Stop streaming (optional; close also stops)
33
- ```
34
-
35
- Precondition: `logs clear --restart` requires an active app session (`open <app>` first).
36
-
37
- ## Command Notes
38
-
39
- - `logs path`: returns log file path and metadata (`active`, `state`, `backend`, size, timestamps).
40
- - `logs start`: starts streaming; requires an active app session (`open` first). Supported on iOS simulator, iOS device, and Android.
41
- - `logs stop`: stops streaming. Session `close` also stops logging.
42
- - `logs clear`: truncates `app.log` and removes rotated `app.log.N` files. Requires logging to be stopped first.
43
- - `logs clear --restart`: convenience reset for repro loops (stop stream, clear files, restart stream).
44
- - `logs doctor`: reports backend/tool checks and readiness notes for troubleshooting.
45
- - `logs mark`: writes a timestamped marker line to the session log.
46
- - `network dump [limit] [summary|headers|body|all]`: parses recent HTTP(s) lines from the session app log and returns request summaries.
47
- - `network log ...`: alias for `network dump`.
48
-
49
- ## Behavior and Limits
50
-
51
- - `logs start` appends to `app.log` and rotates to `app.log.1` when `app.log` exceeds 5 MB.
52
- - `network dump` scans the last 4000 app-log lines, returns up to 200 entries, and truncates payload/header fields at 2048 characters.
53
- - Android log streaming automatically rebinds to the app PID after process restarts.
54
- - iOS log capture relies on Unified Logging signals (for example `os_log`); plain stdout/stderr output may be limited depending on app/runtime.
55
- - Retention knobs:
56
- - `AGENT_DEVICE_APP_LOG_MAX_BYTES`
57
- - `AGENT_DEVICE_APP_LOG_MAX_FILES`
58
- - Optional write-time redaction patterns:
59
- - `AGENT_DEVICE_APP_LOG_REDACT_PATTERNS` (comma-separated regex)
60
-
61
- ## Grep Patterns
62
-
63
- After getting the path from `logs path`, run `grep` (or `grep -E`) so only matching lines enter context.
64
-
65
- ```bash
66
- # Get path first, then grep it; -n adds line numbers
67
- grep -n "Error\|Exception\|Fatal" <path>
68
- grep -n -E "Error|Exception|Fatal|crash" <path>
69
-
70
- # Bounded context: last N lines only
71
- tail -50 <path>
72
- ```
73
-
74
- - Use `-n` for line numbers.
75
- - Use `-E` for extended regex so `|` in the pattern does not need escaping.
76
- - Prefer targeted patterns (e.g. `Error`, `Exception`, or app-specific tags) over reading the full file.
77
-
78
- ## Crash Triage Fast Path
79
-
80
- Always start from the session app log, then branch by platform.
81
-
82
- ```bash
83
- agent-device logs path
84
- grep -n -E "SIGABRT|SIGSEGV|EXC_|fatal|exception|terminated|killed|jetsam|memorystatus|FATAL EXCEPTION|Abort message" <path>
85
- nl -ba <path> | sed -n '<start>,<end>p'
86
- ```
87
-
88
- ### iOS
89
-
90
- ```bash
91
- # If log shows ReportCrash / SIGABRT / EXC_*, inspect simulator DiagnosticReports:
92
- ls -lt ~/Library/Logs/DiagnosticReports | grep -E "<AppName>|<BundleId>" | head
93
- ```
94
-
95
- - `SIGABRT`: app/runtime abort; inspect `.ips` triggered thread and top frames.
96
- - `SIGKILL` + jetsam/memorystatus markers: memory-pressure kill.
97
- - `EXC_BAD_ACCESS`/`SIGSEGV`: native memory access issue.
98
-
99
- ### Android
100
-
101
- ```bash
102
- # Capture fatal crash lines around app process death:
103
- adb -s <serial> logcat -d | grep -n -E "FATAL EXCEPTION|Process: <package>|Abort message|signal [0-9]+ \\(SIG"
104
- ```
105
-
106
- - `FATAL EXCEPTION` with Java stack: uncaught Java/Kotlin exception.
107
- - `signal 6 (SIGABRT)` or `signal 11 (SIGSEGV)` with tombstone refs: native crash path (NDK/JNI/runtime).
108
- - `Low memory killer` / `Killing <pid>` entries: OS memory-pressure/process reclaim.
109
-
110
- ## Stop Conditions
111
-
112
- - If no crash signature appears in app log, switch to platform-native crash sources (`.ips` on iOS, logcat/tombstone flow on Android).
113
- - If signatures are present and root cause class is identified (abort, native fault, memory pressure), stop collecting broad logs and focus on reproducing the specific path.
@@ -1,53 +0,0 @@
1
- # Performance Metrics (`perf` / `metrics`)
2
-
3
- Use this reference when you need to measure launch performance in agent workflows.
4
-
5
- ## Quick flow
6
-
7
- ```bash
8
- agent-device open Settings --platform ios
9
- agent-device perf --json
10
- ```
11
-
12
- Alias:
13
-
14
- ```bash
15
- agent-device metrics --json
16
- ```
17
-
18
- ## What is measured today
19
-
20
- - Session-scoped `startup` timing only.
21
- - Sampling method: `open-command-roundtrip`.
22
- - Unit: milliseconds (`ms`).
23
- - Source: elapsed wall-clock time around each session `open` command dispatch for the active app target.
24
-
25
- ## Output fields to use
26
-
27
- - `metrics.startup.lastDurationMs`: most recent startup sample.
28
- - `metrics.startup.lastMeasuredAt`: ISO timestamp of most recent sample.
29
- - `metrics.startup.sampleCount`: number of retained samples.
30
- - `metrics.startup.samples[]`: recent startup history for the current session.
31
- - `sampling.startup.method`: current sampling method identifier.
32
-
33
- ## Platform support (current)
34
-
35
- - iOS simulator: supported for startup sampling.
36
- - iOS physical device: supported for startup sampling.
37
- - Android emulator/device: supported for startup sampling.
38
- - `fps`, `memory`, and `cpu`: currently placeholders (`available: false`).
39
-
40
- ## Interpretation guidance
41
-
42
- - Treat startup values as command round-trip timing, not true app first-frame or first-interactive telemetry.
43
- - Compare like-for-like runs:
44
- - same device target
45
- - same app build
46
- - same workflow/session steps
47
- - Use multiple runs and compare trend/median, not one-off samples.
48
-
49
- ## Common pitfalls
50
-
51
- - Running `perf` before any `open` in the session yields no startup sample yet.
52
- - Comparing values across different devices/runtimes introduces large noise.
53
- - Interpreting current `startup` as CPU/FPS/memory would be incorrect.
@@ -1,70 +0,0 @@
1
- # Permissions and Setup
2
-
3
- ## iOS snapshots
4
-
5
- iOS snapshots use XCTest and do not require macOS Accessibility permissions.
6
-
7
- ## iOS physical device runner
8
-
9
- For iOS physical devices, XCTest runner setup requires valid signing/provisioning.
10
- Use Automatic Signing in Xcode, or provide optional overrides:
11
-
12
- - `AGENT_DEVICE_IOS_TEAM_ID`
13
- - `AGENT_DEVICE_IOS_SIGNING_IDENTITY`
14
- - `AGENT_DEVICE_IOS_PROVISIONING_PROFILE`
15
- - `AGENT_DEVICE_IOS_BUNDLE_ID` (optional runner bundle-id base override)
16
-
17
- Free Apple Developer (Personal Team) accounts may reject generic bundle IDs as unavailable.
18
- Set `AGENT_DEVICE_IOS_BUNDLE_ID` to a unique reverse-DNS identifier when that happens.
19
-
20
- Security guidance for these overrides:
21
-
22
- - These variables are optional and only needed for physical-device XCTest setup.
23
- - Treat values as sensitive host configuration; do not share in chat logs or commit to source control.
24
- - Do not provide private keys or unrelated secrets; use the minimum values required for signing.
25
- - Prefer Xcode Automatic Signing when possible to reduce manual secret/config handling.
26
- - For autonomous/CI runs, keep these unset by default and require explicit opt-in for physical-device workflows.
27
-
28
- If setup/build takes long, increase:
29
-
30
- - `AGENT_DEVICE_DAEMON_TIMEOUT_MS` (default `90000`, for example `120000`)
31
-
32
- If daemon startup fails with stale metadata hints, clean stale files and retry:
33
-
34
- - `~/.agent-device/daemon.json`
35
- - `~/.agent-device/daemon.lock`
36
-
37
- ## iOS permission alerts (simulator only)
38
-
39
- iOS apps trigger system permission dialogs (camera, location, notifications, etc.) on first use.
40
- Use `alert` to handle them without tapping coordinates:
41
-
42
- ```bash
43
- agent-device alert wait # block until an alert appears (default 10 s timeout)
44
- agent-device alert accept # accept the frontmost alert
45
- agent-device alert dismiss # dismiss the frontmost alert
46
- agent-device alert get # read alert title/message without acting
47
- ```
48
-
49
- **Timing note:** `alert accept` and `alert dismiss` include a built-in 2 s retry window.
50
- If the alert is present in the UI hierarchy but not yet interactive, the command retries every 300 ms
51
- rather than failing immediately. You do not need to add manual sleeps between triggering the alert
52
- and accepting it.
53
-
54
- **Preferred pattern for clean simulator sessions:**
55
-
56
- ```bash
57
- agent-device open MyApp --platform ios
58
- agent-device alert wait 5000 # wait up to 5 s for the permission prompt
59
- agent-device alert accept # accept; retries internally if not yet actionable
60
- ```
61
-
62
- `alert` is only supported on iOS simulators; iOS physical devices are not supported.
63
-
64
- ## iOS: "Allow Paste" dialog
65
-
66
- iOS 16+ shows an "Allow Paste" prompt when an app reads the system pasteboard. Under XCUITest (which `agent-device` uses), this prompt is suppressed by the testing runtime. Use `xcrun simctl pbcopy booted` to set clipboard content directly on the simulator instead.
67
-
68
- ## Simulator troubleshooting
69
-
70
- - If snapshots return 0 nodes, restart Simulator and re-open the app.
@@ -1,101 +0,0 @@
1
- # Session Management
2
-
3
- ## Named sessions
4
-
5
- ```bash
6
- agent-device --session auth open Settings --platform ios
7
- agent-device --session auth snapshot -i
8
- ```
9
-
10
- Sessions isolate device context. A device can only be held by one session at a time.
11
-
12
- ## Best practices
13
-
14
- - Name sessions semantically.
15
- - Close sessions when done.
16
- - Use separate sessions for parallel work.
17
- - For orchestrated QA runs, prefer a pre-bound session/platform over repeating per-command selectors.
18
- - For remote tenant-scoped automation, run commands with:
19
- `--tenant <id> --session-isolation tenant --run-id <id> --lease-id <id>`
20
- - In iOS sessions, use `open <app>`. `open <url>` opens deep links; on devices `http(s)://` opens Safari when no app is active, and custom schemes require an active app in the session.
21
- - In iOS sessions, `open <app> <url>` opens a deep link.
22
- - On iOS, `appstate` is session-scoped and requires a matching active session on the target device.
23
- - For dev loops where runtime state can persist (for example React Native Fast Refresh), use `open <app> --relaunch` to restart the app process in the same session.
24
- - Use `--save-script [path]` to record replay scripts on `close`; path is a file path and parent directories are created automatically.
25
- - Use `close --shutdown` on iOS simulators or Android emulators to shut down the target as part of session teardown, preventing resource leakage in multi-tenant or CI workloads.
26
- - For ambiguous bare `--save-script` values, prefer `--save-script=workflow.ad` or `./workflow.ad`.
27
- - For deterministic replay scripts, prefer selector-based actions and assertions.
28
- - Use `replay -u` to update selector drift during maintenance.
29
-
30
- ## Session-bound automation
31
-
32
- Use this when an external orchestrator must keep every CLI call on the same session/device without a wrapper script.
33
-
34
- ```bash
35
- export AGENT_DEVICE_SESSION=qa-ios
36
- export AGENT_DEVICE_PLATFORM=ios
37
- export AGENT_DEVICE_SESSION_LOCK=strip
38
-
39
- agent-device open MyApp --relaunch
40
- agent-device snapshot -i
41
- agent-device press @e3
42
- agent-device close
43
- ```
44
-
45
- - `AGENT_DEVICE_SESSION` and `AGENT_DEVICE_PLATFORM` provide the default binding when `--session` and `--platform` are omitted.
46
- - A configured `AGENT_DEVICE_SESSION` enables lock policy enforcement by convention. The default mode is `reject`.
47
- - `--session-lock reject|strip` or `AGENT_DEVICE_SESSION_LOCK=reject|strip` controls whether conflicting selectors fail or are ignored.
48
- - The daemon enforces the same lock policy for CLI requests, typed client calls, and direct RPC commands.
49
- - Conflicts include explicit retargeting selectors such as `--platform`, `--target`, `--device`, `--udid`, `--serial`, `--ios-simulator-device-set`, and `--android-device-allowlist`.
50
- - `--session-locked`, `--session-lock-conflicts`, `AGENT_DEVICE_SESSION_LOCKED`, and `AGENT_DEVICE_SESSION_LOCK_CONFLICTS` remain supported as compatibility aliases.
51
- - Lock policy applies to nested `batch` steps too. If a step omits `platform`, it still inherits the parent batch `--platform` instead of being silently replaced by an environment default.
52
-
53
- Android emulator variant:
54
-
55
- ```bash
56
- export AGENT_DEVICE_SESSION=qa-android
57
- export AGENT_DEVICE_PLATFORM=android
58
-
59
- agent-device reinstall MyApp /path/to/app-debug.apk --serial emulator-5554
60
- agent-device --session-lock reject open com.example.myapp --relaunch
61
- agent-device snapshot -i
62
- agent-device close --shutdown
63
- ```
64
-
65
- ## Scoped device isolation
66
-
67
- Use scoped discovery when sessions must not see host-global device lists.
68
-
69
- ```bash
70
- agent-device devices --platform ios --ios-simulator-device-set /tmp/tenant-a/simulators
71
- agent-device devices --platform android --android-device-allowlist emulator-5554,device-1234
72
- ```
73
-
74
- - Scope is applied before selectors (`--device`, `--udid`, `--serial`).
75
- - If selector target is outside scope, resolution fails with `DEVICE_NOT_FOUND`.
76
- - If the scoped iOS simulator set is empty (first-run), the error includes the set path and a suggested `xcrun simctl --set <path> create ...` command.
77
- - With iOS simulator-set scope enabled, iOS physical devices are not enumerated.
78
- - Environment equivalents:
79
- - `AGENT_DEVICE_IOS_SIMULATOR_DEVICE_SET` (compat: `IOS_SIMULATOR_DEVICE_SET`)
80
- - `AGENT_DEVICE_ANDROID_DEVICE_ALLOWLIST` (compat: `ANDROID_DEVICE_ALLOWLIST`)
81
-
82
- ## Listing sessions
83
-
84
- ```bash
85
- agent-device session list
86
- ```
87
-
88
- iOS session entries include `device_udid` and `ios_simulator_device_set` (null when using the default set). Use these fields to confirm device routing in concurrent multi-session runs without additional `simctl` calls.
89
-
90
- ## Replay within sessions
91
-
92
- ```bash
93
- agent-device replay ./session.ad --session auth
94
- agent-device replay -u ./session.ad --session auth
95
- ```
96
-
97
- ## Tenant isolation note
98
-
99
- When session isolation is set to tenant mode, session namespace is scoped as
100
- `<tenant>:<session>`. For remote runs, allocate and maintain an active lease
101
- for the same tenant/run scope before executing tenant-isolated commands.
@@ -1,102 +0,0 @@
1
- # Snapshot Refs and Selectors
2
-
3
- ## Purpose
4
-
5
- Refs are useful for discovery/debugging. For deterministic scripts, use selectors.
6
- For tap interactions, `press` is canonical; `click` is an equivalent alias.
7
- For host Mac desktop apps, pair this reference with [macos-desktop.md](macos-desktop.md) because context menus and native list/table structures need desktop-specific handling.
8
-
9
- ## Snapshot
10
-
11
- ```bash
12
- agent-device snapshot -i
13
- ```
14
-
15
- Output:
16
-
17
- ```
18
- Page: com.apple.Preferences
19
- App: com.apple.Preferences
20
-
21
- @e1 [ioscontentgroup]
22
- @e2 [button] "Camera"
23
- @e3 [button] "Privacy & Security"
24
- ```
25
-
26
- ## Using refs (discovery/debug)
27
-
28
- ```bash
29
- agent-device press @e2
30
- agent-device fill @e5 "test"
31
- ```
32
-
33
- On macOS, if actions live in a context menu, use:
34
-
35
- ```bash
36
- agent-device click @e5 --button secondary --platform macos
37
- agent-device snapshot -i
38
- ```
39
-
40
- ## Using selectors (deterministic)
41
-
42
- ```bash
43
- agent-device press 'id="camera_row" || label="Camera" role=button'
44
- agent-device fill 'id="search_input" editable=true' "test"
45
- agent-device is visible 'id="camera_settings_anchor"'
46
- ```
47
-
48
- ## Ref lifecycle
49
-
50
- Refs can become invalid when UI changes (navigation, modal, dynamic list updates).
51
- Re-snapshot after transitions if you keep using refs.
52
-
53
- ## Scope snapshots
54
-
55
- Use `-s` to scope to labels/identifiers. This reduces size and speeds up results:
56
-
57
- ```bash
58
- agent-device snapshot -i -s "Camera"
59
- agent-device snapshot -i -s @e3
60
- ```
61
-
62
- ## Diff snapshots (structural)
63
-
64
- Use `diff snapshot` when you need compact state-change visibility between nearby UI states:
65
-
66
- ```bash
67
- agent-device snapshot -i # First snapshot initializes baseline
68
- agent-device press @e5
69
- agent-device diff snapshot -i # Shows +/− structural lines vs prior snapshot
70
- ```
71
-
72
- Efficient pattern:
73
-
74
- - Initialize once at a stable point.
75
- - Mutate UI (`press`, `fill`, `swipe`).
76
- - Run `diff snapshot` after interactions to confirm expected change shape with bounded output.
77
- - Re-run full/scoped `snapshot` only when you need fresh refs for next step selection.
78
-
79
- ## Troubleshooting
80
-
81
- - Ref not found: re-snapshot.
82
- - If `snapshot` returns 0 nodes, foreground app state or accessibility state may have changed. Re-open the app or retry after state is stable.
83
- - On macOS, use `snapshot --raw --platform macos` to distinguish collector filtering from truly missing AX content.
84
-
85
- ## Stop Conditions
86
-
87
- - If refs are unstable after UI transitions, switch to selector-based targeting and stop investing in ref-only flows.
88
-
89
- ## find click response
90
-
91
- `find "<query>" click --json` returns deterministic matched-target metadata:
92
-
93
- ```json
94
- { "ref": "@e3", "locator": "any", "query": "Increment", "x": 195, "y": 422 }
95
- ```
96
-
97
- Fields come from the matched snapshot node, not the platform runner. Use these for observability and replay quality — they are stable across runs for the same UI state.
98
-
99
- ## Replay note
100
-
101
- - Prefer selector-based actions in recorded `.ad` replays.
102
- - Use `agent-device replay -u <path>` to update selector drift and rewrite replay scripts in place.