claude-in-mobile 3.7.0 → 3.8.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (77) hide show
  1. package/README.md +401 -263
  2. package/dist/adapters/android-adapter.d.ts.map +1 -1
  3. package/dist/adapters/android-adapter.js +8 -1
  4. package/dist/adapters/android-adapter.js.map +1 -1
  5. package/dist/adapters/desktop-adapter.d.ts +2 -2
  6. package/dist/adapters/desktop-adapter.d.ts.map +1 -1
  7. package/dist/adapters/desktop-adapter.js.map +1 -1
  8. package/dist/adapters/ios-adapter.d.ts.map +1 -1
  9. package/dist/adapters/ios-adapter.js +5 -2
  10. package/dist/adapters/ios-adapter.js.map +1 -1
  11. package/dist/adb/client.d.ts.map +1 -1
  12. package/dist/adb/client.js +12 -6
  13. package/dist/adb/client.js.map +1 -1
  14. package/dist/adb/resolver.d.ts +24 -0
  15. package/dist/adb/resolver.d.ts.map +1 -0
  16. package/dist/adb/resolver.js +130 -0
  17. package/dist/adb/resolver.js.map +1 -0
  18. package/dist/adb/ui-parser.d.ts +14 -1
  19. package/dist/adb/ui-parser.d.ts.map +1 -1
  20. package/dist/adb/ui-parser.js +47 -1
  21. package/dist/adb/ui-parser.js.map +1 -1
  22. package/dist/desktop/client.d.ts +13 -25
  23. package/dist/desktop/client.d.ts.map +1 -1
  24. package/dist/desktop/client.js +326 -198
  25. package/dist/desktop/client.js.map +1 -1
  26. package/dist/desktop/gradle.d.ts +2 -2
  27. package/dist/desktop/gradle.d.ts.map +1 -1
  28. package/dist/desktop/gradle.js.map +1 -1
  29. package/dist/desktop/index.d.ts +1 -1
  30. package/dist/desktop/index.d.ts.map +1 -1
  31. package/dist/desktop/index.js +1 -1
  32. package/dist/desktop/index.js.map +1 -1
  33. package/dist/desktop/types.d.ts +26 -1
  34. package/dist/desktop/types.d.ts.map +1 -1
  35. package/dist/device-manager.d.ts +14 -2
  36. package/dist/device-manager.d.ts.map +1 -1
  37. package/dist/device-manager.js +33 -14
  38. package/dist/device-manager.js.map +1 -1
  39. package/dist/errors.d.ts +2 -1
  40. package/dist/errors.d.ts.map +1 -1
  41. package/dist/errors.js +9 -2
  42. package/dist/errors.js.map +1 -1
  43. package/dist/index.js +2 -2
  44. package/dist/index.js.map +1 -1
  45. package/dist/ios/wda/wda-client.d.ts.map +1 -1
  46. package/dist/ios/wda/wda-client.js +18 -2
  47. package/dist/ios/wda/wda-client.js.map +1 -1
  48. package/dist/store/google-play.d.ts.map +1 -1
  49. package/dist/store/google-play.js +0 -1
  50. package/dist/store/google-play.js.map +1 -1
  51. package/dist/tools/app-tools.d.ts.map +1 -1
  52. package/dist/tools/app-tools.js +27 -0
  53. package/dist/tools/app-tools.js.map +1 -1
  54. package/dist/tools/desktop-tools.d.ts.map +1 -1
  55. package/dist/tools/desktop-tools.js +39 -18
  56. package/dist/tools/desktop-tools.js.map +1 -1
  57. package/dist/tools/helpers/resolve-element.d.ts +2 -0
  58. package/dist/tools/helpers/resolve-element.d.ts.map +1 -1
  59. package/dist/tools/helpers/resolve-element.js +2 -2
  60. package/dist/tools/helpers/resolve-element.js.map +1 -1
  61. package/dist/tools/interaction-tools.d.ts.map +1 -1
  62. package/dist/tools/interaction-tools.js +50 -17
  63. package/dist/tools/interaction-tools.js.map +1 -1
  64. package/dist/tools/meta/desktop-meta.d.ts.map +1 -1
  65. package/dist/tools/meta/desktop-meta.js +8 -7
  66. package/dist/tools/meta/desktop-meta.js.map +1 -1
  67. package/dist/tools/system-tools.d.ts.map +1 -1
  68. package/dist/tools/system-tools.js +147 -1
  69. package/dist/tools/system-tools.js.map +1 -1
  70. package/dist/tools/ui-tools.d.ts.map +1 -1
  71. package/dist/tools/ui-tools.js +4 -2
  72. package/dist/tools/ui-tools.js.map +1 -1
  73. package/dist/utils/sanitize.d.ts +1 -1
  74. package/dist/utils/sanitize.d.ts.map +1 -1
  75. package/dist/utils/sanitize.js +13 -7
  76. package/dist/utils/sanitize.js.map +1 -1
  77. package/package.json +1 -1
package/README.md CHANGED
@@ -1,54 +1,124 @@
1
1
  # Claude Mobile
2
2
 
3
- MCP server for mobile and desktop automation — Android (via ADB), iOS Simulator (via simctl), Desktop (any macOS app), and Aurora OS (via audb). Like [Claude in Chrome](https://www.anthropic.com/news/claude-for-chrome) but for mobile devices and desktop apps.
4
-
5
- Control your Android phone, emulator, iOS Simulator, Desktop applications, or Aurora OS device with natural language through Claude.
6
-
7
- ## Features
8
-
9
- - **Unified API** — Same commands work for Android, iOS, Desktop, Aurora OS, and Browser
10
- - **Token-optimized** — 8 meta-tools + 3 optional modules instead of 81 separate tools (~85% token reduction per request)
11
- - **Dynamic modules** — Browser, Desktop, and Store modules load on demand, keeping the default tool list lean
12
- - **Browser automation** Control Chrome/Chromium via CDP: navigate, click, fill forms, evaluate JS, take screenshots
13
- - **Smart screenshots** — Auto-compressed for optimal LLM processing
14
- - **Annotated screenshots** — Screenshots with colored bounding boxes and numbered element labels
15
- - **Security hardened** — Shell injection protection, URL scheme validation, path traversal blocking, input sanitization
16
- - **Structured errors** — Typed error codes (`[CODE] message`) with auto-retry hints for transient failures
17
- - **Telemetry** — Per-tool call metrics (count, avg latency, error rate) via `system(action:'metrics')`
18
- - **Multi-device parallel** — Run the same action on multiple devices simultaneously via `flow_parallel`
19
- - **Flow engine** `flow_batch` for sequential commands, `flow_run` for conditional loops, `flow_parallel` for fan-out
20
- - **Permission management** Grant, revoke, and reset app permissions (Android runtime, iOS privacy services)
21
- - **Store management** — Upload builds to Google Play, Huawei AppGallery, and RuStore (optional module)
22
- - **Desktop support** — Test any macOS app (SwiftUI, AppKit, Electron, Compose) with window management, clipboard, and performance metrics
3
+ MCP server for mobile, desktop, and browser automation — Android (ADB), iOS Simulator (simctl + WDA), Desktop (any macOS app), Aurora OS (audb), and Browser (CDP). Like [Claude in Chrome](https://www.anthropic.com/news/claude-for-chrome) but for devices, apps, and browsers.
4
+
5
+ Control your Android phone, emulator, iOS Simulator, desktop app, Aurora device, or headless browser with natural language through Claude.
6
+
7
+ ---
8
+
9
+ ## Table of Contents
10
+
11
+ - [Quick Start](#quick-start)
12
+ - [Features at a Glance](#features-at-a-glance)
13
+ - [Quality Engineering](#quality-engineering)
14
+ - [Installation](#installation)
15
+ - [Homebrew (macOS)](#homebrew-macos)
16
+ - [One-liner (any client)](#one-liner-any-client)
17
+ - [Claude Code](#claude-code)
18
+ - [OpenCode](#opencode)
19
+ - [Other Agents (Pi, Qwen, Gemini, Codex, Cursor)](#other-agents)
20
+ - [From npm / source](#from-npm--source)
21
+ - [Windows](#windows)
22
+ - [Platform Guides](#platform-guides)
23
+ - [Android](#android)
24
+ - [iOS](#ios)
25
+ - [Desktop](#desktop)
26
+ - [Browser](#browser)
27
+ - [Aurora OS](#aurora-os)
28
+ - [Tools Reference](#tools-reference)
29
+ - [Core Meta-Tools](#core-meta-tools)
30
+ - [Optional Modules](#optional-modules)
31
+ - [Flow Tools](#flow-tools)
32
+ - [Native CLI](#native-cli)
33
+ - [Architecture](#architecture)
34
+ - [License](#license)
35
+
36
+ ---
37
+
38
+ ## Quick Start
39
+
40
+ ```bash
41
+ # Install via Homebrew (macOS)
42
+ brew tap AlexGladkov/claude-in-mobile https://github.com/AlexGladkov/claude-in-mobile
43
+ brew install claude-in-mobile
44
+
45
+ # Verify dependencies
46
+ claude-in-mobile doctor
47
+
48
+ # Add to Claude Code
49
+ claude mcp add --scope user --transport stdio mobile -- npx claude-in-mobile@latest
50
+ ```
51
+
52
+ Then talk to Claude naturally:
53
+
54
+ ```
55
+ "Take a screenshot of the Android emulator"
56
+ "Tap on the Login button"
57
+ "Type hello in the search field"
58
+ "Switch to iOS simulator"
59
+ ```
60
+
61
+ ---
62
+
63
+ ## Features at a Glance
64
+
65
+ | Feature | Description |
66
+ |---------|-------------|
67
+ | **Unified API** | Same 8 meta-tools work across Android, iOS, Desktop, Aurora, and Browser |
68
+ | **Token-optimized** | 8 meta-tools + 3 optional modules instead of 81 tools (~85% token reduction) |
69
+ | **Dynamic modules** | Browser, Desktop, Store load on demand — default tool list stays lean |
70
+ | **Smart screenshots** | Auto-compressed for optimal LLM processing |
71
+ | **Annotated screenshots** | Colored bounding boxes + numbered element labels |
72
+ | **Security hardened** | Shell injection protection, URL validation, path traversal blocking |
73
+ | **Structured errors** | Typed error codes with auto-recovery hints |
74
+ | **Multi-device parallel** | Run actions on multiple devices simultaneously |
75
+ | **Flow engine** | Batch, conditional loops, and fan-out flows |
76
+ | **Permission management** | Grant/revoke/reset app permissions (Android + iOS) |
77
+ | **Store publishing** | Google Play, Huawei AppGallery, RuStore |
78
+ | **Telemetry** | Per-tool call metrics via `system(action:'metrics')` |
79
+ | **Doctor command** | `claude-in-mobile doctor` — checks all dependencies at once |
80
+
81
+ ---
82
+
83
+ ## Quality Engineering
84
+
85
+ Advanced testing and monitoring built into Claude Mobile:
86
+
87
+ | Feature | What it does | How to use |
88
+ |---------|-------------|------------|
89
+ | **Accessibility Auditing** | WCAG 2.2 checks: missing labels, touch targets < 48px, focus order, duplicates | `accessibility(action:'audit')` |
90
+ | **Visual Regression** | Baseline screenshots + pixel-level diff detection | `visual(action:'baseline_save')`, `visual(action:'compare')` |
91
+ | **Test Recorder** | Record taps/swipes/input, replay without code | `recorder(action:'start')`, `recorder(action:'play')` |
92
+ | **Multi-Device Sync** | Barrier-based coordination for parallel testing | `sync(action:'create')`, `sync(action:'barrier')` |
93
+ | **App Autopilot** | Autonomous BFS/DFS exploration with self-healing locators | `autopilot(action:'explore')` |
94
+ | **Performance Monitor** | Real-time memory, CPU, FPS tracking with snapshots | `performance(action:'start')`, `performance(action:'snapshot')` |
95
+
96
+ ---
23
97
 
24
98
  ## Installation
25
99
 
26
- ### Native CLI via Homebrew (macOS)
100
+ ### Homebrew (macOS)
27
101
 
28
102
  ```bash
29
103
  brew tap AlexGladkov/claude-in-mobile https://github.com/AlexGladkov/claude-in-mobile
30
104
  brew install claude-in-mobile
31
105
  ```
32
106
 
33
- The CLI wraps all device automation tools plus store management (Google Play, Huawei AppGallery, RuStore):
107
+ Verify setup:
34
108
 
35
109
  ```bash
36
- claude-in-mobile screenshot android
37
- claude-in-mobile tap android 540 960 --from-size 540x960
38
- claude-in-mobile store upload --package com.example.app --file app.aab
39
- claude-in-mobile huawei upload --package com.example.app --file app.aab
40
- claude-in-mobile rustore upload --package com.example.app --file app.apk
110
+ claude-in-mobile doctor
41
111
  ```
42
112
 
43
113
  ### One-liner (any client)
44
114
 
45
- Using [add-mcp](https://github.com/neondatabase/add-mcp) — auto-detects installed clients:
115
+ Auto-detects installed clients via [add-mcp](https://github.com/neondatabase/add-mcp):
46
116
 
47
117
  ```bash
48
118
  npx add-mcp claude-in-mobile -y
49
119
  ```
50
120
 
51
- Or target a specific client:
121
+ Target a specific client:
52
122
 
53
123
  ```bash
54
124
  npx add-mcp claude-in-mobile -a claude-code -y
@@ -56,30 +126,39 @@ npx add-mcp claude-in-mobile -a opencode -y
56
126
  npx add-mcp claude-in-mobile -a cursor -y
57
127
  ```
58
128
 
59
- ### Claude Code CLI
129
+ ### Claude Code
60
130
 
61
131
  ```bash
132
+ # Project-local
62
133
  claude mcp add --transport stdio mobile -- npx claude-in-mobile@latest
134
+
135
+ # Global (all projects)
136
+ claude mcp add --scope user --transport stdio mobile -- npx claude-in-mobile@latest
63
137
  ```
64
138
 
65
- To add globally (available in all projects):
139
+ #### Claude Code Plugin
66
140
 
67
141
  ```bash
68
- claude mcp add --scope user --transport stdio mobile -- npx claude-in-mobile@latest
142
+ claude plugin marketplace add AlexGladkov/claude-in-mobile
143
+ claude plugin install claude-in-mobile@claude-in-mobile
69
144
  ```
70
145
 
71
146
  ### OpenCode
72
147
 
73
- Use the interactive setup:
148
+ Two modes:
149
+
150
+ **A) MCP server** (Node.js):
74
151
 
75
152
  ```bash
76
153
  opencode mcp add
154
+ # Choose local MCP → npx -y claude-in-mobile
77
155
  ```
78
156
 
79
- Or add manually to `opencode.json` (project root or `~/.config/opencode/opencode.json`):
157
+ Or in `opencode.json`:
80
158
 
81
159
  ```json
82
160
  {
161
+ "$schema": "https://opencode.ai/config.json",
83
162
  "mcp": {
84
163
  "mobile": {
85
164
  "type": "local",
@@ -90,89 +169,73 @@ Or add manually to `opencode.json` (project root or `~/.config/opencode/opencode
90
169
  }
91
170
  ```
92
171
 
93
- ### Cursor
94
-
95
- Add to `.cursor/mcp.json`:
172
+ **B) Native CLI + Skill** (no Node.js needed):
96
173
 
97
- ```json
98
- {
99
- "mcpServers": {
100
- "mobile": {
101
- "command": "npx",
102
- "args": ["-y", "claude-in-mobile"]
103
- }
104
- }
105
- }
174
+ ```bash
175
+ claude-in-mobile setup opencode # project-local
176
+ claude-in-mobile setup opencode --global # user-wide
106
177
  ```
107
178
 
108
- ### Any MCP Client
179
+ ### Other Agents
109
180
 
110
- Print a config snippet for your client:
181
+ Native CLI skill works with any agent that supports Agent Skills:
111
182
 
112
183
  ```bash
113
- npx claude-in-mobile --init <client-name>
114
- # Supported: opencode, cursor, claude-code
184
+ claude-in-mobile setup pi --global # Pi
185
+ claude-in-mobile setup qwen --global # Qwen Code
186
+ claude-in-mobile setup gemini --global # Gemini CLI
187
+ claude-in-mobile setup codex --global # Codex
188
+ claude-in-mobile setup cursor --global # Cursor
115
189
  ```
116
190
 
117
- ### From npm
191
+ Drop `--global` for project-local install. Restart the agent after setup.
118
192
 
119
- ```bash
120
- npx claude-in-mobile
121
- ```
193
+ <details>
194
+ <summary>MCP server config for Qwen / Gemini / Codex / Cursor</summary>
122
195
 
123
- ### From source
196
+ **Qwen Code** — `.qwen/settings.json` or `~/.qwen/settings.json`:
124
197
 
125
- ```bash
126
- git clone https://github.com/AlexGladkov/claude-in-mobile.git
127
- cd claude-in-mobile
128
- npm install
129
- npm run build:all # Builds TypeScript + Desktop companion
198
+ ```json
199
+ { "mcpServers": { "mobile": { "command": "npx", "args": ["-y", "claude-in-mobile"] } } }
130
200
  ```
131
201
 
132
- > **Note:** For Desktop support, you need to run `npm run build:desktop` (or `build:all`) to compile the Desktop companion app.
202
+ **Gemini CLI** `.gemini/settings.json` or `~/.gemini/settings.json`:
133
203
 
134
- #### Using a local build with MCP clients
204
+ ```json
205
+ { "mcpServers": { "mobile": { "command": "npx", "args": ["-y", "claude-in-mobile"] } } }
206
+ ```
135
207
 
136
- After building from source, point your MCP client to the local `dist/index.js` instead of using npx:
208
+ **Codex**:
137
209
 
138
- ```json
139
- {
140
- "mcpServers": {
141
- "mobile": {
142
- "command": "node",
143
- "args": ["/path/to/claude-in-mobile/dist/index.js"]
144
- }
145
- }
146
- }
210
+ ```bash
211
+ codex mcp add mobile -- npx -y claude-in-mobile
147
212
  ```
148
213
 
149
- For OpenCode (`opencode.json`):
214
+ **Cursor** `.cursor/mcp.json`:
150
215
 
151
216
  ```json
152
- {
153
- "mcp": {
154
- "mobile": {
155
- "type": "local",
156
- "command": ["node", "/path/to/claude-in-mobile/dist/index.js"],
157
- "enabled": true
158
- }
159
- }
160
- }
217
+ { "mcpServers": { "mobile": { "command": "npx", "args": ["-y", "claude-in-mobile"] } } }
161
218
  ```
162
219
 
163
- ### Manual configuration
220
+ </details>
221
+
222
+ ### From npm / source
164
223
 
165
- Add to your Claude Code settings (`~/.claude.json` or project settings):
224
+ ```bash
225
+ # npm (no install)
226
+ npx claude-in-mobile
227
+
228
+ # From source
229
+ git clone https://github.com/AlexGladkov/claude-in-mobile.git
230
+ cd claude-in-mobile
231
+ npm install
232
+ npm run build:all
233
+ ```
234
+
235
+ Using a local build with any MCP client:
166
236
 
167
237
  ```json
168
- {
169
- "mcpServers": {
170
- "mobile": {
171
- "command": "npx",
172
- "args": ["-y", "claude-in-mobile"]
173
- }
174
- }
175
- }
238
+ { "mcpServers": { "mobile": { "command": "node", "args": ["/path/to/claude-in-mobile/dist/index.js"] } } }
176
239
  ```
177
240
 
178
241
  ### Windows
@@ -181,157 +244,277 @@ Add to your Claude Code settings (`~/.claude.json` or project settings):
181
244
  claude mcp add --transport stdio mobile -- cmd /c npx claude-in-mobile@latest
182
245
  ```
183
246
 
184
- ## Requirements
247
+ ---
248
+
249
+ ## Platform Guides
185
250
 
186
251
  ### Android
187
- - ADB installed and in PATH
188
- - Connected Android device (USB debugging enabled) or emulator
189
252
 
190
- ### iOS
191
- - macOS with Xcode installed
192
- - iOS Simulator (no physical device support yet)
193
- - **WebDriverAgent** for full UI inspection and element-based interaction:
194
- ```bash
195
- npm install -g appium
196
- appium driver install xcuitest
197
- ```
198
- Or set `WDA_PATH` environment variable to custom WebDriverAgent location
253
+ **Requirements:**
254
+ - ADB installed (auto-discovered or set `ADB_PATH`)
255
+ - USB debugging enabled on device, or running emulator
199
256
 
200
- ### Desktop
201
- - macOS (Windows/Linux support planned)
202
- - JDK 17+ for building the Desktop companion
203
- - Any macOS application (SwiftUI, AppKit, Electron, Compose) — launch by `bundleId`, `.app` path, or attach by PID
204
- - Accessibility permissions required: System Settings → Privacy & Security → Accessibility
257
+ **ADB discovery order:**
205
258
 
206
- ### Aurora OS
207
- - audb CLI installed and in PATH (`cargo install audb-client`)
208
- - Connected Aurora OS device with SSH enabled
209
- - Python on device required for tap/swipe: `devel-su pkcon install python`
259
+ | Priority | Location |
260
+ |----------|----------|
261
+ | 1 | `ADB_PATH` env var |
262
+ | 2 | `$ANDROID_HOME/platform-tools/adb` |
263
+ | 3 | `$ANDROID_SDK_ROOT/platform-tools/adb` |
264
+ | 4 | OS default: `~/Library/Android/sdk` (macOS), `%LOCALAPPDATA%\Android\Sdk` (Windows), `~/Android/Sdk` (Linux) |
265
+ | 5 | `adb` from `PATH` |
210
266
 
211
- ## Available Tools
267
+ If none found → `[ADB_NOT_INSTALLED]` error with probed paths.
212
268
 
213
- v3.7.0 provides **8 core meta-tools** + **3 optional modules**. Each meta-tool uses an `action` parameter to select the operation. All v3.0/v3.1 tool names still work as backward-compatible aliases.
269
+ **Examples:**
214
270
 
215
- ### Core Meta-Tools (always loaded)
271
+ ```
272
+ "Show connected devices"
273
+ "Take a screenshot on Android"
274
+ "Tap on Settings"
275
+ "Swipe down to scroll"
276
+ "Type 'hello' in the search field"
277
+ "Press the back button"
278
+ "Grant camera permission to com.example.app"
279
+ "Launch com.example.app"
280
+ ```
216
281
 
217
- | Meta-Tool | Actions | Description |
218
- |-----------|---------|-------------|
219
- | `device` | `list`, `set`, `set_target`, `get_target`, `enable_module`, `disable_module`, `list_modules` | Device management and module control |
220
- | `input` | `tap`, `double_tap`, `long_press`, `swipe`, `text`, `key` | Touch/keyboard input |
221
- | `screen` | `capture`, `annotate` | Screenshots and visual annotation |
222
- | `ui` | `tree`, `find`, `find_tap`, `tap_text`, `analyze`, `wait`, `assert_visible`, `assert_gone` | UI hierarchy and element interaction |
223
- | `app` | `launch`, `stop`, `install`, `list` | App lifecycle management |
224
- | `system` | `activity`, `shell`, `wait`, `open_url`, `logs`, `clear_logs`, `info`, `webview`, `clipboard_*`, `permission_*`, `file_*`, `metrics`, `reset_metrics` | System operations, clipboard, permissions, files, telemetry |
225
- | `flow_batch` | — | Execute multiple commands in one round-trip |
226
- | `flow_run` | — | Multi-step automation with conditionals and loops |
282
+ **CLI:**
227
283
 
228
- ### Optional Modules (loaded on demand)
284
+ ```bash
285
+ claude-in-mobile screenshot android
286
+ claude-in-mobile tap android 540 960
287
+ claude-in-mobile input android "hello world"
288
+ claude-in-mobile ui-dump android | grep "Login"
289
+ ```
229
290
 
230
- These modules are hidden by default to save tokens. They auto-enable when you call them, or use `device(action:'enable_module', module:'<name>')`.
291
+ #### Coordinate space (raw `x`/`y` in tap / swipe / long_press)
231
292
 
232
- | Module | Actions | Description |
233
- |--------|---------|-------------|
234
- | `browser` | `open`, `close`, `list_sessions`, `navigate`, `click`, `fill`, `fill_form`, `press_key`, `snapshot`, `screenshot`, `evaluate`, `wait_for_selector`, `clear_session` | Chrome/Chromium automation via CDP |
235
- | `desktop` | `launch`, `stop`, `windows`, `focus`, `resize`, `clipboard_get`, `clipboard_set`, `performance`, `monitors` | Desktop app testing (any macOS app: SwiftUI, AppKit, Electron, Compose) |
236
- | `store` | `upload`, `set_notes`, `submit`, `get_releases`, `discard`, `promote`, `halt_rollout`, `get_versions` | Google Play, Huawei AppGallery, RuStore publishing |
293
+ When you call an input tool with raw `x`/`y` (or `x1`/`y1`/`x2`/`y2` for swipe), the values are interpreted in **the most recent screenshot's pixel space** and auto-scaled to device coordinates before dispatch. The scale comes from the last `screen_capture` call: e.g., capture at `preset='low'` (270×480) on a 1080×2400 device sets a 4× factor, so `tap(135, 240)` becomes `tap(540, 960)` on the device.
237
294
 
238
- ### Flow Tools
295
+ This is convenient for the common flow `screen_capture → reason about pixel → tap`, but has two gotchas worth knowing:
239
296
 
240
- | Tool | Description |
241
- |------|-------------|
242
- | `flow_batch` | Sequential execution of multiple commands in one round-trip (max 50) |
243
- | `flow_run` | Multi-step flows with `if_not_found`, `repeat`, `on_error` handling (max 20 steps) |
244
- | `flow_parallel` | Run the same action on multiple devices concurrently via `Promise.allSettled` (max 10 devices) |
297
+ - **Coordinates from `ui_find` / `ui_tree` are device coordinates**, not screenshot coordinates. They come from `uiautomator` which always reports in device space. If the most recent screenshot was at a low preset, passing those device coords as raw `x`/`y` will over-scale them. Prefer `index`, `text`, or `resourceId` for ui-sourced taps to avoid the issue entirely.
298
+ - **No screenshot taken yet?** Then there's no scale stored, and raw `x`/`y` are passed through 1:1 as device coords.
245
299
 
246
- ### Backward Compatibility
300
+ The cleanest mental model: *raw coords match whatever pixel space you're looking at on screen* (your last screenshot). For everything else, use the resolver fields (`index`, `text`, `resourceId`, `label`).
247
301
 
248
- All v3.0/v3.1 tool names work as aliases. For example, `tap` maps to `input(action:'tap')`, `screenshot` maps to `screen(action:'capture')`, `launch_app` maps to `app(action:'launch')`.
302
+ ---
249
303
 
250
- > For detailed Desktop API documentation, see [Desktop Specification](docs/SPEC_DESKTOP.md)
304
+ ### iOS
251
305
 
252
- ## Usage Examples
306
+ **Requirements:**
307
+ - macOS with Xcode
308
+ - iOS Simulator (no physical device support yet)
309
+ - WebDriverAgent for full UI inspection (optional but recommended)
253
310
 
254
- Just talk to Claude naturally:
311
+ **WebDriverAgent setup:**
255
312
 
256
- ```
257
- "Show me all connected devices"
258
- "Take a screenshot of the Android emulator"
259
- "Take a screenshot on iOS"
260
- "Tap on Settings"
261
- "Swipe down to scroll"
262
- "Type 'hello world' in the search field"
263
- "Press the back button on Android"
264
- "Open Safari on iOS"
265
- "Switch to iOS simulator"
266
- "Run the app on both platforms"
313
+ ```bash
314
+ # Automatic (via Appium)
315
+ npm install -g appium
316
+ appium driver install xcuitest
317
+
318
+ # Or set custom path
319
+ export WDA_PATH=/path/to/WebDriverAgent
267
320
  ```
268
321
 
269
- ### Permission Management
322
+ On first use, WDA is auto-built (~2 min one-time), launched on simulator, and connected on port 8100+.
270
323
 
271
- ```
272
- "Grant camera permission to com.example.app on Android"
273
- "Revoke location access from com.example.app"
274
- "Reset all permissions for com.apple.Maps on iOS"
275
- ```
324
+ **What WDA enables:**
325
+ - `ui(action:'tree')` full accessibility tree
326
+ - `ui(action:'find')` element discovery by label/text
327
+ - `input(action:'tap', label:'...')` element-based tapping
328
+ - Improved swipe and gesture simulation
329
+
330
+ **Troubleshooting:**
331
+
332
+ ```bash
333
+ # Install Xcode CLI tools
334
+ xcode-select --install
276
335
 
277
- ### Annotated Screenshots
336
+ # Accept license
337
+ sudo xcodebuild -license accept
278
338
 
339
+ # Check simulator is booted
340
+ xcrun simctl list | grep Booted
341
+
342
+ # Check port
343
+ lsof -i :8100
279
344
  ```
280
- "Take an annotated screenshot" → Screenshot with green (clickable) and red (non-clickable) bounding boxes + numbered element index
345
+
346
+ <details>
347
+ <summary>Manual WDA test</summary>
348
+
349
+ ```bash
350
+ cd ~/.appium/node_modules/appium-xcuitest-driver/node_modules/appium-webdriveragent
351
+ xcodebuild test -project WebDriverAgent.xcodeproj \
352
+ -scheme WebDriverAgentRunner \
353
+ -destination 'platform=iOS Simulator,id=<DEVICE_UDID>'
281
354
  ```
282
355
 
283
- ### Platform Selection
356
+ </details>
284
357
 
285
- You can explicitly specify the platform:
358
+ **Examples:**
286
359
 
287
360
  ```
288
- "Screenshot on android" → Uses Android device
289
- "Screenshot on ios" → Uses iOS simulator
290
- "Screenshot on desktop" → Uses Desktop app
291
- "Screenshot on aurora" → Uses Aurora OS device
292
- "Screenshot" → Uses last active device
361
+ "Take a screenshot on iOS"
362
+ "Open Safari on iOS"
363
+ "Tap on the Login button"
364
+ "Type my email in the text field"
365
+ "Swipe left on the card"
366
+ "Reset all permissions for com.apple.Maps"
293
367
  ```
294
368
 
295
- Or set the active device:
369
+ ---
370
+
371
+ ### Desktop
372
+
373
+ **Requirements:**
374
+ - macOS (Windows/Linux planned)
375
+ - Accessibility permissions: System Settings → Privacy & Security → Accessibility
376
+ - JDK 17+ (for building Desktop companion)
377
+
378
+ **Supported apps:** Any macOS application — SwiftUI, AppKit, Electron, Compose Desktop.
379
+
380
+ **Launch modes:**
381
+
382
+ | Mode | Example |
383
+ |------|---------|
384
+ | By `bundleId` | `desktop(action:'launch', bundleId:'com.apple.Calculator')` |
385
+ | By `.app` path | `desktop(action:'launch', appPath:'/Applications/Slack.app')` |
386
+ | Attach by PID | `desktop(action:'launch', pid:12345)` |
387
+
388
+ **Enable the module first:**
296
389
 
297
390
  ```
298
- "Use the iPhone 15 simulator"
299
- "Switch to the Android emulator"
300
- "Switch to desktop"
301
- "Switch to Aurora device"
391
+ "Enable desktop module"
302
392
  ```
303
393
 
304
- ### Desktop Examples
394
+ Or it auto-enables on first `desktop(...)` call.
395
+
396
+ **Examples:**
305
397
 
306
398
  ```
307
- "Launch my desktop app from /path/to/app"
399
+ "Launch Calculator"
308
400
  "Take a screenshot of the desktop app"
309
- "Get window info"
401
+ "Get window list"
310
402
  "Resize window to 1280x720"
311
- "Tap at coordinates 100, 200"
403
+ "Tap at 100, 200 on desktop"
312
404
  "Get clipboard content"
313
- "Set clipboard to 'test text'"
314
405
  "Get performance metrics"
315
406
  "Stop the desktop app"
316
407
  ```
317
408
 
318
- ### Aurora Examples
409
+ > Full API documentation: [docs/SPEC_DESKTOP.md](docs/SPEC_DESKTOP.md)
410
+
411
+ ---
412
+
413
+ ### Browser
414
+
415
+ **Requirements:**
416
+ - Chrome or Chromium installed (or set `CHROME_PATH`)
417
+
418
+ Browser automation via Chrome DevTools Protocol (CDP). The `browser` module loads on demand.
419
+
420
+ **Examples:**
421
+
422
+ ```
423
+ "Open https://example.com in the browser"
424
+ "Click the Sign In button"
425
+ "Fill the email field with test@example.com"
426
+ "Take a browser screenshot"
427
+ "Execute JS: document.title"
428
+ "Wait for the loading spinner to disappear"
429
+ ```
430
+
431
+ **Available actions:**
432
+
433
+ | Action | Description |
434
+ |--------|-------------|
435
+ | `open` | Open URL in new session |
436
+ | `navigate` | Go to URL in existing session |
437
+ | `click` | Click element by ref |
438
+ | `fill` | Type into input field |
439
+ | `fill_form` | Fill multiple fields at once |
440
+ | `press_key` | Keyboard input |
441
+ | `snapshot` | DOM snapshot with element refs |
442
+ | `screenshot` | Visual screenshot |
443
+ | `evaluate` | Run JavaScript |
444
+ | `wait_for_selector` | Wait for element to appear |
445
+ | `close` | Close session |
446
+ | `list_sessions` | Show active sessions |
447
+ | `clear_session` | Reset cookies/storage |
448
+
449
+ ---
450
+
451
+ ### Aurora OS
452
+
453
+ **Requirements:**
454
+ - `audb` CLI: `cargo install audb-client`
455
+ - SSH-enabled Aurora OS device
456
+ - Python on device for tap/swipe: `devel-su pkcon install python`
457
+
458
+ **Examples:**
319
459
 
320
460
  ```
321
- "List all Aurora devices"
461
+ "List Aurora devices"
322
462
  "Take a screenshot on Aurora"
323
- "Tap at coordinates 100, 200 on Aurora"
463
+ "Tap at 100, 200 on Aurora"
324
464
  "Launch ru.example.app on Aurora"
325
- "List installed apps on Aurora device"
465
+ "List installed apps on Aurora"
326
466
  "Get logs from Aurora device"
327
- "Push file.txt to /home/defaultuser/ on Aurora device"
467
+ "Push file.txt to /home/defaultuser/"
328
468
  ```
329
469
 
470
+ ---
471
+
472
+ ## Tools Reference
473
+
474
+ v3.8.0 provides **8 core meta-tools** + **3 optional modules**. Each meta-tool uses an `action` parameter.
475
+
476
+ ### Core Meta-Tools
477
+
478
+ | Meta-Tool | Actions | Description |
479
+ |-----------|---------|-------------|
480
+ | `device` | `list`, `set`, `set_target`, `get_target`, `enable_module`, `disable_module`, `list_modules` | Device management, module control |
481
+ | `input` | `tap`, `double_tap`, `long_press`, `swipe`, `text`, `key` | Touch and keyboard input |
482
+ | `screen` | `capture`, `annotate` | Screenshots and visual annotation |
483
+ | `ui` | `tree`, `find`, `find_tap`, `tap_text`, `analyze`, `wait`, `assert_visible`, `assert_gone` | UI hierarchy, element interaction |
484
+ | `app` | `launch`, `stop`, `install`, `list` | App lifecycle |
485
+ | `system` | `activity`, `shell`, `wait`, `open_url`, `logs`, `clear_logs`, `info`, `webview`, `clipboard_*`, `permission_*`, `file_*`, `metrics`, `reset_metrics` | System ops, clipboard, permissions, files, telemetry |
486
+ | `flow_batch` | — | Execute multiple commands in one round-trip (max 50) |
487
+ | `flow_run` | — | Multi-step automation with conditionals and loops (max 20 steps) |
488
+
489
+ ### Optional Modules
490
+
491
+ Load on demand via `device(action:'enable_module', module:'<name>')` or auto-enable on first call.
492
+
493
+ | Module | Actions | Description |
494
+ |--------|---------|-------------|
495
+ | `browser` | `open`, `close`, `list_sessions`, `navigate`, `click`, `fill`, `fill_form`, `press_key`, `snapshot`, `screenshot`, `evaluate`, `wait_for_selector`, `clear_session` | Chrome/Chromium via CDP |
496
+ | `desktop` | `launch`, `stop`, `windows`, `focus`, `resize`, `clipboard_get`, `clipboard_set`, `performance`, `monitors` | Any macOS app |
497
+ | `store` | `upload`, `set_notes`, `submit`, `get_releases`, `discard`, `promote`, `halt_rollout`, `get_versions` | Google Play, Huawei AppGallery, RuStore |
498
+
499
+ ### Flow Tools
500
+
501
+ | Tool | Description |
502
+ |------|-------------|
503
+ | `flow_batch` | Sequential execution, one round-trip (max 50 commands) |
504
+ | `flow_run` | Multi-step flows with `if_not_found`, `repeat`, `on_error` (max 20 steps) |
505
+ | `flow_parallel` | Same action on multiple devices via `Promise.allSettled` (max 10) |
506
+
507
+ ### Backward Compatibility
508
+
509
+ All v3.0/v3.1 tool names work as aliases: `tap` → `input(action:'tap')`, `screenshot` → `screen(action:'capture')`, `launch_app` → `app(action:'launch')`, etc.
510
+
511
+ ---
512
+
330
513
  ## Native CLI
331
514
 
332
- A 2 MB native Rust binary with all the same commands. No Node.js, no dependencies.
515
+ 2 MB Rust binary. No Node.js, no dependencies.
333
516
 
334
- ### Install CLI
517
+ ### Install
335
518
 
336
519
  ```bash
337
520
  brew tap AlexGladkov/claude-in-mobile
@@ -340,15 +523,16 @@ brew install claude-in-mobile
340
523
 
341
524
  Or download from [Releases](https://github.com/AlexGladkov/claude-in-mobile/releases).
342
525
 
343
- ### Advantages over MCP
526
+ ### Why use the CLI
344
527
 
345
- - **Easy install** `brew install` or copy a single 2 MB binary
346
- - **No dependencies** — no Node.js, no npm, nothing
347
- - **Use from terminal** run commands directly, no Claude Code or MCP client needed
348
- - **Test automation** write universal `.sh` scripts for any platform without learning platform internals
349
- - **Token-efficient** skill documentation loads only when used; MCP v3.4.0 reduced schema overhead by ~85% (8 meta-tools vs 81 individual tools)
350
- - **Fast** ~5ms command startup (Rust) vs ~500ms (Node.js MCP)
351
- - **CI/CD ready** exit codes, stdout/stderr, runs anywhere
528
+ | | CLI | MCP Server |
529
+ |---|---|---|
530
+ | **Install** | `brew install` or copy binary | `npx` / npm |
531
+ | **Dependencies** | None | Node.js |
532
+ | **Startup** | ~5ms | ~500ms |
533
+ | **Use from terminal** | Direct commands | Needs MCP client |
534
+ | **CI/CD** | Exit codes, stdout/stderr | Not designed for CI |
535
+ | **Token cost** | Skill loads on demand | Schema always present |
352
536
 
353
537
  ### Test script example
354
538
 
@@ -362,75 +546,27 @@ claude-in-mobile screenshot android -o result.png
362
546
  claude-in-mobile ui-dump android | grep "Welcome" && echo "PASS" || echo "FAIL"
363
547
  ```
364
548
 
365
- ### Claude Code Plugin
366
-
367
- ```bash
368
- claude plugin marketplace add AlexGladkov/claude-in-mobile
369
- claude plugin install claude-in-mobile@claude-in-mobile
370
- ```
371
-
372
- After installing, Claude Code controls devices with natural language. The skill loads into context only on demand — no token overhead when not in use.
549
+ ### Store management (CLI)
373
550
 
374
- See [cli/README.md](cli/README.md) for full CLI documentation.
375
-
376
- ## iOS WebDriverAgent Setup
377
-
378
- For full iOS UI inspection and element-based interaction, WebDriverAgent is required. It enables:
379
- - `get_ui` - JSON accessibility tree inspection
380
- - `tap` with `label` or `text` parameters - Element-based tapping
381
- - `find_element` - Element discovery and querying
382
- - `swipe` - Improved gesture simulation
383
-
384
- ### Installation
385
-
386
- **Automatic (via Appium):**
387
551
  ```bash
388
- npm install -g appium
389
- appium driver install xcuitest
390
- ```
391
-
392
- **Manual:**
393
- Set the `WDA_PATH` environment variable to your WebDriverAgent location:
394
- ```bash
395
- export WDA_PATH=/path/to/WebDriverAgent
552
+ claude-in-mobile store upload --package com.example.app --file app.aab
553
+ claude-in-mobile huawei upload --package com.example.app --file app.aab
554
+ claude-in-mobile rustore upload --package com.example.app --file app.apk
396
555
  ```
397
556
 
398
- ### First Use
557
+ ### Doctor
399
558
 
400
- On first use, WebDriverAgent will be automatically:
401
- 1. Discovered from Appium installation or `WDA_PATH`
402
- 2. Built with xcodebuild (one-time, ~2 minutes)
403
- 3. Launched on the iOS simulator
404
- 4. Connected via HTTP on port 8100+
559
+ Check all dependencies at once:
405
560
 
406
- ### Troubleshooting
407
-
408
- **Build fails:**
409
561
  ```bash
410
- # Install Xcode command line tools
411
- xcode-select --install
412
-
413
- # Accept license
414
- sudo xcodebuild -license accept
415
-
416
- # Set Xcode path
417
- sudo xcode-select -s /Applications/Xcode.app
562
+ claude-in-mobile doctor
418
563
  ```
419
564
 
420
- **Session fails:**
421
- - Ensure simulator is booted: `xcrun simctl list | grep Booted`
422
- - Check port availability: `lsof -i :8100`
423
- - Try restarting the simulator
565
+ Checks: ADB, ANDROID_HOME, Xcode, simctl, Appium, WDA, JDK, audb-client, Chrome. Color-coded output with fix suggestions.
424
566
 
425
- **Manual test:**
426
- ```bash
427
- cd ~/.appium/node_modules/appium-xcuitest-driver/node_modules/appium-webdriveragent
428
- xcodebuild test -project WebDriverAgent.xcodeproj \
429
- -scheme WebDriverAgentRunner \
430
- -destination 'platform=iOS Simulator,id=<DEVICE_UDID>'
431
- ```
567
+ ---
432
568
 
433
- ## How It Works
569
+ ## Architecture
434
570
 
435
571
  ```
436
572
  ┌─────────────┐ ┌──────────────────┐ ┌─────────────────┐
@@ -438,19 +574,21 @@ xcodebuild test -project WebDriverAgent.xcodeproj \
438
574
  ├─────────────┤ │ Claude Mobile │ ├─────────────────┤
439
575
  │ OpenCode │────▶│ MCP Server │────▶│ iOS (simctl+WDA)│
440
576
  ├─────────────┤ │ │ ├─────────────────┤
441
- │ Cursor │────▶│ 8 meta-tools │────▶│ Desktop (Compose)│
577
+ │ Cursor │────▶│ 8 meta-tools │────▶│ Desktop (macOS)
442
578
  ├─────────────┤ │ + 3 modules │ ├─────────────────┤
443
- Any MCP │────▶│ (auto-detects │────▶│ Aurora (audb) │
444
- Client│ client) │ ├─────────────────┤
445
- └─────────────┘ │────▶│ Browser (CDP) │
446
- └──────────────────┘ └─────────────────┘
579
+ Qwen/Gemini │────▶│ │────▶│ Aurora (audb) │
580
+ ├─────────────┤ Auto-detects │ ├─────────────────┤
581
+ Any MCP │────▶│ platform │────▶│ Browser (CDP) │
582
+ └─────────────┘ └──────────────────┘ └─────────────────┘
447
583
  ```
448
584
 
449
- 1. Claude sends commands through MCP protocol (8 meta-tools + 3 optional modules)
450
- 2. Server routes to appropriate platform (ADB, simctl+WDA, Desktop, audb, or CDP)
451
- 3. Commands execute on your device, desktop app, or browser
452
- 4. Results (screenshots, UI data, metrics) return to Claude
453
- 5. Dynamic modules auto-enable when first called — no manual setup needed
585
+ 1. Client sends commands via MCP protocol (8 meta-tools + 3 optional modules)
586
+ 2. Server routes to platform adapter (ADB, simctl+WDA, Desktop, audb, CDP)
587
+ 3. Commands execute on device/app/browser
588
+ 4. Results (screenshots, UI trees, metrics) return to client
589
+ 5. Modules auto-enable on first call — no manual setup needed
590
+
591
+ ---
454
592
 
455
593
  ## License
456
594