claude-in-mobile 3.7.0 → 3.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (64) hide show
  1. package/README.md +391 -264
  2. package/dist/adapters/android-adapter.d.ts.map +1 -1
  3. package/dist/adapters/android-adapter.js +8 -1
  4. package/dist/adapters/android-adapter.js.map +1 -1
  5. package/dist/adapters/desktop-adapter.d.ts +2 -2
  6. package/dist/adapters/desktop-adapter.d.ts.map +1 -1
  7. package/dist/adapters/desktop-adapter.js.map +1 -1
  8. package/dist/adapters/ios-adapter.d.ts.map +1 -1
  9. package/dist/adapters/ios-adapter.js +5 -2
  10. package/dist/adapters/ios-adapter.js.map +1 -1
  11. package/dist/adb/client.d.ts.map +1 -1
  12. package/dist/adb/client.js +12 -6
  13. package/dist/adb/client.js.map +1 -1
  14. package/dist/adb/resolver.d.ts +24 -0
  15. package/dist/adb/resolver.d.ts.map +1 -0
  16. package/dist/adb/resolver.js +130 -0
  17. package/dist/adb/resolver.js.map +1 -0
  18. package/dist/desktop/client.d.ts +13 -25
  19. package/dist/desktop/client.d.ts.map +1 -1
  20. package/dist/desktop/client.js +326 -198
  21. package/dist/desktop/client.js.map +1 -1
  22. package/dist/desktop/gradle.d.ts +2 -2
  23. package/dist/desktop/gradle.d.ts.map +1 -1
  24. package/dist/desktop/gradle.js.map +1 -1
  25. package/dist/desktop/index.d.ts +1 -1
  26. package/dist/desktop/index.d.ts.map +1 -1
  27. package/dist/desktop/index.js +1 -1
  28. package/dist/desktop/index.js.map +1 -1
  29. package/dist/desktop/types.d.ts +26 -1
  30. package/dist/desktop/types.d.ts.map +1 -1
  31. package/dist/device-manager.d.ts +14 -2
  32. package/dist/device-manager.d.ts.map +1 -1
  33. package/dist/device-manager.js +33 -14
  34. package/dist/device-manager.js.map +1 -1
  35. package/dist/errors.d.ts +2 -1
  36. package/dist/errors.d.ts.map +1 -1
  37. package/dist/errors.js +9 -2
  38. package/dist/errors.js.map +1 -1
  39. package/dist/index.js +2 -2
  40. package/dist/index.js.map +1 -1
  41. package/dist/ios/wda/wda-client.d.ts.map +1 -1
  42. package/dist/ios/wda/wda-client.js +18 -2
  43. package/dist/ios/wda/wda-client.js.map +1 -1
  44. package/dist/store/google-play.d.ts.map +1 -1
  45. package/dist/store/google-play.js +0 -1
  46. package/dist/store/google-play.js.map +1 -1
  47. package/dist/tools/desktop-tools.d.ts.map +1 -1
  48. package/dist/tools/desktop-tools.js +39 -18
  49. package/dist/tools/desktop-tools.js.map +1 -1
  50. package/dist/tools/helpers/resolve-element.d.ts +2 -0
  51. package/dist/tools/helpers/resolve-element.d.ts.map +1 -1
  52. package/dist/tools/helpers/resolve-element.js +2 -2
  53. package/dist/tools/helpers/resolve-element.js.map +1 -1
  54. package/dist/tools/interaction-tools.d.ts.map +1 -1
  55. package/dist/tools/interaction-tools.js +30 -4
  56. package/dist/tools/interaction-tools.js.map +1 -1
  57. package/dist/tools/meta/desktop-meta.d.ts.map +1 -1
  58. package/dist/tools/meta/desktop-meta.js +8 -7
  59. package/dist/tools/meta/desktop-meta.js.map +1 -1
  60. package/dist/utils/sanitize.d.ts +1 -1
  61. package/dist/utils/sanitize.d.ts.map +1 -1
  62. package/dist/utils/sanitize.js +13 -7
  63. package/dist/utils/sanitize.js.map +1 -1
  64. package/package.json +1 -1
package/README.md CHANGED
@@ -1,54 +1,124 @@
1
1
  # Claude Mobile
2
2
 
3
- MCP server for mobile and desktop automation — Android (via ADB), iOS Simulator (via simctl), Desktop (any macOS app), and Aurora OS (via audb). Like [Claude in Chrome](https://www.anthropic.com/news/claude-for-chrome) but for mobile devices and desktop apps.
4
-
5
- Control your Android phone, emulator, iOS Simulator, Desktop applications, or Aurora OS device with natural language through Claude.
6
-
7
- ## Features
8
-
9
- - **Unified API** — Same commands work for Android, iOS, Desktop, Aurora OS, and Browser
10
- - **Token-optimized** — 8 meta-tools + 3 optional modules instead of 81 separate tools (~85% token reduction per request)
11
- - **Dynamic modules** — Browser, Desktop, and Store modules load on demand, keeping the default tool list lean
12
- - **Browser automation** Control Chrome/Chromium via CDP: navigate, click, fill forms, evaluate JS, take screenshots
13
- - **Smart screenshots** — Auto-compressed for optimal LLM processing
14
- - **Annotated screenshots** — Screenshots with colored bounding boxes and numbered element labels
15
- - **Security hardened** — Shell injection protection, URL scheme validation, path traversal blocking, input sanitization
16
- - **Structured errors** — Typed error codes (`[CODE] message`) with auto-retry hints for transient failures
17
- - **Telemetry** — Per-tool call metrics (count, avg latency, error rate) via `system(action:'metrics')`
18
- - **Multi-device parallel** — Run the same action on multiple devices simultaneously via `flow_parallel`
19
- - **Flow engine** `flow_batch` for sequential commands, `flow_run` for conditional loops, `flow_parallel` for fan-out
20
- - **Permission management** Grant, revoke, and reset app permissions (Android runtime, iOS privacy services)
21
- - **Store management** — Upload builds to Google Play, Huawei AppGallery, and RuStore (optional module)
22
- - **Desktop support** — Test any macOS app (SwiftUI, AppKit, Electron, Compose) with window management, clipboard, and performance metrics
3
+ MCP server for mobile, desktop, and browser automation — Android (ADB), iOS Simulator (simctl + WDA), Desktop (any macOS app), Aurora OS (audb), and Browser (CDP). Like [Claude in Chrome](https://www.anthropic.com/news/claude-for-chrome) but for devices, apps, and browsers.
4
+
5
+ Control your Android phone, emulator, iOS Simulator, desktop app, Aurora device, or headless browser with natural language through Claude.
6
+
7
+ ---
8
+
9
+ ## Table of Contents
10
+
11
+ - [Quick Start](#quick-start)
12
+ - [Features at a Glance](#features-at-a-glance)
13
+ - [Quality Engineering](#quality-engineering)
14
+ - [Installation](#installation)
15
+ - [Homebrew (macOS)](#homebrew-macos)
16
+ - [One-liner (any client)](#one-liner-any-client)
17
+ - [Claude Code](#claude-code)
18
+ - [OpenCode](#opencode)
19
+ - [Other Agents (Pi, Qwen, Gemini, Codex, Cursor)](#other-agents)
20
+ - [From npm / source](#from-npm--source)
21
+ - [Windows](#windows)
22
+ - [Platform Guides](#platform-guides)
23
+ - [Android](#android)
24
+ - [iOS](#ios)
25
+ - [Desktop](#desktop)
26
+ - [Browser](#browser)
27
+ - [Aurora OS](#aurora-os)
28
+ - [Tools Reference](#tools-reference)
29
+ - [Core Meta-Tools](#core-meta-tools)
30
+ - [Optional Modules](#optional-modules)
31
+ - [Flow Tools](#flow-tools)
32
+ - [Native CLI](#native-cli)
33
+ - [Architecture](#architecture)
34
+ - [License](#license)
35
+
36
+ ---
37
+
38
+ ## Quick Start
39
+
40
+ ```bash
41
+ # Install via Homebrew (macOS)
42
+ brew tap AlexGladkov/claude-in-mobile https://github.com/AlexGladkov/claude-in-mobile
43
+ brew install claude-in-mobile
44
+
45
+ # Verify dependencies
46
+ claude-in-mobile doctor
47
+
48
+ # Add to Claude Code
49
+ claude mcp add --scope user --transport stdio mobile -- npx claude-in-mobile@latest
50
+ ```
51
+
52
+ Then talk to Claude naturally:
53
+
54
+ ```
55
+ "Take a screenshot of the Android emulator"
56
+ "Tap on the Login button"
57
+ "Type hello in the search field"
58
+ "Switch to iOS simulator"
59
+ ```
60
+
61
+ ---
62
+
63
+ ## Features at a Glance
64
+
65
+ | Feature | Description |
66
+ |---------|-------------|
67
+ | **Unified API** | Same 8 meta-tools work across Android, iOS, Desktop, Aurora, and Browser |
68
+ | **Token-optimized** | 8 meta-tools + 3 optional modules instead of 81 tools (~85% token reduction) |
69
+ | **Dynamic modules** | Browser, Desktop, Store load on demand — default tool list stays lean |
70
+ | **Smart screenshots** | Auto-compressed for optimal LLM processing |
71
+ | **Annotated screenshots** | Colored bounding boxes + numbered element labels |
72
+ | **Security hardened** | Shell injection protection, URL validation, path traversal blocking |
73
+ | **Structured errors** | Typed error codes with auto-recovery hints |
74
+ | **Multi-device parallel** | Run actions on multiple devices simultaneously |
75
+ | **Flow engine** | Batch, conditional loops, and fan-out flows |
76
+ | **Permission management** | Grant/revoke/reset app permissions (Android + iOS) |
77
+ | **Store publishing** | Google Play, Huawei AppGallery, RuStore |
78
+ | **Telemetry** | Per-tool call metrics via `system(action:'metrics')` |
79
+ | **Doctor command** | `claude-in-mobile doctor` — checks all dependencies at once |
80
+
81
+ ---
82
+
83
+ ## Quality Engineering
84
+
85
+ Advanced testing and monitoring built into Claude Mobile:
86
+
87
+ | Feature | What it does | How to use |
88
+ |---------|-------------|------------|
89
+ | **Accessibility Auditing** | WCAG 2.2 checks: missing labels, touch targets < 48px, focus order, duplicates | `accessibility(action:'audit')` |
90
+ | **Visual Regression** | Baseline screenshots + pixel-level diff detection | `visual(action:'baseline_save')`, `visual(action:'compare')` |
91
+ | **Test Recorder** | Record taps/swipes/input, replay without code | `recorder(action:'start')`, `recorder(action:'play')` |
92
+ | **Multi-Device Sync** | Barrier-based coordination for parallel testing | `sync(action:'create')`, `sync(action:'barrier')` |
93
+ | **App Autopilot** | Autonomous BFS/DFS exploration with self-healing locators | `autopilot(action:'explore')` |
94
+ | **Performance Monitor** | Real-time memory, CPU, FPS tracking with snapshots | `performance(action:'start')`, `performance(action:'snapshot')` |
95
+
96
+ ---
23
97
 
24
98
  ## Installation
25
99
 
26
- ### Native CLI via Homebrew (macOS)
100
+ ### Homebrew (macOS)
27
101
 
28
102
  ```bash
29
103
  brew tap AlexGladkov/claude-in-mobile https://github.com/AlexGladkov/claude-in-mobile
30
104
  brew install claude-in-mobile
31
105
  ```
32
106
 
33
- The CLI wraps all device automation tools plus store management (Google Play, Huawei AppGallery, RuStore):
107
+ Verify setup:
34
108
 
35
109
  ```bash
36
- claude-in-mobile screenshot android
37
- claude-in-mobile tap android 540 960 --from-size 540x960
38
- claude-in-mobile store upload --package com.example.app --file app.aab
39
- claude-in-mobile huawei upload --package com.example.app --file app.aab
40
- claude-in-mobile rustore upload --package com.example.app --file app.apk
110
+ claude-in-mobile doctor
41
111
  ```
42
112
 
43
113
  ### One-liner (any client)
44
114
 
45
- Using [add-mcp](https://github.com/neondatabase/add-mcp) — auto-detects installed clients:
115
+ Auto-detects installed clients via [add-mcp](https://github.com/neondatabase/add-mcp):
46
116
 
47
117
  ```bash
48
118
  npx add-mcp claude-in-mobile -y
49
119
  ```
50
120
 
51
- Or target a specific client:
121
+ Target a specific client:
52
122
 
53
123
  ```bash
54
124
  npx add-mcp claude-in-mobile -a claude-code -y
@@ -56,30 +126,39 @@ npx add-mcp claude-in-mobile -a opencode -y
56
126
  npx add-mcp claude-in-mobile -a cursor -y
57
127
  ```
58
128
 
59
- ### Claude Code CLI
129
+ ### Claude Code
60
130
 
61
131
  ```bash
132
+ # Project-local
62
133
  claude mcp add --transport stdio mobile -- npx claude-in-mobile@latest
134
+
135
+ # Global (all projects)
136
+ claude mcp add --scope user --transport stdio mobile -- npx claude-in-mobile@latest
63
137
  ```
64
138
 
65
- To add globally (available in all projects):
139
+ #### Claude Code Plugin
66
140
 
67
141
  ```bash
68
- claude mcp add --scope user --transport stdio mobile -- npx claude-in-mobile@latest
142
+ claude plugin marketplace add AlexGladkov/claude-in-mobile
143
+ claude plugin install claude-in-mobile@claude-in-mobile
69
144
  ```
70
145
 
71
146
  ### OpenCode
72
147
 
73
- Use the interactive setup:
148
+ Two modes:
149
+
150
+ **A) MCP server** (Node.js):
74
151
 
75
152
  ```bash
76
153
  opencode mcp add
154
+ # Choose local MCP → npx -y claude-in-mobile
77
155
  ```
78
156
 
79
- Or add manually to `opencode.json` (project root or `~/.config/opencode/opencode.json`):
157
+ Or in `opencode.json`:
80
158
 
81
159
  ```json
82
160
  {
161
+ "$schema": "https://opencode.ai/config.json",
83
162
  "mcp": {
84
163
  "mobile": {
85
164
  "type": "local",
@@ -90,89 +169,73 @@ Or add manually to `opencode.json` (project root or `~/.config/opencode/opencode
90
169
  }
91
170
  ```
92
171
 
93
- ### Cursor
94
-
95
- Add to `.cursor/mcp.json`:
172
+ **B) Native CLI + Skill** (no Node.js needed):
96
173
 
97
- ```json
98
- {
99
- "mcpServers": {
100
- "mobile": {
101
- "command": "npx",
102
- "args": ["-y", "claude-in-mobile"]
103
- }
104
- }
105
- }
174
+ ```bash
175
+ claude-in-mobile setup opencode # project-local
176
+ claude-in-mobile setup opencode --global # user-wide
106
177
  ```
107
178
 
108
- ### Any MCP Client
179
+ ### Other Agents
109
180
 
110
- Print a config snippet for your client:
181
+ Native CLI skill works with any agent that supports Agent Skills:
111
182
 
112
183
  ```bash
113
- npx claude-in-mobile --init <client-name>
114
- # Supported: opencode, cursor, claude-code
184
+ claude-in-mobile setup pi --global # Pi
185
+ claude-in-mobile setup qwen --global # Qwen Code
186
+ claude-in-mobile setup gemini --global # Gemini CLI
187
+ claude-in-mobile setup codex --global # Codex
188
+ claude-in-mobile setup cursor --global # Cursor
115
189
  ```
116
190
 
117
- ### From npm
191
+ Drop `--global` for project-local install. Restart the agent after setup.
118
192
 
119
- ```bash
120
- npx claude-in-mobile
121
- ```
193
+ <details>
194
+ <summary>MCP server config for Qwen / Gemini / Codex / Cursor</summary>
122
195
 
123
- ### From source
196
+ **Qwen Code** — `.qwen/settings.json` or `~/.qwen/settings.json`:
124
197
 
125
- ```bash
126
- git clone https://github.com/AlexGladkov/claude-in-mobile.git
127
- cd claude-in-mobile
128
- npm install
129
- npm run build:all # Builds TypeScript + Desktop companion
198
+ ```json
199
+ { "mcpServers": { "mobile": { "command": "npx", "args": ["-y", "claude-in-mobile"] } } }
130
200
  ```
131
201
 
132
- > **Note:** For Desktop support, you need to run `npm run build:desktop` (or `build:all`) to compile the Desktop companion app.
202
+ **Gemini CLI** `.gemini/settings.json` or `~/.gemini/settings.json`:
133
203
 
134
- #### Using a local build with MCP clients
204
+ ```json
205
+ { "mcpServers": { "mobile": { "command": "npx", "args": ["-y", "claude-in-mobile"] } } }
206
+ ```
135
207
 
136
- After building from source, point your MCP client to the local `dist/index.js` instead of using npx:
208
+ **Codex**:
137
209
 
138
- ```json
139
- {
140
- "mcpServers": {
141
- "mobile": {
142
- "command": "node",
143
- "args": ["/path/to/claude-in-mobile/dist/index.js"]
144
- }
145
- }
146
- }
210
+ ```bash
211
+ codex mcp add mobile -- npx -y claude-in-mobile
147
212
  ```
148
213
 
149
- For OpenCode (`opencode.json`):
214
+ **Cursor** `.cursor/mcp.json`:
150
215
 
151
216
  ```json
152
- {
153
- "mcp": {
154
- "mobile": {
155
- "type": "local",
156
- "command": ["node", "/path/to/claude-in-mobile/dist/index.js"],
157
- "enabled": true
158
- }
159
- }
160
- }
217
+ { "mcpServers": { "mobile": { "command": "npx", "args": ["-y", "claude-in-mobile"] } } }
161
218
  ```
162
219
 
163
- ### Manual configuration
220
+ </details>
164
221
 
165
- Add to your Claude Code settings (`~/.claude.json` or project settings):
222
+ ### From npm / source
223
+
224
+ ```bash
225
+ # npm (no install)
226
+ npx claude-in-mobile
227
+
228
+ # From source
229
+ git clone https://github.com/AlexGladkov/claude-in-mobile.git
230
+ cd claude-in-mobile
231
+ npm install
232
+ npm run build:all
233
+ ```
234
+
235
+ Using a local build with any MCP client:
166
236
 
167
237
  ```json
168
- {
169
- "mcpServers": {
170
- "mobile": {
171
- "command": "npx",
172
- "args": ["-y", "claude-in-mobile"]
173
- }
174
- }
175
- }
238
+ { "mcpServers": { "mobile": { "command": "node", "args": ["/path/to/claude-in-mobile/dist/index.js"] } } }
176
239
  ```
177
240
 
178
241
  ### Windows
@@ -181,157 +244,266 @@ Add to your Claude Code settings (`~/.claude.json` or project settings):
181
244
  claude mcp add --transport stdio mobile -- cmd /c npx claude-in-mobile@latest
182
245
  ```
183
246
 
184
- ## Requirements
247
+ ---
248
+
249
+ ## Platform Guides
185
250
 
186
251
  ### Android
187
- - ADB installed and in PATH
188
- - Connected Android device (USB debugging enabled) or emulator
189
252
 
190
- ### iOS
191
- - macOS with Xcode installed
192
- - iOS Simulator (no physical device support yet)
193
- - **WebDriverAgent** for full UI inspection and element-based interaction:
194
- ```bash
195
- npm install -g appium
196
- appium driver install xcuitest
197
- ```
198
- Or set `WDA_PATH` environment variable to custom WebDriverAgent location
253
+ **Requirements:**
254
+ - ADB installed (auto-discovered or set `ADB_PATH`)
255
+ - USB debugging enabled on device, or running emulator
199
256
 
200
- ### Desktop
201
- - macOS (Windows/Linux support planned)
202
- - JDK 17+ for building the Desktop companion
203
- - Any macOS application (SwiftUI, AppKit, Electron, Compose) — launch by `bundleId`, `.app` path, or attach by PID
204
- - Accessibility permissions required: System Settings → Privacy & Security → Accessibility
257
+ **ADB discovery order:**
205
258
 
206
- ### Aurora OS
207
- - audb CLI installed and in PATH (`cargo install audb-client`)
208
- - Connected Aurora OS device with SSH enabled
209
- - Python on device required for tap/swipe: `devel-su pkcon install python`
259
+ | Priority | Location |
260
+ |----------|----------|
261
+ | 1 | `ADB_PATH` env var |
262
+ | 2 | `$ANDROID_HOME/platform-tools/adb` |
263
+ | 3 | `$ANDROID_SDK_ROOT/platform-tools/adb` |
264
+ | 4 | OS default: `~/Library/Android/sdk` (macOS), `%LOCALAPPDATA%\Android\Sdk` (Windows), `~/Android/Sdk` (Linux) |
265
+ | 5 | `adb` from `PATH` |
210
266
 
211
- ## Available Tools
267
+ If none found → `[ADB_NOT_INSTALLED]` error with probed paths.
212
268
 
213
- v3.7.0 provides **8 core meta-tools** + **3 optional modules**. Each meta-tool uses an `action` parameter to select the operation. All v3.0/v3.1 tool names still work as backward-compatible aliases.
269
+ **Examples:**
214
270
 
215
- ### Core Meta-Tools (always loaded)
271
+ ```
272
+ "Show connected devices"
273
+ "Take a screenshot on Android"
274
+ "Tap on Settings"
275
+ "Swipe down to scroll"
276
+ "Type 'hello' in the search field"
277
+ "Press the back button"
278
+ "Grant camera permission to com.example.app"
279
+ "Launch com.example.app"
280
+ ```
216
281
 
217
- | Meta-Tool | Actions | Description |
218
- |-----------|---------|-------------|
219
- | `device` | `list`, `set`, `set_target`, `get_target`, `enable_module`, `disable_module`, `list_modules` | Device management and module control |
220
- | `input` | `tap`, `double_tap`, `long_press`, `swipe`, `text`, `key` | Touch/keyboard input |
221
- | `screen` | `capture`, `annotate` | Screenshots and visual annotation |
222
- | `ui` | `tree`, `find`, `find_tap`, `tap_text`, `analyze`, `wait`, `assert_visible`, `assert_gone` | UI hierarchy and element interaction |
223
- | `app` | `launch`, `stop`, `install`, `list` | App lifecycle management |
224
- | `system` | `activity`, `shell`, `wait`, `open_url`, `logs`, `clear_logs`, `info`, `webview`, `clipboard_*`, `permission_*`, `file_*`, `metrics`, `reset_metrics` | System operations, clipboard, permissions, files, telemetry |
225
- | `flow_batch` | — | Execute multiple commands in one round-trip |
226
- | `flow_run` | — | Multi-step automation with conditionals and loops |
282
+ **CLI:**
227
283
 
228
- ### Optional Modules (loaded on demand)
284
+ ```bash
285
+ claude-in-mobile screenshot android
286
+ claude-in-mobile tap android 540 960
287
+ claude-in-mobile input android "hello world"
288
+ claude-in-mobile ui-dump android | grep "Login"
289
+ ```
229
290
 
230
- These modules are hidden by default to save tokens. They auto-enable when you call them, or use `device(action:'enable_module', module:'<name>')`.
291
+ ---
231
292
 
232
- | Module | Actions | Description |
233
- |--------|---------|-------------|
234
- | `browser` | `open`, `close`, `list_sessions`, `navigate`, `click`, `fill`, `fill_form`, `press_key`, `snapshot`, `screenshot`, `evaluate`, `wait_for_selector`, `clear_session` | Chrome/Chromium automation via CDP |
235
- | `desktop` | `launch`, `stop`, `windows`, `focus`, `resize`, `clipboard_get`, `clipboard_set`, `performance`, `monitors` | Desktop app testing (any macOS app: SwiftUI, AppKit, Electron, Compose) |
236
- | `store` | `upload`, `set_notes`, `submit`, `get_releases`, `discard`, `promote`, `halt_rollout`, `get_versions` | Google Play, Huawei AppGallery, RuStore publishing |
293
+ ### iOS
237
294
 
238
- ### Flow Tools
295
+ **Requirements:**
296
+ - macOS with Xcode
297
+ - iOS Simulator (no physical device support yet)
298
+ - WebDriverAgent for full UI inspection (optional but recommended)
239
299
 
240
- | Tool | Description |
241
- |------|-------------|
242
- | `flow_batch` | Sequential execution of multiple commands in one round-trip (max 50) |
243
- | `flow_run` | Multi-step flows with `if_not_found`, `repeat`, `on_error` handling (max 20 steps) |
244
- | `flow_parallel` | Run the same action on multiple devices concurrently via `Promise.allSettled` (max 10 devices) |
300
+ **WebDriverAgent setup:**
245
301
 
246
- ### Backward Compatibility
302
+ ```bash
303
+ # Automatic (via Appium)
304
+ npm install -g appium
305
+ appium driver install xcuitest
247
306
 
248
- All v3.0/v3.1 tool names work as aliases. For example, `tap` maps to `input(action:'tap')`, `screenshot` maps to `screen(action:'capture')`, `launch_app` maps to `app(action:'launch')`.
307
+ # Or set custom path
308
+ export WDA_PATH=/path/to/WebDriverAgent
309
+ ```
249
310
 
250
- > For detailed Desktop API documentation, see [Desktop Specification](docs/SPEC_DESKTOP.md)
311
+ On first use, WDA is auto-built (~2 min one-time), launched on simulator, and connected on port 8100+.
251
312
 
252
- ## Usage Examples
313
+ **What WDA enables:**
314
+ - `ui(action:'tree')` — full accessibility tree
315
+ - `ui(action:'find')` — element discovery by label/text
316
+ - `input(action:'tap', label:'...')` — element-based tapping
317
+ - Improved swipe and gesture simulation
253
318
 
254
- Just talk to Claude naturally:
319
+ **Troubleshooting:**
255
320
 
256
- ```
257
- "Show me all connected devices"
258
- "Take a screenshot of the Android emulator"
259
- "Take a screenshot on iOS"
260
- "Tap on Settings"
261
- "Swipe down to scroll"
262
- "Type 'hello world' in the search field"
263
- "Press the back button on Android"
264
- "Open Safari on iOS"
265
- "Switch to iOS simulator"
266
- "Run the app on both platforms"
267
- ```
321
+ ```bash
322
+ # Install Xcode CLI tools
323
+ xcode-select --install
268
324
 
269
- ### Permission Management
325
+ # Accept license
326
+ sudo xcodebuild -license accept
270
327
 
271
- ```
272
- "Grant camera permission to com.example.app on Android"
273
- "Revoke location access from com.example.app"
274
- "Reset all permissions for com.apple.Maps on iOS"
328
+ # Check simulator is booted
329
+ xcrun simctl list | grep Booted
330
+
331
+ # Check port
332
+ lsof -i :8100
275
333
  ```
276
334
 
277
- ### Annotated Screenshots
335
+ <details>
336
+ <summary>Manual WDA test</summary>
278
337
 
279
- ```
280
- "Take an annotated screenshot" → Screenshot with green (clickable) and red (non-clickable) bounding boxes + numbered element index
338
+ ```bash
339
+ cd ~/.appium/node_modules/appium-xcuitest-driver/node_modules/appium-webdriveragent
340
+ xcodebuild test -project WebDriverAgent.xcodeproj \
341
+ -scheme WebDriverAgentRunner \
342
+ -destination 'platform=iOS Simulator,id=<DEVICE_UDID>'
281
343
  ```
282
344
 
283
- ### Platform Selection
345
+ </details>
284
346
 
285
- You can explicitly specify the platform:
347
+ **Examples:**
286
348
 
287
349
  ```
288
- "Screenshot on android" → Uses Android device
289
- "Screenshot on ios" → Uses iOS simulator
290
- "Screenshot on desktop" → Uses Desktop app
291
- "Screenshot on aurora" → Uses Aurora OS device
292
- "Screenshot" → Uses last active device
350
+ "Take a screenshot on iOS"
351
+ "Open Safari on iOS"
352
+ "Tap on the Login button"
353
+ "Type my email in the text field"
354
+ "Swipe left on the card"
355
+ "Reset all permissions for com.apple.Maps"
293
356
  ```
294
357
 
295
- Or set the active device:
358
+ ---
359
+
360
+ ### Desktop
361
+
362
+ **Requirements:**
363
+ - macOS (Windows/Linux planned)
364
+ - Accessibility permissions: System Settings → Privacy & Security → Accessibility
365
+ - JDK 17+ (for building Desktop companion)
366
+
367
+ **Supported apps:** Any macOS application — SwiftUI, AppKit, Electron, Compose Desktop.
368
+
369
+ **Launch modes:**
370
+
371
+ | Mode | Example |
372
+ |------|---------|
373
+ | By `bundleId` | `desktop(action:'launch', bundleId:'com.apple.Calculator')` |
374
+ | By `.app` path | `desktop(action:'launch', appPath:'/Applications/Slack.app')` |
375
+ | Attach by PID | `desktop(action:'launch', pid:12345)` |
376
+
377
+ **Enable the module first:**
296
378
 
297
379
  ```
298
- "Use the iPhone 15 simulator"
299
- "Switch to the Android emulator"
300
- "Switch to desktop"
301
- "Switch to Aurora device"
380
+ "Enable desktop module"
302
381
  ```
303
382
 
304
- ### Desktop Examples
383
+ Or it auto-enables on first `desktop(...)` call.
384
+
385
+ **Examples:**
305
386
 
306
387
  ```
307
- "Launch my desktop app from /path/to/app"
388
+ "Launch Calculator"
308
389
  "Take a screenshot of the desktop app"
309
- "Get window info"
390
+ "Get window list"
310
391
  "Resize window to 1280x720"
311
- "Tap at coordinates 100, 200"
392
+ "Tap at 100, 200 on desktop"
312
393
  "Get clipboard content"
313
- "Set clipboard to 'test text'"
314
394
  "Get performance metrics"
315
395
  "Stop the desktop app"
316
396
  ```
317
397
 
318
- ### Aurora Examples
398
+ > Full API documentation: [docs/SPEC_DESKTOP.md](docs/SPEC_DESKTOP.md)
399
+
400
+ ---
401
+
402
+ ### Browser
403
+
404
+ **Requirements:**
405
+ - Chrome or Chromium installed (or set `CHROME_PATH`)
406
+
407
+ Browser automation via Chrome DevTools Protocol (CDP). The `browser` module loads on demand.
408
+
409
+ **Examples:**
410
+
411
+ ```
412
+ "Open https://example.com in the browser"
413
+ "Click the Sign In button"
414
+ "Fill the email field with test@example.com"
415
+ "Take a browser screenshot"
416
+ "Execute JS: document.title"
417
+ "Wait for the loading spinner to disappear"
418
+ ```
419
+
420
+ **Available actions:**
421
+
422
+ | Action | Description |
423
+ |--------|-------------|
424
+ | `open` | Open URL in new session |
425
+ | `navigate` | Go to URL in existing session |
426
+ | `click` | Click element by ref |
427
+ | `fill` | Type into input field |
428
+ | `fill_form` | Fill multiple fields at once |
429
+ | `press_key` | Keyboard input |
430
+ | `snapshot` | DOM snapshot with element refs |
431
+ | `screenshot` | Visual screenshot |
432
+ | `evaluate` | Run JavaScript |
433
+ | `wait_for_selector` | Wait for element to appear |
434
+ | `close` | Close session |
435
+ | `list_sessions` | Show active sessions |
436
+ | `clear_session` | Reset cookies/storage |
437
+
438
+ ---
439
+
440
+ ### Aurora OS
441
+
442
+ **Requirements:**
443
+ - `audb` CLI: `cargo install audb-client`
444
+ - SSH-enabled Aurora OS device
445
+ - Python on device for tap/swipe: `devel-su pkcon install python`
446
+
447
+ **Examples:**
319
448
 
320
449
  ```
321
- "List all Aurora devices"
450
+ "List Aurora devices"
322
451
  "Take a screenshot on Aurora"
323
- "Tap at coordinates 100, 200 on Aurora"
452
+ "Tap at 100, 200 on Aurora"
324
453
  "Launch ru.example.app on Aurora"
325
- "List installed apps on Aurora device"
454
+ "List installed apps on Aurora"
326
455
  "Get logs from Aurora device"
327
- "Push file.txt to /home/defaultuser/ on Aurora device"
456
+ "Push file.txt to /home/defaultuser/"
328
457
  ```
329
458
 
459
+ ---
460
+
461
+ ## Tools Reference
462
+
463
+ v3.8.0 provides **8 core meta-tools** + **3 optional modules**. Each meta-tool uses an `action` parameter.
464
+
465
+ ### Core Meta-Tools
466
+
467
+ | Meta-Tool | Actions | Description |
468
+ |-----------|---------|-------------|
469
+ | `device` | `list`, `set`, `set_target`, `get_target`, `enable_module`, `disable_module`, `list_modules` | Device management, module control |
470
+ | `input` | `tap`, `double_tap`, `long_press`, `swipe`, `text`, `key` | Touch and keyboard input |
471
+ | `screen` | `capture`, `annotate` | Screenshots and visual annotation |
472
+ | `ui` | `tree`, `find`, `find_tap`, `tap_text`, `analyze`, `wait`, `assert_visible`, `assert_gone` | UI hierarchy, element interaction |
473
+ | `app` | `launch`, `stop`, `install`, `list` | App lifecycle |
474
+ | `system` | `activity`, `shell`, `wait`, `open_url`, `logs`, `clear_logs`, `info`, `webview`, `clipboard_*`, `permission_*`, `file_*`, `metrics`, `reset_metrics` | System ops, clipboard, permissions, files, telemetry |
475
+ | `flow_batch` | — | Execute multiple commands in one round-trip (max 50) |
476
+ | `flow_run` | — | Multi-step automation with conditionals and loops (max 20 steps) |
477
+
478
+ ### Optional Modules
479
+
480
+ Load on demand via `device(action:'enable_module', module:'<name>')` or auto-enable on first call.
481
+
482
+ | Module | Actions | Description |
483
+ |--------|---------|-------------|
484
+ | `browser` | `open`, `close`, `list_sessions`, `navigate`, `click`, `fill`, `fill_form`, `press_key`, `snapshot`, `screenshot`, `evaluate`, `wait_for_selector`, `clear_session` | Chrome/Chromium via CDP |
485
+ | `desktop` | `launch`, `stop`, `windows`, `focus`, `resize`, `clipboard_get`, `clipboard_set`, `performance`, `monitors` | Any macOS app |
486
+ | `store` | `upload`, `set_notes`, `submit`, `get_releases`, `discard`, `promote`, `halt_rollout`, `get_versions` | Google Play, Huawei AppGallery, RuStore |
487
+
488
+ ### Flow Tools
489
+
490
+ | Tool | Description |
491
+ |------|-------------|
492
+ | `flow_batch` | Sequential execution, one round-trip (max 50 commands) |
493
+ | `flow_run` | Multi-step flows with `if_not_found`, `repeat`, `on_error` (max 20 steps) |
494
+ | `flow_parallel` | Same action on multiple devices via `Promise.allSettled` (max 10) |
495
+
496
+ ### Backward Compatibility
497
+
498
+ All v3.0/v3.1 tool names work as aliases: `tap` → `input(action:'tap')`, `screenshot` → `screen(action:'capture')`, `launch_app` → `app(action:'launch')`, etc.
499
+
500
+ ---
501
+
330
502
  ## Native CLI
331
503
 
332
- A 2 MB native Rust binary with all the same commands. No Node.js, no dependencies.
504
+ 2 MB Rust binary. No Node.js, no dependencies.
333
505
 
334
- ### Install CLI
506
+ ### Install
335
507
 
336
508
  ```bash
337
509
  brew tap AlexGladkov/claude-in-mobile
@@ -340,15 +512,16 @@ brew install claude-in-mobile
340
512
 
341
513
  Or download from [Releases](https://github.com/AlexGladkov/claude-in-mobile/releases).
342
514
 
343
- ### Advantages over MCP
515
+ ### Why use the CLI
344
516
 
345
- - **Easy install** `brew install` or copy a single 2 MB binary
346
- - **No dependencies** — no Node.js, no npm, nothing
347
- - **Use from terminal** run commands directly, no Claude Code or MCP client needed
348
- - **Test automation** write universal `.sh` scripts for any platform without learning platform internals
349
- - **Token-efficient** skill documentation loads only when used; MCP v3.4.0 reduced schema overhead by ~85% (8 meta-tools vs 81 individual tools)
350
- - **Fast** ~5ms command startup (Rust) vs ~500ms (Node.js MCP)
351
- - **CI/CD ready** exit codes, stdout/stderr, runs anywhere
517
+ | | CLI | MCP Server |
518
+ |---|---|---|
519
+ | **Install** | `brew install` or copy binary | `npx` / npm |
520
+ | **Dependencies** | None | Node.js |
521
+ | **Startup** | ~5ms | ~500ms |
522
+ | **Use from terminal** | Direct commands | Needs MCP client |
523
+ | **CI/CD** | Exit codes, stdout/stderr | Not designed for CI |
524
+ | **Token cost** | Skill loads on demand | Schema always present |
352
525
 
353
526
  ### Test script example
354
527
 
@@ -362,75 +535,27 @@ claude-in-mobile screenshot android -o result.png
362
535
  claude-in-mobile ui-dump android | grep "Welcome" && echo "PASS" || echo "FAIL"
363
536
  ```
364
537
 
365
- ### Claude Code Plugin
538
+ ### Store management (CLI)
366
539
 
367
540
  ```bash
368
- claude plugin marketplace add AlexGladkov/claude-in-mobile
369
- claude plugin install claude-in-mobile@claude-in-mobile
370
- ```
371
-
372
- After installing, Claude Code controls devices with natural language. The skill loads into context only on demand — no token overhead when not in use.
373
-
374
- See [cli/README.md](cli/README.md) for full CLI documentation.
375
-
376
- ## iOS WebDriverAgent Setup
377
-
378
- For full iOS UI inspection and element-based interaction, WebDriverAgent is required. It enables:
379
- - `get_ui` - JSON accessibility tree inspection
380
- - `tap` with `label` or `text` parameters - Element-based tapping
381
- - `find_element` - Element discovery and querying
382
- - `swipe` - Improved gesture simulation
383
-
384
- ### Installation
385
-
386
- **Automatic (via Appium):**
387
- ```bash
388
- npm install -g appium
389
- appium driver install xcuitest
390
- ```
391
-
392
- **Manual:**
393
- Set the `WDA_PATH` environment variable to your WebDriverAgent location:
394
- ```bash
395
- export WDA_PATH=/path/to/WebDriverAgent
541
+ claude-in-mobile store upload --package com.example.app --file app.aab
542
+ claude-in-mobile huawei upload --package com.example.app --file app.aab
543
+ claude-in-mobile rustore upload --package com.example.app --file app.apk
396
544
  ```
397
545
 
398
- ### First Use
399
-
400
- On first use, WebDriverAgent will be automatically:
401
- 1. Discovered from Appium installation or `WDA_PATH`
402
- 2. Built with xcodebuild (one-time, ~2 minutes)
403
- 3. Launched on the iOS simulator
404
- 4. Connected via HTTP on port 8100+
546
+ ### Doctor
405
547
 
406
- ### Troubleshooting
548
+ Check all dependencies at once:
407
549
 
408
- **Build fails:**
409
550
  ```bash
410
- # Install Xcode command line tools
411
- xcode-select --install
412
-
413
- # Accept license
414
- sudo xcodebuild -license accept
415
-
416
- # Set Xcode path
417
- sudo xcode-select -s /Applications/Xcode.app
551
+ claude-in-mobile doctor
418
552
  ```
419
553
 
420
- **Session fails:**
421
- - Ensure simulator is booted: `xcrun simctl list | grep Booted`
422
- - Check port availability: `lsof -i :8100`
423
- - Try restarting the simulator
554
+ Checks: ADB, ANDROID_HOME, Xcode, simctl, Appium, WDA, JDK, audb-client, Chrome. Color-coded output with fix suggestions.
424
555
 
425
- **Manual test:**
426
- ```bash
427
- cd ~/.appium/node_modules/appium-xcuitest-driver/node_modules/appium-webdriveragent
428
- xcodebuild test -project WebDriverAgent.xcodeproj \
429
- -scheme WebDriverAgentRunner \
430
- -destination 'platform=iOS Simulator,id=<DEVICE_UDID>'
431
- ```
556
+ ---
432
557
 
433
- ## How It Works
558
+ ## Architecture
434
559
 
435
560
  ```
436
561
  ┌─────────────┐ ┌──────────────────┐ ┌─────────────────┐
@@ -438,19 +563,21 @@ xcodebuild test -project WebDriverAgent.xcodeproj \
438
563
  ├─────────────┤ │ Claude Mobile │ ├─────────────────┤
439
564
  │ OpenCode │────▶│ MCP Server │────▶│ iOS (simctl+WDA)│
440
565
  ├─────────────┤ │ │ ├─────────────────┤
441
- │ Cursor │────▶│ 8 meta-tools │────▶│ Desktop (Compose)│
566
+ │ Cursor │────▶│ 8 meta-tools │────▶│ Desktop (macOS)
442
567
  ├─────────────┤ │ + 3 modules │ ├─────────────────┤
443
- Any MCP │────▶│ (auto-detects │────▶│ Aurora (audb) │
444
- Client│ client) │ ├─────────────────┤
445
- └─────────────┘ │────▶│ Browser (CDP) │
446
- └──────────────────┘ └─────────────────┘
568
+ Qwen/Gemini │────▶│ │────▶│ Aurora (audb) │
569
+ ├─────────────┤ Auto-detects │ ├─────────────────┤
570
+ Any MCP │────▶│ platform │────▶│ Browser (CDP) │
571
+ └─────────────┘ └──────────────────┘ └─────────────────┘
447
572
  ```
448
573
 
449
- 1. Claude sends commands through MCP protocol (8 meta-tools + 3 optional modules)
450
- 2. Server routes to appropriate platform (ADB, simctl+WDA, Desktop, audb, or CDP)
451
- 3. Commands execute on your device, desktop app, or browser
452
- 4. Results (screenshots, UI data, metrics) return to Claude
453
- 5. Dynamic modules auto-enable when first called — no manual setup needed
574
+ 1. Client sends commands via MCP protocol (8 meta-tools + 3 optional modules)
575
+ 2. Server routes to platform adapter (ADB, simctl+WDA, Desktop, audb, CDP)
576
+ 3. Commands execute on device/app/browser
577
+ 4. Results (screenshots, UI trees, metrics) return to client
578
+ 5. Modules auto-enable on first call — no manual setup needed
579
+
580
+ ---
454
581
 
455
582
  ## License
456
583