mobile-device-mcp 0.2.2 → 0.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,225 +1,328 @@
1
- # mobile-device-mcp
2
-
3
- MCP server that gives AI coding assistants (Claude Code, Cursor, Windsurf) the ability to **see and interact with mobile devices**. 34 tools for screenshots, UI inspection, touch interaction, AI-powered visual analysis, and Flutter widget tree inspection.
4
-
5
- > AI assistants can read your code but can't see your phone. This fixes that.
6
-
7
- ## The Problem
8
-
9
- Web developers have browser DevTools, Playwright, and Puppeteer — AI assistants can click around, take screenshots, and verify fixes. Mobile developers? They're stuck manually screenshotting, copying logs, and describing what's on screen. They're **human middleware** between the AI and the device.
10
-
11
- ## What This Does
12
-
13
- ```
14
- Developer: "The login button doesn't work"
15
-
16
- Without this tool: With this tool:
17
- 1. Manually screenshot 1. AI calls take_screenshot → sees the screen
18
- 2. Paste into AI chat 2. AI calls smart_tap("login button") → taps it
19
- 3. AI guesses what's wrong 3. AI calls verify_screen("error message shown") → sees result
20
- 4. Apply fix, rebuild 4. AI calls visual_diff confirms fix worked
21
- 5. Repeat 4-5 times 5. Done.
22
- ```
23
-
24
- ## Quick Start
25
-
26
- ### Prerequisites
27
- - Node.js 18+
28
- - Android device/emulator connected via ADB
29
- - ADB installed (Android SDK Platform Tools)
30
-
31
- ### Setup (One-time, 30 seconds)
32
-
33
- 1. **Get a Google AI key** (free tier available): [aistudio.google.com/apikey](https://aistudio.google.com/apikey)
34
-
35
- 2. **Add `.mcp.json` to your project root:**
36
-
37
- ```json
38
- {
39
- "mcpServers": {
40
- "mobile-device": {
41
- "type": "stdio",
42
- "command": "npx",
43
- "args": ["-y", "mobile-device-mcp"],
44
- "env": {
45
- "GOOGLE_API_KEY": "your-google-api-key"
46
- }
47
- }
48
- }
49
- }
50
- ```
51
-
52
- 3. **Open your AI coding assistant** from that directory. That's it.
53
-
54
- The server starts and stops automatically — you never run it manually. Your AI assistant manages it as a background process via the MCP protocol.
55
-
56
- ### Verify It Works
57
-
58
- **Claude Code:** type `/mcp` you should see `mobile-device: Connected`
59
-
60
- **Cursor:** check MCP panel in settings
61
-
62
- Then just talk to your phone:
63
-
64
- ```
65
- You: "Open my app, tap the login button, type test@email.com in the email field"
66
- AI: [takes screenshot → sees the screen → smart_tap("login button") → smart_type("email field", "test@email.com")]
67
-
68
- You: "Find all the bugs on this screen"
69
- AI: [analyze_screen → inspects layout, checks for overflow, missing labels, broken states]
70
-
71
- You: "Navigate to settings and verify dark mode works"
72
- AI: [smart_tap("settings") → take_screenshot → smart_tap("dark mode toggle") → visual_diff → reports result]
73
- ```
74
-
75
- No test scripts. No manual screenshots. Just describe what you want in plain English.
76
-
77
- ### Works with Any AI Coding Assistant
78
-
79
- | Tool | Config file | Docs |
80
- |------|------------|------|
81
- | **Claude Code** | `.mcp.json` in project root | [claude.ai/docs](https://claude.ai/docs) |
82
- | **Cursor** | `.cursor/mcp.json` | [cursor.com/docs](https://cursor.com/docs) |
83
- | **VS Code + Copilot** | MCP settings | [code.visualstudio.com](https://code.visualstudio.com) |
84
- | **Windsurf** | MCP settings | [windsurf.com](https://windsurf.com) |
85
-
86
- All use the same JSON config — just put it in the right file for your editor.
87
-
88
- ### Drop Into Any Project
89
-
90
- Copy `.mcp.json` into any mobile project — Flutter, React Native, Kotlin, Swift — and your AI assistant gets device superpowers in that directory. No global install needed.
91
-
92
- ## Tools (34 total)
93
-
94
- ### Phase 1 Device Control (18 tools)
95
-
96
- | Tool | What it does |
97
- |------|-------------|
98
- | `list_devices` | List all connected Android devices/emulators |
99
- | `get_device_info` | Model, manufacturer, Android version, SDK level |
100
- | `get_screen_size` | Screen resolution in pixels |
101
- | `take_screenshot` | Capture screenshot (PNG or JPEG, configurable quality & resize) |
102
- | `get_ui_elements` | Get the accessibility/UI element tree as structured JSON |
103
- | `tap` | Tap at coordinates |
104
- | `double_tap` | Double tap at coordinates |
105
- | `long_press` | Long press at coordinates |
106
- | `swipe` | Swipe between two points |
107
- | `type_text` | Type text into the focused field |
108
- | `press_key` | Press a key (home, back, enter, volume, etc.) |
109
- | `list_apps` | List installed apps |
110
- | `get_current_app` | Get the foreground app |
111
- | `launch_app` | Launch an app by package name |
112
- | `stop_app` | Force stop an app |
113
- | `install_app` | Install an APK |
114
- | `uninstall_app` | Uninstall an app |
115
- | `get_logs` | Get logcat entries with filtering |
116
-
117
- ### Phase 2 AI Visual Analysis (8 tools)
118
-
119
- These tools use AI vision (Claude or Gemini) to understand what's on screen. Requires `ANTHROPIC_API_KEY` or `GOOGLE_API_KEY`.
120
-
121
- | Tool | What it does |
122
- |------|-------------|
123
- | `analyze_screen` | AI describes the screen: app name, screen type, interactive elements, visible text, suggestions |
124
- | `find_element` | Find a UI element by description: *"the login button"*, *"email input field"* |
125
- | `smart_tap` | Find an element by description and tap it in one step |
126
- | `smart_type` | Find an input field by description, focus it, and type text |
127
- | `suggest_actions` | Plan actions to achieve a goal: *"log into the app"*, *"add item to cart"* |
128
- | `visual_diff` | Compare current screen with a previous screenshot — what changed? |
129
- | `extract_text` | Extract all visible text from the screen (AI-powered OCR) |
130
- | `verify_screen` | Verify an assertion: *"the login was successful"*, *"error message is showing"* |
131
-
132
- ### Phase 3 Flutter Widget Tree (8 tools)
133
-
134
- These tools connect to a running Flutter app in debug/profile mode via the Dart VM Service Protocol. Maps every widget to its source code location (`file:line`).
135
-
136
- | Tool | What it does |
137
- |------|-------------|
138
- | `flutter_connect` | Discover and connect to a running Flutter app on the device |
139
- | `flutter_disconnect` | Disconnect from the Flutter app and clean up resources |
140
- | `flutter_get_widget_tree` | Get the full widget tree (summary or detailed) |
141
- | `flutter_get_widget_details` | Get detailed properties of a specific widget by ID |
142
- | `flutter_find_widget` | Search the widget tree by type, text, or description |
143
- | `flutter_get_source_map` | Map every widget to its source code location (file:line:column) |
144
- | `flutter_screenshot_widget` | Screenshot a specific widget in isolation |
145
- | `flutter_debug_paint` | Toggle debug paint overlay (shows widget boundaries & padding) |
146
-
147
- ## Performance
148
-
149
- The server is optimized to minimize latency and AI token costs:
150
-
151
- - **3-tier element search**: local text match (<1ms) → cached AI → fresh AI. `smart_tap` is 37x faster than naive AI calls.
152
- - **Screenshot compression**: AI tools auto-compress to JPEG q=80, 720w — **65% smaller** (251KB → 88KB) with zero quality loss. Saves ~55K tokens per screenshot.
153
- - **Parallel capture**: Screenshot + UI tree fetched simultaneously via `Promise.all()`.
154
- - **TTL caching**: 3-second cache avoids redundant ADB calls for rapid-fire tool usage.
155
-
156
- ## Environment Variables
157
-
158
- | Variable | Description | Default |
159
- |----------|-------------|---------|
160
- | `ANTHROPIC_API_KEY` | Anthropic API key for Claude vision | — |
161
- | `GOOGLE_API_KEY` or `GEMINI_API_KEY` | Google API key for Gemini vision (recommended — cheapest) | — |
162
- | `MCP_AI_PROVIDER` | Force AI provider: `"anthropic"` or `"google"` | Auto-detected |
163
- | `MCP_AI_MODEL` | Override AI model | `gemini-2.5-flash` / `claude-sonnet-4-20250514` |
164
- | `MCP_ADB_PATH` | Custom ADB binary path | Auto-discovered |
165
- | `MCP_DEFAULT_DEVICE` | Default device serial | Auto-discovered |
166
- | `MCP_SCREENSHOT_FORMAT` | `"png"` or `"jpeg"` | `jpeg` |
167
- | `MCP_SCREENSHOT_QUALITY` | JPEG quality (1-100) | `80` |
168
- | `MCP_SCREENSHOT_MAX_WIDTH` | Resize screenshots to this max width | `720` |
169
- | `MCP_AI_SCREENSHOT` | Send screenshots to AI (`"true"`/`"false"`) | `true` |
170
- | `MCP_AI_UITREE` | Send UI tree to AI (`"true"`/`"false"`) | `true` |
171
-
172
- ## Architecture
173
-
174
- ```
175
- src/
176
- ├── index.ts # CLI entry point (auto-discovery, env config)
177
- ├── server.ts # MCP server factory
178
- ├── types.ts # Shared interfaces
179
- ├── drivers/android/ # ADB driver (DeviceDriver implementation)
180
- │ ├── adb.ts # Low-level ADB command wrapper
181
- │ └── index.ts # AndroidDriver class
182
- ├── tools/ # MCP tool registrations
183
- │ ├── device-tools.ts # Device management
184
- │ ├── screen-tools.ts # Screenshots & UI inspection
185
- │ ├── interaction-tools.ts # Touch, type, keys
186
- │ ├── app-tools.ts # App management
187
- │ ├── log-tools.ts # Logcat
188
- │ ├── ai-tools.ts # AI-powered tools
189
- │ └── flutter-tools.ts # Flutter widget inspection tools
190
- ├── drivers/flutter/ # Dart VM Service driver
191
- │ ├── index.ts # FlutterDriver (discovery, inspection, source mapping)
192
- │ └── vm-service.ts # JSON-RPC 2.0 WebSocket client (DDS redirect handling)
193
- ├── ai/ # AI visual analysis engine
194
- │ ├── client.ts # Multi-provider client (Anthropic + Google)
195
- │ ├── prompts.ts # System prompts & UI element summarizer
196
- │ ├── analyzer.ts # ScreenAnalyzer orchestrator
197
- │ └── element-search.ts # Local element search (no AI needed)
198
- └── utils/
199
- ├── discovery.ts # ADB auto-discovery
200
- └── image.ts # PNG parsing, JPEG compression, bilinear resize
201
- ```
202
-
203
- ## Roadmap
204
-
205
- - [x] Phase 1: Android ADB device control (18 tools)
206
- - [x] Phase 2: AI visual analysis layer (8 tools)
207
- - [x] Multi-provider AI (Anthropic Claude + Google Gemini)
208
- - [x] Performance optimization (3-tier search, caching, parallel capture)
209
- - [x] Screenshot compression pipeline (JPEG, resize, configurable quality)
210
- - [x] npm publish (`npx mobile-device-mcp`)
211
- - [x] Phase 3: Flutter widget tree integration (8 tools, Dart VM Service Protocol)
212
- - [ ] Phase 4: iOS support (simulators via xcrun simctl, devices via idevice)
213
- - [ ] Phase 5: Monetization (license keys, usage analytics)
214
- - [ ] Multi-device orchestration
215
-
216
- ## Tested On
217
-
218
- - Pixel 8, Android 16, SDK 36 — 44/44 tests passed (22 device + 10 AI + 12 Flutter)
219
- - Flutter 3.41.3, metroping app (debug mode)
220
- - Google Gemini 2.5 Flash
221
- - Windows 11 + wireless ADB
222
-
223
- ## License
224
-
225
- MIT
1
+ <p align="center">
2
+ <img src="assets/icon.png" alt="mobile-device-mcp" width="128" height="128" />
3
+ </p>
4
+
5
+ <h1 align="center">mobile-device-mcp</h1>
6
+
7
+ <p align="center">
8
+ <a href="https://www.npmjs.com/package/mobile-device-mcp"><img src="https://img.shields.io/npm/v/mobile-device-mcp" alt="npm version" /></a>
9
+ <a href="https://www.npmjs.com/package/mobile-device-mcp"><img src="https://img.shields.io/npm/dm/mobile-device-mcp" alt="npm downloads" /></a>
10
+ <a href="https://github.com/saranshbamania/mobile-device-mcp"><img src="https://img.shields.io/github/stars/saranshbamania/mobile-device-mcp" alt="GitHub stars" /></a>
11
+ <a href="LICENSE"><img src="https://img.shields.io/badge/License-BSL%201.1-blue.svg" alt="License: BSL 1.1" /></a>
12
+ </p>
13
+
14
+ MCP server that gives AI coding assistants (Claude Code, Cursor, Windsurf) the ability to **see and interact with mobile devices**. 49 tools for screenshots, UI inspection, touch interaction, AI-powered visual analysis, Flutter widget tree inspection, video recording, and test generation.
15
+
16
+ > AI assistants can read your code but can't see your phone. This fixes that.
17
+
18
+ ## Why This One?
19
+
20
+ | Feature | mobile-device-mcp | mobile-next/mobile-mcp | appium/appium-mcp |
21
+ |---------|:-:|:-:|:-:|
22
+ | Total tools | **49** | 20 | ~15 |
23
+ | Setup | `npx` (30 sec) | `npx` | Requires Appium server |
24
+ | AI visual analysis | **12 tools** (Claude + Gemini) | None | Vision-based finding |
25
+ | Flutter widget tree | **10 tools** (Dart VM Service) | None | None |
26
+ | Smart element finding | **4-tier** (<1ms local search) | Accessibility tree only | XPath/selectors |
27
+ | Companion app (23x faster UI tree) | Yes | No | No |
28
+ | Video recording | Yes | No | No |
29
+ | Test script generation | **TS, Python, JSON** | No | Java/TestNG only |
30
+ | iOS simulator support | Yes | Yes | Yes |
31
+ | iOS real device | Planned | Yes | Yes |
32
+ | Screenshot compression | **89%** (251KB->28KB) | None | 50-80% |
33
+ | Multi-provider AI | Claude + Gemini | N/A | Single provider |
34
+ | Price | Free tier + Pro | Free | Free |
35
+
36
+ ## The Problem
37
+
38
+ Web developers have browser DevTools, Playwright, and Puppeteer -- AI assistants can click around, take screenshots, and verify fixes. Mobile developers? They're stuck manually screenshotting, copying logs, and describing what's on screen. They're **human middleware** between the AI and the device.
39
+
40
+ ## What This Does
41
+
42
+ ```
43
+ Developer: "The login button doesn't work"
44
+
45
+ Without this tool: With this tool:
46
+ 1. Manually screenshot 1. AI calls take_screenshot -> sees the screen
47
+ 2. Paste into AI chat 2. AI calls smart_tap("login button") -> taps it
48
+ 3. AI guesses what's wrong 3. AI calls verify_screen("error message shown") -> sees result
49
+ 4. Apply fix, rebuild 4. AI calls visual_diff -> confirms fix worked
50
+ 5. Repeat 4-5 times 5. Done.
51
+ ```
52
+
53
+ ## Quick Start
54
+
55
+ ### Prerequisites
56
+ - Node.js 18+
57
+ - Android device/emulator connected via ADB
58
+ - ADB installed (Android SDK Platform Tools)
59
+
60
+ ### Setup (One-time, 30 seconds)
61
+
62
+ 1. **Get a Google AI key** (free tier available): [aistudio.google.com/apikey](https://aistudio.google.com/apikey)
63
+
64
+ 2. **Add `.mcp.json` to your project root:**
65
+
66
+ ```json
67
+ {
68
+ "mcpServers": {
69
+ "mobile-device": {
70
+ "type": "stdio",
71
+ "command": "npx",
72
+ "args": ["-y", "mobile-device-mcp"],
73
+ "env": {
74
+ "GOOGLE_API_KEY": "your-google-api-key"
75
+ }
76
+ }
77
+ }
78
+ }
79
+ ```
80
+
81
+ 3. **Open your AI coding assistant** from that directory. That's it.
82
+
83
+ The server starts and stops automatically -- you never run it manually. Your AI assistant manages it as a background process via the MCP protocol.
84
+
85
+ ### Verify It Works
86
+
87
+ **Claude Code:** type `/mcp` -- you should see `mobile-device: Connected`
88
+
89
+ **Cursor:** check MCP panel in settings
90
+
91
+ Then just talk to your phone:
92
+
93
+ ```
94
+ You: "Open my app, tap the login button, type test@email.com in the email field"
95
+ AI: [takes screenshot -> sees the screen -> smart_tap("login button") -> smart_type("email field", "test@email.com")]
96
+
97
+ You: "Find all the bugs on this screen"
98
+ AI: [analyze_screen -> inspects layout, checks for overflow, missing labels, broken states]
99
+
100
+ You: "Navigate to settings and verify dark mode works"
101
+ AI: [smart_tap("settings") -> take_screenshot -> smart_tap("dark mode toggle") -> visual_diff -> reports result]
102
+ ```
103
+
104
+ No test scripts. No manual screenshots. Just describe what you want in plain English.
105
+
106
+ ### Works with Any AI Coding Assistant
107
+
108
+ | Tool | Config file | Docs |
109
+ |------|------------|------|
110
+ | **Claude Code** | `.mcp.json` in project root | [claude.ai/docs](https://claude.ai/docs) |
111
+ | **Cursor** | `.cursor/mcp.json` | [cursor.com/docs](https://cursor.com/docs) |
112
+ | **VS Code + Copilot** | MCP settings | [code.visualstudio.com](https://code.visualstudio.com) |
113
+ | **Windsurf** | MCP settings | [windsurf.com](https://windsurf.com) |
114
+
115
+ All use the same JSON config -- just put it in the right file for your editor.
116
+
117
+ ### Drop Into Any Project
118
+
119
+ Copy `.mcp.json` into any mobile project -- Flutter, React Native, Kotlin, Swift -- and your AI assistant gets device superpowers in that directory. No global install needed.
120
+
121
+ ## Free vs Pro
122
+
123
+ <a name="pro"></a>
124
+
125
+ ### Free (14 tools) -- no license key needed
126
+
127
+ | Tool | What it does |
128
+ |------|-------------|
129
+ | `list_devices` | List all connected Android devices/emulators |
130
+ | `get_device_info` | Model, manufacturer, Android version, SDK level |
131
+ | `get_screen_size` | Screen resolution in pixels |
132
+ | `take_screenshot` | Capture screenshot (PNG or JPEG, configurable quality & resize) |
133
+ | `get_ui_elements` | Get the accessibility/UI element tree as structured JSON |
134
+ | `tap` | Tap at coordinates |
135
+ | `double_tap` | Double tap at coordinates |
136
+ | `long_press` | Long press at coordinates |
137
+ | `swipe` | Swipe between two points |
138
+ | `type_text` | Type text into the focused field |
139
+ | `press_key` | Press a key (home, back, enter, volume, etc.) |
140
+ | `list_apps` | List installed apps |
141
+ | `get_current_app` | Get the foreground app |
142
+ | `get_logs` | Get logcat entries with filtering |
143
+
144
+ ### Pro (35 additional tools) -- [$9/mo](https://rzp.io/rzp/fCvY9mNK)
145
+
146
+ **[Get Pro License](https://rzp.io/rzp/fCvY9mNK)** -- unlock all 49 tools. After payment, you'll receive a license key. Add it to your `.mcp.json`:
147
+
148
+ ```json
149
+ {
150
+ "mcpServers": {
151
+ "mobile-device": {
152
+ "type": "stdio",
153
+ "command": "npx",
154
+ "args": ["-y", "mobile-device-mcp"],
155
+ "env": {
156
+ "GOOGLE_API_KEY": "your-google-api-key",
157
+ "MOBILE_MCP_LICENSE_KEY": "your-license-key"
158
+ }
159
+ }
160
+ }
161
+ }
162
+ ```
163
+
164
+ #### AI Visual Analysis (12 tools)
165
+
166
+ Use AI vision (Claude or Gemini) to understand what's on screen.
167
+
168
+ | Tool | What it does |
169
+ |------|-------------|
170
+ | `analyze_screen` | AI describes the screen: app name, screen type, interactive elements, visible text, suggestions |
171
+ | `find_element` | Find a UI element by description: *"the login button"*, *"email input field"* |
172
+ | `smart_tap` | Find an element by description and tap it in one step |
173
+ | `smart_type` | Find an input field by description, focus it, and type text |
174
+ | `suggest_actions` | Plan actions to achieve a goal: *"log into the app"*, *"add item to cart"* |
175
+ | `visual_diff` | Compare current screen with a previous screenshot -- what changed? |
176
+ | `extract_text` | Extract all visible text from the screen (AI-powered OCR) |
177
+ | `verify_screen` | Verify an assertion: *"the login was successful"*, *"error message is showing"* |
178
+ | `wait_for_settle` | Wait until the screen stops changing |
179
+ | `wait_for_element` | Wait for a specific element to appear on screen |
180
+ | `handle_popup` | Detect and dismiss popups, dialogs, permission prompts |
181
+ | `fill_form` | Fill multiple form fields in one step |
182
+
183
+ #### Flutter Widget Tree (10 tools)
184
+
185
+ Connect to running Flutter apps via Dart VM Service Protocol. Maps every widget to its source code location (`file:line`).
186
+
187
+ | Tool | What it does |
188
+ |------|-------------|
189
+ | `flutter_connect` | Discover and connect to a running Flutter app on the device |
190
+ | `flutter_disconnect` | Disconnect from the Flutter app and clean up resources |
191
+ | `flutter_get_widget_tree` | Get the full widget tree (summary or detailed) |
192
+ | `flutter_get_widget_details` | Get detailed properties of a specific widget by ID |
193
+ | `flutter_find_widget` | Search the widget tree by type, text, or description |
194
+ | `flutter_get_source_map` | Map every widget to its source code location (file:line:column) |
195
+ | `flutter_screenshot_widget` | Screenshot a specific widget in isolation |
196
+ | `flutter_debug_paint` | Toggle debug paint overlay (shows widget boundaries & padding) |
197
+ | `flutter_hot_reload` | Hot reload Flutter app (preserves state) |
198
+ | `flutter_hot_restart` | Hot restart Flutter app (resets state) |
199
+
200
+ #### iOS Simulator (4 tools)
201
+
202
+ macOS only. Control iOS simulators via `xcrun simctl`.
203
+
204
+ | Tool | What it does |
205
+ |------|-------------|
206
+ | `ios_list_simulators` | List available iOS simulators |
207
+ | `ios_boot_simulator` | Boot a simulator by name or UDID |
208
+ | `ios_shutdown_simulator` | Shut down a running simulator |
209
+ | `ios_screenshot` | Take a screenshot of a simulator |
210
+
211
+ #### Video Recording (2 tools)
212
+
213
+ | Tool | What it does |
214
+ |------|-------------|
215
+ | `record_screen` | Start recording the device screen |
216
+ | `stop_recording` | Stop recording and save the video |
217
+
218
+ #### Test Generation (3 tools)
219
+
220
+ | Tool | What it does |
221
+ |------|-------------|
222
+ | `start_test_recording` | Start recording your MCP tool calls |
223
+ | `stop_test_recording` | Stop recording and generate a test script |
224
+ | `get_recorded_actions` | Get recorded actions as TypeScript, Python, or JSON |
225
+
226
+ #### App Management (4 tools)
227
+
228
+ | Tool | What it does |
229
+ |------|-------------|
230
+ | `launch_app` | Launch an app by package name |
231
+ | `stop_app` | Force stop an app |
232
+ | `install_app` | Install an APK |
233
+ | `uninstall_app` | Uninstall an app |
234
+
235
+ ## Performance
236
+
237
+ The server is optimized to minimize latency and AI token costs:
238
+
239
+ - **4-tier element search**: companion app (instant) -> local text match (<1ms) -> cached AI -> fresh AI. `smart_tap` is **35x faster** than naive AI calls (205ms vs 7.6s).
240
+ - **Companion app**: AccessibilityService-based Android app provides UI tree in 105ms (23x faster than UIAutomator's 2448ms). Auto-installs on first use.
241
+ - **Screenshot compression**: AI tools auto-compress to JPEG q=60, 400w -- **89% smaller** (251KB -> 28KB) with zero AI quality loss.
242
+ - **Parallel capture**: Screenshot + UI tree fetched simultaneously via `Promise.all()`.
243
+ - **TTL caching**: 5-second cache avoids redundant ADB calls for rapid-fire tool usage.
244
+
245
+ ## Environment Variables
246
+
247
+ | Variable | Description | Default |
248
+ |----------|-------------|---------|
249
+ | `GOOGLE_API_KEY` or `GEMINI_API_KEY` | Google API key for Gemini vision (recommended) | -- |
250
+ | `ANTHROPIC_API_KEY` | Anthropic API key for Claude vision | -- |
251
+ | `MOBILE_MCP_LICENSE_KEY` | License key to unlock Pro tools | -- |
252
+ | `MCP_AI_PROVIDER` | Force AI provider: `"anthropic"` or `"google"` | Auto-detected |
253
+ | `MCP_AI_MODEL` | Override AI model | `gemini-2.5-flash` / `claude-sonnet-4-20250514` |
254
+ | `MCP_ADB_PATH` | Custom ADB binary path | Auto-discovered |
255
+ | `MCP_DEFAULT_DEVICE` | Default device serial | Auto-discovered |
256
+ | `MCP_SCREENSHOT_FORMAT` | `"png"` or `"jpeg"` | `jpeg` |
257
+ | `MCP_SCREENSHOT_QUALITY` | JPEG quality (1-100) | `80` |
258
+ | `MCP_SCREENSHOT_MAX_WIDTH` | Resize screenshots to this max width | `720` |
259
+
260
+ ## Architecture
261
+
262
+ ```
263
+ src/
264
+ |-- index.ts # CLI entry point (auto-discovery, env config)
265
+ |-- server.ts # MCP server factory
266
+ |-- license.ts # License validation and tier gating
267
+ |-- types.ts # Shared interfaces
268
+ |-- drivers/android/ # ADB driver (DeviceDriver implementation)
269
+ | |-- adb.ts # Low-level ADB command wrapper
270
+ | |-- companion-client.ts # TCP client for companion app
271
+ | +-- index.ts # AndroidDriver class (4-strategy UI element retrieval)
272
+ |-- drivers/flutter/ # Dart VM Service driver
273
+ | |-- index.ts # FlutterDriver (discovery, inspection, source mapping, hot reload)
274
+ | +-- vm-service.ts # JSON-RPC 2.0 WebSocket client (DDS redirect handling)
275
+ |-- drivers/ios/ # iOS Simulator driver (macOS only)
276
+ | |-- index.ts # IOSSimulatorDriver via xcrun simctl
277
+ | +-- simctl.ts # Low-level simctl command wrapper
278
+ |-- tools/ # MCP tool registrations (free + pro gating)
279
+ | |-- device-tools.ts # Device management
280
+ | |-- screen-tools.ts # Screenshots & UI inspection
281
+ | |-- interaction-tools.ts # Touch, type, keys
282
+ | |-- app-tools.ts # App management
283
+ | |-- log-tools.ts # Logcat
284
+ | |-- ai-tools.ts # AI-powered tools
285
+ | |-- flutter-tools.ts # Flutter widget inspection
286
+ | |-- ios-tools.ts # iOS simulator tools
287
+ | |-- video-tools.ts # Screen recording
288
+ | +-- recording-tools.ts # Test generation
289
+ |-- recording/ # Test script generation
290
+ | |-- recorder.ts # ActionRecorder (records MCP tool calls)
291
+ | +-- generator.ts # TestGenerator (TypeScript/Python/JSON output)
292
+ |-- ai/ # AI visual analysis engine
293
+ | |-- client.ts # Multi-provider client (Anthropic + Google)
294
+ | |-- prompts.ts # System prompts & UI element summarizer
295
+ | |-- analyzer.ts # ScreenAnalyzer orchestrator (caching, parallel capture)
296
+ | +-- element-search.ts # Local element search (text/alias matching, no AI needed)
297
+ +-- utils/
298
+ |-- discovery.ts # ADB auto-discovery
299
+ +-- image.ts # PNG parsing, JPEG compression, bilinear resize
300
+
301
+ companion-app/ # Android companion app (Kotlin)
302
+ # AccessibilityService + TCP JSON-RPC for fast UI tree
303
+ ```
304
+
305
+ ## Roadmap
306
+
307
+ - [ ] iOS physical device support
308
+ - [ ] Multi-device orchestration
309
+ - [ ] CI/CD integration
310
+ - [ ] Cloud device farm support
311
+
312
+ ## Tested On
313
+
314
+ - **Devices**: Pixel 8 (Android 16), Samsung Galaxy series, Android emulators
315
+ - **Apps**: Telegram, Instagram, Spotify, WhatsApp, YouTube, Chrome, Settings, and Flutter apps
316
+ - **AI Providers**: Google Gemini 2.5 Flash, Anthropic Claude
317
+ - **Platforms**: Windows 11, macOS (iOS simulators)
318
+ - **Connection**: USB and wireless ADB
319
+
320
+ ## License
321
+
322
+ [Business Source License 1.1](LICENSE)
323
+
324
+ - **Free for individuals and non-commercial use**
325
+ - **Commercial use requires a paid license**
326
+ - Converts to Apache 2.0 on March 23, 2030
327
+
328
+ See [LICENSE](LICENSE) for full terms.
Binary file
@@ -0,0 +1,63 @@
1
+ <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512" width="512" height="512">
2
+ <defs>
3
+ <linearGradient id="bg" x1="0%" y1="0%" x2="100%" y2="100%">
4
+ <stop offset="0%" style="stop-color:#6366f1"/>
5
+ <stop offset="100%" style="stop-color:#8b5cf6"/>
6
+ </linearGradient>
7
+ <linearGradient id="screen" x1="0%" y1="0%" x2="0%" y2="100%">
8
+ <stop offset="0%" style="stop-color:#1e1b4b"/>
9
+ <stop offset="100%" style="stop-color:#312e81"/>
10
+ </linearGradient>
11
+ <linearGradient id="eye" x1="0%" y1="0%" x2="100%" y2="100%">
12
+ <stop offset="0%" style="stop-color:#06b6d4"/>
13
+ <stop offset="100%" style="stop-color:#22d3ee"/>
14
+ </linearGradient>
15
+ </defs>
16
+
17
+ <!-- Rounded square background -->
18
+ <rect x="16" y="16" width="480" height="480" rx="96" ry="96" fill="url(#bg)"/>
19
+
20
+ <!-- Phone body -->
21
+ <rect x="156" y="72" width="200" height="368" rx="24" ry="24" fill="#1e1b4b" stroke="#a5b4fc" stroke-width="4"/>
22
+
23
+ <!-- Phone screen -->
24
+ <rect x="172" y="108" width="168" height="280" rx="8" ry="8" fill="url(#screen)"/>
25
+
26
+ <!-- AI Eye on screen -->
27
+ <!-- Outer eye shape -->
28
+ <path d="M196 248 Q256 200 316 248 Q256 296 196 248 Z" fill="none" stroke="url(#eye)" stroke-width="5" stroke-linecap="round"/>
29
+
30
+ <!-- Iris -->
31
+ <circle cx="256" cy="248" r="24" fill="url(#eye)" opacity="0.3"/>
32
+
33
+ <!-- Pupil -->
34
+ <circle cx="256" cy="248" r="14" fill="#22d3ee"/>
35
+
36
+ <!-- Pupil highlight -->
37
+ <circle cx="250" cy="242" r="5" fill="#ffffff" opacity="0.7"/>
38
+
39
+ <!-- Scan lines on screen (subtle) -->
40
+ <line x1="188" y1="160" x2="244" y2="160" stroke="#4f46e5" stroke-width="3" stroke-linecap="round" opacity="0.5"/>
41
+ <line x1="188" y1="176" x2="268" y2="176" stroke="#4f46e5" stroke-width="3" stroke-linecap="round" opacity="0.4"/>
42
+ <line x1="188" y1="192" x2="228" y2="192" stroke="#4f46e5" stroke-width="3" stroke-linecap="round" opacity="0.3"/>
43
+
44
+ <!-- Bottom UI dots on screen -->
45
+ <circle cx="228" cy="340" r="6" fill="#4f46e5" opacity="0.5"/>
46
+ <circle cx="256" cy="340" r="6" fill="#22d3ee" opacity="0.8"/>
47
+ <circle cx="284" cy="340" r="6" fill="#4f46e5" opacity="0.5"/>
48
+
49
+ <!-- Connection signals from phone (left side) -->
50
+ <path d="M148 200 Q120 220 128 248" fill="none" stroke="#a5b4fc" stroke-width="3" stroke-linecap="round" opacity="0.6"/>
51
+ <path d="M136 188 Q100 216 112 252" fill="none" stroke="#a5b4fc" stroke-width="3" stroke-linecap="round" opacity="0.4"/>
52
+
53
+ <!-- Connection signals from phone (right side) -->
54
+ <path d="M364 200 Q392 220 384 248" fill="none" stroke="#a5b4fc" stroke-width="3" stroke-linecap="round" opacity="0.6"/>
55
+ <path d="M376 188 Q412 216 400 252" fill="none" stroke="#a5b4fc" stroke-width="3" stroke-linecap="round" opacity="0.4"/>
56
+
57
+ <!-- Touch indicator (tap finger) -->
58
+ <circle cx="300" cy="296" r="10" fill="#f472b6" opacity="0.6"/>
59
+ <circle cx="300" cy="296" r="18" fill="none" stroke="#f472b6" stroke-width="2" opacity="0.3"/>
60
+
61
+ <!-- Home button / bar -->
62
+ <rect x="228" y="408" width="56" height="6" rx="3" ry="3" fill="#a5b4fc" opacity="0.5"/>
63
+ </svg>
@@ -1 +1 @@
1
- {"version":3,"file":"license.d.ts","sourceRoot":"","sources":["../src/license.ts"],"names":[],"mappings":"AAMA,MAAM,MAAM,WAAW,GAAG,MAAM,GAAG,KAAK,CAAC;AAEzC,MAAM,WAAW,WAAW;IAC1B,IAAI,EAAE,WAAW,CAAC;IAClB,KAAK,EAAE,OAAO,CAAC;CAChB;AAED,sCAAsC;AACtC,eAAO,MAAM,UAAU,aAoBrB,CAAC;AAEH,8DAA8D;AAC9D,eAAO,MAAM,mBAAmB;;;;;;CAe/B,CAAC;AAEF;;;;;GAKG;AACH,wBAAgB,eAAe,IAAI,WAAW,CAU7C;AAED,sCAAsC;AACtC,wBAAgB,gBAAgB,CAAC,IAAI,EAAE,WAAW,GAAG,IAAI,CAMxD"}
1
+ {"version":3,"file":"license.d.ts","sourceRoot":"","sources":["../src/license.ts"],"names":[],"mappings":"AAMA,MAAM,MAAM,WAAW,GAAG,MAAM,GAAG,KAAK,CAAC;AAEzC,MAAM,WAAW,WAAW;IAC1B,IAAI,EAAE,WAAW,CAAC;IAClB,KAAK,EAAE,OAAO,CAAC;CAChB;AAED,sCAAsC;AACtC,eAAO,MAAM,UAAU,aAoBrB,CAAC;AAEH,8DAA8D;AAC9D,eAAO,MAAM,mBAAmB;;;;;;CAiB/B,CAAC;AAEF;;;;;GAKG;AACH,wBAAgB,eAAe,IAAI,WAAW,CAU7C;AAED,sCAAsC;AACtC,wBAAgB,gBAAgB,CAAC,IAAI,EAAE,WAAW,GAAG,IAAI,CAMxD"}
package/dist/license.js CHANGED
@@ -30,11 +30,13 @@ export const PRO_UPGRADE_MESSAGE = {
30
30
  {
31
31
  type: "text",
32
32
  text: [
33
- "This is a Pro feature. Upgrade to unlock all 49 tools including AI vision, Flutter inspection, test generation, and more.",
33
+ "This is a Pro feature. Upgrade to Pro to unlock all 49 tools.",
34
34
  "",
35
- "Get your license key at: https://github.com/saranshbamania/mobile-device-mcp#pro",
35
+ "Pro includes: AI vision, Flutter inspection, iOS simulator, video recording, test generation, and more.",
36
36
  "",
37
- "Then add to your .mcp.json:",
37
+ "Get Pro: https://rzp.io/rzp/fCvY9mNK",
38
+ "",
39
+ "After payment, you'll receive a license key. Add it to your .mcp.json:",
38
40
  ' "MOBILE_MCP_LICENSE_KEY": "your-key-here"',
39
41
  ].join("\n"),
40
42
  },
@@ -1 +1 @@
1
- {"version":3,"file":"license.js","sourceRoot":"","sources":["../src/license.ts"],"names":[],"mappings":"AAAA,+DAA+D;AAC/D,6CAA6C;AAC7C,+DAA+D;AAE/D,MAAM,MAAM,GAAG,qBAAqB,CAAC;AASrC,sCAAsC;AACtC,MAAM,CAAC,MAAM,UAAU,GAAG,IAAI,GAAG,CAAC;IAChC,kBAAkB;IAClB,cAAc;IACd,iBAAiB;IACjB,iBAAiB;IACjB,4BAA4B;IAC5B,iBAAiB;IACjB,iBAAiB;IACjB,wBAAwB;IACxB,KAAK;IACL,YAAY;IACZ,YAAY;IACZ,OAAO;IACP,WAAW;IACX,WAAW;IACX,2BAA2B;IAC3B,WAAW;IACX,iBAAiB;IACjB,WAAW;IACX,UAAU;CACX,CAAC,CAAC;AAEH,8DAA8D;AAC9D,MAAM,CAAC,MAAM,mBAAmB,GAAG;IACjC,OAAO,EAAE;QACP;YACE,IAAI,EAAE,MAAe;YACrB,IAAI,EAAE;gBACJ,2HAA2H;gBAC3H,EAAE;gBACF,kFAAkF;gBAClF,EAAE;gBACF,6BAA6B;gBAC7B,6CAA6C;aAC9C,CAAC,IAAI,CAAC,IAAI,CAAC;SACb;KACF;IACD,OAAO,EAAE,IAAI;CACd,CAAC;AAEF;;;;;GAKG;AACH,MAAM,UAAU,eAAe;IAC7B,MAAM,GAAG,GAAG,OAAO,CAAC,GAAG,CAAC,sBAAsB,CAAC;IAE/C,IAAI,CAAC,GAAG,IAAI,GAAG,CAAC,IAAI,EAAE,KAAK,EAAE,EAAE,CAAC;QAC9B,OAAO,EAAE,IAAI,EAAE,MAAM,EAAE,KAAK,EAAE,IAAI,EAAE,CAAC;IACvC,CAAC;IAED,qEAAqE;IACrE,6DAA6D;IAC7D,OAAO,EAAE,IAAI,EAAE,KAAK,EAAE,KAAK,EAAE,IAAI,EAAE,CAAC;AACtC,CAAC;AAED,sCAAsC;AACtC,MAAM,UAAU,gBAAgB,CAAC,IAAiB;IAChD,IAAI,IAAI,CAAC,IAAI,KAAK,KAAK,EAAE,CAAC;QACxB,OAAO,CAAC,MAAM,CAAC,KAAK,CAAC,GAAG,MAAM,wCAAwC,CAAC,CAAC;IAC1E,CAAC;SAAM,CAAC;QACN,OAAO,CAAC,MAAM,CAAC,KAAK,CAAC,GAAG,MAAM,yDAAyD,CAAC,CAAC;IAC3F,CAAC;AACH,CAAC"}
1
+ {"version":3,"file":"license.js","sourceRoot":"","sources":["../src/license.ts"],"names":[],"mappings":"AAAA,+DAA+D;AAC/D,6CAA6C;AAC7C,+DAA+D;AAE/D,MAAM,MAAM,GAAG,qBAAqB,CAAC;AASrC,sCAAsC;AACtC,MAAM,CAAC,MAAM,UAAU,GAAG,IAAI,GAAG,CAAC;IAChC,kBAAkB;IAClB,cAAc;IACd,iBAAiB;IACjB,iBAAiB;IACjB,4BAA4B;IAC5B,iBAAiB;IACjB,iBAAiB;IACjB,wBAAwB;IACxB,KAAK;IACL,YAAY;IACZ,YAAY;IACZ,OAAO;IACP,WAAW;IACX,WAAW;IACX,2BAA2B;IAC3B,WAAW;IACX,iBAAiB;IACjB,WAAW;IACX,UAAU;CACX,CAAC,CAAC;AAEH,8DAA8D;AAC9D,MAAM,CAAC,MAAM,mBAAmB,GAAG;IACjC,OAAO,EAAE;QACP;YACE,IAAI,EAAE,MAAe;YACrB,IAAI,EAAE;gBACJ,+DAA+D;gBAC/D,EAAE;gBACF,yGAAyG;gBACzG,EAAE;gBACF,sCAAsC;gBACtC,EAAE;gBACF,wEAAwE;gBACxE,6CAA6C;aAC9C,CAAC,IAAI,CAAC,IAAI,CAAC;SACb;KACF;IACD,OAAO,EAAE,IAAI;CACd,CAAC;AAEF;;;;;GAKG;AACH,MAAM,UAAU,eAAe;IAC7B,MAAM,GAAG,GAAG,OAAO,CAAC,GAAG,CAAC,sBAAsB,CAAC;IAE/C,IAAI,CAAC,GAAG,IAAI,GAAG,CAAC,IAAI,EAAE,KAAK,EAAE,EAAE,CAAC;QAC9B,OAAO,EAAE,IAAI,EAAE,MAAM,EAAE,KAAK,EAAE,IAAI,EAAE,CAAC;IACvC,CAAC;IAED,qEAAqE;IACrE,6DAA6D;IAC7D,OAAO,EAAE,IAAI,EAAE,KAAK,EAAE,KAAK,EAAE,IAAI,EAAE,CAAC;AACtC,CAAC;AAED,sCAAsC;AACtC,MAAM,UAAU,gBAAgB,CAAC,IAAiB;IAChD,IAAI,IAAI,CAAC,IAAI,KAAK,KAAK,EAAE,CAAC;QACxB,OAAO,CAAC,MAAM,CAAC,KAAK,CAAC,GAAG,MAAM,wCAAwC,CAAC,CAAC;IAC1E,CAAC;SAAM,CAAC;QACN,OAAO,CAAC,MAAM,CAAC,KAAK,CAAC,GAAG,MAAM,yDAAyD,CAAC,CAAC;IAC3F,CAAC;AACH,CAAC"}
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "mobile-device-mcp",
3
- "version": "0.2.2",
3
+ "version": "0.2.4",
4
4
  "description": "MCP server that gives AI coding assistants (Claude, Cursor, Windsurf) the ability to see and interact with Android mobile devices via ADB — AI-powered visual inspection, element finding, and device automation",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",