@mobilenext/mobile-mcp 0.0.17 β†’ 0.0.19

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,30 +1,44 @@
1
1
  # Mobile Next - MCP server for Mobile Development and Automation | iOS, Android, Simulator, Emulator, and physical devices
2
2
 
3
3
  This is a [Model Context Protocol (MCP) server](https://github.com/modelcontextprotocol) that enables scalable mobile automation, development through a platform-agnostic interface, eliminating the need for distinct iOS or Android knowledge. You can run it on emulators, simulators, and physical devices (iOS and Android).
4
- This server allows Agents and LLMs to interact with native iOS/Android applications and devices through structured accessibility snapshots or coordinate-based taps based on screenshots.
4
+ This server allows Agents and LLMs to interact with native iOS/Android applications and devices through structured accessibility snapshots or coordinate-based taps based on screenshots.
5
5
 
6
- https://github.com/user-attachments/assets/c4e89c4f-cc71-4424-8184-bdbc8c638fa1
7
-
8
-
9
- <p align="center">
6
+ <h4 align="center">
7
+ <a href="https://github.com/mobile-next/mobile-mcp">
8
+ <img src="https://img.shields.io/github/stars/mobile-next/mobile-mcp" alt="Mobile Next Stars" />
9
+ </a>
10
+ <a href="https://github.com/mobile-next/mobile-mcp">
11
+ <img src="https://img.shields.io/github/contributors/mobile-next/mobile-mcp?color=green" alt="Mobile Next Downloads" />
12
+ </a>
10
13
  <a href="https://www.npmjs.com/package/@mobilenext/mobile-mcp">
11
- <img src="https://img.shields.io/badge/npm-@mobilenext/mobile--mcp-red" alt="npm">
14
+ <img src="https://img.shields.io/npm/dm/@mobilenext/mobile-mcp?logo=npm&style=flat&color=red" alt="npm">
15
+ </a>
16
+ <a href="https://github.com/mobile-next/mobile-mcp/releases">
17
+ <img src="https://img.shields.io/github/release/mobile-next/mobile-mcp">
12
18
  </a>
13
- <a href="https://github.com/mobile-next/mobile-mcp">
14
- <img src="https://img.shields.io/badge/github-repo-black" alt="GitHub repo">
19
+ <a href="https://github.com/mobile-next/mobile-mcp/blob/main/LICENSE">
20
+ <img src="https://img.shields.io/badge/license-Apache 2.0-blue.svg" alt="Mobile MCP is released under the Apache-2.0 License">
15
21
  </a>
22
+
23
+ </p>
24
+
25
+ <h4 align="center">
26
+ <a href="http://mobilenexthq.com/join-slack">
27
+ <img src="https://img.shields.io/badge/join-Slack-blueviolet?logo=slack&style=flat" alt="Slack community channel" />
28
+ </a>
16
29
  </p>
17
30
 
31
+ https://github.com/user-attachments/assets/c4e89c4f-cc71-4424-8184-bdbc8c638fa1
32
+
18
33
  <p align="center">
19
34
  <a href="https://github.com/mobile-next/">
20
35
  <img alt="mobile-mcp" src="https://raw.githubusercontent.com/mobile-next/mobile-next-assets/refs/heads/main/mobile-mcp-banner.png" width="600">
21
36
  </a>
22
37
  </p>
23
38
 
24
-
25
39
  ### πŸš€ Mobile MCP Roadmap: Building the Future of Mobile
26
40
 
27
- Join us on our journey as we continuously enhance Mobile MCP!
41
+ Join us on our journey as we continuously enhance Mobile MCP!
28
42
  Check out our detailed roadmap to see upcoming features, improvements, and milestones. Your feedback is invaluable in shaping the future of mobile automation.
29
43
 
30
44
  πŸ‘‰ [Explore the Roadmap](https://github.com/orgs/mobile-next/projects/3)
@@ -34,7 +48,7 @@ Check out our detailed roadmap to see upcoming features, improvements, and miles
34
48
 
35
49
  How we help to scale mobile automation:
36
50
 
37
- - πŸ“² Native app automation (iOS and Android) for testing or data-entry scenarios.
51
+ - πŸ“² Native app automation (iOS and Android) for testing or data-entry scenarios.
38
52
  - πŸ“ Scripted flows and form interactions without manually controlling simulators/emulators or physical devices (iPhone, Samsung, Google Pixel etc)
39
53
  - 🧭 Automating multi-step user journeys driven by an LLM
40
54
  - πŸ‘† General-purpose mobile application interaction for agent-based frameworks
@@ -42,11 +56,11 @@ How we help to scale mobile automation:
42
56
 
43
57
  ## Main Features
44
58
 
45
- - πŸš€ **Fast and lightweight**: Uses native accessibility trees for most interactions, or screenshot based coordinates where a11y labels are not available.
59
+ - πŸš€ **Fast and lightweight**: Uses native accessibility trees for most interactions, or screenshot based coordinates where a11y labels are not available.
46
60
  - πŸ€– **LLM-friendly**: No computer vision model required in Accessibility (Snapshot).
47
61
  - 🧿 **Visual Sense**: Evaluates and analyses what’s actually rendered on screen to decide the next action. If accessibility data or view-hierarchy coordinates are unavailable, it falls back to screenshot-based analysis.
48
62
  - πŸ“Š **Deterministic tool application**: Reduces ambiguity found in purely screenshot-based approaches by relying on structured data whenever possible.
49
- - πŸ“Ί **Extract structured data**: Enables you to extract structred data from anything visible on screen.
63
+ - πŸ“Ί **Extract structured data**: Enables you to extract structred data from anything visible on screen.
50
64
 
51
65
  ## πŸ—οΈ Mobile MCP Architecture
52
66
 
@@ -57,14 +71,14 @@ How we help to scale mobile automation:
57
71
  </p>
58
72
 
59
73
 
60
- ## πŸ“š Wiki page
74
+ ## πŸ“š Wiki page
61
75
 
62
76
  More details in our [wiki page](https://github.com/mobile-next/mobile-mcp/wiki) for setup, configuration and debugging related questions.
63
77
 
64
78
 
65
79
  ## Installation and configuration
66
80
 
67
- Setup our MCP with Cursor, Claude, VS Code, Github Copilot:
81
+ Setup our MCP with Cline, Cursor, Claude, VS Code, Github Copilot:
68
82
 
69
83
  ```json
70
84
  {
@@ -77,11 +91,13 @@ Setup our MCP with Cursor, Claude, VS Code, Github Copilot:
77
91
  }
78
92
 
79
93
  ```
94
+ [Cline:](https://docs.cline.bot/mcp/configuring-mcp-servers) To setup Cline, just add the json above to your MCP settings file.
95
+ [More in our wiki](https://github.com/mobile-next/mobile-mcp/wiki/Cline)
80
96
 
81
97
  [Claude Code:](https://docs.anthropic.com/en/docs/agents-and-tools/claude-code/overview)
82
98
 
83
99
  ```
84
- claude mcp add mobile -- npx -y @mobilenext/mobile-mcp@latest ⁠
100
+ claude mcp add mobile -- npx -y @mobilenext/mobile-mcp@latest
85
101
  ```
86
102
 
87
103
  [Read more in our wiki](https://github.com/mobile-next/mobile-mcp/wiki)! πŸš€
@@ -89,7 +105,7 @@ claude mcp add mobile -- npx -y @mobilenext/mobile-mcp@latest ⁠
89
105
 
90
106
  ### πŸ› οΈ How to Use πŸ“
91
107
 
92
- After adding the MCP server to your IDE/Client, you can instruct your AI assistant to use the available tools.
108
+ After adding the MCP server to your IDE/Client, you can instruct your AI assistant to use the available tools.
93
109
  For example, in Cursor's agent mode, you could use the prompts below to quickly validate, test and iterate on UI intereactions, read information from screen, go through complex workflows.
94
110
  Be descriptive, straight to the point.
95
111
 
@@ -101,48 +117,57 @@ You can specifiy detailed workflows in a single prompt, verify business logic, s
101
117
 
102
118
  **Search for a video, comment, like and share it.**
103
119
  ```
104
- Find the video called " Beginner Recipe for Tonkotsu Ramen" by Way of Ramen, click on like video, after liking write a comment " this was delicious, will make it next Friday", share the video with the first contact in your whatsapp list.
120
+ Find the video called " Beginner Recipe for Tonkotsu Ramen" by Way of
121
+ Ramen, click on like video, after liking write a comment " this was
122
+ delicious, will make it next Friday", share the video with the first
123
+ contact in your whatsapp list.
105
124
  ```
106
125
 
107
- **Download a successful step counter app, register, setup workout and 5 start the app**
126
+ **Download a successful step counter app, register, setup workout and 5-star the app**
108
127
  ```
109
- Find and Download a free "Pomodoro" app that has more thank 1k stars.
110
- Launch the app, register with my email, after registration find how to start a pomodoro timer.
111
- When the pomodoro timer started, go back to the app store and rate the app 5 stars,
112
- and leave a comment how useful the app is.
128
+ Find and Download a free "Pomodoro" app that has more than 1k stars.
129
+ Launch the app, register with my email, after registration find how to
130
+ start a pomodoro timer. When the pomodoro timer started, go back to the
131
+ app store and rate the app 5 stars, and leave a comment how useful the
132
+ app is.
113
133
  ```
114
134
 
115
135
  **Search in Substack, read, highlight, comment and save an article**
116
136
  ```
117
- Open Substack website, search for "Latest trends in AI automation 2025", open the first article,
118
- highlight the section titled "Emerging AI trends", and save article to reading list for later review,
119
- comment a random paragraph summary.
137
+ Open Substack website, search for "Latest trends in AI automation 2025",
138
+ open the first article, highlight the section titled "Emerging AI trends",
139
+ and save article to reading list for later review, comment a random
140
+ paragraph summary.
120
141
  ```
121
142
 
122
143
  **Reserve a workout class, set timer**
123
144
  ```
124
- Open ClassPass, search for yoga classes tomorrow morning within 2 miles,
145
+ Open ClassPass, search for yoga classes tomorrow morning within 2 miles,
125
146
  book the highest-rated class at 7 AM, confirm reservation,
126
- setup a timer for the booked slot in the phone
147
+ setup a timer for the booked slot in the phone
127
148
  ```
128
149
 
129
150
  **Find a local event, setup calendar event**
130
151
  ```
131
- Open Eventbrite, search for AI startup meetup events happening this weekend in "Austin, TX",
132
- select the most popular one, register and RSVP yes to the even, setup a calendar event as a reminder.
152
+ Open Eventbrite, search for AI startup meetup events happening this
153
+ weekend in "Austin, TX", select the most popular one, register and RSVP
154
+ yes to the event, setup a calendar event as a reminder.
133
155
  ```
134
156
 
135
157
  **Check weather forecast and send a Whatsapp/Telegram/Slack message**
136
158
  ```
137
- Open Weather app, check tomorrow's weather forecast for "Berlin", and send the summary
138
- via Whatsapp/Telegram/Slack to contact "Lauren Trown", thumbs up their response.
159
+ Open Weather app, check tomorrow's weather forecast for "Berlin", and
160
+ send the summary via Whatsapp/Telegram/Slack to contact "Lauren Trown",
161
+ thumbs up their response.
139
162
  ```
140
163
 
141
164
  - **Schedule a meeting in Zoom and share invite via email**
142
165
  ```
143
- Open Zoom app, schedule a meeting titled "AI Hackathon" for tomorrow at 10 AM with a duration of 1 hour,
144
- copy the invitation link, and send it via Gmail to contacts "team@example.com".
166
+ Open Zoom app, schedule a meeting titled "AI Hackathon" for tomorrow at
167
+ 10AM with a duration of 1 hour, copy the invitation link, and send it via
168
+ Gmail to contacts "team@example.com".
145
169
  ```
170
+ [More prompt examples can be found here.](https://github.com/mobile-next/mobile-mcp/wiki/Prompt-Example-repo-list)
146
171
 
147
172
  ## Prerequisites
148
173
 
@@ -176,7 +201,7 @@ On iOS, you'll need Xcode and to run the Simulator before using Mobile MCP with
176
201
 
177
202
  # Thanks to all contributors ❀️
178
203
 
179
- ### We appreciate everyone who has helped improve this project.
204
+ ### We appreciate everyone who has helped improve this project.
180
205
 
181
206
  <a href = "https://github.com/mobile-next/mobile-mcp/graphs/contributors">
182
207
  <img src = "https://contrib.rocks/image?repo=mobile-next/mobile-mcp"/>
package/lib/android.js CHANGED
@@ -120,7 +120,6 @@ class AndroidRobot {
120
120
  async swipe(direction) {
121
121
  const screenSize = await this.getScreenSize();
122
122
  const centerX = screenSize.width >> 1;
123
- // const centerY = screenSize[1] >> 1;
124
123
  let x0, y0, x1, y1;
125
124
  switch (direction) {
126
125
  case "up":
@@ -133,6 +132,50 @@ class AndroidRobot {
133
132
  y0 = Math.floor(screenSize.height * 0.20);
134
133
  y1 = Math.floor(screenSize.height * 0.80);
135
134
  break;
135
+ case "left":
136
+ x0 = Math.floor(screenSize.width * 0.80);
137
+ x1 = Math.floor(screenSize.width * 0.20);
138
+ y0 = y1 = Math.floor(screenSize.height * 0.50);
139
+ break;
140
+ case "right":
141
+ x0 = Math.floor(screenSize.width * 0.20);
142
+ x1 = Math.floor(screenSize.width * 0.80);
143
+ y0 = y1 = Math.floor(screenSize.height * 0.50);
144
+ break;
145
+ default:
146
+ throw new robot_1.ActionableError(`Swipe direction "${direction}" is not supported`);
147
+ }
148
+ this.adb("shell", "input", "swipe", `${x0}`, `${y0}`, `${x1}`, `${y1}`, "1000");
149
+ }
150
+ async swipeFromCoordinate(x, y, direction, distance) {
151
+ const screenSize = await this.getScreenSize();
152
+ let x0, y0, x1, y1;
153
+ // Use provided distance or default to 30% of screen dimension
154
+ const defaultDistanceY = Math.floor(screenSize.height * 0.3);
155
+ const defaultDistanceX = Math.floor(screenSize.width * 0.3);
156
+ const swipeDistanceY = distance || defaultDistanceY;
157
+ const swipeDistanceX = distance || defaultDistanceX;
158
+ switch (direction) {
159
+ case "up":
160
+ x0 = x1 = x;
161
+ y0 = y;
162
+ y1 = Math.max(0, y - swipeDistanceY);
163
+ break;
164
+ case "down":
165
+ x0 = x1 = x;
166
+ y0 = y;
167
+ y1 = Math.min(screenSize.height, y + swipeDistanceY);
168
+ break;
169
+ case "left":
170
+ x0 = x;
171
+ x1 = Math.max(0, x - swipeDistanceX);
172
+ y0 = y1 = y;
173
+ break;
174
+ case "right":
175
+ x0 = x;
176
+ x1 = Math.min(screenSize.width, x + swipeDistanceX);
177
+ y0 = y1 = y;
178
+ break;
136
179
  default:
137
180
  throw new robot_1.ActionableError(`Swipe direction "${direction}" is not supported`);
138
181
  }
@@ -164,6 +207,10 @@ class AndroidRobot {
164
207
  // only provide it if it's true, otherwise don't confuse llm
165
208
  element.focused = true;
166
209
  }
210
+ const resourceId = node["resource-id"];
211
+ if (resourceId !== null && resourceId !== "") {
212
+ element.identifier = resourceId;
213
+ }
167
214
  if (element.rect.width > 0 && element.rect.height > 0) {
168
215
  elements.push(element);
169
216
  }
@@ -215,7 +262,7 @@ class AndroidRobot {
215
262
  // console.error("Failed to get UIAutomator XML. Here's a screenshot: " + screenshot.toString("base64"));
216
263
  continue;
217
264
  }
218
- return dump;
265
+ return dump.substring(dump.indexOf("<?xml"));
219
266
  }
220
267
  throw new robot_1.ActionableError("Failed to get UIAutomator XML");
221
268
  }
package/lib/index.js CHANGED
@@ -1,17 +1,60 @@
1
1
  #!/usr/bin/env node
2
2
  "use strict";
3
+ var __importDefault = (this && this.__importDefault) || function (mod) {
4
+ return (mod && mod.__esModule) ? mod : { "default": mod };
5
+ };
3
6
  Object.defineProperty(exports, "__esModule", { value: true });
7
+ const sse_js_1 = require("@modelcontextprotocol/sdk/server/sse.js");
4
8
  const stdio_js_1 = require("@modelcontextprotocol/sdk/server/stdio.js");
5
9
  const server_1 = require("./server");
6
10
  const logger_1 = require("./logger");
7
- async function main() {
8
- const transport = new stdio_js_1.StdioServerTransport();
11
+ const express_1 = __importDefault(require("express"));
12
+ const commander_1 = require("commander");
13
+ const startSseServer = async (port) => {
14
+ const app = (0, express_1.default)();
9
15
  const server = (0, server_1.createMcpServer)();
10
- await server.connect(transport);
11
- (0, logger_1.error)("mobile-mcp server running on stdio");
12
- }
13
- main().catch(err => {
14
- console.error("Fatal error in main():", err);
15
- (0, logger_1.error)("Fatal error in main(): " + JSON.stringify(err.stack));
16
- process.exit(1);
17
- });
16
+ let transport = null;
17
+ app.post("/mcp", (req, res) => {
18
+ if (transport) {
19
+ transport.handlePostMessage(req, res);
20
+ }
21
+ });
22
+ app.get("/mcp", (req, res) => {
23
+ if (transport) {
24
+ transport.close();
25
+ }
26
+ transport = new sse_js_1.SSEServerTransport("/mcp", res);
27
+ server.connect(transport);
28
+ });
29
+ app.listen(port, () => {
30
+ (0, logger_1.error)(`mobile-mcp ${(0, server_1.getAgentVersion)()} sse server listening on http://localhost:${port}/mcp`);
31
+ });
32
+ };
33
+ const startStdioServer = async () => {
34
+ try {
35
+ const transport = new stdio_js_1.StdioServerTransport();
36
+ const server = (0, server_1.createMcpServer)();
37
+ await server.connect(transport);
38
+ (0, logger_1.error)("mobile-mcp server running on stdio");
39
+ }
40
+ catch (err) {
41
+ console.error("Fatal error in main():", err);
42
+ (0, logger_1.error)("Fatal error in main(): " + JSON.stringify(err.stack));
43
+ process.exit(1);
44
+ }
45
+ };
46
+ const main = async () => {
47
+ commander_1.program
48
+ .version((0, server_1.getAgentVersion)())
49
+ .option("--port <port>", "Start SSE server on this port")
50
+ .option("--stdio", "Start stdio server (default)")
51
+ .parse(process.argv);
52
+ const options = commander_1.program.opts();
53
+ if (options.port) {
54
+ await startSseServer(+options.port);
55
+ }
56
+ else {
57
+ await startStdioServer();
58
+ }
59
+ };
60
+ main().then();
package/lib/ios.js CHANGED
@@ -83,6 +83,10 @@ class IosRobot {
83
83
  const wda = await this.wda();
84
84
  await wda.swipe(direction);
85
85
  }
86
+ async swipeFromCoordinate(x, y, direction, distance) {
87
+ const wda = await this.wda();
88
+ await wda.swipeFromCoordinate(x, y, direction, distance);
89
+ }
86
90
  async listApps() {
87
91
  await this.assertTunnelRunning();
88
92
  const output = await this.ios("apps", "--all", "--list");
@@ -39,75 +39,13 @@ class Simctl {
39
39
  async terminateApp(packageName) {
40
40
  this.simctl("terminate", this.simulatorUuid, packageName);
41
41
  }
42
- static parseIOSAppData(inputText) {
43
- const result = [];
44
- let ParseState;
45
- (function (ParseState) {
46
- ParseState[ParseState["LOOKING_FOR_APP"] = 0] = "LOOKING_FOR_APP";
47
- ParseState[ParseState["IN_APP"] = 1] = "IN_APP";
48
- ParseState[ParseState["IN_PROPERTY"] = 2] = "IN_PROPERTY";
49
- })(ParseState || (ParseState = {}));
50
- let state = ParseState.LOOKING_FOR_APP;
51
- let currentApp = {};
52
- let appIdentifier = "";
53
- const lines = inputText.split("\n");
54
- for (let line of lines) {
55
- line = line.trim();
56
- if (line === "") {
57
- continue;
58
- }
59
- switch (state) {
60
- case ParseState.LOOKING_FOR_APP:
61
- // look for app identifier pattern: "com.example.app" = {
62
- const appMatch = line.match(/^"?([^"=]+)"?\s*=\s*\{/);
63
- if (appMatch) {
64
- appIdentifier = appMatch[1].trim();
65
- currentApp = {
66
- CFBundleIdentifier: appIdentifier,
67
- };
68
- state = ParseState.IN_APP;
69
- }
70
- break;
71
- case ParseState.IN_APP:
72
- if (line === "};") {
73
- result.push(currentApp);
74
- currentApp = {};
75
- state = ParseState.LOOKING_FOR_APP;
76
- }
77
- else {
78
- // look for property: PropertyName = Value;
79
- const propertyMatch = line.match(/^([^=]+)\s*=\s*(.+?);\s*$/);
80
- if (propertyMatch) {
81
- const propName = propertyMatch[1].trim();
82
- let propValue = propertyMatch[2].trim();
83
- // remove quotes if present (they're optional)
84
- if (propValue.startsWith('"') && propValue.endsWith('"')) {
85
- propValue = propValue.substring(1, propValue.length - 1);
86
- }
87
- // add property to current app
88
- currentApp[propName] = propValue;
89
- }
90
- else if (line.endsWith("{")) {
91
- // nested property like GroupContainers = {
92
- state = ParseState.IN_PROPERTY;
93
- }
94
- }
95
- break;
96
- case ParseState.IN_PROPERTY:
97
- if (line === "};") {
98
- // end of nested property
99
- state = ParseState.IN_APP;
100
- }
101
- // skip content of nested properties, we don't care of those right now
102
- break;
103
- }
104
- }
105
- return result;
106
- }
107
42
  async listApps() {
108
43
  const text = this.simctl("listapps", this.simulatorUuid).toString();
109
- const apps = Simctl.parseIOSAppData(text);
110
- return apps.map(app => ({
44
+ const result = (0, child_process_1.execFileSync)("plutil", ["-convert", "json", "-o", "-", "-r", "-"], {
45
+ input: text,
46
+ });
47
+ const output = JSON.parse(result.toString());
48
+ return Object.values(output).map(app => ({
111
49
  packageName: app.CFBundleIdentifier,
112
50
  appName: app.CFBundleDisplayName,
113
51
  }));
@@ -124,6 +62,10 @@ class Simctl {
124
62
  const wda = await this.wda();
125
63
  return wda.swipe(direction);
126
64
  }
65
+ async swipeFromCoordinate(x, y, direction, distance) {
66
+ const wda = await this.wda();
67
+ return wda.swipeFromCoordinate(x, y, direction, distance);
68
+ }
127
69
  async tap(x, y) {
128
70
  const wda = await this.wda();
129
71
  return wda.tap(x, y);
package/lib/server.js CHANGED
@@ -1,6 +1,6 @@
1
1
  "use strict";
2
2
  Object.defineProperty(exports, "__esModule", { value: true });
3
- exports.createMcpServer = void 0;
3
+ exports.createMcpServer = exports.getAgentVersion = void 0;
4
4
  const mcp_js_1 = require("@modelcontextprotocol/sdk/server/mcp.js");
5
5
  const zod_1 = require("zod");
6
6
  const logger_1 = require("./logger");
@@ -14,6 +14,7 @@ const getAgentVersion = () => {
14
14
  const json = require("../package.json");
15
15
  return json.version;
16
16
  };
17
+ exports.getAgentVersion = getAgentVersion;
17
18
  const getLatestAgentVersion = async () => {
18
19
  const response = await fetch("https://api.github.com/repos/mobile-next/mobile-mcp/tags?per_page=1");
19
20
  const json = await response.json();
@@ -22,7 +23,7 @@ const getLatestAgentVersion = async () => {
22
23
  const checkForLatestAgentVersion = async () => {
23
24
  try {
24
25
  const latestVersion = await getLatestAgentVersion();
25
- const currentVersion = getAgentVersion();
26
+ const currentVersion = (0, exports.getAgentVersion)();
26
27
  if (latestVersion !== currentVersion) {
27
28
  (0, logger_1.trace)(`You are running an older version of the agent. Please update to the latest version: ${latestVersion}.`);
28
29
  }
@@ -34,12 +35,13 @@ const checkForLatestAgentVersion = async () => {
34
35
  const createMcpServer = () => {
35
36
  const server = new mcp_js_1.McpServer({
36
37
  name: "mobile-mcp",
37
- version: getAgentVersion(),
38
+ version: (0, exports.getAgentVersion)(),
38
39
  capabilities: {
39
40
  resources: {},
40
41
  tools: {},
41
42
  },
42
43
  });
44
+ const noParams = zod_1.z.object({});
43
45
  const tool = (name, description, paramsSchema, cb) => {
44
46
  const wrappedCb = async (args) => {
45
47
  try {
@@ -75,7 +77,9 @@ const createMcpServer = () => {
75
77
  throw new robot_1.ActionableError("No device selected. Use the mobile_use_device tool to select a device.");
76
78
  }
77
79
  };
78
- tool("mobile_list_available_devices", "List all available devices. This includes both physical devices and simulators. If there is more than one device returned, you need to let the user select one of them.", {}, async ({}) => {
80
+ tool("mobile_list_available_devices", "List all available devices. This includes both physical devices and simulators. If there is more than one device returned, you need to let the user select one of them.", {
81
+ noParams
82
+ }, async ({}) => {
79
83
  const iosManager = new ios_1.IosManager();
80
84
  const androidManager = new android_1.AndroidDeviceManager();
81
85
  const devices = simulatorManager.listBootedSimulators();
@@ -117,7 +121,9 @@ const createMcpServer = () => {
117
121
  }
118
122
  return `Selected device: ${device}`;
119
123
  });
120
- tool("mobile_list_apps", "List all the installed apps on the device", {}, async ({}) => {
124
+ tool("mobile_list_apps", "List all the installed apps on the device", {
125
+ noParams
126
+ }, async ({}) => {
121
127
  requireRobot();
122
128
  const result = await robot.listApps();
123
129
  return `Found these apps on device: ${result.map(app => `${app.appName} (${app.packageName})`).join(", ")}`;
@@ -136,7 +142,9 @@ const createMcpServer = () => {
136
142
  await robot.terminateApp(packageName);
137
143
  return `Terminated app ${packageName}`;
138
144
  });
139
- tool("mobile_get_screen_size", "Get the screen size of the mobile device in pixels", {}, async ({}) => {
145
+ tool("mobile_get_screen_size", "Get the screen size of the mobile device in pixels", {
146
+ noParams
147
+ }, async ({}) => {
140
148
  requireRobot();
141
149
  const screenSize = await robot.getScreenSize();
142
150
  return `Screen size is ${screenSize.width}x${screenSize.height} pixels`;
@@ -149,7 +157,9 @@ const createMcpServer = () => {
149
157
  await robot.tap(x, y);
150
158
  return `Clicked on screen at coordinates: ${x}, ${y}`;
151
159
  });
152
- tool("mobile_list_elements_on_screen", "List elements on screen and their coordinates, with display text or accessibility label. Do not cache this result.", {}, async ({}) => {
160
+ tool("mobile_list_elements_on_screen", "List elements on screen and their coordinates, with display text or accessibility label. Do not cache this result.", {
161
+ noParams
162
+ }, async ({}) => {
153
163
  requireRobot();
154
164
  const elements = await robot.getElementsOnScreen();
155
165
  const result = elements.map(element => {
@@ -159,6 +169,7 @@ const createMcpServer = () => {
159
169
  label: element.label,
160
170
  name: element.name,
161
171
  value: element.value,
172
+ identifier: element.identifier,
162
173
  coordinates: {
163
174
  x: element.rect.x,
164
175
  y: element.rect.y,
@@ -188,11 +199,23 @@ const createMcpServer = () => {
188
199
  return `Opened URL: ${url}`;
189
200
  });
190
201
  tool("swipe_on_screen", "Swipe on the screen", {
191
- direction: zod_1.z.enum(["up", "down"]).describe("The direction to swipe"),
192
- }, async ({ direction }) => {
202
+ direction: zod_1.z.enum(["up", "down", "left", "right"]).describe("The direction to swipe"),
203
+ x: zod_1.z.number().optional().describe("The x coordinate to start the swipe from, in pixels. If not provided, uses center of screen"),
204
+ y: zod_1.z.number().optional().describe("The y coordinate to start the swipe from, in pixels. If not provided, uses center of screen"),
205
+ distance: zod_1.z.number().optional().describe("The distance to swipe in pixels. Defaults to 400 pixels for iOS or 30% of screen dimension for Android"),
206
+ }, async ({ direction, x, y, distance }) => {
193
207
  requireRobot();
194
- await robot.swipe(direction);
195
- return `Swiped ${direction} on screen`;
208
+ if (x !== undefined && y !== undefined) {
209
+ // Use coordinate-based swipe
210
+ await robot.swipeFromCoordinate(x, y, direction, distance);
211
+ const distanceText = distance ? ` ${distance} pixels` : "";
212
+ return `Swiped ${direction}${distanceText} from coordinates: ${x}, ${y}`;
213
+ }
214
+ else {
215
+ // Use center-based swipe
216
+ await robot.swipe(direction);
217
+ return `Swiped ${direction} on screen`;
218
+ }
196
219
  });
197
220
  tool("mobile_type_keys", "Type text into the focused element", {
198
221
  text: zod_1.z.string().describe("The text to type"),
@@ -205,7 +228,9 @@ const createMcpServer = () => {
205
228
  }
206
229
  return `Typed text: ${text}`;
207
230
  });
208
- server.tool("mobile_take_screenshot", "Take a screenshot of the mobile device. Use this to understand what's on screen, if you need to press an element that is available through view hierarchy then you must list elements on screen instead. Do not cache this result.", {}, async ({}) => {
231
+ server.tool("mobile_take_screenshot", "Take a screenshot of the mobile device. Use this to understand what's on screen, if you need to press an element that is available through view hierarchy then you must list elements on screen instead. Do not cache this result.", {
232
+ noParams
233
+ }, async ({}) => {
209
234
  requireRobot();
210
235
  try {
211
236
  const screenSize = await robot.getScreenSize();
@@ -249,7 +274,9 @@ const createMcpServer = () => {
249
274
  await robot.setOrientation(orientation);
250
275
  return `Changed device orientation to ${orientation}`;
251
276
  });
252
- tool("mobile_get_orientation", "Get the current screen orientation of the device", {}, async () => {
277
+ tool("mobile_get_orientation", "Get the current screen orientation of the device", {
278
+ noParams
279
+ }, async () => {
253
280
  requireRobot();
254
281
  const orientation = await robot.getOrientation();
255
282
  return `Current device orientation is ${orientation}`;
@@ -29,7 +29,14 @@ class WebDriverAgent {
29
29
  },
30
30
  body: JSON.stringify({ capabilities: { alwaysMatch: { platformName: "iOS" } } }),
31
31
  });
32
+ if (!response.ok) {
33
+ const errorText = await response.text();
34
+ throw new robot_1.ActionableError(`Failed to create WebDriver session: ${response.status} ${errorText}`);
35
+ }
32
36
  const json = await response.json();
37
+ if (!json.value || !json.value.sessionId) {
38
+ throw new robot_1.ActionableError(`Invalid session response: ${JSON.stringify(json)}`);
39
+ }
33
40
  return json.value.sessionId;
34
41
  }
35
42
  async deleteSession(sessionId) {
@@ -44,8 +51,8 @@ class WebDriverAgent {
44
51
  await this.deleteSession(sessionId);
45
52
  return result;
46
53
  }
47
- async getScreenSize() {
48
- return this.withinSession(async (sessionUrl) => {
54
+ async getScreenSize(sessionUrl) {
55
+ if (sessionUrl) {
49
56
  const url = `${sessionUrl}/wda/screen`;
50
57
  const response = await fetch(url);
51
58
  const json = await response.json();
@@ -54,7 +61,19 @@ class WebDriverAgent {
54
61
  height: json.value.screenSize.height,
55
62
  scale: json.value.scale || 1,
56
63
  };
57
- });
64
+ }
65
+ else {
66
+ return this.withinSession(async (sessionUrlInner) => {
67
+ const url = `${sessionUrlInner}/wda/screen`;
68
+ const response = await fetch(url);
69
+ const json = await response.json();
70
+ return {
71
+ width: json.value.screenSize.width,
72
+ height: json.value.screenSize.height,
73
+ scale: json.value.scale || 1,
74
+ };
75
+ });
76
+ }
58
77
  }
59
78
  async sendKeys(keys) {
60
79
  await this.withinSession(async (sessionUrl) => {
@@ -130,12 +149,13 @@ class WebDriverAgent {
130
149
  const acceptedTypes = ["TextField", "Button", "Switch", "Icon", "SearchField", "StaticText", "Image"];
131
150
  if (acceptedTypes.includes(source.type)) {
132
151
  if (source.isVisible === "1" && this.isVisible(source.rect)) {
133
- if (source.label !== null || source.name !== null) {
152
+ if (source.label !== null || source.name !== null || source.rawIdentifier !== null) {
134
153
  output.push({
135
154
  type: source.type,
136
155
  label: source.label,
137
156
  name: source.name,
138
157
  value: source.value,
158
+ identifier: source.rawIdentifier,
139
159
  rect: {
140
160
  x: source.rect.x,
141
161
  y: source.rect.y,
@@ -173,17 +193,95 @@ class WebDriverAgent {
173
193
  }
174
194
  async swipe(direction) {
175
195
  await this.withinSession(async (sessionUrl) => {
176
- const x0 = 200;
177
- let y0 = 600;
178
- const x1 = 200;
179
- let y1 = 200;
180
- if (direction === "up") {
181
- const tmp = y0;
182
- y0 = y1;
183
- y1 = tmp;
196
+ const screenSize = await this.getScreenSize(sessionUrl);
197
+ let x0, y0, x1, y1;
198
+ // Use 60% of the width/height for swipe distance
199
+ const verticalDistance = Math.floor(screenSize.height * 0.6);
200
+ const horizontalDistance = Math.floor(screenSize.width * 0.6);
201
+ const centerX = Math.floor(screenSize.width / 2);
202
+ const centerY = Math.floor(screenSize.height / 2);
203
+ switch (direction) {
204
+ case "up":
205
+ x0 = x1 = centerX;
206
+ y0 = centerY + Math.floor(verticalDistance / 2);
207
+ y1 = centerY - Math.floor(verticalDistance / 2);
208
+ break;
209
+ case "down":
210
+ x0 = x1 = centerX;
211
+ y0 = centerY - Math.floor(verticalDistance / 2);
212
+ y1 = centerY + Math.floor(verticalDistance / 2);
213
+ break;
214
+ case "left":
215
+ y0 = y1 = centerY;
216
+ x0 = centerX + Math.floor(horizontalDistance / 2);
217
+ x1 = centerX - Math.floor(horizontalDistance / 2);
218
+ break;
219
+ case "right":
220
+ y0 = y1 = centerY;
221
+ x0 = centerX - Math.floor(horizontalDistance / 2);
222
+ x1 = centerX + Math.floor(horizontalDistance / 2);
223
+ break;
224
+ default:
225
+ throw new robot_1.ActionableError(`Swipe direction "${direction}" is not supported`);
184
226
  }
185
227
  const url = `${sessionUrl}/actions`;
186
- await fetch(url, {
228
+ const response = await fetch(url, {
229
+ method: "POST",
230
+ headers: {
231
+ "Content-Type": "application/json",
232
+ },
233
+ body: JSON.stringify({
234
+ actions: [
235
+ {
236
+ type: "pointer",
237
+ id: "finger1",
238
+ parameters: { pointerType: "touch" },
239
+ actions: [
240
+ { type: "pointerMove", duration: 0, x: x0, y: y0 },
241
+ { type: "pointerDown", button: 0 },
242
+ { type: "pointerMove", duration: 1000, x: x1, y: y1 },
243
+ { type: "pointerUp", button: 0 }
244
+ ]
245
+ }
246
+ ]
247
+ }),
248
+ });
249
+ if (!response.ok) {
250
+ const errorText = await response.text();
251
+ throw new robot_1.ActionableError(`WebDriver actions request failed: ${response.status} ${errorText}`);
252
+ }
253
+ // Clear actions to ensure they complete
254
+ await fetch(`${sessionUrl}/actions`, {
255
+ method: "DELETE",
256
+ });
257
+ });
258
+ }
259
+ async swipeFromCoordinate(x, y, direction, distance = 400) {
260
+ await this.withinSession(async (sessionUrl) => {
261
+ // Use simple coordinates like the working swipe method
262
+ const x0 = x;
263
+ const y0 = y;
264
+ let x1 = x;
265
+ let y1 = y;
266
+ // Calculate target position based on direction and distance
267
+ switch (direction) {
268
+ case "up":
269
+ y1 = y - distance; // Move up by specified distance
270
+ break;
271
+ case "down":
272
+ y1 = y + distance; // Move down by specified distance
273
+ break;
274
+ case "left":
275
+ x1 = x - distance; // Move left by specified distance
276
+ break;
277
+ case "right":
278
+ x1 = x + distance; // Move right by specified distance
279
+ break;
280
+ default:
281
+ throw new robot_1.ActionableError(`Swipe direction "${direction}" is not supported`);
282
+ }
283
+ const url = `${sessionUrl}/actions`;
284
+ const response = await fetch(url, {
187
285
  method: "POST",
188
286
  headers: {
189
287
  "Content-Type": "application/json",
@@ -197,14 +295,21 @@ class WebDriverAgent {
197
295
  actions: [
198
296
  { type: "pointerMove", duration: 0, x: x0, y: y0 },
199
297
  { type: "pointerDown", button: 0 },
200
- { type: "pointerMove", duration: 0, x: x1, y: y1 },
201
- { type: "pause", duration: 1000 },
298
+ { type: "pointerMove", duration: 1000, x: x1, y: y1 },
202
299
  { type: "pointerUp", button: 0 }
203
300
  ]
204
301
  }
205
302
  ]
206
303
  }),
207
304
  });
305
+ if (!response.ok) {
306
+ const errorText = await response.text();
307
+ throw new robot_1.ActionableError(`WebDriver actions request failed: ${response.status} ${errorText}`);
308
+ }
309
+ // Clear actions to ensure they complete
310
+ await fetch(`${sessionUrl}/actions`, {
311
+ method: "DELETE",
312
+ });
208
313
  });
209
314
  }
210
315
  async setOrientation(orientation) {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@mobilenext/mobile-mcp",
3
- "version": "0.0.17",
3
+ "version": "0.0.19",
4
4
  "description": "Mobile MCP",
5
5
  "repository": {
6
6
  "type": "git",
@@ -23,6 +23,10 @@
23
23
  ],
24
24
  "dependencies": {
25
25
  "@modelcontextprotocol/sdk": "^1.6.1",
26
+ "@types/commander": "^2.12.0",
27
+ "@types/express": "^5.0.3",
28
+ "commander": "^14.0.0",
29
+ "express": "^5.1.0",
26
30
  "fast-xml-parser": "^5.0.9",
27
31
  "zod-to-json-schema": "^3.24.4"
28
32
  },
@@ -40,8 +44,8 @@
40
44
  "eslint-plugin-import": "^2.31.0",
41
45
  "eslint-plugin-notice": "^1.0.0",
42
46
  "husky": "^9.1.7",
43
- "nyc": "^17.1.0",
44
47
  "mocha": "^11.1.0",
48
+ "nyc": "^17.1.0",
45
49
  "ts-node": "^10.9.2",
46
50
  "typescript": "^5.8.2"
47
51
  },