@mobilenext/mobile-mcp 0.0.18 β†’ 0.0.19

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,12 +1,12 @@
1
1
  # Mobile Next - MCP server for Mobile Development and Automation | iOS, Android, Simulator, Emulator, and physical devices
2
2
 
3
3
  This is a [Model Context Protocol (MCP) server](https://github.com/modelcontextprotocol) that enables scalable mobile automation, development through a platform-agnostic interface, eliminating the need for distinct iOS or Android knowledge. You can run it on emulators, simulators, and physical devices (iOS and Android).
4
- This server allows Agents and LLMs to interact with native iOS/Android applications and devices through structured accessibility snapshots or coordinate-based taps based on screenshots.
4
+ This server allows Agents and LLMs to interact with native iOS/Android applications and devices through structured accessibility snapshots or coordinate-based taps based on screenshots.
5
5
 
6
6
  <h4 align="center">
7
7
  <a href="https://github.com/mobile-next/mobile-mcp">
8
8
  <img src="https://img.shields.io/github/stars/mobile-next/mobile-mcp" alt="Mobile Next Stars" />
9
- </a>
9
+ </a>
10
10
  <a href="https://github.com/mobile-next/mobile-mcp">
11
11
  <img src="https://img.shields.io/github/contributors/mobile-next/mobile-mcp?color=green" alt="Mobile Next Downloads" />
12
12
  </a>
@@ -18,14 +18,14 @@ This server allows Agents and LLMs to interact with native iOS/Android applicati
18
18
  </a>
19
19
  <a href="https://github.com/mobile-next/mobile-mcp/blob/main/LICENSE">
20
20
  <img src="https://img.shields.io/badge/license-Apache 2.0-blue.svg" alt="Mobile MCP is released under the Apache-2.0 License">
21
- </a>
22
-
21
+ </a>
22
+
23
23
  </p>
24
24
 
25
25
  <h4 align="center">
26
26
  <a href="http://mobilenexthq.com/join-slack">
27
27
  <img src="https://img.shields.io/badge/join-Slack-blueviolet?logo=slack&style=flat" alt="Slack community channel" />
28
- </a>
28
+ </a>
29
29
  </p>
30
30
 
31
31
  https://github.com/user-attachments/assets/c4e89c4f-cc71-4424-8184-bdbc8c638fa1
@@ -38,7 +38,7 @@ https://github.com/user-attachments/assets/c4e89c4f-cc71-4424-8184-bdbc8c638fa1
38
38
 
39
39
  ### πŸš€ Mobile MCP Roadmap: Building the Future of Mobile
40
40
 
41
- Join us on our journey as we continuously enhance Mobile MCP!
41
+ Join us on our journey as we continuously enhance Mobile MCP!
42
42
  Check out our detailed roadmap to see upcoming features, improvements, and milestones. Your feedback is invaluable in shaping the future of mobile automation.
43
43
 
44
44
  πŸ‘‰ [Explore the Roadmap](https://github.com/orgs/mobile-next/projects/3)
@@ -48,7 +48,7 @@ Check out our detailed roadmap to see upcoming features, improvements, and miles
48
48
 
49
49
  How we help to scale mobile automation:
50
50
 
51
- - πŸ“² Native app automation (iOS and Android) for testing or data-entry scenarios.
51
+ - πŸ“² Native app automation (iOS and Android) for testing or data-entry scenarios.
52
52
  - πŸ“ Scripted flows and form interactions without manually controlling simulators/emulators or physical devices (iPhone, Samsung, Google Pixel etc)
53
53
  - 🧭 Automating multi-step user journeys driven by an LLM
54
54
  - πŸ‘† General-purpose mobile application interaction for agent-based frameworks
@@ -56,11 +56,11 @@ How we help to scale mobile automation:
56
56
 
57
57
  ## Main Features
58
58
 
59
- - πŸš€ **Fast and lightweight**: Uses native accessibility trees for most interactions, or screenshot based coordinates where a11y labels are not available.
59
+ - πŸš€ **Fast and lightweight**: Uses native accessibility trees for most interactions, or screenshot based coordinates where a11y labels are not available.
60
60
  - πŸ€– **LLM-friendly**: No computer vision model required in Accessibility (Snapshot).
61
61
  - 🧿 **Visual Sense**: Evaluates and analyses what’s actually rendered on screen to decide the next action. If accessibility data or view-hierarchy coordinates are unavailable, it falls back to screenshot-based analysis.
62
62
  - πŸ“Š **Deterministic tool application**: Reduces ambiguity found in purely screenshot-based approaches by relying on structured data whenever possible.
63
- - πŸ“Ί **Extract structured data**: Enables you to extract structred data from anything visible on screen.
63
+ - πŸ“Ί **Extract structured data**: Enables you to extract structred data from anything visible on screen.
64
64
 
65
65
  ## πŸ—οΈ Mobile MCP Architecture
66
66
 
@@ -71,7 +71,7 @@ How we help to scale mobile automation:
71
71
  </p>
72
72
 
73
73
 
74
- ## πŸ“š Wiki page
74
+ ## πŸ“š Wiki page
75
75
 
76
76
  More details in our [wiki page](https://github.com/mobile-next/mobile-mcp/wiki) for setup, configuration and debugging related questions.
77
77
 
@@ -91,8 +91,8 @@ Setup our MCP with Cline, Cursor, Claude, VS Code, Github Copilot:
91
91
  }
92
92
 
93
93
  ```
94
- [Cline:](https://docs.cline.bot/mcp/configuring-mcp-servers) To setup Cline, just add the json above to your MCP settings file.
95
- [More in our wiki](https://github.com/mobile-next/mobile-mcp/wiki/Cline)
94
+ [Cline:](https://docs.cline.bot/mcp/configuring-mcp-servers) To setup Cline, just add the json above to your MCP settings file.
95
+ [More in our wiki](https://github.com/mobile-next/mobile-mcp/wiki/Cline)
96
96
 
97
97
  [Claude Code:](https://docs.anthropic.com/en/docs/agents-and-tools/claude-code/overview)
98
98
 
@@ -105,7 +105,7 @@ claude mcp add mobile -- npx -y @mobilenext/mobile-mcp@latest
105
105
 
106
106
  ### πŸ› οΈ How to Use πŸ“
107
107
 
108
- After adding the MCP server to your IDE/Client, you can instruct your AI assistant to use the available tools.
108
+ After adding the MCP server to your IDE/Client, you can instruct your AI assistant to use the available tools.
109
109
  For example, in Cursor's agent mode, you could use the prompts below to quickly validate, test and iterate on UI intereactions, read information from screen, go through complex workflows.
110
110
  Be descriptive, straight to the point.
111
111
 
@@ -117,47 +117,55 @@ You can specifiy detailed workflows in a single prompt, verify business logic, s
117
117
 
118
118
  **Search for a video, comment, like and share it.**
119
119
  ```
120
- Find the video called " Beginner Recipe for Tonkotsu Ramen" by Way of Ramen, click on like video, after liking write a comment " this was delicious, will make it next Friday", share the video with the first contact in your whatsapp list.
120
+ Find the video called " Beginner Recipe for Tonkotsu Ramen" by Way of
121
+ Ramen, click on like video, after liking write a comment " this was
122
+ delicious, will make it next Friday", share the video with the first
123
+ contact in your whatsapp list.
121
124
  ```
122
125
 
123
- **Download a successful step counter app, register, setup workout and 5 start the app**
126
+ **Download a successful step counter app, register, setup workout and 5-star the app**
124
127
  ```
125
- Find and Download a free "Pomodoro" app that has more thank 1k stars.
126
- Launch the app, register with my email, after registration find how to start a pomodoro timer.
127
- When the pomodoro timer started, go back to the app store and rate the app 5 stars,
128
- and leave a comment how useful the app is.
128
+ Find and Download a free "Pomodoro" app that has more than 1k stars.
129
+ Launch the app, register with my email, after registration find how to
130
+ start a pomodoro timer. When the pomodoro timer started, go back to the
131
+ app store and rate the app 5 stars, and leave a comment how useful the
132
+ app is.
129
133
  ```
130
134
 
131
135
  **Search in Substack, read, highlight, comment and save an article**
132
136
  ```
133
- Open Substack website, search for "Latest trends in AI automation 2025", open the first article,
134
- highlight the section titled "Emerging AI trends", and save article to reading list for later review,
135
- comment a random paragraph summary.
137
+ Open Substack website, search for "Latest trends in AI automation 2025",
138
+ open the first article, highlight the section titled "Emerging AI trends",
139
+ and save article to reading list for later review, comment a random
140
+ paragraph summary.
136
141
  ```
137
142
 
138
143
  **Reserve a workout class, set timer**
139
144
  ```
140
- Open ClassPass, search for yoga classes tomorrow morning within 2 miles,
145
+ Open ClassPass, search for yoga classes tomorrow morning within 2 miles,
141
146
  book the highest-rated class at 7 AM, confirm reservation,
142
- setup a timer for the booked slot in the phone
147
+ setup a timer for the booked slot in the phone
143
148
  ```
144
149
 
145
150
  **Find a local event, setup calendar event**
146
151
  ```
147
- Open Eventbrite, search for AI startup meetup events happening this weekend in "Austin, TX",
148
- select the most popular one, register and RSVP yes to the even, setup a calendar event as a reminder.
152
+ Open Eventbrite, search for AI startup meetup events happening this
153
+ weekend in "Austin, TX", select the most popular one, register and RSVP
154
+ yes to the event, setup a calendar event as a reminder.
149
155
  ```
150
156
 
151
157
  **Check weather forecast and send a Whatsapp/Telegram/Slack message**
152
158
  ```
153
- Open Weather app, check tomorrow's weather forecast for "Berlin", and send the summary
154
- via Whatsapp/Telegram/Slack to contact "Lauren Trown", thumbs up their response.
159
+ Open Weather app, check tomorrow's weather forecast for "Berlin", and
160
+ send the summary via Whatsapp/Telegram/Slack to contact "Lauren Trown",
161
+ thumbs up their response.
155
162
  ```
156
163
 
157
164
  - **Schedule a meeting in Zoom and share invite via email**
158
165
  ```
159
- Open Zoom app, schedule a meeting titled "AI Hackathon" for tomorrow at 10 AM with a duration of 1 hour,
160
- copy the invitation link, and send it via Gmail to contacts "team@example.com".
166
+ Open Zoom app, schedule a meeting titled "AI Hackathon" for tomorrow at
167
+ 10AM with a duration of 1 hour, copy the invitation link, and send it via
168
+ Gmail to contacts "team@example.com".
161
169
  ```
162
170
  [More prompt examples can be found here.](https://github.com/mobile-next/mobile-mcp/wiki/Prompt-Example-repo-list)
163
171
 
@@ -193,7 +201,7 @@ On iOS, you'll need Xcode and to run the Simulator before using Mobile MCP with
193
201
 
194
202
  # Thanks to all contributors ❀️
195
203
 
196
- ### We appreciate everyone who has helped improve this project.
204
+ ### We appreciate everyone who has helped improve this project.
197
205
 
198
206
  <a href = "https://github.com/mobile-next/mobile-mcp/graphs/contributors">
199
207
  <img src = "https://contrib.rocks/image?repo=mobile-next/mobile-mcp"/>
package/lib/android.js CHANGED
@@ -120,7 +120,6 @@ class AndroidRobot {
120
120
  async swipe(direction) {
121
121
  const screenSize = await this.getScreenSize();
122
122
  const centerX = screenSize.width >> 1;
123
- // const centerY = screenSize[1] >> 1;
124
123
  let x0, y0, x1, y1;
125
124
  switch (direction) {
126
125
  case "up":
@@ -133,6 +132,50 @@ class AndroidRobot {
133
132
  y0 = Math.floor(screenSize.height * 0.20);
134
133
  y1 = Math.floor(screenSize.height * 0.80);
135
134
  break;
135
+ case "left":
136
+ x0 = Math.floor(screenSize.width * 0.80);
137
+ x1 = Math.floor(screenSize.width * 0.20);
138
+ y0 = y1 = Math.floor(screenSize.height * 0.50);
139
+ break;
140
+ case "right":
141
+ x0 = Math.floor(screenSize.width * 0.20);
142
+ x1 = Math.floor(screenSize.width * 0.80);
143
+ y0 = y1 = Math.floor(screenSize.height * 0.50);
144
+ break;
145
+ default:
146
+ throw new robot_1.ActionableError(`Swipe direction "${direction}" is not supported`);
147
+ }
148
+ this.adb("shell", "input", "swipe", `${x0}`, `${y0}`, `${x1}`, `${y1}`, "1000");
149
+ }
150
+ async swipeFromCoordinate(x, y, direction, distance) {
151
+ const screenSize = await this.getScreenSize();
152
+ let x0, y0, x1, y1;
153
+ // Use provided distance or default to 30% of screen dimension
154
+ const defaultDistanceY = Math.floor(screenSize.height * 0.3);
155
+ const defaultDistanceX = Math.floor(screenSize.width * 0.3);
156
+ const swipeDistanceY = distance || defaultDistanceY;
157
+ const swipeDistanceX = distance || defaultDistanceX;
158
+ switch (direction) {
159
+ case "up":
160
+ x0 = x1 = x;
161
+ y0 = y;
162
+ y1 = Math.max(0, y - swipeDistanceY);
163
+ break;
164
+ case "down":
165
+ x0 = x1 = x;
166
+ y0 = y;
167
+ y1 = Math.min(screenSize.height, y + swipeDistanceY);
168
+ break;
169
+ case "left":
170
+ x0 = x;
171
+ x1 = Math.max(0, x - swipeDistanceX);
172
+ y0 = y1 = y;
173
+ break;
174
+ case "right":
175
+ x0 = x;
176
+ x1 = Math.min(screenSize.width, x + swipeDistanceX);
177
+ y0 = y1 = y;
178
+ break;
136
179
  default:
137
180
  throw new robot_1.ActionableError(`Swipe direction "${direction}" is not supported`);
138
181
  }
@@ -219,7 +262,7 @@ class AndroidRobot {
219
262
  // console.error("Failed to get UIAutomator XML. Here's a screenshot: " + screenshot.toString("base64"));
220
263
  continue;
221
264
  }
222
- return dump;
265
+ return dump.substring(dump.indexOf("<?xml"));
223
266
  }
224
267
  throw new robot_1.ActionableError("Failed to get UIAutomator XML");
225
268
  }
package/lib/ios.js CHANGED
@@ -83,6 +83,10 @@ class IosRobot {
83
83
  const wda = await this.wda();
84
84
  await wda.swipe(direction);
85
85
  }
86
+ async swipeFromCoordinate(x, y, direction, distance) {
87
+ const wda = await this.wda();
88
+ await wda.swipeFromCoordinate(x, y, direction, distance);
89
+ }
86
90
  async listApps() {
87
91
  await this.assertTunnelRunning();
88
92
  const output = await this.ios("apps", "--all", "--list");
@@ -62,6 +62,10 @@ class Simctl {
62
62
  const wda = await this.wda();
63
63
  return wda.swipe(direction);
64
64
  }
65
+ async swipeFromCoordinate(x, y, direction, distance) {
66
+ const wda = await this.wda();
67
+ return wda.swipeFromCoordinate(x, y, direction, distance);
68
+ }
65
69
  async tap(x, y) {
66
70
  const wda = await this.wda();
67
71
  return wda.tap(x, y);
package/lib/server.js CHANGED
@@ -41,6 +41,7 @@ const createMcpServer = () => {
41
41
  tools: {},
42
42
  },
43
43
  });
44
+ const noParams = zod_1.z.object({});
44
45
  const tool = (name, description, paramsSchema, cb) => {
45
46
  const wrappedCb = async (args) => {
46
47
  try {
@@ -76,7 +77,9 @@ const createMcpServer = () => {
76
77
  throw new robot_1.ActionableError("No device selected. Use the mobile_use_device tool to select a device.");
77
78
  }
78
79
  };
79
- tool("mobile_list_available_devices", "List all available devices. This includes both physical devices and simulators. If there is more than one device returned, you need to let the user select one of them.", {}, async ({}) => {
80
+ tool("mobile_list_available_devices", "List all available devices. This includes both physical devices and simulators. If there is more than one device returned, you need to let the user select one of them.", {
81
+ noParams
82
+ }, async ({}) => {
80
83
  const iosManager = new ios_1.IosManager();
81
84
  const androidManager = new android_1.AndroidDeviceManager();
82
85
  const devices = simulatorManager.listBootedSimulators();
@@ -118,7 +121,9 @@ const createMcpServer = () => {
118
121
  }
119
122
  return `Selected device: ${device}`;
120
123
  });
121
- tool("mobile_list_apps", "List all the installed apps on the device", {}, async ({}) => {
124
+ tool("mobile_list_apps", "List all the installed apps on the device", {
125
+ noParams
126
+ }, async ({}) => {
122
127
  requireRobot();
123
128
  const result = await robot.listApps();
124
129
  return `Found these apps on device: ${result.map(app => `${app.appName} (${app.packageName})`).join(", ")}`;
@@ -137,7 +142,9 @@ const createMcpServer = () => {
137
142
  await robot.terminateApp(packageName);
138
143
  return `Terminated app ${packageName}`;
139
144
  });
140
- tool("mobile_get_screen_size", "Get the screen size of the mobile device in pixels", {}, async ({}) => {
145
+ tool("mobile_get_screen_size", "Get the screen size of the mobile device in pixels", {
146
+ noParams
147
+ }, async ({}) => {
141
148
  requireRobot();
142
149
  const screenSize = await robot.getScreenSize();
143
150
  return `Screen size is ${screenSize.width}x${screenSize.height} pixels`;
@@ -150,7 +157,9 @@ const createMcpServer = () => {
150
157
  await robot.tap(x, y);
151
158
  return `Clicked on screen at coordinates: ${x}, ${y}`;
152
159
  });
153
- tool("mobile_list_elements_on_screen", "List elements on screen and their coordinates, with display text or accessibility label. Do not cache this result.", {}, async ({}) => {
160
+ tool("mobile_list_elements_on_screen", "List elements on screen and their coordinates, with display text or accessibility label. Do not cache this result.", {
161
+ noParams
162
+ }, async ({}) => {
154
163
  requireRobot();
155
164
  const elements = await robot.getElementsOnScreen();
156
165
  const result = elements.map(element => {
@@ -190,11 +199,23 @@ const createMcpServer = () => {
190
199
  return `Opened URL: ${url}`;
191
200
  });
192
201
  tool("swipe_on_screen", "Swipe on the screen", {
193
- direction: zod_1.z.enum(["up", "down"]).describe("The direction to swipe"),
194
- }, async ({ direction }) => {
202
+ direction: zod_1.z.enum(["up", "down", "left", "right"]).describe("The direction to swipe"),
203
+ x: zod_1.z.number().optional().describe("The x coordinate to start the swipe from, in pixels. If not provided, uses center of screen"),
204
+ y: zod_1.z.number().optional().describe("The y coordinate to start the swipe from, in pixels. If not provided, uses center of screen"),
205
+ distance: zod_1.z.number().optional().describe("The distance to swipe in pixels. Defaults to 400 pixels for iOS or 30% of screen dimension for Android"),
206
+ }, async ({ direction, x, y, distance }) => {
195
207
  requireRobot();
196
- await robot.swipe(direction);
197
- return `Swiped ${direction} on screen`;
208
+ if (x !== undefined && y !== undefined) {
209
+ // Use coordinate-based swipe
210
+ await robot.swipeFromCoordinate(x, y, direction, distance);
211
+ const distanceText = distance ? ` ${distance} pixels` : "";
212
+ return `Swiped ${direction}${distanceText} from coordinates: ${x}, ${y}`;
213
+ }
214
+ else {
215
+ // Use center-based swipe
216
+ await robot.swipe(direction);
217
+ return `Swiped ${direction} on screen`;
218
+ }
198
219
  });
199
220
  tool("mobile_type_keys", "Type text into the focused element", {
200
221
  text: zod_1.z.string().describe("The text to type"),
@@ -207,7 +228,9 @@ const createMcpServer = () => {
207
228
  }
208
229
  return `Typed text: ${text}`;
209
230
  });
210
- server.tool("mobile_take_screenshot", "Take a screenshot of the mobile device. Use this to understand what's on screen, if you need to press an element that is available through view hierarchy then you must list elements on screen instead. Do not cache this result.", {}, async ({}) => {
231
+ server.tool("mobile_take_screenshot", "Take a screenshot of the mobile device. Use this to understand what's on screen, if you need to press an element that is available through view hierarchy then you must list elements on screen instead. Do not cache this result.", {
232
+ noParams
233
+ }, async ({}) => {
211
234
  requireRobot();
212
235
  try {
213
236
  const screenSize = await robot.getScreenSize();
@@ -251,7 +274,9 @@ const createMcpServer = () => {
251
274
  await robot.setOrientation(orientation);
252
275
  return `Changed device orientation to ${orientation}`;
253
276
  });
254
- tool("mobile_get_orientation", "Get the current screen orientation of the device", {}, async () => {
277
+ tool("mobile_get_orientation", "Get the current screen orientation of the device", {
278
+ noParams
279
+ }, async () => {
255
280
  requireRobot();
256
281
  const orientation = await robot.getOrientation();
257
282
  return `Current device orientation is ${orientation}`;
@@ -29,7 +29,14 @@ class WebDriverAgent {
29
29
  },
30
30
  body: JSON.stringify({ capabilities: { alwaysMatch: { platformName: "iOS" } } }),
31
31
  });
32
+ if (!response.ok) {
33
+ const errorText = await response.text();
34
+ throw new robot_1.ActionableError(`Failed to create WebDriver session: ${response.status} ${errorText}`);
35
+ }
32
36
  const json = await response.json();
37
+ if (!json.value || !json.value.sessionId) {
38
+ throw new robot_1.ActionableError(`Invalid session response: ${JSON.stringify(json)}`);
39
+ }
33
40
  return json.value.sessionId;
34
41
  }
35
42
  async deleteSession(sessionId) {
@@ -44,8 +51,8 @@ class WebDriverAgent {
44
51
  await this.deleteSession(sessionId);
45
52
  return result;
46
53
  }
47
- async getScreenSize() {
48
- return this.withinSession(async (sessionUrl) => {
54
+ async getScreenSize(sessionUrl) {
55
+ if (sessionUrl) {
49
56
  const url = `${sessionUrl}/wda/screen`;
50
57
  const response = await fetch(url);
51
58
  const json = await response.json();
@@ -54,7 +61,19 @@ class WebDriverAgent {
54
61
  height: json.value.screenSize.height,
55
62
  scale: json.value.scale || 1,
56
63
  };
57
- });
64
+ }
65
+ else {
66
+ return this.withinSession(async (sessionUrlInner) => {
67
+ const url = `${sessionUrlInner}/wda/screen`;
68
+ const response = await fetch(url);
69
+ const json = await response.json();
70
+ return {
71
+ width: json.value.screenSize.width,
72
+ height: json.value.screenSize.height,
73
+ scale: json.value.scale || 1,
74
+ };
75
+ });
76
+ }
58
77
  }
59
78
  async sendKeys(keys) {
60
79
  await this.withinSession(async (sessionUrl) => {
@@ -174,17 +193,95 @@ class WebDriverAgent {
174
193
  }
175
194
  async swipe(direction) {
176
195
  await this.withinSession(async (sessionUrl) => {
177
- const x0 = 200;
178
- let y0 = 600;
179
- const x1 = 200;
180
- let y1 = 200;
181
- if (direction === "up") {
182
- const tmp = y0;
183
- y0 = y1;
184
- y1 = tmp;
196
+ const screenSize = await this.getScreenSize(sessionUrl);
197
+ let x0, y0, x1, y1;
198
+ // Use 60% of the width/height for swipe distance
199
+ const verticalDistance = Math.floor(screenSize.height * 0.6);
200
+ const horizontalDistance = Math.floor(screenSize.width * 0.6);
201
+ const centerX = Math.floor(screenSize.width / 2);
202
+ const centerY = Math.floor(screenSize.height / 2);
203
+ switch (direction) {
204
+ case "up":
205
+ x0 = x1 = centerX;
206
+ y0 = centerY + Math.floor(verticalDistance / 2);
207
+ y1 = centerY - Math.floor(verticalDistance / 2);
208
+ break;
209
+ case "down":
210
+ x0 = x1 = centerX;
211
+ y0 = centerY - Math.floor(verticalDistance / 2);
212
+ y1 = centerY + Math.floor(verticalDistance / 2);
213
+ break;
214
+ case "left":
215
+ y0 = y1 = centerY;
216
+ x0 = centerX + Math.floor(horizontalDistance / 2);
217
+ x1 = centerX - Math.floor(horizontalDistance / 2);
218
+ break;
219
+ case "right":
220
+ y0 = y1 = centerY;
221
+ x0 = centerX - Math.floor(horizontalDistance / 2);
222
+ x1 = centerX + Math.floor(horizontalDistance / 2);
223
+ break;
224
+ default:
225
+ throw new robot_1.ActionableError(`Swipe direction "${direction}" is not supported`);
185
226
  }
186
227
  const url = `${sessionUrl}/actions`;
187
- await fetch(url, {
228
+ const response = await fetch(url, {
229
+ method: "POST",
230
+ headers: {
231
+ "Content-Type": "application/json",
232
+ },
233
+ body: JSON.stringify({
234
+ actions: [
235
+ {
236
+ type: "pointer",
237
+ id: "finger1",
238
+ parameters: { pointerType: "touch" },
239
+ actions: [
240
+ { type: "pointerMove", duration: 0, x: x0, y: y0 },
241
+ { type: "pointerDown", button: 0 },
242
+ { type: "pointerMove", duration: 1000, x: x1, y: y1 },
243
+ { type: "pointerUp", button: 0 }
244
+ ]
245
+ }
246
+ ]
247
+ }),
248
+ });
249
+ if (!response.ok) {
250
+ const errorText = await response.text();
251
+ throw new robot_1.ActionableError(`WebDriver actions request failed: ${response.status} ${errorText}`);
252
+ }
253
+ // Clear actions to ensure they complete
254
+ await fetch(`${sessionUrl}/actions`, {
255
+ method: "DELETE",
256
+ });
257
+ });
258
+ }
259
+ async swipeFromCoordinate(x, y, direction, distance = 400) {
260
+ await this.withinSession(async (sessionUrl) => {
261
+ // Use simple coordinates like the working swipe method
262
+ const x0 = x;
263
+ const y0 = y;
264
+ let x1 = x;
265
+ let y1 = y;
266
+ // Calculate target position based on direction and distance
267
+ switch (direction) {
268
+ case "up":
269
+ y1 = y - distance; // Move up by specified distance
270
+ break;
271
+ case "down":
272
+ y1 = y + distance; // Move down by specified distance
273
+ break;
274
+ case "left":
275
+ x1 = x - distance; // Move left by specified distance
276
+ break;
277
+ case "right":
278
+ x1 = x + distance; // Move right by specified distance
279
+ break;
280
+ default:
281
+ throw new robot_1.ActionableError(`Swipe direction "${direction}" is not supported`);
282
+ }
283
+ const url = `${sessionUrl}/actions`;
284
+ const response = await fetch(url, {
188
285
  method: "POST",
189
286
  headers: {
190
287
  "Content-Type": "application/json",
@@ -198,14 +295,21 @@ class WebDriverAgent {
198
295
  actions: [
199
296
  { type: "pointerMove", duration: 0, x: x0, y: y0 },
200
297
  { type: "pointerDown", button: 0 },
201
- { type: "pointerMove", duration: 0, x: x1, y: y1 },
202
- { type: "pause", duration: 1000 },
298
+ { type: "pointerMove", duration: 1000, x: x1, y: y1 },
203
299
  { type: "pointerUp", button: 0 }
204
300
  ]
205
301
  }
206
302
  ]
207
303
  }),
208
304
  });
305
+ if (!response.ok) {
306
+ const errorText = await response.text();
307
+ throw new robot_1.ActionableError(`WebDriver actions request failed: ${response.status} ${errorText}`);
308
+ }
309
+ // Clear actions to ensure they complete
310
+ await fetch(`${sessionUrl}/actions`, {
311
+ method: "DELETE",
312
+ });
209
313
  });
210
314
  }
211
315
  async setOrientation(orientation) {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@mobilenext/mobile-mcp",
3
- "version": "0.0.18",
3
+ "version": "0.0.19",
4
4
  "description": "Mobile MCP",
5
5
  "repository": {
6
6
  "type": "git",