@mobilenext/mobile-mcp 0.0.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,172 @@
1
+ ## Mobile Next - MCP server for Mobile Automation
2
+
3
+ This is a [Model Context Protocol (MCP) server](https://github.com/modelcontextprotocol) that provides mobile automation capabilities powered by [Appium](https://github.com/appium).
4
+ This server allows Agents and LLMs to interact with native iOS/Android applications and devices through structured accessibility snapshots or coordinate-based taps based on screenshots.
5
+
6
+ <p align="center">
7
+ <a href="https://github.com/mobile-next/">
8
+ <img alt="mobile-mcp" src="https://github.com/mobile-next/mobile-next-assets/blob/main/mobile-mcp-banner.png?raw=true" width="600">
9
+ </a>
10
+ </p>
11
+
12
+ ### πŸš€ Mobile MCP Roadmap: Building the Future of Mobile
13
+
14
+ Join us on our journey as we continuously enhance Mobile MCP!
15
+ Check out our detailed roadmap to see upcoming features, improvements, and milestones. Your feedback is invaluable in shaping the future of mobile automation.
16
+
17
+ πŸ‘‰ [Explore the Roadmap](https://github.com/orgs/mobile-next/projects/1)
18
+
19
+ ### Main use cases
20
+
21
+ How we help to scale mobile automation:
22
+
23
+ - πŸ“² Native app automation (iOS and Android) for testing or data-entry scenarios.
24
+ - πŸ“ Scripted flows and form interactions without manually controlling simulators/emulators or physical devices (iPhone, Samsung, Google Pixel etc)
25
+ - 🧭 Automating multi-step user journeys driven by an LLM
26
+ - πŸ‘† General-purpose mobile application interaction for agent-based frameworks
27
+ - πŸ€– Enables agent-to-agent communication for mobile automation usecases, data extraction
28
+
29
+ ## Main Features
30
+
31
+ - πŸš€ **Fast and lightweight**: Uses native accessibility trees for most interactions, or screenshot based coordinates where a11y labels are not available.
32
+ - πŸ€– **LLM-friendly**: No computer vision model required in Accessibility (Snapshot).
33
+ - 🧿 **Visual Sense**: Evaluates and analyses what’s actually rendered on screen to decide the next action. If accessibility data or view-hierarchy coordinates are unavailable, it falls back to screenshot-based analysis.
34
+ - πŸ“Š **Deterministic tool application**: Reduces ambiguity found in purely screenshot-based approaches by relying on structured data whenever possible.
35
+ - πŸ“Ί **Extract structured data**: Enables you to extract structred data from anything visible on screen.
36
+
37
+ ## Mobile MCP Architecture
38
+
39
+ <p align="center">
40
+ <a href="/images/mobile-mcp-arch.png">
41
+ <img alt="mobile-mcp" src="https://github.com/mobile-next/mobile-next-assets/blob/main/mobile-mcp-arch.png?raw=true" width="600">
42
+ </a>
43
+ </p>
44
+
45
+
46
+
47
+ ## How to install
48
+
49
+ ```js
50
+ {
51
+ "mcpServers": {
52
+ "mobile-next": {
53
+ "command": "npx",
54
+ "args": [
55
+ "@mobile-next/appium-mcp@latest"
56
+ ]
57
+ }
58
+ }
59
+ }
60
+ ```
61
+
62
+ ## Prerequisites
63
+
64
+ What you will need to connect MCP with your agent and mobile devices:
65
+
66
+ - [Xcode command line tools](https://developer.apple.com/xcode/resources/)
67
+ - [Android Platform Tools](https://developer.android.com/tools/releases/platform-tools)
68
+ - [node.js](https://nodejs.org/en/download/)
69
+ - [MCP](https://modelcontextprotocol.io/introduction) supported foundational models or agents, like [Claude MCP](https://modelcontextprotocol.io/quickstart/server), [OpenAI Agent SDK](https://openai.github.io/openai-agents-python/mcp/), [Copilot Studio](https://www.microsoft.com/en-us/microsoft-copilot/blog/copilot-studio/introducing-model-context-protocol-mcp-in-copilot-studio-simplified-integration-with-ai-apps-and-agents/)
70
+
71
+ ### Simulators, Emulators, and Physical Devices
72
+
73
+ When launched, Appium MCP can connect to:
74
+ β€’ iOS Simulators on macOS/Linux
75
+ β€’ Android Emulators on Linux/Windows/macOS
76
+ β€’ Physical iOS or Android devices (requires proper platform tools and drivers)
77
+
78
+ Make sure you have your mobile platform SDKs (Xcode, Android SDK) installed and configured properly before running Mobile Next Appium MCP.
79
+
80
+
81
+ ### Running in "headless" mode on Simulators/Emulators
82
+
83
+ When you do not have an actual phone connected, you can run Mobile Next Appium MCP with an emulator or simulator in the background.
84
+
85
+ For example, on Android:
86
+ 1. Start an emulator (avdmanager / emulator command).
87
+ 2. Run Appium MCP with the desired flags
88
+
89
+ On iOS, you'll need Xcode and to run the Simulator before using Appium MCP with that simulator instance.
90
+ `xcrun simctl list`
91
+ `xcrun simctl boot "iPhone 16"`
92
+
93
+
94
+ # Mobile Commands and interaction tools
95
+
96
+ These tools use accessibility-based element references on iOS or Android. By relying on the accessibility/automation IDs, you avoid the ambiguity of coordinate-based approaches.
97
+
98
+ ## mobile_install_app
99
+ - **Description:** Installs an app onto the device/emulator
100
+ - **Parameters:**
101
+ - `appPath` (string): Path or URL to the app file (e.g., .apk for Android, .ipa/.app for iOS)
102
+
103
+ ## mobile_launch_app
104
+ - **Description:** Launches the specified app on the device/emulator
105
+ - **Parameters:**
106
+ - `bundleId` (string): The application's unique bundle/package identifier like: com.google.android.keep or com.apple.mobilenotes )
107
+
108
+ ## mobile_terminate_app
109
+ - **Description:** Terminates a running application
110
+ - **Parameters:**
111
+ - `bundleId` (string): The application's bundle/package identifier
112
+
113
+ ## mobile_element_tap
114
+ - **Description:** Taps on a UI element identified by accessibility locator
115
+ - **Parameters:**
116
+ - `element` (string): Human-readable element description (e.g., "Login button")
117
+ - `ref` (string): Accessibility/automation ID or reference from a snapshot
118
+
119
+ ## mobile_tap
120
+ - **Description:** Taps on specified screen coordinates
121
+ - **Parameters:**
122
+ - `x` (number): X-coordinate
123
+ - `y` (number): Y-coordinate
124
+
125
+ ## mobile_element_send_keys
126
+ - **Description:** Types text into a UI element (e.g., TextField)
127
+ - **Parameters:**
128
+ - `element` (string): Human-readable element description
129
+ - `ref` (string): Accessibility/automation ID of the element
130
+ - `text` (string): Text to type
131
+ - `submit` (boolean): Whether to press Enter/Return after typing
132
+
133
+ ## mobile_element_swipe
134
+ - **Description:** Performs a swipe gesture from one UI element to another
135
+ - **Parameters:**
136
+ - `startElement` (string): Human-readable description of the start element
137
+ - `startRef` (string): Accessibility/automation ID of the start element
138
+ - `endElement` (string): Human-readable description of the end element
139
+ - `endRef` (string): Accessibility/automation ID of the end element
140
+
141
+ ## mobile_swipe
142
+ - **Description:** Performs a swipe gesture between two sets of screen coordinates
143
+ - **Parameters:**
144
+ - `startX` (number): Start X-coordinate
145
+ - `startY` (number): Start Y-coordinate
146
+ - `endX` (number): End X-coordinate
147
+ - `endY` (number): End Y-coordinate
148
+
149
+ ## mobile_press_key
150
+ - **Description:** Presses hardware keys or triggers special events (e.g., back button on Android)
151
+ - **Parameters:**
152
+ - `key` (string): Key identifier (e.g., HOME, BACK, VOLUME_UP, etc.)
153
+
154
+ ## mobile_take_screenshot
155
+ - **Description:** Captures a screenshot of the current device screen
156
+ - **Parameters:**
157
+ - `raw` (boolean): Return a lossless image if true; otherwise, compressed by default
158
+
159
+ ## mobile_get_source
160
+ - **Description:** Fetches the current device UI structure (accessibility snapshot) (xml format)
161
+ - **Parameters:** None
162
+
163
+ ## mobile_wait
164
+ - **Description:** Waits for a specified time
165
+ - **Parameters:**
166
+ - `time` (number): Time to wait in seconds (capped at 10 seconds)
167
+
168
+ ## mobile_close_session
169
+ - **Description:** Closes the current Appium session
170
+ - **Parameters:** None
171
+
172
+
package/lib/android.js ADDED
@@ -0,0 +1,171 @@
1
+ "use strict";
2
+ var __createBinding = (this && this.__createBinding) || (Object.create ? (function(o, m, k, k2) {
3
+ if (k2 === undefined) k2 = k;
4
+ var desc = Object.getOwnPropertyDescriptor(m, k);
5
+ if (!desc || ("get" in desc ? !m.__esModule : desc.writable || desc.configurable)) {
6
+ desc = { enumerable: true, get: function() { return m[k]; } };
7
+ }
8
+ Object.defineProperty(o, k2, desc);
9
+ }) : (function(o, m, k, k2) {
10
+ if (k2 === undefined) k2 = k;
11
+ o[k2] = m[k];
12
+ }));
13
+ var __setModuleDefault = (this && this.__setModuleDefault) || (Object.create ? (function(o, v) {
14
+ Object.defineProperty(o, "default", { enumerable: true, value: v });
15
+ }) : function(o, v) {
16
+ o["default"] = v;
17
+ });
18
+ var __importStar = (this && this.__importStar) || (function () {
19
+ var ownKeys = function(o) {
20
+ ownKeys = Object.getOwnPropertyNames || function (o) {
21
+ var ar = [];
22
+ for (var k in o) if (Object.prototype.hasOwnProperty.call(o, k)) ar[ar.length] = k;
23
+ return ar;
24
+ };
25
+ return ownKeys(o);
26
+ };
27
+ return function (mod) {
28
+ if (mod && mod.__esModule) return mod;
29
+ var result = {};
30
+ if (mod != null) for (var k = ownKeys(mod), i = 0; i < k.length; i++) if (k[i] !== "default") __createBinding(result, mod, k[i]);
31
+ __setModuleDefault(result, mod);
32
+ return result;
33
+ };
34
+ })();
35
+ Object.defineProperty(exports, "__esModule", { value: true });
36
+ exports.listApps = exports.takeScreenshot = exports.swipe = exports.getElementsOnScreen = exports.getScreenSize = exports.resolveLaunchableActivities = exports.getConnectedDevices = void 0;
37
+ const child_process_1 = require("child_process");
38
+ const xml = __importStar(require("fast-xml-parser"));
39
+ const fs_1 = require("fs");
40
+ const getConnectedDevices = () => {
41
+ return (0, child_process_1.execSync)(`adb devices`)
42
+ .toString()
43
+ .split("\n")
44
+ .filter(line => !line.startsWith("List of devices attached"))
45
+ .filter(line => line.trim() !== "");
46
+ };
47
+ exports.getConnectedDevices = getConnectedDevices;
48
+ const resolveLaunchableActivities = (packageName) => {
49
+ return (0, child_process_1.execSync)(`adb shell cmd package resolve-activity ${packageName}`)
50
+ .toString()
51
+ .split("\n")
52
+ .map(line => line.trim())
53
+ .filter(line => line.startsWith("name="))
54
+ .map(line => line.substring("name=".length));
55
+ };
56
+ exports.resolveLaunchableActivities = resolveLaunchableActivities;
57
+ const getScreenSize = () => {
58
+ const screenSize = (0, child_process_1.execSync)("adb shell wm size")
59
+ .toString()
60
+ .split(" ")
61
+ .pop();
62
+ if (!screenSize) {
63
+ throw new Error("Failed to get screen size");
64
+ }
65
+ const [width, height] = screenSize.split("x").map(Number);
66
+ return [width, height];
67
+ };
68
+ exports.getScreenSize = getScreenSize;
69
+ const collectElements = (node, screenSize) => {
70
+ const elements = [];
71
+ const getCoordinates = (element) => {
72
+ const bounds = String(element.bounds);
73
+ const [, left, top, right, bottom] = bounds.match(/^\[(\d+),(\d+)\]\[(\d+),(\d+)\]$/)?.map(Number) || [];
74
+ return { left, top, right, bottom };
75
+ };
76
+ const getCenter = (coordinates) => {
77
+ return {
78
+ x: Math.floor((coordinates.left + coordinates.right) / 2),
79
+ y: Math.floor((coordinates.top + coordinates.bottom) / 2),
80
+ };
81
+ };
82
+ const normalizeCoordinates = (coordinates, screenSize) => {
83
+ return {
84
+ x: Number((coordinates.x / screenSize[0]).toFixed(3)),
85
+ y: Number((coordinates.y / screenSize[1]).toFixed(3)),
86
+ };
87
+ };
88
+ if (node.node) {
89
+ if (Array.isArray(node.node)) {
90
+ for (const childNode of node.node) {
91
+ elements.push(...collectElements(childNode, screenSize));
92
+ }
93
+ }
94
+ else {
95
+ elements.push(...collectElements(node.node, screenSize));
96
+ }
97
+ }
98
+ if (node.text) {
99
+ elements.push({
100
+ "text": node.text,
101
+ "coordinates": normalizeCoordinates(getCenter(getCoordinates(node)), screenSize),
102
+ });
103
+ }
104
+ if (node["content-desc"]) {
105
+ elements.push({
106
+ "text": node["content-desc"],
107
+ "coordinates": normalizeCoordinates(getCenter(getCoordinates(node)), screenSize),
108
+ });
109
+ }
110
+ return elements;
111
+ };
112
+ const getElementsOnScreen = () => {
113
+ const dump = (0, child_process_1.execSync)(`adb exec-out uiautomator dump /dev/tty`);
114
+ const parser = new xml.XMLParser({
115
+ ignoreAttributes: false,
116
+ attributeNamePrefix: ""
117
+ });
118
+ const parsedXml = parser.parse(dump);
119
+ const hierarchy = parsedXml.hierarchy;
120
+ const screenSize = (0, exports.getScreenSize)();
121
+ const elements = collectElements(hierarchy, screenSize);
122
+ return elements;
123
+ };
124
+ exports.getElementsOnScreen = getElementsOnScreen;
125
+ const swipe = (direction) => {
126
+ const screenSize = (0, exports.getScreenSize)();
127
+ const centerX = screenSize[0] >> 1;
128
+ // const centerY = screenSize[1] >> 1;
129
+ let x0, y0, x1, y1;
130
+ switch (direction) {
131
+ case "down":
132
+ x0 = x1 = centerX;
133
+ y0 = Math.floor(screenSize[1] * 0.80);
134
+ y1 = Math.floor(screenSize[1] * 0.20);
135
+ break;
136
+ case "up":
137
+ x0 = x1 = centerX;
138
+ y0 = Math.floor(screenSize[1] * 0.20);
139
+ y1 = Math.floor(screenSize[1] * 0.80);
140
+ break;
141
+ default:
142
+ throw new Error(`Swipe direction "${direction}" is not supported`);
143
+ }
144
+ (0, child_process_1.execSync)(`adb shell input swipe ${x0} ${y0} ${x1} ${y1} 1000`);
145
+ };
146
+ exports.swipe = swipe;
147
+ const takeScreenshot = async () => {
148
+ const randomFilename = `screenshot-${Date.now()}.png`;
149
+ // take screenshot and save on device
150
+ const remoteFilename = `/sdcard/Download/${randomFilename}`;
151
+ (0, child_process_1.execSync)(`adb shell screencap -p ${remoteFilename}`);
152
+ // pull the file locally
153
+ const localFilename = `/tmp/${randomFilename}`;
154
+ (0, child_process_1.execSync)(`adb pull ${remoteFilename} ${localFilename}`);
155
+ (0, child_process_1.execSync)(`adb shell rm ${remoteFilename}`);
156
+ const screenshot = (0, fs_1.readFileSync)(localFilename);
157
+ (0, fs_1.unlinkSync)(localFilename);
158
+ return screenshot;
159
+ };
160
+ exports.takeScreenshot = takeScreenshot;
161
+ const listApps = () => {
162
+ const result = (0, child_process_1.execSync)(`adb shell cmd package query-activities -a android.intent.action.MAIN -c android.intent.category.LAUNCHER`)
163
+ .toString()
164
+ .split("\n")
165
+ .map(line => line.trim())
166
+ .filter(line => line.startsWith("packageName="))
167
+ .map(line => line.substring("packageName=".length))
168
+ .filter((value, index, self) => self.indexOf(value) === index);
169
+ return result;
170
+ };
171
+ exports.listApps = listApps;
package/lib/index.js ADDED
@@ -0,0 +1,16 @@
1
+ "use strict";
2
+ Object.defineProperty(exports, "__esModule", { value: true });
3
+ const stdio_js_1 = require("@modelcontextprotocol/sdk/server/stdio.js");
4
+ const server_1 = require("./server");
5
+ const logger_1 = require("./logger");
6
+ async function main() {
7
+ const transport = new stdio_js_1.StdioServerTransport();
8
+ const server = (0, server_1.createMcpServer)();
9
+ await server.connect(transport);
10
+ (0, logger_1.error)("Appium MCP Server running on stdio");
11
+ }
12
+ main().catch(err => {
13
+ console.error("Fatal error in main():", err);
14
+ (0, logger_1.error)("Fatal error in main(): " + JSON.stringify(err.stack));
15
+ process.exit(1);
16
+ });
@@ -0,0 +1,40 @@
1
+ "use strict";
2
+ Object.defineProperty(exports, "__esModule", { value: true });
3
+ exports.listApps = exports.launchApp = exports.openUrl = exports.getScreenshot = exports.getConnectedDevices = void 0;
4
+ const child_process_1 = require("child_process");
5
+ const getConnectedDevices = () => {
6
+ return (0, child_process_1.execSync)(`xcrun simctl list devices`)
7
+ .toString()
8
+ .split("\n")
9
+ .map(line => {
10
+ // extract device name and UUID from the line
11
+ const match = line.match(/(.*?)\s+\(([\w-]+)\)\s+\(Booted\)/);
12
+ if (!match) {
13
+ return null;
14
+ }
15
+ const deviceName = match[1].trim();
16
+ const deviceUuid = match[2];
17
+ return {
18
+ name: deviceName,
19
+ uuid: deviceUuid,
20
+ };
21
+ })
22
+ .filter(line => line !== null);
23
+ };
24
+ exports.getConnectedDevices = getConnectedDevices;
25
+ const getScreenshot = (simulatorUuid) => {
26
+ return (0, child_process_1.execSync)(`xcrun simctl io "${simulatorUuid}" screenshot -`);
27
+ };
28
+ exports.getScreenshot = getScreenshot;
29
+ const openUrl = (simulatorUuid, url) => {
30
+ return (0, child_process_1.execSync)(`xcrun simctl openurl "${simulatorUuid}" "${url}"`);
31
+ };
32
+ exports.openUrl = openUrl;
33
+ const launchApp = (simulatorUuid, packageName) => {
34
+ return (0, child_process_1.execSync)(`xcrun simctl launch "${simulatorUuid}" "${packageName}"`);
35
+ };
36
+ exports.launchApp = launchApp;
37
+ const listApps = (simulatorUuid) => {
38
+ return (0, child_process_1.execSync)(`xcrun simctl list apps "${simulatorUuid}"`);
39
+ };
40
+ exports.listApps = listApps;
package/lib/logger.js ADDED
@@ -0,0 +1,22 @@
1
+ "use strict";
2
+ Object.defineProperty(exports, "__esModule", { value: true });
3
+ exports.error = exports.trace = void 0;
4
+ const fs_1 = require("fs");
5
+ const writeLog = (message) => {
6
+ if (process.env.LOG_FILE) {
7
+ const logfile = process.env.LOG_FILE;
8
+ const timestamp = new Date().toISOString();
9
+ const levelStr = "INFO";
10
+ const logMessage = `[${timestamp}] ${levelStr} ${message}`;
11
+ (0, fs_1.appendFileSync)(logfile, logMessage + "\n");
12
+ }
13
+ console.error(message);
14
+ };
15
+ const trace = (message) => {
16
+ writeLog(message);
17
+ };
18
+ exports.trace = trace;
19
+ const error = (message) => {
20
+ writeLog(message);
21
+ };
22
+ exports.error = error;
package/lib/server.js ADDED
@@ -0,0 +1,138 @@
1
+ "use strict";
2
+ var __importDefault = (this && this.__importDefault) || function (mod) {
3
+ return (mod && mod.__esModule) ? mod : { "default": mod };
4
+ };
5
+ Object.defineProperty(exports, "__esModule", { value: true });
6
+ exports.createMcpServer = void 0;
7
+ const mcp_js_1 = require("@modelcontextprotocol/sdk/server/mcp.js");
8
+ const child_process_1 = require("child_process");
9
+ const logger_1 = require("./logger");
10
+ const zod_1 = require("zod");
11
+ const android_1 = require("./android");
12
+ const sharp_1 = __importDefault(require("sharp"));
13
+ const getAgentVersion = () => {
14
+ const json = require("../package.json");
15
+ return json.version;
16
+ };
17
+ const createMcpServer = () => {
18
+ const server = new mcp_js_1.McpServer({
19
+ name: "appium-mcp",
20
+ version: getAgentVersion(),
21
+ capabilities: {
22
+ resources: {},
23
+ tools: {},
24
+ },
25
+ });
26
+ const tool = (name, description, paramsSchema, cb) => {
27
+ const wrappedCb = async (args) => {
28
+ try {
29
+ (0, logger_1.trace)(`Invoking ${name} with args: ${JSON.stringify(args)}`);
30
+ const response = await cb(args);
31
+ (0, logger_1.trace)(`=> ${response}`);
32
+ return {
33
+ content: [{ type: "text", text: response }],
34
+ };
35
+ }
36
+ catch (error) {
37
+ (0, logger_1.trace)(`Tool '${description}' failed: ${error.message} stack: ${error.stack}`);
38
+ return {
39
+ content: [{ type: "text", text: `Error: ${error.message}` }],
40
+ isError: true,
41
+ };
42
+ }
43
+ };
44
+ server.tool(name, description, paramsSchema, args => wrappedCb(args));
45
+ };
46
+ tool("list-apps-on-device", "List all apps on device", {}, async ({}) => {
47
+ /*
48
+ const result = execSync(`adb shell pm list packages`)
49
+ .toString()
50
+ .split("\n")
51
+ .filter(line => line.startsWith("package:"))
52
+ .map(line => line.substring("package:".length));
53
+ */
54
+ const result = (0, android_1.listApps)();
55
+ return `Found these packages on device: ${result.join(",")}`;
56
+ });
57
+ tool("launch-app", "Launch an app on mobile device", {
58
+ packageName: zod_1.z.string().describe("The package name of the app to launch"),
59
+ }, async ({ packageName }) => {
60
+ (0, child_process_1.execSync)(`adb shell monkey -p "${packageName}" -c android.intent.category.LAUNCHER 1`);
61
+ return `Launched app ${packageName}`;
62
+ });
63
+ tool("get-screen-size", "Get the screen size of the mobile device in pixels", {}, async ({}) => {
64
+ const screenSize = (0, android_1.getScreenSize)();
65
+ return `Screen size is ${screenSize[0]}x${screenSize[1]} pixels`;
66
+ });
67
+ tool("click-on-screen-at-coordinates", "Click on the screen at given x,y coordinates", {
68
+ x: zod_1.z.number().describe("The x coordinate to click between 0 and 1"),
69
+ y: zod_1.z.number().describe("The y coordinate to click between 0 and 1"),
70
+ }, async ({ x, y }) => {
71
+ const screenSize = (0, android_1.getScreenSize)();
72
+ const x0 = Math.floor(screenSize[0] * x);
73
+ const y0 = Math.floor(screenSize[1] * y);
74
+ (0, child_process_1.execSync)(`adb shell input tap ${x0} ${y0}`);
75
+ return `Clicked on screen at coordinates: ${x}, ${y}`;
76
+ });
77
+ tool("list-elements-on-screen", "List elements on screen and their coordinates, based on text or accessibility label", {}, async ({}) => {
78
+ const elements = (0, android_1.getElementsOnScreen)();
79
+ return `Found these elements on screen: ${JSON.stringify(elements)}`;
80
+ });
81
+ tool("press-button", "Press a button on device", {
82
+ button: zod_1.z.string().describe("The button to press. Supported buttons: KEYCODE_BACK, KEYCODE_HOME, KEYCODE_MENU, KEYCODE_VOLUME_UP, KEYCODE_VOLUME_DOWN, KEYCODE_ENTER"),
83
+ }, async ({ button }) => {
84
+ (0, child_process_1.execSync)(`adb shell input keyevent ${button}`);
85
+ return `Pressed the button: ${button}`;
86
+ });
87
+ tool("open-url", "Open a URL in browser on device", {
88
+ url: zod_1.z.string().describe("The URL to open"),
89
+ }, async ({ url }) => {
90
+ (0, child_process_1.execSync)(`adb shell am start -a android.intent.action.VIEW -d "${url}"`);
91
+ return `Opened URL: ${url}`;
92
+ });
93
+ tool("swipe-on-screen", "Swipe on the screen", {
94
+ direction: zod_1.z.enum(["up", "down"]).describe("The direction to swipe"),
95
+ }, async ({ direction }) => {
96
+ (0, android_1.swipe)(direction);
97
+ return `Swiped ${direction} on screen`;
98
+ });
99
+ tool("type-text", "Type text into the focused element", {
100
+ text: zod_1.z.string().describe("The text to type"),
101
+ }, async ({ text }) => {
102
+ const _text = text.replace(/ /g, "\\ ");
103
+ (0, child_process_1.execSync)(`adb shell input text "${_text}"`);
104
+ return `Typed text: ${text}`;
105
+ });
106
+ server.tool("take-device-screenshot", "Take a screenshot of the mobile device", {}, async ({}) => {
107
+ try {
108
+ const screenshot = await (0, android_1.takeScreenshot)();
109
+ // Scale down the screenshot by 50%
110
+ const image = (0, sharp_1.default)(screenshot);
111
+ const metadata = await image.metadata();
112
+ if (!metadata.width) {
113
+ throw new Error("Failed to get screenshot metadata");
114
+ }
115
+ const resizedScreenshot = await image
116
+ .resize(Math.floor(metadata.width / 2))
117
+ .jpeg({ quality: 75 })
118
+ .toBuffer();
119
+ // debug:
120
+ // writeFileSync('/tmp/screenshot.png', screenshot);
121
+ // writeFileSync('/tmp/screenshot-scaled.jpg', resizedScreenshot);
122
+ const screenshot64 = resizedScreenshot.toString("base64");
123
+ (0, logger_1.trace)(`Screenshot taken: ${screenshot.length} bytes`);
124
+ return {
125
+ content: [{ type: "image", data: screenshot64, mimeType: "image/jpeg" }]
126
+ };
127
+ }
128
+ catch (err) {
129
+ (0, logger_1.error)(`Error taking screenshot: ${err.message} ${err.stack}`);
130
+ return {
131
+ content: [{ type: "text", text: `Error: ${err.message}` }],
132
+ isError: true,
133
+ };
134
+ }
135
+ });
136
+ return server;
137
+ };
138
+ exports.createMcpServer = createMcpServer;
package/package.json ADDED
@@ -0,0 +1,55 @@
1
+ {
2
+ "name": "@mobilenext/mobile-mcp",
3
+ "version": "0.0.6",
4
+ "description": "Mobile MCP",
5
+ "repository": {
6
+ "type": "git",
7
+ "url": "git+https://github.com/mobile-next/mobile-mcp.git"
8
+ },
9
+ "engines": {
10
+ "node": ">=18"
11
+ },
12
+ "license": "Apache-2.0",
13
+ "scripts": {
14
+ "build": "tsc",
15
+ "lint": "eslint .",
16
+ "watch": "tsc --watch",
17
+ "clean": "rm -rf lib"
18
+ },
19
+ "exports": {
20
+ "./package.json": "./package.json",
21
+ ".": {
22
+ "types": "./index.d.ts",
23
+ "default": "./index.js"
24
+ }
25
+ },
26
+ "dependencies": {
27
+ "@modelcontextprotocol/sdk": "^1.6.1",
28
+ "fast-xml-parser": "^5.0.9",
29
+ "sharp": "^0.33.5",
30
+ "zod-to-json-schema": "^3.24.4"
31
+ },
32
+ "devDependencies": {
33
+ "@eslint/eslintrc": "^3.2.0",
34
+ "@eslint/js": "^9.19.0",
35
+ "@stylistic/eslint-plugin": "^3.0.1",
36
+ "@types/node": "^22.13.10",
37
+ "@typescript-eslint/eslint-plugin": "^8.28.0",
38
+ "@typescript-eslint/parser": "^8.26.1",
39
+ "@typescript-eslint/utils": "^8.26.1",
40
+ "eslint": "^9.19.0",
41
+ "eslint-plugin": "^1.0.1",
42
+ "eslint-plugin-import": "^2.31.0",
43
+ "eslint-plugin-notice": "^1.0.0",
44
+ "typescript": "^5.8.2"
45
+ },
46
+ "main": "index.js",
47
+ "directories": {
48
+ "lib": "lib"
49
+ },
50
+ "author": "",
51
+ "bugs": {
52
+ "url": "https://github.com/mobile-next/mobile-mcp/issues"
53
+ },
54
+ "homepage": "https://github.com/mobile-next/mobile-mcp#readme"
55
+ }