mobai-mcp 2.0.0 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -3,24 +3,27 @@
3
3
  [![npm version](https://badge.fury.io/js/mobai-mcp.svg)](https://www.npmjs.com/package/mobai-mcp)
4
4
  [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
5
5
 
6
- MCP (Model Context Protocol) server for [MobAI](https://mobai.run) - AI-powered mobile device automation. This server enables AI coding assistants like Cursor, Windsurf, Cline, and other MCP-compatible tools to control Android and iOS devices, emulators, and simulators.
6
+ MCP (Model Context Protocol) server for [MobAI](https://mobai.run) AI-powered mobile device automation. Lets AI assistants (Claude Code, Cursor, Windsurf, Cline, and other MCP-compatible tools) control Android and iOS devices, emulators, and simulators via a single DSL-first interface.
7
7
 
8
- ## Features
8
+ ## How it works
9
9
 
10
- - **Device Control**: List, connect, and manage Android/iOS devices
11
- - **UI Automation**: Tap, type, swipe, and interact with native apps
12
- - **Web Automation**: Control Safari/Chrome and WebViews with CSS selectors
13
- - **DSL Batch Execution**: Execute multiple automation steps efficiently
14
- - **AI Agent**: Run autonomous agents to complete complex tasks
15
- - **Screenshot Capture**: Capture and save device screenshots
10
+ All device interaction is batched through one primary tool: **`execute_dsl`**. Instead of exposing dozens of fine-grained tools (tap, swipe, type…), the server accepts a JSON script describing a sequence of actions with predicates, assertions, waits, and conditional branches. This keeps round-trips low and encodes retry/failure strategies server-side.
11
+
12
+ A small set of companion tools handles device discovery, screenshots, app management, and running `.mob` test files.
16
13
 
17
14
  ## Prerequisites
18
15
 
19
16
  - Node.js 18+
20
- - [MobAI desktop app](https://mobai.run) running locally (provides the HTTP API on port 8686)
21
- - Connected Android or iOS device (or emulator/simulator)
17
+ - [MobAI desktop app](https://mobai.run) running locally (HTTP API on `127.0.0.1:8686`)
18
+ - A connected Android or iOS device, emulator, or simulator
19
+
20
+ ## Installation
21
+
22
+ ### Claude Code
22
23
 
23
- ## Installation & Configuration
24
+ ```bash
25
+ claude mcp add mobai -- npx -y mobai-mcp
26
+ ```
24
27
 
25
28
  ### Cursor
26
29
 
@@ -39,7 +42,7 @@ Add to `.cursor/mcp.json`:
39
42
 
40
43
  ### Claude Desktop
41
44
 
42
- Add to Claude Desktop config (`~/Library/Application Support/Claude/claude_desktop_config.json` on macOS):
45
+ Add to `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS):
43
46
 
44
47
  ```json
45
48
  {
@@ -52,146 +55,107 @@ Add to Claude Desktop config (`~/Library/Application Support/Claude/claude_deskt
52
55
  }
53
56
  ```
54
57
 
55
- ### Windsurf
58
+ ### Windsurf / Cline / other MCP clients
56
59
 
57
- Add to Windsurf MCP config:
60
+ The server speaks stdio — use your client's generic MCP configuration:
58
61
 
59
62
  ```json
60
63
  {
61
- "mcpServers": {
62
- "mobai": {
63
- "command": "npx",
64
- "args": ["-y", "mobai-mcp"]
65
- }
66
- }
64
+ "command": "npx",
65
+ "args": ["-y", "mobai-mcp"]
67
66
  }
68
67
  ```
69
68
 
70
- ### Cline / Other MCP Clients
69
+ ## Tools
71
70
 
72
- Configure according to your client's MCP server setup. The server uses stdio transport.
71
+ ### Device management
73
72
 
74
- ```json
75
- {
76
- "command": "npx",
77
- "args": ["-y", "mobai-mcp"]
78
- }
79
- ```
73
+ | Tool | Description |
74
+ |---|---|
75
+ | `list_devices` | List all connected Android and iOS devices |
76
+ | `get_device` | Get details about a specific device |
77
+ | `start_bridge` | Start the automation bridge on a device (required before interaction) |
78
+ | `stop_bridge` | Stop the automation bridge |
80
79
 
81
- ## Available Tools
80
+ ### Screenshots
82
81
 
83
- ### Device Management
84
- - `list_devices` - List all connected devices
85
- - `get_device` - Get device information
86
- - `start_bridge` - Start on-device automation bridge
87
- - `stop_bridge` - Stop automation bridge
82
+ | Tool | Description |
83
+ |---|---|
84
+ | `get_screenshot` | Fast, low-quality screenshot for LLM visual analysis (may be downscaled; response includes scale factor) |
85
+ | `save_screenshot` | Full-quality PNG to disk for reporting, debugging, or sharing |
88
86
 
89
- ### UI Automation
90
- - `get_screenshot` - Capture device screenshot
91
- - `get_ui_tree` - Get accessibility tree (supports text_regex and bounds filtering)
92
- - `tap` - Tap element by index or coordinates
93
- - `type_text` - Type text
94
- - `swipe` - Perform swipe gesture
95
- - `go_home` - Navigate to home screen
96
- - `launch_app` - Launch app by bundle ID
97
- - `list_apps` - List installed apps
87
+ ### Apps
98
88
 
99
- ### DSL Execution
100
- - `execute_dsl` - Execute batch automation script
89
+ | Tool | Description |
90
+ |---|---|
91
+ | `list_apps` | List installed apps on the device |
92
+ | `install_app` | Install an `.apk` or `.ipa` from a local file path |
93
+ | `uninstall_app` | Uninstall an app by bundle ID / package name |
94
+ | `debug_app` | Launch an app in debug mode and write stdout/stderr to a log file |
101
95
 
102
- ### AI Agent
103
- - `run_agent` - Run autonomous agent for complex tasks
96
+ ### Automation
104
97
 
105
- ### Web Automation
106
- - `web_list_pages` - List browser tabs/WebViews
107
- - `web_navigate` - Navigate to URL
108
- - `web_get_dom` - Get DOM tree
109
- - `web_click` - Click element by CSS selector
110
- - `web_type` - Type into element by CSS selector
111
- - `web_execute_js` - Execute JavaScript
98
+ | Tool | Description |
99
+ |---|---|
100
+ | `execute_dsl` | **Primary tool.** Execute a batch of DSL steps: tap, type, swipe, observe, assertions, web automation, metrics, screen recording, and more. |
112
101
 
113
- ### Low-Level
114
- - `http_request` - Make raw HTTP request to MobAI API
102
+ ### Test management
115
103
 
116
- ## Available Resources
104
+ Tests are `.mob` files on disk inside project directories. You read, write, and edit them directly using your assistant's filesystem tools — MobAI watches for changes and updates the UI live. MCP is only needed to discover projects and run tests.
117
105
 
118
- - `mobai://api-reference` - Complete API documentation
119
- - `mobai://dsl-guide` - DSL batch execution guide
120
- - `mobai://native-runner` - Native app automation guide
121
- - `mobai://web-runner` - Web automation guide
106
+ | Tool | Description |
107
+ |---|---|
108
+ | `test_get_active` | Get the active test project directory and its `.mob` cases |
109
+ | `test_list_projects` | List all known test project directories with their `.mob` cases |
110
+ | `test_run` | Run a `.mob` test case on a device (`project_dir` + `case_path` + `device_id`) |
122
111
 
123
- ## Example Usage
112
+ ## Resources
124
113
 
125
- ### List devices and take screenshot
114
+ Read these **before** attempting any device interaction — they describe the DSL schema, action set, predicates, failure strategies, and `.mob` syntax.
126
115
 
127
- ```
128
- Use the list_devices tool to see connected devices.
129
- Then use get_screenshot with the device ID.
130
- ```
116
+ | URI | Purpose |
117
+ |---|---|
118
+ | `mobai://reference/device-automation` | How to control devices — guide, all DSL actions, predicates, and failure strategies |
119
+ | `mobai://reference/testing` | Testing workflow, rules, error fixes, and `.mob` script syntax |
131
120
 
132
- ### Automate Settings app
121
+ ## Example
133
122
 
134
- ```
135
- Use execute_dsl with:
123
+ Open the iOS Settings app, navigate to Wi-Fi, and verify the toggle exists:
124
+
125
+ ```json
136
126
  {
137
127
  "version": "0.2",
138
128
  "steps": [
139
129
  {"action": "open_app", "bundle_id": "com.apple.Preferences"},
140
- {"action": "delay", "duration_ms": 1000},
141
- {"action": "observe", "context": "native", "include": ["ui_tree"]},
142
- {"action": "tap", "predicate": {"text_contains": "General"}}
130
+ {"action": "wait_for", "predicate": {"text": "Settings"}, "timeout_ms": 3000},
131
+ {"action": "tap", "predicate": {"text_contains": "Wi-Fi"}},
132
+ {"action": "wait_for", "predicate": {"type": "switch"}, "timeout_ms": 3000},
133
+ {"action": "assert_exists", "predicate": {"type": "switch"}},
134
+ {"action": "observe", "include": ["ui_tree"]}
143
135
  ]
144
136
  }
145
137
  ```
146
138
 
147
- ### Run AI agent
148
-
149
- ```
150
- Use run_agent with device_id and task: "Open Settings and enable WiFi"
151
- ```
152
-
153
- ## Comparison with Claude Code Plugin
154
-
155
- | Feature | Claude Code Plugin | MCP Server |
156
- |---------|-------------------|------------|
157
- | Platform | Claude Code only | Any MCP client |
158
- | Tools | http_request (generic) | Named tools + http_request |
159
- | Resources | Skills (markdown) | MCP resources |
160
- | Setup | Plugin install | npx |
161
-
162
- The MCP server provides the same functionality as the Claude Code plugin but works with any MCP-compatible AI tool.
139
+ Pass this as the `commands` argument (a JSON string) to `execute_dsl` along with a `device_id` from `list_devices`.
163
140
 
164
141
  ## Troubleshooting
165
142
 
166
- ### "Connection refused" error
167
- - Ensure MobAI desktop app is running
168
- - Check that API is available at http://127.0.0.1:8686
143
+ **"Connection refused"** — Make sure the MobAI desktop app is running and the API is reachable at `http://127.0.0.1:8686`.
169
144
 
170
- ### "Bridge not running" error
171
- - Use `start_bridge` tool first before automation
172
- - iOS bridge may take up to 60 seconds to start
145
+ **"Bridge not running"** — Call `start_bridge` first. The iOS bridge can take up to a minute to come up.
173
146
 
174
- ### Screenshots not visible
175
- - Screenshots are saved to `/tmp/mobai/screenshots/`
176
- - Use your AI tool's file reading capability to view them
147
+ **Screenshots not visible** — `get_screenshot` saves to `/tmp/mobai/screenshots/` by default and returns the file path. Use your assistant's file-reading capability to view them. DSL `observe` screenshots are extracted from the response and saved to the same directory.
177
148
 
178
149
  ## Development
179
150
 
180
151
  ```bash
181
- # Clone the repository
182
152
  git clone https://github.com/MobAI-App/mobai-mcp.git
183
153
  cd mobai-mcp
184
-
185
- # Install dependencies
186
154
  npm install
187
-
188
- # Build
189
155
  npm run build
190
-
191
- # Run locally
192
156
  node dist/index.js
193
157
  ```
194
158
 
195
159
  ## License
196
160
 
197
- Apache 2.0 - see [LICENSE](LICENSE) for details.
161
+ Apache 2.0 see [LICENSE](LICENSE).
package/dist/index.js CHANGED
@@ -107,8 +107,6 @@ async function doRequest(method, urlPath, payload, timeoutMs = DEFAULT_TIMEOUT_M
107
107
  const doGet = (p) => doRequest("GET", p);
108
108
  const doPost = (p, body) => doRequest("POST", p, body);
109
109
  const doDelete = (p) => doRequest("DELETE", p);
110
- const doPut = (p, body) => doRequest("PUT", p, body);
111
- const doPatch = (p, body) => doRequest("PATCH", p, body);
112
110
  function textResult(data) {
113
111
  return {
114
112
  content: [{ type: "text", text: typeof data === "string" ? data : JSON.stringify(data, null, 2) }],
@@ -127,8 +125,8 @@ function errResult(err) {
127
125
  const server = new Server({ name: "mobai", version: "1.0.0" }, {
128
126
  capabilities: { tools: {}, resources: {} },
129
127
  instructions: `MobAI controls Android and iOS devices. Before starting any device task, read the relevant MCP resources:
130
- - mobai://reference/device-automation — how to control devices
131
- - mobai://reference/testing — testing workflow, rules, and .mob script syntax
128
+ - mobai://reference/device-automation — how to control devices (read before ANY device interaction)
129
+ - mobai://reference/testing — .mob script syntax (read ONLY when user asks to create or fix test scripts)
132
130
  Check available skills in current work directory and load any relevant to the user's request.`,
133
131
  });
134
132
  // ---------------------------------------------------------------------------
@@ -171,7 +169,7 @@ const TOOLS = [
171
169
  // Screenshot
172
170
  {
173
171
  name: "get_screenshot",
174
- description: "Capture a fast, low-quality screenshot for LLM visual analysis. Returns the file path to the saved image. Use this for AI/LLM processing only — for full-quality screenshots use save_screenshot instead.",
172
+ description: "Capture a fast, low-quality screenshot for LLM visual analysis. Returns the file path to the saved image. The image may be downscaled by an integer factor so its long edge stays ≤ 2000px; when that happens the response includes a scale factor — multiply any coordinates you read off the image by that factor before using them in device actions (tap, swipe, drag, long-press, etc.). UI tree coordinates are already in device pixels, do not scale those. Use this for AI/LLM processing only — for full-quality screenshots use save_screenshot instead.",
175
173
  inputSchema: {
176
174
  type: "object",
177
175
  properties: { device_id: { type: "string", description: "Device ID" } },
@@ -191,6 +189,20 @@ const TOOLS = [
191
189
  required: ["device_id"],
192
190
  },
193
191
  },
192
+ // Debug launch
193
+ {
194
+ name: "debug_app",
195
+ description: "Launch an app in debug mode and write logs to a file. Returns the log file path — use Read/Grep to inspect logs. Use kill_app to stop.",
196
+ inputSchema: {
197
+ type: "object",
198
+ properties: {
199
+ device_id: { type: "string", description: "Device ID" },
200
+ bundle_id: { type: "string", description: "Bundle ID of the app to debug" },
201
+ log_path: { type: "string", description: "Directory for log file (supports ~/). Defaults to OS temp directory." },
202
+ },
203
+ required: ["device_id", "bundle_id"],
204
+ },
205
+ },
194
206
  // App management
195
207
  {
196
208
  name: "list_apps",
@@ -230,7 +242,7 @@ const TOOLS = [
230
242
  name: "execute_dsl",
231
243
  description: `Execute a batch of DSL commands on a device. This is the primary tool for all device interaction — tap, type, swipe, observe, launch apps, assertions, web automation, and more.
232
244
 
233
- Read the MCP resource mobai://reference/device-automation to learn how to control devices before using this tool.
245
+ You MUST read the MCP resource mobai://reference/device-automation to learn how to control devices before using this tool.
234
246
 
235
247
  Input: JSON string with "version": "0.2" and "steps" array. Example:
236
248
  {"version":"0.2","steps":[
@@ -250,150 +262,25 @@ Input: JSON string with "version": "0.2" and "steps" array. Example:
250
262
  // Test management
251
263
  {
252
264
  name: "test_get_active",
253
- description: "Get the currently active test project and its cases. Use this to discover which test cases are available.",
265
+ description: "Get the currently active test project directory and its .mob test cases. Use this to discover the project path and available tests. The agent can then read/write/create/delete .mob files directly in the returned directory.",
254
266
  inputSchema: { type: "object", properties: {}, required: [] },
255
267
  },
256
268
  {
257
269
  name: "test_list_projects",
258
- description: "List all test projects with their test cases included inline",
270
+ description: "List all known test project directories with their .mob test cases. Each project is a directory containing .mob script files.",
259
271
  inputSchema: { type: "object", properties: {}, required: [] },
260
272
  },
261
- {
262
- name: "test_create_project",
263
- description: "Create a new test project",
264
- inputSchema: {
265
- type: "object",
266
- properties: { name: { type: "string", description: "Project name" } },
267
- required: ["name"],
268
- },
269
- },
270
- {
271
- name: "test_rename_project",
272
- description: "Rename an existing test project",
273
- inputSchema: {
274
- type: "object",
275
- properties: {
276
- project_id: { type: "string", description: "Project ID" },
277
- name: { type: "string", description: "New project name" },
278
- },
279
- required: ["project_id", "name"],
280
- },
281
- },
282
- {
283
- name: "test_create_case",
284
- description: "Create a new test case in a project",
285
- inputSchema: {
286
- type: "object",
287
- properties: {
288
- project_id: { type: "string", description: "Project ID" },
289
- name: { type: "string", description: "Test case name" },
290
- folder: { type: "string", description: "Optional folder path within the project" },
291
- },
292
- required: ["project_id", "name"],
293
- },
294
- },
295
- {
296
- name: "test_rename_case",
297
- description: "Rename an existing test case",
298
- inputSchema: {
299
- type: "object",
300
- properties: {
301
- project_id: { type: "string", description: "Project ID" },
302
- case_id: { type: "string", description: "Test case ID" },
303
- name: { type: "string", description: "New test case name" },
304
- },
305
- required: ["project_id", "case_id", "name"],
306
- },
307
- },
308
- {
309
- name: "test_delete_case",
310
- description: "Delete a test case from a project",
311
- inputSchema: {
312
- type: "object",
313
- properties: {
314
- project_id: { type: "string", description: "Project ID" },
315
- case_id: { type: "string", description: "Test case ID" },
316
- },
317
- required: ["project_id", "case_id"],
318
- },
319
- },
320
- {
321
- name: "test_get_script",
322
- description: "Get the .mob script content for a test case (with 1-based line numbers)",
323
- inputSchema: {
324
- type: "object",
325
- properties: {
326
- project_id: { type: "string", description: "Project ID" },
327
- case_id: { type: "string", description: "Test case ID" },
328
- },
329
- required: ["project_id", "case_id"],
330
- },
331
- },
332
- {
333
- name: "test_replace_script",
334
- description: "Replace the entire .mob script for a test case",
335
- inputSchema: {
336
- type: "object",
337
- properties: {
338
- project_id: { type: "string", description: "Project ID" },
339
- case_id: { type: "string", description: "Test case ID" },
340
- script: { type: "string", description: "New script content (without line numbers)" },
341
- },
342
- required: ["project_id", "case_id", "script"],
343
- },
344
- },
345
- {
346
- name: "test_update_line",
347
- description: "Update a single line in the .mob script",
348
- inputSchema: {
349
- type: "object",
350
- properties: {
351
- project_id: { type: "string", description: "Project ID" },
352
- case_id: { type: "string", description: "Test case ID" },
353
- line_number: { type: "number", description: "1-based line number to update" },
354
- content: { type: "string", description: "New line content" },
355
- },
356
- required: ["project_id", "case_id", "line_number", "content"],
357
- },
358
- },
359
- {
360
- name: "test_insert_after",
361
- description: "Insert a new line after the specified line number in the .mob script",
362
- inputSchema: {
363
- type: "object",
364
- properties: {
365
- project_id: { type: "string", description: "Project ID" },
366
- case_id: { type: "string", description: "Test case ID" },
367
- line_number: { type: "number", description: "1-based line number to insert after (0 = insert at beginning)" },
368
- content: { type: "string", description: "Line content to insert" },
369
- },
370
- required: ["project_id", "case_id", "line_number", "content"],
371
- },
372
- },
373
- {
374
- name: "test_delete_line",
375
- description: "Delete a line from the .mob script",
376
- inputSchema: {
377
- type: "object",
378
- properties: {
379
- project_id: { type: "string", description: "Project ID" },
380
- case_id: { type: "string", description: "Test case ID" },
381
- line_number: { type: "number", description: "1-based line number to delete" },
382
- },
383
- required: ["project_id", "case_id", "line_number"],
384
- },
385
- },
386
273
  {
387
274
  name: "test_run",
388
- description: "Run a test case on a device",
275
+ description: "Run a .mob test case on a device. The case_path is relative to the project directory.",
389
276
  inputSchema: {
390
277
  type: "object",
391
278
  properties: {
392
- project_id: { type: "string", description: "Project ID" },
393
- case_id: { type: "string", description: "Test case ID" },
279
+ project_dir: { type: "string", description: "Absolute path to the project directory" },
280
+ case_path: { type: "string", description: "Relative path to the .mob file within the project, e.g. auth/login.mob" },
394
281
  device_id: { type: "string", description: "Device ID to run the test on" },
395
282
  },
396
- required: ["project_id", "case_id", "device_id"],
283
+ required: ["project_dir", "case_path", "device_id"],
397
284
  },
398
285
  },
399
286
  ];
@@ -406,13 +293,6 @@ server.setRequestHandler(ListToolsRequestSchema, async () => {
406
293
  // ---------------------------------------------------------------------------
407
294
  // Tool call handler
408
295
  // ---------------------------------------------------------------------------
409
- function testCasePath(args) {
410
- const projectId = args?.project_id;
411
- const caseId = args?.case_id;
412
- if (!projectId || !caseId)
413
- throw new Error("project_id and case_id are required");
414
- return `/tests/projects/${projectId}/cases/${caseId}`;
415
- }
416
296
  server.setRequestHandler(CallToolRequestSchema, async (request) => {
417
297
  const { name, arguments: args } = request.params;
418
298
  try {
@@ -442,6 +322,12 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
442
322
  return textResult(screenshotToFile(body));
443
323
  }
444
324
  // App management
325
+ case "debug_app": {
326
+ const body = { bundleId: args?.bundle_id };
327
+ if (args?.log_path)
328
+ body.logPath = args.log_path;
329
+ return textResult(await doPost(`/devices/${args?.device_id}/debug/launch`, body));
330
+ }
445
331
  case "list_apps":
446
332
  return textResult(await doGet(`/devices/${args?.device_id}/apps`));
447
333
  case "install_app":
@@ -468,56 +354,12 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
468
354
  return textResult(await doGet("/tests/active"));
469
355
  case "test_list_projects":
470
356
  return textResult(await doGet("/tests/projects"));
471
- case "test_create_project":
472
- return textResult(await doPost("/tests/projects", { name: args?.name }));
473
- case "test_rename_project":
474
- return textResult(await doPatch(`/tests/projects/${args?.project_id}`, { name: args?.name }));
475
- case "test_create_case": {
476
- const body = { name: args?.name };
477
- if (args?.folder)
478
- body.folder = args.folder;
479
- return textResult(await doPost(`/tests/projects/${args?.project_id}/cases`, body));
480
- }
481
- case "test_rename_case": {
482
- const p = testCasePath(args);
483
- return textResult(await doPatch(p, { name: args?.name }));
484
- }
485
- case "test_delete_case": {
486
- const p = testCasePath(args);
487
- return textResult(await doDelete(p));
488
- }
489
- case "test_get_script": {
490
- const p = testCasePath(args);
491
- return textResult(await doGet(`${p}/script`));
492
- }
493
- case "test_replace_script": {
494
- const p = testCasePath(args);
495
- return textResult(await doPut(`${p}/script`, { script: args?.script }));
496
- }
497
- case "test_update_line": {
498
- const p = testCasePath(args);
499
- return textResult(await doPost(`${p}/script/update-line`, {
500
- line_number: args?.line_number,
501
- content: args?.content,
357
+ case "test_run":
358
+ return textResult(await doPost("/tests/cases/run", {
359
+ project_dir: args?.project_dir,
360
+ case_path: args?.case_path,
361
+ device_id: args?.device_id,
502
362
  }));
503
- }
504
- case "test_insert_after": {
505
- const p = testCasePath(args);
506
- return textResult(await doPost(`${p}/script/insert-after`, {
507
- line_number: args?.line_number,
508
- content: args?.content,
509
- }));
510
- }
511
- case "test_delete_line": {
512
- const p = testCasePath(args);
513
- return textResult(await doPost(`${p}/script/delete-line`, {
514
- line_number: args?.line_number,
515
- }));
516
- }
517
- case "test_run": {
518
- const p = testCasePath(args);
519
- return textResult(await doPost(`${p}/run`, { device_id: args?.device_id }));
520
- }
521
363
  default:
522
364
  return { content: [{ type: "text", text: `Unknown tool: ${name}` }], isError: true };
523
365
  }
package/dist/resources.js CHANGED
@@ -11,6 +11,12 @@ export const RESOURCES = [
11
11
  description: "Testing workflow, rules, error fixes, and .mob script syntax for test generation",
12
12
  mimeType: "text/plain",
13
13
  },
14
+ {
15
+ uri: "mobai://claude-code-preview",
16
+ name: "Claude Code Preview Setup",
17
+ description: "How to preview a MobAI device's control UI inside Claude Code's preview panel",
18
+ mimeType: "text/plain",
19
+ },
14
20
  ];
15
21
  export function getResourceContent(uri) {
16
22
  switch (uri) {
@@ -18,13 +24,43 @@ export function getResourceContent(uri) {
18
24
  return DEVICE_AUTOMATION_REF;
19
25
  case "mobai://reference/testing":
20
26
  return TESTING_REF;
27
+ case "mobai://claude-code-preview":
28
+ return CLAUDE_CODE_PREVIEW;
21
29
  default:
22
30
  return null;
23
31
  }
24
32
  }
25
- // ---------------------------------------------------------------------------
26
- // Resource content copied verbatim from Go resources.go
27
- // ---------------------------------------------------------------------------
33
+ const CLAUDE_CODE_PREVIEW = `<claude-code-preview>
34
+ Prerequisite: the MobAI desktop app must be running. It owns the
35
+ localhost 8787 web server the preview panel will render.
36
+
37
+ 1. Call list_devices and grab the device's id and controlUrl.
38
+
39
+ 2. Write .claude/launch.json at the project root (or, inside a git
40
+ worktree, at the worktree root):
41
+
42
+ {
43
+ "version": "0.0.1",
44
+ "configurations": [{
45
+ "name": "MobAI — <device name>",
46
+ "runtimeExecutable": "sleep",
47
+ "runtimeArgs": ["86400"],
48
+ "port": 8787,
49
+ "url": "<controlUrl>"
50
+ }]
51
+ }
52
+
53
+ - runtimeExecutable + runtimeArgs is a no-op lifetime anchor for
54
+ Claude Code's panel; the real server is MobAI.
55
+ - port is the localhost port Claude Code binds the preview to;
56
+ always 8787 for MobAI.
57
+ - url is the device-specific URL (controlUrl from step 1) that the
58
+ panel actually displays.
59
+
60
+ 3. Call the mcp__Claude_Preview__preview_start tool with the "name"
61
+ from the configuration above.
62
+ </claude-code-preview>
63
+ `;
28
64
  const DEVICE_AUTOMATION_REF = `<device-automation-reference>
29
65
 
30
66
  <guide>
@@ -54,15 +90,23 @@ const DEVICE_AUTOMATION_REF = `<device-automation-reference>
54
90
  </ocr-fallback>
55
91
 
56
92
  <execution-modes>
57
- Default (explore mode): non-last observe actions are skippedonly final observe executes. Use "mode": "deterministic" when you need every observe to execute (observe act observe act observe).
58
- Example: {"version": "0.2", "mode": "deterministic", "steps": [{"action": "observe", "include": ["ui_tree"]}, {"action": "tap", "predicate": {"text": "Next"}}, {"action": "observe", "include": ["ui_tree"]}]}
93
+ Default (explore mode): only the last observe in a script runs earlier observes are skipped. This is the right mode for the typical pattern: actions first, then one observe at the end to see the result. If a step fails, the error includes a debug UI tree so you don't need a separate observe.
94
+ Deterministic mode: every observe runs. Use only when you need to capture screen state between actions within a single script (rare — prefer separate execute_dsl calls so you can reason between steps).
59
95
  </execution-modes>
60
96
 
61
97
  <workflow>Observe screen → plan → act via execute_dsl → verify (end script with wait_for stable + observe) → repeat until done.</workflow>
62
98
 
99
+ <per-app-skills>
100
+ Before working with a known app, check ~/.claude/skills/ for a skill matching its bundle id or name (e.g. com-instagram-android, uber) and load it — it may already encode selectors, flows, and quirks learned on a prior run.
101
+ When you discover app-specific gotchas that would cost future sessions time — unstable selectors that only work with a specific predicate, hidden taps, flows that need an extra wait_for, React Native / Flutter screens that need OCR, dialogs that hijack input — create or update a skill at ~/.claude/skills/&lt;app-slug&gt;/SKILL.md capturing the finding. Keep each skill short: the specific quirk, the selector/flow that works, and one sentence on why the obvious approach fails. Do not write generic mobile-automation advice there — that belongs in this reference.
102
+
103
+ Also save reusable multi-step flows as labeled mobai CLI command sequences inside the same SKILL.md. When you confirm a flow works (login, dismiss onboarding, open-settings-and-toggle-X, checkout), add a section with a heading like "## Flow: login" and a fenced shell code block of "mobai ..." commands in order — one per step. Mark variable inputs with placeholders (&lt;EMAIL&gt;, &lt;OTP_CODE&gt;) so future sessions know what to substitute. On next run, replay the commands (shell them out or translate to execute_dsl) with placeholders substituted — this avoids re-deriving the flow from scratch. Shell commands are saved (not JSON DSL) because the MobAI CLI does not execute DSL JSON blobs, and shell commands stay replayable from either CLI or MCP sessions. If a snippet breaks because the app changed, update it in place.
104
+ </per-app-skills>
105
+
63
106
  <screenshot-tools>
64
107
  get_screenshot — fast low-quality image for LLM visual analysis.
65
108
  save_screenshot — full-quality PNG for reporting, debugging, or sharing.
109
+ To verify animations and UI transitions, use record_start/record_stop.
66
110
  </screenshot-tools>
67
111
 
68
112
  <infinite-scrolling>To collect data from infinite-scrolling views (feeds, search results), scroll to load a batch first, then observe with only_visible:false to get all loaded items in one go.</infinite-scrolling>
@@ -79,14 +123,13 @@ const DEVICE_AUTOMATION_REF = `<device-automation-reference>
79
123
  <target-element>{"predicate": Predicate}</target-element>
80
124
 
81
125
  <predicate context="native">
82
- <note>Prefer text_contains or text_regex over text (exact match) — UI text often changes with state, locale, or dynamic content. Exact match breaks easily. Prefer text fields over label fields — text is what the user sees on screen and is more reliable.</note>
126
+ <note>Prefer text_contains or text_regex over text (exact match) — UI text often changes with state, locale, or dynamic content. Exact match breaks easily.</note>
83
127
  <field name="text" type="string">Exact match — use only when the full text is short, static, and unique</field>
84
128
  <field name="text_contains" type="string">Substring, case-insensitive — preferred for most matching</field>
85
129
  <field name="text_starts_with" type="string">Prefix match</field>
86
130
  <field name="text_regex" type="string">Regex pattern — use for dynamic text (numbers, dates, counts)</field>
87
131
  <field name="type" type="string">button, input, switch, text, image, cell, scrollview</field>
88
- <field name="label" type="string">Accessibility label (exact) use only when text fields are empty</field>
89
- <field name="label_contains" type="string">Accessibility label (partial) — use only when text fields are empty</field>
132
+ <field name="accessibility_id" type="string">Exact match on the #id shown in UI tree (without the # prefix)</field>
90
133
  <field name="enabled" type="bool">Enabled state</field>
91
134
  <field name="visible" type="bool">Visible state</field>
92
135
  <field name="selected" type="bool">Selected state</field>
@@ -165,9 +208,10 @@ const DEVICE_AUTOMATION_REF = `<device-automation-reference>
165
208
  Direction = semantic (where to look), not finger movement.
166
209
  <field name="direction" required="yes">down (look below), up (look above)</field>
167
210
  <field name="to_element" type="TargetElement"/>
168
- <field name="max_scrolls" type="int"/>
211
+ <field name="max_scrolls" type="int" default="10"/>
169
212
  <field name="amount">small, page, full</field>
170
213
  <example>{"action": "scroll", "direction": "down", "to_element": {"predicate": {"text": "Privacy"}}, "max_scrolls": 10}</example>
214
+ <note>scroll with to_element returns "reached end of scrollable content" if the list ends before the element is found. If it returns "element not found after scrolling" instead, the list has more content — increase max_scrolls or call scroll again to continue searching.</note>
171
215
  </action>
172
216
 
173
217
  <action name="drag">
@@ -190,7 +234,7 @@ const DEVICE_AUTOMATION_REF = `<device-automation-reference>
190
234
 
191
235
  <action name="toggle">
192
236
  <field name="predicate" required="yes"/>
193
- <field name="state" required="yes">on or off</field>
237
+ <field name="state" required="no">Desired state: "on" or "off". If omitted, always toggles. If set, skips when already correct.</field>
194
238
  <example>{"action": "toggle", "predicate": {"type": "switch", "text_contains": "Wi-Fi"}, "state": "on"}</example>
195
239
  </action>
196
240
 
@@ -342,6 +386,7 @@ const DEVICE_AUTOMATION_REF = `<device-automation-reference>
342
386
  <action name="record_stop">
343
387
  <field name="file_path">Override output directory</field>
344
388
  <returns>recording_path, frame_count, transition_hints (anomalies: jump/flash/stutter/incoherent_motion with from_frame, to_frame, type, delta_percent, region, message)</returns>
389
+ <note>transition_hints contains anomalous frame pairs (from_frame, to_frame). If transition_hints is empty, do not read any frames. If not empty, read only the flagged frame pairs. Read additional frames only if strictly necessary to investigate a flagged anomaly.</note>
345
390
  </action>
346
391
  </screen-recording>
347
392
 
@@ -351,8 +396,15 @@ const TESTING_REF = `<testing-reference>
351
396
 
352
397
  <important>Read mobai://reference/device-automation to learn how to control devices before interacting with them.</important>
353
398
 
399
+ <file-model>
400
+ Tests are .mob files on disk inside project directories. You work with them directly:
401
+ - Use test_list_projects to discover project directories and their .mob files
402
+ - Read .mob files directly from the project directory using filesystem tools
403
+ - Create, edit, rename, and delete .mob files directly — MobAI watches for changes and updates the UI live
404
+ - Use test_run to execute a test on a device — this is the only operation that requires MCP
405
+ </file-model>
406
+
354
407
  <rules>
355
- <rule>Test scripts are ONLY accessible via MCP test_* tools. There are NO .mob files on disk. Do NOT use grep, find, cat, or any filesystem commands to look for scripts.</rule>
356
408
  <rule>Never ask the user for information you can get yourself — use observe, list_apps, get_ui_tree.</rule>
357
409
  <rule>Always add wait_for before every element interaction (tap, type, toggle, long_press, double_tap, drag). Exception: the element was asserted on the immediately preceding line.</rule>
358
410
  <rule>Always use predicates over coordinates — predicates survive layout changes.</rule>
@@ -361,22 +413,21 @@ const TESTING_REF = `<testing-reference>
361
413
  </rules>
362
414
 
363
415
  <workflow-create>
364
- 1. Observe the current screen
365
- 2. Plan the test steps from the user's description
366
- 3. Execute each action via DSL add wait_for before every element interaction
367
- 4. Assert after key actions verify expected state with assert_exists/assert_not_exists
368
- 5. Output the full script using MCP test tools
369
- 6. Verifyrun the full script end-to-end
370
- 7. Fix if steps fail, observe the screen, fix the failing lines
371
- 8. Re-run to verify fixes (max 3 retry cycles)
416
+ 1. Call test_list_projects to find the project directory and existing tests
417
+ 2. Observe the current screen on the device
418
+ 3. Plan the test steps from the user's description
419
+ 4. Write the .mob file directly to the project directory
420
+ 5. Run the test with test_run
421
+ 6. Fixif steps fail, read the error, observe the screen, edit the .mob file
422
+ 7. Re-run to verify fixes (max 3 retry cycles)
372
423
  </workflow-create>
373
424
 
374
425
  <workflow-fix>
375
- 1. Read the current script
426
+ 1. Read the .mob file from disk
376
427
  2. Analyze the error messages — they reference exact line numbers
377
- 3. Reproduce — run the failing line individually via DSL to observe device state
378
- 4. Fix update, insert, or delete lines as needed
379
- 5. Verify — re-run the test
428
+ 3. Reproduce — run a failing action via DSL to observe device state
429
+ 4. Edit the .mob file directly
430
+ 5. Re-run with test_run
380
431
  </workflow-fix>
381
432
 
382
433
  <error-fixes>
@@ -405,10 +456,9 @@ const TESTING_REF = `<testing-reference>
405
456
 
406
457
  <verification>
407
458
  Check before every response:
408
- 1. Did you use MCP tools for all script mutations? (bare .mob lines in text are silently ignored)
409
- 2. Does every element interaction have a wait_for on the preceding line?
410
- 3. Are predicates used instead of coordinates wherever possible?
411
- 4. Did you observe the screen before acting?
459
+ 1. Does every element interaction have a wait_for on the preceding line?
460
+ 2. Are predicates used instead of coordinates wherever possible?
461
+ 3. Did you observe the screen before acting?
412
462
  </verification>
413
463
 
414
464
  <mob-script-syntax>
@@ -452,6 +502,14 @@ const TESTING_REF = `<testing-reference>
452
502
  delay 1000 — wait N ms
453
503
  press_key home|back|enter — hardware key
454
504
  navigate back|home — navigation shortcut
505
+ two_finger_tap "Map" — two-finger tap
506
+ pinch "Map" scale:0.5 — pinch (scale <1 = zoom out, >1 = zoom in)
507
+ pinch "Photo" scale:2.0 — pinch to zoom in
508
+ hide_keyboard — dismiss keyboard
509
+ copy_text "Field" — copy text from element
510
+ paste_text "Field" — paste clipboard into element
511
+ set_location 40.7128,-74.0060 — simulate GPS location (lat,lon)
512
+ reset_location — stop location simulation
455
513
  observe — observe screen
456
514
  screenshot "path.png" — take screenshot
457
515
  </actions>
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "mobai-mcp",
3
- "version": "2.0.0",
3
+ "version": "2.2.0",
4
4
  "mcpName": "io.github.MobAI-App/mobai-mcp",
5
5
  "description": "MCP server for MobAI - AI-powered mobile device automation",
6
6
  "type": "module",
package/server.json CHANGED
@@ -6,12 +6,12 @@
6
6
  "url": "https://github.com/MobAI-App/mobai-mcp",
7
7
  "source": "github"
8
8
  },
9
- "version": "2.0.0",
9
+ "version": "2.2.0",
10
10
  "packages": [
11
11
  {
12
12
  "registryType": "npm",
13
13
  "identifier": "mobai-mcp",
14
- "version": "2.0.0",
14
+ "version": "2.2.0",
15
15
  "transport": {
16
16
  "type": "stdio"
17
17
  }