npm - mobai-mcp - Versions diffs - 2.0.0 → 2.2.0 - Mend

mobai-mcp 2.0.0 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md CHANGED Viewed

@@ -3,24 +3,27 @@
 [![npm version](https://badge.fury.io/js/mobai-mcp.svg)](https://www.npmjs.com/package/mobai-mcp)
 [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
-MCP (Model Context Protocol) server for [MobAI](https://mobai.run) - AI-powered mobile device automation. This server enables AI coding assistants like Cursor, Windsurf, Cline, and other MCP-compatible tools to control Android and iOS devices, emulators, and simulators.
+MCP (Model Context Protocol) server for [MobAI](https://mobai.run) — AI-powered mobile device automation. Lets AI assistants (Claude Code, Cursor, Windsurf, Cline, and other MCP-compatible tools) control Android and iOS devices, emulators, and simulators via a single DSL-first interface.
-## Features
+## How it works
-- **Device Control**: List, connect, and manage Android/iOS devices
-- **UI Automation**: Tap, type, swipe, and interact with native apps
-- **Web Automation**: Control Safari/Chrome and WebViews with CSS selectors
-- **DSL Batch Execution**: Execute multiple automation steps efficiently
-- **AI Agent**: Run autonomous agents to complete complex tasks
-- **Screenshot Capture**: Capture and save device screenshots
+All device interaction is batched through one primary tool: **`execute_dsl`**. Instead of exposing dozens of fine-grained tools (tap, swipe, type…), the server accepts a JSON script describing a sequence of actions with predicates, assertions, waits, and conditional branches. This keeps round-trips low and encodes retry/failure strategies server-side.
+A small set of companion tools handles device discovery, screenshots, app management, and running `.mob` test files.
 ## Prerequisites
 - Node.js 18+
-- [MobAI desktop app](https://mobai.run) running locally (provides the HTTP API on port 8686)
-- Connected Android or iOS device (or emulator/simulator)
+- [MobAI desktop app](https://mobai.run) running locally (HTTP API on `127.0.0.1:8686`)
+- A connected Android or iOS device, emulator, or simulator
+## Installation
+### Claude Code
-## Installation & Configuration
+```bash
+claude mcp add mobai -- npx -y mobai-mcp
+```
 ### Cursor
@@ -39,7 +42,7 @@ Add to `.cursor/mcp.json`:
 ### Claude Desktop
-Add to Claude Desktop config (`~/Library/Application Support/Claude/claude_desktop_config.json` on macOS):
+Add to `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS):
 ```json
 {
@@ -52,146 +55,107 @@ Add to Claude Desktop config (`~/Library/Application Support/Claude/claude_deskt
 }
 ```
-### Windsurf
+### Windsurf / Cline / other MCP clients
-Add to Windsurf MCP config:
+The server speaks stdio — use your client's generic MCP configuration:
 ```json
 {
-  "mcpServers": {
-    "mobai": {
-      "command": "npx",
-      "args": ["-y", "mobai-mcp"]
-    }
-  }
+  "command": "npx",
+  "args": ["-y", "mobai-mcp"]
 }
 ```
-### Cline / Other MCP Clients
+## Tools
-Configure according to your client's MCP server setup. The server uses stdio transport.
+### Device management
-```json
-{
-  "command": "npx",
-  "args": ["-y", "mobai-mcp"]
-}
-```
+| Tool | Description |
+|---|---|
+| `list_devices` | List all connected Android and iOS devices |
+| `get_device` | Get details about a specific device |
+| `start_bridge` | Start the automation bridge on a device (required before interaction) |
+| `stop_bridge` | Stop the automation bridge |
-## Available Tools
+### Screenshots
-### Device Management
-- `list_devices` - List all connected devices
-- `get_device` - Get device information
-- `start_bridge` - Start on-device automation bridge
-- `stop_bridge` - Stop automation bridge
+| Tool | Description |
+|---|---|
+| `get_screenshot` | Fast, low-quality screenshot for LLM visual analysis (may be downscaled; response includes scale factor) |
+| `save_screenshot` | Full-quality PNG to disk for reporting, debugging, or sharing |
-### UI Automation
-- `get_screenshot` - Capture device screenshot
-- `get_ui_tree` - Get accessibility tree (supports text_regex and bounds filtering)
-- `tap` - Tap element by index or coordinates
-- `type_text` - Type text
-- `swipe` - Perform swipe gesture
-- `go_home` - Navigate to home screen
-- `launch_app` - Launch app by bundle ID
-- `list_apps` - List installed apps
+### Apps
-### DSL Execution
-- `execute_dsl` - Execute batch automation script
+| Tool | Description |
+|---|---|
+| `list_apps` | List installed apps on the device |
+| `install_app` | Install an `.apk` or `.ipa` from a local file path |
+| `uninstall_app` | Uninstall an app by bundle ID / package name |
+| `debug_app` | Launch an app in debug mode and write stdout/stderr to a log file |
-### AI Agent
-- `run_agent` - Run autonomous agent for complex tasks
+### Automation
-### Web Automation
-- `web_list_pages` - List browser tabs/WebViews
-- `web_navigate` - Navigate to URL
-- `web_get_dom` - Get DOM tree
-- `web_click` - Click element by CSS selector
-- `web_type` - Type into element by CSS selector
-- `web_execute_js` - Execute JavaScript
+| Tool | Description |
+|---|---|
+| `execute_dsl` | **Primary tool.** Execute a batch of DSL steps: tap, type, swipe, observe, assertions, web automation, metrics, screen recording, and more. |
-### Low-Level
-- `http_request` - Make raw HTTP request to MobAI API
+### Test management
-## Available Resources
+Tests are `.mob` files on disk inside project directories. You read, write, and edit them directly using your assistant's filesystem tools — MobAI watches for changes and updates the UI live. MCP is only needed to discover projects and run tests.
-- `mobai://api-reference` - Complete API documentation
-- `mobai://dsl-guide` - DSL batch execution guide
-- `mobai://native-runner` - Native app automation guide
-- `mobai://web-runner` - Web automation guide
+| Tool | Description |
+|---|---|
+| `test_get_active` | Get the active test project directory and its `.mob` cases |
+| `test_list_projects` | List all known test project directories with their `.mob` cases |
+| `test_run` | Run a `.mob` test case on a device (`project_dir` + `case_path` + `device_id`) |
-## Example Usage
+## Resources
-### List devices and take screenshot
+Read these **before** attempting any device interaction — they describe the DSL schema, action set, predicates, failure strategies, and `.mob` syntax.
-```
-Use the list_devices tool to see connected devices.
-Then use get_screenshot with the device ID.
-```
+| URI | Purpose |
+|---|---|
+| `mobai://reference/device-automation` | How to control devices — guide, all DSL actions, predicates, and failure strategies |
+| `mobai://reference/testing` | Testing workflow, rules, error fixes, and `.mob` script syntax |
-### Automate Settings app
+## Example
-```
-Use execute_dsl with:
+Open the iOS Settings app, navigate to Wi-Fi, and verify the toggle exists:
+```json
 {
   "version": "0.2",
   "steps": [
     {"action": "open_app", "bundle_id": "com.apple.Preferences"},
-    {"action": "delay", "duration_ms": 1000},
-    {"action": "observe", "context": "native", "include": ["ui_tree"]},
-    {"action": "tap", "predicate": {"text_contains": "General"}}
+    {"action": "wait_for", "predicate": {"text": "Settings"}, "timeout_ms": 3000},
+    {"action": "tap", "predicate": {"text_contains": "Wi-Fi"}},
+    {"action": "wait_for", "predicate": {"type": "switch"}, "timeout_ms": 3000},
+    {"action": "assert_exists", "predicate": {"type": "switch"}},
+    {"action": "observe", "include": ["ui_tree"]}
   ]
 }
 ```
-### Run AI agent
-```
-Use run_agent with device_id and task: "Open Settings and enable WiFi"
-```
-## Comparison with Claude Code Plugin
-| Feature | Claude Code Plugin | MCP Server |
-|---------|-------------------|------------|
-| Platform | Claude Code only | Any MCP client |
-| Tools | http_request (generic) | Named tools + http_request |
-| Resources | Skills (markdown) | MCP resources |
-| Setup | Plugin install | npx |
-The MCP server provides the same functionality as the Claude Code plugin but works with any MCP-compatible AI tool.
+Pass this as the `commands` argument (a JSON string) to `execute_dsl` along with a `device_id` from `list_devices`.
 ## Troubleshooting
-### "Connection refused" error
-- Ensure MobAI desktop app is running
-- Check that API is available at http://127.0.0.1:8686
+**"Connection refused"** — Make sure the MobAI desktop app is running and the API is reachable at `http://127.0.0.1:8686`.
-### "Bridge not running" error
-- Use `start_bridge` tool first before automation
-- iOS bridge may take up to 60 seconds to start
+**"Bridge not running"** — Call `start_bridge` first. The iOS bridge can take up to a minute to come up.
-### Screenshots not visible
-- Screenshots are saved to `/tmp/mobai/screenshots/`
-- Use your AI tool's file reading capability to view them
+**Screenshots not visible** — `get_screenshot` saves to `/tmp/mobai/screenshots/` by default and returns the file path. Use your assistant's file-reading capability to view them. DSL `observe` screenshots are extracted from the response and saved to the same directory.
 ## Development
 ```bash
-# Clone the repository
 git clone https://github.com/MobAI-App/mobai-mcp.git
 cd mobai-mcp
-# Install dependencies
 npm install
-# Build
 npm run build
-# Run locally
 node dist/index.js
 ```
 ## License
-Apache 2.0 - see [LICENSE](LICENSE) for details.
+Apache 2.0 — see [LICENSE](LICENSE).

package/dist/index.js CHANGED Viewed

@@ -107,8 +107,6 @@ async function doRequest(method, urlPath, payload, timeoutMs = DEFAULT_TIMEOUT_M
 const doGet = (p) => doRequest("GET", p);
 const doPost = (p, body) => doRequest("POST", p, body);
 const doDelete = (p) => doRequest("DELETE", p);
-const doPut = (p, body) => doRequest("PUT", p, body);
-const doPatch = (p, body) => doRequest("PATCH", p, body);
 function textResult(data) {
     return {
         content: [{ type: "text", text: typeof data === "string" ? data : JSON.stringify(data, null, 2) }],
@@ -127,8 +125,8 @@ function errResult(err) {
 const server = new Server({ name: "mobai", version: "1.0.0" }, {
     capabilities: { tools: {}, resources: {} },
     instructions: `MobAI controls Android and iOS devices. Before starting any device task, read the relevant MCP resources:
-- mobai://reference/device-automation — how to control devices
-- mobai://reference/testing — testing workflow, rules, and .mob script syntax
+- mobai://reference/device-automation — how to control devices (read before ANY device interaction)
+- mobai://reference/testing — .mob script syntax (read ONLY when user asks to create or fix test scripts)
 Check available skills in current work directory and load any relevant to the user's request.`,
 });
 // ---------------------------------------------------------------------------
@@ -171,7 +169,7 @@ const TOOLS = [
     // Screenshot
     {
         name: "get_screenshot",
-        description: "Capture a fast, low-quality screenshot for LLM visual analysis. Returns the file path to the saved image. Use this for AI/LLM processing only — for full-quality screenshots use save_screenshot instead.",
+        description: "Capture a fast, low-quality screenshot for LLM visual analysis. Returns the file path to the saved image. The image may be downscaled by an integer factor so its long edge stays ≤ 2000px; when that happens the response includes a scale factor — multiply any coordinates you read off the image by that factor before using them in device actions (tap, swipe, drag, long-press, etc.). UI tree coordinates are already in device pixels, do not scale those. Use this for AI/LLM processing only — for full-quality screenshots use save_screenshot instead.",
         inputSchema: {
             type: "object",
             properties: { device_id: { type: "string", description: "Device ID" } },
@@ -191,6 +189,20 @@ const TOOLS = [
             required: ["device_id"],
         },
     },
+    // Debug launch
+    {
+        name: "debug_app",
+        description: "Launch an app in debug mode and write logs to a file. Returns the log file path — use Read/Grep to inspect logs. Use kill_app to stop.",
+        inputSchema: {
+            type: "object",
+            properties: {
+                device_id: { type: "string", description: "Device ID" },
+                bundle_id: { type: "string", description: "Bundle ID of the app to debug" },
+                log_path: { type: "string", description: "Directory for log file (supports ~/). Defaults to OS temp directory." },
+            },
+            required: ["device_id", "bundle_id"],
+        },
+    },
     // App management
     {
         name: "list_apps",
@@ -230,7 +242,7 @@ const TOOLS = [
         name: "execute_dsl",
         description: `Execute a batch of DSL commands on a device. This is the primary tool for all device interaction — tap, type, swipe, observe, launch apps, assertions, web automation, and more.
-Read the MCP resource mobai://reference/device-automation to learn how to control devices before using this tool.
+You MUST read the MCP resource mobai://reference/device-automation to learn how to control devices before using this tool.
 Input: JSON string with "version": "0.2" and "steps" array. Example:
 {"version":"0.2","steps":[
@@ -250,150 +262,25 @@ Input: JSON string with "version": "0.2" and "steps" array. Example:
     // Test management
     {
         name: "test_get_active",
-        description: "Get the currently active test project and its cases. Use this to discover which test cases are available.",
+        description: "Get the currently active test project directory and its .mob test cases. Use this to discover the project path and available tests. The agent can then read/write/create/delete .mob files directly in the returned directory.",
         inputSchema: { type: "object", properties: {}, required: [] },
     },
     {
         name: "test_list_projects",
-        description: "List all test projects with their test cases included inline",
+        description: "List all known test project directories with their .mob test cases. Each project is a directory containing .mob script files.",
         inputSchema: { type: "object", properties: {}, required: [] },
     },
-    {
-        name: "test_create_project",
-        description: "Create a new test project",
-        inputSchema: {
-            type: "object",
-            properties: { name: { type: "string", description: "Project name" } },
-            required: ["name"],
-        },
-    },
-    {
-        name: "test_rename_project",
-        description: "Rename an existing test project",
-        inputSchema: {
-            type: "object",
-            properties: {
-                project_id: { type: "string", description: "Project ID" },
-                name: { type: "string", description: "New project name" },
-            },
-            required: ["project_id", "name"],
-        },
-    },
-    {
-        name: "test_create_case",
-        description: "Create a new test case in a project",
-        inputSchema: {
-            type: "object",
-            properties: {
-                project_id: { type: "string", description: "Project ID" },
-                name: { type: "string", description: "Test case name" },
-                folder: { type: "string", description: "Optional folder path within the project" },
-            },
-            required: ["project_id", "name"],
-        },
-    },
-    {
-        name: "test_rename_case",
-        description: "Rename an existing test case",
-        inputSchema: {
-            type: "object",
-            properties: {
-                project_id: { type: "string", description: "Project ID" },
-                case_id: { type: "string", description: "Test case ID" },
-                name: { type: "string", description: "New test case name" },
-            },
-            required: ["project_id", "case_id", "name"],
-        },
-    },
-    {
-        name: "test_delete_case",
-        description: "Delete a test case from a project",
-        inputSchema: {
-            type: "object",
-            properties: {
-                project_id: { type: "string", description: "Project ID" },
-                case_id: { type: "string", description: "Test case ID" },
-            },
-            required: ["project_id", "case_id"],
-        },
-    },
-    {
-        name: "test_get_script",
-        description: "Get the .mob script content for a test case (with 1-based line numbers)",
-        inputSchema: {
-            type: "object",
-            properties: {
-                project_id: { type: "string", description: "Project ID" },
-                case_id: { type: "string", description: "Test case ID" },
-            },
-            required: ["project_id", "case_id"],
-        },
-    },
-    {
-        name: "test_replace_script",
-        description: "Replace the entire .mob script for a test case",
-        inputSchema: {
-            type: "object",
-            properties: {
-                project_id: { type: "string", description: "Project ID" },
-                case_id: { type: "string", description: "Test case ID" },
-                script: { type: "string", description: "New script content (without line numbers)" },
-            },
-            required: ["project_id", "case_id", "script"],
-        },
-    },
-    {
-        name: "test_update_line",
-        description: "Update a single line in the .mob script",
-        inputSchema: {
-            type: "object",
-            properties: {
-                project_id: { type: "string", description: "Project ID" },
-                case_id: { type: "string", description: "Test case ID" },
-                line_number: { type: "number", description: "1-based line number to update" },
-                content: { type: "string", description: "New line content" },
-            },
-            required: ["project_id", "case_id", "line_number", "content"],
-        },
-    },
-    {
-        name: "test_insert_after",
-        description: "Insert a new line after the specified line number in the .mob script",
-        inputSchema: {
-            type: "object",
-            properties: {
-                project_id: { type: "string", description: "Project ID" },
-                case_id: { type: "string", description: "Test case ID" },
-                line_number: { type: "number", description: "1-based line number to insert after (0 = insert at beginning)" },
-                content: { type: "string", description: "Line content to insert" },
-            },
-            required: ["project_id", "case_id", "line_number", "content"],
-        },
-    },
-    {
-        name: "test_delete_line",
-        description: "Delete a line from the .mob script",
-        inputSchema: {
-            type: "object",
-            properties: {
-                project_id: { type: "string", description: "Project ID" },
-                case_id: { type: "string", description: "Test case ID" },
-                line_number: { type: "number", description: "1-based line number to delete" },
-            },
-            required: ["project_id", "case_id", "line_number"],
-        },
-    },
     {
         name: "test_run",
-        description: "Run a test case on a device",
+        description: "Run a .mob test case on a device. The case_path is relative to the project directory.",
         inputSchema: {
             type: "object",
             properties: {
-                project_id: { type: "string", description: "Project ID" },
-                case_id: { type: "string", description: "Test case ID" },
+                project_dir: { type: "string", description: "Absolute path to the project directory" },
+                case_path: { type: "string", description: "Relative path to the .mob file within the project, e.g. auth/login.mob" },
                 device_id: { type: "string", description: "Device ID to run the test on" },
             },
-            required: ["project_id", "case_id", "device_id"],
+            required: ["project_dir", "case_path", "device_id"],
         },
     },
 ];
@@ -406,13 +293,6 @@ server.setRequestHandler(ListToolsRequestSchema, async () => {
 // ---------------------------------------------------------------------------
 // Tool call handler
 // ---------------------------------------------------------------------------
-function testCasePath(args) {
-    const projectId = args?.project_id;
-    const caseId = args?.case_id;
-    if (!projectId || !caseId)
-        throw new Error("project_id and case_id are required");
-    return `/tests/projects/${projectId}/cases/${caseId}`;
-}
 server.setRequestHandler(CallToolRequestSchema, async (request) => {
     const { name, arguments: args } = request.params;
     try {
@@ -442,6 +322,12 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
                 return textResult(screenshotToFile(body));
             }
             // App management
+            case "debug_app": {
+                const body = { bundleId: args?.bundle_id };
+                if (args?.log_path)
+                    body.logPath = args.log_path;
+                return textResult(await doPost(`/devices/${args?.device_id}/debug/launch`, body));
+            }
             case "list_apps":
                 return textResult(await doGet(`/devices/${args?.device_id}/apps`));
             case "install_app":
@@ -468,56 +354,12 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
                 return textResult(await doGet("/tests/active"));
             case "test_list_projects":
                 return textResult(await doGet("/tests/projects"));
-            case "test_create_project":
-                return textResult(await doPost("/tests/projects", { name: args?.name }));
-            case "test_rename_project":
-                return textResult(await doPatch(`/tests/projects/${args?.project_id}`, { name: args?.name }));
-            case "test_create_case": {
-                const body = { name: args?.name };
-                if (args?.folder)
-                    body.folder = args.folder;
-                return textResult(await doPost(`/tests/projects/${args?.project_id}/cases`, body));
-            }
-            case "test_rename_case": {
-                const p = testCasePath(args);
-                return textResult(await doPatch(p, { name: args?.name }));
-            }
-            case "test_delete_case": {
-                const p = testCasePath(args);
-                return textResult(await doDelete(p));
-            }
-            case "test_get_script": {
-                const p = testCasePath(args);
-                return textResult(await doGet(`${p}/script`));
-            }
-            case "test_replace_script": {
-                const p = testCasePath(args);
-                return textResult(await doPut(`${p}/script`, { script: args?.script }));
-            }
-            case "test_update_line": {
-                const p = testCasePath(args);
-                return textResult(await doPost(`${p}/script/update-line`, {
-                    line_number: args?.line_number,
-                    content: args?.content,
+            case "test_run":
+                return textResult(await doPost("/tests/cases/run", {
+                    project_dir: args?.project_dir,
+                    case_path: args?.case_path,
+                    device_id: args?.device_id,
                 }));
-            }
-            case "test_insert_after": {
-                const p = testCasePath(args);
-                return textResult(await doPost(`${p}/script/insert-after`, {
-                    line_number: args?.line_number,
-                    content: args?.content,
-                }));
-            }
-            case "test_delete_line": {
-                const p = testCasePath(args);
-                return textResult(await doPost(`${p}/script/delete-line`, {
-                    line_number: args?.line_number,
-                }));
-            }
-            case "test_run": {
-                const p = testCasePath(args);
-                return textResult(await doPost(`${p}/run`, { device_id: args?.device_id }));
-            }
             default:
                 return { content: [{ type: "text", text: `Unknown tool: ${name}` }], isError: true };
         }

package/dist/resources.js CHANGED Viewed

@@ -11,6 +11,12 @@ export const RESOURCES = [
         description: "Testing workflow, rules, error fixes, and .mob script syntax for test generation",
         mimeType: "text/plain",
     },
+    {
+        uri: "mobai://claude-code-preview",
+        name: "Claude Code Preview Setup",
+        description: "How to preview a MobAI device's control UI inside Claude Code's preview panel",
+        mimeType: "text/plain",
+    },
 ];
 export function getResourceContent(uri) {
     switch (uri) {
@@ -18,13 +24,43 @@ export function getResourceContent(uri) {
             return DEVICE_AUTOMATION_REF;
         case "mobai://reference/testing":
             return TESTING_REF;
+        case "mobai://claude-code-preview":
+            return CLAUDE_CODE_PREVIEW;
         default:
             return null;
     }
 }
-// ---------------------------------------------------------------------------
-// Resource content — copied verbatim from Go resources.go
-// ---------------------------------------------------------------------------
+const CLAUDE_CODE_PREVIEW = `<claude-code-preview>
+Prerequisite: the MobAI desktop app must be running. It owns the
+localhost 8787 web server the preview panel will render.
+1. Call list_devices and grab the device's id and controlUrl.
+2. Write .claude/launch.json at the project root (or, inside a git
+   worktree, at the worktree root):
+   {
+     "version": "0.0.1",
+     "configurations": [{
+       "name": "MobAI — <device name>",
+       "runtimeExecutable": "sleep",
+       "runtimeArgs": ["86400"],
+       "port": 8787,
+       "url": "<controlUrl>"
+     }]
+   }
+   - runtimeExecutable + runtimeArgs is a no-op lifetime anchor for
+     Claude Code's panel; the real server is MobAI.
+   - port is the localhost port Claude Code binds the preview to;
+     always 8787 for MobAI.
+   - url is the device-specific URL (controlUrl from step 1) that the
+     panel actually displays.
+3. Call the mcp__Claude_Preview__preview_start tool with the "name"
+   from the configuration above.
+</claude-code-preview>
+`;
 const DEVICE_AUTOMATION_REF = `<device-automation-reference>
 <guide>
@@ -54,15 +90,23 @@ const DEVICE_AUTOMATION_REF = `<device-automation-reference>
   </ocr-fallback>
   <execution-modes>
-    Default (explore mode): non-last observe actions are skipped — only final observe executes. Use "mode": "deterministic" when you need every observe to execute (observe → act → observe → act → observe).
-    Example: {"version": "0.2", "mode": "deterministic", "steps": [{"action": "observe", "include": ["ui_tree"]}, {"action": "tap", "predicate": {"text": "Next"}}, {"action": "observe", "include": ["ui_tree"]}]}
+    Default (explore mode): only the last observe in a script runs — earlier observes are skipped. This is the right mode for the typical pattern: actions first, then one observe at the end to see the result. If a step fails, the error includes a debug UI tree so you don't need a separate observe.
+    Deterministic mode: every observe runs. Use only when you need to capture screen state between actions within a single script (rare — prefer separate execute_dsl calls so you can reason between steps).
   </execution-modes>
   <workflow>Observe screen → plan → act via execute_dsl → verify (end script with wait_for stable + observe) → repeat until done.</workflow>
+  <per-app-skills>
+    Before working with a known app, check ~/.claude/skills/ for a skill matching its bundle id or name (e.g. com-instagram-android, uber) and load it — it may already encode selectors, flows, and quirks learned on a prior run.
+    When you discover app-specific gotchas that would cost future sessions time — unstable selectors that only work with a specific predicate, hidden taps, flows that need an extra wait_for, React Native / Flutter screens that need OCR, dialogs that hijack input — create or update a skill at ~/.claude/skills/&lt;app-slug&gt;/SKILL.md capturing the finding. Keep each skill short: the specific quirk, the selector/flow that works, and one sentence on why the obvious approach fails. Do not write generic mobile-automation advice there — that belongs in this reference.
+    Also save reusable multi-step flows as labeled mobai CLI command sequences inside the same SKILL.md. When you confirm a flow works (login, dismiss onboarding, open-settings-and-toggle-X, checkout), add a section with a heading like "## Flow: login" and a fenced shell code block of "mobai ..." commands in order — one per step. Mark variable inputs with placeholders (&lt;EMAIL&gt;, &lt;OTP_CODE&gt;) so future sessions know what to substitute. On next run, replay the commands (shell them out or translate to execute_dsl) with placeholders substituted — this avoids re-deriving the flow from scratch. Shell commands are saved (not JSON DSL) because the MobAI CLI does not execute DSL JSON blobs, and shell commands stay replayable from either CLI or MCP sessions. If a snippet breaks because the app changed, update it in place.
+  </per-app-skills>
   <screenshot-tools>
     get_screenshot — fast low-quality image for LLM visual analysis.
     save_screenshot — full-quality PNG for reporting, debugging, or sharing.
+    To verify animations and UI transitions, use record_start/record_stop.
   </screenshot-tools>
   <infinite-scrolling>To collect data from infinite-scrolling views (feeds, search results), scroll to load a batch first, then observe with only_visible:false to get all loaded items in one go.</infinite-scrolling>
@@ -79,14 +123,13 @@ const DEVICE_AUTOMATION_REF = `<device-automation-reference>
   <target-element>{"predicate": Predicate}</target-element>
   <predicate context="native">
-    <note>Prefer text_contains or text_regex over text (exact match) — UI text often changes with state, locale, or dynamic content. Exact match breaks easily. Prefer text fields over label fields — text is what the user sees on screen and is more reliable.</note>
+    <note>Prefer text_contains or text_regex over text (exact match) — UI text often changes with state, locale, or dynamic content. Exact match breaks easily.</note>
     <field name="text" type="string">Exact match — use only when the full text is short, static, and unique</field>
     <field name="text_contains" type="string">Substring, case-insensitive — preferred for most matching</field>
     <field name="text_starts_with" type="string">Prefix match</field>
     <field name="text_regex" type="string">Regex pattern — use for dynamic text (numbers, dates, counts)</field>
     <field name="type" type="string">button, input, switch, text, image, cell, scrollview</field>
-    <field name="label" type="string">Accessibility label (exact) — use only when text fields are empty</field>
-    <field name="label_contains" type="string">Accessibility label (partial) — use only when text fields are empty</field>
+    <field name="accessibility_id" type="string">Exact match on the #id shown in UI tree (without the # prefix)</field>
     <field name="enabled" type="bool">Enabled state</field>
     <field name="visible" type="bool">Visible state</field>
     <field name="selected" type="bool">Selected state</field>
@@ -165,9 +208,10 @@ const DEVICE_AUTOMATION_REF = `<device-automation-reference>
     Direction = semantic (where to look), not finger movement.
     <field name="direction" required="yes">down (look below), up (look above)</field>
     <field name="to_element" type="TargetElement"/>
-    <field name="max_scrolls" type="int"/>
+    <field name="max_scrolls" type="int" default="10"/>
     <field name="amount">small, page, full</field>
     <example>{"action": "scroll", "direction": "down", "to_element": {"predicate": {"text": "Privacy"}}, "max_scrolls": 10}</example>
+    <note>scroll with to_element returns "reached end of scrollable content" if the list ends before the element is found. If it returns "element not found after scrolling" instead, the list has more content — increase max_scrolls or call scroll again to continue searching.</note>
   </action>
   <action name="drag">
@@ -190,7 +234,7 @@ const DEVICE_AUTOMATION_REF = `<device-automation-reference>
   <action name="toggle">
     <field name="predicate" required="yes"/>
-    <field name="state" required="yes">on or off</field>
+    <field name="state" required="no">Desired state: "on" or "off". If omitted, always toggles. If set, skips when already correct.</field>
     <example>{"action": "toggle", "predicate": {"type": "switch", "text_contains": "Wi-Fi"}, "state": "on"}</example>
   </action>
@@ -342,6 +386,7 @@ const DEVICE_AUTOMATION_REF = `<device-automation-reference>
   <action name="record_stop">
     <field name="file_path">Override output directory</field>
     <returns>recording_path, frame_count, transition_hints (anomalies: jump/flash/stutter/incoherent_motion with from_frame, to_frame, type, delta_percent, region, message)</returns>
+    <note>transition_hints contains anomalous frame pairs (from_frame, to_frame). If transition_hints is empty, do not read any frames. If not empty, read only the flagged frame pairs. Read additional frames only if strictly necessary to investigate a flagged anomaly.</note>
   </action>
 </screen-recording>
@@ -351,8 +396,15 @@ const TESTING_REF = `<testing-reference>
 <important>Read mobai://reference/device-automation to learn how to control devices before interacting with them.</important>
+<file-model>
+  Tests are .mob files on disk inside project directories. You work with them directly:
+  - Use test_list_projects to discover project directories and their .mob files
+  - Read .mob files directly from the project directory using filesystem tools
+  - Create, edit, rename, and delete .mob files directly — MobAI watches for changes and updates the UI live
+  - Use test_run to execute a test on a device — this is the only operation that requires MCP
+</file-model>
 <rules>
-  <rule>Test scripts are ONLY accessible via MCP test_* tools. There are NO .mob files on disk. Do NOT use grep, find, cat, or any filesystem commands to look for scripts.</rule>
   <rule>Never ask the user for information you can get yourself — use observe, list_apps, get_ui_tree.</rule>
   <rule>Always add wait_for before every element interaction (tap, type, toggle, long_press, double_tap, drag). Exception: the element was asserted on the immediately preceding line.</rule>
   <rule>Always use predicates over coordinates — predicates survive layout changes.</rule>
@@ -361,22 +413,21 @@ const TESTING_REF = `<testing-reference>
 </rules>
 <workflow-create>
-  1. Observe the current screen
-  2. Plan the test steps from the user's description
-  3. Execute each action via DSL — add wait_for before every element interaction
-  4. Assert after key actions — verify expected state with assert_exists/assert_not_exists
-  5. Output the full script using MCP test tools
-  6. Verify — run the full script end-to-end
-  7. Fix — if steps fail, observe the screen, fix the failing lines
-  8. Re-run to verify fixes (max 3 retry cycles)
+  1. Call test_list_projects to find the project directory and existing tests
+  2. Observe the current screen on the device
+  3. Plan the test steps from the user's description
+  4. Write the .mob file directly to the project directory
+  5. Run the test with test_run
+  6. Fix — if steps fail, read the error, observe the screen, edit the .mob file
+  7. Re-run to verify fixes (max 3 retry cycles)
 </workflow-create>
 <workflow-fix>
-  1. Read the current script
+  1. Read the .mob file from disk
   2. Analyze the error messages — they reference exact line numbers
-  3. Reproduce — run the failing line individually via DSL to observe device state
-  4. Fix — update, insert, or delete lines as needed
-  5. Verify — re-run the test
+  3. Reproduce — run a failing action via DSL to observe device state
+  4. Edit the .mob file directly
+  5. Re-run with test_run
 </workflow-fix>
 <error-fixes>
@@ -405,10 +456,9 @@ const TESTING_REF = `<testing-reference>
 <verification>
   Check before every response:
-  1. Did you use MCP tools for all script mutations? (bare .mob lines in text are silently ignored)
-  2. Does every element interaction have a wait_for on the preceding line?
-  3. Are predicates used instead of coordinates wherever possible?
-  4. Did you observe the screen before acting?
+  1. Does every element interaction have a wait_for on the preceding line?
+  2. Are predicates used instead of coordinates wherever possible?
+  3. Did you observe the screen before acting?
 </verification>
 <mob-script-syntax>
@@ -452,6 +502,14 @@ const TESTING_REF = `<testing-reference>
     delay 1000                          — wait N ms
     press_key home|back|enter           — hardware key
     navigate back|home                  — navigation shortcut
+    two_finger_tap "Map"                — two-finger tap
+    pinch "Map" scale:0.5               — pinch (scale <1 = zoom out, >1 = zoom in)
+    pinch "Photo" scale:2.0             — pinch to zoom in
+    hide_keyboard                       — dismiss keyboard
+    copy_text "Field"                   — copy text from element
+    paste_text "Field"                  — paste clipboard into element
+    set_location 40.7128,-74.0060       — simulate GPS location (lat,lon)
+    reset_location                      — stop location simulation
     observe                             — observe screen
     screenshot "path.png"               — take screenshot
   </actions>

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "mobai-mcp",
-  "version": "2.0.0",
+  "version": "2.2.0",
   "mcpName": "io.github.MobAI-App/mobai-mcp",
   "description": "MCP server for MobAI - AI-powered mobile device automation",
   "type": "module",

package/server.json CHANGED Viewed

@@ -6,12 +6,12 @@
     "url": "https://github.com/MobAI-App/mobai-mcp",
     "source": "github"
   },
-  "version": "2.0.0",
+  "version": "2.2.0",
   "packages": [
     {
       "registryType": "npm",
       "identifier": "mobai-mcp",
-      "version": "2.0.0",
+      "version": "2.2.0",
       "transport": {
         "type": "stdio"
       }