native-devtools-mcp 0.5.0 → 0.5.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +37 -11
- package/package.json +3 -3
package/README.md
CHANGED
|
@@ -1,22 +1,33 @@
|
|
|
1
1
|
# native-devtools-mcp
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
`native-devtools-mcp` is a Model Context Protocol (MCP) server for computer use on macOS, Windows, and Android. It gives AI agents and MCP clients direct control over native desktop apps and Android devices through screenshots, OCR, accessibility-based text lookup, input simulation, window management, and ADB.
|
|
4
4
|
|
|
5
|
-
|
|
6
|
-

|
|
7
|
-

|
|
8
|
-

|
|
5
|
+
Use it when browser-only automation is not enough: Electron apps, system dialogs, desktop tools, native app testing, and Android device workflows. It works with [Claude Desktop](https://claude.ai/download), [Claude Code](https://docs.anthropic.com/en/docs/claude-code), [Cursor](https://cursor.com), and other MCP-compatible clients.
|
|
9
6
|
|
|
10
|
-
|
|
7
|
+
Useful for MCP-based computer use, desktop automation, UI automation, native app testing, e2e testing, RPA, screen reading, mouse and keyboard control, and Android device automation.
|
|
11
8
|
|
|
12
|
-
|
|
9
|
+
```bash
|
|
10
|
+
npx -y native-devtools-mcp
|
|
11
|
+
```
|
|
13
12
|
|
|
14
|
-
**
|
|
13
|
+
**Core capabilities**
|
|
14
|
+
- Screenshots, OCR, and accessibility-first `find_text`
|
|
15
|
+
- `click`, `type_text`, `scroll`, `launch_app`, `quit_app`, and window management
|
|
16
|
+
- `element_at_point` for inspecting accessible UI elements at screen coordinates
|
|
17
|
+
- `load_image` + `find_image` for non-text UI elements such as icons and custom controls
|
|
18
|
+
- Android screenshots, text lookup, input, and app control over ADB
|
|
19
|
+
- Local execution: screenshots and input stay on the machine
|
|
15
20
|
|
|
16
|
-
|
|
21
|
+
**For AI agents:** Read [`AGENTS.md`](./AGENTS.md) for tool definitions, workflow patterns, and machine-readable usage guidance.
|
|
17
22
|
|
|
18
|
-
[
|
|
23
|
+

|
|
24
|
+

|
|
25
|
+

|
|
26
|
+

|
|
27
|
+
|
|
28
|
+
[Features](#-features) • [Installation](#-installation) • [Getting Started](#-getting-started) • [Recipes](#-recipes-and-examples) • [Security & Trust](#-security--trust) • [For AI Agents](#-for-ai-agents-llms) • [Android](#-android-support)
|
|
19
29
|
|
|
30
|
+
<div align="center">
|
|
20
31
|
<table>
|
|
21
32
|
<tr>
|
|
22
33
|
<td align="center"><strong>macOS</strong></td>
|
|
@@ -27,7 +38,6 @@ A Model Context Protocol (MCP) server that provides **Computer Use** capabilitie
|
|
|
27
38
|
<td><img src="windows-demo-1.gif" width="450" alt="Windows Demo"></td>
|
|
28
39
|
</tr>
|
|
29
40
|
</table>
|
|
30
|
-
|
|
31
41
|
</div>
|
|
32
42
|
|
|
33
43
|
---
|
|
@@ -40,6 +50,7 @@ A Model Context Protocol (MCP) server that provides **Computer Use** capabilitie
|
|
|
40
50
|
- **🧩 Template Matching:** Find non-text UI elements (icons, shapes) using `load_image` + `find_image`, returning precise click coordinates.
|
|
41
51
|
- **🔒 Local & Private:** 100% local execution. No screenshots or data are ever sent to external servers.
|
|
42
52
|
- **📱 Android Support:** Connect to Android devices over ADB for screenshots, input simulation, UI element search, and app management — all from the same MCP server.
|
|
53
|
+
- **🔍 Hover Tracking:** Track cursor hover transitions across UI elements in real-time. Configurable dwell threshold filters pass-through noise — designed for LLMs observing user navigation patterns. macOS only.
|
|
43
54
|
- **🔌 Dual-Mode Interaction:**
|
|
44
55
|
1. **Visual/Native:** Works with *any* app via screenshots & coordinates (Universal).
|
|
45
56
|
2. **AppDebugKit:** Deep integration for supported apps to inspect the UI tree (DOM-like structure).
|
|
@@ -56,6 +67,8 @@ This MCP server is designed to be **highly discoverable and usable** by AI model
|
|
|
56
67
|
3. `find_text`: A shortcut to find text on screen and get its coordinates immediately. Uses the platform **accessibility API** (macOS Accessibility / Windows UI Automation) for precise element-level matching, with OCR fallback.
|
|
57
68
|
4. `element_at_point`: Inspect the accessibility element at given screen coordinates — returns name, role, label, value, bounds, pid, and app_name. Note: privacy-focused Electron apps (e.g. Signal) may restrict their AX tree, returning only a container — use `take_screenshot` with OCR as a fallback.
|
|
58
69
|
5. `load_image` / `find_image`: Template matching for non-text UI elements (icons, shapes), returning screen coordinates for clicking.
|
|
70
|
+
6. `start_hover_tracking` / `get_hover_events` / `stop_hover_tracking`: Track cursor hover transitions across UI elements. Configurable dwell threshold filters pass-throughs. macOS only.
|
|
71
|
+
7. `launch_app` / `quit_app`: Launch apps with optional CLI args, or gracefully/forcefully quit them.
|
|
59
72
|
|
|
60
73
|
## 📦 Installation
|
|
61
74
|
|
|
@@ -119,6 +132,18 @@ Then restart your MCP client and you're ready to go.
|
|
|
119
132
|
> { "permissions": { "allow": ["mcp__native-devtools__*"] } }
|
|
120
133
|
> ```
|
|
121
134
|
|
|
135
|
+
## 📚 Recipes and Examples
|
|
136
|
+
|
|
137
|
+
- [Recipes and Examples Index](./examples/README.md)
|
|
138
|
+
- [Claude Desktop Setup](./examples/claude-desktop-setup.md)
|
|
139
|
+
- [Claude Code Setup](./examples/claude-code-setup.md)
|
|
140
|
+
- [Cursor Setup](./examples/cursor-setup.md)
|
|
141
|
+
- [End-to-End Desktop Flow](./examples/end-to-end-desktop-flow.md)
|
|
142
|
+
- [Native App Click Flow](./examples/native-app-click-flow.md)
|
|
143
|
+
- [OCR Fallback and Element Inspection](./examples/ocr-fallback-and-element-inspection.md)
|
|
144
|
+
- [Template Matching Flow](./examples/template-matching-flow.md)
|
|
145
|
+
- [Android Quickstart](./examples/android-quickstart.md)
|
|
146
|
+
|
|
122
147
|
<details>
|
|
123
148
|
<summary><strong>Manual configuration (without setup)</strong></summary>
|
|
124
149
|
|
|
@@ -319,6 +344,7 @@ graph TD
|
|
|
319
344
|
| | Input | `CGEvent` (CoreGraphics) |
|
|
320
345
|
| | Text Search (`find_text`) | `Accessibility API` (primary), Vision OCR (fallback) |
|
|
321
346
|
| | Element Inspection (`element_at_point`) | `AXUIElementCopyElementAtPosition` + AX tree walk fallback (Accessibility API) |
|
|
347
|
+
| | Hover Tracking (`start_hover_tracking`) | `CGEvent` cursor + Accessibility API polling (macOS only) |
|
|
322
348
|
| | OCR | `VNRecognizeTextRequest` (Vision Framework) |
|
|
323
349
|
| **Windows** | Screenshots | `BitBlt` (GDI) |
|
|
324
350
|
| | Input | `SendInput` (Win32) |
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "native-devtools-mcp",
|
|
3
|
-
"version": "0.5.
|
|
3
|
+
"version": "0.5.1",
|
|
4
4
|
"mcpName": "io.github.sh3ll3x3c/native-devtools",
|
|
5
5
|
"description": "MCP server for native app testing — screenshot, OCR, click, type, find_text, template matching. macOS, Windows & Android.",
|
|
6
6
|
"license": "MIT",
|
|
@@ -53,8 +53,8 @@
|
|
|
53
53
|
"bin"
|
|
54
54
|
],
|
|
55
55
|
"optionalDependencies": {
|
|
56
|
-
"@sh3ll3x3c/native-devtools-mcp-darwin-arm64": "0.5.
|
|
57
|
-
"@sh3ll3x3c/native-devtools-mcp-win32-x64": "0.5.
|
|
56
|
+
"@sh3ll3x3c/native-devtools-mcp-darwin-arm64": "0.5.1",
|
|
57
|
+
"@sh3ll3x3c/native-devtools-mcp-win32-x64": "0.5.1"
|
|
58
58
|
},
|
|
59
59
|
"engines": {
|
|
60
60
|
"node": ">=18"
|