native-devtools-mcp 0.5.1 → 0.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +9 -4
- package/package.json +3 -3
package/README.md
CHANGED
|
@@ -50,7 +50,7 @@ npx -y native-devtools-mcp
|
|
|
50
50
|
- **🧩 Template Matching:** Find non-text UI elements (icons, shapes) using `load_image` + `find_image`, returning precise click coordinates.
|
|
51
51
|
- **🔒 Local & Private:** 100% local execution. No screenshots or data are ever sent to external servers.
|
|
52
52
|
- **📱 Android Support:** Connect to Android devices over ADB for screenshots, input simulation, UI element search, and app management — all from the same MCP server.
|
|
53
|
-
- **🔍 Hover Tracking:** Track cursor hover transitions across UI elements in real-time. Configurable dwell threshold filters pass-through noise — designed for LLMs observing user navigation patterns.
|
|
53
|
+
- **🔍 Hover Tracking:** Track cursor hover transitions across UI elements in real-time. Configurable dwell threshold filters pass-through noise — designed for LLMs observing user navigation patterns.
|
|
54
54
|
- **🔌 Dual-Mode Interaction:**
|
|
55
55
|
1. **Visual/Native:** Works with *any* app via screenshots & coordinates (Universal).
|
|
56
56
|
2. **AppDebugKit:** Deep integration for supported apps to inspect the UI tree (DOM-like structure).
|
|
@@ -67,8 +67,9 @@ This MCP server is designed to be **highly discoverable and usable** by AI model
|
|
|
67
67
|
3. `find_text`: A shortcut to find text on screen and get its coordinates immediately. Uses the platform **accessibility API** (macOS Accessibility / Windows UI Automation) for precise element-level matching, with OCR fallback.
|
|
68
68
|
4. `element_at_point`: Inspect the accessibility element at given screen coordinates — returns name, role, label, value, bounds, pid, and app_name. Note: privacy-focused Electron apps (e.g. Signal) may restrict their AX tree, returning only a container — use `take_screenshot` with OCR as a fallback.
|
|
69
69
|
5. `load_image` / `find_image`: Template matching for non-text UI elements (icons, shapes), returning screen coordinates for clicking.
|
|
70
|
-
6. `start_hover_tracking` / `get_hover_events` / `stop_hover_tracking`: Track cursor hover transitions across UI elements. Configurable dwell threshold filters pass-throughs.
|
|
71
|
-
7. `
|
|
70
|
+
6. `start_hover_tracking` / `get_hover_events` / `stop_hover_tracking`: Track cursor hover transitions across UI elements. Configurable dwell threshold filters pass-throughs.
|
|
71
|
+
7. `start_recording` / `stop_recording`: Record the frontmost app's window at ~5fps as timestamped JPEG frames. Automatically follows app switches.
|
|
72
|
+
8. `launch_app` / `quit_app`: Launch apps with optional CLI args, or gracefully/forcefully quit them.
|
|
72
73
|
|
|
73
74
|
## 📦 Installation
|
|
74
75
|
|
|
@@ -344,12 +345,15 @@ graph TD
|
|
|
344
345
|
| | Input | `CGEvent` (CoreGraphics) |
|
|
345
346
|
| | Text Search (`find_text`) | `Accessibility API` (primary), Vision OCR (fallback) |
|
|
346
347
|
| | Element Inspection (`element_at_point`) | `AXUIElementCopyElementAtPosition` + AX tree walk fallback (Accessibility API) |
|
|
347
|
-
| | Hover Tracking (`start_hover_tracking`) | `CGEvent` cursor + Accessibility API polling
|
|
348
|
+
| | Hover Tracking (`start_hover_tracking`) | `CGEvent` cursor + Accessibility API polling |
|
|
349
|
+
| | Screen Recording (`start_recording`) | `CGWindowListCreateImage` at configurable fps |
|
|
348
350
|
| | OCR | `VNRecognizeTextRequest` (Vision Framework) |
|
|
349
351
|
| **Windows** | Screenshots | `BitBlt` (GDI) |
|
|
350
352
|
| | Input | `SendInput` (Win32) |
|
|
351
353
|
| | Text Search (`find_text`) | `UI Automation` (primary), WinRT OCR (fallback) |
|
|
352
354
|
| | Element Inspection (`element_at_point`) | `IUIAutomation::ElementFromPoint` (UI Automation) |
|
|
355
|
+
| | Hover Tracking (`start_hover_tracking`) | `GetCursorPos` + UI Automation polling |
|
|
356
|
+
| | Screen Recording (`start_recording`) | `BitBlt` (GDI) at configurable fps |
|
|
353
357
|
| | OCR | `Windows.Media.Ocr` (WinRT) |
|
|
354
358
|
| **Android** | Screenshots | `screencap` / ADB framebuffer |
|
|
355
359
|
| | Input | `adb shell input` (tap, swipe, text, keyevent) |
|
|
@@ -390,6 +394,7 @@ Works out of the box on **Windows 10/11**.
|
|
|
390
394
|
* `find_text` uses **UI Automation (UIA)** as the primary search mechanism, querying the accessibility tree for element names. This is the same accessibility-first approach used on macOS (with the Accessibility API). Falls back to OCR automatically when UIA finds no matches.
|
|
391
395
|
* OCR uses the built-in Windows Media OCR engine (offline).
|
|
392
396
|
* **Note:** Cannot interact with "Run as Administrator" windows unless the MCP server itself is also running as Administrator.
|
|
397
|
+
* **Screen Recording Performance:** Screen recording uses GDI/BitBlt at configurable fps (default 5). For higher fps requirements or game capture scenarios, DXGI Desktop Duplication API would provide hardware-accelerated capture — this is a planned future upgrade.
|
|
393
398
|
|
|
394
399
|
## 📜 License
|
|
395
400
|
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "native-devtools-mcp",
|
|
3
|
-
"version": "0.
|
|
3
|
+
"version": "0.6.0",
|
|
4
4
|
"mcpName": "io.github.sh3ll3x3c/native-devtools",
|
|
5
5
|
"description": "MCP server for native app testing — screenshot, OCR, click, type, find_text, template matching. macOS, Windows & Android.",
|
|
6
6
|
"license": "MIT",
|
|
@@ -53,8 +53,8 @@
|
|
|
53
53
|
"bin"
|
|
54
54
|
],
|
|
55
55
|
"optionalDependencies": {
|
|
56
|
-
"@sh3ll3x3c/native-devtools-mcp-darwin-arm64": "0.
|
|
57
|
-
"@sh3ll3x3c/native-devtools-mcp-win32-x64": "0.
|
|
56
|
+
"@sh3ll3x3c/native-devtools-mcp-darwin-arm64": "0.6.0",
|
|
57
|
+
"@sh3ll3x3c/native-devtools-mcp-win32-x64": "0.6.0"
|
|
58
58
|
},
|
|
59
59
|
"engines": {
|
|
60
60
|
"node": ">=18"
|