native-devtools-mcp 0.3.4 → 0.3.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +5 -1
- package/package.json +3 -3
package/README.md
CHANGED
|
@@ -54,7 +54,7 @@ This MCP server is designed to be **highly discoverable and usable** by AI model
|
|
|
54
54
|
**Core Capabilities for System Prompts:**
|
|
55
55
|
1. `take_screenshot`: The "eyes". Returns images + layout metadata + text locations (OCR).
|
|
56
56
|
2. `click` / `type_text`: The "hands". Interacts with the system based on visual feedback.
|
|
57
|
-
3. `find_text`: A shortcut to find text on screen and get its coordinates immediately.
|
|
57
|
+
3. `find_text`: A shortcut to find text on screen and get its coordinates immediately. Uses the platform **accessibility API** (macOS Accessibility / Windows UI Automation) for precise element-level matching, with OCR fallback.
|
|
58
58
|
4. `load_image` / `find_image`: Template matching for non-text UI elements (icons, shapes), returning screen coordinates for clicking.
|
|
59
59
|
|
|
60
60
|
## 📦 Installation (macOS + Windows)
|
|
@@ -190,6 +190,7 @@ graph TD
|
|
|
190
190
|
subgraph "Your Machine"
|
|
191
191
|
Sys -->|Screen/OCR| macOS[CoreGraphics / Vision]
|
|
192
192
|
Sys -->|Input| Win[Win32 / SendInput]
|
|
193
|
+
Sys -->|Text Search| UIA[UI Automation]
|
|
193
194
|
Debug -.->|Inspect| App[Target App]
|
|
194
195
|
end
|
|
195
196
|
```
|
|
@@ -201,9 +202,11 @@ graph TD
|
|
|
201
202
|
|----|---------|----------|
|
|
202
203
|
| **macOS** | Screenshots | `screencapture` (CLI) |
|
|
203
204
|
| | Input | `CGEvent` (CoreGraphics) |
|
|
205
|
+
| | Text Search (`find_text`) | `Accessibility API` (primary), Vision OCR (fallback) |
|
|
204
206
|
| | OCR | `VNRecognizeTextRequest` (Vision Framework) |
|
|
205
207
|
| **Windows** | Screenshots | `BitBlt` (GDI) |
|
|
206
208
|
| | Input | `SendInput` (Win32) |
|
|
209
|
+
| | Text Search (`find_text`) | `UI Automation` (primary), WinRT OCR (fallback) |
|
|
207
210
|
| | OCR | `Windows.Media.Ocr` (WinRT) |
|
|
208
211
|
|
|
209
212
|
### Screenshot Coordinate Precision
|
|
@@ -254,6 +257,7 @@ On macOS, you must grant permissions to the **host application** (e.g., Terminal
|
|
|
254
257
|
|
|
255
258
|
Works out of the box on **Windows 10/11**.
|
|
256
259
|
* Uses standard Win32 APIs (GDI, SendInput).
|
|
260
|
+
* `find_text` uses **UI Automation (UIA)** as the primary search mechanism, querying the accessibility tree for element names. This is the same accessibility-first approach used on macOS (with the Accessibility API). Falls back to OCR automatically when UIA finds no matches.
|
|
257
261
|
* OCR uses the built-in Windows Media OCR engine (offline).
|
|
258
262
|
* **Note:** Cannot interact with "Run as Administrator" windows unless the MCP server itself is also running as Administrator.
|
|
259
263
|
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "native-devtools-mcp",
|
|
3
|
-
"version": "0.3.
|
|
3
|
+
"version": "0.3.6",
|
|
4
4
|
"description": "MCP server for computer-use / desktop automation of native apps (screenshots, OCR, input)",
|
|
5
5
|
"license": "MIT",
|
|
6
6
|
"repository": {
|
|
@@ -39,8 +39,8 @@
|
|
|
39
39
|
"bin"
|
|
40
40
|
],
|
|
41
41
|
"optionalDependencies": {
|
|
42
|
-
"@sh3ll3x3c/native-devtools-mcp-darwin-arm64": "0.3.
|
|
43
|
-
"@sh3ll3x3c/native-devtools-mcp-win32-x64": "0.3.
|
|
42
|
+
"@sh3ll3x3c/native-devtools-mcp-darwin-arm64": "0.3.6",
|
|
43
|
+
"@sh3ll3x3c/native-devtools-mcp-win32-x64": "0.3.6"
|
|
44
44
|
},
|
|
45
45
|
"engines": {
|
|
46
46
|
"node": ">=18"
|