native-devtools-mcp 0.3.1 → 0.3.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +26 -1
- package/package.json +18 -4
package/README.md
CHANGED
|
@@ -11,6 +11,8 @@
|
|
|
11
11
|
|
|
12
12
|
A Model Context Protocol (MCP) server that provides **Computer Use** capabilities: screenshots, OCR, input simulation, and window management.
|
|
13
13
|
|
|
14
|
+
[//]: # "Search keywords: MCP, Model Context Protocol, computer use, desktop automation, UI automation, RPA, screenshots, OCR, mouse, keyboard, screen reading, macOS, Windows, native-devtools-mcp"
|
|
15
|
+
|
|
14
16
|
[Features](#-features) • [Installation](#-installation) • [For AI Agents](#-for-ai-agents-llms) • [Permissions](#-required-permissions-macos)
|
|
15
17
|
|
|
16
18
|

|
|
@@ -19,6 +21,10 @@ A Model Context Protocol (MCP) server that provides **Computer Use** capabilitie
|
|
|
19
21
|
|
|
20
22
|
---
|
|
21
23
|
|
|
24
|
+
## 🔍 Search Keywords
|
|
25
|
+
|
|
26
|
+
MCP, Model Context Protocol, computer use, desktop automation, UI automation, RPA, screenshots, OCR, screen reading, mouse, keyboard, macOS, Windows, native-devtools-mcp.
|
|
27
|
+
|
|
22
28
|
## 🚀 Features
|
|
23
29
|
|
|
24
30
|
- **👀 Computer Vision:** Capture screenshots of screens, windows, or specific regions. Includes built-in OCR (text recognition) to "read" the screen.
|
|
@@ -33,7 +39,7 @@ A Model Context Protocol (MCP) server that provides **Computer Use** capabilitie
|
|
|
33
39
|
|
|
34
40
|
This MCP server is designed to be **highly discoverable and usable** by AI models (Claude, Gemini, GPT).
|
|
35
41
|
|
|
36
|
-
- **[📄 Read `
|
|
42
|
+
- **[📄 Read `AGENTS.md`](./AGENTS.md):** A compact, token-optimized technical reference designed specifically for ingestion by LLMs. It contains intent definitions, schema examples, and reasoning patterns.
|
|
37
43
|
|
|
38
44
|
**Core Capabilities for System Prompts:**
|
|
39
45
|
1. `take_screenshot`: The "eyes". Returns images + layout metadata + text locations (OCR).
|
|
@@ -173,6 +179,25 @@ graph TD
|
|
|
173
179
|
| | Input | `SendInput` (Win32) |
|
|
174
180
|
| | OCR | `Windows.Media.Ocr` (WinRT) |
|
|
175
181
|
|
|
182
|
+
### Screenshot Coordinate Precision
|
|
183
|
+
|
|
184
|
+
Screenshots include metadata for accurate coordinate conversion:
|
|
185
|
+
|
|
186
|
+
- `screenshot_origin_x/y`: Screen-space origin of the captured area (in points)
|
|
187
|
+
- `screenshot_scale`: Display scale factor (e.g., 2.0 for Retina displays)
|
|
188
|
+
- `screenshot_pixel_width/height`: Actual pixel dimensions of the image
|
|
189
|
+
- `screenshot_window_id`: Window ID (for window captures)
|
|
190
|
+
|
|
191
|
+
**Coordinate conversion:**
|
|
192
|
+
```
|
|
193
|
+
screen_x = screenshot_origin_x + (pixel_x / screenshot_scale)
|
|
194
|
+
screen_y = screenshot_origin_y + (pixel_y / screenshot_scale)
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
**Implementation notes:**
|
|
198
|
+
- **Window captures** (macOS): Uses `screencapture -o` which excludes window shadow. The captured image dimensions match `kCGWindowBounds × scale` exactly, ensuring click coordinates derived from screenshots land on intended UI elements.
|
|
199
|
+
- **Region captures**: Origin coordinates are aligned to integers to match the actual captured area.
|
|
200
|
+
|
|
176
201
|
</details>
|
|
177
202
|
|
|
178
203
|
## 🛡️ Privacy, Safety & Best Practices
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "native-devtools-mcp",
|
|
3
|
-
"version": "0.3.
|
|
4
|
-
"description": "MCP server for
|
|
3
|
+
"version": "0.3.2",
|
|
4
|
+
"description": "MCP server for computer-use / desktop automation of native apps (screenshots, OCR, input)",
|
|
5
5
|
"license": "MIT",
|
|
6
6
|
"repository": {
|
|
7
7
|
"type": "git",
|
|
@@ -11,6 +11,20 @@
|
|
|
11
11
|
"keywords": [
|
|
12
12
|
"mcp",
|
|
13
13
|
"model-context-protocol",
|
|
14
|
+
"computer-use",
|
|
15
|
+
"desktop-automation",
|
|
16
|
+
"ui-automation",
|
|
17
|
+
"rpa",
|
|
18
|
+
"ocr",
|
|
19
|
+
"screenshot",
|
|
20
|
+
"screen-reading",
|
|
21
|
+
"mouse",
|
|
22
|
+
"keyboard",
|
|
23
|
+
"ai-agent",
|
|
24
|
+
"llm",
|
|
25
|
+
"claude",
|
|
26
|
+
"gemini",
|
|
27
|
+
"gpt",
|
|
14
28
|
"devtools",
|
|
15
29
|
"desktop",
|
|
16
30
|
"testing",
|
|
@@ -25,8 +39,8 @@
|
|
|
25
39
|
"bin"
|
|
26
40
|
],
|
|
27
41
|
"optionalDependencies": {
|
|
28
|
-
"@sh3ll3x3c/native-devtools-mcp-darwin-arm64": "0.3.
|
|
29
|
-
"@sh3ll3x3c/native-devtools-mcp-win32-x64": "0.3.
|
|
42
|
+
"@sh3ll3x3c/native-devtools-mcp-darwin-arm64": "0.3.2",
|
|
43
|
+
"@sh3ll3x3c/native-devtools-mcp-win32-x64": "0.3.2"
|
|
30
44
|
},
|
|
31
45
|
"engines": {
|
|
32
46
|
"node": ">=18"
|