@cool-mcp/desktop-automation 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +225 -0
- package/dist/index.d.ts +7 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +671 -0
- package/dist/index.js.map +1 -0
- package/dist/input-controller.d.ts +63 -0
- package/dist/input-controller.d.ts.map +1 -0
- package/dist/input-controller.js +232 -0
- package/dist/input-controller.js.map +1 -0
- package/dist/screenshot.d.ts +22 -0
- package/dist/screenshot.d.ts.map +1 -0
- package/dist/screenshot.js +247 -0
- package/dist/screenshot.js.map +1 -0
- package/dist/types.d.ts +68 -0
- package/dist/types.d.ts.map +1 -0
- package/dist/types.js +5 -0
- package/dist/types.js.map +1 -0
- package/dist/utils.d.ts +37 -0
- package/dist/utils.d.ts.map +1 -0
- package/dist/utils.js +78 -0
- package/dist/utils.js.map +1 -0
- package/dist/window-manager.d.ts +24 -0
- package/dist/window-manager.d.ts.map +1 -0
- package/dist/window-manager.js +368 -0
- package/dist/window-manager.js.map +1 -0
- package/package.json +55 -0
package/README.md
ADDED
|
@@ -0,0 +1,225 @@
|
|
|
1
|
+
# Desktop Automation MCP
|
|
2
|
+
|
|
3
|
+
一个跨平台(macOS/Windows)的桌面自动化 MCP 服务器,提供鼠标、键盘、截图和窗口管理功能。
|
|
4
|
+
|
|
5
|
+
## 功能特性
|
|
6
|
+
|
|
7
|
+
- 🖱️ **鼠标操作**: 点击、双击、右键、拖拽、滚动、移动
|
|
8
|
+
- ⌨️ **键盘操作**: 文本输入、快捷键、按键控制
|
|
9
|
+
- 📸 **截图功能**: 按进程名截取窗口、全屏截图
|
|
10
|
+
- 🪟 **窗口管理**: 激活窗口、获取窗口信息、列出所有窗口
|
|
11
|
+
- 🖥️ **多显示器支持**: 自动处理多显示器和 DPI 缩放
|
|
12
|
+
- 📐 **坐标转换**: 支持 bbox (0-1000) 归一化坐标自动转换
|
|
13
|
+
|
|
14
|
+
## 快速开始
|
|
15
|
+
|
|
16
|
+
### 使用 npx(推荐)
|
|
17
|
+
|
|
18
|
+
无需安装,直接在 MCP 配置中使用:
|
|
19
|
+
|
|
20
|
+
```json
|
|
21
|
+
{
|
|
22
|
+
"mcpServers": {
|
|
23
|
+
"desktop-automation": {
|
|
24
|
+
"command": "npx",
|
|
25
|
+
"args": ["-y", "@cool-mcp/desktop-automation"],
|
|
26
|
+
"disabled": false
|
|
27
|
+
}
|
|
28
|
+
}
|
|
29
|
+
}
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
### 全局安装
|
|
33
|
+
|
|
34
|
+
```bash
|
|
35
|
+
npm install -g @cool-mcp/desktop-automation
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
然后配置:
|
|
39
|
+
|
|
40
|
+
```json
|
|
41
|
+
{
|
|
42
|
+
"mcpServers": {
|
|
43
|
+
"desktop-automation": {
|
|
44
|
+
"command": "desktop-automation-mcp",
|
|
45
|
+
"disabled": false
|
|
46
|
+
}
|
|
47
|
+
}
|
|
48
|
+
}
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
### 本地开发
|
|
52
|
+
|
|
53
|
+
```bash
|
|
54
|
+
git clone <repo-url>
|
|
55
|
+
cd desktop-automation-mcp
|
|
56
|
+
npm install
|
|
57
|
+
npm run build
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
## MCP 配置位置
|
|
61
|
+
|
|
62
|
+
- Kiro: `~/.kiro/settings/mcp.json` 或项目 `.kiro/settings/mcp.json`
|
|
63
|
+
- Claude Desktop: `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS)
|
|
64
|
+
- Cursor: 项目 `.cursor/mcp.json`
|
|
65
|
+
|
|
66
|
+
## 工具列表
|
|
67
|
+
|
|
68
|
+
### 截图
|
|
69
|
+
|
|
70
|
+
| 工具 | 描述 |
|
|
71
|
+
|------|------|
|
|
72
|
+
| `screenshot` | 截取指定进程窗口的截图 |
|
|
73
|
+
| `screenshot_fullscreen` | 截取整个屏幕 |
|
|
74
|
+
|
|
75
|
+
### 鼠标操作
|
|
76
|
+
|
|
77
|
+
| 工具 | 描述 |
|
|
78
|
+
|------|------|
|
|
79
|
+
| `click` | 鼠标点击(支持 bbox 或像素坐标) |
|
|
80
|
+
| `move_mouse` | 移动鼠标 |
|
|
81
|
+
| `drag` | 鼠标拖拽 |
|
|
82
|
+
| `scroll` | 滚动鼠标滚轮 |
|
|
83
|
+
| `get_mouse_position` | 获取当前鼠标位置 |
|
|
84
|
+
|
|
85
|
+
### 键盘操作
|
|
86
|
+
|
|
87
|
+
| 工具 | 描述 |
|
|
88
|
+
|------|------|
|
|
89
|
+
| `type_text` | 输入文本 |
|
|
90
|
+
| `hotkey` | 按下快捷键组合 |
|
|
91
|
+
| `press_key` | 按下按键(不释放) |
|
|
92
|
+
| `release_key` | 释放按键 |
|
|
93
|
+
|
|
94
|
+
### 窗口管理
|
|
95
|
+
|
|
96
|
+
| 工具 | 描述 |
|
|
97
|
+
|------|------|
|
|
98
|
+
| `activate_window` | 激活窗口(置顶) |
|
|
99
|
+
| `get_window_info` | 获取指定进程的窗口信息 |
|
|
100
|
+
| `get_active_window` | 获取当前活动窗口信息 |
|
|
101
|
+
| `list_windows` | 列出所有打开的窗口 |
|
|
102
|
+
|
|
103
|
+
### 其他
|
|
104
|
+
|
|
105
|
+
| 工具 | 描述 |
|
|
106
|
+
|------|------|
|
|
107
|
+
| `get_displays` | 获取所有显示器信息 |
|
|
108
|
+
| `convert_bbox_to_screen` | 将 bbox 坐标转换为屏幕坐标 |
|
|
109
|
+
| `wait` | 等待指定时间 |
|
|
110
|
+
|
|
111
|
+
## 坐标系统
|
|
112
|
+
|
|
113
|
+
### bbox 坐标 (0-1000 归一化)
|
|
114
|
+
|
|
115
|
+
模型输出的坐标是 0-1000 范围的归一化坐标:
|
|
116
|
+
|
|
117
|
+
```
|
|
118
|
+
bbox = [x1, y1, x2, y2]
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
- `x1, y1`: 左上角坐标
|
|
122
|
+
- `x2, y2`: 右下角坐标
|
|
123
|
+
- 范围: 0-1000
|
|
124
|
+
|
|
125
|
+
### 坐标转换
|
|
126
|
+
|
|
127
|
+
使用 bbox 时需要同时提供 `windowBounds`(从 `screenshot` 返回值获取):
|
|
128
|
+
|
|
129
|
+
```javascript
|
|
130
|
+
// screenshot 返回值
|
|
131
|
+
{
|
|
132
|
+
"base64": "...",
|
|
133
|
+
"width": 1920,
|
|
134
|
+
"height": 1080,
|
|
135
|
+
"windowBounds": {
|
|
136
|
+
"x": 100, // 窗口在屏幕中的 X 位置
|
|
137
|
+
"y": 50, // 窗口在屏幕中的 Y 位置
|
|
138
|
+
"width": 1200,
|
|
139
|
+
"height": 800
|
|
140
|
+
}
|
|
141
|
+
}
|
|
142
|
+
|
|
143
|
+
// 点击时传入 windowBounds
|
|
144
|
+
click({
|
|
145
|
+
bbox: "450 300 550 350",
|
|
146
|
+
windowBounds: { x: 100, y: 50, width: 1200, height: 800 }
|
|
147
|
+
})
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
转换公式:
|
|
151
|
+
```
|
|
152
|
+
centerX = (x1 + x2) / 2 / 1000 * windowWidth + windowX
|
|
153
|
+
centerY = (y1 + y2) / 2 / 1000 * windowHeight + windowY
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
## 示例
|
|
157
|
+
|
|
158
|
+
### 截取 Chrome 窗口并点击
|
|
159
|
+
|
|
160
|
+
```javascript
|
|
161
|
+
// 1. 激活窗口
|
|
162
|
+
await activate_window({ processName: 'Chrome' })
|
|
163
|
+
|
|
164
|
+
// 2. 截图
|
|
165
|
+
const result = await screenshot({ processName: 'Chrome' })
|
|
166
|
+
const { windowBounds } = result
|
|
167
|
+
|
|
168
|
+
// 3. 点击(使用 bbox 坐标)
|
|
169
|
+
await click({
|
|
170
|
+
bbox: '500 100 600 130',
|
|
171
|
+
windowBounds,
|
|
172
|
+
button: 'left'
|
|
173
|
+
})
|
|
174
|
+
```
|
|
175
|
+
|
|
176
|
+
### 输入文本并回车
|
|
177
|
+
|
|
178
|
+
```javascript
|
|
179
|
+
// 点击输入框
|
|
180
|
+
await click({ x: 500, y: 300 })
|
|
181
|
+
|
|
182
|
+
// 输入文本并回车
|
|
183
|
+
await type_text({ text: 'Hello World\n' })
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
### 使用快捷键
|
|
187
|
+
|
|
188
|
+
```javascript
|
|
189
|
+
// 复制
|
|
190
|
+
await hotkey({ keys: 'ctrl c' }) // Windows
|
|
191
|
+
await hotkey({ keys: 'cmd c' }) // macOS
|
|
192
|
+
|
|
193
|
+
// 保存
|
|
194
|
+
await hotkey({ keys: 'ctrl s' })
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
## 平台差异
|
|
198
|
+
|
|
199
|
+
### macOS
|
|
200
|
+
- 使用 `cmd` 代替 `ctrl` 作为主要修饰键
|
|
201
|
+
- 截图使用 `screencapture` 命令
|
|
202
|
+
- 窗口管理使用 AppleScript
|
|
203
|
+
|
|
204
|
+
### Windows
|
|
205
|
+
- 使用 `ctrl` 作为主要修饰键
|
|
206
|
+
- 文本输入使用剪贴板方式(更可靠)
|
|
207
|
+
- 窗口管理使用 PowerShell + Win32 API
|
|
208
|
+
|
|
209
|
+
## 依赖
|
|
210
|
+
|
|
211
|
+
- `@modelcontextprotocol/sdk`: MCP SDK
|
|
212
|
+
- `@nut-tree/nut-js`: 跨平台鼠标键盘控制
|
|
213
|
+
- `jimp`: 图像处理
|
|
214
|
+
- `active-win`: 获取活动窗口信息
|
|
215
|
+
- `node-screenshots`: 截图功能
|
|
216
|
+
|
|
217
|
+
## 系统要求
|
|
218
|
+
|
|
219
|
+
- Node.js >= 18
|
|
220
|
+
- macOS 或 Windows
|
|
221
|
+
- 需要授予辅助功能权限(macOS)
|
|
222
|
+
|
|
223
|
+
## License
|
|
224
|
+
|
|
225
|
+
MIT
|
package/dist/index.d.ts
ADDED
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":";AACA;;;GAGG"}
|