claude-kvm-native 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +244 -0
- package/index.js +441 -0
- package/lib/capture.js +204 -0
- package/lib/hid.js +248 -0
- package/lib/ssh.js +162 -0
- package/lib/types.js +138 -0
- package/lib/vnc.js +1332 -0
- package/package.json +51 -0
- package/tools/control.js +55 -0
- package/tools/index.js +48 -0
- package/tools/keyboard.js +56 -0
- package/tools/mouse.js +61 -0
- package/tools/screen.js +67 -0
- package/tools/ssh.js +62 -0
- package/tools/vlm.js +59 -0
- package/utils/keysym.js +158 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Rฤฑza Emre ARAS <r.emrearas@proton.me>
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,244 @@
|
|
|
1
|
+
# Claude KVM
|
|
2
|
+
|
|
3
|
+
Claude KVM is an MCP tool that controls your remote desktop environment over VNC, with optional SSH access.
|
|
4
|
+
|
|
5
|
+
## Architecture
|
|
6
|
+
|
|
7
|
+
Claude KVM follows an **atomic instrument** design โ each tool does one thing, Claude orchestrates the flow. The system provides three independent channels, each optimized for a different type of interaction:
|
|
8
|
+
|
|
9
|
+
```mermaid
|
|
10
|
+
graph TB
|
|
11
|
+
subgraph MCP["MCP Client (Claude)"]
|
|
12
|
+
AI["๐ค Claude"]
|
|
13
|
+
end
|
|
14
|
+
|
|
15
|
+
subgraph Server["claude-kvm ยท MCP Server (stdio)"]
|
|
16
|
+
direction TB
|
|
17
|
+
Router["Tool Router<br/><code>index.js</code>"]
|
|
18
|
+
|
|
19
|
+
subgraph Channels["Channels"]
|
|
20
|
+
direction LR
|
|
21
|
+
subgraph VNC_Ch["VNC Channel"]
|
|
22
|
+
direction TB
|
|
23
|
+
VNC_Client["VNC Client<br/><code>lib/vnc.js</code>"]
|
|
24
|
+
HID["HID Controller<br/><code>lib/hid.js</code>"]
|
|
25
|
+
Capture["Screen Capture<br/><code>lib/capture.js</code>"]
|
|
26
|
+
end
|
|
27
|
+
|
|
28
|
+
subgraph SSH_Ch["SSH Channel"]
|
|
29
|
+
direction TB
|
|
30
|
+
SSH_Client["SSH Client<br/><code>lib/ssh.js</code>"]
|
|
31
|
+
end
|
|
32
|
+
|
|
33
|
+
subgraph VLM_Ch["VLM Channel"]
|
|
34
|
+
direction TB
|
|
35
|
+
VLM_Bin["claude-kvm-vlm<br/><i>Apple Silicon binary</i>"]
|
|
36
|
+
end
|
|
37
|
+
end
|
|
38
|
+
end
|
|
39
|
+
|
|
40
|
+
subgraph Local["Host Machine (Apple Silicon)"]
|
|
41
|
+
MLX["MLX Framework<br/><i>FastVLM 0.5B</i>"]
|
|
42
|
+
end
|
|
43
|
+
|
|
44
|
+
subgraph Target["Target Machine"]
|
|
45
|
+
VNC_Server["VNC Server<br/><i>:5900</i>"]
|
|
46
|
+
SSH_Server["SSH Server<br/><i>:22</i>"]
|
|
47
|
+
|
|
48
|
+
Desktop["๐ฅ๏ธ Desktop Environment"]
|
|
49
|
+
Shell["๐ป Shell"]
|
|
50
|
+
end
|
|
51
|
+
|
|
52
|
+
AI <--->|"stdio<br/>JSON-RPC"| Router
|
|
53
|
+
|
|
54
|
+
Router --> VNC_Client
|
|
55
|
+
Router --> HID
|
|
56
|
+
Router --> Capture
|
|
57
|
+
Router --> SSH_Client
|
|
58
|
+
Router --> VLM_Bin
|
|
59
|
+
|
|
60
|
+
VNC_Client <-->|"RFB Protocol<br/>TCP :5900"| VNC_Server
|
|
61
|
+
HID --> VNC_Client
|
|
62
|
+
Capture --> VNC_Client
|
|
63
|
+
Capture -->|"PNG crop"| VLM_Bin
|
|
64
|
+
|
|
65
|
+
SSH_Client <-->|"SSH Protocol<br/>TCP :22"| SSH_Server
|
|
66
|
+
VLM_Bin -->|"stdin: PNG<br/>stdout: text"| MLX
|
|
67
|
+
|
|
68
|
+
VNC_Server --> Desktop
|
|
69
|
+
SSH_Server --> Shell
|
|
70
|
+
|
|
71
|
+
classDef server fill:#1a1a2e,stroke:#16213e,color:#e5e5e5
|
|
72
|
+
classDef channel fill:#0f3460,stroke:#533483,color:#e5e5e5
|
|
73
|
+
classDef target fill:#1a1a2e,stroke:#e94560,color:#e5e5e5
|
|
74
|
+
classDef local fill:#1a1a2e,stroke:#533483,color:#e5e5e5
|
|
75
|
+
|
|
76
|
+
class Router server
|
|
77
|
+
class VNC_Client,HID,Capture,SSH_Client,VLM_Bin channel
|
|
78
|
+
class VNC_Server,SSH_Server,Desktop,Shell target
|
|
79
|
+
class MLX local
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
### Channel Overview
|
|
83
|
+
|
|
84
|
+
| Channel | Transport | Purpose | Tools |
|
|
85
|
+
|---------|-------------------|--------------------------------------------------|---------------------------------------------------------------------------|
|
|
86
|
+
| **VNC** | RFB over TCP | Visual control โ screen capture, mouse, keyboard | `screenshot` `cursor_crop` `diff_check` `set_baseline` `mouse` `keyboard` |
|
|
87
|
+
| **SSH** | SSH over TCP | Text I/O โ shell commands, file ops, osascript | `ssh` |
|
|
88
|
+
| **VLM** | stdin/stdout pipe | Pixel โ text โ on-device OCR and visual Q&A | `vlm_query` |
|
|
89
|
+
|
|
90
|
+
### How They Work Together
|
|
91
|
+
|
|
92
|
+
Each channel has a strength. Claude picks the most efficient one โ or combines them:
|
|
93
|
+
|
|
94
|
+
- **Read a web page** โ VNC navigates, VLM reads text from a region, no screenshot needed
|
|
95
|
+
- **Run a shell command** โ SSH returns text directly, faster than typing in a terminal via VNC
|
|
96
|
+
- **Verify a change** โ `diff_check` detects change (5ms, no image), `cursor_crop` confirms placement (small image), `screenshot` only when needed (full image)
|
|
97
|
+
- **Debug a dialog** โ VLM reads the button labels, SSH runs `osascript` to get window info, VNC clicks the right button
|
|
98
|
+
|
|
99
|
+
### Three-Layer Screen Strategy
|
|
100
|
+
|
|
101
|
+
Claude minimizes token cost with a progressive verification approach:
|
|
102
|
+
|
|
103
|
+
```
|
|
104
|
+
diff_check โ changeDetected: true/false ~5ms (text only, no image)
|
|
105
|
+
cursor_crop โ 300ร300px around cursor ~200ms (small image)
|
|
106
|
+
screenshot โ full screen capture ~1200ms (full image, HiDPI)
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
Start cheap, escalate only when needed.
|
|
110
|
+
|
|
111
|
+
### Coordinate Scaling
|
|
112
|
+
|
|
113
|
+
The VNC server's native resolution is scaled down to fit within `DISPLAY_MAX_DIMENSION` (default: 1280px). Claude works in scaled coordinates โ the server transparently converts between native and scaled space:
|
|
114
|
+
|
|
115
|
+
```
|
|
116
|
+
Native: 3840 ร 2400 (VNC server framebuffer)
|
|
117
|
+
Scaled: 1280 ร 800 (what Claude sees and targets)
|
|
118
|
+
|
|
119
|
+
click_at(640, 400) โ VNC receives (1920, 1200)
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
## Usage
|
|
123
|
+
|
|
124
|
+
Create a `.mcp.json` file in your project root directory:
|
|
125
|
+
|
|
126
|
+
```json
|
|
127
|
+
{
|
|
128
|
+
"mcpServers": {
|
|
129
|
+
"claude-kvm": {
|
|
130
|
+
"command": "npx",
|
|
131
|
+
"args": ["-y", "claude-kvm"],
|
|
132
|
+
"env": {
|
|
133
|
+
"VNC_HOST": "192.168.1.100",
|
|
134
|
+
"VNC_PORT": "5900",
|
|
135
|
+
"VNC_AUTH": "auto",
|
|
136
|
+
"VNC_USERNAME": "user",
|
|
137
|
+
"VNC_PASSWORD": "pass",
|
|
138
|
+
"SSH_HOST": "192.168.1.100",
|
|
139
|
+
"SSH_USER": "user",
|
|
140
|
+
"SSH_PASSWORD": "pass",
|
|
141
|
+
"CLAUDE_KVM_VLM_TOOL_PATH": "/path/to/claude-kvm-vlm"
|
|
142
|
+
}
|
|
143
|
+
}
|
|
144
|
+
}
|
|
145
|
+
}
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
Only the VNC connection parameters are required. SSH and all other parameters are optional.
|
|
149
|
+
|
|
150
|
+
### Configuration
|
|
151
|
+
|
|
152
|
+
#### VNC
|
|
153
|
+
|
|
154
|
+
| Parameter | Default | Description |
|
|
155
|
+
|------------------------------|-------------|------------------------------------------------|
|
|
156
|
+
| `VNC_HOST` | `127.0.0.1` | VNC server address |
|
|
157
|
+
| `VNC_PORT` | `5900` | VNC port number |
|
|
158
|
+
| `VNC_AUTH` | `auto` | Authentication mode (`auto` / `none`) |
|
|
159
|
+
| `VNC_USERNAME` | | Username (for VeNCrypt Plain / ARD) |
|
|
160
|
+
| `VNC_PASSWORD` | | Password |
|
|
161
|
+
| `VNC_CONNECT_TIMEOUT_MS` | `10000` | TCP connection timeout (ms) |
|
|
162
|
+
| `VNC_SCREENSHOT_TIMEOUT_MS` | `3000` | Screenshot frame wait timeout (ms) |
|
|
163
|
+
|
|
164
|
+
#### SSH (optional)
|
|
165
|
+
|
|
166
|
+
| Parameter | Default | Description |
|
|
167
|
+
|-----------------|---------|----------------------------------------------|
|
|
168
|
+
| `SSH_HOST` | | SSH server address (required to enable SSH) |
|
|
169
|
+
| `SSH_USER` | | SSH username (required to enable SSH) |
|
|
170
|
+
| `SSH_PASSWORD` | | SSH password (for password auth) |
|
|
171
|
+
| `SSH_KEY` | | Path to private key file (for key auth) |
|
|
172
|
+
| `SSH_PORT` | `22` | SSH port number |
|
|
173
|
+
|
|
174
|
+
The SSH tool is only registered when both `SSH_HOST` and `SSH_USER` are set. Authentication uses either password or key โ whichever is provided.
|
|
175
|
+
|
|
176
|
+
#### VLM (optional, macOS only)
|
|
177
|
+
|
|
178
|
+
| Parameter | Default | Description |
|
|
179
|
+
|----------------------------|---------|------------------------------------------------------------------------------------------------|
|
|
180
|
+
| `CLAUDE_KVM_VLM_TOOL_PATH` | | Absolute path to `claude-kvm-vlm` binary (macOS arm64). Enables the `vlm_query` tool when set. |
|
|
181
|
+
|
|
182
|
+
The `vlm_query` tool is only registered when `CLAUDE_KVM_VLM_TOOL_PATH` is set. Requires Apple Silicon.
|
|
183
|
+
|
|
184
|
+
##### Quick Install
|
|
185
|
+
|
|
186
|
+
```bash
|
|
187
|
+
brew tap ARAS-Workspace/tap
|
|
188
|
+
brew install claude-kvm-vlm
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
The `claude-kvm-vlm` binary is built, code-signed and notarized via CI:
|
|
192
|
+
|
|
193
|
+
- [Build Workflow](https://github.com/ARAS-Workspace/claude-kvm/actions/runs/22114321867)
|
|
194
|
+
- [Source Code](https://github.com/ARAS-Workspace/claude-kvm/tree/vlm-tool)
|
|
195
|
+
|
|
196
|
+
#### Display & Input
|
|
197
|
+
|
|
198
|
+
| Parameter | Default | Description |
|
|
199
|
+
|------------------------------|-------------|------------------------------------------------|
|
|
200
|
+
| `DISPLAY_MAX_DIMENSION` | `1280` | Maximum dimension to scale screenshots to (px) |
|
|
201
|
+
| `HID_CLICK_HOLD_MS` | `80` | Mouse click hold duration (ms) |
|
|
202
|
+
| `HID_KEY_HOLD_MS` | `50` | Key press hold duration (ms) |
|
|
203
|
+
| `HID_TYPING_DELAY_MIN_MS` | `30` | Typing delay lower bound (ms) |
|
|
204
|
+
| `HID_TYPING_DELAY_MAX_MS` | `100` | Typing delay upper bound (ms) |
|
|
205
|
+
| `HID_SCROLL_EVENTS_PER_STEP` | `5` | VNC scroll events per scroll step |
|
|
206
|
+
| `DIFF_PIXEL_THRESHOLD` | `30` | Per-channel pixel difference threshold (0-255) |
|
|
207
|
+
|
|
208
|
+
## Tools
|
|
209
|
+
|
|
210
|
+
| Tool | Returns | Description |
|
|
211
|
+
|-----------------|------------------|-----------------------------------------------------------|
|
|
212
|
+
| `mouse` | `(x, y)` | Mouse actions: move, hover, click, click_at, scroll, drag |
|
|
213
|
+
| `keyboard` | `OK` | Keyboard actions: press, combo, type, paste |
|
|
214
|
+
| `screenshot` | `OK` + image | Capture full screen |
|
|
215
|
+
| `cursor_crop` | `(x, y)` + image | Small crop around cursor position |
|
|
216
|
+
| `diff_check` | `changeDetected` | Lightweight pixel change detection against baseline |
|
|
217
|
+
| `set_baseline` | `OK` | Save current screen as diff reference |
|
|
218
|
+
| `health_check` | JSON | VNC/SSH status, resolution, uptime, memory |
|
|
219
|
+
| `ssh` | stdout/stderr | Execute a command on the remote machine via SSH |
|
|
220
|
+
| `vlm_query` | text | On-device VLM query on a cropped screen region (macOS) |
|
|
221
|
+
| `wait` | `OK` | Wait for a specified duration |
|
|
222
|
+
| `task_complete` | summary | Mark task as completed |
|
|
223
|
+
| `task_failed` | reason | Mark task as failed |
|
|
224
|
+
|
|
225
|
+
## Authentication
|
|
226
|
+
|
|
227
|
+
### VNC
|
|
228
|
+
|
|
229
|
+
Supports multiple VNC authentication methods:
|
|
230
|
+
|
|
231
|
+
- **None** โ no authentication
|
|
232
|
+
- **VNC Auth** โ password-based challenge-response (DES)
|
|
233
|
+
- **ARD** โ Apple Remote Desktop (Diffie-Hellman + AES)
|
|
234
|
+
- **VeNCrypt** โ TLS-wrapped auth (Plain, VNC, None subtypes)
|
|
235
|
+
|
|
236
|
+
macOS Screen Sharing (ARD) is auto-detected via the `RFB 003.889` version string.
|
|
237
|
+
|
|
238
|
+
### SSH
|
|
239
|
+
|
|
240
|
+
Supports password and private key authentication. When the target is macOS, the SSH tool enables AppleScript execution (`osascript`), clipboard access (`pbpaste`/`pbcopy`), and system-level control.
|
|
241
|
+
|
|
242
|
+
---
|
|
243
|
+
|
|
244
|
+
Copyright (c) 2025 Riza Emre ARAS โ MIT License
|