computer-control 0.1.0 → 0.1.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +77 -153
- package/dist/cli.js +370 -312
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -1,154 +1,78 @@
|
|
|
1
1
|
# Computer Control
|
|
2
2
|
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
## Features
|
|
6
|
-
|
|
7
|
-
**Browser Mode** (Chrome Extension)
|
|
8
|
-
- Take screenshots of web pages
|
|
9
|
-
- Click, type, scroll, and navigate
|
|
10
|
-
- Read page content and accessibility trees
|
|
11
|
-
- Execute JavaScript in page context
|
|
12
|
-
- Record and export GIF recordings
|
|
13
|
-
- Manage tabs and windows
|
|
14
|
-
|
|
15
|
-
**Mac Mode** (Native macOS)
|
|
16
|
-
- Control mouse and keyboard
|
|
17
|
-
- Take screenshots and OCR
|
|
18
|
-
- Read accessibility trees
|
|
19
|
-
- Execute AppleScript
|
|
20
|
-
- Record GIF screen captures
|
|
21
|
-
|
|
22
|
-
## Quick Start
|
|
23
|
-
|
|
24
|
-
### Option 1: Install from Chrome Web Store (Recommended)
|
|
25
|
-
|
|
26
|
-
1. **Install the extension** from the [Chrome Web Store](https://chrome.google.com/webstore/detail/computer-control/kenhnnhgbbgkdbedfmijnllgpcognghl)
|
|
27
|
-
|
|
28
|
-
2. **Install the CLI**
|
|
29
|
-
```bash
|
|
30
|
-
npm install -g computer-control
|
|
31
|
-
# or
|
|
32
|
-
bun install -g computer-control
|
|
33
|
-
```
|
|
34
|
-
|
|
35
|
-
3. **Run the setup wizard**
|
|
36
|
-
```bash
|
|
37
|
-
computer-control browser install
|
|
38
|
-
```
|
|
39
|
-
When prompted for the extension ID, enter: `kenhnnhgbbgkdbedfmijnllgpcognghl`
|
|
40
|
-
|
|
41
|
-
4. **Add to your MCP config** (Claude Code, Cursor, etc.)
|
|
42
|
-
```json
|
|
43
|
-
{
|
|
44
|
-
"mcpServers": {
|
|
45
|
-
"computer-control-browser": {
|
|
46
|
-
"command": "computer-control",
|
|
47
|
-
"args": ["browser", "serve", "--skip-permissions"]
|
|
48
|
-
}
|
|
49
|
-
}
|
|
50
|
-
}
|
|
51
|
-
```
|
|
52
|
-
|
|
53
|
-
5. **Restart your AI assistant** and start automating!
|
|
54
|
-
|
|
55
|
-
### Option 2: Load Extension from Source
|
|
56
|
-
|
|
57
|
-
1. **Clone the repository**
|
|
58
|
-
```bash
|
|
59
|
-
git clone https://github.com/mergd/computer-use.git
|
|
60
|
-
cd computer-use
|
|
61
|
-
bun install
|
|
62
|
-
bun run build
|
|
63
|
-
```
|
|
64
|
-
|
|
65
|
-
2. **Build the extension**
|
|
66
|
-
```bash
|
|
67
|
-
cd extension && ./build.sh
|
|
68
|
-
```
|
|
69
|
-
|
|
70
|
-
3. **Load in Chrome**
|
|
71
|
-
- Open `chrome://extensions`
|
|
72
|
-
- Enable "Developer mode"
|
|
73
|
-
- Click "Load unpacked"
|
|
74
|
-
- Select the `extension/dist` folder
|
|
75
|
-
- Copy the extension ID (32 lowercase letters)
|
|
76
|
-
|
|
77
|
-
4. **Run the setup wizard**
|
|
78
|
-
```bash
|
|
79
|
-
computer-control browser install --extension-id YOUR_EXTENSION_ID
|
|
80
|
-
```
|
|
81
|
-
|
|
82
|
-
5. **Add to MCP config** (same as above)
|
|
83
|
-
|
|
84
|
-
## Mac Mode Setup
|
|
85
|
-
|
|
86
|
-
For native macOS control (no browser needed):
|
|
3
|
+
MCP server for browser automation and macOS desktop control. Give your AI agent eyes and hands.
|
|
87
4
|
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
5
|
+
## Getting Started
|
|
6
|
+
|
|
7
|
+
### Browser Mode
|
|
8
|
+
|
|
9
|
+
Install the CLI and the [Chrome extension](https://chromewebstore.google.com/detail/computer-control/kenhnnhgbbgkdbedfmijnllgpcognghl):
|
|
91
10
|
|
|
92
|
-
|
|
93
|
-
computer-control
|
|
11
|
+
```bash
|
|
12
|
+
npm i -g computer-control
|
|
94
13
|
```
|
|
95
14
|
|
|
96
|
-
|
|
97
|
-
- `cliclick` for mouse/keyboard: `brew install cliclick`
|
|
98
|
-
- `gifsicle` for GIF recording: `brew install gifsicle`
|
|
15
|
+
Start the server (native messaging bridge is registered automatically on first run):
|
|
99
16
|
|
|
100
|
-
|
|
101
|
-
-
|
|
102
|
-
|
|
17
|
+
```bash
|
|
18
|
+
computer-control browser serve
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
Then add the MCP endpoint to your AI client (Claude Code, Cursor, etc.):
|
|
103
22
|
|
|
104
|
-
**MCP Config:**
|
|
105
23
|
```json
|
|
106
24
|
{
|
|
107
25
|
"mcpServers": {
|
|
108
|
-
"
|
|
109
|
-
"
|
|
110
|
-
"args": ["mac", "serve"]
|
|
26
|
+
"browser": {
|
|
27
|
+
"url": "http://127.0.0.1:62220/mcp"
|
|
111
28
|
}
|
|
112
29
|
}
|
|
113
30
|
}
|
|
114
31
|
```
|
|
115
32
|
|
|
116
|
-
|
|
33
|
+
### Mac Mode
|
|
34
|
+
|
|
35
|
+
Native macOS control — no browser needed. Install deps, run the wizard, and you're set:
|
|
117
36
|
|
|
37
|
+
```bash
|
|
38
|
+
brew install cliclick gifsicle
|
|
39
|
+
npm i -g computer-control
|
|
40
|
+
computer-control mac setup
|
|
118
41
|
```
|
|
119
|
-
computer-control browser
|
|
120
|
-
├── install Interactive setup wizard
|
|
121
|
-
├── status Check installation status
|
|
122
|
-
├── serve Start MCP server
|
|
123
|
-
├── path Print extension directory
|
|
124
|
-
└── uninstall Remove native host
|
|
125
42
|
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
43
|
+
Grant **Accessibility** and **Screen Recording** permissions to your terminal app when prompted.
|
|
44
|
+
|
|
45
|
+
```json
|
|
46
|
+
{
|
|
47
|
+
"mcpServers": {
|
|
48
|
+
"mac": {
|
|
49
|
+
"command": "computer-control",
|
|
50
|
+
"args": ["mac", "serve"]
|
|
51
|
+
}
|
|
52
|
+
}
|
|
53
|
+
}
|
|
130
54
|
```
|
|
131
55
|
|
|
132
|
-
##
|
|
56
|
+
## Tools
|
|
133
57
|
|
|
134
|
-
### Browser
|
|
58
|
+
### Browser
|
|
135
59
|
|
|
136
60
|
| Tool | Description |
|
|
137
61
|
|------|-------------|
|
|
138
|
-
| `
|
|
139
|
-
| `
|
|
140
|
-
| `
|
|
141
|
-
| `navigate` |
|
|
62
|
+
| `computer` | Mouse, keyboard, and screenshots |
|
|
63
|
+
| `read_page` | Accessibility tree of page elements |
|
|
64
|
+
| `find` | Find elements by natural language |
|
|
65
|
+
| `navigate` | Go to URL, back, forward |
|
|
142
66
|
| `form_input` | Set form input values |
|
|
143
|
-
| `javascript_tool` | Execute
|
|
144
|
-
| `get_page_text` | Extract raw text content
|
|
145
|
-
| `tabs_context` |
|
|
146
|
-
| `tabs_create` |
|
|
67
|
+
| `javascript_tool` | Execute JS in page context |
|
|
68
|
+
| `get_page_text` | Extract raw text content |
|
|
69
|
+
| `tabs_context` | Tab group context |
|
|
70
|
+
| `tabs_create` | Open new tab |
|
|
147
71
|
| `resize_window` | Resize browser window |
|
|
148
|
-
| `gif_creator` | Record
|
|
149
|
-
| `upload_image` | Upload image to file input
|
|
72
|
+
| `gif_creator` | Record browser actions as GIF |
|
|
73
|
+
| `upload_image` | Upload image to file input |
|
|
150
74
|
|
|
151
|
-
### Mac
|
|
75
|
+
### Mac
|
|
152
76
|
|
|
153
77
|
| Tool | Description |
|
|
154
78
|
|------|-------------|
|
|
@@ -156,66 +80,66 @@ computer-control mac
|
|
|
156
80
|
| `mouse_click` | Click at coordinates |
|
|
157
81
|
| `mouse_move` | Move cursor |
|
|
158
82
|
| `mouse_scroll` | Scroll in direction |
|
|
159
|
-
| `mouse_drag` | Drag
|
|
83
|
+
| `mouse_drag` | Drag between points |
|
|
160
84
|
| `type_text` | Type text at cursor |
|
|
161
85
|
| `key_press` | Press key with modifiers |
|
|
162
86
|
| `run_applescript` | Execute AppleScript |
|
|
163
|
-
| `get_active_window` |
|
|
87
|
+
| `get_active_window` | Focused window info |
|
|
164
88
|
| `list_windows` | List open windows |
|
|
165
89
|
| `focus_app` | Bring app to foreground |
|
|
166
|
-
| `get_accessibility_tree` |
|
|
90
|
+
| `get_accessibility_tree` | UI element hierarchy |
|
|
167
91
|
| `ocr_screen` | Extract text via OCR |
|
|
168
92
|
| `find` | Find elements by natural language |
|
|
169
|
-
| `gif_start/
|
|
170
|
-
|
|
171
|
-
## Troubleshooting
|
|
93
|
+
| `gif_start` / `gif_stop` / `gif_export` | Record screen as GIF |
|
|
172
94
|
|
|
173
|
-
|
|
95
|
+
## CLI Reference
|
|
174
96
|
|
|
175
|
-
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
4. Try restarting Chrome
|
|
97
|
+
```
|
|
98
|
+
computer-control browser
|
|
99
|
+
serve Start MCP server (auto-registers native host)
|
|
100
|
+
status Check installation
|
|
101
|
+
install Re-register native host (or use custom extension ID)
|
|
102
|
+
uninstall Remove native host
|
|
182
103
|
|
|
183
|
-
|
|
104
|
+
computer-control mac
|
|
105
|
+
setup Setup wizard (deps + permissions)
|
|
106
|
+
status Check deps & permissions
|
|
107
|
+
serve Start MCP server
|
|
108
|
+
```
|
|
184
109
|
|
|
185
|
-
|
|
186
|
-
- Privacy & Security → Accessibility
|
|
187
|
-
- Privacy & Security → Screen Recording
|
|
110
|
+
## Troubleshooting
|
|
188
111
|
|
|
189
|
-
|
|
112
|
+
**Extension not connecting?**
|
|
113
|
+
Run `computer-control browser status` to check the native host registration. Make sure Chrome is running and the extension is enabled. Restart Chrome if needed.
|
|
190
114
|
|
|
191
|
-
|
|
115
|
+
**Permission errors on Mac?**
|
|
116
|
+
Add your terminal app to Accessibility and Screen Recording in System Settings → Privacy & Security. Restart the terminal after.
|
|
192
117
|
|
|
193
|
-
|
|
118
|
+
**Port conflict?**
|
|
194
119
|
```bash
|
|
195
|
-
lsof -i :62222 #
|
|
196
|
-
lsof -i :62220 #
|
|
120
|
+
lsof -i :62222 # WebSocket port
|
|
121
|
+
lsof -i :62220 # HTTP port
|
|
197
122
|
```
|
|
198
123
|
|
|
199
124
|
## Development
|
|
200
125
|
|
|
201
126
|
```bash
|
|
202
|
-
|
|
127
|
+
git clone https://github.com/mergd/computer-use.git
|
|
128
|
+
cd computer-use
|
|
203
129
|
bun install
|
|
204
|
-
|
|
205
|
-
# Build everything
|
|
206
130
|
bun run build
|
|
207
131
|
|
|
208
|
-
# Build extension only
|
|
209
|
-
cd extension && ./build.sh
|
|
210
|
-
|
|
211
132
|
# Run from source
|
|
212
133
|
bun src/cli.ts browser serve --skip-permissions
|
|
213
134
|
bun src/cli.ts mac serve
|
|
135
|
+
|
|
136
|
+
# Build extension from source
|
|
137
|
+
cd extension && ./build.sh
|
|
214
138
|
```
|
|
215
139
|
|
|
216
140
|
## Privacy
|
|
217
141
|
|
|
218
|
-
See [PRIVACY.md](PRIVACY.md)
|
|
142
|
+
See [PRIVACY.md](PRIVACY.md).
|
|
219
143
|
|
|
220
144
|
## License
|
|
221
145
|
|