computer-control 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +222 -0
- package/dist/cli.js +51899 -0
- package/dist/native-host-entry.js +3084 -0
- package/package.json +41 -0
package/README.md
ADDED
|
@@ -0,0 +1,222 @@
|
|
|
1
|
+
# Computer Control
|
|
2
|
+
|
|
3
|
+
Browser automation and macOS desktop control for AI agents via the Model Context Protocol (MCP).
|
|
4
|
+
|
|
5
|
+
## Features
|
|
6
|
+
|
|
7
|
+
**Browser Mode** (Chrome Extension)
|
|
8
|
+
- Take screenshots of web pages
|
|
9
|
+
- Click, type, scroll, and navigate
|
|
10
|
+
- Read page content and accessibility trees
|
|
11
|
+
- Execute JavaScript in page context
|
|
12
|
+
- Record and export GIF recordings
|
|
13
|
+
- Manage tabs and windows
|
|
14
|
+
|
|
15
|
+
**Mac Mode** (Native macOS)
|
|
16
|
+
- Control mouse and keyboard
|
|
17
|
+
- Take screenshots and OCR
|
|
18
|
+
- Read accessibility trees
|
|
19
|
+
- Execute AppleScript
|
|
20
|
+
- Record GIF screen captures
|
|
21
|
+
|
|
22
|
+
## Quick Start
|
|
23
|
+
|
|
24
|
+
### Option 1: Install from Chrome Web Store (Recommended)
|
|
25
|
+
|
|
26
|
+
1. **Install the extension** from the [Chrome Web Store](https://chrome.google.com/webstore/detail/computer-control/kenhnnhgbbgkdbedfmijnllgpcognghl)
|
|
27
|
+
|
|
28
|
+
2. **Install the CLI**
|
|
29
|
+
```bash
|
|
30
|
+
npm install -g computer-control
|
|
31
|
+
# or
|
|
32
|
+
bun install -g computer-control
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
3. **Run the setup wizard**
|
|
36
|
+
```bash
|
|
37
|
+
computer-control browser install
|
|
38
|
+
```
|
|
39
|
+
When prompted for the extension ID, enter: `kenhnnhgbbgkdbedfmijnllgpcognghl`
|
|
40
|
+
|
|
41
|
+
4. **Add to your MCP config** (Claude Code, Cursor, etc.)
|
|
42
|
+
```json
|
|
43
|
+
{
|
|
44
|
+
"mcpServers": {
|
|
45
|
+
"computer-control-browser": {
|
|
46
|
+
"command": "computer-control",
|
|
47
|
+
"args": ["browser", "serve", "--skip-permissions"]
|
|
48
|
+
}
|
|
49
|
+
}
|
|
50
|
+
}
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
5. **Restart your AI assistant** and start automating!
|
|
54
|
+
|
|
55
|
+
### Option 2: Load Extension from Source
|
|
56
|
+
|
|
57
|
+
1. **Clone the repository**
|
|
58
|
+
```bash
|
|
59
|
+
git clone https://github.com/mergd/computer-use.git
|
|
60
|
+
cd computer-use
|
|
61
|
+
bun install
|
|
62
|
+
bun run build
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
2. **Build the extension**
|
|
66
|
+
```bash
|
|
67
|
+
cd extension && ./build.sh
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
3. **Load in Chrome**
|
|
71
|
+
- Open `chrome://extensions`
|
|
72
|
+
- Enable "Developer mode"
|
|
73
|
+
- Click "Load unpacked"
|
|
74
|
+
- Select the `extension/dist` folder
|
|
75
|
+
- Copy the extension ID (32 lowercase letters)
|
|
76
|
+
|
|
77
|
+
4. **Run the setup wizard**
|
|
78
|
+
```bash
|
|
79
|
+
computer-control browser install --extension-id YOUR_EXTENSION_ID
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
5. **Add to MCP config** (same as above)
|
|
83
|
+
|
|
84
|
+
## Mac Mode Setup
|
|
85
|
+
|
|
86
|
+
For native macOS control (no browser needed):
|
|
87
|
+
|
|
88
|
+
```bash
|
|
89
|
+
# Run the setup wizard
|
|
90
|
+
computer-control mac setup
|
|
91
|
+
|
|
92
|
+
# Check status
|
|
93
|
+
computer-control mac status
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
**Requirements:**
|
|
97
|
+
- `cliclick` for mouse/keyboard: `brew install cliclick`
|
|
98
|
+
- `gifsicle` for GIF recording: `brew install gifsicle`
|
|
99
|
+
|
|
100
|
+
**macOS Permissions** (grant to your terminal app):
|
|
101
|
+
- Accessibility (System Settings → Privacy & Security → Accessibility)
|
|
102
|
+
- Screen Recording (System Settings → Privacy & Security → Screen Recording)
|
|
103
|
+
|
|
104
|
+
**MCP Config:**
|
|
105
|
+
```json
|
|
106
|
+
{
|
|
107
|
+
"mcpServers": {
|
|
108
|
+
"computer-control-mac": {
|
|
109
|
+
"command": "computer-control",
|
|
110
|
+
"args": ["mac", "serve"]
|
|
111
|
+
}
|
|
112
|
+
}
|
|
113
|
+
}
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
## CLI Commands
|
|
117
|
+
|
|
118
|
+
```
|
|
119
|
+
computer-control browser
|
|
120
|
+
├── install Interactive setup wizard
|
|
121
|
+
├── status Check installation status
|
|
122
|
+
├── serve Start MCP server
|
|
123
|
+
├── path Print extension directory
|
|
124
|
+
└── uninstall Remove native host
|
|
125
|
+
|
|
126
|
+
computer-control mac
|
|
127
|
+
├── setup Interactive setup wizard
|
|
128
|
+
├── status Check dependencies & permissions
|
|
129
|
+
└── serve Start MCP server
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
## Available Tools
|
|
133
|
+
|
|
134
|
+
### Browser Mode
|
|
135
|
+
|
|
136
|
+
| Tool | Description |
|
|
137
|
+
|------|-------------|
|
|
138
|
+
| `read_page` | Get accessibility tree of page elements |
|
|
139
|
+
| `find` | Find elements by natural language query |
|
|
140
|
+
| `computer` | Mouse/keyboard actions and screenshots |
|
|
141
|
+
| `navigate` | Navigate to URL or go back/forward |
|
|
142
|
+
| `form_input` | Set form input values |
|
|
143
|
+
| `javascript_tool` | Execute JavaScript in page context |
|
|
144
|
+
| `get_page_text` | Extract raw text content from page |
|
|
145
|
+
| `tabs_context` | Get tab group context info |
|
|
146
|
+
| `tabs_create` | Create new tab in MCP group |
|
|
147
|
+
| `resize_window` | Resize browser window |
|
|
148
|
+
| `gif_creator` | Record and export GIF of browser actions |
|
|
149
|
+
| `upload_image` | Upload image to file input or drag target |
|
|
150
|
+
|
|
151
|
+
### Mac Mode
|
|
152
|
+
|
|
153
|
+
| Tool | Description |
|
|
154
|
+
|------|-------------|
|
|
155
|
+
| `screenshot` | Capture screen or region |
|
|
156
|
+
| `mouse_click` | Click at coordinates |
|
|
157
|
+
| `mouse_move` | Move cursor |
|
|
158
|
+
| `mouse_scroll` | Scroll in direction |
|
|
159
|
+
| `mouse_drag` | Drag from one point to another |
|
|
160
|
+
| `type_text` | Type text at cursor |
|
|
161
|
+
| `key_press` | Press key with modifiers |
|
|
162
|
+
| `run_applescript` | Execute AppleScript |
|
|
163
|
+
| `get_active_window` | Get focused window info |
|
|
164
|
+
| `list_windows` | List open windows |
|
|
165
|
+
| `focus_app` | Bring app to foreground |
|
|
166
|
+
| `get_accessibility_tree` | Get UI element hierarchy |
|
|
167
|
+
| `ocr_screen` | Extract text via OCR |
|
|
168
|
+
| `find` | Find elements by natural language |
|
|
169
|
+
| `gif_start/stop/export` | Record screen as GIF |
|
|
170
|
+
|
|
171
|
+
## Troubleshooting
|
|
172
|
+
|
|
173
|
+
### Extension not connecting
|
|
174
|
+
|
|
175
|
+
1. Check the extension is enabled in `chrome://extensions`
|
|
176
|
+
2. Verify the native host is registered:
|
|
177
|
+
```bash
|
|
178
|
+
computer-control browser status
|
|
179
|
+
```
|
|
180
|
+
3. Make sure Chrome is running
|
|
181
|
+
4. Try restarting Chrome
|
|
182
|
+
|
|
183
|
+
### Permission errors on Mac
|
|
184
|
+
|
|
185
|
+
Grant permissions to your terminal app in System Settings:
|
|
186
|
+
- Privacy & Security → Accessibility
|
|
187
|
+
- Privacy & Security → Screen Recording
|
|
188
|
+
|
|
189
|
+
Then restart your terminal.
|
|
190
|
+
|
|
191
|
+
### MCP server not starting
|
|
192
|
+
|
|
193
|
+
Check if the port is already in use:
|
|
194
|
+
```bash
|
|
195
|
+
lsof -i :62222 # Browser mode WebSocket port
|
|
196
|
+
lsof -i :62220 # Browser mode HTTP port
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
## Development
|
|
200
|
+
|
|
201
|
+
```bash
|
|
202
|
+
# Install dependencies
|
|
203
|
+
bun install
|
|
204
|
+
|
|
205
|
+
# Build everything
|
|
206
|
+
bun run build
|
|
207
|
+
|
|
208
|
+
# Build extension only
|
|
209
|
+
cd extension && ./build.sh
|
|
210
|
+
|
|
211
|
+
# Run from source
|
|
212
|
+
bun src/cli.ts browser serve --skip-permissions
|
|
213
|
+
bun src/cli.ts mac serve
|
|
214
|
+
```
|
|
215
|
+
|
|
216
|
+
## Privacy
|
|
217
|
+
|
|
218
|
+
See [PRIVACY.md](PRIVACY.md) for our privacy policy.
|
|
219
|
+
|
|
220
|
+
## License
|
|
221
|
+
|
|
222
|
+
MIT
|