@agent-sh/computer-use-linux 0.2.1 → 0.2.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +56 -20
- package/npm/README.md +14 -0
- package/npm/install.js +1 -1
- package/package.json +5 -4
- package/skills/computer-use-linux/SKILL.md +125 -0
package/README.md
CHANGED
|
@@ -1,12 +1,22 @@
|
|
|
1
|
-
|
|
1
|
+
<div align="center">
|
|
2
|
+
<h1>computer-use-linux</h1>
|
|
3
|
+
<p><strong>Linux desktop control for MCP hosts.</strong></p>
|
|
4
|
+
<p>
|
|
5
|
+
<a href="https://github.com/agent-sh/computer-use-linux/actions/workflows/ci.yml"><img src="https://github.com/agent-sh/computer-use-linux/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
|
|
6
|
+
<a href="https://crates.io/crates/computer-use-linux"><img src="https://img.shields.io/crates/v/computer-use-linux.svg" alt="crates.io"></a>
|
|
7
|
+
<a href="https://www.npmjs.com/package/@agent-sh/computer-use-linux"><img src="https://img.shields.io/npm/v/@agent-sh/computer-use-linux.svg" alt="npm"></a>
|
|
8
|
+
<a href="LICENSE"><img src="https://img.shields.io/badge/License-MIT-yellow.svg" alt="License: MIT"></a>
|
|
9
|
+
</p>
|
|
10
|
+
</div>
|
|
11
|
+
|
|
12
|
+
Linux desktop control for any MCP host: AT-SPI accessibility trees, portal screenshots, Wayland/X11 input, and multi-compositor window targeting for GNOME, KDE/KWin, Hyprland, i3, and COSMIC.
|
|
2
13
|
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-
[](LICENSE)
|
|
14
|
+
```bash
|
|
15
|
+
npm install -g @agent-sh/computer-use-linux@0.2.1
|
|
16
|
+
computer-use-linux doctor | jq .readiness
|
|
17
|
+
```
|
|
8
18
|
|
|
9
|
-
Current release: [`v0.2.1`](https://github.com/
|
|
19
|
+
Current release: [`v0.2.1`](https://github.com/agent-sh/computer-use-linux/releases/tag/v0.2.1). The Rust crate is published as [`computer-use-linux`](https://crates.io/crates/computer-use-linux), and the npm wrapper is published as [`@agent-sh/computer-use-linux`](https://www.npmjs.com/package/@agent-sh/computer-use-linux).
|
|
10
20
|
|
|
11
21
|
## What this is
|
|
12
22
|
|
|
@@ -102,7 +112,7 @@ COSMIC users do not need a second package or a separate helper install when usin
|
|
|
102
112
|
Installs system packages on Debian/Ubuntu, Fedora/RHEL-like, or Arch-like distros; installs Rust if needed; builds both release binaries; installs them to `~/.local/bin`; enables `ydotoold` as a user service; enables GNOME AT-SPI settings when running under GNOME; and installs the bundled GNOME Shell extension on GNOME Wayland.
|
|
103
113
|
|
|
104
114
|
```bash
|
|
105
|
-
git clone https://github.com/
|
|
115
|
+
git clone https://github.com/agent-sh/computer-use-linux
|
|
106
116
|
cd computer-use-linux
|
|
107
117
|
./install.sh
|
|
108
118
|
# log out and back in if the GNOME extension was newly installed
|
|
@@ -121,7 +131,7 @@ computer-use-linux doctor
|
|
|
121
131
|
For unreleased changes from `main`, install directly from Git:
|
|
122
132
|
|
|
123
133
|
```bash
|
|
124
|
-
cargo install --git https://github.com/
|
|
134
|
+
cargo install --git https://github.com/agent-sh/computer-use-linux
|
|
125
135
|
```
|
|
126
136
|
|
|
127
137
|
Then, as needed:
|
|
@@ -148,15 +158,15 @@ You will still need `ydotoold` running and AT-SPI enabled (run `computer-use-lin
|
|
|
148
158
|
|
|
149
159
|
Linux x86_64 / aarch64 builds are published with each tag. Each binary ships a `.sha256` next to it.
|
|
150
160
|
|
|
151
|
-
- Release: <https://github.com/
|
|
161
|
+
- Release: <https://github.com/agent-sh/computer-use-linux/releases/tag/v0.2.1>
|
|
152
162
|
|
|
153
163
|
```bash
|
|
154
164
|
target=x86_64-unknown-linux-gnu
|
|
155
165
|
version=v0.2.1
|
|
156
166
|
for binary in computer-use-linux computer-use-linux-cosmic; do
|
|
157
167
|
asset="$binary-$target"
|
|
158
|
-
curl -L -O "https://github.com/
|
|
159
|
-
curl -L -O "https://github.com/
|
|
168
|
+
curl -L -O "https://github.com/agent-sh/computer-use-linux/releases/download/$version/$asset"
|
|
169
|
+
curl -L -O "https://github.com/agent-sh/computer-use-linux/releases/download/$version/$asset.sha256"
|
|
160
170
|
sha256sum -c "$asset.sha256"
|
|
161
171
|
install -m 0755 "$asset" "$HOME/.local/bin/$binary"
|
|
162
172
|
done
|
|
@@ -191,29 +201,51 @@ Restart Claude Desktop. The 15 tools should appear in the tools list.
|
|
|
191
201
|
|
|
192
202
|
### Hermes Agent
|
|
193
203
|
|
|
194
|
-
|
|
204
|
+
Install the companion Hermes skill so Hermes has the desktop-specific runbook:
|
|
195
205
|
|
|
196
206
|
```bash
|
|
197
|
-
hermes
|
|
198
|
-
hermes
|
|
207
|
+
hermes skills tap add agent-sh/computer-use-linux
|
|
208
|
+
hermes skills install agent-sh/computer-use-linux/computer-use-linux
|
|
199
209
|
```
|
|
200
210
|
|
|
201
|
-
|
|
211
|
+
The skill is optional but recommended for Hermes users. It teaches Hermes how to install, configure, verify, and call the Linux desktop MCP safely. It follows the same `skills/<name>/SKILL.md` tap layout used by Hermes community skills.
|
|
202
212
|
|
|
203
|
-
|
|
213
|
+
Then add the stdio MCP server:
|
|
214
|
+
|
|
215
|
+
```bash
|
|
216
|
+
hermes mcp add computer-use-linux --command computer-use-linux --args mcp
|
|
217
|
+
hermes mcp test computer-use-linux
|
|
218
|
+
hermes mcp configure computer-use-linux
|
|
219
|
+
```
|
|
204
220
|
|
|
205
|
-
|
|
221
|
+
`configure` opens Hermes' tool-selection UI for the server. The generated config should look like this:
|
|
206
222
|
|
|
207
223
|
```yaml
|
|
208
224
|
mcp_servers:
|
|
209
225
|
computer-use-linux:
|
|
210
226
|
command: computer-use-linux
|
|
211
227
|
args: ["mcp"]
|
|
228
|
+
timeout: 120
|
|
229
|
+
connect_timeout: 30
|
|
212
230
|
|
|
213
231
|
# Optional: expose the tools to subagents as well.
|
|
214
232
|
inherit_mcp_toolsets: true
|
|
215
233
|
```
|
|
216
234
|
|
|
235
|
+
If you installed the binary somewhere that is not on `PATH`, pass the absolute path as `--command`.
|
|
236
|
+
|
|
237
|
+
Restart Hermes after editing the config. Hermes registers the tools as `mcp_computer_use_linux_<tool>` and creates the `mcp-computer-use-linux` runtime toolset.
|
|
238
|
+
|
|
239
|
+
You can verify both sides before asking Hermes to use the desktop:
|
|
240
|
+
|
|
241
|
+
```bash
|
|
242
|
+
computer-use-linux doctor | jq .readiness
|
|
243
|
+
hermes skills inspect agent-sh/computer-use-linux/computer-use-linux
|
|
244
|
+
hermes chat --toolsets mcp-computer-use-linux -q "List the current desktop windows."
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
For one-off installs without adding the tap first, Hermes also accepts `hermes skills install agent-sh/computer-use-linux/skills/computer-use-linux`.
|
|
248
|
+
|
|
217
249
|
### Generic MCP client
|
|
218
250
|
|
|
219
251
|
Spawn the binary with `["mcp"]` as the argv tail. It speaks JSON-RPC over stdio per the rmcp 2024-11-05 protocol; capability discovery happens through `tools/list` and the `doctor` tool. The server normally needs no MCP-specific configuration, but desktop runtime environment still matters (`DBUS_SESSION_BUS_ADDRESS`, `XDG_RUNTIME_DIR`, portals, AT-SPI, `ydotoold`, and optionally `COMPUTER_USE_LINUX_COSMIC_HELPER`).
|
|
@@ -284,6 +316,10 @@ If you're running this on a shared workstation, set `ydotoold`'s socket permissi
|
|
|
284
316
|
|
|
285
317
|
If `doctor` is green and a specific tool still misbehaves, file an issue with the JSON output of `doctor` and the failing tool's request payload.
|
|
286
318
|
|
|
319
|
+
## Contributing
|
|
320
|
+
|
|
321
|
+
Contributions are welcome. See [CONTRIBUTING.md](CONTRIBUTING.md) for the local development workflow, CI gates, and PR expectations. Report security vulnerabilities through [SECURITY.md](SECURITY.md), not public issues.
|
|
322
|
+
|
|
287
323
|
## Credits
|
|
288
324
|
|
|
289
325
|
Extracted from [`codex-desktop-linux`](https://github.com/avifenesh/codex-desktop-linux), the Linux distribution of Codex Desktop, which continues to ship this same binary as a bundled plugin. Maintained by [Avi Fenesh](https://github.com/avifenesh).
|
|
@@ -301,8 +337,8 @@ Built on top of:
|
|
|
301
337
|
Publishing is tag-driven from GitHub Actions. The repository needs these Actions secrets:
|
|
302
338
|
|
|
303
339
|
```bash
|
|
304
|
-
gh secret set CARGO_REGISTRY_TOKEN -R
|
|
305
|
-
gh secret set NPM_TOKEN -R
|
|
340
|
+
gh secret set CARGO_REGISTRY_TOKEN -R agent-sh/computer-use-linux
|
|
341
|
+
gh secret set NPM_TOKEN -R agent-sh/computer-use-linux
|
|
306
342
|
```
|
|
307
343
|
|
|
308
344
|
Then bump `Cargo.toml` and `package.json` together, update `CHANGELOG.md`, and push a `vX.Y.Z` tag. CI runs the full Rust and MCP safety gates, builds release assets for both architectures, publishes `computer-use-linux` to crates.io, and publishes the npm wrapper after the GitHub release binaries are available.
|
package/npm/README.md
CHANGED
|
@@ -12,8 +12,22 @@ desktop actions.
|
|
|
12
12
|
```bash
|
|
13
13
|
npm install -g @agent-sh/computer-use-linux@0.2.1
|
|
14
14
|
computer-use-linux doctor
|
|
15
|
+
hermes skills tap add agent-sh/computer-use-linux
|
|
16
|
+
hermes skills install agent-sh/computer-use-linux/computer-use-linux
|
|
15
17
|
hermes mcp add computer-use-linux --command computer-use-linux --args mcp
|
|
16
18
|
hermes mcp test computer-use-linux
|
|
19
|
+
hermes mcp configure computer-use-linux
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
The generated Hermes config should look like this:
|
|
23
|
+
|
|
24
|
+
```yaml
|
|
25
|
+
mcp_servers:
|
|
26
|
+
computer-use-linux:
|
|
27
|
+
command: computer-use-linux
|
|
28
|
+
args: ["mcp"]
|
|
29
|
+
timeout: 120
|
|
30
|
+
connect_timeout: 30
|
|
17
31
|
```
|
|
18
32
|
|
|
19
33
|
The package downloads the matching Linux x86_64 or aarch64 binary from the
|
package/npm/install.js
CHANGED
|
@@ -102,7 +102,7 @@ async function main() {
|
|
|
102
102
|
const cosmicAsset = `computer-use-linux-cosmic-${targetArch}-unknown-linux-gnu`;
|
|
103
103
|
const baseUrl =
|
|
104
104
|
process.env.COMPUTER_USE_LINUX_DOWNLOAD_BASE ||
|
|
105
|
-
`https://github.com/
|
|
105
|
+
`https://github.com/agent-sh/computer-use-linux/releases/download/v${pkg.version}`;
|
|
106
106
|
const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'computer-use-linux-'));
|
|
107
107
|
const tmpBinary = path.join(tmpDir, asset);
|
|
108
108
|
const tmpSha = path.join(tmpDir, `${asset}.sha256`);
|
package/package.json
CHANGED
|
@@ -1,16 +1,16 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@agent-sh/computer-use-linux",
|
|
3
|
-
"version": "0.2.
|
|
3
|
+
"version": "0.2.3",
|
|
4
4
|
"description": "Linux desktop-control MCP server: AT-SPI accessibility trees, Wayland/X11 input, screenshots, and compositor window targeting.",
|
|
5
5
|
"license": "MIT",
|
|
6
6
|
"type": "commonjs",
|
|
7
|
-
"homepage": "https://github.com/
|
|
7
|
+
"homepage": "https://github.com/agent-sh/computer-use-linux#readme",
|
|
8
8
|
"repository": {
|
|
9
9
|
"type": "git",
|
|
10
|
-
"url": "git+https://github.com/
|
|
10
|
+
"url": "git+https://github.com/agent-sh/computer-use-linux.git"
|
|
11
11
|
},
|
|
12
12
|
"bugs": {
|
|
13
|
-
"url": "https://github.com/
|
|
13
|
+
"url": "https://github.com/agent-sh/computer-use-linux/issues"
|
|
14
14
|
},
|
|
15
15
|
"keywords": [
|
|
16
16
|
"mcp",
|
|
@@ -34,6 +34,7 @@
|
|
|
34
34
|
"files": [
|
|
35
35
|
"LICENSE",
|
|
36
36
|
"README.md",
|
|
37
|
+
"skills/computer-use-linux/SKILL.md",
|
|
37
38
|
"npm/README.md",
|
|
38
39
|
"npm/bin/computer-use-linux.js",
|
|
39
40
|
"npm/install.js"
|
|
@@ -0,0 +1,125 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: computer-use-linux
|
|
3
|
+
description: "Use when Hermes needs Linux desktop observation or control through the computer-use-linux MCP server."
|
|
4
|
+
version: 0.2.1
|
|
5
|
+
author: agent-sh
|
|
6
|
+
license: MIT
|
|
7
|
+
platforms: [linux]
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# computer-use-linux
|
|
11
|
+
|
|
12
|
+
Use `computer-use-linux` when Hermes needs to observe or operate a local Linux desktop through MCP: inspect the accessibility tree, list/focus windows, take screenshots, click, scroll, type, press keys, or invoke AT-SPI actions.
|
|
13
|
+
|
|
14
|
+
## When to Use
|
|
15
|
+
|
|
16
|
+
Use this skill when:
|
|
17
|
+
- The user wants Hermes to control a Linux GUI app.
|
|
18
|
+
- You need desktop state from AT-SPI, screenshots, or compositor window metadata.
|
|
19
|
+
- You are configuring the `computer-use-linux` MCP server for Hermes.
|
|
20
|
+
- A desktop action needs target-aware input instead of blind shell commands.
|
|
21
|
+
|
|
22
|
+
Do not use this for remote browsers, websites, or headless automation when a browser-specific tool is available. Do not assume desktop actions are safe just because the MCP connection works.
|
|
23
|
+
|
|
24
|
+
## Install
|
|
25
|
+
|
|
26
|
+
Preferred install for Hermes users:
|
|
27
|
+
|
|
28
|
+
```bash
|
|
29
|
+
npm install -g @agent-sh/computer-use-linux@0.2.1
|
|
30
|
+
computer-use-linux doctor | jq .readiness
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
Rust users can install the same server from crates.io:
|
|
34
|
+
|
|
35
|
+
```bash
|
|
36
|
+
cargo install computer-use-linux --version 0.2.1
|
|
37
|
+
computer-use-linux doctor | jq .readiness
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
If `doctor` reports missing input or accessibility support, run:
|
|
41
|
+
|
|
42
|
+
```bash
|
|
43
|
+
computer-use-linux setup
|
|
44
|
+
systemctl --user enable --now ydotoold
|
|
45
|
+
computer-use-linux setup-window-targeting
|
|
46
|
+
computer-use-linux doctor | jq .readiness
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
On GNOME Wayland, log out and back in after `setup-window-targeting` if the GNOME Shell extension was newly installed.
|
|
50
|
+
|
|
51
|
+
## Configure Hermes
|
|
52
|
+
|
|
53
|
+
Add the server with the Hermes MCP CLI:
|
|
54
|
+
|
|
55
|
+
```bash
|
|
56
|
+
hermes mcp add computer-use-linux --command computer-use-linux --args mcp
|
|
57
|
+
hermes mcp test computer-use-linux
|
|
58
|
+
hermes mcp configure computer-use-linux
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
`configure` opens Hermes' tool-selection UI for this MCP server.
|
|
62
|
+
|
|
63
|
+
The generated config should look like this:
|
|
64
|
+
|
|
65
|
+
```yaml
|
|
66
|
+
mcp_servers:
|
|
67
|
+
computer-use-linux:
|
|
68
|
+
command: computer-use-linux
|
|
69
|
+
args: ["mcp"]
|
|
70
|
+
timeout: 120
|
|
71
|
+
connect_timeout: 30
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
If the binary is not on `PATH`, pass the absolute path to `--command`.
|
|
75
|
+
|
|
76
|
+
Hermes registers tools using the `mcp_<server>_<tool>` pattern. With this config, tool names are prefixed as `mcp_computer_use_linux_`, for example:
|
|
77
|
+
|
|
78
|
+
| MCP tool | Hermes tool name |
|
|
79
|
+
| --- | --- |
|
|
80
|
+
| `doctor` | `mcp_computer_use_linux_doctor` |
|
|
81
|
+
| `get_app_state` | `mcp_computer_use_linux_get_app_state` |
|
|
82
|
+
| `list_windows` | `mcp_computer_use_linux_list_windows` |
|
|
83
|
+
| `click` | `mcp_computer_use_linux_click` |
|
|
84
|
+
| `type_text` | `mcp_computer_use_linux_type_text` |
|
|
85
|
+
|
|
86
|
+
Restart Hermes after changing MCP config.
|
|
87
|
+
|
|
88
|
+
## Procedure
|
|
89
|
+
|
|
90
|
+
1. Start every desktop-control session with `doctor`.
|
|
91
|
+
2. If `can_build_accessibility_tree` is false, run `setup` and restart the target app.
|
|
92
|
+
3. If `can_query_windows` is false on GNOME Wayland, run `setup-window-targeting` and ask the user to log out and back in if setup says the shell extension needs a reload.
|
|
93
|
+
4. Before targeted input, call `list_windows` or `focused_window` and verify the intended window by title, app id, pid, or wm class.
|
|
94
|
+
5. Prefer semantic targeting from `get_app_state`: use element indices or role/name/text/states selectors.
|
|
95
|
+
6. Use coordinates only when the UI surface has no useful accessibility tree.
|
|
96
|
+
7. For text input, prefer `type_text` with a target selector (`window_id`, `pid`, `app_id`, `wm_class`, `title`, `tty`, `terminal_pid`, `terminal_command`, or `terminal_cwd`) rather than relying on current focus.
|
|
97
|
+
8. After mutating actions, re-check state with `get_app_state`, `focused_window`, or an app-specific readback.
|
|
98
|
+
|
|
99
|
+
## Pitfalls
|
|
100
|
+
|
|
101
|
+
- Already-running GTK, Qt, and Electron apps may need a restart after AT-SPI is enabled.
|
|
102
|
+
- GNOME may show a portal prompt on the first screenshot or `get_app_state` call with screenshots enabled.
|
|
103
|
+
- Desktop input is stateful. Avoid concurrent tool calls against this MCP server.
|
|
104
|
+
- `click`, `drag`, `press_key`, `type_text`, `perform_action`, and `set_value` can change real application state.
|
|
105
|
+
- `ydotoold` should run as a per-user service with its socket under `/run/user/$UID`, not as a system-wide service.
|
|
106
|
+
- On COSMIC, the standard npm, Cargo, and install-script paths install the `computer-use-linux-cosmic` helper automatically. Manual binary installs must copy both binaries.
|
|
107
|
+
|
|
108
|
+
## Verification
|
|
109
|
+
|
|
110
|
+
Run:
|
|
111
|
+
|
|
112
|
+
```bash
|
|
113
|
+
computer-use-linux doctor | jq .readiness
|
|
114
|
+
hermes chat --toolsets mcp-computer-use-linux -q "List the current desktop windows."
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
Ready output should have:
|
|
118
|
+
|
|
119
|
+
- `can_register_mcp_tools: true`
|
|
120
|
+
- `can_build_accessibility_tree: true`
|
|
121
|
+
- `can_query_windows: true`
|
|
122
|
+
- `can_send_development_input: true`
|
|
123
|
+
- `blockers: []`
|
|
124
|
+
|
|
125
|
+
If Hermes does not expose the tools, check startup logs for MCP discovery errors and confirm the server name in `config.yaml` is exactly `computer-use-linux`.
|