@vortex-os/computer-use 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,74 @@
1
+ # @vortex-os/computer-use
2
+
3
+ Read-only **screen perception** for VortEX agents, exposed as an MCP server. It lets an agent *see* what is on screen — read a window's structure, capture a region as an image, and watch for on-screen changes — without ever moving the mouse or typing. It layers on `@vortex-os/base` but also works standalone.
4
+
5
+ > **Status: 0.1.0, Windows-first, read-only.** Mouse/keyboard **control is intentionally out of scope** for this release — this package only *perceives*. macOS/Linux backends are not yet implemented.
6
+
7
+ ## What it is
8
+
9
+ An MCP (Model Context Protocol) server that exposes six perception tools over stdio:
10
+
11
+ | Tool | What it does | Cost |
12
+ |---|---|---|
13
+ | `probe` | Reports whether this environment can perceive the screen (displays, DPI, capture latency). Never captures real screen content. | ~0 |
14
+ | `read_ui` | Reads the active/target window as a **structured accessibility tree** (UI Automation): element roles, coordinates, text. No image. | ~0 image tokens |
15
+ | `capture_screen` | Pixel capture (PNG) for what structure can't reach — canvases, games, remote desktops. Target by window, region, monitor, or cursor box. | image |
16
+ | `watch_capture` | Captures N frames at an interval in one process; with `changeOnly`, keeps only changed frames. | image(s) |
17
+ | `poll_change` | One non-blocking "did it change?" probe; returns a change percentage and (optionally) an image. Poll it on an interval to watch without blocking. | metadata, image optional |
18
+ | `beep` | A system beep, to get the user's attention while they look elsewhere. | — |
19
+
20
+ The design favors **structure first, pixels as fallback**: `read_ui` is cheap and precise for ordinary apps; `capture_screen` is for content that has no accessibility tree (games, custom canvases).
21
+
22
+ ## What it is NOT
23
+
24
+ - **Not control.** No clicking, typing, or app automation. Perception only.
25
+ - **Not comprehensive secret protection.** See *Privacy & redaction* below — the denylist is the real control; field-level masking is best-effort and does not catch plaintext secrets sitting in arbitrary windows.
26
+ - **Not cross-platform yet.** Windows only in 0.1.0.
27
+
28
+ ## Install
29
+
30
+ ```
31
+ npm i @vortex-os/computer-use
32
+ ```
33
+
34
+ Peer dependency: `@vortex-os/base` (`>=0.3.0 <1.0.0`). The MCP SDK (`@modelcontextprotocol/sdk`) is an optional dependency, loaded only when the server runs. No native build step.
35
+
36
+ ### Register the MCP server
37
+
38
+ The package ships a `vortex-mcp-computer-use` bin that launches the stdio server. Register it with your agent host. For Claude Code, add it to `.mcp.json`:
39
+
40
+ ```json
41
+ {
42
+ "mcpServers": {
43
+ "vortex-computer-use": {
44
+ "command": "npx",
45
+ "args": ["vortex-mcp-computer-use"]
46
+ }
47
+ }
48
+ }
49
+ ```
50
+
51
+ > Use a server name other than the reserved `computer-use` (e.g. `vortex-computer-use`) — some hosts reserve `computer-use` and will silently skip a server with that exact name. MCP servers load at session start, so **restart the agent** after adding it.
52
+
53
+ ## Privacy & redaction
54
+
55
+ Whatever you point this at is sent to your AI model. Two controls reduce accidental exposure; both run in the backend **before** any pixels or text reach the model:
56
+
57
+ 1. **Denylist (the primary control).** List window titles or process names that must never be captured. If a listed window is visible anywhere inside a capture region, the whole capture is refused (`{ "redacted": true }` — no image, no text). This is the reliable defense against accidentally capturing a password manager or banking window during a watch.
58
+ 2. **Password-field masking.** In `read_ui`, fields the OS reports as password inputs are dropped (no value, no text, children not traversed).
59
+
60
+ Copy `computer-use.config.example.json` to `computer-use.config.json` (next to the server) to configure the denylist, or set `VORTEX_CU_DENY_TITLES` / `VORTEX_CU_DENY_PROCS` (JSON arrays). **The denylist is read once at startup — restart the server after changing it.**
61
+
62
+ **Honest limits.** This is *not* comprehensive secret-scanning. A plaintext token shown in a text editor or terminal (not a password field, not a denylisted window) will still be captured. Pixel-level password masking is intentionally out of scope for 0.1.0. Capture images are volatile — held only long enough to send, then deleted; they are never written to disk persistently.
63
+
64
+ ### Audit
65
+
66
+ Each perception call appends one metadata line (timestamp, tool, output size, a keyed HMAC of the output, and an HMAC of the window title) to a daily JSONL log under your user-local app data (`%LOCALAPPDATA%\vortex-computer-use\audit\`) — outside the synced instance data. **No raw images and no plaintext window titles are stored.** If the audit key can't be set up, perception still works and a warning is printed.
67
+
68
+ ## Verify
69
+
70
+ ```
71
+ npm run verify # node scripts/verify.mjs — needs a desktop session; captures the real screen
72
+ ```
73
+
74
+ Exercises every tool plus the redaction/audit gate (denylist blocking across all capture modes, no over-block, no title leak, audit written with no plaintext).
@@ -0,0 +1,12 @@
1
+ {
2
+ "_comment": "Copy to computer-use.config.json to enable redaction. Empty by default — nothing is blocked until you add entries. The denylist is the primary control: any listed window/process that appears inside a capture region makes the whole capture fail-closed (no image, no structured text). Matching is case-insensitive substring. This is NOT comprehensive secret-scanning: plaintext secrets visible in non-listed windows (editors, terminals) are still captured. Env overrides: VORTEX_CU_DENY_TITLES / VORTEX_CU_DENY_PROCS (JSON arrays).",
3
+ "_restart": "The denylist is read once at server start; RESTART the MCP server (restart the agent / reload its MCP servers) after changing this file or the env vars for the change to take effect.",
4
+ "redaction": {
5
+ "denyWindowTitles": [],
6
+ "denyProcesses": []
7
+ },
8
+ "_examples": {
9
+ "denyWindowTitles": ["Bitwarden", "1Password", "KeePass", "Online Banking"],
10
+ "denyProcesses": ["Bitwarden", "1Password", "KeePassXC", "keeper"]
11
+ }
12
+ }
package/package.json ADDED
@@ -0,0 +1,50 @@
1
+ {
2
+ "name": "@vortex-os/computer-use",
3
+ "version": "0.1.0",
4
+ "description": "Add-on — read-only screen perception (structured UIA tree + pixel fallback + change watch) exposed as an MCP server, layered on @vortex-os/base. Windows-first. Control (mouse/keyboard) is intentionally out of scope.",
5
+ "license": "MIT",
6
+ "author": "vortex-os-project",
7
+ "homepage": "https://github.com/vortex-os-project/vortex#readme",
8
+ "repository": {
9
+ "type": "git",
10
+ "url": "git+https://github.com/vortex-os-project/vortex.git",
11
+ "directory": "modules/computer-use"
12
+ },
13
+ "type": "module",
14
+ "files": [
15
+ "scripts/mcp-stdio.mjs",
16
+ "scripts/worker.ps1",
17
+ "scripts/lib.ps1",
18
+ "scripts/probe.ps1",
19
+ "scripts/read-ui.ps1",
20
+ "scripts/point-to-ask.ps1",
21
+ "computer-use.config.example.json",
22
+ "README.md"
23
+ ],
24
+ "bin": {
25
+ "vortex-mcp-computer-use": "scripts/mcp-stdio.mjs"
26
+ },
27
+ "scripts": {
28
+ "verify": "node scripts/verify.mjs"
29
+ },
30
+ "peerDependencies": {
31
+ "@vortex-os/base": ">=0.3.0 <1.0.0"
32
+ },
33
+ "peerDependenciesMeta": {
34
+ "@vortex-os/base": {
35
+ "optional": false
36
+ }
37
+ },
38
+ "optionalDependencies": {
39
+ "@modelcontextprotocol/sdk": "^1.21.0"
40
+ },
41
+ "engines": {
42
+ "node": ">=22"
43
+ },
44
+ "os": [
45
+ "win32"
46
+ ],
47
+ "publishConfig": {
48
+ "access": "public"
49
+ }
50
+ }