@different-ai/opencode-browser 3.0.0 → 4.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,14 +1,16 @@
1
1
  # OpenCode Browser
2
2
 
3
- Browser automation MCP server for [OpenCode](https://github.com/opencode-ai/opencode).
3
+ Browser automation plugin for [OpenCode](https://github.com/opencode-ai/opencode).
4
4
 
5
- Control your real Chrome browser with existing logins, cookies, and bookmarks. No DevTools Protocol, no security prompts.
5
+ Control your real Chromium browser (Chrome/Brave/Arc/Edge) using your existing profile (logins, cookies, bookmarks). No DevTools Protocol, no security prompts.
6
6
 
7
- ## Why?
7
+ ## Why this architecture
8
8
 
9
- Chrome 136+ blocks `--remote-debugging-port` on your default profile for security reasons. DevTools-based automation (like Playwright) triggers a security prompt every time.
9
+ This version is optimized for reliability and predictable multi-session behavior:
10
10
 
11
- OpenCode Browser uses a simple WebSocket connection between an MCP server and a Chrome extension. Your automation works with your existing browser session - no prompts, no separate profiles.
11
+ - **No WebSocket port** no port conflicts
12
+ - **Chrome Native Messaging** between extension and a local host process
13
+ - A local **broker** multiplexes multiple OpenCode plugin sessions and enforces **per-tab ownership**
12
14
 
13
15
  ## Installation
14
16
 
@@ -17,111 +19,67 @@ npx @different-ai/opencode-browser install
17
19
  ```
18
20
 
19
21
  The installer will:
22
+
20
23
  1. Copy the extension to `~/.opencode-browser/extension/`
21
- 2. Guide you to load the extension in Chrome
22
- 3. Update your `opencode.json` with MCP server config
24
+ 2. Walk you through loading + pinning it in `chrome://extensions`
25
+ 3. Ask for the extension ID and install a **Native Messaging Host manifest**
26
+ 4. Update your `.opencode.json` to load the plugin
23
27
 
24
- ## Configuration
28
+ ### Configure OpenCode
25
29
 
26
- Add to your `opencode.json`:
30
+ Your `.opencode.json` should contain:
27
31
 
28
32
  ```json
29
33
  {
30
- "mcp": {
31
- "browser": {
32
- "type": "local",
33
- "command": ["bunx", "@different-ai/opencode-browser", "serve"]
34
- }
35
- }
34
+ "$schema": "https://opencode.ai/config.json",
35
+ "plugin": ["@different-ai/opencode-browser"]
36
36
  }
37
37
  ```
38
38
 
39
- Then load the extension in Chrome:
40
- 1. Go to `chrome://extensions`
41
- 2. Enable "Developer mode"
42
- 3. Click "Load unpacked" and select `~/.opencode-browser/extension/`
43
-
44
- ## Available Tools
45
-
46
- | Tool | Description |
47
- |------|-------------|
48
- | `browser_status` | Check if browser extension is connected |
49
- | `browser_navigate` | Navigate to a URL |
50
- | `browser_click` | Click an element by CSS selector |
51
- | `browser_type` | Type text into an input field |
52
- | `browser_screenshot` | Capture the page (returns base64, optionally saves to file) |
53
- | `browser_snapshot` | Get accessibility tree with selectors + all page links |
54
- | `browser_get_tabs` | List all open tabs |
55
- | `browser_scroll` | Scroll page or element into view |
56
- | `browser_wait` | Wait for a duration |
57
- | `browser_execute` | Run JavaScript in page context |
58
-
59
- ### Screenshot Tool
60
-
61
- The `browser_screenshot` tool returns base64 image data by default, allowing AI to view images directly:
62
-
63
- ```javascript
64
- // Returns base64 image (AI can view it)
65
- browser_screenshot()
66
-
67
- // Save to current working directory
68
- browser_screenshot({ save: true })
69
-
70
- // Save to specific path
71
- browser_screenshot({ path: "my-screenshot.png" })
72
- ```
73
-
74
- ## Architecture
39
+ ## How it works
75
40
 
76
41
  ```
77
- OpenCode <──STDIO──> MCP Server <──WebSocket:19222──> Chrome Extension
78
- │ │
79
- └── @modelcontextprotocol/sdk └── chrome.tabs, chrome.scripting
42
+ OpenCode Plugin <-> Local Broker (unix socket) <-> Native Host <-> Chrome Extension
80
43
  ```
81
44
 
82
- **Two components:**
83
- 1. MCP Server (runs as separate process, manages WebSocket server)
84
- 2. Chrome extension (connects to server, executes browser commands)
45
+ - The extension connects to the native host.
46
+ - The plugin talks to the broker over a local unix socket.
47
+ - The broker forwards tool requests to the extension and enforces tab ownership.
85
48
 
86
- **Benefits of MCP architecture:**
87
- - No session conflicts between OpenCode instances
88
- - Server runs independently of OpenCode process
89
- - Clean separation of concerns
90
- - Standard MCP protocol
49
+ ## Per-tab ownership
91
50
 
92
- ## Upgrading from v2.x (Plugin)
51
+ - First time a session touches a tab, the broker **auto-claims** it for that session.
52
+ - Other sessions attempting to use the same tab will get an error.
93
53
 
94
- v3.0 migrates from plugin to MCP architecture:
54
+ Tools:
95
55
 
96
- 1. Run `npx @different-ai/opencode-browser install`
97
- 2. Replace plugin config with MCP config in `opencode.json`:
56
+ - `browser_claim_tab({ tabId })`
57
+ - `browser_release_tab({ tabId })`
58
+ - `browser_list_claims()`
98
59
 
99
- ```diff
100
- - "plugin": ["@different-ai/opencode-browser"]
101
- + "mcp": {
102
- + "browser": {
103
- + "type": "local",
104
- + "command": ["bunx", "@different-ai/opencode-browser", "serve"]
105
- + }
106
- + }
107
- ```
60
+ ## Available tools
108
61
 
109
- 3. Restart OpenCode
62
+ - `browser_version`
63
+ - `browser_status`
64
+ - `browser_get_tabs`
65
+ - `browser_navigate`
66
+ - `browser_click`
67
+ - `browser_type`
68
+ - `browser_screenshot`
69
+ - `browser_snapshot`
70
+ - `browser_scroll`
71
+ - `browser_wait`
72
+ - `browser_execute`
110
73
 
111
74
  ## Troubleshooting
112
75
 
113
- **"Chrome extension not connected"**
114
- - Make sure Chrome is running
115
- - Check that the extension is loaded and enabled
116
- - Click the extension icon to see connection status
117
-
118
- **"Failed to start WebSocket server"**
119
- - Port 19222 may be in use
120
- - Run `lsof -i :19222` to check what's using it
76
+ **Extension says native host not available**
77
+ - Re-run `npx @different-ai/opencode-browser install`
78
+ - Confirm the extension ID you pasted matches the loaded extension in `chrome://extensions`
121
79
 
122
- **"browser_execute fails on some sites"**
123
- - Sites with strict CSP block JavaScript execution
124
- - Use `browser_snapshot` to get page data instead
80
+ **Tab ownership errors**
81
+ - Use `browser_list_claims()` to see who owns a tab
82
+ - Use `browser_claim_tab({ tabId, force: true })` to take over intentionally
125
83
 
126
84
  ## Uninstall
127
85
 
@@ -129,18 +87,4 @@ v3.0 migrates from plugin to MCP architecture:
129
87
  npx @different-ai/opencode-browser uninstall
130
88
  ```
131
89
 
132
- Then remove the extension from Chrome and delete `~/.opencode-browser/` if desired.
133
-
134
- ## Platform Support
135
-
136
- - macOS ✓
137
- - Linux ✓
138
- - Windows (not yet supported)
139
-
140
- ## License
141
-
142
- MIT
143
-
144
- ## Credits
145
-
146
- Inspired by [Claude in Chrome](https://www.anthropic.com/news/claude-in-chrome) by Anthropic.
90
+ Then remove the unpacked extension in `chrome://extensions` and remove the plugin from `.opencode.json`.
package/bin/broker.cjs ADDED
@@ -0,0 +1,290 @@
1
+ #!/usr/bin/env node
2
+ "use strict";
3
+
4
+ const net = require("net");
5
+ const fs = require("fs");
6
+ const os = require("os");
7
+ const path = require("path");
8
+
9
+ const BASE_DIR = path.join(os.homedir(), ".opencode-browser");
10
+ const SOCKET_PATH = path.join(BASE_DIR, "broker.sock");
11
+
12
+ fs.mkdirSync(BASE_DIR, { recursive: true });
13
+
14
+ function nowIso() {
15
+ return new Date().toISOString();
16
+ }
17
+
18
+ function createJsonLineParser(onMessage) {
19
+ let buffer = "";
20
+ return (chunk) => {
21
+ buffer += chunk.toString("utf8");
22
+ while (true) {
23
+ const idx = buffer.indexOf("\n");
24
+ if (idx === -1) return;
25
+ const line = buffer.slice(0, idx);
26
+ buffer = buffer.slice(idx + 1);
27
+ if (!line.trim()) continue;
28
+ try {
29
+ onMessage(JSON.parse(line));
30
+ } catch {
31
+ // ignore
32
+ }
33
+ }
34
+ };
35
+ }
36
+
37
+ function writeJsonLine(socket, msg) {
38
+ socket.write(JSON.stringify(msg) + "\n");
39
+ }
40
+
41
+ function wantsTab(toolName) {
42
+ return !["get_tabs", "get_active_tab"].includes(toolName);
43
+ }
44
+
45
+ // --- State ---
46
+ let host = null; // { socket }
47
+ let nextExtId = 0;
48
+ const extPending = new Map(); // extId -> { pluginSocket, pluginRequestId, sessionId }
49
+
50
+ const clients = new Set();
51
+
52
+ // Tab ownership: tabId -> { sessionId, claimedAt }
53
+ const claims = new Map();
54
+
55
+ function listClaims() {
56
+ const out = [];
57
+ for (const [tabId, info] of claims.entries()) {
58
+ out.push({ tabId, ...info });
59
+ }
60
+ out.sort((a, b) => a.tabId - b.tabId);
61
+ return out;
62
+ }
63
+
64
+ function releaseClaimsForSession(sessionId) {
65
+ for (const [tabId, info] of claims.entries()) {
66
+ if (info.sessionId === sessionId) claims.delete(tabId);
67
+ }
68
+ }
69
+
70
+ function checkClaim(tabId, sessionId) {
71
+ const existing = claims.get(tabId);
72
+ if (!existing) return { ok: true };
73
+ if (existing.sessionId === sessionId) return { ok: true };
74
+ return { ok: false, error: `Tab ${tabId} is owned by another OpenCode session (${existing.sessionId})` };
75
+ }
76
+
77
+ function setClaim(tabId, sessionId) {
78
+ claims.set(tabId, { sessionId, claimedAt: nowIso() });
79
+ }
80
+
81
+ function ensureHost() {
82
+ if (host && host.socket && !host.socket.destroyed) return;
83
+ throw new Error("Chrome extension is not connected (native host offline)");
84
+ }
85
+
86
+ function callExtension(tool, args, sessionId) {
87
+ ensureHost();
88
+ const extId = ++nextExtId;
89
+
90
+ return new Promise((resolve, reject) => {
91
+ extPending.set(extId, { resolve, reject, sessionId });
92
+ writeJsonLine(host.socket, {
93
+ type: "to_extension",
94
+ message: { type: "tool_request", id: extId, tool, args },
95
+ });
96
+
97
+ const timeout = setTimeout(() => {
98
+ if (!extPending.has(extId)) return;
99
+ extPending.delete(extId);
100
+ reject(new Error("Timed out waiting for extension"));
101
+ }, 60000);
102
+
103
+ // attach timeout to resolver
104
+ const pending = extPending.get(extId);
105
+ if (pending) pending.timeout = timeout;
106
+ });
107
+ }
108
+
109
+ async function resolveActiveTab(sessionId) {
110
+ const res = await callExtension("get_active_tab", {}, sessionId);
111
+ const tabId = res && typeof res.tabId === "number" ? res.tabId : undefined;
112
+ if (!tabId) throw new Error("Could not determine active tab");
113
+ return tabId;
114
+ }
115
+
116
+ async function handleTool(pluginSocket, req) {
117
+ const { tool, args = {}, sessionId } = req;
118
+ if (!tool) throw new Error("Missing tool");
119
+
120
+ let tabId = args.tabId;
121
+
122
+ if (wantsTab(tool)) {
123
+ if (typeof tabId !== "number") {
124
+ tabId = await resolveActiveTab(sessionId);
125
+ }
126
+
127
+ const claimCheck = checkClaim(tabId, sessionId);
128
+ if (!claimCheck.ok) throw new Error(claimCheck.error);
129
+ }
130
+
131
+ const res = await callExtension(tool, { ...args, tabId }, sessionId);
132
+
133
+ const usedTabId =
134
+ res && typeof res.tabId === "number" ? res.tabId : typeof tabId === "number" ? tabId : undefined;
135
+ if (typeof usedTabId === "number") {
136
+ // Auto-claim on first touch
137
+ const existing = claims.get(usedTabId);
138
+ if (!existing) setClaim(usedTabId, sessionId);
139
+ }
140
+
141
+ return res;
142
+ }
143
+
144
+ function handleClientMessage(socket, client, msg) {
145
+ if (msg && msg.type === "hello") {
146
+ client.role = msg.role || "unknown";
147
+ client.sessionId = msg.sessionId;
148
+ if (client.role === "native-host") {
149
+ host = { socket };
150
+ // allow host to see current state
151
+ writeJsonLine(socket, { type: "host_ready", claims: listClaims() });
152
+ }
153
+ return;
154
+ }
155
+
156
+ if (msg && msg.type === "from_extension") {
157
+ const message = msg.message;
158
+ if (message && message.type === "tool_response" && typeof message.id === "number") {
159
+ const pending = extPending.get(message.id);
160
+ if (!pending) return;
161
+ extPending.delete(message.id);
162
+ if (pending.timeout) clearTimeout(pending.timeout);
163
+
164
+ if (message.error) {
165
+ pending.reject(new Error(message.error.content || String(message.error)));
166
+ } else {
167
+ // Forward full result payload so callers can read tabId
168
+ pending.resolve(message.result);
169
+ }
170
+ }
171
+ return;
172
+ }
173
+
174
+ if (msg && msg.type === "request" && typeof msg.id === "number") {
175
+ const requestId = msg.id;
176
+ const sessionId = msg.sessionId || client.sessionId;
177
+
178
+ const replyOk = (data) => writeJsonLine(socket, { type: "response", id: requestId, ok: true, data });
179
+ const replyErr = (err) =>
180
+ writeJsonLine(socket, { type: "response", id: requestId, ok: false, error: err.message || String(err) });
181
+
182
+ (async () => {
183
+ try {
184
+ if (msg.op === "status") {
185
+ replyOk({ broker: true, hostConnected: !!host && !!host.socket && !host.socket.destroyed, claims: listClaims() });
186
+ return;
187
+ }
188
+
189
+ if (msg.op === "list_claims") {
190
+ replyOk({ claims: listClaims() });
191
+ return;
192
+ }
193
+
194
+ if (msg.op === "claim_tab") {
195
+ const tabId = msg.tabId;
196
+ const force = !!msg.force;
197
+ if (typeof tabId !== "number") throw new Error("tabId is required");
198
+ const existing = claims.get(tabId);
199
+ if (existing && existing.sessionId !== sessionId && !force) {
200
+ throw new Error(`Tab ${tabId} is owned by another OpenCode session (${existing.sessionId})`);
201
+ }
202
+ setClaim(tabId, sessionId);
203
+ replyOk({ ok: true, tabId, sessionId });
204
+ return;
205
+ }
206
+
207
+ if (msg.op === "release_tab") {
208
+ const tabId = msg.tabId;
209
+ if (typeof tabId !== "number") throw new Error("tabId is required");
210
+ const existing = claims.get(tabId);
211
+ if (!existing) {
212
+ replyOk({ ok: true, tabId, released: false });
213
+ return;
214
+ }
215
+ if (existing.sessionId !== sessionId) {
216
+ throw new Error(`Tab ${tabId} is owned by another OpenCode session (${existing.sessionId})`);
217
+ }
218
+ claims.delete(tabId);
219
+ replyOk({ ok: true, tabId, released: true });
220
+ return;
221
+ }
222
+
223
+ if (msg.op === "tool") {
224
+ const result = await handleTool(socket, { tool: msg.tool, args: msg.args || {}, sessionId });
225
+ replyOk(result);
226
+ return;
227
+ }
228
+
229
+ throw new Error(`Unknown op: ${msg.op}`);
230
+ } catch (e) {
231
+ replyErr(e);
232
+ }
233
+ })();
234
+
235
+ return;
236
+ }
237
+ }
238
+
239
+ function start() {
240
+ try {
241
+ if (fs.existsSync(SOCKET_PATH)) fs.unlinkSync(SOCKET_PATH);
242
+ } catch {
243
+ // ignore
244
+ }
245
+
246
+ const server = net.createServer((socket) => {
247
+ socket.setNoDelay(true);
248
+
249
+ const client = { role: "unknown", sessionId: null };
250
+ clients.add(client);
251
+
252
+ socket.on(
253
+ "data",
254
+ createJsonLineParser((msg) => handleClientMessage(socket, client, msg))
255
+ );
256
+
257
+ socket.on("close", () => {
258
+ clients.delete(client);
259
+ if (client.role === "native-host" && host && host.socket === socket) {
260
+ host = null;
261
+ // fail pending extension requests
262
+ for (const [extId, pending] of extPending.entries()) {
263
+ extPending.delete(extId);
264
+ if (pending.timeout) clearTimeout(pending.timeout);
265
+ pending.reject(new Error("Native host disconnected"));
266
+ }
267
+ }
268
+ if (client.sessionId) releaseClaimsForSession(client.sessionId);
269
+ });
270
+
271
+ socket.on("error", () => {
272
+ // close handler will clean up
273
+ });
274
+ });
275
+
276
+ server.listen(SOCKET_PATH, () => {
277
+ // Make socket group-readable; ignore errors
278
+ try {
279
+ fs.chmodSync(SOCKET_PATH, 0o600);
280
+ } catch {}
281
+ console.error(`[browser-broker] listening on ${SOCKET_PATH}`);
282
+ });
283
+
284
+ server.on("error", (err) => {
285
+ console.error("[browser-broker] server error", err);
286
+ process.exit(1);
287
+ });
288
+ }
289
+
290
+ start();