@different-ai/opencode-browser 2.1.0 → 4.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -2,13 +2,15 @@
2
2
 
3
3
  Browser automation plugin for [OpenCode](https://github.com/opencode-ai/opencode).
4
4
 
5
- Control your real Chrome browser with existing logins, cookies, and bookmarks. No DevTools Protocol, no security prompts.
5
+ Control your real Chromium browser (Chrome/Brave/Arc/Edge) using your existing profile (logins, cookies, bookmarks). No DevTools Protocol, no security prompts.
6
6
 
7
- ## Why?
7
+ ## Why this architecture
8
8
 
9
- Chrome 136+ blocks `--remote-debugging-port` on your default profile for security reasons. DevTools-based automation (like Playwright) triggers a security prompt every time.
9
+ This version is optimized for reliability and predictable multi-session behavior:
10
10
 
11
- OpenCode Browser uses a simple WebSocket connection between an OpenCode plugin and a Chrome extension. Your automation works with your existing browser session - no prompts, no separate profiles.
11
+ - **No WebSocket port** no port conflicts
12
+ - **Chrome Native Messaging** between extension and a local host process
13
+ - A local **broker** multiplexes multiple OpenCode plugin sessions and enforces **per-tab ownership**
12
14
 
13
15
  ## Installation
14
16
 
@@ -17,104 +19,66 @@ npx @different-ai/opencode-browser install
17
19
  ```
18
20
 
19
21
  The installer will:
22
+
20
23
  1. Copy the extension to `~/.opencode-browser/extension/`
21
- 2. Guide you to load the extension in Chrome
22
- 3. Update your `opencode.json` to use the plugin
24
+ 2. Walk you through loading + pinning it in `chrome://extensions`
25
+ 3. Ask for the extension ID and install a **Native Messaging Host manifest**
26
+ 4. Update your `opencode.json` to load the plugin
23
27
 
24
- ## Configuration
28
+ ### Configure OpenCode
25
29
 
26
- Add to your `opencode.json`:
30
+ Your `opencode.json` should contain:
27
31
 
28
32
  ```json
29
33
  {
34
+ "$schema": "https://opencode.ai/config.json",
30
35
  "plugin": ["@different-ai/opencode-browser"]
31
36
  }
32
37
  ```
33
38
 
34
- Then load the extension in Chrome:
35
- 1. Go to `chrome://extensions`
36
- 2. Enable "Developer mode"
37
- 3. Click "Load unpacked" and select `~/.opencode-browser/extension/`
38
-
39
- ## Available Tools
40
-
41
- | Tool | Description |
42
- |------|-------------|
43
- | `browser_status` | Check if browser is available or locked |
44
- | `browser_kill_session` | Request other session release + take over (no kill) |
45
- | `browser_release` | Release lock and stop server |
46
- | `browser_force_kill_session` | (Last resort) kill other OpenCode process |
47
- | `browser_navigate` | Navigate to a URL |
48
- | `browser_click` | Click an element by CSS selector |
49
- | `browser_type` | Type text into an input field |
50
- | `browser_screenshot` | Capture the visible page |
51
- | `browser_snapshot` | Get accessibility tree with selectors |
52
- | `browser_get_tabs` | List all open tabs |
53
- | `browser_scroll` | Scroll page or element into view |
54
- | `browser_wait` | Wait for a duration |
55
- | `browser_execute` | Run JavaScript in page context |
56
-
57
- ## Multi-Session Support
58
-
59
- Only one OpenCode session can use the browser at a time. This prevents conflicts when you have multiple terminals open.
60
-
61
- - `browser_status` - Check who has the lock
62
- - `browser_kill_session` - Request the other session to release (no kill)
63
- - `browser_release` - Release lock/server for this session
64
- - `browser_force_kill_session` - (Last resort) kill the other OpenCode process and take over
65
-
66
- In your prompts, you can say:
67
- - "If browser is locked, kill the session and proceed"
68
- - "If browser is locked, skip this task"
69
-
70
- ## Architecture
39
+ ## How it works
71
40
 
72
41
  ```
73
- OpenCode Plugin ◄──WebSocket:19222──► Chrome Extension
74
- │ │
75
- └── Lock file └── chrome.tabs, chrome.scripting
42
+ OpenCode Plugin <-> Local Broker (unix socket) <-> Native Host <-> Chrome Extension
76
43
  ```
77
44
 
78
- **Two components:**
79
- 1. OpenCode plugin (runs WebSocket server, defines tools)
80
- 2. Chrome extension (connects to plugin, executes commands)
45
+ - The extension connects to the native host.
46
+ - The plugin talks to the broker over a local unix socket.
47
+ - The broker forwards tool requests to the extension and enforces tab ownership.
81
48
 
82
- **No daemon. No MCP server. No native messaging host.**
49
+ ## Per-tab ownership
83
50
 
84
- ## Upgrading from v1.x
51
+ - First time a session touches a tab, the broker **auto-claims** it for that session.
52
+ - Other sessions attempting to use the same tab will get an error.
85
53
 
86
- v2.0 is a complete rewrite with a simpler architecture:
54
+ Tools:
87
55
 
88
- 1. Run `npx @different-ai/opencode-browser install` (cleans up old daemon automatically)
89
- 2. Replace MCP config with plugin config in `opencode.json`:
56
+ - `browser_claim_tab({ tabId })`
57
+ - `browser_release_tab({ tabId })`
58
+ - `browser_list_claims()`
90
59
 
91
- ```diff
92
- - "mcp": {
93
- - "browser": {
94
- - "type": "local",
95
- - "command": ["npx", "@different-ai/opencode-browser", "start"],
96
- - "enabled": true
97
- - }
98
- - }
99
- + "plugin": ["@different-ai/opencode-browser"]
100
- ```
60
+ ## Available tools
101
61
 
102
- 3. Restart OpenCode
62
+ - `browser_status`
63
+ - `browser_get_tabs`
64
+ - `browser_navigate`
65
+ - `browser_click`
66
+ - `browser_type`
67
+ - `browser_screenshot`
68
+ - `browser_snapshot`
69
+ - `browser_scroll`
70
+ - `browser_wait`
71
+ - `browser_execute`
103
72
 
104
73
  ## Troubleshooting
105
74
 
106
- **"Chrome extension not connected"**
107
- - Make sure Chrome is running
108
- - Check that the extension is loaded and enabled
109
- - Click the extension icon to see connection status
110
-
111
- **"Browser locked by another session"**
112
- - Use `browser_kill_session` to take over
113
- - Or close the other OpenCode session
75
+ **Extension says native host not available**
76
+ - Re-run `npx @different-ai/opencode-browser install`
77
+ - Confirm the extension ID you pasted matches the loaded extension in `chrome://extensions`
114
78
 
115
- **"Failed to start WebSocket server"**
116
- - Port 19222 may be in use
117
- - Check if another OpenCode session is running
79
+ **Tab ownership errors**
80
+ - Use `browser_list_claims()` to see who owns a tab
81
+ - Use `browser_claim_tab({ tabId, force: true })` to take over intentionally
118
82
 
119
83
  ## Uninstall
120
84
 
@@ -122,18 +86,4 @@ v2.0 is a complete rewrite with a simpler architecture:
122
86
  npx @different-ai/opencode-browser uninstall
123
87
  ```
124
88
 
125
- Then remove the extension from Chrome and delete `~/.opencode-browser/` if desired.
126
-
127
- ## Platform Support
128
-
129
- - macOS ✓
130
- - Linux ✓
131
- - Windows (not yet supported)
132
-
133
- ## License
134
-
135
- MIT
136
-
137
- ## Credits
138
-
139
- Inspired by [Claude in Chrome](https://www.anthropic.com/news/claude-in-chrome) by Anthropic.
89
+ Then remove the unpacked extension in `chrome://extensions` and remove the plugin from `opencode.json`.
package/bin/broker.cjs ADDED
@@ -0,0 +1,290 @@
1
+ #!/usr/bin/env node
2
+ "use strict";
3
+
4
+ const net = require("net");
5
+ const fs = require("fs");
6
+ const os = require("os");
7
+ const path = require("path");
8
+
9
+ const BASE_DIR = path.join(os.homedir(), ".opencode-browser");
10
+ const SOCKET_PATH = path.join(BASE_DIR, "broker.sock");
11
+
12
+ fs.mkdirSync(BASE_DIR, { recursive: true });
13
+
14
+ function nowIso() {
15
+ return new Date().toISOString();
16
+ }
17
+
18
+ function createJsonLineParser(onMessage) {
19
+ let buffer = "";
20
+ return (chunk) => {
21
+ buffer += chunk.toString("utf8");
22
+ while (true) {
23
+ const idx = buffer.indexOf("\n");
24
+ if (idx === -1) return;
25
+ const line = buffer.slice(0, idx);
26
+ buffer = buffer.slice(idx + 1);
27
+ if (!line.trim()) continue;
28
+ try {
29
+ onMessage(JSON.parse(line));
30
+ } catch {
31
+ // ignore
32
+ }
33
+ }
34
+ };
35
+ }
36
+
37
+ function writeJsonLine(socket, msg) {
38
+ socket.write(JSON.stringify(msg) + "\n");
39
+ }
40
+
41
+ function wantsTab(toolName) {
42
+ return !["get_tabs", "get_active_tab"].includes(toolName);
43
+ }
44
+
45
+ // --- State ---
46
+ let host = null; // { socket }
47
+ let nextExtId = 0;
48
+ const extPending = new Map(); // extId -> { pluginSocket, pluginRequestId, sessionId }
49
+
50
+ const clients = new Set();
51
+
52
+ // Tab ownership: tabId -> { sessionId, claimedAt }
53
+ const claims = new Map();
54
+
55
+ function listClaims() {
56
+ const out = [];
57
+ for (const [tabId, info] of claims.entries()) {
58
+ out.push({ tabId, ...info });
59
+ }
60
+ out.sort((a, b) => a.tabId - b.tabId);
61
+ return out;
62
+ }
63
+
64
+ function releaseClaimsForSession(sessionId) {
65
+ for (const [tabId, info] of claims.entries()) {
66
+ if (info.sessionId === sessionId) claims.delete(tabId);
67
+ }
68
+ }
69
+
70
+ function checkClaim(tabId, sessionId) {
71
+ const existing = claims.get(tabId);
72
+ if (!existing) return { ok: true };
73
+ if (existing.sessionId === sessionId) return { ok: true };
74
+ return { ok: false, error: `Tab ${tabId} is owned by another OpenCode session (${existing.sessionId})` };
75
+ }
76
+
77
+ function setClaim(tabId, sessionId) {
78
+ claims.set(tabId, { sessionId, claimedAt: nowIso() });
79
+ }
80
+
81
+ function ensureHost() {
82
+ if (host && host.socket && !host.socket.destroyed) return;
83
+ throw new Error("Chrome extension is not connected (native host offline)");
84
+ }
85
+
86
+ function callExtension(tool, args, sessionId) {
87
+ ensureHost();
88
+ const extId = ++nextExtId;
89
+
90
+ return new Promise((resolve, reject) => {
91
+ extPending.set(extId, { resolve, reject, sessionId });
92
+ writeJsonLine(host.socket, {
93
+ type: "to_extension",
94
+ message: { type: "tool_request", id: extId, tool, args },
95
+ });
96
+
97
+ const timeout = setTimeout(() => {
98
+ if (!extPending.has(extId)) return;
99
+ extPending.delete(extId);
100
+ reject(new Error("Timed out waiting for extension"));
101
+ }, 60000);
102
+
103
+ // attach timeout to resolver
104
+ const pending = extPending.get(extId);
105
+ if (pending) pending.timeout = timeout;
106
+ });
107
+ }
108
+
109
+ async function resolveActiveTab(sessionId) {
110
+ const res = await callExtension("get_active_tab", {}, sessionId);
111
+ const tabId = res && typeof res.tabId === "number" ? res.tabId : undefined;
112
+ if (!tabId) throw new Error("Could not determine active tab");
113
+ return tabId;
114
+ }
115
+
116
+ async function handleTool(pluginSocket, req) {
117
+ const { tool, args = {}, sessionId } = req;
118
+ if (!tool) throw new Error("Missing tool");
119
+
120
+ let tabId = args.tabId;
121
+
122
+ if (wantsTab(tool)) {
123
+ if (typeof tabId !== "number") {
124
+ tabId = await resolveActiveTab(sessionId);
125
+ }
126
+
127
+ const claimCheck = checkClaim(tabId, sessionId);
128
+ if (!claimCheck.ok) throw new Error(claimCheck.error);
129
+ }
130
+
131
+ const res = await callExtension(tool, { ...args, tabId }, sessionId);
132
+
133
+ const usedTabId =
134
+ res && typeof res.tabId === "number" ? res.tabId : typeof tabId === "number" ? tabId : undefined;
135
+ if (typeof usedTabId === "number") {
136
+ // Auto-claim on first touch
137
+ const existing = claims.get(usedTabId);
138
+ if (!existing) setClaim(usedTabId, sessionId);
139
+ }
140
+
141
+ return res;
142
+ }
143
+
144
+ function handleClientMessage(socket, client, msg) {
145
+ if (msg && msg.type === "hello") {
146
+ client.role = msg.role || "unknown";
147
+ client.sessionId = msg.sessionId;
148
+ if (client.role === "native-host") {
149
+ host = { socket };
150
+ // allow host to see current state
151
+ writeJsonLine(socket, { type: "host_ready", claims: listClaims() });
152
+ }
153
+ return;
154
+ }
155
+
156
+ if (msg && msg.type === "from_extension") {
157
+ const message = msg.message;
158
+ if (message && message.type === "tool_response" && typeof message.id === "number") {
159
+ const pending = extPending.get(message.id);
160
+ if (!pending) return;
161
+ extPending.delete(message.id);
162
+ if (pending.timeout) clearTimeout(pending.timeout);
163
+
164
+ if (message.error) {
165
+ pending.reject(new Error(message.error.content || String(message.error)));
166
+ } else {
167
+ // Forward full result payload so callers can read tabId
168
+ pending.resolve(message.result);
169
+ }
170
+ }
171
+ return;
172
+ }
173
+
174
+ if (msg && msg.type === "request" && typeof msg.id === "number") {
175
+ const requestId = msg.id;
176
+ const sessionId = msg.sessionId || client.sessionId;
177
+
178
+ const replyOk = (data) => writeJsonLine(socket, { type: "response", id: requestId, ok: true, data });
179
+ const replyErr = (err) =>
180
+ writeJsonLine(socket, { type: "response", id: requestId, ok: false, error: err.message || String(err) });
181
+
182
+ (async () => {
183
+ try {
184
+ if (msg.op === "status") {
185
+ replyOk({ broker: true, hostConnected: !!host && !!host.socket && !host.socket.destroyed, claims: listClaims() });
186
+ return;
187
+ }
188
+
189
+ if (msg.op === "list_claims") {
190
+ replyOk({ claims: listClaims() });
191
+ return;
192
+ }
193
+
194
+ if (msg.op === "claim_tab") {
195
+ const tabId = msg.tabId;
196
+ const force = !!msg.force;
197
+ if (typeof tabId !== "number") throw new Error("tabId is required");
198
+ const existing = claims.get(tabId);
199
+ if (existing && existing.sessionId !== sessionId && !force) {
200
+ throw new Error(`Tab ${tabId} is owned by another OpenCode session (${existing.sessionId})`);
201
+ }
202
+ setClaim(tabId, sessionId);
203
+ replyOk({ ok: true, tabId, sessionId });
204
+ return;
205
+ }
206
+
207
+ if (msg.op === "release_tab") {
208
+ const tabId = msg.tabId;
209
+ if (typeof tabId !== "number") throw new Error("tabId is required");
210
+ const existing = claims.get(tabId);
211
+ if (!existing) {
212
+ replyOk({ ok: true, tabId, released: false });
213
+ return;
214
+ }
215
+ if (existing.sessionId !== sessionId) {
216
+ throw new Error(`Tab ${tabId} is owned by another OpenCode session (${existing.sessionId})`);
217
+ }
218
+ claims.delete(tabId);
219
+ replyOk({ ok: true, tabId, released: true });
220
+ return;
221
+ }
222
+
223
+ if (msg.op === "tool") {
224
+ const result = await handleTool(socket, { tool: msg.tool, args: msg.args || {}, sessionId });
225
+ replyOk(result);
226
+ return;
227
+ }
228
+
229
+ throw new Error(`Unknown op: ${msg.op}`);
230
+ } catch (e) {
231
+ replyErr(e);
232
+ }
233
+ })();
234
+
235
+ return;
236
+ }
237
+ }
238
+
239
+ function start() {
240
+ try {
241
+ if (fs.existsSync(SOCKET_PATH)) fs.unlinkSync(SOCKET_PATH);
242
+ } catch {
243
+ // ignore
244
+ }
245
+
246
+ const server = net.createServer((socket) => {
247
+ socket.setNoDelay(true);
248
+
249
+ const client = { role: "unknown", sessionId: null };
250
+ clients.add(client);
251
+
252
+ socket.on(
253
+ "data",
254
+ createJsonLineParser((msg) => handleClientMessage(socket, client, msg))
255
+ );
256
+
257
+ socket.on("close", () => {
258
+ clients.delete(client);
259
+ if (client.role === "native-host" && host && host.socket === socket) {
260
+ host = null;
261
+ // fail pending extension requests
262
+ for (const [extId, pending] of extPending.entries()) {
263
+ extPending.delete(extId);
264
+ if (pending.timeout) clearTimeout(pending.timeout);
265
+ pending.reject(new Error("Native host disconnected"));
266
+ }
267
+ }
268
+ if (client.sessionId) releaseClaimsForSession(client.sessionId);
269
+ });
270
+
271
+ socket.on("error", () => {
272
+ // close handler will clean up
273
+ });
274
+ });
275
+
276
+ server.listen(SOCKET_PATH, () => {
277
+ // Make socket group-readable; ignore errors
278
+ try {
279
+ fs.chmodSync(SOCKET_PATH, 0o600);
280
+ } catch {}
281
+ console.error(`[browser-broker] listening on ${SOCKET_PATH}`);
282
+ });
283
+
284
+ server.on("error", (err) => {
285
+ console.error("[browser-broker] server error", err);
286
+ process.exit(1);
287
+ });
288
+ }
289
+
290
+ start();