browserforce 1.0.2 → 1.0.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,19 +1,27 @@
1
- # BrowserForce
1
+ # BrowserForce //
2
2
 
3
- **Give your AI agent your real Chrome browser.** Your logins, your cookies, your extensions — already there.
3
+ > "a lion doesn't concern itself with token counting" [@steipete](https://x.com/steipete), creator of [OpenClaw](https://github.com/openclaw/openclaw)
4
+ >
5
+ > "a 10x user doesn't concern itself with sandboxed browsers // sandboxes are for kids" — BrowserForce, your friendly neighborhood power source.
4
6
 
5
- Other browser tools spawn a fresh Chrome — no logins, no extensions, instantly flagged by bot detectors. BrowserForce connects to **your running browser** instead. One Chrome extension, full Playwright API, everything you're already logged into.
7
+ **You're giving an AI your real Chrome — your logins, cookies, and sessions. That takes conviction.** BrowserForce is built for people who use the best models and don't look back. Security is built in: lock URLs, block navigation, read-only mode, auto-cleanup you stay in control.
8
+
9
+ **Fully autonomous browser control.** No manual tab clicking. Your agent browses as you, even from WhatsApp. Other tools make you click each tab, spawn a fresh Chrome, or only work with one AI client. BrowserForce connects to **your running browser** and auto-attaches to all tabs. One Chrome extension, full Playwright API, completely hands-off.
6
10
 
7
11
  Works with [OpenClaw](https://github.com/openclaw/openclaw), Claude, or any MCP-compatible agent.
8
12
 
9
- | | OpenClaw's built-in browser | BrowserForce |
10
- |---|---|---|
11
- | Browser | Spawns dedicated Chrome | **Uses your Chrome** |
12
- | Login state | Fresh — must log in every time | Already logged in |
13
- | Extensions | None | Your existing ones |
14
- | 2FA / Captchas | Blocked | Already passed (you did it) |
15
- | Bot detection | Easily detected | Runs in your real profile |
16
- | Cookies & sessions | Empty | Yours |
13
+ ## Comparison
14
+
15
+ | | Playwright MCP | OpenClaw Browser | Playwriter | Claude Extension | BrowserForce |
16
+ |---|---|---|---|---|---|
17
+ | Browser | Spawns new Chrome | Separate profile | Your Chrome | Your Chrome | **Your Chrome** |
18
+ | Login state | Fresh | Fresh (isolated) | Yours | Yours | **Yours** |
19
+ | Tab access | N/A (new browser) | Managed by agent | Click each tab | Click each tab | **All tabs, automatic** |
20
+ | Autonomous | Yes | Yes | No (manual click) | No (manual click) | **Yes (fully autonomous)** |
21
+ | Context method | Screenshots (100KB+) | Screenshots + snapshots | A11y snapshots (5-20KB) | Screenshots (100KB+) | **A11y snapshots (5-20KB)** |
22
+ | Tools | Many dedicated | 1 `browser` tool | 1 `execute` tool | Built-in | **1 `execute` tool** |
23
+ | Agent support | Any MCP client | OpenClaw only | Any MCP client | Claude only | **Any MCP client** |
24
+ | Playwright API | Partial | No | Full | No | **Full** |
17
25
 
18
26
  ## Setup
19
27
 
@@ -38,33 +46,44 @@ pnpm install
38
46
  3. Click **Load unpacked** → select the `extension/` folder
39
47
  4. Extension icon appears in your toolbar (gray = disconnected)
40
48
 
41
- ### 3. Start the relay
49
+ ### 3. Done
50
+
51
+ The relay auto-starts when you run any command or connect via MCP — no manual step needed. Extension icon turns green once connected.
52
+
53
+ To run the relay manually (optional):
42
54
 
43
55
  ```bash
44
56
  browserforce serve
45
57
  ```
46
58
 
47
- Or with pnpm (development):
59
+ ## Connect Your Agent
60
+
61
+ ### OpenClaw
62
+
63
+ Most OpenClaw users chat with their agent from Telegram or WhatsApp. BrowserForce lets your agent browse the web as you — no login flows, no captchas — even from a messaging app.
64
+
65
+ **Quick setup** (copy-paste into your terminal):
48
66
 
49
67
  ```bash
50
- pnpm relay
68
+ npm install -g browserforce && npx -y skills add ivalsaraj/browserforce
51
69
  ```
52
70
 
53
- ```
54
- BrowserForce
55
- ────────────────────────────────────────
56
- Status: http://127.0.0.1:19222/
57
- CDP: ws://127.0.0.1:19222/cdp?token=<TOKEN>
58
- ────────────────────────────────────────
71
+ Then start the relay (keep this running):
72
+
73
+ ```bash
74
+ browserforce serve
59
75
  ```
60
76
 
61
- Extension icon turns green you're connected.
77
+ **Verify it works**send this to your agent:
62
78
 
63
- ## Connect Your Agent
79
+ > Go to https://x.com and give me top tweets
64
80
 
65
- ### OpenClaw
81
+ If your agent browses to the page and responds with the title, you're all set.
82
+
83
+ <details>
84
+ <summary><b>Alternative: MCP server</b> (advanced)</summary>
66
85
 
67
- Add BrowserForce as an MCP server in `~/.openclaw/openclaw.json`:
86
+ If you prefer MCP over the skill, add to `~/.openclaw/openclaw.json`:
68
87
 
69
88
  ```json
70
89
  {
@@ -88,7 +107,7 @@ Add BrowserForce as an MCP server in `~/.openclaw/openclaw.json`:
88
107
  }
89
108
  ```
90
109
 
91
- Then add `"mcp-adapter"` to your agent's allowed tools. Your OpenClaw agent can now browse the web as you — no login flows, no captchas.
110
+ </details>
92
111
 
93
112
  ### Claude Desktop
94
113
 
@@ -138,16 +157,6 @@ browserforce -e "<code>" # Run Playwright JavaScript (one-shot)
138
157
 
139
158
  Each `-e` command is one-shot — state does not persist between calls. For persistent state, use the MCP server.
140
159
 
141
- ### OpenClaw Skill
142
-
143
- Install the skill directly:
144
-
145
- ```bash
146
- npx -y skills add ivalsaraj/browserforce
147
- ```
148
-
149
- Or add to your agent config manually — the skill teaches the agent to use BrowserForce CLI commands via Bash.
150
-
151
160
  ### Any Playwright Script
152
161
 
153
162
  ```javascript
@@ -201,8 +210,39 @@ state.results = await page.evaluate(() => document.title);
201
210
  | Tool | Description |
202
211
  |------|-------------|
203
212
  | `execute` | Run Playwright JavaScript in your real Chrome. Access `page`, `context`, `state`, `snapshot()`, `waitForPageLoad()`, `getLogs()`, and Node.js globals. |
213
+ | `screenshot_with_labels` | Take a screenshot with Vimium-style accessibility labels overlaid on interactive elements. |
204
214
  | `reset` | Reconnect to the relay and clear state. Use when the connection drops. |
205
215
 
216
+ ## Examples
217
+
218
+ Get started with simple prompts. The AI generates code and does the work.
219
+
220
+ <details>
221
+ <summary><b>Example 1: Read page content (X.com search)</b></summary>
222
+
223
+ **Prompt to AI:**
224
+ > Go to x.com/search and search for "browserforce". Show me the top 5 tweets you find.
225
+
226
+ **What the AI does:** Navigates to X, searches the term, extracts top tweets, returns them to you.
227
+
228
+ **Use case:** Quick research, trend tracking, social listening.
229
+
230
+ </details>
231
+
232
+ <details>
233
+ <summary><b>Example 2: Interact with a form (GitHub search)</b></summary>
234
+
235
+ **Prompt to AI:**
236
+ > Go to GitHub and search for "ai agents". Show me the top 3 repositories and their star counts.
237
+
238
+ **What the AI does:** Fills GitHub search, waits for results, extracts repo names + stars, returns them.
239
+
240
+ **Use case:** Finding libraries, competitive research, project discovery.
241
+
242
+ </details>
243
+
244
+ **8+ more examples** available in the [User Guide](GUIDE.md#examples).
245
+
206
246
  ## How It Works
207
247
 
208
248
  ```
@@ -230,16 +270,19 @@ The **Chrome extension** lives in your browser. It attaches Chrome's built-in de
230
270
 
231
271
  When the agent connects, it immediately sees all your open tabs as controllable Playwright pages. No clicking, no manual attachment.
232
272
 
233
- ## Extension Settings
273
+ ## You Stay in Control
234
274
 
235
- Click the extension icon to configure:
275
+ Click the extension icon to configure restrictions. Your browser, your rules:
236
276
 
237
- - **Auto / Manual mode** — Let the agent create tabs freely, or manually select which tabs it can access
238
- - **Lock URL** — Prevent the agent from navigating away from the current page
239
- - **No new tabs** Block tab creation
240
- - **Read-only** Observe only, no interactions
241
- - **Auto-cleanup** Automatically detach or close agent tabs after a timeout
242
- - **Custom instructions** — Pass text instructions to the agent
277
+ | Setting | What it does |
278
+ |---------|-------------|
279
+ | **Auto / Manual mode** | Let the agent create tabs freely, or hand-pick which tabs it can access |
280
+ | **Lock URL** | Prevent the agent from navigating away from the current page |
281
+ | **No new tabs** | Block the agent from opening new tabs |
282
+ | **Read-only** | Observe only no clicks, no typing, no interactions |
283
+ | **Auto-detach** | Automatically detach inactive tabs after 5-60 minutes |
284
+ | **Auto-close** | Automatically close agent-created tabs after 5-60 minutes |
285
+ | **Custom instructions** | Pass text instructions to the agent (e.g. "don't click any buy buttons") |
243
286
 
244
287
  ## Security
245
288
 
@@ -249,6 +292,7 @@ Click the extension icon to configure:
249
292
  | **Auth** | Random token required for every CDP connection |
250
293
  | **Origin** | Extension only accepts connections from its own Chrome origin |
251
294
  | **Visibility** | Chrome shows "controlled by automated test software" on active tabs |
295
+ | **Restrictions** | Lock URLs, block navigation, read-only mode — enforced at the CDP level |
252
296
 
253
297
  Everything runs on your machine. The auth token is stored at `~/.browserforce/auth-token` with owner-only permissions.
254
298
 
@@ -280,38 +324,6 @@ RELAY_PORT=19333 browserforce serve
280
324
  | `ws://.../extension` | Chrome extension WebSocket |
281
325
  | `ws://.../cdp?token=...` | Agent CDP connection |
282
326
 
283
- ## Comparison
284
-
285
- ### vs Playwright MCP
286
-
287
- | | Playwright MCP | BrowserForce |
288
- |---|---|---|
289
- | Browser | Spawns new Chrome | **Uses your Chrome** |
290
- | Login state | Fresh — must log in every time | Already logged in |
291
- | Extensions | None | Your existing ones |
292
- | Bot detection | Always detected | Runs in your real profile |
293
- | Memory | Double (two Chrome instances) | Uses existing Chrome |
294
-
295
- ### vs Claude Browser Extension
296
-
297
- | | Claude Extension | BrowserForce |
298
- |---|---|---|
299
- | Agent support | Claude only | **Any MCP client** (OpenClaw, Claude, custom) |
300
- | Context method | Screenshots (100KB+) | Accessibility snapshots (5-20KB) |
301
- | Playwright API | No | Full |
302
- | Network interception | Limited | Full |
303
- | Raw CDP access | No | Yes |
304
-
305
- ### vs Antigravity (Jetski)
306
-
307
- | | Jetski | BrowserForce |
308
- |---|---|---|
309
- | Tools | 17+ tools | 1 `execute` tool |
310
- | Approach | Spawns subagent for browser tasks | Direct execution |
311
- | Latency | High (agent overhead) | Low |
312
- | LLM knowledge | Must learn custom tools | Already knows Playwright |
313
- | Context usage | High (many tool schemas) | Low |
314
-
315
327
  ## Troubleshooting
316
328
 
317
329
  | Problem | Fix |
package/bin.js CHANGED
@@ -43,7 +43,8 @@ function httpGet(url) {
43
43
  }
44
44
 
45
45
  async function connectBrowser() {
46
- const { getCdpUrl } = await import('./mcp/src/exec-engine.js');
46
+ const { getCdpUrl, ensureRelay } = await import('./mcp/src/exec-engine.js');
47
+ await ensureRelay();
47
48
  // playwright-core lives in mcp/node_modules (pnpm workspace sub-package).
48
49
  // Use createRequire from the mcp package context to locate it, then dynamic-import.
49
50
  const { createRequire } = await import('node:module');
@@ -0,0 +1,466 @@
1
+ // BrowserForce — Accessibility Label Overlay for Screenshots
2
+ // Uses CDP AX tree + DOM.getBoxModel for element positioning,
3
+ // then injects browser-side label renderer for Vimium-style labels.
4
+
5
+ import {
6
+ INTERACTIVE_ROLES, CONTEXT_ROLES, SKIP_ROLES,
7
+ buildLocator, escapeLocatorName,
8
+ } from './snapshot.js';
9
+
10
+ // ─── Semaphore ────────────────────────────────────────────────────────────────
11
+
12
+ export class Semaphore {
13
+ constructor(max) {
14
+ this.max = max;
15
+ this.count = 0;
16
+ this.queue = [];
17
+ }
18
+
19
+ acquire() {
20
+ if (this.count < this.max) {
21
+ this.count++;
22
+ return Promise.resolve();
23
+ }
24
+ return new Promise(resolve => this.queue.push(resolve));
25
+ }
26
+
27
+ release() {
28
+ this.count--;
29
+ if (this.queue.length > 0) {
30
+ this.count++;
31
+ this.queue.shift()();
32
+ }
33
+ }
34
+ }
35
+
36
+ // ─── Box Model ────────────────────────────────────────────────────────────────
37
+
38
+ export function buildBoxFromQuad(quad) {
39
+ const xs = [quad[0], quad[2], quad[4], quad[6]];
40
+ const ys = [quad[1], quad[3], quad[5], quad[7]];
41
+ return {
42
+ x: Math.min(...xs),
43
+ y: Math.min(...ys),
44
+ width: Math.max(...xs) - Math.min(...xs),
45
+ height: Math.max(...ys) - Math.min(...ys),
46
+ };
47
+ }
48
+
49
+ // ─── CDP AX Tree → Snapshot Text ──────────────────────────────────────────────
50
+
51
+ export function buildSnapshotFromCdpNodes(nodes, scopeBackendNodeId, { refAll = false } = {}) {
52
+ // Build lookup maps
53
+ const byId = new Map();
54
+ for (const node of nodes) {
55
+ byId.set(node.nodeId, node);
56
+ }
57
+
58
+ // Find scope root if scoping requested
59
+ let scopeNodeId = null;
60
+ if (scopeBackendNodeId != null) {
61
+ const scopeNode = nodes.find(n => n.backendDOMNodeId === scopeBackendNodeId);
62
+ if (!scopeNode) {
63
+ throw new Error(
64
+ `Scoped element (backendNodeId=${scopeBackendNodeId}) has no matching accessibility node. ` +
65
+ 'The element may be aria-hidden or have no accessible role.'
66
+ );
67
+ }
68
+ scopeNodeId = scopeNode.nodeId;
69
+ }
70
+
71
+ // Build tree structure from parentId references
72
+ const childrenMap = new Map();
73
+ for (const node of nodes) {
74
+ if (!childrenMap.has(node.nodeId)) childrenMap.set(node.nodeId, []);
75
+ if (node.parentId) {
76
+ if (!childrenMap.has(node.parentId)) childrenMap.set(node.parentId, []);
77
+ childrenMap.get(node.parentId).push(node);
78
+ }
79
+ }
80
+
81
+ // Check if a nodeId is inside the scope subtree
82
+ function isInScope(nodeId) {
83
+ if (!scopeNodeId) return true;
84
+ let current = nodeId;
85
+ while (current) {
86
+ if (current === scopeNodeId) return true;
87
+ const node = byId.get(current);
88
+ current = node?.parentId || null;
89
+ }
90
+ return false;
91
+ }
92
+
93
+ const lines = [];
94
+ const refs = [];
95
+ let refCounter = 0;
96
+
97
+ function walk(nodeId, depth) {
98
+ const node = byId.get(nodeId);
99
+ if (!node || node.ignored) return;
100
+
101
+ const role = node.role?.value;
102
+ if (!role) return;
103
+
104
+ // Skip root wrapper and generic roles — recurse into children
105
+ if (role === 'RootWebArea' || role === 'WebArea') {
106
+ for (const child of childrenMap.get(nodeId) || []) {
107
+ walk(child.nodeId, depth);
108
+ }
109
+ return;
110
+ }
111
+ if (SKIP_ROLES.has(role)) {
112
+ for (const child of childrenMap.get(nodeId) || []) {
113
+ walk(child.nodeId, depth);
114
+ }
115
+ return;
116
+ }
117
+
118
+ // Check scope
119
+ if (!isInScope(nodeId)) return;
120
+
121
+ const isInteractive = INTERACTIVE_ROLES.has(role);
122
+ const isContext = CONTEXT_ROLES.has(role);
123
+ const name = node.name?.value || '';
124
+ const children = childrenMap.get(nodeId) || [];
125
+
126
+ // Skip non-interactive, non-context nodes without interactive descendants
127
+ if (!isInteractive && !isContext) {
128
+ const hasInteractive = children.some(c => {
129
+ const r = c.role?.value;
130
+ return r && INTERACTIVE_ROLES.has(r);
131
+ });
132
+ if (!hasInteractive) {
133
+ // Still recurse — children may have interactive descendants
134
+ for (const child of children) {
135
+ walk(child.nodeId, depth);
136
+ }
137
+ return;
138
+ }
139
+ }
140
+
141
+ const indent = ' '.repeat(depth);
142
+ let lineText = `${indent}- ${role}`;
143
+ if (name) {
144
+ lineText += ` "${escapeLocatorName(name)}"`;
145
+ }
146
+
147
+ if (isInteractive || (refAll && isContext && node.backendDOMNodeId)) {
148
+ refCounter++;
149
+ const ref = `e${refCounter}`;
150
+ const locator = buildLocator(role, name, null);
151
+ lineText += ` [ref=${ref}]`;
152
+ refs.push({ ref, role, name, backendNodeId: node.backendDOMNodeId, locator });
153
+ }
154
+
155
+ const relevantChildren = children.filter(c => {
156
+ const r = c.role?.value;
157
+ return r && !c.ignored && (INTERACTIVE_ROLES.has(r) || CONTEXT_ROLES.has(r) || !SKIP_ROLES.has(r));
158
+ });
159
+ if (relevantChildren.length > 0) {
160
+ lineText += ':';
161
+ }
162
+
163
+ lines.push(lineText);
164
+
165
+ for (const child of children) {
166
+ walk(child.nodeId, depth + 1);
167
+ }
168
+ }
169
+
170
+ // Find root nodes (no parentId)
171
+ const roots = nodes.filter(n => !n.parentId);
172
+ for (const root of roots) {
173
+ walk(root.nodeId, 0);
174
+ }
175
+
176
+ return { text: lines.join('\n'), refs };
177
+ }
178
+
179
+ // ─── Browser-Side Label Renderer ──────────────────────────────────────────────
180
+ // Injected into page context via page.evaluate().
181
+ // Port of Playwriter's a11y-client.ts — Vimium-style color-coded labels.
182
+
183
+ const LABELS_CONTAINER_ID = '__bf_labels__';
184
+ const LABELS_TIMER_KEY = '__bf_labels_timer__';
185
+
186
+ export const ROLE_COLORS = {
187
+ link: ['#FFF785', '#FFC542', '#E3BE23'],
188
+ button: ['#FFE0B2', '#FFCC80', '#FFB74D'],
189
+ textbox: ['#FFCDD2', '#EF9A9A', '#E57373'],
190
+ combobox: ['#F8BBD0', '#F48FB1', '#F06292'],
191
+ searchbox: ['#F8BBD0', '#F48FB1', '#F06292'],
192
+ checkbox: ['#C8E6C9', '#A5D6A7', '#81C784'],
193
+ radio: ['#C8E6C9', '#A5D6A7', '#81C784'],
194
+ slider: ['#BBDEFB', '#90CAF9', '#64B5F6'],
195
+ spinbutton: ['#BBDEFB', '#90CAF9', '#64B5F6'],
196
+ switch: ['#D1C4E9', '#B39DDB', '#9575CD'],
197
+ menuitem: ['#FFE0B2', '#FFCC80', '#FFB74D'],
198
+ menuitemcheckbox: ['#FFE0B2', '#FFCC80', '#FFB74D'],
199
+ menuitemradio: ['#FFE0B2', '#FFCC80', '#FFB74D'],
200
+ option: ['#FFE0B2', '#FFCC80', '#FFB74D'],
201
+ tab: ['#FFE0B2', '#FFCC80', '#FFB74D'],
202
+ treeitem: ['#FFE0B2', '#FFCC80', '#FFB74D'],
203
+ img: ['#B3E5FC', '#81D4FA', '#4FC3F7'],
204
+ video: ['#B3E5FC', '#81D4FA', '#4FC3F7'],
205
+ audio: ['#B3E5FC', '#81D4FA', '#4FC3F7'],
206
+ };
207
+
208
+ const DEFAULT_COLORS = ['#FFF9C4', '#FFF59D', '#FFEB3B'];
209
+
210
+ // This string is injected into the page via page.evaluate().
211
+ // It sets up globalThis.__bf_a11y with renderA11yLabels and hideA11yLabels.
212
+ export const A11Y_CLIENT_CODE = `
213
+ (function() {
214
+ const CONTAINER_ID = '${LABELS_CONTAINER_ID}';
215
+ const TIMER_KEY = '${LABELS_TIMER_KEY}';
216
+ const ROLE_COLORS = ${JSON.stringify(ROLE_COLORS)};
217
+ const DEFAULT_COLORS = ${JSON.stringify(DEFAULT_COLORS)};
218
+
219
+ function renderA11yLabels(labels) {
220
+ const doc = document;
221
+ const win = window;
222
+
223
+ if (win[TIMER_KEY]) {
224
+ win.clearTimeout(win[TIMER_KEY]);
225
+ win[TIMER_KEY] = null;
226
+ }
227
+
228
+ doc.getElementById(CONTAINER_ID)?.remove();
229
+
230
+ const container = doc.createElement('div');
231
+ container.id = CONTAINER_ID;
232
+ container.style.cssText = 'position:absolute;left:0;top:0;z-index:2147483647;pointer-events:none;';
233
+
234
+ const style = doc.createElement('style');
235
+ style.textContent = '.__bf_label__{position:absolute;font:bold 12px Helvetica,Arial,sans-serif;padding:1px 4px;border-radius:3px;color:black;text-shadow:0 1px 0 rgba(255,255,255,0.6);white-space:nowrap;}';
236
+ container.appendChild(style);
237
+
238
+ const svg = doc.createElementNS('http://www.w3.org/2000/svg', 'svg');
239
+ svg.style.cssText = 'position:absolute;left:0;top:0;pointer-events:none;overflow:visible;';
240
+ svg.setAttribute('width', '' + doc.documentElement.scrollWidth);
241
+ svg.setAttribute('height', '' + doc.documentElement.scrollHeight);
242
+
243
+ const defs = doc.createElementNS('http://www.w3.org/2000/svg', 'defs');
244
+ svg.appendChild(defs);
245
+ const markerCache = {};
246
+
247
+ function getArrowMarkerId(color) {
248
+ if (markerCache[color]) return markerCache[color];
249
+ const markerId = 'bf-arrow-' + color.replace('#', '');
250
+ const marker = doc.createElementNS('http://www.w3.org/2000/svg', 'marker');
251
+ marker.setAttribute('id', markerId);
252
+ marker.setAttribute('viewBox', '0 0 10 10');
253
+ marker.setAttribute('refX', '9');
254
+ marker.setAttribute('refY', '5');
255
+ marker.setAttribute('markerWidth', '6');
256
+ marker.setAttribute('markerHeight', '6');
257
+ marker.setAttribute('orient', 'auto-start-reverse');
258
+ const path = doc.createElementNS('http://www.w3.org/2000/svg', 'path');
259
+ path.setAttribute('d', 'M 0 0 L 10 5 L 0 10 z');
260
+ path.setAttribute('fill', color);
261
+ marker.appendChild(path);
262
+ defs.appendChild(marker);
263
+ markerCache[color] = markerId;
264
+ return markerId;
265
+ }
266
+
267
+ container.appendChild(svg);
268
+
269
+ const placedLabels = [];
270
+ const LABEL_HEIGHT = 17;
271
+ const LABEL_CHAR_WIDTH = 7;
272
+
273
+ const viewportLeft = win.scrollX;
274
+ const viewportTop = win.scrollY;
275
+ const viewportRight = viewportLeft + win.innerWidth;
276
+ const viewportBottom = viewportTop + win.innerHeight;
277
+
278
+ let count = 0;
279
+ for (const item of labels) {
280
+ const ref = item.ref;
281
+ const role = item.role;
282
+ const box = item.box;
283
+
284
+ const rectLeft = box.x;
285
+ const rectTop = box.y;
286
+ const rectRight = rectLeft + box.width;
287
+ const rectBottom = rectTop + box.height;
288
+
289
+ if (box.width <= 0 || box.height <= 0) continue;
290
+ if (rectRight < viewportLeft || rectLeft > viewportRight ||
291
+ rectBottom < viewportTop || rectTop > viewportBottom) continue;
292
+
293
+ const labelWidth = ref.length * LABEL_CHAR_WIDTH + 8;
294
+ const labelLeft = rectLeft;
295
+ const labelTop = Math.max(0, rectTop - LABEL_HEIGHT);
296
+ const labelRect = { left: labelLeft, top: labelTop, right: labelLeft + labelWidth, bottom: labelTop + LABEL_HEIGHT };
297
+
298
+ let overlaps = false;
299
+ for (const placed of placedLabels) {
300
+ if (labelRect.left < placed.right && labelRect.right > placed.left &&
301
+ labelRect.top < placed.bottom && labelRect.bottom > placed.top) {
302
+ overlaps = true;
303
+ break;
304
+ }
305
+ }
306
+ if (overlaps) continue;
307
+
308
+ const colors = ROLE_COLORS[role] || DEFAULT_COLORS;
309
+ const label = doc.createElement('div');
310
+ label.className = '__bf_label__';
311
+ label.textContent = ref;
312
+ label.style.background = 'linear-gradient(to bottom, ' + colors[0] + ' 0%, ' + colors[1] + ' 100%)';
313
+ label.style.border = '1px solid ' + colors[2];
314
+ label.style.left = labelLeft + 'px';
315
+ label.style.top = labelTop + 'px';
316
+ container.appendChild(label);
317
+
318
+ const line = doc.createElementNS('http://www.w3.org/2000/svg', 'line');
319
+ line.setAttribute('x1', '' + (labelLeft + labelWidth / 2));
320
+ line.setAttribute('y1', '' + (labelTop + LABEL_HEIGHT));
321
+ line.setAttribute('x2', '' + (rectLeft + box.width / 2));
322
+ line.setAttribute('y2', '' + (rectTop + box.height / 2));
323
+ line.setAttribute('stroke', colors[2]);
324
+ line.setAttribute('stroke-width', '1.5');
325
+ line.setAttribute('marker-end', 'url(#' + getArrowMarkerId(colors[2]) + ')');
326
+ svg.appendChild(line);
327
+
328
+ placedLabels.push(labelRect);
329
+ count++;
330
+ }
331
+
332
+ doc.documentElement.appendChild(container);
333
+
334
+ win[TIMER_KEY] = win.setTimeout(function() {
335
+ doc.getElementById(CONTAINER_ID)?.remove();
336
+ win[TIMER_KEY] = null;
337
+ }, 30000);
338
+
339
+ return count;
340
+ }
341
+
342
+ function hideA11yLabels() {
343
+ if (window['${LABELS_TIMER_KEY}']) {
344
+ window.clearTimeout(window['${LABELS_TIMER_KEY}']);
345
+ window['${LABELS_TIMER_KEY}'] = null;
346
+ }
347
+ document.getElementById('${LABELS_CONTAINER_ID}')?.remove();
348
+ }
349
+
350
+ globalThis.__bf_a11y = { renderA11yLabels: renderA11yLabels, hideA11yLabels: hideA11yLabels };
351
+ })();
352
+ `;
353
+
354
+ // ─── CDP Helpers ──────────────────────────────────────────────────────────────
355
+
356
+ const MAX_CONCURRENCY = 24;
357
+ const BOX_MODEL_TIMEOUT_MS = 5000;
358
+ const MAX_SCREENSHOT_DIMENSION = 1568;
359
+
360
+ export async function resolveScopeBackendNodeId(cdp, selector) {
361
+ if (!selector) return null;
362
+ const { root } = await cdp.send('DOM.getDocument');
363
+ const { nodeId } = await cdp.send('DOM.querySelector', {
364
+ nodeId: root.nodeId, selector,
365
+ });
366
+ if (!nodeId) {
367
+ throw new Error(`Selector "${selector}" did not match any element on the page`);
368
+ }
369
+ const { node } = await cdp.send('DOM.describeNode', { nodeId });
370
+ return node.backendNodeId;
371
+ }
372
+
373
+ export async function getLabelBoxes(cdp, refs) {
374
+ const sema = new Semaphore(MAX_CONCURRENCY);
375
+ const results = await Promise.all(
376
+ refs.map(async (ref) => {
377
+ if (!ref.backendNodeId) return null;
378
+ await sema.acquire();
379
+ try {
380
+ const response = await Promise.race([
381
+ cdp.send('DOM.getBoxModel', { backendNodeId: ref.backendNodeId }),
382
+ new Promise(resolve => setTimeout(() => resolve(null), BOX_MODEL_TIMEOUT_MS)),
383
+ ]);
384
+ if (!response) return null;
385
+ const box = buildBoxFromQuad(response.model.border);
386
+ if (box.width <= 0 || box.height <= 0) return null;
387
+ return { ref: ref.ref, role: ref.role, box };
388
+ } catch {
389
+ return null;
390
+ } finally {
391
+ sema.release();
392
+ }
393
+ })
394
+ );
395
+ return results.filter(Boolean);
396
+ }
397
+
398
+ export async function injectA11yClient(page) {
399
+ const exists = await page.evaluate(() => typeof globalThis.__bf_a11y !== 'undefined');
400
+ if (!exists) {
401
+ await page.evaluate(A11Y_CLIENT_CODE);
402
+ }
403
+ }
404
+
405
+ export async function showLabels(page, labels) {
406
+ return page.evaluate((entries) => globalThis.__bf_a11y.renderA11yLabels(entries), labels);
407
+ }
408
+
409
+ export async function hideLabels(page) {
410
+ await page.evaluate(() => {
411
+ const timerKey = '__bf_labels_timer__';
412
+ if (window[timerKey]) {
413
+ window.clearTimeout(window[timerKey]);
414
+ window[timerKey] = null;
415
+ }
416
+ document.getElementById('__bf_labels__')?.remove();
417
+ });
418
+ }
419
+
420
+ // ─── Main Orchestrator ────────────────────────────────────────────────────────
421
+
422
+ export async function screenshotWithLabels(page, { selector, interactiveOnly = true } = {}) {
423
+ let cdp;
424
+ let labelsInjected = false;
425
+
426
+ try {
427
+ cdp = await page.context().newCDPSession(page);
428
+
429
+ const scopeId = selector
430
+ ? await resolveScopeBackendNodeId(cdp, selector)
431
+ : null;
432
+
433
+ const { nodes } = await cdp.send('Accessibility.getFullAXTree');
434
+ const { text, refs } = buildSnapshotFromCdpNodes(nodes, scopeId, {
435
+ refAll: !interactiveOnly,
436
+ });
437
+
438
+ const labels = await getLabelBoxes(cdp, refs);
439
+
440
+ await injectA11yClient(page);
441
+ labelsInjected = true;
442
+ const labelCount = await showLabels(page, labels);
443
+
444
+ const maxDim = MAX_SCREENSHOT_DIMENSION;
445
+ const viewport = await page.evaluate((max) => ({
446
+ width: Math.min(window.innerWidth, max),
447
+ height: Math.min(window.innerHeight, max),
448
+ }), maxDim);
449
+
450
+ const screenshot = await page.screenshot({
451
+ type: 'jpeg',
452
+ quality: 80,
453
+ scale: 'css',
454
+ clip: { x: 0, y: 0, ...viewport },
455
+ });
456
+
457
+ return { screenshot, snapshot: text, labelCount };
458
+ } finally {
459
+ if (labelsInjected) {
460
+ try { await hideLabels(page); } catch { /* page may have navigated */ }
461
+ }
462
+ if (cdp) {
463
+ try { await cdp.detach(); } catch { /* session may already be detached */ }
464
+ }
465
+ }
466
+ }
@@ -4,6 +4,8 @@
4
4
  import { readFileSync } from 'node:fs';
5
5
  import { join } from 'node:path';
6
6
  import { homedir } from 'node:os';
7
+ import { fileURLToPath } from 'node:url';
8
+ import { spawn } from 'node:child_process';
7
9
  import {
8
10
  TEST_ID_ATTRS,
9
11
  buildSnapshotText, parseSearchPattern, annotateStableAttrs,
@@ -11,8 +13,10 @@ import {
11
13
 
12
14
  // ─── Configuration ───────────────────────────────────────────────────────────
13
15
 
16
+ const DEFAULT_PORT = 19222;
14
17
  export const BF_DIR = join(homedir(), '.browserforce');
15
18
  export const CDP_URL_FILE = join(BF_DIR, 'cdp-url');
19
+ const RELAY_SCRIPT = fileURLToPath(new URL('../../relay/src/index.js', import.meta.url));
16
20
 
17
21
  export function getCdpUrl() {
18
22
  if (process.env.BF_CDP_URL) return process.env.BF_CDP_URL;
@@ -34,10 +38,59 @@ export function getRelayHttpUrl() {
34
38
  const parsed = new URL(cdpUrl);
35
39
  return `http://${parsed.hostname}:${parsed.port}`;
36
40
  } catch {
37
- return 'http://127.0.0.1:19222';
41
+ return `http://127.0.0.1:${DEFAULT_PORT}`;
38
42
  }
39
43
  }
40
44
 
45
+ // ─── Auto-start relay ───────────────────────────────────────────────────────
46
+
47
+ function getRelayPort() {
48
+ if (process.env.RELAY_PORT) return parseInt(process.env.RELAY_PORT, 10);
49
+ try {
50
+ const url = readFileSync(CDP_URL_FILE, 'utf8').trim();
51
+ if (url) {
52
+ const port = new URL(url).port;
53
+ if (port) return parseInt(port, 10);
54
+ }
55
+ } catch { /* fall through */ }
56
+ return DEFAULT_PORT;
57
+ }
58
+
59
+ async function isRelayRunning(port) {
60
+ try {
61
+ const res = await fetch(`http://127.0.0.1:${port}/`, {
62
+ signal: AbortSignal.timeout(500),
63
+ });
64
+ return res.ok;
65
+ } catch { return false; }
66
+ }
67
+
68
+ /**
69
+ * Ensure the relay server is running. If not, spawn it as a detached
70
+ * background process and wait for it to become reachable.
71
+ */
72
+ export async function ensureRelay() {
73
+ const port = getRelayPort();
74
+ if (await isRelayRunning(port)) return;
75
+
76
+ const child = spawn(process.execPath, [RELAY_SCRIPT], {
77
+ detached: true,
78
+ stdio: 'ignore',
79
+ env: { ...process.env, RELAY_PORT: String(port) },
80
+ });
81
+ child.unref();
82
+
83
+ const deadline = Date.now() + 5000;
84
+ while (Date.now() < deadline) {
85
+ await new Promise(r => setTimeout(r, 200));
86
+ if (await isRelayRunning(port)) {
87
+ process.stderr.write('[browserforce] Relay auto-started\n');
88
+ return;
89
+ }
90
+ }
91
+ throw new Error('Failed to auto-start relay server within 5s');
92
+ }
93
+
41
94
  // ─── Smart Page Load Detection ───────────────────────────────────────────────
42
95
  // Filters analytics/ad requests that never finish, polls document.readyState +
43
96
  // pending resource count.
package/mcp/src/index.js CHANGED
@@ -1,5 +1,5 @@
1
1
  // BrowserForce — MCP Server
2
- // 2-tool architecture: execute (run Playwright code) + reset (reconnect)
2
+ // 3-tool architecture: execute (run Playwright code) + reset (reconnect) + screenshot_with_labels (visual a11y labels)
3
3
  // Connects to the relay via Playwright's CDP client.
4
4
 
5
5
  import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
@@ -7,8 +7,9 @@ import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js'
7
7
  import { z } from 'zod';
8
8
  import { chromium } from 'playwright-core';
9
9
  import {
10
- getCdpUrl, CodeExecutionTimeoutError, buildExecContext, runCode, formatResult,
10
+ getCdpUrl, ensureRelay, CodeExecutionTimeoutError, buildExecContext, runCode, formatResult,
11
11
  } from './exec-engine.js';
12
+ import { screenshotWithLabels } from './a11y-labels.js';
12
13
 
13
14
  // ─── Console Log Capture ─────────────────────────────────────────────────────
14
15
 
@@ -64,6 +65,7 @@ let browser = null;
64
65
 
65
66
  async function ensureBrowser() {
66
67
  if (browser?.isConnected()) return;
68
+ await ensureRelay();
67
69
  const cdpUrl = getCdpUrl();
68
70
  browser = await chromium.connectOverCDP(cdpUrl);
69
71
  browser.on('disconnected', () => {
@@ -350,6 +352,79 @@ server.tool(
350
352
  }
351
353
  );
352
354
 
355
+ // ─── Screenshot with Labels Tool ──────────────────────────────────────────────
356
+
357
+ const SCREENSHOT_LABELS_PROMPT = `Take a screenshot with Vimium-style accessibility labels on interactive elements.
358
+
359
+ Returns TWO content items:
360
+ 1. JPEG screenshot with color-coded labels (e1, e2, e3...) on buttons, links, inputs, etc.
361
+ 2. Text accessibility snapshot with matching refs and role/name locators
362
+
363
+ Labels are color-coded by role:
364
+ - Yellow: links
365
+ - Orange: buttons, menu items, tabs
366
+ - Red/pink: text inputs, search boxes
367
+ - Green: checkboxes, radio buttons
368
+ - Blue: sliders, spinbuttons, media
369
+ - Purple: switches
370
+
371
+ Use this tool when:
372
+ - You need to understand the visual layout of a page
373
+ - Text snapshot alone can't convey spatial relationships
374
+ - You need to verify element positions (dashboards, grids, maps)
375
+ - You need both visual context AND element refs for interaction
376
+
377
+ After getting the screenshot, use the refs to interact via the execute tool:
378
+ await state.page.locator('role=button[name="Submit"]').click();
379
+
380
+ Parameters:
381
+ - selector: CSS selector to scope labels to part of the page (e.g., '#main', '.sidebar'). Main frame only.
382
+ - interactiveOnly: Only label interactive elements like buttons/links/inputs (default: true)
383
+
384
+ Limitations:
385
+ - Main frame only — does not label elements inside cross-origin iframes
386
+ - Locators are role/name based — no data-testid matching`;
387
+
388
+ server.tool(
389
+ 'screenshot_with_labels',
390
+ SCREENSHOT_LABELS_PROMPT,
391
+ {
392
+ selector: z.string().optional().describe('CSS selector to scope labels to a subtree of the main frame'),
393
+ interactiveOnly: z.boolean().optional().describe('Only label interactive elements (default: true)'),
394
+ },
395
+ async ({ selector, interactiveOnly = true }) => {
396
+ await ensureBrowser();
397
+ const ctx = getContext();
398
+ const page = (userState.page && !userState.page.isClosed())
399
+ ? userState.page
400
+ : ctx.pages()[0] || null;
401
+ if (!page) {
402
+ return {
403
+ content: [{ type: 'text', text: 'Error: No pages available. Open a tab first.' }],
404
+ isError: true,
405
+ };
406
+ }
407
+
408
+ try {
409
+ const { screenshot, snapshot, labelCount } = await screenshotWithLabels(page, {
410
+ selector,
411
+ interactiveOnly,
412
+ });
413
+ return {
414
+ content: [
415
+ { type: 'image', data: screenshot.toString('base64'), mimeType: 'image/jpeg' },
416
+ { type: 'text', text: `Labels: ${labelCount} interactive elements\n\n${snapshot}` },
417
+ ],
418
+ };
419
+ } catch (err) {
420
+ return {
421
+ content: [{ type: 'text', text: `Error: ${err.message}` }],
422
+ isError: true,
423
+ };
424
+ }
425
+ }
426
+ );
427
+
353
428
  // ─── Start Server ────────────────────────────────────────────────────────────
354
429
 
355
430
  async function main() {
package/package.json CHANGED
@@ -1,8 +1,8 @@
1
1
  {
2
2
  "name": "browserforce",
3
- "version": "1.0.2",
3
+ "version": "1.0.6",
4
4
  "type": "module",
5
- "description": "Give AI agents your real Chrome browser your logins, cookies, and tabs. Works with OpenClaw, Claude, and any MCP agent.",
5
+ "description": "Give AI agents your real Chrome browser with progressive examples: simple reads, form interactions, multi-tab workflows, and state persistence. Search X and GitHub, extract ProductHunt data, test forms, compare A/B variants, monitor status pages. Works with OpenClaw, Claude, and any MCP agent.",
6
6
  "homepage": "https://github.com/ivalsaraj/browserforce",
7
7
  "repository": {
8
8
  "type": "git",
@@ -19,8 +19,8 @@ cookies, and extensions already active. No headless browser, no fresh profiles.
19
19
  ## Prerequisites
20
20
 
21
21
  The user must have:
22
- 1. BrowserForce relay running: `browserforce serve`
23
- 2. BrowserForce Chrome extension installed and connected (green icon)
22
+ 1. BrowserForce Chrome extension installed and connected (green icon)
23
+ 2. The relay auto-starts on first command no manual step needed
24
24
 
25
25
  Check with: `browserforce status`
26
26