browserforce 1.0.0 → 1.0.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,19 +1,24 @@
1
1
  # BrowserForce
2
2
 
3
- **Give your AI agent your real Chrome browser.** Your logins, your cookies, your extensions — already there.
3
+ > "a lion doesn't concern itself with token counting" [@steipete](https://x.com/steipete), creator of [OpenClaw](https://github.com/openclaw/openclaw)
4
4
 
5
- Other browser tools spawn a fresh Chrome — no logins, no extensions, instantly flagged by bot detectors. BrowserForce connects to **your running browser** instead. One Chrome extension, full Playwright API, everything you're already logged into.
5
+ **You're giving an AI your real Chrome — your logins, cookies, and sessions. That takes conviction.** BrowserForce is built for people who use the best models and don't look back. Security is built in: lock URLs, block navigation, read-only mode, auto-cleanup you stay in control.
6
+
7
+ **Fully autonomous browser control.** No manual tab clicking. Your agent browses as you, even from WhatsApp. Other tools make you click each tab, spawn a fresh Chrome, or only work with one AI client. BrowserForce connects to **your running browser** and auto-attaches to all tabs. One Chrome extension, full Playwright API, completely hands-off.
6
8
 
7
9
  Works with [OpenClaw](https://github.com/openclaw/openclaw), Claude, or any MCP-compatible agent.
8
10
 
9
- | | OpenClaw's built-in browser | BrowserForce |
10
- |---|---|---|
11
- | Browser | Spawns dedicated Chrome | **Uses your Chrome** |
12
- | Login state | Fresh — must log in every time | Already logged in |
13
- | Extensions | None | Your existing ones |
14
- | 2FA / Captchas | Blocked | Already passed (you did it) |
15
- | Bot detection | Easily detected | Runs in your real profile |
16
- | Cookies & sessions | Empty | Yours |
11
+ ## Comparison
12
+
13
+ | | Playwright MCP | Playwriter | Claude Extension | Antigravity | BrowserForce |
14
+ |---|---|---|---|---|---|
15
+ | Tab access | N/A (new browser) | Click each tab | Click each tab | Click each tab | **All tabs, automatic** |
16
+ | Browser | Spawns new Chrome | Your Chrome | Your Chrome | Your Chrome | **Your Chrome** |
17
+ | Autonomous | Yes | No (manual click) | No (manual click) | No (manual click) | **Yes (fully autonomous)** |
18
+ | Tools | Many dedicated tools | 1 `execute` tool | Built-in | Many dedicated tools | **1 `execute` tool** |
19
+ | Agent support | Any MCP client | Any MCP client | Claude only | Custom | **Any MCP client** |
20
+ | Context method | Screenshots | A11y snapshots | Screenshots | Screenshots | **A11y snapshots** |
21
+ | Playwright API | Partial | Full | No | No | **Full** |
17
22
 
18
23
  ## Setup
19
24
 
@@ -26,7 +31,7 @@ npm install -g browserforce
26
31
  Or from source:
27
32
 
28
33
  ```bash
29
- git clone https://github.com/anthropics/browserforce.git
34
+ git clone https://github.com/ivalsaraj/browserforce.git
30
35
  cd browserforce
31
36
  pnpm install
32
37
  ```
@@ -64,7 +69,15 @@ Extension icon turns green — you're connected.
64
69
 
65
70
  ### OpenClaw
66
71
 
67
- Add to `~/.openclaw/openclaw.json`:
72
+ Install the BrowserForce skill:
73
+
74
+ ```bash
75
+ npx -y skills add ivalsaraj/browserforce
76
+ ```
77
+
78
+ The skill teaches your agent to use BrowserForce CLI commands via Bash. Your OpenClaw agent can now browse the web as you — no login flows, no captchas.
79
+
80
+ Or add BrowserForce as an MCP server in `~/.openclaw/openclaw.json`:
68
81
 
69
82
  ```json
70
83
  {
@@ -77,8 +90,8 @@ Add to `~/.openclaw/openclaw.json`:
77
90
  {
78
91
  "name": "browserforce",
79
92
  "transport": "stdio",
80
- "command": "node",
81
- "args": ["/absolute/path/to/browserforce/mcp/src/index.js"]
93
+ "command": "npx",
94
+ "args": ["-y", "browserforce", "mcp"]
82
95
  }
83
96
  ]
84
97
  }
@@ -88,8 +101,6 @@ Add to `~/.openclaw/openclaw.json`:
88
101
  }
89
102
  ```
90
103
 
91
- Then add `"mcp-adapter"` to your agent's allowed tools. Your OpenClaw agent can now browse the web as you — no login flows, no captchas.
92
-
93
104
  ### Claude Desktop
94
105
 
95
106
  Add to `~/Library/Application Support/Claude/claude_desktop_config.json`:
@@ -98,8 +109,8 @@ Add to `~/Library/Application Support/Claude/claude_desktop_config.json`:
98
109
  {
99
110
  "mcpServers": {
100
111
  "browserforce": {
101
- "command": "node",
102
- "args": ["/absolute/path/to/browserforce/mcp/src/index.js"]
112
+ "command": "npx",
113
+ "args": ["-y", "browserforce", "mcp"]
103
114
  }
104
115
  }
105
116
  }
@@ -113,8 +124,8 @@ Add to `~/.claude/mcp.json`:
113
124
  {
114
125
  "mcpServers": {
115
126
  "browserforce": {
116
- "command": "node",
117
- "args": ["/absolute/path/to/browserforce/mcp/src/index.js"]
127
+ "command": "npx",
128
+ "args": ["-y", "browserforce", "mcp"]
118
129
  }
119
130
  }
120
131
  }
@@ -138,16 +149,6 @@ browserforce -e "<code>" # Run Playwright JavaScript (one-shot)
138
149
 
139
150
  Each `-e` command is one-shot — state does not persist between calls. For persistent state, use the MCP server.
140
151
 
141
- ### OpenClaw Skill
142
-
143
- Install the skill directly:
144
-
145
- ```bash
146
- npx -y skills add anthropics/browserforce
147
- ```
148
-
149
- Or add to your agent config manually — the skill teaches the agent to use BrowserForce CLI commands via Bash.
150
-
151
152
  ### Any Playwright Script
152
153
 
153
154
  ```javascript
@@ -230,16 +231,19 @@ The **Chrome extension** lives in your browser. It attaches Chrome's built-in de
230
231
 
231
232
  When the agent connects, it immediately sees all your open tabs as controllable Playwright pages. No clicking, no manual attachment.
232
233
 
233
- ## Extension Settings
234
+ ## You Stay in Control
234
235
 
235
- Click the extension icon to configure:
236
+ Click the extension icon to configure restrictions. Your browser, your rules:
236
237
 
237
- - **Auto / Manual mode** — Let the agent create tabs freely, or manually select which tabs it can access
238
- - **Lock URL** — Prevent the agent from navigating away from the current page
239
- - **No new tabs** Block tab creation
240
- - **Read-only** Observe only, no interactions
241
- - **Auto-cleanup** Automatically detach or close agent tabs after a timeout
242
- - **Custom instructions** — Pass text instructions to the agent
238
+ | Setting | What it does |
239
+ |---------|-------------|
240
+ | **Auto / Manual mode** | Let the agent create tabs freely, or hand-pick which tabs it can access |
241
+ | **Lock URL** | Prevent the agent from navigating away from the current page |
242
+ | **No new tabs** | Block the agent from opening new tabs |
243
+ | **Read-only** | Observe only no clicks, no typing, no interactions |
244
+ | **Auto-detach** | Automatically detach inactive tabs after 5-60 minutes |
245
+ | **Auto-close** | Automatically close agent-created tabs after 5-60 minutes |
246
+ | **Custom instructions** | Pass text instructions to the agent (e.g. "don't click any buy buttons") |
243
247
 
244
248
  ## Security
245
249
 
@@ -249,6 +253,7 @@ Click the extension icon to configure:
249
253
  | **Auth** | Random token required for every CDP connection |
250
254
  | **Origin** | Extension only accepts connections from its own Chrome origin |
251
255
  | **Visibility** | Chrome shows "controlled by automated test software" on active tabs |
256
+ | **Restrictions** | Lock URLs, block navigation, read-only mode — enforced at the CDP level |
252
257
 
253
258
  Everything runs on your machine. The auth token is stored at `~/.browserforce/auth-token` with owner-only permissions.
254
259
 
@@ -290,4 +295,4 @@ RELAY_PORT=19333 browserforce serve
290
295
  | Extension keeps reconnecting | Normal — MV3 kills idle workers; it auto-recovers |
291
296
  | Port in use | `lsof -ti:19222 \| xargs kill -9` |
292
297
 
293
- > **Want the full walkthrough?** Read the [User Guide](GUIDE.md) for a plain-English explanation of what this does and how to get started.
298
+ > **Want the full walkthrough?** Read the [User Guide](https://github.com/ivalsaraj/browserforce/blob/main/GUIDE.md) for a plain-English explanation of what this does and how to get started.
@@ -0,0 +1,466 @@
1
+ // BrowserForce — Accessibility Label Overlay for Screenshots
2
+ // Uses CDP AX tree + DOM.getBoxModel for element positioning,
3
+ // then injects browser-side label renderer for Vimium-style labels.
4
+
5
+ import {
6
+ INTERACTIVE_ROLES, CONTEXT_ROLES, SKIP_ROLES,
7
+ buildLocator, escapeLocatorName,
8
+ } from './snapshot.js';
9
+
10
+ // ─── Semaphore ────────────────────────────────────────────────────────────────
11
+
12
+ export class Semaphore {
13
+ constructor(max) {
14
+ this.max = max;
15
+ this.count = 0;
16
+ this.queue = [];
17
+ }
18
+
19
+ acquire() {
20
+ if (this.count < this.max) {
21
+ this.count++;
22
+ return Promise.resolve();
23
+ }
24
+ return new Promise(resolve => this.queue.push(resolve));
25
+ }
26
+
27
+ release() {
28
+ this.count--;
29
+ if (this.queue.length > 0) {
30
+ this.count++;
31
+ this.queue.shift()();
32
+ }
33
+ }
34
+ }
35
+
36
+ // ─── Box Model ────────────────────────────────────────────────────────────────
37
+
38
+ export function buildBoxFromQuad(quad) {
39
+ const xs = [quad[0], quad[2], quad[4], quad[6]];
40
+ const ys = [quad[1], quad[3], quad[5], quad[7]];
41
+ return {
42
+ x: Math.min(...xs),
43
+ y: Math.min(...ys),
44
+ width: Math.max(...xs) - Math.min(...xs),
45
+ height: Math.max(...ys) - Math.min(...ys),
46
+ };
47
+ }
48
+
49
+ // ─── CDP AX Tree → Snapshot Text ──────────────────────────────────────────────
50
+
51
+ export function buildSnapshotFromCdpNodes(nodes, scopeBackendNodeId, { refAll = false } = {}) {
52
+ // Build lookup maps
53
+ const byId = new Map();
54
+ for (const node of nodes) {
55
+ byId.set(node.nodeId, node);
56
+ }
57
+
58
+ // Find scope root if scoping requested
59
+ let scopeNodeId = null;
60
+ if (scopeBackendNodeId != null) {
61
+ const scopeNode = nodes.find(n => n.backendDOMNodeId === scopeBackendNodeId);
62
+ if (!scopeNode) {
63
+ throw new Error(
64
+ `Scoped element (backendNodeId=${scopeBackendNodeId}) has no matching accessibility node. ` +
65
+ 'The element may be aria-hidden or have no accessible role.'
66
+ );
67
+ }
68
+ scopeNodeId = scopeNode.nodeId;
69
+ }
70
+
71
+ // Build tree structure from parentId references
72
+ const childrenMap = new Map();
73
+ for (const node of nodes) {
74
+ if (!childrenMap.has(node.nodeId)) childrenMap.set(node.nodeId, []);
75
+ if (node.parentId) {
76
+ if (!childrenMap.has(node.parentId)) childrenMap.set(node.parentId, []);
77
+ childrenMap.get(node.parentId).push(node);
78
+ }
79
+ }
80
+
81
+ // Check if a nodeId is inside the scope subtree
82
+ function isInScope(nodeId) {
83
+ if (!scopeNodeId) return true;
84
+ let current = nodeId;
85
+ while (current) {
86
+ if (current === scopeNodeId) return true;
87
+ const node = byId.get(current);
88
+ current = node?.parentId || null;
89
+ }
90
+ return false;
91
+ }
92
+
93
+ const lines = [];
94
+ const refs = [];
95
+ let refCounter = 0;
96
+
97
+ function walk(nodeId, depth) {
98
+ const node = byId.get(nodeId);
99
+ if (!node || node.ignored) return;
100
+
101
+ const role = node.role?.value;
102
+ if (!role) return;
103
+
104
+ // Skip root wrapper and generic roles — recurse into children
105
+ if (role === 'RootWebArea' || role === 'WebArea') {
106
+ for (const child of childrenMap.get(nodeId) || []) {
107
+ walk(child.nodeId, depth);
108
+ }
109
+ return;
110
+ }
111
+ if (SKIP_ROLES.has(role)) {
112
+ for (const child of childrenMap.get(nodeId) || []) {
113
+ walk(child.nodeId, depth);
114
+ }
115
+ return;
116
+ }
117
+
118
+ // Check scope
119
+ if (!isInScope(nodeId)) return;
120
+
121
+ const isInteractive = INTERACTIVE_ROLES.has(role);
122
+ const isContext = CONTEXT_ROLES.has(role);
123
+ const name = node.name?.value || '';
124
+ const children = childrenMap.get(nodeId) || [];
125
+
126
+ // Skip non-interactive, non-context nodes without interactive descendants
127
+ if (!isInteractive && !isContext) {
128
+ const hasInteractive = children.some(c => {
129
+ const r = c.role?.value;
130
+ return r && INTERACTIVE_ROLES.has(r);
131
+ });
132
+ if (!hasInteractive) {
133
+ // Still recurse — children may have interactive descendants
134
+ for (const child of children) {
135
+ walk(child.nodeId, depth);
136
+ }
137
+ return;
138
+ }
139
+ }
140
+
141
+ const indent = ' '.repeat(depth);
142
+ let lineText = `${indent}- ${role}`;
143
+ if (name) {
144
+ lineText += ` "${escapeLocatorName(name)}"`;
145
+ }
146
+
147
+ if (isInteractive || (refAll && isContext && node.backendDOMNodeId)) {
148
+ refCounter++;
149
+ const ref = `e${refCounter}`;
150
+ const locator = buildLocator(role, name, null);
151
+ lineText += ` [ref=${ref}]`;
152
+ refs.push({ ref, role, name, backendNodeId: node.backendDOMNodeId, locator });
153
+ }
154
+
155
+ const relevantChildren = children.filter(c => {
156
+ const r = c.role?.value;
157
+ return r && !c.ignored && (INTERACTIVE_ROLES.has(r) || CONTEXT_ROLES.has(r) || !SKIP_ROLES.has(r));
158
+ });
159
+ if (relevantChildren.length > 0) {
160
+ lineText += ':';
161
+ }
162
+
163
+ lines.push(lineText);
164
+
165
+ for (const child of children) {
166
+ walk(child.nodeId, depth + 1);
167
+ }
168
+ }
169
+
170
+ // Find root nodes (no parentId)
171
+ const roots = nodes.filter(n => !n.parentId);
172
+ for (const root of roots) {
173
+ walk(root.nodeId, 0);
174
+ }
175
+
176
+ return { text: lines.join('\n'), refs };
177
+ }
178
+
179
+ // ─── Browser-Side Label Renderer ──────────────────────────────────────────────
180
+ // Injected into page context via page.evaluate().
181
+ // Port of Playwriter's a11y-client.ts — Vimium-style color-coded labels.
182
+
183
+ const LABELS_CONTAINER_ID = '__bf_labels__';
184
+ const LABELS_TIMER_KEY = '__bf_labels_timer__';
185
+
186
+ export const ROLE_COLORS = {
187
+ link: ['#FFF785', '#FFC542', '#E3BE23'],
188
+ button: ['#FFE0B2', '#FFCC80', '#FFB74D'],
189
+ textbox: ['#FFCDD2', '#EF9A9A', '#E57373'],
190
+ combobox: ['#F8BBD0', '#F48FB1', '#F06292'],
191
+ searchbox: ['#F8BBD0', '#F48FB1', '#F06292'],
192
+ checkbox: ['#C8E6C9', '#A5D6A7', '#81C784'],
193
+ radio: ['#C8E6C9', '#A5D6A7', '#81C784'],
194
+ slider: ['#BBDEFB', '#90CAF9', '#64B5F6'],
195
+ spinbutton: ['#BBDEFB', '#90CAF9', '#64B5F6'],
196
+ switch: ['#D1C4E9', '#B39DDB', '#9575CD'],
197
+ menuitem: ['#FFE0B2', '#FFCC80', '#FFB74D'],
198
+ menuitemcheckbox: ['#FFE0B2', '#FFCC80', '#FFB74D'],
199
+ menuitemradio: ['#FFE0B2', '#FFCC80', '#FFB74D'],
200
+ option: ['#FFE0B2', '#FFCC80', '#FFB74D'],
201
+ tab: ['#FFE0B2', '#FFCC80', '#FFB74D'],
202
+ treeitem: ['#FFE0B2', '#FFCC80', '#FFB74D'],
203
+ img: ['#B3E5FC', '#81D4FA', '#4FC3F7'],
204
+ video: ['#B3E5FC', '#81D4FA', '#4FC3F7'],
205
+ audio: ['#B3E5FC', '#81D4FA', '#4FC3F7'],
206
+ };
207
+
208
+ const DEFAULT_COLORS = ['#FFF9C4', '#FFF59D', '#FFEB3B'];
209
+
210
+ // This string is injected into the page via page.evaluate().
211
+ // It sets up globalThis.__bf_a11y with renderA11yLabels and hideA11yLabels.
212
+ export const A11Y_CLIENT_CODE = `
213
+ (function() {
214
+ const CONTAINER_ID = '${LABELS_CONTAINER_ID}';
215
+ const TIMER_KEY = '${LABELS_TIMER_KEY}';
216
+ const ROLE_COLORS = ${JSON.stringify(ROLE_COLORS)};
217
+ const DEFAULT_COLORS = ${JSON.stringify(DEFAULT_COLORS)};
218
+
219
+ function renderA11yLabels(labels) {
220
+ const doc = document;
221
+ const win = window;
222
+
223
+ if (win[TIMER_KEY]) {
224
+ win.clearTimeout(win[TIMER_KEY]);
225
+ win[TIMER_KEY] = null;
226
+ }
227
+
228
+ doc.getElementById(CONTAINER_ID)?.remove();
229
+
230
+ const container = doc.createElement('div');
231
+ container.id = CONTAINER_ID;
232
+ container.style.cssText = 'position:absolute;left:0;top:0;z-index:2147483647;pointer-events:none;';
233
+
234
+ const style = doc.createElement('style');
235
+ style.textContent = '.__bf_label__{position:absolute;font:bold 12px Helvetica,Arial,sans-serif;padding:1px 4px;border-radius:3px;color:black;text-shadow:0 1px 0 rgba(255,255,255,0.6);white-space:nowrap;}';
236
+ container.appendChild(style);
237
+
238
+ const svg = doc.createElementNS('http://www.w3.org/2000/svg', 'svg');
239
+ svg.style.cssText = 'position:absolute;left:0;top:0;pointer-events:none;overflow:visible;';
240
+ svg.setAttribute('width', '' + doc.documentElement.scrollWidth);
241
+ svg.setAttribute('height', '' + doc.documentElement.scrollHeight);
242
+
243
+ const defs = doc.createElementNS('http://www.w3.org/2000/svg', 'defs');
244
+ svg.appendChild(defs);
245
+ const markerCache = {};
246
+
247
+ function getArrowMarkerId(color) {
248
+ if (markerCache[color]) return markerCache[color];
249
+ const markerId = 'bf-arrow-' + color.replace('#', '');
250
+ const marker = doc.createElementNS('http://www.w3.org/2000/svg', 'marker');
251
+ marker.setAttribute('id', markerId);
252
+ marker.setAttribute('viewBox', '0 0 10 10');
253
+ marker.setAttribute('refX', '9');
254
+ marker.setAttribute('refY', '5');
255
+ marker.setAttribute('markerWidth', '6');
256
+ marker.setAttribute('markerHeight', '6');
257
+ marker.setAttribute('orient', 'auto-start-reverse');
258
+ const path = doc.createElementNS('http://www.w3.org/2000/svg', 'path');
259
+ path.setAttribute('d', 'M 0 0 L 10 5 L 0 10 z');
260
+ path.setAttribute('fill', color);
261
+ marker.appendChild(path);
262
+ defs.appendChild(marker);
263
+ markerCache[color] = markerId;
264
+ return markerId;
265
+ }
266
+
267
+ container.appendChild(svg);
268
+
269
+ const placedLabels = [];
270
+ const LABEL_HEIGHT = 17;
271
+ const LABEL_CHAR_WIDTH = 7;
272
+
273
+ const viewportLeft = win.scrollX;
274
+ const viewportTop = win.scrollY;
275
+ const viewportRight = viewportLeft + win.innerWidth;
276
+ const viewportBottom = viewportTop + win.innerHeight;
277
+
278
+ let count = 0;
279
+ for (const item of labels) {
280
+ const ref = item.ref;
281
+ const role = item.role;
282
+ const box = item.box;
283
+
284
+ const rectLeft = box.x;
285
+ const rectTop = box.y;
286
+ const rectRight = rectLeft + box.width;
287
+ const rectBottom = rectTop + box.height;
288
+
289
+ if (box.width <= 0 || box.height <= 0) continue;
290
+ if (rectRight < viewportLeft || rectLeft > viewportRight ||
291
+ rectBottom < viewportTop || rectTop > viewportBottom) continue;
292
+
293
+ const labelWidth = ref.length * LABEL_CHAR_WIDTH + 8;
294
+ const labelLeft = rectLeft;
295
+ const labelTop = Math.max(0, rectTop - LABEL_HEIGHT);
296
+ const labelRect = { left: labelLeft, top: labelTop, right: labelLeft + labelWidth, bottom: labelTop + LABEL_HEIGHT };
297
+
298
+ let overlaps = false;
299
+ for (const placed of placedLabels) {
300
+ if (labelRect.left < placed.right && labelRect.right > placed.left &&
301
+ labelRect.top < placed.bottom && labelRect.bottom > placed.top) {
302
+ overlaps = true;
303
+ break;
304
+ }
305
+ }
306
+ if (overlaps) continue;
307
+
308
+ const colors = ROLE_COLORS[role] || DEFAULT_COLORS;
309
+ const label = doc.createElement('div');
310
+ label.className = '__bf_label__';
311
+ label.textContent = ref;
312
+ label.style.background = 'linear-gradient(to bottom, ' + colors[0] + ' 0%, ' + colors[1] + ' 100%)';
313
+ label.style.border = '1px solid ' + colors[2];
314
+ label.style.left = labelLeft + 'px';
315
+ label.style.top = labelTop + 'px';
316
+ container.appendChild(label);
317
+
318
+ const line = doc.createElementNS('http://www.w3.org/2000/svg', 'line');
319
+ line.setAttribute('x1', '' + (labelLeft + labelWidth / 2));
320
+ line.setAttribute('y1', '' + (labelTop + LABEL_HEIGHT));
321
+ line.setAttribute('x2', '' + (rectLeft + box.width / 2));
322
+ line.setAttribute('y2', '' + (rectTop + box.height / 2));
323
+ line.setAttribute('stroke', colors[2]);
324
+ line.setAttribute('stroke-width', '1.5');
325
+ line.setAttribute('marker-end', 'url(#' + getArrowMarkerId(colors[2]) + ')');
326
+ svg.appendChild(line);
327
+
328
+ placedLabels.push(labelRect);
329
+ count++;
330
+ }
331
+
332
+ doc.documentElement.appendChild(container);
333
+
334
+ win[TIMER_KEY] = win.setTimeout(function() {
335
+ doc.getElementById(CONTAINER_ID)?.remove();
336
+ win[TIMER_KEY] = null;
337
+ }, 30000);
338
+
339
+ return count;
340
+ }
341
+
342
+ function hideA11yLabels() {
343
+ if (window['${LABELS_TIMER_KEY}']) {
344
+ window.clearTimeout(window['${LABELS_TIMER_KEY}']);
345
+ window['${LABELS_TIMER_KEY}'] = null;
346
+ }
347
+ document.getElementById('${LABELS_CONTAINER_ID}')?.remove();
348
+ }
349
+
350
+ globalThis.__bf_a11y = { renderA11yLabels: renderA11yLabels, hideA11yLabels: hideA11yLabels };
351
+ })();
352
+ `;
353
+
354
+ // ─── CDP Helpers ──────────────────────────────────────────────────────────────
355
+
356
+ const MAX_CONCURRENCY = 24;
357
+ const BOX_MODEL_TIMEOUT_MS = 5000;
358
+ const MAX_SCREENSHOT_DIMENSION = 1568;
359
+
360
+ export async function resolveScopeBackendNodeId(cdp, selector) {
361
+ if (!selector) return null;
362
+ const { root } = await cdp.send('DOM.getDocument');
363
+ const { nodeId } = await cdp.send('DOM.querySelector', {
364
+ nodeId: root.nodeId, selector,
365
+ });
366
+ if (!nodeId) {
367
+ throw new Error(`Selector "${selector}" did not match any element on the page`);
368
+ }
369
+ const { node } = await cdp.send('DOM.describeNode', { nodeId });
370
+ return node.backendNodeId;
371
+ }
372
+
373
+ export async function getLabelBoxes(cdp, refs) {
374
+ const sema = new Semaphore(MAX_CONCURRENCY);
375
+ const results = await Promise.all(
376
+ refs.map(async (ref) => {
377
+ if (!ref.backendNodeId) return null;
378
+ await sema.acquire();
379
+ try {
380
+ const response = await Promise.race([
381
+ cdp.send('DOM.getBoxModel', { backendNodeId: ref.backendNodeId }),
382
+ new Promise(resolve => setTimeout(() => resolve(null), BOX_MODEL_TIMEOUT_MS)),
383
+ ]);
384
+ if (!response) return null;
385
+ const box = buildBoxFromQuad(response.model.border);
386
+ if (box.width <= 0 || box.height <= 0) return null;
387
+ return { ref: ref.ref, role: ref.role, box };
388
+ } catch {
389
+ return null;
390
+ } finally {
391
+ sema.release();
392
+ }
393
+ })
394
+ );
395
+ return results.filter(Boolean);
396
+ }
397
+
398
+ export async function injectA11yClient(page) {
399
+ const exists = await page.evaluate(() => typeof globalThis.__bf_a11y !== 'undefined');
400
+ if (!exists) {
401
+ await page.evaluate(A11Y_CLIENT_CODE);
402
+ }
403
+ }
404
+
405
+ export async function showLabels(page, labels) {
406
+ return page.evaluate((entries) => globalThis.__bf_a11y.renderA11yLabels(entries), labels);
407
+ }
408
+
409
+ export async function hideLabels(page) {
410
+ await page.evaluate(() => {
411
+ const timerKey = '__bf_labels_timer__';
412
+ if (window[timerKey]) {
413
+ window.clearTimeout(window[timerKey]);
414
+ window[timerKey] = null;
415
+ }
416
+ document.getElementById('__bf_labels__')?.remove();
417
+ });
418
+ }
419
+
420
+ // ─── Main Orchestrator ────────────────────────────────────────────────────────
421
+
422
+ export async function screenshotWithLabels(page, { selector, interactiveOnly = true } = {}) {
423
+ let cdp;
424
+ let labelsInjected = false;
425
+
426
+ try {
427
+ cdp = await page.context().newCDPSession(page);
428
+
429
+ const scopeId = selector
430
+ ? await resolveScopeBackendNodeId(cdp, selector)
431
+ : null;
432
+
433
+ const { nodes } = await cdp.send('Accessibility.getFullAXTree');
434
+ const { text, refs } = buildSnapshotFromCdpNodes(nodes, scopeId, {
435
+ refAll: !interactiveOnly,
436
+ });
437
+
438
+ const labels = await getLabelBoxes(cdp, refs);
439
+
440
+ await injectA11yClient(page);
441
+ labelsInjected = true;
442
+ const labelCount = await showLabels(page, labels);
443
+
444
+ const maxDim = MAX_SCREENSHOT_DIMENSION;
445
+ const viewport = await page.evaluate((max) => ({
446
+ width: Math.min(window.innerWidth, max),
447
+ height: Math.min(window.innerHeight, max),
448
+ }), maxDim);
449
+
450
+ const screenshot = await page.screenshot({
451
+ type: 'jpeg',
452
+ quality: 80,
453
+ scale: 'css',
454
+ clip: { x: 0, y: 0, ...viewport },
455
+ });
456
+
457
+ return { screenshot, snapshot: text, labelCount };
458
+ } finally {
459
+ if (labelsInjected) {
460
+ try { await hideLabels(page); } catch { /* page may have navigated */ }
461
+ }
462
+ if (cdp) {
463
+ try { await cdp.detach(); } catch { /* session may already be detached */ }
464
+ }
465
+ }
466
+ }
package/mcp/src/index.js CHANGED
@@ -1,5 +1,5 @@
1
1
  // BrowserForce — MCP Server
2
- // 2-tool architecture: execute (run Playwright code) + reset (reconnect)
2
+ // 3-tool architecture: execute (run Playwright code) + reset (reconnect) + screenshot_with_labels (visual a11y labels)
3
3
  // Connects to the relay via Playwright's CDP client.
4
4
 
5
5
  import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
@@ -9,6 +9,7 @@ import { chromium } from 'playwright-core';
9
9
  import {
10
10
  getCdpUrl, CodeExecutionTimeoutError, buildExecContext, runCode, formatResult,
11
11
  } from './exec-engine.js';
12
+ import { screenshotWithLabels } from './a11y-labels.js';
12
13
 
13
14
  // ─── Console Log Capture ─────────────────────────────────────────────────────
14
15
 
@@ -350,6 +351,79 @@ server.tool(
350
351
  }
351
352
  );
352
353
 
354
+ // ─── Screenshot with Labels Tool ──────────────────────────────────────────────
355
+
356
+ const SCREENSHOT_LABELS_PROMPT = `Take a screenshot with Vimium-style accessibility labels on interactive elements.
357
+
358
+ Returns TWO content items:
359
+ 1. JPEG screenshot with color-coded labels (e1, e2, e3...) on buttons, links, inputs, etc.
360
+ 2. Text accessibility snapshot with matching refs and role/name locators
361
+
362
+ Labels are color-coded by role:
363
+ - Yellow: links
364
+ - Orange: buttons, menu items, tabs
365
+ - Red/pink: text inputs, search boxes
366
+ - Green: checkboxes, radio buttons
367
+ - Blue: sliders, spinbuttons, media
368
+ - Purple: switches
369
+
370
+ Use this tool when:
371
+ - You need to understand the visual layout of a page
372
+ - Text snapshot alone can't convey spatial relationships
373
+ - You need to verify element positions (dashboards, grids, maps)
374
+ - You need both visual context AND element refs for interaction
375
+
376
+ After getting the screenshot, use the refs to interact via the execute tool:
377
+ await state.page.locator('role=button[name="Submit"]').click();
378
+
379
+ Parameters:
380
+ - selector: CSS selector to scope labels to part of the page (e.g., '#main', '.sidebar'). Main frame only.
381
+ - interactiveOnly: Only label interactive elements like buttons/links/inputs (default: true)
382
+
383
+ Limitations:
384
+ - Main frame only — does not label elements inside cross-origin iframes
385
+ - Locators are role/name based — no data-testid matching`;
386
+
387
+ server.tool(
388
+ 'screenshot_with_labels',
389
+ SCREENSHOT_LABELS_PROMPT,
390
+ {
391
+ selector: z.string().optional().describe('CSS selector to scope labels to a subtree of the main frame'),
392
+ interactiveOnly: z.boolean().optional().describe('Only label interactive elements (default: true)'),
393
+ },
394
+ async ({ selector, interactiveOnly = true }) => {
395
+ await ensureBrowser();
396
+ const ctx = getContext();
397
+ const page = (userState.page && !userState.page.isClosed())
398
+ ? userState.page
399
+ : ctx.pages()[0] || null;
400
+ if (!page) {
401
+ return {
402
+ content: [{ type: 'text', text: 'Error: No pages available. Open a tab first.' }],
403
+ isError: true,
404
+ };
405
+ }
406
+
407
+ try {
408
+ const { screenshot, snapshot, labelCount } = await screenshotWithLabels(page, {
409
+ selector,
410
+ interactiveOnly,
411
+ });
412
+ return {
413
+ content: [
414
+ { type: 'image', data: screenshot.toString('base64'), mimeType: 'image/jpeg' },
415
+ { type: 'text', text: `Labels: ${labelCount} interactive elements\n\n${snapshot}` },
416
+ ],
417
+ };
418
+ } catch (err) {
419
+ return {
420
+ content: [{ type: 'text', text: `Error: ${err.message}` }],
421
+ isError: true,
422
+ };
423
+ }
424
+ }
425
+ );
426
+
353
427
  // ─── Start Server ────────────────────────────────────────────────────────────
354
428
 
355
429
  async function main() {
package/package.json CHANGED
@@ -1,12 +1,12 @@
1
1
  {
2
2
  "name": "browserforce",
3
- "version": "1.0.0",
3
+ "version": "1.0.5",
4
4
  "type": "module",
5
5
  "description": "Give AI agents your real Chrome browser — your logins, cookies, and tabs. Works with OpenClaw, Claude, and any MCP agent.",
6
- "homepage": "https://github.com/anthropics/browserforce",
6
+ "homepage": "https://github.com/ivalsaraj/browserforce",
7
7
  "repository": {
8
8
  "type": "git",
9
- "url": "https://github.com/anthropics/browserforce.git"
9
+ "url": "https://github.com/ivalsaraj/browserforce.git"
10
10
  },
11
11
  "license": "MIT",
12
12
  "keywords": [