npm - browserforce - Versions diffs - 1.0.2 → 1.0.6 - Mend

browserforce 1.0.2 → 1.0.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/README.md +87 -75
package/bin.js +2 -1
package/mcp/src/a11y-labels.js +466 -0
package/mcp/src/exec-engine.js +54 -1
package/mcp/src/index.js +77 -2
package/package.json +2 -2
package/skills/browserforce/SKILL.md +2 -2

package/README.md CHANGED Viewed

@@ -1,19 +1,27 @@
-# BrowserForce
+# BrowserForce //
-**Give your AI agent your real Chrome browser.** Your logins, your cookies, your extensions — already there.
+> "a lion doesn't concern itself with token counting" — [@steipete](https://x.com/steipete), creator of [OpenClaw](https://github.com/openclaw/openclaw)
+>
+> "a 10x user doesn't concern itself with sandboxed browsers // sandboxes are for kids" — BrowserForce, your friendly neighborhood power source.
-Other browser tools spawn a fresh Chrome — no logins, no extensions, instantly flagged by bot detectors. BrowserForce connects to **your running browser** instead. One Chrome extension, full Playwright API, everything you're already logged into.
+**You're giving an AI your real Chrome — your logins, cookies, and sessions. That takes conviction.** BrowserForce is built for people who use the best models and don't look back. Security is built in: lock URLs, block navigation, read-only mode, auto-cleanup — you stay in control.
+**Fully autonomous browser control.** No manual tab clicking. Your agent browses as you, even from WhatsApp. Other tools make you click each tab, spawn a fresh Chrome, or only work with one AI client. BrowserForce connects to **your running browser** and auto-attaches to all tabs. One Chrome extension, full Playwright API, completely hands-off.
 Works with [OpenClaw](https://github.com/openclaw/openclaw), Claude, or any MCP-compatible agent.
-| | OpenClaw's built-in browser | BrowserForce |
-|---|---|---|
-| Browser | Spawns dedicated Chrome | **Uses your Chrome** |
-| Login state | Fresh — must log in every time | Already logged in |
-| Extensions | None | Your existing ones |
-| 2FA / Captchas | Blocked | Already passed (you did it) |
-| Bot detection | Easily detected | Runs in your real profile |
-| Cookies & sessions | Empty | Yours |
+## Comparison
+| | Playwright MCP | OpenClaw Browser | Playwriter | Claude Extension | BrowserForce |
+|---|---|---|---|---|---|
+| Browser | Spawns new Chrome | Separate profile | Your Chrome | Your Chrome | **Your Chrome** |
+| Login state | Fresh | Fresh (isolated) | Yours | Yours | **Yours** |
+| Tab access | N/A (new browser) | Managed by agent | Click each tab | Click each tab | **All tabs, automatic** |
+| Autonomous | Yes | Yes | No (manual click) | No (manual click) | **Yes (fully autonomous)** |
+| Context method | Screenshots (100KB+) | Screenshots + snapshots | A11y snapshots (5-20KB) | Screenshots (100KB+) | **A11y snapshots (5-20KB)** |
+| Tools | Many dedicated | 1 `browser` tool | 1 `execute` tool | Built-in | **1 `execute` tool** |
+| Agent support | Any MCP client | OpenClaw only | Any MCP client | Claude only | **Any MCP client** |
+| Playwright API | Partial | No | Full | No | **Full** |
 ## Setup
@@ -38,33 +46,44 @@ pnpm install
 3. Click **Load unpacked** → select the `extension/` folder
 4. Extension icon appears in your toolbar (gray = disconnected)
-### 3. Start the relay
+### 3. Done
+The relay auto-starts when you run any command or connect via MCP — no manual step needed. Extension icon turns green once connected.
+To run the relay manually (optional):
 ```bash
 browserforce serve
 ```
-Or with pnpm (development):
+## Connect Your Agent
+### OpenClaw
+Most OpenClaw users chat with their agent from Telegram or WhatsApp. BrowserForce lets your agent browse the web as you — no login flows, no captchas — even from a messaging app.
+**Quick setup** (copy-paste into your terminal):
 ```bash
-pnpm relay
+npm install -g browserforce && npx -y skills add ivalsaraj/browserforce
 ```
-```
-  BrowserForce
-  ────────────────────────────────────────
-  Status:   http://127.0.0.1:19222/
-  CDP:      ws://127.0.0.1:19222/cdp?token=<TOKEN>
-  ────────────────────────────────────────
+Then start the relay (keep this running):
+```bash
+browserforce serve
 ```
-Extension icon turns green — you're connected.
+**Verify it works** — send this to your agent:
-## Connect Your Agent
+> Go to https://x.com and give me top tweets
-### OpenClaw
+If your agent browses to the page and responds with the title, you're all set.
+<details>
+<summary><b>Alternative: MCP server</b> (advanced)</summary>
-Add BrowserForce as an MCP server in `~/.openclaw/openclaw.json`:
+If you prefer MCP over the skill, add to `~/.openclaw/openclaw.json`:
 ```json
 {
@@ -88,7 +107,7 @@ Add BrowserForce as an MCP server in `~/.openclaw/openclaw.json`:
 }
 ```
-Then add `"mcp-adapter"` to your agent's allowed tools. Your OpenClaw agent can now browse the web as you — no login flows, no captchas.
+</details>
 ### Claude Desktop
@@ -138,16 +157,6 @@ browserforce -e "<code>"        # Run Playwright JavaScript (one-shot)
 Each `-e` command is one-shot — state does not persist between calls. For persistent state, use the MCP server.
-### OpenClaw Skill
-Install the skill directly:
-```bash
-npx -y skills add ivalsaraj/browserforce
-```
-Or add to your agent config manually — the skill teaches the agent to use BrowserForce CLI commands via Bash.
 ### Any Playwright Script
 ```javascript
@@ -201,8 +210,39 @@ state.results = await page.evaluate(() => document.title);
 | Tool | Description |
 |------|-------------|
 | `execute` | Run Playwright JavaScript in your real Chrome. Access `page`, `context`, `state`, `snapshot()`, `waitForPageLoad()`, `getLogs()`, and Node.js globals. |
+| `screenshot_with_labels` | Take a screenshot with Vimium-style accessibility labels overlaid on interactive elements. |
 | `reset` | Reconnect to the relay and clear state. Use when the connection drops. |
+## Examples
+Get started with simple prompts. The AI generates code and does the work.
+<details>
+<summary><b>Example 1: Read page content (X.com search)</b></summary>
+**Prompt to AI:**
+> Go to x.com/search and search for "browserforce". Show me the top 5 tweets you find.
+**What the AI does:** Navigates to X, searches the term, extracts top tweets, returns them to you.
+**Use case:** Quick research, trend tracking, social listening.
+</details>
+<details>
+<summary><b>Example 2: Interact with a form (GitHub search)</b></summary>
+**Prompt to AI:**
+> Go to GitHub and search for "ai agents". Show me the top 3 repositories and their star counts.
+**What the AI does:** Fills GitHub search, waits for results, extracts repo names + stars, returns them.
+**Use case:** Finding libraries, competitive research, project discovery.
+</details>
+**8+ more examples** available in the [User Guide](GUIDE.md#examples).
 ## How It Works
 ```
@@ -230,16 +270,19 @@ The **Chrome extension** lives in your browser. It attaches Chrome's built-in de
 When the agent connects, it immediately sees all your open tabs as controllable Playwright pages. No clicking, no manual attachment.
-## Extension Settings
+## You Stay in Control
-Click the extension icon to configure:
+Click the extension icon to configure restrictions. Your browser, your rules:
-- **Auto / Manual mode** — Let the agent create tabs freely, or manually select which tabs it can access
-- **Lock URL** — Prevent the agent from navigating away from the current page
-- **No new tabs** — Block tab creation
-- **Read-only** — Observe only, no interactions
-- **Auto-cleanup** — Automatically detach or close agent tabs after a timeout
-- **Custom instructions** — Pass text instructions to the agent
+| Setting | What it does |
+|---------|-------------|
+| **Auto / Manual mode** | Let the agent create tabs freely, or hand-pick which tabs it can access |
+| **Lock URL** | Prevent the agent from navigating away from the current page |
+| **No new tabs** | Block the agent from opening new tabs |
+| **Read-only** | Observe only — no clicks, no typing, no interactions |
+| **Auto-detach** | Automatically detach inactive tabs after 5-60 minutes |
+| **Auto-close** | Automatically close agent-created tabs after 5-60 minutes |
+| **Custom instructions** | Pass text instructions to the agent (e.g. "don't click any buy buttons") |
 ## Security
@@ -249,6 +292,7 @@ Click the extension icon to configure:
 | **Auth** | Random token required for every CDP connection |
 | **Origin** | Extension only accepts connections from its own Chrome origin |
 | **Visibility** | Chrome shows "controlled by automated test software" on active tabs |
+| **Restrictions** | Lock URLs, block navigation, read-only mode — enforced at the CDP level |
 Everything runs on your machine. The auth token is stored at `~/.browserforce/auth-token` with owner-only permissions.
@@ -280,38 +324,6 @@ RELAY_PORT=19333 browserforce serve
 | `ws://.../extension` | Chrome extension WebSocket |
 | `ws://.../cdp?token=...` | Agent CDP connection |
-## Comparison
-### vs Playwright MCP
-| | Playwright MCP | BrowserForce |
-|---|---|---|
-| Browser | Spawns new Chrome | **Uses your Chrome** |
-| Login state | Fresh — must log in every time | Already logged in |
-| Extensions | None | Your existing ones |
-| Bot detection | Always detected | Runs in your real profile |
-| Memory | Double (two Chrome instances) | Uses existing Chrome |
-### vs Claude Browser Extension
-| | Claude Extension | BrowserForce |
-|---|---|---|
-| Agent support | Claude only | **Any MCP client** (OpenClaw, Claude, custom) |
-| Context method | Screenshots (100KB+) | Accessibility snapshots (5-20KB) |
-| Playwright API | No | Full |
-| Network interception | Limited | Full |
-| Raw CDP access | No | Yes |
-### vs Antigravity (Jetski)
-| | Jetski | BrowserForce |
-|---|---|---|
-| Tools | 17+ tools | 1 `execute` tool |
-| Approach | Spawns subagent for browser tasks | Direct execution |
-| Latency | High (agent overhead) | Low |
-| LLM knowledge | Must learn custom tools | Already knows Playwright |
-| Context usage | High (many tool schemas) | Low |
 ## Troubleshooting
 | Problem | Fix |

package/bin.js CHANGED Viewed

@@ -43,7 +43,8 @@ function httpGet(url) {
 }
 async function connectBrowser() {
-  const { getCdpUrl } = await import('./mcp/src/exec-engine.js');
+  const { getCdpUrl, ensureRelay } = await import('./mcp/src/exec-engine.js');
+  await ensureRelay();
   // playwright-core lives in mcp/node_modules (pnpm workspace sub-package).
   // Use createRequire from the mcp package context to locate it, then dynamic-import.
   const { createRequire } = await import('node:module');

package/mcp/src/a11y-labels.js ADDED Viewed

@@ -0,0 +1,466 @@
+// BrowserForce — Accessibility Label Overlay for Screenshots
+// Uses CDP AX tree + DOM.getBoxModel for element positioning,
+// then injects browser-side label renderer for Vimium-style labels.
+import {
+  INTERACTIVE_ROLES, CONTEXT_ROLES, SKIP_ROLES,
+  buildLocator, escapeLocatorName,
+} from './snapshot.js';
+// ─── Semaphore ────────────────────────────────────────────────────────────────
+export class Semaphore {
+  constructor(max) {
+    this.max = max;
+    this.count = 0;
+    this.queue = [];
+  }
+  acquire() {
+    if (this.count < this.max) {
+      this.count++;
+      return Promise.resolve();
+    }
+    return new Promise(resolve => this.queue.push(resolve));
+  }
+  release() {
+    this.count--;
+    if (this.queue.length > 0) {
+      this.count++;
+      this.queue.shift()();
+    }
+  }
+}
+// ─── Box Model ────────────────────────────────────────────────────────────────
+export function buildBoxFromQuad(quad) {
+  const xs = [quad[0], quad[2], quad[4], quad[6]];
+  const ys = [quad[1], quad[3], quad[5], quad[7]];
+  return {
+    x: Math.min(...xs),
+    y: Math.min(...ys),
+    width: Math.max(...xs) - Math.min(...xs),
+    height: Math.max(...ys) - Math.min(...ys),
+  };
+}
+// ─── CDP AX Tree → Snapshot Text ──────────────────────────────────────────────
+export function buildSnapshotFromCdpNodes(nodes, scopeBackendNodeId, { refAll = false } = {}) {
+  // Build lookup maps
+  const byId = new Map();
+  for (const node of nodes) {
+    byId.set(node.nodeId, node);
+  }
+  // Find scope root if scoping requested
+  let scopeNodeId = null;
+  if (scopeBackendNodeId != null) {
+    const scopeNode = nodes.find(n => n.backendDOMNodeId === scopeBackendNodeId);
+    if (!scopeNode) {
+      throw new Error(
+        `Scoped element (backendNodeId=${scopeBackendNodeId}) has no matching accessibility node. ` +
+        'The element may be aria-hidden or have no accessible role.'
+      );
+    }
+    scopeNodeId = scopeNode.nodeId;
+  }
+  // Build tree structure from parentId references
+  const childrenMap = new Map();
+  for (const node of nodes) {
+    if (!childrenMap.has(node.nodeId)) childrenMap.set(node.nodeId, []);
+    if (node.parentId) {
+      if (!childrenMap.has(node.parentId)) childrenMap.set(node.parentId, []);
+      childrenMap.get(node.parentId).push(node);
+    }
+  }
+  // Check if a nodeId is inside the scope subtree
+  function isInScope(nodeId) {
+    if (!scopeNodeId) return true;
+    let current = nodeId;
+    while (current) {
+      if (current === scopeNodeId) return true;
+      const node = byId.get(current);
+      current = node?.parentId || null;
+    }
+    return false;
+  }
+  const lines = [];
+  const refs = [];
+  let refCounter = 0;
+  function walk(nodeId, depth) {
+    const node = byId.get(nodeId);
+    if (!node || node.ignored) return;
+    const role = node.role?.value;
+    if (!role) return;
+    // Skip root wrapper and generic roles — recurse into children
+    if (role === 'RootWebArea' || role === 'WebArea') {
+      for (const child of childrenMap.get(nodeId) || []) {
+        walk(child.nodeId, depth);
+      }
+      return;
+    }
+    if (SKIP_ROLES.has(role)) {
+      for (const child of childrenMap.get(nodeId) || []) {
+        walk(child.nodeId, depth);
+      }
+      return;
+    }
+    // Check scope
+    if (!isInScope(nodeId)) return;
+    const isInteractive = INTERACTIVE_ROLES.has(role);
+    const isContext = CONTEXT_ROLES.has(role);
+    const name = node.name?.value || '';
+    const children = childrenMap.get(nodeId) || [];
+    // Skip non-interactive, non-context nodes without interactive descendants
+    if (!isInteractive && !isContext) {
+      const hasInteractive = children.some(c => {
+        const r = c.role?.value;
+        return r && INTERACTIVE_ROLES.has(r);
+      });
+      if (!hasInteractive) {
+        // Still recurse — children may have interactive descendants
+        for (const child of children) {
+          walk(child.nodeId, depth);
+        }
+        return;
+      }
+    }
+    const indent = '  '.repeat(depth);
+    let lineText = `${indent}- ${role}`;
+    if (name) {
+      lineText += ` "${escapeLocatorName(name)}"`;
+    }
+    if (isInteractive || (refAll && isContext && node.backendDOMNodeId)) {
+      refCounter++;
+      const ref = `e${refCounter}`;
+      const locator = buildLocator(role, name, null);
+      lineText += ` [ref=${ref}]`;
+      refs.push({ ref, role, name, backendNodeId: node.backendDOMNodeId, locator });
+    }
+    const relevantChildren = children.filter(c => {
+      const r = c.role?.value;
+      return r && !c.ignored && (INTERACTIVE_ROLES.has(r) || CONTEXT_ROLES.has(r) || !SKIP_ROLES.has(r));
+    });
+    if (relevantChildren.length > 0) {
+      lineText += ':';
+    }
+    lines.push(lineText);
+    for (const child of children) {
+      walk(child.nodeId, depth + 1);
+    }
+  }
+  // Find root nodes (no parentId)
+  const roots = nodes.filter(n => !n.parentId);
+  for (const root of roots) {
+    walk(root.nodeId, 0);
+  }
+  return { text: lines.join('\n'), refs };
+}
+// ─── Browser-Side Label Renderer ──────────────────────────────────────────────
+// Injected into page context via page.evaluate().
+// Port of Playwriter's a11y-client.ts — Vimium-style color-coded labels.
+const LABELS_CONTAINER_ID = '__bf_labels__';
+const LABELS_TIMER_KEY = '__bf_labels_timer__';
+export const ROLE_COLORS = {
+  link: ['#FFF785', '#FFC542', '#E3BE23'],
+  button: ['#FFE0B2', '#FFCC80', '#FFB74D'],
+  textbox: ['#FFCDD2', '#EF9A9A', '#E57373'],
+  combobox: ['#F8BBD0', '#F48FB1', '#F06292'],
+  searchbox: ['#F8BBD0', '#F48FB1', '#F06292'],
+  checkbox: ['#C8E6C9', '#A5D6A7', '#81C784'],
+  radio: ['#C8E6C9', '#A5D6A7', '#81C784'],
+  slider: ['#BBDEFB', '#90CAF9', '#64B5F6'],
+  spinbutton: ['#BBDEFB', '#90CAF9', '#64B5F6'],
+  switch: ['#D1C4E9', '#B39DDB', '#9575CD'],
+  menuitem: ['#FFE0B2', '#FFCC80', '#FFB74D'],
+  menuitemcheckbox: ['#FFE0B2', '#FFCC80', '#FFB74D'],
+  menuitemradio: ['#FFE0B2', '#FFCC80', '#FFB74D'],
+  option: ['#FFE0B2', '#FFCC80', '#FFB74D'],
+  tab: ['#FFE0B2', '#FFCC80', '#FFB74D'],
+  treeitem: ['#FFE0B2', '#FFCC80', '#FFB74D'],
+  img: ['#B3E5FC', '#81D4FA', '#4FC3F7'],
+  video: ['#B3E5FC', '#81D4FA', '#4FC3F7'],
+  audio: ['#B3E5FC', '#81D4FA', '#4FC3F7'],
+};
+const DEFAULT_COLORS = ['#FFF9C4', '#FFF59D', '#FFEB3B'];
+// This string is injected into the page via page.evaluate().
+// It sets up globalThis.__bf_a11y with renderA11yLabels and hideA11yLabels.
+export const A11Y_CLIENT_CODE = `
+(function() {
+  const CONTAINER_ID = '${LABELS_CONTAINER_ID}';
+  const TIMER_KEY = '${LABELS_TIMER_KEY}';
+  const ROLE_COLORS = ${JSON.stringify(ROLE_COLORS)};
+  const DEFAULT_COLORS = ${JSON.stringify(DEFAULT_COLORS)};
+  function renderA11yLabels(labels) {
+    const doc = document;
+    const win = window;
+    if (win[TIMER_KEY]) {
+      win.clearTimeout(win[TIMER_KEY]);
+      win[TIMER_KEY] = null;
+    }
+    doc.getElementById(CONTAINER_ID)?.remove();
+    const container = doc.createElement('div');
+    container.id = CONTAINER_ID;
+    container.style.cssText = 'position:absolute;left:0;top:0;z-index:2147483647;pointer-events:none;';
+    const style = doc.createElement('style');
+    style.textContent = '.__bf_label__{position:absolute;font:bold 12px Helvetica,Arial,sans-serif;padding:1px 4px;border-radius:3px;color:black;text-shadow:0 1px 0 rgba(255,255,255,0.6);white-space:nowrap;}';
+    container.appendChild(style);
+    const svg = doc.createElementNS('http://www.w3.org/2000/svg', 'svg');
+    svg.style.cssText = 'position:absolute;left:0;top:0;pointer-events:none;overflow:visible;';
+    svg.setAttribute('width', '' + doc.documentElement.scrollWidth);
+    svg.setAttribute('height', '' + doc.documentElement.scrollHeight);
+    const defs = doc.createElementNS('http://www.w3.org/2000/svg', 'defs');
+    svg.appendChild(defs);
+    const markerCache = {};
+    function getArrowMarkerId(color) {
+      if (markerCache[color]) return markerCache[color];
+      const markerId = 'bf-arrow-' + color.replace('#', '');
+      const marker = doc.createElementNS('http://www.w3.org/2000/svg', 'marker');
+      marker.setAttribute('id', markerId);
+      marker.setAttribute('viewBox', '0 0 10 10');
+      marker.setAttribute('refX', '9');
+      marker.setAttribute('refY', '5');
+      marker.setAttribute('markerWidth', '6');
+      marker.setAttribute('markerHeight', '6');
+      marker.setAttribute('orient', 'auto-start-reverse');
+      const path = doc.createElementNS('http://www.w3.org/2000/svg', 'path');
+      path.setAttribute('d', 'M 0 0 L 10 5 L 0 10 z');
+      path.setAttribute('fill', color);
+      marker.appendChild(path);
+      defs.appendChild(marker);
+      markerCache[color] = markerId;
+      return markerId;
+    }
+    container.appendChild(svg);
+    const placedLabels = [];
+    const LABEL_HEIGHT = 17;
+    const LABEL_CHAR_WIDTH = 7;
+    const viewportLeft = win.scrollX;
+    const viewportTop = win.scrollY;
+    const viewportRight = viewportLeft + win.innerWidth;
+    const viewportBottom = viewportTop + win.innerHeight;
+    let count = 0;
+    for (const item of labels) {
+      const ref = item.ref;
+      const role = item.role;
+      const box = item.box;
+      const rectLeft = box.x;
+      const rectTop = box.y;
+      const rectRight = rectLeft + box.width;
+      const rectBottom = rectTop + box.height;
+      if (box.width <= 0 || box.height <= 0) continue;
+      if (rectRight < viewportLeft || rectLeft > viewportRight ||
+          rectBottom < viewportTop || rectTop > viewportBottom) continue;
+      const labelWidth = ref.length * LABEL_CHAR_WIDTH + 8;
+      const labelLeft = rectLeft;
+      const labelTop = Math.max(0, rectTop - LABEL_HEIGHT);
+      const labelRect = { left: labelLeft, top: labelTop, right: labelLeft + labelWidth, bottom: labelTop + LABEL_HEIGHT };
+      let overlaps = false;
+      for (const placed of placedLabels) {
+        if (labelRect.left < placed.right && labelRect.right > placed.left &&
+            labelRect.top < placed.bottom && labelRect.bottom > placed.top) {
+          overlaps = true;
+          break;
+        }
+      }
+      if (overlaps) continue;
+      const colors = ROLE_COLORS[role] || DEFAULT_COLORS;
+      const label = doc.createElement('div');
+      label.className = '__bf_label__';
+      label.textContent = ref;
+      label.style.background = 'linear-gradient(to bottom, ' + colors[0] + ' 0%, ' + colors[1] + ' 100%)';
+      label.style.border = '1px solid ' + colors[2];
+      label.style.left = labelLeft + 'px';
+      label.style.top = labelTop + 'px';
+      container.appendChild(label);
+      const line = doc.createElementNS('http://www.w3.org/2000/svg', 'line');
+      line.setAttribute('x1', '' + (labelLeft + labelWidth / 2));
+      line.setAttribute('y1', '' + (labelTop + LABEL_HEIGHT));
+      line.setAttribute('x2', '' + (rectLeft + box.width / 2));
+      line.setAttribute('y2', '' + (rectTop + box.height / 2));
+      line.setAttribute('stroke', colors[2]);
+      line.setAttribute('stroke-width', '1.5');
+      line.setAttribute('marker-end', 'url(#' + getArrowMarkerId(colors[2]) + ')');
+      svg.appendChild(line);
+      placedLabels.push(labelRect);
+      count++;
+    }
+    doc.documentElement.appendChild(container);
+    win[TIMER_KEY] = win.setTimeout(function() {
+      doc.getElementById(CONTAINER_ID)?.remove();
+      win[TIMER_KEY] = null;
+    }, 30000);
+    return count;
+  }
+  function hideA11yLabels() {
+    if (window['${LABELS_TIMER_KEY}']) {
+      window.clearTimeout(window['${LABELS_TIMER_KEY}']);
+      window['${LABELS_TIMER_KEY}'] = null;
+    }
+    document.getElementById('${LABELS_CONTAINER_ID}')?.remove();
+  }
+  globalThis.__bf_a11y = { renderA11yLabels: renderA11yLabels, hideA11yLabels: hideA11yLabels };
+})();
+`;
+// ─── CDP Helpers ──────────────────────────────────────────────────────────────
+const MAX_CONCURRENCY = 24;
+const BOX_MODEL_TIMEOUT_MS = 5000;
+const MAX_SCREENSHOT_DIMENSION = 1568;
+export async function resolveScopeBackendNodeId(cdp, selector) {
+  if (!selector) return null;
+  const { root } = await cdp.send('DOM.getDocument');
+  const { nodeId } = await cdp.send('DOM.querySelector', {
+    nodeId: root.nodeId, selector,
+  });
+  if (!nodeId) {
+    throw new Error(`Selector "${selector}" did not match any element on the page`);
+  }
+  const { node } = await cdp.send('DOM.describeNode', { nodeId });
+  return node.backendNodeId;
+}
+export async function getLabelBoxes(cdp, refs) {
+  const sema = new Semaphore(MAX_CONCURRENCY);
+  const results = await Promise.all(
+    refs.map(async (ref) => {
+      if (!ref.backendNodeId) return null;
+      await sema.acquire();
+      try {
+        const response = await Promise.race([
+          cdp.send('DOM.getBoxModel', { backendNodeId: ref.backendNodeId }),
+          new Promise(resolve => setTimeout(() => resolve(null), BOX_MODEL_TIMEOUT_MS)),
+        ]);
+        if (!response) return null;
+        const box = buildBoxFromQuad(response.model.border);
+        if (box.width <= 0 || box.height <= 0) return null;
+        return { ref: ref.ref, role: ref.role, box };
+      } catch {
+        return null;
+      } finally {
+        sema.release();
+      }
+    })
+  );
+  return results.filter(Boolean);
+}
+export async function injectA11yClient(page) {
+  const exists = await page.evaluate(() => typeof globalThis.__bf_a11y !== 'undefined');
+  if (!exists) {
+    await page.evaluate(A11Y_CLIENT_CODE);
+  }
+}
+export async function showLabels(page, labels) {
+  return page.evaluate((entries) => globalThis.__bf_a11y.renderA11yLabels(entries), labels);
+}
+export async function hideLabels(page) {
+  await page.evaluate(() => {
+    const timerKey = '__bf_labels_timer__';
+    if (window[timerKey]) {
+      window.clearTimeout(window[timerKey]);
+      window[timerKey] = null;
+    }
+    document.getElementById('__bf_labels__')?.remove();
+  });
+}
+// ─── Main Orchestrator ────────────────────────────────────────────────────────
+export async function screenshotWithLabels(page, { selector, interactiveOnly = true } = {}) {
+  let cdp;
+  let labelsInjected = false;
+  try {
+    cdp = await page.context().newCDPSession(page);
+    const scopeId = selector
+      ? await resolveScopeBackendNodeId(cdp, selector)
+      : null;
+    const { nodes } = await cdp.send('Accessibility.getFullAXTree');
+    const { text, refs } = buildSnapshotFromCdpNodes(nodes, scopeId, {
+      refAll: !interactiveOnly,
+    });
+    const labels = await getLabelBoxes(cdp, refs);
+    await injectA11yClient(page);
+    labelsInjected = true;
+    const labelCount = await showLabels(page, labels);
+    const maxDim = MAX_SCREENSHOT_DIMENSION;
+    const viewport = await page.evaluate((max) => ({
+      width: Math.min(window.innerWidth, max),
+      height: Math.min(window.innerHeight, max),
+    }), maxDim);
+    const screenshot = await page.screenshot({
+      type: 'jpeg',
+      quality: 80,
+      scale: 'css',
+      clip: { x: 0, y: 0, ...viewport },
+    });
+    return { screenshot, snapshot: text, labelCount };
+  } finally {
+    if (labelsInjected) {
+      try { await hideLabels(page); } catch { /* page may have navigated */ }
+    }
+    if (cdp) {
+      try { await cdp.detach(); } catch { /* session may already be detached */ }
+    }
+  }
+}

package/mcp/src/exec-engine.js CHANGED Viewed

@@ -4,6 +4,8 @@
 import { readFileSync } from 'node:fs';
 import { join } from 'node:path';
 import { homedir } from 'node:os';
+import { fileURLToPath } from 'node:url';
+import { spawn } from 'node:child_process';
 import {
   TEST_ID_ATTRS,
   buildSnapshotText, parseSearchPattern, annotateStableAttrs,
@@ -11,8 +13,10 @@ import {
 // ─── Configuration ───────────────────────────────────────────────────────────
+const DEFAULT_PORT = 19222;
 export const BF_DIR = join(homedir(), '.browserforce');
 export const CDP_URL_FILE = join(BF_DIR, 'cdp-url');
+const RELAY_SCRIPT = fileURLToPath(new URL('../../relay/src/index.js', import.meta.url));
 export function getCdpUrl() {
   if (process.env.BF_CDP_URL) return process.env.BF_CDP_URL;
@@ -34,10 +38,59 @@ export function getRelayHttpUrl() {
     const parsed = new URL(cdpUrl);
     return `http://${parsed.hostname}:${parsed.port}`;
   } catch {
-    return 'http://127.0.0.1:19222';
+    return `http://127.0.0.1:${DEFAULT_PORT}`;
   }
 }
+// ─── Auto-start relay ───────────────────────────────────────────────────────
+function getRelayPort() {
+  if (process.env.RELAY_PORT) return parseInt(process.env.RELAY_PORT, 10);
+  try {
+    const url = readFileSync(CDP_URL_FILE, 'utf8').trim();
+    if (url) {
+      const port = new URL(url).port;
+      if (port) return parseInt(port, 10);
+    }
+  } catch { /* fall through */ }
+  return DEFAULT_PORT;
+}
+async function isRelayRunning(port) {
+  try {
+    const res = await fetch(`http://127.0.0.1:${port}/`, {
+      signal: AbortSignal.timeout(500),
+    });
+    return res.ok;
+  } catch { return false; }
+}
+/**
+ * Ensure the relay server is running. If not, spawn it as a detached
+ * background process and wait for it to become reachable.
+ */
+export async function ensureRelay() {
+  const port = getRelayPort();
+  if (await isRelayRunning(port)) return;
+  const child = spawn(process.execPath, [RELAY_SCRIPT], {
+    detached: true,
+    stdio: 'ignore',
+    env: { ...process.env, RELAY_PORT: String(port) },
+  });
+  child.unref();
+  const deadline = Date.now() + 5000;
+  while (Date.now() < deadline) {
+    await new Promise(r => setTimeout(r, 200));
+    if (await isRelayRunning(port)) {
+      process.stderr.write('[browserforce] Relay auto-started\n');
+      return;
+    }
+  }
+  throw new Error('Failed to auto-start relay server within 5s');
+}
 // ─── Smart Page Load Detection ───────────────────────────────────────────────
 // Filters analytics/ad requests that never finish, polls document.readyState +
 // pending resource count.

package/mcp/src/index.js CHANGED Viewed

@@ -1,5 +1,5 @@
 // BrowserForce — MCP Server
-// 2-tool architecture: execute (run Playwright code) + reset (reconnect)
+// 3-tool architecture: execute (run Playwright code) + reset (reconnect) + screenshot_with_labels (visual a11y labels)
 // Connects to the relay via Playwright's CDP client.
 import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
@@ -7,8 +7,9 @@ import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js'
 import { z } from 'zod';
 import { chromium } from 'playwright-core';
 import {
-  getCdpUrl, CodeExecutionTimeoutError, buildExecContext, runCode, formatResult,
+  getCdpUrl, ensureRelay, CodeExecutionTimeoutError, buildExecContext, runCode, formatResult,
 } from './exec-engine.js';
+import { screenshotWithLabels } from './a11y-labels.js';
 // ─── Console Log Capture ─────────────────────────────────────────────────────
@@ -64,6 +65,7 @@ let browser = null;
 async function ensureBrowser() {
   if (browser?.isConnected()) return;
+  await ensureRelay();
   const cdpUrl = getCdpUrl();
   browser = await chromium.connectOverCDP(cdpUrl);
   browser.on('disconnected', () => {
@@ -350,6 +352,79 @@ server.tool(
   }
 );
+// ─── Screenshot with Labels Tool ──────────────────────────────────────────────
+const SCREENSHOT_LABELS_PROMPT = `Take a screenshot with Vimium-style accessibility labels on interactive elements.
+Returns TWO content items:
+1. JPEG screenshot with color-coded labels (e1, e2, e3...) on buttons, links, inputs, etc.
+2. Text accessibility snapshot with matching refs and role/name locators
+Labels are color-coded by role:
+- Yellow: links
+- Orange: buttons, menu items, tabs
+- Red/pink: text inputs, search boxes
+- Green: checkboxes, radio buttons
+- Blue: sliders, spinbuttons, media
+- Purple: switches
+Use this tool when:
+- You need to understand the visual layout of a page
+- Text snapshot alone can't convey spatial relationships
+- You need to verify element positions (dashboards, grids, maps)
+- You need both visual context AND element refs for interaction
+After getting the screenshot, use the refs to interact via the execute tool:
+  await state.page.locator('role=button[name="Submit"]').click();
+Parameters:
+- selector: CSS selector to scope labels to part of the page (e.g., '#main', '.sidebar'). Main frame only.
+- interactiveOnly: Only label interactive elements like buttons/links/inputs (default: true)
+Limitations:
+- Main frame only — does not label elements inside cross-origin iframes
+- Locators are role/name based — no data-testid matching`;
+server.tool(
+  'screenshot_with_labels',
+  SCREENSHOT_LABELS_PROMPT,
+  {
+    selector: z.string().optional().describe('CSS selector to scope labels to a subtree of the main frame'),
+    interactiveOnly: z.boolean().optional().describe('Only label interactive elements (default: true)'),
+  },
+  async ({ selector, interactiveOnly = true }) => {
+    await ensureBrowser();
+    const ctx = getContext();
+    const page = (userState.page && !userState.page.isClosed())
+      ? userState.page
+      : ctx.pages()[0] || null;
+    if (!page) {
+      return {
+        content: [{ type: 'text', text: 'Error: No pages available. Open a tab first.' }],
+        isError: true,
+      };
+    }
+    try {
+      const { screenshot, snapshot, labelCount } = await screenshotWithLabels(page, {
+        selector,
+        interactiveOnly,
+      });
+      return {
+        content: [
+          { type: 'image', data: screenshot.toString('base64'), mimeType: 'image/jpeg' },
+          { type: 'text', text: `Labels: ${labelCount} interactive elements\n\n${snapshot}` },
+        ],
+      };
+    } catch (err) {
+      return {
+        content: [{ type: 'text', text: `Error: ${err.message}` }],
+        isError: true,
+      };
+    }
+  }
+);
 // ─── Start Server ────────────────────────────────────────────────────────────
 async function main() {

package/package.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
   "name": "browserforce",
-  "version": "1.0.2",
+  "version": "1.0.6",
   "type": "module",
-  "description": "Give AI agents your real Chrome browser — your logins, cookies, and tabs. Works with OpenClaw, Claude, and any MCP agent.",
+  "description": "Give AI agents your real Chrome browser with progressive examples: simple reads, form interactions, multi-tab workflows, and state persistence. Search X and GitHub, extract ProductHunt data, test forms, compare A/B variants, monitor status pages. Works with OpenClaw, Claude, and any MCP agent.",
   "homepage": "https://github.com/ivalsaraj/browserforce",
   "repository": {
     "type": "git",

package/skills/browserforce/SKILL.md CHANGED Viewed

@@ -19,8 +19,8 @@ cookies, and extensions already active. No headless browser, no fresh profiles.
 ## Prerequisites
 The user must have:
-1. BrowserForce relay running: `browserforce serve`
-2. BrowserForce Chrome extension installed and connected (green icon)
+1. BrowserForce Chrome extension installed and connected (green icon)
+2. The relay auto-starts on first command — no manual step needed
 Check with: `browserforce status`