npm - @btraut/browser-bridge - Versions diffs - 0.6.0 → 0.7.0 - Mend

@btraut/browser-bridge 0.6.0 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

package/CHANGELOG.md +18 -3
package/README.md +150 -70
package/dist/api.js +4 -3
package/dist/api.js.map +2 -2
package/dist/index.js +4 -3
package/dist/index.js.map +2 -2
package/extension/assets/ui.css +15 -14
package/extension/dist/background.js +9 -15
package/extension/dist/background.js.map +2 -2
package/extension/dist/content.js +10 -0
package/extension/dist/content.js.map +2 -2
package/extension/dist/options-ui.js +22 -17
package/extension/dist/options-ui.js.map +2 -2
package/extension/dist/popup-ui.js +34 -18
package/extension/dist/popup-ui.js.map +2 -2
package/extension/manifest.json +1 -1
package/package.json +1 -1
package/skills/browser-bridge/SKILL.md +28 -0
package/skills/browser-bridge/skill.json +1 -1

package/CHANGELOG.md CHANGED Viewed

@@ -6,17 +6,32 @@ The format is based on "Keep a Changelog", and this project adheres to Semantic
 ## [Unreleased]
+_TBD_
+## [0.7.0] - 2026-02-10
+### Fixed
+- MCP adapter: avoid SDK `_zod` crashes on tool calls by registering object-shaped output schemas and flagging `ok: false` envelopes as MCP errors.
+## [0.6.1] - 2026-02-10
 ### Added
-_TBD_
+- README: competitor feature comparison table.
 ### Fixed
-_TBD_
+- Extension popup menu: Settings/About always open in a new tab/window (no more crushing the UI inside the popup).
+- Extension options: default permissions mode to Granular when unset, and show a real empty state for the approved sites allowlist.
+- Extension options: remove the nested-card empty state styling, simplify the copy, and always show the Approved sites disclosure + list in both Granular and Bypass modes.
+- Extension options: add a drop shadow to the permission mode controls to match the rest of the settings containers.
+- Extension options: remove the "Bypass mode is intentionally unsafe" warning box.
+- Extension options: tighten and vertically align the Approved sites disclosure triangle.
 ### Changed
-_TBD_
+- Expand `scripts/cli-full-tool-smoke.sh` coverage (health-check, locator variants, ref reuse, more dom-snapshot modes, more screenshot options).
 ## [0.6.0] - 2026-02-09

package/README.md CHANGED Viewed

@@ -8,85 +8,53 @@
 Browser Bridge drives your real, local Chrome (not headless) and inspects page state through a Chrome extension plus a local daemon. You stay in the loop with your existing tabs and login state.
-What makes it different:
-- **Real browser state**: operate on your actual Chrome profile (tabs, cookies, logins, extensions).
-- **Two-plane architecture**: a **drive** plane that does what a user does (click, type, navigate), plus an **inspect** plane that reads state (DOM, console, screenshots). This separation makes runs less flaky and lets inspection happen in parallel.
-- **Token-efficient inspection**: stable element refs like `@e1` (find once, reuse everywhere) plus knobs to bound output (`--max-nodes`, `--compact`, `--interactive`, `--selector`).
-- **Structured errors for agents**: stable error codes with a `retryable` flag (no more guessing whether to retry).
-- **Recovery-first**: sessions have an explicit state machine with `session.recover()` and `diagnostics doctor`.
-- **Inspect beyond screenshots**: DOM snapshots (AX + HTML) and `inspect dom-diff` to detect page changes.
-## Why Browser Bridge
-Browser Bridge is built for agent reliability and "stay logged in" workflows in your real Chrome, not for headless test automation.
-If you're coming from Playwright/Puppeteer-style tooling:
-- Browser Bridge targets the user's existing, interactive Chrome session by default (typical Playwright/Puppeteer flows spin up a separate browser/context).
-- Browser Bridge surfaces retry guidance in the API (`retryable`) instead of forcing the agent to infer it from exceptions and timing.
-- Browser Bridge ships a first-class inspect plane (DOM snapshots, diffs, diagnostics) designed for LLM consumption, with output-bounding options to keep agent context small.
-If you're coming from an extension-only MCP tool:
-- Browser Bridge puts a stateful local Core daemon behind the tools (sessions, recovery, diagnostics, artifacts).
-- Drive actions are serialized for determinism; inspect is a separate plane that can keep producing structured state.
-- CLI works everywhere; MCP is optional.
-## How It Works
-Core keeps a session state machine and exposes a small set of stable tools:
-- `session.*` - lifecycle + recovery
-- `drive.*` - navigation + input (single-flight)
-- `inspect.*` - DOM snapshots/diffs + evaluation
-- `diagnostics.*` - health checks
-- `artifacts.*` - screenshots
+## 🏁 Install + Quickstart (Do This First)
-## Requirements
+You need Node.js 20+ and Chrome (stable). Browser Bridge is local-only (binds to 127.0.0.1).
-- Node.js 20+
-- Chrome (stable)
-- Browser Bridge extension (Chrome Web Store listing pending; see manual install below)
-- Local-only usage (all services bind to 127.0.0.1)
-## Install (CLI)
+1. Install the CLI:
 ```bash
 npm i -g @btraut/browser-bridge
 browser-bridge --help
 ```
-## Chrome Extension (Manual Install)
+2. Run the installer:
+```bash
+browser-bridge install
+```
-Chrome Web Store listing is pending. For now, install the extension manually:
+Select your client(s) (Codex, Claude, Cursor, etc).
-1. Download the latest pre-built extension zip from [GitHub Releases](https://github.com/btraut/browser-bridge/releases) (Assets), unzip it, and use the unzipped folder for step 3.
+3. Install the Chrome extension:
-Alternative (build from source):
+- Chrome Web Store listing is pending. For now, install manually.
+- Download the latest pre-built extension zip from [GitHub Releases](https://github.com/btraut/browser-bridge/releases) (Assets), unzip it.
+- Chrome -> `chrome://extensions` -> enable **Developer mode** -> **Load unpacked** -> select the folder with `manifest.json`.
-1. Clone this repo.
-2. Install deps and build:
+<details>
+<summary>Build the extension from source (instead of using a release zip)</summary>
 ```bash
 npm install
 npm run build
 ```
-3. Open Chrome and navigate to `chrome://extensions`.
-4. Enable **Developer mode**, click **Load unpacked**, and select the extension folder (the folder with `manifest.json`).
+Then load the unpacked extension from `packages/extension/`.
-Notes:
+</details>
-- Browser Bridge enforces a per-site allowlist for `drive.*` actions by default. The first time it acts on a new site, you'll see a permission prompt.
-- You can review/revoke approved sites (and optionally enable a dangerous bypass mode) via the extension options page (Extensions menu -> Browser Bridge -> Extension options).
-- If you click **Decline**, the command fails with `PERMISSION_DENIED` (non-retryable). If you don't respond in time, you'll see `PERMISSION_PROMPT_TIMEOUT` (retryable once after the user allows).
+4. Try it:
+```text
+Use Browser Bridge to navigate to https://example.com.
+```
-## Quickstart
+If Chrome shows a Browser Bridge permissions prompt, approve it, then tell the agent to retry.
-1. Install the extension.
-2. (Optional) Run `browser-bridge install` (skill + optional MCP).
-3. Run a quick CLI check (Core auto-starts by default):
+<details>
+<summary>CLI sanity check (debugging)</summary>
 ```bash
 browser-bridge session create
@@ -100,7 +68,123 @@ Notes:
 - `inspect dom-snapshot` defaults to `--format ax`; `--max-nodes` is only supported for AX snapshots.
-## Skills (Agent Clients)
+</details>
+## ✨ What You Get
+What makes it different:
+- **Real browser state**: operate on your actual Chrome profile (tabs, cookies, logins, extensions).
+- **Two-plane architecture**: a **drive** plane that does what a user does (click, type, navigate), plus an **inspect** plane that reads state (DOM, console, screenshots). This separation makes runs less flaky and lets inspection happen in parallel.
+- **Safe-by-default drive permissions**: `drive.*` actions are blocked on new sites until you approve them. You can allow once, always allow (per-site allowlist you can audit/revoke), or enable a clearly-labeled bypass mode if you want zero guardrails.
+- **Token-efficient inspection**: stable element refs like `@e1` (find once, reuse everywhere) plus knobs to bound output (`--max-nodes`, `--compact`, `--interactive`, `--selector`).
+- **Structured errors for agents**: stable error codes with a `retryable` flag (no more guessing whether to retry).
+- **Recovery-first**: sessions have an explicit state machine with `session.recover()` and `diagnostics doctor`.
+- **Inspect beyond screenshots**: DOM snapshots (AX + HTML) and `inspect dom-diff` to detect page changes.
+## 🆚 Feature Comparison
+| Category | Browser Bridge | Playwright MCP | agent-browser | mcp-chrome | Claude Code + Chrome |
+| --- | --- | --- | --- | --- | --- |
+| Uses your real, already-logged-in Chrome (tabs/cookies) | ✅ | ❌ | ❌ | ✅ | ✅ |
+| Visible browser (not headless) | ✅ | ✅ | ❌ | ✅ | ✅ |
+| Per-site permission prompts / allowlist | ✅ | ❌ | ❌ | ❌ | ✅ |
+| Drive/Inspect split (inspect without racing input) | ✅ | ❌ | ❌ | ❌ | ❌ |
+| Token-efficient inspection (element refs, bounded output, cleanup) | ✅ | ❌ | ❌ | ❌ | ❌ |
+| Structured errors + retry hints | ✅ | ❌ | ❌ | ❌ | ❌ |
+| Explicit recovery + doctor-style diagnostics | ✅ | ❌ | ❌ | ❌ | ❌ |
+| DOM diff (change detection) | ✅ | ❌ | ❌ | ❌ | ❌ |
+| HAR / network export | ✅ | ✅ | ✅ | ✅ | ❌ |
+| Open source | ✅ | ✅ | ✅ | ✅ | ❌ |
+## 🔒 Site Permissions (Drive Actions)
+Browser Bridge is intentionally safe: **drive actions** (`drive.navigate`, click, type, etc.) require **per-site approval**. `inspect.*` is not gated, so agents can inspect first and only ask for permission when it's time to click/type.
+<details>
+<summary>How approvals work (click to expand)</summary>
+- The first time a `drive.*` action targets a new site, Chrome opens a small permissions prompt.
+- Click **Allow this action** to allow once (no allowlist entry).
+- Click **Always allow actions on this site** to add the site to your approved-sites allowlist.
+- Click **Decline** to fail the command with `PERMISSION_DENIED` (non-retryable).
+- If you ignore the prompt, the command fails with `PERMISSION_PROMPT_TIMEOUT` (retryable). Default wait is 30 seconds; approve the prompt, then retry the command.
+Manage approvals (and bypass mode):
+- Open the extension options page from `chrome://extensions` (Browser Bridge -> **Extension options**) or from the Extensions toolbar menu (Browser Bridge -> **Extension options**).
+- The options page shows your **Approved sites** allowlist with revoke controls.
+- Switch **Permission mode** to **Bypass (dangerous)** to skip the allowlist and prompts entirely.
+- In bypass mode, the agent can take actions on any website without asking.
+- Restricted URLs (for example `chrome://` and `file://`) are still blocked.
+</details>
+## 🧰 Tools (MCP + CLI)
+The CLI mirrors the MCP tool surface.
+<details>
+<summary>All MCP tools (click to expand)</summary>
+**session**
+- `session.create`
+- `session.status`
+- `session.recover`
+- `session.close`
+**drive**
+- `drive.navigate`
+- `drive.go_back`
+- `drive.go_forward`
+- `drive.back`
+- `drive.forward`
+- `drive.click`
+- `drive.hover`
+- `drive.select`
+- `drive.type`
+- `drive.fill_form`
+- `drive.drag`
+- `drive.handle_dialog`
+- `drive.key`
+- `drive.key_press`
+- `drive.scroll`
+- `drive.wait_for`
+- `drive.tab_list`
+- `drive.tab_activate`
+- `drive.tab_close`
+**dialog**
+- `dialog.accept`
+- `dialog.dismiss`
+**inspect**
+- `inspect.dom_snapshot`
+- `inspect.dom_diff`
+- `inspect.find`
+- `inspect.extract_content`
+- `inspect.page_state`
+- `inspect.console_list`
+- `inspect.network_har`
+- `inspect.evaluate`
+- `inspect.performance_metrics`
+**artifacts**
+- `artifacts.screenshot`
+**misc**
+- `health_check`
+- `diagnostics.doctor`
+</details>
+## 🧩 Skills (Agent Clients)
 Browser Bridge skills work across many agent clients, including Codex and Claude Code.
@@ -110,13 +194,6 @@ Easiest option (recommended):
 browser-bridge install
 ```
-Skill only:
-```bash
-browser-bridge skill install
-browser-bridge skill status
-```
 Or copy the Browser Bridge skill into your agent skills directory (advanced):
 ```bash
@@ -131,7 +208,7 @@ cp -R "$(npm root -g)/@btraut/browser-bridge/skills/browser-bridge" ~/.claude/sk
 Restart your agent app if it does not pick up the new skill automatically.
-## MCP Server (Optional)
+## 🧪 MCP Server (Optional)
 The MCP server runs over stdio and forwards tool calls to Core. It is optional, since you can use the CLI directly. MCP clients launch it automatically when needed, so you typically do not run it yourself.
@@ -140,7 +217,8 @@ The MCP server runs over stdio and forwards tool calls to Core. It is optional,
 - Use your MCP client to call `tools/list`, then `session.create`
 - Override Core host/port with `--host`, `--port`, or `BROWSER_BRIDGE_CORE_HOST` / `BROWSER_BRIDGE_CORE_PORT`.
-## Manual MCP Setup (Advanced)
+<details>
+<summary>Manual MCP setup (advanced)</summary>
 Codex:
@@ -172,19 +250,21 @@ claude mcp add --transport stdio browser-bridge \
   -- browser-bridge mcp
 ```
-## Diagnostics
+</details>
+## 🩺 Diagnostics
 - CLI: `browser-bridge diagnostics doctor --session-id <id>`
 - Reports extension and debugger status alongside session state.
-## Recovery
+## 🔧 Recovery
 If drive or inspect gets into a bad state, recovery is explicit:
 - `browser-bridge session recover --session-id <id>`
 - Then retry the failed operation once (tools report whether failures are `retryable`).
-## Session TTL (Core Daemon)
+## 🧹 Session TTL (Core Daemon)
 The Core daemon keeps sessions in memory. By default, it automatically cleans up idle sessions after 1 hour.

package/dist/api.js CHANGED Viewed

@@ -2888,7 +2888,6 @@ var successEnvelopeSchema = (result) => import_zod.z.object({
   ok: import_zod.z.literal(true),
   result
 });
-var apiEnvelopeSchema = (result) => import_zod.z.union([successEnvelopeSchema(result), ErrorEnvelopeSchema]);
 // packages/shared/src/schemas.ts
 var import_zod2 = require("zod");
@@ -4620,9 +4619,11 @@ var createCoreClient = (options = {}) => {
 var toToolResult = (payload) => {
   const content = [{ type: "text", text: JSON.stringify(payload) }];
   if (payload && typeof payload === "object") {
+    const isErrorEnvelope = ErrorEnvelopeSchema.safeParse(payload).success;
     return {
       content,
-      structuredContent: payload
+      structuredContent: payload,
+      isError: isErrorEnvelope
     };
   }
   return { content };
@@ -4635,7 +4636,7 @@ var toInternalErrorEnvelope = (error) => ({
     retryable: false
   }
 });
-var envelope = (schema) => apiEnvelopeSchema(schema);
+var envelope = (schema) => successEnvelopeSchema(schema);
 var TOOL_DEFINITIONS = [
   {
     name: "session.create",