@btraut/browser-bridge 0.6.0 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -6,17 +6,32 @@ The format is based on "Keep a Changelog", and this project adheres to Semantic
6
6
 
7
7
  ## [Unreleased]
8
8
 
9
+ _TBD_
10
+
11
+ ## [0.7.0] - 2026-02-10
12
+
13
+ ### Fixed
14
+
15
+ - MCP adapter: avoid SDK `_zod` crashes on tool calls by registering object-shaped output schemas and flagging `ok: false` envelopes as MCP errors.
16
+
17
+ ## [0.6.1] - 2026-02-10
18
+
9
19
  ### Added
10
20
 
11
- _TBD_
21
+ - README: competitor feature comparison table.
12
22
 
13
23
  ### Fixed
14
24
 
15
- _TBD_
25
+ - Extension popup menu: Settings/About always open in a new tab/window (no more crushing the UI inside the popup).
26
+ - Extension options: default permissions mode to Granular when unset, and show a real empty state for the approved sites allowlist.
27
+ - Extension options: remove the nested-card empty state styling, simplify the copy, and always show the Approved sites disclosure + list in both Granular and Bypass modes.
28
+ - Extension options: add a drop shadow to the permission mode controls to match the rest of the settings containers.
29
+ - Extension options: remove the "Bypass mode is intentionally unsafe" warning box.
30
+ - Extension options: tighten and vertically align the Approved sites disclosure triangle.
16
31
 
17
32
  ### Changed
18
33
 
19
- _TBD_
34
+ - Expand `scripts/cli-full-tool-smoke.sh` coverage (health-check, locator variants, ref reuse, more dom-snapshot modes, more screenshot options).
20
35
 
21
36
  ## [0.6.0] - 2026-02-09
22
37
 
package/README.md CHANGED
@@ -8,85 +8,53 @@
8
8
 
9
9
  Browser Bridge drives your real, local Chrome (not headless) and inspects page state through a Chrome extension plus a local daemon. You stay in the loop with your existing tabs and login state.
10
10
 
11
- What makes it different:
12
-
13
- - **Real browser state**: operate on your actual Chrome profile (tabs, cookies, logins, extensions).
14
- - **Two-plane architecture**: a **drive** plane that does what a user does (click, type, navigate), plus an **inspect** plane that reads state (DOM, console, screenshots). This separation makes runs less flaky and lets inspection happen in parallel.
15
- - **Token-efficient inspection**: stable element refs like `@e1` (find once, reuse everywhere) plus knobs to bound output (`--max-nodes`, `--compact`, `--interactive`, `--selector`).
16
- - **Structured errors for agents**: stable error codes with a `retryable` flag (no more guessing whether to retry).
17
- - **Recovery-first**: sessions have an explicit state machine with `session.recover()` and `diagnostics doctor`.
18
- - **Inspect beyond screenshots**: DOM snapshots (AX + HTML) and `inspect dom-diff` to detect page changes.
19
-
20
- ## Why Browser Bridge
21
-
22
- Browser Bridge is built for agent reliability and "stay logged in" workflows in your real Chrome, not for headless test automation.
23
-
24
- If you're coming from Playwright/Puppeteer-style tooling:
25
-
26
- - Browser Bridge targets the user's existing, interactive Chrome session by default (typical Playwright/Puppeteer flows spin up a separate browser/context).
27
- - Browser Bridge surfaces retry guidance in the API (`retryable`) instead of forcing the agent to infer it from exceptions and timing.
28
- - Browser Bridge ships a first-class inspect plane (DOM snapshots, diffs, diagnostics) designed for LLM consumption, with output-bounding options to keep agent context small.
29
-
30
- If you're coming from an extension-only MCP tool:
31
-
32
- - Browser Bridge puts a stateful local Core daemon behind the tools (sessions, recovery, diagnostics, artifacts).
33
- - Drive actions are serialized for determinism; inspect is a separate plane that can keep producing structured state.
34
- - CLI works everywhere; MCP is optional.
35
-
36
- ## How It Works
37
-
38
- Core keeps a session state machine and exposes a small set of stable tools:
39
-
40
- - `session.*` - lifecycle + recovery
41
- - `drive.*` - navigation + input (single-flight)
42
- - `inspect.*` - DOM snapshots/diffs + evaluation
43
- - `diagnostics.*` - health checks
44
- - `artifacts.*` - screenshots
11
+ ## 🏁 Install + Quickstart (Do This First)
45
12
 
46
- ## Requirements
13
+ You need Node.js 20+ and Chrome (stable). Browser Bridge is local-only (binds to 127.0.0.1).
47
14
 
48
- - Node.js 20+
49
- - Chrome (stable)
50
- - Browser Bridge extension (Chrome Web Store listing pending; see manual install below)
51
- - Local-only usage (all services bind to 127.0.0.1)
52
-
53
- ## Install (CLI)
15
+ 1. Install the CLI:
54
16
 
55
17
  ```bash
56
18
  npm i -g @btraut/browser-bridge
57
19
  browser-bridge --help
58
20
  ```
59
21
 
60
- ## Chrome Extension (Manual Install)
22
+ 2. Run the installer:
23
+
24
+ ```bash
25
+ browser-bridge install
26
+ ```
61
27
 
62
- Chrome Web Store listing is pending. For now, install the extension manually:
28
+ Select your client(s) (Codex, Claude, Cursor, etc).
63
29
 
64
- 1. Download the latest pre-built extension zip from [GitHub Releases](https://github.com/btraut/browser-bridge/releases) (Assets), unzip it, and use the unzipped folder for step 3.
30
+ 3. Install the Chrome extension:
65
31
 
66
- Alternative (build from source):
32
+ - Chrome Web Store listing is pending. For now, install manually.
33
+ - Download the latest pre-built extension zip from [GitHub Releases](https://github.com/btraut/browser-bridge/releases) (Assets), unzip it.
34
+ - Chrome -> `chrome://extensions` -> enable **Developer mode** -> **Load unpacked** -> select the folder with `manifest.json`.
67
35
 
68
- 1. Clone this repo.
69
- 2. Install deps and build:
36
+ <details>
37
+ <summary>Build the extension from source (instead of using a release zip)</summary>
70
38
 
71
39
  ```bash
72
40
  npm install
73
41
  npm run build
74
42
  ```
75
43
 
76
- 3. Open Chrome and navigate to `chrome://extensions`.
77
- 4. Enable **Developer mode**, click **Load unpacked**, and select the extension folder (the folder with `manifest.json`).
44
+ Then load the unpacked extension from `packages/extension/`.
78
45
 
79
- Notes:
46
+ </details>
80
47
 
81
- - Browser Bridge enforces a per-site allowlist for `drive.*` actions by default. The first time it acts on a new site, you'll see a permission prompt.
82
- - You can review/revoke approved sites (and optionally enable a dangerous bypass mode) via the extension options page (Extensions menu -> Browser Bridge -> Extension options).
83
- - If you click **Decline**, the command fails with `PERMISSION_DENIED` (non-retryable). If you don't respond in time, you'll see `PERMISSION_PROMPT_TIMEOUT` (retryable once after the user allows).
48
+ 4. Try it:
49
+
50
+ ```text
51
+ Use Browser Bridge to navigate to https://example.com.
52
+ ```
84
53
 
85
- ## Quickstart
54
+ If Chrome shows a Browser Bridge permissions prompt, approve it, then tell the agent to retry.
86
55
 
87
- 1. Install the extension.
88
- 2. (Optional) Run `browser-bridge install` (skill + optional MCP).
89
- 3. Run a quick CLI check (Core auto-starts by default):
56
+ <details>
57
+ <summary>CLI sanity check (debugging)</summary>
90
58
 
91
59
  ```bash
92
60
  browser-bridge session create
@@ -100,7 +68,123 @@ Notes:
100
68
 
101
69
  - `inspect dom-snapshot` defaults to `--format ax`; `--max-nodes` is only supported for AX snapshots.
102
70
 
103
- ## Skills (Agent Clients)
71
+ </details>
72
+
73
+ ## ✨ What You Get
74
+
75
+ What makes it different:
76
+
77
+ - **Real browser state**: operate on your actual Chrome profile (tabs, cookies, logins, extensions).
78
+ - **Two-plane architecture**: a **drive** plane that does what a user does (click, type, navigate), plus an **inspect** plane that reads state (DOM, console, screenshots). This separation makes runs less flaky and lets inspection happen in parallel.
79
+ - **Safe-by-default drive permissions**: `drive.*` actions are blocked on new sites until you approve them. You can allow once, always allow (per-site allowlist you can audit/revoke), or enable a clearly-labeled bypass mode if you want zero guardrails.
80
+ - **Token-efficient inspection**: stable element refs like `@e1` (find once, reuse everywhere) plus knobs to bound output (`--max-nodes`, `--compact`, `--interactive`, `--selector`).
81
+ - **Structured errors for agents**: stable error codes with a `retryable` flag (no more guessing whether to retry).
82
+ - **Recovery-first**: sessions have an explicit state machine with `session.recover()` and `diagnostics doctor`.
83
+ - **Inspect beyond screenshots**: DOM snapshots (AX + HTML) and `inspect dom-diff` to detect page changes.
84
+
85
+ ## 🆚 Feature Comparison
86
+
87
+ | Category | Browser Bridge | Playwright MCP | agent-browser | mcp-chrome | Claude Code + Chrome |
88
+ | --- | --- | --- | --- | --- | --- |
89
+ | Uses your real, already-logged-in Chrome (tabs/cookies) | ✅ | ❌ | ❌ | ✅ | ✅ |
90
+ | Visible browser (not headless) | ✅ | ✅ | ❌ | ✅ | ✅ |
91
+ | Per-site permission prompts / allowlist | ✅ | ❌ | ❌ | ❌ | ✅ |
92
+ | Drive/Inspect split (inspect without racing input) | ✅ | ❌ | ❌ | ❌ | ❌ |
93
+ | Token-efficient inspection (element refs, bounded output, cleanup) | ✅ | ❌ | ❌ | ❌ | ❌ |
94
+ | Structured errors + retry hints | ✅ | ❌ | ❌ | ❌ | ❌ |
95
+ | Explicit recovery + doctor-style diagnostics | ✅ | ❌ | ❌ | ❌ | ❌ |
96
+ | DOM diff (change detection) | ✅ | ❌ | ❌ | ❌ | ❌ |
97
+ | HAR / network export | ✅ | ✅ | ✅ | ✅ | ❌ |
98
+ | Open source | ✅ | ✅ | ✅ | ✅ | ❌ |
99
+
100
+ ## 🔒 Site Permissions (Drive Actions)
101
+
102
+ Browser Bridge is intentionally safe: **drive actions** (`drive.navigate`, click, type, etc.) require **per-site approval**. `inspect.*` is not gated, so agents can inspect first and only ask for permission when it's time to click/type.
103
+
104
+ <details>
105
+ <summary>How approvals work (click to expand)</summary>
106
+
107
+ - The first time a `drive.*` action targets a new site, Chrome opens a small permissions prompt.
108
+ - Click **Allow this action** to allow once (no allowlist entry).
109
+ - Click **Always allow actions on this site** to add the site to your approved-sites allowlist.
110
+ - Click **Decline** to fail the command with `PERMISSION_DENIED` (non-retryable).
111
+ - If you ignore the prompt, the command fails with `PERMISSION_PROMPT_TIMEOUT` (retryable). Default wait is 30 seconds; approve the prompt, then retry the command.
112
+
113
+ Manage approvals (and bypass mode):
114
+
115
+ - Open the extension options page from `chrome://extensions` (Browser Bridge -> **Extension options**) or from the Extensions toolbar menu (Browser Bridge -> **Extension options**).
116
+ - The options page shows your **Approved sites** allowlist with revoke controls.
117
+ - Switch **Permission mode** to **Bypass (dangerous)** to skip the allowlist and prompts entirely.
118
+ - In bypass mode, the agent can take actions on any website without asking.
119
+ - Restricted URLs (for example `chrome://` and `file://`) are still blocked.
120
+
121
+ </details>
122
+
123
+ ## 🧰 Tools (MCP + CLI)
124
+
125
+ The CLI mirrors the MCP tool surface.
126
+
127
+ <details>
128
+ <summary>All MCP tools (click to expand)</summary>
129
+
130
+ **session**
131
+
132
+ - `session.create`
133
+ - `session.status`
134
+ - `session.recover`
135
+ - `session.close`
136
+
137
+ **drive**
138
+
139
+ - `drive.navigate`
140
+ - `drive.go_back`
141
+ - `drive.go_forward`
142
+ - `drive.back`
143
+ - `drive.forward`
144
+ - `drive.click`
145
+ - `drive.hover`
146
+ - `drive.select`
147
+ - `drive.type`
148
+ - `drive.fill_form`
149
+ - `drive.drag`
150
+ - `drive.handle_dialog`
151
+ - `drive.key`
152
+ - `drive.key_press`
153
+ - `drive.scroll`
154
+ - `drive.wait_for`
155
+ - `drive.tab_list`
156
+ - `drive.tab_activate`
157
+ - `drive.tab_close`
158
+
159
+ **dialog**
160
+
161
+ - `dialog.accept`
162
+ - `dialog.dismiss`
163
+
164
+ **inspect**
165
+
166
+ - `inspect.dom_snapshot`
167
+ - `inspect.dom_diff`
168
+ - `inspect.find`
169
+ - `inspect.extract_content`
170
+ - `inspect.page_state`
171
+ - `inspect.console_list`
172
+ - `inspect.network_har`
173
+ - `inspect.evaluate`
174
+ - `inspect.performance_metrics`
175
+
176
+ **artifacts**
177
+
178
+ - `artifacts.screenshot`
179
+
180
+ **misc**
181
+
182
+ - `health_check`
183
+ - `diagnostics.doctor`
184
+
185
+ </details>
186
+
187
+ ## 🧩 Skills (Agent Clients)
104
188
 
105
189
  Browser Bridge skills work across many agent clients, including Codex and Claude Code.
106
190
 
@@ -110,13 +194,6 @@ Easiest option (recommended):
110
194
  browser-bridge install
111
195
  ```
112
196
 
113
- Skill only:
114
-
115
- ```bash
116
- browser-bridge skill install
117
- browser-bridge skill status
118
- ```
119
-
120
197
  Or copy the Browser Bridge skill into your agent skills directory (advanced):
121
198
 
122
199
  ```bash
@@ -131,7 +208,7 @@ cp -R "$(npm root -g)/@btraut/browser-bridge/skills/browser-bridge" ~/.claude/sk
131
208
 
132
209
  Restart your agent app if it does not pick up the new skill automatically.
133
210
 
134
- ## MCP Server (Optional)
211
+ ## 🧪 MCP Server (Optional)
135
212
 
136
213
  The MCP server runs over stdio and forwards tool calls to Core. It is optional, since you can use the CLI directly. MCP clients launch it automatically when needed, so you typically do not run it yourself.
137
214
 
@@ -140,7 +217,8 @@ The MCP server runs over stdio and forwards tool calls to Core. It is optional,
140
217
  - Use your MCP client to call `tools/list`, then `session.create`
141
218
  - Override Core host/port with `--host`, `--port`, or `BROWSER_BRIDGE_CORE_HOST` / `BROWSER_BRIDGE_CORE_PORT`.
142
219
 
143
- ## Manual MCP Setup (Advanced)
220
+ <details>
221
+ <summary>Manual MCP setup (advanced)</summary>
144
222
 
145
223
  Codex:
146
224
 
@@ -172,19 +250,21 @@ claude mcp add --transport stdio browser-bridge \
172
250
  -- browser-bridge mcp
173
251
  ```
174
252
 
175
- ## Diagnostics
253
+ </details>
254
+
255
+ ## 🩺 Diagnostics
176
256
 
177
257
  - CLI: `browser-bridge diagnostics doctor --session-id <id>`
178
258
  - Reports extension and debugger status alongside session state.
179
259
 
180
- ## Recovery
260
+ ## 🔧 Recovery
181
261
 
182
262
  If drive or inspect gets into a bad state, recovery is explicit:
183
263
 
184
264
  - `browser-bridge session recover --session-id <id>`
185
265
  - Then retry the failed operation once (tools report whether failures are `retryable`).
186
266
 
187
- ## Session TTL (Core Daemon)
267
+ ## 🧹 Session TTL (Core Daemon)
188
268
 
189
269
  The Core daemon keeps sessions in memory. By default, it automatically cleans up idle sessions after 1 hour.
190
270
 
package/dist/api.js CHANGED
@@ -2888,7 +2888,6 @@ var successEnvelopeSchema = (result) => import_zod.z.object({
2888
2888
  ok: import_zod.z.literal(true),
2889
2889
  result
2890
2890
  });
2891
- var apiEnvelopeSchema = (result) => import_zod.z.union([successEnvelopeSchema(result), ErrorEnvelopeSchema]);
2892
2891
 
2893
2892
  // packages/shared/src/schemas.ts
2894
2893
  var import_zod2 = require("zod");
@@ -4620,9 +4619,11 @@ var createCoreClient = (options = {}) => {
4620
4619
  var toToolResult = (payload) => {
4621
4620
  const content = [{ type: "text", text: JSON.stringify(payload) }];
4622
4621
  if (payload && typeof payload === "object") {
4622
+ const isErrorEnvelope = ErrorEnvelopeSchema.safeParse(payload).success;
4623
4623
  return {
4624
4624
  content,
4625
- structuredContent: payload
4625
+ structuredContent: payload,
4626
+ isError: isErrorEnvelope
4626
4627
  };
4627
4628
  }
4628
4629
  return { content };
@@ -4635,7 +4636,7 @@ var toInternalErrorEnvelope = (error) => ({
4635
4636
  retryable: false
4636
4637
  }
4637
4638
  });
4638
- var envelope = (schema) => apiEnvelopeSchema(schema);
4639
+ var envelope = (schema) => successEnvelopeSchema(schema);
4639
4640
  var TOOL_DEFINITIONS = [
4640
4641
  {
4641
4642
  name: "session.create",