agent-browser-priv 0.27.3-priv.7 → 0.28.0-priv.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -468,6 +468,7 @@ agent-browser upgrade # Upgrade agent-browser to the latest vers
468
468
  agent-browser doctor # Diagnose the install and auto-clean stale daemon files
469
469
  agent-browser doctor --fix # Also run destructive repairs
470
470
  agent-browser doctor --offline --quick # Skip network probes and the live launch test
471
+ agent-browser mcp # Start an MCP stdio server
471
472
  ```
472
473
 
473
474
  `doctor` checks your environment, browser backend install, daemon state,
@@ -488,6 +489,74 @@ agent-browser skills path [name] # Print skill directory path
488
489
 
489
490
  Serves bundled skill content that always matches the installed CLI version. AI agents use this to get current instructions rather than relying on cached copies. Set `AGENT_BROWSER_SKILLS_DIR` to override the skills directory path.
490
491
 
492
+ ### MCP Server
493
+
494
+ ```bash
495
+ agent-browser mcp
496
+ agent-browser mcp --tools all
497
+ agent-browser mcp --tools core,network,react
498
+ ```
499
+
500
+ Starts a Model Context Protocol server over stdio. MCP clients launch this command as a subprocess and exchange newline-delimited JSON-RPC on stdin and stdout. The server defaults to MCP protocol 2025-11-25 and accepts older supported client protocol versions during initialization.
501
+
502
+ The default tools profile is `core`, which keeps MCP context small for everyday browser automation. Use `--tools all` for the full typed CLI parity surface, or combine profiles with commas, such as `--tools core,network,react`.
503
+
504
+ Profiles:
505
+
506
+ - `core` — Default. Navigation, snapshots, interaction, waits, reads, screenshots, JavaScript eval, close, tab basics, and profile discovery
507
+ - `network` — Network routes, request inspection, HAR, headers, credentials, offline
508
+ - `state` — Cookies, storage, auth, saved state, sessions, profiles, skills
509
+ - `debug` — Console/errors, tracing, profiling, recording, clipboard, plugins, doctor, dashboard, install, upgrade, chat, diff, batch, confirm/deny
510
+ - `tabs` — Back/forward/reload, tabs, windows, frames, dialogs
511
+ - `react` — React tree/inspect/renders/suspense, vitals, pushstate
512
+ - `mobile` — Viewport/device/geolocation/media, touch, swipe, mouse, keyboard
513
+ - `all` — Every MCP tool, including the full typed CLI parity surface
514
+
515
+ Common tools include:
516
+
517
+ - `agent_browser_tools_profiles`
518
+ - `agent_browser_open`
519
+ - `agent_browser_snapshot`
520
+ - `agent_browser_click`
521
+ - `agent_browser_fill`
522
+ - `agent_browser_type`
523
+ - `agent_browser_press`
524
+ - `agent_browser_wait_for_selector`
525
+ - `agent_browser_screenshot`
526
+ - `agent_browser_get_url`
527
+ - `agent_browser_eval`
528
+ - `agent_browser_close`
529
+
530
+ Each tool has typed fields such as `url`, `selector`, `text`, `key`, and `session`, so MCP clients show meaningful approval prompts instead of raw command arrays. Each tool also accepts `extraArgs` for advanced CLI flags and exact CLI parity. Tool discovery is paginated and includes read-only/open-world annotations so modern MCP clients can load the large typed surface incrementally.
531
+
532
+ Example MCP client config:
533
+
534
+ ```json
535
+ {
536
+ "mcpServers": {
537
+ "agent-browser": {
538
+ "command": "agent-browser",
539
+ "args": ["mcp"]
540
+ }
541
+ }
542
+ }
543
+ ```
544
+
545
+ Full parity MCP client config:
546
+
547
+ ```json
548
+ {
549
+ "mcpServers": {
550
+ "agent-browser": {
551
+ "command": "agent-browser",
552
+ "args": ["mcp", "--tools", "all"]
553
+ }
554
+ }
555
+ }
556
+ ```
557
+
558
+ Tool invocations use the same config files and environment variables as the CLI. Use `session` in the tool arguments, or set `AGENT_BROWSER_SESSION`, to isolate browser state.
559
+
491
560
  ## Authentication
492
561
 
493
562
  agent-browser provides multiple ways to persist login sessions so you don't re-authenticate every run.
@@ -646,6 +715,7 @@ agent-browser --session-name secure open example.com
646
715
  agent-browser includes security features for safe AI agent deployments. All features are opt-in, and existing workflows are unaffected until you explicitly enable a feature:
647
716
 
648
717
  - **Authentication Vault**: Store credentials locally (always encrypted), reference by name. The LLM never sees passwords. `auth login` navigates with `load` and then waits for login form selectors to appear (SPA-friendly, timeout follows the default action timeout). A key is auto-generated at `~/.agent-browser/.encryption-key` if `AGENT_BROWSER_ENCRYPTION_KEY` is not set: `echo "pass" | agent-browser auth save github --url https://github.com/login --username user --password-stdin` then `agent-browser auth login github`
718
+ - **Plugin System**: Extend agent-browser with external executable plugins. Plugins run out-of-process over the `agent-browser.plugin.v1` stdio JSON protocol and declare capabilities such as `credential.read`, `browser.provider`, `launch.mutate`, or `command.run`.
649
719
  - **Content Boundary Markers**: Wrap page output in delimiters so LLMs can distinguish tool output from untrusted content: `--content-boundaries`
650
720
  - **Domain Allowlist**: Restrict navigation to trusted domains (wildcards like `*.example.com` also match the bare domain): `--allowed-domains "example.com,*.example.com"`. Sub-resource requests (scripts, images, fetch) and WebSocket/EventSource connections to non-allowed domains are also blocked. Include any CDN domains your target pages depend on (e.g., `*.cdn.example.com`).
651
721
  - **Action Policy**: Gate destructive actions with a static policy file: `--action-policy ./policy.json`
@@ -660,9 +730,97 @@ agent-browser includes security features for safe AI agent deployments. All feat
660
730
  | `AGENT_BROWSER_ACTION_POLICY` | Path to action policy JSON file |
661
731
  | `AGENT_BROWSER_CONFIRM_ACTIONS` | Action categories requiring confirmation |
662
732
  | `AGENT_BROWSER_CONFIRM_INTERACTIVE` | Enable interactive confirmation prompts |
733
+ | `AGENT_BROWSER_PLUGINS` | JSON plugin registry override |
663
734
 
664
735
  See [Security documentation](https://agent-browser.dev/security) for details.
665
736
 
737
+ ### Plugin System
738
+
739
+ Plugins let third-party tools integrate without becoming built-in agent-browser dependencies. Add a plugin from npm or GitHub:
740
+
741
+ ```bash
742
+ agent-browser plugin add agent-browser-plugin-captcha
743
+ agent-browser plugin add @company/agent-browser-plugin-vault --name vault
744
+ agent-browser plugin add org/agent-browser-plugin-cloud-browser
745
+ ```
746
+
747
+ References are resolved by shape: `name` uses npm, `@scope/name` uses npm, and `owner/repo` uses GitHub. `plugin add` writes `./agent-browser.json` by default; use `--global` for `~/.agent-browser/config.json`.
748
+
749
+ Plugin packages should support `plugin.manifest` so `plugin add` can discover their name and capabilities automatically. If a plugin does not support manifests, pass `--capability <name>` during add.
750
+
751
+ Plugins can also be configured manually in `agent-browser.json`:
752
+
753
+ ```json
754
+ {
755
+ "plugins": [
756
+ {
757
+ "name": "vault",
758
+ "command": "agent-browser-plugin-vault",
759
+ "capabilities": ["credential.read"]
760
+ },
761
+ {
762
+ "name": "cloud-browser",
763
+ "command": "agent-browser-plugin-cloud-browser",
764
+ "capabilities": ["browser.provider"]
765
+ },
766
+ {
767
+ "name": "stealth",
768
+ "command": "agent-browser-plugin-stealth",
769
+ "capabilities": ["launch.mutate"]
770
+ },
771
+ {
772
+ "name": "captcha",
773
+ "command": "agent-browser-plugin-captcha",
774
+ "capabilities": ["command.run", "captcha.solve"]
775
+ }
776
+ ]
777
+ }
778
+ ```
779
+
780
+ Inspect configured plugins:
781
+
782
+ ```bash
783
+ agent-browser plugin list
784
+ agent-browser plugin show vault
785
+ ```
786
+
787
+ Use a credential provider plugin for one login:
788
+
789
+ ```bash
790
+ agent-browser auth login my-app --credential-provider vault --item "My App"
791
+ agent-browser auth login my-app --credential-provider vault --item "My App" --url https://app.example.com/login --username-selector "#email" --password-selector "#password" --submit-selector "button[type=submit]"
792
+ ```
793
+
794
+ Use a browser provider plugin:
795
+
796
+ ```bash
797
+ agent-browser --provider cloud-browser open https://example.com
798
+ ```
799
+
800
+ Use a launch mutator plugin for stealth or local launch customization. The plugin can append Chrome args, extensions, and init scripts before the browser starts:
801
+
802
+ ```bash
803
+ agent-browser open https://example.com
804
+ ```
805
+
806
+ Use a generic plugin command for domain-specific tools such as CAPTCHA solvers:
807
+
808
+ ```bash
809
+ agent-browser plugin run captcha captcha.solve --payload '{"siteKey":"...","url":"https://example.com"}'
810
+ ```
811
+
812
+ The protocol request always includes `protocol`, `type`, `capability`, and `request`. A credential plugin receives `credential.resolve`, a browser provider receives `browser.launch`, a launch mutator receives `launch.mutate`, and generic commands receive the supplied request type. `plugin run` is for `command.run` and custom capabilities; core capabilities and protocol request types use their dedicated command paths. agent-browser keeps browser automation, redaction-sensitive output, and policy enforcement in core.
813
+
814
+ Gate plugin access by capability action:
815
+
816
+ ```bash
817
+ agent-browser --confirm-actions plugin:vault:credential.read auth login my-app --credential-provider vault --item "My App"
818
+ agent-browser --confirm-actions plugin:cloud-browser:browser.provider --provider cloud-browser open https://example.com
819
+ agent-browser --confirm-actions plugin:stealth:launch.mutate open https://example.com
820
+ ```
821
+
822
+ Do not put vault tokens or passwords in plugin command args. Use the vault vendor's own login/session mechanism or environment outside agent-browser config.
823
+
666
824
  ## Snapshot Options
667
825
 
668
826
  The `snapshot` command supports filtering to reduce output size:
@@ -728,7 +886,7 @@ This is useful for multimodal AI models that can reason about visual layout, unl
728
886
  | `--ignore-https-errors` | Ignore HTTPS certificate errors (useful for self-signed certs) |
729
887
  | `--allow-file-access` | Allow file:// URLs to access local files (Chromium only) |
730
888
  | `--hide-scrollbars <bool>` | Hide native scrollbars in headless Chromium screenshots, enabled by default (or `AGENT_BROWSER_HIDE_SCROLLBARS` env) |
731
- | `-p, --provider <name>` | Cloud browser provider (or `AGENT_BROWSER_PROVIDER` env) |
889
+ | `-p, --provider <name>` | Browser provider, including configured `browser.provider` plugins (or `AGENT_BROWSER_PROVIDER` env) |
732
890
  | `--device <name>` | iOS device name, e.g. "iPhone 15 Pro" (or `AGENT_BROWSER_IOS_DEVICE` env) |
733
891
  | `--json` | JSON output (for agents) |
734
892
  | `--annotate` | Annotated screenshot with numbered element labels (or `AGENT_BROWSER_ANNOTATE` env) |
@@ -859,7 +1017,14 @@ Create an `agent-browser.json` file to set persistent defaults instead of repeat
859
1017
  "profile": "./browser-data",
860
1018
  "userAgent": "my-agent/1.0",
861
1019
  "hideScrollbars": false,
862
- "ignoreHttpsErrors": true
1020
+ "ignoreHttpsErrors": true,
1021
+ "plugins": [
1022
+ {
1023
+ "name": "vault",
1024
+ "command": "agent-browser-plugin-vault",
1025
+ "capabilities": ["credential.read"]
1026
+ }
1027
+ ]
863
1028
  }
864
1029
  ```
865
1030
 
@@ -870,7 +1035,7 @@ agent-browser --config ./ci-config.json open example.com
870
1035
  AGENT_BROWSER_CONFIG=./ci-config.json agent-browser open example.com
871
1036
  ```
872
1037
 
873
- All options from the table above can be set in the config file using camelCase keys (e.g., `--executable-path` becomes `"executablePath"`, `--proxy-bypass` becomes `"proxyBypass"`). Unknown keys are ignored for forward compatibility.
1038
+ All options from the table above can be set in the config file using camelCase keys (e.g., `--executable-path` becomes `"executablePath"`, `--proxy-bypass` becomes `"proxyBypass"`). Plugins are configured with the `"plugins"` array shown above. Unknown keys are ignored for forward compatibility.
874
1039
 
875
1040
  A [JSON Schema](agent-browser.schema.json) is available for IDE autocomplete and validation. Add a `$schema` key to your config file to enable it:
876
1041
 
@@ -1277,9 +1442,11 @@ The daemon starts automatically on first command and persists between commands f
1277
1442
  | Platform | Binary |
1278
1443
  | ----------- | ----------- |
1279
1444
  | macOS ARM64 | Native Rust |
1445
+ | macOS x64 | Native Rust |
1280
1446
  | Linux ARM64 | Native Rust |
1281
1447
  | Linux x64 | Native Rust |
1282
- | Windows x64 | Native Rust |
1448
+
1449
+ Windows release artifacts are temporarily disabled while the fork stabilizes its Patchright-first CI lane.
1283
1450
 
1284
1451
  ## Usage with AI Agents
1285
1452
 
Binary file
Binary file
Binary file
@@ -38,12 +38,7 @@ function getBinaryName() {
38
38
  osKey = 'darwin';
39
39
  break;
40
40
  case 'linux':
41
- if (isMusl()) {
42
- console.error('Error: agent-browser-priv does not publish musl Linux binaries.');
43
- console.error('Use a glibc-based Linux environment or build locally with npm run build:native.');
44
- process.exit(1);
45
- }
46
- osKey = 'linux';
41
+ osKey = isMusl() ? 'linux-musl' : 'linux';
47
42
  break;
48
43
  case 'win32':
49
44
  osKey = 'win32';
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agent-browser-priv",
3
- "version": "0.27.3-priv.7",
3
+ "version": "0.28.0-priv.1",
4
4
  "description": "Browser automation CLI for AI agents",
5
5
  "type": "module",
6
6
  "packageManager": "pnpm@11.1.3",
@@ -24,8 +24,8 @@
24
24
  "build:native": "pnpm run version:sync && cargo build --release --manifest-path cli/Cargo.toml && node scripts/copy-native.js",
25
25
  "build:linux": "pnpm run version:sync && docker compose -f docker/docker-compose.yml run --rm build-linux",
26
26
  "build:macos": "pnpm run version:sync && cargo build --release --manifest-path cli/Cargo.toml --target aarch64-apple-darwin && cp cli/target/aarch64-apple-darwin/release/agent-browser bin/agent-browser-priv-darwin-arm64",
27
- "build:windows": "pnpm run version:sync && docker compose -f docker/docker-compose.yml run --rm build-windows",
28
- "build:all-platforms": "pnpm run version:sync && (pnpm run build:linux & pnpm run build:windows & wait) && pnpm run build:macos",
27
+ "build:windows": "echo \"Windows builds are temporarily disabled\" && exit 0",
28
+ "build:all-platforms": "pnpm run version:sync && pnpm run build:linux && pnpm run build:macos",
29
29
  "build:docker": "docker build --platform linux/amd64 -t agent-browser-builder -f docker/Dockerfile.build .",
30
30
  "release": "pnpm run version:sync && pnpm run build:all-platforms",
31
31
  "postinstall": "node scripts/postinstall.js",
@@ -1,7 +1,7 @@
1
1
  #!/bin/bash
2
2
  set -euo pipefail
3
3
 
4
- # Build agent-browser for all platforms using Docker
4
+ # Build agent-browser release platforms using Docker
5
5
  # Usage: ./scripts/build-all-platforms.sh
6
6
 
7
7
  SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
@@ -14,7 +14,7 @@ GREEN='\033[0;32m'
14
14
  YELLOW='\033[1;33m'
15
15
  NC='\033[0m' # No Color
16
16
 
17
- echo -e "${YELLOW}Building agent-browser for all platforms...${NC}"
17
+ echo -e "${YELLOW}Building agent-browser release platforms...${NC}"
18
18
  echo ""
19
19
 
20
20
  # Ensure output directory exists
@@ -63,12 +63,22 @@ build_target "x86_64-unknown-linux-gnu" "x86_64-unknown-linux-gnu.2.28" "agent-b
63
63
  # Linux ARM64
64
64
  build_target "aarch64-unknown-linux-gnu" "aarch64-unknown-linux-gnu.2.28" "agent-browser-priv-linux-arm64"
65
65
 
66
- # Windows x64
67
- build_target "x86_64-pc-windows-gnu" "x86_64-pc-windows-gnu" "agent-browser-priv-win32-x64.exe"
66
+ # Windows x64 is temporarily disabled. Restore this line when Windows support
67
+ # is re-enabled:
68
+ # build_target "x86_64-pc-windows-gnu" "x86_64-pc-windows-gnu" "agent-browser-priv-win32-x64.exe"
69
+
70
+ # macOS x64 (via zig for cross-compilation)
71
+ build_target "x86_64-apple-darwin" "x86_64-apple-darwin" "agent-browser-priv-darwin-x64"
68
72
 
69
73
  # macOS ARM64 (via zig for cross-compilation)
70
74
  build_target "aarch64-apple-darwin" "aarch64-apple-darwin" "agent-browser-priv-darwin-arm64"
71
75
 
76
+ # Linux musl x64 (Alpine)
77
+ build_target "x86_64-unknown-linux-musl" "x86_64-unknown-linux-musl" "agent-browser-priv-linux-musl-x64"
78
+
79
+ # Linux musl ARM64 (Alpine)
80
+ build_target "aarch64-unknown-linux-musl" "aarch64-unknown-linux-musl" "agent-browser-priv-linux-musl-arm64"
81
+
72
82
  echo ""
73
83
  echo -e "${GREEN}Build complete!${NC}"
74
84
  echo ""
@@ -5,8 +5,10 @@
5
5
  *
6
6
  * Downloads the platform-specific native binary if not present.
7
7
  * On global installs, patches npm's bin entry to use the native binary directly:
8
- * - Windows: Overwrites .cmd/.ps1 shims
9
8
  * - Mac/Linux: Replaces symlink to point to native binary
9
+ *
10
+ * Windows npm binaries are temporarily disabled. The shim helpers remain below
11
+ * for a future re-enable.
10
12
  */
11
13
 
12
14
  import { existsSync, mkdirSync, chmodSync, createWriteStream, unlinkSync, writeFileSync, symlinkSync, lstatSync } from 'fs';
@@ -37,11 +39,15 @@ if (platform() === 'linux' && isMusl()) {
37
39
  process.exit(0);
38
40
  }
39
41
 
42
+ if (platform() === 'win32') {
43
+ console.log('agent-browser-priv Windows npm binaries are temporarily disabled.');
44
+ console.log('Use Linux x64, Linux ARM64, or macOS ARM64 for the published package.');
45
+ process.exit(0);
46
+ }
47
+
40
48
  // Platform detection
41
49
  const osKey = platform();
42
- // Windows ARM64 falls back to x64 binary (no native ARM64 build available).
43
- // x64 binaries run via Windows' built-in emulation on ARM64.
44
- const effectiveArch = platform() === 'win32' && arch() === 'arm64' ? 'x64' : arch();
50
+ const effectiveArch = arch();
45
51
  const platformKey = `${osKey}-${effectiveArch}`;
46
52
  const ext = platform() === 'win32' ? '.exe' : '';
47
53
  const binaryName = `agent-browser-priv-${platformKey}${ext}`;
@@ -138,9 +144,6 @@ async function main() {
138
144
  }
139
145
 
140
146
  console.log(`Downloading native binary for ${platformKey}...`);
141
- if (platform() === 'win32' && arch() === 'arm64') {
142
- console.log(` Note: Using x64 binary on ARM64 Windows (runs via emulation)`);
143
- }
144
147
  console.log(`URL: ${DOWNLOAD_URL}`);
145
148
 
146
149
  try {
@@ -236,6 +239,7 @@ function showInstallReminder() {
236
239
  */
237
240
  async function fixGlobalInstallBin() {
238
241
  if (platform() === 'win32') {
242
+ // Unreachable while Windows npm binaries are disabled; kept for future re-enable.
239
243
  await fixWindowsShims();
240
244
  } else {
241
245
  await fixUnixSymlink();
@@ -0,0 +1,153 @@
1
+ #!/usr/bin/env node
2
+
3
+ import { createHash } from "node:crypto";
4
+ import { writeFileSync } from "node:fs";
5
+ import { get } from "node:https";
6
+
7
+ const ASSETS = [
8
+ {
9
+ key: "darwinArm64",
10
+ name: "agent-browser-priv-darwin-arm64",
11
+ },
12
+ {
13
+ key: "linuxArm64",
14
+ name: "agent-browser-priv-linux-arm64",
15
+ },
16
+ {
17
+ key: "linuxX64",
18
+ name: "agent-browser-priv-linux-x64",
19
+ },
20
+ ];
21
+
22
+ function argValue(name) {
23
+ const index = process.argv.indexOf(`--${name}`);
24
+ return index === -1 ? undefined : process.argv[index + 1];
25
+ }
26
+
27
+ const version = argValue("version");
28
+ const formulaPath = argValue("formula");
29
+
30
+ if (!version || !formulaPath) {
31
+ console.error("Usage: node scripts/update-homebrew-formula.mjs --version <version> --formula <path>");
32
+ process.exit(1);
33
+ }
34
+
35
+ function assetUrl(name) {
36
+ return `https://github.com/liuwen/agent-browser-priv/releases/download/v${version}/${name}`;
37
+ }
38
+
39
+ function fetchBuffer(url, redirects = 0) {
40
+ return new Promise((resolve, reject) => {
41
+ get(url, (response) => {
42
+ if ([301, 302, 303, 307, 308].includes(response.statusCode)) {
43
+ if (!response.headers.location || redirects > 5) {
44
+ reject(new Error(`Too many redirects for ${url}`));
45
+ response.resume();
46
+ return;
47
+ }
48
+ response.resume();
49
+ resolve(fetchBuffer(response.headers.location, redirects + 1));
50
+ return;
51
+ }
52
+
53
+ if (response.statusCode !== 200) {
54
+ reject(new Error(`GET ${url} returned HTTP ${response.statusCode}`));
55
+ response.resume();
56
+ return;
57
+ }
58
+
59
+ const chunks = [];
60
+ response.on("data", (chunk) => chunks.push(chunk));
61
+ response.on("end", () => resolve(Buffer.concat(chunks)));
62
+ }).on("error", reject);
63
+ });
64
+ }
65
+
66
+ async function withRetries(label, fn) {
67
+ let lastError;
68
+ for (let attempt = 1; attempt <= 5; attempt += 1) {
69
+ try {
70
+ return await fn();
71
+ } catch (error) {
72
+ lastError = error;
73
+ const delayMs = attempt * 2000;
74
+ console.error(`${label} failed on attempt ${attempt}: ${error.message}`);
75
+ if (attempt < 5) {
76
+ await new Promise((resolve) => setTimeout(resolve, delayMs));
77
+ }
78
+ }
79
+ }
80
+ throw lastError;
81
+ }
82
+
83
+ function sha256(buffer) {
84
+ return createHash("sha256").update(buffer).digest("hex");
85
+ }
86
+
87
+ function renderFormula(checksums) {
88
+ return `class AgentBrowser < Formula
89
+ desc "Browser automation CLI for AI agents with Patchright as the default backend"
90
+ homepage "https://github.com/liuwen/agent-browser-priv"
91
+ version "${version}"
92
+ license "Apache-2.0"
93
+
94
+ on_macos do
95
+ if Hardware::CPU.arm?
96
+ url "${assetUrl("agent-browser-priv-darwin-arm64")}"
97
+ sha256 "${checksums.darwinArm64}"
98
+ end
99
+ end
100
+
101
+ on_linux do
102
+ if Hardware::CPU.arm?
103
+ url "${assetUrl("agent-browser-priv-linux-arm64")}"
104
+ sha256 "${checksums.linuxArm64}"
105
+ elsif Hardware::CPU.intel?
106
+ url "${assetUrl("agent-browser-priv-linux-x64")}"
107
+ sha256 "${checksums.linuxX64}"
108
+ end
109
+ end
110
+
111
+ def install
112
+ unsupported = "agent-browser Homebrew binary is published for macOS ARM64 and Linux x86_64/ARM64"
113
+ odie unsupported unless supported_platform?
114
+
115
+ binary = if OS.mac? && Hardware::CPU.arm?
116
+ "agent-browser-priv-darwin-arm64"
117
+ elsif OS.linux? && Hardware::CPU.arm?
118
+ "agent-browser-priv-linux-arm64"
119
+ elsif OS.linux? && Hardware::CPU.intel?
120
+ "agent-browser-priv-linux-x64"
121
+ else
122
+ odie unsupported
123
+ end
124
+
125
+ bin.install binary => "agent-browser"
126
+ bin.install_symlink bin/"agent-browser" => "agent-browser-priv"
127
+ end
128
+
129
+ test do
130
+ assert_match version.to_s, shell_output("#{bin}/agent-browser --version")
131
+ assert_match version.to_s, shell_output("#{bin}/agent-browser-priv --version")
132
+ end
133
+
134
+ def supported_platform?
135
+ (OS.mac? && Hardware::CPU.arm?) || (OS.linux? && (Hardware::CPU.arm? || Hardware::CPU.intel?))
136
+ end
137
+ end
138
+ `;
139
+ }
140
+
141
+ const checksums = {};
142
+ for (const asset of ASSETS) {
143
+ const url = assetUrl(asset.name);
144
+ const data = await withRetries(asset.name, () => fetchBuffer(url));
145
+ if (data.length < 100_000) {
146
+ throw new Error(`${asset.name} is too small (${data.length} bytes)`);
147
+ }
148
+ checksums[asset.key] = sha256(data);
149
+ console.log(`${asset.name}: ${checksums[asset.key]}`);
150
+ }
151
+
152
+ writeFileSync(formulaPath, renderFormula(checksums));
153
+ console.log(`Updated ${formulaPath}`);
@@ -66,6 +66,29 @@ agent-browser screenshot result.png
66
66
  The browser stays running across commands so these feel like a single
67
67
  session. Use `agent-browser close` (or `close --all`) when you're done.
68
68
 
69
+ ## MCP integration
70
+
71
+ For tools that support Model Context Protocol servers, start the stdio server:
72
+
73
+ ```bash
74
+ agent-browser mcp
75
+ agent-browser mcp --tools all
76
+ agent-browser mcp --tools core,network,react
77
+ ```
78
+
79
+ Configure the MCP client to launch `agent-browser` with `["mcp"]`. The server
80
+ defaults to MCP protocol 2025-11-25 and accepts older supported client protocol
81
+ versions during initialization. The default tools profile is `core`, which
82
+ keeps MCP context small for everyday browser automation. Use `--tools all` for
83
+ the full typed CLI parity surface, or combine profiles with commas, such as
84
+ `--tools core,network,react`. Profiles are `core`, `network`, `state`, `debug`,
85
+ `tabs`, `react`, `mobile`, and `all`; the `debug` profile includes plugin
86
+ registry and command.run tools. Each tool accepts typed arguments plus
87
+ `extraArgs` for advanced CLI flags and exact CLI parity. Tool discovery is
88
+ paginated and includes read-only/open-world annotations so modern MCP clients
89
+ can load the large typed surface incrementally. Use the tool `session` argument
90
+ or `AGENT_BROWSER_SESSION` to isolate browser sessions.
91
+
69
92
  ## Reading a page
70
93
 
71
94
  ```bash
@@ -204,6 +227,27 @@ agent-browser auth save my-app --url https://app.example.com/login \
204
227
  agent-browser auth login my-app # fills + clicks, waits for form
205
228
  ```
206
229
 
230
+ If credentials live in an external vault, use a configured credential provider
231
+ plugin instead of putting secrets in the command line:
232
+
233
+ ```bash
234
+ agent-browser plugin add agent-browser-plugin-vault --name vault
235
+ agent-browser plugin list
236
+ agent-browser auth login my-app --credential-provider vault --item "My App"
237
+ agent-browser auth login my-app --credential-provider vault --item "My App" --url https://app.example.com/login --username-selector "#email" --password-selector "#password"
238
+ ```
239
+
240
+ Plugins can also provide browser providers, launch mutators such as stealth
241
+ setup, and arbitrary namespaced commands:
242
+
243
+ ```bash
244
+ agent-browser --provider cloud-browser open https://example.com
245
+ agent-browser plugin run captcha captcha.solve --payload '{"siteKey":"...","url":"https://example.com"}'
246
+ ```
247
+
248
+ `plugin run` is for `command.run` and custom capabilities. Core capabilities
249
+ and protocol request types use their dedicated command paths.
250
+
207
251
  ### Persist session across runs
208
252
 
209
253
  ```bash
@@ -484,7 +528,7 @@ That pulls in:
484
528
 
485
529
  - `references/commands.md` — every command, flag, alias
486
530
  - `references/snapshot-refs.md` — deep dive on the snapshot + ref model
487
- - `references/authentication.md` — auth vault, credential handling
531
+ - `references/authentication.md` — auth vault, credential plugins, credential handling
488
532
  - `references/trust-boundaries.md` — safety rules for driving a real browser
489
533
  - `references/session-management.md` — persistence, multi-session workflows
490
534
  - `references/profiling.md` — Chrome DevTools tracing and profiling
@@ -10,6 +10,7 @@ Login flows, session persistence, OAuth, 2FA, and authenticated browsing.
10
10
  - [Persistent Profiles](#persistent-profiles)
11
11
  - [Session Persistence](#session-persistence)
12
12
  - [Basic Login Flow](#basic-login-flow)
13
+ - [Plugins](#plugins)
13
14
  - [Saving Authentication State](#saving-authentication-state)
14
15
  - [Restoring Authentication](#restoring-authentication)
15
16
  - [OAuth / SSO Flows](#oauth--sso-flows)
@@ -140,6 +141,80 @@ agent-browser wait --load networkidle
140
141
  agent-browser get url # Should be dashboard, not login
141
142
  ```
142
143
 
144
+ ## Plugins
145
+
146
+ Use credential provider plugins when credentials live in external vault software. Plugins are configured in `agent-browser.json` and run as external executables over the `agent-browser.plugin.v1` stdio JSON protocol.
147
+
148
+ Add a plugin with `plugin add`. A plain `name` or `@scope/name` resolves from npm; `owner/repo` resolves from GitHub:
149
+
150
+ ```bash
151
+ agent-browser plugin add agent-browser-plugin-vault --name vault
152
+ agent-browser plugin add @company/agent-browser-plugin-vault --name vault
153
+ agent-browser plugin add org/agent-browser-plugin-cloud-browser
154
+ ```
155
+
156
+ ```json
157
+ {
158
+ "plugins": [
159
+ {
160
+ "name": "vault",
161
+ "command": "agent-browser-plugin-vault",
162
+ "capabilities": ["credential.read"]
163
+ },
164
+ {
165
+ "name": "cloud-browser",
166
+ "command": "agent-browser-plugin-cloud-browser",
167
+ "capabilities": ["browser.provider"]
168
+ },
169
+ {
170
+ "name": "stealth",
171
+ "command": "agent-browser-plugin-stealth",
172
+ "capabilities": ["launch.mutate"]
173
+ },
174
+ {
175
+ "name": "captcha",
176
+ "command": "agent-browser-plugin-captcha",
177
+ "capabilities": ["command.run", "captcha.solve"]
178
+ }
179
+ ]
180
+ }
181
+ ```
182
+
183
+ Inspect configured plugins before use:
184
+
185
+ ```bash
186
+ agent-browser plugin list
187
+ agent-browser plugin show vault
188
+ ```
189
+
190
+ Resolve credentials just-in-time for one login:
191
+
192
+ ```bash
193
+ agent-browser auth login my-app --credential-provider vault --item "My App"
194
+ ```
195
+
196
+ Use a plugin as a browser provider or a generic domain command:
197
+
198
+ ```bash
199
+ agent-browser --provider cloud-browser open https://example.com
200
+ agent-browser plugin run captcha captcha.solve --payload '{"siteKey":"...","url":"https://example.com"}'
201
+ ```
202
+
203
+ `plugin run` is for `command.run` and custom capabilities. Core capabilities
204
+ and protocol request types use their dedicated command paths.
205
+
206
+ Use `--url`, `--username-selector`, `--password-selector`, and `--submit-selector` on `auth login` to override plugin-provided metadata for the current login only.
207
+
208
+ Gate plugin secret access separately from normal login automation:
209
+
210
+ ```bash
211
+ agent-browser --confirm-actions plugin:vault:credential.read auth login my-app --credential-provider vault --item "My App"
212
+ agent-browser --confirm-actions plugin:cloud-browser:browser.provider --provider cloud-browser open https://example.com
213
+ agent-browser --confirm-actions plugin:stealth:launch.mutate open https://example.com
214
+ ```
215
+
216
+ Do not put vault tokens or passwords in plugin command args. Use the vault vendor's own login/session mechanism or environment outside agent-browser config.
217
+
143
218
  ## Saving Authentication State
144
219
 
145
220
  After logging in, save state for reuse:
@@ -297,6 +297,36 @@ Array.from(links).map(a => a.href);
297
297
  EOF
298
298
  ```
299
299
 
300
+ ## Authentication and Plugins
301
+
302
+ ```bash
303
+ agent-browser auth save <name> --url <url> --username <user> --password-stdin
304
+ agent-browser auth login <name> # Login using saved credentials
305
+ agent-browser auth login <name> --credential-provider <plugin> [--item <ref>] [--url <url>]
306
+ agent-browser auth login <name> --username-selector <s> --password-selector <s> [--submit-selector <s>]
307
+ agent-browser auth list # List saved auth profiles
308
+ agent-browser auth show <name> # Show profile metadata, no passwords
309
+ agent-browser auth delete <name> # Delete a saved profile
310
+ agent-browser plugin add <ref> # Add a plugin from npm or GitHub
311
+ agent-browser plugin list # List configured plugins
312
+ agent-browser plugin show <name> # Show one configured plugin
313
+ agent-browser plugin run <name> <type> --payload <json>
314
+ # Run an arbitrary plugin request
315
+ ```
316
+
317
+ Credential provider plugins run out-of-process over the
318
+ `agent-browser.plugin.v1` stdio JSON protocol and must declare
319
+ `credential.read`. Use `--confirm-actions plugin:<name>:credential.read`
320
+ to require explicit approval before a plugin resolves secrets.
321
+
322
+ Other capabilities use the same protocol:
323
+ - `browser.provider`: `agent-browser --provider <name> open <url>`
324
+ - `launch.mutate`: append local launch args, extensions, or init scripts
325
+ - `command.run`: `agent-browser plugin run <name> <type> --payload <json>`
326
+
327
+ `plugin run` is for `command.run` and custom capabilities. Core capabilities
328
+ and protocol request types use their dedicated command paths.
329
+
300
330
  ## State Management
301
331
 
302
332
  ```bash
@@ -304,6 +334,56 @@ agent-browser state save auth.json # Save cookies, storage, auth state
304
334
  agent-browser state load auth.json # Restore saved state
305
335
  ```
306
336
 
337
+ ## MCP Server
338
+
339
+ ```bash
340
+ agent-browser mcp
341
+ agent-browser mcp --tools all
342
+ agent-browser mcp --tools core,network,react
343
+ ```
344
+
345
+ Starts a stdio Model Context Protocol server. MCP clients should configure the
346
+ server command as `agent-browser` with args `["mcp"]`. The server defaults to
347
+ MCP protocol 2025-11-25 and accepts older supported client protocol versions
348
+ during initialization.
349
+
350
+ The default tools profile is `core`, which keeps MCP context small for everyday
351
+ browser automation. Use `--tools all` for the full typed CLI parity surface, or
352
+ combine profiles with commas, such as `--tools core,network,react`.
353
+
354
+ Profiles:
355
+
356
+ - `core` - Default. Navigation, snapshots, interaction, waits, reads, screenshots, JavaScript eval, close, tab basics, and profile discovery
357
+ - `network` - Network routes, request inspection, HAR, headers, credentials, offline
358
+ - `state` - Cookies, storage, auth, saved state, sessions, profiles, skills
359
+ - `debug` - Console/errors, tracing, profiling, recording, clipboard, plugins, doctor, dashboard, install, upgrade, chat, diff, batch, confirm/deny
360
+ - `tabs` - Back/forward/reload, tabs, windows, frames, dialogs
361
+ - `react` - React tree/inspect/renders/suspense, vitals, pushstate
362
+ - `mobile` - Viewport/device/geolocation/media, touch, swipe, mouse, keyboard
363
+ - `all` - Every MCP tool, including the full typed CLI parity surface
364
+
365
+ Common tools include:
366
+
367
+ - `agent_browser_tools_profiles`
368
+ - `agent_browser_open`
369
+ - `agent_browser_snapshot`
370
+ - `agent_browser_click`
371
+ - `agent_browser_fill`
372
+ - `agent_browser_type`
373
+ - `agent_browser_press`
374
+ - `agent_browser_wait_for_selector`
375
+ - `agent_browser_screenshot`
376
+ - `agent_browser_get_url`
377
+ - `agent_browser_eval`
378
+ - `agent_browser_close`
379
+
380
+ Tool calls use the same config files and environment variables as the CLI. Each
381
+ tool accepts typed arguments plus `extraArgs` for advanced CLI flags and exact
382
+ CLI parity. Tool discovery is paginated and includes read-only/open-world
383
+ annotations so modern MCP clients can load the large typed surface
384
+ incrementally. Use the `session` tool argument or `AGENT_BROWSER_SESSION` to
385
+ isolate browser state.
386
+
307
387
  ## Global Options
308
388
 
309
389
  ```bash
@@ -312,7 +392,7 @@ agent-browser --json ... # JSON output for parsing
312
392
  agent-browser --headed ... # Show browser window (not headless)
313
393
  agent-browser --cdp <port> ... # Connect via Chrome DevTools Protocol
314
394
  agent-browser --backend <name> ... # Local backend: patchright (default), chrome
315
- agent-browser -p <provider> ... # Cloud browser provider (--provider)
395
+ agent-browser -p <provider> ... # Browser provider or configured provider plugin
316
396
  agent-browser --proxy <url> ... # Use proxy server
317
397
  agent-browser --proxy-bypass <hosts> # Hosts to bypass proxy
318
398
  agent-browser --wait-until <state> # Navigation wait: none, domcontentloaded, load, networkidle
@@ -399,8 +479,9 @@ AGENT_BROWSER_EXTENSIONS="/ext1,/ext2" # Comma-separated extension paths
399
479
  AGENT_BROWSER_INIT_SCRIPTS="/a.js,/b.js" # Comma-separated init script paths
400
480
  AGENT_BROWSER_ENABLE="react-devtools" # Comma-separated built-in init script features
401
481
  AGENT_BROWSER_HIDE_SCROLLBARS="false" # Keep native scrollbars visible in headless Chromium screenshots
402
- AGENT_BROWSER_PROVIDER="browserbase" # Cloud browser provider
482
+ AGENT_BROWSER_PROVIDER="browserbase" # Browser provider or configured provider plugin
403
483
  AGENT_BROWSER_STREAM_PORT="9223" # Override WebSocket streaming port (default: OS-assigned)
404
484
  AGENT_BROWSER_CONFIG="./agent-browser.json" # Custom config file
405
485
  AGENT_BROWSER_CDP="9222" # Connect daemon to CDP port or WebSocket URL
486
+ AGENT_BROWSER_PLUGINS='[{"name":"vault","command":"agent-browser-plugin-vault","capabilities":["credential.read"]},{"name":"stealth","command":"agent-browser-plugin-stealth","capabilities":["launch.mutate"]}]'
406
487
  ```