agent-browser 0.27.3 → 0.28.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +168 -3
- package/bin/agent-browser-darwin-arm64 +0 -0
- package/bin/agent-browser-darwin-x64 +0 -0
- package/bin/agent-browser-linux-arm64 +0 -0
- package/bin/agent-browser-linux-musl-arm64 +0 -0
- package/bin/agent-browser-linux-musl-x64 +0 -0
- package/bin/agent-browser-linux-x64 +0 -0
- package/bin/agent-browser-win32-x64.exe +0 -0
- package/package.json +1 -1
- package/skill-data/core/SKILL.md +45 -1
- package/skill-data/core/references/authentication.md +75 -0
- package/skill-data/core/references/commands.md +83 -2
package/README.md
CHANGED
|
@@ -463,6 +463,7 @@ agent-browser upgrade # Upgrade agent-browser to the latest vers
|
|
|
463
463
|
agent-browser doctor # Diagnose the install and auto-clean stale daemon files
|
|
464
464
|
agent-browser doctor --fix # Also run destructive repairs (reinstall Chrome, purge old state, ...)
|
|
465
465
|
agent-browser doctor --offline --quick # Skip network probes and the live launch test
|
|
466
|
+
agent-browser mcp # Start an MCP stdio server
|
|
466
467
|
```
|
|
467
468
|
|
|
468
469
|
`doctor` checks your environment, Chrome install, daemon state, config files,
|
|
@@ -483,6 +484,74 @@ agent-browser skills path [name] # Print skill directory path
|
|
|
483
484
|
|
|
484
485
|
Serves bundled skill content that always matches the installed CLI version. AI agents use this to get current instructions rather than relying on cached copies. Set `AGENT_BROWSER_SKILLS_DIR` to override the skills directory path.
|
|
485
486
|
|
|
487
|
+
### MCP Server
|
|
488
|
+
|
|
489
|
+
```bash
|
|
490
|
+
agent-browser mcp
|
|
491
|
+
agent-browser mcp --tools all
|
|
492
|
+
agent-browser mcp --tools core,network,react
|
|
493
|
+
```
|
|
494
|
+
|
|
495
|
+
Starts a Model Context Protocol server over stdio. MCP clients launch this command as a subprocess and exchange newline-delimited JSON-RPC on stdin and stdout. The server defaults to MCP protocol 2025-11-25 and accepts older supported client protocol versions during initialization.
|
|
496
|
+
|
|
497
|
+
The default tools profile is `core`, which keeps MCP context small for everyday browser automation. Use `--tools all` for the full typed CLI parity surface, or combine profiles with commas, such as `--tools core,network,react`.
|
|
498
|
+
|
|
499
|
+
Profiles:
|
|
500
|
+
|
|
501
|
+
- `core` — Default. Navigation, snapshots, interaction, waits, reads, screenshots, JavaScript eval, close, tab basics, and profile discovery
|
|
502
|
+
- `network` — Network routes, request inspection, HAR, headers, credentials, offline
|
|
503
|
+
- `state` — Cookies, storage, auth, saved state, sessions, profiles, skills
|
|
504
|
+
- `debug` — Console/errors, tracing, profiling, recording, clipboard, plugins, doctor, dashboard, install, upgrade, chat, diff, batch, confirm/deny
|
|
505
|
+
- `tabs` — Back/forward/reload, tabs, windows, frames, dialogs
|
|
506
|
+
- `react` — React tree/inspect/renders/suspense, vitals, pushstate
|
|
507
|
+
- `mobile` — Viewport/device/geolocation/media, touch, swipe, mouse, keyboard
|
|
508
|
+
- `all` — Every MCP tool, including the full typed CLI parity surface
|
|
509
|
+
|
|
510
|
+
Common tools include:
|
|
511
|
+
|
|
512
|
+
- `agent_browser_tools_profiles`
|
|
513
|
+
- `agent_browser_open`
|
|
514
|
+
- `agent_browser_snapshot`
|
|
515
|
+
- `agent_browser_click`
|
|
516
|
+
- `agent_browser_fill`
|
|
517
|
+
- `agent_browser_type`
|
|
518
|
+
- `agent_browser_press`
|
|
519
|
+
- `agent_browser_wait_for_selector`
|
|
520
|
+
- `agent_browser_screenshot`
|
|
521
|
+
- `agent_browser_get_url`
|
|
522
|
+
- `agent_browser_eval`
|
|
523
|
+
- `agent_browser_close`
|
|
524
|
+
|
|
525
|
+
Each tool has typed fields such as `url`, `selector`, `text`, `key`, and `session`, so MCP clients show meaningful approval prompts instead of raw command arrays. Each tool also accepts `extraArgs` for advanced CLI flags and exact CLI parity. Tool discovery is paginated and includes read-only/open-world annotations so modern MCP clients can load the large typed surface incrementally.
|
|
526
|
+
|
|
527
|
+
Example MCP client config:
|
|
528
|
+
|
|
529
|
+
```json
|
|
530
|
+
{
|
|
531
|
+
"mcpServers": {
|
|
532
|
+
"agent-browser": {
|
|
533
|
+
"command": "agent-browser",
|
|
534
|
+
"args": ["mcp"]
|
|
535
|
+
}
|
|
536
|
+
}
|
|
537
|
+
}
|
|
538
|
+
```
|
|
539
|
+
|
|
540
|
+
Full parity MCP client config:
|
|
541
|
+
|
|
542
|
+
```json
|
|
543
|
+
{
|
|
544
|
+
"mcpServers": {
|
|
545
|
+
"agent-browser": {
|
|
546
|
+
"command": "agent-browser",
|
|
547
|
+
"args": ["mcp", "--tools", "all"]
|
|
548
|
+
}
|
|
549
|
+
}
|
|
550
|
+
}
|
|
551
|
+
```
|
|
552
|
+
|
|
553
|
+
Tool invocations use the same config files and environment variables as the CLI. Use `session` in the tool arguments, or set `AGENT_BROWSER_SESSION`, to isolate browser state.
|
|
554
|
+
|
|
486
555
|
## Authentication
|
|
487
556
|
|
|
488
557
|
agent-browser provides multiple ways to persist login sessions so you don't re-authenticate every run.
|
|
@@ -641,6 +710,7 @@ agent-browser --session-name secure open example.com
|
|
|
641
710
|
agent-browser includes security features for safe AI agent deployments. All features are opt-in, and existing workflows are unaffected until you explicitly enable a feature:
|
|
642
711
|
|
|
643
712
|
- **Authentication Vault**: Store credentials locally (always encrypted), reference by name. The LLM never sees passwords. `auth login` navigates with `load` and then waits for login form selectors to appear (SPA-friendly, timeout follows the default action timeout). A key is auto-generated at `~/.agent-browser/.encryption-key` if `AGENT_BROWSER_ENCRYPTION_KEY` is not set: `echo "pass" | agent-browser auth save github --url https://github.com/login --username user --password-stdin` then `agent-browser auth login github`
|
|
713
|
+
- **Plugin System**: Extend agent-browser with external executable plugins. Plugins run out-of-process over the `agent-browser.plugin.v1` stdio JSON protocol and declare capabilities such as `credential.read`, `browser.provider`, `launch.mutate`, or `command.run`.
|
|
644
714
|
- **Content Boundary Markers**: Wrap page output in delimiters so LLMs can distinguish tool output from untrusted content: `--content-boundaries`
|
|
645
715
|
- **Domain Allowlist**: Restrict navigation to trusted domains (wildcards like `*.example.com` also match the bare domain): `--allowed-domains "example.com,*.example.com"`. Sub-resource requests (scripts, images, fetch) and WebSocket/EventSource connections to non-allowed domains are also blocked. Include any CDN domains your target pages depend on (e.g., `*.cdn.example.com`).
|
|
646
716
|
- **Action Policy**: Gate destructive actions with a static policy file: `--action-policy ./policy.json`
|
|
@@ -655,9 +725,97 @@ agent-browser includes security features for safe AI agent deployments. All feat
|
|
|
655
725
|
| `AGENT_BROWSER_ACTION_POLICY` | Path to action policy JSON file |
|
|
656
726
|
| `AGENT_BROWSER_CONFIRM_ACTIONS` | Action categories requiring confirmation |
|
|
657
727
|
| `AGENT_BROWSER_CONFIRM_INTERACTIVE` | Enable interactive confirmation prompts |
|
|
728
|
+
| `AGENT_BROWSER_PLUGINS` | JSON plugin registry override |
|
|
658
729
|
|
|
659
730
|
See [Security documentation](https://agent-browser.dev/security) for details.
|
|
660
731
|
|
|
732
|
+
### Plugin System
|
|
733
|
+
|
|
734
|
+
Plugins let third-party tools integrate without becoming built-in agent-browser dependencies. Add a plugin from npm or GitHub:
|
|
735
|
+
|
|
736
|
+
```bash
|
|
737
|
+
agent-browser plugin add agent-browser-plugin-captcha
|
|
738
|
+
agent-browser plugin add @company/agent-browser-plugin-vault --name vault
|
|
739
|
+
agent-browser plugin add org/agent-browser-plugin-cloud-browser
|
|
740
|
+
```
|
|
741
|
+
|
|
742
|
+
References are resolved by shape: `name` uses npm, `@scope/name` uses npm, and `owner/repo` uses GitHub. `plugin add` writes `./agent-browser.json` by default; use `--global` for `~/.agent-browser/config.json`.
|
|
743
|
+
|
|
744
|
+
Plugin packages should support `plugin.manifest` so `plugin add` can discover their name and capabilities automatically. If a plugin does not support manifests, pass `--capability <name>` during add.
|
|
745
|
+
|
|
746
|
+
Plugins can also be configured manually in `agent-browser.json`:
|
|
747
|
+
|
|
748
|
+
```json
|
|
749
|
+
{
|
|
750
|
+
"plugins": [
|
|
751
|
+
{
|
|
752
|
+
"name": "vault",
|
|
753
|
+
"command": "agent-browser-plugin-vault",
|
|
754
|
+
"capabilities": ["credential.read"]
|
|
755
|
+
},
|
|
756
|
+
{
|
|
757
|
+
"name": "cloud-browser",
|
|
758
|
+
"command": "agent-browser-plugin-cloud-browser",
|
|
759
|
+
"capabilities": ["browser.provider"]
|
|
760
|
+
},
|
|
761
|
+
{
|
|
762
|
+
"name": "stealth",
|
|
763
|
+
"command": "agent-browser-plugin-stealth",
|
|
764
|
+
"capabilities": ["launch.mutate"]
|
|
765
|
+
},
|
|
766
|
+
{
|
|
767
|
+
"name": "captcha",
|
|
768
|
+
"command": "agent-browser-plugin-captcha",
|
|
769
|
+
"capabilities": ["command.run", "captcha.solve"]
|
|
770
|
+
}
|
|
771
|
+
]
|
|
772
|
+
}
|
|
773
|
+
```
|
|
774
|
+
|
|
775
|
+
Inspect configured plugins:
|
|
776
|
+
|
|
777
|
+
```bash
|
|
778
|
+
agent-browser plugin list
|
|
779
|
+
agent-browser plugin show vault
|
|
780
|
+
```
|
|
781
|
+
|
|
782
|
+
Use a credential provider plugin for one login:
|
|
783
|
+
|
|
784
|
+
```bash
|
|
785
|
+
agent-browser auth login my-app --credential-provider vault --item "My App"
|
|
786
|
+
agent-browser auth login my-app --credential-provider vault --item "My App" --url https://app.example.com/login --username-selector "#email" --password-selector "#password" --submit-selector "button[type=submit]"
|
|
787
|
+
```
|
|
788
|
+
|
|
789
|
+
Use a browser provider plugin:
|
|
790
|
+
|
|
791
|
+
```bash
|
|
792
|
+
agent-browser --provider cloud-browser open https://example.com
|
|
793
|
+
```
|
|
794
|
+
|
|
795
|
+
Use a launch mutator plugin for stealth or local launch customization. The plugin can append Chrome args, extensions, and init scripts before the browser starts:
|
|
796
|
+
|
|
797
|
+
```bash
|
|
798
|
+
agent-browser open https://example.com
|
|
799
|
+
```
|
|
800
|
+
|
|
801
|
+
Use a generic plugin command for domain-specific tools such as CAPTCHA solvers:
|
|
802
|
+
|
|
803
|
+
```bash
|
|
804
|
+
agent-browser plugin run captcha captcha.solve --payload '{"siteKey":"...","url":"https://example.com"}'
|
|
805
|
+
```
|
|
806
|
+
|
|
807
|
+
The protocol request always includes `protocol`, `type`, `capability`, and `request`. A credential plugin receives `credential.resolve`, a browser provider receives `browser.launch`, a launch mutator receives `launch.mutate`, and generic commands receive the supplied request type. `plugin run` is for `command.run` and custom capabilities; core capabilities and protocol request types use their dedicated command paths. agent-browser keeps browser automation, redaction-sensitive output, and policy enforcement in core.
|
|
808
|
+
|
|
809
|
+
Gate plugin access by capability action:
|
|
810
|
+
|
|
811
|
+
```bash
|
|
812
|
+
agent-browser --confirm-actions plugin:vault:credential.read auth login my-app --credential-provider vault --item "My App"
|
|
813
|
+
agent-browser --confirm-actions plugin:cloud-browser:browser.provider --provider cloud-browser open https://example.com
|
|
814
|
+
agent-browser --confirm-actions plugin:stealth:launch.mutate open https://example.com
|
|
815
|
+
```
|
|
816
|
+
|
|
817
|
+
Do not put vault tokens or passwords in plugin command args. Use the vault vendor's own login/session mechanism or environment outside agent-browser config.
|
|
818
|
+
|
|
661
819
|
## Snapshot Options
|
|
662
820
|
|
|
663
821
|
The `snapshot` command supports filtering to reduce output size:
|
|
@@ -723,7 +881,7 @@ This is useful for multimodal AI models that can reason about visual layout, unl
|
|
|
723
881
|
| `--ignore-https-errors` | Ignore HTTPS certificate errors (useful for self-signed certs) |
|
|
724
882
|
| `--allow-file-access` | Allow file:// URLs to access local files (Chromium only) |
|
|
725
883
|
| `--hide-scrollbars <bool>` | Hide native scrollbars in headless Chromium screenshots, enabled by default (or `AGENT_BROWSER_HIDE_SCROLLBARS` env) |
|
|
726
|
-
| `-p, --provider <name>` |
|
|
884
|
+
| `-p, --provider <name>` | Browser provider, including configured `browser.provider` plugins (or `AGENT_BROWSER_PROVIDER` env) |
|
|
727
885
|
| `--device <name>` | iOS device name, e.g. "iPhone 15 Pro" (or `AGENT_BROWSER_IOS_DEVICE` env) |
|
|
728
886
|
| `--json` | JSON output (for agents) |
|
|
729
887
|
| `--annotate` | Annotated screenshot with numbered element labels (or `AGENT_BROWSER_ANNOTATE` env) |
|
|
@@ -820,7 +978,14 @@ Create an `agent-browser.json` file to set persistent defaults instead of repeat
|
|
|
820
978
|
"profile": "./browser-data",
|
|
821
979
|
"userAgent": "my-agent/1.0",
|
|
822
980
|
"hideScrollbars": false,
|
|
823
|
-
"ignoreHttpsErrors": true
|
|
981
|
+
"ignoreHttpsErrors": true,
|
|
982
|
+
"plugins": [
|
|
983
|
+
{
|
|
984
|
+
"name": "vault",
|
|
985
|
+
"command": "agent-browser-plugin-vault",
|
|
986
|
+
"capabilities": ["credential.read"]
|
|
987
|
+
}
|
|
988
|
+
]
|
|
824
989
|
}
|
|
825
990
|
```
|
|
826
991
|
|
|
@@ -831,7 +996,7 @@ agent-browser --config ./ci-config.json open example.com
|
|
|
831
996
|
AGENT_BROWSER_CONFIG=./ci-config.json agent-browser open example.com
|
|
832
997
|
```
|
|
833
998
|
|
|
834
|
-
All options from the table above can be set in the config file using camelCase keys (e.g., `--executable-path` becomes `"executablePath"`, `--proxy-bypass` becomes `"proxyBypass"`). Unknown keys are ignored for forward compatibility.
|
|
999
|
+
All options from the table above can be set in the config file using camelCase keys (e.g., `--executable-path` becomes `"executablePath"`, `--proxy-bypass` becomes `"proxyBypass"`). Plugins are configured with the `"plugins"` array shown above. Unknown keys are ignored for forward compatibility.
|
|
835
1000
|
|
|
836
1001
|
A [JSON Schema](agent-browser.schema.json) is available for IDE autocomplete and validation. Add a `$schema` key to your config file to enable it:
|
|
837
1002
|
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
package/package.json
CHANGED
package/skill-data/core/SKILL.md
CHANGED
|
@@ -54,6 +54,29 @@ agent-browser screenshot result.png
|
|
|
54
54
|
The browser stays running across commands so these feel like a single
|
|
55
55
|
session. Use `agent-browser close` (or `close --all`) when you're done.
|
|
56
56
|
|
|
57
|
+
## MCP integration
|
|
58
|
+
|
|
59
|
+
For tools that support Model Context Protocol servers, start the stdio server:
|
|
60
|
+
|
|
61
|
+
```bash
|
|
62
|
+
agent-browser mcp
|
|
63
|
+
agent-browser mcp --tools all
|
|
64
|
+
agent-browser mcp --tools core,network,react
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
Configure the MCP client to launch `agent-browser` with `["mcp"]`. The server
|
|
68
|
+
defaults to MCP protocol 2025-11-25 and accepts older supported client protocol
|
|
69
|
+
versions during initialization. The default tools profile is `core`, which
|
|
70
|
+
keeps MCP context small for everyday browser automation. Use `--tools all` for
|
|
71
|
+
the full typed CLI parity surface, or combine profiles with commas, such as
|
|
72
|
+
`--tools core,network,react`. Profiles are `core`, `network`, `state`, `debug`,
|
|
73
|
+
`tabs`, `react`, `mobile`, and `all`; the `debug` profile includes plugin
|
|
74
|
+
registry and command.run tools. Each tool accepts typed arguments plus
|
|
75
|
+
`extraArgs` for advanced CLI flags and exact CLI parity. Tool discovery is
|
|
76
|
+
paginated and includes read-only/open-world annotations so modern MCP clients
|
|
77
|
+
can load the large typed surface incrementally. Use the tool `session` argument
|
|
78
|
+
or `AGENT_BROWSER_SESSION` to isolate browser sessions.
|
|
79
|
+
|
|
57
80
|
## Reading a page
|
|
58
81
|
|
|
59
82
|
```bash
|
|
@@ -192,6 +215,27 @@ agent-browser auth save my-app --url https://app.example.com/login \
|
|
|
192
215
|
agent-browser auth login my-app # fills + clicks, waits for form
|
|
193
216
|
```
|
|
194
217
|
|
|
218
|
+
If credentials live in an external vault, use a configured credential provider
|
|
219
|
+
plugin instead of putting secrets in the command line:
|
|
220
|
+
|
|
221
|
+
```bash
|
|
222
|
+
agent-browser plugin add agent-browser-plugin-vault --name vault
|
|
223
|
+
agent-browser plugin list
|
|
224
|
+
agent-browser auth login my-app --credential-provider vault --item "My App"
|
|
225
|
+
agent-browser auth login my-app --credential-provider vault --item "My App" --url https://app.example.com/login --username-selector "#email" --password-selector "#password"
|
|
226
|
+
```
|
|
227
|
+
|
|
228
|
+
Plugins can also provide browser providers, launch mutators such as stealth
|
|
229
|
+
setup, and arbitrary namespaced commands:
|
|
230
|
+
|
|
231
|
+
```bash
|
|
232
|
+
agent-browser --provider cloud-browser open https://example.com
|
|
233
|
+
agent-browser plugin run captcha captcha.solve --payload '{"siteKey":"...","url":"https://example.com"}'
|
|
234
|
+
```
|
|
235
|
+
|
|
236
|
+
`plugin run` is for `command.run` and custom capabilities. Core capabilities
|
|
237
|
+
and protocol request types use their dedicated command paths.
|
|
238
|
+
|
|
195
239
|
### Persist session across runs
|
|
196
240
|
|
|
197
241
|
```bash
|
|
@@ -470,7 +514,7 @@ That pulls in:
|
|
|
470
514
|
|
|
471
515
|
- `references/commands.md` — every command, flag, alias
|
|
472
516
|
- `references/snapshot-refs.md` — deep dive on the snapshot + ref model
|
|
473
|
-
- `references/authentication.md` — auth vault, credential handling
|
|
517
|
+
- `references/authentication.md` — auth vault, credential plugins, credential handling
|
|
474
518
|
- `references/trust-boundaries.md` — safety rules for driving a real browser
|
|
475
519
|
- `references/session-management.md` — persistence, multi-session workflows
|
|
476
520
|
- `references/profiling.md` — Chrome DevTools tracing and profiling
|
|
@@ -10,6 +10,7 @@ Login flows, session persistence, OAuth, 2FA, and authenticated browsing.
|
|
|
10
10
|
- [Persistent Profiles](#persistent-profiles)
|
|
11
11
|
- [Session Persistence](#session-persistence)
|
|
12
12
|
- [Basic Login Flow](#basic-login-flow)
|
|
13
|
+
- [Plugins](#plugins)
|
|
13
14
|
- [Saving Authentication State](#saving-authentication-state)
|
|
14
15
|
- [Restoring Authentication](#restoring-authentication)
|
|
15
16
|
- [OAuth / SSO Flows](#oauth--sso-flows)
|
|
@@ -140,6 +141,80 @@ agent-browser wait --load networkidle
|
|
|
140
141
|
agent-browser get url # Should be dashboard, not login
|
|
141
142
|
```
|
|
142
143
|
|
|
144
|
+
## Plugins
|
|
145
|
+
|
|
146
|
+
Use credential provider plugins when credentials live in external vault software. Plugins are configured in `agent-browser.json` and run as external executables over the `agent-browser.plugin.v1` stdio JSON protocol.
|
|
147
|
+
|
|
148
|
+
Add a plugin with `plugin add`. A plain `name` or `@scope/name` resolves from npm; `owner/repo` resolves from GitHub:
|
|
149
|
+
|
|
150
|
+
```bash
|
|
151
|
+
agent-browser plugin add agent-browser-plugin-vault --name vault
|
|
152
|
+
agent-browser plugin add @company/agent-browser-plugin-vault --name vault
|
|
153
|
+
agent-browser plugin add org/agent-browser-plugin-cloud-browser
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
```json
|
|
157
|
+
{
|
|
158
|
+
"plugins": [
|
|
159
|
+
{
|
|
160
|
+
"name": "vault",
|
|
161
|
+
"command": "agent-browser-plugin-vault",
|
|
162
|
+
"capabilities": ["credential.read"]
|
|
163
|
+
},
|
|
164
|
+
{
|
|
165
|
+
"name": "cloud-browser",
|
|
166
|
+
"command": "agent-browser-plugin-cloud-browser",
|
|
167
|
+
"capabilities": ["browser.provider"]
|
|
168
|
+
},
|
|
169
|
+
{
|
|
170
|
+
"name": "stealth",
|
|
171
|
+
"command": "agent-browser-plugin-stealth",
|
|
172
|
+
"capabilities": ["launch.mutate"]
|
|
173
|
+
},
|
|
174
|
+
{
|
|
175
|
+
"name": "captcha",
|
|
176
|
+
"command": "agent-browser-plugin-captcha",
|
|
177
|
+
"capabilities": ["command.run", "captcha.solve"]
|
|
178
|
+
}
|
|
179
|
+
]
|
|
180
|
+
}
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
Inspect configured plugins before use:
|
|
184
|
+
|
|
185
|
+
```bash
|
|
186
|
+
agent-browser plugin list
|
|
187
|
+
agent-browser plugin show vault
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
Resolve credentials just-in-time for one login:
|
|
191
|
+
|
|
192
|
+
```bash
|
|
193
|
+
agent-browser auth login my-app --credential-provider vault --item "My App"
|
|
194
|
+
```
|
|
195
|
+
|
|
196
|
+
Use a plugin as a browser provider or a generic domain command:
|
|
197
|
+
|
|
198
|
+
```bash
|
|
199
|
+
agent-browser --provider cloud-browser open https://example.com
|
|
200
|
+
agent-browser plugin run captcha captcha.solve --payload '{"siteKey":"...","url":"https://example.com"}'
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
`plugin run` is for `command.run` and custom capabilities. Core capabilities
|
|
204
|
+
and protocol request types use their dedicated command paths.
|
|
205
|
+
|
|
206
|
+
Use `--url`, `--username-selector`, `--password-selector`, and `--submit-selector` on `auth login` to override plugin-provided metadata for the current login only.
|
|
207
|
+
|
|
208
|
+
Gate plugin secret access separately from normal login automation:
|
|
209
|
+
|
|
210
|
+
```bash
|
|
211
|
+
agent-browser --confirm-actions plugin:vault:credential.read auth login my-app --credential-provider vault --item "My App"
|
|
212
|
+
agent-browser --confirm-actions plugin:cloud-browser:browser.provider --provider cloud-browser open https://example.com
|
|
213
|
+
agent-browser --confirm-actions plugin:stealth:launch.mutate open https://example.com
|
|
214
|
+
```
|
|
215
|
+
|
|
216
|
+
Do not put vault tokens or passwords in plugin command args. Use the vault vendor's own login/session mechanism or environment outside agent-browser config.
|
|
217
|
+
|
|
143
218
|
## Saving Authentication State
|
|
144
219
|
|
|
145
220
|
After logging in, save state for reuse:
|
|
@@ -296,6 +296,36 @@ Array.from(links).map(a => a.href);
|
|
|
296
296
|
EOF
|
|
297
297
|
```
|
|
298
298
|
|
|
299
|
+
## Authentication and Plugins
|
|
300
|
+
|
|
301
|
+
```bash
|
|
302
|
+
agent-browser auth save <name> --url <url> --username <user> --password-stdin
|
|
303
|
+
agent-browser auth login <name> # Login using saved credentials
|
|
304
|
+
agent-browser auth login <name> --credential-provider <plugin> [--item <ref>] [--url <url>]
|
|
305
|
+
agent-browser auth login <name> --username-selector <s> --password-selector <s> [--submit-selector <s>]
|
|
306
|
+
agent-browser auth list # List saved auth profiles
|
|
307
|
+
agent-browser auth show <name> # Show profile metadata, no passwords
|
|
308
|
+
agent-browser auth delete <name> # Delete a saved profile
|
|
309
|
+
agent-browser plugin add <ref> # Add a plugin from npm or GitHub
|
|
310
|
+
agent-browser plugin list # List configured plugins
|
|
311
|
+
agent-browser plugin show <name> # Show one configured plugin
|
|
312
|
+
agent-browser plugin run <name> <type> --payload <json>
|
|
313
|
+
# Run an arbitrary plugin request
|
|
314
|
+
```
|
|
315
|
+
|
|
316
|
+
Credential provider plugins run out-of-process over the
|
|
317
|
+
`agent-browser.plugin.v1` stdio JSON protocol and must declare
|
|
318
|
+
`credential.read`. Use `--confirm-actions plugin:<name>:credential.read`
|
|
319
|
+
to require explicit approval before a plugin resolves secrets.
|
|
320
|
+
|
|
321
|
+
Other capabilities use the same protocol:
|
|
322
|
+
- `browser.provider`: `agent-browser --provider <name> open <url>`
|
|
323
|
+
- `launch.mutate`: append local launch args, extensions, or init scripts
|
|
324
|
+
- `command.run`: `agent-browser plugin run <name> <type> --payload <json>`
|
|
325
|
+
|
|
326
|
+
`plugin run` is for `command.run` and custom capabilities. Core capabilities
|
|
327
|
+
and protocol request types use their dedicated command paths.
|
|
328
|
+
|
|
299
329
|
## State Management
|
|
300
330
|
|
|
301
331
|
```bash
|
|
@@ -303,6 +333,56 @@ agent-browser state save auth.json # Save cookies, storage, auth state
|
|
|
303
333
|
agent-browser state load auth.json # Restore saved state
|
|
304
334
|
```
|
|
305
335
|
|
|
336
|
+
## MCP Server
|
|
337
|
+
|
|
338
|
+
```bash
|
|
339
|
+
agent-browser mcp
|
|
340
|
+
agent-browser mcp --tools all
|
|
341
|
+
agent-browser mcp --tools core,network,react
|
|
342
|
+
```
|
|
343
|
+
|
|
344
|
+
Starts a stdio Model Context Protocol server. MCP clients should configure the
|
|
345
|
+
server command as `agent-browser` with args `["mcp"]`. The server defaults to
|
|
346
|
+
MCP protocol 2025-11-25 and accepts older supported client protocol versions
|
|
347
|
+
during initialization.
|
|
348
|
+
|
|
349
|
+
The default tools profile is `core`, which keeps MCP context small for everyday
|
|
350
|
+
browser automation. Use `--tools all` for the full typed CLI parity surface, or
|
|
351
|
+
combine profiles with commas, such as `--tools core,network,react`.
|
|
352
|
+
|
|
353
|
+
Profiles:
|
|
354
|
+
|
|
355
|
+
- `core` - Default. Navigation, snapshots, interaction, waits, reads, screenshots, JavaScript eval, close, tab basics, and profile discovery
|
|
356
|
+
- `network` - Network routes, request inspection, HAR, headers, credentials, offline
|
|
357
|
+
- `state` - Cookies, storage, auth, saved state, sessions, profiles, skills
|
|
358
|
+
- `debug` - Console/errors, tracing, profiling, recording, clipboard, plugins, doctor, dashboard, install, upgrade, chat, diff, batch, confirm/deny
|
|
359
|
+
- `tabs` - Back/forward/reload, tabs, windows, frames, dialogs
|
|
360
|
+
- `react` - React tree/inspect/renders/suspense, vitals, pushstate
|
|
361
|
+
- `mobile` - Viewport/device/geolocation/media, touch, swipe, mouse, keyboard
|
|
362
|
+
- `all` - Every MCP tool, including the full typed CLI parity surface
|
|
363
|
+
|
|
364
|
+
Common tools include:
|
|
365
|
+
|
|
366
|
+
- `agent_browser_tools_profiles`
|
|
367
|
+
- `agent_browser_open`
|
|
368
|
+
- `agent_browser_snapshot`
|
|
369
|
+
- `agent_browser_click`
|
|
370
|
+
- `agent_browser_fill`
|
|
371
|
+
- `agent_browser_type`
|
|
372
|
+
- `agent_browser_press`
|
|
373
|
+
- `agent_browser_wait_for_selector`
|
|
374
|
+
- `agent_browser_screenshot`
|
|
375
|
+
- `agent_browser_get_url`
|
|
376
|
+
- `agent_browser_eval`
|
|
377
|
+
- `agent_browser_close`
|
|
378
|
+
|
|
379
|
+
Tool calls use the same config files and environment variables as the CLI. Each
|
|
380
|
+
tool accepts typed arguments plus `extraArgs` for advanced CLI flags and exact
|
|
381
|
+
CLI parity. Tool discovery is paginated and includes read-only/open-world
|
|
382
|
+
annotations so modern MCP clients can load the large typed surface
|
|
383
|
+
incrementally. Use the `session` tool argument or `AGENT_BROWSER_SESSION` to
|
|
384
|
+
isolate browser state.
|
|
385
|
+
|
|
306
386
|
## Global Options
|
|
307
387
|
|
|
308
388
|
```bash
|
|
@@ -310,7 +390,7 @@ agent-browser --session <name> ... # Isolated browser session
|
|
|
310
390
|
agent-browser --json ... # JSON output for parsing
|
|
311
391
|
agent-browser --headed ... # Show browser window (not headless)
|
|
312
392
|
agent-browser --cdp <port> ... # Connect via Chrome DevTools Protocol
|
|
313
|
-
agent-browser -p <provider> ... #
|
|
393
|
+
agent-browser -p <provider> ... # Browser provider or configured provider plugin
|
|
314
394
|
agent-browser --proxy <url> ... # Use proxy server
|
|
315
395
|
agent-browser --proxy-bypass <hosts> # Hosts to bypass proxy
|
|
316
396
|
agent-browser --headers <json> ... # HTTP headers scoped to URL's origin
|
|
@@ -396,8 +476,9 @@ AGENT_BROWSER_EXTENSIONS="/ext1,/ext2" # Comma-separated extension paths
|
|
|
396
476
|
AGENT_BROWSER_INIT_SCRIPTS="/a.js,/b.js" # Comma-separated init script paths
|
|
397
477
|
AGENT_BROWSER_ENABLE="react-devtools" # Comma-separated built-in init script features
|
|
398
478
|
AGENT_BROWSER_HIDE_SCROLLBARS="false" # Keep native scrollbars visible in headless Chromium screenshots
|
|
399
|
-
AGENT_BROWSER_PROVIDER="browserbase" #
|
|
479
|
+
AGENT_BROWSER_PROVIDER="browserbase" # Browser provider or configured provider plugin
|
|
400
480
|
AGENT_BROWSER_STREAM_PORT="9223" # Override WebSocket streaming port (default: OS-assigned)
|
|
401
481
|
AGENT_BROWSER_CONFIG="./agent-browser.json" # Custom config file
|
|
402
482
|
AGENT_BROWSER_CDP="9222" # Connect daemon to CDP port or WebSocket URL
|
|
483
|
+
AGENT_BROWSER_PLUGINS='[{"name":"vault","command":"agent-browser-plugin-vault","capabilities":["credential.read"]},{"name":"stealth","command":"agent-browser-plugin-stealth","capabilities":["launch.mutate"]}]'
|
|
403
484
|
```
|