agent-browser 0.27.0 → 0.27.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +38 -20
- package/bin/agent-browser-darwin-arm64 +0 -0
- package/bin/agent-browser-darwin-x64 +0 -0
- package/bin/agent-browser-linux-arm64 +0 -0
- package/bin/agent-browser-linux-musl-arm64 +0 -0
- package/bin/agent-browser-linux-musl-x64 +0 -0
- package/bin/agent-browser-linux-x64 +0 -0
- package/bin/agent-browser-win32-x64.exe +0 -0
- package/package.json +7 -2
- package/scripts/build-all-platforms.sh +27 -16
- package/skill-data/core/SKILL.md +12 -9
- package/skill-data/core/references/commands.md +17 -3
- package/skill-data/core/references/video-recording.md +10 -8
- package/bin/.install-method +0 -1
package/README.md
CHANGED
|
@@ -42,6 +42,8 @@ agent-browser install # Download Chrome from Chrome for Testing (first time onl
|
|
|
42
42
|
|
|
43
43
|
### From Source
|
|
44
44
|
|
|
45
|
+
Requires Node.js 24+, pnpm 11+, and Rust.
|
|
46
|
+
|
|
45
47
|
```bash
|
|
46
48
|
git clone https://github.com/vercel-labs/agent-browser
|
|
47
49
|
cd agent-browser
|
|
@@ -73,6 +75,7 @@ Detects your installation method (npm, Homebrew, or Cargo) and runs the appropri
|
|
|
73
75
|
### Requirements
|
|
74
76
|
|
|
75
77
|
- **Chrome** - Run `agent-browser install` to download Chrome from [Chrome for Testing](https://developer.chrome.com/blog/chrome-for-testing/) (Google's official automation channel). Existing Chrome, Brave, Playwright, and Puppeteer installations are detected automatically. No Playwright or Node.js required for the daemon.
|
|
78
|
+
- **Node.js 24+ and pnpm 11+** - Only needed when building from source.
|
|
76
79
|
- **Rust** - Only needed when building from source (see From Source above).
|
|
77
80
|
|
|
78
81
|
## Quick Start
|
|
@@ -87,6 +90,13 @@ agent-browser screenshot page.png
|
|
|
87
90
|
agent-browser close
|
|
88
91
|
```
|
|
89
92
|
|
|
93
|
+
Clicks fail early when another element covers the target's click point,
|
|
94
|
+
for example a consent banner or modal. Dismiss or interact with the reported
|
|
95
|
+
covering element, then take a fresh snapshot before retrying the original ref.
|
|
96
|
+
|
|
97
|
+
Headless Chromium screenshots hide native scrollbars for consistent image output.
|
|
98
|
+
Pass `--hide-scrollbars false` when launching to keep native scrollbars visible.
|
|
99
|
+
|
|
90
100
|
### Traditional Selectors (also supported)
|
|
91
101
|
|
|
92
102
|
```bash
|
|
@@ -359,7 +369,7 @@ agent-browser diff url https://v1.com https://v2.com --selector "#main" # Scope
|
|
|
359
369
|
### Debug
|
|
360
370
|
|
|
361
371
|
```bash
|
|
362
|
-
agent-browser trace start
|
|
372
|
+
agent-browser trace start # Start recording trace
|
|
363
373
|
agent-browser trace stop [path] # Stop and save trace
|
|
364
374
|
agent-browser profiler start # Start Chrome DevTools profiling
|
|
365
375
|
agent-browser profiler stop [path] # Stop and save profile (.json)
|
|
@@ -422,7 +432,7 @@ agent-browser react renders start # Begin fiber render recordin
|
|
|
422
432
|
agent-browser react renders stop [--json] # Stop and print profile (--json for raw data)
|
|
423
433
|
agent-browser react suspense [--only-dynamic] [--json] # Suspense boundaries + classifier
|
|
424
434
|
# --only-dynamic hides the "static" list
|
|
425
|
-
agent-browser vitals [url] [--json] # LCP/CLS/TTFB/FCP/INP +
|
|
435
|
+
agent-browser vitals [url] [--json] # LCP/CLS/TTFB/FCP/INP + hydration summary
|
|
426
436
|
```
|
|
427
437
|
|
|
428
438
|
Each `react ...` subcommand requires `--enable react-devtools` to have been
|
|
@@ -432,6 +442,8 @@ binary). Without it the commands error with `React DevTools hook not installed
|
|
|
432
442
|
|
|
433
443
|
Works on any React app — Next.js, Remix, Vite+React, CRA, TanStack Start,
|
|
434
444
|
React Native Web, etc. `vitals` and `pushstate` are framework-agnostic.
|
|
445
|
+
`vitals` prints a summary by default; pass `--json` for the full structured
|
|
446
|
+
payload.
|
|
435
447
|
|
|
436
448
|
### Init scripts
|
|
437
449
|
|
|
@@ -626,14 +638,14 @@ agent-browser --session-name secure open example.com
|
|
|
626
638
|
|
|
627
639
|
## Security
|
|
628
640
|
|
|
629
|
-
agent-browser includes security features for safe AI agent deployments. All features are opt-in
|
|
641
|
+
agent-browser includes security features for safe AI agent deployments. All features are opt-in, and existing workflows are unaffected until you explicitly enable a feature:
|
|
630
642
|
|
|
631
|
-
- **Authentication Vault
|
|
632
|
-
- **Content Boundary Markers
|
|
633
|
-
- **Domain Allowlist
|
|
634
|
-
- **Action Policy
|
|
635
|
-
- **Action Confirmation
|
|
636
|
-
- **Output Length Limits
|
|
643
|
+
- **Authentication Vault**: Store credentials locally (always encrypted), reference by name. The LLM never sees passwords. `auth login` navigates with `load` and then waits for login form selectors to appear (SPA-friendly, timeout follows the default action timeout). A key is auto-generated at `~/.agent-browser/.encryption-key` if `AGENT_BROWSER_ENCRYPTION_KEY` is not set: `echo "pass" | agent-browser auth save github --url https://github.com/login --username user --password-stdin` then `agent-browser auth login github`
|
|
644
|
+
- **Content Boundary Markers**: Wrap page output in delimiters so LLMs can distinguish tool output from untrusted content: `--content-boundaries`
|
|
645
|
+
- **Domain Allowlist**: Restrict navigation to trusted domains (wildcards like `*.example.com` also match the bare domain): `--allowed-domains "example.com,*.example.com"`. Sub-resource requests (scripts, images, fetch) and WebSocket/EventSource connections to non-allowed domains are also blocked. Include any CDN domains your target pages depend on (e.g., `*.cdn.example.com`).
|
|
646
|
+
- **Action Policy**: Gate destructive actions with a static policy file: `--action-policy ./policy.json`
|
|
647
|
+
- **Action Confirmation**: Require explicit approval for sensitive action categories: `--confirm-actions eval,download`
|
|
648
|
+
- **Output Length Limits**: Prevent context flooding: `--max-output 50000`
|
|
637
649
|
|
|
638
650
|
| Variable | Description |
|
|
639
651
|
| ----------------------------------- | ---------------------------------------- |
|
|
@@ -710,6 +722,7 @@ This is useful for multimodal AI models that can reason about visual layout, unl
|
|
|
710
722
|
| `--proxy-bypass <hosts>` | Hosts to bypass proxy (or `AGENT_BROWSER_PROXY_BYPASS` env) |
|
|
711
723
|
| `--ignore-https-errors` | Ignore HTTPS certificate errors (useful for self-signed certs) |
|
|
712
724
|
| `--allow-file-access` | Allow file:// URLs to access local files (Chromium only) |
|
|
725
|
+
| `--hide-scrollbars <bool>` | Hide native scrollbars in headless Chromium screenshots, enabled by default (or `AGENT_BROWSER_HIDE_SCROLLBARS` env) |
|
|
713
726
|
| `-p, --provider <name>` | Cloud browser provider (or `AGENT_BROWSER_PROVIDER` env) |
|
|
714
727
|
| `--device <name>` | iOS device name, e.g. "iPhone 15 Pro" (or `AGENT_BROWSER_IOS_DEVICE` env) |
|
|
715
728
|
| `--json` | JSON output (for agents) |
|
|
@@ -755,11 +768,11 @@ agent-browser dashboard stop
|
|
|
755
768
|
The dashboard runs as a standalone background process on port 4848, independent of browser sessions. It stays available even when no sessions are running, and it works from `http://localhost:4848` or a proxied/forwarded URL that reaches the dashboard server, such as `https://dashboard.agent-browser.localhost` or a Coder workspace URL. The browser stays on the dashboard origin; session-specific tabs, status, and stream traffic are proxied internally, so session ports do not need to be exposed.
|
|
756
769
|
|
|
757
770
|
The dashboard displays:
|
|
758
|
-
- **Live viewport
|
|
759
|
-
- **Activity feed
|
|
760
|
-
- **Console output
|
|
761
|
-
- **Session creation
|
|
762
|
-
- **AI Chat
|
|
771
|
+
- **Live viewport**: real-time JPEG frames from the browser
|
|
772
|
+
- **Activity feed**: chronological command/result stream with timing and expandable details
|
|
773
|
+
- **Console output**: browser console messages (log, warn, error)
|
|
774
|
+
- **Session creation**: create new sessions from the UI with local engines (Chrome, Lightpanda) or cloud providers (AgentCore, Browserbase, Browserless, Browser Use, Kernel)
|
|
775
|
+
- **AI Chat**: chat with an AI assistant directly in the dashboard (requires Vercel AI Gateway configuration)
|
|
763
776
|
|
|
764
777
|
### AI Chat
|
|
765
778
|
|
|
@@ -793,8 +806,8 @@ Create an `agent-browser.json` file to set persistent defaults instead of repeat
|
|
|
793
806
|
|
|
794
807
|
**Locations (lowest to highest priority):**
|
|
795
808
|
|
|
796
|
-
1. `~/.agent-browser/config.json
|
|
797
|
-
2. `./agent-browser.json
|
|
809
|
+
1. `~/.agent-browser/config.json`: user-level defaults
|
|
810
|
+
2. `./agent-browser.json`: project-level overrides (in working directory)
|
|
798
811
|
3. `AGENT_BROWSER_*` environment variables override config file values
|
|
799
812
|
4. CLI flags override everything
|
|
800
813
|
|
|
@@ -806,6 +819,7 @@ Create an `agent-browser.json` file to set persistent defaults instead of repeat
|
|
|
806
819
|
"proxy": "http://localhost:8080",
|
|
807
820
|
"profile": "./browser-data",
|
|
808
821
|
"userAgent": "my-agent/1.0",
|
|
822
|
+
"hideScrollbars": false,
|
|
809
823
|
"ignoreHttpsErrors": true
|
|
810
824
|
}
|
|
811
825
|
```
|
|
@@ -873,6 +887,10 @@ agent-browser get text @e1 # Get heading text
|
|
|
873
887
|
agent-browser hover @e4 # Hover the link
|
|
874
888
|
```
|
|
875
889
|
|
|
890
|
+
When a ref click is blocked by an overlay, the error includes the covering
|
|
891
|
+
element, such as `covered by <div#consent-banner>`. Click the banner or dialog
|
|
892
|
+
control first, then run `snapshot` again before reusing refs.
|
|
893
|
+
|
|
876
894
|
**Why use refs?**
|
|
877
895
|
|
|
878
896
|
- **Deterministic**: Ref points to exact element from snapshot
|
|
@@ -1229,7 +1247,7 @@ The daemon starts automatically on first command and persists between commands f
|
|
|
1229
1247
|
|
|
1230
1248
|
### Just ask the agent
|
|
1231
1249
|
|
|
1232
|
-
The simplest approach
|
|
1250
|
+
The simplest approach is to tell your agent to use it:
|
|
1233
1251
|
|
|
1234
1252
|
```
|
|
1235
1253
|
Use agent-browser to test the login flow. Run agent-browser --help to see available commands.
|
|
@@ -1245,7 +1263,7 @@ Add the skill to your AI coding assistant for richer context:
|
|
|
1245
1263
|
npx skills add vercel-labs/agent-browser
|
|
1246
1264
|
```
|
|
1247
1265
|
|
|
1248
|
-
This works with Claude Code, Codex, Cursor, Gemini CLI, GitHub Copilot, Goose, OpenCode, and Windsurf. The skill is fetched from the repository, so it stays up to date automatically
|
|
1266
|
+
This works with Claude Code, Codex, Cursor, Gemini CLI, GitHub Copilot, Goose, OpenCode, and Windsurf. The skill is fetched from the repository, so it stays up to date automatically. Do not copy `SKILL.md` from `node_modules` as it will become stale.
|
|
1249
1267
|
|
|
1250
1268
|
### Claude Code
|
|
1251
1269
|
|
|
@@ -1474,8 +1492,8 @@ Optional configuration via environment variables:
|
|
|
1474
1492
|
|
|
1475
1493
|
| Variable | Description | Default |
|
|
1476
1494
|
| ------------------------ | -------------------------------------------------------------------------------- | ------- |
|
|
1477
|
-
| `KERNEL_HEADLESS` | Run browser in headless mode (`true`/`false`) | `
|
|
1478
|
-
| `KERNEL_STEALTH` | Enable stealth mode to avoid bot detection (`true`/`false`) | `
|
|
1495
|
+
| `KERNEL_HEADLESS` | Run browser in headless mode (`true`/`false`) | `true` |
|
|
1496
|
+
| `KERNEL_STEALTH` | Enable stealth mode to avoid bot detection (`true`/`false`) | `false` |
|
|
1479
1497
|
| `KERNEL_TIMEOUT_SECONDS` | Session timeout in seconds | `300` |
|
|
1480
1498
|
| `KERNEL_PROFILE_NAME` | Browser profile name for persistent cookies/logins (created if it doesn't exist) | (none) |
|
|
1481
1499
|
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
package/package.json
CHANGED
|
@@ -1,8 +1,13 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "agent-browser",
|
|
3
|
-
"version": "0.27.
|
|
3
|
+
"version": "0.27.2",
|
|
4
4
|
"description": "Browser automation CLI for AI agents",
|
|
5
5
|
"type": "module",
|
|
6
|
+
"packageManager": "pnpm@11.1.3",
|
|
7
|
+
"engines": {
|
|
8
|
+
"node": ">=24.0.0",
|
|
9
|
+
"pnpm": ">=11.0.0"
|
|
10
|
+
},
|
|
6
11
|
"files": [
|
|
7
12
|
"bin",
|
|
8
13
|
"scripts",
|
|
@@ -20,7 +25,7 @@
|
|
|
20
25
|
"build:macos": "npm run version:sync && (cargo build --release --manifest-path cli/Cargo.toml --target aarch64-apple-darwin & cargo build --release --manifest-path cli/Cargo.toml --target x86_64-apple-darwin & wait) && cp cli/target/aarch64-apple-darwin/release/agent-browser bin/agent-browser-darwin-arm64 && cp cli/target/x86_64-apple-darwin/release/agent-browser bin/agent-browser-darwin-x64",
|
|
21
26
|
"build:windows": "npm run version:sync && docker compose -f docker/docker-compose.yml run --rm build-windows",
|
|
22
27
|
"build:all-platforms": "npm run version:sync && (npm run build:linux & npm run build:windows & wait) && npm run build:macos",
|
|
23
|
-
"build:docker": "docker build -t agent-browser-builder -f docker/Dockerfile.build .",
|
|
28
|
+
"build:docker": "docker build --platform linux/amd64 -t agent-browser-builder -f docker/Dockerfile.build .",
|
|
24
29
|
"release": "npm run version:sync && npm run build:all-platforms && npm publish",
|
|
25
30
|
"postinstall": "node scripts/postinstall.js",
|
|
26
31
|
"build:dashboard": "cd packages/dashboard && pnpm build"
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
#!/bin/bash
|
|
2
|
-
set -
|
|
2
|
+
set -euo pipefail
|
|
3
3
|
|
|
4
4
|
# Build agent-browser for all platforms using Docker
|
|
5
5
|
# Usage: ./scripts/build-all-platforms.sh
|
|
@@ -22,21 +22,32 @@ mkdir -p "$OUTPUT_DIR"
|
|
|
22
22
|
|
|
23
23
|
# Build the Docker image if needed
|
|
24
24
|
echo -e "${YELLOW}Building Docker cross-compilation image...${NC}"
|
|
25
|
-
docker build -t agent-browser-builder -f "$PROJECT_ROOT/docker/Dockerfile.build" "$PROJECT_ROOT"
|
|
25
|
+
docker build --platform linux/amd64 -t agent-browser-builder -f "$PROJECT_ROOT/docker/Dockerfile.build" "$PROJECT_ROOT"
|
|
26
26
|
|
|
27
27
|
# Function to build for a target
|
|
28
28
|
build_target() {
|
|
29
|
-
local
|
|
30
|
-
local
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
29
|
+
local rust_target=$1
|
|
30
|
+
local build_target=$2
|
|
31
|
+
local output_name=$3
|
|
32
|
+
|
|
33
|
+
echo -e "${YELLOW}Building for ${build_target}...${NC}"
|
|
34
|
+
|
|
35
|
+
rm -f "$OUTPUT_DIR/$output_name"
|
|
36
|
+
|
|
34
37
|
docker run --rm \
|
|
38
|
+
--platform linux/amd64 \
|
|
35
39
|
-v "$PROJECT_ROOT/cli:/build" \
|
|
36
40
|
-v "$OUTPUT_DIR:/output" \
|
|
37
41
|
agent-browser-builder \
|
|
38
|
-
-c "
|
|
39
|
-
|
|
42
|
+
-c "set -euo pipefail
|
|
43
|
+
cargo zigbuild --release --target ${build_target}
|
|
44
|
+
source_path=/build/target/${rust_target}/release/agent-browser
|
|
45
|
+
if [ -f \"\$source_path.exe\" ]; then
|
|
46
|
+
source_path=\"\$source_path.exe\"
|
|
47
|
+
fi
|
|
48
|
+
cp \"\$source_path\" /output/${output_name}
|
|
49
|
+
chmod +x /output/${output_name} 2>/dev/null || true"
|
|
50
|
+
|
|
40
51
|
if [ -f "$OUTPUT_DIR/$output_name" ]; then
|
|
41
52
|
echo -e "${GREEN}✓ Built ${output_name}${NC}"
|
|
42
53
|
else
|
|
@@ -47,25 +58,25 @@ build_target() {
|
|
|
47
58
|
|
|
48
59
|
# Build for each platform
|
|
49
60
|
# Linux x64
|
|
50
|
-
build_target "x86_64-unknown-linux-gnu" "agent-browser-linux-x64"
|
|
61
|
+
build_target "x86_64-unknown-linux-gnu" "x86_64-unknown-linux-gnu.2.28" "agent-browser-linux-x64"
|
|
51
62
|
|
|
52
63
|
# Linux ARM64
|
|
53
|
-
build_target "aarch64-unknown-linux-gnu" "agent-browser-linux-arm64"
|
|
64
|
+
build_target "aarch64-unknown-linux-gnu" "aarch64-unknown-linux-gnu.2.28" "agent-browser-linux-arm64"
|
|
54
65
|
|
|
55
66
|
# Windows x64
|
|
56
|
-
build_target "x86_64-pc-windows-gnu" "agent-browser-win32-x64.exe"
|
|
67
|
+
build_target "x86_64-pc-windows-gnu" "x86_64-pc-windows-gnu" "agent-browser-win32-x64.exe"
|
|
57
68
|
|
|
58
69
|
# macOS x64 (via zig for cross-compilation)
|
|
59
|
-
build_target "x86_64-apple-darwin" "agent-browser-darwin-x64"
|
|
70
|
+
build_target "x86_64-apple-darwin" "x86_64-apple-darwin" "agent-browser-darwin-x64"
|
|
60
71
|
|
|
61
72
|
# macOS ARM64 (via zig for cross-compilation)
|
|
62
|
-
build_target "aarch64-apple-darwin" "agent-browser-darwin-arm64"
|
|
73
|
+
build_target "aarch64-apple-darwin" "aarch64-apple-darwin" "agent-browser-darwin-arm64"
|
|
63
74
|
|
|
64
75
|
# Linux musl x64 (Alpine)
|
|
65
|
-
build_target "x86_64-unknown-linux-musl" "agent-browser-linux-musl-x64"
|
|
76
|
+
build_target "x86_64-unknown-linux-musl" "x86_64-unknown-linux-musl" "agent-browser-linux-musl-x64"
|
|
66
77
|
|
|
67
78
|
# Linux musl ARM64 (Alpine)
|
|
68
|
-
build_target "aarch64-unknown-linux-musl" "agent-browser-linux-musl-arm64"
|
|
79
|
+
build_target "aarch64-unknown-linux-musl" "aarch64-unknown-linux-musl" "agent-browser-linux-musl-arm64"
|
|
69
80
|
|
|
70
81
|
echo ""
|
|
71
82
|
echo -e "${GREEN}Build complete!${NC}"
|
package/skill-data/core/SKILL.md
CHANGED
|
@@ -243,6 +243,9 @@ agent-browser screenshot --full full.png # full scroll height
|
|
|
243
243
|
agent-browser screenshot --annotate map.png # numbered labels + legend keyed to snapshot refs
|
|
244
244
|
```
|
|
245
245
|
|
|
246
|
+
Headless Chromium screenshots hide native scrollbars for consistent image output.
|
|
247
|
+
Pass `--hide-scrollbars false` when launching to keep native scrollbars visible.
|
|
248
|
+
|
|
246
249
|
`--annotate` is designed for multimodal models: each label `[N]` maps to ref `@eN`.
|
|
247
250
|
|
|
248
251
|
### Handle multiple pages via tabs
|
|
@@ -250,13 +253,11 @@ agent-browser screenshot --annotate map.png # numbered labels + legend keyed
|
|
|
250
253
|
```bash
|
|
251
254
|
agent-browser tab # list open tabs (with stable tabId)
|
|
252
255
|
agent-browser tab new https://docs... # open a new tab (and switch to it)
|
|
253
|
-
agent-browser tab
|
|
254
|
-
agent-browser tab close
|
|
256
|
+
agent-browser tab t2 # switch to tab t2
|
|
257
|
+
agent-browser tab close t2 # close tab t2
|
|
255
258
|
```
|
|
256
259
|
|
|
257
|
-
Stable `tabId`s mean `
|
|
258
|
-
when other tabs open or close. After switching, refs from a prior snapshot
|
|
259
|
-
on a different tab no longer apply — re-snapshot.
|
|
260
|
+
Stable `tabId`s mean `t2` points at the same tab across commands even when other tabs open or close. After switching, refs from a prior snapshot on a different tab no longer apply — re-snapshot.
|
|
260
261
|
|
|
261
262
|
### Run multiple browsers in parallel
|
|
262
263
|
|
|
@@ -287,8 +288,8 @@ agent-browser network har stop /tmp/trace.har
|
|
|
287
288
|
### Record a video of the workflow
|
|
288
289
|
|
|
289
290
|
```bash
|
|
290
|
-
agent-browser record start demo.webm
|
|
291
291
|
agent-browser open https://example.com
|
|
292
|
+
agent-browser record start demo.webm
|
|
292
293
|
agent-browser snapshot -i
|
|
293
294
|
agent-browser click @e3
|
|
294
295
|
agent-browser record stop
|
|
@@ -366,8 +367,9 @@ agent-browser snapshot -i
|
|
|
366
367
|
```
|
|
367
368
|
|
|
368
369
|
**Click does nothing / overlay swallows the click**
|
|
369
|
-
Some modals and cookie banners block other clicks.
|
|
370
|
-
|
|
370
|
+
Some modals and cookie banners block other clicks. If `click` reports
|
|
371
|
+
`covered by <...>`, interact with that covering element first. Otherwise,
|
|
372
|
+
snapshot, find the dismiss/close button, click it, then re-snapshot.
|
|
371
373
|
|
|
372
374
|
**Fill / type doesn't work**
|
|
373
375
|
Some custom input components intercept key events. Try:
|
|
@@ -444,7 +446,8 @@ agent-browser pushstate <url> # SPA navigation (auto-detects
|
|
|
444
446
|
```
|
|
445
447
|
|
|
446
448
|
Without `--enable react-devtools`, the `react …` commands error. `vitals`
|
|
447
|
-
and `pushstate` work on any site regardless of framework.
|
|
449
|
+
and `pushstate` work on any site regardless of framework. `vitals` prints a
|
|
450
|
+
summary by default; use `--json` for the full structured payload.
|
|
448
451
|
|
|
449
452
|
## Working safely
|
|
450
453
|
|
|
@@ -71,6 +71,11 @@ agent-browser drag @e1 @e2 # Drag and drop
|
|
|
71
71
|
agent-browser upload @e1 file.pdf # Upload files
|
|
72
72
|
```
|
|
73
73
|
|
|
74
|
+
Clicks fail before dispatch when another element covers the target's click
|
|
75
|
+
point. The error names the covering element, for example
|
|
76
|
+
`covered by <div#consent-banner>`. Dismiss or interact with that element, run a
|
|
77
|
+
fresh snapshot, then retry the original action.
|
|
78
|
+
|
|
74
79
|
## Get Information
|
|
75
80
|
|
|
76
81
|
```bash
|
|
@@ -103,9 +108,13 @@ agent-browser screenshot --full # Full page
|
|
|
103
108
|
agent-browser pdf output.pdf # Save as PDF
|
|
104
109
|
```
|
|
105
110
|
|
|
111
|
+
Headless Chromium screenshots hide native scrollbars for consistent image output.
|
|
112
|
+
Pass `--hide-scrollbars false` when launching to keep native scrollbars visible.
|
|
113
|
+
|
|
106
114
|
## Video Recording
|
|
107
115
|
|
|
108
116
|
```bash
|
|
117
|
+
agent-browser open https://example.com # Launch a browser session first
|
|
109
118
|
agent-browser record start ./demo.webm # Start recording
|
|
110
119
|
agent-browser click @e1 # Perform actions
|
|
111
120
|
agent-browser record stop # Stop and save video
|
|
@@ -300,7 +309,6 @@ agent-browser state load auth.json # Restore saved state
|
|
|
300
309
|
agent-browser --session <name> ... # Isolated browser session
|
|
301
310
|
agent-browser --json ... # JSON output for parsing
|
|
302
311
|
agent-browser --headed ... # Show browser window (not headless)
|
|
303
|
-
agent-browser --full ... # Full page screenshot (-f)
|
|
304
312
|
agent-browser --cdp <port> ... # Connect via Chrome DevTools Protocol
|
|
305
313
|
agent-browser -p <provider> ... # Cloud browser provider (--provider)
|
|
306
314
|
agent-browser --proxy <url> ... # Use proxy server
|
|
@@ -309,6 +317,7 @@ agent-browser --headers <json> ... # HTTP headers scoped to URL's origin
|
|
|
309
317
|
agent-browser --executable-path <p> # Custom browser executable
|
|
310
318
|
agent-browser --extension <path> ... # Load browser extension (repeatable)
|
|
311
319
|
agent-browser --ignore-https-errors # Ignore SSL certificate errors
|
|
320
|
+
agent-browser --hide-scrollbars false # Keep native scrollbars visible in headless Chromium screenshots
|
|
312
321
|
agent-browser --help # Show help (-h)
|
|
313
322
|
agent-browser --version # Show version (-V)
|
|
314
323
|
agent-browser <command> --help # Show detailed help for a command
|
|
@@ -327,7 +336,7 @@ agent-browser errors --clear # Clear errors
|
|
|
327
336
|
agent-browser highlight @e1 # Highlight element
|
|
328
337
|
agent-browser inspect # Open Chrome DevTools for this session
|
|
329
338
|
agent-browser trace start # Start recording trace
|
|
330
|
-
agent-browser trace stop trace.
|
|
339
|
+
agent-browser trace stop trace.json # Stop and save trace
|
|
331
340
|
agent-browser profiler start # Start Chrome DevTools profiling
|
|
332
341
|
agent-browser profiler stop trace.json # Stop and save profile
|
|
333
342
|
```
|
|
@@ -349,6 +358,9 @@ agent-browser vitals [url] [--json] # LCP/CLS/TTFB/FCP/INP + hyd
|
|
|
349
358
|
agent-browser pushstate <url> # SPA client-side nav (auto-detects Next router)
|
|
350
359
|
```
|
|
351
360
|
|
|
361
|
+
`vitals` prints a summary by default and uses the same fields as the structured
|
|
362
|
+
`--json` response.
|
|
363
|
+
|
|
352
364
|
## Init scripts
|
|
353
365
|
|
|
354
366
|
```bash
|
|
@@ -383,7 +395,9 @@ AGENT_BROWSER_EXECUTABLE_PATH="/path/chrome" # Custom browser path
|
|
|
383
395
|
AGENT_BROWSER_EXTENSIONS="/ext1,/ext2" # Comma-separated extension paths
|
|
384
396
|
AGENT_BROWSER_INIT_SCRIPTS="/a.js,/b.js" # Comma-separated init script paths
|
|
385
397
|
AGENT_BROWSER_ENABLE="react-devtools" # Comma-separated built-in init script features
|
|
398
|
+
AGENT_BROWSER_HIDE_SCROLLBARS="false" # Keep native scrollbars visible in headless Chromium screenshots
|
|
386
399
|
AGENT_BROWSER_PROVIDER="browserbase" # Cloud browser provider
|
|
387
400
|
AGENT_BROWSER_STREAM_PORT="9223" # Override WebSocket streaming port (default: OS-assigned)
|
|
388
|
-
|
|
401
|
+
AGENT_BROWSER_CONFIG="./agent-browser.json" # Custom config file
|
|
402
|
+
AGENT_BROWSER_CDP="9222" # Connect daemon to CDP port or WebSocket URL
|
|
389
403
|
```
|
|
@@ -16,11 +16,11 @@ Capture browser automation as video for debugging, documentation, or verificatio
|
|
|
16
16
|
## Basic Recording
|
|
17
17
|
|
|
18
18
|
```bash
|
|
19
|
-
#
|
|
19
|
+
# Launch the browser, then start recording
|
|
20
|
+
agent-browser open https://example.com
|
|
20
21
|
agent-browser record start ./demo.webm
|
|
21
22
|
|
|
22
23
|
# Perform actions
|
|
23
|
-
agent-browser open https://example.com
|
|
24
24
|
agent-browser snapshot -i
|
|
25
25
|
agent-browser click @e1
|
|
26
26
|
agent-browser fill @e2 "test input"
|
|
@@ -32,6 +32,9 @@ agent-browser record stop
|
|
|
32
32
|
## Recording Commands
|
|
33
33
|
|
|
34
34
|
```bash
|
|
35
|
+
# Launch a session first
|
|
36
|
+
agent-browser open
|
|
37
|
+
|
|
35
38
|
# Start recording to file
|
|
36
39
|
agent-browser record start ./output.webm
|
|
37
40
|
|
|
@@ -50,10 +53,9 @@ agent-browser record restart ./take2.webm
|
|
|
50
53
|
#!/bin/bash
|
|
51
54
|
# Record automation for debugging
|
|
52
55
|
|
|
53
|
-
agent-browser record start ./debug-$(date +%Y%m%d-%H%M%S).webm
|
|
54
|
-
|
|
55
56
|
# Run your automation
|
|
56
57
|
agent-browser open https://app.example.com
|
|
58
|
+
agent-browser record start ./debug-$(date +%Y%m%d-%H%M%S).webm
|
|
57
59
|
agent-browser snapshot -i
|
|
58
60
|
agent-browser click @e1 || {
|
|
59
61
|
echo "Click failed - check recording"
|
|
@@ -70,9 +72,8 @@ agent-browser record stop
|
|
|
70
72
|
#!/bin/bash
|
|
71
73
|
# Record workflow for documentation
|
|
72
74
|
|
|
73
|
-
agent-browser record start ./docs/how-to-login.webm
|
|
74
|
-
|
|
75
75
|
agent-browser open https://app.example.com/login
|
|
76
|
+
agent-browser record start ./docs/how-to-login.webm
|
|
76
77
|
agent-browser wait 1000 # Pause for visibility
|
|
77
78
|
|
|
78
79
|
agent-browser snapshot -i
|
|
@@ -99,6 +100,7 @@ TEST_NAME="${1:-e2e-test}"
|
|
|
99
100
|
RECORDING_DIR="./test-recordings"
|
|
100
101
|
mkdir -p "$RECORDING_DIR"
|
|
101
102
|
|
|
103
|
+
agent-browser open
|
|
102
104
|
agent-browser record start "$RECORDING_DIR/$TEST_NAME-$(date +%s).webm"
|
|
103
105
|
|
|
104
106
|
# Run test
|
|
@@ -141,6 +143,7 @@ cleanup() {
|
|
|
141
143
|
}
|
|
142
144
|
trap cleanup EXIT
|
|
143
145
|
|
|
146
|
+
agent-browser open
|
|
144
147
|
agent-browser record start ./automation.webm
|
|
145
148
|
# ... automation steps ...
|
|
146
149
|
```
|
|
@@ -149,9 +152,8 @@ agent-browser record start ./automation.webm
|
|
|
149
152
|
|
|
150
153
|
```bash
|
|
151
154
|
# Record video AND capture key frames
|
|
152
|
-
agent-browser record start ./flow.webm
|
|
153
|
-
|
|
154
155
|
agent-browser open https://example.com
|
|
156
|
+
agent-browser record start ./flow.webm
|
|
155
157
|
agent-browser screenshot ./screenshots/step1-homepage.png
|
|
156
158
|
|
|
157
159
|
agent-browser click @e1
|
package/bin/.install-method
DELETED
|
@@ -1 +0,0 @@
|
|
|
1
|
-
pnpm
|