agent-browser-stealth 0.14.0-fork.3 → 0.14.0-fork.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +136 -1150
- package/bin/agent-browser-darwin-arm64 +0 -0
- package/bin/agent-browser-darwin-x64 +0 -0
- package/bin/agent-browser-linux-arm64 +0 -0
- package/bin/agent-browser-linux-x64 +0 -0
- package/bin/agent-browser-win32-x64.exe +0 -0
- package/dist/actions.d.ts +5 -1
- package/dist/actions.d.ts.map +1 -1
- package/dist/actions.js +118 -69
- package/dist/actions.js.map +1 -1
- package/dist/browser.d.ts +1 -0
- package/dist/browser.d.ts.map +1 -1
- package/dist/browser.js +35 -1
- package/dist/browser.js.map +1 -1
- package/dist/protocol.d.ts.map +1 -1
- package/dist/protocol.js +2 -0
- package/dist/protocol.js.map +1 -1
- package/dist/types.d.ts +10 -0
- package/dist/types.d.ts.map +1 -1
- package/package.json +4 -1
- package/scripts/clawhub-sync.sh +27 -0
- package/skills/agent-browser/SKILL.md +35 -14
- package/skills/agent-browser-stealth/SKILL.md +127 -0
- package/skills/dogfood/SKILL.md +216 -0
- package/skills/dogfood/references/issue-taxonomy.md +109 -0
- package/skills/dogfood/templates/dogfood-report-template.md +53 -0
package/README.md
CHANGED
|
@@ -1,1213 +1,199 @@
|
|
|
1
|
-
# agent-browser
|
|
1
|
+
# agent-browser-stealth
|
|
2
2
|
|
|
3
|
-
Stealth-first browser
|
|
3
|
+
Stealth-first fork of `agent-browser` for production browser automation under anti-bot pressure.
|
|
4
4
|
|
|
5
|
-
|
|
6
|
-
- Always-on stealth (no opt-in flag)
|
|
7
|
-
- Browser and protocol-level anti-fingerprint patches
|
|
8
|
-
- Humanized interaction behavior by default
|
|
9
|
-
- Verified against CreepJS using the built-in check script
|
|
5
|
+
This README focuses on stealth architecture and principles. For full command coverage inherited from upstream, use:
|
|
10
6
|
|
|
11
|
-
|
|
7
|
+
- upstream docs: <https://github.com/vercel-labs/agent-browser>
|
|
8
|
+
- local help: `agent-browser --help`
|
|
12
9
|
|
|
13
|
-
|
|
10
|
+
## What This Fork Optimizes
|
|
14
11
|
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
```
|
|
21
|
-
|
|
22
|
-
This is the fastest option -- commands run through the native Rust CLI directly with sub-millisecond parsing overhead.
|
|
23
|
-
|
|
24
|
-
### Quick Start (no install)
|
|
25
|
-
|
|
26
|
-
Run directly with `npx` if you want to try it without installing globally:
|
|
27
|
-
|
|
28
|
-
```bash
|
|
29
|
-
npx agent-browser-stealth install # Download Chromium (first time only)
|
|
30
|
-
npx agent-browser-stealth open example.com
|
|
31
|
-
```
|
|
32
|
-
|
|
33
|
-
> **Note:** `npx` routes through Node.js before reaching the Rust CLI, so it is noticeably slower than a global install. For regular use, install globally.
|
|
34
|
-
|
|
35
|
-
### Project Installation (local dependency)
|
|
36
|
-
|
|
37
|
-
For projects that want to pin the version in `package.json`:
|
|
38
|
-
|
|
39
|
-
```bash
|
|
40
|
-
npm install agent-browser-stealth
|
|
41
|
-
npx agent-browser-stealth install
|
|
42
|
-
```
|
|
43
|
-
|
|
44
|
-
Then use via `npx` or `package.json` scripts:
|
|
45
|
-
|
|
46
|
-
```bash
|
|
47
|
-
npx agent-browser-stealth open example.com
|
|
48
|
-
```
|
|
49
|
-
|
|
50
|
-
### Homebrew (macOS)
|
|
51
|
-
|
|
52
|
-
```bash
|
|
53
|
-
brew install agent-browser
|
|
54
|
-
agent-browser install # Download Chromium
|
|
55
|
-
```
|
|
56
|
-
|
|
57
|
-
### From Source
|
|
58
|
-
|
|
59
|
-
```bash
|
|
60
|
-
git clone https://github.com/leeguooooo/agent-browser
|
|
61
|
-
cd agent-browser
|
|
62
|
-
pnpm install
|
|
63
|
-
pnpm build
|
|
64
|
-
pnpm build:native # Requires Rust (https://rustup.rs)
|
|
65
|
-
pnpm link --global # Makes agent-browser available globally
|
|
66
|
-
agent-browser install
|
|
67
|
-
```
|
|
68
|
-
|
|
69
|
-
### Fork Maintenance (Independent Release + Upstream Sync)
|
|
70
|
-
|
|
71
|
-
If you maintain a fork and publish your own CLI, use this workflow:
|
|
72
|
-
|
|
73
|
-
1. Keep an upstream-tracking branch (`upstream-main`) for clean sync history.
|
|
74
|
-
2. Keep your release branch (`main`) for production-ready code only.
|
|
75
|
-
3. Merge upstream into short-lived sync branches, then open PRs into `main`.
|
|
76
|
-
|
|
77
|
-
One-time setup:
|
|
78
|
-
|
|
79
|
-
```bash
|
|
80
|
-
git remote add upstream https://github.com/vercel-labs/agent-browser.git
|
|
81
|
-
git fetch upstream
|
|
82
|
-
```
|
|
83
|
-
|
|
84
|
-
Regular sync:
|
|
85
|
-
|
|
86
|
-
```bash
|
|
87
|
-
pnpm run sync:upstream:push
|
|
88
|
-
```
|
|
89
|
-
|
|
90
|
-
This command:
|
|
91
|
-
- Fetches `upstream/main`
|
|
92
|
-
- Fast-forwards local `upstream-main`
|
|
93
|
-
- Creates `sync/YYYY-MM-DD` from local `main`
|
|
94
|
-
- Merges `upstream-main` into the sync branch
|
|
95
|
-
- Pushes the sync branch to `origin` (with `sync:upstream:push`)
|
|
96
|
-
|
|
97
|
-
If merge conflicts occur, resolve them on the sync branch and open a PR as usual.
|
|
98
|
-
|
|
99
|
-
Independent release checklist for forks:
|
|
100
|
-
- Use your own npm package name and CLI binary name (avoid conflicts with upstream package ownership).
|
|
101
|
-
- Update `repository`, `bugs`, and `homepage` in `package.json` to your fork.
|
|
102
|
-
- Configure npm Trusted Publishing (OIDC) for your package and repository workflow.
|
|
103
|
-
- Keep release tags and changelog in your own namespace/versioning policy.
|
|
104
|
-
- Use dual-version format: `<upstream>-fork.<fork>` (example: `0.14.0-fork.1`).
|
|
105
|
-
- `agent-browser --version` should show all three: full version, upstream version, and fork version.
|
|
106
|
-
|
|
107
|
-
### Linux Dependencies
|
|
108
|
-
|
|
109
|
-
On Linux, install system dependencies:
|
|
110
|
-
|
|
111
|
-
```bash
|
|
112
|
-
agent-browser install --with-deps
|
|
113
|
-
# or manually: npx playwright install-deps chromium
|
|
114
|
-
```
|
|
12
|
+
- Stealth is always on (legacy `launch.stealth` is accepted but ignored).
|
|
13
|
+
- Fingerprint surfaces are patched at multiple layers (launch args, CDP overrides, init scripts).
|
|
14
|
+
- Behavioral signals are humanized (typing cadence, cursor path, pacing, retry backoff).
|
|
15
|
+
- Region signals are auto-aligned (locale/timezone/Accept-Language) to reduce mismatch risk.
|
|
16
|
+
- Verification/captcha handling is policy-driven (`--risk-mode off|warn|block`).
|
|
115
17
|
|
|
116
18
|
## Quick Start
|
|
117
19
|
|
|
118
|
-
|
|
119
|
-
agent-browser open example.com
|
|
120
|
-
agent-browser snapshot # Get accessibility tree with refs
|
|
121
|
-
agent-browser click @e2 # Click by ref from snapshot
|
|
122
|
-
agent-browser fill @e3 "test@example.com" # Fill by ref
|
|
123
|
-
agent-browser get text @e1 # Get text by ref
|
|
124
|
-
agent-browser screenshot page.png
|
|
125
|
-
agent-browser --version # Includes upstream/fork metadata on fork builds
|
|
126
|
-
agent-browser close
|
|
127
|
-
```
|
|
128
|
-
|
|
129
|
-
### Traditional Selectors (also supported)
|
|
130
|
-
|
|
131
|
-
```bash
|
|
132
|
-
agent-browser click "#submit"
|
|
133
|
-
agent-browser fill "#email" "test@example.com"
|
|
134
|
-
agent-browser find role button click --name "Submit"
|
|
135
|
-
```
|
|
136
|
-
|
|
137
|
-
## Commands
|
|
138
|
-
|
|
139
|
-
### Core Commands
|
|
140
|
-
|
|
141
|
-
```bash
|
|
142
|
-
agent-browser open <url> # Navigate to URL (aliases: goto, navigate)
|
|
143
|
-
agent-browser click <sel> # Click element (--new-tab to open in new tab)
|
|
144
|
-
agent-browser dblclick <sel> # Double-click element
|
|
145
|
-
agent-browser focus <sel> # Focus element
|
|
146
|
-
agent-browser type <sel> <text> [--delay <ms>] # Type into element
|
|
147
|
-
agent-browser fill <sel> <text> # Clear and fill
|
|
148
|
-
agent-browser press <key> # Press key (Enter, Tab, Control+a) (alias: key)
|
|
149
|
-
agent-browser keyboard type <text> [--delay <ms>] # Type with real keystrokes (no selector, current focus)
|
|
150
|
-
agent-browser keyboard inserttext <text> # Insert text without key events (no selector)
|
|
151
|
-
agent-browser keydown <key> # Hold key down
|
|
152
|
-
agent-browser keyup <key> # Release key
|
|
153
|
-
agent-browser hover <sel> # Hover element
|
|
154
|
-
agent-browser select <sel> <val> # Select dropdown option
|
|
155
|
-
agent-browser check <sel> # Check checkbox
|
|
156
|
-
agent-browser uncheck <sel> # Uncheck checkbox
|
|
157
|
-
agent-browser scroll <dir> [px] # Scroll (up/down/left/right)
|
|
158
|
-
agent-browser scrollintoview <sel> # Scroll element into view (alias: scrollinto)
|
|
159
|
-
agent-browser drag <src> <tgt> # Drag and drop
|
|
160
|
-
agent-browser upload <sel> <files> # Upload files
|
|
161
|
-
agent-browser screenshot [path] # Take screenshot (--full for full page, saves to a temporary directory if no path)
|
|
162
|
-
agent-browser screenshot --annotate # Annotated screenshot with numbered element labels
|
|
163
|
-
agent-browser pdf <path> # Save as PDF
|
|
164
|
-
agent-browser snapshot # Accessibility tree with refs (best for AI)
|
|
165
|
-
agent-browser eval <js> # Run JavaScript (-b for base64, --stdin for piped input)
|
|
166
|
-
agent-browser connect <port> # Connect to browser via CDP
|
|
167
|
-
agent-browser close # Close browser (aliases: quit, exit)
|
|
168
|
-
```
|
|
169
|
-
|
|
170
|
-
### Get Info
|
|
171
|
-
|
|
172
|
-
```bash
|
|
173
|
-
agent-browser get text <sel> # Get text content
|
|
174
|
-
agent-browser get html <sel> # Get innerHTML
|
|
175
|
-
agent-browser get value <sel> # Get input value
|
|
176
|
-
agent-browser get attr <sel> <attr> # Get attribute
|
|
177
|
-
agent-browser get title # Get page title
|
|
178
|
-
agent-browser get url # Get current URL
|
|
179
|
-
agent-browser get count <sel> # Count matching elements
|
|
180
|
-
agent-browser get box <sel> # Get bounding box
|
|
181
|
-
agent-browser get styles <sel> # Get computed styles
|
|
182
|
-
```
|
|
183
|
-
|
|
184
|
-
### Check State
|
|
185
|
-
|
|
186
|
-
```bash
|
|
187
|
-
agent-browser is visible <sel> # Check if visible
|
|
188
|
-
agent-browser is enabled <sel> # Check if enabled
|
|
189
|
-
agent-browser is checked <sel> # Check if checked
|
|
190
|
-
```
|
|
191
|
-
|
|
192
|
-
### Find Elements (Semantic Locators)
|
|
193
|
-
|
|
194
|
-
```bash
|
|
195
|
-
agent-browser find role <role> <action> [value] # By ARIA role
|
|
196
|
-
agent-browser find text <text> <action> # By text content
|
|
197
|
-
agent-browser find label <label> <action> [value] # By label
|
|
198
|
-
agent-browser find placeholder <ph> <action> [value] # By placeholder
|
|
199
|
-
agent-browser find alt <text> <action> # By alt text
|
|
200
|
-
agent-browser find title <text> <action> # By title attr
|
|
201
|
-
agent-browser find testid <id> <action> [value] # By data-testid
|
|
202
|
-
agent-browser find first <sel> <action> [value] # First match
|
|
203
|
-
agent-browser find last <sel> <action> [value] # Last match
|
|
204
|
-
agent-browser find nth <n> <sel> <action> [value] # Nth match
|
|
205
|
-
```
|
|
206
|
-
|
|
207
|
-
**Actions:** `click`, `fill`, `type`, `hover`, `focus`, `check`, `uncheck`, `text`
|
|
208
|
-
|
|
209
|
-
**Options:** `--name <name>` (filter role by accessible name), `--exact` (require exact text match)
|
|
210
|
-
|
|
211
|
-
**Examples:**
|
|
212
|
-
```bash
|
|
213
|
-
agent-browser find role button click --name "Submit"
|
|
214
|
-
agent-browser find text "Sign In" click
|
|
215
|
-
agent-browser find label "Email" fill "test@test.com"
|
|
216
|
-
agent-browser find first ".item" click
|
|
217
|
-
agent-browser find nth 2 "a" text
|
|
218
|
-
```
|
|
219
|
-
|
|
220
|
-
### Wait
|
|
221
|
-
|
|
222
|
-
```bash
|
|
223
|
-
agent-browser wait <selector> # Wait for element to be visible
|
|
224
|
-
agent-browser wait <ms> # Wait for time (milliseconds)
|
|
225
|
-
agent-browser wait 2000-5000 # Random wait between 2-5 seconds
|
|
226
|
-
agent-browser wait --text "Welcome" # Wait for text to appear
|
|
227
|
-
agent-browser wait --url "**/dash" # Wait for URL pattern
|
|
228
|
-
agent-browser wait --load networkidle # Wait for load state
|
|
229
|
-
agent-browser wait --fn "window.ready === true" # Wait for JS condition
|
|
230
|
-
```
|
|
231
|
-
|
|
232
|
-
**Load states:** `load`, `domcontentloaded`, `networkidle`
|
|
233
|
-
|
|
234
|
-
### Mouse Control
|
|
235
|
-
|
|
236
|
-
```bash
|
|
237
|
-
agent-browser mouse move <x> <y> # Move mouse
|
|
238
|
-
agent-browser mouse down [button] # Press button (left/right/middle)
|
|
239
|
-
agent-browser mouse up [button] # Release button
|
|
240
|
-
agent-browser mouse wheel <dy> [dx] # Scroll wheel
|
|
241
|
-
```
|
|
242
|
-
|
|
243
|
-
### Browser Settings
|
|
244
|
-
|
|
245
|
-
```bash
|
|
246
|
-
agent-browser set viewport <w> <h> # Set viewport size
|
|
247
|
-
agent-browser set device <name> # Emulate device ("iPhone 14")
|
|
248
|
-
agent-browser set geo <lat> <lng> # Set geolocation
|
|
249
|
-
agent-browser set offline [on|off] # Toggle offline mode
|
|
250
|
-
agent-browser set headers <json> # Extra HTTP headers
|
|
251
|
-
agent-browser set credentials <u> <p> # HTTP basic auth
|
|
252
|
-
agent-browser set media [dark|light] # Emulate color scheme
|
|
253
|
-
```
|
|
254
|
-
|
|
255
|
-
### Cookies & Storage
|
|
256
|
-
|
|
257
|
-
```bash
|
|
258
|
-
agent-browser cookies # Get all cookies
|
|
259
|
-
agent-browser cookies set <name> <val> # Set cookie
|
|
260
|
-
agent-browser cookies clear # Clear cookies
|
|
261
|
-
|
|
262
|
-
agent-browser storage local # Get all localStorage
|
|
263
|
-
agent-browser storage local <key> # Get specific key
|
|
264
|
-
agent-browser storage local set <k> <v> # Set value
|
|
265
|
-
agent-browser storage local clear # Clear all
|
|
266
|
-
|
|
267
|
-
agent-browser storage session # Same for sessionStorage
|
|
268
|
-
```
|
|
269
|
-
|
|
270
|
-
### Network
|
|
271
|
-
|
|
272
|
-
```bash
|
|
273
|
-
agent-browser network route <url> # Intercept requests
|
|
274
|
-
agent-browser network route <url> --abort # Block requests
|
|
275
|
-
agent-browser network route <url> --body <json> # Mock response
|
|
276
|
-
agent-browser network unroute [url] # Remove routes
|
|
277
|
-
agent-browser network requests # View tracked requests
|
|
278
|
-
agent-browser network requests --filter api # Filter requests
|
|
279
|
-
```
|
|
280
|
-
|
|
281
|
-
### Tabs & Windows
|
|
282
|
-
|
|
283
|
-
```bash
|
|
284
|
-
agent-browser tab # List tabs
|
|
285
|
-
agent-browser tab new [url] # New tab (optionally with URL)
|
|
286
|
-
agent-browser tab <n> # Switch to tab n
|
|
287
|
-
agent-browser tab close [n] # Close tab
|
|
288
|
-
agent-browser window new # New window
|
|
289
|
-
```
|
|
290
|
-
|
|
291
|
-
### Frames
|
|
292
|
-
|
|
293
|
-
```bash
|
|
294
|
-
agent-browser frame <sel> # Switch to iframe
|
|
295
|
-
agent-browser frame main # Back to main frame
|
|
296
|
-
```
|
|
297
|
-
|
|
298
|
-
### Dialogs
|
|
299
|
-
|
|
300
|
-
```bash
|
|
301
|
-
agent-browser dialog accept [text] # Accept (with optional prompt text)
|
|
302
|
-
agent-browser dialog dismiss # Dismiss
|
|
303
|
-
```
|
|
304
|
-
|
|
305
|
-
### Diff
|
|
306
|
-
|
|
307
|
-
```bash
|
|
308
|
-
agent-browser diff snapshot # Compare current vs last snapshot
|
|
309
|
-
agent-browser diff snapshot --baseline before.txt # Compare current vs saved snapshot file
|
|
310
|
-
agent-browser diff snapshot --selector "#main" --compact # Scoped snapshot diff
|
|
311
|
-
agent-browser diff screenshot --baseline before.png # Visual pixel diff against baseline
|
|
312
|
-
agent-browser diff screenshot --baseline b.png -o d.png # Save diff image to custom path
|
|
313
|
-
agent-browser diff screenshot --baseline b.png -t 0.2 # Adjust color threshold (0-1)
|
|
314
|
-
agent-browser diff url https://v1.com https://v2.com # Compare two URLs (snapshot diff)
|
|
315
|
-
agent-browser diff url https://v1.com https://v2.com --screenshot # Also visual diff
|
|
316
|
-
agent-browser diff url https://v1.com https://v2.com --wait-until networkidle # Custom wait strategy
|
|
317
|
-
agent-browser diff url https://v1.com https://v2.com --selector "#main" # Scope to element
|
|
318
|
-
```
|
|
319
|
-
|
|
320
|
-
### Debug
|
|
321
|
-
|
|
322
|
-
```bash
|
|
323
|
-
agent-browser trace start [path] # Start recording trace
|
|
324
|
-
agent-browser trace stop [path] # Stop and save trace
|
|
325
|
-
agent-browser profiler start # Start Chrome DevTools profiling
|
|
326
|
-
agent-browser profiler stop [path] # Stop and save profile (.json)
|
|
327
|
-
agent-browser console # View console messages (log, error, warn, info)
|
|
328
|
-
agent-browser console --clear # Clear console
|
|
329
|
-
agent-browser errors # View page errors (uncaught JavaScript exceptions)
|
|
330
|
-
agent-browser errors --clear # Clear errors
|
|
331
|
-
agent-browser highlight <sel> # Highlight element
|
|
332
|
-
agent-browser state save <path> # Save auth state
|
|
333
|
-
agent-browser state load <path> # Load auth state
|
|
334
|
-
agent-browser state list # List saved state files
|
|
335
|
-
agent-browser state show <file> # Show state summary
|
|
336
|
-
agent-browser state rename <old> <new> # Rename state file
|
|
337
|
-
agent-browser state clear [name] # Clear states for session
|
|
338
|
-
agent-browser state clear --all # Clear all saved states
|
|
339
|
-
agent-browser state clean --older-than <days> # Delete old states
|
|
340
|
-
```
|
|
341
|
-
|
|
342
|
-
### Navigation
|
|
343
|
-
|
|
344
|
-
```bash
|
|
345
|
-
agent-browser back # Go back
|
|
346
|
-
agent-browser forward # Go forward
|
|
347
|
-
agent-browser reload # Reload page
|
|
348
|
-
```
|
|
349
|
-
|
|
350
|
-
### Setup
|
|
351
|
-
|
|
352
|
-
```bash
|
|
353
|
-
agent-browser install # Download Chromium browser
|
|
354
|
-
agent-browser install --with-deps # Also install system deps (Linux)
|
|
355
|
-
```
|
|
356
|
-
|
|
357
|
-
## Sessions
|
|
358
|
-
|
|
359
|
-
Run multiple isolated browser instances:
|
|
360
|
-
|
|
361
|
-
```bash
|
|
362
|
-
# Different sessions
|
|
363
|
-
agent-browser --session agent1 open site-a.com
|
|
364
|
-
agent-browser --session agent2 open site-b.com
|
|
365
|
-
|
|
366
|
-
# Or via environment variable
|
|
367
|
-
AGENT_BROWSER_SESSION=agent1 agent-browser click "#btn"
|
|
368
|
-
|
|
369
|
-
# List active sessions
|
|
370
|
-
agent-browser session list
|
|
371
|
-
# Output:
|
|
372
|
-
# Active sessions:
|
|
373
|
-
# -> default
|
|
374
|
-
# agent1
|
|
375
|
-
|
|
376
|
-
# Show current session
|
|
377
|
-
agent-browser session
|
|
378
|
-
```
|
|
379
|
-
|
|
380
|
-
Each session has its own:
|
|
381
|
-
- Browser instance
|
|
382
|
-
- Cookies and storage
|
|
383
|
-
- Navigation history
|
|
384
|
-
- Authentication state
|
|
385
|
-
|
|
386
|
-
## Session Persistence
|
|
387
|
-
|
|
388
|
-
Use `--session-name` to automatically save and restore cookies and localStorage across browser restarts:
|
|
389
|
-
|
|
390
|
-
```bash
|
|
391
|
-
# Auto-save/load state for "twitter" session
|
|
392
|
-
agent-browser --session-name twitter open twitter.com
|
|
393
|
-
|
|
394
|
-
# Login once, then state persists automatically
|
|
395
|
-
# State files stored in ~/.agent-browser/sessions/
|
|
396
|
-
|
|
397
|
-
# Or via environment variable
|
|
398
|
-
export AGENT_BROWSER_SESSION_NAME=twitter
|
|
399
|
-
agent-browser open twitter.com
|
|
400
|
-
```
|
|
401
|
-
|
|
402
|
-
### State Encryption
|
|
403
|
-
|
|
404
|
-
Encrypt saved session data at rest with AES-256-GCM:
|
|
405
|
-
|
|
406
|
-
```bash
|
|
407
|
-
# Generate key: openssl rand -hex 32
|
|
408
|
-
export AGENT_BROWSER_ENCRYPTION_KEY=<64-char-hex-key>
|
|
409
|
-
|
|
410
|
-
# State files are now encrypted automatically
|
|
411
|
-
agent-browser --session-name secure open example.com
|
|
412
|
-
```
|
|
413
|
-
|
|
414
|
-
| Variable | Description |
|
|
415
|
-
|----------|-------------|
|
|
416
|
-
| `AGENT_BROWSER_SESSION_NAME` | Auto-save/load state persistence name |
|
|
417
|
-
| `AGENT_BROWSER_ENCRYPTION_KEY` | 64-char hex key for AES-256-GCM encryption |
|
|
418
|
-
| `AGENT_BROWSER_STATE_EXPIRE_DAYS` | Auto-delete states older than N days (default: 30) |
|
|
419
|
-
|
|
420
|
-
## Snapshot Options
|
|
421
|
-
|
|
422
|
-
The `snapshot` command supports filtering to reduce output size:
|
|
423
|
-
|
|
424
|
-
```bash
|
|
425
|
-
agent-browser snapshot # Full accessibility tree
|
|
426
|
-
agent-browser snapshot -i # Interactive elements only (buttons, inputs, links)
|
|
427
|
-
agent-browser snapshot -i -C # Include cursor-interactive elements (divs with onclick, etc.)
|
|
428
|
-
agent-browser snapshot -c # Compact (remove empty structural elements)
|
|
429
|
-
agent-browser snapshot -d 3 # Limit depth to 3 levels
|
|
430
|
-
agent-browser snapshot -s "#main" # Scope to CSS selector
|
|
431
|
-
agent-browser snapshot -i -c -d 5 # Combine options
|
|
432
|
-
```
|
|
433
|
-
|
|
434
|
-
| Option | Description |
|
|
435
|
-
|--------|-------------|
|
|
436
|
-
| `-i, --interactive` | Only show interactive elements (buttons, links, inputs) |
|
|
437
|
-
| `-C, --cursor` | Include cursor-interactive elements (cursor:pointer, onclick, tabindex) |
|
|
438
|
-
| `-c, --compact` | Remove empty structural elements |
|
|
439
|
-
| `-d, --depth <n>` | Limit tree depth |
|
|
440
|
-
| `-s, --selector <sel>` | Scope to CSS selector |
|
|
441
|
-
|
|
442
|
-
The `-C` flag is useful for modern web apps that use custom clickable elements (divs, spans) instead of standard buttons/links.
|
|
443
|
-
|
|
444
|
-
## Annotated Screenshots
|
|
445
|
-
|
|
446
|
-
The `--annotate` flag overlays numbered labels on interactive elements in the screenshot. Each label `[N]` corresponds to ref `@eN`, so the same refs work for both visual and text-based workflows.
|
|
447
|
-
|
|
448
|
-
```bash
|
|
449
|
-
agent-browser screenshot --annotate
|
|
450
|
-
# -> Screenshot saved to /tmp/screenshot-2026-02-17T12-00-00-abc123.png
|
|
451
|
-
# [1] @e1 button "Submit"
|
|
452
|
-
# [2] @e2 link "Home"
|
|
453
|
-
# [3] @e3 textbox "Email"
|
|
454
|
-
```
|
|
455
|
-
|
|
456
|
-
After an annotated screenshot, refs are cached so you can immediately interact with elements:
|
|
457
|
-
|
|
458
|
-
```bash
|
|
459
|
-
agent-browser screenshot --annotate ./page.png
|
|
460
|
-
agent-browser click @e2 # Click the "Home" link labeled [2]
|
|
461
|
-
```
|
|
462
|
-
|
|
463
|
-
This is useful for multimodal AI models that can reason about visual layout, unlabeled icon buttons, canvas elements, or visual state that the text accessibility tree cannot capture.
|
|
464
|
-
|
|
465
|
-
## Options
|
|
466
|
-
|
|
467
|
-
| Option | Description |
|
|
468
|
-
|--------|-------------|
|
|
469
|
-
| `--session <name>` | Use isolated session (or `AGENT_BROWSER_SESSION` env) |
|
|
470
|
-
| `--session-name <name>` | Auto-save/restore session state (or `AGENT_BROWSER_SESSION_NAME` env) |
|
|
471
|
-
| `--state <path>` | Load storage state from JSON file (or `AGENT_BROWSER_STATE` env) |
|
|
472
|
-
| `--headers <json>` | Set HTTP headers scoped to the URL's origin |
|
|
473
|
-
| `--executable-path <path>` | Custom browser executable (or `AGENT_BROWSER_EXECUTABLE_PATH` env) |
|
|
474
|
-
| `--extension <path>` | Load browser extension (repeatable; or `AGENT_BROWSER_EXTENSIONS` env) |
|
|
475
|
-
| `--args <args>` | Browser launch args, comma or newline separated (or `AGENT_BROWSER_ARGS` env) |
|
|
476
|
-
| `--user-agent <ua>` | Custom User-Agent string (or `AGENT_BROWSER_USER_AGENT` env) |
|
|
477
|
-
| `--proxy <url>` | Proxy server URL with optional auth (or `AGENT_BROWSER_PROXY` env) |
|
|
478
|
-
| `--proxy-bypass <hosts>` | Hosts to bypass proxy (or `AGENT_BROWSER_PROXY_BYPASS` env) |
|
|
479
|
-
| `--ignore-https-errors` | Ignore HTTPS certificate errors (useful for self-signed certs) |
|
|
480
|
-
| `--allow-file-access` | Allow file:// URLs to access local files (Chromium only) |
|
|
481
|
-
| `-p, --provider <name>` | Cloud browser provider (or `AGENT_BROWSER_PROVIDER` env) |
|
|
482
|
-
| `--device <name>` | iOS device name, e.g. "iPhone 15 Pro" (or `AGENT_BROWSER_IOS_DEVICE` env) |
|
|
483
|
-
| `--json` | JSON output (for agents) |
|
|
484
|
-
| `--full, -f` | Full page screenshot |
|
|
485
|
-
| `--annotate` | Annotated screenshot with numbered element labels (or `AGENT_BROWSER_ANNOTATE` env) |
|
|
486
|
-
| `--headed` | Show browser window (not headless) |
|
|
487
|
-
| `--cdp <port\|url>` | Connect via Chrome DevTools Protocol (port or WebSocket URL) |
|
|
488
|
-
| `--auto-connect` | Auto-discover and connect to running Chrome (or `AGENT_BROWSER_AUTO_CONNECT` env) |
|
|
489
|
-
| `--color-scheme <scheme>` | Color scheme: `dark`, `light`, `no-preference` (or `AGENT_BROWSER_COLOR_SCHEME` env) |
|
|
490
|
-
| `--config <path>` | Use a custom config file (or `AGENT_BROWSER_CONFIG` env) |
|
|
491
|
-
| `--debug` | Debug output |
|
|
492
|
-
|
|
493
|
-
Project policy:
|
|
494
|
-
- `--profile` / `AGENT_BROWSER_PROFILE` are forbidden
|
|
495
|
-
- `--channel` / `AGENT_BROWSER_CHANNEL` are forbidden
|
|
496
|
-
- Default mode must connect to an existing browser at `localhost:9333` (no automatic local-launch fallback)
|
|
497
|
-
|
|
498
|
-
## Configuration
|
|
499
|
-
|
|
500
|
-
Create an `agent-browser.json` file to set persistent defaults instead of repeating flags on every command.
|
|
501
|
-
|
|
502
|
-
**Locations (lowest to highest priority):**
|
|
503
|
-
|
|
504
|
-
1. `~/.agent-browser/config.json` -- user-level defaults
|
|
505
|
-
2. `./agent-browser.json` -- project-level overrides (in working directory)
|
|
506
|
-
3. `AGENT_BROWSER_*` environment variables override config file values
|
|
507
|
-
4. CLI flags override everything
|
|
508
|
-
|
|
509
|
-
**Example `agent-browser.json`:**
|
|
510
|
-
|
|
511
|
-
```json
|
|
512
|
-
{
|
|
513
|
-
"headed": true,
|
|
514
|
-
"proxy": "http://localhost:8080",
|
|
515
|
-
"userAgent": "my-agent/1.0",
|
|
516
|
-
"ignoreHttpsErrors": true
|
|
517
|
-
}
|
|
518
|
-
```
|
|
519
|
-
|
|
520
|
-
Use `--config <path>` or `AGENT_BROWSER_CONFIG` to load a specific config file instead of the defaults:
|
|
20
|
+
### Install
|
|
521
21
|
|
|
522
22
|
```bash
|
|
523
|
-
agent-browser
|
|
524
|
-
|
|
525
|
-
```
|
|
526
|
-
|
|
527
|
-
All options from the table above can be set in the config file using camelCase keys (e.g., `--executable-path` becomes `"executablePath"`, `--proxy-bypass` becomes `"proxyBypass"`). Unknown keys are ignored for forward compatibility.
|
|
528
|
-
|
|
529
|
-
Boolean flags accept an optional `true`/`false` value to override config settings. For example, `--headed false` disables `"headed": true` from config. A bare `--headed` is equivalent to `--headed true`.
|
|
530
|
-
|
|
531
|
-
Auto-discovered config files that are missing are silently ignored. If `--config <path>` points to a missing or invalid file, agent-browser exits with an error. Extensions from user and project configs are merged (concatenated), not replaced.
|
|
532
|
-
|
|
533
|
-
> **Tip:** If your project-level `agent-browser.json` contains environment-specific values (paths, proxies), consider adding it to `.gitignore`.
|
|
534
|
-
|
|
535
|
-
## Default Timeout
|
|
536
|
-
|
|
537
|
-
The default Playwright timeout for standard operations (clicks, waits, fills, etc.) is 25 seconds. This is intentionally below the CLI's 30-second IPC read timeout so that Playwright returns a proper error instead of the CLI timing out with EAGAIN.
|
|
538
|
-
|
|
539
|
-
Override the default timeout via environment variable:
|
|
540
|
-
|
|
541
|
-
```bash
|
|
542
|
-
# Set a longer timeout for slow pages (in milliseconds)
|
|
543
|
-
export AGENT_BROWSER_DEFAULT_TIMEOUT=45000
|
|
544
|
-
```
|
|
545
|
-
|
|
546
|
-
> **Note:** Setting this above 30000 (30s) may cause EAGAIN errors on slow operations because the CLI's read timeout will expire before Playwright responds. The CLI retries transient errors automatically, but response times will increase.
|
|
547
|
-
|
|
548
|
-
| Variable | Description |
|
|
549
|
-
|----------|-------------|
|
|
550
|
-
| `AGENT_BROWSER_DEFAULT_TIMEOUT` | Default Playwright timeout in ms (default: 25000) |
|
|
551
|
-
|
|
552
|
-
## Selectors
|
|
553
|
-
|
|
554
|
-
### Refs (Recommended for AI)
|
|
555
|
-
|
|
556
|
-
Refs provide deterministic element selection from snapshots:
|
|
557
|
-
|
|
558
|
-
```bash
|
|
559
|
-
# 1. Get snapshot with refs
|
|
560
|
-
agent-browser snapshot
|
|
561
|
-
# Output:
|
|
562
|
-
# - heading "Example Domain" [ref=e1] [level=1]
|
|
563
|
-
# - button "Submit" [ref=e2]
|
|
564
|
-
# - textbox "Email" [ref=e3]
|
|
565
|
-
# - link "Learn more" [ref=e4]
|
|
566
|
-
|
|
567
|
-
# 2. Use refs to interact
|
|
568
|
-
agent-browser click @e2 # Click the button
|
|
569
|
-
agent-browser fill @e3 "test@example.com" # Fill the textbox
|
|
570
|
-
agent-browser get text @e1 # Get heading text
|
|
571
|
-
agent-browser hover @e4 # Hover the link
|
|
572
|
-
```
|
|
573
|
-
|
|
574
|
-
**Why use refs?**
|
|
575
|
-
- **Deterministic**: Ref points to exact element from snapshot
|
|
576
|
-
- **Fast**: No DOM re-query needed
|
|
577
|
-
- **AI-friendly**: Snapshot + ref workflow is optimal for LLMs
|
|
578
|
-
|
|
579
|
-
### CSS Selectors
|
|
580
|
-
|
|
581
|
-
```bash
|
|
582
|
-
agent-browser click "#id"
|
|
583
|
-
agent-browser click ".class"
|
|
584
|
-
agent-browser click "div > button"
|
|
585
|
-
```
|
|
586
|
-
|
|
587
|
-
### Text & XPath
|
|
588
|
-
|
|
589
|
-
```bash
|
|
590
|
-
agent-browser click "text=Submit"
|
|
591
|
-
agent-browser click "xpath=//button"
|
|
592
|
-
```
|
|
593
|
-
|
|
594
|
-
### Semantic Locators
|
|
595
|
-
|
|
596
|
-
```bash
|
|
597
|
-
agent-browser find role button click --name "Submit"
|
|
598
|
-
agent-browser find label "Email" fill "test@test.com"
|
|
599
|
-
```
|
|
600
|
-
|
|
601
|
-
## Agent Mode
|
|
602
|
-
|
|
603
|
-
Use `--json` for machine-readable output:
|
|
604
|
-
|
|
605
|
-
```bash
|
|
606
|
-
agent-browser snapshot --json
|
|
607
|
-
# Returns: {"success":true,"data":{"snapshot":"...","refs":{"e1":{"role":"heading","name":"Title"},...}}}
|
|
608
|
-
|
|
609
|
-
agent-browser get text @e1 --json
|
|
610
|
-
agent-browser is visible @e2 --json
|
|
611
|
-
```
|
|
612
|
-
|
|
613
|
-
### Optimal AI Workflow
|
|
614
|
-
|
|
615
|
-
```bash
|
|
616
|
-
# 1. Navigate and get snapshot
|
|
617
|
-
agent-browser open example.com
|
|
618
|
-
agent-browser snapshot -i --json # AI parses tree and refs
|
|
619
|
-
|
|
620
|
-
# 2. AI identifies target refs from snapshot
|
|
621
|
-
# 3. Execute actions using refs
|
|
622
|
-
agent-browser click @e2
|
|
623
|
-
agent-browser fill @e3 "input text"
|
|
624
|
-
|
|
625
|
-
# 4. Get new snapshot if page changed
|
|
626
|
-
agent-browser snapshot -i --json
|
|
627
|
-
```
|
|
628
|
-
|
|
629
|
-
### Command Chaining
|
|
630
|
-
|
|
631
|
-
Commands can be chained with `&&` in a single shell invocation. The browser persists via a background daemon, so chaining is safe and more efficient:
|
|
632
|
-
|
|
633
|
-
```bash
|
|
634
|
-
# Open, wait for load, and snapshot in one call
|
|
635
|
-
agent-browser open example.com && agent-browser wait --load networkidle && agent-browser snapshot -i
|
|
636
|
-
|
|
637
|
-
# Chain multiple interactions
|
|
638
|
-
agent-browser fill @e1 "user@example.com" && agent-browser fill @e2 "pass" && agent-browser click @e3
|
|
639
|
-
|
|
640
|
-
# Navigate and screenshot
|
|
641
|
-
agent-browser open example.com && agent-browser wait --load networkidle && agent-browser screenshot page.png
|
|
642
|
-
```
|
|
643
|
-
|
|
644
|
-
Use `&&` when you don't need intermediate output. Run commands separately when you need to parse output first (e.g., snapshot to discover refs before interacting).
|
|
645
|
-
|
|
646
|
-
## Headed Mode
|
|
647
|
-
|
|
648
|
-
Show the browser window for debugging:
|
|
649
|
-
|
|
650
|
-
```bash
|
|
651
|
-
agent-browser open example.com --headed
|
|
23
|
+
npm install -g agent-browser-stealth
|
|
24
|
+
agent-browser install
|
|
652
25
|
```
|
|
653
26
|
|
|
654
|
-
|
|
655
|
-
|
|
656
|
-
## Authenticated Sessions
|
|
657
|
-
|
|
658
|
-
Use `--headers` to set HTTP headers for a specific origin, enabling authentication without login flows:
|
|
27
|
+
### Minimal Usage
|
|
659
28
|
|
|
660
29
|
```bash
|
|
661
|
-
|
|
662
|
-
agent-browser
|
|
663
|
-
|
|
664
|
-
# Requests to api.example.com include the auth header
|
|
665
|
-
agent-browser snapshot -i --json
|
|
30
|
+
agent-browser open https://example.com
|
|
31
|
+
agent-browser snapshot -i
|
|
666
32
|
agent-browser click @e2
|
|
667
|
-
|
|
668
|
-
# Navigate to another domain - headers are NOT sent (safe!)
|
|
669
|
-
agent-browser open other-site.com
|
|
670
|
-
```
|
|
671
|
-
|
|
672
|
-
This is useful for:
|
|
673
|
-
- **Skipping login flows** - Authenticate via headers instead of UI
|
|
674
|
-
- **Switching users** - Start new sessions with different auth tokens
|
|
675
|
-
- **API testing** - Access protected endpoints directly
|
|
676
|
-
- **Security** - Headers are scoped to the origin, not leaked to other domains
|
|
677
|
-
|
|
678
|
-
To set headers for multiple origins, use `--headers` with each `open` command:
|
|
679
|
-
|
|
680
|
-
```bash
|
|
681
|
-
agent-browser open api.example.com --headers '{"Authorization": "Bearer token1"}'
|
|
682
|
-
agent-browser open api.acme.com --headers '{"Authorization": "Bearer token2"}'
|
|
683
|
-
```
|
|
684
|
-
|
|
685
|
-
For global headers (all domains), use `set headers`:
|
|
686
|
-
|
|
687
|
-
```bash
|
|
688
|
-
agent-browser set headers '{"X-Custom-Header": "value"}'
|
|
689
|
-
```
|
|
690
|
-
|
|
691
|
-
## Custom Browser Executable
|
|
692
|
-
|
|
693
|
-
Use a custom browser executable instead of the bundled Chromium. This is useful for:
|
|
694
|
-
- **Serverless deployment**: Use lightweight Chromium builds like `@sparticuz/chromium` (~50MB vs ~684MB)
|
|
695
|
-
- **System browsers**: Use an existing Chrome/Chromium installation
|
|
696
|
-
- **Custom builds**: Use modified browser builds
|
|
697
|
-
|
|
698
|
-
### CLI Usage
|
|
699
|
-
|
|
700
|
-
```bash
|
|
701
|
-
# Via flag
|
|
702
|
-
agent-browser --executable-path /path/to/chromium open example.com
|
|
703
|
-
|
|
704
|
-
# Via environment variable
|
|
705
|
-
AGENT_BROWSER_EXECUTABLE_PATH=/path/to/chromium agent-browser open example.com
|
|
706
33
|
```
|
|
707
34
|
|
|
708
|
-
|
|
709
|
-
|
|
710
|
-
```
|
|
711
|
-
|
|
712
|
-
|
|
713
|
-
|
|
714
|
-
|
|
715
|
-
|
|
716
|
-
|
|
717
|
-
|
|
718
|
-
|
|
719
|
-
|
|
720
|
-
|
|
721
|
-
|
|
35
|
+
## Stealth Architecture
|
|
36
|
+
|
|
37
|
+
```mermaid
|
|
38
|
+
flowchart TD
|
|
39
|
+
A["Command Input"] --> B["Stealth Policy Resolver"]
|
|
40
|
+
B --> C["Connection Mode Detection"]
|
|
41
|
+
C --> D["Launch Layer: Chromium Args"]
|
|
42
|
+
C --> E["CDP Layer: UA + Metadata Override"]
|
|
43
|
+
C --> F["Context Layer: Init Script Patches"]
|
|
44
|
+
D --> G["Behavior Layer: Humanized Interaction"]
|
|
45
|
+
E --> G
|
|
46
|
+
F --> G
|
|
47
|
+
G --> H["Risk Layer: Verification Detection and Handling"]
|
|
48
|
+
H --> I["Response with warnings and riskSignals"]
|
|
722
49
|
```
|
|
723
50
|
|
|
724
|
-
|
|
51
|
+
### Policy by Connection Mode
|
|
725
52
|
|
|
726
|
-
|
|
53
|
+
| Mode | Stealth Capabilities | Notes |
|
|
54
|
+
|---|---|---|
|
|
55
|
+
| Local Chromium launch | Chromium launch args + CDP UA override + context init scripts | Most complete stack |
|
|
56
|
+
| Existing browser via CDP | CDP UA override + context init scripts | No local Chromium arg injection |
|
|
57
|
+
| Cloud provider (browserbase/browseruse) | Context init scripts | Remote browser runtime controls launch layer |
|
|
58
|
+
| Kernel provider | Context init scripts + provider-managed stealth | Provider-side stealth may also apply |
|
|
727
59
|
|
|
728
|
-
|
|
729
|
-
# Enable file access (required for JavaScript to access local files)
|
|
730
|
-
agent-browser --allow-file-access open file:///path/to/document.pdf
|
|
731
|
-
agent-browser --allow-file-access open file:///path/to/page.html
|
|
60
|
+
## Principle 1: Always-On Stealth with Explicit Boundaries
|
|
732
61
|
|
|
733
|
-
|
|
734
|
-
|
|
735
|
-
|
|
736
|
-
|
|
62
|
+
- Stealth defaults to enabled and does not depend on a runtime toggle.
|
|
63
|
+
- Project policy forbids:
|
|
64
|
+
- `--profile` / `AGENT_BROWSER_PROFILE`
|
|
65
|
+
- `--channel` / `AGENT_BROWSER_CHANNEL`
|
|
66
|
+
- Default CLI policy expects an existing browser on CDP `localhost:9333` unless explicit connection options are provided.
|
|
737
67
|
|
|
738
|
-
|
|
739
|
-
- Load and render local files
|
|
740
|
-
- Access other local files via JavaScript (XHR, fetch)
|
|
741
|
-
- Load local resources (images, scripts, stylesheets)
|
|
68
|
+
## Principle 2: Multi-Layer Fingerprint Hardening
|
|
742
69
|
|
|
743
|
-
|
|
70
|
+
### 2.1 Launch Layer (Local Chromium)
|
|
744
71
|
|
|
745
|
-
|
|
72
|
+
Injected Chromium args:
|
|
746
73
|
|
|
747
|
-
|
|
748
|
-
|
|
74
|
+
- `--disable-blink-features=AutomationControlled`
|
|
75
|
+
- `--use-gl=angle`
|
|
76
|
+
- `--use-angle=default`
|
|
749
77
|
|
|
750
|
-
|
|
78
|
+
If no custom UA is set, the runtime UA is normalized to remove `HeadlessChrome` tokens.
|
|
751
79
|
|
|
752
|
-
|
|
753
|
-
- Disables Chromium's `AutomationControlled` blink feature
|
|
754
|
-
- Replaces "HeadlessChrome" in User-Agent and userAgentData (including CDP-level override)
|
|
755
|
-
- Uses ANGLE rendering instead of SwiftShader to avoid GPU fingerprinting
|
|
756
|
-
- Adds realistic `navigator.plugins` and `navigator.mimeTypes` (passes `instanceof` checks)
|
|
757
|
-
- Patches `window.chrome.runtime` to match real Chrome
|
|
758
|
-
- Masks WebGL vendor/renderer
|
|
759
|
-
- Fixes `navigator.permissions.query` for notifications
|
|
760
|
-
- Reports realistic `navigator.hardwareConcurrency` and `performance.memory`
|
|
761
|
-
- Provides default media devices for `enumerateDevices()`
|
|
762
|
-
- Patches screen/window dimensions to avoid viewport-equals-screen fingerprint
|
|
763
|
-
- Sets opaque background color (headless default is transparent)
|
|
764
|
-
- Cleans up CDP-injected properties on the document
|
|
80
|
+
### 2.2 CDP Layer (Browser/Page Targets)
|
|
765
81
|
|
|
766
|
-
|
|
82
|
+
- Uses `Emulation.setUserAgentOverride` to align:
|
|
83
|
+
- `userAgent`
|
|
84
|
+
- `acceptLanguage`
|
|
85
|
+
- `userAgentMetadata` brands and versions
|
|
86
|
+
- Applies overrides for existing/new targets, including worker-relevant contexts.
|
|
87
|
+
- Forces opaque white background (`Emulation.setDefaultBackgroundColorOverride`) to avoid headless transparency fingerprints.
|
|
767
88
|
|
|
768
|
-
|
|
89
|
+
### 2.3 Context Init-Script Layer (Patch Inventory)
|
|
769
90
|
|
|
770
|
-
|
|
771
|
-
| --- | --- |
|
|
772
|
-
| like headless | 0% |
|
|
773
|
-
| headless | 0% |
|
|
774
|
-
| stealth | 0% |
|
|
91
|
+
The init script patch set is injected before page scripts and currently includes:
|
|
775
92
|
|
|
776
|
-
|
|
93
|
+
1. `navigator.webdriver` removal (including prototype-level cleanup).
|
|
94
|
+
2. CSS webdriver heuristic neutralization (`CSS.supports('border-end-end-radius: initial')` probe).
|
|
95
|
+
3. `window.chrome.runtime` bootstrap for missing runtime surfaces.
|
|
96
|
+
4. Locale/language normalization (`navigator.language`, `navigator.languages`).
|
|
97
|
+
5. Realistic `navigator.plugins` and `navigator.mimeTypes`.
|
|
98
|
+
6. `navigator.permissions.query` normalization for notifications.
|
|
99
|
+
7. WebGL vendor/renderer masking when SwiftShader indicators are present.
|
|
100
|
+
8. `cdc_` property cleanup on document/documentElement.
|
|
101
|
+
9. Window/screen dimension normalization (`outerWidth/outerHeight/screenX/screenY`).
|
|
102
|
+
10. Screen availability patching (`availWidth/availHeight`).
|
|
103
|
+
11. Hardware concurrency stabilization.
|
|
104
|
+
12. Notification permission consistency.
|
|
105
|
+
13. Active text color heuristic patching.
|
|
106
|
+
14. `navigator.connection` normalization.
|
|
107
|
+
15. Worker network signal normalization (`downlinkMax`).
|
|
108
|
+
16. `prefers-color-scheme` light-mode heuristic neutralization.
|
|
109
|
+
17. `navigator.share` exposure.
|
|
110
|
+
18. `navigator.contacts` exposure.
|
|
111
|
+
19. `contentIndex` exposure.
|
|
112
|
+
20. `navigator.pdfViewerEnabled` normalization.
|
|
113
|
+
21. Media devices surface normalization.
|
|
114
|
+
22. `navigator.userAgent` cleanup (strip `HeadlessChrome`).
|
|
115
|
+
23. `navigator.userAgentData` brand cleanup.
|
|
116
|
+
24. `performance.memory` stabilization.
|
|
117
|
+
25. Default background color patching at script level.
|
|
777
118
|
|
|
778
|
-
|
|
779
|
-
node scripts/check-creepjs-headless.js --binary ./cli/target/release/agent-browser
|
|
780
|
-
```
|
|
119
|
+
## Principle 3: Behavioral Humanization
|
|
781
120
|
|
|
782
|
-
|
|
121
|
+
- Navigation pacing jitter before `goto` (short randomized delay).
|
|
122
|
+
- Typing jitter for `type --delay` and `keyboard type --delay`:
|
|
123
|
+
- per-character randomized delay around the requested base delay (about ±40%).
|
|
124
|
+
- Click path humanization:
|
|
125
|
+
- cursor moves on a Bezier-like curve before click.
|
|
126
|
+
- Wait supports random ranges (`wait min-max`) for non-uniform timing.
|
|
783
127
|
|
|
784
|
-
|
|
128
|
+
## Principle 4: Region Signal Alignment
|
|
785
129
|
|
|
786
|
-
|
|
787
|
-
- **Random wait ranges** -- `wait 2000-5000` pauses for a random duration between 2 and 5 seconds
|
|
788
|
-
- **Bezier curve mouse movement** -- Before every `click`, the mouse moves to the target element along a randomized cubic Bezier curve with natural-looking control points
|
|
789
|
-
- **Navigation pacing** -- Each page navigation includes a short random delay (300-1000ms) to avoid burst patterns
|
|
130
|
+
Before navigation, the runtime derives region hints from target URL TLD and aligns:
|
|
790
131
|
|
|
791
|
-
|
|
132
|
+
- locale
|
|
133
|
+
- timezone
|
|
134
|
+
- `Accept-Language`
|
|
792
135
|
|
|
793
|
-
|
|
136
|
+
Examples of built-in mappings include `tw`, `jp`, `kr`, `sg`, `de`, `fr`, `uk`, `in`, `au`.
|
|
794
137
|
|
|
795
|
-
|
|
138
|
+
Manual overrides are supported:
|
|
796
139
|
|
|
797
|
-
|
|
140
|
+
- `AGENT_BROWSER_LOCALE`
|
|
141
|
+
- `AGENT_BROWSER_TIMEZONE` (or `TZ`)
|
|
798
142
|
|
|
799
|
-
|
|
143
|
+
## Principle 5: Verification-Aware Risk Control
|
|
800
144
|
|
|
801
|
-
|
|
145
|
+
When a navigation lands on verification/captcha pages, structured risk signals are generated from URL/title evidence.
|
|
802
146
|
|
|
803
|
-
|
|
147
|
+
`riskSignals` include:
|
|
804
148
|
|
|
805
|
-
|
|
149
|
+
- `code`
|
|
150
|
+
- `source` (`url` or `title`)
|
|
151
|
+
- `evidence`
|
|
152
|
+
- `confidence`
|
|
806
153
|
|
|
807
|
-
|
|
154
|
+
### Risk Mode
|
|
808
155
|
|
|
809
|
-
|
|
156
|
+
- `warn` (default): retry with randomized backoff and return warnings + `riskSignals`.
|
|
157
|
+
- `block`: fail fast once verification/captcha interstitial is detected.
|
|
158
|
+
- `off`: skip detection/retry path.
|
|
810
159
|
|
|
811
160
|
```bash
|
|
812
|
-
|
|
813
|
-
|
|
814
|
-
|
|
815
|
-
agent-browser connect 9222
|
|
816
|
-
agent-browser snapshot
|
|
817
|
-
agent-browser tab
|
|
818
|
-
agent-browser close
|
|
819
|
-
|
|
820
|
-
# Or pass --cdp on each command
|
|
821
|
-
agent-browser --cdp 9222 snapshot
|
|
822
|
-
|
|
823
|
-
# Connect to remote browser via WebSocket URL
|
|
824
|
-
agent-browser --cdp "wss://your-browser-service.com/cdp?token=..." snapshot
|
|
161
|
+
agent-browser --risk-mode warn open https://example.com
|
|
162
|
+
agent-browser --risk-mode block open https://example.com
|
|
163
|
+
AGENT_BROWSER_RISK_MODE=off agent-browser open https://example.com
|
|
825
164
|
```
|
|
826
165
|
|
|
827
|
-
|
|
828
|
-
|
|
829
|
-
|
|
830
|
-
|
|
831
|
-
|
|
832
|
-
|
|
833
|
-
|
|
834
|
-
|
|
835
|
-
|
|
836
|
-
|
|
837
|
-
### Auto-Connect
|
|
838
|
-
|
|
839
|
-
Use `--auto-connect` to automatically discover and connect to a running Chrome instance without specifying a port:
|
|
840
|
-
|
|
841
|
-
```bash
|
|
842
|
-
# Auto-discover running Chrome with remote debugging
|
|
843
|
-
agent-browser --auto-connect open example.com
|
|
844
|
-
agent-browser --auto-connect snapshot
|
|
845
|
-
|
|
846
|
-
# Or via environment variable
|
|
847
|
-
AGENT_BROWSER_AUTO_CONNECT=1 agent-browser snapshot
|
|
166
|
+
```mermaid
|
|
167
|
+
flowchart TD
|
|
168
|
+
A["Navigate"] --> B["Collect URL and Title Signals"]
|
|
169
|
+
B --> C{"risk-mode"}
|
|
170
|
+
C -->|off| D["Return Success"]
|
|
171
|
+
C -->|block| E["Return Error with First Signal"]
|
|
172
|
+
C -->|warn| F["Retry up to 2 times"]
|
|
173
|
+
F --> G{"Signals Cleared"}
|
|
174
|
+
G -->|yes| H["Return Success + recovery warning + riskSignals"]
|
|
175
|
+
G -->|no| I["Return Success + warning + riskSignals"]
|
|
848
176
|
```
|
|
849
177
|
|
|
850
|
-
|
|
851
|
-
1. Reading Chrome's `DevToolsActivePort` file from the default user data directory
|
|
852
|
-
2. Falling back to probing common debugging ports (9222, 9229, 9333)
|
|
178
|
+
## Operational Recommendations
|
|
853
179
|
|
|
854
|
-
|
|
855
|
-
-
|
|
856
|
-
-
|
|
857
|
-
-
|
|
180
|
+
- Prefer `--headed` for high-friction targets.
|
|
181
|
+
- Reuse session state with `--session-name` for continuity.
|
|
182
|
+
- Keep locale/timezone consistent with target market.
|
|
183
|
+
- Use `--risk-mode block` in strict pipelines that require explicit operator intervention on verification pages.
|
|
858
184
|
|
|
859
|
-
##
|
|
185
|
+
## Validation Scripts
|
|
860
186
|
|
|
861
|
-
|
|
862
|
-
|
|
863
|
-
### Enable Streaming
|
|
864
|
-
|
|
865
|
-
Set the `AGENT_BROWSER_STREAM_PORT` environment variable:
|
|
187
|
+
Run public detector checks after stealth changes:
|
|
866
188
|
|
|
867
189
|
```bash
|
|
868
|
-
|
|
869
|
-
|
|
870
|
-
|
|
871
|
-
This starts a WebSocket server on the specified port that streams the browser viewport and accepts input events.
|
|
872
|
-
|
|
873
|
-
### WebSocket Protocol
|
|
874
|
-
|
|
875
|
-
Connect to `ws://localhost:9223` to receive frames and send input:
|
|
876
|
-
|
|
877
|
-
**Receive frames:**
|
|
878
|
-
```json
|
|
879
|
-
{
|
|
880
|
-
"type": "frame",
|
|
881
|
-
"data": "<base64-encoded-jpeg>",
|
|
882
|
-
"metadata": {
|
|
883
|
-
"deviceWidth": 1280,
|
|
884
|
-
"deviceHeight": 720,
|
|
885
|
-
"pageScaleFactor": 1,
|
|
886
|
-
"offsetTop": 0,
|
|
887
|
-
"scrollOffsetX": 0,
|
|
888
|
-
"scrollOffsetY": 0
|
|
889
|
-
}
|
|
890
|
-
}
|
|
891
|
-
```
|
|
892
|
-
|
|
893
|
-
**Send mouse events:**
|
|
894
|
-
```json
|
|
895
|
-
{
|
|
896
|
-
"type": "input_mouse",
|
|
897
|
-
"eventType": "mousePressed",
|
|
898
|
-
"x": 100,
|
|
899
|
-
"y": 200,
|
|
900
|
-
"button": "left",
|
|
901
|
-
"clickCount": 1
|
|
902
|
-
}
|
|
903
|
-
```
|
|
904
|
-
|
|
905
|
-
**Send keyboard events:**
|
|
906
|
-
```json
|
|
907
|
-
{
|
|
908
|
-
"type": "input_keyboard",
|
|
909
|
-
"eventType": "keyDown",
|
|
910
|
-
"key": "Enter",
|
|
911
|
-
"code": "Enter"
|
|
912
|
-
}
|
|
913
|
-
```
|
|
914
|
-
|
|
915
|
-
**Send touch events:**
|
|
916
|
-
```json
|
|
917
|
-
{
|
|
918
|
-
"type": "input_touch",
|
|
919
|
-
"eventType": "touchStart",
|
|
920
|
-
"touchPoints": [{ "x": 100, "y": 200 }]
|
|
921
|
-
}
|
|
922
|
-
```
|
|
923
|
-
|
|
924
|
-
### Programmatic API
|
|
925
|
-
|
|
926
|
-
For advanced use, control streaming directly via the protocol:
|
|
927
|
-
|
|
928
|
-
```typescript
|
|
929
|
-
import { BrowserManager } from 'agent-browser-stealth';
|
|
930
|
-
|
|
931
|
-
const browser = new BrowserManager();
|
|
932
|
-
await browser.launch({ headless: true });
|
|
933
|
-
await browser.navigate('https://example.com');
|
|
934
|
-
|
|
935
|
-
// Start screencast
|
|
936
|
-
await browser.startScreencast((frame) => {
|
|
937
|
-
// frame.data is base64-encoded image
|
|
938
|
-
// frame.metadata contains viewport info
|
|
939
|
-
console.log('Frame received:', frame.metadata.deviceWidth, 'x', frame.metadata.deviceHeight);
|
|
940
|
-
}, {
|
|
941
|
-
format: 'jpeg',
|
|
942
|
-
quality: 80,
|
|
943
|
-
maxWidth: 1280,
|
|
944
|
-
maxHeight: 720,
|
|
945
|
-
});
|
|
946
|
-
|
|
947
|
-
// Inject mouse events
|
|
948
|
-
await browser.injectMouseEvent({
|
|
949
|
-
type: 'mousePressed',
|
|
950
|
-
x: 100,
|
|
951
|
-
y: 200,
|
|
952
|
-
button: 'left',
|
|
953
|
-
});
|
|
954
|
-
|
|
955
|
-
// Inject keyboard events
|
|
956
|
-
await browser.injectKeyboardEvent({
|
|
957
|
-
type: 'keyDown',
|
|
958
|
-
key: 'Enter',
|
|
959
|
-
code: 'Enter',
|
|
960
|
-
});
|
|
961
|
-
|
|
962
|
-
// Stop when done
|
|
963
|
-
await browser.stopScreencast();
|
|
964
|
-
```
|
|
965
|
-
|
|
966
|
-
## Architecture
|
|
967
|
-
|
|
968
|
-
agent-browser uses a client-daemon architecture:
|
|
969
|
-
|
|
970
|
-
1. **Rust CLI** (fast native binary) - Parses commands, communicates with daemon
|
|
971
|
-
2. **Node.js Daemon** - Manages Playwright browser instance
|
|
972
|
-
3. **Fallback** - If native binary unavailable, uses Node.js directly
|
|
973
|
-
|
|
974
|
-
The daemon starts automatically on first command and persists between commands for fast subsequent operations.
|
|
975
|
-
|
|
976
|
-
**Browser Engine:** Uses Chromium by default. The daemon also supports Firefox and WebKit via the Playwright protocol.
|
|
977
|
-
|
|
978
|
-
## Platforms
|
|
979
|
-
|
|
980
|
-
| Platform | Binary | Fallback |
|
|
981
|
-
|----------|--------|----------|
|
|
982
|
-
| macOS ARM64 | Native Rust | Node.js |
|
|
983
|
-
| macOS x64 | Native Rust | Node.js |
|
|
984
|
-
| Linux ARM64 | Native Rust | Node.js |
|
|
985
|
-
| Linux x64 | Native Rust | Node.js |
|
|
986
|
-
| Windows x64 | Native Rust | Node.js |
|
|
987
|
-
|
|
988
|
-
## Usage with AI Agents
|
|
989
|
-
|
|
990
|
-
### Just ask the agent
|
|
991
|
-
|
|
992
|
-
The simplest approach -- just tell your agent to use it:
|
|
993
|
-
|
|
994
|
-
```
|
|
995
|
-
Use agent-browser to test the login flow. Run agent-browser --help to see available commands.
|
|
996
|
-
```
|
|
997
|
-
|
|
998
|
-
The `--help` output is comprehensive and most agents can figure it out from there.
|
|
999
|
-
|
|
1000
|
-
### AI Coding Assistants (recommended)
|
|
1001
|
-
|
|
1002
|
-
Add the skill to your AI coding assistant for richer context:
|
|
1003
|
-
|
|
1004
|
-
```bash
|
|
1005
|
-
npx skills add leeguooooo/agent-browser
|
|
1006
|
-
```
|
|
1007
|
-
|
|
1008
|
-
This works with Claude Code, Codex, Cursor, Gemini CLI, GitHub Copilot, Goose, OpenCode, and Windsurf. The skill is fetched from the repository, so it stays up to date automatically -- do not copy `SKILL.md` from `node_modules` as it will become stale.
|
|
1009
|
-
|
|
1010
|
-
### Claude Code
|
|
1011
|
-
|
|
1012
|
-
Install as a Claude Code skill:
|
|
1013
|
-
|
|
1014
|
-
```bash
|
|
1015
|
-
npx skills add leeguooooo/agent-browser
|
|
1016
|
-
```
|
|
1017
|
-
|
|
1018
|
-
This adds the skill to `.claude/skills/agent-browser/SKILL.md` in your project. The skill teaches Claude Code the full agent-browser workflow, including the snapshot-ref interaction pattern, session management, and timeout handling.
|
|
1019
|
-
|
|
1020
|
-
### AGENTS.md / CLAUDE.md
|
|
1021
|
-
|
|
1022
|
-
For more consistent results, add to your project or global instructions file:
|
|
1023
|
-
|
|
1024
|
-
```markdown
|
|
1025
|
-
## Browser Automation
|
|
1026
|
-
|
|
1027
|
-
Use `agent-browser` for web automation. Run `agent-browser --help` for all commands.
|
|
1028
|
-
|
|
1029
|
-
Core workflow:
|
|
1030
|
-
1. `agent-browser open <url>` - Navigate to page
|
|
1031
|
-
2. `agent-browser snapshot -i` - Get interactive elements with refs (@e1, @e2)
|
|
1032
|
-
3. `agent-browser click @e1` / `fill @e2 "text"` - Interact using refs
|
|
1033
|
-
4. Re-snapshot after page changes
|
|
1034
|
-
```
|
|
1035
|
-
|
|
1036
|
-
## Integrations
|
|
1037
|
-
|
|
1038
|
-
### iOS Simulator
|
|
1039
|
-
|
|
1040
|
-
Control real Mobile Safari in the iOS Simulator for authentic mobile web testing. Requires macOS with Xcode.
|
|
1041
|
-
|
|
1042
|
-
**Setup:**
|
|
1043
|
-
|
|
1044
|
-
```bash
|
|
1045
|
-
# Install Appium and XCUITest driver
|
|
1046
|
-
npm install -g appium
|
|
1047
|
-
appium driver install xcuitest
|
|
1048
|
-
```
|
|
1049
|
-
|
|
1050
|
-
**Usage:**
|
|
1051
|
-
|
|
1052
|
-
```bash
|
|
1053
|
-
# List available iOS simulators
|
|
1054
|
-
agent-browser device list
|
|
1055
|
-
|
|
1056
|
-
# Launch Safari on a specific device
|
|
1057
|
-
agent-browser -p ios --device "iPhone 16 Pro" open https://example.com
|
|
1058
|
-
|
|
1059
|
-
# Same commands as desktop
|
|
1060
|
-
agent-browser -p ios snapshot -i
|
|
1061
|
-
agent-browser -p ios tap @e1
|
|
1062
|
-
agent-browser -p ios fill @e2 "text"
|
|
1063
|
-
agent-browser -p ios screenshot mobile.png
|
|
1064
|
-
|
|
1065
|
-
# Mobile-specific commands
|
|
1066
|
-
agent-browser -p ios swipe up
|
|
1067
|
-
agent-browser -p ios swipe down 500
|
|
1068
|
-
|
|
1069
|
-
# Close session
|
|
1070
|
-
agent-browser -p ios close
|
|
1071
|
-
```
|
|
1072
|
-
|
|
1073
|
-
Or use environment variables:
|
|
1074
|
-
|
|
1075
|
-
```bash
|
|
1076
|
-
export AGENT_BROWSER_PROVIDER=ios
|
|
1077
|
-
export AGENT_BROWSER_IOS_DEVICE="iPhone 16 Pro"
|
|
1078
|
-
agent-browser open https://example.com
|
|
1079
|
-
```
|
|
1080
|
-
|
|
1081
|
-
| Variable | Description |
|
|
1082
|
-
|----------|-------------|
|
|
1083
|
-
| `AGENT_BROWSER_PROVIDER` | Set to `ios` to enable iOS mode |
|
|
1084
|
-
| `AGENT_BROWSER_IOS_DEVICE` | Device name (e.g., "iPhone 16 Pro", "iPad Pro") |
|
|
1085
|
-
| `AGENT_BROWSER_IOS_UDID` | Device UDID (alternative to device name) |
|
|
1086
|
-
|
|
1087
|
-
**Supported devices:** All iOS Simulators available in Xcode (iPhones, iPads), plus real iOS devices.
|
|
1088
|
-
|
|
1089
|
-
**Note:** The iOS provider boots the simulator, starts Appium, and controls Safari. First launch takes ~30-60 seconds; subsequent commands are fast.
|
|
1090
|
-
|
|
1091
|
-
#### Real Device Support
|
|
1092
|
-
|
|
1093
|
-
Appium also supports real iOS devices connected via USB. This requires additional one-time setup:
|
|
1094
|
-
|
|
1095
|
-
**1. Get your device UDID:**
|
|
1096
|
-
```bash
|
|
1097
|
-
xcrun xctrace list devices
|
|
1098
|
-
# or
|
|
1099
|
-
system_profiler SPUSBDataType | grep -A 5 "iPhone\|iPad"
|
|
1100
|
-
```
|
|
1101
|
-
|
|
1102
|
-
**2. Sign WebDriverAgent (one-time):**
|
|
1103
|
-
```bash
|
|
1104
|
-
# Open the WebDriverAgent Xcode project
|
|
1105
|
-
cd ~/.appium/node_modules/appium-xcuitest-driver/node_modules/appium-webdriveragent
|
|
1106
|
-
open WebDriverAgent.xcodeproj
|
|
1107
|
-
```
|
|
1108
|
-
|
|
1109
|
-
In Xcode:
|
|
1110
|
-
- Select the `WebDriverAgentRunner` target
|
|
1111
|
-
- Go to Signing & Capabilities
|
|
1112
|
-
- Select your Team (requires Apple Developer account, free tier works)
|
|
1113
|
-
- Let Xcode manage signing automatically
|
|
1114
|
-
|
|
1115
|
-
**3. Use with agent-browser:**
|
|
1116
|
-
```bash
|
|
1117
|
-
# Connect device via USB, then:
|
|
1118
|
-
agent-browser -p ios --device "<DEVICE_UDID>" open https://example.com
|
|
1119
|
-
|
|
1120
|
-
# Or use the device name if unique
|
|
1121
|
-
agent-browser -p ios --device "John's iPhone" open https://example.com
|
|
1122
|
-
```
|
|
1123
|
-
|
|
1124
|
-
**Real device notes:**
|
|
1125
|
-
- First run installs WebDriverAgent to the device (may require Trust prompt)
|
|
1126
|
-
- Device must be unlocked and connected via USB
|
|
1127
|
-
- Slightly slower initial connection than simulator
|
|
1128
|
-
- Tests against real Safari performance and behavior
|
|
1129
|
-
|
|
1130
|
-
### Browserbase
|
|
1131
|
-
|
|
1132
|
-
[Browserbase](https://browserbase.com) provides remote browser infrastructure to make deployment of agentic browsing agents easy. Use it when running the agent-browser CLI in an environment where a local browser isn't feasible.
|
|
1133
|
-
|
|
1134
|
-
To enable Browserbase, use the `-p` flag:
|
|
1135
|
-
|
|
1136
|
-
```bash
|
|
1137
|
-
export BROWSERBASE_API_KEY="your-api-key"
|
|
1138
|
-
export BROWSERBASE_PROJECT_ID="your-project-id"
|
|
1139
|
-
agent-browser -p browserbase open https://example.com
|
|
1140
|
-
```
|
|
1141
|
-
|
|
1142
|
-
Or use environment variables for CI/scripts:
|
|
1143
|
-
|
|
1144
|
-
```bash
|
|
1145
|
-
export AGENT_BROWSER_PROVIDER=browserbase
|
|
1146
|
-
export BROWSERBASE_API_KEY="your-api-key"
|
|
1147
|
-
export BROWSERBASE_PROJECT_ID="your-project-id"
|
|
1148
|
-
agent-browser open https://example.com
|
|
1149
|
-
```
|
|
1150
|
-
|
|
1151
|
-
When enabled, agent-browser connects to a Browserbase session instead of launching a local browser. All commands work identically.
|
|
1152
|
-
|
|
1153
|
-
Get your API key and project ID from the [Browserbase Dashboard](https://browserbase.com/overview).
|
|
1154
|
-
|
|
1155
|
-
### Browser Use
|
|
1156
|
-
|
|
1157
|
-
[Browser Use](https://browser-use.com) provides cloud browser infrastructure for AI agents. Use it when running agent-browser in environments where a local browser isn't available (serverless, CI/CD, etc.).
|
|
1158
|
-
|
|
1159
|
-
To enable Browser Use, use the `-p` flag:
|
|
1160
|
-
|
|
1161
|
-
```bash
|
|
1162
|
-
export BROWSER_USE_API_KEY="your-api-key"
|
|
1163
|
-
agent-browser -p browseruse open https://example.com
|
|
1164
|
-
```
|
|
1165
|
-
|
|
1166
|
-
Or use environment variables for CI/scripts:
|
|
1167
|
-
|
|
1168
|
-
```bash
|
|
1169
|
-
export AGENT_BROWSER_PROVIDER=browseruse
|
|
1170
|
-
export BROWSER_USE_API_KEY="your-api-key"
|
|
1171
|
-
agent-browser open https://example.com
|
|
1172
|
-
```
|
|
1173
|
-
|
|
1174
|
-
When enabled, agent-browser connects to a Browser Use cloud session instead of launching a local browser. All commands work identically.
|
|
1175
|
-
|
|
1176
|
-
Get your API key from the [Browser Use Cloud Dashboard](https://cloud.browser-use.com/settings?tab=api-keys). Free credits are available to get started, with pay-as-you-go pricing after.
|
|
1177
|
-
|
|
1178
|
-
### Kernel
|
|
1179
|
-
|
|
1180
|
-
[Kernel](https://www.kernel.sh) provides cloud browser infrastructure for AI agents with features like stealth mode and persistent profiles.
|
|
1181
|
-
|
|
1182
|
-
To enable Kernel, use the `-p` flag:
|
|
1183
|
-
|
|
1184
|
-
```bash
|
|
1185
|
-
export KERNEL_API_KEY="your-api-key"
|
|
1186
|
-
agent-browser -p kernel open https://example.com
|
|
1187
|
-
```
|
|
1188
|
-
|
|
1189
|
-
Or use environment variables for CI/scripts:
|
|
1190
|
-
|
|
1191
|
-
```bash
|
|
1192
|
-
export AGENT_BROWSER_PROVIDER=kernel
|
|
1193
|
-
export KERNEL_API_KEY="your-api-key"
|
|
1194
|
-
agent-browser open https://example.com
|
|
190
|
+
node scripts/check-sannysoft-webdriver.js --binary ./cli/target/release/agent-browser
|
|
191
|
+
node scripts/check-creepjs-headless.js --binary ./cli/target/release/agent-browser
|
|
1195
192
|
```
|
|
1196
193
|
|
|
1197
|
-
|
|
1198
|
-
|
|
1199
|
-
| Variable | Description | Default |
|
|
1200
|
-
|----------|-------------|---------|
|
|
1201
|
-
| `KERNEL_HEADLESS` | Run browser in headless mode (`true`/`false`) | `false` |
|
|
1202
|
-
| `KERNEL_STEALTH` | Enable stealth mode to avoid bot detection (`true`/`false`) | `true` |
|
|
1203
|
-
| `KERNEL_TIMEOUT_SECONDS` | Session timeout in seconds | `300` |
|
|
1204
|
-
| `KERNEL_PROFILE_NAME` | Browser profile name for persistent cookies/logins (created if it doesn't exist) | (none) |
|
|
1205
|
-
|
|
1206
|
-
When enabled, agent-browser connects to a Kernel cloud session instead of launching a local browser. All commands work identically.
|
|
1207
|
-
|
|
1208
|
-
**Profile Persistence:** When `KERNEL_PROFILE_NAME` is set, the profile will be created if it doesn't already exist. Cookies, logins, and session data are automatically saved back to the profile when the browser session ends, making them available for future sessions.
|
|
194
|
+
## Upstream Compatibility
|
|
1209
195
|
|
|
1210
|
-
|
|
196
|
+
This fork intentionally keeps command workflows close to upstream while concentrating custom behavior in stealth, policy, and anti-detection handling.
|
|
1211
197
|
|
|
1212
198
|
## License
|
|
1213
199
|
|