agent-browser-priv 0.27.3-priv.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (36) hide show
  1. package/LICENSE +201 -0
  2. package/README.md +1564 -0
  3. package/bin/agent-browser.js +125 -0
  4. package/package.json +52 -0
  5. package/scripts/build-all-platforms.sh +76 -0
  6. package/scripts/check-version-sync.js +51 -0
  7. package/scripts/copy-native.js +36 -0
  8. package/scripts/postinstall.js +327 -0
  9. package/scripts/sync-version.js +81 -0
  10. package/scripts/windows-debug/provision.sh +220 -0
  11. package/scripts/windows-debug/run.sh +92 -0
  12. package/scripts/windows-debug/start.sh +43 -0
  13. package/scripts/windows-debug/stop.sh +28 -0
  14. package/scripts/windows-debug/sync.sh +27 -0
  15. package/skill-data/agentcore/SKILL.md +115 -0
  16. package/skill-data/core/SKILL.md +488 -0
  17. package/skill-data/core/references/authentication.md +303 -0
  18. package/skill-data/core/references/commands.md +403 -0
  19. package/skill-data/core/references/profiling.md +120 -0
  20. package/skill-data/core/references/proxy-support.md +194 -0
  21. package/skill-data/core/references/session-management.md +193 -0
  22. package/skill-data/core/references/snapshot-refs.md +219 -0
  23. package/skill-data/core/references/trust-boundaries.md +89 -0
  24. package/skill-data/core/references/video-recording.md +175 -0
  25. package/skill-data/core/templates/authenticated-session.sh +105 -0
  26. package/skill-data/core/templates/capture-workflow.sh +69 -0
  27. package/skill-data/core/templates/form-automation.sh +62 -0
  28. package/skill-data/dogfood/SKILL.md +220 -0
  29. package/skill-data/dogfood/references/issue-taxonomy.md +109 -0
  30. package/skill-data/dogfood/templates/dogfood-report-template.md +53 -0
  31. package/skill-data/electron/SKILL.md +236 -0
  32. package/skill-data/slack/SKILL.md +285 -0
  33. package/skill-data/slack/references/slack-tasks.md +348 -0
  34. package/skill-data/slack/templates/slack-report-template.md +163 -0
  35. package/skill-data/vercel-sandbox/SKILL.md +280 -0
  36. package/skills/agent-browser/SKILL.md +55 -0
@@ -0,0 +1,303 @@
1
+ # Authentication Patterns
2
+
3
+ Login flows, session persistence, OAuth, 2FA, and authenticated browsing.
4
+
5
+ **Related**: [session-management.md](session-management.md) for state persistence details, [SKILL.md](../SKILL.md) for quick start.
6
+
7
+ ## Contents
8
+
9
+ - [Import Auth from Your Browser](#import-auth-from-your-browser)
10
+ - [Persistent Profiles](#persistent-profiles)
11
+ - [Session Persistence](#session-persistence)
12
+ - [Basic Login Flow](#basic-login-flow)
13
+ - [Saving Authentication State](#saving-authentication-state)
14
+ - [Restoring Authentication](#restoring-authentication)
15
+ - [OAuth / SSO Flows](#oauth--sso-flows)
16
+ - [Two-Factor Authentication](#two-factor-authentication)
17
+ - [HTTP Basic Auth](#http-basic-auth)
18
+ - [Cookie-Based Auth](#cookie-based-auth)
19
+ - [Token Refresh Handling](#token-refresh-handling)
20
+ - [Security Best Practices](#security-best-practices)
21
+
22
+ ## Import Auth from Your Browser
23
+
24
+ The fastest way to authenticate is to reuse cookies from a Chrome session you are already logged into.
25
+
26
+ **Step 1: Start Chrome with remote debugging**
27
+
28
+ ```bash
29
+ # macOS
30
+ "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" --remote-debugging-port=9222
31
+
32
+ # Linux
33
+ google-chrome --remote-debugging-port=9222
34
+
35
+ # Windows
36
+ "C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
37
+ ```
38
+
39
+ Log in to your target site(s) in this Chrome window as you normally would.
40
+
41
+ > **Security note:** `--remote-debugging-port` exposes full browser control on localhost. Any local process can connect and read cookies, execute JS, etc. Only use on trusted machines and close Chrome when done.
42
+
43
+ **Step 2: Grab the auth state**
44
+
45
+ ```bash
46
+ # Auto-discover the running Chrome and save its cookies + localStorage
47
+ agent-browser --auto-connect state save ./my-auth.json
48
+ ```
49
+
50
+ **Step 3: Reuse in automation**
51
+
52
+ ```bash
53
+ # Load auth at launch
54
+ agent-browser --state ./my-auth.json open https://app.example.com/dashboard
55
+
56
+ # Or load into an existing session
57
+ agent-browser state load ./my-auth.json
58
+ agent-browser open https://app.example.com/dashboard
59
+ ```
60
+
61
+ This works for any site, including those with complex OAuth flows, SSO, or 2FA -- as long as Chrome already has valid session cookies.
62
+
63
+ > **Security note:** State files contain session tokens in plaintext. Add them to `.gitignore`, delete when no longer needed, and set `AGENT_BROWSER_ENCRYPTION_KEY` for encryption at rest. See [Security Best Practices](#security-best-practices).
64
+
65
+ **Tip:** Combine with `--session-name` so the imported auth auto-persists across restarts:
66
+
67
+ ```bash
68
+ agent-browser --session-name myapp state load ./my-auth.json
69
+ # From now on, state is auto-saved/restored for "myapp"
70
+ ```
71
+
72
+ ## Persistent Profiles
73
+
74
+ Use `--profile` to point agent-browser at a Chrome user data directory. This persists everything (cookies, IndexedDB, service workers, cache) across browser restarts without explicit save/load:
75
+
76
+ ```bash
77
+ # First run: login once
78
+ agent-browser --profile ~/.myapp-profile open https://app.example.com/login
79
+ # ... complete login flow ...
80
+
81
+ # All subsequent runs: already authenticated
82
+ agent-browser --profile ~/.myapp-profile open https://app.example.com/dashboard
83
+ ```
84
+
85
+ Use different paths for different projects or test users:
86
+
87
+ ```bash
88
+ agent-browser --profile ~/.profiles/admin open https://app.example.com
89
+ agent-browser --profile ~/.profiles/viewer open https://app.example.com
90
+ ```
91
+
92
+ Or set via environment variable:
93
+
94
+ ```bash
95
+ export AGENT_BROWSER_PROFILE=~/.myapp-profile
96
+ agent-browser open https://app.example.com/dashboard
97
+ ```
98
+
99
+ ## Session Persistence
100
+
101
+ Use `--session-name` to auto-save and restore cookies + localStorage by name, without managing files:
102
+
103
+ ```bash
104
+ # Auto-saves state on close, auto-restores on next launch
105
+ agent-browser --session-name twitter open https://twitter.com
106
+ # ... login flow ...
107
+ agent-browser close # state saved to ~/.agent-browser/sessions/
108
+
109
+ # Next time: state is automatically restored
110
+ agent-browser --session-name twitter open https://twitter.com
111
+ ```
112
+
113
+ Encrypt state at rest:
114
+
115
+ ```bash
116
+ export AGENT_BROWSER_ENCRYPTION_KEY=$(openssl rand -hex 32)
117
+ agent-browser --session-name secure open https://app.example.com
118
+ ```
119
+
120
+ ## Basic Login Flow
121
+
122
+ ```bash
123
+ # Navigate to login page
124
+ agent-browser open https://app.example.com/login
125
+ agent-browser wait --load networkidle
126
+
127
+ # Get form elements
128
+ agent-browser snapshot -i
129
+ # Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Sign In"
130
+
131
+ # Fill credentials
132
+ agent-browser fill @e1 "user@example.com"
133
+ agent-browser fill @e2 "password123"
134
+
135
+ # Submit
136
+ agent-browser click @e3
137
+ agent-browser wait --load networkidle
138
+
139
+ # Verify login succeeded
140
+ agent-browser get url # Should be dashboard, not login
141
+ ```
142
+
143
+ ## Saving Authentication State
144
+
145
+ After logging in, save state for reuse:
146
+
147
+ ```bash
148
+ # Login first (see above)
149
+ agent-browser open https://app.example.com/login
150
+ agent-browser snapshot -i
151
+ agent-browser fill @e1 "user@example.com"
152
+ agent-browser fill @e2 "password123"
153
+ agent-browser click @e3
154
+ agent-browser wait --url "**/dashboard"
155
+
156
+ # Save authenticated state
157
+ agent-browser state save ./auth-state.json
158
+ ```
159
+
160
+ ## Restoring Authentication
161
+
162
+ Skip login by loading saved state:
163
+
164
+ ```bash
165
+ # Load saved auth state
166
+ agent-browser state load ./auth-state.json
167
+
168
+ # Navigate directly to protected page
169
+ agent-browser open https://app.example.com/dashboard
170
+
171
+ # Verify authenticated
172
+ agent-browser snapshot -i
173
+ ```
174
+
175
+ ## OAuth / SSO Flows
176
+
177
+ For OAuth redirects:
178
+
179
+ ```bash
180
+ # Start OAuth flow
181
+ agent-browser open https://app.example.com/auth/google
182
+
183
+ # Handle redirects automatically
184
+ agent-browser wait --url "**/accounts.google.com**"
185
+ agent-browser snapshot -i
186
+
187
+ # Fill Google credentials
188
+ agent-browser fill @e1 "user@gmail.com"
189
+ agent-browser click @e2 # Next button
190
+ agent-browser wait 2000
191
+ agent-browser snapshot -i
192
+ agent-browser fill @e3 "password"
193
+ agent-browser click @e4 # Sign in
194
+
195
+ # Wait for redirect back
196
+ agent-browser wait --url "**/app.example.com**"
197
+ agent-browser state save ./oauth-state.json
198
+ ```
199
+
200
+ ## Two-Factor Authentication
201
+
202
+ Handle 2FA with manual intervention:
203
+
204
+ ```bash
205
+ # Login with credentials
206
+ agent-browser open https://app.example.com/login --headed # Show browser
207
+ agent-browser snapshot -i
208
+ agent-browser fill @e1 "user@example.com"
209
+ agent-browser fill @e2 "password123"
210
+ agent-browser click @e3
211
+
212
+ # Wait for user to complete 2FA manually
213
+ echo "Complete 2FA in the browser window..."
214
+ agent-browser wait --url "**/dashboard" --timeout 120000
215
+
216
+ # Save state after 2FA
217
+ agent-browser state save ./2fa-state.json
218
+ ```
219
+
220
+ ## HTTP Basic Auth
221
+
222
+ For sites using HTTP Basic Authentication:
223
+
224
+ ```bash
225
+ # Set credentials before navigation
226
+ agent-browser set credentials username password
227
+
228
+ # Navigate to protected resource
229
+ agent-browser open https://protected.example.com/api
230
+ ```
231
+
232
+ ## Cookie-Based Auth
233
+
234
+ Manually set authentication cookies:
235
+
236
+ ```bash
237
+ # Set auth cookie
238
+ agent-browser cookies set session_token "abc123xyz"
239
+
240
+ # Navigate to protected page
241
+ agent-browser open https://app.example.com/dashboard
242
+ ```
243
+
244
+ ## Token Refresh Handling
245
+
246
+ For sessions with expiring tokens:
247
+
248
+ ```bash
249
+ #!/bin/bash
250
+ # Wrapper that handles token refresh
251
+
252
+ STATE_FILE="./auth-state.json"
253
+
254
+ # Try loading existing state
255
+ if [[ -f "$STATE_FILE" ]]; then
256
+ agent-browser state load "$STATE_FILE"
257
+ agent-browser open https://app.example.com/dashboard
258
+
259
+ # Check if session is still valid
260
+ URL=$(agent-browser get url)
261
+ if [[ "$URL" == *"/login"* ]]; then
262
+ echo "Session expired, re-authenticating..."
263
+ # Perform fresh login
264
+ agent-browser snapshot -i
265
+ agent-browser fill @e1 "$USERNAME"
266
+ agent-browser fill @e2 "$PASSWORD"
267
+ agent-browser click @e3
268
+ agent-browser wait --url "**/dashboard"
269
+ agent-browser state save "$STATE_FILE"
270
+ fi
271
+ else
272
+ # First-time login
273
+ agent-browser open https://app.example.com/login
274
+ # ... login flow ...
275
+ fi
276
+ ```
277
+
278
+ ## Security Best Practices
279
+
280
+ 1. **Never commit state files** - They contain session tokens
281
+ ```bash
282
+ echo "*.auth-state.json" >> .gitignore
283
+ ```
284
+
285
+ 2. **Use environment variables for credentials**
286
+ ```bash
287
+ agent-browser fill @e1 "$APP_USERNAME"
288
+ agent-browser fill @e2 "$APP_PASSWORD"
289
+ ```
290
+
291
+ 3. **Clean up after automation**
292
+ ```bash
293
+ agent-browser cookies clear
294
+ rm -f ./auth-state.json
295
+ ```
296
+
297
+ 4. **Use short-lived sessions for CI/CD**
298
+ ```bash
299
+ # Don't persist state in CI
300
+ agent-browser open https://app.example.com/login
301
+ # ... login and perform actions ...
302
+ agent-browser close # Session ends, nothing persisted
303
+ ```
@@ -0,0 +1,403 @@
1
+ # Command Reference
2
+
3
+ Complete reference for all agent-browser commands. For quick start and common patterns, see SKILL.md.
4
+
5
+ ## Navigation
6
+
7
+ ```bash
8
+ agent-browser open # Launch browser (no navigation); stays on about:blank.
9
+ # Pair with `network route`, `cookies set --curl`, or
10
+ # `addinitscript` to stage state before the first navigation.
11
+ agent-browser open <url> # Launch + navigate (aliases: goto, navigate)
12
+ # Supports: https://, http://, file://, about:, data://
13
+ # Auto-prepends https:// if no protocol given
14
+ agent-browser back # Go back
15
+ agent-browser forward # Go forward
16
+ agent-browser reload # Reload page
17
+ agent-browser pushstate <url> # SPA client-side navigation. Auto-detects
18
+ # window.next.router.push (triggers RSC fetch on Next.js);
19
+ # falls back to history.pushState + popstate/navigate events.
20
+ agent-browser close # Close browser (aliases: quit, exit)
21
+ agent-browser connect 9222 # Connect to browser via CDP port
22
+ ```
23
+
24
+ ### Pre-navigation setup (one-turn batch)
25
+
26
+ ```bash
27
+ agent-browser batch \
28
+ '["open"]' \
29
+ '["network","route","*","--abort","--resource-type","script"]' \
30
+ '["cookies","set","--curl","cookies.curl","--domain","localhost"]' \
31
+ '["navigate","http://localhost:3000/target"]'
32
+ ```
33
+
34
+ `open` with no URL gives you a clean launch so any interception, cookies,
35
+ or init scripts you register take effect on the *first* real navigation.
36
+ Use for SSR-only debug (`--resource-type script`), protected-origin auth,
37
+ or capturing fresh `react suspense`/`vitals` state without noise from a
38
+ prior page.
39
+
40
+ ## Snapshot (page analysis)
41
+
42
+ ```bash
43
+ agent-browser snapshot # Full accessibility tree
44
+ agent-browser snapshot -i # Interactive elements only (recommended)
45
+ agent-browser snapshot -c # Compact output
46
+ agent-browser snapshot -d 3 # Limit depth to 3
47
+ agent-browser snapshot -s "#main" # Scope to CSS selector
48
+ ```
49
+
50
+ ## Interactions (use @refs from snapshot)
51
+
52
+ ```bash
53
+ agent-browser click @e1 # Click
54
+ agent-browser click @e1 --new-tab # Click and open in new tab
55
+ agent-browser dblclick @e1 # Double-click
56
+ agent-browser focus @e1 # Focus element
57
+ agent-browser fill @e2 "text" # Clear and type
58
+ agent-browser type @e2 "text" # Type without clearing
59
+ agent-browser press Enter # Press key (alias: key)
60
+ agent-browser press Control+a # Key combination
61
+ agent-browser keydown Shift # Hold key down
62
+ agent-browser keyup Shift # Release key
63
+ agent-browser hover @e1 # Hover
64
+ agent-browser check @e1 # Check checkbox
65
+ agent-browser uncheck @e1 # Uncheck checkbox
66
+ agent-browser select @e1 "value" # Select dropdown option
67
+ agent-browser select @e1 "a" "b" # Select multiple options
68
+ agent-browser scroll down 500 # Scroll page (default: down 300px)
69
+ agent-browser scrollintoview @e1 # Scroll element into view (alias: scrollinto)
70
+ agent-browser drag @e1 @e2 # Drag and drop
71
+ agent-browser upload @e1 file.pdf # Upload files
72
+ ```
73
+
74
+ Clicks fail before dispatch when another element covers the target's click
75
+ point. The error names the covering element, for example
76
+ `covered by <div#consent-banner>`. Dismiss or interact with that element, run a
77
+ fresh snapshot, then retry the original action.
78
+
79
+ ## Get Information
80
+
81
+ ```bash
82
+ agent-browser get text @e1 # Get element text
83
+ agent-browser get html @e1 # Get innerHTML
84
+ agent-browser get value @e1 # Get input value
85
+ agent-browser get attr @e1 href # Get attribute
86
+ agent-browser get title # Get page title
87
+ agent-browser get url # Get current URL
88
+ agent-browser get cdp-url # Get CDP WebSocket URL
89
+ agent-browser get count ".item" # Count matching elements
90
+ agent-browser get box @e1 # Get bounding box
91
+ agent-browser get styles @e1 # Get computed styles (font, color, bg, etc.)
92
+ ```
93
+
94
+ ## Check State
95
+
96
+ ```bash
97
+ agent-browser is visible @e1 # Check if visible
98
+ agent-browser is enabled @e1 # Check if enabled
99
+ agent-browser is checked @e1 # Check if checked
100
+ ```
101
+
102
+ ## Screenshots and PDF
103
+
104
+ ```bash
105
+ agent-browser screenshot # Save to temporary directory
106
+ agent-browser screenshot path.png # Save to specific path
107
+ agent-browser screenshot --full # Full page
108
+ agent-browser pdf output.pdf # Save as PDF
109
+ ```
110
+
111
+ Headless Chromium screenshots hide native scrollbars for consistent image output.
112
+ Pass `--hide-scrollbars false` when launching to keep native scrollbars visible.
113
+
114
+ ## Video Recording
115
+
116
+ ```bash
117
+ agent-browser open https://example.com # Launch a browser session first
118
+ agent-browser record start ./demo.webm # Start recording
119
+ agent-browser click @e1 # Perform actions
120
+ agent-browser record stop # Stop and save video
121
+ agent-browser record restart ./take2.webm # Stop current + start new
122
+ ```
123
+
124
+ ## Wait
125
+
126
+ ```bash
127
+ agent-browser wait @e1 # Wait for element
128
+ agent-browser wait 2000 # Wait milliseconds
129
+ agent-browser wait --text "Success" # Wait for text (or -t)
130
+ agent-browser wait --url "**/dashboard" # Wait for URL pattern (or -u)
131
+ agent-browser wait --load networkidle # Wait for network idle (or -l)
132
+ agent-browser wait --fn "window.ready" # Wait for JS condition (or -f)
133
+ ```
134
+
135
+ ## Mouse Control
136
+
137
+ ```bash
138
+ agent-browser mouse move 100 200 # Move mouse
139
+ agent-browser mouse down left # Press button
140
+ agent-browser mouse up left # Release button
141
+ agent-browser mouse wheel 100 # Scroll wheel
142
+ ```
143
+
144
+ ## Semantic Locators (alternative to refs)
145
+
146
+ ```bash
147
+ agent-browser find role button click --name "Submit"
148
+ agent-browser find text "Sign In" click
149
+ agent-browser find text "Sign In" click --exact # Exact match only
150
+ agent-browser find label "Email" fill "user@test.com"
151
+ agent-browser find placeholder "Search" type "query"
152
+ agent-browser find alt "Logo" click
153
+ agent-browser find title "Close" click
154
+ agent-browser find testid "submit-btn" click
155
+ agent-browser find first ".item" click
156
+ agent-browser find last ".item" click
157
+ agent-browser find nth 2 "a" hover
158
+ ```
159
+
160
+ ## Browser Settings
161
+
162
+ ```bash
163
+ agent-browser set viewport 1920 1080 # Set viewport size
164
+ agent-browser set viewport 1920 1080 2 # 2x retina (same CSS size, higher res screenshots)
165
+ agent-browser set device "iPhone 14" # Emulate device
166
+ agent-browser set geo 37.7749 -122.4194 # Set geolocation (alias: geolocation)
167
+ agent-browser set offline on # Toggle offline mode
168
+ agent-browser set headers '{"X-Key":"v"}' # Extra HTTP headers
169
+ agent-browser set credentials user pass # HTTP basic auth (alias: auth)
170
+ agent-browser set media dark # Emulate color scheme
171
+ agent-browser set media light reduced-motion # Light mode + reduced motion
172
+ ```
173
+
174
+ ## Cookies and Storage
175
+
176
+ ```bash
177
+ agent-browser cookies # Get all cookies
178
+ agent-browser cookies set name value # Set cookie
179
+ agent-browser cookies clear # Clear cookies
180
+ agent-browser storage local # Get all localStorage
181
+ agent-browser storage local key # Get specific key
182
+ agent-browser storage local set k v # Set value
183
+ agent-browser storage local clear # Clear all
184
+ ```
185
+
186
+ ## Network
187
+
188
+ ```bash
189
+ agent-browser network route <url> # Intercept requests
190
+ agent-browser network route <url> --abort # Block requests
191
+ agent-browser network route <url> --body '{}' # Mock response
192
+ agent-browser network unroute [url] # Remove routes
193
+ agent-browser network requests # View tracked requests
194
+ agent-browser network requests --filter api # Filter requests
195
+ ```
196
+
197
+ ## Tabs and Windows
198
+
199
+ ```bash
200
+ agent-browser tab # List tabs with tabId and label
201
+ agent-browser tab new [url] # New tab
202
+ agent-browser tab new --label docs [url] # New tab with a memorable label
203
+ agent-browser tab t2 # Switch to tab by id
204
+ agent-browser tab docs # Switch to tab by label
205
+ agent-browser tab close # Close current tab
206
+ agent-browser tab close t2 # Close tab by id
207
+ agent-browser tab close docs # Close tab by label
208
+ agent-browser window new # New window
209
+ ```
210
+
211
+ Tab ids are stable strings of the form `t1`, `t2`, `t3`. They're never reused
212
+ within a session, so the same id keeps referring to the same tab across
213
+ commands. Positional integers are **not** accepted — `tab 2` errors with a
214
+ teaching message; use `t2`.
215
+
216
+ User-assigned labels (`docs`, `app`, `admin`) are interchangeable with ids
217
+ everywhere a tab ref is accepted. Labels are the agent-friendly way to write
218
+ multi-tab workflows:
219
+
220
+ ```bash
221
+ agent-browser tab new --label docs https://docs.example.com
222
+ agent-browser tab new --label app https://app.example.com
223
+ agent-browser tab docs # switch to docs
224
+ agent-browser snapshot # populate refs for docs
225
+ agent-browser click @e1 # ref click on docs
226
+ agent-browser tab app # switch to app
227
+ agent-browser tab close docs # close by label
228
+ ```
229
+
230
+ Labels are never auto-generated, never rewritten on navigation, and must be
231
+ unique within a session. To interact with another tab, switch to it first:
232
+ the daemon maintains a single active tab, so refs (`@eN`) belong to the tab
233
+ that was active when the snapshot ran.
234
+
235
+ ## Frames
236
+
237
+ ```bash
238
+ agent-browser frame "#iframe" # Switch to iframe by CSS selector
239
+ agent-browser frame @e3 # Switch to iframe by element ref
240
+ agent-browser frame main # Back to main frame
241
+ ```
242
+
243
+ ### Iframe support
244
+
245
+ Iframes are detected automatically during snapshots. When the main-frame snapshot runs, `Iframe` nodes are resolved and their content is inlined beneath the iframe element in the output (one level of nesting; iframes within iframes are not expanded).
246
+
247
+ ```bash
248
+ agent-browser snapshot -i
249
+ # @e3 [Iframe] "payment-frame"
250
+ # @e4 [input] "Card number"
251
+ # @e5 [button] "Pay"
252
+
253
+ # Interact directly — refs inside iframes already work
254
+ agent-browser fill @e4 "4111111111111111"
255
+ agent-browser click @e5
256
+
257
+ # Or switch frame context for scoped snapshots
258
+ agent-browser frame @e3 # Switch using element ref
259
+ agent-browser snapshot -i # Snapshot scoped to that iframe
260
+ agent-browser frame main # Return to main frame
261
+ ```
262
+
263
+ The `frame` command accepts:
264
+ - **Element refs** — `frame @e3` resolves the ref to an iframe element
265
+ - **CSS selectors** — `frame "#payment-iframe"` finds the iframe by selector
266
+ - **Frame name/URL** — matches against the browser's frame tree
267
+
268
+ ## Dialogs
269
+
270
+ By default, `alert` and `beforeunload` dialogs are automatically accepted so they never block the agent. `confirm` and `prompt` dialogs still require explicit handling. Use `--no-auto-dialog` to disable this behavior.
271
+
272
+ ```bash
273
+ agent-browser dialog accept [text] # Accept dialog
274
+ agent-browser dialog dismiss # Dismiss dialog
275
+ agent-browser dialog status # Check if a dialog is currently open
276
+ ```
277
+
278
+ ## JavaScript
279
+
280
+ ```bash
281
+ agent-browser eval "document.title" # Simple expressions only
282
+ agent-browser eval -b "<base64>" # Any JavaScript (base64 encoded)
283
+ agent-browser eval --stdin # Read script from stdin
284
+ ```
285
+
286
+ Use `-b`/`--base64` or `--stdin` for reliable execution. Shell escaping with nested quotes and special characters is error-prone.
287
+
288
+ ```bash
289
+ # Base64 encode your script, then:
290
+ agent-browser eval -b "ZG9jdW1lbnQucXVlcnlTZWxlY3RvcignW3NyYyo9Il9uZXh0Il0nKQ=="
291
+
292
+ # Or use stdin with heredoc for multiline scripts:
293
+ cat <<'EOF' | agent-browser eval --stdin
294
+ const links = document.querySelectorAll('a');
295
+ Array.from(links).map(a => a.href);
296
+ EOF
297
+ ```
298
+
299
+ ## State Management
300
+
301
+ ```bash
302
+ agent-browser state save auth.json # Save cookies, storage, auth state
303
+ agent-browser state load auth.json # Restore saved state
304
+ ```
305
+
306
+ ## Global Options
307
+
308
+ ```bash
309
+ agent-browser --session <name> ... # Isolated browser session
310
+ agent-browser --json ... # JSON output for parsing
311
+ agent-browser --headed ... # Show browser window (not headless)
312
+ agent-browser --cdp <port> ... # Connect via Chrome DevTools Protocol
313
+ agent-browser -p <provider> ... # Cloud browser provider (--provider)
314
+ agent-browser --proxy <url> ... # Use proxy server
315
+ agent-browser --proxy-bypass <hosts> # Hosts to bypass proxy
316
+ agent-browser --headers <json> ... # HTTP headers scoped to URL's origin
317
+ agent-browser --executable-path <p> # Custom browser executable
318
+ agent-browser --extension <path> ... # Load browser extension (repeatable)
319
+ agent-browser --ignore-https-errors # Ignore SSL certificate errors
320
+ agent-browser --hide-scrollbars false # Keep native scrollbars visible in headless Chromium screenshots
321
+ agent-browser --help # Show help (-h)
322
+ agent-browser --version # Show version (-V)
323
+ agent-browser <command> --help # Show detailed help for a command
324
+ ```
325
+
326
+ ## Debugging
327
+
328
+ ```bash
329
+ agent-browser --headed open example.com # Show browser window
330
+ agent-browser --cdp 9222 snapshot # Connect via CDP port
331
+ agent-browser connect 9222 # Alternative: connect command
332
+ agent-browser console # View console messages
333
+ agent-browser console --clear # Clear console
334
+ agent-browser errors # View page errors
335
+ agent-browser errors --clear # Clear errors
336
+ agent-browser highlight @e1 # Highlight element
337
+ agent-browser inspect # Open Chrome DevTools for this session
338
+ agent-browser trace start # Start recording trace
339
+ agent-browser trace stop trace.json # Stop and save trace
340
+ agent-browser profiler start # Start Chrome DevTools profiling
341
+ agent-browser profiler stop trace.json # Stop and save profile
342
+ ```
343
+
344
+ ## React / Web Vitals
345
+
346
+ Requires `--enable react-devtools` at launch for the `react ...` commands.
347
+ `vitals` and `pushstate` are framework-agnostic.
348
+
349
+ ```bash
350
+ agent-browser open --enable react-devtools <url> # Launch with React hook installed
351
+ agent-browser react tree # Full component tree
352
+ agent-browser react inspect <fiberId> # Props, hooks, state, source
353
+ agent-browser react renders start # Begin re-render recording
354
+ agent-browser react renders stop [--json] # Stop and print render profile
355
+ agent-browser react suspense [--only-dynamic] [--json] # Suspense boundaries + classifier
356
+ # --only-dynamic hides the "static" list
357
+ agent-browser vitals [url] [--json] # LCP/CLS/TTFB/FCP/INP + hydration
358
+ agent-browser pushstate <url> # SPA client-side nav (auto-detects Next router)
359
+ ```
360
+
361
+ `vitals` prints a summary by default and uses the same fields as the structured
362
+ `--json` response.
363
+
364
+ ## Init scripts
365
+
366
+ ```bash
367
+ agent-browser open --init-script <path> # Register before first navigation (repeatable)
368
+ agent-browser addinitscript <js> # Register at runtime (returns identifier)
369
+ agent-browser removeinitscript <identifier> # Remove a previously registered init script
370
+ ```
371
+
372
+ ## cURL cookie import
373
+
374
+ ```bash
375
+ agent-browser cookies set --curl <file> # Auto-detects JSON/cURL/Cookie-header
376
+ agent-browser cookies set --curl <file> --domain example.com # Scope to a domain
377
+ ```
378
+
379
+ Supported formats: JSON array of `{name, value}`, a cURL dump from
380
+ DevTools -> Network -> Copy as cURL, or a bare Cookie header. Errors never
381
+ echo cookie values.
382
+
383
+ ## Network route by resource type
384
+
385
+ ```bash
386
+ agent-browser network route '*' --abort --resource-type script # Block scripts only (SSR-lock pattern)
387
+ agent-browser network route '*' --resource-type image,font --body '' # Stub images and fonts
388
+ ```
389
+
390
+ ## Environment Variables
391
+
392
+ ```bash
393
+ AGENT_BROWSER_SESSION="mysession" # Default session name
394
+ AGENT_BROWSER_EXECUTABLE_PATH="/path/chrome" # Custom browser path
395
+ AGENT_BROWSER_EXTENSIONS="/ext1,/ext2" # Comma-separated extension paths
396
+ AGENT_BROWSER_INIT_SCRIPTS="/a.js,/b.js" # Comma-separated init script paths
397
+ AGENT_BROWSER_ENABLE="react-devtools" # Comma-separated built-in init script features
398
+ AGENT_BROWSER_HIDE_SCROLLBARS="false" # Keep native scrollbars visible in headless Chromium screenshots
399
+ AGENT_BROWSER_PROVIDER="browserbase" # Cloud browser provider
400
+ AGENT_BROWSER_STREAM_PORT="9223" # Override WebSocket streaming port (default: OS-assigned)
401
+ AGENT_BROWSER_CONFIG="./agent-browser.json" # Custom config file
402
+ AGENT_BROWSER_CDP="9222" # Connect daemon to CDP port or WebSocket URL
403
+ ```