agent-browser 0.24.1 → 0.25.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +44 -7
- package/bin/agent-browser-darwin-arm64 +0 -0
- package/bin/agent-browser-darwin-x64 +0 -0
- package/bin/agent-browser-linux-arm64 +0 -0
- package/bin/agent-browser-linux-musl-arm64 +0 -0
- package/bin/agent-browser-linux-musl-x64 +0 -0
- package/bin/agent-browser-linux-x64 +0 -0
- package/bin/agent-browser-win32-x64.exe +0 -0
- package/package.json +1 -1
- package/skills/agent-browser/SKILL.md +105 -53
package/README.md
CHANGED
|
@@ -130,6 +130,8 @@ agent-browser stream status # Show runtime streaming state and bound p
|
|
|
130
130
|
agent-browser stream disable # Stop runtime WebSocket streaming
|
|
131
131
|
agent-browser close # Close browser (aliases: quit, exit)
|
|
132
132
|
agent-browser close --all # Close all active sessions
|
|
133
|
+
agent-browser chat "<instruction>" # AI chat: natural language browser control (single-shot)
|
|
134
|
+
agent-browser chat # AI chat: interactive REPL mode
|
|
133
135
|
```
|
|
134
136
|
|
|
135
137
|
### Get Info
|
|
@@ -203,21 +205,24 @@ agent-browser wait "#spinner" --state hidden
|
|
|
203
205
|
|
|
204
206
|
### Batch Execution
|
|
205
207
|
|
|
206
|
-
Execute multiple commands in a single invocation
|
|
207
|
-
|
|
208
|
-
when running multi-step workflows.
|
|
208
|
+
Execute multiple commands in a single invocation. Commands can be passed as
|
|
209
|
+
quoted arguments or piped as JSON via stdin. This avoids per-command process
|
|
210
|
+
startup overhead when running multi-step workflows.
|
|
209
211
|
|
|
210
212
|
```bash
|
|
211
|
-
#
|
|
213
|
+
# Argument mode: each quoted argument is a full command
|
|
214
|
+
agent-browser batch "open https://example.com" "snapshot -i" "screenshot"
|
|
215
|
+
|
|
216
|
+
# With --bail to stop on first error
|
|
217
|
+
agent-browser batch --bail "open https://example.com" "click @e1" "screenshot"
|
|
218
|
+
|
|
219
|
+
# Stdin mode: pipe commands as JSON
|
|
212
220
|
echo '[
|
|
213
221
|
["open", "https://example.com"],
|
|
214
222
|
["snapshot", "-i"],
|
|
215
223
|
["click", "@e1"],
|
|
216
224
|
["screenshot", "result.png"]
|
|
217
225
|
]' | agent-browser batch --json
|
|
218
|
-
|
|
219
|
-
# Stop on first error
|
|
220
|
-
agent-browser batch --bail < commands.json
|
|
221
226
|
```
|
|
222
227
|
|
|
223
228
|
### Clipboard
|
|
@@ -548,6 +553,7 @@ The `snapshot` command supports filtering to reduce output size:
|
|
|
548
553
|
```bash
|
|
549
554
|
agent-browser snapshot # Full accessibility tree
|
|
550
555
|
agent-browser snapshot -i # Interactive elements only (buttons, inputs, links)
|
|
556
|
+
agent-browser snapshot -i --urls # Interactive elements with link URLs
|
|
551
557
|
agent-browser snapshot -c # Compact (remove empty structural elements)
|
|
552
558
|
agent-browser snapshot -d 3 # Limit depth to 3 levels
|
|
553
559
|
agent-browser snapshot -s "#main" # Scope to CSS selector
|
|
@@ -557,6 +563,7 @@ agent-browser snapshot -i -c -d 5 # Combine options
|
|
|
557
563
|
| Option | Description |
|
|
558
564
|
| ---------------------- | ----------------------------------------------------------------------- |
|
|
559
565
|
| `-i, --interactive` | Only show interactive elements (buttons, links, inputs) |
|
|
566
|
+
| `-u, --urls` | Include href URLs for link elements |
|
|
560
567
|
| `-c, --compact` | Remove empty structural elements |
|
|
561
568
|
| `-d, --depth <n>` | Limit tree depth |
|
|
562
569
|
| `-s, --selector <sel>` | Scope to CSS selector |
|
|
@@ -621,6 +628,9 @@ This is useful for multimodal AI models that can reason about visual layout, unl
|
|
|
621
628
|
| `--confirm-interactive` | Interactive confirmation prompts; auto-denies if stdin is not a TTY (or `AGENT_BROWSER_CONFIRM_INTERACTIVE` env) |
|
|
622
629
|
| `--engine <name>` | Browser engine: `chrome` (default), `lightpanda` (or `AGENT_BROWSER_ENGINE` env) |
|
|
623
630
|
| `--no-auto-dialog` | Disable automatic dismissal of `alert`/`beforeunload` dialogs (or `AGENT_BROWSER_NO_AUTO_DIALOG` env) |
|
|
631
|
+
| `--model <name>` | AI model for chat command (or `AI_GATEWAY_MODEL` env) |
|
|
632
|
+
| `-v`, `--verbose` | Show tool commands and their raw output (chat) |
|
|
633
|
+
| `-q`, `--quiet` | Show only AI text responses, hide tool calls (chat) |
|
|
624
634
|
| `--config <path>` | Use a custom config file (or `AGENT_BROWSER_CONFIG` env) |
|
|
625
635
|
| `--debug` | Debug output |
|
|
626
636
|
|
|
@@ -650,6 +660,33 @@ The dashboard displays:
|
|
|
650
660
|
- **Activity feed** -- chronological command/result stream with timing and expandable details
|
|
651
661
|
- **Console output** -- browser console messages (log, warn, error)
|
|
652
662
|
- **Session creation** -- create new sessions from the UI with local engines (Chrome, Lightpanda) or cloud providers (AgentCore, Browserbase, Browserless, Browser Use, Kernel)
|
|
663
|
+
- **AI Chat** -- chat with an AI assistant directly in the dashboard (requires Vercel AI Gateway configuration)
|
|
664
|
+
|
|
665
|
+
### AI Chat
|
|
666
|
+
|
|
667
|
+
The dashboard includes an optional AI chat panel powered by the Vercel AI Gateway. The same functionality is available directly from the CLI via the `chat` command. Set these environment variables to enable AI chat:
|
|
668
|
+
|
|
669
|
+
```bash
|
|
670
|
+
export AI_GATEWAY_API_KEY=gw_your_key_here
|
|
671
|
+
export AI_GATEWAY_MODEL=anthropic/claude-sonnet-4.6 # optional, this is the default
|
|
672
|
+
export AI_GATEWAY_URL=https://ai-gateway.vercel.sh # optional, this is the default
|
|
673
|
+
```
|
|
674
|
+
|
|
675
|
+
**CLI usage:**
|
|
676
|
+
|
|
677
|
+
```bash
|
|
678
|
+
agent-browser chat "open google.com and search for cats" # Single-shot
|
|
679
|
+
agent-browser chat # Interactive REPL
|
|
680
|
+
agent-browser -q chat "summarize this page" # Quiet mode (text only)
|
|
681
|
+
agent-browser -v chat "fill in the login form" # Verbose (show command output)
|
|
682
|
+
agent-browser --model openai/gpt-4o chat "take a screenshot" # Override model
|
|
683
|
+
```
|
|
684
|
+
|
|
685
|
+
The `chat` command translates natural language instructions into agent-browser commands, executes them, and streams the AI response. In interactive mode, type `quit` to exit. Use `--json` for structured output suitable for agent consumption.
|
|
686
|
+
|
|
687
|
+
**Dashboard usage:**
|
|
688
|
+
|
|
689
|
+
The Chat tab is always visible in the dashboard. When `AI_GATEWAY_API_KEY` is set, the Rust server proxies requests to the gateway and streams responses back using the Vercel AI SDK's UI Message Stream protocol. Without the key, sending a message shows an error inline.
|
|
653
690
|
|
|
654
691
|
## Configuration
|
|
655
692
|
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
package/package.json
CHANGED
|
@@ -25,7 +25,7 @@ agent-browser snapshot -i
|
|
|
25
25
|
agent-browser fill @e1 "user@example.com"
|
|
26
26
|
agent-browser fill @e2 "password123"
|
|
27
27
|
agent-browser click @e3
|
|
28
|
-
agent-browser wait
|
|
28
|
+
agent-browser wait 2000
|
|
29
29
|
agent-browser snapshot -i # Check result
|
|
30
30
|
```
|
|
31
31
|
|
|
@@ -34,14 +34,14 @@ agent-browser snapshot -i # Check result
|
|
|
34
34
|
Commands can be chained with `&&` in a single shell invocation. The browser persists between commands via a background daemon, so chaining is safe and more efficient than separate calls.
|
|
35
35
|
|
|
36
36
|
```bash
|
|
37
|
-
# Chain open +
|
|
38
|
-
agent-browser open https://example.com && agent-browser
|
|
37
|
+
# Chain open + snapshot in one call (open already waits for page load)
|
|
38
|
+
agent-browser open https://example.com && agent-browser snapshot -i
|
|
39
39
|
|
|
40
40
|
# Chain multiple interactions
|
|
41
41
|
agent-browser fill @e1 "user@example.com" && agent-browser fill @e2 "password123" && agent-browser click @e3
|
|
42
42
|
|
|
43
43
|
# Navigate and capture
|
|
44
|
-
agent-browser open https://example.com && agent-browser
|
|
44
|
+
agent-browser open https://example.com && agent-browser screenshot
|
|
45
45
|
```
|
|
46
46
|
|
|
47
47
|
**When to chain:** Use `&&` when you don't need to read the output of an intermediate command before proceeding (e.g., open + wait + screenshot). Run commands separately when you need to parse the output first (e.g., snapshot to discover refs, then interact using those refs).
|
|
@@ -117,6 +117,11 @@ See [references/authentication.md](references/authentication.md) for OAuth, 2FA,
|
|
|
117
117
|
## Essential Commands
|
|
118
118
|
|
|
119
119
|
```bash
|
|
120
|
+
# Batch: ALWAYS use batch for 2+ sequential commands. Commands run in order.
|
|
121
|
+
agent-browser batch "open https://example.com" "snapshot -i"
|
|
122
|
+
agent-browser batch "open https://example.com" "screenshot"
|
|
123
|
+
agent-browser batch "click @e1" "wait 1000" "screenshot"
|
|
124
|
+
|
|
120
125
|
# Navigation
|
|
121
126
|
agent-browser open <url> # Navigate (aliases: goto, navigate)
|
|
122
127
|
agent-browser close # Close browser
|
|
@@ -124,6 +129,7 @@ agent-browser close --all # Close all active sessions
|
|
|
124
129
|
|
|
125
130
|
# Snapshot
|
|
126
131
|
agent-browser snapshot -i # Interactive elements with refs (recommended)
|
|
132
|
+
agent-browser snapshot -i --urls # Include href URLs for links
|
|
127
133
|
agent-browser snapshot -s "#selector" # Scope to CSS selector
|
|
128
134
|
|
|
129
135
|
# Interaction (use @refs from snapshot)
|
|
@@ -147,10 +153,10 @@ agent-browser get cdp-url # Get CDP WebSocket URL
|
|
|
147
153
|
|
|
148
154
|
# Wait
|
|
149
155
|
agent-browser wait @e1 # Wait for element
|
|
150
|
-
agent-browser wait --load networkidle # Wait for network idle
|
|
151
|
-
agent-browser wait --url "**/page" # Wait for URL pattern
|
|
152
156
|
agent-browser wait 2000 # Wait milliseconds
|
|
153
|
-
agent-browser wait --
|
|
157
|
+
agent-browser wait --url "**/page" # Wait for URL pattern
|
|
158
|
+
agent-browser wait --text "Welcome" # Wait for text to appear (substring match)
|
|
159
|
+
agent-browser wait --load networkidle # Wait for network idle (caution: see Pitfalls)
|
|
154
160
|
agent-browser wait --fn "!document.body.innerText.includes('Loading...')" # Wait for text to disappear
|
|
155
161
|
agent-browser wait "#spinner" --state hidden # Wait for element to disappear
|
|
156
162
|
|
|
@@ -159,6 +165,14 @@ agent-browser download @e1 ./file.pdf # Click element to trigger downlo
|
|
|
159
165
|
agent-browser wait --download ./output.zip # Wait for any download to complete
|
|
160
166
|
agent-browser --download-path ./downloads open <url> # Set default download directory
|
|
161
167
|
|
|
168
|
+
# Tab management
|
|
169
|
+
agent-browser tab list # List all open tabs
|
|
170
|
+
agent-browser tab new # Open a blank new tab
|
|
171
|
+
agent-browser tab new https://example.com # Open URL in a new tab
|
|
172
|
+
agent-browser tab 2 # Switch to tab by index (0-based)
|
|
173
|
+
agent-browser tab close # Close the current tab
|
|
174
|
+
agent-browser tab close 2 # Close tab by index
|
|
175
|
+
|
|
162
176
|
# Network
|
|
163
177
|
agent-browser network requests # Inspect tracked requests
|
|
164
178
|
agent-browser network requests --type xhr,fetch # Filter by resource type
|
|
@@ -210,6 +224,13 @@ agent-browser diff screenshot --baseline before.png # Visual pixel diff
|
|
|
210
224
|
agent-browser diff url <url1> <url2> # Compare two pages
|
|
211
225
|
agent-browser diff url <url1> <url2> --wait-until networkidle # Custom wait strategy
|
|
212
226
|
agent-browser diff url <url1> <url2> --selector "#main" # Scope to element
|
|
227
|
+
|
|
228
|
+
# Chat (AI natural language control)
|
|
229
|
+
agent-browser chat "open google.com and search for cats" # Single-shot instruction
|
|
230
|
+
agent-browser chat # Interactive REPL mode
|
|
231
|
+
agent-browser -q chat "summarize this page" # Quiet (text only, no tool calls)
|
|
232
|
+
agent-browser -v chat "fill in the login form" # Verbose (show command output)
|
|
233
|
+
agent-browser --model openai/gpt-4o chat "take a screenshot" # Override model
|
|
213
234
|
```
|
|
214
235
|
|
|
215
236
|
## Streaming
|
|
@@ -218,35 +239,62 @@ Every session automatically starts a WebSocket stream server on an OS-assigned p
|
|
|
218
239
|
|
|
219
240
|
## Batch Execution
|
|
220
241
|
|
|
221
|
-
|
|
242
|
+
ALWAYS use `batch` when running 2+ commands in sequence. Batch executes commands in order, so dependent commands (like navigate then screenshot) work correctly. Each quoted argument is a separate command.
|
|
222
243
|
|
|
223
244
|
```bash
|
|
224
|
-
|
|
225
|
-
|
|
226
|
-
|
|
227
|
-
|
|
228
|
-
|
|
229
|
-
|
|
245
|
+
# Navigate and take a snapshot
|
|
246
|
+
agent-browser batch "open https://example.com" "snapshot -i"
|
|
247
|
+
|
|
248
|
+
# Navigate, snapshot, and screenshot in one call
|
|
249
|
+
agent-browser batch "open https://example.com" "snapshot -i" "screenshot"
|
|
250
|
+
|
|
251
|
+
# Click, wait, then screenshot
|
|
252
|
+
agent-browser batch "click @e1" "wait 1000" "screenshot"
|
|
253
|
+
|
|
254
|
+
# With --bail to stop on first error
|
|
255
|
+
agent-browser batch --bail "open https://example.com" "click @e1" "screenshot"
|
|
256
|
+
```
|
|
257
|
+
|
|
258
|
+
Only use a single command (not batch) when you need to read the output before deciding the next command. For example, you must run `snapshot -i` as a single command when you need to read the refs to decide what to click. After reading the snapshot, batch the remaining steps.
|
|
230
259
|
|
|
231
|
-
|
|
260
|
+
Stdin mode is also supported for programmatic use:
|
|
261
|
+
|
|
262
|
+
```bash
|
|
263
|
+
echo '[["open","https://example.com"],["screenshot"]]' | agent-browser batch --json
|
|
232
264
|
agent-browser batch --bail < commands.json
|
|
233
265
|
```
|
|
234
266
|
|
|
235
|
-
|
|
267
|
+
## Efficiency Strategies
|
|
268
|
+
|
|
269
|
+
These patterns minimize tool calls and token usage.
|
|
270
|
+
|
|
271
|
+
**Use `--urls` to avoid re-navigation.** When you need to visit links from a page, use `snapshot -i --urls` to get all href URLs upfront. Then `open` each URL directly instead of clicking refs and navigating back.
|
|
272
|
+
|
|
273
|
+
**Snapshot once, act many times.** Never re-snapshot the same page. Extract all needed info (refs, URLs, text) from a single snapshot, then batch the remaining actions.
|
|
274
|
+
|
|
275
|
+
**Multi-page workflow (e.g. "visit N sites and screenshot each"):**
|
|
276
|
+
|
|
277
|
+
```bash
|
|
278
|
+
# 1. Get all URLs in one call
|
|
279
|
+
agent-browser batch "open https://news.ycombinator.com" "snapshot -i --urls"
|
|
280
|
+
# Read output to extract URLs, then visit each directly:
|
|
281
|
+
# 2. One batch per target site
|
|
282
|
+
agent-browser batch "open https://github.com/example/repo" "screenshot"
|
|
283
|
+
agent-browser batch "open https://example.com/article" "screenshot"
|
|
284
|
+
agent-browser batch "open https://other.com/page" "screenshot"
|
|
285
|
+
```
|
|
286
|
+
|
|
287
|
+
This approach uses 4 tool calls instead of 14+. Never go back to the listing page between visits.
|
|
236
288
|
|
|
237
289
|
## Common Patterns
|
|
238
290
|
|
|
239
291
|
### Form Submission
|
|
240
292
|
|
|
241
293
|
```bash
|
|
242
|
-
|
|
243
|
-
agent-browser snapshot -i
|
|
244
|
-
|
|
245
|
-
agent-browser fill @e2 "jane@example.com"
|
|
246
|
-
agent-browser select @e3 "California"
|
|
247
|
-
agent-browser check @e4
|
|
248
|
-
agent-browser click @e5
|
|
249
|
-
agent-browser wait --load networkidle
|
|
294
|
+
# Navigate and get the form structure
|
|
295
|
+
agent-browser batch "open https://example.com/signup" "snapshot -i"
|
|
296
|
+
# Read the snapshot output to identify form refs, then fill and submit
|
|
297
|
+
agent-browser batch "fill @e1 \"Jane Doe\"" "fill @e2 \"jane@example.com\"" "select @e3 \"California\"" "check @e4" "click @e5" "wait 2000"
|
|
250
298
|
```
|
|
251
299
|
|
|
252
300
|
### Authentication with Auth Vault (Recommended)
|
|
@@ -271,17 +319,12 @@ agent-browser auth delete github
|
|
|
271
319
|
|
|
272
320
|
```bash
|
|
273
321
|
# Login once and save state
|
|
274
|
-
agent-browser open https://app.example.com/login
|
|
275
|
-
|
|
276
|
-
agent-browser fill @e1 "$USERNAME"
|
|
277
|
-
agent-browser fill @e2 "$PASSWORD"
|
|
278
|
-
agent-browser click @e3
|
|
279
|
-
agent-browser wait --url "**/dashboard"
|
|
280
|
-
agent-browser state save auth.json
|
|
322
|
+
agent-browser batch "open https://app.example.com/login" "snapshot -i"
|
|
323
|
+
# Read snapshot to find form refs, then fill and submit
|
|
324
|
+
agent-browser batch "fill @e1 \"$USERNAME\"" "fill @e2 \"$PASSWORD\"" "click @e3" "wait --url **/dashboard" "state save auth.json"
|
|
281
325
|
|
|
282
326
|
# Reuse in future sessions
|
|
283
|
-
agent-browser state load auth.json
|
|
284
|
-
agent-browser open https://app.example.com/dashboard
|
|
327
|
+
agent-browser batch "state load auth.json" "open https://app.example.com/dashboard"
|
|
285
328
|
```
|
|
286
329
|
|
|
287
330
|
### Session Persistence
|
|
@@ -311,8 +354,7 @@ agent-browser state clean --older-than 7
|
|
|
311
354
|
Iframe content is automatically inlined in snapshots. Refs inside iframes carry frame context, so you can interact with them directly.
|
|
312
355
|
|
|
313
356
|
```bash
|
|
314
|
-
agent-browser open https://example.com/checkout
|
|
315
|
-
agent-browser snapshot -i
|
|
357
|
+
agent-browser batch "open https://example.com/checkout" "snapshot -i"
|
|
316
358
|
# @e1 [heading] "Checkout"
|
|
317
359
|
# @e2 [Iframe] "payment-frame"
|
|
318
360
|
# @e3 [input] "Card number"
|
|
@@ -320,23 +362,19 @@ agent-browser snapshot -i
|
|
|
320
362
|
# @e5 [button] "Pay"
|
|
321
363
|
|
|
322
364
|
# Interact directly — no frame switch needed
|
|
323
|
-
agent-browser fill @e3 "4111111111111111"
|
|
324
|
-
agent-browser fill @e4 "12/28"
|
|
325
|
-
agent-browser click @e5
|
|
365
|
+
agent-browser batch "fill @e3 \"4111111111111111\"" "fill @e4 \"12/28\"" "click @e5"
|
|
326
366
|
|
|
327
367
|
# To scope a snapshot to one iframe:
|
|
328
|
-
agent-browser frame @e2
|
|
329
|
-
agent-browser snapshot -i # Only iframe content
|
|
368
|
+
agent-browser batch "frame @e2" "snapshot -i"
|
|
330
369
|
agent-browser frame main # Return to main frame
|
|
331
370
|
```
|
|
332
371
|
|
|
333
372
|
### Data Extraction
|
|
334
373
|
|
|
335
374
|
```bash
|
|
336
|
-
agent-browser open https://example.com/products
|
|
337
|
-
|
|
375
|
+
agent-browser batch "open https://example.com/products" "snapshot -i"
|
|
376
|
+
# Read snapshot to find element refs, then extract
|
|
338
377
|
agent-browser get text @e5 # Get specific element text
|
|
339
|
-
agent-browser get text body > page.txt # Get all page text
|
|
340
378
|
|
|
341
379
|
# JSON output for parsing
|
|
342
380
|
agent-browser snapshot -i --json
|
|
@@ -530,27 +568,29 @@ agent-browser diff url https://staging.example.com https://prod.example.com --sc
|
|
|
530
568
|
|
|
531
569
|
## Timeouts and Slow Pages
|
|
532
570
|
|
|
533
|
-
The default timeout is 25 seconds. This can be overridden with the `AGENT_BROWSER_DEFAULT_TIMEOUT` environment variable (value in milliseconds).
|
|
571
|
+
The default timeout is 25 seconds. This can be overridden with the `AGENT_BROWSER_DEFAULT_TIMEOUT` environment variable (value in milliseconds).
|
|
534
572
|
|
|
535
|
-
|
|
536
|
-
# Wait for network activity to settle (best for slow pages)
|
|
537
|
-
agent-browser wait --load networkidle
|
|
573
|
+
**Important:** `open` already waits for the page `load` event before returning. In most cases, no additional wait is needed before taking a snapshot or screenshot. Only add an explicit wait when content loads asynchronously after the initial page load.
|
|
538
574
|
|
|
539
|
-
|
|
575
|
+
```bash
|
|
576
|
+
# Wait for a specific element to appear (preferred for dynamic content)
|
|
540
577
|
agent-browser wait "#content"
|
|
541
578
|
agent-browser wait @e1
|
|
542
579
|
|
|
580
|
+
# Wait a fixed duration (good default for slow SPAs)
|
|
581
|
+
agent-browser wait 2000
|
|
582
|
+
|
|
543
583
|
# Wait for a specific URL pattern (useful after redirects)
|
|
544
584
|
agent-browser wait --url "**/dashboard"
|
|
545
585
|
|
|
546
|
-
# Wait for
|
|
547
|
-
agent-browser wait --
|
|
586
|
+
# Wait for text to appear on the page
|
|
587
|
+
agent-browser wait --text "Results loaded"
|
|
548
588
|
|
|
549
|
-
# Wait
|
|
550
|
-
agent-browser wait
|
|
589
|
+
# Wait for a JavaScript condition
|
|
590
|
+
agent-browser wait --fn "document.querySelectorAll('.item').length > 0"
|
|
551
591
|
```
|
|
552
592
|
|
|
553
|
-
|
|
593
|
+
**Avoid `wait --load networkidle`** unless you are certain the site has no persistent network activity. Ad-heavy sites, sites with analytics/tracking, and sites with websockets will cause `networkidle` to hang indefinitely. Prefer `wait 2000` or `wait <selector>` instead.
|
|
554
594
|
|
|
555
595
|
## JavaScript Dialogs (alert / confirm / prompt)
|
|
556
596
|
|
|
@@ -764,6 +804,18 @@ agent-browser dashboard stop
|
|
|
764
804
|
|
|
765
805
|
The dashboard runs independently of browser sessions on port 4848 (configurable with `--port`). All sessions automatically stream to the dashboard. Sessions can also be created from the dashboard UI with local engines or cloud providers.
|
|
766
806
|
|
|
807
|
+
### Dashboard AI Chat
|
|
808
|
+
|
|
809
|
+
The dashboard has an optional AI chat tab powered by the Vercel AI Gateway. Enable it by setting:
|
|
810
|
+
|
|
811
|
+
```bash
|
|
812
|
+
export AI_GATEWAY_API_KEY=gw_your_key_here
|
|
813
|
+
export AI_GATEWAY_MODEL=anthropic/claude-sonnet-4.6 # optional default
|
|
814
|
+
export AI_GATEWAY_URL=https://ai-gateway.vercel.sh # optional default
|
|
815
|
+
```
|
|
816
|
+
|
|
817
|
+
The Chat tab is always visible in the dashboard. Set `AI_GATEWAY_API_KEY` to enable AI responses.
|
|
818
|
+
|
|
767
819
|
## Ready-to-Use Templates
|
|
768
820
|
|
|
769
821
|
| Template | Description |
|