agent-browser 0.22.2 → 0.23.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +46 -3
- package/bin/agent-browser-darwin-arm64 +0 -0
- package/bin/agent-browser-darwin-x64 +0 -0
- package/bin/agent-browser-linux-arm64 +0 -0
- package/bin/agent-browser-linux-musl-arm64 +0 -0
- package/bin/agent-browser-linux-musl-x64 +0 -0
- package/bin/agent-browser-linux-x64 +0 -0
- package/bin/agent-browser-win32-x64.exe +0 -0
- package/package.json +3 -2
- package/skills/agent-browser/SKILL.md +33 -1
- package/skills/agent-browser/references/commands.md +1 -1
package/README.md
CHANGED
|
@@ -125,7 +125,11 @@ agent-browser pdf <path> # Save as PDF
|
|
|
125
125
|
agent-browser snapshot # Accessibility tree with refs (best for AI)
|
|
126
126
|
agent-browser eval <js> # Run JavaScript (-b for base64, --stdin for piped input)
|
|
127
127
|
agent-browser connect <port> # Connect to browser via CDP
|
|
128
|
+
agent-browser stream enable [--port <port>] # Start runtime WebSocket streaming
|
|
129
|
+
agent-browser stream status # Show runtime streaming state and bound port
|
|
130
|
+
agent-browser stream disable # Stop runtime WebSocket streaming
|
|
128
131
|
agent-browser close # Close browser (aliases: quit, exit)
|
|
132
|
+
agent-browser close --all # Close all active sessions
|
|
129
133
|
```
|
|
130
134
|
|
|
131
135
|
### Get Info
|
|
@@ -593,6 +597,32 @@ This is useful for multimodal AI models that can reason about visual layout, unl
|
|
|
593
597
|
| `--config <path>` | Use a custom config file (or `AGENT_BROWSER_CONFIG` env) |
|
|
594
598
|
| `--debug` | Debug output |
|
|
595
599
|
|
|
600
|
+
## Observability Dashboard
|
|
601
|
+
|
|
602
|
+
Monitor agent-browser sessions in real time with a local web dashboard showing a live viewport and command activity feed.
|
|
603
|
+
|
|
604
|
+
```bash
|
|
605
|
+
# Install the dashboard (one time)
|
|
606
|
+
agent-browser dashboard install
|
|
607
|
+
|
|
608
|
+
# Start the dashboard server (runs in background on port 4848)
|
|
609
|
+
agent-browser dashboard start
|
|
610
|
+
agent-browser dashboard start --port 8080 # Custom port
|
|
611
|
+
|
|
612
|
+
# All sessions are automatically visible in the dashboard
|
|
613
|
+
agent-browser open example.com
|
|
614
|
+
|
|
615
|
+
# Stop the dashboard
|
|
616
|
+
agent-browser dashboard stop
|
|
617
|
+
```
|
|
618
|
+
|
|
619
|
+
The dashboard runs as a standalone background process on port 4848, independent of browser sessions. It stays available even when no sessions are running. All sessions automatically stream to the dashboard.
|
|
620
|
+
|
|
621
|
+
The dashboard displays:
|
|
622
|
+
- **Live viewport** -- real-time JPEG frames from the browser
|
|
623
|
+
- **Activity feed** -- chronological command/result stream with timing and expandable details
|
|
624
|
+
- **Console output** -- browser console messages (log, warn, error)
|
|
625
|
+
|
|
596
626
|
## Configuration
|
|
597
627
|
|
|
598
628
|
Create an `agent-browser.json` file to set persistent defaults instead of repeating flags on every command.
|
|
@@ -923,15 +953,28 @@ This is useful when:
|
|
|
923
953
|
|
|
924
954
|
Stream the browser viewport via WebSocket for live preview or "pair browsing" where a human can watch and interact alongside an AI agent.
|
|
925
955
|
|
|
926
|
-
###
|
|
956
|
+
### Streaming
|
|
957
|
+
|
|
958
|
+
Every session automatically starts a WebSocket stream server on an OS-assigned port. Use `stream status` to see the bound port and connection state:
|
|
959
|
+
|
|
960
|
+
```bash
|
|
961
|
+
agent-browser stream status
|
|
962
|
+
```
|
|
927
963
|
|
|
928
|
-
|
|
964
|
+
To bind to a specific port, set `AGENT_BROWSER_STREAM_PORT`:
|
|
929
965
|
|
|
930
966
|
```bash
|
|
931
967
|
AGENT_BROWSER_STREAM_PORT=9223 agent-browser open example.com
|
|
932
968
|
```
|
|
933
969
|
|
|
934
|
-
|
|
970
|
+
You can also manage streaming at runtime with `stream enable`, `stream disable`, and `stream status`:
|
|
971
|
+
|
|
972
|
+
```bash
|
|
973
|
+
agent-browser stream enable --port 9223 # Re-enable on a specific port
|
|
974
|
+
agent-browser stream disable # Stop streaming for the session
|
|
975
|
+
```
|
|
976
|
+
|
|
977
|
+
The WebSocket server streams the browser viewport and accepts input events.
|
|
935
978
|
|
|
936
979
|
### WebSocket Protocol
|
|
937
980
|
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "agent-browser",
|
|
3
|
-
"version": "0.
|
|
3
|
+
"version": "0.23.0",
|
|
4
4
|
"description": "Headless browser automation CLI for AI agents",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"files": [
|
|
@@ -45,6 +45,7 @@
|
|
|
45
45
|
"postinstall": "node scripts/postinstall.js",
|
|
46
46
|
"changeset": "changeset",
|
|
47
47
|
"ci:version": "changeset version && pnpm run version:sync && pnpm install --no-frozen-lockfile",
|
|
48
|
-
"ci:publish": "pnpm run version:sync && changeset publish"
|
|
48
|
+
"ci:publish": "pnpm run version:sync && changeset publish",
|
|
49
|
+
"build:dashboard": "cd packages/dashboard && pnpm build"
|
|
49
50
|
}
|
|
50
51
|
}
|
|
@@ -110,6 +110,7 @@ See [references/authentication.md](references/authentication.md) for OAuth, 2FA,
|
|
|
110
110
|
# Navigation
|
|
111
111
|
agent-browser open <url> # Navigate (aliases: goto, navigate)
|
|
112
112
|
agent-browser close # Close browser
|
|
113
|
+
agent-browser close --all # Close all active sessions
|
|
113
114
|
|
|
114
115
|
# Snapshot
|
|
115
116
|
agent-browser snapshot -i # Interactive elements with refs (recommended)
|
|
@@ -171,6 +172,12 @@ agent-browser screenshot --screenshot-dir ./shots # Save to custom directory
|
|
|
171
172
|
agent-browser screenshot --screenshot-format jpeg --screenshot-quality 80
|
|
172
173
|
agent-browser pdf output.pdf # Save as PDF
|
|
173
174
|
|
|
175
|
+
# Live preview / streaming
|
|
176
|
+
agent-browser stream enable # Start runtime WebSocket streaming on an auto-selected port
|
|
177
|
+
agent-browser stream enable --port 9223 # Bind a specific localhost port
|
|
178
|
+
agent-browser stream status # Inspect enabled state, port, connection, and screencasting
|
|
179
|
+
agent-browser stream disable # Stop runtime streaming and remove the .stream metadata file
|
|
180
|
+
|
|
174
181
|
# Clipboard
|
|
175
182
|
agent-browser clipboard read # Read text from clipboard
|
|
176
183
|
agent-browser clipboard write "Hello, World!" # Write text to clipboard
|
|
@@ -192,6 +199,10 @@ agent-browser diff url <url1> <url2> --wait-until networkidle # Custom wait str
|
|
|
192
199
|
agent-browser diff url <url1> <url2> --selector "#main" # Scope to element
|
|
193
200
|
```
|
|
194
201
|
|
|
202
|
+
## Streaming
|
|
203
|
+
|
|
204
|
+
Every session automatically starts a WebSocket stream server on an OS-assigned port. Use `agent-browser stream status` to see the bound port and connection state. Use `stream disable` to tear it down, and `stream enable --port <port>` to re-enable on a specific port.
|
|
205
|
+
|
|
195
206
|
## Batch Execution
|
|
196
207
|
|
|
197
208
|
Execute multiple commands in a single invocation by piping a JSON array of string arrays to `batch`. This avoids per-command process startup overhead when running multi-step workflows.
|
|
@@ -566,9 +577,10 @@ Always close your browser session when done to avoid leaked processes:
|
|
|
566
577
|
```bash
|
|
567
578
|
agent-browser close # Close default session
|
|
568
579
|
agent-browser --session agent1 close # Close specific session
|
|
580
|
+
agent-browser close --all # Close all active sessions
|
|
569
581
|
```
|
|
570
582
|
|
|
571
|
-
If a previous session was not closed properly, the daemon may still be running. Use `agent-browser close` to clean it up
|
|
583
|
+
If a previous session was not closed properly, the daemon may still be running. Use `agent-browser close` to clean it up, or `agent-browser close --all` to shut down every session at once.
|
|
572
584
|
|
|
573
585
|
To auto-shutdown the daemon after a period of inactivity (useful for ephemeral/CI environments):
|
|
574
586
|
|
|
@@ -700,6 +712,26 @@ Supported engines:
|
|
|
700
712
|
|
|
701
713
|
Lightpanda does not support `--extension`, `--profile`, `--state`, or `--allow-file-access`. Install Lightpanda from https://lightpanda.io/docs/open-source/installation.
|
|
702
714
|
|
|
715
|
+
## Observability Dashboard
|
|
716
|
+
|
|
717
|
+
The dashboard is a standalone background server that shows live browser viewports, command activity, and console output for all sessions.
|
|
718
|
+
|
|
719
|
+
```bash
|
|
720
|
+
# Install the dashboard once
|
|
721
|
+
agent-browser dashboard install
|
|
722
|
+
|
|
723
|
+
# Start the dashboard server (background, port 4848)
|
|
724
|
+
agent-browser dashboard start
|
|
725
|
+
|
|
726
|
+
# All sessions are automatically visible in the dashboard
|
|
727
|
+
agent-browser open example.com
|
|
728
|
+
|
|
729
|
+
# Stop the dashboard
|
|
730
|
+
agent-browser dashboard stop
|
|
731
|
+
```
|
|
732
|
+
|
|
733
|
+
The dashboard runs independently of browser sessions on port 4848 (configurable with `--port`). All sessions automatically stream to the dashboard.
|
|
734
|
+
|
|
703
735
|
## Ready-to-Use Templates
|
|
704
736
|
|
|
705
737
|
| Template | Description |
|
|
@@ -287,6 +287,6 @@ AGENT_BROWSER_SESSION="mysession" # Default session name
|
|
|
287
287
|
AGENT_BROWSER_EXECUTABLE_PATH="/path/chrome" # Custom browser path
|
|
288
288
|
AGENT_BROWSER_EXTENSIONS="/ext1,/ext2" # Comma-separated extension paths
|
|
289
289
|
AGENT_BROWSER_PROVIDER="browserbase" # Cloud browser provider
|
|
290
|
-
AGENT_BROWSER_STREAM_PORT="9223" # WebSocket streaming port
|
|
290
|
+
AGENT_BROWSER_STREAM_PORT="9223" # Override WebSocket streaming port (default: OS-assigned)
|
|
291
291
|
AGENT_BROWSER_HOME="/path/to/agent-browser" # Custom install location
|
|
292
292
|
```
|