agent-browser 0.22.2 → 0.23.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -125,7 +125,11 @@ agent-browser pdf <path> # Save as PDF
125
125
  agent-browser snapshot # Accessibility tree with refs (best for AI)
126
126
  agent-browser eval <js> # Run JavaScript (-b for base64, --stdin for piped input)
127
127
  agent-browser connect <port> # Connect to browser via CDP
128
+ agent-browser stream enable [--port <port>] # Start runtime WebSocket streaming
129
+ agent-browser stream status # Show runtime streaming state and bound port
130
+ agent-browser stream disable # Stop runtime WebSocket streaming
128
131
  agent-browser close # Close browser (aliases: quit, exit)
132
+ agent-browser close --all # Close all active sessions
129
133
  ```
130
134
 
131
135
  ### Get Info
@@ -593,6 +597,32 @@ This is useful for multimodal AI models that can reason about visual layout, unl
593
597
  | `--config <path>` | Use a custom config file (or `AGENT_BROWSER_CONFIG` env) |
594
598
  | `--debug` | Debug output |
595
599
 
600
+ ## Observability Dashboard
601
+
602
+ Monitor agent-browser sessions in real time with a local web dashboard showing a live viewport and command activity feed.
603
+
604
+ ```bash
605
+ # Install the dashboard (one time)
606
+ agent-browser dashboard install
607
+
608
+ # Start the dashboard server (runs in background on port 4848)
609
+ agent-browser dashboard start
610
+ agent-browser dashboard start --port 8080 # Custom port
611
+
612
+ # All sessions are automatically visible in the dashboard
613
+ agent-browser open example.com
614
+
615
+ # Stop the dashboard
616
+ agent-browser dashboard stop
617
+ ```
618
+
619
+ The dashboard runs as a standalone background process on port 4848, independent of browser sessions. It stays available even when no sessions are running. All sessions automatically stream to the dashboard.
620
+
621
+ The dashboard displays:
622
+ - **Live viewport** -- real-time JPEG frames from the browser
623
+ - **Activity feed** -- chronological command/result stream with timing and expandable details
624
+ - **Console output** -- browser console messages (log, warn, error)
625
+
596
626
  ## Configuration
597
627
 
598
628
  Create an `agent-browser.json` file to set persistent defaults instead of repeating flags on every command.
@@ -923,15 +953,28 @@ This is useful when:
923
953
 
924
954
  Stream the browser viewport via WebSocket for live preview or "pair browsing" where a human can watch and interact alongside an AI agent.
925
955
 
926
- ### Enable Streaming
956
+ ### Streaming
957
+
958
+ Every session automatically starts a WebSocket stream server on an OS-assigned port. Use `stream status` to see the bound port and connection state:
959
+
960
+ ```bash
961
+ agent-browser stream status
962
+ ```
927
963
 
928
- Set the `AGENT_BROWSER_STREAM_PORT` environment variable:
964
+ To bind to a specific port, set `AGENT_BROWSER_STREAM_PORT`:
929
965
 
930
966
  ```bash
931
967
  AGENT_BROWSER_STREAM_PORT=9223 agent-browser open example.com
932
968
  ```
933
969
 
934
- This starts a WebSocket server on the specified port that streams the browser viewport and accepts input events.
970
+ You can also manage streaming at runtime with `stream enable`, `stream disable`, and `stream status`:
971
+
972
+ ```bash
973
+ agent-browser stream enable --port 9223 # Re-enable on a specific port
974
+ agent-browser stream disable # Stop streaming for the session
975
+ ```
976
+
977
+ The WebSocket server streams the browser viewport and accepts input events.
935
978
 
936
979
  ### WebSocket Protocol
937
980
 
Binary file
Binary file
Binary file
Binary file
Binary file
Binary file
Binary file
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agent-browser",
3
- "version": "0.22.2",
3
+ "version": "0.23.0",
4
4
  "description": "Headless browser automation CLI for AI agents",
5
5
  "type": "module",
6
6
  "files": [
@@ -45,6 +45,7 @@
45
45
  "postinstall": "node scripts/postinstall.js",
46
46
  "changeset": "changeset",
47
47
  "ci:version": "changeset version && pnpm run version:sync && pnpm install --no-frozen-lockfile",
48
- "ci:publish": "pnpm run version:sync && changeset publish"
48
+ "ci:publish": "pnpm run version:sync && changeset publish",
49
+ "build:dashboard": "cd packages/dashboard && pnpm build"
49
50
  }
50
51
  }
@@ -110,6 +110,7 @@ See [references/authentication.md](references/authentication.md) for OAuth, 2FA,
110
110
  # Navigation
111
111
  agent-browser open <url> # Navigate (aliases: goto, navigate)
112
112
  agent-browser close # Close browser
113
+ agent-browser close --all # Close all active sessions
113
114
 
114
115
  # Snapshot
115
116
  agent-browser snapshot -i # Interactive elements with refs (recommended)
@@ -171,6 +172,12 @@ agent-browser screenshot --screenshot-dir ./shots # Save to custom directory
171
172
  agent-browser screenshot --screenshot-format jpeg --screenshot-quality 80
172
173
  agent-browser pdf output.pdf # Save as PDF
173
174
 
175
+ # Live preview / streaming
176
+ agent-browser stream enable # Start runtime WebSocket streaming on an auto-selected port
177
+ agent-browser stream enable --port 9223 # Bind a specific localhost port
178
+ agent-browser stream status # Inspect enabled state, port, connection, and screencasting
179
+ agent-browser stream disable # Stop runtime streaming and remove the .stream metadata file
180
+
174
181
  # Clipboard
175
182
  agent-browser clipboard read # Read text from clipboard
176
183
  agent-browser clipboard write "Hello, World!" # Write text to clipboard
@@ -192,6 +199,10 @@ agent-browser diff url <url1> <url2> --wait-until networkidle # Custom wait str
192
199
  agent-browser diff url <url1> <url2> --selector "#main" # Scope to element
193
200
  ```
194
201
 
202
+ ## Streaming
203
+
204
+ Every session automatically starts a WebSocket stream server on an OS-assigned port. Use `agent-browser stream status` to see the bound port and connection state. Use `stream disable` to tear it down, and `stream enable --port <port>` to re-enable on a specific port.
205
+
195
206
  ## Batch Execution
196
207
 
197
208
  Execute multiple commands in a single invocation by piping a JSON array of string arrays to `batch`. This avoids per-command process startup overhead when running multi-step workflows.
@@ -566,9 +577,10 @@ Always close your browser session when done to avoid leaked processes:
566
577
  ```bash
567
578
  agent-browser close # Close default session
568
579
  agent-browser --session agent1 close # Close specific session
580
+ agent-browser close --all # Close all active sessions
569
581
  ```
570
582
 
571
- If a previous session was not closed properly, the daemon may still be running. Use `agent-browser close` to clean it up before starting new work.
583
+ If a previous session was not closed properly, the daemon may still be running. Use `agent-browser close` to clean it up, or `agent-browser close --all` to shut down every session at once.
572
584
 
573
585
  To auto-shutdown the daemon after a period of inactivity (useful for ephemeral/CI environments):
574
586
 
@@ -700,6 +712,26 @@ Supported engines:
700
712
 
701
713
  Lightpanda does not support `--extension`, `--profile`, `--state`, or `--allow-file-access`. Install Lightpanda from https://lightpanda.io/docs/open-source/installation.
702
714
 
715
+ ## Observability Dashboard
716
+
717
+ The dashboard is a standalone background server that shows live browser viewports, command activity, and console output for all sessions.
718
+
719
+ ```bash
720
+ # Install the dashboard once
721
+ agent-browser dashboard install
722
+
723
+ # Start the dashboard server (background, port 4848)
724
+ agent-browser dashboard start
725
+
726
+ # All sessions are automatically visible in the dashboard
727
+ agent-browser open example.com
728
+
729
+ # Stop the dashboard
730
+ agent-browser dashboard stop
731
+ ```
732
+
733
+ The dashboard runs independently of browser sessions on port 4848 (configurable with `--port`). All sessions automatically stream to the dashboard.
734
+
703
735
  ## Ready-to-Use Templates
704
736
 
705
737
  | Template | Description |
@@ -287,6 +287,6 @@ AGENT_BROWSER_SESSION="mysession" # Default session name
287
287
  AGENT_BROWSER_EXECUTABLE_PATH="/path/chrome" # Custom browser path
288
288
  AGENT_BROWSER_EXTENSIONS="/ext1,/ext2" # Comma-separated extension paths
289
289
  AGENT_BROWSER_PROVIDER="browserbase" # Cloud browser provider
290
- AGENT_BROWSER_STREAM_PORT="9223" # WebSocket streaming port
290
+ AGENT_BROWSER_STREAM_PORT="9223" # Override WebSocket streaming port (default: OS-assigned)
291
291
  AGENT_BROWSER_HOME="/path/to/agent-browser" # Custom install location
292
292
  ```