npm - agent-browser - Versions diffs - 0.17.1 → 0.19.0 - Mend

agent-browser 0.17.1 → 0.19.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (36) hide show

package/README.md +194 -62
package/bin/agent-browser-darwin-arm64 +0 -0
package/bin/agent-browser-darwin-x64 +0 -0
package/bin/agent-browser-linux-arm64 +0 -0
package/bin/agent-browser-linux-x64 +0 -0
package/bin/agent-browser-win32-x64.exe +0 -0
package/dist/actions.d.ts.map +1 -1
package/dist/actions.js +108 -45
package/dist/actions.js.map +1 -1
package/dist/browser.d.ts +50 -4
package/dist/browser.d.ts.map +1 -1
package/dist/browser.js +245 -38
package/dist/browser.js.map +1 -1
package/dist/daemon.d.ts +5 -0
package/dist/daemon.d.ts.map +1 -1
package/dist/daemon.js +62 -3
package/dist/daemon.js.map +1 -1
package/dist/index.d.ts +7 -0
package/dist/index.d.ts.map +1 -0
package/dist/index.js +5 -0
package/dist/index.js.map +1 -0
package/dist/inspect-server.d.ts +26 -0
package/dist/inspect-server.d.ts.map +1 -0
package/dist/inspect-server.js +218 -0
package/dist/inspect-server.js.map +1 -0
package/dist/protocol.d.ts +3 -1
package/dist/protocol.d.ts.map +1 -1
package/dist/protocol.js +7 -3
package/dist/protocol.js.map +1 -1
package/dist/types.d.ts +10 -2
package/dist/types.d.ts.map +1 -1
package/package.json +11 -2
package/skills/agent-browser/SKILL.md +116 -14
package/skills/agent-browser/references/authentication.md +101 -0
package/skills/agent-browser/references/commands.md +2 -0
package/skills/vercel-sandbox/SKILL.md +280 -0

package/README.md CHANGED Viewed

@@ -115,6 +115,8 @@ agent-browser drag <src> <tgt>        # Drag and drop
 agent-browser upload <sel> <files>    # Upload files
 agent-browser screenshot [path]       # Take screenshot (--full for full page, saves to a temporary directory if no path)
 agent-browser screenshot --annotate   # Annotated screenshot with numbered element labels
+agent-browser screenshot --screenshot-dir ./shots    # Save to custom directory
+agent-browser screenshot --screenshot-format jpeg --screenshot-quality 80
 agent-browser pdf <path>              # Save as PDF
 agent-browser snapshot                # Accessibility tree with refs (best for AI)
 agent-browser eval <js>               # Run JavaScript (-b for base64, --stdin for piped input)
@@ -131,6 +133,7 @@ agent-browser get value <sel>         # Get input value
 agent-browser get attr <sel> <attr>   # Get attribute
 agent-browser get title               # Get page title
 agent-browser get url                 # Get current URL
+agent-browser get cdp-url             # Get CDP WebSocket URL (for DevTools, debugging)
 agent-browser get count <sel>         # Count matching elements
 agent-browser get box <sel>           # Get bounding box
 agent-browser get styles <sel>        # Get computed styles
@@ -164,6 +167,7 @@ agent-browser find nth <n> <sel> <action> [value]     # Nth match
 **Options:** `--name <name>` (filter role by accessible name), `--exact` (require exact text match)
 **Examples:**
 ```bash
 agent-browser find role button click --name "Submit"
 agent-browser find text "Sign In" click
@@ -177,14 +181,27 @@ agent-browser find nth 2 "a" text
 ```bash
 agent-browser wait <selector>         # Wait for element to be visible
 agent-browser wait <ms>               # Wait for time (milliseconds)
-agent-browser wait --text "Welcome"   # Wait for text to appear
+agent-browser wait --text "Welcome"   # Wait for text to appear (substring match)
 agent-browser wait --url "**/dash"    # Wait for URL pattern
 agent-browser wait --load networkidle # Wait for load state
 agent-browser wait --fn "window.ready === true"  # Wait for JS condition
+# Wait for text/element to disappear
+agent-browser wait --fn "!document.body.innerText.includes('Loading...')"
+agent-browser wait "#spinner" --state hidden
 ```
 **Load states:** `load`, `domcontentloaded`, `networkidle`
+### Clipboard
+```bash
+agent-browser clipboard read                      # Read text from clipboard
+agent-browser clipboard write "Hello, World!"     # Write text to clipboard
+agent-browser clipboard copy                      # Copy current selection (Ctrl+C)
+agent-browser clipboard paste                     # Paste from clipboard (Ctrl+V)
+```
 ### Mouse Control
 ```bash
@@ -283,6 +300,7 @@ agent-browser console --clear         # Clear console
 agent-browser errors                  # View page errors (uncaught JavaScript exceptions)
 agent-browser errors --clear          # Clear errors
 agent-browser highlight <sel>         # Highlight element
+agent-browser inspect                 # Open Chrome DevTools for the active page
 agent-browser state save <path>       # Save auth state
 agent-browser state load <path>       # Load auth state
 agent-browser state list              # List saved state files
@@ -308,6 +326,47 @@ agent-browser install                 # Download Chromium browser
 agent-browser install --with-deps     # Also install system deps (Linux)
 ```
+## Authentication
+agent-browser provides multiple ways to persist login sessions so you don't re-authenticate every run.
+### Quick summary
+| Approach | Best for | Flag / Env |
+|----------|----------|------------|
+| **Persistent profile** | Full browser state (cookies, IndexedDB, service workers, cache) across restarts | `--profile <path>` / `AGENT_BROWSER_PROFILE` |
+| **Session persistence** | Auto-save/restore cookies + localStorage by name | `--session-name <name>` / `AGENT_BROWSER_SESSION_NAME` |
+| **Import from your browser** | Grab auth from a Chrome session you already logged into | `--auto-connect` + `state save` |
+| **State file** | Load a previously saved state JSON on launch | `--state <path>` / `AGENT_BROWSER_STATE` |
+| **Auth vault** | Store credentials locally (encrypted), login by name | `auth save` / `auth login` |
+### Import auth from your browser
+If you are already logged in to a site in Chrome, you can grab that auth state and reuse it:
+```bash
+# 1. Launch Chrome with remote debugging enabled
+#    macOS:
+"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" --remote-debugging-port=9222
+#    Or use --auto-connect to discover an already-running Chrome
+# 2. Connect and save the authenticated state
+agent-browser --auto-connect state save ./my-auth.json
+# 3. Use the saved auth in future sessions
+agent-browser --state ./my-auth.json open https://app.example.com/dashboard
+# 4. Or use --session-name for automatic persistence
+agent-browser --session-name myapp state load ./my-auth.json
+# From now on, --session-name myapp auto-saves/restores this state
+```
+> **Security notes:**
+> - `--remote-debugging-port` exposes full browser control on localhost. Any local process can connect. Only use on trusted machines and close Chrome when done.
+> - State files contain session tokens in plaintext. Add them to `.gitignore` and delete when no longer needed. For encryption at rest, set `AGENT_BROWSER_ENCRYPTION_KEY` (see [State Encryption](#state-encryption)).
+For full details on login flows, OAuth, 2FA, cookie-based auth, and the auth vault, see the [Authentication](docs/src/app/sessions/page.mdx) docs.
 ## Sessions
 Run multiple isolated browser instances:
@@ -332,6 +391,7 @@ agent-browser session
 ```
 Each session has its own:
 - Browser instance
 - Cookies and storage
 - Navigation history
@@ -353,6 +413,7 @@ AGENT_BROWSER_PROFILE=~/.myapp-profile agent-browser open myapp.com
 ```
 The profile directory stores:
 - Cookies and localStorage
 - IndexedDB data
 - Service workers
@@ -389,10 +450,10 @@ export AGENT_BROWSER_ENCRYPTION_KEY=<64-char-hex-key>
 agent-browser --session-name secure open example.com
 ```
-| Variable | Description |
-|----------|-------------|
-| `AGENT_BROWSER_SESSION_NAME` | Auto-save/load state persistence name |
-| `AGENT_BROWSER_ENCRYPTION_KEY` | 64-char hex key for AES-256-GCM encryption |
+| Variable                          | Description                                        |
+| --------------------------------- | -------------------------------------------------- |
+| `AGENT_BROWSER_SESSION_NAME`      | Auto-save/load state persistence name              |
+| `AGENT_BROWSER_ENCRYPTION_KEY`    | 64-char hex key for AES-256-GCM encryption         |
 | `AGENT_BROWSER_STATE_EXPIRE_DAYS` | Auto-delete states older than N days (default: 30) |
 ## Security
@@ -406,16 +467,16 @@ agent-browser includes security features for safe AI agent deployments. All feat
 - **Action Confirmation** -- Require explicit approval for sensitive action categories: `--confirm-actions eval,download`
 - **Output Length Limits** -- Prevent context flooding: `--max-output 50000`
-| Variable | Description |
-|----------|-------------|
-| `AGENT_BROWSER_CONTENT_BOUNDARIES` | Wrap page output in boundary markers |
-| `AGENT_BROWSER_MAX_OUTPUT` | Max characters for page output |
-| `AGENT_BROWSER_ALLOWED_DOMAINS` | Comma-separated allowed domain patterns |
-| `AGENT_BROWSER_ACTION_POLICY` | Path to action policy JSON file |
-| `AGENT_BROWSER_CONFIRM_ACTIONS` | Action categories requiring confirmation |
-| `AGENT_BROWSER_CONFIRM_INTERACTIVE` | Enable interactive confirmation prompts |
+| Variable                            | Description                              |
+| ----------------------------------- | ---------------------------------------- |
+| `AGENT_BROWSER_CONTENT_BOUNDARIES`  | Wrap page output in boundary markers     |
+| `AGENT_BROWSER_MAX_OUTPUT`          | Max characters for page output           |
+| `AGENT_BROWSER_ALLOWED_DOMAINS`     | Comma-separated allowed domain patterns  |
+| `AGENT_BROWSER_ACTION_POLICY`       | Path to action policy JSON file          |
+| `AGENT_BROWSER_CONFIRM_ACTIONS`     | Action categories requiring confirmation |
+| `AGENT_BROWSER_CONFIRM_INTERACTIVE` | Enable interactive confirmation prompts  |
-See [Security documentation](https://agent-browser.vercel.app/security) for details.
+See [Security documentation](https://agent-browser.dev/security) for details.
 ## Snapshot Options
@@ -431,13 +492,13 @@ agent-browser snapshot -s "#main"         # Scope to CSS selector
 agent-browser snapshot -i -c -d 5         # Combine options
 ```
-| Option | Description |
-|--------|-------------|
-| `-i, --interactive` | Only show interactive elements (buttons, links, inputs) |
-| `-C, --cursor` | Include cursor-interactive elements (cursor:pointer, onclick, tabindex) |
-| `-c, --compact` | Remove empty structural elements |
-| `-d, --depth <n>` | Limit tree depth |
-| `-s, --selector <sel>` | Scope to CSS selector |
+| Option                 | Description                                                             |
+| ---------------------- | ----------------------------------------------------------------------- |
+| `-i, --interactive`    | Only show interactive elements (buttons, links, inputs)                 |
+| `-C, --cursor`         | Include cursor-interactive elements (cursor:pointer, onclick, tabindex) |
+| `-c, --compact`        | Remove empty structural elements                                        |
+| `-d, --depth <n>`      | Limit tree depth                                                        |
+| `-s, --selector <sel>` | Scope to CSS selector                                                   |
 The `-C` flag is useful for modern web apps that use custom clickable elements (divs, spans) instead of standard buttons/links.
@@ -445,6 +506,8 @@ The `-C` flag is useful for modern web apps that use custom clickable elements (
 The `--annotate` flag overlays numbered labels on interactive elements in the screenshot. Each label `[N]` corresponds to ref `@eN`, so the same refs work for both visual and text-based workflows.
+In native mode, annotated screenshots are supported on the CDP-backed browser path (`--native` with Chromium/Lightpanda). The Safari/WebDriver backend does not yet support `--annotate`.
 ```bash
 agent-browser screenshot --annotate
 # -> Screenshot saved to /tmp/screenshot-2026-02-17T12-00-00-abc123.png
@@ -484,6 +547,9 @@ This is useful for multimodal AI models that can reason about visual layout, unl
 | `--json` | JSON output (for agents) |
 | `--full, -f` | Full page screenshot |
 | `--annotate` | Annotated screenshot with numbered element labels (or `AGENT_BROWSER_ANNOTATE` env) |
+| `--screenshot-dir <path>` | Default screenshot output directory (or `AGENT_BROWSER_SCREENSHOT_DIR` env) |
+| `--screenshot-quality <n>` | JPEG quality 0-100 (or `AGENT_BROWSER_SCREENSHOT_QUALITY` env) |
+| `--screenshot-format <fmt>` | Screenshot format: `png`, `jpeg` (or `AGENT_BROWSER_SCREENSHOT_FORMAT` env) |
 | `--headed` | Show browser window (not headless) (or `AGENT_BROWSER_HEADED` env) |
 | `--cdp <port\|url>` | Connect via Chrome DevTools Protocol (port or WebSocket URL) |
 | `--auto-connect` | Auto-discover and connect to running Chrome (or `AGENT_BROWSER_AUTO_CONNECT` env) |
@@ -551,8 +617,8 @@ export AGENT_BROWSER_DEFAULT_TIMEOUT=45000
 > **Note:** Setting this above 30000 (30s) may cause EAGAIN errors on slow operations because the CLI's read timeout will expire before Playwright responds. The CLI retries transient errors automatically, but response times will increase.
-| Variable | Description |
-|----------|-------------|
+| Variable                        | Description                                       |
+| ------------------------------- | ------------------------------------------------- |
 | `AGENT_BROWSER_DEFAULT_TIMEOUT` | Default Playwright timeout in ms (default: 25000) |
 ## Selectors
@@ -578,6 +644,7 @@ agent-browser hover @e4                   # Hover the link
 ```
 **Why use refs?**
 - **Deterministic**: Ref points to exact element from snapshot
 - **Fast**: No DOM re-query needed
 - **AI-friendly**: Snapshot + ref workflow is optimal for LLMs
@@ -678,6 +745,7 @@ agent-browser open other-site.com
 ```
 This is useful for:
 - **Skipping login flows** - Authenticate via headers instead of UI
 - **Switching users** - Start new sessions with different auth tokens
 - **API testing** - Access protected endpoints directly
@@ -699,6 +767,7 @@ agent-browser set headers '{"X-Custom-Header": "value"}'
 ## Custom Browser Executable
 Use a custom browser executable instead of the bundled Chromium. This is useful for:
 - **Serverless deployment**: Use lightweight Chromium builds like `@sparticuz/chromium` (~50MB vs ~684MB)
 - **System browsers**: Use an existing Chrome/Chromium installation
 - **Custom builds**: Use modified browser builds
@@ -713,7 +782,22 @@ agent-browser --executable-path /path/to/chromium open example.com
 AGENT_BROWSER_EXECUTABLE_PATH=/path/to/chromium agent-browser open example.com
 ```
-### Serverless Example (Vercel/AWS Lambda)
+### Serverless (Vercel)
+Run agent-browser + Chrome in an ephemeral Vercel Sandbox microVM. No external server needed:
+```typescript
+import { Sandbox } from "@vercel/sandbox";
+const sandbox = await Sandbox.create({ runtime: "node24" });
+await sandbox.runCommand("agent-browser", ["open", "https://example.com"]);
+const result = await sandbox.runCommand("agent-browser", ["screenshot", "--json"]);
+await sandbox.stop();
+```
+See the [environments example](examples/environments/) for a working demo with a UI and deploy-to-Vercel button.
+### Serverless (AWS Lambda)
 ```typescript
 import chromium from '@sparticuz/chromium';
@@ -744,6 +828,7 @@ agent-browser screenshot report.png
 ```
 The `--allow-file-access` flag adds Chromium flags (`--allow-file-access-from-files`, `--allow-file-access`) that allow `file://` URLs to:
 - Load and render local files
 - Access other local files via JavaScript (XHR, fetch)
 - Load local resources (images, scripts, stylesheets)
@@ -771,10 +856,12 @@ agent-browser --cdp "wss://your-browser-service.com/cdp?token=..." snapshot
 ```
 The `--cdp` flag accepts either:
 - A port number (e.g., `9222`) for local connections via `http://localhost:{port}`
 - A full WebSocket URL (e.g., `wss://...` or `ws://...`) for remote browser services
 This enables control of:
 - Electron apps
 - Chrome/Chromium instances with remote debugging
 - WebView2 applications
@@ -794,10 +881,12 @@ AGENT_BROWSER_AUTO_CONNECT=1 agent-browser snapshot
 ```
 Auto-connect discovers Chrome by:
 1. Reading Chrome's `DevToolsActivePort` file from the default user data directory
 2. Falling back to probing common debugging ports (9222, 9229)
 This is useful when:
 - Chrome 144+ has remote debugging enabled via `chrome://inspect/#remote-debugging` (which uses a dynamic port)
 - You want a zero-configuration connection to your existing browser
 - You don't want to track which port Chrome is using
@@ -821,6 +910,7 @@ This starts a WebSocket server on the specified port that streams the browser vi
 Connect to `ws://localhost:9223` to receive frames and send input:
 **Receive frames:**
 ```json
 {
   "type": "frame",
@@ -837,6 +927,7 @@ Connect to `ws://localhost:9223` to receive frames and send input:
 ```
 **Send mouse events:**
 ```json
 {
   "type": "input_mouse",
@@ -849,6 +940,7 @@ Connect to `ws://localhost:9223` to receive frames and send input:
 ```
 **Send keyboard events:**
 ```json
 {
   "type": "input_keyboard",
@@ -859,6 +951,7 @@ Connect to `ws://localhost:9223` to receive frames and send input:
 ```
 **Send touch events:**
 ```json
 {
   "type": "input_touch",
@@ -879,16 +972,19 @@ await browser.launch({ headless: true });
 await browser.navigate('https://example.com');
 // Start screencast
-await browser.startScreencast((frame) => {
-  // frame.data is base64-encoded image
-  // frame.metadata contains viewport info
-  console.log('Frame received:', frame.metadata.deviceWidth, 'x', frame.metadata.deviceHeight);
-}, {
-  format: 'jpeg',
-  quality: 80,
-  maxWidth: 1280,
-  maxHeight: 720,
-});
+await browser.startScreencast(
+  (frame) => {
+    // frame.data is base64-encoded image
+    // frame.metadata contains viewport info
+    console.log('Frame received:', frame.metadata.deviceWidth, 'x', frame.metadata.deviceHeight);
+  },
+  {
+    format: 'jpeg',
+    quality: 80,
+    maxWidth: 1280,
+    maxHeight: 720,
+  }
+);
 // Inject mouse events
 await browser.injectMouseEvent({
@@ -940,18 +1036,18 @@ agent-browser open example.com
 Or add to your config file (`agent-browser.json`):
 ```json
-{"native": true}
+{ "native": true }
 ```
 ### What's Different
-| | Default (Node.js) | Native (`--native`) |
-|---|---|---|
-| **Runtime** | Node.js + Playwright | Pure Rust binary |
-| **Protocol** | Playwright protocol | Direct CDP / WebDriver |
-| **Install size** | Larger (Node.js + npm deps) | Smaller (single binary) |
-| **Browser support** | Chromium, Firefox, WebKit | Chromium, Safari (via WebDriver) |
-| **Stability** | Stable | Experimental |
+|                     | Default (Node.js)           | Native (`--native`)              |
+| ------------------- | --------------------------- | -------------------------------- |
+| **Runtime**         | Node.js + Playwright        | Pure Rust binary                 |
+| **Protocol**        | Playwright protocol         | Direct CDP / WebDriver           |
+| **Install size**    | Larger (Node.js + npm deps) | Smaller (single binary)          |
+| **Browser support** | Chromium, Firefox, WebKit   | Chromium, Safari (via WebDriver) |
+| **Stability**       | Stable                      | Experimental                     |
 ### Known Limitations
@@ -961,13 +1057,13 @@ Or add to your config file (`agent-browser.json`):
 ## Platforms
-| Platform | Binary | Fallback |
-|----------|--------|----------|
-| macOS ARM64 | Native Rust | Node.js |
-| macOS x64 | Native Rust | Node.js |
-| Linux ARM64 | Native Rust | Node.js |
-| Linux x64 | Native Rust | Node.js |
-| Windows x64 | Native Rust | Node.js |
+| Platform    | Binary      | Fallback |
+| ----------- | ----------- | -------- |
+| macOS ARM64 | Native Rust | Node.js  |
+| macOS x64   | Native Rust | Node.js  |
+| Linux ARM64 | Native Rust | Node.js  |
+| Linux x64   | Native Rust | Node.js  |
+| Windows x64 | Native Rust | Node.js  |
 ## Usage with AI Agents
@@ -1011,6 +1107,7 @@ For more consistent results, add to your project or global instructions file:
 Use `agent-browser` for web automation. Run `agent-browser --help` for all commands.
 Core workflow:
 1. `agent-browser open <url>` - Navigate to page
 2. `agent-browser snapshot -i` - Get interactive elements with refs (@e1, @e2)
 3. `agent-browser click @e1` / `fill @e2 "text"` - Interact using refs
@@ -1062,11 +1159,11 @@ export AGENT_BROWSER_IOS_DEVICE="iPhone 16 Pro"
 agent-browser open https://example.com
 ```
-| Variable | Description |
-|----------|-------------|
-| `AGENT_BROWSER_PROVIDER` | Set to `ios` to enable iOS mode |
+| Variable                   | Description                                     |
+| -------------------------- | ----------------------------------------------- |
+| `AGENT_BROWSER_PROVIDER`   | Set to `ios` to enable iOS mode                 |
 | `AGENT_BROWSER_IOS_DEVICE` | Device name (e.g., "iPhone 16 Pro", "iPad Pro") |
-| `AGENT_BROWSER_IOS_UDID` | Device UDID (alternative to device name) |
+| `AGENT_BROWSER_IOS_UDID`   | Device UDID (alternative to device name)        |
 **Supported devices:** All iOS Simulators available in Xcode (iPhones, iPads), plus real iOS devices.
@@ -1077,6 +1174,7 @@ agent-browser open https://example.com
 Appium also supports real iOS devices connected via USB. This requires additional one-time setup:
 **1. Get your device UDID:**
 ```bash
 xcrun xctrace list devices
 # or
@@ -1084,6 +1182,7 @@ system_profiler SPUSBDataType | grep -A 5 "iPhone\|iPad"
 ```
 **2. Sign WebDriverAgent (one-time):**
 ```bash
 # Open the WebDriverAgent Xcode project
 cd ~/.appium/node_modules/appium-xcuitest-driver/node_modules/appium-webdriveragent
@@ -1091,12 +1190,14 @@ open WebDriverAgent.xcodeproj
 ```
 In Xcode:
 - Select the `WebDriverAgentRunner` target
 - Go to Signing & Capabilities
 - Select your Team (requires Apple Developer account, free tier works)
 - Let Xcode manage signing automatically
 **3. Use with agent-browser:**
 ```bash
 # Connect device via USB, then:
 agent-browser -p ios --device "<DEVICE_UDID>" open https://example.com
@@ -1106,11 +1207,44 @@ agent-browser -p ios --device "John's iPhone" open https://example.com
 ```
 **Real device notes:**
 - First run installs WebDriverAgent to the device (may require Trust prompt)
 - Device must be unlocked and connected via USB
 - Slightly slower initial connection than simulator
 - Tests against real Safari performance and behavior
+### Browserless
+[Browserless](https://browserless.io) provides cloud browser infrastructure with a Sessions API. Use it when running agent-browser in environments where a local browser isn't available.
+To enable Browserless, use the `-p` flag:
+```bash
+export BROWSERLESS_API_KEY="your-api-token"
+agent-browser -p browserless open https://example.com
+```
+Or use environment variables for CI/scripts:
+```bash
+export AGENT_BROWSER_PROVIDER=browserless
+export BROWSERLESS_API_KEY="your-api-token"
+agent-browser open https://example.com
+```
+Optional configuration via environment variables:
+| Variable                   | Description                                      | Default                                 |
+| -------------------------- | ------------------------------------------------ | --------------------------------------- |
+| `BROWSERLESS_API_URL`      | Base API URL (for custom regions or self-hosted) | `https://production-sfo.browserless.io` |
+| `BROWSERLESS_BROWSER_TYPE` | Type of browser to use (chromium or chrome)      | chromium                                |
+| `BROWSERLESS_TTL`          | Session TTL in milliseconds                      | `300000`                                |
+| `BROWSERLESS_STEALTH`      | Enable stealth mode (`true`/`false`)             | `true`                                  |
+When enabled, agent-browser connects to a Browserless cloud session instead of launching a local browser. All commands work identically.
+Get your API token from the [Browserless Dashboard](https://browserless.io).
 ### Browserbase
 [Browserbase](https://browserbase.com) provides remote browser infrastructure to make deployment of agentic browsing agents easy. Use it when running the agent-browser CLI in an environment where a local browser isn't feasible.
@@ -1119,7 +1253,6 @@ To enable Browserbase, use the `-p` flag:
 ```bash
 export BROWSERBASE_API_KEY="your-api-key"
-export BROWSERBASE_PROJECT_ID="your-project-id"
 agent-browser -p browserbase open https://example.com
 ```
@@ -1128,13 +1261,12 @@ Or use environment variables for CI/scripts:
 ```bash
 export AGENT_BROWSER_PROVIDER=browserbase
 export BROWSERBASE_API_KEY="your-api-key"
-export BROWSERBASE_PROJECT_ID="your-project-id"
 agent-browser open https://example.com
 ```
 When enabled, agent-browser connects to a Browserbase session instead of launching a local browser. All commands work identically.
-Get your API key and project ID from the [Browserbase Dashboard](https://browserbase.com/overview).
+Get your API key from the [Browserbase Dashboard](https://browserbase.com/overview).
 ### Browser Use
@@ -1180,12 +1312,12 @@ agent-browser open https://example.com
 Optional configuration via environment variables:
-| Variable | Description | Default |
-|----------|-------------|---------|
-| `KERNEL_HEADLESS` | Run browser in headless mode (`true`/`false`) | `false` |
-| `KERNEL_STEALTH` | Enable stealth mode to avoid bot detection (`true`/`false`) | `true` |
-| `KERNEL_TIMEOUT_SECONDS` | Session timeout in seconds | `300` |
-| `KERNEL_PROFILE_NAME` | Browser profile name for persistent cookies/logins (created if it doesn't exist) | (none) |
+| Variable                 | Description                                                                      | Default |
+| ------------------------ | -------------------------------------------------------------------------------- | ------- |
+| `KERNEL_HEADLESS`        | Run browser in headless mode (`true`/`false`)                                    | `false` |
+| `KERNEL_STEALTH`         | Enable stealth mode to avoid bot detection (`true`/`false`)                      | `true`  |
+| `KERNEL_TIMEOUT_SECONDS` | Session timeout in seconds                                                       | `300`   |
+| `KERNEL_PROFILE_NAME`    | Browser profile name for persistent cookies/logins (created if it doesn't exist) | (none)  |
 When enabled, agent-browser connects to a Kernel cloud session instead of launching a local browser. All commands work identically.

package/bin/agent-browser-darwin-arm64 CHANGED Viewed

Binary file

package/bin/agent-browser-darwin-x64 CHANGED Viewed

Binary file

package/bin/agent-browser-linux-arm64 CHANGED Viewed

Binary file

package/bin/agent-browser-linux-x64 CHANGED Viewed

Binary file

package/bin/agent-browser-win32-x64.exe CHANGED Viewed

Binary file

package/dist/actions.d.ts.map CHANGED Viewed

	@@ -1 +1 @@
1	- {"version":3,"file":"actions.d.ts","sourceRoot":"","sources":["../src/actions.ts"],"names":[],"mappings":"~~AAIA~~,OAAO,KAAK,EAAE,cAAc,EAAE,eAAe,EAAE,MAAM,cAAc,CAAC;AAqBpE,OAAO,KAAK,EACV,OAAO,EACP,QAAQ,EAsIT,MAAM,YAAY,CAAC;AAQpB;;;GAGG;AACH,wBAAgB,0BAA0B,CACxC,QAAQ,EAAE,CAAC,CAAC,KAAK,EAAE,eAAe,KAAK,IAAI,CAAC,GAAG,IAAI,GAClD,IAAI,CAEN;AAQD;;;GAGG;AACH,wBAAgB,iBAAiB,CAAC,KAAK,EAAE,OAAO,EAAE,QAAQ,EAAE,MAAM,GAAG,KAAK,CAqDzE;AAKD,wBAAgB,gBAAgB,IAAI,IAAI,CAuBvC;AAED;;GAEG;AACH,wBAAsB,cAAc,CAAC,OAAO,EAAE,OAAO,EAAE,OAAO,EAAE,cAAc,GAAG,OAAO,CAAC,QAAQ,CAAC,CA6CjG"}
1	+ {"version":3,"file":"actions.d.ts","sourceRoot":"","sources":["../src/actions.ts"],"names":[],"mappings":"AAKA,OAAO,KAAK,EAAE,cAAc,EAAE,eAAe,EAAE,MAAM,cAAc,CAAC;AAqBpE,OAAO,KAAK,EACV,OAAO,EACP,QAAQ,EAsIT,MAAM,YAAY,CAAC;AAQpB;;;GAGG;AACH,wBAAgB,0BAA0B,CACxC,QAAQ,EAAE,CAAC,CAAC,KAAK,EAAE,eAAe,KAAK,IAAI,CAAC,GAAG,IAAI,GAClD,IAAI,CAEN;AAQD;;;GAGG;AACH,wBAAgB,iBAAiB,CAAC,KAAK,EAAE,OAAO,EAAE,QAAQ,EAAE,MAAM,GAAG,KAAK,CAqDzE;AAKD,wBAAgB,gBAAgB,IAAI,IAAI,CAuBvC;AAED;;GAEG;AACH,wBAAsB,cAAc,CAAC,OAAO,EAAE,OAAO,EAAE,OAAO,EAAE,cAAc,GAAG,OAAO,CAAC,QAAQ,CAAC,CA6CjG"}