agent-browser-priv 0.28.0-priv.1 → 0.31.1-priv.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -64,6 +64,8 @@ On Linux, install system dependencies:
64
64
  agent-browser install --with-deps
65
65
  ```
66
66
 
67
+ This exits nonzero if the package manager cannot install every required browser library.
68
+
67
69
  ### Updating
68
70
 
69
71
  Upgrade to the latest version:
@@ -74,6 +76,12 @@ agent-browser upgrade
74
76
 
75
77
  Detects your installation method (npm, Homebrew, or Cargo) and runs the appropriate update command automatically.
76
78
 
79
+ `agent-browser doctor` reports the installed Patchright backend version, the
80
+ Patchright version pinned by the current `agent-browser` release, and npm latest
81
+ when network checks are enabled. Patchright updates are release-controlled:
82
+ after upgrading `agent-browser`, run `agent-browser install` to refresh the
83
+ backend to the version pinned by that release.
84
+
77
85
  ### Requirements
78
86
 
79
87
  - **Patchright backend** - Run `agent-browser install` to install pinned Patchright and its browser artifacts for the default local backend. Requires Node.js and npm for install and runtime launch.
@@ -93,12 +101,9 @@ agent-browser screenshot page.png
93
101
  agent-browser close
94
102
  ```
95
103
 
96
- Clicks fail early when another element covers the target's click point,
97
- for example a consent banner or modal. Dismiss or interact with the reported
98
- covering element, then take a fresh snapshot before retrying the original ref.
104
+ Clicks fail early when another element covers the target's click point, for example a consent banner or modal. Dismiss or interact with the reported covering element, then take a fresh snapshot before retrying the original ref.
99
105
 
100
- Headless Chromium screenshots hide native scrollbars for consistent image output.
101
- Pass `--hide-scrollbars false` when launching to keep native scrollbars visible.
106
+ Headless Chromium screenshots hide native scrollbars for consistent image output. Pass `--hide-scrollbars false` when launching to keep native scrollbars visible.
102
107
 
103
108
  ### Traditional Selectors (also supported)
104
109
 
@@ -116,6 +121,7 @@ agent-browser find role button click --name "Submit"
116
121
  agent-browser open # Launch browser (no navigation); stays on about:blank
117
122
  agent-browser open <url> # Launch + navigate to URL (aliases: goto, navigate)
118
123
  agent-browser open --wait-until none <url> # Return immediately after navigation is sent
124
+ agent-browser read [url] # Fetch agent-readable text, or read rendered active-tab DOM
119
125
  agent-browser click <sel> # Click element (--new-tab to open in new tab)
120
126
  agent-browser dblclick <sel> # Double-click element
121
127
  agent-browser focus <sel> # Focus element
@@ -166,6 +172,23 @@ agent-browser get box <sel> # Get bounding box
166
172
  agent-browser get styles <sel> # Get computed styles
167
173
  ```
168
174
 
175
+ ### Read Agent-Friendly Text
176
+
177
+ ```bash
178
+ agent-browser read
179
+ agent-browser read https://example.com/article
180
+ agent-browser read https://example.com/article --filter overview
181
+ agent-browser read https://example.com/article --outline
182
+ agent-browser read https://docs.example.com --llms index --filter auth
183
+ agent-browser read https://docs.example.com --llms full --filter auth
184
+ agent-browser read example.com/article --require-md
185
+ agent-browser read https://example.com/article --json
186
+ ```
187
+
188
+ `read` fetches a URL without launching Chrome. Omit the URL to read the rendered DOM of the active tab in the current browser session, including browser auth state and client-side updates. Explicit URL reads send `Accept: text/markdown` by default, try the same URL with `.md` appended when the first response is not markdown, walk ancestor paths toward `/` to find the nearest `llms.txt` for a matching docs link, print markdown or plain text when available, and fall back to readable text extracted from HTML. `--llms` and `--require-md` with no URL use the active tab URL because they depend on HTTP resources. `read` does not read `llms-full.txt` unless you ask for it.
189
+
190
+ Options: `--raw` prints the response body without HTML extraction, `--require-md` fails unless the server returns `Content-Type: text/markdown`, `--outline` prints a compact heading outline for one page, `--llms index` prints a compact nearest-ancestor `llms.txt` link list, `--llms full` reads the nearest-ancestor `llms-full.txt`, `--filter <text>` narrows page sections, llms links/sections, or outline headings, and `--timeout <ms>` changes the request timeout. Global safeguards such as `--allowed-domains`, `--content-boundaries`, and `--max-output` also apply to read fetches and output.
191
+
169
192
  ### Check State
170
193
 
171
194
  ```bash
@@ -222,9 +245,7 @@ agent-browser wait "#spinner" --state hidden
222
245
 
223
246
  ### Batch Execution
224
247
 
225
- Execute multiple commands in a single invocation. Commands can be passed as
226
- quoted arguments or piped as JSON via stdin. This avoids per-command process
227
- startup overhead when running multi-step workflows.
248
+ Execute multiple commands in a single invocation. Commands can be passed as quoted arguments or piped as JSON via stdin. This avoids per-command process startup overhead when running multi-step workflows.
228
249
 
229
250
  ```bash
230
251
  # Argument mode: each quoted argument is a full command
@@ -318,15 +339,9 @@ agent-browser tab close [t<N>|label] # Close a tab (defaults to active
318
339
  agent-browser window new # New window
319
340
  ```
320
341
 
321
- Tab ids are stable strings of the form `t1`, `t2`, `t3`. They're never reused
322
- within a session, so scripts and agents can keep referring to the same tab
323
- even after other tabs are opened or closed. Positional integers like `tab 2`
324
- are **not** accepted; the `t` prefix disambiguates handles from indices and
325
- mirrors the `@e1` convention used for element refs.
342
+ Tab ids are stable strings of the form `t1`, `t2`, `t3`. They're never reused within a session, so scripts and agents can keep referring to the same tab even after other tabs are opened or closed. Positional integers like `tab 2` are **not** accepted; the `t` prefix disambiguates handles from indices and mirrors the `@e1` convention used for element refs.
326
343
 
327
- You can also assign a memorable label (`docs`, `app`, `admin`) and use it
328
- interchangeably with the id. Labels are never auto-generated and never
329
- rewritten on navigation — they're yours to name and keep:
344
+ You can also assign a memorable label (`docs`, `app`, `admin`) and use it interchangeably with the id. Labels are never auto-generated and never rewritten on navigation — they're yours to name and keep:
330
345
 
331
346
  ```bash
332
347
  agent-browser tab new --label docs https://docs.example.com
@@ -406,10 +421,7 @@ agent-browser pushstate <url> # SPA client-side nav; auto-detects window
406
421
 
407
422
  ### Pre-navigation setup
408
423
 
409
- Some flows (SSR debug, auth cookies for protected origins, init scripts)
410
- need state set up *before* the first navigation. Use `open` with no URL
411
- to launch the browser, then stage cookies / routes / init scripts, then
412
- navigate. `batch` sends it all in one CLI call:
424
+ Some flows (SSR debug, auth cookies for protected origins, init scripts) need state set up *before* the first navigation. Use `open` with no URL to launch the browser, then stage cookies / routes / init scripts, then navigate. `batch` sends it all in one CLI call:
413
425
 
414
426
  ```bash
415
427
  agent-browser batch \
@@ -419,14 +431,11 @@ agent-browser batch \
419
431
  '["navigate","http://localhost:3000/target"]'
420
432
  ```
421
433
 
422
- Without `batch` the same sequence is three commands that all reuse the
423
- same daemon (fast, but not one turn).
434
+ Without `batch` the same sequence is three commands that all reuse the same daemon (fast, but not one turn).
424
435
 
425
436
  ### React / Web Vitals
426
437
 
427
- Agent-browser ships with first-class React introspection and universal Web
428
- Vitals metrics. The React commands need the React DevTools hook installed at
429
- launch; Web Vitals and pushstate are framework-agnostic.
438
+ Agent-browser ships with first-class React introspection and universal Web Vitals metrics. The React commands need the React DevTools hook installed at launch; Web Vitals and pushstate are framework-agnostic.
430
439
 
431
440
  ```bash
432
441
  agent-browser open --enable react-devtools <url> # Launch with React hook installed
@@ -439,15 +448,10 @@ agent-browser react suspense [--only-dynamic] [--json] # Suspense boundaries +
439
448
  agent-browser vitals [url] [--json] # LCP/CLS/TTFB/FCP/INP + hydration summary
440
449
  ```
441
450
 
442
- Each `react ...` subcommand requires `--enable react-devtools` to have been
443
- passed at launch (the React DevTools `installHook.js` is embedded in the
444
- binary). Without it the commands error with `React DevTools hook not installed
451
+ Each `react ...` subcommand requires `--enable react-devtools` to have been passed at launch (the React DevTools `installHook.js` is embedded in the binary). Without it the commands error with `React DevTools hook not installed
445
452
  - relaunch with --enable react-devtools`.
446
453
 
447
- Works on any React app — Next.js, Remix, Vite+React, CRA, TanStack Start,
448
- React Native Web, etc. `vitals` and `pushstate` are framework-agnostic.
449
- `vitals` prints a summary by default; pass `--json` for the full structured
450
- payload.
454
+ Works on any React app — Next.js, Remix, Vite+React, CRA, TanStack Start, React Native Web, etc. `vitals` and `pushstate` are framework-agnostic. `vitals` prints a summary by default; pass `--json` for the full structured payload.
451
455
 
452
456
  ### Init scripts
453
457
 
@@ -567,7 +571,7 @@ agent-browser provides multiple ways to persist login sessions so you don't re-a
567
571
  |----------|----------|------------|
568
572
  | **Chrome profile reuse** | Reuse your existing Chrome login state (cookies, sessions) with zero setup | `--profile <name>` / `AGENT_BROWSER_PROFILE` |
569
573
  | **Persistent profile** | Full browser state (cookies, IndexedDB, service workers, cache) across restarts | `--profile <path>` / `AGENT_BROWSER_PROFILE` |
570
- | **Session persistence** | Auto-save/restore cookies + localStorage by name | `--session-name <name>` / `AGENT_BROWSER_SESSION_NAME` |
574
+ | **Session persistence** | Auto-save/restore cookies + localStorage from a stable session key | `--session <id> --restore` / `AGENT_BROWSER_RESTORE` |
571
575
  | **Import from your browser** | Grab auth from a Chrome session you already logged into | `--auto-connect` + `state save` |
572
576
  | **State file** | Load a previously saved state JSON on launch | `--state <path>` / `AGENT_BROWSER_STATE` |
573
577
  | **Auth vault** | Store credentials locally (encrypted), login by name | `auth save` / `auth login` |
@@ -588,9 +592,10 @@ agent-browser --auto-connect state save ./my-auth.json
588
592
  # 3. Use the saved auth in future sessions
589
593
  agent-browser --state ./my-auth.json open https://app.example.com/dashboard
590
594
 
591
- # 4. Or use --session-name for automatic persistence
592
- agent-browser --session-name myapp state load ./my-auth.json
593
- # From now on, --session-name myapp auto-saves/restores this state
595
+ # 4. Or use --restore for automatic persistence
596
+ SESSION="$(agent-browser session id --scope worktree --prefix myapp)"
597
+ agent-browser --session "$SESSION" --restore --state ./my-auth.json open https://app.example.com/dashboard
598
+ # From now on, --session "$SESSION" --restore auto-saves/restores this state
594
599
  ```
595
600
 
596
601
  > **Security notes:**
@@ -620,6 +625,12 @@ agent-browser session list
620
625
 
621
626
  # Show current session
622
627
  agent-browser session
628
+
629
+ # Generate a stable worktree-scoped session id
630
+ agent-browser session id --scope worktree --prefix next-dev-loop
631
+
632
+ # Inspect daemon, launch, and restore status
633
+ agent-browser session info --json
623
634
  ```
624
635
 
625
636
  Each session has its own:
@@ -678,18 +689,18 @@ The profile directory stores:
678
689
 
679
690
  ## Session Persistence
680
691
 
681
- Alternatively, use `--session-name` to automatically save and restore cookies and localStorage across browser restarts:
692
+ Use `--restore` with a stable `--session` to automatically save and restore cookies and localStorage across browser restarts:
682
693
 
683
694
  ```bash
684
- # Auto-save/load state for "twitter" session
685
- agent-browser --session-name twitter open twitter.com
695
+ # Generate a stable id for this worktree and auto-save/load state
696
+ SESSION="$(agent-browser session id --scope worktree --prefix twitter)"
697
+ agent-browser --session "$SESSION" --restore open twitter.com
686
698
 
687
699
  # Login once, then state persists automatically
688
700
  # State files stored in ~/.agent-browser/sessions/
689
701
 
690
- # Or via environment variable
691
- export AGENT_BROWSER_SESSION_NAME=twitter
692
- agent-browser open twitter.com
702
+ # Optional: validate restored state before auto-saving again
703
+ agent-browser --session "$SESSION" --restore --restore-check-text Dashboard open twitter.com
693
704
  ```
694
705
 
695
706
  ### State Encryption
@@ -701,12 +712,15 @@ Encrypt saved session data at rest with AES-256-GCM:
701
712
  export AGENT_BROWSER_ENCRYPTION_KEY=<64-char-hex-key>
702
713
 
703
714
  # State files are now encrypted automatically
704
- agent-browser --session-name secure open example.com
715
+ agent-browser --session secure --restore open example.com
705
716
  ```
706
717
 
707
718
  | Variable | Description |
708
719
  | --------------------------------- | -------------------------------------------------- |
709
- | `AGENT_BROWSER_SESSION_NAME` | Auto-save/load state persistence name |
720
+ | `AGENT_BROWSER_RESTORE` | Auto-save/load state persistence name |
721
+ | `AGENT_BROWSER_RESTORE_SAVE` | Restore save policy: `auto`, `always`, or `never` |
722
+ | `AGENT_BROWSER_NAMESPACE` | Namespace for daemon sockets and restore state |
723
+ | `AGENT_BROWSER_SESSION_NAME` | Legacy auto-save/load state persistence name |
710
724
  | `AGENT_BROWSER_ENCRYPTION_KEY` | 64-char hex key for AES-256-GCM encryption |
711
725
  | `AGENT_BROWSER_STATE_EXPIRE_DAYS` | Auto-delete states older than N days (default: 30) |
712
726
 
@@ -871,7 +885,13 @@ This is useful for multimodal AI models that can reason about visual layout, unl
871
885
  | Option | Description |
872
886
  |--------|-------------|
873
887
  | `--session <name>` | Use isolated session (or `AGENT_BROWSER_SESSION` env) |
874
- | `--session-name <name>` | Auto-save/restore session state (or `AGENT_BROWSER_SESSION_NAME` env) |
888
+ | `--restore [name]` | Auto-save/restore session state. Bare `--restore` uses `--session` as the key |
889
+ | `--restore-save <policy>` | Restore save policy: `auto`, `always`, or `never` |
890
+ | `--restore-check-url <glob>` | Validate restored state against a URL pattern |
891
+ | `--restore-check-text <text>` | Validate restored state against page text |
892
+ | `--restore-check-fn <js>` | Validate restored state against a truthy JavaScript expression |
893
+ | `--namespace <name>` | Isolate daemon sockets and restore-state directories |
894
+ | `--session-name <name>` | Legacy alias for restore persistence key |
875
895
  | `--profile <name\|path>` | Chrome profile name or persistent directory path (or `AGENT_BROWSER_PROFILE` env) |
876
896
  | `--state <path>` | Load storage state from JSON file (or `AGENT_BROWSER_STATE` env) |
877
897
  | `--headers <json>` | Set HTTP headers scoped to the URL's origin |
@@ -913,23 +933,68 @@ This is useful for multimodal AI models that can reason about visual layout, unl
913
933
  | `--config <path>` | Use a custom config file (or `AGENT_BROWSER_CONFIG` env) |
914
934
  | `--debug` | Debug output |
915
935
 
916
- ## Local backends
936
+ ## Patchright fork behavior
937
+
938
+ This fork keeps the upstream `agent-browser` command surface and adds a backend
939
+ choice for local Chrome-compatible launches. Normal commands remain the same:
940
+
941
+ ```bash
942
+ agent-browser open https://example.com
943
+ agent-browser snapshot -i
944
+ agent-browser get title
945
+ agent-browser close
946
+ ```
947
+
948
+ The fork-specific default is `--backend patchright` for local Chrome launches.
949
+ Patchright launches the browser process and exposes a localhost CDP endpoint;
950
+ the Rust daemon still drives the page through CDP after launch. This gives a
951
+ more realistic local browser lane for development, sandboxes, and CI without
952
+ making agents learn a separate wrapper command.
917
953
 
918
- This fork defaults local Chrome-compatible launches to Patchright. For fresh
919
- remote hosts, sandboxes, and CI environments, install the default backend once:
954
+ Fork-specific command surface:
955
+
956
+ ```bash
957
+ agent-browser install # install default Patchright backend
958
+ agent-browser install patchright # same, explicit target
959
+ agent-browser install chrome # install Chrome for Testing for --backend chrome
960
+ agent-browser install --with-deps # Linux system deps plus default backend
961
+ agent-browser --backend patchright open https://example.com
962
+ agent-browser --backend chrome open https://example.com
963
+ AGENT_BROWSER_BACKEND=chrome agent-browser open https://example.com
964
+ agent-browser doctor # includes Patchright/backend checks
965
+ agent-browser doctor --offline --quick
966
+ ```
967
+
968
+ ### Install or refresh Patchright
969
+
970
+ For fresh remote hosts, sandboxes, and CI environments, install the default
971
+ backend once:
920
972
 
921
973
  ```bash
922
974
  agent-browser install
923
975
  ```
924
976
 
925
- Then use the normal command surface:
977
+ That installs the Patchright npm package pinned by this `agent-browser` release
978
+ and downloads Patchright's Chromium artifacts. On Linux, add system packages
979
+ when needed:
926
980
 
927
981
  ```bash
928
- agent-browser --headed open https://example.com
929
- agent-browser --profile ~/.agent-browser/profiles/dev open https://example.com
982
+ agent-browser install --with-deps
930
983
  ```
931
984
 
932
- Use the built-in Chrome CDP launcher when a site behaves better on that lane:
985
+ After upgrading `agent-browser`, run `agent-browser install` again to refresh
986
+ the backend to the Patchright version pinned by the new release.
987
+
988
+ ### Switch backend per command
989
+
990
+ Use Patchright explicitly:
991
+
992
+ ```bash
993
+ agent-browser --backend patchright open https://example.com
994
+ ```
995
+
996
+ Use the upstream-style built-in Chrome launcher when a site behaves better on
997
+ that lane or when you want to avoid Node/Patchright at runtime:
933
998
 
934
999
  ```bash
935
1000
  agent-browser install chrome
@@ -937,15 +1002,75 @@ agent-browser --backend chrome open https://example.com
937
1002
  AGENT_BROWSER_BACKEND=chrome agent-browser open https://example.com
938
1003
  ```
939
1004
 
940
- Patchright is used only to launch the local Chrome-compatible browser and expose
941
- CDP on localhost. It does not solve CAPTCHA, Turnstile, or other human
942
- verification pages; preserve those pages for human handoff.
1005
+ For a durable default, use config:
1006
+
1007
+ ```json
1008
+ {
1009
+ "backend": "chrome"
1010
+ }
1011
+ ```
1012
+
1013
+ Put that in `~/.agent-browser/config.json` for your user default or
1014
+ `./agent-browser.json` for a project default. Command-line flags override env
1015
+ vars, env vars override config, project config overrides user config, and a
1016
+ missing auto-discovered config file is ignored.
1017
+
1018
+ ### Diagnose backend state
1019
+
1020
+ `doctor` is extended in this fork:
1021
+
1022
+ ```bash
1023
+ agent-browser doctor
1024
+ agent-browser doctor --offline --quick
1025
+ agent-browser doctor --fix
1026
+ agent-browser doctor --json
1027
+ ```
1028
+
1029
+ It reports:
1030
+
1031
+ - installed Patchright backend path and installed Patchright npm version;
1032
+ - Patchright release pin embedded in the current binary;
1033
+ - npm latest Patchright version when network checks are enabled;
1034
+ - Chrome/Chrome for Testing availability;
1035
+ - stale daemons and version-mismatched sessions;
1036
+ - config files, encryption key, provider env, network reachability, and a live
1037
+ launch test unless `--quick` is used.
1038
+
1039
+ If doctor warns that the installed Patchright backend differs from the release
1040
+ pin, run:
1041
+
1042
+ ```bash
1043
+ agent-browser install
1044
+ ```
1045
+
1046
+ ### What Patchright helps and does not help
1047
+
1048
+ Patchright is not CAPTCHA solving, Turnstile solving, decaptcha, proxy
1049
+ rotation, or a guarantee of access. It still cannot pass pages that require a
1050
+ human action or a third-party solver, including the public Turnstile demo at
1051
+ `https://nopecha.com/captcha/turnstile`.
1052
+
1053
+ What it does provide is a stronger local development browser lane than vanilla
1054
+ headless Chrome in many real-world challenge environments. In practice it has
1055
+ performed better than ordinary automation-flavored browsers on many Cloudflare
1056
+ and AWS WAF-style interstitials, especially when used headed with a persistent
1057
+ profile. If a challenge remains, preserve the page, screenshot/text, and
1058
+ network evidence for human handoff instead of trying to bypass it in code.
1059
+
1060
+ ### Supported launch options
943
1061
 
944
1062
  The default Patchright backend honors `--proxy`, `--proxy-bypass`,
945
1063
  `--user-agent`, `--ignore-https-errors`, `--download-path`, and custom launch
946
1064
  args. Remote-debugging address and port args are reserved by agent-browser and
947
1065
  are forced to localhost.
948
1066
 
1067
+ Patchright is only valid with the Chrome engine:
1068
+
1069
+ ```bash
1070
+ agent-browser --engine chrome --backend patchright open https://example.com
1071
+ agent-browser --engine lightpanda open https://example.com # separate engine, no Patchright
1072
+ ```
1073
+
949
1074
  ## Observability Dashboard
950
1075
 
951
1076
  Monitor agent-browser sessions in real time with a local web dashboard showing a live viewport and command activity feed.
@@ -1013,6 +1138,7 @@ Create an `agent-browser.json` file to set persistent defaults instead of repeat
1013
1138
  ```json
1014
1139
  {
1015
1140
  "headed": true,
1141
+ "backend": "chrome",
1016
1142
  "proxy": "http://localhost:8080",
1017
1143
  "profile": "./browser-data",
1018
1144
  "userAgent": "my-agent/1.0",
@@ -1035,7 +1161,12 @@ agent-browser --config ./ci-config.json open example.com
1035
1161
  AGENT_BROWSER_CONFIG=./ci-config.json agent-browser open example.com
1036
1162
  ```
1037
1163
 
1038
- All options from the table above can be set in the config file using camelCase keys (e.g., `--executable-path` becomes `"executablePath"`, `--proxy-bypass` becomes `"proxyBypass"`). Plugins are configured with the `"plugins"` array shown above. Unknown keys are ignored for forward compatibility.
1164
+ All options from the table above can be set in the config file using camelCase
1165
+ keys (e.g., `--executable-path` becomes `"executablePath"`, `--proxy-bypass`
1166
+ becomes `"proxyBypass"`). Use `"backend": "chrome"` if you want the original
1167
+ built-in Chrome launcher as your default instead of this fork's Patchright
1168
+ backend. Plugins are configured with the `"plugins"` array shown above. Unknown
1169
+ keys are ignored for forward compatibility.
1039
1170
 
1040
1171
  A [JSON Schema](agent-browser.schema.json) is available for IDE autocomplete and validation. Add a `$schema` key to your config file to enable it:
1041
1172
 
@@ -1091,9 +1222,7 @@ agent-browser get text @e1 # Get heading text
1091
1222
  agent-browser hover @e4 # Hover the link
1092
1223
  ```
1093
1224
 
1094
- When a ref click is blocked by an overlay, the error includes the covering
1095
- element, such as `covered by <div#consent-banner>`. Click the banner or dialog
1096
- control first, then run `snapshot` again before reusing refs.
1225
+ When a ref click is blocked by an overlay, the error includes the covering element, such as `covered by <div#consent-banner>`. Click the banner or dialog control first, then run `snapshot` again before reusing refs.
1097
1226
 
1098
1227
  **Why use refs?**
1099
1228
 
@@ -1239,15 +1368,17 @@ AGENT_BROWSER_EXECUTABLE_PATH=/path/to/chromium agent-browser open example.com
1239
1368
  Run agent-browser + Chrome in an ephemeral Vercel Sandbox microVM. No external server needed:
1240
1369
 
1241
1370
  ```typescript
1242
- import { Sandbox } from "@vercel/sandbox";
1371
+ import { runAgentBrowserCommand, withAgentBrowserSandbox } from "@agent-browser/sandbox/vercel";
1243
1372
 
1244
- const sandbox = await Sandbox.create({ runtime: "node24" });
1245
- await sandbox.runCommand("agent-browser", ["open", "https://example.com"]);
1246
- const result = await sandbox.runCommand("agent-browser", ["screenshot", "--json"]);
1247
- await sandbox.stop();
1373
+ const result = await withAgentBrowserSandbox(async (sandbox) => {
1374
+ await runAgentBrowserCommand(sandbox, ["open", "https://example.com"]);
1375
+ return runAgentBrowserCommand(sandbox, ["screenshot"]);
1376
+ });
1248
1377
  ```
1249
1378
 
1250
- See the [environments example](examples/environments/) for a working demo with a UI and deploy-to-Vercel button.
1379
+ Install `@agent-browser/sandbox` and `@vercel/sandbox` in the consuming app. See the [sandbox helper example](examples/sandbox/) for minimal Eve and Vercel Sandbox usage, or the [environments example](examples/environments/) for a full UI demo with a deploy-to-Vercel button.
1380
+
1381
+ Fresh Vercel and Eve sandboxes install Chromium system dependencies by default. Pass `installSystemDependencies: false` only when your sandbox image already includes those libraries.
1251
1382
 
1252
1383
  ### Serverless (AWS Lambda)
1253
1384
 
Binary file
Binary file
Binary file
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agent-browser-priv",
3
- "version": "0.28.0-priv.1",
3
+ "version": "0.31.1-priv.1",
4
4
  "description": "Browser automation CLI for AI agents",
5
5
  "type": "module",
6
6
  "packageManager": "pnpm@11.1.3",
@@ -29,7 +29,9 @@
29
29
  "build:docker": "docker build --platform linux/amd64 -t agent-browser-builder -f docker/Dockerfile.build .",
30
30
  "release": "pnpm run version:sync && pnpm run build:all-platforms",
31
31
  "postinstall": "node scripts/postinstall.js",
32
- "build:dashboard": "cd packages/dashboard && pnpm build"
32
+ "build:dashboard": "cd packages/dashboard && pnpm build",
33
+ "runtime:update-check": "node scripts/check-runtime-updates.mjs",
34
+ "runtime:update-patchright": "node scripts/update-patchright-version.mjs"
33
35
  },
34
36
  "keywords": [
35
37
  "browser",
@@ -0,0 +1,139 @@
1
+ #!/usr/bin/env node
2
+
3
+ import { execFileSync } from 'node:child_process';
4
+ import { readFileSync } from 'node:fs';
5
+ import { dirname, join } from 'node:path';
6
+ import { fileURLToPath } from 'node:url';
7
+
8
+ const rootDir = join(dirname(fileURLToPath(import.meta.url)), '..');
9
+ const installRsPath = join(rootDir, 'cli/src/install.rs');
10
+ const trackingPath = join(rootDir, 'priv/version-tracking.json');
11
+
12
+ function readJson(path) {
13
+ return JSON.parse(readFileSync(path, 'utf8'));
14
+ }
15
+
16
+ function readPatchrightPin() {
17
+ const source = readFileSync(installRsPath, 'utf8');
18
+ const match = source.match(/pub const PATCHRIGHT_VERSION:\s*&str\s*=\s*"([^"]+)"/);
19
+ if (!match) {
20
+ throw new Error('Could not find PATCHRIGHT_VERSION in cli/src/install.rs');
21
+ }
22
+ return match[1];
23
+ }
24
+
25
+ function npmLatest(packageName) {
26
+ return execFileSync('npm', ['view', packageName, 'version'], {
27
+ cwd: rootDir,
28
+ encoding: 'utf8',
29
+ timeout: 20_000,
30
+ stdio: ['ignore', 'pipe', 'pipe'],
31
+ }).trim();
32
+ }
33
+
34
+ function upstreamTags() {
35
+ const output = execFileSync(
36
+ 'git',
37
+ [
38
+ 'ls-remote',
39
+ '--tags',
40
+ '--refs',
41
+ 'https://github.com/vercel-labs/agent-browser.git',
42
+ 'refs/tags/v*',
43
+ ],
44
+ {
45
+ cwd: rootDir,
46
+ encoding: 'utf8',
47
+ timeout: 20_000,
48
+ stdio: ['ignore', 'pipe', 'pipe'],
49
+ },
50
+ );
51
+ return output
52
+ .trim()
53
+ .split('\n')
54
+ .filter(Boolean)
55
+ .map((line) => line.split(/\s+/)[1]?.replace('refs/tags/', ''))
56
+ .filter(Boolean)
57
+ .filter((tag) => /^v\d+\.\d+\.\d+/.test(tag));
58
+ }
59
+
60
+ function parseVersion(version) {
61
+ const cleaned = version.replace(/^v/, '').split('-')[0];
62
+ const parts = cleaned.split('.').map((part) => Number.parseInt(part, 10));
63
+ return [parts[0] || 0, parts[1] || 0, parts[2] || 0];
64
+ }
65
+
66
+ function compareVersions(a, b) {
67
+ const pa = parseVersion(a);
68
+ const pb = parseVersion(b);
69
+ for (let i = 0; i < 3; i += 1) {
70
+ if (pa[i] !== pb[i]) return pa[i] - pb[i];
71
+ }
72
+ return a.localeCompare(b);
73
+ }
74
+
75
+ function latestTag(tags) {
76
+ if (tags.length === 0) return null;
77
+ return [...tags].sort(compareVersions).at(-1);
78
+ }
79
+
80
+ function buildReport() {
81
+ const tracking = readJson(trackingPath);
82
+ const patchrightPinned = readPatchrightPin();
83
+ const patchrightLatest = npmLatest('patchright');
84
+ const latestUpstreamTag = latestTag(upstreamTags());
85
+
86
+ return {
87
+ patchright: {
88
+ pinned: patchrightPinned,
89
+ latest: patchrightLatest,
90
+ outdated: patchrightPinned !== patchrightLatest,
91
+ tracking_matches_source: tracking.patchright_pin === patchrightPinned,
92
+ },
93
+ agent_browser: {
94
+ tracked_upstream_tag: tracking.agent_browser_upstream_tag,
95
+ latest_upstream_tag: latestUpstreamTag,
96
+ outdated: Boolean(
97
+ latestUpstreamTag &&
98
+ compareVersions(latestUpstreamTag, tracking.agent_browser_upstream_tag) > 0,
99
+ ),
100
+ },
101
+ tracking: {
102
+ path: 'priv/version-tracking.json',
103
+ policy: tracking.policy,
104
+ },
105
+ };
106
+ }
107
+
108
+ function printHuman(report) {
109
+ console.log('Runtime update check');
110
+ console.log('');
111
+ console.log(
112
+ `Patchright: pinned ${report.patchright.pinned}, npm latest ${report.patchright.latest}`,
113
+ );
114
+ console.log(
115
+ ` status: ${report.patchright.outdated ? 'outdated' : 'current'}`,
116
+ );
117
+ console.log(
118
+ ` tracking: ${report.patchright.tracking_matches_source ? 'matches source' : 'does not match source'}`,
119
+ );
120
+ console.log('');
121
+ console.log(
122
+ `agent-browser upstream: tracked ${report.agent_browser.tracked_upstream_tag}, latest ${report.agent_browser.latest_upstream_tag ?? 'unknown'}`,
123
+ );
124
+ console.log(
125
+ ` status: ${report.agent_browser.outdated ? 'sync available' : 'current'}`,
126
+ );
127
+ }
128
+
129
+ const report = buildReport();
130
+
131
+ if (process.argv.includes('--json')) {
132
+ console.log(JSON.stringify(report, null, 2));
133
+ } else {
134
+ printHuman(report);
135
+ }
136
+
137
+ if (!report.patchright.tracking_matches_source) {
138
+ process.exitCode = 1;
139
+ }
@@ -31,6 +31,19 @@ const cargoVersion = cargoVersionMatch[1];
31
31
  const dashboardPkg = JSON.parse(readFileSync(join(rootDir, 'packages/dashboard/package.json'), 'utf-8'));
32
32
  const dashboardVersion = dashboardPkg.version;
33
33
 
34
+ // Read sandbox package versions
35
+ const sandboxPkg = JSON.parse(readFileSync(join(rootDir, 'packages/@agent-browser/sandbox/package.json'), 'utf-8'));
36
+ const sandboxVersion = sandboxPkg.version;
37
+ const sandboxVersionSource = readFileSync(join(rootDir, 'packages/@agent-browser/sandbox/src/version.ts'), 'utf-8');
38
+ const sandboxVersionMatch = sandboxVersionSource.match(/AGENT_BROWSER_SANDBOX_VERSION\s*=\s*"([^"]*)"/);
39
+
40
+ if (!sandboxVersionMatch) {
41
+ console.error('Could not find AGENT_BROWSER_SANDBOX_VERSION in packages/@agent-browser/sandbox/src/version.ts');
42
+ process.exit(1);
43
+ }
44
+
45
+ const sandboxRuntimeVersion = sandboxVersionMatch[1];
46
+
34
47
  const mismatches = [];
35
48
  if (packageVersion !== cargoVersion) {
36
49
  mismatches.push(` cli/Cargo.toml: ${cargoVersion}`);
@@ -38,6 +51,12 @@ if (packageVersion !== cargoVersion) {
38
51
  if (packageVersion !== dashboardVersion) {
39
52
  mismatches.push(` packages/dashboard: ${dashboardVersion}`);
40
53
  }
54
+ if (packageVersion !== sandboxVersion) {
55
+ mismatches.push(` packages/@agent-browser/sandbox/package.json: ${sandboxVersion}`);
56
+ }
57
+ if (packageVersion !== sandboxRuntimeVersion) {
58
+ mismatches.push(` packages/@agent-browser/sandbox/src/version.ts: ${sandboxRuntimeVersion}`);
59
+ }
41
60
 
42
61
  if (mismatches.length > 0) {
43
62
  console.error('Version mismatch detected!');