pi-agent-browser-native 0.2.34 → 0.2.35

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. package/CHANGELOG.md +27 -0
  2. package/README.md +14 -14
  3. package/docs/ARCHITECTURE.md +19 -13
  4. package/docs/COMMAND_REFERENCE.md +257 -42
  5. package/docs/ELECTRON.md +3 -3
  6. package/docs/RELEASE.md +11 -11
  7. package/docs/REQUIREMENTS.md +5 -5
  8. package/docs/SUPPORT_MATRIX.md +23 -21
  9. package/docs/TOOL_CONTRACT.md +38 -27
  10. package/extensions/agent-browser/index.ts +518 -2402
  11. package/extensions/agent-browser/lib/argv-descriptor.ts +90 -0
  12. package/extensions/agent-browser/lib/argv-grammar.ts +128 -0
  13. package/extensions/agent-browser/lib/command-policy.ts +71 -0
  14. package/extensions/agent-browser/lib/command-taxonomy.ts +336 -0
  15. package/extensions/agent-browser/lib/electron/cleanup.ts +1 -0
  16. package/extensions/agent-browser/lib/executable-path.ts +19 -0
  17. package/extensions/agent-browser/lib/input-modes/params.ts +6 -6
  18. package/extensions/agent-browser/lib/orchestration/batch-stdin.ts +65 -0
  19. package/extensions/agent-browser/lib/orchestration/browser-run/browser-action-model.ts +154 -0
  20. package/extensions/agent-browser/lib/orchestration/browser-run/click-dispatch.ts +149 -0
  21. package/extensions/agent-browser/lib/orchestration/browser-run/diagnostics.ts +10 -28
  22. package/extensions/agent-browser/lib/orchestration/browser-run/final-result.ts +6 -2
  23. package/extensions/agent-browser/lib/orchestration/browser-run/index.ts +33 -27
  24. package/extensions/agent-browser/lib/orchestration/browser-run/prepare.ts +48 -22
  25. package/extensions/agent-browser/lib/orchestration/browser-run/process-output.ts +33 -10
  26. package/extensions/agent-browser/lib/orchestration/browser-run/prompt-guards.ts +93 -0
  27. package/extensions/agent-browser/lib/orchestration/browser-run/session-state.ts +19 -123
  28. package/extensions/agent-browser/lib/orchestration/browser-run/types.ts +26 -1
  29. package/extensions/agent-browser/lib/orchestration/electron-host/index.ts +860 -0
  30. package/extensions/agent-browser/lib/playbook.ts +9 -9
  31. package/extensions/agent-browser/lib/prompt-policy.ts +122 -0
  32. package/extensions/agent-browser/lib/results/action-recommendations.ts +3 -23
  33. package/extensions/agent-browser/lib/results/presentation/navigation.ts +2 -34
  34. package/extensions/agent-browser/lib/runtime.ts +93 -227
  35. package/extensions/agent-browser/lib/session-page-state.ts +31 -14
  36. package/extensions/agent-browser/lib/temp.ts +148 -23
  37. package/package.json +4 -4
  38. package/scripts/agent-browser-capability-baseline.mjs +198 -1
@@ -18,7 +18,7 @@ This project intentionally blocks normal `agent-browser` bash usage in most agen
18
18
 
19
19
  <!-- agent-browser-capability-baseline:start upstream-baseline -->
20
20
  <!-- Generated from scripts/agent-browser-capability-baseline.mjs. Run `npm run docs -- command-reference write` to update. Do not edit manually. -->
21
- This reference is baselined to the locally installed `agent-browser 0.27.0` command/help surface. Upstream `agent-browser` remains the source of truth for command semantics; this file is the local fallback for Pi agent sessions where direct binary help is blocked or discouraged.
21
+ This reference is baselined to the locally installed `agent-browser 0.27.0` command/help surface, audited against vercel-labs/agent-browser@4ad284890cb59564af603e6de403dd75dd19e832. Upstream `agent-browser` remains the source of truth for command semantics; this file is the local fallback for Pi agent sessions where direct binary help is blocked or discouraged.
22
22
 
23
23
  The lightweight drift check is `npm run verify -- command-reference`. Run it whenever the installed upstream `agent-browser` version changes or this reference is edited.
24
24
 
@@ -72,7 +72,7 @@ Tool parameters (use exactly one of `args`, `semanticAction`, `job`, `qa`, `sour
72
72
  - `sessionMode`:
73
73
  - `"auto"` reuses the extension-managed session when possible.
74
74
  - `"fresh"` rotates that managed session to a fresh upstream launch so launch-scoped flags like `--profile`, `--session-name`, `--cdp`, `--state`, `--auto-connect`, `--init-script`, `--enable`, `-p` / `--provider`, or iOS `--device` apply.
75
- - If a fresh launch fails or times out, read `details.managedSessionOutcome` for `preserved` vs `abandoned` (and related fields). A model-visible `Managed session outcome: …` line is appended only for failing calls that used `sessionMode: "fresh"`; `"auto"` failures can still populate the struct without that extra line.
75
+ - If a fresh launch fails or times out, read `details.managedSessionOutcome` for `preserved` vs `abandoned` (and related fields). A model-visible `Managed session outcome: …` line is appended only for failing calls that used `sessionMode: "fresh"`; `"auto"` failures can still populate the struct without that extra line. If you explicitly close the current wrapper-managed session with `--session <name> close`, later default auto calls rotate to a new wrapper-generated session instead of reusing the closed name; repeated closes and branch restores keep those generated names monotonic.
76
76
 
77
77
  ### Debug, diff, stream, dashboard, and chat families
78
78
 
@@ -153,9 +153,9 @@ Do not assume Playwright selector dialects such as `text=Close` or `button:has-t
153
153
 
154
154
  Treat `@e…` refs as page-scoped. After a successful `snapshot`, the wrapper records the latest refs and page target for that session; mutation-prone ref commands such as `click @e4`, `select @e5 chocolate`, or batch steps with old refs fail with `failureCategory: "stale-ref"` when the page target changed or the ref is absent from the latest same-page snapshot. If a session `snapshot -i` fails with `No active page`, the wrapper invalidates prior refs for that session; later mutation-prone `@e…` calls fail before upstream until a successful fresh `snapshot -i` records refs again. Inside `batch` stdin JSON, the wrapper also walks steps in order before spawn: steps whose first token can navigate or mutate set a latch; a later step whose first token is `snapshot` clears that latch for following rows; guarded steps that still mention `@e…` after an uncleared latch fail with the same `stale-ref` bucket without launching upstream. Same-snapshot form fills are allowed before a click or submit step, so a login-style `fill`, `fill`, `click` batch can run from one snapshot; split dynamic or autosubmit forms with a fresh snapshot if a fill itself rerenders the targets. Follow the `refresh-interactive-refs` next action (it includes `--session <name>` when needed) and prefer stable `find` or `semanticAction` locators when navigation or rerendering is likely. Contract detail: [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#details) (`refSnapshot`, `refSnapshotInvalidation`).
155
155
 
156
- A successful `click` result means upstream reported a target, not that the app definitely handled the event. When the workflow depends on a mutation, use `details.pageChangeSummary`, a wait, URL/text extraction, or a fresh `snapshot -i` before trusting the state; if nothing changed, retry with a current visible ref or stable selector and report the workflow issue. Preserve explicit user stop boundaries: if the user says to stop before a final order, post, purchase, or submit action, gather evidence from that page and do not click the final action. The wrapper avoids site-specific fallback clicks and keeps the verification burden explicit.
156
+ A successful `click` result means upstream reported a target, not that the app definitely handled the event. For top-level non-Electron clicks, the wrapper installs a bounded DOM-event probe; when upstream reports success but no trusted event reaches the target, it fails the tool and exposes `details.clickDispatch` plus a `Click dispatch diagnostic` line with explicit retry/inspect next actions (no in-page click replay). When the workflow depends on a mutation, use `details.pageChangeSummary`, a wait, URL/text extraction, or a fresh `snapshot -i` before trusting the state; if nothing changed, retry with a current visible ref or stable selector and report the workflow issue. Preserve explicit user stop boundaries: if the user says to stop before a final order, post, purchase, or submit action, gather evidence from that page and do not click the final action. The wrapper also blocks likely final order/submit click targets under those prompts and returns `details.promptGuard` with `failureCategory: "policy-blocked"`.
157
157
 
158
- When a **top-level** `click` succeeds (not a `click` hidden inside a `batch`/`job` tool call—the unified command must be `click`), the upstream payload includes `data.clicked`, and the wrapper sees the active tab URL unchanged after the same normalization it uses for ref guards (**`#fragment` ignored**), it may run one extra `snapshot -i` and surface `Possible overlay blockers` plus `details.overlayBlockers` (`candidates`, `summary`, and a `snapshot` map that can refresh `refSnapshot`) when that snapshot shows strong modal context (`dialog` / `alertdialog`) **and** up to three close/dismiss-like controls; page-wide words such as privacy, sign in, or banner alone do not trigger it. The URL check compares the session’s prior pinned tab target to `details.navigationSummary.url` after the click; that summary is gathered with one read-only `eval` when the click JSON omits **both** string `data.url` and `data.title`—if upstream already echoes either field, overlay diagnostics are skipped on this path. The diagnostic is skipped if the wrapper already applied tab-focus correction or about-blank recovery on that result. Appended `inspect-overlay-state` / `try-overlay-blocker-candidate-*` entries in `details.nextActions` include `--session <name>` when the session is named, same as other session-scoped follow-ups. Treat `inspect-overlay-state` as the safe first follow-up; only use a `try-overlay-blocker-candidate-*` next action when the candidate is clearly the control you intend to close.
158
+ When a **top-level** `click` succeeds (not a `click` hidden inside a `batch`/`job` tool call—the unified command must be `click`), the upstream payload includes `data.clicked`, no `details.clickDispatch` diagnostic fired for the same result, and the wrapper sees the active tab URL unchanged after the same normalization it uses for ref guards (**`#fragment` ignored**), it may run one extra `snapshot -i` and surface `Possible overlay blockers` plus `details.overlayBlockers` (`candidates`, `summary`, and a `snapshot` map that can refresh `refSnapshot`) when that snapshot shows strong modal context (`dialog` / `alertdialog`) **and** up to three close/dismiss-like controls; page-wide words such as privacy, sign in, or banner alone do not trigger it. The URL check compares the session’s prior pinned tab target to `details.navigationSummary.url` after the click; that summary is gathered with one read-only `eval` when the click JSON omits **both** string `data.url` and `data.title`—if upstream already echoes either field, overlay diagnostics are skipped on this path. The diagnostic is skipped if the wrapper already applied tab-focus correction or about-blank recovery on that result. Appended `inspect-overlay-state` / `try-overlay-blocker-candidate-*` entries in `details.nextActions` include `--session <name>` when the session is named, same as other session-scoped follow-ups. Treat `inspect-overlay-state` as the safe first follow-up; only use a `try-overlay-blocker-candidate-*` next action when the candidate is clearly the control you intend to close.
159
159
 
160
160
  ### Extract page data
161
161
 
@@ -259,13 +259,13 @@ Typical lifecycle:
259
259
  { "electron": { "action": "cleanup", "launchId": "electron-…" } }
260
260
  ```
261
261
 
262
- `electron.status` and `electron.cleanup` take either `launchId`, **`all: true`** (literal boolean) to walk every wrapper-tracked launch in one call, or neither when exactly one active launch exists—never both `launchId` and `all`. For `electron.launch`, `timeoutMs` bounds host CDP readiness with a **15s** default and **120s** cap in `extensions/agent-browser/lib/electron/launch.ts`. Optional `timeoutMs` on **`status`** applies to managed-session `get title` / `get url` reads (localhost CDP probes stay on a short fixed fetch budget). On **`cleanup`**, it caps upstream `close` **and** host teardown (process exit, debug-port idle check, isolated profile removal); when omitted it follows the implicit session close default (**5s** unless `PI_AGENT_BROWSER_IMPLICIT_SESSION_CLOSE_TIMEOUT_MS` overrides). On **`probe`**, it bounds each underlying upstream read subprocess—omit it to use the normal tool subprocess default, or raise it on slow desktops.
262
+ `electron.status` and `electron.cleanup` take either `launchId`, **`all: true`** (literal boolean) to walk every wrapper-tracked launch in one call, or neither when exactly one active launch exists—never both `launchId` and `all`. They can target the current branch-visible launch plus still-owned off-branch launch records by `launchId`; default no-arg calls are intentionally ambiguous when more than one active launch is owned. `/reload` preserves the current branch-visible active Electron launch and its isolated temp `userDataDir` for continuity, and cleans off-branch owned Electron launches; if cleanup is partial and skips or fails profile removal, the generic temp sweep preserves that `userDataDir` across reload, quit, later temp cleanup, process exit, and stale temp-root pruning after restart. For `electron.launch`, `timeoutMs` bounds host CDP readiness with a **15s** default and **120s** cap in `extensions/agent-browser/lib/electron/launch.ts`. Optional `timeoutMs` on **`status`** applies to managed-session `get title` / `get url` reads (localhost CDP probes stay on a short fixed fetch budget). On **`cleanup`**, it caps upstream `close` **and** host teardown (process exit, debug-port idle check, isolated profile removal); when omitted it follows the implicit session close default (**5s** unless `PI_AGENT_BROWSER_IMPLICIT_SESSION_CLOSE_TIMEOUT_MS` overrides). A successful managed-session close step retires that wrapper-managed session even when host process/profile cleanup remains partial. On **`probe`**, it bounds each underlying upstream read subprocess—omit it to use the normal tool subprocess default, or raise it on slow desktops.
263
263
 
264
264
  `launch.handoff` defaults to `"snapshot"`, which attaches through upstream `connect`, lists targets, and captures a current `snapshot -i` in one call. Snapshot handoff retries briefly when the first Electron snapshot has no refs; if it still reports no refs, run `snapshot -i` once more before assuming the app is blank. Use `handoff: "tabs"` as the safer diagnostic starting point when you only need target discovery and do not want to snapshot app content yet, or `handoff: "connect"` when you want to attach first and run your own follow-up commands. `targetType` defaults to `"page"`; use `"webview"` or `"any"` for apps that expose useful webviews. When a matching CDP target exposes a WebSocket URL, launch connects to that target; otherwise it falls back to the browser port.
265
265
 
266
266
  After launch, prefer the exact `details.nextActions` payloads when present: `status-electron-launch` checks liveness, `probe-electron-launch` runs compact diagnostics for a tracked launch, `snapshot-electron-session` refreshes current refs, `list-electron-tabs` inspects targets, and `cleanup-electron-launch` removes the wrapper-owned process/profile when the run is done. If launch times out, inspect `details.electron.failure.diagnostics` for PID, wrapper profile, `DevToolsActivePort`, and timing evidence before retrying. If status/probe detects a session or target mismatch, follow `reattach-electron-launch` or a fresh snapshot action before using old refs. If a click/fill/type looks successful but the Electron PID or debug port dies, the wrapper now fails the result with `details.electronPostCommandHealth` and same-launch status/probe/cleanup next actions instead of leaving the agent on `about:blank`. If cleanup is partial (`failureCategory: "cleanup-failed"`), inspect `details.electron.cleanup.results` and use `retry-electron-cleanup` only for the same `launchId`.
267
267
 
268
- Manual path for externally launched apps: if you started the Electron app yourself with a debug port or DevTools URL, skip the wrapper lifecycle and attach directly with upstream `connect`. In this path you own app shutdown and profile cleanup; do not use `electron.cleanup`. `close` only closes the browser/CDP session and does not quit the manually launched app or remove explicit artifacts.
268
+ Manual path for externally launched apps: if you started the Electron app yourself with a debug port or DevTools URL, skip the wrapper lifecycle and attach directly with upstream `connect`. In this path you own app shutdown and profile cleanup; do not use `electron.cleanup`. close commands (`close`, `quit`, or `exit`) only close the browser/CDP session and do not quit the manually launched app or remove explicit artifacts.
269
269
 
270
270
  ```json
271
271
  { "args": ["connect", "9222"], "sessionMode": "fresh" }
@@ -327,7 +327,7 @@ A successful wait-based download renders a readable summary such as `Download co
327
327
  { "args": ["pdf", "/tmp/page.pdf"] }
328
328
  ```
329
329
 
330
- The upstream screenshot aliases are `screenshot --full` for full-page capture and `screenshot --annotate` for labeled screenshots. When a user gives exact artifact paths for screenshots, recordings, downloads, PDFs, traces, or HAR files, use those paths or explicitly report why the artifact was unavailable; do not silently substitute another path in the final report.
330
+ The upstream screenshot aliases are `screenshot --full` for full-page capture and `screenshot --annotate` for labeled screenshots. When a user gives exact artifact paths for screenshots, recordings, downloads, PDFs, traces, or HAR files, use those paths or explicitly report why the artifact was unavailable; do not silently substitute another path in the final report. When the latest prompt names exact required screenshot paths, `close` / `quit` / `exit` can be blocked with `details.promptGuard.reason: "requested-artifacts-missing-before-close"` until those paths appear as verified explicit artifacts.
331
331
 
332
332
  Prefer `download <selector> <path>` when the target element itself is the downloadable link/control. Use `click` plus `wait --download [path]` when a previous action starts the download indirectly.
333
333
 
@@ -355,7 +355,7 @@ The wrapper keeps a bounded, metadata-only `details.artifactManifest` of recent
355
355
 
356
356
  This manifest cap controls what appears in `details.artifactManifest` and in summaries such as `Session artifacts: 42 live, 0 evicted (42/100 recent)`. It does not delete explicit files that upstream saved to paths you chose, such as screenshots, PDFs, downloads, traces, HAR files, or WebM recordings.
357
357
 
358
- Browser `close` is also not file cleanup. If `details.artifactManifest` is present with a non-empty `entries` list, a successful `close` appends an `Artifact lifecycle` note and reports `details.artifactCleanup` with the current retention summary and the same host-owned cleanup `note` as the contract (`extensions/agent-browser/index.ts`, `getArtifactCleanupGuidance`). Up to ten distinct user-chosen paths that still exist on disk appear in `explicitArtifactPaths` when matching `explicit-path` manifest rows exist in the recent window; deleted/stale paths are skipped. Otherwise that array is empty and visible text may omit the “Explicit artifact paths” line even though the lifecycle block still reminds you that close does not delete saved files. Delete any paths you care about with host file tools after inspection; the native browser tool intentionally does not remove arbitrary user-chosen filesystem paths.
358
+ Browser close commands (`close`, `quit`, or `exit`) are also not file cleanup. If `details.artifactManifest` is present with a non-empty `entries` list, a successful close command appends an `Artifact lifecycle` note and reports `details.artifactCleanup` with the current retention summary and the same host-owned cleanup `note` as the contract (`extensions/agent-browser/lib/orchestration/browser-run/diagnostics.ts`, `getArtifactCleanupGuidance`). Up to ten distinct user-chosen paths that still exist on disk appear in `explicitArtifactPaths` when matching `explicit-path` manifest rows exist in the recent window; deleted/stale paths are skipped. Otherwise that array is empty and visible text may omit the “Explicit artifact paths” line even though the lifecycle block still reminds you that close commands do not delete saved files. Delete any paths you care about with host file tools after inspection; the native browser tool intentionally does not remove arbitrary user-chosen filesystem paths.
359
359
 
360
360
  Oversized snapshots and oversized generic outputs are different: when a persisted pi session is available, their wrapper-managed spill files are stored under the private session artifact directory and are governed by the byte budget `PI_AGENT_BROWSER_SESSION_ARTIFACT_MAX_BYTES` (default 32 MiB). Raise that byte budget as well for long QA sessions that need many full raw snapshots or large text spills to survive reload/resume.
361
361
 
@@ -438,18 +438,27 @@ Session note: `skills list`, `skills get …`, and `skills path …` are **state
438
438
  | `skills get core` | Print the core usage guide. |
439
439
  | `skills get core --full` | Print the full version-matched core command reference and templates. |
440
440
  | `skills get <name>` | Load a specialized skill such as `electron` or `slack`. Common specialized calls include `skills get electron`, `skills get slack`, `skills get dogfood`, `skills get vercel-sandbox`, and `skills get agentcore`. |
441
+ | `skills get <name> --full` | Include a skill's supplementary references/templates when present. |
442
+ | `skills get --all` | Print all visible bundled skills for broad audit/debug work. |
441
443
  | `skills path [name]` | Print a skill directory path. |
442
444
 
445
+ Skill-source debugging note: upstream honors `AGENT_BROWSER_SKILLS_DIR` as an override for bundled skill discovery. Normal agents should not need it, but it is useful when validating package layout or upstream skill packaging.
446
+
443
447
  ### Core page and element commands
444
448
 
445
449
  | Command | Purpose |
446
450
  | --- | --- |
447
- | `open <url>` | Navigate to a URL. |
451
+ | `open [url]` | Launch the browser and optionally navigate. URL-less `open` stays on `about:blank` so agents can stage routes, cookies, or init scripts before first navigation. |
452
+ | `open <url>` | Navigate to a URL; `goto <url>` and `navigate <url>` are equivalent navigation aliases when a URL is present. |
448
453
  | `click <sel>` | Click an element or `@ref`. |
454
+ | `click <sel> --new-tab` | Click a link/control while requesting a new tab. |
449
455
  | `dblclick <sel>` | Double-click an element. |
450
456
  | `type <sel> <text>` | Type into an element. |
451
457
  | `fill <sel> <text>` | Clear and fill an element. |
452
- | `press <key>` | Press a key such as `Enter`, `Tab`, or `Control+a`. Related key-hold aliases include `keydown Shift` and `keyup Shift`. |
458
+ | `press <key>` | Press a key such as `Enter`, `Tab`, or `Control+a`. `key <key>` is the upstream alias. |
459
+ | `key <key>` | Alias for `press <key>`. |
460
+ | `keydown <key>` | Hold a key down without releasing it, useful for modifiers. |
461
+ | `keyup <key>` | Release a key previously held by `keydown <key>`. Common modifier examples are `keydown Shift` and `keyup Shift`. |
453
462
  | `keyboard type <text>` | Type text with real keystrokes and no selector. |
454
463
  | `keyboard inserttext <text>` | Insert text without key events. |
455
464
  | `hover <sel>` | Hover an element. |
@@ -461,14 +470,19 @@ Session note: `skills list`, `skills get …`, and `skills path …` are **state
461
470
  | `upload <sel> <files...>` | Upload one or more files. |
462
471
  | `download <sel> <path>` | Download a file by clicking an element. |
463
472
  | `scroll <dir> [px]` | Scroll `up`, `down`, `left`, or `right`. |
464
- | `scrollintoview <sel>` | Scroll an element into view. |
473
+ | `scroll <dir> [px] --selector <sel>` | Scroll a specific scrollable element/container instead of the page. |
474
+ | `scrollintoview <sel>` | Scroll an element into view; `scrollinto <sel>` is the upstream alias. |
475
+ | `scrollinto <sel>` | Alias for `scrollintoview <sel>`. |
465
476
  | `wait <sel|ms>` | Wait for an element or a duration. |
466
- | `screenshot [path]` | Take a screenshot. |
477
+ | `screenshot [selector] [path]` | Take a full-page or element-scoped screenshot; a single selector-like argument scopes, while a path-like argument saves to that path. |
478
+ | `screenshot [path]` | Take a screenshot and optionally save it to a path. |
467
479
  | `pdf <path>` | Save the page as a PDF. |
468
- | `snapshot` | Print an accessibility tree with refs for AI interaction. |
469
- | `eval <js>` | Run JavaScript. Use `eval --stdin` through this wrapper for larger snippets. |
480
+ | `snapshot` | Print an accessibility tree with refs for AI interaction. Common options include `snapshot --interactive`, `snapshot --urls`, `snapshot --compact`, `snapshot --depth <n>`, `snapshot --selector <sel>`, and `snapshot --cursor` / `snapshot -C` for cursor/focus context when upstream returns it. |
481
+ | `eval <js>` | Run JavaScript. Use `eval --stdin` through this wrapper for larger snippets, or `eval -b <base64>` for shell-escaping-safe one-liners. |
470
482
  | `connect <port|url>` | Connect to a browser through CDP. |
471
- | `close [--all]` | Close the current browser or all sessions. |
483
+ | `close [--all]` | Close the current browser or all sessions; `quit` and `exit` are upstream close aliases. |
484
+ | `tap <selector>` | Touch-oriented tap alias for iOS/provider workflows. |
485
+ | `swipe <direction> [distance]` | Touch-oriented swipe for iOS/provider workflows. |
472
486
 
473
487
  On dashboards and other apps with nested scroll containers, `scroll <dir> [px]` may report a successful wheel action while the viewport appears unchanged because the page-level scroller was not the one containing the content. For top-level `scroll` calls without startup-scoped launch flags, the wrapper samples viewport and prominent scroll-container positions before and after the command; when nothing changes it appends `Scroll diagnostic: no observed scroll movement`, exposes `details.scrollNoop`, and adds exact `details.nextActions` for a fresh `snapshot -i` and screenshot. Use those before repeating page scrolls; when you need a specific panel, prefer `scrollintoview <@ref>` or a scoped interaction with the actual scrollable region.
474
488
 
@@ -490,6 +504,11 @@ Comboboxes vary by app. For native `<select>` controls, prefer raw `select <sele
490
504
  | `session list` | List active sessions. |
491
505
  | `state save <path>` | Save cookies, local storage, and session storage to a state file. |
492
506
  | `state load <path>` | Load cookies and storage from a state file. |
507
+ | `state list` | List saved state files. |
508
+ | `state show <filename>` | Show saved-state metadata without dumping secrets. |
509
+ | `state rename <old-name> <new-name>` | Rename a saved state file. |
510
+ | `state clear [session-name] [--all]` | Clear saved states for one name or all names; `state clear -a` is the upstream short alias for clearing all names. |
511
+ | `state clean --older-than <days>` | Delete expired saved-state files. |
493
512
  | `frame <selector|main>` | Switch iframe context by selector/ref/name/URL, or return to the main frame. |
494
513
  | `dialog accept [text]` | Accept an alert, confirm, or prompt dialog, optionally supplying prompt text. |
495
514
  | `dialog dismiss` | Dismiss or cancel the current dialog. |
@@ -512,13 +531,13 @@ These calls return plain text and stay stateless: the extension does not inject
512
531
 
513
532
  | Family | Surface |
514
533
  | --- | --- |
515
- | `get <what> [selector]` | `text`, `html`, `value`, `attr <name>`, `title`, `url`, `count`, `box`, `styles`, `cdp-url`. |
534
+ | `get <what> [selector]` | `text`, `html`, `value`, `attr <name>`, `title`, `url`, `count`, `get box <selector>`, `get styles <selector>`, and `get cdp-url`. |
516
535
  | `is <what> <selector>` | Check `visible`, `enabled`, or `checked`. |
517
- | `find <locator> <value> <action> [text]` | Locator types include `role`, `text`, `label`, `placeholder`, `alt`, `title`, `testid`, `first`, `last`, and `nth`. |
536
+ | `find <locator> <value> <action> [text]` | Locator types include `role`, `text`, `label`, `placeholder`, `alt`, `title`, and `testid`; selector helpers include `find first <sel>`, `find last <sel>`, and `find nth <n> <sel>`. Role/text filters include `find role <role> --name <name>` and `find ... --exact`. |
518
537
  | `mouse <action> [args]` | `move <x> <y>`, `down [btn]`, `up [btn]`, `wheel <dy> [dx]`. |
519
- | `set <setting> [value]` | `viewport <w> <h>`, `device <name>`, `geo <lat> <lng>`, `offline [on|off]`, `headers <json>`, `credentials <user> <pass>`, `media [dark|light] [reduced-motion]`. |
520
- | `network <action>` | `route <url> [--abort|--body <json>] [--resource-type <csv>]`, `unroute [url]`, `requests [--clear] [--filter <pattern>]`, `request <requestId>`, `har <start|stop> [path]`. `--resource-type` filters intercepted requests by CDP resource type, such as `script`, `image`, `font`, `xhr`, or `fetch`. |
521
- | `cookies [get|set|clear]` | Manage cookies. `set` supports `--url`, `--domain`, `--path`, `--httpOnly`, `--secure`, `--sameSite`, `--expires`, and `--curl <file>` for JSON, cURL, or bare Cookie-header bulk imports. |
538
+ | `set <setting> [value]` | `viewport <w> <h>`, `device <name>`, `geo <lat> <lng>`, `offline [on|off]`, `headers <json>`, `credentials <user> <pass>`, and `set media <features>` (`dark`, `light`, and/or `reduced-motion`). |
539
+ | `network <action>` | `network route <url> [--abort|--body <json>] [--resource-type <csv>]`, `network unroute [url]`, `network requests [--clear] [--filter <pattern>] [--type <csv>] [--method <method>] [--status <code|range>]`, `network request <requestId>`, `network har start`, and `network har stop [path]`. `--resource-type` filters intercepted requests by CDP resource type, such as `script`, `image`, `font`, `xhr`, or `fetch`; request listing filters accept resource types (`xhr,fetch`), methods (`POST`), and statuses (`2xx`, `400-499`). |
540
+ | `cookies [get|set|clear]` | Manage cookies. Full set form: `cookies set <name> <value> --url <url> --domain <domain> --path <path> --httpOnly --secure --sameSite <Strict|Lax|None> --expires <timestamp>`; also supports `cookies set --curl <file>` for JSON, cURL, or bare Cookie-header bulk imports. |
522
541
  | `storage <local|session>` | Manage web storage. |
523
542
 
524
543
  Privacy note: `cookies get` can expose real profile cookies. Do not run it against `--profile Default` or other authenticated profiles unless the user explicitly needs cookie inspection; prefer task-specific page actions and storage checks.
@@ -534,7 +553,7 @@ Stable tab ids look like `t1`, `t2`, and `t3`. Optional user labels such as `doc
534
553
  | `tab new [url]` | Open a new tab. |
535
554
  | `tab new --label <name> [url]` | Open a new tab with a user label. |
536
555
  | `tab <t<N>|label>` | Switch to a tab by id or label. |
537
- | `tab close [t<N>|label]` | Close the current tab or a referenced tab. |
556
+ | `tab close [t<N>|label]` | Close the current tab or a referenced tab. Generic references in workflows may say `tab close [target]`; use a stable `t<N>` id or label when you have one. |
538
557
 
539
558
  ### Snapshot
540
559
 
@@ -544,6 +563,7 @@ Stable tab ids look like `t1`, `t2`, and `t3`. Optional user labels such as `doc
544
563
  | `snapshot -i` / `snapshot --interactive` | Include only interactive elements. |
545
564
  | `snapshot -i --urls` | Include only interactive elements and link hrefs. |
546
565
  | `snapshot -u` / `snapshot --urls` | Include href URLs for link elements. |
566
+ | `snapshot -C` / `snapshot --cursor` | Include cursor/focus context when upstream provides it. |
547
567
  | `snapshot -c` / `snapshot --compact` | Remove empty structural elements. |
548
568
  | `snapshot -d <n>` / `snapshot --depth <n>` | Limit tree depth. |
549
569
  | `snapshot -s <sel>` / `snapshot --selector <sel>` | Scope to a CSS selector. |
@@ -562,16 +582,16 @@ When a snapshot is too large for inline output, the Pi wrapper renders a compact
562
582
  | `wait --text <text>` | Wait for text to appear on the page; failures may include `inspect-after-text-assertion-failure` with a session-scoped `snapshot -i` payload. |
563
583
  | `wait --download [path]` | Wait for a download started by a previous action and optionally save it to `path`; successful wrapper results include upstream-reported `savedFilePath`/`savedFile`, while `details.artifacts[].exists` is the wrapper's on-disk verification signal. |
564
584
  | `wait --download [path] --timeout <ms>` | Set download-start timeout in milliseconds. In the native Pi wrapper, use `25000` ms or less per call to stay under the upstream CLI IPC budget. |
565
- | `wait <selector> --state hidden` | Wait for an element to become hidden. |
566
- | `wait <selector> --state detached` | Wait for an element to detach. |
585
+
586
+ Current v0.27.0 source does not parse `wait <selector> --state hidden` / `wait <selector> --state detached` as distinct wait modes even though upstream help mentions those examples. Use `wait --fn "!document.querySelector('#spinner')"` or another explicit JavaScript predicate for disappearance/detach checks until upstream parser support exists.
567
587
 
568
588
  ### Diff, debug, and streaming
569
589
 
570
590
  | Command | Purpose |
571
591
  | --- | --- |
572
- | `diff snapshot` | Compare current versus last snapshot. |
573
- | `diff screenshot --baseline` | Compare current screenshot versus a baseline image. |
574
- | `diff url <u1> <u2>` | Compare two pages. |
592
+ | `diff snapshot` | Compare current versus last snapshot. Use `diff snapshot --baseline <file> --selector <sel> --compact --depth <n>` when you need a saved baseline, scoped subtree, compact output, or depth bound. |
593
+ | `diff screenshot --baseline` | Compare current screenshot versus a baseline image. Use `diff screenshot --baseline <file> --output <file> --threshold <0-1> --selector <sel> --full` when you need a saved diff image, threshold tuning, element scope, or full-page capture. |
594
+ | `diff url <u1> <u2>` | Compare two pages. Use `diff url <u1> <u2> --screenshot --wait-until <strategy> --selector <sel> --compact --depth <n>` when you need screenshot comparison, navigation wait control, or scoped/compact snapshot comparison. |
575
595
  | `trace start|stop [path]` | Record a Chrome DevTools trace. |
576
596
  | `profiler start|stop [path]` | Record a Chrome DevTools profile. |
577
597
  | `record start <path> [url]` | Start WebM video recording; output is written on `record stop`. Requires `ffmpeg` on `PATH` for the final encode. |
@@ -581,7 +601,7 @@ When a snapshot is too large for inline output, the Pi wrapper renders a compact
581
601
  | `errors [--clear]` | View or clear page errors. |
582
602
  | `highlight <sel>` | Highlight an element. |
583
603
  | `inspect` | Open Chrome DevTools for the active page. |
584
- | `clipboard <op> [text]` | Read/write clipboard: `read`, `write`, `copy`, `paste`. |
604
+ | `clipboard <op> [text]` | Read/write clipboard: `clipboard read`, `clipboard write <text>`, `clipboard copy`, and `clipboard paste`. |
585
605
  | `stream enable [--port <n>]` | Start runtime WebSocket streaming for this session. |
586
606
  | `stream disable` | Stop runtime WebSocket streaming. |
587
607
  | `stream status` | Show streaming status and active port. |
@@ -590,7 +610,7 @@ When a snapshot is too large for inline output, the Pi wrapper renders a compact
590
610
  | `react renders start` | Start recording React render activity. |
591
611
  | `react renders stop [--json]` | Stop render recording and print mount/re-render counts and changed details. |
592
612
  | `react suspense [--only-dynamic] [--json]` | Classify Suspense boundaries with grouped root-cause recommendations. |
593
- | `vitals [url] [--json]` | Report Core Web Vitals: LCP, CLS, TTFB, FCP, INP, plus React hydration timing when available. |
613
+ | `vitals [url] [--json]` | Report Core Web Vitals: LCP, CLS, TTFB, FCP, INP, plus React hydration timing when available. `web-vitals [url] [--json]` is the upstream alias. |
594
614
  | `pushstate <url>` | Perform SPA client-side navigation; detects Next.js router pushes and falls back to history navigation events. |
595
615
  | `removeinitscript <id>` | Remove an init script registered through upstream init-script mechanisms. |
596
616
 
@@ -600,16 +620,16 @@ Long-running or lifecycle commands should be explicitly paired with cleanup call
600
620
 
601
621
  `trace` and `profiler` share upstream Chrome tracing machinery. Do not run them at the same time. The wrapper tracks owner state it observes in the current Pi session and blocks conflicting starts/stops with "wrapper believes ..." wording because direct upstream CLI use or browser restarts can desynchronize wrapper-local state.
602
622
 
603
- ### Batch, auth, confirmations, sessions, chat, dashboard, and setup
623
+ ### Batch, auth, confirmations, sessions, chat, dashboard, devices, and setup
604
624
 
605
625
  | Command | Purpose |
606
626
  | --- | --- |
607
627
  | `batch [--bail] ["cmd" ...]` | Execute multiple commands sequentially from args or stdin. |
608
- | `auth save <name> [opts]` | Save an auth profile with options such as `--url`, `--username`, `--password`, or `--password-stdin`. Prefer `auth save <name> --password-stdin` with the tool `stdin` field; avoid putting passwords in `args`. |
628
+ | `auth save <name> [opts]` | Save an auth profile. Full credential form: `auth save <name> --url <url> --username <user> --password <pass>`; selector override form: `auth save <name> --username-selector <s> --password-selector <s> --submit-selector <s>`. Prefer `auth save <name> --password-stdin` with the tool `stdin` field; avoid putting passwords in `args`. |
609
629
  | `auth login <name>` | Login using saved credentials. |
610
630
  | `auth list` | List saved auth profiles. |
611
631
  | `auth show <name>` | Show auth profile metadata. |
612
- | `auth delete <name>` | Delete an auth profile. |
632
+ | `auth delete <name>` | Delete an auth profile; `auth remove <name>` is the upstream alias. |
613
633
  | `confirm <id>` | Approve a pending action. |
614
634
  | `deny <id>` | Deny a pending action. |
615
635
  | `session` | Show current session name. |
@@ -619,13 +639,14 @@ Long-running or lifecycle commands should be explicitly paired with cleanup call
619
639
  | `dashboard [start]` | Start the dashboard server on the default port `4848`. |
620
640
  | `dashboard start --port <n>` | Start the dashboard on a specific port. |
621
641
  | `dashboard stop` | Stop the dashboard server. |
642
+ | `device list` | List available iOS simulators. Use with `-p ios` when exercising iOS provider flows. |
622
643
  | `install` | Install browser binaries. |
623
644
  | `install --with-deps` | Install browser binaries plus Linux system dependencies. |
624
645
  | `upgrade` | Upgrade `agent-browser` to the latest version. |
625
646
  | `doctor [--fix]` | Diagnose install issues and optionally auto-clean stale files. Use `doctor --offline --quick` for a fast local-only check and `doctor --json` for structured output. |
626
647
  | `profiles` | List available Chrome profiles. |
627
648
 
628
- When these commands are invoked through the native `agent_browser` tool, structured diagnostic/status outputs are rendered as compact summaries. List-like outputs such as sessions, Chrome profiles, auth profiles, network requests, console messages, and page errors include counts and key fields; large outputs are previewed with a `Full output path:` spill file instead of dumping the entire payload into context. For `network requests`, the wrapper shows a failed-request summary split into actionable versus benign low-impact rows, then status, method, URL, resource/mime type, request id, and, when the installed upstream output includes body-like fields, bounded redacted payload, response, and failure/error snippets. Safe request IDs also produce `details.nextActions` for exact request details, actionable failed-request source lookup candidates, filtered request lists, or starting HAR capture before a repro. `network request <requestId>` can expose upstream full-detail body fields such as response bodies using the same bounded model-facing preview; its request URL stays diagnostic-only and does not overwrite `details.sessionTabTarget` for later ref guards. Header, cookie, auth, token, and other secret-like fields are not expanded in model-facing text or `details.data`; command echoes also redact `--body`, `--headers`, `--password`, proxy credentials, auth-bearing URLs, cookie/storage values, and bearer/basic credential text in positional arguments. Use upstream HAR or full raw details only when complete data is required.
649
+ When these commands are invoked through the native `agent_browser` tool, structured diagnostic/status outputs are rendered as compact summaries. Local inspection/setup calls (`auth save/list/show/delete/remove`, `dashboard start/stop`, `device list`, `doctor`, `install`, `upgrade`, `profiles`, `session list`, `state list/show/rename`, `state clean --older-than <days>`, `state clear --all`, `state clear -a`, and `state clear <session-name>`) are sessionless unless you explicitly pass `--session`; context-dependent calls such as root `session`, untargeted `state clear`, `auth login`, `chat`, and `state save/load` keep normal session behavior. List-like outputs such as sessions, Chrome profiles, auth profiles, network requests, console messages, and page errors include counts and key fields; large outputs are previewed with a `Full output path:` spill file instead of dumping the entire payload into context. For `network requests`, the wrapper shows a failed-request summary split into actionable versus benign low-impact rows, then status, method, URL, resource/mime type, request id, and, when the installed upstream output includes body-like fields, bounded redacted payload, response, and failure/error snippets. Safe request IDs also produce `details.nextActions` for exact request details, actionable failed-request source lookup candidates, filtered request lists, or starting HAR capture before a repro. `network request <requestId>` can expose upstream full-detail body fields such as response bodies using the same bounded model-facing preview; its request URL stays diagnostic-only and does not overwrite `details.sessionTabTarget` for later ref guards. Header, cookie, auth, token, and other secret-like fields are not expanded in model-facing text or `details.data`; command echoes also redact `--body`, `--headers`, `--password`, proxy credentials, auth-bearing URLs, cookie/storage values, and bearer/basic credential text in positional arguments. Use upstream HAR or full raw details only when complete data is required.
629
650
 
630
651
  ## Important global flags, config, and environment
631
652
 
@@ -656,6 +677,7 @@ When these commands are invoked through the native `agent_browser` tool, structu
656
677
  - `--download-path <path>`: default browser download directory. Environment: `AGENT_BROWSER_DOWNLOAD_PATH`.
657
678
  - `--engine <name>`: browser engine, `chrome` by default or `lightpanda`. Environment: `AGENT_BROWSER_ENGINE`.
658
679
  - `--no-auto-dialog`: disable automatic dismissal of alert/beforeunload dialogs. Environment: `AGENT_BROWSER_NO_AUTO_DIALOG`.
680
+ - `--idle-timeout <ms>`: close idle sessions after the requested idle window when upstream owns that session lifecycle. The wrapper also sets `AGENT_BROWSER_IDLE_TIMEOUT_MS` for its managed-session backstop.
659
681
 
660
682
  ### Output, provider, policy, and AI flags
661
683
 
@@ -672,7 +694,7 @@ When these commands are invoked through the native `agent_browser` tool, structu
672
694
  - `--confirm-interactive`: interactive confirmations; auto-denies when stdin is not a TTY. Environment: `AGENT_BROWSER_CONFIRM_INTERACTIVE`.
673
695
  - `-p, --provider <name>`: provider such as `ios`, `browserbase`, `kernel`, `browseruse`, `browserless`, or `agentcore`. Environment: `AGENT_BROWSER_PROVIDER`.
674
696
  - `--device <name>`: iOS device name. Environment: `AGENT_BROWSER_IOS_DEVICE`.
675
- - Provider-specific iOS examples from upstream include `agent-browser -p ios device list`, `agent-browser -p ios swipe up`, and `agent-browser -p ios tap @e1`; in pi, pass those tokens through `args` rather than bash. iOS requires external Xcode/Appium setup, and cloud providers (`browserbase`, `kernel`, `browseruse`, `browserless`, `agentcore`) require their upstream accounts, credentials, and provider-specific environment variables. The wrapper forwards provider flags/env and stays thin; it does not emulate provider setup or cloud browser behavior.
697
+ - Provider-specific iOS examples from upstream include `agent-browser -p ios device list`, `agent-browser -p ios swipe up`, and `agent-browser -p ios tap @e1`; in pi, pass those tokens through `args` rather than bash. iOS requires external Xcode/Appium setup, and cloud providers (`browserbase`, `kernel`, `browseruse`, `browserless`, `agentcore`) require their upstream accounts, credentials, and provider-specific environment variables. Common forwarded provider variables include `BROWSERBASE_API_KEY`, `BROWSERBASE_PROJECT_ID`, `BROWSERLESS_API_KEY`, `BROWSERLESS_API_URL`, `BROWSERLESS_BROWSER_TYPE`, `BROWSERLESS_STEALTH`, `BROWSERLESS_TTL`, `BROWSER_USE_API_KEY`, `KERNEL_API_KEY`, `KERNEL_HEADLESS`, `KERNEL_STEALTH`, `KERNEL_TIMEOUT_SECONDS`, `KERNEL_PROFILE_NAME`, `AGENTCORE_API_KEY`, `AGENTCORE_REGION`, `AGENTCORE_BROWSER_ID`, `AGENTCORE_PROFILE_ID`, `AGENTCORE_SESSION_TIMEOUT`, plus AWS names used by AgentCore such as `AWS_PROFILE`, `AWS_ACCESS_KEY_ID`, and `AWS_SECRET_ACCESS_KEY`. The wrapper forwards provider flags/env and stays thin; it does not emulate provider setup or cloud browser behavior.
676
698
  - `--model <name>`: AI model for `chat`. Environment: `AI_GATEWAY_MODEL`.
677
699
  - `-v, --verbose`: show tool commands and raw output.
678
700
  - `-q, --quiet`: show only AI text responses.
@@ -690,7 +712,7 @@ When these commands are invoked through the native `agent_browser` tool, structu
690
712
 
691
713
  Use `--config <path>` to load a specific config file. Boolean flags accept optional `true` or `false` values, such as `--headed false`, to override config. Browser extensions from user and project configs are merged rather than replaced.
692
714
 
693
- Other useful environment variables include `AGENT_BROWSER_DEFAULT_TIMEOUT`, `AGENT_BROWSER_STREAM_PORT`, `AGENT_BROWSER_IDLE_TIMEOUT_MS`, `AGENT_BROWSER_ENCRYPTION_KEY`, `AGENT_BROWSER_STATE_EXPIRE_DAYS`, `AGENT_BROWSER_IOS_DEVICE`, `AGENT_BROWSER_IOS_UDID`, `AI_GATEWAY_URL`, and `AI_GATEWAY_API_KEY`. The upstream child also receives every parent variable whose name starts with `AGENT_BROWSER_`, `AGENTCORE_`, `AI_GATEWAY_`, `BROWSERBASE_`, `BROWSERLESS_`, `BROWSER_USE_`, `KERNEL_`, or `XDG_`, plus the explicit inherited-name allowlist in `buildAgentBrowserProcessEnv` (`extensions/agent-browser/lib/process.ts`).
715
+ Other useful environment variables include `AGENT_BROWSER_DEFAULT_TIMEOUT`, `AGENT_BROWSER_STREAM_PORT`, `AGENT_BROWSER_IDLE_TIMEOUT_MS`, `AGENT_BROWSER_ENCRYPTION_KEY`, `AGENT_BROWSER_STATE_EXPIRE_DAYS`, `AGENT_BROWSER_IOS_DEVICE`, `AGENT_BROWSER_IOS_UDID`, `AI_GATEWAY_URL`, `AI_GATEWAY_API_KEY`, the provider credential names listed above, and AWS credential names when using AgentCore. The upstream child also receives every parent variable whose name starts with `AGENT_BROWSER_`, `AGENTCORE_`, `AI_GATEWAY_`, `BROWSERBASE_`, `BROWSERLESS_`, `BROWSER_USE_`, `KERNEL_`, or `XDG_`, plus the explicit inherited-name allowlist in `buildAgentBrowserProcessEnv` (`extensions/agent-browser/lib/process.ts`).
694
716
 
695
717
  ## Wrapper-specific behavior worth knowing
696
718
 
@@ -717,20 +739,48 @@ Other useful environment variables include `AGENT_BROWSER_DEFAULT_TIMEOUT`, `AGE
717
739
 
718
740
  This generated block is review data for maintainers. The human-authored reference sections above remain the readable command guide.
719
741
 
742
+ #### Source evidence
743
+ - repository: `vercel-labs/agent-browser`
744
+ - upstream HEAD: `4ad284890cb59564af603e6de403dd75dd19e832`
745
+ - upstream package version: `0.27.0`
746
+ - inspected: `agent-browser --version`
747
+ - inspected: `agent-browser --help`
748
+ - inspected: `selected agent-browser <command> --help output`
749
+ - inspected: `README.md`
750
+ - inspected: `CHANGELOG.md`
751
+ - inspected: `agent-browser.schema.json`
752
+ - inspected: `cli/src/commands.rs`
753
+ - inspected: `cli/src/flags.rs`
754
+
720
755
  #### Upstream help commands sampled
721
756
  - root help: `agent-browser --help`
722
757
  - skills help: `agent-browser skills --help`
723
758
  - skills list: `agent-browser skills list`
724
759
  - core skill full: `agent-browser skills get core --full`
760
+ - open help: `agent-browser open --help`
761
+ - click help: `agent-browser click --help`
762
+ - key help: `agent-browser key --help`
763
+ - scroll help: `agent-browser scroll --help`
764
+ - scrollinto help: `agent-browser scrollinto --help`
765
+ - keydown help: `agent-browser keydown --help`
766
+ - keyup help: `agent-browser keyup --help`
767
+ - get help: `agent-browser get --help`
768
+ - is help: `agent-browser is --help`
769
+ - mouse help: `agent-browser mouse --help`
770
+ - set help: `agent-browser set --help`
725
771
  - tab help: `agent-browser tab --help`
726
772
  - snapshot help: `agent-browser snapshot --help`
773
+ - eval help: `agent-browser eval --help`
727
774
  - wait help: `agent-browser wait --help`
728
775
  - screenshot help: `agent-browser screenshot --help`
776
+ - pdf help: `agent-browser pdf --help`
777
+ - close help: `agent-browser close --help`
729
778
  - find help: `agent-browser find --help`
730
779
  - network help: `agent-browser network --help`
731
780
  - cookies help: `agent-browser cookies --help`
732
781
  - storage help: `agent-browser storage --help`
733
782
  - state help: `agent-browser state --help`
783
+ - session help: `agent-browser session --help`
734
784
  - frame help: `agent-browser frame --help`
735
785
  - dialog help: `agent-browser dialog --help`
736
786
  - window help: `agent-browser window --help`
@@ -745,14 +795,23 @@ This generated block is review data for maintainers. The human-authored referenc
745
795
  - trace help: `agent-browser trace --help`
746
796
  - profiler help: `agent-browser profiler --help`
747
797
  - record help: `agent-browser record --help`
798
+ - console help: `agent-browser console --help`
799
+ - errors help: `agent-browser errors --help`
800
+ - clipboard help: `agent-browser clipboard --help`
801
+ - tap help: `agent-browser tap --help`
802
+ - swipe help: `agent-browser swipe --help`
803
+ - device help: `agent-browser device --help`
804
+ - install help: `agent-browser install --help`
805
+ - upgrade help: `agent-browser upgrade --help`
806
+ - profiles help: `agent-browser profiles --help`
748
807
 
749
808
  #### Inventory sections
750
- - Built-in skills: 10 human-doc token(s), 11 upstream token(s)
751
- - Core page, element, navigation, and extraction commands: 38 human-doc token(s), 40 upstream token(s)
752
- - Sessions, state, tabs, frames, dialogs, and windows: 12 human-doc token(s), 8 upstream token(s)
753
- - Network, storage, artifacts, diagnostics, and performance: 29 human-doc token(s), 33 upstream token(s)
754
- - Batch, auth, confirmations, setup, dashboard, and AI commands: 19 human-doc token(s), 17 upstream token(s)
755
- - Global flags, config, providers, policy, and environment: 95 human-doc token(s), 90 upstream token(s)
809
+ - Built-in skills: 13 human-doc token(s), 13 upstream token(s)
810
+ - Core page, element, navigation, and extraction commands: 74 human-doc token(s), 74 upstream token(s)
811
+ - Sessions, state, tabs, frames, dialogs, and windows: 20 human-doc token(s), 16 upstream token(s)
812
+ - Network, storage, artifacts, diagnostics, and performance: 42 human-doc token(s), 51 upstream token(s)
813
+ - Batch, auth, confirmations, setup, dashboard, devices, and AI commands: 24 human-doc token(s), 24 upstream token(s)
814
+ - Global flags, config, providers, policy, and environment: 117 human-doc token(s), 90 upstream token(s)
756
815
 
757
816
  #### Human-authored doc tokens required
758
817
  ##### Built-in skills
@@ -760,20 +819,30 @@ This generated block is review data for maintainers. The human-authored referenc
760
819
  - `skills get core`
761
820
  - `skills get core --full`
762
821
  - `skills get <name>`
822
+ - `skills get <name> --full`
823
+ - `skills get --all`
763
824
  - `skills get electron`
764
825
  - `skills get slack`
765
826
  - `skills get dogfood`
766
827
  - `skills get vercel-sandbox`
767
828
  - `skills get agentcore`
768
829
  - `skills path [name]`
830
+ - `AGENT_BROWSER_SKILLS_DIR`
769
831
 
770
832
  ##### Core page, element, navigation, and extraction commands
833
+ - `open [url]`
771
834
  - `open <url>`
835
+ - `goto <url>`
836
+ - `navigate <url>`
772
837
  - `click <sel>`
838
+ - `click <sel> --new-tab`
773
839
  - `dblclick <sel>`
774
840
  - `type <sel> <text>`
775
841
  - `fill <sel> <text>`
776
842
  - `press <key>`
843
+ - `key <key>`
844
+ - `keydown <key>`
845
+ - `keyup <key>`
777
846
  - `keyboard type <text>`
778
847
  - `keyboard inserttext <text>`
779
848
  - `keydown Shift`
@@ -787,33 +856,70 @@ This generated block is review data for maintainers. The human-authored referenc
787
856
  - `upload <sel> <files...>`
788
857
  - `download <sel> <path>`
789
858
  - `scroll <dir> [px]`
859
+ - `scroll <dir> [px] --selector <sel>`
790
860
  - `scrollintoview <sel>`
861
+ - `scrollinto <sel>`
791
862
  - `wait <sel|ms>`
863
+ - `wait --url <pattern>`
864
+ - `wait --load <state>`
865
+ - `wait --fn <expression>`
866
+ - `wait --text <text>`
867
+ - `wait --download [path]`
868
+ - `screenshot [selector] [path]`
792
869
  - `screenshot [path]`
793
870
  - `screenshot --full`
794
871
  - `screenshot --annotate`
795
872
  - `pdf <path>`
796
873
  - `snapshot`
874
+ - `snapshot --cursor`
875
+ - `snapshot --interactive`
876
+ - `snapshot --urls`
877
+ - `snapshot --compact`
878
+ - `snapshot --depth <n>`
879
+ - `snapshot --selector <sel>`
797
880
  - `eval <js>`
881
+ - `eval --stdin`
882
+ - `eval -b <base64>`
798
883
  - `connect <port|url>`
799
884
  - `close [--all]`
885
+ - `quit`
886
+ - `exit`
800
887
  - `back`
801
888
  - `forward`
802
889
  - `reload`
803
890
  - `pushstate <url>`
804
891
  - `get <what> [selector]`
892
+ - `get cdp-url`
893
+ - `get box <selector>`
894
+ - `get styles <selector>`
805
895
  - `is <what> <selector>`
806
896
  - `find <locator> <value> <action>`
897
+ - `find first <sel>`
898
+ - `find last <sel>`
899
+ - `find nth <n> <sel>`
900
+ - `find role <role> --name <name>`
901
+ - `find ... --exact`
807
902
  - `mouse <action> [args]`
808
903
  - `set <setting> [value]`
904
+ - `set media <features>`
905
+ - `tap <selector>`
906
+ - `swipe <direction> [distance]`
809
907
 
810
908
  ##### Sessions, state, tabs, frames, dialogs, and windows
811
909
  - `session`
812
910
  - `session list`
813
911
  - `state save <path>`
814
912
  - `state load <path>`
913
+ - `state list`
914
+ - `state show <filename>`
915
+ - `state rename <old-name> <new-name>`
916
+ - `state clear [session-name] [--all]`
917
+ - `state clear -a`
918
+ - `state clean --older-than <days>`
815
919
  - `tab list`
920
+ - `tab new [url]`
816
921
  - `tab new --label <name> [url]`
922
+ - `tab close [target]`
817
923
  - `tab <t<N>|label>`
818
924
  - `frame <selector|main>`
819
925
  - `dialog accept [text]`
@@ -824,13 +930,21 @@ This generated block is review data for maintainers. The human-authored referenc
824
930
  ##### Network, storage, artifacts, diagnostics, and performance
825
931
  - `network <action>`
826
932
  - `network route <url> [--abort|--body <json>] [--resource-type <csv>]`
933
+ - `network unroute [url]`
934
+ - `network requests [--clear] [--filter <pattern>] [--type <csv>] [--method <method>] [--status <code|range>]`
827
935
  - `network request <requestId>`
936
+ - `network har start`
937
+ - `network har stop [path]`
828
938
  - `cookies [get|set|clear]`
939
+ - `cookies set <name> <value> --url <url> --domain <domain> --path <path> --httpOnly --secure --sameSite <Strict|Lax|None> --expires <timestamp>`
829
940
  - `cookies set --curl <file>`
830
941
  - `storage <local|session>`
831
942
  - `diff snapshot`
943
+ - `diff snapshot --baseline <file> --selector <sel> --compact --depth <n>`
832
944
  - `diff screenshot --baseline`
945
+ - `diff screenshot --baseline <file> --output <file> --threshold <0-1> --selector <sel> --full`
833
946
  - `diff url <u1> <u2>`
947
+ - `diff url <u1> <u2> --screenshot --wait-until <strategy> --selector <sel> --compact --depth <n>`
834
948
  - `trace start|stop [path]`
835
949
  - `profiler start|stop [path]`
836
950
  - `record start <path> [url]`
@@ -841,6 +955,10 @@ This generated block is review data for maintainers. The human-authored referenc
841
955
  - `highlight <sel>`
842
956
  - `inspect`
843
957
  - `clipboard <op> [text]`
958
+ - `clipboard read`
959
+ - `clipboard write <text>`
960
+ - `clipboard copy`
961
+ - `clipboard paste`
844
962
  - `stream enable [--port <n>]`
845
963
  - `stream disable`
846
964
  - `stream status`
@@ -850,21 +968,27 @@ This generated block is review data for maintainers. The human-authored referenc
850
968
  - `react renders stop [--json]`
851
969
  - `react suspense [--only-dynamic] [--json]`
852
970
  - `vitals [url] [--json]`
971
+ - `web-vitals [url] [--json]`
853
972
  - `removeinitscript <id>`
854
973
 
855
- ##### Batch, auth, confirmations, setup, dashboard, and AI commands
974
+ ##### Batch, auth, confirmations, setup, dashboard, devices, and AI commands
856
975
  - `batch [--bail]`
857
976
  - `auth save <name>`
977
+ - `auth save <name> --url <url> --username <user> --password <pass>`
978
+ - `auth save <name> --username-selector <s> --password-selector <s> --submit-selector <s>`
858
979
  - `auth save <name> --password-stdin`
859
980
  - `auth login <name>`
860
981
  - `auth list`
861
982
  - `auth show <name>`
862
983
  - `auth delete <name>`
984
+ - `auth remove <name>`
863
985
  - `confirm <id>`
864
986
  - `deny <id>`
865
987
  - `chat <message>`
988
+ - `dashboard [start]`
866
989
  - `dashboard start --port <n>`
867
990
  - `dashboard stop`
991
+ - `device list`
868
992
  - `install`
869
993
  - `install --with-deps`
870
994
  - `upgrade`
@@ -962,6 +1086,7 @@ This generated block is review data for maintainers. The human-authored referenc
962
1086
  - `AGENT_BROWSER_DEBUG`
963
1087
  - `AGENT_BROWSER_CONFIG`
964
1088
  - `AGENT_BROWSER_DEFAULT_TIMEOUT`
1089
+ - `--idle-timeout <ms>`
965
1090
  - `AGENT_BROWSER_STREAM_PORT`
966
1091
  - `AGENT_BROWSER_IDLE_TIMEOUT_MS`
967
1092
  - `AGENT_BROWSER_ENCRYPTION_KEY`
@@ -969,11 +1094,34 @@ This generated block is review data for maintainers. The human-authored referenc
969
1094
  - `AGENT_BROWSER_IOS_UDID`
970
1095
  - `AI_GATEWAY_URL`
971
1096
  - `AI_GATEWAY_API_KEY`
1097
+ - `BROWSERBASE_API_KEY`
1098
+ - `BROWSERBASE_PROJECT_ID`
1099
+ - `BROWSERLESS_API_KEY`
1100
+ - `BROWSERLESS_API_URL`
1101
+ - `BROWSERLESS_BROWSER_TYPE`
1102
+ - `BROWSERLESS_STEALTH`
1103
+ - `BROWSERLESS_TTL`
1104
+ - `BROWSER_USE_API_KEY`
1105
+ - `KERNEL_API_KEY`
1106
+ - `KERNEL_HEADLESS`
1107
+ - `KERNEL_STEALTH`
1108
+ - `KERNEL_TIMEOUT_SECONDS`
1109
+ - `KERNEL_PROFILE_NAME`
1110
+ - `AGENTCORE_API_KEY`
1111
+ - `AGENTCORE_REGION`
1112
+ - `AGENTCORE_BROWSER_ID`
1113
+ - `AGENTCORE_PROFILE_ID`
1114
+ - `AGENTCORE_SESSION_TIMEOUT`
1115
+ - `AWS_PROFILE`
1116
+ - `AWS_ACCESS_KEY_ID`
1117
+ - `AWS_SECRET_ACCESS_KEY`
972
1118
 
973
1119
  #### Upstream help tokens expected
974
1120
  ##### Built-in skills
975
1121
  - root help: `skills get core --full`
976
1122
  - skills help: `get <name> --full`
1123
+ - skills help: `get --all`
1124
+ - skills help: `AGENT_BROWSER_SKILLS_DIR`
977
1125
  - skills list: `core`
978
1126
  - skills list: `electron`
979
1127
  - skills list: `slack`
@@ -985,12 +1133,18 @@ This generated block is review data for maintainers. The human-authored referenc
985
1133
  - core skill full: `agent-browser state save ./auth.json`
986
1134
 
987
1135
  ##### Core page, element, navigation, and extraction commands
1136
+ - open help: `open [url]`
1137
+ - open help: `aliases still require a URL.`
988
1138
  - root help: `open <url>`
989
1139
  - root help: `click <sel>`
1140
+ - click help: `--new-tab`
990
1141
  - root help: `dblclick <sel>`
991
1142
  - root help: `type <sel> <text>`
992
1143
  - root help: `fill <sel> <text>`
993
1144
  - root help: `press <key>`
1145
+ - key help: `Aliases: key`
1146
+ - keydown help: `keydown <key>`
1147
+ - keyup help: `keyup <key>`
994
1148
  - root help: `keyboard type <text>`
995
1149
  - root help: `keyboard inserttext <text>`
996
1150
  - root help: `hover <sel>`
@@ -1002,35 +1156,71 @@ This generated block is review data for maintainers. The human-authored referenc
1002
1156
  - root help: `upload <sel> <files...>`
1003
1157
  - root help: `download <sel> <path>`
1004
1158
  - root help: `scroll <dir> [px]`
1159
+ - scroll help: `--selector <sel>`
1005
1160
  - root help: `scrollintoview <sel>`
1161
+ - scrollinto help: `Aliases: scrollinto`
1006
1162
  - root help: `wait <sel|ms>`
1163
+ - wait help: `--url <pattern>`
1164
+ - wait help: `--load <state>`
1165
+ - wait help: `--fn <expression>`
1166
+ - wait help: `--text <text>`
1167
+ - wait help: `--download [path]`
1007
1168
  - root help: `screenshot [path]`
1169
+ - screenshot help: `screenshot [selector] [path]`
1008
1170
  - root help: `pdf <path>`
1171
+ - pdf help: `Save page as PDF`
1009
1172
  - root help: `snapshot`
1173
+ - snapshot help: `--interactive`
1174
+ - snapshot help: `--urls`
1175
+ - snapshot help: `--compact`
1176
+ - snapshot help: `--depth <n>`
1177
+ - snapshot help: `--selector <sel>`
1010
1178
  - root help: `eval <js>`
1179
+ - eval help: `--stdin`
1180
+ - eval help: `-b, --base64`
1011
1181
  - root help: `connect <port|url>`
1012
1182
  - root help: `close [--all]`
1183
+ - close help: `Aliases: quit, exit`
1013
1184
  - root help: `back`
1014
1185
  - root help: `forward`
1015
1186
  - root help: `reload`
1016
1187
  - root help: `pushstate <url>`
1017
1188
  - root help: `Get Info: agent-browser get <what> [selector]`
1189
+ - get help: `box <selector>`
1190
+ - get help: `styles <selector>`
1191
+ - get help: `cdp-url`
1018
1192
  - root help: `Check State: agent-browser is <what> <selector>`
1019
1193
  - root help: `Find Elements: agent-browser find <locator> <value> <action> [text]`
1194
+ - find help: `first <selector>`
1195
+ - find help: `last <selector>`
1196
+ - find help: `nth <index> <selector>`
1197
+ - find help: `--name <name>`
1198
+ - find help: `--exact`
1020
1199
  - root help: `Mouse: agent-browser mouse <action> [args]`
1021
1200
  - root help: `Browser Settings: agent-browser set <setting> [value]`
1201
+ - set help: `media [dark|light]`
1022
1202
  - keyboard help: `type <text>`
1023
1203
  - keyboard help: `inserttext <text>`
1024
1204
  - screenshot help: `--full, -f`
1025
1205
  - screenshot help: `--annotate`
1026
1206
  - find help: `role <role>`
1027
1207
  - find help: `testid <id>`
1208
+ - tap help: `tap <selector>`
1209
+ - swipe help: `swipe <direction> [distance]`
1028
1210
 
1029
1211
  ##### Sessions, state, tabs, frames, dialogs, and windows
1030
1212
  - root help: `session list`
1031
1213
  - state help: `save <path>`
1032
1214
  - state help: `load <path>`
1215
+ - state help: `list`
1216
+ - state help: `show <filename>`
1217
+ - state help: `rename <old-name> <new-name>`
1218
+ - state help: `clear [session-name] [--all]`
1219
+ - state help: `agent-browser state clear --all`
1220
+ - state help: `clean --older-than <days>`
1221
+ - tab help: `new [url]`
1033
1222
  - tab help: `new --label <name> [url]`
1223
+ - tab help: `close [t<N>|label]`
1034
1224
  - tab help: `Stable tab ids`
1035
1225
  - frame help: `frame <selector|main>`
1036
1226
  - dialog help: `dialog <accept|dismiss|status> [text]`
@@ -1039,6 +1229,9 @@ This generated block is review data for maintainers. The human-authored referenc
1039
1229
  ##### Network, storage, artifacts, diagnostics, and performance
1040
1230
  - root help: `network <action>`
1041
1231
  - root help: `--resource-type <csv>`
1232
+ - network help: `unroute [url]`
1233
+ - network help: `network har start`
1234
+ - network help: `network har stop ./capture.har`
1042
1235
  - root help: `cookies [get|set|clear]`
1043
1236
  - root help: `cookies set --curl <file>`
1044
1237
  - root help: `storage <local|session>`
@@ -1053,6 +1246,10 @@ This generated block is review data for maintainers. The human-authored referenc
1053
1246
  - root help: `highlight <sel>`
1054
1247
  - root help: `inspect`
1055
1248
  - root help: `clipboard <op> [text]`
1249
+ - clipboard help: `read`
1250
+ - clipboard help: `write <text>`
1251
+ - clipboard help: `copy`
1252
+ - clipboard help: `paste`
1056
1253
  - root help: `stream enable [--port <n>]`
1057
1254
  - root help: `stream disable`
1058
1255
  - root help: `stream status`
@@ -1063,15 +1260,26 @@ This generated block is review data for maintainers. The human-authored referenc
1063
1260
  - root help: `react suspense [--only-dynamic] [--json]`
1064
1261
  - root help: `vitals [url] [--json]`
1065
1262
  - root help: `removeinitscript <id>`
1263
+ - network help: `requests [options]`
1264
+ - network help: `--type <types>`
1265
+ - network help: `--method <method>`
1266
+ - network help: `--status <code>`
1066
1267
  - network help: `request <requestId>`
1067
1268
  - network help: `har <start|stop>`
1068
1269
  - storage help: `set <key> <value>`
1270
+ - diff help: `diff snapshot [options]`
1271
+ - diff help: `--baseline <f>`
1272
+ - diff help: `--output <file>`
1273
+ - diff help: `--threshold <0-1>`
1274
+ - diff help: `--wait-until <strategy>`
1069
1275
  - diff help: `diff screenshot --baseline <f>`
1070
1276
  - trace help: `trace <operation> [path]`
1071
1277
  - profiler help: `--categories <list>`
1072
1278
  - record help: `record restart <path.webm> [url]`
1279
+ - console help: `--clear`
1280
+ - errors help: `--clear`
1073
1281
 
1074
- ##### Batch, auth, confirmations, setup, dashboard, and AI commands
1282
+ ##### Batch, auth, confirmations, setup, dashboard, devices, and AI commands
1075
1283
  - root help: `batch [--bail]`
1076
1284
  - root help: `auth save <name>`
1077
1285
  - root help: `auth login <name>`
@@ -1079,12 +1287,19 @@ This generated block is review data for maintainers. The human-authored referenc
1079
1287
  - root help: `deny <id>`
1080
1288
  - root help: `chat <message>`
1081
1289
  - root help: `dashboard start --port <n>`
1290
+ - device help: `device list`
1082
1291
  - root help: `install --with-deps`
1083
1292
  - root help: `upgrade`
1084
1293
  - root help: `doctor [--fix]`
1085
1294
  - root help: `profiles`
1086
1295
  - batch help: `--bail`
1296
+ - auth help: `--url <url>`
1297
+ - auth help: `--username <user>`
1298
+ - auth help: `--password <pass>`
1087
1299
  - auth help: `--password-stdin`
1300
+ - auth help: `--username-selector <s>`
1301
+ - auth help: `--password-selector <s>`
1302
+ - auth help: `--submit-selector <s>`
1088
1303
  - dashboard help: `dashboard [start|stop] [options]`
1089
1304
  - chat help: `chat <message>`
1090
1305
  - doctor help: `--offline`