npm - pi-agent-browser-native - Versions diffs - 0.2.33 → 0.2.35 - Mend

pi-agent-browser-native 0.2.33 → 0.2.35

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (44) hide show

package/docs/COMMAND_REFERENCE.md CHANGED Viewed

@@ -18,7 +18,7 @@ This project intentionally blocks normal `agent-browser` bash usage in most agen
 <!-- agent-browser-capability-baseline:start upstream-baseline -->
 <!-- Generated from scripts/agent-browser-capability-baseline.mjs. Run `npm run docs -- command-reference write` to update. Do not edit manually. -->
-This reference is baselined to the locally installed `agent-browser 0.27.0` command/help surface. Upstream `agent-browser` remains the source of truth for command semantics; this file is the local fallback for Pi agent sessions where direct binary help is blocked or discouraged.
+This reference is baselined to the locally installed `agent-browser 0.27.0` command/help surface, audited against vercel-labs/agent-browser@4ad284890cb59564af603e6de403dd75dd19e832. Upstream `agent-browser` remains the source of truth for command semantics; this file is the local fallback for Pi agent sessions where direct binary help is blocked or discouraged.
 The lightweight drift check is `npm run verify -- command-reference`. Run it whenever the installed upstream `agent-browser` version changes or this reference is edited.
@@ -27,6 +27,8 @@ Use `npm run benchmark:agent-browser` or `npm run verify -- benchmark` before an
 ## Core mental model
+Input mode chooser (one per call): **`args`** for the default open → snapshot -i → click/fill `@refs` flow; **`semanticAction`** for stable role/text/label targets; **`job`** / **`qa`** for multi-step checks; **`electron`** for desktop apps only; **`sourceLookup`** / **`networkSourceLookup`** are **experimental candidates-only** helpers (not authoritative mappings). Do not pass `--json` in `args`—the wrapper injects it. Match link and button text to the latest snapshot (on `https://example.com/` the main link is `Learn more`, not legacy `More information...` copy). See [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#input-mode-chooser) for snapshot variants (`-i` vs `--compact` vs full) and batching three or more getters.
 Tool parameters (use exactly one of `args`, `semanticAction`, `job`, `qa`, `sourceLookup`, `networkSourceLookup`, or `electron`):
 ```json
@@ -63,14 +65,14 @@ Tool parameters (use exactly one of `args`, `semanticAction`, `job`, `qa`, `sour
 - `semanticAction`: optional shorthand for common `find` flows and native dropdown `select`; compiles to upstream argv and is rejected together with `args`, `job`, `qa`, `sourceLookup`, `networkSourceLookup`, or `electron` on the same call.
 - `job`: optional constrained short-workflow schema; compiles to existing upstream `batch` args/stdin and reports the compiled plan in `details.compiledJob`.
 - `qa`: optional lightweight QA preset; compiles to the same batch path and reports `details.compiledQaPreset` plus `details.qaPreset` pass/fail evidence.
-- `sourceLookup`: optional experimental helper for local UI-to-source *candidates*; compiles to the same `batch` path, reports `details.compiledSourceLookup` and `details.sourceLookup`, and never reclassifies a fully successful upstream batch as failed the way `qa` can (see [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#sourcelookup) and the longer notes below).
-- `networkSourceLookup`: optional experimental helper for failed request-to-source *candidates*; compiles to generated `batch`, reports `details.compiledNetworkSourceLookup` and `details.networkSourceLookup`, and never assigns blame or edits files.
+- `sourceLookup`: **EXPERIMENTAL — candidates only** for local UI-to-source hints; compiles to the same `batch` path, reports `details.compiledSourceLookup` and `details.sourceLookup`, and never reclassifies a fully successful upstream batch as failed the way `qa` can (see [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#sourcelookup) and the longer notes below).
+- `networkSourceLookup`: **EXPERIMENTAL — candidates only** for failed request-to-source hints; compiles to generated `batch`, reports `details.compiledNetworkSourceLookup` and `details.networkSourceLookup`, and never assigns blame or edits files.
 - `electron`: optional Electron desktop-app shorthand. `list`, `status`, `cleanup`, and `probe` are wrapper-owned host/session helpers; `launch` starts a wrapper-owned isolated Electron profile and attaches through upstream `connect`.
 - `stdin`: only for `batch`, `eval --stdin`, and `auth save --password-stdin`; other command/stdin combinations are rejected before `agent-browser` is launched. `job`, `qa`, `sourceLookup`, `networkSourceLookup`, and `electron` generate or manage their own input.
 - `sessionMode`:
   - `"auto"` reuses the extension-managed session when possible.
   - `"fresh"` rotates that managed session to a fresh upstream launch so launch-scoped flags like `--profile`, `--session-name`, `--cdp`, `--state`, `--auto-connect`, `--init-script`, `--enable`, `-p` / `--provider`, or iOS `--device` apply.
-  - If a fresh launch fails or times out, read `details.managedSessionOutcome` for `preserved` vs `abandoned` (and related fields). A model-visible `Managed session outcome: …` line is appended only for failing calls that used `sessionMode: "fresh"`; `"auto"` failures can still populate the struct without that extra line.
+  - If a fresh launch fails or times out, read `details.managedSessionOutcome` for `preserved` vs `abandoned` (and related fields). A model-visible `Managed session outcome: …` line is appended only for failing calls that used `sessionMode: "fresh"`; `"auto"` failures can still populate the struct without that extra line. If you explicitly close the current wrapper-managed session with `--session <name> close`, later default auto calls rotate to a new wrapper-generated session instead of reusing the closed name; repeated closes and branch restores keep those generated names monotonic.
 ### Debug, diff, stream, dashboard, and chat families
@@ -105,7 +107,7 @@ React introspection requires the React DevTools init hook to be installed before
 Use `vitals [url]` for Core Web Vitals plus React hydration timing when available, and `pushstate <url>` for client-side SPA navigation without a full reload:
 ```json
-{ "args": ["vitals", "https://example.com", "--json"] }
+{ "args": ["vitals", "https://example.com"] }
 { "args": ["pushstate", "/dashboard?tab=settings"] }
 ```
@@ -151,9 +153,9 @@ Do not assume Playwright selector dialects such as `text=Close` or `button:has-t
 Treat `@e…` refs as page-scoped. After a successful `snapshot`, the wrapper records the latest refs and page target for that session; mutation-prone ref commands such as `click @e4`, `select @e5 chocolate`, or batch steps with old refs fail with `failureCategory: "stale-ref"` when the page target changed or the ref is absent from the latest same-page snapshot. If a session `snapshot -i` fails with `No active page`, the wrapper invalidates prior refs for that session; later mutation-prone `@e…` calls fail before upstream until a successful fresh `snapshot -i` records refs again. Inside `batch` stdin JSON, the wrapper also walks steps in order before spawn: steps whose first token can navigate or mutate set a latch; a later step whose first token is `snapshot` clears that latch for following rows; guarded steps that still mention `@e…` after an uncleared latch fail with the same `stale-ref` bucket without launching upstream. Same-snapshot form fills are allowed before a click or submit step, so a login-style `fill`, `fill`, `click` batch can run from one snapshot; split dynamic or autosubmit forms with a fresh snapshot if a fill itself rerenders the targets. Follow the `refresh-interactive-refs` next action (it includes `--session <name>` when needed) and prefer stable `find` or `semanticAction` locators when navigation or rerendering is likely. Contract detail: [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#details) (`refSnapshot`, `refSnapshotInvalidation`).
-A successful `click` result means upstream reported a target, not that the app definitely handled the event. When the workflow depends on a mutation, use `details.pageChangeSummary`, a wait, URL/text extraction, or a fresh `snapshot -i` before trusting the state; if nothing changed, retry with a current visible ref or stable selector and report the workflow issue. Preserve explicit user stop boundaries: if the user says to stop before a final order, post, purchase, or submit action, gather evidence from that page and do not click the final action. The wrapper avoids site-specific fallback clicks and keeps the verification burden explicit.
+A successful `click` result means upstream reported a target, not that the app definitely handled the event. For top-level non-Electron clicks, the wrapper installs a bounded DOM-event probe; when upstream reports success but no trusted event reaches the target, it fails the tool and exposes `details.clickDispatch` plus a `Click dispatch diagnostic` line with explicit retry/inspect next actions (no in-page click replay). When the workflow depends on a mutation, use `details.pageChangeSummary`, a wait, URL/text extraction, or a fresh `snapshot -i` before trusting the state; if nothing changed, retry with a current visible ref or stable selector and report the workflow issue. Preserve explicit user stop boundaries: if the user says to stop before a final order, post, purchase, or submit action, gather evidence from that page and do not click the final action. The wrapper also blocks likely final order/submit click targets under those prompts and returns `details.promptGuard` with `failureCategory: "policy-blocked"`.
-When a **top-level** `click` succeeds (not a `click` hidden inside a `batch`/`job` tool call—the unified command must be `click`), the upstream payload includes `data.clicked`, and the wrapper sees the active tab URL unchanged after the same normalization it uses for ref guards (**`#fragment` ignored**), it may run one extra `snapshot -i` and surface `Possible overlay blockers` plus `details.overlayBlockers` (`candidates`, `summary`, and a `snapshot` map that can refresh `refSnapshot`) when that snapshot shows strong modal context (`dialog` / `alertdialog`) **and** up to three close/dismiss-like controls; page-wide words such as privacy, sign in, or banner alone do not trigger it. The URL check compares the session’s prior pinned tab target to `details.navigationSummary.url` after the click; that summary is gathered with one read-only `eval` when the click JSON omits **both** string `data.url` and `data.title`—if upstream already echoes either field, overlay diagnostics are skipped on this path. The diagnostic is skipped if the wrapper already applied tab-focus correction or about-blank recovery on that result. Appended `inspect-overlay-state` / `try-overlay-blocker-candidate-*` entries in `details.nextActions` include `--session <name>` when the session is named, same as other session-scoped follow-ups. Treat `inspect-overlay-state` as the safe first follow-up; only use a `try-overlay-blocker-candidate-*` next action when the candidate is clearly the control you intend to close.
+When a **top-level** `click` succeeds (not a `click` hidden inside a `batch`/`job` tool call—the unified command must be `click`), the upstream payload includes `data.clicked`, no `details.clickDispatch` diagnostic fired for the same result, and the wrapper sees the active tab URL unchanged after the same normalization it uses for ref guards (**`#fragment` ignored**), it may run one extra `snapshot -i` and surface `Possible overlay blockers` plus `details.overlayBlockers` (`candidates`, `summary`, and a `snapshot` map that can refresh `refSnapshot`) when that snapshot shows strong modal context (`dialog` / `alertdialog`) **and** up to three close/dismiss-like controls; page-wide words such as privacy, sign in, or banner alone do not trigger it. The URL check compares the session’s prior pinned tab target to `details.navigationSummary.url` after the click; that summary is gathered with one read-only `eval` when the click JSON omits **both** string `data.url` and `data.title`—if upstream already echoes either field, overlay diagnostics are skipped on this path. The diagnostic is skipped if the wrapper already applied tab-focus correction or about-blank recovery on that result. Appended `inspect-overlay-state` / `try-overlay-blocker-candidate-*` entries in `details.nextActions` include `--session <name>` when the session is named, same as other session-scoped follow-ups. Treat `inspect-overlay-state` as the safe first follow-up; only use a `try-overlay-blocker-candidate-*` next action when the candidate is clearly the control you intend to close.
 ### Extract page data
@@ -186,6 +188,8 @@ Use `batch --bail` when later steps should stop after the first failed command.
 For short constrained flows, use top-level `job` instead of hand-writing `batch` stdin. Supported job steps are `open`, `click`, `fill`, `select`, `wait`, `assertText`, `assertUrl`, `waitForDownload`, and `screenshot`; `select` requires `selector` plus `value` or `values`, and compiles to upstream `select <selector> <value...>`. The wrapper compiles steps to upstream `batch` and records `details.compiledJob.steps[]`. There is still no separate first-class catalog of reusable named browser recipes above `job`, the `qa` preset, and raw `batch`; see [`ARCHITECTURE.md`](ARCHITECTURE.md#no-reusable-recipe-layer-yet) for the closed `RQ-0068` decision and revisit bar.
+**Job navigation is explicit.** A `click` step (or other navigation-prone interaction) does not prove the next page loaded. The wrapper does not auto-insert `assertUrl` or `assertText` after clicks inside `job`; add those steps yourself with the URL pattern or on-page text you expect, especially after forms, checkout, tabs, or submit buttons, before screenshots or later steps.
 ```json
 {
   "job": {
@@ -198,11 +202,28 @@ For short constrained flows, use top-level `job` instead of hand-writing `batch`
 }
 ```
+Navigation-prone flow (open → fill → click → assert destination → screenshot):
+```json
+{
+  "job": {
+    "steps": [
+      { "action": "open", "url": "https://shop.example/checkout" },
+      { "action": "fill", "selector": "#email", "text": "user@example.com" },
+      { "action": "click", "selector": "#continue" },
+      { "action": "assertUrl", "url": "**/shipping" },
+      { "action": "assertText", "text": "Shipping address" },
+      { "action": "screenshot", "path": ".dogfood/shipping.png" }
+    ]
+  }
+}
+```
 On app pages that expose a native dropdown, add a `select` step such as `{ "action": "select", "selector": "#flavor", "value": "chocolate" }` before the assertion that depends on it.
 Use raw `args: ["batch"]` with `stdin` when you need arbitrary upstream commands, flags, or batch failure policies outside the constrained schema. Do not pass `stdin` with `job`, `qa`, `sourceLookup`, `networkSourceLookup`, or `electron`; those modes generate or manage their own input.
-For quick smoke/QA checks, use top-level `qa`. It clears enabled network/console/page-error buffers before opening the target URL, waits for page readiness, checks expected text/selector, inspects fresh network requests, console messages, and page errors, and can capture an evidence screenshot. The readiness wait defaults to `loadState: "domcontentloaded"`; set `loadState` to `"load"` or `"networkidle"` only when that stricter state is useful and the site is not expected to keep background requests alive. QA network diagnostics classify failed requests by likely impact and list failed rows first in the network preview: actionable document/script/API-style failures fail the preset, while common low-impact browser icon misses such as `favicon.ico` are surfaced as warnings (`qaPreset.warnings`) so they do not fail an otherwise healthy page. Failed QA presets report `details.resultCategory: "failure"`, `failureCategory: "qa-failure"`, and real Pi sessions treat the diagnostic as a failed tool result. Prose output also gets a model-visible result-category line including `Pi tool isError: true`; caller-requested `--json` output keeps the JSON string parseable and relies on the patched `isError` plus `details` fields.
+For quick smoke/QA checks, use top-level `qa`. It clears enabled network/console/page-error buffers before opening the target URL, waits for page readiness, checks expected text/selector, inspects fresh network requests, console messages, and page errors, and can capture an evidence screenshot. The readiness wait defaults to `loadState: "domcontentloaded"`; set `loadState` to `"load"` or `"networkidle"` only when that stricter state is useful and the site is not expected to keep background requests alive. QA network diagnostics classify failed requests by likely impact and list failed rows first in the network preview: actionable document/script/API-style failures fail the preset, while common low-impact browser icon misses such as `favicon.ico` are surfaced as warnings (`qaPreset.warnings`) so they do not fail an otherwise healthy page. Successful QA with no failed checks returns compact model-visible prose (page URL/title when known, checks run, optional screenshot verification) while keeping the full step matrix in `details.qaPreset` and `details.batchSteps`. Failed QA presets report `details.resultCategory: "failure"`, `failureCategory: "qa-failure"`, keep verbose per-step batch output, and real Pi sessions treat the diagnostic as a failed tool result. Prose output also gets a model-visible result-category line including `Pi tool isError: true`; caller-requested `--json` output keeps the JSON string parseable and relies on the patched `isError` plus `details` fields.
 The same classification drives plain `network requests` presentation: when any row counts as failed (HTTP status ≥ 400, `failed: true`, or a string `error`), model-facing text starts with a line like `Network failure summary: 0 actionable, 1 benign low-impact (1 total).`, and each preview line can end with an impact tag such as `[benign: low-impact browser icon asset]` or `[actionable: document, script, API, or non-benign request failure]`. When safe request IDs are present, `details.nextActions` adds bounded read-only follow-ups such as `network request <id>`, `networkSourceLookup` for actionable failed rows, `network requests --filter <path>`, and `network har start`; prefer those payloads over rebuilding request-id commands from prose. Rules live in `classifyNetworkRequestFailure` / `summarizeNetworkFailures` in `extensions/agent-browser/lib/results/network.ts`; QA aggregation is `analyzeQaPresetResults` in `extensions/agent-browser/index.ts`.
@@ -212,7 +233,7 @@ The same classification drives plain `network requests` presentation: when any r
 Optional `loadState`, `checkNetwork`, `checkConsole`, and `checkErrors` default to `"domcontentloaded"`, `true`, `true`, and `true`; set a check to `false` to skip that diagnostic. Omit `expectedText` and `expectedSelector` when you only need load plus diagnostics.
-For attached Electron or manually connected CDP sessions, use `qa.attached` after the session exists. It does not open a URL and rejects `sessionMode: "fresh"` because it checks the current managed session.
+For attached Electron or manually connected CDP sessions, use `qa.attached` after the session exists. It does not open a URL and rejects `sessionMode: "fresh"` because it checks the current managed session. Before running diagnostics, the wrapper requires a readable `http:` or `https:` page URL on the attached session; missing URLs, read failures, and non-http(s) surfaces fail fast with recovery `nextActions` such as `tab list` and `snapshot -i` instead of running the full QA batch.
 ```json
 { "qa": { "attached": true, "expectedText": "Explorer", "screenshotPath": ".dogfood/electron.png" } }
@@ -238,13 +259,13 @@ Typical lifecycle:
 { "electron": { "action": "cleanup", "launchId": "electron-…" } }
 ```
-`electron.status` and `electron.cleanup` take either `launchId`, **`all: true`** (literal boolean) to walk every wrapper-tracked launch in one call, or neither when exactly one active launch exists—never both `launchId` and `all`. For `electron.launch`, `timeoutMs` bounds host CDP readiness with a **15s** default and **120s** cap in `extensions/agent-browser/lib/electron/launch.ts`. Optional `timeoutMs` on **`status`** applies to managed-session `get title` / `get url` reads (localhost CDP probes stay on a short fixed fetch budget). On **`cleanup`**, it caps upstream `close` **and** host teardown (process exit, debug-port idle check, isolated profile removal); when omitted it follows the implicit session close default (**5s** unless `PI_AGENT_BROWSER_IMPLICIT_SESSION_CLOSE_TIMEOUT_MS` overrides). On **`probe`**, it bounds each underlying upstream read subprocess—omit it to use the normal tool subprocess default, or raise it on slow desktops.
+`electron.status` and `electron.cleanup` take either `launchId`, **`all: true`** (literal boolean) to walk every wrapper-tracked launch in one call, or neither when exactly one active launch exists—never both `launchId` and `all`. They can target the current branch-visible launch plus still-owned off-branch launch records by `launchId`; default no-arg calls are intentionally ambiguous when more than one active launch is owned. `/reload` preserves the current branch-visible active Electron launch and its isolated temp `userDataDir` for continuity, and cleans off-branch owned Electron launches; if cleanup is partial and skips or fails profile removal, the generic temp sweep preserves that `userDataDir` across reload, quit, later temp cleanup, process exit, and stale temp-root pruning after restart. For `electron.launch`, `timeoutMs` bounds host CDP readiness with a **15s** default and **120s** cap in `extensions/agent-browser/lib/electron/launch.ts`. Optional `timeoutMs` on **`status`** applies to managed-session `get title` / `get url` reads (localhost CDP probes stay on a short fixed fetch budget). On **`cleanup`**, it caps upstream `close` **and** host teardown (process exit, debug-port idle check, isolated profile removal); when omitted it follows the implicit session close default (**5s** unless `PI_AGENT_BROWSER_IMPLICIT_SESSION_CLOSE_TIMEOUT_MS` overrides). A successful managed-session close step retires that wrapper-managed session even when host process/profile cleanup remains partial. On **`probe`**, it bounds each underlying upstream read subprocess—omit it to use the normal tool subprocess default, or raise it on slow desktops.
 `launch.handoff` defaults to `"snapshot"`, which attaches through upstream `connect`, lists targets, and captures a current `snapshot -i` in one call. Snapshot handoff retries briefly when the first Electron snapshot has no refs; if it still reports no refs, run `snapshot -i` once more before assuming the app is blank. Use `handoff: "tabs"` as the safer diagnostic starting point when you only need target discovery and do not want to snapshot app content yet, or `handoff: "connect"` when you want to attach first and run your own follow-up commands. `targetType` defaults to `"page"`; use `"webview"` or `"any"` for apps that expose useful webviews. When a matching CDP target exposes a WebSocket URL, launch connects to that target; otherwise it falls back to the browser port.
 After launch, prefer the exact `details.nextActions` payloads when present: `status-electron-launch` checks liveness, `probe-electron-launch` runs compact diagnostics for a tracked launch, `snapshot-electron-session` refreshes current refs, `list-electron-tabs` inspects targets, and `cleanup-electron-launch` removes the wrapper-owned process/profile when the run is done. If launch times out, inspect `details.electron.failure.diagnostics` for PID, wrapper profile, `DevToolsActivePort`, and timing evidence before retrying. If status/probe detects a session or target mismatch, follow `reattach-electron-launch` or a fresh snapshot action before using old refs. If a click/fill/type looks successful but the Electron PID or debug port dies, the wrapper now fails the result with `details.electronPostCommandHealth` and same-launch status/probe/cleanup next actions instead of leaving the agent on `about:blank`. If cleanup is partial (`failureCategory: "cleanup-failed"`), inspect `details.electron.cleanup.results` and use `retry-electron-cleanup` only for the same `launchId`.
-Manual path for externally launched apps: if you started the Electron app yourself with a debug port or DevTools URL, skip the wrapper lifecycle and attach directly with upstream `connect`. In this path you own app shutdown and profile cleanup; do not use `electron.cleanup`. `close` only closes the browser/CDP session and does not quit the manually launched app or remove explicit artifacts.
+Manual path for externally launched apps: if you started the Electron app yourself with a debug port or DevTools URL, skip the wrapper lifecycle and attach directly with upstream `connect`. In this path you own app shutdown and profile cleanup; do not use `electron.cleanup`. close commands (`close`, `quit`, or `exit`) only close the browser/CDP session and do not quit the manually launched app or remove explicit artifacts.
 ```json
 { "args": ["connect", "9222"], "sessionMode": "fresh" }
@@ -306,10 +327,12 @@ A successful wait-based download renders a readable summary such as `Download co
 { "args": ["pdf", "/tmp/page.pdf"] }
 ```
-The upstream screenshot aliases are `screenshot --full` for full-page capture and `screenshot --annotate` for labeled screenshots. When a user gives exact artifact paths for screenshots, recordings, downloads, PDFs, traces, or HAR files, use those paths or explicitly report why the artifact was unavailable; do not silently substitute another path in the final report.
+The upstream screenshot aliases are `screenshot --full` for full-page capture and `screenshot --annotate` for labeled screenshots. When a user gives exact artifact paths for screenshots, recordings, downloads, PDFs, traces, or HAR files, use those paths or explicitly report why the artifact was unavailable; do not silently substitute another path in the final report. When the latest prompt names exact required screenshot paths, `close` / `quit` / `exit` can be blocked with `details.promptGuard.reason: "requested-artifacts-missing-before-close"` until those paths appear as verified explicit artifacts.
 Prefer `download <selector> <path>` when the target element itself is the downloadable link/control. Use `click` plus `wait --download [path]` when a previous action starts the download indirectly.
+For evidence-only screenshots, QA captures, or audit artifacts, save to an explicit path and branch on `details.artifactVerification` plus `details.artifacts` before reporting PASS/FAIL. Inline image attachments are optional convenience when size limits allow; do not require vision review unless the user asked for visual inspection.
 Wrapper result rendering is metadata-first for saved files:
 - screenshots return a saved-path summary, visible artifact metadata, structured `details.artifacts` metadata, and an inline image attachment when safe; the visible block includes artifact type, requested path, absolute path, existence, size, cwd, session, and repair/copy status when applicable
 - downloads, PDFs, `wait --download` files, `state save` state files, diff screenshot output images, traces, CPU profiles, completed WebM recordings from `record stop`, and path-bearing HAR captures return concise saved-path summaries plus structured `details.artifacts` metadata without inlining large files
@@ -332,7 +355,7 @@ The wrapper keeps a bounded, metadata-only `details.artifactManifest` of recent
 This manifest cap controls what appears in `details.artifactManifest` and in summaries such as `Session artifacts: 42 live, 0 evicted (42/100 recent)`. It does not delete explicit files that upstream saved to paths you chose, such as screenshots, PDFs, downloads, traces, HAR files, or WebM recordings.
-Browser `close` is also not file cleanup. If `details.artifactManifest` is present with a non-empty `entries` list, a successful `close` appends an `Artifact lifecycle` note and reports `details.artifactCleanup` with the current retention summary and the same host-owned cleanup `note` as the contract (`extensions/agent-browser/index.ts`, `getArtifactCleanupGuidance`). Up to ten distinct user-chosen paths that still exist on disk appear in `explicitArtifactPaths` when matching `explicit-path` manifest rows exist in the recent window; deleted/stale paths are skipped. Otherwise that array is empty and visible text may omit the “Explicit artifact paths” line even though the lifecycle block still reminds you that close does not delete saved files. Delete any paths you care about with host file tools after inspection; the native browser tool intentionally does not remove arbitrary user-chosen filesystem paths.
+Browser close commands (`close`, `quit`, or `exit`) are also not file cleanup. If `details.artifactManifest` is present with a non-empty `entries` list, a successful close command appends an `Artifact lifecycle` note and reports `details.artifactCleanup` with the current retention summary and the same host-owned cleanup `note` as the contract (`extensions/agent-browser/lib/orchestration/browser-run/diagnostics.ts`, `getArtifactCleanupGuidance`). Up to ten distinct user-chosen paths that still exist on disk appear in `explicitArtifactPaths` when matching `explicit-path` manifest rows exist in the recent window; deleted/stale paths are skipped. Otherwise that array is empty and visible text may omit the “Explicit artifact paths” line even though the lifecycle block still reminds you that close commands do not delete saved files. Delete any paths you care about with host file tools after inspection; the native browser tool intentionally does not remove arbitrary user-chosen filesystem paths.
 Oversized snapshots and oversized generic outputs are different: when a persisted pi session is available, their wrapper-managed spill files are stored under the private session artifact directory and are governed by the byte budget `PI_AGENT_BROWSER_SESSION_ARTIFACT_MAX_BYTES` (default 32 MiB). Raise that byte budget as well for long QA sessions that need many full raw snapshots or large text spills to survive reload/resume.
@@ -415,18 +438,27 @@ Session note: `skills list`, `skills get …`, and `skills path …` are **state
 | `skills get core` | Print the core usage guide. |
 | `skills get core --full` | Print the full version-matched core command reference and templates. |
 | `skills get <name>` | Load a specialized skill such as `electron` or `slack`. Common specialized calls include `skills get electron`, `skills get slack`, `skills get dogfood`, `skills get vercel-sandbox`, and `skills get agentcore`. |
+| `skills get <name> --full` | Include a skill's supplementary references/templates when present. |
+| `skills get --all` | Print all visible bundled skills for broad audit/debug work. |
 | `skills path [name]` | Print a skill directory path. |
+Skill-source debugging note: upstream honors `AGENT_BROWSER_SKILLS_DIR` as an override for bundled skill discovery. Normal agents should not need it, but it is useful when validating package layout or upstream skill packaging.
 ### Core page and element commands
 | Command | Purpose |
 | --- | --- |
-| `open <url>` | Navigate to a URL. |
+| `open [url]` | Launch the browser and optionally navigate. URL-less `open` stays on `about:blank` so agents can stage routes, cookies, or init scripts before first navigation. |
+| `open <url>` | Navigate to a URL; `goto <url>` and `navigate <url>` are equivalent navigation aliases when a URL is present. |
 | `click <sel>` | Click an element or `@ref`. |
+| `click <sel> --new-tab` | Click a link/control while requesting a new tab. |
 | `dblclick <sel>` | Double-click an element. |
 | `type <sel> <text>` | Type into an element. |
 | `fill <sel> <text>` | Clear and fill an element. |
-| `press <key>` | Press a key such as `Enter`, `Tab`, or `Control+a`. Related key-hold aliases include `keydown Shift` and `keyup Shift`. |
+| `press <key>` | Press a key such as `Enter`, `Tab`, or `Control+a`. `key <key>` is the upstream alias. |
+| `key <key>` | Alias for `press <key>`. |
+| `keydown <key>` | Hold a key down without releasing it, useful for modifiers. |
+| `keyup <key>` | Release a key previously held by `keydown <key>`. Common modifier examples are `keydown Shift` and `keyup Shift`. |
 | `keyboard type <text>` | Type text with real keystrokes and no selector. |
 | `keyboard inserttext <text>` | Insert text without key events. |
 | `hover <sel>` | Hover an element. |
@@ -438,14 +470,19 @@ Session note: `skills list`, `skills get …`, and `skills path …` are **state
 | `upload <sel> <files...>` | Upload one or more files. |
 | `download <sel> <path>` | Download a file by clicking an element. |
 | `scroll <dir> [px]` | Scroll `up`, `down`, `left`, or `right`. |
-| `scrollintoview <sel>` | Scroll an element into view. |
+| `scroll <dir> [px] --selector <sel>` | Scroll a specific scrollable element/container instead of the page. |
+| `scrollintoview <sel>` | Scroll an element into view; `scrollinto <sel>` is the upstream alias. |
+| `scrollinto <sel>` | Alias for `scrollintoview <sel>`. |
 | `wait <sel|ms>` | Wait for an element or a duration. |
-| `screenshot [path]` | Take a screenshot. |
+| `screenshot [selector] [path]` | Take a full-page or element-scoped screenshot; a single selector-like argument scopes, while a path-like argument saves to that path. |
+| `screenshot [path]` | Take a screenshot and optionally save it to a path. |
 | `pdf <path>` | Save the page as a PDF. |
-| `snapshot` | Print an accessibility tree with refs for AI interaction. |
-| `eval <js>` | Run JavaScript. Use `eval --stdin` through this wrapper for larger snippets. |
+| `snapshot` | Print an accessibility tree with refs for AI interaction. Common options include `snapshot --interactive`, `snapshot --urls`, `snapshot --compact`, `snapshot --depth <n>`, `snapshot --selector <sel>`, and `snapshot --cursor` / `snapshot -C` for cursor/focus context when upstream returns it. |
+| `eval <js>` | Run JavaScript. Use `eval --stdin` through this wrapper for larger snippets, or `eval -b <base64>` for shell-escaping-safe one-liners. |
 | `connect <port|url>` | Connect to a browser through CDP. |
-| `close [--all]` | Close the current browser or all sessions. |
+| `close [--all]` | Close the current browser or all sessions; `quit` and `exit` are upstream close aliases. |
+| `tap <selector>` | Touch-oriented tap alias for iOS/provider workflows. |
+| `swipe <direction> [distance]` | Touch-oriented swipe for iOS/provider workflows. |
 On dashboards and other apps with nested scroll containers, `scroll <dir> [px]` may report a successful wheel action while the viewport appears unchanged because the page-level scroller was not the one containing the content. For top-level `scroll` calls without startup-scoped launch flags, the wrapper samples viewport and prominent scroll-container positions before and after the command; when nothing changes it appends `Scroll diagnostic: no observed scroll movement`, exposes `details.scrollNoop`, and adds exact `details.nextActions` for a fresh `snapshot -i` and screenshot. Use those before repeating page scrolls; when you need a specific panel, prefer `scrollintoview <@ref>` or a scoped interaction with the actual scrollable region.
@@ -467,6 +504,11 @@ Comboboxes vary by app. For native `<select>` controls, prefer raw `select <sele
 | `session list` | List active sessions. |
 | `state save <path>` | Save cookies, local storage, and session storage to a state file. |
 | `state load <path>` | Load cookies and storage from a state file. |
+| `state list` | List saved state files. |
+| `state show <filename>` | Show saved-state metadata without dumping secrets. |
+| `state rename <old-name> <new-name>` | Rename a saved state file. |
+| `state clear [session-name] [--all]` | Clear saved states for one name or all names; `state clear -a` is the upstream short alias for clearing all names. |
+| `state clean --older-than <days>` | Delete expired saved-state files. |
 | `frame <selector|main>` | Switch iframe context by selector/ref/name/URL, or return to the main frame. |
 | `dialog accept [text]` | Accept an alert, confirm, or prompt dialog, optionally supplying prompt text. |
 | `dialog dismiss` | Dismiss or cancel the current dialog. |
@@ -489,13 +531,13 @@ These calls return plain text and stay stateless: the extension does not inject
 | Family | Surface |
 | --- | --- |
-| `get <what> [selector]` | `text`, `html`, `value`, `attr <name>`, `title`, `url`, `count`, `box`, `styles`, `cdp-url`. |
+| `get <what> [selector]` | `text`, `html`, `value`, `attr <name>`, `title`, `url`, `count`, `get box <selector>`, `get styles <selector>`, and `get cdp-url`. |
 | `is <what> <selector>` | Check `visible`, `enabled`, or `checked`. |
-| `find <locator> <value> <action> [text]` | Locator types include `role`, `text`, `label`, `placeholder`, `alt`, `title`, `testid`, `first`, `last`, and `nth`. |
+| `find <locator> <value> <action> [text]` | Locator types include `role`, `text`, `label`, `placeholder`, `alt`, `title`, and `testid`; selector helpers include `find first <sel>`, `find last <sel>`, and `find nth <n> <sel>`. Role/text filters include `find role <role> --name <name>` and `find ... --exact`. |
 | `mouse <action> [args]` | `move <x> <y>`, `down [btn]`, `up [btn]`, `wheel <dy> [dx]`. |
-| `set <setting> [value]` | `viewport <w> <h>`, `device <name>`, `geo <lat> <lng>`, `offline [on|off]`, `headers <json>`, `credentials <user> <pass>`, `media [dark|light] [reduced-motion]`. |
-| `network <action>` | `route <url> [--abort|--body <json>] [--resource-type <csv>]`, `unroute [url]`, `requests [--clear] [--filter <pattern>]`, `request <requestId>`, `har <start|stop> [path]`. `--resource-type` filters intercepted requests by CDP resource type, such as `script`, `image`, `font`, `xhr`, or `fetch`. |
-| `cookies [get|set|clear]` | Manage cookies. `set` supports `--url`, `--domain`, `--path`, `--httpOnly`, `--secure`, `--sameSite`, `--expires`, and `--curl <file>` for JSON, cURL, or bare Cookie-header bulk imports. |
+| `set <setting> [value]` | `viewport <w> <h>`, `device <name>`, `geo <lat> <lng>`, `offline [on|off]`, `headers <json>`, `credentials <user> <pass>`, and `set media <features>` (`dark`, `light`, and/or `reduced-motion`). |
+| `network <action>` | `network route <url> [--abort|--body <json>] [--resource-type <csv>]`, `network unroute [url]`, `network requests [--clear] [--filter <pattern>] [--type <csv>] [--method <method>] [--status <code|range>]`, `network request <requestId>`, `network har start`, and `network har stop [path]`. `--resource-type` filters intercepted requests by CDP resource type, such as `script`, `image`, `font`, `xhr`, or `fetch`; request listing filters accept resource types (`xhr,fetch`), methods (`POST`), and statuses (`2xx`, `400-499`). |
+| `cookies [get|set|clear]` | Manage cookies. Full set form: `cookies set <name> <value> --url <url> --domain <domain> --path <path> --httpOnly --secure --sameSite <Strict|Lax|None> --expires <timestamp>`; also supports `cookies set --curl <file>` for JSON, cURL, or bare Cookie-header bulk imports. |
 | `storage <local|session>` | Manage web storage. |
 Privacy note: `cookies get` can expose real profile cookies. Do not run it against `--profile Default` or other authenticated profiles unless the user explicitly needs cookie inspection; prefer task-specific page actions and storage checks.
@@ -511,7 +553,7 @@ Stable tab ids look like `t1`, `t2`, and `t3`. Optional user labels such as `doc
 | `tab new [url]` | Open a new tab. |
 | `tab new --label <name> [url]` | Open a new tab with a user label. |
 | `tab <t<N>|label>` | Switch to a tab by id or label. |
-| `tab close [t<N>|label]` | Close the current tab or a referenced tab. |
+| `tab close [t<N>|label]` | Close the current tab or a referenced tab. Generic references in workflows may say `tab close [target]`; use a stable `t<N>` id or label when you have one. |
 ### Snapshot
@@ -521,6 +563,7 @@ Stable tab ids look like `t1`, `t2`, and `t3`. Optional user labels such as `doc
 | `snapshot -i` / `snapshot --interactive` | Include only interactive elements. |
 | `snapshot -i --urls` | Include only interactive elements and link hrefs. |
 | `snapshot -u` / `snapshot --urls` | Include href URLs for link elements. |
+| `snapshot -C` / `snapshot --cursor` | Include cursor/focus context when upstream provides it. |
 | `snapshot -c` / `snapshot --compact` | Remove empty structural elements. |
 | `snapshot -d <n>` / `snapshot --depth <n>` | Limit tree depth. |
 | `snapshot -s <sel>` / `snapshot --selector <sel>` | Scope to a CSS selector. |
@@ -539,16 +582,16 @@ When a snapshot is too large for inline output, the Pi wrapper renders a compact
 | `wait --text <text>` | Wait for text to appear on the page; failures may include `inspect-after-text-assertion-failure` with a session-scoped `snapshot -i` payload. |
 | `wait --download [path]` | Wait for a download started by a previous action and optionally save it to `path`; successful wrapper results include upstream-reported `savedFilePath`/`savedFile`, while `details.artifacts[].exists` is the wrapper's on-disk verification signal. |
 | `wait --download [path] --timeout <ms>` | Set download-start timeout in milliseconds. In the native Pi wrapper, use `25000` ms or less per call to stay under the upstream CLI IPC budget. |
-| `wait <selector> --state hidden` | Wait for an element to become hidden. |
-| `wait <selector> --state detached` | Wait for an element to detach. |
+Current v0.27.0 source does not parse `wait <selector> --state hidden` / `wait <selector> --state detached` as distinct wait modes even though upstream help mentions those examples. Use `wait --fn "!document.querySelector('#spinner')"` or another explicit JavaScript predicate for disappearance/detach checks until upstream parser support exists.
 ### Diff, debug, and streaming
 | Command | Purpose |
 | --- | --- |
-| `diff snapshot` | Compare current versus last snapshot. |
-| `diff screenshot --baseline` | Compare current screenshot versus a baseline image. |
-| `diff url <u1> <u2>` | Compare two pages. |
+| `diff snapshot` | Compare current versus last snapshot. Use `diff snapshot --baseline <file> --selector <sel> --compact --depth <n>` when you need a saved baseline, scoped subtree, compact output, or depth bound. |
+| `diff screenshot --baseline` | Compare current screenshot versus a baseline image. Use `diff screenshot --baseline <file> --output <file> --threshold <0-1> --selector <sel> --full` when you need a saved diff image, threshold tuning, element scope, or full-page capture. |
+| `diff url <u1> <u2>` | Compare two pages. Use `diff url <u1> <u2> --screenshot --wait-until <strategy> --selector <sel> --compact --depth <n>` when you need screenshot comparison, navigation wait control, or scoped/compact snapshot comparison. |
 | `trace start|stop [path]` | Record a Chrome DevTools trace. |
 | `profiler start|stop [path]` | Record a Chrome DevTools profile. |
 | `record start <path> [url]` | Start WebM video recording; output is written on `record stop`. Requires `ffmpeg` on `PATH` for the final encode. |
@@ -558,7 +601,7 @@ When a snapshot is too large for inline output, the Pi wrapper renders a compact
 | `errors [--clear]` | View or clear page errors. |
 | `highlight <sel>` | Highlight an element. |
 | `inspect` | Open Chrome DevTools for the active page. |
-| `clipboard <op> [text]` | Read/write clipboard: `read`, `write`, `copy`, `paste`. |
+| `clipboard <op> [text]` | Read/write clipboard: `clipboard read`, `clipboard write <text>`, `clipboard copy`, and `clipboard paste`. |
 | `stream enable [--port <n>]` | Start runtime WebSocket streaming for this session. |
 | `stream disable` | Stop runtime WebSocket streaming. |
 | `stream status` | Show streaming status and active port. |
@@ -567,7 +610,7 @@ When a snapshot is too large for inline output, the Pi wrapper renders a compact
 | `react renders start` | Start recording React render activity. |
 | `react renders stop [--json]` | Stop render recording and print mount/re-render counts and changed details. |
 | `react suspense [--only-dynamic] [--json]` | Classify Suspense boundaries with grouped root-cause recommendations. |
-| `vitals [url] [--json]` | Report Core Web Vitals: LCP, CLS, TTFB, FCP, INP, plus React hydration timing when available. |
+| `vitals [url] [--json]` | Report Core Web Vitals: LCP, CLS, TTFB, FCP, INP, plus React hydration timing when available. `web-vitals [url] [--json]` is the upstream alias. |
 | `pushstate <url>` | Perform SPA client-side navigation; detects Next.js router pushes and falls back to history navigation events. |
 | `removeinitscript <id>` | Remove an init script registered through upstream init-script mechanisms. |
@@ -577,16 +620,16 @@ Long-running or lifecycle commands should be explicitly paired with cleanup call
 `trace` and `profiler` share upstream Chrome tracing machinery. Do not run them at the same time. The wrapper tracks owner state it observes in the current Pi session and blocks conflicting starts/stops with "wrapper believes ..." wording because direct upstream CLI use or browser restarts can desynchronize wrapper-local state.
-### Batch, auth, confirmations, sessions, chat, dashboard, and setup
+### Batch, auth, confirmations, sessions, chat, dashboard, devices, and setup
 | Command | Purpose |
 | --- | --- |
 | `batch [--bail] ["cmd" ...]` | Execute multiple commands sequentially from args or stdin. |
-| `auth save <name> [opts]` | Save an auth profile with options such as `--url`, `--username`, `--password`, or `--password-stdin`. Prefer `auth save <name> --password-stdin` with the tool `stdin` field; avoid putting passwords in `args`. |
+| `auth save <name> [opts]` | Save an auth profile. Full credential form: `auth save <name> --url <url> --username <user> --password <pass>`; selector override form: `auth save <name> --username-selector <s> --password-selector <s> --submit-selector <s>`. Prefer `auth save <name> --password-stdin` with the tool `stdin` field; avoid putting passwords in `args`. |
 | `auth login <name>` | Login using saved credentials. |
 | `auth list` | List saved auth profiles. |
 | `auth show <name>` | Show auth profile metadata. |
-| `auth delete <name>` | Delete an auth profile. |
+| `auth delete <name>` | Delete an auth profile; `auth remove <name>` is the upstream alias. |
 | `confirm <id>` | Approve a pending action. |
 | `deny <id>` | Deny a pending action. |
 | `session` | Show current session name. |
@@ -596,13 +639,14 @@ Long-running or lifecycle commands should be explicitly paired with cleanup call
 | `dashboard [start]` | Start the dashboard server on the default port `4848`. |
 | `dashboard start --port <n>` | Start the dashboard on a specific port. |
 | `dashboard stop` | Stop the dashboard server. |
+| `device list` | List available iOS simulators. Use with `-p ios` when exercising iOS provider flows. |
 | `install` | Install browser binaries. |
 | `install --with-deps` | Install browser binaries plus Linux system dependencies. |
 | `upgrade` | Upgrade `agent-browser` to the latest version. |
 | `doctor [--fix]` | Diagnose install issues and optionally auto-clean stale files. Use `doctor --offline --quick` for a fast local-only check and `doctor --json` for structured output. |
 | `profiles` | List available Chrome profiles. |
-When these commands are invoked through the native `agent_browser` tool, structured diagnostic/status outputs are rendered as compact summaries. List-like outputs such as sessions, Chrome profiles, auth profiles, network requests, console messages, and page errors include counts and key fields; large outputs are previewed with a `Full output path:` spill file instead of dumping the entire payload into context. For `network requests`, the wrapper shows a failed-request summary split into actionable versus benign low-impact rows, then status, method, URL, resource/mime type, request id, and, when the installed upstream output includes body-like fields, bounded redacted payload, response, and failure/error snippets. Safe request IDs also produce `details.nextActions` for exact request details, actionable failed-request source lookup candidates, filtered request lists, or starting HAR capture before a repro. `network request <requestId>` can expose upstream full-detail body fields such as response bodies using the same bounded model-facing preview; its request URL stays diagnostic-only and does not overwrite `details.sessionTabTarget` for later ref guards. Header, cookie, auth, token, and other secret-like fields are not expanded in model-facing text or `details.data`; command echoes also redact `--body`, `--headers`, `--password`, proxy credentials, auth-bearing URLs, cookie/storage values, and bearer/basic credential text in positional arguments. Use upstream HAR or full raw details only when complete data is required.
+When these commands are invoked through the native `agent_browser` tool, structured diagnostic/status outputs are rendered as compact summaries. Local inspection/setup calls (`auth save/list/show/delete/remove`, `dashboard start/stop`, `device list`, `doctor`, `install`, `upgrade`, `profiles`, `session list`, `state list/show/rename`, `state clean --older-than <days>`, `state clear --all`, `state clear -a`, and `state clear <session-name>`) are sessionless unless you explicitly pass `--session`; context-dependent calls such as root `session`, untargeted `state clear`, `auth login`, `chat`, and `state save/load` keep normal session behavior. List-like outputs such as sessions, Chrome profiles, auth profiles, network requests, console messages, and page errors include counts and key fields; large outputs are previewed with a `Full output path:` spill file instead of dumping the entire payload into context. For `network requests`, the wrapper shows a failed-request summary split into actionable versus benign low-impact rows, then status, method, URL, resource/mime type, request id, and, when the installed upstream output includes body-like fields, bounded redacted payload, response, and failure/error snippets. Safe request IDs also produce `details.nextActions` for exact request details, actionable failed-request source lookup candidates, filtered request lists, or starting HAR capture before a repro. `network request <requestId>` can expose upstream full-detail body fields such as response bodies using the same bounded model-facing preview; its request URL stays diagnostic-only and does not overwrite `details.sessionTabTarget` for later ref guards. Header, cookie, auth, token, and other secret-like fields are not expanded in model-facing text or `details.data`; command echoes also redact `--body`, `--headers`, `--password`, proxy credentials, auth-bearing URLs, cookie/storage values, and bearer/basic credential text in positional arguments. Use upstream HAR or full raw details only when complete data is required.
 ## Important global flags, config, and environment
@@ -633,6 +677,7 @@ When these commands are invoked through the native `agent_browser` tool, structu
 - `--download-path <path>`: default browser download directory. Environment: `AGENT_BROWSER_DOWNLOAD_PATH`.
 - `--engine <name>`: browser engine, `chrome` by default or `lightpanda`. Environment: `AGENT_BROWSER_ENGINE`.
 - `--no-auto-dialog`: disable automatic dismissal of alert/beforeunload dialogs. Environment: `AGENT_BROWSER_NO_AUTO_DIALOG`.
+- `--idle-timeout <ms>`: close idle sessions after the requested idle window when upstream owns that session lifecycle. The wrapper also sets `AGENT_BROWSER_IDLE_TIMEOUT_MS` for its managed-session backstop.
 ### Output, provider, policy, and AI flags
@@ -649,7 +694,7 @@ When these commands are invoked through the native `agent_browser` tool, structu
 - `--confirm-interactive`: interactive confirmations; auto-denies when stdin is not a TTY. Environment: `AGENT_BROWSER_CONFIRM_INTERACTIVE`.
 - `-p, --provider <name>`: provider such as `ios`, `browserbase`, `kernel`, `browseruse`, `browserless`, or `agentcore`. Environment: `AGENT_BROWSER_PROVIDER`.
 - `--device <name>`: iOS device name. Environment: `AGENT_BROWSER_IOS_DEVICE`.
-- Provider-specific iOS examples from upstream include `agent-browser -p ios device list`, `agent-browser -p ios swipe up`, and `agent-browser -p ios tap @e1`; in pi, pass those tokens through `args` rather than bash. iOS requires external Xcode/Appium setup, and cloud providers (`browserbase`, `kernel`, `browseruse`, `browserless`, `agentcore`) require their upstream accounts, credentials, and provider-specific environment variables. The wrapper forwards provider flags/env and stays thin; it does not emulate provider setup or cloud browser behavior.
+- Provider-specific iOS examples from upstream include `agent-browser -p ios device list`, `agent-browser -p ios swipe up`, and `agent-browser -p ios tap @e1`; in pi, pass those tokens through `args` rather than bash. iOS requires external Xcode/Appium setup, and cloud providers (`browserbase`, `kernel`, `browseruse`, `browserless`, `agentcore`) require their upstream accounts, credentials, and provider-specific environment variables. Common forwarded provider variables include `BROWSERBASE_API_KEY`, `BROWSERBASE_PROJECT_ID`, `BROWSERLESS_API_KEY`, `BROWSERLESS_API_URL`, `BROWSERLESS_BROWSER_TYPE`, `BROWSERLESS_STEALTH`, `BROWSERLESS_TTL`, `BROWSER_USE_API_KEY`, `KERNEL_API_KEY`, `KERNEL_HEADLESS`, `KERNEL_STEALTH`, `KERNEL_TIMEOUT_SECONDS`, `KERNEL_PROFILE_NAME`, `AGENTCORE_API_KEY`, `AGENTCORE_REGION`, `AGENTCORE_BROWSER_ID`, `AGENTCORE_PROFILE_ID`, `AGENTCORE_SESSION_TIMEOUT`, plus AWS names used by AgentCore such as `AWS_PROFILE`, `AWS_ACCESS_KEY_ID`, and `AWS_SECRET_ACCESS_KEY`. The wrapper forwards provider flags/env and stays thin; it does not emulate provider setup or cloud browser behavior.
 - `--model <name>`: AI model for `chat`. Environment: `AI_GATEWAY_MODEL`.
 - `-v, --verbose`: show tool commands and raw output.
 - `-q, --quiet`: show only AI text responses.
@@ -667,7 +712,7 @@ When these commands are invoked through the native `agent_browser` tool, structu
 Use `--config <path>` to load a specific config file. Boolean flags accept optional `true` or `false` values, such as `--headed false`, to override config. Browser extensions from user and project configs are merged rather than replaced.
-Other useful environment variables include `AGENT_BROWSER_DEFAULT_TIMEOUT`, `AGENT_BROWSER_STREAM_PORT`, `AGENT_BROWSER_IDLE_TIMEOUT_MS`, `AGENT_BROWSER_ENCRYPTION_KEY`, `AGENT_BROWSER_STATE_EXPIRE_DAYS`, `AGENT_BROWSER_IOS_DEVICE`, `AGENT_BROWSER_IOS_UDID`, `AI_GATEWAY_URL`, and `AI_GATEWAY_API_KEY`. The upstream child also receives every parent variable whose name starts with `AGENT_BROWSER_`, `AGENTCORE_`, `AI_GATEWAY_`, `BROWSERBASE_`, `BROWSERLESS_`, `BROWSER_USE_`, `KERNEL_`, or `XDG_`, plus the explicit inherited-name allowlist in `buildAgentBrowserProcessEnv` (`extensions/agent-browser/lib/process.ts`).
+Other useful environment variables include `AGENT_BROWSER_DEFAULT_TIMEOUT`, `AGENT_BROWSER_STREAM_PORT`, `AGENT_BROWSER_IDLE_TIMEOUT_MS`, `AGENT_BROWSER_ENCRYPTION_KEY`, `AGENT_BROWSER_STATE_EXPIRE_DAYS`, `AGENT_BROWSER_IOS_DEVICE`, `AGENT_BROWSER_IOS_UDID`, `AI_GATEWAY_URL`, `AI_GATEWAY_API_KEY`, the provider credential names listed above, and AWS credential names when using AgentCore. The upstream child also receives every parent variable whose name starts with `AGENT_BROWSER_`, `AGENTCORE_`, `AI_GATEWAY_`, `BROWSERBASE_`, `BROWSERLESS_`, `BROWSER_USE_`, `KERNEL_`, or `XDG_`, plus the explicit inherited-name allowlist in `buildAgentBrowserProcessEnv` (`extensions/agent-browser/lib/process.ts`).
 ## Wrapper-specific behavior worth knowing
@@ -694,20 +739,48 @@ Other useful environment variables include `AGENT_BROWSER_DEFAULT_TIMEOUT`, `AGE
 This generated block is review data for maintainers. The human-authored reference sections above remain the readable command guide.
+#### Source evidence
+- repository: `vercel-labs/agent-browser`
+- upstream HEAD: `4ad284890cb59564af603e6de403dd75dd19e832`
+- upstream package version: `0.27.0`
+- inspected: `agent-browser --version`
+- inspected: `agent-browser --help`
+- inspected: `selected agent-browser <command> --help output`
+- inspected: `README.md`
+- inspected: `CHANGELOG.md`
+- inspected: `agent-browser.schema.json`
+- inspected: `cli/src/commands.rs`
+- inspected: `cli/src/flags.rs`
 #### Upstream help commands sampled
 - root help: `agent-browser --help`
 - skills help: `agent-browser skills --help`
 - skills list: `agent-browser skills list`
 - core skill full: `agent-browser skills get core --full`
+- open help: `agent-browser open --help`
+- click help: `agent-browser click --help`
+- key help: `agent-browser key --help`
+- scroll help: `agent-browser scroll --help`
+- scrollinto help: `agent-browser scrollinto --help`
+- keydown help: `agent-browser keydown --help`
+- keyup help: `agent-browser keyup --help`
+- get help: `agent-browser get --help`
+- is help: `agent-browser is --help`
+- mouse help: `agent-browser mouse --help`
+- set help: `agent-browser set --help`
 - tab help: `agent-browser tab --help`
 - snapshot help: `agent-browser snapshot --help`
+- eval help: `agent-browser eval --help`
 - wait help: `agent-browser wait --help`
 - screenshot help: `agent-browser screenshot --help`
+- pdf help: `agent-browser pdf --help`
+- close help: `agent-browser close --help`
 - find help: `agent-browser find --help`
 - network help: `agent-browser network --help`
 - cookies help: `agent-browser cookies --help`
 - storage help: `agent-browser storage --help`
 - state help: `agent-browser state --help`
+- session help: `agent-browser session --help`
 - frame help: `agent-browser frame --help`
 - dialog help: `agent-browser dialog --help`
 - window help: `agent-browser window --help`
@@ -722,14 +795,23 @@ This generated block is review data for maintainers. The human-authored referenc
 - trace help: `agent-browser trace --help`
 - profiler help: `agent-browser profiler --help`
 - record help: `agent-browser record --help`
+- console help: `agent-browser console --help`
+- errors help: `agent-browser errors --help`
+- clipboard help: `agent-browser clipboard --help`
+- tap help: `agent-browser tap --help`
+- swipe help: `agent-browser swipe --help`
+- device help: `agent-browser device --help`
+- install help: `agent-browser install --help`
+- upgrade help: `agent-browser upgrade --help`
+- profiles help: `agent-browser profiles --help`
 #### Inventory sections
-- Built-in skills: 10 human-doc token(s), 11 upstream token(s)
-- Core page, element, navigation, and extraction commands: 38 human-doc token(s), 40 upstream token(s)
-- Sessions, state, tabs, frames, dialogs, and windows: 12 human-doc token(s), 8 upstream token(s)
-- Network, storage, artifacts, diagnostics, and performance: 29 human-doc token(s), 33 upstream token(s)
-- Batch, auth, confirmations, setup, dashboard, and AI commands: 19 human-doc token(s), 17 upstream token(s)
-- Global flags, config, providers, policy, and environment: 95 human-doc token(s), 90 upstream token(s)
+- Built-in skills: 13 human-doc token(s), 13 upstream token(s)
+- Core page, element, navigation, and extraction commands: 74 human-doc token(s), 74 upstream token(s)
+- Sessions, state, tabs, frames, dialogs, and windows: 20 human-doc token(s), 16 upstream token(s)
+- Network, storage, artifacts, diagnostics, and performance: 42 human-doc token(s), 51 upstream token(s)
+- Batch, auth, confirmations, setup, dashboard, devices, and AI commands: 24 human-doc token(s), 24 upstream token(s)
+- Global flags, config, providers, policy, and environment: 117 human-doc token(s), 90 upstream token(s)
 #### Human-authored doc tokens required
 ##### Built-in skills
@@ -737,20 +819,30 @@ This generated block is review data for maintainers. The human-authored referenc
 - `skills get core`
 - `skills get core --full`
 - `skills get <name>`
+- `skills get <name> --full`
+- `skills get --all`
 - `skills get electron`
 - `skills get slack`
 - `skills get dogfood`
 - `skills get vercel-sandbox`
 - `skills get agentcore`
 - `skills path [name]`
+- `AGENT_BROWSER_SKILLS_DIR`
 ##### Core page, element, navigation, and extraction commands
+- `open [url]`
 - `open <url>`
+- `goto <url>`
+- `navigate <url>`
 - `click <sel>`
+- `click <sel> --new-tab`
 - `dblclick <sel>`
 - `type <sel> <text>`
 - `fill <sel> <text>`
 - `press <key>`
+- `key <key>`
+- `keydown <key>`
+- `keyup <key>`
 - `keyboard type <text>`
 - `keyboard inserttext <text>`
 - `keydown Shift`
@@ -764,33 +856,70 @@ This generated block is review data for maintainers. The human-authored referenc
 - `upload <sel> <files...>`
 - `download <sel> <path>`
 - `scroll <dir> [px]`
+- `scroll <dir> [px] --selector <sel>`
 - `scrollintoview <sel>`
+- `scrollinto <sel>`
 - `wait <sel|ms>`
+- `wait --url <pattern>`
+- `wait --load <state>`
+- `wait --fn <expression>`
+- `wait --text <text>`
+- `wait --download [path]`
+- `screenshot [selector] [path]`
 - `screenshot [path]`
 - `screenshot --full`
 - `screenshot --annotate`
 - `pdf <path>`
 - `snapshot`
+- `snapshot --cursor`
+- `snapshot --interactive`
+- `snapshot --urls`
+- `snapshot --compact`
+- `snapshot --depth <n>`
+- `snapshot --selector <sel>`
 - `eval <js>`
+- `eval --stdin`
+- `eval -b <base64>`
 - `connect <port|url>`
 - `close [--all]`
+- `quit`
+- `exit`
 - `back`
 - `forward`
 - `reload`
 - `pushstate <url>`
 - `get <what> [selector]`
+- `get cdp-url`
+- `get box <selector>`
+- `get styles <selector>`
 - `is <what> <selector>`
 - `find <locator> <value> <action>`
+- `find first <sel>`
+- `find last <sel>`
+- `find nth <n> <sel>`
+- `find role <role> --name <name>`
+- `find ... --exact`
 - `mouse <action> [args]`
 - `set <setting> [value]`
+- `set media <features>`
+- `tap <selector>`
+- `swipe <direction> [distance]`
 ##### Sessions, state, tabs, frames, dialogs, and windows
 - `session`
 - `session list`
 - `state save <path>`
 - `state load <path>`
+- `state list`
+- `state show <filename>`
+- `state rename <old-name> <new-name>`
+- `state clear [session-name] [--all]`
+- `state clear -a`
+- `state clean --older-than <days>`
 - `tab list`
+- `tab new [url]`
 - `tab new --label <name> [url]`
+- `tab close [target]`
 - `tab <t<N>|label>`
 - `frame <selector|main>`
 - `dialog accept [text]`
@@ -801,13 +930,21 @@ This generated block is review data for maintainers. The human-authored referenc
 ##### Network, storage, artifacts, diagnostics, and performance
 - `network <action>`
 - `network route <url> [--abort|--body <json>] [--resource-type <csv>]`
+- `network unroute [url]`
+- `network requests [--clear] [--filter <pattern>] [--type <csv>] [--method <method>] [--status <code|range>]`
 - `network request <requestId>`
+- `network har start`
+- `network har stop [path]`
 - `cookies [get|set|clear]`
+- `cookies set <name> <value> --url <url> --domain <domain> --path <path> --httpOnly --secure --sameSite <Strict|Lax|None> --expires <timestamp>`
 - `cookies set --curl <file>`
 - `storage <local|session>`
 - `diff snapshot`
+- `diff snapshot --baseline <file> --selector <sel> --compact --depth <n>`
 - `diff screenshot --baseline`
+- `diff screenshot --baseline <file> --output <file> --threshold <0-1> --selector <sel> --full`
 - `diff url <u1> <u2>`
+- `diff url <u1> <u2> --screenshot --wait-until <strategy> --selector <sel> --compact --depth <n>`
 - `trace start|stop [path]`
 - `profiler start|stop [path]`
 - `record start <path> [url]`
@@ -818,6 +955,10 @@ This generated block is review data for maintainers. The human-authored referenc
 - `highlight <sel>`
 - `inspect`
 - `clipboard <op> [text]`
+- `clipboard read`
+- `clipboard write <text>`
+- `clipboard copy`
+- `clipboard paste`
 - `stream enable [--port <n>]`
 - `stream disable`
 - `stream status`
@@ -827,21 +968,27 @@ This generated block is review data for maintainers. The human-authored referenc
 - `react renders stop [--json]`
 - `react suspense [--only-dynamic] [--json]`
 - `vitals [url] [--json]`
+- `web-vitals [url] [--json]`
 - `removeinitscript <id>`
-##### Batch, auth, confirmations, setup, dashboard, and AI commands
+##### Batch, auth, confirmations, setup, dashboard, devices, and AI commands
 - `batch [--bail]`
 - `auth save <name>`
+- `auth save <name> --url <url> --username <user> --password <pass>`
+- `auth save <name> --username-selector <s> --password-selector <s> --submit-selector <s>`
 - `auth save <name> --password-stdin`
 - `auth login <name>`
 - `auth list`
 - `auth show <name>`
 - `auth delete <name>`
+- `auth remove <name>`
 - `confirm <id>`
 - `deny <id>`
 - `chat <message>`
+- `dashboard [start]`
 - `dashboard start --port <n>`
 - `dashboard stop`
+- `device list`
 - `install`
 - `install --with-deps`
 - `upgrade`
@@ -939,6 +1086,7 @@ This generated block is review data for maintainers. The human-authored referenc
 - `AGENT_BROWSER_DEBUG`
 - `AGENT_BROWSER_CONFIG`
 - `AGENT_BROWSER_DEFAULT_TIMEOUT`
+- `--idle-timeout <ms>`
 - `AGENT_BROWSER_STREAM_PORT`
 - `AGENT_BROWSER_IDLE_TIMEOUT_MS`
 - `AGENT_BROWSER_ENCRYPTION_KEY`
@@ -946,11 +1094,34 @@ This generated block is review data for maintainers. The human-authored referenc
 - `AGENT_BROWSER_IOS_UDID`
 - `AI_GATEWAY_URL`
 - `AI_GATEWAY_API_KEY`
+- `BROWSERBASE_API_KEY`
+- `BROWSERBASE_PROJECT_ID`
+- `BROWSERLESS_API_KEY`
+- `BROWSERLESS_API_URL`
+- `BROWSERLESS_BROWSER_TYPE`
+- `BROWSERLESS_STEALTH`
+- `BROWSERLESS_TTL`
+- `BROWSER_USE_API_KEY`
+- `KERNEL_API_KEY`
+- `KERNEL_HEADLESS`
+- `KERNEL_STEALTH`
+- `KERNEL_TIMEOUT_SECONDS`
+- `KERNEL_PROFILE_NAME`
+- `AGENTCORE_API_KEY`
+- `AGENTCORE_REGION`
+- `AGENTCORE_BROWSER_ID`
+- `AGENTCORE_PROFILE_ID`
+- `AGENTCORE_SESSION_TIMEOUT`
+- `AWS_PROFILE`
+- `AWS_ACCESS_KEY_ID`
+- `AWS_SECRET_ACCESS_KEY`
 #### Upstream help tokens expected
 ##### Built-in skills
 - root help: `skills get core --full`
 - skills help: `get <name> --full`
+- skills help: `get --all`
+- skills help: `AGENT_BROWSER_SKILLS_DIR`
 - skills list: `core`
 - skills list: `electron`
 - skills list: `slack`
@@ -962,12 +1133,18 @@ This generated block is review data for maintainers. The human-authored referenc
 - core skill full: `agent-browser state save ./auth.json`
 ##### Core page, element, navigation, and extraction commands
+- open help: `open [url]`
+- open help: `aliases still require a URL.`
 - root help: `open <url>`
 - root help: `click <sel>`
+- click help: `--new-tab`
 - root help: `dblclick <sel>`
 - root help: `type <sel> <text>`
 - root help: `fill <sel> <text>`
 - root help: `press <key>`
+- key help: `Aliases: key`
+- keydown help: `keydown <key>`
+- keyup help: `keyup <key>`
 - root help: `keyboard type <text>`
 - root help: `keyboard inserttext <text>`
 - root help: `hover <sel>`
@@ -979,35 +1156,71 @@ This generated block is review data for maintainers. The human-authored referenc
 - root help: `upload <sel> <files...>`
 - root help: `download <sel> <path>`
 - root help: `scroll <dir> [px]`
+- scroll help: `--selector <sel>`
 - root help: `scrollintoview <sel>`
+- scrollinto help: `Aliases: scrollinto`
 - root help: `wait <sel|ms>`
+- wait help: `--url <pattern>`
+- wait help: `--load <state>`
+- wait help: `--fn <expression>`
+- wait help: `--text <text>`
+- wait help: `--download [path]`
 - root help: `screenshot [path]`
+- screenshot help: `screenshot [selector] [path]`
 - root help: `pdf <path>`
+- pdf help: `Save page as PDF`
 - root help: `snapshot`
+- snapshot help: `--interactive`
+- snapshot help: `--urls`
+- snapshot help: `--compact`
+- snapshot help: `--depth <n>`
+- snapshot help: `--selector <sel>`
 - root help: `eval <js>`
+- eval help: `--stdin`
+- eval help: `-b, --base64`
 - root help: `connect <port|url>`
 - root help: `close [--all]`
+- close help: `Aliases: quit, exit`
 - root help: `back`
 - root help: `forward`
 - root help: `reload`
 - root help: `pushstate <url>`
 - root help: `Get Info:  agent-browser get <what> [selector]`
+- get help: `box <selector>`
+- get help: `styles <selector>`
+- get help: `cdp-url`
 - root help: `Check State:  agent-browser is <what> <selector>`
 - root help: `Find Elements:  agent-browser find <locator> <value> <action> [text]`
+- find help: `first <selector>`
+- find help: `last <selector>`
+- find help: `nth <index> <selector>`
+- find help: `--name <name>`
+- find help: `--exact`
 - root help: `Mouse:  agent-browser mouse <action> [args]`
 - root help: `Browser Settings:  agent-browser set <setting> [value]`
+- set help: `media [dark|light]`
 - keyboard help: `type <text>`
 - keyboard help: `inserttext <text>`
 - screenshot help: `--full, -f`
 - screenshot help: `--annotate`
 - find help: `role <role>`
 - find help: `testid <id>`
+- tap help: `tap <selector>`
+- swipe help: `swipe <direction> [distance]`
 ##### Sessions, state, tabs, frames, dialogs, and windows
 - root help: `session list`
 - state help: `save <path>`
 - state help: `load <path>`
+- state help: `list`
+- state help: `show <filename>`
+- state help: `rename <old-name> <new-name>`
+- state help: `clear [session-name] [--all]`
+- state help: `agent-browser state clear --all`
+- state help: `clean --older-than <days>`
+- tab help: `new [url]`
 - tab help: `new --label <name> [url]`
+- tab help: `close [t<N>|label]`
 - tab help: `Stable tab ids`
 - frame help: `frame <selector|main>`
 - dialog help: `dialog <accept|dismiss|status> [text]`
@@ -1016,6 +1229,9 @@ This generated block is review data for maintainers. The human-authored referenc
 ##### Network, storage, artifacts, diagnostics, and performance
 - root help: `network <action>`
 - root help: `--resource-type <csv>`
+- network help: `unroute [url]`
+- network help: `network har start`
+- network help: `network har stop ./capture.har`
 - root help: `cookies [get|set|clear]`
 - root help: `cookies set --curl <file>`
 - root help: `storage <local|session>`
@@ -1030,6 +1246,10 @@ This generated block is review data for maintainers. The human-authored referenc
 - root help: `highlight <sel>`
 - root help: `inspect`
 - root help: `clipboard <op> [text]`
+- clipboard help: `read`
+- clipboard help: `write <text>`
+- clipboard help: `copy`
+- clipboard help: `paste`
 - root help: `stream enable [--port <n>]`
 - root help: `stream disable`
 - root help: `stream status`
@@ -1040,15 +1260,26 @@ This generated block is review data for maintainers. The human-authored referenc
 - root help: `react suspense [--only-dynamic] [--json]`
 - root help: `vitals [url] [--json]`
 - root help: `removeinitscript <id>`
+- network help: `requests [options]`
+- network help: `--type <types>`
+- network help: `--method <method>`
+- network help: `--status <code>`
 - network help: `request <requestId>`
 - network help: `har <start|stop>`
 - storage help: `set <key> <value>`
+- diff help: `diff snapshot [options]`
+- diff help: `--baseline <f>`
+- diff help: `--output <file>`
+- diff help: `--threshold <0-1>`
+- diff help: `--wait-until <strategy>`
 - diff help: `diff screenshot --baseline <f>`
 - trace help: `trace <operation> [path]`
 - profiler help: `--categories <list>`
 - record help: `record restart <path.webm> [url]`
+- console help: `--clear`
+- errors help: `--clear`
-##### Batch, auth, confirmations, setup, dashboard, and AI commands
+##### Batch, auth, confirmations, setup, dashboard, devices, and AI commands
 - root help: `batch [--bail]`
 - root help: `auth save <name>`
 - root help: `auth login <name>`
@@ -1056,12 +1287,19 @@ This generated block is review data for maintainers. The human-authored referenc
 - root help: `deny <id>`
 - root help: `chat <message>`
 - root help: `dashboard start --port <n>`
+- device help: `device list`
 - root help: `install --with-deps`
 - root help: `upgrade`
 - root help: `doctor [--fix]`
 - root help: `profiles`
 - batch help: `--bail`
+- auth help: `--url <url>`
+- auth help: `--username <user>`
+- auth help: `--password <pass>`
 - auth help: `--password-stdin`
+- auth help: `--username-selector <s>`
+- auth help: `--password-selector <s>`
+- auth help: `--submit-selector <s>`
 - dashboard help: `dashboard [start|stop] [options]`
 - chat help: `chat <message>`
 - doctor help: `--offline`