pi-agent-browser-native 0.2.22 → 0.2.24

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,22 @@
2
2
 
3
3
  ## Unreleased
4
4
 
5
+ ## 0.2.24 - 2026-05-11
6
+
7
+ ### Added
8
+ - added custom `agent_browser` TUI rendering with colorized call/output text and built-in-style visual truncation for long visible output while preserving model-facing tool content
9
+
10
+ ## 0.2.23 - 2026-05-10
11
+
12
+ ### Fixed
13
+ - added safe `auth save --password-stdin` support for native tool calls and redacted password stdin from model-visible content, tool details, upstream failure output, and preserved parse-failure spill files
14
+ - improved session and launch-flag handling for agent workflows, including disabled `--auto-connect`, optional boolean flag values, dash-starting `--args` values, and stale `@ref` recovery guidance through pinned commands and user batch stdin
15
+ - expanded sensitive argument redaction for password and credential command forms
16
+
17
+ ### Changed
18
+ - rewrote the public README around outcome-first usage, fastest install paths, profile/auth workflow guidance, and release verification proof
19
+ - clarified native-tool command guidance for password stdin, cookie/privacy handling, stable tab ids, and explicit session persistence limits
20
+
5
21
  ## 0.2.22 - 2026-05-07
6
22
 
7
23
  ### Compatibility
package/README.md CHANGED
@@ -1,61 +1,86 @@
1
1
  # pi-agent-browser-native
2
2
 
3
- Native `pi` integration for [`agent-browser`](https://agent-browser.dev/).
3
+ A Pi extension that lets coding agents drive real browser sessions with a native `agent_browser` tool instead of brittle shell commands.
4
4
 
5
- ## Status
5
+ It is for Pi users who want agents to browse sites, inspect pages, click through flows, capture screenshots, use persistent profiles, and handle authenticated web apps without spending context on `agent-browser` CLI ceremony.
6
6
 
7
- Published pre-1.0 package.
7
+ ## What this looks like in Pi
8
8
 
9
- The native `agent_browser` tool, local verification workflow, package-content checks, and release checks are in place. Package install is the default path; checkout loading is for development and validation.
9
+ You prompt the agent in plain English:
10
10
 
11
- ## Goal
11
+ ```text
12
+ Use the agent_browser tool to open https://react.dev and then take an interactive snapshot.
13
+ ```
12
14
 
13
- Expose `agent-browser` to `pi` as a native tool so agents can automate the browser without going through a bash-backed skill.
15
+ The agent gets a native tool, not a bash workaround:
14
16
 
15
- ## Product stance
17
+ ```json
18
+ { "args": ["open", "https://react.dev"] }
19
+ { "args": ["snapshot", "-i"] }
20
+ ```
16
21
 
17
- - **Not bundled**: users install `agent-browser` separately and keep it on `PATH`
18
- - **Latest-version only**: no backward-compatibility support or shims for older `agent-browser` versions
19
- - **Thin wrapper**: stay close to upstream `agent-browser` instead of re-implementing its CLI
20
- - **Agent-invoked first**: the main UX is the agent calling the tool directly, like `read` or `write`
21
- - **Global-install first**: package behavior matters more than repo-local development wiring
22
+ The result is optimized for agent work:
22
23
 
23
- Upstream install/docs:
24
- - https://agent-browser.dev/
25
- - https://github.com/vercel-labs/agent-browser
24
+ - compact page snapshots that lead with useful page content instead of chrome/sidebar noise
25
+ - interactive `@eN` refs for follow-up clicks and form fills
26
+ - screenshots and downloaded files surfaced as Pi artifacts
27
+ - structured details for titles, URLs, saved files, sessions, and errors
28
+ - spill files for oversized raw output instead of dumping pages into context
29
+ - compact, colorized Pi TUI rows that can be expanded without changing what the agent receives
30
+ - recovery hints when a tab, selector, stale `@ref`, or launch mode needs a different next step
31
+
32
+ ## Who this is for
26
33
 
27
- ## Why this exists
34
+ - **Pi users** who want browser automation available as a normal tool beside `read`, `write`, and `bash`.
35
+ - **Coding agents** that need low-context browser workflows for docs, QA, research, dashboards, and web apps.
36
+ - **Maintainers** who want a thin integration that tracks the current upstream [`agent-browser`](https://agent-browser.dev/) CLI without bundling or re-implementing it.
28
37
 
29
- A native `pi` integration can improve on the current skill by adding:
38
+ ## The problem
30
39
 
31
- - structured tool calls instead of shell strings
32
- - parsed results instead of bash stdout
33
- - compact model-facing snapshot shaping with full raw spill files for oversized pages
34
- - main-content-first snapshot previews so the model sees the important page region before unrelated chrome or sidebar noise
35
- - inline screenshots and artifacts
36
- - lightweight session convenience inside `pi`
37
- - a better base for serious browser automation
40
+ `agent-browser` is powerful, but plain CLI use is awkward inside an agent harness:
38
41
 
39
- ## Example use cases
42
+ - shell strings are easy for agents to quote wrong
43
+ - large page snapshots can waste model context
44
+ - screenshots and downloads need artifact metadata, not just text paths
45
+ - implicit browser sessions need predictable reuse and cleanup
46
+ - profile/debug launches need a clear way to start fresh after public browsing
47
+ - secrets and auth material must not be echoed into model-visible output
48
+ - stale element refs need actionable recovery guidance, not generic failures
40
49
 
41
- - UI testing and exploratory QA
42
- - web research
43
- - driving web UIs for ChatGPT, Grok, Gemini, and Claude
44
- - authenticated browser sessions and persistent profiles
50
+ `pi-agent-browser-native` keeps upstream `agent-browser` as the browser engine and adds the Pi-native wrapper behavior needed for reliable agent use.
45
51
 
46
- ## Install and try
52
+ ## What it does
47
53
 
48
- The product direction is package-first. Prefer the package source for normal use; keep the local-checkout flow for development and pre-release validation.
54
+ | Pain | Native wrapper capability | Proof surface |
55
+ |---|---|---|
56
+ | Agents build fragile shell commands | Exposes `agent_browser` with exact `args`, controlled `stdin`, and `sessionMode` fields | `extensions/agent-browser/index.ts`, [`docs/TOOL_CONTRACT.md`](docs/TOOL_CONTRACT.md) |
57
+ | Page snapshots are too large | Shows compact, main-content-first summaries and stores full raw output in spill files when needed | `test/agent-browser.presentation.test.ts` |
58
+ | Screenshots/downloads get lost in text | Normalizes artifact paths and reports existence, size, cwd, session, and repair status | [`docs/COMMAND_REFERENCE.md`](docs/COMMAND_REFERENCE.md#download-screenshot-and-pdf-files) |
59
+ | Profile restores and tab drift confuse agents | Tracks managed sessions, pins intended tabs, and re-selects target tabs after drift | generated tab-recovery notes below; `test/agent-browser.resume-state.test.ts` |
60
+ | Auth/profile workflows can leak secrets | Supports `auth save --password-stdin` and redacts sensitive args, URLs, stdout/stderr, details, and parse-failure spills | `test/agent-browser.extension-validation.test.ts` |
61
+ | Stale `@eN` refs fail mysteriously | Adds recovery guidance to rerun `snapshot -i` or use stable `find` locators | `test/agent-browser.results.test.ts` |
62
+ | Direct binary help may be blocked in agent sessions | Publishes a repo-readable command reference and verifies it against the target upstream version | `npm run verify` |
49
63
 
50
- ### Preferred package install
64
+ ## Fastest way to try it
51
65
 
52
- Install `agent-browser` separately, then install this package into `pi`:
66
+ Install upstream `agent-browser` first and make sure it is on `PATH`:
67
+
68
+ - https://agent-browser.dev/
69
+ - https://github.com/vercel-labs/agent-browser
70
+
71
+ Then install this Pi package:
53
72
 
54
73
  ```bash
55
74
  pi install npm:pi-agent-browser-native
56
75
  ```
57
76
 
58
- To try a published package without installing it permanently, isolate that temporary package source from any configured checkout or global install:
77
+ Start Pi and ask for a browser action:
78
+
79
+ ```text
80
+ Use the agent_browser tool to open https://example.com and then take an interactive snapshot.
81
+ ```
82
+
83
+ For a one-off trial that does not touch your configured Pi extensions:
59
84
 
60
85
  ```bash
61
86
  pi --no-extensions -e npm:pi-agent-browser-native
@@ -67,127 +92,123 @@ For a specific published version:
67
92
  pi --no-extensions -e npm:pi-agent-browser-native@<version>
68
93
  ```
69
94
 
70
- ### First-run doctor
71
-
72
- Run the package doctor before first use or when `agent_browser` is missing or duplicated:
73
-
74
- ```bash
75
- pi-agent-browser-doctor
76
- # one-off without installing the package source permanently:
77
- npm exec --package pi-agent-browser-native -- pi-agent-browser-doctor
78
- # from a checkout:
79
- npm run doctor
80
- ```
81
-
82
- The doctor is read-only. It checks that upstream `agent-browser` is on `PATH`, that `agent-browser --version` matches the wrapper's capability baseline, and that Pi settings do not point at multiple active `pi-agent-browser-native` sources. It does not run upstream `agent-browser doctor --fix` or edit Pi settings.
83
-
84
- If it reports duplicate sources, keep exactly one active source. For normal use, keep `pi install npm:pi-agent-browser-native` and remove checkout paths from Pi settings. For temporary package or checkout trials, use `pi --no-extensions -e npm:pi-agent-browser-native[@<version>]` or `pi --no-extensions -e /path/to/checkout` so configured sources are bypassed.
85
-
86
- ### GitHub install
87
-
88
- For the source install path, prefer the repository URL:
95
+ To install directly from source instead of npm:
89
96
 
90
97
  ```bash
91
98
  pi install https://github.com/fitchmultz/pi-agent-browser-native
92
99
  ```
93
100
 
94
- To try the GitHub source without installing it permanently, isolate that temporary source extension from your normal installed package set:
101
+ For a temporary source trial, keep it isolated from your normal package sources:
95
102
 
96
103
  ```bash
97
104
  pi --no-extensions -e https://github.com/fitchmultz/pi-agent-browser-native
98
105
  ```
99
106
 
100
- This avoids duplicate `agent_browser` registrations when you already have `pi-agent-browser-native` installed globally.
101
-
102
- ### Current practical local-checkout flows
103
-
104
- This repository's `package.json` is itself a publishable pi package manifest that points at `extensions/agent-browser/index.ts`. That file is the real extension entrypoint for both the checkout and the published package.
105
-
106
- Use two local-checkout modes intentionally:
107
-
108
- - **Quick isolated smoke test:** run the checkout explicitly with `-e` and disable extension discovery:
109
-
110
- ```bash
111
- pi --no-extensions -e /absolute/path/to/pi-agent-browser-native
112
- ```
113
-
114
- This bypasses Pi settings and any configured checkout/global package sources, so it avoids duplicate `agent_browser` registrations. After editing extension code, restart this `pi` process to validate the new source; do not use this mode as proof that configured-source `/reload` works.
107
+ ## First-run health check
115
108
 
116
- - **Configured-source lifecycle validation:** run `npm run verify -- lifecycle` for the opt-in automated tmux harness, or keep exactly one active source for this extension in Pi settings and launch plain `pi` for manual checks. Use this mode when validating `/reload`, full restart, and `/resume` behavior because Pi's reload flow operates on discovered/configured resources.
109
+ Run the read-only doctor when installing, upgrading, or debugging missing/duplicated tools:
117
110
 
118
- The native tool exposed to the agent is named `agent_browser`.
119
-
120
- The primary session control parameter is `sessionMode`:
111
+ ```bash
112
+ pi-agent-browser-doctor
113
+ # one-off without permanent install:
114
+ npm exec --package pi-agent-browser-native -- pi-agent-browser-doctor
115
+ # from this checkout:
116
+ npm run doctor
117
+ ```
121
118
 
122
- - `"auto"` (default) reuses the extension-managed `pi`-scoped session when possible
123
- - `"fresh"` switches that managed session to a fresh upstream launch so launch-scoped flags like `--profile`, `--session-name`, `--cdp`, `--state`, `--auto-connect`, `--init-script`, and `--enable` apply and later auto calls follow the new browser
119
+ The doctor checks:
124
120
 
125
- ## Agent quick start
121
+ - upstream `agent-browser` exists on `PATH`
122
+ - the installed upstream version matches this wrapper's command-reference baseline
123
+ - Pi settings do not point at multiple active `pi-agent-browser-native` sources
126
124
 
127
- ### Mental model
125
+ It does **not** edit Pi settings and does **not** run upstream `agent-browser doctor --fix`.
128
126
 
129
- - `args` exact CLI args after `agent-browser`
130
- - `stdin` — raw stdin only for `batch` and `eval --stdin` (other command/stdin combinations are rejected before `agent-browser` is launched)
131
- - `sessionMode`
132
- - `"auto"` — default, reuse the extension-managed `pi`-scoped session
133
- - `"fresh"` — switch that managed session to a new profile/debug launch
127
+ ## Common agent calls
134
128
 
135
- ### Common call shapes
129
+ You usually prompt the agent in natural language. These JSON snippets show the exact native tool shape the agent should use.
136
130
 
137
- Open a page, then take an interactive snapshot:
131
+ Open a page and inspect it:
138
132
 
139
133
  ```json
140
134
  { "args": ["open", "https://example.com"] }
141
135
  { "args": ["snapshot", "-i"] }
142
136
  ```
143
137
 
144
- Click a ref, then re-snapshot after navigation or a major DOM change:
138
+ Click a visible ref, then refresh refs after navigation or a DOM update:
145
139
 
146
140
  ```json
147
141
  { "args": ["click", "@e2"] }
148
142
  { "args": ["snapshot", "-i"] }
149
143
  ```
150
144
 
151
- Run a multi-step browser flow in one tool call:
145
+ Run a multi-step flow in one tool call:
152
146
 
153
147
  ```json
154
148
  { "args": ["batch"], "stdin": "[[\"open\",\"https://example.com\"],[\"snapshot\",\"-i\"]]" }
155
149
  ```
156
150
 
157
- Evaluate page JavaScript via stdin:
151
+ Evaluate page JavaScript through stdin:
158
152
 
159
153
  ```json
160
154
  { "args": ["eval", "--stdin"], "stdin": "document.title" }
161
155
  ```
162
156
 
163
- Download a file from a known link/control directly:
157
+ Save an auth profile without putting the password in `args`:
158
+
159
+ ```json
160
+ { "args": ["auth", "save", "demo", "--password-stdin"], "stdin": "<password>" }
161
+ ```
162
+
163
+ Download a file from a known link or control:
164
164
 
165
165
  ```json
166
166
  { "args": ["download", "@e5", "/tmp/report.pdf"] }
167
167
  ```
168
168
 
169
- For dashboards that start an export asynchronously after a click or navigation, click first and then wait for the download. The wrapper reports `Download completed: /tmp/report.csv` and exposes upstream-reported `details.savedFilePath` plus `details.savedFile` for the `wait` result; with upstream `agent-browser 0.27.0`, confirm `details.artifacts[].exists` before relying on a requested `wait --download <path>` file being present on disk (tracked upstream at [vercel-labs/agent-browser#1300](https://github.com/vercel-labs/agent-browser/issues/1300)):
169
+ For asynchronous exports, click first and then wait for the download:
170
170
 
171
171
  ```json
172
172
  { "args": ["click", "@export"] }
173
173
  { "args": ["wait", "--download", "/tmp/report.csv"] }
174
174
  ```
175
175
 
176
- Batch flows preserve the same saved-file metadata on the wait step:
176
+ With upstream `agent-browser 0.27.0`, treat `details.savedFilePath` as upstream-reported metadata and confirm `details.artifacts[].exists` before relying on the requested `wait --download <path>` file being present on disk.
177
+
178
+ Start a fresh profiled browser after the implicit public-browsing session already exists:
177
179
 
178
180
  ```json
179
- { "args": ["batch"], "stdin": "[[\"click\",\"@export\"],[\"wait\",\"--download\",\"/tmp/report.csv\"]]" }
181
+ { "args": ["--profile", "Default", "open", "https://example.com/account"], "sessionMode": "fresh" }
180
182
  ```
181
183
 
182
- Start a fresh profiled launch after you already used the implicit session:
184
+ After a successful unnamed fresh launch, later default `sessionMode: "auto"` calls follow that browser automatically.
185
+
186
+ ## Authenticated/profile workflows
187
+
188
+ The wrapper does not clone profiles or hide what upstream Chrome profile you chose. Passing `--profile` is an explicit upstream `agent-browser` choice.
189
+
190
+ Use these rules:
191
+
192
+ - Use public/temp profiles for tests and examples.
193
+ - Use `sessionMode: "fresh"` when switching from public browsing to `--profile`, `--session-name`, `--cdp`, `--state`, `--auto-connect`, `--init-script`, or `--enable`.
194
+ - Use `--session` when you want to manage a live upstream session name yourself.
195
+ - Do not treat `--session` as persisted auth or tab restore after `close`; use `--profile`, `--session-name`, or `--state` for persistence.
196
+ - Prefer page actions and storage checks over cookie dumps. `cookies get` can expose real profile cookies.
197
+ - Prefer `auth save --password-stdin` over putting passwords in `args`.
198
+
199
+ Example explicit session plus profile launch:
183
200
 
184
201
  ```json
185
- { "args": ["--profile", "Default", "open", "https://example.com/account"], "sessionMode": "fresh" }
202
+ {
203
+ "args": ["--session", "auth-flow", "--profile", "Default", "open", "https://example.com/account"]
204
+ }
186
205
  ```
187
206
 
188
- After a successful unnamed fresh launch, later `sessionMode: "auto"` calls follow that new browser automatically.
207
+ ## React, SPA, and first-navigation setup
189
208
 
190
- React and SPA tooling added upstream in `agent-browser` v0.27.0 is passed through as native tool calls. Launch React introspection with the DevTools hook before first navigation, then use the `react` commands; `vitals` and `pushstate` work as regular command tokens:
209
+ React and SPA tooling from upstream `agent-browser` is passed through directly.
210
+
211
+ Launch React introspection before first navigation:
191
212
 
192
213
  ```json
193
214
  { "args": ["open", "--enable", "react-devtools", "https://example.com"], "sessionMode": "fresh" }
@@ -196,11 +217,16 @@ React and SPA tooling added upstream in `agent-browser` v0.27.0 is passed throug
196
217
  { "args": ["react", "renders", "start"] }
197
218
  { "args": ["react", "renders", "stop"] }
198
219
  { "args": ["react", "suspense", "--only-dynamic"] }
199
- { "args": ["vitals", "https://example.com", "--json"] }
220
+ ```
221
+
222
+ Use SPA and Web Vitals helpers as normal command tokens:
223
+
224
+ ```json
200
225
  { "args": ["pushstate", "/dashboard"] }
226
+ { "args": ["vitals", "https://example.com", "--json"] }
201
227
  ```
202
228
 
203
- For first-navigation setup, launch a fresh blank page before staging routes, cookies, or scripts:
229
+ For setup that must happen before first navigation, open a blank fresh page, stage routes/cookies/scripts, then navigate:
204
230
 
205
231
  ```json
206
232
  { "args": ["open"], "sessionMode": "fresh" }
@@ -209,68 +235,93 @@ For first-navigation setup, launch a fresh blank page before staging routes, coo
209
235
  { "args": ["navigate", "https://example.com"] }
210
236
  ```
211
237
 
212
- Name a new upstream session explicitly when you want to keep reusing it yourself:
238
+ ## Proof and verification
213
239
 
214
- ```json
215
- { "args": ["--session", "auth-flow", "open", "https://example.com"] }
240
+ The local verification gate is:
241
+
242
+ ```bash
243
+ npm run verify
216
244
  ```
217
245
 
218
- ### First useful prompt in a fresh `pi` session
246
+ It runs:
219
247
 
220
- ```text
221
- Use the agent_browser tool to open https://react.dev and then take an interactive snapshot.
248
+ - generated playbook/documentation drift checks
249
+ - `tsc --noEmit`
250
+ - the test suite
251
+ - command-reference baseline checks
252
+ - live command-reference verification against the targeted installed upstream `agent-browser`
253
+
254
+ The opt-in real-upstream suite is separate because it drives a real browser installation:
255
+
256
+ ```bash
257
+ npm run verify -- real-upstream
222
258
  ```
223
259
 
260
+ For package release confidence, follow [`docs/RELEASE.md`](docs/RELEASE.md). The release gate is:
261
+
262
+ ```bash
263
+ npm run doctor
264
+ npm run verify -- release
265
+ ```
266
+
267
+ `npm run verify -- release` includes the default verification gate plus packaged Pi smoke coverage. The package also has a `prepublishOnly` hook that runs default verification and `npm pack --dry-run` during `npm publish`.
268
+
269
+ ## How it works
270
+
271
+ `pi-agent-browser-native` is intentionally thin:
272
+
273
+ 1. Pi loads `extensions/agent-browser/index.ts` from the package manifest.
274
+ 2. The extension registers one native tool named `agent_browser`.
275
+ 3. Tool calls are translated into upstream `agent-browser` CLI invocations with controlled args, stdin, environment, timeout, and session planning.
276
+ 4. Upstream JSON/plain-text output is parsed into model-friendly content and structured details.
277
+ 5. Screenshots, downloads, recordings, traces, profiles, and spill files are normalized as Pi-visible artifacts where possible.
278
+ 6. Generated playbook text in docs and tool metadata stays aligned with `extensions/agent-browser/lib/playbook.ts`.
279
+
280
+ The upstream browser engine remains [`agent-browser`](https://agent-browser.dev/). This package does not bundle it and does not maintain compatibility shims for old upstream versions.
281
+
282
+ ## Current limits
283
+
284
+ - Published pre-1.0 package.
285
+ - Targets the current locally installed upstream `agent-browser` version only.
286
+ - Does not bundle `agent-browser`; users install it separately.
287
+ - Does not provide a human browser UI inside Pi; the primary UX is agent-invoked tool calls.
288
+ - Real authenticated profile use is powerful but sensitive. Treat profile and cookie access as user-approved, task-specific behavior.
289
+ - Wrapper tab/session recovery is best effort around observed upstream behavior, not a replacement for explicit profile/session design.
290
+
224
291
  ## Local development
225
292
 
226
- Do not track or rely on a repo-local `.pi/extensions/agent-browser.ts` autoload shim for this package. That creates an unnecessary second registration path.
293
+ Install upstream `agent-browser`, then install dependencies:
227
294
 
228
- The published entrypoint lives at `extensions/agent-browser/index.ts` and is referenced directly from this repo's `package.json`.
295
+ ```bash
296
+ npm install
297
+ ```
229
298
 
230
- Recommended local development setup:
231
- 1. Install `agent-browser` separately via the upstream project.
232
- 2. Run `npm install`.
233
- 3. For a quick checkout-only smoke test, launch `pi` from this repository root with discovery disabled:
299
+ Quick isolated checkout smoke test:
234
300
 
235
301
  ```bash
236
302
  pi --no-extensions -e .
237
303
  ```
238
304
 
239
- 4. Prompt the agent to use `agent_browser`.
240
- 5. For hot-reload or resume validation, run `npm run verify -- lifecycle` or configure exactly one active source for this extension in Pi settings, launch plain `pi`, and exercise `/reload` plus restart/`/resume`. Settings matter only in this configured-source mode; they are bypassed by `--no-extensions -e .`. See [`docs/RELEASE.md`](docs/RELEASE.md) for the automated harness behavior, cleanup, and transcript retention details.
305
+ This bypasses Pi settings and configured extensions. After editing extension code, restart that Pi process to test the new checkout.
241
306
 
242
- Example prompt:
307
+ Configured-source lifecycle validation:
243
308
 
244
- ```text
245
- Use the agent_browser tool to open https://react.dev and then take an interactive snapshot.
309
+ ```bash
310
+ npm run verify -- lifecycle
246
311
  ```
247
312
 
248
- For installed-package validation after a release, use exactly one active source. The canonical isolated validation sequence is:
313
+ Use lifecycle validation when testing `/reload`, full restart, `/resume`, managed-session continuity, or persisted artifact behavior.
314
+
315
+ Installed-package validation after publish:
249
316
 
250
317
  ```bash
251
318
  npm run verify -- package-pi
252
319
  pi --no-extensions -e npm:pi-agent-browser-native@<version>
253
320
  ```
254
321
 
255
- Only use plain `pi` for installed-package validation after disabling or removing the checkout source from Pi settings.
256
-
257
- Validated workflow examples:
258
-
259
- - open a page and snapshot it
260
- - click a link and confirm the destination title
261
- - use an explicit `--session` across multiple tool calls
262
- - use an explicit `--profile` and verify persisted browser storage across restarts
263
- - open `chat.com` or `chatgpt.com` headlessly with `--profile Default` without forcing `--headed` or `--auto-connect`
264
- - in configured-source lifecycle mode, verify `/reload` and full restart + `/resume` keep following the same implicit managed browser session
265
- - run `batch` with JSON via `stdin`
266
- - run `eval --stdin`
267
- - take a screenshot with inline attachment support and visible artifact metadata: artifact type, requested path, absolute path, existence, size, cwd, session, and repair/copy status when applicable
268
- - inspect upstream help/version through native tool calls like `{ "args": ["--help"] }` and `{ "args": ["--version"] }` via the tool's stateless plain-text inspection fallback
269
- - use `download <selector> <path>` for direct attachment/file-save workflows instead of trying to infer downloads from generic clicks or large eval dumps
270
- - for `.dogfood/...` or other dot-directory screenshot paths, rely on the wrapper's path normalization/repair contract; the visible result reports the requested path and absolute path rather than only an upstream temp path
271
- - use `click` plus `wait --download <path>` for asynchronous export flows, confirm `details.savedFilePath`/`details.savedFile` are present on the wait result or batch wait step, and check `details.artifacts[].exists` before relying on requested-path persistence
272
- - confirm oversized outputs show the actual spill file path directly in tool content, not just a details key name
273
- - inspect `details.artifactManifest` / `details.artifactRetentionSummary` during artifact-heavy flows to recover recent saved files, spill files, and visible eviction state after reload/resume
322
+ ## Generated native-tool playbook notes
323
+
324
+ These sections are generated from `extensions/agent-browser/lib/playbook.ts`. Run `npm run docs -- playbook write` after changing the canonical playbook source.
274
325
 
275
326
  <!-- agent-browser-playbook:start inspection -->
276
327
  <!-- Generated from extensions/agent-browser/lib/playbook.ts. Run `npm run docs -- playbook write` to update. -->
@@ -282,14 +333,6 @@ Native inspection calls use the `agent_browser` tool shape, not shell-like direc
282
333
  These calls return plain text and stay stateless: the extension does not inject its implicit session and does not let inspection consume the managed-session slot needed for later profile, session, CDP, state, or auto-connect launches.
283
334
  <!-- agent-browser-playbook:end inspection -->
284
335
 
285
- Current cautions:
286
- - passing `--profile` is an explicit upstream choice; this extension does not add its own profile-cloning or isolation layer
287
- - launch-scoped flags like `--profile`, `--session-name`, `--cdp`, `--state`, `--auto-connect`, `--init-script`, and `--enable` are for the first command that launches a session; if the implicit session is already active, retry that call with `sessionMode: "fresh"` or provide an explicit `--session ...` for the new launch
288
- - implicit `piab-*` sessions are extension-managed convenience sessions; they stay alive across `/reload` and resumable session transitions so later default calls can keep following the active managed browser on `/reload` or `/resume`, close when the originating `pi` process quits, rely on the configured idle timeout only as an abnormal-exit backstop, store persisted-session large snapshot spill files under a private session-scoped artifact directory with a bounded per-session budget so `details.fullOutputPath` and metadata-only `details.artifactManifest` survive reload/resume without unbounded growth, and still clean up process-private temp spill artifacts on shutdown
289
- - `sessionMode: "fresh"` without an explicit `--session` rotates that extension-managed session to the new browser so later auto calls keep using it
290
- - for local Unix launches, the wrapper uses a short private socket directory under `/tmp` so extension-generated session names do not trip upstream Unix socket-path limits in longer cwd/session-name combinations
291
- - wrapper-spawned commands clamp `AGENT_BROWSER_DEFAULT_TIMEOUT` to 25 seconds and use a 28-second process watchdog so a single upstream CLI call does not cross the upstream 30-second IPC read-timeout/retry path; split intentionally long waits into shorter tool calls
292
- - for direct headless local Chrome launches to `chat.com`, `chatgpt.com`, and `chat.openai.com`, the extension injects a normal Chrome user agent when the caller did not explicitly provide `--user-agent`; this keeps the default headless workflow usable without forcing `--headed` or `--auto-connect`
293
336
  <!-- agent-browser-playbook:start wrapper-tab-recovery -->
294
337
  <!-- Generated from extensions/agent-browser/lib/playbook.ts. Run `npm run docs -- playbook write` to update. -->
295
338
  - After launch-scoped open/goto/navigate calls that can restore existing tabs (for example --profile, --session-name, or --state), agent_browser best-effort re-selects the tab whose URL matches the returned page when restored tabs steal focus during launch.
@@ -297,59 +340,32 @@ Current cautions:
297
340
  - After a successful command on a known target tab, agent_browser also best-effort restores that intended tab if a restored/background tab steals focus after the command completes.
298
341
  - If a known session target unexpectedly reports about:blank, agent_browser preserves the prior intended target, best-effort re-selects it when it still exists, and reports exact recovery guidance when it cannot be re-selected.
299
342
  <!-- agent-browser-playbook:end wrapper-tab-recovery -->
300
- - oversized snapshots and oversized generic outputs compact inline content and print the actual spill file path directly in the tool result when a spill file exists; recent spills and explicit saved artifacts are also summarized in `details.artifactManifest`, including `evicted` entries when retention budgets remove older persisted files
301
- - artifact-producing commands render direct readable artifact metadata in visible content and `details.artifacts`: `kind`/`artifactType`, `path`, `requestedPath`, `absolutePath`, `exists`, `sizeBytes`, `status`, `cwd`, `session`, and `tempPath` when the wrapper repaired an upstream temp fallback
302
- - if the caller explicitly passes `--json`, the visible text content is valid JSON; for `stream status`, the wrapper enriches data with `wsUrl` and `frameFormat`
303
- - `trace` and `profiler` share upstream tracing machinery; the wrapper blocks starts/stops that conflict with owner state it observed in the current Pi session, but the message says "wrapper believes" because upstream or external CLI calls can desynchronize that local state
304
- - explicit caller-provided `--session` values are treated as user-managed and are not auto-closed by the extension
305
- - explicit caller-provided `--user-agent` values win over the ChatGPT/OpenAI compatibility workaround
306
- - tool progress/details redact sensitive invocation values such as `--headers`, proxy credentials, and auth-bearing URL parameters before echoing them back into Pi
307
-
308
- ### Switching from public browsing to a fresh profile/debug launch
309
-
310
- A common agent workflow is:
311
-
312
- 1. browse a public page with the default implicit session
313
- 2. then switch to a fresh authenticated/profile/debug launch
314
-
315
- Use `sessionMode: "fresh"` for that transition instead of relying on the implicit session:
316
-
317
- ```json
318
- {
319
- "args": ["--profile", "Default", "open", "https://example.com/account"],
320
- "sessionMode": "fresh"
321
- }
322
- ```
323
-
324
- After that call succeeds, later default `sessionMode: "auto"` calls continue in the new fresh browser.
325
-
326
- If you want to name the new upstream session yourself, pass an explicit session instead:
327
-
328
- ```json
329
- {
330
- "args": ["--session", "auth-flow", "--profile", "Default", "open", "https://example.com/account"]
331
- }
332
- ```
333
343
 
334
- ## Docs
344
+ ## Project map
335
345
 
336
- - [`docs/REQUIREMENTS.md`](docs/REQUIREMENTS.md) product requirements and constraints
337
- - [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md) — current architecture decision
338
- - [`docs/TOOL_CONTRACT.md`](docs/TOOL_CONTRACT.md) proposed v1 tool shape
339
- - [`docs/COMMAND_REFERENCE.md`](docs/COMMAND_REFERENCE.md) local repo-readable command reference for the blocked direct-binary path
340
- - [`docs/RELEASE.md`](docs/RELEASE.md) maintainer release and package verification workflow
346
+ | Path | Purpose |
347
+ |---|---|
348
+ | `extensions/agent-browser/index.ts` | Pi extension entrypoint and native tool wrapper |
349
+ | `extensions/agent-browser/lib/runtime.ts` | Args, session planning, redaction, process, and runtime helpers |
350
+ | `extensions/agent-browser/lib/results/` | Model-facing result rendering and error guidance |
351
+ | `extensions/agent-browser/lib/playbook.ts` | Canonical generated agent/browser guidance |
352
+ | `docs/COMMAND_REFERENCE.md` | Repo-readable native command reference |
353
+ | `docs/TOOL_CONTRACT.md` | Tool parameters, result shape, and behavior contract |
354
+ | `docs/ARCHITECTURE.md` | Design decisions and implementation structure |
355
+ | `docs/REQUIREMENTS.md` | Product requirements and constraints |
356
+ | `docs/RELEASE.md` | Release, package, and lifecycle verification workflow |
357
+ | `test/` | Wrapper, runtime, presentation, lifecycle, and package tests |
341
358
 
342
- ## Documentation rule
359
+ ## More docs
343
360
 
344
- When requirements change in chat:
361
+ - [`docs/COMMAND_REFERENCE.md`](docs/COMMAND_REFERENCE.md) full native command reference and upstream capability baseline
362
+ - [`docs/TOOL_CONTRACT.md`](docs/TOOL_CONTRACT.md) — exact tool contract
363
+ - [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md) — how the wrapper is designed
364
+ - [`docs/REQUIREMENTS.md`](docs/REQUIREMENTS.md) — product constraints and non-goals
365
+ - [`docs/RELEASE.md`](docs/RELEASE.md) — maintainer release workflow
345
366
 
346
- 1. update `docs/REQUIREMENTS.md`
347
- 2. update the affected design docs
348
- 3. update this README if user-facing expectations changed
367
+ ## Next action
349
368
 
350
- When the upstream `agent-browser` binary changes:
369
+ If you are a user, install the package and ask Pi to open a public page with `agent_browser`.
351
370
 
352
- 1. re-check the upstream command/help surface
353
- 2. update `docs/COMMAND_REFERENCE.md`
354
- 3. update tool guidance, README, and release docs if behavior or recommended usage changed
355
- 4. verify the blocked direct-binary path still has an equally usable local extension-side documentation path
371
+ If you are evaluating the implementation, read [`extensions/agent-browser/index.ts`](extensions/agent-browser/index.ts), then run `npm run verify`.
@@ -31,7 +31,7 @@ The extension should:
31
31
  - resolve `agent-browser` from `PATH`
32
32
  - invoke it directly, not through a shell
33
33
  - inject `--json`
34
- - support optional stdin only for `eval --stdin` and `batch`, rejecting other command/stdin combinations before launch
34
+ - support optional stdin only for `eval --stdin`, `batch`, and `auth save --password-stdin`, rejecting other command/stdin combinations before launch
35
35
 
36
36
  ### Agent-first UX
37
37
 
@@ -34,7 +34,7 @@ Tool parameters:
34
34
  ```
35
35
 
36
36
  - `args`: exact `agent-browser` CLI tokens after the binary name.
37
- - `stdin`: only for `batch` and `eval --stdin`; other command/stdin combinations are rejected before `agent-browser` is launched.
37
+ - `stdin`: only for `batch`, `eval --stdin`, and `auth save --password-stdin`; other command/stdin combinations are rejected before `agent-browser` is launched.
38
38
  - `sessionMode`:
39
39
  - `"auto"` reuses the extension-managed session when possible.
40
40
  - `"fresh"` rotates that managed session to a fresh upstream launch so launch-scoped flags like `--profile`, `--session-name`, `--cdp`, `--state`, `--auto-connect`, `--init-script`, or `--enable` apply.
@@ -220,7 +220,7 @@ The tables below intentionally list more than the recommended workflow. Rare com
220
220
 
221
221
  ### Built-in skills
222
222
 
223
- Native-tool note: upstream skills are written for the standalone `agent-browser` CLI and may show bash/heredoc examples. In pi, convert those examples to `agent_browser` calls: pass CLI tokens in `args`, and pass heredoc/stdin bodies through the tool `stdin` field for `batch` or `eval --stdin`.
223
+ Native-tool note: upstream skills are written for the standalone `agent-browser` CLI and may show bash/heredoc examples. In pi, convert those examples to `agent_browser` calls: pass CLI tokens in `args`, and pass heredoc/stdin bodies through the tool `stdin` field for `batch`, `eval --stdin`, or `auth save --password-stdin`.
224
224
 
225
225
  | Command | Purpose |
226
226
  | --- | --- |
@@ -300,9 +300,11 @@ These calls return plain text and stay stateless: the extension does not inject
300
300
  | `cookies [get|set|clear]` | Manage cookies. `set` supports `--url`, `--domain`, `--path`, `--httpOnly`, `--secure`, `--sameSite`, `--expires`, and `--curl <file>` for JSON, cURL, or bare Cookie-header bulk imports. |
301
301
  | `storage <local|session>` | Manage web storage. |
302
302
 
303
+ Privacy note: `cookies get` can expose real profile cookies. Do not run it against `--profile Default` or other authenticated profiles unless the user explicitly needs cookie inspection; prefer task-specific page actions and storage checks.
304
+
303
305
  ### Tabs
304
306
 
305
- Stable tab ids look like `t1`, `t2`, and `t3`. Optional user labels such as `docs` or `app` are interchangeable with ids wherever a tab reference is accepted.
307
+ Stable tab ids look like `t1`, `t2`, and `t3`. Optional user labels such as `docs` or `app` are interchangeable with ids wherever a tab reference is accepted. Upstream help may refer to numeric tab positions, but this wrapper guidance uses stable `t<N>` ids because positional integers are not accepted by current upstream `agent-browser`.
306
308
 
307
309
  | Command | Purpose |
308
310
  | --- | --- |
@@ -377,7 +379,7 @@ When these diagnostic commands are invoked through the native `agent_browser` to
377
379
  | Command | Purpose |
378
380
  | --- | --- |
379
381
  | `batch [--bail] ["cmd" ...]` | Execute multiple commands sequentially from args or stdin. |
380
- | `auth save <name> [opts]` | Save an auth profile with options such as `--url`, `--username`, `--password`, or `--password-stdin`. |
382
+ | `auth save <name> [opts]` | Save an auth profile with options such as `--url`, `--username`, `--password`, or `--password-stdin`. Prefer `--password-stdin` with the tool `stdin` field; avoid putting passwords in `args`. |
381
383
  | `auth login <name>` | Login using saved credentials. |
382
384
  | `auth list` | List saved auth profiles. |
383
385
  | `auth show <name>` | Show auth profile metadata. |