pi-agent-browser-native 0.2.12 → 0.2.14
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +21 -0
- package/README.md +87 -27
- package/docs/ARCHITECTURE.md +9 -3
- package/docs/COMMAND_REFERENCE.md +383 -151
- package/docs/RELEASE.md +81 -26
- package/docs/REQUIREMENTS.md +10 -4
- package/docs/TOOL_CONTRACT.md +51 -11
- package/extensions/agent-browser/index.ts +847 -344
- package/extensions/agent-browser/lib/parsing.ts +20 -0
- package/extensions/agent-browser/lib/playbook.ts +79 -0
- package/extensions/agent-browser/lib/process.ts +56 -8
- package/extensions/agent-browser/lib/results/confirmation.ts +76 -0
- package/extensions/agent-browser/lib/results/envelope.ts +42 -5
- package/extensions/agent-browser/lib/results/presentation.ts +907 -50
- package/extensions/agent-browser/lib/results/shared.ts +166 -15
- package/extensions/agent-browser/lib/results/snapshot.ts +69 -7
- package/extensions/agent-browser/lib/results.ts +7 -1
- package/extensions/agent-browser/lib/runtime.ts +204 -15
- package/extensions/agent-browser/lib/temp.ts +131 -23
- package/package.json +14 -8
- package/scripts/agent-browser-capability-baseline.mjs +104 -0
- package/scripts/doctor.mjs +420 -0
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,27 @@
|
|
|
2
2
|
|
|
3
3
|
## Unreleased
|
|
4
4
|
|
|
5
|
+
## 0.2.14 - 2026-05-01
|
|
6
|
+
|
|
7
|
+
### Changed
|
|
8
|
+
- updated the local pi development baseline to `@mariozechner/pi-coding-agent` `0.71.1`
|
|
9
|
+
- regenerated the npm lockfile against the current stable dependency graph
|
|
10
|
+
|
|
11
|
+
### Compatibility
|
|
12
|
+
- reviewed the pi `0.71.1` changelog and confirmed the extension is compatible with the current TypeBox 1.x package guidance, session-replacement safety rules, and latest package install/update behavior
|
|
13
|
+
|
|
14
|
+
|
|
15
|
+
## 0.2.13 - 2026-04-30
|
|
16
|
+
|
|
17
|
+
### Fixed
|
|
18
|
+
- improved model-facing redaction across generic output, scalar extraction summaries, diagnostics, console/error previews, and compacted spill files so nested, multiline, and prefixed structured secrets are masked before entering tool content or summaries
|
|
19
|
+
- adapted upstream `agent-browser skills get` output for Pi's native `agent_browser` tool by removing bash-oriented allowlist hints and translating quoted and heredoc CLI examples into native tool-call examples
|
|
20
|
+
- reduced artifact-retention noise for routine explicit saved files while preserving retention metadata in details, and fixed explicit artifact manifest deduplication for same relative paths in different working directories
|
|
21
|
+
|
|
22
|
+
### Changed
|
|
23
|
+
- documented that oversized spill files contain redacted upstream payloads rather than raw secret-bearing output
|
|
24
|
+
- added command-reference guidance for converting upstream standalone CLI examples into native `agent_browser` tool calls
|
|
25
|
+
|
|
5
26
|
## 0.2.12 - 2026-04-23
|
|
6
27
|
|
|
7
28
|
### Changed
|
package/README.md
CHANGED
|
@@ -55,12 +55,34 @@ Install `agent-browser` separately, then install this package into `pi`:
|
|
|
55
55
|
pi install npm:pi-agent-browser-native
|
|
56
56
|
```
|
|
57
57
|
|
|
58
|
-
To try a published package without installing it permanently:
|
|
58
|
+
To try a published package without installing it permanently, isolate that temporary package source from any configured checkout or global install:
|
|
59
59
|
|
|
60
60
|
```bash
|
|
61
|
-
pi -e npm:pi-agent-browser-native
|
|
61
|
+
pi --no-extensions -e npm:pi-agent-browser-native
|
|
62
62
|
```
|
|
63
63
|
|
|
64
|
+
For a specific published version:
|
|
65
|
+
|
|
66
|
+
```bash
|
|
67
|
+
pi --no-extensions -e npm:pi-agent-browser-native@<version>
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
### First-run doctor
|
|
71
|
+
|
|
72
|
+
Run the package doctor before first use or when `agent_browser` is missing or duplicated:
|
|
73
|
+
|
|
74
|
+
```bash
|
|
75
|
+
pi-agent-browser-doctor
|
|
76
|
+
# one-off without installing the package source permanently:
|
|
77
|
+
npm exec --package pi-agent-browser-native -- pi-agent-browser-doctor
|
|
78
|
+
# from a checkout:
|
|
79
|
+
npm run doctor
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
The doctor is read-only. It checks that upstream `agent-browser` is on `PATH`, that `agent-browser --version` matches the wrapper's capability baseline, and that Pi settings do not point at multiple active `pi-agent-browser-native` sources. It does not run upstream `agent-browser doctor --fix` or edit Pi settings.
|
|
83
|
+
|
|
84
|
+
If it reports duplicate sources, keep exactly one active source. For normal use, keep `pi install npm:pi-agent-browser-native` and remove checkout paths from Pi settings. For temporary package or checkout trials, use `pi --no-extensions -e npm:pi-agent-browser-native[@<version>]` or `pi --no-extensions -e /path/to/checkout` so configured sources are bypassed.
|
|
85
|
+
|
|
64
86
|
### GitHub install
|
|
65
87
|
|
|
66
88
|
For the source install path, prefer the repository URL:
|
|
@@ -77,31 +99,35 @@ pi --no-extensions -e https://github.com/fitchmultz/pi-agent-browser-native
|
|
|
77
99
|
|
|
78
100
|
This avoids duplicate `agent_browser` registrations when you already have `pi-agent-browser-native` installed globally.
|
|
79
101
|
|
|
80
|
-
### Current practical local-checkout
|
|
102
|
+
### Current practical local-checkout flows
|
|
81
103
|
|
|
82
|
-
|
|
104
|
+
This repository's `package.json` is itself a publishable pi package manifest that points at `extensions/agent-browser/index.ts`. That file is the real extension entrypoint for both the checkout and the published package.
|
|
83
105
|
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
|
|
106
|
+
Use two local-checkout modes intentionally:
|
|
107
|
+
|
|
108
|
+
- **Quick isolated smoke test:** run the checkout explicitly with `-e` and disable extension discovery:
|
|
109
|
+
|
|
110
|
+
```bash
|
|
111
|
+
pi --no-extensions -e /absolute/path/to/pi-agent-browser-native
|
|
112
|
+
```
|
|
87
113
|
|
|
88
|
-
This
|
|
114
|
+
This bypasses Pi settings and any configured checkout/global package sources, so it avoids duplicate `agent_browser` registrations. After editing extension code, restart this `pi` process to validate the new source; do not use this mode as proof that configured-source `/reload` works.
|
|
89
115
|
|
|
90
|
-
|
|
116
|
+
- **Configured-source lifecycle validation:** run `npm run verify -- lifecycle` for the opt-in automated tmux harness, or keep exactly one active source for this extension in Pi settings and launch plain `pi` for manual checks. Use this mode when validating `/reload`, full restart, and `/resume` behavior because Pi's reload flow operates on discovered/configured resources.
|
|
91
117
|
|
|
92
118
|
The native tool exposed to the agent is named `agent_browser`.
|
|
93
119
|
|
|
94
120
|
The primary session control parameter is `sessionMode`:
|
|
95
121
|
|
|
96
122
|
- `"auto"` (default) reuses the extension-managed `pi`-scoped session when possible
|
|
97
|
-
- `"fresh"` switches that managed session to a fresh upstream launch so
|
|
123
|
+
- `"fresh"` switches that managed session to a fresh upstream launch so launch-scoped flags like `--profile`, `--session-name`, `--cdp`, `--state`, and `--auto-connect` apply and later auto calls follow the new browser
|
|
98
124
|
|
|
99
125
|
## Agent quick start
|
|
100
126
|
|
|
101
127
|
### Mental model
|
|
102
128
|
|
|
103
129
|
- `args` — exact CLI args after `agent-browser`
|
|
104
|
-
- `stdin` — raw stdin only for `batch` and `eval --stdin`
|
|
130
|
+
- `stdin` — raw stdin only for `batch` and `eval --stdin` (other command/stdin combinations are rejected before `agent-browser` is launched)
|
|
105
131
|
- `sessionMode`
|
|
106
132
|
- `"auto"` — default, reuse the extension-managed `pi`-scoped session
|
|
107
133
|
- `"fresh"` — switch that managed session to a new profile/debug launch
|
|
@@ -134,12 +160,25 @@ Evaluate page JavaScript via stdin:
|
|
|
134
160
|
{ "args": ["eval", "--stdin"], "stdin": "document.title" }
|
|
135
161
|
```
|
|
136
162
|
|
|
137
|
-
Download a file
|
|
163
|
+
Download a file from a known link/control directly:
|
|
138
164
|
|
|
139
165
|
```json
|
|
140
166
|
{ "args": ["download", "@e5", "/tmp/report.pdf"] }
|
|
141
167
|
```
|
|
142
168
|
|
|
169
|
+
For dashboards that start an export asynchronously after a click or navigation, click first and then wait for the download. The wrapper reports `Download completed: /tmp/report.csv` and exposes upstream-reported `details.savedFilePath` plus `details.savedFile` for the `wait` result; with upstream `agent-browser 0.26.0`, confirm `details.artifacts[].exists` before relying on a requested `wait --download <path>` file being present on disk (tracked upstream at [vercel-labs/agent-browser#1300](https://github.com/vercel-labs/agent-browser/issues/1300)):
|
|
170
|
+
|
|
171
|
+
```json
|
|
172
|
+
{ "args": ["click", "@export"] }
|
|
173
|
+
{ "args": ["wait", "--download", "/tmp/report.csv"] }
|
|
174
|
+
```
|
|
175
|
+
|
|
176
|
+
Batch flows preserve the same saved-file metadata on the wait step:
|
|
177
|
+
|
|
178
|
+
```json
|
|
179
|
+
{ "args": ["batch"], "stdin": "[[\"click\",\"@export\"],[\"wait\",\"--download\",\"/tmp/report.csv\"]]" }
|
|
180
|
+
```
|
|
181
|
+
|
|
143
182
|
Start a fresh profiled launch after you already used the implicit session:
|
|
144
183
|
|
|
145
184
|
```json
|
|
@@ -164,19 +203,19 @@ Use the agent_browser tool to open https://react.dev and then take an interactiv
|
|
|
164
203
|
|
|
165
204
|
Do not track or rely on a repo-local `.pi/extensions/agent-browser.ts` autoload shim for this package. That creates an unnecessary second registration path.
|
|
166
205
|
|
|
167
|
-
The published entrypoint lives at `extensions/agent-browser/index.ts` and is referenced directly from this repo's `package.json`.
|
|
206
|
+
The published entrypoint lives at `extensions/agent-browser/index.ts` and is referenced directly from this repo's `package.json`.
|
|
168
207
|
|
|
169
208
|
Recommended local development setup:
|
|
170
209
|
1. Install `agent-browser` separately via the upstream project.
|
|
171
210
|
2. Run `npm install`.
|
|
172
|
-
3.
|
|
173
|
-
4. Launch `pi` from this repository root with only the checkout extension loaded:
|
|
211
|
+
3. For a quick checkout-only smoke test, launch `pi` from this repository root with discovery disabled:
|
|
174
212
|
|
|
175
213
|
```bash
|
|
176
214
|
pi --no-extensions -e .
|
|
177
215
|
```
|
|
178
216
|
|
|
179
|
-
|
|
217
|
+
4. Prompt the agent to use `agent_browser`.
|
|
218
|
+
5. For hot-reload or resume validation, run `npm run verify -- lifecycle` or configure exactly one active source for this extension in Pi settings, launch plain `pi`, and exercise `/reload` plus restart/`/resume`. Settings matter only in this configured-source mode; they are bypassed by `--no-extensions -e .`. See [`docs/RELEASE.md`](docs/RELEASE.md) for the automated harness behavior, cleanup, and transcript retention details.
|
|
180
219
|
|
|
181
220
|
Example prompt:
|
|
182
221
|
|
|
@@ -184,7 +223,14 @@ Example prompt:
|
|
|
184
223
|
Use the agent_browser tool to open https://react.dev and then take an interactive snapshot.
|
|
185
224
|
```
|
|
186
225
|
|
|
187
|
-
For installed-package validation after a release,
|
|
226
|
+
For installed-package validation after a release, use exactly one active source. The canonical isolated validation sequence is:
|
|
227
|
+
|
|
228
|
+
```bash
|
|
229
|
+
npm run verify -- package-pi
|
|
230
|
+
pi --no-extensions -e npm:pi-agent-browser-native@<version>
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
Only use plain `pi` for installed-package validation after disabling or removing the checkout source from Pi settings.
|
|
188
234
|
|
|
189
235
|
Validated workflow examples:
|
|
190
236
|
|
|
@@ -193,27 +239,41 @@ Validated workflow examples:
|
|
|
193
239
|
- use an explicit `--session` across multiple tool calls
|
|
194
240
|
- use an explicit `--profile` and verify persisted browser storage across restarts
|
|
195
241
|
- open `chat.com` or `chatgpt.com` headlessly with `--profile Default` without forcing `--headed` or `--auto-connect`
|
|
196
|
-
- verify `/reload` and full restart + `/resume` keep following the same implicit managed browser session
|
|
242
|
+
- in configured-source lifecycle mode, verify `/reload` and full restart + `/resume` keep following the same implicit managed browser session
|
|
197
243
|
- run `batch` with JSON via `stdin`
|
|
198
244
|
- run `eval --stdin`
|
|
199
245
|
- take a screenshot with inline attachment support
|
|
200
|
-
- inspect `
|
|
201
|
-
- use `download <selector> <path>` for attachment/file-save workflows instead of trying to infer downloads from generic clicks or large eval dumps
|
|
246
|
+
- inspect upstream help/version through native tool calls like `{ "args": ["--help"] }` and `{ "args": ["--version"] }` via the tool's stateless plain-text inspection fallback
|
|
247
|
+
- use `download <selector> <path>` for direct attachment/file-save workflows instead of trying to infer downloads from generic clicks or large eval dumps
|
|
248
|
+
- use `click` plus `wait --download <path>` for asynchronous export flows, confirm `details.savedFilePath`/`details.savedFile` are present on the wait result or batch wait step, and check `details.artifacts[].exists` before relying on requested-path persistence
|
|
202
249
|
- confirm oversized outputs show the actual spill file path directly in tool content, not just a details key name
|
|
250
|
+
- inspect `details.artifactManifest` / `details.artifactRetentionSummary` during artifact-heavy flows to recover recent saved files, spill files, and visible eviction state after reload/resume
|
|
251
|
+
|
|
252
|
+
<!-- agent-browser-playbook:start inspection -->
|
|
253
|
+
<!-- Generated from extensions/agent-browser/lib/playbook.ts. Run `npm run docs -- playbook write` to update. -->
|
|
254
|
+
Native inspection calls use the `agent_browser` tool shape, not shell-like direct-binary commands:
|
|
255
|
+
|
|
256
|
+
- { "args": ["--help"] }
|
|
257
|
+
- { "args": ["--version"] }
|
|
203
258
|
|
|
204
|
-
|
|
259
|
+
These calls return plain text and stay stateless: the extension does not inject its implicit session and does not let inspection consume the managed-session slot needed for later profile, session, CDP, state, or auto-connect launches.
|
|
260
|
+
<!-- agent-browser-playbook:end inspection -->
|
|
205
261
|
|
|
206
262
|
Current cautions:
|
|
207
263
|
- passing `--profile` is an explicit upstream choice; this extension does not add its own profile-cloning or isolation layer
|
|
208
|
-
-
|
|
209
|
-
- implicit `piab-*` sessions are extension-managed convenience sessions; they stay alive across `pi` shutdown/reload so later default calls can keep following the active managed browser on `/reload` or `/resume`, rely on the configured idle timeout to reduce stale background daemons, store persisted-session large snapshot spill files under a private session-scoped artifact directory with a bounded per-session budget so `details.fullOutputPath`
|
|
264
|
+
- launch-scoped flags like `--profile`, `--session-name`, `--cdp`, `--state`, and `--auto-connect` are for the first command that launches a session; if the implicit session is already active, retry that call with `sessionMode: "fresh"` or provide an explicit `--session ...` for the new launch
|
|
265
|
+
- implicit `piab-*` sessions are extension-managed convenience sessions; they stay alive across `pi` shutdown/reload so later default calls can keep following the active managed browser on `/reload` or `/resume`, rely on the configured idle timeout to reduce stale background daemons, store persisted-session large snapshot spill files under a private session-scoped artifact directory with a bounded per-session budget so `details.fullOutputPath` and metadata-only `details.artifactManifest` survive reload/resume without unbounded growth, and still clean up process-private temp spill artifacts on shutdown
|
|
210
266
|
- `sessionMode: "fresh"` without an explicit `--session` rotates that extension-managed session to the new browser so later auto calls keep using it
|
|
211
267
|
- for local Unix launches, the wrapper uses a short private socket directory under `/tmp` so extension-generated session names do not trip upstream Unix socket-path limits in longer cwd/session-name combinations
|
|
212
268
|
- for direct headless local Chrome launches to `chat.com`, `chatgpt.com`, and `chat.openai.com`, the extension injects a normal Chrome user agent when the caller did not explicitly provide `--user-agent`; this keeps the default headless workflow usable without forcing `--headed` or `--auto-connect`
|
|
213
|
-
|
|
214
|
-
|
|
215
|
-
-
|
|
216
|
-
-
|
|
269
|
+
<!-- agent-browser-playbook:start wrapper-tab-recovery -->
|
|
270
|
+
<!-- Generated from extensions/agent-browser/lib/playbook.ts. Run `npm run docs -- playbook write` to update. -->
|
|
271
|
+
- After launch-scoped open/goto/navigate calls that can restore existing tabs (for example --profile, --session-name, or --state), agent_browser best-effort re-selects the tab whose URL matches the returned page when restored tabs steal focus during launch.
|
|
272
|
+
- After a target tab is known for a session, later active-tab commands best-effort pin that tab inside the same upstream invocation when reconnect drift would otherwise move the command to a restored/background tab.
|
|
273
|
+
- After a successful command on a known target tab, agent_browser also best-effort restores that intended tab if a restored/background tab steals focus after the command completes.
|
|
274
|
+
- If a known session target unexpectedly reports about:blank, agent_browser preserves the prior intended target, best-effort re-selects it when it still exists, and reports exact recovery guidance when it cannot be re-selected.
|
|
275
|
+
<!-- agent-browser-playbook:end wrapper-tab-recovery -->
|
|
276
|
+
- oversized snapshots and oversized generic outputs compact inline content and print the actual spill file path directly in the tool result when a spill file exists; recent spills and explicit saved artifacts are also summarized in `details.artifactManifest`, including `evicted` entries when retention budgets remove older persisted files
|
|
217
277
|
- explicit caller-provided `--session` values are treated as user-managed and are not auto-closed by the extension
|
|
218
278
|
- explicit caller-provided `--user-agent` values win over the ChatGPT/OpenAI compatibility workaround
|
|
219
279
|
- tool progress/details redact sensitive invocation values such as `--headers`, proxy credentials, and auth-bearing URL parameters before echoing them back into Pi
|
package/docs/ARCHITECTURE.md
CHANGED
|
@@ -9,7 +9,7 @@ Related docs:
|
|
|
9
9
|
|
|
10
10
|
Build this as a **thin `pi` extension/package** that exposes `agent-browser` as one native tool while keeping upstream `agent-browser` as the source of truth.
|
|
11
11
|
|
|
12
|
-
The package install path is the primary product path. Local checkout development should use explicit CLI loading, while package-manifest behavior and packaged contents matter more.
|
|
12
|
+
The package install path is the primary product path. Local checkout development should use explicit CLI loading for isolated smoke tests and configured-source plain `pi` runs for lifecycle validation, while package-manifest behavior and packaged contents matter more.
|
|
13
13
|
|
|
14
14
|
## Chosen shape
|
|
15
15
|
|
|
@@ -31,7 +31,7 @@ The extension should:
|
|
|
31
31
|
- resolve `agent-browser` from `PATH`
|
|
32
32
|
- invoke it directly, not through a shell
|
|
33
33
|
- inject `--json`
|
|
34
|
-
- support optional stdin for
|
|
34
|
+
- support optional stdin only for `eval --stdin` and `batch`, rejecting other command/stdin combinations before launch
|
|
35
35
|
|
|
36
36
|
### Agent-first UX
|
|
37
37
|
|
|
@@ -46,11 +46,17 @@ That means:
|
|
|
46
46
|
|
|
47
47
|
The published package should load from the `pi` manifest in `package.json`.
|
|
48
48
|
|
|
49
|
-
Local checkout
|
|
49
|
+
Local checkout validation has two intentional modes:
|
|
50
|
+
|
|
51
|
+
- **Quick isolated mode:** use explicit CLI loading such as `pi --no-extensions -e .` from the repository root. This bypasses Pi settings and extension discovery, avoids duplicate `agent_browser` registrations when another source is installed globally, and is the right mode for checkout smoke tests.
|
|
52
|
+
- **Configured-source lifecycle mode:** configure exactly one active checkout or package source in Pi settings and launch plain `pi`. This is the right mode for validating `/reload`, restart, and `/resume` behavior because those lifecycle checks exercise discovered/configured resources.
|
|
53
|
+
|
|
54
|
+
The repo should not add a repo-local `.pi/extensions/` autoload shim as the documented checkout path.
|
|
50
55
|
|
|
51
56
|
Why:
|
|
52
57
|
- avoids duplicate `agent_browser` registrations when the package is also installed globally
|
|
53
58
|
- keeps the product contract centered on the package manifest instead of repo-local autoload wiring
|
|
59
|
+
- keeps reload and resume validation tied to Pi's configured-source lifecycle instead of an isolated quick-test path
|
|
54
60
|
- keeps the published tarball focused on the package manifest, extension code, canonical docs, and license
|
|
55
61
|
|
|
56
62
|
The published package should exclude agent-only and superseded repo materials such as `AGENTS.md`, `docs/v1-tool-contract.md`, `docs/native-integration-design.md`, and other internal planning notes.
|