pi-agent-browser-native 0.1.6 → 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +24 -0
- package/README.md +99 -5
- package/docs/ARCHITECTURE.md +16 -8
- package/docs/TOOL_CONTRACT.md +27 -17
- package/extensions/agent-browser/index.ts +196 -59
- package/extensions/agent-browser/lib/results/envelope.ts +7 -0
- package/extensions/agent-browser/lib/results/presentation.ts +263 -22
- package/extensions/agent-browser/lib/results/shared.ts +25 -0
- package/extensions/agent-browser/lib/results/snapshot.ts +32 -16
- package/extensions/agent-browser/lib/runtime.ts +158 -32
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,29 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## 0.2.1 - 2026-04-12
|
|
4
|
+
|
|
5
|
+
### Fixed
|
|
6
|
+
- the GitHub source trial docs now use `pi --no-extensions -e https://github.com/fitchmultz/pi-agent-browser-native` so published-package users do not hit duplicate `agent_browser` registration conflicts during source-path testing
|
|
7
|
+
- successful unnamed `sessionMode: "fresh"` launches now rotate the extension-managed session to the new browser, and later default `sessionMode: "auto"` calls keep following that fresh session instead of silently snapping back to the older one
|
|
8
|
+
- mixed-success `batch` failures now preserve per-step rendering, include the first failing step in the visible output and structured details, and still mark the overall tool call as an error so agents can recover from partial progress
|
|
9
|
+
- implicit `piab-*` session names now include a stable cwd hash in addition to the `pi` session id so same-named checkouts and worktrees no longer collide onto the same browser session
|
|
10
|
+
- value-taking flags like `--session`, `--profile`, `--session-name`, and `--cdp` now fail locally with direct validation errors when the value is missing or replaced by another flag, instead of producing confusing downstream JSON parse failures
|
|
11
|
+
- the bash guard now catches wrapped `agent-browser` invocations such as `env agent-browser ...`, `npx --yes agent-browser ...`, `pnpm dlx agent-browser ...`, `yarn dlx agent-browser ...`, `bunx agent-browser ...`, and absolute-path execution, reducing accidental bypasses of the native-tool path
|
|
12
|
+
|
|
13
|
+
## 0.2.0 - 2026-04-12
|
|
14
|
+
|
|
15
|
+
### Changed
|
|
16
|
+
- `batch` now reuses the richer standalone renderers, so batched snapshots keep the compact main-content-first view and batched screenshots keep inline image attachments instead of degrading to raw JSON-ish text
|
|
17
|
+
- the tool schema now uses `sessionMode: "auto" | "fresh"` instead of the old implicit-session boolean so agents have a first-class way to request a fresh profiled/debug launch, and blocked startup-scoped reuse errors now include structured recovery hints
|
|
18
|
+
- plain-text inspection commands like `agent_browser --help` and `--version` are now always allowed, removing the old prompt-dependent inspection gate and making the inspection contract local and predictable
|
|
19
|
+
- navigation actions like `click`, `dblclick`, `back`, `forward`, and `reload` now include lightweight post-action title/url summaries when the wrapper can address the active session, reducing guess-and-check follow-up snapshots
|
|
20
|
+
- compact snapshot rendering is leaner by default: fewer additional sections, fewer refs, smaller role summaries, and the raw spill path now stays in `details.fullOutputPath` instead of dominating the visible snapshot body
|
|
21
|
+
- README and injected tool guidance now include a compact agent quick start with the core call shapes for `open` + `snapshot`, `click` + re-snapshot, `batch`, `eval --stdin`, and fresh profiled launches
|
|
22
|
+
|
|
23
|
+
### Migration notes
|
|
24
|
+
- replace any use of `useActiveSession` with `sessionMode`
|
|
25
|
+
- use `sessionMode: "fresh"` when you need a new `--profile`, `--session-name`, or `--cdp` launch after the implicit session is already active
|
|
26
|
+
|
|
3
27
|
## 0.1.6 - 2026-04-12
|
|
4
28
|
|
|
5
29
|
### Changed
|
package/README.md
CHANGED
|
@@ -69,12 +69,14 @@ For the source install path, prefer the repository URL:
|
|
|
69
69
|
pi install https://github.com/fitchmultz/pi-agent-browser-native
|
|
70
70
|
```
|
|
71
71
|
|
|
72
|
-
To try the GitHub source without installing it permanently:
|
|
72
|
+
To try the GitHub source without installing it permanently, isolate that temporary source extension from your normal installed package set:
|
|
73
73
|
|
|
74
74
|
```bash
|
|
75
|
-
pi -e https://github.com/fitchmultz/pi-agent-browser-native
|
|
75
|
+
pi --no-extensions -e https://github.com/fitchmultz/pi-agent-browser-native
|
|
76
76
|
```
|
|
77
77
|
|
|
78
|
+
This avoids duplicate `agent_browser` registrations when you already have `pi-agent-browser-native` installed globally.
|
|
79
|
+
|
|
78
80
|
### Current practical local-checkout flow
|
|
79
81
|
|
|
80
82
|
Until you are using a published package release, prefer an explicit checkout-only run instead of installing the checkout into your normal `pi` package set:
|
|
@@ -87,6 +89,69 @@ This avoids duplicate `agent_browser` registrations if you also have the publish
|
|
|
87
89
|
|
|
88
90
|
The native tool exposed to the agent is named `agent_browser`.
|
|
89
91
|
|
|
92
|
+
The primary session control parameter is `sessionMode`:
|
|
93
|
+
|
|
94
|
+
- `"auto"` (default) reuses the extension-managed `pi`-scoped session when possible
|
|
95
|
+
- `"fresh"` switches that managed session to a fresh upstream launch so startup-scoped flags like `--profile`, `--session-name`, and `--cdp` apply and later auto calls follow the new browser
|
|
96
|
+
|
|
97
|
+
## Agent quick start
|
|
98
|
+
|
|
99
|
+
### Mental model
|
|
100
|
+
|
|
101
|
+
- `args` — exact CLI args after `agent-browser`
|
|
102
|
+
- `stdin` — raw stdin only for `batch` and `eval --stdin`
|
|
103
|
+
- `sessionMode`
|
|
104
|
+
- `"auto"` — default, reuse the extension-managed `pi`-scoped session
|
|
105
|
+
- `"fresh"` — switch that managed session to a new profile/debug launch
|
|
106
|
+
|
|
107
|
+
### Common call shapes
|
|
108
|
+
|
|
109
|
+
Open a page, then take an interactive snapshot:
|
|
110
|
+
|
|
111
|
+
```json
|
|
112
|
+
{ "args": ["open", "https://example.com"] }
|
|
113
|
+
{ "args": ["snapshot", "-i"] }
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
Click a ref, then re-snapshot after navigation or a major DOM change:
|
|
117
|
+
|
|
118
|
+
```json
|
|
119
|
+
{ "args": ["click", "@e2"] }
|
|
120
|
+
{ "args": ["snapshot", "-i"] }
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
Run a multi-step browser flow in one tool call:
|
|
124
|
+
|
|
125
|
+
```json
|
|
126
|
+
{ "args": ["batch"], "stdin": "[[\"open\",\"https://example.com\"],[\"snapshot\",\"-i\"]]" }
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
Evaluate page JavaScript via stdin:
|
|
130
|
+
|
|
131
|
+
```json
|
|
132
|
+
{ "args": ["eval", "--stdin"], "stdin": "document.title" }
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
Start a fresh profiled launch after you already used the implicit session:
|
|
136
|
+
|
|
137
|
+
```json
|
|
138
|
+
{ "args": ["--profile", "Default", "open", "https://example.com/account"], "sessionMode": "fresh" }
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
After a successful unnamed fresh launch, later `sessionMode: "auto"` calls follow that new browser automatically.
|
|
142
|
+
|
|
143
|
+
Name a new upstream session explicitly when you want to keep reusing it yourself:
|
|
144
|
+
|
|
145
|
+
```json
|
|
146
|
+
{ "args": ["--session", "auth-flow", "open", "https://example.com"] }
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
### First useful prompt in a fresh `pi` session
|
|
150
|
+
|
|
151
|
+
```text
|
|
152
|
+
Use the agent_browser tool to open https://react.dev and then take an interactive snapshot.
|
|
153
|
+
```
|
|
154
|
+
|
|
90
155
|
## Local development
|
|
91
156
|
|
|
92
157
|
Do not track or rely on a repo-local `.pi/extensions/agent-browser.ts` autoload shim for this package. When the package is also installed globally, that creates a duplicate `agent_browser` registration and blocks `pi` startup from this working directory.
|
|
@@ -116,13 +181,42 @@ Validated workflow examples:
|
|
|
116
181
|
- run `batch` with JSON via `stdin`
|
|
117
182
|
- run `eval --stdin`
|
|
118
183
|
- take a screenshot with inline attachment support
|
|
119
|
-
- inspect `agent_browser --help` and `--version`
|
|
184
|
+
- inspect `agent_browser --help` and `--version` via the tool's plain-text inspection fallback
|
|
185
|
+
|
|
186
|
+
Inspection commands like `agent_browser --help` and `--version` are always supported. They return plain text and are useful for debugging or capability checks, but they are not required for normal browsing workflows.
|
|
120
187
|
|
|
121
188
|
Current cautions:
|
|
122
189
|
- passing `--profile` is an explicit upstream choice; this extension does not add its own profile-cloning or isolation layer
|
|
123
|
-
- startup-scoped flags like `--profile`, `--session-name`, and `--cdp` are for the first command that launches a session; if the implicit session is already active,
|
|
190
|
+
- startup-scoped flags like `--profile`, `--session-name`, and `--cdp` are for the first command that launches a session; if the implicit session is already active, retry that call with `sessionMode: "fresh"` or provide an explicit `--session ...` for the new launch
|
|
124
191
|
- implicit `piab-*` sessions are extension-managed convenience sessions; they are best-effort closed on `pi` shutdown, get an idle timeout to reduce stale background daemons, and clean up private temp spill artifacts on shutdown
|
|
125
|
-
-
|
|
192
|
+
- `sessionMode: "fresh"` without an explicit `--session` rotates that extension-managed session to the new browser so later auto calls keep using it
|
|
193
|
+
- explicit caller-provided `--session` values are treated as user-managed and are not auto-closed by the extension
|
|
194
|
+
|
|
195
|
+
### Switching from public browsing to a fresh profile/debug launch
|
|
196
|
+
|
|
197
|
+
A common agent workflow is:
|
|
198
|
+
|
|
199
|
+
1. browse a public page with the default implicit session
|
|
200
|
+
2. then switch to a fresh authenticated/profile/debug launch
|
|
201
|
+
|
|
202
|
+
Use `sessionMode: "fresh"` for that transition instead of relying on the implicit session:
|
|
203
|
+
|
|
204
|
+
```json
|
|
205
|
+
{
|
|
206
|
+
"args": ["--profile", "Default", "open", "https://example.com/account"],
|
|
207
|
+
"sessionMode": "fresh"
|
|
208
|
+
}
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
After that call succeeds, later default `sessionMode: "auto"` calls continue in the new fresh browser.
|
|
212
|
+
|
|
213
|
+
If you want to name the new upstream session yourself, pass an explicit session instead:
|
|
214
|
+
|
|
215
|
+
```json
|
|
216
|
+
{
|
|
217
|
+
"args": ["--session", "auth-flow", "--profile", "Default", "open", "https://example.com/account"]
|
|
218
|
+
}
|
|
219
|
+
```
|
|
126
220
|
|
|
127
221
|
## Docs
|
|
128
222
|
|
package/docs/ARCHITECTURE.md
CHANGED
|
@@ -59,29 +59,33 @@ The published package should exclude agent-only and superseded repo materials su
|
|
|
59
59
|
|
|
60
60
|
### Default
|
|
61
61
|
|
|
62
|
-
If the caller does not provide `--session`, the extension should use an implicit session name derived from the current `pi` session id.
|
|
62
|
+
If the caller does not provide `--session`, the extension should default to `sessionMode: "auto"` and use an implicit session name derived from the current `pi` session id plus a hash of the absolute cwd.
|
|
63
63
|
|
|
64
64
|
Why:
|
|
65
65
|
- works out of the box
|
|
66
66
|
- gives continuity across calls
|
|
67
67
|
- avoids forcing the agent to invent session names for basic browsing
|
|
68
68
|
|
|
69
|
-
### Explicit upstream sessions
|
|
69
|
+
### Explicit upstream sessions and fresh launches
|
|
70
70
|
|
|
71
71
|
If the caller provides `--session`, `--profile`, `--cdp`, or similar upstream flags, the extension should respect them with minimal interference.
|
|
72
72
|
|
|
73
|
+
The tool should also expose a first-class `sessionMode: "fresh"` escape hatch so agents can intentionally rotate the extension-managed session to a fresh upstream launch without inventing a fixed explicit session name.
|
|
74
|
+
|
|
73
75
|
### Ownership
|
|
74
76
|
|
|
75
77
|
V1 ownership rule:
|
|
76
78
|
- implicit auto-generated sessions are extension-managed convenience sessions
|
|
79
|
+
- unnamed `sessionMode: "fresh"` launches rotate that extension-managed session to a new upstream browser
|
|
77
80
|
- explicit/user-managed sessions are not auto-managed by default
|
|
78
|
-
-
|
|
81
|
+
- extension-managed sessions should be reusable during an active `pi` session, but should still be cleaned up predictably
|
|
79
82
|
|
|
80
83
|
Practical policy:
|
|
81
|
-
- on normal `pi` shutdown, best-effort close the
|
|
82
|
-
- also set an idle timeout on
|
|
83
|
-
- clean up private temp spill artifacts owned by the
|
|
84
|
-
-
|
|
84
|
+
- on normal `pi` shutdown, best-effort close the current extension-managed session
|
|
85
|
+
- also set an idle timeout on extension-managed sessions so abandoned daemons self-clean after inactivity
|
|
86
|
+
- clean up private temp spill artifacts owned by the extension-managed session on shutdown
|
|
87
|
+
- if an unnamed fresh launch replaces an active extension-managed session, best-effort close the old managed session after the switch succeeds
|
|
88
|
+
- leave explicit caller-provided `--session` choices alone unless the caller closes them explicitly
|
|
85
89
|
|
|
86
90
|
This is primarily about ownership clarity and avoiding surprise, not adding a heavy safety wrapper. If the extension invented the session, the extension should clean it up. If the caller explicitly chose the upstream session model, the extension should stay out of the way.
|
|
87
91
|
|
|
@@ -92,7 +96,11 @@ The extension should surface that clearly and avoid hidden restart behavior in v
|
|
|
92
96
|
|
|
93
97
|
That means explicit startup-scoping flags like `--profile`, `--session-name`, and `--cdp` should remain explicit upstream choices instead of being wrapped in extra hidden restart or cloning logic.
|
|
94
98
|
|
|
95
|
-
If the implicit session is already active and one of those startup-scoped flags appears again
|
|
99
|
+
If the implicit session is already active and one of those startup-scoped flags appears again while `sessionMode` is still `"auto"`, the extension should fail clearly instead of silently sending a command shape that upstream would ignore.
|
|
100
|
+
|
|
101
|
+
That failure should include a structured recovery hint pointing to `sessionMode: "fresh"` as the first-line fix, while still allowing an explicit `--session` when the caller wants to name the new upstream session.
|
|
102
|
+
|
|
103
|
+
A successful unnamed `sessionMode: "fresh"` launch should become the new extension-managed session so later default calls follow that browser instead of silently snapping back to the older managed session.
|
|
96
104
|
|
|
97
105
|
## Preferring the native tool
|
|
98
106
|
|
package/docs/TOOL_CONTRACT.md
CHANGED
|
@@ -32,7 +32,7 @@ The tool also needs an operating playbook, not just a capability list. The model
|
|
|
32
32
|
{
|
|
33
33
|
"args": ["open", "https://example.com"],
|
|
34
34
|
"stdin": "optional raw stdin content",
|
|
35
|
-
"
|
|
35
|
+
"sessionMode": "auto"
|
|
36
36
|
}
|
|
37
37
|
```
|
|
38
38
|
|
|
@@ -69,15 +69,20 @@ Examples:
|
|
|
69
69
|
{ "args": ["batch"], "stdin": "[[\"open\",\"https://example.com\"],[\"snapshot\",\"-i\"]]" }
|
|
70
70
|
```
|
|
71
71
|
|
|
72
|
-
### `
|
|
72
|
+
### `sessionMode`
|
|
73
73
|
|
|
74
|
-
- type: `
|
|
74
|
+
- type: `"auto" | "fresh"`
|
|
75
75
|
- optional
|
|
76
|
-
- default: `
|
|
76
|
+
- default: `"auto"`
|
|
77
77
|
|
|
78
78
|
Behavior:
|
|
79
79
|
- if `args` already include `--session`, upstream session choice wins
|
|
80
|
-
-
|
|
80
|
+
- `"auto"` prepends the current extension-managed active session when appropriate
|
|
81
|
+
- `"fresh"` rotates that managed session to a fresh upstream launch so startup-scoped flags like `--profile`, `--session-name`, or `--cdp` apply and later default calls follow the new browser
|
|
82
|
+
|
|
83
|
+
Recommended use:
|
|
84
|
+
- use `"auto"` for the common browse/snapshot/click flow inside one `pi` session
|
|
85
|
+
- use `"fresh"` when switching from an already-active implicit session to a new profile/debug/auth launch without inventing a fixed explicit session name
|
|
81
86
|
|
|
82
87
|
## Wrapper behavior
|
|
83
88
|
|
|
@@ -87,8 +92,8 @@ The extension should:
|
|
|
87
92
|
- parse JSON output into tool details
|
|
88
93
|
- handle observed JSON result shapes, including the array returned by `batch --json`
|
|
89
94
|
- allow plain-text fallback for inspection commands like `--help` and `--version`
|
|
90
|
-
-
|
|
91
|
-
-
|
|
95
|
+
- support those inspection commands unconditionally so the tool contract stays local and predictable
|
|
96
|
+
- still describe normal browser workflows in guidance so models do not overuse inspection for routine tasks
|
|
92
97
|
- surface stderr and non-zero exits clearly
|
|
93
98
|
- attach images when the result points to a screenshot-like artifact
|
|
94
99
|
|
|
@@ -104,7 +109,8 @@ Primary content should be:
|
|
|
104
109
|
|
|
105
110
|
Examples:
|
|
106
111
|
- small `snapshot` results should include the actual snapshot text
|
|
107
|
-
- oversized `snapshot` results should switch to a compact view that preserves the primary content, nearby sections, high-value refs,
|
|
112
|
+
- oversized `snapshot` results should switch to a compact view that preserves the primary content, nearby sections, and a trimmed set of high-value refs, while exposing the full raw snapshot path via `details.fullOutputPath`
|
|
113
|
+
- successful navigation actions like `click`, `back`, `forward`, and `reload` should include a lightweight post-action title/url summary when the wrapper can address the active session
|
|
108
114
|
- `tab list` should include a readable tab summary
|
|
109
115
|
- `screenshot` should include the saved-path summary plus the inline image attachment when available
|
|
110
116
|
|
|
@@ -116,6 +122,7 @@ Recommended details:
|
|
|
116
122
|
{
|
|
117
123
|
"args": ["snapshot", "-i"],
|
|
118
124
|
"effectiveArgs": ["--session", "pi-abc123", "--json", "snapshot", "-i"],
|
|
125
|
+
"sessionMode": "auto",
|
|
119
126
|
"sessionName": "pi-abc123",
|
|
120
127
|
"usedImplicitSession": true,
|
|
121
128
|
"data": {
|
|
@@ -136,7 +143,8 @@ For oversized snapshots, details should switch to a compact metadata object and
|
|
|
136
143
|
|
|
137
144
|
Worth doing in v1:
|
|
138
145
|
- screenshots → inline image attachment
|
|
139
|
-
- snapshots → origin + ref count + main-content-first compact preview, with
|
|
146
|
+
- snapshots → origin + ref count + main-content-first compact preview, with the raw snapshot spill path kept in `details.fullOutputPath` when the inline result would otherwise be too large
|
|
147
|
+
- navigation actions like `click`, `back`, `forward`, and `reload` → lightweight post-action title/url summary when available
|
|
140
148
|
- tab lists → compact summary/table
|
|
141
149
|
- stream status → enabled/connected/port summary
|
|
142
150
|
|
|
@@ -149,16 +157,18 @@ If `agent-browser` is not on `PATH`, fail with a message that:
|
|
|
149
157
|
|
|
150
158
|
## Session behavior
|
|
151
159
|
|
|
152
|
-
- maintain one
|
|
153
|
-
- derive
|
|
160
|
+
- maintain one extension-managed active session per `pi` session for the common path
|
|
161
|
+
- derive the base implicit session name from the official `pi` session id plus a cwd hash so same-named checkouts do not collide
|
|
154
162
|
- respect explicit upstream `--session` with minimal interference
|
|
155
|
-
- treat the
|
|
156
|
-
- on normal `pi` shutdown, best-effort close the
|
|
157
|
-
- set an idle timeout on
|
|
158
|
-
- clean up private temp spill artifacts owned by the
|
|
159
|
-
-
|
|
163
|
+
- treat the extension-managed session as convenience state owned by the wrapper
|
|
164
|
+
- on normal `pi` shutdown, best-effort close the current extension-managed session
|
|
165
|
+
- set an idle timeout on extension-managed sessions so abandoned daemons eventually self-clean
|
|
166
|
+
- clean up private temp spill artifacts owned by the extension-managed session on shutdown
|
|
167
|
+
- when an unnamed `sessionMode: "fresh"` launch succeeds, make it the new extension-managed session so later default calls keep using it
|
|
168
|
+
- if that unnamed fresh launch replaced an already-active managed session, best-effort close the old managed session after the switch succeeds
|
|
169
|
+
- treat explicit caller-provided `--session` choices as user-managed
|
|
160
170
|
- pass explicit `--profile` straight through to upstream `agent-browser`; no profile-cloning or isolation layer is added in v1
|
|
161
|
-
- if startup-scoped flags like `--profile`, `--session-name`, or `--cdp` are supplied after the implicit session is already active
|
|
171
|
+
- if startup-scoped flags like `--profile`, `--session-name`, or `--cdp` are supplied after the implicit session is already active while `sessionMode` is `"auto"`, return a validation error with a structured recovery hint that recommends `sessionMode: "fresh"`
|
|
162
172
|
|
|
163
173
|
## Non-goals
|
|
164
174
|
|