pi-agent-browser-native 0.1.6 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,29 @@
1
1
  # Changelog
2
2
 
3
+ ## 0.2.1 - 2026-04-12
4
+
5
+ ### Fixed
6
+ - the GitHub source trial docs now use `pi --no-extensions -e https://github.com/fitchmultz/pi-agent-browser-native` so published-package users do not hit duplicate `agent_browser` registration conflicts during source-path testing
7
+ - successful unnamed `sessionMode: "fresh"` launches now rotate the extension-managed session to the new browser, and later default `sessionMode: "auto"` calls keep following that fresh session instead of silently snapping back to the older one
8
+ - mixed-success `batch` failures now preserve per-step rendering, include the first failing step in the visible output and structured details, and still mark the overall tool call as an error so agents can recover from partial progress
9
+ - implicit `piab-*` session names now include a stable cwd hash in addition to the `pi` session id so same-named checkouts and worktrees no longer collide onto the same browser session
10
+ - value-taking flags like `--session`, `--profile`, `--session-name`, and `--cdp` now fail locally with direct validation errors when the value is missing or replaced by another flag, instead of producing confusing downstream JSON parse failures
11
+ - the bash guard now catches wrapped `agent-browser` invocations such as `env agent-browser ...`, `npx --yes agent-browser ...`, `pnpm dlx agent-browser ...`, `yarn dlx agent-browser ...`, `bunx agent-browser ...`, and absolute-path execution, reducing accidental bypasses of the native-tool path
12
+
13
+ ## 0.2.0 - 2026-04-12
14
+
15
+ ### Changed
16
+ - `batch` now reuses the richer standalone renderers, so batched snapshots keep the compact main-content-first view and batched screenshots keep inline image attachments instead of degrading to raw JSON-ish text
17
+ - the tool schema now uses `sessionMode: "auto" | "fresh"` instead of the old implicit-session boolean so agents have a first-class way to request a fresh profiled/debug launch, and blocked startup-scoped reuse errors now include structured recovery hints
18
+ - plain-text inspection commands like `agent_browser --help` and `--version` are now always allowed, removing the old prompt-dependent inspection gate and making the inspection contract local and predictable
19
+ - navigation actions like `click`, `dblclick`, `back`, `forward`, and `reload` now include lightweight post-action title/url summaries when the wrapper can address the active session, reducing guess-and-check follow-up snapshots
20
+ - compact snapshot rendering is leaner by default: fewer additional sections, fewer refs, smaller role summaries, and the raw spill path now stays in `details.fullOutputPath` instead of dominating the visible snapshot body
21
+ - README and injected tool guidance now include a compact agent quick start with the core call shapes for `open` + `snapshot`, `click` + re-snapshot, `batch`, `eval --stdin`, and fresh profiled launches
22
+
23
+ ### Migration notes
24
+ - replace any use of `useActiveSession` with `sessionMode`
25
+ - use `sessionMode: "fresh"` when you need a new `--profile`, `--session-name`, or `--cdp` launch after the implicit session is already active
26
+
3
27
  ## 0.1.6 - 2026-04-12
4
28
 
5
29
  ### Changed
package/README.md CHANGED
@@ -69,12 +69,14 @@ For the source install path, prefer the repository URL:
69
69
  pi install https://github.com/fitchmultz/pi-agent-browser-native
70
70
  ```
71
71
 
72
- To try the GitHub source without installing it permanently:
72
+ To try the GitHub source without installing it permanently, isolate that temporary source extension from your normal installed package set:
73
73
 
74
74
  ```bash
75
- pi -e https://github.com/fitchmultz/pi-agent-browser-native
75
+ pi --no-extensions -e https://github.com/fitchmultz/pi-agent-browser-native
76
76
  ```
77
77
 
78
+ This avoids duplicate `agent_browser` registrations when you already have `pi-agent-browser-native` installed globally.
79
+
78
80
  ### Current practical local-checkout flow
79
81
 
80
82
  Until you are using a published package release, prefer an explicit checkout-only run instead of installing the checkout into your normal `pi` package set:
@@ -87,6 +89,69 @@ This avoids duplicate `agent_browser` registrations if you also have the publish
87
89
 
88
90
  The native tool exposed to the agent is named `agent_browser`.
89
91
 
92
+ The primary session control parameter is `sessionMode`:
93
+
94
+ - `"auto"` (default) reuses the extension-managed `pi`-scoped session when possible
95
+ - `"fresh"` switches that managed session to a fresh upstream launch so startup-scoped flags like `--profile`, `--session-name`, and `--cdp` apply and later auto calls follow the new browser
96
+
97
+ ## Agent quick start
98
+
99
+ ### Mental model
100
+
101
+ - `args` — exact CLI args after `agent-browser`
102
+ - `stdin` — raw stdin only for `batch` and `eval --stdin`
103
+ - `sessionMode`
104
+ - `"auto"` — default, reuse the extension-managed `pi`-scoped session
105
+ - `"fresh"` — switch that managed session to a new profile/debug launch
106
+
107
+ ### Common call shapes
108
+
109
+ Open a page, then take an interactive snapshot:
110
+
111
+ ```json
112
+ { "args": ["open", "https://example.com"] }
113
+ { "args": ["snapshot", "-i"] }
114
+ ```
115
+
116
+ Click a ref, then re-snapshot after navigation or a major DOM change:
117
+
118
+ ```json
119
+ { "args": ["click", "@e2"] }
120
+ { "args": ["snapshot", "-i"] }
121
+ ```
122
+
123
+ Run a multi-step browser flow in one tool call:
124
+
125
+ ```json
126
+ { "args": ["batch"], "stdin": "[[\"open\",\"https://example.com\"],[\"snapshot\",\"-i\"]]" }
127
+ ```
128
+
129
+ Evaluate page JavaScript via stdin:
130
+
131
+ ```json
132
+ { "args": ["eval", "--stdin"], "stdin": "document.title" }
133
+ ```
134
+
135
+ Start a fresh profiled launch after you already used the implicit session:
136
+
137
+ ```json
138
+ { "args": ["--profile", "Default", "open", "https://example.com/account"], "sessionMode": "fresh" }
139
+ ```
140
+
141
+ After a successful unnamed fresh launch, later `sessionMode: "auto"` calls follow that new browser automatically.
142
+
143
+ Name a new upstream session explicitly when you want to keep reusing it yourself:
144
+
145
+ ```json
146
+ { "args": ["--session", "auth-flow", "open", "https://example.com"] }
147
+ ```
148
+
149
+ ### First useful prompt in a fresh `pi` session
150
+
151
+ ```text
152
+ Use the agent_browser tool to open https://react.dev and then take an interactive snapshot.
153
+ ```
154
+
90
155
  ## Local development
91
156
 
92
157
  Do not track or rely on a repo-local `.pi/extensions/agent-browser.ts` autoload shim for this package. When the package is also installed globally, that creates a duplicate `agent_browser` registration and blocks `pi` startup from this working directory.
@@ -116,13 +181,42 @@ Validated workflow examples:
116
181
  - run `batch` with JSON via `stdin`
117
182
  - run `eval --stdin`
118
183
  - take a screenshot with inline attachment support
119
- - inspect `agent_browser --help` and `--version`
184
+ - inspect `agent_browser --help` and `--version` via the tool's plain-text inspection fallback
185
+
186
+ Inspection commands like `agent_browser --help` and `--version` are always supported. They return plain text and are useful for debugging or capability checks, but they are not required for normal browsing workflows.
120
187
 
121
188
  Current cautions:
122
189
  - passing `--profile` is an explicit upstream choice; this extension does not add its own profile-cloning or isolation layer
123
- - startup-scoped flags like `--profile`, `--session-name`, and `--cdp` are for the first command that launches a session; if the implicit session is already active, the extension returns a validation error instead of silently letting upstream ignore those flags
190
+ - startup-scoped flags like `--profile`, `--session-name`, and `--cdp` are for the first command that launches a session; if the implicit session is already active, retry that call with `sessionMode: "fresh"` or provide an explicit `--session ...` for the new launch
124
191
  - implicit `piab-*` sessions are extension-managed convenience sessions; they are best-effort closed on `pi` shutdown, get an idle timeout to reduce stale background daemons, and clean up private temp spill artifacts on shutdown
125
- - explicit upstream sessions like `--session`, `--profile`, `--session-name`, and `--cdp` are treated as user-managed and are not auto-closed by the extension
192
+ - `sessionMode: "fresh"` without an explicit `--session` rotates that extension-managed session to the new browser so later auto calls keep using it
193
+ - explicit caller-provided `--session` values are treated as user-managed and are not auto-closed by the extension
194
+
195
+ ### Switching from public browsing to a fresh profile/debug launch
196
+
197
+ A common agent workflow is:
198
+
199
+ 1. browse a public page with the default implicit session
200
+ 2. then switch to a fresh authenticated/profile/debug launch
201
+
202
+ Use `sessionMode: "fresh"` for that transition instead of relying on the implicit session:
203
+
204
+ ```json
205
+ {
206
+ "args": ["--profile", "Default", "open", "https://example.com/account"],
207
+ "sessionMode": "fresh"
208
+ }
209
+ ```
210
+
211
+ After that call succeeds, later default `sessionMode: "auto"` calls continue in the new fresh browser.
212
+
213
+ If you want to name the new upstream session yourself, pass an explicit session instead:
214
+
215
+ ```json
216
+ {
217
+ "args": ["--session", "auth-flow", "--profile", "Default", "open", "https://example.com/account"]
218
+ }
219
+ ```
126
220
 
127
221
  ## Docs
128
222
 
@@ -59,29 +59,33 @@ The published package should exclude agent-only and superseded repo materials su
59
59
 
60
60
  ### Default
61
61
 
62
- If the caller does not provide `--session`, the extension should use an implicit session name derived from the current `pi` session id.
62
+ If the caller does not provide `--session`, the extension should default to `sessionMode: "auto"` and use an implicit session name derived from the current `pi` session id plus a hash of the absolute cwd.
63
63
 
64
64
  Why:
65
65
  - works out of the box
66
66
  - gives continuity across calls
67
67
  - avoids forcing the agent to invent session names for basic browsing
68
68
 
69
- ### Explicit upstream sessions
69
+ ### Explicit upstream sessions and fresh launches
70
70
 
71
71
  If the caller provides `--session`, `--profile`, `--cdp`, or similar upstream flags, the extension should respect them with minimal interference.
72
72
 
73
+ The tool should also expose a first-class `sessionMode: "fresh"` escape hatch so agents can intentionally rotate the extension-managed session to a fresh upstream launch without inventing a fixed explicit session name.
74
+
73
75
  ### Ownership
74
76
 
75
77
  V1 ownership rule:
76
78
  - implicit auto-generated sessions are extension-managed convenience sessions
79
+ - unnamed `sessionMode: "fresh"` launches rotate that extension-managed session to a new upstream browser
77
80
  - explicit/user-managed sessions are not auto-managed by default
78
- - implicit sessions should be reusable during an active `pi` session, but should still be cleaned up predictably
81
+ - extension-managed sessions should be reusable during an active `pi` session, but should still be cleaned up predictably
79
82
 
80
83
  Practical policy:
81
- - on normal `pi` shutdown, best-effort close the implicit session
82
- - also set an idle timeout on implicit sessions so abandoned daemons self-clean after inactivity
83
- - clean up private temp spill artifacts owned by the implicit session on shutdown
84
- - leave explicit upstream sessions like `--session`, `--profile`, `--session-name`, and `--cdp` alone unless the caller closes them explicitly
84
+ - on normal `pi` shutdown, best-effort close the current extension-managed session
85
+ - also set an idle timeout on extension-managed sessions so abandoned daemons self-clean after inactivity
86
+ - clean up private temp spill artifacts owned by the extension-managed session on shutdown
87
+ - if an unnamed fresh launch replaces an active extension-managed session, best-effort close the old managed session after the switch succeeds
88
+ - leave explicit caller-provided `--session` choices alone unless the caller closes them explicitly
85
89
 
86
90
  This is primarily about ownership clarity and avoiding surprise, not adding a heavy safety wrapper. If the extension invented the session, the extension should clean it up. If the caller explicitly chose the upstream session model, the extension should stay out of the way.
87
91
 
@@ -92,7 +96,11 @@ The extension should surface that clearly and avoid hidden restart behavior in v
92
96
 
93
97
  That means explicit startup-scoping flags like `--profile`, `--session-name`, and `--cdp` should remain explicit upstream choices instead of being wrapped in extra hidden restart or cloning logic.
94
98
 
95
- If the implicit session is already active and one of those startup-scoped flags appears again, the extension should fail clearly instead of silently sending a command shape that upstream would ignore.
99
+ If the implicit session is already active and one of those startup-scoped flags appears again while `sessionMode` is still `"auto"`, the extension should fail clearly instead of silently sending a command shape that upstream would ignore.
100
+
101
+ That failure should include a structured recovery hint pointing to `sessionMode: "fresh"` as the first-line fix, while still allowing an explicit `--session` when the caller wants to name the new upstream session.
102
+
103
+ A successful unnamed `sessionMode: "fresh"` launch should become the new extension-managed session so later default calls follow that browser instead of silently snapping back to the older managed session.
96
104
 
97
105
  ## Preferring the native tool
98
106
 
@@ -32,7 +32,7 @@ The tool also needs an operating playbook, not just a capability list. The model
32
32
  {
33
33
  "args": ["open", "https://example.com"],
34
34
  "stdin": "optional raw stdin content",
35
- "useActiveSession": true
35
+ "sessionMode": "auto"
36
36
  }
37
37
  ```
38
38
 
@@ -69,15 +69,20 @@ Examples:
69
69
  { "args": ["batch"], "stdin": "[[\"open\",\"https://example.com\"],[\"snapshot\",\"-i\"]]" }
70
70
  ```
71
71
 
72
- ### `useActiveSession`
72
+ ### `sessionMode`
73
73
 
74
- - type: `boolean`
74
+ - type: `"auto" | "fresh"`
75
75
  - optional
76
- - default: `true`
76
+ - default: `"auto"`
77
77
 
78
78
  Behavior:
79
79
  - if `args` already include `--session`, upstream session choice wins
80
- - otherwise the extension prepends its implicit active session when `useActiveSession` is `true`
80
+ - `"auto"` prepends the current extension-managed active session when appropriate
81
+ - `"fresh"` rotates that managed session to a fresh upstream launch so startup-scoped flags like `--profile`, `--session-name`, or `--cdp` apply and later default calls follow the new browser
82
+
83
+ Recommended use:
84
+ - use `"auto"` for the common browse/snapshot/click flow inside one `pi` session
85
+ - use `"fresh"` when switching from an already-active implicit session to a new profile/debug/auth launch without inventing a fixed explicit session name
81
86
 
82
87
  ## Wrapper behavior
83
88
 
@@ -87,8 +92,8 @@ The extension should:
87
92
  - parse JSON output into tool details
88
93
  - handle observed JSON result shapes, including the array returned by `batch --json`
89
94
  - allow plain-text fallback for inspection commands like `--help` and `--version`
90
- - discourage exploratory inspection calls unless the user explicitly asks or debugging requires them
91
- - deflect normal-task `--help` inspection back into the standard browser workflow instead of letting the model relearn the tool from scratch each session
95
+ - support those inspection commands unconditionally so the tool contract stays local and predictable
96
+ - still describe normal browser workflows in guidance so models do not overuse inspection for routine tasks
92
97
  - surface stderr and non-zero exits clearly
93
98
  - attach images when the result points to a screenshot-like artifact
94
99
 
@@ -104,7 +109,8 @@ Primary content should be:
104
109
 
105
110
  Examples:
106
111
  - small `snapshot` results should include the actual snapshot text
107
- - oversized `snapshot` results should switch to a compact view that preserves the primary content, nearby sections, high-value refs, and a path to the spilled full raw snapshot
112
+ - oversized `snapshot` results should switch to a compact view that preserves the primary content, nearby sections, and a trimmed set of high-value refs, while exposing the full raw snapshot path via `details.fullOutputPath`
113
+ - successful navigation actions like `click`, `back`, `forward`, and `reload` should include a lightweight post-action title/url summary when the wrapper can address the active session
108
114
  - `tab list` should include a readable tab summary
109
115
  - `screenshot` should include the saved-path summary plus the inline image attachment when available
110
116
 
@@ -116,6 +122,7 @@ Recommended details:
116
122
  {
117
123
  "args": ["snapshot", "-i"],
118
124
  "effectiveArgs": ["--session", "pi-abc123", "--json", "snapshot", "-i"],
125
+ "sessionMode": "auto",
119
126
  "sessionName": "pi-abc123",
120
127
  "usedImplicitSession": true,
121
128
  "data": {
@@ -136,7 +143,8 @@ For oversized snapshots, details should switch to a compact metadata object and
136
143
 
137
144
  Worth doing in v1:
138
145
  - screenshots → inline image attachment
139
- - snapshots → origin + ref count + main-content-first compact preview, with full raw snapshot spill files when the inline result would otherwise be too large
146
+ - snapshots → origin + ref count + main-content-first compact preview, with the raw snapshot spill path kept in `details.fullOutputPath` when the inline result would otherwise be too large
147
+ - navigation actions like `click`, `back`, `forward`, and `reload` → lightweight post-action title/url summary when available
140
148
  - tab lists → compact summary/table
141
149
  - stream status → enabled/connected/port summary
142
150
 
@@ -149,16 +157,18 @@ If `agent-browser` is not on `PATH`, fail with a message that:
149
157
 
150
158
  ## Session behavior
151
159
 
152
- - maintain one implicit active session per `pi` session for the common path
153
- - derive that implicit session from the official `pi` session id
160
+ - maintain one extension-managed active session per `pi` session for the common path
161
+ - derive the base implicit session name from the official `pi` session id plus a cwd hash so same-named checkouts do not collide
154
162
  - respect explicit upstream `--session` with minimal interference
155
- - treat the implicit session as extension-managed convenience state
156
- - on normal `pi` shutdown, best-effort close the implicit session
157
- - set an idle timeout on implicit sessions so abandoned daemons eventually self-clean
158
- - clean up private temp spill artifacts owned by the implicit session on shutdown
159
- - treat explicit upstream session choices like `--session`, `--profile`, `--session-name`, and `--cdp` as user-managed
163
+ - treat the extension-managed session as convenience state owned by the wrapper
164
+ - on normal `pi` shutdown, best-effort close the current extension-managed session
165
+ - set an idle timeout on extension-managed sessions so abandoned daemons eventually self-clean
166
+ - clean up private temp spill artifacts owned by the extension-managed session on shutdown
167
+ - when an unnamed `sessionMode: "fresh"` launch succeeds, make it the new extension-managed session so later default calls keep using it
168
+ - if that unnamed fresh launch replaced an already-active managed session, best-effort close the old managed session after the switch succeeds
169
+ - treat explicit caller-provided `--session` choices as user-managed
160
170
  - pass explicit `--profile` straight through to upstream `agent-browser`; no profile-cloning or isolation layer is added in v1
161
- - if startup-scoped flags like `--profile`, `--session-name`, or `--cdp` are supplied after the implicit session is already active, return a validation error instead of silently relying on upstream to ignore them
171
+ - if startup-scoped flags like `--profile`, `--session-name`, or `--cdp` are supplied after the implicit session is already active while `sessionMode` is `"auto"`, return a validation error with a structured recovery hint that recommends `sessionMode: "fresh"`
162
172
 
163
173
  ## Non-goals
164
174