pi-agent-browser-native 0.2.39 → 0.2.41
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +39 -0
- package/README.md +98 -3
- package/docs/ARCHITECTURE.md +27 -8
- package/docs/COMMAND_REFERENCE.md +84 -6
- package/docs/RELEASE.md +3 -2
- package/docs/SUPPORT_MATRIX.md +15 -11
- package/docs/TOOL_CONTRACT.md +78 -8
- package/extensions/agent-browser/index.ts +14 -3
- package/extensions/agent-browser/lib/config-policy.js +679 -0
- package/extensions/agent-browser/lib/config.ts +198 -0
- package/extensions/agent-browser/lib/input-modes/params.ts +1 -1
- package/extensions/agent-browser/lib/launch-scoped-flags.ts +67 -0
- package/extensions/agent-browser/lib/playbook.ts +42 -17
- package/extensions/agent-browser/lib/results/presentation/browser-profile-recovery.ts +67 -0
- package/extensions/agent-browser/lib/results/presentation/errors.ts +4 -0
- package/extensions/agent-browser/lib/runtime.ts +6 -73
- package/extensions/agent-browser/lib/web-search.ts +703 -0
- package/package.json +3 -1
- package/scripts/config.mjs +381 -0
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,45 @@
|
|
|
2
2
|
|
|
3
3
|
## Unreleased
|
|
4
4
|
|
|
5
|
+
No changes yet.
|
|
6
|
+
|
|
7
|
+
## 0.2.41 - 2026-06-03
|
|
8
|
+
|
|
9
|
+
### Added
|
|
10
|
+
|
|
11
|
+
- Added Exa support to the optional `agent_browser_web_search` companion tool. When both Exa and Brave credentials are configured, `provider: "auto"` now prefers Exa by default; use `pi-agent-browser-config web-search prefer brave --global` or `webSearch.preferredProvider: "brave"` to keep Brave as the default provider.
|
|
12
|
+
- Added `webSearch.enabled: false` to disable the companion search tool even when `EXA_API_KEY` or `BRAVE_API_KEY` is present.
|
|
13
|
+
|
|
14
|
+
### Changed
|
|
15
|
+
|
|
16
|
+
- Tightened project-local web-search credential config: `.pi/config/pi-agent-browser-native/config.json` may only reference the matching provider env var (`$EXA_API_KEY` / `${EXA_API_KEY}` for Exa, `$BRAVE_API_KEY` / `${BRAVE_API_KEY}` for Brave). Project-local custom env aliases now fail config validation; move aliases to global config or `PI_AGENT_BROWSER_CONFIG`.
|
|
17
|
+
- Updated `pi-agent-browser-config web-search set-key`, `set-command`, and `clear` to require `--provider`; `set-env` still infers the provider from `EXA_API_KEY` or `BRAVE_API_KEY`.
|
|
18
|
+
- Browser profile/executable prompt guidance now comes only from trusted global config or `PI_AGENT_BROWSER_CONFIG`; project-local browser config is status-only for host profile/executable launch guidance and cannot shadow trusted global guidance.
|
|
19
|
+
- `--executable-path` is treated as launch-scoped, so using it after an active implicit browser session returns fresh-session recovery guidance instead of being quietly ignored by session reuse.
|
|
20
|
+
|
|
21
|
+
### Fixed
|
|
22
|
+
|
|
23
|
+
- Rejected `--session-mode` inside `agent_browser.args` with guidance to use the top-level `sessionMode` field.
|
|
24
|
+
- Added profile/config recovery hints and `profiles` / `doctor` next actions for Chrome profile and user-data-dir failures, including upstream `Chrome profile ... not found` errors.
|
|
25
|
+
- Added web-search rate-limit guidance and serialized companion web-search requests so agents avoid repeated or parallel provider calls after HTTP 429s.
|
|
26
|
+
- Clarified web-search disable precedence: `webSearch.enabled` is evaluated after global → project → `PI_AGENT_BROWSER_CONFIG` merge, so users know when to use global, project, or override-level disable.
|
|
27
|
+
|
|
28
|
+
## 0.2.40 - 2026-06-02
|
|
29
|
+
|
|
30
|
+
### Added
|
|
31
|
+
|
|
32
|
+
- Added Pi-scoped `pi-agent-browser-native` package config at `~/.pi/config/pi-agent-browser-native/config.json`, `.pi/config/pi-agent-browser-native/config.json`, and the `PI_AGENT_BROWSER_CONFIG` override, including the `pi-agent-browser-config` helper for redacted setup/status and conservative browser profile hints.
|
|
33
|
+
- Added the optional Brave-backed `agent_browser_web_search` companion tool, registered only when a usable Brave credential source is configured or resolvable, with compact normalized results for current/live web information.
|
|
34
|
+
|
|
35
|
+
### Changed
|
|
36
|
+
|
|
37
|
+
- Documented optional web-search setup, config precedence, credential safety, and browser profile guidance across the README, command reference, tool contract, architecture notes, and support matrix.
|
|
38
|
+
|
|
39
|
+
### Fixed
|
|
40
|
+
|
|
41
|
+
- Hardened Brave credential handling so project-local config only accepts exact inert `$ENV_VAR` or `${ENV_VAR}` references, rejects plaintext/malformed/interpolation-literal/command-backed secrets, and keeps raw or entity-encoded API keys out of tool output and errors.
|
|
42
|
+
- Cleaned Brave result text by decoding common HTML entities while stripping decoded HTML tags safely and preserving placeholder text such as `<version>`.
|
|
43
|
+
|
|
5
44
|
## 0.2.39 - 2026-06-02
|
|
6
45
|
|
|
7
46
|
### Added
|
package/README.md
CHANGED
|
@@ -146,6 +146,90 @@ The doctor checks:
|
|
|
146
146
|
|
|
147
147
|
It does **not** edit Pi settings and does **not** run upstream `agent-browser doctor --fix`.
|
|
148
148
|
|
|
149
|
+
## Optional package config and web search
|
|
150
|
+
|
|
151
|
+
`pi-agent-browser-native` also reads package-owned config under Pi-scoped paths:
|
|
152
|
+
|
|
153
|
+
- global user config: `~/.pi/config/pi-agent-browser-native/config.json`
|
|
154
|
+
- project config: `.pi/config/pi-agent-browser-native/config.json`
|
|
155
|
+
- explicit override: `PI_AGENT_BROWSER_CONFIG=/path/to/config.json`
|
|
156
|
+
|
|
157
|
+
`pi install npm:pi-agent-browser-native` loads the extension, but it does **not** usually put the package helper on your shell `PATH`. You can configure web search by writing the config file directly, or run the helper through `npm exec` when you want a command to write it for you.
|
|
158
|
+
|
|
159
|
+
Inspect paths/status with the helper when available on `PATH`, or through npm:
|
|
160
|
+
|
|
161
|
+
```bash
|
|
162
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config paths
|
|
163
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config show
|
|
164
|
+
```
|
|
165
|
+
|
|
166
|
+
The optional `agent_browser_web_search` companion tool is registered only when a usable Exa or Brave credential source is configured or resolvable. It is not an `agent_browser` input mode and does not launch a browser; agents may use it whenever current/live external web information helps, then use `agent_browser` when they need page interaction, screenshots, authenticated/profile content, or DOM inspection. If both keys are available, the default provider is Exa because its `/search` endpoint returns agent-friendly highlights and search modes; set `webSearch.preferredProvider` to `"brave"` when you prefer Brave Search.
|
|
167
|
+
|
|
168
|
+
Get an Exa API key from the [Exa dashboard](https://dashboard.exa.ai/api-keys) or a Brave Search API key from the [Brave Search API dashboard](https://api-dashboard.search.brave.com/). Most users can simply export `EXA_API_KEY` or `BRAVE_API_KEY` in the environment that launches `pi`; config is only needed when you want Pi-scoped secret references, a preferred provider, or to disable this built-in search tool.
|
|
169
|
+
|
|
170
|
+
Most config users should store env-var references in the Pi-scoped config:
|
|
171
|
+
|
|
172
|
+
```bash
|
|
173
|
+
mkdir -p ~/.pi/config/pi-agent-browser-native
|
|
174
|
+
cat > ~/.pi/config/pi-agent-browser-native/config.json <<'JSON'
|
|
175
|
+
{
|
|
176
|
+
"version": 1,
|
|
177
|
+
"webSearch": {
|
|
178
|
+
"enabled": true,
|
|
179
|
+
"preferredProvider": "exa",
|
|
180
|
+
"exaApiKey": "$EXA_API_KEY",
|
|
181
|
+
"braveApiKey": "$BRAVE_API_KEY"
|
|
182
|
+
}
|
|
183
|
+
}
|
|
184
|
+
JSON
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
If you prefer the helper, run it through npm unless you know `pi-agent-browser-config` is already on your `PATH`:
|
|
188
|
+
|
|
189
|
+
```bash
|
|
190
|
+
# Store env-var references in global config.
|
|
191
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search set-env EXA_API_KEY --global
|
|
192
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search set-env BRAVE_API_KEY --global
|
|
193
|
+
|
|
194
|
+
# Store an env-var reference in project config.
|
|
195
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search set-env EXA_API_KEY --project
|
|
196
|
+
|
|
197
|
+
# Prefer Brave when both Exa and Brave keys are available, or clear with "auto".
|
|
198
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search prefer brave --global
|
|
199
|
+
|
|
200
|
+
# Disable this package's built-in web-search tool in global config even if API keys are in the environment.
|
|
201
|
+
# Global disable applies to normal runs unless a project config or PI_AGENT_BROWSER_CONFIG override explicitly re-enables it.
|
|
202
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search disable --global
|
|
203
|
+
|
|
204
|
+
# Hard-disable web search for one run, regardless of project config, by using the highest-priority override layer.
|
|
205
|
+
cat > /tmp/pi-agent-browser-disable-web-search.json <<'JSON'
|
|
206
|
+
{ "version": 1, "webSearch": { "enabled": false } }
|
|
207
|
+
JSON
|
|
208
|
+
PI_AGENT_BROWSER_CONFIG=/tmp/pi-agent-browser-disable-web-search.json pi
|
|
209
|
+
|
|
210
|
+
# Store a plaintext key in global Pi-scoped user config; output stays redacted.
|
|
211
|
+
printf '%s' "$EXA_API_KEY" | npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search set-key --provider exa --stdin
|
|
212
|
+
|
|
213
|
+
# Store a global secret-manager command source.
|
|
214
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search set-command "op read 'op://Private/Brave Search/API Key'" --provider brave --global
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
Config merges in this order: global → project → `PI_AGENT_BROWSER_CONFIG` override. `webSearch.enabled` is evaluated after that merge. Use `web-search disable --global` for a user default, `web-search disable --project` for one repo, and a `PI_AGENT_BROWSER_CONFIG` override with `{ "webSearch": { "enabled": false } }` when web search must stay off even if project config exists. Project-local plaintext, custom env aliases, interpolation-literal, malformed, and command-backed web-search keys are refused; project config may only use the matching provider env refs (`$EXA_API_KEY` / `${EXA_API_KEY}` for Exa and `$BRAVE_API_KEY` / `${BRAVE_API_KEY}` for Brave). `web-search set-key`, `set-command`, and `clear` require `--provider`; `set-env` infers Exa/Brave from `EXA_API_KEY` or `BRAVE_API_KEY` unless you pass `--provider`. The tool content, details, status output, and docs examples must not expose resolved keys.
|
|
218
|
+
|
|
219
|
+
For Exa, the tool defaults to `searchType: "auto"` with `contents.highlights: true`. Agents may pass `searchType` (`fast`, `instant`, `deep-lite`, `deep`, or `deep-reasoning`) only when the task needs that latency/depth tradeoff; structured output schemas are intentionally not exposed yet.
|
|
220
|
+
|
|
221
|
+
The same config file can record conservative browser defaults such as a profile hint or a Chromium-compatible executable path:
|
|
222
|
+
|
|
223
|
+
```bash
|
|
224
|
+
# Ask the agent to use this profile for signed-in/account-specific work.
|
|
225
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config browser profile set "Profile 1" --policy authenticated-only
|
|
226
|
+
|
|
227
|
+
# Ask the agent to launch a different Chromium-compatible browser executable.
|
|
228
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config browser executable set "/Applications/Brave Browser.app/Contents/MacOS/Brave Browser"
|
|
229
|
+
```
|
|
230
|
+
|
|
231
|
+
This adds agent guidance for signed-in/account-specific tasks; current releases do not auto-inject `--profile` or `--executable-path` for every launch. Configure profile/executable guidance globally or through `PI_AGENT_BROWSER_CONFIG`; project-local browser config is not trusted to steer host executable/profile prompt guidance. Ask the agent to run `agent_browser` with `args: ["profiles"]` and `args: ["doctor"]` when profile resolution fails. The upstream `profiles` command lists Chrome profiles from Chrome's user data directory; `Default` is not canonical on every machine. Use the displayed profile directory name, a full profile/user-data directory path when upstream accepts one, or a configured `browser.executablePath` plus `sessionMode: "fresh"` for a different Chromium-compatible browser.
|
|
232
|
+
|
|
149
233
|
## Common agent calls
|
|
150
234
|
|
|
151
235
|
You usually prompt the agent in natural language. These JSON snippets show the exact native tool shape the agent should use.
|
|
@@ -349,19 +433,30 @@ Artifact cleanup is host-owned, not a browser command. Close commands (`close`,
|
|
|
349
433
|
Start a fresh profiled browser after the implicit public-browsing session already exists:
|
|
350
434
|
|
|
351
435
|
```json
|
|
352
|
-
{ "args": ["--profile", "
|
|
436
|
+
{ "args": ["--profile", "Profile 1", "open", "https://example.com/account"], "sessionMode": "fresh" }
|
|
437
|
+
```
|
|
438
|
+
|
|
439
|
+
Start a fresh launch with a different Chromium-compatible executable:
|
|
440
|
+
|
|
441
|
+
```json
|
|
442
|
+
{
|
|
443
|
+
"args": ["--executable-path", "/Applications/Brave Browser.app/Contents/MacOS/Brave Browser", "open", "https://example.com/account"],
|
|
444
|
+
"sessionMode": "fresh"
|
|
445
|
+
}
|
|
353
446
|
```
|
|
354
447
|
|
|
355
448
|
After a successful unnamed fresh launch, later default `sessionMode: "auto"` calls follow that browser automatically. If the fresh launch fails or times out, `details.managedSessionOutcome` records whether the previous managed session was preserved or the attempted fresh session was abandoned before any managed session became current; a `Managed session outcome: …` line is appended only when the failing call used `sessionMode: "fresh"`. If you explicitly close the current wrapper-managed session with `--session <name> close`, later default auto calls rotate to a new wrapper-generated session instead of reusing that closed name, and repeated closes keep reserving fresh names across resume/branch restore.
|
|
356
449
|
|
|
357
450
|
## Authenticated/profile workflows
|
|
358
451
|
|
|
359
|
-
The wrapper does not clone profiles or hide what upstream Chrome profile you chose. Passing `--profile` is an explicit upstream `agent-browser` choice. Visible page content from real profiles is model-visible and may persist in transcripts or saved artifacts; redaction protects credential-like cookie/storage/auth values, not ordinary page text you asked the browser to read.
|
|
452
|
+
The wrapper does not clone profiles or hide what upstream Chrome/Chromium profile or executable you chose. Passing `--profile` or `--executable-path` is an explicit upstream `agent-browser` choice. Visible page content from real profiles is model-visible and may persist in transcripts or saved artifacts; redaction protects credential-like cookie/storage/auth values, not ordinary page text you asked the browser to read.
|
|
360
453
|
|
|
361
454
|
Use these rules:
|
|
362
455
|
|
|
363
456
|
- Use public/temp profiles for tests and examples.
|
|
364
|
-
-
|
|
457
|
+
- Do not assume `--profile Default` is correct. Ask the agent to run `profiles` to list Chrome profile directory names, then `doctor` if profile/user-data-dir resolution still fails.
|
|
458
|
+
- For non-Chrome Chromium browsers such as Brave, Edge, Arc, or Vivaldi, use `--executable-path <path>` when upstream can launch that executable. If you need that browser's existing login state, use the browser's real profile/user-data directory path when upstream accepts it, or attach with `--auto-connect` / `connect` to a debug-enabled running browser when appropriate.
|
|
459
|
+
- Use `sessionMode: "fresh"` when switching from public browsing to `--profile`, `--executable-path`, `--session-name`, `--cdp`, `--state`, `--auto-connect`, `--init-script`, `--enable`, `-p` / `--provider`, or iOS `--device`.
|
|
365
460
|
- Use `--session` when you want to manage a live upstream session name yourself.
|
|
366
461
|
- Do not treat `--session` as persisted auth or tab restore after `close`, `quit`, or `exit`; use `--profile`, `--session-name`, or `--state` for persistence.
|
|
367
462
|
- Prefer page actions and storage checks over cookie dumps. `cookies get` can expose real profile cookies.
|
package/docs/ARCHITECTURE.md
CHANGED
|
@@ -15,17 +15,22 @@ The package install path is the primary product path. Local checkout development
|
|
|
15
15
|
|
|
16
16
|
## Chosen shape
|
|
17
17
|
|
|
18
|
-
### One primary tool
|
|
18
|
+
### One primary tool plus optional companion search
|
|
19
19
|
|
|
20
|
-
V1
|
|
20
|
+
V1 exposes one native browser tool:
|
|
21
21
|
|
|
22
22
|
- `agent_browser`
|
|
23
23
|
|
|
24
|
+
It may also expose one optional companion tool:
|
|
25
|
+
|
|
26
|
+
- `agent_browser_web_search`, registered only when an Exa or Brave Search credential source is configured or resolvable and `webSearch.enabled` is not false
|
|
27
|
+
|
|
24
28
|
Why:
|
|
25
|
-
-
|
|
26
|
-
-
|
|
27
|
-
-
|
|
28
|
-
-
|
|
29
|
+
- keeps browser automation centered on `agent_browser`
|
|
30
|
+
- avoids colliding with generic `web_search`
|
|
31
|
+
- keeps live search separate from browser state, screenshots, refs, and session lifecycle
|
|
32
|
+
- supports requested Exa/Brave provider choice without turning search into an `agent_browser` input mode
|
|
33
|
+
- keeps optional search invisible when it cannot run or when the user disables it
|
|
29
34
|
|
|
30
35
|
### Direct subprocess execution
|
|
31
36
|
|
|
@@ -53,6 +58,20 @@ That means:
|
|
|
53
58
|
- no manual user orchestration as the main workflow
|
|
54
59
|
- any future slash commands should be minimal and secondary
|
|
55
60
|
|
|
61
|
+
### Package-owned config
|
|
62
|
+
|
|
63
|
+
Pi docs use `settings.json` for package/resource loading and filtering, not arbitrary extension secrets. For user-tunable package behavior, this package owns Pi-scoped config files instead:
|
|
64
|
+
|
|
65
|
+
- global: `~/.pi/config/pi-agent-browser-native/config.json`
|
|
66
|
+
- project-local: `.pi/config/pi-agent-browser-native/config.json`
|
|
67
|
+
- explicit override: `PI_AGENT_BROWSER_CONFIG=/path/to/config.json`
|
|
68
|
+
|
|
69
|
+
Config layers merge in that order: global, project, override. The shared policy module (`extensions/agent-browser/lib/config-policy.js`) owns provider descriptors, environment variable names, config keys, project-local credential safety, layer validation/merge, redacted status projection, and credential summaries for both runtime config loading and the `pi-agent-browser-config` CLI. The config reader accepts v1 fields for `webSearch.enabled`, `webSearch.preferredProvider`, `webSearch.exaApiKey`, `webSearch.braveApiKey`, and conservative browser defaults such as `browser.defaultProfile` and `browser.executablePath`. Web-search key fields follow Pi model/provider-style value resolution for trusted global/override config: literal values, `$ENV_VAR` / `${ENV_VAR}` interpolation, escapes (`$$`, `$!`), and leading `!command` resolved at request time. Project-local plaintext, interpolation-literal, malformed, and command-backed web-search keys are rejected because project config can be copied, committed, or supplied by a repository; project config may use only the matching provider env ref (`$EXA_API_KEY` / `${EXA_API_KEY}` for Exa, `$BRAVE_API_KEY` / `${BRAVE_API_KEY}` for Brave), so a repository cannot redirect search credentials to arbitrary host env vars. `EXA_API_KEY` and `BRAVE_API_KEY` remain environment fallbacks when no config credential source exists for that provider. Browser default values keep their source scope; profile/executable prompt guidance is emitted only from trusted global or explicit override config, not from project-local config that could steer host profile or executable choices.
|
|
70
|
+
|
|
71
|
+
`agent_browser_web_search` registration is conditional. `webSearch.enabled: false` disables registration even when environment keys are present, but it is evaluated after config merge. A global disable is the normal user default and can still be overridden by project config or `PI_AGENT_BROWSER_CONFIG`; a project disable applies to one repo; an explicit `PI_AGENT_BROWSER_CONFIG` file with `webSearch.enabled: false` is the highest-priority hard disable for that run. Literal and env-backed sources must resolve at startup; command-backed sources are considered configured without running the command until tool execution, so secret managers do not slow startup or prompt unexpectedly. The tool resolves the selected key lazily, chooses Exa or Brave from available credentials (preferring Exa by default unless `webSearch.preferredProvider` says otherwise), then follows one provider-agnostic execution path through provider adapters for request building, HTTP JSON fetch, response normalization, and provider-specific detail fields. It calls Exa `/search` with highlights or Brave Search and returns compact result details without exposing keys.
|
|
72
|
+
|
|
73
|
+
Browser default config is intentionally conservative. It can add prompt guidance for signed-in/account-specific tasks and alternate Chromium-compatible executables, but current releases do not auto-inject `--profile` or `--executable-path` into every launch. Project-local browser config is status-only for launch guidance unless the profile policy is `explicit-only`; executable guidance must come from global or explicit override config. Automatic launch-default mutation would affect privacy, browser state, and host executable choice, so it needs a separate explicit design and test pass.
|
|
74
|
+
|
|
56
75
|
### Prompt guidance budget
|
|
57
76
|
|
|
58
77
|
Runtime `promptGuidelines` are a Tier A budget, not a full manual. They stay short enough to load on every `agent_browser`-aware turn and carry only high-impact rules: input-mode choice, the open → snapshot → ref loop, launch-scoped session handling, artifact verification, structured `nextActions`, extraction basics, and hard safety boundaries such as “stop before order/post/purchase/submit.”
|
|
@@ -145,7 +164,7 @@ This is primarily about ownership clarity and avoiding surprise, not adding a he
|
|
|
145
164
|
`agent-browser` startup flags are sticky once a session is already running.
|
|
146
165
|
The extension should surface that clearly and avoid hidden restart behavior in v1.
|
|
147
166
|
|
|
148
|
-
That means explicit startup-scoping flags like `--
|
|
167
|
+
That means explicit startup-scoping flags like `--auto-connect`, `--cdp`, `--enable`, `--executable-path`, `--init-script`, `--device`, `--profile`, `--provider`, `-p`, `--session-name`, and `--state` should remain explicit upstream choices instead of being wrapped in extra hidden restart or cloning logic.
|
|
149
168
|
|
|
150
169
|
The wrapper may still apply narrow compatibility normalizations when observed behavior justifies them and the result remains thin, local, and opt-out. For example, if a specific site starts rejecting the default local headless Chrome user agent while the same flow works with a normal Chrome UA, the extension may inject a domain-specific fallback UA only when the caller did not already choose `--user-agent`, `--headed`, `--cdp`, `--auto-connect`, or a provider-backed launch.
|
|
151
170
|
|
|
@@ -153,7 +172,7 @@ If the implicit session is already active and one of those startup-scoped flags
|
|
|
153
172
|
|
|
154
173
|
That failure should include a structured recovery hint pointing to `sessionMode: "fresh"` as the first-line fix, while still allowing an explicit `--session` when the caller wants to name the new upstream session.
|
|
155
174
|
|
|
156
|
-
Implementation detail lives in `extensions/agent-browser/lib/argv-descriptor.ts` and `extensions/agent-browser/lib/argv-grammar.ts` (command discovery, `VALUE_FLAGS`, `parseArgvDescriptor`) plus `extensions/agent-browser/lib/runtime.ts` (`getStartupScopedFlags`, `buildExecutionPlan`):
|
|
175
|
+
Implementation detail lives in `extensions/agent-browser/lib/launch-scoped-flags.ts` (canonical flag metadata shared with playbook/docs assertions), `extensions/agent-browser/lib/argv-descriptor.ts` and `extensions/agent-browser/lib/argv-grammar.ts` (command discovery, `VALUE_FLAGS`, `parseArgvDescriptor`) plus `extensions/agent-browser/lib/runtime.ts` (`getStartupScopedFlags`, `buildExecutionPlan`):
|
|
157
176
|
|
|
158
177
|
- **Command discovery:** Leading argv is scanned with a value-taking allowlist so known global flags and documented command flags consume their values before the upstream command word is identified. Missing-value prevalidation is intentionally limited to upstream global value flags; command-scoped flags and literal text are left to upstream parsing so values like `fill #field --password` are not rejected by wrapper heuristics before the CLI sees them. When upstream adds new global flags that take values ahead of the command, extend both the command-discovery and prevalidation allowlists; when it adds command-specific flags, extend only command discovery/redaction as needed. A smaller set of global boolean flags may be followed by an optional `true`/`false` literal; when present, that literal is consumed as the flag value before command discovery continues.
|
|
159
178
|
- **`--state` disambiguation:** Persisted browser `--state` before the command participates in launch-scoped validation and tab-correction hints. The same flag spelling after a `wait` command is excluded from startup-scoped detection so upstream help examples such as `wait @ref --state hidden` do not spuriously require `sessionMode: "fresh"` while an implicit session is active. As of upstream `agent-browser 0.27.1`, the parser does not implement those `wait --state` examples as distinct wait modes, so agent-facing docs recommend `wait --fn` predicates for disappearance checks instead.
|
|
@@ -71,7 +71,7 @@ Tool parameters (use exactly one of `args`, `semanticAction`, `job`, `qa`, `sour
|
|
|
71
71
|
- `stdin`: only for `batch`, `eval --stdin`, and `auth save --password-stdin`; other command/stdin combinations are rejected before `agent-browser` is launched. `job`, `qa`, `sourceLookup`, `networkSourceLookup`, and `electron` generate or manage their own input.
|
|
72
72
|
- `sessionMode`:
|
|
73
73
|
- `"auto"` reuses the extension-managed session when possible.
|
|
74
|
-
- `"fresh"` rotates that managed session to a fresh upstream launch so launch-scoped flags
|
|
74
|
+
- `"fresh"` rotates that managed session to a fresh upstream launch so launch-scoped flags (`--auto-connect`, `--cdp`, `--enable`, `--executable-path`, `--init-script`, `--device`, `--profile`, `--provider`, `-p`, `--session-name`, `--state`) apply.
|
|
75
75
|
- If a fresh launch fails or times out, read `details.managedSessionOutcome` for `preserved` vs `abandoned` (and related fields). A model-visible `Managed session outcome: …` line is appended only for failing calls that used `sessionMode: "fresh"`; `"auto"` failures can still populate the struct without that extra line. If you explicitly close the current wrapper-managed session with `--session <name> close`, later default auto calls rotate to a new wrapper-generated session instead of reusing the closed name; repeated closes and branch restores keep those generated names monotonic.
|
|
76
76
|
|
|
77
77
|
### Debug, diff, stream, dashboard, and chat families
|
|
@@ -374,15 +374,24 @@ Browser close commands (`close`, `quit`, or `exit`) are also not file cleanup. I
|
|
|
374
374
|
|
|
375
375
|
Oversized snapshots and oversized generic outputs are different: when a persisted pi session is available, their wrapper-managed spill files are stored under the private session artifact directory and are governed by the byte budget `PI_AGENT_BROWSER_SESSION_ARTIFACT_MAX_BYTES` (default 32 MiB). Raise that byte budget as well for long QA sessions that need many full raw snapshots or large text spills to survive reload/resume.
|
|
376
376
|
|
|
377
|
-
### Switch from an already-active implicit session to a fresh profiled launch
|
|
377
|
+
### Switch from an already-active implicit session to a fresh profiled or alternate-browser launch
|
|
378
378
|
|
|
379
379
|
```json
|
|
380
380
|
{
|
|
381
|
-
"args": ["--profile", "
|
|
381
|
+
"args": ["--profile", "Profile 1", "open", "https://mail.google.com"],
|
|
382
382
|
"sessionMode": "fresh"
|
|
383
383
|
}
|
|
384
384
|
```
|
|
385
385
|
|
|
386
|
+
```json
|
|
387
|
+
{
|
|
388
|
+
"args": ["--executable-path", "/Applications/Brave Browser.app/Contents/MacOS/Brave Browser", "open", "https://mail.google.com"],
|
|
389
|
+
"sessionMode": "fresh"
|
|
390
|
+
}
|
|
391
|
+
```
|
|
392
|
+
|
|
393
|
+
`profiles` lists Chrome profile directory names from Chrome's user data directory; `Default` is common but not guaranteed. When profile resolution fails, use `agent_browser` diagnostics first: run `{ "args": ["profiles"] }` and `{ "args": ["doctor"] }`, then tell the user which profile name/path or browser executable setting to configure before retrying. For non-Chrome Chromium browsers, pass `--executable-path <path>` to the browser binary and use a full profile/user-data directory path only when upstream accepts that path.
|
|
394
|
+
|
|
386
395
|
### Recover tabs when focus lands somewhere unexpected
|
|
387
396
|
|
|
388
397
|
```json
|
|
@@ -663,11 +672,80 @@ Long-running or lifecycle commands should be explicitly paired with cleanup call
|
|
|
663
672
|
|
|
664
673
|
When these commands are invoked through the native `agent_browser` tool, structured diagnostic/status outputs are rendered as compact summaries. Local inspection/setup calls (`auth save/list/show/delete/remove`, `dashboard start/stop`, `device list`, `doctor`, `install`, `upgrade`, `profiles`, `session list`, `state list/show/rename`, `state clean --older-than <days>`, `state clear --all`, `state clear -a`, and `state clear <session-name>`) are sessionless unless you explicitly pass `--session`; context-dependent calls such as root `session`, untargeted `state clear`, `auth login`, `chat`, and `state save/load` keep normal session behavior. List-like outputs such as sessions, Chrome profiles, auth profiles, network requests, console messages, and page errors include counts and key fields; large outputs are previewed with a `Full output path:` spill file instead of dumping the entire payload into context. For `network requests`, the wrapper shows a failed-request summary split into actionable versus benign low-impact rows, then status, method, URL, resource/mime type, request id, and, when the installed upstream output includes body-like fields, bounded redacted payload, response, and failure/error snippets. Safe request IDs also produce `details.nextActions` for exact request details, actionable failed-request source lookup candidates, filtered request lists, or starting HAR capture before a repro. `network request <requestId>` can expose upstream full-detail body fields such as response bodies using the same bounded model-facing preview; its request URL stays diagnostic-only and does not overwrite `details.sessionTabTarget` for later ref guards. Header, cookie, auth, token, and other secret-like fields are not expanded in model-facing text or `details.data`; command echoes also redact `--body`, `--headers`, `--password`, proxy credentials, auth-bearing URLs, cookie/storage values, and bearer/basic credential text in positional arguments. Use upstream HAR or full raw details only when complete data is required.
|
|
665
674
|
|
|
675
|
+
## Optional package config and companion web search
|
|
676
|
+
|
|
677
|
+
`pi-agent-browser-native` has package-owned config under Pi-scoped paths. This is separate from upstream `agent-browser` config and from Pi package settings:
|
|
678
|
+
|
|
679
|
+
- global: `~/.pi/config/pi-agent-browser-native/config.json`
|
|
680
|
+
- project-local: `.pi/config/pi-agent-browser-native/config.json`
|
|
681
|
+
- explicit override: `PI_AGENT_BROWSER_CONFIG=/path/to/config.json`
|
|
682
|
+
|
|
683
|
+
Get an Exa API key from the [Exa dashboard](https://dashboard.exa.ai/api-keys) or a Brave Search API key from the [Brave Search API dashboard](https://api-dashboard.search.brave.com/). If both keys are available, `agent_browser_web_search` prefers Exa by default because its `/search` endpoint returns token-efficient highlights and agent-oriented search modes; set `webSearch.preferredProvider` to `"brave"` when Brave Search is preferred. You can also disable this package's search tool with `webSearch.enabled: false` when another search tool should win. Config merges global → project → `PI_AGENT_BROWSER_CONFIG` override, so `enabled` is read from the final merged config: a global disable can be re-enabled by project or override config, while an override file with `enabled: false` is the highest-priority hard disable for that run.
|
|
684
|
+
|
|
685
|
+
`pi install npm:pi-agent-browser-native` loads the extension, but it does **not** usually put the package helper on your shell `PATH`. The clearest setup is to write the config file directly and keep actual keys in the environment that launches `pi`:
|
|
686
|
+
|
|
687
|
+
```bash
|
|
688
|
+
mkdir -p ~/.pi/config/pi-agent-browser-native
|
|
689
|
+
cat > ~/.pi/config/pi-agent-browser-native/config.json <<'JSON'
|
|
690
|
+
{
|
|
691
|
+
"version": 1,
|
|
692
|
+
"webSearch": {
|
|
693
|
+
"enabled": true,
|
|
694
|
+
"preferredProvider": "exa",
|
|
695
|
+
"exaApiKey": "$EXA_API_KEY",
|
|
696
|
+
"braveApiKey": "$BRAVE_API_KEY"
|
|
697
|
+
}
|
|
698
|
+
}
|
|
699
|
+
JSON
|
|
700
|
+
```
|
|
701
|
+
|
|
702
|
+
If you prefer the helper, run it through npm unless you know `pi-agent-browser-config` is already on your `PATH`:
|
|
703
|
+
|
|
704
|
+
```bash
|
|
705
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config paths
|
|
706
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config show
|
|
707
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search set-env EXA_API_KEY --global
|
|
708
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search set-env BRAVE_API_KEY --global
|
|
709
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search set-env EXA_API_KEY --project
|
|
710
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search prefer brave --global
|
|
711
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search disable --global
|
|
712
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search disable --project
|
|
713
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search set-command "op read 'op://Private/Brave Search/API Key'" --provider brave --global
|
|
714
|
+
printf '%s' "$EXA_API_KEY" | npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search set-key --provider exa --stdin
|
|
715
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config browser profile set "Profile 1" --policy authenticated-only
|
|
716
|
+
npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config browser executable set "/Applications/Brave Browser.app/Contents/MacOS/Brave Browser"
|
|
717
|
+
```
|
|
718
|
+
|
|
719
|
+
The optional `agent_browser_web_search` tool is registered only when Exa or Brave credentials are available and the final merged config has not set `webSearch.enabled` to `false`. It is a separate custom tool, not an `agent_browser` input mode, and does not launch a browser. Use it when current/live external web information would help; use `agent_browser` for browser interaction, screenshots, authenticated/profile pages, and DOM inspection. Disable scope is explicit: `web-search disable --global` sets the normal user default, `web-search disable --project` disables it for one repo, and a `PI_AGENT_BROWSER_CONFIG` override containing `{ "version": 1, "webSearch": { "enabled": false } }` wins over both for a hard per-run disable. Project-local plaintext, custom env aliases, interpolation-literal, malformed, and command-backed web-search keys are refused; project config may only use the matching provider env refs (`$EXA_API_KEY` / `${EXA_API_KEY}` for Exa and `$BRAVE_API_KEY` / `${BRAVE_API_KEY}` for Brave). `web-search set-key`, `set-command`, and `clear` require `--provider`; `set-env` infers Exa/Brave from `EXA_API_KEY` or `BRAVE_API_KEY` unless you pass `--provider`. For Exa, the tool defaults to `searchType: "auto"` with `contents.highlights: true`; use `fast`, `instant`, `deep-lite`, `deep`, or `deep-reasoning` only when the task needs that latency/depth tradeoff.
|
|
720
|
+
|
|
721
|
+
Example config:
|
|
722
|
+
|
|
723
|
+
```json
|
|
724
|
+
{
|
|
725
|
+
"version": 1,
|
|
726
|
+
"webSearch": {
|
|
727
|
+
"enabled": true,
|
|
728
|
+
"preferredProvider": "exa",
|
|
729
|
+
"exaApiKey": "$EXA_API_KEY",
|
|
730
|
+
"braveApiKey": "$BRAVE_API_KEY"
|
|
731
|
+
},
|
|
732
|
+
"browser": {
|
|
733
|
+
"defaultProfile": {
|
|
734
|
+
"name": "Profile 1",
|
|
735
|
+
"policy": "authenticated-only"
|
|
736
|
+
},
|
|
737
|
+
"executablePath": "/Applications/Brave Browser.app/Contents/MacOS/Brave Browser"
|
|
738
|
+
}
|
|
739
|
+
}
|
|
740
|
+
```
|
|
741
|
+
|
|
742
|
+
Browser default config is conservative: it adds agent guidance for signed-in/account-specific tasks and alternate Chromium-compatible executables; current releases do not auto-inject `--profile` or `--executable-path` for every launch. Configure profile/executable guidance globally or through `PI_AGENT_BROWSER_CONFIG`; project-local browser config is not trusted to steer host executable/profile prompt guidance. Ask the agent to run `profiles` and `doctor` when profile resolution fails, then use the reported Chrome profile directory name, a full profile/user-data directory path if upstream accepts one, or the configured `browser.executablePath` with top-level `sessionMode: "fresh"`.
|
|
743
|
+
|
|
666
744
|
## Important global flags, config, and environment
|
|
667
745
|
|
|
668
746
|
### Authentication and session flags
|
|
669
747
|
|
|
670
|
-
- `--profile <name|path>`: reuse Chrome profile login state or use a persistent custom profile. Environment: `AGENT_BROWSER_PROFILE`.
|
|
748
|
+
- `--profile <name|path>`: reuse Chrome profile login state by directory name from `profiles`, or use a persistent custom profile/profile-directory path when upstream accepts it. Environment: `AGENT_BROWSER_PROFILE`.
|
|
671
749
|
- `--session <name>`: use an isolated session. Environment: `AGENT_BROWSER_SESSION`.
|
|
672
750
|
- `--session-name <name>`: auto-save/restore cookies and local storage by name. Environment: `AGENT_BROWSER_SESSION_NAME`.
|
|
673
751
|
- `--state <path>`: load saved auth state from JSON. Environment: `AGENT_BROWSER_STATE`.
|
|
@@ -678,7 +756,7 @@ When these commands are invoked through the native `agent_browser` tool, structu
|
|
|
678
756
|
|
|
679
757
|
### Browser launch and runtime flags
|
|
680
758
|
|
|
681
|
-
- `--executable-path <path>`: custom browser executable. Environment: `AGENT_BROWSER_EXECUTABLE_PATH`.
|
|
759
|
+
- `--executable-path <path>`: custom Chromium-compatible browser executable, such as Brave/Edge/Arc/Vivaldi when upstream can launch that binary. Environment: `AGENT_BROWSER_EXECUTABLE_PATH`.
|
|
682
760
|
- `--extension <path>`: load browser extensions; repeatable. Environment: `AGENT_BROWSER_EXTENSIONS`.
|
|
683
761
|
- `--args <args>`: browser launch args, comma or newline separated. Environment: `AGENT_BROWSER_ARGS`.
|
|
684
762
|
- `--user-agent <ua>`: custom user agent. Environment: `AGENT_BROWSER_USER_AGENT`.
|
|
@@ -732,7 +810,7 @@ Other useful environment variables include `AGENT_BROWSER_DEFAULT_TIMEOUT`, `AGE
|
|
|
732
810
|
## Wrapper-specific behavior worth knowing
|
|
733
811
|
|
|
734
812
|
- The extension may keep following one implicit managed session across later tool calls.
|
|
735
|
-
- If launch-scoped flags like `--profile`, `--session-name`, `--cdp`, `--state`, `--auto-connect`, `--init-script`, `--enable`, `--provider` / `-p`, or provider device flags like `--device` would be ignored because that implicit session is already active, retry with `sessionMode: "fresh"`.
|
|
813
|
+
- If launch-scoped flags like `--profile`, `--executable-path`, `--session-name`, `--cdp`, `--state`, `--auto-connect`, `--init-script`, `--enable`, `--provider` / `-p`, or provider device flags like `--device` would be ignored because that implicit session is already active, retry with `sessionMode: "fresh"`.
|
|
736
814
|
- If a `sessionMode: "fresh"` call fails (including upstream failure, timeout, missing binary, or **`qa`** reclassification after a nominally successful batch), read `details.managedSessionOutcome` before assuming where the next default call will go: `preserved` means the prior managed session remains current, while `abandoned` means no managed session became current. When the failure reason is not the fresh launch itself—for example `failureCategory: "qa-failure"`—`status`/`summary` may still describe the managed-session transition while `succeeded` on this object matches the final tool outcome.
|
|
737
815
|
<!-- agent-browser-playbook:start wrapper-tab-recovery -->
|
|
738
816
|
<!-- Generated from extensions/agent-browser/lib/playbook.ts. Run `npm run docs -- playbook write` to update. -->
|
package/docs/RELEASE.md
CHANGED
|
@@ -57,7 +57,7 @@ npm run smoke:platform:all
|
|
|
57
57
|
|
|
58
58
|
This mode uses the extension harness and the real `agent-browser` on `PATH` against a deterministic local file fixture, then verifies top-level `qa`, `semanticAction`, constrained `job`, screenshot artifact verification, and session close. Use `npm run verify -- dogfood --keep-artifacts` or `--artifact-dir <path>` only while debugging, then delete retained screenshots. This smoke complements, but does not replace, human-readable interactive transcript evidence.
|
|
59
59
|
|
|
60
|
-
Every release also requires interactive `tmux`-driven Pi dogfood with the native `agent_browser` tool against real sites. For extension-focused release smokes, use `pi --no-extensions --no-skills -e .` from the checkout before publish so auto-loaded dogfood/QA skills cannot replace the bounded smoke workflow; run separate skill-enabled dogfood only when validating skill routing or report-generation behavior. Drive prompts with `tmux send-keys`, exercise at least one simple static site and one real documentation/product site, include the higher-level `qa` or `job`/`batch` surfaces when they changed, close every opened browser session, remove screenshots/temp artifacts, and record the outcome in the release notes or support-matrix evidence. Automated localhost, fake-upstream, and deterministic dogfood gates do not replace this human-readable live-site transcript evidence. When `electron.*` surfaces, attached-session diagnostics, or `qa.attached` changed, add a local Electron pass: `electron.list` → `electron.launch` (expect isolated profile behavior) → `snapshot -i` or `electron.probe` / `qa.attached` → `electron.cleanup` with the returned `launchId`, verifying status/mismatch guidance if you simulate a dead renderer or stale refs. For dense-dashboard stress coverage, use the [public Grafana stress checklist](#public-grafana-stress-checklist) below; it is a maintainer workflow, not bundled product skill or recipe runtime.
|
|
60
|
+
Every release also requires interactive `tmux`-driven Pi dogfood with the native `agent_browser` tool against real sites. For extension-focused release smokes, use `pi --no-extensions --no-skills -e .` from the checkout before publish so auto-loaded dogfood/QA skills cannot replace the bounded smoke workflow; run separate skill-enabled dogfood only when validating skill routing or report-generation behavior. Drive prompts with `tmux send-keys`, exercise at least one simple static site and one real documentation/product site, include the higher-level `qa` or `job`/`batch` surfaces when they changed, close every opened browser session, remove screenshots/temp artifacts, and record the outcome in the release notes or support-matrix evidence. Automated localhost, fake-upstream, and deterministic dogfood gates do not replace this human-readable live-site transcript evidence. When `agent_browser_web_search` or package config changed, add one key-free smoke proving the optional tool is absent without config, one fake/unit-backed smoke in the default suite, and one opt-in live Exa or Brave Search check with a real key while confirming the key does not appear in transcripts, stdout/stderr, config status, PR text, or artifacts. When `electron.*` surfaces, attached-session diagnostics, or `qa.attached` changed, add a local Electron pass: `electron.list` → `electron.launch` (expect isolated profile behavior) → `snapshot -i` or `electron.probe` / `qa.attached` → `electron.cleanup` with the returned `launchId`, verifying status/mismatch guidance if you simulate a dead renderer or stale refs. For dense-dashboard stress coverage, use the [public Grafana stress checklist](#public-grafana-stress-checklist) below; it is a maintainer workflow, not bundled product skill or recipe runtime.
|
|
61
61
|
|
|
62
62
|
When reviewing saved session JSONL after a failed smoke or a `qa` preset that reclassified an upstream-successful batch, expect `agent_browser` tool rows to carry `isError: true` whenever `details.resultCategory` is `failure`. For normal prose output, model-visible text should end with a `Pi tool isError: true` category line; for caller-requested `--json` output, the hook preserves parseable JSON and only patches `isError`. The extension applies that patch on the `tool_result` path so Pi’s transcript matches the wrapper contract ([`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#details)). Preserve a normal Pi session directory for those checks; avoiding `--no-session` keeps this evidence intact ([`AGENTS.md`](../AGENTS.md) preferred validation workflow).
|
|
63
63
|
|
|
@@ -157,7 +157,8 @@ Maintainer constraints for evolving scenarios and version bumps are summarized u
|
|
|
157
157
|
`npm run verify -- package-pi` runs the same package-content checks and additionally confirms that:
|
|
158
158
|
|
|
159
159
|
- the packed package can be loaded through Pi SDK resource loading with the same isolation principle as `pi --no-extensions -e <package-source>`
|
|
160
|
-
-
|
|
160
|
+
- `agent_browser` is registered without requiring optional Brave config
|
|
161
|
+
- any optional companion tools remain governed by their own configuration gates
|
|
161
162
|
- the registered `agent_browser` source resolves inside the extracted packed package path, not the working checkout
|
|
162
163
|
- the packaged `agent_browser` tool can be executed through Pi's loaded native tool definition with a deterministic fake upstream `agent-browser --version` binary
|
|
163
164
|
|
package/docs/SUPPORT_MATRIX.md
CHANGED
|
@@ -30,7 +30,7 @@ When upstream ships a new `agent-browser` or the inventory changes:
|
|
|
30
30
|
- Source of truth: `CAPABILITY_BASELINE.inventorySections` in the same file (stable `id` keys: `skills`, `core-commands`, `state-tabs-frames-dialogs`, `network-storage-artifacts-diagnostics`, `batch-auth-setup-ai`, `options-and-env`).
|
|
31
31
|
- Status: supported for the current wrapper contract after the 2026-05-26 all-command audit.
|
|
32
32
|
- High-priority support gaps: 2026-05-26 audit found sessionless local commands and command-scoped value flags needed sharper wrapper handling; runtime/tests/docs now cover those paths. Remaining upstream-owned caveat: `agent-browser 0.27.1` help mentions `wait <selector> --state hidden`, but source parsing does not implement that distinct wait mode, so wrapper docs steer agents to `wait --fn` predicates.
|
|
33
|
-
- Post-`v0.2.29` review state: commits `eb55320` through `86abbfb` add browser guidance/smoke coverage plus `RQ-0086` click-probe reduction, `RQ-0087` same-snapshot form fill batching, `RQ-0088` current-ref fallback on locator misses, `RQ-0089` direct-upstream click mutation investigation, and `RQ-0090` stop-boundary/artifact-path guidance. Verification gates below were rerun on 2026-05-18 after those tasks landed. Constrained `job` (`RQ-0064`), the lightweight `qa` preset (`RQ-0065`), the experimental `sourceLookup` helper (`RQ-0066`),
|
|
33
|
+
- Post-`v0.2.29` review state: commits `eb55320` through `86abbfb` add browser guidance/smoke coverage plus `RQ-0086` click-probe reduction, `RQ-0087` same-snapshot form fill batching, `RQ-0088` current-ref fallback on locator misses, `RQ-0089` direct-upstream click mutation investigation, and `RQ-0090` stop-boundary/artifact-path guidance. Verification gates below were rerun on 2026-05-18 after those tasks landed. Constrained `job` (`RQ-0064`), the lightweight `qa` preset (`RQ-0065`), the experimental `sourceLookup` helper (`RQ-0066`), the experimental `networkSourceLookup` helper (`RQ-0067`), optional Exa/Brave-backed `agent_browser_web_search` with Pi-scoped package config (`RQ-0121`), and agent recovery for search/profile configuration failures (`RQ-0122`) are implemented; see [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#job), [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#qa), [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#sourcelookup), [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#networksourcelookup), and [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#optional-companion-web-search). Reusable browser recipes (`RQ-0068`) are intentionally not adopted as a runtime surface; see [`ARCHITECTURE.md`](ARCHITECTURE.md#no-reusable-recipe-layer-yet).
|
|
34
34
|
|
|
35
35
|
## Open UX/reliability follow-ups from 2026-05-29 agent feedback
|
|
36
36
|
|
|
@@ -58,15 +58,15 @@ Re-run the gates below before each release; this table records what the closure
|
|
|
58
58
|
|
|
59
59
|
| Gate | Evidence | Status |
|
|
60
60
|
| --- | --- | --- |
|
|
61
|
-
| Default local gate | `npm run verify` checks generated playbook drift, `tsc --noEmit`, unit/fake tests, generated command-reference blocks, and live command-reference sampling. | Pass on 2026-06-
|
|
62
|
-
| Real upstream contract | `npm run verify -- real-upstream` runs the localhost fixture matrix against the real installed `agent-browser` matching the baseline. | Pass on 2026-06-
|
|
63
|
-
| Packaged Pi smoke | `npm run verify -- package-pi` validates package contents, loads
|
|
64
|
-
| Deterministic dogfood smoke | `npm run verify -- dogfood` (`scripts/verify-agent-browser-dogfood.ts`) drives the native wrapper against a local file fixture through top-level `qa`, `semanticAction`, constrained `job`, screenshot artifact verification, and session close with the real `agent-browser` on `PATH`. | Pass on 2026-06-
|
|
61
|
+
| Default local gate | `npm run verify` checks generated playbook drift, `tsc --noEmit`, unit/fake tests, generated command-reference blocks, and live command-reference sampling. | Pass on 2026-06-03 as part of `npm run verify -- release` (`agent-browser 0.27.1` on `PATH`). |
|
|
62
|
+
| Real upstream contract | `npm run verify -- real-upstream` runs the localhost fixture matrix against the real installed `agent-browser` matching the baseline. | Pass on 2026-06-03 (`npm run verify -- real-upstream`, `agent-browser 0.27.1` on `PATH`; updated Web Vitals shape assertions for upstream 0.27.1 structured output). |
|
|
63
|
+
| Packaged Pi smoke | `npm run verify -- package-pi` validates package contents, loads the packaged `agent_browser` tool without requiring optional Brave config, and executes fake-upstream `--version`. | Pass on 2026-06-03 as part of `npm run verify -- release` (`npm run verify -- package-pi` slice). |
|
|
64
|
+
| Deterministic dogfood smoke | `npm run verify -- dogfood` (`scripts/verify-agent-browser-dogfood.ts`) drives the native wrapper against a local file fixture through top-level `qa`, `semanticAction`, constrained `job`, screenshot artifact verification, and session close with the real `agent-browser` on `PATH`. | Pass on 2026-06-03 (`npm run verify -- dogfood`, `agent-browser 0.27.1`; artifacts cleaned by the harness). |
|
|
65
65
|
| Efficiency benchmark | `npm run verify -- benchmark` runs deterministic browser workflow accounting plus focused benchmark tests, including JSONL sampling fixtures and job/qa/sourceLookup/networkSourceLookup/Electron scenario coverage. | Pass on 2026-05-29 (`npm run verify -- benchmark`). |
|
|
66
|
-
| Crabbox platform smoke | `npm run check:platform-smoke` syntax-checks the harness and cheap invariants. `npm run smoke:platform:all` runs doctor first, then fast target-local `platform-build` (`npm run verify -- platform-target`, pack, clean Pi install) plus `browser-dogfood-smoke` on Crabbox `macos`, `ubuntu`, and `windows-native`; see [`platform-smoke.md`](platform-smoke.md). | Pass on 2026-06-
|
|
67
|
-
| `verify -- release` / `prepublishOnly` | `npm run verify -- release` chains the default gate with packaged Pi smoke and the release-blocking Crabbox platform matrix (`verifySteps` `release` in [`scripts/project.mjs`](../scripts/project.mjs)). `package.json` `prepublishOnly` runs that compose before `npm pack --dry-run` during `npm publish`. It intentionally omits standalone lifecycle, real-upstream, host-only dogfood, and benchmark modes—see [`RELEASE.md`](RELEASE.md#pre-release-checks). | Pass on 2026-06-
|
|
68
|
-
| Configured-source lifecycle | `npm run verify -- lifecycle` (`scripts/verify-lifecycle.mjs`) drives `/reload`, closes and relaunches Pi with the same exact `--session-id`, checks the JSONL session header id, session continuity, slash-command sentinel tokens (`v1` then `v2` after rewriting the packaged extension to simulate pickup), persisted spill reachability, and real Pi `tool_result` failure-patch semantics for a QA reclassification with a fake upstream on `PATH`. Default Pi model is `zai/glm-5.1`; default per-step wait is **180000 ms** (`DEFAULT_TIMEOUT_MS`); override model with `--model <id>` and waits with `--timeout-ms <ms>`. Passthrough flags in [`scripts/project.mjs`](../scripts/project.mjs): `--keep-artifacts`, `--model`, `--verbose`, and `--timeout-ms` plus a value (for example `npm run verify -- lifecycle --model openai-codex/gpt-5.5:minimal --keep-artifacts --verbose --timeout-ms 600000`). | Pass on 2026-
|
|
69
|
-
| Quick isolated Pi smoke | `pi --no-extensions --no-skills -e . --tools agent_browser` from repo root; native `agent_browser` only. | Last interactive tmux checkout smoke pass on 2026-05-29 (`agent-browser 0.27.0` at the time). The 2026-06-
|
|
66
|
+
| Crabbox platform smoke | `npm run check:platform-smoke` syntax-checks the harness and cheap invariants. `npm run smoke:platform:all` runs doctor first, then fast target-local `platform-build` (`npm run verify -- platform-target`, pack, clean Pi install) plus `browser-dogfood-smoke` on Crabbox `macos`, `ubuntu`, and `windows-native`; see [`platform-smoke.md`](platform-smoke.md). | Pass on 2026-06-03 (`npm run check:platform-smoke`, `npm run smoke:platform:ubuntu-image`, and `npm run verify -- release`, whose platform slice ran the macOS/Ubuntu/native-Windows Crabbox matrix; artifacts cleaned after evidence capture). |
|
|
67
|
+
| `verify -- release` / `prepublishOnly` | `npm run verify -- release` chains the default gate with packaged Pi smoke and the release-blocking Crabbox platform matrix (`verifySteps` `release` in [`scripts/project.mjs`](../scripts/project.mjs)). `package.json` `prepublishOnly` runs that compose before `npm pack --dry-run` during `npm publish`. It intentionally omits standalone lifecycle, real-upstream, host-only dogfood, and benchmark modes—see [`RELEASE.md`](RELEASE.md#pre-release-checks). | Pass on 2026-06-03 (`npm run verify -- release`, including macOS/Ubuntu/native-Windows Crabbox matrix). |
|
|
68
|
+
| Configured-source lifecycle | `npm run verify -- lifecycle` (`scripts/verify-lifecycle.mjs`) drives `/reload`, closes and relaunches Pi with the same exact `--session-id`, checks the JSONL session header id, session continuity, slash-command sentinel tokens (`v1` then `v2` after rewriting the packaged extension to simulate pickup), persisted spill reachability, and real Pi `tool_result` failure-patch semantics for a QA reclassification with a fake upstream on `PATH`. Default Pi model is `zai/glm-5.1`; default per-step wait is **180000 ms** (`DEFAULT_TIMEOUT_MS`); override model with `--model <id>` and waits with `--timeout-ms <ms>`. Passthrough flags in [`scripts/project.mjs`](../scripts/project.mjs): `--keep-artifacts`, `--model`, `--verbose`, and `--timeout-ms` plus a value (for example `npm run verify -- lifecycle --model openai-codex/gpt-5.5:minimal --keep-artifacts --verbose --timeout-ms 600000`). | Pass on 2026-06-03 (`npm run verify -- lifecycle`). Treat any future unexplained red lifecycle gate as a release blocker. |
|
|
69
|
+
| Quick isolated Pi smoke | `pi --no-extensions --no-skills -e . --tools agent_browser` from repo root; native `agent_browser` only. | Last interactive tmux checkout smoke pass on 2026-05-29 (`agent-browser 0.27.0` at the time). The 2026-06-03 Crabbox matrix now covers clean packed Pi install plus deterministic wrapper dogfood on all required platforms for `agent-browser 0.27.1`; run a new manual tmux smoke before publish when human-readable transcript evidence is required. Broader historical coverage also includes version/help/skills, open/snapshot/click, eval stdin, batch stdin, screenshot, explicit session, `sessionMode: "fresh"`, network requests, console/errors, diff snapshot, stream status/disable, dashboard start/stop, and chat credential-failure pass-through during RQ-0055. |
|
|
70
70
|
|
|
71
71
|
## Baseline checklist by inventory section
|
|
72
72
|
|
|
@@ -77,11 +77,15 @@ Re-run the gates below before each release; this table records what the closure
|
|
|
77
77
|
| Sessions, state, tabs, frames, dialogs, and windows | 20 canonical tokens from baseline section `state-tabs-frames-dialogs`; see [`scripts/agent-browser-capability-baseline.mjs`](../scripts/agent-browser-capability-baseline.mjs) and generated [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#session-state-frames-dialogs-windows-and-inspection-commands). | [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#session-state-frames-dialogs-windows-and-inspection-commands), stateful workflow notes, [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#details). | Stateful summaries/redaction, state artifact handling, sessionless local command planning, managed-session restore, tab target pinning, and close alias cleanup. | Extension-validation stateful matrix, runtime session/resume tests, presentation redaction tests, lifecycle harness. | Supported. External profile/auth state remains operator-owned. |
|
|
78
78
|
| Network, storage, artifacts, diagnostics, and performance | 42 canonical tokens from baseline section `network-storage-artifacts-diagnostics`; see [`scripts/agent-browser-capability-baseline.mjs`](../scripts/agent-browser-capability-baseline.mjs) and generated [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#page-state-finding-mouse-settings-network-and-storage). | [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#page-state-finding-mouse-settings-network-and-storage), diagnostic sections, [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#details). | Thin passthrough plus compact diagnostics, artifact metadata, missing-ffmpeg warnings, sensitive-data redaction, timeout bounds, and cleanup-pair guidance. | Fake non-core matrix and safe real-upstream coverage for network/HAR, diff, trace/profiler, console/errors/highlight, stream, vitals, and React missing-renderer. | Supported. Environment-sensitive operations need suitable local/browser state. |
|
|
79
79
|
| Batch, auth, confirmations, setup, dashboard, devices, and AI commands | 24 canonical tokens from baseline section `batch-auth-setup-ai`; see [`scripts/agent-browser-capability-baseline.mjs`](../scripts/agent-browser-capability-baseline.mjs) and generated [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#batch-auth-confirmations-sessions-chat-dashboard-devices-and-setup). | [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#batch-auth-confirmations-sessions-chat-dashboard-devices-and-setup), README security notes, release docs. | Native-tool batch stdin, generated `job`/`qa`/lookup batch plans, auth/confirmation redaction, sessionless local auth/setup/dashboard/doctor planning, timeout/cleanup guidance. | Unit/fake batch/auth/confirmation/dashboard/chat/doctor tests; extension-validation for structured input modes; efficiency benchmark scenarios. | Supported. Interactive side-effecting setup/auth/chat remains upstream-owned. |
|
|
80
|
-
| Global flags, config, providers, policy, and environment | 117 canonical tokens from baseline section `options-and-env`; see [`scripts/agent-browser-capability-baseline.mjs`](../scripts/agent-browser-capability-baseline.mjs) and generated [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#important-global-flags-config-and-environment). | [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#important-global-flags-config-and-environment), README provider/setup notes, [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#sessionmode), architecture/runtime docs. | Runtime handles command discovery, value-flag prevalidation, launch-scoped flags, redacted echoes, fresh-session recovery hints, explicit sessions, provider/device launch-scoping, curated env forwarding, and
|
|
80
|
+
| Global flags, config, providers, policy, and environment | 117 canonical tokens from baseline section `options-and-env`; see [`scripts/agent-browser-capability-baseline.mjs`](../scripts/agent-browser-capability-baseline.mjs) and generated [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#important-global-flags-config-and-environment). | [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#important-global-flags-config-and-environment), README provider/setup notes, [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#sessionmode), architecture/runtime docs. | Runtime handles command discovery, value-flag prevalidation, launch-scoped flags, redacted echoes, fresh-session recovery hints, explicit sessions, provider/device launch-scoping, curated env forwarding, subprocess completion, and package-owned Pi-scoped config for optional companion features. | Runtime tests for flags/planning/redaction/session behavior; process tests for env and stdio-linger completion; config/web-search/CLI tests; fake provider/specialized-skill matrix; package doctor. | Supported. Provider clouds, iOS/Appium, proxies, profiles, and credentials require external setup. |
|
|
81
81
|
|
|
82
82
|
## Follow-up decision after closure
|
|
83
83
|
|
|
84
|
-
Native `job`, `qa`, experimental `sourceLookup`, experimental `networkSourceLookup`,
|
|
84
|
+
Native `job`, `qa`, experimental `sourceLookup`, experimental `networkSourceLookup`, first-class Electron lifecycle/probe support, and optional Exa/Brave-backed companion web search are shipped.
|
|
85
|
+
|
|
86
|
+
`RQ-0121` adds Pi-scoped package config plus optional Exa/Brave web search without turning search into an `agent_browser` input mode. Config lives at `~/.pi/config/pi-agent-browser-native/config.json`, `.pi/config/pi-agent-browser-native/config.json`, or an explicit `PI_AGENT_BROWSER_CONFIG` override, with global → project → override merge order and `EXA_API_KEY` / `BRAVE_API_KEY` as fallbacks only when no config credential source exists for that provider. `webSearch.exaApiKey` and `webSearch.braveApiKey` support Pi model/provider-style literal, `$ENV_VAR` / `${ENV_VAR}`, escape, and `!command` values in trusted global/override config; project-local config rejects plaintext, custom env aliases, interpolation-literal, malformed, and command-backed keys and allows only matching provider env refs. `webSearch.enabled: false` disables the tool even when environment keys exist after the final config merge: global disable is a user default, project disable is repo-scoped, and `PI_AGENT_BROWSER_CONFIG` disable wins for a hard per-run off switch. `webSearch.preferredProvider` chooses the default provider when both credentials resolve; Exa is the default because `/search` with highlights is the more agent-oriented result shape. `agent_browser_web_search` registers only when a usable credential source is available, resolves command secrets lazily at execution, calls Exa `/search` with highlights or Brave Search, returns compact normalized result details, and never exposes the key. Browser default profile/executable config records conservative prompt guidance only from trusted global or explicit override config; project-local browser config is not trusted to steer host profile/executable prompt guidance, and no config auto-injects launch args. Contract: [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#optional-companion-web-search); human workflow: README optional package config and [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#optional-package-config-and-companion-web-search); implementation: `extensions/agent-browser/lib/config.ts`, `extensions/agent-browser/lib/web-search.ts`, `scripts/config.mjs`, and conditional registration in `extensions/agent-browser/index.ts`; fake coverage: `test/agent-browser.config.test.ts`, `test/agent-browser.web-search.test.ts`, and `test/agent-browser.config-cli.test.ts`.
|
|
87
|
+
|
|
88
|
+
`RQ-0122` tracks the private shared-session finding reported by @deepakness: agents should not fan out web searches until provider rate limits are hit, and browser/profile config failures should lead to diagnostics plus user-facing setup recommendations instead of repeated failing opens. The wrapper now serializes companion web-search calls with a small request gate, adds agent-visible 429 guidance, rejects `--session-mode` inside native `args` in favor of top-level `sessionMode`, surfaces profile/user-data-dir failures with `inspect-browser-profiles` and `run-agent-browser-doctor` next actions, treats `--executable-path` as launch-scoped for active implicit sessions, and adds conservative config/prompt/docs support for non-default Chromium-compatible browser executables. Contract: [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#details) and [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#sessionmode); human workflow: README authenticated/profile workflows and [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#switch-from-an-already-active-implicit-session-to-a-fresh-profiled-or-alternate-browser-launch); implementation: `extensions/agent-browser/lib/web-search.ts`, `extensions/agent-browser/lib/config-policy.js`, `extensions/agent-browser/lib/results/presentation/browser-profile-recovery.ts`, `extensions/agent-browser/lib/results/presentation/errors.ts`, `extensions/agent-browser/lib/launch-scoped-flags.ts`, `extensions/agent-browser/lib/runtime.ts`, `extensions/agent-browser/lib/config.ts`, `scripts/config.mjs`, and `extensions/agent-browser/lib/playbook.ts`; fake coverage: `test/agent-browser.web-search.test.ts`, `test/agent-browser.presentation-skills-recovery.test.ts`, `test/agent-browser.results.test.ts`, `test/agent-browser.runtime.test.ts`, `test/agent-browser.config.test.ts`, `test/agent-browser.config-cli.test.ts`, and prompt-guidance assertions in `test/agent-browser.extension-validation.test.ts`.
|
|
85
89
|
|
|
86
90
|
`RQ-0066` shipped as the bounded evidence model in [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#sourcelookup): it compiles to upstream `batch` steps (`is visible`, `get html`, `react inspect`, `react tree` as applicable), merges `details.sourceLookup` into the tool `details` alongside batch presentation, and never reclassifies an upstream-successful batch to failed solely because no candidates were found (unlike `qa` diagnostic reclassification). Wrapper-tracked packaged Electron no-candidate results now add bounded `workspaceRoot` / `electronContext` when available, limitations that the scan only covers the Pi cwd and does not unpack installed app resources or `app.asar`, and live Electron `snapshot` / `probe` / `tab list` next actions. Fake coverage: `agentBrowserExtension explains packaged Electron sourceLookup no-candidate boundaries` in [`test/agent-browser.extension-validation.test.ts`](../test/agent-browser.extension-validation.test.ts).
|
|
87
91
|
|