pi-agent-browser-native 0.2.38 → 0.2.40

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -10,9 +10,10 @@ Related docs:
10
10
 
11
11
  ## V1 tool
12
12
 
13
- V1 should expose one primary native tool:
13
+ V1 exposes one primary native browser tool and one optional companion search tool:
14
14
 
15
15
  - `agent_browser`
16
+ - `agent_browser_web_search` when a Brave Search credential source is configured or resolvable
16
17
 
17
18
  ## Why this tool shape
18
19
 
@@ -33,6 +34,49 @@ The native command reference in `docs/COMMAND_REFERENCE.md` is driven by the sam
33
34
 
34
35
  Agent-facing efficiency claims are measured with `npm run benchmark:agent-browser` or `npm run verify -- benchmark`. The benchmark is deterministic and does not launch a browser; it tracks representative workflow success, tool calls, model-visible output size, stale-ref failures and recoveries, artifact success, failure-category coverage, and elapsed-time estimates so future abstractions can prove they reduce agent work before replacing raw tool use.
35
36
 
37
+ ## Optional companion web search
38
+
39
+ `agent_browser_web_search` is a separate custom tool, not an `agent_browser` input mode. It is registered only when the extension can see a configured Brave Search credential source from `~/.pi/config/pi-agent-browser-native/config.json`, `.pi/config/pi-agent-browser-native/config.json`, `PI_AGENT_BROWSER_CONFIG`, or the `BRAVE_API_KEY` environment fallback. Command credential sources such as `"!op read 'op://Private/Brave Search/API Key'"` are allowed only from trusted global or explicit-override config; they make the tool available without running the command at startup, and the key is resolved when the tool executes. Project-local config may use inert exact `$ENV_VAR` / `${ENV_VAR}` references only; interpolation literals and malformed `$` values are rejected.
40
+
41
+ Use it when live/current external web information would help answer a task, find current docs/news, or discover candidate URLs. Use `agent_browser` when the task needs browser interaction, screenshots, authenticated/profile content, page inspection, or DOM work. The search tool is Brave-only today, namespaced to avoid colliding with generic `web_search`, and must not expose the resolved API key in content, details, errors, status output, docs examples, logs, or PR artifacts.
42
+
43
+ Schema:
44
+
45
+ ```json
46
+ {
47
+ "query": "search text",
48
+ "count": 5,
49
+ "offset": 0,
50
+ "country": "US",
51
+ "searchLang": "en-US",
52
+ "safesearch": "moderate",
53
+ "freshness": "pw"
54
+ }
55
+ ```
56
+
57
+ Result details:
58
+
59
+ ```json
60
+ {
61
+ "provider": "brave",
62
+ "query": "search text",
63
+ "returnedQuery": "search text",
64
+ "count": 5,
65
+ "offset": 0,
66
+ "fetchedAt": "2026-06-02T00:00:00.000Z",
67
+ "results": [
68
+ {
69
+ "title": "Result title",
70
+ "url": "https://example.com/",
71
+ "description": "Compact summary",
72
+ "source": "Example",
73
+ "age": "1 day ago",
74
+ "language": "en"
75
+ }
76
+ ]
77
+ }
78
+ ```
79
+
36
80
  ## Input mode chooser
37
81
 
38
82
  Use exactly one top-level input per call:
@@ -88,7 +132,7 @@ The extension always plans normal browser commands with `--json` prepended in `e
88
132
  - For Electron desktop apps, prefer top-level electron for wrapper-owned discovery, isolated launch, status, compact probe, and cleanup: list first, treat likely-sensitive annotations as hints rather than enforcement, launch with the default snapshot handoff unless handoff: "tabs" is the safer diagnostic starting point, use electron.probe or snapshot -i/qa.attached for current-session state, and always cleanup the returned launchId when done. electron.launch uses an isolated temporary profile; it does not reuse the app's normal signed-in profile or attach to an already-running authenticated app. For signed-in local app state, host-launch the normal app with --remote-debugging-port when appropriate, then use raw args connect <port|url>; after connect, inspect tab list, select the stable tab id such as tab t2, then run a condition wait or snapshot -i before using refs. close commands (`close`, `quit`, or `exit`) only close the browser/CDP session; leave manually launched app shutdown, profile cleanup, and explicit artifacts to the host owner.
89
133
  - For provider or specialized app workflows, load version-matched upstream guidance with skills get agentcore|electron|slack|dogfood|vercel-sandbox through the native tool; add --full when you need references/templates, and use skills get --all only for broad skill audits. Provider launches such as -p ios, --provider browserbase/kernel/browseruse/browserless/agentcore, and iOS --device are upstream-owned setup paths; use sessionMode fresh when switching providers and expect external credentials or local Appium/Xcode setup to be required.
90
134
  - For dialogs and frames, use dialog status/accept/dismiss and frame <selector|main> through native args; when --confirm-actions produces a pending confirmation, use details.nextActions or exact confirm <id> / deny <id> calls instead of inventing ids.
91
- - If a session lands on the wrong page or tab, an interaction changes origin unexpectedly, or an open call returns blocked, blank, or otherwise unexpected results, use tab list / tab <tab-id-or-label> / snapshot -i to recover state before retrying different URLs or fallback strategies. For headed demos, put --headed on the first launch with sessionMode=fresh and verify with screenshot/tab/get-url evidence because tool success cannot prove the OS window is visible to the user. For desktop readiness, prefer real conditions first: wait --text, wait --url, wait --fn, wait --load <state>, wait --download, or qa.attached; for disappearance checks in agent-browser 0.27.0, use wait --fn predicates instead of stale upstream-help examples like wait <selector> --state hidden. Use electron.probe/status for wrapper-owned launch health or target mismatch. Fixed waits are a last resort, must stay below the wrapper IPC budget (wait 30000 is intentionally blocked), and a successful payload like "waited":"timeout" means elapsed time only—verify completion with an observed condition, fresh snapshot, or screenshot.
135
+ - If a session lands on the wrong page or tab, an interaction changes origin unexpectedly, or an open call returns blocked, blank, or otherwise unexpected results, use tab list / tab <tab-id-or-label> / snapshot -i to recover state before retrying different URLs or fallback strategies. For headed demos, put --headed on the first launch with sessionMode=fresh and verify with screenshot/tab/get-url evidence because tool success cannot prove the OS window is visible to the user. For desktop readiness, prefer real conditions first: wait --text, wait --url, wait --fn, wait --load <state>, wait --download, or qa.attached; for disappearance checks in agent-browser 0.27.1, use wait --fn predicates instead of stale upstream-help examples like wait <selector> --state hidden. Use electron.probe/status for wrapper-owned launch health or target mismatch. Fixed waits are a last resort, must stay below the wrapper IPC budget (wait 30000 is intentionally blocked), and a successful payload like "waited":"timeout" means elapsed time only—verify completion with an observed condition, fresh snapshot, or screenshot.
92
136
  - For feed, timeline, or inbox reading tasks, focus on the main timeline/list region and read the first item there rather than unrelated composer or sidebar content.
93
137
  - For read-only browsing tasks, prefer extracting the answer from the current snapshot, structured ref labels, or eval --stdin on the current page before navigating away. Only click into media viewers, detail routes, or new pages when the current view does not contain the needed information.
94
138
  - For downloads, prefer download <selector> <path> when an element click should save a file. Do not rely on click alone when you need the downloaded file on disk.
@@ -0,0 +1,176 @@
1
+ # Platform smoke testing
2
+
3
+ `pi-agent-browser-native` uses a Crabbox-backed local platform smoke gate to prove the package on macOS, Ubuntu Linux, and native Windows before release.
4
+
5
+ This is a release-blocking gate. Missing Crabbox setup, Docker, macOS SSH, the native Windows template, upstream `agent-browser`, or browser runtime dependencies is a blocked release setup, not a skipped pass.
6
+
7
+ ## Required release gate
8
+
9
+ Run the cheap harness checks first, then the full matrix:
10
+
11
+ ```sh
12
+ npm run check:platform-smoke
13
+ npm run smoke:platform:ubuntu-image
14
+ npm run smoke:platform:all
15
+ ```
16
+
17
+ `smoke:platform:all` runs `smoke:platform:doctor` before any target suite starts. The canonical `npm run verify -- release` gate also runs the same platform doctor and full `macos,ubuntu,windows-native` matrix after default verification and packaged Pi smoke, so `npm publish` cannot pass `prepublishOnly` without the platform gate.
18
+
19
+ Per-target commands are for diagnosis:
20
+
21
+ ```sh
22
+ npm run smoke:platform:macos
23
+ npm run smoke:platform:ubuntu
24
+ npm run smoke:platform:windows-native
25
+ npm run verify -- platform-smoke run --target ubuntu --suite platform-build
26
+ ```
27
+
28
+ ## Targets
29
+
30
+ | Target | Crabbox provider | Shell contract | Release status |
31
+ | --- | --- | --- | --- |
32
+ | `macos` | `ssh` static localhost | POSIX shell on macOS | Required |
33
+ | `ubuntu` | `local-container` | POSIX shell in a Docker-compatible local container | Required |
34
+ | `windows-native` | `parallels` | native Windows PowerShell over OpenSSH | Required |
35
+
36
+ ## Required environment
37
+
38
+ Install Crabbox on the macOS maintainer host and keep it on `PATH`:
39
+
40
+ ```sh
41
+ brew install openclaw/tap/crabbox
42
+ crabbox --version
43
+ crabbox providers
44
+ ```
45
+
46
+ Use `PLATFORM_SMOKE_CRABBOX=/path/to/crabbox` only when testing a non-default Crabbox binary.
47
+
48
+ Standard configuration knobs:
49
+
50
+ ```sh
51
+ PLATFORM_SMOKE_MAC_HOST=localhost
52
+ PLATFORM_SMOKE_MAC_USER="$USER"
53
+ PLATFORM_SMOKE_MAC_WORK_ROOT="/Users/$USER/crabbox/pi-agent-browser-native"
54
+
55
+ # Default local image built by npm run smoke:platform:ubuntu-image.
56
+ PLATFORM_SMOKE_UBUNTU_IMAGE="pi-agent-browser-native-platform:node24-agent-browser0.27.1"
57
+
58
+ PLATFORM_SMOKE_WINDOWS_VM="pi-extension-windows-template"
59
+ PLATFORM_SMOKE_WINDOWS_SNAPSHOT="crabbox-ready"
60
+ PLATFORM_SMOKE_WINDOWS_USER="<windows-ssh-user>"
61
+ PLATFORM_SMOKE_WINDOWS_WORK_ROOT="C:\\crabbox\\pi-agent-browser-native"
62
+
63
+ # Optional: names of secret env vars to redact/forward if future live suites need them.
64
+ PLATFORM_SMOKE_AUTH_ENV=""
65
+ ```
66
+
67
+ The Ubuntu target image is derived from `node:24-bookworm`, installs `agent-browser@0.27.1`, installs Debian Chromium through apt, creates a non-root `circleci` user, and sets `AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium`. Rebuild it after upstream rebaselining, or override `PLATFORM_SMOKE_UBUNTU_IMAGE` with an equivalent prepared local image. Do not install `agent-browser` ad hoc inside the Ubuntu smoke command.
68
+
69
+ The configured upstream `agent-browser` baseline is imported from [`scripts/agent-browser-capability-baseline.mjs`](../scripts/agent-browser-capability-baseline.mjs). Target-local browser suites verify that exact `agent-browser` version before running. Bake the exact upstream CLI and browser runtime into the Windows template/snapshot for speed and reproducibility; missing or stale Windows `agent-browser` / browser readiness is a blocked setup, not something the smoke command repairs. The Windows browser suite checks the preinstalled browser cache and prewarms one short local file URL before the extension harness runs.
70
+
71
+ ## Target setup expectations
72
+
73
+ Crabbox does not install project runtime tools. The macOS host, Ubuntu image, and Windows template must already provide:
74
+
75
+ - Node/npm at or above the configured Node major baseline in [`platform-smoke.config.mjs`](../platform-smoke.config.mjs).
76
+ - Git and `tar`.
77
+ - Upstream `agent-browser` matching this wrapper’s capability baseline. The Ubuntu target gets it from [`scripts/platform-smoke/linux-image/Dockerfile`](../scripts/platform-smoke/linux-image/Dockerfile); the Windows template gets it from the shared `pi-extension-windows-template` / `crabbox-ready` snapshot.
78
+ - Browser/runtime dependencies needed by upstream `agent-browser`.
79
+ - Native PowerShell and OpenSSH Server on Windows.
80
+
81
+ For Windows, reuse `pi-extension-windows-template` with the shared canonical `crabbox-ready` power-off snapshot. Do not create one-off project VMs. If a reusable tool is missing, update the shared template, verify from a fresh SSH session, remove caches/secrets/checkouts, shut down cleanly, and promote a known-good power-off snapshot.
82
+
83
+ ## What the suites prove
84
+
85
+ Each required target runs `platform-build` and `browser-dogfood-smoke` on one Crabbox lease, serially.
86
+
87
+ ### `platform-build`
88
+
89
+ 1. Verify the target Node major version.
90
+ 2. Run `npm ci` in the synced checkout.
91
+ 3. Run `npm run verify -- platform-target`, a fast target-local gate covering generated docs, TypeScript, package/platform harness tests, and runtime planning. The full unit/fake suite still runs once in the host default gate before the release matrix starts; target-local smoke must not duplicate that full suite on every OS. Browser subprocess behavior is then exercised by the target-local `browser-dogfood-smoke` suite against the real upstream binary.
92
+ 4. Run `npm pack`.
93
+ 5. Create a clean target-local Pi project.
94
+ 6. Install the packed tarball with `npm install --no-save`.
95
+ 7. Run `pi install -l ./node_modules/pi-agent-browser-native` from the clean project.
96
+ 8. Run `pi list` and assert the package is registered from the packed install.
97
+ 9. Assert the release proof did not use `pi -e .` or `pi --extension .`.
98
+
99
+ ### `browser-dogfood-smoke`
100
+
101
+ 1. Run `npm ci` in the synced checkout if needed.
102
+ 2. Run the deterministic model-free browser smoke through `scripts/verify-agent-browser-dogfood.ts`.
103
+ 3. Exercise native wrapper surfaces against the deterministic local file fixture from `scripts/verify-agent-browser-dogfood.ts`: top-level `qa`, `semanticAction`, constrained `job`, screenshot artifact verification, and session close.
104
+ 4. Persist the dogfood JSON report and stdout/stderr evidence.
105
+ 5. Fail on missing browser artifacts, failed tool calls, leaked secrets, or unclosed sessions.
106
+
107
+ The dogfood suite intentionally uses the checkout harness while `platform-build` proves packed Pi installation. Together they catch OS-specific packaging, install, path, process, browser, and wrapper bugs without using an LLM.
108
+
109
+ ## Artifact contract
110
+
111
+ Every target suite writes host-side evidence under:
112
+
113
+ ```text
114
+ .artifacts/platform-smoke/<run-id>/<target>/<suite>/
115
+ ```
116
+
117
+ Required files include:
118
+
119
+ ```text
120
+ summary.json
121
+ artifact-manifest.json
122
+ target.json
123
+ suite.json
124
+ command.txt
125
+ exit-code.txt
126
+ crabbox.stdout.txt
127
+ crabbox.stderr.txt
128
+ crabbox.timing.json
129
+ assertions.json
130
+ failures.md # only when assertions fail
131
+ ```
132
+
133
+ `platform-build` also writes:
134
+
135
+ ```text
136
+ node-version.txt
137
+ packed-tarball.txt
138
+ packed-node-install.stdout.txt
139
+ packed-node-install.stderr.txt
140
+ pi-install.stdout.txt
141
+ pi-install.stderr.txt
142
+ pi-list.stdout.txt
143
+ pi-list.stderr.txt
144
+ ```
145
+
146
+ `browser-dogfood-smoke` also writes:
147
+
148
+ ```text
149
+ node-version.txt
150
+ dogfood-artifacts.txt
151
+ dogfood.stdout.txt
152
+ dogfood.stderr.txt
153
+ dogfood-report.json
154
+ ```
155
+
156
+ Each target also writes a `lease-cleanup` artifact directory with `crabbox.stop.*` files. Cleanup failures are failing test results. Ubuntu and Windows runs also invoke Crabbox cleanup for stale direct-provider state after stopping the owned lease.
157
+
158
+ Passing suites must satisfy:
159
+
160
+ ```text
161
+ summary.ok === assertions.ok
162
+ artifact-manifest.missing.length === 0
163
+ ```
164
+
165
+ The harness redacts configured secret values and token-like text from persisted artifacts, then fails if a redaction scan still finds raw secrets.
166
+
167
+ ## Source of truth
168
+
169
+ - Config: [`platform-smoke.config.mjs`](../platform-smoke.config.mjs)
170
+ - CLI: [`scripts/platform-smoke.mjs`](../scripts/platform-smoke.mjs)
171
+ - Crabbox wrapper: [`scripts/platform-smoke/crabbox-runner.mjs`](../scripts/platform-smoke/crabbox-runner.mjs)
172
+ - Target commands/assertions: [`scripts/platform-smoke/targets.mjs`](../scripts/platform-smoke/targets.mjs)
173
+ - Platform doctor: [`scripts/platform-smoke/doctor.mjs`](../scripts/platform-smoke/doctor.mjs)
174
+ - Artifact helpers: [`scripts/platform-smoke/artifacts.mjs`](../scripts/platform-smoke/artifacts.mjs)
175
+ - Windows build suite: [`scripts/platform-smoke/platform-build-windows.ps1`](../scripts/platform-smoke/platform-build-windows.ps1)
176
+ - Windows browser suite: [`scripts/platform-smoke/browser-dogfood-windows.ps1`](../scripts/platform-smoke/browser-dogfood-windows.ps1)
@@ -41,7 +41,6 @@ import {
41
41
  getImplicitSessionCloseTimeoutMs,
42
42
  getImplicitSessionIdleTimeoutMs,
43
43
  hasLaunchScopedTabCorrectionFlag,
44
- hasUsableBraveApiKey,
45
44
  extractExplicitSessionName,
46
45
  redactInvocationArgs,
47
46
  restoreManagedSessionStateFromBranch,
@@ -118,6 +117,8 @@ import {
118
117
  type VisibleRefFallbackDiagnostic,
119
118
  } from "./lib/results/selector-recovery.js";
120
119
  import { withOptionalSessionArgs } from "./lib/results/next-actions.js";
120
+ import { canRegisterWebSearchTool, loadAgentBrowserConfigSync } from "./lib/config.js";
121
+ import { createAgentBrowserWebSearchTool } from "./lib/web-search.js";
121
122
 
122
123
  const DEFAULT_SESSION_MODE = "auto" as const;
123
124
  const DIRECT_AGENT_BROWSER_BASH_BYPASS_ENV = "PI_AGENT_BROWSER_ALLOW_DIRECT_BASH";
@@ -947,8 +948,13 @@ function getInstalledDocsPaths(): { readmePath: string; commandReferencePath: st
947
948
 
948
949
  export default function agentBrowserExtension(pi: ExtensionAPI) {
949
950
  const ephemeralSessionSeed = createEphemeralSessionSeed();
950
- const hasBraveApiKey = hasUsableBraveApiKey();
951
- const toolPromptGuidelines = buildToolPromptGuidelines({ includeBraveSearch: hasBraveApiKey, docs: getInstalledDocsPaths() });
951
+ const agentBrowserConfig = loadAgentBrowserConfigSync({ cwd: process.cwd() });
952
+ const webSearchToolAvailable = canRegisterWebSearchTool(agentBrowserConfig);
953
+ const toolPromptGuidelines = buildToolPromptGuidelines({
954
+ browserDefaultProfile: agentBrowserConfig.browserDefaultProfile,
955
+ includeBraveSearch: webSearchToolAvailable,
956
+ docs: getInstalledDocsPaths(),
957
+ });
952
958
  const implicitSessionIdleTimeoutMs = String(getImplicitSessionIdleTimeoutMs());
953
959
  const implicitSessionCloseTimeoutMs = getImplicitSessionCloseTimeoutMs();
954
960
  let managedSessionActive = false;
@@ -1267,4 +1273,8 @@ export default function agentBrowserExtension(pi: ExtensionAPI) {
1267
1273
  : runBrowserCommand();
1268
1274
  },
1269
1275
  });
1276
+
1277
+ if (webSearchToolAvailable) {
1278
+ pi.registerTool(createAgentBrowserWebSearchTool(agentBrowserConfig));
1279
+ }
1270
1280
  }