pi-agent-browser-native 0.2.37 → 0.2.39

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,176 @@
1
+ # Platform smoke testing
2
+
3
+ `pi-agent-browser-native` uses a Crabbox-backed local platform smoke gate to prove the package on macOS, Ubuntu Linux, and native Windows before release.
4
+
5
+ This is a release-blocking gate. Missing Crabbox setup, Docker, macOS SSH, the native Windows template, upstream `agent-browser`, or browser runtime dependencies is a blocked release setup, not a skipped pass.
6
+
7
+ ## Required release gate
8
+
9
+ Run the cheap harness checks first, then the full matrix:
10
+
11
+ ```sh
12
+ npm run check:platform-smoke
13
+ npm run smoke:platform:ubuntu-image
14
+ npm run smoke:platform:all
15
+ ```
16
+
17
+ `smoke:platform:all` runs `smoke:platform:doctor` before any target suite starts. The canonical `npm run verify -- release` gate also runs the same platform doctor and full `macos,ubuntu,windows-native` matrix after default verification and packaged Pi smoke, so `npm publish` cannot pass `prepublishOnly` without the platform gate.
18
+
19
+ Per-target commands are for diagnosis:
20
+
21
+ ```sh
22
+ npm run smoke:platform:macos
23
+ npm run smoke:platform:ubuntu
24
+ npm run smoke:platform:windows-native
25
+ npm run verify -- platform-smoke run --target ubuntu --suite platform-build
26
+ ```
27
+
28
+ ## Targets
29
+
30
+ | Target | Crabbox provider | Shell contract | Release status |
31
+ | --- | --- | --- | --- |
32
+ | `macos` | `ssh` static localhost | POSIX shell on macOS | Required |
33
+ | `ubuntu` | `local-container` | POSIX shell in a Docker-compatible local container | Required |
34
+ | `windows-native` | `parallels` | native Windows PowerShell over OpenSSH | Required |
35
+
36
+ ## Required environment
37
+
38
+ Install Crabbox on the macOS maintainer host and keep it on `PATH`:
39
+
40
+ ```sh
41
+ brew install openclaw/tap/crabbox
42
+ crabbox --version
43
+ crabbox providers
44
+ ```
45
+
46
+ Use `PLATFORM_SMOKE_CRABBOX=/path/to/crabbox` only when testing a non-default Crabbox binary.
47
+
48
+ Standard configuration knobs:
49
+
50
+ ```sh
51
+ PLATFORM_SMOKE_MAC_HOST=localhost
52
+ PLATFORM_SMOKE_MAC_USER="$USER"
53
+ PLATFORM_SMOKE_MAC_WORK_ROOT="/Users/$USER/crabbox/pi-agent-browser-native"
54
+
55
+ # Default local image built by npm run smoke:platform:ubuntu-image.
56
+ PLATFORM_SMOKE_UBUNTU_IMAGE="pi-agent-browser-native-platform:node24-agent-browser0.27.1"
57
+
58
+ PLATFORM_SMOKE_WINDOWS_VM="pi-extension-windows-template"
59
+ PLATFORM_SMOKE_WINDOWS_SNAPSHOT="crabbox-ready"
60
+ PLATFORM_SMOKE_WINDOWS_USER="<windows-ssh-user>"
61
+ PLATFORM_SMOKE_WINDOWS_WORK_ROOT="C:\\crabbox\\pi-agent-browser-native"
62
+
63
+ # Optional: names of secret env vars to redact/forward if future live suites need them.
64
+ PLATFORM_SMOKE_AUTH_ENV=""
65
+ ```
66
+
67
+ The Ubuntu target image is derived from `node:24-bookworm`, installs `agent-browser@0.27.1`, installs Debian Chromium through apt, creates a non-root `circleci` user, and sets `AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium`. Rebuild it after upstream rebaselining, or override `PLATFORM_SMOKE_UBUNTU_IMAGE` with an equivalent prepared local image. Do not install `agent-browser` ad hoc inside the Ubuntu smoke command.
68
+
69
+ The configured upstream `agent-browser` baseline is imported from [`scripts/agent-browser-capability-baseline.mjs`](../scripts/agent-browser-capability-baseline.mjs). Target-local browser suites verify that exact `agent-browser` version before running. Bake the exact upstream CLI and browser runtime into the Windows template/snapshot for speed and reproducibility; missing or stale Windows `agent-browser` / browser readiness is a blocked setup, not something the smoke command repairs. The Windows browser suite checks the preinstalled browser cache and prewarms one short local file URL before the extension harness runs.
70
+
71
+ ## Target setup expectations
72
+
73
+ Crabbox does not install project runtime tools. The macOS host, Ubuntu image, and Windows template must already provide:
74
+
75
+ - Node/npm at or above the configured Node major baseline in [`platform-smoke.config.mjs`](../platform-smoke.config.mjs).
76
+ - Git and `tar`.
77
+ - Upstream `agent-browser` matching this wrapper’s capability baseline. The Ubuntu target gets it from [`scripts/platform-smoke/linux-image/Dockerfile`](../scripts/platform-smoke/linux-image/Dockerfile); the Windows template gets it from the shared `pi-extension-windows-template` / `crabbox-ready` snapshot.
78
+ - Browser/runtime dependencies needed by upstream `agent-browser`.
79
+ - Native PowerShell and OpenSSH Server on Windows.
80
+
81
+ For Windows, reuse `pi-extension-windows-template` with the shared canonical `crabbox-ready` power-off snapshot. Do not create one-off project VMs. If a reusable tool is missing, update the shared template, verify from a fresh SSH session, remove caches/secrets/checkouts, shut down cleanly, and promote a known-good power-off snapshot.
82
+
83
+ ## What the suites prove
84
+
85
+ Each required target runs `platform-build` and `browser-dogfood-smoke` on one Crabbox lease, serially.
86
+
87
+ ### `platform-build`
88
+
89
+ 1. Verify the target Node major version.
90
+ 2. Run `npm ci` in the synced checkout.
91
+ 3. Run `npm run verify -- platform-target`, a fast target-local gate covering generated docs, TypeScript, package/platform harness tests, and runtime planning. The full unit/fake suite still runs once in the host default gate before the release matrix starts; target-local smoke must not duplicate that full suite on every OS. Browser subprocess behavior is then exercised by the target-local `browser-dogfood-smoke` suite against the real upstream binary.
92
+ 4. Run `npm pack`.
93
+ 5. Create a clean target-local Pi project.
94
+ 6. Install the packed tarball with `npm install --no-save`.
95
+ 7. Run `pi install -l ./node_modules/pi-agent-browser-native` from the clean project.
96
+ 8. Run `pi list` and assert the package is registered from the packed install.
97
+ 9. Assert the release proof did not use `pi -e .` or `pi --extension .`.
98
+
99
+ ### `browser-dogfood-smoke`
100
+
101
+ 1. Run `npm ci` in the synced checkout if needed.
102
+ 2. Run the deterministic model-free browser smoke through `scripts/verify-agent-browser-dogfood.ts`.
103
+ 3. Exercise native wrapper surfaces against the deterministic local file fixture from `scripts/verify-agent-browser-dogfood.ts`: top-level `qa`, `semanticAction`, constrained `job`, screenshot artifact verification, and session close.
104
+ 4. Persist the dogfood JSON report and stdout/stderr evidence.
105
+ 5. Fail on missing browser artifacts, failed tool calls, leaked secrets, or unclosed sessions.
106
+
107
+ The dogfood suite intentionally uses the checkout harness while `platform-build` proves packed Pi installation. Together they catch OS-specific packaging, install, path, process, browser, and wrapper bugs without using an LLM.
108
+
109
+ ## Artifact contract
110
+
111
+ Every target suite writes host-side evidence under:
112
+
113
+ ```text
114
+ .artifacts/platform-smoke/<run-id>/<target>/<suite>/
115
+ ```
116
+
117
+ Required files include:
118
+
119
+ ```text
120
+ summary.json
121
+ artifact-manifest.json
122
+ target.json
123
+ suite.json
124
+ command.txt
125
+ exit-code.txt
126
+ crabbox.stdout.txt
127
+ crabbox.stderr.txt
128
+ crabbox.timing.json
129
+ assertions.json
130
+ failures.md # only when assertions fail
131
+ ```
132
+
133
+ `platform-build` also writes:
134
+
135
+ ```text
136
+ node-version.txt
137
+ packed-tarball.txt
138
+ packed-node-install.stdout.txt
139
+ packed-node-install.stderr.txt
140
+ pi-install.stdout.txt
141
+ pi-install.stderr.txt
142
+ pi-list.stdout.txt
143
+ pi-list.stderr.txt
144
+ ```
145
+
146
+ `browser-dogfood-smoke` also writes:
147
+
148
+ ```text
149
+ node-version.txt
150
+ dogfood-artifacts.txt
151
+ dogfood.stdout.txt
152
+ dogfood.stderr.txt
153
+ dogfood-report.json
154
+ ```
155
+
156
+ Each target also writes a `lease-cleanup` artifact directory with `crabbox.stop.*` files. Cleanup failures are failing test results. Ubuntu and Windows runs also invoke Crabbox cleanup for stale direct-provider state after stopping the owned lease.
157
+
158
+ Passing suites must satisfy:
159
+
160
+ ```text
161
+ summary.ok === assertions.ok
162
+ artifact-manifest.missing.length === 0
163
+ ```
164
+
165
+ The harness redacts configured secret values and token-like text from persisted artifacts, then fails if a redaction scan still finds raw secrets.
166
+
167
+ ## Source of truth
168
+
169
+ - Config: [`platform-smoke.config.mjs`](../platform-smoke.config.mjs)
170
+ - CLI: [`scripts/platform-smoke.mjs`](../scripts/platform-smoke.mjs)
171
+ - Crabbox wrapper: [`scripts/platform-smoke/crabbox-runner.mjs`](../scripts/platform-smoke/crabbox-runner.mjs)
172
+ - Target commands/assertions: [`scripts/platform-smoke/targets.mjs`](../scripts/platform-smoke/targets.mjs)
173
+ - Platform doctor: [`scripts/platform-smoke/doctor.mjs`](../scripts/platform-smoke/doctor.mjs)
174
+ - Artifact helpers: [`scripts/platform-smoke/artifacts.mjs`](../scripts/platform-smoke/artifacts.mjs)
175
+ - Windows build suite: [`scripts/platform-smoke/platform-build-windows.ps1`](../scripts/platform-smoke/platform-build-windows.ps1)
176
+ - Windows browser suite: [`scripts/platform-smoke/browser-dogfood-windows.ps1`](../scripts/platform-smoke/browser-dogfood-windows.ps1)
@@ -30,7 +30,7 @@ export const QUICK_START_GUIDELINES = [
30
30
  ] as const;
31
31
 
32
32
  export const BRAVE_SEARCH_PROMPT_GUIDELINE =
33
- "When a non-empty BRAVE_API_KEY is available in the current environment, prefer the Brave Search API via bash/curl to discover specific destination URLs, then open the chosen URL with agent_browser instead of browsing a search engine results page just to find the target.";
33
+ "With BRAVE_API_KEY set, use Brave Search via bash/curl to find exact destination URLs, then open the chosen URL with agent_browser; do not browse search results just to locate a target.";
34
34
 
35
35
  export const SHARED_BROWSER_PLAYBOOK_GUIDELINES = [
36
36
  "Standard workflow: open the page, snapshot -i, interact using current @refs from that snapshot, and re-snapshot after navigation, scrolling, rerendering, or other major DOM changes because refs are page-scoped; the wrapper fails mutation-prone stale/recycled refs before upstream can silently target a different current-page element.",
@@ -51,7 +51,7 @@ export const SHARED_BROWSER_PLAYBOOK_GUIDELINES = [
51
51
  "For Electron desktop apps, prefer top-level electron for wrapper-owned discovery, isolated launch, status, compact probe, and cleanup: list first, treat likely-sensitive annotations as hints rather than enforcement, launch with the default snapshot handoff unless handoff: \"tabs\" is the safer diagnostic starting point, use electron.probe or snapshot -i/qa.attached for current-session state, and always cleanup the returned launchId when done. electron.launch uses an isolated temporary profile; it does not reuse the app's normal signed-in profile or attach to an already-running authenticated app. For signed-in local app state, host-launch the normal app with --remote-debugging-port when appropriate, then use raw args connect <port|url>; after connect, inspect tab list, select the stable tab id such as tab t2, then run a condition wait or snapshot -i before using refs. close commands (`close`, `quit`, or `exit`) only close the browser/CDP session; leave manually launched app shutdown, profile cleanup, and explicit artifacts to the host owner.",
52
52
  "For provider or specialized app workflows, load version-matched upstream guidance with skills get agentcore|electron|slack|dogfood|vercel-sandbox through the native tool; add --full when you need references/templates, and use skills get --all only for broad skill audits. Provider launches such as -p ios, --provider browserbase/kernel/browseruse/browserless/agentcore, and iOS --device are upstream-owned setup paths; use sessionMode fresh when switching providers and expect external credentials or local Appium/Xcode setup to be required.",
53
53
  "For dialogs and frames, use dialog status/accept/dismiss and frame <selector|main> through native args; when --confirm-actions produces a pending confirmation, use details.nextActions or exact confirm <id> / deny <id> calls instead of inventing ids.",
54
- "If a session lands on the wrong page or tab, an interaction changes origin unexpectedly, or an open call returns blocked, blank, or otherwise unexpected results, use tab list / tab <tab-id-or-label> / snapshot -i to recover state before retrying different URLs or fallback strategies. For headed demos, put --headed on the first launch with sessionMode=fresh and verify with screenshot/tab/get-url evidence because tool success cannot prove the OS window is visible to the user. For desktop readiness, prefer real conditions first: wait --text, wait --url, wait --fn, wait --load <state>, wait --download, or qa.attached; for disappearance checks in agent-browser 0.27.0, use wait --fn predicates instead of stale upstream-help examples like wait <selector> --state hidden. Use electron.probe/status for wrapper-owned launch health or target mismatch. Fixed waits are a last resort, must stay below the wrapper IPC budget (wait 30000 is intentionally blocked), and a successful payload like \"waited\":\"timeout\" means elapsed time only—verify completion with an observed condition, fresh snapshot, or screenshot.",
54
+ "If a session lands on the wrong page or tab, an interaction changes origin unexpectedly, or an open call returns blocked, blank, or otherwise unexpected results, use tab list / tab <tab-id-or-label> / snapshot -i to recover state before retrying different URLs or fallback strategies. For headed demos, put --headed on the first launch with sessionMode=fresh and verify with screenshot/tab/get-url evidence because tool success cannot prove the OS window is visible to the user. For desktop readiness, prefer real conditions first: wait --text, wait --url, wait --fn, wait --load <state>, wait --download, or qa.attached; for disappearance checks in agent-browser 0.27.1, use wait --fn predicates instead of stale upstream-help examples like wait <selector> --state hidden. Use electron.probe/status for wrapper-owned launch health or target mismatch. Fixed waits are a last resort, must stay below the wrapper IPC budget (wait 30000 is intentionally blocked), and a successful payload like \"waited\":\"timeout\" means elapsed time only—verify completion with an observed condition, fresh snapshot, or screenshot.",
55
55
  "For feed, timeline, or inbox reading tasks, focus on the main timeline/list region and read the first item there rather than unrelated composer or sidebar content.",
56
56
  "For read-only browsing tasks, prefer extracting the answer from the current snapshot, structured ref labels, or eval --stdin on the current page before navigating away. Only click into media viewers, detail routes, or new pages when the current view does not contain the needed information.",
57
57
  "For downloads, prefer download <selector> <path> when an element click should save a file. Do not rely on click alone when you need the downloaded file on disk.",
@@ -1,15 +1,16 @@
1
1
  /**
2
2
  * Purpose: Execute the upstream agent-browser binary for the pi-agent-browser extension.
3
- * Responsibilities: Spawn the agent-browser subprocess without a shell, forward a curated environment surface, stream optional stdin, bound in-memory output buffering, spill oversized stdout safely to a private temp file under a disk budget, and honor abort signals.
3
+ * Responsibilities: Spawn the agent-browser subprocess, forward a curated environment surface, stream optional stdin, bound in-memory output buffering, spill oversized stdout safely to a private temp file under a disk budget, and honor abort signals.
4
4
  * Scope: Process execution only; argument planning, output formatting, and pi tool registration live elsewhere.
5
5
  * Usage: Called by the extension tool after argument validation and session planning are complete.
6
- * Invariants/Assumptions: The binary name is always `agent-browser`, the wrapper never shells out, and callers handle semantic success/error interpretation.
6
+ * Invariants/Assumptions: The binary name is always `agent-browser`; Windows routes through PowerShell to invoke npm launchers with escaped argv; callers handle semantic success/error interpretation.
7
7
  */
8
8
 
9
9
  import { type ChildProcessWithoutNullStreams, spawn } from "node:child_process";
10
10
  import { chmod, mkdir } from "node:fs/promises";
11
11
  import { env as processEnv, platform as processPlatform } from "node:process";
12
12
 
13
+ import { GLOBAL_BOOLEAN_FLAGS_WITH_OPTIONAL_VALUES, GLOBAL_VALUE_FLAGS, getFlagName } from "./argv-grammar.js";
13
14
  import { openSecureTempFile, writeSecureTempChunk } from "./temp.js";
14
15
 
15
16
  const MAX_BUFFERED_STDOUT_BYTES = 512 * 1_024;
@@ -107,6 +108,52 @@ function appendTail(text: string, addition: string, maxChars: number): string {
107
108
  return combined.length <= maxChars ? combined : combined.slice(combined.length - maxChars);
108
109
  }
109
110
 
111
+ function quoteWindowsPowerShellArg(value: string): string {
112
+ return `'${value.replace(/'/g, "''")}'`;
113
+ }
114
+
115
+ const WINDOWS_LEADING_GLOBAL_VALUE_FLAGS = new Set<string>(GLOBAL_VALUE_FLAGS);
116
+
117
+ /** Exported for unit tests that lock Windows launcher argv ordering. */
118
+ export function reorderWindowsLeadingGlobalArgs(args: string[]): string[] {
119
+ const leadingGlobals: string[] = [];
120
+ let index = 0;
121
+ while (index < args.length && args[index]?.startsWith("-")) {
122
+ const token = args[index];
123
+ const flagName = getFlagName(token);
124
+ leadingGlobals.push(token);
125
+ index += 1;
126
+ if (WINDOWS_LEADING_GLOBAL_VALUE_FLAGS.has(flagName) && !token.includes("=") && index < args.length) {
127
+ leadingGlobals.push(args[index]);
128
+ index += 1;
129
+ continue;
130
+ }
131
+ if (GLOBAL_BOOLEAN_FLAGS_WITH_OPTIONAL_VALUES.has(flagName) && ["true", "false"].includes(args[index] ?? "")) {
132
+ leadingGlobals.push(args[index]);
133
+ index += 1;
134
+ }
135
+ }
136
+ if (leadingGlobals.length === 0 || index >= args.length) return args;
137
+ return [args[index], ...leadingGlobals, ...args.slice(index + 1)];
138
+ }
139
+
140
+ function buildAgentBrowserSpawnCommand(args: string[]): { command: string; args: string[] } {
141
+ if (processPlatform !== "win32") {
142
+ return { command: "agent-browser", args };
143
+ }
144
+ const commandLine = ["&", "agent-browser", ...reorderWindowsLeadingGlobalArgs(args).map(quoteWindowsPowerShellArg)].join(" ");
145
+ return { command: "powershell.exe", args: ["-NoLogo", "-NoProfile", "-ExecutionPolicy", "Bypass", "-Command", commandLine] };
146
+ }
147
+
148
+ function terminateSpawnedChild(child: ChildProcessWithoutNullStreams, signal: NodeJS.Signals): void {
149
+ if (processPlatform === "win32" && child.pid) {
150
+ const killer = spawn("taskkill.exe", ["/PID", String(child.pid), "/T", "/F"], { stdio: "ignore" });
151
+ killer.on("error", () => undefined);
152
+ killer.unref();
153
+ }
154
+ child.kill(signal);
155
+ }
156
+
110
157
  /** Exported for unit tests that lock subprocess exit-code precedence. */
111
158
  export function resolveSpawnedChildExitCode(input: {
112
159
  closeCode?: number | null;
@@ -234,17 +281,27 @@ async function ensureAgentBrowserSocketDir(socketDir: string): Promise<boolean>
234
281
  }
235
282
  }
236
283
 
284
+ function getChildEnvName(name: string): string | undefined {
285
+ if (processPlatform === "win32") {
286
+ const upperName = name.toUpperCase();
287
+ if (INHERITED_ENV_NAMES.has(upperName)) return upperName;
288
+ return INHERITED_ENV_PREFIXES.some((prefix) => upperName.startsWith(prefix)) ? upperName : undefined;
289
+ }
290
+ if (INHERITED_ENV_NAMES.has(name) || INHERITED_ENV_PREFIXES.some((prefix) => name.startsWith(prefix))) {
291
+ return name;
292
+ }
293
+ return undefined;
294
+ }
295
+
237
296
  export function buildAgentBrowserProcessEnv(
238
297
  baseEnv: NodeJS.ProcessEnv = processEnv,
239
298
  overrides: NodeJS.ProcessEnv | undefined = undefined,
240
299
  ): NodeJS.ProcessEnv {
241
300
  const childEnv: NodeJS.ProcessEnv = {};
242
301
  for (const [name, value] of Object.entries(baseEnv)) {
243
- if (
244
- value !== undefined &&
245
- (INHERITED_ENV_NAMES.has(name) || INHERITED_ENV_PREFIXES.some((prefix) => name.startsWith(prefix)))
246
- ) {
247
- childEnv[name] = value;
302
+ const childName = getChildEnvName(name);
303
+ if (value !== undefined && childName) {
304
+ childEnv[childName] = value;
248
305
  }
249
306
  }
250
307
 
@@ -254,10 +311,11 @@ export function buildAgentBrowserProcessEnv(
254
311
  }
255
312
 
256
313
  for (const [name, value] of Object.entries(overrides)) {
314
+ const childName = getChildEnvName(name) ?? name;
257
315
  if (value === undefined) {
258
- delete childEnv[name];
316
+ delete childEnv[childName];
259
317
  } else {
260
- childEnv[name] = value;
318
+ childEnv[childName] = value;
261
319
  }
262
320
  }
263
321
  clampUpstreamDefaultTimeout(childEnv);
@@ -371,7 +429,8 @@ export async function runAgentBrowserProcess(options: {
371
429
  });
372
430
  };
373
431
 
374
- const child = spawn("agent-browser", args, {
432
+ const spawnCommand = buildAgentBrowserSpawnCommand(args);
433
+ const child = spawn(spawnCommand.command, spawnCommand.args, {
375
434
  cwd,
376
435
  env: buildAgentBrowserProcessEnv(processEnv, effectiveEnv),
377
436
  stdio: ["pipe", "pipe", "pipe"],
@@ -384,15 +443,15 @@ export async function runAgentBrowserProcess(options: {
384
443
  } else {
385
444
  timedOut = true;
386
445
  }
387
- child.kill("SIGTERM");
446
+ terminateSpawnedChild(child, "SIGTERM");
388
447
  killTimer = setTimeout(() => {
389
- child.kill("SIGKILL");
448
+ terminateSpawnedChild(child, "SIGKILL");
390
449
  }, 2_000);
391
450
  };
392
451
  const recordStdinError = (error: unknown) => {
393
452
  const stdinError = error instanceof Error ? error : new Error(String(error));
394
453
  const errorCode = (stdinError as NodeJS.ErrnoException).code;
395
- if (errorCode === "EPIPE" || errorCode === "ERR_STREAM_DESTROYED") {
454
+ if (errorCode === "EPIPE" || errorCode === "EOF" || errorCode === "ERR_STREAM_DESTROYED") {
396
455
  return;
397
456
  }
398
457
  if (!spawnError) {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "pi-agent-browser-native",
3
- "version": "0.2.37",
3
+ "version": "0.2.39",
4
4
  "description": "pi extension that exposes agent-browser as a native tool for browser automation",
5
5
  "type": "module",
6
6
  "author": "Mitch Fultz (https://github.com/fitchmultz)",
@@ -31,8 +31,11 @@
31
31
  },
32
32
  "files": [
33
33
  "extensions",
34
+ "platform-smoke.config.mjs",
34
35
  "scripts/doctor.mjs",
35
36
  "scripts/agent-browser-capability-baseline.mjs",
37
+ "scripts/platform-smoke.mjs",
38
+ "scripts/platform-smoke",
36
39
  "README.md",
37
40
  "CHANGELOG.md",
38
41
  "LICENSE",
@@ -40,6 +43,7 @@
40
43
  "docs/COMMAND_REFERENCE.md",
41
44
  "docs/ELECTRON.md",
42
45
  "docs/RELEASE.md",
46
+ "docs/platform-smoke.md",
43
47
  "docs/REQUIREMENTS.md",
44
48
  "docs/SUPPORT_MATRIX.md",
45
49
  "docs/TOOL_CONTRACT.md"
@@ -56,9 +60,9 @@
56
60
  "typebox": "*"
57
61
  },
58
62
  "devDependencies": {
59
- "@earendil-works/pi-ai": "^0.76.0",
60
- "@earendil-works/pi-coding-agent": "^0.76.0",
61
- "@earendil-works/pi-tui": "^0.76.0",
63
+ "@earendil-works/pi-ai": "^0.78.0",
64
+ "@earendil-works/pi-coding-agent": "^0.78.0",
65
+ "@earendil-works/pi-tui": "^0.78.0",
62
66
  "@types/node": "^25.6.1",
63
67
  "tsx": "^4.21.0",
64
68
  "typebox": "^1.1.38",
@@ -71,6 +75,14 @@
71
75
  "docs": "node ./scripts/project.mjs docs",
72
76
  "doctor": "node ./scripts/doctor.mjs",
73
77
  "benchmark:agent-browser": "node ./scripts/agent-browser-efficiency-benchmark.mjs",
78
+ "check:platform-smoke": "node --check platform-smoke.config.mjs && node --check scripts/platform-smoke.mjs && node --check scripts/platform-smoke/doctor.mjs && node --check scripts/platform-smoke/crabbox-runner.mjs && node --check scripts/platform-smoke/targets.mjs && node --check scripts/platform-smoke/artifacts.mjs && tsx --test test/platform-smoke.test.ts",
79
+ "smoke:platform": "node scripts/platform-smoke.mjs",
80
+ "smoke:platform:doctor": "node scripts/platform-smoke.mjs doctor",
81
+ "smoke:platform:ubuntu-image": "docker build -t pi-agent-browser-native-platform:node24-agent-browser0.27.1 --build-arg AGENT_BROWSER_VERSION=0.27.1 -f scripts/platform-smoke/linux-image/Dockerfile .",
82
+ "smoke:platform:macos": "node scripts/platform-smoke.mjs run --target macos",
83
+ "smoke:platform:ubuntu": "node scripts/platform-smoke.mjs run --target ubuntu",
84
+ "smoke:platform:windows-native": "node scripts/platform-smoke.mjs run --target windows-native",
85
+ "smoke:platform:all": "npm run smoke:platform:doctor && node scripts/platform-smoke.mjs run --target macos,ubuntu,windows-native",
74
86
  "typecheck": "node ./scripts/project.mjs verify typecheck",
75
87
  "test": "tsx --test test/**/*.test.ts",
76
88
  "verify": "node ./scripts/project.mjs verify",
@@ -0,0 +1,18 @@
1
+ // Platform smoke configuration for pi-agent-browser-native.
2
+ // Crabbox owns the target lease/sync loop; this file is the project source of truth for release-blocking platform coverage.
3
+
4
+ import { CAPABILITY_BASELINE } from "./scripts/agent-browser-capability-baseline.mjs";
5
+
6
+ export default {
7
+ packageName: "pi-agent-browser-native",
8
+ artifactRoot: ".artifacts/platform-smoke",
9
+ requiredTargets: ["macos", "ubuntu", "windows-native"],
10
+ requiredSuites: ["platform-build", "browser-dogfood-smoke"],
11
+ requiredCrabbox: {
12
+ install: "Homebrew package or PLATFORM_SMOKE_CRABBOX override",
13
+ minVersion: "0.24.0",
14
+ },
15
+ ubuntuContainerImage: "pi-agent-browser-native-platform:node24-agent-browser0.27.1",
16
+ nodeValidationMajor: 22,
17
+ agentBrowserVersion: CAPABILITY_BASELINE.targetVersion,
18
+ };
@@ -14,8 +14,8 @@ export const COMMAND_REFERENCE_BASELINE_BLOCK_IDS = Object.freeze(["upstream-bas
14
14
 
15
15
  const sourceEvidence = Object.freeze({
16
16
  repository: "vercel-labs/agent-browser",
17
- upstreamHead: "4ad284890cb59564af603e6de403dd75dd19e832",
18
- upstreamPackageVersion: "0.27.0",
17
+ upstreamHead: "90050f2913159875e2c3719e424746396ccb3cbf",
18
+ upstreamPackageVersion: "0.27.1",
19
19
  inspectedSources: Object.freeze([
20
20
  "agent-browser --version",
21
21
  "agent-browser --help",
@@ -349,7 +349,8 @@ const inventorySections = Object.freeze([
349
349
  "diff screenshot --baseline <file> --output <file> --threshold <0-1> --selector <sel> --full",
350
350
  "diff url <u1> <u2>",
351
351
  "diff url <u1> <u2> --screenshot --wait-until <strategy> --selector <sel> --compact --depth <n>",
352
- "trace start|stop [path]",
352
+ "trace start",
353
+ "trace stop [path]",
353
354
  "profiler start|stop [path]",
354
355
  "record start <path> [url]",
355
356
  "record restart <path> [url]",
@@ -386,7 +387,8 @@ const inventorySections = Object.freeze([
386
387
  root("storage <local|session>"),
387
388
  root("diff snapshot"),
388
389
  root("diff screenshot --baseline"),
389
- root("trace start|stop [path]"),
390
+ root("trace start"),
391
+ root("trace stop [path]"),
390
392
  root("profiler start|stop [path]"),
391
393
  root("record start <path> [url]"),
392
394
  root("record stop"),
@@ -422,7 +424,8 @@ const inventorySections = Object.freeze([
422
424
  ["diff help", "--threshold <0-1>"],
423
425
  ["diff help", "--wait-until <strategy>"],
424
426
  ["diff help", "diff screenshot --baseline <f>"],
425
- ["trace help", "trace <operation> [path]"],
427
+ ["trace help", "trace start"],
428
+ ["trace help", "trace stop [path]"],
426
429
  ["profiler help", "--categories <list>"],
427
430
  ["record help", "record restart <path.webm> [url]"],
428
431
  ["console help", "--clear"],
@@ -703,7 +706,7 @@ const inventorySections = Object.freeze([
703
706
  ]);
704
707
 
705
708
  export const CAPABILITY_BASELINE = Object.freeze({
706
- targetVersion: "0.27.0",
709
+ targetVersion: "0.27.1",
707
710
  sourceEvidence,
708
711
  helpCommands,
709
712
  inventorySections,
@@ -0,0 +1,94 @@
1
+ /** Artifact helpers for platform smoke suites. */
2
+
3
+ import { existsSync, mkdirSync, readdirSync, readFileSync, writeFileSync } from "node:fs";
4
+ import { relative, resolve } from "node:path";
5
+
6
+ export function createSuiteDir(artifactRoot, runId, targetName, suiteName) {
7
+ const dir = resolve(process.cwd(), artifactRoot, runId, targetName, suiteName);
8
+ mkdirSync(dir, { recursive: true });
9
+ return dir;
10
+ }
11
+
12
+ export function writeCommand(dir, command) {
13
+ writeFileSync(resolve(dir, "command.txt"), `${command}\n`);
14
+ }
15
+
16
+ export function writeExitCode(dir, code, signal) {
17
+ writeFileSync(resolve(dir, "exit-code.txt"), `code=${code}\nsignal=${signal ?? "none"}\n`);
18
+ }
19
+
20
+ export function writeSummary(dir, data) {
21
+ writeFileSync(resolve(dir, "summary.json"), JSON.stringify({ ...data, writtenAt: new Date().toISOString() }, null, 2));
22
+ }
23
+
24
+ export function writeManifest(dir, expectedFiles) {
25
+ const present = [];
26
+ function walk(current) {
27
+ for (const entry of readdirSync(current, { withFileTypes: true })) {
28
+ const path = resolve(current, entry.name);
29
+ if (entry.isDirectory()) walk(path);
30
+ else if (entry.isFile()) present.push(relative(dir, path));
31
+ }
32
+ }
33
+ if (existsSync(dir)) walk(dir);
34
+ const allPresent = [...new Set([...present, "artifact-manifest.json"])].sort();
35
+ const manifest = {
36
+ expected: expectedFiles,
37
+ present: allPresent,
38
+ missing: expectedFiles.filter((file) => !allPresent.includes(file)),
39
+ writtenAt: new Date().toISOString(),
40
+ };
41
+ writeFileSync(resolve(dir, "artifact-manifest.json"), JSON.stringify(manifest, null, 2));
42
+ return manifest;
43
+ }
44
+
45
+ export function collectSecretValues(envNames, env = process.env) {
46
+ return [...new Set(envNames.map((name) => env[name]).filter((value) => typeof value === "string" && value.length >= 8))];
47
+ }
48
+
49
+ export function redactSecrets(text, secretValues = []) {
50
+ let redacted = String(text ?? "");
51
+ for (const secret of secretValues) {
52
+ redacted = redacted.split(secret).join("[REDACTED_SECRET]");
53
+ }
54
+ return redacted;
55
+ }
56
+
57
+ export function scanForSecrets(text, secretValues = []) {
58
+ const content = String(text ?? "");
59
+ const violations = [];
60
+ for (const secret of secretValues) {
61
+ if (secret && content.includes(secret)) violations.push("raw forwarded secret value");
62
+ }
63
+ for (const [pattern, label] of [
64
+ [/bearer\s+[A-Za-z0-9\-._~+/]{20,}=*/gi, "bearer token"],
65
+ [/Authorization:\s*Bearer\s+[A-Za-z0-9\-._~+/]{20,}=*/gi, "authorization header"],
66
+ [/(?:api[_-]?key|access[_-]?token|refresh[_-]?token|cookie)\s*[:=]\s*["']?[A-Za-z0-9_./+\-=]{20,}/gi, "token-like field"],
67
+ ]) {
68
+ if (pattern.test(content)) violations.push(label);
69
+ }
70
+ return [...new Set(violations)];
71
+ }
72
+
73
+ export function scanArtifactTextFiles(dir, secretValues = []) {
74
+ const findings = [];
75
+ function walk(current) {
76
+ for (const entry of readdirSync(current, { withFileTypes: true })) {
77
+ const path = resolve(current, entry.name);
78
+ if (entry.isDirectory()) {
79
+ walk(path);
80
+ continue;
81
+ }
82
+ if (!entry.isFile()) continue;
83
+ if (!/\.(?:txt|json|jsonl|md|log|ps1|mjs|js)$/i.test(entry.name)) continue;
84
+ try {
85
+ const text = readFileSync(path, "utf8");
86
+ for (const violation of scanForSecrets(text, secretValues)) findings.push({ file: relative(dir, path), violation });
87
+ } catch {
88
+ // Ignore unreadable or non-text files.
89
+ }
90
+ }
91
+ }
92
+ walk(dir);
93
+ return findings;
94
+ }