npm - pi-agent-browser-native - Versions diffs - 0.2.48 → 0.2.50 - Mend

pi-agent-browser-native 0.2.48 → 0.2.50

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (189) hide show

package/docs/RELEASE.md CHANGED Viewed

@@ -38,7 +38,7 @@ For PR-ready local confidence before release-only lifecycle and platform cost, r
 npm run verify -- pre-pr
 ```
-`pre-pr` composes the default gate with `npm run verify -- package`: generated docs, TypeScript, the full unit/fake suite, live command-reference sampling, and package-content verification. It intentionally does not run lifecycle, packaged Pi smoke, Crabbox platform smoke, real-upstream, dogfood, or benchmark modes.
+`pre-pr` composes the default gate with `npm run verify -- package`: generated docs, clean `dist/` build, TypeScript, the full unit/fake suite, live command-reference sampling, and package-content verification. It intentionally does not run lifecycle, packaged Pi smoke, Crabbox platform smoke, startup-profile, real-upstream, dogfood, or benchmark modes.
 `npm run verify -- release` runs:
@@ -47,9 +47,18 @@ npm run verify -- pre-pr
 3. `npm run verify -- package-pi`, which first validates package contents via `npm pack --json --dry-run` and then smoke-loads the packed package in Pi isolation
 4. `npm run smoke:platform:doctor` and the full Crabbox matrix from [`platform-smoke.md`](platform-smoke.md): macOS SSH, Ubuntu local-container, and native Windows Parallels targets running fast target-local `platform-build` plus `browser-dogfood-smoke`
-`npm publish` runs npm’s `prepublishOnly` script from `package.json`, which executes the same `npm run verify -- release` gate and then `npm pack --dry-run`. That concatenated gate is everything in the default `npm run verify` step (generated playbook drift, TypeScript, the unit/fake suite, generated command-reference blocks, and live upstream command-reference sampling against the targeted `agent-browser` on `PATH`), the configured-source lifecycle harness, the packaged Pi smoke in `package-pi`, and the release-blocking Crabbox platform matrix. Using `npm publish --ignore-scripts` skips that contract intentionally.
+`npm publish` runs npm’s `prepublishOnly` script from `package.json`, which executes the same `npm run verify -- release` gate and then `npm pack --dry-run`. That concatenated gate is everything in the default `npm run verify` step (generated playbook drift, clean `dist/` build, TypeScript, the unit/fake suite, generated command-reference blocks, and live upstream command-reference sampling against the targeted `agent-browser` on `PATH`), the configured-source lifecycle harness, the packaged Pi smoke in `package-pi`, and the release-blocking Crabbox platform matrix. Using `npm publish --ignore-scripts` skips that contract intentionally.
-`prepublishOnly` intentionally does **not** run the standalone host-only `npm run verify -- real-upstream`, `npm run verify -- dogfood`, or `npm run verify -- benchmark` modes; those remain separate `npm run verify` modes in [`scripts/project.mjs`](https://github.com/fitchmultz/pi-agent-browser-native/blob/main/scripts/project.mjs). The platform matrix includes its own fast target-local build/package gate and browser dogfood suite, and is automated through the `release` slice.
+`prepublishOnly` intentionally does **not** run the standalone host-only `npm run verify -- startup-profile`, `npm run verify -- real-upstream`, `npm run verify -- dogfood`, or `npm run verify -- benchmark` modes; those remain separate `npm run verify` modes in [`scripts/project.mjs`](https://github.com/fitchmultz/pi-agent-browser-native/blob/main/scripts/project.mjs). The platform matrix includes its own fast target-local build/package gate and browser dogfood suite, and is automated through the `release` slice.
+Run the opt-in startup profiler whenever package layout, the compiled entrypoint, top-level imports, schema registration, or prompt/config startup logic changes:
+```bash
+npm run build
+npm run verify -- startup-profile --samples 3
+```
+The profiler first clean-builds `dist/`, then records only direct package entrypoint import/factory timing in fresh Node processes, writes `.artifacts/startup-profile/latest.json`, and includes a safety block confirming it did not launch Pi, tmux, mise, npm, browsers, or `agent-browser`. Full Pi TUI ready-prompt profiling is intentionally excluded because repeated real Pi/tmux launches proved too invasive for routine verification on the operator machine.
 For a deterministic host-only real-browser wrapper smoke without model choice in the loop, run:
@@ -76,7 +85,7 @@ Every release also requires interactive `tmux`-driven Pi dogfood with the native
 When reviewing saved session JSONL after a failed smoke or a `qa` preset that reclassified an upstream-successful batch, expect `agent_browser` tool rows to carry `isError: true` whenever `details.resultCategory` is `failure`. For normal prose output, model-visible text should end with a `Pi tool isError: true` category line; for caller-requested `--json` output, the hook preserves parseable JSON and only patches `isError`. The extension applies that patch on the `tool_result` path so Pi’s transcript matches the wrapper contract ([`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#details)). Preserve a normal Pi session directory for those checks; avoiding `--no-session` keeps this evidence intact ([`AGENTS.md`](https://github.com/fitchmultz/pi-agent-browser-native/blob/main/AGENTS.md) preferred validation workflow).
-The configured-source lifecycle regression harness is required before release because it launches an interactive `pi` process under `tmux` with `--approve` and validates `/reload`, full relaunch with the same exact Pi 0.79 `--session-id`, managed-session continuity, persisted artifacts, and Pi failure-patch behavior. Branch-backed `session_tree` rehydration and cleanup ownership are validated by focused extension harness tests:
+The configured-source lifecycle regression harness is required before release because it launches an interactive `pi` process under `tmux` with `--approve` and validates `/reload`, full relaunch with the same exact Pi 0.79 `--session-id`, managed-session continuity, persisted artifacts, compiled-entrypoint pickup after process restart, and Pi failure-patch behavior. Branch-backed `session_tree` rehydration and cleanup ownership are validated by focused extension harness tests:
 ```bash
 npm run verify -- lifecycle
@@ -155,7 +164,7 @@ Evaluator expectations after the queued Sauce Demo fixes: the agent should indep
 [`scripts/agent-browser-efficiency-benchmark.mjs`](https://github.com/fitchmultz/pi-agent-browser-native/blob/main/scripts/agent-browser-efficiency-benchmark.mjs) is an accounting-only benchmark: it does not shell out to `agent-browser`, launch a browser, or read or write Pi sessions. It models representative `agent_browser` call shapes (including optional `stdin` for `batch` and top-level `job`, `qa`, or experimental `sourceLookup` / `networkSourceLookup` objects that compile to batch) and aggregates success rate, tool-call counts, UTF-8 size of model-visible strings, stale-ref failure and recovery counts, artifact success, distinct failure-category coverage, and summed elapsed-time estimates. When extending scenarios, keep them aligned with the closed `RQ-0068` “no reusable recipe layer” rationale in [`ARCHITECTURE.md`](ARCHITECTURE.md#no-reusable-recipe-layer-yet) (benchmark ids cited there are the canonical inventory for that evidence bar).
 - **During development:** `npm run benchmark:agent-browser` prints a Markdown report; `npm run benchmark:agent-browser -- --json` saves machine-readable metrics; `npm run benchmark:agent-browser -- --compare path/to/prior.json` fails with exit code `1` on regressions (see the script’s `--help` for exit codes). Optional `--sample-jsonl path/to/session.jsonl` adds a `jsonlSample` section with real UTF-8 byte totals and per-workflow/overall p95 sizes for model-visible `agent_browser` tool-result text without changing deterministic scenario metrics; comparison ignores `jsonlSample` blocks.
-- **Default gate:** `npm run verify` checks generated playbook drift, runs `tsc --noEmit`, runs the full unit/fake suite under `test/**/*.test.ts` with Node test concurrency pinned to `1` (including [`test/agent-browser.efficiency-benchmark.test.ts`](https://github.com/fitchmultz/pi-agent-browser-native/blob/main/test/agent-browser.efficiency-benchmark.test.ts) for scenario coverage and comparison behavior), verifies generated command-reference baseline blocks, and samples live upstream command-reference tokens. It does not spawn the standalone benchmark script’s JSON/Markdown run; that is what the opt-in slice below adds.
+- **Default gate:** `npm run verify` checks generated playbook drift, clean-builds `dist/`, runs `tsc --noEmit`, runs the full unit/fake suite under `test/**/*.test.ts` with Node test concurrency pinned to `1` (including [`test/agent-browser.efficiency-benchmark.test.ts`](https://github.com/fitchmultz/pi-agent-browser-native/blob/main/test/agent-browser.efficiency-benchmark.test.ts) for scenario coverage and comparison behavior), verifies generated command-reference baseline blocks, and samples live upstream command-reference tokens. It does not spawn the standalone benchmark script’s JSON/Markdown run; that is what the opt-in slice below adds.
 - **Pre-PR gate:** `npm run verify -- pre-pr` runs the default gate plus `npm run verify -- package` for larger handoffs that need package-content confidence without lifecycle, platform, real-upstream, dogfood, or benchmark cost.
 - **Opt-in slice:** `npm run verify -- benchmark` runs the benchmark script once with `--json` and then that same test module alone. It is intentionally **not** part of `npm run verify -- pre-pr` or `npm run verify -- release`, so routine handoff and publish gates stay decoupled from benchmark churn while still allowing a focused check after editing scenarios or `CURRENT_BENCHMARK_VERSION`.
@@ -168,9 +177,11 @@ Maintainer constraints for evolving scenarios and version bumps are summarized u
 - no repo-local `.pi/extensions/agent-browser.ts` autoload shim is present
 - `LICENSE` exists in the repo and the packed tarball
 - canonical published docs are present
+- `npm pack --json --dry-run` runs the `prepack` build and packs the compiled `dist/extensions/agent-browser/index.js` entrypoint
+- GitHub/source installs run the package `prepare` build so Pi can load the ignored compiled `dist/extensions/agent-browser/index.js` entrypoint from a fresh clone
 - the package-level doctor command and capability baseline are present
-- extension source files are present, including the split result-rendering modules required by the published facade
-- agent-only and superseded docs are absent from the tarball
+- compiled extension runtime files are present, including the split result-rendering modules required by the published facade
+- source-only, agent-only, and superseded docs are absent from the tarball
 `npm run verify -- package-pi` runs the same package-content checks and additionally confirms that:
@@ -187,7 +198,7 @@ Current forbidden packed files include:
 - `AGENTS.md`
 - archived planning drafts under `docs/archive/`
 - `.pi/extensions/agent-browser.ts`
-- test and repo-only maintenance files
+- TypeScript extension source and other test/repo-only maintenance files
 For a full packed file listing:
@@ -203,7 +214,7 @@ Before publishing, validate both local-checkout modes without mixing their assum
 1. Install `agent-browser` separately.
 2. Launch `pi --approve --no-extensions -e .` from this trusted repository root. Omit `--approve` only when testing Pi's Project Trust prompt.
-3. Confirm the checkout extension loads from `extensions/agent-browser/index.ts`.
+3. Confirm the checkout package loads the compiled `dist/extensions/agent-browser/index.js` entrypoint (run `npm run build` first after source edits).
 4. Run a smoke prompt that exercises `agent_browser`.
 5. Restart the `pi` process after extension edits; Pi settings and `/reload` are not the validation target in this isolated mode.
@@ -221,7 +232,7 @@ Run the automated harness for deterministic configured-source lifecycle regressi
 npm run verify -- lifecycle
 ```
-The harness creates an isolated `PI_CODING_AGENT_DIR`, writes settings with exactly one temporary configured package source, runs `pi` in `tmux` with `--approve`, default model **`zai/glm-5.1`**, and a deterministic `--session-id`, puts a deterministic fake `agent-browser` first on `PATH`, drives `/reload`, closes Pi, and relaunches with the same exact session id instead of typing `/resume`. It also asserts the JSONL session header id, same-page managed-session continuity, persisted spill reachability, and real Pi `tool_result` failure-patch semantics for a QA reclassification. Per-step tmux waits default to **180000 ms** (three minutes) in [`scripts/verify-lifecycle.mjs`](https://github.com/fitchmultz/pi-agent-browser-native/blob/main/scripts/verify-lifecycle.mjs) (`DEFAULT_TIMEOUT_MS`); override with `--timeout-ms <ms>` when slower models or cold starts need more headroom. Override the model when needed:
+The harness creates an isolated `PI_CODING_AGENT_DIR`, writes settings with exactly one temporary configured package source, runs `pi` in `tmux` with `--approve`, default model **`zai/glm-5.1`**, and a deterministic `--session-id`, puts a deterministic fake `agent-browser` first on `PATH`, drives `/reload`, closes Pi, and relaunches with the same exact session id instead of typing `/resume`. It also asserts the JSONL session header id, same-page managed-session continuity, compiled JS code pickup after full process relaunch, persisted spill reachability, and real Pi `tool_result` failure-patch semantics for a QA reclassification. Per-step tmux waits default to **180000 ms** (three minutes) in [`scripts/verify-lifecycle.mjs`](https://github.com/fitchmultz/pi-agent-browser-native/blob/main/scripts/verify-lifecycle.mjs) (`DEFAULT_TIMEOUT_MS`); override with `--timeout-ms <ms>` when slower models or cold starts need more headroom. Override the model when needed:
 ```bash
 npm run verify -- lifecycle --model openai-codex/gpt-5.5:minimal
@@ -235,7 +246,7 @@ npm run verify -- lifecycle --model openai-codex/gpt-5.5:minimal --timeout-ms 60
 On failure it retains transcripts/session artifacts; on success it performs best-effort cleanup. It does not replace occasional real-browser manual smoke testing.
-**Lifecycle triage:** a timeout on sentinel `v2` after `/reload` often means Pi rejected reload while the TUI still showed `Working…` (`Wait for the current response to finish before reloading`), even when the session JSONL already has a final assistant message. Re-run with `--keep-artifacts --verbose`, inspect the retained pane capture, and confirm the configured model follows tool prompts reliably. Slower models may need a higher `--timeout-ms` than the **180000 ms** default.
+**Lifecycle triage:** a timeout on sentinel `v2` after exact-session relaunch means the new compiled entrypoint did not load after process restart. A reload-step timeout or missing post-reload snapshot often means Pi rejected reload while the TUI still showed `Working…` (`Wait for the current response to finish before reloading`), even when the session JSONL already has a final assistant message. Re-run with `--keep-artifacts --verbose`, inspect the retained pane capture, and confirm the configured model follows tool prompts reliably. Slower models may need a higher `--timeout-ms` than the **180000 ms** default.
 ### Environment and automation pitfalls

package/docs/REQUIREMENTS.md CHANGED Viewed

@@ -106,7 +106,7 @@ The design should comfortably support workflows such as:
 - isolated authenticated browser sessions
 - headless authenticated `chat.com` / ChatGPT / OpenAI browsing without forcing `--headed` or `--auto-connect`
 - upstream profile/debug workflows without adding a local profile-cloning layer in this package
-- provider-backed or iOS device launches where upstream owns credentials, env, and setup; the wrapper forwards argv and a curated provider-related environment without emulating those backends
+- provider-backed or iOS device launches where upstream owns credentials, env, and setup; the wrapper forwards argv and the parent environment without emulating those backends
 - desktop Electron targets using top-level `electron` for discover → isolated launch → attach → probe/cleanup, or raw `args: ["connect", …]` when the operator launches the real app with a debug port for signed-in state (see [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#electron) and [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#electron-desktop-apps))
 ## Implications for the implementation

package/docs/SUPPORT_MATRIX.md CHANGED Viewed

@@ -50,15 +50,16 @@ Re-run the gates below before each release; this table records what the closure
 | Gate | Evidence | Status |
 | --- | --- | --- |
-| Default local gate | `npm run verify` checks generated playbook drift, `tsc --noEmit`, unit/fake tests, generated command-reference blocks, and live command-reference sampling. | **Current for 0.27.2:** pass on 2026-06-11 inside `npm run verify -- release`; 561 passed, 1 skipped, then command-reference generated blocks and live sampling passed with `agent-browser 0.27.2` on `PATH`. |
+| Default local gate | `npm run verify` checks generated playbook drift, clean-builds generated `dist/`, runs `tsc --noEmit`, unit/fake tests, generated command-reference blocks, and live command-reference sampling. | **Current for 0.27.2:** pass on 2026-06-11 inside `npm run verify -- release`; 561 passed, 1 skipped, then command-reference generated blocks and live sampling passed with `agent-browser 0.27.2` on `PATH`. |
 | Pre-PR local gate | `npm run verify -- pre-pr` composes the default gate with package-content verification. Use before larger local handoffs or PR-ready claims when lifecycle/platform/live dogfood cost is not warranted. | Added 2026-06-10; orchestration is locked by `test/project-verify.test.ts` and does not change release mode. |
 | Real upstream contract | `npm run verify -- real-upstream` runs the localhost fixture matrix against the real installed `agent-browser` matching the baseline. | **Current for 0.27.2:** pass on 2026-06-11 (`npm run verify -- real-upstream`, `agent-browser 0.27.2` on `PATH`; includes 0.27.2 off-viewport click, frame-scoped selector/wait/click, form command, and wait-download artifact coverage). |
-| Packaged Pi smoke | `npm run verify -- package-pi` validates package contents, loads the packaged `agent_browser` tool without requiring optional Brave config, and executes fake-upstream `--version`. | **Current for 0.27.2:** pass on 2026-06-11 as part of `npm run verify -- release` (`verify-package.mjs --smoke-pi`; packed 117 files, packaged `agent_browser --version` invocation passed). |
+| Packaged Pi smoke | `npm run verify -- package-pi` validates package contents, loads the packaged `agent_browser` tool without requiring optional Brave config, and executes fake-upstream `--version`. | **Current for 0.27.2:** pass on 2026-06-11 as part of `npm run verify -- release` and rerun after the compiled-entrypoint change (`verify-package.mjs --smoke-pi`; packed 117 files, packaged `agent_browser --version` invocation passed). |
+| Startup profile | `npm run verify -- startup-profile --samples <n>` clean-builds generated `dist/`, records direct package entrypoint import/factory timing in fresh Node processes, and writes `.artifacts/startup-profile/latest.json`. It must not launch Pi, tmux, mise, npm, browsers, or `agent-browser`; full Pi TUI ready-prompt profiling is intentionally excluded after it proved too invasive for routine verification. Run this opt-in evidence when package layout, the compiled entrypoint, top-level imports, schema registration, or prompt/config startup logic changes. | **Current for compiled entrypoint:** pass on 2026-06-11 with direct compiled entrypoint import+factory median 47.136 ms in earlier samples, below the 250 ms direct-import guard and below the prior ~96 ms TypeScript-entrypoint baseline. Full-Pi startup numbers from the unsafe tmux profiler are not accepted as ongoing release evidence. |
 | Deterministic dogfood smoke | `npm run verify -- dogfood` (`scripts/verify-agent-browser-dogfood.ts`) drives the native wrapper against a local file fixture through top-level `qa`, `semanticAction`, constrained `job`, screenshot artifact verification, and session close with the real `agent-browser` on `PATH`. | **Current for 0.27.2:** pass on 2026-06-11 (`npm run verify -- dogfood`, `agent-browser 0.27.2`; `qa-url`, fresh/current opens, semantic click, job screenshot artifact, and close all passed). |
 | Efficiency benchmark | `npm run verify -- benchmark` runs deterministic browser workflow accounting plus focused benchmark tests, including JSONL sampling fixtures and job/qa/sourceLookup/networkSourceLookup/Electron scenario coverage. | **Historical / pending refresh:** pass on 2026-05-29 (`npm run verify -- benchmark`). This deterministic gate is not upstream-version-specific, but rerun before claiming current benchmark evidence after benchmark or workflow-scenario edits. |
 | Crabbox platform smoke | `npm run check:platform-smoke` syntax-checks the harness and cheap invariants. `npm run smoke:platform:ubuntu-image` builds the project-owned Linux image, `npm run smoke:platform:doctor` checks Crabbox 0.26.0+ and local target readiness, and `npm run smoke:platform:all` runs doctor first, then fast target-local `platform-build` (`npm run verify -- platform-target`, pack, clean Pi install) plus `browser-dogfood-smoke` on Crabbox `macos`, `ubuntu`, and `windows-native`; see [`platform-smoke.md`](platform-smoke.md). Target artifacts include Crabbox/provider/work-root metadata, and release review also checks provider-specific `crabbox list` commands for leftover leases/clones. | **Current for 0.27.2:** pass on 2026-06-11 inside `npm run verify -- release`; rebuilt Ubuntu image `pi-agent-browser-native-platform:node24-agent-browser0.27.2`, refreshed the Windows `crabbox-ready` template snapshot to `agent-browser 0.27.2`, doctor passed, then Crabbox platform smoke passed for macOS, Ubuntu, and native Windows. |
 | `verify -- release` / `prepublishOnly` | `npm run verify -- release` chains the default gate with the configured-source lifecycle harness, packaged Pi smoke, and the release-blocking Crabbox platform matrix (`verifySteps` `release` in [`scripts/project.mjs`](https://github.com/fitchmultz/pi-agent-browser-native/blob/main/scripts/project.mjs)). `package.json` `prepublishOnly` runs that compose before `npm pack --dry-run` during `npm publish`. It intentionally omits standalone real-upstream, host-only dogfood, and benchmark modes—see [`RELEASE.md`](RELEASE.md#pre-release-checks). | **Current for 0.27.2:** pass on 2026-06-11 (`npm run verify -- release`), including default unit/fake gate, generated docs checks, live command-reference sampling, lifecycle harness, packaged Pi smoke, and macOS/Ubuntu/native-Windows Crabbox platform smoke. |
-| Configured-source lifecycle | `npm run verify -- lifecycle` (`scripts/verify-lifecycle.mjs`) drives `/reload`, closes and relaunches Pi with the same exact `--session-id`, checks the JSONL session header id, session continuity, slash-command sentinel tokens (`v1` then `v2` after rewriting the packaged extension to simulate pickup), persisted spill reachability, and real Pi `tool_result` failure-patch semantics for a QA reclassification with a fake upstream on `PATH`. Default Pi model is `zai/glm-5.1`; default per-step wait is **180000 ms** (`DEFAULT_TIMEOUT_MS`); override model with `--model <id>` and waits with `--timeout-ms <ms>`. Passthrough flags in [`scripts/project.mjs`](https://github.com/fitchmultz/pi-agent-browser-native/blob/main/scripts/project.mjs): `--keep-artifacts`, `--model`, `--verbose`, and `--timeout-ms` plus a value (for example `npm run verify -- lifecycle --model openai-codex/gpt-5.5:minimal --keep-artifacts --verbose --timeout-ms 600000`). | **Current for 0.27.2:** pass on 2026-06-11 inside `npm run verify -- release`; exact session `piab-lifecycle-16278`, managed browser session `piab-pi-agent-browser-piablifecycl-186485dc`, persisted full output verified before cleanup. |
+| Configured-source lifecycle | `npm run verify -- lifecycle` (`scripts/verify-lifecycle.mjs`) drives `/reload`, closes and relaunches Pi with the same exact `--session-id`, checks the JSONL session header id, session continuity, slash-command sentinel tokens (`v1` before reload and `v2` after full relaunch because compiled JS package modules are process-cached), persisted spill reachability, and real Pi `tool_result` failure-patch semantics for a QA reclassification with a fake upstream on `PATH`. Default Pi model is `zai/glm-5.1`; default per-step wait is **180000 ms** (`DEFAULT_TIMEOUT_MS`); override model with `--model <id>` and waits with `--timeout-ms <ms>`. Passthrough flags in [`scripts/project.mjs`](https://github.com/fitchmultz/pi-agent-browser-native/blob/main/scripts/project.mjs): `--keep-artifacts`, `--model`, `--verbose`, and `--timeout-ms` plus a value (for example `npm run verify -- lifecycle --model openai-codex/gpt-5.5:minimal --keep-artifacts --verbose --timeout-ms 600000`). | **Current for 0.27.2:** lifecycle-focused pass on 2026-06-11 after compiled-entrypoint update; managed browser session continuity and persisted full output verified before cleanup. |
 | Quick isolated Pi smoke | `pi --approve --no-extensions --no-skills -e . --tools agent_browser` from trusted repo root; native `agent_browser` only. | **Current for 0.27.2:** pass on 2026-06-11 via tmux with `pi --approve --no-extensions --no-skills -e .`; native `agent_browser` only. Covered `qa` with `sessionMode: "fresh"` against `https://example.com`, `open` and compact `snapshot -i` on `https://react.dev`, `semanticAction` link click to `https://react.dev/learn`, screenshot artifact verification at `/tmp/piab-release-smoke-react.png`, and `close`; explicit screenshot and temporary session artifacts were removed after evidence capture. Broader historical coverage also includes version/help/skills, eval stdin, batch stdin, explicit session, network requests, console/errors, diff snapshot, stream status/disable, dashboard start/stop, and chat credential-failure pass-through during RQ-0055. |
 Runtime floor note: package metadata keeps Pi core package peer ranges wildcard per installed Pi package docs, but `pi-agent-browser-doctor` / `npm run doctor` treats `pi --version` below 0.79.0 as a setup failure. This keeps package dependency shape aligned with Pi package loading while still making unsupported host Pi versions a release and first-run blocker.
@@ -72,7 +73,7 @@ Runtime floor note: package metadata keeps Pi core package peer ranges wildcard
 | Sessions, state, tabs, frames, dialogs, and windows | 20 canonical tokens from baseline section `state-tabs-frames-dialogs`; see [`scripts/agent-browser-capability-baseline.mjs`](../scripts/agent-browser-capability-baseline.mjs) and generated [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#session-state-frames-dialogs-windows-and-inspection-commands). | [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#session-state-frames-dialogs-windows-and-inspection-commands), stateful workflow notes, [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#details). | Stateful summaries/redaction, state artifact handling, sessionless local command planning, managed-session restore, tab target pinning, and close alias cleanup. | Extension-validation stateful matrix, runtime session/resume tests, presentation redaction tests, lifecycle harness. | Supported. External profile/auth state remains operator-owned. |
 | Network, storage, artifacts, diagnostics, and performance | 42 canonical tokens from baseline section `network-storage-artifacts-diagnostics`; see [`scripts/agent-browser-capability-baseline.mjs`](../scripts/agent-browser-capability-baseline.mjs) and generated [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#page-state-finding-mouse-settings-network-and-storage). | [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#page-state-finding-mouse-settings-network-and-storage), diagnostic sections, [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#details). | Thin passthrough plus compact diagnostics, route-mock warnings, useful-but-redacted storage output, stream idempotency normalization, artifact metadata, missing-ffmpeg warnings, sensitive-data redaction, timeout bounds, and cleanup-pair guidance. | Fake non-core matrix and safe real-upstream coverage for network/HAR, diff, trace/profiler, console/errors/highlight, stream, vitals, and React missing-renderer. | Supported. Environment-sensitive operations need suitable local/browser state. |
 | Batch, auth, confirmations, setup, dashboard, devices, and AI commands | 24 canonical tokens from baseline section `batch-auth-setup-ai`; see [`scripts/agent-browser-capability-baseline.mjs`](../scripts/agent-browser-capability-baseline.mjs) and generated [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#batch-auth-confirmations-sessions-chat-dashboard-devices-and-setup). | [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#batch-auth-confirmations-sessions-chat-dashboard-devices-and-setup), README security notes, release docs. | Native-tool batch stdin, generated `job`/`qa`/lookup batch plans, auth/confirmation redaction, sessionless local auth/setup/dashboard/doctor planning, timeout/cleanup guidance. | Unit/fake batch/auth/confirmation/dashboard/chat/doctor tests; extension-validation for structured input modes; efficiency benchmark scenarios. | Supported. Interactive side-effecting setup/auth/chat remains upstream-owned. |
-| Global flags, config, providers, policy, and environment | 120 canonical tokens from baseline section `options-and-env`; see [`scripts/agent-browser-capability-baseline.mjs`](../scripts/agent-browser-capability-baseline.mjs) and generated [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#important-global-flags-config-and-environment). | [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#important-global-flags-config-and-environment), README provider/setup notes, [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#sessionmode), architecture/runtime docs. | Runtime handles command discovery, value-flag prevalidation, launch-scoped flags, redacted echoes, fresh-session recovery hints, explicit sessions, provider/device launch-scoping, curated env forwarding, subprocess completion, and package-owned Pi-scoped config for optional companion features. | Runtime tests for flags/planning/redaction/session behavior; process tests for env and stdio-linger completion; config/web-search/CLI tests; fake provider/specialized-skill matrix; package doctor. | Supported. Provider clouds, iOS/Appium, proxies, profiles, and credentials require external setup. |
+| Global flags, config, providers, policy, and environment | 120 canonical tokens from baseline section `options-and-env`; see [`scripts/agent-browser-capability-baseline.mjs`](../scripts/agent-browser-capability-baseline.mjs) and generated [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#important-global-flags-config-and-environment). | [`COMMAND_REFERENCE.md`](COMMAND_REFERENCE.md#important-global-flags-config-and-environment), README provider/setup notes, [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md#sessionmode), architecture/runtime docs. | Runtime handles command discovery, value-flag prevalidation, launch-scoped flags, redacted echoes, fresh-session recovery hints, explicit sessions, provider/device launch-scoping, parent env forwarding with wrapper overrides, subprocess completion, and package-owned Pi-scoped config for optional companion features. | Runtime tests for flags/planning/redaction/session behavior; process tests for env and stdio-linger completion; config/web-search/CLI tests; fake provider/specialized-skill matrix; package doctor. | Supported. Provider clouds, iOS/Appium, proxies, profiles, and credentials require external setup. |
 ## Follow-up decision after closure

package/docs/TOOL_CONTRACT.md CHANGED Viewed

@@ -36,7 +36,7 @@ Agent-facing efficiency claims are measured with `npm run benchmark:agent-browse
 ## Optional companion web search
-`agent_browser_web_search` is a separate custom tool, not an `agent_browser` input mode. It is registered only when the extension can see at least one configured/resolvable Exa or Brave credential source from `~/.pi/config/pi-agent-browser-native/config.json`, `.pi/config/pi-agent-browser-native/config.json`, `PI_AGENT_BROWSER_CONFIG`, or the `EXA_API_KEY` / `BRAVE_API_KEY` environment fallbacks, and only when the final available merged config does not set `webSearch.enabled` to `false`. Config layers merge global → project → `PI_AGENT_BROWSER_CONFIG` override; under Pi 0.79+, globally installed and CLI-loaded copies read `.pi/config/...` by default because extensions are developer-trusted code, and they skip the project layer when Pi reports the project is untrusted or when launched with `--no-approve`. Disable scope is explicit: a global disable is a normal user default, a project disable applies to one repo, and an override file with `webSearch.enabled: false` is the highest-priority hard disable for that run. Command credential sources such as `"!op read 'op://Private/Exa/API Key'"` are allowed only from trusted global or explicit-override config; they make the tool available without running the command at startup, and the key is resolved when the tool executes. Project-local config may use only matching provider env refs (`$EXA_API_KEY` / `${EXA_API_KEY}` for Exa and `$BRAVE_API_KEY` / `${BRAVE_API_KEY}` for Brave); custom env aliases, interpolation literals, and malformed `$` values are rejected. Browser profile/executable config uses the same paths but only trusted global or explicit override values are emitted as host launch prompt guidance; project-local browser config is loaded by default but is not trusted to steer local profiles or executable paths.
+`agent_browser_web_search` is a separate custom tool, not an `agent_browser` input mode. It is available when the extension can see at least one configured/resolvable Exa or Brave credential source from `~/.pi/config/pi-agent-browser-native/config.json`, `.pi/config/pi-agent-browser-native/config.json`, `PI_AGENT_BROWSER_CONFIG`, or the `EXA_API_KEY` / `BRAVE_API_KEY` environment fallbacks, and runtime execution still checks that the final available merged config has not set `webSearch.enabled` to `false`. Config layers merge global → project → `PI_AGENT_BROWSER_CONFIG` override; under Pi 0.79+, globally installed and CLI-loaded copies read `.pi/config/...` when Pi trust allows that project layer, and they skip the project layer when Pi reports the project is untrusted or when launched with `--no-approve`. Disable scope is explicit: a global disable is a normal user default, a project disable applies to one repo, and an override file with `webSearch.enabled: false` is the highest-priority hard disable for that run. Credential sources may be plaintext, `$ENV_VAR` / `${ENV_VAR}` interpolation, escaped literals, or command sources such as `"!op read 'op://Private/Exa/API Key'"` from any loaded config layer; they make the tool available without exposing the value in status text, and command values resolve when the tool executes. Browser profile/executable config uses the same paths and emits prompt guidance from the highest-priority loaded layer, including project config when that layer is loaded.
 Use it when live/current external web information would help answer a task, find current docs/news, or discover candidate URLs. Use `agent_browser` when the task needs browser interaction, screenshots, authenticated/profile content, page inspection, or DOM work. The search tool is namespaced to avoid colliding with generic `web_search`, chooses Exa or Brave automatically from available credentials, defaults to Exa when both are available (unless `webSearch.preferredProvider` is set), and must not expose resolved API keys in content, details, errors, status output, docs examples, logs, or PR artifacts.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "pi-agent-browser-native",
-  "version": "0.2.48",
+  "version": "0.2.50",
   "description": "pi extension that exposes agent-browser as a native tool for browser automation",
   "type": "module",
   "author": "Mitch Fultz (https://github.com/fitchmultz)",
@@ -31,7 +31,7 @@
     "pi-agent-browser-doctor": "scripts/doctor.mjs"
   },
   "files": [
-    "extensions",
+    "dist",
     "platform-smoke.config.mjs",
     "scripts/config.mjs",
     "scripts/doctor.mjs",
@@ -52,7 +52,7 @@
   ],
   "pi": {
     "extensions": [
-      "./extensions/agent-browser/index.ts"
+      "./dist/extensions/agent-browser/index.js"
     ]
   },
   "peerDependencies": {
@@ -86,9 +86,13 @@
     "smoke:platform:windows-native": "node scripts/platform-smoke.mjs run --target windows-native",
     "smoke:platform:all": "npm run smoke:platform:doctor && node scripts/platform-smoke.mjs run --target macos,ubuntu,windows-native",
     "typecheck": "node ./scripts/project.mjs verify typecheck",
-    "test": "tsx --test test/**/*.test.ts",
+    "test": "node ./scripts/build.mjs && tsx --test test/**/*.test.ts",
     "verify": "node ./scripts/project.mjs verify",
-    "prepublishOnly": "npm run verify -- release && npm pack --dry-run"
+    "prepublishOnly": "npm run verify -- release && npm pack --dry-run",
+    "build": "node ./scripts/build.mjs",
+    "startup-profile": "node ./scripts/profile-startup.mjs",
+    "prepack": "npm run build",
+    "prepare": "npm run build"
   },
   "packageManager": "npm@11.14.0"
 }

package/scripts/config.mjs CHANGED Viewed

@@ -9,7 +9,13 @@ import { chmodSync, existsSync, mkdirSync, readFileSync, writeFileSync } from "n
 import { dirname } from "node:path";
 import process from "node:process";
-import {
+async function loadConfigPolicyModule() {
+	const sourcePolicyUrl = new URL("../extensions/agent-browser/lib/config-policy.js", import.meta.url);
+	if (existsSync(sourcePolicyUrl)) return import(sourcePolicyUrl.href);
+	return import("../dist/extensions/agent-browser/lib/config-policy.js");
+}
+const {
 	AGENT_BROWSER_CONFIG_ENV,
 	BRAVE_API_KEY_ENV,
 	DEFAULT_WEB_SEARCH_PROVIDER,
@@ -22,11 +28,10 @@ import {
 	getWebSearchProviderConfigKey,
 	getWebSearchProviderEnvVar,
 	getWebSearchProviderLabel,
-	isProjectSafeCredentialValueForProvider,
 	isWebSearchProvider,
 	loadAgentBrowserConfigStateSync,
 	summarizeConfigFiles,
-} from "../extensions/agent-browser/lib/config-policy.js";
+} = await loadConfigPolicyModule();
 const DEFAULT_CONFIG = { version: 1 };
@@ -44,9 +49,9 @@ Usage through npm exec:
   npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config paths
   npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config show
   npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search status
-  npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search set-key --stdin --provider <exa|brave> [--global]
+  npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search set-key --stdin --provider <exa|brave> [--global|--project]
   npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search set-env <ENV_VAR> [--provider brave|exa] [--global|--project]
-  npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search set-command <command> --provider <exa|brave> [--global]
+  npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search set-command <command> --provider <exa|brave> [--global|--project]
   npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search clear --provider <exa|brave|all> [--global|--project]
   npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search prefer <exa|brave|auto> [--global|--project]
   npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config web-search enable [--global|--project]
@@ -55,14 +60,14 @@ Usage through npm exec:
   npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config browser profile set <name|path> [--policy explicit-only|authenticated-only|always] [--global|--project]
   npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config browser profile clear [--global|--project]
   npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config browser executable status
-  npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config browser executable set <path> [--global]
+  npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config browser executable set <path> [--global|--project]
   npm exec --yes --package pi-agent-browser-native@latest -- pi-agent-browser-config browser executable clear [--global|--project]
 Notes:
   Global config:  ~/.pi/config/pi-agent-browser-native/config.json
   Project config: .pi/config/pi-agent-browser-native/config.json
   Override:       ${AGENT_BROWSER_CONFIG_ENV}=/path/to/config.json
-  Project-local plaintext, custom env aliases, interpolation-literal, malformed, and command-backed web-search keys are refused; use matching ${EXA_API_KEY_ENV} or ${BRAVE_API_KEY_ENV} set-env references there.
+  Loaded config may use plaintext, environment interpolation, or !command credential sources; displayed status redacts resolved keys.
   Use --provider for set-key, set-command, and clear; set-env infers exa/brave from ${EXA_API_KEY_ENV} or ${BRAVE_API_KEY_ENV}.
 `;
 }
@@ -206,13 +211,12 @@ async function handleWebSearch(args, flags) {
 	}
 	if (action === "set-key") {
 		const provider = getWebSearchProvider(flags);
-		if (flags.get("--project")) throw new UsageError(`Plaintext ${getWebSearchProviderLabel(provider)} keys cannot be written to project-local config. Use set-env or set-command.`);
 		const key = await readSecretFromStdin(Boolean(flags.get("--stdin")));
-		const { path } = selectWritePath(flags);
+		const { path, scope } = selectWritePath(flags);
 		mutateConfig(path, (config) => {
 			setWebSearchCredential(config, provider, key);
 		});
-		console.log(`Saved ${getWebSearchProviderLabel(provider)} key to global config: ${path}`);
+		console.log(`Saved ${getWebSearchProviderLabel(provider)} key to ${scope} config: ${path}`);
 		return;
 	}
 	if (action === "set-env") {
@@ -220,9 +224,6 @@ async function handleWebSearch(args, flags) {
 		if (!envName || !/^[A-Za-z_][A-Za-z0-9_]*$/.test(envName)) throw new UsageError("set-env requires a valid environment variable name.");
 		const provider = getWebSearchProvider(flags, { envName });
 		const envReference = `$${envName}`;
-		if (flags.get("--project") && !isProjectSafeCredentialValueForProvider(envReference, provider)) {
-			throw new UsageError(`Project-local ${getWebSearchProviderLabel(provider)} env references must use ${getWebSearchProviderEnvVar(provider)} exactly; custom env aliases belong in global config or ${AGENT_BROWSER_CONFIG_ENV}.`);
-		}
 		const { path, scope } = selectWritePath(flags);
 		mutateConfig(path, (config) => {
 			setWebSearchCredential(config, provider, envReference);
@@ -232,7 +233,6 @@ async function handleWebSearch(args, flags) {
 	}
 	if (action === "set-command") {
 		const provider = getWebSearchProvider(flags);
-		if (flags.get("--project")) throw new UsageError(`Command-backed ${getWebSearchProviderLabel(provider)} keys cannot be written to project-local config. Use set-env there.`);
 		const command = args.slice(1).join(" ").trim();
 		if (!command) throw new UsageError("set-command requires a command string.");
 		const { path, scope } = selectWritePath(flags);
@@ -291,9 +291,6 @@ function handleBrowser(args, flags) {
 			if (!name) throw new UsageError("browser profile set requires a profile name or profile directory path.");
 			const policy = flags.get("--policy") || "authenticated-only";
 			if (!["explicit-only", "authenticated-only", "always"].includes(policy)) throw new UsageError("Invalid --policy value.");
-			if (flags.get("--project") && policy !== "explicit-only") {
-				throw new UsageError("Project-local browser profile config may only use --policy explicit-only; authenticated or always profile guidance must be configured globally or through PI_AGENT_BROWSER_CONFIG.");
-			}
 			const { path, scope } = selectWritePath(flags);
 			mutateConfig(path, (config) => {
 				config.browser = { ...(config.browser ?? {}), defaultProfile: { name, policy } };
@@ -319,9 +316,6 @@ function handleBrowser(args, flags) {
 		if (action === "set") {
 			const executablePath = args.slice(2).join(" ").trim();
 			if (!executablePath) throw new UsageError("browser executable set requires a browser executable path.");
-			if (flags.get("--project")) {
-				throw new UsageError("Project-local browser executable config cannot steer host launch guidance; configure it globally or through PI_AGENT_BROWSER_CONFIG.");
-			}
 			const { path, scope } = selectWritePath(flags);
 			mutateConfig(path, (config) => {
 				config.browser = { ...(config.browser ?? {}), executablePath };

package/scripts/doctor.mjs CHANGED Viewed

@@ -20,7 +20,10 @@ import { CAPABILITY_BASELINE, CAPABILITY_BASELINE_SOURCE } from "./agent-browser
 const execFile = promisify(execFileCallback);
 const PACKAGE_NAME = "pi-agent-browser-native";
 const REPO_URL_FRAGMENT = "github.com/fitchmultz/pi-agent-browser-native";
-const EXTENSION_ENTRYPOINT = "extensions/agent-browser/index.ts";
+const EXTENSION_ENTRYPOINTS = Object.freeze([
+	"extensions/agent-browser/index.ts",
+	"dist/extensions/agent-browser/index.js",
+]);
 const EXPECTED_VERSION = CAPABILITY_BASELINE.targetVersion;
 const MINIMUM_PI_VERSION = "0.79.0";
 const DEFAULT_AGENT_DIR = resolve(homedir(), ".pi/agent");
@@ -163,15 +166,13 @@ function sourceLooksLikeThisPackage(source, cwd, sourceBaseDir = cwd) {
 	if (!isPathLikeSource(text)) return false;
 	const resolvedSource = resolve(sourceBaseDir, expandUserPath(text));
-	const cwdEntrypoint = resolve(cwd, EXTENSION_ENTRYPOINT);
-	const packageEntrypoint = resolve(THIS_PACKAGE_ROOT, EXTENSION_ENTRYPOINT);
+	const cwdEntrypoints = EXTENSION_ENTRYPOINTS.map((entrypoint) => resolve(cwd, entrypoint));
+	const packageEntrypoints = EXTENSION_ENTRYPOINTS.map((entrypoint) => resolve(THIS_PACKAGE_ROOT, entrypoint));
 	return (
 		resolvedSource === cwd ||
-		resolvedSource === cwdEntrypoint ||
 		resolvedSource === THIS_PACKAGE_ROOT ||
-		resolvedSource === packageEntrypoint ||
-		isInsidePath(cwdEntrypoint, resolvedSource) ||
-		isInsidePath(packageEntrypoint, resolvedSource)
+		cwdEntrypoints.some((entrypoint) => resolvedSource === entrypoint || isInsidePath(entrypoint, resolvedSource)) ||
+		packageEntrypoints.some((entrypoint) => resolvedSource === entrypoint || isInsidePath(entrypoint, resolvedSource))
 	);
 }