pi-agent-browser-native 0.1.2 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,19 @@
1
1
  # Changelog
2
2
 
3
+ ## 0.1.4 - 2026-04-12
4
+
5
+ ### Changed
6
+ - removed the tracked repo-local `.pi/extensions/agent-browser.ts` autoload shim because it conflicts with the globally installed package and blocks `pi` startup from this repository root
7
+ - local checkout validation now uses explicit CLI loading with `pi --no-extensions -e .` instead of repo-local `.pi/extensions/` auto-discovery
8
+ - aligned the package description and keywords with the GitHub repository metadata used for your other public `pi` extensions
9
+
10
+ ## 0.1.3 - 2026-04-12
11
+
12
+ ### Changed
13
+ - when `BRAVE_API_KEY` is present and non-empty, the extension now tells agents to prefer the Brave Search API via `bash`/`curl` for URL discovery and then open the chosen destination with `agent_browser` instead of driving a search engine results page in the browser
14
+ - when `BRAVE_API_KEY` is absent, the extension behavior remains unchanged
15
+ - added a small runtime helper and unit coverage for the `BRAVE_API_KEY` gate so the change stays explicit and minimal
16
+
3
17
  ## 0.1.2 - 2026-04-11
4
18
 
5
19
  ### Changed
package/README.md CHANGED
@@ -6,7 +6,7 @@ Native `pi` integration for [`agent-browser`](https://agent-browser.dev/).
6
6
 
7
7
  Early working scaffold.
8
8
 
9
- The package scaffold, native `agent_browser` tool, local typecheck/test setup, tracked repo-local development entrypoint, and release/package verification workflow are in place.
9
+ The package scaffold, native `agent_browser` tool, local typecheck/test setup, and release/package verification workflow are in place.
10
10
 
11
11
  ## Goal
12
12
 
@@ -77,27 +77,28 @@ pi -e https://github.com/fitchmultz/pi-agent-browser-native
77
77
 
78
78
  ### Current practical local-checkout flow
79
79
 
80
- Until you are using a published package release, install from a checkout:
80
+ Until you are using a published package release, prefer an explicit checkout-only run instead of installing the checkout into your normal `pi` package set:
81
81
 
82
82
  ```bash
83
- pi install /absolute/path/to/pi-agent-browser-native
83
+ pi --no-extensions -e /absolute/path/to/pi-agent-browser-native
84
84
  ```
85
85
 
86
- Or try it for one session only:
87
-
88
- ```bash
89
- pi -e /absolute/path/to/pi-agent-browser-native
90
- ```
86
+ This avoids duplicate `agent_browser` registrations if you also have the published package installed globally.
91
87
 
92
88
  The native tool exposed to the agent is named `agent_browser`.
93
89
 
94
90
  ## Local development
95
91
 
96
- This repository now tracks `.pi/extensions/agent-browser.ts` as a thin development entrypoint that re-exports the real extension from `extensions/agent-browser/index.ts`. That keeps repo-root `pi` launches working without changing the published package layout.
92
+ Do not track or rely on a repo-local `.pi/extensions/agent-browser.ts` autoload shim for this package. When the package is also installed globally, that creates a duplicate `agent_browser` registration and blocks `pi` startup from this working directory.
97
93
 
98
94
  1. Install `agent-browser` separately via the upstream project.
99
95
  2. Run `npm install`.
100
- 3. Launch `pi` from this repository root.
96
+ 3. Launch `pi` from this repository root with only the checkout extension loaded:
97
+
98
+ ```bash
99
+ pi --no-extensions -e .
100
+ ```
101
+
101
102
  4. Prompt the agent to use `agent_browser`.
102
103
 
103
104
  Example prompt:
@@ -9,7 +9,7 @@ Related docs:
9
9
 
10
10
  Build this as a **thin `pi` extension/package** that exposes `agent-browser` as one native tool while keeping upstream `agent-browser` as the source of truth.
11
11
 
12
- The package install path is the primary product path. Repo-local `.pi/` wiring exists only as a thin tracked development shim, while package-manifest behavior and packaged contents matter more.
12
+ The package install path is the primary product path. Local checkout development should use explicit CLI loading, while package-manifest behavior and packaged contents matter more.
13
13
 
14
14
  ## Chosen shape
15
15
 
@@ -42,15 +42,15 @@ That means:
42
42
  - no manual user orchestration as the main workflow
43
43
  - any future slash commands should be minimal and secondary
44
44
 
45
- ### Package layout versus repo-local development
45
+ ### Package layout versus local checkout development
46
46
 
47
47
  The published package should load from the `pi` manifest in `package.json`.
48
48
 
49
- Repo-local development should use one thin tracked shim at `.pi/extensions/agent-browser.ts` that re-exports `extensions/agent-browser/index.ts`.
49
+ Local checkout development should use explicit CLI loading such as `pi --no-extensions -e .` from the repository root instead of repo-local `.pi/extensions/` auto-discovery.
50
50
 
51
51
  Why:
52
- - keeps repo-root `pi` launches working during development
53
- - avoids making local `.pi/` wiring the product contract
52
+ - avoids duplicate `agent_browser` registrations when the package is also installed globally
53
+ - keeps the product contract centered on the package manifest instead of repo-local autoload wiring
54
54
  - keeps the published tarball focused on the package manifest, extension code, canonical docs, and license
55
55
 
56
56
  The published package should exclude agent-only and superseded repo materials such as `AGENTS.md`, `docs/v1-tool-contract.md`, `docs/native-integration-design.md`, and other internal planning notes.
package/docs/RELEASE.md CHANGED
@@ -28,7 +28,7 @@ npm run verify:release
28
28
 
29
29
  `npm run verify:package` confirms that:
30
30
 
31
- - the tracked repo-local development entrypoint exists at `.pi/extensions/agent-browser.ts`
31
+ - no repo-local `.pi/extensions/agent-browser.ts` autoload shim is present
32
32
  - `LICENSE` exists in the repo and the packed tarball
33
33
  - canonical published docs are present
34
34
  - extension source files are present
@@ -51,11 +51,11 @@ node scripts/verify-package.mjs --list-files
51
51
 
52
52
  ## Local development validation
53
53
 
54
- Before publishing, also validate the repo-local development path:
54
+ Before publishing, also validate the explicit local-checkout path:
55
55
 
56
56
  1. Install `agent-browser` separately.
57
- 2. Launch `pi` from this repository root.
58
- 3. Confirm the tracked `.pi/extensions/agent-browser.ts` shim loads the current extension code.
57
+ 2. Launch `pi --no-extensions -e .` from this repository root.
58
+ 3. Confirm the checkout extension loads from `extensions/agent-browser/index.ts`.
59
59
  4. Run a smoke prompt that exercises `agent_browser`.
60
60
 
61
61
  Example prompt:
@@ -81,6 +81,6 @@ Before publishing:
81
81
 
82
82
  - update `CHANGELOG.md`
83
83
  - confirm README install guidance still leads with the package-first flow
84
- - confirm local-checkout instructions still work for pre-release validation
84
+ - confirm the explicit local-checkout instructions still work for pre-release validation
85
85
  - rerun `npm run verify:release`
86
86
  - publish only after the tarball contents match expectations
@@ -48,8 +48,8 @@ Define the product requirements and constraints for `pi-agent-browser-native`.
48
48
  - User-facing install docs should also include the GitHub source path `pi install https://github.com/fitchmultz/pi-agent-browser-native`.
49
49
  - Keep the current local-checkout path documented as the practical pre-release and development flow.
50
50
  - Most users will install this extension globally rather than as a project-local extension.
51
- - Repo-local `.pi/` wiring is for development convenience only and should not drive the product design.
52
- - The tracked repo-local development entrypoint should stay a thin re-export, not a second implementation path.
51
+ - Local checkout development should use explicit CLI loading such as `pi --no-extensions -e .` or `pi --no-extensions -e /absolute/path/to/pi-agent-browser-native`.
52
+ - Do **not** rely on repo-local `.pi/extensions/` auto-discovery for this package, because it conflicts with the global installed-package path.
53
53
 
54
54
  ### Native-tool preference
55
55
 
@@ -70,7 +70,7 @@ Define the product requirements and constraints for `pi-agent-browser-native`.
70
70
  ### Testing guidance
71
71
 
72
72
  - The primary confidence path is a real `pi` session driven in `tmux`.
73
- - Launch `pi` from the repository root for local development so the tracked `.pi` development entrypoint loads.
73
+ - For local checkout validation, launch `pi --no-extensions -e .` from the repository root so only the checkout copy loads.
74
74
  - Prefer full `pi` restart over `/reload` when validating extension changes.
75
75
  - Use `/resume` when needed after restart.
76
76
  - Keep testing broader than a single smoke site like `example.com`.
@@ -2,7 +2,7 @@
2
2
  * Purpose: Register the native agent_browser tool for pi so agents can invoke agent-browser without going through bash.
3
3
  * Responsibilities: Define the tool schema, inject thin wrapper behavior around the upstream CLI, manage implicit session convenience, and return pi-friendly content/details.
4
4
  * Scope: Native tool registration and orchestration only; the wrapper intentionally stays close to the upstream agent-browser CLI.
5
- * Usage: Loaded by pi through the package manifest or the local `.pi/extensions/agent-browser.ts` development entrypoint.
5
+ * Usage: Loaded by pi through the package manifest in this package, or explicitly via `pi --no-extensions -e .` during local checkout development.
6
6
  * Invariants/Assumptions: agent-browser is installed separately on PATH, the wrapper targets the current locally installed upstream version only, and no backward-compatibility shims are provided.
7
7
  */
8
8
 
@@ -19,6 +19,7 @@ import {
19
19
  createEphemeralSessionSeed,
20
20
  createImplicitSessionName,
21
21
  getLatestUserPrompt,
22
+ hasUsableBraveApiKey,
22
23
  validateToolArgs,
23
24
  } from "./lib/runtime.js";
24
25
  import { cleanupSecureTempArtifacts } from "./lib/temp.js";
@@ -78,8 +79,14 @@ function buildInspectionDeflectionMessage(): string {
78
79
  ].join("\n");
79
80
  }
80
81
 
82
+ function buildBraveSearchGuidance(hasBraveApiKey: boolean): string {
83
+ if (!hasBraveApiKey) return "";
84
+ return "\n- A non-empty `BRAVE_API_KEY` is available in the current environment. For web search or URL discovery, prefer the Brave Search API via `bash`/`curl` to find the destination URL, then open that URL with `agent_browser` instead of using browser automation to drive Google or another search engine results page. If the Brave request fails, fall back to the normal workflow.";
85
+ }
86
+
81
87
  export default function agentBrowserExtension(pi: ExtensionAPI) {
82
88
  const ephemeralSessionSeed = createEphemeralSessionSeed();
89
+ const braveSearchGuidance = buildBraveSearchGuidance(hasUsableBraveApiKey());
83
90
  let implicitSessionActive = false;
84
91
  let implicitSessionName = createImplicitSessionName(undefined, process.cwd(), ephemeralSessionSeed);
85
92
  let implicitSessionCwd = process.cwd();
@@ -112,7 +119,8 @@ export default function agentBrowserExtension(pi: ExtensionAPI) {
112
119
  return {
113
120
  systemPrompt:
114
121
  event.systemPrompt +
115
- "\n\nProject rule: when browser automation is needed, prefer the native `agent_browser` tool. Do not run direct `agent-browser` bash commands unless the user explicitly asks for a bash-oriented workflow or browser-integration debugging.\n\nBrowser operating playbook:\n- Standard workflow: open the page, then snapshot -i, then interact via refs, then re-snapshot after navigation or major DOM changes.\n- For user-specific or authenticated content like feeds, inboxes, dashboards, and accounts, start with an authenticated browser strategy instead of public browsing. Prefer `--profile Default` on the first browser call and let the current implicit session carry continuity. Use `--auto-connect` only if profile-based reuse is unavailable or the task is specifically about attaching to a running debug-enabled browser.\n- Do not invent fixed explicit session names for routine tasks. Use the implicit session unless you truly need multiple isolated browser sessions in the same conversation.\n- When using startup-scoped flags like `--profile`, `--session-name`, or `--cdp`, put them on the first command for that session. If you intentionally use an explicit `--session`, keep using that same explicit session for follow-ups.\n- If a session lands on the wrong page or tab, an interaction changes origin unexpectedly, or an `open` call returns blocked, blank, or otherwise unexpected results, use `tab list`, `tab <n>`, and `snapshot -i` to recover state before retrying different URLs or fallback strategies. Only use `wait` with an explicit argument like milliseconds, `--load`, `--url`, `--fn`, or `--text`.\n- For feed, timeline, or inbox reading tasks, focus on the main timeline/list region and read the first item there rather than unrelated composer or sidebar content.\n- For read-only browsing tasks, prefer extracting the answer from the current snapshot, structured ref labels, or `eval --stdin` on the current page before navigating away. Only click into media viewers, detail routes, or new pages when the current view does not contain the needed information.\n- When using `eval --stdin`, scope checks and actions to the target element or route whenever possible instead of relying on broad page-wide text heuristics.\n- When using `eval --stdin` for extraction, return the value you want instead of relying on `console.log` as the primary result channel.\n- Do not use `agent_browser --help` for normal browsing tasks.",
122
+ "\n\nProject rule: when browser automation is needed, prefer the native `agent_browser` tool. Do not run direct `agent-browser` bash commands unless the user explicitly asks for a bash-oriented workflow or browser-integration debugging.\n\nBrowser operating playbook:\n- Standard workflow: open the page, then snapshot -i, then interact via refs, then re-snapshot after navigation or major DOM changes.\n- For user-specific or authenticated content like feeds, inboxes, dashboards, and accounts, start with an authenticated browser strategy instead of public browsing. Prefer `--profile Default` on the first browser call and let the current implicit session carry continuity. Use `--auto-connect` only if profile-based reuse is unavailable or the task is specifically about attaching to a running debug-enabled browser.\n- Do not invent fixed explicit session names for routine tasks. Use the implicit session unless you truly need multiple isolated browser sessions in the same conversation.\n- When using startup-scoped flags like `--profile`, `--session-name`, or `--cdp`, put them on the first command for that session. If you intentionally use an explicit `--session`, keep using that same explicit session for follow-ups.\n- If a session lands on the wrong page or tab, an interaction changes origin unexpectedly, or an `open` call returns blocked, blank, or otherwise unexpected results, use `tab list`, `tab <n>`, and `snapshot -i` to recover state before retrying different URLs or fallback strategies. Only use `wait` with an explicit argument like milliseconds, `--load`, `--url`, `--fn`, or `--text`.\n- For feed, timeline, or inbox reading tasks, focus on the main timeline/list region and read the first item there rather than unrelated composer or sidebar content.\n- For read-only browsing tasks, prefer extracting the answer from the current snapshot, structured ref labels, or `eval --stdin` on the current page before navigating away. Only click into media viewers, detail routes, or new pages when the current view does not contain the needed information.\n- When using `eval --stdin`, scope checks and actions to the target element or route whenever possible instead of relying on broad page-wide text heuristics.\n- When using `eval --stdin` for extraction, return the value you want instead of relying on `console.log` as the primary result channel.\n- Do not use `agent_browser --help` for normal browsing tasks." +
123
+ braveSearchGuidance,
116
124
  };
117
125
  });
118
126
 
@@ -141,6 +149,11 @@ export default function agentBrowserExtension(pi: ExtensionAPI) {
141
149
  promptGuidelines: [
142
150
  "Use this tool whenever the task requires a real browser or live web content.",
143
151
  "Standard workflow: open the page, snapshot -i, interact using refs, and re-snapshot after navigation or major DOM changes.",
152
+ ...(braveSearchGuidance
153
+ ? [
154
+ "When a non-empty BRAVE_API_KEY is available in the current environment, prefer the Brave Search API via bash/curl to discover specific destination URLs, then open the chosen URL with agent_browser instead of browsing a search engine results page just to find the target.",
155
+ ]
156
+ : []),
144
157
  "For authenticated or user-specific content like feeds, inboxes, dashboards, and accounts, prefer --profile Default on the first browser call and let the implicit session carry continuity. Use --auto-connect only if profile-based reuse is unavailable or the task is specifically about attaching to a running debug-enabled browser.",
145
158
  "Do not invent fixed explicit session names for routine tasks. Use the implicit session unless you truly need multiple isolated browser sessions in the same conversation.",
146
159
  "When using --profile, --session-name, or --cdp, put them on the first command for that session. If you intentionally use an explicit --session, keep using that same explicit session for follow-ups.",
@@ -10,6 +10,7 @@ import { createHash, randomUUID } from "node:crypto";
10
10
  import { basename } from "node:path";
11
11
 
12
12
  const STARTUP_SCOPED_FLAGS = ["--cdp", "--profile", "--session-name"] as const;
13
+ const BRAVE_API_KEY_ENV = "BRAVE_API_KEY";
13
14
  const INSPECTION_ALLOW_PATTERNS = [
14
15
  /\bagent[_ -]?browser\s+--(?:help|version)\b/i,
15
16
  /\bagent[_ -]?browser\b.*\b(?:help|version|docs?|documentation|tool contract|tool guidance|tool description)\b/i,
@@ -77,6 +78,10 @@ export interface PromptPolicy {
77
78
  allowLegacyAgentBrowserBash: boolean;
78
79
  }
79
80
 
81
+ export function hasUsableBraveApiKey(apiKey: string | null | undefined = process.env[BRAVE_API_KEY_ENV]): boolean {
82
+ return typeof apiKey === "string" && apiKey.trim().length > 0;
83
+ }
84
+
80
85
  export function createEphemeralSessionSeed(): string {
81
86
  return randomUUID();
82
87
  }
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "pi-agent-browser-native",
3
- "version": "0.1.2",
4
- "description": "Native pi integration for agent-browser",
3
+ "version": "0.1.4",
4
+ "description": "pi extension that exposes agent-browser as a native tool for browser automation",
5
5
  "type": "module",
6
6
  "author": "Mitch Fultz (https://github.com/fitchmultz)",
7
7
  "keywords": [
@@ -10,7 +10,9 @@
10
10
  "pi-extension",
11
11
  "extension",
12
12
  "agent-browser",
13
- "browser-automation"
13
+ "browser-automation",
14
+ "typescript",
15
+ "npm-package"
14
16
  ],
15
17
  "license": "MIT",
16
18
  "repository": {