pi-chrome 0.15.15 → 0.15.17
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +9 -0
- package/CONTRIBUTING.md +3 -3
- package/README.md +10 -14
- package/docs/COMPARISON.md +3 -3
- package/docs/EXAMPLES.md +3 -5
- package/docs/FAQ.md +7 -14
- package/extensions/chrome-profile-bridge/browser-extension/manifest.json +1 -1
- package/extensions/chrome-profile-bridge/browser-extension/service_worker.js +6 -6
- package/extensions/chrome-profile-bridge/index.ts +7 -6
- package/package.json +2 -2
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,15 @@
|
|
|
2
2
|
|
|
3
3
|
All notable user-facing changes to `pi-chrome`.
|
|
4
4
|
|
|
5
|
+
## 0.15.17 — 2026-05-14
|
|
6
|
+
|
|
7
|
+
- **Docs accuracy pass.** Updated README, FAQ, comparison, contributing notes, and package metadata for the current real-input-only, terminal-authorized tool surface.
|
|
8
|
+
- **Input verification fix.** `includeSnapshot=true` now works for `chrome_click`, `chrome_type`, `chrome_fill`, and `chrome_key`, returning the Chrome-input result plus a fresh snapshot.
|
|
9
|
+
|
|
10
|
+
## 0.15.16 — 2026-05-14
|
|
11
|
+
|
|
12
|
+
- **Visible `/chrome` loading state.** Bare `/chrome` and `/chrome status` now immediately say “Checking Chrome connection…” before probing the companion extension, so a slow Chrome bridge no longer looks like the command did nothing.
|
|
13
|
+
|
|
5
14
|
## 0.15.15 — 2026-05-14
|
|
6
15
|
|
|
7
16
|
- **Terminal authorization restored.** `/chrome authorize` is back to terminal-based confirmation. Removed the browser-side Chrome consent page and companion-extension consent polling.
|
package/CONTRIBUTING.md
CHANGED
|
@@ -5,8 +5,8 @@ Thanks for considering a contribution. pi-chrome aims to be the **de-facto brows
|
|
|
5
5
|
## Non-negotiables
|
|
6
6
|
|
|
7
7
|
1. **No re-login.** Every change must keep working against the user's already-signed-in Chrome profile. Anything that requires a fresh profile or extra auth steps is out of scope.
|
|
8
|
-
2. **
|
|
9
|
-
3. **
|
|
8
|
+
2. **Verifiable action results.** Input tools must return structured details and support `includeSnapshot` where verification matters. Agents need enough evidence to avoid blind retries.
|
|
9
|
+
3. **Chrome real input.** Interactive controls use Chrome's input layer through `chrome.debugger`; do not re-expose synthetic/untrusted input as public UX.
|
|
10
10
|
4. **Benchmarks gate features.** Add a page in `test-suite/` that fails before your change and passes after. We accept PRs faster when there's a green/red verdict to point at.
|
|
11
11
|
|
|
12
12
|
## Local dev
|
|
@@ -25,7 +25,7 @@ python3 -m http.server 8765
|
|
|
25
25
|
|
|
26
26
|
1. Register in `extensions/chrome-profile-bridge/index.ts` (the `register*Tool` calls near line 840+).
|
|
27
27
|
2. Implement the handler in `extensions/chrome-profile-bridge/browser-extension/service_worker.js`.
|
|
28
|
-
3. Return
|
|
28
|
+
3. Return structured details and support `includeSnapshot` for user-visible state changes when relevant.
|
|
29
29
|
4. Add a benchmark page under `test-suite/challenges/` and a manifest entry.
|
|
30
30
|
5. Update `README.md` "What an agent gets" table.
|
|
31
31
|
6. Add a `CHANGELOG.md` entry.
|
package/README.md
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
# pi-chrome
|
|
2
2
|
|
|
3
3
|
> **The fastest way to give a [Pi](https://pi.dev) agent your real Chrome.**
|
|
4
|
-
> No
|
|
4
|
+
> No remote-debug port. No throwaway profile. No re-login. Watch it work — or run silent.
|
|
5
5
|
|
|
6
6
|
**MIT · 0 runtime deps · loopback-only bridge (`127.0.0.1:17318`) · inspect [`extensions/chrome-profile-bridge/browser-extension/`](./extensions/chrome-profile-bridge/browser-extension) before loading.** Verify connectivity in one command: `/chrome doctor`.
|
|
7
7
|
|
|
@@ -12,7 +12,7 @@ Agent: chrome_tab(list) → chrome_snapshot(uid:…) → chrome_screenshot(...)
|
|
|
12
12
|
You: [keeps coding — agent never asked you to log in]
|
|
13
13
|
```
|
|
14
14
|
|
|
15
|
-
`pi-chrome` ships **
|
|
15
|
+
`pi-chrome` ships **19 browser tools** for Pi agents, backed by a small MIT-licensed Chrome extension that runs inside the Chrome profile **you already use** — including every site you're already signed into.
|
|
16
16
|
|
|
17
17
|
---
|
|
18
18
|
|
|
@@ -30,7 +30,7 @@ Then in Pi:
|
|
|
30
30
|
|
|
31
31
|
On macOS this opens `chrome://extensions`, reveals the bundled `browser-extension/` folder in Finder, and copies its path to your clipboard. In Chrome: **Developer mode** → **Load unpacked** → paste the path. Done.
|
|
32
32
|
|
|
33
|
-
Verify, then authorize current Pi session
|
|
33
|
+
Verify, then authorize current Pi session from the terminal:
|
|
34
34
|
|
|
35
35
|
```text
|
|
36
36
|
/chrome doctor
|
|
@@ -120,27 +120,23 @@ You: [files the ticket with the folder attached]
|
|
|
120
120
|
|
|
121
121
|
---
|
|
122
122
|
|
|
123
|
-
##
|
|
123
|
+
## Verifiable actions
|
|
124
124
|
|
|
125
|
-
|
|
125
|
+
Input tools return structured details such as the coordinates used, target tag, uploaded paths, key pressed, or scroll distance. For click/type/fill/key calls, pass `includeSnapshot: true` to get a fresh page snapshot in the same result:
|
|
126
126
|
|
|
127
127
|
```text
|
|
128
|
-
chrome_click(
|
|
129
|
-
|
|
128
|
+
chrome_click(uid:"el-3", includeSnapshot:true) →
|
|
129
|
+
result: { input:"chrome", x:412, y:238, tag:"BUTTON" }
|
|
130
|
+
snapshot: { title, url, text, elements:[...] }
|
|
130
131
|
```
|
|
131
132
|
|
|
132
|
-
|
|
133
|
-
chrome_type(react-input, "hello") →
|
|
134
|
-
"Typed into el-7 — valueMatches=true; pageMutated=true"
|
|
135
|
-
```
|
|
136
|
-
|
|
137
|
-
This is why agents using pi-chrome don't get stuck in retry loops on broken sites. They get the **reason** the action didn't land and can fix course in one turn.
|
|
133
|
+
Agents can verify page state immediately instead of blindly retrying.
|
|
138
134
|
|
|
139
135
|
---
|
|
140
136
|
|
|
141
137
|
## What an agent gets
|
|
142
138
|
|
|
143
|
-
**
|
|
139
|
+
**19 tools**, grouped by job. Every one runs against your already-open tabs.
|
|
144
140
|
|
|
145
141
|
| Category | Tools |
|
|
146
142
|
| --------------- | ---------------------------------------------------------------------------------------------- |
|
package/docs/COMPARISON.md
CHANGED
|
@@ -52,7 +52,7 @@ We benchmark in public — see [`../test-suite/`](../test-suite). Where exact sc
|
|
|
52
52
|
1. **Profile attach, not driver launch.** Every other driver fights cookie persistence, login walls, MFA, and extension state. pi-chrome inherits all of it because it *is* your Chrome.
|
|
53
53
|
2. **Chrome input against your real profile.** Interactive tools use CDP input for reliability while still controlling the Chrome profile you already use.
|
|
54
54
|
3. **Extension bridge transport.** No `--remote-debugging-port`, no throwaway Chromium. Survives Chrome auto-updates. Works alongside your normal Chrome usage.
|
|
55
|
-
4. **
|
|
55
|
+
4. **Structured action results.** Input tools return target coordinates/tags and can include a fresh snapshot (`includeSnapshot`) so agents can verify state instead of blindly retrying.
|
|
56
56
|
5. **Multi-session shared bridge.** Planner + worker + audit Pi sessions all drive the same Chrome concurrently.
|
|
57
57
|
6. **Stable element uids.** `chrome_snapshot` returns deterministic uids you can pass to subsequent actions — similar to BrowserGym's `bid`, but built into the snapshot tool itself.
|
|
58
58
|
|
|
@@ -101,9 +101,9 @@ These wrap a driver with an LLM loop. They are **higher-level than pi-chrome** a
|
|
|
101
101
|
|
|
102
102
|
`pi-chrome` exposes tools that any Pi agent can call. If you want to use it from outside Pi:
|
|
103
103
|
|
|
104
|
-
1. The local bridge speaks HTTP JSON
|
|
104
|
+
1. The local bridge speaks HTTP JSON over `127.0.0.1:17318` (default). The API is internal; use the Pi tool surface unless you are building an adapter.
|
|
105
105
|
2. Tool surface mirrors Playwright closely (click/type/navigate/snapshot/screenshot/evaluate/wait_for) so adapter code is short.
|
|
106
|
-
3.
|
|
106
|
+
3. `includeSnapshot` on input tools lets agent harnesses verify state after actions.
|
|
107
107
|
|
|
108
108
|
If you want a first-class pi-chrome adapter for Browser Use / Stagehand / LangGraph, file an issue with your use case.
|
|
109
109
|
|
package/docs/EXAMPLES.md
CHANGED
|
@@ -119,10 +119,8 @@ On my staging app:
|
|
|
119
119
|
### React controlled inputs
|
|
120
120
|
|
|
121
121
|
```text
|
|
122
|
-
chrome_fill
|
|
123
|
-
|
|
124
|
-
After each fill, the result envelope's valueMatches=true confirms the
|
|
125
|
-
component re-rendered with the new value.
|
|
122
|
+
Use `chrome_fill` for React inputs when you want to replace the full value.
|
|
123
|
+
Pass `includeSnapshot=true` to verify the component re-rendered with the new value.
|
|
126
124
|
```
|
|
127
125
|
|
|
128
126
|
### File upload without the native picker
|
|
@@ -160,7 +158,7 @@ Interactive tools use Chrome's real input layer by default: clicks, typing, fill
|
|
|
160
158
|
- sign-in flows
|
|
161
159
|
- guarded buttons
|
|
162
160
|
- audio/video controls
|
|
163
|
-
- fullscreen
|
|
161
|
+
- fullscreen and other user-activation checks
|
|
164
162
|
- pages with strict CSP or user-activation checks
|
|
165
163
|
|
|
166
164
|
Chrome may show its debugger banner while pi-chrome is attached.
|
package/docs/FAQ.md
CHANGED
|
@@ -32,9 +32,9 @@ Chrome control is also locked per Pi session until you run `/chrome authorize`;
|
|
|
32
32
|
|
|
33
33
|
Yes. The first session opens the local bridge; later sessions detect it and pipe their commands through the same bridge. Each Pi session must be authorized with `/chrome authorize` before its chrome_* tools work.
|
|
34
34
|
|
|
35
|
-
## Why
|
|
35
|
+
## Why ship as an unpacked extension?
|
|
36
36
|
|
|
37
|
-
|
|
37
|
+
pi-chrome ships as an unpacked extension so the source and broad browser permissions are easy to inspect and update with the npm package. The downside: you load it manually from `chrome://extensions` and reload it after package updates.
|
|
38
38
|
|
|
39
39
|
## What happens when I update pi-chrome?
|
|
40
40
|
|
|
@@ -42,8 +42,8 @@ Web Store extensions cannot communicate with a local process bridge controlled b
|
|
|
42
42
|
|
|
43
43
|
## What's the install footprint?
|
|
44
44
|
|
|
45
|
-
- Pi side: one extension that registers
|
|
46
|
-
- Chrome side: one unpacked extension, ~
|
|
45
|
+
- Pi side: one extension that registers 19 tools and a few slash commands.
|
|
46
|
+
- Chrome side: one unpacked extension, ~2000 LOC of plain JavaScript, no dependencies.
|
|
47
47
|
|
|
48
48
|
## Can I script it without Pi?
|
|
49
49
|
|
|
@@ -53,18 +53,11 @@ The Pi-facing tools are thin wrappers around an HTTP bridge at `127.0.0.1:17318`
|
|
|
53
53
|
|
|
54
54
|
Not always. `chrome_evaluate` and `chrome_snapshot` run in the page's MAIN world through the Function constructor, so pages whose CSP blocks `'unsafe-eval'` can reject them. `chrome_screenshot`, `chrome_navigate`, tab tools, and real Chrome input still work because they use extension/browser APIs rather than page JavaScript.
|
|
55
55
|
|
|
56
|
-
##
|
|
56
|
+
## How do I tell whether a click or type worked?
|
|
57
57
|
|
|
58
|
-
|
|
59
|
-
- The element was occluded (look for `occludedBy: <selector>` in the envelope).
|
|
60
|
-
- The click handler called `event.preventDefault()` and the page intentionally ignored it.
|
|
61
|
-
- The target changed after your snapshot; take a fresh snapshot or screenshot.
|
|
58
|
+
Use `includeSnapshot=true` on `chrome_click`, `chrome_type`, `chrome_fill`, or `chrome_key`. The tool returns the Chrome-input result plus a fresh snapshot, so the agent can verify text, URL, visible elements, or form values before continuing.
|
|
62
59
|
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
## Why does `chrome_type` return `valueMatches=false`?
|
|
66
|
-
|
|
67
|
-
The field rejected or transformed the typed value. Common culprits: contenteditable rich-text editors, native date pickers, masked-input libraries, or masks. Try `chrome_fill`, then verify with `includeSnapshot=true`.
|
|
60
|
+
If the page did not change, take a fresh snapshot or screenshot and check for overlays, disabled controls, stale element uids, or app-side validation.
|
|
68
61
|
|
|
69
62
|
## How do I attach a file to a React file input?
|
|
70
63
|
|
|
@@ -712,7 +712,7 @@ async function dispatch(action, params) {
|
|
|
712
712
|
case "page.evaluate":
|
|
713
713
|
return evaluateInTab(params);
|
|
714
714
|
case "page.click":
|
|
715
|
-
return
|
|
715
|
+
return withOptionalSnapshot(params, chromeInputClick);
|
|
716
716
|
case "page.hover":
|
|
717
717
|
return chromeInputHover(params);
|
|
718
718
|
case "page.drag":
|
|
@@ -720,11 +720,11 @@ async function dispatch(action, params) {
|
|
|
720
720
|
case "page.upload":
|
|
721
721
|
return chromeInputUpload(params);
|
|
722
722
|
case "page.type":
|
|
723
|
-
return
|
|
723
|
+
return withOptionalSnapshot(params, chromeInputType);
|
|
724
724
|
case "page.fill":
|
|
725
|
-
return
|
|
725
|
+
return withOptionalSnapshot(params, chromeInputFill);
|
|
726
726
|
case "page.key":
|
|
727
|
-
return
|
|
727
|
+
return withOptionalSnapshot(params, chromeInputKey);
|
|
728
728
|
case "page.scroll":
|
|
729
729
|
return chromeInputScroll(params);
|
|
730
730
|
case "page.tap":
|
|
@@ -932,8 +932,8 @@ async function evaluateInTab(params) {
|
|
|
932
932
|
return v;
|
|
933
933
|
}
|
|
934
934
|
|
|
935
|
-
async function
|
|
936
|
-
const result = await
|
|
935
|
+
async function withOptionalSnapshot(params, actionFn) {
|
|
936
|
+
const result = await actionFn(params);
|
|
937
937
|
if (params.includeSnapshot) {
|
|
938
938
|
const snapshot = await executeInTab({ ...params, foreground: false }, snapshotPage, [params.maxElements || 80, null, null, null]);
|
|
939
939
|
return { result, snapshot };
|
|
@@ -8,9 +8,8 @@ import { dirname, join, resolve } from "node:path";
|
|
|
8
8
|
/**
|
|
9
9
|
* Existing-profile Chrome bridge for pi.
|
|
10
10
|
*
|
|
11
|
-
* This is intentionally not a
|
|
12
|
-
*
|
|
13
|
-
* remote debugging. Instead, install the companion Chrome extension from the
|
|
11
|
+
* This is intentionally not a remote-debugging-port integration. Chrome blocks default-profile
|
|
12
|
+
* remote debugging in many normal launches, so pi-chrome uses a companion extension from the
|
|
14
13
|
* browser-extension folder bundled next to this Pi extension.
|
|
15
14
|
*
|
|
16
15
|
* The companion extension runs inside the user's real Chrome profile and polls this local
|
|
@@ -496,18 +495,18 @@ export default function (pi: ExtensionAPI): void {
|
|
|
496
495
|
pi.on("before_agent_start", (event) => {
|
|
497
496
|
const primer = `
|
|
498
497
|
<chrome-profile-bridge>
|
|
499
|
-
Chrome control is available through the chrome_* tools via a companion Chrome extension installed in the user's normal Chrome profile. Tools target the existing signed-in profile
|
|
498
|
+
Chrome control is available through the chrome_* tools via a companion Chrome extension installed in the user's normal Chrome profile. Tools target the existing signed-in profile: no remote-debug port, no throwaway profile.
|
|
500
499
|
|
|
501
500
|
Capability model (important):
|
|
502
501
|
- Interactive controls (click/type/fill/key/hover/drag/scroll/tap) use Chrome's real input layer via chrome.debugger / CDP. Events satisfy normal user-activation gates.
|
|
503
502
|
- Input bypasses page CSP because it is injected at browser input layer, not page JavaScript. Chrome may show the “Pi Chrome Connector started debugging this browser” banner while attached.
|
|
504
503
|
- \`chrome_evaluate\` and \`chrome_snapshot\` run in MAIN world via the **Function constructor**, which requires \`'unsafe-eval'\` in the page CSP. Pages with strict CSP (e.g. github.com, many bank/SaaS apps) will throw \`EvalError: ... 'unsafe-eval' is not an allowed source of script\` and chrome_snapshot will return empty. On those pages, drive the page with \`chrome_screenshot\` + viewport-coordinate \`chrome_click\`/\`chrome_type\`/\`chrome_key\`. \`chrome_navigate\`, \`chrome_screenshot\`, \`chrome_tab\`, and Chrome input all keep working under any CSP.
|
|
505
|
-
-
|
|
504
|
+
- Input tools return structured details and support \`includeSnapshot=true\` on click/type/fill/key. Use the fresh snapshot to verify state instead of repeating blindly.
|
|
506
505
|
|
|
507
506
|
Usage rules:
|
|
508
507
|
1. If a chrome_* tool says Chrome control is locked, ask the user to run \`/chrome authorize\` before retrying.
|
|
509
508
|
2. \`chrome_snapshot\` before clicking/typing; pass \`uid\` over \`selector\`.
|
|
510
|
-
3. \`includeSnapshot=true\` on click/type/fill to verify in one round trip.
|
|
509
|
+
3. \`includeSnapshot=true\` on click/type/fill/key to verify in one round trip.
|
|
511
510
|
4. If \`chrome_evaluate\` returns null when you expected a value, the expression evaluated to null/undefined in the page; surface the value via \`JSON.stringify\` to confirm.
|
|
512
511
|
5. \`chrome_navigate\` supports an optional \`initScript\` that runs at document_start in MAIN world for the next navigation (good for seeding localStorage or stubbing Date.now).
|
|
513
512
|
6. By default chrome_* tools focus Chrome so the user can watch; pass \`background=true\` or run /chrome background on for session-wide background execution.
|
|
@@ -684,6 +683,7 @@ Usage rules:
|
|
|
684
683
|
};
|
|
685
684
|
|
|
686
685
|
const statusHandler = async (ctx: ExtensionContext) => {
|
|
686
|
+
ctx.ui.notify("Checking Chrome connection…", "info");
|
|
687
687
|
ctx.ui.notify(await statusSummary(), "info");
|
|
688
688
|
};
|
|
689
689
|
|
|
@@ -723,6 +723,7 @@ Usage rules:
|
|
|
723
723
|
|
|
724
724
|
const openCommandMenu = async (ctx: ExtensionContext): Promise<void> => {
|
|
725
725
|
while (true) {
|
|
726
|
+
ctx.ui.notify("Checking Chrome connection…", "info");
|
|
726
727
|
const choice = await ctx.ui.select(`pi-chrome\n${await statusSummary()}`, [
|
|
727
728
|
"Authorize Chrome control…",
|
|
728
729
|
"Lock Chrome control",
|
package/package.json
CHANGED
|
@@ -1,11 +1,11 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "pi-chrome",
|
|
3
|
-
"version": "0.15.
|
|
3
|
+
"version": "0.15.17",
|
|
4
4
|
"scripts": {
|
|
5
5
|
"version": "node scripts/sync-manifest-version.js",
|
|
6
6
|
"prepublishOnly": "node scripts/sync-manifest-version.js"
|
|
7
7
|
},
|
|
8
|
-
"description": "Give a Pi agent your real, signed-in Chrome. No
|
|
8
|
+
"description": "Give a Pi agent your real, signed-in Chrome. No remote-debug port, no throwaway profile, no re-login. 19 tools for click, type, navigate, screenshot, network capture, file upload, drag, and touch.",
|
|
9
9
|
"keywords": [
|
|
10
10
|
"pi",
|
|
11
11
|
"pi-package",
|