pi-chrome 0.14.8 → 0.15.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,17 @@
2
2
 
3
3
  All notable user-facing changes to `pi-chrome`.
4
4
 
5
+ ## 0.15.0 — 2026-05-13
6
+
7
+ - **README rewrite — top-3 recipes as terminal mockups.** PR triage, Linear standup, and Bug-repro-with-evidence each get a copy-pasteable prompt → tool trace → result block modeled on the hero example. The other six recipes (form auto-fill, admin cross-check, visual diff, auth-only data pull, network forensics, file upload) collapsed into a `<details>` block so the section sells before it catalogs.
8
+ - **Comparison table rewritten.** Dropped the all-✅ "Works on strict-CSP pages" row (zero signal). New table leads with "Time from `pi install` → first useful action on your real account" (~60s vs. hours) and "Survives MFA / SSO without code" (✅ already logged in). Multi-session row reframed as the bolded "Multiple agents drive the same Chrome at once". Footnote ² rewritten to highlight mode-aware scoring + open invitation for competing tools to PR their scores.
9
+ - **Section reorder: sells before catalogs.** New flow: hero → 60-second install → 30-second try-this → killer recipes → comparison → honest results → tool catalog → click/watch modes (with Diagnostics folded in) → architecture → benchmark suite → security model & why unpacked (combined) → composes-with → roadmap → contributing → license. Hero blockquote now precedes shields badges so pi.dev no longer scrapes a broken-image badge as the description. Package `description` shortened to 255 chars so pi.dev hero stops truncating mid-word. `author` set to `"tianrendong (Earendil Inc.)"`.
10
+
11
+ ## 0.14.9
12
+
13
+ - Primer (agent system prompt) now teaches the **trusted-mode escape hatch** explicitly. Previously the bridge would hit a CSP-locked page (github.com, banks, many SaaS apps), `chrome_evaluate`/`chrome_snapshot` would throw `EvalError: 'unsafe-eval' is not an allowed source of script`, and the agent would conclude *"bridge can't drive this page"* and ask the user for a fallback. New primer makes three things self-discoverable: (1) `trusted: true` on click/type/key/fill/hover/drag/scroll dispatches through chrome.debugger / CDP and bypasses page CSP entirely, (2) the recipe for strict-CSP pages is `chrome_screenshot` + trusted input at viewport coordinates, (3) when synthetic input produces no `pageMutated` or you see a CSP/eval error, **escalate to `trusted: true` yourself instead of asking the user**. Also corrects the old claim that `chrome_evaluate` works without `'unsafe-eval'` (it does not — Function constructor is gated by `script-src`).
14
+ - Add `scripts/sync-manifest-version.js` wired to npm's `version` + `prepublishOnly` lifecycle hooks. Bumping the package version with `npm version <bump>` now auto-syncs `extensions/chrome-profile-bridge/browser-extension/manifest.json` and stages it into the version commit — kills the recurring drift class (cf. 0.14.4, 0.14.8, this fix).
15
+
5
16
  ## 0.14.8
6
17
 
7
18
  - Repo moved to its own home: https://github.com/tianrendong/pi-chrome. No code changes; updated `repository`, `homepage`, and `bugs` URLs in `package.json`.
package/README.md CHANGED
@@ -1,12 +1,10 @@
1
1
  # pi-chrome
2
2
 
3
- [![npm version](https://img.shields.io/npm/v/pi-chrome.svg)](https://www.npmjs.com/package/pi-chrome)
4
- [![npm downloads](https://img.shields.io/npm/dm/pi-chrome.svg)](https://www.npmjs.com/package/pi-chrome)
5
- [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](./LICENSE)
6
-
7
3
  > **The fastest way to give a [Pi](https://pi.dev) agent your real Chrome.**
8
4
  > No CDP. No throwaway profile. No re-login. Watch it work — or run silent.
9
5
 
6
+ **MIT · 0 runtime deps · loopback-only bridge (`127.0.0.1:17318`) · inspect [`extensions/chrome-profile-bridge/browser-extension/`](./extensions/chrome-profile-bridge/browser-extension) before loading.** Verify connectivity in one command: `/chrome doctor`.
7
+
10
8
  ```text
11
9
  You: "Find my open GitHub PR tab, summarize review state, and screenshot the failing CI."
12
10
  Agent: chrome_tab(list) → chrome_snapshot(uid:…) → chrome_screenshot(...)
@@ -14,46 +12,11 @@ Agent: chrome_tab(list) → chrome_snapshot(uid:…) → chrome_screenshot(...)
14
12
  You: [keeps coding — agent never asked you to log in]
15
13
  ```
16
14
 
17
- `pi-chrome` ships **20+ browser tools** for Pi agents, backed by a small MIT-licensed Chrome extension that runs inside the Chrome profile **you already use** — including every site you're already signed into.
18
-
19
- ---
20
-
21
- ## Why pi-chrome vs. everything else
22
-
23
- > Short version: **pi-chrome is primitives — "Playwright for the Chrome you're already signed into."** Not an agent loop. Plug it under any agent framework (Browser Use, Stagehand, LangGraph) or call its tools directly from a Pi agent. See [docs/COMPARISON.md](./docs/COMPARISON.md) for the full three-axis landscape (drivers, agents, cloud providers).
24
-
25
- | | **pi-chrome** | Playwright / Puppeteer | CDP-based agents | Selenium / WebDriver |
26
- | ------------------------------ | --------------------------------- | ----------------------------- | ----------------------------- | ----------------------------- |
27
- | Uses your real signed-in Chrome | ✅ yes (extension in your profile) | ❌ throwaway profile | ⚠️ requires `--remote-debug` | ❌ throwaway profile |
28
- | Re-login required | **Never** | Every run | Sometimes | Every run |
29
- | Watch agent work, live | ✅ default; toggle quiet | ❌ headless or new window | ⚠️ debugger banner always | ❌ new window |
30
- | Works on strict-CSP pages | ✅ `new Function` MAIN-world | ✅ | ✅ | ✅ |
31
- | Real browser-trusted clicks | ✅ opt-in (`chrome clicks on`) | ✅ | ✅ | ✅ |
32
- | Multi-session safe | ✅ shared local bridge | ❌ port collisions | ❌ | ❌ |
33
- | Network/console capture | ✅ built-in | ✅ | ✅ | ⚠️ via extensions |
34
- | Honest result envelopes¹ | ✅ | ⚠️ | ❌ | ❌ |
35
- | Built-in benchmark suite² | ✅ 38 primitives + 4 long-horizon | n/a | n/a | n/a |
36
-
37
- ¹ Every action returns `pageMutated`, `defaultPrevented`, `elementVisible`, `occludedBy`, and `valueMatches` so the agent knows when a click didn't take effect — instead of looping blindly.
38
- ² See [`test-suite/`](./test-suite) — 38 primitive challenges plus 4 hermetic BrowserGym-style tasks. Scoring is expected-outcome-by-mode (`synthetic` / `trusted` / `manual`), not raw PASS count. Pages grade any browser-control tool on trusted clicks, pointer humanization, keyboard fidelity, drag/drop, clipboard, Shadow DOM, iframes, file uploads, network capture, and fingerprint leaks.
39
-
40
- ---
41
-
42
- ## What an agent gets
43
-
44
- **20 tools**, grouped by job. Every one runs against your already-open tabs.
45
-
46
- | Category | Tools |
47
- | --------------- | ---------------------------------------------------------------------------------------------- |
48
- | **Tabs** | `chrome_tab` (list/new/activate/close/version), `chrome_launch` |
49
- | **Inspect** | `chrome_snapshot` (uids + selectors + text + viewport), `chrome_screenshot`, `chrome_evaluate` |
50
- | **Navigate** | `chrome_navigate` (with optional `initScript` at `document_start`), `chrome_wait_for` |
51
- | **Interact** | `chrome_click`, `chrome_type`, `chrome_fill`, `chrome_key`, `chrome_hover` |
52
- | **Gesture** | `chrome_drag` (HTML5 DataTransfer), `chrome_scroll` (wheel + momentum), `chrome_tap` (touch) |
53
- | **Files** | `chrome_upload_file` (no native picker; works with React/Vue/Angular file inputs) |
54
- | **Observe** | `chrome_list_console_messages`, `chrome_list_network_requests`, `chrome_get_network_request` (with response body) |
15
+ [![npm version](https://img.shields.io/npm/v/pi-chrome.svg)](https://www.npmjs.com/package/pi-chrome)
16
+ [![npm downloads](https://img.shields.io/npm/dm/pi-chrome.svg)](https://www.npmjs.com/package/pi-chrome)
17
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](./LICENSE)
55
18
 
56
- Each tool is documented inline in Pi agents see the parameters and the gotchas (synthetic vs. trusted, autoplay gates, file picker limits) without trial-and-error.
19
+ `pi-chrome` ships **20+ browser tools** for Pi agents, backed by a small MIT-licensed Chrome extension that runs inside the Chrome profile **you already use** including every site you're already signed into.
57
20
 
58
21
  ---
59
22
 
@@ -101,16 +64,43 @@ You'll watch the agent jump to your GitHub tab and read the page — using **you
101
64
 
102
65
  ## Killer recipes (copy-paste into Pi)
103
66
 
104
- Each one assumes tabs you already have open + accounts you're already signed into.
67
+ Each recipe assumes the relevant tab is already open in the Chrome you control.
105
68
 
106
69
  **PR triage**
107
- > Use chrome_tab list to find my GitHub notifications tab, snapshot it, summarize PRs needing my review.
70
+
71
+ ```text
72
+ You: "Use chrome_tab list to find my GitHub notifications tab, then summarize PRs needing my review today, sorted by staleness."
73
+ Agent: chrome_tab(list) → chrome_snapshot(uid:el-notifications) → chrome_evaluate(...)
74
+ ✓ 7 PRs waiting on you. 2 stale >3d (storage-rewrite, billing-v2).
75
+ 1 just turned CI-green (api-keys-prune). Full sorted list below.
76
+ You: [pastes the list straight into Linear]
77
+ ```
108
78
 
109
79
  **Linear standup**
110
- > Open my Linear current cycle in the active tab, snapshot it, write a 5-bullet standup.
80
+
81
+ ```text
82
+ You: "Open my Linear current cycle in the active tab and write a 5-bullet standup."
83
+ Agent: chrome_tab(activate, urlIncludes:"linear.app") → chrome_snapshot(uid:el-cycle) → chrome_evaluate(...)
84
+ ✓ 5 in-progress, 2 blocked. Standup draft:
85
+ • Shipped: bridge auto-recover. • In flight: trusted-mode retry path.
86
+ • Blocked: vendor portal CSP (waiting on infra).
87
+ • Next: benchmark v2. • Risk: none today.
88
+ You: [drops it into #standup]
89
+ ```
111
90
 
112
91
  **Bug repro with evidence**
113
- > Open the staging app I'm already signed into, reproduce \<bug>, save a screenshot of each step under `./repro/`.
92
+
93
+ ```text
94
+ You: "Repro the checkout 500 on staging, save a screenshot at each step under ./repro/."
95
+ Agent: chrome_navigate(staging) → chrome_click(uid:el-add-to-cart) → chrome_screenshot(./repro/01-cart.png)
96
+ → chrome_click(uid:el-checkout) → chrome_list_network_requests() → chrome_screenshot(./repro/02-500.png)
97
+ ✓ POST /api/checkout → 500. Response body saved → ./repro/checkout-500.json
98
+ 3 screenshots in ./repro/. Trigger: missing tax_id when cart contains digital goods.
99
+ You: [files the ticket with the folder attached]
100
+ ```
101
+
102
+ <details>
103
+ <summary><strong>More recipes</strong> (form auto-fill, admin cross-check, visual diff, auth-only data pull, network forensics, file upload)</summary>
114
104
 
115
105
  **Form auto-fill (no submit)**
116
106
  > Open the vendor portal, fill the new-vendor form from this JSON, stop before submit.
@@ -130,23 +120,65 @@ Each one assumes tabs you already have open + accounts you're already signed int
130
120
  **File upload through React**
131
121
  > Open the photo uploader, `chrome_upload_file` with `./fixtures/sample.png`, confirm preview rendered.
132
122
 
123
+ </details>
124
+
133
125
  ---
134
126
 
135
- ## Architecture
127
+ ## Why pi-chrome vs. Playwright / CDP / Selenium
128
+
129
+ > Short version: **pi-chrome is primitives — "Playwright for the Chrome you're already signed into."** Not an agent loop. Plug it under any agent framework (Browser Use, Stagehand, LangGraph) or call its tools directly from a Pi agent. See [docs/COMPARISON.md](./docs/COMPARISON.md) for the full three-axis landscape (drivers, agents, cloud providers).
130
+
131
+ | | **pi-chrome** | Playwright / Puppeteer | CDP-based agents | Selenium / WebDriver |
132
+ | ------------------------------ | --------------------------------- | ----------------------------- | ----------------------------- | ----------------------------- |
133
+ | **Time from `pi install` → first useful action on your real account** | ~60s (load unpacked, `/chrome doctor`) | hours (script login, store creds, debug headless) | 30+ min (`--remote-debug` setup, attach) | hours (driver + login script) |
134
+ | **Survives MFA / SSO without code** | ✅ already logged in | ❌ | ⚠️ if you re-auth | ❌ |
135
+ | Uses your real signed-in Chrome | ✅ extension in your profile | ❌ throwaway profile | ⚠️ requires `--remote-debug` | ❌ throwaway profile |
136
+ | Re-login required | **Never** | Every run | Sometimes | Every run |
137
+ | **Multiple agents drive the same Chrome at once** | ✅ shared bridge | ❌ port collisions | ❌ | ❌ |
138
+ | Watch agent work, live | ✅ default; toggle quiet | ❌ headless or new window | ⚠️ debugger banner always | ❌ new window |
139
+ | Real browser-trusted clicks | ✅ opt-in (`chrome clicks on`) | ✅ | ✅ | ✅ |
140
+ | Network/console capture | ✅ built-in | ✅ | ✅ | ⚠️ via extensions |
141
+ | **Honest result envelopes¹** | ✅ | ⚠️ | ❌ | ❌ |
142
+ | Self-graded by built-in benchmark² | ✅ 38 primitives + 4 long-horizon | n/a | n/a | n/a |
143
+
144
+ ¹ Every action returns `pageMutated`, `defaultPrevented`, `elementVisible`, `occludedBy`, and `valueMatches` so the agent knows when a click didn't take effect — instead of looping blindly.
145
+ ² [`test-suite/`](./test-suite) is mode-aware: a synthetic-events tool is *expected* to fail clipboard. If you build a competing tool, send a PR with your scores. We benchmark in public.
146
+
147
+ ---
148
+
149
+ ## Honest results
136
150
 
151
+ Most browser-automation libraries return `void` or a generic ack. `pi-chrome` returns a structured envelope on every interaction:
152
+
153
+ ```text
154
+ chrome_click(occluded-button) →
155
+ "Clicked el-3 — pageMutated=false; occluded by <div#overlay>"
137
156
  ```
138
- ┌──────────────────────┐ ┌──────────────────────────┐
139
- │ Pi agent (terminal) │ ─── http://127.0.0.1:17318 ─→ │ Chrome extension │
140
- │ chrome_* tools │ │ (your real profile) │
141
- └──────────┬───────────┘ └─────────┬────────────────┘
142
- │ same machine │
143
- ▼ ▼
144
- Other Pi sessions Tabs you already have open
145
- share the same bridge (signed in to GitHub,
146
- automatically Linear, Stripe, etc.)
157
+
158
+ ```text
159
+ chrome_type(react-input, "hello")
160
+ "Typed into el-7 — valueMatches=true; pageMutated=true"
147
161
  ```
148
162
 
149
- Multiple Pi sessions (planner / worker / audit) can all drive the same Chrome at once. The first session opens the local bridge; later sessions detect it and pipe their commands through.
163
+ This is why agents using pi-chrome don't get stuck in retry loops on broken sites. They get the **reason** the action didn't land and can fix course in one turn.
164
+
165
+ ---
166
+
167
+ ## What an agent gets
168
+
169
+ **20 tools**, grouped by job. Every one runs against your already-open tabs.
170
+
171
+ | Category | Tools |
172
+ | --------------- | ---------------------------------------------------------------------------------------------- |
173
+ | **Tabs** | `chrome_tab` (list/new/activate/close/version), `chrome_launch` |
174
+ | **Inspect** | `chrome_snapshot` (uids + selectors + text + viewport), `chrome_screenshot`, `chrome_evaluate` |
175
+ | **Navigate** | `chrome_navigate` (with optional `initScript` at `document_start`), `chrome_wait_for` |
176
+ | **Interact** | `chrome_click`, `chrome_type`, `chrome_fill`, `chrome_key`, `chrome_hover` |
177
+ | **Gesture** | `chrome_drag` (HTML5 DataTransfer), `chrome_scroll` (wheel + momentum), `chrome_tap` (touch) |
178
+ | **Files** | `chrome_upload_file` (no native picker; works with React/Vue/Angular file inputs) |
179
+ | **Observe** | `chrome_list_console_messages`, `chrome_list_network_requests`, `chrome_get_network_request` (with response body) |
180
+
181
+ Each tool is documented inline in Pi — agents see the parameters and the gotchas (synthetic vs. trusted, autoplay gates, file picker limits) without trial-and-error.
150
182
 
151
183
  ---
152
184
 
@@ -166,9 +198,7 @@ Multiple Pi sessions (planner / worker / audit) can all drive the same Chrome at
166
198
 
167
199
  Per-call `trusted: true / false` on any input tool wins over the global mode.
168
200
 
169
- ---
170
-
171
- ## Background / watch modes
201
+ ### Background / watch modes
172
202
 
173
203
  By default, every `chrome_*` call focuses Chrome and activates the target tab so you can **watch the agent work** — invaluable for demos, debugging, and first-time confidence.
174
204
 
@@ -180,51 +210,56 @@ By default, every `chrome_*` call focuses Chrome and activates the target tab so
180
210
 
181
211
  Per-call `background: true` wins over the session toggle.
182
212
 
183
- ---
213
+ ### Diagnostics
184
214
 
185
- ## Honest results
215
+ - `/chrome doctor` — single command: connectivity, extension version, bridge owner, version drift, MAIN-world helper injection, `chrome_evaluate("1+1") === 2`, fingerprint flags.
216
+ - `/chrome onboard` — guided first-time setup.
217
+ - `/chrome quiet status`, `/chrome clicks status` — current modes.
186
218
 
187
- Most browser-automation libraries return `void` or a generic ack. `pi-chrome` returns a structured envelope on every interaction:
219
+ If the loaded Chrome extension is older than `pi-chrome` on disk, `/chrome doctor` tells you to reload it from `chrome://extensions`.
188
220
 
189
- ```text
190
- chrome_click(occluded-button) →
191
- "Clicked el-3 — pageMutated=false; occluded by <div#overlay>"
192
- ```
221
+ ---
193
222
 
194
- ```text
195
- chrome_type(react-input, "hello") →
196
- "Typed into el-7 — valueMatches=true; pageMutated=true"
223
+ ## Architecture
224
+
225
+ ```
226
+ ┌──────────────────────┐ ┌──────────────────────────┐
227
+ │ Pi agent (terminal) │ ─── http://127.0.0.1:17318 ─→ │ Chrome extension │
228
+ │ chrome_* tools │ │ (your real profile) │
229
+ └──────────┬───────────┘ └─────────┬────────────────┘
230
+ │ same machine │
231
+ ▼ ▼
232
+ Other Pi sessions Tabs you already have open
233
+ share the same bridge (signed in to GitHub,
234
+ automatically Linear, Stripe, etc.)
197
235
  ```
198
236
 
199
- This is why agents using pi-chrome don't get stuck in retry loops on broken sites. They get the **reason** the action didn't land and can fix course in one turn.
237
+ Multiple Pi sessions (planner / worker / audit) can all drive the same Chrome at once. The first session opens the local bridge; later sessions detect it and pipe their commands through.
200
238
 
201
239
  ---
202
240
 
203
- ## Diagnostics
204
-
205
- - `/chrome doctor` — single command: connectivity, extension version, bridge owner, version drift, MAIN-world helper injection, `chrome_evaluate("1+1") === 2`, fingerprint flags.
206
- - `/chrome onboard` — guided first-time setup.
207
- - `/chrome quiet status`, `/chrome clicks status` — current modes.
208
-
209
- If the loaded Chrome extension is older than `pi-chrome` on disk, `/chrome doctor` tells you to reload it from `chrome://extensions`.
241
+ ## Built-in benchmark suite
210
242
 
211
- ---
243
+ [`test-suite/`](./test-suite) is a benchmark for **any** browser-control agent (not just pi-chrome). It includes **38 primitive challenges** plus **4 hermetic BrowserGym-style long-horizon tasks**.
212
244
 
213
- ## Composes with
245
+ Scoring is **expected-outcome-by-mode**, not raw PASS count: each challenge has an expected verdict per mode (`synthetic`, `trusted`, `manual`) and a tool grades itself by whether its actual outcome matches the expected one. This avoids false equivalence between modes — a synthetic-events tool isn't supposed to satisfy a clipboard user-activation gate; matching that expectation is the pass.
214
246
 
215
- - **[pi-qq](https://www.npmjs.com/package/pi-qq)** `/qq summarize what the active GitHub tab shows` without polluting the main transcript.
216
- - **[pi-bar](https://www.npmjs.com/package/pi-bar)** — when the agent scrapes large pages, watch the context-usage segment turn yellow → red as a signal to `/qq` for a recap.
217
- - **PR demo skills** — screenshots write to `.pi/chrome-screenshots/` so you can attach them to PR descriptions or demo bundles.
247
+ Each challenge exposes `window.__verdict` / `window.__reason` / `window.__events` and a manifest entry with expected results per mode.
218
248
 
219
- ---
249
+ ```bash
250
+ cd test-suite && python3 -m http.server 8765
251
+ # open http://127.0.0.1:8765/ in the Chrome window pi-chrome controls
252
+ ```
220
253
 
221
- ## Why an unpacked Chrome extension?
254
+ Categories: `trusted-input`, `pointer-humanization`, `keyboard`, `activation-gates`, `scroll`, `drag-drop`, `clipboard`, `native-controls`, `frameworks`, `editing`, `dom-complexity`, `frames`, `files`, `observability`, `fingerprint`, `agent-safety`.
222
255
 
223
- `pi-chrome` cannot ship through the Chrome Web Store — a Web Store extension cannot talk to a local bridge controlled by another tool on the same machine. So it ships as a small MIT-licensed unpacked extension in [`extensions/chrome-profile-bridge/browser-extension/`](./extensions/chrome-profile-bridge/browser-extension). **Read the source before loading.** `/chrome doctor` reports the extension version and warns when it drifts from your installed `pi-chrome`.
256
+ If you build a competing tool, please open a PR with your scores. We benchmark in public.
224
257
 
225
258
  ---
226
259
 
227
- ## Security model
260
+ ## Security model & why unpacked
261
+
262
+ **Unpacked on purpose.** A Web Store extension cannot talk to a local bridge controlled by another tool on the same machine — so pi-chrome ships its bridge as an inspectable, MIT-licensed folder you load once with Developer Mode. Every line is yours to read in [`extensions/chrome-profile-bridge/browser-extension/`](./extensions/chrome-profile-bridge/browser-extension). `/chrome doctor` reports the loaded extension version and warns when it drifts from your installed `pi-chrome`.
228
263
 
229
264
  The companion extension runs in the Chrome profile where you install it and has broad tab/scripting permissions. Only install it from a package source you trust.
230
265
 
@@ -238,22 +273,11 @@ There is no network exposure; the bridge binds to loopback only.
238
273
 
239
274
  ---
240
275
 
241
- ## Built-in benchmark suite
242
-
243
- [`test-suite/`](./test-suite) is a benchmark for **any** browser-control agent (not just pi-chrome). It includes **38 primitive challenges** plus **4 hermetic BrowserGym-style long-horizon tasks**.
244
-
245
- Scoring is **expected-outcome-by-mode**, not raw PASS count: each challenge has an expected verdict per mode (`synthetic`, `trusted`, `manual`) and a tool grades itself by whether its actual outcome matches the expected one. This avoids false equivalence between modes — a synthetic-events tool isn't supposed to satisfy a clipboard user-activation gate; matching that expectation is the pass.
246
-
247
- Each challenge exposes `window.__verdict` / `window.__reason` / `window.__events` and a manifest entry with expected results per mode.
248
-
249
- ```bash
250
- cd test-suite && python3 -m http.server 8765
251
- # open http://127.0.0.1:8765/ in the Chrome window pi-chrome controls
252
- ```
253
-
254
- Categories: `trusted-input`, `pointer-humanization`, `keyboard`, `activation-gates`, `scroll`, `drag-drop`, `clipboard`, `native-controls`, `frameworks`, `editing`, `dom-complexity`, `frames`, `files`, `observability`, `fingerprint`, `agent-safety`.
276
+ ## Composes with
255
277
 
256
- If you build a competing tool, please open a PR with your scores. We benchmark in public.
278
+ - **[pi-qq](https://www.npmjs.com/package/pi-qq)** `/qq summarize what the active GitHub tab shows` without polluting the main transcript.
279
+ - **[pi-bar](https://www.npmjs.com/package/pi-bar)** — when the agent scrapes large pages, watch the context-usage segment turn yellow → red as a signal to `/qq` for a recap.
280
+ - **PR demo skills** — screenshots write to `.pi/chrome-screenshots/` so you can attach them to PR descriptions or demo bundles.
257
281
 
258
282
  ---
259
283
 
@@ -1,10 +1,21 @@
1
1
  {
2
2
  "manifest_version": 3,
3
3
  "name": "Pi Chrome Connector",
4
- "version": "0.14.7",
4
+ "version": "0.15.0",
5
5
  "description": "Lets Pi control tabs in Chrome via a local connector at 127.0.0.1.",
6
- "permissions": ["tabs", "scripting", "storage", "activeTab", "alarms", "webNavigation", "debugger"],
7
- "host_permissions": ["<all_urls>", "http://127.0.0.1:17318/*"],
6
+ "permissions": [
7
+ "tabs",
8
+ "scripting",
9
+ "storage",
10
+ "activeTab",
11
+ "alarms",
12
+ "webNavigation",
13
+ "debugger"
14
+ ],
15
+ "host_permissions": [
16
+ "<all_urls>",
17
+ "http://127.0.0.1:17318/*"
18
+ ],
8
19
  "background": {
9
20
  "service_worker": "service_worker.js"
10
21
  },
@@ -436,9 +436,10 @@ export default function (pi: ExtensionAPI): void {
436
436
  Chrome control is available through the chrome_* tools via a companion Chrome extension installed in the user's normal Chrome profile. Tools target the existing signed-in profile, no CDP, no throwaway profile.
437
437
 
438
438
  Capability model (important):
439
- - All input is **synthetic DOM events** (\`isTrusted=false\`). Synthetic events drive React/Vue/Angular state fine, but they do NOT satisfy Chrome's user-activation gates: audio/video autoplay, clipboard write, file pickers, fullscreen, and Web Push prompts will NOT open from a chrome_click.
440
- - \`chrome_evaluate\` runs in MAIN world via the Function constructor. It works on pages with strict CSP (\`script-src 'self'\` without \`'unsafe-eval'\`), and surfaces thrown exceptions.
441
- - Tool results include \`pageMutated\`, \`defaultPrevented\`, \`elementVisible\`, \`occludedBy\`, and (for type/fill) \`valueMatches\`. If \`pageMutated\` is false after a click that should have changed something, the click likely didn't take effect do NOT just retry; check the action result and snapshot for the cause.
439
+ - Default input path is **synthetic DOM events** (\`isTrusted=false\`). Synthetic events drive React/Vue/Angular state fine, but they do NOT satisfy Chrome's user-activation gates: audio/video autoplay, clipboard write, file pickers, fullscreen, and Web Push prompts will NOT open from a synthetic chrome_click.
440
+ - **Trusted escape hatch**: chrome_click / chrome_type / chrome_key / chrome_fill / chrome_hover / chrome_drag / chrome_scroll all accept \`trusted: true\`, which dispatches through chrome.debugger / CDP. Trusted events are browser-trusted (\`isTrusted=true\`) and **bypass page CSP entirely** because they're injected at the input layer, not via JS. Default mode is \`auto\`: synthetic first, silent CDP retry only when the click looks gated. If a synthetic click/type produced no \`pageMutated\` or you got a CSP/eval error from chrome_evaluate, escalate to \`trusted: true\` yourself — don't ask the user.
441
+ - \`chrome_evaluate\` and \`chrome_snapshot\` run in MAIN world via the **Function constructor**, which requires \`'unsafe-eval'\` in the page CSP. Pages with strict CSP (e.g. github.com, many bank/SaaS apps) will throw \`EvalError: ... 'unsafe-eval' is not an allowed source of script\` and chrome_snapshot will return empty. On those pages, drive the page with \`chrome_screenshot\` (extension API, not gated by CSP) + \`chrome_click\`/\`chrome_type\`/\`chrome_key\` with \`trusted: true\` and viewport coordinates. \`chrome_navigate\`, \`chrome_screenshot\`, \`chrome_tab\`, and trusted input all keep working under any CSP.
442
+ - Tool results include \`pageMutated\`, \`defaultPrevented\`, \`elementVisible\`, \`occludedBy\`, and (for type/fill) \`valueMatches\`. If \`pageMutated\` is false after a click that should have changed something, the click likely didn't take effect — do NOT just retry the same way; either escalate to \`trusted: true\` or check the snapshot for occlusion.
442
443
 
443
444
  Usage rules:
444
445
  1. \`chrome_snapshot\` before clicking/typing; pass \`uid\` over \`selector\`.
@@ -446,7 +447,7 @@ Usage rules:
446
447
  3. If \`chrome_evaluate\` returns null when you expected a value, the expression evaluated to null/undefined in the page; surface the value via \`JSON.stringify\` to confirm.
447
448
  4. \`chrome_navigate\` supports an optional \`initScript\` that runs at document_start in MAIN world for the next navigation (good for seeding localStorage or stubbing Date.now).
448
449
  5. By default chrome_* tools focus Chrome so the user can watch; pass \`background=true\` or run /chrome quiet to silence the whole session.
449
- 6. If you hit an autoplay/clipboard/file-picker gate, tell the user; this bridge cannot satisfy it.
450
+ 6. If you hit an autoplay/clipboard/file-picker gate, tell the user; this bridge cannot satisfy it. (Generic clicks/typing/CSP gates are fine — escalate to \`trusted: true\`.)
450
451
  7. Run /chrome doctor when in doubt about connectivity or capabilities.
451
452
  </chrome-profile-bridge>`;
452
453
  return { systemPrompt: event.systemPrompt + primer };
package/package.json CHANGED
@@ -1,7 +1,11 @@
1
1
  {
2
2
  "name": "pi-chrome",
3
- "version": "0.14.8",
4
- "description": "The de-facto browser automation toolkit for Pi agents. Drive your existing logged-in Chrome — no re-login, no throwaway profile, no CDP. 20+ tools (click, type, navigate, screenshot, network capture, file upload, drag, scroll, touch) + honest result envelopes + a built-in benchmark suite.",
3
+ "version": "0.15.0",
4
+ "scripts": {
5
+ "version": "node scripts/sync-manifest-version.js",
6
+ "prepublishOnly": "node scripts/sync-manifest-version.js"
7
+ },
8
+ "description": "Give a Pi agent your real, signed-in Chrome. No CDP, no throwaway profile, no re-login. 20+ tools (click, type, navigate, screenshot, network capture, file upload, drag, touch) with honest result envelopes — and a built-in browser-control benchmark suite.",
5
9
  "keywords": [
6
10
  "pi",
7
11
  "pi-package",
@@ -32,6 +36,7 @@
32
36
  "stagehand-alternative"
33
37
  ],
34
38
  "license": "MIT",
39
+ "author": "tianrendong (Earendil Inc.)",
35
40
  "homepage": "https://github.com/tianrendong/pi-chrome#readme",
36
41
  "repository": {
37
42
  "type": "git",