pi-chrome 0.14.2 → 0.14.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +74 -0
- package/CONTRIBUTING.md +50 -0
- package/LICENSE +21 -0
- package/README.md +195 -102
- package/SECURITY.md +38 -0
- package/docs/COMPARISON.md +87 -0
- package/docs/EXAMPLES.md +166 -0
- package/docs/FAQ.md +79 -0
- package/extensions/chrome-profile-bridge/browser-extension/manifest.json +1 -1
- package/package.json +38 -5
package/CHANGELOG.md
ADDED
|
@@ -0,0 +1,74 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
All notable user-facing changes to `pi-chrome`.
|
|
4
|
+
|
|
5
|
+
## 0.14.4
|
|
6
|
+
|
|
7
|
+
- Sync `manifest.json` version to match `package.json` (0.14.3 shipped with stale manifest, would trigger spurious `/chrome doctor` drift warnings). No code or behavior changes.
|
|
8
|
+
|
|
9
|
+
## 0.14.3
|
|
10
|
+
|
|
11
|
+
- Documentation & discoverability overhaul.
|
|
12
|
+
- New README: hero, alternatives comparison table, 20-tool reference grouped by job, killer recipes, architecture diagram, honest-results explainer, benchmark suite plug.
|
|
13
|
+
- `docs/COMPARISON.md` — deep comparison vs Playwright / Puppeteer / Stagehand / browser-use / Selenium.
|
|
14
|
+
- `docs/EXAMPLES.md` — ready-to-paste agent prompts (PR triage, Linear standup, bug repro, network forensics, multi-tab admin cross-check, etc.).
|
|
15
|
+
- `docs/FAQ.md` — covers Brave/Arc, incognito, detection, banner, multi-session, CSP, file uploads, common envelope causes.
|
|
16
|
+
- `CHANGELOG.md`, `CONTRIBUTING.md`, `SECURITY.md`, `LICENSE` added.
|
|
17
|
+
- `package.json`: `homepage`, `repository`, `bugs`, expanded keywords for npm search.
|
|
18
|
+
- No code changes to the tools or extension.
|
|
19
|
+
|
|
20
|
+
## 0.14.2
|
|
21
|
+
|
|
22
|
+
- Recover from foreign-extension input overlays so input tools don't get hijacked by other Chrome extensions.
|
|
23
|
+
|
|
24
|
+
## 0.14.1
|
|
25
|
+
|
|
26
|
+
- Harder `attachDebugger` retry path so trusted clicks survive transient Chrome debugger contention.
|
|
27
|
+
- New `trusted.debug` diagnostic surface.
|
|
28
|
+
|
|
29
|
+
## 0.14.0
|
|
30
|
+
|
|
31
|
+
- Bare `/chrome` opens a `SettingsList` dialog. `space` cycles values, `enter` saves.
|
|
32
|
+
|
|
33
|
+
## 0.13.0
|
|
34
|
+
|
|
35
|
+
- Flattened `/chrome` tree. Cycle-on-pick for `clicks` / `quiet`. Status header at the top of the picker.
|
|
36
|
+
|
|
37
|
+
## 0.12.1
|
|
38
|
+
|
|
39
|
+
- Fixed anti-automation regressions across 7 benchmark challenges (`07/13/21/27/28`).
|
|
40
|
+
|
|
41
|
+
## 0.12.0
|
|
42
|
+
|
|
43
|
+
- Unified all slash commands under `/chrome` (`/chrome doctor`, `/chrome onboard`, `/chrome clicks`, `/chrome quiet`).
|
|
44
|
+
|
|
45
|
+
## 0.11.x
|
|
46
|
+
|
|
47
|
+
- Plain-English audit pass across `/chrome doctor`, `/chrome onboard`, `/chrome quiet`, README.
|
|
48
|
+
- `chrome_tap` (real CDP touch events).
|
|
49
|
+
- Smoother `trustedScroll`; CDP debugger auto-recovers from detach.
|
|
50
|
+
- Smart-auto trusted mode default. Extension renamed to **Pi Chrome Connector**.
|
|
51
|
+
- Per-event scroll delta cap so IntersectionObserver thresholds land naturally.
|
|
52
|
+
|
|
53
|
+
## 0.10.x
|
|
54
|
+
|
|
55
|
+
- Trusted-input mode via `chrome.debugger` (CDP) — opt-in, indistinguishable-from-human clicks/keys.
|
|
56
|
+
- `chrome_key` modifiers (Cmd+V, Ctrl+Shift+Tab, etc.) for trusted chords.
|
|
57
|
+
- Interactive `/chrome-trusted` picker (later folded into `/chrome`).
|
|
58
|
+
|
|
59
|
+
## 0.9.x
|
|
60
|
+
|
|
61
|
+
- Humanized synthetic input (pointer paths, key cadence variance).
|
|
62
|
+
- Anti-automation benchmark test suite landed in `test-suite/`.
|
|
63
|
+
|
|
64
|
+
## 0.8.0
|
|
65
|
+
|
|
66
|
+
- `chrome_evaluate` no longer returns `null` for valid expressions — dedicated MAIN-world `evaluateInTab` pipeline with statement-mode fallback and tagged envelope for `undefined`/errors/symbols/bigints.
|
|
67
|
+
- Truthful action result envelopes: `isTrusted`, `defaultPrevented`, `elementVisible`, `occludedBy`, `valueMatches`, `pageMutated`.
|
|
68
|
+
- Extended `/chrome doctor`: bridge mode/URL, extension version drift, MAIN-world helper injection, `navigator.webdriver` fingerprint, CDP availability probe.
|
|
69
|
+
- Removed misleading `returnByValue` param. Implemented `chrome_screenshot.fullPage` via tile stitching.
|
|
70
|
+
- Autoplay-gate heuristic for `chrome_click`.
|
|
71
|
+
|
|
72
|
+
## 0.7.0
|
|
73
|
+
|
|
74
|
+
- Initial public `pi-chrome` release. Companion Chrome extension + local bridge + first tool set.
|
package/CONTRIBUTING.md
ADDED
|
@@ -0,0 +1,50 @@
|
|
|
1
|
+
# Contributing to pi-chrome
|
|
2
|
+
|
|
3
|
+
Thanks for considering a contribution. pi-chrome aims to be the **de-facto browser-control toolkit for Pi agents** — that means a few non-negotiables.
|
|
4
|
+
|
|
5
|
+
## Non-negotiables
|
|
6
|
+
|
|
7
|
+
1. **No re-login.** Every change must keep working against the user's already-signed-in Chrome profile. Anything that requires a fresh profile or extra auth steps is out of scope.
|
|
8
|
+
2. **Honest result envelopes.** Every action tool returns `pageMutated`, `defaultPrevented`, `elementVisible`, `occludedBy` (when relevant), `valueMatches` (for input). Agents need to know **why** something didn't take effect.
|
|
9
|
+
3. **Quiet by default, trusted by opt-in.** Synthetic DOM events first. CDP/`chrome.debugger` only when explicitly requested (`trusted: true`) or when the smart-auto heuristic detects an obvious user-activation gate.
|
|
10
|
+
4. **Benchmarks gate features.** Add a page in `test-suite/` that fails before your change and passes after. We accept PRs faster when there's a green/red verdict to point at.
|
|
11
|
+
|
|
12
|
+
## Local dev
|
|
13
|
+
|
|
14
|
+
```bash
|
|
15
|
+
# Link from a checkout
|
|
16
|
+
pi install ./pi-chrome
|
|
17
|
+
|
|
18
|
+
# Run the benchmark dashboard
|
|
19
|
+
cd test-suite
|
|
20
|
+
python3 -m http.server 8765
|
|
21
|
+
# open http://127.0.0.1:8765/ in the Chrome window pi-chrome controls
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
## Adding a new tool
|
|
25
|
+
|
|
26
|
+
1. Register in `extensions/chrome-profile-bridge/index.ts` (the `register*Tool` calls near line 840+).
|
|
27
|
+
2. Implement the handler in `extensions/chrome-profile-bridge/browser-extension/service_worker.js`.
|
|
28
|
+
3. Return a `pageMutated` + relevant fields.
|
|
29
|
+
4. Add a benchmark page under `test-suite/challenges/` and a manifest entry.
|
|
30
|
+
5. Update `README.md` "What an agent gets" table.
|
|
31
|
+
6. Add a `CHANGELOG.md` entry.
|
|
32
|
+
|
|
33
|
+
## Filing a bug
|
|
34
|
+
|
|
35
|
+
Include:
|
|
36
|
+
|
|
37
|
+
- `/chrome doctor` output
|
|
38
|
+
- `pi-chrome` version + extension version (the `doctor` output prints both)
|
|
39
|
+
- The exact tool call + the result envelope you got
|
|
40
|
+
- Page URL or a minimal repro page in `test-suite/`
|
|
41
|
+
|
|
42
|
+
## Releasing
|
|
43
|
+
|
|
44
|
+
- Bump `package.json` version.
|
|
45
|
+
- Move `CHANGELOG.md` notes from the working section to the new version header.
|
|
46
|
+
- `npm publish --access public`.
|
|
47
|
+
|
|
48
|
+
## Code of conduct
|
|
49
|
+
|
|
50
|
+
Be kind, be precise, ship things. PRs that break the "no re-login" promise will be closed with a note explaining which non-negotiable they hit.
|
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2024-2026 pi-chrome contributors
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
CHANGED
|
@@ -1,63 +1,80 @@
|
|
|
1
1
|
# pi-chrome
|
|
2
2
|
|
|
3
3
|
[](https://www.npmjs.com/package/pi-chrome)
|
|
4
|
+
[](https://www.npmjs.com/package/pi-chrome)
|
|
5
|
+
[](./LICENSE)
|
|
4
6
|
|
|
5
|
-
|
|
7
|
+
> **The fastest way to give a [Pi](https://pi.dev) agent your real Chrome.**
|
|
8
|
+
> No CDP. No throwaway profile. No re-login. Watch it work — or run silent.
|
|
6
9
|
|
|
7
|
-
|
|
10
|
+
```text
|
|
11
|
+
You: "Find my open GitHub PR tab, summarize review state, and screenshot the failing CI."
|
|
12
|
+
Agent: chrome_tab(list) → chrome_snapshot(uid:…) → chrome_screenshot(...)
|
|
13
|
+
✓ 3 reviewers, 1 change requested, CI red on iOS. Saved → .pi/chrome-screenshots/ci.png
|
|
14
|
+
You: [keeps coding — agent never asked you to log in]
|
|
15
|
+
```
|
|
8
16
|
|
|
9
|
-
|
|
17
|
+
`pi-chrome` ships **20+ browser tools** for Pi agents, backed by a small MIT-licensed Chrome extension that runs inside the Chrome profile **you already use** — including every site you're already signed into.
|
|
10
18
|
|
|
11
|
-
|
|
19
|
+
---
|
|
12
20
|
|
|
13
|
-
|
|
14
|
-
- **Watch your authenticated Chrome work** — by default, `chrome_*` tool calls focus Chrome and activate the target tab so you can see the agent inspect, navigate, click, and type in real time. Switch to silent/background mode for the whole session with `/chrome quiet`, or pass `background: true` on a single tool call when you want quiet.
|
|
15
|
-
- **Full browser automation toolkit for Pi** — list/create/activate/close tabs, snapshot pages with usable CSS selectors, navigate, evaluate JavaScript, click, type, press keys, wait for page state, and capture screenshots.
|
|
16
|
-
- **Built-in setup and agent guidance** — `/chrome onboard` walks users through installing the companion extension, `/chrome doctor` checks connectivity and version drift, screenshots save to disk, and the prompt primer tells agents to inspect with `chrome_snapshot` before acting and avoid destructive actions unless explicitly requested.
|
|
21
|
+
## Why pi-chrome vs. everything else
|
|
17
22
|
|
|
18
|
-
|
|
23
|
+
| | **pi-chrome** | Playwright / Puppeteer | CDP-based agents | Selenium / WebDriver |
|
|
24
|
+
| ------------------------------ | --------------------------------- | ----------------------------- | ----------------------------- | ----------------------------- |
|
|
25
|
+
| Uses your real signed-in Chrome | ✅ yes (extension in your profile) | ❌ throwaway profile | ⚠️ requires `--remote-debug` | ❌ throwaway profile |
|
|
26
|
+
| Re-login required | **Never** | Every run | Sometimes | Every run |
|
|
27
|
+
| Watch agent work, live | ✅ default; toggle quiet | ❌ headless or new window | ⚠️ debugger banner always | ❌ new window |
|
|
28
|
+
| Works on strict-CSP pages | ✅ `new Function` MAIN-world | ✅ | ✅ | ✅ |
|
|
29
|
+
| Real browser-trusted clicks | ✅ opt-in (`chrome clicks on`) | ✅ | ✅ | ✅ |
|
|
30
|
+
| Multi-session safe | ✅ shared local bridge | ❌ port collisions | ❌ | ❌ |
|
|
31
|
+
| Network/console capture | ✅ built-in | ✅ | ✅ | ⚠️ via extensions |
|
|
32
|
+
| Honest result envelopes¹ | ✅ | ⚠️ | ❌ | ❌ |
|
|
33
|
+
| Built-in benchmark suite² | ✅ 30+ challenges | n/a | n/a | n/a |
|
|
19
34
|
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
```
|
|
35
|
+
¹ Every action returns `pageMutated`, `defaultPrevented`, `elementVisible`, `occludedBy`, and `valueMatches` so the agent knows when a click didn't take effect — instead of looping blindly.
|
|
36
|
+
² See [`test-suite/`](./test-suite) — static pages that grade any browser-control tool on trusted clicks, pointer humanization, keyboard fidelity, drag/drop, clipboard, Shadow DOM, iframes, file uploads, network capture, and fingerprint leaks.
|
|
23
37
|
|
|
24
|
-
|
|
38
|
+
---
|
|
25
39
|
|
|
26
|
-
|
|
27
|
-
pi install ./pi-chrome
|
|
28
|
-
```
|
|
40
|
+
## What an agent gets
|
|
29
41
|
|
|
30
|
-
|
|
42
|
+
**20 tools**, grouped by job. Every one runs against your already-open tabs.
|
|
31
43
|
|
|
32
|
-
|
|
44
|
+
| Category | Tools |
|
|
45
|
+
| --------------- | ---------------------------------------------------------------------------------------------- |
|
|
46
|
+
| **Tabs** | `chrome_tab` (list/new/activate/close/version), `chrome_launch` |
|
|
47
|
+
| **Inspect** | `chrome_snapshot` (uids + selectors + text + viewport), `chrome_screenshot`, `chrome_evaluate` |
|
|
48
|
+
| **Navigate** | `chrome_navigate` (with optional `initScript` at `document_start`), `chrome_wait_for` |
|
|
49
|
+
| **Interact** | `chrome_click`, `chrome_type`, `chrome_fill`, `chrome_key`, `chrome_hover` |
|
|
50
|
+
| **Gesture** | `chrome_drag` (HTML5 DataTransfer), `chrome_scroll` (wheel + momentum), `chrome_tap` (touch) |
|
|
51
|
+
| **Files** | `chrome_upload_file` (no native picker; works with React/Vue/Angular file inputs) |
|
|
52
|
+
| **Observe** | `chrome_list_console_messages`, `chrome_list_network_requests`, `chrome_get_network_request` (with response body) |
|
|
33
53
|
|
|
34
|
-
|
|
54
|
+
Each tool is documented inline in Pi — agents see the parameters and the gotchas (synthetic vs. trusted, autoplay gates, file picker limits) without trial-and-error.
|
|
35
55
|
|
|
36
|
-
|
|
56
|
+
---
|
|
37
57
|
|
|
38
|
-
|
|
39
|
-
|
|
58
|
+
## 60-second install
|
|
59
|
+
|
|
60
|
+
```bash
|
|
61
|
+
pi install npm:pi-chrome
|
|
40
62
|
```
|
|
41
63
|
|
|
42
|
-
|
|
64
|
+
Then in Pi:
|
|
43
65
|
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
66
|
+
```text
|
|
67
|
+
/chrome onboard
|
|
68
|
+
```
|
|
47
69
|
|
|
48
|
-
|
|
70
|
+
On macOS this opens `chrome://extensions`, reveals the bundled `browser-extension/` folder in Finder, and copies its path to your clipboard. In Chrome: **Developer mode** → **Load unpacked** → paste the path. Done.
|
|
49
71
|
|
|
50
|
-
|
|
51
|
-
2. Click **Load unpacked**.
|
|
52
|
-
3. Select the revealed/copied `browser-extension` folder.
|
|
53
|
-
4. Return to Pi and run:
|
|
72
|
+
Verify:
|
|
54
73
|
|
|
55
74
|
```text
|
|
56
75
|
/chrome doctor
|
|
57
76
|
```
|
|
58
77
|
|
|
59
|
-
Expected output:
|
|
60
|
-
|
|
61
78
|
```text
|
|
62
79
|
Performing Chrome bridge health check
|
|
63
80
|
pi-chrome v<version>
|
|
@@ -65,122 +82,198 @@ pi-chrome v<version>
|
|
|
65
82
|
✓ Companion Chrome extension responding (ID: <chrome-extension-id>, ext v<version>)
|
|
66
83
|
```
|
|
67
84
|
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
pi-chrome can drive Chrome two ways:
|
|
85
|
+
---
|
|
71
86
|
|
|
72
|
-
|
|
73
|
-
- **Real-looking clicks** — indistinguishable from a person clicking. They unlock the cases above, but Chrome shows a *"Pi Chrome Connector started debugging this browser"* banner at the top of every tab pi-chrome touches while it's working.
|
|
74
|
-
|
|
75
|
-
Pick a mode with `/chrome clicks`:
|
|
87
|
+
## Try this in 30 seconds after install
|
|
76
88
|
|
|
77
89
|
```text
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
90
|
+
Use chrome_tab list to find my GitHub notifications tab.
|
|
91
|
+
chrome_snapshot it, then write a 5-bullet triage:
|
|
92
|
+
which PRs need my review today, sorted by staleness.
|
|
93
|
+
Do not click anything yet.
|
|
82
94
|
```
|
|
83
95
|
|
|
84
|
-
|
|
96
|
+
You'll watch the agent jump to your GitHub tab and read the page — using **your** session, **your** filters, **your** orgs.
|
|
85
97
|
|
|
86
|
-
|
|
98
|
+
---
|
|
87
99
|
|
|
88
|
-
##
|
|
100
|
+
## Killer recipes (copy-paste into Pi)
|
|
89
101
|
|
|
90
|
-
|
|
102
|
+
Each one assumes tabs you already have open + accounts you're already signed into.
|
|
91
103
|
|
|
92
|
-
|
|
104
|
+
**PR triage**
|
|
105
|
+
> Use chrome_tab list to find my GitHub notifications tab, snapshot it, summarize PRs needing my review.
|
|
93
106
|
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
/chrome quiet on # explicit
|
|
97
|
-
/chrome quiet off # explicit
|
|
98
|
-
```
|
|
107
|
+
**Linear standup**
|
|
108
|
+
> Open my Linear current cycle in the active tab, snapshot it, write a 5-bullet standup.
|
|
99
109
|
|
|
100
|
-
|
|
110
|
+
**Bug repro with evidence**
|
|
111
|
+
> Open the staging app I'm already signed into, reproduce \<bug>, save a screenshot of each step under `./repro/`.
|
|
101
112
|
|
|
102
|
-
|
|
113
|
+
**Form auto-fill (no submit)**
|
|
114
|
+
> Open the vendor portal, fill the new-vendor form from this JSON, stop before submit.
|
|
103
115
|
|
|
104
|
-
|
|
116
|
+
**Admin cross-check**
|
|
117
|
+
> Across my Stripe / Postmark / our admin tabs, find any user where state disagrees.
|
|
105
118
|
|
|
106
|
-
|
|
119
|
+
**Local dev visual diff**
|
|
120
|
+
> Snapshot `localhost:3000` and the staging URL of the same page; tell me what's visually different.
|
|
107
121
|
|
|
108
|
-
|
|
109
|
-
|
|
122
|
+
**Auth-only data pull**
|
|
123
|
+
> Open my analytics dashboard tab and `chrome_evaluate` to extract today's KPIs from page state.
|
|
124
|
+
|
|
125
|
+
**Network forensics**
|
|
126
|
+
> Reproduce the checkout bug, then use `chrome_list_network_requests` to find the failing call and dump its response body.
|
|
127
|
+
|
|
128
|
+
**File upload through React**
|
|
129
|
+
> Open the photo uploader, `chrome_upload_file` with `./fixtures/sample.png`, confirm preview rendered.
|
|
130
|
+
|
|
131
|
+
---
|
|
132
|
+
|
|
133
|
+
## Architecture
|
|
134
|
+
|
|
135
|
+
```
|
|
136
|
+
┌──────────────────────┐ ┌──────────────────────────┐
|
|
137
|
+
│ Pi agent (terminal) │ ─── http://127.0.0.1:17318 ─→ │ Chrome extension │
|
|
138
|
+
│ chrome_* tools │ │ (your real profile) │
|
|
139
|
+
└──────────┬───────────┘ └─────────┬────────────────┘
|
|
140
|
+
│ same machine │
|
|
141
|
+
▼ ▼
|
|
142
|
+
Other Pi sessions Tabs you already have open
|
|
143
|
+
share the same bridge (signed in to GitHub,
|
|
144
|
+
automatically Linear, Stripe, etc.)
|
|
110
145
|
```
|
|
111
146
|
|
|
112
|
-
|
|
147
|
+
Multiple Pi sessions (planner / worker / audit) can all drive the same Chrome at once. The first session opens the local bridge; later sessions detect it and pipe their commands through.
|
|
148
|
+
|
|
149
|
+
---
|
|
150
|
+
|
|
151
|
+
## Click & input modes
|
|
152
|
+
|
|
153
|
+
`pi-chrome` can drive Chrome two ways:
|
|
154
|
+
|
|
155
|
+
- **Quiet** — synthetic DOM events. Fast, no UI banners. Drives React/Vue/Angular state. Won't satisfy autoplay, clipboard, file picker, fullscreen, or user-activation gates.
|
|
156
|
+
- **Trusted** — `chrome.debugger` / CDP under the hood. Indistinguishable from a person clicking. Shows Chrome's *"Pi Chrome Connector started debugging this browser"* banner while active.
|
|
113
157
|
|
|
114
158
|
```text
|
|
115
|
-
|
|
159
|
+
/chrome clicks auto # default: quiet, upgrade to trusted only when needed
|
|
160
|
+
/chrome clicks off # always quiet, never banner
|
|
161
|
+
/chrome clicks on # always trusted, banner stays up
|
|
162
|
+
/chrome clicks status
|
|
116
163
|
```
|
|
117
164
|
|
|
118
|
-
|
|
165
|
+
Per-call `trusted: true / false` on any input tool wins over the global mode.
|
|
166
|
+
|
|
167
|
+
---
|
|
168
|
+
|
|
169
|
+
## Background / watch modes
|
|
170
|
+
|
|
171
|
+
By default, every `chrome_*` call focuses Chrome and activates the target tab so you can **watch the agent work** — invaluable for demos, debugging, and first-time confidence.
|
|
119
172
|
|
|
120
173
|
```text
|
|
121
|
-
|
|
174
|
+
/chrome quiet # toggle for the whole session
|
|
175
|
+
/chrome quiet on # explicit
|
|
176
|
+
/chrome quiet off # explicit
|
|
122
177
|
```
|
|
123
178
|
|
|
124
|
-
|
|
179
|
+
Per-call `background: true` wins over the session toggle.
|
|
125
180
|
|
|
126
|
-
|
|
181
|
+
---
|
|
127
182
|
|
|
128
|
-
|
|
129
|
-
- **Linear standup:** "Open my Linear current cycle in the active tab, snapshot it, and write me a 5-bullet standup."
|
|
130
|
-
- **Bug repro with evidence:** "Open the staging app I'm already signed into, reproduce <bug>, and save a screenshot of each step under ./repro/."
|
|
131
|
-
- **Form auto-fill (no submit):** "Open <vendor> portal, fill the new-vendor form from this JSON, but stop before submit."
|
|
132
|
-
- **Admin cross-check:** "Across my Stripe / Postmark / our admin tabs, find any user where state disagrees."
|
|
133
|
-
- **Local dev visual diff:** "Snapshot localhost:3000 and the staging URL of the same page; tell me what's visually different."
|
|
134
|
-
- **Auth-only data pull:** "Open my analytics dashboard tab and chrome_evaluate to extract today's KPIs from the page state."
|
|
183
|
+
## Honest results
|
|
135
184
|
|
|
136
|
-
|
|
185
|
+
Most browser-automation libraries return `void` or a generic ack. `pi-chrome` returns a structured envelope on every interaction:
|
|
137
186
|
|
|
138
|
-
|
|
187
|
+
```text
|
|
188
|
+
chrome_click(occluded-button) →
|
|
189
|
+
"Clicked el-3 — pageMutated=false; occluded by <div#overlay>"
|
|
190
|
+
```
|
|
139
191
|
|
|
140
|
-
|
|
192
|
+
```text
|
|
193
|
+
chrome_type(react-input, "hello") →
|
|
194
|
+
"Typed into el-7 — valueMatches=true; pageMutated=true"
|
|
195
|
+
```
|
|
141
196
|
|
|
142
|
-
|
|
197
|
+
This is why agents using pi-chrome don't get stuck in retry loops on broken sites. They get the **reason** the action didn't land and can fix course in one turn.
|
|
143
198
|
|
|
144
|
-
|
|
199
|
+
---
|
|
145
200
|
|
|
146
|
-
|
|
147
|
-
- **pi-bar** — watch context pressure as the agent scrapes large pages; the footer's red threshold is a clean signal to `/qq` for a recap before context overflows.
|
|
148
|
-
- **PR demo skills** (such as `ios-pr-agent` / `ios-demo-record` workflows) — `chrome_screenshot` writes to `.pi/chrome-screenshots/` so you can attach images to PR descriptions or demo bundles.
|
|
201
|
+
## Diagnostics
|
|
149
202
|
|
|
150
|
-
|
|
203
|
+
- `/chrome doctor` — single command: connectivity, extension version, bridge owner, version drift, MAIN-world helper injection, `chrome_evaluate("1+1") === 2`, fingerprint flags.
|
|
204
|
+
- `/chrome onboard` — guided first-time setup.
|
|
205
|
+
- `/chrome quiet status`, `/chrome clicks status` — current modes.
|
|
151
206
|
|
|
152
|
-
|
|
207
|
+
If the loaded Chrome extension is older than `pi-chrome` on disk, `/chrome doctor` tells you to reload it from `chrome://extensions`.
|
|
153
208
|
|
|
154
|
-
|
|
155
|
-
- `chrome_tab` — list, create, activate, close, or inspect tabs
|
|
156
|
-
- `chrome_snapshot` — inspect title, URL, visible text, viewport, and clickable/focusable selectors
|
|
157
|
-
- `chrome_navigate` — navigate an existing tab
|
|
158
|
-
- `chrome_evaluate` — evaluate JavaScript in a tab
|
|
159
|
-
- `chrome_click` — click by selector or coordinates
|
|
160
|
-
- `chrome_type` — type text, optionally focusing a selector first
|
|
161
|
-
- `chrome_key` — send keyboard keys
|
|
162
|
-
- `chrome_wait_for` — wait for a selector or expression
|
|
163
|
-
- `chrome_screenshot` — capture viewport screenshots to disk
|
|
209
|
+
---
|
|
164
210
|
|
|
165
|
-
|
|
211
|
+
## Composes with
|
|
166
212
|
|
|
167
|
-
|
|
168
|
-
|
|
213
|
+
- **[pi-qq](https://www.npmjs.com/package/pi-qq)** — `/qq summarize what the active GitHub tab shows` without polluting the main transcript.
|
|
214
|
+
- **[pi-bar](https://www.npmjs.com/package/pi-bar)** — when the agent scrapes large pages, watch the context-usage segment turn yellow → red as a signal to `/qq` for a recap.
|
|
215
|
+
- **PR demo skills** — screenshots write to `.pi/chrome-screenshots/` so you can attach them to PR descriptions or demo bundles.
|
|
169
216
|
|
|
170
|
-
|
|
217
|
+
---
|
|
171
218
|
|
|
172
|
-
|
|
219
|
+
## Why an unpacked Chrome extension?
|
|
173
220
|
|
|
174
|
-
|
|
221
|
+
`pi-chrome` cannot ship through the Chrome Web Store — a Web Store extension cannot talk to a local bridge controlled by another tool on the same machine. So it ships as a small MIT-licensed unpacked extension in [`extensions/chrome-profile-bridge/browser-extension/`](./extensions/chrome-profile-bridge/browser-extension). **Read the source before loading.** `/chrome doctor` reports the extension version and warns when it drifts from your installed `pi-chrome`.
|
|
175
222
|
|
|
176
|
-
|
|
223
|
+
---
|
|
177
224
|
|
|
178
225
|
## Security model
|
|
179
226
|
|
|
180
|
-
The companion
|
|
227
|
+
The companion extension runs in the Chrome profile where you install it and has broad tab/scripting permissions. Only install it from a package source you trust.
|
|
181
228
|
|
|
182
|
-
The Pi side listens on `127.0.0.1:17318` by default. Override before starting Pi
|
|
229
|
+
The Pi side listens on `127.0.0.1:17318` by default. Override before starting Pi:
|
|
183
230
|
|
|
184
231
|
```bash
|
|
185
232
|
PI_CHROME_BRIDGE_PORT=17319 pi
|
|
186
233
|
```
|
|
234
|
+
|
|
235
|
+
There is no network exposure; the bridge binds to loopback only.
|
|
236
|
+
|
|
237
|
+
---
|
|
238
|
+
|
|
239
|
+
## Built-in benchmark suite
|
|
240
|
+
|
|
241
|
+
[`test-suite/`](./test-suite) is a static page benchmark for **any** browser-control agent (not just pi-chrome). Each challenge exposes `window.__verdict` / `window.__reason` / `window.__events` and a manifest entry with expected results per mode (`synthetic`, `trusted`, `manual`).
|
|
242
|
+
|
|
243
|
+
```bash
|
|
244
|
+
cd test-suite && python3 -m http.server 8765
|
|
245
|
+
# open http://127.0.0.1:8765/ in the Chrome window pi-chrome controls
|
|
246
|
+
```
|
|
247
|
+
|
|
248
|
+
Categories: `trusted-input`, `pointer-humanization`, `keyboard`, `activation-gates`, `scroll`, `drag-drop`, `clipboard`, `native-controls`, `frameworks`, `editing`, `dom-complexity`, `frames`, `files`, `observability`, `fingerprint`, `agent-safety`.
|
|
249
|
+
|
|
250
|
+
If you build a competing tool, please open a PR with your scores. We benchmark in public.
|
|
251
|
+
|
|
252
|
+
---
|
|
253
|
+
|
|
254
|
+
## Roadmap signals
|
|
255
|
+
|
|
256
|
+
`pi-chrome` is actively shipped. Things on the near roadmap:
|
|
257
|
+
|
|
258
|
+
- More observability tools (DOM mutation streams, performance traces)
|
|
259
|
+
- First-class iframe + Shadow-DOM uid stability across snapshots
|
|
260
|
+
- Web Push & service worker introspection
|
|
261
|
+
- Recorder mode that emits agent prompts from your own clicks
|
|
262
|
+
|
|
263
|
+
If you want one of those next, open an issue.
|
|
264
|
+
|
|
265
|
+
---
|
|
266
|
+
|
|
267
|
+
## Contributing
|
|
268
|
+
|
|
269
|
+
PRs welcome. The bar:
|
|
270
|
+
|
|
271
|
+
1. Add a benchmark page in `test-suite/` that fails before your change and passes after.
|
|
272
|
+
2. Keep `chrome_*` tool results honest — surface `pageMutated`, `valueMatches`, `defaultPrevented`, etc.
|
|
273
|
+
3. Don't break the "no re-login" guarantee. Anything that requires a fresh profile is out of scope.
|
|
274
|
+
|
|
275
|
+
---
|
|
276
|
+
|
|
277
|
+
## License
|
|
278
|
+
|
|
279
|
+
MIT. See [LICENSE](./LICENSE).
|
package/SECURITY.md
ADDED
|
@@ -0,0 +1,38 @@
|
|
|
1
|
+
# Security policy
|
|
2
|
+
|
|
3
|
+
## Reporting a vulnerability
|
|
4
|
+
|
|
5
|
+
Open a GitHub issue prefixed with `[security]` at https://github.com/tianrendong/pi-packs/issues, or contact the maintainer directly if the issue is sensitive. Please do **not** include exploit details in a public issue without coordinating first.
|
|
6
|
+
|
|
7
|
+
## Threat model
|
|
8
|
+
|
|
9
|
+
`pi-chrome` is a developer tool you install knowingly. It is **not** designed to defend against:
|
|
10
|
+
|
|
11
|
+
- Hostile pages running in your Chrome trying to detect or escape automation. (Standard browser security boundaries still apply, but a hostile page that already runs in your tab can do anything that page can already do.)
|
|
12
|
+
- Other processes on your local machine. The bridge binds to `127.0.0.1:17318` (loopback only) but **does not authenticate** local callers. Any process running as your user can issue commands. If your threat model includes hostile local processes, run pi-chrome on a separate user account.
|
|
13
|
+
|
|
14
|
+
`pi-chrome` **is** designed to:
|
|
15
|
+
|
|
16
|
+
- Never exfiltrate page state to the network. All communication is loopback (`127.0.0.1`).
|
|
17
|
+
- Surface every action with an honest result envelope so the agent can't silently do the wrong thing.
|
|
18
|
+
- Require explicit opt-in for trusted-input mode (`/chrome clicks on` or `trusted: true`), which uses `chrome.debugger` and shows Chrome's banner.
|
|
19
|
+
|
|
20
|
+
## The companion extension
|
|
21
|
+
|
|
22
|
+
The Chrome extension under `extensions/chrome-profile-bridge/browser-extension/` runs with broad permissions: `tabs`, `scripting`, `debugger`, `webNavigation`, etc. **Only install it from a package source you trust.** Read the source before loading. Pin a known-good commit if you're security-sensitive.
|
|
23
|
+
|
|
24
|
+
## Defaults
|
|
25
|
+
|
|
26
|
+
- Loopback bridge only. No remote port. No telemetry.
|
|
27
|
+
- Synthetic events first; trusted CDP only when explicitly enabled.
|
|
28
|
+
- Quiet mode optional; tab/window focus is observable (the user can see Pi acting).
|
|
29
|
+
|
|
30
|
+
## Override the port
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
PI_CHROME_BRIDGE_PORT=17319 pi
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
## Supported versions
|
|
37
|
+
|
|
38
|
+
The latest minor on npm is supported. Security patches will be released as soon as practical.
|
|
@@ -0,0 +1,87 @@
|
|
|
1
|
+
# pi-chrome vs. other browser-automation stacks
|
|
2
|
+
|
|
3
|
+
This is the "should I use pi-chrome?" page. Honest comparisons. We benchmark in public — see [`../test-suite/`](../test-suite).
|
|
4
|
+
|
|
5
|
+
## TL;DR
|
|
6
|
+
|
|
7
|
+
| You are… | Use… |
|
|
8
|
+
| -------------------------------------------------------------- | ------------------------------- |
|
|
9
|
+
| A Pi agent operator who wants the agent to use **your real Chrome** (logged-in tabs, cookies, extensions) | **pi-chrome** |
|
|
10
|
+
| Writing deterministic end-to-end tests in CI | Playwright / Cypress |
|
|
11
|
+
| Building a hosted scraping fleet on isolated profiles | Playwright / Puppeteer / Browserless |
|
|
12
|
+
| Building a "general web agent" SaaS that fetches its own pages | browser-use / Stagehand / Skyvern |
|
|
13
|
+
| Debugging your own app from your editor without leaving your real session | **pi-chrome** |
|
|
14
|
+
|
|
15
|
+
Different tools, different jobs. `pi-chrome` is the only one that says: *"don't make me sign in again to do work I'd do as myself."*
|
|
16
|
+
|
|
17
|
+
---
|
|
18
|
+
|
|
19
|
+
## Feature matrix
|
|
20
|
+
|
|
21
|
+
| | **pi-chrome** | Playwright | Puppeteer | Stagehand | browser-use | Selenium |
|
|
22
|
+
| ---------------------------------------------- | ----------------- | ------------------- | ------------------- | ------------------- | ------------------- | ------------------- |
|
|
23
|
+
| Drives your **real, signed-in** Chrome profile | ✅ | ❌ throwaway | ❌ throwaway | ❌ throwaway | ❌ throwaway | ❌ throwaway |
|
|
24
|
+
| Zero re-login | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
|
|
25
|
+
| First-class Pi tool integration | ✅ 20 tools | n/a | n/a | n/a | n/a | n/a |
|
|
26
|
+
| Synthetic input (fast, no banner) | ✅ | ❌ always trusted | ❌ | ❌ | ❌ | ❌ |
|
|
27
|
+
| Trusted (CDP) input on demand | ✅ opt-in | ✅ always | ✅ always | ✅ | ✅ | ✅ |
|
|
28
|
+
| Honest result envelopes (`pageMutated`, `occludedBy`, `valueMatches`) | ✅ | ⚠️ partial | ⚠️ partial | ❌ | ❌ | ❌ |
|
|
29
|
+
| Network capture w/ response body | ✅ | ✅ | ✅ | ⚠️ | ⚠️ | ⚠️ |
|
|
30
|
+
| Console capture | ✅ | ✅ | ✅ | ⚠️ | ⚠️ | ⚠️ |
|
|
31
|
+
| Real touch events (mobile PWAs) | ✅ `chrome_tap` | ✅ | ✅ | ⚠️ | ⚠️ | ❌ |
|
|
32
|
+
| File upload through React/Vue controlled inputs | ✅ no native picker | ✅ | ✅ | ✅ | ⚠️ | ⚠️ |
|
|
33
|
+
| HTML5 drag/drop with `DataTransfer` | ✅ | ✅ | ⚠️ | ⚠️ | ⚠️ | ⚠️ |
|
|
34
|
+
| Multi-session shared bridge | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
|
|
35
|
+
| MIT-licensed | ✅ | Apache-2 | Apache-2 | MIT | MIT | Apache-2 |
|
|
36
|
+
| Watch the agent work, live | ✅ default | ❌ headless / new | ❌ | ❌ | ❌ | ❌ |
|
|
37
|
+
|
|
38
|
+
---
|
|
39
|
+
|
|
40
|
+
## "But Playwright has a `storageState` option"
|
|
41
|
+
|
|
42
|
+
Yes — you can export cookies and replay them. In practice for agent workflows that breaks down fast:
|
|
43
|
+
|
|
44
|
+
1. **OAuth + SSO** providers (Okta, Google, GitHub) often pin sessions to TLS fingerprints, device IDs, and browser-extension state that doesn't survive replay.
|
|
45
|
+
2. **MFA** tokens expire. Re-prompts in the middle of an agent run = dead end.
|
|
46
|
+
3. **Internal admin tools** often hard-pin to your real device for security.
|
|
47
|
+
4. **Your editor / IDE / colleagues** see your real Chrome, not a hidden one. Watching an agent in a separate window is jarring.
|
|
48
|
+
|
|
49
|
+
`pi-chrome` sidesteps all of this by living **inside** the Chrome profile you're already logged into. The agent operates as you, because it literally is you (in your tabs).
|
|
50
|
+
|
|
51
|
+
---
|
|
52
|
+
|
|
53
|
+
## "Is this safer than CDP?"
|
|
54
|
+
|
|
55
|
+
It's a different security boundary, not strictly safer. Trade-offs:
|
|
56
|
+
|
|
57
|
+
- **CDP-based tools** require `chrome --remote-debugging-port`. That port is unauthenticated and exposes the whole browser to any local process. Easy to misconfigure.
|
|
58
|
+
- **pi-chrome** runs through an extension you install yourself with broad permissions (tabs, scripting, debugger). The bridge listens on `127.0.0.1:17318` loopback only. **Only install the bundled extension if you trust the source you got the npm package from.**
|
|
59
|
+
|
|
60
|
+
If your threat model excludes extensions with broad permissions, neither approach is a fit — you want a sandboxed CI runner.
|
|
61
|
+
|
|
62
|
+
---
|
|
63
|
+
|
|
64
|
+
## When NOT to use pi-chrome
|
|
65
|
+
|
|
66
|
+
- **Headless production scraping at scale** — you want isolated profiles, parallel containers, retries. Use Playwright or Browserless.
|
|
67
|
+
- **Cross-browser testing** — pi-chrome is Chrome only by design (it's a Chrome extension). Use Playwright for Firefox/WebKit.
|
|
68
|
+
- **Untrusted page execution** — same reason. Use a sandbox.
|
|
69
|
+
- **You don't already use Chrome** — the whole value proposition is "your real Chrome". If your day is in Arc / Firefox / Brave, you'll get less out of it (though Brave/other Chromium browsers work fine technically).
|
|
70
|
+
|
|
71
|
+
---
|
|
72
|
+
|
|
73
|
+
## Reproducing this table
|
|
74
|
+
|
|
75
|
+
Run [`../test-suite/`](../test-suite) against your tool of choice and open a PR with your scores. The benchmark covers:
|
|
76
|
+
|
|
77
|
+
- trusted-input gates (clipboard, fullscreen, file picker, autoplay)
|
|
78
|
+
- pointer humanization (entropy, paths, timing)
|
|
79
|
+
- keyboard fidelity (Tab flows, modifier chords, IME)
|
|
80
|
+
- React-style value tracking and contenteditable
|
|
81
|
+
- Shadow DOM + iframe targeting
|
|
82
|
+
- file attachment to `<input type=file>`
|
|
83
|
+
- console & network observability
|
|
84
|
+
- fingerprint leaks (`navigator.webdriver`, runtime flags)
|
|
85
|
+
- agent-safety honeypots
|
|
86
|
+
|
|
87
|
+
`window.__verdict` / `window.__reason` / `window.__events` make it deterministic for any tool to grade itself.
|
package/docs/EXAMPLES.md
ADDED
|
@@ -0,0 +1,166 @@
|
|
|
1
|
+
# pi-chrome examples
|
|
2
|
+
|
|
3
|
+
Real, useful agent prompts. Drop any of these into Pi after running `/chrome onboard`. Each one uses Chrome tabs and accounts you already have.
|
|
4
|
+
|
|
5
|
+
## Daily workflow
|
|
6
|
+
|
|
7
|
+
### PR triage
|
|
8
|
+
|
|
9
|
+
```text
|
|
10
|
+
Use chrome_tab list to find my GitHub notifications tab.
|
|
11
|
+
chrome_snapshot it. Group PRs by:
|
|
12
|
+
- awaiting my review
|
|
13
|
+
- blocked on me (changes requested back)
|
|
14
|
+
- mergeable (approved + green CI)
|
|
15
|
+
Output a 5-bullet ranked triage. Do not click anything.
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
### Linear standup
|
|
19
|
+
|
|
20
|
+
```text
|
|
21
|
+
Open my Linear current cycle in the active tab.
|
|
22
|
+
chrome_snapshot, then write yesterday/today/blockers
|
|
23
|
+
in the exact format my standup channel uses.
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
### Slack catch-up
|
|
27
|
+
|
|
28
|
+
```text
|
|
29
|
+
For each unread channel in my Slack tab, chrome_snapshot,
|
|
30
|
+
extract the top 3 messages that mention me or my team,
|
|
31
|
+
and summarize what I missed in <100 words total.
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
## Debugging
|
|
35
|
+
|
|
36
|
+
### Reproduce a customer bug with evidence
|
|
37
|
+
|
|
38
|
+
```text
|
|
39
|
+
1. chrome_navigate to https://staging.acme.com/orders/<id>
|
|
40
|
+
2. chrome_snapshot
|
|
41
|
+
3. Click "Refund" with chrome_click
|
|
42
|
+
4. Use chrome_list_network_requests to capture the API call
|
|
43
|
+
5. chrome_get_network_request on the failing one — give me the response body
|
|
44
|
+
6. chrome_screenshot the error state to ./repro/refund-bug.png
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
### Visual diff local vs staging
|
|
48
|
+
|
|
49
|
+
```text
|
|
50
|
+
chrome_screenshot http://localhost:3000/pricing → ./diff/local.png
|
|
51
|
+
chrome_screenshot https://staging.acme.com/pricing → ./diff/staging.png
|
|
52
|
+
Then read both, describe layout differences in plain English.
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
### Console + network forensics
|
|
56
|
+
|
|
57
|
+
```text
|
|
58
|
+
Reproduce the checkout bug on the active tab.
|
|
59
|
+
After the failure:
|
|
60
|
+
- chrome_list_console_messages
|
|
61
|
+
- chrome_list_network_requests
|
|
62
|
+
Cross-reference the timestamps and tell me what broke first.
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
## Admin / ops
|
|
66
|
+
|
|
67
|
+
### Multi-tab cross-check
|
|
68
|
+
|
|
69
|
+
```text
|
|
70
|
+
I have Stripe, Postmark, and our internal admin open in 3 tabs.
|
|
71
|
+
For user <id>, chrome_snapshot each tab in turn and find
|
|
72
|
+
any field where state disagrees. Output a 3-column table.
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
### Bulk gentle action (safe form-fill, no submit)
|
|
76
|
+
|
|
77
|
+
```text
|
|
78
|
+
Open our vendor portal "Add Vendor" form.
|
|
79
|
+
For each row in ./vendors.csv:
|
|
80
|
+
- chrome_fill the form
|
|
81
|
+
- chrome_screenshot it
|
|
82
|
+
- STOP before submit
|
|
83
|
+
- chrome_evaluate "history.back()" to return to the list
|
|
84
|
+
I will review screenshots and submit manually.
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
### Auth-only data pull
|
|
88
|
+
|
|
89
|
+
```text
|
|
90
|
+
My analytics dashboard is open and the cookie auth would die in headless mode.
|
|
91
|
+
chrome_evaluate to read window.__APP_STATE__.dashboardData
|
|
92
|
+
and dump today's KPIs as JSON.
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
## Demos / PRs
|
|
96
|
+
|
|
97
|
+
### Capture screenshots for a PR description
|
|
98
|
+
|
|
99
|
+
```text
|
|
100
|
+
On localhost:3000/feature-x:
|
|
101
|
+
- empty state → ./pr/01-empty.png
|
|
102
|
+
- filled state → ./pr/02-filled.png
|
|
103
|
+
- error state (delete the API key from devtools first) → ./pr/03-error.png
|
|
104
|
+
Save each with chrome_screenshot. Output a markdown block I can paste into the PR.
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
### Record a guided demo flow
|
|
108
|
+
|
|
109
|
+
```text
|
|
110
|
+
On my staging app:
|
|
111
|
+
1. Walk the new-onboarding flow start to finish
|
|
112
|
+
2. After each chrome_click or chrome_navigate, chrome_screenshot
|
|
113
|
+
3. Save numbered PNGs under ./demo/
|
|
114
|
+
4. Write narration captions for each step
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
## Forms with frameworks
|
|
118
|
+
|
|
119
|
+
### React controlled inputs
|
|
120
|
+
|
|
121
|
+
```text
|
|
122
|
+
chrome_fill (not chrome_type) for React inputs — it uses the
|
|
123
|
+
framework-aware native value setter so the form's state actually updates.
|
|
124
|
+
After each fill, the result envelope's valueMatches=true confirms the
|
|
125
|
+
component re-rendered with the new value.
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
### File upload without the native picker
|
|
129
|
+
|
|
130
|
+
```text
|
|
131
|
+
chrome_upload_file paths=[./fixtures/avatar.png] selector="input[type=file]"
|
|
132
|
+
# No native file picker opens. Works with React/Vue/Angular controlled inputs.
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
### Drag-to-reorder lists
|
|
136
|
+
|
|
137
|
+
```text
|
|
138
|
+
chrome_drag fromUid=row-3 toUid=row-1
|
|
139
|
+
# Fires real HTML5 dragstart/dragover/drop with a shared DataTransfer.
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
## Multi-session patterns
|
|
143
|
+
|
|
144
|
+
`pi-chrome` shares one bridge across all Pi sessions on the same machine. Useful patterns:
|
|
145
|
+
|
|
146
|
+
### Planner + Worker
|
|
147
|
+
|
|
148
|
+
- **Planner session** stays high level: "find the bug, decide the fix."
|
|
149
|
+
- **Worker session** runs the actual `chrome_*` tools.
|
|
150
|
+
- Both see the same Chrome state because they're both pointing at your real profile.
|
|
151
|
+
|
|
152
|
+
### Watcher
|
|
153
|
+
|
|
154
|
+
A third Pi session can run `chrome_snapshot` periodically in `background: true` mode and post summaries via `pi-qq` — handy for long-running flows.
|
|
155
|
+
|
|
156
|
+
## When to prefer trusted clicks
|
|
157
|
+
|
|
158
|
+
Pass `trusted: true` on `chrome_click` (or run `/chrome clicks on`) when:
|
|
159
|
+
|
|
160
|
+
- the click should open a file picker
|
|
161
|
+
- the click should write to the clipboard or read it
|
|
162
|
+
- the click should start an audio/video play
|
|
163
|
+
- the click should request fullscreen / push permission
|
|
164
|
+
- the page is wrapped in a strict user-activation guard (some paywalls / login flows)
|
|
165
|
+
|
|
166
|
+
Everything else is faster and quieter without it.
|
package/docs/FAQ.md
ADDED
|
@@ -0,0 +1,79 @@
|
|
|
1
|
+
# pi-chrome FAQ
|
|
2
|
+
|
|
3
|
+
## Does this work with Brave / Arc / Edge / Vivaldi?
|
|
4
|
+
|
|
5
|
+
Yes. Any Chromium-based browser that supports unpacked extensions and the `chrome.debugger` API will work. The extension is named "Pi Chrome Connector" but the source is browser-agnostic. Firefox / WebKit are out of scope (different extension models).
|
|
6
|
+
|
|
7
|
+
## Will it slow my browser down?
|
|
8
|
+
|
|
9
|
+
The companion extension is idle when no Pi command is in flight. It uses Manifest V3 service worker activation, so it wakes for a request and goes back to sleep. No content script is injected globally.
|
|
10
|
+
|
|
11
|
+
## Does it work in Chrome incognito?
|
|
12
|
+
|
|
13
|
+
By default no — extensions need explicit "Allow in incognito" permission. Toggle it on `chrome://extensions` if you want pi-chrome to see incognito tabs. We don't recommend it for sensitive work.
|
|
14
|
+
|
|
15
|
+
## Will sites detect that I'm automating?
|
|
16
|
+
|
|
17
|
+
For **synthetic** (quiet) input: yes, technically. `event.isTrusted` is `false`. Most sites don't check; some anti-bot defenses do.
|
|
18
|
+
|
|
19
|
+
For **trusted** (CDP) input: events are `isTrusted=true`, pointer paths are humanized, key cadence has variance. Most fingerprint-based detectors don't fire. Some specifically check for the `chrome.debugger` API attached and will show the "Chrome is being debugged" banner. That banner is the visible cost of trusted mode.
|
|
20
|
+
|
|
21
|
+
The [`test-suite/`](../test-suite) grades both modes against common detection signals.
|
|
22
|
+
|
|
23
|
+
## Why do I see a banner saying "Pi Chrome Connector started debugging this browser"?
|
|
24
|
+
|
|
25
|
+
That's Chrome's built-in warning when an extension uses `chrome.debugger`. pi-chrome uses it only in trusted-input mode. If you don't want to see it, run `/chrome clicks off` and accept that some sign-in flows / file pickers / clipboard ops won't work.
|
|
26
|
+
|
|
27
|
+
## Can a malicious page escape and access my other tabs?
|
|
28
|
+
|
|
29
|
+
No — pages cannot directly talk to extensions. Commands flow agent → local bridge (`127.0.0.1:17318`) → extension → tab. The bridge binds to loopback only. The risk surface is **other local processes** that could connect to `127.0.0.1:17318` and impersonate Pi. If that's in your threat model, run pi-chrome on a separate user account.
|
|
30
|
+
|
|
31
|
+
## Can multiple Pi sessions use it at once?
|
|
32
|
+
|
|
33
|
+
Yes. The first session opens the local bridge; later sessions detect it and pipe their commands through the same bridge. Planner + worker + audit can all drive the same Chrome concurrently.
|
|
34
|
+
|
|
35
|
+
## Why can't this be on the Chrome Web Store?
|
|
36
|
+
|
|
37
|
+
Web Store extensions cannot communicate with a local process bridge controlled by another tool — Google's policy. pi-chrome must ship as an unpacked extension you load yourself. The upside: you can read the source. The downside: each Chrome update may prompt you to re-confirm.
|
|
38
|
+
|
|
39
|
+
## What happens when I update pi-chrome?
|
|
40
|
+
|
|
41
|
+
`/chrome doctor` will warn you if the loaded extension is older than the installed `pi-chrome`. Reload it from `chrome://extensions` to pick up the new version. Trusted-input mode in particular requires re-approving the `debugger` permission once.
|
|
42
|
+
|
|
43
|
+
## What's the install footprint?
|
|
44
|
+
|
|
45
|
+
- Pi side: one extension that registers ~20 tools and a few slash commands.
|
|
46
|
+
- Chrome side: one unpacked extension, ~5000 LOC of plain JavaScript, no dependencies.
|
|
47
|
+
|
|
48
|
+
## Can I script it without Pi?
|
|
49
|
+
|
|
50
|
+
The Pi-facing tools are thin wrappers around an HTTP bridge at `127.0.0.1:17318`. You could call it directly from any process, but the API is internal and may change. If you need a stable scripting interface, file an issue and we'll consider stabilizing.
|
|
51
|
+
|
|
52
|
+
## Does `chrome_evaluate` work on strict-CSP pages?
|
|
53
|
+
|
|
54
|
+
Yes. The handler compiles with `new Function(...)` in the MAIN world, which works under `script-src 'self'` without `'unsafe-eval'`. Multi-statement bodies are supported via a statement-mode fallback. Exceptions are surfaced to the agent.
|
|
55
|
+
|
|
56
|
+
## Why does my click return `pageMutated=false`?
|
|
57
|
+
|
|
58
|
+
Either:
|
|
59
|
+
- The element was occluded (look for `occludedBy: <selector>` in the envelope).
|
|
60
|
+
- The click handler called `event.preventDefault()` and the page intentionally ignored it.
|
|
61
|
+
- The site rejects synthetic events. Try `trusted: true` or `/chrome clicks on`.
|
|
62
|
+
|
|
63
|
+
The result envelope tells you which one. **Don't blind-retry.**
|
|
64
|
+
|
|
65
|
+
## Why does `chrome_type` return `valueMatches=false`?
|
|
66
|
+
|
|
67
|
+
The editor rejected the synthetic input. Common culprits: contenteditable rich-text editors, native date pickers, masked-input libraries. Try `chrome_fill` (uses framework-aware native setters) or `trusted: true`.
|
|
68
|
+
|
|
69
|
+
## How do I attach a file to a React file input?
|
|
70
|
+
|
|
71
|
+
`chrome_upload_file` — populates `input.files` via a real `DataTransfer` and fires `input` + `change` events. It does **not** open the native file picker (no synthetic event can; that's a user-activation gate). Works with React/Vue/Angular controlled inputs.
|
|
72
|
+
|
|
73
|
+
## Can it record videos?
|
|
74
|
+
|
|
75
|
+
Not yet. Screenshots only. Video recording is on the roadmap.
|
|
76
|
+
|
|
77
|
+
## How do I file a good bug report?
|
|
78
|
+
|
|
79
|
+
Include `/chrome doctor` output, the exact tool call, and the result envelope. If the page is public, link to it; if private, distill it into a benchmark page under `test-suite/challenges/`. See [CONTRIBUTING.md](../CONTRIBUTING.md).
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"manifest_version": 3,
|
|
3
3
|
"name": "Pi Chrome Connector",
|
|
4
|
-
"version": "0.14.
|
|
4
|
+
"version": "0.14.4",
|
|
5
5
|
"description": "Lets Pi control tabs in Chrome via a local connector at 127.0.0.1.",
|
|
6
6
|
"permissions": ["tabs", "scripting", "storage", "activeTab", "alarms", "webNavigation", "debugger"],
|
|
7
7
|
"host_permissions": ["<all_urls>", "http://127.0.0.1:17318/*"],
|
package/package.json
CHANGED
|
@@ -1,22 +1,55 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "pi-chrome",
|
|
3
|
-
"version": "0.14.
|
|
4
|
-
"description": "Drive your existing logged-in Chrome
|
|
3
|
+
"version": "0.14.4",
|
|
4
|
+
"description": "The de-facto browser automation toolkit for Pi agents. Drive your existing logged-in Chrome — no re-login, no throwaway profile, no CDP. 20+ tools (click, type, navigate, screenshot, network capture, file upload, drag, scroll, touch) + honest result envelopes + a built-in benchmark suite.",
|
|
5
5
|
"keywords": [
|
|
6
|
+
"pi",
|
|
6
7
|
"pi-package",
|
|
7
8
|
"pi-extension",
|
|
9
|
+
"pi-agent",
|
|
8
10
|
"chrome",
|
|
11
|
+
"chrome-extension",
|
|
9
12
|
"browser",
|
|
10
|
-
"automation",
|
|
13
|
+
"browser-automation",
|
|
14
|
+
"browser-control",
|
|
15
|
+
"web-automation",
|
|
16
|
+
"agent",
|
|
17
|
+
"ai-agent",
|
|
18
|
+
"llm-agent",
|
|
19
|
+
"llm-tools",
|
|
20
|
+
"agentic",
|
|
21
|
+
"playwright-alternative",
|
|
22
|
+
"puppeteer-alternative",
|
|
23
|
+
"selenium-alternative",
|
|
24
|
+
"cdp-alternative",
|
|
11
25
|
"authenticated-session",
|
|
12
26
|
"real-profile",
|
|
13
|
-
"web-debugging"
|
|
27
|
+
"web-debugging",
|
|
28
|
+
"web-scraping",
|
|
29
|
+
"screenshot",
|
|
30
|
+
"automation-tools",
|
|
31
|
+
"browser-use",
|
|
32
|
+
"stagehand-alternative"
|
|
14
33
|
],
|
|
15
34
|
"license": "MIT",
|
|
35
|
+
"homepage": "https://github.com/tianrendong/pi-packs/tree/main/packages/pi-chrome#readme",
|
|
36
|
+
"repository": {
|
|
37
|
+
"type": "git",
|
|
38
|
+
"url": "git+https://github.com/tianrendong/pi-packs.git",
|
|
39
|
+
"directory": "packages/pi-chrome"
|
|
40
|
+
},
|
|
41
|
+
"bugs": {
|
|
42
|
+
"url": "https://github.com/tianrendong/pi-packs/issues"
|
|
43
|
+
},
|
|
16
44
|
"type": "commonjs",
|
|
17
45
|
"files": [
|
|
18
46
|
"extensions",
|
|
19
|
-
"
|
|
47
|
+
"docs",
|
|
48
|
+
"README.md",
|
|
49
|
+
"CHANGELOG.md",
|
|
50
|
+
"CONTRIBUTING.md",
|
|
51
|
+
"SECURITY.md",
|
|
52
|
+
"LICENSE"
|
|
20
53
|
],
|
|
21
54
|
"pi": {
|
|
22
55
|
"extensions": [
|