pi-chrome 0.15.36 → 0.15.38

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,9 +2,20 @@
2
2
 
3
3
  All notable user-facing changes to `pi-chrome`.
4
4
 
5
+ ## 0.15.38 — 2026-06-07
6
+
7
+ - **Overlay-safe click/fill fallbacks.** `chrome_click` and `chrome_fill` now fall back to DOM-dispatched click/value events when Chrome's debugger input path is blocked by another extension overlay (for example password-manager/autofill UI), unless `domFallback:false` is passed.
8
+
9
+ ## 0.15.37 — 2026-06-07
10
+
11
+ - **Hardened Chrome input targeting.** `chrome_click`/`chrome_fill`/related input paths now fail fast with resolved tab/CDP target metadata when debugger attach hits a stale or protected target, instead of surfacing bare `chrome-extension://` errors or hanging until the bridge timeout.
12
+ - **Internal timeouts and cleanup.** Companion extension commands, debugger attach, CDP commands, and script injection now have shorter internal timeouts with debugger cleanup, so stuck input dispatch returns actionable errors before the 30s bridge timeout.
13
+ - **Clear stale uid errors.** Snapshot uids that no longer map to live elements now report `snapshot uid ... is stale; refresh chrome_snapshot`.
14
+ - **`chrome_fill` fallback.** If real CDP input is blocked by another extension overlay (for example password-manager/autofill UI), `chrome_fill` falls back to setting the field value through the page DOM and dispatching `input`/`change` events, unless `domFallback:false` is passed.
15
+
5
16
  ## 0.15.36 — 2026-06-03
6
17
 
7
- - **Richer page observation (ported from the `foxfirecodes` fork).** `chrome_snapshot` now returns a concise, agent-friendly observation — structural layout/context, page hints, visible actions, form fields, a page map, query matches, and a diff of changes since the previous snapshot — instead of a raw JSON dump. New `mode` (`auto`/`interactive`/`forms`/`pageMap`/`text`/`changes`/`full`), `query`, and `maxTextChars` parameters let the agent zoom in instead of dumping the whole page.
18
+ - **Richer page observation.** `chrome_snapshot` now returns a concise, agent-friendly observation — structural layout/context, page hints, visible actions, form fields, a page map, query matches, and a diff of changes since the previous snapshot — instead of a raw JSON dump. New `mode` (`auto`/`interactive`/`forms`/`pageMap`/`text`/`changes`/`full`), `query`, and `maxTextChars` parameters let the agent zoom in instead of dumping the whole page.
8
19
  - **New `chrome_find` tool.** Find elements, regions, or text by natural-language query (`'merge button'`, `'email error'`) and get ranked matches with stable uids and coordinates. Thin wrapper around `chrome_snapshot({ query })`.
9
20
  - **New `chrome_inspect` tool.** Inspect one snapshot uid/selector deeply: nearby text, nearby actions, form context, ancestors, and a suggested click target. Falls back to a focused snapshot if the loaded extension predates `page.inspect`.
10
21
  - **`includeSnapshot` now embeds the formatted snapshot.** `chrome_click`/`chrome_type`/`chrome_fill`/`chrome_key` with `includeSnapshot=true` append the fresh concise snapshot to the tool text so the agent can verify in one round trip.
package/README.md CHANGED
@@ -15,34 +15,58 @@ You: [keeps coding — agent never asked you to log in]
15
15
 
16
16
  ---
17
17
 
18
- ## 60-second install
18
+ ## 60-second install instruction
19
+
20
+ To install pi-chrome, run the following command:
19
21
 
20
22
  ```bash
21
23
  pi install npm:pi-chrome
22
24
  ```
23
25
 
24
- Then in Pi:
26
+
27
+ Then in Pi, run the next command, which will:
28
+
29
+ 1. Reveal the bundled browser-extension folder in Finder, and copy the folder path to your clipboard.
30
+ 2. Pop open the chrome://extensions webpage in Chrome.
31
+
32
+ In the Chrome Extensions page it opened, **YOU WILL NEED TO**:
33
+
34
+ 1. Turn on **developer mode** (top right).
35
+ 2. Click the **load unpacked** button (top left).
36
+ 3. Use **Cmd + Shift + G** (Mac) or **Ctrl + L** (Windows/Linux) to open the folder path field.
37
+ 4. **Cmd + V** (Mac) or **Ctrl + V** (Windows/Linux) to paste the copied path and press Enter.
38
+ 5. You're done with the chrome extensions page, and you can continue with the rest of the installation commands
25
39
 
26
40
  ```text
27
41
  /chrome onboard
28
42
  ```
29
43
 
30
- On macOS this opens `chrome://extensions`, reveals the bundled `browser-extension/` folder in Finder, and copies its path to your clipboard. In Chrome: **Developer mode** → **Load unpacked** → paste the path. Done.
44
+ Reload Pi so the newly installed package is actually loaded:
45
+
46
+ ```text
47
+ /reload
48
+ ```
49
+
31
50
 
32
- Verify, then authorize current Pi session from the terminal:
51
+ Verify the chrome connection:
33
52
 
34
53
  ```text
35
54
  /chrome doctor
36
- /chrome authorize
37
55
  ```
56
+ In the output, you just need to make sure the following line is present (It's okay if the other ones are still not checked):
57
+
58
+ ✓ Chrome is connected (companion extension v0.15.36, responded in 11ms).
38
59
 
60
+ Lastly, authorize the current session by running:
39
61
  ```text
40
- Performing Chrome bridge health check
41
- pi-chrome v<version>
42
- • Local bridge: mode=server, url=http://127.0.0.1:17318
43
- ✓ Companion Chrome extension responding (ID: <chrome-extension-id>, ext v<version>)
62
+ /chrome authorize
44
63
  ```
45
64
 
65
+ Run the following once more, and you should see all the lines checked:
66
+
67
+ ```text
68
+ /chrome doctor
69
+ ```
46
70
  ---
47
71
 
48
72
  ## Try this in 30 seconds after install
@@ -135,7 +159,7 @@ Agents can verify page state immediately instead of blindly retrying.
135
159
 
136
160
  ## What an agent gets
137
161
 
138
- **19 tools**, grouped by job. Every one runs against your already-open tabs.
162
+ **21 tools**, grouped by job. Every one runs against your already-open tabs.
139
163
 
140
164
  | Category | Tools |
141
165
  | --------------- | ---------------------------------------------------------------------------------------------- |
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "manifest_version": 3,
3
3
  "name": "Pi Chrome Connector",
4
- "version": "0.15.36",
4
+ "version": "0.15.38",
5
5
  "description": "Lets Pi control tabs in Chrome via a local connector at 127.0.0.1.",
6
6
  "permissions": [
7
7
  "tabs",
@@ -4,8 +4,25 @@ const POLL_ERROR_BACKOFF_MS = 2000;
4
4
  const DEFAULT_GROUP_COLOR = "blue";
5
5
  const PI_GROUP_RE = /^Pi(\b|\s*-)/i;
6
6
  const VALID_GROUP_COLORS = new Set(["grey", "blue", "red", "yellow", "green", "pink", "purple", "cyan", "orange"]);
7
+ const COMMAND_TIMEOUT_MS = 25_000;
8
+ const CDP_COMMAND_TIMEOUT_MS = 5_000;
9
+ const SCRIPTING_TIMEOUT_MS = 8_000;
10
+ const ATTACH_TIMEOUT_MS = 3_000;
7
11
  let polling = false;
8
12
 
13
+ function withTimeout(promise, ms, label, onTimeout) {
14
+ let timer;
15
+ return Promise.race([
16
+ Promise.resolve(promise).finally(() => clearTimeout(timer)),
17
+ new Promise((_, reject) => {
18
+ timer = setTimeout(async () => {
19
+ try { await onTimeout?.(); } catch {}
20
+ reject(new Error(`${label} timed out after ${ms}ms`));
21
+ }, ms);
22
+ }),
23
+ ]);
24
+ }
25
+
9
26
  // =================== Chrome input (CDP) layer ===================
10
27
  // Tracks which tabs we have attached chrome.debugger to.
11
28
  const attachedTabs = new Map(); // tabId -> { detachAt: number, pointer: {x,y} }
@@ -29,6 +46,31 @@ function recordAttachEvent(entry) {
29
46
  if (attachDebugLog.length > 20) attachDebugLog.shift();
30
47
  }
31
48
 
49
+ function normalPageTarget(target, tabId) {
50
+ const url = String(target?.url || "");
51
+ return target?.tabId === tabId && target?.type === "page" && !url.startsWith("chrome://") && !url.startsWith("chrome-extension://") && !url.startsWith("devtools://");
52
+ }
53
+
54
+ async function pageDebuggeeForTab(tabId) {
55
+ const targets = await new Promise((resolve) => chrome.debugger.getTargets((t) => resolve(t || []))).catch(() => []);
56
+ const target = targets.find((t) => normalPageTarget(t, tabId));
57
+ return target?.id ? { targetId: target.id } : { tabId };
58
+ }
59
+
60
+ async function debuggerAttachRaw(tabId, preferredDebuggee) {
61
+ const debuggee = preferredDebuggee || { tabId };
62
+ await withTimeout(
63
+ chrome.debugger.attach(debuggee, CDP_VERSION),
64
+ ATTACH_TIMEOUT_MS,
65
+ `Chrome debugger attach to tab ${tabId}`,
66
+ async () => {
67
+ attachedTabs.delete(tabId);
68
+ try { await chrome.debugger.detach(debuggee); } catch {}
69
+ },
70
+ );
71
+ return debuggee;
72
+ }
73
+
32
74
  async function attachDebugger(tabId) {
33
75
  if (!chrome.debugger) throw new Error("chrome.debugger API unavailable; reload the extension to grant the new permission");
34
76
  if (attachedTabs.has(tabId)) {
@@ -50,15 +92,23 @@ async function attachDebugger(tabId) {
50
92
  }
51
93
  }
52
94
  } catch {}
53
- const attemptAttach = async () => {
95
+ let attachedDebuggee = null;
96
+ const attemptAttach = async (debuggee) => {
54
97
  try {
55
- await chrome.debugger.attach({ tabId }, CDP_VERSION);
98
+ attachedDebuggee = await debuggerAttachRaw(tabId, debuggee);
56
99
  return null;
57
100
  } catch (error) {
58
101
  return error;
59
102
  }
60
103
  };
104
+ const retryPageTargetIfExtensionBlocked = async (err, kind) => {
105
+ if (!/Cannot access a chrome-extension:\/\/ URL of different extension/i.test(String(err?.message || err))) return err;
106
+ const pageDebuggee = await pageDebuggeeForTab(tabId);
107
+ recordAttachEvent({ kind, tabId, debuggee: pageDebuggee });
108
+ return attemptAttach(pageDebuggee);
109
+ };
61
110
  let err = await attemptAttach();
111
+ if (err) err = await retryPageTargetIfExtensionBlocked(err, "attach-page-target-retry");
62
112
  if (err) {
63
113
  const msg = String(err?.message || err);
64
114
  const transient = /Cannot access a chrome-extension|Cannot access contents of|No tab with id|Debugger is not attached|Another debugger|Target closed/i.test(msg);
@@ -70,6 +120,7 @@ async function attachDebugger(tabId) {
70
120
  }
71
121
  await sleep(180);
72
122
  err = await attemptAttach();
123
+ if (err) err = await retryPageTargetIfExtensionBlocked(err, "attach-page-target-retry2");
73
124
  if (err) {
74
125
  recordAttachEvent({ kind: "attach-retry-failed", tabId, message: String(err.message || err), tabUrl: tabSnapshot?.url });
75
126
  // One more try after a longer settle. Some Chrome builds need ~500ms after a navigation
@@ -77,37 +128,53 @@ async function attachDebugger(tabId) {
77
128
  // will accept the target.
78
129
  await sleep(500);
79
130
  err = await attemptAttach();
131
+ if (err) err = await retryPageTargetIfExtensionBlocked(err, "attach-page-target-retry3");
80
132
  if (err) {
81
133
  recordAttachEvent({ kind: "attach-retry2-failed", tabId, message: String(err.message || err), tabUrl: tabSnapshot?.url });
82
- throw err;
134
+ const meta = await describeInputTarget(tabId);
135
+ throw new Error(`Chrome debugger attach failed for tab ${tabId}: ${String(err.message || err)}${targetMetaSuffix(meta)}`);
83
136
  }
84
137
  }
85
138
  }
86
- recordAttachEvent({ kind: "attached", tabId });
139
+ recordAttachEvent({ kind: "attached", tabId, debuggee: attachedDebuggee });
87
140
  // Seed pointer in a plausible "just left the address bar" location.
88
- const entry = { detachAt: Date.now() + INPUT_IDLE_DETACH_MS, pointer: { x: 120 + Math.random() * 200, y: 80 + Math.random() * 120 } };
141
+ const entry = { detachAt: Date.now() + INPUT_IDLE_DETACH_MS, pointer: { x: 120 + Math.random() * 200, y: 80 + Math.random() * 120 }, debuggee: attachedDebuggee || { tabId } };
89
142
  attachedTabs.set(tabId, entry);
90
143
  return entry;
91
144
  }
92
145
 
93
- async function inputDebug(params) {
94
- const tab = params?.targetId ? await chrome.tabs.get(Number(params.targetId)).catch(() => null) : null;
146
+ async function describeInputTarget(tabId) {
147
+ const tab = await chrome.tabs.get(Number(tabId)).catch(() => null);
148
+ const active = (await chrome.tabs.query({ active: true, lastFocusedWindow: true }).catch(() => []))[0] || null;
95
149
  let targets = [];
96
150
  try { targets = await new Promise((resolve) => chrome.debugger.getTargets((t) => resolve(t || []))); } catch {}
151
+ return {
152
+ resolvedTab: tab ? { id: tab.id, windowId: tab.windowId, url: tab.url, status: tab.status, title: tab.title, active: tab.active } : null,
153
+ activeTab: active ? { id: active.id, windowId: active.windowId, url: active.url, status: active.status, title: active.title, active: active.active } : null,
154
+ attachedTabs: Array.from(attachedTabs.keys()),
155
+ cdpTargets: targets.map((t) => ({ id: t.id, tabId: t.tabId, type: t.type, url: t.url, attached: t.attached, extensionId: t.extensionId })),
156
+ };
157
+ }
158
+
159
+ function targetMetaSuffix(meta) {
160
+ return `\nTarget metadata: ${JSON.stringify(meta).slice(0, 4000)}`;
161
+ }
162
+
163
+ async function inputDebug(params) {
164
+ const requested = params?.targetId ? await describeInputTarget(Number(params.targetId)) : await describeInputTarget(-1);
97
165
  return {
98
166
  extensionVersion: chrome.runtime.getManifest().version,
99
167
  extensionId: chrome.runtime.id,
100
- attachedTabs: Array.from(attachedTabs.keys()),
101
- requestedTab: tab ? { id: tab.id, url: tab.url, status: tab.status, title: tab.title } : null,
102
- cdpTargets: targets,
168
+ ...requested,
103
169
  recentAttachEvents: attachDebugLog.slice(),
104
170
  };
105
171
  }
106
172
 
107
173
  async function detachDebugger(tabId) {
108
- if (!attachedTabs.has(tabId)) return;
174
+ const entry = attachedTabs.get(tabId);
175
+ if (!entry) return;
109
176
  attachedTabs.delete(tabId);
110
- try { await chrome.debugger.detach({ tabId }); } catch {}
177
+ try { await chrome.debugger.detach(entry.debuggee || { tabId }); } catch {}
111
178
  }
112
179
 
113
180
  async function detachAll() {
@@ -134,14 +201,22 @@ setInterval(() => {
134
201
  }, 5000);
135
202
 
136
203
  function cdpRaw(tabId, method, params) {
137
- return new Promise((resolve, reject) => {
138
- chrome.debugger.sendCommand({ tabId }, method, params || {}, (result) => {
204
+ const debuggee = attachedTabs.get(tabId)?.debuggee || { tabId };
205
+ return withTimeout(new Promise((resolve, reject) => {
206
+ chrome.debugger.sendCommand(debuggee, method, params || {}, (result) => {
139
207
  if (chrome.runtime.lastError) reject(new Error(`${method}: ${chrome.runtime.lastError.message}`));
140
208
  else resolve(result);
141
209
  });
210
+ }), CDP_COMMAND_TIMEOUT_MS, `CDP ${method}`, async () => {
211
+ attachedTabs.delete(tabId);
212
+ try { await chrome.debugger.detach(debuggee); } catch {}
142
213
  });
143
214
  }
144
215
 
216
+ function executeScriptTimed(options, label) {
217
+ return withTimeout(chrome.scripting.executeScript(options), SCRIPTING_TIMEOUT_MS, label || "chrome.scripting.executeScript");
218
+ }
219
+
145
220
  // Wraps cdpRaw with one auto-recover on detached/closed sessions:
146
221
  // chrome.debugger.attach can stay cached in attachedTabs even after Chrome killed
147
222
  // the session (tab nav, devtools opened/closed, etc). Recover by detaching the
@@ -214,8 +289,7 @@ async function cdp(tabId, method, params) {
214
289
  }
215
290
  if (!isStale) throw error;
216
291
  attachedTabs.delete(tabId);
217
- await chrome.debugger.attach({ tabId }, CDP_VERSION).catch(() => undefined);
218
- attachedTabs.set(tabId, { detachAt: Date.now() + INPUT_IDLE_DETACH_MS, pointer: { x: 120 + Math.random() * 200, y: 80 + Math.random() * 120 } });
292
+ await attachDebugger(tabId).catch(() => undefined);
219
293
  return cdpRaw(tabId, method, params);
220
294
  }
221
295
  }
@@ -254,14 +328,18 @@ function cdpIsSyntaxError(details) {
254
328
 
255
329
  // Resolve target -> {x, y, rect} in viewport coords by running tiny script in tab.
256
330
  async function resolveTargetInTab(tabId, params) {
257
- const results = await chrome.scripting.executeScript({
331
+ const results = await executeScriptTimed({
258
332
  target: { tabId, frameIds: [0] },
259
333
  world: "MAIN",
260
334
  func: (selector, uid, x, y) => {
261
335
  const state = window.__PI_CHROME_STATE__;
262
336
  let el = null;
263
- if (uid && state && state.elements && state.elements[uid]) el = state.elements[uid];
264
- else if (selector) el = document.querySelector(selector);
337
+ if (uid) {
338
+ el = state && state.elements ? state.elements[uid] : null;
339
+ if (!el || !el.isConnected) return { found: false, staleUid: true, reason: `snapshot uid ${uid} is stale; refresh chrome_snapshot`, url: location.href };
340
+ } else if (selector) {
341
+ el = document.querySelector(selector);
342
+ }
265
343
  if (el) {
266
344
  el.scrollIntoView({ block: "center", inline: "center", behavior: "instant" });
267
345
  const r = el.getBoundingClientRect();
@@ -271,8 +349,9 @@ async function resolveTargetInTab(tabId, params) {
271
349
  return { found: false };
272
350
  },
273
351
  args: [params.selector ?? null, params.uid ?? null, params.x ?? null, params.y ?? null],
274
- });
352
+ }, `resolve input target in tab ${tabId}`);
275
353
  const v = results?.[0]?.result;
354
+ if (v?.staleUid) throw new Error(v.reason || "snapshot uid is stale; refresh chrome_snapshot");
276
355
  if (!v || !v.found) throw new Error("Could not resolve target element for Chrome input");
277
356
  return v;
278
357
  }
@@ -400,36 +479,70 @@ async function cdpTypeChar(tabId, ch) {
400
479
  await sleep(rng(35, 130));
401
480
  }
402
481
 
482
+ async function domClickFallback(tabId, params, cause) {
483
+ const results = await executeScriptTimed({
484
+ target: { tabId, frameIds: [0] },
485
+ world: "MAIN",
486
+ func: (selector, uid, x, y) => {
487
+ const state = window.__PI_CHROME_STATE__;
488
+ let el = uid && state && state.elements ? state.elements[uid] : null;
489
+ if (uid && (!el || !el.isConnected)) return { staleUid: true, reason: `snapshot uid ${uid} is stale; refresh chrome_snapshot`, url: location.href };
490
+ if (!el && selector) el = document.querySelector(selector);
491
+ if (!el && typeof x === "number" && typeof y === "number") el = document.elementFromPoint(x, y);
492
+ if (!el) throw new Error(`DOM fallback target not found: ${uid || selector || `${x},${y}`}`);
493
+ el.scrollIntoView({ block: "center", inline: "center", behavior: "instant" });
494
+ const rect = el.getBoundingClientRect();
495
+ const eventInit = { bubbles: true, cancelable: true, view: window, clientX: rect.left + rect.width / 2, clientY: rect.top + rect.height / 2, button: 0, buttons: 1 };
496
+ el.dispatchEvent(new PointerEvent("pointerdown", { ...eventInit, pointerId: 1, pointerType: "mouse", isPrimary: true }));
497
+ el.dispatchEvent(new MouseEvent("mousedown", eventInit));
498
+ if (typeof el.focus === "function") el.focus({ preventScroll: true });
499
+ el.dispatchEvent(new PointerEvent("pointerup", { ...eventInit, pointerId: 1, pointerType: "mouse", isPrimary: true, buttons: 0 }));
500
+ el.dispatchEvent(new MouseEvent("mouseup", { ...eventInit, buttons: 0 }));
501
+ el.click();
502
+ return { tag: el.tagName, url: location.href };
503
+ },
504
+ args: [params.selector ?? null, params.uid ?? null, params.x ?? null, params.y ?? null],
505
+ }, `DOM click fallback in tab ${tabId}`);
506
+ const v = results?.[0]?.result;
507
+ if (v?.staleUid) throw new Error(v.reason || "snapshot uid is stale; refresh chrome_snapshot");
508
+ return { input: "dom-fallback", reason: String(cause?.message || cause).slice(0, 500), tag: v?.tag };
509
+ }
510
+
403
511
  async function chromeInputClick(params) {
404
512
  const tab = await getTabByParams(params);
405
513
  if (params.foreground) await bringToFront(tab);
406
- await attachDebugger(tab.id);
407
- const resolved = await resolveTargetInTab(tab.id, params);
408
- const point = resolved.rect ? pickInsideRect(resolved.rect) : { x: resolved.x, y: resolved.y };
409
- await cdpMoveTo(tab.id, point.x, point.y);
410
- await cdp(tab.id, "Input.dispatchMouseEvent", { type: "mousePressed", x: point.x, y: point.y, button: "left", buttons: 1, clickCount: 1, pointerType: "mouse", force: 0.5 });
411
- await sleep(rng(45, 140));
412
- await cdp(tab.id, "Input.dispatchMouseEvent", { type: "mouseReleased", x: point.x, y: point.y, button: "left", buttons: 0, clickCount: 1, pointerType: "mouse" });
413
- // Reset :focus-visible if the click landed on a focusable element. CDP-driven pointer
414
- // focus can leave :focus-visible=true in Chromium, which trips heuristics that expect
415
- // Reset focus styling after pointer click when possible.
416
- if (params.selector || params.uid) {
417
- await chrome.scripting.executeScript({
418
- target: { tabId: tab.id, frameIds: [0] },
419
- world: "MAIN",
420
- func: (sel, uid) => {
421
- const state = window.__PI_CHROME_STATE__;
422
- let el = null;
423
- if (uid && state && state.elements && state.elements[uid]) el = state.elements[uid];
424
- else if (sel) el = document.querySelector(sel);
425
- if (el && typeof el.focus === "function" && el === document.activeElement) {
426
- try { el.blur(); el.focus({ preventScroll: true, focusVisible: false }); } catch {}
427
- }
428
- },
429
- args: [params.selector ?? null, params.uid ?? null],
430
- }).catch(() => undefined);
514
+ try {
515
+ await attachDebugger(tab.id);
516
+ const resolved = await resolveTargetInTab(tab.id, params);
517
+ const point = resolved.rect ? pickInsideRect(resolved.rect) : { x: resolved.x, y: resolved.y };
518
+ await cdpMoveTo(tab.id, point.x, point.y);
519
+ await cdp(tab.id, "Input.dispatchMouseEvent", { type: "mousePressed", x: point.x, y: point.y, button: "left", buttons: 1, clickCount: 1, pointerType: "mouse", force: 0.5 });
520
+ await sleep(rng(45, 140));
521
+ await cdp(tab.id, "Input.dispatchMouseEvent", { type: "mouseReleased", x: point.x, y: point.y, button: "left", buttons: 0, clickCount: 1, pointerType: "mouse" });
522
+ // Reset :focus-visible if the click landed on a focusable element. CDP-driven pointer
523
+ // focus can leave :focus-visible=true in Chromium, which trips heuristics that expect
524
+ // Reset focus styling after pointer click when possible.
525
+ if (params.selector || params.uid) {
526
+ await executeScriptTimed({
527
+ target: { tabId: tab.id, frameIds: [0] },
528
+ world: "MAIN",
529
+ func: (sel, uid) => {
530
+ const state = window.__PI_CHROME_STATE__;
531
+ let el = null;
532
+ if (uid && state && state.elements && state.elements[uid]) el = state.elements[uid];
533
+ else if (sel) el = document.querySelector(sel);
534
+ if (el && typeof el.focus === "function" && el === document.activeElement) {
535
+ try { el.blur(); el.focus({ preventScroll: true, focusVisible: false }); } catch {}
536
+ }
537
+ },
538
+ args: [params.selector ?? null, params.uid ?? null],
539
+ }, `reset focus style in tab ${tab.id}`).catch(() => undefined);
540
+ }
541
+ return { input: "chrome", x: point.x, y: point.y, tag: resolved.tag };
542
+ } catch (error) {
543
+ if (params.domFallback === false) throw error;
544
+ return domClickFallback(tab.id, params, error);
431
545
  }
432
- return { input: "chrome", x: point.x, y: point.y, tag: resolved.tag };
433
546
  }
434
547
 
435
548
  async function chromeInputHover(params) {
@@ -504,29 +617,74 @@ async function chromeInputType(params) {
504
617
  return { input: "chrome", length: text.length };
505
618
  }
506
619
 
620
+ async function domFillFallback(tabId, params, cause) {
621
+ if (!(params.selector || params.uid)) throw cause;
622
+ const results = await executeScriptTimed({
623
+ target: { tabId, frameIds: [0] },
624
+ world: "MAIN",
625
+ func: async (selector, uid, text, submit) => {
626
+ const state = window.__PI_CHROME_STATE__;
627
+ let el = uid && state && state.elements ? state.elements[uid] : null;
628
+ if (uid && (!el || !el.isConnected)) return { staleUid: true, reason: `snapshot uid ${uid} is stale; refresh chrome_snapshot`, url: location.href };
629
+ if (!el && selector) el = document.querySelector(selector);
630
+ if (!el) throw new Error(`DOM fallback target not found: ${uid || selector}`);
631
+ el.scrollIntoView({ block: "center", inline: "center", behavior: "instant" });
632
+ if (typeof el.focus === "function") el.focus({ preventScroll: true });
633
+ const value = String(text ?? "");
634
+ if ("value" in el) {
635
+ const proto = el instanceof HTMLTextAreaElement ? HTMLTextAreaElement.prototype : HTMLInputElement.prototype;
636
+ const setter = Object.getOwnPropertyDescriptor(proto, "value")?.set;
637
+ if (setter) setter.call(el, value);
638
+ else el.value = value;
639
+ } else if (el.isContentEditable) {
640
+ el.textContent = value;
641
+ } else {
642
+ throw new Error(`DOM fallback target is not fillable: <${el.tagName.toLowerCase()}>`);
643
+ }
644
+ el.dispatchEvent(new InputEvent("input", { bubbles: true, inputType: "insertText", data: value }));
645
+ el.dispatchEvent(new Event("change", { bubbles: true }));
646
+ if (submit) {
647
+ const form = el.closest("form");
648
+ if (form) form.requestSubmit ? form.requestSubmit() : form.submit();
649
+ else document.querySelector("button,[type=submit]")?.click();
650
+ }
651
+ return { valueMatches: "value" in el ? el.value === value : el.textContent === value, tag: el.tagName, url: location.href };
652
+ },
653
+ args: [params.selector ?? null, params.uid ?? null, params.text ?? "", params.submit === true],
654
+ }, `DOM fill fallback in tab ${tabId}`);
655
+ const v = results?.[0]?.result;
656
+ if (v?.staleUid) throw new Error(v.reason || "snapshot uid is stale; refresh chrome_snapshot");
657
+ return { input: "dom-fallback", length: String(params.text || "").length, valueMatches: v?.valueMatches, reason: String(cause?.message || cause).slice(0, 500), tag: v?.tag };
658
+ }
659
+
507
660
  async function chromeInputFill(params) {
508
661
  const tab = await getTabByParams(params);
509
662
  if (params.foreground) await bringToFront(tab);
510
- await attachDebugger(tab.id);
511
- if (!(params.selector || params.uid)) throw new Error("chrome.fill: selector or uid required");
512
- const resolved = await resolveTargetInTab(tab.id, params);
513
- const point = resolved.rect ? pickInsideRect(resolved.rect) : { x: resolved.x, y: resolved.y };
514
- await cdpMoveTo(tab.id, point.x, point.y);
515
- // Triple-click selects all in input fields.
516
- for (let i = 1; i <= 3; i++) {
517
- await cdp(tab.id, "Input.dispatchMouseEvent", { type: "mousePressed", x: point.x, y: point.y, button: "left", buttons: 1, clickCount: i, pointerType: "mouse", force: 0.5 });
518
- await sleep(rng(20, 60));
519
- await cdp(tab.id, "Input.dispatchMouseEvent", { type: "mouseReleased", x: point.x, y: point.y, button: "left", buttons: 0, clickCount: i, pointerType: "mouse" });
663
+ try {
664
+ await attachDebugger(tab.id);
665
+ if (!(params.selector || params.uid)) throw new Error("chrome.fill: selector or uid required");
666
+ const resolved = await resolveTargetInTab(tab.id, params);
667
+ const point = resolved.rect ? pickInsideRect(resolved.rect) : { x: resolved.x, y: resolved.y };
668
+ await cdpMoveTo(tab.id, point.x, point.y);
669
+ // Triple-click selects all in input fields.
670
+ for (let i = 1; i <= 3; i++) {
671
+ await cdp(tab.id, "Input.dispatchMouseEvent", { type: "mousePressed", x: point.x, y: point.y, button: "left", buttons: 1, clickCount: i, pointerType: "mouse", force: 0.5 });
672
+ await sleep(rng(20, 60));
673
+ await cdp(tab.id, "Input.dispatchMouseEvent", { type: "mouseReleased", x: point.x, y: point.y, button: "left", buttons: 0, clickCount: i, pointerType: "mouse" });
674
+ await sleep(rng(20, 60));
675
+ }
676
+ // Delete selection.
677
+ await cdp(tab.id, "Input.dispatchKeyEvent", { type: "keyDown", key: "Delete", code: "Delete", windowsVirtualKeyCode: 46 });
678
+ await cdp(tab.id, "Input.dispatchKeyEvent", { type: "keyUp", key: "Delete", code: "Delete", windowsVirtualKeyCode: 46 });
520
679
  await sleep(rng(20, 60));
680
+ const text = String(params.text || "");
681
+ for (const ch of Array.from(text)) await cdpTypeChar(tab.id, ch);
682
+ if (params.submit) await chromeInputKey({ ...params, key: "Enter" });
683
+ return { input: "chrome", length: text.length };
684
+ } catch (error) {
685
+ if (params.domFallback === false) throw error;
686
+ return domFillFallback(tab.id, params, error);
521
687
  }
522
- // Delete selection.
523
- await cdp(tab.id, "Input.dispatchKeyEvent", { type: "keyDown", key: "Delete", code: "Delete", windowsVirtualKeyCode: 46 });
524
- await cdp(tab.id, "Input.dispatchKeyEvent", { type: "keyUp", key: "Delete", code: "Delete", windowsVirtualKeyCode: 46 });
525
- await sleep(rng(20, 60));
526
- const text = String(params.text || "");
527
- for (const ch of Array.from(text)) await cdpTypeChar(tab.id, ch);
528
- if (params.submit) await chromeInputKey({ ...params, key: "Enter" });
529
- return { input: "chrome", length: text.length };
530
688
  }
531
689
 
532
690
  async function chromeInputScroll(params) {
@@ -714,7 +872,12 @@ async function pollLoop() {
714
872
 
715
873
  async function handleCommand(command) {
716
874
  try {
717
- const result = await dispatch(command.action, command.params ?? {});
875
+ const result = await withTimeout(
876
+ dispatch(command.action, command.params ?? {}),
877
+ COMMAND_TIMEOUT_MS,
878
+ command.action || "Chrome command",
879
+ () => detachAll(),
880
+ );
718
881
  await postResult({ id: command.id, ok: true, result });
719
882
  } catch (error) {
720
883
  await postResult({ id: command.id, ok: false, error: error?.message ?? String(error) });
@@ -936,7 +1099,7 @@ async function getTabByParams(params) {
936
1099
  let tab;
937
1100
  if (params.targetId !== undefined) {
938
1101
  const id = Number(params.targetId);
939
- tab = tabs.find((candidate) => candidate.id === id);
1102
+ tab = await chrome.tabs.get(id).catch(() => null);
940
1103
  if (!tab?.id) {
941
1104
  // Chrome tab ids are not stable across reloads/navigations; a long session can hold a
942
1105
  // stale id. Surface the current tabs so the caller can re-target instead of guessing.
@@ -960,8 +1123,9 @@ async function getTabByParams(params) {
960
1123
  tab = active[0] || tabs.find((candidate) => candidate.active) || tabs[0];
961
1124
  }
962
1125
  if (!tab?.id) throw new Error("No matching Chrome tab found");
963
- if ((tab.url || "").startsWith("chrome://") || (tab.url || "").startsWith("chrome-extension://")) {
964
- throw new Error(`Chrome blocks extension automation on protected URL: ${tab.url}`);
1126
+ const url = tab.url || "";
1127
+ if (url.startsWith("chrome://") || url.startsWith("chrome-extension://") || url.startsWith("devtools://")) {
1128
+ throw new Error(`Chrome blocks extension automation on protected URL: tab=${tab.id} url=${url}`);
965
1129
  }
966
1130
  // Tabs Pi interacts with (page.* actions) join this session's group so the user can see exactly
967
1131
  // which tabs Pi is driving. We only adopt *ungrouped* tabs — never hijack a tab the user (or
@@ -1032,7 +1196,7 @@ async function executeInTab(params, func, args) {
1032
1196
  // Phase 2: run the action via chrome.scripting.executeScript. The `func:` form is
1033
1197
  // injected by Chrome itself (not `new Function`), so it is CSP-safe, and it lets Chrome
1034
1198
  // serialize the invocation args. The wrapper references window.__piAction defined above.
1035
- const results = await chrome.scripting.executeScript({
1199
+ const results = await executeScriptTimed({
1036
1200
  target: { tabId: tab.id },
1037
1201
  world: "MAIN",
1038
1202
  func: async (invocationArgs) => {
@@ -1043,7 +1207,7 @@ async function executeInTab(params, func, args) {
1043
1207
  }
1044
1208
  },
1045
1209
  args: [args || []],
1046
- });
1210
+ }, `execute page action in tab ${tab.id}`);
1047
1211
  const first = results?.[0];
1048
1212
  if (first?.error) {
1049
1213
  const message = typeof first.error === "string" ? first.error : (first.error.message || JSON.stringify(first.error));
@@ -1139,12 +1303,12 @@ async function snapshotInTab(params) {
1139
1303
  params.query ?? null,
1140
1304
  params.maxTextChars ?? null,
1141
1305
  ];
1142
- await chrome.scripting.executeScript({
1306
+ await executeScriptTimed({
1143
1307
  target: { tabId: tab.id, frameIds: [0] },
1144
1308
  world: "MAIN",
1145
1309
  files: ["snapshot_injected.js"],
1146
- });
1147
- const results = await chrome.scripting.executeScript({
1310
+ }, `inject snapshot script in tab ${tab.id}`);
1311
+ const results = await executeScriptTimed({
1148
1312
  target: { tabId: tab.id, frameIds: [0] },
1149
1313
  world: "MAIN",
1150
1314
  func: async (invocationArgs) => {
@@ -1157,7 +1321,7 @@ async function snapshotInTab(params) {
1157
1321
  }
1158
1322
  },
1159
1323
  args: [args],
1160
- });
1324
+ }, `run snapshot script in tab ${tab.id}`);
1161
1325
  const first = results?.[0];
1162
1326
  if (first?.error) {
1163
1327
  const message = typeof first.error === "string" ? first.error : (first.error.message || JSON.stringify(first.error));
@@ -1175,12 +1339,12 @@ async function inspectInTab(params) {
1175
1339
  const tab = await getTabByParams(params);
1176
1340
  if (params.foreground) await bringToFront(tab);
1177
1341
  const args = [params.uid ?? null, params.selector ?? null, params.scrollIntoView === true];
1178
- await chrome.scripting.executeScript({
1342
+ await executeScriptTimed({
1179
1343
  target: { tabId: tab.id, frameIds: [0] },
1180
1344
  world: "MAIN",
1181
1345
  files: ["snapshot_injected.js"],
1182
- });
1183
- const results = await chrome.scripting.executeScript({
1346
+ }, `inject inspect script in tab ${tab.id}`);
1347
+ const results = await executeScriptTimed({
1184
1348
  target: { tabId: tab.id, frameIds: [0] },
1185
1349
  world: "MAIN",
1186
1350
  func: async (invocationArgs) => {
@@ -1193,7 +1357,7 @@ async function inspectInTab(params) {
1193
1357
  }
1194
1358
  },
1195
1359
  args: [args],
1196
- });
1360
+ }, `run inspect script in tab ${tab.id}`);
1197
1361
  const first = results?.[0];
1198
1362
  if (first?.error) {
1199
1363
  const message = typeof first.error === "string" ? first.error : (first.error.message || JSON.stringify(first.error));
@@ -1413,6 +1413,7 @@ Usage rules:
1413
1413
  selector: Type.Optional(Type.String({ description: "CSS selector to click. Prefer uid from chrome_snapshot when available." })),
1414
1414
  x: Type.Optional(Type.Number({ description: "Viewport x coordinate if uid/selector is omitted." })),
1415
1415
  y: Type.Optional(Type.Number({ description: "Viewport y coordinate if uid/selector is omitted." })),
1416
+ domFallback: Type.Optional(Type.Boolean({ description: "If true (default), fall back to DOM-dispatched click if Chrome's CDP input path is blocked by another extension overlay or debugger failure." })),
1416
1417
  includeSnapshot: Type.Optional(Type.Boolean({ description: "If true, include a fresh chrome_snapshot result after the click." })),
1417
1418
  maxElements: Type.Optional(Type.Number({ default: MAX_ELEMENTS, description: "Max elements in the included snapshot." })),
1418
1419
  targetId: Type.Optional(Type.String()),
@@ -1478,6 +1479,7 @@ Usage rules:
1478
1479
  uid: Type.Optional(Type.String({ description: "Stable element uid from chrome_snapshot." })),
1479
1480
  selector: Type.Optional(Type.String({ description: "CSS selector to fill if uid is omitted." })),
1480
1481
  submit: Type.Optional(Type.Boolean({ description: "If true, press Enter after filling." })),
1482
+ domFallback: Type.Optional(Type.Boolean({ description: "If true (default), fall back to DOM value-setting if Chrome's CDP input path is blocked by another extension overlay or debugger failure." })),
1481
1483
  includeSnapshot: Type.Optional(Type.Boolean({ description: "If true, include a fresh chrome_snapshot result after filling." })),
1482
1484
  maxElements: Type.Optional(Type.Number({ default: MAX_ELEMENTS, description: "Max elements in the included snapshot." })),
1483
1485
  targetId: Type.Optional(Type.String()),
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "pi-chrome",
3
- "version": "0.15.36",
3
+ "version": "0.15.38",
4
4
  "scripts": {
5
5
  "test": "node test-suite/unit/csp-eval.test.mjs",
6
6
  "version": "node scripts/sync-manifest-version.js",