npm - pi-chrome - Versions diffs - 0.15.28 → 0.15.30 - Mend

pi-chrome 0.15.28 → 0.15.30

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

package/CHANGELOG.md +18 -0
package/README.md +1 -1
package/docs/COMPARISON.md +1 -1
package/docs/EXAMPLES.md +1 -1
package/docs/FAQ.md +1 -1
package/extensions/chrome-profile-bridge/browser-extension/manifest.json +2 -1
package/extensions/chrome-profile-bridge/browser-extension/service_worker.js +209 -115
package/extensions/chrome-profile-bridge/index.ts +12 -7
package/package.json +2 -1
package/test-suite/README.md +2 -1
package/test-suite/challenges/42-strict-csp-evaluate.html +16 -0
package/test-suite/challenges/42-strict-csp-evaluate.js +21 -0
package/test-suite/manifest.json +52 -1
package/test-suite/unit/csp-eval.test.mjs +171 -0

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,24 @@
 All notable user-facing changes to `pi-chrome`.
+## 0.15.30 — 2026-05-31
+Tab grouping for `chrome_tab`.
+- **Pi-opened tabs auto-group.** `action=new` now drops every tab into a shared `Pi` tab group per window by default (created once, then reused), so agent tabs stay visually separated from your own. Opt out per call with `groupTitle:""` or `group:false`.
+- **`chrome_tab` can group/ungroup tabs.** New `action=group` (and `action=ungroup`) plus `groupTitle`/`groupColor` params. Grouping reuses an existing same-title group in the window instead of spawning duplicates. Defaults: title `Pi`, color `blue`; color validated against Chrome's 9 group colors. Target an existing tab with `targetId`/`urlIncludes`/`titleIncludes`.
+- **Tab listings include group info.** `formatTab` now reports `groupId` and a `group` record (`title`, `color`, `collapsed`, `windowId`, `piGroup`), and `chrome_tab list` prefixes grouped tabs with `[Group Title]`.
+- Requires the new `tabGroups` extension permission — reload the companion extension after updating.
+## 0.15.29 — 2026-05-31
+Strict-CSP support: `chrome_evaluate`, `chrome_snapshot`, `chrome_wait_for`, and `chrome_navigate initScript` now work on pages that block `unsafe-eval`.
+- **CDP-based evaluation bypasses page CSP.** `chrome_evaluate`/`chrome_snapshot` (and all snapshot-driven inspection) previously ran user code in the page MAIN world via the **Function constructor**, which is blocked by `script-src 'self'` without `'unsafe-eval'` — so they returned null/empty (or `EvalError`) on github.com and many bank/SaaS apps. They now evaluate through CDP `Runtime.evaluate`, a DevTools protocol command that is not subject to the page's Content-Security-Policy. Rich return values (undefined/function/symbol/bigint/Error markers, DOMRect expansion) and the expression/statement fallback are preserved.
+- **`chrome_wait_for` polls via CDP.** The selector/expression polling loop moved from in-page `new Function()` to service-worker-side CDP evaluation, so waits work under strict CSP too.
+- **`chrome_navigate initScript` injects via CDP.** Document-start init scripts now register with `Page.addScriptToEvaluateOnNewDocument` instead of `new Function()` on `webNavigation.onCommitted`, so seeding localStorage / stubbing `Date.now` works under strict CSP.
+- **Tests.** Added a Node unit harness (`test-suite/unit/csp-eval.test.mjs`, run via `npm test`) validating the evaluate/execute/waitFor refactor, and an in-browser regression challenge (42 `strict-csp-evaluate`) that reads a JS-only secret under strict CSP. Updated challenge 39's notes and docs (FAQ, EXAMPLES, COMPARISON, primer) which previously stated eval/snapshot fail under strict CSP.
 ## 0.15.28 — 2026-05-31
 Low-risk reliability fixes from a long-session bug report.

package/README.md CHANGED Viewed

@@ -217,7 +217,7 @@ Multiple Pi sessions (planner / worker / audit) can all drive the same Chrome at
 ## Built-in benchmark suite
-[`test-suite/`](./test-suite) is a benchmark for **any** browser-control agent (not just pi-chrome). It includes **41 primitive challenges** plus **4 hermetic BrowserGym-style long-horizon tasks**.
+[`test-suite/`](./test-suite) is a benchmark for **any** browser-control agent (not just pi-chrome). It includes **42 primitive challenges** plus **4 hermetic BrowserGym-style long-horizon tasks**.
 Scoring tracks expected outcomes per challenge rather than raw PASS count, so tools are judged against their declared browser-control capability. Unit challenges are split into gate buckets:

package/docs/COMPARISON.md CHANGED Viewed

@@ -134,7 +134,7 @@ If your threat model excludes extensions with broad permissions, neither approac
 ## Public benchmarks worth knowing (for axis 2 / axis 3 comparison)
-Pi-chrome itself ships a benchmark suite ([`../test-suite/`](../test-suite)) of **41 primitive challenges** plus **4 hermetic BrowserGym-style long-horizon tasks** covering trusted input, pointer humanization, keyboard fidelity, drag/drop, Shadow DOM, iframes, file uploads, strict-CSP screenshot fallback, dynamic waits, tab lifecycle, network observability, fingerprint leaks, and agent-safety honeypots. Scoring tracks expected outcomes per challenge instead of raw PASS count, with `core`, `conditional`, and `quality` gate buckets. That's **driver-level** grading.
+Pi-chrome itself ships a benchmark suite ([`../test-suite/`](../test-suite)) of **42 primitive challenges** plus **4 hermetic BrowserGym-style long-horizon tasks** covering trusted input, pointer humanization, keyboard fidelity, drag/drop, Shadow DOM, iframes, file uploads, strict-CSP screenshot fallback and CDP eval/snapshot bypass, dynamic waits, tab lifecycle, network observability, fingerprint leaks, and agent-safety honeypots. Scoring tracks expected outcomes per challenge instead of raw PASS count, with `core`, `conditional`, and `quality` gate buckets. That's **driver-level** grading.
 For **agent-level** comparison (axis 2), the public benchmarks worth citing:

package/docs/EXAMPLES.md CHANGED Viewed

@@ -161,6 +161,6 @@ Interactive tools use Chrome's real input layer by default: clicks, typing, fill
 - fullscreen and other user-activation checks
 - pages where DOM injection/evaluate is limited, if the agent can use screenshots + coordinates
-Strict CSP note: `chrome_snapshot`/`chrome_evaluate` may be blocked on pages that disallow `unsafe-eval`; `chrome_screenshot`, tab/navigation tools, and real input still work.
+Strict CSP note: `chrome_snapshot`/`chrome_evaluate` work even on pages that disallow `unsafe-eval`, because they run via CDP `Runtime.evaluate` (not page-level `eval`/`new Function`), which is not subject to page CSP. `chrome_screenshot`, tab/navigation tools, and real input also work under any CSP.
 Chrome may show its debugger banner while pi-chrome is attached.

package/docs/FAQ.md CHANGED Viewed

@@ -55,7 +55,7 @@ pi-chrome controls web pages through Chrome extension APIs, page inspection, scr
 ## Does `chrome_evaluate` work on strict-CSP pages?
-Not always. `chrome_evaluate` and `chrome_snapshot` run in the page's MAIN world through the Function constructor, so pages whose CSP blocks `'unsafe-eval'` can reject them. `chrome_screenshot`, `chrome_navigate`, tab tools, and real Chrome input still work because they use extension/browser APIs rather than page JavaScript.
+Yes. `chrome_evaluate` and `chrome_snapshot` run in the page's MAIN world through CDP `Runtime.evaluate`, which is a DevTools protocol command and is **not** subject to the page's Content-Security-Policy. They work even on pages that block `'unsafe-eval'` (e.g. github.com and many bank/SaaS apps). `chrome_navigate`'s `initScript` injects at document_start via CDP and likewise bypasses CSP. `chrome_screenshot`, tab tools, and real Chrome input also keep working under any CSP.
 ## How do I tell whether a click or type worked?

package/extensions/chrome-profile-bridge/browser-extension/manifest.json CHANGED Viewed

@@ -1,10 +1,11 @@
 {
   "manifest_version": 3,
   "name": "Pi Chrome Connector",
-  "version": "0.15.28",
+  "version": "0.15.30",
   "description": "Lets Pi control tabs in Chrome via a local connector at 127.0.0.1.",
   "permissions": [
     "tabs",
+    "tabGroups",
     "scripting",
     "storage",
     "activeTab",

package/extensions/chrome-profile-bridge/browser-extension/service_worker.js CHANGED Viewed

@@ -1,6 +1,9 @@
 const BRIDGE_URL = "http://127.0.0.1:17318";
 const CLIENT_NAME = `Pi Chrome Connector ${chrome.runtime.id}`;
 const POLL_ERROR_BACKOFF_MS = 2000;
+const DEFAULT_GROUP_COLOR = "blue";
+const PI_GROUP_RE = /^Pi(\b|\s*-)/i;
+const VALID_GROUP_COLORS = new Set(["grey", "blue", "red", "yellow", "green", "pink", "purple", "cyan", "orange"]);
 let polling = false;
 // =================== Chrome input (CDP) layer ===================
@@ -217,6 +220,38 @@ async function cdp(tabId, method, params) {
   }
 }
+// cdpEval: evaluate a JavaScript expression string in the page's MAIN world via CDP
+// Runtime.evaluate. Runtime.evaluate is a DevTools protocol command and is NOT subject to
+// the page's Content-Security-Policy, so it works on pages that ship `script-src 'self'`
+// without `'unsafe-eval'` (which blocks `eval`/`new Function`). Ensures the debugger is
+// attached first. Returns the raw CDP result ({ result, exceptionDetails }).
+async function cdpEval(tabId, expression, opts) {
+  await attachDebugger(tabId);
+  return cdp(tabId, "Runtime.evaluate", {
+    expression,
+    returnByValue: true,
+    awaitPromise: true,
+    userGesture: true,
+    ...(opts || {}),
+  });
+}
+function cdpExceptionText(details) {
+  if (!details) return "";
+  return String(
+    details.exception?.description ||
+      details.exception?.value ||
+      details.text ||
+      "",
+  );
+}
+function cdpIsSyntaxError(details) {
+  if (!details) return false;
+  const className = String(details.exception?.className || "");
+  return className === "SyntaxError" || /SyntaxError/.test(cdpExceptionText(details));
+}
 // Resolve target -> {x, y, rect} in viewport coords by running tiny script in tab.
 async function resolveTargetInTab(tabId, params) {
   const results = await chrome.scripting.executeScript({
@@ -677,6 +712,58 @@ function isVersionOlder(a, b) {
   return false;
 }
+function cleanGroupTitle(value) {
+  const text = String(value || "Pi").replace(/\s+/g, " ").trim().slice(0, 80);
+  return text || "Pi";
+}
+function cleanGroupColor(value) {
+  const color = String(value || DEFAULT_GROUP_COLOR).toLowerCase();
+  return VALID_GROUP_COLORS.has(color) ? color : DEFAULT_GROUP_COLOR;
+}
+async function groupRecord(groupId) {
+  if (typeof groupId !== "number" || groupId < 0 || !chrome.tabGroups) return null;
+  const group = await chrome.tabGroups.get(groupId).catch(() => null);
+  if (!group) return null;
+  return {
+    id: group.id,
+    title: group.title || "",
+    color: group.color || "",
+    collapsed: Boolean(group.collapsed),
+    windowId: group.windowId,
+    piGroup: Boolean(group.title && PI_GROUP_RE.test(group.title)),
+  };
+}
+// Find an existing tab group in `windowId` whose title matches `title` (case-insensitive).
+// Used so all Pi-opened tabs collect into one group per window instead of spawning new ones.
+async function findGroupByTitle(windowId, title) {
+  if (!chrome.tabGroups) return null;
+  const wanted = cleanGroupTitle(title).toLowerCase();
+  const groups = await chrome.tabGroups.query({ windowId }).catch(() => []);
+  const match = groups.find((g) => (g.title || "").trim().toLowerCase() === wanted);
+  return match ? match.id : null;
+}
+// Add `tab` to a tab group, then set title/color. If the tab is ungrouped, reuse an
+// existing same-title group in its window when present, otherwise create a new group.
+async function groupTab(tab, title, color) {
+  if (!chrome.tabGroups) throw new Error("chrome.tabGroups API unavailable; reload the extension after granting the tabGroups permission");
+  if (!tab || typeof tab.id !== "number") throw new Error("No tab to group");
+  const groupTitle = cleanGroupTitle(title);
+  let groupId = tab.groupId;
+  if (typeof groupId !== "number" || groupId < 0) {
+    const existing = await findGroupByTitle(tab.windowId, groupTitle);
+    groupId = existing !== null
+      ? await chrome.tabs.group({ groupId: existing, tabIds: [tab.id] })
+      : await chrome.tabs.group({ tabIds: [tab.id] });
+  }
+  await chrome.tabGroups.update(groupId, { title: groupTitle, color: cleanGroupColor(color), collapsed: false });
+  const grouped = await chrome.tabs.get(tab.id);
+  return { tab: await formatTab(grouped), group: await groupRecord(groupId) };
+}
 async function dispatch(action, params) {
   switch (action) {
     case "tab.version":
@@ -686,17 +773,31 @@ async function dispatch(action, params) {
         bridgeUrl: BRIDGE_URL,
         userAgent: navigator.userAgent,
       };
-    case "tab.list":
-      return (await chrome.tabs.query({})).map(formatTab);
+    case "tab.list": {
+      const tabs = await chrome.tabs.query({});
+      return Promise.all(tabs.map(formatTab));
+    }
     case "tab.new": {
       const tab = await chrome.tabs.create({ url: params.url || "about:blank", active: true });
-      return formatTab(tab);
+      // Every Pi-opened tab joins a group by default. Pass groupTitle:"" (or group:false) to opt out.
+      const optOut = params.groupTitle === "" || params.group === false;
+      if (optOut && !params.groupColor) return formatTab(tab);
+      return groupTab(tab, params.groupTitle || "Pi", params.groupColor);
     }
     case "tab.activate": {
       const tab = await getTabByParams(params);
       await chrome.windows.update(tab.windowId, { focused: true });
       return formatTab(await chrome.tabs.update(tab.id, { active: true }));
     }
+    case "tab.group": {
+      const tab = await getTabByParams(params);
+      return groupTab(tab, params.groupTitle || "Pi", params.groupColor);
+    }
+    case "tab.ungroup": {
+      const tab = await getTabByParams(params);
+      if (typeof tab.groupId === "number" && tab.groupId >= 0) await chrome.tabs.ungroup(tab.id);
+      return formatTab(await chrome.tabs.get(tab.id));
+    }
     case "tab.close": {
       const tab = await getTabByParams(params);
       await chrome.tabs.remove(tab.id);
@@ -739,8 +840,29 @@ async function dispatch(action, params) {
       return executeInTab(params, listNetworkRequests, [params.includePreservedRequests === true, params.clear === true]);
     case "page.network.get":
       return executeInTab(params, getNetworkRequest, [params.requestId]);
-    case "page.waitFor":
-      return executeInTab(params, waitForPage, [params.kind, params.value, params.timeoutMs || 10000, params.intervalMs || 250]);
+    case "page.waitFor": {
+      // Poll from the service worker via CDP (bypasses CSP). The old approach ran the polling
+      // loop in-page with new Function() for expression checks, which fails under strict CSP.
+      const tab = await getTabByParams(params);
+      if (params.foreground) await bringToFront(tab);
+      const timeoutMs = params.timeoutMs || 10000;
+      const intervalMs = params.intervalMs || 250;
+      const started = Date.now();
+      while (Date.now() - started < timeoutMs) {
+        let ok = false;
+        try {
+          const expr = params.kind === "selector"
+            ? `!!document.querySelector(${JSON.stringify(params.value)})`
+            : params.value;
+          ok = Boolean(await evaluateInTab({ ...params, expression: expr, foreground: false }));
+        } catch {
+          ok = false;
+        }
+        if (ok) return { elapsedMs: Date.now() - started };
+        await sleep(intervalMs);
+      }
+      throw new Error(`Timed out after ${timeoutMs}ms waiting for ${params.kind}: ${params.value}`);
+    }
     case "page.probe":
       // Lightweight capability probe for /chrome-doctor. Runs in MAIN world.
       return executeInTab(params, probePage, []);
@@ -758,7 +880,7 @@ async function dispatch(action, params) {
       } finally {
         if (params.initScript) await unregisterInitScript(tab.id).catch(() => undefined);
       }
-      return formatTab(await chrome.tabs.get(updated.id));
+      return await formatTab(await chrome.tabs.get(updated.id));
     }
     case "page.screenshot":
       return takeScreenshot(params);
@@ -767,7 +889,7 @@ async function dispatch(action, params) {
   }
 }
-function formatTab(tab) {
+async function formatTab(tab) {
   return {
     id: tab.id,
     windowId: tab.windowId,
@@ -778,6 +900,8 @@ function formatTab(tab) {
     status: tab.status,
     pinned: tab.pinned,
     incognito: tab.incognito,
+    groupId: typeof tab.groupId === "number" ? tab.groupId : -1,
+    group: await groupRecord(tab.groupId),
   };
 }
@@ -847,25 +971,33 @@ const HELPER_FUNCS = [
 async function executeInTab(params, func, args) {
   const tab = await getTabByParams(params);
   if (params.foreground) await bringToFront(tab);
-  const helperSource = HELPER_FUNCS.map((helper) => helper.toString()).join("\n");
+  // Phase 1: define the helpers and the action function as page globals via CDP
+  // Runtime.evaluate. This bypasses page CSP (no `eval`/`new Function`), which is the
+  // root cause of snapshot/click/etc silently failing on `script-src 'self'` sites.
+  // Each helper is a named function declaration, assigned to window.<name> so the action
+  // (which references helpers by bare name) resolves them as globals at call time.
+  const assignments = HELPER_FUNCS.map((helper) => `window.${helper.name}=${helper.toString()}`).join(";\n");
+  const actionAssign = `window.__piAction=(${func.toString()})`;
+  const defineRes = await cdpEval(tab.id, `(()=>{${assignments};\n${actionAssign};})()`);
+  if (defineRes.exceptionDetails) {
+    throw new Error(`Failed to inject Chrome page helpers: ${cdpExceptionText(defineRes.exceptionDetails) || "unknown error"}`);
+  }
+  // Phase 2: run the action via chrome.scripting.executeScript. The `func:` form is
+  // injected by Chrome itself (not `new Function`), so it is CSP-safe, and it lets Chrome
+  // serialize the invocation args. The wrapper references window.__piAction defined above.
   const results = await chrome.scripting.executeScript({
     target: { tabId: tab.id },
     world: "MAIN",
-    func: async (helperSource, source, invocationArgs) => {
+    func: async (invocationArgs) => {
       try {
-        // Helpers are plain function declarations; injecting them via Function constructor avoids
-        // running through `eval` (which is restricted under strict CSP) and keeps them isolated.
-        new Function(helperSource).call(globalThis);
-        // The action itself is reconstructed from its source text. We use `new Function` rather
-        // than `eval` because the latter is blocked by `script-src 'self'` (no `'unsafe-eval'`)
-        // CSPs that are common on production sites.
-        const injected = new Function(helperSource + "\nreturn (" + source + ");").call(globalThis);
-        return { ok: true, value: await injected(...invocationArgs) };
+        return { ok: true, value: await window.__piAction(...invocationArgs) };
       } catch (error) {
         return { ok: false, error: error?.stack || error?.message || String(error) };
       }
     },
-    args: [helperSource, func.toString(), args],
+    args: [args || []],
   });
   const first = results?.[0];
   if (first?.error) {
@@ -879,72 +1011,54 @@ async function executeInTab(params, func, args) {
   return envelope?.value;
 }
-// Dedicated executor for page.evaluate. Doesn't go through the helper-source injection chain;
-// that chain was the root cause of `chrome_evaluate` silently returning null on pages with strict
-// CSP. We build a single Function in MAIN world and invoke it directly.
+// Serializer for page.evaluate results. Embedded (via .toString()) into the CDP-evaluated
+// expression so we can return rich markers for values that don't survive returnByValue
+// (undefined/function/symbol/bigint/Error), plus expand DOMRect-like objects whose fields
+// are non-enumerable. Kept as a standalone function so it stays editable/lintable.
+function piEvalStringify(v) {
+  if (v === undefined) return { kind: "undefined" };
+  if (typeof v === "function") return { kind: "function", source: v.toString().slice(0, 500) };
+  if (typeof v === "symbol") return { kind: "symbol", description: v.description };
+  if (typeof v === "bigint") return { kind: "bigint", value: v.toString() };
+  if (v instanceof Error) return { kind: "error", name: v.name, message: v.message, stack: v.stack };
+  // DOMRect/DOMRectReadOnly (and getBoundingClientRect results) have non-enumerable
+  // properties, so JSON.stringify yields `{}`. Expand the fields explicitly.
+  if ((typeof DOMRectReadOnly !== "undefined" && v instanceof DOMRectReadOnly) ||
+      (typeof DOMRect !== "undefined" && v instanceof DOMRect) ||
+      (v && typeof v === "object" && typeof v.toJSON === "function" &&
+       typeof v.width === "number" && typeof v.height === "number" && typeof v.top === "number")) {
+    return { x: v.x, y: v.y, width: v.width, height: v.height, top: v.top, right: v.right, bottom: v.bottom, left: v.left };
+  }
+  return v;
+}
+// Dedicated executor for page.evaluate. Uses CDP Runtime.evaluate (via cdpEval) which is not
+// subject to the page's CSP, fixing `chrome_evaluate` silently returning null / failing on
+// pages that ship `script-src 'self'` without `'unsafe-eval'` (which blocks `eval`/`new Function`).
 async function evaluateInTab(params) {
   const tab = await getTabByParams(params);
   if (params.foreground) await bringToFront(tab);
   const expression = String(params.expression ?? "");
-  const awaitPromise = params.awaitPromise !== false;
-  const results = await chrome.scripting.executeScript({
-    target: { tabId: tab.id },
-    world: "MAIN",
-    func: async (expression, awaitPromise) => {
-      const stringify = (v) => {
-        if (v === undefined) return { kind: "undefined" };
-        if (typeof v === "function") return { kind: "function", source: v.toString().slice(0, 500) };
-        if (typeof v === "symbol") return { kind: "symbol", description: v.description };
-        if (typeof v === "bigint") return { kind: "bigint", value: v.toString() };
-        if (v instanceof Error) return { kind: "error", name: v.name, message: v.message, stack: v.stack };
-        // DOMRect/DOMRectReadOnly (and getBoundingClientRect results) have non-enumerable
-        // properties, so JSON.stringify yields `{}`. Expand the fields explicitly.
-        if ((typeof DOMRectReadOnly !== "undefined" && v instanceof DOMRectReadOnly) ||
-            (typeof DOMRect !== "undefined" && v instanceof DOMRect) ||
-            (v && typeof v === "object" && typeof v.toJSON === "function" &&
-             typeof v.width === "number" && typeof v.height === "number" && typeof v.top === "number")) {
-          return { x: v.x, y: v.y, width: v.width, height: v.height, top: v.top, right: v.right, bottom: v.bottom, left: v.left };
-        }
-        return v;
-      };
-      // Compile via the Function constructor. We try expression form first so callers can pass
-      // `1+1` or `document.title` without a `return`; if that's a SyntaxError we retry with the
-      // statement form so callers can use multi-statement bodies (loops, var decls, etc).
-      const compile = (src) => {
-        try {
-          return { fn: new Function(`return (async () => (${src}))();`), mode: "expression" };
-        } catch (e1) {
-          if (e1 && e1.name === "SyntaxError") {
-            try {
-              return { fn: new Function(`return (async () => { ${src} })();`), mode: "statement" };
-            } catch (e2) {
-              throw e2;
-            }
-          }
-          throw e1;
-        }
-      };
-      try {
-        const { fn } = compile(expression);
-        const value = await fn.call(globalThis);
-        const resolved = awaitPromise && value && typeof value.then === "function" ? await value : value;
-        return { ok: true, value: stringify(resolved) };
-      } catch (error) {
-        return { ok: false, error: error?.stack || error?.message || String(error) };
-      }
-    },
-    args: [expression, awaitPromise],
-  });
-  const first = results?.[0];
-  if (first?.error) {
-    const message = typeof first.error === "string" ? first.error : (first.error.message || JSON.stringify(first.error));
-    throw new Error(`chrome_evaluate failed: ${message}`);
-  }
-  const envelope = first?.result;
-  if (!envelope) throw new Error("chrome_evaluate returned no envelope from MAIN world");
-  if (envelope.ok === false) throw new Error(envelope.error || "chrome_evaluate failed");
-  const v = envelope.value;
-  // Unwrap special markers from MAIN world
+  const stringifySrc = `(${piEvalStringify.toString()})`;
+  // Wrap the user expression so the result is run through piEvalStringify in-page before it
+  // crosses the returnByValue boundary. Try expression form first (so `1+1` / `document.title`
+  // work without `return`); on a SyntaxError fall back to statement form for multi-statement
+  // bodies (loops, var decls, etc), matching the previous new Function() two-form behavior.
+  const buildWrapper = (form) => `(async () => { const __s=${stringifySrc}; const __v = await ${form}; return __s(__v); })()`;
+  const exprForm = `(async () => (${expression}))()`;
+  const stmtForm = `(async () => { ${expression} })()`;
+  let res = await cdpEval(tab.id, buildWrapper(exprForm));
+  if (res.exceptionDetails && cdpIsSyntaxError(res.exceptionDetails)) {
+    res = await cdpEval(tab.id, buildWrapper(stmtForm));
+  }
+  if (res.exceptionDetails) {
+    throw new Error(`chrome_evaluate failed: ${cdpExceptionText(res.exceptionDetails) || "evaluation failed"}`);
+  }
+  const result = res.result;
+  if (!result || result.type === "undefined") return undefined;
+  const v = result.value;
+  // Unwrap special markers produced by piEvalStringify.
   if (v && typeof v === "object" && !Array.isArray(v)) {
     if (v.kind === "undefined") return undefined;
     if (v.kind === "function") return `[Function: ${v.source}]`;
@@ -964,29 +1078,23 @@ async function withOptionalSnapshot(params, actionFn) {
   return result;
 }
-// One-shot init script registry, scoped per tab. The script source is injected at
-// document_start of the next committed navigation in that tab, in MAIN world, then cleared.
-const initScriptIds = new Map();
+// One-shot init script registry, scoped per tab. The source is registered with CDP
+// Page.addScriptToEvaluateOnNewDocument, which runs it at document_start in the page's MAIN
+// world and is NOT subject to page CSP (the old func:(code)=>new Function(code) path was
+// blocked by `script-src 'self'`). page.navigate registers before the nav and unregisters
+// after load, so only the intended navigation receives the script.
+const initScriptIds = new Map(); // tabId -> CDP script identifier
 async function registerInitScript(tabId, source) {
-  initScriptIds.set(tabId, source);
+  await attachDebugger(tabId);
+  await cdp(tabId, "Page.enable", {}).catch(() => undefined);
+  const result = await cdp(tabId, "Page.addScriptToEvaluateOnNewDocument", { source });
+  if (result && result.identifier !== undefined) initScriptIds.set(tabId, result.identifier);
 }
 async function unregisterInitScript(tabId) {
+  const identifier = initScriptIds.get(tabId);
+  if (identifier === undefined) return;
   initScriptIds.delete(tabId);
-}
-if (chrome.webNavigation && chrome.webNavigation.onCommitted) {
-  chrome.webNavigation.onCommitted.addListener((details) => {
-    if (details.frameId !== 0) return;
-    const source = initScriptIds.get(details.tabId);
-    if (!source) return;
-    chrome.scripting.executeScript({
-      target: { tabId: details.tabId, frameIds: [0] },
-      world: "MAIN",
-      injectImmediately: true,
-      func: (code) => { try { new Function(code).call(globalThis); } catch (e) { console.error("[pi-chrome init script]", e); } },
-      args: [source],
-    }).catch(() => undefined);
-  });
+  await cdp(tabId, "Page.removeScriptToEvaluateOnNewDocument", { identifier }).catch(() => undefined);
 }
 // Always inject early console/network capture at document_start on every navigation.
@@ -1058,7 +1166,7 @@ async function takeScreenshot(params) {
       await executeInTab({ ...params, foreground: false }, scrollToY, [tiles.originalScrollY]);
       return {
         fullPage: true,
-        tab: formatTab(tab),
+        tab: await formatTab(tab),
         dimensions: { width: tiles.width, height: tiles.height, viewportHeight: tiles.viewportHeight, dpr: tiles.dpr },
         tiles: captured,
       };
@@ -1067,7 +1175,7 @@ async function takeScreenshot(params) {
       format: params.format || "png",
       quality: params.format === "jpeg" ? params.quality : undefined,
     });
-    return { dataUrl, tab: formatTab(tab) };
+    return { dataUrl, tab: await formatTab(tab) };
   } finally {
     if (previousActiveId !== undefined && previousActiveId !== tab.id) {
       await chrome.tabs.update(previousActiveId, { active: true }).catch(() => undefined);
@@ -2076,20 +2184,6 @@ function getNetworkRequest(requestId) {
   return request;
 }
-async function waitForPage(kind, value, timeoutMs, intervalMs) {
-  const started = Date.now();
-  while (Date.now() - started < timeoutMs) {
-    let ok = false;
-    if (kind === "selector") ok = Boolean(document.querySelector(value));
-    else {
-      try { ok = Boolean(new Function("return (" + value + ");").call(globalThis)); } catch { ok = false; }
-    }
-    if (ok) return { elapsedMs: Date.now() - started };
-    await new Promise((resolve) => setTimeout(resolve, intervalMs));
-  }
-  throw new Error(`Timed out after ${timeoutMs}ms waiting for ${kind}: ${value}`);
-}
 function normalizeKey(key) {
   const table = {
     enter: "Enter",

package/extensions/chrome-profile-bridge/index.ts CHANGED Viewed

@@ -500,7 +500,7 @@ class ChromeProfileBridge {
 	}
 }
-const tabActionValues = ["list", "new", "activate", "close", "version"] as const;
+const tabActionValues = ["list", "new", "activate", "close", "group", "ungroup", "version"] as const;
 const imageFormatValues = ["png", "jpeg"] as const;
 const waitForValues = ["selector", "expression"] as const;
 const CHROME_TOOL_NAMES = [
@@ -665,7 +665,7 @@ Chrome control is available through the chrome_* tools via a companion Chrome ex
 Capability model (important):
 - Interactive controls (click/type/fill/key/hover/drag/scroll/tap) use Chrome's real input layer via chrome.debugger / CDP. Events satisfy normal user-activation gates.
 - Input bypasses page CSP because it is injected at browser input layer, not page JavaScript. Chrome may show the “Pi Chrome Connector started debugging this browser” banner while attached.
-- \`chrome_evaluate\` and \`chrome_snapshot\` run in MAIN world via the **Function constructor**, which requires \`'unsafe-eval'\` in the page CSP. Pages with strict CSP (e.g. github.com, many bank/SaaS apps) will throw \`EvalError: ... 'unsafe-eval' is not an allowed source of script\` and chrome_snapshot will return empty. On those pages, drive the page with \`chrome_screenshot\` + viewport-coordinate \`chrome_click\`/\`chrome_type\`/\`chrome_key\`. \`chrome_navigate\`, \`chrome_screenshot\`, \`chrome_tab\`, and Chrome input all keep working under any CSP.
+- \`chrome_evaluate\` and \`chrome_snapshot\` run in MAIN world via **CDP \`Runtime.evaluate\`**, which is not subject to the page's Content-Security-Policy. They work even on strict-CSP pages (e.g. github.com, many bank/SaaS apps) that block \`'unsafe-eval'\`. \`chrome_navigate initScript\` likewise injects at document_start via CDP and bypasses CSP. \`chrome_screenshot\`, \`chrome_tab\`, and Chrome input also work under any CSP.
 - Input tools return structured details and support \`includeSnapshot=true\` on click/type/fill/key. Use the fresh snapshot to verify state instead of repeating blindly.
 Usage rules:
@@ -1029,20 +1029,25 @@ Usage rules:
 	pi.registerTool({
 		name: "chrome_tab",
 		label: "Chrome Tab",
-		description: "List, create, activate, close, or inspect tabs in the user's existing Chrome profile via the companion extension.",
-		promptSnippet: "List/open/activate/close existing Chrome tabs through the companion extension.",
+		description: "List, create, activate, close, group, ungroup, or inspect tabs in the user's existing Chrome profile via the companion extension.",
+		promptSnippet: "List/open/activate/close/group existing Chrome tabs through the companion extension.",
 		parameters: Type.Object({
 			action: StringEnum(tabActionValues),
 			url: Type.Optional(Type.String({ description: "URL for action=new." })),
-			targetId: Type.Optional(Type.String({ description: "Chrome tab id for activate/close." })),
+			targetId: Type.Optional(Type.String({ description: "Chrome tab id for activate/close/group/ungroup." })),
+			urlIncludes: Type.Optional(Type.String({ description: "Match the target tab by URL substring for activate/close/group/ungroup." })),
+			titleIncludes: Type.Optional(Type.String({ description: "Match the target tab by title substring for activate/close/group/ungroup." })),
+			group: Type.Optional(Type.Boolean({ description: "action=new only: pass false to open an ungrouped tab. By default every Pi-opened tab joins the window's 'Pi' tab group." })),
+			groupTitle: Type.Optional(Type.String({ description: "Tab group title for action=group (or action=new to open into a named group). Defaults to 'Pi'. Pass an empty string on action=new to opt out of grouping." })),
+			groupColor: Type.Optional(Type.String({ description: "Tab group color for action=group/new: grey, blue, red, yellow, green, pink, purple, cyan, or orange. Defaults to blue." })),
 			host: Type.Optional(Type.String()),
 			port: Type.Optional(Type.Number()),
 		}),
 		async execute(_id, params, signal): Promise<ToolTextResult> {
 			const result = await authorizedBridgeSend(`tab.${params.action}`, params, DEFAULT_TIMEOUT_MS, signal);
 			if (params.action === "list") {
-				const tabs = result as Array<{ id: number; title: string; url: string; active: boolean; windowId: number }>;
-				const text = tabs.map((tab) => `${tab.id}\t${tab.active ? "*" : " "}\t${tab.title || "(untitled)"}\t${tab.url}`).join("\n") || "No tabs.";
+				const tabs = result as Array<{ id: number; title: string; url: string; active: boolean; windowId: number; group?: { title?: string } | null }>;
+				const text = tabs.map((tab) => `${tab.id}\t${tab.active ? "*" : " "}\t${tab.group?.title ? `[${tab.group.title}] ` : ""}${tab.title || "(untitled)"}\t${tab.url}`).join("\n") || "No tabs.";
 				return { content: [{ type: "text", text }], details: { tabs } };
 			}
 			return { content: [{ type: "text", text: safeJson(result) }], details: { result: result as Json } };

package/package.json CHANGED Viewed

@@ -1,7 +1,8 @@
 {
 	"name": "pi-chrome",
-	"version": "0.15.28",
+	"version": "0.15.30",
 	"scripts": {
+		"test": "node test-suite/unit/csp-eval.test.mjs",
 		"version": "node scripts/sync-manifest-version.js",
 		"prepublishOnly": "node scripts/sync-manifest-version.js"
 	},

package/test-suite/README.md CHANGED Viewed

@@ -125,7 +125,7 @@ Each unit challenge has a `gate` field:
 - `dom-complexity` / `frames` — Shadow DOM and iframe targeting.
 - `files` — file attachment to `<input type=file>`.
 - `observability` — console/network capture tools.
-- `csp` — strict Content Security Policy fallback where eval/snapshot may fail.
+- `csp` — strict Content Security Policy: screenshot/coordinate fallback (39) and the CDP eval/snapshot bypass that works under `script-src 'self'` without `unsafe-eval` (42).
 - `lazy-loading` — dynamic DOM readiness and wait behavior.
 - `fingerprint` — environment and stack fingerprint probes.
 - `agent-safety` — hidden honeypots and safe target selection.
@@ -175,6 +175,7 @@ The dashboard renders this from `manifest.json`. In brief:
 39. strict CSP screenshot/coordinate fallback
 40. dynamic wait/readiness
 41. explicit tab lifecycle
+42. strict CSP eval/snapshot via CDP (regression guard for the CSP bypass)
 ## Design notes

package/test-suite/challenges/42-strict-csp-evaluate.html ADDED Viewed

@@ -0,0 +1,16 @@
+<!doctype html>
+<meta charset="utf-8">
+<meta http-equiv="Content-Security-Policy" content="default-src 'self'; script-src 'self'; style-src 'self' 'unsafe-inline'; object-src 'none'; base-uri 'none'">
+<title>42 strict CSP evaluate/snapshot</title>
+<link rel="stylesheet" href="../_style.css">
+<script src="../_lib.js"></script>
+<body>
+<main>
+  <p>Goal: this page ships a strict CSP (<code>script-src 'self'</code>, no <code>unsafe-eval</code>), which blocks <code>eval</code>/<code>new Function</code>. <code>chrome_evaluate</code> and <code>chrome_snapshot</code> must still work because they run through CDP, which is not subject to page CSP.</p>
+  <p id="hint">A secret token is exposed only at <code>window.__cspToken</code> — it is never written into the DOM. Use <code>chrome_evaluate</code> to read it, type it into the field (snapshot/uid to find the field), then click Verify.</p>
+  <label for="tokenInput">Token:</label>
+  <input id="tokenInput" type="text" autocomplete="off" aria-label="csp token">
+  <button id="verify" aria-label="verify token">Verify</button>
+</main>
+<script src="42-strict-csp-evaluate.js"></script>
+</body>

package/test-suite/challenges/42-strict-csp-evaluate.js ADDED Viewed

@@ -0,0 +1,21 @@
+Challenge.init({
+  id: "strict-csp-evaluate",
+  instructions: "under strict CSP: read window.__cspToken via chrome_evaluate, type it into the field, click Verify",
+});
+// Secret available only via JS evaluation. It is intentionally NOT rendered into the DOM and
+// is defined non-enumerable, so the only way to obtain it is to evaluate window.__cspToken in
+// the page (which proves chrome_evaluate works despite script-src 'self' blocking eval).
+const token = "csp-" + Math.random().toString(36).slice(2, 10);
+Object.defineProperty(window, "__cspToken", { value: token, enumerable: false, configurable: false, writable: false });
+document.getElementById("verify").addEventListener("click", (e) => {
+  const bad = [];
+  if (!e.isTrusted) bad.push("verify click isTrusted=false (use trusted/CDP input)");
+  const val = (document.getElementById("tokenInput").value || "").trim();
+  if (val !== token) {
+    bad.push(`token mismatch: got "${val}" expected "${token}" — chrome_evaluate must read window.__cspToken under strict CSP`);
+  }
+  if (bad.length) Challenge.fail(...bad);
+  else Challenge.pass("strict CSP: chrome_evaluate read the hidden token via CDP and trusted input submitted it");
+});

package/test-suite/manifest.json CHANGED Viewed

@@ -1510,7 +1510,7 @@
       "strict-csp"
     ],
     "notes": [
-      "chrome_snapshot/chrome_evaluate may fail on this page because script-src omits unsafe-eval; this is intentional. Use screenshot plus viewport coordinates, then read verdict from dashboard/localStorage after leaving CSP page."
+      "This challenge exercises the pure screenshot + viewport-coordinate path (no snapshot/evaluate needed). Note: as of the CDP CSP bypass, chrome_snapshot/chrome_evaluate DO work here even though script-src omits unsafe-eval (see challenge 42 strict-csp-evaluate). Read verdict from dashboard/localStorage after leaving the CSP page."
     ],
     "manualBaseline": "unverified",
     "gradeSource": "page"
@@ -1626,5 +1626,56 @@
     ],
     "manualBaseline": "unverified",
     "gradeSource": "page"
+  },
+  {
+    "id": "strict-csp-evaluate",
+    "file": "challenges/42-strict-csp-evaluate.html",
+    "category": "csp",
+    "difficulty": "L2",
+    "gate": "core",
+    "goal": "On a page with strict CSP (script-src 'self', no unsafe-eval), use chrome_evaluate + chrome_snapshot via CDP to read a JS-only secret (window.__cspToken), then submit it with trusted input.",
+    "expected": {
+      "synthetic": "FAIL",
+      "trusted": "PASS",
+      "manual": "PASS"
+    },
+    "recipe": [
+      {
+        "tool": "chrome_evaluate",
+        "params": {
+          "expression": "window.__cspToken"
+        }
+      },
+      {
+        "tool": "chrome_fill",
+        "params": {
+          "selector": "#tokenInput",
+          "value": "$RESULT_OF_chrome_evaluate",
+          "trusted": true
+        }
+      },
+      {
+        "tool": "chrome_click",
+        "params": {
+          "selector": "#verify",
+          "trusted": true
+        }
+      }
+    ],
+    "requires": {
+      "cdp": true
+    },
+    "tags": [
+      "csp",
+      "evaluate",
+      "snapshot",
+      "strict-csp",
+      "cdp-bypass"
+    ],
+    "notes": [
+      "Regression guard for the CDP CSP bypass: chrome_evaluate and chrome_snapshot must work even though script-src omits unsafe-eval (eval/new Function are blocked). The token is non-enumerable and never in the DOM, so it can only be obtained via chrome_evaluate. Runner must substitute the evaluate result into the fill value. synthetic FAIL is due to the trusted-click gate, not eval availability."
+    ],
+    "manualBaseline": "unverified",
+    "gradeSource": "page"
   }
 ]

package/test-suite/unit/csp-eval.test.mjs ADDED Viewed

@@ -0,0 +1,171 @@
+// Unit harness for the CSP-bypass layer in service_worker.js.
+//
+// The real CSP bypass (CDP Runtime.evaluate not being subject to page CSP) can only be
+// proven in a browser — see challenge 39-strict-csp-fallback. These tests instead validate
+// the JS *logic* of the refactor that the bypass depends on:
+//   - evaluateInTab: wrapper-string construction, expression/statement fallback, value
+//     marker round-trip (undefined/function/symbol/bigint/Error/DOMRect), error propagation.
+//   - executeInTab: 2-phase define-then-invoke, envelope unwrap, error propagation, and that
+//     all real HELPER_FUNCS serialize+assign without a parse error.
+//   - page.waitFor: service-worker-side polling via evaluateInTab (selector + expression).
+//
+// We load the worker into a vm sandbox with mocked chrome.* APIs, then replace `cdp` with a
+// shim that evaluates the expression in a separate "page world" vm context (simulating CDP
+// Runtime.evaluate returnByValue). No browser, no network, no deps.
+import vm from "node:vm";
+import fs from "node:fs";
+import path from "node:path";
+import { fileURLToPath } from "node:url";
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
+const workerPath = path.resolve(__dirname, "../../extensions/chrome-profile-bridge/browser-extension/service_worker.js");
+const src = fs.readFileSync(workerPath, "utf8");
+let failures = 0;
+let passes = 0;
+function ok(cond, msg) {
+  if (cond) { passes++; }
+  else { failures++; console.error(`  ✗ ${msg}`); }
+}
+async function throwsWith(fn, re, msg) {
+  try { await fn(); ok(false, `${msg} (expected throw)`); }
+  catch (e) { ok(re.test(String(e.message || e)), `${msg} (got: ${e.message})`); }
+}
+// ---- page world: simulates the page's MAIN world for Runtime.evaluate ----
+const pageGlobals = {
+  console, JSON, Date, Math, Promise, Object, Array, String, Number, Boolean,
+  Error, TypeError, SyntaxError, RangeError, BigInt, Symbol, structuredClone,
+  setTimeout, parseInt, parseFloat, isNaN,
+  document: {
+    title: "page title",
+    _present: new Set(),
+    querySelector(sel) { return this._present.has(sel) ? { sel } : null; },
+  },
+};
+pageGlobals.window = pageGlobals;
+pageGlobals.globalThis = pageGlobals;
+const pageWorld = vm.createContext(pageGlobals);
+// Simulate CDP Runtime.evaluate returnByValue serialization.
+function toCdpResult(v) {
+  if (v === undefined) return { result: { type: "undefined" } };
+  if (v === null) return { result: { type: "object", subtype: "null", value: null } };
+  const t = typeof v;
+  if (t === "number" || t === "string" || t === "boolean")
+    return { result: { type: t, value: v } };
+  // object/array: returnByValue deep-clones JSON-able structures
+  return { result: { type: "object", value: JSON.parse(JSON.stringify(v)) } };
+}
+// ---- worker sandbox ----
+const noop = () => {};
+const listener = { addListener: noop, removeListener: noop };
+const sandbox = {
+  console, JSON, Date, Math, Promise, Array, Object, String, Number, Boolean,
+  Error, TypeError, Map, Set, BigInt, Symbol, structuredClone,
+  setTimeout, clearTimeout,
+  setInterval: () => 0,
+  clearInterval: noop,
+  fetch: async () => { throw new Error("no network in unit test"); },
+  navigator: { userAgent: "unit-test" },
+  WebSocket: function () {},
+  chrome: {
+    runtime: { id: "unittestextension", getManifest: () => ({ version: "0.0.0" }), onInstalled: listener, onStartup: listener, lastError: null },
+    alarms: { onAlarm: listener, create: noop, clear: noop, clearAll: noop },
+    action: { onClicked: listener },
+    debugger: { sendCommand: noop, attach: async () => {}, detach: async () => {}, getTargets: (cb) => cb([]) },
+    scripting: { executeScript: async () => [{ result: undefined }] },
+    tabs: { query: async () => [], get: async () => ({}), create: async () => ({}), update: async () => ({}), remove: async () => {} },
+    windows: { update: async () => {} },
+    webNavigation: { onCommitted: listener },
+  },
+};
+sandbox.globalThis = sandbox;
+sandbox.self = sandbox;
+vm.createContext(sandbox);
+vm.runInContext(src, sandbox);
+// ---- override the page-touching primitives with the page-world shim ----
+sandbox.attachDebugger = async () => ({});
+sandbox.bringToFront = async () => {};
+sandbox.getTabByParams = async (p) => ({ id: (p && p.targetId) || 1, windowId: 1 });
+sandbox.cdp = async (_tabId, method, params) => {
+  if (method !== "Runtime.evaluate") return {};
+  try {
+    const value = await vm.runInContext(params.expression, pageWorld);
+    return toCdpResult(value);
+  } catch (e) {
+    return { exceptionDetails: { exception: { className: e.name, description: String(e.stack || e.message) }, text: "Uncaught " + String(e) } };
+  }
+};
+// Phase-2 of executeInTab: run the injected wrapper func against the page world,
+// where Phase-1 (via cdp shim above) already defined window.__piAction + helpers.
+sandbox.chrome.scripting.executeScript = async ({ func, args }) => {
+  const fn = vm.runInContext("(" + func.toString() + ")", pageWorld);
+  const result = await fn(...(args || []));
+  return [{ result }];
+};
+const { evaluateInTab, executeInTab, dispatch } = sandbox;
+async function run() {
+  // ===== evaluateInTab: primitives & objects =====
+  ok((await evaluateInTab({ expression: "2 + 2" })) === 4, "evaluate: arithmetic expression");
+  ok((await evaluateInTab({ expression: "document.title" })) === "page title", "evaluate: expression without return");
+  ok((await evaluateInTab({ expression: "'a' + 'b'" })) === "ab", "evaluate: string concat");
+  const obj = await evaluateInTab({ expression: "({a:1, b:[2,3]})" });
+  ok(obj && obj.a === 1 && obj.b[1] === 3, "evaluate: object literal round-trips");
+  // ===== value markers =====
+  ok((await evaluateInTab({ expression: "void 0" })) === undefined, "evaluate: undefined marker -> undefined");
+  ok((await evaluateInTab({ expression: "10n" })) === "10", "evaluate: bigint marker -> string");
+  ok(/^\[Function:/.test(await evaluateInTab({ expression: "(function foo(){})" })), "evaluate: function marker");
+  ok((await evaluateInTab({ expression: "Promise.resolve(42)" })) === 42, "evaluate: promise is awaited");
+  // DOMRect-like (toJSON + width/height/top) is expanded, not flattened to {}
+  const rect = await evaluateInTab({ expression: "({ x:1,y:2,width:3,height:4,top:2,right:4,bottom:6,left:1, toJSON(){return {}} })" });
+  ok(rect && rect.width === 3 && rect.bottom === 6, "evaluate: DOMRect-like expanded");
+  // ===== statement-form fallback (expression form is a SyntaxError) =====
+  // `let x=...; x` is not a valid expression, so the wrapper must retry as a statement body.
+  ok((await evaluateInTab({ expression: "let x = 5; x" })) === undefined, "evaluate: statement form falls back (no return -> undefined)");
+  ok((await evaluateInTab({ expression: "let y = 7; return y" })) === 7, "evaluate: statement form with explicit return");
+  // ===== error propagation =====
+  await throwsWith(() => evaluateInTab({ expression: "throw new Error('boom')" }), /chrome_evaluate failed[\s\S]*boom/, "evaluate: runtime error propagates");
+  // ===== executeInTab: 2-phase define + invoke =====
+  // Real HELPER_FUNCS get serialized + assigned in Phase 1; a parse error there would throw here.
+  const sum = await executeInTab({ targetId: 1 }, function add(a, b) { return a + b; }, [3, 4]);
+  ok(sum === 7, "executeInTab: action runs with args after helper injection");
+  const asyncResult = await executeInTab({ targetId: 1 }, async function asyncEcho(v) { return v * 2; }, [21]);
+  ok(asyncResult === 42, "executeInTab: async action awaited");
+  await throwsWith(
+    () => executeInTab({ targetId: 1 }, function boom() { throw new Error("action failed"); }, []),
+    /action failed/,
+    "executeInTab: thrown action error propagates via envelope",
+  );
+  // ===== page.waitFor (service-worker-side polling) =====
+  pageGlobals.document._present.add("#ready");
+  const wf = await dispatch("page.waitFor", { targetId: 1, kind: "selector", value: "#ready", timeoutMs: 1000, intervalMs: 20 });
+  ok(wf && typeof wf.elapsedMs === "number", "waitFor: selector present resolves");
+  const wfExpr = await dispatch("page.waitFor", { targetId: 1, kind: "expression", value: "1 === 1", timeoutMs: 1000, intervalMs: 20 });
+  ok(wfExpr && typeof wfExpr.elapsedMs === "number", "waitFor: truthy expression resolves");
+  await throwsWith(
+    () => dispatch("page.waitFor", { targetId: 1, kind: "selector", value: "#never", timeoutMs: 120, intervalMs: 30 }),
+    /Timed out after 120ms/,
+    "waitFor: missing selector times out",
+  );
+  console.log(`\n${passes} passed, ${failures} failed`);
+  if (failures) process.exit(1);
+}
+run().catch((e) => { console.error(e); process.exit(1); });