@mehmoodqureshi/chrome-mcp 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (51) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +129 -0
  3. package/dist/shared/download.d.ts +15 -0
  4. package/dist/shared/download.js +0 -0
  5. package/dist/shared/protocol.d.ts +114 -0
  6. package/dist/shared/protocol.js +55 -0
  7. package/dist/src/bridge/auth.d.ts +32 -0
  8. package/dist/src/bridge/auth.js +76 -0
  9. package/dist/src/bridge/connection.d.ts +48 -0
  10. package/dist/src/bridge/connection.js +192 -0
  11. package/dist/src/bridge/datadir.d.ts +8 -0
  12. package/dist/src/bridge/datadir.js +22 -0
  13. package/dist/src/bridge/server.d.ts +58 -0
  14. package/dist/src/bridge/server.js +178 -0
  15. package/dist/src/cli.d.ts +11 -0
  16. package/dist/src/cli.js +93 -0
  17. package/dist/src/config.d.ts +42 -0
  18. package/dist/src/config.js +188 -0
  19. package/dist/src/executor/cdp-executor.d.ts +131 -0
  20. package/dist/src/executor/cdp-executor.js +422 -0
  21. package/dist/src/executor/extension-executor.d.ts +102 -0
  22. package/dist/src/executor/extension-executor.js +124 -0
  23. package/dist/src/executor/manager.d.ts +43 -0
  24. package/dist/src/executor/manager.js +94 -0
  25. package/dist/src/executor/select.d.ts +23 -0
  26. package/dist/src/executor/select.js +53 -0
  27. package/dist/src/executor/stub-executor.d.ts +60 -0
  28. package/dist/src/executor/stub-executor.js +118 -0
  29. package/dist/src/executor/types.d.ts +192 -0
  30. package/dist/src/executor/types.js +24 -0
  31. package/dist/src/mcp/envelopes.d.ts +13 -0
  32. package/dist/src/mcp/envelopes.js +30 -0
  33. package/dist/src/mcp/helpers.d.ts +37 -0
  34. package/dist/src/mcp/helpers.js +71 -0
  35. package/dist/src/mcp/markdown-extract.d.ts +9 -0
  36. package/dist/src/mcp/markdown-extract.js +61 -0
  37. package/dist/src/mcp/server.d.ts +18 -0
  38. package/dist/src/mcp/server.js +82 -0
  39. package/dist/src/mcp/tools.d.ts +32 -0
  40. package/dist/src/mcp/tools.js +267 -0
  41. package/dist/src/mcp/validators.d.ts +32 -0
  42. package/dist/src/mcp/validators.js +104 -0
  43. package/dist/src/security/policy.d.ts +48 -0
  44. package/dist/src/security/policy.js +155 -0
  45. package/docs/BLUEPRINT.md +596 -0
  46. package/extension-dist/background.js +567 -0
  47. package/extension-dist/manifest.json +12 -0
  48. package/extension-dist/options.html +32 -0
  49. package/extension-dist/options.js +37 -0
  50. package/package.json +69 -0
  51. package/scripts/postinstall.js +50 -0
@@ -0,0 +1,596 @@
1
+ # Chrome MCP — Build-Ready Blueprint (v1)
2
+
3
+ > An MCP server that lets Claude drive a real Chrome browser. One pluggable
4
+ > `Executor` interface, two backends: an MV3 **extension** (primary, drives the
5
+ > user's real Chrome via `chrome.debugger`/CDP) and a **CDP fallback** (the
6
+ > server launches/attaches Chromium via Playwright). Distributed as an npm
7
+ > package run with `npx` plus a load-unpacked extension.
8
+ >
9
+ > This document is the spec. Every blocker and major from verification has been
10
+ > **designed away**; resolutions are called out inline as **[RESOLVED]**.
11
+
12
+ ---
13
+
14
+ ## 1. Overview & Architecture
15
+
16
+ Three "worlds" cooperate through one process-global `ExecutorManager` and one
17
+ canonical wire protocol. The MCP host (Claude Desktop / Code) speaks JSON-RPC
18
+ over **stdio**; the extension speaks the canonical frame protocol over a
19
+ **localhost WebSocket**; the CDP fallback speaks **Playwright/CDP**.
20
+
21
+ ```
22
+ ┌──────────────────────────────────────────────────────────────┐
23
+ WORLD 1 │ MCP HOST (Claude Desktop / Claude Code) │
24
+ (the model)│ spawns: npx chrome-mcp │
25
+ └───────────────┬──────────────────────────────────────────────┘
26
+ │ JSON-RPC over stdio (stdout = SACRED)
27
+ ┌───────────────▼──────────────────────────────────────────────┐
28
+ WORLD 2 │ chrome-mcp CLI (one Node process) │
29
+ (server) │ │
30
+ │ ┌────────────┐ ┌──────────────────┐ ┌─────────────────┐ │
31
+ │ │ MCP Server │──▶│ tools.ts │──▶│ ExecutorManager │ │
32
+ │ │ (stdio) │ │ dispatchToolCall│ │ (selection + │ │
33
+ │ └────────────┘ │ + policy gate │ │ self-heal) │ │
34
+ │ └──────────────────┘ └───┬─────────┬───┘ │
35
+ │ │ │ │
36
+ │ ┌───────────────▼──┐ ┌──▼────┐│
37
+ │ │ ExtensionExecutor│ │ Cdp- ││
38
+ │ │ (WS proxy) │ │ Exec- ││
39
+ │ └───────┬──────────┘ │ utor ││
40
+ │ ┌─────────────────────┐ │ └──┬────┘│
41
+ │ │ BridgeServer (ws) │◀───────────┘ sendCommand() │ │
42
+ │ │ 127.0.0.1:<port> │ id-correlated frames │ │
43
+ │ │ token gate (only │ │ │
44
+ │ │ real boundary) │ │ │
45
+ │ └──────────┬──────────┘ │ │
46
+ └──────────────│─────────────────────────────────────────│─────┘
47
+ │ ws:// loopback │
48
+ │ (extension dials IN as CLIENT) │ Playwright
49
+ ┌──────────────▼─────────────────────┐ ┌───────────────▼───────────┐
50
+ WORLD 3 │ MV3 EXTENSION (user's REAL Chrome) │ │ Chromium (server-owned │
51
+ (browser) │ background SW: WsClient + Router │ │ OR attached via │
52
+ │ CdpExecutor over chrome.debugger │ │ connectOverCDP) │
53
+ │ → real cookies / real logins │ │ FALLBACK ONLY │
54
+ └────────────────────────────────────┘ └───────────────────────────┘
55
+ ```
56
+
57
+ **Backend selection (every `ensureReady()`):** prefer a *responsive*
58
+ authenticated extension; otherwise, if `cdpFallback` is enabled, lazily
59
+ launch/attach Chromium. Selection is re-evaluated per call so a late extension
60
+ takes over from CDP and vice-versa on disconnect.
61
+
62
+ **Decisive global calls:**
63
+ - Single process-global `Executor` pointer for v1 (not a pool). Multi-session is out of scope.
64
+ - **The 256-bit per-boot token is the ONLY security boundary.** Origin checks and the loopback bind are defense-in-depth against *browser-page* attackers, not native processes. [RESOLVED — security blockers 2 & 3]
65
+ - **One canonical `protocol.ts`** imported by both server and extension. Four conflicting wire/port/token designs are collapsed into this single contract. [RESOLVED — integration blocker 1]
66
+ - **Helpers (`extract_links`, `read_as_markdown`, `fill_form`) are composed server-side** from primitives. Only `download_file` is an executor/wire method. [RESOLVED — integration blocker 2]
67
+ - **Default-deny domain policy is ON by default and gates reads too**, enforced inside the executor dispatch (both backends) AND the extension router. [RESOLVED — security major 4 & eval]
68
+
69
+ ---
70
+
71
+ ## 2. Repository Layout
72
+
73
+ Single sibling repo `chrome-mcp/` (NOT a workspace). Two build roots in one git
74
+ repo so the wire protocol version-locks between server and extension. The shared
75
+ `protocol.ts` lives at the top so both builds import the *same file*.
76
+
77
+ ```
78
+ chrome-mcp/
79
+ ├── package.json # bin: chrome-mcp -> dist/cli.js; deps sdk, ws, playwright
80
+ ├── tsconfig.json
81
+ ├── README.md
82
+ ├── .gitignore # dist/, extension-dist/, test-targets.json, *.handshake
83
+ ├── scripts/
84
+ │ └── postinstall.js # playwright install chromium (CDP fallback only); skip-guarded
85
+
86
+ ├── shared/
87
+ │ └── protocol.ts # ⭐ SINGLE SOURCE OF TRUTH for the wire contract
88
+ │ # imported by BOTH src/ and extension/ builds
89
+
90
+ ├── src/ # → dist/ via tsc
91
+ │ ├── cli.ts # npx entry: parseArgs, help/version, boot bridge+stdio, shutdown
92
+ │ ├── config.ts # CliConfig + parseArgs (single port/token/policy source)
93
+ │ ├── mcp/
94
+ │ │ ├── server.ts # createServer factory + start/stop/isRunning + logErr (clean-stdout)
95
+ │ │ ├── tools.ts # TOOL_DEFINITIONS + TOOL_HANDLERS + dispatchToolCall + drift-check
96
+ │ │ ├── validators.ts # asArgs/requireString/optional*/requireTarget guards
97
+ │ │ ├── envelopes.ts # jsonResult / imageResult / textResult
98
+ │ │ ├── helpers.ts # extract_links / read_as_markdown / fill_form composed server-side
99
+ │ │ └── markdown-extract.ts # dependency-free HTML→md reducer (injected string)
100
+ │ ├── executor/
101
+ │ │ ├── types.ts # ⭐ Executor interface + arg/result types + ExecutorError
102
+ │ │ ├── manager.ts # ExecutorManager singleton + withReadyExecutor + selection/self-heal
103
+ │ │ ├── extension-executor.ts# proxies primitives → BridgeServer.sendCommand
104
+ │ │ ├── cdp-executor.ts # Playwright connectOverCDP / launchPersistentContext (BrowserManager port)
105
+ │ │ └── page-scripts.ts # shared injected-script sources (getText/extractLinks/waitFor poll)
106
+ │ ├── bridge/
107
+ │ │ ├── server.ts # BridgeServer: ws on 127.0.0.1, token gate, id-correlation, heartbeat
108
+ │ │ ├── connection.ts # ExtensionConnection: pending-map, timeouts, reject-all-on-close
109
+ │ │ ├── auth.ts # token gen, atomic 0600 handshake.json, hashed constant-time compare
110
+ │ │ └── datadir.ts # Electron-free CHROME_MCP_DATA || ~/.chrome-mcp
111
+ │ └── security/
112
+ │ └── policy.ts # Policy type, loadPolicy, assertUrlAllowed (default-deny, gates reads)
113
+
114
+ ├── extension/ # → extension-dist/ via esbuild (load-unpacked root)
115
+ │ ├── manifest.json # MV3; keyed for stable id; minimum_chrome_version 123
116
+ │ ├── icons/{16,48,128}.png
117
+ │ ├── scripts/build-ext.mjs # esbuild background.ts (esm) + options.ts (iife) + copy manifest/html/icons
118
+ │ ├── package.json # devDeps only: typescript, esbuild, @types/chrome
119
+ │ └── src/
120
+ │ ├── sw/
121
+ │ │ ├── background.ts # SW entry: top-level listeners, ensureConnected, keepalive loop
122
+ │ │ ├── ws-client.ts # dial ws://, hello-token handshake, reconnect, ping/pong
123
+ │ │ ├── router.ts # CommandRouter: never-throw firewall, drift-assert, policy gate
124
+ │ │ ├── cdp-executor.ts # chrome.debugger Executor: ensureAttached self-heal, poll-based lifecycle
125
+ │ │ └── tabs.ts # tabs_list/select/new/close + attachedTabId persistence + validation
126
+ │ ├── options/
127
+ │ │ ├── options.html
128
+ │ │ └── options.ts # native-host pairing status + manual fallback paste; live status
129
+ │ └── nm/
130
+ │ └── trampoline.ts # (optional, v1.1) native-messaging host that reads handshake.json
131
+
132
+ ├── test/
133
+ │ ├── stub-executor.ts # in-memory Executor (canned values + forced throws)
134
+ │ ├── bridge.test.ts # node --test: token gate, id-correlation, displacement, deadline→isError
135
+ │ ├── tools.test.ts # dispatch drift-check, requireTarget, isError envelope, image block
136
+ │ ├── policy.test.ts # default-deny, read-gating, allowlist glob, token-never-logged assertion
137
+ │ ├── extension-smoke.ts # Playwright --load-extension, programmatic pair, navigate→getText
138
+ │ └── hitl/ # human-gated e2e (cloned from linkedin-mcp/test/hitl)
139
+ │ ├── index.ts runner.ts reporter.ts prompts.ts types.ts scenarios.ts
140
+
141
+ ├── test-targets.example.json # committed; test-targets.json is gitignored
142
+ └── policy.example.json # committed example allowlist
143
+ ```
144
+
145
+ ---
146
+
147
+ ## 3. The Executor Interface
148
+
149
+ The single backend-agnostic contract. Plain JSON-serializable in/out — **no
150
+ Playwright `Page`, no CDP session, no DOM handle crosses this boundary.** Both
151
+ `ExtensionExecutor` and `CdpExecutor` implement it. Helpers are composed in
152
+ `mcp/helpers.ts` from these primitives; only `download` is privileged.
153
+
154
+ ```ts
155
+ // src/executor/types.ts (and re-exported shapes shared with shared/protocol.ts)
156
+
157
+ export type BackendKind = 'extension' | 'cdp';
158
+ export type WaitUntil = 'load' | 'domcontentloaded' | 'networkidle';
159
+ export type KeyModifier = 'Alt' | 'Control' | 'Meta' | 'Shift';
160
+
161
+ /** Selector XOR ref. requireTarget() enforces exactly-one-of. Refs are minted
162
+ * by getText/extractLinks/waitFor/eval results and are PAGE-scoped: invalidated
163
+ * on that tab's navigation. ref format: `el_<tabShort>_<backendNodeId>`. */
164
+ export type Target = { selector: string; ref?: never } | { ref: string; selector?: never };
165
+
166
+ /** tabId is ALWAYS prefixed `<backend>:<sessionId>:<rawId>` so a handle from one
167
+ * backend/session can never mis-route after a fallback switch or reconnect. */
168
+ export type TabId = string;
169
+
170
+ export interface TabInfo { tabId: TabId; url: string; title: string; active: boolean; index: number; }
171
+ export interface NavResult{ url: string; title: string; httpStatus?: number; }
172
+ export interface EvalResult { ok: boolean; value?: unknown; type?: string; error?: string; } // value truncated >256KB
173
+ export interface WaitResult { matched: boolean; ref?: string; waitedMs: number; }
174
+ export interface ActionOk { ok: true; }
175
+ export interface ScreenshotResult {
176
+ dataBase64: string; mimeType: 'image/png';
177
+ width: number; height: number; truncated: boolean; fullHeight?: number; // fullPage cap metadata
178
+ }
179
+ export interface DownloadResult { path: string; backend: BackendKind; bytes: number; mimeType?: string; suggestedName?: string; }
180
+
181
+ export interface ExecutorStatus {
182
+ ready: boolean;
183
+ backend: BackendKind | null;
184
+ activeTabId: TabId | null;
185
+ detail?: string; // WHY unavailable (port in use, no extension, policy, etc.)
186
+ extensionConnected: boolean;
187
+ cdpAttached: boolean;
188
+ }
189
+
190
+ export interface Executor {
191
+ readonly backend: BackendKind;
192
+ status(): ExecutorStatus;
193
+ /** idempotent lazy connect/attach + self-heal; single-flight guarded. */
194
+ ensureReady(): Promise<void>;
195
+ /** lightweight responsiveness probe (short deadline) — used by withReadyExecutor
196
+ * to detect a dead-but-not-yet-reconnected MV3 worker and fall through. */
197
+ ping(deadlineMs?: number): Promise<boolean>;
198
+ dispose(): Promise<void>; // close ONLY if we own the browser; never the user's Chrome
199
+
200
+ // --- tabs ---
201
+ tabsList(): Promise<TabInfo[]>;
202
+ tabSelect(tabId: TabId): Promise<TabInfo>;
203
+ tabNew(url?: string): Promise<TabInfo>;
204
+ tabClose(tabId: TabId): Promise<{ closed: true; tabId: TabId }>;
205
+
206
+ // --- navigation (active tab unless tabId given) ---
207
+ navigate(args: { url: string; tabId?: TabId; waitUntil?: WaitUntil }): Promise<NavResult>;
208
+ back(tabId?: TabId): Promise<NavResult>;
209
+ forward(tabId?: TabId): Promise<NavResult>;
210
+ reload(args?: { tabId?: TabId; waitUntil?: WaitUntil }): Promise<NavResult>;
211
+
212
+ // --- interaction (Target = {selector} XOR {ref}) ---
213
+ click(t: Target, opts?: { tabId?: TabId; button?: 'left'|'right'|'middle'; clickCount?: number }): Promise<ActionOk>;
214
+ type(t: Target, text: string, opts?: { tabId?: TabId; clear?: boolean; pressEnter?: boolean; keyEvents?: boolean }): Promise<ActionOk>;
215
+ fill(t: Target, value: string, opts?: { tabId?: TabId }): Promise<ActionOk>; // value-set + input/change (used by fill_form)
216
+ press(key: string, opts?: { tabId?: TabId; modifiers?: KeyModifier[] }): Promise<ActionOk>;
217
+ hover(t: Target, opts?: { tabId?: TabId }): Promise<ActionOk>;
218
+ scroll(opts: { tabId?: TabId; x?: number; y?: number; deltaX?: number; deltaY?: number; target?: Target }): Promise<ActionOk>;
219
+
220
+ // --- read (policy-gated by current tab URL) ---
221
+ getText(t?: Target, opts?: { tabId?: TabId }): Promise<{ text: string; ref?: string }>;
222
+ getHtml(t?: Target, opts?: { tabId?: TabId; outer?: boolean }): Promise<{ html: string }>;
223
+ screenshot(opts?: { tabId?: TabId; fullPage?: boolean; target?: Target }): Promise<ScreenshotResult>;
224
+ eval(expression: string, opts?: { tabId?: TabId; awaitPromise?: boolean }): Promise<EvalResult>;
225
+ waitFor(opts: { tabId?: TabId; selector?: string; textContains?: string; gone?: boolean; timeoutMs?: number }): Promise<WaitResult>;
226
+
227
+ // --- privileged, executor-owned (NOT composable) ---
228
+ download(args: { url?: string; target?: Target; tabId?: TabId; suggestedName?: string }): Promise<DownloadResult>;
229
+ }
230
+
231
+ export class ExecutorError extends Error {
232
+ constructor(
233
+ public code:
234
+ | 'NO_BACKEND' | 'EXTENSION_DISCONNECTED' | 'TIMEOUT'
235
+ | 'TAB_NOT_FOUND' | 'STALE_TAB' | 'SELECTOR_NOT_FOUND' | 'REF_EXPIRED'
236
+ | 'EVAL_FAILED' | 'LAUNCH_FAILED' | 'DETACHED' | 'TARGET_GONE'
237
+ | 'POLICY_DENIED' | 'DEVTOOLS_OPEN' | 'DOWNLOAD_FAILED' | 'BACKPRESSURE',
238
+ message: string,
239
+ ) { super(message); this.name = 'ExecutorError'; }
240
+ }
241
+ ```
242
+
243
+ **`ExecutorManager`** (the `withReadyDriver` analog, `linkedin-mcp/src/driver/linkedin.ts:208`
244
+ / `linkedin-mcp/src/mcp/tools.ts:638`):
245
+
246
+ ```ts
247
+ export class ExecutorManager {
248
+ constructor(deps: { bridge: BridgeServer; policy: Policy; cdpFallback: boolean;
249
+ cdpEndpoint?: string; headless: boolean; userDataDir: string });
250
+ static getInstance(deps?: ConstructorParameters<typeof ExecutorManager>[0]): ExecutorManager;
251
+
252
+ /** Re-selects backend (extension-if-responsive else CDP), ensures operational,
253
+ * single-flight guarded by this.readying. PINGs the extension with a short
254
+ * deadline; a dead MV3 worker falls through to CDP instead of eating a 30s
255
+ * timeout. [RESOLVED — mv3 keepalive blocker] */
256
+ ensureReady(): Promise<Executor>;
257
+ peek(): { backend: BackendKind | null; ready: boolean; detail?: string };
258
+ dispose(): Promise<void>;
259
+ }
260
+ export async function withReadyExecutor(): Promise<Executor> {
261
+ return ExecutorManager.getInstance().ensureReady();
262
+ }
263
+ ```
264
+
265
+ Selection logic, decisively:
266
+ 1. If `bridge.hasActiveExtension()` **and** `extensionExecutor.ping(800ms)` resolves true → return `ExtensionExecutor`.
267
+ 2. Else if `cdpFallback` → single-flight `cdpExecutor.ensureReady()` (launch or `connectOverCDP`) → return it.
268
+ 3. Else → `throw new ExecutorError('NO_BACKEND', 'No Chrome available: open the extension and pair it, or enable CDP fallback.')`.
269
+
270
+ Mid-call disconnects are **not** migrated: the in-flight call rejects (→ `isError`),
271
+ the *next* call re-selects. Tool descriptions tell the model to retry on disconnect.
272
+
273
+ ---
274
+
275
+ ## 4. Wire Protocol
276
+
277
+ One canonical `shared/protocol.ts`, `v:1`, **snake_case** method names matching
278
+ the MCP primitives 1:1, JSON text frames (one object per WS message),
279
+ request/response correlated by `id`. The server is the WS **SERVER**; the
280
+ extension is the single privileged **CLIENT**. [RESOLVED — integration blocker 1]
281
+
282
+ ```ts
283
+ // shared/protocol.ts — imported VERBATIM by src/ and extension/
284
+ export const PROTOCOL_VERSION = 1 as const;
285
+
286
+ /** Methods on the wire = MCP primitives 1:1. Helpers (extract_links,
287
+ * read_as_markdown, fill_form) are NOT here — composed server-side.
288
+ * Only download_file is a wire method beyond the primitives. [RESOLVED] */
289
+ export type WireMethod =
290
+ | 'tabs_list' | 'tab_select' | 'tab_new' | 'tab_close'
291
+ | 'navigate' | 'back' | 'forward' | 'reload'
292
+ | 'click' | 'type' | 'press' | 'hover' | 'scroll'
293
+ | 'screenshot' | 'get_text' | 'get_html' | 'eval' | 'wait_for'
294
+ | 'download_file' | 'ping_probe';
295
+
296
+ export interface BaseFrame { type: string; v: 1; }
297
+
298
+ // ---- handshake ----
299
+ export interface HelloFrame extends BaseFrame { type: 'hello'; token: string; ext: { id: string; version: string; chrome: string }; }
300
+ export interface WelcomeFrame extends BaseFrame { type: 'welcome'; serverVersion: string; sessionId: string; heartbeatMs: number; }
301
+ export interface UnauthFrame extends BaseFrame { type: 'unauthorized'; reason: 'bad_token' | 'bad_version' | 'timeout'; }
302
+
303
+ // ---- command / result ----
304
+ export interface CommandFrame extends BaseFrame {
305
+ type: 'command'; id: string; method: WireMethod;
306
+ params: Record<string, unknown>; tabId?: string; timeoutMs: number;
307
+ }
308
+ export interface ResultFrame extends BaseFrame {
309
+ type: 'result'; id: string; ok: true; data: unknown; // screenshot: { dataBase64, mimeType, width, height, truncated }
310
+ }
311
+ export interface ErrorFrame extends BaseFrame {
312
+ type: 'error'; id: string; ok: false; error: { code: ExecutorErrorCode; message: string; data?: Record<string, unknown> };
313
+ }
314
+
315
+ // ---- unsolicited + heartbeat ----
316
+ export interface EventFrame extends BaseFrame {
317
+ type: 'event'; event: 'tab_created'|'tab_removed'|'tab_updated'|'detached'|'target_gone'; data: Record<string, unknown>;
318
+ }
319
+ export interface PingFrame extends BaseFrame { type: 'ping'; ts: number; }
320
+ export interface PongFrame extends BaseFrame { type: 'pong'; ts: number; }
321
+
322
+ export type ServerFrame = CommandFrame | WelcomeFrame | UnauthFrame | PingFrame;
323
+ export type ExtensionFrame = HelloFrame | ResultFrame | ErrorFrame | EventFrame | PongFrame;
324
+ export type Frame = ServerFrame | ExtensionFrame;
325
+
326
+ export type ExecutorErrorCode =
327
+ | 'NO_TARGET' | 'TARGET_GONE' | 'DETACHED' | 'DEVTOOLS_OPEN'
328
+ | 'SELECTOR_NOT_FOUND' | 'REF_EXPIRED' | 'EVAL_THREW'
329
+ | 'TIMEOUT' | 'BAD_ARGS' | 'CDP_ERROR' | 'POLICY_DENIED'
330
+ | 'DOWNLOAD_FAILED' | 'UNKNOWN_METHOD';
331
+ ```
332
+
333
+ **Auth handshake (the canonical, ONLY model):** [RESOLVED — security blocker 1, minor 7]
334
+ 1. Server, at boot, generates a **fresh 256-bit token every boot**, never persisted/reused: `crypto.randomBytes(32).toString('base64url')`. It writes `{ v, port, token, pid, ts, expectedExtensionId? }` **atomically** (tmp + `rename`) to `$CHROME_MCP_DATA/handshake.json` with mode **0600**, verifies the mode after write, and **fails closed** if it cannot. The token is **never** written to stdout or stderr or any log. A test asserts this.
335
+ 2. The extension acquires `{port, token}` from the handshake file via a **native-messaging trampoline** (`extension/src/nm/`) — the only component that can read a 0600 file the model/attacker cannot. v1 ships a **manual fallback**: the user runs `npx chrome-mcp --print-pairing` which prints a **redacted confirmation + the file path** (`handshake.json written to …; open the extension and click Pair`) and the trampoline reads it. **No token on stdout/stderr, no `/connect` HTTP endpoint.** [RESOLVED — security blocker 3]
336
+ 3. Extension dials `ws://127.0.0.1:<port>` (port read from handshake via trampoline; see §6 for re-read-on-reconnect) and sends `hello` within 5000ms.
337
+ 4. Server compares tokens by **hashing both sides to SHA-256 and `timingSafeEqual`-ing the digests** (no length precondition, no length leak). Match → `welcome`, flip to ACTIVE; mismatch / `v!==1` / timeout → send `unauthorized` then close **4401**. No `command`/`result`/`event` is processed before `welcome`.
338
+
339
+ **id correlation & timeouts:**
340
+ - `id` is a server-generated ULID per `CommandFrame`. `ExtensionConnection` holds `Map<id, { resolve, reject, timer, method, startedAt }>`.
341
+ - Per-request timeout is **method-aware**: default **30s**; `screenshot`, `wait_for`, `navigate`, `download_file` get **60s**. On timeout: reject `ExecutorError('TIMEOUT')`, delete the entry, **do NOT close the socket**.
342
+ - On socket close/error: reject **every** pending entry with `EXTENSION_DISCONNECTED`, clear the map.
343
+ - Backpressure: before send, if `ws.bufferedAmount > 8MB` reject `BACKPRESSURE` (screenshots are large; never queue unboundedly).
344
+
345
+ **Heartbeat:** app-level `ping`/`pong` every 15s, layered over `ws` native ping/pong. Two missed pongs (>30s silence) → terminate connection → next `ensureReady` falls to CDP, with a loud stderr note.
346
+
347
+ **Single-active-connection (security-aware):** a second valid-token dial **supersedes** the first (close 4000). This is logged **loudly to stderr AND surfaced via `chrome_status`** as a *security-relevant displacement event*, not a UX nicety. The active connection is bound to the `ext.id` from the first `hello`; a displacement by a *different* id is refused without an explicit user re-pair. [RESOLVED — security minor 9]
348
+
349
+ ---
350
+
351
+ ## 5. The Complete MCP Tool Surface
352
+
353
+ 26 tools. `readOnly` is metadata (not a JSON-Schema field) consumed by the host
354
+ and by **safe-mode** (shipped in v1, default ON). Every handler:
355
+ `withReadyExecutor()` → validate args (`requireTarget` for selector|ref) →
356
+ **`policy.assertUrlAllowed(currentTabUrl, method)`** → call executor/helper →
357
+ `jsonResult`/`imageResult`/`textResult`. `dispatchToolCall` is the single
358
+ never-throw firewall converting any thrown `Error` to `{isError:true}`.
359
+
360
+ | Tool | Input (summary) | Returns | R/O? | Impl home |
361
+ |---|---|---|---|---|
362
+ | `tabs_list` | `{}` | `TabInfo[]` | read | executor primitive |
363
+ | `tab_select` | `{tabId}` | `TabInfo` | mutate | executor primitive |
364
+ | `tab_new` | `{url?}` | `TabInfo` | mutate | executor primitive |
365
+ | `tab_close` | `{tabId}` | `{closed,tabId}` | mutate | executor primitive |
366
+ | `navigate` | `{url, tabId?, waitUntil?}` | `NavResult` | mutate | executor primitive |
367
+ | `back` | `{tabId?}` | `NavResult` | mutate | executor primitive |
368
+ | `forward` | `{tabId?}` | `NavResult` | mutate | executor primitive |
369
+ | `reload` | `{tabId?, waitUntil?}` | `NavResult` | mutate | executor primitive |
370
+ | `click` | `{selector?\|ref?, tabId?, button?, clickCount?}` | `ActionOk` | mutate | executor primitive |
371
+ | `type` | `{selector?\|ref?, text, tabId?, clear?, pressEnter?, keyEvents?}` | `ActionOk` | mutate | executor primitive |
372
+ | `press` | `{key, modifiers?, tabId?}` | `ActionOk` | mutate | executor primitive |
373
+ | `hover` | `{selector?\|ref?, tabId?}` | `ActionOk` | mutate | executor primitive |
374
+ | `scroll` | `{x?,y?,deltaX?,deltaY?,selector?\|ref?, tabId?}` | `ActionOk` | mutate | executor primitive |
375
+ | `screenshot` | `{fullPage?, selector?\|ref?, tabId?}` | **image block** + dims/truncated | read* | executor primitive |
376
+ | `get_text` | `{selector?\|ref?, tabId?}` | `{text, ref?}` | read | executor primitive |
377
+ | `get_html` | `{selector?\|ref?, outer?, tabId?}` | `{html}` | read | executor primitive |
378
+ | `eval` | `{expression, awaitPromise?, tabId?}` | `EvalResult` | **mutate** | executor primitive |
379
+ | `wait_for` | `{selector?, textContains?, gone?, timeoutMs?, tabId?}` | `WaitResult` | read | executor primitive |
380
+ | `extract_links` | `{selector?, sameOriginOnly?, include?, exclude?, tabId?}` | `{links:[{href,text,ref}]}` | read | **server helper** (one `ex.eval` IIFE) |
381
+ | `read_as_markdown` | `{selector?, tabId?}` | raw markdown text | read | **server helper** (md-reducer IIFE) |
382
+ | `fill_form` | `{fields:{[selector]:string\|bool}, submitSelector?, tabId?}` | `{filled, submitted}` | mutate | **server helper** (seq `ex.fill`/`ex.click`) |
383
+ | `download_file` | `{url?\|(selector?\|ref?), suggestedName?, tabId?}` | `DownloadResult` | mutate | **executor `download`** (privileged) |
384
+ | `chrome_status` | `{}` | `ExecutorStatus` + displacement/heartbeat flags | read | manager (defensive cached fallback) |
385
+
386
+ Notes:
387
+ - **`screenshot` is `readOnly:true` but auto-scrolls the target into view** before capture — documented benign side effect. Returns a real `{type:'image',data,mimeType}` block (the one envelope extension over the LinkedIn repo). `read*` = read with a benign side effect.
388
+ - **`eval` is `readOnly:false`** (arbitrary JS). Safe-mode disables `eval` and the entire mutating set unless `--unsafe-enable-eval` / `--enable-mutations`. [RESOLVED — security major eval]
389
+ - **`eval`/all reads/all navigations call `policy.assertUrlAllowed(tab.currentUrl, method)`** before dispatch — reads are gated because reads are the exfil payload. [RESOLVED — security major 4]
390
+ - Validators add `optionalBoolean`, `optionalNumber(float,bounds)`, `optionalStringArray`, and `requireTarget` (exactly-one-of selector|ref) to the lifted LinkedIn set.
391
+
392
+ ---
393
+
394
+ ## 6. Chrome Extension
395
+
396
+ ### 6.1 `manifest.json` (full)
397
+
398
+ ```json
399
+ {
400
+ "manifest_version": 3,
401
+ "name": "Chrome MCP Bridge",
402
+ "version": "1.0.0",
403
+ "minimum_chrome_version": "123",
404
+ "key": "<BASE64_PUBLIC_KEY_PINS_A_DETERMINISTIC_EXTENSION_ID>",
405
+ "background": { "service_worker": "background.js", "type": "module" },
406
+ "permissions": [
407
+ "debugger", "tabs", "scripting", "activeTab", "downloads", "storage",
408
+ "alarms", "nativeMessaging"
409
+ ],
410
+ "optional_host_permissions": ["*://*/*"],
411
+ "host_permissions": [],
412
+ "options_page": "options.html",
413
+ "action": { "default_title": "Chrome MCP — click to pair / attach" },
414
+ "icons": { "16": "icons/16.png", "48": "icons/48.png", "128": "icons/128.png" }
415
+ }
416
+ ```
417
+
418
+ Decisions (each justified, narrowed for review): [RESOLVED — security major 4, mv3 minors]
419
+ - **No `<all_urls>` baked in.** `host_permissions: []`; hosts are requested on demand via `optional_host_permissions` after the user action-click grants the tab. `debugger` does not need host permissions; reading `tab.url` needs only `tabs`. This is the default-deny posture at the manifest level and eases (eventual) store review.
420
+ - `nativeMessaging` powers the pairing trampoline that reads the 0600 handshake file.
421
+ - `key` pins a deterministic extension id so the Origin pin is meaningful; the id is **public**, the token is the secret.
422
+ - `minimum_chrome_version: 123` (not 116) for predictable `alarms`/`storage.session`/`debugger` behavior the keepalive design assumes. [RESOLVED — mv3 nit]
423
+
424
+ ### 6.2 Background service worker (eviction survival)
425
+
426
+ **Posture (correct, kept):** all listeners registered **synchronously at top
427
+ level**; **zero authoritative state in SW memory** — `{wsPort, token-presence,
428
+ connState}` in `chrome.storage.local`, `attachedTabId` in
429
+ `chrome.storage.session`. On every wake (`onStartup`/`onInstalled`/`onAlarm`/
430
+ `onClicked`/`onMessage`) call `ensureConnected()` which re-hydrates and (re)dials.
431
+
432
+ **Keepalive — the real one.** [RESOLVED — mv3 blocker 1]
433
+ `chrome.alarms` (≥30s) only *wakes* the worker; an open WebSocket does **not**
434
+ extend MV3 worker lifetime. So:
435
+ - While a session is ACTIVE (after `welcome`, until socket close or N idle minutes), run a **25s `setInterval` that issues an *awaited extension API call*** (`await chrome.storage.local.get('connState')`) — an awaited extension API call resets the 30s idle timer; raw socket I/O does not.
436
+ - A **`chrome.alarms` keepalive (every 30s) is the cold-restart safety net** and the authoritative **reconnect driver**: on each alarm, if `connState !== 'connected'` and not `unauthorized`, re-dial. The fragile in-SW backoff timer is best-effort only. [RESOLVED — mv3 minor reconnect]
437
+ - The server **cannot wake a dead worker**, so the gap between death and next wake is real: `withReadyExecutor` PINGs the extension with an **800ms deadline**; a dead-but-not-yet-reconnected worker fails the probe and the call **falls through to CDP** instead of eating a 30s timeout (or a short bounded retry if CDP is disabled). [RESOLVED]
438
+
439
+ **Port re-read on reconnect (port/token consistency):** [RESOLVED — mv3 blocker 3, security minor 8]
440
+ The canonical token+port live together in `handshake.json` (one source of
441
+ truth). The extension learns both via the native-messaging trampoline. **Whenever
442
+ a WS dial FAILS** (server restarted → fresh token + possibly fresh port), the SW
443
+ re-invokes the trampoline to re-read `{port, token}` before the next dial. This
444
+ makes ephemeral ports work: the extension never holds a stale port. The single
445
+ default fixed port is **`38017`** (override `CHROME_MCP_WS_PORT`); ephemeral
446
+ port `0` is supported precisely because the extension re-reads on failure. The
447
+ port is **not** a security boundary — the token is.
448
+
449
+ **WsClient handshake & unauthorized:** on `open`, send `hello` with the
450
+ trampoline-read token. `welcome` → ACTIVE + start keepalive interval.
451
+ `unauthorized` → **stop reconnecting, badge `AUTH`, re-read handshake on next
452
+ alarm** (server likely rotated the token). `options.ts` must **`await
453
+ chrome.storage.local.set(...)` before** sending `{type:'reconnect'}` so the SW
454
+ reads fresh config; the `unauthorized` state is cleared atomically on re-pair.
455
+
456
+ **Router (`router.ts`):** extension-side mirror of `dispatchToolCall` — one
457
+ try/catch firewall, **never throws**, always returns exactly one
458
+ `result`/`error` per `id`. Boot-time **drift assert** that every `WireMethod`
459
+ has a router case and vice-versa. **The router calls
460
+ `policy.assertUrlAllowed(currentTabUrl, method)` BEFORE any
461
+ `chrome.debugger.attach`/`navigate`/`eval`/read** — the policy gate is enforced
462
+ at *both* ends. [RESOLVED — security major 4]
463
+
464
+ ### 6.3 chrome.debugger method-mapping table
465
+
466
+ `CdpExecutor` (extension side) over `chrome.debugger.sendCommand({tabId}, …,
467
+ '1.3')`. **Resolve selectors AND refs uniformly to `backendNodeId` →
468
+ `DOM.getBoxModel` quad center**; scroll-into-view first; this fixes off-screen,
469
+ zoom, and same-origin-iframe coordinate bugs. [RESOLVED — mv3 major coordinates]
470
+
471
+ | Protocol method | CDP / chrome.* calls |
472
+ |---|---|
473
+ | `tabs_list` | `chrome.tabs.query({})` → scheme-filter (`http/https/file`; skip `chrome://`,`devtools://`,`about:blank`) |
474
+ | `tab_select` | single-flight: `detach(old)` → `chrome.debugger.attach({tabId},'1.3')` → enable `Page`,`Runtime`,`DOM` (lazy `Accessibility`) → persist `attachedTabId` |
475
+ | `tab_new` / `tab_close` | `chrome.tabs.create({url,active:false})` (auto-select) / detach-if-attached → `chrome.tabs.remove` |
476
+ | `navigate` | `Page.navigate` then **poll `Runtime.evaluate(document.readyState)`** to the requested `waitUntil`, deadline-bounded |
477
+ | `back`/`forward` | `Page.getNavigationHistory` → `Page.navigateToHistoryEntry` |
478
+ | `reload` | `Page.reload` + readiness poll |
479
+ | `click`/`hover` | resolve target → `DOM.scrollIntoViewIfNeeded` → `DOM.getBoxModel` center → `Input.dispatchMouseEvent` (pressed/released; trusted events) |
480
+ | `type` | `DOM.focus` → if `clear` select-all+delete → `Input.insertText`; if `keyEvents` per-char `Input.dispatchKeyEvent`; `pressEnter` → key event |
481
+ | `press` | `Input.dispatchKeyEvent` with modifiers |
482
+ | `scroll` | `Input.dispatchMouseEvent` wheel, or element box + `Input.dispatchScrollEvent` |
483
+ | `screenshot` | `Emulation.setDeviceMetricsOverride` (known DPR) → `Page.captureScreenshot({format:'png', captureBeyondViewport:fullPage})`, **height-capped** with `truncated`+`fullHeight` metadata (or scroll-stitch fallback) [RESOLVED — mv3 major screenshot] |
484
+ | `get_text`/`get_html` | `Runtime.evaluate` (CSP-bypassing path; `returnByValue:true`) |
485
+ | `eval` | `Runtime.evaluate({awaitPromise})`; `exceptionDetails` → `{ok:false,error}` (NOT a tool error) |
486
+ | `wait_for` | **poll `Runtime.evaluate`** (querySelector / textContains / gone) with deadline — never wait on a future lifecycle event [RESOLVED — mv3 blocker 2] |
487
+ | `download_file` | server-side CDP fetch (see §8); extension path only if explicitly allowed |
488
+
489
+ **Detach handling:** `chrome.debugger.onDetach` → null attach state, emit
490
+ `detached` event; next command re-attaches. **DevTools-open is terminal, not a
491
+ loop:** on attach failure, `chrome.debugger.getTargets()` checks for
492
+ `attached:true` by another client → return `DEVTOOLS_OPEN` ('close DevTools on
493
+ this tab') and do **not** retry. attach/detach are **single-flight serialized**
494
+ so `tab_select` churn can't race. `detach-all` on session end. [RESOLVED — mv3 major one-debugger-client]
495
+
496
+ **Helpers parity:** `extract_links`/`read_as_markdown`/`fill_form` run via the
497
+ **already-attached `Runtime.evaluate`** path (CSP-bypassing, parity with the CDP
498
+ backend) — **not** `chrome.scripting` MAIN-world (which CSP can block). `fill_form`
499
+ submit click goes through `Input.dispatch` (trusted event). [RESOLVED — mv3 minor CSP/manifest]
500
+
501
+ ---
502
+
503
+ ## 7. CdpExecutor Fallback
504
+
505
+ `src/executor/cdp-executor.ts` — Playwright `connectOverCDP(endpoint)` (attach)
506
+ or `launchPersistentContext(userDataDir)` (self-spawn). **Lifted from
507
+ `linkedin-mcp/src/driver/browser.ts`:**
508
+ - connect-once/reuse guard, **never `cdpBrowser.close()` in attach mode** (`doConnect` ~464–509).
509
+ - single-flight launch guard `this.launching` (~291–305).
510
+ - `launchWithLockRecovery` + `inspectProfileLock`/`clearProfileLocks`/`isChromiumProcess` (~318–361) — **must verify the lock owner is a Chromium before SIGKILL**; required because Claude Desktop SIGKILLs the npx child and orphans `SingletonLock`.
511
+ - `STEALTH_INIT_SCRIPT` + `--disable-blink-features=AutomationControlled` + pinned UA **only on the self-launch path** (never attach).
512
+ - `wirePage` default timeouts.
513
+ - `findLinkedInPage` **generalized to `resolveTab(tabId?|urlPattern?)`**: poll `context.pages()`, skip `file://`/`data://`/`devtools://`/`chrome://`/`about:blank`, match by stored guid or URL regex, else first content page. `contexts()[0]` holds ALL targets over CDP — `tabsList`/`tabSelect` filter non-content targets.
514
+
515
+ Decisions:
516
+ - **Ownership flag `launched`**: `dispose()` closes the context only when `launched===true`; attach mode drops refs only.
517
+ - **Do NOT set `PLAYWRIGHT_BROWSERS_PATH`** (rely on the postinstall cache; respect a pre-set override). Drop the Electron `useBundledBrowsersIfPackaged` branch.
518
+ - **Self-launch profile dir must NOT be the user's real Chrome profile** (`CHROME_MCP_USERDATA` || `~/.chrome-mcp/cdp-profile`) — it would conflict with the real Chrome the extension drives.
519
+ - `screenshot` → `page.screenshot({fullPage})` → base64; `eval` → `page.evaluate` wrapped so a page throw becomes `{ok:false}` not a transport-killing throw.
520
+ - **tabId stamping**: CDP guids are prefixed `cdp:<sessionId>:<guid>`; extension ids `ext:<sessionId>:<targetId>`. A handle whose prefix doesn't match the current backend/session → `STALE_TAB` ('call tabs_list again'). [RESOLVED — mv3 major tabId / integration]
521
+
522
+ **Selection** (recap §3): extension if responsive-on-ping, else CDP. `--prefer cdp` forces CDP even when an extension is connected (testing). `--no-cdp-fallback` disables the fallback entirely.
523
+
524
+ ---
525
+
526
+ ## 8. Security Model
527
+
528
+ **Threat model.** `chrome.debugger` grants **total** control of the attached
529
+ Chrome: read every cookie/DOM/localStorage of every logged-in site, inject JS,
530
+ download files. The localhost WS is reachable by **any local process** (a
531
+ malicious npm postinstall, a browser-spawned helper, any unsandboxed user
532
+ process). **Therefore: the 256-bit per-boot token is the entire boundary, and
533
+ prompt-injection-to-exfil via page content is a PRIMARY, not theoretical,
534
+ threat** (this tool feeds untrusted page text to an LLM that can call `eval`).
535
+
536
+ **Mitigations — every security blocker/major resolved:**
537
+
538
+ 1. **One auth model, no alternatives.** Fresh 256-bit token **every boot**, never persisted/reused; atomic-0600 `handshake.json`; SHA-256-then-`timingSafeEqual` compare; the four other token/port/transport variants are **deleted from the spec** so they can't half-ship as dead paths. [RESOLVED — blocker 1, minor 7, minor 8]
539
+ 2. **Loopback bind only** (`127.0.0.1`, never `0.0.0.0`). Defense-in-depth.
540
+ 3. **Origin is NOT a security layer.** Documented as defense-in-depth against *browser-page* attackers only (a web page cannot forge a `chrome-extension://` Origin; a native process can). The token holds independently. We never relax token strength on Origin's account. Never reject a valid-token dev connection solely on Origin mismatch. [RESOLVED — blocker 2, mv3 major Origin]
541
+ 4. **No `/connect` HTTP endpoint.** Token bootstrap is the **native-messaging trampoline** reading the 0600 file (the model/attacker can't read it), with a manual file-path-pairing fallback. No race-able network token vendor exists. [RESOLVED — blocker 3]
542
+ 5. **Token never logged.** Never on stdout or stderr; only in the 0600 file (mode verified post-write, fail-closed). Any human-readable pairing artifact is also 0600 and short-lived. `policy.test.ts` asserts the token appears on neither stream. `logErr` redacts. [RESOLVED — major 5]
543
+ 6. **Default-deny domain policy, ON by default, gating READS too.** `Policy { allowDomains: glob[]; allowEval; allowDownloads; allowAllTabs }`. Absent config → **SAFE DEFAULT**: navigate only to `about:blank` + already-open allowlisted tabs, **eval denied cross-domain, reads (`get_text`/`get_html`/`screenshot`/`eval`) denied outside the allowlist**, downloads off. `assertUrlAllowed(currentTabUrl, method)` runs in **the executor dispatch (both backends) AND the extension router** before any attach/navigate/eval/read. `--unsafe-all-domains` (= `allowDomains:['*']`) is the loud-logged escape hatch. [RESOLVED — major 4]
544
+ 7. **Safe-mode shipped in v1 (not "future").** Default disables `eval` and the entire mutating tool set; `--enable-mutations` / `--unsafe-enable-eval` opt in. `eval`'s effective target origin (the tab's current URL) is allowlist-checked before dispatch. [RESOLVED — major eval]
545
+ 8. **Narrowed manifest:** no `<all_urls>`; `host_permissions:[]` + `optional_host_permissions` requested on demand after an explicit user action-click grant before first attach. [RESOLVED — major 4]
546
+ 9. **Displacement is a security event:** superseding connects are logged loudly + surfaced in `chrome_status`; the active connection is pinned to the first `hello` ext id and won't be displaced by a different id without re-pair. [RESOLVED — minor 9]
547
+ 10. **`download_file` is server-side CDP fetch into a dedicated, non-executable, server-owned dir.** `suggestedName` is sanitized (strip path separators, drop dangerous extensions or force `.download`), size-capped; **never writes into the user's real Downloads via the AI path by default.** Result `{path, backend, bytes, mimeType}` marks which filesystem `path` is on; the call resolves only on `complete`/`interrupted` (never on download-begin), no Save-As dialog. [RESOLVED — minor 10, mv3 minor download]
548
+ 11. **Kill switch:** delete `handshake.json` + `SIGHUP` rotates the token and forces re-pair. **Ship load-unpacked only for v1** (`chrome.debugger` triggers heavy store review).
549
+
550
+ ---
551
+
552
+ ## 9. Build Plan
553
+
554
+ Each phase is a shippable slice with a verification slice that proves it. Order
555
+ is dependency-aware; the canonical contracts (`shared/protocol.ts`,
556
+ `executor/types.ts`, `security/policy.ts`) land first so nothing forks.
557
+
558
+ **Phase 0 — Contracts & skeleton.**
559
+ Build: `shared/protocol.ts`, `executor/types.ts`, `security/policy.ts`, `config.ts` (single port/token/policy surface), `package.json`/`tsconfig`, `postinstall.js`. *Verify:* `tsc` compiles; `policy.test.ts` proves default-deny + read-gating + glob; a unit test asserts no second copy of any protocol/port/token constant exists.
560
+
561
+ **Phase 1 — MCP server + StubExecutor (Chrome-free).**
562
+ Build: `mcp/server.ts` (clean-stdout `logErr`), `mcp/tools.ts` (26 defs+handlers, drift-check, `dispatchToolCall`), `validators.ts`, `envelopes.ts`, `helpers.ts`, `markdown-extract.ts`, `executor/manager.ts` wired to a `StubExecutor`. *Verify:* `tools.test.ts` — drift parity, `requireTarget` exactly-one-of, `McpToolError`→`isError`, screenshot→image block, `eval`-throw→`{ok:false}`, safe-mode blocks `eval`+mutations, policy denies cross-domain read. Runs in CI (`node --test`).
563
+
564
+ **Phase 2 — Bridge + auth + token-never-logged.**
565
+ Build: `bridge/server.ts`, `bridge/connection.ts`, `bridge/auth.ts` (per-boot token, atomic 0600, SHA-256 compare), `bridge/datadir.ts`. *Verify:* `bridge.test.ts` — correct token accepted; wrong token → 4401; missing/forged Origin in dev with valid token → still accepted (Origin not a gate); id X→reply X with 3 in flight; second valid client displaces first (+ event surfaced); deadline→`isError`; **token absent from stdout AND stderr**. CI-eligible (fake WS client).
566
+
567
+ **Phase 3 — ExtensionExecutor + CdpExecutor + selection.**
568
+ Build: `executor/extension-executor.ts`, `executor/cdp-executor.ts` (BrowserManager port + lock recovery + `resolveTab`), `executor/page-scripts.ts`. Manager `ensureReady` with ping-probe fall-through + single-flight. *Verify:* extend `tools.test.ts` with a fake bridge — late extension takes over from CDP; extension ping-fail falls to CDP without a 30s stall; `STALE_TAB` on cross-backend tabId.
569
+
570
+ **Phase 4 — MV3 extension.**
571
+ Build: `manifest.json`, `sw/background.ts` (top-level listeners, keepalive interval + alarm reconnect driver, re-hydrate), `sw/ws-client.ts`, `sw/router.ts` (never-throw, drift-assert, policy gate), `sw/cdp-executor.ts` (poll-based lifecycle, box-model resolution, DevTools-terminal), `sw/tabs.ts`, `options/*`, `nm/trampoline.ts`, `build-ext.mjs`. *Verify:* `extension-smoke.ts` — Playwright `--load-extension`, programmatic pair via trampoline-equivalent, `navigate(example.com)`→`get_text` contains "Example", and a >30s idle test proving keepalive keeps the socket alive then a kill test proving CDP fall-through. Headed; behind a display flag.
572
+
573
+ **Phase 5 — Helpers, download, full policy wiring, HITL.**
574
+ Build: server-side `extract_links`/`read_as_markdown`/`fill_form` over `Runtime.evaluate`; server-side CDP-fetch `download_file` with sanitized names/size cap; policy enforced at executor + router. *Verify:* `test/hitl/` (cloned harness, Executor handle) — read steps free; mutating steps (`download_file`, `fill_form+submit`, `eval`-side-effect) behind classification + `--include-mutating` + literal `yes` + `test-targets.json` allowlist. Human verdict gate; never CI.
575
+
576
+ **Phase 6 — Packaging & docs.**
577
+ Build: `files` whitelist incl. `extension-dist/`, `--print-pairing`/`--print-extension-path`, `.mcp.json` snippet, README (banner caveat, load-unpacked-only, policy/safe-mode defaults, kill switch). *Verify:* `npx chrome-mcp` from a packed tarball boots, prints the pairing file path (no token leak), and a fresh Chrome load-unpacked round-trips one navigate→get_text.
578
+
579
+ ---
580
+
581
+ ## 10. Open Questions & Remaining Risks
582
+
583
+ **Resolved-and-closed** (were open questions; now decided): token model (per-boot ephemeral, 0600, native-messaging trampoline); port (single `38017` default, ephemeral-0 supported via re-read-on-failure); helper home (server-side compose, only `download` on the wire); error enum (one `ExecutorErrorCode`); screenshot key (`dataBase64`); ref model (page-scoped, invalidated on navigation); safe-mode + read-gating (shipped, default-on); Origin (defense-in-depth only, not a gate).
584
+
585
+ **Remaining risks (accepted for v1):**
586
+ - **The yellow "is being debugged" banner** is permanent in v1 and a social-engineering footgun; accepted with `chrome.debugger`. Documented.
587
+ - **MV3 worker death gap:** between worker eviction and the next wake, the extension is briefly undriveable; mitigated by ping-probe fall-through to CDP and the keepalive loop, but the model may see one transient `isError`+retry. Tool descriptions instruct retry.
588
+ - **Native-messaging trampoline install step** is the one manual setup beyond `npx`; the manual file-path paste is the no-native fallback. Smoother one-click pairing is a v1.1 polish.
589
+ - **`networkidle`** is approximated by a bounded idle-window poll (no native CDP event); documented as best-effort, never able to wedge a call.
590
+ - **Local-code-execution attacker** who can already read the user's 0600 files has root-equivalent access to the user session; the token cannot defend against an attacker who already owns the filesystem. The policy allowlist still blocks blind exfil to arbitrary domains.
591
+ - **`captureBeyondViewport` very-tall pages**: capped + `truncated` flag; scroll-stitch is the v1.1 upgrade if full fidelity is needed.
592
+
593
+ **Genuinely open (decide before v1.1):**
594
+ - Multi-tab/multi-session concurrency (single global Executor + single attached tab today): does an agent need N tabs driven simultaneously? That breaks the singleton and requires per-cmd `tabId` everywhere on the wire.
595
+ - Web Store path: requires a `chrome.scripting`-only mode (no `chrome.debugger`) — a second executor backend behind the same interface.
596
+ - Whether `download` should ever use the extension `chrome.downloads` path (user Downloads dir) as an explicit opt-in, or remain server-fetch-only forever.