@mehmoodqureshi/chrome-mcp 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +129 -0
- package/dist/shared/download.d.ts +15 -0
- package/dist/shared/download.js +0 -0
- package/dist/shared/protocol.d.ts +114 -0
- package/dist/shared/protocol.js +55 -0
- package/dist/src/bridge/auth.d.ts +32 -0
- package/dist/src/bridge/auth.js +76 -0
- package/dist/src/bridge/connection.d.ts +48 -0
- package/dist/src/bridge/connection.js +192 -0
- package/dist/src/bridge/datadir.d.ts +8 -0
- package/dist/src/bridge/datadir.js +22 -0
- package/dist/src/bridge/server.d.ts +58 -0
- package/dist/src/bridge/server.js +178 -0
- package/dist/src/cli.d.ts +11 -0
- package/dist/src/cli.js +93 -0
- package/dist/src/config.d.ts +42 -0
- package/dist/src/config.js +188 -0
- package/dist/src/executor/cdp-executor.d.ts +131 -0
- package/dist/src/executor/cdp-executor.js +422 -0
- package/dist/src/executor/extension-executor.d.ts +102 -0
- package/dist/src/executor/extension-executor.js +124 -0
- package/dist/src/executor/manager.d.ts +43 -0
- package/dist/src/executor/manager.js +94 -0
- package/dist/src/executor/select.d.ts +23 -0
- package/dist/src/executor/select.js +53 -0
- package/dist/src/executor/stub-executor.d.ts +60 -0
- package/dist/src/executor/stub-executor.js +118 -0
- package/dist/src/executor/types.d.ts +192 -0
- package/dist/src/executor/types.js +24 -0
- package/dist/src/mcp/envelopes.d.ts +13 -0
- package/dist/src/mcp/envelopes.js +30 -0
- package/dist/src/mcp/helpers.d.ts +37 -0
- package/dist/src/mcp/helpers.js +71 -0
- package/dist/src/mcp/markdown-extract.d.ts +9 -0
- package/dist/src/mcp/markdown-extract.js +61 -0
- package/dist/src/mcp/server.d.ts +18 -0
- package/dist/src/mcp/server.js +82 -0
- package/dist/src/mcp/tools.d.ts +32 -0
- package/dist/src/mcp/tools.js +267 -0
- package/dist/src/mcp/validators.d.ts +32 -0
- package/dist/src/mcp/validators.js +104 -0
- package/dist/src/security/policy.d.ts +48 -0
- package/dist/src/security/policy.js +155 -0
- package/docs/BLUEPRINT.md +596 -0
- package/extension-dist/background.js +567 -0
- package/extension-dist/manifest.json +12 -0
- package/extension-dist/options.html +32 -0
- package/extension-dist/options.js +37 -0
- package/package.json +69 -0
- package/scripts/postinstall.js +50 -0
|
@@ -0,0 +1,596 @@
|
|
|
1
|
+
# Chrome MCP — Build-Ready Blueprint (v1)
|
|
2
|
+
|
|
3
|
+
> An MCP server that lets Claude drive a real Chrome browser. One pluggable
|
|
4
|
+
> `Executor` interface, two backends: an MV3 **extension** (primary, drives the
|
|
5
|
+
> user's real Chrome via `chrome.debugger`/CDP) and a **CDP fallback** (the
|
|
6
|
+
> server launches/attaches Chromium via Playwright). Distributed as an npm
|
|
7
|
+
> package run with `npx` plus a load-unpacked extension.
|
|
8
|
+
>
|
|
9
|
+
> This document is the spec. Every blocker and major from verification has been
|
|
10
|
+
> **designed away**; resolutions are called out inline as **[RESOLVED]**.
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## 1. Overview & Architecture
|
|
15
|
+
|
|
16
|
+
Three "worlds" cooperate through one process-global `ExecutorManager` and one
|
|
17
|
+
canonical wire protocol. The MCP host (Claude Desktop / Code) speaks JSON-RPC
|
|
18
|
+
over **stdio**; the extension speaks the canonical frame protocol over a
|
|
19
|
+
**localhost WebSocket**; the CDP fallback speaks **Playwright/CDP**.
|
|
20
|
+
|
|
21
|
+
```
|
|
22
|
+
┌──────────────────────────────────────────────────────────────┐
|
|
23
|
+
WORLD 1 │ MCP HOST (Claude Desktop / Claude Code) │
|
|
24
|
+
(the model)│ spawns: npx chrome-mcp │
|
|
25
|
+
└───────────────┬──────────────────────────────────────────────┘
|
|
26
|
+
│ JSON-RPC over stdio (stdout = SACRED)
|
|
27
|
+
┌───────────────▼──────────────────────────────────────────────┐
|
|
28
|
+
WORLD 2 │ chrome-mcp CLI (one Node process) │
|
|
29
|
+
(server) │ │
|
|
30
|
+
│ ┌────────────┐ ┌──────────────────┐ ┌─────────────────┐ │
|
|
31
|
+
│ │ MCP Server │──▶│ tools.ts │──▶│ ExecutorManager │ │
|
|
32
|
+
│ │ (stdio) │ │ dispatchToolCall│ │ (selection + │ │
|
|
33
|
+
│ └────────────┘ │ + policy gate │ │ self-heal) │ │
|
|
34
|
+
│ └──────────────────┘ └───┬─────────┬───┘ │
|
|
35
|
+
│ │ │ │
|
|
36
|
+
│ ┌───────────────▼──┐ ┌──▼────┐│
|
|
37
|
+
│ │ ExtensionExecutor│ │ Cdp- ││
|
|
38
|
+
│ │ (WS proxy) │ │ Exec- ││
|
|
39
|
+
│ └───────┬──────────┘ │ utor ││
|
|
40
|
+
│ ┌─────────────────────┐ │ └──┬────┘│
|
|
41
|
+
│ │ BridgeServer (ws) │◀───────────┘ sendCommand() │ │
|
|
42
|
+
│ │ 127.0.0.1:<port> │ id-correlated frames │ │
|
|
43
|
+
│ │ token gate (only │ │ │
|
|
44
|
+
│ │ real boundary) │ │ │
|
|
45
|
+
│ └──────────┬──────────┘ │ │
|
|
46
|
+
└──────────────│─────────────────────────────────────────│─────┘
|
|
47
|
+
│ ws:// loopback │
|
|
48
|
+
│ (extension dials IN as CLIENT) │ Playwright
|
|
49
|
+
┌──────────────▼─────────────────────┐ ┌───────────────▼───────────┐
|
|
50
|
+
WORLD 3 │ MV3 EXTENSION (user's REAL Chrome) │ │ Chromium (server-owned │
|
|
51
|
+
(browser) │ background SW: WsClient + Router │ │ OR attached via │
|
|
52
|
+
│ CdpExecutor over chrome.debugger │ │ connectOverCDP) │
|
|
53
|
+
│ → real cookies / real logins │ │ FALLBACK ONLY │
|
|
54
|
+
└────────────────────────────────────┘ └───────────────────────────┘
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
**Backend selection (every `ensureReady()`):** prefer a *responsive*
|
|
58
|
+
authenticated extension; otherwise, if `cdpFallback` is enabled, lazily
|
|
59
|
+
launch/attach Chromium. Selection is re-evaluated per call so a late extension
|
|
60
|
+
takes over from CDP and vice-versa on disconnect.
|
|
61
|
+
|
|
62
|
+
**Decisive global calls:**
|
|
63
|
+
- Single process-global `Executor` pointer for v1 (not a pool). Multi-session is out of scope.
|
|
64
|
+
- **The 256-bit per-boot token is the ONLY security boundary.** Origin checks and the loopback bind are defense-in-depth against *browser-page* attackers, not native processes. [RESOLVED — security blockers 2 & 3]
|
|
65
|
+
- **One canonical `protocol.ts`** imported by both server and extension. Four conflicting wire/port/token designs are collapsed into this single contract. [RESOLVED — integration blocker 1]
|
|
66
|
+
- **Helpers (`extract_links`, `read_as_markdown`, `fill_form`) are composed server-side** from primitives. Only `download_file` is an executor/wire method. [RESOLVED — integration blocker 2]
|
|
67
|
+
- **Default-deny domain policy is ON by default and gates reads too**, enforced inside the executor dispatch (both backends) AND the extension router. [RESOLVED — security major 4 & eval]
|
|
68
|
+
|
|
69
|
+
---
|
|
70
|
+
|
|
71
|
+
## 2. Repository Layout
|
|
72
|
+
|
|
73
|
+
Single sibling repo `chrome-mcp/` (NOT a workspace). Two build roots in one git
|
|
74
|
+
repo so the wire protocol version-locks between server and extension. The shared
|
|
75
|
+
`protocol.ts` lives at the top so both builds import the *same file*.
|
|
76
|
+
|
|
77
|
+
```
|
|
78
|
+
chrome-mcp/
|
|
79
|
+
├── package.json # bin: chrome-mcp -> dist/cli.js; deps sdk, ws, playwright
|
|
80
|
+
├── tsconfig.json
|
|
81
|
+
├── README.md
|
|
82
|
+
├── .gitignore # dist/, extension-dist/, test-targets.json, *.handshake
|
|
83
|
+
├── scripts/
|
|
84
|
+
│ └── postinstall.js # playwright install chromium (CDP fallback only); skip-guarded
|
|
85
|
+
│
|
|
86
|
+
├── shared/
|
|
87
|
+
│ └── protocol.ts # ⭐ SINGLE SOURCE OF TRUTH for the wire contract
|
|
88
|
+
│ # imported by BOTH src/ and extension/ builds
|
|
89
|
+
│
|
|
90
|
+
├── src/ # → dist/ via tsc
|
|
91
|
+
│ ├── cli.ts # npx entry: parseArgs, help/version, boot bridge+stdio, shutdown
|
|
92
|
+
│ ├── config.ts # CliConfig + parseArgs (single port/token/policy source)
|
|
93
|
+
│ ├── mcp/
|
|
94
|
+
│ │ ├── server.ts # createServer factory + start/stop/isRunning + logErr (clean-stdout)
|
|
95
|
+
│ │ ├── tools.ts # TOOL_DEFINITIONS + TOOL_HANDLERS + dispatchToolCall + drift-check
|
|
96
|
+
│ │ ├── validators.ts # asArgs/requireString/optional*/requireTarget guards
|
|
97
|
+
│ │ ├── envelopes.ts # jsonResult / imageResult / textResult
|
|
98
|
+
│ │ ├── helpers.ts # extract_links / read_as_markdown / fill_form composed server-side
|
|
99
|
+
│ │ └── markdown-extract.ts # dependency-free HTML→md reducer (injected string)
|
|
100
|
+
│ ├── executor/
|
|
101
|
+
│ │ ├── types.ts # ⭐ Executor interface + arg/result types + ExecutorError
|
|
102
|
+
│ │ ├── manager.ts # ExecutorManager singleton + withReadyExecutor + selection/self-heal
|
|
103
|
+
│ │ ├── extension-executor.ts# proxies primitives → BridgeServer.sendCommand
|
|
104
|
+
│ │ ├── cdp-executor.ts # Playwright connectOverCDP / launchPersistentContext (BrowserManager port)
|
|
105
|
+
│ │ └── page-scripts.ts # shared injected-script sources (getText/extractLinks/waitFor poll)
|
|
106
|
+
│ ├── bridge/
|
|
107
|
+
│ │ ├── server.ts # BridgeServer: ws on 127.0.0.1, token gate, id-correlation, heartbeat
|
|
108
|
+
│ │ ├── connection.ts # ExtensionConnection: pending-map, timeouts, reject-all-on-close
|
|
109
|
+
│ │ ├── auth.ts # token gen, atomic 0600 handshake.json, hashed constant-time compare
|
|
110
|
+
│ │ └── datadir.ts # Electron-free CHROME_MCP_DATA || ~/.chrome-mcp
|
|
111
|
+
│ └── security/
|
|
112
|
+
│ └── policy.ts # Policy type, loadPolicy, assertUrlAllowed (default-deny, gates reads)
|
|
113
|
+
│
|
|
114
|
+
├── extension/ # → extension-dist/ via esbuild (load-unpacked root)
|
|
115
|
+
│ ├── manifest.json # MV3; keyed for stable id; minimum_chrome_version 123
|
|
116
|
+
│ ├── icons/{16,48,128}.png
|
|
117
|
+
│ ├── scripts/build-ext.mjs # esbuild background.ts (esm) + options.ts (iife) + copy manifest/html/icons
|
|
118
|
+
│ ├── package.json # devDeps only: typescript, esbuild, @types/chrome
|
|
119
|
+
│ └── src/
|
|
120
|
+
│ ├── sw/
|
|
121
|
+
│ │ ├── background.ts # SW entry: top-level listeners, ensureConnected, keepalive loop
|
|
122
|
+
│ │ ├── ws-client.ts # dial ws://, hello-token handshake, reconnect, ping/pong
|
|
123
|
+
│ │ ├── router.ts # CommandRouter: never-throw firewall, drift-assert, policy gate
|
|
124
|
+
│ │ ├── cdp-executor.ts # chrome.debugger Executor: ensureAttached self-heal, poll-based lifecycle
|
|
125
|
+
│ │ └── tabs.ts # tabs_list/select/new/close + attachedTabId persistence + validation
|
|
126
|
+
│ ├── options/
|
|
127
|
+
│ │ ├── options.html
|
|
128
|
+
│ │ └── options.ts # native-host pairing status + manual fallback paste; live status
|
|
129
|
+
│ └── nm/
|
|
130
|
+
│ └── trampoline.ts # (optional, v1.1) native-messaging host that reads handshake.json
|
|
131
|
+
│
|
|
132
|
+
├── test/
|
|
133
|
+
│ ├── stub-executor.ts # in-memory Executor (canned values + forced throws)
|
|
134
|
+
│ ├── bridge.test.ts # node --test: token gate, id-correlation, displacement, deadline→isError
|
|
135
|
+
│ ├── tools.test.ts # dispatch drift-check, requireTarget, isError envelope, image block
|
|
136
|
+
│ ├── policy.test.ts # default-deny, read-gating, allowlist glob, token-never-logged assertion
|
|
137
|
+
│ ├── extension-smoke.ts # Playwright --load-extension, programmatic pair, navigate→getText
|
|
138
|
+
│ └── hitl/ # human-gated e2e (cloned from linkedin-mcp/test/hitl)
|
|
139
|
+
│ ├── index.ts runner.ts reporter.ts prompts.ts types.ts scenarios.ts
|
|
140
|
+
│
|
|
141
|
+
├── test-targets.example.json # committed; test-targets.json is gitignored
|
|
142
|
+
└── policy.example.json # committed example allowlist
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
---
|
|
146
|
+
|
|
147
|
+
## 3. The Executor Interface
|
|
148
|
+
|
|
149
|
+
The single backend-agnostic contract. Plain JSON-serializable in/out — **no
|
|
150
|
+
Playwright `Page`, no CDP session, no DOM handle crosses this boundary.** Both
|
|
151
|
+
`ExtensionExecutor` and `CdpExecutor` implement it. Helpers are composed in
|
|
152
|
+
`mcp/helpers.ts` from these primitives; only `download` is privileged.
|
|
153
|
+
|
|
154
|
+
```ts
|
|
155
|
+
// src/executor/types.ts (and re-exported shapes shared with shared/protocol.ts)
|
|
156
|
+
|
|
157
|
+
export type BackendKind = 'extension' | 'cdp';
|
|
158
|
+
export type WaitUntil = 'load' | 'domcontentloaded' | 'networkidle';
|
|
159
|
+
export type KeyModifier = 'Alt' | 'Control' | 'Meta' | 'Shift';
|
|
160
|
+
|
|
161
|
+
/** Selector XOR ref. requireTarget() enforces exactly-one-of. Refs are minted
|
|
162
|
+
* by getText/extractLinks/waitFor/eval results and are PAGE-scoped: invalidated
|
|
163
|
+
* on that tab's navigation. ref format: `el_<tabShort>_<backendNodeId>`. */
|
|
164
|
+
export type Target = { selector: string; ref?: never } | { ref: string; selector?: never };
|
|
165
|
+
|
|
166
|
+
/** tabId is ALWAYS prefixed `<backend>:<sessionId>:<rawId>` so a handle from one
|
|
167
|
+
* backend/session can never mis-route after a fallback switch or reconnect. */
|
|
168
|
+
export type TabId = string;
|
|
169
|
+
|
|
170
|
+
export interface TabInfo { tabId: TabId; url: string; title: string; active: boolean; index: number; }
|
|
171
|
+
export interface NavResult{ url: string; title: string; httpStatus?: number; }
|
|
172
|
+
export interface EvalResult { ok: boolean; value?: unknown; type?: string; error?: string; } // value truncated >256KB
|
|
173
|
+
export interface WaitResult { matched: boolean; ref?: string; waitedMs: number; }
|
|
174
|
+
export interface ActionOk { ok: true; }
|
|
175
|
+
export interface ScreenshotResult {
|
|
176
|
+
dataBase64: string; mimeType: 'image/png';
|
|
177
|
+
width: number; height: number; truncated: boolean; fullHeight?: number; // fullPage cap metadata
|
|
178
|
+
}
|
|
179
|
+
export interface DownloadResult { path: string; backend: BackendKind; bytes: number; mimeType?: string; suggestedName?: string; }
|
|
180
|
+
|
|
181
|
+
export interface ExecutorStatus {
|
|
182
|
+
ready: boolean;
|
|
183
|
+
backend: BackendKind | null;
|
|
184
|
+
activeTabId: TabId | null;
|
|
185
|
+
detail?: string; // WHY unavailable (port in use, no extension, policy, etc.)
|
|
186
|
+
extensionConnected: boolean;
|
|
187
|
+
cdpAttached: boolean;
|
|
188
|
+
}
|
|
189
|
+
|
|
190
|
+
export interface Executor {
|
|
191
|
+
readonly backend: BackendKind;
|
|
192
|
+
status(): ExecutorStatus;
|
|
193
|
+
/** idempotent lazy connect/attach + self-heal; single-flight guarded. */
|
|
194
|
+
ensureReady(): Promise<void>;
|
|
195
|
+
/** lightweight responsiveness probe (short deadline) — used by withReadyExecutor
|
|
196
|
+
* to detect a dead-but-not-yet-reconnected MV3 worker and fall through. */
|
|
197
|
+
ping(deadlineMs?: number): Promise<boolean>;
|
|
198
|
+
dispose(): Promise<void>; // close ONLY if we own the browser; never the user's Chrome
|
|
199
|
+
|
|
200
|
+
// --- tabs ---
|
|
201
|
+
tabsList(): Promise<TabInfo[]>;
|
|
202
|
+
tabSelect(tabId: TabId): Promise<TabInfo>;
|
|
203
|
+
tabNew(url?: string): Promise<TabInfo>;
|
|
204
|
+
tabClose(tabId: TabId): Promise<{ closed: true; tabId: TabId }>;
|
|
205
|
+
|
|
206
|
+
// --- navigation (active tab unless tabId given) ---
|
|
207
|
+
navigate(args: { url: string; tabId?: TabId; waitUntil?: WaitUntil }): Promise<NavResult>;
|
|
208
|
+
back(tabId?: TabId): Promise<NavResult>;
|
|
209
|
+
forward(tabId?: TabId): Promise<NavResult>;
|
|
210
|
+
reload(args?: { tabId?: TabId; waitUntil?: WaitUntil }): Promise<NavResult>;
|
|
211
|
+
|
|
212
|
+
// --- interaction (Target = {selector} XOR {ref}) ---
|
|
213
|
+
click(t: Target, opts?: { tabId?: TabId; button?: 'left'|'right'|'middle'; clickCount?: number }): Promise<ActionOk>;
|
|
214
|
+
type(t: Target, text: string, opts?: { tabId?: TabId; clear?: boolean; pressEnter?: boolean; keyEvents?: boolean }): Promise<ActionOk>;
|
|
215
|
+
fill(t: Target, value: string, opts?: { tabId?: TabId }): Promise<ActionOk>; // value-set + input/change (used by fill_form)
|
|
216
|
+
press(key: string, opts?: { tabId?: TabId; modifiers?: KeyModifier[] }): Promise<ActionOk>;
|
|
217
|
+
hover(t: Target, opts?: { tabId?: TabId }): Promise<ActionOk>;
|
|
218
|
+
scroll(opts: { tabId?: TabId; x?: number; y?: number; deltaX?: number; deltaY?: number; target?: Target }): Promise<ActionOk>;
|
|
219
|
+
|
|
220
|
+
// --- read (policy-gated by current tab URL) ---
|
|
221
|
+
getText(t?: Target, opts?: { tabId?: TabId }): Promise<{ text: string; ref?: string }>;
|
|
222
|
+
getHtml(t?: Target, opts?: { tabId?: TabId; outer?: boolean }): Promise<{ html: string }>;
|
|
223
|
+
screenshot(opts?: { tabId?: TabId; fullPage?: boolean; target?: Target }): Promise<ScreenshotResult>;
|
|
224
|
+
eval(expression: string, opts?: { tabId?: TabId; awaitPromise?: boolean }): Promise<EvalResult>;
|
|
225
|
+
waitFor(opts: { tabId?: TabId; selector?: string; textContains?: string; gone?: boolean; timeoutMs?: number }): Promise<WaitResult>;
|
|
226
|
+
|
|
227
|
+
// --- privileged, executor-owned (NOT composable) ---
|
|
228
|
+
download(args: { url?: string; target?: Target; tabId?: TabId; suggestedName?: string }): Promise<DownloadResult>;
|
|
229
|
+
}
|
|
230
|
+
|
|
231
|
+
export class ExecutorError extends Error {
|
|
232
|
+
constructor(
|
|
233
|
+
public code:
|
|
234
|
+
| 'NO_BACKEND' | 'EXTENSION_DISCONNECTED' | 'TIMEOUT'
|
|
235
|
+
| 'TAB_NOT_FOUND' | 'STALE_TAB' | 'SELECTOR_NOT_FOUND' | 'REF_EXPIRED'
|
|
236
|
+
| 'EVAL_FAILED' | 'LAUNCH_FAILED' | 'DETACHED' | 'TARGET_GONE'
|
|
237
|
+
| 'POLICY_DENIED' | 'DEVTOOLS_OPEN' | 'DOWNLOAD_FAILED' | 'BACKPRESSURE',
|
|
238
|
+
message: string,
|
|
239
|
+
) { super(message); this.name = 'ExecutorError'; }
|
|
240
|
+
}
|
|
241
|
+
```
|
|
242
|
+
|
|
243
|
+
**`ExecutorManager`** (the `withReadyDriver` analog, `linkedin-mcp/src/driver/linkedin.ts:208`
|
|
244
|
+
/ `linkedin-mcp/src/mcp/tools.ts:638`):
|
|
245
|
+
|
|
246
|
+
```ts
|
|
247
|
+
export class ExecutorManager {
|
|
248
|
+
constructor(deps: { bridge: BridgeServer; policy: Policy; cdpFallback: boolean;
|
|
249
|
+
cdpEndpoint?: string; headless: boolean; userDataDir: string });
|
|
250
|
+
static getInstance(deps?: ConstructorParameters<typeof ExecutorManager>[0]): ExecutorManager;
|
|
251
|
+
|
|
252
|
+
/** Re-selects backend (extension-if-responsive else CDP), ensures operational,
|
|
253
|
+
* single-flight guarded by this.readying. PINGs the extension with a short
|
|
254
|
+
* deadline; a dead MV3 worker falls through to CDP instead of eating a 30s
|
|
255
|
+
* timeout. [RESOLVED — mv3 keepalive blocker] */
|
|
256
|
+
ensureReady(): Promise<Executor>;
|
|
257
|
+
peek(): { backend: BackendKind | null; ready: boolean; detail?: string };
|
|
258
|
+
dispose(): Promise<void>;
|
|
259
|
+
}
|
|
260
|
+
export async function withReadyExecutor(): Promise<Executor> {
|
|
261
|
+
return ExecutorManager.getInstance().ensureReady();
|
|
262
|
+
}
|
|
263
|
+
```
|
|
264
|
+
|
|
265
|
+
Selection logic, decisively:
|
|
266
|
+
1. If `bridge.hasActiveExtension()` **and** `extensionExecutor.ping(800ms)` resolves true → return `ExtensionExecutor`.
|
|
267
|
+
2. Else if `cdpFallback` → single-flight `cdpExecutor.ensureReady()` (launch or `connectOverCDP`) → return it.
|
|
268
|
+
3. Else → `throw new ExecutorError('NO_BACKEND', 'No Chrome available: open the extension and pair it, or enable CDP fallback.')`.
|
|
269
|
+
|
|
270
|
+
Mid-call disconnects are **not** migrated: the in-flight call rejects (→ `isError`),
|
|
271
|
+
the *next* call re-selects. Tool descriptions tell the model to retry on disconnect.
|
|
272
|
+
|
|
273
|
+
---
|
|
274
|
+
|
|
275
|
+
## 4. Wire Protocol
|
|
276
|
+
|
|
277
|
+
One canonical `shared/protocol.ts`, `v:1`, **snake_case** method names matching
|
|
278
|
+
the MCP primitives 1:1, JSON text frames (one object per WS message),
|
|
279
|
+
request/response correlated by `id`. The server is the WS **SERVER**; the
|
|
280
|
+
extension is the single privileged **CLIENT**. [RESOLVED — integration blocker 1]
|
|
281
|
+
|
|
282
|
+
```ts
|
|
283
|
+
// shared/protocol.ts — imported VERBATIM by src/ and extension/
|
|
284
|
+
export const PROTOCOL_VERSION = 1 as const;
|
|
285
|
+
|
|
286
|
+
/** Methods on the wire = MCP primitives 1:1. Helpers (extract_links,
|
|
287
|
+
* read_as_markdown, fill_form) are NOT here — composed server-side.
|
|
288
|
+
* Only download_file is a wire method beyond the primitives. [RESOLVED] */
|
|
289
|
+
export type WireMethod =
|
|
290
|
+
| 'tabs_list' | 'tab_select' | 'tab_new' | 'tab_close'
|
|
291
|
+
| 'navigate' | 'back' | 'forward' | 'reload'
|
|
292
|
+
| 'click' | 'type' | 'press' | 'hover' | 'scroll'
|
|
293
|
+
| 'screenshot' | 'get_text' | 'get_html' | 'eval' | 'wait_for'
|
|
294
|
+
| 'download_file' | 'ping_probe';
|
|
295
|
+
|
|
296
|
+
export interface BaseFrame { type: string; v: 1; }
|
|
297
|
+
|
|
298
|
+
// ---- handshake ----
|
|
299
|
+
export interface HelloFrame extends BaseFrame { type: 'hello'; token: string; ext: { id: string; version: string; chrome: string }; }
|
|
300
|
+
export interface WelcomeFrame extends BaseFrame { type: 'welcome'; serverVersion: string; sessionId: string; heartbeatMs: number; }
|
|
301
|
+
export interface UnauthFrame extends BaseFrame { type: 'unauthorized'; reason: 'bad_token' | 'bad_version' | 'timeout'; }
|
|
302
|
+
|
|
303
|
+
// ---- command / result ----
|
|
304
|
+
export interface CommandFrame extends BaseFrame {
|
|
305
|
+
type: 'command'; id: string; method: WireMethod;
|
|
306
|
+
params: Record<string, unknown>; tabId?: string; timeoutMs: number;
|
|
307
|
+
}
|
|
308
|
+
export interface ResultFrame extends BaseFrame {
|
|
309
|
+
type: 'result'; id: string; ok: true; data: unknown; // screenshot: { dataBase64, mimeType, width, height, truncated }
|
|
310
|
+
}
|
|
311
|
+
export interface ErrorFrame extends BaseFrame {
|
|
312
|
+
type: 'error'; id: string; ok: false; error: { code: ExecutorErrorCode; message: string; data?: Record<string, unknown> };
|
|
313
|
+
}
|
|
314
|
+
|
|
315
|
+
// ---- unsolicited + heartbeat ----
|
|
316
|
+
export interface EventFrame extends BaseFrame {
|
|
317
|
+
type: 'event'; event: 'tab_created'|'tab_removed'|'tab_updated'|'detached'|'target_gone'; data: Record<string, unknown>;
|
|
318
|
+
}
|
|
319
|
+
export interface PingFrame extends BaseFrame { type: 'ping'; ts: number; }
|
|
320
|
+
export interface PongFrame extends BaseFrame { type: 'pong'; ts: number; }
|
|
321
|
+
|
|
322
|
+
export type ServerFrame = CommandFrame | WelcomeFrame | UnauthFrame | PingFrame;
|
|
323
|
+
export type ExtensionFrame = HelloFrame | ResultFrame | ErrorFrame | EventFrame | PongFrame;
|
|
324
|
+
export type Frame = ServerFrame | ExtensionFrame;
|
|
325
|
+
|
|
326
|
+
export type ExecutorErrorCode =
|
|
327
|
+
| 'NO_TARGET' | 'TARGET_GONE' | 'DETACHED' | 'DEVTOOLS_OPEN'
|
|
328
|
+
| 'SELECTOR_NOT_FOUND' | 'REF_EXPIRED' | 'EVAL_THREW'
|
|
329
|
+
| 'TIMEOUT' | 'BAD_ARGS' | 'CDP_ERROR' | 'POLICY_DENIED'
|
|
330
|
+
| 'DOWNLOAD_FAILED' | 'UNKNOWN_METHOD';
|
|
331
|
+
```
|
|
332
|
+
|
|
333
|
+
**Auth handshake (the canonical, ONLY model):** [RESOLVED — security blocker 1, minor 7]
|
|
334
|
+
1. Server, at boot, generates a **fresh 256-bit token every boot**, never persisted/reused: `crypto.randomBytes(32).toString('base64url')`. It writes `{ v, port, token, pid, ts, expectedExtensionId? }` **atomically** (tmp + `rename`) to `$CHROME_MCP_DATA/handshake.json` with mode **0600**, verifies the mode after write, and **fails closed** if it cannot. The token is **never** written to stdout or stderr or any log. A test asserts this.
|
|
335
|
+
2. The extension acquires `{port, token}` from the handshake file via a **native-messaging trampoline** (`extension/src/nm/`) — the only component that can read a 0600 file the model/attacker cannot. v1 ships a **manual fallback**: the user runs `npx chrome-mcp --print-pairing` which prints a **redacted confirmation + the file path** (`handshake.json written to …; open the extension and click Pair`) and the trampoline reads it. **No token on stdout/stderr, no `/connect` HTTP endpoint.** [RESOLVED — security blocker 3]
|
|
336
|
+
3. Extension dials `ws://127.0.0.1:<port>` (port read from handshake via trampoline; see §6 for re-read-on-reconnect) and sends `hello` within 5000ms.
|
|
337
|
+
4. Server compares tokens by **hashing both sides to SHA-256 and `timingSafeEqual`-ing the digests** (no length precondition, no length leak). Match → `welcome`, flip to ACTIVE; mismatch / `v!==1` / timeout → send `unauthorized` then close **4401**. No `command`/`result`/`event` is processed before `welcome`.
|
|
338
|
+
|
|
339
|
+
**id correlation & timeouts:**
|
|
340
|
+
- `id` is a server-generated ULID per `CommandFrame`. `ExtensionConnection` holds `Map<id, { resolve, reject, timer, method, startedAt }>`.
|
|
341
|
+
- Per-request timeout is **method-aware**: default **30s**; `screenshot`, `wait_for`, `navigate`, `download_file` get **60s**. On timeout: reject `ExecutorError('TIMEOUT')`, delete the entry, **do NOT close the socket**.
|
|
342
|
+
- On socket close/error: reject **every** pending entry with `EXTENSION_DISCONNECTED`, clear the map.
|
|
343
|
+
- Backpressure: before send, if `ws.bufferedAmount > 8MB` reject `BACKPRESSURE` (screenshots are large; never queue unboundedly).
|
|
344
|
+
|
|
345
|
+
**Heartbeat:** app-level `ping`/`pong` every 15s, layered over `ws` native ping/pong. Two missed pongs (>30s silence) → terminate connection → next `ensureReady` falls to CDP, with a loud stderr note.
|
|
346
|
+
|
|
347
|
+
**Single-active-connection (security-aware):** a second valid-token dial **supersedes** the first (close 4000). This is logged **loudly to stderr AND surfaced via `chrome_status`** as a *security-relevant displacement event*, not a UX nicety. The active connection is bound to the `ext.id` from the first `hello`; a displacement by a *different* id is refused without an explicit user re-pair. [RESOLVED — security minor 9]
|
|
348
|
+
|
|
349
|
+
---
|
|
350
|
+
|
|
351
|
+
## 5. The Complete MCP Tool Surface
|
|
352
|
+
|
|
353
|
+
26 tools. `readOnly` is metadata (not a JSON-Schema field) consumed by the host
|
|
354
|
+
and by **safe-mode** (shipped in v1, default ON). Every handler:
|
|
355
|
+
`withReadyExecutor()` → validate args (`requireTarget` for selector|ref) →
|
|
356
|
+
**`policy.assertUrlAllowed(currentTabUrl, method)`** → call executor/helper →
|
|
357
|
+
`jsonResult`/`imageResult`/`textResult`. `dispatchToolCall` is the single
|
|
358
|
+
never-throw firewall converting any thrown `Error` to `{isError:true}`.
|
|
359
|
+
|
|
360
|
+
| Tool | Input (summary) | Returns | R/O? | Impl home |
|
|
361
|
+
|---|---|---|---|---|
|
|
362
|
+
| `tabs_list` | `{}` | `TabInfo[]` | read | executor primitive |
|
|
363
|
+
| `tab_select` | `{tabId}` | `TabInfo` | mutate | executor primitive |
|
|
364
|
+
| `tab_new` | `{url?}` | `TabInfo` | mutate | executor primitive |
|
|
365
|
+
| `tab_close` | `{tabId}` | `{closed,tabId}` | mutate | executor primitive |
|
|
366
|
+
| `navigate` | `{url, tabId?, waitUntil?}` | `NavResult` | mutate | executor primitive |
|
|
367
|
+
| `back` | `{tabId?}` | `NavResult` | mutate | executor primitive |
|
|
368
|
+
| `forward` | `{tabId?}` | `NavResult` | mutate | executor primitive |
|
|
369
|
+
| `reload` | `{tabId?, waitUntil?}` | `NavResult` | mutate | executor primitive |
|
|
370
|
+
| `click` | `{selector?\|ref?, tabId?, button?, clickCount?}` | `ActionOk` | mutate | executor primitive |
|
|
371
|
+
| `type` | `{selector?\|ref?, text, tabId?, clear?, pressEnter?, keyEvents?}` | `ActionOk` | mutate | executor primitive |
|
|
372
|
+
| `press` | `{key, modifiers?, tabId?}` | `ActionOk` | mutate | executor primitive |
|
|
373
|
+
| `hover` | `{selector?\|ref?, tabId?}` | `ActionOk` | mutate | executor primitive |
|
|
374
|
+
| `scroll` | `{x?,y?,deltaX?,deltaY?,selector?\|ref?, tabId?}` | `ActionOk` | mutate | executor primitive |
|
|
375
|
+
| `screenshot` | `{fullPage?, selector?\|ref?, tabId?}` | **image block** + dims/truncated | read* | executor primitive |
|
|
376
|
+
| `get_text` | `{selector?\|ref?, tabId?}` | `{text, ref?}` | read | executor primitive |
|
|
377
|
+
| `get_html` | `{selector?\|ref?, outer?, tabId?}` | `{html}` | read | executor primitive |
|
|
378
|
+
| `eval` | `{expression, awaitPromise?, tabId?}` | `EvalResult` | **mutate** | executor primitive |
|
|
379
|
+
| `wait_for` | `{selector?, textContains?, gone?, timeoutMs?, tabId?}` | `WaitResult` | read | executor primitive |
|
|
380
|
+
| `extract_links` | `{selector?, sameOriginOnly?, include?, exclude?, tabId?}` | `{links:[{href,text,ref}]}` | read | **server helper** (one `ex.eval` IIFE) |
|
|
381
|
+
| `read_as_markdown` | `{selector?, tabId?}` | raw markdown text | read | **server helper** (md-reducer IIFE) |
|
|
382
|
+
| `fill_form` | `{fields:{[selector]:string\|bool}, submitSelector?, tabId?}` | `{filled, submitted}` | mutate | **server helper** (seq `ex.fill`/`ex.click`) |
|
|
383
|
+
| `download_file` | `{url?\|(selector?\|ref?), suggestedName?, tabId?}` | `DownloadResult` | mutate | **executor `download`** (privileged) |
|
|
384
|
+
| `chrome_status` | `{}` | `ExecutorStatus` + displacement/heartbeat flags | read | manager (defensive cached fallback) |
|
|
385
|
+
|
|
386
|
+
Notes:
|
|
387
|
+
- **`screenshot` is `readOnly:true` but auto-scrolls the target into view** before capture — documented benign side effect. Returns a real `{type:'image',data,mimeType}` block (the one envelope extension over the LinkedIn repo). `read*` = read with a benign side effect.
|
|
388
|
+
- **`eval` is `readOnly:false`** (arbitrary JS). Safe-mode disables `eval` and the entire mutating set unless `--unsafe-enable-eval` / `--enable-mutations`. [RESOLVED — security major eval]
|
|
389
|
+
- **`eval`/all reads/all navigations call `policy.assertUrlAllowed(tab.currentUrl, method)`** before dispatch — reads are gated because reads are the exfil payload. [RESOLVED — security major 4]
|
|
390
|
+
- Validators add `optionalBoolean`, `optionalNumber(float,bounds)`, `optionalStringArray`, and `requireTarget` (exactly-one-of selector|ref) to the lifted LinkedIn set.
|
|
391
|
+
|
|
392
|
+
---
|
|
393
|
+
|
|
394
|
+
## 6. Chrome Extension
|
|
395
|
+
|
|
396
|
+
### 6.1 `manifest.json` (full)
|
|
397
|
+
|
|
398
|
+
```json
|
|
399
|
+
{
|
|
400
|
+
"manifest_version": 3,
|
|
401
|
+
"name": "Chrome MCP Bridge",
|
|
402
|
+
"version": "1.0.0",
|
|
403
|
+
"minimum_chrome_version": "123",
|
|
404
|
+
"key": "<BASE64_PUBLIC_KEY_PINS_A_DETERMINISTIC_EXTENSION_ID>",
|
|
405
|
+
"background": { "service_worker": "background.js", "type": "module" },
|
|
406
|
+
"permissions": [
|
|
407
|
+
"debugger", "tabs", "scripting", "activeTab", "downloads", "storage",
|
|
408
|
+
"alarms", "nativeMessaging"
|
|
409
|
+
],
|
|
410
|
+
"optional_host_permissions": ["*://*/*"],
|
|
411
|
+
"host_permissions": [],
|
|
412
|
+
"options_page": "options.html",
|
|
413
|
+
"action": { "default_title": "Chrome MCP — click to pair / attach" },
|
|
414
|
+
"icons": { "16": "icons/16.png", "48": "icons/48.png", "128": "icons/128.png" }
|
|
415
|
+
}
|
|
416
|
+
```
|
|
417
|
+
|
|
418
|
+
Decisions (each justified, narrowed for review): [RESOLVED — security major 4, mv3 minors]
|
|
419
|
+
- **No `<all_urls>` baked in.** `host_permissions: []`; hosts are requested on demand via `optional_host_permissions` after the user action-click grants the tab. `debugger` does not need host permissions; reading `tab.url` needs only `tabs`. This is the default-deny posture at the manifest level and eases (eventual) store review.
|
|
420
|
+
- `nativeMessaging` powers the pairing trampoline that reads the 0600 handshake file.
|
|
421
|
+
- `key` pins a deterministic extension id so the Origin pin is meaningful; the id is **public**, the token is the secret.
|
|
422
|
+
- `minimum_chrome_version: 123` (not 116) for predictable `alarms`/`storage.session`/`debugger` behavior the keepalive design assumes. [RESOLVED — mv3 nit]
|
|
423
|
+
|
|
424
|
+
### 6.2 Background service worker (eviction survival)
|
|
425
|
+
|
|
426
|
+
**Posture (correct, kept):** all listeners registered **synchronously at top
|
|
427
|
+
level**; **zero authoritative state in SW memory** — `{wsPort, token-presence,
|
|
428
|
+
connState}` in `chrome.storage.local`, `attachedTabId` in
|
|
429
|
+
`chrome.storage.session`. On every wake (`onStartup`/`onInstalled`/`onAlarm`/
|
|
430
|
+
`onClicked`/`onMessage`) call `ensureConnected()` which re-hydrates and (re)dials.
|
|
431
|
+
|
|
432
|
+
**Keepalive — the real one.** [RESOLVED — mv3 blocker 1]
|
|
433
|
+
`chrome.alarms` (≥30s) only *wakes* the worker; an open WebSocket does **not**
|
|
434
|
+
extend MV3 worker lifetime. So:
|
|
435
|
+
- While a session is ACTIVE (after `welcome`, until socket close or N idle minutes), run a **25s `setInterval` that issues an *awaited extension API call*** (`await chrome.storage.local.get('connState')`) — an awaited extension API call resets the 30s idle timer; raw socket I/O does not.
|
|
436
|
+
- A **`chrome.alarms` keepalive (every 30s) is the cold-restart safety net** and the authoritative **reconnect driver**: on each alarm, if `connState !== 'connected'` and not `unauthorized`, re-dial. The fragile in-SW backoff timer is best-effort only. [RESOLVED — mv3 minor reconnect]
|
|
437
|
+
- The server **cannot wake a dead worker**, so the gap between death and next wake is real: `withReadyExecutor` PINGs the extension with an **800ms deadline**; a dead-but-not-yet-reconnected worker fails the probe and the call **falls through to CDP** instead of eating a 30s timeout (or a short bounded retry if CDP is disabled). [RESOLVED]
|
|
438
|
+
|
|
439
|
+
**Port re-read on reconnect (port/token consistency):** [RESOLVED — mv3 blocker 3, security minor 8]
|
|
440
|
+
The canonical token+port live together in `handshake.json` (one source of
|
|
441
|
+
truth). The extension learns both via the native-messaging trampoline. **Whenever
|
|
442
|
+
a WS dial FAILS** (server restarted → fresh token + possibly fresh port), the SW
|
|
443
|
+
re-invokes the trampoline to re-read `{port, token}` before the next dial. This
|
|
444
|
+
makes ephemeral ports work: the extension never holds a stale port. The single
|
|
445
|
+
default fixed port is **`38017`** (override `CHROME_MCP_WS_PORT`); ephemeral
|
|
446
|
+
port `0` is supported precisely because the extension re-reads on failure. The
|
|
447
|
+
port is **not** a security boundary — the token is.
|
|
448
|
+
|
|
449
|
+
**WsClient handshake & unauthorized:** on `open`, send `hello` with the
|
|
450
|
+
trampoline-read token. `welcome` → ACTIVE + start keepalive interval.
|
|
451
|
+
`unauthorized` → **stop reconnecting, badge `AUTH`, re-read handshake on next
|
|
452
|
+
alarm** (server likely rotated the token). `options.ts` must **`await
|
|
453
|
+
chrome.storage.local.set(...)` before** sending `{type:'reconnect'}` so the SW
|
|
454
|
+
reads fresh config; the `unauthorized` state is cleared atomically on re-pair.
|
|
455
|
+
|
|
456
|
+
**Router (`router.ts`):** extension-side mirror of `dispatchToolCall` — one
|
|
457
|
+
try/catch firewall, **never throws**, always returns exactly one
|
|
458
|
+
`result`/`error` per `id`. Boot-time **drift assert** that every `WireMethod`
|
|
459
|
+
has a router case and vice-versa. **The router calls
|
|
460
|
+
`policy.assertUrlAllowed(currentTabUrl, method)` BEFORE any
|
|
461
|
+
`chrome.debugger.attach`/`navigate`/`eval`/read** — the policy gate is enforced
|
|
462
|
+
at *both* ends. [RESOLVED — security major 4]
|
|
463
|
+
|
|
464
|
+
### 6.3 chrome.debugger method-mapping table
|
|
465
|
+
|
|
466
|
+
`CdpExecutor` (extension side) over `chrome.debugger.sendCommand({tabId}, …,
|
|
467
|
+
'1.3')`. **Resolve selectors AND refs uniformly to `backendNodeId` →
|
|
468
|
+
`DOM.getBoxModel` quad center**; scroll-into-view first; this fixes off-screen,
|
|
469
|
+
zoom, and same-origin-iframe coordinate bugs. [RESOLVED — mv3 major coordinates]
|
|
470
|
+
|
|
471
|
+
| Protocol method | CDP / chrome.* calls |
|
|
472
|
+
|---|---|
|
|
473
|
+
| `tabs_list` | `chrome.tabs.query({})` → scheme-filter (`http/https/file`; skip `chrome://`,`devtools://`,`about:blank`) |
|
|
474
|
+
| `tab_select` | single-flight: `detach(old)` → `chrome.debugger.attach({tabId},'1.3')` → enable `Page`,`Runtime`,`DOM` (lazy `Accessibility`) → persist `attachedTabId` |
|
|
475
|
+
| `tab_new` / `tab_close` | `chrome.tabs.create({url,active:false})` (auto-select) / detach-if-attached → `chrome.tabs.remove` |
|
|
476
|
+
| `navigate` | `Page.navigate` then **poll `Runtime.evaluate(document.readyState)`** to the requested `waitUntil`, deadline-bounded |
|
|
477
|
+
| `back`/`forward` | `Page.getNavigationHistory` → `Page.navigateToHistoryEntry` |
|
|
478
|
+
| `reload` | `Page.reload` + readiness poll |
|
|
479
|
+
| `click`/`hover` | resolve target → `DOM.scrollIntoViewIfNeeded` → `DOM.getBoxModel` center → `Input.dispatchMouseEvent` (pressed/released; trusted events) |
|
|
480
|
+
| `type` | `DOM.focus` → if `clear` select-all+delete → `Input.insertText`; if `keyEvents` per-char `Input.dispatchKeyEvent`; `pressEnter` → key event |
|
|
481
|
+
| `press` | `Input.dispatchKeyEvent` with modifiers |
|
|
482
|
+
| `scroll` | `Input.dispatchMouseEvent` wheel, or element box + `Input.dispatchScrollEvent` |
|
|
483
|
+
| `screenshot` | `Emulation.setDeviceMetricsOverride` (known DPR) → `Page.captureScreenshot({format:'png', captureBeyondViewport:fullPage})`, **height-capped** with `truncated`+`fullHeight` metadata (or scroll-stitch fallback) [RESOLVED — mv3 major screenshot] |
|
|
484
|
+
| `get_text`/`get_html` | `Runtime.evaluate` (CSP-bypassing path; `returnByValue:true`) |
|
|
485
|
+
| `eval` | `Runtime.evaluate({awaitPromise})`; `exceptionDetails` → `{ok:false,error}` (NOT a tool error) |
|
|
486
|
+
| `wait_for` | **poll `Runtime.evaluate`** (querySelector / textContains / gone) with deadline — never wait on a future lifecycle event [RESOLVED — mv3 blocker 2] |
|
|
487
|
+
| `download_file` | server-side CDP fetch (see §8); extension path only if explicitly allowed |
|
|
488
|
+
|
|
489
|
+
**Detach handling:** `chrome.debugger.onDetach` → null attach state, emit
|
|
490
|
+
`detached` event; next command re-attaches. **DevTools-open is terminal, not a
|
|
491
|
+
loop:** on attach failure, `chrome.debugger.getTargets()` checks for
|
|
492
|
+
`attached:true` by another client → return `DEVTOOLS_OPEN` ('close DevTools on
|
|
493
|
+
this tab') and do **not** retry. attach/detach are **single-flight serialized**
|
|
494
|
+
so `tab_select` churn can't race. `detach-all` on session end. [RESOLVED — mv3 major one-debugger-client]
|
|
495
|
+
|
|
496
|
+
**Helpers parity:** `extract_links`/`read_as_markdown`/`fill_form` run via the
|
|
497
|
+
**already-attached `Runtime.evaluate`** path (CSP-bypassing, parity with the CDP
|
|
498
|
+
backend) — **not** `chrome.scripting` MAIN-world (which CSP can block). `fill_form`
|
|
499
|
+
submit click goes through `Input.dispatch` (trusted event). [RESOLVED — mv3 minor CSP/manifest]
|
|
500
|
+
|
|
501
|
+
---
|
|
502
|
+
|
|
503
|
+
## 7. CdpExecutor Fallback
|
|
504
|
+
|
|
505
|
+
`src/executor/cdp-executor.ts` — Playwright `connectOverCDP(endpoint)` (attach)
|
|
506
|
+
or `launchPersistentContext(userDataDir)` (self-spawn). **Lifted from
|
|
507
|
+
`linkedin-mcp/src/driver/browser.ts`:**
|
|
508
|
+
- connect-once/reuse guard, **never `cdpBrowser.close()` in attach mode** (`doConnect` ~464–509).
|
|
509
|
+
- single-flight launch guard `this.launching` (~291–305).
|
|
510
|
+
- `launchWithLockRecovery` + `inspectProfileLock`/`clearProfileLocks`/`isChromiumProcess` (~318–361) — **must verify the lock owner is a Chromium before SIGKILL**; required because Claude Desktop SIGKILLs the npx child and orphans `SingletonLock`.
|
|
511
|
+
- `STEALTH_INIT_SCRIPT` + `--disable-blink-features=AutomationControlled` + pinned UA **only on the self-launch path** (never attach).
|
|
512
|
+
- `wirePage` default timeouts.
|
|
513
|
+
- `findLinkedInPage` **generalized to `resolveTab(tabId?|urlPattern?)`**: poll `context.pages()`, skip `file://`/`data://`/`devtools://`/`chrome://`/`about:blank`, match by stored guid or URL regex, else first content page. `contexts()[0]` holds ALL targets over CDP — `tabsList`/`tabSelect` filter non-content targets.
|
|
514
|
+
|
|
515
|
+
Decisions:
|
|
516
|
+
- **Ownership flag `launched`**: `dispose()` closes the context only when `launched===true`; attach mode drops refs only.
|
|
517
|
+
- **Do NOT set `PLAYWRIGHT_BROWSERS_PATH`** (rely on the postinstall cache; respect a pre-set override). Drop the Electron `useBundledBrowsersIfPackaged` branch.
|
|
518
|
+
- **Self-launch profile dir must NOT be the user's real Chrome profile** (`CHROME_MCP_USERDATA` || `~/.chrome-mcp/cdp-profile`) — it would conflict with the real Chrome the extension drives.
|
|
519
|
+
- `screenshot` → `page.screenshot({fullPage})` → base64; `eval` → `page.evaluate` wrapped so a page throw becomes `{ok:false}` not a transport-killing throw.
|
|
520
|
+
- **tabId stamping**: CDP guids are prefixed `cdp:<sessionId>:<guid>`; extension ids `ext:<sessionId>:<targetId>`. A handle whose prefix doesn't match the current backend/session → `STALE_TAB` ('call tabs_list again'). [RESOLVED — mv3 major tabId / integration]
|
|
521
|
+
|
|
522
|
+
**Selection** (recap §3): extension if responsive-on-ping, else CDP. `--prefer cdp` forces CDP even when an extension is connected (testing). `--no-cdp-fallback` disables the fallback entirely.
|
|
523
|
+
|
|
524
|
+
---
|
|
525
|
+
|
|
526
|
+
## 8. Security Model
|
|
527
|
+
|
|
528
|
+
**Threat model.** `chrome.debugger` grants **total** control of the attached
|
|
529
|
+
Chrome: read every cookie/DOM/localStorage of every logged-in site, inject JS,
|
|
530
|
+
download files. The localhost WS is reachable by **any local process** (a
|
|
531
|
+
malicious npm postinstall, a browser-spawned helper, any unsandboxed user
|
|
532
|
+
process). **Therefore: the 256-bit per-boot token is the entire boundary, and
|
|
533
|
+
prompt-injection-to-exfil via page content is a PRIMARY, not theoretical,
|
|
534
|
+
threat** (this tool feeds untrusted page text to an LLM that can call `eval`).
|
|
535
|
+
|
|
536
|
+
**Mitigations — every security blocker/major resolved:**
|
|
537
|
+
|
|
538
|
+
1. **One auth model, no alternatives.** Fresh 256-bit token **every boot**, never persisted/reused; atomic-0600 `handshake.json`; SHA-256-then-`timingSafeEqual` compare; the four other token/port/transport variants are **deleted from the spec** so they can't half-ship as dead paths. [RESOLVED — blocker 1, minor 7, minor 8]
|
|
539
|
+
2. **Loopback bind only** (`127.0.0.1`, never `0.0.0.0`). Defense-in-depth.
|
|
540
|
+
3. **Origin is NOT a security layer.** Documented as defense-in-depth against *browser-page* attackers only (a web page cannot forge a `chrome-extension://` Origin; a native process can). The token holds independently. We never relax token strength on Origin's account. Never reject a valid-token dev connection solely on Origin mismatch. [RESOLVED — blocker 2, mv3 major Origin]
|
|
541
|
+
4. **No `/connect` HTTP endpoint.** Token bootstrap is the **native-messaging trampoline** reading the 0600 file (the model/attacker can't read it), with a manual file-path-pairing fallback. No race-able network token vendor exists. [RESOLVED — blocker 3]
|
|
542
|
+
5. **Token never logged.** Never on stdout or stderr; only in the 0600 file (mode verified post-write, fail-closed). Any human-readable pairing artifact is also 0600 and short-lived. `policy.test.ts` asserts the token appears on neither stream. `logErr` redacts. [RESOLVED — major 5]
|
|
543
|
+
6. **Default-deny domain policy, ON by default, gating READS too.** `Policy { allowDomains: glob[]; allowEval; allowDownloads; allowAllTabs }`. Absent config → **SAFE DEFAULT**: navigate only to `about:blank` + already-open allowlisted tabs, **eval denied cross-domain, reads (`get_text`/`get_html`/`screenshot`/`eval`) denied outside the allowlist**, downloads off. `assertUrlAllowed(currentTabUrl, method)` runs in **the executor dispatch (both backends) AND the extension router** before any attach/navigate/eval/read. `--unsafe-all-domains` (= `allowDomains:['*']`) is the loud-logged escape hatch. [RESOLVED — major 4]
|
|
544
|
+
7. **Safe-mode shipped in v1 (not "future").** Default disables `eval` and the entire mutating tool set; `--enable-mutations` / `--unsafe-enable-eval` opt in. `eval`'s effective target origin (the tab's current URL) is allowlist-checked before dispatch. [RESOLVED — major eval]
|
|
545
|
+
8. **Narrowed manifest:** no `<all_urls>`; `host_permissions:[]` + `optional_host_permissions` requested on demand after an explicit user action-click grant before first attach. [RESOLVED — major 4]
|
|
546
|
+
9. **Displacement is a security event:** superseding connects are logged loudly + surfaced in `chrome_status`; the active connection is pinned to the first `hello` ext id and won't be displaced by a different id without re-pair. [RESOLVED — minor 9]
|
|
547
|
+
10. **`download_file` is server-side CDP fetch into a dedicated, non-executable, server-owned dir.** `suggestedName` is sanitized (strip path separators, drop dangerous extensions or force `.download`), size-capped; **never writes into the user's real Downloads via the AI path by default.** Result `{path, backend, bytes, mimeType}` marks which filesystem `path` is on; the call resolves only on `complete`/`interrupted` (never on download-begin), no Save-As dialog. [RESOLVED — minor 10, mv3 minor download]
|
|
548
|
+
11. **Kill switch:** delete `handshake.json` + `SIGHUP` rotates the token and forces re-pair. **Ship load-unpacked only for v1** (`chrome.debugger` triggers heavy store review).
|
|
549
|
+
|
|
550
|
+
---
|
|
551
|
+
|
|
552
|
+
## 9. Build Plan
|
|
553
|
+
|
|
554
|
+
Each phase is a shippable slice with a verification slice that proves it. Order
|
|
555
|
+
is dependency-aware; the canonical contracts (`shared/protocol.ts`,
|
|
556
|
+
`executor/types.ts`, `security/policy.ts`) land first so nothing forks.
|
|
557
|
+
|
|
558
|
+
**Phase 0 — Contracts & skeleton.**
|
|
559
|
+
Build: `shared/protocol.ts`, `executor/types.ts`, `security/policy.ts`, `config.ts` (single port/token/policy surface), `package.json`/`tsconfig`, `postinstall.js`. *Verify:* `tsc` compiles; `policy.test.ts` proves default-deny + read-gating + glob; a unit test asserts no second copy of any protocol/port/token constant exists.
|
|
560
|
+
|
|
561
|
+
**Phase 1 — MCP server + StubExecutor (Chrome-free).**
|
|
562
|
+
Build: `mcp/server.ts` (clean-stdout `logErr`), `mcp/tools.ts` (26 defs+handlers, drift-check, `dispatchToolCall`), `validators.ts`, `envelopes.ts`, `helpers.ts`, `markdown-extract.ts`, `executor/manager.ts` wired to a `StubExecutor`. *Verify:* `tools.test.ts` — drift parity, `requireTarget` exactly-one-of, `McpToolError`→`isError`, screenshot→image block, `eval`-throw→`{ok:false}`, safe-mode blocks `eval`+mutations, policy denies cross-domain read. Runs in CI (`node --test`).
|
|
563
|
+
|
|
564
|
+
**Phase 2 — Bridge + auth + token-never-logged.**
|
|
565
|
+
Build: `bridge/server.ts`, `bridge/connection.ts`, `bridge/auth.ts` (per-boot token, atomic 0600, SHA-256 compare), `bridge/datadir.ts`. *Verify:* `bridge.test.ts` — correct token accepted; wrong token → 4401; missing/forged Origin in dev with valid token → still accepted (Origin not a gate); id X→reply X with 3 in flight; second valid client displaces first (+ event surfaced); deadline→`isError`; **token absent from stdout AND stderr**. CI-eligible (fake WS client).
|
|
566
|
+
|
|
567
|
+
**Phase 3 — ExtensionExecutor + CdpExecutor + selection.**
|
|
568
|
+
Build: `executor/extension-executor.ts`, `executor/cdp-executor.ts` (BrowserManager port + lock recovery + `resolveTab`), `executor/page-scripts.ts`. Manager `ensureReady` with ping-probe fall-through + single-flight. *Verify:* extend `tools.test.ts` with a fake bridge — late extension takes over from CDP; extension ping-fail falls to CDP without a 30s stall; `STALE_TAB` on cross-backend tabId.
|
|
569
|
+
|
|
570
|
+
**Phase 4 — MV3 extension.**
|
|
571
|
+
Build: `manifest.json`, `sw/background.ts` (top-level listeners, keepalive interval + alarm reconnect driver, re-hydrate), `sw/ws-client.ts`, `sw/router.ts` (never-throw, drift-assert, policy gate), `sw/cdp-executor.ts` (poll-based lifecycle, box-model resolution, DevTools-terminal), `sw/tabs.ts`, `options/*`, `nm/trampoline.ts`, `build-ext.mjs`. *Verify:* `extension-smoke.ts` — Playwright `--load-extension`, programmatic pair via trampoline-equivalent, `navigate(example.com)`→`get_text` contains "Example", and a >30s idle test proving keepalive keeps the socket alive then a kill test proving CDP fall-through. Headed; behind a display flag.
|
|
572
|
+
|
|
573
|
+
**Phase 5 — Helpers, download, full policy wiring, HITL.**
|
|
574
|
+
Build: server-side `extract_links`/`read_as_markdown`/`fill_form` over `Runtime.evaluate`; server-side CDP-fetch `download_file` with sanitized names/size cap; policy enforced at executor + router. *Verify:* `test/hitl/` (cloned harness, Executor handle) — read steps free; mutating steps (`download_file`, `fill_form+submit`, `eval`-side-effect) behind classification + `--include-mutating` + literal `yes` + `test-targets.json` allowlist. Human verdict gate; never CI.
|
|
575
|
+
|
|
576
|
+
**Phase 6 — Packaging & docs.**
|
|
577
|
+
Build: `files` whitelist incl. `extension-dist/`, `--print-pairing`/`--print-extension-path`, `.mcp.json` snippet, README (banner caveat, load-unpacked-only, policy/safe-mode defaults, kill switch). *Verify:* `npx chrome-mcp` from a packed tarball boots, prints the pairing file path (no token leak), and a fresh Chrome load-unpacked round-trips one navigate→get_text.
|
|
578
|
+
|
|
579
|
+
---
|
|
580
|
+
|
|
581
|
+
## 10. Open Questions & Remaining Risks
|
|
582
|
+
|
|
583
|
+
**Resolved-and-closed** (were open questions; now decided): token model (per-boot ephemeral, 0600, native-messaging trampoline); port (single `38017` default, ephemeral-0 supported via re-read-on-failure); helper home (server-side compose, only `download` on the wire); error enum (one `ExecutorErrorCode`); screenshot key (`dataBase64`); ref model (page-scoped, invalidated on navigation); safe-mode + read-gating (shipped, default-on); Origin (defense-in-depth only, not a gate).
|
|
584
|
+
|
|
585
|
+
**Remaining risks (accepted for v1):**
|
|
586
|
+
- **The yellow "is being debugged" banner** is permanent in v1 and a social-engineering footgun; accepted with `chrome.debugger`. Documented.
|
|
587
|
+
- **MV3 worker death gap:** between worker eviction and the next wake, the extension is briefly undriveable; mitigated by ping-probe fall-through to CDP and the keepalive loop, but the model may see one transient `isError`+retry. Tool descriptions instruct retry.
|
|
588
|
+
- **Native-messaging trampoline install step** is the one manual setup beyond `npx`; the manual file-path paste is the no-native fallback. Smoother one-click pairing is a v1.1 polish.
|
|
589
|
+
- **`networkidle`** is approximated by a bounded idle-window poll (no native CDP event); documented as best-effort, never able to wedge a call.
|
|
590
|
+
- **Local-code-execution attacker** who can already read the user's 0600 files has root-equivalent access to the user session; the token cannot defend against an attacker who already owns the filesystem. The policy allowlist still blocks blind exfil to arbitrary domains.
|
|
591
|
+
- **`captureBeyondViewport` very-tall pages**: capped + `truncated` flag; scroll-stitch is the v1.1 upgrade if full fidelity is needed.
|
|
592
|
+
|
|
593
|
+
**Genuinely open (decide before v1.1):**
|
|
594
|
+
- Multi-tab/multi-session concurrency (single global Executor + single attached tab today): does an agent need N tabs driven simultaneously? That breaks the singleton and requires per-cmd `tabId` everywhere on the wire.
|
|
595
|
+
- Web Store path: requires a `chrome.scripting`-only mode (no `chrome.debugger`) — a second executor backend behind the same interface.
|
|
596
|
+
- Whether `download` should ever use the extension `chrome.downloads` path (user Downloads dir) as an explicit opt-in, or remain server-fetch-only forever.
|