@bool01master/gemini-web-mcp 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,363 @@
1
+ # Gemini Local MCP Bridge
2
+
3
+ This project wraps `https://gemini.google.com/` as a local MCP server by using a Chrome extension inside your real Gemini tab.
4
+
5
+ It is designed for personal local use on one machine.
6
+
7
+ ## Why this design
8
+
9
+ The direct `Playwright + logged-in Chrome profile` route is no longer reliable here:
10
+
11
+ - Google account login blocks automated browser flows
12
+ - modern Chrome restricts remote debugging on the default profile
13
+
14
+ The extension bridge avoids both problems:
15
+
16
+ - you keep using your real Chrome profile
17
+ - the extension runs inside the real Gemini tab
18
+ - the local Node process only exposes MCP and a localhost bridge
19
+
20
+ ## What works now
21
+
22
+ - local `stdio` MCP server
23
+ - local HTTP bridge on `127.0.0.1:8765`
24
+ - Chrome extension content script on `gemini.google.com`
25
+ - `gemini_bridge_status`
26
+ - `gemini_page_status`
27
+ - `gemini_run_prompt`
28
+ - optional local image upload from file paths
29
+ - best-effort mode selection by visible label
30
+ - saves structured output under `artifacts/`
31
+ - returns clean answer text for the latest Gemini reply
32
+ - best-effort saves generated images into the chosen `outputDir`
33
+
34
+ ## Project layout
35
+
36
+ - `src/index.js`: MCP entrypoint
37
+ - `src/server.js`: MCP tool registration
38
+ - `src/extension-bridge.js`: localhost bridge server
39
+ - `extension/`: unpacked Chrome extension
40
+
41
+ ## Install
42
+
43
+ ### Local development
44
+
45
+ ```bash
46
+ npm install
47
+ ```
48
+
49
+ ### Package-style install
50
+
51
+ The package now exposes a CLI entrypoint:
52
+
53
+ ```bash
54
+ npx -y @bool01master/gemini-web-mcp
55
+ ```
56
+
57
+ To print the extension directory after install:
58
+
59
+ ```bash
60
+ npx -y @bool01master/gemini-web-mcp --extension-path
61
+ ```
62
+
63
+ Or after publishing / packing:
64
+
65
+ ```bash
66
+ npm install -g @bool01master/gemini-web-mcp
67
+ gemini-web-mcp
68
+ ```
69
+
70
+ And to print the packaged extension path:
71
+
72
+ ```bash
73
+ gemini-web-mcp --extension-path
74
+ ```
75
+
76
+ ## Run the local MCP process
77
+
78
+ ```bash
79
+ npm start
80
+ ```
81
+
82
+ Equivalent:
83
+
84
+ ```bash
85
+ npx -y @bool01master/gemini-web-mcp
86
+ ```
87
+
88
+ This starts:
89
+
90
+ - MCP over stdio
91
+ - a local bridge at `http://127.0.0.1:8765`
92
+
93
+ ## Load the extension
94
+
95
+ 1. Open `chrome://extensions`
96
+ 2. Enable `Developer mode`
97
+ 3. Click `Load unpacked`
98
+ 4. Select the extension folder.
99
+
100
+
101
+ For an installed package, run:
102
+
103
+ ```bash
104
+ npx -y @bool01master/gemini-web-mcp --extension-path
105
+ ```
106
+
107
+ 5. Keep a `https://gemini.google.com/` tab open in your normal Chrome profile
108
+
109
+ The content script will automatically connect back to `http://127.0.0.1:8765`.
110
+
111
+ ## Can I install only the extension?
112
+
113
+ No.
114
+
115
+ The extension is only the browser-side half of the system. You still need the local Node MCP process because it:
116
+
117
+ - exposes the MCP tools over stdio
118
+ - runs the localhost bridge on `127.0.0.1:8765`
119
+ - receives tool calls and forwards them into the Gemini tab
120
+ - saves images and writes `result.json`
121
+
122
+ So the minimum working setup is:
123
+
124
+ 1. install the Node package
125
+ 2. run the MCP server
126
+ 3. load the unpacked extension
127
+
128
+ ## First check
129
+
130
+ After `npm start` is running and the extension is loaded:
131
+
132
+ 1. Open a Gemini tab
133
+ 2. Click the extension popup
134
+ 3. Press `Refresh Status`
135
+
136
+ You should see page status JSON from the content script.
137
+
138
+ ## MCP tools
139
+
140
+ ### `gemini_bridge_status`
141
+
142
+ Shows whether the local bridge is running and whether any Gemini tabs are connected.
143
+
144
+ ### `gemini_page_status`
145
+
146
+ Asks the active Gemini tab for:
147
+
148
+ - whether the prompt box is found
149
+ - visible buttons
150
+ - file input count
151
+ - mode-like button candidates
152
+
153
+ ### `gemini_run_prompt`
154
+
155
+ Inputs:
156
+
157
+ - `prompt: string`
158
+ - `mode?: string`
159
+ - `images?: string[]`
160
+ - `outputDir?: string`
161
+ - `newChat?: boolean`
162
+ - `waitTimeoutMs?: number`
163
+ - `maxImages?: number`
164
+
165
+ Example:
166
+
167
+ ```json
168
+ {
169
+ "prompt": "把这张图改成极简海报风格,保留主体,增加留白。",
170
+ "mode": "Images",
171
+ "images": [
172
+ "/absolute/path/to/input.png"
173
+ ],
174
+ "outputDir": "/absolute/path/to/output-dir",
175
+ "newChat": true,
176
+ "waitTimeoutMs": 120000,
177
+ "maxImages": 4
178
+ }
179
+ ```
180
+
181
+ Output:
182
+
183
+ - returned answer text
184
+ - structured result JSON
185
+ - `imagePaths` for images successfully saved to disk
186
+ - `curlCommands` when the page exposes URLs that could not be saved directly
187
+ - `result.json` under `<outputDir>/<timestamp>-<slug>/`
188
+ - saved image files under the same run directory when extraction succeeds
189
+
190
+ ## MCP config example
191
+
192
+ ### Recommended for Codex: use `npx`
193
+
194
+ ```json
195
+ {
196
+ "mcpServers": {
197
+ "gemini-image": {
198
+ "type": "stdio",
199
+ "command": "npx",
200
+ "args": [
201
+ "-y",
202
+ "@bool01master/gemini-web-mcp"
203
+ ],
204
+ "env": {
205
+ "NO_PROXY": "*"
206
+ }
207
+ }
208
+ }
209
+ }
210
+ ```
211
+
212
+ This is the most reliable option for Codex because it does not depend on your global npm bin path.
213
+
214
+ ### Alternative: global install + direct command
215
+
216
+ If you have already run:
217
+
218
+ ```bash
219
+ npm install -g @bool01master/gemini-web-mcp
220
+ ```
221
+
222
+ you can also use:
223
+
224
+ ```json
225
+ {
226
+ "mcpServers": {
227
+ "gemini-image": {
228
+ "type": "stdio",
229
+ "command": "gemini-web-mcp",
230
+ "env": {
231
+ "NO_PROXY": "*"
232
+ }
233
+ }
234
+ }
235
+ }
236
+ ```
237
+
238
+ ### Development mode: run from local source checkout
239
+
240
+ If you have not installed the package and are running from a local repo checkout, use:
241
+
242
+ ```json
243
+ {
244
+ "mcpServers": {
245
+ "gemini-image": {
246
+ "type": "stdio",
247
+ "command": "node",
248
+ "args": [
249
+ "/absolute/path/to/your/local/checkout/src/index.js"
250
+ ],
251
+ "env": {
252
+ "NO_PROXY": "*"
253
+ }
254
+ }
255
+ }
256
+ }
257
+ ```
258
+
259
+ ### If Codex shows `Tools: (none)`
260
+
261
+ `Auth: Unsupported` and `Resources: (none)` are expected.
262
+
263
+ But if Codex shows `Tools: (none)`, the MCP process did not initialize correctly. Check these items:
264
+
265
+ 1. Prefer the `npx -y @bool01master/gemini-web-mcp` config above instead of `command: "gemini-web-mcp"`.
266
+ 2. Make sure port `8765` is not already occupied:
267
+
268
+ ```bash
269
+ lsof -i :8765
270
+ ```
271
+
272
+ 3. If you use global install, verify the binary is actually on PATH:
273
+
274
+ ```bash
275
+ which gemini-web-mcp
276
+ gemini-web-mcp --help
277
+ ```
278
+
279
+ 4. Verify the package itself is healthy:
280
+
281
+ ```bash
282
+ npm run smoke
283
+ ```
284
+
285
+ Important: even if the extension is not loaded yet, the tool list should still appear. So `Tools: (none)` is usually a process startup problem, not a Gemini page problem.
286
+
287
+ ## Quick local checks
288
+
289
+ ```bash
290
+ npm run smoke
291
+ ```
292
+
293
+ ## Packaging and publishing
294
+
295
+ The npm package name is `@bool01master/gemini-web-mcp`, and `package.json` already sets `publishConfig.access=public`, so `npm run release` publishes it as a public scoped package by default.
296
+
297
+ Dry-run the full release flow:
298
+
299
+ ```bash
300
+ npm run release:dry-run
301
+ ```
302
+
303
+ Publish in one command:
304
+
305
+ ```bash
306
+ npm run release
307
+ ```
308
+
309
+ If your npm account requires 2FA for publish, pass the one-time password explicitly:
310
+
311
+ ```bash
312
+ npm run release -- --otp=123456
313
+ ```
314
+
315
+ You can also export it through the environment:
316
+
317
+ ```bash
318
+ NPM_OTP=123456 npm run release
319
+ ```
320
+
321
+ For CI or non-interactive publishing, use a **granular access token with bypass 2FA enabled** in your `.npmrc` / `NODE_AUTH_TOKEN`.
322
+
323
+ The release script will:
324
+
325
+ 1. run `npm run smoke`
326
+ 2. print the packaged extension path
327
+ 3. run `npm pack --dry-run`
328
+ 4. run `npm publish` (or `npm publish --dry-run`)
329
+
330
+ You can pass extra npm publish args through the script, for example:
331
+
332
+ ```bash
333
+ bash scripts/release.sh --dry-run --tag next
334
+ ```
335
+
336
+ Optional helper scripts retained from earlier experiments:
337
+
338
+ - `npm run list:profiles`
339
+ - `npm run open:profile`
340
+ - `npm run open:debug-profile`
341
+ - `npm run trace:gemini`
342
+
343
+ These are no longer the primary path. The extension bridge is the intended route.
344
+
345
+ ## Environment variables
346
+
347
+ - `GEMINI_BRIDGE_HOST`
348
+ - `GEMINI_BRIDGE_PORT`
349
+ - `GEMINI_WEB_OUTPUT_DIR`
350
+
351
+ Defaults:
352
+
353
+ ```text
354
+ GEMINI_BRIDGE_HOST=127.0.0.1
355
+ GEMINI_BRIDGE_PORT=8765
356
+ ```
357
+
358
+ ## Notes
359
+
360
+ - Mode selection is best-effort and depends on Gemini’s visible UI labels.
361
+ - Image upload is implemented through the page’s file input and upload menu, and may need selector tuning if Gemini changes its DOM.
362
+ - Detailed debugging metadata, including remote image URLs when available, is still written to `result.json`.
363
+ - This is still UI automation, just running from inside the real tab instead of controlling Chrome externally.
@@ -0,0 +1,118 @@
1
+ function arrayBufferToBase64(buffer) {
2
+ const bytes = new Uint8Array(buffer);
3
+ let binary = "";
4
+
5
+ for (const byte of bytes) {
6
+ binary += String.fromCharCode(byte);
7
+ }
8
+
9
+ return btoa(binary);
10
+ }
11
+
12
+ const ALLOWED_HOST_PATTERNS = [
13
+ /^https:\/\/gemini\.google\.com\//,
14
+ /^https:\/\/[^/]*\.googleusercontent\.com\//,
15
+ /^https:\/\/[^/]*\.usercontent\.google\.com\//,
16
+ ];
17
+
18
+ function isAllowedUrl(url) {
19
+ return ALLOWED_HOST_PATTERNS.some((pattern) => pattern.test(url));
20
+ }
21
+
22
+ chrome.runtime.onMessage.addListener((message, _sender, sendResponse) => {
23
+ if (message?.type === "bridge:fetch_image" && message.url) {
24
+ if (!isAllowedUrl(message.url)) {
25
+ sendResponse({
26
+ ok: false,
27
+ error: `URL host not in extension host_permissions: ${new URL(message.url).hostname}`,
28
+ });
29
+ return true;
30
+ }
31
+
32
+ (async () => {
33
+ try {
34
+ const response = await fetch(message.url, {
35
+ credentials: "omit",
36
+ redirect: "follow",
37
+ });
38
+
39
+ if (!response.ok) {
40
+ throw new Error(`HTTP ${response.status}`);
41
+ }
42
+
43
+ const blob = await response.blob();
44
+ const buffer = await blob.arrayBuffer();
45
+ const mimeType = blob.type || "application/octet-stream";
46
+ const dataUrl = `data:${mimeType};base64,${arrayBufferToBase64(buffer)}`;
47
+
48
+ sendResponse({ ok: true, dataUrl });
49
+ } catch (error) {
50
+ sendResponse({
51
+ ok: false,
52
+ error: error instanceof Error ? error.message : String(error),
53
+ });
54
+ }
55
+ })();
56
+
57
+ return true;
58
+ }
59
+
60
+ if (message?.type === "bridge:capture_tab") {
61
+ chrome.tabs.captureVisibleTab(
62
+ undefined,
63
+ { format: "png" },
64
+ (dataUrl) => {
65
+ if (chrome.runtime.lastError) {
66
+ sendResponse({
67
+ ok: false,
68
+ error: chrome.runtime.lastError.message,
69
+ });
70
+ return;
71
+ }
72
+
73
+ sendResponse({ ok: true, dataUrl });
74
+ },
75
+ );
76
+
77
+ return true;
78
+ }
79
+
80
+ if (message?.type === "bridge:activate_self") {
81
+ const senderTabId = _sender?.tab?.id;
82
+ const senderWindowId = _sender?.tab?.windowId;
83
+
84
+ if (!Number.isInteger(senderTabId) || !Number.isInteger(senderWindowId)) {
85
+ sendResponse({
86
+ ok: false,
87
+ error: "Missing sender tab context.",
88
+ });
89
+ return false;
90
+ }
91
+
92
+ chrome.tabs.update(senderTabId, { active: true }, () => {
93
+ if (chrome.runtime.lastError) {
94
+ sendResponse({
95
+ ok: false,
96
+ error: chrome.runtime.lastError.message,
97
+ });
98
+ return;
99
+ }
100
+
101
+ chrome.windows.update(senderWindowId, { focused: true }, () => {
102
+ if (chrome.runtime.lastError) {
103
+ sendResponse({
104
+ ok: false,
105
+ error: chrome.runtime.lastError.message,
106
+ });
107
+ return;
108
+ }
109
+
110
+ sendResponse({ ok: true });
111
+ });
112
+ });
113
+
114
+ return true;
115
+ }
116
+
117
+ return false;
118
+ });