npm - @bool01master/gemini-web-mcp - Versions diffs - 1.0.0 - Mend

@bool01master/gemini-web-mcp 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/README.md +363 -0
package/extension/background.js +118 -0
package/extension/content.js +1475 -0
package/extension/manifest.json +33 -0
package/extension/popup.css +43 -0
package/extension/popup.html +18 -0
package/extension/popup.js +37 -0
package/package.json +38 -0
package/src/extension-bridge.js +749 -0
package/src/gemini-web-client.js +678 -0
package/src/index.js +56 -0
package/src/server.js +272 -0

package/README.md ADDED Viewed

@@ -0,0 +1,363 @@
+# Gemini Local MCP Bridge
+This project wraps `https://gemini.google.com/` as a local MCP server by using a Chrome extension inside your real Gemini tab.
+It is designed for personal local use on one machine.
+## Why this design
+The direct `Playwright + logged-in Chrome profile` route is no longer reliable here:
+- Google account login blocks automated browser flows
+- modern Chrome restricts remote debugging on the default profile
+The extension bridge avoids both problems:
+- you keep using your real Chrome profile
+- the extension runs inside the real Gemini tab
+- the local Node process only exposes MCP and a localhost bridge
+## What works now
+- local `stdio` MCP server
+- local HTTP bridge on `127.0.0.1:8765`
+- Chrome extension content script on `gemini.google.com`
+- `gemini_bridge_status`
+- `gemini_page_status`
+- `gemini_run_prompt`
+- optional local image upload from file paths
+- best-effort mode selection by visible label
+- saves structured output under `artifacts/`
+- returns clean answer text for the latest Gemini reply
+- best-effort saves generated images into the chosen `outputDir`
+## Project layout
+- `src/index.js`: MCP entrypoint
+- `src/server.js`: MCP tool registration
+- `src/extension-bridge.js`: localhost bridge server
+- `extension/`: unpacked Chrome extension
+## Install
+### Local development
+```bash
+npm install
+```
+### Package-style install
+The package now exposes a CLI entrypoint:
+```bash
+npx -y @bool01master/gemini-web-mcp
+```
+To print the extension directory after install:
+```bash
+npx -y @bool01master/gemini-web-mcp --extension-path
+```
+Or after publishing / packing:
+```bash
+npm install -g @bool01master/gemini-web-mcp
+gemini-web-mcp
+```
+And to print the packaged extension path:
+```bash
+gemini-web-mcp --extension-path
+```
+## Run the local MCP process
+```bash
+npm start
+```
+Equivalent:
+```bash
+npx -y @bool01master/gemini-web-mcp
+```
+This starts:
+- MCP over stdio
+- a local bridge at `http://127.0.0.1:8765`
+## Load the extension
+1. Open `chrome://extensions`
+2. Enable `Developer mode`
+3. Click `Load unpacked`
+4. Select the extension folder.
+For an installed package, run:
+```bash
+npx -y @bool01master/gemini-web-mcp --extension-path
+```
+5. Keep a `https://gemini.google.com/` tab open in your normal Chrome profile
+The content script will automatically connect back to `http://127.0.0.1:8765`.
+## Can I install only the extension?
+No.
+The extension is only the browser-side half of the system. You still need the local Node MCP process because it:
+- exposes the MCP tools over stdio
+- runs the localhost bridge on `127.0.0.1:8765`
+- receives tool calls and forwards them into the Gemini tab
+- saves images and writes `result.json`
+So the minimum working setup is:
+1. install the Node package
+2. run the MCP server
+3. load the unpacked extension
+## First check
+After `npm start` is running and the extension is loaded:
+1. Open a Gemini tab
+2. Click the extension popup
+3. Press `Refresh Status`
+You should see page status JSON from the content script.
+## MCP tools
+### `gemini_bridge_status`
+Shows whether the local bridge is running and whether any Gemini tabs are connected.
+### `gemini_page_status`
+Asks the active Gemini tab for:
+- whether the prompt box is found
+- visible buttons
+- file input count
+- mode-like button candidates
+### `gemini_run_prompt`
+Inputs:
+- `prompt: string`
+- `mode?: string`
+- `images?: string[]`
+- `outputDir?: string`
+- `newChat?: boolean`
+- `waitTimeoutMs?: number`
+- `maxImages?: number`
+Example:
+```json
+{
+  "prompt": "把这张图改成极简海报风格，保留主体，增加留白。",
+  "mode": "Images",
+  "images": [
+    "/absolute/path/to/input.png"
+  ],
+  "outputDir": "/absolute/path/to/output-dir",
+  "newChat": true,
+  "waitTimeoutMs": 120000,
+  "maxImages": 4
+}
+```
+Output:
+- returned answer text
+- structured result JSON
+- `imagePaths` for images successfully saved to disk
+- `curlCommands` when the page exposes URLs that could not be saved directly
+- `result.json` under `<outputDir>/<timestamp>-<slug>/`
+- saved image files under the same run directory when extraction succeeds
+## MCP config example
+### Recommended for Codex: use `npx`
+```json
+{
+  "mcpServers": {
+    "gemini-image": {
+        "type": "stdio",
+        "command": "npx",
+        "args": [
+          "-y",
+          "@bool01master/gemini-web-mcp"
+        ],
+        "env": {
+          "NO_PROXY": "*"
+      }
+    }
+  }
+}
+```
+This is the most reliable option for Codex because it does not depend on your global npm bin path.
+### Alternative: global install + direct command
+If you have already run:
+```bash
+npm install -g @bool01master/gemini-web-mcp
+```
+you can also use:
+```json
+{
+  "mcpServers": {
+    "gemini-image": {
+      "type": "stdio",
+      "command": "gemini-web-mcp",
+      "env": {
+        "NO_PROXY": "*"
+      }
+    }
+  }
+}
+```
+### Development mode: run from local source checkout
+If you have not installed the package and are running from a local repo checkout, use:
+```json
+{
+  "mcpServers": {
+    "gemini-image": {
+        "type": "stdio",
+        "command": "node",
+        "args": [
+          "/absolute/path/to/your/local/checkout/src/index.js"
+        ],
+        "env": {
+          "NO_PROXY": "*"
+      }
+    }
+  }
+}
+```
+### If Codex shows `Tools: (none)`
+`Auth: Unsupported` and `Resources: (none)` are expected.
+But if Codex shows `Tools: (none)`, the MCP process did not initialize correctly. Check these items:
+1. Prefer the `npx -y @bool01master/gemini-web-mcp` config above instead of `command: "gemini-web-mcp"`.
+2. Make sure port `8765` is not already occupied:
+   ```bash
+   lsof -i :8765
+   ```
+3. If you use global install, verify the binary is actually on PATH:
+   ```bash
+   which gemini-web-mcp
+   gemini-web-mcp --help
+   ```
+4. Verify the package itself is healthy:
+   ```bash
+   npm run smoke
+   ```
+Important: even if the extension is not loaded yet, the tool list should still appear. So `Tools: (none)` is usually a process startup problem, not a Gemini page problem.
+## Quick local checks
+```bash
+npm run smoke
+```
+## Packaging and publishing
+The npm package name is `@bool01master/gemini-web-mcp`, and `package.json` already sets `publishConfig.access=public`, so `npm run release` publishes it as a public scoped package by default.
+Dry-run the full release flow:
+```bash
+npm run release:dry-run
+```
+Publish in one command:
+```bash
+npm run release
+```
+If your npm account requires 2FA for publish, pass the one-time password explicitly:
+```bash
+npm run release -- --otp=123456
+```
+You can also export it through the environment:
+```bash
+NPM_OTP=123456 npm run release
+```
+For CI or non-interactive publishing, use a **granular access token with bypass 2FA enabled** in your `.npmrc` / `NODE_AUTH_TOKEN`.
+The release script will:
+1. run `npm run smoke`
+2. print the packaged extension path
+3. run `npm pack --dry-run`
+4. run `npm publish` (or `npm publish --dry-run`)
+You can pass extra npm publish args through the script, for example:
+```bash
+bash scripts/release.sh --dry-run --tag next
+```
+Optional helper scripts retained from earlier experiments:
+- `npm run list:profiles`
+- `npm run open:profile`
+- `npm run open:debug-profile`
+- `npm run trace:gemini`
+These are no longer the primary path. The extension bridge is the intended route.
+## Environment variables
+- `GEMINI_BRIDGE_HOST`
+- `GEMINI_BRIDGE_PORT`
+- `GEMINI_WEB_OUTPUT_DIR`
+Defaults:
+```text
+GEMINI_BRIDGE_HOST=127.0.0.1
+GEMINI_BRIDGE_PORT=8765
+```
+## Notes
+- Mode selection is best-effort and depends on Gemini’s visible UI labels.
+- Image upload is implemented through the page’s file input and upload menu, and may need selector tuning if Gemini changes its DOM.
+- Detailed debugging metadata, including remote image URLs when available, is still written to `result.json`.
+- This is still UI automation, just running from inside the real tab instead of controlling Chrome externally.

package/extension/background.js ADDED Viewed

@@ -0,0 +1,118 @@
+function arrayBufferToBase64(buffer) {
+  const bytes = new Uint8Array(buffer);
+  let binary = "";
+  for (const byte of bytes) {
+    binary += String.fromCharCode(byte);
+  }
+  return btoa(binary);
+}
+const ALLOWED_HOST_PATTERNS = [
+  /^https:\/\/gemini\.google\.com\//,
+  /^https:\/\/[^/]*\.googleusercontent\.com\//,
+  /^https:\/\/[^/]*\.usercontent\.google\.com\//,
+];
+function isAllowedUrl(url) {
+  return ALLOWED_HOST_PATTERNS.some((pattern) => pattern.test(url));
+}
+chrome.runtime.onMessage.addListener((message, _sender, sendResponse) => {
+  if (message?.type === "bridge:fetch_image" && message.url) {
+    if (!isAllowedUrl(message.url)) {
+      sendResponse({
+        ok: false,
+        error: `URL host not in extension host_permissions: ${new URL(message.url).hostname}`,
+      });
+      return true;
+    }
+    (async () => {
+      try {
+        const response = await fetch(message.url, {
+          credentials: "omit",
+          redirect: "follow",
+        });
+        if (!response.ok) {
+          throw new Error(`HTTP ${response.status}`);
+        }
+        const blob = await response.blob();
+        const buffer = await blob.arrayBuffer();
+        const mimeType = blob.type || "application/octet-stream";
+        const dataUrl = `data:${mimeType};base64,${arrayBufferToBase64(buffer)}`;
+        sendResponse({ ok: true, dataUrl });
+      } catch (error) {
+        sendResponse({
+          ok: false,
+          error: error instanceof Error ? error.message : String(error),
+        });
+      }
+    })();
+    return true;
+  }
+  if (message?.type === "bridge:capture_tab") {
+    chrome.tabs.captureVisibleTab(
+      undefined,
+      { format: "png" },
+      (dataUrl) => {
+        if (chrome.runtime.lastError) {
+          sendResponse({
+            ok: false,
+            error: chrome.runtime.lastError.message,
+          });
+          return;
+        }
+        sendResponse({ ok: true, dataUrl });
+      },
+    );
+    return true;
+  }
+  if (message?.type === "bridge:activate_self") {
+    const senderTabId = _sender?.tab?.id;
+    const senderWindowId = _sender?.tab?.windowId;
+    if (!Number.isInteger(senderTabId) || !Number.isInteger(senderWindowId)) {
+      sendResponse({
+        ok: false,
+        error: "Missing sender tab context.",
+      });
+      return false;
+    }
+    chrome.tabs.update(senderTabId, { active: true }, () => {
+      if (chrome.runtime.lastError) {
+        sendResponse({
+          ok: false,
+          error: chrome.runtime.lastError.message,
+        });
+        return;
+      }
+      chrome.windows.update(senderWindowId, { focused: true }, () => {
+        if (chrome.runtime.lastError) {
+          sendResponse({
+            ok: false,
+            error: chrome.runtime.lastError.message,
+          });
+          return;
+        }
+        sendResponse({ ok: true });
+      });
+    });
+    return true;
+  }
+  return false;
+});