npm - chrome-ai-bridge - Versions diffs - 1.0.1 → 1.0.3 - Mend

chrome-ai-bridge 1.0.1 → 1.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md +134 -90
package/build/src/tools/chatgpt-web.js +88 -39
package/package.json +1 -1
package/scripts/cli.mjs +3 -3

package/README.md CHANGED Viewed

@@ -1,14 +1,29 @@
-# Chrome DevTools MCP for Extension Development
+# chrome-ai-bridge
-[![npm chrome-ai-bridge package](https://img.shields.io/npm/v/chrome-ai-bridge.svg)](https://npmjs.org/package/chrome-ai-bridge)
+[![npm](https://img.shields.io/npm/v/chrome-ai-bridge.svg)](https://npmjs.org/package/chrome-ai-bridge)
+[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)
-> AI-powered Chrome extension development via MCP
+> Bridge between AI and Chrome Browser
-Built for: Claude Code, Cursor, VS Code Copilot, Cline, and other MCP-compatible AI tools
+MCP server enabling AI assistants to control Chrome, consult other AIs, and develop extensions.
+**Compatible with:** Claude Code, Cursor, VS Code Copilot, Cline, and other MCP clients
+---
+## What is this?
+chrome-ai-bridge is a [Model Context Protocol](https://modelcontextprotocol.io/) server that gives AI assistants:
+- **Eyes**: See what's on web pages (screenshots, DOM snapshots)
+- **Hands**: Interact with pages (click, type, navigate)
+- **Voice**: Consult other AIs (ChatGPT, Gemini) via browser
+Think of it as the bridge that connects your AI coding assistant to the browser world.
 ---
-## Quick Start (5 minutes)
+## Quick Start
 ### 1. Run the server
@@ -33,85 +48,59 @@ npx chrome-ai-bridge@latest
 ### 3. Verify it works
-Restart your AI client and ask: `"List all my Chrome extensions"`
-### Load development extensions (optional)
-```json
-{
-  "mcpServers": {
-    "chrome-ai-bridge": {
-      "command": "npx",
-      "args": [
-        "chrome-ai-bridge@latest",
-        "--loadExtensionsDir=/path/to/your/extensions"
-      ]
-    }
-  }
-}
-```
+Restart your AI client and try: `"Take a screenshot of google.com"`
 ---
-## What You Can Do
+## Key Features
-- **Extension Development**: Load, debug, and hot-reload Chrome extensions
-- **Browser Automation**: Navigate, click, fill forms, take screenshots
-- **Performance Analysis**: Trace recording and insight extraction
-- **AI Research**: Automated ChatGPT/Gemini interactions
-- **Web Store Submission**: Automated screenshot generation and submission
+### Multi-AI Consultation
----
+Ask ChatGPT or Gemini questions directly from your AI assistant:
-## Tools Reference
+```
+"Ask ChatGPT how to implement OAuth in Node.js"
+"Ask Gemini to review this architecture decision"
+```
-### Core Tools (18)
+| Feature | Description |
+|---------|-------------|
+| **Session persistence** | Conversations continue across tool calls |
+| **Auto-logging** | All Q&A saved to `docs/ask/chatgpt/` and `docs/ask/gemini/` |
+| **12 languages** | Login detection works in EN, JA, FR, DE, ES, IT, KO, ZH, PT, RU, AR |
-| Tool | Description | Key Parameters |
-|------|-------------|----------------|
-| `take_snapshot` | Get page structure with element UIDs | - |
-| `take_screenshot` | Capture page or element image | `fullPage`, `uid` |
-| `click` | Click element by UID | `uid`, `dblClick` |
-| `fill` | Fill input/textarea/select | `uid`, `value` |
-| `fill_form` | Fill multiple form elements | `elements[]` |
-| `hover` | Hover over element | `uid` |
-| `drag` | Drag element to another | `from_uid`, `to_uid` |
-| `upload_file` | Upload file through input | `uid`, `filePath` |
-| `navigate` | Go to URL, back, forward | `op`, `url` |
-| `pages` | List, select, close tabs | `op`, `pageIdx` |
-| `wait_for` | Wait for text to appear | `text`, `timeout` |
-| `handle_dialog` | Accept/dismiss dialogs | `action` |
-| `resize_page` | Change viewport size | `width`, `height` |
-| `emulate` | CPU/network throttling | `target`, `throttlingRate` |
-| `network` | List/get network requests | `op`, `url` |
-| `performance` | Start/stop/analyze traces | `op`, `insightName` |
-| `evaluate_script` | Run JavaScript in page | `function` |
-| `list_console_messages` | Get console output | - |
-### Optional Tools (2) - Web-LLM
-| Tool | Description | Key Parameters |
-|------|-------------|----------------|
-| `ask_chatgpt_web` | Ask ChatGPT via browser | `question`, `createNewChat` |
-| `ask_gemini_web` | Ask Gemini via browser | `question`, `createNewChat` |
+### Browser Automation
-**Full documentation:** [docs/reference/tools.md](docs/reference/tools.md)
+Full browser control with 20+ tools:
----
+| Category | Tools |
+|----------|-------|
+| **Snapshot** | `take_snapshot`, `take_screenshot` |
+| **Input** | `click`, `fill`, `fill_form`, `hover`, `drag`, `upload_file` |
+| **Navigation** | `navigate`, `pages`, `wait_for`, `handle_dialog` |
+| **Inspection** | `network`, `list_console_messages`, `evaluate_script` |
+| **Performance** | `performance` (start/stop/analyze traces) |
+| **Emulation** | `emulate` (CPU/network throttling), `resize_page` |
-## Plugin Architecture (v0.26.0)
+### Chrome Extension Development
-### Disable Web-LLM tools
+Build and debug Chrome extensions with AI assistance:
 ```json
 {
-  "env": {
-    "MCP_DISABLE_WEB_LLM": "true"
-  }
+  "args": ["chrome-ai-bridge@latest", "--loadExtensionsDir=/path/to/extensions"]
 }
 ```
-### Load external plugins
+| Tool | Description |
+|------|-------------|
+| `extension_popup` | Open/close extension popups |
+| `iframe_popup` | Inspect, patch, reload iframe-embedded popups |
+| `bookmarks` | Quick access to chrome://extensions, Web Store dashboard |
+### Plugin Architecture
+Extend with custom tools:
 ```json
 {
@@ -121,19 +110,18 @@ Restart your AI client and ask: `"List all my Chrome extensions"`
 }
 ```
-**Plugin interface:**
 ```typescript
+// my-plugin.js
 export default {
   id: 'my-plugin',
-  name: 'My Custom Plugin',
+  name: 'My Plugin',
   version: '1.0.0',
   async register(ctx) {
     ctx.registry.register({
       name: 'my_tool',
       description: 'Does something useful',
       schema: { /* zod schema */ },
-      async handler(input, response, context) { /* implementation */ },
+      async handler(input, response, context) { /* ... */ },
     });
   },
 };
@@ -141,9 +129,65 @@ export default {
 ---
+## Configuration
+### Environment Variables
+| Variable | Description |
+|----------|-------------|
+| `MCP_DISABLE_WEB_LLM` | Set `true` to disable ChatGPT/Gemini tools |
+| `MCP_PLUGINS` | Comma-separated list of plugin paths |
+| `MCP_ENV` | Set `development` for hot-reload mode |
+### CLI Options
+| Option | Description |
+|--------|-------------|
+| `--loadExtensionsDir` | Load Chrome extensions from directory |
+| `--headless` | Run in headless mode |
+| `--channel` | Chrome channel (stable/canary) |
+---
+## Tools Reference
+### Core Tools (18)
+| Tool | Description |
+|------|-------------|
+| `take_snapshot` | Get page structure with element UIDs |
+| `take_screenshot` | Capture page or element image |
+| `click` | Click element by UID |
+| `fill` | Fill input/textarea/select |
+| `fill_form` | Fill multiple form elements |
+| `hover` | Hover over element |
+| `drag` | Drag element to another |
+| `upload_file` | Upload file through input |
+| `navigate` | Go to URL, back, forward |
+| `pages` | List, select, close tabs |
+| `wait_for` | Wait for text to appear |
+| `handle_dialog` | Accept/dismiss dialogs |
+| `resize_page` | Change viewport size |
+| `emulate` | CPU/network throttling |
+| `network` | List/get network requests |
+| `performance` | Start/stop/analyze traces |
+| `evaluate_script` | Run JavaScript in page |
+| `list_console_messages` | Get console output |
+### Web-LLM Tools (2)
+| Tool | Description |
+|------|-------------|
+| `ask_chatgpt_web` | Ask ChatGPT via browser |
+| `ask_gemini_web` | Ask Gemini via browser |
+**Full documentation:** [docs/reference/tools.md](docs/reference/tools.md)
+---
 ## For Developers
-### Local development setup
+### Local Development
 ```bash
 git clone https://github.com/usedhonda/chrome-ai-bridge.git
@@ -151,37 +195,35 @@ cd chrome-ai-bridge
 npm install && npm run build
 ```
-Configure `~/.claude.json` to use local version:
+Configure `~/.claude.json`:
 ```json
 {
   "mcpServers": {
     "chrome-ai-bridge": {
       "command": "node",
-      "args": ["/absolute/path/to/chrome-ai-bridge/scripts/cli.mjs"]
+      "args": ["/path/to/chrome-ai-bridge/scripts/cli.mjs"]
     }
   }
 }
 ```
-### Hot-reload development
+### Hot-Reload Development
 ```json
 {
   "mcpServers": {
     "chrome-ai-bridge": {
       "command": "node",
-      "args": ["/absolute/path/to/chrome-ai-bridge/scripts/mcp-wrapper.mjs"],
-      "cwd": "/absolute/path/to/chrome-ai-bridge",
+      "args": ["/path/to/chrome-ai-bridge/scripts/mcp-wrapper.mjs"],
+      "cwd": "/path/to/chrome-ai-bridge",
       "env": { "MCP_ENV": "development" }
     }
   }
 }
 ```
-**Benefits:** Auto-rebuild on file changes, 2-5 second feedback loop.
-**See also:** [docs/dev/hot-reload.md](docs/dev/hot-reload.md)
+Auto-rebuild on file changes with 2-5 second feedback loop.
 ### Commands
@@ -192,7 +234,7 @@ npm test           # Run tests
 npm run format     # Format code
 ```
-### Project structure
+### Project Structure
 ```
 chrome-ai-bridge/
@@ -223,30 +265,32 @@ chrome-ai-bridge/
 ## Troubleshooting
-### Extension not loading
-- Verify `manifest.json` is at extension root
-- Use absolute paths in `--loadExtensionsDir`
+### MCP server not responding
-### MCP server issues
 ```bash
 npx clear-npx-cache && npx chrome-ai-bridge@latest
 ```
+### Extension not loading
+- Verify `manifest.json` exists at extension root
+- Use absolute paths in `--loadExtensionsDir`
+### ChatGPT/Gemini login issues
+- Check browser window for login prompts
+- Login detection supports 12 languages
 **More:** [docs/user/troubleshooting.md](docs/user/troubleshooting.md)
 ---
 ## Credits
-Fork of [Chrome DevTools MCP](https://github.com/ChromeDevTools/chrome-ai-bridge) by Google LLC.
-**Additions:** Extension development tools, Web Store automation, ChatGPT/Gemini integration, hot-reload workflow.
+Built on [Chrome DevTools MCP](https://github.com/anthropics/anthropic-quickstarts/tree/main/mcp-devtools) by Google LLC, with extensions for multi-AI consultation and Chrome extension development.
 ---
 ## License
 Apache-2.0
-**Version**: 0.26.1
-**Repository**: https://github.com/usedhonda/chrome-ai-bridge

package/build/src/tools/chatgpt-web.js CHANGED Viewed

@@ -267,9 +267,20 @@ export const askChatGPTWeb = defineTool({
                 }
             }
             else if (loginStatus === LoginStatus.IN_PROGRESS) {
-                // Wait a bit and retry
-                await new Promise(r => setTimeout(r, 2000));
-                const retryStatus = await getLoginStatus(page, 'chatgpt');
+                // Wait and retry with exponential backoff (login may still be processing)
+                let retryStatus = LoginStatus.IN_PROGRESS;
+                const maxRetries = 3;
+                for (let i = 0; i < maxRetries; i++) {
+                    const waitTime = 3000 + i * 2000; // 3s, 5s, 7s
+                    await new Promise(r => setTimeout(r, waitTime));
+                    retryStatus = await getLoginStatus(page, 'chatgpt');
+                    if (retryStatus === LoginStatus.LOGGED_IN) {
+                        break;
+                    }
+                    if (i < maxRetries - 1) {
+                        response.appendResponseLine(`⏳ ログイン処理中... (${i + 1}/${maxRetries})`);
+                    }
+                }
                 if (retryStatus !== LoginStatus.LOGGED_IN) {
                     response.appendResponseLine('⚠️ ログイン状態を確認できませんでした。再試行してください。');
                     return;
@@ -293,17 +304,27 @@ export const askChatGPTWeb = defineTool({
                     const currentUrl = page.url();
                     if (!currentUrl.includes(latestSession.chatId)) {
                         await navigateWithRetry(page, latestSession.url, {
-                            waitUntil: 'domcontentloaded',
+                            waitUntil: 'networkidle2', // Wait for JS to finish loading
                         });
                     }
-                    // Wait for input field to be ready (even when skipping navigation)
-                    await page
-                        .waitForSelector('.ProseMirror[contenteditable="true"]', {
-                        timeout: 5000,
-                    })
-                        .catch(() => {
-                        // Ignore timeout, will be handled later
-                    });
+                    // Wait for input field to be ready with retry
+                    let inputFieldReady = false;
+                    for (let attempt = 0; attempt < 3; attempt++) {
+                        try {
+                            await page.waitForSelector('.ProseMirror[contenteditable="true"]', { timeout: 5000 });
+                            inputFieldReady = true;
+                            break;
+                        }
+                        catch {
+                            if (attempt < 2) {
+                                response.appendResponseLine(`⏳ 入力欄を待機中... (${attempt + 1}/3)`);
+                                await new Promise(r => setTimeout(r, 2000));
+                            }
+                        }
+                    }
+                    if (!inputFieldReady) {
+                        response.appendResponseLine('⚠️ 入力欄の準備に時間がかかっています。続行を試みます...');
+                    }
                 }
                 else {
                     response.appendResponseLine('既存チャットが見つかりませんでした。新規作成します。');
@@ -342,19 +363,40 @@ export const askChatGPTWeb = defineTool({
                     await new Promise(resolve => setTimeout(resolve, 200));
                 }
             }
-            // Step 4: Send question
+            // Capture initial message counts BEFORE sending
+            // This is critical to detect if our message was actually sent
+            const initialCounts = await page.evaluate(() => {
+                const userMessages = document.querySelectorAll('[data-message-author-role="user"]');
+                const assistantMessages = document.querySelectorAll('[data-message-author-role="assistant"]');
+                return {
+                    userCount: userMessages.length,
+                    assistantCount: assistantMessages.length,
+                };
+            });
+            const initialUserMsgCount = initialCounts.userCount;
+            const initialAssistantMsgCount = initialCounts.assistantCount;
+            // Step 4: Send question with retry
             response.appendResponseLine('質問を送信中...');
-            const questionSent = await page.evaluate(questionText => {
-                const prosemirror = document.querySelector('.ProseMirror[contenteditable="true"]');
-                if (!prosemirror)
-                    return false;
-                prosemirror.innerHTML = '';
-                const p = document.createElement('p');
-                p.textContent = questionText;
-                prosemirror.appendChild(p);
-                prosemirror.dispatchEvent(new Event('input', { bubbles: true }));
-                return true;
-            }, sanitizedQuestion);
+            let questionSent = false;
+            for (let attempt = 0; attempt < 3; attempt++) {
+                questionSent = await page.evaluate(questionText => {
+                    const prosemirror = document.querySelector('.ProseMirror[contenteditable="true"]');
+                    if (!prosemirror)
+                        return false;
+                    prosemirror.innerHTML = '';
+                    const p = document.createElement('p');
+                    p.textContent = questionText;
+                    prosemirror.appendChild(p);
+                    prosemirror.dispatchEvent(new Event('input', { bubbles: true }));
+                    return true;
+                }, sanitizedQuestion);
+                if (questionSent)
+                    break;
+                if (attempt < 2) {
+                    response.appendResponseLine(`⏳ 入力欄が見つかりません。再試行中... (${attempt + 1}/3)`);
+                    await new Promise(r => setTimeout(r, 2000));
+                }
+            }
             if (!questionSent) {
                 response.appendResponseLine('❌ 入力欄が見つかりません（ページ読み込み中の可能性）');
                 return;
@@ -373,11 +415,12 @@ export const askChatGPTWeb = defineTool({
                 response.appendResponseLine('❌ 送信ボタンが見つかりません');
                 return;
             }
-            // Wait for message to actually be sent (user message appears in DOM)
-            await page.waitForFunction(() => {
+            // Wait for message to actually be sent (user message count INCREASED)
+            // This ensures we detect our NEW message, not existing ones
+            await page.waitForFunction(initialCount => {
                 const messages = document.querySelectorAll('[data-message-author-role="user"]');
-                return messages.length > 0;
-            }, { timeout: 10000 });
+                return messages.length > initialCount;
+            }, { timeout: 10000 }, initialUserMsgCount);
             response.appendResponseLine('✅ 質問送信完了');
             // Step 5: Monitor streaming with progress updates
             response.appendResponseLine('ChatGPTの回答を待機中... (10秒ごとに進捗を表示)');
@@ -390,39 +433,45 @@ export const askChatGPTWeb = defineTool({
                     await new Promise(resolve => setTimeout(resolve, 500));
                 }
                 isFirstCheck = false;
-                const status = await page.evaluate(() => {
+                const status = await page.evaluate(initialAssistantCount => {
                     // Streaming detection - check for stop button by data-testid
                     // When ChatGPT is generating, send-button becomes stop-button
                     const stopButton = document.querySelector('button[data-testid="stop-button"]');
                     const isStreaming = !!stopButton;
                     if (!isStreaming) {
-                        // Get final response
+                        // Get final response - only look at NEW messages
                         const assistantMessages = document.querySelectorAll('[data-message-author-role="assistant"]');
-                        if (assistantMessages.length === 0)
+                        // Check if we have a NEW assistant message (not old ones)
+                        if (assistantMessages.length <= initialAssistantCount) {
                             return { completed: false };
-                        const latestMessage = assistantMessages[assistantMessages.length - 1];
-                        const thinkingButton = latestMessage.querySelector('button[aria-label*="思考時間"]');
+                        }
+                        // Get the NEW message (first one after initial count)
+                        const newMessage = assistantMessages[initialAssistantCount];
+                        const thinkingButton = newMessage.querySelector('button[aria-label*="思考時間"]');
                         const thinkingTime = thinkingButton
                             ? parseInt((thinkingButton.textContent || '').match(/\d+/)?.[0] || '0')
                             : undefined;
                         return {
                             completed: true,
-                            text: latestMessage.textContent || '',
+                            text: newMessage.textContent || '',
                             thinkingTime,
                         };
                     }
-                    // Get current text
+                    // Get current text from NEW message during streaming
                     const assistantMessages = document.querySelectorAll('[data-message-author-role="assistant"]');
-                    const latestMessage = assistantMessages[assistantMessages.length - 1];
-                    const currentText = latestMessage
-                        ? latestMessage.textContent?.substring(0, 200)
+                    // Only check new messages
+                    const newMessage = assistantMessages.length > initialAssistantCount
+                        ? assistantMessages[initialAssistantCount]
+                        : null;
+                    const currentText = newMessage
+                        ? newMessage.textContent?.substring(0, 200)
                         : '';
                     return {
                         completed: false,
                         streaming: true,
                         currentText,
                     };
-                });
+                }, initialAssistantMsgCount);
                 if (status.completed) {
                     response.appendResponseLine(`\n✅ 回答完了 (所要時間: ${Math.floor((Date.now() - startTime) / 1000)}秒)`);
                     if (status.thinkingTime) {

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "chrome-ai-bridge",
-  "version": "1.0.1",
+  "version": "1.0.3",
   "description": "MCP server bridging Chrome browser and AI assistants (ChatGPT, Gemini). Browser automation + AI consultation.",
   "type": "module",
   "bin": "./scripts/cli.mjs",

package/scripts/cli.mjs CHANGED Viewed

@@ -1,10 +1,10 @@
 #!/usr/bin/env node
 /**
- * CLI Entry Point for chrome-devtools-mcp-for-extension
+ * CLI Entry Point for chrome-ai-bridge
  *
  * This is the entry point when users run:
- *   npx chrome-devtools-mcp-for-extension
- *   chrome-devtools-mcp-for-extension (if globally installed)
+ *   npx chrome-ai-bridge
+ *   chrome-ai-bridge (if globally installed)
  *
  * Launches the MCP server with browser globals mock:
  * - Loads browser-globals-mock.mjs BEFORE main.js