chrome-ai-bridge 1.0.1 → 1.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,14 +1,29 @@
1
- # Chrome DevTools MCP for Extension Development
1
+ # chrome-ai-bridge
2
2
 
3
- [![npm chrome-ai-bridge package](https://img.shields.io/npm/v/chrome-ai-bridge.svg)](https://npmjs.org/package/chrome-ai-bridge)
3
+ [![npm](https://img.shields.io/npm/v/chrome-ai-bridge.svg)](https://npmjs.org/package/chrome-ai-bridge)
4
+ [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)
4
5
 
5
- > AI-powered Chrome extension development via MCP
6
+ > Bridge between AI and Chrome Browser
6
7
 
7
- Built for: Claude Code, Cursor, VS Code Copilot, Cline, and other MCP-compatible AI tools
8
+ MCP server enabling AI assistants to control Chrome, consult other AIs, and develop extensions.
9
+
10
+ **Compatible with:** Claude Code, Cursor, VS Code Copilot, Cline, and other MCP clients
11
+
12
+ ---
13
+
14
+ ## What is this?
15
+
16
+ chrome-ai-bridge is a [Model Context Protocol](https://modelcontextprotocol.io/) server that gives AI assistants:
17
+
18
+ - **Eyes**: See what's on web pages (screenshots, DOM snapshots)
19
+ - **Hands**: Interact with pages (click, type, navigate)
20
+ - **Voice**: Consult other AIs (ChatGPT, Gemini) via browser
21
+
22
+ Think of it as the bridge that connects your AI coding assistant to the browser world.
8
23
 
9
24
  ---
10
25
 
11
- ## Quick Start (5 minutes)
26
+ ## Quick Start
12
27
 
13
28
  ### 1. Run the server
14
29
 
@@ -33,85 +48,59 @@ npx chrome-ai-bridge@latest
33
48
 
34
49
  ### 3. Verify it works
35
50
 
36
- Restart your AI client and ask: `"List all my Chrome extensions"`
37
-
38
- ### Load development extensions (optional)
39
-
40
- ```json
41
- {
42
- "mcpServers": {
43
- "chrome-ai-bridge": {
44
- "command": "npx",
45
- "args": [
46
- "chrome-ai-bridge@latest",
47
- "--loadExtensionsDir=/path/to/your/extensions"
48
- ]
49
- }
50
- }
51
- }
52
- ```
51
+ Restart your AI client and try: `"Take a screenshot of google.com"`
53
52
 
54
53
  ---
55
54
 
56
- ## What You Can Do
55
+ ## Key Features
57
56
 
58
- - **Extension Development**: Load, debug, and hot-reload Chrome extensions
59
- - **Browser Automation**: Navigate, click, fill forms, take screenshots
60
- - **Performance Analysis**: Trace recording and insight extraction
61
- - **AI Research**: Automated ChatGPT/Gemini interactions
62
- - **Web Store Submission**: Automated screenshot generation and submission
57
+ ### Multi-AI Consultation
63
58
 
64
- ---
59
+ Ask ChatGPT or Gemini questions directly from your AI assistant:
65
60
 
66
- ## Tools Reference
61
+ ```
62
+ "Ask ChatGPT how to implement OAuth in Node.js"
63
+ "Ask Gemini to review this architecture decision"
64
+ ```
67
65
 
68
- ### Core Tools (18)
66
+ | Feature | Description |
67
+ |---------|-------------|
68
+ | **Session persistence** | Conversations continue across tool calls |
69
+ | **Auto-logging** | All Q&A saved to `docs/ask/chatgpt/` and `docs/ask/gemini/` |
70
+ | **12 languages** | Login detection works in EN, JA, FR, DE, ES, IT, KO, ZH, PT, RU, AR |
69
71
 
70
- | Tool | Description | Key Parameters |
71
- |------|-------------|----------------|
72
- | `take_snapshot` | Get page structure with element UIDs | - |
73
- | `take_screenshot` | Capture page or element image | `fullPage`, `uid` |
74
- | `click` | Click element by UID | `uid`, `dblClick` |
75
- | `fill` | Fill input/textarea/select | `uid`, `value` |
76
- | `fill_form` | Fill multiple form elements | `elements[]` |
77
- | `hover` | Hover over element | `uid` |
78
- | `drag` | Drag element to another | `from_uid`, `to_uid` |
79
- | `upload_file` | Upload file through input | `uid`, `filePath` |
80
- | `navigate` | Go to URL, back, forward | `op`, `url` |
81
- | `pages` | List, select, close tabs | `op`, `pageIdx` |
82
- | `wait_for` | Wait for text to appear | `text`, `timeout` |
83
- | `handle_dialog` | Accept/dismiss dialogs | `action` |
84
- | `resize_page` | Change viewport size | `width`, `height` |
85
- | `emulate` | CPU/network throttling | `target`, `throttlingRate` |
86
- | `network` | List/get network requests | `op`, `url` |
87
- | `performance` | Start/stop/analyze traces | `op`, `insightName` |
88
- | `evaluate_script` | Run JavaScript in page | `function` |
89
- | `list_console_messages` | Get console output | - |
90
-
91
- ### Optional Tools (2) - Web-LLM
92
-
93
- | Tool | Description | Key Parameters |
94
- |------|-------------|----------------|
95
- | `ask_chatgpt_web` | Ask ChatGPT via browser | `question`, `createNewChat` |
96
- | `ask_gemini_web` | Ask Gemini via browser | `question`, `createNewChat` |
72
+ ### Browser Automation
97
73
 
98
- **Full documentation:** [docs/reference/tools.md](docs/reference/tools.md)
74
+ Full browser control with 20+ tools:
99
75
 
100
- ---
76
+ | Category | Tools |
77
+ |----------|-------|
78
+ | **Snapshot** | `take_snapshot`, `take_screenshot` |
79
+ | **Input** | `click`, `fill`, `fill_form`, `hover`, `drag`, `upload_file` |
80
+ | **Navigation** | `navigate`, `pages`, `wait_for`, `handle_dialog` |
81
+ | **Inspection** | `network`, `list_console_messages`, `evaluate_script` |
82
+ | **Performance** | `performance` (start/stop/analyze traces) |
83
+ | **Emulation** | `emulate` (CPU/network throttling), `resize_page` |
101
84
 
102
- ## Plugin Architecture (v0.26.0)
85
+ ### Chrome Extension Development
103
86
 
104
- ### Disable Web-LLM tools
87
+ Build and debug Chrome extensions with AI assistance:
105
88
 
106
89
  ```json
107
90
  {
108
- "env": {
109
- "MCP_DISABLE_WEB_LLM": "true"
110
- }
91
+ "args": ["chrome-ai-bridge@latest", "--loadExtensionsDir=/path/to/extensions"]
111
92
  }
112
93
  ```
113
94
 
114
- ### Load external plugins
95
+ | Tool | Description |
96
+ |------|-------------|
97
+ | `extension_popup` | Open/close extension popups |
98
+ | `iframe_popup` | Inspect, patch, reload iframe-embedded popups |
99
+ | `bookmarks` | Quick access to chrome://extensions, Web Store dashboard |
100
+
101
+ ### Plugin Architecture
102
+
103
+ Extend with custom tools:
115
104
 
116
105
  ```json
117
106
  {
@@ -121,19 +110,18 @@ Restart your AI client and ask: `"List all my Chrome extensions"`
121
110
  }
122
111
  ```
123
112
 
124
- **Plugin interface:**
125
-
126
113
  ```typescript
114
+ // my-plugin.js
127
115
  export default {
128
116
  id: 'my-plugin',
129
- name: 'My Custom Plugin',
117
+ name: 'My Plugin',
130
118
  version: '1.0.0',
131
119
  async register(ctx) {
132
120
  ctx.registry.register({
133
121
  name: 'my_tool',
134
122
  description: 'Does something useful',
135
123
  schema: { /* zod schema */ },
136
- async handler(input, response, context) { /* implementation */ },
124
+ async handler(input, response, context) { /* ... */ },
137
125
  });
138
126
  },
139
127
  };
@@ -141,9 +129,65 @@ export default {
141
129
 
142
130
  ---
143
131
 
132
+ ## Configuration
133
+
134
+ ### Environment Variables
135
+
136
+ | Variable | Description |
137
+ |----------|-------------|
138
+ | `MCP_DISABLE_WEB_LLM` | Set `true` to disable ChatGPT/Gemini tools |
139
+ | `MCP_PLUGINS` | Comma-separated list of plugin paths |
140
+ | `MCP_ENV` | Set `development` for hot-reload mode |
141
+
142
+ ### CLI Options
143
+
144
+ | Option | Description |
145
+ |--------|-------------|
146
+ | `--loadExtensionsDir` | Load Chrome extensions from directory |
147
+ | `--headless` | Run in headless mode |
148
+ | `--channel` | Chrome channel (stable/canary) |
149
+
150
+ ---
151
+
152
+ ## Tools Reference
153
+
154
+ ### Core Tools (18)
155
+
156
+ | Tool | Description |
157
+ |------|-------------|
158
+ | `take_snapshot` | Get page structure with element UIDs |
159
+ | `take_screenshot` | Capture page or element image |
160
+ | `click` | Click element by UID |
161
+ | `fill` | Fill input/textarea/select |
162
+ | `fill_form` | Fill multiple form elements |
163
+ | `hover` | Hover over element |
164
+ | `drag` | Drag element to another |
165
+ | `upload_file` | Upload file through input |
166
+ | `navigate` | Go to URL, back, forward |
167
+ | `pages` | List, select, close tabs |
168
+ | `wait_for` | Wait for text to appear |
169
+ | `handle_dialog` | Accept/dismiss dialogs |
170
+ | `resize_page` | Change viewport size |
171
+ | `emulate` | CPU/network throttling |
172
+ | `network` | List/get network requests |
173
+ | `performance` | Start/stop/analyze traces |
174
+ | `evaluate_script` | Run JavaScript in page |
175
+ | `list_console_messages` | Get console output |
176
+
177
+ ### Web-LLM Tools (2)
178
+
179
+ | Tool | Description |
180
+ |------|-------------|
181
+ | `ask_chatgpt_web` | Ask ChatGPT via browser |
182
+ | `ask_gemini_web` | Ask Gemini via browser |
183
+
184
+ **Full documentation:** [docs/reference/tools.md](docs/reference/tools.md)
185
+
186
+ ---
187
+
144
188
  ## For Developers
145
189
 
146
- ### Local development setup
190
+ ### Local Development
147
191
 
148
192
  ```bash
149
193
  git clone https://github.com/usedhonda/chrome-ai-bridge.git
@@ -151,37 +195,35 @@ cd chrome-ai-bridge
151
195
  npm install && npm run build
152
196
  ```
153
197
 
154
- Configure `~/.claude.json` to use local version:
198
+ Configure `~/.claude.json`:
155
199
 
156
200
  ```json
157
201
  {
158
202
  "mcpServers": {
159
203
  "chrome-ai-bridge": {
160
204
  "command": "node",
161
- "args": ["/absolute/path/to/chrome-ai-bridge/scripts/cli.mjs"]
205
+ "args": ["/path/to/chrome-ai-bridge/scripts/cli.mjs"]
162
206
  }
163
207
  }
164
208
  }
165
209
  ```
166
210
 
167
- ### Hot-reload development
211
+ ### Hot-Reload Development
168
212
 
169
213
  ```json
170
214
  {
171
215
  "mcpServers": {
172
216
  "chrome-ai-bridge": {
173
217
  "command": "node",
174
- "args": ["/absolute/path/to/chrome-ai-bridge/scripts/mcp-wrapper.mjs"],
175
- "cwd": "/absolute/path/to/chrome-ai-bridge",
218
+ "args": ["/path/to/chrome-ai-bridge/scripts/mcp-wrapper.mjs"],
219
+ "cwd": "/path/to/chrome-ai-bridge",
176
220
  "env": { "MCP_ENV": "development" }
177
221
  }
178
222
  }
179
223
  }
180
224
  ```
181
225
 
182
- **Benefits:** Auto-rebuild on file changes, 2-5 second feedback loop.
183
-
184
- **See also:** [docs/dev/hot-reload.md](docs/dev/hot-reload.md)
226
+ Auto-rebuild on file changes with 2-5 second feedback loop.
185
227
 
186
228
  ### Commands
187
229
 
@@ -192,7 +234,7 @@ npm test # Run tests
192
234
  npm run format # Format code
193
235
  ```
194
236
 
195
- ### Project structure
237
+ ### Project Structure
196
238
 
197
239
  ```
198
240
  chrome-ai-bridge/
@@ -223,30 +265,32 @@ chrome-ai-bridge/
223
265
 
224
266
  ## Troubleshooting
225
267
 
226
- ### Extension not loading
227
- - Verify `manifest.json` is at extension root
228
- - Use absolute paths in `--loadExtensionsDir`
268
+ ### MCP server not responding
229
269
 
230
- ### MCP server issues
231
270
  ```bash
232
271
  npx clear-npx-cache && npx chrome-ai-bridge@latest
233
272
  ```
234
273
 
274
+ ### Extension not loading
275
+
276
+ - Verify `manifest.json` exists at extension root
277
+ - Use absolute paths in `--loadExtensionsDir`
278
+
279
+ ### ChatGPT/Gemini login issues
280
+
281
+ - Check browser window for login prompts
282
+ - Login detection supports 12 languages
283
+
235
284
  **More:** [docs/user/troubleshooting.md](docs/user/troubleshooting.md)
236
285
 
237
286
  ---
238
287
 
239
288
  ## Credits
240
289
 
241
- Fork of [Chrome DevTools MCP](https://github.com/ChromeDevTools/chrome-ai-bridge) by Google LLC.
242
-
243
- **Additions:** Extension development tools, Web Store automation, ChatGPT/Gemini integration, hot-reload workflow.
290
+ Built on [Chrome DevTools MCP](https://github.com/anthropics/anthropic-quickstarts/tree/main/mcp-devtools) by Google LLC, with extensions for multi-AI consultation and Chrome extension development.
244
291
 
245
292
  ---
246
293
 
247
294
  ## License
248
295
 
249
296
  Apache-2.0
250
-
251
- **Version**: 0.26.1
252
- **Repository**: https://github.com/usedhonda/chrome-ai-bridge
@@ -267,9 +267,20 @@ export const askChatGPTWeb = defineTool({
267
267
  }
268
268
  }
269
269
  else if (loginStatus === LoginStatus.IN_PROGRESS) {
270
- // Wait a bit and retry
271
- await new Promise(r => setTimeout(r, 2000));
272
- const retryStatus = await getLoginStatus(page, 'chatgpt');
270
+ // Wait and retry with exponential backoff (login may still be processing)
271
+ let retryStatus = LoginStatus.IN_PROGRESS;
272
+ const maxRetries = 3;
273
+ for (let i = 0; i < maxRetries; i++) {
274
+ const waitTime = 3000 + i * 2000; // 3s, 5s, 7s
275
+ await new Promise(r => setTimeout(r, waitTime));
276
+ retryStatus = await getLoginStatus(page, 'chatgpt');
277
+ if (retryStatus === LoginStatus.LOGGED_IN) {
278
+ break;
279
+ }
280
+ if (i < maxRetries - 1) {
281
+ response.appendResponseLine(`⏳ ログイン処理中... (${i + 1}/${maxRetries})`);
282
+ }
283
+ }
273
284
  if (retryStatus !== LoginStatus.LOGGED_IN) {
274
285
  response.appendResponseLine('⚠️ ログイン状態を確認できませんでした。再試行してください。');
275
286
  return;
@@ -293,17 +304,27 @@ export const askChatGPTWeb = defineTool({
293
304
  const currentUrl = page.url();
294
305
  if (!currentUrl.includes(latestSession.chatId)) {
295
306
  await navigateWithRetry(page, latestSession.url, {
296
- waitUntil: 'domcontentloaded',
307
+ waitUntil: 'networkidle2', // Wait for JS to finish loading
297
308
  });
298
309
  }
299
- // Wait for input field to be ready (even when skipping navigation)
300
- await page
301
- .waitForSelector('.ProseMirror[contenteditable="true"]', {
302
- timeout: 5000,
303
- })
304
- .catch(() => {
305
- // Ignore timeout, will be handled later
306
- });
310
+ // Wait for input field to be ready with retry
311
+ let inputFieldReady = false;
312
+ for (let attempt = 0; attempt < 3; attempt++) {
313
+ try {
314
+ await page.waitForSelector('.ProseMirror[contenteditable="true"]', { timeout: 5000 });
315
+ inputFieldReady = true;
316
+ break;
317
+ }
318
+ catch {
319
+ if (attempt < 2) {
320
+ response.appendResponseLine(`⏳ 入力欄を待機中... (${attempt + 1}/3)`);
321
+ await new Promise(r => setTimeout(r, 2000));
322
+ }
323
+ }
324
+ }
325
+ if (!inputFieldReady) {
326
+ response.appendResponseLine('⚠️ 入力欄の準備に時間がかかっています。続行を試みます...');
327
+ }
307
328
  }
308
329
  else {
309
330
  response.appendResponseLine('既存チャットが見つかりませんでした。新規作成します。');
@@ -342,19 +363,40 @@ export const askChatGPTWeb = defineTool({
342
363
  await new Promise(resolve => setTimeout(resolve, 200));
343
364
  }
344
365
  }
345
- // Step 4: Send question
366
+ // Capture initial message counts BEFORE sending
367
+ // This is critical to detect if our message was actually sent
368
+ const initialCounts = await page.evaluate(() => {
369
+ const userMessages = document.querySelectorAll('[data-message-author-role="user"]');
370
+ const assistantMessages = document.querySelectorAll('[data-message-author-role="assistant"]');
371
+ return {
372
+ userCount: userMessages.length,
373
+ assistantCount: assistantMessages.length,
374
+ };
375
+ });
376
+ const initialUserMsgCount = initialCounts.userCount;
377
+ const initialAssistantMsgCount = initialCounts.assistantCount;
378
+ // Step 4: Send question with retry
346
379
  response.appendResponseLine('質問を送信中...');
347
- const questionSent = await page.evaluate(questionText => {
348
- const prosemirror = document.querySelector('.ProseMirror[contenteditable="true"]');
349
- if (!prosemirror)
350
- return false;
351
- prosemirror.innerHTML = '';
352
- const p = document.createElement('p');
353
- p.textContent = questionText;
354
- prosemirror.appendChild(p);
355
- prosemirror.dispatchEvent(new Event('input', { bubbles: true }));
356
- return true;
357
- }, sanitizedQuestion);
380
+ let questionSent = false;
381
+ for (let attempt = 0; attempt < 3; attempt++) {
382
+ questionSent = await page.evaluate(questionText => {
383
+ const prosemirror = document.querySelector('.ProseMirror[contenteditable="true"]');
384
+ if (!prosemirror)
385
+ return false;
386
+ prosemirror.innerHTML = '';
387
+ const p = document.createElement('p');
388
+ p.textContent = questionText;
389
+ prosemirror.appendChild(p);
390
+ prosemirror.dispatchEvent(new Event('input', { bubbles: true }));
391
+ return true;
392
+ }, sanitizedQuestion);
393
+ if (questionSent)
394
+ break;
395
+ if (attempt < 2) {
396
+ response.appendResponseLine(`⏳ 入力欄が見つかりません。再試行中... (${attempt + 1}/3)`);
397
+ await new Promise(r => setTimeout(r, 2000));
398
+ }
399
+ }
358
400
  if (!questionSent) {
359
401
  response.appendResponseLine('❌ 入力欄が見つかりません(ページ読み込み中の可能性)');
360
402
  return;
@@ -373,11 +415,12 @@ export const askChatGPTWeb = defineTool({
373
415
  response.appendResponseLine('❌ 送信ボタンが見つかりません');
374
416
  return;
375
417
  }
376
- // Wait for message to actually be sent (user message appears in DOM)
377
- await page.waitForFunction(() => {
418
+ // Wait for message to actually be sent (user message count INCREASED)
419
+ // This ensures we detect our NEW message, not existing ones
420
+ await page.waitForFunction(initialCount => {
378
421
  const messages = document.querySelectorAll('[data-message-author-role="user"]');
379
- return messages.length > 0;
380
- }, { timeout: 10000 });
422
+ return messages.length > initialCount;
423
+ }, { timeout: 10000 }, initialUserMsgCount);
381
424
  response.appendResponseLine('✅ 質問送信完了');
382
425
  // Step 5: Monitor streaming with progress updates
383
426
  response.appendResponseLine('ChatGPTの回答を待機中... (10秒ごとに進捗を表示)');
@@ -390,39 +433,45 @@ export const askChatGPTWeb = defineTool({
390
433
  await new Promise(resolve => setTimeout(resolve, 500));
391
434
  }
392
435
  isFirstCheck = false;
393
- const status = await page.evaluate(() => {
436
+ const status = await page.evaluate(initialAssistantCount => {
394
437
  // Streaming detection - check for stop button by data-testid
395
438
  // When ChatGPT is generating, send-button becomes stop-button
396
439
  const stopButton = document.querySelector('button[data-testid="stop-button"]');
397
440
  const isStreaming = !!stopButton;
398
441
  if (!isStreaming) {
399
- // Get final response
442
+ // Get final response - only look at NEW messages
400
443
  const assistantMessages = document.querySelectorAll('[data-message-author-role="assistant"]');
401
- if (assistantMessages.length === 0)
444
+ // Check if we have a NEW assistant message (not old ones)
445
+ if (assistantMessages.length <= initialAssistantCount) {
402
446
  return { completed: false };
403
- const latestMessage = assistantMessages[assistantMessages.length - 1];
404
- const thinkingButton = latestMessage.querySelector('button[aria-label*="思考時間"]');
447
+ }
448
+ // Get the NEW message (first one after initial count)
449
+ const newMessage = assistantMessages[initialAssistantCount];
450
+ const thinkingButton = newMessage.querySelector('button[aria-label*="思考時間"]');
405
451
  const thinkingTime = thinkingButton
406
452
  ? parseInt((thinkingButton.textContent || '').match(/\d+/)?.[0] || '0')
407
453
  : undefined;
408
454
  return {
409
455
  completed: true,
410
- text: latestMessage.textContent || '',
456
+ text: newMessage.textContent || '',
411
457
  thinkingTime,
412
458
  };
413
459
  }
414
- // Get current text
460
+ // Get current text from NEW message during streaming
415
461
  const assistantMessages = document.querySelectorAll('[data-message-author-role="assistant"]');
416
- const latestMessage = assistantMessages[assistantMessages.length - 1];
417
- const currentText = latestMessage
418
- ? latestMessage.textContent?.substring(0, 200)
462
+ // Only check new messages
463
+ const newMessage = assistantMessages.length > initialAssistantCount
464
+ ? assistantMessages[initialAssistantCount]
465
+ : null;
466
+ const currentText = newMessage
467
+ ? newMessage.textContent?.substring(0, 200)
419
468
  : '';
420
469
  return {
421
470
  completed: false,
422
471
  streaming: true,
423
472
  currentText,
424
473
  };
425
- });
474
+ }, initialAssistantMsgCount);
426
475
  if (status.completed) {
427
476
  response.appendResponseLine(`\n✅ 回答完了 (所要時間: ${Math.floor((Date.now() - startTime) / 1000)}秒)`);
428
477
  if (status.thinkingTime) {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "chrome-ai-bridge",
3
- "version": "1.0.1",
3
+ "version": "1.0.3",
4
4
  "description": "MCP server bridging Chrome browser and AI assistants (ChatGPT, Gemini). Browser automation + AI consultation.",
5
5
  "type": "module",
6
6
  "bin": "./scripts/cli.mjs",
package/scripts/cli.mjs CHANGED
@@ -1,10 +1,10 @@
1
1
  #!/usr/bin/env node
2
2
  /**
3
- * CLI Entry Point for chrome-devtools-mcp-for-extension
3
+ * CLI Entry Point for chrome-ai-bridge
4
4
  *
5
5
  * This is the entry point when users run:
6
- * npx chrome-devtools-mcp-for-extension
7
- * chrome-devtools-mcp-for-extension (if globally installed)
6
+ * npx chrome-ai-bridge
7
+ * chrome-ai-bridge (if globally installed)
8
8
  *
9
9
  * Launches the MCP server with browser globals mock:
10
10
  * - Loads browser-globals-mock.mjs BEFORE main.js