chrometools-mcp 3.2.4 → 3.2.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,36 @@
2
2
 
3
3
  All notable changes to this project will be documented in this file.
4
4
 
5
+ ## [3.2.6] - 2026-01-28
6
+
7
+ ### Removed
8
+ - **getAllInteractiveElements tool** — Removed redundant tool, fully replaced by analyzePage (54 → 53 tools)
9
+ - `analyzePage` provides superior functionality: hierarchical tree, element registration, APOM IDs, metadata
10
+ - `getAllInteractiveElements` only returned flat list with CSS selectors
11
+ - Affected files: `index.js`, `server/tool-definitions.js`, `server/tool-schemas.js`, `server/tool-groups.js`, `README.md`
12
+
13
+ ### Fixed
14
+ - **analyzePage visibility detection** — Fixed critical bug where analyzePage returned tree: null with interactiveCount: 0 on Angular Material pages
15
+ - Changed `isVisible()` check from `offsetParent` to `offsetWidth/offsetHeight > 0`
16
+ - Now correctly detects elements inside `position: fixed` containers (Angular Material overlays, dialogs, selects)
17
+ - Handles `position: sticky` elements properly
18
+ - Testing on my-autotests.segmento.ru: interactiveCount increased from 0 → 329 elements
19
+ - Affected file: `pom/apom-tree-converter.js`
20
+ - **type() text corruption** — Fixed text input corruption (duplicated/swapped characters)
21
+ - Changed default keystroke delay from 0ms to 30ms
22
+ - Prevents character corruption on fast-reacting inputs (Google Search, autocomplete fields)
23
+ - Example: "puppeteer automation" no longer becomes "ppuuppppeetteeeerr baruotwosmeart"
24
+ - Affected file: `index.js:454`
25
+
26
+ ## [3.2.5] - 2026-01-28
27
+
28
+ ### Fixed
29
+ - **CSS selector validation** — Fixed analyzePage crash when elements have numeric IDs
30
+ - Added validation to skip IDs starting with digits (e.g., `id="301178"`)
31
+ - CSS selectors don't support IDs starting with numbers (per CSS specification)
32
+ - Added try-catch for invalid selector edge cases
33
+ - Affected file: `pom/apom-tree-converter.js`
34
+
5
35
  ## [3.2.4] - 2026-01-27
6
36
 
7
37
  ### Performance
package/README.md CHANGED
@@ -1,6 +1,29 @@
1
1
  # chrometools-mcp
2
2
 
3
- MCP server for Chrome automation using Puppeteer with persistent browser sessions.
3
+ > 🌐 [Русская версия README](./README.ru.md)
4
+
5
+ **AI-powered Chrome automation through natural language.** No more fighting with CSS selectors, XPath expressions, or brittle test scripts. Just tell your AI assistant what you want to do on a web page, and ChromeTools MCP makes it happen.
6
+
7
+ ## Why ChromeTools MCP?
8
+
9
+ **For AI Agents & Developers:**
10
+ - 🎯 **54 specialized tools** for browser automation - from simple clicks to Figma comparisons
11
+ - 🧠 **APOM (Agent Page Object Model)** - AI-friendly page representation (~8-10k tokens vs 15-25k for screenshots)
12
+ - 🔄 **Persistent browser sessions** - pages stay open between commands for iterative workflows
13
+ - ⚡ **Framework-aware** - handles React, Vue, Angular events and state updates automatically
14
+ - 📸 **Visual testing** - compare designs pixel-by-pixel with Figma integration
15
+ - 🎬 **Scenario recording** - record browser actions, replay them, or export as Playwright/Selenium tests
16
+ - 🌍 **Cross-platform** - works seamlessly on Windows, WSL, Linux, and macOS
17
+
18
+ **Perfect for:**
19
+ - 🤖 Building AI agents that interact with web applications
20
+ - 🧪 Automated testing without writing code - let AI generate tests from scenarios
21
+ - 🔍 Web scraping and data extraction with natural language instructions
22
+ - 🎨 Design validation - compare implemented UI with Figma designs
23
+ - 🚀 Rapid prototyping - test user flows by describing them to AI
24
+ - 📊 Monitoring and health checks for web applications
25
+
26
+ Stop writing brittle automation scripts. Start describing what you want in plain English.
4
27
 
5
28
  ## Installation
6
29
 
@@ -152,7 +175,7 @@ The Chrome Extension is **required** for scenario recording and other advanced f
152
175
  **Step 3:** Download and Extract the Extension
153
176
 
154
177
  **Option A - Download from GitHub (Recommended):**
155
- 1. Download the extension archive: [chrome-extension.zip](https://github.com/modelcontextprotocol/servers/raw/main/src/chrometools/chrome-extension.zip)
178
+ 1. Download the extension archive: [chrome-extension.zip](https://github.com/docentovich/chrometools-mcp/raw/main/chrome-extension.zip)
156
179
  2. Extract the ZIP file to a folder on your computer
157
180
  3. Remember the extraction path (you'll need it in the next step)
158
181
 
@@ -197,7 +220,7 @@ The Chrome Extension is **required** for scenario recording and other advanced f
197
220
  - [Chrome Extension Setup](#chrome-extension-setup)
198
221
  - [AI Optimization Features](#ai-optimization-features)- [Scenario Recorder](#scenario-recorder) - Visual UI-based recording with smart optimization
199
222
  - [Available Tools](#available-tools) - **46+ Tools Total**
200
- - [AI-Powered Tools](#ai-powered-tools) - smartFindElement, analyzePage, getElementDetails, getAllInteractiveElements, findElementsByText
223
+ - [AI-Powered Tools](#ai-powered-tools) - smartFindElement, analyzePage, getElementDetails, findElementsByText
201
224
  - [Core Tools](#1-core-tools) - ping, openBrowser
202
225
  - [Interaction Tools](#2-interaction-tools) - click, type, scrollTo, selectOption, selectFromGroup, drag, scrollHorizontal
203
226
  - [Inspection Tools](#3-inspection-tools) - getElement, getComputedCss, getBoxModel, screenshot
@@ -238,7 +261,7 @@ AI: smartFindElement("login button")
238
261
  1. **`analyzePage`** - 🔥 **USE FREQUENTLY** - Get current page state after loads, clicks, submissions (cached, use refresh:true)
239
262
  2. **`smartFindElement`** - Natural language element search with multilingual support
240
263
  3. **AI Hints** - Automatic context in all tools (page type, available actions, suggestions)
241
- 4. **Batch helpers** - `getAllInteractiveElements`, `findElementsByText`
264
+ 4. **Text search** - `findElementsByText` for finding elements by visible text
242
265
 
243
266
  **Performance:** 3-5x faster, 5-10x fewer requests
244
267
 
@@ -438,12 +461,6 @@ executeScenario({ name: "login_flow", parameters: { email: "user@test.com" } })
438
461
  getElementDetails({ id: "container_123", analyzeChildren: true, refresh: true }) // Analyze modal contents with children tree
439
462
  ```
440
463
 
441
- #### getAllInteractiveElements
442
- Get all clickable/fillable elements with their selectors.
443
- - **Parameters**:
444
- - `includeHidden` (optional): Include hidden elements (default: false)
445
- - **Returns**: Array of all interactive elements with selectors and metadata
446
-
447
464
  #### findElementsByText
448
465
  Find elements by their visible text content.
449
466
  - **Parameters**:
@@ -1431,11 +1448,11 @@ Each tool definition is sent to the AI in every request, consuming context token
1431
1448
  | `interaction` | User interaction | `click`, `type`, `scrollTo`, `waitForElement`, `hover` (5) |
1432
1449
  | `inspection` | Page inspection | `getComputedCss`, `getBoxModel`, `screenshot`, `saveScreenshot` (4) |
1433
1450
  | `debug` | Debugging & network | `getConsoleLogs`, `listNetworkRequests`, `getNetworkRequest`, `filterNetworkRequests` (4) |
1434
- | `advanced` | Advanced automation & AI | `executeScript`, `setStyles`, `setViewport`, `getViewport`, `navigateTo`, `smartFindElement`, `analyzePage`, `getAllInteractiveElements`, `findElementsByText` (9) |
1451
+ | `advanced` | Advanced automation & AI | `executeScript`, `setStyles`, `setViewport`, `getViewport`, `navigateTo`, `smartFindElement`, `analyzePage`, `findElementsByText` (8) |
1435
1452
  | `recorder` | Scenario recording | `enableRecorder`, `executeScenario`, `listScenarios`, `searchScenarios`, `getScenarioInfo`, `deleteScenario`, `exportScenarioAsCode`, `appendScenarioToFile`, `generatePageObject` (9) |
1436
1453
  | `figma` | Figma integration | `getFigmaFrame`, `compareFigmaToElement`, `getFigmaSpecs`, `parseFigmaUrl`, `listFigmaPages`, `searchFigmaFrames`, `getFigmaComponents`, `getFigmaStyles`, `getFigmaColorPalette`, `convertFigmaToCode` (10) |
1437
1454
 
1438
- **Total:** 43 tools across 7 groups
1455
+ **Total:** 42 tools across 7 groups
1439
1456
 
1440
1457
  **Configuration:**
1441
1458
 
@@ -1603,7 +1620,7 @@ npx @modelcontextprotocol/inspector node index.js
1603
1620
  - Interaction: click, type, scrollTo, selectOption, selectFromGroup, drag, scrollHorizontal
1604
1621
  - Inspection: getElement, getComputedCss, getBoxModel, screenshot, saveScreenshot
1605
1622
  - Advanced: executeScript, getConsoleLogs, listNetworkRequests, getNetworkRequest, filterNetworkRequests, hover, setStyles, setViewport, getViewport, navigateTo, waitForElement
1606
- - AI-Powered: smartFindElement, analyzePage, getElementDetails (with children analysis), getAllInteractiveElements, findElementsByText - Recorder: enableRecorder, executeScenario, listScenarios, searchScenarios, getScenarioInfo, deleteScenario, exportScenarioAsCode, appendScenarioToFile, generatePageObject
1623
+ - AI-Powered: smartFindElement, analyzePage, getElementDetails (with children analysis), findElementsByText - Recorder: enableRecorder, executeScenario, listScenarios, searchScenarios, getScenarioInfo, deleteScenario, exportScenarioAsCode, appendScenarioToFile, generatePageObject
1607
1624
  - Figma: getFigmaFrame, compareFigmaToElement, getFigmaSpecs, parseFigmaUrl, listFigmaPages, searchFigmaFrames, getFigmaComponents, getFigmaStyles, getFigmaColorPalette, convertFigmaToCode
1608
1625
  - **UI Framework Detection**: Automatic detection of MUI, Ant Design, Chakra UI, Bootstrap, Vuetify, Semantic UI- **Smart Dropdown Handling**: Extracts options from both native `<select>` and custom UI framework components- **APOM (Agent Page Object Model)**: Automatic element ID assignment for reliable interaction - `analyzePage()` returns elements with unique IDs (e.g., `input_20`, `button_45`)
1609
1626
  - Use `id` parameter in click/type/hover/selectOption for stable targeting
package/README.ru.md CHANGED
@@ -1,8 +1,29 @@
1
1
  # chrometools-mcp
2
2
 
3
- MCP сервер для автоматизации Chrome с использованием Puppeteer и постоянными сессиями браузера.
3
+ > 🌐 [English version](./README.md)
4
4
 
5
- [English version](README.md)
5
+ **Автоматизация Chrome через естественный язык для ИИ.** Забудьте о борьбе с CSS селекторами, XPath выражениями и хрупкими тестовыми скриптами. Просто скажите своему ИИ-помощнику, что вы хотите сделать на веб-странице, и ChromeTools MCP сделает это.
6
+
7
+ ## Зачем нужен ChromeTools MCP?
8
+
9
+ **Для ИИ-агентов и разработчиков:**
10
+ - 🎯 **54 специализированных инструмента** для автоматизации браузера — от простых кликов до сравнения с Figma
11
+ - 🧠 **APOM (Agent Page Object Model)** — представление страницы для ИИ (~8-10k токенов против 15-25k для скриншотов)
12
+ - 🔄 **Постоянные сессии браузера** — страницы остаются открытыми между командами для итеративной работы
13
+ - ⚡ **Поддержка фреймворков** — автоматически обрабатывает события и состояние React, Vue, Angular
14
+ - 📸 **Визуальное тестирование** — попиксельное сравнение дизайна с макетами Figma
15
+ - 🎬 **Запись сценариев** — записывайте действия в браузере, воспроизводите их или экспортируйте в Playwright/Selenium
16
+ - 🌍 **Кросс-платформенность** — работает на Windows, WSL, Linux и macOS
17
+
18
+ **Идеально для:**
19
+ - 🤖 Создания ИИ-агентов, взаимодействующих с веб-приложениями
20
+ - 🧪 Автоматизированного тестирования без написания кода — пусть ИИ генерирует тесты из сценариев
21
+ - 🔍 Парсинга веб-страниц и извлечения данных с помощью естественного языка
22
+ - 🎨 Валидации дизайна — сравнение реализованного UI с дизайном в Figma
23
+ - 🚀 Быстрого прототипирования — тестирование пользовательских сценариев через их описание
24
+ - 📊 Мониторинга и проверки работоспособности веб-приложений
25
+
26
+ Перестаньте писать хрупкие скрипты автоматизации. Начните описывать желаемое на обычном языке.
6
27
 
7
28
  ## Установка
8
29
 
@@ -91,7 +112,7 @@ npx chrometools-mcp
91
112
  **Шаг 3:** Скачайте и распакуйте расширение
92
113
 
93
114
  **Вариант A - Скачать с GitHub (Рекомендуется):**
94
- 1. Скачайте архив расширения: [chrome-extension.zip](https://github.com/modelcontextprotocol/servers/raw/main/src/chrometools/chrome-extension.zip)
115
+ 1. Скачайте архив расширения: [chrome-extension.zip](https://github.com/docentovich/chrometools-mcp/raw/main/chrome-extension.zip)
95
116
  2. Распакуйте ZIP файл в папку на вашем компьютере
96
117
  3. Запомните путь распаковки (он понадобится на следующем шаге)
97
118
 
package/index.js CHANGED
@@ -451,7 +451,7 @@ async function executeToolInternal(name, args) {
451
451
  // Use input model to handle the element appropriately
452
452
  const model = await getInputModel(element, page);
453
453
  const options = {
454
- delay: validatedArgs.delay || 0,
454
+ delay: validatedArgs.delay !== undefined ? validatedArgs.delay : 30,
455
455
  clearFirst: validatedArgs.clearFirst !== undefined ? validatedArgs.clearFirst : true,
456
456
  };
457
457
 
@@ -2250,54 +2250,6 @@ Start coding now.`;
2250
2250
  };
2251
2251
  }
2252
2252
 
2253
- if (name === "getAllInteractiveElements") {
2254
- const validatedArgs = schemas.GetAllInteractiveElementsSchema.parse(args);
2255
- const page = await getLastOpenPage();
2256
-
2257
- const elements = await page.evaluate((includeHidden, utilsCode) => {
2258
- eval(utilsCode);
2259
-
2260
- const results = [];
2261
- const selector = 'button, a[href], input, select, textarea, [onclick], [role="button"], [tabindex]:not([tabindex="-1"])';
2262
-
2263
- document.querySelectorAll(selector).forEach(el => {
2264
- const isVisible = el.offsetWidth > 0 && el.offsetHeight > 0;
2265
-
2266
- if (!includeHidden && !isVisible) return;
2267
-
2268
- const text = (el.textContent || el.value || el.getAttribute('aria-label') || el.placeholder || '').trim();
2269
-
2270
- results.push({
2271
- selector: getUniqueSelectorInPage(el),
2272
- type: el.tagName.toLowerCase(),
2273
- text: text.substring(0, 100),
2274
- visible: isVisible,
2275
- attributes: {
2276
- id: el.id || null,
2277
- class: el.className || null,
2278
- role: el.getAttribute('role') || null,
2279
- type: el.type || null,
2280
- }
2281
- });
2282
- });
2283
-
2284
- return results;
2285
- }, validatedArgs.includeHidden || false, elementFinderUtils);
2286
-
2287
- return {
2288
- content: [{
2289
- type: 'text',
2290
- text: JSON.stringify({
2291
- count: elements.length,
2292
- elements,
2293
- hints: {
2294
- suggestion: 'Use these selectors directly with click, type, or other tools'
2295
- }
2296
- }, null, 2)
2297
- }]
2298
- };
2299
- }
2300
-
2301
2253
  if (name === "findElementsByText") {
2302
2254
  const validatedArgs = schemas.FindElementsByTextSchema.parse(args);
2303
2255
  const page = await getLastOpenPage();
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "chrometools-mcp",
3
- "version": "3.2.4",
3
+ "version": "3.2.6",
4
4
  "description": "MCP (Model Context Protocol) server for Chrome automation using Puppeteer. Persistent browser sessions, UI framework detection (MUI, Ant Design, etc.), Page Object support, visual testing, Figma comparison. Works seamlessly in WSL, Linux, macOS, and Windows.",
5
5
  "type": "module",
6
6
  "main": "index.js",
@@ -231,13 +231,26 @@ function buildAPOMTree(interactiveOnly = true) {
231
231
 
232
232
  /**
233
233
  * Check if element is visible
234
+ * More reliable check that works with position:fixed elements (Angular Material, etc.)
234
235
  */
235
236
  function isVisible(el) {
236
- if (!el.offsetParent && el !== document.body) return false;
237
+ // Check dimensions first (works for fixed position elements)
238
+ if (el.offsetWidth === 0 || el.offsetHeight === 0) return false;
239
+
240
+ // Check computed styles
237
241
  const style = window.getComputedStyle(el);
238
- return style.display !== 'none' &&
239
- style.visibility !== 'hidden' &&
240
- style.opacity !== '0';
242
+ if (style.display === 'none' ||
243
+ style.visibility === 'hidden' ||
244
+ style.opacity === '0') {
245
+ return false;
246
+ }
247
+
248
+ // For body element, always consider visible if dimensions > 0
249
+ if (el === document.body) return true;
250
+
251
+ // Additional check: element should be in viewport or have offsetParent
252
+ // This handles elements inside position:fixed containers (Angular Material)
253
+ return el.offsetParent !== null || style.position === 'fixed' || style.position === 'sticky';
241
254
  }
242
255
 
243
256
  /**
@@ -682,9 +695,17 @@ function buildAPOMTree(interactiveOnly = true) {
682
695
  * Excludes framework-specific dynamic attributes (React, Vue, Angular)
683
696
  */
684
697
  function generateSelector(element) {
685
- // Use ID if available and unique
686
- if (element.id && document.querySelectorAll(`#${element.id}`).length === 1) {
687
- return `#${element.id}`;
698
+ // Use ID if available, valid (not starting with digit), and unique
699
+ // CSS selectors don't support IDs starting with digits (e.g., #301178 is invalid)
700
+ if (element.id && !/^[0-9]/.test(element.id)) {
701
+ try {
702
+ const selector = `#${CSS.escape(element.id)}`;
703
+ if (document.querySelectorAll(selector).length === 1) {
704
+ return selector;
705
+ }
706
+ } catch (e) {
707
+ // Invalid selector, continue to other strategies
708
+ }
688
709
  }
689
710
 
690
711
  // Try to find stable class name (excluding framework-specific dynamic classes)
File without changes
@@ -49,7 +49,7 @@ export const toolDefinitions = [
49
49
  id: { type: "string", description: "APOM element ID from analyzePage (e.g., 'input_20'). Either id or selector required." },
50
50
  selector: { type: "string", description: "CSS selector (e.g., '#email'). Either id or selector required." },
51
51
  text: { type: "string", description: "Text to type" },
52
- delay: { type: "number", description: "Keystroke delay ms (default: 0)" },
52
+ delay: { type: "number", description: "Keystroke delay ms (default: 30)" },
53
53
  clearFirst: { type: "boolean", description: "Clear first (default: true)" },
54
54
  },
55
55
  required: ["text"],
@@ -503,16 +503,6 @@ export const toolDefinitions = [
503
503
  required: ["id"],
504
504
  },
505
505
  },
506
- {
507
- name: "getAllInteractiveElements",
508
- description: "Get all interactive elements with selectors. For understanding available actions.",
509
- inputSchema: {
510
- type: "object",
511
- properties: {
512
- includeHidden: { type: "boolean", description: "Include hidden (default: false)" },
513
- },
514
- },
515
- },
516
506
  {
517
507
  name: "findElementsByText",
518
508
  description: "Find elements by visible text content and get their selectors. Use this INSTEAD of executeScript when you need to find elements. Returns working selectors that can be used with click/type tools. Can optionally perform actions directly.",
@@ -24,7 +24,6 @@ export const toolGroups = {
24
24
  'getViewport',
25
25
  'smartFindElement',
26
26
  'analyzePage',
27
- 'getAllInteractiveElements',
28
27
  'findElementsByText'
29
28
  ],
30
29
 
@@ -29,7 +29,7 @@ export const TypeSchema = z.object({
29
29
  id: z.string().optional().describe("APOM element ID from analyzePage (e.g., 'input_20', 'input_33'). Mutually exclusive with selector."),
30
30
  selector: z.string().optional().describe("CSS selector for input element. Mutually exclusive with id."),
31
31
  text: z.string().describe("Text to type"),
32
- delay: z.number().optional().describe("Delay between keystrokes in ms (default: 0)"),
32
+ delay: z.number().optional().describe("Delay between keystrokes in ms (default: 30)"),
33
33
  clearFirst: z.boolean().optional().describe("Clear field before typing (default: true)"),
34
34
  }).refine(data => (data.id && !data.selector) || (!data.id && data.selector), {
35
35
  message: "Either 'id' or 'selector' must be provided, but not both"
@@ -269,10 +269,6 @@ export const GetElementDetailsSchema = z.object({
269
269
  refresh: z.boolean().optional().describe("Force refresh of cached analysis (default: false)"),
270
270
  });
271
271
 
272
- export const GetAllInteractiveElementsSchema = z.object({
273
- includeHidden: z.boolean().optional().describe("Include hidden elements (default: false)"),
274
- });
275
-
276
272
  export const FindElementsByTextSchema = z.object({
277
273
  text: z.string().describe("Text to search for in elements"),
278
274
  exact: z.boolean().optional().describe("Exact match only (default: false)"),