terminator-mcp-agent 0.12.17 → 0.13.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +163 -11
  2. package/package.json +5 -5
package/README.md CHANGED
@@ -2,14 +2,29 @@
2
2
 
3
3
  <!-- BADGES:START -->
4
4
 
5
- [<img alt="Install in VS Code" src="https://img.shields.io/badge/VS_Code-VS_Code?style=flat-square&label=Install%20Server&color=0098FF">](https://insiders.vscode.dev/redirect?url=vscode%3Amcp%2Finstall%3F%257B%2522terminator-mcp-agent%2522%253A%257B%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522-y%2522%252C%2522terminator-mcp-agent%2540latest%2522%255D%257D%257D)
6
- [<img alt="Install in VS Code Insiders" src="https://img.shields.io/badge/VS_Code_Insiders-VS_Code_Insiders?style=flat-square&label=Install%20Server&color=24bfa5">](https://insiders.vscode.dev/redirect?url=vscode-insiders%3Amcp%2Finstall%3F%257B%2522terminator-mcp-agent%2522%253A%257B%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522-y%2522%252C%2522terminator-mcp-agent%2540latest%2522%255D%257D%257D)
7
- [<img alt="Install in Cursor" src="https://img.shields.io/badge/Cursor-Cursor?style=flat-square&label=Install%20Server&color=22272e">](https://cursor.com/install-mcp?name=terminator-mcp-agent&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsInRlcm1pbmF0b3ItbWNwLWFnZW50QGxhdGVzdCJdfQ%3D%3D)
5
+ [<img alt="Install in VS Code" src="https://img.shields.io/badge/VS_Code-VS_Code?style=flat-square&label=Install%20Server&color=0098FF">](https://insiders.vscode.dev/redirect?url=vscode%3Amcp%2Finstall%3F%257B%2522terminator-mcp-agent%2522%253A%257B%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522-y%2522%252C%2522terminator-mcp-agent%2522%255D%257D%257D)
6
+ [<img alt="Install in VS Code Insiders" src="https://img.shields.io/badge/VS_Code_Insiders-VS_Code_Insiders?style=flat-square&label=Install%20Server&color=24bfa5">](https://insiders.vscode.dev/redirect?url=vscode-insiders%3Amcp%2Finstall%3F%257B%2522terminator-mcp-agent%2522%253A%257B%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522-y%2522%252C%2522terminator-mcp-agent%2522%255D%257D%257D)
8
7
 
9
8
  <!-- BADGES:END -->
10
9
 
11
10
  A Model Context Protocol (MCP) server that provides desktop GUI automation capabilities using the [Terminator](https://github.com/mediar-ai/terminator) library. This server enables LLMs and agentic clients to interact with Windows, macOS, and Linux applications through structured accessibility APIs—no vision models or screenshots required.
12
11
 
12
+ ## Quick Install
13
+
14
+ ### Cursor
15
+
16
+ Copy and paste this URL into your browser's address bar:
17
+
18
+ ```
19
+ cursor://anysphere.cursor-deeplink/mcp/install?name=terminator-mcp-agent&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsInRlcm1pbmF0b3ItbWNwLWFnZW50Il19
20
+ ```
21
+
22
+ Or install manually:
23
+
24
+ 1. Open Cursor Settings (`Cmd/Ctrl + ,`)
25
+ 2. Go to the MCP tab
26
+ 3. Add server with command: `npx -y terminator-mcp-agent`
27
+
13
28
  ### HTTP Endpoints (when running with `-t http`)
14
29
 
15
30
  - `GET /health`: Always returns 200 while the process is alive.
@@ -102,15 +117,15 @@ Tool call wrapper format (`workflow.json`):
102
117
  }
103
118
  ```
104
119
 
105
- **JavaScript Execution in Workflows:**
120
+ **Code Execution in Workflows (engine mode):**
106
121
 
107
- Execute custom JavaScript code with access to desktop automation APIs:
122
+ Execute custom JavaScript or Python with access to desktop automation APIs via `run_command`:
108
123
 
109
124
  ```yaml
110
125
  steps:
111
- - tool_name: run_javascript
126
+ - tool_name: run_command
112
127
  arguments:
113
- engine: "nodejs"
128
+ engine: "javascript"
114
129
  script: |
115
130
  // Access desktop automation APIs
116
131
  const elements = await desktop.locator('role:button').all();
@@ -212,6 +227,140 @@ For simpler tasks, you can record your own actions to generate a baseline workfl
212
227
  3. **Stop and Save**: Call `record_workflow` with `action: "stop"`. This returns a complete workflow JSON file containing all your recorded actions.
213
228
  4. **Refine and Parse**: The recorded workflow is a great starting point. You can then refine the selectors for robustness, add a final step to capture the UI tree, and attach an `output_parser` to extract structured data, just as you would in the iterative workflow.
214
229
 
230
+ ### Browser DOM Inspection
231
+
232
+ The `execute_browser_script` tool enables direct JavaScript execution in browser contexts, providing access to the full HTML DOM. This is particularly useful when you need information not available in the accessibility tree.
233
+
234
+ #### When to Use DOM vs Accessibility Tree
235
+
236
+ **Use Accessibility Tree (default) when:**
237
+ - Navigating and interacting with UI elements
238
+ - Working with semantic page structure
239
+ - Building reliable automation workflows
240
+ - Performance is critical (faster, cleaner data)
241
+
242
+ **Use DOM Inspection when:**
243
+ - Extracting data attributes, meta tags, or hidden inputs
244
+ - Debugging why elements aren't appearing in accessibility tree
245
+ - Scraping structured data from specific HTML patterns
246
+ - Validating complete page structure or SEO elements
247
+
248
+ #### Basic DOM Retrieval Patterns
249
+
250
+ ```javascript
251
+ // Get full HTML DOM (be mindful of size limits)
252
+ execute_browser_script({
253
+ selector: "role:Window|name:Google Chrome",
254
+ script: "document.documentElement.outerHTML"
255
+ })
256
+
257
+ // Get structured page information
258
+ execute_browser_script({
259
+ selector: "role:Window|name:Google Chrome",
260
+ script: `({
261
+ url: window.location.href,
262
+ title: document.title,
263
+ html: document.documentElement.outerHTML,
264
+ bodyText: document.body.innerText.substring(0, 1000)
265
+ })`
266
+ })
267
+
268
+ // Extract specific data (forms, hidden inputs, meta tags)
269
+ execute_browser_script({
270
+ selector: "role:Window|name:Google Chrome",
271
+ script: `({
272
+ forms: Array.from(document.forms).map(f => ({
273
+ id: f.id,
274
+ action: f.action,
275
+ method: f.method,
276
+ inputs: Array.from(f.elements).map(e => ({
277
+ name: e.name,
278
+ type: e.type,
279
+ value: e.type === 'password' ? '[REDACTED]' : e.value
280
+ }))
281
+ })),
282
+ hiddenInputs: Array.from(document.querySelectorAll('input[type="hidden"]')).map(e => ({
283
+ name: e.name,
284
+ value: e.value
285
+ })),
286
+ metaTags: Array.from(document.querySelectorAll('meta')).map(m => ({
287
+ name: m.name || m.property,
288
+ content: m.content
289
+ }))
290
+ })`
291
+ })
292
+ ```
293
+
294
+ #### Handling Large DOMs
295
+
296
+ The MCP protocol has response size limits (~30KB). For large DOMs, use truncation strategies:
297
+
298
+ ```javascript
299
+ execute_browser_script({
300
+ selector: "role:Window|name:Google Chrome",
301
+ script: `
302
+ const html = document.documentElement.outerHTML;
303
+ const maxLength = 30000;
304
+
305
+ ({
306
+ url: window.location.href,
307
+ title: document.title,
308
+ html: html.length > maxLength
309
+ ? html.substring(0, maxLength) + '... [truncated at ' + maxLength + ' chars]'
310
+ : html,
311
+ totalLength: html.length,
312
+ truncated: html.length > maxLength
313
+ })
314
+ `
315
+ })
316
+ ```
317
+
318
+ #### Advanced DOM Analysis
319
+
320
+ ```javascript
321
+ // Analyze page structure and extract semantic content
322
+ execute_browser_script({
323
+ selector: "role:Window|name:Google Chrome",
324
+ script: `
325
+ // Remove scripts and styles for cleaner analysis
326
+ const clonedDoc = document.documentElement.cloneNode(true);
327
+ clonedDoc.querySelectorAll('script, style, noscript').forEach(el => el.remove());
328
+
329
+ ({
330
+ // Page metrics
331
+ domElementCount: document.querySelectorAll('*').length,
332
+ formCount: document.forms.length,
333
+ linkCount: document.links.length,
334
+ imageCount: document.images.length,
335
+
336
+ // Semantic structure
337
+ headings: Array.from(document.querySelectorAll('h1,h2,h3')).map(h => ({
338
+ level: h.tagName,
339
+ text: h.innerText.substring(0, 100)
340
+ })),
341
+
342
+ // Clean HTML without scripts/styles
343
+ cleanHtml: clonedDoc.outerHTML.substring(0, 20000),
344
+
345
+ // Data extraction
346
+ jsonLd: Array.from(document.querySelectorAll('script[type="application/ld+json"]'))
347
+ .map(s => { try { return JSON.parse(s.textContent); } catch { return null; } })
348
+ .filter(Boolean)
349
+ })
350
+ `
351
+ })
352
+ ```
353
+
354
+ #### Important Notes
355
+
356
+ 1. **Chrome Extension Required**: The `execute_browser_script` tool requires the Terminator browser extension to be installed. See the installation workflow examples for automated setup.
357
+
358
+ 2. **Security Considerations**: Be cautious when extracting sensitive data. The examples above redact password fields and you should follow similar practices.
359
+
360
+ 3. **Performance**: DOM operations are synchronous and can be slow on large pages. Consider using specific selectors rather than traversing the entire DOM.
361
+
362
+ 4. **Error Handling**: Always wrap complex DOM operations in try-catch blocks and return meaningful error messages.
363
+
215
364
  ## Local Development
216
365
 
217
366
  To build and test the agent from the source code:
@@ -294,11 +443,14 @@ terminator mcp run workflow.yml --url http://localhost:3000/mcp
294
443
  **Solution**: Verify JavaScript execution and API access:
295
444
 
296
445
  ```bash
297
- # Test basic JavaScript execution
298
- terminator mcp exec run_javascript '{"script": "return {test: true};"}'
446
+ # Test basic JavaScript execution via run_command engine mode
447
+ terminator mcp exec run_command '{"engine": "javascript", "run": "return {test: true};"}'
448
+
449
+ # Test desktop API access with node engine
450
+ terminator mcp exec run_command '{"engine": "node", "run": "const elements = await desktop.locator(\\\"role:button\\\").all(); return {count: elements.length};"}'
299
451
 
300
- # Test desktop API access with nodejs engine
301
- terminator mcp exec run_javascript '{"engine": "nodejs", "script": "const elements = await desktop.locator(\"role:button\").all(); return {count: elements.length};"}'
452
+ # Test Python engine
453
+ terminator mcp exec run_command '{"engine": "python", "run": "return {\\\"py\\\": True}"}'
302
454
 
303
455
  # Debug with verbose logging
304
456
  terminator mcp run workflow.yml --verbose
package/package.json CHANGED
@@ -15,10 +15,10 @@
15
15
  ],
16
16
  "name": "terminator-mcp-agent",
17
17
  "optionalDependencies": {
18
- "terminator-mcp-darwin-arm64": "0.12.17",
19
- "terminator-mcp-darwin-x64": "0.12.17",
20
- "terminator-mcp-linux-x64-gnu": "0.12.17",
21
- "terminator-mcp-win32-x64-msvc": "0.12.17"
18
+ "terminator-mcp-darwin-arm64": "0.13.0",
19
+ "terminator-mcp-darwin-x64": "0.13.0",
20
+ "terminator-mcp-linux-x64-gnu": "0.13.0",
21
+ "terminator-mcp-win32-x64-msvc": "0.13.0"
22
22
  },
23
23
  "repository": {
24
24
  "type": "git",
@@ -30,5 +30,5 @@
30
30
  "sync-version": "node ./utils/sync-version.js",
31
31
  "update-badges": "node ./utils/update-badges.js"
32
32
  },
33
- "version": "0.12.17"
33
+ "version": "0.13.0"
34
34
  }