terminator-mcp-agent 0.15.6 → 0.15.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (4) hide show
  1. package/README.md +1044 -1044
  2. package/config.js +280 -280
  3. package/index.js +194 -194
  4. package/package.json +5 -5
package/README.md CHANGED
@@ -1,1044 +1,1044 @@
1
- ## Terminator MCP Agent
2
-
3
- <!-- BADGES:START -->
4
-
5
- [<img alt="Install in VS Code" src="https://img.shields.io/badge/VS_Code-VS_Code?style=flat-square&label=Install%20Server&color=0098FF">](https://insiders.vscode.dev/redirect?url=vscode%3Amcp%2Finstall%3F%257B%2522terminator-mcp-agent%2522%253A%257B%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522-y%2522%252C%2522terminator-mcp-agent%2522%255D%257D%257D)
6
- [<img alt="Install in VS Code Insiders" src="https://img.shields.io/badge/VS_Code_Insiders-VS_Code_Insiders?style=flat-square&label=Install%20Server&color=24bfa5">](https://insiders.vscode.dev/redirect?url=vscode-insiders%3Amcp%2Finstall%3F%257B%2522terminator-mcp-agent%2522%253A%257B%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522-y%2522%252C%2522terminator-mcp-agent%2522%255D%257D%257D)
7
-
8
- <!-- BADGES:END -->
9
-
10
- A Model Context Protocol (MCP) server that provides desktop GUI automation capabilities using the [Terminator](https://github.com/mediar-ai/terminator) library. This server enables LLMs and agentic clients to interact with Windows, macOS, and Linux applications through structured accessibility APIs—no vision models or screenshots required.
11
-
12
- ## Quick Install
13
-
14
- ### Claude Code
15
-
16
- Install with a single command:
17
-
18
- ```bash
19
- claude mcp add terminator "npx -y terminator-mcp-agent" -s user
20
- ```
21
-
22
- ### Cursor
23
-
24
- Copy and paste this URL into your browser's address bar:
25
-
26
- ```
27
- cursor://anysphere.cursor-deeplink/mcp/install?name=terminator-mcp-agent&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsInRlcm1pbmF0b3ItbWNwLWFnZW50Il19
28
- ```
29
-
30
- Or install manually:
31
-
32
- 1. Open Cursor Settings (`Cmd/Ctrl + ,`)
33
- 2. Go to the MCP tab
34
- 3. Add server with command: `npx -y terminator-mcp-agent`
35
-
36
- ### HTTP Endpoints (when running with `-t http`)
37
-
38
- - `GET /health`: Always returns 200 while the process is alive.
39
- - `GET /status`: Busy-aware probe for load balancers. Returns JSON and appropriate status:
40
- - 200 when idle: `{ "busy": false, "activeRequests": 0, "maxConcurrent": 1, "lastActivity": "<ISO-8601>" }`
41
- - 503 when busy: `{ "busy": true, "activeRequests": 1, "maxConcurrent": 1, "lastActivity": "<ISO-8601>" }`
42
- - Content-Type is `application/json`.
43
- - `POST /mcp`: MCP execution endpoint. Enforces single-request concurrency per machine by default.
44
-
45
- Concurrency is controlled by the `MCP_MAX_CONCURRENT` environment variable (default `1`). Only accepted `POST /mcp` requests are counted toward `activeRequests`. If the server is at capacity, new `POST /mcp` requests return 503 immediately. This 503 behavior is intentional so an Azure Load Balancer probing `GET /status` can take a busy VM out of rotation and route traffic elsewhere.
46
-
47
- ### Getting Started
48
-
49
- The easiest way to get started is to use the one-click install buttons above for your specific editor (VS Code, Cursor, etc.).
50
-
51
- Alternatively, you can install and configure the agent from your command line.
52
-
53
- **1. Install & Configure Automatically**
54
- Run the following command and select your MCP client from the list:
55
-
56
- ```sh
57
- npx -y terminator-mcp-agent@latest --add-to-app
58
- ```
59
-
60
- **2. Manual Configuration**
61
- If you prefer, you can add the following to your MCP client's settings file:
62
-
63
- ```json
64
- {
65
- "mcpServers": {
66
- "terminator-mcp-agent": {
67
- "command": "npx",
68
- "args": ["-y", "terminator-mcp-agent@latest"]
69
- }
70
- }
71
- }
72
- ```
73
-
74
- ### Command Line Interface (CLI) Execution
75
-
76
- For automation workflows and CI/CD pipelines, you can execute workflows directly from the command line using the [Terminator CLI](../terminator-cli/README.md):
77
-
78
- **Quick Start:**
79
-
80
- ```bash
81
- # Execute a workflow file
82
- terminator mcp run workflow.yml
83
-
84
- # With verbose logging
85
- terminator mcp run workflow.yml --verbose
86
-
87
- # Dry run (validate without executing)
88
- terminator mcp run workflow.yml --dry-run
89
-
90
- # Use specific MCP server version
91
- terminator mcp run workflow.yml --command "npx -y terminator-mcp-agent@latest"
92
-
93
- # Run specific steps (requires step IDs in workflow)
94
- terminator mcp run workflow.yml --start-from "step_12" --end-at "step_13"
95
-
96
- # Run single step
97
- terminator mcp run workflow.yml --start-from "read_json" --end-at "read_json"
98
- ```
99
-
100
- **Workflow File Formats:**
101
-
102
- Direct workflow format (`workflow.yml`):
103
-
104
- ```yaml
105
- steps:
106
- - tool_name: navigate_browser
107
- arguments:
108
- url: "https://example.com"
109
- - tool_name: click_element
110
- arguments:
111
- selector: "role:Button|name:Submit"
112
- stop_on_error: true
113
- include_detailed_results: true
114
- ```
115
-
116
- Tool call wrapper format (`workflow.json`):
117
-
118
- ```json
119
- {
120
- "tool_name": "execute_sequence",
121
- "arguments": {
122
- "steps": [
123
- {
124
- "tool_name": "navigate_browser",
125
- "arguments": {
126
- "url": "https://example.com"
127
- }
128
- }
129
- ]
130
- }
131
- }
132
- ```
133
-
134
- **Code Execution in Workflows (engine mode):**
135
-
136
- Execute custom JavaScript or Python with access to desktop automation APIs via `run_command`.
137
-
138
- **Passing Data Between Workflow Steps:**
139
-
140
- When using `engine` mode, data automatically flows between steps:
141
-
142
- ```yaml
143
- steps:
144
- # Step 1: Return data directly (NEW - simplified!)
145
- - tool_name: run_command
146
- arguments:
147
- engine: "javascript"
148
- run: |
149
- // Get file info (example)
150
- const filePath = 'C:\\data\\report.pdf';
151
- const fileSize = 1024;
152
-
153
- console.log(`Found file: ${filePath}`);
154
-
155
- // Just return fields directly - they auto-merge into env
156
- return {
157
- status: 'success',
158
- file_path: filePath, // Becomes env.file_path
159
- file_size: fileSize // Becomes env.file_size
160
- };
161
-
162
- # Step 2: Access data automatically
163
- - tool_name: run_command
164
- arguments:
165
- engine: "javascript"
166
- run: |
167
- // env is automatically available - no setup needed!
168
- console.log(`Processing: ${env.file_path} (${env.file_size} bytes)`);
169
-
170
- // Workflow variables also auto-available
171
- console.log(`Config: ${variables.max_retries}`);
172
-
173
- // NEW: Direct variable access also works!
174
- console.log(`Processing: ${file_path} (${file_size} bytes)`);
175
- console.log(`Config: ${max_retries}`);
176
-
177
- // Continue with desktop automation
178
- const elements = await desktop.locator('role:button').all();
179
-
180
- // Return more data (auto-merges to env)
181
- return {
182
- status: 'success',
183
- file_processed: env.file_path,
184
- buttons_found: elements.length
185
- };
186
- ```
187
-
188
- **Important Notes on Data Passing:**
189
-
190
- - **NEW:** `env` and `variables` are automatically injected into all scripts
191
- - **NEW:** Non-reserved fields in return values auto-merge into env (no `set_env` wrapper needed)
192
- - **NEW:** Valid env fields are also available as individual variables (e.g., `file_path` instead of `env.file_path`)
193
- - Reserved fields that don't auto-merge: `status`, `error`, `logs`, `duration_ms`, `set_env`
194
- - Data passing only works with `engine` mode (JavaScript/Python), NOT with shell commands
195
- - Backward compatible: explicit `set_env` still works if needed
196
- - Individual variable names must be valid JavaScript identifiers (no spaces, special chars, or reserved keywords)
197
- - Watch for backslash escaping issues in Windows paths (may need double escaping)
198
- - Consider combining related operations in a single step if data passing becomes complex
199
-
200
- For complete CLI documentation, see [Terminator CLI README](../terminator-cli/README.md).
201
-
202
- ### Core Workflows: From Interaction to Structured Data
203
-
204
- The Terminator MCP agent offers two primary workflows for automating desktop tasks. Both paths lead to the same goal: creating a >95% accuracy, 10000x faster than humans, automation.
205
-
206
- #### 1. Iterative Development with `execute_sequence`
207
-
208
- This is the most powerful and flexible method. You build a workflow step-by-step, using MCP tools to inspect the UI and refine your actions.
209
-
210
- 1. **Inspect the UI**: Start by using `get_focused_window_tree` to understand the structure of your target application. This gives you the roles, names, and IDs of all elements.
211
- 2. **Build a Sequence**: Create an `execute_sequence` tool call with a series of actions (`click_element`, `type_into_element`, etc.). Use robust selectors (like `role|name` or stable `properties:AutomationId:value` selectors) whenever possible.
212
- 3. **Capture the Final State**: Ensure the last step in your sequence is an action that returns a UI tree. The `wait_for_element` tool with `include_tree: true` is perfect for this, as it captures the application's state after your automation has run.
213
- 4. **Extract Structured Data with `output_parser`**: Add the `output_parser` argument to your `execute_sequence` call. Write JavaScript code to parse the final UI tree and extract structured data. If successful, the tool result will contain a `parsed_output` field with your clean JSON data.
214
-
215
- Here is an example of an `output_parser` that extracts insurance quote data from a web page:
216
-
217
- ```yaml
218
- output_parser:
219
- ui_tree_source_step_id: capture_quotes_tree
220
- javascript_code: |
221
- // Find all quote groups with Image and Text children
222
- const results = [];
223
-
224
- function findElementsRecursively(element) {
225
- if (element.attributes && element.attributes.role === 'Group') {
226
- const children = element.children || [];
227
- const hasImage = children.some(child =>
228
- child.attributes && child.attributes.role === 'Image'
229
- );
230
- const hasText = children.some(child =>
231
- child.attributes && child.attributes.role === 'Text'
232
- );
233
-
234
- if (hasImage && hasText) {
235
- const textElements = children.filter(child =>
236
- child.attributes && child.attributes.role === 'Text' && child.attributes.name
237
- );
238
-
239
- let carrierProduct = '';
240
- let monthlyPrice = '';
241
-
242
- for (const textEl of textElements) {
243
- const text = textEl.attributes.name;
244
- if (text.includes(':')) {
245
- carrierProduct = text;
246
- }
247
- if (text.startsWith('$')) {
248
- monthlyPrice = text;
249
- }
250
- }
251
-
252
- if (carrierProduct && monthlyPrice) {
253
- results.push({
254
- carrierProduct: carrierProduct,
255
- monthlyPrice: monthlyPrice
256
- });
257
- }
258
- }
259
- }
260
-
261
- if (element.children) {
262
- for (const child of element.children) {
263
- findElementsRecursively(child);
264
- }
265
- }
266
- }
267
-
268
- findElementsRecursively(tree);
269
- return results;
270
- ```
271
-
272
- #### 2. Recording Human Actions with `record_workflow`
273
-
274
- For simpler tasks, you can record your own actions to generate a baseline workflow.
275
-
276
- 1. **Start Recording**: Call `record_workflow` with `action: "start"`.
277
- 2. **Perform the Task**: Manually perform the clicks, typing, and other interactions in the target application.
278
- 3. **Stop and Save**: Call `record_workflow` with `action: "stop"`. This returns a complete workflow JSON file containing all your recorded actions.
279
- 4. **Refine and Parse**: The recorded workflow is a great starting point. You can then refine the selectors for robustness, add a final step to capture the UI tree, and attach an `output_parser` to extract structured data, just as you would in the iterative workflow.
280
-
281
- ### Browser DOM Inspection
282
-
283
- The `execute_browser_script` tool enables direct JavaScript execution in browser contexts, providing access to the full HTML DOM. This is particularly useful when you need information not available in the accessibility tree.
284
-
285
- #### When to Use DOM vs Accessibility Tree
286
-
287
- **Use Accessibility Tree (default) when:**
288
-
289
- - Navigating and interacting with UI elements
290
- - Working with semantic page structure
291
- - Building reliable automation workflows
292
- - Performance is critical (faster, cleaner data)
293
-
294
- **Use DOM Inspection when:**
295
-
296
- - Extracting data attributes, meta tags, or hidden inputs
297
- - Debugging why elements aren't appearing in accessibility tree
298
- - Scraping structured data from specific HTML patterns
299
- - Validating complete page structure or SEO elements
300
-
301
- #### Basic DOM Retrieval Patterns
302
-
303
- ```javascript
304
- // Get full HTML DOM (be mindful of size limits)
305
- execute_browser_script({
306
- selector: "role:Window|name:Google Chrome",
307
- script: "document.documentElement.outerHTML",
308
- });
309
-
310
- // Get structured page information
311
- execute_browser_script({
312
- selector: "role:Window|name:Google Chrome",
313
- script: `({
314
- url: window.location.href,
315
- title: document.title,
316
- html: document.documentElement.outerHTML,
317
- bodyText: document.body.innerText.substring(0, 1000)
318
- })`,
319
- });
320
-
321
- // Extract specific data (forms, hidden inputs, meta tags)
322
- execute_browser_script({
323
- selector: "role:Window|name:Google Chrome",
324
- script: `({
325
- forms: Array.from(document.forms).map(f => ({
326
- id: f.id,
327
- action: f.action,
328
- method: f.method,
329
- inputs: Array.from(f.elements).map(e => ({
330
- name: e.name,
331
- type: e.type,
332
- value: e.type === 'password' ? '[REDACTED]' : e.value
333
- }))
334
- })),
335
- hiddenInputs: Array.from(document.querySelectorAll('input[type="hidden"]')).map(e => ({
336
- name: e.name,
337
- value: e.value
338
- })),
339
- metaTags: Array.from(document.querySelectorAll('meta')).map(m => ({
340
- name: m.name || m.property,
341
- content: m.content
342
- }))
343
- })`,
344
- });
345
- ```
346
-
347
- #### Handling Large DOMs
348
-
349
- The MCP protocol has response size limits (~30KB). For large DOMs, use truncation strategies:
350
-
351
- ```javascript
352
- execute_browser_script({
353
- selector: "role:Window|name:Google Chrome",
354
- script: `
355
- const html = document.documentElement.outerHTML;
356
- const maxLength = 30000;
357
-
358
- ({
359
- url: window.location.href,
360
- title: document.title,
361
- html: html.length > maxLength
362
- ? html.substring(0, maxLength) + '... [truncated at ' + maxLength + ' chars]'
363
- : html,
364
- totalLength: html.length,
365
- truncated: html.length > maxLength
366
- })
367
- `,
368
- });
369
- ```
370
-
371
- #### Advanced DOM Analysis
372
-
373
- ```javascript
374
- // Analyze page structure and extract semantic content
375
- execute_browser_script({
376
- selector: "role:Window|name:Google Chrome",
377
- script: `
378
- // Remove scripts and styles for cleaner analysis
379
- const clonedDoc = document.documentElement.cloneNode(true);
380
- clonedDoc.querySelectorAll('script, style, noscript').forEach(el => el.remove());
381
-
382
- ({
383
- // Page metrics
384
- domElementCount: document.querySelectorAll('*').length,
385
- formCount: document.forms.length,
386
- linkCount: document.links.length,
387
- imageCount: document.images.length,
388
-
389
- // Semantic structure
390
- headings: Array.from(document.querySelectorAll('h1,h2,h3')).map(h => ({
391
- level: h.tagName,
392
- text: h.innerText.substring(0, 100)
393
- })),
394
-
395
- // Clean HTML without scripts/styles
396
- cleanHtml: clonedDoc.outerHTML.substring(0, 20000),
397
-
398
- // Data extraction
399
- jsonLd: Array.from(document.querySelectorAll('script[type="application/ld+json"]'))
400
- .map(s => { try { return JSON.parse(s.textContent); } catch { return null; } })
401
- .filter(Boolean)
402
- })
403
- `,
404
- });
405
- ```
406
-
407
- #### Passing Data with Environment Variables
408
-
409
- The `execute_browser_script` tool now supports passing data through `env` and `outputs` parameters:
410
-
411
- ```javascript
412
- // Step 1: Set environment variables in JavaScript
413
- run_command({
414
- engine: "javascript",
415
- run: `
416
- return {
417
- set_env: {
418
- userName: 'John Doe',
419
- userId: '12345',
420
- apiKey: 'secret-key'
421
- }
422
- };
423
- `,
424
- });
425
-
426
- // Step 2: Use environment variables in browser script
427
- execute_browser_script({
428
- selector: "role:Window",
429
- env: {
430
- userName: "{{env.userName}}",
431
- userId: "{{env.userId}}",
432
- },
433
- script: `
434
- // Parse env if it's a JSON string (for backward compatibility)
435
- const parsedEnv = typeof env === 'string' ? JSON.parse(env) : env;
436
-
437
- // Use the data - traditional way
438
- console.log('Processing user:', parsedEnv.userName);
439
-
440
- // NEW: Direct variable access also works!
441
- console.log('Processing user:', userName); // Direct access
442
- console.log('User ID:', userId); // No env prefix needed
443
-
444
- // Fill form with data
445
- document.querySelector('#username').value = userName;
446
- document.querySelector('#userid').value = userId;
447
-
448
- // Return result and set new variables
449
- JSON.stringify({
450
- status: 'form_filled',
451
- set_env: {
452
- form_submitted: 'true',
453
- timestamp: new Date().toISOString()
454
- }
455
- });
456
- `,
457
- });
458
- ```
459
-
460
- #### Loading Scripts from Files
461
-
462
- You can load JavaScript from external files using the `script_file` parameter:
463
-
464
- ```javascript
465
- // browser_scripts/extract_data.js
466
- const parsedEnv = typeof env === "string" ? JSON.parse(env) : env;
467
- const parsedOutputs =
468
- typeof outputs === "string" ? JSON.parse(outputs) : outputs;
469
-
470
- console.log("Script loaded from file");
471
- console.log("User:", parsedEnv?.userName);
472
- console.log("Previous result:", parsedOutputs?.previousStep);
473
-
474
- // Extract and return data
475
- JSON.stringify({
476
- extractedData: {
477
- url: window.location.href,
478
- title: document.title,
479
- forms: document.forms.length,
480
- },
481
- set_env: {
482
- extraction_complete: "true",
483
- },
484
- });
485
-
486
- // In your workflow:
487
- execute_browser_script({
488
- selector: "role:Window",
489
- script_file: "browser_scripts/extract_data.js",
490
- env: {
491
- userName: "{{env.userName}}",
492
- previousStep: "{{env.previousStep}}",
493
- },
494
- });
495
- ```
496
-
497
- #### Important Notes
498
-
499
- 1. **Chrome Extension Required**: The `execute_browser_script` tool requires the Terminator browser extension to be installed. See the installation workflow examples for automated setup.
500
-
501
- 2. **Security Considerations**: Be cautious when extracting sensitive data. The examples above redact password fields and you should follow similar practices.
502
-
503
- 3. **Performance**: DOM operations are synchronous and can be slow on large pages. Consider using specific selectors rather than traversing the entire DOM.
504
-
505
- 4. **Error Handling**: Always wrap complex DOM operations in try-catch blocks and return meaningful error messages.
506
-
507
- 5. **Data Injection**: When using `env` or `outputs` parameters, they are injected as JavaScript variables at the beginning of your script. Always parse them if they might be JSON strings.
508
-
509
- ## Local Development
510
-
511
- To build and test the agent from the source code:
512
-
513
- ```sh
514
- # 1. Clone the entire Terminator repository
515
- git clone https://github.com/mediar-ai/terminator
516
-
517
- # 2. Navigate to the agent's directory
518
- cd terminator/terminator-mcp-agent
519
-
520
- # 3. Install Node.js dependencies
521
- npm install
522
-
523
- # 4. Build the Rust binary and Node.js wrapper
524
- npm run build
525
-
526
- # 5. To use your local build in your MCP client, link it globally
527
- npm install --global .
528
- ```
529
-
530
- Now, when your MCP client runs `terminator-mcp-agent`, it will use your local build instead of the published `npm` version.
531
-
532
- ---
533
-
534
- ## Troubleshooting
535
-
536
- - Make sure you have Node.js installed (v16+ recommended).
537
- - For VS Code/Insiders, ensure the CLI (`code` or `code-insiders`) is available in your PATH.
538
- - If you encounter issues, try running with elevated permissions.
539
-
540
- ### Version Compatibility Issues
541
-
542
- **Problem**: "missing field `items`" or schema mismatch errors
543
-
544
- **Solution**: Ensure you're using the latest MCP server version:
545
-
546
- ```bash
547
- # Force latest version in CLI
548
- terminator mcp run workflow.yml --command "npx -y terminator-mcp-agent@latest"
549
-
550
- # Update MCP client configuration to use @latest
551
- {
552
- "mcpServers": {
553
- "terminator-mcp-agent": {
554
- "command": "npx",
555
- "args": ["-y", "terminator-mcp-agent@latest"]
556
- }
557
- }
558
- }
559
-
560
- # Clear npm cache if needed
561
- npm cache clean --force
562
- ```
563
-
564
- ### CLI Integration Issues
565
-
566
- **Problem**: CLI commands not working or connection errors
567
-
568
- **Solution**: Test MCP connectivity step by step:
569
-
570
- ```bash
571
- # Test basic connectivity
572
- terminator mcp exec get_applications
573
-
574
- # Test with verbose logging
575
- terminator mcp run workflow.yml --verbose
576
-
577
- # Test with dry run first
578
- terminator mcp run workflow.yml --dry-run
579
-
580
- # Use HTTP connection for debugging
581
- terminator mcp run workflow.yml --url http://localhost:3000/mcp
582
- ```
583
-
584
- ### JavaScript Execution Issues
585
-
586
- **Problem**: JavaScript code fails or can't access desktop APIs
587
-
588
- **Solution**: Verify JavaScript execution and API access:
589
-
590
- ```bash
591
- # Test basic JavaScript execution via run_command engine mode
592
- terminator mcp exec run_command '{"engine": "javascript", "run": "return {test: true};"}'
593
-
594
- # Test desktop API access with node engine
595
- terminator mcp exec run_command '{"engine": "node", "run": "const elements = await desktop.locator(\\\"role:button\\\").all(); return {count: elements.length};"}'
596
-
597
- # Test Python engine
598
- terminator mcp exec run_command '{"engine": "python", "run": "return {\\\"py\\\": True}"}'
599
-
600
- # Debug with verbose logging
601
- terminator mcp run workflow.yml --verbose
602
- ```
603
-
604
- ### Workflow File Issues
605
-
606
- **Problem**: Workflow parsing errors or unexpected behavior
607
-
608
- **Solution**: Validate workflow structure:
609
-
610
- ```bash
611
- # Validate workflow syntax
612
- terminator mcp run workflow.yml --dry-run
613
-
614
- # Test with minimal workflow first
615
- echo 'steps: [{tool_name: get_applications}]' > test.yml
616
- terminator mcp run test.yml
617
-
618
- # Check both YAML and JSON formats work
619
- terminator mcp run workflow.yml # YAML
620
- terminator mcp run workflow.json # JSON
621
- ```
622
-
623
- ### Platform-Specific Issues
624
-
625
- **Windows**:
626
-
627
- - Ensure Windows UI Automation APIs are available
628
- - Run with administrator privileges if accessibility features are restricted
629
- - Check Windows Defender/antivirus isn't blocking automation
630
-
631
- **macOS**:
632
-
633
- - Grant accessibility permissions in System Preferences > Security & Privacy
634
- - Ensure the terminal/IDE has accessibility access
635
- - Check macOS version compatibility (10.14+ recommended)
636
-
637
- **Linux**:
638
-
639
- - Ensure AT-SPI (assistive technology) is enabled
640
- - Install required packages: `sudo apt-get install at-spi2-core`
641
- - Check desktop environment compatibility (GNOME, KDE, XFCE supported)
642
-
643
- ### Virtual Display Support (Headless VMs)
644
-
645
- Terminator MCP Agent includes virtual display support for running on headless VMs without requiring RDP connections. This enables scalable automation on cloud platforms like Azure, AWS, and GCP.
646
-
647
- **How It Works**:
648
-
649
- The agent automatically detects headless environments and initializes a virtual display context that Windows UI Automation APIs can interact with. This allows full UI automation capabilities even when no physical display or RDP session is active.
650
-
651
- **Activation**:
652
-
653
- Virtual display activates automatically when:
654
-
655
- - Environment variable `TERMINATOR_HEADLESS=true` is set
656
- - No console window is available (common in VM/container scenarios)
657
- - Running as a Windows service or scheduled task
658
-
659
- **Configuration**:
660
-
661
- ```bash
662
- # Enable virtual display mode
663
- export TERMINATOR_HEADLESS=true
664
-
665
- # Run the MCP agent
666
- npx -y terminator-mcp-agent
667
- ```
668
-
669
- **Use Cases**:
670
-
671
- - Running multiple automation agents on VMs without RDP overhead
672
- - CI/CD pipelines in cloud environments
673
- - Scalable automation farms on Azure/AWS/GCP
674
- - Containerized automation workloads
675
-
676
- **Requirements**:
677
-
678
- - Windows Server 2016+ or Windows 10/11
679
- - .NET Framework 4.7.2+
680
- - UI Automation APIs available (included in Windows)
681
-
682
- The virtual display manager creates a memory-based display context that satisfies Windows UI Automation requirements, enabling terminator to enumerate and interact with UI elements as if a physical display were present.
683
-
684
- ### Performance Optimization
685
-
686
- **Large UI Trees**:
687
-
688
- - Use specific selectors instead of broad element searches
689
- - Implement delays between rapid operations
690
- - Consider using `include_tree: false` for intermediate steps
691
-
692
- **JavaScript Performance**:
693
-
694
- - Use `quickjs` engine for lightweight operations
695
- - Use `nodejs` engine only when full APIs are needed
696
- - Implement `sleep()` delays in loops to prevent overwhelming the UI
697
-
698
- For additional help, see the [Terminator CLI documentation](../terminator-cli/README.md) or open an issue on GitHub.
699
-
700
- ---
701
-
702
- ## 📚 Full `execute_sequence` Reference & Sample Workflow
703
-
704
- > **Why another example?** The quick start above shows the concept, but many users asked for a fully-annotated workflow schema. The example below automates the Windows **Calculator** app—so it is 100% safe to share and does **not** reveal any private customer data. Feel free to copy-paste and adapt it to your own application.
705
-
706
- ### 1. Anatomy of an `execute_sequence` Call
707
-
708
- ```jsonc
709
- {
710
- "tool_name": "execute_sequence",
711
- "arguments": {
712
- "variables": {
713
- // 1️⃣ Re-usable inputs with type metadata
714
- "app_path": {
715
- "type": "string",
716
- "label": "Calculator EXE Path",
717
- "default": "calc.exe"
718
- },
719
- "first_number": {
720
- "type": "string",
721
- "label": "First Number",
722
- "default": "42"
723
- },
724
- "second_number": {
725
- "type": "string",
726
- "label": "Second Number",
727
- "default": "8"
728
- }
729
- },
730
- "inputs": {
731
- // 2️⃣ Concrete values for *this run*
732
- "app_path": "calc.exe",
733
- "first_number": "42",
734
- "second_number": "8"
735
- },
736
- "selectors": {
737
- // 3️⃣ Human-readable element shortcuts
738
- "calc_window": "role:Window|name:Calculator",
739
- "btn_clear": "role:Button|name:Clear",
740
- "btn_plus": "role:Button|name:Plus",
741
- "btn_equals": "role:Button|name:Equals"
742
- },
743
- "steps": [
744
- // 4️⃣ Ordered actions & control flow
745
- {
746
- "tool_name": "open_application",
747
- "arguments": { "path": "${{app_path}}" }
748
- },
749
- {
750
- "tool_name": "click_element", // 4a. Make sure the UI is reset
751
- "arguments": { "selector": "${{selectors.btn_clear}}" },
752
- "continue_on_error": true
753
- },
754
- {
755
- "group_name": "Enter First Number", // 4b. Groups improve logs
756
- "steps": [
757
- {
758
- "tool_name": "type_into_element",
759
- "arguments": {
760
- "selector": "${{selectors.calc_window}}",
761
- "text_to_type": "${{first_number}}"
762
- }
763
- }
764
- ]
765
- },
766
- {
767
- "tool_name": "click_element",
768
- "arguments": { "selector": "${{selectors.btn_plus}}" }
769
- },
770
- {
771
- "group_name": "Enter Second Number",
772
- "steps": [
773
- {
774
- "tool_name": "type_into_element",
775
- "arguments": {
776
- "selector": "${{selectors.calc_window}}",
777
- "text_to_type": "${{second_number}}"
778
- }
779
- }
780
- ]
781
- },
782
- {
783
- "tool_name": "click_element",
784
- "arguments": { "selector": "${{selectors.btn_equals}}" }
785
- },
786
- {
787
- "tool_name": "wait_for_element", // 4c. Capture final UI tree
788
- "arguments": {
789
- "selector": "${{selectors.calc_window}}",
790
- "condition": "exists",
791
- "include_tree": true,
792
- "timeout_ms": 2000
793
- }
794
- }
795
- ],
796
- "output_parser": {
797
- // 5️⃣ Turn the tree into clean JSON
798
- "javascript_code": "// Extract calculator display value\nconst results = [];\n\nfunction findElementsRecursively(element) {\n if (element.attributes && element.attributes.role === 'Text') {\n const item = {\n displayValue: element.attributes.name || ''\n };\n results.push(item);\n }\n \n if (element.children) {\n for (const child of element.children) {\n findElementsRecursively(child);\n }\n }\n}\n\nfindElementsRecursively(tree);\nreturn results;"
799
- }
800
- }
801
- }
802
- ```
803
-
804
- ### 2. Key Concepts at a Glance
805
-
806
- 1. **Variables vs. Inputs** – Declare once, override per-run. This is perfect for parameterizing CI pipelines or A/B test data.
807
- 2. **Selectors** – Give every important UI element a _nickname_. It makes long workflows readable and easy to maintain.
808
- 3. **Templating** – `${{ ... }}` (GitHub Actions-style) _or_ legacy `{{ ... }}` lets you reference **any** key inside `variables`, `inputs`, or `selectors`. Both syntaxes are supported; the engine uses Mustache-style rendering.
809
- 4. **Groups & Control Flow** – Add `group_name`, `skippable`, `if`, or `continue_on_error` to any step for advanced branching.
810
- 5. **Output Parsing** – Always end with a step that includes the UI tree, then use the declarative JSON DSL to mine the data you need.
811
-
812
- ### 3. State Persistence & Partial Execution
813
-
814
- The `execute_sequence` tool supports powerful features for workflow debugging and resumption:
815
-
816
- #### Partial Execution with Step Ranges
817
-
818
- You can run specific portions of a workflow using `start_from_step` and `end_at_step` parameters:
819
-
820
- ```jsonc
821
- {
822
- "tool_name": "execute_sequence",
823
- "arguments": {
824
- "url": "file://path/to/workflow.yml",
825
- "start_from_step": "read_json_file", // Start from this step ID
826
- "end_at_step": "fill_journal_entries", // Stop after this step (inclusive)
827
- "follow_fallback": false // Don't follow fallback_id beyond end_at_step (default: false)
828
- }
829
- }
830
- ```
831
-
832
- **Examples:**
833
- - Run single step: Set both `start_from_step` and `end_at_step` to the same ID
834
- - Run step range: Set different IDs for start and end
835
- - Run from step to end: Only set `start_from_step`
836
- - Run from beginning to step: Only set `end_at_step`
837
- - Debug without fallback: Use `follow_fallback: false` to prevent jumping to troubleshooting steps when a bounded step fails
838
-
839
- #### Automatic State Persistence
840
-
841
- When using `file://` URLs, the workflow state (environment variables) is automatically saved to a `.workflow_state` folder:
842
-
843
- 1. **State is saved** after each step that modifies environment variables via `set_env` or has a tool result with an ID
844
- 2. **State is loaded** when starting from a specific step
845
- 3. **Location**: `.workflow_state/<workflow_hash>.json` in the workflow's directory
846
- 4. **Tool results** from all tools (not just scripts) are automatically stored as `{step_id}_result` and `{step_id}_status`
847
-
848
- This enables:
849
- - **Debugging**: Run steps individually to inspect state between executions
850
- - **Recovery**: Resume failed workflows from the last successful step
851
- - **Testing**: Test specific steps without re-running the entire workflow
852
-
853
- #### Data Passing Between Steps
854
-
855
- Steps can pass data using multiple methods:
856
-
857
- ##### 1. Tool Result Storage (NEW)
858
-
859
- ALL tools with an `id` field automatically store their results in the environment:
860
-
861
- ```yaml
862
- steps:
863
- # Any tool with an ID stores its result
864
- - id: check_apps
865
- tool_name: get_applications
866
- arguments:
867
- include_tree: false
868
-
869
- # Access the result in JavaScript
870
- - tool_name: run_command
871
- arguments:
872
- engine: javascript
873
- run: |
874
- // Direct variable access - auto-injected!
875
- const apps = check_apps_result || [];
876
- const status = check_apps_status; // "success" or "error"
877
- console.log(`Found ${apps[0]?.applications?.length} apps`);
878
- ```
879
-
880
- ##### 2. Script Return Values
881
-
882
- Steps can pass data using the `set_env` mechanism in `run_command` with engine mode:
883
-
884
- ```javascript
885
- // Step 12: Read and process data
886
- return {
887
- set_env: {
888
- file_path: "C:/data/input.json",
889
- journal_entries: JSON.stringify(entries),
890
- total_debit: "100.50"
891
- }
892
- };
893
-
894
- // Step 13: Use the data (NEW - simplified access!)
895
- const filePath = file_path; // Direct access, no {{env.}} needed!
896
- const entries = JSON.parse(journal_entries);
897
- const debit = total_debit;
898
- ```
899
-
900
- ### 4. Running the Workflow
901
-
902
- 1. Ensure the Terminator MCP agent is running (it will auto-start in supported editors).
903
- 2. Send the JSON above as the body of an `execute_sequence` tool call from your LLM or test harness.
904
- 3. Inspect the response: if parsing succeeds you'll see something like
905
-
906
- ### Realtime events (SSE)
907
-
908
- When running with the HTTP transport, you can subscribe to realtime workflow events at a separate endpoint outside `/mcp`:
909
-
910
- - SSE endpoint: `/events`
911
- - Emits JSON payloads for: `sequence` (start/end), `sequence_progress`, and `sequence_step` (begin/end)
912
-
913
- Example in Node.js:
914
-
915
- ```js
916
- import EventSource from "eventsource";
917
- const es = new EventSource("http://127.0.0.1:3000/events");
918
- es.onmessage = (e) => console.log("event", e.data);
919
- ```
920
-
921
- ```jsonc
922
- {
923
- "parsed_output": {
924
- "displayValue": "50" // 42 + 8
925
- }
926
- }
927
- ```
928
-
929
- ### 5. Working with Tool Results
930
-
931
- Every tool that has an `id` field automatically stores its result for use in later steps:
932
-
933
- ```yaml
934
- steps:
935
- # Capture browser DOM
936
- - id: capture_dom
937
- tool_name: execute_browser_script
938
- arguments:
939
- selector: "role:Window"
940
- script: "return document.documentElement.innerHTML;"
941
-
942
- # Validate an element exists
943
- - id: check_button
944
- tool_name: validate_element
945
- arguments:
946
- selector: "role:Button|name:Submit"
947
-
948
- # Use both results in script
949
- - tool_name: run_command
950
- arguments:
951
- engine: javascript
952
- run: |
953
- // All tool results are auto-injected as variables
954
- const dom = capture_dom_result?.content || '';
955
- const buttonExists = check_button_status === 'success';
956
-
957
- if (buttonExists) {
958
- const button = check_button_result[0]?.element;
959
- console.log(`Submit button at: ${button?.bounds?.x}, ${button?.bounds?.y}`);
960
- }
961
-
962
- return { dom_length: dom.length, has_button: buttonExists };
963
- ```
964
-
965
- Tool results are accessible as:
966
- - `{step_id}_result`: The tool's return value (content, element info, etc.)
967
- - `{step_id}_status`: Either "success" or "error"
968
-
969
- ### 6. Tips for Production Workflows
970
-
971
- - **Never hard-code credentials** – use environment variables or your secret manager.
972
- - **Keep workflows short** – <100 steps is ideal. Break large tasks into multiple sequences.
973
- - **Capture errors** – `continue_on_error` is useful, but also check `{step_id}_status` for tool failures.
974
- - **Version control** – Store workflow JSON in a repo and use PR reviews just like regular code.
975
- - **Use step IDs** – Give meaningful IDs to steps whose results you'll need later.
976
-
977
- ## 🔍 Troubleshooting & Debugging
978
-
979
- ### Finding MCP Server Logs
980
-
981
- MCP logs are saved to:
982
- - **Windows:** `%LOCALAPPDATA%\claude-cli-nodejs\Cache\<encoded-project-path>\mcp-logs-terminator-mcp-agent\`
983
- - **macOS/Linux:** `~/.local/share/claude-cli-nodejs/Cache/<encoded-project-path>/mcp-logs-terminator-mcp-agent/`
984
-
985
- Where `<encoded-project-path>` is your project path with special chars replaced (e.g., `C--Users-username-project`).
986
- Note: Logs are saved as `.txt` files, not `.log` files.
987
-
988
- **Read logs:**
989
- ```powershell
990
- # Windows - Find and read latest logs (run in PowerShell)
991
- Get-ChildItem (Join-Path ([Environment]::GetFolderPath('LocalApplicationData')) 'claude-cli-nodejs\Cache\*\mcp-logs-terminator-mcp-agent\*.txt') | Sort-Object LastWriteTime -Descending | Select-Object -First 1 | Get-Content -Tail 50
992
- ```
993
-
994
- ### Enable Debug Logging
995
-
996
- In your Claude MCP configuration (`claude_desktop_config.json`):
997
- ```json
998
- {
999
- "mcpServers": {
1000
- "terminator-mcp-agent": {
1001
- "command": "path/to/terminator-mcp-agent",
1002
- "env": {
1003
- "LOG_LEVEL": "debug", // or "info", "warn", "error"
1004
- "RUST_BACKTRACE": "1" // for stack traces on errors
1005
- }
1006
- }
1007
- }
1008
- }
1009
- ```
1010
-
1011
- ### Common Debug Scenarios
1012
-
1013
- | Issue | What to Look For in Logs |
1014
- |-------|--------------------------|
1015
- | Workflow failures | Search for `fallback_id` triggers and `critical_error_occurred` |
1016
- | Element not found | Look for selector resolution attempts, `find_element` timeouts |
1017
- | Browser script errors | Check for `EVAL_ERROR`, Promise rejections, JavaScript exceptions |
1018
- | Binary version issues | Startup logs show binary path and build timestamp |
1019
- | MCP connection lost | Check for panic messages, ensure binary path is correct |
1020
-
1021
- ### Fallback Mechanism
1022
-
1023
- Workflows support `fallback_id` to handle errors gracefully:
1024
- - If a step fails and has `fallback_id`, it jumps to that step instead of stopping
1025
- - Without `fallback_id`, errors may set `critical_error_occurred` and skip remaining steps
1026
- - Use `troubleshooting:` section for recovery steps only accessed via fallback
1027
-
1028
- > Need more help? Browse the examples under `examples/` in this repo or open a discussion on GitHub.
1029
-
1030
- ## Documentation
1031
-
1032
- ### Workflow Development
1033
-
1034
- - **[Workflow Output Structure](docs/WORKFLOW_OUTPUT_STRUCTURE.md)**: Detailed documentation on the expected output structure for workflows, including:
1035
- - How to structure `parsed_output` for proper CLI rendering
1036
- - Success/failure indicators and business logic validation
1037
- - Data extraction patterns and error handling
1038
- - Integration with CLI and backend systems
1039
-
1040
- ### Additional Resources
1041
-
1042
- - **[CLI Documentation](../terminator-cli/README.md)**: Command-line interface for executing workflows
1043
- - **[Examples](examples/)**: Sample workflows and use cases
1044
- - **[API Reference](https://github.com/mediar-ai/terminator#api)**: Core Terminator library documentation
1
+ ## Terminator MCP Agent
2
+
3
+ <!-- BADGES:START -->
4
+
5
+ [<img alt="Install in VS Code" src="https://img.shields.io/badge/VS_Code-VS_Code?style=flat-square&label=Install%20Server&color=0098FF">](https://insiders.vscode.dev/redirect?url=vscode%3Amcp%2Finstall%3F%257B%2522terminator-mcp-agent%2522%253A%257B%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522-y%2522%252C%2522terminator-mcp-agent%2522%255D%257D%257D)
6
+ [<img alt="Install in VS Code Insiders" src="https://img.shields.io/badge/VS_Code_Insiders-VS_Code_Insiders?style=flat-square&label=Install%20Server&color=24bfa5">](https://insiders.vscode.dev/redirect?url=vscode-insiders%3Amcp%2Finstall%3F%257B%2522terminator-mcp-agent%2522%253A%257B%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522-y%2522%252C%2522terminator-mcp-agent%2522%255D%257D%257D)
7
+
8
+ <!-- BADGES:END -->
9
+
10
+ A Model Context Protocol (MCP) server that provides desktop GUI automation capabilities using the [Terminator](https://github.com/mediar-ai/terminator) library. This server enables LLMs and agentic clients to interact with Windows, macOS, and Linux applications through structured accessibility APIs—no vision models or screenshots required.
11
+
12
+ ## Quick Install
13
+
14
+ ### Claude Code
15
+
16
+ Install with a single command:
17
+
18
+ ```bash
19
+ claude mcp add terminator "npx -y terminator-mcp-agent" -s user
20
+ ```
21
+
22
+ ### Cursor
23
+
24
+ Copy and paste this URL into your browser's address bar:
25
+
26
+ ```
27
+ cursor://anysphere.cursor-deeplink/mcp/install?name=terminator-mcp-agent&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsInRlcm1pbmF0b3ItbWNwLWFnZW50Il19
28
+ ```
29
+
30
+ Or install manually:
31
+
32
+ 1. Open Cursor Settings (`Cmd/Ctrl + ,`)
33
+ 2. Go to the MCP tab
34
+ 3. Add server with command: `npx -y terminator-mcp-agent`
35
+
36
+ ### HTTP Endpoints (when running with `-t http`)
37
+
38
+ - `GET /health`: Always returns 200 while the process is alive.
39
+ - `GET /status`: Busy-aware probe for load balancers. Returns JSON and appropriate status:
40
+ - 200 when idle: `{ "busy": false, "activeRequests": 0, "maxConcurrent": 1, "lastActivity": "<ISO-8601>" }`
41
+ - 503 when busy: `{ "busy": true, "activeRequests": 1, "maxConcurrent": 1, "lastActivity": "<ISO-8601>" }`
42
+ - Content-Type is `application/json`.
43
+ - `POST /mcp`: MCP execution endpoint. Enforces single-request concurrency per machine by default.
44
+
45
+ Concurrency is controlled by the `MCP_MAX_CONCURRENT` environment variable (default `1`). Only accepted `POST /mcp` requests are counted toward `activeRequests`. If the server is at capacity, new `POST /mcp` requests return 503 immediately. This 503 behavior is intentional so an Azure Load Balancer probing `GET /status` can take a busy VM out of rotation and route traffic elsewhere.
46
+
47
+ ### Getting Started
48
+
49
+ The easiest way to get started is to use the one-click install buttons above for your specific editor (VS Code, Cursor, etc.).
50
+
51
+ Alternatively, you can install and configure the agent from your command line.
52
+
53
+ **1. Install & Configure Automatically**
54
+ Run the following command and select your MCP client from the list:
55
+
56
+ ```sh
57
+ npx -y terminator-mcp-agent@latest --add-to-app
58
+ ```
59
+
60
+ **2. Manual Configuration**
61
+ If you prefer, you can add the following to your MCP client's settings file:
62
+
63
+ ```json
64
+ {
65
+ "mcpServers": {
66
+ "terminator-mcp-agent": {
67
+ "command": "npx",
68
+ "args": ["-y", "terminator-mcp-agent@latest"]
69
+ }
70
+ }
71
+ }
72
+ ```
73
+
74
+ ### Command Line Interface (CLI) Execution
75
+
76
+ For automation workflows and CI/CD pipelines, you can execute workflows directly from the command line using the [Terminator CLI](../terminator-cli/README.md):
77
+
78
+ **Quick Start:**
79
+
80
+ ```bash
81
+ # Execute a workflow file
82
+ terminator mcp run workflow.yml
83
+
84
+ # With verbose logging
85
+ terminator mcp run workflow.yml --verbose
86
+
87
+ # Dry run (validate without executing)
88
+ terminator mcp run workflow.yml --dry-run
89
+
90
+ # Use specific MCP server version
91
+ terminator mcp run workflow.yml --command "npx -y terminator-mcp-agent@latest"
92
+
93
+ # Run specific steps (requires step IDs in workflow)
94
+ terminator mcp run workflow.yml --start-from "step_12" --end-at "step_13"
95
+
96
+ # Run single step
97
+ terminator mcp run workflow.yml --start-from "read_json" --end-at "read_json"
98
+ ```
99
+
100
+ **Workflow File Formats:**
101
+
102
+ Direct workflow format (`workflow.yml`):
103
+
104
+ ```yaml
105
+ steps:
106
+ - tool_name: navigate_browser
107
+ arguments:
108
+ url: "https://example.com"
109
+ - tool_name: click_element
110
+ arguments:
111
+ selector: "role:Button|name:Submit"
112
+ stop_on_error: true
113
+ include_detailed_results: true
114
+ ```
115
+
116
+ Tool call wrapper format (`workflow.json`):
117
+
118
+ ```json
119
+ {
120
+ "tool_name": "execute_sequence",
121
+ "arguments": {
122
+ "steps": [
123
+ {
124
+ "tool_name": "navigate_browser",
125
+ "arguments": {
126
+ "url": "https://example.com"
127
+ }
128
+ }
129
+ ]
130
+ }
131
+ }
132
+ ```
133
+
134
+ **Code Execution in Workflows (engine mode):**
135
+
136
+ Execute custom JavaScript or Python with access to desktop automation APIs via `run_command`.
137
+
138
+ **Passing Data Between Workflow Steps:**
139
+
140
+ When using `engine` mode, data automatically flows between steps:
141
+
142
+ ```yaml
143
+ steps:
144
+ # Step 1: Return data directly (NEW - simplified!)
145
+ - tool_name: run_command
146
+ arguments:
147
+ engine: "javascript"
148
+ run: |
149
+ // Get file info (example)
150
+ const filePath = 'C:\\data\\report.pdf';
151
+ const fileSize = 1024;
152
+
153
+ console.log(`Found file: ${filePath}`);
154
+
155
+ // Just return fields directly - they auto-merge into env
156
+ return {
157
+ status: 'success',
158
+ file_path: filePath, // Becomes env.file_path
159
+ file_size: fileSize // Becomes env.file_size
160
+ };
161
+
162
+ # Step 2: Access data automatically
163
+ - tool_name: run_command
164
+ arguments:
165
+ engine: "javascript"
166
+ run: |
167
+ // env is automatically available - no setup needed!
168
+ console.log(`Processing: ${env.file_path} (${env.file_size} bytes)`);
169
+
170
+ // Workflow variables also auto-available
171
+ console.log(`Config: ${variables.max_retries}`);
172
+
173
+ // NEW: Direct variable access also works!
174
+ console.log(`Processing: ${file_path} (${file_size} bytes)`);
175
+ console.log(`Config: ${max_retries}`);
176
+
177
+ // Continue with desktop automation
178
+ const elements = await desktop.locator('role:button').all();
179
+
180
+ // Return more data (auto-merges to env)
181
+ return {
182
+ status: 'success',
183
+ file_processed: env.file_path,
184
+ buttons_found: elements.length
185
+ };
186
+ ```
187
+
188
+ **Important Notes on Data Passing:**
189
+
190
+ - **NEW:** `env` and `variables` are automatically injected into all scripts
191
+ - **NEW:** Non-reserved fields in return values auto-merge into env (no `set_env` wrapper needed)
192
+ - **NEW:** Valid env fields are also available as individual variables (e.g., `file_path` instead of `env.file_path`)
193
+ - Reserved fields that don't auto-merge: `status`, `error`, `logs`, `duration_ms`, `set_env`
194
+ - Data passing only works with `engine` mode (JavaScript/Python), NOT with shell commands
195
+ - Backward compatible: explicit `set_env` still works if needed
196
+ - Individual variable names must be valid JavaScript identifiers (no spaces, special chars, or reserved keywords)
197
+ - Watch for backslash escaping issues in Windows paths (may need double escaping)
198
+ - Consider combining related operations in a single step if data passing becomes complex
199
+
200
+ For complete CLI documentation, see [Terminator CLI README](../terminator-cli/README.md).
201
+
202
+ ### Core Workflows: From Interaction to Structured Data
203
+
204
+ The Terminator MCP agent offers two primary workflows for automating desktop tasks. Both paths lead to the same goal: creating a >95% accuracy, 10000x faster than humans, automation.
205
+
206
+ #### 1. Iterative Development with `execute_sequence`
207
+
208
+ This is the most powerful and flexible method. You build a workflow step-by-step, using MCP tools to inspect the UI and refine your actions.
209
+
210
+ 1. **Inspect the UI**: Start by using `get_focused_window_tree` to understand the structure of your target application. This gives you the roles, names, and IDs of all elements.
211
+ 2. **Build a Sequence**: Create an `execute_sequence` tool call with a series of actions (`click_element`, `type_into_element`, etc.). Use robust selectors (like `role|name` or stable `properties:AutomationId:value` selectors) whenever possible.
212
+ 3. **Capture the Final State**: Ensure the last step in your sequence is an action that returns a UI tree. The `wait_for_element` tool with `include_tree: true` is perfect for this, as it captures the application's state after your automation has run.
213
+ 4. **Extract Structured Data with `output_parser`**: Add the `output_parser` argument to your `execute_sequence` call. Write JavaScript code to parse the final UI tree and extract structured data. If successful, the tool result will contain a `parsed_output` field with your clean JSON data.
214
+
215
+ Here is an example of an `output_parser` that extracts insurance quote data from a web page:
216
+
217
+ ```yaml
218
+ output_parser:
219
+ ui_tree_source_step_id: capture_quotes_tree
220
+ javascript_code: |
221
+ // Find all quote groups with Image and Text children
222
+ const results = [];
223
+
224
+ function findElementsRecursively(element) {
225
+ if (element.attributes && element.attributes.role === 'Group') {
226
+ const children = element.children || [];
227
+ const hasImage = children.some(child =>
228
+ child.attributes && child.attributes.role === 'Image'
229
+ );
230
+ const hasText = children.some(child =>
231
+ child.attributes && child.attributes.role === 'Text'
232
+ );
233
+
234
+ if (hasImage && hasText) {
235
+ const textElements = children.filter(child =>
236
+ child.attributes && child.attributes.role === 'Text' && child.attributes.name
237
+ );
238
+
239
+ let carrierProduct = '';
240
+ let monthlyPrice = '';
241
+
242
+ for (const textEl of textElements) {
243
+ const text = textEl.attributes.name;
244
+ if (text.includes(':')) {
245
+ carrierProduct = text;
246
+ }
247
+ if (text.startsWith('$')) {
248
+ monthlyPrice = text;
249
+ }
250
+ }
251
+
252
+ if (carrierProduct && monthlyPrice) {
253
+ results.push({
254
+ carrierProduct: carrierProduct,
255
+ monthlyPrice: monthlyPrice
256
+ });
257
+ }
258
+ }
259
+ }
260
+
261
+ if (element.children) {
262
+ for (const child of element.children) {
263
+ findElementsRecursively(child);
264
+ }
265
+ }
266
+ }
267
+
268
+ findElementsRecursively(tree);
269
+ return results;
270
+ ```
271
+
272
+ #### 2. Recording Human Actions with `record_workflow`
273
+
274
+ For simpler tasks, you can record your own actions to generate a baseline workflow.
275
+
276
+ 1. **Start Recording**: Call `record_workflow` with `action: "start"`.
277
+ 2. **Perform the Task**: Manually perform the clicks, typing, and other interactions in the target application.
278
+ 3. **Stop and Save**: Call `record_workflow` with `action: "stop"`. This returns a complete workflow JSON file containing all your recorded actions.
279
+ 4. **Refine and Parse**: The recorded workflow is a great starting point. You can then refine the selectors for robustness, add a final step to capture the UI tree, and attach an `output_parser` to extract structured data, just as you would in the iterative workflow.
280
+
281
+ ### Browser DOM Inspection
282
+
283
+ The `execute_browser_script` tool enables direct JavaScript execution in browser contexts, providing access to the full HTML DOM. This is particularly useful when you need information not available in the accessibility tree.
284
+
285
+ #### When to Use DOM vs Accessibility Tree
286
+
287
+ **Use Accessibility Tree (default) when:**
288
+
289
+ - Navigating and interacting with UI elements
290
+ - Working with semantic page structure
291
+ - Building reliable automation workflows
292
+ - Performance is critical (faster, cleaner data)
293
+
294
+ **Use DOM Inspection when:**
295
+
296
+ - Extracting data attributes, meta tags, or hidden inputs
297
+ - Debugging why elements aren't appearing in accessibility tree
298
+ - Scraping structured data from specific HTML patterns
299
+ - Validating complete page structure or SEO elements
300
+
301
+ #### Basic DOM Retrieval Patterns
302
+
303
+ ```javascript
304
+ // Get full HTML DOM (be mindful of size limits)
305
+ execute_browser_script({
306
+ selector: "role:Window|name:Google Chrome",
307
+ script: "document.documentElement.outerHTML",
308
+ });
309
+
310
+ // Get structured page information
311
+ execute_browser_script({
312
+ selector: "role:Window|name:Google Chrome",
313
+ script: `({
314
+ url: window.location.href,
315
+ title: document.title,
316
+ html: document.documentElement.outerHTML,
317
+ bodyText: document.body.innerText.substring(0, 1000)
318
+ })`,
319
+ });
320
+
321
+ // Extract specific data (forms, hidden inputs, meta tags)
322
+ execute_browser_script({
323
+ selector: "role:Window|name:Google Chrome",
324
+ script: `({
325
+ forms: Array.from(document.forms).map(f => ({
326
+ id: f.id,
327
+ action: f.action,
328
+ method: f.method,
329
+ inputs: Array.from(f.elements).map(e => ({
330
+ name: e.name,
331
+ type: e.type,
332
+ value: e.type === 'password' ? '[REDACTED]' : e.value
333
+ }))
334
+ })),
335
+ hiddenInputs: Array.from(document.querySelectorAll('input[type="hidden"]')).map(e => ({
336
+ name: e.name,
337
+ value: e.value
338
+ })),
339
+ metaTags: Array.from(document.querySelectorAll('meta')).map(m => ({
340
+ name: m.name || m.property,
341
+ content: m.content
342
+ }))
343
+ })`,
344
+ });
345
+ ```
346
+
347
+ #### Handling Large DOMs
348
+
349
+ The MCP protocol has response size limits (~30KB). For large DOMs, use truncation strategies:
350
+
351
+ ```javascript
352
+ execute_browser_script({
353
+ selector: "role:Window|name:Google Chrome",
354
+ script: `
355
+ const html = document.documentElement.outerHTML;
356
+ const maxLength = 30000;
357
+
358
+ ({
359
+ url: window.location.href,
360
+ title: document.title,
361
+ html: html.length > maxLength
362
+ ? html.substring(0, maxLength) + '... [truncated at ' + maxLength + ' chars]'
363
+ : html,
364
+ totalLength: html.length,
365
+ truncated: html.length > maxLength
366
+ })
367
+ `,
368
+ });
369
+ ```
370
+
371
+ #### Advanced DOM Analysis
372
+
373
+ ```javascript
374
+ // Analyze page structure and extract semantic content
375
+ execute_browser_script({
376
+ selector: "role:Window|name:Google Chrome",
377
+ script: `
378
+ // Remove scripts and styles for cleaner analysis
379
+ const clonedDoc = document.documentElement.cloneNode(true);
380
+ clonedDoc.querySelectorAll('script, style, noscript').forEach(el => el.remove());
381
+
382
+ ({
383
+ // Page metrics
384
+ domElementCount: document.querySelectorAll('*').length,
385
+ formCount: document.forms.length,
386
+ linkCount: document.links.length,
387
+ imageCount: document.images.length,
388
+
389
+ // Semantic structure
390
+ headings: Array.from(document.querySelectorAll('h1,h2,h3')).map(h => ({
391
+ level: h.tagName,
392
+ text: h.innerText.substring(0, 100)
393
+ })),
394
+
395
+ // Clean HTML without scripts/styles
396
+ cleanHtml: clonedDoc.outerHTML.substring(0, 20000),
397
+
398
+ // Data extraction
399
+ jsonLd: Array.from(document.querySelectorAll('script[type="application/ld+json"]'))
400
+ .map(s => { try { return JSON.parse(s.textContent); } catch { return null; } })
401
+ .filter(Boolean)
402
+ })
403
+ `,
404
+ });
405
+ ```
406
+
407
+ #### Passing Data with Environment Variables
408
+
409
+ The `execute_browser_script` tool now supports passing data through `env` and `outputs` parameters:
410
+
411
+ ```javascript
412
+ // Step 1: Set environment variables in JavaScript
413
+ run_command({
414
+ engine: "javascript",
415
+ run: `
416
+ return {
417
+ set_env: {
418
+ userName: 'John Doe',
419
+ userId: '12345',
420
+ apiKey: 'secret-key'
421
+ }
422
+ };
423
+ `,
424
+ });
425
+
426
+ // Step 2: Use environment variables in browser script
427
+ execute_browser_script({
428
+ selector: "role:Window",
429
+ env: {
430
+ userName: "{{env.userName}}",
431
+ userId: "{{env.userId}}",
432
+ },
433
+ script: `
434
+ // Parse env if it's a JSON string (for backward compatibility)
435
+ const parsedEnv = typeof env === 'string' ? JSON.parse(env) : env;
436
+
437
+ // Use the data - traditional way
438
+ console.log('Processing user:', parsedEnv.userName);
439
+
440
+ // NEW: Direct variable access also works!
441
+ console.log('Processing user:', userName); // Direct access
442
+ console.log('User ID:', userId); // No env prefix needed
443
+
444
+ // Fill form with data
445
+ document.querySelector('#username').value = userName;
446
+ document.querySelector('#userid').value = userId;
447
+
448
+ // Return result and set new variables
449
+ JSON.stringify({
450
+ status: 'form_filled',
451
+ set_env: {
452
+ form_submitted: 'true',
453
+ timestamp: new Date().toISOString()
454
+ }
455
+ });
456
+ `,
457
+ });
458
+ ```
459
+
460
+ #### Loading Scripts from Files
461
+
462
+ You can load JavaScript from external files using the `script_file` parameter:
463
+
464
+ ```javascript
465
+ // browser_scripts/extract_data.js
466
+ const parsedEnv = typeof env === "string" ? JSON.parse(env) : env;
467
+ const parsedOutputs =
468
+ typeof outputs === "string" ? JSON.parse(outputs) : outputs;
469
+
470
+ console.log("Script loaded from file");
471
+ console.log("User:", parsedEnv?.userName);
472
+ console.log("Previous result:", parsedOutputs?.previousStep);
473
+
474
+ // Extract and return data
475
+ JSON.stringify({
476
+ extractedData: {
477
+ url: window.location.href,
478
+ title: document.title,
479
+ forms: document.forms.length,
480
+ },
481
+ set_env: {
482
+ extraction_complete: "true",
483
+ },
484
+ });
485
+
486
+ // In your workflow:
487
+ execute_browser_script({
488
+ selector: "role:Window",
489
+ script_file: "browser_scripts/extract_data.js",
490
+ env: {
491
+ userName: "{{env.userName}}",
492
+ previousStep: "{{env.previousStep}}",
493
+ },
494
+ });
495
+ ```
496
+
497
+ #### Important Notes
498
+
499
+ 1. **Chrome Extension Required**: The `execute_browser_script` tool requires the Terminator browser extension to be installed. See the installation workflow examples for automated setup.
500
+
501
+ 2. **Security Considerations**: Be cautious when extracting sensitive data. The examples above redact password fields and you should follow similar practices.
502
+
503
+ 3. **Performance**: DOM operations are synchronous and can be slow on large pages. Consider using specific selectors rather than traversing the entire DOM.
504
+
505
+ 4. **Error Handling**: Always wrap complex DOM operations in try-catch blocks and return meaningful error messages.
506
+
507
+ 5. **Data Injection**: When using `env` or `outputs` parameters, they are injected as JavaScript variables at the beginning of your script. Always parse them if they might be JSON strings.
508
+
509
+ ## Local Development
510
+
511
+ To build and test the agent from the source code:
512
+
513
+ ```sh
514
+ # 1. Clone the entire Terminator repository
515
+ git clone https://github.com/mediar-ai/terminator
516
+
517
+ # 2. Navigate to the agent's directory
518
+ cd terminator/terminator-mcp-agent
519
+
520
+ # 3. Install Node.js dependencies
521
+ npm install
522
+
523
+ # 4. Build the Rust binary and Node.js wrapper
524
+ npm run build
525
+
526
+ # 5. To use your local build in your MCP client, link it globally
527
+ npm install --global .
528
+ ```
529
+
530
+ Now, when your MCP client runs `terminator-mcp-agent`, it will use your local build instead of the published `npm` version.
531
+
532
+ ---
533
+
534
+ ## Troubleshooting
535
+
536
+ - Make sure you have Node.js installed (v16+ recommended).
537
+ - For VS Code/Insiders, ensure the CLI (`code` or `code-insiders`) is available in your PATH.
538
+ - If you encounter issues, try running with elevated permissions.
539
+
540
+ ### Version Compatibility Issues
541
+
542
+ **Problem**: "missing field `items`" or schema mismatch errors
543
+
544
+ **Solution**: Ensure you're using the latest MCP server version:
545
+
546
+ ```bash
547
+ # Force latest version in CLI
548
+ terminator mcp run workflow.yml --command "npx -y terminator-mcp-agent@latest"
549
+
550
+ # Update MCP client configuration to use @latest
551
+ {
552
+ "mcpServers": {
553
+ "terminator-mcp-agent": {
554
+ "command": "npx",
555
+ "args": ["-y", "terminator-mcp-agent@latest"]
556
+ }
557
+ }
558
+ }
559
+
560
+ # Clear npm cache if needed
561
+ npm cache clean --force
562
+ ```
563
+
564
+ ### CLI Integration Issues
565
+
566
+ **Problem**: CLI commands not working or connection errors
567
+
568
+ **Solution**: Test MCP connectivity step by step:
569
+
570
+ ```bash
571
+ # Test basic connectivity
572
+ terminator mcp exec get_applications
573
+
574
+ # Test with verbose logging
575
+ terminator mcp run workflow.yml --verbose
576
+
577
+ # Test with dry run first
578
+ terminator mcp run workflow.yml --dry-run
579
+
580
+ # Use HTTP connection for debugging
581
+ terminator mcp run workflow.yml --url http://localhost:3000/mcp
582
+ ```
583
+
584
+ ### JavaScript Execution Issues
585
+
586
+ **Problem**: JavaScript code fails or can't access desktop APIs
587
+
588
+ **Solution**: Verify JavaScript execution and API access:
589
+
590
+ ```bash
591
+ # Test basic JavaScript execution via run_command engine mode
592
+ terminator mcp exec run_command '{"engine": "javascript", "run": "return {test: true};"}'
593
+
594
+ # Test desktop API access with node engine
595
+ terminator mcp exec run_command '{"engine": "node", "run": "const elements = await desktop.locator(\\\"role:button\\\").all(); return {count: elements.length};"}'
596
+
597
+ # Test Python engine
598
+ terminator mcp exec run_command '{"engine": "python", "run": "return {\\\"py\\\": True}"}'
599
+
600
+ # Debug with verbose logging
601
+ terminator mcp run workflow.yml --verbose
602
+ ```
603
+
604
+ ### Workflow File Issues
605
+
606
+ **Problem**: Workflow parsing errors or unexpected behavior
607
+
608
+ **Solution**: Validate workflow structure:
609
+
610
+ ```bash
611
+ # Validate workflow syntax
612
+ terminator mcp run workflow.yml --dry-run
613
+
614
+ # Test with minimal workflow first
615
+ echo 'steps: [{tool_name: get_applications}]' > test.yml
616
+ terminator mcp run test.yml
617
+
618
+ # Check both YAML and JSON formats work
619
+ terminator mcp run workflow.yml # YAML
620
+ terminator mcp run workflow.json # JSON
621
+ ```
622
+
623
+ ### Platform-Specific Issues
624
+
625
+ **Windows**:
626
+
627
+ - Ensure Windows UI Automation APIs are available
628
+ - Run with administrator privileges if accessibility features are restricted
629
+ - Check Windows Defender/antivirus isn't blocking automation
630
+
631
+ **macOS**:
632
+
633
+ - Grant accessibility permissions in System Preferences > Security & Privacy
634
+ - Ensure the terminal/IDE has accessibility access
635
+ - Check macOS version compatibility (10.14+ recommended)
636
+
637
+ **Linux**:
638
+
639
+ - Ensure AT-SPI (assistive technology) is enabled
640
+ - Install required packages: `sudo apt-get install at-spi2-core`
641
+ - Check desktop environment compatibility (GNOME, KDE, XFCE supported)
642
+
643
+ ### Virtual Display Support (Headless VMs)
644
+
645
+ Terminator MCP Agent includes virtual display support for running on headless VMs without requiring RDP connections. This enables scalable automation on cloud platforms like Azure, AWS, and GCP.
646
+
647
+ **How It Works**:
648
+
649
+ The agent automatically detects headless environments and initializes a virtual display context that Windows UI Automation APIs can interact with. This allows full UI automation capabilities even when no physical display or RDP session is active.
650
+
651
+ **Activation**:
652
+
653
+ Virtual display activates automatically when:
654
+
655
+ - Environment variable `TERMINATOR_HEADLESS=true` is set
656
+ - No console window is available (common in VM/container scenarios)
657
+ - Running as a Windows service or scheduled task
658
+
659
+ **Configuration**:
660
+
661
+ ```bash
662
+ # Enable virtual display mode
663
+ export TERMINATOR_HEADLESS=true
664
+
665
+ # Run the MCP agent
666
+ npx -y terminator-mcp-agent
667
+ ```
668
+
669
+ **Use Cases**:
670
+
671
+ - Running multiple automation agents on VMs without RDP overhead
672
+ - CI/CD pipelines in cloud environments
673
+ - Scalable automation farms on Azure/AWS/GCP
674
+ - Containerized automation workloads
675
+
676
+ **Requirements**:
677
+
678
+ - Windows Server 2016+ or Windows 10/11
679
+ - .NET Framework 4.7.2+
680
+ - UI Automation APIs available (included in Windows)
681
+
682
+ The virtual display manager creates a memory-based display context that satisfies Windows UI Automation requirements, enabling terminator to enumerate and interact with UI elements as if a physical display were present.
683
+
684
+ ### Performance Optimization
685
+
686
+ **Large UI Trees**:
687
+
688
+ - Use specific selectors instead of broad element searches
689
+ - Implement delays between rapid operations
690
+ - Consider using `include_tree: false` for intermediate steps
691
+
692
+ **JavaScript Performance**:
693
+
694
+ - Use `quickjs` engine for lightweight operations
695
+ - Use `nodejs` engine only when full APIs are needed
696
+ - Implement `sleep()` delays in loops to prevent overwhelming the UI
697
+
698
+ For additional help, see the [Terminator CLI documentation](../terminator-cli/README.md) or open an issue on GitHub.
699
+
700
+ ---
701
+
702
+ ## 📚 Full `execute_sequence` Reference & Sample Workflow
703
+
704
+ > **Why another example?** The quick start above shows the concept, but many users asked for a fully-annotated workflow schema. The example below automates the Windows **Calculator** app—so it is 100% safe to share and does **not** reveal any private customer data. Feel free to copy-paste and adapt it to your own application.
705
+
706
+ ### 1. Anatomy of an `execute_sequence` Call
707
+
708
+ ```jsonc
709
+ {
710
+ "tool_name": "execute_sequence",
711
+ "arguments": {
712
+ "variables": {
713
+ // 1️⃣ Re-usable inputs with type metadata
714
+ "app_path": {
715
+ "type": "string",
716
+ "label": "Calculator EXE Path",
717
+ "default": "calc.exe"
718
+ },
719
+ "first_number": {
720
+ "type": "string",
721
+ "label": "First Number",
722
+ "default": "42"
723
+ },
724
+ "second_number": {
725
+ "type": "string",
726
+ "label": "Second Number",
727
+ "default": "8"
728
+ }
729
+ },
730
+ "inputs": {
731
+ // 2️⃣ Concrete values for *this run*
732
+ "app_path": "calc.exe",
733
+ "first_number": "42",
734
+ "second_number": "8"
735
+ },
736
+ "selectors": {
737
+ // 3️⃣ Human-readable element shortcuts
738
+ "calc_window": "role:Window|name:Calculator",
739
+ "btn_clear": "role:Button|name:Clear",
740
+ "btn_plus": "role:Button|name:Plus",
741
+ "btn_equals": "role:Button|name:Equals"
742
+ },
743
+ "steps": [
744
+ // 4️⃣ Ordered actions & control flow
745
+ {
746
+ "tool_name": "open_application",
747
+ "arguments": { "path": "${{app_path}}" }
748
+ },
749
+ {
750
+ "tool_name": "click_element", // 4a. Make sure the UI is reset
751
+ "arguments": { "selector": "${{selectors.btn_clear}}" },
752
+ "continue_on_error": true
753
+ },
754
+ {
755
+ "group_name": "Enter First Number", // 4b. Groups improve logs
756
+ "steps": [
757
+ {
758
+ "tool_name": "type_into_element",
759
+ "arguments": {
760
+ "selector": "${{selectors.calc_window}}",
761
+ "text_to_type": "${{first_number}}"
762
+ }
763
+ }
764
+ ]
765
+ },
766
+ {
767
+ "tool_name": "click_element",
768
+ "arguments": { "selector": "${{selectors.btn_plus}}" }
769
+ },
770
+ {
771
+ "group_name": "Enter Second Number",
772
+ "steps": [
773
+ {
774
+ "tool_name": "type_into_element",
775
+ "arguments": {
776
+ "selector": "${{selectors.calc_window}}",
777
+ "text_to_type": "${{second_number}}"
778
+ }
779
+ }
780
+ ]
781
+ },
782
+ {
783
+ "tool_name": "click_element",
784
+ "arguments": { "selector": "${{selectors.btn_equals}}" }
785
+ },
786
+ {
787
+ "tool_name": "wait_for_element", // 4c. Capture final UI tree
788
+ "arguments": {
789
+ "selector": "${{selectors.calc_window}}",
790
+ "condition": "exists",
791
+ "include_tree": true,
792
+ "timeout_ms": 2000
793
+ }
794
+ }
795
+ ],
796
+ "output_parser": {
797
+ // 5️⃣ Turn the tree into clean JSON
798
+ "javascript_code": "// Extract calculator display value\nconst results = [];\n\nfunction findElementsRecursively(element) {\n if (element.attributes && element.attributes.role === 'Text') {\n const item = {\n displayValue: element.attributes.name || ''\n };\n results.push(item);\n }\n \n if (element.children) {\n for (const child of element.children) {\n findElementsRecursively(child);\n }\n }\n}\n\nfindElementsRecursively(tree);\nreturn results;"
799
+ }
800
+ }
801
+ }
802
+ ```
803
+
804
+ ### 2. Key Concepts at a Glance
805
+
806
+ 1. **Variables vs. Inputs** – Declare once, override per-run. This is perfect for parameterizing CI pipelines or A/B test data.
807
+ 2. **Selectors** – Give every important UI element a _nickname_. It makes long workflows readable and easy to maintain.
808
+ 3. **Templating** – `${{ ... }}` (GitHub Actions-style) _or_ legacy `{{ ... }}` lets you reference **any** key inside `variables`, `inputs`, or `selectors`. Both syntaxes are supported; the engine uses Mustache-style rendering.
809
+ 4. **Groups & Control Flow** – Add `group_name`, `skippable`, `if`, or `continue_on_error` to any step for advanced branching.
810
+ 5. **Output Parsing** – Always end with a step that includes the UI tree, then use the declarative JSON DSL to mine the data you need.
811
+
812
+ ### 3. State Persistence & Partial Execution
813
+
814
+ The `execute_sequence` tool supports powerful features for workflow debugging and resumption:
815
+
816
+ #### Partial Execution with Step Ranges
817
+
818
+ You can run specific portions of a workflow using `start_from_step` and `end_at_step` parameters:
819
+
820
+ ```jsonc
821
+ {
822
+ "tool_name": "execute_sequence",
823
+ "arguments": {
824
+ "url": "file://path/to/workflow.yml",
825
+ "start_from_step": "read_json_file", // Start from this step ID
826
+ "end_at_step": "fill_journal_entries", // Stop after this step (inclusive)
827
+ "follow_fallback": false // Don't follow fallback_id beyond end_at_step (default: false)
828
+ }
829
+ }
830
+ ```
831
+
832
+ **Examples:**
833
+ - Run single step: Set both `start_from_step` and `end_at_step` to the same ID
834
+ - Run step range: Set different IDs for start and end
835
+ - Run from step to end: Only set `start_from_step`
836
+ - Run from beginning to step: Only set `end_at_step`
837
+ - Debug without fallback: Use `follow_fallback: false` to prevent jumping to troubleshooting steps when a bounded step fails
838
+
839
+ #### Automatic State Persistence
840
+
841
+ When using `file://` URLs, the workflow state (environment variables) is automatically saved to a `.workflow_state` folder:
842
+
843
+ 1. **State is saved** after each step that modifies environment variables via `set_env` or has a tool result with an ID
844
+ 2. **State is loaded** when starting from a specific step
845
+ 3. **Location**: `.workflow_state/<workflow_hash>.json` in the workflow's directory
846
+ 4. **Tool results** from all tools (not just scripts) are automatically stored as `{step_id}_result` and `{step_id}_status`
847
+
848
+ This enables:
849
+ - **Debugging**: Run steps individually to inspect state between executions
850
+ - **Recovery**: Resume failed workflows from the last successful step
851
+ - **Testing**: Test specific steps without re-running the entire workflow
852
+
853
+ #### Data Passing Between Steps
854
+
855
+ Steps can pass data using multiple methods:
856
+
857
+ ##### 1. Tool Result Storage (NEW)
858
+
859
+ ALL tools with an `id` field automatically store their results in the environment:
860
+
861
+ ```yaml
862
+ steps:
863
+ # Any tool with an ID stores its result
864
+ - id: check_apps
865
+ tool_name: get_applications
866
+ arguments:
867
+ include_tree: false
868
+
869
+ # Access the result in JavaScript
870
+ - tool_name: run_command
871
+ arguments:
872
+ engine: javascript
873
+ run: |
874
+ // Direct variable access - auto-injected!
875
+ const apps = check_apps_result || [];
876
+ const status = check_apps_status; // "success" or "error"
877
+ console.log(`Found ${apps[0]?.applications?.length} apps`);
878
+ ```
879
+
880
+ ##### 2. Script Return Values
881
+
882
+ Steps can pass data using the `set_env` mechanism in `run_command` with engine mode:
883
+
884
+ ```javascript
885
+ // Step 12: Read and process data
886
+ return {
887
+ set_env: {
888
+ file_path: "C:/data/input.json",
889
+ journal_entries: JSON.stringify(entries),
890
+ total_debit: "100.50"
891
+ }
892
+ };
893
+
894
+ // Step 13: Use the data (NEW - simplified access!)
895
+ const filePath = file_path; // Direct access, no {{env.}} needed!
896
+ const entries = JSON.parse(journal_entries);
897
+ const debit = total_debit;
898
+ ```
899
+
900
+ ### 4. Running the Workflow
901
+
902
+ 1. Ensure the Terminator MCP agent is running (it will auto-start in supported editors).
903
+ 2. Send the JSON above as the body of an `execute_sequence` tool call from your LLM or test harness.
904
+ 3. Inspect the response: if parsing succeeds you'll see something like
905
+
906
+ ### Realtime events (SSE)
907
+
908
+ When running with the HTTP transport, you can subscribe to realtime workflow events at a separate endpoint outside `/mcp`:
909
+
910
+ - SSE endpoint: `/events`
911
+ - Emits JSON payloads for: `sequence` (start/end), `sequence_progress`, and `sequence_step` (begin/end)
912
+
913
+ Example in Node.js:
914
+
915
+ ```js
916
+ import EventSource from "eventsource";
917
+ const es = new EventSource("http://127.0.0.1:3000/events");
918
+ es.onmessage = (e) => console.log("event", e.data);
919
+ ```
920
+
921
+ ```jsonc
922
+ {
923
+ "parsed_output": {
924
+ "displayValue": "50" // 42 + 8
925
+ }
926
+ }
927
+ ```
928
+
929
+ ### 5. Working with Tool Results
930
+
931
+ Every tool that has an `id` field automatically stores its result for use in later steps:
932
+
933
+ ```yaml
934
+ steps:
935
+ # Capture browser DOM
936
+ - id: capture_dom
937
+ tool_name: execute_browser_script
938
+ arguments:
939
+ selector: "role:Window"
940
+ script: "return document.documentElement.innerHTML;"
941
+
942
+ # Validate an element exists
943
+ - id: check_button
944
+ tool_name: validate_element
945
+ arguments:
946
+ selector: "role:Button|name:Submit"
947
+
948
+ # Use both results in script
949
+ - tool_name: run_command
950
+ arguments:
951
+ engine: javascript
952
+ run: |
953
+ // All tool results are auto-injected as variables
954
+ const dom = capture_dom_result?.content || '';
955
+ const buttonExists = check_button_status === 'success';
956
+
957
+ if (buttonExists) {
958
+ const button = check_button_result[0]?.element;
959
+ console.log(`Submit button at: ${button?.bounds?.x}, ${button?.bounds?.y}`);
960
+ }
961
+
962
+ return { dom_length: dom.length, has_button: buttonExists };
963
+ ```
964
+
965
+ Tool results are accessible as:
966
+ - `{step_id}_result`: The tool's return value (content, element info, etc.)
967
+ - `{step_id}_status`: Either "success" or "error"
968
+
969
+ ### 6. Tips for Production Workflows
970
+
971
+ - **Never hard-code credentials** – use environment variables or your secret manager.
972
+ - **Keep workflows short** – <100 steps is ideal. Break large tasks into multiple sequences.
973
+ - **Capture errors** – `continue_on_error` is useful, but also check `{step_id}_status` for tool failures.
974
+ - **Version control** – Store workflow JSON in a repo and use PR reviews just like regular code.
975
+ - **Use step IDs** – Give meaningful IDs to steps whose results you'll need later.
976
+
977
+ ## 🔍 Troubleshooting & Debugging
978
+
979
+ ### Finding MCP Server Logs
980
+
981
+ MCP logs are saved to:
982
+ - **Windows:** `%LOCALAPPDATA%\claude-cli-nodejs\Cache\<encoded-project-path>\mcp-logs-terminator-mcp-agent\`
983
+ - **macOS/Linux:** `~/.local/share/claude-cli-nodejs/Cache/<encoded-project-path>/mcp-logs-terminator-mcp-agent/`
984
+
985
+ Where `<encoded-project-path>` is your project path with special chars replaced (e.g., `C--Users-username-project`).
986
+ Note: Logs are saved as `.txt` files, not `.log` files.
987
+
988
+ **Read logs:**
989
+ ```powershell
990
+ # Windows - Find and read latest logs (run in PowerShell)
991
+ Get-ChildItem (Join-Path ([Environment]::GetFolderPath('LocalApplicationData')) 'claude-cli-nodejs\Cache\*\mcp-logs-terminator-mcp-agent\*.txt') | Sort-Object LastWriteTime -Descending | Select-Object -First 1 | Get-Content -Tail 50
992
+ ```
993
+
994
+ ### Enable Debug Logging
995
+
996
+ In your Claude MCP configuration (`claude_desktop_config.json`):
997
+ ```json
998
+ {
999
+ "mcpServers": {
1000
+ "terminator-mcp-agent": {
1001
+ "command": "path/to/terminator-mcp-agent",
1002
+ "env": {
1003
+ "LOG_LEVEL": "debug", // or "info", "warn", "error"
1004
+ "RUST_BACKTRACE": "1" // for stack traces on errors
1005
+ }
1006
+ }
1007
+ }
1008
+ }
1009
+ ```
1010
+
1011
+ ### Common Debug Scenarios
1012
+
1013
+ | Issue | What to Look For in Logs |
1014
+ |-------|--------------------------|
1015
+ | Workflow failures | Search for `fallback_id` triggers and `critical_error_occurred` |
1016
+ | Element not found | Look for selector resolution attempts, `find_element` timeouts |
1017
+ | Browser script errors | Check for `EVAL_ERROR`, Promise rejections, JavaScript exceptions |
1018
+ | Binary version issues | Startup logs show binary path and build timestamp |
1019
+ | MCP connection lost | Check for panic messages, ensure binary path is correct |
1020
+
1021
+ ### Fallback Mechanism
1022
+
1023
+ Workflows support `fallback_id` to handle errors gracefully:
1024
+ - If a step fails and has `fallback_id`, it jumps to that step instead of stopping
1025
+ - Without `fallback_id`, errors may set `critical_error_occurred` and skip remaining steps
1026
+ - Use `troubleshooting:` section for recovery steps only accessed via fallback
1027
+
1028
+ > Need more help? Browse the examples under `examples/` in this repo or open a discussion on GitHub.
1029
+
1030
+ ## Documentation
1031
+
1032
+ ### Workflow Development
1033
+
1034
+ - **[Workflow Output Structure](docs/WORKFLOW_OUTPUT_STRUCTURE.md)**: Detailed documentation on the expected output structure for workflows, including:
1035
+ - How to structure `parsed_output` for proper CLI rendering
1036
+ - Success/failure indicators and business logic validation
1037
+ - Data extraction patterns and error handling
1038
+ - Integration with CLI and backend systems
1039
+
1040
+ ### Additional Resources
1041
+
1042
+ - **[CLI Documentation](../terminator-cli/README.md)**: Command-line interface for executing workflows
1043
+ - **[Examples](examples/)**: Sample workflows and use cases
1044
+ - **[API Reference](https://github.com/mediar-ai/terminator#api)**: Core Terminator library documentation