@harusame64/desktop-touch-mcp 0.12.0 → 0.13.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +53 -3
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -534,13 +534,63 @@ When auto guard is enabled (default), `post.perception.status` will be one of:
534
534
  | `unsafe_coordinates` | Click coordinates are outside the target window rect |
535
535
  | `needs_escalation` | Use `browser_click_element` or specify `windowTitle` |
536
536
 
537
- When `unsafe_coordinates` or `identity_changed` is returned, the response may include a `suggestedFix.fixId`. Pass that `fixId` to the next `mouse_click` call to approve the recovery:
537
+ When `unsafe_coordinates` or `identity_changed` is returned, the response may include a `suggestedFix.fixId`. Pass that `fixId` to the relevant tool call to approve the recovery:
538
538
 
539
539
  ```json
540
- { "name": "mouse_click", "arguments": { "fixId": "fix-..." } }
540
+ { "name": "mouse_click", "arguments": { "fixId": "fix-..." } }
541
+ { "name": "keyboard_type", "arguments": { "fixId": "fix-...", "text": "hello" } }
542
+ { "name": "click_element", "arguments": { "fixId": "fix-..." } }
543
+ { "name": "browser_click_element", "arguments": { "fixId": "fix-..." } }
541
544
  ```
542
545
 
543
- The fix is one-shot and expires in 15 seconds.
546
+ The fix is one-shot and expires in 15 seconds. The server revalidates the target process identity before executing.
547
+
548
+ ---
549
+
550
+ ## v0.13 Additions
551
+
552
+ ### Target-Identity Timeline
553
+
554
+ The server tracks a semantic timeline of what happened to each target window/tab. Recent events are included in:
555
+
556
+ - `get_history` → `recentTargetKeys`: array of 3 most recently active target keys (compact, no event bodies)
557
+ - `perception_read(lensId)` → `recentEvents`: up to 10 events for that lens's target, each with `tsMs`, `semantic`, `summary`
558
+
559
+ Enable the MCP resources below to browse timelines:
560
+
561
+ ```json
562
+ { "env": { "DESKTOP_TOUCH_PERCEPTION_RESOURCES": "1" } }
563
+ ```
564
+
565
+ MCP resources available when enabled:
566
+
567
+ | URI | Content |
568
+ |---|---|
569
+ | `perception://target/{targetKey}/timeline` | Semantic event timeline for a target |
570
+ | `perception://targets/recent` | Most recently active target keys |
571
+ | `perception://lens/{lensId}/summary` | Lens attention/guard state |
572
+
573
+ ### Manual Lens Eviction: FIFO → LRU
574
+
575
+ Manual lenses (created via `perception_register`) are now evicted by **least-recently-used** instead of insertion order. Using `perception_read`, `evaluatePreToolGuards`, or `buildEnvelopeFor` on a lens promotes it. The hard limit of 16 active lenses is unchanged.
576
+
577
+ ### browser_eval Structured Mode
578
+
579
+ Pass `withPerception: true` to receive a structured JSON response with `post.perception` instead of raw text:
580
+
581
+ ```json
582
+ { "name": "browser_eval", "arguments": { "expression": "document.title", "withPerception": true } }
583
+ ```
584
+
585
+ Returns `{ ok: true, result: "...", post: { perception: { status: "ok", ... } } }`.
586
+
587
+ ### mouse_drag Cross-Window Guard
588
+
589
+ `mouse_drag` now guards both start and end coordinates. Drags that cross window boundaries (or reach the desktop wallpaper) are blocked by default. To allow intentional cross-window or range-selection drags:
590
+
591
+ ```json
592
+ { "name": "mouse_drag", "arguments": { "startX": 100, "startY": 100, "endX": 900, "endY": 900, "allowCrossWindowDrag": true } }
593
+ ```
544
594
 
545
595
  ---
546
596
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@harusame64/desktop-touch-mcp",
3
- "version": "0.12.0",
3
+ "version": "0.13.0",
4
4
  "description": "LLM-native Windows computer-use MCP server with 56 tools for screenshots, UIA, mouse/keyboard, Chrome CDP, terminal, SmartScroll, and perception guards",
5
5
  "engines": {
6
6
  "node": ">=20.0.0"