aiden-runtime 4.1.2 → 4.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -54,10 +54,34 @@ Replace the clipboard with new text. Handles multi-line strings safely
54
54
  { "tool": "clipboard_write", "input": { "text": "Hello, world!" } }
55
55
  ```
56
56
 
57
+ ### media_sessions
58
+ Enumerate every Windows media session registered with the OS (Spotify,
59
+ YouTube in browser, VLC, etc.). One entry per app, with which one is
60
+ the OS-routed target for global media keys. Use this BEFORE
61
+ `media_transport` when controlling a specific app.
62
+ ```json
63
+ { "tool": "media_sessions", "input": {} }
64
+ ```
65
+
66
+ ### media_transport
67
+ Verified play / pause / skip against a specific GSMTC media session.
68
+ Targets by `AppUserModelId` substring (case-insensitive — "spotify"
69
+ matches `Spotify.exe`), then by track title as a softer fallback. Omit
70
+ `target` to act on the OS-routed current session. Returns OS-level
71
+ success/failure — NOT a blind keystroke like `media_key`.
72
+ ```json
73
+ { "tool": "media_transport", "input": { "action": "pause", "target": "spotify" } }
74
+ { "tool": "media_transport", "input": { "action": "play", "target": "spotify" } }
75
+ { "tool": "media_transport", "input": { "action": "next", "target": "youtube" } }
76
+ { "tool": "media_transport", "input": { "action": "toggle" } }
77
+ ```
78
+
57
79
  ### media_key
58
- Send a media-control key to the active media session (Spotify, YouTube
59
- in browser, Windows Media Player, etc.). Pair with `now_playing` to
60
- inspect state first.
80
+ Blind global media keypress (`VK_MEDIA_PLAY_PAUSE` and friends). Layer-3
81
+ fallback for the rare case where neither a semantic API nor GSMTC can
82
+ act. Prefer `media_transport` whenever the user names an app — this
83
+ tool returns `degraded:true` because Windows doesn't surface the SMTC
84
+ routing outcome to user-mode, so we can't verify any app received it.
61
85
  ```json
62
86
  { "tool": "media_key", "input": { "action": "play_pause" } }
63
87
  { "tool": "media_key", "input": { "action": "next" } }
@@ -65,6 +89,17 @@ inspect state first.
65
89
  { "tool": "media_key", "input": { "action": "stop" } }
66
90
  ```
67
91
 
92
+ ### app_input
93
+ Focus a Windows application by process name and send a SendKeys
94
+ sequence to it. Escape hatch when GSMTC doesn't enumerate the surface
95
+ ("press space in Chrome to pause this YouTube tab"). Always returns
96
+ `degraded:true` — SendKeys cannot verify receipt at the target window.
97
+ ```json
98
+ { "tool": "app_input", "input": { "app": "chrome", "keys": "{SPACE}" } }
99
+ { "tool": "app_input", "input": { "app": "notepad", "keys": "Hello{ENTER}" } }
100
+ { "tool": "app_input", "input": { "app": "Spotify", "keys": "^{RIGHT}" } }
101
+ ```
102
+
68
103
  ### volume_set
69
104
  Set Windows master volume to a percentage, or mute / unmute / toggle.
70
105
  ```json
@@ -126,9 +161,24 @@ turn into common requests.
126
161
  1. `os_process_list` with `name: "<substring>"` → returns matching processes
127
162
  2. If `count === 0` → tell the user honestly, suggest `app_launch`
128
163
 
129
- **Media control workflow:**
130
- 1. `now_playing` → see what's currently playing
131
- 2. `media_key` control it (play_pause / next / previous / stop)
164
+ ## Media control — strict order
165
+
166
+ 1. If the user names an app ("Spotify", "YouTube", "VLC") ALWAYS try
167
+ `media_transport({action, target})` first. Verified, OS-confirmed.
168
+ 2. If `media_transport` returns `NoSession` OR the user didn't name an app
169
+ — fall back to `media_key({action})`. Blind global keystroke, returns
170
+ `degraded:true` because Windows can't tell us if anything received it.
171
+ 3. If GSMTC doesn't enumerate the surface at all (e.g. a YouTube tab the
172
+ browser hasn't registered with SMTC) — last resort: `app_input({app,
173
+ keys})` to focus the window and send a keystroke directly.
174
+
175
+ Never call `media_key` and `media_transport` in the same turn — redundant.
176
+ First call gives you the answer; second is noise the user has to read.
177
+
178
+ Honesty contract:
179
+ - `media_transport` success is OS-confirmed → trail row is silent (success).
180
+ - `media_key` and `app_input` always report `degraded:true` → yellow trail
181
+ row, because neither can verify receipt at the target app.
132
182
 
133
183
  **Volume change with feedback:**
134
184
  1. `volume_set` → returns the resulting volume percent in `result`