opencode-smart-voice-notify 1.2.4 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE CHANGED
@@ -1,21 +1,21 @@
1
- MIT License
2
-
3
- Copyright (c) 2025 MasuRii
4
-
5
- Permission is hereby granted, free of charge, to any person obtaining a copy
6
- of this software and associated documentation files (the "Software"), to deal
7
- in the Software without restriction, including without limitation the rights
8
- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
- copies of the Software, and to permit persons to whom the Software is
10
- furnished to do so, subject to the following conditions:
11
-
12
- The above copyright notice and this permission notice shall be included in all
13
- copies or substantial portions of the Software.
14
-
15
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
- SOFTWARE.
1
+ MIT License
2
+
3
+ Copyright (c) 2026 MasuRii
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md CHANGED
@@ -3,9 +3,14 @@
3
3
 
4
4
  # OpenCode Smart Voice Notify
5
5
 
6
+ ![Coverage](https://img.shields.io/badge/coverage-86.73%25-brightgreen)
7
+ ![Version](https://img.shields.io/badge/version-1.2.5-blue)
8
+ ![License](https://img.shields.io/badge/license-MIT-green)
9
+
10
+
6
11
  > **Disclaimer**: This project is not built by the OpenCode team and is not affiliated with [OpenCode](https://opencode.ai) in any way. It is an independent community plugin.
7
12
 
8
- A smart voice notification plugin for [OpenCode](https://opencode.ai) with **multiple TTS engines** and an intelligent reminder system.
13
+ A smart voice notification plugin for [OpenCode](https://opencode.ai) with **multiple TTS engines**, native desktop notifications, and an intelligent reminder system.
9
14
 
10
15
  <img width="1456" height="720" alt="image" src="https://github.com/user-attachments/assets/52ccf357-2548-400b-a346-6362f2fc3180" />
11
16
 
@@ -15,10 +20,11 @@ A smart voice notification plugin for [OpenCode](https://opencode.ai) with **mul
15
20
  ### Smart TTS Engine Selection
16
21
  The plugin automatically tries multiple TTS engines in order, falling back if one fails:
17
22
 
18
- 1. **ElevenLabs** (Online) - High-quality, anime-like voices with natural expression
19
- 2. **Edge TTS** (Free) - Microsoft's neural voices, native Node.js implementation (no Python required)
20
- 3. **Windows SAPI** (Offline) - Built-in Windows speech synthesis
21
- 4. **Local Sound Files** (Fallback) - Plays bundled MP3 files if all TTS fails
23
+ 1. **OpenAI-Compatible** (Cloud/Self-hosted) - Any OpenAI-compatible `/v1/audio/speech` endpoint (Kokoro, LocalAI, Coqui, AllTalk, OpenAI API, etc.)
24
+ 2. **ElevenLabs** (Online) - High-quality, anime-like voices with natural expression
25
+ 3. **Edge TTS** (Free) - Microsoft's neural voices, native Node.js implementation (no Python required)
26
+ 4. **Windows SAPI** (Offline) - Built-in Windows speech synthesis
27
+ 5. **Local Sound Files** (Fallback) - Plays bundled MP3 files if all TTS fails
22
28
 
23
29
  ### Smart Notification System
24
30
  - **Sound-first mode**: Play a sound immediately, then speak a TTS reminder if user doesn't respond
@@ -27,6 +33,7 @@ The plugin automatically tries multiple TTS engines in order, falling back if on
27
33
  - **Sound-only mode**: Just play sounds, no TTS
28
34
 
29
35
  ### Intelligent Reminders
36
+ - **Granular Control**: Enable or disable notifications and reminders for specific event types (Idle, Permission, Question, Error) via configuration.
30
37
  - Delayed TTS reminders if user doesn't respond within configurable time
31
38
  - Follow-up reminders with exponential backoff
32
39
  - Automatic cancellation when user responds
@@ -43,11 +50,15 @@ The plugin automatically tries multiple TTS engines in order, falling back if on
43
50
  - **Smart fallback**: Automatically falls back to static messages if AI is unavailable
44
51
 
45
52
  ### System Integration
53
+ - **Native Desktop Notifications**: Windows (Toast), macOS (Notification Center), and Linux (notify-send) support
46
54
  - **Native Edge TTS**: No external dependencies (Python/pip) required
47
- - Wake monitor from sleep before notifying
48
- - Auto-boost volume if too low
49
- - TUI toast notifications
50
- - Cross-platform support (Windows, macOS, Linux)
55
+ - **Focus Detection** (macOS): Suppresses notifications when terminal is focused
56
+ - **Webhook Integration**: Receive notifications on Discord or any custom webhook endpoint when tasks finish or need attention
57
+ - **Themed Sound Packs**: Use custom sound collections (e.g., Warcraft, StarCraft) by simply pointing to a directory
58
+ - **Per-Project Sounds**: Assign unique sounds to different projects for easy identification
59
+ - **Wake monitor** from sleep before notifying
60
+ - **Auto-boost volume** if too low
61
+ - **TUI toast** notifications
51
62
 
52
63
  ## Installation
53
64
 
@@ -99,62 +110,101 @@ When you first run OpenCode with this plugin installed, it will **automatically
99
110
 
100
111
  The auto-generated configuration includes all advanced settings, message arrays, and engine options, so you don't have to refer back to the documentation for available settings.
101
112
 
102
- ### Manual Configuration
103
-
104
- If you prefer to create the config manually, add a `smart-voice-notify.jsonc` file in your OpenCode config directory (`~/.config/opencode/`):
105
-
106
- ```jsonc
107
- {
108
- // ============================================================
109
- // OpenCode Smart Voice Notify - Quick Start Configuration
110
- // ============================================================
111
- // For ALL available options, see example.config.jsonc in the plugin.
112
- // The plugin auto-creates a comprehensive config on first run.
113
- // ============================================================
114
-
115
- // Master switch to enable/disable the plugin without uninstalling
116
- "enabled": true,
117
-
118
- // Notification mode: 'sound-first', 'tts-first', 'both', 'sound-only'
119
- "notificationMode": "sound-first",
120
-
121
- // TTS engine: 'elevenlabs', 'edge', 'sapi'
122
- "ttsEngine": "elevenlabs",
123
- "enableTTS": true,
124
-
125
- // ElevenLabs settings (get API key from https://elevenlabs.io/app/settings/api-keys)
126
- "elevenLabsApiKey": "YOUR_API_KEY_HERE",
127
- "elevenLabsVoiceId": "cgSgspJ2msm6clMCkdW9", // Jessica - Playful, Bright
128
-
129
- // Edge TTS settings (free, no API key required)
130
- "edgeVoice": "en-US-AnaNeural",
131
- "edgePitch": "+50Hz",
132
- "edgeRate": "+10%",
133
-
134
- // TTS reminder settings
135
- "enableTTSReminder": true,
136
- "ttsReminderDelaySeconds": 30,
137
- "enableFollowUpReminders": true,
138
- "maxFollowUpReminders": 3,
139
-
140
- // AI-generated messages (optional - requires local AI server)
141
- "enableAIMessages": false,
142
- "aiEndpoint": "http://localhost:11434/v1",
143
- "aiModel": "llama3",
144
- "aiApiKey": "",
145
- "aiFallbackToStatic": true,
146
-
147
- // General settings
148
- "wakeMonitor": true,
149
- "forceVolume": true,
150
- "volumeThreshold": 50,
151
- "enableToast": true,
152
- "enableSound": true,
153
- "debugLog": false
154
- }
155
- ```
156
-
157
- For the complete configuration with all TTS engine settings, message arrays, AI prompts, and advanced options, see [`example.config.jsonc`](./example.config.jsonc) in the plugin directory.
113
+ ### Manual Configuration
114
+
115
+ If you prefer to create the config manually, add a `smart-voice-notify.jsonc` file in your OpenCode config directory (`~/.config/opencode/`):
116
+
117
+ ```jsonc
118
+ {
119
+ // Master switch to enable/disable the plugin without uninstalling
120
+ "enabled": true,
121
+
122
+ // Notification mode: 'sound-first', 'tts-first', 'both', 'sound-only'
123
+ "notificationMode": "sound-first",
124
+
125
+ // TTS engine: 'openai', 'elevenlabs', 'edge', 'sapi'
126
+ "ttsEngine": "openai",
127
+ "enableTTS": true,
128
+
129
+ // ElevenLabs settings (get API key from https://elevenlabs.io/app/settings/api-keys)
130
+ "elevenLabsApiKey": "YOUR_API_KEY_HERE",
131
+ "elevenLabsVoiceId": "cgSgspJ2msm6clMCkdW9", // Jessica - Playful, Bright
132
+
133
+ // OpenAI-compatible TTS (Kokoro, LocalAI, OpenAI, Coqui, AllTalk, etc.)
134
+ "openaiTtsEndpoint": "http://localhost:8880",
135
+ "openaiTtsVoice": "af_heart",
136
+ "openaiTtsModel": "kokoro",
137
+
138
+ // Edge TTS settings (free, no API key required)
139
+ "edgeVoice": "en-US-AnaNeural",
140
+ "edgePitch": "+50Hz",
141
+ "edgeRate": "+10%",
142
+
143
+ // Desktop Notifications
144
+ "enableDesktopNotification": true,
145
+ "desktopNotificationTimeout": 5,
146
+ "showProjectInNotification": true,
147
+
148
+ // TTS reminder settings
149
+ "enableTTSReminder": true,
150
+ "ttsReminderDelaySeconds": 30,
151
+ "enableFollowUpReminders": true,
152
+
153
+ // Focus Detection (macOS only)
154
+ "suppressWhenFocused": true,
155
+ "alwaysNotify": false,
156
+
157
+ // AI-generated messages (optional - requires local AI server)
158
+ "enableAIMessages": false,
159
+ "aiEndpoint": "http://localhost:11434/v1",
160
+
161
+ // Webhook settings (optional - works with Discord)
162
+ "enableWebhook": false,
163
+ "webhookUrl": "",
164
+ "webhookUsername": "OpenCode Notify",
165
+
166
+ // Sound theme settings (optional)
167
+ "soundThemeDir": "", // Path to custom sound theme directory
168
+
169
+ // Per-project sounds
170
+ "perProjectSounds": false,
171
+ "projectSoundSeed": 0,
172
+
173
+ // General settings
174
+ "wakeMonitor": true,
175
+ "forceVolume": false,
176
+ "volumeThreshold": 50,
177
+ "enableToast": true,
178
+ "enableSound": true,
179
+ "debugLog": false
180
+ }
181
+ ```
182
+
183
+ For the complete configuration with all TTS engine settings, message arrays, AI prompts, and advanced options, see [`example.config.jsonc`](./example.config.jsonc) in the plugin directory.
184
+
185
+ ### OpenAI-Compatible TTS Setup (Kokoro, LocalAI, OpenAI API, etc.)
186
+
187
+ For cloud-based or self-hosted TTS using any OpenAI-compatible `/v1/audio/speech` endpoint:
188
+
189
+ ```jsonc
190
+ {
191
+ "ttsEngine": "openai",
192
+ "openaiTtsEndpoint": "http://192.168.86.43:8880", // Your TTS server
193
+ "openaiTtsVoice": "af_heart", // Server-dependent
194
+ "openaiTtsModel": "kokoro", // Server-dependent
195
+ "openaiTtsApiKey": "", // Optional, if server requires auth
196
+ "openaiTtsSpeed": 1.0 // 0.25 to 4.0
197
+ }
198
+ ```
199
+
200
+ **Supported OpenAI-Compatible TTS Servers:**
201
+ | Server | Example Endpoint | Voices |
202
+ |--------|------------------|--------|
203
+ | Kokoro | `http://localhost:8880` | `af_heart`, `af_bella`, `am_adam`, etc. |
204
+ | LocalAI | `http://localhost:8080` | Model-dependent |
205
+ | AllTalk | `http://localhost:7851` | Model-dependent |
206
+ | OpenAI | `https://api.openai.com` | `alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer` |
207
+ | Coqui | `http://localhost:5002` | Model-dependent |
158
208
 
159
209
  ### AI Message Generation (Optional)
160
210
 
@@ -173,12 +223,15 @@ If you want dynamic, AI-generated notification messages instead of preset ones,
173
223
  "aiEndpoint": "http://localhost:11434/v1",
174
224
  "aiModel": "llama3",
175
225
  "aiApiKey": "",
176
- "aiFallbackToStatic": true
226
+ "aiFallbackToStatic": true,
227
+ "enableContextAwareAI": false // Set to true for personalized messages with project/task context
177
228
  }
178
229
  ```
179
230
 
180
231
  3. **The AI will generate unique messages** for each notification, which are then spoken by your TTS engine.
181
232
 
233
+ 4. **Context-Aware Messages** (optional): Enable `enableContextAwareAI` for personalized notifications that include project name, task title, and change summary (e.g., "Your work on MyProject is complete!").
234
+
182
235
  **Supported AI Servers:**
183
236
  | Server | Default Endpoint | API Key |
184
237
  |--------|-----------------|---------|
@@ -188,8 +241,94 @@ If you want dynamic, AI-generated notification messages instead of preset ones,
188
241
  | vLLM | `http://localhost:8000/v1` | Use "EMPTY" |
189
242
  | Jan.ai | `http://localhost:1337/v1` | Required |
190
243
 
244
+ ### Discord / Webhook Integration (Optional)
245
+
246
+ Receive remote notifications on Discord or any custom endpoint. This is perfect for long-running tasks when you're away from your computer.
247
+
248
+ 1. **Create a Discord Webhook**:
249
+ - In Discord, go to **Server Settings** > **Integrations** > **Webhooks**.
250
+ - Click **New Webhook**, choose a channel, and click **Copy Webhook URL**.
251
+
252
+ 2. **Enable Webhooks in your config**:
253
+ ```jsonc
254
+ {
255
+ "enableWebhook": true,
256
+ "webhookUrl": "https://discord.com/api/webhooks/...",
257
+ "webhookUsername": "OpenCode Notify",
258
+ "webhookEvents": ["idle", "permission", "error", "question"],
259
+ "webhookMentionOnPermission": true
260
+ }
261
+ ```
262
+
263
+ 3. **Features**:
264
+ - **Color-coded Embeds**: Different colors for task completion (green), permissions (orange), errors (red), and questions (blue).
265
+ - **Smart Mentions**: Automatically @everyone on Discord for urgent permission requests.
266
+ - **Rate Limiting**: Intelligent retry logic with backoff if Discord's rate limits are hit.
267
+ - **Fire-and-forget**: Webhook requests never block local sound or TTS playback.
268
+
269
+ **Supported Webhook Events:**
270
+ | Event | Trigger |
271
+ |-------|---------|
272
+ | `idle` | Agent finished working |
273
+ | `permission` | Agent needs permission for a tool |
274
+ | `error` | Agent encountered an error |
275
+ | `question` | Agent is asking you a question |
276
+
277
+
278
+ ### Custom Sound Themes (Optional)
279
+
280
+ You can replace individual sound files with entire "Sound Themes" (like the classic Warcraft II or StarCraft sound packs).
281
+
282
+ 1. **Set up your theme directory**:
283
+ Create a folder (e.g., `~/.config/opencode/themes/warcraft2/`) with the following structure:
284
+ ```text
285
+ warcraft2/
286
+ ├── idle/ # Sounds for when the agent finishes
287
+ │ ├── job_done.mp3
288
+ │ └── alright.wav
289
+ ├── permission/ # Sounds for permission requests
290
+ │ ├── help.mp3
291
+ │ └── need_orders.wav
292
+ ├── error/ # Sounds for agent errors
293
+ │ └── alert.mp3
294
+ └── question/ # Sounds for agent questions
295
+ └── yes_milord.mp3
296
+ ```
297
+
298
+ 2. **Configure the theme in your config**:
299
+ ```jsonc
300
+ {
301
+ "soundThemeDir": "themes/warcraft2",
302
+ "randomizeSoundFromTheme": true
303
+ }
304
+ ```
305
+
306
+ 3. **Features**:
307
+ - **Automatic Fallback**: If a theme subdirectory or sound is missing, the plugin automatically falls back to your default sound files.
308
+ - **Randomization**: If multiple sounds are in a subdirectory, the plugin will pick one at random each time (if `randomizeSoundFromTheme` is `true`).
309
+ - **Relative Paths**: Paths are relative to your OpenCode config directory (`~/.config/opencode/`).
310
+
311
+
191
312
  ## Requirements
192
313
 
314
+ ### Platform Support Matrix
315
+
316
+ | Feature | Windows | macOS | Linux |
317
+ |---------|:---:|:---:|:---:|
318
+ | **Sound Playback** | ✅ | ✅ | ✅ |
319
+ | **TTS (Cloud/Edge)** | ✅ | ✅ | ✅ |
320
+ | **TTS (Windows SAPI)** | ✅ | ❌ | ❌ |
321
+ | **Desktop Notifications** | ✅ | ✅ | ✅ (req libnotify) |
322
+ | **Focus Detection** | ❌ | ✅ | ❌ |
323
+ | **Webhook Integration** | ✅ | ✅ | ✅ |
324
+ | **Wake Monitor** | ✅ | ✅ | ✅ (X11/Gnome) |
325
+ | **Volume Control** | ✅ | ✅ | ✅ (Pulse/ALSA) |
326
+
327
+ ### For OpenAI-Compatible TTS
328
+ - Any server implementing the `/v1/audio/speech` endpoint
329
+ - Examples: [Kokoro](https://github.com/remsky/Kokoro-FastAPI), [LocalAI](https://localai.io), [AllTalk](https://github.com/erew123/alltalk_tts), OpenAI API, etc.
330
+ - Works with both local self-hosted servers and cloud-based providers.
331
+
193
332
  ### For ElevenLabs TTS
194
333
  - ElevenLabs API key (free tier: 10,000 characters/month)
195
334
  - Internet connection
@@ -200,16 +339,48 @@ If you want dynamic, AI-generated notification messages instead of preset ones,
200
339
  ### For Windows SAPI
201
340
  - Windows OS (uses built-in System.Speech)
202
341
 
342
+ ### For Desktop Notifications
343
+ - **Windows**: Built-in (uses Toast notifications)
344
+ - **macOS**: Built-in (uses Notification Center)
345
+ - **Linux**: Requires `notify-send` (libnotify)
346
+ ```bash
347
+ # Ubuntu/Debian
348
+ sudo apt install libnotify-bin
349
+
350
+ # Fedora
351
+ sudo dnf install libnotify
352
+
353
+ # Arch Linux
354
+ sudo pacman -S libnotify
355
+ ```
356
+
203
357
  ### For Sound Playback
204
358
  - **Windows**: Built-in (uses Windows Media Player)
205
359
  - **macOS**: Built-in (`afplay`)
206
360
  - **Linux**: `paplay` or `aplay`
207
361
 
362
+ ### For Focus Detection
363
+ Focus detection suppresses sound and desktop notifications when the terminal is focused.
364
+
365
+ | Platform | Support | Notes |
366
+ |----------|---------|-------|
367
+ | **macOS** | ✅ Full | Uses AppleScript to detect frontmost application |
368
+ | **Windows** | ❌ Not supported | No reliable API available |
369
+ | **Linux** | ❌ Not supported | Varies by desktop environment |
370
+
371
+ > **Note**: On unsupported platforms, notifications are always sent (fail-open behavior). TTS reminders are never suppressed, even when focused, since users may step away after seeing the toast.
372
+
373
+ ### For Webhook Notifications
374
+ - **Discord**: Full support for Discord's webhook embed format.
375
+ - **Generic**: Works with any endpoint that accepts a POST request with a JSON body (though formatting is optimized for Discord).
376
+ - **Rate Limits**: The plugin handles HTTP 429 (Too Many Requests) automatically with retries and a 250ms queue delay.
377
+
208
378
  ## Events Handled
209
379
 
210
380
  | Event | Action |
211
381
  |-------|--------|
212
382
  | `session.idle` | Agent finished working - notify user |
383
+ | `session.error` | Agent encountered an error - alert user |
213
384
  | `permission.asked` | Permission request (SDK v1.1.1+) - alert user |
214
385
  | `permission.updated` | Permission request (SDK v1.0.x) - alert user |
215
386
  | `permission.replied` | User responded - cancel pending reminders |
@@ -247,6 +418,23 @@ To develop on this plugin locally:
247
418
  }
248
419
  ```
249
420
 
421
+ ### Testing
422
+
423
+ The plugin uses [Bun](https://bun.sh)'s built-in test runner for unit and E2E tests.
424
+
425
+ ```bash
426
+ # Run all tests
427
+ bun test
428
+
429
+ # Run tests with coverage
430
+ bun test --coverage
431
+
432
+ # Run tests in watch mode
433
+ bun test --watch
434
+ ```
435
+
436
+ For more detailed testing guidelines and mock usage examples, see [CONTRIBUTING.md](./CONTRIBUTING.md).
437
+
250
438
  ## Updating
251
439
 
252
440
  OpenCode does not automatically update plugins. To update to the latest version: