opencode-smart-voice-notify 1.3.0 → 1.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,460 +1,585 @@
1
- <!-- Dynamic Header -->
2
- <img width="100%" src="https://capsule-render.vercel.app/api?type=waving&color=0:667eea,100:764ba2&height=120&section=header"/>
3
-
4
- # OpenCode Smart Voice Notify
5
-
6
- ![Coverage](https://img.shields.io/badge/coverage-86.73%25-brightgreen)
7
- ![Version](https://img.shields.io/badge/version-1.2.5-blue)
8
- ![License](https://img.shields.io/badge/license-MIT-green)
9
-
10
-
11
- > **Disclaimer**: This project is not built by the OpenCode team and is not affiliated with [OpenCode](https://opencode.ai) in any way. It is an independent community plugin.
12
-
13
- A smart voice notification plugin for [OpenCode](https://opencode.ai) with **multiple TTS engines**, native desktop notifications, and an intelligent reminder system.
14
-
15
- <img width="1456" height="720" alt="image" src="https://github.com/user-attachments/assets/52ccf357-2548-400b-a346-6362f2fc3180" />
16
-
17
-
18
- ## Features
19
-
20
- ### Smart TTS Engine Selection
21
- The plugin automatically tries multiple TTS engines in order, falling back if one fails:
22
-
23
- 1. **OpenAI-Compatible** (Cloud/Self-hosted) - Any OpenAI-compatible `/v1/audio/speech` endpoint (Kokoro, LocalAI, Coqui, AllTalk, OpenAI API, etc.)
24
- 2. **ElevenLabs** (Online) - High-quality, anime-like voices with natural expression
25
- 3. **Edge TTS** (Free) - Microsoft's neural voices, native Node.js implementation (no Python required)
26
- 4. **Windows SAPI** (Offline) - Built-in Windows speech synthesis
27
- 5. **Local Sound Files** (Fallback) - Plays bundled MP3 files if all TTS fails
28
-
29
- ### Smart Notification System
30
- - **Sound-first mode**: Play a sound immediately, then speak a TTS reminder if user doesn't respond
31
- - **TTS-first mode**: Speak immediately using TTS
32
- - **Both mode**: Play sound AND speak TTS at the same time
33
- - **Sound-only mode**: Just play sounds, no TTS
34
-
35
- ### Intelligent Reminders
36
- - **Granular Control**: Enable or disable notifications and reminders for specific event types (Idle, Permission, Question, Error) via configuration.
37
- - Delayed TTS reminders if user doesn't respond within configurable time
38
- - Follow-up reminders with exponential backoff
39
- - Automatic cancellation when user responds
40
- - Per-notification type delays (permission requests are more urgent)
41
- - **Smart Quota Handling**: Automatically falls back to free Edge TTS if ElevenLabs quota is exceeded
42
- - **Permission Batching**: Multiple simultaneous permission requests are batched into a single notification (e.g., "5 permission requests require your attention")
43
- - **Question Tool Support** (SDK v1.1.7+): Notifies when the agent asks questions and needs user input
44
-
45
- ### AI-Generated Messages
46
- - **Dynamic notifications**: Use a local AI to generate unique, contextual messages instead of preset static ones
47
- - **OpenAI-compatible**: Works with Ollama, LM Studio, LocalAI, vLLM, llama.cpp, Jan.ai, or any OpenAI-compatible endpoint
48
- - **User-hosted**: You provide your own AI endpoint - no cloud API keys required
49
- - **Custom prompts**: Configure prompts per notification type for full control over AI personality
50
- - **Smart fallback**: Automatically falls back to static messages if AI is unavailable
51
-
52
- ### System Integration
53
- - **Native Desktop Notifications**: Windows (Toast), macOS (Notification Center), and Linux (notify-send) support
54
- - **Native Edge TTS**: No external dependencies (Python/pip) required
55
- - **Focus Detection** (macOS): Suppresses notifications when terminal is focused
56
- - **Webhook Integration**: Receive notifications on Discord or any custom webhook endpoint when tasks finish or need attention
57
- - **Themed Sound Packs**: Use custom sound collections (e.g., Warcraft, StarCraft) by simply pointing to a directory
58
- - **Per-Project Sounds**: Assign unique sounds to different projects for easy identification
59
- - **Wake monitor** from sleep before notifying
60
- - **Auto-boost volume** if too low
61
- - **TUI toast** notifications
62
-
63
- ## Installation
64
-
65
- ### Option 1: From npm/Bun (Recommended)
66
-
67
- Add to your OpenCode config file (`~/.config/opencode/opencode.json`):
68
-
69
- ```json
70
- {
71
- "$schema": "https://opencode.ai/config.json",
72
- "plugin": ["opencode-smart-voice-notify@latest"]
73
- }
74
- ```
75
-
76
- > **Note**: OpenCode will automatically install the plugin using your system's package manager (npm or bun).
77
-
78
- ### Option 2: From GitHub
79
-
80
- ```json
81
- {
82
- "$schema": "https://opencode.ai/config.json",
83
- "plugin": ["github:MasuRii/opencode-smart-voice-notify"]
84
- }
85
- ```
86
-
87
- ### Option 3: Local Development
88
-
89
- 1. Clone the repository:
90
- ```bash
91
- git clone https://github.com/MasuRii/opencode-smart-voice-notify.git
92
- ```
93
-
94
- 2. Reference the local path in your config:
95
- ```json
96
- {
97
- "plugin": ["file:///path/to/opencode-smart-voice-notify"]
98
- }
99
- ```
100
-
101
- ## Configuration
102
-
103
- ### Automatic Setup
104
-
105
- When you first run OpenCode with this plugin installed, it will **automatically create**:
106
-
107
- 1. **`~/.config/opencode/smart-voice-notify.jsonc`** - A comprehensive configuration file with all available options fully documented.
108
- 2. **`~/.config/opencode/assets/*.mp3`** - Bundled notification sound files.
109
- 3. **`~/.config/opencode/logs/`** - Debug log folder (created when debug logging is enabled).
110
-
111
- The auto-generated configuration includes all advanced settings, message arrays, and engine options, so you don't have to refer back to the documentation for available settings.
112
-
113
- ### Manual Configuration
114
-
115
- If you prefer to create the config manually, add a `smart-voice-notify.jsonc` file in your OpenCode config directory (`~/.config/opencode/`):
116
-
117
- ```jsonc
118
- {
119
- // Master switch to enable/disable the plugin without uninstalling
120
- "enabled": true,
121
-
122
- // Notification mode: 'sound-first', 'tts-first', 'both', 'sound-only'
123
- "notificationMode": "sound-first",
124
-
125
- // TTS engine: 'openai', 'elevenlabs', 'edge', 'sapi'
126
- "ttsEngine": "openai",
127
- "enableTTS": true,
128
-
129
- // ElevenLabs settings (get API key from https://elevenlabs.io/app/settings/api-keys)
130
- "elevenLabsApiKey": "YOUR_API_KEY_HERE",
131
- "elevenLabsVoiceId": "cgSgspJ2msm6clMCkdW9", // Jessica - Playful, Bright
132
-
133
- // OpenAI-compatible TTS (Kokoro, LocalAI, OpenAI, Coqui, AllTalk, etc.)
134
- "openaiTtsEndpoint": "http://localhost:8880",
135
- "openaiTtsVoice": "af_heart",
136
- "openaiTtsModel": "kokoro",
137
-
138
- // Edge TTS settings (free, no API key required)
139
- "edgeVoice": "en-US-AnaNeural",
140
- "edgePitch": "+50Hz",
141
- "edgeRate": "+10%",
142
-
143
- // Desktop Notifications
144
- "enableDesktopNotification": true,
145
- "desktopNotificationTimeout": 5,
146
- "showProjectInNotification": true,
147
-
148
- // TTS reminder settings
149
- "enableTTSReminder": true,
150
- "ttsReminderDelaySeconds": 30,
151
- "enableFollowUpReminders": true,
152
-
153
- // Focus Detection (macOS only)
154
- "suppressWhenFocused": true,
155
- "alwaysNotify": false,
156
-
157
- // AI-generated messages (optional - requires local AI server)
158
- "enableAIMessages": false,
159
- "aiEndpoint": "http://localhost:11434/v1",
160
-
161
- // Webhook settings (optional - works with Discord)
162
- "enableWebhook": false,
163
- "webhookUrl": "",
164
- "webhookUsername": "OpenCode Notify",
165
-
166
- // Sound theme settings (optional)
167
- "soundThemeDir": "", // Path to custom sound theme directory
168
-
169
- // Per-project sounds
170
- "perProjectSounds": false,
171
- "projectSoundSeed": 0,
172
-
173
- // General settings
174
- "wakeMonitor": true,
175
- "forceVolume": false,
176
- "volumeThreshold": 50,
177
- "enableToast": true,
178
- "enableSound": true,
179
- "debugLog": false
180
- }
181
- ```
182
-
183
- For the complete configuration with all TTS engine settings, message arrays, AI prompts, and advanced options, see [`example.config.jsonc`](./example.config.jsonc) in the plugin directory.
184
-
185
- ### OpenAI-Compatible TTS Setup (Kokoro, LocalAI, OpenAI API, etc.)
186
-
187
- For cloud-based or self-hosted TTS using any OpenAI-compatible `/v1/audio/speech` endpoint:
188
-
189
- ```jsonc
190
- {
191
- "ttsEngine": "openai",
192
- "openaiTtsEndpoint": "http://192.168.86.43:8880", // Your TTS server
193
- "openaiTtsVoice": "af_heart", // Server-dependent
194
- "openaiTtsModel": "kokoro", // Server-dependent
195
- "openaiTtsApiKey": "", // Optional, if server requires auth
196
- "openaiTtsSpeed": 1.0 // 0.25 to 4.0
197
- }
198
- ```
199
-
200
- **Supported OpenAI-Compatible TTS Servers:**
201
- | Server | Example Endpoint | Voices |
202
- |--------|------------------|--------|
203
- | Kokoro | `http://localhost:8880` | `af_heart`, `af_bella`, `am_adam`, etc. |
204
- | LocalAI | `http://localhost:8080` | Model-dependent |
205
- | AllTalk | `http://localhost:7851` | Model-dependent |
206
- | OpenAI | `https://api.openai.com` | `alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer` |
207
- | Coqui | `http://localhost:5002` | Model-dependent |
208
-
209
- ### AI Message Generation (Optional)
210
-
211
- If you want dynamic, AI-generated notification messages instead of preset ones, you can connect to a local AI server:
212
-
213
- 1. **Install a local AI server** (e.g., [Ollama](https://ollama.ai)):
214
- ```bash
215
- # Install Ollama and pull a model
216
- ollama pull llama3
217
- ```
218
-
219
- 2. **Enable AI messages in your config**:
220
- ```jsonc
221
- {
222
- "enableAIMessages": true,
223
- "aiEndpoint": "http://localhost:11434/v1",
224
- "aiModel": "llama3",
225
- "aiApiKey": "",
226
- "aiFallbackToStatic": true,
227
- "enableContextAwareAI": false // Set to true for personalized messages with project/task context
228
- }
229
- ```
230
-
231
- 3. **The AI will generate unique messages** for each notification, which are then spoken by your TTS engine.
232
-
233
- 4. **Context-Aware Messages** (optional): Enable `enableContextAwareAI` for personalized notifications that include project name, task title, and change summary (e.g., "Your work on MyProject is complete!").
234
-
235
- **Supported AI Servers:**
236
- | Server | Default Endpoint | API Key |
237
- |--------|-----------------|---------|
238
- | Ollama | `http://localhost:11434/v1` | Not needed |
239
- | LM Studio | `http://localhost:1234/v1` | Not needed |
240
- | LocalAI | `http://localhost:8080/v1` | Not needed |
241
- | vLLM | `http://localhost:8000/v1` | Use "EMPTY" |
242
- | Jan.ai | `http://localhost:1337/v1` | Required |
243
-
244
- ### Discord / Webhook Integration (Optional)
245
-
246
- Receive remote notifications on Discord or any custom endpoint. This is perfect for long-running tasks when you're away from your computer.
247
-
248
- 1. **Create a Discord Webhook**:
249
- - In Discord, go to **Server Settings** > **Integrations** > **Webhooks**.
250
- - Click **New Webhook**, choose a channel, and click **Copy Webhook URL**.
251
-
252
- 2. **Enable Webhooks in your config**:
253
- ```jsonc
254
- {
255
- "enableWebhook": true,
256
- "webhookUrl": "https://discord.com/api/webhooks/...",
257
- "webhookUsername": "OpenCode Notify",
258
- "webhookEvents": ["idle", "permission", "error", "question"],
259
- "webhookMentionOnPermission": true
260
- }
261
- ```
262
-
263
- 3. **Features**:
264
- - **Color-coded Embeds**: Different colors for task completion (green), permissions (orange), errors (red), and questions (blue).
265
- - **Smart Mentions**: Automatically @everyone on Discord for urgent permission requests.
266
- - **Rate Limiting**: Intelligent retry logic with backoff if Discord's rate limits are hit.
267
- - **Fire-and-forget**: Webhook requests never block local sound or TTS playback.
268
-
269
- **Supported Webhook Events:**
270
- | Event | Trigger |
271
- |-------|---------|
272
- | `idle` | Agent finished working |
273
- | `permission` | Agent needs permission for a tool |
274
- | `error` | Agent encountered an error |
275
- | `question` | Agent is asking you a question |
276
-
277
-
278
- ### Custom Sound Themes (Optional)
279
-
280
- You can replace individual sound files with entire "Sound Themes" (like the classic Warcraft II or StarCraft sound packs).
281
-
282
- 1. **Set up your theme directory**:
283
- Create a folder (e.g., `~/.config/opencode/themes/warcraft2/`) with the following structure:
284
- ```text
285
- warcraft2/
286
- ├── idle/ # Sounds for when the agent finishes
287
- │ ├── job_done.mp3
288
- │ └── alright.wav
289
- ├── permission/ # Sounds for permission requests
290
- │ ├── help.mp3
291
- │ └── need_orders.wav
292
- ├── error/ # Sounds for agent errors
293
- │ └── alert.mp3
294
- └── question/ # Sounds for agent questions
295
- └── yes_milord.mp3
296
- ```
297
-
298
- 2. **Configure the theme in your config**:
299
- ```jsonc
300
- {
301
- "soundThemeDir": "themes/warcraft2",
302
- "randomizeSoundFromTheme": true
303
- }
304
- ```
305
-
306
- 3. **Features**:
307
- - **Automatic Fallback**: If a theme subdirectory or sound is missing, the plugin automatically falls back to your default sound files.
308
- - **Randomization**: If multiple sounds are in a subdirectory, the plugin will pick one at random each time (if `randomizeSoundFromTheme` is `true`).
309
- - **Relative Paths**: Paths are relative to your OpenCode config directory (`~/.config/opencode/`).
310
-
311
-
312
- ## Requirements
313
-
314
- ### Platform Support Matrix
315
-
316
- | Feature | Windows | macOS | Linux |
317
- |---------|:---:|:---:|:---:|
318
- | **Sound Playback** | ✅ | ✅ | ✅ |
319
- | **TTS (Cloud/Edge)** | ✅ | ✅ | ✅ |
320
- | **TTS (Windows SAPI)** | ✅ | ❌ | ❌ |
321
- | **Desktop Notifications** | | | (req libnotify) |
322
- | **Focus Detection** | ❌ | ✅ | ❌ |
323
- | **Webhook Integration** | ✅ | ✅ | ✅ |
324
- | **Wake Monitor** | ✅ | ✅ | ✅ (X11/Gnome) |
325
- | **Volume Control** | ✅ | | (Pulse/ALSA) |
326
-
327
- ### For OpenAI-Compatible TTS
328
- - Any server implementing the `/v1/audio/speech` endpoint
329
- - Examples: [Kokoro](https://github.com/remsky/Kokoro-FastAPI), [LocalAI](https://localai.io), [AllTalk](https://github.com/erew123/alltalk_tts), OpenAI API, etc.
330
- - Works with both local self-hosted servers and cloud-based providers.
331
-
332
- ### For ElevenLabs TTS
333
- - ElevenLabs API key (free tier: 10,000 characters/month)
334
- - Internet connection
335
-
336
- ### For Edge TTS
337
- - Internet connection (No external dependencies required)
338
-
339
- ### For Windows SAPI
340
- - Windows OS (uses built-in System.Speech)
341
-
342
- ### For Desktop Notifications
343
- - **Windows**: Built-in (uses Toast notifications)
344
- - **macOS**: Built-in (uses Notification Center)
345
- - **Linux**: Requires `notify-send` (libnotify)
346
- ```bash
347
- # Ubuntu/Debian
348
- sudo apt install libnotify-bin
349
-
350
- # Fedora
351
- sudo dnf install libnotify
352
-
353
- # Arch Linux
354
- sudo pacman -S libnotify
355
- ```
356
-
357
- ### For Sound Playback
358
- - **Windows**: Built-in (uses Windows Media Player)
359
- - **macOS**: Built-in (`afplay`)
360
- - **Linux**: `paplay` or `aplay`
361
-
362
- ### For Focus Detection
363
- Focus detection suppresses sound and desktop notifications when the terminal is focused.
364
-
365
- | Platform | Support | Notes |
366
- |----------|---------|-------|
367
- | **macOS** | ✅ Full | Uses AppleScript to detect frontmost application |
368
- | **Windows** | ❌ Not supported | No reliable API available |
369
- | **Linux** | ❌ Not supported | Varies by desktop environment |
370
-
371
- > **Note**: On unsupported platforms, notifications are always sent (fail-open behavior). TTS reminders are never suppressed, even when focused, since users may step away after seeing the toast.
372
-
373
- ### For Webhook Notifications
374
- - **Discord**: Full support for Discord's webhook embed format.
375
- - **Generic**: Works with any endpoint that accepts a POST request with a JSON body (though formatting is optimized for Discord).
376
- - **Rate Limits**: The plugin handles HTTP 429 (Too Many Requests) automatically with retries and a 250ms queue delay.
377
-
378
- ## Events Handled
379
-
380
- | Event | Action |
381
- |-------|--------|
382
- | `session.idle` | Agent finished working - notify user |
383
- | `session.error` | Agent encountered an error - alert user |
384
- | `permission.asked` | Permission request (SDK v1.1.1+) - alert user |
385
- | `permission.updated` | Permission request (SDK v1.0.x) - alert user |
386
- | `permission.replied` | User responded - cancel pending reminders |
387
- | `question.asked` | Agent asks question (SDK v1.1.7+) - notify user |
388
- | `question.replied` | User answered question - cancel pending reminders |
389
- | `question.rejected` | User dismissed question - cancel pending reminders |
390
- | `message.updated` | New user message - cancel pending reminders |
391
- | `session.created` | New session - reset state |
392
-
393
- > **Note**: The plugin supports OpenCode SDK v1.0.x, v1.1.x, and v1.1.7+ for backward compatibility.
394
-
395
- ## Development
396
-
397
- To develop on this plugin locally:
398
-
399
- 1. Clone the repository:
400
- ```bash
401
- git clone https://github.com/MasuRii/opencode-smart-voice-notify.git
402
- cd opencode-smart-voice-notify
403
- ```
404
-
405
- 2. Install dependencies:
406
- ```bash
407
- # Using Bun (recommended)
408
- bun install
409
-
410
- # Or using npm
411
- npm install
412
- ```
413
-
414
- 3. Link to your OpenCode config:
415
- ```json
416
- {
417
- "plugin": ["file:///absolute/path/to/opencode-smart-voice-notify"]
418
- }
419
- ```
420
-
421
- ### Testing
422
-
423
- The plugin uses [Bun](https://bun.sh)'s built-in test runner for unit and E2E tests.
424
-
425
- ```bash
426
- # Run all tests
427
- bun test
428
-
429
- # Run tests with coverage
430
- bun test --coverage
431
-
432
- # Run tests in watch mode
433
- bun test --watch
434
- ```
435
-
436
- For more detailed testing guidelines and mock usage examples, see [CONTRIBUTING.md](./CONTRIBUTING.md).
437
-
438
- ## Updating
439
-
440
- OpenCode does not automatically update plugins. To update to the latest version:
441
-
442
- ```bash
443
- # Clear the cached plugin
444
- rm -rf ~/.cache/opencode/node_modules/opencode-smart-voice-notify
445
-
446
- # Run OpenCode to trigger a fresh install
447
- opencode
448
- ```
449
-
450
- ## License
451
-
452
- MIT
453
-
454
- ## Support
455
-
456
- - Open an issue on [GitHub](https://github.com/MasuRii/opencode-smart-voice-notify/issues)
457
- - Check the [OpenCode docs](https://opencode.ai/docs/plugins)
458
-
459
- <!-- Dynamic Header -->
460
- <img width="100%" src="https://capsule-render.vercel.app/api?type=waving&color=0:667eea,100:764ba2&height=120&section=header"/>
1
+ <!-- Dynamic Header -->
2
+ <img width="100%" src="https://capsule-render.vercel.app/api?type=waving&color=0:667eea,100:764ba2&height=120&section=header"/>
3
+
4
+ # OpenCode Smart Voice Notify
5
+
6
+ [![npm version](https://img.shields.io/npm/v/opencode-smart-voice-notify?color=blue&logo=npm)](https://www.npmjs.com/package/opencode-smart-voice-notify)
7
+ [![npm downloads](https://img.shields.io/npm/dm/opencode-smart-voice-notify?color=blue&logo=npm)](https://www.npmjs.com/package/opencode-smart-voice-notify)
8
+ [![GitHub release](https://img.shields.io/github/v/release/MasuRii/opencode-smart-voice-notify?logo=github)](https://github.com/MasuRii/opencode-smart-voice-notify/releases)
9
+ [![CI](https://img.shields.io/github/actions/workflow/status/MasuRii/opencode-smart-voice-notify/test.yml?branch=master&logo=github&label=tests)](https://github.com/MasuRii/opencode-smart-voice-notify/actions/workflows/test.yml)
10
+ [![License](https://img.shields.io/github/license/MasuRii/opencode-smart-voice-notify?color=green)](https://github.com/MasuRii/opencode-smart-voice-notify/blob/master/LICENSE)
11
+ [![Node](https://img.shields.io/node/v/opencode-smart-voice-notify?color=brightgreen&logo=node.js)](https://nodejs.org)
12
+ [![Platform](https://img.shields.io/badge/platform-Windows%20%7C%20macOS%20%7C%20Linux-lightgrey?logo=windows-terminal)](https://github.com/MasuRii/opencode-smart-voice-notify#platform-support-matrix)
13
+
14
+
15
+ > **Disclaimer**: This project is not built by the OpenCode team and is not affiliated with [OpenCode](https://opencode.ai) in any way. It is an independent community plugin.
16
+
17
+ A smart voice notification plugin for [OpenCode](https://opencode.ai) with **multiple TTS engines**, native desktop notifications, and an intelligent reminder system.
18
+
19
+ <img width="1456" height="720" alt="image" src="https://github.com/user-attachments/assets/52ccf357-2548-400b-a346-6362f2fc3180" />
20
+
21
+
22
+ ## Features
23
+
24
+ ### Smart TTS Engine Selection
25
+ The plugin automatically tries multiple TTS engines in order, falling back if one fails:
26
+
27
+ 1. **OpenAI-Compatible** (Cloud/Self-hosted) - Any OpenAI-compatible `/v1/audio/speech` endpoint (Kokoro, LocalAI, Coqui, AllTalk, OpenAI API, etc.)
28
+ 2. **ElevenLabs** (Online) - High-quality, anime-like voices with natural expression
29
+ 3. **Edge TTS** (Free) - Microsoft's neural voices via Python CLI (recommended) or native npm fallback
30
+ 4. **Windows SAPI** (Offline) - Built-in Windows speech synthesis
31
+ 5. **macOS Say** (Offline) - Built-in macOS speech synthesis
32
+ 6. **Local Sound Files** (Fallback) - Plays bundled MP3 files if all TTS fails
33
+
34
+ ### Smart Notification System
35
+ - **Sound-first mode**: Play a sound immediately, then speak a TTS reminder if user doesn't respond
36
+ - **TTS-first mode**: Speak immediately using TTS
37
+ - **Both mode**: Play sound AND speak TTS at the same time
38
+ - **Sound-only mode**: Just play sounds, no TTS
39
+
40
+ ### Intelligent Reminders
41
+ - **Granular Control**: Enable or disable notifications and reminders for specific event types (Idle, Permission, Question, Error) via configuration.
42
+ - Delayed TTS reminders if user doesn't respond within configurable time
43
+ - Follow-up reminders with exponential backoff
44
+ - Automatic cancellation when user responds
45
+ - Per-notification type delays (permission requests are more urgent)
46
+ - **Smart Quota Handling**: Automatically falls back to free Edge TTS if ElevenLabs quota is exceeded
47
+ - **Permission Batching**: Multiple simultaneous permission requests are batched into a single notification (e.g., "5 permission requests require your attention")
48
+ - **Question Tool Support** (SDK v1.1.7+): Notifies when the agent asks questions and needs user input
49
+
50
+ ### AI-Generated Messages
51
+ - **Dynamic notifications**: Use a local AI to generate unique, contextual messages instead of preset static ones
52
+ - **OpenAI-compatible**: Works with Ollama, LM Studio, LocalAI, vLLM, llama.cpp, Jan.ai, or any OpenAI-compatible endpoint
53
+ - **User-hosted**: You provide your own AI endpoint - no cloud API keys required
54
+ - **Custom prompts**: Configure prompts per notification type for full control over AI personality
55
+ - **Smart fallback**: Automatically falls back to static messages if AI is unavailable
56
+
57
+ ### System Integration
58
+ - **Native Desktop Notifications**: Windows (Toast), macOS (Notification Center), and Linux (notify-send) support
59
+ - **Native Edge TTS**: No external dependencies (Python/pip) required
60
+ - **Focus Detection** (macOS): Suppresses notifications when terminal is focused
61
+ - **Webhook Integration**: Receive notifications on Discord or any custom webhook endpoint when tasks finish or need attention
62
+ - **Themed Sound Packs**: Use custom sound collections (e.g., Warcraft, StarCraft) by simply pointing to a directory
63
+ - **Per-Project Sounds**: Assign unique sounds to different projects for easy identification
64
+ - **Wake monitor** from sleep before notifying
65
+ - **Auto-boost volume** if too low
66
+ - **TUI toast** notifications
67
+
68
+ ## Installation
69
+
70
+ ### Option 1: From npm/Bun (Recommended)
71
+
72
+ Add to your OpenCode config file (`~/.config/opencode/opencode.json`):
73
+
74
+ ```json
75
+ {
76
+ "$schema": "https://opencode.ai/config.json",
77
+ "plugin": ["opencode-smart-voice-notify@latest"]
78
+ }
79
+ ```
80
+
81
+ > **Note**: OpenCode will automatically install the plugin using your system's package manager (npm or bun).
82
+
83
+ ### Option 2: From GitHub
84
+
85
+ ```json
86
+ {
87
+ "$schema": "https://opencode.ai/config.json",
88
+ "plugin": ["github:MasuRii/opencode-smart-voice-notify"]
89
+ }
90
+ ```
91
+
92
+ ### Option 3: Local Development
93
+
94
+ 1. Clone the repository:
95
+ ```bash
96
+ git clone https://github.com/MasuRii/opencode-smart-voice-notify.git
97
+ ```
98
+
99
+ 2. Reference the local path in your config:
100
+ ```json
101
+ {
102
+ "plugin": ["file:///path/to/opencode-smart-voice-notify"]
103
+ }
104
+ ```
105
+
106
+ ## Configuration
107
+
108
+ ### Automatic Setup
109
+
110
+ When you first run OpenCode with this plugin installed, it will **automatically create**:
111
+
112
+ 1. **`~/.config/opencode/smart-voice-notify.jsonc`** - A comprehensive configuration file with all available options fully documented.
113
+ 2. **`~/.config/opencode/assets/*.mp3`** - Bundled notification sound files.
114
+ 3. **`~/.config/opencode/logs/`** - Debug log folder (created when debug logging is enabled).
115
+
116
+ The auto-generated configuration includes all advanced settings, message arrays, and engine options, so you don't have to refer back to the documentation for available settings.
117
+
118
+ ### Manual Configuration
119
+
120
+ If you prefer to create the config manually, add a `smart-voice-notify.jsonc` file in your OpenCode config directory (`~/.config/opencode/`):
121
+
122
+ ```jsonc
123
+ {
124
+ // Master switch to enable/disable the plugin without uninstalling
125
+ "enabled": true,
126
+
127
+ // Notification mode: 'sound-first', 'tts-first', 'both', 'sound-only'
128
+ "notificationMode": "sound-first",
129
+
130
+ // TTS engine: 'openai', 'elevenlabs', 'edge', 'sapi'
131
+ "ttsEngine": "openai",
132
+ "enableTTS": true,
133
+
134
+ // ElevenLabs settings (get API key from https://elevenlabs.io/app/settings/api-keys)
135
+ "elevenLabsApiKey": "YOUR_API_KEY_HERE",
136
+ "elevenLabsVoiceId": "cgSgspJ2msm6clMCkdW9", // Jessica - Playful, Bright
137
+
138
+ // OpenAI-compatible TTS (Kokoro, LocalAI, OpenAI, Coqui, AllTalk, etc.)
139
+ "openaiTtsEndpoint": "http://localhost:8880",
140
+ "openaiTtsVoice": "af_heart",
141
+ "openaiTtsModel": "kokoro",
142
+
143
+ // Edge TTS settings (free, no API key required)
144
+ "edgeVoice": "en-US-AnaNeural",
145
+ "edgePitch": "+50Hz",
146
+ "edgeRate": "+10%",
147
+
148
+ // Desktop Notifications
149
+ "enableDesktopNotification": true,
150
+ "desktopNotificationTimeout": 5,
151
+ "showProjectInNotification": true,
152
+
153
+ // TTS reminder settings
154
+ "enableTTSReminder": true,
155
+ "ttsReminderDelaySeconds": 30,
156
+ "enableFollowUpReminders": true,
157
+
158
+ // Focus Detection (macOS only)
159
+ "suppressWhenFocused": true,
160
+ "alwaysNotify": false,
161
+
162
+ // AI-generated messages (optional - requires local AI server)
163
+ "enableAIMessages": false,
164
+ "aiEndpoint": "http://localhost:11434/v1",
165
+
166
+ // Webhook settings (optional - works with Discord)
167
+ "enableWebhook": false,
168
+ "webhookUrl": "",
169
+ "webhookUsername": "OpenCode Notify",
170
+
171
+ // Sound theme settings (optional)
172
+ "soundThemeDir": "", // Path to custom sound theme directory
173
+
174
+ // Per-project sounds
175
+ "perProjectSounds": false,
176
+ "projectSoundSeed": 0,
177
+
178
+ // General settings
179
+ "wakeMonitor": true,
180
+ "forceVolume": false,
181
+ "volumeThreshold": 50,
182
+ "enableToast": true,
183
+ "enableSound": true,
184
+ "debugLog": false
185
+ }
186
+ ```
187
+
188
+ For the complete configuration with all TTS engine settings, message arrays, AI prompts, and advanced options, see [`example.config.jsonc`](./example.config.jsonc) in the plugin directory.
189
+
190
+ ### OpenAI-Compatible TTS Setup (Kokoro, LocalAI, OpenAI API, etc.)
191
+
192
+ For cloud-based or self-hosted TTS using any OpenAI-compatible `/v1/audio/speech` endpoint:
193
+
194
+ ```jsonc
195
+ {
196
+ "ttsEngine": "openai",
197
+ "openaiTtsEndpoint": "http://192.168.86.43:8880", // Your TTS server
198
+ "openaiTtsVoice": "af_heart", // Server-dependent
199
+ "openaiTtsModel": "kokoro", // Server-dependent
200
+ "openaiTtsApiKey": "", // Optional, if server requires auth
201
+ "openaiTtsSpeed": 1.0 // 0.25 to 4.0
202
+ }
203
+ ```
204
+
205
+ **Supported OpenAI-Compatible TTS Servers:**
206
+ | Server | Example Endpoint | Voices |
207
+ |--------|------------------|--------|
208
+ | Kokoro | `http://localhost:8880` | `af_heart`, `af_bella`, `am_adam`, etc. |
209
+ | LocalAI | `http://localhost:8080` | Model-dependent |
210
+ | AllTalk | `http://localhost:7851` | Model-dependent |
211
+ | OpenAI | `https://api.openai.com` | `alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer` |
212
+ | Coqui | `http://localhost:5002` | Model-dependent |
213
+
214
+ ### AI Message Generation (Optional)
215
+
216
+ If you want dynamic, AI-generated notification messages instead of preset ones, you can connect to a local AI server:
217
+
218
+ 1. **Install a local AI server** (e.g., [Ollama](https://ollama.ai)):
219
+ ```bash
220
+ # Install Ollama and pull a model
221
+ ollama pull llama3
222
+ ```
223
+
224
+ 2. **Enable AI messages in your config**:
225
+ ```jsonc
226
+ {
227
+ "enableAIMessages": true,
228
+ "aiEndpoint": "http://localhost:11434/v1",
229
+ "aiModel": "llama3",
230
+ "aiApiKey": "",
231
+ "aiFallbackToStatic": true,
232
+ "enableContextAwareAI": false // Set to true for personalized messages with project/task context
233
+ }
234
+ ```
235
+
236
+ 3. **The AI will generate unique messages** for each notification, which are then spoken by your TTS engine.
237
+
238
+ 4. **Context-Aware Messages** (optional): Enable `enableContextAwareAI` for personalized notifications that include project name, task title, and change summary (e.g., "Your work on MyProject is complete!").
239
+
240
+ **Supported AI Servers:**
241
+ | Server | Default Endpoint | API Key |
242
+ |--------|-----------------|---------|
243
+ | Ollama | `http://localhost:11434/v1` | Not needed |
244
+ | LM Studio | `http://localhost:1234/v1` | Not needed |
245
+ | LocalAI | `http://localhost:8080/v1` | Not needed |
246
+ | vLLM | `http://localhost:8000/v1` | Use "EMPTY" |
247
+ | Jan.ai | `http://localhost:1337/v1` | Required |
248
+
249
+ ### Discord / Webhook Integration (Optional)
250
+
251
+ Receive remote notifications on Discord or any custom endpoint. This is perfect for long-running tasks when you're away from your computer.
252
+
253
+ 1. **Create a Discord Webhook**:
254
+ - In Discord, go to **Server Settings** > **Integrations** > **Webhooks**.
255
+ - Click **New Webhook**, choose a channel, and click **Copy Webhook URL**.
256
+
257
+ 2. **Enable Webhooks in your config**:
258
+ ```jsonc
259
+ {
260
+ "enableWebhook": true,
261
+ "webhookUrl": "https://discord.com/api/webhooks/...",
262
+ "webhookUsername": "OpenCode Notify",
263
+ "webhookEvents": ["idle", "permission", "error", "question"],
264
+ "webhookMentionOnPermission": true
265
+ }
266
+ ```
267
+
268
+ 3. **Features**:
269
+ - **Color-coded Embeds**: Different colors for task completion (green), permissions (orange), errors (red), and questions (blue).
270
+ - **Smart Mentions**: Automatically @everyone on Discord for urgent permission requests.
271
+ - **Rate Limiting**: Intelligent retry logic with backoff if Discord's rate limits are hit.
272
+ - **Fire-and-forget**: Webhook requests never block local sound or TTS playback.
273
+
274
+ **Supported Webhook Events:**
275
+ | Event | Trigger |
276
+ |-------|---------|
277
+ | `idle` | Agent finished working |
278
+ | `permission` | Agent needs permission for a tool |
279
+ | `error` | Agent encountered an error |
280
+ | `question` | Agent is asking you a question |
281
+
282
+
283
+ ### Custom Sound Themes (Optional)
284
+
285
+ You can replace individual sound files with entire "Sound Themes" (like the classic Warcraft II or StarCraft sound packs).
286
+
287
+ 1. **Set up your theme directory**:
288
+ Create a folder (e.g., `~/.config/opencode/themes/warcraft2/`) with the following structure:
289
+ ```text
290
+ warcraft2/
291
+ ├── idle/ # Sounds for when the agent finishes
292
+ ├── job_done.mp3
293
+ │ └── alright.wav
294
+ ├── permission/ # Sounds for permission requests
295
+ │ ├── help.mp3
296
+ │ └── need_orders.wav
297
+ ├── error/ # Sounds for agent errors
298
+ │ └── alert.mp3
299
+ └── question/ # Sounds for agent questions
300
+ └── yes_milord.mp3
301
+ ```
302
+
303
+ 2. **Configure the theme in your config**:
304
+ ```jsonc
305
+ {
306
+ "soundThemeDir": "themes/warcraft2",
307
+ "randomizeSoundFromTheme": true
308
+ }
309
+ ```
310
+
311
+ 3. **Features**:
312
+ - **Automatic Fallback**: If a theme subdirectory or sound is missing, the plugin automatically falls back to your default sound files.
313
+ - **Randomization**: If multiple sounds are in a subdirectory, the plugin will pick one at random each time (if `randomizeSoundFromTheme` is `true`).
314
+ - **Relative Paths**: Paths are relative to your OpenCode config directory (`~/.config/opencode/`).
315
+
316
+
317
+ ## Requirements
318
+
319
+ ### Platform Support Matrix
320
+
321
+ | Feature | Windows | macOS | Linux |
322
+ |---------|:---:|:---:|:---:|
323
+ | **Sound Playback** | ✅ | ✅ | ✅ |
324
+ | **TTS (Cloud/Edge)** | ✅ | ✅ | ✅ |
325
+ | **TTS (Windows SAPI)** | ✅ | | |
326
+ | **TTS (macOS Say)** | ❌ | ✅ | ❌ |
327
+ | **Desktop Notifications** | ✅ | ✅ | ✅ (req libnotify) |
328
+ | **Focus Detection** | | ✅ | ❌ |
329
+ | **Webhook Integration** | | | ✅ |
330
+ | **Wake Monitor** | | | (X11/Gnome) |
331
+ | **Volume Control** | ✅ | ✅ | ✅ (Pulse/ALSA) |
332
+
333
+ ### For OpenAI-Compatible TTS
334
+ - Any server implementing the `/v1/audio/speech` endpoint
335
+ - Examples: [Kokoro](https://github.com/remsky/Kokoro-FastAPI), [LocalAI](https://localai.io), [AllTalk](https://github.com/erew123/alltalk_tts), OpenAI API, etc.
336
+ - Works with both local self-hosted servers and cloud-based providers.
337
+
338
+ ### For ElevenLabs TTS
339
+ - ElevenLabs API key (free tier: 10,000 characters/month)
340
+ - Internet connection
341
+
342
+ ### For Edge TTS
343
+ - Internet connection required
344
+ - **Recommended**: Install Python edge-tts for best reliability: `pip install edge-tts`
345
+ - **Fallback**: Works without Python (uses bundled npm package), but may be less reliable
346
+ - If Edge TTS fails, automatically falls back to SAPI (Windows) or Say (macOS)
347
+
348
+ ### For Windows SAPI
349
+ - Windows OS (uses built-in System.Speech)
350
+
351
+ ### For macOS Say
352
+ - macOS (uses built-in `say` command)
353
+ - Serves as fallback when other TTS engines fail
354
+
355
+ ### For Desktop Notifications
356
+ - **Windows**: Built-in (uses Toast notifications)
357
+ - **macOS**: Built-in (uses Notification Center)
358
+ - **Linux**: Requires `notify-send` (libnotify)
359
+ ```bash
360
+ # Ubuntu/Debian
361
+ sudo apt install libnotify-bin
362
+
363
+ # Fedora
364
+ sudo dnf install libnotify
365
+
366
+ # Arch Linux
367
+ sudo pacman -S libnotify
368
+ ```
369
+
370
+ ### For Sound Playback
371
+ - **Windows**: Built-in (uses Windows Media Player)
372
+ - **macOS**: Built-in (`afplay`)
373
+ - **Linux**: `paplay` or `aplay`
374
+
375
+ ### For Focus Detection
376
+ Focus detection suppresses sound and desktop notifications when the terminal is focused.
377
+
378
+ | Platform | Support | Notes |
379
+ |----------|---------|-------|
380
+ | **macOS** | Full | Uses AppleScript to detect frontmost application |
381
+ | **Windows** | ❌ Not supported | No reliable API available |
382
+ | **Linux** | Not supported | Varies by desktop environment |
383
+
384
+ > **Note**: On unsupported platforms, notifications are always sent (fail-open behavior). TTS reminders are never suppressed, even when focused, since users may step away after seeing the toast.
385
+
386
+ ### For Webhook Notifications
387
+ - **Discord**: Full support for Discord's webhook embed format.
388
+ - **Generic**: Works with any endpoint that accepts a POST request with a JSON body (though formatting is optimized for Discord).
389
+ - **Rate Limits**: The plugin handles HTTP 429 (Too Many Requests) automatically with retries and a 250ms queue delay.
390
+
391
+ ## Events Handled
392
+
393
+ | Event | Action |
394
+ |-------|--------|
395
+ | `session.idle` | Agent finished working - notify user |
396
+ | `session.error` | Agent encountered an error - alert user |
397
+ | `permission.asked` | Permission request (SDK v1.1.1+) - alert user |
398
+ | `permission.updated` | Permission request (SDK v1.0.x) - alert user |
399
+ | `permission.replied` | User responded - cancel pending reminders |
400
+ | `question.asked` | Agent asks question (SDK v1.1.7+) - notify user |
401
+ | `question.replied` | User answered question - cancel pending reminders |
402
+ | `question.rejected` | User dismissed question - cancel pending reminders |
403
+ | `message.updated` | New user message - cancel pending reminders |
404
+ | `session.created` | New session - reset state |
405
+
406
+ > **Note**: The plugin supports OpenCode SDK v1.0.x, v1.1.x, and v1.1.7+ for backward compatibility.
407
+
408
+ ## Development
409
+
410
+ To develop on this plugin locally:
411
+
412
+ 1. Clone the repository:
413
+ ```bash
414
+ git clone https://github.com/MasuRii/opencode-smart-voice-notify.git
415
+ cd opencode-smart-voice-notify
416
+ ```
417
+
418
+ 2. Install dependencies:
419
+ ```bash
420
+ # Using Bun (recommended)
421
+ bun install
422
+
423
+ # Or using npm
424
+ npm install
425
+ ```
426
+
427
+ 3. Link to your OpenCode config:
428
+ ```json
429
+ {
430
+ "plugin": ["file:///absolute/path/to/opencode-smart-voice-notify"]
431
+ }
432
+ ```
433
+
434
+ ### Testing
435
+
436
+ The plugin uses [Bun](https://bun.sh)'s built-in test runner for unit and E2E tests.
437
+
438
+ ```bash
439
+ # Run all tests
440
+ bun test
441
+
442
+ # Run tests with coverage
443
+ bun test --coverage
444
+
445
+ # Run tests in watch mode
446
+ bun test --watch
447
+ ```
448
+
449
+ For more detailed testing guidelines and mock usage examples, see [CONTRIBUTING.md](./CONTRIBUTING.md).
450
+
451
+ ## Updating
452
+
453
+ OpenCode does not automatically update plugins. To update to the latest version:
454
+
455
+ ```bash
456
+ # Clear the cached plugin
457
+ rm -rf ~/.cache/opencode/node_modules/opencode-smart-voice-notify
458
+
459
+ # Run OpenCode to trigger a fresh install
460
+ opencode
461
+ ```
462
+
463
+ <!--
464
+ LLM-AGENT-INSTRUCTIONS (Hidden from rendered markdown, visible to AI assistants scanning raw file)
465
+
466
+ ## LLM Instructions
467
+
468
+ > This section provides structured guidance for AI assistants (Claude, GPT, Cursor, Copilot, etc.) helping users set up and configure this plugin.
469
+
470
+ ### Quick Setup (Execute These Steps)
471
+
472
+ When a user asks you to set up this plugin, follow these steps:
473
+
474
+ 1. **Add plugin to OpenCode config** - Edit `~/.config/opencode/opencode.json`:
475
+ ```json
476
+ {
477
+ "$schema": "https://opencode.ai/config.json",
478
+ "plugin": ["opencode-smart-voice-notify@latest"]
479
+ }
480
+ ```
481
+
482
+ 2. **Run OpenCode once** - This auto-generates the configuration file at `~/.config/opencode/smart-voice-notify.jsonc`
483
+
484
+ 3. **Configure based on user's TTS preference**:
485
+ - **Free/No API key**: Use Edge TTS (default, works out of box)
486
+ - **High quality**: Use ElevenLabs (requires API key)
487
+ - **Self-hosted**: Use OpenAI-compatible endpoint (Kokoro, LocalAI, etc.)
488
+ - **Offline Windows**: Use Windows SAPI
489
+ - **Offline macOS**: Use macOS Say (built-in fallback)
490
+
491
+ ### Configuration Tasks
492
+
493
+ #### Task: Enable ElevenLabs TTS
494
+ ```jsonc
495
+ // In ~/.config/opencode/smart-voice-notify.jsonc
496
+ {
497
+ "ttsEngine": "elevenlabs",
498
+ "elevenLabsApiKey": "USER_API_KEY_HERE",
499
+ "elevenLabsVoiceId": "cgSgspJ2msm6clMCkdW9"
500
+ }
501
+ ```
502
+ **Note**: User must get API key from https://elevenlabs.io/app/settings/api-keys
503
+
504
+ #### Task: Enable OpenAI-Compatible TTS (Kokoro, LocalAI, etc.)
505
+ ```jsonc
506
+ {
507
+ "ttsEngine": "openai",
508
+ "openaiTtsEndpoint": "http://localhost:8880",
509
+ "openaiTtsVoice": "af_heart",
510
+ "openaiTtsModel": "kokoro"
511
+ }
512
+ ```
513
+ **Note**: User must have a compatible TTS server running locally or remotely.
514
+
515
+ #### Task: Enable Discord Webhook Notifications
516
+ ```jsonc
517
+ {
518
+ "enableWebhook": true,
519
+ "webhookUrl": "https://discord.com/api/webhooks/...",
520
+ "webhookEvents": ["idle", "permission", "error", "question"],
521
+ "webhookMentionOnPermission": true
522
+ }
523
+ ```
524
+
525
+ #### Task: Enable AI-Generated Messages (requires local AI server)
526
+ ```jsonc
527
+ {
528
+ "enableAIMessages": true,
529
+ "aiEndpoint": "http://localhost:11434/v1",
530
+ "aiModel": "llama3"
531
+ }
532
+ ```
533
+ **Note**: Requires Ollama, LM Studio, or similar running locally.
534
+
535
+ #### Task: Set Up Custom Sound Theme
536
+ ```jsonc
537
+ {
538
+ "soundThemeDir": "themes/warcraft2",
539
+ "randomizeSoundFromTheme": true
540
+ }
541
+ ```
542
+ **Directory structure**: `~/.config/opencode/themes/warcraft2/{idle,permission,error,question}/*.mp3`
543
+
544
+ ### Common Issues & Fixes
545
+
546
+ | Issue | Solution |
547
+ |-------|----------|
548
+ | No sound playing | Check `"enableSound": true` and verify audio files exist in `~/.config/opencode/assets/` |
549
+ | TTS not working | Try fallback: change `"ttsEngine"` to `"edge"` (free, no API key) |
550
+ | ElevenLabs quota exceeded | Plugin auto-falls back to Edge TTS; or set `"ttsEngine": "edge"` |
551
+ | Notifications too frequent | Increase `"ttsReminderDelaySeconds"` (default: 30) |
552
+ | Want to disable completely | Set `"enabled": false` in config |
553
+ | Edge TTS not speaking | Install Python edge-tts: `pip install edge-tts` |
554
+
555
+ ### Platform-Specific Notes
556
+
557
+ - **Windows**: All features supported, SAPI as offline fallback
558
+ - **macOS**: Focus detection available, `say` command as offline fallback
559
+ - **Linux**: Requires `libnotify-bin` for desktop notifications, no offline TTS fallback
560
+
561
+ ### TTS Fallback Chain
562
+
563
+ Primary Engine → Edge TTS → Windows SAPI → macOS Say → Sound File
564
+
565
+ ### Done When
566
+
567
+ The plugin is successfully configured when:
568
+ - [ ] `~/.config/opencode/opencode.json` contains the plugin entry
569
+ - [ ] `~/.config/opencode/smart-voice-notify.jsonc` exists with user's settings
570
+ - [ ] Running `opencode` produces audio notification when agent finishes a task
571
+
572
+ END LLM-AGENT-INSTRUCTIONS
573
+ -->
574
+
575
+ ## License
576
+
577
+ MIT
578
+
579
+ ## Support
580
+
581
+ - Open an issue on [GitHub](https://github.com/MasuRii/opencode-smart-voice-notify/issues)
582
+ - Check the [OpenCode docs](https://opencode.ai/docs/plugins)
583
+
584
+ <!-- Dynamic Header -->
585
+ <img width="100%" src="https://capsule-render.vercel.app/api?type=waving&color=0:667eea,100:764ba2&height=120&section=header"/>