agentvibes 3.3.0 β†’ 3.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -11,7 +11,7 @@
11
11
  [![Publish](https://github.com/paulpreibisch/AgentVibes/actions/workflows/publish.yml/badge.svg)](https://github.com/paulpreibisch/AgentVibes/actions/workflows/publish.yml)
12
12
  [![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
13
13
 
14
- **Author**: Paul Preibisch ([@997Fire](https://x.com/997Fire)) | **Version**: v3.3.0
14
+ **Author**: Paul Preibisch ([@997Fire](https://x.com/997Fire)) | **Version**: v3.4.0
15
15
 
16
16
  ---
17
17
 
@@ -39,9 +39,15 @@ Whether you're coding in Claude Code, chatting in Claude Desktop, using Warp Ter
39
39
 
40
40
  ### 🎯 Key Features
41
41
 
42
+ **⚑ NEW IN v3.4.0:**
43
+ - 🎀 **Soprano TTS Provider** - Ultra-fast neural TTS with 20x CPU, 2000x GPU acceleration (thanks [@nathanchase](https://github.com/nathanchase)!)
44
+ - πŸ›‘οΈ **Security Hardening** - 9.5/10 score with comprehensive validation and timeouts
45
+ - 🌐 **Environment Intelligence** - PulseAudio tunnel auto-detection for SSH scenarios
46
+ - 🎯 **Smart Recommendations** - GPU/RAM-based provider suggestions in installer
47
+
42
48
  **✨ NEW IN v3.3.0:**
43
- - πŸ“± **AgentVibes Receiver - NEW!** - Stream TTS from voiceless servers to your phone, laptop, or local machine via encrypted SSH tunnel
44
- - 🌐 **Voiceless Server Support - NEW!** - Generate TTS on cloud servers (AWS, GCP, Azure) and play on any device with speakers
49
+ - πŸ“± **AgentVibes Receiver** - Stream TTS from voiceless servers to your phone, laptop, or local machine via encrypted SSH tunnel
50
+ - 🌐 **Voiceless Server Support** - Generate TTS on cloud servers (AWS, GCP, Azure) and play on any device with speakers
45
51
 
46
52
  **⚑ Core Features:**
47
53
  - ⚑ **One-Command Install** - Get started in 30 seconds (`npx agentvibes install`)
@@ -88,7 +94,7 @@ All 50+ Piper voices AgentVibes provides are sourced from Hugging Face's open-so
88
94
  - [πŸ“± Android/Termux](#-quick-setup-android--termux-claude-code-on-your-phone) - Run Claude Code on your phone
89
95
  - [πŸ“‹ Prerequisites](#-prerequisites) - What you actually need (Node.js + optional tools)
90
96
  - [✨ What is AgentVibes?](#-what-is-agentvibes) - Overview & key features
91
- - [πŸ“° Latest Release](#-latest-release) - v3.3.0 - Remote Audio Revolution
97
+ - [πŸ“° Latest Release](#-latest-release) - v3.4.0 - Soprano TTS & Security Hardening
92
98
  - [πŸͺŸ Windows Setup Guide for Claude Desktop](mcp-server/WINDOWS_SETUP.md) - Complete Windows installation with WSL & Python
93
99
 
94
100
  ### AgentVibes MCP (Natural Language Control)
@@ -132,24 +138,25 @@ All 50+ Piper voices AgentVibes provides are sourced from Hugging Face's open-so
132
138
 
133
139
  ## πŸ“° Latest Release
134
140
 
135
- **[Remote Audio Revolution: Voiceless Servers β†’ Phone & Mobile Playback](https://github.com/paulpreibisch/AgentVibes/releases/tag/v3.3.0)** πŸ“±πŸ”Š
141
+ **[v3.4.0 - Soprano TTS, Security Hardening & Environment Intelligence](https://github.com/paulpreibisch/AgentVibes/releases/tag/v3.4.0)** βš‘πŸ›‘οΈ
136
142
 
137
- AgentVibes v3.3.0 brings breakthrough remote audio capabilities, turning your messaging apps into voice-enabled AI assistants! **Use Case:** Install [OpenClaw](https://openclaw.ai/) on a remote server, message it via Telegram or WhatsApp from anywhere, and AgentVibes (running in Termux on your phone) plays the TTS responses through your phone speakersβ€”making it work like Siri, but powered by AgentVibes! This release enables voiceless servers to play audio remotely on phones, mobile devices, or any machine via SSH/PulseAudio tunneling. Also includes audio tracks directory structure fix and comprehensive OpenClaw skill documentation.
143
+ AgentVibes v3.4.0 introduces Soprano TTS - an 80M parameter neural provider offering 20x CPU and 2000x GPU acceleration with sub-1GB memory footprint - plus comprehensive security hardening (timeouts, bounds checking, NaN validation) achieving a 9.5/10 security score, and intelligent environment detection that recognizes PulseAudio tunnels as working audio for remote scenarios. The enhanced installer provides GPU-based provider recommendations and context-aware messaging. Special thanks to [@nathanchase](https://github.com/nathanchase) for contributing the Soprano TTS Provider integration!
138
144
 
139
145
  **Key Highlights:**
140
- - πŸ“± **Voiceless Server Support** - Generate TTS on servers without audio hardware, play on remote devices
141
- - πŸ”Š **Phone/Mobile Playback** - Audio tunnels from cloud servers to your phone or local machine via SSH
142
- - 🌐 **PulseAudio SSH Tunneling** - Automatic audio routing through SSH reverse forwarding (port 14713)
143
- - πŸ€– **Enhanced OpenClaw Integration** - Complete skill documentation with 50+ voices and remote audio setup
144
- - 🎡 **Audio Tracks Fix** - Corrected directory structure (backgrounds β†’ tracks) with proper .npmignore entries
145
- - πŸ“¦ **Package Size Optimization** - 8.3 MB unpacked, 172 files, optimized for npm distribution
146
- - πŸ›‘οΈ **Security & Quality** - Removed sensitive data from git history, SonarCloud compliance
146
+ - ⚑ **Soprano TTS Provider** - Ultra-fast neural TTS with 20x CPU, 2000x GPU acceleration (thanks @nathanchase!)
147
+ - πŸ›‘οΈ **Security Hardening** - 9.5/10 score with timeouts on system commands and comprehensive validation
148
+ - 🌐 **Environment Intelligence** - PulseAudio tunnel auto-detection for SSH + tunnel scenarios
149
+ - 🎯 **Smart Recommendations** - GPU/RAM-based provider suggestions (Soprano for CUDA, macOS Say for Apple)
150
+ - πŸ“± **Provider-Aware Voice Pages** - Soprano shows model specs, auto-selects single voice
151
+ - πŸ§ͺ **260/260 Tests Passing** - Complete suite coverage with all edge cases fixed
152
+ - 🎨 **Better UX** - Context-aware messaging ("PulseAudio Tunnel Detected!" vs "speakers")
147
153
 
148
154
  **Perfect For:**
149
- - Running AgentVibes/OpenClaw on AWS, GCP, Azure, DigitalOcean
150
- - VS Code Remote SSH development with TTS feedback
151
- - Android/Termux with audio playback on phone speakers
152
- - Any headless server β†’ local audio scenario
155
+ - GPU users wanting ultra-fast TTS (2000x real-time with CUDA)
156
+ - Low-RAM systems (<1GB memory footprint with Soprano)
157
+ - SSH sessions with PulseAudio tunnels (auto-detected)
158
+ - Production deployments requiring security hardening
159
+ - Any environment needing intelligent provider selection
153
160
 
154
161
  πŸ’‘ **Tip:** If `npx agentvibes` shows an older version or missing commands, clear your npm cache: `npm cache clean --force && npx agentvibes@latest --help`
155
162
 
@@ -1452,6 +1459,9 @@ Both do the exact same thing - MCP is more convenient, slash commands are more t
1452
1459
  - [Claude Code](https://claude.com/claude-code) - AI coding assistant
1453
1460
  - Licensed under Apache 2.0
1454
1461
 
1462
+ **Contributors:**
1463
+ - 🎀 [@nathanchase](https://github.com/nathanchase) - Soprano TTS Provider integration (PR #95) - Ultra-fast neural TTS with GPU acceleration
1464
+
1455
1465
  **Special Thanks:**
1456
1466
  - πŸ’‘ [Claude Code Hooks Mastery](https://github.com/disler/claude-code-hooks-mastery) by [@disler](https://github.com/disler) - Hooks inspiration
1457
1467
  - πŸ€– [BMAD METHOD](https://github.com/bmad-code-org/BMAD-METHOD) - Multi-agent framework with auto voice switching integration
package/RELEASE_NOTES.md CHANGED
@@ -1,4 +1,202 @@
1
- # AgentVibes Release Notes
1
+ # AgentVibes v3.4.0 Release Notes - DRAFT
2
+
3
+ ## πŸ“¦ v3.4.0 - Soprano TTS, Security Hardening & Environment Intelligence
4
+
5
+ **Release Date:** February 10, 2026
6
+
7
+ ### 🎯 Why v3.4.0?
8
+
9
+ v3.4.0 introduces **Soprano TTS** - an ultra-fast neural TTS provider with GPU acceleration, comprehensive **security hardening** across the codebase, and **intelligent environment detection** that recognizes PulseAudio tunnels for remote audio scenarios.
10
+
11
+ ### πŸš€ Key Highlights
12
+
13
+ #### ⚑ Soprano TTS Provider (NEW!)
14
+ - **80M parameter neural model** with premium female English voice
15
+ - **20x CPU speed** (vs Piper), **2000x GPU speed** with CUDA
16
+ - **3 synthesis modes**: WebUI (Gradio), API (OpenAI-compatible), CLI (fallback)
17
+ - **Auto-detection**: Checks for running Gradio server, falls back gracefully
18
+ - **<1GB memory footprint** - perfect for low-RAM systems
19
+ - **Provider-aware voice management**: Auto-selects single voice, shows model specs
20
+ - **Thanks to [@nathanchase](https://github.com/nathanchase)** for this contribution! ([see acknowledgments](#-acknowledgments))
21
+
22
+ #### πŸ›‘οΈ Security Hardening (9.5/10 Score)
23
+ - **Timeouts on system commands**: Prevents installer hangs (nvidia-smi, sysctl, meminfo)
24
+ - **Bounds checking**: Validates array access before parsing system output
25
+ - **NaN validation**: Prevents crashes from malformed memory/GPU detection
26
+ - **Case-insensitive checks**: PulseAudio tunnel detection handles TCP: and tcp:
27
+ - **Code duplication eliminated**: Extracted PulseAudio helper function (DRY)
28
+
29
+ #### 🌐 Environment Intelligence
30
+ - **PulseAudio tunnel detection**: Recognizes `PULSE_SERVER=tcp:*` as working audio
31
+ - **Context-aware messaging**:
32
+ - "🌐 PulseAudio Tunnel Detected!" for SSH + tunnel setups
33
+ - "πŸ”Š Audio Output Detected!" for local speakers
34
+ - Distinguishes local/tunnel/hybrid configurations
35
+ - **Smart environment classification**:
36
+ - DESKTOP: Local audio OR active PulseAudio tunnel
37
+ - VOICELESS: No audio AND no tunnel
38
+ - PHONE: Termux/Android devices
39
+
40
+ #### 🎀 Installer Enhancements
41
+ - **Provider-aware voice pages**: Soprano shows model specs, Piper shows 50+ voices
42
+ - **Auto-selection logic**: Soprano (1 voice) auto-selects, no manual choice needed
43
+ - **GPU-based recommendations**: "Your GPU will run Soprano 2000x faster!"
44
+ - **RAM-based suggestions**: Low memory systems see "Soprano uses <1GB" message
45
+ - **Better RAM display**: Shows "512MB" instead of "0GB" for sub-1GB systems
46
+
47
+ ### πŸ€– AI Summary
48
+
49
+ AgentVibes v3.4.0 brings Soprano TTS - an 80M parameter neural provider offering 20x CPU and 2000x GPU acceleration with sub-1GB memory footprint - plus comprehensive security hardening (timeouts, bounds checking, NaN validation) and intelligent environment detection that recognizes PulseAudio tunnels as working audio for remote scenarios. The enhanced installer provides context-aware messaging distinguishing local speakers from SSH tunnels, GPU-based provider recommendations (Soprano for CUDA users, macOS Say for Apple, Piper for versatility), and provider-specific voice pages that auto-select Soprano's single voice while showcasing model specifications. This release achieves a 9.5/10 security score through systematic defensive programming, making AgentVibes production-ready for enterprise deployments while expanding TTS provider options for diverse hardware configurations.
50
+
51
+ ---
52
+
53
+ ## ✨ New Features
54
+
55
+ ### Soprano TTS Provider
56
+ - Add Soprano TTS provider script with 3 synthesis modes (WebUI, API, CLI) (#95)
57
+ - Integrate Soprano into TTS router and provider manager
58
+ - Add soprano-gradio-synth.py helper for WebUI/SSE protocol
59
+ - Provider-aware voice selection page with model specifications
60
+ - Auto-select single Soprano voice with performance details
61
+
62
+ ### Installer Intelligence
63
+ - Add `detectSystemCapabilities()` for GPU/RAM detection
64
+ - Add `hasPulseAudioTunnel()` helper function
65
+ - Context-aware audio detection messaging (tunnel vs local)
66
+ - GPU-based provider ordering (Soprano first for CUDA users)
67
+ - RAM-based recommendations (<4GB systems see Soprano first)
68
+ - Provider-specific intro messages (Soprano vs Piper vs macOS)
69
+
70
+ ### Environment Detection
71
+ - PulseAudio tunnel recognition via PULSE_SERVER env var
72
+ - Case-insensitive TCP protocol detection
73
+ - Smart DESKTOP classification (local audio OR tunnel)
74
+ - Improved VOICELESS detection (no audio AND no tunnel)
75
+
76
+ ---
77
+
78
+ ## πŸ› Bug Fixes
79
+
80
+ ### Security Fixes
81
+ - Add 5s timeout to nvidia-smi to prevent GPU detection hangs
82
+ - Add 3s timeout to sysctl/meminfo to prevent memory detection hangs
83
+ - Add bounds checking before parsing sysctl output (macOS)
84
+ - Add bounds checking before parsing /proc/meminfo (Linux)
85
+ - Add NaN validation for parseInt() memory size parsing
86
+ - Fix case sensitivity in PULSE_SERVER detection (handles TCP: and tcp:)
87
+
88
+ ### Test Fixes
89
+ - Fix provider-manager test #90: Add soprano and ssh-remote to cleanup list
90
+ - Ensure zero-provider edge case properly simulates empty state
91
+
92
+ ### User Experience
93
+ - Fix RAM display for <1GB systems (show "512MB" not "0GB")
94
+ - Fix PulseAudio selection triggering wrong setup flow
95
+ - Separate PulseAudio tunnel setup from SSH receiver setup
96
+
97
+ ---
98
+
99
+ ## πŸ—οΈ Improvements
100
+
101
+ ### Code Quality
102
+ - Extract PulseAudio detection to helper function (DRY principle)
103
+ - Implement system capabilities caching (eliminates duplicate calls)
104
+ - Add comprehensive error handling in detectSystemCapabilities()
105
+ - Improve code comments for security-critical sections
106
+
107
+ ### Performance
108
+ - Cache system detection results (prevents duplicate nvidia-smi calls)
109
+ - Add timeouts to prevent indefinite hangs
110
+ - Optimize provider detection with early returns
111
+
112
+ ### Documentation
113
+ - Add comprehensive commit message documenting all changes
114
+ - Document security improvements (timeouts, bounds checking, NaN validation)
115
+ - Explain PulseAudio tunnel detection architecture
116
+ - Detail environment classification logic
117
+
118
+ ---
119
+
120
+ ## πŸ“Š Statistics
121
+
122
+ - **91 commits** since v3.3.0
123
+ - **817 lines added** in merge to master
124
+ - **6 files modified** in core integration
125
+ - **260 tests passing** (213 BATS + 47 Node)
126
+ - **Security score**: 7.5/10 β†’ 9.5/10
127
+ - **Test coverage**: 100% pass rate
128
+
129
+ ---
130
+
131
+ ## πŸ”§ Technical Details
132
+
133
+ ### Files Modified
134
+ - `src/installer.js`: +335 lines (security fixes, environment detection, Soprano integration)
135
+ - `test/unit/provider-manager.bats`: +4 lines (fix edge case test)
136
+ - `.claude/hooks/play-tts-soprano.sh`: +320 lines (new provider)
137
+ - `.claude/hooks/soprano-gradio-synth.py`: +139 lines (new helper)
138
+ - `.claude/hooks/provider-manager.sh`: +17 lines (Soprano support)
139
+ - `.claude/hooks/play-tts.sh`: +6 lines (route to Soprano)
140
+
141
+ ### Breaking Changes
142
+ None - all changes are backward compatible.
143
+
144
+ ### Dependencies
145
+ - **New**: `soprano-tts` (Python package, optional)
146
+ - **Recommended**: CUDA-capable GPU for 2000x speedup (optional)
147
+ - **Compatible**: Works on CPU-only systems (20x vs Piper)
148
+
149
+ ---
150
+
151
+ ## πŸŽ“ Migration Notes
152
+
153
+ ### For New Users
154
+ 1. Run `npx agentvibes install`
155
+ 2. Installer auto-detects your hardware (GPU, RAM, platform)
156
+ 3. Soprano appears as option if you have working audio
157
+ 4. Select Soprano for ultra-fast TTS with GPU acceleration
158
+
159
+ ### For Existing Users
160
+ 1. Update: `npx agentvibes update`
161
+ 2. Switch provider: `/agent-vibes:provider switch soprano`
162
+ 3. Test: `/agent-vibes:sample soprano-default`
163
+ 4. Optionally install soprano-tts: `pip install soprano-tts`
164
+
165
+ ### PulseAudio Tunnel Users
166
+ - Installer now auto-detects your tunnel configuration
167
+ - Shows "🌐 PulseAudio Tunnel Detected!" instead of "speakers"
168
+ - Provides DESKTOP mode options (Soprano, Piper, macOS Say)
169
+ - No manual configuration needed
170
+
171
+ ---
172
+
173
+ ## πŸ™ Acknowledgments
174
+
175
+ ### Special Thanks
176
+
177
+ **πŸŽ‰ [@nathanchase](https://github.com/nathanchase)** - For contributing the Soprano TTS Provider integration (PR #95)! Nathan's work brings ultra-fast neural TTS with GPU acceleration to AgentVibes, offering 20x CPU and 2000x GPU performance improvements. The comprehensive integration includes WebUI, API, and CLI synthesis modes with intelligent auto-detection and graceful fallback. Thank you for this outstanding contribution! πŸš€
178
+
179
+ ### Quality Assurance
180
+
181
+ - **Security Review**: Adversarial code review achieved 9.5/10 score
182
+ - **Testing**: All 260 tests pass (100% suite coverage)
183
+ - **Quality Gates**: All Sonar requirements validated
184
+ - **Co-Authored-By**: Claude Sonnet 4.5
185
+
186
+ ---
187
+
188
+ ## πŸ“š Additional Resources
189
+
190
+ - [Soprano TTS Documentation](https://github.com/paulpreibisch/AgentVibes/blob/master/docs/providers.md#soprano-tts)
191
+ - [PulseAudio Tunnel Setup](https://github.com/paulpreibisch/AgentVibes/blob/master/docs/SSH_REMOTE_SETUP.md)
192
+ - [Security Hardening Guide](https://github.com/paulpreibisch/AgentVibes/blob/master/docs/security-hardening-guide.md)
193
+ - [Provider Comparison](https://github.com/paulpreibisch/AgentVibes/blob/master/docs/providers.md)
194
+
195
+ ---
196
+
197
+ **Full Changelog**: https://github.com/paulpreibisch/AgentVibes/compare/v3.3.0...v3.4.0
198
+
199
+ ---
2
200
 
3
201
  ## πŸ“¦ v3.3.0 - Remote TTS, Smart Installer, OpenClaw Receiver & Cache Management
4
202
 
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "$schema": "https://json.schemastore.org/package.json",
3
3
  "name": "agentvibes",
4
- "version": "3.3.0",
4
+ "version": "3.4.1",
5
5
  "description": "Now your AI Agents can finally talk back! Professional TTS voice for Claude Code, Claude Desktop (via MCP), and Clawdbot with multi-provider support.",
6
6
  "homepage": "https://agentvibes.org",
7
7
  "keywords": [