agentvibes 2.1.0 → 2.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (116) hide show
  1. package/.bmad-core/agent-teams/team-all.yaml +15 -0
  2. package/.bmad-core/agent-teams/team-fullstack.yaml +19 -0
  3. package/.bmad-core/agent-teams/team-ide-minimal.yaml +11 -0
  4. package/.bmad-core/agent-teams/team-no-ui.yaml +14 -0
  5. package/.bmad-core/agents/analyst.md +84 -0
  6. package/.bmad-core/agents/architect.md +85 -0
  7. package/.bmad-core/agents/bmad-master.md +110 -0
  8. package/.bmad-core/agents/bmad-orchestrator.md +147 -0
  9. package/.bmad-core/agents/dev.md +81 -0
  10. package/.bmad-core/agents/pm.md +84 -0
  11. package/.bmad-core/agents/po.md +79 -0
  12. package/.bmad-core/agents/qa.md +87 -0
  13. package/.bmad-core/agents/sm.md +65 -0
  14. package/.bmad-core/agents/ux-expert.md +69 -0
  15. package/.bmad-core/checklists/architect-checklist.md +440 -0
  16. package/.bmad-core/checklists/change-checklist.md +184 -0
  17. package/.bmad-core/checklists/pm-checklist.md +372 -0
  18. package/.bmad-core/checklists/po-master-checklist.md +434 -0
  19. package/.bmad-core/checklists/story-dod-checklist.md +96 -0
  20. package/.bmad-core/checklists/story-draft-checklist.md +155 -0
  21. package/.bmad-core/core-config.yaml +22 -0
  22. package/.bmad-core/data/bmad-kb.md +809 -0
  23. package/.bmad-core/data/brainstorming-techniques.md +38 -0
  24. package/.bmad-core/data/elicitation-methods.md +156 -0
  25. package/.bmad-core/data/technical-preferences.md +5 -0
  26. package/.bmad-core/data/test-levels-framework.md +148 -0
  27. package/.bmad-core/data/test-priorities-matrix.md +174 -0
  28. package/.bmad-core/enhanced-ide-development-workflow.md +248 -0
  29. package/.bmad-core/install-manifest.yaml +230 -0
  30. package/.bmad-core/tasks/advanced-elicitation.md +119 -0
  31. package/.bmad-core/tasks/apply-qa-fixes.md +150 -0
  32. package/.bmad-core/tasks/brownfield-create-epic.md +162 -0
  33. package/.bmad-core/tasks/brownfield-create-story.md +149 -0
  34. package/.bmad-core/tasks/correct-course.md +72 -0
  35. package/.bmad-core/tasks/create-brownfield-story.md +314 -0
  36. package/.bmad-core/tasks/create-deep-research-prompt.md +280 -0
  37. package/.bmad-core/tasks/create-doc.md +103 -0
  38. package/.bmad-core/tasks/create-next-story.md +114 -0
  39. package/.bmad-core/tasks/document-project.md +345 -0
  40. package/.bmad-core/tasks/execute-checklist.md +88 -0
  41. package/.bmad-core/tasks/facilitate-brainstorming-session.md +138 -0
  42. package/.bmad-core/tasks/generate-ai-frontend-prompt.md +53 -0
  43. package/.bmad-core/tasks/index-docs.md +175 -0
  44. package/.bmad-core/tasks/kb-mode-interaction.md +77 -0
  45. package/.bmad-core/tasks/nfr-assess.md +345 -0
  46. package/.bmad-core/tasks/qa-gate.md +163 -0
  47. package/.bmad-core/tasks/review-story.md +316 -0
  48. package/.bmad-core/tasks/risk-profile.md +355 -0
  49. package/.bmad-core/tasks/shard-doc.md +187 -0
  50. package/.bmad-core/tasks/test-design.md +176 -0
  51. package/.bmad-core/tasks/trace-requirements.md +266 -0
  52. package/.bmad-core/tasks/validate-next-story.md +136 -0
  53. package/.bmad-core/templates/architecture-tmpl.yaml +651 -0
  54. package/.bmad-core/templates/brainstorming-output-tmpl.yaml +156 -0
  55. package/.bmad-core/templates/brownfield-architecture-tmpl.yaml +477 -0
  56. package/.bmad-core/templates/brownfield-prd-tmpl.yaml +281 -0
  57. package/.bmad-core/templates/competitor-analysis-tmpl.yaml +307 -0
  58. package/.bmad-core/templates/front-end-architecture-tmpl.yaml +219 -0
  59. package/.bmad-core/templates/front-end-spec-tmpl.yaml +350 -0
  60. package/.bmad-core/templates/fullstack-architecture-tmpl.yaml +824 -0
  61. package/.bmad-core/templates/market-research-tmpl.yaml +253 -0
  62. package/.bmad-core/templates/prd-tmpl.yaml +203 -0
  63. package/.bmad-core/templates/project-brief-tmpl.yaml +222 -0
  64. package/.bmad-core/templates/qa-gate-tmpl.yaml +103 -0
  65. package/.bmad-core/templates/story-tmpl.yaml +138 -0
  66. package/.bmad-core/user-guide.md +577 -0
  67. package/.bmad-core/utils/bmad-doc-template.md +327 -0
  68. package/.bmad-core/utils/workflow-management.md +71 -0
  69. package/.bmad-core/workflows/brownfield-fullstack.yaml +298 -0
  70. package/.bmad-core/workflows/brownfield-service.yaml +188 -0
  71. package/.bmad-core/workflows/brownfield-ui.yaml +198 -0
  72. package/.bmad-core/workflows/greenfield-fullstack.yaml +241 -0
  73. package/.bmad-core/workflows/greenfield-service.yaml +207 -0
  74. package/.bmad-core/workflows/greenfield-ui.yaml +236 -0
  75. package/.bmad-core/working-in-the-brownfield.md +606 -0
  76. package/.claude/commands/BMad/analyst.md +88 -0
  77. package/.claude/commands/BMad/architect.md +89 -0
  78. package/.claude/commands/BMad/bmad-master.md +114 -0
  79. package/.claude/commands/BMad/bmad-orchestrator.md +151 -0
  80. package/.claude/commands/BMad/dev.md +85 -0
  81. package/.claude/commands/BMad/pm.md +88 -0
  82. package/.claude/commands/BMad/po.md +83 -0
  83. package/.claude/commands/BMad/qa.md +91 -0
  84. package/.claude/commands/BMad/sm.md +69 -0
  85. package/.claude/commands/BMad/tasks/advanced-elicitation.md +123 -0
  86. package/.claude/commands/BMad/tasks/apply-qa-fixes.md +154 -0
  87. package/.claude/commands/BMad/tasks/brownfield-create-epic.md +166 -0
  88. package/.claude/commands/BMad/tasks/brownfield-create-story.md +153 -0
  89. package/.claude/commands/BMad/tasks/correct-course.md +76 -0
  90. package/.claude/commands/BMad/tasks/create-brownfield-story.md +318 -0
  91. package/.claude/commands/BMad/tasks/create-deep-research-prompt.md +284 -0
  92. package/.claude/commands/BMad/tasks/create-doc.md +107 -0
  93. package/.claude/commands/BMad/tasks/create-next-story.md +118 -0
  94. package/.claude/commands/BMad/tasks/document-project.md +349 -0
  95. package/.claude/commands/BMad/tasks/execute-checklist.md +92 -0
  96. package/.claude/commands/BMad/tasks/facilitate-brainstorming-session.md +142 -0
  97. package/.claude/commands/BMad/tasks/generate-ai-frontend-prompt.md +57 -0
  98. package/.claude/commands/BMad/tasks/index-docs.md +179 -0
  99. package/.claude/commands/BMad/tasks/kb-mode-interaction.md +81 -0
  100. package/.claude/commands/BMad/tasks/nfr-assess.md +349 -0
  101. package/.claude/commands/BMad/tasks/qa-gate.md +167 -0
  102. package/.claude/commands/BMad/tasks/review-story.md +320 -0
  103. package/.claude/commands/BMad/tasks/risk-profile.md +359 -0
  104. package/.claude/commands/BMad/tasks/shard-doc.md +191 -0
  105. package/.claude/commands/BMad/tasks/test-design.md +180 -0
  106. package/.claude/commands/BMad/tasks/trace-requirements.md +270 -0
  107. package/.claude/commands/BMad/tasks/validate-next-story.md +140 -0
  108. package/.claude/commands/BMad/ux-expert.md +73 -0
  109. package/.claude/hooks/piper-installer.sh +2 -2
  110. package/README.md +10 -11
  111. package/docs/technical-deep-dive.md +905 -0
  112. package/linkedin/vibe-coding-and-pulseaudio.md +121 -0
  113. package/mcp-server/agentvibes.db +0 -0
  114. package/package.json +1 -1
  115. package/scripts/audio-tunnel.config +17 -0
  116. package/src/installer.js +3 -3
@@ -0,0 +1,905 @@
1
+ # How AgentVibes Works Under the Hood: A Technical Deep Dive
2
+
3
+ Two months ago, I wanted to add voice and personality to my Claude coding agents so they would speak acknowledgments and completions—making my development workflow more engaging and keeping me in flow state. Fast forward to today, and we've built an amazing working system that not only speaks with over 150 voices, but does so with distinct personalities ranging from zen masters to sarcastic companions that add a bit of sass to your coding sessions.
4
+
5
+ In this article, we're going to take a deep dive to show you how AgentVibes works under the hood—the architecture, the design patterns, and the clever implementations that make it all possible. And best of all, **this is an open source project that is completely free** and will completely transform your coding experience with AI assistants.
6
+
7
+ ## The Big Picture: What Problem Does AgentVibes Solve?
8
+
9
+ Claude Code is an amazing AI coding assistant, but it's entirely text-based. You type a request, Claude responds with text, runs commands, and writes code. But what if Claude could *tell* you when it's starting a task? What if it could vocally confirm when it's done? What if it could do all this with personality—speaking with dry wit and sass, zen-like calmness, or whatever style fits your mood?
10
+
11
+ That's exactly what AgentVibes does. It transforms Claude Code from a silent text assistant into a voice-enabled AI companion with character and charm.
12
+
13
+ ## Architecture Overview: The Four Core Systems
14
+
15
+ AgentVibes is built on four interconnected systems:
16
+
17
+ 1. **Output Style System** - The AI's instructions for when to speak
18
+ 2. **Hook System** - The bash scripts that generate and play audio
19
+ 3. **Provider System** - The TTS engines (ElevenLabs or Piper)
20
+ 4. **MCP Server** - Natural language control interface
21
+
22
+ Let's explore each one.
23
+
24
+ ---
25
+
26
+ ## System 1: The Output Style - Teaching Claude When to Speak
27
+
28
+ ### What is an Output Style?
29
+
30
+ In Claude Code, an "output style" is essentially a set of instructions that tells the AI assistant *how* to format and present its responses. Think of it as a personality overlay that changes Claude's behavior without changing its core capabilities.
31
+
32
+ AgentVibes provides an output style called "Agent Vibes" (located at `.claude/output-styles/agent-vibes.md`). This markdown file contains detailed instructions that become part of Claude's system prompt when activated.
33
+
34
+ ### The Two-Point Protocol
35
+
36
+ The core genius of the AgentVibes output style is its **Two-Point TTS Protocol**:
37
+
38
+ **1. ACKNOWLEDGMENT** (Start of task)
39
+ When Claude receives a user command, it:
40
+ - Checks current personality/sentiment settings
41
+ - Generates a unique acknowledgment in that style
42
+ - Executes the TTS script to speak it
43
+ - Then proceeds with the actual work
44
+
45
+ **2. COMPLETION** (End of task)
46
+ After completing the task, Claude:
47
+ - Uses the same personality/sentiment as acknowledgment
48
+ - Generates a unique completion message
49
+ - Executes the TTS script again
50
+
51
+ Here's the critical part from `.claude/output-styles/agent-vibes.md`:
52
+
53
+ ```
54
+ ### 1. ACKNOWLEDGMENT (Start of task)
55
+ After receiving a user command:
56
+ 1. Check sentiment FIRST: `SENTIMENT=$(cat .claude/tts-sentiment.txt 2>/dev/null)`
57
+ 2. If no sentiment, check personality: `PERSONALITY=$(cat .claude/tts-personality.txt 2>/dev/null)`
58
+ 3. Use sentiment if set, otherwise use personality
59
+ 4. **Generate UNIQUE acknowledgment** - Use AI to create a fresh response in that style
60
+ 5. Execute TTS: `.claude/hooks/play-tts.sh "[message]" "[VoiceName]"`
61
+ 6. Proceed with work
62
+ ```
63
+
64
+ ### Why This Matters
65
+
66
+ This two-point protocol creates natural conversational flow:
67
+ - User: "Check git status"
68
+ - Claude (spoken): "I'll check that for you right away"
69
+ - Claude (text): *runs git status command*
70
+ - Claude (spoken): "Your repository is clean and up to date"
71
+
72
+ The AI doesn't just blindly execute—it *communicates* like a helpful assistant would.
73
+
74
+ ### Settings Priority System
75
+
76
+ AgentVibes has a sophisticated three-tier priority system for how Claude should speak:
77
+
78
+ **Priority 0: Language** (`.claude/tts-language.txt`)
79
+ - Controls which language TTS speaks
80
+ - Examples: "english", "spanish", "french"
81
+ - When set to non-English, ALL TTS is in that language
82
+
83
+ **Priority 1: Sentiment** (`.claude/tts-sentiment.txt`)
84
+ - Applies personality style WITHOUT changing voice
85
+ - Examples: "sarcastic", "flirty", "professional"
86
+ - Keeps your current voice but changes speaking style
87
+
88
+ **Priority 2: Personality** (`.claude/tts-personality.txt`)
89
+ - Changes BOTH voice AND speaking style
90
+ - Examples: "sarcastic" = Jessica Anne Bogart voice + dry wit
91
+ - Each personality has an assigned voice
92
+
93
+ The output style checks these in order—if language is set, speak in that language. If sentiment is set, use that style. Otherwise fall back to personality.
94
+
95
+ ---
96
+
97
+ ## System 2: The Hook System - Where the Magic Happens
98
+
99
+ The hook system is a collection of bash scripts in `.claude/hooks/` that do the actual work of generating and playing audio. Let's trace the journey of a TTS request.
100
+
101
+ ### The Entry Point: play-tts.sh
102
+
103
+ When Claude's output style executes `.claude/hooks/play-tts.sh "Hello world" "Aria"`, here's what happens:
104
+
105
+ **File: `.claude/hooks/play-tts.sh`** (the router)
106
+
107
+ ```bash
108
+ TEXT="$1" # "Hello world"
109
+ VOICE_OVERRIDE="$2" # "Aria" (optional)
110
+
111
+ # Get active provider (elevenlabs or piper)
112
+ ACTIVE_PROVIDER=$(get_active_provider)
113
+
114
+ # Route to provider-specific implementation
115
+ case "$ACTIVE_PROVIDER" in
116
+ elevenlabs)
117
+ exec "$SCRIPT_DIR/play-tts-elevenlabs.sh" "$TEXT" "$VOICE_OVERRIDE"
118
+ ;;
119
+ piper)
120
+ exec "$SCRIPT_DIR/play-tts-piper.sh" "$TEXT" "$VOICE_OVERRIDE"
121
+ ;;
122
+ esac
123
+ ```
124
+
125
+ This script is a **provider router**. It doesn't generate audio itself—it delegates to the appropriate provider implementation. This is the provider abstraction pattern in action.
126
+
127
+ ### Provider Implementations
128
+
129
+ Each provider has its own script that handles the specifics:
130
+
131
+ **For ElevenLabs** (`.claude/hooks/play-tts-elevenlabs.sh`):
132
+ 1. Resolves voice name to voice ID (looks up "Aria" → actual voice ID)
133
+ 2. Detects current language setting (for multilingual support)
134
+ 3. Makes API call to ElevenLabs with text, voice, and language
135
+ 4. Saves audio to temp file
136
+ 5. Plays audio using system player (paplay/aplay/mpg123)
137
+ 6. Handles SSH detection and audio optimization
138
+
139
+ **For Piper** (`.claude/hooks/play-tts-piper.sh`):
140
+ 1. Resolves voice name to Piper model (e.g., "en_US-lessac-medium")
141
+ 2. Downloads voice model if not cached
142
+ 3. Runs local Piper TTS engine (no API call)
143
+ 4. Saves audio to temp file
144
+ 5. Plays audio using system player
145
+
146
+ ### The Personality Manager
147
+
148
+ One of the most interesting hooks is `personality-manager.sh`. Let's see how it works.
149
+
150
+ When you run `/agent-vibes:personality sarcastic`, this script:
151
+
152
+ ```bash
153
+ # 1. Validates personality exists
154
+ if [[ ! -f "$PERSONALITIES_DIR/${PERSONALITY}.md" ]]; then
155
+ echo "❌ Personality not found: $PERSONALITY"
156
+ exit 1
157
+ fi
158
+
159
+ # 2. Saves personality to config file
160
+ echo "$PERSONALITY" > "$PERSONALITY_FILE"
161
+
162
+ # 3. Detects active provider (ElevenLabs or Piper)
163
+ ACTIVE_PROVIDER=$(cat "$CLAUDE_DIR/tts-provider.txt")
164
+
165
+ # 4. Reads assigned voice from personality file
166
+ if [[ "$ACTIVE_PROVIDER" == "piper" ]]; then
167
+ ASSIGNED_VOICE=$(get_personality_data "$PERSONALITY" "piper_voice")
168
+ else
169
+ ASSIGNED_VOICE=$(get_personality_data "$PERSONALITY" "voice")
170
+ fi
171
+
172
+ # 5. Switches to that voice automatically
173
+ "$VOICE_MANAGER" switch "$ASSIGNED_VOICE" --silent
174
+
175
+ # 6. Plays a personality-appropriate acknowledgment
176
+ REMARK=$(pick_random_example_from_personality_file)
177
+ .claude/hooks/play-tts.sh "$REMARK"
178
+ ```
179
+
180
+ ### Personality Configuration Files
181
+
182
+ Each personality is defined in a markdown file like `.claude/personalities/sarcastic.md`:
183
+
184
+ ```markdown
185
+ ---
186
+ name: sarcastic
187
+ description: Dry wit and cutting observations
188
+ elevenlabs_voice: Jessica Anne Bogart
189
+ piper_voice: en_US-amy-medium
190
+ ---
191
+
192
+ ## AI Instructions
193
+ Use dry wit, cutting observations, and dismissive compliance. Model after
194
+ iconic sarcastic characters like Dr. House, Chandler Bing, and Miranda Priestly.
195
+
196
+ Rotate through different sarcastic approaches:
197
+ - Condescending intelligence: "Fascinating. You've discovered debugging."
198
+ - Quick zingers: "Could this build BE any slower?"
199
+ - Icy dismissiveness: "By all means, continue at a glacial pace"
200
+
201
+ ## Example Responses
202
+ - "Oh joy, another merge conflict. Just what I needed today."
203
+ - "Wow, a syntax error. I'm shocked. Shocked, I tell you."
204
+ - "Sure, I'll run that test. Right after I finish curing world hunger."
205
+ ```
206
+
207
+ **Personal note:** I've literally laughed out loud multiple times while coding with the sarcastic personality active. There's something delightfully entertaining about having your AI assistant respond with perfectly-timed sass when you ask it to debug yet another type error.
208
+
209
+ The AI reads this file and uses the "AI Instructions" section to generate unique responses in that style. The example responses are just guidance—the AI creates fresh variations each time.
210
+
211
+ ### Provider Manager
212
+
213
+ The provider manager (`provider-manager.sh`) handles switching between ElevenLabs and Piper:
214
+
215
+ ```bash
216
+ # Get active provider
217
+ get_active_provider() {
218
+ local provider_file=""
219
+
220
+ # Check project-local first, then global
221
+ if [[ -f ".claude/tts-provider.txt" ]]; then
222
+ provider_file=".claude/tts-provider.txt"
223
+ elif [[ -f "$HOME/.claude/tts-provider.txt" ]]; then
224
+ provider_file="$HOME/.claude/tts-provider.txt"
225
+ fi
226
+
227
+ cat "$provider_file" 2>/dev/null || echo "elevenlabs"
228
+ }
229
+
230
+ # Switch provider
231
+ switch_provider() {
232
+ local new_provider="$1"
233
+ echo "$new_provider" > "$CLAUDE_DIR/tts-provider.txt"
234
+ echo "✅ Switched to $new_provider provider"
235
+ }
236
+ ```
237
+
238
+ This allows seamless switching between paid (ElevenLabs) and free (Piper) TTS without changing any other configuration.
239
+
240
+ ---
241
+
242
+ ## System 3: The Provider System - Two Engines, One Interface
243
+
244
+ AgentVibes supports two TTS providers with the same interface:
245
+
246
+ ### ElevenLabs Provider
247
+
248
+ **Architecture:** Cloud-based API
249
+
250
+ **How it works:**
251
+ 1. Accepts text, voice name, and language code
252
+ 2. Makes HTTPS POST request to ElevenLabs API
253
+ 3. Receives MP3 audio stream
254
+ 4. Detects if running over SSH (checks `$SSH_CONNECTION`)
255
+ 5. If SSH detected, converts to OGG format (prevents audio corruption)
256
+ 6. Plays audio using local audio player
257
+
258
+ **Code snippet from `.claude/hooks/play-tts-elevenlabs.sh`:**
259
+
260
+ ```bash
261
+ # Make API request
262
+ RESPONSE=$(curl -s -X POST \
263
+ "https://api.elevenlabs.io/v1/text-to-speech/${VOICE_ID}" \
264
+ -H "xi-api-key: ${API_KEY}" \
265
+ -H "Content-Type: application/json" \
266
+ -d "{
267
+ \"text\": \"$TEXT\",
268
+ \"model_id\": \"eleven_multilingual_v2\",
269
+ \"language_code\": \"$LANGUAGE_CODE\",
270
+ \"voice_settings\": {
271
+ \"stability\": 0.5,
272
+ \"similarity_boost\": 0.75
273
+ }
274
+ }" \
275
+ --output "$AUDIO_FILE")
276
+
277
+ # SSH audio optimization
278
+ if [[ -n "$SSH_CONNECTION" ]]; then
279
+ # Convert MP3 to OGG to prevent corruption over SSH
280
+ ffmpeg -i "$AUDIO_FILE" -c:a libopus -b:a 128k "$OGG_FILE"
281
+ AUDIO_FILE="$OGG_FILE"
282
+ fi
283
+
284
+ # Play audio
285
+ paplay "$AUDIO_FILE" 2>/dev/null || aplay "$AUDIO_FILE"
286
+ ```
287
+
288
+ ### Piper Provider
289
+
290
+ **Architecture:** Local neural TTS
291
+
292
+ **How it works:**
293
+ 1. Accepts text and voice model name
294
+ 2. Downloads voice model if not cached (stored in `~/.local/share/piper/`)
295
+ 3. Runs Piper engine locally (no internet required)
296
+ 4. Generates WAV audio
297
+ 5. Plays audio using local audio player
298
+
299
+ **Code snippet from `.claude/hooks/play-tts-piper.sh`:**
300
+
301
+ ```bash
302
+ # Check if voice model exists
303
+ VOICE_PATH="$HOME/.local/share/piper/voices/${VOICE}.onnx"
304
+
305
+ if [[ ! -f "$VOICE_PATH" ]]; then
306
+ # Download voice model
307
+ "$SCRIPT_DIR/piper-download-voices.sh" "$VOICE"
308
+ fi
309
+
310
+ # Generate speech locally
311
+ echo "$TEXT" | piper \
312
+ --model "$VOICE_PATH" \
313
+ --output_file "$AUDIO_FILE"
314
+
315
+ # Play audio
316
+ paplay "$AUDIO_FILE" 2>/dev/null || aplay "$AUDIO_FILE"
317
+ ```
318
+
319
+ ### Why Two Providers?
320
+
321
+ **ElevenLabs:**
322
+ - ✅ Superior voice quality
323
+ - ✅ 150+ voices with distinct characters
324
+ - ✅ Perfect multilingual support (29 languages)
325
+ - ❌ Requires API key and paid plan
326
+ - ❌ Needs internet connection
327
+ - ❌ API costs per character
328
+
329
+ **Piper:**
330
+ - ✅ Completely free
331
+ - ✅ Works offline
332
+ - ✅ No API key needed
333
+ - ✅ 50+ voices
334
+ - ❌ Moderate voice quality
335
+ - ❌ Basic multilingual support
336
+ - ❌ Requires local installation
337
+
338
+ By supporting both, AgentVibes lets users choose based on their priorities: quality vs. cost.
339
+
340
+ ---
341
+
342
+ ## System 4: The MCP Server - Natural Language Control
343
+
344
+ The Model Context Protocol (MCP) server is AgentVibes' newest feature. It exposes all AgentVibes functionality through a standardized protocol that AI assistants can use.
345
+
346
+ ### What is MCP?
347
+
348
+ MCP is a protocol that allows AI assistants to discover and use external tools. Think of it like REST API for AI assistants—instead of manually typing commands like `/agent-vibes:switch Aria`, you can just say "Switch to Aria voice" and the AI figures out the right tool to call.
349
+
350
+ ### The MCP Server Architecture
351
+
352
+ **File: `mcp-server/server.py`** (Python implementation)
353
+
354
+ ```python
355
+ class AgentVibesServer:
356
+ """MCP Server for AgentVibes TTS functionality"""
357
+
358
+ def __init__(self):
359
+ # Find the .claude directory (where hooks live)
360
+ self.claude_dir = self._find_claude_dir()
361
+ self.hooks_dir = self.claude_dir / "hooks"
362
+
363
+ async def text_to_speech(
364
+ self,
365
+ text: str,
366
+ voice: Optional[str] = None,
367
+ personality: Optional[str] = None,
368
+ language: Optional[str] = None,
369
+ ) -> str:
370
+ """Convert text to speech using AgentVibes"""
371
+
372
+ # Temporarily set personality if specified
373
+ if personality:
374
+ await self._run_script(
375
+ "personality-manager.sh",
376
+ ["set", personality]
377
+ )
378
+
379
+ # Temporarily set language if specified
380
+ if language:
381
+ await self._run_script(
382
+ "language-manager.sh",
383
+ ["set", language]
384
+ )
385
+
386
+ # Call the TTS script
387
+ args = ["bash", str(self.hooks_dir / "play-tts.sh"), text]
388
+ if voice:
389
+ args.append(voice)
390
+
391
+ # Execute asynchronously (non-blocking)
392
+ result = await asyncio.create_subprocess_exec(
393
+ *args,
394
+ stdout=asyncio.subprocess.PIPE,
395
+ stderr=asyncio.subprocess.PIPE,
396
+ )
397
+
398
+ return "✅ Audio played successfully"
399
+ ```
400
+
401
+ ### How MCP Tools are Registered
402
+
403
+ The server registers tools that the AI can discover:
404
+
405
+ ```python
406
+ @server.list_tools()
407
+ async def list_tools() -> list[Tool]:
408
+ return [
409
+ Tool(
410
+ name="text_to_speech",
411
+ description="Speak text using AgentVibes TTS",
412
+ inputSchema={
413
+ "type": "object",
414
+ "properties": {
415
+ "text": {"type": "string"},
416
+ "voice": {"type": "string", "optional": True},
417
+ "personality": {"type": "string", "optional": True},
418
+ "language": {"type": "string", "optional": True},
419
+ },
420
+ },
421
+ ),
422
+ Tool(name="switch_voice", ...),
423
+ Tool(name="list_voices", ...),
424
+ Tool(name="set_personality", ...),
425
+ # ... 20+ more tools
426
+ ]
427
+ ```
428
+
429
+ ### MCP in Action
430
+
431
+ When you say "Switch to Aria voice" in Claude Desktop with AgentVibes MCP installed:
432
+
433
+ 1. Claude receives your natural language request
434
+ 2. Claude sees the `switch_voice` tool is available
435
+ 3. Claude calls: `switch_voice(voice_name="Aria")`
436
+ 4. MCP server executes: `bash .claude/hooks/voice-manager.sh switch Aria`
437
+ 5. Voice manager saves "Aria" to `.claude/tts-voice.txt`
438
+ 6. MCP server returns: "✅ Switched to Aria voice"
439
+ 7. Claude responds to you with confirmation
440
+
441
+ You never had to know the slash command syntax or where files are stored!
442
+
443
+ ### Project-Specific vs Global Settings
444
+
445
+ One clever feature of the MCP server is how it handles settings:
446
+
447
+ ```python
448
+ # Determine where to save settings based on context
449
+ cwd = Path.cwd()
450
+
451
+ if (cwd / ".claude").is_dir() and cwd != self.agentvibes_root:
452
+ # Real Claude Code project with .claude directory
453
+ env["CLAUDE_PROJECT_DIR"] = str(cwd)
454
+ # Settings will be saved to project's .claude/
455
+ else:
456
+ # Claude Desktop, Warp, or non-project context
457
+ # Settings will be saved to ~/.claude/
458
+ ```
459
+
460
+ This means:
461
+ - **In Claude Code projects:** Settings are project-specific (each project can have different voice/personality)
462
+ - **In Claude Desktop/Warp:** Settings are global (consistent across all conversations)
463
+
464
+ ---
465
+
466
+ ## Data Flow: Following a TTS Request From Start to Finish
467
+
468
+ Let's trace a complete request to see how all systems work together.
469
+
470
+ **Scenario:** You ask Claude Code to "Check git status" with the sarcastic personality active.
471
+
472
+ ### Step 1: Output Style Triggers Acknowledgment
473
+
474
+ Claude's output style instructions kick in:
475
+
476
+ ```
477
+ 1. Check personality setting:
478
+ - Reads .claude/tts-personality.txt → "sarcastic"
479
+
480
+ 2. Read personality configuration:
481
+ - Reads .claude/personalities/sarcastic.md
482
+ - Extracts AI instructions: "Use dry wit, cutting observations..."
483
+
484
+ 3. Generate unique acknowledgment:
485
+ - AI creates: "Oh, the excitement. Let me check that git status for you."
486
+
487
+ 4. Execute TTS:
488
+ - Calls: .claude/hooks/play-tts.sh "Oh, the excitement. Let me check that git status for you."
489
+ ```
490
+
491
+ ### Step 2: TTS Router Determines Provider
492
+
493
+ `play-tts.sh` routes the request:
494
+
495
+ ```bash
496
+ # Read active provider
497
+ ACTIVE_PROVIDER=$(cat .claude/tts-provider.txt) → "elevenlabs"
498
+
499
+ # Route to ElevenLabs implementation
500
+ exec .claude/hooks/play-tts-elevenlabs.sh "$TEXT" "$VOICE"
501
+ ```
502
+
503
+ ### Step 3: ElevenLabs Provider Generates Audio
504
+
505
+ `play-tts-elevenlabs.sh` does the heavy lifting:
506
+
507
+ ```bash
508
+ # 1. Resolve voice
509
+ VOICE_NAME="Jessica Anne Bogart" # from sarcastic.md
510
+ VOICE_ID=$(lookup_voice_id "$VOICE_NAME") → "abc123xyz789"
511
+
512
+ # 2. Detect language
513
+ LANGUAGE_CODE=$(cat .claude/tts-language.txt) → "en"
514
+
515
+ # 3. Call ElevenLabs API
516
+ curl -X POST "https://api.elevenlabs.io/v1/text-to-speech/$VOICE_ID" \
517
+ -H "xi-api-key: $API_KEY" \
518
+ -d '{"text": "Oh, the excitement. Let me check that git status for you."}' \
519
+ --output /tmp/tts_12345.mp3
520
+
521
+ # 4. Check if over SSH
522
+ if [[ -n "$SSH_CONNECTION" ]]; then
523
+ # Convert MP3 to OGG to prevent corruption
524
+ ffmpeg -i /tmp/tts_12345.mp3 /tmp/tts_12345.ogg
525
+ AUDIO_FILE=/tmp/tts_12345.ogg
526
+ fi
527
+
528
+ # 5. Play audio
529
+ paplay /tmp/tts_12345.ogg
530
+ ```
531
+
532
+ ### Step 4: Claude Proceeds With Task
533
+
534
+ Claude runs the git status command while audio plays in parallel (non-blocking).
535
+
536
+ ### Step 5: Output Style Triggers Completion
537
+
538
+ After task completes:
539
+
540
+ ```
541
+ 1. Generate completion message:
542
+ - AI creates: "Riveting. Your repository is clean. Try not to get too excited."
543
+
544
+ 2. Execute TTS:
545
+ - Calls: .claude/hooks/play-tts.sh "Riveting. Your repository is clean. Try not to get too excited."
546
+
547
+ 3. Same flow as Step 2-3 repeats
548
+ ```
549
+
550
+ **Important note:** These aren't just hard-coded responses—AgentVibes uses AI to generate unique responses each time based on the personality instructions. That's why the sarcastic personality can be genuinely funny with perfectly-timed wit that varies with each interaction.
551
+
552
+ And if sarcasm isn't your style, AgentVibes includes 19 different personalities ranging from professional and zen to enthusiastic and grandpa—or you can simply use the normal personality for straightforward, no-nonsense responses. The choice is yours!
553
+
554
+ The entire flow takes ~2-3 seconds for acknowledgment and completion combined.
555
+
556
+ ---
557
+
558
+ ## Installation Architecture: How AgentVibes Gets Installed
559
+
560
+ When you run `npx agentvibes install --yes`, here's what happens:
561
+
562
+ ### Step 1: NPM Package Execution
563
+
564
+ ```bash
565
+ # NPM downloads AgentVibes package to cache
566
+ ~/.npm/_npx/[hash]/node_modules/agentvibes/
567
+
568
+ # NPM executes the bin script
569
+ ./bin/agent-vibes install --yes
570
+ ```
571
+
572
+ ### Step 2: Installer Script Runs
573
+
574
+ **File: `src/installer.js`**
575
+
576
+ The installer:
577
+ 1. Detects installation location (current directory or global `~/.claude/`)
578
+ 2. Creates `.claude/` directory structure
579
+ 3. Copies all files from package:
580
+ - Commands → `.claude/commands/agent-vibes/`
581
+ - Hooks → `.claude/hooks/`
582
+ - Personalities → `.claude/personalities/`
583
+ - Output styles → `.claude/output-styles/`
584
+ 4. Makes all bash scripts executable (`chmod +x`)
585
+ 5. Creates default configuration files
586
+
587
+ ### Directory Structure Created
588
+
589
+ ```
590
+ .claude/
591
+ ├── commands/
592
+ │ └── agent-vibes/
593
+ │ ├── agent-vibes.md # Main command file
594
+ │ ├── switch.md # /agent-vibes:switch
595
+ │ ├── list.md # /agent-vibes:list
596
+ │ ├── personality.md # /agent-vibes:personality
597
+ │ └── ... (50+ command files)
598
+ ├── hooks/
599
+ │ ├── play-tts.sh # Main TTS router
600
+ │ ├── play-tts-elevenlabs.sh # ElevenLabs implementation
601
+ │ ├── play-tts-piper.sh # Piper implementation
602
+ │ ├── personality-manager.sh # Personality system
603
+ │ ├── voice-manager.sh # Voice switching
604
+ │ ├── provider-manager.sh # Provider switching
605
+ │ ├── language-manager.sh # Language settings
606
+ │ └── ... (20+ hook scripts)
607
+ ├── personalities/
608
+ │ ├── normal.md
609
+ │ ├── professional.md
610
+ │ ├── sarcastic.md
611
+ │ ├── zen.md
612
+ │ └── ... (19 personality files)
613
+ ├── output-styles/
614
+ │ └── agent-vibes.md # Output style instructions
615
+ ├── tts-voice.txt # Current voice (e.g., "Aria")
616
+ ├── tts-personality.txt # Current personality (e.g., "sarcastic")
617
+ ├── tts-provider.txt # Current provider (e.g., "elevenlabs")
618
+ └── tts-language.txt # Current language (e.g., "english")
619
+ ```
620
+
621
+ ### Step 3: Post-Install (MCP Dependencies)
622
+
623
+ If installing for MCP use:
624
+
625
+ ```bash
626
+ # Install Python dependencies
627
+ cd mcp-server/
628
+ pip install -r requirements.txt
629
+ # Installs: mcp (MCP SDK), aiosqlite, etc.
630
+ ```
631
+
632
+ ---
633
+
634
+ ## Configuration Storage: Where Settings Live
635
+
636
+ AgentVibes uses simple text files for configuration. This makes it easy to understand, debug, and even manually edit.
637
+
638
+ ### Project-Local vs Global
639
+
640
+ **Project-Local** (`.claude/` in project directory):
641
+ - Used when working in a Claude Code project
642
+ - Settings are specific to that project
643
+ - Example: `/home/user/my-app/.claude/tts-voice.txt`
644
+
645
+ **Global** (`~/.claude/` in home directory):
646
+ - Used for Claude Desktop, Warp, and when no project `.claude/` exists
647
+ - Settings are shared across all sessions
648
+ - Example: `/home/user/.claude/tts-voice.txt`
649
+
650
+ ### Configuration Files
651
+
652
+ | File | Purpose | Example Value |
653
+ |------|---------|---------------|
654
+ | `tts-voice.txt` | Current voice name | `Aria` |
655
+ | `tts-personality.txt` | Current personality | `pirate` |
656
+ | `tts-sentiment.txt` | Current sentiment (optional) | `sarcastic` |
657
+ | `tts-provider.txt` | Active TTS provider | `elevenlabs` |
658
+ | `tts-language.txt` | TTS language | `spanish` |
659
+
660
+ ### Reading Configuration in Code
661
+
662
+ The hooks use a consistent pattern:
663
+
664
+ ```bash
665
+ # Check project-local first, fallback to global
666
+ get_current_voice() {
667
+ if [[ -f ".claude/tts-voice.txt" ]]; then
668
+ cat ".claude/tts-voice.txt"
669
+ elif [[ -f "$HOME/.claude/tts-voice.txt" ]]; then
670
+ cat "$HOME/.claude/tts-voice.txt"
671
+ else
672
+ echo "Aria" # Default
673
+ fi
674
+ }
675
+ ```
676
+
677
+ This ensures settings are found regardless of context.
678
+
679
+ ---
680
+
681
+ ## Advanced Features Deep Dive
682
+
683
+ ### Language Learning Mode
684
+
685
+ One of AgentVibes' coolest features is language learning mode. When enabled, every TTS message plays **twice**—once in your main language, then again in your target language.
686
+
687
+ **How it works:**
688
+
689
+ The output style is modified to call TTS twice:
690
+
691
+ ```bash
692
+ # First call - main language (English)
693
+ .claude/hooks/play-tts.sh "I'll check that for you"
694
+
695
+ # Second call - target language (Spanish)
696
+ .claude/hooks/play-tts.sh "Lo verificaré para ti" "es_ES-davefx-medium"
697
+ ```
698
+
699
+ The translation happens via API (if using ElevenLabs multilingual voices) or by using language-specific Piper voices.
700
+
701
+ ### SSH Audio Optimization
702
+
703
+ AgentVibes automatically detects SSH sessions and optimizes audio:
704
+
705
+ ```bash
706
+ # Detect SSH
707
+ if [[ -n "$SSH_CONNECTION" ]]; then
708
+ IS_SSH=true
709
+ fi
710
+
711
+ if [[ "$IS_SSH" == "true" ]]; then
712
+ # Convert MP3 to OGG with Opus codec
713
+ # This prevents audio corruption over SSH tunnels
714
+ ffmpeg -i "$MP3_FILE" -c:a libopus -b:a 128k "$OGG_FILE"
715
+ AUDIO_FILE="$OGG_FILE"
716
+ fi
717
+ ```
718
+
719
+ Why? MP3 streaming over SSH can have corruption. OGG/Opus format is more robust for network transmission.
720
+
721
+ ### BMAD Plugin Integration
722
+
723
+ AgentVibes can integrate with the BMAD METHOD (a multi-agent framework). When a BMAD agent activates, AgentVibes automatically switches to that agent's assigned voice.
724
+
725
+ **How it works:**
726
+
727
+ 1. BMAD agent activates (e.g., `/BMad:agents:pm` for project manager)
728
+ 2. BMAD writes agent ID to `.bmad-agent-context` file
729
+ 3. AgentVibes output style checks this file
730
+ 4. If BMAD plugin is enabled, looks up voice in `.claude/plugins/bmad-voices.md`
731
+ 5. Automatically switches to that voice
732
+
733
+ This creates the illusion of multiple distinct AI personalities in conversations.
734
+
735
+ ---
736
+
737
+ ## Performance Considerations
738
+
739
+ ### Non-Blocking Audio Playback
740
+
741
+ TTS requests run asynchronously—Claude doesn't wait for audio to finish before continuing work:
742
+
743
+ ```bash
744
+ # Play audio in background
745
+ paplay "$AUDIO_FILE" &
746
+
747
+ # Claude continues immediately
748
+ # (runs git status, writes code, etc.)
749
+ ```
750
+
751
+ This means acknowledgment audio plays while Claude is already working on your task.
752
+
753
+ ### Audio Caching
754
+
755
+ AgentVibes saves audio files temporarily:
756
+
757
+ ```bash
758
+ AUDIO_FILE="/tmp/agentvibes_tts_${RANDOM}_${TIMESTAMP}.mp3"
759
+ ```
760
+
761
+ Files are kept for the duration of the session, allowing the `/agent-vibes:replay` command to work. Cleanup happens automatically when terminal session ends.
762
+
763
+ ### Provider Performance
764
+
765
+ **ElevenLabs:**
766
+ - API latency: ~500-1000ms
767
+ - Audio quality: Excellent (256kbps MP3)
768
+ - Bandwidth: ~2KB per second of audio
769
+
770
+ **Piper:**
771
+ - Generation latency: ~200-500ms (local)
772
+ - Audio quality: Good (22kHz WAV)
773
+ - Bandwidth: None (offline)
774
+
775
+ ### Text Length Limits
776
+
777
+ AgentVibes limits text length to prevent issues:
778
+
779
+ ```bash
780
+ # Truncate long text
781
+ if [ ${#TEXT} -gt 500 ]; then
782
+ TEXT="${TEXT:0:497}..."
783
+ fi
784
+ ```
785
+
786
+ This prevents:
787
+ - Excessive API costs (ElevenLabs charges per character)
788
+ - Slow generation (long audio takes time to produce)
789
+ - User confusion (very long TTS messages are hard to follow)
790
+
791
+ ---
792
+
793
+ ## Error Handling and Resilience
794
+
795
+ AgentVibes has multiple layers of error handling:
796
+
797
+ ### API Failure Handling
798
+
799
+ ```bash
800
+ # Try ElevenLabs API
801
+ RESPONSE=$(curl -s -X POST "$API_ENDPOINT" ...)
802
+
803
+ if [[ $? -ne 0 ]] || [[ ! -f "$AUDIO_FILE" ]]; then
804
+ echo "⚠️ TTS request failed (API error or network issue)"
805
+ exit 1
806
+ fi
807
+ ```
808
+
809
+ If the API fails, error is logged but doesn't crash Claude Code—the task continues without audio.
810
+
811
+ ### Missing Configuration Graceful Degradation
812
+
813
+ ```bash
814
+ # If no voice configured, use default
815
+ VOICE=$(cat .claude/tts-voice.txt 2>/dev/null || echo "Aria")
816
+
817
+ # If no personality configured, use normal
818
+ PERSONALITY=$(cat .claude/tts-personality.txt 2>/dev/null || echo "normal")
819
+ ```
820
+
821
+ Missing files don't cause crashes—sensible defaults are used.
822
+
823
+ ### Provider Fallback
824
+
825
+ If Piper isn't installed, AgentVibes can guide installation:
826
+
827
+ ```bash
828
+ if ! command -v piper &> /dev/null; then
829
+ echo "❌ Piper not installed"
830
+ echo " Install with: /agent-vibes:provider install piper"
831
+ exit 1
832
+ fi
833
+ ```
834
+
835
+ Clear error messages help users fix issues themselves.
836
+
837
+ ---
838
+
839
+ ## Testing and Quality Assurance
840
+
841
+ AgentVibes includes a test suite:
842
+
843
+ ```bash
844
+ # Run tests
845
+ npm test
846
+
847
+ # This executes
848
+ bats test/unit/*.bats
849
+ ```
850
+
851
+ Test files validate:
852
+ - Voice resolution (name → ID mapping)
853
+ - Personality file parsing
854
+ - Provider switching logic
855
+ - Configuration file handling
856
+
857
+ ---
858
+
859
+ ## Conclusion: The Bigger Picture
860
+
861
+ AgentVibes demonstrates several important software engineering principles:
862
+
863
+ **1. Separation of Concerns**
864
+ - Output style (when to speak) is separate from hooks (how to speak)
865
+ - Provider abstraction (ElevenLabs vs Piper) is separate from voice management
866
+ - MCP server is separate from core functionality
867
+
868
+ **2. Provider Pattern**
869
+ - Multiple TTS engines behind a single interface
870
+ - Easy to add new providers (OpenAI TTS, Google TTS, etc.)
871
+
872
+ **3. Configuration as Data**
873
+ - Simple text files instead of complex databases
874
+ - Easy to version control, debug, and manually edit
875
+
876
+ **4. Progressive Enhancement**
877
+ - Core functionality works with minimal setup
878
+ - Advanced features (MCP, BMAD, language learning) layer on top
879
+ - Graceful degradation when features aren't available
880
+
881
+ **5. User Experience First**
882
+ - Natural language control (MCP) instead of memorizing commands
883
+ - Instant feedback (acknowledgment/completion)
884
+ - Personality makes it fun, not just functional
885
+
886
+ Whether you're building your own AI integrations, designing CLI tools, or just curious about how AgentVibes works, I hope this deep dive has given you a comprehensive understanding of the architecture.
887
+
888
+ The beauty of AgentVibes isn't just that it makes Claude talk—it's that it does so with a clean, maintainable, extensible architecture that other developers can learn from and build upon.
889
+
890
+ ---
891
+
892
+ ## What's Next?
893
+
894
+ Now that you understand how AgentVibes works under the hood, you might want to:
895
+
896
+ - **Create custom personalities** - Edit `.claude/personalities/*.md` files
897
+ - **Extend the MCP server** - Add new tools in `mcp-server/server.py`
898
+ - **Build custom output styles** - Create your own instructions in `.claude/output-styles/`
899
+ - **Contribute to the project** - Submit PRs on [GitHub](https://github.com/paulpreibisch/AgentVibes)
900
+
901
+ Happy coding, and may your AI assistant always speak with personality! 🎤✨
902
+
903
+ ---
904
+
905
+ **About the Author:** Paul Preibisch is the creator of AgentVibes, an open source project that brings voice and personality to AI coding assistants. Follow the project on [GitHub](https://github.com/paulpreibisch/AgentVibes) or visit [agentvibes.org](https://www.agentvibes.org) to learn more.