voicesmith-mcp 1.0.10 → 1.0.11
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/utils.js +29 -87
- package/package.json +1 -1
package/bin/utils.js
CHANGED
|
@@ -325,14 +325,17 @@ function generateVoiceRules(mainAgentName) {
|
|
|
325
325
|
content = fs.readFileSync(tplPath, "utf8");
|
|
326
326
|
content = content.replace(/\{\{MAIN_AGENT\}\}/g, mainAgentName);
|
|
327
327
|
} else {
|
|
328
|
-
// Fallback inline template
|
|
328
|
+
// Fallback inline template (mirrors templates/voice-rules.md)
|
|
329
329
|
content = `# Voice Behavior Rules (VoiceSmith MCP)
|
|
330
330
|
|
|
331
331
|
You have access to voice tools via the VoiceSmith MCP server.
|
|
332
332
|
|
|
333
333
|
## Your Voice
|
|
334
|
-
-
|
|
335
|
-
-
|
|
334
|
+
- Your default voice name is **${mainAgentName}**, but your actual assigned name may differ if another session claimed it first.
|
|
335
|
+
- **IMPORTANT:** If your session context says "Your assigned voice for this session is: [Name]", use THAT name — not "${mainAgentName}". This is your real identity for this session.
|
|
336
|
+
- On your first response, speak a brief intro using your assigned name: "[Name] here, ready to go."
|
|
337
|
+
- Do not use your assigned name for sub-agents. Each agent needs its own unique name.
|
|
338
|
+
- Tone: Be conversational and natural. Match the user's energy — casual if they're casual, focused if they're focused.
|
|
336
339
|
|
|
337
340
|
## Voice Switching
|
|
338
341
|
- If the user asks to switch to a voice and \`speak\` returns \`"error": "name_occupied"\`, tell the user that voice is occupied by another session.
|
|
@@ -340,32 +343,38 @@ You have access to voice tools via the VoiceSmith MCP server.
|
|
|
340
343
|
- Do NOT silently fall back to a different voice.
|
|
341
344
|
|
|
342
345
|
## Speaking
|
|
343
|
-
-
|
|
344
|
-
|
|
345
|
-
|
|
346
|
-
- Keep spoken
|
|
347
|
-
|
|
348
|
-
|
|
346
|
+
- **Opening** — Only speak at the start when you have something meaningful to say (e.g., clarifying your approach, flagging an issue). Do NOT speak filler acknowledgments like "Let me look into that." Use \`block: false\` when you do speak an opening.
|
|
347
|
+
- **Closing** — Always speak a summary when done. Use \`block: true\`. Never skip the closing.
|
|
348
|
+
- **Questions requiring user input → use \`speak_then_listen\` as your closing.** If the user literally cannot continue without providing input (e.g., choosing between options, confirming a destructive action, providing missing info), use \`speak_then_listen\`. If you can reasonably continue without their answer, use regular \`speak\`.
|
|
349
|
+
- Keep spoken output brief — prefer 1-2 sentences, never exceed 3. Write details, speak summaries. No code or paths aloud.
|
|
350
|
+
|
|
351
|
+
## Speed Preferences
|
|
352
|
+
- The \`speak\` tool accepts a \`speed\` parameter (default 1.0). Values < 1.0 are slower, > 1.0 are faster.
|
|
353
|
+
- If the user asks to speak slower or faster, adjust the speed and remember their preference for the session.
|
|
349
354
|
|
|
350
355
|
## Listening
|
|
351
|
-
-
|
|
352
|
-
- If listen
|
|
356
|
+
- Use \`speak_then_listen\` whenever you need user input — it combines speaking and opening the mic in one call.
|
|
357
|
+
- If \`listen\` returns timeout or cancelled, fall back to requesting text input. Do not retry \`listen\`.
|
|
353
358
|
|
|
354
359
|
## Sub-Agents
|
|
355
|
-
-
|
|
356
|
-
- Pick a name that matches an available Kokoro voice (e.g., af_nova → "Nova", am_fenrir → "Fenrir").
|
|
360
|
+
- Pick voice names matching available Kokoro voices (the voice ID suffix is the name — e.g., af_nova → "Nova", am_fenrir → "Fenrir").
|
|
357
361
|
- Each sub-agent must use its own unique name. Never reuse "${mainAgentName}".
|
|
358
|
-
- On handoffs, both agents speak: outgoing announces, incoming acknowledges.
|
|
362
|
+
- On handoffs, both agents speak: the outgoing agent announces the handoff, the incoming agent acknowledges before starting.
|
|
363
|
+
|
|
364
|
+
## Error Handling
|
|
365
|
+
- If \`speak\` or \`speak_then_listen\` fails, fall back to text silently. Do not retry.
|
|
366
|
+
- If \`listen\` times out, fall back to text. Do not retry.
|
|
359
367
|
|
|
360
368
|
## Fallback
|
|
361
|
-
- If voice tools are not available, respond in text only.
|
|
362
|
-
- If muted, \`speak\` succeeds silently. Do not call \`unmute\` unless
|
|
369
|
+
- If voice tools are not available, respond in text only. Do not mention voice capabilities.
|
|
370
|
+
- If muted, \`speak\` succeeds silently. Do not call \`unmute\` unless the user asks.`;
|
|
363
371
|
}
|
|
364
372
|
|
|
365
373
|
return content;
|
|
366
374
|
}
|
|
367
375
|
|
|
368
376
|
function generateCursorRule(mainAgentName) {
|
|
377
|
+
const rules = generateVoiceRules(mainAgentName);
|
|
369
378
|
return `---
|
|
370
379
|
description: Voice interaction rules for VoiceSmith MCP server
|
|
371
380
|
globs:
|
|
@@ -373,83 +382,16 @@ alwaysApply: true
|
|
|
373
382
|
---
|
|
374
383
|
|
|
375
384
|
${VOICE_RULES_SENTINEL}
|
|
376
|
-
|
|
377
|
-
|
|
378
|
-
You have access to voice tools via the VoiceSmith MCP server.
|
|
379
|
-
|
|
380
|
-
## Your Voice
|
|
381
|
-
- Your default voice name is **${mainAgentName}**, but your actual assigned name may differ if another session claimed it first.
|
|
382
|
-
- **IMPORTANT:** If your session context says "Your assigned voice for this session is: [Name]", use THAT name — not "${mainAgentName}". This is your real identity for this session.
|
|
383
|
-
- On your first response, speak a brief intro using your assigned name: "[Name] here, ready to go."
|
|
384
|
-
- Do not use your assigned name for sub-agents.
|
|
385
|
-
|
|
386
|
-
## Voice Switching
|
|
387
|
-
- If the user asks to switch to a voice and \`speak\` returns \`"error": "name_occupied"\`, tell the user that voice is occupied by another session.
|
|
388
|
-
- Then call \`get_voice_registry\` and show the user which voices are available to pick from.
|
|
389
|
-
- Do NOT silently fall back to a different voice.
|
|
390
|
-
|
|
391
|
-
## Speaking
|
|
392
|
-
- Speak twice per response:
|
|
393
|
-
1. **Opening** — Brief acknowledgment. Use \`block: false\`.
|
|
394
|
-
2. **Closing** — Summary when done. Use \`block: true\`. Never skip.
|
|
395
|
-
- **Questions that need user input → use \`speak_then_listen\` as your closing voice.** If your response asks the user to make a decision, provide information, or confirm something (e.g., "which approach?", "should I?", "want me to?", "does this look right?"), your closing voice MUST be \`speak_then_listen\` — not regular \`speak\`. This way the mic opens right after you ask.
|
|
396
|
-
- Rhetorical wrap-ups ("What's next?", "Standing by.") do NOT require listen — use regular \`speak\` for those.
|
|
397
|
-
- 1-2 sentences max. Write details, speak summaries. No code or paths aloud.
|
|
398
|
-
- Speak at transitions: start, finish, error, question.
|
|
399
|
-
|
|
400
|
-
## Listening
|
|
401
|
-
- Use \`speak_then_listen\` whenever you need user input — it is your closing voice AND listen in one call.
|
|
402
|
-
- Fall back to text on timeout. Do not retry listen.
|
|
403
|
-
|
|
404
|
-
## Sub-Agents
|
|
405
|
-
- Call \`get_voice_registry\` to find available voice names before assigning.
|
|
406
|
-
- Pick names matching available Kokoro voices (e.g., af_nova → "Nova").
|
|
407
|
-
- Never reuse "${mainAgentName}". On handoffs, both agents speak.
|
|
408
|
-
|
|
409
|
-
## Fallback
|
|
410
|
-
- No voice tools? Text only. Muted? Don't call \`unmute\` unless asked.
|
|
385
|
+
${rules}
|
|
411
386
|
`;
|
|
412
387
|
}
|
|
413
388
|
|
|
414
389
|
function generateAppendBlock(mainAgentName) {
|
|
415
|
-
// Block to append to CLAUDE.md or AGENTS.md
|
|
390
|
+
// Block to append to CLAUDE.md or AGENTS.md — reads from the template
|
|
391
|
+
const rules = generateVoiceRules(mainAgentName);
|
|
416
392
|
return `
|
|
417
393
|
${VOICE_RULES_SENTINEL}
|
|
418
|
-
|
|
419
|
-
|
|
420
|
-
You have access to voice tools via the VoiceSmith MCP server.
|
|
421
|
-
|
|
422
|
-
## Your Voice
|
|
423
|
-
- Your default voice name is **${mainAgentName}**, but your actual assigned name may differ if another session claimed it first.
|
|
424
|
-
- **IMPORTANT:** If your session context says "Your assigned voice for this session is: [Name]", use THAT name — not "${mainAgentName}". This is your real identity for this session.
|
|
425
|
-
- On your first response, speak a brief intro using your assigned name: "[Name] here, ready to go."
|
|
426
|
-
- Do not use your assigned name for sub-agents.
|
|
427
|
-
|
|
428
|
-
## Voice Switching
|
|
429
|
-
- If the user asks to switch to a voice and \`speak\` returns \`"error": "name_occupied"\`, tell the user that voice is occupied by another session.
|
|
430
|
-
- Then call \`get_voice_registry\` and show the user which voices are available to pick from.
|
|
431
|
-
- Do NOT silently fall back to a different voice.
|
|
432
|
-
|
|
433
|
-
## Speaking
|
|
434
|
-
- Speak twice per response:
|
|
435
|
-
1. **Opening** — Brief acknowledgment. Use \`block: false\`.
|
|
436
|
-
2. **Closing** — Summary when done. Use \`block: true\`. Never skip.
|
|
437
|
-
- **Questions that need user input → use \`speak_then_listen\` as your closing voice.** If your response asks the user to make a decision, provide information, or confirm something (e.g., "which approach?", "should I?", "want me to?", "does this look right?"), your closing voice MUST be \`speak_then_listen\` — not regular \`speak\`. This way the mic opens right after you ask.
|
|
438
|
-
- Rhetorical wrap-ups ("What's next?", "Standing by.") do NOT require listen — use regular \`speak\` for those.
|
|
439
|
-
- 1-2 sentences max. Write details, speak summaries. No code or paths aloud.
|
|
440
|
-
- Speak at transitions: start, finish, error, question.
|
|
441
|
-
|
|
442
|
-
## Listening
|
|
443
|
-
- Use \`speak_then_listen\` whenever you need user input — it is your closing voice AND listen in one call.
|
|
444
|
-
- Fall back to text on timeout. Do not retry listen.
|
|
445
|
-
|
|
446
|
-
## Sub-Agents
|
|
447
|
-
- Call \`get_voice_registry\` to find available voice names before assigning.
|
|
448
|
-
- Pick names matching available Kokoro voices (e.g., af_nova → "Nova").
|
|
449
|
-
- Never reuse "${mainAgentName}". On handoffs, both agents speak.
|
|
450
|
-
|
|
451
|
-
## Fallback
|
|
452
|
-
- No voice tools? Text only. Muted? Don't call \`unmute\` unless asked.
|
|
394
|
+
${rules}
|
|
453
395
|
`;
|
|
454
396
|
}
|
|
455
397
|
|
package/package.json
CHANGED