npm - verbalcoding - Versions diffs - 0.2.1 → 0.2.2 - Mend

verbalcoding 0.2.1 → 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/docs/CONFIGURATION.md +32 -0
package/docs/USAGE.md +26 -1
package/docs/i18n/CONFIGURATION.ko.md +32 -0
package/docs/i18n/USAGE.ko.md +26 -1
package/package.json +1 -1

package/docs/CONFIGURATION.md CHANGED Viewed

@@ -66,6 +66,7 @@ WHISPER_CPP_BIN="whisper-cli"
 WHISPER_CPP_MODEL="./models/ggml-small-q5_1.bin"
 TTS_BACKEND="edge"
+TTS_VOICE_TYPE="korean_female"
 TTS_VOICE="ko-KR-SunHiNeural"
 TTS_RATE="+10%"
 TTS_MAX_CHARS="495"
@@ -80,6 +81,37 @@ AGENT_VERBOSE_PROGRESS="0"
 LATENCY_LOG_PATH="./.logs/latency.jsonl"
 ```
+## TTS Voice Selection
+Language presets and voice selection are separate:
+- `vc language ko|en|auto` changes STT language, progress language, and the default voice for that language.
+- Live voice commands such as “남자 한국어 목소리로 바꿔”, “여자 한국어 목소리로 바꿔”, `change voice to Korean female`, and `switch speaker to English` change only the speaker/voice type.
+- `!voice-test <text>` plays a quick sample with the currently selected backend and voice.
+Voice selection is stored in `config/tts-voices.json` by default. Override the path with `TTS_VOICE_CONFIG`. The running bridge re-reads/applies voice selection before synthesis, so voice commands take effect without a full restart.
+Default Edge catalog:
+| `TTS_VOICE_TYPE` | `TTS_VOICE` | Language |
+|---|---|---|
+| `korean_male` | `ko-KR-InJoonNeural` | Korean |
+| `korean_female` | `ko-KR-SunHiNeural` | Korean |
+| `korean_multilingual_male` | `ko-KR-HyunsuMultilingualNeural` | Korean |
+| `english_male` | `en-US-GuyNeural` | English |
+| `english_female` | `en-US-AriaNeural` | English |
+Manual persistent override:
+```bash
+TTS_BACKEND="edge"
+TTS_VOICE_TYPE="korean_male"
+TTS_VOICE="ko-KR-InJoonNeural"
+TTS_VOICE_CONFIG="config/tts-voices.json"
+```
+For OpenVoice, SpeechSwift, or Supertonic, keep the backend-specific voice/reference settings in the sections below; the same voice catalog file can still track the active voice type.
 ## MCP Server
 VerbalCoding ships a stdio MCP server so Hermes Agent or any MCP client can control the bridge through tools instead of relying on skills or free-form shell commands.

package/docs/USAGE.md CHANGED Viewed

@@ -48,7 +48,7 @@ The bot auto-joins the first configured channel name, defaulting to `일반,Gene
 | `!ping` | Basic bot check |
 | `!join` / `!leave` | Join or leave voice |
 | `!say <text>` | Speak text directly through TTS |
-| `!voice-test <text>` | Test the active TTS backend |
+| `!voice-test <text>` | Test the active TTS backend/voice |
 | `!voice-clone capture` | Save the next valid utterance as an OpenVoice reference sample |
 | `!voice-clone status` / `!voice-clone cancel` | Inspect or cancel capture |
 | `!ask <prompt>` | Send text through the same selected harness adapter as voice |
@@ -63,6 +63,31 @@ The bot auto-joins the first configured channel name, defaulting to `일반,Gene
 Voice equivalents such as “외부 모드”, “보수 모드”, “실내”, “기본 감도”, and clear stop phrases like “잠깐”, “멈춰”, “그만” are handled by the bridge. You can also say “상세 진행 켜” / “상세 진행 꺼” to toggle verbose progress by voice.
+## Changing the Voice
+`vc language ko|en|auto` changes STT language, progress language, and the matching default TTS voice together. If you only want to change the speaker/voice while the bridge is running, say it in Discord voice:
+```text
+남자 한국어 목소리로 바꿔
+여자 한국어 목소리로 바꿔
+change voice to Korean female
+switch speaker to English
+```
+The live bridge recognizes these as voice-control commands, updates `config/tts-voices.json`, updates the effective TTS env for the running process, and answers with a short confirmation such as “목소리를 Korean male로 바꿨어.” Use `!voice-test <text>` right after changing it to hear the current backend and voice.
+Built-in Edge voice types:
+| Voice type | Edge voice |
+|---|---|
+| `korean_male` | `ko-KR-InJoonNeural` |
+| `korean_female` | `ko-KR-SunHiNeural` |
+| `korean_multilingual_male` | `ko-KR-HyunsuMultilingualNeural` |
+| `english_male` | `en-US-GuyNeural` |
+| `english_female` | `en-US-AriaNeural` |
+For persistent manual config, set `TTS_BACKEND=edge`, `TTS_VOICE_TYPE=<voice-type>`, and optionally `TTS_VOICE=<edge-voice>` in `.env`, or edit `config/tts-voices.json` for custom voice catalogs.
 ## Verbose Progress Mode
 Verbose progress is off by default unless `AGENT_VERBOSE_PROGRESS=1` is set. Enable it with `!verbose on` or a voice command like “상세 진행 켜”. It can emit short progress lines such as:

package/docs/i18n/CONFIGURATION.ko.md CHANGED Viewed

@@ -74,6 +74,7 @@ WHISPER_CPP_BIN="whisper-cli"
 WHISPER_CPP_MODEL="./models/ggml-small-q5_1.bin"
 TTS_BACKEND="edge"
+TTS_VOICE_TYPE="korean_female"
 TTS_VOICE="ko-KR-SunHiNeural"
 TTS_RATE="+10%"
 TTS_MAX_CHARS="495"
@@ -88,6 +89,37 @@ AGENT_VERBOSE_PROGRESS="0"
 LATENCY_LOG_PATH="./.logs/latency.jsonl"
 ```
+## TTS 목소리 선택
+언어 프리셋과 목소리 선택은 분리되어 있습니다.
+- `vc language ko|en|auto`는 STT 언어, 진행 언어, 해당 언어의 기본 목소리를 함께 바꿉니다.
+- “남자 한국어 목소리로 바꿔”, “여자 한국어 목소리로 바꿔”, `change voice to Korean female`, `switch speaker to English` 같은 실시간 음성 명령은 말하는 사람/목소리 타입만 바꿉니다.
+- `!voice-test <text>`는 현재 선택된 백엔드와 목소리로 짧은 샘플을 재생합니다.
+목소리 선택은 기본적으로 `config/tts-voices.json`에 저장됩니다. 경로는 `TTS_VOICE_CONFIG`로 바꿀 수 있습니다. 실행 중인 브릿지는 합성 직전에 목소리 선택을 다시 적용하므로, 음성 명령으로 바꾼 목소리는 전체 재시작 없이 바로 반영됩니다.
+기본 Edge 카탈로그:
+| `TTS_VOICE_TYPE` | `TTS_VOICE` | 언어 |
+|---|---|---|
+| `korean_male` | `ko-KR-InJoonNeural` | 한국어 |
+| `korean_female` | `ko-KR-SunHiNeural` | 한국어 |
+| `korean_multilingual_male` | `ko-KR-HyunsuMultilingualNeural` | 한국어 |
+| `english_male` | `en-US-GuyNeural` | 영어 |
+| `english_female` | `en-US-AriaNeural` | 영어 |
+수동 영구 override 예시:
+```bash
+TTS_BACKEND="edge"
+TTS_VOICE_TYPE="korean_male"
+TTS_VOICE="ko-KR-InJoonNeural"
+TTS_VOICE_CONFIG="config/tts-voices.json"
+```
+OpenVoice, SpeechSwift, Supertonic을 쓸 때는 아래 백엔드별 reference/voice 설정을 유지하세요. 같은 voice catalog 파일에서 현재 voice type을 추적할 수 있습니다.
 ## MCP 서버
 VerbalCoding은 stdio MCP 서버를 포함합니다. Hermes Agent 또는 MCP client는 자유 형식 shell 명령 대신 도구로 브릿지를 제어할 수 있습니다.

package/docs/i18n/USAGE.ko.md CHANGED Viewed

@@ -56,7 +56,7 @@ VERBALCODING_INSTANCE_ENV=instances/my-project.env ./run.sh
 | `!ping` | 봇 연결 기본 확인 |
 | `!join` / `!leave` | 음성 채널 입장/퇴장 |
 | `!say <text>` | 텍스트를 바로 TTS로 읽기 |
-| `!voice-test <text>` | 현재 TTS 백엔드 테스트 |
+| `!voice-test <text>` | 현재 TTS 백엔드/목소리 테스트 |
 | `!voice-clone capture` | 다음 유효 발화를 OpenVoice 기준 샘플로 저장 |
 | `!voice-clone status` / `!voice-clone cancel` | 샘플 캡처 상태 확인/취소 |
 | `!ask <prompt>` | 음성과 같은 선택된 CLI 어댑터로 텍스트 요청 보내기 |
@@ -71,6 +71,31 @@ VERBALCODING_INSTANCE_ENV=instances/my-project.env ./run.sh
 음성으로도 “외부 모드”, “보수 모드”, “실내”, “기본 감도” 같은 감도 전환과 “잠깐”, “멈춰”, “그만” 같은 명확한 중단 표현을 처리합니다. “상세 진행 켜” / “상세 진행 꺼”처럼 말해서 verbose progress도 바꿀 수 있습니다.
+## 목소리 변경
+`vc language ko|en|auto`는 STT 언어, 진행 언어, 기본 TTS 목소리를 함께 바꿉니다. 언어 전체가 아니라 말하는 사람/목소리만 바꾸고 싶다면 Discord 음성에서 이렇게 말하면 됩니다.
+```text
+남자 한국어 목소리로 바꿔
+여자 한국어 목소리로 바꿔
+change voice to Korean female
+switch speaker to English
+```
+실행 중인 브릿지는 이 발화를 제어 명령으로 인식해 `config/tts-voices.json`을 갱신하고, 현재 프로세스의 TTS 설정도 바로 바꾼 뒤 “목소리를 Korean male로 바꿨어.” 같은 짧은 확인을 말합니다. 바꾼 직후에는 `!voice-test <text>`로 현재 백엔드와 목소리를 바로 들어볼 수 있습니다.
+기본 Edge 목소리 타입:
+| 목소리 타입 | Edge voice |
+|---|---|
+| `korean_male` | `ko-KR-InJoonNeural` |
+| `korean_female` | `ko-KR-SunHiNeural` |
+| `korean_multilingual_male` | `ko-KR-HyunsuMultilingualNeural` |
+| `english_male` | `en-US-GuyNeural` |
+| `english_female` | `en-US-AriaNeural` |
+영구 수동 설정이 필요하면 `.env`에 `TTS_BACKEND=edge`, `TTS_VOICE_TYPE=<voice-type>`, 필요 시 `TTS_VOICE=<edge-voice>`를 설정하세요. 더 많은 커스텀 목소리 카탈로그는 `config/tts-voices.json`에서 관리할 수 있습니다.
 ## 자세한 진행 모드
 자세한 진행은 기본적으로 꺼져 있습니다. `.env`에 `AGENT_VERBOSE_PROGRESS=1`을 설정하거나 Discord에서 `!verbose on`, 또는 음성으로 “상세 진행 켜”라고 말해 켤 수 있습니다.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "verbalcoding",
-  "version": "0.2.1",
+  "version": "0.2.2",
   "description": "Discord voice bridge for CLI coding agents.",
   "license": "MIT",
   "repository": {