verbalcoding 0.2.6 → 0.2.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. package/README.md +5 -0
  2. package/docs/i18n/CONFIGURATION.es.md +150 -0
  3. package/docs/i18n/CONFIGURATION.fr.md +150 -0
  4. package/docs/i18n/CONFIGURATION.ja.md +150 -0
  5. package/docs/i18n/CONFIGURATION.ko.md +49 -146
  6. package/docs/i18n/CONFIGURATION.ru.md +150 -0
  7. package/docs/i18n/CONFIGURATION.zh.md +150 -0
  8. package/docs/i18n/FRESH_INSTALL.es.md +124 -0
  9. package/docs/i18n/FRESH_INSTALL.fr.md +124 -0
  10. package/docs/i18n/FRESH_INSTALL.ja.md +124 -0
  11. package/docs/i18n/FRESH_INSTALL.ko.md +37 -114
  12. package/docs/i18n/FRESH_INSTALL.ru.md +124 -0
  13. package/docs/i18n/FRESH_INSTALL.zh.md +124 -0
  14. package/docs/i18n/MULTI_INSTANCE.es.md +121 -0
  15. package/docs/i18n/MULTI_INSTANCE.fr.md +121 -0
  16. package/docs/i18n/MULTI_INSTANCE.ja.md +121 -0
  17. package/docs/i18n/MULTI_INSTANCE.ko.md +28 -86
  18. package/docs/i18n/MULTI_INSTANCE.ru.md +121 -0
  19. package/docs/i18n/MULTI_INSTANCE.zh.md +121 -0
  20. package/docs/i18n/README.es.md +50 -86
  21. package/docs/i18n/README.fr.md +50 -86
  22. package/docs/i18n/README.ja.md +50 -86
  23. package/docs/i18n/README.ko.md +41 -113
  24. package/docs/i18n/README.ru.md +50 -86
  25. package/docs/i18n/README.zh.md +50 -86
  26. package/docs/i18n/RELEASE.es.md +58 -0
  27. package/docs/i18n/RELEASE.fr.md +58 -0
  28. package/docs/i18n/RELEASE.ja.md +58 -0
  29. package/docs/i18n/RELEASE.ko.md +36 -50
  30. package/docs/i18n/RELEASE.ru.md +58 -0
  31. package/docs/i18n/RELEASE.zh.md +58 -0
  32. package/docs/i18n/USAGE.es.md +134 -0
  33. package/docs/i18n/USAGE.fr.md +134 -0
  34. package/docs/i18n/USAGE.ja.md +134 -0
  35. package/docs/i18n/USAGE.ko.md +63 -101
  36. package/docs/i18n/USAGE.ru.md +134 -0
  37. package/docs/i18n/USAGE.zh.md +134 -0
  38. package/package.json +1 -1
package/README.md CHANGED
@@ -108,6 +108,11 @@ flowchart LR
108
108
  | [Multi-Instance](docs/MULTI_INSTANCE.md) | One permanent Discord voice room per project |
109
109
  | [Release Notes](docs/RELEASE.md) | Current capabilities and pre-release checklist |
110
110
  | [한국어 문서](docs/i18n/README.ko.md) | npm 설치, 사용법, 설정, 멀티 인스턴스 한국어 가이드 |
111
+ | [日本語 docs](docs/i18n/README.ja.md) | npm install, usage, configuration, multi-instance guide in Japanese |
112
+ | [中文文档](docs/i18n/README.zh.md) | npm 安装、使用、配置和多实例中文指南 |
113
+ | [Español docs](docs/i18n/README.es.md) | Instalación npm, uso, configuración y multiinstancia en español |
114
+ | [Français docs](docs/i18n/README.fr.md) | Installation npm, utilisation, configuration et multi-instance en français |
115
+ | [Русская документация](docs/i18n/README.ru.md) | npm установка, использование, конфигурация и мульти-инстансы на русском |
111
116
 
112
117
  ## Tiny Command Map
113
118
 
@@ -0,0 +1,150 @@
1
+ # VerbalCoding Configuración
2
+
3
+ ## Setup Wizard
4
+
5
+ Use upstream Discord-side guides first, then return to VerbalCoding:
6
+
7
+ - Hermes Agent Discord messaging guide: <https://hermes-agent.nousresearch.com/docs/user-guide/messaging/discord>
8
+ - Discord official bot overview: <https://docs.discord.com/developers/bots/overview>
9
+ - Discord official quick start: <https://docs.discord.com/developers/quick-start/getting-started>
10
+
11
+ ```bash
12
+ vc setup --yes
13
+ # or from a clone
14
+ ./scripts/install.sh
15
+ ```
16
+
17
+ The installer asks for the Discord token, allowed users, auto-join voice channel names, transcript channel/thread, CLI harness backend, default voice language, TTS settings, and wake-word behavior. It writes `.env` with mode `0600`.
18
+
19
+ ## Supported Agent Backends
20
+
21
+ Set `AGENT_BACKEND` in `.env`.
22
+
23
+ | Backend | Default command | Notes |
24
+ |---|---|---|
25
+ | `hermes` | `hermes chat -Q -q` | Default; supports resume and verbose progress |
26
+ | `claude-code` / `claude` | `claude -p` | Override with `CLAUDE_COMMAND` or `AGENT_COMMAND` |
27
+ | `codex` | `codex exec` | Override with `CODEX_COMMAND` or `AGENT_COMMAND` |
28
+ | `gemini` | `gemini -p` | Override with `GEMINI_COMMAND` or `AGENT_COMMAND` |
29
+ | `opencode` | `opencode run` | Override with `OPENCODE_COMMAND` or `AGENT_COMMAND` |
30
+ | `openclaw` | `openclaw run` | Override with `OPENCLAW_COMMAND` or `AGENT_COMMAND` |
31
+ | `custom` | `AGENT_COMMAND` required | Prompt is appended as final argv |
32
+
33
+ Generic overrides:
34
+
35
+ ```bash
36
+ AGENT_BACKEND=custom
37
+ AGENT_LABEL="My Harness"
38
+ AGENT_COMMAND="my-harness run --non-interactive"
39
+ AGENT_TASK_TIMEOUT_MS=0
40
+ AGENT_CHAT_TIMEOUT_MS=45000
41
+ AGENT_VERBOSE_PROGRESS=0
42
+ UTTERANCE_IDLE_MS=4500
43
+ LATENCY_LOG_PATH=./.logs/latency.jsonl
44
+ ```
45
+
46
+ ## Example `.env`
47
+
48
+ ```bash
49
+ DISCORD_BOT_TOKEN="***"
50
+ DISCORD_ALLOWED_USERS="123456789012345678"
51
+ AUTO_JOIN_VOICE_CHANNELS="일반,General,general"
52
+ TRANSCRIPT_CHANNEL_ID="123456789012345678"
53
+ AGENT_BACKEND="hermes"
54
+ STT_ENGINE="whisper_cpp"
55
+ WHISPER_CPP_BIN="whisper-cli"
56
+ WHISPER_CPP_MODEL="./models/ggml-small-q5_1.bin"
57
+ TTS_BACKEND="edge"
58
+ TTS_VOICE_TYPE="korean_female"
59
+ TTS_VOICE="ko-KR-SunHiNeural"
60
+ TTS_RATE="+10%"
61
+ TTS_MAX_CHARS="495"
62
+ TTS_VOLUME="1.0"
63
+ REQUIRE_WAKE_WORD="0"
64
+ MIN_UTTERANCE_SECONDS="1.0"
65
+ UTTERANCE_IDLE_MS="4500"
66
+ HERMES_TASK_TIMEOUT_MS="0"
67
+ HERMES_CHAT_TIMEOUT_MS="45000"
68
+ AGENT_VERBOSE_PROGRESS="0"
69
+ LATENCY_LOG_PATH="./.logs/latency.jsonl"
70
+ ```
71
+
72
+ ## TTS Voice Selection
73
+
74
+ `vc language ko|en|auto` changes STT language, progress language, and default TTS voice. Live commands such as “남자 한국어 목소리로 바꿔”, “여자 한국어 목소리로 바꿔”, `change voice to Korean female`, and `switch speaker to English` change only the speaker/voice type.
75
+
76
+ Default Edge catalog:
77
+
78
+ | `TTS_VOICE_TYPE` | `TTS_VOICE` | Language |
79
+ |---|---|---|
80
+ | `korean_male` | `ko-KR-InJoonNeural` | Korean |
81
+ | `korean_female` | `ko-KR-SunHiNeural` | Korean |
82
+ | `korean_multilingual_male` | `ko-KR-HyunsuMultilingualNeural` | Korean |
83
+ | `english_male` | `en-US-GuyNeural` | English |
84
+ | `english_female` | `en-US-AriaNeural` | English |
85
+
86
+ Backend-specific voice options:
87
+
88
+ | Backend | Settings | Voice choices |
89
+ |---|---|---|
90
+ | Edge | `TTS_VOICE_TYPE`, `TTS_VOICE` | Built-in types plus any `edge-tts --list-voices` voice |
91
+ | Supertonic | `SUPERTONIC_VOICE`, `SUPERTONIC_LANGUAGE` | `M1`–`M5`, `F1`–`F5`; `ko`, `en`, `es`, `pt`, `fr` |
92
+ | OpenVoice | `OPENVOICE_REF_AUDIO`, `OPENVOICE_STYLE`, `OPENVOICE_LANGUAGE` | User-provided permitted reference WAV |
93
+ | SpeechSwift / CosyVoice | `SPEECHSWIFT_REF_AUDIO`, `SPEECHSWIFT_ENGINE`, `SPEECHSWIFT_SPEAKER`, `SPEECHSWIFT_MODEL_ID` | Reference-sample voice or backend speaker/model ID |
94
+
95
+ ## Utterance Segmentation
96
+
97
+ `UTTERANCE_IDLE_MS` controls how long the bridge waits after speech before starting STT. Default is `4500` ms.
98
+
99
+ ```bash
100
+ UTTERANCE_IDLE_MS="4500"
101
+ UTTERANCE_IDLE_MS="6000"
102
+ ```
103
+
104
+ ## MCP Server
105
+
106
+ ```yaml
107
+ mcp_servers:
108
+ verbalcoding:
109
+ command: "node"
110
+ args: ["/path/to/VerbalCoding/scripts/mcp-server.mjs"]
111
+ timeout: 120
112
+ connect_timeout: 30
113
+ ```
114
+
115
+ Tools: `status`, `doctor`, `set_auto_restart`, `set_language`, `start`, `stop`, and `restart`.
116
+
117
+ ## Optional OpenVoice TTS
118
+
119
+ ```bash
120
+ ./scripts/setup_openvoice.sh
121
+ python3 integrations/openvoice/synth.py --openvoice-dir vendor/OpenVoice --ref-audio voice-samples/user-reference.wav --text '안녕하세요. 버벌코딩 목소리 복제 테스트입니다.' --output /tmp/verbalcoding-openvoice-smoke.wav
122
+ ```
123
+
124
+ ```bash
125
+ TTS_BACKEND="openvoice"
126
+ OPENVOICE_REF_AUDIO="./voice-samples/user-reference.wav"
127
+ OPENVOICE_PROGRESS="0"
128
+ ```
129
+
130
+ Only clone voices you own or have permission to use. OpenVoice falls back to Edge on failure.
131
+
132
+ ## Optional Supertonic TTS
133
+
134
+ ```bash
135
+ ./scripts/setup_supertonic.sh
136
+ supertonic tts '안녕하세요. 수퍼토닉 테스트입니다.' --lang ko --voice M1 --steps 2 --speed 1.0 -o /tmp/verbalcoding-supertonic.wav
137
+ ```
138
+
139
+ ## Optional SpeechSwift / CosyVoice TTS
140
+
141
+ ```bash
142
+ brew tap soniqo/speech https://github.com/soniqo/speech-swift
143
+ brew install speech
144
+ ```
145
+
146
+ Recommended env includes `TTS_BACKEND="speechswift"`, `SPEECHSWIFT_MODE="server"`, `SPEECHSWIFT_ENGINE="cosyvoice"`, `SPEECHSWIFT_REF_AUDIO`, and `SPEECHSWIFT_SERVER_URL`. Keep Edge for quick progress prompts.
147
+
148
+ ## Operational Notes
149
+
150
+ Enable Discord Message Content intent, grant voice connect/speak permissions, authenticate the selected CLI harness separately, and avoid reading diffs/log dumps aloud.
@@ -0,0 +1,150 @@
1
+ # VerbalCoding Configuration
2
+
3
+ ## Setup Wizard
4
+
5
+ Use upstream Discord-side guides first, then return to VerbalCoding:
6
+
7
+ - Hermes Agent Discord messaging guide: <https://hermes-agent.nousresearch.com/docs/user-guide/messaging/discord>
8
+ - Discord official bot overview: <https://docs.discord.com/developers/bots/overview>
9
+ - Discord official quick start: <https://docs.discord.com/developers/quick-start/getting-started>
10
+
11
+ ```bash
12
+ vc setup --yes
13
+ # or from a clone
14
+ ./scripts/install.sh
15
+ ```
16
+
17
+ The installer asks for the Discord token, allowed users, auto-join voice channel names, transcript channel/thread, CLI harness backend, default voice language, TTS settings, and wake-word behavior. It writes `.env` with mode `0600`.
18
+
19
+ ## Supported Agent Backends
20
+
21
+ Set `AGENT_BACKEND` in `.env`.
22
+
23
+ | Backend | Default command | Notes |
24
+ |---|---|---|
25
+ | `hermes` | `hermes chat -Q -q` | Default; supports resume and verbose progress |
26
+ | `claude-code` / `claude` | `claude -p` | Override with `CLAUDE_COMMAND` or `AGENT_COMMAND` |
27
+ | `codex` | `codex exec` | Override with `CODEX_COMMAND` or `AGENT_COMMAND` |
28
+ | `gemini` | `gemini -p` | Override with `GEMINI_COMMAND` or `AGENT_COMMAND` |
29
+ | `opencode` | `opencode run` | Override with `OPENCODE_COMMAND` or `AGENT_COMMAND` |
30
+ | `openclaw` | `openclaw run` | Override with `OPENCLAW_COMMAND` or `AGENT_COMMAND` |
31
+ | `custom` | `AGENT_COMMAND` required | Prompt is appended as final argv |
32
+
33
+ Generic overrides:
34
+
35
+ ```bash
36
+ AGENT_BACKEND=custom
37
+ AGENT_LABEL="My Harness"
38
+ AGENT_COMMAND="my-harness run --non-interactive"
39
+ AGENT_TASK_TIMEOUT_MS=0
40
+ AGENT_CHAT_TIMEOUT_MS=45000
41
+ AGENT_VERBOSE_PROGRESS=0
42
+ UTTERANCE_IDLE_MS=4500
43
+ LATENCY_LOG_PATH=./.logs/latency.jsonl
44
+ ```
45
+
46
+ ## Example `.env`
47
+
48
+ ```bash
49
+ DISCORD_BOT_TOKEN="***"
50
+ DISCORD_ALLOWED_USERS="123456789012345678"
51
+ AUTO_JOIN_VOICE_CHANNELS="일반,General,general"
52
+ TRANSCRIPT_CHANNEL_ID="123456789012345678"
53
+ AGENT_BACKEND="hermes"
54
+ STT_ENGINE="whisper_cpp"
55
+ WHISPER_CPP_BIN="whisper-cli"
56
+ WHISPER_CPP_MODEL="./models/ggml-small-q5_1.bin"
57
+ TTS_BACKEND="edge"
58
+ TTS_VOICE_TYPE="korean_female"
59
+ TTS_VOICE="ko-KR-SunHiNeural"
60
+ TTS_RATE="+10%"
61
+ TTS_MAX_CHARS="495"
62
+ TTS_VOLUME="1.0"
63
+ REQUIRE_WAKE_WORD="0"
64
+ MIN_UTTERANCE_SECONDS="1.0"
65
+ UTTERANCE_IDLE_MS="4500"
66
+ HERMES_TASK_TIMEOUT_MS="0"
67
+ HERMES_CHAT_TIMEOUT_MS="45000"
68
+ AGENT_VERBOSE_PROGRESS="0"
69
+ LATENCY_LOG_PATH="./.logs/latency.jsonl"
70
+ ```
71
+
72
+ ## TTS Voice Selection
73
+
74
+ `vc language ko|en|auto` changes STT language, progress language, and default TTS voice. Live commands such as “남자 한국어 목소리로 바꿔”, “여자 한국어 목소리로 바꿔”, `change voice to Korean female`, and `switch speaker to English` change only the speaker/voice type.
75
+
76
+ Default Edge catalog:
77
+
78
+ | `TTS_VOICE_TYPE` | `TTS_VOICE` | Language |
79
+ |---|---|---|
80
+ | `korean_male` | `ko-KR-InJoonNeural` | Korean |
81
+ | `korean_female` | `ko-KR-SunHiNeural` | Korean |
82
+ | `korean_multilingual_male` | `ko-KR-HyunsuMultilingualNeural` | Korean |
83
+ | `english_male` | `en-US-GuyNeural` | English |
84
+ | `english_female` | `en-US-AriaNeural` | English |
85
+
86
+ Backend-specific voice options:
87
+
88
+ | Backend | Settings | Voice choices |
89
+ |---|---|---|
90
+ | Edge | `TTS_VOICE_TYPE`, `TTS_VOICE` | Built-in types plus any `edge-tts --list-voices` voice |
91
+ | Supertonic | `SUPERTONIC_VOICE`, `SUPERTONIC_LANGUAGE` | `M1`–`M5`, `F1`–`F5`; `ko`, `en`, `es`, `pt`, `fr` |
92
+ | OpenVoice | `OPENVOICE_REF_AUDIO`, `OPENVOICE_STYLE`, `OPENVOICE_LANGUAGE` | User-provided permitted reference WAV |
93
+ | SpeechSwift / CosyVoice | `SPEECHSWIFT_REF_AUDIO`, `SPEECHSWIFT_ENGINE`, `SPEECHSWIFT_SPEAKER`, `SPEECHSWIFT_MODEL_ID` | Reference-sample voice or backend speaker/model ID |
94
+
95
+ ## Utterance Segmentation
96
+
97
+ `UTTERANCE_IDLE_MS` controls how long the bridge waits after speech before starting STT. Default is `4500` ms.
98
+
99
+ ```bash
100
+ UTTERANCE_IDLE_MS="4500"
101
+ UTTERANCE_IDLE_MS="6000"
102
+ ```
103
+
104
+ ## MCP Server
105
+
106
+ ```yaml
107
+ mcp_servers:
108
+ verbalcoding:
109
+ command: "node"
110
+ args: ["/path/to/VerbalCoding/scripts/mcp-server.mjs"]
111
+ timeout: 120
112
+ connect_timeout: 30
113
+ ```
114
+
115
+ Tools: `status`, `doctor`, `set_auto_restart`, `set_language`, `start`, `stop`, and `restart`.
116
+
117
+ ## Optional OpenVoice TTS
118
+
119
+ ```bash
120
+ ./scripts/setup_openvoice.sh
121
+ python3 integrations/openvoice/synth.py --openvoice-dir vendor/OpenVoice --ref-audio voice-samples/user-reference.wav --text '안녕하세요. 버벌코딩 목소리 복제 테스트입니다.' --output /tmp/verbalcoding-openvoice-smoke.wav
122
+ ```
123
+
124
+ ```bash
125
+ TTS_BACKEND="openvoice"
126
+ OPENVOICE_REF_AUDIO="./voice-samples/user-reference.wav"
127
+ OPENVOICE_PROGRESS="0"
128
+ ```
129
+
130
+ Only clone voices you own or have permission to use. OpenVoice falls back to Edge on failure.
131
+
132
+ ## Optional Supertonic TTS
133
+
134
+ ```bash
135
+ ./scripts/setup_supertonic.sh
136
+ supertonic tts '안녕하세요. 수퍼토닉 테스트입니다.' --lang ko --voice M1 --steps 2 --speed 1.0 -o /tmp/verbalcoding-supertonic.wav
137
+ ```
138
+
139
+ ## Optional SpeechSwift / CosyVoice TTS
140
+
141
+ ```bash
142
+ brew tap soniqo/speech https://github.com/soniqo/speech-swift
143
+ brew install speech
144
+ ```
145
+
146
+ Recommended env includes `TTS_BACKEND="speechswift"`, `SPEECHSWIFT_MODE="server"`, `SPEECHSWIFT_ENGINE="cosyvoice"`, `SPEECHSWIFT_REF_AUDIO`, and `SPEECHSWIFT_SERVER_URL`. Keep Edge for quick progress prompts.
147
+
148
+ ## Operational Notes
149
+
150
+ Enable Discord Message Content intent, grant voice connect/speak permissions, authenticate the selected CLI harness separately, and avoid reading diffs/log dumps aloud.
@@ -0,0 +1,150 @@
1
+ # VerbalCoding 設定
2
+
3
+ ## Setup Wizard
4
+
5
+ Use upstream Discord-side guides first, then return to VerbalCoding:
6
+
7
+ - Hermes Agent Discord messaging guide: <https://hermes-agent.nousresearch.com/docs/user-guide/messaging/discord>
8
+ - Discord official bot overview: <https://docs.discord.com/developers/bots/overview>
9
+ - Discord official quick start: <https://docs.discord.com/developers/quick-start/getting-started>
10
+
11
+ ```bash
12
+ vc setup --yes
13
+ # or from a clone
14
+ ./scripts/install.sh
15
+ ```
16
+
17
+ The installer asks for the Discord token, allowed users, auto-join voice channel names, transcript channel/thread, CLI harness backend, default voice language, TTS settings, and wake-word behavior. It writes `.env` with mode `0600`.
18
+
19
+ ## Supported Agent Backends
20
+
21
+ Set `AGENT_BACKEND` in `.env`.
22
+
23
+ | Backend | Default command | Notes |
24
+ |---|---|---|
25
+ | `hermes` | `hermes chat -Q -q` | Default; supports resume and verbose progress |
26
+ | `claude-code` / `claude` | `claude -p` | Override with `CLAUDE_COMMAND` or `AGENT_COMMAND` |
27
+ | `codex` | `codex exec` | Override with `CODEX_COMMAND` or `AGENT_COMMAND` |
28
+ | `gemini` | `gemini -p` | Override with `GEMINI_COMMAND` or `AGENT_COMMAND` |
29
+ | `opencode` | `opencode run` | Override with `OPENCODE_COMMAND` or `AGENT_COMMAND` |
30
+ | `openclaw` | `openclaw run` | Override with `OPENCLAW_COMMAND` or `AGENT_COMMAND` |
31
+ | `custom` | `AGENT_COMMAND` required | Prompt is appended as final argv |
32
+
33
+ Generic overrides:
34
+
35
+ ```bash
36
+ AGENT_BACKEND=custom
37
+ AGENT_LABEL="My Harness"
38
+ AGENT_COMMAND="my-harness run --non-interactive"
39
+ AGENT_TASK_TIMEOUT_MS=0
40
+ AGENT_CHAT_TIMEOUT_MS=45000
41
+ AGENT_VERBOSE_PROGRESS=0
42
+ UTTERANCE_IDLE_MS=4500
43
+ LATENCY_LOG_PATH=./.logs/latency.jsonl
44
+ ```
45
+
46
+ ## Example `.env`
47
+
48
+ ```bash
49
+ DISCORD_BOT_TOKEN="***"
50
+ DISCORD_ALLOWED_USERS="123456789012345678"
51
+ AUTO_JOIN_VOICE_CHANNELS="일반,General,general"
52
+ TRANSCRIPT_CHANNEL_ID="123456789012345678"
53
+ AGENT_BACKEND="hermes"
54
+ STT_ENGINE="whisper_cpp"
55
+ WHISPER_CPP_BIN="whisper-cli"
56
+ WHISPER_CPP_MODEL="./models/ggml-small-q5_1.bin"
57
+ TTS_BACKEND="edge"
58
+ TTS_VOICE_TYPE="korean_female"
59
+ TTS_VOICE="ko-KR-SunHiNeural"
60
+ TTS_RATE="+10%"
61
+ TTS_MAX_CHARS="495"
62
+ TTS_VOLUME="1.0"
63
+ REQUIRE_WAKE_WORD="0"
64
+ MIN_UTTERANCE_SECONDS="1.0"
65
+ UTTERANCE_IDLE_MS="4500"
66
+ HERMES_TASK_TIMEOUT_MS="0"
67
+ HERMES_CHAT_TIMEOUT_MS="45000"
68
+ AGENT_VERBOSE_PROGRESS="0"
69
+ LATENCY_LOG_PATH="./.logs/latency.jsonl"
70
+ ```
71
+
72
+ ## TTS Voice Selection
73
+
74
+ `vc language ko|en|auto` changes STT language, progress language, and default TTS voice. Live commands such as “남자 한국어 목소리로 바꿔”, “여자 한국어 목소리로 바꿔”, `change voice to Korean female`, and `switch speaker to English` change only the speaker/voice type.
75
+
76
+ Default Edge catalog:
77
+
78
+ | `TTS_VOICE_TYPE` | `TTS_VOICE` | Language |
79
+ |---|---|---|
80
+ | `korean_male` | `ko-KR-InJoonNeural` | Korean |
81
+ | `korean_female` | `ko-KR-SunHiNeural` | Korean |
82
+ | `korean_multilingual_male` | `ko-KR-HyunsuMultilingualNeural` | Korean |
83
+ | `english_male` | `en-US-GuyNeural` | English |
84
+ | `english_female` | `en-US-AriaNeural` | English |
85
+
86
+ Backend-specific voice options:
87
+
88
+ | Backend | Settings | Voice choices |
89
+ |---|---|---|
90
+ | Edge | `TTS_VOICE_TYPE`, `TTS_VOICE` | Built-in types plus any `edge-tts --list-voices` voice |
91
+ | Supertonic | `SUPERTONIC_VOICE`, `SUPERTONIC_LANGUAGE` | `M1`–`M5`, `F1`–`F5`; `ko`, `en`, `es`, `pt`, `fr` |
92
+ | OpenVoice | `OPENVOICE_REF_AUDIO`, `OPENVOICE_STYLE`, `OPENVOICE_LANGUAGE` | User-provided permitted reference WAV |
93
+ | SpeechSwift / CosyVoice | `SPEECHSWIFT_REF_AUDIO`, `SPEECHSWIFT_ENGINE`, `SPEECHSWIFT_SPEAKER`, `SPEECHSWIFT_MODEL_ID` | Reference-sample voice or backend speaker/model ID |
94
+
95
+ ## Utterance Segmentation
96
+
97
+ `UTTERANCE_IDLE_MS` controls how long the bridge waits after speech before starting STT. Default is `4500` ms.
98
+
99
+ ```bash
100
+ UTTERANCE_IDLE_MS="4500"
101
+ UTTERANCE_IDLE_MS="6000"
102
+ ```
103
+
104
+ ## MCP Server
105
+
106
+ ```yaml
107
+ mcp_servers:
108
+ verbalcoding:
109
+ command: "node"
110
+ args: ["/path/to/VerbalCoding/scripts/mcp-server.mjs"]
111
+ timeout: 120
112
+ connect_timeout: 30
113
+ ```
114
+
115
+ Tools: `status`, `doctor`, `set_auto_restart`, `set_language`, `start`, `stop`, and `restart`.
116
+
117
+ ## Optional OpenVoice TTS
118
+
119
+ ```bash
120
+ ./scripts/setup_openvoice.sh
121
+ python3 integrations/openvoice/synth.py --openvoice-dir vendor/OpenVoice --ref-audio voice-samples/user-reference.wav --text '안녕하세요. 버벌코딩 목소리 복제 테스트입니다.' --output /tmp/verbalcoding-openvoice-smoke.wav
122
+ ```
123
+
124
+ ```bash
125
+ TTS_BACKEND="openvoice"
126
+ OPENVOICE_REF_AUDIO="./voice-samples/user-reference.wav"
127
+ OPENVOICE_PROGRESS="0"
128
+ ```
129
+
130
+ Only clone voices you own or have permission to use. OpenVoice falls back to Edge on failure.
131
+
132
+ ## Optional Supertonic TTS
133
+
134
+ ```bash
135
+ ./scripts/setup_supertonic.sh
136
+ supertonic tts '안녕하세요. 수퍼토닉 테스트입니다.' --lang ko --voice M1 --steps 2 --speed 1.0 -o /tmp/verbalcoding-supertonic.wav
137
+ ```
138
+
139
+ ## Optional SpeechSwift / CosyVoice TTS
140
+
141
+ ```bash
142
+ brew tap soniqo/speech https://github.com/soniqo/speech-swift
143
+ brew install speech
144
+ ```
145
+
146
+ Recommended env includes `TTS_BACKEND="speechswift"`, `SPEECHSWIFT_MODE="server"`, `SPEECHSWIFT_ENGINE="cosyvoice"`, `SPEECHSWIFT_REF_AUDIO`, and `SPEECHSWIFT_SERVER_URL`. Keep Edge for quick progress prompts.
147
+
148
+ ## Operational Notes
149
+
150
+ Enable Discord Message Content intent, grant voice connect/speak permissions, authenticate the selected CLI harness separately, and avoid reading diffs/log dumps aloud.