verbalcoding 0.2.5 → 0.2.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +5 -0
- package/app-node/cli_install.test.mjs +2 -1
- package/docs/i18n/CONFIGURATION.es.md +150 -0
- package/docs/i18n/CONFIGURATION.fr.md +150 -0
- package/docs/i18n/CONFIGURATION.ja.md +150 -0
- package/docs/i18n/CONFIGURATION.ko.md +49 -146
- package/docs/i18n/CONFIGURATION.ru.md +150 -0
- package/docs/i18n/CONFIGURATION.zh.md +150 -0
- package/docs/i18n/FRESH_INSTALL.es.md +124 -0
- package/docs/i18n/FRESH_INSTALL.fr.md +124 -0
- package/docs/i18n/FRESH_INSTALL.ja.md +124 -0
- package/docs/i18n/FRESH_INSTALL.ko.md +37 -114
- package/docs/i18n/FRESH_INSTALL.ru.md +124 -0
- package/docs/i18n/FRESH_INSTALL.zh.md +124 -0
- package/docs/i18n/MULTI_INSTANCE.es.md +121 -0
- package/docs/i18n/MULTI_INSTANCE.fr.md +121 -0
- package/docs/i18n/MULTI_INSTANCE.ja.md +121 -0
- package/docs/i18n/MULTI_INSTANCE.ko.md +28 -86
- package/docs/i18n/MULTI_INSTANCE.ru.md +121 -0
- package/docs/i18n/MULTI_INSTANCE.zh.md +121 -0
- package/docs/i18n/README.es.md +50 -86
- package/docs/i18n/README.fr.md +50 -86
- package/docs/i18n/README.ja.md +50 -86
- package/docs/i18n/README.ko.md +41 -113
- package/docs/i18n/README.ru.md +50 -86
- package/docs/i18n/README.zh.md +50 -86
- package/docs/i18n/RELEASE.es.md +58 -0
- package/docs/i18n/RELEASE.fr.md +58 -0
- package/docs/i18n/RELEASE.ja.md +58 -0
- package/docs/i18n/RELEASE.ko.md +36 -50
- package/docs/i18n/RELEASE.ru.md +58 -0
- package/docs/i18n/RELEASE.zh.md +58 -0
- package/docs/i18n/USAGE.es.md +134 -0
- package/docs/i18n/USAGE.fr.md +134 -0
- package/docs/i18n/USAGE.ja.md +134 -0
- package/docs/i18n/USAGE.ko.md +63 -101
- package/docs/i18n/USAGE.ru.md +134 -0
- package/docs/i18n/USAGE.zh.md +134 -0
- package/package.json +2 -2
- package/integrations/openvoice/__pycache__/synth.cpython-311.pyc +0 -0
|
@@ -1,48 +1,36 @@
|
|
|
1
|
-
# VerbalCoding 설정
|
|
1
|
+
# VerbalCoding 설정
|
|
2
2
|
|
|
3
|
-
##
|
|
3
|
+
## Setup Wizard
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
Use upstream Discord-side guides first, then return to VerbalCoding:
|
|
6
6
|
|
|
7
|
-
- Hermes Agent Discord
|
|
8
|
-
- Discord
|
|
9
|
-
- Discord
|
|
10
|
-
|
|
11
|
-
npm으로 설치한 경우:
|
|
7
|
+
- Hermes Agent Discord messaging guide: <https://hermes-agent.nousresearch.com/docs/user-guide/messaging/discord>
|
|
8
|
+
- Discord official bot overview: <https://docs.discord.com/developers/bots/overview>
|
|
9
|
+
- Discord official quick start: <https://docs.discord.com/developers/quick-start/getting-started>
|
|
12
10
|
|
|
13
11
|
```bash
|
|
14
12
|
vc setup --yes
|
|
13
|
+
# or from a clone
|
|
14
|
+
./scripts/install.sh
|
|
15
15
|
```
|
|
16
16
|
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
```bash
|
|
20
|
-
./scripts/install.sh --yes
|
|
21
|
-
```
|
|
22
|
-
|
|
23
|
-
설치기는 Discord 토큰, 허용 사용자, 자동 입장 음성 채널 이름, transcript 채널/스레드, CLI 하네스 백엔드, 기본 음성 언어, TTS 설정, wake word 동작을 묻습니다. 결과는 권한 `0600`의 `.env`에 저장되며, `.env`는 git에서 무시됩니다. 클론 설치에서는 짧은 셸 명령 `vc`도 연결합니다.
|
|
17
|
+
The installer asks for the Discord token, allowed users, auto-join voice channel names, transcript channel/thread, CLI harness backend, default voice language, TTS settings, and wake-word behavior. It writes `.env` with mode `0600`.
|
|
24
18
|
|
|
25
|
-
|
|
19
|
+
## Supported Agent Backends
|
|
26
20
|
|
|
27
|
-
|
|
28
|
-
npm link
|
|
29
|
-
```
|
|
30
|
-
|
|
31
|
-
## 지원 에이전트 백엔드
|
|
21
|
+
Set `AGENT_BACKEND` in `.env`.
|
|
32
22
|
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
| 백엔드 | 기본 명령 | 메모 |
|
|
23
|
+
| Backend | Default command | Notes |
|
|
36
24
|
|---|---|---|
|
|
37
|
-
| `hermes` | `hermes chat -Q -q` |
|
|
38
|
-
| `claude-code` / `claude` | `claude -p` | `CLAUDE_COMMAND`
|
|
39
|
-
| `codex` | `codex exec` | `CODEX_COMMAND`
|
|
40
|
-
| `gemini` | `gemini -p` | `GEMINI_COMMAND`
|
|
41
|
-
| `opencode` | `opencode run` | `OPENCODE_COMMAND`
|
|
42
|
-
| `openclaw` | `openclaw run` | `OPENCLAW_COMMAND`
|
|
43
|
-
| `custom` | `AGENT_COMMAND`
|
|
25
|
+
| `hermes` | `hermes chat -Q -q` | Default; supports resume and verbose progress |
|
|
26
|
+
| `claude-code` / `claude` | `claude -p` | Override with `CLAUDE_COMMAND` or `AGENT_COMMAND` |
|
|
27
|
+
| `codex` | `codex exec` | Override with `CODEX_COMMAND` or `AGENT_COMMAND` |
|
|
28
|
+
| `gemini` | `gemini -p` | Override with `GEMINI_COMMAND` or `AGENT_COMMAND` |
|
|
29
|
+
| `opencode` | `opencode run` | Override with `OPENCODE_COMMAND` or `AGENT_COMMAND` |
|
|
30
|
+
| `openclaw` | `openclaw run` | Override with `OPENCLAW_COMMAND` or `AGENT_COMMAND` |
|
|
31
|
+
| `custom` | `AGENT_COMMAND` required | Prompt is appended as final argv |
|
|
44
32
|
|
|
45
|
-
|
|
33
|
+
Generic overrides:
|
|
46
34
|
|
|
47
35
|
```bash
|
|
48
36
|
AGENT_BACKEND=custom
|
|
@@ -55,37 +43,23 @@ UTTERANCE_IDLE_MS=4500
|
|
|
55
43
|
LATENCY_LOG_PATH=./.logs/latency.jsonl
|
|
56
44
|
```
|
|
57
45
|
|
|
58
|
-
##
|
|
59
|
-
|
|
60
|
-
음성 브릿지는 모든 백엔드와 하나의 어댑터 계약으로 통신합니다.
|
|
61
|
-
|
|
62
|
-
- `run({ text }, signal, plan)`은 상태, 최종 답변 텍스트, 백엔드 라벨, elapsed time, 선택적 세션 metadata를 반환합니다.
|
|
63
|
-
- `ask(text, signal, plan)`은 호환용 단축 함수이며 최종 답변 텍스트만 반환합니다.
|
|
64
|
-
- `capabilities`는 해당 백엔드가 session resume, streaming progress, cancellation을 지원하는지 선언합니다.
|
|
65
|
-
- Hermes는 기준 어댑터입니다. resume, verbose progress streaming, cancellation, Hermes 세션 파일에서 최종 답변 복구를 지원합니다.
|
|
66
|
-
|
|
67
|
-
새 백엔드는 같은 계약을 구현하고, voice/STT/TTS 동작은 어댑터 밖에 유지하는 것이 좋습니다.
|
|
68
|
-
|
|
69
|
-
## 예시 `.env`
|
|
46
|
+
## Example `.env`
|
|
70
47
|
|
|
71
48
|
```bash
|
|
72
49
|
DISCORD_BOT_TOKEN="***"
|
|
73
50
|
DISCORD_ALLOWED_USERS="123456789012345678"
|
|
74
51
|
AUTO_JOIN_VOICE_CHANNELS="일반,General,general"
|
|
75
52
|
TRANSCRIPT_CHANNEL_ID="123456789012345678"
|
|
76
|
-
|
|
77
53
|
AGENT_BACKEND="hermes"
|
|
78
54
|
STT_ENGINE="whisper_cpp"
|
|
79
55
|
WHISPER_CPP_BIN="whisper-cli"
|
|
80
56
|
WHISPER_CPP_MODEL="./models/ggml-small-q5_1.bin"
|
|
81
|
-
|
|
82
57
|
TTS_BACKEND="edge"
|
|
83
58
|
TTS_VOICE_TYPE="korean_female"
|
|
84
59
|
TTS_VOICE="ko-KR-SunHiNeural"
|
|
85
60
|
TTS_RATE="+10%"
|
|
86
61
|
TTS_MAX_CHARS="495"
|
|
87
62
|
TTS_VOLUME="1.0"
|
|
88
|
-
|
|
89
63
|
REQUIRE_WAKE_WORD="0"
|
|
90
64
|
MIN_UTTERANCE_SECONDS="1.0"
|
|
91
65
|
UTTERANCE_IDLE_MS="4500"
|
|
@@ -95,60 +69,39 @@ AGENT_VERBOSE_PROGRESS="0"
|
|
|
95
69
|
LATENCY_LOG_PATH="./.logs/latency.jsonl"
|
|
96
70
|
```
|
|
97
71
|
|
|
98
|
-
## TTS
|
|
99
|
-
|
|
100
|
-
언어 프리셋과 목소리 선택은 분리되어 있습니다.
|
|
101
|
-
|
|
102
|
-
- `vc language ko|en|auto`는 STT 언어, 진행 언어, 해당 언어의 기본 목소리를 함께 바꿉니다.
|
|
103
|
-
- “남자 한국어 목소리로 바꿔”, “여자 한국어 목소리로 바꿔”, `change voice to Korean female`, `switch speaker to English` 같은 실시간 음성 명령은 말하는 사람/목소리 타입만 바꿉니다.
|
|
104
|
-
- `!voice-test <text>`는 현재 선택된 백엔드와 목소리로 짧은 샘플을 재생합니다.
|
|
72
|
+
## TTS Voice Selection
|
|
105
73
|
|
|
106
|
-
|
|
74
|
+
`vc language ko|en|auto` changes STT language, progress language, and default TTS voice. Live commands such as “남자 한국어 목소리로 바꿔”, “여자 한국어 목소리로 바꿔”, `change voice to Korean female`, and `switch speaker to English` change only the speaker/voice type.
|
|
107
75
|
|
|
108
|
-
|
|
76
|
+
Default Edge catalog:
|
|
109
77
|
|
|
110
|
-
| `TTS_VOICE_TYPE` | `TTS_VOICE` |
|
|
78
|
+
| `TTS_VOICE_TYPE` | `TTS_VOICE` | Language |
|
|
111
79
|
|---|---|---|
|
|
112
|
-
| `korean_male` | `ko-KR-InJoonNeural` |
|
|
113
|
-
| `korean_female` | `ko-KR-SunHiNeural` |
|
|
114
|
-
| `korean_multilingual_male` | `ko-KR-HyunsuMultilingualNeural` |
|
|
115
|
-
| `english_male` | `en-US-GuyNeural` |
|
|
116
|
-
| `english_female` | `en-US-AriaNeural` |
|
|
117
|
-
|
|
118
|
-
수동 영구 override 예시:
|
|
119
|
-
|
|
120
|
-
```bash
|
|
121
|
-
TTS_BACKEND="edge"
|
|
122
|
-
TTS_VOICE_TYPE="korean_male"
|
|
123
|
-
TTS_VOICE="ko-KR-InJoonNeural"
|
|
124
|
-
TTS_VOICE_CONFIG="config/tts-voices.json"
|
|
125
|
-
```
|
|
80
|
+
| `korean_male` | `ko-KR-InJoonNeural` | Korean |
|
|
81
|
+
| `korean_female` | `ko-KR-SunHiNeural` | Korean |
|
|
82
|
+
| `korean_multilingual_male` | `ko-KR-HyunsuMultilingualNeural` | Korean |
|
|
83
|
+
| `english_male` | `en-US-GuyNeural` | English |
|
|
84
|
+
| `english_female` | `en-US-AriaNeural` | English |
|
|
126
85
|
|
|
127
|
-
|
|
86
|
+
Backend-specific voice options:
|
|
128
87
|
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
| 백엔드 | 설정 | 목소리 선택지 |
|
|
88
|
+
| Backend | Settings | Voice choices |
|
|
132
89
|
|---|---|---|
|
|
133
|
-
| Edge | `TTS_VOICE_TYPE`, `TTS_VOICE` |
|
|
134
|
-
| Supertonic | `SUPERTONIC_VOICE`, `SUPERTONIC_LANGUAGE` | `M1`–`M5`, `F1`–`F5`;
|
|
135
|
-
| OpenVoice | `OPENVOICE_REF_AUDIO`, `OPENVOICE_STYLE`, `OPENVOICE_LANGUAGE` |
|
|
136
|
-
| SpeechSwift / CosyVoice | `SPEECHSWIFT_REF_AUDIO`, `SPEECHSWIFT_ENGINE`, `SPEECHSWIFT_SPEAKER`, `SPEECHSWIFT_MODEL_ID` |
|
|
90
|
+
| Edge | `TTS_VOICE_TYPE`, `TTS_VOICE` | Built-in types plus any `edge-tts --list-voices` voice |
|
|
91
|
+
| Supertonic | `SUPERTONIC_VOICE`, `SUPERTONIC_LANGUAGE` | `M1`–`M5`, `F1`–`F5`; `ko`, `en`, `es`, `pt`, `fr` |
|
|
92
|
+
| OpenVoice | `OPENVOICE_REF_AUDIO`, `OPENVOICE_STYLE`, `OPENVOICE_LANGUAGE` | User-provided permitted reference WAV |
|
|
93
|
+
| SpeechSwift / CosyVoice | `SPEECHSWIFT_REF_AUDIO`, `SPEECHSWIFT_ENGINE`, `SPEECHSWIFT_SPEAKER`, `SPEECHSWIFT_MODEL_ID` | Reference-sample voice or backend speaker/model ID |
|
|
137
94
|
|
|
138
|
-
##
|
|
95
|
+
## Utterance Segmentation
|
|
139
96
|
|
|
140
|
-
`UTTERANCE_IDLE_MS
|
|
97
|
+
`UTTERANCE_IDLE_MS` controls how long the bridge waits after speech before starting STT. Default is `4500` ms.
|
|
141
98
|
|
|
142
99
|
```bash
|
|
143
|
-
UTTERANCE_IDLE_MS="4500"
|
|
144
|
-
UTTERANCE_IDLE_MS="6000"
|
|
100
|
+
UTTERANCE_IDLE_MS="4500"
|
|
101
|
+
UTTERANCE_IDLE_MS="6000"
|
|
145
102
|
```
|
|
146
103
|
|
|
147
|
-
## MCP
|
|
148
|
-
|
|
149
|
-
VerbalCoding은 stdio MCP 서버를 포함합니다. Hermes Agent 또는 MCP client는 자유 형식 shell 명령 대신 도구로 브릿지를 제어할 수 있습니다.
|
|
150
|
-
|
|
151
|
-
Hermes 설정 예시:
|
|
104
|
+
## MCP Server
|
|
152
105
|
|
|
153
106
|
```yaml
|
|
154
107
|
mcp_servers:
|
|
@@ -159,89 +112,39 @@ mcp_servers:
|
|
|
159
112
|
connect_timeout: 30
|
|
160
113
|
```
|
|
161
114
|
|
|
162
|
-
|
|
163
|
-
|
|
164
|
-
| 도구 | 용도 |
|
|
165
|
-
|---|---|
|
|
166
|
-
| `status` | 비밀값 없이 브릿지/설정 상태 보고 |
|
|
167
|
-
| `doctor` | 비밀값을 숨긴 doctor 점검 실행 |
|
|
168
|
-
| `set_auto_restart` | 커밋 시 음성 봇 자동 재시작 켜기/끄기 |
|
|
169
|
-
| `set_language` | STT/진행/TTS 언어를 함께 변경 |
|
|
170
|
-
| `start`, `stop`, `restart` | Discord 음성 브릿지 제어 |
|
|
115
|
+
Tools: `status`, `doctor`, `set_auto_restart`, `set_language`, `start`, `stop`, and `restart`.
|
|
171
116
|
|
|
172
|
-
##
|
|
173
|
-
|
|
174
|
-
Edge TTS가 기본값이자 fallback입니다. OpenVoice V2로 로컬 음성 복제를 시험하려면:
|
|
117
|
+
## Optional OpenVoice TTS
|
|
175
118
|
|
|
176
119
|
```bash
|
|
177
120
|
./scripts/setup_openvoice.sh
|
|
178
|
-
# OpenVoice 문서에서 checkpoints_v2_0417.zip을 받아 vendor/OpenVoice/checkpoints_v2/ 아래에 풉니다.
|
|
179
|
-
mkdir -p voice-samples
|
|
180
|
-
# 허가된 기준 샘플을 voice-samples/user-reference.wav에 넣거나,
|
|
181
|
-
# Discord에서 !voice-clone capture로 샘플을 캡처합니다.
|
|
182
121
|
python3 integrations/openvoice/synth.py --openvoice-dir vendor/OpenVoice --ref-audio voice-samples/user-reference.wav --text '안녕하세요. 버벌코딩 목소리 복제 테스트입니다.' --output /tmp/verbalcoding-openvoice-smoke.wav
|
|
183
122
|
```
|
|
184
123
|
|
|
185
|
-
그 뒤 설정:
|
|
186
|
-
|
|
187
124
|
```bash
|
|
188
125
|
TTS_BACKEND="openvoice"
|
|
189
126
|
OPENVOICE_REF_AUDIO="./voice-samples/user-reference.wav"
|
|
190
127
|
OPENVOICE_PROGRESS="0"
|
|
191
128
|
```
|
|
192
129
|
|
|
193
|
-
|
|
130
|
+
Only clone voices you own or have permission to use. OpenVoice falls back to Edge on failure.
|
|
194
131
|
|
|
195
|
-
##
|
|
132
|
+
## Optional Supertonic TTS
|
|
196
133
|
|
|
197
134
|
```bash
|
|
198
135
|
./scripts/setup_supertonic.sh
|
|
199
136
|
supertonic tts '안녕하세요. 수퍼토닉 테스트입니다.' --lang ko --voice M1 --steps 2 --speed 1.0 -o /tmp/verbalcoding-supertonic.wav
|
|
200
137
|
```
|
|
201
138
|
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
```bash
|
|
205
|
-
TTS_BACKEND="supertonic"
|
|
206
|
-
SUPERTONIC_COMMAND="./.venv-supertonic/bin/supertonic"
|
|
207
|
-
SUPERTONIC_VOICE="M1"
|
|
208
|
-
SUPERTONIC_LANGUAGE="ko"
|
|
209
|
-
SUPERTONIC_STEPS="2"
|
|
210
|
-
SUPERTONIC_SPEED="1.0"
|
|
211
|
-
SUPERTONIC_PROGRESS="0"
|
|
212
|
-
```
|
|
213
|
-
|
|
214
|
-
Supertonic이 없거나 실패하거나 timeout되면 VerbalCoding은 Edge TTS로 fallback합니다.
|
|
215
|
-
|
|
216
|
-
## 선택: SpeechSwift / CosyVoice TTS
|
|
217
|
-
|
|
218
|
-
Apple Silicon에서는 `speech-swift`가 MLX-native CosyVoice/Qwen3-TTS 기반 한국어 음성 복제용 로컬 백엔드로 동작할 수 있습니다.
|
|
139
|
+
## Optional SpeechSwift / CosyVoice TTS
|
|
219
140
|
|
|
220
141
|
```bash
|
|
221
142
|
brew tap soniqo/speech https://github.com/soniqo/speech-swift
|
|
222
143
|
brew install speech
|
|
223
144
|
```
|
|
224
145
|
|
|
225
|
-
|
|
226
|
-
|
|
227
|
-
```bash
|
|
228
|
-
TTS_BACKEND="speechswift"
|
|
229
|
-
SPEECHSWIFT_MODE="server"
|
|
230
|
-
SPEECHSWIFT_ENGINE="cosyvoice"
|
|
231
|
-
SPEECHSWIFT_LANGUAGE="korean"
|
|
232
|
-
SPEECHSWIFT_REF_AUDIO="./voice-samples/user-reference.wav"
|
|
233
|
-
SPEECHSWIFT_SERVER_HOST="127.0.0.1"
|
|
234
|
-
SPEECHSWIFT_SERVER_PORT="18080"
|
|
235
|
-
SPEECHSWIFT_SERVER_URL="http://127.0.0.1:18080"
|
|
236
|
-
SPEECHSWIFT_PROGRESS="0"
|
|
237
|
-
```
|
|
238
|
-
|
|
239
|
-
빠른 진행/짧은 backchannel prompt는 Edge를 유지하는 편이 안전합니다.
|
|
146
|
+
Recommended env includes `TTS_BACKEND="speechswift"`, `SPEECHSWIFT_MODE="server"`, `SPEECHSWIFT_ENGINE="cosyvoice"`, `SPEECHSWIFT_REF_AUDIO`, and `SPEECHSWIFT_SERVER_URL`. Keep Edge for quick progress prompts.
|
|
240
147
|
|
|
241
|
-
##
|
|
148
|
+
## Operational Notes
|
|
242
149
|
|
|
243
|
-
|
|
244
|
-
- 봇에는 음성 채널 connect/speak 권한이 필요합니다.
|
|
245
|
-
- Hermes Agent를 쓴다면 기본 프로필에서 Hermes를 정상 설정/인증하세요. 예: `hermes setup`, `hermes login` 등.
|
|
246
|
-
- Claude Code, Codex, Gemini, OpenCode, OpenClaw를 쓰려면 해당 CLI를 별도로 설치하고 인증하세요.
|
|
247
|
-
- CLI가 timeout 또는 signal 실패 중 diff/code를 출력하면 브릿지는 그 내용을 음성으로 읽지 않고 자세한 텍스트로만 보냅니다.
|
|
150
|
+
Enable Discord Message Content intent, grant voice connect/speak permissions, authenticate the selected CLI harness separately, and avoid reading diffs/log dumps aloud.
|
|
@@ -0,0 +1,150 @@
|
|
|
1
|
+
# VerbalCoding Конфигурация
|
|
2
|
+
|
|
3
|
+
## Setup Wizard
|
|
4
|
+
|
|
5
|
+
Use upstream Discord-side guides first, then return to VerbalCoding:
|
|
6
|
+
|
|
7
|
+
- Hermes Agent Discord messaging guide: <https://hermes-agent.nousresearch.com/docs/user-guide/messaging/discord>
|
|
8
|
+
- Discord official bot overview: <https://docs.discord.com/developers/bots/overview>
|
|
9
|
+
- Discord official quick start: <https://docs.discord.com/developers/quick-start/getting-started>
|
|
10
|
+
|
|
11
|
+
```bash
|
|
12
|
+
vc setup --yes
|
|
13
|
+
# or from a clone
|
|
14
|
+
./scripts/install.sh
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
The installer asks for the Discord token, allowed users, auto-join voice channel names, transcript channel/thread, CLI harness backend, default voice language, TTS settings, and wake-word behavior. It writes `.env` with mode `0600`.
|
|
18
|
+
|
|
19
|
+
## Supported Agent Backends
|
|
20
|
+
|
|
21
|
+
Set `AGENT_BACKEND` in `.env`.
|
|
22
|
+
|
|
23
|
+
| Backend | Default command | Notes |
|
|
24
|
+
|---|---|---|
|
|
25
|
+
| `hermes` | `hermes chat -Q -q` | Default; supports resume and verbose progress |
|
|
26
|
+
| `claude-code` / `claude` | `claude -p` | Override with `CLAUDE_COMMAND` or `AGENT_COMMAND` |
|
|
27
|
+
| `codex` | `codex exec` | Override with `CODEX_COMMAND` or `AGENT_COMMAND` |
|
|
28
|
+
| `gemini` | `gemini -p` | Override with `GEMINI_COMMAND` or `AGENT_COMMAND` |
|
|
29
|
+
| `opencode` | `opencode run` | Override with `OPENCODE_COMMAND` or `AGENT_COMMAND` |
|
|
30
|
+
| `openclaw` | `openclaw run` | Override with `OPENCLAW_COMMAND` or `AGENT_COMMAND` |
|
|
31
|
+
| `custom` | `AGENT_COMMAND` required | Prompt is appended as final argv |
|
|
32
|
+
|
|
33
|
+
Generic overrides:
|
|
34
|
+
|
|
35
|
+
```bash
|
|
36
|
+
AGENT_BACKEND=custom
|
|
37
|
+
AGENT_LABEL="My Harness"
|
|
38
|
+
AGENT_COMMAND="my-harness run --non-interactive"
|
|
39
|
+
AGENT_TASK_TIMEOUT_MS=0
|
|
40
|
+
AGENT_CHAT_TIMEOUT_MS=45000
|
|
41
|
+
AGENT_VERBOSE_PROGRESS=0
|
|
42
|
+
UTTERANCE_IDLE_MS=4500
|
|
43
|
+
LATENCY_LOG_PATH=./.logs/latency.jsonl
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
## Example `.env`
|
|
47
|
+
|
|
48
|
+
```bash
|
|
49
|
+
DISCORD_BOT_TOKEN="***"
|
|
50
|
+
DISCORD_ALLOWED_USERS="123456789012345678"
|
|
51
|
+
AUTO_JOIN_VOICE_CHANNELS="일반,General,general"
|
|
52
|
+
TRANSCRIPT_CHANNEL_ID="123456789012345678"
|
|
53
|
+
AGENT_BACKEND="hermes"
|
|
54
|
+
STT_ENGINE="whisper_cpp"
|
|
55
|
+
WHISPER_CPP_BIN="whisper-cli"
|
|
56
|
+
WHISPER_CPP_MODEL="./models/ggml-small-q5_1.bin"
|
|
57
|
+
TTS_BACKEND="edge"
|
|
58
|
+
TTS_VOICE_TYPE="korean_female"
|
|
59
|
+
TTS_VOICE="ko-KR-SunHiNeural"
|
|
60
|
+
TTS_RATE="+10%"
|
|
61
|
+
TTS_MAX_CHARS="495"
|
|
62
|
+
TTS_VOLUME="1.0"
|
|
63
|
+
REQUIRE_WAKE_WORD="0"
|
|
64
|
+
MIN_UTTERANCE_SECONDS="1.0"
|
|
65
|
+
UTTERANCE_IDLE_MS="4500"
|
|
66
|
+
HERMES_TASK_TIMEOUT_MS="0"
|
|
67
|
+
HERMES_CHAT_TIMEOUT_MS="45000"
|
|
68
|
+
AGENT_VERBOSE_PROGRESS="0"
|
|
69
|
+
LATENCY_LOG_PATH="./.logs/latency.jsonl"
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
## TTS Voice Selection
|
|
73
|
+
|
|
74
|
+
`vc language ko|en|auto` changes STT language, progress language, and default TTS voice. Live commands such as “남자 한국어 목소리로 바꿔”, “여자 한국어 목소리로 바꿔”, `change voice to Korean female`, and `switch speaker to English` change only the speaker/voice type.
|
|
75
|
+
|
|
76
|
+
Default Edge catalog:
|
|
77
|
+
|
|
78
|
+
| `TTS_VOICE_TYPE` | `TTS_VOICE` | Language |
|
|
79
|
+
|---|---|---|
|
|
80
|
+
| `korean_male` | `ko-KR-InJoonNeural` | Korean |
|
|
81
|
+
| `korean_female` | `ko-KR-SunHiNeural` | Korean |
|
|
82
|
+
| `korean_multilingual_male` | `ko-KR-HyunsuMultilingualNeural` | Korean |
|
|
83
|
+
| `english_male` | `en-US-GuyNeural` | English |
|
|
84
|
+
| `english_female` | `en-US-AriaNeural` | English |
|
|
85
|
+
|
|
86
|
+
Backend-specific voice options:
|
|
87
|
+
|
|
88
|
+
| Backend | Settings | Voice choices |
|
|
89
|
+
|---|---|---|
|
|
90
|
+
| Edge | `TTS_VOICE_TYPE`, `TTS_VOICE` | Built-in types plus any `edge-tts --list-voices` voice |
|
|
91
|
+
| Supertonic | `SUPERTONIC_VOICE`, `SUPERTONIC_LANGUAGE` | `M1`–`M5`, `F1`–`F5`; `ko`, `en`, `es`, `pt`, `fr` |
|
|
92
|
+
| OpenVoice | `OPENVOICE_REF_AUDIO`, `OPENVOICE_STYLE`, `OPENVOICE_LANGUAGE` | User-provided permitted reference WAV |
|
|
93
|
+
| SpeechSwift / CosyVoice | `SPEECHSWIFT_REF_AUDIO`, `SPEECHSWIFT_ENGINE`, `SPEECHSWIFT_SPEAKER`, `SPEECHSWIFT_MODEL_ID` | Reference-sample voice or backend speaker/model ID |
|
|
94
|
+
|
|
95
|
+
## Utterance Segmentation
|
|
96
|
+
|
|
97
|
+
`UTTERANCE_IDLE_MS` controls how long the bridge waits after speech before starting STT. Default is `4500` ms.
|
|
98
|
+
|
|
99
|
+
```bash
|
|
100
|
+
UTTERANCE_IDLE_MS="4500"
|
|
101
|
+
UTTERANCE_IDLE_MS="6000"
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
## MCP Server
|
|
105
|
+
|
|
106
|
+
```yaml
|
|
107
|
+
mcp_servers:
|
|
108
|
+
verbalcoding:
|
|
109
|
+
command: "node"
|
|
110
|
+
args: ["/path/to/VerbalCoding/scripts/mcp-server.mjs"]
|
|
111
|
+
timeout: 120
|
|
112
|
+
connect_timeout: 30
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
Tools: `status`, `doctor`, `set_auto_restart`, `set_language`, `start`, `stop`, and `restart`.
|
|
116
|
+
|
|
117
|
+
## Optional OpenVoice TTS
|
|
118
|
+
|
|
119
|
+
```bash
|
|
120
|
+
./scripts/setup_openvoice.sh
|
|
121
|
+
python3 integrations/openvoice/synth.py --openvoice-dir vendor/OpenVoice --ref-audio voice-samples/user-reference.wav --text '안녕하세요. 버벌코딩 목소리 복제 테스트입니다.' --output /tmp/verbalcoding-openvoice-smoke.wav
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
```bash
|
|
125
|
+
TTS_BACKEND="openvoice"
|
|
126
|
+
OPENVOICE_REF_AUDIO="./voice-samples/user-reference.wav"
|
|
127
|
+
OPENVOICE_PROGRESS="0"
|
|
128
|
+
```
|
|
129
|
+
|
|
130
|
+
Only clone voices you own or have permission to use. OpenVoice falls back to Edge on failure.
|
|
131
|
+
|
|
132
|
+
## Optional Supertonic TTS
|
|
133
|
+
|
|
134
|
+
```bash
|
|
135
|
+
./scripts/setup_supertonic.sh
|
|
136
|
+
supertonic tts '안녕하세요. 수퍼토닉 테스트입니다.' --lang ko --voice M1 --steps 2 --speed 1.0 -o /tmp/verbalcoding-supertonic.wav
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
## Optional SpeechSwift / CosyVoice TTS
|
|
140
|
+
|
|
141
|
+
```bash
|
|
142
|
+
brew tap soniqo/speech https://github.com/soniqo/speech-swift
|
|
143
|
+
brew install speech
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
Recommended env includes `TTS_BACKEND="speechswift"`, `SPEECHSWIFT_MODE="server"`, `SPEECHSWIFT_ENGINE="cosyvoice"`, `SPEECHSWIFT_REF_AUDIO`, and `SPEECHSWIFT_SERVER_URL`. Keep Edge for quick progress prompts.
|
|
147
|
+
|
|
148
|
+
## Operational Notes
|
|
149
|
+
|
|
150
|
+
Enable Discord Message Content intent, grant voice connect/speak permissions, authenticate the selected CLI harness separately, and avoid reading diffs/log dumps aloud.
|
|
@@ -0,0 +1,150 @@
|
|
|
1
|
+
# VerbalCoding 配置
|
|
2
|
+
|
|
3
|
+
## Setup Wizard
|
|
4
|
+
|
|
5
|
+
Use upstream Discord-side guides first, then return to VerbalCoding:
|
|
6
|
+
|
|
7
|
+
- Hermes Agent Discord messaging guide: <https://hermes-agent.nousresearch.com/docs/user-guide/messaging/discord>
|
|
8
|
+
- Discord official bot overview: <https://docs.discord.com/developers/bots/overview>
|
|
9
|
+
- Discord official quick start: <https://docs.discord.com/developers/quick-start/getting-started>
|
|
10
|
+
|
|
11
|
+
```bash
|
|
12
|
+
vc setup --yes
|
|
13
|
+
# or from a clone
|
|
14
|
+
./scripts/install.sh
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
The installer asks for the Discord token, allowed users, auto-join voice channel names, transcript channel/thread, CLI harness backend, default voice language, TTS settings, and wake-word behavior. It writes `.env` with mode `0600`.
|
|
18
|
+
|
|
19
|
+
## Supported Agent Backends
|
|
20
|
+
|
|
21
|
+
Set `AGENT_BACKEND` in `.env`.
|
|
22
|
+
|
|
23
|
+
| Backend | Default command | Notes |
|
|
24
|
+
|---|---|---|
|
|
25
|
+
| `hermes` | `hermes chat -Q -q` | Default; supports resume and verbose progress |
|
|
26
|
+
| `claude-code` / `claude` | `claude -p` | Override with `CLAUDE_COMMAND` or `AGENT_COMMAND` |
|
|
27
|
+
| `codex` | `codex exec` | Override with `CODEX_COMMAND` or `AGENT_COMMAND` |
|
|
28
|
+
| `gemini` | `gemini -p` | Override with `GEMINI_COMMAND` or `AGENT_COMMAND` |
|
|
29
|
+
| `opencode` | `opencode run` | Override with `OPENCODE_COMMAND` or `AGENT_COMMAND` |
|
|
30
|
+
| `openclaw` | `openclaw run` | Override with `OPENCLAW_COMMAND` or `AGENT_COMMAND` |
|
|
31
|
+
| `custom` | `AGENT_COMMAND` required | Prompt is appended as final argv |
|
|
32
|
+
|
|
33
|
+
Generic overrides:
|
|
34
|
+
|
|
35
|
+
```bash
|
|
36
|
+
AGENT_BACKEND=custom
|
|
37
|
+
AGENT_LABEL="My Harness"
|
|
38
|
+
AGENT_COMMAND="my-harness run --non-interactive"
|
|
39
|
+
AGENT_TASK_TIMEOUT_MS=0
|
|
40
|
+
AGENT_CHAT_TIMEOUT_MS=45000
|
|
41
|
+
AGENT_VERBOSE_PROGRESS=0
|
|
42
|
+
UTTERANCE_IDLE_MS=4500
|
|
43
|
+
LATENCY_LOG_PATH=./.logs/latency.jsonl
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
## Example `.env`
|
|
47
|
+
|
|
48
|
+
```bash
|
|
49
|
+
DISCORD_BOT_TOKEN="***"
|
|
50
|
+
DISCORD_ALLOWED_USERS="123456789012345678"
|
|
51
|
+
AUTO_JOIN_VOICE_CHANNELS="일반,General,general"
|
|
52
|
+
TRANSCRIPT_CHANNEL_ID="123456789012345678"
|
|
53
|
+
AGENT_BACKEND="hermes"
|
|
54
|
+
STT_ENGINE="whisper_cpp"
|
|
55
|
+
WHISPER_CPP_BIN="whisper-cli"
|
|
56
|
+
WHISPER_CPP_MODEL="./models/ggml-small-q5_1.bin"
|
|
57
|
+
TTS_BACKEND="edge"
|
|
58
|
+
TTS_VOICE_TYPE="korean_female"
|
|
59
|
+
TTS_VOICE="ko-KR-SunHiNeural"
|
|
60
|
+
TTS_RATE="+10%"
|
|
61
|
+
TTS_MAX_CHARS="495"
|
|
62
|
+
TTS_VOLUME="1.0"
|
|
63
|
+
REQUIRE_WAKE_WORD="0"
|
|
64
|
+
MIN_UTTERANCE_SECONDS="1.0"
|
|
65
|
+
UTTERANCE_IDLE_MS="4500"
|
|
66
|
+
HERMES_TASK_TIMEOUT_MS="0"
|
|
67
|
+
HERMES_CHAT_TIMEOUT_MS="45000"
|
|
68
|
+
AGENT_VERBOSE_PROGRESS="0"
|
|
69
|
+
LATENCY_LOG_PATH="./.logs/latency.jsonl"
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
## TTS Voice Selection
|
|
73
|
+
|
|
74
|
+
`vc language ko|en|auto` changes STT language, progress language, and default TTS voice. Live commands such as “남자 한국어 목소리로 바꿔”, “여자 한국어 목소리로 바꿔”, `change voice to Korean female`, and `switch speaker to English` change only the speaker/voice type.
|
|
75
|
+
|
|
76
|
+
Default Edge catalog:
|
|
77
|
+
|
|
78
|
+
| `TTS_VOICE_TYPE` | `TTS_VOICE` | Language |
|
|
79
|
+
|---|---|---|
|
|
80
|
+
| `korean_male` | `ko-KR-InJoonNeural` | Korean |
|
|
81
|
+
| `korean_female` | `ko-KR-SunHiNeural` | Korean |
|
|
82
|
+
| `korean_multilingual_male` | `ko-KR-HyunsuMultilingualNeural` | Korean |
|
|
83
|
+
| `english_male` | `en-US-GuyNeural` | English |
|
|
84
|
+
| `english_female` | `en-US-AriaNeural` | English |
|
|
85
|
+
|
|
86
|
+
Backend-specific voice options:
|
|
87
|
+
|
|
88
|
+
| Backend | Settings | Voice choices |
|
|
89
|
+
|---|---|---|
|
|
90
|
+
| Edge | `TTS_VOICE_TYPE`, `TTS_VOICE` | Built-in types plus any `edge-tts --list-voices` voice |
|
|
91
|
+
| Supertonic | `SUPERTONIC_VOICE`, `SUPERTONIC_LANGUAGE` | `M1`–`M5`, `F1`–`F5`; `ko`, `en`, `es`, `pt`, `fr` |
|
|
92
|
+
| OpenVoice | `OPENVOICE_REF_AUDIO`, `OPENVOICE_STYLE`, `OPENVOICE_LANGUAGE` | User-provided permitted reference WAV |
|
|
93
|
+
| SpeechSwift / CosyVoice | `SPEECHSWIFT_REF_AUDIO`, `SPEECHSWIFT_ENGINE`, `SPEECHSWIFT_SPEAKER`, `SPEECHSWIFT_MODEL_ID` | Reference-sample voice or backend speaker/model ID |
|
|
94
|
+
|
|
95
|
+
## Utterance Segmentation
|
|
96
|
+
|
|
97
|
+
`UTTERANCE_IDLE_MS` controls how long the bridge waits after speech before starting STT. Default is `4500` ms.
|
|
98
|
+
|
|
99
|
+
```bash
|
|
100
|
+
UTTERANCE_IDLE_MS="4500"
|
|
101
|
+
UTTERANCE_IDLE_MS="6000"
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
## MCP Server
|
|
105
|
+
|
|
106
|
+
```yaml
|
|
107
|
+
mcp_servers:
|
|
108
|
+
verbalcoding:
|
|
109
|
+
command: "node"
|
|
110
|
+
args: ["/path/to/VerbalCoding/scripts/mcp-server.mjs"]
|
|
111
|
+
timeout: 120
|
|
112
|
+
connect_timeout: 30
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
Tools: `status`, `doctor`, `set_auto_restart`, `set_language`, `start`, `stop`, and `restart`.
|
|
116
|
+
|
|
117
|
+
## Optional OpenVoice TTS
|
|
118
|
+
|
|
119
|
+
```bash
|
|
120
|
+
./scripts/setup_openvoice.sh
|
|
121
|
+
python3 integrations/openvoice/synth.py --openvoice-dir vendor/OpenVoice --ref-audio voice-samples/user-reference.wav --text '안녕하세요. 버벌코딩 목소리 복제 테스트입니다.' --output /tmp/verbalcoding-openvoice-smoke.wav
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
```bash
|
|
125
|
+
TTS_BACKEND="openvoice"
|
|
126
|
+
OPENVOICE_REF_AUDIO="./voice-samples/user-reference.wav"
|
|
127
|
+
OPENVOICE_PROGRESS="0"
|
|
128
|
+
```
|
|
129
|
+
|
|
130
|
+
Only clone voices you own or have permission to use. OpenVoice falls back to Edge on failure.
|
|
131
|
+
|
|
132
|
+
## Optional Supertonic TTS
|
|
133
|
+
|
|
134
|
+
```bash
|
|
135
|
+
./scripts/setup_supertonic.sh
|
|
136
|
+
supertonic tts '안녕하세요. 수퍼토닉 테스트입니다.' --lang ko --voice M1 --steps 2 --speed 1.0 -o /tmp/verbalcoding-supertonic.wav
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
## Optional SpeechSwift / CosyVoice TTS
|
|
140
|
+
|
|
141
|
+
```bash
|
|
142
|
+
brew tap soniqo/speech https://github.com/soniqo/speech-swift
|
|
143
|
+
brew install speech
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
Recommended env includes `TTS_BACKEND="speechswift"`, `SPEECHSWIFT_MODE="server"`, `SPEECHSWIFT_ENGINE="cosyvoice"`, `SPEECHSWIFT_REF_AUDIO`, and `SPEECHSWIFT_SERVER_URL`. Keep Edge for quick progress prompts.
|
|
147
|
+
|
|
148
|
+
## Operational Notes
|
|
149
|
+
|
|
150
|
+
Enable Discord Message Content intent, grant voice connect/speak permissions, authenticate the selected CLI harness separately, and avoid reading diffs/log dumps aloud.
|