verbalcoding 0.2.7 → 0.2.9
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +12 -27
- package/app-node/cli_install.test.mjs +32 -0
- package/app-node/install_config.mjs +10 -0
- package/docs/FRESH_INSTALL.md +8 -2
- package/docs/assets/figures/verbalcoding-flow.svg +45 -30
- package/docs/i18n/CONFIGURATION.es.md +138 -49
- package/docs/i18n/CONFIGURATION.fr.md +138 -49
- package/docs/i18n/CONFIGURATION.ja.md +137 -48
- package/docs/i18n/CONFIGURATION.ko.md +137 -48
- package/docs/i18n/CONFIGURATION.ru.md +138 -49
- package/docs/i18n/CONFIGURATION.zh.md +137 -48
- package/docs/i18n/FRESH_INSTALL.es.md +115 -32
- package/docs/i18n/FRESH_INSTALL.fr.md +115 -32
- package/docs/i18n/FRESH_INSTALL.ja.md +119 -36
- package/docs/i18n/FRESH_INSTALL.ko.md +120 -37
- package/docs/i18n/FRESH_INSTALL.ru.md +115 -32
- package/docs/i18n/FRESH_INSTALL.zh.md +119 -36
- package/docs/i18n/MULTI_INSTANCE.es.md +85 -26
- package/docs/i18n/MULTI_INSTANCE.fr.md +85 -26
- package/docs/i18n/MULTI_INSTANCE.ja.md +87 -29
- package/docs/i18n/MULTI_INSTANCE.ko.md +87 -29
- package/docs/i18n/MULTI_INSTANCE.ru.md +84 -26
- package/docs/i18n/MULTI_INSTANCE.zh.md +87 -29
- package/docs/i18n/README.es.md +109 -45
- package/docs/i18n/README.fr.md +109 -45
- package/docs/i18n/README.ja.md +109 -45
- package/docs/i18n/README.ko.md +108 -45
- package/docs/i18n/README.ru.md +109 -45
- package/docs/i18n/README.zh.md +108 -45
- package/docs/i18n/RELEASE.es.md +53 -37
- package/docs/i18n/RELEASE.fr.md +53 -37
- package/docs/i18n/RELEASE.ja.md +52 -36
- package/docs/i18n/RELEASE.ko.md +52 -36
- package/docs/i18n/RELEASE.ru.md +53 -37
- package/docs/i18n/RELEASE.zh.md +53 -37
- package/docs/i18n/USAGE.es.md +91 -64
- package/docs/i18n/USAGE.fr.md +91 -64
- package/docs/i18n/USAGE.ja.md +90 -63
- package/docs/i18n/USAGE.ko.md +90 -63
- package/docs/i18n/USAGE.ru.md +91 -64
- package/docs/i18n/USAGE.zh.md +90 -63
- package/package.json +1 -1
- package/scripts/bootstrap_prereqs.sh +15 -3
- package/scripts/cli.mjs +1 -1
- package/scripts/doctor.mjs +173 -8
- package/scripts/install.mjs +2 -0
|
@@ -1,36 +1,40 @@
|
|
|
1
1
|
# VerbalCoding 配置
|
|
2
2
|
|
|
3
|
-
##
|
|
3
|
+
## 设置向导
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
这里有意不从头重新解释 Discord 机器人/应用设置。请先使用这些上游指南完成 Discord 侧步骤,然后回到 VerbalCoding 设置:
|
|
6
6
|
|
|
7
|
-
- Hermes Agent Discord
|
|
8
|
-
- Discord
|
|
9
|
-
- Discord
|
|
7
|
+
- Hermes Agent Discord 消息指南:<https://hermes-agent.nousresearch.com/docs/user-guide/messaging/discord>
|
|
8
|
+
- Discord 官方机器人概览:<https://docs.discord.com/developers/bots/overview>
|
|
9
|
+
- Discord 官方快速开始:<https://docs.discord.com/developers/quick-start/getting-started>
|
|
10
10
|
|
|
11
11
|
```bash
|
|
12
|
-
vc setup --yes
|
|
13
|
-
# or from a clone
|
|
14
12
|
./scripts/install.sh
|
|
15
13
|
```
|
|
16
14
|
|
|
17
|
-
|
|
15
|
+
安装器会询问 Discord 令牌、允许的用户、自动加入的语音频道名称、转写频道/thread、CLI 驱动后端、默认语音语言、TTS 设置和唤醒词行为。它会以 `0600` 模式写入 `.env`;`.env` 会被 git 忽略。它还会链接简短的 shell 命令 `vc`。
|
|
18
16
|
|
|
19
|
-
|
|
17
|
+
如果你在手动安装后只需要 shell 命令:
|
|
20
18
|
|
|
21
|
-
|
|
19
|
+
```bash
|
|
20
|
+
npm link
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
## 支持的代理后端
|
|
22
24
|
|
|
23
|
-
|
|
25
|
+
在 `.env` 中设置 `AGENT_BACKEND`。
|
|
26
|
+
|
|
27
|
+
| 后端 | 默认命令 | 说明 |
|
|
24
28
|
|---|---|---|
|
|
25
|
-
| `hermes` | `hermes chat -Q -q` |
|
|
26
|
-
| `claude-code` / `claude` | `claude -p` |
|
|
27
|
-
| `codex` | `codex exec` |
|
|
28
|
-
| `gemini` | `gemini -p` |
|
|
29
|
-
| `opencode` | `opencode run` |
|
|
30
|
-
| `openclaw` | `openclaw run` |
|
|
31
|
-
| `custom` | `AGENT_COMMAND`
|
|
29
|
+
| `hermes` | `hermes chat -Q -q` | 默认。保留 `.verbalcoding-session` 恢复行为。 |
|
|
30
|
+
| `claude-code` / `claude` | `claude -p` | 用 `CLAUDE_COMMAND` 或 `AGENT_COMMAND` 覆盖。 |
|
|
31
|
+
| `codex` | `codex exec` | 用 `CODEX_COMMAND` 或 `AGENT_COMMAND` 覆盖。 |
|
|
32
|
+
| `gemini` | `gemini -p` | 用 `GEMINI_COMMAND` 或 `AGENT_COMMAND` 覆盖。 |
|
|
33
|
+
| `opencode` | `opencode run` | 用 `OPENCODE_COMMAND` 或 `AGENT_COMMAND` 覆盖。 |
|
|
34
|
+
| `openclaw` | `openclaw run` | 用 `OPENCLAW_COMMAND` 或 `AGENT_COMMAND` 覆盖。 |
|
|
35
|
+
| `custom` | 必需的 `AGENT_COMMAND` | 提示会作为最终 argv 参数追加。 |
|
|
32
36
|
|
|
33
|
-
|
|
37
|
+
通用覆盖:
|
|
34
38
|
|
|
35
39
|
```bash
|
|
36
40
|
AGENT_BACKEND=custom
|
|
@@ -43,23 +47,37 @@ UTTERANCE_IDLE_MS=4500
|
|
|
43
47
|
LATENCY_LOG_PATH=./.logs/latency.jsonl
|
|
44
48
|
```
|
|
45
49
|
|
|
46
|
-
##
|
|
50
|
+
## 代理适配器契约
|
|
51
|
+
|
|
52
|
+
语音桥接通过一个适配器契约与每个后端通信:
|
|
53
|
+
|
|
54
|
+
- `run({ text }, signal, plan)` 返回状态、最终答案文本、后端标签、耗时,以及可选会话元数据。
|
|
55
|
+
- `ask(text, signal, plan)` 是兼容性快捷方式,只返回最终答案文本。
|
|
56
|
+
- `capabilities` 声明后端是否支持会话恢复、流式进度和取消。
|
|
57
|
+
- Hermes 是参考适配器:会话恢复、详细进度流、取消,以及从 Hermes 会话文件恢复最终答案。
|
|
58
|
+
|
|
59
|
+
新后端应实现同一契约,并将语音/STT/TTS 行为保留在适配器外部。
|
|
60
|
+
|
|
61
|
+
## `.env` 示例
|
|
47
62
|
|
|
48
63
|
```bash
|
|
49
64
|
DISCORD_BOT_TOKEN="***"
|
|
50
65
|
DISCORD_ALLOWED_USERS="123456789012345678"
|
|
51
66
|
AUTO_JOIN_VOICE_CHANNELS="일반,General,general"
|
|
52
67
|
TRANSCRIPT_CHANNEL_ID="123456789012345678"
|
|
68
|
+
|
|
53
69
|
AGENT_BACKEND="hermes"
|
|
54
70
|
STT_ENGINE="whisper_cpp"
|
|
55
71
|
WHISPER_CPP_BIN="whisper-cli"
|
|
56
72
|
WHISPER_CPP_MODEL="./models/ggml-small-q5_1.bin"
|
|
73
|
+
|
|
57
74
|
TTS_BACKEND="edge"
|
|
58
75
|
TTS_VOICE_TYPE="korean_female"
|
|
59
76
|
TTS_VOICE="ko-KR-SunHiNeural"
|
|
60
77
|
TTS_RATE="+10%"
|
|
61
78
|
TTS_MAX_CHARS="495"
|
|
62
79
|
TTS_VOLUME="1.0"
|
|
80
|
+
|
|
63
81
|
REQUIRE_WAKE_WORD="0"
|
|
64
82
|
MIN_UTTERANCE_SECONDS="1.0"
|
|
65
83
|
UTTERANCE_IDLE_MS="4500"
|
|
@@ -69,39 +87,60 @@ AGENT_VERBOSE_PROGRESS="0"
|
|
|
69
87
|
LATENCY_LOG_PATH="./.logs/latency.jsonl"
|
|
70
88
|
```
|
|
71
89
|
|
|
72
|
-
## TTS
|
|
90
|
+
## TTS 声音选择
|
|
91
|
+
|
|
92
|
+
语言预设和声音选择是分开的:
|
|
73
93
|
|
|
74
|
-
`vc language ko|en|auto`
|
|
94
|
+
- `vc language ko|en|auto` 会更改 STT 语言、进度语言和该语言的默认声音。
|
|
95
|
+
- “남자 한국어 목소리로 바꿔”、“여자 한국어 목소리로 바꿔”、`change voice to Korean female` 和 `switch speaker to English` 等实时语音命令只更改说话人/声音类型。
|
|
96
|
+
- `!voice-test <text>` 会用当前选择的后端和声音播放快速样本。
|
|
75
97
|
|
|
76
|
-
|
|
98
|
+
默认情况下,声音选择保存在 `config/tts-voices.json` 中。可用 `TTS_VOICE_CONFIG` 覆盖路径。运行中的桥接会在合成前重新读取/应用声音选择,因此语音命令无需完整重启即可生效。
|
|
77
99
|
|
|
78
|
-
|
|
100
|
+
默认 Edge 目录:
|
|
101
|
+
|
|
102
|
+
| `TTS_VOICE_TYPE` | `TTS_VOICE` | 语言 |
|
|
79
103
|
|---|---|---|
|
|
80
|
-
| `korean_male` | `ko-KR-InJoonNeural` |
|
|
81
|
-
| `korean_female` | `ko-KR-SunHiNeural` |
|
|
82
|
-
| `korean_multilingual_male` | `ko-KR-HyunsuMultilingualNeural` |
|
|
83
|
-
| `english_male` | `en-US-GuyNeural` |
|
|
84
|
-
| `english_female` | `en-US-AriaNeural` |
|
|
104
|
+
| `korean_male` | `ko-KR-InJoonNeural` | 韩语 |
|
|
105
|
+
| `korean_female` | `ko-KR-SunHiNeural` | 韩语 |
|
|
106
|
+
| `korean_multilingual_male` | `ko-KR-HyunsuMultilingualNeural` | 韩语 |
|
|
107
|
+
| `english_male` | `en-US-GuyNeural` | 英语 |
|
|
108
|
+
| `english_female` | `en-US-AriaNeural` | 英语 |
|
|
109
|
+
|
|
110
|
+
手动持久覆盖:
|
|
111
|
+
|
|
112
|
+
```bash
|
|
113
|
+
TTS_BACKEND="edge"
|
|
114
|
+
TTS_VOICE_TYPE="korean_male"
|
|
115
|
+
TTS_VOICE="ko-KR-InJoonNeural"
|
|
116
|
+
TTS_VOICE_CONFIG="config/tts-voices.json"
|
|
117
|
+
```
|
|
85
118
|
|
|
86
|
-
|
|
119
|
+
对于 OpenVoice、SpeechSwift 或 Supertonic,请保留下方各节中的后端专用声音/参考设置;同一个声音目录文件仍可跟踪当前活动声音类型。
|
|
87
120
|
|
|
88
|
-
|
|
121
|
+
后端专用声音选项:
|
|
122
|
+
|
|
123
|
+
| 后端 | 设置 | 声音选择 |
|
|
89
124
|
|---|---|---|
|
|
90
|
-
| Edge | `TTS_VOICE_TYPE`, `TTS_VOICE` |
|
|
91
|
-
| Supertonic | `SUPERTONIC_VOICE`, `SUPERTONIC_LANGUAGE` | `M1`–`M5`, `F1`–`F5
|
|
92
|
-
| OpenVoice | `OPENVOICE_REF_AUDIO`, `OPENVOICE_STYLE`, `OPENVOICE_LANGUAGE` |
|
|
93
|
-
| SpeechSwift / CosyVoice | `SPEECHSWIFT_REF_AUDIO`, `SPEECHSWIFT_ENGINE`, `SPEECHSWIFT_SPEAKER`, `SPEECHSWIFT_MODEL_ID` |
|
|
125
|
+
| Edge | `TTS_VOICE_TYPE`, `TTS_VOICE` | 上述内置类型,以及 `edge-tts --list-voices` 返回的任何声音 |
|
|
126
|
+
| Supertonic | `SUPERTONIC_VOICE`, `SUPERTONIC_LANGUAGE` | `M1`–`M5`, `F1`–`F5`;语言 `ko`, `en`, `es`, `pt`, `fr` |
|
|
127
|
+
| OpenVoice | `OPENVOICE_REF_AUDIO`, `OPENVOICE_STYLE`, `OPENVOICE_LANGUAGE` | 用户提供且获准使用的参考 WAV;风格默认 `default` |
|
|
128
|
+
| SpeechSwift / CosyVoice | `SPEECHSWIFT_REF_AUDIO`, `SPEECHSWIFT_ENGINE`, `SPEECHSWIFT_SPEAKER`, `SPEECHSWIFT_MODEL_ID` | CosyVoice 的参考样本声音,或后端支持的说话人/模型 ID |
|
|
94
129
|
|
|
95
|
-
##
|
|
130
|
+
## 发言分段
|
|
96
131
|
|
|
97
|
-
`UTTERANCE_IDLE_MS`
|
|
132
|
+
`UTTERANCE_IDLE_MS` 控制桥接在语音片段后等待多久,才判定用户说完并启动 STT。默认值是 `4500` ms,用于保留带自然停顿的较长口述指令。较低值让短命令感觉更快,但可能拆分长听写;较高值更适合需要思考停顿的语音。
|
|
98
133
|
|
|
99
134
|
```bash
|
|
100
|
-
UTTERANCE_IDLE_MS="4500"
|
|
101
|
-
UTTERANCE_IDLE_MS="6000"
|
|
135
|
+
UTTERANCE_IDLE_MS="4500" # 平衡默认值
|
|
136
|
+
UTTERANCE_IDLE_MS="6000" # 对带停顿的长听写更安全
|
|
102
137
|
```
|
|
103
138
|
|
|
104
|
-
## MCP
|
|
139
|
+
## MCP 服务器
|
|
140
|
+
|
|
141
|
+
VerbalCoding 附带一个 stdio MCP 服务器,因此 Hermes Agent 或任何 MCP 客户端都可以通过工具控制桥接,而不必依赖 skills 或自由形式 shell 命令。
|
|
142
|
+
|
|
143
|
+
Hermes 配置示例:
|
|
105
144
|
|
|
106
145
|
```yaml
|
|
107
146
|
mcp_servers:
|
|
@@ -112,39 +151,89 @@ mcp_servers:
|
|
|
112
151
|
connect_timeout: 30
|
|
113
152
|
```
|
|
114
153
|
|
|
115
|
-
|
|
154
|
+
暴露的 MCP 工具:
|
|
155
|
+
|
|
156
|
+
| 工具 | 用途 |
|
|
157
|
+
|---|---|
|
|
158
|
+
| `status` | 在不暴露密钥的情况下报告桥接/配置状态 |
|
|
159
|
+
| `doctor` | 运行脱敏 doctor 检查 |
|
|
160
|
+
| `set_auto_restart` | 启用/禁用提交时语音机器人自动重启 |
|
|
161
|
+
| `set_language` | 同时更新 STT/进度/TTS 语言 |
|
|
162
|
+
| `start`, `stop`, `restart` | 控制 Discord 语音桥接 |
|
|
116
163
|
|
|
117
|
-
##
|
|
164
|
+
## 可选 OpenVoice TTS
|
|
165
|
+
|
|
166
|
+
Edge TTS 仍是默认值和回退。若要尝试使用 OpenVoice V2 进行本地语音克隆:
|
|
118
167
|
|
|
119
168
|
```bash
|
|
120
169
|
./scripts/setup_openvoice.sh
|
|
170
|
+
# 从 OpenVoice 文档下载 checkpoints_v2_0417.zip,并解压到 vendor/OpenVoice/checkpoints_v2/
|
|
171
|
+
mkdir -p voice-samples
|
|
172
|
+
# 将获准使用的参考样本放到 voice-samples/user-reference.wav,
|
|
173
|
+
# 或在 Discord 中用 !voice-clone capture 采集一个。
|
|
121
174
|
python3 integrations/openvoice/synth.py --openvoice-dir vendor/OpenVoice --ref-audio voice-samples/user-reference.wav --text '안녕하세요. 버벌코딩 목소리 복제 테스트입니다.' --output /tmp/verbalcoding-openvoice-smoke.wav
|
|
122
175
|
```
|
|
123
176
|
|
|
177
|
+
然后设置:
|
|
178
|
+
|
|
124
179
|
```bash
|
|
125
180
|
TTS_BACKEND="openvoice"
|
|
126
181
|
OPENVOICE_REF_AUDIO="./voice-samples/user-reference.wav"
|
|
127
182
|
OPENVOICE_PROGRESS="0"
|
|
128
183
|
```
|
|
129
184
|
|
|
130
|
-
|
|
185
|
+
只克隆你拥有或获准使用的声音。如果 OpenVoice 失败或超时,VerbalCoding 会回退到 Edge TTS。
|
|
131
186
|
|
|
132
|
-
##
|
|
187
|
+
## 可选 Supertonic TTS
|
|
133
188
|
|
|
134
189
|
```bash
|
|
135
190
|
./scripts/setup_supertonic.sh
|
|
136
191
|
supertonic tts '안녕하세요. 수퍼토닉 테스트입니다.' --lang ko --voice M1 --steps 2 --speed 1.0 -o /tmp/verbalcoding-supertonic.wav
|
|
137
192
|
```
|
|
138
193
|
|
|
139
|
-
|
|
194
|
+
然后设置:
|
|
195
|
+
|
|
196
|
+
```bash
|
|
197
|
+
TTS_BACKEND="supertonic"
|
|
198
|
+
SUPERTONIC_COMMAND="./.venv-supertonic/bin/supertonic"
|
|
199
|
+
SUPERTONIC_VOICE="M1"
|
|
200
|
+
SUPERTONIC_LANGUAGE="ko"
|
|
201
|
+
SUPERTONIC_STEPS="2"
|
|
202
|
+
SUPERTONIC_SPEED="1.0"
|
|
203
|
+
SUPERTONIC_PROGRESS="0"
|
|
204
|
+
```
|
|
205
|
+
|
|
206
|
+
如果 Supertonic 缺失、失败或超时,VerbalCoding 会回退到 Edge TTS。
|
|
207
|
+
|
|
208
|
+
## 可选 SpeechSwift / CosyVoice TTS
|
|
209
|
+
|
|
210
|
+
在 Apple Silicon 上,`speech-swift` 是一个用于韩语语音克隆的本地后端,基于 MLX 原生 CosyVoice/Qwen3-TTS。
|
|
140
211
|
|
|
141
212
|
```bash
|
|
142
213
|
brew tap soniqo/speech https://github.com/soniqo/speech-swift
|
|
143
214
|
brew install speech
|
|
144
215
|
```
|
|
145
216
|
|
|
146
|
-
|
|
217
|
+
推荐 env:
|
|
218
|
+
|
|
219
|
+
```bash
|
|
220
|
+
TTS_BACKEND="speechswift"
|
|
221
|
+
SPEECHSWIFT_MODE="server"
|
|
222
|
+
SPEECHSWIFT_ENGINE="cosyvoice"
|
|
223
|
+
SPEECHSWIFT_LANGUAGE="korean"
|
|
224
|
+
SPEECHSWIFT_REF_AUDIO="./voice-samples/user-reference.wav"
|
|
225
|
+
SPEECHSWIFT_SERVER_HOST="127.0.0.1"
|
|
226
|
+
SPEECHSWIFT_SERVER_PORT="18080"
|
|
227
|
+
SPEECHSWIFT_SERVER_URL="http://127.0.0.1:18080"
|
|
228
|
+
SPEECHSWIFT_PROGRESS="0"
|
|
229
|
+
```
|
|
230
|
+
|
|
231
|
+
保留 Edge 用于快速进度/回声提示。
|
|
147
232
|
|
|
148
|
-
##
|
|
233
|
+
## 运维说明
|
|
149
234
|
|
|
150
|
-
|
|
235
|
+
- 机器人需要启用 Discord 特权 Message Content intent 才能使用文本命令。
|
|
236
|
+
- 机器人需要语音频道连接/发言权限。
|
|
237
|
+
- 对于 Hermes Agent,请在默认 profile 上正常配置/认证 Hermes(`hermes setup`、`hermes login` 等)。
|
|
238
|
+
- 对于 Claude Code、Codex、Gemini、OpenCode、OpenClaw,请分别安装并认证这些 CLI。
|
|
239
|
+
- 如果某个 CLI 在超时或信号失败时输出 diff/code,桥接会避免朗读它,而改为发送详细文本。
|
|
@@ -1,21 +1,28 @@
|
|
|
1
1
|
# Instalación limpia
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
Esta guía es para una instalación pública limpia. Evita suposiciones locales y usa el instalador para inicializar todo lo posible.
|
|
4
4
|
|
|
5
|
-
## 1.
|
|
5
|
+
## 1. Instala la CLI
|
|
6
|
+
|
|
7
|
+
Ruta recomendada con npm:
|
|
6
8
|
|
|
7
9
|
```bash
|
|
8
10
|
npm install -g verbalcoding
|
|
9
|
-
vc setup --yes
|
|
10
11
|
```
|
|
11
12
|
|
|
12
|
-
|
|
13
|
+
O ejecuta directamente el paquete publicado:
|
|
13
14
|
|
|
14
15
|
```bash
|
|
15
16
|
npx verbalcoding setup --yes
|
|
16
17
|
```
|
|
17
18
|
|
|
18
|
-
|
|
19
|
+
Si usaste `npm install -g`, continúa con:
|
|
20
|
+
|
|
21
|
+
```bash
|
|
22
|
+
vc setup --yes
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
Ruta de clonación de GitHub para colaboradores:
|
|
19
26
|
|
|
20
27
|
```bash
|
|
21
28
|
git clone https://github.com/ca1773130n/VerbalCoding.git
|
|
@@ -23,47 +30,105 @@ cd VerbalCoding
|
|
|
23
30
|
./scripts/install.sh --yes
|
|
24
31
|
```
|
|
25
32
|
|
|
26
|
-
## 2.
|
|
33
|
+
## 2. Inicializa dependencias y ejecuta el asistente de configuración
|
|
27
34
|
|
|
28
|
-
|
|
35
|
+
En una instalación npm, no ejecutes `./scripts/install.sh` directamente; no hay un checkout del repositorio en tu directorio actual. Usa en su lugar el wrapper CLI empaquetado:
|
|
29
36
|
|
|
30
|
-
|
|
37
|
+
```bash
|
|
38
|
+
vc setup --yes
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
`vc setup` ejecuta el `scripts/install.sh` incluido dentro del paquete npm instalado. Usa `./scripts/install.sh --yes` solo cuando estés dentro de un clon de GitHub:
|
|
31
42
|
|
|
32
43
|
```bash
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
44
|
+
./scripts/install.sh --yes
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
Qué hace esto:
|
|
48
|
+
|
|
49
|
+
- instala las dependencias npm cuando falta `node_modules/`,
|
|
50
|
+
- instala el comando corto de shell `vc` con `npm link`,
|
|
51
|
+
- instala `ffmpeg`, Node/npm y `whisper-cli` cuando el administrador de paquetes del SO lo admite,
|
|
52
|
+
- descarga `models/ggml-small-q5_1.bin`,
|
|
53
|
+
- crea `.venv-tts` e instala `edge-tts` cuando `edge-tts` no está ya en `PATH`,
|
|
54
|
+
- ejecuta el asistente interactivo de `.env`.
|
|
55
|
+
|
|
56
|
+
Rutas de arranque del sistema compatibles:
|
|
57
|
+
|
|
58
|
+
| SO | Ruta de dependencias del sistema |
|
|
59
|
+
|---|---|
|
|
60
|
+
| macOS | Homebrew: `brew install node ffmpeg whisper-cpp` según sea necesario |
|
|
61
|
+
| Debian/Ubuntu | `apt-get` para Node/npm, ffmpeg, Python y herramientas de compilación; compilación local alternativa de whisper.cpp |
|
|
62
|
+
| Fedora/RHEL | `dnf` para Node/npm, ffmpeg, Python y herramientas de compilación; compilación local alternativa de whisper.cpp |
|
|
63
|
+
| Arch | `pacman` para Node/npm, ffmpeg, Python y herramientas de compilación; compilación local alternativa de whisper.cpp |
|
|
64
|
+
|
|
65
|
+
Variantes útiles del instalador:
|
|
66
|
+
|
|
67
|
+
```bash
|
|
68
|
+
vc setup --yes --no-wizard # dependency/bootstrap only from npm install
|
|
69
|
+
./scripts/install.sh --yes --no-wizard # dependency/bootstrap only from a clone
|
|
70
|
+
./scripts/install.sh --skip-system # do not install OS packages
|
|
71
|
+
./scripts/install.sh --skip-model # do not download the default STT model
|
|
72
|
+
./scripts/install.sh --skip-edge-tts # do not create .venv-tts
|
|
38
73
|
VERBALCODING_SKIP_CLI_LINK=1 ./scripts/install.sh --yes
|
|
39
74
|
```
|
|
40
75
|
|
|
41
|
-
|
|
76
|
+
Si tu SO no es compatible, instala esto manualmente antes de volver a ejecutar:
|
|
77
|
+
|
|
78
|
+
- Node.js 20+ y npm
|
|
79
|
+
- ffmpeg
|
|
80
|
+
- Python 3 con venv/pip
|
|
81
|
+
- `whisper-cli` de whisper.cpp
|
|
82
|
+
- un backend de agente CLI autenticado, Hermes Agent por defecto
|
|
42
83
|
|
|
43
|
-
## 3.
|
|
84
|
+
## 3. Configuración de la aplicación de Discord
|
|
44
85
|
|
|
45
|
-
|
|
86
|
+
Lee primero las guías originales de configuración de bots de Discord si este es tu primer bot:
|
|
46
87
|
|
|
47
|
-
-
|
|
48
|
-
-
|
|
49
|
-
-
|
|
88
|
+
- Guía de mensajería Discord de Hermes Agent: <https://hermes-agent.nousresearch.com/docs/user-guide/messaging/discord>
|
|
89
|
+
- Resumen oficial de bots de Discord: <https://docs.discord.com/developers/bots/overview>
|
|
90
|
+
- Guía oficial de primeros pasos de Discord: <https://docs.discord.com/developers/quick-start/getting-started>
|
|
50
91
|
|
|
51
|
-
|
|
92
|
+
Esas páginas muestran cómo crear una aplicación de Discord, añadir un usuario bot, habilitar intents privilegiados e invitarlo a un servidor. VerbalCoding usa la misma configuración de bot de Discord y luego añade recepción de voz, STT, ejecución de agentes CLI y reproducción TTS encima.
|
|
93
|
+
|
|
94
|
+
1. Crea una aplicación y un bot de Discord en el Discord Developer Portal.
|
|
95
|
+
2. Habilita el intent privilegiado Message Content.
|
|
96
|
+
3. Copia el token del bot en el prompt del instalador o en `.env` como `DISCORD_BOT_TOKEN`.
|
|
97
|
+
4. Genera una URL de invitación:
|
|
52
98
|
|
|
53
99
|
```bash
|
|
54
100
|
vc bot invite <discord-client-id>
|
|
101
|
+
# or pin it to one server:
|
|
55
102
|
vc bot invite <discord-client-id> --guild <guild-id>
|
|
56
103
|
```
|
|
57
104
|
|
|
58
|
-
|
|
105
|
+
La invitación incluye los scopes de bot y comandos slash, además de los permisos de texto/voz usados por VerbalCoding.
|
|
106
|
+
|
|
107
|
+
## 4. Verifica
|
|
59
108
|
|
|
60
109
|
```bash
|
|
61
110
|
vc doctor
|
|
62
111
|
```
|
|
63
112
|
|
|
64
|
-
`vc doctor`
|
|
113
|
+
`vc doctor` está redactado: informa tokens/comandos/modelos faltantes sin imprimir valores secretos. Cuando falten prerrequisitos locales reparables (`ffmpeg`, `whisper-cli`, el modelo predeterminado o el asistente Edge TTS), primero vuelve a ejecutar automáticamente el bootstrap empaquetado. Corrige cualquier elemento `✗` restante y vuelve a ejecutarlo.
|
|
114
|
+
|
|
115
|
+
El éxito esperado incluye:
|
|
116
|
+
|
|
117
|
+
```text
|
|
118
|
+
✓ Node.js
|
|
119
|
+
✓ npm
|
|
120
|
+
✓ ffmpeg
|
|
121
|
+
✓ whisper-cli
|
|
122
|
+
✓ whisper.cpp model
|
|
123
|
+
✓ Discord bot token configured — [REDACTED]
|
|
124
|
+
✓ edge-tts
|
|
125
|
+
✓ hermes CLI
|
|
126
|
+
Doctor passed. Run vc start to start VerbalCoding.
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
Si el instalador creó un asistente local de Edge TTS, `.env` debería contener una ruta `EDGE_TTS_COMMAND` que apunte a `.venv-tts/bin/edge-tts`.
|
|
65
130
|
|
|
66
|
-
## 5.
|
|
131
|
+
## 5. Ejecuta el bot predeterminado único
|
|
67
132
|
|
|
68
133
|
```bash
|
|
69
134
|
vc start
|
|
@@ -71,14 +136,14 @@ vc start
|
|
|
71
136
|
./run.sh
|
|
72
137
|
```
|
|
73
138
|
|
|
74
|
-
|
|
139
|
+
Los registros de inicio correcto incluyen:
|
|
75
140
|
|
|
76
141
|
```text
|
|
77
142
|
Logged in as <bot-name>
|
|
78
143
|
Listening in voice channel <server> / <channel>
|
|
79
144
|
```
|
|
80
145
|
|
|
81
|
-
|
|
146
|
+
En Discord:
|
|
82
147
|
|
|
83
148
|
```text
|
|
84
149
|
!ping
|
|
@@ -87,11 +152,11 @@ In Discord:
|
|
|
87
152
|
!verbose on
|
|
88
153
|
```
|
|
89
154
|
|
|
90
|
-
|
|
155
|
+
Luego habla en el canal de voz configurado. Deberías ver texto STT, texto de progreso cuando el modo detallado está activado, una respuesta final de texto y escuchar la reproducción TTS.
|
|
91
156
|
|
|
92
|
-
## 6.
|
|
157
|
+
## 6. Configuración de un proyecto por sala
|
|
93
158
|
|
|
94
|
-
|
|
159
|
+
Para un bot permanente por sala de voz de proyecto, crea una aplicación de Discord por proyecto y luego:
|
|
95
160
|
|
|
96
161
|
```bash
|
|
97
162
|
vc instance setup my-project
|
|
@@ -100,9 +165,11 @@ vc instance start my-project
|
|
|
100
165
|
vc instance status my-project
|
|
101
166
|
```
|
|
102
167
|
|
|
103
|
-
|
|
168
|
+
Cada instancia escribe un `instances/<name>.env` ignorado con su propio token, canal de voz, destino de transcripción, ruta de registro, archivo de sesión de Hermes y perfil de Hermes opcional.
|
|
169
|
+
|
|
170
|
+
## 7. Configuración opcional de OpenVoice
|
|
104
171
|
|
|
105
|
-
|
|
172
|
+
La clonación de voz de OpenVoice es opcional. Mantén `TTS_BACKEND=edge` para una instalación pública nueva. Para habilitar OpenVoice más adelante:
|
|
106
173
|
|
|
107
174
|
```bash
|
|
108
175
|
./scripts/setup_openvoice.sh
|
|
@@ -112,13 +179,29 @@ Keep `TTS_BACKEND=edge` for a fresh install. To enable OpenVoice later:
|
|
|
112
179
|
python3 integrations/openvoice/synth.py --openvoice-dir vendor/OpenVoice --ref-audio voice-samples/user-reference.wav --text '안녕하세요. 버벌코딩 목소리 복제 테스트입니다.' --output /tmp/verbalcoding-openvoice-smoke.wav
|
|
113
180
|
```
|
|
114
181
|
|
|
115
|
-
|
|
182
|
+
Luego define `TTS_BACKEND=openvoice`, ejecuta `vc doctor` y prueba `!voice-test <text>` en Discord.
|
|
183
|
+
|
|
184
|
+
## 8. Prueba rápida de clon limpio para mantenedores
|
|
116
185
|
|
|
117
|
-
|
|
186
|
+
Prueba rápida solo en el host:
|
|
118
187
|
|
|
119
188
|
```bash
|
|
189
|
+
TMPDIR=$(mktemp -d)
|
|
190
|
+
git clone https://github.com/ca1773130n/VerbalCoding.git "$TMPDIR/VerbalCoding"
|
|
191
|
+
cd "$TMPDIR/VerbalCoding"
|
|
120
192
|
./scripts/install.sh --yes --no-wizard
|
|
121
193
|
npm pack --dry-run
|
|
194
|
+
cp .env.example .env
|
|
195
|
+
chmod 600 .env
|
|
122
196
|
vc doctor || true
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
El fallo esperado en este punto es la ausencia de secretos locales o una CLI de agente no autenticada, no tokens filtrados ni scripts de instalación faltantes.
|
|
200
|
+
|
|
201
|
+
Prueba rápida de instalación limpia en Ubuntu basada en Docker:
|
|
202
|
+
|
|
203
|
+
```bash
|
|
123
204
|
./scripts/docker_ubuntu_smoke.sh
|
|
124
205
|
```
|
|
206
|
+
|
|
207
|
+
Esto ejecuta `ubuntu:24.04`, copia el árbol del repositorio rastreado a un contenedor limpio, ejecuta `./scripts/install.sh --yes --no-wizard`, escribe un `.env` de prueba sin secretos, comprueba `vc`, ejecuta pruebas de Node y verifica `vc doctor`. No se conecta a voz de Discord; usa una VM real de Ubuntu o WSL2 después de esto si necesitas una prueba de extremo a extremo con canal de voz.
|