verbalcoding 0.2.11 → 0.2.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (235) hide show
  1. package/.env.example +98 -2
  2. package/README.es.md +134 -0
  3. package/README.fr.md +134 -0
  4. package/README.ja.md +134 -0
  5. package/README.ko.md +134 -0
  6. package/README.md +118 -74
  7. package/README.ru.md +134 -0
  8. package/README.zh.md +133 -0
  9. package/app-node/agent_adapters.mjs +37 -5
  10. package/app-node/agent_adapters.test.mjs +27 -1
  11. package/app-node/agent_detect.mjs +73 -0
  12. package/app-node/agent_detect.test.mjs +77 -0
  13. package/app-node/agent_routing.mjs +148 -0
  14. package/app-node/agent_routing.test.mjs +138 -0
  15. package/app-node/agent_turn.mjs +86 -0
  16. package/app-node/agent_turn.test.mjs +109 -0
  17. package/app-node/bridge_context.mjs +73 -0
  18. package/app-node/bridge_context.test.mjs +54 -0
  19. package/app-node/bridge_state.mjs +4 -0
  20. package/app-node/bridge_wireup.test.mjs +462 -0
  21. package/app-node/cli_install.test.mjs +31 -0
  22. package/app-node/cross_agent_routing.test.mjs +78 -0
  23. package/app-node/discord_command_router.mjs +204 -0
  24. package/app-node/discord_command_router.test.mjs +311 -0
  25. package/app-node/discord_voice_setup.mjs +251 -0
  26. package/app-node/discord_voice_setup.test.mjs +86 -0
  27. package/app-node/hermes_profiles.test.mjs +12 -1
  28. package/app-node/install_config.mjs +113 -3
  29. package/app-node/install_config.test.mjs +8 -0
  30. package/app-node/instance_doctor.test.mjs +9 -0
  31. package/app-node/instances.test.mjs +8 -1
  32. package/app-node/main.mjs +513 -1058
  33. package/app-node/mcp_tools.test.mjs +7 -0
  34. package/app-node/notification_handler.mjs +89 -0
  35. package/app-node/notification_handler.test.mjs +187 -0
  36. package/app-node/notify.mjs +73 -0
  37. package/app-node/notify.test.mjs +68 -0
  38. package/app-node/plan_dispatcher.mjs +215 -0
  39. package/app-node/plan_dispatcher.test.mjs +101 -0
  40. package/app-node/plan_mode.mjs +203 -0
  41. package/app-node/plan_mode.test.mjs +231 -0
  42. package/app-node/progress_handler.mjs +220 -0
  43. package/app-node/progress_handler.test.mjs +193 -0
  44. package/app-node/progress_speech.mjs +54 -32
  45. package/app-node/progress_speech.test.mjs +12 -3
  46. package/app-node/project_sessions.mjs +5 -2
  47. package/app-node/project_sessions.test.mjs +7 -0
  48. package/app-node/research_mode.mjs +282 -0
  49. package/app-node/research_mode.test.mjs +264 -0
  50. package/app-node/restart_notice.mjs +3 -0
  51. package/app-node/restart_notice.test.mjs +11 -0
  52. package/app-node/session_ontology.mjs +271 -0
  53. package/app-node/session_ontology.test.mjs +130 -0
  54. package/app-node/smart_progress.mjs +94 -0
  55. package/app-node/smart_progress.test.mjs +66 -0
  56. package/app-node/stream_sentencer.mjs +91 -0
  57. package/app-node/stream_sentencer.test.mjs +129 -0
  58. package/app-node/streaming_tts_queue.mjs +52 -0
  59. package/app-node/streaming_tts_queue.test.mjs +64 -0
  60. package/app-node/stt_whisper.mjs +24 -0
  61. package/app-node/stt_whisper.test.mjs +32 -0
  62. package/app-node/text_routing.mjs +22 -0
  63. package/app-node/text_routing.test.mjs +23 -1
  64. package/app-node/tts_backends.mjs +537 -3
  65. package/app-node/tts_backends.test.mjs +454 -0
  66. package/app-node/tts_player.mjs +164 -0
  67. package/app-node/tts_player.test.mjs +202 -0
  68. package/app-node/tts_runtime.mjs +134 -0
  69. package/app-node/tts_runtime.test.mjs +89 -0
  70. package/app-node/tts_settings.mjs +150 -3
  71. package/app-node/tts_settings.test.mjs +204 -0
  72. package/app-node/tts_voice_config.mjs +136 -2
  73. package/app-node/tts_voice_config.test.mjs +94 -0
  74. package/app-node/utterance_router.mjs +216 -0
  75. package/app-node/utterance_router.test.mjs +236 -0
  76. package/app-node/voice_autojoin.mjs +37 -0
  77. package/app-node/voice_autojoin.test.mjs +59 -0
  78. package/app-node/voice_io.mjs +272 -0
  79. package/app-node/voice_io.test.mjs +102 -0
  80. package/app-node/voice_turn_runner.mjs +449 -0
  81. package/app-node/voice_turn_runner.test.mjs +289 -0
  82. package/docs/CONFIGURATION.md +79 -96
  83. package/docs/FRESH_INSTALL.md +105 -63
  84. package/docs/HARNESSES.md +58 -0
  85. package/docs/HARNESS_AIDER.md +50 -0
  86. package/docs/HARNESS_CLAUDE.md +56 -0
  87. package/docs/HARNESS_CODEX.md +56 -0
  88. package/docs/HARNESS_CURSOR.md +45 -0
  89. package/docs/HARNESS_GEMINI.md +45 -0
  90. package/docs/HARNESS_HERMES.md +57 -0
  91. package/docs/HARNESS_OPENCLAW.md +44 -0
  92. package/docs/HARNESS_OPENCODE.md +44 -0
  93. package/docs/HERMES_VOICE.md +65 -0
  94. package/docs/MULTI_INSTANCE.md +16 -0
  95. package/docs/README.md +50 -0
  96. package/docs/RELEASE.md +42 -19
  97. package/docs/ROADMAP.md +53 -0
  98. package/docs/TROUBLESHOOTING.md +126 -0
  99. package/docs/TTS_BACKENDS.md +227 -0
  100. package/docs/USAGE.md +94 -40
  101. package/docs/assets/figures/verbalcoding-flow.svg +1 -1
  102. package/docs/i18n/AGENTS.es.md +34 -0
  103. package/docs/i18n/AGENTS.fr.md +34 -0
  104. package/docs/i18n/AGENTS.ja.md +34 -0
  105. package/docs/i18n/AGENTS.ko.md +34 -0
  106. package/docs/i18n/AGENTS.ru.md +34 -0
  107. package/docs/i18n/AGENTS.zh.md +34 -0
  108. package/docs/i18n/CONFIGURATION.es.md +25 -0
  109. package/docs/i18n/CONFIGURATION.fr.md +25 -0
  110. package/docs/i18n/CONFIGURATION.ja.md +25 -0
  111. package/docs/i18n/CONFIGURATION.ko.md +25 -0
  112. package/docs/i18n/CONFIGURATION.ru.md +25 -0
  113. package/docs/i18n/CONFIGURATION.zh.md +25 -0
  114. package/docs/i18n/FRESH_INSTALL.es.md +27 -2
  115. package/docs/i18n/FRESH_INSTALL.fr.md +27 -2
  116. package/docs/i18n/FRESH_INSTALL.ja.md +27 -2
  117. package/docs/i18n/FRESH_INSTALL.ko.md +27 -2
  118. package/docs/i18n/FRESH_INSTALL.ru.md +27 -2
  119. package/docs/i18n/FRESH_INSTALL.zh.md +27 -2
  120. package/docs/i18n/HARNESSES.es.md +58 -0
  121. package/docs/i18n/HARNESSES.fr.md +58 -0
  122. package/docs/i18n/HARNESSES.ja.md +58 -0
  123. package/docs/i18n/HARNESSES.ko.md +58 -0
  124. package/docs/i18n/HARNESSES.ru.md +58 -0
  125. package/docs/i18n/HARNESSES.zh.md +58 -0
  126. package/docs/i18n/HARNESS_AIDER.es.md +48 -0
  127. package/docs/i18n/HARNESS_AIDER.fr.md +48 -0
  128. package/docs/i18n/HARNESS_AIDER.ja.md +50 -0
  129. package/docs/i18n/HARNESS_AIDER.ko.md +50 -0
  130. package/docs/i18n/HARNESS_AIDER.ru.md +48 -0
  131. package/docs/i18n/HARNESS_AIDER.zh.md +48 -0
  132. package/docs/i18n/HARNESS_CLAUDE.es.md +55 -0
  133. package/docs/i18n/HARNESS_CLAUDE.fr.md +55 -0
  134. package/docs/i18n/HARNESS_CLAUDE.ja.md +56 -0
  135. package/docs/i18n/HARNESS_CLAUDE.ko.md +56 -0
  136. package/docs/i18n/HARNESS_CLAUDE.ru.md +55 -0
  137. package/docs/i18n/HARNESS_CLAUDE.zh.md +56 -0
  138. package/docs/i18n/HARNESS_CODEX.es.md +55 -0
  139. package/docs/i18n/HARNESS_CODEX.fr.md +55 -0
  140. package/docs/i18n/HARNESS_CODEX.ja.md +56 -0
  141. package/docs/i18n/HARNESS_CODEX.ko.md +56 -0
  142. package/docs/i18n/HARNESS_CODEX.ru.md +55 -0
  143. package/docs/i18n/HARNESS_CODEX.zh.md +56 -0
  144. package/docs/i18n/HARNESS_CURSOR.es.md +42 -0
  145. package/docs/i18n/HARNESS_CURSOR.fr.md +42 -0
  146. package/docs/i18n/HARNESS_CURSOR.ja.md +45 -0
  147. package/docs/i18n/HARNESS_CURSOR.ko.md +45 -0
  148. package/docs/i18n/HARNESS_CURSOR.ru.md +42 -0
  149. package/docs/i18n/HARNESS_CURSOR.zh.md +42 -0
  150. package/docs/i18n/HARNESS_GEMINI.es.md +44 -0
  151. package/docs/i18n/HARNESS_GEMINI.fr.md +44 -0
  152. package/docs/i18n/HARNESS_GEMINI.ja.md +45 -0
  153. package/docs/i18n/HARNESS_GEMINI.ko.md +45 -0
  154. package/docs/i18n/HARNESS_GEMINI.ru.md +44 -0
  155. package/docs/i18n/HARNESS_GEMINI.zh.md +45 -0
  156. package/docs/i18n/HARNESS_HERMES.es.md +54 -0
  157. package/docs/i18n/HARNESS_HERMES.fr.md +54 -0
  158. package/docs/i18n/HARNESS_HERMES.ja.md +57 -0
  159. package/docs/i18n/HARNESS_HERMES.ko.md +57 -0
  160. package/docs/i18n/HARNESS_HERMES.ru.md +54 -0
  161. package/docs/i18n/HARNESS_HERMES.zh.md +57 -0
  162. package/docs/i18n/HARNESS_OPENCLAW.es.md +41 -0
  163. package/docs/i18n/HARNESS_OPENCLAW.fr.md +41 -0
  164. package/docs/i18n/HARNESS_OPENCLAW.ja.md +44 -0
  165. package/docs/i18n/HARNESS_OPENCLAW.ko.md +44 -0
  166. package/docs/i18n/HARNESS_OPENCLAW.ru.md +41 -0
  167. package/docs/i18n/HARNESS_OPENCLAW.zh.md +42 -0
  168. package/docs/i18n/HARNESS_OPENCODE.es.md +41 -0
  169. package/docs/i18n/HARNESS_OPENCODE.fr.md +41 -0
  170. package/docs/i18n/HARNESS_OPENCODE.ja.md +44 -0
  171. package/docs/i18n/HARNESS_OPENCODE.ko.md +44 -0
  172. package/docs/i18n/HARNESS_OPENCODE.ru.md +41 -0
  173. package/docs/i18n/HARNESS_OPENCODE.zh.md +44 -0
  174. package/docs/i18n/HERMES_VOICE.es.md +46 -0
  175. package/docs/i18n/HERMES_VOICE.fr.md +46 -0
  176. package/docs/i18n/HERMES_VOICE.ja.md +46 -0
  177. package/docs/i18n/HERMES_VOICE.ko.md +65 -0
  178. package/docs/i18n/HERMES_VOICE.ru.md +46 -0
  179. package/docs/i18n/HERMES_VOICE.zh.md +46 -0
  180. package/docs/i18n/MULTI_INSTANCE.es.md +25 -0
  181. package/docs/i18n/MULTI_INSTANCE.fr.md +25 -0
  182. package/docs/i18n/MULTI_INSTANCE.ja.md +25 -0
  183. package/docs/i18n/MULTI_INSTANCE.ko.md +25 -0
  184. package/docs/i18n/MULTI_INSTANCE.ru.md +25 -0
  185. package/docs/i18n/MULTI_INSTANCE.zh.md +25 -0
  186. package/docs/i18n/README.es.md +20 -134
  187. package/docs/i18n/README.fr.md +20 -134
  188. package/docs/i18n/README.ja.md +20 -134
  189. package/docs/i18n/README.ko.md +20 -133
  190. package/docs/i18n/README.ru.md +20 -134
  191. package/docs/i18n/README.zh.md +20 -133
  192. package/docs/i18n/RELEASE.es.md +26 -1
  193. package/docs/i18n/RELEASE.fr.md +26 -1
  194. package/docs/i18n/RELEASE.ja.md +26 -1
  195. package/docs/i18n/RELEASE.ko.md +26 -1
  196. package/docs/i18n/RELEASE.ru.md +26 -1
  197. package/docs/i18n/RELEASE.zh.md +26 -1
  198. package/docs/i18n/TROUBLESHOOTING.es.md +39 -0
  199. package/docs/i18n/TROUBLESHOOTING.fr.md +39 -0
  200. package/docs/i18n/TROUBLESHOOTING.ja.md +39 -0
  201. package/docs/i18n/TROUBLESHOOTING.ko.md +39 -0
  202. package/docs/i18n/TROUBLESHOOTING.ru.md +39 -0
  203. package/docs/i18n/TROUBLESHOOTING.zh.md +39 -0
  204. package/docs/i18n/USAGE.es.md +25 -0
  205. package/docs/i18n/USAGE.fr.md +25 -0
  206. package/docs/i18n/USAGE.ja.md +25 -0
  207. package/docs/i18n/USAGE.ko.md +25 -0
  208. package/docs/i18n/USAGE.ru.md +25 -0
  209. package/docs/i18n/USAGE.zh.md +25 -0
  210. package/docs/superpowers/plans/2026-05-13-phase1-streaming-pipeline.md +122 -0
  211. package/docs/superpowers/plans/2026-05-13-phase10-push-notifications.md +152 -0
  212. package/docs/superpowers/plans/2026-05-13-phase2-agent-adapters.md +242 -0
  213. package/docs/superpowers/plans/2026-05-13-phase6-smart-progress.md +172 -0
  214. package/docs/superpowers/plans/2026-05-13-phase7-voice-plan-mode.md +108 -0
  215. package/docs/superpowers/plans/2026-05-14-cross-agent-voice-transfer.md +625 -0
  216. package/docs/superpowers/plans/2026-05-21-audio-overview-narrated-diffs.md +95 -0
  217. package/docs/superpowers/plans/2026-05-21-autoresearch-ontology.md +83 -0
  218. package/docs/superpowers/plans/2026-05-21-phase11-push-to-talk-wakeword-v2.md +77 -0
  219. package/docs/superpowers/plans/2026-05-21-phase12-multi-user-voice.md +147 -0
  220. package/docs/superpowers/plans/2026-05-21-phase14-verbalbench.md +136 -0
  221. package/docs/superpowers/plans/2026-05-21-phase15-phone-companion.md +72 -0
  222. package/integrations/fireredtts2/mlx_llm.py +183 -0
  223. package/integrations/fireredtts2/synth.py +156 -0
  224. package/integrations/fireredtts2/synth_mlx.py +196 -0
  225. package/integrations/mlxaudio/synth.py +74 -0
  226. package/integrations/neuttsair/synth.py +104 -0
  227. package/integrations/omnivoice/synth.py +110 -0
  228. package/package.json +7 -1
  229. package/scripts/cli.mjs +88 -3
  230. package/scripts/doctor.mjs +115 -4
  231. package/scripts/install.mjs +20 -2
  232. package/scripts/install_fireredtts2.sh +109 -0
  233. package/scripts/install_mlxaudio.sh +34 -0
  234. package/scripts/install_mossttsnano.sh +46 -0
  235. package/scripts/postinstall.mjs +34 -0
package/.env.example CHANGED
@@ -1,17 +1,44 @@
1
1
  # Copy to .env and fill local values. Do not commit .env.
2
+ # Preferred setup commands:
3
+ # vc setup token
4
+ # vc setup channels "General,Team Voice"
2
5
 
3
6
  DISCORD_BOT_TOKEN=""
7
+ DISCORD_CLIENT_ID=""
4
8
  DISCORD_ALLOWED_USERS=""
5
9
  AUTO_JOIN_VOICE_CHANNELS="일반,General,general"
6
10
  TRANSCRIPT_CHANNEL_ID=""
11
+ VOICE_CONNECT_TIMEOUT_MS="60000"
7
12
 
8
- # Agent harness: hermes, claude-code, claude, codex, gemini, opencode, openclaw, custom
13
+ # Agent harness: hermes, claude-code, claude, codex, gemini, opencode, openclaw, aider, cursor, custom
14
+ # `vc setup` auto-detects which agents are installed and lets you pick.
9
15
  AGENT_BACKEND="hermes"
10
16
  # AGENT_LABEL="My Harness"
11
17
  # AGENT_COMMAND="my-harness run --non-interactive"
18
+ # AIDER_COMMAND="aider --no-pretty --yes-always --message"
19
+ # CURSOR_COMMAND="cursor-agent --print --prompt"
12
20
  AGENT_TASK_TIMEOUT_MS="0"
13
21
  AGENT_CHAT_TIMEOUT_MS="45000"
14
22
  AGENT_VERBOSE_PROGRESS="0" # default off; toggle in Discord with !verbose on/off
23
+
24
+ # Streaming TTS pipeline: sentence-by-sentence playback while the agent is still writing.
25
+ # First audio plays before the agent finishes. Set to "0" to fall back to whole-reply playback.
26
+ STREAMING_TTS="1"
27
+
28
+ # Smart progress summarization. When SMART_PROGRESS_API_KEY is set, raw progress events get
29
+ # folded into a single human sentence via a small LLM (Groq OpenAI-compatible API by default).
30
+ # Without an API key it falls back to the existing regex categories.
31
+ # SMART_PROGRESS_API_KEY=""
32
+ # SMART_PROGRESS_BASE_URL="https://api.groq.com/openai/v1"
33
+ # SMART_PROGRESS_MODEL="llama-3.1-8b-instant"
34
+
35
+ # Push notification handoff for long tasks when the voice channel is empty.
36
+ # Provider: ntfy (free, no account, mobile apps) | pushover | noop.
37
+ # NOTIFY_PROVIDER="ntfy"
38
+ # NTFY_TOPIC="" # pick something unguessable; subscribe with the ntfy app
39
+ # PUSHOVER_USER=""
40
+ # PUSHOVER_TOKEN=""
41
+ NOTIFY_MIN_TASK_MS="60000" # only notify when a task ran at least this long
15
42
  LATENCY_LOG_PATH="./.logs/latency.jsonl"
16
43
  PROJECT_SESSIONS_FILE="./config/project-sessions.json"
17
44
  # Agent workflow helper: off by default. Toggle with `vc restart auto on|off`.
@@ -24,7 +51,7 @@ VOICE_LANGUAGE="ko" # ko | en | auto; controls progress/status language
24
51
  WHISPER_CPP_LANGUAGE="ko" # ko | en | auto; auto omits forced whisper language
25
52
  STT_LANGUAGE="ko"
26
53
 
27
- TTS_BACKEND="edge" # edge | openvoice | speechswift | supertonic
54
+ TTS_BACKEND="edge" # edge | openvoice | speechswift | supertonic | omnivoice | qwen3tts | mlxaudio | fireredtts2 | mossttsnano | neuttsair
28
55
  EDGE_TTS_COMMAND="edge-tts"
29
56
  TTS_VOICE_TYPE="korean_female" # edge: korean_male | korean_female | korean_multilingual_male | english_male | english_female
30
57
  TTS_VOICE="ko-KR-SunHiNeural"
@@ -66,6 +93,33 @@ OPENVOICE_LANGUAGE="KR"
66
93
  OPENVOICE_STYLE="default"
67
94
  OPENVOICE_TIMEOUT_MS="90000"
68
95
  OPENVOICE_PROGRESS="0" # keep progress prompts fast via Edge unless set to 1
96
+
97
+ # Optional k2-fsa/OmniVoice TTS backend (600+ language zero-shot TTS / voice cloning).
98
+ # Recommended: create a separate env, install torch/torchaudio/soundfile/omnivoice, and keep progress prompts on Edge.
99
+ OMNIVOICE_PYTHON="./.venv-omnivoice/bin/python"
100
+ OMNIVOICE_MODEL="k2-fsa/OmniVoice"
101
+ OMNIVOICE_DEVICE="mps" # mps on Apple Silicon, cuda:0 on NVIDIA, cpu as fallback
102
+ OMNIVOICE_DTYPE="float16"
103
+ OMNIVOICE_REF_AUDIO="./voice-samples/user-reference.wav"
104
+ OMNIVOICE_REF_TEXT="" # optional transcript of the reference sample
105
+ OMNIVOICE_LANGUAGE="ko"
106
+ OMNIVOICE_SPEAKER="" # optional voice-design attributes when no ref sample is desired
107
+ OMNIVOICE_TIMEOUT_MS="180000"
108
+ OMNIVOICE_PROGRESS="0" # keep progress prompts fast via Edge unless set to 1
109
+
110
+ # Optional Qwen3-TTS CLI backend via speech-swift `audio speak`.
111
+ # Install speech-swift/audio separately, then set TTS_BACKEND="qwen3tts" or "qwen3".
112
+ QWEN3TTS_COMMAND="audio"
113
+ QWEN3TTS_MODE="custom" # custom | clone | design
114
+ QWEN3TTS_MODEL="customVoice" # customVoice for preset speakers, base/base-8bit for cloning
115
+ QWEN3TTS_LANGUAGE="korean"
116
+ QWEN3TTS_SPEAKER="sohee" # used in custom mode
117
+ QWEN3TTS_INSTRUCT="" # emotion/style instruction; used in custom/design modes
118
+ QWEN3TTS_REF_AUDIO="./voice-samples/user-reference.wav" # used in clone mode
119
+ QWEN3TTS_REF_TEXT="" # optional note for your reference sample
120
+ QWEN3TTS_STREAM="1"
121
+ QWEN3TTS_TIMEOUT_MS="120000"
122
+ QWEN3TTS_PROGRESS="0" # keep progress prompts fast via Edge unless set to 1
69
123
  REQUIRE_WAKE_WORD="0"
70
124
  MIN_UTTERANCE_SECONDS="1.0"
71
125
  # Wait for natural thinking pauses before STT. Lower for faster but more fragmented turns.
@@ -83,3 +137,45 @@ BARGE_IN_CONSERVATIVE_MIN_SECONDS="1.8"
83
137
  BARGE_IN_CONSERVATIVE_MIN_MEAN_VOLUME_DB="-27"
84
138
  BARGE_IN_CONSERVATIVE_MIN_MAX_VOLUME_DB="-12"
85
139
  MAX_DEFERRED_PROCESSING_UTTERANCES="0"
140
+
141
+
142
+ # Optional local TTS backends (final answers only by default; progress falls back to Edge)
143
+ # TTS_BACKEND=fireredtts2
144
+ FIREREDTTS2_COMMAND=./.local/bin/fireredtts2
145
+ FIREREDTTS2_PRETRAINED_DIR=pretrained_models/FireRedTTS2
146
+ FIREREDTTS2_DEVICE=auto
147
+ FIREREDTTS2_GEN_TYPE=monologue
148
+ FIREREDTTS2_SPEAKER=S1
149
+ FIREREDTTS2_PROMPT_AUDIO=voice-samples/user-reference.wav
150
+ FIREREDTTS2_PROMPT_TEXT=
151
+ FIREREDTTS2_BF16=0
152
+ FIREREDTTS2_TIMEOUT_MS=180000
153
+ FIREREDTTS2_PROGRESS=0
154
+
155
+ # TTS_BACKEND=mossttsnano
156
+ MOSSTTSNANO_COMMAND=python3
157
+ MOSSTTSNANO_SCRIPT=vendor/MOSS-TTS-Nano/infer.py
158
+ MOSSTTSNANO_CHECKPOINT=OpenMOSS-Team/MOSS-TTS-Nano
159
+ MOSSTTSNANO_AUDIO_TOKENIZER=
160
+ MOSSTTSNANO_MODE=continuation
161
+ MOSSTTSNANO_DEVICE=auto
162
+ MOSSTTSNANO_DTYPE=auto
163
+ MOSSTTSNANO_PROMPT_AUDIO=voice-samples/user-reference.wav
164
+ MOSSTTSNANO_PROMPT_TEXT=
165
+ MOSSTTSNANO_MAX_NEW_FRAMES=375
166
+ MOSSTTSNANO_TIMEOUT_MS=120000
167
+ MOSSTTSNANO_PROGRESS=0
168
+
169
+ # TTS_BACKEND=neuttsair # NeuTTS Air is English-only; progress falls back to Edge by default.
170
+ NEUTTSAIR_PYTHON=./.venv-neuttsair/bin/python
171
+ NEUTTSAIR_SCRIPT=integrations/neuttsair/synth.py
172
+ NEUTTSAIR_BACKBONE_REPO=neuphonic/neutts-air-q4-gguf
173
+ NEUTTSAIR_BACKBONE_DEVICE=mps
174
+ NEUTTSAIR_CODEC_REPO=neuphonic/neucodec
175
+ NEUTTSAIR_CODEC_DEVICE=mps
176
+ NEUTTSAIR_REF_AUDIO=voice-samples/user-reference.wav
177
+ NEUTTSAIR_REF_TEXT=
178
+ NEUTTSAIR_LANGUAGE=en
179
+ NEUTTSAIR_SAMPLE_RATE=24000
180
+ NEUTTSAIR_TIMEOUT_MS=120000
181
+ NEUTTSAIR_PROGRESS=0
package/README.es.md ADDED
@@ -0,0 +1,134 @@
1
+ # VerbalCoding
2
+
3
+ <p align="center"><strong>Habla con agentes de programación CLI por voz en Discord, como en una llamada.</strong></p>
4
+
5
+ <p align="center"><a href="./README.md">English</a> · <a href="./README.ko.md">한국어</a> · <a href="./README.ja.md">日本語</a> · <a href="./README.zh.md">中文</a> · <a href="./README.fr.md">Français</a> · <a href="./README.ru.md">Русский</a></p>
6
+
7
+ <p align="center">
8
+ <img alt="npm" src="https://img.shields.io/npm/v/verbalcoding?color=CB3837&logo=npm&logoColor=white">
9
+ <img alt="Node.js" src="https://img.shields.io/badge/Node.js-20%2B-339933?logo=node.js&logoColor=white">
10
+ <img alt="Discord" src="https://img.shields.io/badge/Discord-voice%20bridge-5865F2?logo=discord&logoColor=white">
11
+ <img alt="STT" src="https://img.shields.io/badge/STT-whisper.cpp-7C3AED">
12
+ <img alt="TTS" src="https://img.shields.io/badge/TTS-Edge%20%7C%20OpenVoice%20%7C%20SpeechSwift-0EA5E9">
13
+ <img alt="License" src="https://img.shields.io/github/license/ca1773130n/VerbalCoding">
14
+ </p>
15
+
16
+ <p align="center">
17
+ <img src="docs/assets/figures/verbalcoding-flow.svg" alt="VerbalCoding voice-to-agent flow" width="860">
18
+ </p>
19
+
20
+ ## Por qué existe
21
+
22
+ VerbalCoding convierte una sala de voz de Discord en una cabina manos libres para agentes de programación. Pides algo hablando, dejas trabajar al agente CLI y recibes una respuesta breve por voz con transcripción y eventos de progreso. Los diffs y logs quedan fuera del TTS largo.
23
+
24
+ > **¿Ya usas Hermes Agent?** Hermes ya trae soporte de canales de voz de Discord con `/voice join` / `/voice channel`: puede unirse al VC actual, transcribir con Whisper y responder por TTS. Para ese bucle básico, VerbalCoding no es obligatorio. VerbalCoding añade una capa de flujo de trabajo: enrutamiento de proyectos/sesiones, contexto compartido de voz+texto, reglas de interrupción, avisos de progreso, presets de idioma, métricas de latencia y cambio de backend CLI más allá de Hermes.
25
+
26
+ ## Qué lo hace distinto
27
+
28
+ | Capacidad | Por qué importa |
29
+ |---|---|
30
+ | Flujo tipo llamada | Habla, escucha, interrumpe y continúa en el mismo canal de voz de Discord. |
31
+ | Configuración guiada | `vc setup` reúne prerequisites, Discord token/client ID, voice channel, transcript target, backend y TTS settings en un solo flujo. |
32
+ | Bucle de voz local | Discord audio → local `whisper-cli` → selected CLI agent → TTS reply. |
33
+ | Elección de agente | Hermes Agent, Claude Code, Codex, Gemini CLI, OpenCode, OpenClaw, Aider, Cursor CLI o custom command. `vc setup` autodetecta lo que tienes instalado. |
34
+ | Enrutamiento de agente por voz | `"ask Codex what it thinks"` (un turno), `"switch to Aider"` (sticky), `"back to default"` para volver. Si el binario no está instalado, el puente ofrece fallback al agente por defecto. |
35
+ | Más allá de la voz integrada de Hermes | Mantiene el mismo bucle de voz en VC y añade salas de proyecto, contexto compartido con `!ask`, interrupciones afinadas, voz de progreso/estado y control de backends multiagente. |
36
+ | Operación real | Incluye doctor auto-fix, guía Docker UDP, latency metrics, multi-instance rooms y redacted config checks. |
37
+
38
+ ## Inicio rápido
39
+
40
+ ```bash
41
+ npm install -g verbalcoding@latest
42
+ vc setup
43
+ vc doctor
44
+ vc start
45
+ ```
46
+
47
+ `vc setup` es la ruta normal para personas. Mantén abierto Discord Developer Portal mientras introduces bot token, application/client ID, transcript target y voice channel names.
48
+
49
+ En automatización puedes omitir prompts y completar los datos de Discord después.
50
+
51
+ ```bash
52
+ vc setup --yes
53
+ vc setup token <bot-token> --client-id <discord-client-id>
54
+ vc setup channels "General,Team Voice"
55
+ vc doctor
56
+ ```
57
+
58
+ ## Discord en un minuto
59
+
60
+ 1. Crea una application y un bot en Discord Developer Portal.
61
+ 2. Activa Message Content privileged intent.
62
+ 3. Ejecuta `vc setup` y pega bot token y application/client ID.
63
+ 4. Introduce los nombres exactos de los voice channels para auto-join.
64
+ 5. Invita el bot con estos comandos.
65
+
66
+ ```bash
67
+ vc bot invite <discord-client-id>
68
+ vc bot invite <discord-client-id> --guild <guild-id>
69
+ ```
70
+
71
+ ## Mapa mínimo de comandos
72
+
73
+ ```bash
74
+ vc setup # configuración guiada: prerequisites, Discord, backend, voice
75
+ vc setup --yes # bootstrap/starter config no interactiva
76
+ vc setup token # rotar o añadir Discord bot token/client ID después
77
+ vc setup channels "General,Team Voice" # actualizar auto-join voice channel names
78
+ vc bot invite CLIENT_ID # generar Discord bot invite URL
79
+ vc status # mostrar configuración actual
80
+ vc language ko|en|auto # cambiar language preset
81
+ vc doctor # redacted health check y auto-fixes
82
+ vc start # iniciar bridge por defecto
83
+ vc instance setup NAME # crear project voice bot aislado
84
+ vc instance start NAME # ejecutar ese bot en background
85
+ ```
86
+
87
+ ## Más información
88
+
89
+ | Guía | Qué obtienes |
90
+ |---|---|
91
+ | [Centro de documentación](docs/i18n/README.es.md) | Índice de guías localizadas. |
92
+ | [Fresh Install](docs/i18n/FRESH_INSTALL.es.md) | npm/global setup, configuración de Discord y primera ejecución. |
93
+ | [Usage](docs/i18n/USAGE.es.md) | Comandos CLI, comandos Discord, modos de ejecución y latency. |
94
+ | [Uso por harness](docs/i18n/HARNESSES.es.md) | Instalación, configuración y enrutamiento por voz para Claude Code, Codex, Aider y demás. |
95
+ | [Voz integrada de Hermes vs VerbalCoding](docs/i18n/HERMES_VOICE.es.md) | La voz Discord que Hermes ya ofrece y la diferencia de VerbalCoding. |
96
+ | [Configuration](docs/i18n/CONFIGURATION.es.md) | .env, agent backends, MCP, TTS y operación. |
97
+ | [Troubleshooting](docs/i18n/TROUBLESHOOTING.es.md) | Docker UDP y comprobaciones de token/channel. |
98
+ | [Multi-Instance](docs/i18n/MULTI_INSTANCE.es.md) | Una sala de voz fija por proyecto. |
99
+
100
+ ## Requisitos
101
+
102
+ | Capa | Predeterminado |
103
+ |---|---|
104
+ | Runtime | Node.js 20+ y npm. |
105
+ | Audio | `ffmpeg` y local `whisper-cli`. |
106
+ | TTS | Edge TTS por defecto; OpenVoice, SpeechSwift/CosyVoice y Supertonic opcionales. |
107
+ | Discord | Bot token, Message Content intent, voice permissions y channel names coincidentes. |
108
+ | Agent | Al menos un CLI harness autenticado; Hermes Agent por defecto. |
109
+
110
+ ## Nota Docker / contenedores
111
+
112
+ Si los logs muestran `Cannot perform IP discovery - socket closed`, Discord voice UDP está bloqueado. En Linux Docker Compose usa:
113
+
114
+ ```yaml
115
+ services:
116
+ verbalcoding:
117
+ network_mode: "host"
118
+ ```
119
+
120
+ No combines `network_mode: "host"` con `ports:`.
121
+
122
+ ## Contribuir
123
+
124
+ ```bash
125
+ node --check app-node/main.mjs
126
+ npm test
127
+ bash -n run.sh scripts/install.sh scripts/bootstrap_prereqs.sh
128
+ npm pack --dry-run
129
+ vc doctor
130
+ ```
131
+
132
+ ## Estado
133
+
134
+ VerbalCoding apunta a publicación pública, pero todavía es temprano. Demo video/GIF, validación Linux más amplia, CI y revisión de seguridad siguen como TODO.
package/README.fr.md ADDED
@@ -0,0 +1,134 @@
1
+ # VerbalCoding
2
+
3
+ <p align="center"><strong>Parlez à des agents de code CLI depuis Discord vocal, comme lors d’un appel.</strong></p>
4
+
5
+ <p align="center"><a href="./README.md">English</a> · <a href="./README.ko.md">한국어</a> · <a href="./README.ja.md">日本語</a> · <a href="./README.zh.md">中文</a> · <a href="./README.es.md">Español</a> · <a href="./README.ru.md">Русский</a></p>
6
+
7
+ <p align="center">
8
+ <img alt="npm" src="https://img.shields.io/npm/v/verbalcoding?color=CB3837&logo=npm&logoColor=white">
9
+ <img alt="Node.js" src="https://img.shields.io/badge/Node.js-20%2B-339933?logo=node.js&logoColor=white">
10
+ <img alt="Discord" src="https://img.shields.io/badge/Discord-voice%20bridge-5865F2?logo=discord&logoColor=white">
11
+ <img alt="STT" src="https://img.shields.io/badge/STT-whisper.cpp-7C3AED">
12
+ <img alt="TTS" src="https://img.shields.io/badge/TTS-Edge%20%7C%20OpenVoice%20%7C%20SpeechSwift-0EA5E9">
13
+ <img alt="License" src="https://img.shields.io/github/license/ca1773130n/VerbalCoding">
14
+ </p>
15
+
16
+ <p align="center">
17
+ <img src="docs/assets/figures/verbalcoding-flow.svg" alt="VerbalCoding voice-to-agent flow" width="860">
18
+ </p>
19
+
20
+ ## Pourquoi ce projet existe
21
+
22
+ VerbalCoding transforme un salon vocal Discord en poste de pilotage mains libres pour agents de code. Dictez une demande, laissez le CLI travailler, puis recevez une réponse vocale concise avec transcription et progression. Les diffs et logs ne sont pas lus longuement par TTS.
23
+
24
+ > **Vous utilisez déjà Hermes Agent ?** Hermes prend déjà en charge les salons vocaux Discord via `/voice join` / `/voice channel` : il peut rejoindre votre VC, transcrire avec Whisper et répondre en TTS. Pour cette boucle de base, VerbalCoding n’est pas obligatoire. VerbalCoding ajoute une couche de workflow : routage projet/session, contexte voix+texte partagé, règles d’interruption, annonces de progression, préréglages de langue, métriques de latence et changement de backend CLI au-delà de Hermes.
25
+
26
+ ## Ce qui change
27
+
28
+ | Capacité | Pourquoi c’est utile |
29
+ |---|---|
30
+ | Flux type appel | Parler, écouter, interrompre et continuer dans le même salon vocal Discord. |
31
+ | Configuration guidée | `vc setup` couvre prerequisites, Discord token/client ID, voice channel, transcript target, backend et TTS settings en un seul flux. |
32
+ | Boucle vocale locale | Discord audio → local `whisper-cli` → selected CLI agent → TTS reply. |
33
+ | Choix de l’agent | Hermes Agent, Claude Code, Codex, Gemini CLI, OpenCode, OpenClaw, Aider, Cursor CLI ou custom command. `vc setup` détecte automatiquement ce qui est installé. |
34
+ | Routage d’agent par voix | `"ask Codex what it thinks"` pour un tour, `"switch to Aider"` en sticky, `"back to default"` pour revenir. Les binaires absents sont détectés et le pont propose un fallback vers l’agent par défaut. |
35
+ | Au-delà de la voix intégrée de Hermes | Garde la même boucle vocale VC, puis ajoute salons de projet, contexte partagé `!ask`, interruptions réglées, annonces progression/état et contrôle de backends multiagents. |
36
+ | Exploitation réelle | doctor auto-fix, guide Docker UDP, latency metrics, multi-instance rooms et redacted config checks inclus. |
37
+
38
+ ## Démarrage rapide
39
+
40
+ ```bash
41
+ npm install -g verbalcoding@latest
42
+ vc setup
43
+ vc doctor
44
+ vc start
45
+ ```
46
+
47
+ `vc setup` est le parcours normal pour une personne. Gardez Discord Developer Portal ouvert pendant la saisie du bot token, application/client ID, transcript target et voice channel names.
48
+
49
+ En automatisation, vous pouvez ignorer les prompts puis renseigner Discord ensuite.
50
+
51
+ ```bash
52
+ vc setup --yes
53
+ vc setup token <bot-token> --client-id <discord-client-id>
54
+ vc setup channels "General,Team Voice"
55
+ vc doctor
56
+ ```
57
+
58
+ ## Discord en une minute
59
+
60
+ 1. Créez une application et un bot dans Discord Developer Portal.
61
+ 2. Activez Message Content privileged intent.
62
+ 3. Lancez `vc setup` et collez bot token et application/client ID.
63
+ 4. Saisissez les noms exacts des voice channels à rejoindre.
64
+ 5. Invitez le bot avec ces commandes.
65
+
66
+ ```bash
67
+ vc bot invite <discord-client-id>
68
+ vc bot invite <discord-client-id> --guild <guild-id>
69
+ ```
70
+
71
+ ## Carte rapide des commandes
72
+
73
+ ```bash
74
+ vc setup # configuration guidée: prerequisites, Discord, backend, voice
75
+ vc setup --yes # bootstrap/starter config non interactive
76
+ vc setup token # modifier ou ajouter Discord bot token/client ID plus tard
77
+ vc setup channels "General,Team Voice" # mettre à jour auto-join voice channel names
78
+ vc bot invite CLIENT_ID # générer Discord bot invite URL
79
+ vc status # afficher les réglages actuels
80
+ vc language ko|en|auto # changer language preset
81
+ vc doctor # redacted health check et auto-fixes
82
+ vc start # démarrer le bridge par défaut
83
+ vc instance setup NAME # créer un project voice bot isolé
84
+ vc instance start NAME # exécuter ce bot en background
85
+ ```
86
+
87
+ ## En savoir plus
88
+
89
+ | Guide | Contenu |
90
+ |---|---|
91
+ | [Centre de documentation](docs/i18n/README.fr.md) | Index des guides localisés. |
92
+ | [Fresh Install](docs/i18n/FRESH_INSTALL.fr.md) | npm/global setup, configuration Discord, premier lancement. |
93
+ | [Usage](docs/i18n/USAGE.fr.md) | Commandes CLI, commandes Discord, modes d’exécution, latency. |
94
+ | [Usage par harness](docs/i18n/HARNESSES.fr.md) | Installation, configuration et routage vocal pour Claude Code, Codex, Aider et les autres. |
95
+ | [Voix intégrée Hermes vs VerbalCoding](docs/i18n/HERMES_VOICE.fr.md) | La voix Discord déjà fournie par Hermes et la différence VerbalCoding. |
96
+ | [Configuration](docs/i18n/CONFIGURATION.fr.md) | .env, agent backends, MCP, TTS, exploitation. |
97
+ | [Troubleshooting](docs/i18n/TROUBLESHOOTING.fr.md) | Docker UDP et vérifications token/channel. |
98
+ | [Multi-Instance](docs/i18n/MULTI_INSTANCE.fr.md) | Un salon vocal fixe par projet. |
99
+
100
+ ## Exigences
101
+
102
+ | Couche | Défaut |
103
+ |---|---|
104
+ | Runtime | Node.js 20+ et npm. |
105
+ | Audio | `ffmpeg` et local `whisper-cli`. |
106
+ | TTS | Edge TTS par défaut; OpenVoice, SpeechSwift/CosyVoice et Supertonic en option. |
107
+ | Discord | Bot token, Message Content intent, voice permissions et channel names correspondants. |
108
+ | Agent | Au moins un CLI harness authentifié; Hermes Agent par défaut. |
109
+
110
+ ## Note Docker / conteneurs
111
+
112
+ Si les logs affichent `Cannot perform IP discovery - socket closed`, Discord voice UDP est bloqué. Avec Linux Docker Compose, utilisez:
113
+
114
+ ```yaml
115
+ services:
116
+ verbalcoding:
117
+ network_mode: "host"
118
+ ```
119
+
120
+ Ne combinez pas `network_mode: "host"` avec `ports:`.
121
+
122
+ ## Contribuer
123
+
124
+ ```bash
125
+ node --check app-node/main.mjs
126
+ npm test
127
+ bash -n run.sh scripts/install.sh scripts/bootstrap_prereqs.sh
128
+ npm pack --dry-run
129
+ vc doctor
130
+ ```
131
+
132
+ ## Statut
133
+
134
+ VerbalCoding vise une publication publique mais reste jeune. Vidéo/GIF de démo, validation Linux plus large, CI et revue sécurité restent TODO.
package/README.ja.md ADDED
@@ -0,0 +1,134 @@
1
+ # VerbalCoding
2
+
3
+ <p align="center"><strong>Discord 音声で CLI コーディングエージェントと電話のように作業できます。</strong></p>
4
+
5
+ <p align="center"><a href="./README.md">English</a> · <a href="./README.ko.md">한국어</a> · <a href="./README.zh.md">中文</a> · <a href="./README.es.md">Español</a> · <a href="./README.fr.md">Français</a> · <a href="./README.ru.md">Русский</a></p>
6
+
7
+ <p align="center">
8
+ <img alt="npm" src="https://img.shields.io/npm/v/verbalcoding?color=CB3837&logo=npm&logoColor=white">
9
+ <img alt="Node.js" src="https://img.shields.io/badge/Node.js-20%2B-339933?logo=node.js&logoColor=white">
10
+ <img alt="Discord" src="https://img.shields.io/badge/Discord-voice%20bridge-5865F2?logo=discord&logoColor=white">
11
+ <img alt="STT" src="https://img.shields.io/badge/STT-whisper.cpp-7C3AED">
12
+ <img alt="TTS" src="https://img.shields.io/badge/TTS-Edge%20%7C%20OpenVoice%20%7C%20SpeechSwift-0EA5E9">
13
+ <img alt="License" src="https://img.shields.io/github/license/ca1773130n/VerbalCoding">
14
+ </p>
15
+
16
+ <p align="center">
17
+ <img src="docs/assets/figures/verbalcoding-flow.svg" alt="VerbalCoding voice-to-agent flow" width="860">
18
+ </p>
19
+
20
+ ## なぜ作ったのか
21
+
22
+ VerbalCoding は Discord の音声ルームを、コーディングエージェント用のハンズフリー操作席に変えます。声で依頼し、CLI エージェントに作業させ、短い音声回答とテキスト記録を受け取れます。diff やログを長々と読み上げないための保護も入っています。
23
+
24
+ > **すでに Hermes Agent を使っていますか?** Hermes 自体にも `/voice join` / `/voice channel` による Discord 音声チャンネル対応があります。現在の VC に参加し、Whisper STT で文字起こしし、TTS で話し返せます。その基本ループだけなら VerbalCoding は必須ではありません。VerbalCoding はその上に、プロジェクト/セッションルーティング、音声+テキスト共有コンテキスト、割り込みルール、進捗音声、言語プリセット、レイテンシ指標、Hermes 以外の CLI バックエンド切り替えを加えるワークフローレイヤーです。
25
+
26
+ ## 体験の違い
27
+
28
+ | 機能 | 価値 |
29
+ |---|---|
30
+ | 電話のような流れ | 同じ Discord 音声チャンネルで話す、聞く、割り込む、続けるができます。 |
31
+ | 人向けのガイド付き設定 | `vc setup` が prerequisites、Discord token/client ID、voice channel、transcript target、backend、TTS 設定を一連の流れで確認します。 |
32
+ | ローカル音声ループ | Discord audio → local `whisper-cli` → selected CLI agent → TTS response。 |
33
+ | エージェント選択 | Hermes Agent、Claude Code、Codex、Gemini CLI、OpenCode、OpenClaw、Aider、Cursor CLI または custom command に対応します。`vc setup` がインストール済みのものを自動検出します。 |
34
+ | 音声でエージェントを切替 | `"ask Codex what it thinks"` で 1 ターンのみ、`"switch to Aider"` で sticky、`"back to default"` で復帰。未インストールのバイナリを検出して既定エージェントへの fallback を提案します。 |
35
+ | Hermes 標準音声の先 | 同じ VC 音声ループを土台に、プロジェクトルーム、`!ask` 共有コンテキスト、細かな割り込み処理、進捗/状態の音声案内、複数エージェントバックエンド制御を追加します。 |
36
+ | 運用向け機能 | doctor auto-fix、Docker UDP ガイド、latency metrics、multi-instance rooms、redacted config checks を備えています。 |
37
+
38
+ ## クイックスタート
39
+
40
+ ```bash
41
+ npm install -g verbalcoding@latest
42
+ vc setup
43
+ vc doctor
44
+ vc start
45
+ ```
46
+
47
+ 通常の人間向け導線は `vc setup` です。Discord Developer Portal を開いたまま、bot token、application/client ID、transcript target、voice channel names を入力してください。
48
+
49
+ 自動化ではプロンプトを省略し、Discord の値を後から設定できます。
50
+
51
+ ```bash
52
+ vc setup --yes
53
+ vc setup token <bot-token> --client-id <discord-client-id>
54
+ vc setup channels "General,Team Voice"
55
+ vc doctor
56
+ ```
57
+
58
+ ## Discord 設定を 1 分で
59
+
60
+ 1. Discord Developer Portal で application と bot を作成します。
61
+ 2. Message Content privileged intent を有効にします。
62
+ 3. `vc setup` を実行し、bot token と application/client ID を貼り付けます。
63
+ 4. 自動参加する voice channel 名を正確に入力します。
64
+ 5. 次のコマンドで bot を招待します。
65
+
66
+ ```bash
67
+ vc bot invite <discord-client-id>
68
+ vc bot invite <discord-client-id> --guild <guild-id>
69
+ ```
70
+
71
+ ## 小さなコマンド表
72
+
73
+ ```bash
74
+ vc setup # ガイド付き設定: prerequisites, Discord, backend, voice
75
+ vc setup --yes # 非対話 bootstrap/starter config
76
+ vc setup token # Discord bot token と client ID を後で更新/追加
77
+ vc setup channels "General,Team Voice" # auto-join voice channel names を更新
78
+ vc bot invite CLIENT_ID # Discord bot invite URL を生成
79
+ vc status # 現在の設定を表示
80
+ vc language ko|en|auto # language preset を切り替え
81
+ vc doctor # redacted health check と auto-fix
82
+ vc start # 既定 bridge を開始
83
+ vc instance setup NAME # 分離された project voice bot を作成
84
+ vc instance start NAME # その bot を background で実行
85
+ ```
86
+
87
+ ## 詳しく見る
88
+
89
+ | ガイド | 得られる内容 |
90
+ |---|---|
91
+ | [ドキュメントハブ](docs/i18n/README.ja.md) | ローカライズ済みガイドの索引。 |
92
+ | [Fresh Install](docs/i18n/FRESH_INSTALL.ja.md) | npm/global setup、Discord 設定、初回起動。 |
93
+ | [Usage](docs/i18n/USAGE.ja.md) | CLI コマンド、Discord コマンド、実行モード、latency。 |
94
+ | [Harness 別の使い方](docs/i18n/HARNESSES.ja.md) | Claude Code、Codex、Aider などバックエンド別のインストール・設定・音声ルーティング。 |
95
+ | [Hermes 標準音声 vs VerbalCoding](docs/i18n/HERMES_VOICE.ja.md) | Hermes がすでに提供する Discord 音声と VerbalCoding の違い。 |
96
+ | [Configuration](docs/i18n/CONFIGURATION.ja.md) | .env、agent backends、MCP、TTS、運用。 |
97
+ | [Troubleshooting](docs/i18n/TROUBLESHOOTING.ja.md) | Docker UDP、token/channel 不足チェック。 |
98
+ | [Multi-Instance](docs/i18n/MULTI_INSTANCE.ja.md) | プロジェクトごとに固定音声ルームを 1 つ。 |
99
+
100
+ ## 要件
101
+
102
+ | レイヤー | 既定 |
103
+ |---|---|
104
+ | Runtime | Node.js 20+ と npm。 |
105
+ | Audio | `ffmpeg` と local `whisper-cli`。 |
106
+ | TTS | 既定は Edge TTS。OpenVoice、SpeechSwift/CosyVoice、Supertonic は任意。 |
107
+ | Discord | Bot token、Message Content intent、voice permissions、一致する channel names。 |
108
+ | Agent | 認証済み CLI harness が 1 つ以上。既定は Hermes Agent。 |
109
+
110
+ ## Docker / コンテナ注意
111
+
112
+ ログに `Cannot perform IP discovery - socket closed` が出る場合、Discord voice UDP がブロックされています。Linux Docker Compose では次を使います:
113
+
114
+ ```yaml
115
+ services:
116
+ verbalcoding:
117
+ network_mode: "host"
118
+ ```
119
+
120
+ `network_mode: "host"` と `ports:` を併用しないでください。
121
+
122
+ ## コントリビューション
123
+
124
+ ```bash
125
+ node --check app-node/main.mjs
126
+ npm test
127
+ bash -n run.sh scripts/install.sh scripts/bootstrap_prereqs.sh
128
+ npm pack --dry-run
129
+ vc doctor
130
+ ```
131
+
132
+ ## 状態
133
+
134
+ VerbalCoding は公開リリースを目指していますが、まだ初期段階です。デモ動画/GIF、より広い Linux 検証、CI、セキュリティレビューは TODO です。