verbalcoding 0.2.11 → 0.2.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (235) hide show
  1. package/.env.example +98 -2
  2. package/README.es.md +134 -0
  3. package/README.fr.md +134 -0
  4. package/README.ja.md +134 -0
  5. package/README.ko.md +134 -0
  6. package/README.md +118 -74
  7. package/README.ru.md +134 -0
  8. package/README.zh.md +133 -0
  9. package/app-node/agent_adapters.mjs +37 -5
  10. package/app-node/agent_adapters.test.mjs +27 -1
  11. package/app-node/agent_detect.mjs +73 -0
  12. package/app-node/agent_detect.test.mjs +77 -0
  13. package/app-node/agent_routing.mjs +148 -0
  14. package/app-node/agent_routing.test.mjs +138 -0
  15. package/app-node/agent_turn.mjs +86 -0
  16. package/app-node/agent_turn.test.mjs +109 -0
  17. package/app-node/bridge_context.mjs +73 -0
  18. package/app-node/bridge_context.test.mjs +54 -0
  19. package/app-node/bridge_state.mjs +4 -0
  20. package/app-node/bridge_wireup.test.mjs +462 -0
  21. package/app-node/cli_install.test.mjs +31 -0
  22. package/app-node/cross_agent_routing.test.mjs +78 -0
  23. package/app-node/discord_command_router.mjs +204 -0
  24. package/app-node/discord_command_router.test.mjs +311 -0
  25. package/app-node/discord_voice_setup.mjs +251 -0
  26. package/app-node/discord_voice_setup.test.mjs +86 -0
  27. package/app-node/hermes_profiles.test.mjs +12 -1
  28. package/app-node/install_config.mjs +113 -3
  29. package/app-node/install_config.test.mjs +8 -0
  30. package/app-node/instance_doctor.test.mjs +9 -0
  31. package/app-node/instances.test.mjs +8 -1
  32. package/app-node/main.mjs +513 -1058
  33. package/app-node/mcp_tools.test.mjs +7 -0
  34. package/app-node/notification_handler.mjs +89 -0
  35. package/app-node/notification_handler.test.mjs +187 -0
  36. package/app-node/notify.mjs +73 -0
  37. package/app-node/notify.test.mjs +68 -0
  38. package/app-node/plan_dispatcher.mjs +215 -0
  39. package/app-node/plan_dispatcher.test.mjs +101 -0
  40. package/app-node/plan_mode.mjs +203 -0
  41. package/app-node/plan_mode.test.mjs +231 -0
  42. package/app-node/progress_handler.mjs +220 -0
  43. package/app-node/progress_handler.test.mjs +193 -0
  44. package/app-node/progress_speech.mjs +54 -32
  45. package/app-node/progress_speech.test.mjs +12 -3
  46. package/app-node/project_sessions.mjs +5 -2
  47. package/app-node/project_sessions.test.mjs +7 -0
  48. package/app-node/research_mode.mjs +282 -0
  49. package/app-node/research_mode.test.mjs +264 -0
  50. package/app-node/restart_notice.mjs +3 -0
  51. package/app-node/restart_notice.test.mjs +11 -0
  52. package/app-node/session_ontology.mjs +271 -0
  53. package/app-node/session_ontology.test.mjs +130 -0
  54. package/app-node/smart_progress.mjs +94 -0
  55. package/app-node/smart_progress.test.mjs +66 -0
  56. package/app-node/stream_sentencer.mjs +91 -0
  57. package/app-node/stream_sentencer.test.mjs +129 -0
  58. package/app-node/streaming_tts_queue.mjs +52 -0
  59. package/app-node/streaming_tts_queue.test.mjs +64 -0
  60. package/app-node/stt_whisper.mjs +24 -0
  61. package/app-node/stt_whisper.test.mjs +32 -0
  62. package/app-node/text_routing.mjs +22 -0
  63. package/app-node/text_routing.test.mjs +23 -1
  64. package/app-node/tts_backends.mjs +537 -3
  65. package/app-node/tts_backends.test.mjs +454 -0
  66. package/app-node/tts_player.mjs +164 -0
  67. package/app-node/tts_player.test.mjs +202 -0
  68. package/app-node/tts_runtime.mjs +134 -0
  69. package/app-node/tts_runtime.test.mjs +89 -0
  70. package/app-node/tts_settings.mjs +150 -3
  71. package/app-node/tts_settings.test.mjs +204 -0
  72. package/app-node/tts_voice_config.mjs +136 -2
  73. package/app-node/tts_voice_config.test.mjs +94 -0
  74. package/app-node/utterance_router.mjs +216 -0
  75. package/app-node/utterance_router.test.mjs +236 -0
  76. package/app-node/voice_autojoin.mjs +37 -0
  77. package/app-node/voice_autojoin.test.mjs +59 -0
  78. package/app-node/voice_io.mjs +272 -0
  79. package/app-node/voice_io.test.mjs +102 -0
  80. package/app-node/voice_turn_runner.mjs +449 -0
  81. package/app-node/voice_turn_runner.test.mjs +289 -0
  82. package/docs/CONFIGURATION.md +79 -96
  83. package/docs/FRESH_INSTALL.md +105 -63
  84. package/docs/HARNESSES.md +58 -0
  85. package/docs/HARNESS_AIDER.md +50 -0
  86. package/docs/HARNESS_CLAUDE.md +56 -0
  87. package/docs/HARNESS_CODEX.md +56 -0
  88. package/docs/HARNESS_CURSOR.md +45 -0
  89. package/docs/HARNESS_GEMINI.md +45 -0
  90. package/docs/HARNESS_HERMES.md +57 -0
  91. package/docs/HARNESS_OPENCLAW.md +44 -0
  92. package/docs/HARNESS_OPENCODE.md +44 -0
  93. package/docs/HERMES_VOICE.md +65 -0
  94. package/docs/MULTI_INSTANCE.md +16 -0
  95. package/docs/README.md +50 -0
  96. package/docs/RELEASE.md +42 -19
  97. package/docs/ROADMAP.md +53 -0
  98. package/docs/TROUBLESHOOTING.md +126 -0
  99. package/docs/TTS_BACKENDS.md +227 -0
  100. package/docs/USAGE.md +94 -40
  101. package/docs/assets/figures/verbalcoding-flow.svg +1 -1
  102. package/docs/i18n/AGENTS.es.md +34 -0
  103. package/docs/i18n/AGENTS.fr.md +34 -0
  104. package/docs/i18n/AGENTS.ja.md +34 -0
  105. package/docs/i18n/AGENTS.ko.md +34 -0
  106. package/docs/i18n/AGENTS.ru.md +34 -0
  107. package/docs/i18n/AGENTS.zh.md +34 -0
  108. package/docs/i18n/CONFIGURATION.es.md +25 -0
  109. package/docs/i18n/CONFIGURATION.fr.md +25 -0
  110. package/docs/i18n/CONFIGURATION.ja.md +25 -0
  111. package/docs/i18n/CONFIGURATION.ko.md +25 -0
  112. package/docs/i18n/CONFIGURATION.ru.md +25 -0
  113. package/docs/i18n/CONFIGURATION.zh.md +25 -0
  114. package/docs/i18n/FRESH_INSTALL.es.md +27 -2
  115. package/docs/i18n/FRESH_INSTALL.fr.md +27 -2
  116. package/docs/i18n/FRESH_INSTALL.ja.md +27 -2
  117. package/docs/i18n/FRESH_INSTALL.ko.md +27 -2
  118. package/docs/i18n/FRESH_INSTALL.ru.md +27 -2
  119. package/docs/i18n/FRESH_INSTALL.zh.md +27 -2
  120. package/docs/i18n/HARNESSES.es.md +58 -0
  121. package/docs/i18n/HARNESSES.fr.md +58 -0
  122. package/docs/i18n/HARNESSES.ja.md +58 -0
  123. package/docs/i18n/HARNESSES.ko.md +58 -0
  124. package/docs/i18n/HARNESSES.ru.md +58 -0
  125. package/docs/i18n/HARNESSES.zh.md +58 -0
  126. package/docs/i18n/HARNESS_AIDER.es.md +48 -0
  127. package/docs/i18n/HARNESS_AIDER.fr.md +48 -0
  128. package/docs/i18n/HARNESS_AIDER.ja.md +50 -0
  129. package/docs/i18n/HARNESS_AIDER.ko.md +50 -0
  130. package/docs/i18n/HARNESS_AIDER.ru.md +48 -0
  131. package/docs/i18n/HARNESS_AIDER.zh.md +48 -0
  132. package/docs/i18n/HARNESS_CLAUDE.es.md +55 -0
  133. package/docs/i18n/HARNESS_CLAUDE.fr.md +55 -0
  134. package/docs/i18n/HARNESS_CLAUDE.ja.md +56 -0
  135. package/docs/i18n/HARNESS_CLAUDE.ko.md +56 -0
  136. package/docs/i18n/HARNESS_CLAUDE.ru.md +55 -0
  137. package/docs/i18n/HARNESS_CLAUDE.zh.md +56 -0
  138. package/docs/i18n/HARNESS_CODEX.es.md +55 -0
  139. package/docs/i18n/HARNESS_CODEX.fr.md +55 -0
  140. package/docs/i18n/HARNESS_CODEX.ja.md +56 -0
  141. package/docs/i18n/HARNESS_CODEX.ko.md +56 -0
  142. package/docs/i18n/HARNESS_CODEX.ru.md +55 -0
  143. package/docs/i18n/HARNESS_CODEX.zh.md +56 -0
  144. package/docs/i18n/HARNESS_CURSOR.es.md +42 -0
  145. package/docs/i18n/HARNESS_CURSOR.fr.md +42 -0
  146. package/docs/i18n/HARNESS_CURSOR.ja.md +45 -0
  147. package/docs/i18n/HARNESS_CURSOR.ko.md +45 -0
  148. package/docs/i18n/HARNESS_CURSOR.ru.md +42 -0
  149. package/docs/i18n/HARNESS_CURSOR.zh.md +42 -0
  150. package/docs/i18n/HARNESS_GEMINI.es.md +44 -0
  151. package/docs/i18n/HARNESS_GEMINI.fr.md +44 -0
  152. package/docs/i18n/HARNESS_GEMINI.ja.md +45 -0
  153. package/docs/i18n/HARNESS_GEMINI.ko.md +45 -0
  154. package/docs/i18n/HARNESS_GEMINI.ru.md +44 -0
  155. package/docs/i18n/HARNESS_GEMINI.zh.md +45 -0
  156. package/docs/i18n/HARNESS_HERMES.es.md +54 -0
  157. package/docs/i18n/HARNESS_HERMES.fr.md +54 -0
  158. package/docs/i18n/HARNESS_HERMES.ja.md +57 -0
  159. package/docs/i18n/HARNESS_HERMES.ko.md +57 -0
  160. package/docs/i18n/HARNESS_HERMES.ru.md +54 -0
  161. package/docs/i18n/HARNESS_HERMES.zh.md +57 -0
  162. package/docs/i18n/HARNESS_OPENCLAW.es.md +41 -0
  163. package/docs/i18n/HARNESS_OPENCLAW.fr.md +41 -0
  164. package/docs/i18n/HARNESS_OPENCLAW.ja.md +44 -0
  165. package/docs/i18n/HARNESS_OPENCLAW.ko.md +44 -0
  166. package/docs/i18n/HARNESS_OPENCLAW.ru.md +41 -0
  167. package/docs/i18n/HARNESS_OPENCLAW.zh.md +42 -0
  168. package/docs/i18n/HARNESS_OPENCODE.es.md +41 -0
  169. package/docs/i18n/HARNESS_OPENCODE.fr.md +41 -0
  170. package/docs/i18n/HARNESS_OPENCODE.ja.md +44 -0
  171. package/docs/i18n/HARNESS_OPENCODE.ko.md +44 -0
  172. package/docs/i18n/HARNESS_OPENCODE.ru.md +41 -0
  173. package/docs/i18n/HARNESS_OPENCODE.zh.md +44 -0
  174. package/docs/i18n/HERMES_VOICE.es.md +46 -0
  175. package/docs/i18n/HERMES_VOICE.fr.md +46 -0
  176. package/docs/i18n/HERMES_VOICE.ja.md +46 -0
  177. package/docs/i18n/HERMES_VOICE.ko.md +65 -0
  178. package/docs/i18n/HERMES_VOICE.ru.md +46 -0
  179. package/docs/i18n/HERMES_VOICE.zh.md +46 -0
  180. package/docs/i18n/MULTI_INSTANCE.es.md +25 -0
  181. package/docs/i18n/MULTI_INSTANCE.fr.md +25 -0
  182. package/docs/i18n/MULTI_INSTANCE.ja.md +25 -0
  183. package/docs/i18n/MULTI_INSTANCE.ko.md +25 -0
  184. package/docs/i18n/MULTI_INSTANCE.ru.md +25 -0
  185. package/docs/i18n/MULTI_INSTANCE.zh.md +25 -0
  186. package/docs/i18n/README.es.md +20 -134
  187. package/docs/i18n/README.fr.md +20 -134
  188. package/docs/i18n/README.ja.md +20 -134
  189. package/docs/i18n/README.ko.md +20 -133
  190. package/docs/i18n/README.ru.md +20 -134
  191. package/docs/i18n/README.zh.md +20 -133
  192. package/docs/i18n/RELEASE.es.md +26 -1
  193. package/docs/i18n/RELEASE.fr.md +26 -1
  194. package/docs/i18n/RELEASE.ja.md +26 -1
  195. package/docs/i18n/RELEASE.ko.md +26 -1
  196. package/docs/i18n/RELEASE.ru.md +26 -1
  197. package/docs/i18n/RELEASE.zh.md +26 -1
  198. package/docs/i18n/TROUBLESHOOTING.es.md +39 -0
  199. package/docs/i18n/TROUBLESHOOTING.fr.md +39 -0
  200. package/docs/i18n/TROUBLESHOOTING.ja.md +39 -0
  201. package/docs/i18n/TROUBLESHOOTING.ko.md +39 -0
  202. package/docs/i18n/TROUBLESHOOTING.ru.md +39 -0
  203. package/docs/i18n/TROUBLESHOOTING.zh.md +39 -0
  204. package/docs/i18n/USAGE.es.md +25 -0
  205. package/docs/i18n/USAGE.fr.md +25 -0
  206. package/docs/i18n/USAGE.ja.md +25 -0
  207. package/docs/i18n/USAGE.ko.md +25 -0
  208. package/docs/i18n/USAGE.ru.md +25 -0
  209. package/docs/i18n/USAGE.zh.md +25 -0
  210. package/docs/superpowers/plans/2026-05-13-phase1-streaming-pipeline.md +122 -0
  211. package/docs/superpowers/plans/2026-05-13-phase10-push-notifications.md +152 -0
  212. package/docs/superpowers/plans/2026-05-13-phase2-agent-adapters.md +242 -0
  213. package/docs/superpowers/plans/2026-05-13-phase6-smart-progress.md +172 -0
  214. package/docs/superpowers/plans/2026-05-13-phase7-voice-plan-mode.md +108 -0
  215. package/docs/superpowers/plans/2026-05-14-cross-agent-voice-transfer.md +625 -0
  216. package/docs/superpowers/plans/2026-05-21-audio-overview-narrated-diffs.md +95 -0
  217. package/docs/superpowers/plans/2026-05-21-autoresearch-ontology.md +83 -0
  218. package/docs/superpowers/plans/2026-05-21-phase11-push-to-talk-wakeword-v2.md +77 -0
  219. package/docs/superpowers/plans/2026-05-21-phase12-multi-user-voice.md +147 -0
  220. package/docs/superpowers/plans/2026-05-21-phase14-verbalbench.md +136 -0
  221. package/docs/superpowers/plans/2026-05-21-phase15-phone-companion.md +72 -0
  222. package/integrations/fireredtts2/mlx_llm.py +183 -0
  223. package/integrations/fireredtts2/synth.py +156 -0
  224. package/integrations/fireredtts2/synth_mlx.py +196 -0
  225. package/integrations/mlxaudio/synth.py +74 -0
  226. package/integrations/neuttsair/synth.py +104 -0
  227. package/integrations/omnivoice/synth.py +110 -0
  228. package/package.json +7 -1
  229. package/scripts/cli.mjs +88 -3
  230. package/scripts/doctor.mjs +115 -4
  231. package/scripts/install.mjs +20 -2
  232. package/scripts/install_fireredtts2.sh +109 -0
  233. package/scripts/install_mlxaudio.sh +34 -0
  234. package/scripts/install_mossttsnano.sh +46 -0
  235. package/scripts/postinstall.mjs +34 -0
@@ -1,116 +1,142 @@
1
1
  # Fresh install
2
2
 
3
- This guide is for a clean public install. It avoids local-only assumptions and uses the installer to bootstrap as much as possible.
3
+ <!-- readme-glow-up:intro -->
4
+ <p align="center">
5
+ <a href="../README.md">README</a> ·
6
+ <a href="README.md">Docs hub</a> ·
7
+ <a href="FRESH_INSTALL.md">Fresh Install</a> ·
8
+ <a href="USAGE.md">Usage</a> ·
9
+ <a href="CONFIGURATION.md">Configuration</a> ·
10
+ <a href="TROUBLESHOOTING.md">Troubleshooting</a> ·
11
+ <a href="MULTI_INSTANCE.md">Multi-Instance</a>
12
+ </p>
4
13
 
5
- ## 1. Install the CLI
14
+ > Clean install path for humans first, automation second.
15
+ >
16
+ > Fast path: `npm install -g verbalcoding@latest → vc setup → vc doctor → vc start`
17
+ <!-- /readme-glow-up:intro -->
6
18
 
7
- Recommended npm path:
19
+ This guide is for a clean public install. It avoids local-only assumptions and uses the `vc` CLI to bootstrap as much as possible. Windows is not supported yet.
8
20
 
9
- ```bash
10
- npm install -g verbalcoding
11
- ```
21
+ ## 1. Install the CLI and run guided setup
12
22
 
13
- Or run the published package directly:
23
+ Recommended npm path for humans:
14
24
 
15
25
  ```bash
16
- npx verbalcoding setup --yes
26
+ npm install -g verbalcoding@latest
27
+ vc setup
17
28
  ```
18
29
 
19
- If you used `npm install -g`, continue with:
30
+ `vc setup` bootstraps supported local prerequisites, then asks for the Discord bot token, application/client ID, auto-join voice channel names, transcript target, agent backend, and voice/TTS settings. Keep the Discord Developer Portal open while it runs.
31
+
32
+ Automation/CI path:
20
33
 
21
34
  ```bash
35
+ npm install -g verbalcoding@latest
22
36
  vc setup --yes
37
+ vc setup token <bot-token> --client-id <discord-client-id>
38
+ vc setup channels "General,Team Voice"
23
39
  ```
24
40
 
41
+ Use `--yes` only when you need non-interactive bootstrap/starter config. It cannot stop and wait for you to create a Discord application, so token/channel setup remains a follow-up step in that mode.
42
+
25
43
  Contributor GitHub clone path:
26
44
 
27
45
  ```bash
28
46
  git clone https://github.com/ca1773130n/VerbalCoding.git
29
47
  cd VerbalCoding
30
- ./scripts/install.sh --yes
48
+ ./scripts/install.sh
31
49
  ```
32
50
 
33
- ## 2. Bootstrap dependencies and run the setup wizard
34
-
35
- For an npm install, do not run `./scripts/install.sh` directly; there is no repository checkout in your current directory. Use the packaged CLI wrapper instead:
51
+ For npm/global installs, use `vc ...` commands. Do not run `./scripts/install.sh` unless you are inside a repository clone.
36
52
 
37
- ```bash
38
- vc setup --yes
39
- ```
53
+ ## 2. What setup bootstraps
40
54
 
41
- `vc setup` runs the `scripts/install.sh` bundled inside the installed npm package. Only use `./scripts/install.sh --yes` when you are inside a GitHub clone:
55
+ `vc setup` runs the bootstrap bundled in the npm package and writes `.env`. It can install or prepare:
42
56
 
43
- ```bash
44
- ./scripts/install.sh --yes
45
- ```
46
-
47
- What this does:
48
-
49
- - installs npm dependencies when `node_modules/` is missing,
50
- - installs the short `vc` shell command with `npm link`,
51
- - installs `ffmpeg`, Node/npm, and `whisper-cli` when supported by the OS package manager,
52
- - downloads `models/ggml-small-q5_1.bin`,
53
- - creates `.venv-tts` and installs `edge-tts` when `edge-tts` is not already on `PATH`,
54
- - runs the interactive `.env` wizard.
57
+ - npm dependencies when `node_modules/` is missing,
58
+ - `ffmpeg`, Node/npm, Python venv support, build tools, and `whisper-cli` where supported,
59
+ - the default `models/ggml-small-q5_1.bin` whisper.cpp model,
60
+ - a local `.venv-tts` Edge TTS helper,
61
+ - the short `vc` shell command when running from a clone.
55
62
 
56
63
  Supported system bootstrap paths:
57
64
 
58
65
  | OS | System dependency path |
59
66
  |---|---|
60
67
  | macOS | Homebrew: `brew install node ffmpeg whisper-cpp` as needed |
61
- | Debian/Ubuntu | `apt-get` for Node/npm, ffmpeg, Python, build tools; local whisper.cpp build fallback |
62
- | Fedora/RHEL | `dnf` for Node/npm, ffmpeg, Python, build tools; local whisper.cpp build fallback |
63
- | Arch | `pacman` for Node/npm, ffmpeg, Python, build tools; local whisper.cpp build fallback |
68
+ | Debian/Ubuntu | `apt-get`; handles NodeSource npm conflicts and can locally build whisper.cpp |
69
+ | Fedora/RHEL | `dnf`; local whisper.cpp build fallback |
70
+ | Arch | `pacman`; local whisper.cpp build fallback |
71
+ | Windows | Not supported yet |
64
72
 
65
73
  Useful installer variants:
66
74
 
67
75
  ```bash
68
76
  vc setup --yes --no-wizard # dependency/bootstrap only from npm install
69
- ./scripts/install.sh --yes --no-wizard # dependency/bootstrap only from a clone
70
- ./scripts/install.sh --skip-system # do not install OS packages
71
- ./scripts/install.sh --skip-model # do not download the default STT model
72
- ./scripts/install.sh --skip-edge-tts # do not create .venv-tts
73
- VERBALCODING_SKIP_CLI_LINK=1 ./scripts/install.sh --yes
77
+ vc setup --yes --skip-system # skip OS package installation
78
+ vc setup --yes --skip-model # skip default STT model download
79
+ vc setup --yes --skip-edge-tts # skip local Edge TTS helper
80
+ ./scripts/install.sh --yes --no-wizard # clone-only non-interactive equivalent
74
81
  ```
75
82
 
76
- If your OS is unsupported, install these manually before rerunning:
77
-
78
- - Node.js 20+ and npm
79
- - ffmpeg
80
- - Python 3 with venv/pip
81
- - whisper.cpp `whisper-cli`
82
- - one authenticated CLI agent backend, Hermes Agent by default
83
+ ## 3. Discord values collected by setup
83
84
 
84
- ## 3. Discord application setup
85
-
86
- Read the upstream Discord bot setup guides first if this is your first bot:
85
+ Read the upstream Discord bot setup guides if this is your first bot:
87
86
 
88
87
  - Hermes Agent Discord messaging guide: <https://hermes-agent.nousresearch.com/docs/user-guide/messaging/discord>
89
88
  - Discord official bot overview: <https://docs.discord.com/developers/bots/overview>
90
89
  - Discord official getting started guide: <https://docs.discord.com/developers/quick-start/getting-started>
91
90
 
92
- Those pages show how to create a Discord application, add a bot user, enable privileged intents, and invite it to a server. VerbalCoding uses the same Discord bot setup, then adds voice receive, STT, CLI-agent execution, and TTS playback on top.
91
+ During `vc setup`:
93
92
 
94
- 1. Create a Discord application and bot in the Discord Developer Portal.
93
+ 1. Create a Discord application/bot in the Developer Portal.
95
94
  2. Enable the Message Content privileged intent.
96
- 3. Copy the bot token into the installer prompt or `.env` as `DISCORD_BOT_TOKEN`.
97
- 4. Generate an invite URL:
95
+ 3. Paste the bot token when asked for `DISCORD_BOT_TOKEN`.
96
+ 4. Paste the application/client ID when asked; setup can print the invite command.
97
+ 5. Enter the real voice channel names the bot should auto-join.
98
+
99
+ Invite URL helper:
98
100
 
99
101
  ```bash
100
102
  vc bot invite <discord-client-id>
101
- # or pin it to one server:
102
103
  vc bot invite <discord-client-id> --guild <guild-id>
103
104
  ```
104
105
 
105
- The invite includes bot and slash-command scopes plus text/voice permissions used by VerbalCoding.
106
+ If you skipped a value or need to rotate it later, update only that part:
107
+
108
+ ```bash
109
+ vc setup token
110
+ vc setup token <bot-token> --client-id <discord-client-id>
111
+ vc setup channels "VerbalCoding,LLM-Wiki,General"
112
+ ```
113
+
114
+ `vc setup token` updates `DISCORD_BOT_TOKEN` and optional `DISCORD_CLIENT_ID`; `vc setup channels` updates `AUTO_JOIN_VOICE_CHANNELS`. Both preserve unrelated `.env` values, set mode `0600`, and do not print secrets back.
106
115
 
107
- ## 4. Verify
116
+ ## 4. Auto-join voice channel names
117
+
118
+ Use the exact Discord voice channel names:
119
+
120
+ ```bash
121
+ vc setup channels
122
+ vc setup channels "General,Team Voice"
123
+ vc setup channel "General"
124
+ vc setup voice "General"
125
+ ```
126
+
127
+ Restart the bridge after changing channel names.
128
+
129
+ ## 5. Verify
108
130
 
109
131
  ```bash
110
132
  vc doctor
111
133
  ```
112
134
 
113
- `vc doctor` is redacted: it reports missing tokens/commands/models without printing secret values. When fixable local prerequisites are missing (`ffmpeg`, `whisper-cli`, the default model, or Edge TTS helper), it automatically reruns the packaged bootstrap first. Fix any remaining `✗` items, then rerun it.
135
+ `vc doctor` is redacted: it reports missing tokens/commands/models without printing secret values. On supported macOS/Linux installs it attempts to auto-fix installable prerequisites first, including `ffmpeg`, `whisper-cli`/model, Edge TTS helper, and Hermes CLI for the default Hermes backend. Use this opt-out if you only want diagnosis:
136
+
137
+ ```bash
138
+ VERBALCODING_DOCTOR_INSTALL_HERMES=0 vc doctor
139
+ ```
114
140
 
115
141
  Expected success includes:
116
142
 
@@ -126,9 +152,9 @@ Expected success includes:
126
152
  Doctor passed. Run vc start to start VerbalCoding.
127
153
  ```
128
154
 
129
- If the installer created a local Edge TTS helper, `.env` should contain an `EDGE_TTS_COMMAND` path pointing at `.venv-tts/bin/edge-tts`.
155
+ If `DISCORD_BOT_TOKEN` is missing, run `vc setup token`. If no configured channel is found, run `vc setup channels "<actual voice channel name>"`.
130
156
 
131
- ## 5. Run the single default bot
157
+ ## 6. Run the single default bot
132
158
 
133
159
  ```bash
134
160
  vc start
@@ -154,7 +180,25 @@ In Discord:
154
180
 
155
181
  Then speak in the configured voice channel. You should see STT text, progress text when verbose mode is on, a final text answer, and hear TTS playback.
156
182
 
157
- ## 6. Project-per-room setup
183
+ ## 7. Docker and containers
184
+
185
+ Discord text/gateway login uses TCP/WebSocket, but Discord voice also needs UDP. If `vc start` logs this, the channel was found but voice UDP discovery failed:
186
+
187
+ ```text
188
+ Cannot perform IP discovery - socket closed
189
+ ```
190
+
191
+ On Linux Docker Compose, use host networking for the service running `vc start`:
192
+
193
+ ```yaml
194
+ services:
195
+ verbalcoding:
196
+ network_mode: "host"
197
+ ```
198
+
199
+ Remove any `ports:` block from that service when using host networking. On Docker Desktop for macOS/Windows, host networking behaves differently; if UDP voice still fails, run VerbalCoding directly on the host or in a Linux VM. See [Troubleshooting](TROUBLESHOOTING.md).
200
+
201
+ ## 8. Project-per-room setup
158
202
 
159
203
  For one permanent bot per project voice room, create one Discord application per project, then:
160
204
 
@@ -167,7 +211,7 @@ vc instance status my-project
167
211
 
168
212
  Each instance writes an ignored `instances/<name>.env` with its own token, voice channel, transcript target, log path, Hermes session file, and optional Hermes profile.
169
213
 
170
- ## 7. Optional OpenVoice setup
214
+ ## 9. Optional OpenVoice setup
171
215
 
172
216
  OpenVoice voice cloning is optional. Keep `TTS_BACKEND=edge` for a fresh public install. To enable OpenVoice later:
173
217
 
@@ -181,7 +225,7 @@ python3 integrations/openvoice/synth.py --openvoice-dir vendor/OpenVoice --ref-a
181
225
 
182
226
  Then set `TTS_BACKEND=openvoice`, run `vc doctor`, and test `!voice-test <text>` in Discord.
183
227
 
184
- ## 8. Clean clone smoke test for maintainers
228
+ ## 10. Clean clone smoke test for maintainers
185
229
 
186
230
  Fast host-only smoke test:
187
231
 
@@ -196,12 +240,10 @@ chmod 600 .env
196
240
  vc doctor || true
197
241
  ```
198
242
 
199
- The expected failure at this point is missing local secrets or unauthenticated agent CLI, not leaked tokens or missing install scripts.
200
-
201
243
  Docker-based Ubuntu clean install smoke test:
202
244
 
203
245
  ```bash
204
246
  ./scripts/docker_ubuntu_smoke.sh
205
247
  ```
206
248
 
207
- This runs `ubuntu:24.04`, copies the tracked repository tree into a clean container, runs `./scripts/install.sh --yes --no-wizard`, writes a non-secret smoke `.env`, checks `vc`, runs Node tests, and verifies `vc doctor`. It does not connect to Discord voice; use a real Ubuntu VM or WSL2 after this if you need an end-to-end voice-channel test.
249
+ This validates bootstrap and doctor behavior in a clean container. It does not connect to Discord voice; use a real Linux host/VM for end-to-end voice UDP testing.
@@ -0,0 +1,58 @@
1
+ # Coding Agent Harnesses
2
+
3
+ <p align="center">
4
+ <a href="../README.md">README</a> ·
5
+ <a href="README.md">Docs hub</a> ·
6
+ <a href="USAGE.md">Usage</a> ·
7
+ <a href="CONFIGURATION.md">Configuration</a> ·
8
+ <a href="TROUBLESHOOTING.md">Troubleshooting</a>
9
+ </p>
10
+
11
+ VerbalCoding is agent-agnostic. It drives whichever CLI coding agent you have installed by spawning it once per voice turn, feeding the transcript as a prompt, and speaking the response back. Pick **one** as your default; the cross-agent voice routing lets you reach the others mid-session.
12
+
13
+ | Harness | Default command | Session resume | Per-harness doc |
14
+ |---|---|---|---|
15
+ | Hermes Agent | `hermes chat -Q -q` | ✅ (`--resume <id>`) | [HERMES_VOICE.md](./HERMES_VOICE.md) (positioning) + [HARNESS_HERMES.md](./HARNESS_HERMES.md) |
16
+ | Claude Code | `claude -p` | ❌ | [HARNESS_CLAUDE.md](./HARNESS_CLAUDE.md) |
17
+ | Codex | `codex exec` | ❌ (output-last-message capture) | [HARNESS_CODEX.md](./HARNESS_CODEX.md) |
18
+ | Gemini CLI | `gemini -p` | ❌ | [HARNESS_GEMINI.md](./HARNESS_GEMINI.md) |
19
+ | OpenCode | `opencode run` | ❌ | [HARNESS_OPENCODE.md](./HARNESS_OPENCODE.md) |
20
+ | OpenClaw | `openclaw run` | ❌ | [HARNESS_OPENCLAW.md](./HARNESS_OPENCLAW.md) |
21
+ | Aider | `aider --no-pretty --yes-always --message` | ❌ | [HARNESS_AIDER.md](./HARNESS_AIDER.md) |
22
+ | Cursor CLI | `cursor-agent --print --prompt` | ❌ | [HARNESS_CURSOR.md](./HARNESS_CURSOR.md) |
23
+
24
+ ## Pick your default
25
+
26
+ `vc setup` auto-detects installed binaries and lets you pick. Non-interactive override:
27
+
28
+ ```bash
29
+ # .env or instance .env
30
+ AGENT_BACKEND=claude # hermes | claude | codex | gemini | opencode | openclaw | aider | cursor | custom
31
+ ```
32
+
33
+ Each harness picks up its own command from a matching env var (`HERMES_COMMAND`, `CLAUDE_COMMAND`, etc.). The shared envs `AGENT_LABEL`, `AGENT_COMMAND`, `AGENT_SESSION_FILE`, `AGENT_WORKDIR`, `AGENT_PROJECT_CONTEXT`, `AGENT_TASK_TIMEOUT_MS`, `AGENT_CHAT_TIMEOUT_MS`, `AGENT_VERBOSE_PROGRESS` override per-harness defaults when set.
34
+
35
+ ## Routing between harnesses by voice
36
+
37
+ Once configured, you can reach any **installed** harness from a voice channel without restarting:
38
+
39
+ - `"ask Codex what it thinks"` — single-turn route, next utterance returns to the default.
40
+ - `"switch to Aider"` — sticky route until you say `"back to default"`.
41
+ - Plan-mode `which_agent` slot — the agent itself proposes which backend runs the next plan.
42
+
43
+ The routing layer detects whether the binary is on `PATH` (resolving relative commands against the active project session's workdir). If not installed, the bridge asks `"Want me to use the default agent instead?"` — answer `"yes"` to fall back or `"no"` to cancel.
44
+
45
+ Aliases recognized by the parser: `claude` / `claude code`, `codex` / `코덱스`, `gemini` / `gemini cli` / `제미나이`, `opencode`, `openclaw`, `aider` / `에이더`, `cursor` / `cursor cli`, `hermes` / `헤르메스`.
46
+
47
+ ## Shared semantics
48
+
49
+ Things every harness adapter respects:
50
+
51
+ - **Voice plan mode** — `"plan it first"` → narrate a plan; edit by voice; `"approve"` to execute against the chosen harness.
52
+ - **Barge-in** — interrupting cuts the current TTS and aborts the agent task. Sticky routing survives interrupts; only single-turn routes are cleared.
53
+ - **Verbose progress** — `AGENT_VERBOSE_PROGRESS=1` (or `"상세 진행 켜"`) prints structured progress events the harness emits (file reads, web search, tool use). Smart-progress, if `SMART_PROGRESS_API_KEY` is set, summarizes these into one sentence per batch.
54
+ - **Push handoff** — `NOTIFY_PROVIDER=ntfy|pushover` plus `NOTIFY_MIN_TASK_MS` fires a push notification when a long task completes and the voice channel is empty. Debounced by body + `NOTIFY_DEBOUNCE_MS`.
55
+ - **Per-channel state** — each Discord voice channel keeps its own routing, plan-mode, and recent-utterance ring buffer.
56
+ - **Project sessions** — `!session new <name> <workdir>` binds a Discord channel to a project; per-(harness, session) adapters are cached and invalidated on rebind.
57
+
58
+ See per-harness docs for install paths, auth, and gotchas. `docs/CONFIGURATION.md` is the canonical env-var reference.
@@ -0,0 +1,50 @@
1
+ # Aider — Harness Notes
2
+
3
+ <p align="center">
4
+ <a href="../README.md">README</a> ·
5
+ <a href="HARNESSES.md">Harnesses</a> ·
6
+ <a href="USAGE.md">Usage</a> ·
7
+ <a href="CONFIGURATION.md">Configuration</a>
8
+ </p>
9
+
10
+ Aider is a pair-programming AI CLI focused on direct edits. VerbalCoding drives it through `aider --no-pretty --yes-always --message` — the prompt is passed as the `--message` value so each voice turn becomes one non-interactive Aider run that may modify files in `AGENT_WORKDIR`.
11
+
12
+ ## Install
13
+
14
+ ```bash
15
+ pip install aider-chat
16
+ aider --version
17
+ # Confirm a single-message run works:
18
+ aider --no-pretty --yes-always --message "list the top-level files"
19
+ ```
20
+
21
+ Aider needs an API key for the model you point it at (OpenAI / Anthropic / a local server). See <https://aider.chat>.
22
+
23
+ ## Configure VerbalCoding
24
+
25
+ ```bash
26
+ # .env
27
+ AGENT_BACKEND=aider
28
+ # optional
29
+ AIDER_COMMAND="aider --no-pretty --yes-always --message" # default
30
+ AGENT_WORKDIR=/Users/you/code/your-project # where Aider should edit
31
+ AGENT_PROJECT_CONTEXT="..."
32
+ AGENT_CHAT_TIMEOUT_MS=120000 # Aider can take longer
33
+ AGENT_TASK_TIMEOUT_MS=0
34
+ ```
35
+
36
+ `--no-pretty` strips Rich-formatting box characters so the stream sentencer doesn't choke on them. `--yes-always` keeps the run non-interactive (Aider won't pause for "apply this diff?" prompts).
37
+
38
+ ## Voice phrases to switch TO Aider
39
+
40
+ - en: `"switch to Aider"`, `"ask Aider to ..."`
41
+ - ko: `"aider로 전환해줘"`, `"에이더로 전환"`
42
+
43
+ The matcher accepts `aider` and `에이더`.
44
+
45
+ ## Gotchas
46
+
47
+ - **Aider edits files.** Unlike Claude / Codex / Gemini under `-p`, Aider directly modifies the working tree as part of answering. Be deliberate about `AGENT_WORKDIR` — usually a project session's `workdir`.
48
+ - **Diffs in output.** Aider often emits diff-shaped text. If a turn is interrupted, the bridge speaks an "interrupted" notice and skips reading the diff aloud — check the text channel and `git status`.
49
+ - **Auth.** `OPENAI_API_KEY` / `ANTHROPIC_API_KEY` need to be in Aider's environment; instance-isolated installs typically use `instances/<project>.env`.
50
+ - **Per-channel state.** Cross-agent routing is per Discord channel; switching to Aider in one project room does not affect another.
@@ -0,0 +1,56 @@
1
+ # Claude Code — Harness Notes
2
+
3
+ <p align="center">
4
+ <a href="../README.md">README</a> ·
5
+ <a href="HARNESSES.md">Harnesses</a> ·
6
+ <a href="USAGE.md">Usage</a> ·
7
+ <a href="CONFIGURATION.md">Configuration</a>
8
+ </p>
9
+
10
+ Claude Code is Anthropic's official terminal-resident coding agent. VerbalCoding drives it through `claude -p`, where each voice turn is one invocation. Claude Code does not expose a stable session-resume contract over `-p`, so each call is a fresh context — use `AGENT_PROJECT_CONTEXT` and the cross-agent handoff block to keep continuity.
11
+
12
+ ## Install
13
+
14
+ ```bash
15
+ npm install -g @anthropic-ai/claude-code
16
+ claude login
17
+ claude -p "hello" # confirm it answers
18
+ ```
19
+
20
+ ## Configure VerbalCoding
21
+
22
+ ```bash
23
+ # .env
24
+ AGENT_BACKEND=claude # alias 'claude-code' also accepted
25
+ # optional
26
+ CLAUDE_COMMAND="claude -p" # default; override e.g. to add --model, --debug
27
+ AGENT_PROJECT_CONTEXT="Working on the auth module; previous decisions: oauth=github."
28
+ AGENT_WORKDIR=/Users/you/code/your-project
29
+ AGENT_CHAT_TIMEOUT_MS=45000
30
+ AGENT_TASK_TIMEOUT_MS=0
31
+ AGENT_VERBOSE_PROGRESS=0
32
+ ```
33
+
34
+ `AGENT_SESSION_FILE` defaults to `<repo>/.agent-sessions/claude` but is **unused** by this harness — Claude Code's `-p` is stateless. Leave it set; it just becomes a no-op.
35
+
36
+ ## What Claude sees per turn
37
+
38
+ Every turn the adapter prepends a Discord-aware preamble (English or Korean depending on `VOICE_LANGUAGE`), the project context, recent Discord text context, and finally the user's transcribed utterance. On cross-agent handoff (e.g. you said `"ask Codex ..."` last turn and just spoke again), the prepended block also includes a "Recent user voice" line of up to the last 4 utterances plus the most recently resolved plan decisions, so Claude doesn't start cold.
39
+
40
+ ## Verbose progress
41
+
42
+ Claude Code does not emit a standard progress stream over `-p`. `AGENT_VERBOSE_PROGRESS=1` still works — the adapter parses tool/file/web mentions out of stdout/stderr if Claude prints them — but expect coarser progress than Hermes.
43
+
44
+ ## Voice phrases to switch TO Claude Code
45
+
46
+ - en: `"switch to Claude Code"`, `"ask Claude ..."`, `"let Claude finish this"`
47
+ - ko: `"클로드로 전환"`, `"claude한테 물어봐"`
48
+
49
+ The matcher accepts both `claude` and `claude code` as aliases; strict mode (used for routing-only utterances) requires an exact alias.
50
+
51
+ ## Gotchas
52
+
53
+ - **No session resume.** A long-running pair-programming session needs the cross-agent handoff context block to carry decisions forward. The bridge does this automatically on backend changes; within the same backend, set `AGENT_PROJECT_CONTEXT` to a short summary.
54
+ - **Quoted command paths.** If `CLAUDE_COMMAND` uses a quoted absolute path (e.g. `"/Applications/Claude Code/claude" -p`), VerbalCoding's installation probe uses `shellSplit` and honors quotes correctly.
55
+ - **Auth refresh.** `claude login` token expiry surfaces as a non-zero exit; the bridge reports the failure and (if a non-default backend) the fallback prompt will offer to retry on the default.
56
+ - **Patch-like output.** If Claude returns a diff and the turn is interrupted, the bridge says `"the agent was interrupted; check the text channel for files and tests"` rather than reading the diff aloud.
@@ -0,0 +1,56 @@
1
+ # Codex — Harness Notes
2
+
3
+ <p align="center">
4
+ <a href="../README.md">README</a> ·
5
+ <a href="HARNESSES.md">Harnesses</a> ·
6
+ <a href="USAGE.md">Usage</a> ·
7
+ <a href="CONFIGURATION.md">Configuration</a>
8
+ </p>
9
+
10
+ Codex CLI is OpenAI's terminal coding agent. VerbalCoding drives it through `codex exec`. Because `codex exec` writes its final assistant text to a temp file when `--output-last-message <path>` is passed, the adapter inserts that flag automatically and reads the file back even if stdout is noisy.
11
+
12
+ ## Install
13
+
14
+ ```bash
15
+ npm install -g @openai/codex
16
+ codex login # or set OPENAI_API_KEY for headless use
17
+ codex exec "hello"
18
+ ```
19
+
20
+ ## Configure VerbalCoding
21
+
22
+ ```bash
23
+ # .env
24
+ AGENT_BACKEND=codex
25
+ # optional
26
+ CODEX_COMMAND="codex exec" # default
27
+ AGENT_PROJECT_CONTEXT="What we're working on, what's already decided."
28
+ AGENT_WORKDIR=/Users/you/code/your-project
29
+ AGENT_CHAT_TIMEOUT_MS=45000
30
+ AGENT_TASK_TIMEOUT_MS=0
31
+ ```
32
+
33
+ `AGENT_SESSION_FILE` is unused (Codex `exec` is stateless across calls).
34
+
35
+ ## Output capture
36
+
37
+ For Codex, the adapter:
38
+
39
+ 1. Generates a temp path under `os.tmpdir()` like `verbalcoding-codex-last-<pid>-<ts>.txt`.
40
+ 2. Inserts `--output-last-message <path>` immediately before the final positional prompt arg.
41
+ 3. After the run, reads that file as the authoritative answer (preferred over `stdout`).
42
+ 4. Deletes the temp file.
43
+
44
+ This is robust to Codex emitting tool-use chatter on stdout; the spoken answer always comes from the captured file.
45
+
46
+ ## Voice phrases to switch TO Codex
47
+
48
+ - en: `"switch to Codex"`, `"ask Codex what it thinks"`
49
+ - ko: `"코덱스로 전환"`, `"코덱스한테 물어봐"`
50
+
51
+ ## Gotchas
52
+
53
+ - **Long tasks.** Set `AGENT_TASK_TIMEOUT_MS=0` for codegen runs that may take minutes. The adapter respects `signal.aborted` so barge-in still cuts cleanly.
54
+ - **No session resume.** Pass context via `AGENT_PROJECT_CONTEXT` and rely on the cross-agent handoff block for continuity after a route change.
55
+ - **Patch-like output safety.** If a turn is interrupted and Codex was mid-diff, the bridge does **not** read the diff aloud — it speaks an "interrupted" notice and asks you to check the text channel.
56
+ - **Auth.** A 401 from the OpenAI backend surfaces as a non-zero exit; the bridge reports the failure and the cross-agent fallback prompt offers the default agent.
@@ -0,0 +1,45 @@
1
+ # Cursor CLI — Harness Notes
2
+
3
+ <p align="center">
4
+ <a href="../README.md">README</a> ·
5
+ <a href="HARNESSES.md">Harnesses</a> ·
6
+ <a href="USAGE.md">Usage</a> ·
7
+ <a href="CONFIGURATION.md">Configuration</a>
8
+ </p>
9
+
10
+ Cursor CLI (`cursor-agent`) is Cursor's terminal agent. VerbalCoding drives it through `cursor-agent --print --prompt`, passing the user's transcribed utterance as the prompt value. `--print` keeps the run non-interactive.
11
+
12
+ ## Install
13
+
14
+ Follow the upstream Cursor CLI install. Confirm:
15
+
16
+ ```bash
17
+ cursor-agent --print --prompt "hello"
18
+ ```
19
+
20
+ ## Configure VerbalCoding
21
+
22
+ ```bash
23
+ # .env
24
+ AGENT_BACKEND=cursor # alias 'cursor-cli' also accepted
25
+ # optional
26
+ CURSOR_COMMAND="cursor-agent --print --prompt" # default
27
+ AGENT_PROJECT_CONTEXT="..."
28
+ AGENT_WORKDIR=/Users/you/code/your-project
29
+ AGENT_CHAT_TIMEOUT_MS=45000
30
+ AGENT_TASK_TIMEOUT_MS=0
31
+ ```
32
+
33
+ ## Voice phrases to switch TO Cursor
34
+
35
+ - en: `"switch to Cursor"`, `"ask Cursor ..."`, `"switch to cursor cli"`, `"switch to cursor agent"`
36
+ - ko: `"커서로 전환"`, `"cursor한테 물어봐"`
37
+
38
+ The matcher accepts `cursor`, `cursor cli`, `cursor-cli`, `cursor agent`, and `cursor-agent`.
39
+
40
+ ## Gotchas
41
+
42
+ - **Prompt placement.** `--prompt` expects the value to follow; VerbalCoding's shell-aware argv builder places the transcribed utterance as the final positional argument, so `CURSOR_COMMAND` must end with `--prompt`.
43
+ - **Editor side-effects.** Cursor's CLI may touch local cursor-related state files in the working directory; if that's surprising for a voice-only flow, point `AGENT_WORKDIR` at an isolated project dir.
44
+ - **No session resume.** Use `AGENT_PROJECT_CONTEXT` for cross-turn continuity, plus the cross-agent handoff block when routing back from a different harness.
45
+ - **Patch safety.** If Cursor returns a diff and the turn is interrupted, the bridge does not read the diff aloud.
@@ -0,0 +1,45 @@
1
+ # Gemini CLI — Harness Notes
2
+
3
+ <p align="center">
4
+ <a href="../README.md">README</a> ·
5
+ <a href="HARNESSES.md">Harnesses</a> ·
6
+ <a href="USAGE.md">Usage</a> ·
7
+ <a href="CONFIGURATION.md">Configuration</a>
8
+ </p>
9
+
10
+ Gemini CLI is Google's terminal coding agent. VerbalCoding drives it through `gemini -p`. Each voice turn is one invocation; there is no built-in session-resume across calls.
11
+
12
+ ## Install
13
+
14
+ Follow the upstream Gemini CLI install guide. Confirm:
15
+
16
+ ```bash
17
+ gemini -p "hello"
18
+ ```
19
+
20
+ ## Configure VerbalCoding
21
+
22
+ ```bash
23
+ # .env
24
+ AGENT_BACKEND=gemini
25
+ # optional
26
+ GEMINI_COMMAND="gemini -p" # default; add --model, --debug as needed
27
+ AGENT_PROJECT_CONTEXT="..."
28
+ AGENT_WORKDIR=/Users/you/code/your-project
29
+ AGENT_CHAT_TIMEOUT_MS=45000
30
+ AGENT_TASK_TIMEOUT_MS=0
31
+ ```
32
+
33
+ ## Voice phrases to switch TO Gemini
34
+
35
+ - en: `"switch to Gemini"`, `"ask Gemini ..."`, `"switch to Gemini CLI"`
36
+ - ko: `"제미나이로 전환"`, `"gemini한테 물어봐"`
37
+
38
+ The matcher accepts `gemini`, `gemini cli`, `gemini-cli`, and `제미나이`.
39
+
40
+ ## Gotchas
41
+
42
+ - **No session resume.** Same continuity story as Claude / Codex: rely on `AGENT_PROJECT_CONTEXT` and the cross-agent handoff block.
43
+ - **Long answers.** Gemini sometimes returns large structured responses; the stream sentencer splits them into TTS-able sentences. Code fences are stripped from speech (the text channel still gets the full answer with code).
44
+ - **API key.** If Gemini exits non-zero with an auth error, the bridge reports the message; the cross-agent fallback prompt offers the default agent if Gemini was a non-default route.
45
+ - **Verbose progress.** Gemini's stdout doesn't follow Hermes' `┊`-style preview format, so verbose progress mostly relies on the smart-progress LLM summarizer.
@@ -0,0 +1,57 @@
1
+ # Hermes Agent — Harness Notes
2
+
3
+ <p align="center">
4
+ <a href="../README.md">README</a> ·
5
+ <a href="HARNESSES.md">Harnesses</a> ·
6
+ <a href="USAGE.md">Usage</a> ·
7
+ <a href="CONFIGURATION.md">Configuration</a>
8
+ </p>
9
+
10
+ Hermes Agent is VerbalCoding's default backend — it is the one harness with a real session-resume contract, so chat across turns retains context cleanly. For positioning vs Hermes' built-in `/voice` slash command, see [HERMES_VOICE.md](./HERMES_VOICE.md).
11
+
12
+ ## Install
13
+
14
+ Follow the upstream Hermes Agent install guide: <https://hermes-agent.nousresearch.com>.
15
+
16
+ Verify the CLI works directly first:
17
+
18
+ ```bash
19
+ hermes chat -Q -q "hello"
20
+ ```
21
+
22
+ ## Configure VerbalCoding
23
+
24
+ ```bash
25
+ # .env
26
+ AGENT_BACKEND=hermes
27
+ # optional overrides
28
+ HERMES_COMMAND="hermes chat -Q -q" # default
29
+ HERMES_HOME=/Users/you/.hermes # per-instance Hermes home
30
+ HERMES_PROJECT_CONTEXT="Project session: ..."
31
+ HERMES_TASK_TIMEOUT_MS=0 # 0 = no limit
32
+ HERMES_CHAT_TIMEOUT_MS=45000
33
+ HERMES_WORKDIR=/Users/you/code/your-project
34
+ ```
35
+
36
+ The session file lives at `<repo>/.verbalcoding-session` by default (override with `HERMES_SESSION_FILE`).
37
+
38
+ ## Session resume
39
+
40
+ Hermes is the only built-in adapter with session resume. After each successful turn the adapter writes the new `session_id` to disk and prepends `--resume <id>` to the next call. `!session reset` (or `!reset-session`) clears that file.
41
+
42
+ If a turn aborts before Hermes emits `session_id:` on stderr, the adapter also reads the Hermes session JSON at `~/.hermes/sessions/session_<id>.json` to recover the last assistant message.
43
+
44
+ ## Verbose progress
45
+
46
+ In verbose mode the adapter drops Hermes' `-Q` quiet flag so stdout streams `┊ <emoji> <tool>` previews. These get summarized into one-line progress events (file reads, web search, terminal). Without verbose, only the final boxed answer plays.
47
+
48
+ ## Voice phrases to switch TO Hermes
49
+
50
+ - en: `"switch to Hermes"`, `"ask Hermes ..."`
51
+ - ko: `"헤르메스로 전환"`, `"헤르메스한테 물어봐"`
52
+
53
+ ## Gotchas
54
+
55
+ - The TTS prefix on cross-agent handoff uses the localized label: `"Hermes says: "` / `"헤르메스: "`.
56
+ - `HERMES_HOME` is the most common per-project isolation knob; per-instance `.env` typically sets `HERMES_HOME=/Users/you/.hermes/profiles/<project>`.
57
+ - If verbose progress is on and Hermes still finishes with an empty box (timed out), the adapter scrapes the session JSON for the final assistant text before giving up.