npm - verbalcoding - Versions diffs - 0.2.11 → 0.2.13 - Mend

verbalcoding 0.2.11 → 0.2.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (235) hide show

package/.env.example +98 -2
package/README.es.md +134 -0
package/README.fr.md +134 -0
package/README.ja.md +134 -0
package/README.ko.md +134 -0
package/README.md +118 -74
package/README.ru.md +134 -0
package/README.zh.md +133 -0
package/app-node/agent_adapters.mjs +37 -5
package/app-node/agent_adapters.test.mjs +27 -1
package/app-node/agent_detect.mjs +73 -0
package/app-node/agent_detect.test.mjs +77 -0
package/app-node/agent_routing.mjs +148 -0
package/app-node/agent_routing.test.mjs +138 -0
package/app-node/agent_turn.mjs +86 -0
package/app-node/agent_turn.test.mjs +109 -0
package/app-node/bridge_context.mjs +73 -0
package/app-node/bridge_context.test.mjs +54 -0
package/app-node/bridge_state.mjs +4 -0
package/app-node/bridge_wireup.test.mjs +462 -0
package/app-node/cli_install.test.mjs +31 -0
package/app-node/cross_agent_routing.test.mjs +78 -0
package/app-node/discord_command_router.mjs +204 -0
package/app-node/discord_command_router.test.mjs +311 -0
package/app-node/discord_voice_setup.mjs +251 -0
package/app-node/discord_voice_setup.test.mjs +86 -0
package/app-node/hermes_profiles.test.mjs +12 -1
package/app-node/install_config.mjs +113 -3
package/app-node/install_config.test.mjs +8 -0
package/app-node/instance_doctor.test.mjs +9 -0
package/app-node/instances.test.mjs +8 -1
package/app-node/main.mjs +513 -1058
package/app-node/mcp_tools.test.mjs +7 -0
package/app-node/notification_handler.mjs +89 -0
package/app-node/notification_handler.test.mjs +187 -0
package/app-node/notify.mjs +73 -0
package/app-node/notify.test.mjs +68 -0
package/app-node/plan_dispatcher.mjs +215 -0
package/app-node/plan_dispatcher.test.mjs +101 -0
package/app-node/plan_mode.mjs +203 -0
package/app-node/plan_mode.test.mjs +231 -0
package/app-node/progress_handler.mjs +220 -0
package/app-node/progress_handler.test.mjs +193 -0
package/app-node/progress_speech.mjs +54 -32
package/app-node/progress_speech.test.mjs +12 -3
package/app-node/project_sessions.mjs +5 -2
package/app-node/project_sessions.test.mjs +7 -0
package/app-node/research_mode.mjs +282 -0
package/app-node/research_mode.test.mjs +264 -0
package/app-node/restart_notice.mjs +3 -0
package/app-node/restart_notice.test.mjs +11 -0
package/app-node/session_ontology.mjs +271 -0
package/app-node/session_ontology.test.mjs +130 -0
package/app-node/smart_progress.mjs +94 -0
package/app-node/smart_progress.test.mjs +66 -0
package/app-node/stream_sentencer.mjs +91 -0
package/app-node/stream_sentencer.test.mjs +129 -0
package/app-node/streaming_tts_queue.mjs +52 -0
package/app-node/streaming_tts_queue.test.mjs +64 -0
package/app-node/stt_whisper.mjs +24 -0
package/app-node/stt_whisper.test.mjs +32 -0
package/app-node/text_routing.mjs +22 -0
package/app-node/text_routing.test.mjs +23 -1
package/app-node/tts_backends.mjs +537 -3
package/app-node/tts_backends.test.mjs +454 -0
package/app-node/tts_player.mjs +164 -0
package/app-node/tts_player.test.mjs +202 -0
package/app-node/tts_runtime.mjs +134 -0
package/app-node/tts_runtime.test.mjs +89 -0
package/app-node/tts_settings.mjs +150 -3
package/app-node/tts_settings.test.mjs +204 -0
package/app-node/tts_voice_config.mjs +136 -2
package/app-node/tts_voice_config.test.mjs +94 -0
package/app-node/utterance_router.mjs +216 -0
package/app-node/utterance_router.test.mjs +236 -0
package/app-node/voice_autojoin.mjs +37 -0
package/app-node/voice_autojoin.test.mjs +59 -0
package/app-node/voice_io.mjs +272 -0
package/app-node/voice_io.test.mjs +102 -0
package/app-node/voice_turn_runner.mjs +449 -0
package/app-node/voice_turn_runner.test.mjs +289 -0
package/docs/CONFIGURATION.md +79 -96
package/docs/FRESH_INSTALL.md +105 -63
package/docs/HARNESSES.md +58 -0
package/docs/HARNESS_AIDER.md +50 -0
package/docs/HARNESS_CLAUDE.md +56 -0
package/docs/HARNESS_CODEX.md +56 -0
package/docs/HARNESS_CURSOR.md +45 -0
package/docs/HARNESS_GEMINI.md +45 -0
package/docs/HARNESS_HERMES.md +57 -0
package/docs/HARNESS_OPENCLAW.md +44 -0
package/docs/HARNESS_OPENCODE.md +44 -0
package/docs/HERMES_VOICE.md +65 -0
package/docs/MULTI_INSTANCE.md +16 -0
package/docs/README.md +50 -0
package/docs/RELEASE.md +42 -19
package/docs/ROADMAP.md +53 -0
package/docs/TROUBLESHOOTING.md +126 -0
package/docs/TTS_BACKENDS.md +227 -0
package/docs/USAGE.md +94 -40
package/docs/assets/figures/verbalcoding-flow.svg +1 -1
package/docs/i18n/AGENTS.es.md +34 -0
package/docs/i18n/AGENTS.fr.md +34 -0
package/docs/i18n/AGENTS.ja.md +34 -0
package/docs/i18n/AGENTS.ko.md +34 -0
package/docs/i18n/AGENTS.ru.md +34 -0
package/docs/i18n/AGENTS.zh.md +34 -0
package/docs/i18n/CONFIGURATION.es.md +25 -0
package/docs/i18n/CONFIGURATION.fr.md +25 -0
package/docs/i18n/CONFIGURATION.ja.md +25 -0
package/docs/i18n/CONFIGURATION.ko.md +25 -0
package/docs/i18n/CONFIGURATION.ru.md +25 -0
package/docs/i18n/CONFIGURATION.zh.md +25 -0
package/docs/i18n/FRESH_INSTALL.es.md +27 -2
package/docs/i18n/FRESH_INSTALL.fr.md +27 -2
package/docs/i18n/FRESH_INSTALL.ja.md +27 -2
package/docs/i18n/FRESH_INSTALL.ko.md +27 -2
package/docs/i18n/FRESH_INSTALL.ru.md +27 -2
package/docs/i18n/FRESH_INSTALL.zh.md +27 -2
package/docs/i18n/HARNESSES.es.md +58 -0
package/docs/i18n/HARNESSES.fr.md +58 -0
package/docs/i18n/HARNESSES.ja.md +58 -0
package/docs/i18n/HARNESSES.ko.md +58 -0
package/docs/i18n/HARNESSES.ru.md +58 -0
package/docs/i18n/HARNESSES.zh.md +58 -0
package/docs/i18n/HARNESS_AIDER.es.md +48 -0
package/docs/i18n/HARNESS_AIDER.fr.md +48 -0
package/docs/i18n/HARNESS_AIDER.ja.md +50 -0
package/docs/i18n/HARNESS_AIDER.ko.md +50 -0
package/docs/i18n/HARNESS_AIDER.ru.md +48 -0
package/docs/i18n/HARNESS_AIDER.zh.md +48 -0
package/docs/i18n/HARNESS_CLAUDE.es.md +55 -0
package/docs/i18n/HARNESS_CLAUDE.fr.md +55 -0
package/docs/i18n/HARNESS_CLAUDE.ja.md +56 -0
package/docs/i18n/HARNESS_CLAUDE.ko.md +56 -0
package/docs/i18n/HARNESS_CLAUDE.ru.md +55 -0
package/docs/i18n/HARNESS_CLAUDE.zh.md +56 -0
package/docs/i18n/HARNESS_CODEX.es.md +55 -0
package/docs/i18n/HARNESS_CODEX.fr.md +55 -0
package/docs/i18n/HARNESS_CODEX.ja.md +56 -0
package/docs/i18n/HARNESS_CODEX.ko.md +56 -0
package/docs/i18n/HARNESS_CODEX.ru.md +55 -0
package/docs/i18n/HARNESS_CODEX.zh.md +56 -0
package/docs/i18n/HARNESS_CURSOR.es.md +42 -0
package/docs/i18n/HARNESS_CURSOR.fr.md +42 -0
package/docs/i18n/HARNESS_CURSOR.ja.md +45 -0
package/docs/i18n/HARNESS_CURSOR.ko.md +45 -0
package/docs/i18n/HARNESS_CURSOR.ru.md +42 -0
package/docs/i18n/HARNESS_CURSOR.zh.md +42 -0
package/docs/i18n/HARNESS_GEMINI.es.md +44 -0
package/docs/i18n/HARNESS_GEMINI.fr.md +44 -0
package/docs/i18n/HARNESS_GEMINI.ja.md +45 -0
package/docs/i18n/HARNESS_GEMINI.ko.md +45 -0
package/docs/i18n/HARNESS_GEMINI.ru.md +44 -0
package/docs/i18n/HARNESS_GEMINI.zh.md +45 -0
package/docs/i18n/HARNESS_HERMES.es.md +54 -0
package/docs/i18n/HARNESS_HERMES.fr.md +54 -0
package/docs/i18n/HARNESS_HERMES.ja.md +57 -0
package/docs/i18n/HARNESS_HERMES.ko.md +57 -0
package/docs/i18n/HARNESS_HERMES.ru.md +54 -0
package/docs/i18n/HARNESS_HERMES.zh.md +57 -0
package/docs/i18n/HARNESS_OPENCLAW.es.md +41 -0
package/docs/i18n/HARNESS_OPENCLAW.fr.md +41 -0
package/docs/i18n/HARNESS_OPENCLAW.ja.md +44 -0
package/docs/i18n/HARNESS_OPENCLAW.ko.md +44 -0
package/docs/i18n/HARNESS_OPENCLAW.ru.md +41 -0
package/docs/i18n/HARNESS_OPENCLAW.zh.md +42 -0
package/docs/i18n/HARNESS_OPENCODE.es.md +41 -0
package/docs/i18n/HARNESS_OPENCODE.fr.md +41 -0
package/docs/i18n/HARNESS_OPENCODE.ja.md +44 -0
package/docs/i18n/HARNESS_OPENCODE.ko.md +44 -0
package/docs/i18n/HARNESS_OPENCODE.ru.md +41 -0
package/docs/i18n/HARNESS_OPENCODE.zh.md +44 -0
package/docs/i18n/HERMES_VOICE.es.md +46 -0
package/docs/i18n/HERMES_VOICE.fr.md +46 -0
package/docs/i18n/HERMES_VOICE.ja.md +46 -0
package/docs/i18n/HERMES_VOICE.ko.md +65 -0
package/docs/i18n/HERMES_VOICE.ru.md +46 -0
package/docs/i18n/HERMES_VOICE.zh.md +46 -0
package/docs/i18n/MULTI_INSTANCE.es.md +25 -0
package/docs/i18n/MULTI_INSTANCE.fr.md +25 -0
package/docs/i18n/MULTI_INSTANCE.ja.md +25 -0
package/docs/i18n/MULTI_INSTANCE.ko.md +25 -0
package/docs/i18n/MULTI_INSTANCE.ru.md +25 -0
package/docs/i18n/MULTI_INSTANCE.zh.md +25 -0
package/docs/i18n/README.es.md +20 -134
package/docs/i18n/README.fr.md +20 -134
package/docs/i18n/README.ja.md +20 -134
package/docs/i18n/README.ko.md +20 -133
package/docs/i18n/README.ru.md +20 -134
package/docs/i18n/README.zh.md +20 -133
package/docs/i18n/RELEASE.es.md +26 -1
package/docs/i18n/RELEASE.fr.md +26 -1
package/docs/i18n/RELEASE.ja.md +26 -1
package/docs/i18n/RELEASE.ko.md +26 -1
package/docs/i18n/RELEASE.ru.md +26 -1
package/docs/i18n/RELEASE.zh.md +26 -1
package/docs/i18n/TROUBLESHOOTING.es.md +39 -0
package/docs/i18n/TROUBLESHOOTING.fr.md +39 -0
package/docs/i18n/TROUBLESHOOTING.ja.md +39 -0
package/docs/i18n/TROUBLESHOOTING.ko.md +39 -0
package/docs/i18n/TROUBLESHOOTING.ru.md +39 -0
package/docs/i18n/TROUBLESHOOTING.zh.md +39 -0
package/docs/i18n/USAGE.es.md +25 -0
package/docs/i18n/USAGE.fr.md +25 -0
package/docs/i18n/USAGE.ja.md +25 -0
package/docs/i18n/USAGE.ko.md +25 -0
package/docs/i18n/USAGE.ru.md +25 -0
package/docs/i18n/USAGE.zh.md +25 -0
package/docs/superpowers/plans/2026-05-13-phase1-streaming-pipeline.md +122 -0
package/docs/superpowers/plans/2026-05-13-phase10-push-notifications.md +152 -0
package/docs/superpowers/plans/2026-05-13-phase2-agent-adapters.md +242 -0
package/docs/superpowers/plans/2026-05-13-phase6-smart-progress.md +172 -0
package/docs/superpowers/plans/2026-05-13-phase7-voice-plan-mode.md +108 -0
package/docs/superpowers/plans/2026-05-14-cross-agent-voice-transfer.md +625 -0
package/docs/superpowers/plans/2026-05-21-audio-overview-narrated-diffs.md +95 -0
package/docs/superpowers/plans/2026-05-21-autoresearch-ontology.md +83 -0
package/docs/superpowers/plans/2026-05-21-phase11-push-to-talk-wakeword-v2.md +77 -0
package/docs/superpowers/plans/2026-05-21-phase12-multi-user-voice.md +147 -0
package/docs/superpowers/plans/2026-05-21-phase14-verbalbench.md +136 -0
package/docs/superpowers/plans/2026-05-21-phase15-phone-companion.md +72 -0
package/integrations/fireredtts2/mlx_llm.py +183 -0
package/integrations/fireredtts2/synth.py +156 -0
package/integrations/fireredtts2/synth_mlx.py +196 -0
package/integrations/mlxaudio/synth.py +74 -0
package/integrations/neuttsair/synth.py +104 -0
package/integrations/omnivoice/synth.py +110 -0
package/package.json +7 -1
package/scripts/cli.mjs +88 -3
package/scripts/doctor.mjs +115 -4
package/scripts/install.mjs +20 -2
package/scripts/install_fireredtts2.sh +109 -0
package/scripts/install_mlxaudio.sh +34 -0
package/scripts/install_mossttsnano.sh +46 -0
package/scripts/postinstall.mjs +34 -0

package/docs/FRESH_INSTALL.md CHANGED Viewed

@@ -1,116 +1,142 @@
 # Fresh install
-This guide is for a clean public install. It avoids local-only assumptions and uses the installer to bootstrap as much as possible.
+<!-- readme-glow-up:intro -->
+<p align="center">
+  <a href="../README.md">README</a> ·
+  <a href="README.md">Docs hub</a> ·
+  <a href="FRESH_INSTALL.md">Fresh Install</a> ·
+  <a href="USAGE.md">Usage</a> ·
+  <a href="CONFIGURATION.md">Configuration</a> ·
+  <a href="TROUBLESHOOTING.md">Troubleshooting</a> ·
+  <a href="MULTI_INSTANCE.md">Multi-Instance</a>
+</p>
-## 1. Install the CLI
+> Clean install path for humans first, automation second.
+>
+> Fast path: `npm install -g verbalcoding@latest → vc setup → vc doctor → vc start`
+<!-- /readme-glow-up:intro -->
-Recommended npm path:
+This guide is for a clean public install. It avoids local-only assumptions and uses the `vc` CLI to bootstrap as much as possible. Windows is not supported yet.
-```bash
-npm install -g verbalcoding
-```
+## 1. Install the CLI and run guided setup
-Or run the published package directly:
+Recommended npm path for humans:
 ```bash
-npx verbalcoding setup --yes
+npm install -g verbalcoding@latest
+vc setup
 ```
-If you used `npm install -g`, continue with:
+`vc setup` bootstraps supported local prerequisites, then asks for the Discord bot token, application/client ID, auto-join voice channel names, transcript target, agent backend, and voice/TTS settings. Keep the Discord Developer Portal open while it runs.
+Automation/CI path:
 ```bash
+npm install -g verbalcoding@latest
 vc setup --yes
+vc setup token <bot-token> --client-id <discord-client-id>
+vc setup channels "General,Team Voice"
 ```
+Use `--yes` only when you need non-interactive bootstrap/starter config. It cannot stop and wait for you to create a Discord application, so token/channel setup remains a follow-up step in that mode.
 Contributor GitHub clone path:
 ```bash
 git clone https://github.com/ca1773130n/VerbalCoding.git
 cd VerbalCoding
-./scripts/install.sh --yes
+./scripts/install.sh
 ```
-## 2. Bootstrap dependencies and run the setup wizard
-For an npm install, do not run `./scripts/install.sh` directly; there is no repository checkout in your current directory. Use the packaged CLI wrapper instead:
+For npm/global installs, use `vc ...` commands. Do not run `./scripts/install.sh` unless you are inside a repository clone.
-```bash
-vc setup --yes
-```
+## 2. What setup bootstraps
-`vc setup` runs the `scripts/install.sh` bundled inside the installed npm package. Only use `./scripts/install.sh --yes` when you are inside a GitHub clone:
+`vc setup` runs the bootstrap bundled in the npm package and writes `.env`. It can install or prepare:
-```bash
-./scripts/install.sh --yes
-```
-What this does:
-- installs npm dependencies when `node_modules/` is missing,
-- installs the short `vc` shell command with `npm link`,
-- installs `ffmpeg`, Node/npm, and `whisper-cli` when supported by the OS package manager,
-- downloads `models/ggml-small-q5_1.bin`,
-- creates `.venv-tts` and installs `edge-tts` when `edge-tts` is not already on `PATH`,
-- runs the interactive `.env` wizard.
+- npm dependencies when `node_modules/` is missing,
+- `ffmpeg`, Node/npm, Python venv support, build tools, and `whisper-cli` where supported,
+- the default `models/ggml-small-q5_1.bin` whisper.cpp model,
+- a local `.venv-tts` Edge TTS helper,
+- the short `vc` shell command when running from a clone.
 Supported system bootstrap paths:
 | OS | System dependency path |
 |---|---|
 | macOS | Homebrew: `brew install node ffmpeg whisper-cpp` as needed |
-| Debian/Ubuntu | `apt-get` for Node/npm, ffmpeg, Python, build tools; local whisper.cpp build fallback |
-| Fedora/RHEL | `dnf` for Node/npm, ffmpeg, Python, build tools; local whisper.cpp build fallback |
-| Arch | `pacman` for Node/npm, ffmpeg, Python, build tools; local whisper.cpp build fallback |
+| Debian/Ubuntu | `apt-get`; handles NodeSource npm conflicts and can locally build whisper.cpp |
+| Fedora/RHEL | `dnf`; local whisper.cpp build fallback |
+| Arch | `pacman`; local whisper.cpp build fallback |
+| Windows | Not supported yet |
 Useful installer variants:
 ```bash
 vc setup --yes --no-wizard                   # dependency/bootstrap only from npm install
-./scripts/install.sh --yes --no-wizard       # dependency/bootstrap only from a clone
-./scripts/install.sh --skip-system           # do not install OS packages
-./scripts/install.sh --skip-model            # do not download the default STT model
-./scripts/install.sh --skip-edge-tts         # do not create .venv-tts
-VERBALCODING_SKIP_CLI_LINK=1 ./scripts/install.sh --yes
+vc setup --yes --skip-system                 # skip OS package installation
+vc setup --yes --skip-model                  # skip default STT model download
+vc setup --yes --skip-edge-tts               # skip local Edge TTS helper
+./scripts/install.sh --yes --no-wizard       # clone-only non-interactive equivalent
 ```
-If your OS is unsupported, install these manually before rerunning:
-- Node.js 20+ and npm
-- ffmpeg
-- Python 3 with venv/pip
-- whisper.cpp `whisper-cli`
-- one authenticated CLI agent backend, Hermes Agent by default
+## 3. Discord values collected by setup
-## 3. Discord application setup
-Read the upstream Discord bot setup guides first if this is your first bot:
+Read the upstream Discord bot setup guides if this is your first bot:
 - Hermes Agent Discord messaging guide: <https://hermes-agent.nousresearch.com/docs/user-guide/messaging/discord>
 - Discord official bot overview: <https://docs.discord.com/developers/bots/overview>
 - Discord official getting started guide: <https://docs.discord.com/developers/quick-start/getting-started>
-Those pages show how to create a Discord application, add a bot user, enable privileged intents, and invite it to a server. VerbalCoding uses the same Discord bot setup, then adds voice receive, STT, CLI-agent execution, and TTS playback on top.
+During `vc setup`:
-1. Create a Discord application and bot in the Discord Developer Portal.
+1. Create a Discord application/bot in the Developer Portal.
 2. Enable the Message Content privileged intent.
-3. Copy the bot token into the installer prompt or `.env` as `DISCORD_BOT_TOKEN`.
-4. Generate an invite URL:
+3. Paste the bot token when asked for `DISCORD_BOT_TOKEN`.
+4. Paste the application/client ID when asked; setup can print the invite command.
+5. Enter the real voice channel names the bot should auto-join.
+Invite URL helper:
 ```bash
 vc bot invite <discord-client-id>
-# or pin it to one server:
 vc bot invite <discord-client-id> --guild <guild-id>
 ```
-The invite includes bot and slash-command scopes plus text/voice permissions used by VerbalCoding.
+If you skipped a value or need to rotate it later, update only that part:
+```bash
+vc setup token
+vc setup token <bot-token> --client-id <discord-client-id>
+vc setup channels "VerbalCoding,LLM-Wiki,General"
+```
+`vc setup token` updates `DISCORD_BOT_TOKEN` and optional `DISCORD_CLIENT_ID`; `vc setup channels` updates `AUTO_JOIN_VOICE_CHANNELS`. Both preserve unrelated `.env` values, set mode `0600`, and do not print secrets back.
-## 4. Verify
+## 4. Auto-join voice channel names
+Use the exact Discord voice channel names:
+```bash
+vc setup channels
+vc setup channels "General,Team Voice"
+vc setup channel "General"
+vc setup voice "General"
+```
+Restart the bridge after changing channel names.
+## 5. Verify
 ```bash
 vc doctor
 ```
-`vc doctor` is redacted: it reports missing tokens/commands/models without printing secret values. When fixable local prerequisites are missing (`ffmpeg`, `whisper-cli`, the default model, or Edge TTS helper), it automatically reruns the packaged bootstrap first. Fix any remaining `✗` items, then rerun it.
+`vc doctor` is redacted: it reports missing tokens/commands/models without printing secret values. On supported macOS/Linux installs it attempts to auto-fix installable prerequisites first, including `ffmpeg`, `whisper-cli`/model, Edge TTS helper, and Hermes CLI for the default Hermes backend. Use this opt-out if you only want diagnosis:
+```bash
+VERBALCODING_DOCTOR_INSTALL_HERMES=0 vc doctor
+```
 Expected success includes:
@@ -126,9 +152,9 @@ Expected success includes:
 Doctor passed. Run vc start to start VerbalCoding.
 ```
-If the installer created a local Edge TTS helper, `.env` should contain an `EDGE_TTS_COMMAND` path pointing at `.venv-tts/bin/edge-tts`.
+If `DISCORD_BOT_TOKEN` is missing, run `vc setup token`. If no configured channel is found, run `vc setup channels "<actual voice channel name>"`.
-## 5. Run the single default bot
+## 6. Run the single default bot
 ```bash
 vc start
@@ -154,7 +180,25 @@ In Discord:
 Then speak in the configured voice channel. You should see STT text, progress text when verbose mode is on, a final text answer, and hear TTS playback.
-## 6. Project-per-room setup
+## 7. Docker and containers
+Discord text/gateway login uses TCP/WebSocket, but Discord voice also needs UDP. If `vc start` logs this, the channel was found but voice UDP discovery failed:
+```text
+Cannot perform IP discovery - socket closed
+```
+On Linux Docker Compose, use host networking for the service running `vc start`:
+```yaml
+services:
+  verbalcoding:
+    network_mode: "host"
+```
+Remove any `ports:` block from that service when using host networking. On Docker Desktop for macOS/Windows, host networking behaves differently; if UDP voice still fails, run VerbalCoding directly on the host or in a Linux VM. See [Troubleshooting](TROUBLESHOOTING.md).
+## 8. Project-per-room setup
 For one permanent bot per project voice room, create one Discord application per project, then:
@@ -167,7 +211,7 @@ vc instance status my-project
 Each instance writes an ignored `instances/<name>.env` with its own token, voice channel, transcript target, log path, Hermes session file, and optional Hermes profile.
-## 7. Optional OpenVoice setup
+## 9. Optional OpenVoice setup
 OpenVoice voice cloning is optional. Keep `TTS_BACKEND=edge` for a fresh public install. To enable OpenVoice later:
@@ -181,7 +225,7 @@ python3 integrations/openvoice/synth.py --openvoice-dir vendor/OpenVoice --ref-a
 Then set `TTS_BACKEND=openvoice`, run `vc doctor`, and test `!voice-test <text>` in Discord.
-## 8. Clean clone smoke test for maintainers
+## 10. Clean clone smoke test for maintainers
 Fast host-only smoke test:
@@ -196,12 +240,10 @@ chmod 600 .env
 vc doctor || true
 ```
-The expected failure at this point is missing local secrets or unauthenticated agent CLI, not leaked tokens or missing install scripts.
 Docker-based Ubuntu clean install smoke test:
 ```bash
 ./scripts/docker_ubuntu_smoke.sh
 ```
-This runs `ubuntu:24.04`, copies the tracked repository tree into a clean container, runs `./scripts/install.sh --yes --no-wizard`, writes a non-secret smoke `.env`, checks `vc`, runs Node tests, and verifies `vc doctor`. It does not connect to Discord voice; use a real Ubuntu VM or WSL2 after this if you need an end-to-end voice-channel test.
+This validates bootstrap and doctor behavior in a clean container. It does not connect to Discord voice; use a real Linux host/VM for end-to-end voice UDP testing.

package/docs/HARNESSES.md ADDED Viewed

@@ -0,0 +1,58 @@
+# Coding Agent Harnesses
+<p align="center">
+  <a href="../README.md">README</a> ·
+  <a href="README.md">Docs hub</a> ·
+  <a href="USAGE.md">Usage</a> ·
+  <a href="CONFIGURATION.md">Configuration</a> ·
+  <a href="TROUBLESHOOTING.md">Troubleshooting</a>
+</p>
+VerbalCoding is agent-agnostic. It drives whichever CLI coding agent you have installed by spawning it once per voice turn, feeding the transcript as a prompt, and speaking the response back. Pick **one** as your default; the cross-agent voice routing lets you reach the others mid-session.
+| Harness | Default command | Session resume | Per-harness doc |
+|---|---|---|---|
+| Hermes Agent | `hermes chat -Q -q` | ✅ (`--resume <id>`) | [HERMES_VOICE.md](./HERMES_VOICE.md) (positioning) + [HARNESS_HERMES.md](./HARNESS_HERMES.md) |
+| Claude Code | `claude -p` | ❌ | [HARNESS_CLAUDE.md](./HARNESS_CLAUDE.md) |
+| Codex | `codex exec` | ❌ (output-last-message capture) | [HARNESS_CODEX.md](./HARNESS_CODEX.md) |
+| Gemini CLI | `gemini -p` | ❌ | [HARNESS_GEMINI.md](./HARNESS_GEMINI.md) |
+| OpenCode | `opencode run` | ❌ | [HARNESS_OPENCODE.md](./HARNESS_OPENCODE.md) |
+| OpenClaw | `openclaw run` | ❌ | [HARNESS_OPENCLAW.md](./HARNESS_OPENCLAW.md) |
+| Aider | `aider --no-pretty --yes-always --message` | ❌ | [HARNESS_AIDER.md](./HARNESS_AIDER.md) |
+| Cursor CLI | `cursor-agent --print --prompt` | ❌ | [HARNESS_CURSOR.md](./HARNESS_CURSOR.md) |
+## Pick your default
+`vc setup` auto-detects installed binaries and lets you pick. Non-interactive override:
+```bash
+# .env or instance .env
+AGENT_BACKEND=claude              # hermes | claude | codex | gemini | opencode | openclaw | aider | cursor | custom
+```
+Each harness picks up its own command from a matching env var (`HERMES_COMMAND`, `CLAUDE_COMMAND`, etc.). The shared envs `AGENT_LABEL`, `AGENT_COMMAND`, `AGENT_SESSION_FILE`, `AGENT_WORKDIR`, `AGENT_PROJECT_CONTEXT`, `AGENT_TASK_TIMEOUT_MS`, `AGENT_CHAT_TIMEOUT_MS`, `AGENT_VERBOSE_PROGRESS` override per-harness defaults when set.
+## Routing between harnesses by voice
+Once configured, you can reach any **installed** harness from a voice channel without restarting:
+- `"ask Codex what it thinks"` — single-turn route, next utterance returns to the default.
+- `"switch to Aider"` — sticky route until you say `"back to default"`.
+- Plan-mode `which_agent` slot — the agent itself proposes which backend runs the next plan.
+The routing layer detects whether the binary is on `PATH` (resolving relative commands against the active project session's workdir). If not installed, the bridge asks `"Want me to use the default agent instead?"` — answer `"yes"` to fall back or `"no"` to cancel.
+Aliases recognized by the parser: `claude` / `claude code`, `codex` / `코덱스`, `gemini` / `gemini cli` / `제미나이`, `opencode`, `openclaw`, `aider` / `에이더`, `cursor` / `cursor cli`, `hermes` / `헤르메스`.
+## Shared semantics
+Things every harness adapter respects:
+- **Voice plan mode** — `"plan it first"` → narrate a plan; edit by voice; `"approve"` to execute against the chosen harness.
+- **Barge-in** — interrupting cuts the current TTS and aborts the agent task. Sticky routing survives interrupts; only single-turn routes are cleared.
+- **Verbose progress** — `AGENT_VERBOSE_PROGRESS=1` (or `"상세 진행 켜"`) prints structured progress events the harness emits (file reads, web search, tool use). Smart-progress, if `SMART_PROGRESS_API_KEY` is set, summarizes these into one sentence per batch.
+- **Push handoff** — `NOTIFY_PROVIDER=ntfy|pushover` plus `NOTIFY_MIN_TASK_MS` fires a push notification when a long task completes and the voice channel is empty. Debounced by body + `NOTIFY_DEBOUNCE_MS`.
+- **Per-channel state** — each Discord voice channel keeps its own routing, plan-mode, and recent-utterance ring buffer.
+- **Project sessions** — `!session new <name> <workdir>` binds a Discord channel to a project; per-(harness, session) adapters are cached and invalidated on rebind.
+See per-harness docs for install paths, auth, and gotchas. `docs/CONFIGURATION.md` is the canonical env-var reference.

package/docs/HARNESS_AIDER.md ADDED Viewed

@@ -0,0 +1,50 @@
+# Aider — Harness Notes
+<p align="center">
+  <a href="../README.md">README</a> ·
+  <a href="HARNESSES.md">Harnesses</a> ·
+  <a href="USAGE.md">Usage</a> ·
+  <a href="CONFIGURATION.md">Configuration</a>
+</p>
+Aider is a pair-programming AI CLI focused on direct edits. VerbalCoding drives it through `aider --no-pretty --yes-always --message` — the prompt is passed as the `--message` value so each voice turn becomes one non-interactive Aider run that may modify files in `AGENT_WORKDIR`.
+## Install
+```bash
+pip install aider-chat
+aider --version
+# Confirm a single-message run works:
+aider --no-pretty --yes-always --message "list the top-level files"
+```
+Aider needs an API key for the model you point it at (OpenAI / Anthropic / a local server). See <https://aider.chat>.
+## Configure VerbalCoding
+```bash
+# .env
+AGENT_BACKEND=aider
+# optional
+AIDER_COMMAND="aider --no-pretty --yes-always --message"   # default
+AGENT_WORKDIR=/Users/you/code/your-project                 # where Aider should edit
+AGENT_PROJECT_CONTEXT="..."
+AGENT_CHAT_TIMEOUT_MS=120000                               # Aider can take longer
+AGENT_TASK_TIMEOUT_MS=0
+```
+`--no-pretty` strips Rich-formatting box characters so the stream sentencer doesn't choke on them. `--yes-always` keeps the run non-interactive (Aider won't pause for "apply this diff?" prompts).
+## Voice phrases to switch TO Aider
+- en: `"switch to Aider"`, `"ask Aider to ..."`
+- ko: `"aider로 전환해줘"`, `"에이더로 전환"`
+The matcher accepts `aider` and `에이더`.
+## Gotchas
+- **Aider edits files.** Unlike Claude / Codex / Gemini under `-p`, Aider directly modifies the working tree as part of answering. Be deliberate about `AGENT_WORKDIR` — usually a project session's `workdir`.
+- **Diffs in output.** Aider often emits diff-shaped text. If a turn is interrupted, the bridge speaks an "interrupted" notice and skips reading the diff aloud — check the text channel and `git status`.
+- **Auth.** `OPENAI_API_KEY` / `ANTHROPIC_API_KEY` need to be in Aider's environment; instance-isolated installs typically use `instances/<project>.env`.
+- **Per-channel state.** Cross-agent routing is per Discord channel; switching to Aider in one project room does not affect another.

package/docs/HARNESS_CLAUDE.md ADDED Viewed

@@ -0,0 +1,56 @@
+# Claude Code — Harness Notes
+<p align="center">
+  <a href="../README.md">README</a> ·
+  <a href="HARNESSES.md">Harnesses</a> ·
+  <a href="USAGE.md">Usage</a> ·
+  <a href="CONFIGURATION.md">Configuration</a>
+</p>
+Claude Code is Anthropic's official terminal-resident coding agent. VerbalCoding drives it through `claude -p`, where each voice turn is one invocation. Claude Code does not expose a stable session-resume contract over `-p`, so each call is a fresh context — use `AGENT_PROJECT_CONTEXT` and the cross-agent handoff block to keep continuity.
+## Install
+```bash
+npm install -g @anthropic-ai/claude-code
+claude login
+claude -p "hello"     # confirm it answers
+```
+## Configure VerbalCoding
+```bash
+# .env
+AGENT_BACKEND=claude              # alias 'claude-code' also accepted
+# optional
+CLAUDE_COMMAND="claude -p"        # default; override e.g. to add --model, --debug
+AGENT_PROJECT_CONTEXT="Working on the auth module; previous decisions: oauth=github."
+AGENT_WORKDIR=/Users/you/code/your-project
+AGENT_CHAT_TIMEOUT_MS=45000
+AGENT_TASK_TIMEOUT_MS=0
+AGENT_VERBOSE_PROGRESS=0
+```
+`AGENT_SESSION_FILE` defaults to `<repo>/.agent-sessions/claude` but is **unused** by this harness — Claude Code's `-p` is stateless. Leave it set; it just becomes a no-op.
+## What Claude sees per turn
+Every turn the adapter prepends a Discord-aware preamble (English or Korean depending on `VOICE_LANGUAGE`), the project context, recent Discord text context, and finally the user's transcribed utterance. On cross-agent handoff (e.g. you said `"ask Codex ..."` last turn and just spoke again), the prepended block also includes a "Recent user voice" line of up to the last 4 utterances plus the most recently resolved plan decisions, so Claude doesn't start cold.
+## Verbose progress
+Claude Code does not emit a standard progress stream over `-p`. `AGENT_VERBOSE_PROGRESS=1` still works — the adapter parses tool/file/web mentions out of stdout/stderr if Claude prints them — but expect coarser progress than Hermes.
+## Voice phrases to switch TO Claude Code
+- en: `"switch to Claude Code"`, `"ask Claude ..."`, `"let Claude finish this"`
+- ko: `"클로드로 전환"`, `"claude한테 물어봐"`
+The matcher accepts both `claude` and `claude code` as aliases; strict mode (used for routing-only utterances) requires an exact alias.
+## Gotchas
+- **No session resume.** A long-running pair-programming session needs the cross-agent handoff context block to carry decisions forward. The bridge does this automatically on backend changes; within the same backend, set `AGENT_PROJECT_CONTEXT` to a short summary.
+- **Quoted command paths.** If `CLAUDE_COMMAND` uses a quoted absolute path (e.g. `"/Applications/Claude Code/claude" -p`), VerbalCoding's installation probe uses `shellSplit` and honors quotes correctly.
+- **Auth refresh.** `claude login` token expiry surfaces as a non-zero exit; the bridge reports the failure and (if a non-default backend) the fallback prompt will offer to retry on the default.
+- **Patch-like output.** If Claude returns a diff and the turn is interrupted, the bridge says `"the agent was interrupted; check the text channel for files and tests"` rather than reading the diff aloud.

package/docs/HARNESS_CODEX.md ADDED Viewed

@@ -0,0 +1,56 @@
+# Codex — Harness Notes
+<p align="center">
+  <a href="../README.md">README</a> ·
+  <a href="HARNESSES.md">Harnesses</a> ·
+  <a href="USAGE.md">Usage</a> ·
+  <a href="CONFIGURATION.md">Configuration</a>
+</p>
+Codex CLI is OpenAI's terminal coding agent. VerbalCoding drives it through `codex exec`. Because `codex exec` writes its final assistant text to a temp file when `--output-last-message <path>` is passed, the adapter inserts that flag automatically and reads the file back even if stdout is noisy.
+## Install
+```bash
+npm install -g @openai/codex
+codex login              # or set OPENAI_API_KEY for headless use
+codex exec "hello"
+```
+## Configure VerbalCoding
+```bash
+# .env
+AGENT_BACKEND=codex
+# optional
+CODEX_COMMAND="codex exec"                      # default
+AGENT_PROJECT_CONTEXT="What we're working on, what's already decided."
+AGENT_WORKDIR=/Users/you/code/your-project
+AGENT_CHAT_TIMEOUT_MS=45000
+AGENT_TASK_TIMEOUT_MS=0
+```
+`AGENT_SESSION_FILE` is unused (Codex `exec` is stateless across calls).
+## Output capture
+For Codex, the adapter:
+1. Generates a temp path under `os.tmpdir()` like `verbalcoding-codex-last-<pid>-<ts>.txt`.
+2. Inserts `--output-last-message <path>` immediately before the final positional prompt arg.
+3. After the run, reads that file as the authoritative answer (preferred over `stdout`).
+4. Deletes the temp file.
+This is robust to Codex emitting tool-use chatter on stdout; the spoken answer always comes from the captured file.
+## Voice phrases to switch TO Codex
+- en: `"switch to Codex"`, `"ask Codex what it thinks"`
+- ko: `"코덱스로 전환"`, `"코덱스한테 물어봐"`
+## Gotchas
+- **Long tasks.** Set `AGENT_TASK_TIMEOUT_MS=0` for codegen runs that may take minutes. The adapter respects `signal.aborted` so barge-in still cuts cleanly.
+- **No session resume.** Pass context via `AGENT_PROJECT_CONTEXT` and rely on the cross-agent handoff block for continuity after a route change.
+- **Patch-like output safety.** If a turn is interrupted and Codex was mid-diff, the bridge does **not** read the diff aloud — it speaks an "interrupted" notice and asks you to check the text channel.
+- **Auth.** A 401 from the OpenAI backend surfaces as a non-zero exit; the bridge reports the failure and the cross-agent fallback prompt offers the default agent.

package/docs/HARNESS_CURSOR.md ADDED Viewed

@@ -0,0 +1,45 @@
+# Cursor CLI — Harness Notes
+<p align="center">
+  <a href="../README.md">README</a> ·
+  <a href="HARNESSES.md">Harnesses</a> ·
+  <a href="USAGE.md">Usage</a> ·
+  <a href="CONFIGURATION.md">Configuration</a>
+</p>
+Cursor CLI (`cursor-agent`) is Cursor's terminal agent. VerbalCoding drives it through `cursor-agent --print --prompt`, passing the user's transcribed utterance as the prompt value. `--print` keeps the run non-interactive.
+## Install
+Follow the upstream Cursor CLI install. Confirm:
+```bash
+cursor-agent --print --prompt "hello"
+```
+## Configure VerbalCoding
+```bash
+# .env
+AGENT_BACKEND=cursor                                       # alias 'cursor-cli' also accepted
+# optional
+CURSOR_COMMAND="cursor-agent --print --prompt"             # default
+AGENT_PROJECT_CONTEXT="..."
+AGENT_WORKDIR=/Users/you/code/your-project
+AGENT_CHAT_TIMEOUT_MS=45000
+AGENT_TASK_TIMEOUT_MS=0
+```
+## Voice phrases to switch TO Cursor
+- en: `"switch to Cursor"`, `"ask Cursor ..."`, `"switch to cursor cli"`, `"switch to cursor agent"`
+- ko: `"커서로 전환"`, `"cursor한테 물어봐"`
+The matcher accepts `cursor`, `cursor cli`, `cursor-cli`, `cursor agent`, and `cursor-agent`.
+## Gotchas
+- **Prompt placement.** `--prompt` expects the value to follow; VerbalCoding's shell-aware argv builder places the transcribed utterance as the final positional argument, so `CURSOR_COMMAND` must end with `--prompt`.
+- **Editor side-effects.** Cursor's CLI may touch local cursor-related state files in the working directory; if that's surprising for a voice-only flow, point `AGENT_WORKDIR` at an isolated project dir.
+- **No session resume.** Use `AGENT_PROJECT_CONTEXT` for cross-turn continuity, plus the cross-agent handoff block when routing back from a different harness.
+- **Patch safety.** If Cursor returns a diff and the turn is interrupted, the bridge does not read the diff aloud.

package/docs/HARNESS_GEMINI.md ADDED Viewed

@@ -0,0 +1,45 @@
+# Gemini CLI — Harness Notes
+<p align="center">
+  <a href="../README.md">README</a> ·
+  <a href="HARNESSES.md">Harnesses</a> ·
+  <a href="USAGE.md">Usage</a> ·
+  <a href="CONFIGURATION.md">Configuration</a>
+</p>
+Gemini CLI is Google's terminal coding agent. VerbalCoding drives it through `gemini -p`. Each voice turn is one invocation; there is no built-in session-resume across calls.
+## Install
+Follow the upstream Gemini CLI install guide. Confirm:
+```bash
+gemini -p "hello"
+```
+## Configure VerbalCoding
+```bash
+# .env
+AGENT_BACKEND=gemini
+# optional
+GEMINI_COMMAND="gemini -p"                  # default; add --model, --debug as needed
+AGENT_PROJECT_CONTEXT="..."
+AGENT_WORKDIR=/Users/you/code/your-project
+AGENT_CHAT_TIMEOUT_MS=45000
+AGENT_TASK_TIMEOUT_MS=0
+```
+## Voice phrases to switch TO Gemini
+- en: `"switch to Gemini"`, `"ask Gemini ..."`, `"switch to Gemini CLI"`
+- ko: `"제미나이로 전환"`, `"gemini한테 물어봐"`
+The matcher accepts `gemini`, `gemini cli`, `gemini-cli`, and `제미나이`.
+## Gotchas
+- **No session resume.** Same continuity story as Claude / Codex: rely on `AGENT_PROJECT_CONTEXT` and the cross-agent handoff block.
+- **Long answers.** Gemini sometimes returns large structured responses; the stream sentencer splits them into TTS-able sentences. Code fences are stripped from speech (the text channel still gets the full answer with code).
+- **API key.** If Gemini exits non-zero with an auth error, the bridge reports the message; the cross-agent fallback prompt offers the default agent if Gemini was a non-default route.
+- **Verbose progress.** Gemini's stdout doesn't follow Hermes' `┊`-style preview format, so verbose progress mostly relies on the smart-progress LLM summarizer.

package/docs/HARNESS_HERMES.md ADDED Viewed

@@ -0,0 +1,57 @@
+# Hermes Agent — Harness Notes
+<p align="center">
+  <a href="../README.md">README</a> ·
+  <a href="HARNESSES.md">Harnesses</a> ·
+  <a href="USAGE.md">Usage</a> ·
+  <a href="CONFIGURATION.md">Configuration</a>
+</p>
+Hermes Agent is VerbalCoding's default backend — it is the one harness with a real session-resume contract, so chat across turns retains context cleanly. For positioning vs Hermes' built-in `/voice` slash command, see [HERMES_VOICE.md](./HERMES_VOICE.md).
+## Install
+Follow the upstream Hermes Agent install guide: <https://hermes-agent.nousresearch.com>.
+Verify the CLI works directly first:
+```bash
+hermes chat -Q -q "hello"
+```
+## Configure VerbalCoding
+```bash
+# .env
+AGENT_BACKEND=hermes
+# optional overrides
+HERMES_COMMAND="hermes chat -Q -q"           # default
+HERMES_HOME=/Users/you/.hermes               # per-instance Hermes home
+HERMES_PROJECT_CONTEXT="Project session: ..."
+HERMES_TASK_TIMEOUT_MS=0                     # 0 = no limit
+HERMES_CHAT_TIMEOUT_MS=45000
+HERMES_WORKDIR=/Users/you/code/your-project
+```
+The session file lives at `<repo>/.verbalcoding-session` by default (override with `HERMES_SESSION_FILE`).
+## Session resume
+Hermes is the only built-in adapter with session resume. After each successful turn the adapter writes the new `session_id` to disk and prepends `--resume <id>` to the next call. `!session reset` (or `!reset-session`) clears that file.
+If a turn aborts before Hermes emits `session_id:` on stderr, the adapter also reads the Hermes session JSON at `~/.hermes/sessions/session_<id>.json` to recover the last assistant message.
+## Verbose progress
+In verbose mode the adapter drops Hermes' `-Q` quiet flag so stdout streams `┊ <emoji> <tool>` previews. These get summarized into one-line progress events (file reads, web search, terminal). Without verbose, only the final boxed answer plays.
+## Voice phrases to switch TO Hermes
+- en: `"switch to Hermes"`, `"ask Hermes ..."`
+- ko: `"헤르메스로 전환"`, `"헤르메스한테 물어봐"`
+## Gotchas
+- The TTS prefix on cross-agent handoff uses the localized label: `"Hermes says: "` / `"헤르메스: "`.
+- `HERMES_HOME` is the most common per-project isolation knob; per-instance `.env` typically sets `HERMES_HOME=/Users/you/.hermes/profiles/<project>`.
+- If verbose progress is on and Hermes still finishes with an empty box (timed out), the adapter scrapes the session JSON for the final assistant text before giving up.