npm - verbalcoding - Versions diffs - 0.2.11 → 0.2.13 - Mend

verbalcoding 0.2.11 → 0.2.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (235) hide show

package/.env.example +98 -2
package/README.es.md +134 -0
package/README.fr.md +134 -0
package/README.ja.md +134 -0
package/README.ko.md +134 -0
package/README.md +118 -74
package/README.ru.md +134 -0
package/README.zh.md +133 -0
package/app-node/agent_adapters.mjs +37 -5
package/app-node/agent_adapters.test.mjs +27 -1
package/app-node/agent_detect.mjs +73 -0
package/app-node/agent_detect.test.mjs +77 -0
package/app-node/agent_routing.mjs +148 -0
package/app-node/agent_routing.test.mjs +138 -0
package/app-node/agent_turn.mjs +86 -0
package/app-node/agent_turn.test.mjs +109 -0
package/app-node/bridge_context.mjs +73 -0
package/app-node/bridge_context.test.mjs +54 -0
package/app-node/bridge_state.mjs +4 -0
package/app-node/bridge_wireup.test.mjs +462 -0
package/app-node/cli_install.test.mjs +31 -0
package/app-node/cross_agent_routing.test.mjs +78 -0
package/app-node/discord_command_router.mjs +204 -0
package/app-node/discord_command_router.test.mjs +311 -0
package/app-node/discord_voice_setup.mjs +251 -0
package/app-node/discord_voice_setup.test.mjs +86 -0
package/app-node/hermes_profiles.test.mjs +12 -1
package/app-node/install_config.mjs +113 -3
package/app-node/install_config.test.mjs +8 -0
package/app-node/instance_doctor.test.mjs +9 -0
package/app-node/instances.test.mjs +8 -1
package/app-node/main.mjs +513 -1058
package/app-node/mcp_tools.test.mjs +7 -0
package/app-node/notification_handler.mjs +89 -0
package/app-node/notification_handler.test.mjs +187 -0
package/app-node/notify.mjs +73 -0
package/app-node/notify.test.mjs +68 -0
package/app-node/plan_dispatcher.mjs +215 -0
package/app-node/plan_dispatcher.test.mjs +101 -0
package/app-node/plan_mode.mjs +203 -0
package/app-node/plan_mode.test.mjs +231 -0
package/app-node/progress_handler.mjs +220 -0
package/app-node/progress_handler.test.mjs +193 -0
package/app-node/progress_speech.mjs +54 -32
package/app-node/progress_speech.test.mjs +12 -3
package/app-node/project_sessions.mjs +5 -2
package/app-node/project_sessions.test.mjs +7 -0
package/app-node/research_mode.mjs +282 -0
package/app-node/research_mode.test.mjs +264 -0
package/app-node/restart_notice.mjs +3 -0
package/app-node/restart_notice.test.mjs +11 -0
package/app-node/session_ontology.mjs +271 -0
package/app-node/session_ontology.test.mjs +130 -0
package/app-node/smart_progress.mjs +94 -0
package/app-node/smart_progress.test.mjs +66 -0
package/app-node/stream_sentencer.mjs +91 -0
package/app-node/stream_sentencer.test.mjs +129 -0
package/app-node/streaming_tts_queue.mjs +52 -0
package/app-node/streaming_tts_queue.test.mjs +64 -0
package/app-node/stt_whisper.mjs +24 -0
package/app-node/stt_whisper.test.mjs +32 -0
package/app-node/text_routing.mjs +22 -0
package/app-node/text_routing.test.mjs +23 -1
package/app-node/tts_backends.mjs +537 -3
package/app-node/tts_backends.test.mjs +454 -0
package/app-node/tts_player.mjs +164 -0
package/app-node/tts_player.test.mjs +202 -0
package/app-node/tts_runtime.mjs +134 -0
package/app-node/tts_runtime.test.mjs +89 -0
package/app-node/tts_settings.mjs +150 -3
package/app-node/tts_settings.test.mjs +204 -0
package/app-node/tts_voice_config.mjs +136 -2
package/app-node/tts_voice_config.test.mjs +94 -0
package/app-node/utterance_router.mjs +216 -0
package/app-node/utterance_router.test.mjs +236 -0
package/app-node/voice_autojoin.mjs +37 -0
package/app-node/voice_autojoin.test.mjs +59 -0
package/app-node/voice_io.mjs +272 -0
package/app-node/voice_io.test.mjs +102 -0
package/app-node/voice_turn_runner.mjs +449 -0
package/app-node/voice_turn_runner.test.mjs +289 -0
package/docs/CONFIGURATION.md +79 -96
package/docs/FRESH_INSTALL.md +105 -63
package/docs/HARNESSES.md +58 -0
package/docs/HARNESS_AIDER.md +50 -0
package/docs/HARNESS_CLAUDE.md +56 -0
package/docs/HARNESS_CODEX.md +56 -0
package/docs/HARNESS_CURSOR.md +45 -0
package/docs/HARNESS_GEMINI.md +45 -0
package/docs/HARNESS_HERMES.md +57 -0
package/docs/HARNESS_OPENCLAW.md +44 -0
package/docs/HARNESS_OPENCODE.md +44 -0
package/docs/HERMES_VOICE.md +65 -0
package/docs/MULTI_INSTANCE.md +16 -0
package/docs/README.md +50 -0
package/docs/RELEASE.md +42 -19
package/docs/ROADMAP.md +53 -0
package/docs/TROUBLESHOOTING.md +126 -0
package/docs/TTS_BACKENDS.md +227 -0
package/docs/USAGE.md +94 -40
package/docs/assets/figures/verbalcoding-flow.svg +1 -1
package/docs/i18n/AGENTS.es.md +34 -0
package/docs/i18n/AGENTS.fr.md +34 -0
package/docs/i18n/AGENTS.ja.md +34 -0
package/docs/i18n/AGENTS.ko.md +34 -0
package/docs/i18n/AGENTS.ru.md +34 -0
package/docs/i18n/AGENTS.zh.md +34 -0
package/docs/i18n/CONFIGURATION.es.md +25 -0
package/docs/i18n/CONFIGURATION.fr.md +25 -0
package/docs/i18n/CONFIGURATION.ja.md +25 -0
package/docs/i18n/CONFIGURATION.ko.md +25 -0
package/docs/i18n/CONFIGURATION.ru.md +25 -0
package/docs/i18n/CONFIGURATION.zh.md +25 -0
package/docs/i18n/FRESH_INSTALL.es.md +27 -2
package/docs/i18n/FRESH_INSTALL.fr.md +27 -2
package/docs/i18n/FRESH_INSTALL.ja.md +27 -2
package/docs/i18n/FRESH_INSTALL.ko.md +27 -2
package/docs/i18n/FRESH_INSTALL.ru.md +27 -2
package/docs/i18n/FRESH_INSTALL.zh.md +27 -2
package/docs/i18n/HARNESSES.es.md +58 -0
package/docs/i18n/HARNESSES.fr.md +58 -0
package/docs/i18n/HARNESSES.ja.md +58 -0
package/docs/i18n/HARNESSES.ko.md +58 -0
package/docs/i18n/HARNESSES.ru.md +58 -0
package/docs/i18n/HARNESSES.zh.md +58 -0
package/docs/i18n/HARNESS_AIDER.es.md +48 -0
package/docs/i18n/HARNESS_AIDER.fr.md +48 -0
package/docs/i18n/HARNESS_AIDER.ja.md +50 -0
package/docs/i18n/HARNESS_AIDER.ko.md +50 -0
package/docs/i18n/HARNESS_AIDER.ru.md +48 -0
package/docs/i18n/HARNESS_AIDER.zh.md +48 -0
package/docs/i18n/HARNESS_CLAUDE.es.md +55 -0
package/docs/i18n/HARNESS_CLAUDE.fr.md +55 -0
package/docs/i18n/HARNESS_CLAUDE.ja.md +56 -0
package/docs/i18n/HARNESS_CLAUDE.ko.md +56 -0
package/docs/i18n/HARNESS_CLAUDE.ru.md +55 -0
package/docs/i18n/HARNESS_CLAUDE.zh.md +56 -0
package/docs/i18n/HARNESS_CODEX.es.md +55 -0
package/docs/i18n/HARNESS_CODEX.fr.md +55 -0
package/docs/i18n/HARNESS_CODEX.ja.md +56 -0
package/docs/i18n/HARNESS_CODEX.ko.md +56 -0
package/docs/i18n/HARNESS_CODEX.ru.md +55 -0
package/docs/i18n/HARNESS_CODEX.zh.md +56 -0
package/docs/i18n/HARNESS_CURSOR.es.md +42 -0
package/docs/i18n/HARNESS_CURSOR.fr.md +42 -0
package/docs/i18n/HARNESS_CURSOR.ja.md +45 -0
package/docs/i18n/HARNESS_CURSOR.ko.md +45 -0
package/docs/i18n/HARNESS_CURSOR.ru.md +42 -0
package/docs/i18n/HARNESS_CURSOR.zh.md +42 -0
package/docs/i18n/HARNESS_GEMINI.es.md +44 -0
package/docs/i18n/HARNESS_GEMINI.fr.md +44 -0
package/docs/i18n/HARNESS_GEMINI.ja.md +45 -0
package/docs/i18n/HARNESS_GEMINI.ko.md +45 -0
package/docs/i18n/HARNESS_GEMINI.ru.md +44 -0
package/docs/i18n/HARNESS_GEMINI.zh.md +45 -0
package/docs/i18n/HARNESS_HERMES.es.md +54 -0
package/docs/i18n/HARNESS_HERMES.fr.md +54 -0
package/docs/i18n/HARNESS_HERMES.ja.md +57 -0
package/docs/i18n/HARNESS_HERMES.ko.md +57 -0
package/docs/i18n/HARNESS_HERMES.ru.md +54 -0
package/docs/i18n/HARNESS_HERMES.zh.md +57 -0
package/docs/i18n/HARNESS_OPENCLAW.es.md +41 -0
package/docs/i18n/HARNESS_OPENCLAW.fr.md +41 -0
package/docs/i18n/HARNESS_OPENCLAW.ja.md +44 -0
package/docs/i18n/HARNESS_OPENCLAW.ko.md +44 -0
package/docs/i18n/HARNESS_OPENCLAW.ru.md +41 -0
package/docs/i18n/HARNESS_OPENCLAW.zh.md +42 -0
package/docs/i18n/HARNESS_OPENCODE.es.md +41 -0
package/docs/i18n/HARNESS_OPENCODE.fr.md +41 -0
package/docs/i18n/HARNESS_OPENCODE.ja.md +44 -0
package/docs/i18n/HARNESS_OPENCODE.ko.md +44 -0
package/docs/i18n/HARNESS_OPENCODE.ru.md +41 -0
package/docs/i18n/HARNESS_OPENCODE.zh.md +44 -0
package/docs/i18n/HERMES_VOICE.es.md +46 -0
package/docs/i18n/HERMES_VOICE.fr.md +46 -0
package/docs/i18n/HERMES_VOICE.ja.md +46 -0
package/docs/i18n/HERMES_VOICE.ko.md +65 -0
package/docs/i18n/HERMES_VOICE.ru.md +46 -0
package/docs/i18n/HERMES_VOICE.zh.md +46 -0
package/docs/i18n/MULTI_INSTANCE.es.md +25 -0
package/docs/i18n/MULTI_INSTANCE.fr.md +25 -0
package/docs/i18n/MULTI_INSTANCE.ja.md +25 -0
package/docs/i18n/MULTI_INSTANCE.ko.md +25 -0
package/docs/i18n/MULTI_INSTANCE.ru.md +25 -0
package/docs/i18n/MULTI_INSTANCE.zh.md +25 -0
package/docs/i18n/README.es.md +20 -134
package/docs/i18n/README.fr.md +20 -134
package/docs/i18n/README.ja.md +20 -134
package/docs/i18n/README.ko.md +20 -133
package/docs/i18n/README.ru.md +20 -134
package/docs/i18n/README.zh.md +20 -133
package/docs/i18n/RELEASE.es.md +26 -1
package/docs/i18n/RELEASE.fr.md +26 -1
package/docs/i18n/RELEASE.ja.md +26 -1
package/docs/i18n/RELEASE.ko.md +26 -1
package/docs/i18n/RELEASE.ru.md +26 -1
package/docs/i18n/RELEASE.zh.md +26 -1
package/docs/i18n/TROUBLESHOOTING.es.md +39 -0
package/docs/i18n/TROUBLESHOOTING.fr.md +39 -0
package/docs/i18n/TROUBLESHOOTING.ja.md +39 -0
package/docs/i18n/TROUBLESHOOTING.ko.md +39 -0
package/docs/i18n/TROUBLESHOOTING.ru.md +39 -0
package/docs/i18n/TROUBLESHOOTING.zh.md +39 -0
package/docs/i18n/USAGE.es.md +25 -0
package/docs/i18n/USAGE.fr.md +25 -0
package/docs/i18n/USAGE.ja.md +25 -0
package/docs/i18n/USAGE.ko.md +25 -0
package/docs/i18n/USAGE.ru.md +25 -0
package/docs/i18n/USAGE.zh.md +25 -0
package/docs/superpowers/plans/2026-05-13-phase1-streaming-pipeline.md +122 -0
package/docs/superpowers/plans/2026-05-13-phase10-push-notifications.md +152 -0
package/docs/superpowers/plans/2026-05-13-phase2-agent-adapters.md +242 -0
package/docs/superpowers/plans/2026-05-13-phase6-smart-progress.md +172 -0
package/docs/superpowers/plans/2026-05-13-phase7-voice-plan-mode.md +108 -0
package/docs/superpowers/plans/2026-05-14-cross-agent-voice-transfer.md +625 -0
package/docs/superpowers/plans/2026-05-21-audio-overview-narrated-diffs.md +95 -0
package/docs/superpowers/plans/2026-05-21-autoresearch-ontology.md +83 -0
package/docs/superpowers/plans/2026-05-21-phase11-push-to-talk-wakeword-v2.md +77 -0
package/docs/superpowers/plans/2026-05-21-phase12-multi-user-voice.md +147 -0
package/docs/superpowers/plans/2026-05-21-phase14-verbalbench.md +136 -0
package/docs/superpowers/plans/2026-05-21-phase15-phone-companion.md +72 -0
package/integrations/fireredtts2/mlx_llm.py +183 -0
package/integrations/fireredtts2/synth.py +156 -0
package/integrations/fireredtts2/synth_mlx.py +196 -0
package/integrations/mlxaudio/synth.py +74 -0
package/integrations/neuttsair/synth.py +104 -0
package/integrations/omnivoice/synth.py +110 -0
package/package.json +7 -1
package/scripts/cli.mjs +88 -3
package/scripts/doctor.mjs +115 -4
package/scripts/install.mjs +20 -2
package/scripts/install_fireredtts2.sh +109 -0
package/scripts/install_mlxaudio.sh +34 -0
package/scripts/install_mossttsnano.sh +46 -0
package/scripts/postinstall.mjs +34 -0

package/docs/TTS_BACKENDS.md ADDED Viewed

@@ -0,0 +1,227 @@
+# TTS backends and latency notes
+This document captures the current VerbalCoding TTS backends, the live-selection rules, and the latency caveats observed while testing on the current Mac mini.
+## Current test machine
+Observed host for these notes:
+- Machine: Mac mini, Apple M4
+- Memory: 16 GB
+- OS: macOS 26.3 / Darwin 25.3.0 arm64
+- Workload caveat: several measurements were taken while other heavy local processes or model-training jobs could be active. Treat local neural TTS timings as operational observations, not clean benchmarks.
+## Operational rule
+Edge TTS is the default safe live backend. Local neural backends are optional and should normally fall back to Edge for progress prompts unless explicitly enabled with each backend's `*_PROGRESS=1` setting.
+When a user explicitly asks to switch to a specific backend, update both:
+```bash
+TTS_BACKEND=<backend>
+TTS_VOICE_TYPE=<voice-type>
+```
+and `config/tts-voices.json`:
+```json
+{
+  "currentBackend": "<backend>",
+  "currentVoiceType": "<voice-type>"
+}
+```
+The runtime re-reads voice config, so changing only `.env` can be overridden.
+### Fallback notice
+When a non-Edge backend fails to synthesize (model missing, runtime crash, timeout, install error), the bridge silently re-routes that utterance through Edge so the user still hears a response. The first time this happens for each backend in a session, VerbalCoding posts a one-shot warning to the active Discord text channel and speaks the same message ("`<backend>` synthesis failed; using Edge for the rest of this session." / "`<backend>` 음성 생성에 실패해서 이번 세션은 Edge로 진행할게."). Subsequent failures for the same backend stay silent.
+If you see the warning, check `vc doctor` and the backend's venv/model install — the bridge will keep using Edge until the next `vc start`.
+## Supported backends
+| Backend | Purpose | Default path / command | Live-call suitability | Notes |
+|---|---|---|---|---|
+| `edge` | Free cloud TTS baseline | `edge-tts` | Best current default | Korean and English voices, fast enough for phone-call mode, progress cache works well. |
+| `openvoice` | Reference-sample voice cloning | `integrations/openvoice/synth.py` | Experimental | Requires permitted reference audio. Progress falls back to Edge unless `OPENVOICE_PROGRESS=1`. |
+| `speechswift` | Apple Silicon local CosyVoice / Qwen3 wrapper | `audio speak ...` | Experimental | CosyVoice is usable for demos but not as responsive as Edge; Qwen3 path is much slower. |
+| `supertonic` | Local Supertonic CLI wrapper | `supertonic tts ...` | Experimental | Supports voice IDs such as `M1`; falls back to Edge on failure. |
+| `omnivoice` | OmniVoice local reference/design voice | `.venv-omnivoice/bin/python integrations/omnivoice/synth.py` | Experimental | Startup/model load can feel hung. Keep Edge for live mode unless explicitly testing. |
+| `qwen3tts` | Qwen3 TTS via `audio` CLI | `audio speak --engine qwen3 ...` | Slow experimental | Correct backend name is `qwen3tts` / alias `qwen3`; do not use old `q13` aliases. |
+| `mlxaudio` | MLX Audio Qwen3 wrapper | `.venv-mlxaudio/bin/python integrations/mlxaudio/synth.py` | Experimental | Uses MLX Qwen3 model defaults; validate actual audible output, not only file existence. |
+| `neuttsair` | NeuTTS-Air English reference cloning | `.venv-neuttsair/bin/python integrations/neuttsair/synth.py` | Too slow for current live use | English-only in practice. Q4 GGUF lowers latency but still felt unusably slow under contention. |
+| `fireredtts2` | FireRedTTS-2 prompt-reference backend | `./.local/bin/fireredtts2` | Slow experimental | Can stall restart/final TTS long enough to feel broken. Honor explicit user selection, but report slowness instead of silently reverting. |
+| FireRedTTS-2 MLX helper | Apple Silicon FireRed LLM-port experiment | `integrations/fireredtts2/synth_mlx.py` | Not wired as canonical backend yet | Ports the FireRed LLM token generator to MLX/Metal while keeping RedCodec in Torch; intended to avoid upstream Torch Qwen hangs/slowness. |
+| `mossttsnano` | OpenMOSS / MOSS-TTS-Nano PyTorch backend | `.venv-mossttsnano/bin/python vendor/MOSS-TTS-Nano/infer.py` | Very slow experimental | On macOS use Python 3.11 venv and `--disable-wetext-processing`. |
+| `mossttsnano_mlx` | MOSS-TTS-Nano hybrid MLX port | `.venv-mossttsnano/bin/python integrations/mossttsnano_mlx/synth.py` | Active experiment, not live default | Native MLX generator, KV cache, and persistent JSON-line worker were added. Still verify audibility and tokenizer/model parity. |
+## Backend aliases
+Accepted aliases normalize to canonical backend names:
+| Alias examples | Canonical backend |
+|---|---|
+| `qwen3`, `qwen3-tts`, `qtts` | `qwen3tts` |
+| `mlx`, `mlx-audio`, `qwen3-mlx` | `mlxaudio` |
+| `neutts`, `neutts-air`, `neu tts air` | `neuttsair` |
+| `firered`, `fireredtts`, `firered-tts-2` | `fireredtts2` |
+| `moss`, `moss-tts`, `mossnano`, `openmoss` | `mossttsnano` |
+| `moss-mlx`, `mossttsnano-mlx`, `openmoss-mlx` | `mossttsnano_mlx` |
+## Observed latency
+### End-to-end voice loop log
+From `.logs/latency.jsonl`, 160 successful voice turns were available. These measure the whole Discord voice loop, not just TTS:
+| Stage | Median | P90 | Min | Max |
+|---|---:|---:|---:|---:|
+| STT | 3.81 s | 4.60 s | 0.75 s | 23.70 s |
+| Agent call | 16.90 s | 209.74 s | 5.58 s | 825.90 s |
+| TTS synth | 3.98 s | 12.77 s | 0.72 s | 760.73 s |
+| TTS playback | 19.50 s | 47.16 s | 0.99 s | 90.36 s |
+| TTS total | 23.14 s | 62.28 s | 1.90 s | 782.89 s |
+| Voice capture | 11.66 s | 30.02 s | 3.20 s | 109.10 s |
+| Utterance idle wait | 2.60 s | 4.50 s | 2.60 s | 4.54 s |
+| Total turn | 69.06 s | 289.56 s | 20.99 s | 905.24 s |
+Interpretation:
+- Long perceived latency is often not only TTS. Agent work and spoken playback length dominate many turns.
+- A high TTS-synth max indicates local/experimental TTS can stall badly under load or fallback paths.
+- Playback time is real audio duration, so long answers sound slow even if synthesis is fast.
+- The idle wait is intentionally a few seconds to avoid cutting off Korean phone-call utterances.
+### Local neural TTS observations
+| Backend / mode | Observed behavior on this Mac mini | Practical conclusion |
+|---|---|---|
+| Edge TTS | Usually low seconds for chunks; reliable enough for current live mode. | Keep as default/fallback. |
+| SpeechSwift CosyVoice CLI | About 6.9 s wall time for a 1.68 s Korean sample after warm-up. | Demo-capable, but sluggish for conversation. |
+| SpeechSwift audio-server | Warm short Korean requests varied around 4.5-7.7 s and sometimes hung. | Not safe as the always-on live backend yet. |
+| SpeechSwift/Qwen3 | About 62.5 s wall time, first chunk around 47.6 s in prior testing. | Too slow for live phone-call mode. |
+| NeuTTS Air | Produced valid WAVs, but felt unusably slow while the machine was under unrelated GPU/model load. | English-only experiment; use Edge for live answers. |
+| FireRedTTS-2 | Can be slow enough that restart/final TTS appears stalled. Timeout is 180 s by default. | Useful to test, but report slowness clearly. |
+| FireRedTTS-2 MLX helper | Added as an Apple Silicon experiment that moves the LLM token generator to MLX/Metal and keeps RedCodec encode/decode in Torch. | Not a production backend yet; verify dependencies, imports, generated frames, and decoded volume before wiring it to `TTS_BACKEND`. |
+| MOSS-TTS-Nano PyTorch | Works as an OpenMOSS path but is very slow on macOS. | Keep as correctness baseline, not live default. |
+| MOSS-TTS-Nano MLX | Added native generator, sampling fixes, KV cache, and persistent worker; can reduce repeated startup overhead. | Still experimental; verify audible volume and parity before live use. |
+## MOSS-TTS-Nano MLX status
+Recent implementation work added:
+- `integrations/mossttsnano_mlx/convert.py` for conversion experiments.
+- `integrations/mossttsnano_mlx/gpt2_mlx.py` for a native MLX GPT2-like generator.
+- `integrations/mossttsnano_mlx/synth.py` for the hybrid synthesis path.
+- `integrations/mossttsnano_mlx/worker.py` for a persistent JSON-line worker.
+- `MOSSTTSNANO_MLX_WORKER=1` to keep the worker hot between requests.
+- KV cache and sampling-semantics fixes in the MLX generator.
+Known caution:
+- A generated WAV is not enough. Check audibility with playback or `ffmpeg volumedetect`.
+- Near-silent or strange audio usually means model/tokenizer/audio-code parity is still wrong.
+- Keep the PyTorch MOSS path as a reference until MLX parity is proven.
+## Configuration examples
+### Safe live default
+```bash
+TTS_BACKEND=edge
+TTS_VOICE_TYPE=korean_male
+TTS_VOICE=ko-KR-InJoonNeural
+TTS_RATE=+10%
+```
+### Qwen3 TTS preset
+```bash
+TTS_BACKEND=qwen3tts
+TTS_VOICE_TYPE=korean_preset
+QWEN3TTS_COMMAND=audio
+QWEN3TTS_MODE=custom
+QWEN3TTS_MODEL=customVoice
+QWEN3TTS_LANGUAGE=korean
+QWEN3TTS_SPEAKER=sohee
+QWEN3TTS_PROGRESS=0
+```
+### NeuTTS Air English experiment
+```bash
+TTS_BACKEND=neuttsair
+TTS_VOICE_TYPE=cloned_reference
+VOICE_LANGUAGE=en
+STT_LANGUAGE=en
+WHISPER_CPP_LANGUAGE=en
+NEUTTSAIR_PYTHON=./.venv-neuttsair/bin/python
+NEUTTSAIR_SCRIPT=integrations/neuttsair/synth.py
+NEUTTSAIR_BACKBONE_REPO=neuphonic/neutts-air-q4-gguf
+NEUTTSAIR_CODEC_REPO=neuphonic/neucodec
+NEUTTSAIR_PROGRESS=0
+```
+### FireRedTTS-2 experiment
+```bash
+TTS_BACKEND=fireredtts2
+TTS_VOICE_TYPE=prompt_reference
+FIREREDTTS2_COMMAND=./.local/bin/fireredtts2
+FIREREDTTS2_PRETRAINED_DIR=./pretrained_models/FireRedTTS2
+FIREREDTTS2_PROMPT_AUDIO=./voice-samples/user-reference.wav
+FIREREDTTS2_PROGRESS=0
+```
+### MOSS-TTS-Nano PyTorch experiment
+```bash
+TTS_BACKEND=mossttsnano
+TTS_VOICE_TYPE=prompt_reference
+MOSSTTSNANO_COMMAND=./.venv-mossttsnano/bin/python
+MOSSTTSNANO_SCRIPT=vendor/MOSS-TTS-Nano/infer.py
+MOSSTTSNANO_CHECKPOINT=OpenMOSS-Team/MOSS-TTS-Nano
+MOSSTTSNANO_PROMPT_AUDIO=./voice-samples/user-reference.wav
+MOSSTTSNANO_PROGRESS=0
+```
+### MOSS-TTS-Nano MLX worker experiment
+```bash
+TTS_BACKEND=mossttsnano_mlx
+TTS_VOICE_TYPE=prompt_reference
+MOSSTTSNANO_MLX_PYTHON=./.venv-mossttsnano/bin/python
+MOSSTTSNANO_MLX_SCRIPT=integrations/mossttsnano_mlx/synth.py
+MOSSTTSNANO_MLX_WORKER=1
+MOSSTTSNANO_MLX_WORKER_SCRIPT=integrations/mossttsnano_mlx/worker.py
+MOSSTTSNANO_TORCH_DEVICE=cpu
+MOSSTTSNANO_TORCH_DTYPE=float32
+MOSSTTSNANO_PROMPT_AUDIO=./voice-samples/user-reference.wav
+MOSSTTSNANO_MLX_PROGRESS=0
+```
+## How to benchmark safely
+Use a quiet machine, short fixed text, and separate synthesis from playback:
+```bash
+vc doctor
+node --test app-node/tts_backends.test.mjs app-node/tts_settings.test.mjs app-node/tts_voice_config.test.mjs
+```
+For live logs, compare these fields in `.logs/latency.jsonl`:
+- `stt_ms`: speech-to-text time.
+- `agent_ms`: CLI agent time.
+- `tts_synth_ms`: time to synthesize audio files.
+- `tts_play_ms`: time spent playing generated audio.
+- `total_ms`: full turn time.
+When testing local neural backends, also verify:
+```bash
+ffmpeg -i output.wav -af volumedetect -f null -
+```
+A non-empty file can still be inaudible or near-silent.

package/docs/USAGE.md CHANGED Viewed

@@ -1,45 +1,82 @@
 # VerbalCoding Usage Guide
-This page holds the operational details that used to make the README too long.
+<!-- readme-glow-up:intro -->
+<p align="center">
+  <a href="../README.md">README</a> ·
+  <a href="README.md">Docs hub</a> ·
+  <a href="FRESH_INSTALL.md">Fresh Install</a> ·
+  <a href="USAGE.md">Usage</a> ·
+  <a href="CONFIGURATION.md">Configuration</a> ·
+  <a href="TROUBLESHOOTING.md">Troubleshooting</a> ·
+  <a href="MULTI_INSTANCE.md">Multi-Instance</a>
+</p>
+> Operational command reference for the voice bridge.
+>
+> Fast path: `vc setup → vc start → speak or use !ask in Discord`
+<!-- /readme-glow-up:intro -->
+This page holds the operational details that should stay out of the README.
 ## CLI Commands
 ```bash
-vc status                    # show STT language, progress language, and TTS voice
-vc language en               # English STT + English progress/TTS voice
-vc language ko               # Korean STT + Korean progress/TTS voice
-vc language auto             # Whisper auto-detect STT + English progress/TTS voice
-vc restart auto status       # show commit-time voice-bot auto-restart setting
-vc restart auto on           # enable commit-time voice-bot auto-restart
-vc restart auto off          # disable it; this is the default
-vc bot invite CLIENT_ID      # print a Discord invite URL with required permissions
-vc instance status           # list per-instance bridge configs and process status
-vc instance setup NAME       # write instances/NAME.env and create ~/.hermes/profiles/NAME
-vc instance start NAME       # start ./run.sh instances/NAME.env detached
-vc instance stop NAME        # stop a detached instance and remove its pid file
-vc doctor                    # run the redacted doctor check
-npm run mcp                  # run the stdio MCP server
+vc setup                               # guided setup: prerequisites, Discord token, voice channels
+vc setup --yes                         # non-interactive bootstrap/starter config for automation
+vc setup --yes --no-wizard             # dependency/bootstrap only
+vc setup token                         # later update Discord bot token
+vc setup token TOKEN --client-id ID     # non-interactive token/client-id update
+vc setup channels "General,Team Voice" # later update auto-join voice channel names
+vc setup channel "General"             # alias for setup channels
+vc setup voice "General"               # alias for setup channels
+vc bot invite CLIENT_ID                 # print a Discord invite URL with required permissions
+vc status                               # show STT language, progress language, and TTS voice
+vc language en                          # English STT + English progress/TTS voice
+vc language ko                          # Korean STT + Korean progress/TTS voice
+vc language auto                        # Whisper auto-detect STT + English progress/TTS voice
+vc restart auto status                  # show commit-time voice-bot auto-restart setting
+vc restart auto on                      # enable commit-time voice-bot auto-restart
+vc restart auto off                     # disable it; this is the default
+vc instance list                        # list per-instance bridge configs
+vc instance status [NAME]               # show instance process status
+vc instance setup NAME                  # write instances/NAME.env and create ~/.hermes/profiles/NAME
+vc instance start NAME                  # start ./run.sh instances/NAME.env detached
+vc instance stop NAME                   # stop a detached instance and remove its pid file
+vc doctor                               # run the redacted doctor check and supported auto-fixes
+vc start                                # start the default bridge
+npm run mcp                             # run the stdio MCP server from a clone
 ```
-Language changes update `.env`; restart the bridge with `./run.sh` or your process manager for them to take effect.
+For npm/global installs, prefer `vc ...` commands. Use `./scripts/install.sh` only from a GitHub clone.
+`vc setup token` and `vc setup channels` are safe follow-up commands: they update `.env` in place, preserve unrelated keys, set file mode `0600`, and avoid printing secrets.
+Language changes update `.env`; restart the bridge with `vc start`, `./run.sh`, or your process manager for them to take effect.
 ## Run Modes
 Single-instance bridge:
 ```bash
+vc start
+# clone equivalent:
 ./run.sh
 ```
 Per-instance bridge using a local override env:
 ```bash
+vc instance start my-project
+# clone/debug equivalent:
 ./run.sh instances/my-project.env
-# or
 VERBALCODING_INSTANCE_ENV=instances/my-project.env ./run.sh
 ```
-The bot auto-joins the first configured channel name, defaulting to `일반,General,general`.
+The bot auto-joins the first matching configured channel name. Set it with:
+```bash
+vc setup channels "VerbalCoding,LLM-Wiki,General"
+```
 ## Discord Commands
@@ -70,6 +107,28 @@ Then use `vc bot invite CLIENT_ID` to generate the VerbalCoding-specific invite
 Voice equivalents such as “외부 모드”, “보수 모드”, “실내”, “기본 감도”, and clear stop phrases like “잠깐”, “멈춰”, “그만” are handled by the bridge. You can also say “상세 진행 켜” / “상세 진행 꺼” to toggle verbose progress by voice.
+## Cross-agent voice routing
+VerbalCoding can route a single turn (or the rest of the session) to a different installed CLI agent without restarting.
+| Voice phrase (en) | Voice phrase (ko) | Behavior |
+|---|---|---|
+| `ask Codex what it thinks` | `코덱스한테 물어봐` | Single-turn route to Codex; next utterance returns to the default. |
+| `switch to Aider` | `aider로 전환` | Sticky route — every following utterance goes to Aider. |
+| `back to default` | `기본으로 돌아가` | Restore the default agent (`AGENT_BACKEND` / `vc setup` selection). |
+| `let Claude finish this` | — | Treated as sticky route to Claude Code. |
+Recognized aliases: `hermes`, `claude` / `claude code`, `codex` / `코덱스`, `gemini` / `gemini cli` / `제미나이`, `opencode`, `openclaw`, `aider` / `에이더`, `cursor` / `cursor cli`.
+Behaviors on top:
+- **Missing-binary fallback** — if the requested backend's binary is not on `PATH` (resolved against the active project session's workdir when applicable), the bridge asks "Want me to use the default agent instead?" Answer "yes" / "예" to retry on the default; "no" / "아니오" to cancel.
+- **TTS prefix on backend change** — when the active backend changes between turns, the spoken answer is prefixed (`Codex says: …` / `코덱스: …`). No prefix on stable backends.
+- **Cross-agent context handoff** — the routed agent receives a prompt block containing the prior agent label, recent voice utterances (last 4), and the most recently resolved plan decisions, so it doesn't restart cold.
+- **Plan-mode `which_agent` slot** — plans can include a `which_agent` decision listing CLI options (e.g. `codex, aider, claude, gemini, opencode, openclaw, cursor, hermes`); the user's voice answer selects which agent executes that plan.
+- **Per-channel state** — routing is scoped per Discord channel; switching agents in one project room does not affect others.
+- **Sticky survives interrupts** — barge-in or aborted turns keep a sticky route intact; only single-turn routes are cleared.
 ## Changing the Voice
 `vc language ko|en|auto` changes STT language, progress language, and the matching default TTS voice together. If you only want to change the speaker/voice while the bridge is running, say it in Discord voice:
@@ -81,7 +140,7 @@ change voice to Korean female
 switch speaker to English
 ```
-The live bridge recognizes these as voice-control commands, updates `config/tts-voices.json`, updates the effective TTS env for the running process, and answers with a short confirmation such as “목소리를 Korean male로 바꿨어.” Use `!voice-test <text>` right after changing it to hear the current backend and voice.
+The live bridge recognizes these as voice-control commands, updates `config/tts-voices.json`, updates the effective TTS env for the running process, and answers with a short confirmation. Use `!voice-test <text>` right after changing it to hear the current backend and voice.
 Built-in Edge voice types:
@@ -93,27 +152,12 @@ Built-in Edge voice types:
 | `english_male` | `en-US-GuyNeural` |
 | `english_female` | `en-US-AriaNeural` |
-For persistent manual config, set `TTS_BACKEND=edge`, `TTS_VOICE_TYPE=<voice-type>`, and optionally `TTS_VOICE=<edge-voice>` in `.env`, or edit `config/tts-voices.json` for custom voice catalogs.
-Backend-specific voice knobs:
-| Backend | Voice setting | Common choices |
-|---|---|---|
-| Edge | `TTS_VOICE_TYPE`, `TTS_VOICE` | `korean_male`, `korean_female`, `korean_multilingual_male`, `english_male`, `english_female`; any Edge voice from `edge-tts --list-voices` |
-| Supertonic | `SUPERTONIC_VOICE` | `M1`–`M5`, `F1`–`F5`; set `SUPERTONIC_LANGUAGE=ko|en|es|pt|fr` |
-| OpenVoice | `OPENVOICE_REF_AUDIO`, `OPENVOICE_STYLE` | a permitted reference WAV plus style such as `default` |
-| SpeechSwift / CosyVoice | `SPEECHSWIFT_REF_AUDIO`, `SPEECHSWIFT_ENGINE`, `SPEECHSWIFT_SPEAKER` | reference WAV for CosyVoice, or backend-supported speaker/model values |
-For Supertonic and local clone backends, use the backend env vars above plus `!voice-test <text>` to audition changes. Voice-command switching currently maps the built-in Edge-style voice types; richer backend catalogs can be added in `config/tts-voices.json`.
 ## Long Dictation and Pauses
-VerbalCoding waits for an idle window before sending speech to STT. The default `UTTERANCE_IDLE_MS=4500` is intentionally a bit patient so a natural pause in a long instruction does not split the sentence, start an agent turn too early, and then treat the rest as a processing-time interruption.
-If you prefer faster short commands, lower it in `.env`; if long Korean dictation is still being split, raise it:
+VerbalCoding waits for an idle window before sending speech to STT. The default `UTTERANCE_IDLE_MS=4500` is intentionally patient so a natural pause in a long instruction does not split the sentence.
 ```bash
-UTTERANCE_IDLE_MS="6000"
+UTTERANCE_IDLE_MS="6000"  # safer for long dictation with pauses
 ```
 ## Verbose Progress Mode
@@ -128,7 +172,19 @@ Verbose progress is off by default unless `AGENT_VERBOSE_PROGRESS=1` is set. Ena
 🤖 Hermes Agent 응답 수신
 ```
-This mode asks the selected CLI harness to emit `VERBALCODING_PROGRESS: ...` lines and summarizes common tool markers from streaming stdout/stderr when available. Secret-looking fields are redacted and progress lines are removed from the final spoken answer.
+Secret-looking fields are redacted and progress lines are removed from the final spoken answer.
+## Docker / Container Run Mode
+If you run VerbalCoding in Docker and voice auto-join fails with `Cannot perform IP discovery - socket closed`, the likely issue is UDP connectivity, not channel lookup. For Linux Docker Compose:
+```yaml
+services:
+  verbalcoding:
+    network_mode: "host"
+```
+Remove `ports:` from that service. Docker Desktop for macOS/Windows has different host networking behavior; if UDP voice still fails there, run on the host or in a Linux VM. See [Troubleshooting](TROUBLESHOOTING.md).
 ## Latency Metrics
@@ -138,8 +194,6 @@ VerbalCoding writes per-turn latency records as JSONL. Default path:
 ./.logs/latency.jsonl
 ```
-Each record includes status, total time, voice capture time, utterance idle wait, STT time, agent time, TTS synthesis/playback time, chunk counts, transcript length, answer length, and audio levels where available.
 In Discord:
 ```text
@@ -154,7 +208,7 @@ The summary uses the latest 200 records: count, average, p95, max, and non-OK st
 ```bash
 node --check app-node/main.mjs
 npm test
-bash -n run.sh scripts/install.sh
+bash -n run.sh scripts/install.sh scripts/bootstrap_prereqs.sh
 vc doctor
 ```

package/docs/assets/figures/verbalcoding-flow.svg CHANGED Viewed

@@ -73,6 +73,6 @@
   <text x="594" y="194" fill="#FDE68A" text-anchor="middle" font-family="Inter, ui-sans-serif, system-ui" font-size="15" font-weight="700">Barge-in stays open while the agent is thinking or speaking</text>
   <rect x="150" y="348" width="900" height="54" rx="17" fill="#020617" stroke="#1F2937"/>
-  <text x="182" y="382" fill="#A7F3D0" font-family="SFMono-Regular, ui-monospace, monospace" font-size="18">$ vc language ko &amp;&amp; vc instance start my-project</text>
+  <text x="182" y="382" fill="#A7F3D0" font-family="SFMono-Regular, ui-monospace, monospace" font-size="18">$ vc setup → vc doctor → vc start</text>
   <text x="1045" y="382" fill="#64748B" text-anchor="end" font-family="Inter, ui-sans-serif, system-ui" font-size="15">hands-free coding call</text>
 </svg>

package/docs/i18n/AGENTS.es.md ADDED Viewed

@@ -0,0 +1,34 @@
+# Guía del repositorio (español)
+> Este fichero es un resumen en español de [`AGENTS.md`](../../AGENTS.md). Las reglas formales viven en el inglés original.
+VerbalCoding es un puente de voz Discord para agentes de codificación. El runtime es la implementación Node en `app-node/`, lanzada vía `run.sh` o el CLI `vc`.
+## Desarrollo
+- En docs y ejemplos prefiere `vc ...` sobre `npm run vc -- ...`.
+- Los secretos locales viven en `.env` o `instances/*.env`; nunca commits con tokens Discord, IDs de canal, ficheros de sesión, muestras de voz, pesos de modelo, virtualenvs, logs ni cachés.
+- Edita ficheros fuente, no artefactos generados.
+- Ejemplos públicos: usa placeholders para rutas locales, IDs de usuario, IDs Discord y tokens.
+## Verificación
+Antes de marcar un cambio como completo, ejecuta:
+```bash
+npm test
+```
+## Layout de módulos
+Detalle en [`AGENTS.md`](../../AGENTS.md). Módulos clave:
+- `main.mjs` — dispatcher Discord / voz / agente
+- `agent_routing.mjs` — enrutamiento entre agentes por voz
+- `plan_mode.mjs` — modo plan por voz (slot `which_agent`)
+- `session_ontology.mjs` — grafo tipado por canal (handoff)
+- `research_mode.mjs` — comando `"research X"`
+## Bloque gestionado
+HarnessSync sincroniza las reglas de `CLAUDE.md` dentro de `AGENTS.md`. No edites manualmente ese bloque.

package/docs/i18n/AGENTS.fr.md ADDED Viewed

@@ -0,0 +1,34 @@
+# Guide du dépôt (français)
+> Ce fichier est un résumé français de [`AGENTS.md`](../../AGENTS.md). Les règles formelles restent dans l'original anglais.
+VerbalCoding est un pont vocal Discord pour les agents de codage. Le runtime est l'implémentation Node sous `app-node/`, lancée via `run.sh` ou le CLI `vc`.
+## Développement
+- Dans les docs et exemples, préférez `vc ...` à `npm run vc -- ...`.
+- Les secrets locaux vivent dans `.env` ou `instances/*.env`; ne commitez jamais de vrais tokens Discord, IDs de salon, fichiers de session, échantillons vocaux, poids de modèle, virtualenvs, logs ni caches.
+- Modifiez les fichiers source, pas les artefacts générés.
+- Exemples publics-safe : placeholders pour chemins locaux, IDs utilisateur, IDs Discord, tokens.
+## Vérification
+Avant de signaler un changement comme terminé, exécutez:
+```bash
+npm test
+```
+## Cartographie des modules
+Détail dans [`AGENTS.md`](../../AGENTS.md). Modules clés :
+- `main.mjs` — dispatcher Discord / voix / agent
+- `agent_routing.mjs` — routage inter-agent par voix
+- `plan_mode.mjs` — mode plan vocal (slot `which_agent`)
+- `session_ontology.mjs` — graphe typé par salon (handoff)
+- `research_mode.mjs` — commande `"research X"`
+## Bloc géré
+HarnessSync synchronise les règles de `CLAUDE.md` dans `AGENTS.md`. Ne pas éditer manuellement ce bloc.

package/docs/i18n/AGENTS.ja.md ADDED Viewed

@@ -0,0 +1,34 @@
+# リポジトリガイドライン (日本語)
+> 本ファイルは [`AGENTS.md`](../../AGENTS.md) の日本語要約です。正式なルールは英語の本文を参照してください。
+VerbalCoding はコーディングエージェント向けの Discord 音声ブリッジです。実装は `app-node/` 配下の Node 実装で、`run.sh` または `vc` CLI 経由で起動します。
+## 開発
+- ドキュメント / サンプルでは `vc ...` を `npm run vc -- ...` より優先してください。
+- ローカルシークレットは `.env` または `instances/*.env` に置き、実 Discord トークン、チャンネル ID、セッションファイル、音声サンプル、モデル重み、venv、ログ、キャッシュ出力はコミットしないでください。
+- 自動生成物ではなくソースファイルを編集してください。
+- サンプルは公開しても安全な値で。ローカルパス、ユーザー ID、Discord ID、トークンはプレースホルダで。
+## 検証
+コード変更を完了とする前に Node テストを走らせてください:
+```bash
+npm test
+```
+## モジュール構成
+詳細は [`AGENTS.md`](../../AGENTS.md) を参照。主要モジュール:
+- `main.mjs` — Discord / 音声 / エージェントのディスパッチャ
+- `agent_routing.mjs` — 音声主導のクロスエージェントルーティング
+- `plan_mode.mjs` — 音声プランモード(`which_agent` スロット)
+- `session_ontology.mjs` — チャネル単位の typed graph(handoff 用)
+- `research_mode.mjs` — `"research X"` 音声コマンドパイプライン
+## 管理ブロック
+HarnessSync が `AGENTS.md` に `CLAUDE.md` のルールを同期します。当該ブロックは編集しないでください。

package/docs/i18n/AGENTS.ko.md ADDED Viewed

@@ -0,0 +1,34 @@
+# 저장소 가이드라인 (한국어)
+> 이 파일은 [`AGENTS.md`](../../AGENTS.md)의 한국어 요약입니다. 정식 규칙은 원본 영어 문서를 따라주세요.
+VerbalCoding은 코딩 에이전트용 Discord 음성 브릿지입니다. 실제 런타임은 `app-node/` 하위 Node 구현체이고, `run.sh` 또는 `vc` CLI로 실행합니다.
+## 개발
+- 문서·예제에서는 `npm run vc -- ...` 보다 `vc ...` 형태를 우선 사용합니다.
+- 로컬 비밀은 `.env` 또는 `instances/*.env`에만 두고 절대 커밋하지 마세요. 실제 Discord 토큰, 채널 ID, 세션 파일, 음성 샘플, 모델 가중치, 가상환경, 로그, 캐시 출력도 마찬가지입니다.
+- 생성물/런타임 산출물 대신 소스 파일을 수정합니다.
+- 예제는 공개 안전한 값으로 유지합니다. 로컬 경로, 사용자 ID, Discord ID, 토큰은 플레이스홀더로.
+## 검증
+코드 변경을 완료로 보고하기 전에 Node 테스트 스위트를 실행하세요:
+```bash
+npm test
+```
+## 모듈 맵
+자세한 내용은 [`AGENTS.md`](../../AGENTS.md)를 참고하세요. 핵심 모듈:
+- `main.mjs` — Discord/음성/에이전트 디스패처
+- `agent_routing.mjs` — 음성 기반 크로스 에이전트 라우팅
+- `plan_mode.mjs` — 음성 플랜 모드 (which_agent 슬롯)
+- `session_ontology.mjs` — 채널별 타입드 그래프 (cross-agent 핸드오프 컨텍스트)
+- `research_mode.mjs` — `"리서치 X"` 음성 명령 파이프라인
+## 관리되는 영역
+HarnessSync가 `AGENTS.md`에 `CLAUDE.md`의 규칙을 자동 동기화합니다. 그 블록은 손대지 마세요.

package/docs/i18n/AGENTS.ru.md ADDED Viewed

@@ -0,0 +1,34 @@
+# Руководство по репозиторию (русский)
+> Этот файл — русское резюме [`AGENTS.md`](../../AGENTS.md). Формальные правила — в оригинале на английском.
+VerbalCoding — голосовой мост Discord для кодинг-агентов. Рантайм — Node-реализация в `app-node/`, запускается через `run.sh` или CLI `vc`.
+## Разработка
+- В документации и примерах предпочитайте `vc ...`, а не `npm run vc -- ...`.
+- Локальные секреты — в `.env` или `instances/*.env`. Не коммитьте реальные Discord-токены, channel ID, session-файлы, голосовые сэмплы, веса моделей, venv, логи, кеши.
+- Правьте исходные файлы, а не сгенерированные артефакты.
+- Примеры — публично-безопасные: плейсхолдеры для локальных путей, user ID, Discord ID, токенов.
+## Проверка
+Перед тем как считать изменения готовыми, запустите:
+```bash
+npm test
+```
+## Карта модулей
+Подробности в [`AGENTS.md`](../../AGENTS.md). Ключевые модули:
+- `main.mjs` — диспетчер Discord / голос / агенты
+- `agent_routing.mjs` — голосовая маршрутизация между агентами
+- `plan_mode.mjs` — голосовой plan-mode (слот `which_agent`)
+- `session_ontology.mjs` — типизированный граф на канал (handoff)
+- `research_mode.mjs` — голосовая команда `"research X"`
+## Управляемый блок
+HarnessSync синхронизирует правила из `CLAUDE.md` в управляемый блок `AGENTS.md`. Не редактируйте этот блок вручную.

package/docs/i18n/AGENTS.zh.md ADDED Viewed

@@ -0,0 +1,34 @@
+# 仓库指南 (中文)
+> 本文是 [`AGENTS.md`](../../AGENTS.md) 的中文摘要。正式规则以英文原文为准。
+VerbalCoding 是面向编码代理的 Discord 语音桥。运行时位于 `app-node/`,通过 `run.sh` 或 `vc` CLI 启动。
+## 开发
+- 文档与示例优先使用 `vc ...` 形式,而不是 `npm run vc -- ...`。
+- 本地密钥放在 `.env` 或 `instances/*.env`,不要提交真实 Discord token、频道 ID、会话文件、语音样本、模型权重、虚拟环境、日志、缓存。
+- 修改源文件而非自动生成物。
+- 示例保持公开安全:本地路径、用户 ID、Discord ID、token 用占位符替代。
+## 验证
+报告完成前请运行 Node 测试:
+```bash
+npm test
+```
+## 模块布局
+详情见 [`AGENTS.md`](../../AGENTS.md)。核心模块:
+- `main.mjs` — Discord / 语音 / 代理调度器
+- `agent_routing.mjs` — 语音驱动的跨代理路由
+- `plan_mode.mjs` — 语音 plan 模式 (`which_agent` 槽)
+- `session_ontology.mjs` — 按频道的类型图 (用于 handoff)
+- `research_mode.mjs` — `"research X"` 语音命令流程
+## 托管区域
+HarnessSync 会把 `CLAUDE.md` 的规则同步进 `AGENTS.md` 的托管块,请勿手动修改该块。