verbalcoding 0.2.12 → 0.2.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (169) hide show
  1. package/.env.example +74 -4
  2. package/README.es.md +3 -1
  3. package/README.fr.md +3 -1
  4. package/README.ja.md +3 -1
  5. package/README.ko.md +4 -2
  6. package/README.md +4 -2
  7. package/README.ru.md +3 -1
  8. package/README.zh.md +3 -1
  9. package/app-node/agent_adapters.test.mjs +14 -0
  10. package/app-node/agent_routing.mjs +148 -0
  11. package/app-node/agent_routing.test.mjs +138 -0
  12. package/app-node/agent_turn.mjs +86 -0
  13. package/app-node/agent_turn.test.mjs +109 -0
  14. package/app-node/bridge_context.mjs +73 -0
  15. package/app-node/bridge_context.test.mjs +54 -0
  16. package/app-node/bridge_state.mjs +4 -0
  17. package/app-node/bridge_wireup.test.mjs +462 -0
  18. package/app-node/cli_install.test.mjs +31 -0
  19. package/app-node/cross_agent_routing.test.mjs +78 -0
  20. package/app-node/discord_command_router.mjs +204 -0
  21. package/app-node/discord_command_router.test.mjs +311 -0
  22. package/app-node/discord_voice_setup.mjs +251 -0
  23. package/app-node/discord_voice_setup.test.mjs +86 -0
  24. package/app-node/hermes_profiles.test.mjs +12 -1
  25. package/app-node/install_config.mjs +110 -3
  26. package/app-node/install_config.test.mjs +8 -0
  27. package/app-node/instance_doctor.test.mjs +9 -0
  28. package/app-node/instances.test.mjs +8 -1
  29. package/app-node/main.mjs +488 -1368
  30. package/app-node/mcp_tools.test.mjs +7 -0
  31. package/app-node/notification_handler.mjs +89 -0
  32. package/app-node/notification_handler.test.mjs +187 -0
  33. package/app-node/plan_dispatcher.mjs +215 -0
  34. package/app-node/plan_dispatcher.test.mjs +101 -0
  35. package/app-node/plan_mode.mjs +36 -7
  36. package/app-node/plan_mode.test.mjs +78 -0
  37. package/app-node/progress_handler.mjs +220 -0
  38. package/app-node/progress_handler.test.mjs +193 -0
  39. package/app-node/progress_speech.mjs +54 -32
  40. package/app-node/progress_speech.test.mjs +12 -3
  41. package/app-node/project_sessions.mjs +5 -2
  42. package/app-node/project_sessions.test.mjs +7 -0
  43. package/app-node/research_mode.mjs +282 -0
  44. package/app-node/research_mode.test.mjs +264 -0
  45. package/app-node/restart_notice.mjs +3 -0
  46. package/app-node/restart_notice.test.mjs +11 -0
  47. package/app-node/session_ontology.mjs +271 -0
  48. package/app-node/session_ontology.test.mjs +130 -0
  49. package/app-node/smart_progress.mjs +1 -1
  50. package/app-node/stream_sentencer.mjs +32 -2
  51. package/app-node/stream_sentencer.test.mjs +65 -0
  52. package/app-node/streaming_tts_queue.mjs +5 -1
  53. package/app-node/streaming_tts_queue.test.mjs +7 -1
  54. package/app-node/stt_whisper.mjs +24 -0
  55. package/app-node/stt_whisper.test.mjs +32 -0
  56. package/app-node/text_routing.mjs +4 -2
  57. package/app-node/tts_backends.mjs +537 -3
  58. package/app-node/tts_backends.test.mjs +454 -0
  59. package/app-node/tts_player.mjs +164 -0
  60. package/app-node/tts_player.test.mjs +202 -0
  61. package/app-node/tts_runtime.mjs +134 -0
  62. package/app-node/tts_runtime.test.mjs +89 -0
  63. package/app-node/tts_settings.mjs +150 -3
  64. package/app-node/tts_settings.test.mjs +204 -0
  65. package/app-node/tts_voice_config.mjs +136 -2
  66. package/app-node/tts_voice_config.test.mjs +94 -0
  67. package/app-node/utterance_router.mjs +216 -0
  68. package/app-node/utterance_router.test.mjs +236 -0
  69. package/app-node/voice_autojoin.mjs +37 -0
  70. package/app-node/voice_autojoin.test.mjs +59 -0
  71. package/app-node/voice_io.mjs +272 -0
  72. package/app-node/voice_io.test.mjs +102 -0
  73. package/app-node/voice_turn_runner.mjs +449 -0
  74. package/app-node/voice_turn_runner.test.mjs +289 -0
  75. package/docs/CONFIGURATION.md +12 -2
  76. package/docs/HARNESSES.md +58 -0
  77. package/docs/HARNESS_AIDER.md +50 -0
  78. package/docs/HARNESS_CLAUDE.md +56 -0
  79. package/docs/HARNESS_CODEX.md +56 -0
  80. package/docs/HARNESS_CURSOR.md +45 -0
  81. package/docs/HARNESS_GEMINI.md +45 -0
  82. package/docs/HARNESS_HERMES.md +57 -0
  83. package/docs/HARNESS_OPENCLAW.md +44 -0
  84. package/docs/HARNESS_OPENCODE.md +44 -0
  85. package/docs/README.md +1 -0
  86. package/docs/ROADMAP.md +20 -5
  87. package/docs/TTS_BACKENDS.md +227 -0
  88. package/docs/USAGE.md +22 -0
  89. package/docs/i18n/AGENTS.es.md +34 -0
  90. package/docs/i18n/AGENTS.fr.md +34 -0
  91. package/docs/i18n/AGENTS.ja.md +34 -0
  92. package/docs/i18n/AGENTS.ko.md +34 -0
  93. package/docs/i18n/AGENTS.ru.md +34 -0
  94. package/docs/i18n/AGENTS.zh.md +34 -0
  95. package/docs/i18n/HARNESSES.es.md +58 -0
  96. package/docs/i18n/HARNESSES.fr.md +58 -0
  97. package/docs/i18n/HARNESSES.ja.md +58 -0
  98. package/docs/i18n/HARNESSES.ko.md +58 -0
  99. package/docs/i18n/HARNESSES.ru.md +58 -0
  100. package/docs/i18n/HARNESSES.zh.md +58 -0
  101. package/docs/i18n/HARNESS_AIDER.es.md +48 -0
  102. package/docs/i18n/HARNESS_AIDER.fr.md +48 -0
  103. package/docs/i18n/HARNESS_AIDER.ja.md +50 -0
  104. package/docs/i18n/HARNESS_AIDER.ko.md +50 -0
  105. package/docs/i18n/HARNESS_AIDER.ru.md +48 -0
  106. package/docs/i18n/HARNESS_AIDER.zh.md +48 -0
  107. package/docs/i18n/HARNESS_CLAUDE.es.md +55 -0
  108. package/docs/i18n/HARNESS_CLAUDE.fr.md +55 -0
  109. package/docs/i18n/HARNESS_CLAUDE.ja.md +56 -0
  110. package/docs/i18n/HARNESS_CLAUDE.ko.md +56 -0
  111. package/docs/i18n/HARNESS_CLAUDE.ru.md +55 -0
  112. package/docs/i18n/HARNESS_CLAUDE.zh.md +56 -0
  113. package/docs/i18n/HARNESS_CODEX.es.md +55 -0
  114. package/docs/i18n/HARNESS_CODEX.fr.md +55 -0
  115. package/docs/i18n/HARNESS_CODEX.ja.md +56 -0
  116. package/docs/i18n/HARNESS_CODEX.ko.md +56 -0
  117. package/docs/i18n/HARNESS_CODEX.ru.md +55 -0
  118. package/docs/i18n/HARNESS_CODEX.zh.md +56 -0
  119. package/docs/i18n/HARNESS_CURSOR.es.md +42 -0
  120. package/docs/i18n/HARNESS_CURSOR.fr.md +42 -0
  121. package/docs/i18n/HARNESS_CURSOR.ja.md +45 -0
  122. package/docs/i18n/HARNESS_CURSOR.ko.md +45 -0
  123. package/docs/i18n/HARNESS_CURSOR.ru.md +42 -0
  124. package/docs/i18n/HARNESS_CURSOR.zh.md +42 -0
  125. package/docs/i18n/HARNESS_GEMINI.es.md +44 -0
  126. package/docs/i18n/HARNESS_GEMINI.fr.md +44 -0
  127. package/docs/i18n/HARNESS_GEMINI.ja.md +45 -0
  128. package/docs/i18n/HARNESS_GEMINI.ko.md +45 -0
  129. package/docs/i18n/HARNESS_GEMINI.ru.md +44 -0
  130. package/docs/i18n/HARNESS_GEMINI.zh.md +45 -0
  131. package/docs/i18n/HARNESS_HERMES.es.md +54 -0
  132. package/docs/i18n/HARNESS_HERMES.fr.md +54 -0
  133. package/docs/i18n/HARNESS_HERMES.ja.md +57 -0
  134. package/docs/i18n/HARNESS_HERMES.ko.md +57 -0
  135. package/docs/i18n/HARNESS_HERMES.ru.md +54 -0
  136. package/docs/i18n/HARNESS_HERMES.zh.md +57 -0
  137. package/docs/i18n/HARNESS_OPENCLAW.es.md +41 -0
  138. package/docs/i18n/HARNESS_OPENCLAW.fr.md +41 -0
  139. package/docs/i18n/HARNESS_OPENCLAW.ja.md +44 -0
  140. package/docs/i18n/HARNESS_OPENCLAW.ko.md +44 -0
  141. package/docs/i18n/HARNESS_OPENCLAW.ru.md +41 -0
  142. package/docs/i18n/HARNESS_OPENCLAW.zh.md +42 -0
  143. package/docs/i18n/HARNESS_OPENCODE.es.md +41 -0
  144. package/docs/i18n/HARNESS_OPENCODE.fr.md +41 -0
  145. package/docs/i18n/HARNESS_OPENCODE.ja.md +44 -0
  146. package/docs/i18n/HARNESS_OPENCODE.ko.md +44 -0
  147. package/docs/i18n/HARNESS_OPENCODE.ru.md +41 -0
  148. package/docs/i18n/HARNESS_OPENCODE.zh.md +44 -0
  149. package/docs/superpowers/plans/2026-05-14-cross-agent-voice-transfer.md +625 -0
  150. package/docs/superpowers/plans/2026-05-21-audio-overview-narrated-diffs.md +95 -0
  151. package/docs/superpowers/plans/2026-05-21-autoresearch-ontology.md +83 -0
  152. package/docs/superpowers/plans/2026-05-21-phase11-push-to-talk-wakeword-v2.md +77 -0
  153. package/docs/superpowers/plans/2026-05-21-phase12-multi-user-voice.md +147 -0
  154. package/docs/superpowers/plans/2026-05-21-phase14-verbalbench.md +136 -0
  155. package/docs/superpowers/plans/2026-05-21-phase15-phone-companion.md +72 -0
  156. package/integrations/fireredtts2/mlx_llm.py +183 -0
  157. package/integrations/fireredtts2/synth.py +156 -0
  158. package/integrations/fireredtts2/synth_mlx.py +196 -0
  159. package/integrations/mlxaudio/synth.py +74 -0
  160. package/integrations/neuttsair/synth.py +104 -0
  161. package/integrations/omnivoice/synth.py +110 -0
  162. package/package.json +6 -1
  163. package/scripts/cli.mjs +84 -0
  164. package/scripts/doctor.mjs +104 -4
  165. package/scripts/install.mjs +5 -1
  166. package/scripts/install_fireredtts2.sh +109 -0
  167. package/scripts/install_mlxaudio.sh +34 -0
  168. package/scripts/install_mossttsnano.sh +46 -0
  169. package/scripts/postinstall.mjs +34 -0
@@ -0,0 +1,57 @@
1
+ # Hermes Agent — Harness Notes
2
+
3
+ <p align="center">
4
+ <a href="../README.md">README</a> ·
5
+ <a href="HARNESSES.md">Harnesses</a> ·
6
+ <a href="USAGE.md">Usage</a> ·
7
+ <a href="CONFIGURATION.md">Configuration</a>
8
+ </p>
9
+
10
+ Hermes Agent is VerbalCoding's default backend — it is the one harness with a real session-resume contract, so chat across turns retains context cleanly. For positioning vs Hermes' built-in `/voice` slash command, see [HERMES_VOICE.md](./HERMES_VOICE.md).
11
+
12
+ ## Install
13
+
14
+ Follow the upstream Hermes Agent install guide: <https://hermes-agent.nousresearch.com>.
15
+
16
+ Verify the CLI works directly first:
17
+
18
+ ```bash
19
+ hermes chat -Q -q "hello"
20
+ ```
21
+
22
+ ## Configure VerbalCoding
23
+
24
+ ```bash
25
+ # .env
26
+ AGENT_BACKEND=hermes
27
+ # optional overrides
28
+ HERMES_COMMAND="hermes chat -Q -q" # default
29
+ HERMES_HOME=/Users/you/.hermes # per-instance Hermes home
30
+ HERMES_PROJECT_CONTEXT="Project session: ..."
31
+ HERMES_TASK_TIMEOUT_MS=0 # 0 = no limit
32
+ HERMES_CHAT_TIMEOUT_MS=45000
33
+ HERMES_WORKDIR=/Users/you/code/your-project
34
+ ```
35
+
36
+ The session file lives at `<repo>/.verbalcoding-session` by default (override with `HERMES_SESSION_FILE`).
37
+
38
+ ## Session resume
39
+
40
+ Hermes is the only built-in adapter with session resume. After each successful turn the adapter writes the new `session_id` to disk and prepends `--resume <id>` to the next call. `!session reset` (or `!reset-session`) clears that file.
41
+
42
+ If a turn aborts before Hermes emits `session_id:` on stderr, the adapter also reads the Hermes session JSON at `~/.hermes/sessions/session_<id>.json` to recover the last assistant message.
43
+
44
+ ## Verbose progress
45
+
46
+ In verbose mode the adapter drops Hermes' `-Q` quiet flag so stdout streams `┊ <emoji> <tool>` previews. These get summarized into one-line progress events (file reads, web search, terminal). Without verbose, only the final boxed answer plays.
47
+
48
+ ## Voice phrases to switch TO Hermes
49
+
50
+ - en: `"switch to Hermes"`, `"ask Hermes ..."`
51
+ - ko: `"헤르메스로 전환"`, `"헤르메스한테 물어봐"`
52
+
53
+ ## Gotchas
54
+
55
+ - The TTS prefix on cross-agent handoff uses the localized label: `"Hermes says: "` / `"헤르메스: "`.
56
+ - `HERMES_HOME` is the most common per-project isolation knob; per-instance `.env` typically sets `HERMES_HOME=/Users/you/.hermes/profiles/<project>`.
57
+ - If verbose progress is on and Hermes still finishes with an empty box (timed out), the adapter scrapes the session JSON for the final assistant text before giving up.
@@ -0,0 +1,44 @@
1
+ # OpenClaw — Harness Notes
2
+
3
+ <p align="center">
4
+ <a href="../README.md">README</a> ·
5
+ <a href="HARNESSES.md">Harnesses</a> ·
6
+ <a href="USAGE.md">Usage</a> ·
7
+ <a href="CONFIGURATION.md">Configuration</a>
8
+ </p>
9
+
10
+ OpenClaw is an open-source terminal coding agent. VerbalCoding drives it through `openclaw run`.
11
+
12
+ ## Install
13
+
14
+ Follow the upstream OpenClaw install guide. Confirm:
15
+
16
+ ```bash
17
+ openclaw run "hello"
18
+ ```
19
+
20
+ ## Configure VerbalCoding
21
+
22
+ ```bash
23
+ # .env
24
+ AGENT_BACKEND=openclaw
25
+ # optional
26
+ OPENCLAW_COMMAND="openclaw run" # default
27
+ AGENT_PROJECT_CONTEXT="..."
28
+ AGENT_WORKDIR=/Users/you/code/your-project
29
+ AGENT_CHAT_TIMEOUT_MS=45000
30
+ AGENT_TASK_TIMEOUT_MS=0
31
+ ```
32
+
33
+ ## Voice phrases to switch TO OpenClaw
34
+
35
+ - en: `"switch to OpenClaw"`, `"ask OpenClaw ..."`, `"switch to open claw"`
36
+ - ko: `"openclaw로 전환"`
37
+
38
+ The matcher accepts `openclaw` and `open claw`.
39
+
40
+ ## Gotchas
41
+
42
+ - **No session resume** in the default command. Add a resume flag via `OPENCLAW_COMMAND` if your build supports one.
43
+ - **Verbose progress.** Same as OpenCode — keyword-based labels unless `SMART_PROGRESS_API_KEY` is configured for the LLM summarizer.
44
+ - **Naming clash.** Both the parser alias `openclaw` and the user-facing label `OpenClaw` are distinct from `claude` / `claude code`; the strict-mode router won't conflate them.
@@ -0,0 +1,44 @@
1
+ # OpenCode — Harness Notes
2
+
3
+ <p align="center">
4
+ <a href="../README.md">README</a> ·
5
+ <a href="HARNESSES.md">Harnesses</a> ·
6
+ <a href="USAGE.md">Usage</a> ·
7
+ <a href="CONFIGURATION.md">Configuration</a>
8
+ </p>
9
+
10
+ OpenCode is an open-source terminal coding agent. VerbalCoding drives it through `opencode run`.
11
+
12
+ ## Install
13
+
14
+ Follow the upstream OpenCode install guide. Confirm:
15
+
16
+ ```bash
17
+ opencode run "hello"
18
+ ```
19
+
20
+ ## Configure VerbalCoding
21
+
22
+ ```bash
23
+ # .env
24
+ AGENT_BACKEND=opencode
25
+ # optional
26
+ OPENCODE_COMMAND="opencode run" # default
27
+ AGENT_PROJECT_CONTEXT="..."
28
+ AGENT_WORKDIR=/Users/you/code/your-project
29
+ AGENT_CHAT_TIMEOUT_MS=45000
30
+ AGENT_TASK_TIMEOUT_MS=0
31
+ ```
32
+
33
+ ## Voice phrases to switch TO OpenCode
34
+
35
+ - en: `"switch to OpenCode"`, `"ask OpenCode ..."`, `"switch to open code"`
36
+ - ko: `"opencode로 전환"`, `"오픈코드로 전환"`
37
+
38
+ The matcher accepts `opencode` and `open code`.
39
+
40
+ ## Gotchas
41
+
42
+ - **No session resume** in the default command. If your OpenCode build supports a resume flag, append it via `OPENCODE_COMMAND="opencode run --resume"` (the adapter passes the prompt as the final positional arg).
43
+ - **Model choice.** Append `--model` flags via `OPENCODE_COMMAND` if your OpenCode build expects them.
44
+ - **Verbose progress.** Whatever events OpenCode prints on stdout/stderr get keyword-matched (file reads, web search, terminal); without `SMART_PROGRESS_API_KEY` the bridge falls back to those raw labels.
package/docs/README.md CHANGED
@@ -29,6 +29,7 @@ vc start
29
29
  | [Usage](USAGE.md) | CLI commands, Discord commands, run modes, voice changes, progress, and latency metrics. |
30
30
  | [Hermes Voice vs VerbalCoding](HERMES_VOICE.md) | What Hermes built-in Discord voice already does and what VerbalCoding adds. |
31
31
  | [Configuration](CONFIGURATION.md) | `.env`, agent backends, MCP server, TTS backends, and operational settings. |
32
+ | [TTS Backends](TTS_BACKENDS.md) | Optional local/cloud TTS backends, aliases, latency observations, and Mac mini caveats. |
32
33
  | [Troubleshooting](TROUBLESHOOTING.md) | Docker UDP, voice join failures, missing token/channel checks, and doctor behavior. |
33
34
  | [Multi-Instance](MULTI_INSTANCE.md) | One permanent Discord voice bot per project room with isolated Hermes profiles. |
34
35
  | [Release Notes](RELEASE.md) | Current capabilities, verification checklist, and pre-public-release gaps. |
package/docs/ROADMAP.md CHANGED
@@ -8,11 +8,11 @@ This roadmap covers five differentiation phases that separate VerbalCoding from
8
8
 
9
9
  | # | Phase | Status | Plan |
10
10
  |---|---|---|---|
11
- | 1 | Streaming end-to-end pipeline | designed | [phase1-streaming-pipeline.md](./superpowers/plans/2026-05-13-phase1-streaming-pipeline.md) |
12
- | 2 | Agent-agnostic adapter completion | partial designed | [phase2-agent-adapters.md](./superpowers/plans/2026-05-13-phase2-agent-adapters.md) |
13
- | 6 | Smart progress summarization | designed | [phase6-smart-progress.md](./superpowers/plans/2026-05-13-phase6-smart-progress.md) |
14
- | 7 | Voice plan mode | designed | [phase7-voice-plan-mode.md](./superpowers/plans/2026-05-13-phase7-voice-plan-mode.md) |
15
- | 10 | Push notification handoff | designed | [phase10-push-notifications.md](./superpowers/plans/2026-05-13-phase10-push-notifications.md) |
11
+ | 1 | Streaming end-to-end pipeline | shipped | [phase1-streaming-pipeline.md](./superpowers/plans/2026-05-13-phase1-streaming-pipeline.md) |
12
+ | 2 | Agent-agnostic adapter completion | shipped (incl. cross-agent voice routing) | [phase2-agent-adapters.md](./superpowers/plans/2026-05-13-phase2-agent-adapters.md), [cross-agent-voice-transfer.md](./superpowers/plans/2026-05-14-cross-agent-voice-transfer.md) |
13
+ | 6 | Smart progress summarization | shipped | [phase6-smart-progress.md](./superpowers/plans/2026-05-13-phase6-smart-progress.md) |
14
+ | 7 | Voice plan mode | shipped (incl. `which_agent` slot) | [phase7-voice-plan-mode.md](./superpowers/plans/2026-05-13-phase7-voice-plan-mode.md) |
15
+ | 10 | Push notification handoff | shipped | [phase10-push-notifications.md](./superpowers/plans/2026-05-13-phase10-push-notifications.md) |
16
16
 
17
17
  ## Sequencing rationale
18
18
 
@@ -36,3 +36,18 @@ This roadmap covers five differentiation phases that separate VerbalCoding from
36
36
  - PSTN bridge / actual phone calls (Phase 4 of the broader pitch; deferred).
37
37
  - Local-first one-flag preset (Phase 5; deferred but trivial follow-up).
38
38
  - Multi-agent in one VC with distinct voices (Phase 3; needs Phase 2 to land first).
39
+
40
+ ## What's next (2026 H2 candidates)
41
+
42
+ The differentiation push above shipped — the foundation is in. Candidate next phases, not yet planned:
43
+
44
+ | # | Candidate | Why | Status |
45
+ |---|---|---|---|
46
+ | 11 | Push-to-talk and wake-word v2 | Reduce false barge-ins in shared rooms; pair with hardware push-to-talk via a Discord overlay or a key-binding companion. | candidate |
47
+ | 12 | Multi-user voice in one VC | Each speaker resolves to a distinct routing/session; per-speaker plan-mode and decision answers. Builds on the per-channel routing state. | candidate |
48
+ | 13 | Output voice cloning per agent | Distinct voices per backend (e.g. Codex gets a different TTS voice than Claude Code); piggybacks on the existing voice-clone capture flow. | candidate |
49
+ | 14 | Latency benchmarking + regression gate | Codify the latency_metrics output into a benchmark harness + CI threshold so any regression in STT/agent/TTS stages is caught. | candidate |
50
+ | 15 | Phone-app companion (deferred) | The push-handoff notification deeplinks back to Discord today; a thin phone app or PWA could replay a redacted transcript on demand. | candidate |
51
+ | 16 | Voice-clone reference auto-detect | Detect that an OpenVoice/FireRedTTS reference sample is missing and propose `!voice-clone capture` proactively when the user selects a clone-only backend. | candidate |
52
+
53
+ These aren't sequenced yet. Phases 11/12/14 are the highest-leverage if the goal is making the bridge feel solid in shared rooms; 13/16 are quality-of-life on top of the existing voice stack.
@@ -0,0 +1,227 @@
1
+ # TTS backends and latency notes
2
+
3
+ This document captures the current VerbalCoding TTS backends, the live-selection rules, and the latency caveats observed while testing on the current Mac mini.
4
+
5
+ ## Current test machine
6
+
7
+ Observed host for these notes:
8
+
9
+ - Machine: Mac mini, Apple M4
10
+ - Memory: 16 GB
11
+ - OS: macOS 26.3 / Darwin 25.3.0 arm64
12
+ - Workload caveat: several measurements were taken while other heavy local processes or model-training jobs could be active. Treat local neural TTS timings as operational observations, not clean benchmarks.
13
+
14
+ ## Operational rule
15
+
16
+ Edge TTS is the default safe live backend. Local neural backends are optional and should normally fall back to Edge for progress prompts unless explicitly enabled with each backend's `*_PROGRESS=1` setting.
17
+
18
+ When a user explicitly asks to switch to a specific backend, update both:
19
+
20
+ ```bash
21
+ TTS_BACKEND=<backend>
22
+ TTS_VOICE_TYPE=<voice-type>
23
+ ```
24
+
25
+ and `config/tts-voices.json`:
26
+
27
+ ```json
28
+ {
29
+ "currentBackend": "<backend>",
30
+ "currentVoiceType": "<voice-type>"
31
+ }
32
+ ```
33
+
34
+ The runtime re-reads voice config, so changing only `.env` can be overridden.
35
+
36
+ ### Fallback notice
37
+
38
+ When a non-Edge backend fails to synthesize (model missing, runtime crash, timeout, install error), the bridge silently re-routes that utterance through Edge so the user still hears a response. The first time this happens for each backend in a session, VerbalCoding posts a one-shot warning to the active Discord text channel and speaks the same message ("`<backend>` synthesis failed; using Edge for the rest of this session." / "`<backend>` 음성 생성에 실패해서 이번 세션은 Edge로 진행할게."). Subsequent failures for the same backend stay silent.
39
+
40
+ If you see the warning, check `vc doctor` and the backend's venv/model install — the bridge will keep using Edge until the next `vc start`.
41
+
42
+ ## Supported backends
43
+
44
+ | Backend | Purpose | Default path / command | Live-call suitability | Notes |
45
+ |---|---|---|---|---|
46
+ | `edge` | Free cloud TTS baseline | `edge-tts` | Best current default | Korean and English voices, fast enough for phone-call mode, progress cache works well. |
47
+ | `openvoice` | Reference-sample voice cloning | `integrations/openvoice/synth.py` | Experimental | Requires permitted reference audio. Progress falls back to Edge unless `OPENVOICE_PROGRESS=1`. |
48
+ | `speechswift` | Apple Silicon local CosyVoice / Qwen3 wrapper | `audio speak ...` | Experimental | CosyVoice is usable for demos but not as responsive as Edge; Qwen3 path is much slower. |
49
+ | `supertonic` | Local Supertonic CLI wrapper | `supertonic tts ...` | Experimental | Supports voice IDs such as `M1`; falls back to Edge on failure. |
50
+ | `omnivoice` | OmniVoice local reference/design voice | `.venv-omnivoice/bin/python integrations/omnivoice/synth.py` | Experimental | Startup/model load can feel hung. Keep Edge for live mode unless explicitly testing. |
51
+ | `qwen3tts` | Qwen3 TTS via `audio` CLI | `audio speak --engine qwen3 ...` | Slow experimental | Correct backend name is `qwen3tts` / alias `qwen3`; do not use old `q13` aliases. |
52
+ | `mlxaudio` | MLX Audio Qwen3 wrapper | `.venv-mlxaudio/bin/python integrations/mlxaudio/synth.py` | Experimental | Uses MLX Qwen3 model defaults; validate actual audible output, not only file existence. |
53
+ | `neuttsair` | NeuTTS-Air English reference cloning | `.venv-neuttsair/bin/python integrations/neuttsair/synth.py` | Too slow for current live use | English-only in practice. Q4 GGUF lowers latency but still felt unusably slow under contention. |
54
+ | `fireredtts2` | FireRedTTS-2 prompt-reference backend | `./.local/bin/fireredtts2` | Slow experimental | Can stall restart/final TTS long enough to feel broken. Honor explicit user selection, but report slowness instead of silently reverting. |
55
+ | FireRedTTS-2 MLX helper | Apple Silicon FireRed LLM-port experiment | `integrations/fireredtts2/synth_mlx.py` | Not wired as canonical backend yet | Ports the FireRed LLM token generator to MLX/Metal while keeping RedCodec in Torch; intended to avoid upstream Torch Qwen hangs/slowness. |
56
+ | `mossttsnano` | OpenMOSS / MOSS-TTS-Nano PyTorch backend | `.venv-mossttsnano/bin/python vendor/MOSS-TTS-Nano/infer.py` | Very slow experimental | On macOS use Python 3.11 venv and `--disable-wetext-processing`. |
57
+ | `mossttsnano_mlx` | MOSS-TTS-Nano hybrid MLX port | `.venv-mossttsnano/bin/python integrations/mossttsnano_mlx/synth.py` | Active experiment, not live default | Native MLX generator, KV cache, and persistent JSON-line worker were added. Still verify audibility and tokenizer/model parity. |
58
+
59
+ ## Backend aliases
60
+
61
+ Accepted aliases normalize to canonical backend names:
62
+
63
+ | Alias examples | Canonical backend |
64
+ |---|---|
65
+ | `qwen3`, `qwen3-tts`, `qtts` | `qwen3tts` |
66
+ | `mlx`, `mlx-audio`, `qwen3-mlx` | `mlxaudio` |
67
+ | `neutts`, `neutts-air`, `neu tts air` | `neuttsair` |
68
+ | `firered`, `fireredtts`, `firered-tts-2` | `fireredtts2` |
69
+ | `moss`, `moss-tts`, `mossnano`, `openmoss` | `mossttsnano` |
70
+ | `moss-mlx`, `mossttsnano-mlx`, `openmoss-mlx` | `mossttsnano_mlx` |
71
+
72
+ ## Observed latency
73
+
74
+ ### End-to-end voice loop log
75
+
76
+ From `.logs/latency.jsonl`, 160 successful voice turns were available. These measure the whole Discord voice loop, not just TTS:
77
+
78
+ | Stage | Median | P90 | Min | Max |
79
+ |---|---:|---:|---:|---:|
80
+ | STT | 3.81 s | 4.60 s | 0.75 s | 23.70 s |
81
+ | Agent call | 16.90 s | 209.74 s | 5.58 s | 825.90 s |
82
+ | TTS synth | 3.98 s | 12.77 s | 0.72 s | 760.73 s |
83
+ | TTS playback | 19.50 s | 47.16 s | 0.99 s | 90.36 s |
84
+ | TTS total | 23.14 s | 62.28 s | 1.90 s | 782.89 s |
85
+ | Voice capture | 11.66 s | 30.02 s | 3.20 s | 109.10 s |
86
+ | Utterance idle wait | 2.60 s | 4.50 s | 2.60 s | 4.54 s |
87
+ | Total turn | 69.06 s | 289.56 s | 20.99 s | 905.24 s |
88
+
89
+ Interpretation:
90
+
91
+ - Long perceived latency is often not only TTS. Agent work and spoken playback length dominate many turns.
92
+ - A high TTS-synth max indicates local/experimental TTS can stall badly under load or fallback paths.
93
+ - Playback time is real audio duration, so long answers sound slow even if synthesis is fast.
94
+ - The idle wait is intentionally a few seconds to avoid cutting off Korean phone-call utterances.
95
+
96
+ ### Local neural TTS observations
97
+
98
+ | Backend / mode | Observed behavior on this Mac mini | Practical conclusion |
99
+ |---|---|---|
100
+ | Edge TTS | Usually low seconds for chunks; reliable enough for current live mode. | Keep as default/fallback. |
101
+ | SpeechSwift CosyVoice CLI | About 6.9 s wall time for a 1.68 s Korean sample after warm-up. | Demo-capable, but sluggish for conversation. |
102
+ | SpeechSwift audio-server | Warm short Korean requests varied around 4.5-7.7 s and sometimes hung. | Not safe as the always-on live backend yet. |
103
+ | SpeechSwift/Qwen3 | About 62.5 s wall time, first chunk around 47.6 s in prior testing. | Too slow for live phone-call mode. |
104
+ | NeuTTS Air | Produced valid WAVs, but felt unusably slow while the machine was under unrelated GPU/model load. | English-only experiment; use Edge for live answers. |
105
+ | FireRedTTS-2 | Can be slow enough that restart/final TTS appears stalled. Timeout is 180 s by default. | Useful to test, but report slowness clearly. |
106
+ | FireRedTTS-2 MLX helper | Added as an Apple Silicon experiment that moves the LLM token generator to MLX/Metal and keeps RedCodec encode/decode in Torch. | Not a production backend yet; verify dependencies, imports, generated frames, and decoded volume before wiring it to `TTS_BACKEND`. |
107
+ | MOSS-TTS-Nano PyTorch | Works as an OpenMOSS path but is very slow on macOS. | Keep as correctness baseline, not live default. |
108
+ | MOSS-TTS-Nano MLX | Added native generator, sampling fixes, KV cache, and persistent worker; can reduce repeated startup overhead. | Still experimental; verify audible volume and parity before live use. |
109
+
110
+ ## MOSS-TTS-Nano MLX status
111
+
112
+ Recent implementation work added:
113
+
114
+ - `integrations/mossttsnano_mlx/convert.py` for conversion experiments.
115
+ - `integrations/mossttsnano_mlx/gpt2_mlx.py` for a native MLX GPT2-like generator.
116
+ - `integrations/mossttsnano_mlx/synth.py` for the hybrid synthesis path.
117
+ - `integrations/mossttsnano_mlx/worker.py` for a persistent JSON-line worker.
118
+ - `MOSSTTSNANO_MLX_WORKER=1` to keep the worker hot between requests.
119
+ - KV cache and sampling-semantics fixes in the MLX generator.
120
+
121
+ Known caution:
122
+
123
+ - A generated WAV is not enough. Check audibility with playback or `ffmpeg volumedetect`.
124
+ - Near-silent or strange audio usually means model/tokenizer/audio-code parity is still wrong.
125
+ - Keep the PyTorch MOSS path as a reference until MLX parity is proven.
126
+
127
+ ## Configuration examples
128
+
129
+ ### Safe live default
130
+
131
+ ```bash
132
+ TTS_BACKEND=edge
133
+ TTS_VOICE_TYPE=korean_male
134
+ TTS_VOICE=ko-KR-InJoonNeural
135
+ TTS_RATE=+10%
136
+ ```
137
+
138
+ ### Qwen3 TTS preset
139
+
140
+ ```bash
141
+ TTS_BACKEND=qwen3tts
142
+ TTS_VOICE_TYPE=korean_preset
143
+ QWEN3TTS_COMMAND=audio
144
+ QWEN3TTS_MODE=custom
145
+ QWEN3TTS_MODEL=customVoice
146
+ QWEN3TTS_LANGUAGE=korean
147
+ QWEN3TTS_SPEAKER=sohee
148
+ QWEN3TTS_PROGRESS=0
149
+ ```
150
+
151
+ ### NeuTTS Air English experiment
152
+
153
+ ```bash
154
+ TTS_BACKEND=neuttsair
155
+ TTS_VOICE_TYPE=cloned_reference
156
+ VOICE_LANGUAGE=en
157
+ STT_LANGUAGE=en
158
+ WHISPER_CPP_LANGUAGE=en
159
+ NEUTTSAIR_PYTHON=./.venv-neuttsair/bin/python
160
+ NEUTTSAIR_SCRIPT=integrations/neuttsair/synth.py
161
+ NEUTTSAIR_BACKBONE_REPO=neuphonic/neutts-air-q4-gguf
162
+ NEUTTSAIR_CODEC_REPO=neuphonic/neucodec
163
+ NEUTTSAIR_PROGRESS=0
164
+ ```
165
+
166
+ ### FireRedTTS-2 experiment
167
+
168
+ ```bash
169
+ TTS_BACKEND=fireredtts2
170
+ TTS_VOICE_TYPE=prompt_reference
171
+ FIREREDTTS2_COMMAND=./.local/bin/fireredtts2
172
+ FIREREDTTS2_PRETRAINED_DIR=./pretrained_models/FireRedTTS2
173
+ FIREREDTTS2_PROMPT_AUDIO=./voice-samples/user-reference.wav
174
+ FIREREDTTS2_PROGRESS=0
175
+ ```
176
+
177
+ ### MOSS-TTS-Nano PyTorch experiment
178
+
179
+ ```bash
180
+ TTS_BACKEND=mossttsnano
181
+ TTS_VOICE_TYPE=prompt_reference
182
+ MOSSTTSNANO_COMMAND=./.venv-mossttsnano/bin/python
183
+ MOSSTTSNANO_SCRIPT=vendor/MOSS-TTS-Nano/infer.py
184
+ MOSSTTSNANO_CHECKPOINT=OpenMOSS-Team/MOSS-TTS-Nano
185
+ MOSSTTSNANO_PROMPT_AUDIO=./voice-samples/user-reference.wav
186
+ MOSSTTSNANO_PROGRESS=0
187
+ ```
188
+
189
+ ### MOSS-TTS-Nano MLX worker experiment
190
+
191
+ ```bash
192
+ TTS_BACKEND=mossttsnano_mlx
193
+ TTS_VOICE_TYPE=prompt_reference
194
+ MOSSTTSNANO_MLX_PYTHON=./.venv-mossttsnano/bin/python
195
+ MOSSTTSNANO_MLX_SCRIPT=integrations/mossttsnano_mlx/synth.py
196
+ MOSSTTSNANO_MLX_WORKER=1
197
+ MOSSTTSNANO_MLX_WORKER_SCRIPT=integrations/mossttsnano_mlx/worker.py
198
+ MOSSTTSNANO_TORCH_DEVICE=cpu
199
+ MOSSTTSNANO_TORCH_DTYPE=float32
200
+ MOSSTTSNANO_PROMPT_AUDIO=./voice-samples/user-reference.wav
201
+ MOSSTTSNANO_MLX_PROGRESS=0
202
+ ```
203
+
204
+ ## How to benchmark safely
205
+
206
+ Use a quiet machine, short fixed text, and separate synthesis from playback:
207
+
208
+ ```bash
209
+ vc doctor
210
+ node --test app-node/tts_backends.test.mjs app-node/tts_settings.test.mjs app-node/tts_voice_config.test.mjs
211
+ ```
212
+
213
+ For live logs, compare these fields in `.logs/latency.jsonl`:
214
+
215
+ - `stt_ms`: speech-to-text time.
216
+ - `agent_ms`: CLI agent time.
217
+ - `tts_synth_ms`: time to synthesize audio files.
218
+ - `tts_play_ms`: time spent playing generated audio.
219
+ - `total_ms`: full turn time.
220
+
221
+ When testing local neural backends, also verify:
222
+
223
+ ```bash
224
+ ffmpeg -i output.wav -af volumedetect -f null -
225
+ ```
226
+
227
+ A non-empty file can still be inaudible or near-silent.
package/docs/USAGE.md CHANGED
@@ -107,6 +107,28 @@ Then use `vc bot invite CLIENT_ID` to generate the VerbalCoding-specific invite
107
107
 
108
108
  Voice equivalents such as “외부 모드”, “보수 모드”, “실내”, “기본 감도”, and clear stop phrases like “잠깐”, “멈춰”, “그만” are handled by the bridge. You can also say “상세 진행 켜” / “상세 진행 꺼” to toggle verbose progress by voice.
109
109
 
110
+ ## Cross-agent voice routing
111
+
112
+ VerbalCoding can route a single turn (or the rest of the session) to a different installed CLI agent without restarting.
113
+
114
+ | Voice phrase (en) | Voice phrase (ko) | Behavior |
115
+ |---|---|---|
116
+ | `ask Codex what it thinks` | `코덱스한테 물어봐` | Single-turn route to Codex; next utterance returns to the default. |
117
+ | `switch to Aider` | `aider로 전환` | Sticky route — every following utterance goes to Aider. |
118
+ | `back to default` | `기본으로 돌아가` | Restore the default agent (`AGENT_BACKEND` / `vc setup` selection). |
119
+ | `let Claude finish this` | — | Treated as sticky route to Claude Code. |
120
+
121
+ Recognized aliases: `hermes`, `claude` / `claude code`, `codex` / `코덱스`, `gemini` / `gemini cli` / `제미나이`, `opencode`, `openclaw`, `aider` / `에이더`, `cursor` / `cursor cli`.
122
+
123
+ Behaviors on top:
124
+
125
+ - **Missing-binary fallback** — if the requested backend's binary is not on `PATH` (resolved against the active project session's workdir when applicable), the bridge asks "Want me to use the default agent instead?" Answer "yes" / "예" to retry on the default; "no" / "아니오" to cancel.
126
+ - **TTS prefix on backend change** — when the active backend changes between turns, the spoken answer is prefixed (`Codex says: …` / `코덱스: …`). No prefix on stable backends.
127
+ - **Cross-agent context handoff** — the routed agent receives a prompt block containing the prior agent label, recent voice utterances (last 4), and the most recently resolved plan decisions, so it doesn't restart cold.
128
+ - **Plan-mode `which_agent` slot** — plans can include a `which_agent` decision listing CLI options (e.g. `codex, aider, claude, gemini, opencode, openclaw, cursor, hermes`); the user's voice answer selects which agent executes that plan.
129
+ - **Per-channel state** — routing is scoped per Discord channel; switching agents in one project room does not affect others.
130
+ - **Sticky survives interrupts** — barge-in or aborted turns keep a sticky route intact; only single-turn routes are cleared.
131
+
110
132
  ## Changing the Voice
111
133
 
112
134
  `vc language ko|en|auto` changes STT language, progress language, and the matching default TTS voice together. If you only want to change the speaker/voice while the bridge is running, say it in Discord voice:
@@ -0,0 +1,34 @@
1
+ # Guía del repositorio (español)
2
+
3
+ > Este fichero es un resumen en español de [`AGENTS.md`](../../AGENTS.md). Las reglas formales viven en el inglés original.
4
+
5
+ VerbalCoding es un puente de voz Discord para agentes de codificación. El runtime es la implementación Node en `app-node/`, lanzada vía `run.sh` o el CLI `vc`.
6
+
7
+ ## Desarrollo
8
+
9
+ - En docs y ejemplos prefiere `vc ...` sobre `npm run vc -- ...`.
10
+ - Los secretos locales viven en `.env` o `instances/*.env`; nunca commits con tokens Discord, IDs de canal, ficheros de sesión, muestras de voz, pesos de modelo, virtualenvs, logs ni cachés.
11
+ - Edita ficheros fuente, no artefactos generados.
12
+ - Ejemplos públicos: usa placeholders para rutas locales, IDs de usuario, IDs Discord y tokens.
13
+
14
+ ## Verificación
15
+
16
+ Antes de marcar un cambio como completo, ejecuta:
17
+
18
+ ```bash
19
+ npm test
20
+ ```
21
+
22
+ ## Layout de módulos
23
+
24
+ Detalle en [`AGENTS.md`](../../AGENTS.md). Módulos clave:
25
+
26
+ - `main.mjs` — dispatcher Discord / voz / agente
27
+ - `agent_routing.mjs` — enrutamiento entre agentes por voz
28
+ - `plan_mode.mjs` — modo plan por voz (slot `which_agent`)
29
+ - `session_ontology.mjs` — grafo tipado por canal (handoff)
30
+ - `research_mode.mjs` — comando `"research X"`
31
+
32
+ ## Bloque gestionado
33
+
34
+ HarnessSync sincroniza las reglas de `CLAUDE.md` dentro de `AGENTS.md`. No edites manualmente ese bloque.
@@ -0,0 +1,34 @@
1
+ # Guide du dépôt (français)
2
+
3
+ > Ce fichier est un résumé français de [`AGENTS.md`](../../AGENTS.md). Les règles formelles restent dans l'original anglais.
4
+
5
+ VerbalCoding est un pont vocal Discord pour les agents de codage. Le runtime est l'implémentation Node sous `app-node/`, lancée via `run.sh` ou le CLI `vc`.
6
+
7
+ ## Développement
8
+
9
+ - Dans les docs et exemples, préférez `vc ...` à `npm run vc -- ...`.
10
+ - Les secrets locaux vivent dans `.env` ou `instances/*.env`; ne commitez jamais de vrais tokens Discord, IDs de salon, fichiers de session, échantillons vocaux, poids de modèle, virtualenvs, logs ni caches.
11
+ - Modifiez les fichiers source, pas les artefacts générés.
12
+ - Exemples publics-safe : placeholders pour chemins locaux, IDs utilisateur, IDs Discord, tokens.
13
+
14
+ ## Vérification
15
+
16
+ Avant de signaler un changement comme terminé, exécutez:
17
+
18
+ ```bash
19
+ npm test
20
+ ```
21
+
22
+ ## Cartographie des modules
23
+
24
+ Détail dans [`AGENTS.md`](../../AGENTS.md). Modules clés :
25
+
26
+ - `main.mjs` — dispatcher Discord / voix / agent
27
+ - `agent_routing.mjs` — routage inter-agent par voix
28
+ - `plan_mode.mjs` — mode plan vocal (slot `which_agent`)
29
+ - `session_ontology.mjs` — graphe typé par salon (handoff)
30
+ - `research_mode.mjs` — commande `"research X"`
31
+
32
+ ## Bloc géré
33
+
34
+ HarnessSync synchronise les règles de `CLAUDE.md` dans `AGENTS.md`. Ne pas éditer manuellement ce bloc.
@@ -0,0 +1,34 @@
1
+ # リポジトリガイドライン (日本語)
2
+
3
+ > 本ファイルは [`AGENTS.md`](../../AGENTS.md) の日本語要約です。正式なルールは英語の本文を参照してください。
4
+
5
+ VerbalCoding はコーディングエージェント向けの Discord 音声ブリッジです。実装は `app-node/` 配下の Node 実装で、`run.sh` または `vc` CLI 経由で起動します。
6
+
7
+ ## 開発
8
+
9
+ - ドキュメント / サンプルでは `vc ...` を `npm run vc -- ...` より優先してください。
10
+ - ローカルシークレットは `.env` または `instances/*.env` に置き、実 Discord トークン、チャンネル ID、セッションファイル、音声サンプル、モデル重み、venv、ログ、キャッシュ出力はコミットしないでください。
11
+ - 自動生成物ではなくソースファイルを編集してください。
12
+ - サンプルは公開しても安全な値で。ローカルパス、ユーザー ID、Discord ID、トークンはプレースホルダで。
13
+
14
+ ## 検証
15
+
16
+ コード変更を完了とする前に Node テストを走らせてください:
17
+
18
+ ```bash
19
+ npm test
20
+ ```
21
+
22
+ ## モジュール構成
23
+
24
+ 詳細は [`AGENTS.md`](../../AGENTS.md) を参照。主要モジュール:
25
+
26
+ - `main.mjs` — Discord / 音声 / エージェントのディスパッチャ
27
+ - `agent_routing.mjs` — 音声主導のクロスエージェントルーティング
28
+ - `plan_mode.mjs` — 音声プランモード(`which_agent` スロット)
29
+ - `session_ontology.mjs` — チャネル単位の typed graph(handoff 用)
30
+ - `research_mode.mjs` — `"research X"` 音声コマンドパイプライン
31
+
32
+ ## 管理ブロック
33
+
34
+ HarnessSync が `AGENTS.md` に `CLAUDE.md` のルールを同期します。当該ブロックは編集しないでください。
@@ -0,0 +1,34 @@
1
+ # 저장소 가이드라인 (한국어)
2
+
3
+ > 이 파일은 [`AGENTS.md`](../../AGENTS.md)의 한국어 요약입니다. 정식 규칙은 원본 영어 문서를 따라주세요.
4
+
5
+ VerbalCoding은 코딩 에이전트용 Discord 음성 브릿지입니다. 실제 런타임은 `app-node/` 하위 Node 구현체이고, `run.sh` 또는 `vc` CLI로 실행합니다.
6
+
7
+ ## 개발
8
+
9
+ - 문서·예제에서는 `npm run vc -- ...` 보다 `vc ...` 형태를 우선 사용합니다.
10
+ - 로컬 비밀은 `.env` 또는 `instances/*.env`에만 두고 절대 커밋하지 마세요. 실제 Discord 토큰, 채널 ID, 세션 파일, 음성 샘플, 모델 가중치, 가상환경, 로그, 캐시 출력도 마찬가지입니다.
11
+ - 생성물/런타임 산출물 대신 소스 파일을 수정합니다.
12
+ - 예제는 공개 안전한 값으로 유지합니다. 로컬 경로, 사용자 ID, Discord ID, 토큰은 플레이스홀더로.
13
+
14
+ ## 검증
15
+
16
+ 코드 변경을 완료로 보고하기 전에 Node 테스트 스위트를 실행하세요:
17
+
18
+ ```bash
19
+ npm test
20
+ ```
21
+
22
+ ## 모듈 맵
23
+
24
+ 자세한 내용은 [`AGENTS.md`](../../AGENTS.md)를 참고하세요. 핵심 모듈:
25
+
26
+ - `main.mjs` — Discord/음성/에이전트 디스패처
27
+ - `agent_routing.mjs` — 음성 기반 크로스 에이전트 라우팅
28
+ - `plan_mode.mjs` — 음성 플랜 모드 (which_agent 슬롯)
29
+ - `session_ontology.mjs` — 채널별 타입드 그래프 (cross-agent 핸드오프 컨텍스트)
30
+ - `research_mode.mjs` — `"리서치 X"` 음성 명령 파이프라인
31
+
32
+ ## 관리되는 영역
33
+
34
+ HarnessSync가 `AGENTS.md`에 `CLAUDE.md`의 규칙을 자동 동기화합니다. 그 블록은 손대지 마세요.
@@ -0,0 +1,34 @@
1
+ # Руководство по репозиторию (русский)
2
+
3
+ > Этот файл — русское резюме [`AGENTS.md`](../../AGENTS.md). Формальные правила — в оригинале на английском.
4
+
5
+ VerbalCoding — голосовой мост Discord для кодинг-агентов. Рантайм — Node-реализация в `app-node/`, запускается через `run.sh` или CLI `vc`.
6
+
7
+ ## Разработка
8
+
9
+ - В документации и примерах предпочитайте `vc ...`, а не `npm run vc -- ...`.
10
+ - Локальные секреты — в `.env` или `instances/*.env`. Не коммитьте реальные Discord-токены, channel ID, session-файлы, голосовые сэмплы, веса моделей, venv, логи, кеши.
11
+ - Правьте исходные файлы, а не сгенерированные артефакты.
12
+ - Примеры — публично-безопасные: плейсхолдеры для локальных путей, user ID, Discord ID, токенов.
13
+
14
+ ## Проверка
15
+
16
+ Перед тем как считать изменения готовыми, запустите:
17
+
18
+ ```bash
19
+ npm test
20
+ ```
21
+
22
+ ## Карта модулей
23
+
24
+ Подробности в [`AGENTS.md`](../../AGENTS.md). Ключевые модули:
25
+
26
+ - `main.mjs` — диспетчер Discord / голос / агенты
27
+ - `agent_routing.mjs` — голосовая маршрутизация между агентами
28
+ - `plan_mode.mjs` — голосовой plan-mode (слот `which_agent`)
29
+ - `session_ontology.mjs` — типизированный граф на канал (handoff)
30
+ - `research_mode.mjs` — голосовая команда `"research X"`
31
+
32
+ ## Управляемый блок
33
+
34
+ HarnessSync синхронизирует правила из `CLAUDE.md` в управляемый блок `AGENTS.md`. Не редактируйте этот блок вручную.
@@ -0,0 +1,34 @@
1
+ # 仓库指南 (中文)
2
+
3
+ > 本文是 [`AGENTS.md`](../../AGENTS.md) 的中文摘要。正式规则以英文原文为准。
4
+
5
+ VerbalCoding 是面向编码代理的 Discord 语音桥。运行时位于 `app-node/`,通过 `run.sh` 或 `vc` CLI 启动。
6
+
7
+ ## 开发
8
+
9
+ - 文档与示例优先使用 `vc ...` 形式,而不是 `npm run vc -- ...`。
10
+ - 本地密钥放在 `.env` 或 `instances/*.env`,不要提交真实 Discord token、频道 ID、会话文件、语音样本、模型权重、虚拟环境、日志、缓存。
11
+ - 修改源文件而非自动生成物。
12
+ - 示例保持公开安全:本地路径、用户 ID、Discord ID、token 用占位符替代。
13
+
14
+ ## 验证
15
+
16
+ 报告完成前请运行 Node 测试:
17
+
18
+ ```bash
19
+ npm test
20
+ ```
21
+
22
+ ## 模块布局
23
+
24
+ 详情见 [`AGENTS.md`](../../AGENTS.md)。核心模块:
25
+
26
+ - `main.mjs` — Discord / 语音 / 代理调度器
27
+ - `agent_routing.mjs` — 语音驱动的跨代理路由
28
+ - `plan_mode.mjs` — 语音 plan 模式 (`which_agent` 槽)
29
+ - `session_ontology.mjs` — 按频道的类型图 (用于 handoff)
30
+ - `research_mode.mjs` — `"research X"` 语音命令流程
31
+
32
+ ## 托管区域
33
+
34
+ HarnessSync 会把 `CLAUDE.md` 的规则同步进 `AGENTS.md` 的托管块,请勿手动修改该块。