verbalcoding 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (85) hide show
  1. package/.env.example +83 -0
  2. package/LICENSE +21 -0
  3. package/README.md +157 -0
  4. package/app-node/agent_adapters.mjs +576 -0
  5. package/app-node/agent_adapters.test.mjs +455 -0
  6. package/app-node/agent_contract.mjs +45 -0
  7. package/app-node/barge_in.mjs +148 -0
  8. package/app-node/barge_in.test.mjs +179 -0
  9. package/app-node/bridge_logger.mjs +66 -0
  10. package/app-node/bridge_logger.test.mjs +73 -0
  11. package/app-node/bridge_state.mjs +104 -0
  12. package/app-node/bridge_state.test.mjs +64 -0
  13. package/app-node/cli_install.test.mjs +97 -0
  14. package/app-node/deferred_queue.mjs +12 -0
  15. package/app-node/deferred_queue.test.mjs +20 -0
  16. package/app-node/discord_invite_cli.test.mjs +31 -0
  17. package/app-node/discord_text.mjs +29 -0
  18. package/app-node/discord_text.test.mjs +32 -0
  19. package/app-node/hermes_profiles.mjs +164 -0
  20. package/app-node/hermes_profiles.test.mjs +276 -0
  21. package/app-node/install_config.mjs +263 -0
  22. package/app-node/install_config.test.mjs +205 -0
  23. package/app-node/instance_doctor.mjs +137 -0
  24. package/app-node/instance_doctor.test.mjs +128 -0
  25. package/app-node/instance_profile_lifecycle.mjs +16 -0
  26. package/app-node/instances.mjs +153 -0
  27. package/app-node/instances.test.mjs +102 -0
  28. package/app-node/language_config.mjs +73 -0
  29. package/app-node/language_config.test.mjs +51 -0
  30. package/app-node/latency_metrics.mjs +133 -0
  31. package/app-node/latency_metrics.test.mjs +71 -0
  32. package/app-node/main.mjs +1771 -0
  33. package/app-node/mcp_tools.mjs +198 -0
  34. package/app-node/mcp_tools.test.mjs +39 -0
  35. package/app-node/progress_cache.mjs +7 -0
  36. package/app-node/progress_cache.test.mjs +23 -0
  37. package/app-node/progress_speech.mjs +102 -0
  38. package/app-node/progress_speech.test.mjs +48 -0
  39. package/app-node/project_sessions.mjs +148 -0
  40. package/app-node/project_sessions.test.mjs +77 -0
  41. package/app-node/restart_notice.mjs +57 -0
  42. package/app-node/restart_notice.test.mjs +37 -0
  43. package/app-node/restart_policy.mjs +27 -0
  44. package/app-node/restart_policy.test.mjs +33 -0
  45. package/app-node/text_routing.mjs +8 -0
  46. package/app-node/text_routing.test.mjs +18 -0
  47. package/app-node/tts_backends.mjs +251 -0
  48. package/app-node/tts_backends.test.mjs +400 -0
  49. package/app-node/tts_chunks.mjs +57 -0
  50. package/app-node/tts_chunks.test.mjs +35 -0
  51. package/app-node/tts_prefetch.mjs +38 -0
  52. package/app-node/tts_prefetch.test.mjs +49 -0
  53. package/app-node/tts_settings.mjs +72 -0
  54. package/app-node/tts_settings.test.mjs +127 -0
  55. package/app-node/tts_voice_config.mjs +127 -0
  56. package/app-node/tts_voice_config.test.mjs +64 -0
  57. package/app-node/voice_clone_capture.mjs +76 -0
  58. package/app-node/voice_clone_capture.test.mjs +51 -0
  59. package/app-node/voice_messages.mjs +62 -0
  60. package/app-node/voice_messages.test.mjs +33 -0
  61. package/docs/CONFIGURATION.md +183 -0
  62. package/docs/FRESH_INSTALL.md +193 -0
  63. package/docs/MULTI_INSTANCE.md +183 -0
  64. package/docs/RELEASE.md +72 -0
  65. package/docs/USAGE.md +108 -0
  66. package/docs/assets/figures/verbalcoding-flow.svg +63 -0
  67. package/docs/i18n/README.es.md +121 -0
  68. package/docs/i18n/README.fr.md +121 -0
  69. package/docs/i18n/README.ja.md +121 -0
  70. package/docs/i18n/README.ko.md +121 -0
  71. package/docs/i18n/README.ru.md +121 -0
  72. package/docs/i18n/README.zh.md +121 -0
  73. package/package.json +58 -0
  74. package/run.sh +82 -0
  75. package/scripts/bootstrap_prereqs.sh +193 -0
  76. package/scripts/cli.mjs +369 -0
  77. package/scripts/docker_ubuntu_smoke.sh +76 -0
  78. package/scripts/doctor.mjs +134 -0
  79. package/scripts/install.mjs +108 -0
  80. package/scripts/install.sh +44 -0
  81. package/scripts/mcp-server.mjs +84 -0
  82. package/scripts/openvoice_smoke.py +34 -0
  83. package/scripts/openvoice_synth.py +103 -0
  84. package/scripts/setup_openvoice.sh +34 -0
  85. package/scripts/setup_supertonic.sh +18 -0
@@ -0,0 +1,183 @@
1
+ # VerbalCoding Configuration
2
+
3
+ ## Setup Wizard
4
+
5
+ ```bash
6
+ ./scripts/install.sh
7
+ ```
8
+
9
+ The installer asks for Discord token, allowed users, auto-join voice channel names, transcript channel/thread, CLI harness backend, default voice language, TTS settings, and wake-word behavior. It writes `.env` with mode `0600`; `.env` is ignored by git. It also links the short shell command `vc`.
10
+
11
+ If you only need the shell command after manual install:
12
+
13
+ ```bash
14
+ npm link
15
+ ```
16
+
17
+ ## Supported Agent Backends
18
+
19
+ Set `AGENT_BACKEND` in `.env`.
20
+
21
+ | Backend | Default command | Notes |
22
+ |---|---|---|
23
+ | `hermes` | `hermes chat -Q -q` | Default. Preserves `.verbalcoding-session` resume behavior. |
24
+ | `claude-code` / `claude` | `claude -p` | Override with `CLAUDE_COMMAND` or `AGENT_COMMAND`. |
25
+ | `codex` | `codex exec` | Override with `CODEX_COMMAND` or `AGENT_COMMAND`. |
26
+ | `gemini` | `gemini -p` | Override with `GEMINI_COMMAND` or `AGENT_COMMAND`. |
27
+ | `opencode` | `opencode run` | Override with `OPENCODE_COMMAND` or `AGENT_COMMAND`. |
28
+ | `openclaw` | `openclaw run` | Override with `OPENCLAW_COMMAND` or `AGENT_COMMAND`. |
29
+ | `custom` | required `AGENT_COMMAND` | Prompt is appended as the final argv argument. |
30
+
31
+ Generic overrides:
32
+
33
+ ```bash
34
+ AGENT_BACKEND=custom
35
+ AGENT_LABEL="My Harness"
36
+ AGENT_COMMAND="my-harness run --non-interactive"
37
+ AGENT_TASK_TIMEOUT_MS=0
38
+ AGENT_CHAT_TIMEOUT_MS=45000
39
+ AGENT_VERBOSE_PROGRESS=0
40
+ UTTERANCE_IDLE_MS=2000
41
+ LATENCY_LOG_PATH=./.logs/latency.jsonl
42
+ ```
43
+
44
+ ## Agent Adapter Contract
45
+
46
+ The voice bridge talks to every backend through one adapter contract:
47
+
48
+ - `run({ text }, signal, plan)` returns status, final answer text, backend label, elapsed time, and optional session metadata.
49
+ - `ask(text, signal, plan)` is the compatibility shortcut that returns only final answer text.
50
+ - `capabilities` declares whether the backend supports session resume, streaming progress, and cancellation.
51
+ - Hermes is the reference adapter: resume, verbose progress streaming, cancellation, and final-answer recovery from Hermes session files.
52
+
53
+ New backends should implement the same contract and keep voice/STT/TTS behavior outside the adapter.
54
+
55
+ ## Example `.env`
56
+
57
+ ```bash
58
+ DISCORD_BOT_TOKEN="***"
59
+ DISCORD_ALLOWED_USERS="123456789012345678"
60
+ AUTO_JOIN_VOICE_CHANNELS="일반,General,general"
61
+ TRANSCRIPT_CHANNEL_ID="123456789012345678"
62
+
63
+ AGENT_BACKEND="hermes"
64
+ STT_ENGINE="whisper_cpp"
65
+ WHISPER_CPP_BIN="whisper-cli"
66
+ WHISPER_CPP_MODEL="./models/ggml-small-q5_1.bin"
67
+
68
+ TTS_BACKEND="edge"
69
+ TTS_VOICE="ko-KR-SunHiNeural"
70
+ TTS_RATE="+10%"
71
+ TTS_MAX_CHARS="495"
72
+ TTS_VOLUME="1.0"
73
+
74
+ REQUIRE_WAKE_WORD="0"
75
+ MIN_UTTERANCE_SECONDS="1.0"
76
+ UTTERANCE_IDLE_MS="2000"
77
+ HERMES_TASK_TIMEOUT_MS="0"
78
+ HERMES_CHAT_TIMEOUT_MS="45000"
79
+ AGENT_VERBOSE_PROGRESS="0"
80
+ LATENCY_LOG_PATH="./.logs/latency.jsonl"
81
+ ```
82
+
83
+ ## MCP Server
84
+
85
+ VerbalCoding ships a stdio MCP server so Hermes Agent or any MCP client can control the bridge through tools instead of relying on skills or free-form shell commands.
86
+
87
+ Hermes config example:
88
+
89
+ ```yaml
90
+ mcp_servers:
91
+ verbalcoding:
92
+ command: "node"
93
+ args: ["/path/to/VerbalCoding/scripts/mcp-server.mjs"]
94
+ timeout: 120
95
+ connect_timeout: 30
96
+ ```
97
+
98
+ Exposed MCP tools:
99
+
100
+ | Tool | Purpose |
101
+ |---|---|
102
+ | `status` | Report bridge/config status without secrets |
103
+ | `doctor` | Run the redacted doctor check |
104
+ | `set_auto_restart` | Enable/disable commit-time voice-bot auto-restart |
105
+ | `set_language` | Update STT/progress/TTS language together |
106
+ | `start`, `stop`, `restart` | Control the Discord voice bridge |
107
+
108
+ ## Optional OpenVoice TTS
109
+
110
+ Edge TTS remains the default and fallback. To try local voice cloning with OpenVoice V2:
111
+
112
+ ```bash
113
+ ./scripts/setup_openvoice.sh
114
+ # Download checkpoints_v2_0417.zip from OpenVoice docs and extract under vendor/OpenVoice/checkpoints_v2/
115
+ mkdir -p voice-samples
116
+ # Put a permitted reference sample at voice-samples/user-reference.wav,
117
+ # or capture one from Discord with !voice-clone capture.
118
+ python3 scripts/openvoice_smoke.py
119
+ ```
120
+
121
+ Then set:
122
+
123
+ ```bash
124
+ TTS_BACKEND="openvoice"
125
+ OPENVOICE_REF_AUDIO="./voice-samples/user-reference.wav"
126
+ OPENVOICE_PROGRESS="0"
127
+ ```
128
+
129
+ Only clone voices you own or have permission to use. If OpenVoice fails or times out, VerbalCoding falls back to Edge TTS.
130
+
131
+ ## Optional Supertonic TTS
132
+
133
+ ```bash
134
+ ./scripts/setup_supertonic.sh
135
+ supertonic tts '안녕하세요. 수퍼토닉 테스트입니다.' --lang ko --voice M1 --steps 2 --speed 1.0 -o /tmp/verbalcoding-supertonic.wav
136
+ ```
137
+
138
+ Then set:
139
+
140
+ ```bash
141
+ TTS_BACKEND="supertonic"
142
+ SUPERTONIC_COMMAND="./.venv-supertonic/bin/supertonic"
143
+ SUPERTONIC_VOICE="M1"
144
+ SUPERTONIC_LANGUAGE="ko"
145
+ SUPERTONIC_STEPS="2"
146
+ SUPERTONIC_SPEED="1.0"
147
+ SUPERTONIC_PROGRESS="0"
148
+ ```
149
+
150
+ If Supertonic is missing, fails, or times out, VerbalCoding falls back to Edge TTS.
151
+
152
+ ## Optional SpeechSwift / CosyVoice TTS
153
+
154
+ On Apple Silicon, `speech-swift` is a local backend for Korean voice cloning with MLX-native CosyVoice/Qwen3-TTS.
155
+
156
+ ```bash
157
+ brew tap soniqo/speech https://github.com/soniqo/speech-swift
158
+ brew install speech
159
+ ```
160
+
161
+ Recommended env:
162
+
163
+ ```bash
164
+ TTS_BACKEND="speechswift"
165
+ SPEECHSWIFT_MODE="server"
166
+ SPEECHSWIFT_ENGINE="cosyvoice"
167
+ SPEECHSWIFT_LANGUAGE="korean"
168
+ SPEECHSWIFT_REF_AUDIO="./voice-samples/user-reference.wav"
169
+ SPEECHSWIFT_SERVER_HOST="127.0.0.1"
170
+ SPEECHSWIFT_SERVER_PORT="18080"
171
+ SPEECHSWIFT_SERVER_URL="http://127.0.0.1:18080"
172
+ SPEECHSWIFT_PROGRESS="0"
173
+ ```
174
+
175
+ Keep Edge for quick progress/backchannel prompts.
176
+
177
+ ## Operational Notes
178
+
179
+ - Bot needs Discord privileged Message Content intent enabled for text commands.
180
+ - Bot needs voice channel connect/speak permissions.
181
+ - For Hermes Agent, configure/authenticate Hermes normally (`hermes setup`, `hermes login`, etc.) on your default profile.
182
+ - For Claude Code, Codex, Gemini, OpenCode, OpenClaw, install and authenticate those CLIs separately.
183
+ - If a CLI emits diff/code output on timeout or signal failure, the bridge avoids reading it aloud and sends detailed text instead.
@@ -0,0 +1,193 @@
1
+ # Fresh install
2
+
3
+ This guide is for a clean public install. It avoids local-only assumptions and uses the installer to bootstrap as much as possible.
4
+
5
+ ## 1. Install the CLI
6
+
7
+ Recommended npm path:
8
+
9
+ ```bash
10
+ npm install -g verbalcoding
11
+ ```
12
+
13
+ Or run the published package directly:
14
+
15
+ ```bash
16
+ npx verbalcoding setup --yes
17
+ ```
18
+
19
+ If you used `npm install -g`, continue with:
20
+
21
+ ```bash
22
+ vc setup --yes
23
+ ```
24
+
25
+ Contributor GitHub clone path:
26
+
27
+ ```bash
28
+ git clone https://github.com/ca1773130n/VerbalCoding.git
29
+ cd VerbalCoding
30
+ ./scripts/install.sh --yes
31
+ ```
32
+
33
+ ## 2. Bootstrap dependencies and run the setup wizard
34
+
35
+ The npm commands above run the same bootstrapper as the clone install. For a clone, run:
36
+
37
+ ```bash
38
+ ./scripts/install.sh --yes
39
+ ```
40
+
41
+ What this does:
42
+
43
+ - installs npm dependencies when `node_modules/` is missing,
44
+ - installs the short `vc` shell command with `npm link`,
45
+ - installs `ffmpeg`, Node/npm, and `whisper-cli` when supported by the OS package manager,
46
+ - downloads `models/ggml-small-q5_1.bin`,
47
+ - creates `.venv-tts` and installs `edge-tts` when `edge-tts` is not already on `PATH`,
48
+ - runs the interactive `.env` wizard.
49
+
50
+ Supported system bootstrap paths:
51
+
52
+ | OS | System dependency path |
53
+ |---|---|
54
+ | macOS | Homebrew: `brew install node ffmpeg whisper-cpp` as needed |
55
+ | Debian/Ubuntu | `apt-get` for Node/npm, ffmpeg, Python, build tools; local whisper.cpp build fallback |
56
+ | Fedora/RHEL | `dnf` for Node/npm, ffmpeg, Python, build tools; local whisper.cpp build fallback |
57
+ | Arch | `pacman` for Node/npm, ffmpeg, Python, build tools; local whisper.cpp build fallback |
58
+
59
+ Useful installer variants:
60
+
61
+ ```bash
62
+ vc setup --yes --no-wizard # dependency/bootstrap only from npm install
63
+ ./scripts/install.sh --yes --no-wizard # dependency/bootstrap only from a clone
64
+ ./scripts/install.sh --skip-system # do not install OS packages
65
+ ./scripts/install.sh --skip-model # do not download the default STT model
66
+ ./scripts/install.sh --skip-edge-tts # do not create .venv-tts
67
+ VERBALCODING_SKIP_CLI_LINK=1 ./scripts/install.sh --yes
68
+ ```
69
+
70
+ If your OS is unsupported, install these manually before rerunning:
71
+
72
+ - Node.js 20+ and npm
73
+ - ffmpeg
74
+ - Python 3 with venv/pip
75
+ - whisper.cpp `whisper-cli`
76
+ - one authenticated CLI agent backend, Hermes Agent by default
77
+
78
+ ## 3. Discord application setup
79
+
80
+ 1. Create a Discord application and bot in the Discord Developer Portal.
81
+ 2. Enable the Message Content privileged intent.
82
+ 3. Copy the bot token into the installer prompt or `.env` as `DISCORD_BOT_TOKEN`.
83
+ 4. Generate an invite URL:
84
+
85
+ ```bash
86
+ vc bot invite <discord-client-id>
87
+ # or pin it to one server:
88
+ vc bot invite <discord-client-id> --guild <guild-id>
89
+ ```
90
+
91
+ The invite includes bot and slash-command scopes plus text/voice permissions used by VerbalCoding.
92
+
93
+ ## 4. Verify
94
+
95
+ ```bash
96
+ vc doctor
97
+ ```
98
+
99
+ `vc doctor` is redacted: it reports missing tokens/commands/models without printing secret values. Fix every `✗` item, then rerun it.
100
+
101
+ Expected success includes:
102
+
103
+ ```text
104
+ ✓ Node.js
105
+ ✓ npm
106
+ ✓ ffmpeg
107
+ ✓ whisper-cli
108
+ ✓ whisper.cpp model
109
+ ✓ Discord bot token configured — [REDACTED]
110
+ ✓ edge-tts
111
+ ✓ hermes CLI
112
+ Doctor passed. Run vc start to start VerbalCoding.
113
+ ```
114
+
115
+ If the installer created a local Edge TTS helper, `.env` should contain an `EDGE_TTS_COMMAND` path pointing at `.venv-tts/bin/edge-tts`.
116
+
117
+ ## 5. Run the single default bot
118
+
119
+ ```bash
120
+ vc start
121
+ # or, from a GitHub clone:
122
+ ./run.sh
123
+ ```
124
+
125
+ Successful startup logs include:
126
+
127
+ ```text
128
+ Logged in as <bot-name>
129
+ Listening in voice channel <server> / <channel>
130
+ ```
131
+
132
+ In Discord:
133
+
134
+ ```text
135
+ !ping
136
+ !join
137
+ !ask say hello briefly
138
+ !verbose on
139
+ ```
140
+
141
+ Then speak in the configured voice channel. You should see STT text, progress text when verbose mode is on, a final text answer, and hear TTS playback.
142
+
143
+ ## 6. Project-per-room setup
144
+
145
+ For one permanent bot per project voice room, create one Discord application per project, then:
146
+
147
+ ```bash
148
+ vc instance setup my-project
149
+ vc bot invite <that-project-client-id>
150
+ vc instance start my-project
151
+ vc instance status my-project
152
+ ```
153
+
154
+ Each instance writes an ignored `instances/<name>.env` with its own token, voice channel, transcript target, log path, Hermes session file, and optional Hermes profile.
155
+
156
+ ## 7. Optional OpenVoice setup
157
+
158
+ OpenVoice voice cloning is optional. Keep `TTS_BACKEND=edge` for a fresh public install. To enable OpenVoice later:
159
+
160
+ ```bash
161
+ ./scripts/setup_openvoice.sh
162
+ # Download OpenVoice V2 checkpoints into vendor/OpenVoice/checkpoints_v2/
163
+ # Add a permitted local sample at voice-samples/user-reference.wav,
164
+ # or run the bot, say "목소리 샘플 녹음 시작해", then speak 10-30 seconds.
165
+ python3 scripts/openvoice_smoke.py
166
+ ```
167
+
168
+ Then set `TTS_BACKEND=openvoice`, run `vc doctor`, and test `!voice-test <text>` in Discord.
169
+
170
+ ## 8. Clean clone smoke test for maintainers
171
+
172
+ Fast host-only smoke test:
173
+
174
+ ```bash
175
+ TMPDIR=$(mktemp -d)
176
+ git clone https://github.com/ca1773130n/VerbalCoding.git "$TMPDIR/VerbalCoding"
177
+ cd "$TMPDIR/VerbalCoding"
178
+ ./scripts/install.sh --yes --no-wizard
179
+ npm pack --dry-run
180
+ cp .env.example .env
181
+ chmod 600 .env
182
+ vc doctor || true
183
+ ```
184
+
185
+ The expected failure at this point is missing local secrets or unauthenticated agent CLI, not leaked tokens or missing install scripts.
186
+
187
+ Docker-based Ubuntu clean install smoke test:
188
+
189
+ ```bash
190
+ ./scripts/docker_ubuntu_smoke.sh
191
+ ```
192
+
193
+ This runs `ubuntu:24.04`, copies the tracked repository tree into a clean container, runs `./scripts/install.sh --yes --no-wizard`, writes a non-secret smoke `.env`, checks `vc`, runs Node tests, and verifies `vc doctor`. It does not connect to Discord voice; use a real Ubuntu VM or WSL2 after this if you need an end-to-end voice-channel test.
@@ -0,0 +1,183 @@
1
+ # Multi-instance VerbalCoding
2
+
3
+ VerbalCoding can run multiple independent Discord voice bridge processes. Each process is still the existing single-instance Node bridge, but it loads a different `instances/<name>.env` file and uses a different Discord bot token.
4
+
5
+ Use this when each project should permanently occupy its own Discord voice channel and write to its own transcript channel/thread.
6
+
7
+ ## Why multiple bot tokens are required
8
+
9
+ Discord voice residency is effectively one active voice connection per bot account per guild. If one bot token joins another voice channel in the same guild, it cannot also remain permanently connected to the previous channel. For simultaneous project rooms, create one Discord application/bot per project.
10
+
11
+ ## File layout
12
+
13
+ ```text
14
+ instances/
15
+ README.md
16
+ example.env
17
+ llm-wiki.env # local only, ignored by git
18
+ verbalcoding.env # local only, ignored by git
19
+ .run/instances/
20
+ llm-wiki.pid # runtime only, ignored by git
21
+ ```
22
+
23
+ Real `instances/*.env` files are ignored because they may contain Discord tokens. `instances/example.env` is the committed template.
24
+
25
+ ## Instance setup wizard
26
+
27
+ Users should not copy and manually edit env files for normal use. Run the wizard instead:
28
+
29
+ ```bash
30
+ vc instance setup llm-wiki
31
+ # or through the project setup script:
32
+ ./scripts/install.sh --instance llm-wiki
33
+ ```
34
+
35
+ The wizard prompts for the bot token, Discord Application/Client ID, voice channel, transcript target, workdir, project context, and isolated runtime paths. It writes `instances/<name>.env` with mode `0600`, backs up an existing file before overwriting it, and prints the next start/status commands.
36
+
37
+ If you enter the Discord Application/Client ID during setup, the summary also prints the invite URL for that bot. You can generate the same URL any time with:
38
+
39
+ ```bash
40
+ vc bot invite <client-id>
41
+ vc bot invite <client-id> --guild <guild-id>
42
+ ```
43
+
44
+ Discord still requires one Developer Portal application/bot per simultaneous voice room, but this avoids manually building OAuth URLs or permission integers.
45
+
46
+ ### Hermes profile isolation
47
+
48
+ Each instance gets its own Hermes home at `~/.hermes/profiles/<name>` so that
49
+ memory, MEMORY.md, SOUL.md, and learned skills do not leak across projects.
50
+
51
+ `vc instance setup <name>` automatically:
52
+
53
+ - runs `hermes profile create <name> --clone-from default` (carries API keys
54
+ and model from your current `~/.hermes`; sessions and memory start fresh),
55
+ - sets the new profile's `terminal.cwd` to the instance workdir,
56
+ - seeds `<profile>/SOUL.md` from the wizard's project-context answer,
57
+ - writes `HERMES_HOME=...` into `instances/<name>.env`.
58
+
59
+ `vc instance start <name>` self-heals: if the env points at a Hermes profile
60
+ dir that no longer exists, the start command recreates it before launching.
61
+
62
+ Instance names must match `^[a-z0-9][a-z0-9_-]{0,63}$` because Hermes uses the
63
+ name as a directory and config key.
64
+
65
+ ## Minimal generated instance env
66
+
67
+ ```env
68
+ INSTANCE_NAME=my-project
69
+ DISCORD_TOKEN=replace-with-bot-token
70
+ DISCORD_CLIENT_ID=123456789012345678
71
+ AUTO_JOIN_VOICE_CHANNELS=Project Room
72
+ TRANSCRIPT_CHANNEL_ID=123456789012345678
73
+ PROJECT_SESSIONS_FILE=config/project-sessions.my-project.json
74
+ BRIDGE_LOG_PATH=/tmp/verbalcoding-my-project.log
75
+ NODE_AUDIO_DEBUG_DIR=/tmp/verbalcoding-my-project-debug
76
+ HERMES_SESSION_FILE=.agent-sessions/hermes/my-project.session
77
+ HERMES_HOME=/home/you/.hermes/profiles/my-project
78
+ AGENT_LABEL=VerbalCoding · My Project
79
+ AGENT_CWD=/path/to/my-project
80
+ AGENT_PROJECT_CONTEXT=Project session: My Project
81
+ ```
82
+
83
+ Give every instance unique values for log/debug/session files. `HERMES_HOME` and the matching `~/.hermes/profiles/<name>` directory are created automatically by `vc instance setup`. `vc doctor` checks for duplicate tokens, colliding runtime paths, missing profile directories, and `terminal.cwd` mismatches between profile and instance — all without printing secrets.
84
+
85
+ ## Commands
86
+
87
+ ```bash
88
+ vc instance list
89
+ vc instance status
90
+ vc instance status my-project
91
+ vc instance start my-project
92
+ vc instance stop my-project
93
+ vc instance restart my-project
94
+ ```
95
+
96
+ `start` runs `./run.sh instances/<name>.env` detached and writes `.run/instances/<name>.pid`.
97
+
98
+ `stop` sends `SIGTERM`, waits up to 10 seconds, then falls back to `SIGKILL` and removes the pid file.
99
+
100
+ ## Example: two permanent voice rooms
101
+
102
+ 1. Create two Discord applications/bots:
103
+ - VerbalCoding bot
104
+ - LLM-Wiki bot
105
+
106
+ 2. Invite both to the server with text and voice permissions:
107
+ - View Channel
108
+ - Send Messages
109
+ - Send Messages in Threads
110
+ - Read Message History
111
+ - Use Application Commands
112
+ - Connect
113
+ - Speak
114
+
115
+ Use `vc bot invite <client-id>` after creating each Discord application to print the exact invite URL with those permissions.
116
+
117
+ 3. Run the setup wizard for each local instance:
118
+
119
+ ```bash
120
+ vc instance setup verbalcoding
121
+ vc instance setup llm-wiki
122
+ ```
123
+
124
+ The wizard writes ignored `instances/verbalcoding.env` and `instances/llm-wiki.env` files with mode `0600`; it also backs up an existing instance env before replacing it. Each run also creates `~/.hermes/profiles/<name>` cloned from your default Hermes home, so the two instances start with the same auth/model but accumulate independent memory and skills as they learn each project.
125
+
126
+ 4. Check config:
127
+
128
+ ```bash
129
+ vc doctor
130
+ ```
131
+
132
+ 5. Start both:
133
+
134
+ ```bash
135
+ vc instance start verbalcoding
136
+ vc instance start llm-wiki
137
+ vc instance status
138
+ ```
139
+
140
+ 6. Verify logs:
141
+
142
+ ```bash
143
+ tail -n 50 /tmp/verbalcoding-verbalcoding.log
144
+ tail -n 50 /tmp/verbalcoding-llm-wiki.log
145
+ ```
146
+
147
+ Expected log lines:
148
+
149
+ ```text
150
+ Listening in voice channel ... / VerbalCoding
151
+ Listening in voice channel ... / LLM-Wiki
152
+ ```
153
+
154
+ 7. Stop both:
155
+
156
+ ```bash
157
+ vc instance stop verbalcoding
158
+ vc instance stop llm-wiki
159
+ ```
160
+
161
+ ## Short-term single-bot text/voice binding
162
+
163
+ If you only have one bot token, use project-session voice binding instead of simultaneous multi-channel residency.
164
+
165
+ Run this in the target text channel/thread:
166
+
167
+ ```text
168
+ !session attach-voice --voice "LLM-Wiki"
169
+ ```
170
+
171
+ Behavior:
172
+
173
+ - Binds the selected voice channel to the current text channel/thread.
174
+ - If the current text channel has no project session, creates an ad-hoc isolated session.
175
+ - Voice STT/result/progress/final-answer text routes to that active project transcript target.
176
+
177
+ To attach an existing named project session:
178
+
179
+ ```text
180
+ !session voice llm-wiki --voice "LLM-Wiki"
181
+ ```
182
+
183
+ This is convenient for routing, but it does not make one bot stay in two voice channels at the same time. Use multiple bot tokens/processes for simultaneous permanent residency.
@@ -0,0 +1,72 @@
1
+ # VerbalCoding release notes
2
+
3
+ ## Current release candidate
4
+
5
+ VerbalCoding is a Discord voice bridge for controlling CLI-based coding agents by voice. It is public-release oriented, with macOS / Apple Silicon as the most tested path and best-effort Linux bootstrap support for common package managers.
6
+
7
+ ### Included
8
+
9
+ - Discord voice receive via Node `@discordjs/voice`.
10
+ - Local Korean STT via `whisper.cpp` + Metal.
11
+ - Edge TTS playback with Korean default voice.
12
+ - Generic CLI harness adapter layer:
13
+ - Hermes Agent
14
+ - Claude Code
15
+ - Codex CLI
16
+ - Gemini CLI
17
+ - OpenCode
18
+ - OpenClaw
19
+ - custom command
20
+ - Shared voice/text session support for Hermes backend.
21
+ - Long-answer TTS chunking and responsive barge-in.
22
+ - Diff/code/log guardrails so large technical output is not read aloud.
23
+ - Normal and conservative sensitivity modes for indoor vs. noisy/outdoor use.
24
+ - Setup wizard, `.env.example`, `vc doctor` prerequisite checker, and `./scripts/install.sh --yes` bootstrap for OS packages, npm dependencies, Edge TTS helper, and the default whisper.cpp model.
25
+ - Optional verbose progress mode for text-only middle-step updates during long agent work.
26
+ - Always-on JSONL latency metrics plus `!latency` / `!metrics` summary for pipeline optimization.
27
+ - Lower default utterance idle wait (`UTTERANCE_IDLE_MS=2000`) so STT starts about 0.6s sooner after speech ends.
28
+ - Multi-instance Hermes profile isolation: `vc instance setup <name>` auto-clones a Hermes profile to `~/.hermes/profiles/<name>` with the instance workdir, seeds SOUL.md, and writes `HERMES_HOME` into the instance env so per-project memory and skills stay separate; `vc instance start` self-heals a missing profile, and `vc doctor` checks profile-dir presence and `terminal.cwd` consistency.
29
+
30
+ ### Pre-release checklist
31
+
32
+ Run from the repo root:
33
+
34
+ ```bash
35
+ ./scripts/install.sh --yes --no-wizard
36
+ ./scripts/docker_ubuntu_smoke.sh # requires Docker; validates ubuntu:24.04 clean install
37
+ node --check app-node/main.mjs app-node/agent_adapters.mjs app-node/install_config.mjs scripts/install.mjs
38
+ npm test
39
+ PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 python3 -m pytest tests/ -q || [ $? -eq 5 ] # ok when no Python tests exist
40
+ bash -n run.sh scripts/install.sh scripts/bootstrap_prereqs.sh scripts/docker_ubuntu_smoke.sh
41
+ vc doctor
42
+ git diff --check
43
+ ```
44
+
45
+ Manual smoke test:
46
+
47
+ 1. Start the bridge with `./run.sh`.
48
+ 2. Verify log contains `Logged in as Hermes#6718`.
49
+ 3. Verify log contains `Listening in voice channel ... / 일반` or the configured default channel.
50
+ 4. In Discord, run `!ping`.
51
+ 5. In Discord voice, say a short Korean request.
52
+ 6. Verify STT transcript, agent response, TTS playback, and barge-in behavior.
53
+
54
+ ### Known requirements
55
+
56
+ - macOS with Homebrew, or Linux with `apt`, `dnf`, or `pacman` for best-effort bootstrap.
57
+ - `ffmpeg`; installer attempts to install it.
58
+ - `whisper-cli`; installer uses Homebrew on macOS or local `vendor/whisper.cpp` build fallback on Linux.
59
+ - Default model at `models/ggml-small-q5_1.bin`; installer downloads it unless `--skip-model` is used.
60
+ - Edge TTS CLI on `PATH` or local `.venv-tts/bin/edge-tts`; installer creates the local helper when needed.
61
+ - Discord bot token in `.env`, `instances/<name>.env`, `~/.zshrc`, or runtime env.
62
+ - Selected CLI harness installed and authenticated.
63
+
64
+ ### Not for public release yet
65
+
66
+ Before public release, consider adding:
67
+
68
+ - GitHub Actions CI.
69
+ - Demo video / GIF.
70
+ - Discord bot setup screenshots.
71
+ - Broader Linux validation on real distributions beyond script-level checks.
72
+ - Security review of all logging paths.