verbalcoding 0.2.6 → 0.2.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. package/README.md +5 -0
  2. package/docs/i18n/CONFIGURATION.es.md +150 -0
  3. package/docs/i18n/CONFIGURATION.fr.md +150 -0
  4. package/docs/i18n/CONFIGURATION.ja.md +150 -0
  5. package/docs/i18n/CONFIGURATION.ko.md +49 -146
  6. package/docs/i18n/CONFIGURATION.ru.md +150 -0
  7. package/docs/i18n/CONFIGURATION.zh.md +150 -0
  8. package/docs/i18n/FRESH_INSTALL.es.md +124 -0
  9. package/docs/i18n/FRESH_INSTALL.fr.md +124 -0
  10. package/docs/i18n/FRESH_INSTALL.ja.md +124 -0
  11. package/docs/i18n/FRESH_INSTALL.ko.md +37 -114
  12. package/docs/i18n/FRESH_INSTALL.ru.md +124 -0
  13. package/docs/i18n/FRESH_INSTALL.zh.md +124 -0
  14. package/docs/i18n/MULTI_INSTANCE.es.md +121 -0
  15. package/docs/i18n/MULTI_INSTANCE.fr.md +121 -0
  16. package/docs/i18n/MULTI_INSTANCE.ja.md +121 -0
  17. package/docs/i18n/MULTI_INSTANCE.ko.md +28 -86
  18. package/docs/i18n/MULTI_INSTANCE.ru.md +121 -0
  19. package/docs/i18n/MULTI_INSTANCE.zh.md +121 -0
  20. package/docs/i18n/README.es.md +50 -86
  21. package/docs/i18n/README.fr.md +50 -86
  22. package/docs/i18n/README.ja.md +50 -86
  23. package/docs/i18n/README.ko.md +41 -113
  24. package/docs/i18n/README.ru.md +50 -86
  25. package/docs/i18n/README.zh.md +50 -86
  26. package/docs/i18n/RELEASE.es.md +58 -0
  27. package/docs/i18n/RELEASE.fr.md +58 -0
  28. package/docs/i18n/RELEASE.ja.md +58 -0
  29. package/docs/i18n/RELEASE.ko.md +36 -50
  30. package/docs/i18n/RELEASE.ru.md +58 -0
  31. package/docs/i18n/RELEASE.zh.md +58 -0
  32. package/docs/i18n/USAGE.es.md +134 -0
  33. package/docs/i18n/USAGE.fr.md +134 -0
  34. package/docs/i18n/USAGE.ja.md +134 -0
  35. package/docs/i18n/USAGE.ko.md +63 -101
  36. package/docs/i18n/USAGE.ru.md +134 -0
  37. package/docs/i18n/USAGE.zh.md +134 -0
  38. package/package.json +1 -1
@@ -1,91 +1,66 @@
1
1
  # VerbalCoding
2
2
 
3
- <p align="center">
4
- <strong>Общайтесь с CLI-агентами для программирования голосом в Discord — почти как по телефону.</strong>
5
- </p>
6
-
7
- <p align="center">
8
- <a href="../../README.md">English</a> ·
9
- <a href="README.ko.md">한국어</a> ·
10
- <a href="README.ja.md">日本語</a> ·
11
- <a href="README.zh.md">中文</a> ·
12
- <a href="README.es.md">Español</a> ·
13
- <a href="README.fr.md">Français</a> ·
14
- <a href="README.ru.md">Русский</a>
15
- </p>
16
-
17
- <p align="center">
18
- <img alt="Node.js" src="https://img.shields.io/badge/Node.js-20%2B-339933?logo=node.js&logoColor=white">
19
- <img alt="Discord" src="https://img.shields.io/badge/Discord-voice%20bridge-5865F2?logo=discord&logoColor=white">
20
- <img alt="STT" src="https://img.shields.io/badge/STT-whisper.cpp-7C3AED">
21
- <img alt="TTS" src="https://img.shields.io/badge/TTS-Edge%20%7C%20OpenVoice%20%7C%20Supertonic%20%7C%20SpeechSwift-0EA5E9">
22
- </p>
23
-
24
- <p align="center">
25
- <img src="../assets/figures/verbalcoding-flow.svg" alt="VerbalCoding voice-to-agent flow" width="860">
26
- </p>
3
+ **Управляйте CLI-агентами для кода голосом в Discord — почти как по телефону.**
4
+
5
+ [English](../../README.md) · [한국어](README.ko.md) · [日本語](README.ja.md) · [中文](README.zh.md) · [Español](README.es.md) · [Français](README.fr.md) · [Русский](README.ru.md)
6
+
7
+ ![VerbalCoding voice-to-agent flow](../assets/figures/verbalcoding-flow.svg)
27
8
 
28
9
  ## Why
29
10
 
30
- VerbalCoding превращает голосовой канал Discord в hands-free панель управления агентами для разработки. Скажите задачу, дайте CLI-агенту выполнить работу и получите краткий голосовой ответ с текстовыми транскриптами, событиями прогресса и защитой от зачитывания длинного кода или логов.
11
+ VerbalCoding превращает голосовой канал Discord в hands-free интерфейс для coding agents. Вы произносите задачу, CLI-агент работает, а вы получаете краткий голосовой ответ, текстовую расшифровку и события прогресса.
31
12
 
32
- ## Возможности
13
+ ## Highlights
33
14
 
34
- | Что есть | Почему это удобно |
15
+ | Feature | What it means |
35
16
  |---|---|
36
- | Голосовое управление прежде всего | Управляйте Hermes Agent, Claude Code, Codex, Gemini CLI, OpenCode, OpenClaw или своим CLI голосом. |
37
- | Локальный voice loop | Голос Discord → STT `whisper.cpp` → агентфрагментированное TTS-воспроизведение. |
38
- | Общий контекст голоса и текста | Голосовые реплики и `!ask` могут использовать одну и ту же поддерживаемую сессию агента. |
39
- | Barge-in и режимы чувствительности | Естественно перебивайте воспроизведение и переключайте normal/conservative режимы. |
40
- | Многоязычные voice presets | `vc language ko/en/auto` одновременно меняет STT, язык прогресса и TTS-голос. |
41
- | Изоляция комнат по проектам | Отдельный bot, Hermes profile, сессия, память и логи для каждого проекта. |
17
+ | Voice-first agent control | Hermes Agent, Claude Code, Codex, Gemini CLI, OpenCode, OpenClaw, or a custom CLI harness. |
18
+ | Local-first speech loop | Discord voice capture → `whisper.cpp` STT agentchunked TTS playback. |
19
+ | Shared voice + text context | Voice turns and `!ask` text commands can reuse the same supported agent session. |
20
+ | Barge-in and sensitivity modes | Interrupt playback naturally and switch between normal and conservative/noisy modes. |
21
+ | Multilingual voice presets | `vc language ko/en/auto` changes STT, progress language, and TTS voice together. |
22
+ | Multi-room project isolation | Run one bot per project room with isolated Hermes profiles, sessions, memory, and logs. |
42
23
 
43
- ## Быстрый старт
24
+ ## Quick Start
44
25
 
45
26
  ```bash
46
- git clone git@github.com:ca1773130n/VerbalCoding.git
47
- cd VerbalCoding
48
- ./scripts/install.sh
27
+ npm install -g verbalcoding
28
+ vc setup --yes
49
29
  vc doctor
50
- ./run.sh
30
+ vc start
51
31
  ```
52
32
 
53
- ## Как это работает
54
-
55
- ```mermaid
56
- flowchart LR
57
- A[Discord voice] --> B["@discordjs/voice"]
58
- B --> C[PCM cleanup + gates]
59
- C --> D["whisper.cpp STT"]
60
- D --> E["CLI agent adapter"]
61
- E --> F["Concise answer"]
62
- F --> G["Chunked TTS"]
63
- G --> H["Discord playback"]
33
+ Run without a permanent global install:
34
+
35
+ ```bash
36
+ npx verbalcoding setup --yes
37
+ vc doctor
38
+ vc start
64
39
  ```
65
40
 
66
- ## Поддерживаемые agent-бэкенды
41
+ Contributor clone path:
42
+
43
+ ```bash
44
+ git clone https://github.com/ca1773130n/VerbalCoding.git
45
+ cd VerbalCoding
46
+ ./scripts/install.sh --yes
47
+ vc doctor
48
+ ./run.sh
49
+ ```
67
50
 
68
- | Backend | Default command | Session support |
69
- |---|---:|---|
70
- | Hermes Agent | `hermes chat -Q -q` | Resume, verbose progress, cancellation, final-answer recovery |
71
- | Claude Code | `claude -p` | CLI session file support through adapter defaults |
72
- | Codex CLI | `codex exec` | CLI session file support through adapter defaults |
73
- | Gemini CLI | `gemini -p` | CLI session file support through adapter defaults |
74
- | OpenCode | `opencode run` | CLI session file support through adapter defaults |
75
- | OpenClaw | `openclaw run` | CLI session file support through adapter defaults |
76
- | Custom | `AGENT_COMMAND` | Bring your own non-interactive command |
51
+ `vc setup --yes` and `./scripts/install.sh --yes` bootstrap npm dependencies, `ffmpeg`, `whisper-cli`, the default whisper.cpp model, a local Edge TTS helper, and the short `vc` command where possible.
77
52
 
78
- ## Подробнее
53
+ ## Guides
79
54
 
80
- | Guide | What you get |
55
+ | Guide | Link |
81
56
  |---|---|
82
- | [Fresh Install](../FRESH_INSTALL.md) | Чистая установка, загрузка модели, первый запуск |
83
- | [Usage Guide](../USAGE.md) | CLI-команды, команды Discord, режим прогресса, метрики задержек |
84
- | [Configuration](../CONFIGURATION.md) | .env, agent-бэкенды, MCP, TTS и эксплуатационные заметки |
85
- | [Multi-Instance](../MULTI_INSTANCE.md) | Постоянная голосовая комната Discord для каждого проекта |
86
- | [Release Notes](../RELEASE.md) | Текущие возможности и pre-release checklist |
57
+ | Чистая установка | [FRESH_INSTALL.ru.md](FRESH_INSTALL.ru.md) |
58
+ | Руководство по использованию | [USAGE.ru.md](USAGE.ru.md) |
59
+ | Конфигурация | [CONFIGURATION.ru.md](CONFIGURATION.ru.md) |
60
+ | Мульти-инстансы | [MULTI_INSTANCE.ru.md](MULTI_INSTANCE.ru.md) |
61
+ | Заметки о релизе | [RELEASE.ru.md](RELEASE.ru.md) |
87
62
 
88
- ## Карта команд
63
+ ## Command map
89
64
 
90
65
  ```bash
91
66
  vc status
@@ -94,28 +69,17 @@ vc bot invite CLIENT_ID
94
69
  vc instance setup NAME
95
70
  vc instance start NAME
96
71
  vc doctor
72
+ vc start
97
73
  ```
98
74
 
99
- ## Требования
75
+ Discord commands:
100
76
 
101
- | Layer | Default |
102
- |---|---|
103
- | Runtime | Node.js 20+, npm |
104
- | Audio | `ffmpeg` |
105
- | STT | `whisper.cpp` / `whisper-cli` |
106
- | Discord | Bot token, Message Content intent, voice permissions |
107
- | Agent | At least one authenticated CLI harness, Hermes Agent by default |
108
- | Platform focus | macOS / Apple Silicon currently gets the most testing |
109
-
110
- ## Участие
111
-
112
- ```bash
113
- node --check app-node/main.mjs
114
- npm test
115
- bash -n run.sh scripts/install.sh
116
- vc doctor
77
+ ```text
78
+ !join !ask <prompt> !verbose on/off
79
+ !latency !sensitivity normal !sensitivity conservative
80
+ !session new <name> <workdir> [context] --voice <voice-channel>
117
81
  ```
118
82
 
119
- ## Статус
83
+ ## Requirements
120
84
 
121
- VerbalCoding is public-release oriented but still early. Demo video/GIF, broader Linux notes, and a formal license file are still TODOs.
85
+ Node.js 20+, npm, `ffmpeg`, `whisper.cpp` / `whisper-cli`, Edge TTS CLI, a Discord bot token with Message Content intent and voice permissions, and at least one authenticated CLI agent backend.
@@ -1,91 +1,66 @@
1
1
  # VerbalCoding
2
2
 
3
- <p align="center">
4
- <strong>通过 Discord 语音像打电话一样控制 CLI 编程 Agent。</strong>
5
- </p>
6
-
7
- <p align="center">
8
- <a href="../../README.md">English</a> ·
9
- <a href="README.ko.md">한국어</a> ·
10
- <a href="README.ja.md">日本語</a> ·
11
- <a href="README.zh.md">中文</a> ·
12
- <a href="README.es.md">Español</a> ·
13
- <a href="README.fr.md">Français</a> ·
14
- <a href="README.ru.md">Русский</a>
15
- </p>
16
-
17
- <p align="center">
18
- <img alt="Node.js" src="https://img.shields.io/badge/Node.js-20%2B-339933?logo=node.js&logoColor=white">
19
- <img alt="Discord" src="https://img.shields.io/badge/Discord-voice%20bridge-5865F2?logo=discord&logoColor=white">
20
- <img alt="STT" src="https://img.shields.io/badge/STT-whisper.cpp-7C3AED">
21
- <img alt="TTS" src="https://img.shields.io/badge/TTS-Edge%20%7C%20OpenVoice%20%7C%20Supertonic%20%7C%20SpeechSwift-0EA5E9">
22
- </p>
23
-
24
- <p align="center">
25
- <img src="../assets/figures/verbalcoding-flow.svg" alt="VerbalCoding voice-to-agent flow" width="860">
26
- </p>
3
+ **通过 Discord 语音像打电话一样控制 CLI 编程代理。**
4
+
5
+ [English](../../README.md) · [한국어](README.ko.md) · [日本語](README.ja.md) · [中文](README.zh.md) · [Español](README.es.md) · [Français](README.fr.md) · [Русский](README.ru.md)
6
+
7
+ ![VerbalCoding voice-to-agent flow](../assets/figures/verbalcoding-flow.svg)
27
8
 
28
9
  ## Why
29
10
 
30
- VerbalCoding 把 Discord 语音频道变成面向编程 Agent 的免手动控制台。你可以直接说出需求,让 CLI Agent 工作,再听到简洁的语音回答;同时保留文字记录、进度事件,并避免把大段代码或日志读出来。
11
+ VerbalCoding 把 Discord 语音频道变成编程代理的免手控制界面。说出需求,让 CLI 代理工作,然后收到简洁的语音回复、文本转录和进度事件。
31
12
 
32
- ## 亮点
13
+ ## Highlights
33
14
 
34
- | 能力 | 价值 |
15
+ | Feature | What it means |
35
16
  |---|---|
36
- | 语音优先的 Agent 控制 | 用语音控制 Hermes AgentClaude CodeCodexGemini CLIOpenCodeOpenClaw 或自定义 CLI |
37
- | 本地优先语音闭环 | Discord 语音捕获 → `whisper.cpp` STT → Agent分段 TTS 播放。 |
38
- | 语音 + 文本共享上下文 | 在支持的 Agent 中,语音轮次和 `!ask` 文本命令可复用同一会话。 |
39
- | 打断与灵敏度模式 | 可自然打断播放,并在普通/保守灵敏度之间切换。 |
40
- | 多语言语音预设 | `vc language ko/en/auto` 同步切换 STT、进度语言和 TTS 声音。 |
41
- | 按项目隔离的多房间 | 每个项目房间使用独立 Bot、Hermes profile、会话、记忆和日志。 |
17
+ | Voice-first agent control | Hermes Agent, Claude Code, Codex, Gemini CLI, OpenCode, OpenClaw, or a custom CLI harness. |
18
+ | Local-first speech loop | Discord voice capture → `whisper.cpp` STT → agentchunked TTS playback. |
19
+ | Shared voice + text context | Voice turns and `!ask` text commands can reuse the same supported agent session. |
20
+ | Barge-in and sensitivity modes | Interrupt playback naturally and switch between normal and conservative/noisy modes. |
21
+ | Multilingual voice presets | `vc language ko/en/auto` changes STT, progress language, and TTS voice together. |
22
+ | Multi-room project isolation | Run one bot per project room with isolated Hermes profiles, sessions, memory, and logs. |
42
23
 
43
- ## 快速开始
24
+ ## Quick Start
44
25
 
45
26
  ```bash
46
- git clone git@github.com:ca1773130n/VerbalCoding.git
47
- cd VerbalCoding
48
- ./scripts/install.sh
27
+ npm install -g verbalcoding
28
+ vc setup --yes
49
29
  vc doctor
50
- ./run.sh
30
+ vc start
51
31
  ```
52
32
 
53
- ## 工作原理
54
-
55
- ```mermaid
56
- flowchart LR
57
- A[Discord voice] --> B["@discordjs/voice"]
58
- B --> C[PCM cleanup + gates]
59
- C --> D["whisper.cpp STT"]
60
- D --> E["CLI agent adapter"]
61
- E --> F["Concise answer"]
62
- F --> G["Chunked TTS"]
63
- G --> H["Discord playback"]
33
+ Run without a permanent global install:
34
+
35
+ ```bash
36
+ npx verbalcoding setup --yes
37
+ vc doctor
38
+ vc start
64
39
  ```
65
40
 
66
- ## 支持的 Agent 后端
41
+ Contributor clone path:
42
+
43
+ ```bash
44
+ git clone https://github.com/ca1773130n/VerbalCoding.git
45
+ cd VerbalCoding
46
+ ./scripts/install.sh --yes
47
+ vc doctor
48
+ ./run.sh
49
+ ```
67
50
 
68
- | Backend | Default command | Session support |
69
- |---|---:|---|
70
- | Hermes Agent | `hermes chat -Q -q` | Resume, verbose progress, cancellation, final-answer recovery |
71
- | Claude Code | `claude -p` | CLI session file support through adapter defaults |
72
- | Codex CLI | `codex exec` | CLI session file support through adapter defaults |
73
- | Gemini CLI | `gemini -p` | CLI session file support through adapter defaults |
74
- | OpenCode | `opencode run` | CLI session file support through adapter defaults |
75
- | OpenClaw | `openclaw run` | CLI session file support through adapter defaults |
76
- | Custom | `AGENT_COMMAND` | Bring your own non-interactive command |
51
+ `vc setup --yes` and `./scripts/install.sh --yes` bootstrap npm dependencies, `ffmpeg`, `whisper-cli`, the default whisper.cpp model, a local Edge TTS helper, and the short `vc` command where possible.
77
52
 
78
- ## 了解更多
53
+ ## Guides
79
54
 
80
- | Guide | What you get |
55
+ | Guide | Link |
81
56
  |---|---|
82
- | [Fresh Install](../FRESH_INSTALL.md) | 干净克隆安装、模型下载、首次运行 |
83
- | [Usage Guide](../USAGE.md) | CLI 命令、Discord 命令、进度模式、延迟指标 |
84
- | [Configuration](../CONFIGURATION.md) | .env、Agent 后端、MCP、TTS 后端、运维说明 |
85
- | [Multi-Instance](../MULTI_INSTANCE.md) | 每个项目一个常驻 Discord 语音房间 |
86
- | [Release Notes](../RELEASE.md) | 当前能力与发布前检查清单 |
57
+ | 全新安装 | [FRESH_INSTALL.zh.md](FRESH_INSTALL.zh.md) |
58
+ | 使用指南 | [USAGE.zh.md](USAGE.zh.md) |
59
+ | 配置 | [CONFIGURATION.zh.md](CONFIGURATION.zh.md) |
60
+ | 多实例 | [MULTI_INSTANCE.zh.md](MULTI_INSTANCE.zh.md) |
61
+ | 发布说明 | [RELEASE.zh.md](RELEASE.zh.md) |
87
62
 
88
- ## 常用命令
63
+ ## Command map
89
64
 
90
65
  ```bash
91
66
  vc status
@@ -94,28 +69,17 @@ vc bot invite CLIENT_ID
94
69
  vc instance setup NAME
95
70
  vc instance start NAME
96
71
  vc doctor
72
+ vc start
97
73
  ```
98
74
 
99
- ## 要求
75
+ Discord commands:
100
76
 
101
- | Layer | Default |
102
- |---|---|
103
- | Runtime | Node.js 20+, npm |
104
- | Audio | `ffmpeg` |
105
- | STT | `whisper.cpp` / `whisper-cli` |
106
- | Discord | Bot token, Message Content intent, voice permissions |
107
- | Agent | At least one authenticated CLI harness, Hermes Agent by default |
108
- | Platform focus | macOS / Apple Silicon currently gets the most testing |
109
-
110
- ## 贡献
111
-
112
- ```bash
113
- node --check app-node/main.mjs
114
- npm test
115
- bash -n run.sh scripts/install.sh
116
- vc doctor
77
+ ```text
78
+ !join !ask <prompt> !verbose on/off
79
+ !latency !sensitivity normal !sensitivity conservative
80
+ !session new <name> <workdir> [context] --voice <voice-channel>
117
81
  ```
118
82
 
119
- ## 状态
83
+ ## Requirements
120
84
 
121
- VerbalCoding is public-release oriented but still early. Demo video/GIF, broader Linux notes, and a formal license file are still TODOs.
85
+ Node.js 20+, npm, `ffmpeg`, `whisper.cpp` / `whisper-cli`, Edge TTS CLI, a Discord bot token with Message Content intent and voice permissions, and at least one authenticated CLI agent backend.
@@ -0,0 +1,58 @@
1
+ # VerbalCoding Notas de versión
2
+
3
+ ## Current release candidate
4
+
5
+ VerbalCoding is a Discord voice bridge for controlling CLI-based coding agents by voice. macOS / Apple Silicon is the most tested path; Linux bootstrap is best-effort for common package managers.
6
+
7
+ ## Included
8
+
9
+ - Discord voice receive via Node `@discordjs/voice`.
10
+ - Local Korean STT via `whisper.cpp` + Metal.
11
+ - Edge TTS playback with Korean default voice.
12
+ - Generic CLI harness adapter layer: Hermes Agent, Claude Code, Codex CLI, Gemini CLI, OpenCode, OpenClaw, or custom command.
13
+ - Shared voice/text session support for Hermes backend.
14
+ - Long-answer TTS chunking and responsive barge-in.
15
+ - Diff/code/log guardrails so large technical output is not read aloud.
16
+ - Normal and conservative sensitivity modes.
17
+ - Setup wizard, `.env.example`, `vc doctor`, `./scripts/install.sh --yes`, and npm install path.
18
+ - `npm install -g verbalcoding`, `vc setup --yes`, and `vc start`.
19
+ - Verbose progress mode, JSONL latency metrics, and `!latency` / `!metrics`.
20
+ - `UTTERANCE_IDLE_MS=4500` for long spoken instructions with natural pauses.
21
+ - Multi-instance Hermes profile isolation via `vc instance setup <name>` and `HERMES_HOME`.
22
+
23
+ ## Pre-release checklist
24
+
25
+ ```bash
26
+ ./scripts/install.sh --yes --no-wizard
27
+ ./scripts/docker_ubuntu_smoke.sh
28
+ node --check app-node/main.mjs app-node/agent_adapters.mjs app-node/install_config.mjs scripts/install.mjs
29
+ npm test
30
+ PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 python3 -m pytest tests/ -q || [ $? -eq 5 ]
31
+ bash -n run.sh scripts/install.sh scripts/bootstrap_prereqs.sh scripts/docker_ubuntu_smoke.sh
32
+ npm pack --dry-run
33
+ vc doctor
34
+ git diff --check
35
+ ```
36
+
37
+ Manual smoke test:
38
+
39
+ 1. Start the bridge with `vc start` or `./run.sh`.
40
+ 2. Verify `Logged in as <bot-name>`.
41
+ 3. Verify `Listening in voice channel ...`.
42
+ 4. In Discord, run `!ping`.
43
+ 5. Say a short Korean request in voice.
44
+ 6. Verify STT transcript, agent response, TTS playback, and barge-in.
45
+
46
+ ## Known requirements
47
+
48
+ - macOS with Homebrew, or Linux with `apt`, `dnf`, or `pacman`.
49
+ - `ffmpeg`.
50
+ - `whisper-cli`.
51
+ - `models/ggml-small-q5_1.bin`.
52
+ - Edge TTS CLI or `.venv-tts/bin/edge-tts`.
53
+ - Discord bot token in `.env`, `instances/<name>.env`, `~/.zshrc`, or runtime env.
54
+ - Selected CLI harness installed and authenticated.
55
+
56
+ ## Not for public release yet
57
+
58
+ Consider adding GitHub Actions CI, demo video/GIF, Discord bot setup screenshots, broader real Linux validation, and security review of logging paths.
@@ -0,0 +1,58 @@
1
+ # VerbalCoding Notes de version
2
+
3
+ ## Current release candidate
4
+
5
+ VerbalCoding is a Discord voice bridge for controlling CLI-based coding agents by voice. macOS / Apple Silicon is the most tested path; Linux bootstrap is best-effort for common package managers.
6
+
7
+ ## Included
8
+
9
+ - Discord voice receive via Node `@discordjs/voice`.
10
+ - Local Korean STT via `whisper.cpp` + Metal.
11
+ - Edge TTS playback with Korean default voice.
12
+ - Generic CLI harness adapter layer: Hermes Agent, Claude Code, Codex CLI, Gemini CLI, OpenCode, OpenClaw, or custom command.
13
+ - Shared voice/text session support for Hermes backend.
14
+ - Long-answer TTS chunking and responsive barge-in.
15
+ - Diff/code/log guardrails so large technical output is not read aloud.
16
+ - Normal and conservative sensitivity modes.
17
+ - Setup wizard, `.env.example`, `vc doctor`, `./scripts/install.sh --yes`, and npm install path.
18
+ - `npm install -g verbalcoding`, `vc setup --yes`, and `vc start`.
19
+ - Verbose progress mode, JSONL latency metrics, and `!latency` / `!metrics`.
20
+ - `UTTERANCE_IDLE_MS=4500` for long spoken instructions with natural pauses.
21
+ - Multi-instance Hermes profile isolation via `vc instance setup <name>` and `HERMES_HOME`.
22
+
23
+ ## Pre-release checklist
24
+
25
+ ```bash
26
+ ./scripts/install.sh --yes --no-wizard
27
+ ./scripts/docker_ubuntu_smoke.sh
28
+ node --check app-node/main.mjs app-node/agent_adapters.mjs app-node/install_config.mjs scripts/install.mjs
29
+ npm test
30
+ PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 python3 -m pytest tests/ -q || [ $? -eq 5 ]
31
+ bash -n run.sh scripts/install.sh scripts/bootstrap_prereqs.sh scripts/docker_ubuntu_smoke.sh
32
+ npm pack --dry-run
33
+ vc doctor
34
+ git diff --check
35
+ ```
36
+
37
+ Manual smoke test:
38
+
39
+ 1. Start the bridge with `vc start` or `./run.sh`.
40
+ 2. Verify `Logged in as <bot-name>`.
41
+ 3. Verify `Listening in voice channel ...`.
42
+ 4. In Discord, run `!ping`.
43
+ 5. Say a short Korean request in voice.
44
+ 6. Verify STT transcript, agent response, TTS playback, and barge-in.
45
+
46
+ ## Known requirements
47
+
48
+ - macOS with Homebrew, or Linux with `apt`, `dnf`, or `pacman`.
49
+ - `ffmpeg`.
50
+ - `whisper-cli`.
51
+ - `models/ggml-small-q5_1.bin`.
52
+ - Edge TTS CLI or `.venv-tts/bin/edge-tts`.
53
+ - Discord bot token in `.env`, `instances/<name>.env`, `~/.zshrc`, or runtime env.
54
+ - Selected CLI harness installed and authenticated.
55
+
56
+ ## Not for public release yet
57
+
58
+ Consider adding GitHub Actions CI, demo video/GIF, Discord bot setup screenshots, broader real Linux validation, and security review of logging paths.
@@ -0,0 +1,58 @@
1
+ # VerbalCoding リリースノート
2
+
3
+ ## Current release candidate
4
+
5
+ VerbalCoding is a Discord voice bridge for controlling CLI-based coding agents by voice. macOS / Apple Silicon is the most tested path; Linux bootstrap is best-effort for common package managers.
6
+
7
+ ## Included
8
+
9
+ - Discord voice receive via Node `@discordjs/voice`.
10
+ - Local Korean STT via `whisper.cpp` + Metal.
11
+ - Edge TTS playback with Korean default voice.
12
+ - Generic CLI harness adapter layer: Hermes Agent, Claude Code, Codex CLI, Gemini CLI, OpenCode, OpenClaw, or custom command.
13
+ - Shared voice/text session support for Hermes backend.
14
+ - Long-answer TTS chunking and responsive barge-in.
15
+ - Diff/code/log guardrails so large technical output is not read aloud.
16
+ - Normal and conservative sensitivity modes.
17
+ - Setup wizard, `.env.example`, `vc doctor`, `./scripts/install.sh --yes`, and npm install path.
18
+ - `npm install -g verbalcoding`, `vc setup --yes`, and `vc start`.
19
+ - Verbose progress mode, JSONL latency metrics, and `!latency` / `!metrics`.
20
+ - `UTTERANCE_IDLE_MS=4500` for long spoken instructions with natural pauses.
21
+ - Multi-instance Hermes profile isolation via `vc instance setup <name>` and `HERMES_HOME`.
22
+
23
+ ## Pre-release checklist
24
+
25
+ ```bash
26
+ ./scripts/install.sh --yes --no-wizard
27
+ ./scripts/docker_ubuntu_smoke.sh
28
+ node --check app-node/main.mjs app-node/agent_adapters.mjs app-node/install_config.mjs scripts/install.mjs
29
+ npm test
30
+ PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 python3 -m pytest tests/ -q || [ $? -eq 5 ]
31
+ bash -n run.sh scripts/install.sh scripts/bootstrap_prereqs.sh scripts/docker_ubuntu_smoke.sh
32
+ npm pack --dry-run
33
+ vc doctor
34
+ git diff --check
35
+ ```
36
+
37
+ Manual smoke test:
38
+
39
+ 1. Start the bridge with `vc start` or `./run.sh`.
40
+ 2. Verify `Logged in as <bot-name>`.
41
+ 3. Verify `Listening in voice channel ...`.
42
+ 4. In Discord, run `!ping`.
43
+ 5. Say a short Korean request in voice.
44
+ 6. Verify STT transcript, agent response, TTS playback, and barge-in.
45
+
46
+ ## Known requirements
47
+
48
+ - macOS with Homebrew, or Linux with `apt`, `dnf`, or `pacman`.
49
+ - `ffmpeg`.
50
+ - `whisper-cli`.
51
+ - `models/ggml-small-q5_1.bin`.
52
+ - Edge TTS CLI or `.venv-tts/bin/edge-tts`.
53
+ - Discord bot token in `.env`, `instances/<name>.env`, `~/.zshrc`, or runtime env.
54
+ - Selected CLI harness installed and authenticated.
55
+
56
+ ## Not for public release yet
57
+
58
+ Consider adding GitHub Actions CI, demo video/GIF, Discord bot setup screenshots, broader real Linux validation, and security review of logging paths.
@@ -1,72 +1,58 @@
1
1
  # VerbalCoding 릴리스 노트
2
2
 
3
- ## 현재 릴리스 후보
3
+ ## Current release candidate
4
4
 
5
- VerbalCoding 음성으로 CLI 기반 코딩 에이전트를 제어하기 위한 Discord 음성 브릿지입니다. 공개 릴리스를 지향하며, macOS / Apple Silicon 경로가 가장 많이 테스트되어 있고, 일반적인 Linux 패키지 매니저에 대해서는 best-effort 부트스트랩을 제공합니다.
5
+ VerbalCoding is a Discord voice bridge for controlling CLI-based coding agents by voice. macOS / Apple Silicon is the most tested path; Linux bootstrap is best-effort for common package managers.
6
6
 
7
- ### 포함된 기능
7
+ ## Included
8
8
 
9
- - Node `@discordjs/voice` 기반 Discord 음성 수신.
10
- - `whisper.cpp` + Metal 기반 로컬 한국어 STT.
11
- - 한국어 기본 음성을 사용하는 Edge TTS 재생.
12
- - 범용 CLI 하네스 어댑터 레이어:
13
- - Hermes Agent
14
- - Claude Code
15
- - Codex CLI
16
- - Gemini CLI
17
- - OpenCode
18
- - OpenClaw
19
- - custom command
20
- - Hermes 백엔드의 음성/텍스트 공유 세션 지원.
21
- - 답변 TTS chunking과 반응형 barge-in.
22
- - 큰 diff/code/log 출력이 음성으로 읽히지 않도록 하는 guardrail.
23
- - 실내와 noisy/outdoor 환경을 위한 normal/conservative 감도 모드.
24
- - 설정 마법사, `.env.example`, `vc doctor` prerequisite checker, OS 패키지/npm 의존성/Edge TTS helper/기본 whisper.cpp 모델을 준비하는 `./scripts/install.sh --yes` 부트스트랩.
25
- - 긴 에이전트 작업 중 텍스트 전용 중간 단계 업데이트를 위한 선택적 verbose progress mode.
26
- - 파이프라인 최적화를 위한 JSONL latency metrics와 `!latency` / `!metrics` 요약.
27
- - 더 여유 있는 utterance idle wait (`UTTERANCE_IDLE_MS=4500`)로 자연스러운 중간 멈춤이 있는 긴 지시가 앞부분 prompt와 무시되는 processing-time speech로 쪼개지지 않도록 개선.
28
- - 멀티 인스턴스 Hermes 프로필 격리: `vc instance setup <name>`이 자동으로 Hermes 프로필을 `~/.hermes/profiles/<name>`에 clone하고, instance workdir을 설정하고, SOUL.md를 초기화하고, instance env에 `HERMES_HOME`을 기록합니다. `vc instance start`는 누락된 profile을 self-heal하고, `vc doctor`는 profile-dir 존재와 `terminal.cwd` 일관성을 검사합니다.
29
- - npm 공개 패키지: `npm install -g verbalcoding`, `vc setup --yes`, `vc start` 경로 지원.
9
+ - Discord voice receive via Node `@discordjs/voice`.
10
+ - Local Korean STT via `whisper.cpp` + Metal.
11
+ - Edge TTS playback with Korean default voice.
12
+ - Generic CLI harness adapter layer: Hermes Agent, Claude Code, Codex CLI, Gemini CLI, OpenCode, OpenClaw, or custom command.
13
+ - Shared voice/text session support for Hermes backend.
14
+ - Long-answer TTS chunking and responsive barge-in.
15
+ - Diff/code/log guardrails so large technical output is not read aloud.
16
+ - Normal and conservative sensitivity modes.
17
+ - Setup wizard, `.env.example`, `vc doctor`, `./scripts/install.sh --yes`, and npm install path.
18
+ - `npm install -g verbalcoding`, `vc setup --yes`, and `vc start`.
19
+ - Verbose progress mode, JSONL latency metrics, and `!latency` / `!metrics`.
20
+ - `UTTERANCE_IDLE_MS=4500` for long spoken instructions with natural pauses.
21
+ - Multi-instance Hermes profile isolation via `vc instance setup <name>` and `HERMES_HOME`.
30
22
 
31
- ### 릴리스 전 체크리스트
32
-
33
- 저장소 루트에서 실행:
23
+ ## Pre-release checklist
34
24
 
35
25
  ```bash
36
26
  ./scripts/install.sh --yes --no-wizard
37
- ./scripts/docker_ubuntu_smoke.sh # Docker 필요; ubuntu:24.04 clean install 검증
27
+ ./scripts/docker_ubuntu_smoke.sh
38
28
  node --check app-node/main.mjs app-node/agent_adapters.mjs app-node/install_config.mjs scripts/install.mjs
39
29
  npm test
40
- PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 python3 -m pytest tests/ -q || [ $? -eq 5 ] # Python 테스트가 없으면 exit 5 허용
30
+ PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 python3 -m pytest tests/ -q || [ $? -eq 5 ]
41
31
  bash -n run.sh scripts/install.sh scripts/bootstrap_prereqs.sh scripts/docker_ubuntu_smoke.sh
42
32
  npm pack --dry-run
43
33
  vc doctor
44
34
  git diff --check
45
35
  ```
46
36
 
47
- 수동 스모크 테스트:
37
+ Manual smoke test:
48
38
 
49
- 1. `vc start` 또는 `./run.sh`로 브릿지를 시작합니다.
50
- 2. 로그에 `Logged in as <bot-name>`이 있는지 확인합니다.
51
- 3. 로그에 `Listening in voice channel ... / 일반` 또는 설정된 기본 채널이 있는지 확인합니다.
52
- 4. Discord에서 `!ping`을 실행합니다.
53
- 5. Discord 음성에서 짧은 한국어 요청을 말합니다.
54
- 6. STT transcript, agent response, TTS playback, barge-in 동작을 확인합니다.
39
+ 1. Start the bridge with `vc start` or `./run.sh`.
40
+ 2. Verify `Logged in as <bot-name>`.
41
+ 3. Verify `Listening in voice channel ...`.
42
+ 4. In Discord, run `!ping`.
43
+ 5. Say a short Korean request in voice.
44
+ 6. Verify STT transcript, agent response, TTS playback, and barge-in.
55
45
 
56
- ### 알려진 요구 사항
46
+ ## Known requirements
57
47
 
58
- - macOS + Homebrew 또는 Linux + `apt`, `dnf`, `pacman` best-effort bootstrap.
59
- - `ffmpeg`; 설치기가 설치를 시도합니다.
60
- - `whisper-cli`; macOS에서는 Homebrew를 사용하고, Linux에서는 로컬 `vendor/whisper.cpp` 빌드 fallback을 사용합니다.
61
- - 기본 모델 `models/ggml-small-q5_1.bin`; `--skip-model`을 쓰지 않으면 설치기가 다운로드합니다.
62
- - PATH의 Edge TTS CLI 또는 로컬 `.venv-tts/bin/edge-tts`; 필요하면 설치기가 로컬 helper를 만듭니다.
63
- - `.env`, `instances/<name>.env`, `~/.zshrc`, runtime env 중 하나에 Discord bot token.
64
- - 선택한 CLI 하네스가 설치되고 인증되어 있어야 합니다.
48
+ - macOS with Homebrew, or Linux with `apt`, `dnf`, or `pacman`.
49
+ - `ffmpeg`.
50
+ - `whisper-cli`.
51
+ - `models/ggml-small-q5_1.bin`.
52
+ - Edge TTS CLI or `.venv-tts/bin/edge-tts`.
53
+ - Discord bot token in `.env`, `instances/<name>.env`, `~/.zshrc`, or runtime env.
54
+ - Selected CLI harness installed and authenticated.
65
55
 
66
- ### 아직 public release 전에 보강하면 좋은 것
56
+ ## Not for public release yet
67
57
 
68
- - GitHub Actions CI.
69
- - Demo video / GIF.
70
- - Discord bot setup screenshots.
71
- - 스크립트 수준 검증을 넘어 실제 여러 Linux 배포판에서 더 넓은 검증.
72
- - 모든 logging path 보안 리뷰.
58
+ Consider adding GitHub Actions CI, demo video/GIF, Discord bot setup screenshots, broader real Linux validation, and security review of logging paths.