verbalcoding 0.2.7 → 0.2.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (44) hide show
  1. package/README.md +12 -27
  2. package/app-node/cli_install.test.mjs +15 -0
  3. package/docs/FRESH_INSTALL.md +8 -2
  4. package/docs/assets/figures/verbalcoding-flow.svg +45 -30
  5. package/docs/i18n/CONFIGURATION.es.md +138 -49
  6. package/docs/i18n/CONFIGURATION.fr.md +138 -49
  7. package/docs/i18n/CONFIGURATION.ja.md +137 -48
  8. package/docs/i18n/CONFIGURATION.ko.md +137 -48
  9. package/docs/i18n/CONFIGURATION.ru.md +138 -49
  10. package/docs/i18n/CONFIGURATION.zh.md +137 -48
  11. package/docs/i18n/FRESH_INSTALL.es.md +115 -32
  12. package/docs/i18n/FRESH_INSTALL.fr.md +115 -32
  13. package/docs/i18n/FRESH_INSTALL.ja.md +119 -36
  14. package/docs/i18n/FRESH_INSTALL.ko.md +120 -37
  15. package/docs/i18n/FRESH_INSTALL.ru.md +115 -32
  16. package/docs/i18n/FRESH_INSTALL.zh.md +119 -36
  17. package/docs/i18n/MULTI_INSTANCE.es.md +85 -26
  18. package/docs/i18n/MULTI_INSTANCE.fr.md +85 -26
  19. package/docs/i18n/MULTI_INSTANCE.ja.md +87 -29
  20. package/docs/i18n/MULTI_INSTANCE.ko.md +87 -29
  21. package/docs/i18n/MULTI_INSTANCE.ru.md +84 -26
  22. package/docs/i18n/MULTI_INSTANCE.zh.md +87 -29
  23. package/docs/i18n/README.es.md +109 -45
  24. package/docs/i18n/README.fr.md +109 -45
  25. package/docs/i18n/README.ja.md +109 -45
  26. package/docs/i18n/README.ko.md +108 -45
  27. package/docs/i18n/README.ru.md +109 -45
  28. package/docs/i18n/README.zh.md +108 -45
  29. package/docs/i18n/RELEASE.es.md +53 -37
  30. package/docs/i18n/RELEASE.fr.md +53 -37
  31. package/docs/i18n/RELEASE.ja.md +52 -36
  32. package/docs/i18n/RELEASE.ko.md +52 -36
  33. package/docs/i18n/RELEASE.ru.md +53 -37
  34. package/docs/i18n/RELEASE.zh.md +53 -37
  35. package/docs/i18n/USAGE.es.md +91 -64
  36. package/docs/i18n/USAGE.fr.md +91 -64
  37. package/docs/i18n/USAGE.ja.md +90 -63
  38. package/docs/i18n/USAGE.ko.md +90 -63
  39. package/docs/i18n/USAGE.ru.md +91 -64
  40. package/docs/i18n/USAGE.zh.md +90 -63
  41. package/package.json +1 -1
  42. package/scripts/bootstrap_prereqs.sh +15 -3
  43. package/scripts/cli.mjs +1 -1
  44. package/scripts/doctor.mjs +114 -8
@@ -1,58 +1,74 @@
1
1
  # VerbalCoding 릴리스 노트
2
2
 
3
- ## Current release candidate
3
+ ## 현재 릴리스 후보
4
4
 
5
- VerbalCoding is a Discord voice bridge for controlling CLI-based coding agents by voice. macOS / Apple Silicon is the most tested path; Linux bootstrap is best-effort for common package managers.
5
+ VerbalCoding CLI 기반 코딩 에이전트를 음성으로 제어하기 위한 Discord 음성 브리지입니다. 공개 릴리스를 지향하며, macOS / Apple Silicon 경로가 가장 많이 테스트되었고 일반적인 패키지 관리자에 대한 Linux 부트스트랩은 최선 노력으로 지원됩니다.
6
6
 
7
- ## Included
7
+ ### 포함됨
8
8
 
9
- - Discord voice receive via Node `@discordjs/voice`.
10
- - Local Korean STT via `whisper.cpp` + Metal.
11
- - Edge TTS playback with Korean default voice.
12
- - Generic CLI harness adapter layer: Hermes Agent, Claude Code, Codex CLI, Gemini CLI, OpenCode, OpenClaw, or custom command.
13
- - Shared voice/text session support for Hermes backend.
14
- - Long-answer TTS chunking and responsive barge-in.
15
- - Diff/code/log guardrails so large technical output is not read aloud.
16
- - Normal and conservative sensitivity modes.
17
- - Setup wizard, `.env.example`, `vc doctor`, `./scripts/install.sh --yes`, and npm install path.
18
- - `npm install -g verbalcoding`, `vc setup --yes`, and `vc start`.
19
- - Verbose progress mode, JSONL latency metrics, and `!latency` / `!metrics`.
20
- - `UTTERANCE_IDLE_MS=4500` for long spoken instructions with natural pauses.
21
- - Multi-instance Hermes profile isolation via `vc instance setup <name>` and `HERMES_HOME`.
9
+ - Node `@discordjs/voice`를 통한 Discord 음성 수신.
10
+ - `whisper.cpp` + Metal을 통한 로컬 한국어 STT.
11
+ - 한국어 기본 음성을 사용하는 Edge TTS 재생.
12
+ - 범용 CLI 하네스 어댑터 계층:
13
+ - Hermes Agent
14
+ - Claude Code
15
+ - Codex CLI
16
+ - Gemini CLI
17
+ - OpenCode
18
+ - OpenClaw
19
+ - 커스텀 명령
20
+ - Hermes 백엔드의 공유 음성/텍스트 세션 지원.
21
+ - 답변 TTS 청킹 반응형 끼어들기.
22
+ - 큰 기술 출력이 소리 내어 읽히지 않도록 하는 diff/code/log 안전장치.
23
+ - 실내와 시끄러운/실외 사용을 위한 일반 및 보수적 감도 모드.
24
+ - OS 패키지, npm 의존성, Edge TTS 헬퍼, 기본 whisper.cpp 모델을 위한 설정 마법사, `.env.example`, `vc doctor` 필수 조건 점검기, `./scripts/install.sh --yes` 부트스트랩.
25
+ - npm 패키지 설치 경로: `npm install -g verbalcoding`, `vc setup --yes`, `vc start`.
26
+ - 긴 에이전트 작업 중 텍스트 전용 중간 단계 업데이트를 위한 선택적 자세한 진행 모드.
27
+ - 파이프라인 최적화를 위한 항상 켜진 JSONL 지연 시간 지표와 `!latency` / `!metrics` 요약.
28
+ - 더 여유 있는 발화 유휴 대기(`UTTERANCE_IDLE_MS=4500`)로, 자연스러운 멈춤이 있는 긴 음성 지시가 부분 프롬프트와 무시되는 처리 중 발화로 나뉘지 않도록 함.
29
+ - Multi-instance Hermes 프로필 격리: `vc instance setup <name>`은 인스턴스 workdir를 가진 Hermes 프로필을 `~/.hermes/profiles/<name>`에 자동 복제하고, SOUL.md를 초기화하며, 인스턴스 env에 `HERMES_HOME`을 작성하여 프로젝트별 메모리와 skill을 분리합니다. `vc instance start`는 누락된 프로필을 자가 복구하고, `vc doctor`는 프로필 디렉터리 존재와 `terminal.cwd` 일관성을 확인합니다.
22
30
 
23
- ## Pre-release checklist
31
+ ### 사전 릴리스 체크리스트
32
+
33
+ 저장소 루트에서 실행하세요:
24
34
 
25
35
  ```bash
26
36
  ./scripts/install.sh --yes --no-wizard
27
- ./scripts/docker_ubuntu_smoke.sh
37
+ ./scripts/docker_ubuntu_smoke.sh # Docker 필요; ubuntu:24.04 깨끗한 설치 검증
28
38
  node --check app-node/main.mjs app-node/agent_adapters.mjs app-node/install_config.mjs scripts/install.mjs
29
39
  npm test
30
- PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 python3 -m pytest tests/ -q || [ $? -eq 5 ]
40
+ PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 python3 -m pytest tests/ -q || [ $? -eq 5 ] # Python 테스트가 없으면 OK
31
41
  bash -n run.sh scripts/install.sh scripts/bootstrap_prereqs.sh scripts/docker_ubuntu_smoke.sh
32
42
  npm pack --dry-run
33
43
  vc doctor
34
44
  git diff --check
35
45
  ```
36
46
 
37
- Manual smoke test:
47
+ 수동 스모크 테스트:
48
+
49
+ 1. `vc start` 또는 `./run.sh`로 브리지를 시작합니다.
50
+ 2. 로그에 `Logged in as <bot-name>`가 포함되는지 확인합니다.
51
+ 3. 로그에 `Listening in voice channel ... / 일반` 또는 설정된 기본 채널이 포함되는지 확인합니다.
52
+ 4. Discord에서 `!ping`을 실행합니다.
53
+ 5. Discord 음성에서 짧은 한국어 요청을 말합니다.
54
+ 6. STT 전사, 에이전트 응답, TTS 재생, 끼어들기 동작을 확인합니다.
38
55
 
39
- 1. Start the bridge with `vc start` or `./run.sh`.
40
- 2. Verify `Logged in as <bot-name>`.
41
- 3. Verify `Listening in voice channel ...`.
42
- 4. In Discord, run `!ping`.
43
- 5. Say a short Korean request in voice.
44
- 6. Verify STT transcript, agent response, TTS playback, and barge-in.
56
+ ### 알려진 요구 사항
45
57
 
46
- ## Known requirements
58
+ - 최선 노력 부트스트랩을 위해 Homebrew가 있는 macOS 또는 `apt`, `dnf`, `pacman`이 있는 Linux.
59
+ - `ffmpeg`; 설치 프로그램이 설치를 시도합니다.
60
+ - `whisper-cli`; 설치 프로그램은 macOS에서 Homebrew를 사용하거나 Linux에서 로컬 `vendor/whisper.cpp` 빌드 폴백을 사용합니다.
61
+ - `models/ggml-small-q5_1.bin`의 기본 모델; `--skip-model`을 사용하지 않으면 설치 프로그램이 다운로드합니다.
62
+ - `PATH`의 Edge TTS CLI 또는 로컬 `.venv-tts/bin/edge-tts`; 필요하면 설치 프로그램이 로컬 헬퍼를 만듭니다.
63
+ - `.env`, `instances/<name>.env`, `~/.zshrc` 또는 런타임 env의 Discord 봇 토큰.
64
+ - 선택한 CLI 하네스가 설치 및 인증되어 있어야 합니다.
47
65
 
48
- - macOS with Homebrew, or Linux with `apt`, `dnf`, or `pacman`.
49
- - `ffmpeg`.
50
- - `whisper-cli`.
51
- - `models/ggml-small-q5_1.bin`.
52
- - Edge TTS CLI or `.venv-tts/bin/edge-tts`.
53
- - Discord bot token in `.env`, `instances/<name>.env`, `~/.zshrc`, or runtime env.
54
- - Selected CLI harness installed and authenticated.
66
+ ### 아직 공개 릴리스용은 아님
55
67
 
56
- ## Not for public release yet
68
+ 공개 릴리스 전에 다음 추가를 고려하세요:
57
69
 
58
- Consider adding GitHub Actions CI, demo video/GIF, Discord bot setup screenshots, broader real Linux validation, and security review of logging paths.
70
+ - GitHub Actions CI.
71
+ - 데모 비디오 / GIF.
72
+ - Discord 봇 설정 스크린샷.
73
+ - 스크립트 수준 점검을 넘어 실제 배포판에서 더 넓은 Linux 검증.
74
+ - 모든 로깅 경로의 보안 검토.
@@ -1,58 +1,74 @@
1
- # VerbalCoding Заметки о релизе
1
+ # Заметки о релизе VerbalCoding
2
2
 
3
- ## Current release candidate
3
+ ## Текущий релиз-кандидат
4
4
 
5
- VerbalCoding is a Discord voice bridge for controlling CLI-based coding agents by voice. macOS / Apple Silicon is the most tested path; Linux bootstrap is best-effort for common package managers.
5
+ VerbalCoding это голосовой bridge Discord для управления CLI-агентами кодинга голосом. Он ориентирован на публичный релиз; macOS / Apple Silicon наиболее протестированный путь, а bootstrap-поддержка Linux для распространённых менеджеров пакетов предоставляется по мере возможностей.
6
6
 
7
- ## Included
7
+ ### Включено
8
8
 
9
- - Discord voice receive via Node `@discordjs/voice`.
10
- - Local Korean STT via `whisper.cpp` + Metal.
11
- - Edge TTS playback with Korean default voice.
12
- - Generic CLI harness adapter layer: Hermes Agent, Claude Code, Codex CLI, Gemini CLI, OpenCode, OpenClaw, or custom command.
13
- - Shared voice/text session support for Hermes backend.
14
- - Long-answer TTS chunking and responsive barge-in.
15
- - Diff/code/log guardrails so large technical output is not read aloud.
16
- - Normal and conservative sensitivity modes.
17
- - Setup wizard, `.env.example`, `vc doctor`, `./scripts/install.sh --yes`, and npm install path.
18
- - `npm install -g verbalcoding`, `vc setup --yes`, and `vc start`.
19
- - Verbose progress mode, JSONL latency metrics, and `!latency` / `!metrics`.
20
- - `UTTERANCE_IDLE_MS=4500` for long spoken instructions with natural pauses.
21
- - Multi-instance Hermes profile isolation via `vc instance setup <name>` and `HERMES_HOME`.
9
+ - Приём голоса Discord через Node `@discordjs/voice`.
10
+ - Локальный корейский STT через `whisper.cpp` + Metal.
11
+ - Воспроизведение Edge TTS с корейским голосом по умолчанию.
12
+ - Универсальный слой адаптеров CLI-харнесов:
13
+ - Hermes Agent
14
+ - Claude Code
15
+ - Codex CLI
16
+ - Gemini CLI
17
+ - OpenCode
18
+ - OpenClaw
19
+ - пользовательская команда
20
+ - Поддержка общей голосовой/текстовой сессии для бэкенда Hermes.
21
+ - Разбиение длинных TTS-ответов на фрагменты и отзывчивое перебивание.
22
+ - Защитные ограничения для diff/code/log, чтобы большой технический вывод не читался вслух.
23
+ - Обычный и консервативный режимы чувствительности для помещений по сравнению с шумным/уличным использованием.
24
+ - Мастер настройки, `.env.example`, проверка prerequisites через `vc doctor` и bootstrap `./scripts/install.sh --yes` для пакетов ОС, npm-зависимостей, помощника Edge TTS и стандартной модели whisper.cpp.
25
+ - Путь установки npm-пакета: `npm install -g verbalcoding`, `vc setup --yes` и `vc start`.
26
+ - Необязательный режим подробного прогресса для текстовых обновлений промежуточных шагов во время долгой работы агента.
27
+ - Постоянные JSONL-метрики задержки плюс сводка `!latency` / `!metrics` для оптимизации pipeline.
28
+ - Более терпеливое ожидание бездействия реплики (`UTTERANCE_IDLE_MS=4500`), чтобы длинные голосовые инструкции с естественными паузами не разделялись на частичный prompt плюс игнорируемую речь во время обработки.
29
+ - Изоляция профилей Hermes для нескольких экземпляров: `vc instance setup <name>` автоматически клонирует профиль Hermes в `~/.hermes/profiles/<name>` с workdir экземпляра, заполняет SOUL.md и записывает `HERMES_HOME` в env экземпляра, чтобы память и skills проектов оставались разделёнными; `vc instance start` самовосстанавливает отсутствующий профиль, а `vc doctor` проверяет наличие директории профиля и согласованность `terminal.cwd`.
22
30
 
23
- ## Pre-release checklist
31
+ ### Чеклист перед релизом
32
+
33
+ Запускайте из корня репозитория:
24
34
 
25
35
  ```bash
26
36
  ./scripts/install.sh --yes --no-wizard
27
- ./scripts/docker_ubuntu_smoke.sh
37
+ ./scripts/docker_ubuntu_smoke.sh # requires Docker; validates ubuntu:24.04 clean install
28
38
  node --check app-node/main.mjs app-node/agent_adapters.mjs app-node/install_config.mjs scripts/install.mjs
29
39
  npm test
30
- PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 python3 -m pytest tests/ -q || [ $? -eq 5 ]
40
+ PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 python3 -m pytest tests/ -q || [ $? -eq 5 ] # ok when no Python tests exist
31
41
  bash -n run.sh scripts/install.sh scripts/bootstrap_prereqs.sh scripts/docker_ubuntu_smoke.sh
32
42
  npm pack --dry-run
33
43
  vc doctor
34
44
  git diff --check
35
45
  ```
36
46
 
37
- Manual smoke test:
47
+ Ручной smoke-тест:
48
+
49
+ 1. Запустите bridge через `vc start` или `./run.sh`.
50
+ 2. Проверьте, что лог содержит `Logged in as <bot-name>`.
51
+ 3. Проверьте, что лог содержит `Listening in voice channel ... / 일반` или настроенный канал по умолчанию.
52
+ 4. В Discord выполните `!ping`.
53
+ 5. В голосе Discord произнесите короткий корейский запрос.
54
+ 6. Проверьте STT-расшифровку, ответ агента, воспроизведение TTS и поведение перебивания.
38
55
 
39
- 1. Start the bridge with `vc start` or `./run.sh`.
40
- 2. Verify `Logged in as <bot-name>`.
41
- 3. Verify `Listening in voice channel ...`.
42
- 4. In Discord, run `!ping`.
43
- 5. Say a short Korean request in voice.
44
- 6. Verify STT transcript, agent response, TTS playback, and barge-in.
56
+ ### Известные требования
45
57
 
46
- ## Known requirements
58
+ - macOS с Homebrew или Linux с `apt`, `dnf` либо `pacman` для best-effort bootstrap.
59
+ - `ffmpeg`; установщик пытается установить его.
60
+ - `whisper-cli`; установщик использует Homebrew на macOS или резервную локальную сборку `vendor/whisper.cpp` на Linux.
61
+ - Модель по умолчанию в `models/ggml-small-q5_1.bin`; установщик загружает её, если не используется `--skip-model`.
62
+ - Edge TTS CLI в `PATH` или локальный `.venv-tts/bin/edge-tts`; установщик создаёт локальный помощник при необходимости.
63
+ - Токен Discord-бота в `.env`, `instances/<name>.env`, `~/.zshrc` или runtime env.
64
+ - Выбранный CLI-харнес установлен и аутентифицирован.
47
65
 
48
- - macOS with Homebrew, or Linux with `apt`, `dnf`, or `pacman`.
49
- - `ffmpeg`.
50
- - `whisper-cli`.
51
- - `models/ggml-small-q5_1.bin`.
52
- - Edge TTS CLI or `.venv-tts/bin/edge-tts`.
53
- - Discord bot token in `.env`, `instances/<name>.env`, `~/.zshrc`, or runtime env.
54
- - Selected CLI harness installed and authenticated.
66
+ ### Пока не для публичного релиза
55
67
 
56
- ## Not for public release yet
68
+ Перед публичным релизом стоит добавить:
57
69
 
58
- Consider adding GitHub Actions CI, demo video/GIF, Discord bot setup screenshots, broader real Linux validation, and security review of logging paths.
70
+ - GitHub Actions CI.
71
+ - Демо-видео / GIF.
72
+ - Скриншоты настройки Discord-бота.
73
+ - Более широкую проверку Linux на реальных дистрибутивах сверх проверок на уровне скриптов.
74
+ - Аудит безопасности всех путей логирования.
@@ -1,58 +1,74 @@
1
- # VerbalCoding 发布说明
1
+ # VerbalCoding 发行说明
2
2
 
3
- ## Current release candidate
3
+ ## 当前候选版本
4
4
 
5
- VerbalCoding is a Discord voice bridge for controlling CLI-based coding agents by voice. macOS / Apple Silicon is the most tested path; Linux bootstrap is best-effort for common package managers.
5
+ VerbalCoding 是一个 Discord 语音桥接,用于通过语音控制基于 CLI 的编码代理。它面向公开发布,macOS / Apple Silicon 是测试最多的路径,并为常见包管理器提供尽力支持的 Linux 引导。
6
6
 
7
- ## Included
7
+ ### 已包含
8
8
 
9
- - Discord voice receive via Node `@discordjs/voice`.
10
- - Local Korean STT via `whisper.cpp` + Metal.
11
- - Edge TTS playback with Korean default voice.
12
- - Generic CLI harness adapter layer: Hermes Agent, Claude Code, Codex CLI, Gemini CLI, OpenCode, OpenClaw, or custom command.
13
- - Shared voice/text session support for Hermes backend.
14
- - Long-answer TTS chunking and responsive barge-in.
15
- - Diff/code/log guardrails so large technical output is not read aloud.
16
- - Normal and conservative sensitivity modes.
17
- - Setup wizard, `.env.example`, `vc doctor`, `./scripts/install.sh --yes`, and npm install path.
18
- - `npm install -g verbalcoding`, `vc setup --yes`, and `vc start`.
19
- - Verbose progress mode, JSONL latency metrics, and `!latency` / `!metrics`.
20
- - `UTTERANCE_IDLE_MS=4500` for long spoken instructions with natural pauses.
21
- - Multi-instance Hermes profile isolation via `vc instance setup <name>` and `HERMES_HOME`.
9
+ - 通过 Node `@discordjs/voice` 接收 Discord 语音。
10
+ - 通过 `whisper.cpp` + Metal 进行本地韩语 STT。
11
+ - 使用韩语默认声音进行 Edge TTS 播放。
12
+ - 通用 CLI 驱动适配器层:
13
+ - Hermes Agent
14
+ - Claude Code
15
+ - Codex CLI
16
+ - Gemini CLI
17
+ - OpenCode
18
+ - OpenClaw
19
+ - 自定义命令
20
+ - Hermes 后端的共享语音/文本会话支持。
21
+ - 长答案 TTS 分块和响应式插话。
22
+ - Diff/code/log 保护机制,避免大段技术输出被朗读。
23
+ - 面向室内与嘈杂/户外使用的普通和保守灵敏度模式。
24
+ - 设置向导、`.env.example`、`vc doctor` 前置条件检查器,以及用于 OS 软件包、npm 依赖、Edge TTS 辅助环境和默认 whisper.cpp 模型的 `./scripts/install.sh --yes` 引导。
25
+ - npm 包安装路径:`npm install -g verbalcoding`、`vc setup --yes` 和 `vc start`。
26
+ - 可选详细进度模式,在长时间代理工作期间提供仅文本的中间步骤更新。
27
+ - 常开 JSONL 延迟指标,以及用于流水线优化的 `!latency` / `!metrics` 摘要。
28
+ - 更耐心的发言空闲等待(`UTTERANCE_IDLE_MS=4500`),使带自然停顿的长口述指令不会被拆成部分提示加被忽略的处理期间语音。
29
+ - 多实例 Hermes profile 隔离:`vc instance setup <name>` 会自动将 Hermes profile 克隆到 `~/.hermes/profiles/<name>`,设置实例 workdir,初始化 SOUL.md,并将 `HERMES_HOME` 写入实例 env,使每项目记忆和 skills 保持分离;`vc instance start` 会自愈缺失 profile,`vc doctor` 会检查 profile 目录存在性和 `terminal.cwd` 一致性。
22
30
 
23
- ## Pre-release checklist
31
+ ### 预发布检查清单
32
+
33
+ 从仓库根目录运行:
24
34
 
25
35
  ```bash
26
36
  ./scripts/install.sh --yes --no-wizard
27
- ./scripts/docker_ubuntu_smoke.sh
37
+ ./scripts/docker_ubuntu_smoke.sh # 需要 Docker;验证 ubuntu:24.04 干净安装
28
38
  node --check app-node/main.mjs app-node/agent_adapters.mjs app-node/install_config.mjs scripts/install.mjs
29
39
  npm test
30
- PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 python3 -m pytest tests/ -q || [ $? -eq 5 ]
40
+ PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 python3 -m pytest tests/ -q || [ $? -eq 5 ] # 没有 Python 测试时也可以
31
41
  bash -n run.sh scripts/install.sh scripts/bootstrap_prereqs.sh scripts/docker_ubuntu_smoke.sh
32
42
  npm pack --dry-run
33
43
  vc doctor
34
44
  git diff --check
35
45
  ```
36
46
 
37
- Manual smoke test:
47
+ 手动冒烟测试:
48
+
49
+ 1. 使用 `vc start` 或 `./run.sh` 启动桥接。
50
+ 2. 验证日志包含 `Logged in as <bot-name>`。
51
+ 3. 验证日志包含 `Listening in voice channel ... / 일반` 或已配置的默认频道。
52
+ 4. 在 Discord 中运行 `!ping`。
53
+ 5. 在 Discord 语音中说一个简短韩语请求。
54
+ 6. 验证 STT 转写、代理响应、TTS 播放和插话行为。
38
55
 
39
- 1. Start the bridge with `vc start` or `./run.sh`.
40
- 2. Verify `Logged in as <bot-name>`.
41
- 3. Verify `Listening in voice channel ...`.
42
- 4. In Discord, run `!ping`.
43
- 5. Say a short Korean request in voice.
44
- 6. Verify STT transcript, agent response, TTS playback, and barge-in.
56
+ ### 已知要求
45
57
 
46
- ## Known requirements
58
+ - macOS + Homebrew,或带 `apt`、`dnf`、`pacman` 的 Linux(用于尽力支持的引导)。
59
+ - `ffmpeg`;安装器会尝试安装它。
60
+ - `whisper-cli`;安装器在 macOS 使用 Homebrew,在 Linux 上回退到本地 `vendor/whisper.cpp` 构建。
61
+ - 默认模型位于 `models/ggml-small-q5_1.bin`;除非使用 `--skip-model`,安装器会下载它。
62
+ - `PATH` 上的 Edge TTS CLI,或本地 `.venv-tts/bin/edge-tts`;安装器会在需要时创建本地辅助环境。
63
+ - `.env`、`instances/<name>.env`、`~/.zshrc` 或运行时 env 中的 Discord 机器人令牌。
64
+ - 已安装并认证的所选 CLI 驱动。
47
65
 
48
- - macOS with Homebrew, or Linux with `apt`, `dnf`, or `pacman`.
49
- - `ffmpeg`.
50
- - `whisper-cli`.
51
- - `models/ggml-small-q5_1.bin`.
52
- - Edge TTS CLI or `.venv-tts/bin/edge-tts`.
53
- - Discord bot token in `.env`, `instances/<name>.env`, `~/.zshrc`, or runtime env.
54
- - Selected CLI harness installed and authenticated.
66
+ ### 尚不适合公开发布
55
67
 
56
- ## Not for public release yet
68
+ 公开发布前,建议添加:
57
69
 
58
- Consider adding GitHub Actions CI, demo video/GIF, Discord bot setup screenshots, broader real Linux validation, and security review of logging paths.
70
+ - GitHub Actions CI
71
+ - 演示视频 / GIF。
72
+ - Discord 机器人设置截图。
73
+ - 在脚本级检查之外,对真实发行版进行更广泛的 Linux 验证。
74
+ - 对所有日志路径进行安全审查。
@@ -1,71 +1,78 @@
1
- # VerbalCoding Guía de uso
1
+ # Guía de uso de VerbalCoding
2
2
 
3
- Operational details for Español users.
3
+ Esta página contiene los detalles operativos que antes hacían que el README fuera demasiado largo.
4
4
 
5
- ## CLI Commands
5
+ ## Comandos CLI
6
6
 
7
7
  ```bash
8
- vc status
9
- vc language en
10
- vc language ko
11
- vc language auto
12
- vc restart auto status
13
- vc restart auto on
14
- vc restart auto off
15
- vc bot invite CLIENT_ID
16
- vc instance status
17
- vc instance setup NAME
18
- vc instance start NAME
19
- vc instance stop NAME
20
- vc doctor
21
- npm run mcp
8
+ vc status # show STT language, progress language, and TTS voice
9
+ vc language en # English STT + English progress/TTS voice
10
+ vc language ko # Korean STT + Korean progress/TTS voice
11
+ vc language auto # Whisper auto-detect STT + English progress/TTS voice
12
+ vc restart auto status # show commit-time voice-bot auto-restart setting
13
+ vc restart auto on # enable commit-time voice-bot auto-restart
14
+ vc restart auto off # disable it; this is the default
15
+ vc bot invite CLIENT_ID # print a Discord invite URL with required permissions
16
+ vc instance status # list per-instance bridge configs and process status
17
+ vc instance setup NAME # write instances/NAME.env and create ~/.hermes/profiles/NAME
18
+ vc instance start NAME # start ./run.sh instances/NAME.env detached
19
+ vc instance stop NAME # stop a detached instance and remove its pid file
20
+ vc doctor # run the redacted doctor check
21
+ npm run mcp # run the stdio MCP server
22
22
  ```
23
23
 
24
- Language commands update `.env`; restart with `vc start`, `./run.sh`, or your process manager.
24
+ Los cambios de idioma actualizan `.env`; reinicia el puente con `./run.sh` o tu gestor de procesos para que surtan efecto.
25
+
26
+ ## Modos de ejecución
25
27
 
26
- ## Run Modes
28
+ Puente de instancia única:
27
29
 
28
30
  ```bash
29
- vc start
30
31
  ./run.sh
32
+ ```
33
+
34
+ Puente por instancia usando un entorno local de sobrescritura:
35
+
36
+ ```bash
31
37
  ./run.sh instances/my-project.env
38
+ # or
32
39
  VERBALCODING_INSTANCE_ENV=instances/my-project.env ./run.sh
33
40
  ```
34
41
 
35
- The bot auto-joins the first configured channel name, defaulting to `일반,General,general`.
42
+ El bot se une automáticamente al primer nombre de canal configurado, con valor predeterminado `일반,General,general`.
36
43
 
37
- ## Discord Commands
44
+ ## Comandos de Discord
38
45
 
39
- Before using commands, set up the Discord application/bot:
46
+ Antes de cablear comandos, configura la aplicación/bot de Discord usando las guías originales:
40
47
 
41
- - Hermes Agent Discord guide: <https://hermes-agent.nousresearch.com/docs/user-guide/messaging/discord>
42
- - Discord official bot docs: <https://docs.discord.com/developers/bots/overview>
48
+ - Guía de Discord de Hermes Agent: <https://hermes-agent.nousresearch.com/docs/user-guide/messaging/discord>
49
+ - Documentación oficial de bots de Discord: <https://docs.discord.com/developers/bots/overview>
43
50
 
44
- Then run `vc bot invite CLIENT_ID` for the VerbalCoding permissions.
51
+ Luego usa `vc bot invite CLIENT_ID` para generar la URL de invitación específica de VerbalCoding con permisos de texto y voz.
45
52
 
46
- | Command | Purpose |
53
+ | Comando | Propósito |
47
54
  |---|---|
48
- | `!ping` | Basic bot check |
49
- | `!join` / `!leave` | Join or leave voice |
50
- | `!say <text>` | Speak text directly through TTS |
51
- | `!voice-test <text>` | Test the active TTS backend/voice |
52
- | `!voice-clone capture` | Save the next valid utterance as an OpenVoice reference sample |
53
- | `!voice-clone status` / `!voice-clone cancel` | Inspect or cancel capture |
54
- | `!ask <prompt>` | Send text through the same harness adapter as voice |
55
- | `!session status` | Show current project/default adapter session |
56
- | `!session new <name> <workdir> [context] --voice <voice-channel>` | Create a project-scoped Hermes session |
57
- | `!session attach-voice [sessionName] --voice <voice-channel>` | Bind a text channel/thread to a voice channel |
58
- | `!session list` | List configured project sessions |
59
- | `!session reset` / `!reset-session` | Clear the current session file |
60
- | `!verbose on/off` | Toggle detailed progress updates |
61
- | `!latency` / `!metrics` | Show recent latency summary |
62
- | `!sensitivity normal/conservative` | Switch barge-in sensitivity |
63
-
64
- Voice equivalents such as “외부 모드”, “보수 모드”, “실내”, “기본 감도”, “상세 진행 켜”, and clear stop phrases like “잠깐”, “멈춰”, “그만” are handled by the bridge.
65
-
66
- ## Changing the Voice
67
-
68
- `vc language ko|en|auto` changes STT language, progress language, and the matching default TTS voice together. Live voice commands can change the speaker without restart:
55
+ | `!ping` | Comprobación básica del bot |
56
+ | `!join` / `!leave` | Unirse a voz o salir de voz |
57
+ | `!say <text>` | Decir texto directamente mediante TTS |
58
+ | `!voice-test <text>` | Probar el backend/voz TTS activo |
59
+ | `!voice-clone capture` | Guardar la siguiente emisión válida como muestra de referencia para OpenVoice |
60
+ | `!voice-clone status` / `!voice-clone cancel` | Inspeccionar o cancelar la captura |
61
+ | `!ask <prompt>` | Enviar texto mediante el mismo adaptador de arnés seleccionado que la voz |
62
+ | `!session status` | Mostrar la sesión actual de proyecto/adaptador predeterminado |
63
+ | `!session new <name> <workdir> [context] --voice <voice-channel>` | Crear una sesión Hermes con alcance de proyecto |
64
+ | `!session attach-voice [sessionName] --voice <voice-channel>` | Vincular canal/hilo de texto a un canal de voz |
65
+ | `!session list` | Listar sesiones de proyecto configuradas |
66
+ | `!session reset` / `!reset-session` | Borrar el archivo de sesión actual del proyecto/adaptador predeterminado |
67
+ | `!verbose on/off` | Alternar actualizaciones de progreso detalladas |
68
+ | `!latency` / `!metrics` | Mostrar resumen de latencia reciente |
69
+ | `!sensitivity normal/conservative` | Cambiar sensibilidad de interrupción |
70
+
71
+ Equivalentes de voz como “외부 모드”, “보수 모드”, “실내”, “기본 감도” y frases claras de parada como “잠깐”, “멈춰”, “그만” son gestionados por el puente. También puedes decir “상세 진행 켜” / “상세 진행 꺼” para alternar el progreso detallado por voz.
72
+
73
+ ## Cambiar la voz
74
+
75
+ `vc language ko|en|auto` cambia juntos el idioma STT, el idioma de progreso y la voz TTS predeterminada correspondiente. Si solo quieres cambiar el hablante/voz mientras el puente está en ejecución, dilo en la voz de Discord:
69
76
 
70
77
  ```text
71
78
  남자 한국어 목소리로 바꿔
@@ -74,9 +81,11 @@ change voice to Korean female
74
81
  switch speaker to English
75
82
  ```
76
83
 
77
- Built-in Edge types:
84
+ El puente en vivo reconoce esto como comandos de control por voz, actualiza `config/tts-voices.json`, actualiza el entorno TTS efectivo del proceso en ejecución y responde con una confirmación corta como “목소리를 Korean male로 바꿨어.” Usa `!voice-test <text>` justo después de cambiarla para escuchar el backend y la voz actuales.
85
+
86
+ Tipos de voz Edge integrados:
78
87
 
79
- | Voice type | Edge voice |
88
+ | Tipo de voz | Voz Edge |
80
89
  |---|---|
81
90
  | `korean_male` | `ko-KR-InJoonNeural` |
82
91
  | `korean_female` | `ko-KR-SunHiNeural` |
@@ -84,26 +93,32 @@ Built-in Edge types:
84
93
  | `english_male` | `en-US-GuyNeural` |
85
94
  | `english_female` | `en-US-AriaNeural` |
86
95
 
87
- Backend voice settings:
96
+ Para configuración manual persistente, define `TTS_BACKEND=edge`, `TTS_VOICE_TYPE=<voice-type>` y opcionalmente `TTS_VOICE=<edge-voice>` en `.env`, o edita `config/tts-voices.json` para catálogos de voz personalizados.
88
97
 
89
- | Backend | Voice setting | Common choices |
98
+ Controles de voz específicos de backend:
99
+
100
+ | Backend | Ajuste de voz | Opciones comunes |
90
101
  |---|---|---|
91
- | Edge | `TTS_VOICE_TYPE`, `TTS_VOICE` | Built-in types or any Edge voice from `edge-tts --list-voices` |
92
- | Supertonic | `SUPERTONIC_VOICE` | `M1`–`M5`, `F1`–`F5`; `SUPERTONIC_LANGUAGE=ko|en|es|pt|fr` |
93
- | OpenVoice | `OPENVOICE_REF_AUDIO`, `OPENVOICE_STYLE` | A permitted reference WAV plus style such as `default` |
94
- | SpeechSwift / CosyVoice | `SPEECHSWIFT_REF_AUDIO`, `SPEECHSWIFT_ENGINE`, `SPEECHSWIFT_SPEAKER` | Reference WAV or backend speaker/model values |
102
+ | Edge | `TTS_VOICE_TYPE`, `TTS_VOICE` | `korean_male`, `korean_female`, `korean_multilingual_male`, `english_male`, `english_female`; cualquier voz Edge de `edge-tts --list-voices` |
103
+ | Supertonic | `SUPERTONIC_VOICE` | `M1`–`M5`, `F1`–`F5`; define `SUPERTONIC_LANGUAGE=ko|en|es|pt|fr` |
104
+ | OpenVoice | `OPENVOICE_REF_AUDIO`, `OPENVOICE_STYLE` | una referencia WAV permitida más un estilo como `default` |
105
+ | SpeechSwift / CosyVoice | `SPEECHSWIFT_REF_AUDIO`, `SPEECHSWIFT_ENGINE`, `SPEECHSWIFT_SPEAKER` | WAV de referencia para CosyVoice, o valores de hablante/modelo admitidos por el backend |
106
+
107
+ Para Supertonic y backends locales de clonación, usa las variables de entorno del backend anteriores junto con `!voice-test <text>` para probar cambios. El cambio por comandos de voz actualmente asigna los tipos de voz integrados de estilo Edge; se pueden añadir catálogos de backend más completos en `config/tts-voices.json`.
95
108
 
96
- ## Long Dictation and Pauses
109
+ ## Dictado largo y pausas
97
110
 
98
- The default `UTTERANCE_IDLE_MS=4500` waits long enough to keep natural pauses inside one spoken instruction. Lower it for faster short commands or raise it for long dictation:
111
+ VerbalCoding espera una ventana de inactividad antes de enviar el habla a STT. El valor predeterminado `UTTERANCE_IDLE_MS=4500` es deliberadamente un poco paciente para que una pausa natural en una instrucción larga no divida la oración, inicie un turno de agente demasiado pronto y luego trate el resto como una interrupción durante el procesamiento.
112
+
113
+ Si prefieres comandos cortos más rápidos, bájalo en `.env`; si el dictado largo en coreano aún se divide, súbelo:
99
114
 
100
115
  ```bash
101
116
  UTTERANCE_IDLE_MS="6000"
102
117
  ```
103
118
 
104
- ## Verbose Progress Mode
119
+ ## Modo de progreso detallado
105
120
 
106
- Enable with `!verbose on`, `AGENT_VERBOSE_PROGRESS=1`, or “상세 진행 켜”. Progress lines look like:
121
+ El progreso detallado está desactivado por defecto salvo que `AGENT_VERBOSE_PROGRESS=1` esté definido. Habilítalo con `!verbose on` o con un comando de voz como “상세 진행 켜”. Puede emitir líneas cortas de progreso como:
107
122
 
108
123
  ```text
109
124
  🤖 Hermes Agent 호출 시작
@@ -113,18 +128,28 @@ Enable with `!verbose on`, `AGENT_VERBOSE_PROGRESS=1`, or “상세 진행 켜
113
128
  🤖 Hermes Agent 응답 수신
114
129
  ```
115
130
 
116
- Secret-looking fields are redacted and progress lines are removed from final spoken answers.
131
+ Este modo pide al arnés CLI seleccionado que emita líneas `VERBALCODING_PROGRESS: ...` y resume marcadores comunes de herramientas desde stdout/stderr en streaming cuando están disponibles. Los campos con aspecto de secreto se redactan y las líneas de progreso se eliminan de la respuesta final hablada.
132
+
133
+ ## Métricas de latencia
134
+
135
+ VerbalCoding escribe registros de latencia por turno como JSONL. Ruta predeterminada:
136
+
137
+ ```text
138
+ ./.logs/latency.jsonl
139
+ ```
117
140
 
118
- ## Latency Metrics
141
+ Cada registro incluye estado, tiempo total, tiempo de captura de voz, espera de inactividad de emisión, tiempo STT, tiempo del agente, tiempo de síntesis/reproducción TTS, conteos de fragmentos, longitud de transcripción, longitud de respuesta y niveles de audio cuando están disponibles.
119
142
 
120
- Latency records are written to `./.logs/latency.jsonl`. In Discord, run:
143
+ En Discord:
121
144
 
122
145
  ```text
123
146
  !latency
124
147
  !metrics
125
148
  ```
126
149
 
127
- ## Testing
150
+ El resumen usa los últimos 200 registros: conteo, promedio, p95, máximo y estados no OK.
151
+
152
+ ## Pruebas
128
153
 
129
154
  ```bash
130
155
  node --check app-node/main.mjs
@@ -132,3 +157,5 @@ npm test
132
157
  bash -n run.sh scripts/install.sh
133
158
  vc doctor
134
159
  ```
160
+
161
+ `vc doctor` redacta secretos intencionadamente y solo informa si los valores requeridos están configurados. También comprueba `instances/*.env` en busca de huellas de token duplicadas y rutas de ejecución en conflicto.