verbalcoding 0.2.6 → 0.2.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +5 -0
- package/docs/i18n/CONFIGURATION.es.md +150 -0
- package/docs/i18n/CONFIGURATION.fr.md +150 -0
- package/docs/i18n/CONFIGURATION.ja.md +150 -0
- package/docs/i18n/CONFIGURATION.ko.md +49 -146
- package/docs/i18n/CONFIGURATION.ru.md +150 -0
- package/docs/i18n/CONFIGURATION.zh.md +150 -0
- package/docs/i18n/FRESH_INSTALL.es.md +124 -0
- package/docs/i18n/FRESH_INSTALL.fr.md +124 -0
- package/docs/i18n/FRESH_INSTALL.ja.md +124 -0
- package/docs/i18n/FRESH_INSTALL.ko.md +37 -114
- package/docs/i18n/FRESH_INSTALL.ru.md +124 -0
- package/docs/i18n/FRESH_INSTALL.zh.md +124 -0
- package/docs/i18n/MULTI_INSTANCE.es.md +121 -0
- package/docs/i18n/MULTI_INSTANCE.fr.md +121 -0
- package/docs/i18n/MULTI_INSTANCE.ja.md +121 -0
- package/docs/i18n/MULTI_INSTANCE.ko.md +28 -86
- package/docs/i18n/MULTI_INSTANCE.ru.md +121 -0
- package/docs/i18n/MULTI_INSTANCE.zh.md +121 -0
- package/docs/i18n/README.es.md +50 -86
- package/docs/i18n/README.fr.md +50 -86
- package/docs/i18n/README.ja.md +50 -86
- package/docs/i18n/README.ko.md +41 -113
- package/docs/i18n/README.ru.md +50 -86
- package/docs/i18n/README.zh.md +50 -86
- package/docs/i18n/RELEASE.es.md +58 -0
- package/docs/i18n/RELEASE.fr.md +58 -0
- package/docs/i18n/RELEASE.ja.md +58 -0
- package/docs/i18n/RELEASE.ko.md +36 -50
- package/docs/i18n/RELEASE.ru.md +58 -0
- package/docs/i18n/RELEASE.zh.md +58 -0
- package/docs/i18n/USAGE.es.md +134 -0
- package/docs/i18n/USAGE.fr.md +134 -0
- package/docs/i18n/USAGE.ja.md +134 -0
- package/docs/i18n/USAGE.ko.md +63 -101
- package/docs/i18n/USAGE.ru.md +134 -0
- package/docs/i18n/USAGE.zh.md +134 -0
- package/package.json +1 -1
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
# VerbalCoding Заметки о релизе
|
|
2
|
+
|
|
3
|
+
## Current release candidate
|
|
4
|
+
|
|
5
|
+
VerbalCoding is a Discord voice bridge for controlling CLI-based coding agents by voice. macOS / Apple Silicon is the most tested path; Linux bootstrap is best-effort for common package managers.
|
|
6
|
+
|
|
7
|
+
## Included
|
|
8
|
+
|
|
9
|
+
- Discord voice receive via Node `@discordjs/voice`.
|
|
10
|
+
- Local Korean STT via `whisper.cpp` + Metal.
|
|
11
|
+
- Edge TTS playback with Korean default voice.
|
|
12
|
+
- Generic CLI harness adapter layer: Hermes Agent, Claude Code, Codex CLI, Gemini CLI, OpenCode, OpenClaw, or custom command.
|
|
13
|
+
- Shared voice/text session support for Hermes backend.
|
|
14
|
+
- Long-answer TTS chunking and responsive barge-in.
|
|
15
|
+
- Diff/code/log guardrails so large technical output is not read aloud.
|
|
16
|
+
- Normal and conservative sensitivity modes.
|
|
17
|
+
- Setup wizard, `.env.example`, `vc doctor`, `./scripts/install.sh --yes`, and npm install path.
|
|
18
|
+
- `npm install -g verbalcoding`, `vc setup --yes`, and `vc start`.
|
|
19
|
+
- Verbose progress mode, JSONL latency metrics, and `!latency` / `!metrics`.
|
|
20
|
+
- `UTTERANCE_IDLE_MS=4500` for long spoken instructions with natural pauses.
|
|
21
|
+
- Multi-instance Hermes profile isolation via `vc instance setup <name>` and `HERMES_HOME`.
|
|
22
|
+
|
|
23
|
+
## Pre-release checklist
|
|
24
|
+
|
|
25
|
+
```bash
|
|
26
|
+
./scripts/install.sh --yes --no-wizard
|
|
27
|
+
./scripts/docker_ubuntu_smoke.sh
|
|
28
|
+
node --check app-node/main.mjs app-node/agent_adapters.mjs app-node/install_config.mjs scripts/install.mjs
|
|
29
|
+
npm test
|
|
30
|
+
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 python3 -m pytest tests/ -q || [ $? -eq 5 ]
|
|
31
|
+
bash -n run.sh scripts/install.sh scripts/bootstrap_prereqs.sh scripts/docker_ubuntu_smoke.sh
|
|
32
|
+
npm pack --dry-run
|
|
33
|
+
vc doctor
|
|
34
|
+
git diff --check
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
Manual smoke test:
|
|
38
|
+
|
|
39
|
+
1. Start the bridge with `vc start` or `./run.sh`.
|
|
40
|
+
2. Verify `Logged in as <bot-name>`.
|
|
41
|
+
3. Verify `Listening in voice channel ...`.
|
|
42
|
+
4. In Discord, run `!ping`.
|
|
43
|
+
5. Say a short Korean request in voice.
|
|
44
|
+
6. Verify STT transcript, agent response, TTS playback, and barge-in.
|
|
45
|
+
|
|
46
|
+
## Known requirements
|
|
47
|
+
|
|
48
|
+
- macOS with Homebrew, or Linux with `apt`, `dnf`, or `pacman`.
|
|
49
|
+
- `ffmpeg`.
|
|
50
|
+
- `whisper-cli`.
|
|
51
|
+
- `models/ggml-small-q5_1.bin`.
|
|
52
|
+
- Edge TTS CLI or `.venv-tts/bin/edge-tts`.
|
|
53
|
+
- Discord bot token in `.env`, `instances/<name>.env`, `~/.zshrc`, or runtime env.
|
|
54
|
+
- Selected CLI harness installed and authenticated.
|
|
55
|
+
|
|
56
|
+
## Not for public release yet
|
|
57
|
+
|
|
58
|
+
Consider adding GitHub Actions CI, demo video/GIF, Discord bot setup screenshots, broader real Linux validation, and security review of logging paths.
|
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
# VerbalCoding 发布说明
|
|
2
|
+
|
|
3
|
+
## Current release candidate
|
|
4
|
+
|
|
5
|
+
VerbalCoding is a Discord voice bridge for controlling CLI-based coding agents by voice. macOS / Apple Silicon is the most tested path; Linux bootstrap is best-effort for common package managers.
|
|
6
|
+
|
|
7
|
+
## Included
|
|
8
|
+
|
|
9
|
+
- Discord voice receive via Node `@discordjs/voice`.
|
|
10
|
+
- Local Korean STT via `whisper.cpp` + Metal.
|
|
11
|
+
- Edge TTS playback with Korean default voice.
|
|
12
|
+
- Generic CLI harness adapter layer: Hermes Agent, Claude Code, Codex CLI, Gemini CLI, OpenCode, OpenClaw, or custom command.
|
|
13
|
+
- Shared voice/text session support for Hermes backend.
|
|
14
|
+
- Long-answer TTS chunking and responsive barge-in.
|
|
15
|
+
- Diff/code/log guardrails so large technical output is not read aloud.
|
|
16
|
+
- Normal and conservative sensitivity modes.
|
|
17
|
+
- Setup wizard, `.env.example`, `vc doctor`, `./scripts/install.sh --yes`, and npm install path.
|
|
18
|
+
- `npm install -g verbalcoding`, `vc setup --yes`, and `vc start`.
|
|
19
|
+
- Verbose progress mode, JSONL latency metrics, and `!latency` / `!metrics`.
|
|
20
|
+
- `UTTERANCE_IDLE_MS=4500` for long spoken instructions with natural pauses.
|
|
21
|
+
- Multi-instance Hermes profile isolation via `vc instance setup <name>` and `HERMES_HOME`.
|
|
22
|
+
|
|
23
|
+
## Pre-release checklist
|
|
24
|
+
|
|
25
|
+
```bash
|
|
26
|
+
./scripts/install.sh --yes --no-wizard
|
|
27
|
+
./scripts/docker_ubuntu_smoke.sh
|
|
28
|
+
node --check app-node/main.mjs app-node/agent_adapters.mjs app-node/install_config.mjs scripts/install.mjs
|
|
29
|
+
npm test
|
|
30
|
+
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 python3 -m pytest tests/ -q || [ $? -eq 5 ]
|
|
31
|
+
bash -n run.sh scripts/install.sh scripts/bootstrap_prereqs.sh scripts/docker_ubuntu_smoke.sh
|
|
32
|
+
npm pack --dry-run
|
|
33
|
+
vc doctor
|
|
34
|
+
git diff --check
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
Manual smoke test:
|
|
38
|
+
|
|
39
|
+
1. Start the bridge with `vc start` or `./run.sh`.
|
|
40
|
+
2. Verify `Logged in as <bot-name>`.
|
|
41
|
+
3. Verify `Listening in voice channel ...`.
|
|
42
|
+
4. In Discord, run `!ping`.
|
|
43
|
+
5. Say a short Korean request in voice.
|
|
44
|
+
6. Verify STT transcript, agent response, TTS playback, and barge-in.
|
|
45
|
+
|
|
46
|
+
## Known requirements
|
|
47
|
+
|
|
48
|
+
- macOS with Homebrew, or Linux with `apt`, `dnf`, or `pacman`.
|
|
49
|
+
- `ffmpeg`.
|
|
50
|
+
- `whisper-cli`.
|
|
51
|
+
- `models/ggml-small-q5_1.bin`.
|
|
52
|
+
- Edge TTS CLI or `.venv-tts/bin/edge-tts`.
|
|
53
|
+
- Discord bot token in `.env`, `instances/<name>.env`, `~/.zshrc`, or runtime env.
|
|
54
|
+
- Selected CLI harness installed and authenticated.
|
|
55
|
+
|
|
56
|
+
## Not for public release yet
|
|
57
|
+
|
|
58
|
+
Consider adding GitHub Actions CI, demo video/GIF, Discord bot setup screenshots, broader real Linux validation, and security review of logging paths.
|
|
@@ -0,0 +1,134 @@
|
|
|
1
|
+
# VerbalCoding Guía de uso
|
|
2
|
+
|
|
3
|
+
Operational details for Español users.
|
|
4
|
+
|
|
5
|
+
## CLI Commands
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
vc status
|
|
9
|
+
vc language en
|
|
10
|
+
vc language ko
|
|
11
|
+
vc language auto
|
|
12
|
+
vc restart auto status
|
|
13
|
+
vc restart auto on
|
|
14
|
+
vc restart auto off
|
|
15
|
+
vc bot invite CLIENT_ID
|
|
16
|
+
vc instance status
|
|
17
|
+
vc instance setup NAME
|
|
18
|
+
vc instance start NAME
|
|
19
|
+
vc instance stop NAME
|
|
20
|
+
vc doctor
|
|
21
|
+
npm run mcp
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
Language commands update `.env`; restart with `vc start`, `./run.sh`, or your process manager.
|
|
25
|
+
|
|
26
|
+
## Run Modes
|
|
27
|
+
|
|
28
|
+
```bash
|
|
29
|
+
vc start
|
|
30
|
+
./run.sh
|
|
31
|
+
./run.sh instances/my-project.env
|
|
32
|
+
VERBALCODING_INSTANCE_ENV=instances/my-project.env ./run.sh
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
The bot auto-joins the first configured channel name, defaulting to `일반,General,general`.
|
|
36
|
+
|
|
37
|
+
## Discord Commands
|
|
38
|
+
|
|
39
|
+
Before using commands, set up the Discord application/bot:
|
|
40
|
+
|
|
41
|
+
- Hermes Agent Discord guide: <https://hermes-agent.nousresearch.com/docs/user-guide/messaging/discord>
|
|
42
|
+
- Discord official bot docs: <https://docs.discord.com/developers/bots/overview>
|
|
43
|
+
|
|
44
|
+
Then run `vc bot invite CLIENT_ID` for the VerbalCoding permissions.
|
|
45
|
+
|
|
46
|
+
| Command | Purpose |
|
|
47
|
+
|---|---|
|
|
48
|
+
| `!ping` | Basic bot check |
|
|
49
|
+
| `!join` / `!leave` | Join or leave voice |
|
|
50
|
+
| `!say <text>` | Speak text directly through TTS |
|
|
51
|
+
| `!voice-test <text>` | Test the active TTS backend/voice |
|
|
52
|
+
| `!voice-clone capture` | Save the next valid utterance as an OpenVoice reference sample |
|
|
53
|
+
| `!voice-clone status` / `!voice-clone cancel` | Inspect or cancel capture |
|
|
54
|
+
| `!ask <prompt>` | Send text through the same harness adapter as voice |
|
|
55
|
+
| `!session status` | Show current project/default adapter session |
|
|
56
|
+
| `!session new <name> <workdir> [context] --voice <voice-channel>` | Create a project-scoped Hermes session |
|
|
57
|
+
| `!session attach-voice [sessionName] --voice <voice-channel>` | Bind a text channel/thread to a voice channel |
|
|
58
|
+
| `!session list` | List configured project sessions |
|
|
59
|
+
| `!session reset` / `!reset-session` | Clear the current session file |
|
|
60
|
+
| `!verbose on/off` | Toggle detailed progress updates |
|
|
61
|
+
| `!latency` / `!metrics` | Show recent latency summary |
|
|
62
|
+
| `!sensitivity normal/conservative` | Switch barge-in sensitivity |
|
|
63
|
+
|
|
64
|
+
Voice equivalents such as “외부 모드”, “보수 모드”, “실내”, “기본 감도”, “상세 진행 켜”, and clear stop phrases like “잠깐”, “멈춰”, “그만” are handled by the bridge.
|
|
65
|
+
|
|
66
|
+
## Changing the Voice
|
|
67
|
+
|
|
68
|
+
`vc language ko|en|auto` changes STT language, progress language, and the matching default TTS voice together. Live voice commands can change the speaker without restart:
|
|
69
|
+
|
|
70
|
+
```text
|
|
71
|
+
남자 한국어 목소리로 바꿔
|
|
72
|
+
여자 한국어 목소리로 바꿔
|
|
73
|
+
change voice to Korean female
|
|
74
|
+
switch speaker to English
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
Built-in Edge types:
|
|
78
|
+
|
|
79
|
+
| Voice type | Edge voice |
|
|
80
|
+
|---|---|
|
|
81
|
+
| `korean_male` | `ko-KR-InJoonNeural` |
|
|
82
|
+
| `korean_female` | `ko-KR-SunHiNeural` |
|
|
83
|
+
| `korean_multilingual_male` | `ko-KR-HyunsuMultilingualNeural` |
|
|
84
|
+
| `english_male` | `en-US-GuyNeural` |
|
|
85
|
+
| `english_female` | `en-US-AriaNeural` |
|
|
86
|
+
|
|
87
|
+
Backend voice settings:
|
|
88
|
+
|
|
89
|
+
| Backend | Voice setting | Common choices |
|
|
90
|
+
|---|---|---|
|
|
91
|
+
| Edge | `TTS_VOICE_TYPE`, `TTS_VOICE` | Built-in types or any Edge voice from `edge-tts --list-voices` |
|
|
92
|
+
| Supertonic | `SUPERTONIC_VOICE` | `M1`–`M5`, `F1`–`F5`; `SUPERTONIC_LANGUAGE=ko|en|es|pt|fr` |
|
|
93
|
+
| OpenVoice | `OPENVOICE_REF_AUDIO`, `OPENVOICE_STYLE` | A permitted reference WAV plus style such as `default` |
|
|
94
|
+
| SpeechSwift / CosyVoice | `SPEECHSWIFT_REF_AUDIO`, `SPEECHSWIFT_ENGINE`, `SPEECHSWIFT_SPEAKER` | Reference WAV or backend speaker/model values |
|
|
95
|
+
|
|
96
|
+
## Long Dictation and Pauses
|
|
97
|
+
|
|
98
|
+
The default `UTTERANCE_IDLE_MS=4500` waits long enough to keep natural pauses inside one spoken instruction. Lower it for faster short commands or raise it for long dictation:
|
|
99
|
+
|
|
100
|
+
```bash
|
|
101
|
+
UTTERANCE_IDLE_MS="6000"
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
## Verbose Progress Mode
|
|
105
|
+
|
|
106
|
+
Enable with `!verbose on`, `AGENT_VERBOSE_PROGRESS=1`, or “상세 진행 켜”. Progress lines look like:
|
|
107
|
+
|
|
108
|
+
```text
|
|
109
|
+
🤖 Hermes Agent 호출 시작
|
|
110
|
+
📖 파일 읽기 app-node/main.mjs
|
|
111
|
+
🔎 웹 검색 실행
|
|
112
|
+
⌨️ 터미널 명령 실행
|
|
113
|
+
🤖 Hermes Agent 응답 수신
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
Secret-looking fields are redacted and progress lines are removed from final spoken answers.
|
|
117
|
+
|
|
118
|
+
## Latency Metrics
|
|
119
|
+
|
|
120
|
+
Latency records are written to `./.logs/latency.jsonl`. In Discord, run:
|
|
121
|
+
|
|
122
|
+
```text
|
|
123
|
+
!latency
|
|
124
|
+
!metrics
|
|
125
|
+
```
|
|
126
|
+
|
|
127
|
+
## Testing
|
|
128
|
+
|
|
129
|
+
```bash
|
|
130
|
+
node --check app-node/main.mjs
|
|
131
|
+
npm test
|
|
132
|
+
bash -n run.sh scripts/install.sh
|
|
133
|
+
vc doctor
|
|
134
|
+
```
|
|
@@ -0,0 +1,134 @@
|
|
|
1
|
+
# VerbalCoding Guide d’utilisation
|
|
2
|
+
|
|
3
|
+
Operational details for Français users.
|
|
4
|
+
|
|
5
|
+
## CLI Commands
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
vc status
|
|
9
|
+
vc language en
|
|
10
|
+
vc language ko
|
|
11
|
+
vc language auto
|
|
12
|
+
vc restart auto status
|
|
13
|
+
vc restart auto on
|
|
14
|
+
vc restart auto off
|
|
15
|
+
vc bot invite CLIENT_ID
|
|
16
|
+
vc instance status
|
|
17
|
+
vc instance setup NAME
|
|
18
|
+
vc instance start NAME
|
|
19
|
+
vc instance stop NAME
|
|
20
|
+
vc doctor
|
|
21
|
+
npm run mcp
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
Language commands update `.env`; restart with `vc start`, `./run.sh`, or your process manager.
|
|
25
|
+
|
|
26
|
+
## Run Modes
|
|
27
|
+
|
|
28
|
+
```bash
|
|
29
|
+
vc start
|
|
30
|
+
./run.sh
|
|
31
|
+
./run.sh instances/my-project.env
|
|
32
|
+
VERBALCODING_INSTANCE_ENV=instances/my-project.env ./run.sh
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
The bot auto-joins the first configured channel name, defaulting to `일반,General,general`.
|
|
36
|
+
|
|
37
|
+
## Discord Commands
|
|
38
|
+
|
|
39
|
+
Before using commands, set up the Discord application/bot:
|
|
40
|
+
|
|
41
|
+
- Hermes Agent Discord guide: <https://hermes-agent.nousresearch.com/docs/user-guide/messaging/discord>
|
|
42
|
+
- Discord official bot docs: <https://docs.discord.com/developers/bots/overview>
|
|
43
|
+
|
|
44
|
+
Then run `vc bot invite CLIENT_ID` for the VerbalCoding permissions.
|
|
45
|
+
|
|
46
|
+
| Command | Purpose |
|
|
47
|
+
|---|---|
|
|
48
|
+
| `!ping` | Basic bot check |
|
|
49
|
+
| `!join` / `!leave` | Join or leave voice |
|
|
50
|
+
| `!say <text>` | Speak text directly through TTS |
|
|
51
|
+
| `!voice-test <text>` | Test the active TTS backend/voice |
|
|
52
|
+
| `!voice-clone capture` | Save the next valid utterance as an OpenVoice reference sample |
|
|
53
|
+
| `!voice-clone status` / `!voice-clone cancel` | Inspect or cancel capture |
|
|
54
|
+
| `!ask <prompt>` | Send text through the same harness adapter as voice |
|
|
55
|
+
| `!session status` | Show current project/default adapter session |
|
|
56
|
+
| `!session new <name> <workdir> [context] --voice <voice-channel>` | Create a project-scoped Hermes session |
|
|
57
|
+
| `!session attach-voice [sessionName] --voice <voice-channel>` | Bind a text channel/thread to a voice channel |
|
|
58
|
+
| `!session list` | List configured project sessions |
|
|
59
|
+
| `!session reset` / `!reset-session` | Clear the current session file |
|
|
60
|
+
| `!verbose on/off` | Toggle detailed progress updates |
|
|
61
|
+
| `!latency` / `!metrics` | Show recent latency summary |
|
|
62
|
+
| `!sensitivity normal/conservative` | Switch barge-in sensitivity |
|
|
63
|
+
|
|
64
|
+
Voice equivalents such as “외부 모드”, “보수 모드”, “실내”, “기본 감도”, “상세 진행 켜”, and clear stop phrases like “잠깐”, “멈춰”, “그만” are handled by the bridge.
|
|
65
|
+
|
|
66
|
+
## Changing the Voice
|
|
67
|
+
|
|
68
|
+
`vc language ko|en|auto` changes STT language, progress language, and the matching default TTS voice together. Live voice commands can change the speaker without restart:
|
|
69
|
+
|
|
70
|
+
```text
|
|
71
|
+
남자 한국어 목소리로 바꿔
|
|
72
|
+
여자 한국어 목소리로 바꿔
|
|
73
|
+
change voice to Korean female
|
|
74
|
+
switch speaker to English
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
Built-in Edge types:
|
|
78
|
+
|
|
79
|
+
| Voice type | Edge voice |
|
|
80
|
+
|---|---|
|
|
81
|
+
| `korean_male` | `ko-KR-InJoonNeural` |
|
|
82
|
+
| `korean_female` | `ko-KR-SunHiNeural` |
|
|
83
|
+
| `korean_multilingual_male` | `ko-KR-HyunsuMultilingualNeural` |
|
|
84
|
+
| `english_male` | `en-US-GuyNeural` |
|
|
85
|
+
| `english_female` | `en-US-AriaNeural` |
|
|
86
|
+
|
|
87
|
+
Backend voice settings:
|
|
88
|
+
|
|
89
|
+
| Backend | Voice setting | Common choices |
|
|
90
|
+
|---|---|---|
|
|
91
|
+
| Edge | `TTS_VOICE_TYPE`, `TTS_VOICE` | Built-in types or any Edge voice from `edge-tts --list-voices` |
|
|
92
|
+
| Supertonic | `SUPERTONIC_VOICE` | `M1`–`M5`, `F1`–`F5`; `SUPERTONIC_LANGUAGE=ko|en|es|pt|fr` |
|
|
93
|
+
| OpenVoice | `OPENVOICE_REF_AUDIO`, `OPENVOICE_STYLE` | A permitted reference WAV plus style such as `default` |
|
|
94
|
+
| SpeechSwift / CosyVoice | `SPEECHSWIFT_REF_AUDIO`, `SPEECHSWIFT_ENGINE`, `SPEECHSWIFT_SPEAKER` | Reference WAV or backend speaker/model values |
|
|
95
|
+
|
|
96
|
+
## Long Dictation and Pauses
|
|
97
|
+
|
|
98
|
+
The default `UTTERANCE_IDLE_MS=4500` waits long enough to keep natural pauses inside one spoken instruction. Lower it for faster short commands or raise it for long dictation:
|
|
99
|
+
|
|
100
|
+
```bash
|
|
101
|
+
UTTERANCE_IDLE_MS="6000"
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
## Verbose Progress Mode
|
|
105
|
+
|
|
106
|
+
Enable with `!verbose on`, `AGENT_VERBOSE_PROGRESS=1`, or “상세 진행 켜”. Progress lines look like:
|
|
107
|
+
|
|
108
|
+
```text
|
|
109
|
+
🤖 Hermes Agent 호출 시작
|
|
110
|
+
📖 파일 읽기 app-node/main.mjs
|
|
111
|
+
🔎 웹 검색 실행
|
|
112
|
+
⌨️ 터미널 명령 실행
|
|
113
|
+
🤖 Hermes Agent 응답 수신
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
Secret-looking fields are redacted and progress lines are removed from final spoken answers.
|
|
117
|
+
|
|
118
|
+
## Latency Metrics
|
|
119
|
+
|
|
120
|
+
Latency records are written to `./.logs/latency.jsonl`. In Discord, run:
|
|
121
|
+
|
|
122
|
+
```text
|
|
123
|
+
!latency
|
|
124
|
+
!metrics
|
|
125
|
+
```
|
|
126
|
+
|
|
127
|
+
## Testing
|
|
128
|
+
|
|
129
|
+
```bash
|
|
130
|
+
node --check app-node/main.mjs
|
|
131
|
+
npm test
|
|
132
|
+
bash -n run.sh scripts/install.sh
|
|
133
|
+
vc doctor
|
|
134
|
+
```
|
|
@@ -0,0 +1,134 @@
|
|
|
1
|
+
# VerbalCoding 使い方ガイド
|
|
2
|
+
|
|
3
|
+
Operational details for 日本語 users.
|
|
4
|
+
|
|
5
|
+
## CLI Commands
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
vc status
|
|
9
|
+
vc language en
|
|
10
|
+
vc language ko
|
|
11
|
+
vc language auto
|
|
12
|
+
vc restart auto status
|
|
13
|
+
vc restart auto on
|
|
14
|
+
vc restart auto off
|
|
15
|
+
vc bot invite CLIENT_ID
|
|
16
|
+
vc instance status
|
|
17
|
+
vc instance setup NAME
|
|
18
|
+
vc instance start NAME
|
|
19
|
+
vc instance stop NAME
|
|
20
|
+
vc doctor
|
|
21
|
+
npm run mcp
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
Language commands update `.env`; restart with `vc start`, `./run.sh`, or your process manager.
|
|
25
|
+
|
|
26
|
+
## Run Modes
|
|
27
|
+
|
|
28
|
+
```bash
|
|
29
|
+
vc start
|
|
30
|
+
./run.sh
|
|
31
|
+
./run.sh instances/my-project.env
|
|
32
|
+
VERBALCODING_INSTANCE_ENV=instances/my-project.env ./run.sh
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
The bot auto-joins the first configured channel name, defaulting to `일반,General,general`.
|
|
36
|
+
|
|
37
|
+
## Discord Commands
|
|
38
|
+
|
|
39
|
+
Before using commands, set up the Discord application/bot:
|
|
40
|
+
|
|
41
|
+
- Hermes Agent Discord guide: <https://hermes-agent.nousresearch.com/docs/user-guide/messaging/discord>
|
|
42
|
+
- Discord official bot docs: <https://docs.discord.com/developers/bots/overview>
|
|
43
|
+
|
|
44
|
+
Then run `vc bot invite CLIENT_ID` for the VerbalCoding permissions.
|
|
45
|
+
|
|
46
|
+
| Command | Purpose |
|
|
47
|
+
|---|---|
|
|
48
|
+
| `!ping` | Basic bot check |
|
|
49
|
+
| `!join` / `!leave` | Join or leave voice |
|
|
50
|
+
| `!say <text>` | Speak text directly through TTS |
|
|
51
|
+
| `!voice-test <text>` | Test the active TTS backend/voice |
|
|
52
|
+
| `!voice-clone capture` | Save the next valid utterance as an OpenVoice reference sample |
|
|
53
|
+
| `!voice-clone status` / `!voice-clone cancel` | Inspect or cancel capture |
|
|
54
|
+
| `!ask <prompt>` | Send text through the same harness adapter as voice |
|
|
55
|
+
| `!session status` | Show current project/default adapter session |
|
|
56
|
+
| `!session new <name> <workdir> [context] --voice <voice-channel>` | Create a project-scoped Hermes session |
|
|
57
|
+
| `!session attach-voice [sessionName] --voice <voice-channel>` | Bind a text channel/thread to a voice channel |
|
|
58
|
+
| `!session list` | List configured project sessions |
|
|
59
|
+
| `!session reset` / `!reset-session` | Clear the current session file |
|
|
60
|
+
| `!verbose on/off` | Toggle detailed progress updates |
|
|
61
|
+
| `!latency` / `!metrics` | Show recent latency summary |
|
|
62
|
+
| `!sensitivity normal/conservative` | Switch barge-in sensitivity |
|
|
63
|
+
|
|
64
|
+
Voice equivalents such as “외부 모드”, “보수 모드”, “실내”, “기본 감도”, “상세 진행 켜”, and clear stop phrases like “잠깐”, “멈춰”, “그만” are handled by the bridge.
|
|
65
|
+
|
|
66
|
+
## Changing the Voice
|
|
67
|
+
|
|
68
|
+
`vc language ko|en|auto` changes STT language, progress language, and the matching default TTS voice together. Live voice commands can change the speaker without restart:
|
|
69
|
+
|
|
70
|
+
```text
|
|
71
|
+
남자 한국어 목소리로 바꿔
|
|
72
|
+
여자 한국어 목소리로 바꿔
|
|
73
|
+
change voice to Korean female
|
|
74
|
+
switch speaker to English
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
Built-in Edge types:
|
|
78
|
+
|
|
79
|
+
| Voice type | Edge voice |
|
|
80
|
+
|---|---|
|
|
81
|
+
| `korean_male` | `ko-KR-InJoonNeural` |
|
|
82
|
+
| `korean_female` | `ko-KR-SunHiNeural` |
|
|
83
|
+
| `korean_multilingual_male` | `ko-KR-HyunsuMultilingualNeural` |
|
|
84
|
+
| `english_male` | `en-US-GuyNeural` |
|
|
85
|
+
| `english_female` | `en-US-AriaNeural` |
|
|
86
|
+
|
|
87
|
+
Backend voice settings:
|
|
88
|
+
|
|
89
|
+
| Backend | Voice setting | Common choices |
|
|
90
|
+
|---|---|---|
|
|
91
|
+
| Edge | `TTS_VOICE_TYPE`, `TTS_VOICE` | Built-in types or any Edge voice from `edge-tts --list-voices` |
|
|
92
|
+
| Supertonic | `SUPERTONIC_VOICE` | `M1`–`M5`, `F1`–`F5`; `SUPERTONIC_LANGUAGE=ko|en|es|pt|fr` |
|
|
93
|
+
| OpenVoice | `OPENVOICE_REF_AUDIO`, `OPENVOICE_STYLE` | A permitted reference WAV plus style such as `default` |
|
|
94
|
+
| SpeechSwift / CosyVoice | `SPEECHSWIFT_REF_AUDIO`, `SPEECHSWIFT_ENGINE`, `SPEECHSWIFT_SPEAKER` | Reference WAV or backend speaker/model values |
|
|
95
|
+
|
|
96
|
+
## Long Dictation and Pauses
|
|
97
|
+
|
|
98
|
+
The default `UTTERANCE_IDLE_MS=4500` waits long enough to keep natural pauses inside one spoken instruction. Lower it for faster short commands or raise it for long dictation:
|
|
99
|
+
|
|
100
|
+
```bash
|
|
101
|
+
UTTERANCE_IDLE_MS="6000"
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
## Verbose Progress Mode
|
|
105
|
+
|
|
106
|
+
Enable with `!verbose on`, `AGENT_VERBOSE_PROGRESS=1`, or “상세 진행 켜”. Progress lines look like:
|
|
107
|
+
|
|
108
|
+
```text
|
|
109
|
+
🤖 Hermes Agent 호출 시작
|
|
110
|
+
📖 파일 읽기 app-node/main.mjs
|
|
111
|
+
🔎 웹 검색 실행
|
|
112
|
+
⌨️ 터미널 명령 실행
|
|
113
|
+
🤖 Hermes Agent 응답 수신
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
Secret-looking fields are redacted and progress lines are removed from final spoken answers.
|
|
117
|
+
|
|
118
|
+
## Latency Metrics
|
|
119
|
+
|
|
120
|
+
Latency records are written to `./.logs/latency.jsonl`. In Discord, run:
|
|
121
|
+
|
|
122
|
+
```text
|
|
123
|
+
!latency
|
|
124
|
+
!metrics
|
|
125
|
+
```
|
|
126
|
+
|
|
127
|
+
## Testing
|
|
128
|
+
|
|
129
|
+
```bash
|
|
130
|
+
node --check app-node/main.mjs
|
|
131
|
+
npm test
|
|
132
|
+
bash -n run.sh scripts/install.sh
|
|
133
|
+
vc doctor
|
|
134
|
+
```
|