@kajidog/mcp-tts-voicevox 0.6.1 → 0.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +50 -15
- package/dist/index.js +1594 -173
- package/dist/index.js.map +1 -1
- package/dist/mcp-app.html +56 -56
- package/dist/stdio.js +1591 -180
- package/dist/stdio.js.map +1 -1
- package/package.json +4 -4
package/README.md
CHANGED
|
@@ -1,7 +1,5 @@
|
|
|
1
1
|
# VOICEVOX TTS MCP
|
|
2
2
|
|
|
3
|
-
**English** | [日本語](README.ja.md)
|
|
4
|
-
|
|
5
3
|
A text-to-speech MCP server using VOICEVOX
|
|
6
4
|
|
|
7
5
|
> 🎮 **[Try the Browser Demo](https://kajidog.github.io/mcp-tts-voicevox/)** — Test VoicevoxClient directly in your browser
|
|
@@ -16,9 +14,9 @@ A text-to-speech MCP server using VOICEVOX
|
|
|
16
14
|
|
|
17
15
|
## UI Audio Player (MCP Apps)
|
|
18
16
|
|
|
19
|
-

|
|
20
18
|
|
|
21
|
-
The `
|
|
19
|
+
The `voicevox_speak_player` tool uses [MCP Apps](https://github.com/modelcontextprotocol/ext-apps) to render an interactive audio player directly inside the chat. Unlike the standard `voicevox_speak` tool which plays audio on the server, **audio is played on the client side (in the browser/app)** — no audio device needed on the server.
|
|
22
20
|
|
|
23
21
|
### Features
|
|
24
22
|
|
|
@@ -26,13 +24,36 @@ The `speak_player` tool uses [MCP Apps](https://github.com/modelcontextprotocol/
|
|
|
26
24
|
- **Play/Pause controls** — Full playback controls embedded in the conversation
|
|
27
25
|
- **Multi-speaker dialogue** — Sequential playback of multiple speakers in one player with track navigation
|
|
28
26
|
- **Speaker switching** — Change the voice of any segment directly from the player UI
|
|
27
|
+
- **Segment editing** — Adjust speed, volume, intonation, pause length, and pre/post silence per segment
|
|
28
|
+
- **Accent phrase editing** — Edit accent positions and mora pitch directly in the UI
|
|
29
|
+
- **Add / delete / reorder segments** — Drag-and-drop track reordering; add new segments inline
|
|
30
|
+
- **WAV export** — Save all tracks as numbered WAV files and open the output folder automatically
|
|
31
|
+
- **User dictionary manager** — Add, edit, and delete VOICEVOX user dictionary words with preview playback
|
|
32
|
+
- **Cross-session state restore** — Player state is persisted on the server; reopening the chat restores previous tracks
|
|
33
|
+
|
|
34
|
+
Export behavior by environment:
|
|
35
|
+
- `Save and open` always exports WAV files. If opening the file explorer is not supported, export still succeeds and the save path is shown in the UI.
|
|
36
|
+
- `Choose output folder` uses a native directory picker on Windows/macOS. On unsupported environments, this action falls back to the default export directory.
|
|
37
|
+
|
|
38
|
+
| Multi-speaker playback | Track list | Segment editing |
|
|
39
|
+
|:---:|:---:|:---:|
|
|
40
|
+
|  |  |  |
|
|
29
41
|
|
|
30
|
-
|
|
|
42
|
+
| Speaker selection | Dictionary manager | WAV export |
|
|
31
43
|
|:---:|:---:|:---:|
|
|
32
|
-
|  |  |  |
|
|
33
45
|
|
|
34
46
|
> **Note:** `speak_player` requires a host that supports MCP Apps (e.g., Claude Desktop). In hosts without MCP Apps support, the tool is not available and `speak` (server-side playback) can be used instead.
|
|
35
47
|
|
|
48
|
+
### Player MCP Tools
|
|
49
|
+
|
|
50
|
+
| Tool | Description |
|
|
51
|
+
|------|-------------|
|
|
52
|
+
| `speak_player` | Create a new player session and display the UI. Returns `viewUUID`. |
|
|
53
|
+
| `resynthesize_player` | Update all segments for an existing player (new `viewUUID` each call). |
|
|
54
|
+
| `get_player_state` | Read the current player state (paginated) for AI tuning. |
|
|
55
|
+
| `open_dictionary_ui` | Open the user dictionary manager UI. |
|
|
56
|
+
|
|
36
57
|
## Quick Start
|
|
37
58
|
|
|
38
59
|
### Requirements
|
|
@@ -108,7 +129,7 @@ Config file location:
|
|
|
108
129
|
}
|
|
109
130
|
```
|
|
110
131
|
|
|
111
|
-
> 💡 Bun
|
|
132
|
+
> 💡 **If using Bun**, just replace `npx` with `bunx`:
|
|
112
133
|
> ```json
|
|
113
134
|
> "command": "bunx", "args": ["@kajidog/mcp-tts-voicevox"]
|
|
114
135
|
> ```
|
|
@@ -144,13 +165,13 @@ This starts the VOICEVOX Engine and the MCP server (HTTP mode on port 3000).
|
|
|
144
165
|
|
|
145
166
|
**3. Restart Claude Desktop**
|
|
146
167
|
|
|
147
|
-
> **Limitations (Docker):** The Docker container has no audio device, so the `
|
|
168
|
+
> **Limitations (Docker):** The Docker container has no audio device, so the `voicevox_speak` tool (server-side playback) is disabled by default. Use `voicevox_speak_player` instead — it plays audio on the client side (in Claude Desktop) and works without any audio device on the server. See [UI Audio Player](#ui-audio-player-mcp-apps) for details.
|
|
148
169
|
|
|
149
170
|
---
|
|
150
171
|
|
|
151
172
|
## MCP Tools
|
|
152
173
|
|
|
153
|
-
### `
|
|
174
|
+
### `voicevox_speak` — Text-to-Speech
|
|
154
175
|
|
|
155
176
|
The main feature callable from Claude.
|
|
156
177
|
|
|
@@ -183,11 +204,11 @@ The main feature callable from Claude.
|
|
|
183
204
|
|
|
184
205
|
| Tool | Description |
|
|
185
206
|
|------|-------------|
|
|
186
|
-
| `
|
|
187
|
-
| `
|
|
188
|
-
| `
|
|
189
|
-
| `
|
|
190
|
-
| `
|
|
207
|
+
| `voicevox_speak_player` | Speak with UI audio player (disable with `--disable-tools`) |
|
|
208
|
+
| `voicevox_ping` | Check VOICEVOX Engine connection |
|
|
209
|
+
| `voicevox_get_speakers` | Get list of available speakers |
|
|
210
|
+
| `voicevox_stop_speaker` | Stop playback and clear queue |
|
|
211
|
+
| `voicevox_synthesize_file` | Generate audio file |
|
|
191
212
|
|
|
192
213
|
</details>
|
|
193
214
|
|
|
@@ -237,6 +258,13 @@ export VOICEVOX_DISABLED_TOOLS=speak_player,synthesize_file
|
|
|
237
258
|
| Variable | Description | Default |
|
|
238
259
|
|----------|-------------|---------|
|
|
239
260
|
| `VOICEVOX_AUTO_PLAY` | Auto-play audio in UI player | `true` |
|
|
261
|
+
| `VOICEVOX_PLAYER_EXPORT_ENABLED` | Enable track export(download) from UI player (`false` to disable) | `true` |
|
|
262
|
+
| `VOICEVOX_PLAYER_EXPORT_DIR` | Default output directory for exported tracks (also used as fallback when folder picker is unavailable) | `./voicevox-player-exports` |
|
|
263
|
+
| `VOICEVOX_PLAYER_CACHE_DIR` | Directory for player cache files (`*.txt`) and default player state file | `./.voicevox-player-cache` |
|
|
264
|
+
| `VOICEVOX_PLAYER_AUDIO_CACHE_ENABLED` | Enable persistent audio cache on disk (`false` disables disk cache writes/reads) | `true` |
|
|
265
|
+
| `VOICEVOX_PLAYER_AUDIO_CACHE_TTL_DAYS` | Audio cache retention in days (`0`: disable disk cache, `-1`: no TTL cleanup) | `30` |
|
|
266
|
+
| `VOICEVOX_PLAYER_AUDIO_CACHE_MAX_MB` | Audio cache size cap in MB (`0`: disable disk cache, `-1`: unlimited) | `512` |
|
|
267
|
+
| `VOICEVOX_PLAYER_STATE_FILE` | Path of persisted player state JSON | `<VOICEVOX_PLAYER_CACHE_DIR>/player-state.json` |
|
|
240
268
|
|
|
241
269
|
### Server Settings
|
|
242
270
|
|
|
@@ -285,6 +313,13 @@ npx @kajidog/mcp-tts-voicevox --disable-tools speak_player,synthesize_file
|
|
|
285
313
|
| `--restrict-wait-for-end` | Restrict waitForEnd |
|
|
286
314
|
| `--disable-tools <tools>` | Disable tools |
|
|
287
315
|
| `--auto-play` / `--no-auto-play` | Auto-play in UI player |
|
|
316
|
+
| `--player-export` / `--no-player-export` | Enable/disable track export(download) in UI player |
|
|
317
|
+
| `--player-export-dir <dir>` | Default output directory for exported tracks |
|
|
318
|
+
| `--player-cache-dir <dir>` | Player cache directory |
|
|
319
|
+
| `--player-state-file <path>` | Persisted player state file path |
|
|
320
|
+
| `--player-audio-cache` / `--no-player-audio-cache` | Enable/disable disk audio cache for player |
|
|
321
|
+
| `--player-audio-cache-ttl-days <days>` | Audio cache retention days (`0`: disable, `-1`: no TTL cleanup) |
|
|
322
|
+
| `--player-audio-cache-max-mb <mb>` | Audio cache size cap in MB (`0`: disable, `-1`: unlimited) |
|
|
288
323
|
| `--http` | HTTP mode |
|
|
289
324
|
| `--port <value>` | HTTP port |
|
|
290
325
|
| `--host <value>` | HTTP host |
|
|
@@ -476,4 +511,4 @@ pnpm install
|
|
|
476
511
|
|
|
477
512
|
## License
|
|
478
513
|
|
|
479
|
-
ISC
|
|
514
|
+
ISC
|