@kajidog/mcp-tts-voicevox 0.6.1 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,7 +1,5 @@
1
1
  # VOICEVOX TTS MCP
2
2
 
3
- **English** | [日本語](README.ja.md)
4
-
5
3
  A text-to-speech MCP server using VOICEVOX
6
4
 
7
5
  > 🎮 **[Try the Browser Demo](https://kajidog.github.io/mcp-tts-voicevox/)** — Test VoicevoxClient directly in your browser
@@ -16,9 +14,9 @@ A text-to-speech MCP server using VOICEVOX
16
14
 
17
15
  ## UI Audio Player (MCP Apps)
18
16
 
19
- ![Single track player](docs/images/single-player.png)
17
+ ![UI Audio Player](docs/images/player.png)
20
18
 
21
- The `speak_player` tool uses [MCP Apps](https://github.com/modelcontextprotocol/ext-apps) to render an interactive audio player directly inside the chat. Unlike the standard `speak` tool which plays audio on the server, **audio is played on the client side (in the browser/app)** — no audio device needed on the server.
19
+ The `voicevox_speak_player` tool uses [MCP Apps](https://github.com/modelcontextprotocol/ext-apps) to render an interactive audio player directly inside the chat. Unlike the standard `voicevox_speak` tool which plays audio on the server, **audio is played on the client side (in the browser/app)** — no audio device needed on the server.
22
20
 
23
21
  ### Features
24
22
 
@@ -26,13 +24,36 @@ The `speak_player` tool uses [MCP Apps](https://github.com/modelcontextprotocol/
26
24
  - **Play/Pause controls** — Full playback controls embedded in the conversation
27
25
  - **Multi-speaker dialogue** — Sequential playback of multiple speakers in one player with track navigation
28
26
  - **Speaker switching** — Change the voice of any segment directly from the player UI
27
+ - **Segment editing** — Adjust speed, volume, intonation, pause length, and pre/post silence per segment
28
+ - **Accent phrase editing** — Edit accent positions and mora pitch directly in the UI
29
+ - **Add / delete / reorder segments** — Drag-and-drop track reordering; add new segments inline
30
+ - **WAV export** — Save all tracks as numbered WAV files and open the output folder automatically
31
+ - **User dictionary manager** — Add, edit, and delete VOICEVOX user dictionary words with preview playback
32
+ - **Cross-session state restore** — Player state is persisted on the server; reopening the chat restores previous tracks
33
+
34
+ Export behavior by environment:
35
+ - `Save and open` always exports WAV files. If opening the file explorer is not supported, export still succeeds and the save path is shown in the UI.
36
+ - `Choose output folder` uses a native directory picker on Windows/macOS. On unsupported environments, this action falls back to the default export directory.
37
+
38
+ | Multi-speaker playback | Track list | Segment editing |
39
+ |:---:|:---:|:---:|
40
+ | ![Multi-speaker player](docs/images/multi-player.png) | ![Track list](docs/images/list-player.png) | ![Segment editing](docs/images/edit-player.png) |
29
41
 
30
- | Multi-speaker playback | Track list | Speaker selection |
42
+ | Speaker selection | Dictionary manager | WAV export |
31
43
  |:---:|:---:|:---:|
32
- | ![Multi-speaker player](docs/images/multi-player.png) | ![Track list](docs/images/list-player.png) | ![Speaker selection](docs/images/select-player.png) |
44
+ | ![Speaker selection](docs/images/select-player.png) | ![Dictionary manager](docs/images/dictionary-player.png) | ![WAV export](docs/images/export-player.png) |
33
45
 
34
46
  > **Note:** `speak_player` requires a host that supports MCP Apps (e.g., Claude Desktop). In hosts without MCP Apps support, the tool is not available and `speak` (server-side playback) can be used instead.
35
47
 
48
+ ### Player MCP Tools
49
+
50
+ | Tool | Description |
51
+ |------|-------------|
52
+ | `speak_player` | Create a new player session and display the UI. Returns `viewUUID`. |
53
+ | `resynthesize_player` | Update all segments for an existing player (new `viewUUID` each call). |
54
+ | `get_player_state` | Read the current player state (paginated) for AI tuning. |
55
+ | `open_dictionary_ui` | Open the user dictionary manager UI. |
56
+
36
57
  ## Quick Start
37
58
 
38
59
  ### Requirements
@@ -108,7 +129,7 @@ Config file location:
108
129
  }
109
130
  ```
110
131
 
111
- > 💡 Bun を使う場合は `npx` `bunx` に置き換えるだけでOK:
132
+ > 💡 **If using Bun**, just replace `npx` with `bunx`:
112
133
  > ```json
113
134
  > "command": "bunx", "args": ["@kajidog/mcp-tts-voicevox"]
114
135
  > ```
@@ -144,13 +165,13 @@ This starts the VOICEVOX Engine and the MCP server (HTTP mode on port 3000).
144
165
 
145
166
  **3. Restart Claude Desktop**
146
167
 
147
- > **Limitations (Docker):** The Docker container has no audio device, so the `speak` tool (server-side playback) is disabled by default. Use `speak_player` instead — it plays audio on the client side (in Claude Desktop) and works without any audio device on the server. See [UI Audio Player](#ui-audio-player-mcp-apps) for details.
168
+ > **Limitations (Docker):** The Docker container has no audio device, so the `voicevox_speak` tool (server-side playback) is disabled by default. Use `voicevox_speak_player` instead — it plays audio on the client side (in Claude Desktop) and works without any audio device on the server. See [UI Audio Player](#ui-audio-player-mcp-apps) for details.
148
169
 
149
170
  ---
150
171
 
151
172
  ## MCP Tools
152
173
 
153
- ### `speak` — Text-to-Speech
174
+ ### `voicevox_speak` — Text-to-Speech
154
175
 
155
176
  The main feature callable from Claude.
156
177
 
@@ -183,11 +204,11 @@ The main feature callable from Claude.
183
204
 
184
205
  | Tool | Description |
185
206
  |------|-------------|
186
- | `speak_player` | Speak with UI audio player (disable with `--disable-tools`) |
187
- | `ping_voicevox` | Check VOICEVOX Engine connection |
188
- | `get_speakers` | Get list of available speakers |
189
- | `stop_speaker` | Stop playback and clear queue |
190
- | `synthesize_file` | Generate audio file |
207
+ | `voicevox_speak_player` | Speak with UI audio player (disable with `--disable-tools`) |
208
+ | `voicevox_ping` | Check VOICEVOX Engine connection |
209
+ | `voicevox_get_speakers` | Get list of available speakers |
210
+ | `voicevox_stop_speaker` | Stop playback and clear queue |
211
+ | `voicevox_synthesize_file` | Generate audio file |
191
212
 
192
213
  </details>
193
214
 
@@ -237,6 +258,13 @@ export VOICEVOX_DISABLED_TOOLS=speak_player,synthesize_file
237
258
  | Variable | Description | Default |
238
259
  |----------|-------------|---------|
239
260
  | `VOICEVOX_AUTO_PLAY` | Auto-play audio in UI player | `true` |
261
+ | `VOICEVOX_PLAYER_EXPORT_ENABLED` | Enable track export(download) from UI player (`false` to disable) | `true` |
262
+ | `VOICEVOX_PLAYER_EXPORT_DIR` | Default output directory for exported tracks (also used as fallback when folder picker is unavailable) | `./voicevox-player-exports` |
263
+ | `VOICEVOX_PLAYER_CACHE_DIR` | Directory for player cache files (`*.txt`) and default player state file | `./.voicevox-player-cache` |
264
+ | `VOICEVOX_PLAYER_AUDIO_CACHE_ENABLED` | Enable persistent audio cache on disk (`false` disables disk cache writes/reads) | `true` |
265
+ | `VOICEVOX_PLAYER_AUDIO_CACHE_TTL_DAYS` | Audio cache retention in days (`0`: disable disk cache, `-1`: no TTL cleanup) | `30` |
266
+ | `VOICEVOX_PLAYER_AUDIO_CACHE_MAX_MB` | Audio cache size cap in MB (`0`: disable disk cache, `-1`: unlimited) | `512` |
267
+ | `VOICEVOX_PLAYER_STATE_FILE` | Path of persisted player state JSON | `<VOICEVOX_PLAYER_CACHE_DIR>/player-state.json` |
240
268
 
241
269
  ### Server Settings
242
270
 
@@ -285,6 +313,13 @@ npx @kajidog/mcp-tts-voicevox --disable-tools speak_player,synthesize_file
285
313
  | `--restrict-wait-for-end` | Restrict waitForEnd |
286
314
  | `--disable-tools <tools>` | Disable tools |
287
315
  | `--auto-play` / `--no-auto-play` | Auto-play in UI player |
316
+ | `--player-export` / `--no-player-export` | Enable/disable track export(download) in UI player |
317
+ | `--player-export-dir <dir>` | Default output directory for exported tracks |
318
+ | `--player-cache-dir <dir>` | Player cache directory |
319
+ | `--player-state-file <path>` | Persisted player state file path |
320
+ | `--player-audio-cache` / `--no-player-audio-cache` | Enable/disable disk audio cache for player |
321
+ | `--player-audio-cache-ttl-days <days>` | Audio cache retention days (`0`: disable, `-1`: no TTL cleanup) |
322
+ | `--player-audio-cache-max-mb <mb>` | Audio cache size cap in MB (`0`: disable, `-1`: unlimited) |
288
323
  | `--http` | HTTP mode |
289
324
  | `--port <value>` | HTTP port |
290
325
  | `--host <value>` | HTTP host |
@@ -476,4 +511,4 @@ pnpm install
476
511
 
477
512
  ## License
478
513
 
479
- ISC
514
+ ISC