@kajidog/mcp-tts-voicevox 0.7.3 → 0.8.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,655 @@
1
+ # VOICEVOX TTS MCP
2
+
3
+ **English** | [日本語](README.ja.md)
4
+
5
+ A text-to-speech MCP server using VOICEVOX
6
+
7
+ > 🎮 **[Try the Browser Demo](https://kajidog.github.io/mcp-tts-voicevox/)** — Test VoicevoxClient directly in your browser
8
+
9
+ ## What You Can Do
10
+
11
+ - **Make your AI assistant speak** — Text-to-speech from MCP clients like Claude Desktop
12
+ - **UI Audio Player (MCP Apps)** — Play audio directly in the chat with an interactive player (ChatGPT / Claude Desktop / Claude Web etc.)
13
+ - **Multi-character conversations** — Switch speakers per segment in a single call
14
+ - **Smooth playback** — Queue management, immediate playback, prefetching, streaming
15
+ - **Cross-platform** — Works on Windows, macOS, Linux (including WSL)
16
+
17
+ ## UI Audio Player (MCP Apps)
18
+
19
+ ![UI Audio Player](docs/images/player.png)
20
+
21
+ The `voicevox_speak_player` tool uses [MCP Apps](https://github.com/modelcontextprotocol/ext-apps) to render an interactive audio player directly inside the chat. Unlike the standard `voicevox_speak` tool which plays audio on the server, **audio is played on the client side (in the browser/app)** — no audio device needed on the server.
22
+
23
+ ### Features
24
+
25
+ - **Client-side playback** — Audio plays in Claude Desktop's chat, not on the server. Works even over remote connections.
26
+ - **Play/Pause controls** — Full playback controls embedded in the conversation
27
+ - **Multi-speaker dialogue** — Sequential playback of multiple speakers in one player with track navigation
28
+ - **Speaker switching** — Change the voice of any segment directly from the player UI
29
+ - **Segment editing** — Adjust speed, volume, intonation, pause length, and pre/post silence per segment
30
+ - **Accent phrase editing** — Edit accent positions and mora pitch directly in the UI
31
+ - **Add / delete / reorder segments** — Drag-and-drop track reordering; add new segments inline
32
+ - **WAV export** — Save all tracks as numbered WAV files and open the output folder automatically
33
+ - **User dictionary manager** — Add, edit, and delete VOICEVOX user dictionary words with preview playback
34
+ - **Cross-session state restore** — Player state is persisted on the server; reopening the chat restores previous tracks
35
+
36
+ Export behavior by environment:
37
+ - `Save and open` always exports WAV files. If opening the file explorer is not supported, export still succeeds and the save path is shown in the UI.
38
+ - `Choose output folder` uses a native directory picker on Windows/macOS. On unsupported environments, this action falls back to the default export directory.
39
+
40
+ | Multi-speaker playback | Track list | Segment editing |
41
+ |:---:|:---:|:---:|
42
+ | ![Multi-speaker player](docs/images/multi-player.png) | ![Track list](docs/images/list-player.png) | ![Segment editing](docs/images/edit-player.png) |
43
+
44
+ | Speaker selection | Dictionary manager | WAV export |
45
+ |:---:|:---:|:---:|
46
+ | ![Speaker selection](docs/images/select-player.png) | ![Dictionary manager](docs/images/dictionary-player.png) | ![WAV export](docs/images/export-player.png) |
47
+
48
+ ### Supported Clients
49
+
50
+ | Client | Connection | Notes |
51
+ |--------|-----------|-------|
52
+ | **ChatGPT** | HTTP (remote) | Requires `VOICEVOX_PLAYER_DOMAIN` |
53
+ | **Claude Desktop** | stdio (local) | Works out of the box |
54
+ | **Claude Desktop** | HTTP (via mcp-remote) | Do not set `VOICEVOX_PLAYER_DOMAIN` |
55
+
56
+ > **Note:** `speak_player` requires a host that supports MCP Apps. In hosts without MCP Apps support, the tool is not available and `speak` (server-side playback) can be used instead.
57
+
58
+ ### Player MCP Tools
59
+
60
+ | Tool | Description |
61
+ |------|-------------|
62
+ | `speak_player` | Create a new player session and display the UI. Returns `viewUUID`. |
63
+ | `resynthesize_player` | Update all segments for an existing player (new `viewUUID` each call). |
64
+ | `get_player_state` | Read the current player state (paginated) for AI tuning. |
65
+ | `open_dictionary_ui` | Open the user dictionary manager UI. |
66
+
67
+ ## Quick Start
68
+
69
+ ### Requirements
70
+
71
+ - Node.js 20.0.0 or higher (or [Bun](https://bun.sh/)) **or Docker**
72
+ - [VOICEVOX Engine](https://voicevox.hiroshiba.jp/) (must be running; included in Docker Compose)
73
+ - ffplay (optional, recommended — not needed with Docker)
74
+
75
+ #### Installing FFplay
76
+
77
+ ffplay is a lightweight player included with FFmpeg that supports playback from stdin. When available, it automatically enables low-latency streaming playback.
78
+
79
+ > 💡 **FFplay is optional.** Without it, playback falls back to temp file-based playback (Windows: PowerShell, macOS: afplay, Linux: aplay, etc.).
80
+
81
+ - Easy setup: One-liner installation for each OS (see steps below)
82
+ - Required: `ffplay` must be in PATH (restart terminal/apps after installation)
83
+
84
+ <details>
85
+ <summary>FFplay Installation and PATH Setup</summary>
86
+
87
+ Installation examples:
88
+
89
+ - Windows (any of these)
90
+ - Winget: `winget install --id=Gyan.FFmpeg -e`
91
+ - Chocolatey: `choco install ffmpeg`
92
+ - Scoop: `scoop install ffmpeg`
93
+ - Official builds: Download from https://www.gyan.dev/ffmpeg/builds/ or https://github.com/BtbN/FFmpeg-Builds and add the `bin` folder to PATH
94
+
95
+ - macOS
96
+ - Homebrew: `brew install ffmpeg`
97
+
98
+ - Linux
99
+ - Debian/Ubuntu: `sudo apt-get update && sudo apt-get install -y ffmpeg`
100
+ - Fedora: `sudo dnf install -y ffmpeg`
101
+ - Arch: `sudo pacman -S ffmpeg`
102
+
103
+ PATH Setup:
104
+
105
+ - Windows: Add `...\ffmpeg\bin` to environment variables, then restart PowerShell/terminal and editor (Claude/VS Code, etc.)
106
+ - Verify: `powershell -c "$env:Path"` should include the ffmpeg path
107
+ - macOS/Linux: Usually auto-detected. Check with `echo $PATH` if needed, restart shell.
108
+ - MCP clients (Claude Desktop/Code): Restart the app to reload PATH.
109
+
110
+ Verification:
111
+
112
+ ```bash
113
+ ffplay -version
114
+ ```
115
+
116
+ If version info is displayed, installation is complete. CLI/MCP will automatically detect ffplay and use stdin streaming playback.
117
+
118
+ </details>
119
+
120
+
121
+ ### 3 Steps to Get Started
122
+
123
+ **1. Start VOICEVOX Engine**
124
+
125
+ **2. Add to Claude Desktop config file**
126
+
127
+ Config file location:
128
+ - Windows: `%APPDATA%\Claude\claude_desktop_config.json`
129
+ - macOS: `~/Library/Application Support/Claude/claude_desktop_config.json`
130
+
131
+ ```json
132
+ {
133
+ "mcpServers": {
134
+ "tts-mcp": {
135
+ "command": "npx",
136
+ "args": ["-y", "@kajidog/mcp-tts-voicevox"]
137
+ }
138
+ }
139
+ }
140
+ ```
141
+
142
+ > 💡 **If using Bun**, just replace `npx` with `bunx`:
143
+ > ```json
144
+ > "command": "bunx", "args": ["@kajidog/mcp-tts-voicevox"]
145
+ > ```
146
+
147
+ **3. Restart Claude Desktop**
148
+
149
+ That's it! Ask Claude to "say hello" and it will speak!
150
+
151
+ ### Quick Start with Docker
152
+
153
+ You can run both the MCP server and VOICEVOX Engine with a single command using Docker Compose. No Node.js or VOICEVOX installation required.
154
+
155
+ **1. Start the containers**
156
+
157
+ ```bash
158
+ docker compose up -d
159
+ ```
160
+
161
+ This starts the VOICEVOX Engine and the MCP server (HTTP mode on port 3000).
162
+
163
+ **2. Add to Claude Desktop config file (using mcp-remote)**
164
+
165
+ ```json
166
+ {
167
+ "mcpServers": {
168
+ "tts-mcp": {
169
+ "command": "npx",
170
+ "args": ["-y", "mcp-remote", "http://localhost:3000/mcp"]
171
+ }
172
+ }
173
+ }
174
+ ```
175
+
176
+ **3. Restart Claude Desktop**
177
+
178
+ > **Limitations (Docker):** The Docker container has no audio device, so the `voicevox_speak` tool (server-side playback) is disabled by default. Use `voicevox_speak_player` instead — it plays audio on the client side (in Claude Desktop) and works without any audio device on the server. See [UI Audio Player](#ui-audio-player-mcp-apps) for details.
179
+
180
+ ---
181
+
182
+ ## MCP Tools
183
+
184
+ ### `voicevox_speak` — Text-to-Speech
185
+
186
+ The main feature callable from Claude.
187
+
188
+ | Parameter | Description | Default |
189
+ |-----------|-------------|---------|
190
+ | `text` | Text to speak (multiple segments separated by newlines) | Required |
191
+ | `speaker` | Speaker ID | 1 |
192
+ | `speedScale` | Playback speed | 1.0 |
193
+ | `immediate` | Immediate playback (clears queue) | true |
194
+ | `waitForEnd` | Wait for playback completion | false |
195
+
196
+ **Examples:**
197
+
198
+ ```javascript
199
+ // Simple text
200
+ { "text": "Hello" }
201
+
202
+ // Specify speaker
203
+ { "text": "Hello", "speaker": 3 }
204
+
205
+ // Different speakers per segment
206
+ { "text": "1:Hello\n3:Nice weather today" }
207
+
208
+ // Wait for completion (synchronous processing)
209
+ { "text": "Wait for this to finish before continuing", "waitForEnd": true }
210
+ ```
211
+
212
+ <details>
213
+ <summary>Other Tools</summary>
214
+
215
+ | Tool | Description |
216
+ |------|-------------|
217
+ | `voicevox_speak_player` | Speak with UI audio player (disable with `--disable-tools`) |
218
+ | `voicevox_ping` | Check VOICEVOX Engine connection |
219
+ | `voicevox_get_speakers` | Get list of available speakers |
220
+ | `voicevox_stop_speaker` | Stop playback and clear queue |
221
+ | `voicevox_synthesize_file` | Generate audio file |
222
+
223
+ </details>
224
+
225
+ ---
226
+
227
+ ## Configuration
228
+
229
+ <details>
230
+ <summary><b>Environment Variables</b></summary>
231
+
232
+ ### VOICEVOX Settings
233
+
234
+ | Variable | Description | Default |
235
+ |----------|-------------|---------|
236
+ | `VOICEVOX_URL` | Engine URL | `http://localhost:50021` |
237
+ | `VOICEVOX_DEFAULT_SPEAKER` | Default speaker ID | `1` |
238
+ | `VOICEVOX_DEFAULT_SPEED_SCALE` | Playback speed | `1.0` |
239
+ | `VOICEVOX_RETRY_COUNT` | Retries for failed API requests (0 disables) | `2` |
240
+ | `VOICEVOX_RETRY_DELAY_MS` | Initial retry delay in ms (exponential backoff) | `250` |
241
+
242
+ ### Playback Options
243
+
244
+ | Variable | Description | Default |
245
+ |----------|-------------|---------|
246
+ | `VOICEVOX_USE_STREAMING` | Streaming playback (requires `ffplay`) | `false` |
247
+ | `VOICEVOX_DEFAULT_POST_PHONEME_LENGTH` | Trailing silence per segment in seconds. Increase for a longer pause between queued segments (also protects the end of speech from being cut off with streaming playback) | engine default |
248
+ | `VOICEVOX_DEFAULT_IMMEDIATE` | Immediate playback | `true` |
249
+ | `VOICEVOX_DEFAULT_WAIT_FOR_START` | Wait for playback start | `false` |
250
+ | `VOICEVOX_DEFAULT_WAIT_FOR_END` | Wait for playback end | `false` |
251
+
252
+ ### Restriction Settings
253
+
254
+ Restrict AI from specifying certain options.
255
+
256
+ | Variable | Description |
257
+ |----------|-------------|
258
+ | `VOICEVOX_RESTRICT_IMMEDIATE` | Restrict `immediate` option |
259
+ | `VOICEVOX_RESTRICT_WAIT_FOR_START` | Restrict `waitForStart` option |
260
+ | `VOICEVOX_RESTRICT_WAIT_FOR_END` | Restrict `waitForEnd` option |
261
+
262
+ ### Disable Tools
263
+
264
+ ```bash
265
+ # Disable individual tools
266
+ export VOICEVOX_DISABLED_TOOLS=speak_player,synthesize_file
267
+
268
+ # Disable a built-in group of tools
269
+ export VOICEVOX_DISABLED_GROUPS=player
270
+
271
+ # Combine groups and individual tools
272
+ export VOICEVOX_DISABLED_GROUPS=dictionary
273
+ export VOICEVOX_DISABLED_TOOLS=synthesize_file
274
+ ```
275
+
276
+ Built-in groups for `VOICEVOX_DISABLED_GROUPS` / `--disable-groups`:
277
+
278
+ | Group | Tools |
279
+ |-------|-------|
280
+ | `player` | `speak_player`, `resynthesize_player`, `get_player_state`, `open_dictionary_ui` |
281
+ | `dictionary` | `get_accent_phrases`, `get_user_dictionary`, `add_user_dictionary_word`, `update_user_dictionary_word`, `delete_user_dictionary_word`, `add_user_dictionary_words`, `update_user_dictionary_words` |
282
+ | `file` | `synthesize_file` |
283
+ | `apps` | `speak_player`, `resynthesize_player`, `open_dictionary_ui` (MCP App UI tools) |
284
+
285
+ ### UI Player Settings
286
+
287
+ | Variable | Description | Default |
288
+ |----------|-------------|---------|
289
+ | `VOICEVOX_PLAYER_DOMAIN` | Widget domain for UI player (required for ChatGPT, e.g. `https://your-app.onrender.com`) | _(unset)_ |
290
+ | `VOICEVOX_AUTO_PLAY` | Auto-play audio in UI player | `true` |
291
+ | `VOICEVOX_PLAYER_EXPORT_ENABLED` | Enable track export(download) from UI player (`false` to disable) | `true` |
292
+ | `VOICEVOX_PLAYER_EXPORT_DIR` | Default output directory for exported tracks (also used as fallback when folder picker is unavailable) | `./voicevox-player-exports` |
293
+ | `VOICEVOX_PLAYER_CACHE_DIR` | Directory for player cache files (`*.txt`) and default player state file | `./.voicevox-player-cache` |
294
+ | `VOICEVOX_PLAYER_AUDIO_CACHE_ENABLED` | Enable persistent audio cache on disk (`false` disables disk cache writes/reads) | `true` |
295
+ | `VOICEVOX_PLAYER_AUDIO_CACHE_TTL_DAYS` | Audio cache retention in days (`0`: disable disk cache, `-1`: no TTL cleanup) | `30` |
296
+ | `VOICEVOX_PLAYER_AUDIO_CACHE_MAX_MB` | Audio cache size cap in MB (`0`: disable disk cache, `-1`: unlimited) | `512` |
297
+ | `VOICEVOX_PLAYER_STATE_FILE` | Path of persisted player state JSON | `<VOICEVOX_PLAYER_CACHE_DIR>/player-state.json` |
298
+
299
+ ### Server Settings
300
+
301
+ | Variable | Description | Default |
302
+ |----------|-------------|---------|
303
+ | `MCP_HTTP_MODE` | Enable HTTP mode | `false` |
304
+ | `MCP_HTTP_PORT` | HTTP port | `3000` |
305
+ | `MCP_HTTP_HOST` | HTTP host | `0.0.0.0` |
306
+ | `MCP_ALLOWED_HOSTS` | Allowed hosts (comma-separated) | `localhost,127.0.0.1,[::1]` |
307
+ | `MCP_ALLOWED_ORIGINS` | Allowed origins (comma-separated) | `http://localhost,http://127.0.0.1,...` |
308
+ | `MCP_API_KEY` | Required API key for `/mcp` (sent via `X-API-Key` or `Authorization: Bearer`) | _(unset)_ |
309
+
310
+ </details>
311
+
312
+ <details>
313
+ <summary><b>Command Line Arguments</b></summary>
314
+
315
+ Command line arguments take priority over environment variables.
316
+ The complete, up-to-date list of options is always available via `npx @kajidog/mcp-tts-voicevox --help`.
317
+
318
+ ```bash
319
+ # Basic settings
320
+ npx @kajidog/mcp-tts-voicevox --url http://192.168.1.100:50021 --speaker 3 --speed 1.2
321
+
322
+ # HTTP mode
323
+ npx @kajidog/mcp-tts-voicevox --http --port 8080
324
+
325
+ # With restrictions
326
+ npx @kajidog/mcp-tts-voicevox --restrict-immediate --restrict-wait-for-end
327
+
328
+ # Disable individual tools
329
+ npx @kajidog/mcp-tts-voicevox --disable-tools speak_player,synthesize_file
330
+
331
+ # Disable a tool group
332
+ npx @kajidog/mcp-tts-voicevox --disable-groups player
333
+ ```
334
+
335
+ | Argument | Description |
336
+ |----------|-------------|
337
+ | `--help`, `-h` | Show help |
338
+ | `--version`, `-v` | Show version |
339
+ | `--init` | Generate `.voicevoxrc.json` with default settings |
340
+ | `--config <path>` | Path to config file |
341
+ | `--url <value>` | VOICEVOX Engine URL |
342
+ | `--speaker <value>` | Default speaker ID |
343
+ | `--speed <value>` | Playback speed |
344
+ | `--use-streaming` / `--no-use-streaming` | Streaming playback |
345
+ | `--post-phoneme-length <sec>` | Trailing silence per segment (pause between queued segments) |
346
+ | `--immediate` / `--no-immediate` | Immediate playback |
347
+ | `--wait-for-start` / `--no-wait-for-start` | Wait for start |
348
+ | `--wait-for-end` / `--no-wait-for-end` | Wait for end |
349
+ | `--restrict-immediate` | Restrict immediate |
350
+ | `--restrict-wait-for-start` | Restrict waitForStart |
351
+ | `--restrict-wait-for-end` | Restrict waitForEnd |
352
+ | `--disable-tools <tools>` | Disable tools (comma-separated tool names) |
353
+ | `--disable-groups <groups>` | Disable tool groups: `player`, `dictionary`, `file`, `apps` |
354
+ | `--auto-play` / `--no-auto-play` | Auto-play in UI player |
355
+ | `--player-export` / `--no-player-export` | Enable/disable track export(download) in UI player |
356
+ | `--player-export-dir <dir>` | Default output directory for exported tracks |
357
+ | `--player-cache-dir <dir>` | Player cache directory |
358
+ | `--player-state-file <path>` | Persisted player state file path |
359
+ | `--player-audio-cache` / `--no-player-audio-cache` | Enable/disable disk audio cache for player |
360
+ | `--player-audio-cache-ttl-days <days>` | Audio cache retention days (`0`: disable, `-1`: no TTL cleanup) |
361
+ | `--player-audio-cache-max-mb <mb>` | Audio cache size cap in MB (`0`: disable, `-1`: unlimited) |
362
+ | `--http` | HTTP mode |
363
+ | `--port <value>` | HTTP port |
364
+ | `--host <value>` | HTTP host |
365
+ | `--allowed-hosts <hosts>` | Allowed hosts (comma-separated) |
366
+ | `--allowed-origins <origins>` | Allowed origins (comma-separated) |
367
+ | `--api-key <key>` | Required API key for `/mcp` |
368
+
369
+ </details>
370
+
371
+ <details>
372
+ <summary><b>Config File (.voicevoxrc.json)</b></summary>
373
+
374
+ You can use a JSON config file instead of (or in addition to) environment variables and CLI arguments. This is useful when you have many settings to configure.
375
+
376
+ **Priority order:** CLI args > Environment variables > Config file > Defaults
377
+
378
+ ### Generate a config file
379
+
380
+ ```bash
381
+ npx @kajidog/mcp-tts-voicevox --init
382
+ ```
383
+
384
+ This creates `.voicevoxrc.json` in the current directory with all default settings. Edit it as needed.
385
+
386
+ ### Use a custom config file path
387
+
388
+ ```bash
389
+ npx @kajidog/mcp-tts-voicevox --config ./my-config.json
390
+ ```
391
+
392
+ Or via environment variable:
393
+
394
+ ```bash
395
+ VOICEVOX_CONFIG=./my-config.json npx @kajidog/mcp-tts-voicevox
396
+ ```
397
+
398
+ ### Example `.voicevoxrc.json`
399
+
400
+ ```json
401
+ {
402
+ "url": "http://192.168.1.50:50021",
403
+ "speaker": 3,
404
+ "speed": 1.2,
405
+ "http": true,
406
+ "port": 8080,
407
+ "disable-tools": ["synthesize_file"],
408
+ "disable-groups": ["dictionary"]
409
+ }
410
+ ```
411
+
412
+ Keys can be written in kebab-case (`use-streaming`), camelCase (`useStreaming`), or internal key names (`defaultSpeaker`). If `.voicevoxrc.json` exists in the current directory, it is loaded automatically.
413
+
414
+ </details>
415
+
416
+ <details>
417
+ <summary><b>HTTP Mode</b></summary>
418
+
419
+ For remote connections:
420
+
421
+ **Start Server:**
422
+
423
+ ```bash
424
+ # Linux/macOS
425
+ MCP_HTTP_MODE=true MCP_HTTP_PORT=3000 npx @kajidog/mcp-tts-voicevox
426
+
427
+ # Windows PowerShell
428
+ $env:MCP_HTTP_MODE='true'; $env:MCP_HTTP_PORT='3000'; npx @kajidog/mcp-tts-voicevox
429
+ ```
430
+
431
+ **Claude Desktop Config (using mcp-remote):**
432
+
433
+ ```json
434
+ {
435
+ "mcpServers": {
436
+ "tts-mcp-proxy": {
437
+ "command": "npx",
438
+ "args": ["-y", "mcp-remote", "http://localhost:3000/mcp"]
439
+ }
440
+ }
441
+ }
442
+ ```
443
+
444
+ ### Per-Project Speaker Settings
445
+
446
+ With Claude Code, you can configure different default speakers per project using custom headers in `.mcp.json`:
447
+
448
+ | Header | Description |
449
+ |--------|-------------|
450
+ | `X-Voicevox-Speaker` | Default speaker ID for this project |
451
+ | `X-API-Key` | API key when `MCP_API_KEY` is configured |
452
+
453
+ **Example `.mcp.json`:**
454
+
455
+ ```json
456
+ {
457
+ "mcpServers": {
458
+ "tts": {
459
+ "type": "http",
460
+ "url": "http://localhost:3000/mcp",
461
+ "headers": {
462
+ "X-Voicevox-Speaker": "113",
463
+ "X-API-Key": "your-api-key"
464
+ }
465
+ }
466
+ }
467
+ }
468
+ ```
469
+
470
+ This allows each project to use a different voice character automatically.
471
+
472
+ **Priority order:**
473
+ 1. Explicit `speaker` parameter in tool call (highest)
474
+ 2. Project default from `X-Voicevox-Speaker` header
475
+ 3. Global `VOICEVOX_DEFAULT_SPEAKER` setting (lowest)
476
+
477
+ </details>
478
+
479
+ <details>
480
+ <summary><b>WSL to Windows Host Connection</b></summary>
481
+
482
+ Connecting from WSL to an MCP server running on Windows:
483
+
484
+ ### 1. Get Windows Host IP from WSL
485
+
486
+ ```bash
487
+ # Method 1: From default gateway
488
+ ip route show | grep -oP 'default via \K[\d.]+'
489
+ # Usually in the format 172.x.x.1
490
+
491
+ # Method 2: From /etc/resolv.conf (WSL2)
492
+ cat /etc/resolv.conf | grep nameserver | awk '{print $2}'
493
+ ```
494
+
495
+ ### 2. Start Server on Windows
496
+
497
+ Add the WSL gateway IP to `MCP_ALLOWED_HOSTS` to allow access from WSL:
498
+
499
+ ```powershell
500
+ $env:MCP_HTTP_MODE='true'
501
+ $env:MCP_ALLOWED_HOSTS='localhost,127.0.0.1,172.29.176.1'
502
+ npx @kajidog/mcp-tts-voicevox
503
+ ```
504
+
505
+ Or with CLI arguments:
506
+
507
+ ```powershell
508
+ npx @kajidog/mcp-tts-voicevox --http --allowed-hosts "localhost,127.0.0.1,172.29.176.1"
509
+ ```
510
+
511
+ ### 3. WSL Configuration (.mcp.json)
512
+
513
+ ```json
514
+ {
515
+ "mcpServers": {
516
+ "tts": {
517
+ "type": "http",
518
+ "url": "http://172.29.176.1:3000/mcp"
519
+ }
520
+ }
521
+ }
522
+ ```
523
+
524
+ > ⚠️ Within WSL, `localhost` refers to WSL itself. Use the WSL gateway IP to access the Windows host.
525
+
526
+ </details>
527
+
528
+ <details>
529
+ <summary><b>Using with ChatGPT</b></summary>
530
+
531
+ To use with ChatGPT, deploy the MCP server in HTTP mode to the cloud with access to a VOICEVOX Engine.
532
+
533
+ ### 1. Deploy to the Cloud
534
+
535
+ Deploy with Docker to Render, Railway, etc. (Dockerfile included).
536
+
537
+ ### 2. Set Up VOICEVOX Engine
538
+
539
+ Run VOICEVOX Engine locally and expose it via ngrok, or deploy it alongside the MCP server.
540
+
541
+ ### 3. Configure Environment Variables
542
+
543
+ | Variable | Example | Description |
544
+ |----------|---------|-------------|
545
+ | `VOICEVOX_URL` | `https://xxxx.ngrok-free.app` | VOICEVOX Engine URL |
546
+ | `MCP_HTTP_MODE` | `true` | Enable HTTP mode |
547
+ | `MCP_ALLOWED_HOSTS` | `your-app.onrender.com` | Deployed hostname |
548
+ | `VOICEVOX_PLAYER_DOMAIN` | `https://your-app.onrender.com` | Widget domain for UI player (required for ChatGPT) |
549
+ | `VOICEVOX_DISABLED_TOOLS` | `speak` | Disable server-side playback (no audio device) |
550
+ | `VOICEVOX_PLAYER_EXPORT_ENABLED` | `false` | Disable export feature (files cannot be downloaded from cloud) |
551
+
552
+ ### 4. Add Connector in ChatGPT
553
+
554
+ Go to ChatGPT Settings → Connectors → Add MCP server URL (`https://your-app.onrender.com/mcp`).
555
+
556
+ </details>
557
+
558
+ <details>
559
+ <summary><b>Using with Claude Web</b></summary>
560
+
561
+ The basic steps are the same as ChatGPT, but the `VOICEVOX_PLAYER_DOMAIN` value is different.
562
+
563
+ Claude Web requires `ui.domain` to be a **hash-based dedicated domain**. Compute it with the following command:
564
+
565
+ ```bash
566
+ node -e "console.log(require('crypto').createHash('sha256').update('Your MCP server URL').digest('hex').slice(0,32)+'.claudemcpcontent.com')"
567
+ ```
568
+
569
+ Example: If your MCP server URL is `https://your-app.onrender.com/mcp`:
570
+
571
+ ```bash
572
+ node -e "console.log(require('crypto').createHash('sha256').update('https://your-app.onrender.com/mcp').digest('hex').slice(0,32)+'.claudemcpcontent.com')"
573
+ # Example output: 48fb73a6...claudemcpcontent.com
574
+ ```
575
+
576
+ Set this output value as `VOICEVOX_PLAYER_DOMAIN`.
577
+
578
+ > **Note**: Since ChatGPT and Claude Web require different `VOICEVOX_PLAYER_DOMAIN` values, a single instance cannot serve both clients simultaneously. Deploy separate instances for each, or switch the environment variable depending on your target client.
579
+
580
+ </details>
581
+
582
+ ---
583
+
584
+ ## Troubleshooting
585
+
586
+ <details>
587
+ <summary><b>Audio is not playing</b></summary>
588
+
589
+ **1. Check if VOICEVOX Engine is running**
590
+
591
+ ```bash
592
+ curl http://localhost:50021/speakers
593
+ ```
594
+
595
+ **2. Check platform-specific playback tools**
596
+
597
+ | OS | Required Tool |
598
+ |----|---------------|
599
+ | Linux | One of `aplay`, `paplay`, `play`, `ffplay` |
600
+ | macOS | `afplay` (pre-installed) |
601
+ | Windows | PowerShell (pre-installed) |
602
+
603
+ </details>
604
+
605
+ <details>
606
+ <summary><b>Not recognized by MCP client</b></summary>
607
+
608
+ - Check package installation: `npm list -g @kajidog/mcp-tts-voicevox`
609
+ - Verify JSON syntax in config file
610
+ - Restart the client
611
+
612
+ </details>
613
+
614
+ ---
615
+
616
+ ## Package Structure
617
+
618
+ | Package | Description |
619
+ |---------|-------------|
620
+ | `@kajidog/mcp-tts-voicevox` | MCP server |
621
+ | [`@kajidog/voicevox-client`](https://www.npmjs.com/package/@kajidog/voicevox-client) | General-purpose VOICEVOX client library (can be used independently) |
622
+ | `@kajidog/player-ui` | React-based audio player UI for browser playback |
623
+
624
+ ---
625
+
626
+ <details>
627
+ <summary><b>Developer Information</b></summary>
628
+
629
+ ### Setup
630
+
631
+ ```bash
632
+ git clone https://github.com/kajidog/mcp-tts-voicevox.git
633
+ cd mcp-tts-voicevox
634
+ pnpm install
635
+ ```
636
+
637
+ ### Commands
638
+
639
+ | Command | Description |
640
+ |---------|-------------|
641
+ | `pnpm build` | Build all packages |
642
+ | `pnpm test` | Run tests |
643
+ | `pnpm lint` | Run lint |
644
+ | `pnpm dev` | Start dev server |
645
+ | `pnpm dev:stdio` | Dev with stdio mode |
646
+ | `pnpm dev:bun` | Start dev server with Bun |
647
+ | `pnpm dev:bun:http` | Start HTTP dev server with Bun |
648
+
649
+ </details>
650
+
651
+ ---
652
+
653
+ ## License
654
+
655
+ ISC