@loonylabs/tts-middleware 0.4.0 โ 0.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +372 -176
- package/dist/middleware/services/tts/providers/fish-audio-provider.d.ts +90 -0
- package/dist/middleware/services/tts/providers/fish-audio-provider.d.ts.map +1 -0
- package/dist/middleware/services/tts/providers/fish-audio-provider.js +198 -0
- package/dist/middleware/services/tts/providers/fish-audio-provider.js.map +1 -0
- package/dist/middleware/services/tts/providers/google-cloud-tts-provider.d.ts +14 -4
- package/dist/middleware/services/tts/providers/google-cloud-tts-provider.d.ts.map +1 -1
- package/dist/middleware/services/tts/providers/google-cloud-tts-provider.js +38 -16
- package/dist/middleware/services/tts/providers/google-cloud-tts-provider.js.map +1 -1
- package/dist/middleware/services/tts/providers/index.d.ts +1 -0
- package/dist/middleware/services/tts/providers/index.d.ts.map +1 -1
- package/dist/middleware/services/tts/providers/index.js +3 -1
- package/dist/middleware/services/tts/providers/index.js.map +1 -1
- package/dist/middleware/services/tts/tts.service.d.ts.map +1 -1
- package/dist/middleware/services/tts/tts.service.js +13 -0
- package/dist/middleware/services/tts/tts.service.js.map +1 -1
- package/dist/middleware/services/tts/types/common.types.d.ts +2 -1
- package/dist/middleware/services/tts/types/common.types.d.ts.map +1 -1
- package/dist/middleware/services/tts/types/common.types.js +1 -0
- package/dist/middleware/services/tts/types/common.types.js.map +1 -1
- package/dist/middleware/services/tts/types/index.d.ts +2 -2
- package/dist/middleware/services/tts/types/index.d.ts.map +1 -1
- package/dist/middleware/services/tts/types/index.js +2 -1
- package/dist/middleware/services/tts/types/index.js.map +1 -1
- package/dist/middleware/services/tts/types/provider-options.types.d.ts +91 -1
- package/dist/middleware/services/tts/types/provider-options.types.d.ts.map +1 -1
- package/dist/middleware/services/tts/types/provider-options.types.js +13 -0
- package/dist/middleware/services/tts/types/provider-options.types.js.map +1 -1
- package/dist/middleware/shared/config/tts.config.d.ts +16 -0
- package/dist/middleware/shared/config/tts.config.d.ts.map +1 -1
- package/dist/middleware/shared/config/tts.config.js +10 -0
- package/dist/middleware/shared/config/tts.config.js.map +1 -1
- package/package.json +3 -2
package/README.md
CHANGED
|
@@ -1,155 +1,146 @@
|
|
|
1
|
-
|
|
1
|
+
<div align="center">
|
|
2
2
|
|
|
3
|
-
|
|
4
|
-
[](https://opensource.org/licenses/MIT)
|
|
5
|
-
[]()
|
|
6
|
-
[]()
|
|
3
|
+
# TTS Middleware
|
|
7
4
|
|
|
8
|
-
|
|
5
|
+
*Provider-agnostic Text-to-Speech middleware with **GDPR compliance** support. Currently supports Azure Speech Services, EdenAI, Google Cloud TTS, and Fish Audio. Features EU data residency via Azure and Google Cloud, pluggable logging, character-based billing, and comprehensive error handling.*
|
|
9
6
|
|
|
10
|
-
|
|
7
|
+
<!-- Horizontal Badge Navigation Bar -->
|
|
8
|
+
[](https://www.npmjs.com/package/@loonylabs/tts-middleware)
|
|
9
|
+
[](https://www.npmjs.com/package/@loonylabs/tts-middleware)
|
|
10
|
+
[](#features)
|
|
11
|
+
[](#prerequisites)
|
|
12
|
+
[](#license)
|
|
13
|
+
[](https://github.com/loonylabs-dev/tts-middleware)
|
|
11
14
|
|
|
12
|
-
|
|
15
|
+
</div>
|
|
13
16
|
|
|
14
|
-
|
|
17
|
+
<!-- Table of Contents -->
|
|
18
|
+
<details>
|
|
19
|
+
<summary><strong>Table of Contents</strong></summary>
|
|
15
20
|
|
|
16
|
-
-
|
|
17
|
-
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
-
|
|
23
|
-
-
|
|
24
|
-
-
|
|
25
|
-
-
|
|
26
|
-
-
|
|
21
|
+
- [Features](#features)
|
|
22
|
+
- [Quick Start](#quick-start)
|
|
23
|
+
- [Prerequisites](#prerequisites)
|
|
24
|
+
- [Configuration](#configuration)
|
|
25
|
+
- [Providers & Models](#providers--models)
|
|
26
|
+
- [GDPR / Compliance](#gdpr--compliance)
|
|
27
|
+
- [API Reference](#api-reference)
|
|
28
|
+
- [Advanced Features](#advanced-features)
|
|
29
|
+
- [Testing](#testing)
|
|
30
|
+
- [Contributing](#contributing)
|
|
31
|
+
- [License](#license)
|
|
32
|
+
- [Links](#links)
|
|
27
33
|
|
|
28
|
-
|
|
34
|
+
</details>
|
|
29
35
|
|
|
30
|
-
|
|
36
|
+
---
|
|
31
37
|
|
|
32
|
-
|
|
33
|
-
npm install @loonylabs/tts-middleware
|
|
34
|
-
```
|
|
38
|
+
## Features
|
|
35
39
|
|
|
36
|
-
|
|
40
|
+
- **Multi-Provider Architecture**: Unified API for all TTS providers
|
|
41
|
+
- **Azure Speech Services** (MVP): Neural voices with emotion/style, EU regions
|
|
42
|
+
- **EdenAI**: Aggregator with access to Google, OpenAI, Amazon, IBM, ElevenLabs
|
|
43
|
+
- **Google Cloud TTS**: Neural2, WaveNet, Studio voices with EU data residency
|
|
44
|
+
- **Fish Audio**: S1 model with 13 languages & 64+ emotions (test/admin only)
|
|
45
|
+
- **Ready for:** OpenAI, ElevenLabs, Deepgram (interfaces prepared)
|
|
46
|
+
- **GDPR/DSGVO Compliance**: Built-in EU region support for Azure and Google Cloud
|
|
47
|
+
- **SSML Abstraction**: Auto-generates provider-specific SSML from simple JSON options
|
|
48
|
+
- **Character Billing**: Accurate character counting for cost calculation
|
|
49
|
+
- **Pluggable Logger**: Bring your own logger (Winston, Pino, etc.) or use the built-in console logger
|
|
50
|
+
- **TypeScript First**: Full type safety with comprehensive interfaces
|
|
51
|
+
- **Error Handling**: Typed error classes (InvalidConfig, QuotaExceeded, SynthesisFailed, etc.)
|
|
52
|
+
- **Zero Lock-in**: Switch providers without changing your application code
|
|
37
53
|
|
|
38
|
-
|
|
54
|
+
## Quick Start
|
|
39
55
|
|
|
40
|
-
|
|
56
|
+
### Installation
|
|
41
57
|
|
|
42
|
-
|
|
43
|
-
# Default Provider
|
|
44
|
-
TTS_DEFAULT_PROVIDER=azure
|
|
58
|
+
Install from npm:
|
|
45
59
|
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
60
|
+
```bash
|
|
61
|
+
npm install @loonylabs/tts-middleware
|
|
62
|
+
```
|
|
49
63
|
|
|
50
|
-
|
|
51
|
-
GOOGLE_APPLICATION_CREDENTIALS=./path/to/service-account.json
|
|
52
|
-
GOOGLE_CLOUD_PROJECT=your-project-id
|
|
53
|
-
GOOGLE_TTS_REGION=eu # Options: eu, europe-west3 (Frankfurt), europe-west1, etc.
|
|
64
|
+
Or install directly from GitHub:
|
|
54
65
|
|
|
55
|
-
|
|
56
|
-
|
|
66
|
+
```bash
|
|
67
|
+
npm install github:loonylabs-dev/tts-middleware
|
|
57
68
|
```
|
|
58
69
|
|
|
59
|
-
###
|
|
70
|
+
### Basic Usage
|
|
60
71
|
|
|
61
72
|
```typescript
|
|
62
73
|
import { ttsService, TTSProvider } from '@loonylabs/tts-middleware';
|
|
63
74
|
import fs from 'fs';
|
|
64
75
|
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
voice: { id: "en-US-JennyNeural" }, // Provider-specific voice ID
|
|
71
|
-
audio: {
|
|
72
|
-
format: "mp3",
|
|
73
|
-
speed: 1.0
|
|
74
|
-
}
|
|
75
|
-
});
|
|
76
|
-
|
|
77
|
-
// Save to file
|
|
78
|
-
fs.writeFileSync('output.mp3', response.audio);
|
|
79
|
-
|
|
80
|
-
console.log(`Generated audio: ${response.metadata.duration}ms`);
|
|
81
|
-
console.log(`Billed characters: ${response.billing.characters}`);
|
|
82
|
-
|
|
83
|
-
} catch (error) {
|
|
84
|
-
console.error("Synthesis failed:", error);
|
|
85
|
-
}
|
|
86
|
-
}
|
|
76
|
+
const response = await ttsService.synthesize({
|
|
77
|
+
text: 'Hallo Welt! Dies ist ein Test.',
|
|
78
|
+
voice: { id: 'de-DE-KatjaNeural' },
|
|
79
|
+
audio: { format: 'mp3', speed: 1.0 },
|
|
80
|
+
});
|
|
87
81
|
|
|
88
|
-
|
|
82
|
+
fs.writeFileSync('output.mp3', response.audio);
|
|
83
|
+
console.log('Characters billed:', response.billing.characters);
|
|
84
|
+
console.log('Duration:', response.metadata.duration, 'ms');
|
|
89
85
|
```
|
|
90
86
|
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
## ๐ ๏ธ Advanced Usage
|
|
94
|
-
|
|
95
|
-
### Using Provider-Specific Features (e.g., Azure Emotions)
|
|
87
|
+
<details>
|
|
88
|
+
<summary><strong>Switching Providers</strong></summary>
|
|
96
89
|
|
|
97
90
|
```typescript
|
|
98
|
-
|
|
99
|
-
|
|
91
|
+
// Azure with emotion
|
|
92
|
+
const azure = await ttsService.synthesize({
|
|
93
|
+
text: 'Great news!',
|
|
100
94
|
provider: TTSProvider.AZURE,
|
|
101
|
-
voice: { id:
|
|
102
|
-
providerOptions: {
|
|
103
|
-
emotion: "cheerful", // Azure-specific
|
|
104
|
-
style: "chat", // Azure-specific
|
|
105
|
-
styleDegree: 1.5
|
|
106
|
-
}
|
|
95
|
+
voice: { id: 'en-US-JennyNeural' },
|
|
96
|
+
providerOptions: { emotion: 'cheerful', style: 'chat' },
|
|
107
97
|
});
|
|
108
|
-
```
|
|
109
98
|
|
|
110
|
-
|
|
99
|
+
// Google Cloud TTS (EU-compliant)
|
|
100
|
+
const google = await ttsService.synthesize({
|
|
101
|
+
text: 'Hallo aus Frankfurt!',
|
|
102
|
+
provider: TTSProvider.GOOGLE,
|
|
103
|
+
voice: { id: 'de-DE-Neural2-C' },
|
|
104
|
+
providerOptions: { region: 'europe-west3' },
|
|
105
|
+
});
|
|
111
106
|
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
text: "Hello from Google via EdenAI",
|
|
107
|
+
// EdenAI (OpenAI voices via aggregator)
|
|
108
|
+
const edenai = await ttsService.synthesize({
|
|
109
|
+
text: 'Hello World',
|
|
116
110
|
provider: TTSProvider.EDENAI,
|
|
117
|
-
voice: { id:
|
|
118
|
-
providerOptions: {
|
|
119
|
-
|
|
120
|
-
|
|
111
|
+
voice: { id: 'en-US' },
|
|
112
|
+
providerOptions: { provider: 'openai', settings: { openai: 'en_nova' } },
|
|
113
|
+
});
|
|
114
|
+
|
|
115
|
+
// Fish Audio (test/admin only)
|
|
116
|
+
const fish = await ttsService.synthesize({
|
|
117
|
+
text: '(excited) Das ist fantastisch!',
|
|
118
|
+
provider: TTSProvider.FISH_AUDIO,
|
|
119
|
+
voice: { id: '90042f762dbf49baa2e7776d011eee6b' },
|
|
120
|
+
providerOptions: { model: 's1' },
|
|
121
121
|
});
|
|
122
122
|
```
|
|
123
123
|
|
|
124
|
-
|
|
124
|
+
</details>
|
|
125
125
|
|
|
126
|
-
|
|
126
|
+
<details>
|
|
127
|
+
<summary><strong>Using OpenAI Voices via EdenAI</strong></summary>
|
|
127
128
|
|
|
128
129
|
```typescript
|
|
129
130
|
// German with OpenAI "nova" voice (female)
|
|
130
131
|
const response = await ttsService.synthesize({
|
|
131
|
-
text:
|
|
132
|
+
text: 'Hallo Welt! Das ist ein Test.',
|
|
132
133
|
provider: TTSProvider.EDENAI,
|
|
133
|
-
voice: { id:
|
|
134
|
+
voice: { id: 'de' },
|
|
134
135
|
providerOptions: {
|
|
135
|
-
provider:
|
|
136
|
-
settings: { openai:
|
|
137
|
-
}
|
|
138
|
-
});
|
|
139
|
-
|
|
140
|
-
// English with OpenAI "onyx" voice (male, deep)
|
|
141
|
-
const response = await ttsService.synthesize({
|
|
142
|
-
text: "Hello World! This is a test.",
|
|
143
|
-
provider: TTSProvider.EDENAI,
|
|
144
|
-
voice: { id: "en" },
|
|
145
|
-
providerOptions: {
|
|
146
|
-
provider: "openai",
|
|
147
|
-
settings: { openai: "en_onyx" }
|
|
148
|
-
}
|
|
136
|
+
provider: 'openai',
|
|
137
|
+
settings: { openai: 'de_nova' },
|
|
138
|
+
},
|
|
149
139
|
});
|
|
150
140
|
```
|
|
151
141
|
|
|
152
142
|
**Available OpenAI Voices:**
|
|
143
|
+
|
|
153
144
|
| Voice | Character |
|
|
154
145
|
|-------|-----------|
|
|
155
146
|
| `alloy` | Neutral |
|
|
@@ -161,29 +152,22 @@ const response = await ttsService.synthesize({
|
|
|
161
152
|
|
|
162
153
|
Format: `{language}_{voice}` (e.g., `de_nova`, `en_alloy`, `fr_shimmer`)
|
|
163
154
|
|
|
164
|
-
|
|
155
|
+
</details>
|
|
165
156
|
|
|
166
|
-
|
|
157
|
+
<details>
|
|
158
|
+
<summary><strong>Using Google Cloud TTS (GDPR/DSGVO-Compliant)</strong></summary>
|
|
167
159
|
|
|
168
160
|
```typescript
|
|
169
|
-
// Basic usage with EU endpoint (default)
|
|
170
|
-
const response = await ttsService.synthesize({
|
|
171
|
-
text: "Guten Tag, wie geht es Ihnen?",
|
|
172
|
-
provider: TTSProvider.GOOGLE,
|
|
173
|
-
voice: { id: "de-DE-Neural2-G" }, // G=Female, H=Male
|
|
174
|
-
audio: { format: "mp3" }
|
|
175
|
-
});
|
|
176
|
-
|
|
177
161
|
// With Frankfurt endpoint for maximum DSGVO compliance
|
|
178
162
|
const response = await ttsService.synthesize({
|
|
179
|
-
text:
|
|
163
|
+
text: 'Guten Tag, wie geht es Ihnen?',
|
|
180
164
|
provider: TTSProvider.GOOGLE,
|
|
181
|
-
voice: { id:
|
|
182
|
-
audio: { format:
|
|
165
|
+
voice: { id: 'de-DE-Neural2-G' },
|
|
166
|
+
audio: { format: 'mp3' },
|
|
183
167
|
providerOptions: {
|
|
184
|
-
region:
|
|
185
|
-
effectsProfileId: [
|
|
186
|
-
}
|
|
168
|
+
region: 'europe-west3',
|
|
169
|
+
effectsProfileId: ['headphone-class-device'],
|
|
170
|
+
},
|
|
187
171
|
});
|
|
188
172
|
```
|
|
189
173
|
|
|
@@ -194,103 +178,315 @@ const response = await ttsService.synthesize({
|
|
|
194
178
|
| Neural2 | `de-DE-Neural2-G` | `de-DE-Neural2-H` | Best value |
|
|
195
179
|
| WaveNet | `de-DE-Wavenet-G` | `de-DE-Wavenet-H` | Good |
|
|
196
180
|
| Studio | `de-DE-Studio-C` | `de-DE-Studio-B` | Premium |
|
|
197
|
-
| Chirp3-HD | `
|
|
181
|
+
| Chirp3-HD | `Aoede`, `Kore`, ... | `Fenrir`, `Puck`, ... | Newest |
|
|
198
182
|
|
|
199
|
-
>
|
|
183
|
+
</details>
|
|
200
184
|
|
|
201
|
-
|
|
185
|
+
## Prerequisites
|
|
202
186
|
|
|
203
|
-
|
|
187
|
+
<details>
|
|
188
|
+
<summary><strong>Required Dependencies</strong></summary>
|
|
204
189
|
|
|
205
|
-
|
|
190
|
+
- **Node.js** 18+
|
|
191
|
+
- **TypeScript** 5.3+
|
|
192
|
+
- Provider credentials (API keys / service accounts)
|
|
206
193
|
|
|
207
|
-
|
|
208
|
-
graph TD
|
|
209
|
-
App[Your Application] -->|synthesize()| Service[TTSService]
|
|
210
|
-
Service -->|getProvider()| Registry{Provider Registry}
|
|
194
|
+
</details>
|
|
211
195
|
|
|
212
|
-
|
|
213
|
-
Registry -->|Select| GCloud[GoogleCloudTTSProvider]
|
|
214
|
-
Registry -->|Select| Eden[EdenAIProvider]
|
|
196
|
+
## Configuration
|
|
215
197
|
|
|
216
|
-
|
|
217
|
-
|
|
218
|
-
Eden -->|REST| EdenAPI[EdenAI API]
|
|
198
|
+
<details>
|
|
199
|
+
<summary><strong>Environment Setup</strong></summary>
|
|
219
200
|
|
|
220
|
-
|
|
221
|
-
|
|
222
|
-
|
|
201
|
+
Create a `.env` file in your project root:
|
|
202
|
+
|
|
203
|
+
```env
|
|
204
|
+
# Default provider
|
|
205
|
+
TTS_DEFAULT_PROVIDER=azure
|
|
206
|
+
|
|
207
|
+
# Azure Speech Services (EU-compliant)
|
|
208
|
+
AZURE_SPEECH_KEY=your-azure-speech-key
|
|
209
|
+
AZURE_SPEECH_REGION=germanywestcentral
|
|
210
|
+
|
|
211
|
+
# EdenAI (multi-provider aggregator)
|
|
212
|
+
EDENAI_API_KEY=your-edenai-api-key
|
|
213
|
+
|
|
214
|
+
# Google Cloud TTS (EU-compliant)
|
|
215
|
+
GOOGLE_APPLICATION_CREDENTIALS=./service-account.json
|
|
216
|
+
GOOGLE_CLOUD_PROJECT=your-project-id
|
|
217
|
+
GOOGLE_TTS_REGION=eu
|
|
218
|
+
|
|
219
|
+
# Fish Audio (test/admin only โ no EU data residency)
|
|
220
|
+
FISH_AUDIO_API_KEY=your-fish-audio-api-key
|
|
221
|
+
|
|
222
|
+
# Logging
|
|
223
|
+
TTS_DEBUG=false
|
|
224
|
+
LOG_LEVEL=info
|
|
223
225
|
```
|
|
224
226
|
|
|
225
|
-
|
|
227
|
+
</details>
|
|
226
228
|
|
|
227
|
-
##
|
|
229
|
+
## Providers & Models
|
|
228
230
|
|
|
229
|
-
|
|
230
|
-
|----------|--------|------------|--------------|
|
|
231
|
-
| **Azure** | โ
Stable | โ
EU Regions | Neural Voices, Emotions, Styles, SSML |
|
|
232
|
-
| **Google Cloud** | โ
Stable | โ
EU Endpoints | Neural2, WaveNet, Studio, Chirp3-HD, Effects Profiles |
|
|
233
|
-
| **EdenAI** | โ
Stable | โ ๏ธ No DPA | Aggregator for Google, OpenAI, Amazon, IBM |
|
|
234
|
-
| **OpenAI** | ๐ฎ Planned | โ | HD Audio, Simple API |
|
|
235
|
-
| **ElevenLabs** | ๐ฎ Planned | โ | Voice Cloning, High Expressivity |
|
|
231
|
+
### Azure Speech Services (MVP)
|
|
236
232
|
|
|
237
|
-
|
|
233
|
+
| Feature | Details |
|
|
234
|
+
|---------|---------|
|
|
235
|
+
| **Voices** | 180+ neural voices |
|
|
236
|
+
| **Languages** | 100+ locales |
|
|
237
|
+
| **Emotions** | cheerful, sad, angry, friendly, etc. |
|
|
238
|
+
| **Styles** | chat, newscast, customerservice, etc. |
|
|
239
|
+
| **Audio** | MP3, WAV, Opus |
|
|
240
|
+
| **EU Region** | germanywestcentral (Frankfurt) |
|
|
241
|
+
| **Pricing** | ~$16/1M characters |
|
|
242
|
+
|
|
243
|
+
### Google Cloud TTS
|
|
238
244
|
|
|
239
|
-
|
|
245
|
+
| Feature | Details |
|
|
246
|
+
|---------|---------|
|
|
247
|
+
| **Voices** | Neural2, WaveNet, Standard, Studio, Chirp3-HD |
|
|
248
|
+
| **Languages** | 40+ languages |
|
|
249
|
+
| **Audio** | MP3, WAV, Opus |
|
|
250
|
+
| **EU Regions** | eu, europe-west1 through europe-west9 |
|
|
251
|
+
| **Pricing** | ~$16/1M characters |
|
|
240
252
|
|
|
241
|
-
|
|
253
|
+
### EdenAI (Aggregator)
|
|
254
|
+
|
|
255
|
+
| Feature | Details |
|
|
256
|
+
|---------|---------|
|
|
257
|
+
| **Providers** | Google, OpenAI, Amazon, IBM, Microsoft, ElevenLabs |
|
|
258
|
+
| **Voices** | Depends on underlying provider |
|
|
259
|
+
| **OpenAI Voices** | alloy, echo, fable, onyx, nova, shimmer (57 languages) |
|
|
260
|
+
|
|
261
|
+
### Fish Audio (Test/Admin Only)
|
|
262
|
+
|
|
263
|
+
| Feature | Details |
|
|
264
|
+
|---------|---------|
|
|
265
|
+
| **Models** | S1 (flagship, 4B params), speech-1.6, speech-1.5 |
|
|
266
|
+
| **Languages** | 13 with auto-detection (EN, DE, FR, ES, JA, ZH, KO, AR, RU, NL, IT, PL, PT) |
|
|
267
|
+
| **Emotions** | 64+ expressions via text markers: `(excited)`, `(sad)`, `(whispering)` |
|
|
268
|
+
| **Voices** | Community library + custom voice cloning |
|
|
269
|
+
| **Audio** | MP3, WAV, PCM, Opus |
|
|
270
|
+
| **Pricing** | $15/1M UTF-8 bytes |
|
|
271
|
+
| **EU Compliance** | No data residency guarantees |
|
|
272
|
+
|
|
273
|
+
## GDPR / Compliance
|
|
274
|
+
|
|
275
|
+
### Provider Compliance Overview
|
|
276
|
+
|
|
277
|
+
| Provider | DPA | GDPR | EU Data Residency | Notes |
|
|
278
|
+
|----------|-----|------|-------------------|-------|
|
|
279
|
+
| **Azure** | Yes | Yes | Yes (Frankfurt) | Recommended for EU |
|
|
280
|
+
| **Google Cloud** | Yes | Yes | Yes (EU multi-region) | Full EU endpoint support |
|
|
281
|
+
| **EdenAI** | Yes | Depends* | Depends* | Depends on underlying provider |
|
|
282
|
+
| **Fish Audio** | No | No | No | Test/admin only |
|
|
283
|
+
|
|
284
|
+
*EdenAI is an aggregator - compliance depends on the underlying provider.
|
|
285
|
+
|
|
286
|
+
## API Reference
|
|
287
|
+
|
|
288
|
+
### TTSService
|
|
242
289
|
|
|
243
290
|
```typescript
|
|
244
|
-
|
|
291
|
+
class TTSService {
|
|
292
|
+
synthesize(request: TTSSynthesizeRequest): Promise<TTSResponse>;
|
|
293
|
+
getProvider(provider: TTSProvider): BaseTTSProvider;
|
|
294
|
+
setDefaultProvider(provider: TTSProvider): void;
|
|
295
|
+
getAvailableProviders(): TTSProvider[];
|
|
296
|
+
isProviderAvailable(provider: TTSProvider): boolean;
|
|
297
|
+
}
|
|
298
|
+
```
|
|
245
299
|
|
|
246
|
-
|
|
247
|
-
setLogger(silentLogger);
|
|
300
|
+
### TTSSynthesizeRequest
|
|
248
301
|
|
|
249
|
-
|
|
250
|
-
|
|
302
|
+
```typescript
|
|
303
|
+
interface TTSSynthesizeRequest {
|
|
304
|
+
text: string;
|
|
305
|
+
provider?: TTSProvider;
|
|
306
|
+
voice: { id: string };
|
|
307
|
+
audio?: {
|
|
308
|
+
format?: 'mp3' | 'wav' | 'opus' | 'aac' | 'flac';
|
|
309
|
+
speed?: number; // 0.5 - 2.0
|
|
310
|
+
pitch?: number; // -20 to 20
|
|
311
|
+
volumeGainDb?: number; // -96 to 16
|
|
312
|
+
sampleRate?: number;
|
|
313
|
+
};
|
|
314
|
+
providerOptions?: Record<string, unknown>;
|
|
315
|
+
}
|
|
316
|
+
```
|
|
317
|
+
|
|
318
|
+
### TTSResponse
|
|
251
319
|
|
|
252
|
-
|
|
320
|
+
```typescript
|
|
321
|
+
interface TTSResponse {
|
|
322
|
+
audio: Buffer;
|
|
323
|
+
metadata: {
|
|
324
|
+
provider: string;
|
|
325
|
+
voice: string;
|
|
326
|
+
duration: number;
|
|
327
|
+
audioFormat: string;
|
|
328
|
+
sampleRate: number;
|
|
329
|
+
};
|
|
330
|
+
billing: {
|
|
331
|
+
characters: number;
|
|
332
|
+
tokensUsed?: number;
|
|
333
|
+
};
|
|
334
|
+
}
|
|
335
|
+
```
|
|
336
|
+
|
|
337
|
+
## Advanced Features
|
|
338
|
+
|
|
339
|
+
<details>
|
|
340
|
+
<summary><strong>Pluggable Logger</strong></summary>
|
|
341
|
+
|
|
342
|
+
Replace the default console logger with your own:
|
|
343
|
+
|
|
344
|
+
```typescript
|
|
345
|
+
import { setLogger, silentLogger, setLogLevel } from '@loonylabs/tts-middleware';
|
|
346
|
+
|
|
347
|
+
// Use Winston, Pino, or any custom logger
|
|
253
348
|
setLogger({
|
|
254
349
|
info: (msg, meta) => winston.info(msg, meta),
|
|
255
350
|
warn: (msg, meta) => winston.warn(msg, meta),
|
|
256
351
|
error: (msg, meta) => winston.error(msg, meta),
|
|
257
352
|
debug: (msg, meta) => winston.debug(msg, meta),
|
|
258
353
|
});
|
|
354
|
+
|
|
355
|
+
// Disable all logging
|
|
356
|
+
setLogger(silentLogger);
|
|
357
|
+
|
|
358
|
+
// Control log level
|
|
359
|
+
setLogLevel('warn');
|
|
259
360
|
```
|
|
260
361
|
|
|
261
|
-
|
|
362
|
+
</details>
|
|
363
|
+
|
|
364
|
+
<details>
|
|
365
|
+
<summary><strong>Error Handling</strong></summary>
|
|
366
|
+
|
|
367
|
+
Typed error classes for precise error handling:
|
|
368
|
+
|
|
369
|
+
```typescript
|
|
370
|
+
import {
|
|
371
|
+
TTSError,
|
|
372
|
+
InvalidConfigError,
|
|
373
|
+
InvalidVoiceError,
|
|
374
|
+
QuotaExceededError,
|
|
375
|
+
ProviderUnavailableError,
|
|
376
|
+
SynthesisFailedError,
|
|
377
|
+
NetworkError,
|
|
378
|
+
} from '@loonylabs/tts-middleware';
|
|
379
|
+
|
|
380
|
+
try {
|
|
381
|
+
const result = await ttsService.synthesize({ text: 'test', voice: { id: 'en-US' } });
|
|
382
|
+
} catch (error) {
|
|
383
|
+
if (error instanceof QuotaExceededError) {
|
|
384
|
+
console.log('Rate limit hit, try again later');
|
|
385
|
+
} else if (error instanceof InvalidVoiceError) {
|
|
386
|
+
console.log('Voice not found');
|
|
387
|
+
} else if (error instanceof TTSError) {
|
|
388
|
+
console.log(`TTS Error [${error.code}]: ${error.message}`);
|
|
389
|
+
}
|
|
390
|
+
}
|
|
391
|
+
```
|
|
392
|
+
|
|
393
|
+
</details>
|
|
394
|
+
|
|
395
|
+
<details>
|
|
396
|
+
<summary><strong>Billing & Cost Calculation</strong></summary>
|
|
397
|
+
|
|
398
|
+
The middleware returns character counts for cost calculation:
|
|
399
|
+
|
|
400
|
+
```typescript
|
|
401
|
+
const PROVIDER_RATES = {
|
|
402
|
+
[TTSProvider.AZURE]: 16 / 1_000_000,
|
|
403
|
+
[TTSProvider.GOOGLE]: 16 / 1_000_000,
|
|
404
|
+
[TTSProvider.FISH_AUDIO]: 15 / 1_000_000,
|
|
405
|
+
};
|
|
406
|
+
|
|
407
|
+
const response = await ttsService.synthesize({ /* ... */ });
|
|
408
|
+
const costUSD = response.billing.characters * PROVIDER_RATES[TTSProvider.AZURE];
|
|
409
|
+
```
|
|
262
410
|
|
|
263
|
-
|
|
411
|
+
</details>
|
|
264
412
|
|
|
265
|
-
|
|
413
|
+
## Architecture
|
|
414
|
+
|
|
415
|
+
```mermaid
|
|
416
|
+
graph TD
|
|
417
|
+
App[Your Application] -->|synthesize()| Service[TTSService]
|
|
418
|
+
Service -->|getProvider()| Registry{Provider Registry}
|
|
419
|
+
|
|
420
|
+
Registry -->|Select| Azure[AzureProvider]
|
|
421
|
+
Registry -->|Select| GCloud[GoogleCloudTTSProvider]
|
|
422
|
+
Registry -->|Select| Eden[EdenAIProvider]
|
|
423
|
+
Registry -->|Select| Fish[FishAudioProvider]
|
|
424
|
+
|
|
425
|
+
Azure -->|SSML/SDK| AzureAPI[Azure Speech API]
|
|
426
|
+
GCloud -->|gRPC/SDK| GoogleAPI[Google Cloud TTS API]
|
|
427
|
+
Eden -->|REST| EdenAPI[EdenAI API]
|
|
428
|
+
Fish -->|REST| FishAPI[Fish Audio API]
|
|
429
|
+
|
|
430
|
+
GoogleAPI -->|EU Endpoint| EU[eu-texttospeech.googleapis.com]
|
|
431
|
+
EdenAPI -.-> OpenAI[OpenAI TTS]
|
|
432
|
+
EdenAPI -.-> Amazon[Amazon Polly]
|
|
433
|
+
```
|
|
434
|
+
|
|
435
|
+
## Testing
|
|
266
436
|
|
|
267
437
|
```bash
|
|
268
|
-
# Run all tests
|
|
438
|
+
# Run all tests (436 tests, >90% coverage)
|
|
269
439
|
npm test
|
|
270
440
|
|
|
271
|
-
#
|
|
272
|
-
npm test
|
|
273
|
-
|
|
274
|
-
|
|
441
|
+
# Unit tests only
|
|
442
|
+
npm run test:unit
|
|
443
|
+
|
|
444
|
+
# Integration tests
|
|
445
|
+
npm run test:integration
|
|
446
|
+
|
|
447
|
+
# Coverage report
|
|
448
|
+
npm run test:coverage
|
|
275
449
|
|
|
276
|
-
# Manual
|
|
277
|
-
npx ts-node scripts/manual-test-google-cloud-tts.ts de neural2
|
|
450
|
+
# Manual test scripts
|
|
278
451
|
npx ts-node scripts/manual-test-edenai.ts
|
|
452
|
+
npx ts-node scripts/manual-test-google-cloud-tts.ts
|
|
453
|
+
npx ts-node scripts/manual-test-fish-audio.ts [en] [de]
|
|
279
454
|
|
|
280
455
|
# List available Google Cloud voices
|
|
281
456
|
npx ts-node scripts/list-google-voices.ts de-DE
|
|
282
|
-
npx ts-node scripts/list-google-voices.ts en-US
|
|
283
457
|
```
|
|
284
458
|
|
|
285
|
-
##
|
|
459
|
+
## Contributing
|
|
460
|
+
|
|
461
|
+
We welcome contributions! Please ensure:
|
|
462
|
+
|
|
463
|
+
1. **Tests:** Add tests for new features
|
|
464
|
+
2. **Linting:** Run `npm run lint` before committing
|
|
465
|
+
3. **Conventions:** Follow the existing project structure
|
|
466
|
+
|
|
467
|
+
1. Fork the repository
|
|
468
|
+
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
|
|
469
|
+
3. Commit your changes (`git commit -m 'Add some amazing feature'`)
|
|
470
|
+
4. Push to the branch (`git push origin feature/amazing-feature`)
|
|
471
|
+
5. Open a Pull Request
|
|
472
|
+
|
|
473
|
+
## License
|
|
474
|
+
|
|
475
|
+
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|
|
476
|
+
|
|
477
|
+
## Links
|
|
478
|
+
|
|
479
|
+
- [NPM Package](https://www.npmjs.com/package/@loonylabs/tts-middleware)
|
|
480
|
+
- [Issues](https://github.com/loonylabs-dev/tts-middleware/issues)
|
|
481
|
+
- [CHANGELOG](CHANGELOG.md)
|
|
482
|
+
|
|
483
|
+
---
|
|
286
484
|
|
|
287
|
-
|
|
485
|
+
<div align="center">
|
|
288
486
|
|
|
289
|
-
|
|
290
|
-
1. **Tests:** Add tests for new features.
|
|
291
|
-
2. **Linting:** Run `npm run lint` before committing.
|
|
292
|
-
3. **Conventions:** Follow the existing project structure.
|
|
487
|
+
**Made with care by the LoonyLabs Team**
|
|
293
488
|
|
|
294
|
-
|
|
489
|
+
[](https://github.com/loonylabs-dev/tts-middleware/stargazers)
|
|
490
|
+
[](https://github.com/loonylabs-dev)
|
|
295
491
|
|
|
296
|
-
|
|
492
|
+
</div>
|