@loonylabs/tts-middleware 0.4.0 โ†’ 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (33) hide show
  1. package/README.md +372 -176
  2. package/dist/middleware/services/tts/providers/fish-audio-provider.d.ts +90 -0
  3. package/dist/middleware/services/tts/providers/fish-audio-provider.d.ts.map +1 -0
  4. package/dist/middleware/services/tts/providers/fish-audio-provider.js +198 -0
  5. package/dist/middleware/services/tts/providers/fish-audio-provider.js.map +1 -0
  6. package/dist/middleware/services/tts/providers/google-cloud-tts-provider.d.ts +14 -4
  7. package/dist/middleware/services/tts/providers/google-cloud-tts-provider.d.ts.map +1 -1
  8. package/dist/middleware/services/tts/providers/google-cloud-tts-provider.js +38 -16
  9. package/dist/middleware/services/tts/providers/google-cloud-tts-provider.js.map +1 -1
  10. package/dist/middleware/services/tts/providers/index.d.ts +1 -0
  11. package/dist/middleware/services/tts/providers/index.d.ts.map +1 -1
  12. package/dist/middleware/services/tts/providers/index.js +3 -1
  13. package/dist/middleware/services/tts/providers/index.js.map +1 -1
  14. package/dist/middleware/services/tts/tts.service.d.ts.map +1 -1
  15. package/dist/middleware/services/tts/tts.service.js +13 -0
  16. package/dist/middleware/services/tts/tts.service.js.map +1 -1
  17. package/dist/middleware/services/tts/types/common.types.d.ts +2 -1
  18. package/dist/middleware/services/tts/types/common.types.d.ts.map +1 -1
  19. package/dist/middleware/services/tts/types/common.types.js +1 -0
  20. package/dist/middleware/services/tts/types/common.types.js.map +1 -1
  21. package/dist/middleware/services/tts/types/index.d.ts +2 -2
  22. package/dist/middleware/services/tts/types/index.d.ts.map +1 -1
  23. package/dist/middleware/services/tts/types/index.js +2 -1
  24. package/dist/middleware/services/tts/types/index.js.map +1 -1
  25. package/dist/middleware/services/tts/types/provider-options.types.d.ts +91 -1
  26. package/dist/middleware/services/tts/types/provider-options.types.d.ts.map +1 -1
  27. package/dist/middleware/services/tts/types/provider-options.types.js +13 -0
  28. package/dist/middleware/services/tts/types/provider-options.types.js.map +1 -1
  29. package/dist/middleware/shared/config/tts.config.d.ts +16 -0
  30. package/dist/middleware/shared/config/tts.config.d.ts.map +1 -1
  31. package/dist/middleware/shared/config/tts.config.js +10 -0
  32. package/dist/middleware/shared/config/tts.config.js.map +1 -1
  33. package/package.json +3 -2
package/README.md CHANGED
@@ -1,155 +1,146 @@
1
- # @loonylabs/tts-middleware
1
+ <div align="center">
2
2
 
3
- [![npm version](https://img.shields.io/npm/v/@loonylabs/tts-middleware.svg)](https://www.npmjs.com/package/@loonylabs/tts-middleware)
4
- [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
5
- [![Build Status](https://img.shields.io/badge/build-passing-brightgreen)]()
6
- [![Coverage](https://img.shields.io/badge/coverage-94%25-brightgreen)]()
3
+ # TTS Middleware
7
4
 
8
- **Provider-agnostic Text-to-Speech (TTS) middleware infrastructure.**
5
+ *Provider-agnostic Text-to-Speech middleware with **GDPR compliance** support. Currently supports Azure Speech Services, EdenAI, Google Cloud TTS, and Fish Audio. Features EU data residency via Azure and Google Cloud, pluggable logging, character-based billing, and comprehensive error handling.*
9
6
 
10
- Build voice-enabled applications that switch seamlessly between Azure, EdenAI, OpenAI, ElevenLabs, and more without changing your application logic. Includes standardized error handling, accurate character counting for billing, and uniform audio output.
7
+ <!-- Horizontal Badge Navigation Bar -->
8
+ [![npm version](https://img.shields.io/npm/v/@loonylabs/tts-middleware.svg?style=for-the-badge&logo=npm&logoColor=white)](https://www.npmjs.com/package/@loonylabs/tts-middleware)
9
+ [![npm downloads](https://img.shields.io/npm/dm/@loonylabs/tts-middleware.svg?style=for-the-badge&logo=npm&logoColor=white)](https://www.npmjs.com/package/@loonylabs/tts-middleware)
10
+ [![TypeScript](https://img.shields.io/badge/TypeScript-5.3+-blue.svg?style=for-the-badge&logo=typescript&logoColor=white)](#features)
11
+ [![Node.js](https://img.shields.io/badge/Node.js-18+-339933?style=for-the-badge&logo=nodedotjs&logoColor=white)](#prerequisites)
12
+ [![MIT License](https://img.shields.io/badge/License-MIT-yellow.svg?style=for-the-badge&logo=opensource&logoColor=white)](#license)
13
+ [![GitHub](https://img.shields.io/badge/GitHub-Repository-181717?style=for-the-badge&logo=github&logoColor=white)](https://github.com/loonylabs-dev/tts-middleware)
11
14
 
12
- ---
15
+ </div>
13
16
 
14
- ## โœจ Key Features
17
+ <!-- Table of Contents -->
18
+ <details>
19
+ <summary><strong>Table of Contents</strong></summary>
15
20
 
16
- - **๐Ÿ”Œ Provider Agnostic:** Unified API for all TTS providers. Switch providers by changing one config parameter.
17
- - **โ˜๏ธ Multi-Provider Support:**
18
- - **Azure Speech Services:** Full support for Neural voices, emotions, and speaking styles.
19
- - **Google Cloud TTS:** Neural2, WaveNet, Studio, Chirp3-HD voices with EU regional endpoints.
20
- - **EdenAI:** Access to 6+ providers (OpenAI, Amazon, IBM, etc.) via a single aggregator API.
21
- - **Ready for:** OpenAI, ElevenLabs, Deepgram (interfaces prepared).
22
- - **๐Ÿ“ SSML Abstraction:** Auto-generates provider-specific SSML markup (e.g., for Azure prosody/styles) from simple JSON options.
23
- - **๐Ÿ’ฐ Character Counting:** Precise character counting logic for billing estimation.
24
- - **๐Ÿ›ก๏ธ Robust Error Handling:** Standardized error types (`InvalidConfigError`, `QuotaExceededError`, `NetworkError`) across all providers.
25
- - **๐Ÿ“ TypeScript First:** Fully typed request/response objects and provider options.
26
- - **๐Ÿ‡ช๐Ÿ‡บ GDPR/DSGVO Ready:** Configurable region support (e.g., Azure Germany/Europe regions).
21
+ - [Features](#features)
22
+ - [Quick Start](#quick-start)
23
+ - [Prerequisites](#prerequisites)
24
+ - [Configuration](#configuration)
25
+ - [Providers & Models](#providers--models)
26
+ - [GDPR / Compliance](#gdpr--compliance)
27
+ - [API Reference](#api-reference)
28
+ - [Advanced Features](#advanced-features)
29
+ - [Testing](#testing)
30
+ - [Contributing](#contributing)
31
+ - [License](#license)
32
+ - [Links](#links)
27
33
 
28
- ---
34
+ </details>
29
35
 
30
- ## ๐Ÿ“ฆ Installation
36
+ ---
31
37
 
32
- ```bash
33
- npm install @loonylabs/tts-middleware
34
- ```
38
+ ## Features
35
39
 
36
- ## ๐Ÿš€ Quick Start
40
+ - **Multi-Provider Architecture**: Unified API for all TTS providers
41
+ - **Azure Speech Services** (MVP): Neural voices with emotion/style, EU regions
42
+ - **EdenAI**: Aggregator with access to Google, OpenAI, Amazon, IBM, ElevenLabs
43
+ - **Google Cloud TTS**: Neural2, WaveNet, Studio voices with EU data residency
44
+ - **Fish Audio**: S1 model with 13 languages & 64+ emotions (test/admin only)
45
+ - **Ready for:** OpenAI, ElevenLabs, Deepgram (interfaces prepared)
46
+ - **GDPR/DSGVO Compliance**: Built-in EU region support for Azure and Google Cloud
47
+ - **SSML Abstraction**: Auto-generates provider-specific SSML from simple JSON options
48
+ - **Character Billing**: Accurate character counting for cost calculation
49
+ - **Pluggable Logger**: Bring your own logger (Winston, Pino, etc.) or use the built-in console logger
50
+ - **TypeScript First**: Full type safety with comprehensive interfaces
51
+ - **Error Handling**: Typed error classes (InvalidConfig, QuotaExceeded, SynthesisFailed, etc.)
52
+ - **Zero Lock-in**: Switch providers without changing your application code
37
53
 
38
- ### 1. Configure Environment
54
+ ## Quick Start
39
55
 
40
- Create a `.env` file in your project root:
56
+ ### Installation
41
57
 
42
- ```env
43
- # Default Provider
44
- TTS_DEFAULT_PROVIDER=azure
58
+ Install from npm:
45
59
 
46
- # Azure Speech Services
47
- AZURE_SPEECH_KEY=your_azure_key
48
- AZURE_SPEECH_REGION=germanywestcentral
60
+ ```bash
61
+ npm install @loonylabs/tts-middleware
62
+ ```
49
63
 
50
- # Google Cloud TTS (GDPR/DSGVO-compliant with EU endpoints)
51
- GOOGLE_APPLICATION_CREDENTIALS=./path/to/service-account.json
52
- GOOGLE_CLOUD_PROJECT=your-project-id
53
- GOOGLE_TTS_REGION=eu # Options: eu, europe-west3 (Frankfurt), europe-west1, etc.
64
+ Or install directly from GitHub:
54
65
 
55
- # EdenAI (Optional - no DPA available)
56
- EDENAI_API_KEY=your_edenai_key
66
+ ```bash
67
+ npm install github:loonylabs-dev/tts-middleware
57
68
  ```
58
69
 
59
- ### 2. Basic Usage
70
+ ### Basic Usage
60
71
 
61
72
  ```typescript
62
73
  import { ttsService, TTSProvider } from '@loonylabs/tts-middleware';
63
74
  import fs from 'fs';
64
75
 
65
- async function generateSpeech() {
66
- try {
67
- // Synthesize speech
68
- const response = await ttsService.synthesize({
69
- text: "Hello! This is a test of the LoonyLabs TTS middleware.",
70
- voice: { id: "en-US-JennyNeural" }, // Provider-specific voice ID
71
- audio: {
72
- format: "mp3",
73
- speed: 1.0
74
- }
75
- });
76
-
77
- // Save to file
78
- fs.writeFileSync('output.mp3', response.audio);
79
-
80
- console.log(`Generated audio: ${response.metadata.duration}ms`);
81
- console.log(`Billed characters: ${response.billing.characters}`);
82
-
83
- } catch (error) {
84
- console.error("Synthesis failed:", error);
85
- }
86
- }
76
+ const response = await ttsService.synthesize({
77
+ text: 'Hallo Welt! Dies ist ein Test.',
78
+ voice: { id: 'de-DE-KatjaNeural' },
79
+ audio: { format: 'mp3', speed: 1.0 },
80
+ });
87
81
 
88
- generateSpeech();
82
+ fs.writeFileSync('output.mp3', response.audio);
83
+ console.log('Characters billed:', response.billing.characters);
84
+ console.log('Duration:', response.metadata.duration, 'ms');
89
85
  ```
90
86
 
91
- ---
92
-
93
- ## ๐Ÿ› ๏ธ Advanced Usage
94
-
95
- ### Using Provider-Specific Features (e.g., Azure Emotions)
87
+ <details>
88
+ <summary><strong>Switching Providers</strong></summary>
96
89
 
97
90
  ```typescript
98
- const response = await ttsService.synthesize({
99
- text: "I am so excited to tell you this!",
91
+ // Azure with emotion
92
+ const azure = await ttsService.synthesize({
93
+ text: 'Great news!',
100
94
  provider: TTSProvider.AZURE,
101
- voice: { id: "en-US-JennyNeural" },
102
- providerOptions: {
103
- emotion: "cheerful", // Azure-specific
104
- style: "chat", // Azure-specific
105
- styleDegree: 1.5
106
- }
95
+ voice: { id: 'en-US-JennyNeural' },
96
+ providerOptions: { emotion: 'cheerful', style: 'chat' },
107
97
  });
108
- ```
109
98
 
110
- ### Switching Providers Dynamically
99
+ // Google Cloud TTS (EU-compliant)
100
+ const google = await ttsService.synthesize({
101
+ text: 'Hallo aus Frankfurt!',
102
+ provider: TTSProvider.GOOGLE,
103
+ voice: { id: 'de-DE-Neural2-C' },
104
+ providerOptions: { region: 'europe-west3' },
105
+ });
111
106
 
112
- ```typescript
113
- // Use EdenAI to access Google's TTS engine
114
- const response = await ttsService.synthesize({
115
- text: "Hello from Google via EdenAI",
107
+ // EdenAI (OpenAI voices via aggregator)
108
+ const edenai = await ttsService.synthesize({
109
+ text: 'Hello World',
116
110
  provider: TTSProvider.EDENAI,
117
- voice: { id: "en-US" },
118
- providerOptions: {
119
- provider: "google" // Select underlying provider
120
- }
111
+ voice: { id: 'en-US' },
112
+ providerOptions: { provider: 'openai', settings: { openai: 'en_nova' } },
113
+ });
114
+
115
+ // Fish Audio (test/admin only)
116
+ const fish = await ttsService.synthesize({
117
+ text: '(excited) Das ist fantastisch!',
118
+ provider: TTSProvider.FISH_AUDIO,
119
+ voice: { id: '90042f762dbf49baa2e7776d011eee6b' },
120
+ providerOptions: { model: 's1' },
121
121
  });
122
122
  ```
123
123
 
124
- ### Using OpenAI Voices via EdenAI
124
+ </details>
125
125
 
126
- Access OpenAI's TTS voices (alloy, echo, fable, onyx, nova, shimmer) through EdenAI with specific voice selection:
126
+ <details>
127
+ <summary><strong>Using OpenAI Voices via EdenAI</strong></summary>
127
128
 
128
129
  ```typescript
129
130
  // German with OpenAI "nova" voice (female)
130
131
  const response = await ttsService.synthesize({
131
- text: "Hallo Welt! Das ist ein Test.",
132
+ text: 'Hallo Welt! Das ist ein Test.',
132
133
  provider: TTSProvider.EDENAI,
133
- voice: { id: "de" }, // Language code
134
+ voice: { id: 'de' },
134
135
  providerOptions: {
135
- provider: "openai",
136
- settings: { openai: "de_nova" } // Voice: {lang}_{voice}
137
- }
138
- });
139
-
140
- // English with OpenAI "onyx" voice (male, deep)
141
- const response = await ttsService.synthesize({
142
- text: "Hello World! This is a test.",
143
- provider: TTSProvider.EDENAI,
144
- voice: { id: "en" },
145
- providerOptions: {
146
- provider: "openai",
147
- settings: { openai: "en_onyx" }
148
- }
136
+ provider: 'openai',
137
+ settings: { openai: 'de_nova' },
138
+ },
149
139
  });
150
140
  ```
151
141
 
152
142
  **Available OpenAI Voices:**
143
+
153
144
  | Voice | Character |
154
145
  |-------|-----------|
155
146
  | `alloy` | Neutral |
@@ -161,29 +152,22 @@ const response = await ttsService.synthesize({
161
152
 
162
153
  Format: `{language}_{voice}` (e.g., `de_nova`, `en_alloy`, `fr_shimmer`)
163
154
 
164
- ### Using Google Cloud TTS (GDPR/DSGVO-Compliant)
155
+ </details>
165
156
 
166
- Google Cloud TTS with EU regional endpoints for data residency compliance:
157
+ <details>
158
+ <summary><strong>Using Google Cloud TTS (GDPR/DSGVO-Compliant)</strong></summary>
167
159
 
168
160
  ```typescript
169
- // Basic usage with EU endpoint (default)
170
- const response = await ttsService.synthesize({
171
- text: "Guten Tag, wie geht es Ihnen?",
172
- provider: TTSProvider.GOOGLE,
173
- voice: { id: "de-DE-Neural2-G" }, // G=Female, H=Male
174
- audio: { format: "mp3" }
175
- });
176
-
177
161
  // With Frankfurt endpoint for maximum DSGVO compliance
178
162
  const response = await ttsService.synthesize({
179
- text: "Hallo Welt!",
163
+ text: 'Guten Tag, wie geht es Ihnen?',
180
164
  provider: TTSProvider.GOOGLE,
181
- voice: { id: "de-DE-Studio-C" }, // Premium Studio voice
182
- audio: { format: "mp3", speed: 1.0, pitch: 0.0 },
165
+ voice: { id: 'de-DE-Neural2-G' },
166
+ audio: { format: 'mp3' },
183
167
  providerOptions: {
184
- region: "europe-west3", // Frankfurt
185
- effectsProfileId: ["headphone-class-device"] // Audio optimization
186
- }
168
+ region: 'europe-west3',
169
+ effectsProfileId: ['headphone-class-device'],
170
+ },
187
171
  });
188
172
  ```
189
173
 
@@ -194,103 +178,315 @@ const response = await ttsService.synthesize({
194
178
  | Neural2 | `de-DE-Neural2-G` | `de-DE-Neural2-H` | Best value |
195
179
  | WaveNet | `de-DE-Wavenet-G` | `de-DE-Wavenet-H` | Good |
196
180
  | Studio | `de-DE-Studio-C` | `de-DE-Studio-B` | Premium |
197
- | Chirp3-HD | `de-DE-Chirp3-HD-Aoede`, `Kore`, ... | `de-DE-Chirp3-HD-Fenrir`, `Puck`, ... | Newest |
181
+ | Chirp3-HD | `Aoede`, `Kore`, ... | `Fenrir`, `Puck`, ... | Newest |
198
182
 
199
- > **Note:** German Neural2/WaveNet only have G and H variants (no A-F). Use `scripts/list-google-voices.ts` to query all available voices.
183
+ </details>
200
184
 
201
- ---
185
+ ## Prerequisites
202
186
 
203
- ## ๐Ÿ—๏ธ Architecture
187
+ <details>
188
+ <summary><strong>Required Dependencies</strong></summary>
204
189
 
205
- The middleware uses a singleton orchestrator pattern to manage provider instances.
190
+ - **Node.js** 18+
191
+ - **TypeScript** 5.3+
192
+ - Provider credentials (API keys / service accounts)
206
193
 
207
- ```mermaid
208
- graph TD
209
- App[Your Application] -->|synthesize()| Service[TTSService]
210
- Service -->|getProvider()| Registry{Provider Registry}
194
+ </details>
211
195
 
212
- Registry -->|Select| Azure[AzureProvider]
213
- Registry -->|Select| GCloud[GoogleCloudTTSProvider]
214
- Registry -->|Select| Eden[EdenAIProvider]
196
+ ## Configuration
215
197
 
216
- Azure -->|SSML/SDK| AzureAPI[Azure Speech API]
217
- GCloud -->|gRPC/SDK| GoogleAPI[Google Cloud TTS API]
218
- Eden -->|REST| EdenAPI[EdenAI API]
198
+ <details>
199
+ <summary><strong>Environment Setup</strong></summary>
219
200
 
220
- GoogleAPI -->|EU Endpoint| EU[eu-texttospeech.googleapis.com]
221
- EdenAPI -.-> OpenAI[OpenAI TTS]
222
- EdenAPI -.-> Amazon[Amazon Polly]
201
+ Create a `.env` file in your project root:
202
+
203
+ ```env
204
+ # Default provider
205
+ TTS_DEFAULT_PROVIDER=azure
206
+
207
+ # Azure Speech Services (EU-compliant)
208
+ AZURE_SPEECH_KEY=your-azure-speech-key
209
+ AZURE_SPEECH_REGION=germanywestcentral
210
+
211
+ # EdenAI (multi-provider aggregator)
212
+ EDENAI_API_KEY=your-edenai-api-key
213
+
214
+ # Google Cloud TTS (EU-compliant)
215
+ GOOGLE_APPLICATION_CREDENTIALS=./service-account.json
216
+ GOOGLE_CLOUD_PROJECT=your-project-id
217
+ GOOGLE_TTS_REGION=eu
218
+
219
+ # Fish Audio (test/admin only โ€“ no EU data residency)
220
+ FISH_AUDIO_API_KEY=your-fish-audio-api-key
221
+
222
+ # Logging
223
+ TTS_DEBUG=false
224
+ LOG_LEVEL=info
223
225
  ```
224
226
 
225
- ---
227
+ </details>
226
228
 
227
- ## ๐Ÿงฉ Supported Providers
229
+ ## Providers & Models
228
230
 
229
- | Provider | Status | GDPR/DSGVO | Key Features |
230
- |----------|--------|------------|--------------|
231
- | **Azure** | โœ… Stable | โœ… EU Regions | Neural Voices, Emotions, Styles, SSML |
232
- | **Google Cloud** | โœ… Stable | โœ… EU Endpoints | Neural2, WaveNet, Studio, Chirp3-HD, Effects Profiles |
233
- | **EdenAI** | โœ… Stable | โš ๏ธ No DPA | Aggregator for Google, OpenAI, Amazon, IBM |
234
- | **OpenAI** | ๐Ÿ”ฎ Planned | โŒ | HD Audio, Simple API |
235
- | **ElevenLabs** | ๐Ÿ”ฎ Planned | โŒ | Voice Cloning, High Expressivity |
231
+ ### Azure Speech Services (MVP)
236
232
 
237
- ---
233
+ | Feature | Details |
234
+ |---------|---------|
235
+ | **Voices** | 180+ neural voices |
236
+ | **Languages** | 100+ locales |
237
+ | **Emotions** | cheerful, sad, angry, friendly, etc. |
238
+ | **Styles** | chat, newscast, customerservice, etc. |
239
+ | **Audio** | MP3, WAV, Opus |
240
+ | **EU Region** | germanywestcentral (Frankfurt) |
241
+ | **Pricing** | ~$16/1M characters |
242
+
243
+ ### Google Cloud TTS
238
244
 
239
- ## ๐Ÿ”ง Logging Configuration
245
+ | Feature | Details |
246
+ |---------|---------|
247
+ | **Voices** | Neural2, WaveNet, Standard, Studio, Chirp3-HD |
248
+ | **Languages** | 40+ languages |
249
+ | **Audio** | MP3, WAV, Opus |
250
+ | **EU Regions** | eu, europe-west1 through europe-west9 |
251
+ | **Pricing** | ~$16/1M characters |
240
252
 
241
- The middleware includes a pluggable logger interface. By default, it uses `console`, but you can replace it with any logger (Winston, Pino, etc.).
253
+ ### EdenAI (Aggregator)
254
+
255
+ | Feature | Details |
256
+ |---------|---------|
257
+ | **Providers** | Google, OpenAI, Amazon, IBM, Microsoft, ElevenLabs |
258
+ | **Voices** | Depends on underlying provider |
259
+ | **OpenAI Voices** | alloy, echo, fable, onyx, nova, shimmer (57 languages) |
260
+
261
+ ### Fish Audio (Test/Admin Only)
262
+
263
+ | Feature | Details |
264
+ |---------|---------|
265
+ | **Models** | S1 (flagship, 4B params), speech-1.6, speech-1.5 |
266
+ | **Languages** | 13 with auto-detection (EN, DE, FR, ES, JA, ZH, KO, AR, RU, NL, IT, PL, PT) |
267
+ | **Emotions** | 64+ expressions via text markers: `(excited)`, `(sad)`, `(whispering)` |
268
+ | **Voices** | Community library + custom voice cloning |
269
+ | **Audio** | MP3, WAV, PCM, Opus |
270
+ | **Pricing** | $15/1M UTF-8 bytes |
271
+ | **EU Compliance** | No data residency guarantees |
272
+
273
+ ## GDPR / Compliance
274
+
275
+ ### Provider Compliance Overview
276
+
277
+ | Provider | DPA | GDPR | EU Data Residency | Notes |
278
+ |----------|-----|------|-------------------|-------|
279
+ | **Azure** | Yes | Yes | Yes (Frankfurt) | Recommended for EU |
280
+ | **Google Cloud** | Yes | Yes | Yes (EU multi-region) | Full EU endpoint support |
281
+ | **EdenAI** | Yes | Depends* | Depends* | Depends on underlying provider |
282
+ | **Fish Audio** | No | No | No | Test/admin only |
283
+
284
+ *EdenAI is an aggregator - compliance depends on the underlying provider.
285
+
286
+ ## API Reference
287
+
288
+ ### TTSService
242
289
 
243
290
  ```typescript
244
- import { setLogger, silentLogger, setLogLevel } from '@loonylabs/tts-middleware';
291
+ class TTSService {
292
+ synthesize(request: TTSSynthesizeRequest): Promise<TTSResponse>;
293
+ getProvider(provider: TTSProvider): BaseTTSProvider;
294
+ setDefaultProvider(provider: TTSProvider): void;
295
+ getAvailableProviders(): TTSProvider[];
296
+ isProviderAvailable(provider: TTSProvider): boolean;
297
+ }
298
+ ```
245
299
 
246
- // Disable all logging (useful for tests)
247
- setLogger(silentLogger);
300
+ ### TTSSynthesizeRequest
248
301
 
249
- // Set minimum log level (debug, info, warn, error)
250
- setLogLevel('warn'); // Only show warnings and errors
302
+ ```typescript
303
+ interface TTSSynthesizeRequest {
304
+ text: string;
305
+ provider?: TTSProvider;
306
+ voice: { id: string };
307
+ audio?: {
308
+ format?: 'mp3' | 'wav' | 'opus' | 'aac' | 'flac';
309
+ speed?: number; // 0.5 - 2.0
310
+ pitch?: number; // -20 to 20
311
+ volumeGainDb?: number; // -96 to 16
312
+ sampleRate?: number;
313
+ };
314
+ providerOptions?: Record<string, unknown>;
315
+ }
316
+ ```
317
+
318
+ ### TTSResponse
251
319
 
252
- // Use custom logger (e.g., Winston)
320
+ ```typescript
321
+ interface TTSResponse {
322
+ audio: Buffer;
323
+ metadata: {
324
+ provider: string;
325
+ voice: string;
326
+ duration: number;
327
+ audioFormat: string;
328
+ sampleRate: number;
329
+ };
330
+ billing: {
331
+ characters: number;
332
+ tokensUsed?: number;
333
+ };
334
+ }
335
+ ```
336
+
337
+ ## Advanced Features
338
+
339
+ <details>
340
+ <summary><strong>Pluggable Logger</strong></summary>
341
+
342
+ Replace the default console logger with your own:
343
+
344
+ ```typescript
345
+ import { setLogger, silentLogger, setLogLevel } from '@loonylabs/tts-middleware';
346
+
347
+ // Use Winston, Pino, or any custom logger
253
348
  setLogger({
254
349
  info: (msg, meta) => winston.info(msg, meta),
255
350
  warn: (msg, meta) => winston.warn(msg, meta),
256
351
  error: (msg, meta) => winston.error(msg, meta),
257
352
  debug: (msg, meta) => winston.debug(msg, meta),
258
353
  });
354
+
355
+ // Disable all logging
356
+ setLogger(silentLogger);
357
+
358
+ // Control log level
359
+ setLogLevel('warn');
259
360
  ```
260
361
 
261
- ---
362
+ </details>
363
+
364
+ <details>
365
+ <summary><strong>Error Handling</strong></summary>
366
+
367
+ Typed error classes for precise error handling:
368
+
369
+ ```typescript
370
+ import {
371
+ TTSError,
372
+ InvalidConfigError,
373
+ InvalidVoiceError,
374
+ QuotaExceededError,
375
+ ProviderUnavailableError,
376
+ SynthesisFailedError,
377
+ NetworkError,
378
+ } from '@loonylabs/tts-middleware';
379
+
380
+ try {
381
+ const result = await ttsService.synthesize({ text: 'test', voice: { id: 'en-US' } });
382
+ } catch (error) {
383
+ if (error instanceof QuotaExceededError) {
384
+ console.log('Rate limit hit, try again later');
385
+ } else if (error instanceof InvalidVoiceError) {
386
+ console.log('Voice not found');
387
+ } else if (error instanceof TTSError) {
388
+ console.log(`TTS Error [${error.code}]: ${error.message}`);
389
+ }
390
+ }
391
+ ```
392
+
393
+ </details>
394
+
395
+ <details>
396
+ <summary><strong>Billing & Cost Calculation</strong></summary>
397
+
398
+ The middleware returns character counts for cost calculation:
399
+
400
+ ```typescript
401
+ const PROVIDER_RATES = {
402
+ [TTSProvider.AZURE]: 16 / 1_000_000,
403
+ [TTSProvider.GOOGLE]: 16 / 1_000_000,
404
+ [TTSProvider.FISH_AUDIO]: 15 / 1_000_000,
405
+ };
406
+
407
+ const response = await ttsService.synthesize({ /* ... */ });
408
+ const costUSD = response.billing.characters * PROVIDER_RATES[TTSProvider.AZURE];
409
+ ```
262
410
 
263
- ## ๐Ÿงช Testing
411
+ </details>
264
412
 
265
- The project maintains high code coverage (>90%) with 434+ tests using Jest.
413
+ ## Architecture
414
+
415
+ ```mermaid
416
+ graph TD
417
+ App[Your Application] -->|synthesize()| Service[TTSService]
418
+ Service -->|getProvider()| Registry{Provider Registry}
419
+
420
+ Registry -->|Select| Azure[AzureProvider]
421
+ Registry -->|Select| GCloud[GoogleCloudTTSProvider]
422
+ Registry -->|Select| Eden[EdenAIProvider]
423
+ Registry -->|Select| Fish[FishAudioProvider]
424
+
425
+ Azure -->|SSML/SDK| AzureAPI[Azure Speech API]
426
+ GCloud -->|gRPC/SDK| GoogleAPI[Google Cloud TTS API]
427
+ Eden -->|REST| EdenAPI[EdenAI API]
428
+ Fish -->|REST| FishAPI[Fish Audio API]
429
+
430
+ GoogleAPI -->|EU Endpoint| EU[eu-texttospeech.googleapis.com]
431
+ EdenAPI -.-> OpenAI[OpenAI TTS]
432
+ EdenAPI -.-> Amazon[Amazon Polly]
433
+ ```
434
+
435
+ ## Testing
266
436
 
267
437
  ```bash
268
- # Run all tests
438
+ # Run all tests (436 tests, >90% coverage)
269
439
  npm test
270
440
 
271
- # Run specific provider tests
272
- npm test -- --testPathPattern="google"
273
- npm test -- --testPathPattern="azure"
274
- npm test -- --testPathPattern="edenai"
441
+ # Unit tests only
442
+ npm run test:unit
443
+
444
+ # Integration tests
445
+ npm run test:integration
446
+
447
+ # Coverage report
448
+ npm run test:coverage
275
449
 
276
- # Manual verification scripts (require .env)
277
- npx ts-node scripts/manual-test-google-cloud-tts.ts de neural2
450
+ # Manual test scripts
278
451
  npx ts-node scripts/manual-test-edenai.ts
452
+ npx ts-node scripts/manual-test-google-cloud-tts.ts
453
+ npx ts-node scripts/manual-test-fish-audio.ts [en] [de]
279
454
 
280
455
  # List available Google Cloud voices
281
456
  npx ts-node scripts/list-google-voices.ts de-DE
282
- npx ts-node scripts/list-google-voices.ts en-US
283
457
  ```
284
458
 
285
- ## ๐Ÿค Contributing
459
+ ## Contributing
460
+
461
+ We welcome contributions! Please ensure:
462
+
463
+ 1. **Tests:** Add tests for new features
464
+ 2. **Linting:** Run `npm run lint` before committing
465
+ 3. **Conventions:** Follow the existing project structure
466
+
467
+ 1. Fork the repository
468
+ 2. Create your feature branch (`git checkout -b feature/amazing-feature`)
469
+ 3. Commit your changes (`git commit -m 'Add some amazing feature'`)
470
+ 4. Push to the branch (`git push origin feature/amazing-feature`)
471
+ 5. Open a Pull Request
472
+
473
+ ## License
474
+
475
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
476
+
477
+ ## Links
478
+
479
+ - [NPM Package](https://www.npmjs.com/package/@loonylabs/tts-middleware)
480
+ - [Issues](https://github.com/loonylabs-dev/tts-middleware/issues)
481
+ - [CHANGELOG](CHANGELOG.md)
482
+
483
+ ---
286
484
 
287
- See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.
485
+ <div align="center">
288
486
 
289
- Contributions are welcome! Please ensure:
290
- 1. **Tests:** Add tests for new features.
291
- 2. **Linting:** Run `npm run lint` before committing.
292
- 3. **Conventions:** Follow the existing project structure.
487
+ **Made with care by the LoonyLabs Team**
293
488
 
294
- ## ๐Ÿ“„ License
489
+ [![GitHub stars](https://img.shields.io/github/stars/loonylabs-dev/tts-middleware?style=social)](https://github.com/loonylabs-dev/tts-middleware/stargazers)
490
+ [![Follow on GitHub](https://img.shields.io/github/followers/loonylabs-dev?style=social&label=Follow)](https://github.com/loonylabs-dev)
295
491
 
296
- [MIT](LICENSE) ยฉ 2026 LoonyLabs Team
492
+ </div>