voice-router-dev 0.3.2 → 0.3.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,478 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [0.3.3] - 2026-01-08
9
+
10
+ ### Added
11
+
12
+ #### Gladia Audio File Download
13
+
14
+ New `getAudioFile()` method for Gladia adapter - download the original audio used for transcription:
15
+
16
+ ```typescript
17
+ // Download audio from a pre-recorded transcription
18
+ const result = await gladiaAdapter.getAudioFile('transcript-123')
19
+ if (result.success && result.data) {
20
+ // Save to file (Node.js)
21
+ const buffer = Buffer.from(await result.data.arrayBuffer())
22
+ fs.writeFileSync('audio.mp3', buffer)
23
+
24
+ // Or create download URL (browser)
25
+ const url = URL.createObjectURL(result.data)
26
+ }
27
+
28
+ // Download audio from a live/streaming session
29
+ const liveResult = await gladiaAdapter.getAudioFile('stream-456', 'streaming')
30
+ ```
31
+
32
+ **Note:** This is a Gladia-specific feature. Other providers (Deepgram, AssemblyAI, Azure) do not store audio files after transcription.
33
+
34
+ New capability flag: `capabilities.getAudioFile` indicates provider support for audio retrieval.
35
+
36
+ #### Improved Metadata Clarity
37
+
38
+ New metadata fields for better discoverability:
39
+
40
+ ```typescript
41
+ interface TranscriptMetadata {
42
+ /** Original audio URL you provided (echoed back) - renamed from audioUrl */
43
+ sourceAudioUrl?: string
44
+
45
+ /** True if getAudioFile() can retrieve the audio (Gladia only) */
46
+ audioFileAvailable?: boolean
47
+ // ...
48
+ }
49
+ ```
50
+
51
+ Usage pattern:
52
+ ```typescript
53
+ const { transcripts } = await router.listTranscripts('gladia')
54
+
55
+ transcripts.forEach(item => {
56
+ // What you sent
57
+ console.log(item.data?.metadata?.sourceAudioUrl) // "https://your-bucket.s3.amazonaws.com/audio.mp3"
58
+
59
+ // Can we download from provider?
60
+ if (item.data?.metadata?.audioFileAvailable) {
61
+ const audio = await gladiaAdapter.getAudioFile(item.data.id)
62
+ // audio.data is a Blob - actual file stored by Gladia
63
+ }
64
+ })
65
+ ```
66
+
67
+ ### Changed
68
+
69
+ - **BREAKING:** `metadata.audioUrl` renamed to `metadata.sourceAudioUrl` for clarity
70
+ - This field contains the URL you originally provided, not a provider-hosted URL
71
+ - `audioFileAvailable` is now set on all provider responses (derived from `capabilities.getAudioFile`)
72
+
73
+ #### listTranscripts Implementation
74
+
75
+ Full `listTranscripts()` support for AssemblyAI, Gladia, Azure, and Deepgram using only generated types:
76
+
77
+ ```typescript
78
+ // List recent transcripts with filtering
79
+ const { transcripts, hasMore } = await router.listTranscripts('assemblyai', {
80
+ status: 'completed',
81
+ date: '2026-01-07',
82
+ limit: 50
83
+ })
84
+
85
+ // Date range filtering (Gladia)
86
+ const { transcripts } = await router.listTranscripts('gladia', {
87
+ afterDate: '2026-01-01',
88
+ beforeDate: '2026-01-31'
89
+ })
90
+
91
+ // Provider-specific passthrough
92
+ const { transcripts } = await router.listTranscripts('assemblyai', {
93
+ assemblyai: { after_id: 'cursor-123' }
94
+ })
95
+
96
+ // Deepgram request history (requires projectId)
97
+ const adapter = new DeepgramAdapter()
98
+ adapter.initialize({
99
+ apiKey: process.env.DEEPGRAM_API_KEY,
100
+ projectId: process.env.DEEPGRAM_PROJECT_ID
101
+ })
102
+
103
+ // List requests (metadata only)
104
+ const { transcripts } = await adapter.listTranscripts({
105
+ status: 'succeeded',
106
+ afterDate: '2026-01-01'
107
+ })
108
+
109
+ // Get full transcript by request ID
110
+ const fullTranscript = await adapter.getTranscript(transcripts[0].data?.id)
111
+ console.log(fullTranscript.data?.text) // Full transcript!
112
+ ```
113
+
114
+ #### Status Enums for Filtering
115
+
116
+ New status constants with IDE autocomplete:
117
+
118
+ ```typescript
119
+ import { AssemblyAIStatus, GladiaStatus, AzureStatus, DeepgramStatus } from 'voice-router-dev/constants'
120
+
121
+ await router.listTranscripts('assemblyai', {
122
+ status: AssemblyAIStatus.completed // queued | processing | completed | error
123
+ })
124
+
125
+ await router.listTranscripts('gladia', {
126
+ status: GladiaStatus.done // queued | processing | done | error
127
+ })
128
+
129
+ await router.listTranscripts('azure-stt', {
130
+ status: AzureStatus.Succeeded // NotStarted | Running | Succeeded | Failed
131
+ })
132
+
133
+ // Deepgram (request history - requires projectId)
134
+ await adapter.listTranscripts({
135
+ status: DeepgramStatus.succeeded // succeeded | failed
136
+ })
137
+ ```
138
+
139
+ #### JSDoc Comments for All Constants
140
+
141
+ All constants now have JSDoc with:
142
+ - Available values listed
143
+ - Usage examples
144
+ - Provider-specific notes
145
+
146
+ #### Typed Response Interfaces
147
+
148
+ New exported types for full autocomplete on transcript responses:
149
+
150
+ ```typescript
151
+ import type {
152
+ TranscriptData,
153
+ TranscriptMetadata,
154
+ ListTranscriptsResponse
155
+ } from 'voice-router-dev';
156
+
157
+ const response: ListTranscriptsResponse = await router.listTranscripts('assemblyai', { limit: 20 });
158
+
159
+ response.transcripts.forEach(item => {
160
+ // Full autocomplete - no `as any` casts needed!
161
+ console.log(item.data?.id); // string
162
+ console.log(item.data?.status); // TranscriptionStatus
163
+ console.log(item.data?.metadata?.audioUrl); // string | undefined
164
+ console.log(item.data?.metadata?.createdAt); // string | undefined
165
+ });
166
+ ```
167
+
168
+ **Note:** These are manual normalization types that unify different provider schemas.
169
+ For raw provider types, use `result.raw` with the generic parameter:
170
+
171
+ ```typescript
172
+ const result: UnifiedTranscriptResponse<'assemblyai'> = await adapter.transcribe(audio);
173
+ // result.raw is typed as AssemblyAITranscript
174
+ ```
175
+
176
+ #### DeepgramSampleRate Const
177
+
178
+ New convenience const for Deepgram sample rates (not in OpenAPI spec):
179
+
180
+ ```typescript
181
+ import { DeepgramSampleRate } from 'voice-router-dev/constants'
182
+
183
+ { sampleRate: DeepgramSampleRate.NUMBER_16000 }
184
+ ```
185
+
186
+ #### Additional Deepgram OpenAPI Re-exports
187
+
188
+ New constants directly re-exported from OpenAPI-generated types:
189
+
190
+ ```typescript
191
+ import { DeepgramIntentMode, DeepgramCallbackMethod } from 'voice-router-dev/constants'
192
+
193
+ // Intent detection mode
194
+ { customIntentMode: DeepgramIntentMode.extended } // extended | strict
195
+
196
+ // Async callback method
197
+ { callbackMethod: DeepgramCallbackMethod.POST } // POST | PUT
198
+ ```
199
+
200
+ ### Changed
201
+
202
+ - All adapter `listTranscripts()` implementations use generated API functions and types only
203
+ - Status mappings use generated enums (`TranscriptStatus`, `TranscriptionControllerListV2StatusItem`, `Status`, `ManageV1FilterStatusParameter`)
204
+ - Deepgram adapter now supports `listTranscripts()` via request history API (metadata only)
205
+ - Deepgram `getTranscript()` now returns full transcript data from request history
206
+
207
+ ### Fixed
208
+
209
+ - Gladia `listTranscripts()` now includes file metadata:
210
+ - `data.duration` - audio duration in seconds
211
+ - `metadata.audioUrl` - source URL (if audio_url was used)
212
+ - `metadata.filename` - original filename
213
+ - `metadata.audioDuration` - audio duration (also in metadata)
214
+ - `metadata.numberOfChannels` - number of audio channels
215
+
216
+ - All adapters now include `raw: item` in `listTranscripts()` responses for consistency:
217
+ - AssemblyAI: now includes `raw` field with original `TranscriptListItem`
218
+ - Azure: now includes `raw` field with original `Transcription` item
219
+ - Added `metadata.description` to Azure list responses
220
+
221
+ - Added clarifying comments in adapters about provider limitations:
222
+ - AssemblyAI: `audio_duration` only available in full `Transcript`, not `TranscriptListItem`
223
+ - Azure: `contentUrls` is write-only (not returned in list responses per API docs)
224
+
225
+ ---
226
+
227
+ ## [0.3.0] - 2026-01-07
228
+
229
+ ### Added
230
+
231
+ #### Browser-Safe Constants Export
232
+
233
+ New `/constants` subpath export for browser, Cloudflare Workers, and edge environments:
234
+
235
+ ```typescript
236
+ // Browser-safe import (no node:crypto, ws, or axios)
237
+ import { DeepgramModel, GladiaEncoding, AssemblyAIEncoding } from 'voice-router-dev/constants'
238
+
239
+ const model = DeepgramModel["nova-3"]
240
+ const encoding = GladiaEncoding["wav/pcm"]
241
+ ```
242
+
243
+ The main entry point (`voice-router-dev`) still works but bundles Node.js dependencies.
244
+ Use `/constants` when you only need the enum values without the adapter classes.
245
+
246
+ #### Type-Safe Streaming Enums with Autocomplete
247
+
248
+ New const objects provide IDE autocomplete and compile-time validation for all streaming options.
249
+ All enums are derived from OpenAPI specs and stay in sync with provider APIs.
250
+
251
+ **Deepgram:**
252
+ ```typescript
253
+ import { DeepgramEncoding, DeepgramModel, DeepgramRedact } from 'voice-router-dev'
254
+
255
+ await adapter.transcribeStream({
256
+ deepgramStreaming: {
257
+ encoding: DeepgramEncoding.linear16, // "linear16" | "flac" | "mulaw" | ...
258
+ model: DeepgramModel["nova-3"], // "nova-3" | "nova-2" | "enhanced" | ...
259
+ redact: [DeepgramRedact.pii], // "pii" | "pci" | "numbers"
260
+ }
261
+ })
262
+ ```
263
+
264
+ **Gladia:**
265
+ ```typescript
266
+ import { GladiaEncoding, GladiaSampleRate, GladiaLanguage } from 'voice-router-dev'
267
+
268
+ await adapter.transcribeStream({
269
+ encoding: GladiaEncoding['wav/pcm'], // "wav/pcm" | "wav/alaw" | "wav/ulaw"
270
+ sampleRate: GladiaSampleRate.NUMBER_16000, // 8000 | 16000 | 32000 | 44100 | 48000
271
+ language: GladiaLanguage.en, // 100+ language codes
272
+ })
273
+ ```
274
+
275
+ **AssemblyAI:**
276
+ ```typescript
277
+ import { AssemblyAIEncoding, AssemblyAISpeechModel, AssemblyAISampleRate } from 'voice-router-dev'
278
+
279
+ await adapter.transcribeStream({
280
+ assemblyaiStreaming: {
281
+ encoding: AssemblyAIEncoding.pcmS16le, // "pcm_s16le" | "pcm_mulaw"
282
+ speechModel: AssemblyAISpeechModel.multilingual, // English or multilingual
283
+ sampleRate: AssemblyAISampleRate.rate16000, // 8000-48000
284
+ }
285
+ })
286
+ ```
287
+
288
+ #### Type Safety Audit
289
+
290
+ All enums are either re-exported from OpenAPI-generated types or type-checked with `satisfies`:
291
+
292
+ | Enum | Source | Type Safety |
293
+ |------|--------|-------------|
294
+ | `DeepgramEncoding` | Re-exported from `ListenV1EncodingParameter` | ✅ OpenAPI |
295
+ | `DeepgramRedact` | Re-exported from `ListenV1RedactParameterOneOfItem` | ✅ OpenAPI |
296
+ | `DeepgramModel` | Manual const with `satisfies ListenV1ModelParameter` | ⚠️ Type-checked |
297
+ | `DeepgramTopicMode` | Re-exported from `SharedCustomTopicModeParameter` | ✅ OpenAPI |
298
+ | `GladiaEncoding` | Re-exported from `StreamingSupportedEncodingEnum` | ✅ OpenAPI |
299
+ | `GladiaSampleRate` | Re-exported from `StreamingSupportedSampleRateEnum` | ✅ OpenAPI |
300
+ | `GladiaBitDepth` | Re-exported from `StreamingSupportedBitDepthEnum` | ✅ OpenAPI |
301
+ | `GladiaModel` | Re-exported from `StreamingSupportedModels` | ✅ OpenAPI |
302
+ | `GladiaLanguage` | Re-exported from `TranscriptionLanguageCodeEnum` | ✅ OpenAPI |
303
+ | `AssemblyAIEncoding` | Manual const with `satisfies AudioEncoding` | ⚠️ Type-checked |
304
+ | `AssemblyAISpeechModel` | Manual const with `satisfies StreamingSpeechModel` | ⚠️ Type-checked |
305
+ | `AssemblyAISampleRate` | Manual const (no generated type exists) | ❌ Unchecked |
306
+
307
+ **Why some remain manual:**
308
+ - `DeepgramModel`: OpenAPI generates a type union, not a const object
309
+ - `AssemblyAI*`: Synced from SDK types which are unions, not const objects
310
+ - `AssemblyAISampleRate`: Not defined in any spec (values from SDK documentation)
311
+
312
+ The `satisfies` keyword ensures compile-time errors if values drift from the generated types.
313
+
314
+ #### Full Streaming Implementation for All Providers
315
+
316
+ - **Gladia**: Complete streaming with pre-processing, real-time processing (translation, NER, sentiment), post-processing (summarization, chapterization), and all WebSocket message types
317
+ - **Deepgram**: Full streaming with 30+ options including filler words, numerals, measurements, topics, intents, sentiment, entities, keyterm prompting, and VAD events
318
+ - **AssemblyAI**: v3 Universal Streaming API with end-of-turn detection tuning, VAD threshold, format turns, profanity filtering, keyterms, and dynamic configuration updates
319
+
320
+ #### New Streaming Event Callbacks
321
+
322
+ ```typescript
323
+ await adapter.transcribeStream(options, {
324
+ onTranscript: (event) => { /* interim/final transcripts */ },
325
+ onUtterance: (utterance) => { /* complete utterances */ },
326
+ onSpeechStart: (event) => { /* speech detected */ },
327
+ onSpeechEnd: (event) => { /* speech ended */ },
328
+ onTranslation: (event) => { /* real-time translation (Gladia) */ },
329
+ onSentiment: (event) => { /* sentiment analysis (Gladia) */ },
330
+ onEntity: (event) => { /* named entity recognition (Gladia) */ },
331
+ onSummarization: (event) => { /* post-processing summary (Gladia) */ },
332
+ onChapterization: (event) => { /* auto-chapters (Gladia) */ },
333
+ onMetadata: (metadata) => { /* stream metadata */ },
334
+ onError: (error) => { /* error handling */ },
335
+ onClose: (code, reason) => { /* connection closed */ },
336
+ })
337
+ ```
338
+
339
+ #### AssemblyAI Dynamic Configuration
340
+
341
+ ```typescript
342
+ const session = await adapter.transcribeStream(options, callbacks)
343
+
344
+ // Update configuration mid-stream
345
+ session.updateConfiguration?.({
346
+ end_of_turn_confidence_threshold: 0.8,
347
+ vad_threshold: 0.4,
348
+ format_turns: true,
349
+ })
350
+
351
+ // Force end-of-turn detection
352
+ session.forceEndpoint?.()
353
+ ```
354
+
355
+ ### Changed
356
+
357
+ - `TranscriptionModel` (batch) now uses strict union type (no `| string` fallback)
358
+ - `DeepgramStreamingOptions.model` now uses strict union type (no `| string` fallback)
359
+ - `AssemblyAIStreamingOptions.speechModel` now uses strict union type
360
+ - `ProviderCapabilities` now includes `listTranscripts` and `deleteTranscript` flags
361
+ - `DeepgramStreamingOptions` now includes 30+ typed parameters from OpenAPI spec
362
+ - `AssemblyAIStreamingOptions` now includes all v3 streaming parameters
363
+ - `GladiaStreamingOptions` now includes full pre/realtime/post processing options
364
+ - Provider-specific streaming options now have JSDoc examples for better discoverability
365
+
366
+ ### Deprecated
367
+
368
+ Raw generated enum exports are deprecated in favor of user-friendly aliases:
369
+
370
+ | Deprecated | Use Instead |
371
+ |------------|-------------|
372
+ | `ListenV1EncodingParameter` | `DeepgramEncoding` |
373
+ | `ListenV1ModelParameter` | `DeepgramModel` |
374
+ | `ListenV1RedactParameterOneOfItem` | `DeepgramRedact` |
375
+ | `StreamingSupportedEncodingEnum` | `GladiaEncoding` |
376
+ | `StreamingSupportedSampleRateEnum` | `GladiaSampleRate` |
377
+ | `StreamingSupportedBitDepthEnum` | `GladiaBitDepth` |
378
+
379
+ ---
380
+
381
+ ## Migration Guide (0.2.x → 0.3.0)
382
+
383
+ ### 1. Update Enum Imports
384
+
385
+ **Before (0.2.x):**
386
+ ```typescript
387
+ import {
388
+ ListenV1EncodingParameter,
389
+ StreamingSupportedEncodingEnum
390
+ } from 'voice-router-dev'
391
+
392
+ const encoding = ListenV1EncodingParameter.linear16
393
+ const gladiaEncoding = StreamingSupportedEncodingEnum['wav/pcm']
394
+ ```
395
+
396
+ **After (0.3.0):**
397
+ ```typescript
398
+ import {
399
+ DeepgramEncoding,
400
+ GladiaEncoding
401
+ } from 'voice-router-dev'
402
+
403
+ const encoding = DeepgramEncoding.linear16
404
+ const gladiaEncoding = GladiaEncoding['wav/pcm']
405
+ ```
406
+
407
+ ### 2. Update Model References
408
+
409
+ **Before:**
410
+ ```typescript
411
+ // String literals (still work but no autocomplete)
412
+ model: "nova-3"
413
+ ```
414
+
415
+ **After:**
416
+ ```typescript
417
+ import { DeepgramModel } from 'voice-router-dev'
418
+
419
+ // With autocomplete
420
+ model: DeepgramModel["nova-3"]
421
+ ```
422
+
423
+ ### 3. Update Streaming Options
424
+
425
+ **Before (0.2.x):**
426
+ ```typescript
427
+ await adapter.transcribeStream({
428
+ encoding: 'linear16',
429
+ sampleRate: 16000,
430
+ })
431
+ ```
432
+
433
+ **After (0.3.0):**
434
+ ```typescript
435
+ await adapter.transcribeStream({
436
+ deepgramStreaming: {
437
+ encoding: DeepgramEncoding.linear16,
438
+ sampleRate: 16000,
439
+ // Now supports 30+ additional options with autocomplete
440
+ fillerWords: true,
441
+ smartFormat: true,
442
+ }
443
+ })
444
+ ```
445
+
446
+ ### 4. New Callback Handlers
447
+
448
+ If you were only using `onTranscript`, you now have access to more granular events:
449
+
450
+ ```typescript
451
+ await adapter.transcribeStream(options, {
452
+ onTranscript: (event) => { /* still works */ },
453
+
454
+ // New in 0.3.0:
455
+ onSpeechStart: (event) => console.log('Speech started'),
456
+ onSpeechEnd: (event) => console.log('Speech ended'),
457
+ onUtterance: (utterance) => console.log('Complete utterance:', utterance.text),
458
+ })
459
+ ```
460
+
461
+ ---
462
+
463
+ ## [0.2.8] - 2025-12-30
464
+
465
+ ### Added
466
+ - Typed extended response data with `extendedData` field
467
+ - Request tracking with `requestId` field
468
+ - Type-safe provider-specific options from OpenAPI specs
469
+
470
+ ### Changed
471
+ - Replace 'text' with 'words' in SDK responses
472
+
473
+ ## [0.2.5] - 2025-12-15
474
+
475
+ ### Added
476
+ - Initial OpenAPI-generated types for Gladia, Deepgram, AssemblyAI
477
+ - Webhook normalization handlers
478
+ - Basic streaming support