@ai-sdk/deepgram 2.0.9 → 2.0.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,190 @@
1
+ ---
2
+ title: Deepgram
3
+ description: Learn how to use the Deepgram provider for the AI SDK.
4
+ ---
5
+
6
+ # Deepgram Provider
7
+
8
+ The [Deepgram](https://deepgram.com/) provider contains language model support for the Deepgram transcription API.
9
+
10
+ ## Setup
11
+
12
+ The Deepgram provider is available in the `@ai-sdk/deepgram` module. You can install it with
13
+
14
+ <Tabs items={['pnpm', 'npm', 'yarn', 'bun']}>
15
+ <Tab>
16
+ <Snippet text="pnpm add @ai-sdk/deepgram" dark />
17
+ </Tab>
18
+ <Tab>
19
+ <Snippet text="npm install @ai-sdk/deepgram" dark />
20
+ </Tab>
21
+ <Tab>
22
+ <Snippet text="yarn add @ai-sdk/deepgram" dark />
23
+ </Tab>
24
+
25
+ <Tab>
26
+ <Snippet text="bun add @ai-sdk/deepgram" dark />
27
+ </Tab>
28
+ </Tabs>
29
+
30
+ ## Provider Instance
31
+
32
+ You can import the default provider instance `deepgram` from `@ai-sdk/deepgram`:
33
+
34
+ ```ts
35
+ import { deepgram } from '@ai-sdk/deepgram';
36
+ ```
37
+
38
+ If you need a customized setup, you can import `createDeepgram` from `@ai-sdk/deepgram` and create a provider instance with your settings:
39
+
40
+ ```ts
41
+ import { createDeepgram } from '@ai-sdk/deepgram';
42
+
43
+ const deepgram = createDeepgram({
44
+ // custom settings, e.g.
45
+ fetch: customFetch,
46
+ });
47
+ ```
48
+
49
+ You can use the following optional settings to customize the Deepgram provider instance:
50
+
51
+ - **apiKey** _string_
52
+
53
+ API key that is being sent using the `Authorization` header.
54
+ It defaults to the `DEEPGRAM_API_KEY` environment variable.
55
+
56
+ - **headers** _Record&lt;string,string&gt;_
57
+
58
+ Custom headers to include in the requests.
59
+
60
+ - **fetch** _(input: RequestInfo, init?: RequestInit) => Promise&lt;Response&gt;_
61
+
62
+ Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation.
63
+ Defaults to the global `fetch` function.
64
+ You can use it as a middleware to intercept requests,
65
+ or to provide a custom fetch implementation for e.g. testing.
66
+
67
+ ## Transcription Models
68
+
69
+ You can create models that call the [Deepgram transcription API](https://developers.deepgram.com/docs/pre-recorded-audio)
70
+ using the `.transcription()` factory method.
71
+
72
+ The first argument is the model id e.g. `nova-3`.
73
+
74
+ ```ts
75
+ const model = deepgram.transcription('nova-3');
76
+ ```
77
+
78
+ You can also pass additional provider-specific options using the `providerOptions` argument. For example, supplying the `summarize` option will enable summaries for sections of content.
79
+
80
+ ```ts highlight="6"
81
+ import { experimental_transcribe as transcribe } from 'ai';
82
+ import { deepgram } from '@ai-sdk/deepgram';
83
+ import { readFile } from 'fs/promises';
84
+
85
+ const result = await transcribe({
86
+ model: deepgram.transcription('nova-3'),
87
+ audio: await readFile('audio.mp3'),
88
+ providerOptions: { deepgram: { summarize: true } },
89
+ });
90
+ ```
91
+
92
+ The following provider options are available:
93
+
94
+ - **language** _string_
95
+
96
+ Language code for the audio.
97
+ Supports numerous ISO-639-1 and ISO-639-3 language codes.
98
+ Optional.
99
+
100
+ - **smartFormat** _boolean_
101
+
102
+ Whether to apply smart formatting to the transcription.
103
+ Optional.
104
+
105
+ - **punctuate** _boolean_
106
+
107
+ Whether to add punctuation to the transcription.
108
+ Optional.
109
+
110
+ - **paragraphs** _boolean_
111
+
112
+ Whether to format the transcription into paragraphs.
113
+ Optional.
114
+
115
+ - **summarize** _enum | boolean_
116
+
117
+ Whether to generate a summary of the transcription.
118
+ Allowed values: `'v2'`, `false`.
119
+ Optional.
120
+
121
+ - **topics** _boolean_
122
+
123
+ Whether to detect topics in the transcription.
124
+ Optional.
125
+
126
+ - **intents** _boolean_
127
+
128
+ Whether to detect intents in the transcription.
129
+ Optional.
130
+
131
+ - **sentiment** _boolean_
132
+
133
+ Whether to perform sentiment analysis on the transcription.
134
+ Optional.
135
+
136
+ - **detectEntities** _boolean_
137
+
138
+ Whether to detect entities in the transcription.
139
+ Optional.
140
+
141
+ - **redact** _string | array of strings_
142
+
143
+ Specifies what content to redact from the transcription.
144
+ Optional.
145
+
146
+ - **replace** _string_
147
+
148
+ Replacement string for redacted content.
149
+ Optional.
150
+
151
+ - **search** _string_
152
+
153
+ Search term to find in the transcription.
154
+ Optional.
155
+
156
+ - **keyterm** _string_
157
+
158
+ Key terms to identify in the transcription.
159
+ Optional.
160
+
161
+ - **diarize** _boolean_
162
+
163
+ Whether to identify different speakers in the transcription.
164
+ Defaults to `true`.
165
+ Optional.
166
+
167
+ - **utterances** _boolean_
168
+
169
+ Whether to segment the transcription into utterances.
170
+ Optional.
171
+
172
+ - **uttSplit** _number_
173
+
174
+ Threshold for splitting utterances.
175
+ Optional.
176
+
177
+ - **fillerWords** _boolean_
178
+
179
+ Whether to include filler words (um, uh, etc.) in the transcription.
180
+ Optional.
181
+
182
+ ### Model Capabilities
183
+
184
+ | Model | Transcription | Duration | Segments | Language |
185
+ | -------------------------------------------------------------------------------------------------- | ------------------- | ------------------- | ------------------- | ------------------- |
186
+ | `nova-3` (+ [variants](https://developers.deepgram.com/docs/models-languages-overview#nova-3)) | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> |
187
+ | `nova-2` (+ [variants](https://developers.deepgram.com/docs/models-languages-overview#nova-2)) | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> |
188
+ | `nova` (+ [variants](https://developers.deepgram.com/docs/models-languages-overview#nova)) | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> |
189
+ | `enhanced` (+ [variants](https://developers.deepgram.com/docs/models-languages-overview#enhanced)) | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> |
190
+ | `base` (+ [variants](https://developers.deepgram.com/docs/models-languages-overview#base)) | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> |
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@ai-sdk/deepgram",
3
- "version": "2.0.9",
3
+ "version": "2.0.11",
4
4
  "license": "Apache-2.0",
5
5
  "sideEffects": false,
6
6
  "main": "./dist/index.js",
@@ -8,10 +8,18 @@
8
8
  "types": "./dist/index.d.ts",
9
9
  "files": [
10
10
  "dist/**/*",
11
+ "docs/**/*",
11
12
  "src",
13
+ "!src/**/*.test.ts",
14
+ "!src/**/*.test-d.ts",
15
+ "!src/**/__snapshots__",
16
+ "!src/**/__fixtures__",
12
17
  "CHANGELOG.md",
13
18
  "README.md"
14
19
  ],
20
+ "directories": {
21
+ "doc": "./docs"
22
+ },
15
23
  "exports": {
16
24
  "./package.json": "./package.json",
17
25
  ".": {
@@ -21,15 +29,15 @@
21
29
  }
22
30
  },
23
31
  "dependencies": {
24
- "@ai-sdk/provider": "3.0.4",
25
- "@ai-sdk/provider-utils": "4.0.8"
32
+ "@ai-sdk/provider": "3.0.5",
33
+ "@ai-sdk/provider-utils": "4.0.9"
26
34
  },
27
35
  "devDependencies": {
28
36
  "@types/node": "20.17.24",
29
37
  "tsup": "^8",
30
38
  "typescript": "5.6.3",
31
39
  "zod": "3.25.76",
32
- "@ai-sdk/test-server": "1.0.2",
40
+ "@ai-sdk/test-server": "1.0.3",
33
41
  "@vercel/ai-tsconfig": "0.0.0"
34
42
  },
35
43
  "peerDependencies": {
@@ -55,7 +63,7 @@
55
63
  "scripts": {
56
64
  "build": "tsup --tsconfig tsconfig.build.json",
57
65
  "build:watch": "tsup --tsconfig tsconfig.build.json --watch",
58
- "clean": "del-cli dist",
66
+ "clean": "del-cli dist docs",
59
67
  "lint": "eslint \"./**/*.ts*\"",
60
68
  "type-check": "tsc --noEmit",
61
69
  "prettier-check": "prettier --check \"./**/*.ts*\"",
@@ -1,34 +0,0 @@
1
- import { safeParseJSON } from '@ai-sdk/provider-utils';
2
- import { deepgramErrorDataSchema } from './deepgram-error';
3
- import { describe, expect, it } from 'vitest';
4
-
5
- describe('deepgramErrorDataSchema', () => {
6
- it('should parse Deepgram resource exhausted error', async () => {
7
- const error = `
8
- {"error":{"message":"{\\n \\"error\\": {\\n \\"code\\": 429,\\n \\"message\\": \\"Resource has been exhausted (e.g. check quota).\\",\\n \\"status\\": \\"RESOURCE_EXHAUSTED\\"\\n }\\n}\\n","code":429}}
9
- `;
10
-
11
- const result = await safeParseJSON({
12
- text: error,
13
- schema: deepgramErrorDataSchema,
14
- });
15
-
16
- expect(result).toStrictEqual({
17
- success: true,
18
- value: {
19
- error: {
20
- message:
21
- '{\n "error": {\n "code": 429,\n "message": "Resource has been exhausted (e.g. check quota).",\n "status": "RESOURCE_EXHAUSTED"\n }\n}\n',
22
- code: 429,
23
- },
24
- },
25
- rawValue: {
26
- error: {
27
- message:
28
- '{\n "error": {\n "code": 429,\n "message": "Resource has been exhausted (e.g. check quota).",\n "status": "RESOURCE_EXHAUSTED"\n }\n}\n',
29
- code: 429,
30
- },
31
- },
32
- });
33
- });
34
- });
@@ -1,355 +0,0 @@
1
- import { createTestServer } from '@ai-sdk/test-server/with-vitest';
2
- import { createDeepgram } from './deepgram-provider';
3
- import { DeepgramSpeechModel } from './deepgram-speech-model';
4
- import { describe, it, expect, vi } from 'vitest';
5
-
6
- vi.mock('./version', () => ({
7
- VERSION: '0.0.0-test',
8
- }));
9
-
10
- const provider = createDeepgram({ apiKey: 'test-api-key' });
11
- const model = provider.speech('aura-2-helena-en');
12
-
13
- const server = createTestServer({
14
- 'https://api.deepgram.com/v1/speak': {},
15
- });
16
-
17
- describe('doGenerate', () => {
18
- function prepareAudioResponse({
19
- headers,
20
- }: {
21
- headers?: Record<string, string>;
22
- } = {}) {
23
- const audioBuffer = new Uint8Array(100); // Mock audio data
24
- server.urls['https://api.deepgram.com/v1/speak'].response = {
25
- type: 'binary',
26
- headers: {
27
- 'content-type': 'audio/mp3',
28
- ...headers,
29
- },
30
- body: Buffer.from(audioBuffer),
31
- };
32
- return audioBuffer;
33
- }
34
-
35
- it('should pass the model and text', async () => {
36
- prepareAudioResponse();
37
-
38
- await model.doGenerate({
39
- text: 'Hello, welcome to Deepgram!',
40
- });
41
-
42
- expect(await server.calls[0].requestBodyJson).toMatchObject({
43
- text: 'Hello, welcome to Deepgram!',
44
- });
45
-
46
- const url = new URL(server.calls[0].requestUrl);
47
- expect(url.searchParams.get('model')).toBe('aura-2-helena-en');
48
- });
49
-
50
- it('should pass headers', async () => {
51
- prepareAudioResponse();
52
-
53
- const provider = createDeepgram({
54
- apiKey: 'test-api-key',
55
- headers: {
56
- 'Custom-Provider-Header': 'provider-header-value',
57
- },
58
- });
59
-
60
- await provider.speech('aura-2-helena-en').doGenerate({
61
- text: 'Hello, welcome to Deepgram!',
62
- headers: {
63
- 'Custom-Request-Header': 'request-header-value',
64
- },
65
- });
66
-
67
- expect(server.calls[0].requestHeaders).toMatchObject({
68
- authorization: 'Token test-api-key',
69
- 'content-type': 'application/json',
70
- 'custom-provider-header': 'provider-header-value',
71
- 'custom-request-header': 'request-header-value',
72
- });
73
-
74
- expect(server.calls[0].requestUserAgent).toContain(
75
- `ai-sdk/deepgram/0.0.0-test`,
76
- );
77
- });
78
-
79
- it('should pass query parameters for model', async () => {
80
- prepareAudioResponse();
81
-
82
- await model.doGenerate({
83
- text: 'Hello, welcome to Deepgram!',
84
- });
85
-
86
- const url = new URL(server.calls[0].requestUrl);
87
- expect(url.searchParams.get('model')).toBe('aura-2-helena-en');
88
- });
89
-
90
- it('should map outputFormat to encoding/container', async () => {
91
- prepareAudioResponse();
92
-
93
- await model.doGenerate({
94
- text: 'Hello, welcome to Deepgram!',
95
- outputFormat: 'wav',
96
- });
97
-
98
- const url = new URL(server.calls[0].requestUrl);
99
- expect(url.searchParams.get('container')).toBe('wav');
100
- expect(url.searchParams.get('encoding')).toBe('linear16');
101
- });
102
-
103
- it('should pass provider options', async () => {
104
- prepareAudioResponse();
105
-
106
- await model.doGenerate({
107
- text: 'Hello, welcome to Deepgram!',
108
- providerOptions: {
109
- deepgram: {
110
- encoding: 'mp3',
111
- bitRate: 48000,
112
- container: 'wav',
113
- callback: 'https://example.com/callback',
114
- callbackMethod: 'POST',
115
- mipOptOut: true,
116
- tag: 'test-tag',
117
- },
118
- },
119
- });
120
-
121
- const url = new URL(server.calls[0].requestUrl);
122
- expect(url.searchParams.get('encoding')).toBe('mp3');
123
- expect(url.searchParams.get('bit_rate')).toBe('48000');
124
- // mp3 doesn't support container, so it should be removed
125
- expect(url.searchParams.get('container')).toBeNull();
126
- expect(url.searchParams.get('callback')).toBe(
127
- 'https://example.com/callback',
128
- );
129
- expect(url.searchParams.get('callback_method')).toBe('POST');
130
- expect(url.searchParams.get('mip_opt_out')).toBe('true');
131
- expect(url.searchParams.get('tag')).toBe('test-tag');
132
- });
133
-
134
- it('should handle array tag', async () => {
135
- prepareAudioResponse();
136
-
137
- await model.doGenerate({
138
- text: 'Hello, welcome to Deepgram!',
139
- providerOptions: {
140
- deepgram: {
141
- tag: ['tag1', 'tag2'],
142
- },
143
- },
144
- });
145
-
146
- const url = new URL(server.calls[0].requestUrl);
147
- expect(url.searchParams.get('tag')).toBe('tag1,tag2');
148
- });
149
-
150
- it('should return audio data', async () => {
151
- const audio = new Uint8Array(100); // Mock audio data
152
- prepareAudioResponse({
153
- headers: {
154
- 'x-request-id': 'test-request-id',
155
- },
156
- });
157
-
158
- const result = await model.doGenerate({
159
- text: 'Hello, welcome to Deepgram!',
160
- });
161
-
162
- expect(result.audio).toStrictEqual(audio);
163
- });
164
-
165
- it('should include response data with timestamp, modelId and headers', async () => {
166
- prepareAudioResponse({
167
- headers: {
168
- 'x-request-id': 'test-request-id',
169
- },
170
- });
171
-
172
- const testDate = new Date(0);
173
- const customModel = new DeepgramSpeechModel('aura-2-helena-en', {
174
- provider: 'test-provider',
175
- url: () => 'https://api.deepgram.com/v1/speak',
176
- headers: () => ({}),
177
- _internal: {
178
- currentDate: () => testDate,
179
- },
180
- });
181
-
182
- const result = await customModel.doGenerate({
183
- text: 'Hello, welcome to Deepgram!',
184
- });
185
-
186
- expect(result.response).toMatchObject({
187
- timestamp: testDate,
188
- modelId: 'aura-2-helena-en',
189
- headers: {
190
- 'content-type': 'audio/mp3',
191
- 'x-request-id': 'test-request-id',
192
- },
193
- });
194
- });
195
-
196
- it('should warn about unsupported voice parameter', async () => {
197
- prepareAudioResponse();
198
-
199
- const result = await model.doGenerate({
200
- text: 'Hello, welcome to Deepgram!',
201
- voice: 'different-voice',
202
- });
203
-
204
- expect(result.warnings).toMatchInlineSnapshot(`
205
- [
206
- {
207
- "details": "Deepgram TTS models embed the voice in the model ID. The voice parameter "different-voice" was ignored. Use the model ID to select a voice (e.g., "aura-2-helena-en").",
208
- "feature": "voice",
209
- "type": "unsupported",
210
- },
211
- ]
212
- `);
213
- });
214
-
215
- it('should warn about unsupported speed parameter', async () => {
216
- prepareAudioResponse();
217
-
218
- const result = await model.doGenerate({
219
- text: 'Hello, welcome to Deepgram!',
220
- speed: 1.5,
221
- });
222
-
223
- expect(result.warnings).toMatchInlineSnapshot(`
224
- [
225
- {
226
- "details": "Deepgram TTS REST API does not support speed adjustment. Speed parameter was ignored.",
227
- "feature": "speed",
228
- "type": "unsupported",
229
- },
230
- ]
231
- `);
232
- });
233
-
234
- it('should warn about unsupported language parameter', async () => {
235
- prepareAudioResponse();
236
-
237
- const result = await model.doGenerate({
238
- text: 'Hello, welcome to Deepgram!',
239
- language: 'en',
240
- });
241
-
242
- expect(result.warnings).toMatchInlineSnapshot(`
243
- [
244
- {
245
- "details": "Deepgram TTS models are language-specific via the model ID. Language parameter "en" was ignored. Select a model with the appropriate language suffix (e.g., "-en" for English).",
246
- "feature": "language",
247
- "type": "unsupported",
248
- },
249
- ]
250
- `);
251
- });
252
-
253
- it('should warn about unsupported instructions parameter', async () => {
254
- prepareAudioResponse();
255
-
256
- const result = await model.doGenerate({
257
- text: 'Hello, welcome to Deepgram!',
258
- instructions: 'Speak slowly',
259
- });
260
-
261
- expect(result.warnings).toMatchInlineSnapshot(`
262
- [
263
- {
264
- "details": "Deepgram TTS REST API does not support instructions. Instructions parameter was ignored.",
265
- "feature": "instructions",
266
- "type": "unsupported",
267
- },
268
- ]
269
- `);
270
- });
271
-
272
- it('should include request body in response', async () => {
273
- prepareAudioResponse();
274
-
275
- const result = await model.doGenerate({
276
- text: 'Hello, welcome to Deepgram!',
277
- });
278
-
279
- expect(result.request?.body).toBe(
280
- JSON.stringify({ text: 'Hello, welcome to Deepgram!' }),
281
- );
282
- });
283
-
284
- it('should clean up incompatible parameters when encoding changes via providerOptions', async () => {
285
- prepareAudioResponse();
286
-
287
- // Test case 1: outputFormat sets sample_rate, encoding changed to mp3 (fixed sample rate)
288
- await model.doGenerate({
289
- text: 'Hello, welcome to Deepgram!',
290
- outputFormat: 'linear16_16000', // Sets: encoding=linear16, sample_rate=16000
291
- providerOptions: {
292
- deepgram: {
293
- encoding: 'mp3', // Changes encoding to mp3
294
- },
295
- },
296
- });
297
-
298
- const url1 = new URL(server.calls[0].requestUrl);
299
- expect(url1.searchParams.get('encoding')).toBe('mp3');
300
- expect(url1.searchParams.get('sample_rate')).toBeNull(); // Should be removed
301
-
302
- // Test case 2: outputFormat sets container for linear16, encoding changed to opus
303
- await model.doGenerate({
304
- text: 'Hello, welcome to Deepgram!',
305
- outputFormat: 'linear16_16000', // Sets: encoding=linear16, container=wav
306
- providerOptions: {
307
- deepgram: {
308
- encoding: 'opus', // Changes encoding to opus
309
- },
310
- },
311
- });
312
-
313
- const url2 = new URL(server.calls[1].requestUrl);
314
- expect(url2.searchParams.get('encoding')).toBe('opus');
315
- expect(url2.searchParams.get('container')).toBe('ogg'); // Should be ogg, not wav
316
- expect(url2.searchParams.get('sample_rate')).toBeNull(); // Should be removed
317
-
318
- // Test case 3: outputFormat sets bit_rate, encoding changed to linear16 (no bitrate support)
319
- await model.doGenerate({
320
- text: 'Hello, welcome to Deepgram!',
321
- outputFormat: 'mp3', // Sets: encoding=mp3
322
- providerOptions: {
323
- deepgram: {
324
- encoding: 'linear16', // Changes encoding to linear16
325
- bitRate: 48000, // Try to set bitrate
326
- },
327
- },
328
- });
329
-
330
- const url3 = new URL(server.calls[2].requestUrl);
331
- expect(url3.searchParams.get('encoding')).toBe('linear16');
332
- expect(url3.searchParams.get('bit_rate')).toBeNull(); // Should be removed
333
- });
334
-
335
- it('should clean up incompatible parameters when container changes encoding implicitly', async () => {
336
- prepareAudioResponse();
337
-
338
- // Test case: outputFormat sets sample_rate, container changes encoding to opus
339
- await model.doGenerate({
340
- text: 'Hello, welcome to Deepgram!',
341
- outputFormat: 'linear16_16000', // Sets: encoding=linear16, sample_rate=16000
342
- providerOptions: {
343
- deepgram: {
344
- container: 'ogg', // Changes encoding to opus implicitly
345
- },
346
- },
347
- });
348
-
349
- const callIndex = server.calls.length - 1;
350
- const url = new URL(server.calls[callIndex].requestUrl);
351
- expect(url.searchParams.get('encoding')).toBe('opus');
352
- expect(url.searchParams.get('container')).toBe('ogg');
353
- expect(url.searchParams.get('sample_rate')).toBeNull(); // Should be removed (opus has fixed sample rate)
354
- });
355
- });