workers-ai-provider 3.1.0 → 3.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # workers-ai-provider
2
2
 
3
- [Workers AI](https://developers.cloudflare.com/workers-ai/) provider for the [AI SDK](https://sdk.vercel.ai/). Use Cloudflare's models for chat, tool calling, structured output, embeddings, image generation, and [AI Search](https://developers.cloudflare.com/ai-search/).
3
+ [Workers AI](https://developers.cloudflare.com/workers-ai/) provider for the [AI SDK](https://sdk.vercel.ai/). Run Cloudflare's models for chat, embeddings, image generation, transcription, text-to-speech, reranking, and [AI Search](https://developers.cloudflare.com/ai-search/) — all from a single provider.
4
4
 
5
5
  ## Quick Start
6
6
 
@@ -71,13 +71,19 @@ Browse the full catalog at [developers.cloudflare.com/workers-ai/models](https:/
71
71
 
72
72
  Some good defaults:
73
73
 
74
- | Task | Model | Notes |
75
- | ---------- | ------------------------------------------ | --------------------------- |
76
- | Chat | `@cf/meta/llama-4-scout-17b-16e-instruct` | Fast, strong tool calling |
77
- | Chat | `@cf/meta/llama-3.3-70b-instruct-fp8-fast` | Largest Llama, best quality |
78
- | Reasoning | `@cf/qwen/qwq-32b` | Emits `reasoning_content` |
79
- | Embeddings | `@cf/baai/bge-base-en-v1.5` | 768-dim, English |
80
- | Images | `@cf/black-forest-labs/flux-1-schnell` | Fast image generation |
74
+ | Task | Model | Notes |
75
+ | -------------- | ------------------------------------------ | -------------------------------- |
76
+ | Chat | `@cf/meta/llama-4-scout-17b-16e-instruct` | Fast, strong tool calling |
77
+ | Chat | `@cf/meta/llama-3.3-70b-instruct-fp8-fast` | Largest Llama, best quality |
78
+ | Chat | `@cf/openai/gpt-oss-120b` | OpenAI open-weights, high reason |
79
+ | Reasoning | `@cf/qwen/qwq-32b` | Emits `reasoning_content` |
80
+ | Embeddings | `@cf/baai/bge-base-en-v1.5` | 768-dim, English |
81
+ | Embeddings | `@cf/google/embeddinggemma-300m` | 100+ languages, by Google |
82
+ | Images | `@cf/black-forest-labs/flux-1-schnell` | Fast image generation |
83
+ | Transcription | `@cf/openai/whisper-large-v3-turbo` | Best accuracy, multilingual |
84
+ | Transcription | `@cf/deepgram/nova-3` | Fast, high accuracy |
85
+ | Text-to-Speech | `@cf/deepgram/aura-2-en` | Context-aware, natural pacing |
86
+ | Reranking | `@cf/baai/bge-reranker-base` | Fast document reranking |
81
87
 
82
88
  ## Text Generation
83
89
 
@@ -169,6 +175,80 @@ const { images } = await generateImage({
169
175
  // images[0].uint8Array contains the PNG bytes
170
176
  ```
171
177
 
178
+ ## Transcription (Speech-to-Text)
179
+
180
+ Transcribe audio using Whisper or Deepgram Nova-3 models.
181
+
182
+ ```ts
183
+ import { transcribe } from "ai";
184
+ import { readFile } from "node:fs/promises";
185
+
186
+ const { text, segments } = await transcribe({
187
+ model: workersai.transcription("@cf/openai/whisper-large-v3-turbo"),
188
+ audio: await readFile("./audio.mp3"),
189
+ mediaType: "audio/mpeg",
190
+ });
191
+ ```
192
+
193
+ With language hints (Whisper only):
194
+
195
+ ```ts
196
+ const { text } = await transcribe({
197
+ model: workersai.transcription("@cf/openai/whisper-large-v3-turbo", {
198
+ language: "fr",
199
+ }),
200
+ audio: audioBuffer,
201
+ mediaType: "audio/wav",
202
+ });
203
+ ```
204
+
205
+ Deepgram Nova-3 is also supported and detects language automatically:
206
+
207
+ ```ts
208
+ const { text } = await transcribe({
209
+ model: workersai.transcription("@cf/deepgram/nova-3"),
210
+ audio: audioBuffer,
211
+ mediaType: "audio/wav",
212
+ });
213
+ ```
214
+
215
+ ## Text-to-Speech
216
+
217
+ Generate spoken audio from text using Deepgram Aura-2.
218
+
219
+ ```ts
220
+ import { speech } from "ai";
221
+
222
+ const { audio } = await speech({
223
+ model: workersai.speech("@cf/deepgram/aura-2-en"),
224
+ text: "Hello from Cloudflare Workers AI!",
225
+ voice: "asteria",
226
+ });
227
+
228
+ // audio is a Uint8Array of MP3 bytes
229
+ ```
230
+
231
+ ## Reranking
232
+
233
+ Reorder documents by relevance to a query — useful for RAG pipelines.
234
+
235
+ ```ts
236
+ import { rerank } from "ai";
237
+
238
+ const { results } = await rerank({
239
+ model: workersai.reranking("@cf/baai/bge-reranker-base"),
240
+ query: "What is Cloudflare Workers?",
241
+ documents: [
242
+ "Cloudflare Workers lets you run JavaScript at the edge.",
243
+ "A cookie is a small piece of data stored in the browser.",
244
+ "Workers AI runs inference on Cloudflare's global network.",
245
+ ],
246
+ topN: 2,
247
+ });
248
+
249
+ // results is sorted by relevance score
250
+ ```
251
+
172
252
  ## AI Search
173
253
 
174
254
  [AI Search](https://developers.cloudflare.com/ai-search/) is Cloudflare's managed RAG service. Connect your data and query it with natural language.
@@ -192,7 +272,7 @@ const { text } = await generateText({
192
272
  });
193
273
  ```
194
274
 
195
- Streaming works the same way -- use `streamText` instead of `generateText`.
275
+ Streaming works the same way use `streamText` instead of `generateText`.
196
276
 
197
277
  > `createAutoRAG` still works but is deprecated. Use `createAISearch` instead.
198
278
 
@@ -207,18 +287,27 @@ Streaming works the same way -- use `streamText` instead of `generateText`.
207
287
  | `apiKey` | `string` | Cloudflare API token. Required with `accountId`. |
208
288
  | `gateway` | `GatewayOptions` | Optional [AI Gateway](https://developers.cloudflare.com/ai-gateway/) config. |
209
289
 
210
- Returns a provider with model factories for each AI SDK function:
290
+ Returns a provider with model factories:
211
291
 
212
292
  ```ts
213
- // For generateText / streamText:
293
+ // Chat — for generateText / streamText
214
294
  workersai(modelId);
215
295
  workersai.chat(modelId);
216
296
 
217
- // For embedMany / embed:
297
+ // Embeddings — for embedMany / embed
218
298
  workersai.textEmbedding(modelId);
219
299
 
220
- // For generateImage:
300
+ // Images — for generateImage
221
301
  workersai.image(modelId);
302
+
303
+ // Transcription — for transcribe
304
+ workersai.transcription(modelId, settings?);
305
+
306
+ // Text-to-Speech — for speech
307
+ workersai.speech(modelId);
308
+
309
+ // Reranking — for rerank
310
+ workersai.reranking(modelId);
222
311
  ```
223
312
 
224
313
  ### `createAISearch(options)`
package/dist/index.d.ts CHANGED
@@ -14,31 +14,39 @@ type AISearchChatSettings = {
14
14
 
15
15
  /**
16
16
  * The names of the BaseAiTextGeneration models.
17
+ *
18
+ * Accepts any string at runtime, but provides autocomplete for known models.
17
19
  */
18
- type TextGenerationModels = Exclude<value2key<AiModels, BaseAiTextGeneration>, value2key<AiModels, BaseAiTextToImage>>;
19
- type ImageGenerationModels = value2key<AiModels, BaseAiTextToImage>;
20
+ type TextGenerationModels = Exclude<value2key<AiModels, BaseAiTextGeneration>, value2key<AiModels, BaseAiTextToImage>> | (string & {});
21
+ type ImageGenerationModels = value2key<AiModels, BaseAiTextToImage> | (string & {});
20
22
  /**
21
23
  * The names of the BaseAiTextToEmbeddings models.
24
+ *
25
+ * Accepts any string at runtime, but provides autocomplete for known models.
22
26
  */
23
- type EmbeddingModels = value2key<AiModels, BaseAiTextEmbeddings>;
27
+ type EmbeddingModels = value2key<AiModels, BaseAiTextEmbeddings> | (string & {});
24
28
  /**
25
29
  * Workers AI models that support speech-to-text transcription.
26
30
  *
27
31
  * Includes Whisper variants from `@cloudflare/workers-types` plus
28
32
  * Deepgram partner models that may not be in the typed interface yet.
33
+ * Accepts any string at runtime, but provides autocomplete for known models.
29
34
  */
30
- type TranscriptionModels = value2key<AiModels, BaseAiAutomaticSpeechRecognition> | "@cf/deepgram/nova-3";
35
+ type TranscriptionModels = value2key<AiModels, BaseAiAutomaticSpeechRecognition> | "@cf/deepgram/nova-3" | (string & {});
31
36
  /**
32
37
  * Workers AI models that support text-to-speech.
33
38
  *
34
39
  * Includes models from `@cloudflare/workers-types` plus Deepgram partner
35
40
  * models that may not be in the typed interface yet.
41
+ * Accepts any string at runtime, but provides autocomplete for known models.
36
42
  */
37
- type SpeechModels = value2key<AiModels, BaseAiTextToSpeech> | "@cf/deepgram/aura-1";
43
+ type SpeechModels = value2key<AiModels, BaseAiTextToSpeech> | "@cf/deepgram/aura-1" | "@cf/deepgram/aura-2-en" | "@cf/deepgram/aura-2-es" | (string & {});
38
44
  /**
39
45
  * Workers AI models that support reranking.
46
+ *
47
+ * Accepts any string at runtime, but provides autocomplete for known models.
40
48
  */
41
- type RerankingModels = "@cf/baai/bge-reranker-base" | "@cf/baai/bge-reranker-v2-m3";
49
+ type RerankingModels = "@cf/baai/bge-reranker-base" | "@cf/baai/bge-reranker-v2-m3" | (string & {});
42
50
  type value2key<T, V> = {
43
51
  [K in keyof T]: T[K] extends V ? K : never;
44
52
  }[keyof T];