workers-ai-provider 3.0.4 → 3.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,154 +1,235 @@
1
- # ⛅️ ✨ workers-ai-provider ✨ ⛅️
1
+ # workers-ai-provider
2
2
 
3
- A custom provider that enables [Workers AI](https://ai.cloudflare.com/)'s models for the [Vercel AI SDK](https://sdk.vercel.ai/).
3
+ [Workers AI](https://developers.cloudflare.com/workers-ai/) provider for the [AI SDK](https://sdk.vercel.ai/). Use Cloudflare's models for chat, tool calling, structured output, embeddings, image generation, and [AI Search](https://developers.cloudflare.com/ai-search/).
4
4
 
5
- ## Install
5
+ ## Quick Start
6
+
7
+ ```jsonc
8
+ // wrangler.jsonc
9
+ {
10
+ "ai": { "binding": "AI" },
11
+ }
12
+ ```
13
+
14
+ ```ts
15
+ import { createWorkersAI } from "workers-ai-provider";
16
+ import { streamText } from "ai";
17
+
18
+ export default {
19
+ async fetch(req: Request, env: { AI: Ai }) {
20
+ const workersai = createWorkersAI({ binding: env.AI });
21
+
22
+ const result = streamText({
23
+ model: workersai("@cf/meta/llama-4-scout-17b-16e-instruct"),
24
+ messages: [{ role: "user", content: "Write a haiku about Cloudflare" }],
25
+ });
26
+
27
+ return result.toTextStreamResponse();
28
+ },
29
+ };
30
+ ```
6
31
 
7
32
  ```bash
8
- npm install workers-ai-provider
33
+ npm install workers-ai-provider ai
9
34
  ```
10
35
 
11
- ## Usage
36
+ ## Configuration
37
+
38
+ ### Workers binding (recommended)
12
39
 
13
- First, setup an AI binding in `wrangler.toml` in your Workers project:
40
+ Inside a Cloudflare Worker, pass the `env.AI` binding directly. No API keys needed.
14
41
 
15
- ```toml
16
- # ...
17
- [ai]
18
- binding = "AI"
19
- # ...
42
+ ```ts
43
+ const workersai = createWorkersAI({ binding: env.AI });
20
44
  ```
21
45
 
22
- ### Using Workers AI
46
+ ### REST API
23
47
 
24
- Then in your Worker, import the factory function and create a new AI provider:
48
+ Outside of Workers (Node.js, Bun, etc.), use your Cloudflare credentials:
49
+
50
+ ```ts
51
+ const workersai = createWorkersAI({
52
+ accountId: process.env.CLOUDFLARE_ACCOUNT_ID,
53
+ apiKey: process.env.CLOUDFLARE_API_TOKEN,
54
+ });
55
+ ```
56
+
57
+ ### AI Gateway
58
+
59
+ Route requests through [AI Gateway](https://developers.cloudflare.com/ai-gateway/) for caching, rate limiting, and observability:
60
+
61
+ ```ts
62
+ const workersai = createWorkersAI({
63
+ binding: env.AI,
64
+ gateway: { id: "my-gateway" },
65
+ });
66
+ ```
67
+
68
+ ## Models
69
+
70
+ Browse the full catalog at [developers.cloudflare.com/workers-ai/models](https://developers.cloudflare.com/workers-ai/models/).
71
+
72
+ Some good defaults:
73
+
74
+ | Task | Model | Notes |
75
+ | ---------- | ------------------------------------------ | --------------------------- |
76
+ | Chat | `@cf/meta/llama-4-scout-17b-16e-instruct` | Fast, strong tool calling |
77
+ | Chat | `@cf/meta/llama-3.3-70b-instruct-fp8-fast` | Largest Llama, best quality |
78
+ | Reasoning | `@cf/qwen/qwq-32b` | Emits `reasoning_content` |
79
+ | Embeddings | `@cf/baai/bge-base-en-v1.5` | 768-dim, English |
80
+ | Images | `@cf/black-forest-labs/flux-1-schnell` | Fast image generation |
81
+
82
+ ## Text Generation
83
+
84
+ ```ts
85
+ import { generateText } from "ai";
86
+
87
+ const { text } = await generateText({
88
+ model: workersai("@cf/meta/llama-3.3-70b-instruct-fp8-fast"),
89
+ prompt: "Explain Workers AI in one paragraph",
90
+ });
91
+ ```
92
+
93
+ Streaming:
25
94
 
26
95
  ```ts
27
- import { createWorkersAI } from "workers-ai-provider";
28
96
  import { streamText } from "ai";
29
97
 
30
- type Env = {
31
- AI: Ai;
32
- };
98
+ const result = streamText({
99
+ model: workersai("@cf/meta/llama-4-scout-17b-16e-instruct"),
100
+ messages: [{ role: "user", content: "Write a short story" }],
101
+ });
33
102
 
34
- export default {
35
- async fetch(req: Request, env: Env) {
36
- const workersai = createWorkersAI({ binding: env.AI });
37
- // Use the AI provider to interact with the Vercel AI SDK
38
- // Here, we generate a chat stream based on a prompt
39
- const text = await streamText({
40
- model: workersai("@cf/meta/llama-2-7b-chat-int8"),
41
- messages: [
42
- {
43
- role: "user",
44
- content: "Write an essay about hello world",
45
- },
46
- ],
47
- });
48
-
49
- return text.toTextStreamResponse({
50
- headers: {
51
- // add these headers to ensure that the
52
- // response is chunked and streamed
53
- "Content-Type": "text/x-unknown",
54
- "content-encoding": "identity",
55
- "transfer-encoding": "chunked",
56
- },
57
- });
58
- },
59
- };
103
+ for await (const chunk of result.textStream) {
104
+ process.stdout.write(chunk);
105
+ }
60
106
  ```
61
107
 
62
- You can also use your Cloudflare credentials to create the provider, for example if you want to use Cloudflare AI outside of the Worker environment. For example, here is how you can use Cloudflare AI in a Node script:
108
+ ## Tool Calling
63
109
 
64
- ```js
65
- const workersai = createWorkersAI({
66
- accountId: process.env.CLOUDFLARE_ACCOUNT_ID,
67
- apiKey: process.env.CLOUDFLARE_API_KEY
110
+ ```ts
111
+ import { generateText, stepCountIs } from "ai";
112
+ import { z } from "zod";
113
+
114
+ const { text } = await generateText({
115
+ model: workersai("@cf/meta/llama-4-scout-17b-16e-instruct"),
116
+ prompt: "What's the weather in London?",
117
+ tools: {
118
+ getWeather: {
119
+ description: "Get the current weather for a city",
120
+ inputSchema: z.object({ city: z.string() }),
121
+ execute: async ({ city }) => ({ city, temperature: 18, condition: "Cloudy" }),
122
+ },
123
+ },
124
+ stopWhen: stepCountIs(2),
68
125
  });
126
+ ```
127
+
128
+ ## Structured Output
69
129
 
70
- const text = await streamText({
71
- model: workersai("@cf/meta/llama-2-7b-chat-int8"),
72
- messages: [
73
- {
74
- role: "user",
75
- content: "Write an essay about hello world",
76
- },
77
- ],
130
+ ```ts
131
+ import { generateText, Output } from "ai";
132
+ import { z } from "zod";
133
+
134
+ const { output } = await generateText({
135
+ model: workersai("@cf/meta/llama-3.3-70b-instruct-fp8-fast"),
136
+ prompt: "Recipe for spaghetti bolognese",
137
+ output: Output.object({
138
+ schema: z.object({
139
+ name: z.string(),
140
+ ingredients: z.array(z.object({ name: z.string(), amount: z.string() })),
141
+ steps: z.array(z.string()),
142
+ }),
143
+ }),
78
144
  });
79
145
  ```
80
146
 
81
- ### Using generateText for Non-Streaming Responses
147
+ ## Embeddings
148
+
149
+ ```ts
150
+ import { embedMany } from "ai";
151
+
152
+ const { embeddings } = await embedMany({
153
+ model: workersai.textEmbedding("@cf/baai/bge-base-en-v1.5"),
154
+ values: ["sunny day at the beach", "rainy afternoon in the city"],
155
+ });
156
+ ```
82
157
 
83
- If you prefer to get a complete text response rather than a stream, you can use the `generateText` function:
158
+ ## Image Generation
84
159
 
85
160
  ```ts
86
- import { createWorkersAI } from "workers-ai-provider";
161
+ import { generateImage } from "ai";
162
+
163
+ const { images } = await generateImage({
164
+ model: workersai.image("@cf/black-forest-labs/flux-1-schnell"),
165
+ prompt: "A mountain landscape at sunset",
166
+ size: "1024x1024",
167
+ });
168
+
169
+ // images[0].uint8Array contains the PNG bytes
170
+ ```
171
+
172
+ ## AI Search
173
+
174
+ [AI Search](https://developers.cloudflare.com/ai-search/) is Cloudflare's managed RAG service. Connect your data and query it with natural language.
175
+
176
+ ```jsonc
177
+ // wrangler.jsonc
178
+ {
179
+ "ai_search": [{ "binding": "AI_SEARCH", "name": "my-search-index" }],
180
+ }
181
+ ```
182
+
183
+ ```ts
184
+ import { createAISearch } from "workers-ai-provider";
87
185
  import { generateText } from "ai";
88
186
 
89
- type Env = {
90
- AI: Ai;
91
- };
187
+ const aisearch = createAISearch({ binding: env.AI_SEARCH });
92
188
 
93
- export default {
94
- async fetch(req: Request, env: Env) {
95
- const workersai = createWorkersAI({ binding: env.AI });
96
-
97
- const { text } = await generateText({
98
- model: workersai("@cf/meta/llama-3.3-70b-instruct-fp8-fast"),
99
- prompt: "Write a short poem about clouds",
100
- });
101
-
102
- return new Response(JSON.stringify({ generatedText: text }), {
103
- headers: {
104
- "Content-Type": "application/json",
105
- },
106
- });
107
- },
108
- };
189
+ const { text } = await generateText({
190
+ model: aisearch(),
191
+ messages: [{ role: "user", content: "How do I setup AI Gateway?" }],
192
+ });
109
193
  ```
110
194
 
111
- ### Using AutoRAG
195
+ Streaming works the same way -- use `streamText` instead of `generateText`.
196
+
197
+ > `createAutoRAG` still works but is deprecated. Use `createAISearch` instead.
198
+
199
+ ## API Reference
112
200
 
113
- The provider now supports [Cloudflare's AutoRAG](https://developers.cloudflare.com/autorag/), allowing you to prompt your AutoRAG models directly from the Vercel AI SDK. Here's how to use it in your Worker:
201
+ ### `createWorkersAI(options)`
202
+
203
+ | Option | Type | Description |
204
+ | ----------- | ---------------- | ---------------------------------------------------------------------------- |
205
+ | `binding` | `Ai` | Workers AI binding (`env.AI`). Use this OR credentials. |
206
+ | `accountId` | `string` | Cloudflare account ID. Required with `apiKey`. |
207
+ | `apiKey` | `string` | Cloudflare API token. Required with `accountId`. |
208
+ | `gateway` | `GatewayOptions` | Optional [AI Gateway](https://developers.cloudflare.com/ai-gateway/) config. |
209
+
210
+ Returns a provider with model factories for each AI SDK function:
114
211
 
115
212
  ```ts
116
- import { createAutoRAG } from "workers-ai-provider";
117
- import { streamText } from "ai";
213
+ // For generateText / streamText:
214
+ workersai(modelId);
215
+ workersai.chat(modelId);
118
216
 
119
- type Env = {
120
- AI: Ai;
121
- };
217
+ // For embedMany / embed:
218
+ workersai.textEmbedding(modelId);
122
219
 
123
- export default {
124
- async fetch(req: Request, env: Env) {
125
- const autorag = createAutoRAG({ binding: env.AI.autorag('my-rag-name') });
126
-
127
- const text = await streamText({
128
- model: autorag("@cf/meta/llama-3.3-70b-instruct-fp8-fast"),
129
- messages: [
130
- {
131
- role: "user",
132
- content: "How to setup AI Gateway?",
133
- },
134
- ],
135
- });
136
-
137
- return text.toTextStreamResponse({
138
- headers: {
139
- // add these headers to ensure that the
140
- // response is chunked and streamed
141
- "Content-Type": "text/x-unknown",
142
- "content-encoding": "identity",
143
- "transfer-encoding": "chunked",
144
- },
145
- });
146
- },
147
- };
220
+ // For generateImage:
221
+ workersai.image(modelId);
148
222
  ```
149
223
 
150
- For more info, refer to the documentation of the [Vercel AI SDK](https://sdk.vercel.ai/).
224
+ ### `createAISearch(options)`
225
+
226
+ | Option | Type | Description |
227
+ | --------- | --------- | ------------------------------------ |
228
+ | `binding` | `AutoRAG` | AI Search binding (`env.AI_SEARCH`). |
151
229
 
152
- ### Credits
230
+ Returns a callable provider:
153
231
 
154
- Based on work by [Dhravya Shah](https://twitter.com/DhravyaShah) and the Workers AI team at Cloudflare.
232
+ ```ts
233
+ aisearch(); // AI Search model (shorthand)
234
+ aisearch.chat(); // AI Search model
235
+ ```