ai-retry 1.10.0 → 1.11.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. package/README.md +457 -826
  2. package/dist/{retryables-M5l_6w9k.mjs → conditions-BGoANmfr.mjs} +5 -5
  3. package/dist/{retryables-CPAbu_M3.mjs → conditions-CyJOeRZK.mjs} +4 -4
  4. package/dist/create-retryable-model-BIMStLIF.mjs +676 -0
  5. package/dist/create-retryable-model-CLCFZANp.mjs +244 -0
  6. package/dist/create-retryable-model-DEQ5jciq.mjs +247 -0
  7. package/dist/embedding-model/conditions/index.d.mts +14 -0
  8. package/dist/embedding-model/conditions/index.mjs +7 -0
  9. package/dist/embedding-model/index.d.mts +14 -0
  10. package/dist/embedding-model/index.mjs +6 -0
  11. package/dist/{guards-D8UJtxDK.mjs → guards-DtZgDqE3.mjs} +6 -1
  12. package/dist/image-model/conditions/index.d.mts +4 -0
  13. package/dist/image-model/conditions/index.mjs +4 -0
  14. package/dist/image-model/index.d.mts +14 -0
  15. package/dist/image-model/index.mjs +6 -0
  16. package/dist/{index-DaJrd4dN.d.mts → index-BkvvEDSr.d.mts} +6 -4
  17. package/dist/index-D3t1Xo_U.d.mts +28 -0
  18. package/dist/index.d.mts +34 -7
  19. package/dist/index.mjs +43 -2
  20. package/dist/language-model/conditions/index.d.mts +4 -0
  21. package/dist/language-model/conditions/index.mjs +4 -0
  22. package/dist/language-model/index.d.mts +14 -0
  23. package/dist/language-model/index.mjs +6 -0
  24. package/dist/{error-CaTT-xX8.mjs → not-C9pUKPO7.mjs} +69 -38
  25. package/dist/{error-B-rjhfG_.d.mts → or-CFcJxcaL.d.mts} +36 -27
  26. package/dist/retryables/index.d.mts +54 -18
  27. package/dist/retryables/index.mjs +50 -14
  28. package/dist/telemetry-CJFJzjTr.mjs +442 -0
  29. package/dist/{types-Dik-mH20.d.mts → types-B8qg3Yzx.d.mts} +23 -10
  30. package/package.json +8 -7
  31. package/dist/create-retryable-model-D36IQyOQ.mjs +0 -1564
  32. package/dist/experimental/embedding-model/index.d.mts +0 -8
  33. package/dist/experimental/embedding-model/index.mjs +0 -19
  34. package/dist/experimental/embedding-model/retryables/index.d.mts +0 -20
  35. package/dist/experimental/embedding-model/retryables/index.mjs +0 -7
  36. package/dist/experimental/image-model/index.d.mts +0 -8
  37. package/dist/experimental/image-model/index.mjs +0 -19
  38. package/dist/experimental/image-model/retryables/index.d.mts +0 -4
  39. package/dist/experimental/image-model/retryables/index.mjs +0 -4
  40. package/dist/experimental/language-model/index.d.mts +0 -11
  41. package/dist/experimental/language-model/index.mjs +0 -19
  42. package/dist/experimental/language-model/retryables/index.d.mts +0 -4
  43. package/dist/experimental/language-model/retryables/index.mjs +0 -4
  44. package/dist/index-ewZ5T6B2.d.mts +0 -34
  45. /package/dist/{parse-retry-headers-CRxgluhe.mjs → parse-retry-headers-RPSiSNjf.mjs} +0 -0
package/README.md CHANGED
@@ -11,108 +11,108 @@
11
11
 
12
12
  Automatically handle API failures, content filtering, timeouts and other errors by switching between different AI models and providers.
13
13
 
14
- `ai-retry` wraps the provided base model with a set of retry conditions (retryables). When a request fails with an error or the response is not satisfying, it iterates through the given retryables to find a suitable fallback model. It automatically tracks which models have been tried and how many attempts have been made to prevent infinite loops.
14
+ `ai-retry` wraps a base model with a list of typed retry **conditions**. When a request fails with an error, or the response is not satisfying, it walks the conditions top-down to find a suitable fallback. It tracks which models have been tried and how many attempts have been made to prevent infinite loops.
15
15
 
16
- It supports two types of retries:
16
+ Two retry shapes are supported:
17
17
 
18
- - Error-based retries: when the model throws an error (e.g. timeouts, API errors, etc.)
19
- - Result-based retries: when the model returns a successful response that needs retrying (e.g. content filtering, etc.)
18
+ - **Error-based**: the model throws (timeouts, rate limits, API errors).
19
+ - **Result-based**: the model returns a successful response that still needs retrying (content filtering, schema mismatch, etc.).
20
20
 
21
21
  ### Installation
22
22
 
23
- This library supports both AI SDK v5 and v6. The main branch reflects the latest stable version for AI SDK v6. See the [v0 branch](https://github.com/zirkelc/ai-retry/tree/v0) for the AI SDK v5 documentation.
24
-
25
- > [!WARNING]
23
+ > [!NOTE]
26
24
  > Version compatibility:
27
25
  >
28
- > - Use `ai-retry` version 0.x for AI SDK v5.
29
- > - Use `ai-retry` version 1.x for AI SDK v6.
26
+ > - `ai-retry@0.x` AI SDK v5
27
+ > - `ai-retry@1.x` AI SDK v6
28
+ > - `ai-retry@beta` — AI SDK v7 (beta, see the [`ai-sdk-v7` branch](https://github.com/zirkelc/ai-retry/tree/ai-sdk-v7))
30
29
 
31
30
  ```bash
32
- # AI SDK v5
33
- npm install ai-retry@0
31
+ npm install ai-retry
32
+ ```
34
33
 
35
- # AI SDK v6
36
- npm install ai-retry@1
34
+ A beta release for AI SDK v7 is available on the [`ai-sdk-v7` branch](https://github.com/zirkelc/ai-retry/tree/ai-sdk-v7). Install it with the `beta` tag:
35
+
36
+ ```bash
37
+ npm install ai-retry@beta
37
38
  ```
38
39
 
39
40
  ### Usage
40
41
 
41
- Create a retryable model by providing a base model and a list of retryables or fallback models.
42
- When an error occurs, it will evaluate each retryable in order and use the first one that indicates a retry should be attempted with a different model.
43
-
44
42
  > [!NOTE]
45
- > `ai-retry` supports language models, embedding models, and image models.
43
+ > **The condition API is the recommended way to configure retries.** Existing code keeps working:
44
+ >
45
+ > - The root `createRetryable` export and the function-style retryables (`contentFilterTriggered`, `requestTimeout`, …) are **deprecated but still functional**. Prefer `createRetryableModel` from `ai-retry/<family>-model` — it is typed for that family and resolves gateway strings for it.
46
+ > - The previously experimental `ai-retry/experimental/*` import paths were removed; the same API now ships at `ai-retry/<family>-model`.
47
+ >
48
+ > See the [migration guide](./MIGRATION.md) to move existing code to the condition API.
49
+
50
+ Create a retryable model with a base model and a list of conditions plus the action to take when a condition matches.
46
51
 
47
52
  ```typescript
53
+ import { anthropic } from '@ai-sdk/anthropic';
48
54
  import { openai } from '@ai-sdk/openai';
49
- import { generateText, streamText } from 'ai';
50
- import { createRetryable } from 'ai-retry';
55
+ import { generateText } from 'ai';
56
+ import {
57
+ createRetryableModel,
58
+ error,
59
+ finishReason,
60
+ httpStatus,
61
+ } from 'ai-retry/language-model';
51
62
 
52
- // Create a retryable model
53
- const retryableModel = createRetryable({
54
- // Base model
55
- model: openai('gpt-4-mini'),
63
+ const retryableModel = createRetryableModel({
64
+ model: openai('gpt-4o'),
56
65
  retries: [
57
- // Retry strategies and fallbacks...
66
+ // Fall back to a different model on HTTP 529 or any "overloaded" message
67
+ httpStatus(529, 'overloaded').switch({
68
+ model: anthropic('claude-sonnet-4-0'),
69
+ }),
70
+
71
+ // Fall back when the response was content-filtered
72
+ finishReason('content-filter').switch({ model: openai('gpt-4o-mini') }),
73
+
74
+ // Retry the same model with exponential backoff on retryable errors
75
+ error.isRetryable(true).retry({ delay: 1_000, backoffFactor: 2 }),
58
76
  ],
59
77
  });
60
78
 
61
- // Use like any other AI SDK model
62
79
  const result = await generateText({
63
80
  model: retryableModel,
64
81
  prompt: 'Hello world!',
65
82
  });
66
83
 
67
84
  console.log(result.text);
68
-
69
- // Or with streaming
70
- const result = streamText({
71
- model: retryableModel,
72
- prompt: 'Write a story about a robot...',
73
- });
74
-
75
- for await (const chunk of result.textStream) {
76
- console.log(chunk.text);
77
- }
78
85
  ```
79
86
 
80
- This also works with embedding models:
87
+ This also works with embedding models and image models, each through their own entry point:
81
88
 
82
89
  ```typescript
83
90
  import { openai } from '@ai-sdk/openai';
84
91
  import { embed } from 'ai';
85
- import { createRetryable } from 'ai-retry';
92
+ import { createRetryableModel, httpStatus } from 'ai-retry/embedding-model';
86
93
 
87
- // Create a retryable model
88
- const retryableModel = createRetryable({
89
- // Base model
94
+ const retryableModel = createRetryableModel({
90
95
  model: openai.textEmbedding('text-embedding-3-large'),
91
96
  retries: [
92
- // Retry strategies and fallbacks...
97
+ httpStatus(529).switch({
98
+ model: openai.textEmbedding('text-embedding-3-small'),
99
+ }),
93
100
  ],
94
101
  });
95
102
 
96
- // Use like any other AI SDK model
97
- const result = await embed({
98
- model: retryableModel,
99
- value: 'Hello world!',
100
- });
101
-
102
- console.log(result.embedding);
103
+ const result = await embed({ model: retryableModel, value: 'Hello world!' });
103
104
  ```
104
105
 
105
- This also works with image models:
106
-
107
106
  ```typescript
107
+ import { google } from '@ai-sdk/google';
108
108
  import { openai } from '@ai-sdk/openai';
109
109
  import { generateImage } from 'ai';
110
- import { createRetryable } from 'ai-retry';
110
+ import { createRetryableModel, noImage } from 'ai-retry/image-model';
111
111
 
112
- const retryableModel = createRetryable({
112
+ const retryableModel = createRetryableModel({
113
113
  model: openai.image('dall-e-3'),
114
114
  retries: [
115
- // Retry strategies and fallbacks...
115
+ noImage().switch({ model: google.image('gemini-3-pro-image-preview') }),
116
116
  ],
117
117
  });
118
118
 
@@ -120,805 +120,463 @@ const result = await generateImage({
120
120
  model: retryableModel,
121
121
  prompt: 'A sunset over mountains',
122
122
  });
123
+ ```
124
+
125
+ #### Entry points
126
+
127
+ Pick the entry point that matches the model you pass to `createRetryableModel`. Each module exposes the helpers that make sense for that model family already typed for it, so no manual type annotations are needed.
128
+
129
+ | Entry point | For models passed to |
130
+ | -------------------------- | -------------------------------------------------------------- |
131
+ | `ai-retry/language-model` | `generateText`, `generateObject`, `streamText`, `streamObject` |
132
+ | `ai-retry/embedding-model` | `embed`, `embedMany` |
133
+ | `ai-retry/image-model` | `generateImage` |
134
+
135
+ ```typescript
136
+ import { createRetryableModel } from 'ai-retry/language-model';
137
+ import { createRetryableModel } from 'ai-retry/image-model';
138
+ import { createRetryableModel } from 'ai-retry/embedding-model';
139
+ ```
140
+
141
+ Each entry point re-exports `createRetryableModel` plus every condition for that family. The condition helpers can also be imported from the dedicated `/conditions` subpath:
123
142
 
124
- console.log(result.images);
143
+ ```typescript
144
+ import {
145
+ error,
146
+ httpStatus,
147
+ finishReason,
148
+ } from 'ai-retry/language-model/conditions';
149
+ // or
150
+ import * as conditions from 'ai-retry/language-model/conditions';
125
151
  ```
126
152
 
127
153
  #### Vercel AI Gateway
128
154
 
129
- You can use `ai-retry` with Vercel AI Gateway by providing the model as a string. Internally, the model will be resolved with the default `gateway` [provider instance](https://ai-sdk.dev/providers/ai-sdk-providers/ai-gateway#provider-instance) from AI SDK.
155
+ You can pass a model as a string and it will be resolved through the default `gateway` [provider instance](https://ai-sdk.dev/providers/ai-sdk-providers/ai-gateway#provider-instance) from the AI SDK. Each entry point resolves strings to its own model family, so the string is typed against that family's gateway model ids.
130
156
 
131
157
  ```typescript
132
158
  import { gateway } from 'ai';
133
- import { createRetryable } from 'ai-retry';
159
+ import { createRetryableModel } from 'ai-retry/language-model';
134
160
 
135
- const retryableModel = createRetryable({
161
+ const retryableModel = createRetryableModel({
136
162
  model: 'openai/gpt-5',
137
163
  retries: ['anthropic/claude-sonnet-4'],
138
164
  });
139
165
 
140
166
  // Is the same as:
141
- const retryableModel = createRetryable({
167
+ const retryableModel2 = createRetryableModel({
142
168
  model: gateway('openai/gpt-5'),
143
169
  retries: [gateway('anthropic/claude-sonnet-4')],
144
170
  });
145
171
  ```
146
172
 
147
- By default, the `gateway` provider resolves model strings as language models. If you want to use an embedding model, you need to use the `textEmbeddingModel` method.
173
+ Embedding and image entry points accept gateway strings too, resolved against their respective families:
148
174
 
149
175
  ```typescript
150
- import { gateway } from 'ai';
151
- import { createRetryable } from 'ai-retry';
176
+ import { createRetryableModel } from 'ai-retry/embedding-model';
152
177
 
153
- const retryableModel = createRetryable({
154
- model: gateway.textEmbeddingModel('openai/text-embedding-3-large'),
178
+ const retryableEmbedding = createRetryableModel({
179
+ model: 'openai/text-embedding-3-large',
180
+ retries: ['openai/text-embedding-3-small'],
155
181
  });
156
182
  ```
157
183
 
158
- ### Retryables
159
-
160
- The objects passed to the `retries` are called retryables and control the retry behavior. We can distinguish between two types of retryables:
161
-
162
- - **Static retryables** are simply models instances (language or embedding) that will always be used when an error occurs. They are also called fallback models.
163
- - **Dynamic retryables** are functions that receive the current attempt context (error/result and previous attempts) and decide whether to retry with a different model based on custom logic.
164
-
165
- You can think of the `retries` array as a big `if-else` block, where each dynamic retryable is an `if` branch that can match a certain error/result condition, and static retryables are the `else` branches that match all other conditions. The analogy is not perfect, because the order of retryables matters because `retries` are evaluated in order until one matches:
166
-
167
184
  ```typescript
168
- import { generateText, streamText } from 'ai';
169
- import { createRetryable } from 'ai-retry';
170
-
171
- const retryableModel = createRetryable({
172
- // Base model
173
- model: openai('gpt-4'),
174
- // Retryables are evaluated top-down in order
175
- retries: [
176
- // Dynamic retryables act like if-branches:
177
- // If error.code == 429 (too many requests) happens, retry with this model
178
- (context) => {
179
- return context.current.error.statusCode === 429
180
- ? { model: azure('gpt-4-mini') } // Retry
181
- : undefined; // Skip
182
- },
185
+ import { createRetryableModel } from 'ai-retry/image-model';
183
186
 
184
- // If error.message ~= "service overloaded", retry with this model
185
- (context) => {
186
- return context.current.error.message.includes('service overloaded')
187
- ? { model: azure('gpt-4-mini') } // Retry
188
- : undefined; // Skip
189
- },
190
-
191
- // Static retryables act like else branches:
192
- // Else, always fallback to this model
193
- anthropic('claude-3-haiku-20240307'),
194
- // Same as:
195
- // { model: anthropic('claude-3-haiku-20240307'), maxAttempts: 1 }
196
- ],
187
+ const retryableImage = createRetryableModel({
188
+ model: 'google/imagen-4.0-generate-001',
189
+ retries: ['google/imagen-4.0-fast-generate-001'],
197
190
  });
198
191
  ```
199
192
 
200
- In this example, if the base model fails with code 429 or a service overloaded error, it will retry with `gpt-4-mini` on Azure. In any other error case, it will fallback to `claude-3-haiku-20240307` on Anthropic. If the order would be reversed, the static retryable would catch all errors first, and the dynamic retryable would never be reached.
193
+ ### Retries
201
194
 
202
- #### Errors vs Results
195
+ The `retries` array holds the things `ai-retry` tries, in order, when a request fails or a result needs retrying. There are two kinds:
203
196
 
204
- Dynamic retryables can be further divided based on what triggers them:
197
+ - **Fallbacks** are model instances (or gateway strings). They always match and are used as plain fallbacks.
198
+ - **Conditions** are typed predicates produced by helpers like `error()` or `httpStatus()` and finalized with a `.switch()` or `.retry()` action. They only fire when their predicate matches.
205
199
 
206
- - **Error-based retryables** handle API errors where the request throws an error (e.g., timeouts, rate limits, service unavailable, etc.)
207
- - **Result-based retryables** handle successful responses that still need retrying (e.g., content filtering, guardrails, etc.)
208
-
209
- Both types of retryables have the same interface and receive the current attempt as context. You can use the `isErrorAttempt` and `isResultAttempt` type guards to check the type of the current attempt.
200
+ You can think of `retries` as a big `if-else` chain — each condition is an `if` branch matching some error/result, and each fallback is an `else` branch matching anything left over. Order matters: the array is evaluated top-down until one matches.
210
201
 
211
202
  ```typescript
212
- import { generateText } from 'ai';
213
- import { createRetryable, isErrorAttempt, isResultAttempt } from 'ai-retry';
214
- import type { Retryable } from 'ai-retry';
215
-
216
- // Error-based retryable: handles thrown errors (e.g., timeouts, rate limits)
217
- const errorBasedRetry: Retryable = (context) => {
218
- if (isErrorAttempt(context.current)) {
219
- const { error } = context.current;
220
- // The request threw an error - e.g., network timeout, 429 rate limit
221
- console.log('Request failed with error:', error);
222
- return { model: anthropic('claude-3-haiku-20240307') };
223
- }
224
- return undefined;
225
- };
226
-
227
- // Result-based retryable: handles successful responses that need retrying
228
- const resultBasedRetry: Retryable = (context) => {
229
- if (isResultAttempt(context.current)) {
230
- const { result } = context.current;
231
- // The request succeeded, but the response indicates a problem
232
- if (result.finishReason.unified === 'content-filter') {
233
- console.log('Content was filtered, trying different model');
234
- return { model: openai('gpt-4') };
235
- }
236
- }
237
- return undefined;
238
- };
203
+ import { anthropic } from '@ai-sdk/anthropic';
204
+ import { azure } from '@ai-sdk/azure';
205
+ import { openai } from '@ai-sdk/openai';
206
+ import {
207
+ createRetryableModel,
208
+ error,
209
+ httpStatus,
210
+ } from 'ai-retry/language-model';
239
211
 
240
- const retryableModel = createRetryable({
241
- model: azure('gpt-4-mini'),
212
+ const retryableModel = createRetryableModel({
213
+ model: openai('gpt-4'),
242
214
  retries: [
243
- // Error-based: catches thrown errors like timeouts, rate limits, etc.
244
- errorBasedRetry,
215
+ // Condition: match HTTP 429 (rate limit)
216
+ httpStatus(429).switch({ model: azure('gpt-4-mini') }),
245
217
 
246
- // Result-based: catches successful responses that need retrying
247
- resultBasedRetry,
218
+ // Condition: match "overloaded" in the error message
219
+ error.message('overloaded').switch({ model: azure('gpt-4-mini') }),
220
+
221
+ // Fallback: switch to Anthropic for anything else
222
+ anthropic('claude-3-haiku-20240307'),
223
+ // Same as:
224
+ // { model: anthropic('claude-3-haiku-20240307'), maxAttempts: 1 }
248
225
  ],
249
226
  });
250
227
  ```
251
228
 
252
- Result-based retryables apply to language models for both generate (`generateText`, `generateObject`) and streaming (`streamText`, `streamObject`) calls. For streams, the retry decision happens when the upstream `finish` part arrives and only fires if no content has been emitted yet, so behavior like `finishReason: 'content-filter'` on an otherwise empty response can still trigger a fallback. Once any content chunk has been forwarded, the stream is committed and result-based retries are skipped.
253
-
254
229
  #### Fallbacks
255
230
 
256
- If you don't need precise error matching with custom logic and just want to fallback to different models on any error, you can simply provide a list of models.
257
-
258
- > [!NOTE]
259
- > Use the object syntax `{ model: openai('gpt-4') }` if you need to provide additional options like `maxAttempts`, `delay`, etc.
231
+ A fallback is a plain model instance (or gateway string) in `retries`. It always matches, so it acts as a catch-all: when no earlier condition fired, the next fallback model is tried. Each fallback is attempted once by default; use the object form to pass options like `maxAttempts`.
260
232
 
261
233
  ```typescript
234
+ import { anthropic } from '@ai-sdk/anthropic';
262
235
  import { openai } from '@ai-sdk/openai';
263
- import { generateText, streamText } from 'ai';
264
- import { createRetryable } from 'ai-retry';
236
+ import { createRetryableModel } from 'ai-retry/language-model';
265
237
 
266
- const retryableModel = createRetryable({
267
- // Base model
268
- model: openai('gpt-4-mini'),
269
- // List of fallback models
238
+ const retryableModel = createRetryableModel({
239
+ model: openai('gpt-4o'),
270
240
  retries: [
271
- openai('gpt-3.5-turbo'), // Fallback for first error
272
- // Same as:
273
- // { model: openai('gpt-3.5-turbo'), maxAttempts: 1 },
241
+ openai('gpt-4o-mini'), // first fallback
242
+ anthropic('claude-3-haiku-20240307'), // second fallback
274
243
 
275
- anthropic('claude-3-haiku-20240307'), // Fallback for second error
276
- // Same as:
277
- // { model: anthropic('claude-3-haiku-20240307'), maxAttempts: 1 },
244
+ // Object form to pass options:
245
+ { model: anthropic('claude-3-haiku-20240307'), maxAttempts: 2 },
278
246
  ],
279
247
  });
280
248
  ```
281
249
 
282
- In this example, if the base model fails, it will retry with `gpt-3.5-turbo`. If that also fails, it will retry with `claude-3-haiku-20240307`. If that fails again, the whole retry process stops and a `RetryError` is thrown.
283
-
284
- #### Custom
285
-
286
- If you need more control over when to retry and which model to use, you can create your own custom retryable. This function is called with a context object containing the current attempt (error or result) and all previous attempts and needs to return a retry model or `undefined` to skip to the next retryable. The object you return from the retryable function is the same as the one you provide in the `retries` array.
287
-
288
- > [!NOTE]
289
- > You can return additional options like `maxAttempts`, `delay`, etc. along with the model.
250
+ Fallbacks are tried in order. Once all of them are exhausted, a `RetryError` is thrown (see [All retries failed](#all-retries-failed)).
290
251
 
291
- > [!TIP]
292
- > If you'd like the same flexibility with a typed, composable condition system, see [Experimental: Composable Conditions](#experimental-composable-conditions).
252
+ #### Conditions
293
253
 
294
- ```typescript
295
- import { anthropic } from '@ai-sdk/anthropic';
296
- import { openai } from '@ai-sdk/openai';
297
- import { APICallError } from 'ai';
298
- import { createRetryable, isErrorAttempt } from 'ai-retry';
299
- import type { Retryable } from 'ai-retry';
300
-
301
- // Custom retryable that retries on rate limit errors (429)
302
- const rateLimitRetry: Retryable = (context) => {
303
- // Only handle error attempts
304
- if (isErrorAttempt(context.current)) {
305
- // Get the error from the current attempt
306
- const { error } = context.current;
307
-
308
- // Check for rate limit error
309
- if (APICallError.isInstance(error) && error.statusCode === 429) {
310
- // Retry with a different model
311
- return { model: anthropic('claude-3-haiku-20240307') };
312
- }
313
- }
254
+ A `Condition` is a typed predicate over a `RetryContext`. The library ships two **low-level** builders (`error()` and `result()`) plus **high-level** helpers built on top of them. Every condition is finalized with one of two terminal actions, `.switch()` or `.retry()`, which turn it into a retryable.
314
255
 
315
- // Skip to next retryable
316
- return undefined;
317
- };
318
-
319
- const retryableModel = createRetryable({
320
- // Base model
321
- model: openai('gpt-4-mini'),
322
- retries: [
323
- // Use custom rate limit retryable
324
- rateLimitRetry,
256
+ ##### Universal conditions
325
257
 
326
- // Other retryables...
327
- ],
328
- });
329
- ```
258
+ These are available from all three entry points (`language-model`, `embedding-model`, `image-model`).
330
259
 
331
- In this example, if the base model fails with a 429 error, it will retry with `claude-3-haiku-20240307`. For any other error, it will skip to the next retryable (if any) or throw the original error.
260
+ | Helper | Kind | Matches when |
261
+ | ------------------------------- | ---------- | ------------------------------------------------------------------------------ |
262
+ | `error(predicate)` | low-level | The current attempt failed and `predicate(err, ctx)` returns true |
263
+ | `error.isRetryable(flag)` | low-level | `APICallError.isRetryable === flag` (default `true`) |
264
+ | `error.statusCode(...patterns)` | low-level | Numbers match the status code exactly; regex matches the stringified code |
265
+ | `error.message(...patterns)` | low-level | Substring (case-insensitive) or regex match against the error message |
266
+ | `error.isTimeout()` | low-level | `Error.name === 'TimeoutError'` (`AbortSignal.timeout()` fired) |
267
+ | `error.isAbort()` | low-level | `Error.name === 'AbortError'` (manual `controller.abort()`) |
268
+ | `httpStatus(...patterns)` | high-level | Numbers match the status code; strings match the message; regex matches either |
269
+ | `timeout()` | high-level | Alias for `error.isTimeout()` |
270
+ | `aborted()` | high-level | Alias for `error.isAbort()` |
332
271
 
333
- #### All Retries Failed
272
+ ###### `error(predicate)`
334
273
 
335
- If all retry attempts failed, a `RetryError` is thrown containing all individual errors.
336
- If no retry was attempted (e.g. because all retryables returned `undefined`), the original error is thrown directly.
274
+ Takes any predicate over the failed attempt's error. Its namespace bundles the common matchers: `isRetryable` (defaults to `true`), `statusCode` (numbers or regex), `message` (case-insensitive substring or regex), and `isTimeout` / `isAbort` (match `AbortSignal.timeout()` firing vs a manual `controller.abort()`). The pattern matchers accept any number of patterns and match if any matches.
337
275
 
338
276
  ```typescript
339
- import { RetryError } from 'ai';
277
+ import { APICallError } from 'ai';
278
+ import { error } from 'ai-retry/language-model';
340
279
 
341
- const retryableModel = createRetryable({
342
- // Base model = first attempt
343
- model: azure('gpt-4-mini'),
344
- retries: [
345
- // Fallback model 1 = Second attempt
346
- openai('gpt-3.5-turbo'),
347
- // Fallback model 2 = Third attempt
348
- anthropic('claude-3-haiku-20240307'),
349
- ],
280
+ error((e) => APICallError.isInstance(e) && e.statusCode === 418).switch({
281
+ model: fallback,
350
282
  });
351
283
 
352
- try {
353
- const result = await generateText({
354
- model: retryableModel,
355
- prompt: 'Hello world!',
356
- });
357
- } catch (error) {
358
- // RetryError is an official AI SDK error
359
- if (error instanceof RetryError) {
360
- console.error('All retry attempts failed:', error.errors);
361
- } else {
362
- console.error('Request failed:', error);
363
- }
364
- }
365
- ```
366
-
367
- Errors are tracked per unique model (provider + modelId). That means on the first error, it will retry with `gpt-3.5-turbo`. If that also fails, it will retry with `claude-3-haiku-20240307`. If that fails again, the whole retry process stops and a `RetryError` is thrown.
368
-
369
- ### Built-in Retryables
370
-
371
- There are several built-in dynamic retryables available for common use cases:
284
+ error.isRetryable().switch({ model: fallback }); // defaults to true
285
+ error.isRetryable(false).switch({ model: fallback });
372
286
 
373
- > [!TIP]
374
- > You are missing a retryable for your use case? [Open an issue](https://github.com/zirkelc/ai-retry/issues/new) and let's discuss it!
287
+ error.statusCode(503, 529).switch({ model: fallback });
288
+ error.statusCode(/^5\d\d$/).switch({ model: fallback }); // any 5xx
375
289
 
376
- > [!NOTE]
377
- > Looking for a composable alternative? See [Experimental: Composable Conditions](#experimental-composable-conditions) for a `condition().action()` API that builds on small primitives.
378
-
379
- - [`contentFilterTriggered`](./src/retryables/content-filter-triggered.ts): Content filter was triggered based on the prompt or completion.
380
- - [`requestTimeout`](./src/retryables/request-timeout.ts): Request timeout occurred.
381
- - [`requestNotRetryable`](./src/retryables/request-not-retryable.ts): Request failed with a non-retryable error.
382
- - [`retryAfterDelay`](./src/retryables/retry-after-delay.ts): Retry with delay and exponential backoff and respect `retry-after` headers.
383
- - [`serviceOverloaded`](./src/retryables/service-overloaded.ts): Response with status code 529 (service overloaded).
384
- - [`serviceUnavailable`](./src/retryables/service-unavailable.ts): Response with status code 503 (service unavailable).
385
- - [`schemaMismatch`](./src/retryables/schema-mismatch.ts): Response JSON doesn't match the expected schema from structured output modes (`Output.object()`, `Output.array()`, `Output.choice()`).
386
- - [`noImageGenerated`](./src/retryables/no-image-generated.ts): Image generation failed with `NoImageGeneratedError`.
290
+ error.message('overloaded').switch({ model: fallback }); // substring
291
+ error.message(/rate.?limit/i).switch({ model: fallback }); // regex
387
292
 
388
- #### Content Filter
293
+ error.isTimeout().switch({ model: fallback }); // AbortSignal.timeout() fired
294
+ error.isAbort().switch({ model: fallback }); // manual controller.abort()
295
+ ```
389
296
 
390
- Automatically switch to a different model when content filtering blocks your request.
297
+ ###### `httpStatus(...patterns)`
391
298
 
392
- > [!NOTE]
393
- > For streaming requests this retryable can only fire if the content filter trips before any content has been emitted. Once a text chunk flows through, the stream is committed and the fallback is skipped.
299
+ Matches an `APICallError` by status code (numbers), message substring (strings), or either (regex). Mix any combination in one call.
394
300
 
395
301
  ```typescript
396
- import { contentFilterTriggered } from 'ai-retry/retryables';
302
+ import { httpStatus } from 'ai-retry/language-model';
397
303
 
398
- const retryableModel = createRetryable({
399
- model: azure('gpt-4-mini'),
400
- retries: [
401
- contentFilterTriggered(openai('gpt-4-mini')), // Try OpenAI if Azure filters
402
- ],
403
- });
304
+ httpStatus(429).switch({ model: fallback }); // status code
305
+ httpStatus(529, 'overloaded').switch({ model: fallback }); // status or message
306
+ httpStatus(/^5\d\d$/).switch({ model: fallback }); // any 5xx
404
307
  ```
405
308
 
406
- #### Request Timeout
407
-
408
- Handle timeouts by switching to potentially faster models.
409
-
410
- > [!NOTE]
411
- > You need to use an `abortSignal` with a timeout on your request.
309
+ ###### `timeout()`
412
310
 
413
- When a request times out, the `requestTimeout` retryable will automatically create a fresh abort signal for the retry attempt. This prevents the retry from immediately failing due to the already-aborted signal from the original request. If you do not provide a `timeout` value, a default of 60 seconds is used for the retry attempt.
311
+ Alias for `error.isTimeout()` matches `AbortSignal.timeout()` firing (`Error.name === 'TimeoutError'`); pass a fresh `timeout` to the action so the fallback gets its own deadline.
414
312
 
415
313
  ```typescript
416
- import { requestTimeout } from 'ai-retry/retryables';
417
-
418
- const retryableModel = createRetryable({
419
- model: azure('gpt-4'),
420
- retries: [
421
- // Defaults to 60 seconds timeout for the retry attempt
422
- requestTimeout(azure('gpt-4-mini')),
314
+ import { timeout } from 'ai-retry/language-model';
423
315
 
424
- // Or specify a custom timeout for the retry attempt
425
- requestTimeout(azure('gpt-4-mini'), { timeout: 30_000 }),
426
- ],
427
- });
428
-
429
- const result = await generateText({
430
- model: retryableModel,
431
- prompt: 'Write a vegetarian lasagna recipe for 4 people.',
432
- abortSignal: AbortSignal.timeout(60_000), // Original request timeout
433
- });
316
+ timeout().switch({ model: fallback, timeout: 30_000 });
434
317
  ```
435
318
 
436
- #### Service Overloaded
319
+ ###### `aborted()`
437
320
 
438
- Handle service overload errors (status code 529) by switching to a provider.
321
+ Alias for `error.isAbort()` matches a manual `controller.abort()` (`Error.name === 'AbortError'`).
439
322
 
440
323
  ```typescript
441
- import { serviceOverloaded } from 'ai-retry/retryables';
324
+ import { aborted } from 'ai-retry/language-model';
442
325
 
443
- const retryableModel = createRetryable({
444
- model: anthropic('claude-sonnet-4-0'),
445
- retries: [
446
- // Retry with delay and exponential backoff
447
- serviceOverloaded(anthropic('claude-sonnet-4-0'), {
448
- delay: 5_000,
449
- backoffFactor: 2,
450
- maxAttempts: 5,
451
- }),
452
- // Or switch to a different provider
453
- serviceOverloaded(openai('gpt-4')),
454
- ],
455
- });
456
-
457
- const result = streamText({
458
- model: retryableModel,
459
- prompt: 'Write a story about a robot...',
460
- });
326
+ aborted().switch({ model: fallback });
461
327
  ```
462
328
 
463
- #### Service Unavailable
329
+ Each high-level helper is a thin wrapper around the low-level ones. For example, `httpStatus(...)` composes `error.statusCode(...)` with `error.message(...)`, and `timeout()` / `aborted()` are aliases for `error.isTimeout()` / `error.isAbort()`.
464
330
 
465
- Handle service unavailable errors (status code 503) by switching to a different provider.
331
+ ##### Language model conditions
466
332
 
467
- ```typescript
468
- import { serviceUnavailable } from 'ai-retry/retryables';
333
+ Only available from `ai-retry/language-model`. Result-based conditions inspect a successful response (see [Streaming](#streaming) for how they behave on streams).
469
334
 
470
- const retryableModel = createRetryable({
471
- model: azure('gpt-4'),
472
- retries: [
473
- serviceUnavailable(openai('gpt-4')), // Switch to OpenAI if Azure is unavailable
474
- ],
475
- });
476
- ```
335
+ | Helper | Kind | Matches when |
336
+ | --------------------------------- | ---------- | --------------------------------------------------------------------- |
337
+ | `result(predicate)` | low-level | The current attempt succeeded and `predicate(res, ctx)` returns true |
338
+ | `result.finishReason(...reasons)` | low-level | The result's `finishReason.unified` matches one of the given values |
339
+ | `finishReason(...reasons)` | high-level | Same as `result.finishReason` (re-exported for convenience) |
340
+ | `schemaInvalid()` | high-level | The result text fails JSON-schema validation against `responseFormat` |
477
341
 
478
- #### No Image Generated
342
+ ###### `result(predicate)`
479
343
 
480
- Handle image generation failures by switching to a different model.
344
+ Takes any predicate over the successful result. `result.finishReason(...reasons)` and the re-exported `finishReason(...reasons)` match the result's unified finish reason against one or more values.
481
345
 
482
346
  ```typescript
483
- import { openai } from '@ai-sdk/openai';
484
- import { google } from '@ai-sdk/google';
485
- import { generateImage } from 'ai';
486
- import { createRetryable } from 'ai-retry';
487
- import { noImageGenerated } from 'ai-retry/retryables';
347
+ import { finishReason, result } from 'ai-retry/language-model';
488
348
 
489
- const retryableModel = createRetryable({
490
- model: openai.image('dall-e-3'),
491
- retries: [
492
- noImageGenerated(google.image('gemini-3-pro-image-preview')), // Switch to Gemini if DALL-E fails to generate an image
493
- ],
494
- });
349
+ result((res) => res.usage.outputTokens.total === 0).switch({ model: fallback });
495
350
 
496
- const result = await generateImage({
497
- model: retryableModel,
498
- prompt: 'A sunset over mountains',
499
- });
351
+ finishReason('content-filter').switch({ model: fallback });
352
+ finishReason('length', 'content-filter').retry({ maxAttempts: 3 });
500
353
  ```
501
354
 
502
- #### Request Not Retryable
355
+ ###### `schemaInvalid()`
503
356
 
504
- Handle cases where the base model fails with a non-retryable error.
505
-
506
- > [!NOTE]
507
- > You can check if an error is retryable with the `isRetryable` property on an [`APICallError`](https://ai-sdk.dev/docs/reference/ai-sdk-errors/ai-api-call-error#ai_apicallerror).
357
+ Matches when the result text fails JSON-schema validation against the call's `responseFormat` (set automatically by `Output.object()`).
508
358
 
509
359
  ```typescript
510
- import { requestNotRetryable } from 'ai-retry/retryables';
360
+ import { schemaInvalid } from 'ai-retry/language-model';
511
361
 
512
- const retryable = createRetryable({
513
- model: azure('gpt-4-mini'),
514
- retries: [
515
- requestNotRetryable(openai('gpt-4')), // Switch provider if error is not retryable
516
- ],
517
- });
362
+ schemaInvalid().switch({ model: fallback });
518
363
  ```
519
364
 
520
- #### Retry After Delay
365
+ ##### Image model conditions
521
366
 
522
- If an error is retryable, such as 429 (Too Many Requests) or 503 (Service Unavailable) errors, it will be retried after a delay.
523
- The delay and exponential backoff can be configured. If the response contains a [`retry-after`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Retry-After) header, it will be prioritized over the configured delay.
367
+ Only available from `ai-retry/image-model`.
524
368
 
525
- Note that this retryable does not accept a model parameter, it will always retry the model from the latest failed attempt.
369
+ | Helper | Kind | Matches when |
370
+ | ----------- | ---------- | --------------------------------------------- |
371
+ | `noImage()` | high-level | The image model threw `NoImageGeneratedError` |
526
372
 
527
- ```typescript
528
- import { retryAfterDelay } from 'ai-retry/retryables';
373
+ ###### `noImage()`
529
374
 
530
- const retryableModel = createRetryable({
531
- model: openai('gpt-4'), // Base model
532
- retries: [
533
- // Retry base model 3 times with fixed 2s delay
534
- retryAfterDelay({ delay: 2_000, maxAttempts: 3 }),
375
+ Matches when the image model threw `NoImageGeneratedError`.
535
376
 
536
- // Or retry with exponential backoff (2s, 4s, 8s)
537
- retryAfterDelay({ delay: 2_000, backoffFactor: 2, maxAttempts: 3 }),
377
+ ```typescript
378
+ import { noImage } from 'ai-retry/image-model';
538
379
 
539
- // Or retry only if the response contains a retry-after header
540
- retryAfterDelay({ maxAttempts: 3 }),
541
- ],
542
- });
380
+ noImage().switch({ model: fallback });
543
381
  ```
544
382
 
545
- By default, if a [`retry-after-ms`](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/provisioned-get-started#what-should--i-do-when-i-receive-a-429-response) or [`retry-after`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Retry-After) header is present in the response, it will be prioritized over the configured delay. The delay from the header will be capped at 60 seconds for safety.
383
+ ##### Embedding model conditions
546
384
 
547
- #### Schema Mismatch
385
+ > [!NOTE]
386
+ > The `embedding-model` entry point exposes only the universal conditions — there are no embedding-specific result conditions.
548
387
 
549
- Automatically retry with a different model when the response JSON doesn't match the expected schema.
388
+ #### Actions
550
389
 
551
- This is a result-based retryable that validates the model's JSON output against the schema set by structured output modes like `Output.object()`, `Output.array()`, and `Output.choice()`.
552
- Normally, schema validation happens outside the model in `generateText`, so a schema validation error would not be seen by the retryable model. This retryable catches it early and retries with a fallback model.
390
+ Every condition exposes two terminal actions that turn it into a retryable:
553
391
 
554
- > [!NOTE]
555
- > This retryable works with `generateText` and any structured output mode that provides a schema: `Output.object()`, `Output.array()`, and `Output.choice()`.
392
+ - **`.switch({ model, ...options })`** falls back to a different model when the condition matches. Optional fields (`maxAttempts`, `delay`, `backoffFactor`, `timeout`, `options`) are the same as on a normal `Retry` object. `maxAttempts` defaults to `1`.
393
+ - **`.retry({ delay?, backoffFactor?, maxAttempts?, ... })`** retries the **current** model when the condition matches. Honors `Retry-After` and `Retry-After-Ms` response headers, capped at 60 seconds. `maxAttempts` defaults to `2` (one original attempt + one retry); values below `2` throw, since the retry budget is consumed by the original failure.
556
394
 
557
395
  ```typescript
558
- import { openai } from '@ai-sdk/openai';
559
- import { anthropic } from '@ai-sdk/anthropic';
560
- import { generateText, Output } from 'ai';
561
- import { createRetryable } from 'ai-retry';
562
- import { schemaMismatch } from 'ai-retry/retryables';
563
- import { z } from 'zod';
564
-
565
- const retryableModel = createRetryable({
566
- model: openai('gpt-4-mini'), // Weak base model
567
- retries: [
568
- // Retry with stronger model on schema mismatch
569
- schemaMismatch(openai('gpt-5')),
570
- ],
571
- });
396
+ import { error, timeout } from 'ai-retry/language-model';
572
397
 
573
- const result = await generateText({
574
- model: retryableModel,
575
- output: Output.object({
576
- schema: z.object({
577
- name: z.string(),
578
- age: z.number(),
579
- }),
580
- }),
581
- prompt: 'Generate a person with name and age.',
582
- });
398
+ // Switch on a timeout, with a fresh timeout for the fallback
399
+ timeout().switch({ model: fallback, timeout: 30_000 });
583
400
 
584
- console.log(result.object); // { name: "Alice", age: 30 }
401
+ // Retry the current model with exponential backoff, max 3 attempts
402
+ error
403
+ .isRetryable(true)
404
+ .retry({ delay: 1_000, backoffFactor: 2, maxAttempts: 3 });
585
405
  ```
586
406
 
587
- ### Experimental: Composable Conditions
588
-
589
- > [!WARNING]
590
- > This API is experimental and may change. It is not exported from the package root; opt in via one of the per-model deep imports:
591
- >
592
- > ```ts
593
- > import { ... } from 'ai-retry/experimental/language-model';
594
- > import { ... } from 'ai-retry/experimental/image-model';
595
- > import { ... } from 'ai-retry/experimental/embedding-model';
596
- > ```
597
- >
598
- > Each entry point also re-exports `createRetryable` already typed for that model family, so you can either import everything from one path:
599
- >
600
- > ```ts
601
- > import {
602
- > createRetryable,
603
- > error,
604
- > httpStatus,
605
- > } from 'ai-retry/experimental/language-model';
606
- > ```
607
- >
608
- > or pull retryables from the dedicated `/retryables` subpath:
609
- >
610
- > ```ts
611
- > import {
612
- > error,
613
- > httpStatus,
614
- > } from 'ai-retry/experimental/language-model/retryables';
615
- > // or
616
- > import * as retryables from 'ai-retry/experimental/language-model/retryables';
617
- > ```
407
+ #### Combinators
618
408
 
619
- A `condition().action()` API for retryables. Conditions are built from small primitives (`error(fn)`, `result(fn)`), composed with `.and` / `.or` / `.not`, and turned into a `Retryable` by one of two terminal actions: `.switch({ model })` or `.retry({ delay })`. The result drops into the same `retries: [...]` array as the stable helpers, so you can mix the two styles freely.
409
+ Compose conditions with the top-level `or()`, `and()`, `not()` helpers. Because each entry point is typed for a single model family, they infer the family from their arguments no type annotations or casts needed. `or()` and `and()` are variadic.
620
410
 
621
411
  ```typescript
622
- import { anthropic } from '@ai-sdk/anthropic';
623
- import { openai } from '@ai-sdk/openai';
624
- import { generateText } from 'ai';
625
- import {
626
- createRetryable,
627
- error,
628
- finishReason,
629
- httpStatus,
630
- } from 'ai-retry/experimental/language-model';
631
-
632
- const retryableModel = createRetryable({
633
- model: openai('gpt-4'),
634
- retries: [
635
- // Switch on 529 or any "overloaded" message
636
- httpStatus(529, 'overloaded').switch({
637
- model: anthropic('claude-3-haiku-20240307'),
638
- }),
639
-
640
- // Switch when the response was content-filtered
641
- finishReason('content-filter').switch({ model: openai('gpt-4o') }),
412
+ import { and, error, httpStatus, not, or } from 'ai-retry/language-model';
642
413
 
643
- // Retry the same model with exponential backoff on retryable errors
644
- error.isRetryable(true).retry({ delay: 1_000, backoffFactor: 2 }),
645
- ],
646
- });
414
+ or(httpStatus(429), error.message('overloaded')).switch({ model: fallback });
415
+ and(httpStatus(503), error.message('temporary')).switch({ model: fallback });
416
+ not(error.isRetryable(true)).switch({ model: fallback });
647
417
  ```
648
418
 
649
- #### Picking an entry point
650
-
651
- Pick the entry point that matches the model you pass to `createRetryable`. Each module exposes the helpers that make sense for that model family already typed for it, so you don't need to add type annotations yourself.
652
-
653
- #### Low-level conditions
419
+ #### Custom predicates
654
420
 
655
- The primitive builders `error(...)` and `result(...)` take a predicate and turn it into a condition; their namespaces bundle the most common field matchers on top.
656
-
657
- | Helper | Matches when | Available in |
658
- | --------------------------------- | ------------------------------------------------------------------------------------ | ---------------------- |
659
- | `error(predicate)` | The current attempt failed and `predicate(err, ctx)` returns true | all three entry points |
660
- | `error.isRetryable(flag)` | `APICallError.isRetryable === flag` (default `true`) | all three entry points |
661
- | `error.statusCode(...patterns)` | Numbers match exactly; regex matches the stringified code (e.g. `/^5\d\d$/` for 5xx) | all three entry points |
662
- | `error.message(...patterns)` | Substring (case-insensitive) or regex match against the error message | all three entry points |
663
- | `result(predicate)` | The current attempt succeeded and `predicate(res, ctx)` returns true | `language-model` only |
664
- | `result.finishReason(...reasons)` | The result's `finishReason.unified` matches one of the given values | `language-model` only |
421
+ When the higher-level helpers don't cover the field you need, drop down to `error(predicate)` / `result(predicate)` and inspect whatever is on the error or result. The predicate receives `(err | result, ctx)` and can be `async`; `ctx` is fully typed for the entry point you imported from, so the current attempt, the model, and all previous attempts are available without manual annotations.
665
422
 
666
423
  ```typescript
424
+ import { anthropic } from '@ai-sdk/anthropic';
425
+ import { openai } from '@ai-sdk/openai';
667
426
  import { APICallError } from 'ai';
668
- import { error } from 'ai-retry/experimental/language-model';
427
+ import { createRetryableModel, error } from 'ai-retry/language-model';
669
428
 
670
- error((e) => APICallError.isInstance(e) && e.statusCode === 418).switch({
671
- model: fallback,
429
+ // OpenAI-style error code nested at data.error.code. `e` is `unknown`.
430
+ const isContentFilter = (e: unknown) => {
431
+ if (!APICallError.isInstance(e)) return false;
432
+ const data = e.data as { error?: { code?: string } } | undefined;
433
+ return data?.error?.code === 'content_filter';
434
+ };
435
+
436
+ const retryableModel = createRetryableModel({
437
+ model: openai('gpt-4o'),
438
+ retries: [
439
+ error(isContentFilter).switch({
440
+ model: anthropic('claude-3-haiku-20240307'),
441
+ }),
442
+ ],
672
443
  });
673
444
  ```
674
445
 
675
- #### High-level conditions
676
-
677
- Convenience matchers built on top of the low-level ones for the common cases. Each returns a condition that you finalize with `.switch(...)` or `.retry(...)`.
678
-
679
- | Helper | language-model | image-model | embedding-model |
680
- | -------------------------- | :------------: | :---------: | :-------------: |
681
- | `httpStatus(...patterns)` | ✓ | ✓ | ✓ |
682
- | `timeout()` | ✓ | ✓ | ✓ |
683
- | `aborted()` | ✓ | ✓ | ✓ |
684
- | `finishReason(...reasons)` | ✓ | — | — |
685
- | `schemaInvalid()` | ✓ | — | — |
686
- | `noImage()` | — | ✓ | — |
687
-
688
- What each one matches:
446
+ The predicate's second argument is the typed `RetryContext`, so a check like “only retry on the first attempt” is just `(e, ctx) => ctx.attempts.length === 1 && isContentFilter(e)`.
689
447
 
690
- | Helper | Matches when |
691
- | -------------------------- | ------------------------------------------------------------------------------------------ |
692
- | `httpStatus(...patterns)` | Numbers match the status code; strings match the message (substring); regex matches either |
693
- | `timeout()` | `Error.name === 'TimeoutError'` (`AbortSignal.timeout()` fired) |
694
- | `aborted()` | `Error.name === 'AbortError'` (manual `controller.abort()`) |
695
- | `finishReason(...reasons)` | The result's `finishReason.unified` matches one of the given values |
696
- | `schemaInvalid()` | The result text fails JSON-schema validation against the call's `responseFormat` |
697
- | `noImage()` | The image model threw `NoImageGeneratedError` |
448
+ #### All retries failed
698
449
 
699
- Each high-level helper is a thin wrapper around the low-level ones. For example, `timeout()` is roughly:
450
+ If all retry attempts fail, a `RetryError` is thrown containing all individual errors. If no retry was attempted (every retryable returned `undefined` / didn't match), the original error is re-thrown directly.
700
451
 
701
452
  ```typescript
702
- function timeout() {
703
- return error((err) => err instanceof Error && err.name === 'TimeoutError');
704
- }
705
- ```
706
-
707
- and `finishReason(...)` just delegates to `result.finishReason(...)`:
453
+ import { RetryError } from 'ai';
708
454
 
709
- ```typescript
710
- function finishReason(...reasons: Array<string>) {
711
- return result.finishReason(...reasons);
455
+ try {
456
+ const result = await generateText({
457
+ model: retryableModel,
458
+ prompt: 'Hello!',
459
+ });
460
+ } catch (err) {
461
+ if (err instanceof RetryError) {
462
+ console.error('All retry attempts failed:', err.errors);
463
+ } else {
464
+ console.error('Request failed:', err);
465
+ }
712
466
  }
713
467
  ```
714
468
 
715
- #### Actions
716
-
717
- Every condition exposes two terminal actions that turn it into a `Retryable`:
718
-
719
- - **`.switch({ model, ...options })`** falls back to a different model when the condition matches. Optional fields (`maxAttempts`, `delay`, `backoffFactor`, `timeout`, `options`) are the same as on a normal `Retry` object. `maxAttempts` defaults to `1`.
720
- - **`.retry({ delay?, backoffFactor?, maxAttempts?, ... })`** retries the current model when the condition matches. Honors `Retry-After` and `Retry-After-Ms` response headers when present, capped at 60 seconds. `maxAttempts` defaults to `2` (one original attempt + one retry); values below `2` throw, since the retry budget is consumed by the original failure.
721
-
722
- #### Combinators
723
-
724
- Compose conditions with `.and`, `.or`, `.not`:
725
-
726
- ```typescript
727
- import { error, httpStatus } from 'ai-retry/experimental/language-model';
728
-
729
- httpStatus(429).or(error.message('overloaded'));
730
- httpStatus(503).and(error.message('temporary'));
731
- error.isRetryable(true).not();
732
- ```
733
-
734
- #### Mapping from Built-in retryables
735
-
736
- Each stable retryable has an equivalent in the new shape (imports from `ai-retry/experimental/language-model` unless noted):
737
-
738
- | Built-in | Composable form |
739
- | ------------------------------------------- | ------------------------------------------------------------------------------------------------------------------- |
740
- | `contentFilterTriggered(m)` | `error(/* check e.data.error.code === 'content_filter' */).or(finishReason('content-filter')).switch({ model: m })` |
741
- | `requestTimeout(m)` | `timeout().switch({ model: m, timeout: 60_000 })` |
742
- | `requestNotRetryable(m)` | `error.isRetryable(false).switch({ model: m })` |
743
- | `schemaMismatch(m)` | `schemaInvalid().switch({ model: m })` |
744
- | `serviceOverloaded(m)` | `httpStatus(529, 'overloaded').switch({ model: m })` |
745
- | `serviceUnavailable(m)` | `error.statusCode(503).switch({ model: m })` |
746
- | `noImageGenerated(m)` | `noImage().switch({ model: m })` (from `image-model`) |
747
- | `retryAfterDelay({ delay, backoffFactor })` | `error.isRetryable(true).retry({ delay, backoffFactor })` |
748
-
749
- > [!NOTE]
750
- > `error.isRetryable(true)` matches whatever the AI SDK's `APICallError` marks retryable. By default that's status codes 408, 409, 429, and any 5xx, plus network errors and provider-specific overrides (e.g. Anthropic flips it on `error.type === 'overloaded_error'`). It picks up more cases than a manual status-code list.
469
+ Errors are tracked per unique model (`provider/modelId`). Once a model has hit its `maxAttempts`, no further retry will land on it.
751
470
 
752
471
  ### Options
753
472
 
754
- #### Disabling Retries
755
-
756
- You can disable retries entirely, which is useful for testing or specific environments. When disabled, the base model will execute directly without any retry logic.
473
+ #### Disabling retries
757
474
 
758
475
  ```typescript
759
- const retryableModel = createRetryable({
760
- model: openai('gpt-4'), // Base model
761
- retries: [
762
- /* ... */
763
- ],
764
- disabled: true, // Retries are completely disabled
765
- });
766
-
767
- // Or disable based on environment
768
- const retryableModel = createRetryable({
769
- model: openai('gpt-4'), // Base model
770
- retries: [
771
- /* ... */
772
- ],
773
- disabled: process.env.NODE_ENV === 'test', // Disable in test environment
774
- });
775
-
776
- // Or use a function for dynamic control
777
- const retryableModel = createRetryable({
778
- model: openai('gpt-4'), // Base model
476
+ const retryableModel = createRetryableModel({
477
+ model: openai('gpt-4'),
779
478
  retries: [
780
479
  /* ... */
781
480
  ],
782
- disabled: () => !featureFlags.isEnabled('ai-retries'), // Check feature flag
481
+ disabled: true, // hard off
482
+ // disabled: process.env.NODE_ENV === 'test', // env-based
483
+ // disabled: () => !featureFlags.isEnabled('ai'), // dynamic
783
484
  });
784
485
  ```
785
486
 
786
- #### Retry Delays
487
+ When disabled the base model executes directly, no retry logic runs.
488
+
489
+ #### Retry delays
787
490
 
788
- You can delay retries with an optional exponential backoff. The delay respects abort signals, so requests can still be cancelled during the delay period.
491
+ Delays accept exponential backoff and respect the request's abort signal so they can still be cancelled.
789
492
 
790
493
  ```typescript
791
- const retryableModel = createRetryable({
494
+ import { createRetryableModel } from 'ai-retry/language-model';
495
+
496
+ const retryableModel = createRetryableModel({
792
497
  model: openai('gpt-4'),
793
498
  retries: [
794
- // Retry model 3 times with fixed 2s delay
499
+ // Retry the base model with a fixed 2s delay
795
500
  { model: openai('gpt-4'), delay: 2_000, maxAttempts: 3 },
796
501
 
797
- // Or retry with exponential backoff (2s, 4s, 8s)
502
+ // Or with exponential backoff: 2s, 4s, 8s
798
503
  { model: openai('gpt-4'), delay: 2_000, backoffFactor: 2, maxAttempts: 3 },
799
504
  ],
800
505
  });
801
-
802
- const result = await generateText({
803
- model: retryableModel,
804
- prompt: 'Write a vegetarian lasagna recipe for 4 people.',
805
- // Will be respected during delays
806
- abortSignal: AbortSignal.timeout(60_000),
807
- });
808
506
  ```
809
507
 
810
- You can also use delays with built-in retryables:
811
-
812
- ```typescript
813
- import { serviceOverloaded } from 'ai-retry/retryables';
814
-
815
- const retryableModel = createRetryable({
816
- model: openai('gpt-4'),
817
- retries: [
818
- // Wait 5 seconds before retrying on service overload
819
- serviceOverloaded(openai('gpt-4'), { maxAttempts: 3, delay: 5_000 }),
820
- ],
821
- });
822
- ```
508
+ The same `delay` / `backoffFactor` / `maxAttempts` options are accepted by `.switch({...})` and `.retry({...})`.
823
509
 
824
510
  #### Timeouts
825
511
 
826
- When a retry specifies a `timeout` value, a fresh `AbortSignal.timeout()` is created for that retry attempt. If the original `abortSignal` is still alive, the fresh deadline is composed with it via `AbortSignal.any()` so user cancellation still works mid-retry. If the original signal is already aborted (for example it carried a request-level deadline that already fired), it is dropped so the retry runs against the fresh deadline alone.
512
+ When a retry specifies a `timeout`, a fresh `AbortSignal.timeout()` is created for that attempt. If the original `abortSignal` is still alive, the fresh deadline is composed with it via `AbortSignal.any()` so user cancellation still works. If the original signal is already aborted (a request-level deadline already fired), it is dropped so the retry runs against the fresh deadline alone.
827
513
 
828
- If the original `abortSignal` is already aborted at the time of retry and the chosen retry does **not** supply a `timeout`, ai-retry rethrows the original error rather than firing a misleading retry against the dead signal. `onError` still fires for observability, but `onRetry` is skipped. Setting `retry.timeout` is the explicit opt-in for retrying past an aborted signal.
514
+ If the original `abortSignal` is already aborted at the time of retry and the retry does **not** supply a `timeout`, `ai-retry` re-throws the original error rather than firing a misleading retry against the dead signal. `onError` still fires for observability; `onRetry` is skipped. Setting `timeout` is the explicit opt-in for retrying past an aborted signal.
829
515
 
830
516
  ```typescript
831
- const retryableModel = createRetryable({
517
+ import { createRetryableModel, timeout } from 'ai-retry/language-model';
518
+
519
+ const retryableModel = createRetryableModel({
832
520
  model: openai('gpt-4'),
833
521
  retries: [
834
- // Provide a fresh 30 second timeout for the retry
835
- {
836
- model: openai('gpt-3.5-turbo'),
837
- timeout: 30_000,
838
- },
522
+ timeout().switch({ model: openai('gpt-3.5-turbo'), timeout: 30_000 }),
839
523
  ],
840
524
  });
841
525
 
842
- // Even if the original request times out, the retry gets a fresh signal
843
- const result = await generateText({
526
+ await generateText({
844
527
  model: retryableModel,
845
528
  prompt: 'Write a story',
846
- // Original request timeout
847
529
  abortSignal: AbortSignal.timeout(60_000),
848
530
  });
849
531
  ```
850
532
 
851
- #### Max Attempts
533
+ #### Max attempts
852
534
 
853
- By default, each retryable will only attempt to retry once per model to avoid infinite loops. You can customize this behavior by returning a `maxAttempts` value from your retryable function. Note that the initial request with the base model is counted as the first attempt.
535
+ Each retryable attempts a model at most once by default. Use `maxAttempts` to allow more. Attempts are counted per unique model, so duplicates across multiple retryables don't get more chances than configured.
854
536
 
855
537
  ```typescript
856
- const retryableModel = createRetryable({
538
+ const retryableModel = createRetryableModel({
857
539
  model: openai('gpt-4'),
858
540
  retries: [
859
- // Try this once
860
- anthropic('claude-3-haiku-20240307'),
861
- // Try this one more time (initial + 1 retry)
862
- { model: openai('gpt-4'), maxAttempts: 2 },
863
- // Already tried, won't be retried again
864
- anthropic('claude-3-haiku-20240307'),
541
+ anthropic('claude-3-haiku-20240307'), // 1 attempt
542
+ { model: openai('gpt-4'), maxAttempts: 2 }, // 1 + 1 retry
543
+ anthropic('claude-3-haiku-20240307'), // already used
865
544
  ],
866
545
  });
867
546
  ```
868
547
 
869
- The attempts are counted per unique model (provider + modelId). That means if multiple retryables return the same model, it won't be retried again once the `maxAttempts` is reached.
870
-
871
- #### Provider Options
548
+ #### Provider options
872
549
 
873
- You can override provider-specific options for each retry attempt. This is useful when you want to use different configurations for fallback models.
550
+ Override provider-specific options for a retry, completely replacing the original ones.
874
551
 
875
552
  ```typescript
876
- const retryableModel = createRetryable({
553
+ const retryableModel = createRetryableModel({
877
554
  model: openai('gpt-5'),
878
555
  retries: [
879
- // Use different provider options for the retry
880
556
  {
881
557
  model: openai('gpt-4o-2024-08-06'),
882
558
  providerOptions: {
883
- openai: {
884
- user: 'fallback-user',
885
- structuredOutputs: false,
886
- },
559
+ openai: { user: 'fallback-user', structuredOutputs: false },
887
560
  },
888
561
  },
889
562
  ],
890
563
  });
891
-
892
- // Original provider options are used for the first attempt
893
- const result = await generateText({
894
- model: retryableModel,
895
- prompt: 'Write a story',
896
- providerOptions: {
897
- openai: {
898
- user: 'primary-user',
899
- },
900
- },
901
- });
902
564
  ```
903
565
 
904
- The retry's `providerOptions` will completely replace the original ones during retry attempts. This works for all model types (language and embedding) and all operations (generate, stream, embed).
566
+ #### Call options
905
567
 
906
- #### Call Options
907
-
908
- You can override various call options when retrying requests. This is useful for adjusting parameters like temperature, max tokens, or even the prompt itself for retry attempts. Call options are specified in the `options` field of the retry object.
568
+ Override any of the call options for a retry. Useful for things like temperature, max tokens, or the prompt itself.
909
569
 
910
570
  ```typescript
911
- const retryableModel = createRetryable({
571
+ const retryableModel = createRetryableModel({
912
572
  model: openai('gpt-4'),
913
573
  retries: [
914
574
  {
915
575
  model: anthropic('claude-3-haiku'),
916
576
  options: {
917
- // Override generation parameters for more deterministic output
918
577
  temperature: 0.3,
919
578
  topP: 0.9,
920
579
  maxOutputTokens: 500,
921
- // Set a seed for reproducibility
922
580
  seed: 42,
923
581
  },
924
582
  },
@@ -926,58 +584,54 @@ const retryableModel = createRetryable({
926
584
  });
927
585
  ```
928
586
 
929
- The following options can be overridden:
930
-
931
587
  > [!NOTE]
932
588
  > Override options completely replace the original values (they are not merged). If you don't specify an option, the original value from the request is used.
933
589
 
934
- ##### Language Model Options
935
-
936
- | Option | Description |
937
- | -------------------------------------------------------------------------------------------------- | ---------------------------------------------- |
938
- | [`prompt`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-text#prompt) | Override the entire prompt for the retry |
939
- | [`temperature`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-text#temperature) | Temperature setting for controlling randomness |
940
- | [`topP`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-text#topp) | Nucleus sampling parameter |
941
- | [`topK`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-text#topk) | Top-K sampling parameter |
942
- | [`maxOutputTokens`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-text#max-output-tokens) | Maximum number of tokens to generate |
943
- | [`seed`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-text#seed) | Random seed for deterministic generation |
944
- | [`stopSequences`](https://ai-sdk.dev/docs/reference/ai-sdk-types/generate-text#stopsequences) | Stop sequences to end generation |
945
- | [`presencePenalty`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-text#presencepenalty) | Presence penalty for reducing repetition |
946
- | [`frequencyPenalty`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-text#frequencypenalty) | Frequency penalty for reducing repetition |
947
- | [`headers`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-text#headers) | Additional HTTP headers |
948
- | [`providerOptions`](https://ai-sdk.dev/docs/reference/ai-sdk-types/generate-text#provideroptions) | Provider-specific options |
590
+ ##### Language model options
949
591
 
950
- ##### Embedding Model Options
592
+ | Option | Description |
593
+ | ------------------ | ---------------------------------------------- |
594
+ | `prompt` | Override the entire prompt for the retry |
595
+ | `temperature` | Temperature setting for controlling randomness |
596
+ | `topP` | Nucleus sampling parameter |
597
+ | `topK` | Top-K sampling parameter |
598
+ | `maxOutputTokens` | Maximum number of tokens to generate |
599
+ | `seed` | Random seed for deterministic generation |
600
+ | `stopSequences` | Stop sequences to end generation |
601
+ | `presencePenalty` | Presence penalty for reducing repetition |
602
+ | `frequencyPenalty` | Frequency penalty for reducing repetition |
603
+ | `headers` | Additional HTTP headers |
604
+ | `providerOptions` | Provider-specific options |
951
605
 
952
- | Option | Description |
953
- | ---------------------------------------------------------------------------------------- | ---------------------------- |
954
- | [`values`](https://ai-sdk.dev/docs/reference/ai-sdk-core/embed#values) | Override the values to embed |
955
- | [`headers`](https://ai-sdk.dev/docs/reference/ai-sdk-core/embed#headers) | Additional HTTP headers |
956
- | [`providerOptions`](https://ai-sdk.dev/docs/reference/ai-sdk-core/embed#provideroptions) | Provider-specific options |
606
+ ##### Embedding model options
957
607
 
958
- ##### Image Model Options
608
+ | Option | Description |
609
+ | ----------------- | ---------------------------- |
610
+ | `values` | Override the values to embed |
611
+ | `headers` | Additional HTTP headers |
612
+ | `providerOptions` | Provider-specific options |
959
613
 
960
- | Option | Description |
961
- | ------------------------------------------------------------------------------------------------- | -------------------------------- |
962
- | [`n`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-image#n) | Number of images to generate |
963
- | [`size`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-image#size) | Size of generated images |
964
- | [`aspectRatio`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-image#aspectratio) | Aspect ratio of generated images |
965
- | [`seed`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-image#seed) | Random seed for reproducibility |
966
- | [`headers`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-image#headers) | Additional HTTP headers |
967
- | [`providerOptions`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-image#provideroptions) | Provider-specific options |
614
+ ##### Image model options
968
615
 
969
- #### Dynamic Call Options
616
+ | Option | Description |
617
+ | ----------------- | -------------------------------- |
618
+ | `n` | Number of images to generate |
619
+ | `size` | Size of generated images |
620
+ | `aspectRatio` | Aspect ratio of generated images |
621
+ | `seed` | Random seed for reproducibility |
622
+ | `headers` | Additional HTTP headers |
623
+ | `providerOptions` | Provider-specific options |
970
624
 
971
- You can also override call options dynamically from inside the `onRetry` callback, instead of declaring them statically on the retry object. This is useful when the override depends on something only known at runtime, like the prompt that just failed, the model that's about to be tried next, or the error that triggered the retry. The overrides apply to the upcoming retry attempt only, and can change the same fields as the static `options` on a retry. The callback may also be `async` if computing the override needs to do work (e.g. fetching a fresh credential).
625
+ #### Dynamic call options
972
626
 
973
- A common use case is sanitizing provider-scoped metadata when falling back to a different provider, for example stripping `providerOptions.azure.itemId` references from the previous prompt before retrying on OpenAI:
627
+ You can also override call options dynamically from `onRetry`, instead of declaring them statically on the retry object. This is useful when the override depends on something only known at runtime — the prompt that just failed, the model about to be tried, or the error that triggered the retry. The overrides apply to the upcoming attempt only and can change the same fields as the static `options`. The callback can be `async` if computing the override needs to do work (e.g. fetching a fresh credential).
974
628
 
975
629
  ```typescript
976
- import { createRetryable } from 'ai-retry';
977
630
  import { azure } from '@ai-sdk/azure';
978
631
  import { openai } from '@ai-sdk/openai';
632
+ import { createRetryableModel } from 'ai-retry/language-model';
979
633
 
980
- const retryableModel = createRetryable({
634
+ const retryableModel = createRetryableModel({
981
635
  model: azure('gpt-5-chat'),
982
636
  retries: [openai('gpt-5-chat')],
983
637
  onRetry: (context) => {
@@ -985,33 +639,16 @@ const retryableModel = createRetryable({
985
639
  const previous = attempts.at(-1);
986
640
 
987
641
  if (current.model.provider !== previous.model.provider) {
988
- // Strip provider-scoped metadata from the prompt before retrying on a different provider
642
+ // Strip provider-scoped metadata before retrying on a different provider
989
643
  return {
990
- options: {
991
- prompt: stripProviderMetadata(current.options.prompt),
992
- },
644
+ options: { prompt: stripProviderMetadata(current.options.prompt) },
993
645
  };
994
646
  }
995
647
  },
996
648
  });
997
649
  ```
998
650
 
999
- Inside the `onRetry` callback, `context.current.model` is the model that's about to be tried next, while `context.current.options` and `context.current.error` describe the failed attempt that triggered the retry. The previous model is available at `context.attempts.at(-1).model`.
1000
-
1001
- `onRetry` may also be `async`, which is useful if computing the override needs to do work (e.g. fetching a fresh credential):
1002
-
1003
- ```typescript
1004
- const retryableModel = createRetryable({
1005
- model: openai('gpt-4o-mini'),
1006
- retries: [anthropic('claude-sonnet-4-20250514')],
1007
- onRetry: async (context) => {
1008
- const { current } = context;
1009
-
1010
- const headers = await refreshAuthHeaders(current.model.provider);
1011
- return { options: { headers } };
1012
- },
1013
- });
1014
- ```
651
+ Inside `onRetry`, `context.current.model` is the model about to be tried next; `context.current.options` and `context.current.error` describe the failed attempt that triggered the retry. The previous model is at `context.attempts.at(-1).model`.
1015
652
 
1016
653
  **Precedence** for the upcoming retry attempt (highest to lowest):
1017
654
 
@@ -1029,10 +666,10 @@ You can use the following callbacks to log retry attempts and errors:
1029
666
  - `onFailure` is invoked when the request ultimately fails and no retry could recover it.
1030
667
 
1031
668
  ```typescript
1032
- const retryableModel = createRetryable({
1033
- model: openai('gpt-4-mini'),
669
+ const retryableModel = createRetryableModel({
670
+ model: openai('gpt-4o-mini'),
1034
671
  retries: [
1035
- /* your retryables */
672
+ /* ... */
1036
673
  ],
1037
674
  onError: (context) => {
1038
675
  console.error(
@@ -1042,7 +679,7 @@ const retryableModel = createRetryable({
1042
679
  },
1043
680
  onRetry: (context) => {
1044
681
  console.log(
1045
- `Retrying attempt ${context.attempts.length + 1} with model ${context.current.model.provider}/${context.current.model.modelId}...`,
682
+ `Retrying with ${context.current.model.provider}/${context.current.model.modelId}...`,
1046
683
  );
1047
684
  },
1048
685
  onSuccess: (context) => {
@@ -1063,7 +700,7 @@ const retryableModel = createRetryable({
1063
700
 
1064
701
  #### Reset
1065
702
 
1066
- By default, every new request starts with the base model, even if a previous request was retried with a different model. The `reset` option changes this behavior by making the last successfully retried model **sticky**, that means subsequent requests will continue using that model instead of switching back to the base model. The reset value controls how long the retry model stays sticky before resetting back to the base model.
703
+ By default, every new request starts with the base model, even if a previous request was retried with a different model. The `reset` option changes this behavior by making the last successfully retried model **sticky** subsequent requests will continue using that model until the reset condition fires.
1067
704
 
1068
705
  | Value | Description |
1069
706
  | ------------------ | ------------------------------------------------------------ |
@@ -1071,51 +708,29 @@ By default, every new request starts with the base model, even if a previous req
1071
708
  | `after-N-requests` | Keep the retry model for the next **N** requests, then reset |
1072
709
  | `after-N-seconds` | Keep the retry model for **N** seconds, then reset |
1073
710
 
1074
- ##### Reset after each request (default)
1075
-
1076
- ```typescript
1077
- const retryableModel = createRetryable({
1078
- model: openai('gpt-4o-mini'),
1079
- retries: [anthropic('claude-sonnet-4-20250514')],
1080
- reset: 'after-request', // default: always start with the base model
1081
- });
1082
- ```
1083
-
1084
- ##### Keep the retry model for N requests
1085
-
1086
- ```typescript
1087
- const retryableModel = createRetryable({
1088
- model: openai('gpt-4o-mini'),
1089
- retries: [anthropic('claude-sonnet-4-20250514')],
1090
- reset: 'after-5-requests', // use the retry model for 5 more requests before resetting
1091
- });
1092
- ```
1093
-
1094
- ##### Keep the retry model for N seconds
1095
-
1096
711
  ```typescript
1097
- const retryableModel = createRetryable({
712
+ const retryableModel = createRetryableModel({
1098
713
  model: openai('gpt-4o-mini'),
1099
714
  retries: [anthropic('claude-sonnet-4-20250514')],
1100
- reset: 'after-30-seconds', // use the retry model for 30 seconds before resetting
715
+ reset: 'after-5-requests',
1101
716
  });
1102
717
  ```
1103
718
 
1104
719
  ### Telemetry
1105
720
 
1106
721
  > [!NOTE]
1107
- > Experimental: Span names and attributes may change in patch versions.
722
+ > Experimental: span names and attributes may change in patch versions.
1108
723
 
1109
- `ai-retry` can emit [OpenTelemetry](https://opentelemetry.io/) spans for each request and every retry attempt. The spans are created on the active OpenTelemetry context, so they nest automatically under the AI SDK's own spans (e.g. `ai.generateText.doGenerate`) when you also enable `experimental_telemetry` on `generateText`/`streamText`. A single trace then shows the individual attempts — which model each used, why it was retried, and the backoff between them — that the SDK's own span otherwise hides.
724
+ `ai-retry` can emit [OpenTelemetry](https://opentelemetry.io/) spans for each request and every retry attempt. Spans are created on the active OpenTelemetry context, so they nest automatically under the AI SDK's own spans (e.g. `ai.generateText.doGenerate`) when you also enable `experimental_telemetry` on `generateText` / `streamText`. A single trace then shows the individual attempts — which model each used, why it was retried, and the backoff between them — that the SDK's own span otherwise hides.
1110
725
 
1111
726
  #### Setup
1112
727
 
1113
728
  Telemetry uses the optional peer dependency `@opentelemetry/api` (already present if you use the AI SDK). Register an OpenTelemetry SDK once at startup, then opt in per model:
1114
729
 
1115
730
  ```typescript
1116
- import { createRetryable } from 'ai-retry';
731
+ import { createRetryableModel } from 'ai-retry/language-model';
1117
732
 
1118
- const retryableModel = createRetryable({
733
+ const retryableModel = createRetryableModel({
1119
734
  model: openai('gpt-4o'),
1120
735
  retries: [anthropic('claude-sonnet-4-5')],
1121
736
  experimental_telemetry: { isEnabled: true },
@@ -1150,27 +765,27 @@ ai_retry.doGenerate outcome=success, attempts=2
1150
765
 
1151
766
  **Operation span** attributes:
1152
767
 
1153
- | Attribute | Description |
1154
- | --------------------------------------------------------------- | ---------------------------------------------------------------------------- |
1155
- | `ai_retry.operation` | `doGenerate`, `doStream`, or `doEmbed` |
1156
- | `ai_retry.outcome` | `success` or `failure` |
1157
- | `ai_retry.attempts` | total number of attempts |
1158
- | `ai_retry.model.start` | the model the request started with (`provider/modelId`) |
1159
- | `ai_retry.model.final` | the model that produced the final outcome |
768
+ | Attribute | Description |
769
+ | ---------------------------------------------------------------------------- | ---------------------------------------------------------------------------- |
770
+ | `ai_retry.operation` | `doGenerate`, `doStream`, or `doEmbed` |
771
+ | `ai_retry.outcome` | `success` or `failure` |
772
+ | `ai_retry.attempts` | total number of attempts |
773
+ | `ai_retry.model.start` | the model the request started with (`provider/modelId`) |
774
+ | `ai_retry.model.final` | the model that produced the final outcome |
1160
775
  | `ai_retry.error.{name,message,status,cause.name,cause.message,cause.status}` | the failing error (on failure); `status` when it carries an HTTP status code |
1161
- | `ai_retry.function.id`, `ai_retry.metadata.*` | from the telemetry settings |
776
+ | `ai_retry.function.id`, `ai_retry.metadata.*` | from the telemetry settings |
1162
777
 
1163
778
  **Attempt span** (`ai_retry.attempt`) attributes:
1164
779
 
1165
- | Attribute | Description |
1166
- | ----------------------------------------------------------------------- | ------------------------------------------------------------------------ |
1167
- | `ai_retry.attempt.number` | 1-based attempt index |
1168
- | `ai_retry.attempt.model` | model used (`provider/modelId`) |
1169
- | `ai_retry.attempt.outcome` | `success`, `retry`, or `failure` |
1170
- | `ai_retry.attempt.type` | `result` or `error` |
1171
- | `ai_retry.attempt.finish_reason` | finish reason (result attempts) |
1172
- | `ai_retry.attempt.delay_ms` | backoff scheduled before the next attempt |
1173
- | `ai_retry.attempt.timeout_ms` | timeout budget, when the retry set one |
780
+ | Attribute | Description |
781
+ | ------------------------------------------------------------------------------------ | ------------------------------------------------------------------------ |
782
+ | `ai_retry.attempt.number` | 1-based attempt index |
783
+ | `ai_retry.attempt.model` | model used (`provider/modelId`) |
784
+ | `ai_retry.attempt.outcome` | `success`, `retry`, or `failure` |
785
+ | `ai_retry.attempt.type` | `result` or `error` |
786
+ | `ai_retry.attempt.finish_reason` | finish reason (result attempts) |
787
+ | `ai_retry.attempt.delay_ms` | backoff scheduled before the next attempt |
788
+ | `ai_retry.attempt.timeout_ms` | timeout budget, when the retry set one |
1174
789
  | `ai_retry.attempt.error.{name,message,status,cause.name,cause.message,cause.status}` | the error (error attempts); `status` when it carries an HTTP status code |
1175
790
 
1176
791
  Attempt spans also carry the standard `gen_ai.request.model` / `gen_ai.provider.name` attributes so observability tools (Langfuse, etc.) recognize and render them.
@@ -1187,10 +802,32 @@ Errors during streaming requests can occur in two ways:
1187
802
  1. When the stream is initially created (e.g. network error, API error, etc.) by calling `streamText`.
1188
803
  2. While the stream is being processed (e.g. timeout, API error, etc.) by reading from the returned `result.textStream` async iterable.
1189
804
 
1190
- In the second case, errors during stream processing will not always be retried, because the stream might have already emitted some actual content and the consumer might have processed it. Retrying will be stopped as soon as the first content chunk (e.g. types of `text-delta`, `tool-call`, etc.) is emitted. The type of chunks considered as content are the same as the ones that are passed to [onChunk()](https://github.com/vercel/ai/blob/1fe4bd4144bff927f5319d9d206e782a73979ccb/packages/ai/src/generate-text/stream-text.ts#L684-L697).
805
+ In the second case, errors during stream processing will not always be retried, because the stream might have already emitted some actual content and the consumer might have processed it. Retrying stops as soon as the first content chunk (e.g. `text-delta`, `tool-call`, etc.) is emitted. The chunks considered as content are the same as the ones passed to [`onChunk()`](https://github.com/vercel/ai/blob/1fe4bd4144bff927f5319d9d206e782a73979ccb/packages/ai/src/generate-text/stream-text.ts#L684-L697).
806
+
807
+ Result-based conditions (`finishReason`, `schemaInvalid`, `result(...)`) apply to streams as well: the decision happens when the upstream `finish` part arrives and only fires if no content has been emitted yet, so behavior like `finishReason.unified === 'content-filter'` on an otherwise empty response can still trigger a fallback. Once any content chunk has been forwarded, the stream is committed and result-based retries are skipped.
1191
808
 
1192
809
  > [!IMPORTANT]
1193
- > **Streaming limitation:** Retries and fallbacks only apply before the first content chunk is emitted. Once streaming begins delivering content, the response is committed to the current model. Mid-stream errors will propagate to the caller rather than triggering a fallback. If reliable retries are critical for your use case, consider using `generateText` instead of `streamText`.
810
+ > **Streaming limitation:** retries and fallbacks only apply before the first content chunk is emitted. Once streaming begins delivering content, the response is committed to the current model. Mid-stream errors will propagate to the caller rather than triggering a fallback. If reliable retries are critical for your use case, consider using `generateText` instead of `streamText`.
811
+
812
+ ### Deprecated: function-style retryables
813
+
814
+ The function-style helpers (`contentFilterTriggered`, `requestTimeout`, `requestNotRetryable`, `retryAfterDelay`, `schemaMismatch`, `serviceOverloaded`, `serviceUnavailable`, `noImageGenerated`) are still exported from `ai-retry/retryables` for backwards compatibility, but they are deprecated in favor of the condition API documented above.
815
+
816
+ > [!NOTE]
817
+ > Full documentation for the deprecated function-style retryables lives in the [earlier README](https://github.com/zirkelc/ai-retry/blob/v1/README.md). New code should use the condition API. See the [migration guide](./MIGRATION.md) to convert existing code.
818
+
819
+ Each function-style retryable has a one-line equivalent in the new shape (imports from `ai-retry/language-model` unless noted):
820
+
821
+ | Function-style (deprecated) | Condition API |
822
+ | ------------------------------------------- | -------------------------------------------------------------------------------------------------------------------- |
823
+ | `contentFilterTriggered(m)` | `finishReason('content-filter').switch({ model: m })` |
824
+ | `requestTimeout(m)` | `timeout().switch({ model: m, timeout: 60_000 })` |
825
+ | `requestNotRetryable(m)` | `error.isRetryable(false).switch({ model: m })` |
826
+ | `schemaMismatch(m)` | `schemaInvalid().switch({ model: m })` |
827
+ | `serviceOverloaded(m)` | `httpStatus(529).switch({ model: m })` |
828
+ | `serviceUnavailable(m)` | `httpStatus(503).switch({ model: m })` |
829
+ | `noImageGenerated(m)` | `noImage().switch({ model: m })` (from `ai-retry/image-model`) |
830
+ | `retryAfterDelay({ delay, backoffFactor })` | `error.isRetryable(true).retry({ delay, backoffFactor })` |
1194
831
 
1195
832
  #### Preamble buffering
1196
833
 
@@ -1201,13 +838,13 @@ Every stream begins with a non-content preamble (`stream-start`, then optionally
1201
838
 
1202
839
  ### API Reference
1203
840
 
1204
- #### `createRetryable(options: RetryableModelOptions): LanguageModelV3 | EmbeddingModelV3 | ImageModelV3`
841
+ #### `createRetryableModel(options): LanguageModel | EmbeddingModel | ImageModel`
1205
842
 
1206
- Creates a retryable model that works with language models, embedding models, and image models.
843
+ Imported from the per-model entry point (`ai-retry/language-model`, `ai-retry/embedding-model`, `ai-retry/image-model`). Each entry returns a model already narrowed to that family.
1207
844
 
1208
845
  ```ts
1209
846
  interface RetryableModelOptions<
1210
- MODEL extends LanguageModelV3 | EmbeddingModelV3 | ImageModelV3,
847
+ MODEL extends LanguageModel | EmbeddingModel | ImageModel,
1211
848
  > {
1212
849
  model: MODEL;
1213
850
  retries: Array<Retryable<MODEL> | MODEL>;
@@ -1225,19 +862,26 @@ interface RetryableModelOptions<
1225
862
 
1226
863
  **Options:**
1227
864
 
1228
- - `model`: The base model to use for the initial request.
1229
- - `retries`: Array of retryables (functions, models, or retry objects) to attempt on failure.
1230
- - `disabled`: Disable all retry logic. Can be a boolean or function returning boolean. Default: `false` (retries enabled).
1231
- - `reset`: Controls when to reset back to the base model after a successful retry. Default: `after-request`.
1232
- - `experimental_telemetry`: OpenTelemetry instrumentation for retries. Off by default. See [Telemetry](#telemetry).
1233
- - `onError`: Callback invoked when an error occurs.
1234
- - `onRetry`: Callback invoked before attempting a retry. May optionally return an `OnRetryOverrides` object (or a `Promise` of one) to override `options.*` for the upcoming attempt only. See [Dynamic Call Options via `onRetry`](#dynamic-call-options-via-onretry).
1235
- - `onSuccess`: Callback invoked after a successful request. Receives the model that handled the request and all previous attempts.
1236
- - `onFailure`: Callback invoked when the request ultimately fails and no retry could recover it (no retryable matched, all retries exhausted, or the retry itself failed).
865
+ - `model` base model used for the initial request.
866
+ - `retries` array of conditions (`.switch(...)` / `.retry(...)` outputs), models, or retry objects to try on failure.
867
+ - `disabled` disable all retry logic. `boolean` or `() => boolean`. Default `false`.
868
+ - `reset` controls when to reset back to the base model after a successful retry. Default `'after-request'`.
869
+ - `experimental_telemetry` OpenTelemetry instrumentation. See [Telemetry](#telemetry).
870
+ - `onError` fires when an error occurs.
871
+ - `onRetry` fires before a retry attempt. May return `OnRetryOverrides` (or a promise of one) to override `options.*` for that attempt only. See [Dynamic call options](#dynamic-call-options).
872
+ - `onSuccess` fires after a successful request.
873
+ - `onFailure` fires when the request ultimately fails and no retry recovered it (no condition matched, retries exhausted, or the retry itself failed).
1237
874
 
1238
- #### `Reset`
875
+ #### `createRetryable(options)` (deprecated)
876
+
877
+ ```ts
878
+ import { createRetryable } from 'ai-retry';
879
+ ```
880
+
881
+ > [!WARNING]
882
+ > Deprecated. The root `createRetryable` auto-detects the model family at runtime and resolves bare gateway strings as language models only. Prefer `createRetryableModel` from the matching per-model entry point.
1239
883
 
1240
- Controls when the sticky model resets back to the base model after a successful retry.
884
+ #### `Reset`
1241
885
 
1242
886
  ```ts
1243
887
  type Reset =
@@ -1246,77 +890,53 @@ type Reset =
1246
890
  | `after-${number}-seconds`;
1247
891
  ```
1248
892
 
1249
- - `after-request` — reset immediately after the next request (default).
1250
- - `after-N-requests` — keep the retry model for the next N requests, then reset.
1251
- - `after-N-seconds` — keep the retry model for N seconds, then reset.
1252
-
1253
- #### `Retryable`
1254
-
1255
- A `Retryable` is a function that receives a `RetryContext` with the current error or result and model and all previous attempts.
1256
- It should evaluate the error/result and decide whether to retry by returning a `Retry` or to skip by returning `undefined`.
893
+ #### `Condition<MODEL>`
1257
894
 
1258
895
  ```ts
1259
- type Retryable = (context: RetryContext) => Retry | Promise<Retry> | undefined;
1260
- ```
1261
-
1262
- #### `Retry`
1263
-
1264
- A `Retry` specifies the model to retry and optional settings. The available options depend on the model type (language model, embedding model, or image model).
1265
-
1266
- ```typescript
1267
- interface Retry {
1268
- model: LanguageModelV3 | EmbeddingModelV3 | ImageModelV3;
1269
- maxAttempts?: number; // Maximum retry attempts per model (default: 1)
1270
- delay?: number; // Delay in milliseconds before retrying
1271
- backoffFactor?: number; // Multiplier for exponential backoff
1272
- timeout?: number; // Timeout in milliseconds for the retry attempt
1273
- providerOptions?: ProviderOptions; // @deprecated - use options.providerOptions instead
1274
- options?:
1275
- | LanguageModelV3CallOptions
1276
- | EmbeddingModelV3CallOptions
1277
- | ImageModelV3CallOptions; // Call options to override for this retry
896
+ class Condition<MODEL> {
897
+ evaluate(ctx: RetryContext<MODEL>): Promise<boolean>;
898
+ switch(
899
+ target: { model: MODEL } & Omit<Retry<MODEL>, 'model'>,
900
+ ): Retryable<MODEL>;
901
+ retry(options?: Omit<Retry<MODEL>, 'model'>): Retryable<MODEL>;
1278
902
  }
1279
903
  ```
1280
904
 
1281
- #### `RetryContext`
905
+ Conditions are produced by the low-level (`error`, `result`) and high-level (`httpStatus`, `timeout`, `aborted`, `finishReason`, `schemaInvalid`, `noImage`) helpers. They can be composed with the top-level `and(...conditions)` / `or(...conditions)` / `not(condition)` helpers and finalized into a `Retryable` with `.switch()` or `.retry()`.
1282
906
 
1283
- The `RetryContext` object contains information about the current attempt and all previous attempts.
907
+ #### `Retryable`
1284
908
 
1285
- ```typescript
1286
- interface RetryContext {
1287
- current: RetryAttempt;
1288
- attempts: Array<RetryAttempt>;
1289
- }
909
+ A `Retryable` is a function that receives a `RetryContext` and returns a `Retry` (to fire) or `undefined` (to skip).
910
+
911
+ ```ts
912
+ type Retryable<MODEL> = (
913
+ context: RetryContext<MODEL>,
914
+ ) => Retry<MODEL> | Promise<Retry<MODEL> | undefined> | undefined;
1290
915
  ```
1291
916
 
1292
- #### `SuccessContext`
917
+ The `.switch()` and `.retry()` actions return `Retryable<MODEL>` for you. Hand-written retryables are still supported when the condition helpers aren't a fit.
1293
918
 
1294
- The `SuccessContext` object is passed to the `onSuccess` callback after a successful request.
919
+ #### `Retry`
1295
920
 
1296
- ```typescript
1297
- interface SuccessContext {
1298
- current: SuccessAttempt;
1299
- attempts: Array<RetryAttempt>;
921
+ ```ts
922
+ interface Retry<MODEL> {
923
+ model: MODEL;
924
+ maxAttempts?: number; // default: 1 for switch, 2 for retry
925
+ delay?: number; // ms before the attempt
926
+ backoffFactor?: number; // exponential multiplier
927
+ timeout?: number; // fresh AbortSignal.timeout() for this attempt
928
+ options?: RetryCallOptions<MODEL>;
1300
929
  }
1301
930
  ```
1302
931
 
1303
- #### `SuccessAttempt`
932
+ The shape returned by a retryable (and accepted in static `retries: [...]` entries) describing the next attempt.
1304
933
 
1305
- A `SuccessAttempt` represents the successful attempt with the model, result, and call options used. The `result` type depends on the model type.
934
+ #### `RetryContext`
1306
935
 
1307
- ```typescript
1308
- interface SuccessAttempt {
1309
- type: 'success';
1310
- model: LanguageModelV3 | EmbeddingModelV3 | ImageModelV3;
1311
- result:
1312
- | LanguageModelResult
1313
- | LanguageModelStream
1314
- | EmbeddingModelEmbed
1315
- | ImageModelGenerate;
1316
- options:
1317
- | LanguageModelV3CallOptions
1318
- | EmbeddingModelV3CallOptions
1319
- | ImageModelV3CallOptions;
936
+ ```ts
937
+ interface RetryContext<MODEL> {
938
+ current: RetryAttempt<MODEL>;
939
+ attempts: Array<RetryAttempt<MODEL>>;
1320
940
  }
1321
941
  ```
1322
942
 
@@ -1334,34 +954,45 @@ interface FailureContext {
1334
954
 
1335
955
  #### `RetryAttempt`
1336
956
 
1337
- A `RetryAttempt` represents a single attempt with a specific model, which can be either an error or a successful result that triggered a retry. Each attempt includes the call options that were used for that specific attempt. For retry attempts, this will reflect any overridden options from the retry configuration.
1338
-
1339
- ```typescript
1340
- // For language, embedding, and image models
1341
- type RetryAttempt =
957
+ ```ts
958
+ type RetryAttempt<MODEL> =
1342
959
  | {
1343
960
  type: 'error';
1344
961
  error: unknown;
1345
- model: LanguageModelV3 | EmbeddingModelV3 | ImageModelV3;
1346
- options:
1347
- | LanguageModelV3CallOptions
1348
- | EmbeddingModelV3CallOptions
1349
- | ImageModelV3CallOptions;
962
+ model: MODEL;
963
+ options: CallOptions<MODEL>;
1350
964
  }
1351
965
  | {
1352
966
  type: 'result';
1353
967
  result: LanguageModelResult;
1354
- model: LanguageModelV3;
1355
- options: LanguageModelV3CallOptions;
968
+ model: LanguageModel;
969
+ options: LanguageModelCallOptions;
1356
970
  };
1357
971
 
1358
- // Note: Result-based retries only apply to language models (both generate and stream paths). They do not apply to embedding or image models. For streaming, retries are only possible before any content has been emitted; once a text-delta flows through, the stream is committed.
1359
-
1360
- // Type guards for discriminating attempts
1361
972
  function isErrorAttempt(attempt: RetryAttempt): attempt is RetryErrorAttempt;
1362
973
  function isResultAttempt(attempt: RetryAttempt): attempt is RetryResultAttempt;
1363
974
  ```
1364
975
 
976
+ Result-based attempts only fire for language models (both generate and stream paths). They do not fire for embedding or image models. For streams, retries are only possible before any content has been emitted; once a content chunk flows through, the stream is committed.
977
+
978
+ `isErrorAttempt` and `isResultAttempt` are re-exported from the package root (`ai-retry`).
979
+
980
+ #### `SuccessContext`
981
+
982
+ ```ts
983
+ interface SuccessContext<MODEL> {
984
+ current: {
985
+ type: 'success';
986
+ model: MODEL;
987
+ result: Result<MODEL>;
988
+ options: CallOptions<MODEL>;
989
+ };
990
+ attempts: Array<RetryAttempt<MODEL>>;
991
+ }
992
+ ```
993
+
994
+ Passed to the `onSuccess` callback.
995
+
1365
996
  ### License
1366
997
 
1367
998
  MIT