ai-retry 1.10.0 → 1.11.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +457 -826
- package/dist/{retryables-M5l_6w9k.mjs → conditions-BGoANmfr.mjs} +5 -5
- package/dist/{retryables-CPAbu_M3.mjs → conditions-CyJOeRZK.mjs} +4 -4
- package/dist/create-retryable-model-BIMStLIF.mjs +676 -0
- package/dist/create-retryable-model-CLCFZANp.mjs +244 -0
- package/dist/create-retryable-model-DEQ5jciq.mjs +247 -0
- package/dist/embedding-model/conditions/index.d.mts +14 -0
- package/dist/embedding-model/conditions/index.mjs +7 -0
- package/dist/embedding-model/index.d.mts +14 -0
- package/dist/embedding-model/index.mjs +6 -0
- package/dist/{guards-D8UJtxDK.mjs → guards-DtZgDqE3.mjs} +6 -1
- package/dist/image-model/conditions/index.d.mts +4 -0
- package/dist/image-model/conditions/index.mjs +4 -0
- package/dist/image-model/index.d.mts +14 -0
- package/dist/image-model/index.mjs +6 -0
- package/dist/{index-DaJrd4dN.d.mts → index-BkvvEDSr.d.mts} +6 -4
- package/dist/index-D3t1Xo_U.d.mts +28 -0
- package/dist/index.d.mts +34 -7
- package/dist/index.mjs +43 -2
- package/dist/language-model/conditions/index.d.mts +4 -0
- package/dist/language-model/conditions/index.mjs +4 -0
- package/dist/language-model/index.d.mts +14 -0
- package/dist/language-model/index.mjs +6 -0
- package/dist/{error-CaTT-xX8.mjs → not-C9pUKPO7.mjs} +69 -38
- package/dist/{error-B-rjhfG_.d.mts → or-CFcJxcaL.d.mts} +36 -27
- package/dist/retryables/index.d.mts +54 -18
- package/dist/retryables/index.mjs +50 -14
- package/dist/telemetry-CJFJzjTr.mjs +442 -0
- package/dist/{types-Dik-mH20.d.mts → types-B8qg3Yzx.d.mts} +23 -10
- package/package.json +8 -7
- package/dist/create-retryable-model-D36IQyOQ.mjs +0 -1564
- package/dist/experimental/embedding-model/index.d.mts +0 -8
- package/dist/experimental/embedding-model/index.mjs +0 -19
- package/dist/experimental/embedding-model/retryables/index.d.mts +0 -20
- package/dist/experimental/embedding-model/retryables/index.mjs +0 -7
- package/dist/experimental/image-model/index.d.mts +0 -8
- package/dist/experimental/image-model/index.mjs +0 -19
- package/dist/experimental/image-model/retryables/index.d.mts +0 -4
- package/dist/experimental/image-model/retryables/index.mjs +0 -4
- package/dist/experimental/language-model/index.d.mts +0 -11
- package/dist/experimental/language-model/index.mjs +0 -19
- package/dist/experimental/language-model/retryables/index.d.mts +0 -4
- package/dist/experimental/language-model/retryables/index.mjs +0 -4
- package/dist/index-ewZ5T6B2.d.mts +0 -34
- /package/dist/{parse-retry-headers-CRxgluhe.mjs → parse-retry-headers-RPSiSNjf.mjs} +0 -0
package/README.md
CHANGED
|
@@ -11,108 +11,108 @@
|
|
|
11
11
|
|
|
12
12
|
Automatically handle API failures, content filtering, timeouts and other errors by switching between different AI models and providers.
|
|
13
13
|
|
|
14
|
-
`ai-retry` wraps
|
|
14
|
+
`ai-retry` wraps a base model with a list of typed retry **conditions**. When a request fails with an error, or the response is not satisfying, it walks the conditions top-down to find a suitable fallback. It tracks which models have been tried and how many attempts have been made to prevent infinite loops.
|
|
15
15
|
|
|
16
|
-
|
|
16
|
+
Two retry shapes are supported:
|
|
17
17
|
|
|
18
|
-
- Error-based
|
|
19
|
-
- Result-based
|
|
18
|
+
- **Error-based**: the model throws (timeouts, rate limits, API errors).
|
|
19
|
+
- **Result-based**: the model returns a successful response that still needs retrying (content filtering, schema mismatch, etc.).
|
|
20
20
|
|
|
21
21
|
### Installation
|
|
22
22
|
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
> [!WARNING]
|
|
23
|
+
> [!NOTE]
|
|
26
24
|
> Version compatibility:
|
|
27
25
|
>
|
|
28
|
-
> -
|
|
29
|
-
> -
|
|
26
|
+
> - `ai-retry@0.x` — AI SDK v5
|
|
27
|
+
> - `ai-retry@1.x` — AI SDK v6
|
|
28
|
+
> - `ai-retry@beta` — AI SDK v7 (beta, see the [`ai-sdk-v7` branch](https://github.com/zirkelc/ai-retry/tree/ai-sdk-v7))
|
|
30
29
|
|
|
31
30
|
```bash
|
|
32
|
-
|
|
33
|
-
|
|
31
|
+
npm install ai-retry
|
|
32
|
+
```
|
|
34
33
|
|
|
35
|
-
|
|
36
|
-
|
|
34
|
+
A beta release for AI SDK v7 is available on the [`ai-sdk-v7` branch](https://github.com/zirkelc/ai-retry/tree/ai-sdk-v7). Install it with the `beta` tag:
|
|
35
|
+
|
|
36
|
+
```bash
|
|
37
|
+
npm install ai-retry@beta
|
|
37
38
|
```
|
|
38
39
|
|
|
39
40
|
### Usage
|
|
40
41
|
|
|
41
|
-
Create a retryable model by providing a base model and a list of retryables or fallback models.
|
|
42
|
-
When an error occurs, it will evaluate each retryable in order and use the first one that indicates a retry should be attempted with a different model.
|
|
43
|
-
|
|
44
42
|
> [!NOTE]
|
|
45
|
-
>
|
|
43
|
+
> **The condition API is the recommended way to configure retries.** Existing code keeps working:
|
|
44
|
+
>
|
|
45
|
+
> - The root `createRetryable` export and the function-style retryables (`contentFilterTriggered`, `requestTimeout`, …) are **deprecated but still functional**. Prefer `createRetryableModel` from `ai-retry/<family>-model` — it is typed for that family and resolves gateway strings for it.
|
|
46
|
+
> - The previously experimental `ai-retry/experimental/*` import paths were removed; the same API now ships at `ai-retry/<family>-model`.
|
|
47
|
+
>
|
|
48
|
+
> See the [migration guide](./MIGRATION.md) to move existing code to the condition API.
|
|
49
|
+
|
|
50
|
+
Create a retryable model with a base model and a list of conditions plus the action to take when a condition matches.
|
|
46
51
|
|
|
47
52
|
```typescript
|
|
53
|
+
import { anthropic } from '@ai-sdk/anthropic';
|
|
48
54
|
import { openai } from '@ai-sdk/openai';
|
|
49
|
-
import { generateText
|
|
50
|
-
import {
|
|
55
|
+
import { generateText } from 'ai';
|
|
56
|
+
import {
|
|
57
|
+
createRetryableModel,
|
|
58
|
+
error,
|
|
59
|
+
finishReason,
|
|
60
|
+
httpStatus,
|
|
61
|
+
} from 'ai-retry/language-model';
|
|
51
62
|
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
// Base model
|
|
55
|
-
model: openai('gpt-4-mini'),
|
|
63
|
+
const retryableModel = createRetryableModel({
|
|
64
|
+
model: openai('gpt-4o'),
|
|
56
65
|
retries: [
|
|
57
|
-
//
|
|
66
|
+
// Fall back to a different model on HTTP 529 or any "overloaded" message
|
|
67
|
+
httpStatus(529, 'overloaded').switch({
|
|
68
|
+
model: anthropic('claude-sonnet-4-0'),
|
|
69
|
+
}),
|
|
70
|
+
|
|
71
|
+
// Fall back when the response was content-filtered
|
|
72
|
+
finishReason('content-filter').switch({ model: openai('gpt-4o-mini') }),
|
|
73
|
+
|
|
74
|
+
// Retry the same model with exponential backoff on retryable errors
|
|
75
|
+
error.isRetryable(true).retry({ delay: 1_000, backoffFactor: 2 }),
|
|
58
76
|
],
|
|
59
77
|
});
|
|
60
78
|
|
|
61
|
-
// Use like any other AI SDK model
|
|
62
79
|
const result = await generateText({
|
|
63
80
|
model: retryableModel,
|
|
64
81
|
prompt: 'Hello world!',
|
|
65
82
|
});
|
|
66
83
|
|
|
67
84
|
console.log(result.text);
|
|
68
|
-
|
|
69
|
-
// Or with streaming
|
|
70
|
-
const result = streamText({
|
|
71
|
-
model: retryableModel,
|
|
72
|
-
prompt: 'Write a story about a robot...',
|
|
73
|
-
});
|
|
74
|
-
|
|
75
|
-
for await (const chunk of result.textStream) {
|
|
76
|
-
console.log(chunk.text);
|
|
77
|
-
}
|
|
78
85
|
```
|
|
79
86
|
|
|
80
|
-
This also works with embedding models:
|
|
87
|
+
This also works with embedding models and image models, each through their own entry point:
|
|
81
88
|
|
|
82
89
|
```typescript
|
|
83
90
|
import { openai } from '@ai-sdk/openai';
|
|
84
91
|
import { embed } from 'ai';
|
|
85
|
-
import {
|
|
92
|
+
import { createRetryableModel, httpStatus } from 'ai-retry/embedding-model';
|
|
86
93
|
|
|
87
|
-
|
|
88
|
-
const retryableModel = createRetryable({
|
|
89
|
-
// Base model
|
|
94
|
+
const retryableModel = createRetryableModel({
|
|
90
95
|
model: openai.textEmbedding('text-embedding-3-large'),
|
|
91
96
|
retries: [
|
|
92
|
-
|
|
97
|
+
httpStatus(529).switch({
|
|
98
|
+
model: openai.textEmbedding('text-embedding-3-small'),
|
|
99
|
+
}),
|
|
93
100
|
],
|
|
94
101
|
});
|
|
95
102
|
|
|
96
|
-
|
|
97
|
-
const result = await embed({
|
|
98
|
-
model: retryableModel,
|
|
99
|
-
value: 'Hello world!',
|
|
100
|
-
});
|
|
101
|
-
|
|
102
|
-
console.log(result.embedding);
|
|
103
|
+
const result = await embed({ model: retryableModel, value: 'Hello world!' });
|
|
103
104
|
```
|
|
104
105
|
|
|
105
|
-
This also works with image models:
|
|
106
|
-
|
|
107
106
|
```typescript
|
|
107
|
+
import { google } from '@ai-sdk/google';
|
|
108
108
|
import { openai } from '@ai-sdk/openai';
|
|
109
109
|
import { generateImage } from 'ai';
|
|
110
|
-
import {
|
|
110
|
+
import { createRetryableModel, noImage } from 'ai-retry/image-model';
|
|
111
111
|
|
|
112
|
-
const retryableModel =
|
|
112
|
+
const retryableModel = createRetryableModel({
|
|
113
113
|
model: openai.image('dall-e-3'),
|
|
114
114
|
retries: [
|
|
115
|
-
|
|
115
|
+
noImage().switch({ model: google.image('gemini-3-pro-image-preview') }),
|
|
116
116
|
],
|
|
117
117
|
});
|
|
118
118
|
|
|
@@ -120,805 +120,463 @@ const result = await generateImage({
|
|
|
120
120
|
model: retryableModel,
|
|
121
121
|
prompt: 'A sunset over mountains',
|
|
122
122
|
});
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
#### Entry points
|
|
126
|
+
|
|
127
|
+
Pick the entry point that matches the model you pass to `createRetryableModel`. Each module exposes the helpers that make sense for that model family already typed for it, so no manual type annotations are needed.
|
|
128
|
+
|
|
129
|
+
| Entry point | For models passed to |
|
|
130
|
+
| -------------------------- | -------------------------------------------------------------- |
|
|
131
|
+
| `ai-retry/language-model` | `generateText`, `generateObject`, `streamText`, `streamObject` |
|
|
132
|
+
| `ai-retry/embedding-model` | `embed`, `embedMany` |
|
|
133
|
+
| `ai-retry/image-model` | `generateImage` |
|
|
134
|
+
|
|
135
|
+
```typescript
|
|
136
|
+
import { createRetryableModel } from 'ai-retry/language-model';
|
|
137
|
+
import { createRetryableModel } from 'ai-retry/image-model';
|
|
138
|
+
import { createRetryableModel } from 'ai-retry/embedding-model';
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
Each entry point re-exports `createRetryableModel` plus every condition for that family. The condition helpers can also be imported from the dedicated `/conditions` subpath:
|
|
123
142
|
|
|
124
|
-
|
|
143
|
+
```typescript
|
|
144
|
+
import {
|
|
145
|
+
error,
|
|
146
|
+
httpStatus,
|
|
147
|
+
finishReason,
|
|
148
|
+
} from 'ai-retry/language-model/conditions';
|
|
149
|
+
// or
|
|
150
|
+
import * as conditions from 'ai-retry/language-model/conditions';
|
|
125
151
|
```
|
|
126
152
|
|
|
127
153
|
#### Vercel AI Gateway
|
|
128
154
|
|
|
129
|
-
You can
|
|
155
|
+
You can pass a model as a string and it will be resolved through the default `gateway` [provider instance](https://ai-sdk.dev/providers/ai-sdk-providers/ai-gateway#provider-instance) from the AI SDK. Each entry point resolves strings to its own model family, so the string is typed against that family's gateway model ids.
|
|
130
156
|
|
|
131
157
|
```typescript
|
|
132
158
|
import { gateway } from 'ai';
|
|
133
|
-
import {
|
|
159
|
+
import { createRetryableModel } from 'ai-retry/language-model';
|
|
134
160
|
|
|
135
|
-
const retryableModel =
|
|
161
|
+
const retryableModel = createRetryableModel({
|
|
136
162
|
model: 'openai/gpt-5',
|
|
137
163
|
retries: ['anthropic/claude-sonnet-4'],
|
|
138
164
|
});
|
|
139
165
|
|
|
140
166
|
// Is the same as:
|
|
141
|
-
const
|
|
167
|
+
const retryableModel2 = createRetryableModel({
|
|
142
168
|
model: gateway('openai/gpt-5'),
|
|
143
169
|
retries: [gateway('anthropic/claude-sonnet-4')],
|
|
144
170
|
});
|
|
145
171
|
```
|
|
146
172
|
|
|
147
|
-
|
|
173
|
+
Embedding and image entry points accept gateway strings too, resolved against their respective families:
|
|
148
174
|
|
|
149
175
|
```typescript
|
|
150
|
-
import {
|
|
151
|
-
import { createRetryable } from 'ai-retry';
|
|
176
|
+
import { createRetryableModel } from 'ai-retry/embedding-model';
|
|
152
177
|
|
|
153
|
-
const
|
|
154
|
-
model:
|
|
178
|
+
const retryableEmbedding = createRetryableModel({
|
|
179
|
+
model: 'openai/text-embedding-3-large',
|
|
180
|
+
retries: ['openai/text-embedding-3-small'],
|
|
155
181
|
});
|
|
156
182
|
```
|
|
157
183
|
|
|
158
|
-
### Retryables
|
|
159
|
-
|
|
160
|
-
The objects passed to the `retries` are called retryables and control the retry behavior. We can distinguish between two types of retryables:
|
|
161
|
-
|
|
162
|
-
- **Static retryables** are simply models instances (language or embedding) that will always be used when an error occurs. They are also called fallback models.
|
|
163
|
-
- **Dynamic retryables** are functions that receive the current attempt context (error/result and previous attempts) and decide whether to retry with a different model based on custom logic.
|
|
164
|
-
|
|
165
|
-
You can think of the `retries` array as a big `if-else` block, where each dynamic retryable is an `if` branch that can match a certain error/result condition, and static retryables are the `else` branches that match all other conditions. The analogy is not perfect, because the order of retryables matters because `retries` are evaluated in order until one matches:
|
|
166
|
-
|
|
167
184
|
```typescript
|
|
168
|
-
import {
|
|
169
|
-
import { createRetryable } from 'ai-retry';
|
|
170
|
-
|
|
171
|
-
const retryableModel = createRetryable({
|
|
172
|
-
// Base model
|
|
173
|
-
model: openai('gpt-4'),
|
|
174
|
-
// Retryables are evaluated top-down in order
|
|
175
|
-
retries: [
|
|
176
|
-
// Dynamic retryables act like if-branches:
|
|
177
|
-
// If error.code == 429 (too many requests) happens, retry with this model
|
|
178
|
-
(context) => {
|
|
179
|
-
return context.current.error.statusCode === 429
|
|
180
|
-
? { model: azure('gpt-4-mini') } // Retry
|
|
181
|
-
: undefined; // Skip
|
|
182
|
-
},
|
|
185
|
+
import { createRetryableModel } from 'ai-retry/image-model';
|
|
183
186
|
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
? { model: azure('gpt-4-mini') } // Retry
|
|
188
|
-
: undefined; // Skip
|
|
189
|
-
},
|
|
190
|
-
|
|
191
|
-
// Static retryables act like else branches:
|
|
192
|
-
// Else, always fallback to this model
|
|
193
|
-
anthropic('claude-3-haiku-20240307'),
|
|
194
|
-
// Same as:
|
|
195
|
-
// { model: anthropic('claude-3-haiku-20240307'), maxAttempts: 1 }
|
|
196
|
-
],
|
|
187
|
+
const retryableImage = createRetryableModel({
|
|
188
|
+
model: 'google/imagen-4.0-generate-001',
|
|
189
|
+
retries: ['google/imagen-4.0-fast-generate-001'],
|
|
197
190
|
});
|
|
198
191
|
```
|
|
199
192
|
|
|
200
|
-
|
|
193
|
+
### Retries
|
|
201
194
|
|
|
202
|
-
|
|
195
|
+
The `retries` array holds the things `ai-retry` tries, in order, when a request fails or a result needs retrying. There are two kinds:
|
|
203
196
|
|
|
204
|
-
|
|
197
|
+
- **Fallbacks** are model instances (or gateway strings). They always match and are used as plain fallbacks.
|
|
198
|
+
- **Conditions** are typed predicates produced by helpers like `error()` or `httpStatus()` and finalized with a `.switch()` or `.retry()` action. They only fire when their predicate matches.
|
|
205
199
|
|
|
206
|
-
|
|
207
|
-
- **Result-based retryables** handle successful responses that still need retrying (e.g., content filtering, guardrails, etc.)
|
|
208
|
-
|
|
209
|
-
Both types of retryables have the same interface and receive the current attempt as context. You can use the `isErrorAttempt` and `isResultAttempt` type guards to check the type of the current attempt.
|
|
200
|
+
You can think of `retries` as a big `if-else` chain — each condition is an `if` branch matching some error/result, and each fallback is an `else` branch matching anything left over. Order matters: the array is evaluated top-down until one matches.
|
|
210
201
|
|
|
211
202
|
```typescript
|
|
212
|
-
import {
|
|
213
|
-
import {
|
|
214
|
-
import
|
|
215
|
-
|
|
216
|
-
|
|
217
|
-
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
// The request threw an error - e.g., network timeout, 429 rate limit
|
|
221
|
-
console.log('Request failed with error:', error);
|
|
222
|
-
return { model: anthropic('claude-3-haiku-20240307') };
|
|
223
|
-
}
|
|
224
|
-
return undefined;
|
|
225
|
-
};
|
|
226
|
-
|
|
227
|
-
// Result-based retryable: handles successful responses that need retrying
|
|
228
|
-
const resultBasedRetry: Retryable = (context) => {
|
|
229
|
-
if (isResultAttempt(context.current)) {
|
|
230
|
-
const { result } = context.current;
|
|
231
|
-
// The request succeeded, but the response indicates a problem
|
|
232
|
-
if (result.finishReason.unified === 'content-filter') {
|
|
233
|
-
console.log('Content was filtered, trying different model');
|
|
234
|
-
return { model: openai('gpt-4') };
|
|
235
|
-
}
|
|
236
|
-
}
|
|
237
|
-
return undefined;
|
|
238
|
-
};
|
|
203
|
+
import { anthropic } from '@ai-sdk/anthropic';
|
|
204
|
+
import { azure } from '@ai-sdk/azure';
|
|
205
|
+
import { openai } from '@ai-sdk/openai';
|
|
206
|
+
import {
|
|
207
|
+
createRetryableModel,
|
|
208
|
+
error,
|
|
209
|
+
httpStatus,
|
|
210
|
+
} from 'ai-retry/language-model';
|
|
239
211
|
|
|
240
|
-
const retryableModel =
|
|
241
|
-
model:
|
|
212
|
+
const retryableModel = createRetryableModel({
|
|
213
|
+
model: openai('gpt-4'),
|
|
242
214
|
retries: [
|
|
243
|
-
//
|
|
244
|
-
|
|
215
|
+
// Condition: match HTTP 429 (rate limit)
|
|
216
|
+
httpStatus(429).switch({ model: azure('gpt-4-mini') }),
|
|
245
217
|
|
|
246
|
-
//
|
|
247
|
-
|
|
218
|
+
// Condition: match "overloaded" in the error message
|
|
219
|
+
error.message('overloaded').switch({ model: azure('gpt-4-mini') }),
|
|
220
|
+
|
|
221
|
+
// Fallback: switch to Anthropic for anything else
|
|
222
|
+
anthropic('claude-3-haiku-20240307'),
|
|
223
|
+
// Same as:
|
|
224
|
+
// { model: anthropic('claude-3-haiku-20240307'), maxAttempts: 1 }
|
|
248
225
|
],
|
|
249
226
|
});
|
|
250
227
|
```
|
|
251
228
|
|
|
252
|
-
Result-based retryables apply to language models for both generate (`generateText`, `generateObject`) and streaming (`streamText`, `streamObject`) calls. For streams, the retry decision happens when the upstream `finish` part arrives and only fires if no content has been emitted yet, so behavior like `finishReason: 'content-filter'` on an otherwise empty response can still trigger a fallback. Once any content chunk has been forwarded, the stream is committed and result-based retries are skipped.
|
|
253
|
-
|
|
254
229
|
#### Fallbacks
|
|
255
230
|
|
|
256
|
-
|
|
257
|
-
|
|
258
|
-
> [!NOTE]
|
|
259
|
-
> Use the object syntax `{ model: openai('gpt-4') }` if you need to provide additional options like `maxAttempts`, `delay`, etc.
|
|
231
|
+
A fallback is a plain model instance (or gateway string) in `retries`. It always matches, so it acts as a catch-all: when no earlier condition fired, the next fallback model is tried. Each fallback is attempted once by default; use the object form to pass options like `maxAttempts`.
|
|
260
232
|
|
|
261
233
|
```typescript
|
|
234
|
+
import { anthropic } from '@ai-sdk/anthropic';
|
|
262
235
|
import { openai } from '@ai-sdk/openai';
|
|
263
|
-
import {
|
|
264
|
-
import { createRetryable } from 'ai-retry';
|
|
236
|
+
import { createRetryableModel } from 'ai-retry/language-model';
|
|
265
237
|
|
|
266
|
-
const retryableModel =
|
|
267
|
-
|
|
268
|
-
model: openai('gpt-4-mini'),
|
|
269
|
-
// List of fallback models
|
|
238
|
+
const retryableModel = createRetryableModel({
|
|
239
|
+
model: openai('gpt-4o'),
|
|
270
240
|
retries: [
|
|
271
|
-
openai('gpt-
|
|
272
|
-
//
|
|
273
|
-
// { model: openai('gpt-3.5-turbo'), maxAttempts: 1 },
|
|
241
|
+
openai('gpt-4o-mini'), // first fallback
|
|
242
|
+
anthropic('claude-3-haiku-20240307'), // second fallback
|
|
274
243
|
|
|
275
|
-
|
|
276
|
-
|
|
277
|
-
// { model: anthropic('claude-3-haiku-20240307'), maxAttempts: 1 },
|
|
244
|
+
// Object form to pass options:
|
|
245
|
+
{ model: anthropic('claude-3-haiku-20240307'), maxAttempts: 2 },
|
|
278
246
|
],
|
|
279
247
|
});
|
|
280
248
|
```
|
|
281
249
|
|
|
282
|
-
|
|
283
|
-
|
|
284
|
-
#### Custom
|
|
285
|
-
|
|
286
|
-
If you need more control over when to retry and which model to use, you can create your own custom retryable. This function is called with a context object containing the current attempt (error or result) and all previous attempts and needs to return a retry model or `undefined` to skip to the next retryable. The object you return from the retryable function is the same as the one you provide in the `retries` array.
|
|
287
|
-
|
|
288
|
-
> [!NOTE]
|
|
289
|
-
> You can return additional options like `maxAttempts`, `delay`, etc. along with the model.
|
|
250
|
+
Fallbacks are tried in order. Once all of them are exhausted, a `RetryError` is thrown (see [All retries failed](#all-retries-failed)).
|
|
290
251
|
|
|
291
|
-
|
|
292
|
-
> If you'd like the same flexibility with a typed, composable condition system, see [Experimental: Composable Conditions](#experimental-composable-conditions).
|
|
252
|
+
#### Conditions
|
|
293
253
|
|
|
294
|
-
|
|
295
|
-
import { anthropic } from '@ai-sdk/anthropic';
|
|
296
|
-
import { openai } from '@ai-sdk/openai';
|
|
297
|
-
import { APICallError } from 'ai';
|
|
298
|
-
import { createRetryable, isErrorAttempt } from 'ai-retry';
|
|
299
|
-
import type { Retryable } from 'ai-retry';
|
|
300
|
-
|
|
301
|
-
// Custom retryable that retries on rate limit errors (429)
|
|
302
|
-
const rateLimitRetry: Retryable = (context) => {
|
|
303
|
-
// Only handle error attempts
|
|
304
|
-
if (isErrorAttempt(context.current)) {
|
|
305
|
-
// Get the error from the current attempt
|
|
306
|
-
const { error } = context.current;
|
|
307
|
-
|
|
308
|
-
// Check for rate limit error
|
|
309
|
-
if (APICallError.isInstance(error) && error.statusCode === 429) {
|
|
310
|
-
// Retry with a different model
|
|
311
|
-
return { model: anthropic('claude-3-haiku-20240307') };
|
|
312
|
-
}
|
|
313
|
-
}
|
|
254
|
+
A `Condition` is a typed predicate over a `RetryContext`. The library ships two **low-level** builders (`error()` and `result()`) plus **high-level** helpers built on top of them. Every condition is finalized with one of two terminal actions, `.switch()` or `.retry()`, which turn it into a retryable.
|
|
314
255
|
|
|
315
|
-
|
|
316
|
-
return undefined;
|
|
317
|
-
};
|
|
318
|
-
|
|
319
|
-
const retryableModel = createRetryable({
|
|
320
|
-
// Base model
|
|
321
|
-
model: openai('gpt-4-mini'),
|
|
322
|
-
retries: [
|
|
323
|
-
// Use custom rate limit retryable
|
|
324
|
-
rateLimitRetry,
|
|
256
|
+
##### Universal conditions
|
|
325
257
|
|
|
326
|
-
|
|
327
|
-
],
|
|
328
|
-
});
|
|
329
|
-
```
|
|
258
|
+
These are available from all three entry points (`language-model`, `embedding-model`, `image-model`).
|
|
330
259
|
|
|
331
|
-
|
|
260
|
+
| Helper | Kind | Matches when |
|
|
261
|
+
| ------------------------------- | ---------- | ------------------------------------------------------------------------------ |
|
|
262
|
+
| `error(predicate)` | low-level | The current attempt failed and `predicate(err, ctx)` returns true |
|
|
263
|
+
| `error.isRetryable(flag)` | low-level | `APICallError.isRetryable === flag` (default `true`) |
|
|
264
|
+
| `error.statusCode(...patterns)` | low-level | Numbers match the status code exactly; regex matches the stringified code |
|
|
265
|
+
| `error.message(...patterns)` | low-level | Substring (case-insensitive) or regex match against the error message |
|
|
266
|
+
| `error.isTimeout()` | low-level | `Error.name === 'TimeoutError'` (`AbortSignal.timeout()` fired) |
|
|
267
|
+
| `error.isAbort()` | low-level | `Error.name === 'AbortError'` (manual `controller.abort()`) |
|
|
268
|
+
| `httpStatus(...patterns)` | high-level | Numbers match the status code; strings match the message; regex matches either |
|
|
269
|
+
| `timeout()` | high-level | Alias for `error.isTimeout()` |
|
|
270
|
+
| `aborted()` | high-level | Alias for `error.isAbort()` |
|
|
332
271
|
|
|
333
|
-
|
|
272
|
+
###### `error(predicate)`
|
|
334
273
|
|
|
335
|
-
|
|
336
|
-
If no retry was attempted (e.g. because all retryables returned `undefined`), the original error is thrown directly.
|
|
274
|
+
Takes any predicate over the failed attempt's error. Its namespace bundles the common matchers: `isRetryable` (defaults to `true`), `statusCode` (numbers or regex), `message` (case-insensitive substring or regex), and `isTimeout` / `isAbort` (match `AbortSignal.timeout()` firing vs a manual `controller.abort()`). The pattern matchers accept any number of patterns and match if any matches.
|
|
337
275
|
|
|
338
276
|
```typescript
|
|
339
|
-
import {
|
|
277
|
+
import { APICallError } from 'ai';
|
|
278
|
+
import { error } from 'ai-retry/language-model';
|
|
340
279
|
|
|
341
|
-
|
|
342
|
-
|
|
343
|
-
model: azure('gpt-4-mini'),
|
|
344
|
-
retries: [
|
|
345
|
-
// Fallback model 1 = Second attempt
|
|
346
|
-
openai('gpt-3.5-turbo'),
|
|
347
|
-
// Fallback model 2 = Third attempt
|
|
348
|
-
anthropic('claude-3-haiku-20240307'),
|
|
349
|
-
],
|
|
280
|
+
error((e) => APICallError.isInstance(e) && e.statusCode === 418).switch({
|
|
281
|
+
model: fallback,
|
|
350
282
|
});
|
|
351
283
|
|
|
352
|
-
|
|
353
|
-
|
|
354
|
-
model: retryableModel,
|
|
355
|
-
prompt: 'Hello world!',
|
|
356
|
-
});
|
|
357
|
-
} catch (error) {
|
|
358
|
-
// RetryError is an official AI SDK error
|
|
359
|
-
if (error instanceof RetryError) {
|
|
360
|
-
console.error('All retry attempts failed:', error.errors);
|
|
361
|
-
} else {
|
|
362
|
-
console.error('Request failed:', error);
|
|
363
|
-
}
|
|
364
|
-
}
|
|
365
|
-
```
|
|
366
|
-
|
|
367
|
-
Errors are tracked per unique model (provider + modelId). That means on the first error, it will retry with `gpt-3.5-turbo`. If that also fails, it will retry with `claude-3-haiku-20240307`. If that fails again, the whole retry process stops and a `RetryError` is thrown.
|
|
368
|
-
|
|
369
|
-
### Built-in Retryables
|
|
370
|
-
|
|
371
|
-
There are several built-in dynamic retryables available for common use cases:
|
|
284
|
+
error.isRetryable().switch({ model: fallback }); // defaults to true
|
|
285
|
+
error.isRetryable(false).switch({ model: fallback });
|
|
372
286
|
|
|
373
|
-
|
|
374
|
-
|
|
287
|
+
error.statusCode(503, 529).switch({ model: fallback });
|
|
288
|
+
error.statusCode(/^5\d\d$/).switch({ model: fallback }); // any 5xx
|
|
375
289
|
|
|
376
|
-
|
|
377
|
-
|
|
378
|
-
|
|
379
|
-
- [`contentFilterTriggered`](./src/retryables/content-filter-triggered.ts): Content filter was triggered based on the prompt or completion.
|
|
380
|
-
- [`requestTimeout`](./src/retryables/request-timeout.ts): Request timeout occurred.
|
|
381
|
-
- [`requestNotRetryable`](./src/retryables/request-not-retryable.ts): Request failed with a non-retryable error.
|
|
382
|
-
- [`retryAfterDelay`](./src/retryables/retry-after-delay.ts): Retry with delay and exponential backoff and respect `retry-after` headers.
|
|
383
|
-
- [`serviceOverloaded`](./src/retryables/service-overloaded.ts): Response with status code 529 (service overloaded).
|
|
384
|
-
- [`serviceUnavailable`](./src/retryables/service-unavailable.ts): Response with status code 503 (service unavailable).
|
|
385
|
-
- [`schemaMismatch`](./src/retryables/schema-mismatch.ts): Response JSON doesn't match the expected schema from structured output modes (`Output.object()`, `Output.array()`, `Output.choice()`).
|
|
386
|
-
- [`noImageGenerated`](./src/retryables/no-image-generated.ts): Image generation failed with `NoImageGeneratedError`.
|
|
290
|
+
error.message('overloaded').switch({ model: fallback }); // substring
|
|
291
|
+
error.message(/rate.?limit/i).switch({ model: fallback }); // regex
|
|
387
292
|
|
|
388
|
-
|
|
293
|
+
error.isTimeout().switch({ model: fallback }); // AbortSignal.timeout() fired
|
|
294
|
+
error.isAbort().switch({ model: fallback }); // manual controller.abort()
|
|
295
|
+
```
|
|
389
296
|
|
|
390
|
-
|
|
297
|
+
###### `httpStatus(...patterns)`
|
|
391
298
|
|
|
392
|
-
|
|
393
|
-
> For streaming requests this retryable can only fire if the content filter trips before any content has been emitted. Once a text chunk flows through, the stream is committed and the fallback is skipped.
|
|
299
|
+
Matches an `APICallError` by status code (numbers), message substring (strings), or either (regex). Mix any combination in one call.
|
|
394
300
|
|
|
395
301
|
```typescript
|
|
396
|
-
import {
|
|
302
|
+
import { httpStatus } from 'ai-retry/language-model';
|
|
397
303
|
|
|
398
|
-
|
|
399
|
-
|
|
400
|
-
|
|
401
|
-
contentFilterTriggered(openai('gpt-4-mini')), // Try OpenAI if Azure filters
|
|
402
|
-
],
|
|
403
|
-
});
|
|
304
|
+
httpStatus(429).switch({ model: fallback }); // status code
|
|
305
|
+
httpStatus(529, 'overloaded').switch({ model: fallback }); // status or message
|
|
306
|
+
httpStatus(/^5\d\d$/).switch({ model: fallback }); // any 5xx
|
|
404
307
|
```
|
|
405
308
|
|
|
406
|
-
|
|
407
|
-
|
|
408
|
-
Handle timeouts by switching to potentially faster models.
|
|
409
|
-
|
|
410
|
-
> [!NOTE]
|
|
411
|
-
> You need to use an `abortSignal` with a timeout on your request.
|
|
309
|
+
###### `timeout()`
|
|
412
310
|
|
|
413
|
-
|
|
311
|
+
Alias for `error.isTimeout()` — matches `AbortSignal.timeout()` firing (`Error.name === 'TimeoutError'`); pass a fresh `timeout` to the action so the fallback gets its own deadline.
|
|
414
312
|
|
|
415
313
|
```typescript
|
|
416
|
-
import {
|
|
417
|
-
|
|
418
|
-
const retryableModel = createRetryable({
|
|
419
|
-
model: azure('gpt-4'),
|
|
420
|
-
retries: [
|
|
421
|
-
// Defaults to 60 seconds timeout for the retry attempt
|
|
422
|
-
requestTimeout(azure('gpt-4-mini')),
|
|
314
|
+
import { timeout } from 'ai-retry/language-model';
|
|
423
315
|
|
|
424
|
-
|
|
425
|
-
requestTimeout(azure('gpt-4-mini'), { timeout: 30_000 }),
|
|
426
|
-
],
|
|
427
|
-
});
|
|
428
|
-
|
|
429
|
-
const result = await generateText({
|
|
430
|
-
model: retryableModel,
|
|
431
|
-
prompt: 'Write a vegetarian lasagna recipe for 4 people.',
|
|
432
|
-
abortSignal: AbortSignal.timeout(60_000), // Original request timeout
|
|
433
|
-
});
|
|
316
|
+
timeout().switch({ model: fallback, timeout: 30_000 });
|
|
434
317
|
```
|
|
435
318
|
|
|
436
|
-
|
|
319
|
+
###### `aborted()`
|
|
437
320
|
|
|
438
|
-
|
|
321
|
+
Alias for `error.isAbort()` — matches a manual `controller.abort()` (`Error.name === 'AbortError'`).
|
|
439
322
|
|
|
440
323
|
```typescript
|
|
441
|
-
import {
|
|
324
|
+
import { aborted } from 'ai-retry/language-model';
|
|
442
325
|
|
|
443
|
-
|
|
444
|
-
model: anthropic('claude-sonnet-4-0'),
|
|
445
|
-
retries: [
|
|
446
|
-
// Retry with delay and exponential backoff
|
|
447
|
-
serviceOverloaded(anthropic('claude-sonnet-4-0'), {
|
|
448
|
-
delay: 5_000,
|
|
449
|
-
backoffFactor: 2,
|
|
450
|
-
maxAttempts: 5,
|
|
451
|
-
}),
|
|
452
|
-
// Or switch to a different provider
|
|
453
|
-
serviceOverloaded(openai('gpt-4')),
|
|
454
|
-
],
|
|
455
|
-
});
|
|
456
|
-
|
|
457
|
-
const result = streamText({
|
|
458
|
-
model: retryableModel,
|
|
459
|
-
prompt: 'Write a story about a robot...',
|
|
460
|
-
});
|
|
326
|
+
aborted().switch({ model: fallback });
|
|
461
327
|
```
|
|
462
328
|
|
|
463
|
-
|
|
329
|
+
Each high-level helper is a thin wrapper around the low-level ones. For example, `httpStatus(...)` composes `error.statusCode(...)` with `error.message(...)`, and `timeout()` / `aborted()` are aliases for `error.isTimeout()` / `error.isAbort()`.
|
|
464
330
|
|
|
465
|
-
|
|
331
|
+
##### Language model conditions
|
|
466
332
|
|
|
467
|
-
|
|
468
|
-
import { serviceUnavailable } from 'ai-retry/retryables';
|
|
333
|
+
Only available from `ai-retry/language-model`. Result-based conditions inspect a successful response (see [Streaming](#streaming) for how they behave on streams).
|
|
469
334
|
|
|
470
|
-
|
|
471
|
-
|
|
472
|
-
|
|
473
|
-
|
|
474
|
-
|
|
475
|
-
|
|
476
|
-
```
|
|
335
|
+
| Helper | Kind | Matches when |
|
|
336
|
+
| --------------------------------- | ---------- | --------------------------------------------------------------------- |
|
|
337
|
+
| `result(predicate)` | low-level | The current attempt succeeded and `predicate(res, ctx)` returns true |
|
|
338
|
+
| `result.finishReason(...reasons)` | low-level | The result's `finishReason.unified` matches one of the given values |
|
|
339
|
+
| `finishReason(...reasons)` | high-level | Same as `result.finishReason` (re-exported for convenience) |
|
|
340
|
+
| `schemaInvalid()` | high-level | The result text fails JSON-schema validation against `responseFormat` |
|
|
477
341
|
|
|
478
|
-
|
|
342
|
+
###### `result(predicate)`
|
|
479
343
|
|
|
480
|
-
|
|
344
|
+
Takes any predicate over the successful result. `result.finishReason(...reasons)` and the re-exported `finishReason(...reasons)` match the result's unified finish reason against one or more values.
|
|
481
345
|
|
|
482
346
|
```typescript
|
|
483
|
-
import {
|
|
484
|
-
import { google } from '@ai-sdk/google';
|
|
485
|
-
import { generateImage } from 'ai';
|
|
486
|
-
import { createRetryable } from 'ai-retry';
|
|
487
|
-
import { noImageGenerated } from 'ai-retry/retryables';
|
|
347
|
+
import { finishReason, result } from 'ai-retry/language-model';
|
|
488
348
|
|
|
489
|
-
|
|
490
|
-
model: openai.image('dall-e-3'),
|
|
491
|
-
retries: [
|
|
492
|
-
noImageGenerated(google.image('gemini-3-pro-image-preview')), // Switch to Gemini if DALL-E fails to generate an image
|
|
493
|
-
],
|
|
494
|
-
});
|
|
349
|
+
result((res) => res.usage.outputTokens.total === 0).switch({ model: fallback });
|
|
495
350
|
|
|
496
|
-
|
|
497
|
-
|
|
498
|
-
prompt: 'A sunset over mountains',
|
|
499
|
-
});
|
|
351
|
+
finishReason('content-filter').switch({ model: fallback });
|
|
352
|
+
finishReason('length', 'content-filter').retry({ maxAttempts: 3 });
|
|
500
353
|
```
|
|
501
354
|
|
|
502
|
-
|
|
355
|
+
###### `schemaInvalid()`
|
|
503
356
|
|
|
504
|
-
|
|
505
|
-
|
|
506
|
-
> [!NOTE]
|
|
507
|
-
> You can check if an error is retryable with the `isRetryable` property on an [`APICallError`](https://ai-sdk.dev/docs/reference/ai-sdk-errors/ai-api-call-error#ai_apicallerror).
|
|
357
|
+
Matches when the result text fails JSON-schema validation against the call's `responseFormat` (set automatically by `Output.object()`).
|
|
508
358
|
|
|
509
359
|
```typescript
|
|
510
|
-
import {
|
|
360
|
+
import { schemaInvalid } from 'ai-retry/language-model';
|
|
511
361
|
|
|
512
|
-
|
|
513
|
-
model: azure('gpt-4-mini'),
|
|
514
|
-
retries: [
|
|
515
|
-
requestNotRetryable(openai('gpt-4')), // Switch provider if error is not retryable
|
|
516
|
-
],
|
|
517
|
-
});
|
|
362
|
+
schemaInvalid().switch({ model: fallback });
|
|
518
363
|
```
|
|
519
364
|
|
|
520
|
-
|
|
365
|
+
##### Image model conditions
|
|
521
366
|
|
|
522
|
-
|
|
523
|
-
The delay and exponential backoff can be configured. If the response contains a [`retry-after`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Retry-After) header, it will be prioritized over the configured delay.
|
|
367
|
+
Only available from `ai-retry/image-model`.
|
|
524
368
|
|
|
525
|
-
|
|
369
|
+
| Helper | Kind | Matches when |
|
|
370
|
+
| ----------- | ---------- | --------------------------------------------- |
|
|
371
|
+
| `noImage()` | high-level | The image model threw `NoImageGeneratedError` |
|
|
526
372
|
|
|
527
|
-
|
|
528
|
-
import { retryAfterDelay } from 'ai-retry/retryables';
|
|
373
|
+
###### `noImage()`
|
|
529
374
|
|
|
530
|
-
|
|
531
|
-
model: openai('gpt-4'), // Base model
|
|
532
|
-
retries: [
|
|
533
|
-
// Retry base model 3 times with fixed 2s delay
|
|
534
|
-
retryAfterDelay({ delay: 2_000, maxAttempts: 3 }),
|
|
375
|
+
Matches when the image model threw `NoImageGeneratedError`.
|
|
535
376
|
|
|
536
|
-
|
|
537
|
-
|
|
377
|
+
```typescript
|
|
378
|
+
import { noImage } from 'ai-retry/image-model';
|
|
538
379
|
|
|
539
|
-
|
|
540
|
-
retryAfterDelay({ maxAttempts: 3 }),
|
|
541
|
-
],
|
|
542
|
-
});
|
|
380
|
+
noImage().switch({ model: fallback });
|
|
543
381
|
```
|
|
544
382
|
|
|
545
|
-
|
|
383
|
+
##### Embedding model conditions
|
|
546
384
|
|
|
547
|
-
|
|
385
|
+
> [!NOTE]
|
|
386
|
+
> The `embedding-model` entry point exposes only the universal conditions — there are no embedding-specific result conditions.
|
|
548
387
|
|
|
549
|
-
|
|
388
|
+
#### Actions
|
|
550
389
|
|
|
551
|
-
|
|
552
|
-
Normally, schema validation happens outside the model in `generateText`, so a schema validation error would not be seen by the retryable model. This retryable catches it early and retries with a fallback model.
|
|
390
|
+
Every condition exposes two terminal actions that turn it into a retryable:
|
|
553
391
|
|
|
554
|
-
|
|
555
|
-
|
|
392
|
+
- **`.switch({ model, ...options })`** falls back to a different model when the condition matches. Optional fields (`maxAttempts`, `delay`, `backoffFactor`, `timeout`, `options`) are the same as on a normal `Retry` object. `maxAttempts` defaults to `1`.
|
|
393
|
+
- **`.retry({ delay?, backoffFactor?, maxAttempts?, ... })`** retries the **current** model when the condition matches. Honors `Retry-After` and `Retry-After-Ms` response headers, capped at 60 seconds. `maxAttempts` defaults to `2` (one original attempt + one retry); values below `2` throw, since the retry budget is consumed by the original failure.
|
|
556
394
|
|
|
557
395
|
```typescript
|
|
558
|
-
import {
|
|
559
|
-
import { anthropic } from '@ai-sdk/anthropic';
|
|
560
|
-
import { generateText, Output } from 'ai';
|
|
561
|
-
import { createRetryable } from 'ai-retry';
|
|
562
|
-
import { schemaMismatch } from 'ai-retry/retryables';
|
|
563
|
-
import { z } from 'zod';
|
|
564
|
-
|
|
565
|
-
const retryableModel = createRetryable({
|
|
566
|
-
model: openai('gpt-4-mini'), // Weak base model
|
|
567
|
-
retries: [
|
|
568
|
-
// Retry with stronger model on schema mismatch
|
|
569
|
-
schemaMismatch(openai('gpt-5')),
|
|
570
|
-
],
|
|
571
|
-
});
|
|
396
|
+
import { error, timeout } from 'ai-retry/language-model';
|
|
572
397
|
|
|
573
|
-
|
|
574
|
-
|
|
575
|
-
output: Output.object({
|
|
576
|
-
schema: z.object({
|
|
577
|
-
name: z.string(),
|
|
578
|
-
age: z.number(),
|
|
579
|
-
}),
|
|
580
|
-
}),
|
|
581
|
-
prompt: 'Generate a person with name and age.',
|
|
582
|
-
});
|
|
398
|
+
// Switch on a timeout, with a fresh timeout for the fallback
|
|
399
|
+
timeout().switch({ model: fallback, timeout: 30_000 });
|
|
583
400
|
|
|
584
|
-
|
|
401
|
+
// Retry the current model with exponential backoff, max 3 attempts
|
|
402
|
+
error
|
|
403
|
+
.isRetryable(true)
|
|
404
|
+
.retry({ delay: 1_000, backoffFactor: 2, maxAttempts: 3 });
|
|
585
405
|
```
|
|
586
406
|
|
|
587
|
-
|
|
588
|
-
|
|
589
|
-
> [!WARNING]
|
|
590
|
-
> This API is experimental and may change. It is not exported from the package root; opt in via one of the per-model deep imports:
|
|
591
|
-
>
|
|
592
|
-
> ```ts
|
|
593
|
-
> import { ... } from 'ai-retry/experimental/language-model';
|
|
594
|
-
> import { ... } from 'ai-retry/experimental/image-model';
|
|
595
|
-
> import { ... } from 'ai-retry/experimental/embedding-model';
|
|
596
|
-
> ```
|
|
597
|
-
>
|
|
598
|
-
> Each entry point also re-exports `createRetryable` already typed for that model family, so you can either import everything from one path:
|
|
599
|
-
>
|
|
600
|
-
> ```ts
|
|
601
|
-
> import {
|
|
602
|
-
> createRetryable,
|
|
603
|
-
> error,
|
|
604
|
-
> httpStatus,
|
|
605
|
-
> } from 'ai-retry/experimental/language-model';
|
|
606
|
-
> ```
|
|
607
|
-
>
|
|
608
|
-
> or pull retryables from the dedicated `/retryables` subpath:
|
|
609
|
-
>
|
|
610
|
-
> ```ts
|
|
611
|
-
> import {
|
|
612
|
-
> error,
|
|
613
|
-
> httpStatus,
|
|
614
|
-
> } from 'ai-retry/experimental/language-model/retryables';
|
|
615
|
-
> // or
|
|
616
|
-
> import * as retryables from 'ai-retry/experimental/language-model/retryables';
|
|
617
|
-
> ```
|
|
407
|
+
#### Combinators
|
|
618
408
|
|
|
619
|
-
|
|
409
|
+
Compose conditions with the top-level `or()`, `and()`, `not()` helpers. Because each entry point is typed for a single model family, they infer the family from their arguments — no type annotations or casts needed. `or()` and `and()` are variadic.
|
|
620
410
|
|
|
621
411
|
```typescript
|
|
622
|
-
import {
|
|
623
|
-
import { openai } from '@ai-sdk/openai';
|
|
624
|
-
import { generateText } from 'ai';
|
|
625
|
-
import {
|
|
626
|
-
createRetryable,
|
|
627
|
-
error,
|
|
628
|
-
finishReason,
|
|
629
|
-
httpStatus,
|
|
630
|
-
} from 'ai-retry/experimental/language-model';
|
|
631
|
-
|
|
632
|
-
const retryableModel = createRetryable({
|
|
633
|
-
model: openai('gpt-4'),
|
|
634
|
-
retries: [
|
|
635
|
-
// Switch on 529 or any "overloaded" message
|
|
636
|
-
httpStatus(529, 'overloaded').switch({
|
|
637
|
-
model: anthropic('claude-3-haiku-20240307'),
|
|
638
|
-
}),
|
|
639
|
-
|
|
640
|
-
// Switch when the response was content-filtered
|
|
641
|
-
finishReason('content-filter').switch({ model: openai('gpt-4o') }),
|
|
412
|
+
import { and, error, httpStatus, not, or } from 'ai-retry/language-model';
|
|
642
413
|
|
|
643
|
-
|
|
644
|
-
|
|
645
|
-
|
|
646
|
-
});
|
|
414
|
+
or(httpStatus(429), error.message('overloaded')).switch({ model: fallback });
|
|
415
|
+
and(httpStatus(503), error.message('temporary')).switch({ model: fallback });
|
|
416
|
+
not(error.isRetryable(true)).switch({ model: fallback });
|
|
647
417
|
```
|
|
648
418
|
|
|
649
|
-
####
|
|
650
|
-
|
|
651
|
-
Pick the entry point that matches the model you pass to `createRetryable`. Each module exposes the helpers that make sense for that model family already typed for it, so you don't need to add type annotations yourself.
|
|
652
|
-
|
|
653
|
-
#### Low-level conditions
|
|
419
|
+
#### Custom predicates
|
|
654
420
|
|
|
655
|
-
|
|
656
|
-
|
|
657
|
-
| Helper | Matches when | Available in |
|
|
658
|
-
| --------------------------------- | ------------------------------------------------------------------------------------ | ---------------------- |
|
|
659
|
-
| `error(predicate)` | The current attempt failed and `predicate(err, ctx)` returns true | all three entry points |
|
|
660
|
-
| `error.isRetryable(flag)` | `APICallError.isRetryable === flag` (default `true`) | all three entry points |
|
|
661
|
-
| `error.statusCode(...patterns)` | Numbers match exactly; regex matches the stringified code (e.g. `/^5\d\d$/` for 5xx) | all three entry points |
|
|
662
|
-
| `error.message(...patterns)` | Substring (case-insensitive) or regex match against the error message | all three entry points |
|
|
663
|
-
| `result(predicate)` | The current attempt succeeded and `predicate(res, ctx)` returns true | `language-model` only |
|
|
664
|
-
| `result.finishReason(...reasons)` | The result's `finishReason.unified` matches one of the given values | `language-model` only |
|
|
421
|
+
When the higher-level helpers don't cover the field you need, drop down to `error(predicate)` / `result(predicate)` and inspect whatever is on the error or result. The predicate receives `(err | result, ctx)` and can be `async`; `ctx` is fully typed for the entry point you imported from, so the current attempt, the model, and all previous attempts are available without manual annotations.
|
|
665
422
|
|
|
666
423
|
```typescript
|
|
424
|
+
import { anthropic } from '@ai-sdk/anthropic';
|
|
425
|
+
import { openai } from '@ai-sdk/openai';
|
|
667
426
|
import { APICallError } from 'ai';
|
|
668
|
-
import { error } from 'ai-retry/
|
|
427
|
+
import { createRetryableModel, error } from 'ai-retry/language-model';
|
|
669
428
|
|
|
670
|
-
error
|
|
671
|
-
|
|
429
|
+
// OpenAI-style error code nested at data.error.code. `e` is `unknown`.
|
|
430
|
+
const isContentFilter = (e: unknown) => {
|
|
431
|
+
if (!APICallError.isInstance(e)) return false;
|
|
432
|
+
const data = e.data as { error?: { code?: string } } | undefined;
|
|
433
|
+
return data?.error?.code === 'content_filter';
|
|
434
|
+
};
|
|
435
|
+
|
|
436
|
+
const retryableModel = createRetryableModel({
|
|
437
|
+
model: openai('gpt-4o'),
|
|
438
|
+
retries: [
|
|
439
|
+
error(isContentFilter).switch({
|
|
440
|
+
model: anthropic('claude-3-haiku-20240307'),
|
|
441
|
+
}),
|
|
442
|
+
],
|
|
672
443
|
});
|
|
673
444
|
```
|
|
674
445
|
|
|
675
|
-
|
|
676
|
-
|
|
677
|
-
Convenience matchers built on top of the low-level ones for the common cases. Each returns a condition that you finalize with `.switch(...)` or `.retry(...)`.
|
|
678
|
-
|
|
679
|
-
| Helper | language-model | image-model | embedding-model |
|
|
680
|
-
| -------------------------- | :------------: | :---------: | :-------------: |
|
|
681
|
-
| `httpStatus(...patterns)` | ✓ | ✓ | ✓ |
|
|
682
|
-
| `timeout()` | ✓ | ✓ | ✓ |
|
|
683
|
-
| `aborted()` | ✓ | ✓ | ✓ |
|
|
684
|
-
| `finishReason(...reasons)` | ✓ | — | — |
|
|
685
|
-
| `schemaInvalid()` | ✓ | — | — |
|
|
686
|
-
| `noImage()` | — | ✓ | — |
|
|
687
|
-
|
|
688
|
-
What each one matches:
|
|
446
|
+
The predicate's second argument is the typed `RetryContext`, so a check like “only retry on the first attempt” is just `(e, ctx) => ctx.attempts.length === 1 && isContentFilter(e)`.
|
|
689
447
|
|
|
690
|
-
|
|
691
|
-
| -------------------------- | ------------------------------------------------------------------------------------------ |
|
|
692
|
-
| `httpStatus(...patterns)` | Numbers match the status code; strings match the message (substring); regex matches either |
|
|
693
|
-
| `timeout()` | `Error.name === 'TimeoutError'` (`AbortSignal.timeout()` fired) |
|
|
694
|
-
| `aborted()` | `Error.name === 'AbortError'` (manual `controller.abort()`) |
|
|
695
|
-
| `finishReason(...reasons)` | The result's `finishReason.unified` matches one of the given values |
|
|
696
|
-
| `schemaInvalid()` | The result text fails JSON-schema validation against the call's `responseFormat` |
|
|
697
|
-
| `noImage()` | The image model threw `NoImageGeneratedError` |
|
|
448
|
+
#### All retries failed
|
|
698
449
|
|
|
699
|
-
|
|
450
|
+
If all retry attempts fail, a `RetryError` is thrown containing all individual errors. If no retry was attempted (every retryable returned `undefined` / didn't match), the original error is re-thrown directly.
|
|
700
451
|
|
|
701
452
|
```typescript
|
|
702
|
-
|
|
703
|
-
return error((err) => err instanceof Error && err.name === 'TimeoutError');
|
|
704
|
-
}
|
|
705
|
-
```
|
|
706
|
-
|
|
707
|
-
and `finishReason(...)` just delegates to `result.finishReason(...)`:
|
|
453
|
+
import { RetryError } from 'ai';
|
|
708
454
|
|
|
709
|
-
|
|
710
|
-
|
|
711
|
-
|
|
455
|
+
try {
|
|
456
|
+
const result = await generateText({
|
|
457
|
+
model: retryableModel,
|
|
458
|
+
prompt: 'Hello!',
|
|
459
|
+
});
|
|
460
|
+
} catch (err) {
|
|
461
|
+
if (err instanceof RetryError) {
|
|
462
|
+
console.error('All retry attempts failed:', err.errors);
|
|
463
|
+
} else {
|
|
464
|
+
console.error('Request failed:', err);
|
|
465
|
+
}
|
|
712
466
|
}
|
|
713
467
|
```
|
|
714
468
|
|
|
715
|
-
|
|
716
|
-
|
|
717
|
-
Every condition exposes two terminal actions that turn it into a `Retryable`:
|
|
718
|
-
|
|
719
|
-
- **`.switch({ model, ...options })`** falls back to a different model when the condition matches. Optional fields (`maxAttempts`, `delay`, `backoffFactor`, `timeout`, `options`) are the same as on a normal `Retry` object. `maxAttempts` defaults to `1`.
|
|
720
|
-
- **`.retry({ delay?, backoffFactor?, maxAttempts?, ... })`** retries the current model when the condition matches. Honors `Retry-After` and `Retry-After-Ms` response headers when present, capped at 60 seconds. `maxAttempts` defaults to `2` (one original attempt + one retry); values below `2` throw, since the retry budget is consumed by the original failure.
|
|
721
|
-
|
|
722
|
-
#### Combinators
|
|
723
|
-
|
|
724
|
-
Compose conditions with `.and`, `.or`, `.not`:
|
|
725
|
-
|
|
726
|
-
```typescript
|
|
727
|
-
import { error, httpStatus } from 'ai-retry/experimental/language-model';
|
|
728
|
-
|
|
729
|
-
httpStatus(429).or(error.message('overloaded'));
|
|
730
|
-
httpStatus(503).and(error.message('temporary'));
|
|
731
|
-
error.isRetryable(true).not();
|
|
732
|
-
```
|
|
733
|
-
|
|
734
|
-
#### Mapping from Built-in retryables
|
|
735
|
-
|
|
736
|
-
Each stable retryable has an equivalent in the new shape (imports from `ai-retry/experimental/language-model` unless noted):
|
|
737
|
-
|
|
738
|
-
| Built-in | Composable form |
|
|
739
|
-
| ------------------------------------------- | ------------------------------------------------------------------------------------------------------------------- |
|
|
740
|
-
| `contentFilterTriggered(m)` | `error(/* check e.data.error.code === 'content_filter' */).or(finishReason('content-filter')).switch({ model: m })` |
|
|
741
|
-
| `requestTimeout(m)` | `timeout().switch({ model: m, timeout: 60_000 })` |
|
|
742
|
-
| `requestNotRetryable(m)` | `error.isRetryable(false).switch({ model: m })` |
|
|
743
|
-
| `schemaMismatch(m)` | `schemaInvalid().switch({ model: m })` |
|
|
744
|
-
| `serviceOverloaded(m)` | `httpStatus(529, 'overloaded').switch({ model: m })` |
|
|
745
|
-
| `serviceUnavailable(m)` | `error.statusCode(503).switch({ model: m })` |
|
|
746
|
-
| `noImageGenerated(m)` | `noImage().switch({ model: m })` (from `image-model`) |
|
|
747
|
-
| `retryAfterDelay({ delay, backoffFactor })` | `error.isRetryable(true).retry({ delay, backoffFactor })` |
|
|
748
|
-
|
|
749
|
-
> [!NOTE]
|
|
750
|
-
> `error.isRetryable(true)` matches whatever the AI SDK's `APICallError` marks retryable. By default that's status codes 408, 409, 429, and any 5xx, plus network errors and provider-specific overrides (e.g. Anthropic flips it on `error.type === 'overloaded_error'`). It picks up more cases than a manual status-code list.
|
|
469
|
+
Errors are tracked per unique model (`provider/modelId`). Once a model has hit its `maxAttempts`, no further retry will land on it.
|
|
751
470
|
|
|
752
471
|
### Options
|
|
753
472
|
|
|
754
|
-
#### Disabling
|
|
755
|
-
|
|
756
|
-
You can disable retries entirely, which is useful for testing or specific environments. When disabled, the base model will execute directly without any retry logic.
|
|
473
|
+
#### Disabling retries
|
|
757
474
|
|
|
758
475
|
```typescript
|
|
759
|
-
const retryableModel =
|
|
760
|
-
model: openai('gpt-4'),
|
|
761
|
-
retries: [
|
|
762
|
-
/* ... */
|
|
763
|
-
],
|
|
764
|
-
disabled: true, // Retries are completely disabled
|
|
765
|
-
});
|
|
766
|
-
|
|
767
|
-
// Or disable based on environment
|
|
768
|
-
const retryableModel = createRetryable({
|
|
769
|
-
model: openai('gpt-4'), // Base model
|
|
770
|
-
retries: [
|
|
771
|
-
/* ... */
|
|
772
|
-
],
|
|
773
|
-
disabled: process.env.NODE_ENV === 'test', // Disable in test environment
|
|
774
|
-
});
|
|
775
|
-
|
|
776
|
-
// Or use a function for dynamic control
|
|
777
|
-
const retryableModel = createRetryable({
|
|
778
|
-
model: openai('gpt-4'), // Base model
|
|
476
|
+
const retryableModel = createRetryableModel({
|
|
477
|
+
model: openai('gpt-4'),
|
|
779
478
|
retries: [
|
|
780
479
|
/* ... */
|
|
781
480
|
],
|
|
782
|
-
disabled:
|
|
481
|
+
disabled: true, // hard off
|
|
482
|
+
// disabled: process.env.NODE_ENV === 'test', // env-based
|
|
483
|
+
// disabled: () => !featureFlags.isEnabled('ai'), // dynamic
|
|
783
484
|
});
|
|
784
485
|
```
|
|
785
486
|
|
|
786
|
-
|
|
487
|
+
When disabled the base model executes directly, no retry logic runs.
|
|
488
|
+
|
|
489
|
+
#### Retry delays
|
|
787
490
|
|
|
788
|
-
|
|
491
|
+
Delays accept exponential backoff and respect the request's abort signal so they can still be cancelled.
|
|
789
492
|
|
|
790
493
|
```typescript
|
|
791
|
-
|
|
494
|
+
import { createRetryableModel } from 'ai-retry/language-model';
|
|
495
|
+
|
|
496
|
+
const retryableModel = createRetryableModel({
|
|
792
497
|
model: openai('gpt-4'),
|
|
793
498
|
retries: [
|
|
794
|
-
// Retry
|
|
499
|
+
// Retry the base model with a fixed 2s delay
|
|
795
500
|
{ model: openai('gpt-4'), delay: 2_000, maxAttempts: 3 },
|
|
796
501
|
|
|
797
|
-
// Or
|
|
502
|
+
// Or with exponential backoff: 2s, 4s, 8s
|
|
798
503
|
{ model: openai('gpt-4'), delay: 2_000, backoffFactor: 2, maxAttempts: 3 },
|
|
799
504
|
],
|
|
800
505
|
});
|
|
801
|
-
|
|
802
|
-
const result = await generateText({
|
|
803
|
-
model: retryableModel,
|
|
804
|
-
prompt: 'Write a vegetarian lasagna recipe for 4 people.',
|
|
805
|
-
// Will be respected during delays
|
|
806
|
-
abortSignal: AbortSignal.timeout(60_000),
|
|
807
|
-
});
|
|
808
506
|
```
|
|
809
507
|
|
|
810
|
-
|
|
811
|
-
|
|
812
|
-
```typescript
|
|
813
|
-
import { serviceOverloaded } from 'ai-retry/retryables';
|
|
814
|
-
|
|
815
|
-
const retryableModel = createRetryable({
|
|
816
|
-
model: openai('gpt-4'),
|
|
817
|
-
retries: [
|
|
818
|
-
// Wait 5 seconds before retrying on service overload
|
|
819
|
-
serviceOverloaded(openai('gpt-4'), { maxAttempts: 3, delay: 5_000 }),
|
|
820
|
-
],
|
|
821
|
-
});
|
|
822
|
-
```
|
|
508
|
+
The same `delay` / `backoffFactor` / `maxAttempts` options are accepted by `.switch({...})` and `.retry({...})`.
|
|
823
509
|
|
|
824
510
|
#### Timeouts
|
|
825
511
|
|
|
826
|
-
When a retry specifies a `timeout
|
|
512
|
+
When a retry specifies a `timeout`, a fresh `AbortSignal.timeout()` is created for that attempt. If the original `abortSignal` is still alive, the fresh deadline is composed with it via `AbortSignal.any()` so user cancellation still works. If the original signal is already aborted (a request-level deadline already fired), it is dropped so the retry runs against the fresh deadline alone.
|
|
827
513
|
|
|
828
|
-
If the original `abortSignal` is already aborted at the time of retry and the
|
|
514
|
+
If the original `abortSignal` is already aborted at the time of retry and the retry does **not** supply a `timeout`, `ai-retry` re-throws the original error rather than firing a misleading retry against the dead signal. `onError` still fires for observability; `onRetry` is skipped. Setting `timeout` is the explicit opt-in for retrying past an aborted signal.
|
|
829
515
|
|
|
830
516
|
```typescript
|
|
831
|
-
|
|
517
|
+
import { createRetryableModel, timeout } from 'ai-retry/language-model';
|
|
518
|
+
|
|
519
|
+
const retryableModel = createRetryableModel({
|
|
832
520
|
model: openai('gpt-4'),
|
|
833
521
|
retries: [
|
|
834
|
-
|
|
835
|
-
{
|
|
836
|
-
model: openai('gpt-3.5-turbo'),
|
|
837
|
-
timeout: 30_000,
|
|
838
|
-
},
|
|
522
|
+
timeout().switch({ model: openai('gpt-3.5-turbo'), timeout: 30_000 }),
|
|
839
523
|
],
|
|
840
524
|
});
|
|
841
525
|
|
|
842
|
-
|
|
843
|
-
const result = await generateText({
|
|
526
|
+
await generateText({
|
|
844
527
|
model: retryableModel,
|
|
845
528
|
prompt: 'Write a story',
|
|
846
|
-
// Original request timeout
|
|
847
529
|
abortSignal: AbortSignal.timeout(60_000),
|
|
848
530
|
});
|
|
849
531
|
```
|
|
850
532
|
|
|
851
|
-
#### Max
|
|
533
|
+
#### Max attempts
|
|
852
534
|
|
|
853
|
-
|
|
535
|
+
Each retryable attempts a model at most once by default. Use `maxAttempts` to allow more. Attempts are counted per unique model, so duplicates across multiple retryables don't get more chances than configured.
|
|
854
536
|
|
|
855
537
|
```typescript
|
|
856
|
-
const retryableModel =
|
|
538
|
+
const retryableModel = createRetryableModel({
|
|
857
539
|
model: openai('gpt-4'),
|
|
858
540
|
retries: [
|
|
859
|
-
//
|
|
860
|
-
|
|
861
|
-
|
|
862
|
-
{ model: openai('gpt-4'), maxAttempts: 2 },
|
|
863
|
-
// Already tried, won't be retried again
|
|
864
|
-
anthropic('claude-3-haiku-20240307'),
|
|
541
|
+
anthropic('claude-3-haiku-20240307'), // 1 attempt
|
|
542
|
+
{ model: openai('gpt-4'), maxAttempts: 2 }, // 1 + 1 retry
|
|
543
|
+
anthropic('claude-3-haiku-20240307'), // already used
|
|
865
544
|
],
|
|
866
545
|
});
|
|
867
546
|
```
|
|
868
547
|
|
|
869
|
-
|
|
870
|
-
|
|
871
|
-
#### Provider Options
|
|
548
|
+
#### Provider options
|
|
872
549
|
|
|
873
|
-
|
|
550
|
+
Override provider-specific options for a retry, completely replacing the original ones.
|
|
874
551
|
|
|
875
552
|
```typescript
|
|
876
|
-
const retryableModel =
|
|
553
|
+
const retryableModel = createRetryableModel({
|
|
877
554
|
model: openai('gpt-5'),
|
|
878
555
|
retries: [
|
|
879
|
-
// Use different provider options for the retry
|
|
880
556
|
{
|
|
881
557
|
model: openai('gpt-4o-2024-08-06'),
|
|
882
558
|
providerOptions: {
|
|
883
|
-
openai: {
|
|
884
|
-
user: 'fallback-user',
|
|
885
|
-
structuredOutputs: false,
|
|
886
|
-
},
|
|
559
|
+
openai: { user: 'fallback-user', structuredOutputs: false },
|
|
887
560
|
},
|
|
888
561
|
},
|
|
889
562
|
],
|
|
890
563
|
});
|
|
891
|
-
|
|
892
|
-
// Original provider options are used for the first attempt
|
|
893
|
-
const result = await generateText({
|
|
894
|
-
model: retryableModel,
|
|
895
|
-
prompt: 'Write a story',
|
|
896
|
-
providerOptions: {
|
|
897
|
-
openai: {
|
|
898
|
-
user: 'primary-user',
|
|
899
|
-
},
|
|
900
|
-
},
|
|
901
|
-
});
|
|
902
564
|
```
|
|
903
565
|
|
|
904
|
-
|
|
566
|
+
#### Call options
|
|
905
567
|
|
|
906
|
-
|
|
907
|
-
|
|
908
|
-
You can override various call options when retrying requests. This is useful for adjusting parameters like temperature, max tokens, or even the prompt itself for retry attempts. Call options are specified in the `options` field of the retry object.
|
|
568
|
+
Override any of the call options for a retry. Useful for things like temperature, max tokens, or the prompt itself.
|
|
909
569
|
|
|
910
570
|
```typescript
|
|
911
|
-
const retryableModel =
|
|
571
|
+
const retryableModel = createRetryableModel({
|
|
912
572
|
model: openai('gpt-4'),
|
|
913
573
|
retries: [
|
|
914
574
|
{
|
|
915
575
|
model: anthropic('claude-3-haiku'),
|
|
916
576
|
options: {
|
|
917
|
-
// Override generation parameters for more deterministic output
|
|
918
577
|
temperature: 0.3,
|
|
919
578
|
topP: 0.9,
|
|
920
579
|
maxOutputTokens: 500,
|
|
921
|
-
// Set a seed for reproducibility
|
|
922
580
|
seed: 42,
|
|
923
581
|
},
|
|
924
582
|
},
|
|
@@ -926,58 +584,54 @@ const retryableModel = createRetryable({
|
|
|
926
584
|
});
|
|
927
585
|
```
|
|
928
586
|
|
|
929
|
-
The following options can be overridden:
|
|
930
|
-
|
|
931
587
|
> [!NOTE]
|
|
932
588
|
> Override options completely replace the original values (they are not merged). If you don't specify an option, the original value from the request is used.
|
|
933
589
|
|
|
934
|
-
##### Language
|
|
935
|
-
|
|
936
|
-
| Option | Description |
|
|
937
|
-
| -------------------------------------------------------------------------------------------------- | ---------------------------------------------- |
|
|
938
|
-
| [`prompt`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-text#prompt) | Override the entire prompt for the retry |
|
|
939
|
-
| [`temperature`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-text#temperature) | Temperature setting for controlling randomness |
|
|
940
|
-
| [`topP`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-text#topp) | Nucleus sampling parameter |
|
|
941
|
-
| [`topK`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-text#topk) | Top-K sampling parameter |
|
|
942
|
-
| [`maxOutputTokens`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-text#max-output-tokens) | Maximum number of tokens to generate |
|
|
943
|
-
| [`seed`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-text#seed) | Random seed for deterministic generation |
|
|
944
|
-
| [`stopSequences`](https://ai-sdk.dev/docs/reference/ai-sdk-types/generate-text#stopsequences) | Stop sequences to end generation |
|
|
945
|
-
| [`presencePenalty`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-text#presencepenalty) | Presence penalty for reducing repetition |
|
|
946
|
-
| [`frequencyPenalty`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-text#frequencypenalty) | Frequency penalty for reducing repetition |
|
|
947
|
-
| [`headers`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-text#headers) | Additional HTTP headers |
|
|
948
|
-
| [`providerOptions`](https://ai-sdk.dev/docs/reference/ai-sdk-types/generate-text#provideroptions) | Provider-specific options |
|
|
590
|
+
##### Language model options
|
|
949
591
|
|
|
950
|
-
|
|
592
|
+
| Option | Description |
|
|
593
|
+
| ------------------ | ---------------------------------------------- |
|
|
594
|
+
| `prompt` | Override the entire prompt for the retry |
|
|
595
|
+
| `temperature` | Temperature setting for controlling randomness |
|
|
596
|
+
| `topP` | Nucleus sampling parameter |
|
|
597
|
+
| `topK` | Top-K sampling parameter |
|
|
598
|
+
| `maxOutputTokens` | Maximum number of tokens to generate |
|
|
599
|
+
| `seed` | Random seed for deterministic generation |
|
|
600
|
+
| `stopSequences` | Stop sequences to end generation |
|
|
601
|
+
| `presencePenalty` | Presence penalty for reducing repetition |
|
|
602
|
+
| `frequencyPenalty` | Frequency penalty for reducing repetition |
|
|
603
|
+
| `headers` | Additional HTTP headers |
|
|
604
|
+
| `providerOptions` | Provider-specific options |
|
|
951
605
|
|
|
952
|
-
|
|
953
|
-
| ---------------------------------------------------------------------------------------- | ---------------------------- |
|
|
954
|
-
| [`values`](https://ai-sdk.dev/docs/reference/ai-sdk-core/embed#values) | Override the values to embed |
|
|
955
|
-
| [`headers`](https://ai-sdk.dev/docs/reference/ai-sdk-core/embed#headers) | Additional HTTP headers |
|
|
956
|
-
| [`providerOptions`](https://ai-sdk.dev/docs/reference/ai-sdk-core/embed#provideroptions) | Provider-specific options |
|
|
606
|
+
##### Embedding model options
|
|
957
607
|
|
|
958
|
-
|
|
608
|
+
| Option | Description |
|
|
609
|
+
| ----------------- | ---------------------------- |
|
|
610
|
+
| `values` | Override the values to embed |
|
|
611
|
+
| `headers` | Additional HTTP headers |
|
|
612
|
+
| `providerOptions` | Provider-specific options |
|
|
959
613
|
|
|
960
|
-
|
|
961
|
-
| ------------------------------------------------------------------------------------------------- | -------------------------------- |
|
|
962
|
-
| [`n`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-image#n) | Number of images to generate |
|
|
963
|
-
| [`size`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-image#size) | Size of generated images |
|
|
964
|
-
| [`aspectRatio`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-image#aspectratio) | Aspect ratio of generated images |
|
|
965
|
-
| [`seed`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-image#seed) | Random seed for reproducibility |
|
|
966
|
-
| [`headers`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-image#headers) | Additional HTTP headers |
|
|
967
|
-
| [`providerOptions`](https://ai-sdk.dev/docs/reference/ai-sdk-core/generate-image#provideroptions) | Provider-specific options |
|
|
614
|
+
##### Image model options
|
|
968
615
|
|
|
969
|
-
|
|
616
|
+
| Option | Description |
|
|
617
|
+
| ----------------- | -------------------------------- |
|
|
618
|
+
| `n` | Number of images to generate |
|
|
619
|
+
| `size` | Size of generated images |
|
|
620
|
+
| `aspectRatio` | Aspect ratio of generated images |
|
|
621
|
+
| `seed` | Random seed for reproducibility |
|
|
622
|
+
| `headers` | Additional HTTP headers |
|
|
623
|
+
| `providerOptions` | Provider-specific options |
|
|
970
624
|
|
|
971
|
-
|
|
625
|
+
#### Dynamic call options
|
|
972
626
|
|
|
973
|
-
|
|
627
|
+
You can also override call options dynamically from `onRetry`, instead of declaring them statically on the retry object. This is useful when the override depends on something only known at runtime — the prompt that just failed, the model about to be tried, or the error that triggered the retry. The overrides apply to the upcoming attempt only and can change the same fields as the static `options`. The callback can be `async` if computing the override needs to do work (e.g. fetching a fresh credential).
|
|
974
628
|
|
|
975
629
|
```typescript
|
|
976
|
-
import { createRetryable } from 'ai-retry';
|
|
977
630
|
import { azure } from '@ai-sdk/azure';
|
|
978
631
|
import { openai } from '@ai-sdk/openai';
|
|
632
|
+
import { createRetryableModel } from 'ai-retry/language-model';
|
|
979
633
|
|
|
980
|
-
const retryableModel =
|
|
634
|
+
const retryableModel = createRetryableModel({
|
|
981
635
|
model: azure('gpt-5-chat'),
|
|
982
636
|
retries: [openai('gpt-5-chat')],
|
|
983
637
|
onRetry: (context) => {
|
|
@@ -985,33 +639,16 @@ const retryableModel = createRetryable({
|
|
|
985
639
|
const previous = attempts.at(-1);
|
|
986
640
|
|
|
987
641
|
if (current.model.provider !== previous.model.provider) {
|
|
988
|
-
// Strip provider-scoped metadata
|
|
642
|
+
// Strip provider-scoped metadata before retrying on a different provider
|
|
989
643
|
return {
|
|
990
|
-
options: {
|
|
991
|
-
prompt: stripProviderMetadata(current.options.prompt),
|
|
992
|
-
},
|
|
644
|
+
options: { prompt: stripProviderMetadata(current.options.prompt) },
|
|
993
645
|
};
|
|
994
646
|
}
|
|
995
647
|
},
|
|
996
648
|
});
|
|
997
649
|
```
|
|
998
650
|
|
|
999
|
-
Inside
|
|
1000
|
-
|
|
1001
|
-
`onRetry` may also be `async`, which is useful if computing the override needs to do work (e.g. fetching a fresh credential):
|
|
1002
|
-
|
|
1003
|
-
```typescript
|
|
1004
|
-
const retryableModel = createRetryable({
|
|
1005
|
-
model: openai('gpt-4o-mini'),
|
|
1006
|
-
retries: [anthropic('claude-sonnet-4-20250514')],
|
|
1007
|
-
onRetry: async (context) => {
|
|
1008
|
-
const { current } = context;
|
|
1009
|
-
|
|
1010
|
-
const headers = await refreshAuthHeaders(current.model.provider);
|
|
1011
|
-
return { options: { headers } };
|
|
1012
|
-
},
|
|
1013
|
-
});
|
|
1014
|
-
```
|
|
651
|
+
Inside `onRetry`, `context.current.model` is the model about to be tried next; `context.current.options` and `context.current.error` describe the failed attempt that triggered the retry. The previous model is at `context.attempts.at(-1).model`.
|
|
1015
652
|
|
|
1016
653
|
**Precedence** for the upcoming retry attempt (highest to lowest):
|
|
1017
654
|
|
|
@@ -1029,10 +666,10 @@ You can use the following callbacks to log retry attempts and errors:
|
|
|
1029
666
|
- `onFailure` is invoked when the request ultimately fails and no retry could recover it.
|
|
1030
667
|
|
|
1031
668
|
```typescript
|
|
1032
|
-
const retryableModel =
|
|
1033
|
-
model: openai('gpt-
|
|
669
|
+
const retryableModel = createRetryableModel({
|
|
670
|
+
model: openai('gpt-4o-mini'),
|
|
1034
671
|
retries: [
|
|
1035
|
-
/*
|
|
672
|
+
/* ... */
|
|
1036
673
|
],
|
|
1037
674
|
onError: (context) => {
|
|
1038
675
|
console.error(
|
|
@@ -1042,7 +679,7 @@ const retryableModel = createRetryable({
|
|
|
1042
679
|
},
|
|
1043
680
|
onRetry: (context) => {
|
|
1044
681
|
console.log(
|
|
1045
|
-
`Retrying
|
|
682
|
+
`Retrying with ${context.current.model.provider}/${context.current.model.modelId}...`,
|
|
1046
683
|
);
|
|
1047
684
|
},
|
|
1048
685
|
onSuccess: (context) => {
|
|
@@ -1063,7 +700,7 @@ const retryableModel = createRetryable({
|
|
|
1063
700
|
|
|
1064
701
|
#### Reset
|
|
1065
702
|
|
|
1066
|
-
By default, every new request starts with the base model, even if a previous request was retried with a different model. The `reset` option changes this behavior by making the last successfully retried model **sticky
|
|
703
|
+
By default, every new request starts with the base model, even if a previous request was retried with a different model. The `reset` option changes this behavior by making the last successfully retried model **sticky** — subsequent requests will continue using that model until the reset condition fires.
|
|
1067
704
|
|
|
1068
705
|
| Value | Description |
|
|
1069
706
|
| ------------------ | ------------------------------------------------------------ |
|
|
@@ -1071,51 +708,29 @@ By default, every new request starts with the base model, even if a previous req
|
|
|
1071
708
|
| `after-N-requests` | Keep the retry model for the next **N** requests, then reset |
|
|
1072
709
|
| `after-N-seconds` | Keep the retry model for **N** seconds, then reset |
|
|
1073
710
|
|
|
1074
|
-
##### Reset after each request (default)
|
|
1075
|
-
|
|
1076
|
-
```typescript
|
|
1077
|
-
const retryableModel = createRetryable({
|
|
1078
|
-
model: openai('gpt-4o-mini'),
|
|
1079
|
-
retries: [anthropic('claude-sonnet-4-20250514')],
|
|
1080
|
-
reset: 'after-request', // default: always start with the base model
|
|
1081
|
-
});
|
|
1082
|
-
```
|
|
1083
|
-
|
|
1084
|
-
##### Keep the retry model for N requests
|
|
1085
|
-
|
|
1086
|
-
```typescript
|
|
1087
|
-
const retryableModel = createRetryable({
|
|
1088
|
-
model: openai('gpt-4o-mini'),
|
|
1089
|
-
retries: [anthropic('claude-sonnet-4-20250514')],
|
|
1090
|
-
reset: 'after-5-requests', // use the retry model for 5 more requests before resetting
|
|
1091
|
-
});
|
|
1092
|
-
```
|
|
1093
|
-
|
|
1094
|
-
##### Keep the retry model for N seconds
|
|
1095
|
-
|
|
1096
711
|
```typescript
|
|
1097
|
-
const retryableModel =
|
|
712
|
+
const retryableModel = createRetryableModel({
|
|
1098
713
|
model: openai('gpt-4o-mini'),
|
|
1099
714
|
retries: [anthropic('claude-sonnet-4-20250514')],
|
|
1100
|
-
reset: 'after-
|
|
715
|
+
reset: 'after-5-requests',
|
|
1101
716
|
});
|
|
1102
717
|
```
|
|
1103
718
|
|
|
1104
719
|
### Telemetry
|
|
1105
720
|
|
|
1106
721
|
> [!NOTE]
|
|
1107
|
-
> Experimental:
|
|
722
|
+
> Experimental: span names and attributes may change in patch versions.
|
|
1108
723
|
|
|
1109
|
-
`ai-retry` can emit [OpenTelemetry](https://opentelemetry.io/) spans for each request and every retry attempt.
|
|
724
|
+
`ai-retry` can emit [OpenTelemetry](https://opentelemetry.io/) spans for each request and every retry attempt. Spans are created on the active OpenTelemetry context, so they nest automatically under the AI SDK's own spans (e.g. `ai.generateText.doGenerate`) when you also enable `experimental_telemetry` on `generateText` / `streamText`. A single trace then shows the individual attempts — which model each used, why it was retried, and the backoff between them — that the SDK's own span otherwise hides.
|
|
1110
725
|
|
|
1111
726
|
#### Setup
|
|
1112
727
|
|
|
1113
728
|
Telemetry uses the optional peer dependency `@opentelemetry/api` (already present if you use the AI SDK). Register an OpenTelemetry SDK once at startup, then opt in per model:
|
|
1114
729
|
|
|
1115
730
|
```typescript
|
|
1116
|
-
import {
|
|
731
|
+
import { createRetryableModel } from 'ai-retry/language-model';
|
|
1117
732
|
|
|
1118
|
-
const retryableModel =
|
|
733
|
+
const retryableModel = createRetryableModel({
|
|
1119
734
|
model: openai('gpt-4o'),
|
|
1120
735
|
retries: [anthropic('claude-sonnet-4-5')],
|
|
1121
736
|
experimental_telemetry: { isEnabled: true },
|
|
@@ -1150,27 +765,27 @@ ai_retry.doGenerate outcome=success, attempts=2
|
|
|
1150
765
|
|
|
1151
766
|
**Operation span** attributes:
|
|
1152
767
|
|
|
1153
|
-
| Attribute
|
|
1154
|
-
|
|
|
1155
|
-
| `ai_retry.operation`
|
|
1156
|
-
| `ai_retry.outcome`
|
|
1157
|
-
| `ai_retry.attempts`
|
|
1158
|
-
| `ai_retry.model.start`
|
|
1159
|
-
| `ai_retry.model.final`
|
|
768
|
+
| Attribute | Description |
|
|
769
|
+
| ---------------------------------------------------------------------------- | ---------------------------------------------------------------------------- |
|
|
770
|
+
| `ai_retry.operation` | `doGenerate`, `doStream`, or `doEmbed` |
|
|
771
|
+
| `ai_retry.outcome` | `success` or `failure` |
|
|
772
|
+
| `ai_retry.attempts` | total number of attempts |
|
|
773
|
+
| `ai_retry.model.start` | the model the request started with (`provider/modelId`) |
|
|
774
|
+
| `ai_retry.model.final` | the model that produced the final outcome |
|
|
1160
775
|
| `ai_retry.error.{name,message,status,cause.name,cause.message,cause.status}` | the failing error (on failure); `status` when it carries an HTTP status code |
|
|
1161
|
-
| `ai_retry.function.id`, `ai_retry.metadata.*`
|
|
776
|
+
| `ai_retry.function.id`, `ai_retry.metadata.*` | from the telemetry settings |
|
|
1162
777
|
|
|
1163
778
|
**Attempt span** (`ai_retry.attempt`) attributes:
|
|
1164
779
|
|
|
1165
|
-
| Attribute
|
|
1166
|
-
|
|
|
1167
|
-
| `ai_retry.attempt.number`
|
|
1168
|
-
| `ai_retry.attempt.model`
|
|
1169
|
-
| `ai_retry.attempt.outcome`
|
|
1170
|
-
| `ai_retry.attempt.type`
|
|
1171
|
-
| `ai_retry.attempt.finish_reason`
|
|
1172
|
-
| `ai_retry.attempt.delay_ms`
|
|
1173
|
-
| `ai_retry.attempt.timeout_ms`
|
|
780
|
+
| Attribute | Description |
|
|
781
|
+
| ------------------------------------------------------------------------------------ | ------------------------------------------------------------------------ |
|
|
782
|
+
| `ai_retry.attempt.number` | 1-based attempt index |
|
|
783
|
+
| `ai_retry.attempt.model` | model used (`provider/modelId`) |
|
|
784
|
+
| `ai_retry.attempt.outcome` | `success`, `retry`, or `failure` |
|
|
785
|
+
| `ai_retry.attempt.type` | `result` or `error` |
|
|
786
|
+
| `ai_retry.attempt.finish_reason` | finish reason (result attempts) |
|
|
787
|
+
| `ai_retry.attempt.delay_ms` | backoff scheduled before the next attempt |
|
|
788
|
+
| `ai_retry.attempt.timeout_ms` | timeout budget, when the retry set one |
|
|
1174
789
|
| `ai_retry.attempt.error.{name,message,status,cause.name,cause.message,cause.status}` | the error (error attempts); `status` when it carries an HTTP status code |
|
|
1175
790
|
|
|
1176
791
|
Attempt spans also carry the standard `gen_ai.request.model` / `gen_ai.provider.name` attributes so observability tools (Langfuse, etc.) recognize and render them.
|
|
@@ -1187,10 +802,32 @@ Errors during streaming requests can occur in two ways:
|
|
|
1187
802
|
1. When the stream is initially created (e.g. network error, API error, etc.) by calling `streamText`.
|
|
1188
803
|
2. While the stream is being processed (e.g. timeout, API error, etc.) by reading from the returned `result.textStream` async iterable.
|
|
1189
804
|
|
|
1190
|
-
In the second case, errors during stream processing will not always be retried, because the stream might have already emitted some actual content and the consumer might have processed it. Retrying
|
|
805
|
+
In the second case, errors during stream processing will not always be retried, because the stream might have already emitted some actual content and the consumer might have processed it. Retrying stops as soon as the first content chunk (e.g. `text-delta`, `tool-call`, etc.) is emitted. The chunks considered as content are the same as the ones passed to [`onChunk()`](https://github.com/vercel/ai/blob/1fe4bd4144bff927f5319d9d206e782a73979ccb/packages/ai/src/generate-text/stream-text.ts#L684-L697).
|
|
806
|
+
|
|
807
|
+
Result-based conditions (`finishReason`, `schemaInvalid`, `result(...)`) apply to streams as well: the decision happens when the upstream `finish` part arrives and only fires if no content has been emitted yet, so behavior like `finishReason.unified === 'content-filter'` on an otherwise empty response can still trigger a fallback. Once any content chunk has been forwarded, the stream is committed and result-based retries are skipped.
|
|
1191
808
|
|
|
1192
809
|
> [!IMPORTANT]
|
|
1193
|
-
> **Streaming limitation:**
|
|
810
|
+
> **Streaming limitation:** retries and fallbacks only apply before the first content chunk is emitted. Once streaming begins delivering content, the response is committed to the current model. Mid-stream errors will propagate to the caller rather than triggering a fallback. If reliable retries are critical for your use case, consider using `generateText` instead of `streamText`.
|
|
811
|
+
|
|
812
|
+
### Deprecated: function-style retryables
|
|
813
|
+
|
|
814
|
+
The function-style helpers (`contentFilterTriggered`, `requestTimeout`, `requestNotRetryable`, `retryAfterDelay`, `schemaMismatch`, `serviceOverloaded`, `serviceUnavailable`, `noImageGenerated`) are still exported from `ai-retry/retryables` for backwards compatibility, but they are deprecated in favor of the condition API documented above.
|
|
815
|
+
|
|
816
|
+
> [!NOTE]
|
|
817
|
+
> Full documentation for the deprecated function-style retryables lives in the [earlier README](https://github.com/zirkelc/ai-retry/blob/v1/README.md). New code should use the condition API. See the [migration guide](./MIGRATION.md) to convert existing code.
|
|
818
|
+
|
|
819
|
+
Each function-style retryable has a one-line equivalent in the new shape (imports from `ai-retry/language-model` unless noted):
|
|
820
|
+
|
|
821
|
+
| Function-style (deprecated) | Condition API |
|
|
822
|
+
| ------------------------------------------- | -------------------------------------------------------------------------------------------------------------------- |
|
|
823
|
+
| `contentFilterTriggered(m)` | `finishReason('content-filter').switch({ model: m })` |
|
|
824
|
+
| `requestTimeout(m)` | `timeout().switch({ model: m, timeout: 60_000 })` |
|
|
825
|
+
| `requestNotRetryable(m)` | `error.isRetryable(false).switch({ model: m })` |
|
|
826
|
+
| `schemaMismatch(m)` | `schemaInvalid().switch({ model: m })` |
|
|
827
|
+
| `serviceOverloaded(m)` | `httpStatus(529).switch({ model: m })` |
|
|
828
|
+
| `serviceUnavailable(m)` | `httpStatus(503).switch({ model: m })` |
|
|
829
|
+
| `noImageGenerated(m)` | `noImage().switch({ model: m })` (from `ai-retry/image-model`) |
|
|
830
|
+
| `retryAfterDelay({ delay, backoffFactor })` | `error.isRetryable(true).retry({ delay, backoffFactor })` |
|
|
1194
831
|
|
|
1195
832
|
#### Preamble buffering
|
|
1196
833
|
|
|
@@ -1201,13 +838,13 @@ Every stream begins with a non-content preamble (`stream-start`, then optionally
|
|
|
1201
838
|
|
|
1202
839
|
### API Reference
|
|
1203
840
|
|
|
1204
|
-
#### `
|
|
841
|
+
#### `createRetryableModel(options): LanguageModel | EmbeddingModel | ImageModel`
|
|
1205
842
|
|
|
1206
|
-
|
|
843
|
+
Imported from the per-model entry point (`ai-retry/language-model`, `ai-retry/embedding-model`, `ai-retry/image-model`). Each entry returns a model already narrowed to that family.
|
|
1207
844
|
|
|
1208
845
|
```ts
|
|
1209
846
|
interface RetryableModelOptions<
|
|
1210
|
-
MODEL extends
|
|
847
|
+
MODEL extends LanguageModel | EmbeddingModel | ImageModel,
|
|
1211
848
|
> {
|
|
1212
849
|
model: MODEL;
|
|
1213
850
|
retries: Array<Retryable<MODEL> | MODEL>;
|
|
@@ -1225,19 +862,26 @@ interface RetryableModelOptions<
|
|
|
1225
862
|
|
|
1226
863
|
**Options:**
|
|
1227
864
|
|
|
1228
|
-
- `model
|
|
1229
|
-
- `retries
|
|
1230
|
-
- `disabled
|
|
1231
|
-
- `reset
|
|
1232
|
-
- `experimental_telemetry
|
|
1233
|
-
- `onError
|
|
1234
|
-
- `onRetry
|
|
1235
|
-
- `onSuccess
|
|
1236
|
-
- `onFailure
|
|
865
|
+
- `model` — base model used for the initial request.
|
|
866
|
+
- `retries` — array of conditions (`.switch(...)` / `.retry(...)` outputs), models, or retry objects to try on failure.
|
|
867
|
+
- `disabled` — disable all retry logic. `boolean` or `() => boolean`. Default `false`.
|
|
868
|
+
- `reset` — controls when to reset back to the base model after a successful retry. Default `'after-request'`.
|
|
869
|
+
- `experimental_telemetry` — OpenTelemetry instrumentation. See [Telemetry](#telemetry).
|
|
870
|
+
- `onError` — fires when an error occurs.
|
|
871
|
+
- `onRetry` — fires before a retry attempt. May return `OnRetryOverrides` (or a promise of one) to override `options.*` for that attempt only. See [Dynamic call options](#dynamic-call-options).
|
|
872
|
+
- `onSuccess` — fires after a successful request.
|
|
873
|
+
- `onFailure` — fires when the request ultimately fails and no retry recovered it (no condition matched, retries exhausted, or the retry itself failed).
|
|
1237
874
|
|
|
1238
|
-
#### `
|
|
875
|
+
#### `createRetryable(options)` (deprecated)
|
|
876
|
+
|
|
877
|
+
```ts
|
|
878
|
+
import { createRetryable } from 'ai-retry';
|
|
879
|
+
```
|
|
880
|
+
|
|
881
|
+
> [!WARNING]
|
|
882
|
+
> Deprecated. The root `createRetryable` auto-detects the model family at runtime and resolves bare gateway strings as language models only. Prefer `createRetryableModel` from the matching per-model entry point.
|
|
1239
883
|
|
|
1240
|
-
|
|
884
|
+
#### `Reset`
|
|
1241
885
|
|
|
1242
886
|
```ts
|
|
1243
887
|
type Reset =
|
|
@@ -1246,77 +890,53 @@ type Reset =
|
|
|
1246
890
|
| `after-${number}-seconds`;
|
|
1247
891
|
```
|
|
1248
892
|
|
|
1249
|
-
|
|
1250
|
-
- `after-N-requests` — keep the retry model for the next N requests, then reset.
|
|
1251
|
-
- `after-N-seconds` — keep the retry model for N seconds, then reset.
|
|
1252
|
-
|
|
1253
|
-
#### `Retryable`
|
|
1254
|
-
|
|
1255
|
-
A `Retryable` is a function that receives a `RetryContext` with the current error or result and model and all previous attempts.
|
|
1256
|
-
It should evaluate the error/result and decide whether to retry by returning a `Retry` or to skip by returning `undefined`.
|
|
893
|
+
#### `Condition<MODEL>`
|
|
1257
894
|
|
|
1258
895
|
```ts
|
|
1259
|
-
|
|
1260
|
-
|
|
1261
|
-
|
|
1262
|
-
|
|
1263
|
-
|
|
1264
|
-
|
|
1265
|
-
|
|
1266
|
-
```typescript
|
|
1267
|
-
interface Retry {
|
|
1268
|
-
model: LanguageModelV3 | EmbeddingModelV3 | ImageModelV3;
|
|
1269
|
-
maxAttempts?: number; // Maximum retry attempts per model (default: 1)
|
|
1270
|
-
delay?: number; // Delay in milliseconds before retrying
|
|
1271
|
-
backoffFactor?: number; // Multiplier for exponential backoff
|
|
1272
|
-
timeout?: number; // Timeout in milliseconds for the retry attempt
|
|
1273
|
-
providerOptions?: ProviderOptions; // @deprecated - use options.providerOptions instead
|
|
1274
|
-
options?:
|
|
1275
|
-
| LanguageModelV3CallOptions
|
|
1276
|
-
| EmbeddingModelV3CallOptions
|
|
1277
|
-
| ImageModelV3CallOptions; // Call options to override for this retry
|
|
896
|
+
class Condition<MODEL> {
|
|
897
|
+
evaluate(ctx: RetryContext<MODEL>): Promise<boolean>;
|
|
898
|
+
switch(
|
|
899
|
+
target: { model: MODEL } & Omit<Retry<MODEL>, 'model'>,
|
|
900
|
+
): Retryable<MODEL>;
|
|
901
|
+
retry(options?: Omit<Retry<MODEL>, 'model'>): Retryable<MODEL>;
|
|
1278
902
|
}
|
|
1279
903
|
```
|
|
1280
904
|
|
|
1281
|
-
|
|
905
|
+
Conditions are produced by the low-level (`error`, `result`) and high-level (`httpStatus`, `timeout`, `aborted`, `finishReason`, `schemaInvalid`, `noImage`) helpers. They can be composed with the top-level `and(...conditions)` / `or(...conditions)` / `not(condition)` helpers and finalized into a `Retryable` with `.switch()` or `.retry()`.
|
|
1282
906
|
|
|
1283
|
-
|
|
907
|
+
#### `Retryable`
|
|
1284
908
|
|
|
1285
|
-
|
|
1286
|
-
|
|
1287
|
-
|
|
1288
|
-
|
|
1289
|
-
|
|
909
|
+
A `Retryable` is a function that receives a `RetryContext` and returns a `Retry` (to fire) or `undefined` (to skip).
|
|
910
|
+
|
|
911
|
+
```ts
|
|
912
|
+
type Retryable<MODEL> = (
|
|
913
|
+
context: RetryContext<MODEL>,
|
|
914
|
+
) => Retry<MODEL> | Promise<Retry<MODEL> | undefined> | undefined;
|
|
1290
915
|
```
|
|
1291
916
|
|
|
1292
|
-
|
|
917
|
+
The `.switch()` and `.retry()` actions return `Retryable<MODEL>` for you. Hand-written retryables are still supported when the condition helpers aren't a fit.
|
|
1293
918
|
|
|
1294
|
-
|
|
919
|
+
#### `Retry`
|
|
1295
920
|
|
|
1296
|
-
```
|
|
1297
|
-
interface
|
|
1298
|
-
|
|
1299
|
-
|
|
921
|
+
```ts
|
|
922
|
+
interface Retry<MODEL> {
|
|
923
|
+
model: MODEL;
|
|
924
|
+
maxAttempts?: number; // default: 1 for switch, 2 for retry
|
|
925
|
+
delay?: number; // ms before the attempt
|
|
926
|
+
backoffFactor?: number; // exponential multiplier
|
|
927
|
+
timeout?: number; // fresh AbortSignal.timeout() for this attempt
|
|
928
|
+
options?: RetryCallOptions<MODEL>;
|
|
1300
929
|
}
|
|
1301
930
|
```
|
|
1302
931
|
|
|
1303
|
-
|
|
932
|
+
The shape returned by a retryable (and accepted in static `retries: [...]` entries) describing the next attempt.
|
|
1304
933
|
|
|
1305
|
-
|
|
934
|
+
#### `RetryContext`
|
|
1306
935
|
|
|
1307
|
-
```
|
|
1308
|
-
interface
|
|
1309
|
-
|
|
1310
|
-
|
|
1311
|
-
result:
|
|
1312
|
-
| LanguageModelResult
|
|
1313
|
-
| LanguageModelStream
|
|
1314
|
-
| EmbeddingModelEmbed
|
|
1315
|
-
| ImageModelGenerate;
|
|
1316
|
-
options:
|
|
1317
|
-
| LanguageModelV3CallOptions
|
|
1318
|
-
| EmbeddingModelV3CallOptions
|
|
1319
|
-
| ImageModelV3CallOptions;
|
|
936
|
+
```ts
|
|
937
|
+
interface RetryContext<MODEL> {
|
|
938
|
+
current: RetryAttempt<MODEL>;
|
|
939
|
+
attempts: Array<RetryAttempt<MODEL>>;
|
|
1320
940
|
}
|
|
1321
941
|
```
|
|
1322
942
|
|
|
@@ -1334,34 +954,45 @@ interface FailureContext {
|
|
|
1334
954
|
|
|
1335
955
|
#### `RetryAttempt`
|
|
1336
956
|
|
|
1337
|
-
|
|
1338
|
-
|
|
1339
|
-
```typescript
|
|
1340
|
-
// For language, embedding, and image models
|
|
1341
|
-
type RetryAttempt =
|
|
957
|
+
```ts
|
|
958
|
+
type RetryAttempt<MODEL> =
|
|
1342
959
|
| {
|
|
1343
960
|
type: 'error';
|
|
1344
961
|
error: unknown;
|
|
1345
|
-
model:
|
|
1346
|
-
options:
|
|
1347
|
-
| LanguageModelV3CallOptions
|
|
1348
|
-
| EmbeddingModelV3CallOptions
|
|
1349
|
-
| ImageModelV3CallOptions;
|
|
962
|
+
model: MODEL;
|
|
963
|
+
options: CallOptions<MODEL>;
|
|
1350
964
|
}
|
|
1351
965
|
| {
|
|
1352
966
|
type: 'result';
|
|
1353
967
|
result: LanguageModelResult;
|
|
1354
|
-
model:
|
|
1355
|
-
options:
|
|
968
|
+
model: LanguageModel;
|
|
969
|
+
options: LanguageModelCallOptions;
|
|
1356
970
|
};
|
|
1357
971
|
|
|
1358
|
-
// Note: Result-based retries only apply to language models (both generate and stream paths). They do not apply to embedding or image models. For streaming, retries are only possible before any content has been emitted; once a text-delta flows through, the stream is committed.
|
|
1359
|
-
|
|
1360
|
-
// Type guards for discriminating attempts
|
|
1361
972
|
function isErrorAttempt(attempt: RetryAttempt): attempt is RetryErrorAttempt;
|
|
1362
973
|
function isResultAttempt(attempt: RetryAttempt): attempt is RetryResultAttempt;
|
|
1363
974
|
```
|
|
1364
975
|
|
|
976
|
+
Result-based attempts only fire for language models (both generate and stream paths). They do not fire for embedding or image models. For streams, retries are only possible before any content has been emitted; once a content chunk flows through, the stream is committed.
|
|
977
|
+
|
|
978
|
+
`isErrorAttempt` and `isResultAttempt` are re-exported from the package root (`ai-retry`).
|
|
979
|
+
|
|
980
|
+
#### `SuccessContext`
|
|
981
|
+
|
|
982
|
+
```ts
|
|
983
|
+
interface SuccessContext<MODEL> {
|
|
984
|
+
current: {
|
|
985
|
+
type: 'success';
|
|
986
|
+
model: MODEL;
|
|
987
|
+
result: Result<MODEL>;
|
|
988
|
+
options: CallOptions<MODEL>;
|
|
989
|
+
};
|
|
990
|
+
attempts: Array<RetryAttempt<MODEL>>;
|
|
991
|
+
}
|
|
992
|
+
```
|
|
993
|
+
|
|
994
|
+
Passed to the `onSuccess` callback.
|
|
995
|
+
|
|
1365
996
|
### License
|
|
1366
997
|
|
|
1367
998
|
MIT
|