ai-lcr 0.5.3 → 0.5.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +30 -0
- package/README.md +2 -2
- package/README.zh-CN.md +2 -2
- package/dist/index.cjs +35 -8
- package/dist/index.d.cts +43 -1
- package/dist/index.d.ts +43 -1
- package/dist/index.js +30 -7
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -4,6 +4,36 @@ All notable changes to `ai-lcr` are documented here. The format follows
|
|
|
4
4
|
[Keep a Changelog](https://keepachangelog.com/), and the project adheres to
|
|
5
5
|
[Semantic Versioning](https://semver.org/).
|
|
6
6
|
|
|
7
|
+
## [0.5.4] — 2026-06-03
|
|
8
|
+
|
|
9
|
+
### Changed
|
|
10
|
+
|
|
11
|
+
- **A provider 400 now fails over instead of being passed through.** Previously
|
|
12
|
+
any client error (400/422/…) was treated as the caller's fault and thrown
|
|
13
|
+
immediately, killing the request even when another provider would have served
|
|
14
|
+
it. But across OpenAI-compatible aggregators a 400 is most often
|
|
15
|
+
*provider-specific* — an unsupported parameter, a model the provider hasn't
|
|
16
|
+
listed, a stricter JSON schema — not a universally-broken request. The default
|
|
17
|
+
failover gate (`shouldFailover`) now advances to the next provider on **any**
|
|
18
|
+
failure except a deliberate caller cancellation (`AbortSignal`), which is the
|
|
19
|
+
one thing we must never re-issue elsewhere. When every provider rejects the
|
|
20
|
+
request it still throws — now surfacing the **first** (original) error rather
|
|
21
|
+
than the last fallback's, so a genuine caller bug stays debuggable. Failed
|
|
22
|
+
attempts keep their precise `ErrorKind` (`"client"` for a 400) in the
|
|
23
|
+
`CallRecord`, so a real bug is still visible.
|
|
24
|
+
|
|
25
|
+
To restore the old "client errors fail fast" behavior, pass
|
|
26
|
+
`shouldRetry: isRetryableError` to `createLCR`.
|
|
27
|
+
|
|
28
|
+
### Added
|
|
29
|
+
|
|
30
|
+
- **`createLCR({ shouldRetry })`.** The failover predicate is now configurable
|
|
31
|
+
from the top-level API (it previously existed only on the internal engine), so
|
|
32
|
+
callers can tune or fully override the policy above.
|
|
33
|
+
- **Exported error predicates** `isRetryableError`, `isNetworkError`,
|
|
34
|
+
`isAbortError`, and `shouldFailover` — building blocks for a custom
|
|
35
|
+
`shouldRetry`.
|
|
36
|
+
|
|
7
37
|
## [0.5.3] — 2026-06-03
|
|
8
38
|
|
|
9
39
|
All additions are optional and backward compatible.
|
package/README.md
CHANGED
|
@@ -141,7 +141,7 @@ DeepInfra carries open weights only — no first-party Claude / GPT / Gemini. Fo
|
|
|
141
141
|
## How it routes
|
|
142
142
|
|
|
143
143
|
1. **Cheapest first.** Providers are tried in order — list them cheapest-first, or set `autoSort: true` to order them by `cost` automatically.
|
|
144
|
-
2. **Fall through on failure.** On
|
|
144
|
+
2. **Fall through on failure.** On any provider failure — rate limit, 5xx, timeout, a **billing cap** (402 / out-of-credit / quota), *and* a client error like a **400** — it advances to the next provider, streaming-safe. A 400 fails over on purpose: across OpenAI-compatible aggregators a 400 is usually "*this* provider won't take this request" (an unsupported param, a model it hasn't listed, a stricter schema), not a universally-broken request — so the next provider may well serve it. If every provider rejects the request it still fails, surfacing the **original** error so a genuine caller bug stays debuggable. The one failure that never fails over is a deliberate caller cancellation (`AbortSignal`). Pass `shouldRetry: isRetryableError` to `createLCR` to restore the stricter "client errors fail fast" behavior.
|
|
145
145
|
3. **Recover.** After an idle window (`resetIntervalMs`, default 60s) it snaps back to the cheapest provider.
|
|
146
146
|
|
|
147
147
|
## See what happened (`onCall`)
|
|
@@ -364,7 +364,7 @@ npm run typecheck
|
|
|
364
364
|
npm test # mocked routing/failover tests + live Kunavo tests
|
|
365
365
|
```
|
|
366
366
|
|
|
367
|
-
The suite covers cheapest-first routing, failover on retryable errors
|
|
367
|
+
The suite covers cheapest-first routing, failover on retryable errors *and* on a provider 400 (but *not* on a caller cancellation), surfacing the original error when the whole chain is exhausted, and a real broken-provider → Kunavo recovery. Live tests run only when `KUNAVO_API_KEY` is set in the environment; otherwise they're skipped.
|
|
368
368
|
|
|
369
369
|
## Credits
|
|
370
370
|
|
package/README.zh-CN.md
CHANGED
|
@@ -141,7 +141,7 @@ DeepInfra 只承载开源权重——没有第一方 Claude / GPT / Gemini。那
|
|
|
141
141
|
## 它如何路由
|
|
142
142
|
|
|
143
143
|
1. **最便宜优先。** provider 按顺序依次尝试——把它们排成最便宜优先,或设置 `autoSort: true` 让它按 `cost` 自动排序。
|
|
144
|
-
2. **失败时向下穿透。**
|
|
144
|
+
2. **失败时向下穿透。** 遇到任何 provider 失败——限流、5xx、超时、**额度耗尽**(402 / 欠费 / 余额不足),以及 **400** 这类 client 错误——都会前进到下一个 provider,且对流式安全。400 会 failover 是有意为之:在 OpenAI 兼容聚合层里,400 往往是"*这家* provider 不吃这个请求"(不支持的参数、它没上架这个 model、更严格的 schema),而非请求本身坏了——换一家很可能就能服务。若所有 provider 都拒绝,请求仍会失败,并抛出**第一个**(原始)错误,让真正的调用方 bug 保持可调试。唯一永远不 failover 的是调用方主动取消(`AbortSignal`)。想恢复旧的"client 错误立即失败"行为,给 `createLCR` 传 `shouldRetry: isRetryableError`。
|
|
145
145
|
3. **恢复。** 在一段空闲窗口(`resetIntervalMs`,默认 60s)之后,自动回到最便宜的 provider。
|
|
146
146
|
|
|
147
147
|
## 支持的 provider
|
|
@@ -280,7 +280,7 @@ npm run typecheck
|
|
|
280
280
|
npm test # mock 的路由 / failover 测试 + 真实 Kunavo 测试
|
|
281
281
|
```
|
|
282
282
|
|
|
283
|
-
|
|
283
|
+
测试套件覆盖了:最便宜优先路由、可重试错误以及 provider 400 时的 failover(但调用方主动取消时*不*做 failover)、穷尽整条链路时抛出原始错误,以及一次真实的「provider 故障 → Kunavo 恢复」。真实测试仅在环境变量 `KUNAVO_API_KEY` 设置时运行,否则跳过。
|
|
284
284
|
|
|
285
285
|
## 致谢
|
|
286
286
|
|
package/dist/index.cjs
CHANGED
|
@@ -34,9 +34,13 @@ __export(index_exports, {
|
|
|
34
34
|
createMediaLCR: () => createMediaLCR,
|
|
35
35
|
createRunwareMediaAdapter: () => createRunwareMediaAdapter,
|
|
36
36
|
formatCallRecord: () => formatCallRecord,
|
|
37
|
+
isAbortError: () => isAbortError,
|
|
38
|
+
isNetworkError: () => isNetworkError,
|
|
39
|
+
isRetryableError: () => isRetryableError,
|
|
37
40
|
normalizedCents: () => normalizedCents,
|
|
38
41
|
rankRoutes: () => rankRoutes,
|
|
39
|
-
referenceMegapixels: () => referenceMegapixels
|
|
42
|
+
referenceMegapixels: () => referenceMegapixels,
|
|
43
|
+
shouldFailover: () => shouldFailover
|
|
40
44
|
});
|
|
41
45
|
module.exports = __toCommonJS(index_exports);
|
|
42
46
|
|
|
@@ -158,6 +162,15 @@ function isRetryableError(error) {
|
|
|
158
162
|
const { text } = errorSignals(error);
|
|
159
163
|
return RETRYABLE_PATTERNS.some((p) => text.includes(p));
|
|
160
164
|
}
|
|
165
|
+
function isAbortError(error) {
|
|
166
|
+
const e = error;
|
|
167
|
+
if (typeof e?.name === "string" && e.name === "AbortError") return true;
|
|
168
|
+
const { text } = errorSignals(error);
|
|
169
|
+
return text.includes("operation was aborted") || text.includes("operation was canceled");
|
|
170
|
+
}
|
|
171
|
+
function shouldFailover(error) {
|
|
172
|
+
return !isAbortError(error);
|
|
173
|
+
}
|
|
161
174
|
function classifyError(error) {
|
|
162
175
|
if (error instanceof EmptyCompletionError) return "empty_completion";
|
|
163
176
|
const e = error;
|
|
@@ -281,7 +294,7 @@ var LcrFallbackModel = class {
|
|
|
281
294
|
this.lastFailoverAt = Date.now();
|
|
282
295
|
}
|
|
283
296
|
shouldRetry(error) {
|
|
284
|
-
return (this.opts.shouldRetry ??
|
|
297
|
+
return (this.opts.shouldRetry ?? shouldFailover)(error);
|
|
285
298
|
}
|
|
286
299
|
// Observer callbacks are caller-supplied logging hooks: a throw from one of
|
|
287
300
|
// them must NEVER turn a successful (or already-failed) request into a
|
|
@@ -314,6 +327,7 @@ var LcrFallbackModel = class {
|
|
|
314
327
|
}
|
|
315
328
|
/** Record a failed attempt onto the call's chain (no event yet). */
|
|
316
329
|
recordFail(ctx, provider, attemptStart, error) {
|
|
330
|
+
if (ctx.firstError === void 0) ctx.firstError = error;
|
|
317
331
|
ctx.attempts.push({
|
|
318
332
|
provider: provider.label,
|
|
319
333
|
ok: false,
|
|
@@ -429,7 +443,7 @@ var LcrFallbackModel = class {
|
|
|
429
443
|
}
|
|
430
444
|
}
|
|
431
445
|
this.finalizeFail(ctx);
|
|
432
|
-
throw lastError;
|
|
446
|
+
throw ctx.firstError ?? lastError;
|
|
433
447
|
}
|
|
434
448
|
async doStream(options) {
|
|
435
449
|
return this.doStreamWithCtx(options, this.startCall(options), this.startIndex(), 0);
|
|
@@ -465,7 +479,7 @@ var LcrFallbackModel = class {
|
|
|
465
479
|
tried++;
|
|
466
480
|
if (tried >= n) {
|
|
467
481
|
this.finalizeFail(ctx);
|
|
468
|
-
throw error;
|
|
482
|
+
throw ctx.firstError ?? error;
|
|
469
483
|
}
|
|
470
484
|
idx = (idx + 1) % n;
|
|
471
485
|
}
|
|
@@ -513,7 +527,7 @@ var LcrFallbackModel = class {
|
|
|
513
527
|
const nextTried = triedBeforeServing + 1;
|
|
514
528
|
if (nextTried >= n) {
|
|
515
529
|
self.finalizeFail(ctx);
|
|
516
|
-
controller.error(error);
|
|
530
|
+
controller.error(ctx.firstError ?? error);
|
|
517
531
|
return;
|
|
518
532
|
}
|
|
519
533
|
try {
|
|
@@ -1229,7 +1243,16 @@ function withDefaultCacheRead(p, ratio) {
|
|
|
1229
1243
|
return { ...p, cost: { ...p.cost, cacheRead: p.cost.input * ratio } };
|
|
1230
1244
|
}
|
|
1231
1245
|
function createLCR(config) {
|
|
1232
|
-
const {
|
|
1246
|
+
const {
|
|
1247
|
+
models,
|
|
1248
|
+
autoSort = false,
|
|
1249
|
+
resetIntervalMs,
|
|
1250
|
+
onError,
|
|
1251
|
+
onCost,
|
|
1252
|
+
onCall,
|
|
1253
|
+
shouldRetry,
|
|
1254
|
+
defaultCacheReadRatio
|
|
1255
|
+
} = config;
|
|
1233
1256
|
if (defaultCacheReadRatio !== void 0 && (defaultCacheReadRatio < 0 || defaultCacheReadRatio > 1)) {
|
|
1234
1257
|
throw new Error(
|
|
1235
1258
|
`ai-lcr: defaultCacheReadRatio must be in [0, 1], got ${defaultCacheReadRatio}`
|
|
@@ -1243,7 +1266,7 @@ function createLCR(config) {
|
|
|
1243
1266
|
}
|
|
1244
1267
|
routed.set(
|
|
1245
1268
|
name,
|
|
1246
|
-
new LcrFallbackModel({ modelName: name, providers, resetIntervalMs, onError, onCost, onCall })
|
|
1269
|
+
new LcrFallbackModel({ modelName: name, providers, resetIntervalMs, onError, onCost, onCall, shouldRetry })
|
|
1247
1270
|
);
|
|
1248
1271
|
}
|
|
1249
1272
|
return (modelName) => {
|
|
@@ -1272,7 +1295,11 @@ function createLCR(config) {
|
|
|
1272
1295
|
createMediaLCR,
|
|
1273
1296
|
createRunwareMediaAdapter,
|
|
1274
1297
|
formatCallRecord,
|
|
1298
|
+
isAbortError,
|
|
1299
|
+
isNetworkError,
|
|
1300
|
+
isRetryableError,
|
|
1275
1301
|
normalizedCents,
|
|
1276
1302
|
rankRoutes,
|
|
1277
|
-
referenceMegapixels
|
|
1303
|
+
referenceMegapixels,
|
|
1304
|
+
shouldFailover
|
|
1278
1305
|
});
|
package/dist/index.d.cts
CHANGED
|
@@ -165,6 +165,40 @@ interface CallRecord {
|
|
|
165
165
|
*/
|
|
166
166
|
emptyCompletion?: boolean;
|
|
167
167
|
}
|
|
168
|
+
/**
|
|
169
|
+
* A transport-level failure (provider unreachable / socket dropped / DNS /
|
|
170
|
+
* connect timeout). These carry no HTTP status, so they must be detected
|
|
171
|
+
* structurally — by Node `code` or message — or they read as non-retryable.
|
|
172
|
+
* Note: a deliberate caller cancellation (AbortError without a network code) is
|
|
173
|
+
* intentionally NOT treated as network here, so we don't "fail over" a request
|
|
174
|
+
* the caller chose to abort.
|
|
175
|
+
*/
|
|
176
|
+
declare function isNetworkError(error: unknown): boolean;
|
|
177
|
+
/** Default switch criterion: provider down / rate-limited / overloaded / unreachable. */
|
|
178
|
+
declare function isRetryableError(error: unknown): boolean;
|
|
179
|
+
/**
|
|
180
|
+
* A deliberate caller cancellation (an `AbortSignal` fired by the app). This is
|
|
181
|
+
* the one failure we must NEVER fail over: re-issuing an aborted request to the
|
|
182
|
+
* next provider is the opposite of what the caller asked for. Detected by name
|
|
183
|
+
* (`fetch`/AI SDK emit an `AbortError`) and by the canonical abort message.
|
|
184
|
+
*/
|
|
185
|
+
declare function isAbortError(error: unknown): boolean;
|
|
186
|
+
/**
|
|
187
|
+
* Default failover criterion — broader than {@link isRetryableError} on purpose.
|
|
188
|
+
* It fails over on *anything* except a deliberate caller cancellation, including
|
|
189
|
+
* a client error such as a 400. In the OpenAI-compatible aggregator world a 400
|
|
190
|
+
* is most often "THIS provider won't take this request" (an unsupported param, a
|
|
191
|
+
* model it hasn't listed, a stricter schema) rather than a universally-broken
|
|
192
|
+
* request — and the next provider may well serve it, which is the whole point of
|
|
193
|
+
* the router. When every provider rejects the request, the engine still throws
|
|
194
|
+
* (surfacing the original error), so a genuinely-bad request stays debuggable.
|
|
195
|
+
* The failed attempts keep their precise {@link ErrorKind} (`"client"` for a
|
|
196
|
+
* 400) so a real caller bug is still visible in the {@link CallRecord}.
|
|
197
|
+
*
|
|
198
|
+
* Pass a custom `shouldRetry` to opt out (e.g. `isRetryableError` to restore the
|
|
199
|
+
* stricter "client errors fail fast" behavior).
|
|
200
|
+
*/
|
|
201
|
+
declare function shouldFailover(error: unknown): boolean;
|
|
168
202
|
/**
|
|
169
203
|
* Normalize an error into a short, log-friendly class for {@link CallRecord}.
|
|
170
204
|
* An HTTP status wins (e.g. "502", "429"); otherwise the first matching
|
|
@@ -589,6 +623,14 @@ interface LCRConfig {
|
|
|
589
623
|
* you. Pair with `formatCallRecord` for a one-line log. See {@link CallRecord}.
|
|
590
624
|
*/
|
|
591
625
|
onCall?: (record: CallRecord) => void;
|
|
626
|
+
/**
|
|
627
|
+
* Decide whether a failed attempt should fail over to the next provider.
|
|
628
|
+
* Defaults to {@link shouldFailover} — fail over on everything except a
|
|
629
|
+
* deliberate caller cancellation, so a provider-specific 400 still survives by
|
|
630
|
+
* trying the next provider. Pass {@link isRetryableError} to restore the
|
|
631
|
+
* stricter behavior where a client error (e.g. 400) fails fast.
|
|
632
|
+
*/
|
|
633
|
+
shouldRetry?: (error: unknown) => boolean;
|
|
592
634
|
/**
|
|
593
635
|
* Fallback prompt-cache read rate, as a fraction of each leg's `input` price,
|
|
594
636
|
* applied ONLY to legs whose `cost` omits an explicit `cacheRead`. So a leg
|
|
@@ -614,4 +656,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
|
|
|
614
656
|
*/
|
|
615
657
|
declare function createLCR(config: LCRConfig): LCRRouter;
|
|
616
658
|
|
|
617
|
-
export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type HttpSinkOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, OFFICIAL_PRICES, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createHttpSink, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
|
|
659
|
+
export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type HttpSinkOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, OFFICIAL_PRICES, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createHttpSink, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, isAbortError, isNetworkError, isRetryableError, normalizedCents, rankRoutes, referenceMegapixels, shouldFailover };
|
package/dist/index.d.ts
CHANGED
|
@@ -165,6 +165,40 @@ interface CallRecord {
|
|
|
165
165
|
*/
|
|
166
166
|
emptyCompletion?: boolean;
|
|
167
167
|
}
|
|
168
|
+
/**
|
|
169
|
+
* A transport-level failure (provider unreachable / socket dropped / DNS /
|
|
170
|
+
* connect timeout). These carry no HTTP status, so they must be detected
|
|
171
|
+
* structurally — by Node `code` or message — or they read as non-retryable.
|
|
172
|
+
* Note: a deliberate caller cancellation (AbortError without a network code) is
|
|
173
|
+
* intentionally NOT treated as network here, so we don't "fail over" a request
|
|
174
|
+
* the caller chose to abort.
|
|
175
|
+
*/
|
|
176
|
+
declare function isNetworkError(error: unknown): boolean;
|
|
177
|
+
/** Default switch criterion: provider down / rate-limited / overloaded / unreachable. */
|
|
178
|
+
declare function isRetryableError(error: unknown): boolean;
|
|
179
|
+
/**
|
|
180
|
+
* A deliberate caller cancellation (an `AbortSignal` fired by the app). This is
|
|
181
|
+
* the one failure we must NEVER fail over: re-issuing an aborted request to the
|
|
182
|
+
* next provider is the opposite of what the caller asked for. Detected by name
|
|
183
|
+
* (`fetch`/AI SDK emit an `AbortError`) and by the canonical abort message.
|
|
184
|
+
*/
|
|
185
|
+
declare function isAbortError(error: unknown): boolean;
|
|
186
|
+
/**
|
|
187
|
+
* Default failover criterion — broader than {@link isRetryableError} on purpose.
|
|
188
|
+
* It fails over on *anything* except a deliberate caller cancellation, including
|
|
189
|
+
* a client error such as a 400. In the OpenAI-compatible aggregator world a 400
|
|
190
|
+
* is most often "THIS provider won't take this request" (an unsupported param, a
|
|
191
|
+
* model it hasn't listed, a stricter schema) rather than a universally-broken
|
|
192
|
+
* request — and the next provider may well serve it, which is the whole point of
|
|
193
|
+
* the router. When every provider rejects the request, the engine still throws
|
|
194
|
+
* (surfacing the original error), so a genuinely-bad request stays debuggable.
|
|
195
|
+
* The failed attempts keep their precise {@link ErrorKind} (`"client"` for a
|
|
196
|
+
* 400) so a real caller bug is still visible in the {@link CallRecord}.
|
|
197
|
+
*
|
|
198
|
+
* Pass a custom `shouldRetry` to opt out (e.g. `isRetryableError` to restore the
|
|
199
|
+
* stricter "client errors fail fast" behavior).
|
|
200
|
+
*/
|
|
201
|
+
declare function shouldFailover(error: unknown): boolean;
|
|
168
202
|
/**
|
|
169
203
|
* Normalize an error into a short, log-friendly class for {@link CallRecord}.
|
|
170
204
|
* An HTTP status wins (e.g. "502", "429"); otherwise the first matching
|
|
@@ -589,6 +623,14 @@ interface LCRConfig {
|
|
|
589
623
|
* you. Pair with `formatCallRecord` for a one-line log. See {@link CallRecord}.
|
|
590
624
|
*/
|
|
591
625
|
onCall?: (record: CallRecord) => void;
|
|
626
|
+
/**
|
|
627
|
+
* Decide whether a failed attempt should fail over to the next provider.
|
|
628
|
+
* Defaults to {@link shouldFailover} — fail over on everything except a
|
|
629
|
+
* deliberate caller cancellation, so a provider-specific 400 still survives by
|
|
630
|
+
* trying the next provider. Pass {@link isRetryableError} to restore the
|
|
631
|
+
* stricter behavior where a client error (e.g. 400) fails fast.
|
|
632
|
+
*/
|
|
633
|
+
shouldRetry?: (error: unknown) => boolean;
|
|
592
634
|
/**
|
|
593
635
|
* Fallback prompt-cache read rate, as a fraction of each leg's `input` price,
|
|
594
636
|
* applied ONLY to legs whose `cost` omits an explicit `cacheRead`. So a leg
|
|
@@ -614,4 +656,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
|
|
|
614
656
|
*/
|
|
615
657
|
declare function createLCR(config: LCRConfig): LCRRouter;
|
|
616
658
|
|
|
617
|
-
export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type HttpSinkOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, OFFICIAL_PRICES, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createHttpSink, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
|
|
659
|
+
export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type HttpSinkOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, OFFICIAL_PRICES, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createHttpSink, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, isAbortError, isNetworkError, isRetryableError, normalizedCents, rankRoutes, referenceMegapixels, shouldFailover };
|
package/dist/index.js
CHANGED
|
@@ -116,6 +116,15 @@ function isRetryableError(error) {
|
|
|
116
116
|
const { text } = errorSignals(error);
|
|
117
117
|
return RETRYABLE_PATTERNS.some((p) => text.includes(p));
|
|
118
118
|
}
|
|
119
|
+
function isAbortError(error) {
|
|
120
|
+
const e = error;
|
|
121
|
+
if (typeof e?.name === "string" && e.name === "AbortError") return true;
|
|
122
|
+
const { text } = errorSignals(error);
|
|
123
|
+
return text.includes("operation was aborted") || text.includes("operation was canceled");
|
|
124
|
+
}
|
|
125
|
+
function shouldFailover(error) {
|
|
126
|
+
return !isAbortError(error);
|
|
127
|
+
}
|
|
119
128
|
function classifyError(error) {
|
|
120
129
|
if (error instanceof EmptyCompletionError) return "empty_completion";
|
|
121
130
|
const e = error;
|
|
@@ -239,7 +248,7 @@ var LcrFallbackModel = class {
|
|
|
239
248
|
this.lastFailoverAt = Date.now();
|
|
240
249
|
}
|
|
241
250
|
shouldRetry(error) {
|
|
242
|
-
return (this.opts.shouldRetry ??
|
|
251
|
+
return (this.opts.shouldRetry ?? shouldFailover)(error);
|
|
243
252
|
}
|
|
244
253
|
// Observer callbacks are caller-supplied logging hooks: a throw from one of
|
|
245
254
|
// them must NEVER turn a successful (or already-failed) request into a
|
|
@@ -272,6 +281,7 @@ var LcrFallbackModel = class {
|
|
|
272
281
|
}
|
|
273
282
|
/** Record a failed attempt onto the call's chain (no event yet). */
|
|
274
283
|
recordFail(ctx, provider, attemptStart, error) {
|
|
284
|
+
if (ctx.firstError === void 0) ctx.firstError = error;
|
|
275
285
|
ctx.attempts.push({
|
|
276
286
|
provider: provider.label,
|
|
277
287
|
ok: false,
|
|
@@ -387,7 +397,7 @@ var LcrFallbackModel = class {
|
|
|
387
397
|
}
|
|
388
398
|
}
|
|
389
399
|
this.finalizeFail(ctx);
|
|
390
|
-
throw lastError;
|
|
400
|
+
throw ctx.firstError ?? lastError;
|
|
391
401
|
}
|
|
392
402
|
async doStream(options) {
|
|
393
403
|
return this.doStreamWithCtx(options, this.startCall(options), this.startIndex(), 0);
|
|
@@ -423,7 +433,7 @@ var LcrFallbackModel = class {
|
|
|
423
433
|
tried++;
|
|
424
434
|
if (tried >= n) {
|
|
425
435
|
this.finalizeFail(ctx);
|
|
426
|
-
throw error;
|
|
436
|
+
throw ctx.firstError ?? error;
|
|
427
437
|
}
|
|
428
438
|
idx = (idx + 1) % n;
|
|
429
439
|
}
|
|
@@ -471,7 +481,7 @@ var LcrFallbackModel = class {
|
|
|
471
481
|
const nextTried = triedBeforeServing + 1;
|
|
472
482
|
if (nextTried >= n) {
|
|
473
483
|
self.finalizeFail(ctx);
|
|
474
|
-
controller.error(error);
|
|
484
|
+
controller.error(ctx.firstError ?? error);
|
|
475
485
|
return;
|
|
476
486
|
}
|
|
477
487
|
try {
|
|
@@ -1187,7 +1197,16 @@ function withDefaultCacheRead(p, ratio) {
|
|
|
1187
1197
|
return { ...p, cost: { ...p.cost, cacheRead: p.cost.input * ratio } };
|
|
1188
1198
|
}
|
|
1189
1199
|
function createLCR(config) {
|
|
1190
|
-
const {
|
|
1200
|
+
const {
|
|
1201
|
+
models,
|
|
1202
|
+
autoSort = false,
|
|
1203
|
+
resetIntervalMs,
|
|
1204
|
+
onError,
|
|
1205
|
+
onCost,
|
|
1206
|
+
onCall,
|
|
1207
|
+
shouldRetry,
|
|
1208
|
+
defaultCacheReadRatio
|
|
1209
|
+
} = config;
|
|
1191
1210
|
if (defaultCacheReadRatio !== void 0 && (defaultCacheReadRatio < 0 || defaultCacheReadRatio > 1)) {
|
|
1192
1211
|
throw new Error(
|
|
1193
1212
|
`ai-lcr: defaultCacheReadRatio must be in [0, 1], got ${defaultCacheReadRatio}`
|
|
@@ -1201,7 +1220,7 @@ function createLCR(config) {
|
|
|
1201
1220
|
}
|
|
1202
1221
|
routed.set(
|
|
1203
1222
|
name,
|
|
1204
|
-
new LcrFallbackModel({ modelName: name, providers, resetIntervalMs, onError, onCost, onCall })
|
|
1223
|
+
new LcrFallbackModel({ modelName: name, providers, resetIntervalMs, onError, onCost, onCall, shouldRetry })
|
|
1205
1224
|
);
|
|
1206
1225
|
}
|
|
1207
1226
|
return (modelName) => {
|
|
@@ -1229,7 +1248,11 @@ export {
|
|
|
1229
1248
|
createMediaLCR,
|
|
1230
1249
|
createRunwareMediaAdapter,
|
|
1231
1250
|
formatCallRecord,
|
|
1251
|
+
isAbortError,
|
|
1252
|
+
isNetworkError,
|
|
1253
|
+
isRetryableError,
|
|
1232
1254
|
normalizedCents,
|
|
1233
1255
|
rankRoutes,
|
|
1234
|
-
referenceMegapixels
|
|
1256
|
+
referenceMegapixels,
|
|
1257
|
+
shouldFailover
|
|
1235
1258
|
};
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "ai-lcr",
|
|
3
|
-
"version": "0.5.
|
|
3
|
+
"version": "0.5.4",
|
|
4
4
|
"description": "Least Cost Routing for LLMs — route every model call to the cheapest available provider, fall back automatically, and track real cost. Built for the Vercel AI SDK.",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"ai",
|