npm - ai-lcr - Versions diffs - 0.5.2 → 0.5.4 - Mend

ai-lcr 0.5.2 → 0.5.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,57 @@ All notable changes to `ai-lcr` are documented here. The format follows
 [Keep a Changelog](https://keepachangelog.com/), and the project adheres to
 [Semantic Versioning](https://semver.org/).
+## [0.5.4] — 2026-06-03
+### Changed
+- **A provider 400 now fails over instead of being passed through.** Previously
+  any client error (400/422/…) was treated as the caller's fault and thrown
+  immediately, killing the request even when another provider would have served
+  it. But across OpenAI-compatible aggregators a 400 is most often
+  *provider-specific* — an unsupported parameter, a model the provider hasn't
+  listed, a stricter JSON schema — not a universally-broken request. The default
+  failover gate (`shouldFailover`) now advances to the next provider on **any**
+  failure except a deliberate caller cancellation (`AbortSignal`), which is the
+  one thing we must never re-issue elsewhere. When every provider rejects the
+  request it still throws — now surfacing the **first** (original) error rather
+  than the last fallback's, so a genuine caller bug stays debuggable. Failed
+  attempts keep their precise `ErrorKind` (`"client"` for a 400) in the
+  `CallRecord`, so a real bug is still visible.
+  To restore the old "client errors fail fast" behavior, pass
+  `shouldRetry: isRetryableError` to `createLCR`.
+### Added
+- **`createLCR({ shouldRetry })`.** The failover predicate is now configurable
+  from the top-level API (it previously existed only on the internal engine), so
+  callers can tune or fully override the policy above.
+- **Exported error predicates** `isRetryableError`, `isNetworkError`,
+  `isAbortError`, and `shouldFailover` — building blocks for a custom
+  `shouldRetry`.
+## [0.5.3] — 2026-06-03
+All additions are optional and backward compatible.
+### Added
+- **`defaultCacheReadRatio` — chain-wide fallback price for prompt-cache reads.**
+  ai-lcr already detects cache hits from the provider's reported usage and emits
+  `cachedInputTokens` for any provider that reports them (Anthropic, Gemini's
+  implicit cache, DeepSeek, …). But the *saving* (`cachedSavingUsd`) and the
+  cache-discounted `costUsd` were only computed when a leg set an explicit
+  `cost.cacheRead` — so a route that forgot it (e.g. a Gemini OpenRouter leg)
+  silently reported `$0` saved and billed cached tokens at the full input rate.
+  `createLCR({ defaultCacheReadRatio: 0.1 })` now supplies a fallback cache-read
+  price as a fraction of each leg's `input`, applied **only** to legs that omit
+  an explicit `cacheRead`. Most providers' cache-read price is ~0.1× input, so
+  `0.1` makes cache cost + savings "just work" across every model without each
+  route hardcoding a rate. Legs with their own `cacheRead` are untouched (set it
+  for outliers like OpenAI's ~0.5×). Unset = previous behavior. Must be in [0, 1].
 ## [0.5.0] — 2026-06-02
 All additions are optional and backward compatible.

package/README.md CHANGED Viewed

@@ -141,7 +141,7 @@ DeepInfra carries open weights only — no first-party Claude / GPT / Gemini. Fo
 ## How it routes
 1. **Cheapest first.** Providers are tried in order — list them cheapest-first, or set `autoSort: true` to order them by `cost` automatically.
-2. **Fall through on failure.** On a retryable error — rate limit, 5xx, timeout, or a **billing cap** (402 / out-of-credit / quota) — it advances to the next provider, streaming-safe. A caller's own bad request (e.g. 400, 422) passes through immediately.
+2. **Fall through on failure.** On any provider failure — rate limit, 5xx, timeout, a **billing cap** (402 / out-of-credit / quota), *and* a client error like a **400** — it advances to the next provider, streaming-safe. A 400 fails over on purpose: across OpenAI-compatible aggregators a 400 is usually "*this* provider won't take this request" (an unsupported param, a model it hasn't listed, a stricter schema), not a universally-broken request — so the next provider may well serve it. If every provider rejects the request it still fails, surfacing the **original** error so a genuine caller bug stays debuggable. The one failure that never fails over is a deliberate caller cancellation (`AbortSignal`). Pass `shouldRetry: isRetryableError` to `createLCR` to restore the stricter "client errors fail fast" behavior.
 3. **Recover.** After an idle window (`resetIntervalMs`, default 60s) it snaps back to the cheapest provider.
 ## See what happened (`onCall`)
@@ -364,7 +364,7 @@ npm run typecheck
 npm test          # mocked routing/failover tests + live Kunavo tests
 ```
-The suite covers cheapest-first routing, failover on retryable errors (and *not* failing over on a 400), exhausting the whole chain, and a real broken-provider → Kunavo recovery. Live tests run only when `KUNAVO_API_KEY` is set in the environment; otherwise they're skipped.
+The suite covers cheapest-first routing, failover on retryable errors *and* on a provider 400 (but *not* on a caller cancellation), surfacing the original error when the whole chain is exhausted, and a real broken-provider → Kunavo recovery. Live tests run only when `KUNAVO_API_KEY` is set in the environment; otherwise they're skipped.
 ## Credits

package/README.zh-CN.md CHANGED Viewed

@@ -141,7 +141,7 @@ DeepInfra 只承载开源权重——没有第一方 Claude / GPT / Gemini。那
 ## 它如何路由
 1. **最便宜优先。** provider 按顺序依次尝试——把它们排成最便宜优先，或设置 `autoSort: true` 让它按 `cost` 自动排序。
-2. **失败时向下穿透。** 遇到可重试的错误（限流、5xx、超时）时，前进到下一个 provider，且对流式安全。硬错误（400、401、403、422）会直接透传，不做重试。
+2. **失败时向下穿透。** 遇到任何 provider 失败——限流、5xx、超时、**额度耗尽**（402 / 欠费 / 余额不足），以及 **400** 这类 client 错误——都会前进到下一个 provider，且对流式安全。400 会 failover 是有意为之：在 OpenAI 兼容聚合层里，400 往往是"*这家* provider 不吃这个请求"（不支持的参数、它没上架这个 model、更严格的 schema），而非请求本身坏了——换一家很可能就能服务。若所有 provider 都拒绝，请求仍会失败，并抛出**第一个**（原始）错误，让真正的调用方 bug 保持可调试。唯一永远不 failover 的是调用方主动取消（`AbortSignal`）。想恢复旧的"client 错误立即失败"行为，给 `createLCR` 传 `shouldRetry: isRetryableError`。
 3. **恢复。** 在一段空闲窗口（`resetIntervalMs`，默认 60s）之后，自动回到最便宜的 provider。
 ## 支持的 provider
@@ -280,7 +280,7 @@ npm run typecheck
 npm test          # mock 的路由 / failover 测试 + 真实 Kunavo 测试
 ```
-测试套件覆盖了：最便宜优先路由、可重试错误时的 failover（以及遇到 400 时*不*做 failover）、穷尽整条链路，以及一次真实的「provider 故障 → Kunavo 恢复」。真实测试仅在环境变量 `KUNAVO_API_KEY` 设置时运行，否则跳过。
+测试套件覆盖了：最便宜优先路由、可重试错误以及 provider 400 时的 failover（但调用方主动取消时*不*做 failover）、穷尽整条链路时抛出原始错误，以及一次真实的「provider 故障 → Kunavo 恢复」。真实测试仅在环境变量 `KUNAVO_API_KEY` 设置时运行，否则跳过。
 ## 致谢

package/dist/index.cjs CHANGED Viewed

@@ -34,9 +34,13 @@ __export(index_exports, {
   createMediaLCR: () => createMediaLCR,
   createRunwareMediaAdapter: () => createRunwareMediaAdapter,
   formatCallRecord: () => formatCallRecord,
+  isAbortError: () => isAbortError,
+  isNetworkError: () => isNetworkError,
+  isRetryableError: () => isRetryableError,
   normalizedCents: () => normalizedCents,
   rankRoutes: () => rankRoutes,
-  referenceMegapixels: () => referenceMegapixels
+  referenceMegapixels: () => referenceMegapixels,
+  shouldFailover: () => shouldFailover
 });
 module.exports = __toCommonJS(index_exports);
@@ -158,6 +162,15 @@ function isRetryableError(error) {
   const { text } = errorSignals(error);
   return RETRYABLE_PATTERNS.some((p) => text.includes(p));
 }
+function isAbortError(error) {
+  const e = error;
+  if (typeof e?.name === "string" && e.name === "AbortError") return true;
+  const { text } = errorSignals(error);
+  return text.includes("operation was aborted") || text.includes("operation was canceled");
+}
+function shouldFailover(error) {
+  return !isAbortError(error);
+}
 function classifyError(error) {
   if (error instanceof EmptyCompletionError) return "empty_completion";
   const e = error;
@@ -281,7 +294,7 @@ var LcrFallbackModel = class {
     this.lastFailoverAt = Date.now();
   }
   shouldRetry(error) {
-    return (this.opts.shouldRetry ?? isRetryableError)(error);
+    return (this.opts.shouldRetry ?? shouldFailover)(error);
   }
   // Observer callbacks are caller-supplied logging hooks: a throw from one of
   // them must NEVER turn a successful (or already-failed) request into a
@@ -314,6 +327,7 @@ var LcrFallbackModel = class {
   }
   /** Record a failed attempt onto the call's chain (no event yet). */
   recordFail(ctx, provider, attemptStart, error) {
+    if (ctx.firstError === void 0) ctx.firstError = error;
     ctx.attempts.push({
       provider: provider.label,
       ok: false,
@@ -429,7 +443,7 @@ var LcrFallbackModel = class {
       }
     }
     this.finalizeFail(ctx);
-    throw lastError;
+    throw ctx.firstError ?? lastError;
   }
   async doStream(options) {
     return this.doStreamWithCtx(options, this.startCall(options), this.startIndex(), 0);
@@ -465,7 +479,7 @@ var LcrFallbackModel = class {
         tried++;
         if (tried >= n) {
           this.finalizeFail(ctx);
-          throw error;
+          throw ctx.firstError ?? error;
         }
         idx = (idx + 1) % n;
       }
@@ -513,7 +527,7 @@ var LcrFallbackModel = class {
             const nextTried = triedBeforeServing + 1;
             if (nextTried >= n) {
               self.finalizeFail(ctx);
-              controller.error(error);
+              controller.error(ctx.firstError ?? error);
               return;
             }
             try {
@@ -1224,17 +1238,35 @@ function normalize(entry) {
 function priceKey(p) {
   return p.cost ? p.cost.input + p.cost.output : Number.POSITIVE_INFINITY;
 }
+function withDefaultCacheRead(p, ratio) {
+  if (ratio === void 0 || !p.cost || p.cost.cacheRead !== void 0) return p;
+  return { ...p, cost: { ...p.cost, cacheRead: p.cost.input * ratio } };
+}
 function createLCR(config) {
-  const { models, autoSort = false, resetIntervalMs, onError, onCost, onCall } = config;
+  const {
+    models,
+    autoSort = false,
+    resetIntervalMs,
+    onError,
+    onCost,
+    onCall,
+    shouldRetry,
+    defaultCacheReadRatio
+  } = config;
+  if (defaultCacheReadRatio !== void 0 && (defaultCacheReadRatio < 0 || defaultCacheReadRatio > 1)) {
+    throw new Error(
+      `ai-lcr: defaultCacheReadRatio must be in [0, 1], got ${defaultCacheReadRatio}`
+    );
+  }
   const routed = /* @__PURE__ */ new Map();
   for (const [name, entries] of Object.entries(models)) {
-    let providers = entries.map(normalize);
+    let providers = entries.map(normalize).map((p) => withDefaultCacheRead(p, defaultCacheReadRatio));
     if (autoSort) {
       providers = [...providers].sort((a, b) => priceKey(a) - priceKey(b));
     }
     routed.set(
       name,
-      new LcrFallbackModel({ modelName: name, providers, resetIntervalMs, onError, onCost, onCall })
+      new LcrFallbackModel({ modelName: name, providers, resetIntervalMs, onError, onCost, onCall, shouldRetry })
     );
   }
   return (modelName) => {
@@ -1263,7 +1295,11 @@ function createLCR(config) {
   createMediaLCR,
   createRunwareMediaAdapter,
   formatCallRecord,
+  isAbortError,
+  isNetworkError,
+  isRetryableError,
   normalizedCents,
   rankRoutes,
-  referenceMegapixels
+  referenceMegapixels,
+  shouldFailover
 });

package/dist/index.d.cts CHANGED Viewed

@@ -165,6 +165,40 @@ interface CallRecord {
      */
     emptyCompletion?: boolean;
 }
+/**
+ * A transport-level failure (provider unreachable / socket dropped / DNS /
+ * connect timeout). These carry no HTTP status, so they must be detected
+ * structurally — by Node `code` or message — or they read as non-retryable.
+ * Note: a deliberate caller cancellation (AbortError without a network code) is
+ * intentionally NOT treated as network here, so we don't "fail over" a request
+ * the caller chose to abort.
+ */
+declare function isNetworkError(error: unknown): boolean;
+/** Default switch criterion: provider down / rate-limited / overloaded / unreachable. */
+declare function isRetryableError(error: unknown): boolean;
+/**
+ * A deliberate caller cancellation (an `AbortSignal` fired by the app). This is
+ * the one failure we must NEVER fail over: re-issuing an aborted request to the
+ * next provider is the opposite of what the caller asked for. Detected by name
+ * (`fetch`/AI SDK emit an `AbortError`) and by the canonical abort message.
+ */
+declare function isAbortError(error: unknown): boolean;
+/**
+ * Default failover criterion — broader than {@link isRetryableError} on purpose.
+ * It fails over on *anything* except a deliberate caller cancellation, including
+ * a client error such as a 400. In the OpenAI-compatible aggregator world a 400
+ * is most often "THIS provider won't take this request" (an unsupported param, a
+ * model it hasn't listed, a stricter schema) rather than a universally-broken
+ * request — and the next provider may well serve it, which is the whole point of
+ * the router. When every provider rejects the request, the engine still throws
+ * (surfacing the original error), so a genuinely-bad request stays debuggable.
+ * The failed attempts keep their precise {@link ErrorKind} (`"client"` for a
+ * 400) so a real caller bug is still visible in the {@link CallRecord}.
+ *
+ * Pass a custom `shouldRetry` to opt out (e.g. `isRetryableError` to restore the
+ * stricter "client errors fail fast" behavior).
+ */
+declare function shouldFailover(error: unknown): boolean;
 /**
  * Normalize an error into a short, log-friendly class for {@link CallRecord}.
  * An HTTP status wins (e.g. "502", "429"); otherwise the first matching
@@ -589,6 +623,29 @@ interface LCRConfig {
      * you. Pair with `formatCallRecord` for a one-line log. See {@link CallRecord}.
      */
     onCall?: (record: CallRecord) => void;
+    /**
+     * Decide whether a failed attempt should fail over to the next provider.
+     * Defaults to {@link shouldFailover} — fail over on everything except a
+     * deliberate caller cancellation, so a provider-specific 400 still survives by
+     * trying the next provider. Pass {@link isRetryableError} to restore the
+     * stricter behavior where a client error (e.g. 400) fails fast.
+     */
+    shouldRetry?: (error: unknown) => boolean;
+    /**
+     * Fallback prompt-cache read rate, as a fraction of each leg's `input` price,
+     * applied ONLY to legs whose `cost` omits an explicit `cacheRead`. So a leg
+     * priced `{ input: 0.5, output: 3 }` with `defaultCacheReadRatio: 0.1` bills
+     * its cached input tokens at `0.05`/1M and reports the resulting
+     * `cachedSavingUsd` — without every route having to hardcode `cacheRead`.
+     *
+     * Most providers' cache-read price is ~0.1× input (Anthropic, Gemini, DeepSeek);
+     * `0.1` is a sane default. Legs with their own `cacheRead` are untouched, so set
+     * it explicitly for outliers (e.g. OpenAI's ~0.5×). Unset = pre-existing
+     * behavior: cached tokens bill at the full input rate and save nothing.
+     * Caching is detected from the provider's reported usage either way; this only
+     * controls the *price* applied to it. Must be in [0, 1].
+     */
+    defaultCacheReadRatio?: number;
 }
 /** Resolve a logical model name to a routed model. */
 type LCRRouter = (modelName: string) => LanguageModelV3;
@@ -599,4 +656,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
  */
 declare function createLCR(config: LCRConfig): LCRRouter;
-export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type HttpSinkOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, OFFICIAL_PRICES, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createHttpSink, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
+export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type HttpSinkOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, OFFICIAL_PRICES, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createHttpSink, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, isAbortError, isNetworkError, isRetryableError, normalizedCents, rankRoutes, referenceMegapixels, shouldFailover };

package/dist/index.d.ts CHANGED Viewed

@@ -165,6 +165,40 @@ interface CallRecord {
      */
     emptyCompletion?: boolean;
 }
+/**
+ * A transport-level failure (provider unreachable / socket dropped / DNS /
+ * connect timeout). These carry no HTTP status, so they must be detected
+ * structurally — by Node `code` or message — or they read as non-retryable.
+ * Note: a deliberate caller cancellation (AbortError without a network code) is
+ * intentionally NOT treated as network here, so we don't "fail over" a request
+ * the caller chose to abort.
+ */
+declare function isNetworkError(error: unknown): boolean;
+/** Default switch criterion: provider down / rate-limited / overloaded / unreachable. */
+declare function isRetryableError(error: unknown): boolean;
+/**
+ * A deliberate caller cancellation (an `AbortSignal` fired by the app). This is
+ * the one failure we must NEVER fail over: re-issuing an aborted request to the
+ * next provider is the opposite of what the caller asked for. Detected by name
+ * (`fetch`/AI SDK emit an `AbortError`) and by the canonical abort message.
+ */
+declare function isAbortError(error: unknown): boolean;
+/**
+ * Default failover criterion — broader than {@link isRetryableError} on purpose.
+ * It fails over on *anything* except a deliberate caller cancellation, including
+ * a client error such as a 400. In the OpenAI-compatible aggregator world a 400
+ * is most often "THIS provider won't take this request" (an unsupported param, a
+ * model it hasn't listed, a stricter schema) rather than a universally-broken
+ * request — and the next provider may well serve it, which is the whole point of
+ * the router. When every provider rejects the request, the engine still throws
+ * (surfacing the original error), so a genuinely-bad request stays debuggable.
+ * The failed attempts keep their precise {@link ErrorKind} (`"client"` for a
+ * 400) so a real caller bug is still visible in the {@link CallRecord}.
+ *
+ * Pass a custom `shouldRetry` to opt out (e.g. `isRetryableError` to restore the
+ * stricter "client errors fail fast" behavior).
+ */
+declare function shouldFailover(error: unknown): boolean;
 /**
  * Normalize an error into a short, log-friendly class for {@link CallRecord}.
  * An HTTP status wins (e.g. "502", "429"); otherwise the first matching
@@ -589,6 +623,29 @@ interface LCRConfig {
      * you. Pair with `formatCallRecord` for a one-line log. See {@link CallRecord}.
      */
     onCall?: (record: CallRecord) => void;
+    /**
+     * Decide whether a failed attempt should fail over to the next provider.
+     * Defaults to {@link shouldFailover} — fail over on everything except a
+     * deliberate caller cancellation, so a provider-specific 400 still survives by
+     * trying the next provider. Pass {@link isRetryableError} to restore the
+     * stricter behavior where a client error (e.g. 400) fails fast.
+     */
+    shouldRetry?: (error: unknown) => boolean;
+    /**
+     * Fallback prompt-cache read rate, as a fraction of each leg's `input` price,
+     * applied ONLY to legs whose `cost` omits an explicit `cacheRead`. So a leg
+     * priced `{ input: 0.5, output: 3 }` with `defaultCacheReadRatio: 0.1` bills
+     * its cached input tokens at `0.05`/1M and reports the resulting
+     * `cachedSavingUsd` — without every route having to hardcode `cacheRead`.
+     *
+     * Most providers' cache-read price is ~0.1× input (Anthropic, Gemini, DeepSeek);
+     * `0.1` is a sane default. Legs with their own `cacheRead` are untouched, so set
+     * it explicitly for outliers (e.g. OpenAI's ~0.5×). Unset = pre-existing
+     * behavior: cached tokens bill at the full input rate and save nothing.
+     * Caching is detected from the provider's reported usage either way; this only
+     * controls the *price* applied to it. Must be in [0, 1].
+     */
+    defaultCacheReadRatio?: number;
 }
 /** Resolve a logical model name to a routed model. */
 type LCRRouter = (modelName: string) => LanguageModelV3;
@@ -599,4 +656,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
  */
 declare function createLCR(config: LCRConfig): LCRRouter;
-export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type HttpSinkOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, OFFICIAL_PRICES, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createHttpSink, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
+export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type HttpSinkOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, OFFICIAL_PRICES, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createHttpSink, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, isAbortError, isNetworkError, isRetryableError, normalizedCents, rankRoutes, referenceMegapixels, shouldFailover };

package/dist/index.js CHANGED Viewed

@@ -116,6 +116,15 @@ function isRetryableError(error) {
   const { text } = errorSignals(error);
   return RETRYABLE_PATTERNS.some((p) => text.includes(p));
 }
+function isAbortError(error) {
+  const e = error;
+  if (typeof e?.name === "string" && e.name === "AbortError") return true;
+  const { text } = errorSignals(error);
+  return text.includes("operation was aborted") || text.includes("operation was canceled");
+}
+function shouldFailover(error) {
+  return !isAbortError(error);
+}
 function classifyError(error) {
   if (error instanceof EmptyCompletionError) return "empty_completion";
   const e = error;
@@ -239,7 +248,7 @@ var LcrFallbackModel = class {
     this.lastFailoverAt = Date.now();
   }
   shouldRetry(error) {
-    return (this.opts.shouldRetry ?? isRetryableError)(error);
+    return (this.opts.shouldRetry ?? shouldFailover)(error);
   }
   // Observer callbacks are caller-supplied logging hooks: a throw from one of
   // them must NEVER turn a successful (or already-failed) request into a
@@ -272,6 +281,7 @@ var LcrFallbackModel = class {
   }
   /** Record a failed attempt onto the call's chain (no event yet). */
   recordFail(ctx, provider, attemptStart, error) {
+    if (ctx.firstError === void 0) ctx.firstError = error;
     ctx.attempts.push({
       provider: provider.label,
       ok: false,
@@ -387,7 +397,7 @@ var LcrFallbackModel = class {
       }
     }
     this.finalizeFail(ctx);
-    throw lastError;
+    throw ctx.firstError ?? lastError;
   }
   async doStream(options) {
     return this.doStreamWithCtx(options, this.startCall(options), this.startIndex(), 0);
@@ -423,7 +433,7 @@ var LcrFallbackModel = class {
         tried++;
         if (tried >= n) {
           this.finalizeFail(ctx);
-          throw error;
+          throw ctx.firstError ?? error;
         }
         idx = (idx + 1) % n;
       }
@@ -471,7 +481,7 @@ var LcrFallbackModel = class {
             const nextTried = triedBeforeServing + 1;
             if (nextTried >= n) {
               self.finalizeFail(ctx);
-              controller.error(error);
+              controller.error(ctx.firstError ?? error);
               return;
             }
             try {
@@ -1182,17 +1192,35 @@ function normalize(entry) {
 function priceKey(p) {
   return p.cost ? p.cost.input + p.cost.output : Number.POSITIVE_INFINITY;
 }
+function withDefaultCacheRead(p, ratio) {
+  if (ratio === void 0 || !p.cost || p.cost.cacheRead !== void 0) return p;
+  return { ...p, cost: { ...p.cost, cacheRead: p.cost.input * ratio } };
+}
 function createLCR(config) {
-  const { models, autoSort = false, resetIntervalMs, onError, onCost, onCall } = config;
+  const {
+    models,
+    autoSort = false,
+    resetIntervalMs,
+    onError,
+    onCost,
+    onCall,
+    shouldRetry,
+    defaultCacheReadRatio
+  } = config;
+  if (defaultCacheReadRatio !== void 0 && (defaultCacheReadRatio < 0 || defaultCacheReadRatio > 1)) {
+    throw new Error(
+      `ai-lcr: defaultCacheReadRatio must be in [0, 1], got ${defaultCacheReadRatio}`
+    );
+  }
   const routed = /* @__PURE__ */ new Map();
   for (const [name, entries] of Object.entries(models)) {
-    let providers = entries.map(normalize);
+    let providers = entries.map(normalize).map((p) => withDefaultCacheRead(p, defaultCacheReadRatio));
     if (autoSort) {
       providers = [...providers].sort((a, b) => priceKey(a) - priceKey(b));
     }
     routed.set(
       name,
-      new LcrFallbackModel({ modelName: name, providers, resetIntervalMs, onError, onCost, onCall })
+      new LcrFallbackModel({ modelName: name, providers, resetIntervalMs, onError, onCost, onCall, shouldRetry })
     );
   }
   return (modelName) => {
@@ -1220,7 +1248,11 @@ export {
   createMediaLCR,
   createRunwareMediaAdapter,
   formatCallRecord,
+  isAbortError,
+  isNetworkError,
+  isRetryableError,
   normalizedCents,
   rankRoutes,
-  referenceMegapixels
+  referenceMegapixels,
+  shouldFailover
 };

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "ai-lcr",
-  "version": "0.5.2",
+  "version": "0.5.4",
   "description": "Least Cost Routing for LLMs — route every model call to the cheapest available provider, fall back automatically, and track real cost. Built for the Vercel AI SDK.",
   "keywords": [
     "ai",