ai-lcr 0.2.2 → 0.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,60 @@
1
+ # Changelog
2
+
3
+ All notable changes to `ai-lcr` are documented here. The format follows
4
+ [Keep a Changelog](https://keepachangelog.com/), and the project adheres to
5
+ [Semantic Versioning](https://semver.org/).
6
+
7
+ ## [0.2.3] — 2026-06-01
8
+
9
+ Release-quality and engine-correctness pass.
10
+
11
+ ### Fixed
12
+
13
+ - **Build was red on `main`.** `media.ts` set `CallRecord.baselineUsd` but the
14
+ type never declared it, so `tsc`/`npm run build` failed while `npm test`
15
+ (which doesn't typecheck) stayed green. `baselineUsd?: number` is now part of
16
+ `CallRecord`. The text router leaves it `undefined`; the media router sets it.
17
+ - **Failover used shared mutable state across concurrent requests.** The active
18
+ provider index was an instance field used both as the per-request loop cursor
19
+ and the loop's termination check. Two requests sharing one model instance
20
+ could clobber each other's cursor mid-flight (skipped providers, wrong
21
+ termination). Each request now walks providers on a fully local cursor; the
22
+ only shared state is a "where to start next" hint, read once and written once.
23
+ - **Cheapest provider was never re-probed under sustained traffic.** The
24
+ snap-back-to-cheapest timer reset on *every* call, so with calls more frequent
25
+ than `resetIntervalMs` it never fired — one blip pinned you on the expensive
26
+ fallback indefinitely (exactly when spend is highest). The timer now measures
27
+ from the last *failover*, so re-probe fires under load too.
28
+
29
+ ### Added
30
+
31
+ - **`classifyErrorKind(error)` and `RouteAttempt.kind`** (`"transient" | "auth"
32
+ | "billing" | "client"`). 401/403 (auth) and 402/out-of-credit (billing)
33
+ still fail over so the request survives — but they're now tagged distinctly
34
+ from transient 429/5xx, so a misconfigured key silently burning the pricey
35
+ fallback is something you can alert on instead of mistaking for healthy
36
+ routing.
37
+ - **Continuous Integration** (`.github/workflows/ci.yml`): `build` +
38
+ `typecheck` + `test` on Node 20 & 22, plus a `pack-smoke` job that installs
39
+ the actual `npm pack` tarball into a clean directory and imports it (ESM and
40
+ CJS) — catching dropped exports and broken `dist` that an in-repo test can't.
41
+ - **`prepublishOnly` gate**: `npm publish` now runs build + typecheck + test
42
+ first, so a red tree can't be published.
43
+ - **Public-export surface test** (`public-api.test.ts`): pins every runtime
44
+ export by name, so removing one fails loudly and adding one is deliberate.
45
+
46
+ ## [0.2.1] — earlier
47
+
48
+ - `onCall` correlated `CallRecord` + `formatCallRecord` one-liner for the text
49
+ router, extended to the media router (image/video).
50
+
51
+ ## [0.2.0] — earlier
52
+
53
+ - Observability: `onCall` / `CallRecord`, `formatCallRecord`.
54
+
55
+ ## [0.1.x] — earlier
56
+
57
+ - Dual ESM/CJS build. Media (image/video) least-cost routing with the Runware
58
+ and Kunavo adapters; cap-aware failover for the text router.
59
+
60
+ [0.2.3]: https://github.com/victorzhrn/ai-lcr/releases/tag/v0.2.3
package/README.md CHANGED
@@ -156,7 +156,7 @@ Any OpenAI-compatible endpoint works — and so does any AI SDK provider package
156
156
 
157
157
  - **Model vendors' own APIs (native):** route straight to [DeepSeek](https://platform.deepseek.com), [OpenAI](https://openai.com), [Anthropic](https://anthropic.com), [Google](https://ai.google.dev), [xAI](https://x.ai), etc. via their AI SDK provider packages — no markup, full native features. See [Route to a model vendor's own API](#route-to-a-model-vendors-own-api-native-providers).
158
158
  - **Text aggregators:** [OpenRouter](https://openrouter.ai) (widest coverage, list pricing) · [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off** every model) · [TokenMart](https://thetokenmart.ai) (15–65% off, varies by model)
159
- - **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — routing via `createMediaLCR`. Image: Kunavo + Runware + fal. Video: fal (live, via its async queue API); Kunavo's Veo poll path is implemented but unverified
159
+ - **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — image routing available via `createMediaLCR` (Kunavo + Runware adapters); video on the roadmap
160
160
 
161
161
  ## Text model pricing
162
162
 
@@ -273,8 +273,7 @@ Two OpenAI-compatible providers, same probe, same day. Cells cover both families
273
273
  - [ ] Bundled price table for zero-config pricing (drop the manual `cost` numbers)
274
274
  - [ ] Provider-quirk middleware (transparently patch known per-provider request quirks, e.g. Kunavo's ignored `max_tokens`)
275
275
  - [ ] Feed probe results into routing automatically (auto-exclude a model from a provider that fails its probe)
276
- - [x] Image & video model routing (`createMediaLCR`) image via Kunavo + Runware + fal; **video live via fal** (async queue API)
277
- - [ ] Normalized cross-provider video price comparison + verified Kunavo/Runware video adapters
276
+ - [ ] Image & video model routing (fal.ai / Runware / Kunavo)
278
277
 
279
278
  ## Affiliate disclosure
280
279
 
package/README.zh-CN.md CHANGED
@@ -114,7 +114,7 @@ const lcr = createLCR({
114
114
 
115
115
  - **模型厂商官方 API(原生):** 通过各自的 AI SDK provider 包直连 [DeepSeek](https://platform.deepseek.com)、[OpenAI](https://openai.com)、[Anthropic](https://anthropic.com)、[Google](https://ai.google.dev)、[xAI](https://x.ai) 等——无加价,原生特性齐全。见上方「直连模型厂商官方 API(原生 provider)」一节。
116
116
  - **文本聚合器:** [OpenRouter](https://openrouter.ai)(覆盖最广,列表定价)· [Kunavo](https://kunavo.com/?ref=victorimf)(**全模型 8 折**)· [TokenMart](https://thetokenmart.ai)(按模型 85 折–35 折不等)
117
- - **图像 / 视频:** [Kunavo](https://kunavo.com/?ref=victorimf)(**8 折**)· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) —— 通过 `createMediaLCR` 路由。图像:Kunavo + Runware + fal。视频:fal(已可用,走其异步队列 API);Kunavo 的 Veo 轮询路径已实现但未验证
117
+ - **图像 / 视频:** [Kunavo](https://kunavo.com/?ref=victorimf)(**8 折**)· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) —— 路由功能在路线图中
118
118
 
119
119
  ## 文本模型价格
120
120
 
@@ -229,8 +229,7 @@ API_KEY=$TOKENMART_API_KEY BASE=https://api.tokenmart.ai \
229
229
  - [ ] 内置价格表,实现零配置定价(省去手填 `cost` 数字)
230
230
  - [ ] provider 怪癖中间件(透明地修补已知怪癖,如 Kunavo 被忽略的 `max_tokens`)
231
231
  - [ ] 把 probe 结果自动接入路由(探测失败的 provider×model 自动从列表剔除)
232
- - [x] 图像与视频模型路由(`createMediaLCR`)—— 图像走 Kunavo + Runware + fal;**视频已可用,走 fal**(异步队列 API
233
- - [ ] 归一化的跨 provider 视频价格对比 + 验证 Kunavo/Runware 视频适配器
232
+ - [ ] 图像与视频模型路由(fal.ai / Runware / Kunavo
234
233
 
235
234
  ## 联盟(Affiliate)披露
236
235
 
package/dist/index.cjs CHANGED
@@ -21,12 +21,11 @@ var __toCommonJS = (mod) => __copyProps(__defProp({}, "__esModule", { value: tru
21
21
  var index_exports = {};
22
22
  __export(index_exports, {
23
23
  DEFAULT_REFERENCE: () => DEFAULT_REFERENCE,
24
- FalMediaError: () => FalMediaError,
25
24
  MEDIA_PRICING: () => MEDIA_PRICING,
26
25
  cheapestRoute: () => cheapestRoute,
27
26
  classifyError: () => classifyError,
27
+ classifyErrorKind: () => classifyErrorKind,
28
28
  comparePrices: () => comparePrices,
29
- createFalMediaAdapter: () => createFalMediaAdapter,
30
29
  createKunavoMediaAdapter: () => createKunavoMediaAdapter,
31
30
  createLCR: () => createLCR,
32
31
  createMediaLCR: () => createMediaLCR,
@@ -87,6 +86,16 @@ function classifyError(error) {
87
86
  const text = (e?.message ? String(e.message) : safeStringify(error)).toLowerCase();
88
87
  return RETRYABLE_PATTERNS.find((p) => text.includes(p)) ?? "error";
89
88
  }
89
+ var AUTH_STATUS = /* @__PURE__ */ new Set([401, 403]);
90
+ var BILLING_PATTERNS = ["insufficient", "credit", "quota", "billing", "payment required"];
91
+ function classifyErrorKind(error) {
92
+ const e = error;
93
+ const status = e?.statusCode ?? e?.status;
94
+ const text = (e?.message ? String(e.message) : safeStringify(error)).toLowerCase();
95
+ if (typeof status === "number" && AUTH_STATUS.has(status)) return "auth";
96
+ if (status === 402 || BILLING_PATTERNS.some((p) => text.includes(p))) return "billing";
97
+ return isRetryableError(error) ? "transient" : "client";
98
+ }
90
99
  var callSeq = 0;
91
100
  function newCallId() {
92
101
  const c = globalThis.crypto;
@@ -103,11 +112,20 @@ var LcrFallbackModel = class {
103
112
  }
104
113
  opts;
105
114
  specificationVersion = "v3";
106
- index = 0;
107
- lastReset = Date.now();
115
+ // Cross-request *hint* for where the next request starts: after a failover we
116
+ // remember the provider that worked so we don't re-probe a dead cheap one on
117
+ // every call. This is the ONLY shared mutable state — and crucially it is read
118
+ // once per request (snapshotted into a local cursor) and written once on
119
+ // settle, never used as a per-request loop bound. The within-request iteration
120
+ // is fully local, so concurrent requests can't corrupt each other's routing.
121
+ sticky = 0;
122
+ // When `sticky` was last advanced (a failover). The re-probe timer measures
123
+ // from THIS, not from the last call — so it fires under sustained traffic too,
124
+ // instead of being pushed forward forever by a busy stream of requests.
125
+ lastFailoverAt = Date.now();
108
126
  resetIntervalMs;
109
127
  get current() {
110
- return this.opts.providers[this.index];
128
+ return this.opts.providers[this.sticky];
111
129
  }
112
130
  get modelId() {
113
131
  return this.current.model.modelId;
@@ -118,14 +136,28 @@ var LcrFallbackModel = class {
118
136
  get supportedUrls() {
119
137
  return this.current.model.supportedUrls;
120
138
  }
121
- checkReset() {
122
- if (this.index !== 0 && Date.now() - this.lastReset >= this.resetIntervalMs) {
123
- this.index = 0;
139
+ /**
140
+ * Index a new request should start at. If we're parked on a non-cheapest
141
+ * provider and it's been `resetIntervalMs` since the failover, snap back to
142
+ * the cheapest and re-probe it — this is what lets routing recover to the
143
+ * cheap source even during continuous traffic.
144
+ */
145
+ startIndex() {
146
+ if (this.sticky !== 0 && Date.now() - this.lastFailoverAt >= this.resetIntervalMs) {
147
+ this.sticky = 0;
124
148
  }
125
- this.lastReset = Date.now();
126
- }
127
- switchNext() {
128
- this.index = (this.index + 1) % this.opts.providers.length;
149
+ return this.sticky;
150
+ }
151
+ /**
152
+ * A request settled on `winIndex`. Park there so the next request skips the
153
+ * providers we just learned are down. Stamp the failover time only when the
154
+ * parked provider actually CHANGES — so a steady stream of successful calls
155
+ * on the same fallback doesn't keep pushing the re-probe timer forward.
156
+ */
157
+ settleSticky(winIndex) {
158
+ if (winIndex === this.sticky) return;
159
+ this.sticky = winIndex;
160
+ this.lastFailoverAt = Date.now();
129
161
  }
130
162
  shouldRetry(error) {
131
163
  return (this.opts.shouldRetry ?? isRetryableError)(error);
@@ -139,7 +171,8 @@ var LcrFallbackModel = class {
139
171
  provider: provider.label,
140
172
  ok: false,
141
173
  latencyMs: Date.now() - attemptStart,
142
- errorClass: classifyError(error)
174
+ errorClass: classifyError(error),
175
+ kind: classifyErrorKind(error)
143
176
  });
144
177
  }
145
178
  /** Winner settled: record the attempt, fire `onCost` (compat) + `onCall`. */
@@ -184,15 +217,18 @@ var LcrFallbackModel = class {
184
217
  });
185
218
  }
186
219
  async doGenerate(options) {
187
- this.checkReset();
188
220
  const ctx = this.startCall();
189
- const start = this.index;
221
+ const providers = this.opts.providers;
222
+ const n = providers.length;
223
+ const start = this.startIndex();
190
224
  let lastError;
191
- for (; ; ) {
192
- const provider = this.current;
225
+ for (let tried = 0; tried < n; tried++) {
226
+ const idx = (start + tried) % n;
227
+ const provider = providers[idx];
193
228
  const attemptStart = Date.now();
194
229
  try {
195
230
  const result = await provider.model.doGenerate(options);
231
+ this.settleSticky(idx);
196
232
  this.finalizeOk(ctx, provider, attemptStart, result.usage);
197
233
  return result;
198
234
  } catch (error) {
@@ -204,29 +240,30 @@ var LcrFallbackModel = class {
204
240
  }
205
241
  this.opts.onError?.(error, provider.label);
206
242
  this.recordFail(ctx, provider, attemptStart, error);
207
- this.switchNext();
208
- if (this.index === start) {
209
- this.finalizeFail(ctx);
210
- throw lastError;
211
- }
212
243
  }
213
244
  }
245
+ this.finalizeFail(ctx);
246
+ throw lastError;
214
247
  }
215
248
  async doStream(options) {
216
- this.checkReset();
217
- return this.doStreamWithCtx(options, this.startCall());
218
- }
219
- // The stream's failover recursion re-enters here with the SAME `ctx`, so a
220
- // mid-stream switch keeps appending to one CallRecord instead of starting a
221
- // fresh one. `finalizeOk`/`finalizeFail` fire exactly once per outer request.
222
- async doStreamWithCtx(options, ctx) {
249
+ return this.doStreamWithCtx(options, this.startCall(), this.startIndex(), 0);
250
+ }
251
+ // The stream's failover recursion re-enters here with the SAME `ctx` and a
252
+ // threaded-through local cursor (`idx`/`tried`), so a mid-stream switch keeps
253
+ // appending to one CallRecord and bounds itself on the local `tried` count —
254
+ // never on shared instance state. `finalizeOk`/`finalizeFail` fire exactly
255
+ // once per outer request.
256
+ async doStreamWithCtx(options, ctx, startIdx, alreadyTried) {
223
257
  const self = this;
224
- const start = this.index;
258
+ const providers = this.opts.providers;
259
+ const n = providers.length;
225
260
  let result;
226
261
  let serving;
227
262
  let servingStart;
263
+ let idx = startIdx;
264
+ let tried = alreadyTried;
228
265
  for (; ; ) {
229
- serving = this.current;
266
+ serving = providers[idx];
230
267
  servingStart = Date.now();
231
268
  try {
232
269
  result = await serving.model.doStream(options);
@@ -239,15 +276,18 @@ var LcrFallbackModel = class {
239
276
  }
240
277
  this.opts.onError?.(error, serving.label);
241
278
  this.recordFail(ctx, serving, servingStart, error);
242
- this.switchNext();
243
- if (this.index === start) {
279
+ tried++;
280
+ if (tried >= n) {
244
281
  this.finalizeFail(ctx);
245
282
  throw error;
246
283
  }
284
+ idx = (idx + 1) % n;
247
285
  }
248
286
  }
249
287
  const servingProvider = serving;
250
288
  const servingAttemptStart = servingStart;
289
+ const servingIdx = idx;
290
+ const triedBeforeServing = tried;
251
291
  let usage;
252
292
  let streamedAny = false;
253
293
  const stream = new ReadableStream({
@@ -266,20 +306,26 @@ var LcrFallbackModel = class {
266
306
  controller.enqueue(value);
267
307
  if (value.type !== "stream-start") streamedAny = true;
268
308
  }
309
+ self.settleSticky(servingIdx);
269
310
  self.finalizeOk(ctx, servingProvider, servingAttemptStart, usage);
270
311
  controller.close();
271
312
  } catch (error) {
272
313
  self.opts.onError?.(error, servingProvider.label);
273
314
  self.recordFail(ctx, servingProvider, servingAttemptStart, error);
274
315
  if (!streamedAny) {
275
- self.switchNext();
276
- if (self.index === start) {
316
+ const nextTried = triedBeforeServing + 1;
317
+ if (nextTried >= n) {
277
318
  self.finalizeFail(ctx);
278
319
  controller.error(error);
279
320
  return;
280
321
  }
281
322
  try {
282
- const next = await self.doStreamWithCtx(options, ctx);
323
+ const next = await self.doStreamWithCtx(
324
+ options,
325
+ ctx,
326
+ (servingIdx + 1) % n,
327
+ nextTried
328
+ );
283
329
  const nextReader = next.stream.getReader();
284
330
  try {
285
331
  for (; ; ) {
@@ -733,108 +779,6 @@ var RunwareMediaError = class extends Error {
733
779
  status;
734
780
  };
735
781
 
736
- // src/adapters/fal-media.ts
737
- var DEFAULT_BASE3 = "https://queue.fal.run";
738
- function extractOutputs(raw) {
739
- if (!raw || typeof raw !== "object") return [];
740
- const data = raw;
741
- const out = [];
742
- const pushUrl = (url, type) => {
743
- if (typeof url === "string" && url.length > 0) out.push({ url, type });
744
- };
745
- if (Array.isArray(data.images)) {
746
- for (const img of data.images) pushUrl(img?.url, "image");
747
- }
748
- pushUrl(data.image?.url, "image");
749
- if (Array.isArray(data.videos)) {
750
- for (const v of data.videos) pushUrl(v?.url, "video");
751
- }
752
- pushUrl(data.video?.url, "video");
753
- return out;
754
- }
755
- function createFalMediaAdapter(config) {
756
- const {
757
- apiKey,
758
- baseUrl = DEFAULT_BASE3,
759
- pollIntervalMs = 3e3,
760
- pollTimeoutMs = 3e5,
761
- fetchImpl = fetch
762
- } = config;
763
- const headers = {
764
- "content-type": "application/json",
765
- authorization: `Key ${apiKey}`
766
- };
767
- return {
768
- provider: "fal",
769
- async run(req) {
770
- const submitRes = await fetchImpl(`${baseUrl}/${req.externalId}`, {
771
- method: "POST",
772
- headers,
773
- body: JSON.stringify(req.input)
774
- });
775
- if (!submitRes.ok) {
776
- throw new FalMediaError(submitRes.status, await safeText2(submitRes));
777
- }
778
- const submit = await submitRes.json();
779
- const statusUrl = submit.status_url;
780
- const responseUrl = submit.response_url;
781
- if (!statusUrl || !responseUrl) {
782
- throw new Error(
783
- `ai-lcr: fal submit for "${req.externalId}" returned no status/response URL (keys: ${Object.keys(
784
- submit
785
- ).join(", ")})`
786
- );
787
- }
788
- const deadline = Date.now() + pollTimeoutMs;
789
- let completed = false;
790
- while (Date.now() < deadline) {
791
- const statusRes = await fetchImpl(statusUrl, { headers });
792
- if (!statusRes.ok) {
793
- throw new FalMediaError(statusRes.status, await safeText2(statusRes));
794
- }
795
- const status = String((await statusRes.json()).status ?? "");
796
- if (status === "COMPLETED") {
797
- completed = true;
798
- break;
799
- }
800
- await sleep2(pollIntervalMs);
801
- }
802
- if (!completed) {
803
- throw new Error(
804
- `ai-lcr: fal job for "${req.externalId}" timed out after ${pollTimeoutMs}ms`
805
- );
806
- }
807
- const resultRes = await fetchImpl(responseUrl, { headers });
808
- if (!resultRes.ok) {
809
- throw new FalMediaError(resultRes.status, await safeText2(resultRes));
810
- }
811
- const outputs = extractOutputs(await resultRes.json());
812
- if (outputs.length === 0) {
813
- throw new Error(`ai-lcr: fal returned no media URL for "${req.externalId}"`);
814
- }
815
- return { outputs, units: outputs.length };
816
- }
817
- };
818
- }
819
- var FalMediaError = class extends Error {
820
- constructor(status, body) {
821
- super(`fal media HTTP ${status}: ${body.slice(0, 300)}`);
822
- this.status = status;
823
- this.name = "FalMediaError";
824
- }
825
- status;
826
- };
827
- function sleep2(ms) {
828
- return new Promise((r) => setTimeout(r, ms));
829
- }
830
- async function safeText2(res) {
831
- try {
832
- return await res.text();
833
- } catch {
834
- return "<no body>";
835
- }
836
- }
837
-
838
782
  // src/index.ts
839
783
  function isLanguageModel(entry) {
840
784
  return typeof entry.doGenerate === "function";
@@ -878,12 +822,11 @@ function createLCR(config) {
878
822
  // Annotate the CommonJS export names for ESM import in node:
879
823
  0 && (module.exports = {
880
824
  DEFAULT_REFERENCE,
881
- FalMediaError,
882
825
  MEDIA_PRICING,
883
826
  cheapestRoute,
884
827
  classifyError,
828
+ classifyErrorKind,
885
829
  comparePrices,
886
- createFalMediaAdapter,
887
830
  createKunavoMediaAdapter,
888
831
  createLCR,
889
832
  createMediaLCR,
package/dist/index.d.cts CHANGED
@@ -5,8 +5,10 @@ import { LanguageModelV3 } from '@ai-sdk/provider';
5
5
  *
6
6
  * A LanguageModelV3 that wraps an ordered, cheapest-first list of providers:
7
7
  * it serves from the first healthy one, switches to the next on a retryable
8
- * error (streaming-safe), and snaps back to the cheapest after an idle window.
9
- * It also computes per-call cost from each provider's price and fires `onCost`.
8
+ * error (streaming-safe), and periodically re-probes the cheapest provider
9
+ * (every `resetIntervalMs` after a failover under load too, not only when
10
+ * idle). It also computes per-call cost from each provider's price and fires
11
+ * `onCost`.
10
12
  *
11
13
  * The switching loop is adapted from `ai-fallback` (MIT, © remorses) — its
12
14
  * streaming-safe fallback approach — reimplemented here so ai-lcr owns its core
@@ -28,6 +30,17 @@ interface CostEvent {
28
30
  /** Computed from the serving provider's `cost`; 0 if no price was given. */
29
31
  costUsd: number;
30
32
  }
33
+ /**
34
+ * Coarse error category for a failed attempt — distinct from `errorClass`
35
+ * (which is the raw status/pattern). Use it to alert: `"auth"` and `"billing"`
36
+ * mean a config/account problem masquerading as a healthy failover, the thing
37
+ * you want to page on rather than silently keep burning the pricey fallback.
38
+ * - "transient": rate limit / overload / 5xx — expected, self-healing.
39
+ * - "auth": 401 / 403 — a misconfigured or revoked key.
40
+ * - "billing": 402 / out-of-credit / quota — account needs topping up.
41
+ * - "client": a non-retryable caller error (e.g. 400 bad request).
42
+ */
43
+ type ErrorKind = "transient" | "auth" | "billing" | "client";
31
44
  /** One provider attempt within a single request. */
32
45
  interface RouteAttempt {
33
46
  /** Provider label that was tried (e.g. "tokenmart"). */
@@ -38,6 +51,8 @@ interface RouteAttempt {
38
51
  latencyMs: number;
39
52
  /** Normalized failure reason when `ok` is false (e.g. "502", "rate_limit", "timeout"). */
40
53
  errorClass?: string;
54
+ /** Coarse category of the failure when `ok` is false. See {@link ErrorKind}. */
55
+ kind?: ErrorKind;
41
56
  }
42
57
  /**
43
58
  * One settled request, with its full failover chain. Emitted exactly once per
@@ -65,10 +80,10 @@ interface CallRecord {
65
80
  /** Computed from the winner's `cost`; 0 if no price was given or the call failed. */
66
81
  costUsd: number;
67
82
  /**
68
- * What the priciest configured route would have cost for this request, so
69
- * `baselineUsd - costUsd` is the saving from routing cheapest-first. Set by
70
- * the media router (`createMediaLCR`), where every route has a known price;
71
- * omitted by the text router, which can't price a baseline per call.
83
+ * What the same request would have cost on the most expensive configured
84
+ * provider the savings baseline (`baselineUsd - costUsd`). Set by the media
85
+ * router; the text router omits it (left undefined) until a per-call text
86
+ * baseline lands. Optional so both routers share one {@link CallRecord} shape.
72
87
  */
73
88
  baselineUsd?: number;
74
89
  }
@@ -79,6 +94,13 @@ interface CallRecord {
79
94
  * Reuses the same signals as {@link isRetryableError} — no new vocabulary.
80
95
  */
81
96
  declare function classifyError(error: unknown): string;
97
+ /**
98
+ * Categorize an error for alerting. Orthogonal to {@link isRetryableError}
99
+ * (which decides *whether* to fail over) — this decides *how alarming* the
100
+ * failover is. A run of `"auth"`/`"billing"` attempts means you're silently
101
+ * burning the pricey fallback because a key/account is broken: page on it.
102
+ */
103
+ declare function classifyErrorKind(error: unknown): ErrorKind;
82
104
 
83
105
  /**
84
106
  * Human-readable one-liner for a {@link CallRecord}.
@@ -336,53 +358,6 @@ interface RunwareMediaConfig {
336
358
  }
337
359
  declare function createRunwareMediaAdapter(config: RunwareMediaConfig): MediaAdapter;
338
360
 
339
- /**
340
- * fal media adapter — image (queue) + video (queue, async poll).
341
- *
342
- * fal serves every model through one async queue API, so a single submit→poll→
343
- * fetch-result path covers both image and video. That is the whole reason this
344
- * adapter exists: it is ai-lcr's first VIDEO-capable execution path. (The
345
- * Runware adapter is image-only; the Kunavo one's video poll loop is unverified.)
346
- *
347
- * Implementation note: ai-art's fal adapter uses the `@fal-ai/client` SDK, but
348
- * ai-lcr deliberately keeps zero provider SDKs — every adapter is raw `fetch`
349
- * with an injectable `fetchImpl` for testing (see runware-media, kunavo-media).
350
- * So this re-implements the three queue calls against fal's REST endpoints:
351
- *
352
- * 1. submit POST https://queue.fal.run/{model} → { request_id, status_url, response_url }
353
- * 2. status GET {status_url} → { status: IN_QUEUE | IN_PROGRESS | COMPLETED }
354
- * 3. result GET {response_url} → { images:[…] } | { video:{url} } | …
355
- *
356
- * We follow the `status_url` / `response_url` returned by submit rather than
357
- * rebuilding them, which sidesteps fal's sub-path quirk (a model like
358
- * `fal-ai/flux/schnell` submits to the full path but its status/result live
359
- * under the `fal-ai/flux` base).
360
- *
361
- * Auth: fal uses `Authorization: Key {FAL_KEY}` (NOT Bearer).
362
- *
363
- * Cost: fal's queue result does not carry a per-call price, so cost is left to
364
- * the router's normalized estimate (costCents stays undefined; `units` is the
365
- * output count — one image, or one clip).
366
- */
367
-
368
- interface FalMediaConfig {
369
- apiKey: string;
370
- /** Override for testing. Defaults to https://queue.fal.run. */
371
- baseUrl?: string;
372
- /** Video/job poll cadence (ms). Default 3000. */
373
- pollIntervalMs?: number;
374
- /** Max time to wait for a job before giving up (ms). Default 300000 (5m). */
375
- pollTimeoutMs?: number;
376
- /** Injected for testing; defaults to global fetch. */
377
- fetchImpl?: typeof fetch;
378
- }
379
- declare function createFalMediaAdapter(config: FalMediaConfig): MediaAdapter;
380
- /** Carries the HTTP status so the router's `isRetryableError` can classify it. */
381
- declare class FalMediaError extends Error {
382
- status: number;
383
- constructor(status: number, body: string);
384
- }
385
-
386
361
  /**
387
362
  * ai-lcr — Least Cost Routing for LLMs.
388
363
  *
@@ -436,4 +411,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
436
411
  */
437
412
  declare function createLCR(config: LCRConfig): LCRRouter;
438
413
 
439
- export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, FalMediaError, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, comparePrices, createFalMediaAdapter, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
414
+ export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
package/dist/index.d.ts CHANGED
@@ -5,8 +5,10 @@ import { LanguageModelV3 } from '@ai-sdk/provider';
5
5
  *
6
6
  * A LanguageModelV3 that wraps an ordered, cheapest-first list of providers:
7
7
  * it serves from the first healthy one, switches to the next on a retryable
8
- * error (streaming-safe), and snaps back to the cheapest after an idle window.
9
- * It also computes per-call cost from each provider's price and fires `onCost`.
8
+ * error (streaming-safe), and periodically re-probes the cheapest provider
9
+ * (every `resetIntervalMs` after a failover under load too, not only when
10
+ * idle). It also computes per-call cost from each provider's price and fires
11
+ * `onCost`.
10
12
  *
11
13
  * The switching loop is adapted from `ai-fallback` (MIT, © remorses) — its
12
14
  * streaming-safe fallback approach — reimplemented here so ai-lcr owns its core
@@ -28,6 +30,17 @@ interface CostEvent {
28
30
  /** Computed from the serving provider's `cost`; 0 if no price was given. */
29
31
  costUsd: number;
30
32
  }
33
+ /**
34
+ * Coarse error category for a failed attempt — distinct from `errorClass`
35
+ * (which is the raw status/pattern). Use it to alert: `"auth"` and `"billing"`
36
+ * mean a config/account problem masquerading as a healthy failover, the thing
37
+ * you want to page on rather than silently keep burning the pricey fallback.
38
+ * - "transient": rate limit / overload / 5xx — expected, self-healing.
39
+ * - "auth": 401 / 403 — a misconfigured or revoked key.
40
+ * - "billing": 402 / out-of-credit / quota — account needs topping up.
41
+ * - "client": a non-retryable caller error (e.g. 400 bad request).
42
+ */
43
+ type ErrorKind = "transient" | "auth" | "billing" | "client";
31
44
  /** One provider attempt within a single request. */
32
45
  interface RouteAttempt {
33
46
  /** Provider label that was tried (e.g. "tokenmart"). */
@@ -38,6 +51,8 @@ interface RouteAttempt {
38
51
  latencyMs: number;
39
52
  /** Normalized failure reason when `ok` is false (e.g. "502", "rate_limit", "timeout"). */
40
53
  errorClass?: string;
54
+ /** Coarse category of the failure when `ok` is false. See {@link ErrorKind}. */
55
+ kind?: ErrorKind;
41
56
  }
42
57
  /**
43
58
  * One settled request, with its full failover chain. Emitted exactly once per
@@ -65,10 +80,10 @@ interface CallRecord {
65
80
  /** Computed from the winner's `cost`; 0 if no price was given or the call failed. */
66
81
  costUsd: number;
67
82
  /**
68
- * What the priciest configured route would have cost for this request, so
69
- * `baselineUsd - costUsd` is the saving from routing cheapest-first. Set by
70
- * the media router (`createMediaLCR`), where every route has a known price;
71
- * omitted by the text router, which can't price a baseline per call.
83
+ * What the same request would have cost on the most expensive configured
84
+ * provider the savings baseline (`baselineUsd - costUsd`). Set by the media
85
+ * router; the text router omits it (left undefined) until a per-call text
86
+ * baseline lands. Optional so both routers share one {@link CallRecord} shape.
72
87
  */
73
88
  baselineUsd?: number;
74
89
  }
@@ -79,6 +94,13 @@ interface CallRecord {
79
94
  * Reuses the same signals as {@link isRetryableError} — no new vocabulary.
80
95
  */
81
96
  declare function classifyError(error: unknown): string;
97
+ /**
98
+ * Categorize an error for alerting. Orthogonal to {@link isRetryableError}
99
+ * (which decides *whether* to fail over) — this decides *how alarming* the
100
+ * failover is. A run of `"auth"`/`"billing"` attempts means you're silently
101
+ * burning the pricey fallback because a key/account is broken: page on it.
102
+ */
103
+ declare function classifyErrorKind(error: unknown): ErrorKind;
82
104
 
83
105
  /**
84
106
  * Human-readable one-liner for a {@link CallRecord}.
@@ -336,53 +358,6 @@ interface RunwareMediaConfig {
336
358
  }
337
359
  declare function createRunwareMediaAdapter(config: RunwareMediaConfig): MediaAdapter;
338
360
 
339
- /**
340
- * fal media adapter — image (queue) + video (queue, async poll).
341
- *
342
- * fal serves every model through one async queue API, so a single submit→poll→
343
- * fetch-result path covers both image and video. That is the whole reason this
344
- * adapter exists: it is ai-lcr's first VIDEO-capable execution path. (The
345
- * Runware adapter is image-only; the Kunavo one's video poll loop is unverified.)
346
- *
347
- * Implementation note: ai-art's fal adapter uses the `@fal-ai/client` SDK, but
348
- * ai-lcr deliberately keeps zero provider SDKs — every adapter is raw `fetch`
349
- * with an injectable `fetchImpl` for testing (see runware-media, kunavo-media).
350
- * So this re-implements the three queue calls against fal's REST endpoints:
351
- *
352
- * 1. submit POST https://queue.fal.run/{model} → { request_id, status_url, response_url }
353
- * 2. status GET {status_url} → { status: IN_QUEUE | IN_PROGRESS | COMPLETED }
354
- * 3. result GET {response_url} → { images:[…] } | { video:{url} } | …
355
- *
356
- * We follow the `status_url` / `response_url` returned by submit rather than
357
- * rebuilding them, which sidesteps fal's sub-path quirk (a model like
358
- * `fal-ai/flux/schnell` submits to the full path but its status/result live
359
- * under the `fal-ai/flux` base).
360
- *
361
- * Auth: fal uses `Authorization: Key {FAL_KEY}` (NOT Bearer).
362
- *
363
- * Cost: fal's queue result does not carry a per-call price, so cost is left to
364
- * the router's normalized estimate (costCents stays undefined; `units` is the
365
- * output count — one image, or one clip).
366
- */
367
-
368
- interface FalMediaConfig {
369
- apiKey: string;
370
- /** Override for testing. Defaults to https://queue.fal.run. */
371
- baseUrl?: string;
372
- /** Video/job poll cadence (ms). Default 3000. */
373
- pollIntervalMs?: number;
374
- /** Max time to wait for a job before giving up (ms). Default 300000 (5m). */
375
- pollTimeoutMs?: number;
376
- /** Injected for testing; defaults to global fetch. */
377
- fetchImpl?: typeof fetch;
378
- }
379
- declare function createFalMediaAdapter(config: FalMediaConfig): MediaAdapter;
380
- /** Carries the HTTP status so the router's `isRetryableError` can classify it. */
381
- declare class FalMediaError extends Error {
382
- status: number;
383
- constructor(status: number, body: string);
384
- }
385
-
386
361
  /**
387
362
  * ai-lcr — Least Cost Routing for LLMs.
388
363
  *
@@ -436,4 +411,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
436
411
  */
437
412
  declare function createLCR(config: LCRConfig): LCRRouter;
438
413
 
439
- export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, FalMediaError, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, comparePrices, createFalMediaAdapter, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
414
+ export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
package/dist/index.js CHANGED
@@ -47,6 +47,16 @@ function classifyError(error) {
47
47
  const text = (e?.message ? String(e.message) : safeStringify(error)).toLowerCase();
48
48
  return RETRYABLE_PATTERNS.find((p) => text.includes(p)) ?? "error";
49
49
  }
50
+ var AUTH_STATUS = /* @__PURE__ */ new Set([401, 403]);
51
+ var BILLING_PATTERNS = ["insufficient", "credit", "quota", "billing", "payment required"];
52
+ function classifyErrorKind(error) {
53
+ const e = error;
54
+ const status = e?.statusCode ?? e?.status;
55
+ const text = (e?.message ? String(e.message) : safeStringify(error)).toLowerCase();
56
+ if (typeof status === "number" && AUTH_STATUS.has(status)) return "auth";
57
+ if (status === 402 || BILLING_PATTERNS.some((p) => text.includes(p))) return "billing";
58
+ return isRetryableError(error) ? "transient" : "client";
59
+ }
50
60
  var callSeq = 0;
51
61
  function newCallId() {
52
62
  const c = globalThis.crypto;
@@ -63,11 +73,20 @@ var LcrFallbackModel = class {
63
73
  }
64
74
  opts;
65
75
  specificationVersion = "v3";
66
- index = 0;
67
- lastReset = Date.now();
76
+ // Cross-request *hint* for where the next request starts: after a failover we
77
+ // remember the provider that worked so we don't re-probe a dead cheap one on
78
+ // every call. This is the ONLY shared mutable state — and crucially it is read
79
+ // once per request (snapshotted into a local cursor) and written once on
80
+ // settle, never used as a per-request loop bound. The within-request iteration
81
+ // is fully local, so concurrent requests can't corrupt each other's routing.
82
+ sticky = 0;
83
+ // When `sticky` was last advanced (a failover). The re-probe timer measures
84
+ // from THIS, not from the last call — so it fires under sustained traffic too,
85
+ // instead of being pushed forward forever by a busy stream of requests.
86
+ lastFailoverAt = Date.now();
68
87
  resetIntervalMs;
69
88
  get current() {
70
- return this.opts.providers[this.index];
89
+ return this.opts.providers[this.sticky];
71
90
  }
72
91
  get modelId() {
73
92
  return this.current.model.modelId;
@@ -78,14 +97,28 @@ var LcrFallbackModel = class {
78
97
  get supportedUrls() {
79
98
  return this.current.model.supportedUrls;
80
99
  }
81
- checkReset() {
82
- if (this.index !== 0 && Date.now() - this.lastReset >= this.resetIntervalMs) {
83
- this.index = 0;
100
+ /**
101
+ * Index a new request should start at. If we're parked on a non-cheapest
102
+ * provider and it's been `resetIntervalMs` since the failover, snap back to
103
+ * the cheapest and re-probe it — this is what lets routing recover to the
104
+ * cheap source even during continuous traffic.
105
+ */
106
+ startIndex() {
107
+ if (this.sticky !== 0 && Date.now() - this.lastFailoverAt >= this.resetIntervalMs) {
108
+ this.sticky = 0;
84
109
  }
85
- this.lastReset = Date.now();
86
- }
87
- switchNext() {
88
- this.index = (this.index + 1) % this.opts.providers.length;
110
+ return this.sticky;
111
+ }
112
+ /**
113
+ * A request settled on `winIndex`. Park there so the next request skips the
114
+ * providers we just learned are down. Stamp the failover time only when the
115
+ * parked provider actually CHANGES — so a steady stream of successful calls
116
+ * on the same fallback doesn't keep pushing the re-probe timer forward.
117
+ */
118
+ settleSticky(winIndex) {
119
+ if (winIndex === this.sticky) return;
120
+ this.sticky = winIndex;
121
+ this.lastFailoverAt = Date.now();
89
122
  }
90
123
  shouldRetry(error) {
91
124
  return (this.opts.shouldRetry ?? isRetryableError)(error);
@@ -99,7 +132,8 @@ var LcrFallbackModel = class {
99
132
  provider: provider.label,
100
133
  ok: false,
101
134
  latencyMs: Date.now() - attemptStart,
102
- errorClass: classifyError(error)
135
+ errorClass: classifyError(error),
136
+ kind: classifyErrorKind(error)
103
137
  });
104
138
  }
105
139
  /** Winner settled: record the attempt, fire `onCost` (compat) + `onCall`. */
@@ -144,15 +178,18 @@ var LcrFallbackModel = class {
144
178
  });
145
179
  }
146
180
  async doGenerate(options) {
147
- this.checkReset();
148
181
  const ctx = this.startCall();
149
- const start = this.index;
182
+ const providers = this.opts.providers;
183
+ const n = providers.length;
184
+ const start = this.startIndex();
150
185
  let lastError;
151
- for (; ; ) {
152
- const provider = this.current;
186
+ for (let tried = 0; tried < n; tried++) {
187
+ const idx = (start + tried) % n;
188
+ const provider = providers[idx];
153
189
  const attemptStart = Date.now();
154
190
  try {
155
191
  const result = await provider.model.doGenerate(options);
192
+ this.settleSticky(idx);
156
193
  this.finalizeOk(ctx, provider, attemptStart, result.usage);
157
194
  return result;
158
195
  } catch (error) {
@@ -164,29 +201,30 @@ var LcrFallbackModel = class {
164
201
  }
165
202
  this.opts.onError?.(error, provider.label);
166
203
  this.recordFail(ctx, provider, attemptStart, error);
167
- this.switchNext();
168
- if (this.index === start) {
169
- this.finalizeFail(ctx);
170
- throw lastError;
171
- }
172
204
  }
173
205
  }
206
+ this.finalizeFail(ctx);
207
+ throw lastError;
174
208
  }
175
209
  async doStream(options) {
176
- this.checkReset();
177
- return this.doStreamWithCtx(options, this.startCall());
178
- }
179
- // The stream's failover recursion re-enters here with the SAME `ctx`, so a
180
- // mid-stream switch keeps appending to one CallRecord instead of starting a
181
- // fresh one. `finalizeOk`/`finalizeFail` fire exactly once per outer request.
182
- async doStreamWithCtx(options, ctx) {
210
+ return this.doStreamWithCtx(options, this.startCall(), this.startIndex(), 0);
211
+ }
212
+ // The stream's failover recursion re-enters here with the SAME `ctx` and a
213
+ // threaded-through local cursor (`idx`/`tried`), so a mid-stream switch keeps
214
+ // appending to one CallRecord and bounds itself on the local `tried` count —
215
+ // never on shared instance state. `finalizeOk`/`finalizeFail` fire exactly
216
+ // once per outer request.
217
+ async doStreamWithCtx(options, ctx, startIdx, alreadyTried) {
183
218
  const self = this;
184
- const start = this.index;
219
+ const providers = this.opts.providers;
220
+ const n = providers.length;
185
221
  let result;
186
222
  let serving;
187
223
  let servingStart;
224
+ let idx = startIdx;
225
+ let tried = alreadyTried;
188
226
  for (; ; ) {
189
- serving = this.current;
227
+ serving = providers[idx];
190
228
  servingStart = Date.now();
191
229
  try {
192
230
  result = await serving.model.doStream(options);
@@ -199,15 +237,18 @@ var LcrFallbackModel = class {
199
237
  }
200
238
  this.opts.onError?.(error, serving.label);
201
239
  this.recordFail(ctx, serving, servingStart, error);
202
- this.switchNext();
203
- if (this.index === start) {
240
+ tried++;
241
+ if (tried >= n) {
204
242
  this.finalizeFail(ctx);
205
243
  throw error;
206
244
  }
245
+ idx = (idx + 1) % n;
207
246
  }
208
247
  }
209
248
  const servingProvider = serving;
210
249
  const servingAttemptStart = servingStart;
250
+ const servingIdx = idx;
251
+ const triedBeforeServing = tried;
211
252
  let usage;
212
253
  let streamedAny = false;
213
254
  const stream = new ReadableStream({
@@ -226,20 +267,26 @@ var LcrFallbackModel = class {
226
267
  controller.enqueue(value);
227
268
  if (value.type !== "stream-start") streamedAny = true;
228
269
  }
270
+ self.settleSticky(servingIdx);
229
271
  self.finalizeOk(ctx, servingProvider, servingAttemptStart, usage);
230
272
  controller.close();
231
273
  } catch (error) {
232
274
  self.opts.onError?.(error, servingProvider.label);
233
275
  self.recordFail(ctx, servingProvider, servingAttemptStart, error);
234
276
  if (!streamedAny) {
235
- self.switchNext();
236
- if (self.index === start) {
277
+ const nextTried = triedBeforeServing + 1;
278
+ if (nextTried >= n) {
237
279
  self.finalizeFail(ctx);
238
280
  controller.error(error);
239
281
  return;
240
282
  }
241
283
  try {
242
- const next = await self.doStreamWithCtx(options, ctx);
284
+ const next = await self.doStreamWithCtx(
285
+ options,
286
+ ctx,
287
+ (servingIdx + 1) % n,
288
+ nextTried
289
+ );
243
290
  const nextReader = next.stream.getReader();
244
291
  try {
245
292
  for (; ; ) {
@@ -693,108 +740,6 @@ var RunwareMediaError = class extends Error {
693
740
  status;
694
741
  };
695
742
 
696
- // src/adapters/fal-media.ts
697
- var DEFAULT_BASE3 = "https://queue.fal.run";
698
- function extractOutputs(raw) {
699
- if (!raw || typeof raw !== "object") return [];
700
- const data = raw;
701
- const out = [];
702
- const pushUrl = (url, type) => {
703
- if (typeof url === "string" && url.length > 0) out.push({ url, type });
704
- };
705
- if (Array.isArray(data.images)) {
706
- for (const img of data.images) pushUrl(img?.url, "image");
707
- }
708
- pushUrl(data.image?.url, "image");
709
- if (Array.isArray(data.videos)) {
710
- for (const v of data.videos) pushUrl(v?.url, "video");
711
- }
712
- pushUrl(data.video?.url, "video");
713
- return out;
714
- }
715
- function createFalMediaAdapter(config) {
716
- const {
717
- apiKey,
718
- baseUrl = DEFAULT_BASE3,
719
- pollIntervalMs = 3e3,
720
- pollTimeoutMs = 3e5,
721
- fetchImpl = fetch
722
- } = config;
723
- const headers = {
724
- "content-type": "application/json",
725
- authorization: `Key ${apiKey}`
726
- };
727
- return {
728
- provider: "fal",
729
- async run(req) {
730
- const submitRes = await fetchImpl(`${baseUrl}/${req.externalId}`, {
731
- method: "POST",
732
- headers,
733
- body: JSON.stringify(req.input)
734
- });
735
- if (!submitRes.ok) {
736
- throw new FalMediaError(submitRes.status, await safeText2(submitRes));
737
- }
738
- const submit = await submitRes.json();
739
- const statusUrl = submit.status_url;
740
- const responseUrl = submit.response_url;
741
- if (!statusUrl || !responseUrl) {
742
- throw new Error(
743
- `ai-lcr: fal submit for "${req.externalId}" returned no status/response URL (keys: ${Object.keys(
744
- submit
745
- ).join(", ")})`
746
- );
747
- }
748
- const deadline = Date.now() + pollTimeoutMs;
749
- let completed = false;
750
- while (Date.now() < deadline) {
751
- const statusRes = await fetchImpl(statusUrl, { headers });
752
- if (!statusRes.ok) {
753
- throw new FalMediaError(statusRes.status, await safeText2(statusRes));
754
- }
755
- const status = String((await statusRes.json()).status ?? "");
756
- if (status === "COMPLETED") {
757
- completed = true;
758
- break;
759
- }
760
- await sleep2(pollIntervalMs);
761
- }
762
- if (!completed) {
763
- throw new Error(
764
- `ai-lcr: fal job for "${req.externalId}" timed out after ${pollTimeoutMs}ms`
765
- );
766
- }
767
- const resultRes = await fetchImpl(responseUrl, { headers });
768
- if (!resultRes.ok) {
769
- throw new FalMediaError(resultRes.status, await safeText2(resultRes));
770
- }
771
- const outputs = extractOutputs(await resultRes.json());
772
- if (outputs.length === 0) {
773
- throw new Error(`ai-lcr: fal returned no media URL for "${req.externalId}"`);
774
- }
775
- return { outputs, units: outputs.length };
776
- }
777
- };
778
- }
779
- var FalMediaError = class extends Error {
780
- constructor(status, body) {
781
- super(`fal media HTTP ${status}: ${body.slice(0, 300)}`);
782
- this.status = status;
783
- this.name = "FalMediaError";
784
- }
785
- status;
786
- };
787
- function sleep2(ms) {
788
- return new Promise((r) => setTimeout(r, ms));
789
- }
790
- async function safeText2(res) {
791
- try {
792
- return await res.text();
793
- } catch {
794
- return "<no body>";
795
- }
796
- }
797
-
798
743
  // src/index.ts
799
744
  function isLanguageModel(entry) {
800
745
  return typeof entry.doGenerate === "function";
@@ -837,12 +782,11 @@ function createLCR(config) {
837
782
  }
838
783
  export {
839
784
  DEFAULT_REFERENCE,
840
- FalMediaError,
841
785
  MEDIA_PRICING,
842
786
  cheapestRoute,
843
787
  classifyError,
788
+ classifyErrorKind,
844
789
  comparePrices,
845
- createFalMediaAdapter,
846
790
  createKunavoMediaAdapter,
847
791
  createLCR,
848
792
  createMediaLCR,
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "ai-lcr",
3
- "version": "0.2.2",
3
+ "version": "0.2.3",
4
4
  "description": "Least Cost Routing for LLMs — route every model call to the cheapest available provider, fall back automatically, and track real cost. Built for the Vercel AI SDK.",
5
5
  "keywords": [
6
6
  "ai",
@@ -39,13 +39,15 @@
39
39
  "files": [
40
40
  "dist",
41
41
  "README.md",
42
- "LICENSE"
42
+ "LICENSE",
43
+ "CHANGELOG.md"
43
44
  ],
44
45
  "scripts": {
45
46
  "build": "tsup src/index.ts --format esm,cjs --dts --clean",
46
47
  "typecheck": "tsc --noEmit",
47
48
  "test": "vitest run",
48
- "test:watch": "vitest"
49
+ "test:watch": "vitest",
50
+ "prepublishOnly": "npm run build && npm run typecheck && npm test"
49
51
  },
50
52
  "peerDependencies": {
51
53
  "ai": "^6.0.0"