ai-lcr 0.2.3 → 0.2.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -4,6 +4,53 @@ All notable changes to `ai-lcr` are documented here. The format follows
4
4
  [Keep a Changelog](https://keepachangelog.com/), and the project adheres to
5
5
  [Semantic Versioning](https://semver.org/).
6
6
 
7
+ ## [0.2.5] — 2026-06-01
8
+
9
+ Pre-launch failover-robustness + media-provider pass — closing cases where a
10
+ real provider failure slipped past the switch criterion and killed the request,
11
+ and making fal a live failover target.
12
+
13
+ ### Fixed
14
+
15
+ - **A network-unreachable provider didn't fail over.** `isRetryableError` only
16
+ matched HTTP statuses and English keywords, but a provider that's down throws
17
+ a `fetch` `TypeError` with *no* status — and wraps the real cause
18
+ (`ECONNREFUSED`/`ECONNRESET`/`ENOTFOUND`/connect-timeout, with the Node `code`)
19
+ in `error.cause`. Those read as a non-retryable client error, so the cheapest
20
+ provider going down killed the request instead of falling over — the most
21
+ common outage mode. The engine now walks the `cause` chain and treats Node
22
+ network codes / transport-failure messages as retryable. Applies to both the
23
+ text and media routers. New exported helper `isNetworkError`.
24
+ - **Non-English billing failures didn't fail over.** Out-of-credit detection was
25
+ English-only, but Chinese providers (e.g. Kunavo) report a failed charge as
26
+ `余额不足`/`账户欠费`/`扣费失败` in a 200/400 body with no billing status.
27
+ Those are now matched (plus `balance`/`exhausted`), so a failed charge fails
28
+ over and is tagged `billing` by `classifyErrorKind` for alerting.
29
+ - **An out-of-balance 403 was mis-tagged as `auth`.** Providers report an
30
+ exhausted account as 403 (e.g. fal "exhausted balance") — a top-up problem,
31
+ not a revoked key. `classifyErrorKind` now lets billing wording win over a
32
+ bare 401/403 status, so it's tagged `billing` (a plain 403 stays `auth`).
33
+ - **A throwing observer could fail a successful request.** `onCost`/`onCall`/
34
+ `onError` were invoked unguarded; a logging sink that threw (e.g. a flaky db9
35
+ write) turned an otherwise-successful generation into a thrown error. All
36
+ observer callbacks are now fire-and-forget — wrapped so a throw can never
37
+ affect routing or the request outcome. Applies to both routers.
38
+
39
+ ### Added
40
+
41
+ - **fal media adapter** (`createFalMediaAdapter`). fal was in the price table
42
+ but had no adapter, so its routes were silently skipped at runtime — now it's
43
+ a real cheapest-first / failover target for image models. Synchronous
44
+ `https://fal.run/<model>` with `Authorization: Key`, generic input pass-
45
+ through, HTTP-status-bearing errors (403 out-of-balance → fails over; 422 bad
46
+ input → doesn't). Image only; fal video (queue) is on the roadmap.
47
+ - **Status-page liveness probes for Runware + fal** (`website`). Both are now
48
+ monitored with a free, generation-free reachability probe: Runware's `ping`
49
+ task (→ `pong`, 0 cost) and fal's `GET /v1/account/billing` (2xx ⇒ endpoint up
50
+ + key valid). Generalized via a new `ReachProbe` so a "reachable" check can
51
+ hit a provider-specific free endpoint instead of `GET /v1/models`. Requires
52
+ `RUNWARE_API_KEY` and `FAL_KEY` env vars to be set.
53
+
7
54
  ## [0.2.3] — 2026-06-01
8
55
 
9
56
  Release-quality and engine-correctness pass.
@@ -57,4 +104,5 @@ Release-quality and engine-correctness pass.
57
104
  - Dual ESM/CJS build. Media (image/video) least-cost routing with the Runware
58
105
  and Kunavo adapters; cap-aware failover for the text router.
59
106
 
107
+ [0.2.5]: https://github.com/victorzhrn/ai-lcr/releases/tag/v0.2.5
60
108
  [0.2.3]: https://github.com/victorzhrn/ai-lcr/releases/tag/v0.2.3
package/README.md CHANGED
@@ -156,7 +156,7 @@ Any OpenAI-compatible endpoint works — and so does any AI SDK provider package
156
156
 
157
157
  - **Model vendors' own APIs (native):** route straight to [DeepSeek](https://platform.deepseek.com), [OpenAI](https://openai.com), [Anthropic](https://anthropic.com), [Google](https://ai.google.dev), [xAI](https://x.ai), etc. via their AI SDK provider packages — no markup, full native features. See [Route to a model vendor's own API](#route-to-a-model-vendors-own-api-native-providers).
158
158
  - **Text aggregators:** [OpenRouter](https://openrouter.ai) (widest coverage, list pricing) · [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off** every model) · [TokenMart](https://thetokenmart.ai) (15–65% off, varies by model)
159
- - **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — image routing available via `createMediaLCR` (Kunavo + Runware adapters); video on the roadmap
159
+ - **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — image routing available via `createMediaLCR` (Kunavo + Runware + fal adapters); video on the roadmap
160
160
 
161
161
  ## Text model pricing
162
162
 
package/dist/index.cjs CHANGED
@@ -26,6 +26,7 @@ __export(index_exports, {
26
26
  classifyError: () => classifyError,
27
27
  classifyErrorKind: () => classifyErrorKind,
28
28
  comparePrices: () => comparePrices,
29
+ createFalMediaAdapter: () => createFalMediaAdapter,
29
30
  createKunavoMediaAdapter: () => createKunavoMediaAdapter,
30
31
  createLCR: () => createLCR,
31
32
  createMediaLCR: () => createMediaLCR,
@@ -57,43 +58,126 @@ var RETRYABLE_PATTERNS = [
57
58
  "504",
58
59
  "429",
59
60
  // Billing caps — a capped provider should fall over, not kill the request.
61
+ // Include non-English wording: Chinese providers (e.g. Kunavo) report a failed
62
+ // charge as "余额不足"/"账户欠费"/"扣费失败" with a 200/400 body, which no
63
+ // English keyword and no HTTP status would catch — so without these a billing
64
+ // failure would die instead of failing over, the exact opposite of what we want.
60
65
  "insufficient",
61
66
  "credit",
62
67
  "quota",
63
68
  "billing",
64
- "payment required"
69
+ "payment required",
70
+ "balance",
71
+ "\u4F59\u989D",
72
+ "\u6B20\u8D39",
73
+ "\u6263\u8D39",
74
+ "\u6263\u6B3E"
65
75
  ];
76
+ var NETWORK_CODES = /* @__PURE__ */ new Set([
77
+ "ECONNREFUSED",
78
+ "ECONNRESET",
79
+ "ECONNABORTED",
80
+ "ENOTFOUND",
81
+ "EAI_AGAIN",
82
+ "ETIMEDOUT",
83
+ "EPIPE",
84
+ "EHOSTUNREACH",
85
+ "ENETUNREACH",
86
+ "EPROTO",
87
+ "UND_ERR_SOCKET",
88
+ "UND_ERR_CONNECT_TIMEOUT",
89
+ "UND_ERR_HEADERS_TIMEOUT",
90
+ "UND_ERR_BODY_TIMEOUT"
91
+ ]);
92
+ var NETWORK_PATTERNS = [
93
+ "fetch failed",
94
+ "failed to fetch",
95
+ "socket hang up",
96
+ "socket disconnected",
97
+ "econnrefused",
98
+ "econnreset",
99
+ "enotfound",
100
+ "etimedout",
101
+ "ehostunreach",
102
+ "enetunreach",
103
+ "eai_again",
104
+ "getaddrinfo",
105
+ "connect timeout",
106
+ "connection refused",
107
+ "connection reset",
108
+ "connection error",
109
+ "network error",
110
+ "dns"
111
+ ];
112
+ function safeStringify(value) {
113
+ try {
114
+ return JSON.stringify(value) ?? "";
115
+ } catch {
116
+ return String(value);
117
+ }
118
+ }
119
+ function errorSignals(error) {
120
+ const parts = [];
121
+ const codes = [];
122
+ const seen = /* @__PURE__ */ new Set();
123
+ let cur = error;
124
+ for (let depth = 0; depth < 6 && cur && typeof cur === "object" && !seen.has(cur); depth++) {
125
+ seen.add(cur);
126
+ const e = cur;
127
+ if (typeof e.message === "string") parts.push(e.message);
128
+ if (typeof e.name === "string") parts.push(e.name);
129
+ if (typeof e.code === "string") {
130
+ parts.push(e.code);
131
+ codes.push(e.code);
132
+ }
133
+ cur = e.cause;
134
+ }
135
+ if (parts.length === 0) parts.push(safeStringify(error));
136
+ return { text: parts.join(" ").toLowerCase(), codes };
137
+ }
138
+ function isNetworkError(error) {
139
+ const { text, codes } = errorSignals(error);
140
+ if (codes.some((c) => NETWORK_CODES.has(c))) return true;
141
+ return NETWORK_PATTERNS.some((p) => text.includes(p));
142
+ }
66
143
  function isRetryableError(error) {
67
144
  const e = error;
68
145
  const status = e?.statusCode ?? e?.status;
69
146
  if (typeof status === "number" && (RETRYABLE_STATUS.has(status) || status > 500)) {
70
147
  return true;
71
148
  }
72
- const text = (e?.message ? String(e.message) : safeStringify(error)).toLowerCase();
149
+ if (isNetworkError(error)) return true;
150
+ const { text } = errorSignals(error);
73
151
  return RETRYABLE_PATTERNS.some((p) => text.includes(p));
74
152
  }
75
- function safeStringify(value) {
76
- try {
77
- return JSON.stringify(value) ?? "";
78
- } catch {
79
- return String(value);
80
- }
81
- }
82
153
  function classifyError(error) {
83
154
  const e = error;
84
155
  const status = e?.statusCode ?? e?.status;
85
156
  if (typeof status === "number") return String(status);
86
- const text = (e?.message ? String(e.message) : safeStringify(error)).toLowerCase();
157
+ if (isNetworkError(error)) return "network";
158
+ const { text } = errorSignals(error);
87
159
  return RETRYABLE_PATTERNS.find((p) => text.includes(p)) ?? "error";
88
160
  }
89
161
  var AUTH_STATUS = /* @__PURE__ */ new Set([401, 403]);
90
- var BILLING_PATTERNS = ["insufficient", "credit", "quota", "billing", "payment required"];
162
+ var BILLING_PATTERNS = [
163
+ "insufficient",
164
+ "credit",
165
+ "quota",
166
+ "billing",
167
+ "payment required",
168
+ "balance",
169
+ "exhausted",
170
+ "\u4F59\u989D",
171
+ "\u6B20\u8D39",
172
+ "\u6263\u8D39",
173
+ "\u6263\u6B3E"
174
+ ];
91
175
  function classifyErrorKind(error) {
92
176
  const e = error;
93
177
  const status = e?.statusCode ?? e?.status;
94
- const text = (e?.message ? String(e.message) : safeStringify(error)).toLowerCase();
95
- if (typeof status === "number" && AUTH_STATUS.has(status)) return "auth";
178
+ const { text } = errorSignals(error);
96
179
  if (status === 402 || BILLING_PATTERNS.some((p) => text.includes(p))) return "billing";
180
+ if (typeof status === "number" && AUTH_STATUS.has(status)) return "auth";
97
181
  return isRetryableError(error) ? "transient" : "client";
98
182
  }
99
183
  var callSeq = 0;
@@ -162,6 +246,27 @@ var LcrFallbackModel = class {
162
246
  shouldRetry(error) {
163
247
  return (this.opts.shouldRetry ?? isRetryableError)(error);
164
248
  }
249
+ // Observer callbacks are caller-supplied logging hooks: a throw from one of
250
+ // them must NEVER turn a successful (or already-failed) request into a
251
+ // different outcome. Swallow anything they throw — they are fire-and-forget.
252
+ emitError(error, provider) {
253
+ try {
254
+ this.opts.onError?.(error, provider);
255
+ } catch {
256
+ }
257
+ }
258
+ emitCost(event) {
259
+ try {
260
+ this.opts.onCost?.(event);
261
+ } catch {
262
+ }
263
+ }
264
+ emitCall(record) {
265
+ try {
266
+ this.opts.onCall?.(record);
267
+ } catch {
268
+ }
269
+ }
165
270
  startCall() {
166
271
  return { id: newCallId(), attempts: [], startedAt: Date.now() };
167
272
  }
@@ -181,14 +286,14 @@ var LcrFallbackModel = class {
181
286
  const inputTokens = usage?.inputTokens?.total ?? 0;
182
287
  const outputTokens = usage?.outputTokens?.total ?? 0;
183
288
  const costUsd = provider.cost ? inputTokens / 1e6 * provider.cost.input + outputTokens / 1e6 * provider.cost.output : 0;
184
- this.opts.onCost?.({
289
+ this.emitCost({
185
290
  model: this.opts.modelName,
186
291
  provider: provider.label,
187
292
  inputTokens,
188
293
  outputTokens,
189
294
  costUsd
190
295
  });
191
- this.opts.onCall?.({
296
+ this.emitCall({
192
297
  id: ctx.id,
193
298
  model: this.opts.modelName,
194
299
  attempts: ctx.attempts,
@@ -203,7 +308,7 @@ var LcrFallbackModel = class {
203
308
  }
204
309
  /** Every provider failed: fire `onCall` with no winner. */
205
310
  finalizeFail(ctx) {
206
- this.opts.onCall?.({
311
+ this.emitCall({
207
312
  id: ctx.id,
208
313
  model: this.opts.modelName,
209
314
  attempts: ctx.attempts,
@@ -238,7 +343,7 @@ var LcrFallbackModel = class {
238
343
  this.finalizeFail(ctx);
239
344
  throw error;
240
345
  }
241
- this.opts.onError?.(error, provider.label);
346
+ this.emitError(error, provider.label);
242
347
  this.recordFail(ctx, provider, attemptStart, error);
243
348
  }
244
349
  }
@@ -274,7 +379,7 @@ var LcrFallbackModel = class {
274
379
  this.finalizeFail(ctx);
275
380
  throw error;
276
381
  }
277
- this.opts.onError?.(error, serving.label);
382
+ this.emitError(error, serving.label);
278
383
  this.recordFail(ctx, serving, servingStart, error);
279
384
  tried++;
280
385
  if (tried >= n) {
@@ -310,7 +415,7 @@ var LcrFallbackModel = class {
310
415
  self.finalizeOk(ctx, servingProvider, servingAttemptStart, usage);
311
416
  controller.close();
312
417
  } catch (error) {
313
- self.opts.onError?.(error, servingProvider.label);
418
+ self.emitError(error, servingProvider.label);
314
419
  self.recordFail(ctx, servingProvider, servingAttemptStart, error);
315
420
  if (!streamedAny) {
316
421
  const nextTried = triedBeforeServing + 1;
@@ -434,6 +539,24 @@ function newMediaCallId() {
434
539
  }
435
540
  function createMediaLCR(config) {
436
541
  const { registry, adapters, reference = DEFAULT_REFERENCE, onError, onCost, onCall } = config;
542
+ const safeError = (error, provider) => {
543
+ try {
544
+ onError?.(error, provider);
545
+ } catch {
546
+ }
547
+ };
548
+ const safeCost = (event) => {
549
+ try {
550
+ onCost?.(event);
551
+ } catch {
552
+ }
553
+ };
554
+ const safeCall = (record) => {
555
+ try {
556
+ onCall?.(record);
557
+ } catch {
558
+ }
559
+ };
437
560
  return async function generate(modelId, input) {
438
561
  const def = registry[modelId];
439
562
  if (!def) {
@@ -444,7 +567,7 @@ function createMediaLCR(config) {
444
567
  const startedAt = Date.now();
445
568
  const attempts = [];
446
569
  let lastErr;
447
- const emitFail = () => onCall?.({
570
+ const emitFail = () => safeCall({
448
571
  id: newMediaCallId(),
449
572
  model: modelId,
450
573
  attempts,
@@ -466,8 +589,8 @@ function createMediaLCR(config) {
466
589
  const estimated = result.costCents === void 0;
467
590
  const costCents = estimated ? route.refCents * (result.units ?? 1) : result.costCents;
468
591
  attempts.push({ provider: route.provider, ok: true, latencyMs: Date.now() - attemptStart });
469
- onCost?.({ modelId, provider: route.provider, costCents, estimated });
470
- onCall?.({
592
+ safeCost({ modelId, provider: route.provider, costCents, estimated });
593
+ safeCall({
471
594
  id: newMediaCallId(),
472
595
  model: modelId,
473
596
  attempts,
@@ -489,7 +612,7 @@ function createMediaLCR(config) {
489
612
  latencyMs: Date.now() - attemptStart,
490
613
  errorClass: classifyError(err)
491
614
  });
492
- onError?.(err, route.provider);
615
+ safeError(err, route.provider);
493
616
  if (!isRetryableError(err)) {
494
617
  emitFail();
495
618
  throw err;
@@ -779,6 +902,63 @@ var RunwareMediaError = class extends Error {
779
902
  status;
780
903
  };
781
904
 
905
+ // src/adapters/fal-media.ts
906
+ var DEFAULT_BASE3 = "https://fal.run";
907
+ function extractImageUrls2(body) {
908
+ const fromArray = (body.images ?? []).map((im) => im?.url).filter((u) => typeof u === "string" && u.length > 0);
909
+ if (fromArray.length > 0) return fromArray;
910
+ const single = body.image?.url;
911
+ return typeof single === "string" && single.length > 0 ? [single] : [];
912
+ }
913
+ function errorMessage2(body) {
914
+ if (typeof body.detail === "string") return body.detail;
915
+ if (Array.isArray(body.detail)) {
916
+ const msgs = body.detail.map((d) => d?.msg).filter(Boolean);
917
+ if (msgs.length > 0) return msgs.join("; ");
918
+ }
919
+ return body.error || body.message || "unknown";
920
+ }
921
+ function createFalMediaAdapter(config) {
922
+ const { apiKey, baseUrl = DEFAULT_BASE3, fetchImpl = fetch } = config;
923
+ return {
924
+ provider: "fal",
925
+ async run(req) {
926
+ const res = await fetchImpl(`${baseUrl}/${req.externalId}`, {
927
+ method: "POST",
928
+ headers: {
929
+ "content-type": "application/json",
930
+ authorization: `Key ${apiKey}`,
931
+ accept: "application/json"
932
+ },
933
+ body: JSON.stringify(req.input)
934
+ });
935
+ let body;
936
+ try {
937
+ body = await res.json();
938
+ } catch {
939
+ body = {};
940
+ }
941
+ if (!res.ok) {
942
+ throw new FalMediaError(res.status, errorMessage2(body));
943
+ }
944
+ const urls = extractImageUrls2(body);
945
+ if (urls.length === 0) {
946
+ throw new Error(`ai-lcr: fal returned no image URL for "${req.externalId}"`);
947
+ }
948
+ const outputs = urls.map((url) => ({ url, type: "image" }));
949
+ return { outputs, units: outputs.length };
950
+ }
951
+ };
952
+ }
953
+ var FalMediaError = class extends Error {
954
+ constructor(status, body) {
955
+ super(`fal media HTTP ${status}: ${body.slice(0, 300)}`);
956
+ this.status = status;
957
+ this.name = "FalMediaError";
958
+ }
959
+ status;
960
+ };
961
+
782
962
  // src/index.ts
783
963
  function isLanguageModel(entry) {
784
964
  return typeof entry.doGenerate === "function";
@@ -827,6 +1007,7 @@ function createLCR(config) {
827
1007
  classifyError,
828
1008
  classifyErrorKind,
829
1009
  comparePrices,
1010
+ createFalMediaAdapter,
830
1011
  createKunavoMediaAdapter,
831
1012
  createLCR,
832
1013
  createMediaLCR,
package/dist/index.d.cts CHANGED
@@ -358,6 +358,41 @@ interface RunwareMediaConfig {
358
358
  }
359
359
  declare function createRunwareMediaAdapter(config: RunwareMediaConfig): MediaAdapter;
360
360
 
361
+ /**
362
+ * fal.ai media adapter — image generation (synchronous).
363
+ *
364
+ * fal exposes every model at `https://fal.run/<model-id>` (the synchronous API):
365
+ * POST the model's inputs as a flat JSON body, get the result back in the same
366
+ * response. This adapter passes the caller's `input` straight through, so any
367
+ * fal image model and any of its parameters (prompt, image_size, num_images,
368
+ * image_url for i2i/edit, …) work without this adapter knowing about them — it
369
+ * stays generic, not tied to one model family.
370
+ *
371
+ * Auth: fal uses `Authorization: Key <FAL_KEY>` (NOT a Bearer token).
372
+ *
373
+ * Errors: fal returns a proper HTTP status — 401 (bad key), 403 (insufficient
374
+ * balance / no permission), 422 (bad input), 429 (rate limit), 5xx. We surface
375
+ * the status on the thrown error so the router's `isRetryableError` can decide
376
+ * whether to fail over. A 403 "exhausted balance" is retryable (fall over to the
377
+ * next provider); a 422 bad-input is not (don't waste the fallbacks).
378
+ *
379
+ * Cost: the synchronous response does NOT carry a per-call price (fal billing is
380
+ * a separate account-level API), so `costCents` stays undefined and the router
381
+ * falls back to its normalized estimate — same contract as the Kunavo adapter.
382
+ *
383
+ * Video: fal video (e.g. veo3.1) is a long-running queue job, a different code
384
+ * path — out of scope here, like the Runware adapter. Image inference only.
385
+ */
386
+
387
+ interface FalMediaConfig {
388
+ apiKey: string;
389
+ /** Override for testing. Defaults to https://fal.run. */
390
+ baseUrl?: string;
391
+ /** Injected for testing; defaults to global fetch. */
392
+ fetchImpl?: typeof fetch;
393
+ }
394
+ declare function createFalMediaAdapter(config: FalMediaConfig): MediaAdapter;
395
+
361
396
  /**
362
397
  * ai-lcr — Least Cost Routing for LLMs.
363
398
  *
@@ -411,4 +446,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
411
446
  */
412
447
  declare function createLCR(config: LCRConfig): LCRRouter;
413
448
 
414
- export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
449
+ export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
package/dist/index.d.ts CHANGED
@@ -358,6 +358,41 @@ interface RunwareMediaConfig {
358
358
  }
359
359
  declare function createRunwareMediaAdapter(config: RunwareMediaConfig): MediaAdapter;
360
360
 
361
+ /**
362
+ * fal.ai media adapter — image generation (synchronous).
363
+ *
364
+ * fal exposes every model at `https://fal.run/<model-id>` (the synchronous API):
365
+ * POST the model's inputs as a flat JSON body, get the result back in the same
366
+ * response. This adapter passes the caller's `input` straight through, so any
367
+ * fal image model and any of its parameters (prompt, image_size, num_images,
368
+ * image_url for i2i/edit, …) work without this adapter knowing about them — it
369
+ * stays generic, not tied to one model family.
370
+ *
371
+ * Auth: fal uses `Authorization: Key <FAL_KEY>` (NOT a Bearer token).
372
+ *
373
+ * Errors: fal returns a proper HTTP status — 401 (bad key), 403 (insufficient
374
+ * balance / no permission), 422 (bad input), 429 (rate limit), 5xx. We surface
375
+ * the status on the thrown error so the router's `isRetryableError` can decide
376
+ * whether to fail over. A 403 "exhausted balance" is retryable (fall over to the
377
+ * next provider); a 422 bad-input is not (don't waste the fallbacks).
378
+ *
379
+ * Cost: the synchronous response does NOT carry a per-call price (fal billing is
380
+ * a separate account-level API), so `costCents` stays undefined and the router
381
+ * falls back to its normalized estimate — same contract as the Kunavo adapter.
382
+ *
383
+ * Video: fal video (e.g. veo3.1) is a long-running queue job, a different code
384
+ * path — out of scope here, like the Runware adapter. Image inference only.
385
+ */
386
+
387
+ interface FalMediaConfig {
388
+ apiKey: string;
389
+ /** Override for testing. Defaults to https://fal.run. */
390
+ baseUrl?: string;
391
+ /** Injected for testing; defaults to global fetch. */
392
+ fetchImpl?: typeof fetch;
393
+ }
394
+ declare function createFalMediaAdapter(config: FalMediaConfig): MediaAdapter;
395
+
361
396
  /**
362
397
  * ai-lcr — Least Cost Routing for LLMs.
363
398
  *
@@ -411,4 +446,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
411
446
  */
412
447
  declare function createLCR(config: LCRConfig): LCRRouter;
413
448
 
414
- export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
449
+ export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
package/dist/index.js CHANGED
@@ -18,43 +18,126 @@ var RETRYABLE_PATTERNS = [
18
18
  "504",
19
19
  "429",
20
20
  // Billing caps — a capped provider should fall over, not kill the request.
21
+ // Include non-English wording: Chinese providers (e.g. Kunavo) report a failed
22
+ // charge as "余额不足"/"账户欠费"/"扣费失败" with a 200/400 body, which no
23
+ // English keyword and no HTTP status would catch — so without these a billing
24
+ // failure would die instead of failing over, the exact opposite of what we want.
21
25
  "insufficient",
22
26
  "credit",
23
27
  "quota",
24
28
  "billing",
25
- "payment required"
29
+ "payment required",
30
+ "balance",
31
+ "\u4F59\u989D",
32
+ "\u6B20\u8D39",
33
+ "\u6263\u8D39",
34
+ "\u6263\u6B3E"
26
35
  ];
36
+ var NETWORK_CODES = /* @__PURE__ */ new Set([
37
+ "ECONNREFUSED",
38
+ "ECONNRESET",
39
+ "ECONNABORTED",
40
+ "ENOTFOUND",
41
+ "EAI_AGAIN",
42
+ "ETIMEDOUT",
43
+ "EPIPE",
44
+ "EHOSTUNREACH",
45
+ "ENETUNREACH",
46
+ "EPROTO",
47
+ "UND_ERR_SOCKET",
48
+ "UND_ERR_CONNECT_TIMEOUT",
49
+ "UND_ERR_HEADERS_TIMEOUT",
50
+ "UND_ERR_BODY_TIMEOUT"
51
+ ]);
52
+ var NETWORK_PATTERNS = [
53
+ "fetch failed",
54
+ "failed to fetch",
55
+ "socket hang up",
56
+ "socket disconnected",
57
+ "econnrefused",
58
+ "econnreset",
59
+ "enotfound",
60
+ "etimedout",
61
+ "ehostunreach",
62
+ "enetunreach",
63
+ "eai_again",
64
+ "getaddrinfo",
65
+ "connect timeout",
66
+ "connection refused",
67
+ "connection reset",
68
+ "connection error",
69
+ "network error",
70
+ "dns"
71
+ ];
72
+ function safeStringify(value) {
73
+ try {
74
+ return JSON.stringify(value) ?? "";
75
+ } catch {
76
+ return String(value);
77
+ }
78
+ }
79
+ function errorSignals(error) {
80
+ const parts = [];
81
+ const codes = [];
82
+ const seen = /* @__PURE__ */ new Set();
83
+ let cur = error;
84
+ for (let depth = 0; depth < 6 && cur && typeof cur === "object" && !seen.has(cur); depth++) {
85
+ seen.add(cur);
86
+ const e = cur;
87
+ if (typeof e.message === "string") parts.push(e.message);
88
+ if (typeof e.name === "string") parts.push(e.name);
89
+ if (typeof e.code === "string") {
90
+ parts.push(e.code);
91
+ codes.push(e.code);
92
+ }
93
+ cur = e.cause;
94
+ }
95
+ if (parts.length === 0) parts.push(safeStringify(error));
96
+ return { text: parts.join(" ").toLowerCase(), codes };
97
+ }
98
+ function isNetworkError(error) {
99
+ const { text, codes } = errorSignals(error);
100
+ if (codes.some((c) => NETWORK_CODES.has(c))) return true;
101
+ return NETWORK_PATTERNS.some((p) => text.includes(p));
102
+ }
27
103
  function isRetryableError(error) {
28
104
  const e = error;
29
105
  const status = e?.statusCode ?? e?.status;
30
106
  if (typeof status === "number" && (RETRYABLE_STATUS.has(status) || status > 500)) {
31
107
  return true;
32
108
  }
33
- const text = (e?.message ? String(e.message) : safeStringify(error)).toLowerCase();
109
+ if (isNetworkError(error)) return true;
110
+ const { text } = errorSignals(error);
34
111
  return RETRYABLE_PATTERNS.some((p) => text.includes(p));
35
112
  }
36
- function safeStringify(value) {
37
- try {
38
- return JSON.stringify(value) ?? "";
39
- } catch {
40
- return String(value);
41
- }
42
- }
43
113
  function classifyError(error) {
44
114
  const e = error;
45
115
  const status = e?.statusCode ?? e?.status;
46
116
  if (typeof status === "number") return String(status);
47
- const text = (e?.message ? String(e.message) : safeStringify(error)).toLowerCase();
117
+ if (isNetworkError(error)) return "network";
118
+ const { text } = errorSignals(error);
48
119
  return RETRYABLE_PATTERNS.find((p) => text.includes(p)) ?? "error";
49
120
  }
50
121
  var AUTH_STATUS = /* @__PURE__ */ new Set([401, 403]);
51
- var BILLING_PATTERNS = ["insufficient", "credit", "quota", "billing", "payment required"];
122
+ var BILLING_PATTERNS = [
123
+ "insufficient",
124
+ "credit",
125
+ "quota",
126
+ "billing",
127
+ "payment required",
128
+ "balance",
129
+ "exhausted",
130
+ "\u4F59\u989D",
131
+ "\u6B20\u8D39",
132
+ "\u6263\u8D39",
133
+ "\u6263\u6B3E"
134
+ ];
52
135
  function classifyErrorKind(error) {
53
136
  const e = error;
54
137
  const status = e?.statusCode ?? e?.status;
55
- const text = (e?.message ? String(e.message) : safeStringify(error)).toLowerCase();
56
- if (typeof status === "number" && AUTH_STATUS.has(status)) return "auth";
138
+ const { text } = errorSignals(error);
57
139
  if (status === 402 || BILLING_PATTERNS.some((p) => text.includes(p))) return "billing";
140
+ if (typeof status === "number" && AUTH_STATUS.has(status)) return "auth";
58
141
  return isRetryableError(error) ? "transient" : "client";
59
142
  }
60
143
  var callSeq = 0;
@@ -123,6 +206,27 @@ var LcrFallbackModel = class {
123
206
  shouldRetry(error) {
124
207
  return (this.opts.shouldRetry ?? isRetryableError)(error);
125
208
  }
209
+ // Observer callbacks are caller-supplied logging hooks: a throw from one of
210
+ // them must NEVER turn a successful (or already-failed) request into a
211
+ // different outcome. Swallow anything they throw — they are fire-and-forget.
212
+ emitError(error, provider) {
213
+ try {
214
+ this.opts.onError?.(error, provider);
215
+ } catch {
216
+ }
217
+ }
218
+ emitCost(event) {
219
+ try {
220
+ this.opts.onCost?.(event);
221
+ } catch {
222
+ }
223
+ }
224
+ emitCall(record) {
225
+ try {
226
+ this.opts.onCall?.(record);
227
+ } catch {
228
+ }
229
+ }
126
230
  startCall() {
127
231
  return { id: newCallId(), attempts: [], startedAt: Date.now() };
128
232
  }
@@ -142,14 +246,14 @@ var LcrFallbackModel = class {
142
246
  const inputTokens = usage?.inputTokens?.total ?? 0;
143
247
  const outputTokens = usage?.outputTokens?.total ?? 0;
144
248
  const costUsd = provider.cost ? inputTokens / 1e6 * provider.cost.input + outputTokens / 1e6 * provider.cost.output : 0;
145
- this.opts.onCost?.({
249
+ this.emitCost({
146
250
  model: this.opts.modelName,
147
251
  provider: provider.label,
148
252
  inputTokens,
149
253
  outputTokens,
150
254
  costUsd
151
255
  });
152
- this.opts.onCall?.({
256
+ this.emitCall({
153
257
  id: ctx.id,
154
258
  model: this.opts.modelName,
155
259
  attempts: ctx.attempts,
@@ -164,7 +268,7 @@ var LcrFallbackModel = class {
164
268
  }
165
269
  /** Every provider failed: fire `onCall` with no winner. */
166
270
  finalizeFail(ctx) {
167
- this.opts.onCall?.({
271
+ this.emitCall({
168
272
  id: ctx.id,
169
273
  model: this.opts.modelName,
170
274
  attempts: ctx.attempts,
@@ -199,7 +303,7 @@ var LcrFallbackModel = class {
199
303
  this.finalizeFail(ctx);
200
304
  throw error;
201
305
  }
202
- this.opts.onError?.(error, provider.label);
306
+ this.emitError(error, provider.label);
203
307
  this.recordFail(ctx, provider, attemptStart, error);
204
308
  }
205
309
  }
@@ -235,7 +339,7 @@ var LcrFallbackModel = class {
235
339
  this.finalizeFail(ctx);
236
340
  throw error;
237
341
  }
238
- this.opts.onError?.(error, serving.label);
342
+ this.emitError(error, serving.label);
239
343
  this.recordFail(ctx, serving, servingStart, error);
240
344
  tried++;
241
345
  if (tried >= n) {
@@ -271,7 +375,7 @@ var LcrFallbackModel = class {
271
375
  self.finalizeOk(ctx, servingProvider, servingAttemptStart, usage);
272
376
  controller.close();
273
377
  } catch (error) {
274
- self.opts.onError?.(error, servingProvider.label);
378
+ self.emitError(error, servingProvider.label);
275
379
  self.recordFail(ctx, servingProvider, servingAttemptStart, error);
276
380
  if (!streamedAny) {
277
381
  const nextTried = triedBeforeServing + 1;
@@ -395,6 +499,24 @@ function newMediaCallId() {
395
499
  }
396
500
  function createMediaLCR(config) {
397
501
  const { registry, adapters, reference = DEFAULT_REFERENCE, onError, onCost, onCall } = config;
502
+ const safeError = (error, provider) => {
503
+ try {
504
+ onError?.(error, provider);
505
+ } catch {
506
+ }
507
+ };
508
+ const safeCost = (event) => {
509
+ try {
510
+ onCost?.(event);
511
+ } catch {
512
+ }
513
+ };
514
+ const safeCall = (record) => {
515
+ try {
516
+ onCall?.(record);
517
+ } catch {
518
+ }
519
+ };
398
520
  return async function generate(modelId, input) {
399
521
  const def = registry[modelId];
400
522
  if (!def) {
@@ -405,7 +527,7 @@ function createMediaLCR(config) {
405
527
  const startedAt = Date.now();
406
528
  const attempts = [];
407
529
  let lastErr;
408
- const emitFail = () => onCall?.({
530
+ const emitFail = () => safeCall({
409
531
  id: newMediaCallId(),
410
532
  model: modelId,
411
533
  attempts,
@@ -427,8 +549,8 @@ function createMediaLCR(config) {
427
549
  const estimated = result.costCents === void 0;
428
550
  const costCents = estimated ? route.refCents * (result.units ?? 1) : result.costCents;
429
551
  attempts.push({ provider: route.provider, ok: true, latencyMs: Date.now() - attemptStart });
430
- onCost?.({ modelId, provider: route.provider, costCents, estimated });
431
- onCall?.({
552
+ safeCost({ modelId, provider: route.provider, costCents, estimated });
553
+ safeCall({
432
554
  id: newMediaCallId(),
433
555
  model: modelId,
434
556
  attempts,
@@ -450,7 +572,7 @@ function createMediaLCR(config) {
450
572
  latencyMs: Date.now() - attemptStart,
451
573
  errorClass: classifyError(err)
452
574
  });
453
- onError?.(err, route.provider);
575
+ safeError(err, route.provider);
454
576
  if (!isRetryableError(err)) {
455
577
  emitFail();
456
578
  throw err;
@@ -740,6 +862,63 @@ var RunwareMediaError = class extends Error {
740
862
  status;
741
863
  };
742
864
 
865
+ // src/adapters/fal-media.ts
866
+ var DEFAULT_BASE3 = "https://fal.run";
867
+ function extractImageUrls2(body) {
868
+ const fromArray = (body.images ?? []).map((im) => im?.url).filter((u) => typeof u === "string" && u.length > 0);
869
+ if (fromArray.length > 0) return fromArray;
870
+ const single = body.image?.url;
871
+ return typeof single === "string" && single.length > 0 ? [single] : [];
872
+ }
873
+ function errorMessage2(body) {
874
+ if (typeof body.detail === "string") return body.detail;
875
+ if (Array.isArray(body.detail)) {
876
+ const msgs = body.detail.map((d) => d?.msg).filter(Boolean);
877
+ if (msgs.length > 0) return msgs.join("; ");
878
+ }
879
+ return body.error || body.message || "unknown";
880
+ }
881
+ function createFalMediaAdapter(config) {
882
+ const { apiKey, baseUrl = DEFAULT_BASE3, fetchImpl = fetch } = config;
883
+ return {
884
+ provider: "fal",
885
+ async run(req) {
886
+ const res = await fetchImpl(`${baseUrl}/${req.externalId}`, {
887
+ method: "POST",
888
+ headers: {
889
+ "content-type": "application/json",
890
+ authorization: `Key ${apiKey}`,
891
+ accept: "application/json"
892
+ },
893
+ body: JSON.stringify(req.input)
894
+ });
895
+ let body;
896
+ try {
897
+ body = await res.json();
898
+ } catch {
899
+ body = {};
900
+ }
901
+ if (!res.ok) {
902
+ throw new FalMediaError(res.status, errorMessage2(body));
903
+ }
904
+ const urls = extractImageUrls2(body);
905
+ if (urls.length === 0) {
906
+ throw new Error(`ai-lcr: fal returned no image URL for "${req.externalId}"`);
907
+ }
908
+ const outputs = urls.map((url) => ({ url, type: "image" }));
909
+ return { outputs, units: outputs.length };
910
+ }
911
+ };
912
+ }
913
+ var FalMediaError = class extends Error {
914
+ constructor(status, body) {
915
+ super(`fal media HTTP ${status}: ${body.slice(0, 300)}`);
916
+ this.status = status;
917
+ this.name = "FalMediaError";
918
+ }
919
+ status;
920
+ };
921
+
743
922
  // src/index.ts
744
923
  function isLanguageModel(entry) {
745
924
  return typeof entry.doGenerate === "function";
@@ -787,6 +966,7 @@ export {
787
966
  classifyError,
788
967
  classifyErrorKind,
789
968
  comparePrices,
969
+ createFalMediaAdapter,
790
970
  createKunavoMediaAdapter,
791
971
  createLCR,
792
972
  createMediaLCR,
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "ai-lcr",
3
- "version": "0.2.3",
3
+ "version": "0.2.5",
4
4
  "description": "Least Cost Routing for LLMs — route every model call to the cheapest available provider, fall back automatically, and track real cost. Built for the Vercel AI SDK.",
5
5
  "keywords": [
6
6
  "ai",