npm - ai-lcr - Versions diffs - 0.2.3 → 0.2.6 - Mend

ai-lcr 0.2.3 → 0.2.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,69 @@ All notable changes to `ai-lcr` are documented here. The format follows
 [Keep a Changelog](https://keepachangelog.com/), and the project adheres to
 [Semantic Versioning](https://semver.org/).
+## [0.2.6] — 2026-06-01
+### Changed
+- **fal media adapter now covers image *and* video** via fal's async queue API
+  (submit → poll `status_url` → fetch `response_url`), replacing the synchronous
+  image-only `fal.run` adapter shipped in 0.2.5. This is ai-lcr's first working
+  **video** execution path: the registry already priced/routed the Veo family
+  but no adapter could run it. Same house style — raw `fetch`, injectable
+  `fetchImpl`, no provider SDK; `Authorization: Key` (not Bearer); cost left to
+  the router's normalized estimate (the queue result carries no per-call price).
+  Following the submit response's `status_url`/`response_url` sidesteps fal's
+  sub-path quirk (`fal-ai/flux/schnell` submits to the full path, but status and
+  result live under the `fal-ai/flux` base). `createFalMediaAdapter`'s public
+  name is unchanged; image callers are unaffected.
+## [0.2.5] — 2026-06-01
+Pre-launch failover-robustness + media-provider pass — closing cases where a
+real provider failure slipped past the switch criterion and killed the request,
+and making fal a live failover target.
+### Fixed
+- **A network-unreachable provider didn't fail over.** `isRetryableError` only
+  matched HTTP statuses and English keywords, but a provider that's down throws
+  a `fetch` `TypeError` with *no* status — and wraps the real cause
+  (`ECONNREFUSED`/`ECONNRESET`/`ENOTFOUND`/connect-timeout, with the Node `code`)
+  in `error.cause`. Those read as a non-retryable client error, so the cheapest
+  provider going down killed the request instead of falling over — the most
+  common outage mode. The engine now walks the `cause` chain and treats Node
+  network codes / transport-failure messages as retryable. Applies to both the
+  text and media routers. New exported helper `isNetworkError`.
+- **Non-English billing failures didn't fail over.** Out-of-credit detection was
+  English-only, but Chinese providers (e.g. Kunavo) report a failed charge as
+  `余额不足`/`账户欠费`/`扣费失败` in a 200/400 body with no billing status.
+  Those are now matched (plus `balance`/`exhausted`), so a failed charge fails
+  over and is tagged `billing` by `classifyErrorKind` for alerting.
+- **An out-of-balance 403 was mis-tagged as `auth`.** Providers report an
+  exhausted account as 403 (e.g. fal "exhausted balance") — a top-up problem,
+  not a revoked key. `classifyErrorKind` now lets billing wording win over a
+  bare 401/403 status, so it's tagged `billing` (a plain 403 stays `auth`).
+- **A throwing observer could fail a successful request.** `onCost`/`onCall`/
+  `onError` were invoked unguarded; a logging sink that threw (e.g. a flaky db9
+  write) turned an otherwise-successful generation into a thrown error. All
+  observer callbacks are now fire-and-forget — wrapped so a throw can never
+  affect routing or the request outcome. Applies to both routers.
+### Added
+- **fal media adapter** (`createFalMediaAdapter`). fal was in the price table
+  but had no adapter, so its routes were silently skipped at runtime — now it's
+  a real cheapest-first / failover target for image models. Synchronous
+  `https://fal.run/<model>` with `Authorization: Key`, generic input pass-
+  through, HTTP-status-bearing errors (403 out-of-balance → fails over; 422 bad
+  input → doesn't). Image only; fal video (queue) is on the roadmap.
+- **Status-page liveness probes for Runware + fal** (`website`). Both are now
+  monitored with a free, generation-free reachability probe: Runware's `ping`
+  task (→ `pong`, 0 cost) and fal's `GET /v1/account/billing` (2xx ⇒ endpoint up
+  + key valid). Generalized via a new `ReachProbe` so a "reachable" check can
+  hit a provider-specific free endpoint instead of `GET /v1/models`. Requires
+  `RUNWARE_API_KEY` and `FAL_KEY` env vars to be set.
 ## [0.2.3] — 2026-06-01
 Release-quality and engine-correctness pass.
@@ -57,4 +120,6 @@ Release-quality and engine-correctness pass.
 - Dual ESM/CJS build. Media (image/video) least-cost routing with the Runware
   and Kunavo adapters; cap-aware failover for the text router.
+[0.2.6]: https://github.com/victorzhrn/ai-lcr/releases/tag/v0.2.6
+[0.2.5]: https://github.com/victorzhrn/ai-lcr/releases/tag/v0.2.5
 [0.2.3]: https://github.com/victorzhrn/ai-lcr/releases/tag/v0.2.3

package/README.md CHANGED Viewed

@@ -156,7 +156,7 @@ Any OpenAI-compatible endpoint works — and so does any AI SDK provider package
 - **Model vendors' own APIs (native):** route straight to [DeepSeek](https://platform.deepseek.com), [OpenAI](https://openai.com), [Anthropic](https://anthropic.com), [Google](https://ai.google.dev), [xAI](https://x.ai), etc. via their AI SDK provider packages — no markup, full native features. See [Route to a model vendor's own API](#route-to-a-model-vendors-own-api-native-providers).
 - **Text aggregators:** [OpenRouter](https://openrouter.ai) (widest coverage, list pricing) · [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off** every model) · [TokenMart](https://thetokenmart.ai) (15–65% off, varies by model)
-- **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — image routing available via `createMediaLCR` (Kunavo + Runware adapters); video on the roadmap
+- **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — routing via `createMediaLCR`. Image: Kunavo + Runware + fal. Video: fal (live, via its async queue API); Kunavo's Veo poll path is implemented but unverified
 ## Text model pricing
@@ -273,7 +273,8 @@ Two OpenAI-compatible providers, same probe, same day. Cells cover both families
 - [ ] Bundled price table for zero-config pricing (drop the manual `cost` numbers)
 - [ ] Provider-quirk middleware (transparently patch known per-provider request quirks, e.g. Kunavo's ignored `max_tokens`)
 - [ ] Feed probe results into routing automatically (auto-exclude a model from a provider that fails its probe)
-- [ ] Image & video model routing (fal.ai / Runware / Kunavo)
+- [x] Image & video model routing (`createMediaLCR`) — image via Kunavo + Runware + fal; **video live via fal** (async queue API)
+- [ ] Normalized cross-provider video price comparison + verified Kunavo/Runware video adapters
 ## Affiliate disclosure

package/README.zh-CN.md CHANGED Viewed

@@ -114,7 +114,7 @@ const lcr = createLCR({
 - **模型厂商官方 API（原生）：** 通过各自的 AI SDK provider 包直连 [DeepSeek](https://platform.deepseek.com)、[OpenAI](https://openai.com)、[Anthropic](https://anthropic.com)、[Google](https://ai.google.dev)、[xAI](https://x.ai) 等——无加价，原生特性齐全。见上方「直连模型厂商官方 API（原生 provider）」一节。
 - **文本聚合器：** [OpenRouter](https://openrouter.ai)（覆盖最广，列表定价）· [Kunavo](https://kunavo.com/?ref=victorimf)（**全模型 8 折**）· [TokenMart](https://thetokenmart.ai)（按模型 85 折–35 折不等）
-- **图像 / 视频：** [Kunavo](https://kunavo.com/?ref=victorimf)（**8 折**）· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) —— 路由功能在路线图中
+- **图像 / 视频：** [Kunavo](https://kunavo.com/?ref=victorimf)（**8 折**）· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) —— 通过 `createMediaLCR` 路由。图像：Kunavo + Runware + fal。视频：fal（已可用，走其异步队列 API）；Kunavo 的 Veo 轮询路径已实现但未验证
 ## 文本模型价格
@@ -229,7 +229,8 @@ API_KEY=$TOKENMART_API_KEY BASE=https://api.tokenmart.ai \
 - [ ] 内置价格表，实现零配置定价（省去手填 `cost` 数字）
 - [ ] provider 怪癖中间件（透明地修补已知怪癖，如 Kunavo 被忽略的 `max_tokens`）
 - [ ] 把 probe 结果自动接入路由（探测失败的 provider×model 自动从列表剔除）
-- [ ] 图像与视频模型路由（fal.ai / Runware / Kunavo）
+- [x] 图像与视频模型路由（`createMediaLCR`）—— 图像走 Kunavo + Runware + fal；**视频已可用，走 fal**（异步队列 API）
+- [ ] 归一化的跨 provider 视频价格对比 + 验证 Kunavo/Runware 视频适配器
 ## 联盟（Affiliate）披露

package/dist/index.cjs CHANGED Viewed

@@ -26,6 +26,7 @@ __export(index_exports, {
   classifyError: () => classifyError,
   classifyErrorKind: () => classifyErrorKind,
   comparePrices: () => comparePrices,
+  createFalMediaAdapter: () => createFalMediaAdapter,
   createKunavoMediaAdapter: () => createKunavoMediaAdapter,
   createLCR: () => createLCR,
   createMediaLCR: () => createMediaLCR,
@@ -57,43 +58,126 @@ var RETRYABLE_PATTERNS = [
   "504",
   "429",
   // Billing caps — a capped provider should fall over, not kill the request.
+  // Include non-English wording: Chinese providers (e.g. Kunavo) report a failed
+  // charge as "余额不足"/"账户欠费"/"扣费失败" with a 200/400 body, which no
+  // English keyword and no HTTP status would catch — so without these a billing
+  // failure would die instead of failing over, the exact opposite of what we want.
   "insufficient",
   "credit",
   "quota",
   "billing",
-  "payment required"
+  "payment required",
+  "balance",
+  "\u4F59\u989D",
+  "\u6B20\u8D39",
+  "\u6263\u8D39",
+  "\u6263\u6B3E"
 ];
+var NETWORK_CODES = /* @__PURE__ */ new Set([
+  "ECONNREFUSED",
+  "ECONNRESET",
+  "ECONNABORTED",
+  "ENOTFOUND",
+  "EAI_AGAIN",
+  "ETIMEDOUT",
+  "EPIPE",
+  "EHOSTUNREACH",
+  "ENETUNREACH",
+  "EPROTO",
+  "UND_ERR_SOCKET",
+  "UND_ERR_CONNECT_TIMEOUT",
+  "UND_ERR_HEADERS_TIMEOUT",
+  "UND_ERR_BODY_TIMEOUT"
+]);
+var NETWORK_PATTERNS = [
+  "fetch failed",
+  "failed to fetch",
+  "socket hang up",
+  "socket disconnected",
+  "econnrefused",
+  "econnreset",
+  "enotfound",
+  "etimedout",
+  "ehostunreach",
+  "enetunreach",
+  "eai_again",
+  "getaddrinfo",
+  "connect timeout",
+  "connection refused",
+  "connection reset",
+  "connection error",
+  "network error",
+  "dns"
+];
+function safeStringify(value) {
+  try {
+    return JSON.stringify(value) ?? "";
+  } catch {
+    return String(value);
+  }
+}
+function errorSignals(error) {
+  const parts = [];
+  const codes = [];
+  const seen = /* @__PURE__ */ new Set();
+  let cur = error;
+  for (let depth = 0; depth < 6 && cur && typeof cur === "object" && !seen.has(cur); depth++) {
+    seen.add(cur);
+    const e = cur;
+    if (typeof e.message === "string") parts.push(e.message);
+    if (typeof e.name === "string") parts.push(e.name);
+    if (typeof e.code === "string") {
+      parts.push(e.code);
+      codes.push(e.code);
+    }
+    cur = e.cause;
+  }
+  if (parts.length === 0) parts.push(safeStringify(error));
+  return { text: parts.join(" ").toLowerCase(), codes };
+}
+function isNetworkError(error) {
+  const { text, codes } = errorSignals(error);
+  if (codes.some((c) => NETWORK_CODES.has(c))) return true;
+  return NETWORK_PATTERNS.some((p) => text.includes(p));
+}
 function isRetryableError(error) {
   const e = error;
   const status = e?.statusCode ?? e?.status;
   if (typeof status === "number" && (RETRYABLE_STATUS.has(status) || status > 500)) {
     return true;
   }
-  const text = (e?.message ? String(e.message) : safeStringify(error)).toLowerCase();
+  if (isNetworkError(error)) return true;
+  const { text } = errorSignals(error);
   return RETRYABLE_PATTERNS.some((p) => text.includes(p));
 }
-function safeStringify(value) {
-  try {
-    return JSON.stringify(value) ?? "";
-  } catch {
-    return String(value);
-  }
-}
 function classifyError(error) {
   const e = error;
   const status = e?.statusCode ?? e?.status;
   if (typeof status === "number") return String(status);
-  const text = (e?.message ? String(e.message) : safeStringify(error)).toLowerCase();
+  if (isNetworkError(error)) return "network";
+  const { text } = errorSignals(error);
   return RETRYABLE_PATTERNS.find((p) => text.includes(p)) ?? "error";
 }
 var AUTH_STATUS = /* @__PURE__ */ new Set([401, 403]);
-var BILLING_PATTERNS = ["insufficient", "credit", "quota", "billing", "payment required"];
+var BILLING_PATTERNS = [
+  "insufficient",
+  "credit",
+  "quota",
+  "billing",
+  "payment required",
+  "balance",
+  "exhausted",
+  "\u4F59\u989D",
+  "\u6B20\u8D39",
+  "\u6263\u8D39",
+  "\u6263\u6B3E"
+];
 function classifyErrorKind(error) {
   const e = error;
   const status = e?.statusCode ?? e?.status;
-  const text = (e?.message ? String(e.message) : safeStringify(error)).toLowerCase();
-  if (typeof status === "number" && AUTH_STATUS.has(status)) return "auth";
+  const { text } = errorSignals(error);
   if (status === 402 || BILLING_PATTERNS.some((p) => text.includes(p))) return "billing";
+  if (typeof status === "number" && AUTH_STATUS.has(status)) return "auth";
   return isRetryableError(error) ? "transient" : "client";
 }
 var callSeq = 0;
@@ -162,6 +246,27 @@ var LcrFallbackModel = class {
   shouldRetry(error) {
     return (this.opts.shouldRetry ?? isRetryableError)(error);
   }
+  // Observer callbacks are caller-supplied logging hooks: a throw from one of
+  // them must NEVER turn a successful (or already-failed) request into a
+  // different outcome. Swallow anything they throw — they are fire-and-forget.
+  emitError(error, provider) {
+    try {
+      this.opts.onError?.(error, provider);
+    } catch {
+    }
+  }
+  emitCost(event) {
+    try {
+      this.opts.onCost?.(event);
+    } catch {
+    }
+  }
+  emitCall(record) {
+    try {
+      this.opts.onCall?.(record);
+    } catch {
+    }
+  }
   startCall() {
     return { id: newCallId(), attempts: [], startedAt: Date.now() };
   }
@@ -181,14 +286,14 @@ var LcrFallbackModel = class {
     const inputTokens = usage?.inputTokens?.total ?? 0;
     const outputTokens = usage?.outputTokens?.total ?? 0;
     const costUsd = provider.cost ? inputTokens / 1e6 * provider.cost.input + outputTokens / 1e6 * provider.cost.output : 0;
-    this.opts.onCost?.({
+    this.emitCost({
       model: this.opts.modelName,
       provider: provider.label,
       inputTokens,
       outputTokens,
       costUsd
     });
-    this.opts.onCall?.({
+    this.emitCall({
       id: ctx.id,
       model: this.opts.modelName,
       attempts: ctx.attempts,
@@ -203,7 +308,7 @@ var LcrFallbackModel = class {
   }
   /** Every provider failed: fire `onCall` with no winner. */
   finalizeFail(ctx) {
-    this.opts.onCall?.({
+    this.emitCall({
       id: ctx.id,
       model: this.opts.modelName,
       attempts: ctx.attempts,
@@ -238,7 +343,7 @@ var LcrFallbackModel = class {
           this.finalizeFail(ctx);
           throw error;
         }
-        this.opts.onError?.(error, provider.label);
+        this.emitError(error, provider.label);
         this.recordFail(ctx, provider, attemptStart, error);
       }
     }
@@ -274,7 +379,7 @@ var LcrFallbackModel = class {
           this.finalizeFail(ctx);
           throw error;
         }
-        this.opts.onError?.(error, serving.label);
+        this.emitError(error, serving.label);
         this.recordFail(ctx, serving, servingStart, error);
         tried++;
         if (tried >= n) {
@@ -310,7 +415,7 @@ var LcrFallbackModel = class {
           self.finalizeOk(ctx, servingProvider, servingAttemptStart, usage);
           controller.close();
         } catch (error) {
-          self.opts.onError?.(error, servingProvider.label);
+          self.emitError(error, servingProvider.label);
           self.recordFail(ctx, servingProvider, servingAttemptStart, error);
           if (!streamedAny) {
             const nextTried = triedBeforeServing + 1;
@@ -434,6 +539,24 @@ function newMediaCallId() {
 }
 function createMediaLCR(config) {
   const { registry, adapters, reference = DEFAULT_REFERENCE, onError, onCost, onCall } = config;
+  const safeError = (error, provider) => {
+    try {
+      onError?.(error, provider);
+    } catch {
+    }
+  };
+  const safeCost = (event) => {
+    try {
+      onCost?.(event);
+    } catch {
+    }
+  };
+  const safeCall = (record) => {
+    try {
+      onCall?.(record);
+    } catch {
+    }
+  };
   return async function generate(modelId, input) {
     const def = registry[modelId];
     if (!def) {
@@ -444,7 +567,7 @@ function createMediaLCR(config) {
     const startedAt = Date.now();
     const attempts = [];
     let lastErr;
-    const emitFail = () => onCall?.({
+    const emitFail = () => safeCall({
       id: newMediaCallId(),
       model: modelId,
       attempts,
@@ -466,8 +589,8 @@ function createMediaLCR(config) {
         const estimated = result.costCents === void 0;
         const costCents = estimated ? route.refCents * (result.units ?? 1) : result.costCents;
         attempts.push({ provider: route.provider, ok: true, latencyMs: Date.now() - attemptStart });
-        onCost?.({ modelId, provider: route.provider, costCents, estimated });
-        onCall?.({
+        safeCost({ modelId, provider: route.provider, costCents, estimated });
+        safeCall({
           id: newMediaCallId(),
           model: modelId,
           attempts,
@@ -489,7 +612,7 @@ function createMediaLCR(config) {
           latencyMs: Date.now() - attemptStart,
           errorClass: classifyError(err)
         });
-        onError?.(err, route.provider);
+        safeError(err, route.provider);
         if (!isRetryableError(err)) {
           emitFail();
           throw err;
@@ -779,6 +902,108 @@ var RunwareMediaError = class extends Error {
   status;
 };
+// src/adapters/fal-media.ts
+var DEFAULT_BASE3 = "https://queue.fal.run";
+function extractOutputs(raw) {
+  if (!raw || typeof raw !== "object") return [];
+  const data = raw;
+  const out = [];
+  const pushUrl = (url, type) => {
+    if (typeof url === "string" && url.length > 0) out.push({ url, type });
+  };
+  if (Array.isArray(data.images)) {
+    for (const img of data.images) pushUrl(img?.url, "image");
+  }
+  pushUrl(data.image?.url, "image");
+  if (Array.isArray(data.videos)) {
+    for (const v of data.videos) pushUrl(v?.url, "video");
+  }
+  pushUrl(data.video?.url, "video");
+  return out;
+}
+function createFalMediaAdapter(config) {
+  const {
+    apiKey,
+    baseUrl = DEFAULT_BASE3,
+    pollIntervalMs = 3e3,
+    pollTimeoutMs = 3e5,
+    fetchImpl = fetch
+  } = config;
+  const headers = {
+    "content-type": "application/json",
+    authorization: `Key ${apiKey}`
+  };
+  return {
+    provider: "fal",
+    async run(req) {
+      const submitRes = await fetchImpl(`${baseUrl}/${req.externalId}`, {
+        method: "POST",
+        headers,
+        body: JSON.stringify(req.input)
+      });
+      if (!submitRes.ok) {
+        throw new FalMediaError(submitRes.status, await safeText2(submitRes));
+      }
+      const submit = await submitRes.json();
+      const statusUrl = submit.status_url;
+      const responseUrl = submit.response_url;
+      if (!statusUrl || !responseUrl) {
+        throw new Error(
+          `ai-lcr: fal submit for "${req.externalId}" returned no status/response URL (keys: ${Object.keys(
+            submit
+          ).join(", ")})`
+        );
+      }
+      const deadline = Date.now() + pollTimeoutMs;
+      let completed = false;
+      while (Date.now() < deadline) {
+        const statusRes = await fetchImpl(statusUrl, { headers });
+        if (!statusRes.ok) {
+          throw new FalMediaError(statusRes.status, await safeText2(statusRes));
+        }
+        const status = String((await statusRes.json()).status ?? "");
+        if (status === "COMPLETED") {
+          completed = true;
+          break;
+        }
+        await sleep2(pollIntervalMs);
+      }
+      if (!completed) {
+        throw new Error(
+          `ai-lcr: fal job for "${req.externalId}" timed out after ${pollTimeoutMs}ms`
+        );
+      }
+      const resultRes = await fetchImpl(responseUrl, { headers });
+      if (!resultRes.ok) {
+        throw new FalMediaError(resultRes.status, await safeText2(resultRes));
+      }
+      const outputs = extractOutputs(await resultRes.json());
+      if (outputs.length === 0) {
+        throw new Error(`ai-lcr: fal returned no media URL for "${req.externalId}"`);
+      }
+      return { outputs, units: outputs.length };
+    }
+  };
+}
+var FalMediaError = class extends Error {
+  constructor(status, body) {
+    super(`fal media HTTP ${status}: ${body.slice(0, 300)}`);
+    this.status = status;
+    this.name = "FalMediaError";
+  }
+  status;
+};
+function sleep2(ms) {
+  return new Promise((r) => setTimeout(r, ms));
+}
+async function safeText2(res) {
+  try {
+    return await res.text();
+  } catch {
+    return "<no body>";
+  }
+}
 // src/index.ts
 function isLanguageModel(entry) {
   return typeof entry.doGenerate === "function";
@@ -827,6 +1052,7 @@ function createLCR(config) {
   classifyError,
   classifyErrorKind,
   comparePrices,
+  createFalMediaAdapter,
   createKunavoMediaAdapter,
   createLCR,
   createMediaLCR,

package/dist/index.d.cts CHANGED Viewed

@@ -358,6 +358,48 @@ interface RunwareMediaConfig {
 }
 declare function createRunwareMediaAdapter(config: RunwareMediaConfig): MediaAdapter;
+/**
+ * fal media adapter — image (queue) + video (queue, async poll).
+ *
+ * fal serves every model through one async queue API, so a single submit→poll→
+ * fetch-result path covers both image and video. That is the whole reason this
+ * adapter exists: it is ai-lcr's first VIDEO-capable execution path. (The
+ * Runware adapter is image-only; the Kunavo one's video poll loop is unverified.)
+ *
+ * Implementation note: ai-art's fal adapter uses the `@fal-ai/client` SDK, but
+ * ai-lcr deliberately keeps zero provider SDKs — every adapter is raw `fetch`
+ * with an injectable `fetchImpl` for testing (see runware-media, kunavo-media).
+ * So this re-implements the three queue calls against fal's REST endpoints:
+ *
+ *   1. submit  POST https://queue.fal.run/{model}        → { request_id, status_url, response_url }
+ *   2. status  GET  {status_url}                         → { status: IN_QUEUE | IN_PROGRESS | COMPLETED }
+ *   3. result  GET  {response_url}                        → { images:[…] } | { video:{url} } | …
+ *
+ * We follow the `status_url` / `response_url` returned by submit rather than
+ * rebuilding them, which sidesteps fal's sub-path quirk (a model like
+ * `fal-ai/flux/schnell` submits to the full path but its status/result live
+ * under the `fal-ai/flux` base).
+ *
+ * Auth: fal uses `Authorization: Key {FAL_KEY}` (NOT Bearer).
+ *
+ * Cost: fal's queue result does not carry a per-call price, so cost is left to
+ * the router's normalized estimate (costCents stays undefined; `units` is the
+ * output count — one image, or one clip).
+ */
+interface FalMediaConfig {
+    apiKey: string;
+    /** Override for testing. Defaults to https://queue.fal.run. */
+    baseUrl?: string;
+    /** Video/job poll cadence (ms). Default 3000. */
+    pollIntervalMs?: number;
+    /** Max time to wait for a job before giving up (ms). Default 300000 (5m). */
+    pollTimeoutMs?: number;
+    /** Injected for testing; defaults to global fetch. */
+    fetchImpl?: typeof fetch;
+}
+declare function createFalMediaAdapter(config: FalMediaConfig): MediaAdapter;
 /**
  * ai-lcr — Least Cost Routing for LLMs.
  *
@@ -411,4 +453,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
  */
 declare function createLCR(config: LCRConfig): LCRRouter;
-export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
+export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };

package/dist/index.d.ts CHANGED Viewed

@@ -358,6 +358,48 @@ interface RunwareMediaConfig {
 }
 declare function createRunwareMediaAdapter(config: RunwareMediaConfig): MediaAdapter;
+/**
+ * fal media adapter — image (queue) + video (queue, async poll).
+ *
+ * fal serves every model through one async queue API, so a single submit→poll→
+ * fetch-result path covers both image and video. That is the whole reason this
+ * adapter exists: it is ai-lcr's first VIDEO-capable execution path. (The
+ * Runware adapter is image-only; the Kunavo one's video poll loop is unverified.)
+ *
+ * Implementation note: ai-art's fal adapter uses the `@fal-ai/client` SDK, but
+ * ai-lcr deliberately keeps zero provider SDKs — every adapter is raw `fetch`
+ * with an injectable `fetchImpl` for testing (see runware-media, kunavo-media).
+ * So this re-implements the three queue calls against fal's REST endpoints:
+ *
+ *   1. submit  POST https://queue.fal.run/{model}        → { request_id, status_url, response_url }
+ *   2. status  GET  {status_url}                         → { status: IN_QUEUE | IN_PROGRESS | COMPLETED }
+ *   3. result  GET  {response_url}                        → { images:[…] } | { video:{url} } | …
+ *
+ * We follow the `status_url` / `response_url` returned by submit rather than
+ * rebuilding them, which sidesteps fal's sub-path quirk (a model like
+ * `fal-ai/flux/schnell` submits to the full path but its status/result live
+ * under the `fal-ai/flux` base).
+ *
+ * Auth: fal uses `Authorization: Key {FAL_KEY}` (NOT Bearer).
+ *
+ * Cost: fal's queue result does not carry a per-call price, so cost is left to
+ * the router's normalized estimate (costCents stays undefined; `units` is the
+ * output count — one image, or one clip).
+ */
+interface FalMediaConfig {
+    apiKey: string;
+    /** Override for testing. Defaults to https://queue.fal.run. */
+    baseUrl?: string;
+    /** Video/job poll cadence (ms). Default 3000. */
+    pollIntervalMs?: number;
+    /** Max time to wait for a job before giving up (ms). Default 300000 (5m). */
+    pollTimeoutMs?: number;
+    /** Injected for testing; defaults to global fetch. */
+    fetchImpl?: typeof fetch;
+}
+declare function createFalMediaAdapter(config: FalMediaConfig): MediaAdapter;
 /**
  * ai-lcr — Least Cost Routing for LLMs.
  *
@@ -411,4 +453,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
  */
 declare function createLCR(config: LCRConfig): LCRRouter;
-export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
+export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };

package/dist/index.js CHANGED Viewed

@@ -18,43 +18,126 @@ var RETRYABLE_PATTERNS = [
   "504",
   "429",
   // Billing caps — a capped provider should fall over, not kill the request.
+  // Include non-English wording: Chinese providers (e.g. Kunavo) report a failed
+  // charge as "余额不足"/"账户欠费"/"扣费失败" with a 200/400 body, which no
+  // English keyword and no HTTP status would catch — so without these a billing
+  // failure would die instead of failing over, the exact opposite of what we want.
   "insufficient",
   "credit",
   "quota",
   "billing",
-  "payment required"
+  "payment required",
+  "balance",
+  "\u4F59\u989D",
+  "\u6B20\u8D39",
+  "\u6263\u8D39",
+  "\u6263\u6B3E"
 ];
+var NETWORK_CODES = /* @__PURE__ */ new Set([
+  "ECONNREFUSED",
+  "ECONNRESET",
+  "ECONNABORTED",
+  "ENOTFOUND",
+  "EAI_AGAIN",
+  "ETIMEDOUT",
+  "EPIPE",
+  "EHOSTUNREACH",
+  "ENETUNREACH",
+  "EPROTO",
+  "UND_ERR_SOCKET",
+  "UND_ERR_CONNECT_TIMEOUT",
+  "UND_ERR_HEADERS_TIMEOUT",
+  "UND_ERR_BODY_TIMEOUT"
+]);
+var NETWORK_PATTERNS = [
+  "fetch failed",
+  "failed to fetch",
+  "socket hang up",
+  "socket disconnected",
+  "econnrefused",
+  "econnreset",
+  "enotfound",
+  "etimedout",
+  "ehostunreach",
+  "enetunreach",
+  "eai_again",
+  "getaddrinfo",
+  "connect timeout",
+  "connection refused",
+  "connection reset",
+  "connection error",
+  "network error",
+  "dns"
+];
+function safeStringify(value) {
+  try {
+    return JSON.stringify(value) ?? "";
+  } catch {
+    return String(value);
+  }
+}
+function errorSignals(error) {
+  const parts = [];
+  const codes = [];
+  const seen = /* @__PURE__ */ new Set();
+  let cur = error;
+  for (let depth = 0; depth < 6 && cur && typeof cur === "object" && !seen.has(cur); depth++) {
+    seen.add(cur);
+    const e = cur;
+    if (typeof e.message === "string") parts.push(e.message);
+    if (typeof e.name === "string") parts.push(e.name);
+    if (typeof e.code === "string") {
+      parts.push(e.code);
+      codes.push(e.code);
+    }
+    cur = e.cause;
+  }
+  if (parts.length === 0) parts.push(safeStringify(error));
+  return { text: parts.join(" ").toLowerCase(), codes };
+}
+function isNetworkError(error) {
+  const { text, codes } = errorSignals(error);
+  if (codes.some((c) => NETWORK_CODES.has(c))) return true;
+  return NETWORK_PATTERNS.some((p) => text.includes(p));
+}
 function isRetryableError(error) {
   const e = error;
   const status = e?.statusCode ?? e?.status;
   if (typeof status === "number" && (RETRYABLE_STATUS.has(status) || status > 500)) {
     return true;
   }
-  const text = (e?.message ? String(e.message) : safeStringify(error)).toLowerCase();
+  if (isNetworkError(error)) return true;
+  const { text } = errorSignals(error);
   return RETRYABLE_PATTERNS.some((p) => text.includes(p));
 }
-function safeStringify(value) {
-  try {
-    return JSON.stringify(value) ?? "";
-  } catch {
-    return String(value);
-  }
-}
 function classifyError(error) {
   const e = error;
   const status = e?.statusCode ?? e?.status;
   if (typeof status === "number") return String(status);
-  const text = (e?.message ? String(e.message) : safeStringify(error)).toLowerCase();
+  if (isNetworkError(error)) return "network";
+  const { text } = errorSignals(error);
   return RETRYABLE_PATTERNS.find((p) => text.includes(p)) ?? "error";
 }
 var AUTH_STATUS = /* @__PURE__ */ new Set([401, 403]);
-var BILLING_PATTERNS = ["insufficient", "credit", "quota", "billing", "payment required"];
+var BILLING_PATTERNS = [
+  "insufficient",
+  "credit",
+  "quota",
+  "billing",
+  "payment required",
+  "balance",
+  "exhausted",
+  "\u4F59\u989D",
+  "\u6B20\u8D39",
+  "\u6263\u8D39",
+  "\u6263\u6B3E"
+];
 function classifyErrorKind(error) {
   const e = error;
   const status = e?.statusCode ?? e?.status;
-  const text = (e?.message ? String(e.message) : safeStringify(error)).toLowerCase();
-  if (typeof status === "number" && AUTH_STATUS.has(status)) return "auth";
+  const { text } = errorSignals(error);
   if (status === 402 || BILLING_PATTERNS.some((p) => text.includes(p))) return "billing";
+  if (typeof status === "number" && AUTH_STATUS.has(status)) return "auth";
   return isRetryableError(error) ? "transient" : "client";
 }
 var callSeq = 0;
@@ -123,6 +206,27 @@ var LcrFallbackModel = class {
   shouldRetry(error) {
     return (this.opts.shouldRetry ?? isRetryableError)(error);
   }
+  // Observer callbacks are caller-supplied logging hooks: a throw from one of
+  // them must NEVER turn a successful (or already-failed) request into a
+  // different outcome. Swallow anything they throw — they are fire-and-forget.
+  emitError(error, provider) {
+    try {
+      this.opts.onError?.(error, provider);
+    } catch {
+    }
+  }
+  emitCost(event) {
+    try {
+      this.opts.onCost?.(event);
+    } catch {
+    }
+  }
+  emitCall(record) {
+    try {
+      this.opts.onCall?.(record);
+    } catch {
+    }
+  }
   startCall() {
     return { id: newCallId(), attempts: [], startedAt: Date.now() };
   }
@@ -142,14 +246,14 @@ var LcrFallbackModel = class {
     const inputTokens = usage?.inputTokens?.total ?? 0;
     const outputTokens = usage?.outputTokens?.total ?? 0;
     const costUsd = provider.cost ? inputTokens / 1e6 * provider.cost.input + outputTokens / 1e6 * provider.cost.output : 0;
-    this.opts.onCost?.({
+    this.emitCost({
       model: this.opts.modelName,
       provider: provider.label,
       inputTokens,
       outputTokens,
       costUsd
     });
-    this.opts.onCall?.({
+    this.emitCall({
       id: ctx.id,
       model: this.opts.modelName,
       attempts: ctx.attempts,
@@ -164,7 +268,7 @@ var LcrFallbackModel = class {
   }
   /** Every provider failed: fire `onCall` with no winner. */
   finalizeFail(ctx) {
-    this.opts.onCall?.({
+    this.emitCall({
       id: ctx.id,
       model: this.opts.modelName,
       attempts: ctx.attempts,
@@ -199,7 +303,7 @@ var LcrFallbackModel = class {
           this.finalizeFail(ctx);
           throw error;
         }
-        this.opts.onError?.(error, provider.label);
+        this.emitError(error, provider.label);
         this.recordFail(ctx, provider, attemptStart, error);
       }
     }
@@ -235,7 +339,7 @@ var LcrFallbackModel = class {
           this.finalizeFail(ctx);
           throw error;
         }
-        this.opts.onError?.(error, serving.label);
+        this.emitError(error, serving.label);
         this.recordFail(ctx, serving, servingStart, error);
         tried++;
         if (tried >= n) {
@@ -271,7 +375,7 @@ var LcrFallbackModel = class {
           self.finalizeOk(ctx, servingProvider, servingAttemptStart, usage);
           controller.close();
         } catch (error) {
-          self.opts.onError?.(error, servingProvider.label);
+          self.emitError(error, servingProvider.label);
           self.recordFail(ctx, servingProvider, servingAttemptStart, error);
           if (!streamedAny) {
             const nextTried = triedBeforeServing + 1;
@@ -395,6 +499,24 @@ function newMediaCallId() {
 }
 function createMediaLCR(config) {
   const { registry, adapters, reference = DEFAULT_REFERENCE, onError, onCost, onCall } = config;
+  const safeError = (error, provider) => {
+    try {
+      onError?.(error, provider);
+    } catch {
+    }
+  };
+  const safeCost = (event) => {
+    try {
+      onCost?.(event);
+    } catch {
+    }
+  };
+  const safeCall = (record) => {
+    try {
+      onCall?.(record);
+    } catch {
+    }
+  };
   return async function generate(modelId, input) {
     const def = registry[modelId];
     if (!def) {
@@ -405,7 +527,7 @@ function createMediaLCR(config) {
     const startedAt = Date.now();
     const attempts = [];
     let lastErr;
-    const emitFail = () => onCall?.({
+    const emitFail = () => safeCall({
       id: newMediaCallId(),
       model: modelId,
       attempts,
@@ -427,8 +549,8 @@ function createMediaLCR(config) {
         const estimated = result.costCents === void 0;
         const costCents = estimated ? route.refCents * (result.units ?? 1) : result.costCents;
         attempts.push({ provider: route.provider, ok: true, latencyMs: Date.now() - attemptStart });
-        onCost?.({ modelId, provider: route.provider, costCents, estimated });
-        onCall?.({
+        safeCost({ modelId, provider: route.provider, costCents, estimated });
+        safeCall({
           id: newMediaCallId(),
           model: modelId,
           attempts,
@@ -450,7 +572,7 @@ function createMediaLCR(config) {
           latencyMs: Date.now() - attemptStart,
           errorClass: classifyError(err)
         });
-        onError?.(err, route.provider);
+        safeError(err, route.provider);
         if (!isRetryableError(err)) {
           emitFail();
           throw err;
@@ -740,6 +862,108 @@ var RunwareMediaError = class extends Error {
   status;
 };
+// src/adapters/fal-media.ts
+var DEFAULT_BASE3 = "https://queue.fal.run";
+function extractOutputs(raw) {
+  if (!raw || typeof raw !== "object") return [];
+  const data = raw;
+  const out = [];
+  const pushUrl = (url, type) => {
+    if (typeof url === "string" && url.length > 0) out.push({ url, type });
+  };
+  if (Array.isArray(data.images)) {
+    for (const img of data.images) pushUrl(img?.url, "image");
+  }
+  pushUrl(data.image?.url, "image");
+  if (Array.isArray(data.videos)) {
+    for (const v of data.videos) pushUrl(v?.url, "video");
+  }
+  pushUrl(data.video?.url, "video");
+  return out;
+}
+function createFalMediaAdapter(config) {
+  const {
+    apiKey,
+    baseUrl = DEFAULT_BASE3,
+    pollIntervalMs = 3e3,
+    pollTimeoutMs = 3e5,
+    fetchImpl = fetch
+  } = config;
+  const headers = {
+    "content-type": "application/json",
+    authorization: `Key ${apiKey}`
+  };
+  return {
+    provider: "fal",
+    async run(req) {
+      const submitRes = await fetchImpl(`${baseUrl}/${req.externalId}`, {
+        method: "POST",
+        headers,
+        body: JSON.stringify(req.input)
+      });
+      if (!submitRes.ok) {
+        throw new FalMediaError(submitRes.status, await safeText2(submitRes));
+      }
+      const submit = await submitRes.json();
+      const statusUrl = submit.status_url;
+      const responseUrl = submit.response_url;
+      if (!statusUrl || !responseUrl) {
+        throw new Error(
+          `ai-lcr: fal submit for "${req.externalId}" returned no status/response URL (keys: ${Object.keys(
+            submit
+          ).join(", ")})`
+        );
+      }
+      const deadline = Date.now() + pollTimeoutMs;
+      let completed = false;
+      while (Date.now() < deadline) {
+        const statusRes = await fetchImpl(statusUrl, { headers });
+        if (!statusRes.ok) {
+          throw new FalMediaError(statusRes.status, await safeText2(statusRes));
+        }
+        const status = String((await statusRes.json()).status ?? "");
+        if (status === "COMPLETED") {
+          completed = true;
+          break;
+        }
+        await sleep2(pollIntervalMs);
+      }
+      if (!completed) {
+        throw new Error(
+          `ai-lcr: fal job for "${req.externalId}" timed out after ${pollTimeoutMs}ms`
+        );
+      }
+      const resultRes = await fetchImpl(responseUrl, { headers });
+      if (!resultRes.ok) {
+        throw new FalMediaError(resultRes.status, await safeText2(resultRes));
+      }
+      const outputs = extractOutputs(await resultRes.json());
+      if (outputs.length === 0) {
+        throw new Error(`ai-lcr: fal returned no media URL for "${req.externalId}"`);
+      }
+      return { outputs, units: outputs.length };
+    }
+  };
+}
+var FalMediaError = class extends Error {
+  constructor(status, body) {
+    super(`fal media HTTP ${status}: ${body.slice(0, 300)}`);
+    this.status = status;
+    this.name = "FalMediaError";
+  }
+  status;
+};
+function sleep2(ms) {
+  return new Promise((r) => setTimeout(r, ms));
+}
+async function safeText2(res) {
+  try {
+    return await res.text();
+  } catch {
+    return "<no body>";
+  }
+}
 // src/index.ts
 function isLanguageModel(entry) {
   return typeof entry.doGenerate === "function";
@@ -787,6 +1011,7 @@ export {
   classifyError,
   classifyErrorKind,
   comparePrices,
+  createFalMediaAdapter,
   createKunavoMediaAdapter,
   createLCR,
   createMediaLCR,

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "ai-lcr",
-  "version": "0.2.3",
+  "version": "0.2.6",
   "description": "Least Cost Routing for LLMs — route every model call to the cheapest available provider, fall back automatically, and track real cost. Built for the Vercel AI SDK.",
   "keywords": [
     "ai",