npm - ai-lcr - Versions diffs - 0.2.0 → 0.2.2 - Mend

ai-lcr 0.2.0 → 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/README.md CHANGED Viewed

@@ -135,25 +135,6 @@ const lcr = createLCR({
 onCall: (record) => console.log(JSON.stringify(record)),
 ```
-Or ship each record to an HTTP collector with the built-in `createHttpSink` (fire-and-forget, never throws, dashboard-agnostic):
-```ts
-import { createLCR, createHttpSink } from "ai-lcr";
-import { after } from "next/server"; // serverless: don't block the response
-const lcr = createLCR({
-  models: { /* … */ },
-  onCall: createHttpSink({
-    url: `${process.env.LCR_INGEST_URL}/api/ingest`,
-    headers: { authorization: `Bearer ${process.env.LCR_INGEST_KEY}` },
-    project: process.env.LCR_PROJECT, // optional tag if one collector serves several apps
-    dispatch: after,                  // run after the response is sent (serverless-safe)
-  }),
-});
-```
-Point `url` at anything that accepts the `CallRecord` JSON — including the self-hostable companion dashboard, **[ai-lcr-dashboard](https://github.com/victorzhrn/ai-lcr-dashboard)** (Spend / Calls / Failover rate + a live failover feed). You run your own instance, so the data never leaves your infrastructure; a [db9](https://db9.ai) database can be provisioned in seconds if you don't want to stand one up yourself.
 ```ts
 interface CallRecord {
   id: string;                // correlation id, one per request
@@ -165,8 +146,7 @@ interface CallRecord {
   latencyMs: number;
   inputTokens: number;
   outputTokens: number;
-  costUsd: number;            // what the winner charged for these tokens
-  baselineUsd: number;        // what the priciest configured route would cost → savings = baselineUsd - costUsd
+  costUsd: number;
 }
 ```
@@ -176,7 +156,7 @@ Any OpenAI-compatible endpoint works — and so does any AI SDK provider package
 - **Model vendors' own APIs (native):** route straight to [DeepSeek](https://platform.deepseek.com), [OpenAI](https://openai.com), [Anthropic](https://anthropic.com), [Google](https://ai.google.dev), [xAI](https://x.ai), etc. via their AI SDK provider packages — no markup, full native features. See [Route to a model vendor's own API](#route-to-a-model-vendors-own-api-native-providers).
 - **Text aggregators:** [OpenRouter](https://openrouter.ai) (widest coverage, list pricing) · [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off** every model) · [TokenMart](https://thetokenmart.ai) (15–65% off, varies by model)
-- **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — image routing available via `createMediaLCR` (Kunavo + Runware adapters); video on the roadmap
+- **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — routing via `createMediaLCR`. Image: Kunavo + Runware + fal. Video: fal (live, via its async queue API); Kunavo's Veo poll path is implemented but unverified
 ## Text model pricing
@@ -293,7 +273,8 @@ Two OpenAI-compatible providers, same probe, same day. Cells cover both families
 - [ ] Bundled price table for zero-config pricing (drop the manual `cost` numbers)
 - [ ] Provider-quirk middleware (transparently patch known per-provider request quirks, e.g. Kunavo's ignored `max_tokens`)
 - [ ] Feed probe results into routing automatically (auto-exclude a model from a provider that fails its probe)
-- [ ] Image & video model routing (fal.ai / Runware / Kunavo)
+- [x] Image & video model routing (`createMediaLCR`) — image via Kunavo + Runware + fal; **video live via fal** (async queue API)
+- [ ] Normalized cross-provider video price comparison + verified Kunavo/Runware video adapters
 ## Affiliate disclosure

package/README.zh-CN.md CHANGED Viewed

@@ -114,7 +114,7 @@ const lcr = createLCR({
 - **模型厂商官方 API（原生）：** 通过各自的 AI SDK provider 包直连 [DeepSeek](https://platform.deepseek.com)、[OpenAI](https://openai.com)、[Anthropic](https://anthropic.com)、[Google](https://ai.google.dev)、[xAI](https://x.ai) 等——无加价，原生特性齐全。见上方「直连模型厂商官方 API（原生 provider）」一节。
 - **文本聚合器：** [OpenRouter](https://openrouter.ai)（覆盖最广，列表定价）· [Kunavo](https://kunavo.com/?ref=victorimf)（**全模型 8 折**）· [TokenMart](https://thetokenmart.ai)（按模型 85 折–35 折不等）
-- **图像 / 视频：** [Kunavo](https://kunavo.com/?ref=victorimf)（**8 折**）· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) —— 路由功能在路线图中
+- **图像 / 视频：** [Kunavo](https://kunavo.com/?ref=victorimf)（**8 折**）· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) —— 通过 `createMediaLCR` 路由。图像：Kunavo + Runware + fal。视频：fal（已可用，走其异步队列 API）；Kunavo 的 Veo 轮询路径已实现但未验证
 ## 文本模型价格
@@ -229,7 +229,8 @@ API_KEY=$TOKENMART_API_KEY BASE=https://api.tokenmart.ai \
 - [ ] 内置价格表，实现零配置定价（省去手填 `cost` 数字）
 - [ ] provider 怪癖中间件（透明地修补已知怪癖，如 Kunavo 被忽略的 `max_tokens`）
 - [ ] 把 probe 结果自动接入路由（探测失败的 provider×model 自动从列表剔除）
-- [ ] 图像与视频模型路由（fal.ai / Runware / Kunavo）
+- [x] 图像与视频模型路由（`createMediaLCR`）—— 图像走 Kunavo + Runware + fal；**视频已可用，走 fal**（异步队列 API）
+- [ ] 归一化的跨 provider 视频价格对比 + 验证 Kunavo/Runware 视频适配器
 ## 联盟（Affiliate）披露

package/dist/index.cjs CHANGED Viewed

@@ -21,11 +21,12 @@ var __toCommonJS = (mod) => __copyProps(__defProp({}, "__esModule", { value: tru
 var index_exports = {};
 __export(index_exports, {
   DEFAULT_REFERENCE: () => DEFAULT_REFERENCE,
+  FalMediaError: () => FalMediaError,
   MEDIA_PRICING: () => MEDIA_PRICING,
   cheapestRoute: () => cheapestRoute,
   classifyError: () => classifyError,
   comparePrices: () => comparePrices,
-  createHttpSink: () => createHttpSink,
+  createFalMediaAdapter: () => createFalMediaAdapter,
   createKunavoMediaAdapter: () => createKunavoMediaAdapter,
   createLCR: () => createLCR,
   createMediaLCR: () => createMediaLCR,
@@ -141,20 +142,12 @@ var LcrFallbackModel = class {
       errorClass: classifyError(error)
     });
   }
-  /** Cost of one route for the given token counts; 0 if it has no price. */
-  routeCost(p, inputTokens, outputTokens) {
-    return p.cost ? inputTokens / 1e6 * p.cost.input + outputTokens / 1e6 * p.cost.output : 0;
-  }
   /** Winner settled: record the attempt, fire `onCost` (compat) + `onCall`. */
   finalizeOk(ctx, provider, attemptStart, usage) {
     ctx.attempts.push({ provider: provider.label, ok: true, latencyMs: Date.now() - attemptStart });
     const inputTokens = usage?.inputTokens?.total ?? 0;
     const outputTokens = usage?.outputTokens?.total ?? 0;
-    const costUsd = this.routeCost(provider, inputTokens, outputTokens);
-    const baselineUsd = this.opts.providers.reduce(
-      (max, p) => Math.max(max, this.routeCost(p, inputTokens, outputTokens)),
-      costUsd
-    );
+    const costUsd = provider.cost ? inputTokens / 1e6 * provider.cost.input + outputTokens / 1e6 * provider.cost.output : 0;
     this.opts.onCost?.({
       model: this.opts.modelName,
       provider: provider.label,
@@ -172,8 +165,7 @@ var LcrFallbackModel = class {
       latencyMs: Date.now() - ctx.startedAt,
       inputTokens,
       outputTokens,
-      costUsd,
-      baselineUsd
+      costUsd
     });
   }
   /** Every provider failed: fire `onCall` with no winner. */
@@ -188,8 +180,7 @@ var LcrFallbackModel = class {
       latencyMs: Date.now() - ctx.startedAt,
       inputTokens: 0,
       outputTokens: 0,
-      costUsd: 0,
-      baselineUsd: 0
+      costUsd: 0
     });
   }
   async doGenerate(options) {
@@ -345,40 +336,6 @@ function formatCallRecord(record, opts = {}) {
   return line;
 }
-// src/sink.ts
-function createHttpSink(options) {
-  const {
-    url,
-    headers,
-    project,
-    dispatch = (task) => {
-      void task();
-    },
-    fetchImpl,
-    onError
-  } = options;
-  const doFetch = fetchImpl ?? globalThis.fetch;
-  return (record) => {
-    if (!doFetch) {
-      onError?.(new Error("ai-lcr: no fetch available for createHttpSink"));
-      return;
-    }
-    const payload = project ? { project, ...record } : record;
-    dispatch(async () => {
-      try {
-        await doFetch(url, {
-          method: "POST",
-          headers: { "content-type": "application/json", ...headers },
-          body: JSON.stringify(payload),
-          keepalive: true
-        });
-      } catch (err) {
-        onError?.(err);
-      }
-    });
-  };
-}
 // src/media.ts
 var DEFAULT_REFERENCE = {
   image: { width: 1920, height: 1080 },
@@ -425,30 +382,75 @@ function comparePrices(registry, ref = DEFAULT_REFERENCE) {
     };
   });
 }
+function newMediaCallId() {
+  const c = globalThis.crypto;
+  return c?.randomUUID ? c.randomUUID() : `lcr_${Date.now().toString(36)}`;
+}
 function createMediaLCR(config) {
-  const { registry, adapters, reference = DEFAULT_REFERENCE, onError, onCost } = config;
+  const { registry, adapters, reference = DEFAULT_REFERENCE, onError, onCost, onCall } = config;
   return async function generate(modelId, input) {
     const def = registry[modelId];
     if (!def) {
       throw new Error(`ai-lcr: unknown media model "${modelId}" \u2014 add it to the registry`);
     }
     const ranked = rankRoutes(def, reference);
+    const baselineUsd = ranked.length > 0 ? Math.max(...ranked.map((r) => r.refCents)) / 100 : 0;
+    const startedAt = Date.now();
+    const attempts = [];
     let lastErr;
+    const emitFail = () => onCall?.({
+      id: newMediaCallId(),
+      model: modelId,
+      attempts,
+      winner: void 0,
+      ok: false,
+      failedOver: attempts.length > 1,
+      latencyMs: Date.now() - startedAt,
+      inputTokens: 0,
+      outputTokens: 0,
+      costUsd: 0,
+      baselineUsd
+    });
     for (const route of ranked) {
       const adapter = adapters[route.provider];
       if (!adapter) continue;
+      const attemptStart = Date.now();
       try {
         const result = await adapter.run({ externalId: route.externalId, input });
         const estimated = result.costCents === void 0;
         const costCents = estimated ? route.refCents * (result.units ?? 1) : result.costCents;
+        attempts.push({ provider: route.provider, ok: true, latencyMs: Date.now() - attemptStart });
         onCost?.({ modelId, provider: route.provider, costCents, estimated });
+        onCall?.({
+          id: newMediaCallId(),
+          model: modelId,
+          attempts,
+          winner: route.provider,
+          ok: true,
+          failedOver: attempts.length > 1,
+          latencyMs: Date.now() - startedAt,
+          inputTokens: 0,
+          outputTokens: 0,
+          costUsd: costCents / 100,
+          baselineUsd
+        });
         return { outputs: result.outputs, provider: route.provider, costCents, estimated };
       } catch (err) {
         lastErr = err;
+        attempts.push({
+          provider: route.provider,
+          ok: false,
+          latencyMs: Date.now() - attemptStart,
+          errorClass: classifyError(err)
+        });
         onError?.(err, route.provider);
-        if (!isRetryableError(err)) throw err;
+        if (!isRetryableError(err)) {
+          emitFail();
+          throw err;
+        }
       }
     }
+    emitFail();
     throw lastErr instanceof Error ? lastErr : new Error(`ai-lcr: no provider could serve media model "${modelId}"`);
   };
 }
@@ -731,6 +733,108 @@ var RunwareMediaError = class extends Error {
   status;
 };
+// src/adapters/fal-media.ts
+var DEFAULT_BASE3 = "https://queue.fal.run";
+function extractOutputs(raw) {
+  if (!raw || typeof raw !== "object") return [];
+  const data = raw;
+  const out = [];
+  const pushUrl = (url, type) => {
+    if (typeof url === "string" && url.length > 0) out.push({ url, type });
+  };
+  if (Array.isArray(data.images)) {
+    for (const img of data.images) pushUrl(img?.url, "image");
+  }
+  pushUrl(data.image?.url, "image");
+  if (Array.isArray(data.videos)) {
+    for (const v of data.videos) pushUrl(v?.url, "video");
+  }
+  pushUrl(data.video?.url, "video");
+  return out;
+}
+function createFalMediaAdapter(config) {
+  const {
+    apiKey,
+    baseUrl = DEFAULT_BASE3,
+    pollIntervalMs = 3e3,
+    pollTimeoutMs = 3e5,
+    fetchImpl = fetch
+  } = config;
+  const headers = {
+    "content-type": "application/json",
+    authorization: `Key ${apiKey}`
+  };
+  return {
+    provider: "fal",
+    async run(req) {
+      const submitRes = await fetchImpl(`${baseUrl}/${req.externalId}`, {
+        method: "POST",
+        headers,
+        body: JSON.stringify(req.input)
+      });
+      if (!submitRes.ok) {
+        throw new FalMediaError(submitRes.status, await safeText2(submitRes));
+      }
+      const submit = await submitRes.json();
+      const statusUrl = submit.status_url;
+      const responseUrl = submit.response_url;
+      if (!statusUrl || !responseUrl) {
+        throw new Error(
+          `ai-lcr: fal submit for "${req.externalId}" returned no status/response URL (keys: ${Object.keys(
+            submit
+          ).join(", ")})`
+        );
+      }
+      const deadline = Date.now() + pollTimeoutMs;
+      let completed = false;
+      while (Date.now() < deadline) {
+        const statusRes = await fetchImpl(statusUrl, { headers });
+        if (!statusRes.ok) {
+          throw new FalMediaError(statusRes.status, await safeText2(statusRes));
+        }
+        const status = String((await statusRes.json()).status ?? "");
+        if (status === "COMPLETED") {
+          completed = true;
+          break;
+        }
+        await sleep2(pollIntervalMs);
+      }
+      if (!completed) {
+        throw new Error(
+          `ai-lcr: fal job for "${req.externalId}" timed out after ${pollTimeoutMs}ms`
+        );
+      }
+      const resultRes = await fetchImpl(responseUrl, { headers });
+      if (!resultRes.ok) {
+        throw new FalMediaError(resultRes.status, await safeText2(resultRes));
+      }
+      const outputs = extractOutputs(await resultRes.json());
+      if (outputs.length === 0) {
+        throw new Error(`ai-lcr: fal returned no media URL for "${req.externalId}"`);
+      }
+      return { outputs, units: outputs.length };
+    }
+  };
+}
+var FalMediaError = class extends Error {
+  constructor(status, body) {
+    super(`fal media HTTP ${status}: ${body.slice(0, 300)}`);
+    this.status = status;
+    this.name = "FalMediaError";
+  }
+  status;
+};
+function sleep2(ms) {
+  return new Promise((r) => setTimeout(r, ms));
+}
+async function safeText2(res) {
+  try {
+    return await res.text();
+  } catch {
+    return "<no body>";
+  }
+}
 // src/index.ts
 function isLanguageModel(entry) {
   return typeof entry.doGenerate === "function";
@@ -774,11 +878,12 @@ function createLCR(config) {
 // Annotate the CommonJS export names for ESM import in node:
 0 && (module.exports = {
   DEFAULT_REFERENCE,
+  FalMediaError,
   MEDIA_PRICING,
   cheapestRoute,
   classifyError,
   comparePrices,
-  createHttpSink,
+  createFalMediaAdapter,
   createKunavoMediaAdapter,
   createLCR,
   createMediaLCR,

package/dist/index.d.cts CHANGED Viewed

@@ -65,13 +65,12 @@ interface CallRecord {
     /** Computed from the winner's `cost`; 0 if no price was given or the call failed. */
     costUsd: number;
     /**
-     * What these same tokens would have cost at the **most expensive** configured
-     * provider for this model — the "if you never routed cheap" baseline. Savings
-     * = `baselineUsd - costUsd`. Equals `costUsd` (savings 0) when prices are
-     * missing or the priciest route is the one that served. Self-contained: no
-     * external price table needed.
+     * What the priciest configured route would have cost for this request, so
+     * `baselineUsd - costUsd` is the saving from routing cheapest-first. Set by
+     * the media router (`createMediaLCR`), where every route has a known price;
+     * omitted by the text router, which can't price a baseline per call.
      */
-    baselineUsd: number;
+    baselineUsd?: number;
 }
 /**
  * Normalize an error into a short, log-friendly class for {@link CallRecord}.
@@ -102,52 +101,20 @@ interface FormatOptions {
 declare function formatCallRecord(record: CallRecord, opts?: FormatOptions): string;
 /**
- * Optional HTTP sink for `onCall` — ship each {@link CallRecord} as JSON to a
- * collector (e.g. a self-hosted ai-lcr-dashboard `/api/ingest`, or any endpoint
- * that accepts the CallRecord shape).
+ * ai-lcr media routing — Least Cost Routing for image & video models.
  *
- * Fully optional and dashboard-agnostic: omit it and ai-lcr stores nothing;
- * point `url` at whatever you run. Logging must never break your app, so a
- * failed POST is swallowed by default (surface it via `onError` if you want).
+ * The text router (./index, ./fallback) is built on the AI SDK's
+ * `LanguageModelV3` and only handles token-billed chat/completion. Image and
+ * video providers are a different world: outputs are files (URLs), pricing
+ * comes in incompatible units (per-image, per-second, per-call, per-megapixel),
+ * and video is a long-running async job. This module is the parallel, self-
+ * contained media side — no `LanguageModelV3` dependency.
  *
- *   import { createLCR, createHttpSink } from "ai-lcr";
- *   import { after } from "next/server"; // serverless: don't block the response
- *
- *   const lcr = createLCR({
- *     models: { ... },
- *     onCall: createHttpSink({
- *       url: process.env.LCR_INGEST_URL + "/api/ingest",
- *       headers: { authorization: `Bearer ${process.env.LCR_INGEST_KEY}` },
- *       project: process.env.LCR_PROJECT,
- *       dispatch: after, // run after the response is sent
- *     }),
- *   });
- */
-interface HttpSinkOptions {
-    /** Where to POST each CallRecord (a collector that accepts the JSON shape). */
-    url: string;
-    /** Extra headers, e.g. `{ authorization: ` + "`Bearer ${key}`" + ` }`. */
-    headers?: Record<string, string>;
-    /** Optional tenant/project tag merged into each payload (`{ project, ...record }`). */
-    project?: string;
-    /**
-     * Wrap the dispatch so it survives a serverless function returning. On
-     * Next.js pass `after` from "next/server"; elsewhere pass a `waitUntil`-style
-     * function. Defaults to running immediately — correct for long-lived servers,
-     * but on serverless an un-awaited POST may be cut off, so pass `after`.
-     */
-    dispatch?: (task: () => void | Promise<void>) => void;
-    /** Custom fetch (tests / runtimes without a global `fetch`). */
-    fetchImpl?: typeof fetch;
-    /** Called if the POST fails. Failures are swallowed by default. */
-    onError?: (error: unknown) => void;
-}
-/**
- * Build an `onCall` handler that POSTs each {@link CallRecord} to `url`.
- * Returns a plain `(record) => void` — pass it straight to `createLCR`'s `onCall`.
+ * The core idea is the SAME as the text LCR: keep a list of providers per
+ * model, route to the cheapest healthy one, fall back on failure, report real
+ * cost. The only new problem is making prices comparable, which we solve by
+ * normalizing every provider's price to ONE reference output (see ReferenceSpec).
  */
-declare function createHttpSink(options: HttpSinkOptions): (record: CallRecord) => void;
 type MediaModality = "image" | "video";
 /**
@@ -268,6 +235,13 @@ interface MediaLCRConfig {
     reference?: ReferenceSpec;
     onError?: (error: Error, provider: string) => void;
     onCost?: (event: MediaCostEvent) => void;
+    /**
+     * One correlated {@link CallRecord} per settled request — the full failover
+     * chain, winner, latency, and cost — mirroring the text side's `onCall`, so
+     * the same dashboard sink works for image/video. Fire-and-forget; never
+     * throws. Media records carry no token counts (inputTokens/outputTokens = 0).
+     */
+    onCall?: (record: CallRecord) => void;
 }
 interface MediaRunResult {
     outputs: MediaOutput[];
@@ -275,11 +249,6 @@ interface MediaRunResult {
     costCents: number;
     estimated: boolean;
 }
-/**
- * Build a media Least Cost Router. Returns `generate(modelId, input)` which
- * tries providers cheapest-first and falls through on a retryable error —
- * exactly the text LCR's contract, for image/video.
- */
 declare function createMediaLCR(config: MediaLCRConfig): (modelId: string, input: Record<string, unknown>) => Promise<MediaRunResult>;
 /**
@@ -367,6 +336,53 @@ interface RunwareMediaConfig {
 }
 declare function createRunwareMediaAdapter(config: RunwareMediaConfig): MediaAdapter;
+/**
+ * fal media adapter — image (queue) + video (queue, async poll).
+ *
+ * fal serves every model through one async queue API, so a single submit→poll→
+ * fetch-result path covers both image and video. That is the whole reason this
+ * adapter exists: it is ai-lcr's first VIDEO-capable execution path. (The
+ * Runware adapter is image-only; the Kunavo one's video poll loop is unverified.)
+ *
+ * Implementation note: ai-art's fal adapter uses the `@fal-ai/client` SDK, but
+ * ai-lcr deliberately keeps zero provider SDKs — every adapter is raw `fetch`
+ * with an injectable `fetchImpl` for testing (see runware-media, kunavo-media).
+ * So this re-implements the three queue calls against fal's REST endpoints:
+ *
+ *   1. submit  POST https://queue.fal.run/{model}        → { request_id, status_url, response_url }
+ *   2. status  GET  {status_url}                         → { status: IN_QUEUE | IN_PROGRESS | COMPLETED }
+ *   3. result  GET  {response_url}                        → { images:[…] } | { video:{url} } | …
+ *
+ * We follow the `status_url` / `response_url` returned by submit rather than
+ * rebuilding them, which sidesteps fal's sub-path quirk (a model like
+ * `fal-ai/flux/schnell` submits to the full path but its status/result live
+ * under the `fal-ai/flux` base).
+ *
+ * Auth: fal uses `Authorization: Key {FAL_KEY}` (NOT Bearer).
+ *
+ * Cost: fal's queue result does not carry a per-call price, so cost is left to
+ * the router's normalized estimate (costCents stays undefined; `units` is the
+ * output count — one image, or one clip).
+ */
+interface FalMediaConfig {
+    apiKey: string;
+    /** Override for testing. Defaults to https://queue.fal.run. */
+    baseUrl?: string;
+    /** Video/job poll cadence (ms). Default 3000. */
+    pollIntervalMs?: number;
+    /** Max time to wait for a job before giving up (ms). Default 300000 (5m). */
+    pollTimeoutMs?: number;
+    /** Injected for testing; defaults to global fetch. */
+    fetchImpl?: typeof fetch;
+}
+declare function createFalMediaAdapter(config: FalMediaConfig): MediaAdapter;
+/** Carries the HTTP status so the router's `isRetryableError` can classify it. */
+declare class FalMediaError extends Error {
+    status: number;
+    constructor(status: number, body: string);
+}
 /**
  * ai-lcr — Least Cost Routing for LLMs.
  *
@@ -420,4 +436,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
  */
 declare function createLCR(config: LCRConfig): LCRRouter;
-export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type FormatOptions, type HttpSinkOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, comparePrices, createHttpSink, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
+export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, FalMediaError, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, comparePrices, createFalMediaAdapter, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };

package/dist/index.d.ts CHANGED Viewed

@@ -65,13 +65,12 @@ interface CallRecord {
     /** Computed from the winner's `cost`; 0 if no price was given or the call failed. */
     costUsd: number;
     /**
-     * What these same tokens would have cost at the **most expensive** configured
-     * provider for this model — the "if you never routed cheap" baseline. Savings
-     * = `baselineUsd - costUsd`. Equals `costUsd` (savings 0) when prices are
-     * missing or the priciest route is the one that served. Self-contained: no
-     * external price table needed.
+     * What the priciest configured route would have cost for this request, so
+     * `baselineUsd - costUsd` is the saving from routing cheapest-first. Set by
+     * the media router (`createMediaLCR`), where every route has a known price;
+     * omitted by the text router, which can't price a baseline per call.
      */
-    baselineUsd: number;
+    baselineUsd?: number;
 }
 /**
  * Normalize an error into a short, log-friendly class for {@link CallRecord}.
@@ -102,52 +101,20 @@ interface FormatOptions {
 declare function formatCallRecord(record: CallRecord, opts?: FormatOptions): string;
 /**
- * Optional HTTP sink for `onCall` — ship each {@link CallRecord} as JSON to a
- * collector (e.g. a self-hosted ai-lcr-dashboard `/api/ingest`, or any endpoint
- * that accepts the CallRecord shape).
+ * ai-lcr media routing — Least Cost Routing for image & video models.
  *
- * Fully optional and dashboard-agnostic: omit it and ai-lcr stores nothing;
- * point `url` at whatever you run. Logging must never break your app, so a
- * failed POST is swallowed by default (surface it via `onError` if you want).
+ * The text router (./index, ./fallback) is built on the AI SDK's
+ * `LanguageModelV3` and only handles token-billed chat/completion. Image and
+ * video providers are a different world: outputs are files (URLs), pricing
+ * comes in incompatible units (per-image, per-second, per-call, per-megapixel),
+ * and video is a long-running async job. This module is the parallel, self-
+ * contained media side — no `LanguageModelV3` dependency.
  *
- *   import { createLCR, createHttpSink } from "ai-lcr";
- *   import { after } from "next/server"; // serverless: don't block the response
- *
- *   const lcr = createLCR({
- *     models: { ... },
- *     onCall: createHttpSink({
- *       url: process.env.LCR_INGEST_URL + "/api/ingest",
- *       headers: { authorization: `Bearer ${process.env.LCR_INGEST_KEY}` },
- *       project: process.env.LCR_PROJECT,
- *       dispatch: after, // run after the response is sent
- *     }),
- *   });
- */
-interface HttpSinkOptions {
-    /** Where to POST each CallRecord (a collector that accepts the JSON shape). */
-    url: string;
-    /** Extra headers, e.g. `{ authorization: ` + "`Bearer ${key}`" + ` }`. */
-    headers?: Record<string, string>;
-    /** Optional tenant/project tag merged into each payload (`{ project, ...record }`). */
-    project?: string;
-    /**
-     * Wrap the dispatch so it survives a serverless function returning. On
-     * Next.js pass `after` from "next/server"; elsewhere pass a `waitUntil`-style
-     * function. Defaults to running immediately — correct for long-lived servers,
-     * but on serverless an un-awaited POST may be cut off, so pass `after`.
-     */
-    dispatch?: (task: () => void | Promise<void>) => void;
-    /** Custom fetch (tests / runtimes without a global `fetch`). */
-    fetchImpl?: typeof fetch;
-    /** Called if the POST fails. Failures are swallowed by default. */
-    onError?: (error: unknown) => void;
-}
-/**
- * Build an `onCall` handler that POSTs each {@link CallRecord} to `url`.
- * Returns a plain `(record) => void` — pass it straight to `createLCR`'s `onCall`.
+ * The core idea is the SAME as the text LCR: keep a list of providers per
+ * model, route to the cheapest healthy one, fall back on failure, report real
+ * cost. The only new problem is making prices comparable, which we solve by
+ * normalizing every provider's price to ONE reference output (see ReferenceSpec).
  */
-declare function createHttpSink(options: HttpSinkOptions): (record: CallRecord) => void;
 type MediaModality = "image" | "video";
 /**
@@ -268,6 +235,13 @@ interface MediaLCRConfig {
     reference?: ReferenceSpec;
     onError?: (error: Error, provider: string) => void;
     onCost?: (event: MediaCostEvent) => void;
+    /**
+     * One correlated {@link CallRecord} per settled request — the full failover
+     * chain, winner, latency, and cost — mirroring the text side's `onCall`, so
+     * the same dashboard sink works for image/video. Fire-and-forget; never
+     * throws. Media records carry no token counts (inputTokens/outputTokens = 0).
+     */
+    onCall?: (record: CallRecord) => void;
 }
 interface MediaRunResult {
     outputs: MediaOutput[];
@@ -275,11 +249,6 @@ interface MediaRunResult {
     costCents: number;
     estimated: boolean;
 }
-/**
- * Build a media Least Cost Router. Returns `generate(modelId, input)` which
- * tries providers cheapest-first and falls through on a retryable error —
- * exactly the text LCR's contract, for image/video.
- */
 declare function createMediaLCR(config: MediaLCRConfig): (modelId: string, input: Record<string, unknown>) => Promise<MediaRunResult>;
 /**
@@ -367,6 +336,53 @@ interface RunwareMediaConfig {
 }
 declare function createRunwareMediaAdapter(config: RunwareMediaConfig): MediaAdapter;
+/**
+ * fal media adapter — image (queue) + video (queue, async poll).
+ *
+ * fal serves every model through one async queue API, so a single submit→poll→
+ * fetch-result path covers both image and video. That is the whole reason this
+ * adapter exists: it is ai-lcr's first VIDEO-capable execution path. (The
+ * Runware adapter is image-only; the Kunavo one's video poll loop is unverified.)
+ *
+ * Implementation note: ai-art's fal adapter uses the `@fal-ai/client` SDK, but
+ * ai-lcr deliberately keeps zero provider SDKs — every adapter is raw `fetch`
+ * with an injectable `fetchImpl` for testing (see runware-media, kunavo-media).
+ * So this re-implements the three queue calls against fal's REST endpoints:
+ *
+ *   1. submit  POST https://queue.fal.run/{model}        → { request_id, status_url, response_url }
+ *   2. status  GET  {status_url}                         → { status: IN_QUEUE | IN_PROGRESS | COMPLETED }
+ *   3. result  GET  {response_url}                        → { images:[…] } | { video:{url} } | …
+ *
+ * We follow the `status_url` / `response_url` returned by submit rather than
+ * rebuilding them, which sidesteps fal's sub-path quirk (a model like
+ * `fal-ai/flux/schnell` submits to the full path but its status/result live
+ * under the `fal-ai/flux` base).
+ *
+ * Auth: fal uses `Authorization: Key {FAL_KEY}` (NOT Bearer).
+ *
+ * Cost: fal's queue result does not carry a per-call price, so cost is left to
+ * the router's normalized estimate (costCents stays undefined; `units` is the
+ * output count — one image, or one clip).
+ */
+interface FalMediaConfig {
+    apiKey: string;
+    /** Override for testing. Defaults to https://queue.fal.run. */
+    baseUrl?: string;
+    /** Video/job poll cadence (ms). Default 3000. */
+    pollIntervalMs?: number;
+    /** Max time to wait for a job before giving up (ms). Default 300000 (5m). */
+    pollTimeoutMs?: number;
+    /** Injected for testing; defaults to global fetch. */
+    fetchImpl?: typeof fetch;
+}
+declare function createFalMediaAdapter(config: FalMediaConfig): MediaAdapter;
+/** Carries the HTTP status so the router's `isRetryableError` can classify it. */
+declare class FalMediaError extends Error {
+    status: number;
+    constructor(status: number, body: string);
+}
 /**
  * ai-lcr — Least Cost Routing for LLMs.
  *
@@ -420,4 +436,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
  */
 declare function createLCR(config: LCRConfig): LCRRouter;
-export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type FormatOptions, type HttpSinkOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, comparePrices, createHttpSink, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
+export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, FalMediaError, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, comparePrices, createFalMediaAdapter, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };

package/dist/index.js CHANGED Viewed

@@ -102,20 +102,12 @@ var LcrFallbackModel = class {
       errorClass: classifyError(error)
     });
   }
-  /** Cost of one route for the given token counts; 0 if it has no price. */
-  routeCost(p, inputTokens, outputTokens) {
-    return p.cost ? inputTokens / 1e6 * p.cost.input + outputTokens / 1e6 * p.cost.output : 0;
-  }
   /** Winner settled: record the attempt, fire `onCost` (compat) + `onCall`. */
   finalizeOk(ctx, provider, attemptStart, usage) {
     ctx.attempts.push({ provider: provider.label, ok: true, latencyMs: Date.now() - attemptStart });
     const inputTokens = usage?.inputTokens?.total ?? 0;
     const outputTokens = usage?.outputTokens?.total ?? 0;
-    const costUsd = this.routeCost(provider, inputTokens, outputTokens);
-    const baselineUsd = this.opts.providers.reduce(
-      (max, p) => Math.max(max, this.routeCost(p, inputTokens, outputTokens)),
-      costUsd
-    );
+    const costUsd = provider.cost ? inputTokens / 1e6 * provider.cost.input + outputTokens / 1e6 * provider.cost.output : 0;
     this.opts.onCost?.({
       model: this.opts.modelName,
       provider: provider.label,
@@ -133,8 +125,7 @@ var LcrFallbackModel = class {
       latencyMs: Date.now() - ctx.startedAt,
       inputTokens,
       outputTokens,
-      costUsd,
-      baselineUsd
+      costUsd
     });
   }
   /** Every provider failed: fire `onCall` with no winner. */
@@ -149,8 +140,7 @@ var LcrFallbackModel = class {
       latencyMs: Date.now() - ctx.startedAt,
       inputTokens: 0,
       outputTokens: 0,
-      costUsd: 0,
-      baselineUsd: 0
+      costUsd: 0
     });
   }
   async doGenerate(options) {
@@ -306,40 +296,6 @@ function formatCallRecord(record, opts = {}) {
   return line;
 }
-// src/sink.ts
-function createHttpSink(options) {
-  const {
-    url,
-    headers,
-    project,
-    dispatch = (task) => {
-      void task();
-    },
-    fetchImpl,
-    onError
-  } = options;
-  const doFetch = fetchImpl ?? globalThis.fetch;
-  return (record) => {
-    if (!doFetch) {
-      onError?.(new Error("ai-lcr: no fetch available for createHttpSink"));
-      return;
-    }
-    const payload = project ? { project, ...record } : record;
-    dispatch(async () => {
-      try {
-        await doFetch(url, {
-          method: "POST",
-          headers: { "content-type": "application/json", ...headers },
-          body: JSON.stringify(payload),
-          keepalive: true
-        });
-      } catch (err) {
-        onError?.(err);
-      }
-    });
-  };
-}
 // src/media.ts
 var DEFAULT_REFERENCE = {
   image: { width: 1920, height: 1080 },
@@ -386,30 +342,75 @@ function comparePrices(registry, ref = DEFAULT_REFERENCE) {
     };
   });
 }
+function newMediaCallId() {
+  const c = globalThis.crypto;
+  return c?.randomUUID ? c.randomUUID() : `lcr_${Date.now().toString(36)}`;
+}
 function createMediaLCR(config) {
-  const { registry, adapters, reference = DEFAULT_REFERENCE, onError, onCost } = config;
+  const { registry, adapters, reference = DEFAULT_REFERENCE, onError, onCost, onCall } = config;
   return async function generate(modelId, input) {
     const def = registry[modelId];
     if (!def) {
       throw new Error(`ai-lcr: unknown media model "${modelId}" \u2014 add it to the registry`);
     }
     const ranked = rankRoutes(def, reference);
+    const baselineUsd = ranked.length > 0 ? Math.max(...ranked.map((r) => r.refCents)) / 100 : 0;
+    const startedAt = Date.now();
+    const attempts = [];
     let lastErr;
+    const emitFail = () => onCall?.({
+      id: newMediaCallId(),
+      model: modelId,
+      attempts,
+      winner: void 0,
+      ok: false,
+      failedOver: attempts.length > 1,
+      latencyMs: Date.now() - startedAt,
+      inputTokens: 0,
+      outputTokens: 0,
+      costUsd: 0,
+      baselineUsd
+    });
     for (const route of ranked) {
       const adapter = adapters[route.provider];
       if (!adapter) continue;
+      const attemptStart = Date.now();
       try {
         const result = await adapter.run({ externalId: route.externalId, input });
         const estimated = result.costCents === void 0;
         const costCents = estimated ? route.refCents * (result.units ?? 1) : result.costCents;
+        attempts.push({ provider: route.provider, ok: true, latencyMs: Date.now() - attemptStart });
         onCost?.({ modelId, provider: route.provider, costCents, estimated });
+        onCall?.({
+          id: newMediaCallId(),
+          model: modelId,
+          attempts,
+          winner: route.provider,
+          ok: true,
+          failedOver: attempts.length > 1,
+          latencyMs: Date.now() - startedAt,
+          inputTokens: 0,
+          outputTokens: 0,
+          costUsd: costCents / 100,
+          baselineUsd
+        });
         return { outputs: result.outputs, provider: route.provider, costCents, estimated };
       } catch (err) {
         lastErr = err;
+        attempts.push({
+          provider: route.provider,
+          ok: false,
+          latencyMs: Date.now() - attemptStart,
+          errorClass: classifyError(err)
+        });
         onError?.(err, route.provider);
-        if (!isRetryableError(err)) throw err;
+        if (!isRetryableError(err)) {
+          emitFail();
+          throw err;
+        }
       }
     }
+    emitFail();
     throw lastErr instanceof Error ? lastErr : new Error(`ai-lcr: no provider could serve media model "${modelId}"`);
   };
 }
@@ -692,6 +693,108 @@ var RunwareMediaError = class extends Error {
   status;
 };
+// src/adapters/fal-media.ts
+var DEFAULT_BASE3 = "https://queue.fal.run";
+function extractOutputs(raw) {
+  if (!raw || typeof raw !== "object") return [];
+  const data = raw;
+  const out = [];
+  const pushUrl = (url, type) => {
+    if (typeof url === "string" && url.length > 0) out.push({ url, type });
+  };
+  if (Array.isArray(data.images)) {
+    for (const img of data.images) pushUrl(img?.url, "image");
+  }
+  pushUrl(data.image?.url, "image");
+  if (Array.isArray(data.videos)) {
+    for (const v of data.videos) pushUrl(v?.url, "video");
+  }
+  pushUrl(data.video?.url, "video");
+  return out;
+}
+function createFalMediaAdapter(config) {
+  const {
+    apiKey,
+    baseUrl = DEFAULT_BASE3,
+    pollIntervalMs = 3e3,
+    pollTimeoutMs = 3e5,
+    fetchImpl = fetch
+  } = config;
+  const headers = {
+    "content-type": "application/json",
+    authorization: `Key ${apiKey}`
+  };
+  return {
+    provider: "fal",
+    async run(req) {
+      const submitRes = await fetchImpl(`${baseUrl}/${req.externalId}`, {
+        method: "POST",
+        headers,
+        body: JSON.stringify(req.input)
+      });
+      if (!submitRes.ok) {
+        throw new FalMediaError(submitRes.status, await safeText2(submitRes));
+      }
+      const submit = await submitRes.json();
+      const statusUrl = submit.status_url;
+      const responseUrl = submit.response_url;
+      if (!statusUrl || !responseUrl) {
+        throw new Error(
+          `ai-lcr: fal submit for "${req.externalId}" returned no status/response URL (keys: ${Object.keys(
+            submit
+          ).join(", ")})`
+        );
+      }
+      const deadline = Date.now() + pollTimeoutMs;
+      let completed = false;
+      while (Date.now() < deadline) {
+        const statusRes = await fetchImpl(statusUrl, { headers });
+        if (!statusRes.ok) {
+          throw new FalMediaError(statusRes.status, await safeText2(statusRes));
+        }
+        const status = String((await statusRes.json()).status ?? "");
+        if (status === "COMPLETED") {
+          completed = true;
+          break;
+        }
+        await sleep2(pollIntervalMs);
+      }
+      if (!completed) {
+        throw new Error(
+          `ai-lcr: fal job for "${req.externalId}" timed out after ${pollTimeoutMs}ms`
+        );
+      }
+      const resultRes = await fetchImpl(responseUrl, { headers });
+      if (!resultRes.ok) {
+        throw new FalMediaError(resultRes.status, await safeText2(resultRes));
+      }
+      const outputs = extractOutputs(await resultRes.json());
+      if (outputs.length === 0) {
+        throw new Error(`ai-lcr: fal returned no media URL for "${req.externalId}"`);
+      }
+      return { outputs, units: outputs.length };
+    }
+  };
+}
+var FalMediaError = class extends Error {
+  constructor(status, body) {
+    super(`fal media HTTP ${status}: ${body.slice(0, 300)}`);
+    this.status = status;
+    this.name = "FalMediaError";
+  }
+  status;
+};
+function sleep2(ms) {
+  return new Promise((r) => setTimeout(r, ms));
+}
+async function safeText2(res) {
+  try {
+    return await res.text();
+  } catch {
+    return "<no body>";
+  }
+}
 // src/index.ts
 function isLanguageModel(entry) {
   return typeof entry.doGenerate === "function";
@@ -734,11 +837,12 @@ function createLCR(config) {
 }
 export {
   DEFAULT_REFERENCE,
+  FalMediaError,
   MEDIA_PRICING,
   cheapestRoute,
   classifyError,
   comparePrices,
-  createHttpSink,
+  createFalMediaAdapter,
   createKunavoMediaAdapter,
   createLCR,
   createMediaLCR,

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "ai-lcr",
-  "version": "0.2.0",
+  "version": "0.2.2",
   "description": "Least Cost Routing for LLMs — route every model call to the cheapest available provider, fall back automatically, and track real cost. Built for the Vercel AI SDK.",
   "keywords": [
     "ai",