ai-lcr 0.2.0 → 0.2.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +4 -23
- package/README.zh-CN.md +3 -2
- package/dist/index.cjs +156 -51
- package/dist/index.d.cts +71 -55
- package/dist/index.d.ts +71 -55
- package/dist/index.js +154 -50
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -135,25 +135,6 @@ const lcr = createLCR({
|
|
|
135
135
|
onCall: (record) => console.log(JSON.stringify(record)),
|
|
136
136
|
```
|
|
137
137
|
|
|
138
|
-
Or ship each record to an HTTP collector with the built-in `createHttpSink` (fire-and-forget, never throws, dashboard-agnostic):
|
|
139
|
-
|
|
140
|
-
```ts
|
|
141
|
-
import { createLCR, createHttpSink } from "ai-lcr";
|
|
142
|
-
import { after } from "next/server"; // serverless: don't block the response
|
|
143
|
-
|
|
144
|
-
const lcr = createLCR({
|
|
145
|
-
models: { /* … */ },
|
|
146
|
-
onCall: createHttpSink({
|
|
147
|
-
url: `${process.env.LCR_INGEST_URL}/api/ingest`,
|
|
148
|
-
headers: { authorization: `Bearer ${process.env.LCR_INGEST_KEY}` },
|
|
149
|
-
project: process.env.LCR_PROJECT, // optional tag if one collector serves several apps
|
|
150
|
-
dispatch: after, // run after the response is sent (serverless-safe)
|
|
151
|
-
}),
|
|
152
|
-
});
|
|
153
|
-
```
|
|
154
|
-
|
|
155
|
-
Point `url` at anything that accepts the `CallRecord` JSON — including the self-hostable companion dashboard, **[ai-lcr-dashboard](https://github.com/victorzhrn/ai-lcr-dashboard)** (Spend / Calls / Failover rate + a live failover feed). You run your own instance, so the data never leaves your infrastructure; a [db9](https://db9.ai) database can be provisioned in seconds if you don't want to stand one up yourself.
|
|
156
|
-
|
|
157
138
|
```ts
|
|
158
139
|
interface CallRecord {
|
|
159
140
|
id: string; // correlation id, one per request
|
|
@@ -165,8 +146,7 @@ interface CallRecord {
|
|
|
165
146
|
latencyMs: number;
|
|
166
147
|
inputTokens: number;
|
|
167
148
|
outputTokens: number;
|
|
168
|
-
costUsd: number;
|
|
169
|
-
baselineUsd: number; // what the priciest configured route would cost → savings = baselineUsd - costUsd
|
|
149
|
+
costUsd: number;
|
|
170
150
|
}
|
|
171
151
|
```
|
|
172
152
|
|
|
@@ -176,7 +156,7 @@ Any OpenAI-compatible endpoint works — and so does any AI SDK provider package
|
|
|
176
156
|
|
|
177
157
|
- **Model vendors' own APIs (native):** route straight to [DeepSeek](https://platform.deepseek.com), [OpenAI](https://openai.com), [Anthropic](https://anthropic.com), [Google](https://ai.google.dev), [xAI](https://x.ai), etc. via their AI SDK provider packages — no markup, full native features. See [Route to a model vendor's own API](#route-to-a-model-vendors-own-api-native-providers).
|
|
178
158
|
- **Text aggregators:** [OpenRouter](https://openrouter.ai) (widest coverage, list pricing) · [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off** every model) · [TokenMart](https://thetokenmart.ai) (15–65% off, varies by model)
|
|
179
|
-
- **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) —
|
|
159
|
+
- **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — routing via `createMediaLCR`. Image: Kunavo + Runware + fal. Video: fal (live, via its async queue API); Kunavo's Veo poll path is implemented but unverified
|
|
180
160
|
|
|
181
161
|
## Text model pricing
|
|
182
162
|
|
|
@@ -293,7 +273,8 @@ Two OpenAI-compatible providers, same probe, same day. Cells cover both families
|
|
|
293
273
|
- [ ] Bundled price table for zero-config pricing (drop the manual `cost` numbers)
|
|
294
274
|
- [ ] Provider-quirk middleware (transparently patch known per-provider request quirks, e.g. Kunavo's ignored `max_tokens`)
|
|
295
275
|
- [ ] Feed probe results into routing automatically (auto-exclude a model from a provider that fails its probe)
|
|
296
|
-
- [
|
|
276
|
+
- [x] Image & video model routing (`createMediaLCR`) — image via Kunavo + Runware + fal; **video live via fal** (async queue API)
|
|
277
|
+
- [ ] Normalized cross-provider video price comparison + verified Kunavo/Runware video adapters
|
|
297
278
|
|
|
298
279
|
## Affiliate disclosure
|
|
299
280
|
|
package/README.zh-CN.md
CHANGED
|
@@ -114,7 +114,7 @@ const lcr = createLCR({
|
|
|
114
114
|
|
|
115
115
|
- **模型厂商官方 API(原生):** 通过各自的 AI SDK provider 包直连 [DeepSeek](https://platform.deepseek.com)、[OpenAI](https://openai.com)、[Anthropic](https://anthropic.com)、[Google](https://ai.google.dev)、[xAI](https://x.ai) 等——无加价,原生特性齐全。见上方「直连模型厂商官方 API(原生 provider)」一节。
|
|
116
116
|
- **文本聚合器:** [OpenRouter](https://openrouter.ai)(覆盖最广,列表定价)· [Kunavo](https://kunavo.com/?ref=victorimf)(**全模型 8 折**)· [TokenMart](https://thetokenmart.ai)(按模型 85 折–35 折不等)
|
|
117
|
-
- **图像 / 视频:** [Kunavo](https://kunavo.com/?ref=victorimf)(**8 折**)· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) ——
|
|
117
|
+
- **图像 / 视频:** [Kunavo](https://kunavo.com/?ref=victorimf)(**8 折**)· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) —— 通过 `createMediaLCR` 路由。图像:Kunavo + Runware + fal。视频:fal(已可用,走其异步队列 API);Kunavo 的 Veo 轮询路径已实现但未验证
|
|
118
118
|
|
|
119
119
|
## 文本模型价格
|
|
120
120
|
|
|
@@ -229,7 +229,8 @@ API_KEY=$TOKENMART_API_KEY BASE=https://api.tokenmart.ai \
|
|
|
229
229
|
- [ ] 内置价格表,实现零配置定价(省去手填 `cost` 数字)
|
|
230
230
|
- [ ] provider 怪癖中间件(透明地修补已知怪癖,如 Kunavo 被忽略的 `max_tokens`)
|
|
231
231
|
- [ ] 把 probe 结果自动接入路由(探测失败的 provider×model 自动从列表剔除)
|
|
232
|
-
- [
|
|
232
|
+
- [x] 图像与视频模型路由(`createMediaLCR`)—— 图像走 Kunavo + Runware + fal;**视频已可用,走 fal**(异步队列 API)
|
|
233
|
+
- [ ] 归一化的跨 provider 视频价格对比 + 验证 Kunavo/Runware 视频适配器
|
|
233
234
|
|
|
234
235
|
## 联盟(Affiliate)披露
|
|
235
236
|
|
package/dist/index.cjs
CHANGED
|
@@ -21,11 +21,12 @@ var __toCommonJS = (mod) => __copyProps(__defProp({}, "__esModule", { value: tru
|
|
|
21
21
|
var index_exports = {};
|
|
22
22
|
__export(index_exports, {
|
|
23
23
|
DEFAULT_REFERENCE: () => DEFAULT_REFERENCE,
|
|
24
|
+
FalMediaError: () => FalMediaError,
|
|
24
25
|
MEDIA_PRICING: () => MEDIA_PRICING,
|
|
25
26
|
cheapestRoute: () => cheapestRoute,
|
|
26
27
|
classifyError: () => classifyError,
|
|
27
28
|
comparePrices: () => comparePrices,
|
|
28
|
-
|
|
29
|
+
createFalMediaAdapter: () => createFalMediaAdapter,
|
|
29
30
|
createKunavoMediaAdapter: () => createKunavoMediaAdapter,
|
|
30
31
|
createLCR: () => createLCR,
|
|
31
32
|
createMediaLCR: () => createMediaLCR,
|
|
@@ -141,20 +142,12 @@ var LcrFallbackModel = class {
|
|
|
141
142
|
errorClass: classifyError(error)
|
|
142
143
|
});
|
|
143
144
|
}
|
|
144
|
-
/** Cost of one route for the given token counts; 0 if it has no price. */
|
|
145
|
-
routeCost(p, inputTokens, outputTokens) {
|
|
146
|
-
return p.cost ? inputTokens / 1e6 * p.cost.input + outputTokens / 1e6 * p.cost.output : 0;
|
|
147
|
-
}
|
|
148
145
|
/** Winner settled: record the attempt, fire `onCost` (compat) + `onCall`. */
|
|
149
146
|
finalizeOk(ctx, provider, attemptStart, usage) {
|
|
150
147
|
ctx.attempts.push({ provider: provider.label, ok: true, latencyMs: Date.now() - attemptStart });
|
|
151
148
|
const inputTokens = usage?.inputTokens?.total ?? 0;
|
|
152
149
|
const outputTokens = usage?.outputTokens?.total ?? 0;
|
|
153
|
-
const costUsd =
|
|
154
|
-
const baselineUsd = this.opts.providers.reduce(
|
|
155
|
-
(max, p) => Math.max(max, this.routeCost(p, inputTokens, outputTokens)),
|
|
156
|
-
costUsd
|
|
157
|
-
);
|
|
150
|
+
const costUsd = provider.cost ? inputTokens / 1e6 * provider.cost.input + outputTokens / 1e6 * provider.cost.output : 0;
|
|
158
151
|
this.opts.onCost?.({
|
|
159
152
|
model: this.opts.modelName,
|
|
160
153
|
provider: provider.label,
|
|
@@ -172,8 +165,7 @@ var LcrFallbackModel = class {
|
|
|
172
165
|
latencyMs: Date.now() - ctx.startedAt,
|
|
173
166
|
inputTokens,
|
|
174
167
|
outputTokens,
|
|
175
|
-
costUsd
|
|
176
|
-
baselineUsd
|
|
168
|
+
costUsd
|
|
177
169
|
});
|
|
178
170
|
}
|
|
179
171
|
/** Every provider failed: fire `onCall` with no winner. */
|
|
@@ -188,8 +180,7 @@ var LcrFallbackModel = class {
|
|
|
188
180
|
latencyMs: Date.now() - ctx.startedAt,
|
|
189
181
|
inputTokens: 0,
|
|
190
182
|
outputTokens: 0,
|
|
191
|
-
costUsd: 0
|
|
192
|
-
baselineUsd: 0
|
|
183
|
+
costUsd: 0
|
|
193
184
|
});
|
|
194
185
|
}
|
|
195
186
|
async doGenerate(options) {
|
|
@@ -345,40 +336,6 @@ function formatCallRecord(record, opts = {}) {
|
|
|
345
336
|
return line;
|
|
346
337
|
}
|
|
347
338
|
|
|
348
|
-
// src/sink.ts
|
|
349
|
-
function createHttpSink(options) {
|
|
350
|
-
const {
|
|
351
|
-
url,
|
|
352
|
-
headers,
|
|
353
|
-
project,
|
|
354
|
-
dispatch = (task) => {
|
|
355
|
-
void task();
|
|
356
|
-
},
|
|
357
|
-
fetchImpl,
|
|
358
|
-
onError
|
|
359
|
-
} = options;
|
|
360
|
-
const doFetch = fetchImpl ?? globalThis.fetch;
|
|
361
|
-
return (record) => {
|
|
362
|
-
if (!doFetch) {
|
|
363
|
-
onError?.(new Error("ai-lcr: no fetch available for createHttpSink"));
|
|
364
|
-
return;
|
|
365
|
-
}
|
|
366
|
-
const payload = project ? { project, ...record } : record;
|
|
367
|
-
dispatch(async () => {
|
|
368
|
-
try {
|
|
369
|
-
await doFetch(url, {
|
|
370
|
-
method: "POST",
|
|
371
|
-
headers: { "content-type": "application/json", ...headers },
|
|
372
|
-
body: JSON.stringify(payload),
|
|
373
|
-
keepalive: true
|
|
374
|
-
});
|
|
375
|
-
} catch (err) {
|
|
376
|
-
onError?.(err);
|
|
377
|
-
}
|
|
378
|
-
});
|
|
379
|
-
};
|
|
380
|
-
}
|
|
381
|
-
|
|
382
339
|
// src/media.ts
|
|
383
340
|
var DEFAULT_REFERENCE = {
|
|
384
341
|
image: { width: 1920, height: 1080 },
|
|
@@ -425,30 +382,75 @@ function comparePrices(registry, ref = DEFAULT_REFERENCE) {
|
|
|
425
382
|
};
|
|
426
383
|
});
|
|
427
384
|
}
|
|
385
|
+
function newMediaCallId() {
|
|
386
|
+
const c = globalThis.crypto;
|
|
387
|
+
return c?.randomUUID ? c.randomUUID() : `lcr_${Date.now().toString(36)}`;
|
|
388
|
+
}
|
|
428
389
|
function createMediaLCR(config) {
|
|
429
|
-
const { registry, adapters, reference = DEFAULT_REFERENCE, onError, onCost } = config;
|
|
390
|
+
const { registry, adapters, reference = DEFAULT_REFERENCE, onError, onCost, onCall } = config;
|
|
430
391
|
return async function generate(modelId, input) {
|
|
431
392
|
const def = registry[modelId];
|
|
432
393
|
if (!def) {
|
|
433
394
|
throw new Error(`ai-lcr: unknown media model "${modelId}" \u2014 add it to the registry`);
|
|
434
395
|
}
|
|
435
396
|
const ranked = rankRoutes(def, reference);
|
|
397
|
+
const baselineUsd = ranked.length > 0 ? Math.max(...ranked.map((r) => r.refCents)) / 100 : 0;
|
|
398
|
+
const startedAt = Date.now();
|
|
399
|
+
const attempts = [];
|
|
436
400
|
let lastErr;
|
|
401
|
+
const emitFail = () => onCall?.({
|
|
402
|
+
id: newMediaCallId(),
|
|
403
|
+
model: modelId,
|
|
404
|
+
attempts,
|
|
405
|
+
winner: void 0,
|
|
406
|
+
ok: false,
|
|
407
|
+
failedOver: attempts.length > 1,
|
|
408
|
+
latencyMs: Date.now() - startedAt,
|
|
409
|
+
inputTokens: 0,
|
|
410
|
+
outputTokens: 0,
|
|
411
|
+
costUsd: 0,
|
|
412
|
+
baselineUsd
|
|
413
|
+
});
|
|
437
414
|
for (const route of ranked) {
|
|
438
415
|
const adapter = adapters[route.provider];
|
|
439
416
|
if (!adapter) continue;
|
|
417
|
+
const attemptStart = Date.now();
|
|
440
418
|
try {
|
|
441
419
|
const result = await adapter.run({ externalId: route.externalId, input });
|
|
442
420
|
const estimated = result.costCents === void 0;
|
|
443
421
|
const costCents = estimated ? route.refCents * (result.units ?? 1) : result.costCents;
|
|
422
|
+
attempts.push({ provider: route.provider, ok: true, latencyMs: Date.now() - attemptStart });
|
|
444
423
|
onCost?.({ modelId, provider: route.provider, costCents, estimated });
|
|
424
|
+
onCall?.({
|
|
425
|
+
id: newMediaCallId(),
|
|
426
|
+
model: modelId,
|
|
427
|
+
attempts,
|
|
428
|
+
winner: route.provider,
|
|
429
|
+
ok: true,
|
|
430
|
+
failedOver: attempts.length > 1,
|
|
431
|
+
latencyMs: Date.now() - startedAt,
|
|
432
|
+
inputTokens: 0,
|
|
433
|
+
outputTokens: 0,
|
|
434
|
+
costUsd: costCents / 100,
|
|
435
|
+
baselineUsd
|
|
436
|
+
});
|
|
445
437
|
return { outputs: result.outputs, provider: route.provider, costCents, estimated };
|
|
446
438
|
} catch (err) {
|
|
447
439
|
lastErr = err;
|
|
440
|
+
attempts.push({
|
|
441
|
+
provider: route.provider,
|
|
442
|
+
ok: false,
|
|
443
|
+
latencyMs: Date.now() - attemptStart,
|
|
444
|
+
errorClass: classifyError(err)
|
|
445
|
+
});
|
|
448
446
|
onError?.(err, route.provider);
|
|
449
|
-
if (!isRetryableError(err))
|
|
447
|
+
if (!isRetryableError(err)) {
|
|
448
|
+
emitFail();
|
|
449
|
+
throw err;
|
|
450
|
+
}
|
|
450
451
|
}
|
|
451
452
|
}
|
|
453
|
+
emitFail();
|
|
452
454
|
throw lastErr instanceof Error ? lastErr : new Error(`ai-lcr: no provider could serve media model "${modelId}"`);
|
|
453
455
|
};
|
|
454
456
|
}
|
|
@@ -731,6 +733,108 @@ var RunwareMediaError = class extends Error {
|
|
|
731
733
|
status;
|
|
732
734
|
};
|
|
733
735
|
|
|
736
|
+
// src/adapters/fal-media.ts
|
|
737
|
+
var DEFAULT_BASE3 = "https://queue.fal.run";
|
|
738
|
+
function extractOutputs(raw) {
|
|
739
|
+
if (!raw || typeof raw !== "object") return [];
|
|
740
|
+
const data = raw;
|
|
741
|
+
const out = [];
|
|
742
|
+
const pushUrl = (url, type) => {
|
|
743
|
+
if (typeof url === "string" && url.length > 0) out.push({ url, type });
|
|
744
|
+
};
|
|
745
|
+
if (Array.isArray(data.images)) {
|
|
746
|
+
for (const img of data.images) pushUrl(img?.url, "image");
|
|
747
|
+
}
|
|
748
|
+
pushUrl(data.image?.url, "image");
|
|
749
|
+
if (Array.isArray(data.videos)) {
|
|
750
|
+
for (const v of data.videos) pushUrl(v?.url, "video");
|
|
751
|
+
}
|
|
752
|
+
pushUrl(data.video?.url, "video");
|
|
753
|
+
return out;
|
|
754
|
+
}
|
|
755
|
+
function createFalMediaAdapter(config) {
|
|
756
|
+
const {
|
|
757
|
+
apiKey,
|
|
758
|
+
baseUrl = DEFAULT_BASE3,
|
|
759
|
+
pollIntervalMs = 3e3,
|
|
760
|
+
pollTimeoutMs = 3e5,
|
|
761
|
+
fetchImpl = fetch
|
|
762
|
+
} = config;
|
|
763
|
+
const headers = {
|
|
764
|
+
"content-type": "application/json",
|
|
765
|
+
authorization: `Key ${apiKey}`
|
|
766
|
+
};
|
|
767
|
+
return {
|
|
768
|
+
provider: "fal",
|
|
769
|
+
async run(req) {
|
|
770
|
+
const submitRes = await fetchImpl(`${baseUrl}/${req.externalId}`, {
|
|
771
|
+
method: "POST",
|
|
772
|
+
headers,
|
|
773
|
+
body: JSON.stringify(req.input)
|
|
774
|
+
});
|
|
775
|
+
if (!submitRes.ok) {
|
|
776
|
+
throw new FalMediaError(submitRes.status, await safeText2(submitRes));
|
|
777
|
+
}
|
|
778
|
+
const submit = await submitRes.json();
|
|
779
|
+
const statusUrl = submit.status_url;
|
|
780
|
+
const responseUrl = submit.response_url;
|
|
781
|
+
if (!statusUrl || !responseUrl) {
|
|
782
|
+
throw new Error(
|
|
783
|
+
`ai-lcr: fal submit for "${req.externalId}" returned no status/response URL (keys: ${Object.keys(
|
|
784
|
+
submit
|
|
785
|
+
).join(", ")})`
|
|
786
|
+
);
|
|
787
|
+
}
|
|
788
|
+
const deadline = Date.now() + pollTimeoutMs;
|
|
789
|
+
let completed = false;
|
|
790
|
+
while (Date.now() < deadline) {
|
|
791
|
+
const statusRes = await fetchImpl(statusUrl, { headers });
|
|
792
|
+
if (!statusRes.ok) {
|
|
793
|
+
throw new FalMediaError(statusRes.status, await safeText2(statusRes));
|
|
794
|
+
}
|
|
795
|
+
const status = String((await statusRes.json()).status ?? "");
|
|
796
|
+
if (status === "COMPLETED") {
|
|
797
|
+
completed = true;
|
|
798
|
+
break;
|
|
799
|
+
}
|
|
800
|
+
await sleep2(pollIntervalMs);
|
|
801
|
+
}
|
|
802
|
+
if (!completed) {
|
|
803
|
+
throw new Error(
|
|
804
|
+
`ai-lcr: fal job for "${req.externalId}" timed out after ${pollTimeoutMs}ms`
|
|
805
|
+
);
|
|
806
|
+
}
|
|
807
|
+
const resultRes = await fetchImpl(responseUrl, { headers });
|
|
808
|
+
if (!resultRes.ok) {
|
|
809
|
+
throw new FalMediaError(resultRes.status, await safeText2(resultRes));
|
|
810
|
+
}
|
|
811
|
+
const outputs = extractOutputs(await resultRes.json());
|
|
812
|
+
if (outputs.length === 0) {
|
|
813
|
+
throw new Error(`ai-lcr: fal returned no media URL for "${req.externalId}"`);
|
|
814
|
+
}
|
|
815
|
+
return { outputs, units: outputs.length };
|
|
816
|
+
}
|
|
817
|
+
};
|
|
818
|
+
}
|
|
819
|
+
var FalMediaError = class extends Error {
|
|
820
|
+
constructor(status, body) {
|
|
821
|
+
super(`fal media HTTP ${status}: ${body.slice(0, 300)}`);
|
|
822
|
+
this.status = status;
|
|
823
|
+
this.name = "FalMediaError";
|
|
824
|
+
}
|
|
825
|
+
status;
|
|
826
|
+
};
|
|
827
|
+
function sleep2(ms) {
|
|
828
|
+
return new Promise((r) => setTimeout(r, ms));
|
|
829
|
+
}
|
|
830
|
+
async function safeText2(res) {
|
|
831
|
+
try {
|
|
832
|
+
return await res.text();
|
|
833
|
+
} catch {
|
|
834
|
+
return "<no body>";
|
|
835
|
+
}
|
|
836
|
+
}
|
|
837
|
+
|
|
734
838
|
// src/index.ts
|
|
735
839
|
function isLanguageModel(entry) {
|
|
736
840
|
return typeof entry.doGenerate === "function";
|
|
@@ -774,11 +878,12 @@ function createLCR(config) {
|
|
|
774
878
|
// Annotate the CommonJS export names for ESM import in node:
|
|
775
879
|
0 && (module.exports = {
|
|
776
880
|
DEFAULT_REFERENCE,
|
|
881
|
+
FalMediaError,
|
|
777
882
|
MEDIA_PRICING,
|
|
778
883
|
cheapestRoute,
|
|
779
884
|
classifyError,
|
|
780
885
|
comparePrices,
|
|
781
|
-
|
|
886
|
+
createFalMediaAdapter,
|
|
782
887
|
createKunavoMediaAdapter,
|
|
783
888
|
createLCR,
|
|
784
889
|
createMediaLCR,
|
package/dist/index.d.cts
CHANGED
|
@@ -65,13 +65,12 @@ interface CallRecord {
|
|
|
65
65
|
/** Computed from the winner's `cost`; 0 if no price was given or the call failed. */
|
|
66
66
|
costUsd: number;
|
|
67
67
|
/**
|
|
68
|
-
* What
|
|
69
|
-
*
|
|
70
|
-
*
|
|
71
|
-
*
|
|
72
|
-
* external price table needed.
|
|
68
|
+
* What the priciest configured route would have cost for this request, so
|
|
69
|
+
* `baselineUsd - costUsd` is the saving from routing cheapest-first. Set by
|
|
70
|
+
* the media router (`createMediaLCR`), where every route has a known price;
|
|
71
|
+
* omitted by the text router, which can't price a baseline per call.
|
|
73
72
|
*/
|
|
74
|
-
baselineUsd
|
|
73
|
+
baselineUsd?: number;
|
|
75
74
|
}
|
|
76
75
|
/**
|
|
77
76
|
* Normalize an error into a short, log-friendly class for {@link CallRecord}.
|
|
@@ -102,52 +101,20 @@ interface FormatOptions {
|
|
|
102
101
|
declare function formatCallRecord(record: CallRecord, opts?: FormatOptions): string;
|
|
103
102
|
|
|
104
103
|
/**
|
|
105
|
-
*
|
|
106
|
-
* collector (e.g. a self-hosted ai-lcr-dashboard `/api/ingest`, or any endpoint
|
|
107
|
-
* that accepts the CallRecord shape).
|
|
104
|
+
* ai-lcr media routing — Least Cost Routing for image & video models.
|
|
108
105
|
*
|
|
109
|
-
*
|
|
110
|
-
*
|
|
111
|
-
*
|
|
106
|
+
* The text router (./index, ./fallback) is built on the AI SDK's
|
|
107
|
+
* `LanguageModelV3` and only handles token-billed chat/completion. Image and
|
|
108
|
+
* video providers are a different world: outputs are files (URLs), pricing
|
|
109
|
+
* comes in incompatible units (per-image, per-second, per-call, per-megapixel),
|
|
110
|
+
* and video is a long-running async job. This module is the parallel, self-
|
|
111
|
+
* contained media side — no `LanguageModelV3` dependency.
|
|
112
112
|
*
|
|
113
|
-
*
|
|
114
|
-
*
|
|
115
|
-
*
|
|
116
|
-
*
|
|
117
|
-
* models: { ... },
|
|
118
|
-
* onCall: createHttpSink({
|
|
119
|
-
* url: process.env.LCR_INGEST_URL + "/api/ingest",
|
|
120
|
-
* headers: { authorization: `Bearer ${process.env.LCR_INGEST_KEY}` },
|
|
121
|
-
* project: process.env.LCR_PROJECT,
|
|
122
|
-
* dispatch: after, // run after the response is sent
|
|
123
|
-
* }),
|
|
124
|
-
* });
|
|
125
|
-
*/
|
|
126
|
-
|
|
127
|
-
interface HttpSinkOptions {
|
|
128
|
-
/** Where to POST each CallRecord (a collector that accepts the JSON shape). */
|
|
129
|
-
url: string;
|
|
130
|
-
/** Extra headers, e.g. `{ authorization: ` + "`Bearer ${key}`" + ` }`. */
|
|
131
|
-
headers?: Record<string, string>;
|
|
132
|
-
/** Optional tenant/project tag merged into each payload (`{ project, ...record }`). */
|
|
133
|
-
project?: string;
|
|
134
|
-
/**
|
|
135
|
-
* Wrap the dispatch so it survives a serverless function returning. On
|
|
136
|
-
* Next.js pass `after` from "next/server"; elsewhere pass a `waitUntil`-style
|
|
137
|
-
* function. Defaults to running immediately — correct for long-lived servers,
|
|
138
|
-
* but on serverless an un-awaited POST may be cut off, so pass `after`.
|
|
139
|
-
*/
|
|
140
|
-
dispatch?: (task: () => void | Promise<void>) => void;
|
|
141
|
-
/** Custom fetch (tests / runtimes without a global `fetch`). */
|
|
142
|
-
fetchImpl?: typeof fetch;
|
|
143
|
-
/** Called if the POST fails. Failures are swallowed by default. */
|
|
144
|
-
onError?: (error: unknown) => void;
|
|
145
|
-
}
|
|
146
|
-
/**
|
|
147
|
-
* Build an `onCall` handler that POSTs each {@link CallRecord} to `url`.
|
|
148
|
-
* Returns a plain `(record) => void` — pass it straight to `createLCR`'s `onCall`.
|
|
113
|
+
* The core idea is the SAME as the text LCR: keep a list of providers per
|
|
114
|
+
* model, route to the cheapest healthy one, fall back on failure, report real
|
|
115
|
+
* cost. The only new problem is making prices comparable, which we solve by
|
|
116
|
+
* normalizing every provider's price to ONE reference output (see ReferenceSpec).
|
|
149
117
|
*/
|
|
150
|
-
declare function createHttpSink(options: HttpSinkOptions): (record: CallRecord) => void;
|
|
151
118
|
|
|
152
119
|
type MediaModality = "image" | "video";
|
|
153
120
|
/**
|
|
@@ -268,6 +235,13 @@ interface MediaLCRConfig {
|
|
|
268
235
|
reference?: ReferenceSpec;
|
|
269
236
|
onError?: (error: Error, provider: string) => void;
|
|
270
237
|
onCost?: (event: MediaCostEvent) => void;
|
|
238
|
+
/**
|
|
239
|
+
* One correlated {@link CallRecord} per settled request — the full failover
|
|
240
|
+
* chain, winner, latency, and cost — mirroring the text side's `onCall`, so
|
|
241
|
+
* the same dashboard sink works for image/video. Fire-and-forget; never
|
|
242
|
+
* throws. Media records carry no token counts (inputTokens/outputTokens = 0).
|
|
243
|
+
*/
|
|
244
|
+
onCall?: (record: CallRecord) => void;
|
|
271
245
|
}
|
|
272
246
|
interface MediaRunResult {
|
|
273
247
|
outputs: MediaOutput[];
|
|
@@ -275,11 +249,6 @@ interface MediaRunResult {
|
|
|
275
249
|
costCents: number;
|
|
276
250
|
estimated: boolean;
|
|
277
251
|
}
|
|
278
|
-
/**
|
|
279
|
-
* Build a media Least Cost Router. Returns `generate(modelId, input)` which
|
|
280
|
-
* tries providers cheapest-first and falls through on a retryable error —
|
|
281
|
-
* exactly the text LCR's contract, for image/video.
|
|
282
|
-
*/
|
|
283
252
|
declare function createMediaLCR(config: MediaLCRConfig): (modelId: string, input: Record<string, unknown>) => Promise<MediaRunResult>;
|
|
284
253
|
|
|
285
254
|
/**
|
|
@@ -367,6 +336,53 @@ interface RunwareMediaConfig {
|
|
|
367
336
|
}
|
|
368
337
|
declare function createRunwareMediaAdapter(config: RunwareMediaConfig): MediaAdapter;
|
|
369
338
|
|
|
339
|
+
/**
|
|
340
|
+
* fal media adapter — image (queue) + video (queue, async poll).
|
|
341
|
+
*
|
|
342
|
+
* fal serves every model through one async queue API, so a single submit→poll→
|
|
343
|
+
* fetch-result path covers both image and video. That is the whole reason this
|
|
344
|
+
* adapter exists: it is ai-lcr's first VIDEO-capable execution path. (The
|
|
345
|
+
* Runware adapter is image-only; the Kunavo one's video poll loop is unverified.)
|
|
346
|
+
*
|
|
347
|
+
* Implementation note: ai-art's fal adapter uses the `@fal-ai/client` SDK, but
|
|
348
|
+
* ai-lcr deliberately keeps zero provider SDKs — every adapter is raw `fetch`
|
|
349
|
+
* with an injectable `fetchImpl` for testing (see runware-media, kunavo-media).
|
|
350
|
+
* So this re-implements the three queue calls against fal's REST endpoints:
|
|
351
|
+
*
|
|
352
|
+
* 1. submit POST https://queue.fal.run/{model} → { request_id, status_url, response_url }
|
|
353
|
+
* 2. status GET {status_url} → { status: IN_QUEUE | IN_PROGRESS | COMPLETED }
|
|
354
|
+
* 3. result GET {response_url} → { images:[…] } | { video:{url} } | …
|
|
355
|
+
*
|
|
356
|
+
* We follow the `status_url` / `response_url` returned by submit rather than
|
|
357
|
+
* rebuilding them, which sidesteps fal's sub-path quirk (a model like
|
|
358
|
+
* `fal-ai/flux/schnell` submits to the full path but its status/result live
|
|
359
|
+
* under the `fal-ai/flux` base).
|
|
360
|
+
*
|
|
361
|
+
* Auth: fal uses `Authorization: Key {FAL_KEY}` (NOT Bearer).
|
|
362
|
+
*
|
|
363
|
+
* Cost: fal's queue result does not carry a per-call price, so cost is left to
|
|
364
|
+
* the router's normalized estimate (costCents stays undefined; `units` is the
|
|
365
|
+
* output count — one image, or one clip).
|
|
366
|
+
*/
|
|
367
|
+
|
|
368
|
+
interface FalMediaConfig {
|
|
369
|
+
apiKey: string;
|
|
370
|
+
/** Override for testing. Defaults to https://queue.fal.run. */
|
|
371
|
+
baseUrl?: string;
|
|
372
|
+
/** Video/job poll cadence (ms). Default 3000. */
|
|
373
|
+
pollIntervalMs?: number;
|
|
374
|
+
/** Max time to wait for a job before giving up (ms). Default 300000 (5m). */
|
|
375
|
+
pollTimeoutMs?: number;
|
|
376
|
+
/** Injected for testing; defaults to global fetch. */
|
|
377
|
+
fetchImpl?: typeof fetch;
|
|
378
|
+
}
|
|
379
|
+
declare function createFalMediaAdapter(config: FalMediaConfig): MediaAdapter;
|
|
380
|
+
/** Carries the HTTP status so the router's `isRetryableError` can classify it. */
|
|
381
|
+
declare class FalMediaError extends Error {
|
|
382
|
+
status: number;
|
|
383
|
+
constructor(status: number, body: string);
|
|
384
|
+
}
|
|
385
|
+
|
|
370
386
|
/**
|
|
371
387
|
* ai-lcr — Least Cost Routing for LLMs.
|
|
372
388
|
*
|
|
@@ -420,4 +436,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
|
|
|
420
436
|
*/
|
|
421
437
|
declare function createLCR(config: LCRConfig): LCRRouter;
|
|
422
438
|
|
|
423
|
-
export { type CallRecord, type CostEvent, DEFAULT_REFERENCE,
|
|
439
|
+
export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, FalMediaError, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, comparePrices, createFalMediaAdapter, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
|
package/dist/index.d.ts
CHANGED
|
@@ -65,13 +65,12 @@ interface CallRecord {
|
|
|
65
65
|
/** Computed from the winner's `cost`; 0 if no price was given or the call failed. */
|
|
66
66
|
costUsd: number;
|
|
67
67
|
/**
|
|
68
|
-
* What
|
|
69
|
-
*
|
|
70
|
-
*
|
|
71
|
-
*
|
|
72
|
-
* external price table needed.
|
|
68
|
+
* What the priciest configured route would have cost for this request, so
|
|
69
|
+
* `baselineUsd - costUsd` is the saving from routing cheapest-first. Set by
|
|
70
|
+
* the media router (`createMediaLCR`), where every route has a known price;
|
|
71
|
+
* omitted by the text router, which can't price a baseline per call.
|
|
73
72
|
*/
|
|
74
|
-
baselineUsd
|
|
73
|
+
baselineUsd?: number;
|
|
75
74
|
}
|
|
76
75
|
/**
|
|
77
76
|
* Normalize an error into a short, log-friendly class for {@link CallRecord}.
|
|
@@ -102,52 +101,20 @@ interface FormatOptions {
|
|
|
102
101
|
declare function formatCallRecord(record: CallRecord, opts?: FormatOptions): string;
|
|
103
102
|
|
|
104
103
|
/**
|
|
105
|
-
*
|
|
106
|
-
* collector (e.g. a self-hosted ai-lcr-dashboard `/api/ingest`, or any endpoint
|
|
107
|
-
* that accepts the CallRecord shape).
|
|
104
|
+
* ai-lcr media routing — Least Cost Routing for image & video models.
|
|
108
105
|
*
|
|
109
|
-
*
|
|
110
|
-
*
|
|
111
|
-
*
|
|
106
|
+
* The text router (./index, ./fallback) is built on the AI SDK's
|
|
107
|
+
* `LanguageModelV3` and only handles token-billed chat/completion. Image and
|
|
108
|
+
* video providers are a different world: outputs are files (URLs), pricing
|
|
109
|
+
* comes in incompatible units (per-image, per-second, per-call, per-megapixel),
|
|
110
|
+
* and video is a long-running async job. This module is the parallel, self-
|
|
111
|
+
* contained media side — no `LanguageModelV3` dependency.
|
|
112
112
|
*
|
|
113
|
-
*
|
|
114
|
-
*
|
|
115
|
-
*
|
|
116
|
-
*
|
|
117
|
-
* models: { ... },
|
|
118
|
-
* onCall: createHttpSink({
|
|
119
|
-
* url: process.env.LCR_INGEST_URL + "/api/ingest",
|
|
120
|
-
* headers: { authorization: `Bearer ${process.env.LCR_INGEST_KEY}` },
|
|
121
|
-
* project: process.env.LCR_PROJECT,
|
|
122
|
-
* dispatch: after, // run after the response is sent
|
|
123
|
-
* }),
|
|
124
|
-
* });
|
|
125
|
-
*/
|
|
126
|
-
|
|
127
|
-
interface HttpSinkOptions {
|
|
128
|
-
/** Where to POST each CallRecord (a collector that accepts the JSON shape). */
|
|
129
|
-
url: string;
|
|
130
|
-
/** Extra headers, e.g. `{ authorization: ` + "`Bearer ${key}`" + ` }`. */
|
|
131
|
-
headers?: Record<string, string>;
|
|
132
|
-
/** Optional tenant/project tag merged into each payload (`{ project, ...record }`). */
|
|
133
|
-
project?: string;
|
|
134
|
-
/**
|
|
135
|
-
* Wrap the dispatch so it survives a serverless function returning. On
|
|
136
|
-
* Next.js pass `after` from "next/server"; elsewhere pass a `waitUntil`-style
|
|
137
|
-
* function. Defaults to running immediately — correct for long-lived servers,
|
|
138
|
-
* but on serverless an un-awaited POST may be cut off, so pass `after`.
|
|
139
|
-
*/
|
|
140
|
-
dispatch?: (task: () => void | Promise<void>) => void;
|
|
141
|
-
/** Custom fetch (tests / runtimes without a global `fetch`). */
|
|
142
|
-
fetchImpl?: typeof fetch;
|
|
143
|
-
/** Called if the POST fails. Failures are swallowed by default. */
|
|
144
|
-
onError?: (error: unknown) => void;
|
|
145
|
-
}
|
|
146
|
-
/**
|
|
147
|
-
* Build an `onCall` handler that POSTs each {@link CallRecord} to `url`.
|
|
148
|
-
* Returns a plain `(record) => void` — pass it straight to `createLCR`'s `onCall`.
|
|
113
|
+
* The core idea is the SAME as the text LCR: keep a list of providers per
|
|
114
|
+
* model, route to the cheapest healthy one, fall back on failure, report real
|
|
115
|
+
* cost. The only new problem is making prices comparable, which we solve by
|
|
116
|
+
* normalizing every provider's price to ONE reference output (see ReferenceSpec).
|
|
149
117
|
*/
|
|
150
|
-
declare function createHttpSink(options: HttpSinkOptions): (record: CallRecord) => void;
|
|
151
118
|
|
|
152
119
|
type MediaModality = "image" | "video";
|
|
153
120
|
/**
|
|
@@ -268,6 +235,13 @@ interface MediaLCRConfig {
|
|
|
268
235
|
reference?: ReferenceSpec;
|
|
269
236
|
onError?: (error: Error, provider: string) => void;
|
|
270
237
|
onCost?: (event: MediaCostEvent) => void;
|
|
238
|
+
/**
|
|
239
|
+
* One correlated {@link CallRecord} per settled request — the full failover
|
|
240
|
+
* chain, winner, latency, and cost — mirroring the text side's `onCall`, so
|
|
241
|
+
* the same dashboard sink works for image/video. Fire-and-forget; never
|
|
242
|
+
* throws. Media records carry no token counts (inputTokens/outputTokens = 0).
|
|
243
|
+
*/
|
|
244
|
+
onCall?: (record: CallRecord) => void;
|
|
271
245
|
}
|
|
272
246
|
interface MediaRunResult {
|
|
273
247
|
outputs: MediaOutput[];
|
|
@@ -275,11 +249,6 @@ interface MediaRunResult {
|
|
|
275
249
|
costCents: number;
|
|
276
250
|
estimated: boolean;
|
|
277
251
|
}
|
|
278
|
-
/**
|
|
279
|
-
* Build a media Least Cost Router. Returns `generate(modelId, input)` which
|
|
280
|
-
* tries providers cheapest-first and falls through on a retryable error —
|
|
281
|
-
* exactly the text LCR's contract, for image/video.
|
|
282
|
-
*/
|
|
283
252
|
declare function createMediaLCR(config: MediaLCRConfig): (modelId: string, input: Record<string, unknown>) => Promise<MediaRunResult>;
|
|
284
253
|
|
|
285
254
|
/**
|
|
@@ -367,6 +336,53 @@ interface RunwareMediaConfig {
|
|
|
367
336
|
}
|
|
368
337
|
declare function createRunwareMediaAdapter(config: RunwareMediaConfig): MediaAdapter;
|
|
369
338
|
|
|
339
|
+
/**
|
|
340
|
+
* fal media adapter — image (queue) + video (queue, async poll).
|
|
341
|
+
*
|
|
342
|
+
* fal serves every model through one async queue API, so a single submit→poll→
|
|
343
|
+
* fetch-result path covers both image and video. That is the whole reason this
|
|
344
|
+
* adapter exists: it is ai-lcr's first VIDEO-capable execution path. (The
|
|
345
|
+
* Runware adapter is image-only; the Kunavo one's video poll loop is unverified.)
|
|
346
|
+
*
|
|
347
|
+
* Implementation note: ai-art's fal adapter uses the `@fal-ai/client` SDK, but
|
|
348
|
+
* ai-lcr deliberately keeps zero provider SDKs — every adapter is raw `fetch`
|
|
349
|
+
* with an injectable `fetchImpl` for testing (see runware-media, kunavo-media).
|
|
350
|
+
* So this re-implements the three queue calls against fal's REST endpoints:
|
|
351
|
+
*
|
|
352
|
+
* 1. submit POST https://queue.fal.run/{model} → { request_id, status_url, response_url }
|
|
353
|
+
* 2. status GET {status_url} → { status: IN_QUEUE | IN_PROGRESS | COMPLETED }
|
|
354
|
+
* 3. result GET {response_url} → { images:[…] } | { video:{url} } | …
|
|
355
|
+
*
|
|
356
|
+
* We follow the `status_url` / `response_url` returned by submit rather than
|
|
357
|
+
* rebuilding them, which sidesteps fal's sub-path quirk (a model like
|
|
358
|
+
* `fal-ai/flux/schnell` submits to the full path but its status/result live
|
|
359
|
+
* under the `fal-ai/flux` base).
|
|
360
|
+
*
|
|
361
|
+
* Auth: fal uses `Authorization: Key {FAL_KEY}` (NOT Bearer).
|
|
362
|
+
*
|
|
363
|
+
* Cost: fal's queue result does not carry a per-call price, so cost is left to
|
|
364
|
+
* the router's normalized estimate (costCents stays undefined; `units` is the
|
|
365
|
+
* output count — one image, or one clip).
|
|
366
|
+
*/
|
|
367
|
+
|
|
368
|
+
interface FalMediaConfig {
|
|
369
|
+
apiKey: string;
|
|
370
|
+
/** Override for testing. Defaults to https://queue.fal.run. */
|
|
371
|
+
baseUrl?: string;
|
|
372
|
+
/** Video/job poll cadence (ms). Default 3000. */
|
|
373
|
+
pollIntervalMs?: number;
|
|
374
|
+
/** Max time to wait for a job before giving up (ms). Default 300000 (5m). */
|
|
375
|
+
pollTimeoutMs?: number;
|
|
376
|
+
/** Injected for testing; defaults to global fetch. */
|
|
377
|
+
fetchImpl?: typeof fetch;
|
|
378
|
+
}
|
|
379
|
+
declare function createFalMediaAdapter(config: FalMediaConfig): MediaAdapter;
|
|
380
|
+
/** Carries the HTTP status so the router's `isRetryableError` can classify it. */
|
|
381
|
+
declare class FalMediaError extends Error {
|
|
382
|
+
status: number;
|
|
383
|
+
constructor(status: number, body: string);
|
|
384
|
+
}
|
|
385
|
+
|
|
370
386
|
/**
|
|
371
387
|
* ai-lcr — Least Cost Routing for LLMs.
|
|
372
388
|
*
|
|
@@ -420,4 +436,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
|
|
|
420
436
|
*/
|
|
421
437
|
declare function createLCR(config: LCRConfig): LCRRouter;
|
|
422
438
|
|
|
423
|
-
export { type CallRecord, type CostEvent, DEFAULT_REFERENCE,
|
|
439
|
+
export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, FalMediaError, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, comparePrices, createFalMediaAdapter, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
|
package/dist/index.js
CHANGED
|
@@ -102,20 +102,12 @@ var LcrFallbackModel = class {
|
|
|
102
102
|
errorClass: classifyError(error)
|
|
103
103
|
});
|
|
104
104
|
}
|
|
105
|
-
/** Cost of one route for the given token counts; 0 if it has no price. */
|
|
106
|
-
routeCost(p, inputTokens, outputTokens) {
|
|
107
|
-
return p.cost ? inputTokens / 1e6 * p.cost.input + outputTokens / 1e6 * p.cost.output : 0;
|
|
108
|
-
}
|
|
109
105
|
/** Winner settled: record the attempt, fire `onCost` (compat) + `onCall`. */
|
|
110
106
|
finalizeOk(ctx, provider, attemptStart, usage) {
|
|
111
107
|
ctx.attempts.push({ provider: provider.label, ok: true, latencyMs: Date.now() - attemptStart });
|
|
112
108
|
const inputTokens = usage?.inputTokens?.total ?? 0;
|
|
113
109
|
const outputTokens = usage?.outputTokens?.total ?? 0;
|
|
114
|
-
const costUsd =
|
|
115
|
-
const baselineUsd = this.opts.providers.reduce(
|
|
116
|
-
(max, p) => Math.max(max, this.routeCost(p, inputTokens, outputTokens)),
|
|
117
|
-
costUsd
|
|
118
|
-
);
|
|
110
|
+
const costUsd = provider.cost ? inputTokens / 1e6 * provider.cost.input + outputTokens / 1e6 * provider.cost.output : 0;
|
|
119
111
|
this.opts.onCost?.({
|
|
120
112
|
model: this.opts.modelName,
|
|
121
113
|
provider: provider.label,
|
|
@@ -133,8 +125,7 @@ var LcrFallbackModel = class {
|
|
|
133
125
|
latencyMs: Date.now() - ctx.startedAt,
|
|
134
126
|
inputTokens,
|
|
135
127
|
outputTokens,
|
|
136
|
-
costUsd
|
|
137
|
-
baselineUsd
|
|
128
|
+
costUsd
|
|
138
129
|
});
|
|
139
130
|
}
|
|
140
131
|
/** Every provider failed: fire `onCall` with no winner. */
|
|
@@ -149,8 +140,7 @@ var LcrFallbackModel = class {
|
|
|
149
140
|
latencyMs: Date.now() - ctx.startedAt,
|
|
150
141
|
inputTokens: 0,
|
|
151
142
|
outputTokens: 0,
|
|
152
|
-
costUsd: 0
|
|
153
|
-
baselineUsd: 0
|
|
143
|
+
costUsd: 0
|
|
154
144
|
});
|
|
155
145
|
}
|
|
156
146
|
async doGenerate(options) {
|
|
@@ -306,40 +296,6 @@ function formatCallRecord(record, opts = {}) {
|
|
|
306
296
|
return line;
|
|
307
297
|
}
|
|
308
298
|
|
|
309
|
-
// src/sink.ts
|
|
310
|
-
function createHttpSink(options) {
|
|
311
|
-
const {
|
|
312
|
-
url,
|
|
313
|
-
headers,
|
|
314
|
-
project,
|
|
315
|
-
dispatch = (task) => {
|
|
316
|
-
void task();
|
|
317
|
-
},
|
|
318
|
-
fetchImpl,
|
|
319
|
-
onError
|
|
320
|
-
} = options;
|
|
321
|
-
const doFetch = fetchImpl ?? globalThis.fetch;
|
|
322
|
-
return (record) => {
|
|
323
|
-
if (!doFetch) {
|
|
324
|
-
onError?.(new Error("ai-lcr: no fetch available for createHttpSink"));
|
|
325
|
-
return;
|
|
326
|
-
}
|
|
327
|
-
const payload = project ? { project, ...record } : record;
|
|
328
|
-
dispatch(async () => {
|
|
329
|
-
try {
|
|
330
|
-
await doFetch(url, {
|
|
331
|
-
method: "POST",
|
|
332
|
-
headers: { "content-type": "application/json", ...headers },
|
|
333
|
-
body: JSON.stringify(payload),
|
|
334
|
-
keepalive: true
|
|
335
|
-
});
|
|
336
|
-
} catch (err) {
|
|
337
|
-
onError?.(err);
|
|
338
|
-
}
|
|
339
|
-
});
|
|
340
|
-
};
|
|
341
|
-
}
|
|
342
|
-
|
|
343
299
|
// src/media.ts
|
|
344
300
|
var DEFAULT_REFERENCE = {
|
|
345
301
|
image: { width: 1920, height: 1080 },
|
|
@@ -386,30 +342,75 @@ function comparePrices(registry, ref = DEFAULT_REFERENCE) {
|
|
|
386
342
|
};
|
|
387
343
|
});
|
|
388
344
|
}
|
|
345
|
+
function newMediaCallId() {
|
|
346
|
+
const c = globalThis.crypto;
|
|
347
|
+
return c?.randomUUID ? c.randomUUID() : `lcr_${Date.now().toString(36)}`;
|
|
348
|
+
}
|
|
389
349
|
function createMediaLCR(config) {
|
|
390
|
-
const { registry, adapters, reference = DEFAULT_REFERENCE, onError, onCost } = config;
|
|
350
|
+
const { registry, adapters, reference = DEFAULT_REFERENCE, onError, onCost, onCall } = config;
|
|
391
351
|
return async function generate(modelId, input) {
|
|
392
352
|
const def = registry[modelId];
|
|
393
353
|
if (!def) {
|
|
394
354
|
throw new Error(`ai-lcr: unknown media model "${modelId}" \u2014 add it to the registry`);
|
|
395
355
|
}
|
|
396
356
|
const ranked = rankRoutes(def, reference);
|
|
357
|
+
const baselineUsd = ranked.length > 0 ? Math.max(...ranked.map((r) => r.refCents)) / 100 : 0;
|
|
358
|
+
const startedAt = Date.now();
|
|
359
|
+
const attempts = [];
|
|
397
360
|
let lastErr;
|
|
361
|
+
const emitFail = () => onCall?.({
|
|
362
|
+
id: newMediaCallId(),
|
|
363
|
+
model: modelId,
|
|
364
|
+
attempts,
|
|
365
|
+
winner: void 0,
|
|
366
|
+
ok: false,
|
|
367
|
+
failedOver: attempts.length > 1,
|
|
368
|
+
latencyMs: Date.now() - startedAt,
|
|
369
|
+
inputTokens: 0,
|
|
370
|
+
outputTokens: 0,
|
|
371
|
+
costUsd: 0,
|
|
372
|
+
baselineUsd
|
|
373
|
+
});
|
|
398
374
|
for (const route of ranked) {
|
|
399
375
|
const adapter = adapters[route.provider];
|
|
400
376
|
if (!adapter) continue;
|
|
377
|
+
const attemptStart = Date.now();
|
|
401
378
|
try {
|
|
402
379
|
const result = await adapter.run({ externalId: route.externalId, input });
|
|
403
380
|
const estimated = result.costCents === void 0;
|
|
404
381
|
const costCents = estimated ? route.refCents * (result.units ?? 1) : result.costCents;
|
|
382
|
+
attempts.push({ provider: route.provider, ok: true, latencyMs: Date.now() - attemptStart });
|
|
405
383
|
onCost?.({ modelId, provider: route.provider, costCents, estimated });
|
|
384
|
+
onCall?.({
|
|
385
|
+
id: newMediaCallId(),
|
|
386
|
+
model: modelId,
|
|
387
|
+
attempts,
|
|
388
|
+
winner: route.provider,
|
|
389
|
+
ok: true,
|
|
390
|
+
failedOver: attempts.length > 1,
|
|
391
|
+
latencyMs: Date.now() - startedAt,
|
|
392
|
+
inputTokens: 0,
|
|
393
|
+
outputTokens: 0,
|
|
394
|
+
costUsd: costCents / 100,
|
|
395
|
+
baselineUsd
|
|
396
|
+
});
|
|
406
397
|
return { outputs: result.outputs, provider: route.provider, costCents, estimated };
|
|
407
398
|
} catch (err) {
|
|
408
399
|
lastErr = err;
|
|
400
|
+
attempts.push({
|
|
401
|
+
provider: route.provider,
|
|
402
|
+
ok: false,
|
|
403
|
+
latencyMs: Date.now() - attemptStart,
|
|
404
|
+
errorClass: classifyError(err)
|
|
405
|
+
});
|
|
409
406
|
onError?.(err, route.provider);
|
|
410
|
-
if (!isRetryableError(err))
|
|
407
|
+
if (!isRetryableError(err)) {
|
|
408
|
+
emitFail();
|
|
409
|
+
throw err;
|
|
410
|
+
}
|
|
411
411
|
}
|
|
412
412
|
}
|
|
413
|
+
emitFail();
|
|
413
414
|
throw lastErr instanceof Error ? lastErr : new Error(`ai-lcr: no provider could serve media model "${modelId}"`);
|
|
414
415
|
};
|
|
415
416
|
}
|
|
@@ -692,6 +693,108 @@ var RunwareMediaError = class extends Error {
|
|
|
692
693
|
status;
|
|
693
694
|
};
|
|
694
695
|
|
|
696
|
+
// src/adapters/fal-media.ts
|
|
697
|
+
var DEFAULT_BASE3 = "https://queue.fal.run";
|
|
698
|
+
function extractOutputs(raw) {
|
|
699
|
+
if (!raw || typeof raw !== "object") return [];
|
|
700
|
+
const data = raw;
|
|
701
|
+
const out = [];
|
|
702
|
+
const pushUrl = (url, type) => {
|
|
703
|
+
if (typeof url === "string" && url.length > 0) out.push({ url, type });
|
|
704
|
+
};
|
|
705
|
+
if (Array.isArray(data.images)) {
|
|
706
|
+
for (const img of data.images) pushUrl(img?.url, "image");
|
|
707
|
+
}
|
|
708
|
+
pushUrl(data.image?.url, "image");
|
|
709
|
+
if (Array.isArray(data.videos)) {
|
|
710
|
+
for (const v of data.videos) pushUrl(v?.url, "video");
|
|
711
|
+
}
|
|
712
|
+
pushUrl(data.video?.url, "video");
|
|
713
|
+
return out;
|
|
714
|
+
}
|
|
715
|
+
function createFalMediaAdapter(config) {
|
|
716
|
+
const {
|
|
717
|
+
apiKey,
|
|
718
|
+
baseUrl = DEFAULT_BASE3,
|
|
719
|
+
pollIntervalMs = 3e3,
|
|
720
|
+
pollTimeoutMs = 3e5,
|
|
721
|
+
fetchImpl = fetch
|
|
722
|
+
} = config;
|
|
723
|
+
const headers = {
|
|
724
|
+
"content-type": "application/json",
|
|
725
|
+
authorization: `Key ${apiKey}`
|
|
726
|
+
};
|
|
727
|
+
return {
|
|
728
|
+
provider: "fal",
|
|
729
|
+
async run(req) {
|
|
730
|
+
const submitRes = await fetchImpl(`${baseUrl}/${req.externalId}`, {
|
|
731
|
+
method: "POST",
|
|
732
|
+
headers,
|
|
733
|
+
body: JSON.stringify(req.input)
|
|
734
|
+
});
|
|
735
|
+
if (!submitRes.ok) {
|
|
736
|
+
throw new FalMediaError(submitRes.status, await safeText2(submitRes));
|
|
737
|
+
}
|
|
738
|
+
const submit = await submitRes.json();
|
|
739
|
+
const statusUrl = submit.status_url;
|
|
740
|
+
const responseUrl = submit.response_url;
|
|
741
|
+
if (!statusUrl || !responseUrl) {
|
|
742
|
+
throw new Error(
|
|
743
|
+
`ai-lcr: fal submit for "${req.externalId}" returned no status/response URL (keys: ${Object.keys(
|
|
744
|
+
submit
|
|
745
|
+
).join(", ")})`
|
|
746
|
+
);
|
|
747
|
+
}
|
|
748
|
+
const deadline = Date.now() + pollTimeoutMs;
|
|
749
|
+
let completed = false;
|
|
750
|
+
while (Date.now() < deadline) {
|
|
751
|
+
const statusRes = await fetchImpl(statusUrl, { headers });
|
|
752
|
+
if (!statusRes.ok) {
|
|
753
|
+
throw new FalMediaError(statusRes.status, await safeText2(statusRes));
|
|
754
|
+
}
|
|
755
|
+
const status = String((await statusRes.json()).status ?? "");
|
|
756
|
+
if (status === "COMPLETED") {
|
|
757
|
+
completed = true;
|
|
758
|
+
break;
|
|
759
|
+
}
|
|
760
|
+
await sleep2(pollIntervalMs);
|
|
761
|
+
}
|
|
762
|
+
if (!completed) {
|
|
763
|
+
throw new Error(
|
|
764
|
+
`ai-lcr: fal job for "${req.externalId}" timed out after ${pollTimeoutMs}ms`
|
|
765
|
+
);
|
|
766
|
+
}
|
|
767
|
+
const resultRes = await fetchImpl(responseUrl, { headers });
|
|
768
|
+
if (!resultRes.ok) {
|
|
769
|
+
throw new FalMediaError(resultRes.status, await safeText2(resultRes));
|
|
770
|
+
}
|
|
771
|
+
const outputs = extractOutputs(await resultRes.json());
|
|
772
|
+
if (outputs.length === 0) {
|
|
773
|
+
throw new Error(`ai-lcr: fal returned no media URL for "${req.externalId}"`);
|
|
774
|
+
}
|
|
775
|
+
return { outputs, units: outputs.length };
|
|
776
|
+
}
|
|
777
|
+
};
|
|
778
|
+
}
|
|
779
|
+
var FalMediaError = class extends Error {
|
|
780
|
+
constructor(status, body) {
|
|
781
|
+
super(`fal media HTTP ${status}: ${body.slice(0, 300)}`);
|
|
782
|
+
this.status = status;
|
|
783
|
+
this.name = "FalMediaError";
|
|
784
|
+
}
|
|
785
|
+
status;
|
|
786
|
+
};
|
|
787
|
+
function sleep2(ms) {
|
|
788
|
+
return new Promise((r) => setTimeout(r, ms));
|
|
789
|
+
}
|
|
790
|
+
async function safeText2(res) {
|
|
791
|
+
try {
|
|
792
|
+
return await res.text();
|
|
793
|
+
} catch {
|
|
794
|
+
return "<no body>";
|
|
795
|
+
}
|
|
796
|
+
}
|
|
797
|
+
|
|
695
798
|
// src/index.ts
|
|
696
799
|
function isLanguageModel(entry) {
|
|
697
800
|
return typeof entry.doGenerate === "function";
|
|
@@ -734,11 +837,12 @@ function createLCR(config) {
|
|
|
734
837
|
}
|
|
735
838
|
export {
|
|
736
839
|
DEFAULT_REFERENCE,
|
|
840
|
+
FalMediaError,
|
|
737
841
|
MEDIA_PRICING,
|
|
738
842
|
cheapestRoute,
|
|
739
843
|
classifyError,
|
|
740
844
|
comparePrices,
|
|
741
|
-
|
|
845
|
+
createFalMediaAdapter,
|
|
742
846
|
createKunavoMediaAdapter,
|
|
743
847
|
createLCR,
|
|
744
848
|
createMediaLCR,
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "ai-lcr",
|
|
3
|
-
"version": "0.2.
|
|
3
|
+
"version": "0.2.2",
|
|
4
4
|
"description": "Least Cost Routing for LLMs — route every model call to the cheapest available provider, fall back automatically, and track real cost. Built for the Vercel AI SDK.",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"ai",
|