ai-lcr 0.2.3 → 0.2.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +65 -0
- package/README.md +3 -2
- package/README.zh-CN.md +3 -2
- package/dist/index.cjs +249 -23
- package/dist/index.d.cts +43 -1
- package/dist/index.d.ts +43 -1
- package/dist/index.js +248 -23
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -4,6 +4,69 @@ All notable changes to `ai-lcr` are documented here. The format follows
|
|
|
4
4
|
[Keep a Changelog](https://keepachangelog.com/), and the project adheres to
|
|
5
5
|
[Semantic Versioning](https://semver.org/).
|
|
6
6
|
|
|
7
|
+
## [0.2.6] — 2026-06-01
|
|
8
|
+
|
|
9
|
+
### Changed
|
|
10
|
+
|
|
11
|
+
- **fal media adapter now covers image *and* video** via fal's async queue API
|
|
12
|
+
(submit → poll `status_url` → fetch `response_url`), replacing the synchronous
|
|
13
|
+
image-only `fal.run` adapter shipped in 0.2.5. This is ai-lcr's first working
|
|
14
|
+
**video** execution path: the registry already priced/routed the Veo family
|
|
15
|
+
but no adapter could run it. Same house style — raw `fetch`, injectable
|
|
16
|
+
`fetchImpl`, no provider SDK; `Authorization: Key` (not Bearer); cost left to
|
|
17
|
+
the router's normalized estimate (the queue result carries no per-call price).
|
|
18
|
+
Following the submit response's `status_url`/`response_url` sidesteps fal's
|
|
19
|
+
sub-path quirk (`fal-ai/flux/schnell` submits to the full path, but status and
|
|
20
|
+
result live under the `fal-ai/flux` base). `createFalMediaAdapter`'s public
|
|
21
|
+
name is unchanged; image callers are unaffected.
|
|
22
|
+
|
|
23
|
+
## [0.2.5] — 2026-06-01
|
|
24
|
+
|
|
25
|
+
Pre-launch failover-robustness + media-provider pass — closing cases where a
|
|
26
|
+
real provider failure slipped past the switch criterion and killed the request,
|
|
27
|
+
and making fal a live failover target.
|
|
28
|
+
|
|
29
|
+
### Fixed
|
|
30
|
+
|
|
31
|
+
- **A network-unreachable provider didn't fail over.** `isRetryableError` only
|
|
32
|
+
matched HTTP statuses and English keywords, but a provider that's down throws
|
|
33
|
+
a `fetch` `TypeError` with *no* status — and wraps the real cause
|
|
34
|
+
(`ECONNREFUSED`/`ECONNRESET`/`ENOTFOUND`/connect-timeout, with the Node `code`)
|
|
35
|
+
in `error.cause`. Those read as a non-retryable client error, so the cheapest
|
|
36
|
+
provider going down killed the request instead of falling over — the most
|
|
37
|
+
common outage mode. The engine now walks the `cause` chain and treats Node
|
|
38
|
+
network codes / transport-failure messages as retryable. Applies to both the
|
|
39
|
+
text and media routers. New exported helper `isNetworkError`.
|
|
40
|
+
- **Non-English billing failures didn't fail over.** Out-of-credit detection was
|
|
41
|
+
English-only, but Chinese providers (e.g. Kunavo) report a failed charge as
|
|
42
|
+
`余额不足`/`账户欠费`/`扣费失败` in a 200/400 body with no billing status.
|
|
43
|
+
Those are now matched (plus `balance`/`exhausted`), so a failed charge fails
|
|
44
|
+
over and is tagged `billing` by `classifyErrorKind` for alerting.
|
|
45
|
+
- **An out-of-balance 403 was mis-tagged as `auth`.** Providers report an
|
|
46
|
+
exhausted account as 403 (e.g. fal "exhausted balance") — a top-up problem,
|
|
47
|
+
not a revoked key. `classifyErrorKind` now lets billing wording win over a
|
|
48
|
+
bare 401/403 status, so it's tagged `billing` (a plain 403 stays `auth`).
|
|
49
|
+
- **A throwing observer could fail a successful request.** `onCost`/`onCall`/
|
|
50
|
+
`onError` were invoked unguarded; a logging sink that threw (e.g. a flaky db9
|
|
51
|
+
write) turned an otherwise-successful generation into a thrown error. All
|
|
52
|
+
observer callbacks are now fire-and-forget — wrapped so a throw can never
|
|
53
|
+
affect routing or the request outcome. Applies to both routers.
|
|
54
|
+
|
|
55
|
+
### Added
|
|
56
|
+
|
|
57
|
+
- **fal media adapter** (`createFalMediaAdapter`). fal was in the price table
|
|
58
|
+
but had no adapter, so its routes were silently skipped at runtime — now it's
|
|
59
|
+
a real cheapest-first / failover target for image models. Synchronous
|
|
60
|
+
`https://fal.run/<model>` with `Authorization: Key`, generic input pass-
|
|
61
|
+
through, HTTP-status-bearing errors (403 out-of-balance → fails over; 422 bad
|
|
62
|
+
input → doesn't). Image only; fal video (queue) is on the roadmap.
|
|
63
|
+
- **Status-page liveness probes for Runware + fal** (`website`). Both are now
|
|
64
|
+
monitored with a free, generation-free reachability probe: Runware's `ping`
|
|
65
|
+
task (→ `pong`, 0 cost) and fal's `GET /v1/account/billing` (2xx ⇒ endpoint up
|
|
66
|
+
+ key valid). Generalized via a new `ReachProbe` so a "reachable" check can
|
|
67
|
+
hit a provider-specific free endpoint instead of `GET /v1/models`. Requires
|
|
68
|
+
`RUNWARE_API_KEY` and `FAL_KEY` env vars to be set.
|
|
69
|
+
|
|
7
70
|
## [0.2.3] — 2026-06-01
|
|
8
71
|
|
|
9
72
|
Release-quality and engine-correctness pass.
|
|
@@ -57,4 +120,6 @@ Release-quality and engine-correctness pass.
|
|
|
57
120
|
- Dual ESM/CJS build. Media (image/video) least-cost routing with the Runware
|
|
58
121
|
and Kunavo adapters; cap-aware failover for the text router.
|
|
59
122
|
|
|
123
|
+
[0.2.6]: https://github.com/victorzhrn/ai-lcr/releases/tag/v0.2.6
|
|
124
|
+
[0.2.5]: https://github.com/victorzhrn/ai-lcr/releases/tag/v0.2.5
|
|
60
125
|
[0.2.3]: https://github.com/victorzhrn/ai-lcr/releases/tag/v0.2.3
|
package/README.md
CHANGED
|
@@ -156,7 +156,7 @@ Any OpenAI-compatible endpoint works — and so does any AI SDK provider package
|
|
|
156
156
|
|
|
157
157
|
- **Model vendors' own APIs (native):** route straight to [DeepSeek](https://platform.deepseek.com), [OpenAI](https://openai.com), [Anthropic](https://anthropic.com), [Google](https://ai.google.dev), [xAI](https://x.ai), etc. via their AI SDK provider packages — no markup, full native features. See [Route to a model vendor's own API](#route-to-a-model-vendors-own-api-native-providers).
|
|
158
158
|
- **Text aggregators:** [OpenRouter](https://openrouter.ai) (widest coverage, list pricing) · [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off** every model) · [TokenMart](https://thetokenmart.ai) (15–65% off, varies by model)
|
|
159
|
-
- **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) —
|
|
159
|
+
- **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — routing via `createMediaLCR`. Image: Kunavo + Runware + fal. Video: fal (live, via its async queue API); Kunavo's Veo poll path is implemented but unverified
|
|
160
160
|
|
|
161
161
|
## Text model pricing
|
|
162
162
|
|
|
@@ -273,7 +273,8 @@ Two OpenAI-compatible providers, same probe, same day. Cells cover both families
|
|
|
273
273
|
- [ ] Bundled price table for zero-config pricing (drop the manual `cost` numbers)
|
|
274
274
|
- [ ] Provider-quirk middleware (transparently patch known per-provider request quirks, e.g. Kunavo's ignored `max_tokens`)
|
|
275
275
|
- [ ] Feed probe results into routing automatically (auto-exclude a model from a provider that fails its probe)
|
|
276
|
-
- [
|
|
276
|
+
- [x] Image & video model routing (`createMediaLCR`) — image via Kunavo + Runware + fal; **video live via fal** (async queue API)
|
|
277
|
+
- [ ] Normalized cross-provider video price comparison + verified Kunavo/Runware video adapters
|
|
277
278
|
|
|
278
279
|
## Affiliate disclosure
|
|
279
280
|
|
package/README.zh-CN.md
CHANGED
|
@@ -114,7 +114,7 @@ const lcr = createLCR({
|
|
|
114
114
|
|
|
115
115
|
- **模型厂商官方 API(原生):** 通过各自的 AI SDK provider 包直连 [DeepSeek](https://platform.deepseek.com)、[OpenAI](https://openai.com)、[Anthropic](https://anthropic.com)、[Google](https://ai.google.dev)、[xAI](https://x.ai) 等——无加价,原生特性齐全。见上方「直连模型厂商官方 API(原生 provider)」一节。
|
|
116
116
|
- **文本聚合器:** [OpenRouter](https://openrouter.ai)(覆盖最广,列表定价)· [Kunavo](https://kunavo.com/?ref=victorimf)(**全模型 8 折**)· [TokenMart](https://thetokenmart.ai)(按模型 85 折–35 折不等)
|
|
117
|
-
- **图像 / 视频:** [Kunavo](https://kunavo.com/?ref=victorimf)(**8 折**)· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) ——
|
|
117
|
+
- **图像 / 视频:** [Kunavo](https://kunavo.com/?ref=victorimf)(**8 折**)· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) —— 通过 `createMediaLCR` 路由。图像:Kunavo + Runware + fal。视频:fal(已可用,走其异步队列 API);Kunavo 的 Veo 轮询路径已实现但未验证
|
|
118
118
|
|
|
119
119
|
## 文本模型价格
|
|
120
120
|
|
|
@@ -229,7 +229,8 @@ API_KEY=$TOKENMART_API_KEY BASE=https://api.tokenmart.ai \
|
|
|
229
229
|
- [ ] 内置价格表,实现零配置定价(省去手填 `cost` 数字)
|
|
230
230
|
- [ ] provider 怪癖中间件(透明地修补已知怪癖,如 Kunavo 被忽略的 `max_tokens`)
|
|
231
231
|
- [ ] 把 probe 结果自动接入路由(探测失败的 provider×model 自动从列表剔除)
|
|
232
|
-
- [
|
|
232
|
+
- [x] 图像与视频模型路由(`createMediaLCR`)—— 图像走 Kunavo + Runware + fal;**视频已可用,走 fal**(异步队列 API)
|
|
233
|
+
- [ ] 归一化的跨 provider 视频价格对比 + 验证 Kunavo/Runware 视频适配器
|
|
233
234
|
|
|
234
235
|
## 联盟(Affiliate)披露
|
|
235
236
|
|
package/dist/index.cjs
CHANGED
|
@@ -26,6 +26,7 @@ __export(index_exports, {
|
|
|
26
26
|
classifyError: () => classifyError,
|
|
27
27
|
classifyErrorKind: () => classifyErrorKind,
|
|
28
28
|
comparePrices: () => comparePrices,
|
|
29
|
+
createFalMediaAdapter: () => createFalMediaAdapter,
|
|
29
30
|
createKunavoMediaAdapter: () => createKunavoMediaAdapter,
|
|
30
31
|
createLCR: () => createLCR,
|
|
31
32
|
createMediaLCR: () => createMediaLCR,
|
|
@@ -57,43 +58,126 @@ var RETRYABLE_PATTERNS = [
|
|
|
57
58
|
"504",
|
|
58
59
|
"429",
|
|
59
60
|
// Billing caps — a capped provider should fall over, not kill the request.
|
|
61
|
+
// Include non-English wording: Chinese providers (e.g. Kunavo) report a failed
|
|
62
|
+
// charge as "余额不足"/"账户欠费"/"扣费失败" with a 200/400 body, which no
|
|
63
|
+
// English keyword and no HTTP status would catch — so without these a billing
|
|
64
|
+
// failure would die instead of failing over, the exact opposite of what we want.
|
|
60
65
|
"insufficient",
|
|
61
66
|
"credit",
|
|
62
67
|
"quota",
|
|
63
68
|
"billing",
|
|
64
|
-
"payment required"
|
|
69
|
+
"payment required",
|
|
70
|
+
"balance",
|
|
71
|
+
"\u4F59\u989D",
|
|
72
|
+
"\u6B20\u8D39",
|
|
73
|
+
"\u6263\u8D39",
|
|
74
|
+
"\u6263\u6B3E"
|
|
65
75
|
];
|
|
76
|
+
var NETWORK_CODES = /* @__PURE__ */ new Set([
|
|
77
|
+
"ECONNREFUSED",
|
|
78
|
+
"ECONNRESET",
|
|
79
|
+
"ECONNABORTED",
|
|
80
|
+
"ENOTFOUND",
|
|
81
|
+
"EAI_AGAIN",
|
|
82
|
+
"ETIMEDOUT",
|
|
83
|
+
"EPIPE",
|
|
84
|
+
"EHOSTUNREACH",
|
|
85
|
+
"ENETUNREACH",
|
|
86
|
+
"EPROTO",
|
|
87
|
+
"UND_ERR_SOCKET",
|
|
88
|
+
"UND_ERR_CONNECT_TIMEOUT",
|
|
89
|
+
"UND_ERR_HEADERS_TIMEOUT",
|
|
90
|
+
"UND_ERR_BODY_TIMEOUT"
|
|
91
|
+
]);
|
|
92
|
+
var NETWORK_PATTERNS = [
|
|
93
|
+
"fetch failed",
|
|
94
|
+
"failed to fetch",
|
|
95
|
+
"socket hang up",
|
|
96
|
+
"socket disconnected",
|
|
97
|
+
"econnrefused",
|
|
98
|
+
"econnreset",
|
|
99
|
+
"enotfound",
|
|
100
|
+
"etimedout",
|
|
101
|
+
"ehostunreach",
|
|
102
|
+
"enetunreach",
|
|
103
|
+
"eai_again",
|
|
104
|
+
"getaddrinfo",
|
|
105
|
+
"connect timeout",
|
|
106
|
+
"connection refused",
|
|
107
|
+
"connection reset",
|
|
108
|
+
"connection error",
|
|
109
|
+
"network error",
|
|
110
|
+
"dns"
|
|
111
|
+
];
|
|
112
|
+
function safeStringify(value) {
|
|
113
|
+
try {
|
|
114
|
+
return JSON.stringify(value) ?? "";
|
|
115
|
+
} catch {
|
|
116
|
+
return String(value);
|
|
117
|
+
}
|
|
118
|
+
}
|
|
119
|
+
function errorSignals(error) {
|
|
120
|
+
const parts = [];
|
|
121
|
+
const codes = [];
|
|
122
|
+
const seen = /* @__PURE__ */ new Set();
|
|
123
|
+
let cur = error;
|
|
124
|
+
for (let depth = 0; depth < 6 && cur && typeof cur === "object" && !seen.has(cur); depth++) {
|
|
125
|
+
seen.add(cur);
|
|
126
|
+
const e = cur;
|
|
127
|
+
if (typeof e.message === "string") parts.push(e.message);
|
|
128
|
+
if (typeof e.name === "string") parts.push(e.name);
|
|
129
|
+
if (typeof e.code === "string") {
|
|
130
|
+
parts.push(e.code);
|
|
131
|
+
codes.push(e.code);
|
|
132
|
+
}
|
|
133
|
+
cur = e.cause;
|
|
134
|
+
}
|
|
135
|
+
if (parts.length === 0) parts.push(safeStringify(error));
|
|
136
|
+
return { text: parts.join(" ").toLowerCase(), codes };
|
|
137
|
+
}
|
|
138
|
+
function isNetworkError(error) {
|
|
139
|
+
const { text, codes } = errorSignals(error);
|
|
140
|
+
if (codes.some((c) => NETWORK_CODES.has(c))) return true;
|
|
141
|
+
return NETWORK_PATTERNS.some((p) => text.includes(p));
|
|
142
|
+
}
|
|
66
143
|
function isRetryableError(error) {
|
|
67
144
|
const e = error;
|
|
68
145
|
const status = e?.statusCode ?? e?.status;
|
|
69
146
|
if (typeof status === "number" && (RETRYABLE_STATUS.has(status) || status > 500)) {
|
|
70
147
|
return true;
|
|
71
148
|
}
|
|
72
|
-
|
|
149
|
+
if (isNetworkError(error)) return true;
|
|
150
|
+
const { text } = errorSignals(error);
|
|
73
151
|
return RETRYABLE_PATTERNS.some((p) => text.includes(p));
|
|
74
152
|
}
|
|
75
|
-
function safeStringify(value) {
|
|
76
|
-
try {
|
|
77
|
-
return JSON.stringify(value) ?? "";
|
|
78
|
-
} catch {
|
|
79
|
-
return String(value);
|
|
80
|
-
}
|
|
81
|
-
}
|
|
82
153
|
function classifyError(error) {
|
|
83
154
|
const e = error;
|
|
84
155
|
const status = e?.statusCode ?? e?.status;
|
|
85
156
|
if (typeof status === "number") return String(status);
|
|
86
|
-
|
|
157
|
+
if (isNetworkError(error)) return "network";
|
|
158
|
+
const { text } = errorSignals(error);
|
|
87
159
|
return RETRYABLE_PATTERNS.find((p) => text.includes(p)) ?? "error";
|
|
88
160
|
}
|
|
89
161
|
var AUTH_STATUS = /* @__PURE__ */ new Set([401, 403]);
|
|
90
|
-
var BILLING_PATTERNS = [
|
|
162
|
+
var BILLING_PATTERNS = [
|
|
163
|
+
"insufficient",
|
|
164
|
+
"credit",
|
|
165
|
+
"quota",
|
|
166
|
+
"billing",
|
|
167
|
+
"payment required",
|
|
168
|
+
"balance",
|
|
169
|
+
"exhausted",
|
|
170
|
+
"\u4F59\u989D",
|
|
171
|
+
"\u6B20\u8D39",
|
|
172
|
+
"\u6263\u8D39",
|
|
173
|
+
"\u6263\u6B3E"
|
|
174
|
+
];
|
|
91
175
|
function classifyErrorKind(error) {
|
|
92
176
|
const e = error;
|
|
93
177
|
const status = e?.statusCode ?? e?.status;
|
|
94
|
-
const text = (
|
|
95
|
-
if (typeof status === "number" && AUTH_STATUS.has(status)) return "auth";
|
|
178
|
+
const { text } = errorSignals(error);
|
|
96
179
|
if (status === 402 || BILLING_PATTERNS.some((p) => text.includes(p))) return "billing";
|
|
180
|
+
if (typeof status === "number" && AUTH_STATUS.has(status)) return "auth";
|
|
97
181
|
return isRetryableError(error) ? "transient" : "client";
|
|
98
182
|
}
|
|
99
183
|
var callSeq = 0;
|
|
@@ -162,6 +246,27 @@ var LcrFallbackModel = class {
|
|
|
162
246
|
shouldRetry(error) {
|
|
163
247
|
return (this.opts.shouldRetry ?? isRetryableError)(error);
|
|
164
248
|
}
|
|
249
|
+
// Observer callbacks are caller-supplied logging hooks: a throw from one of
|
|
250
|
+
// them must NEVER turn a successful (or already-failed) request into a
|
|
251
|
+
// different outcome. Swallow anything they throw — they are fire-and-forget.
|
|
252
|
+
emitError(error, provider) {
|
|
253
|
+
try {
|
|
254
|
+
this.opts.onError?.(error, provider);
|
|
255
|
+
} catch {
|
|
256
|
+
}
|
|
257
|
+
}
|
|
258
|
+
emitCost(event) {
|
|
259
|
+
try {
|
|
260
|
+
this.opts.onCost?.(event);
|
|
261
|
+
} catch {
|
|
262
|
+
}
|
|
263
|
+
}
|
|
264
|
+
emitCall(record) {
|
|
265
|
+
try {
|
|
266
|
+
this.opts.onCall?.(record);
|
|
267
|
+
} catch {
|
|
268
|
+
}
|
|
269
|
+
}
|
|
165
270
|
startCall() {
|
|
166
271
|
return { id: newCallId(), attempts: [], startedAt: Date.now() };
|
|
167
272
|
}
|
|
@@ -181,14 +286,14 @@ var LcrFallbackModel = class {
|
|
|
181
286
|
const inputTokens = usage?.inputTokens?.total ?? 0;
|
|
182
287
|
const outputTokens = usage?.outputTokens?.total ?? 0;
|
|
183
288
|
const costUsd = provider.cost ? inputTokens / 1e6 * provider.cost.input + outputTokens / 1e6 * provider.cost.output : 0;
|
|
184
|
-
this.
|
|
289
|
+
this.emitCost({
|
|
185
290
|
model: this.opts.modelName,
|
|
186
291
|
provider: provider.label,
|
|
187
292
|
inputTokens,
|
|
188
293
|
outputTokens,
|
|
189
294
|
costUsd
|
|
190
295
|
});
|
|
191
|
-
this.
|
|
296
|
+
this.emitCall({
|
|
192
297
|
id: ctx.id,
|
|
193
298
|
model: this.opts.modelName,
|
|
194
299
|
attempts: ctx.attempts,
|
|
@@ -203,7 +308,7 @@ var LcrFallbackModel = class {
|
|
|
203
308
|
}
|
|
204
309
|
/** Every provider failed: fire `onCall` with no winner. */
|
|
205
310
|
finalizeFail(ctx) {
|
|
206
|
-
this.
|
|
311
|
+
this.emitCall({
|
|
207
312
|
id: ctx.id,
|
|
208
313
|
model: this.opts.modelName,
|
|
209
314
|
attempts: ctx.attempts,
|
|
@@ -238,7 +343,7 @@ var LcrFallbackModel = class {
|
|
|
238
343
|
this.finalizeFail(ctx);
|
|
239
344
|
throw error;
|
|
240
345
|
}
|
|
241
|
-
this.
|
|
346
|
+
this.emitError(error, provider.label);
|
|
242
347
|
this.recordFail(ctx, provider, attemptStart, error);
|
|
243
348
|
}
|
|
244
349
|
}
|
|
@@ -274,7 +379,7 @@ var LcrFallbackModel = class {
|
|
|
274
379
|
this.finalizeFail(ctx);
|
|
275
380
|
throw error;
|
|
276
381
|
}
|
|
277
|
-
this.
|
|
382
|
+
this.emitError(error, serving.label);
|
|
278
383
|
this.recordFail(ctx, serving, servingStart, error);
|
|
279
384
|
tried++;
|
|
280
385
|
if (tried >= n) {
|
|
@@ -310,7 +415,7 @@ var LcrFallbackModel = class {
|
|
|
310
415
|
self.finalizeOk(ctx, servingProvider, servingAttemptStart, usage);
|
|
311
416
|
controller.close();
|
|
312
417
|
} catch (error) {
|
|
313
|
-
self.
|
|
418
|
+
self.emitError(error, servingProvider.label);
|
|
314
419
|
self.recordFail(ctx, servingProvider, servingAttemptStart, error);
|
|
315
420
|
if (!streamedAny) {
|
|
316
421
|
const nextTried = triedBeforeServing + 1;
|
|
@@ -434,6 +539,24 @@ function newMediaCallId() {
|
|
|
434
539
|
}
|
|
435
540
|
function createMediaLCR(config) {
|
|
436
541
|
const { registry, adapters, reference = DEFAULT_REFERENCE, onError, onCost, onCall } = config;
|
|
542
|
+
const safeError = (error, provider) => {
|
|
543
|
+
try {
|
|
544
|
+
onError?.(error, provider);
|
|
545
|
+
} catch {
|
|
546
|
+
}
|
|
547
|
+
};
|
|
548
|
+
const safeCost = (event) => {
|
|
549
|
+
try {
|
|
550
|
+
onCost?.(event);
|
|
551
|
+
} catch {
|
|
552
|
+
}
|
|
553
|
+
};
|
|
554
|
+
const safeCall = (record) => {
|
|
555
|
+
try {
|
|
556
|
+
onCall?.(record);
|
|
557
|
+
} catch {
|
|
558
|
+
}
|
|
559
|
+
};
|
|
437
560
|
return async function generate(modelId, input) {
|
|
438
561
|
const def = registry[modelId];
|
|
439
562
|
if (!def) {
|
|
@@ -444,7 +567,7 @@ function createMediaLCR(config) {
|
|
|
444
567
|
const startedAt = Date.now();
|
|
445
568
|
const attempts = [];
|
|
446
569
|
let lastErr;
|
|
447
|
-
const emitFail = () =>
|
|
570
|
+
const emitFail = () => safeCall({
|
|
448
571
|
id: newMediaCallId(),
|
|
449
572
|
model: modelId,
|
|
450
573
|
attempts,
|
|
@@ -466,8 +589,8 @@ function createMediaLCR(config) {
|
|
|
466
589
|
const estimated = result.costCents === void 0;
|
|
467
590
|
const costCents = estimated ? route.refCents * (result.units ?? 1) : result.costCents;
|
|
468
591
|
attempts.push({ provider: route.provider, ok: true, latencyMs: Date.now() - attemptStart });
|
|
469
|
-
|
|
470
|
-
|
|
592
|
+
safeCost({ modelId, provider: route.provider, costCents, estimated });
|
|
593
|
+
safeCall({
|
|
471
594
|
id: newMediaCallId(),
|
|
472
595
|
model: modelId,
|
|
473
596
|
attempts,
|
|
@@ -489,7 +612,7 @@ function createMediaLCR(config) {
|
|
|
489
612
|
latencyMs: Date.now() - attemptStart,
|
|
490
613
|
errorClass: classifyError(err)
|
|
491
614
|
});
|
|
492
|
-
|
|
615
|
+
safeError(err, route.provider);
|
|
493
616
|
if (!isRetryableError(err)) {
|
|
494
617
|
emitFail();
|
|
495
618
|
throw err;
|
|
@@ -779,6 +902,108 @@ var RunwareMediaError = class extends Error {
|
|
|
779
902
|
status;
|
|
780
903
|
};
|
|
781
904
|
|
|
905
|
+
// src/adapters/fal-media.ts
|
|
906
|
+
var DEFAULT_BASE3 = "https://queue.fal.run";
|
|
907
|
+
function extractOutputs(raw) {
|
|
908
|
+
if (!raw || typeof raw !== "object") return [];
|
|
909
|
+
const data = raw;
|
|
910
|
+
const out = [];
|
|
911
|
+
const pushUrl = (url, type) => {
|
|
912
|
+
if (typeof url === "string" && url.length > 0) out.push({ url, type });
|
|
913
|
+
};
|
|
914
|
+
if (Array.isArray(data.images)) {
|
|
915
|
+
for (const img of data.images) pushUrl(img?.url, "image");
|
|
916
|
+
}
|
|
917
|
+
pushUrl(data.image?.url, "image");
|
|
918
|
+
if (Array.isArray(data.videos)) {
|
|
919
|
+
for (const v of data.videos) pushUrl(v?.url, "video");
|
|
920
|
+
}
|
|
921
|
+
pushUrl(data.video?.url, "video");
|
|
922
|
+
return out;
|
|
923
|
+
}
|
|
924
|
+
function createFalMediaAdapter(config) {
|
|
925
|
+
const {
|
|
926
|
+
apiKey,
|
|
927
|
+
baseUrl = DEFAULT_BASE3,
|
|
928
|
+
pollIntervalMs = 3e3,
|
|
929
|
+
pollTimeoutMs = 3e5,
|
|
930
|
+
fetchImpl = fetch
|
|
931
|
+
} = config;
|
|
932
|
+
const headers = {
|
|
933
|
+
"content-type": "application/json",
|
|
934
|
+
authorization: `Key ${apiKey}`
|
|
935
|
+
};
|
|
936
|
+
return {
|
|
937
|
+
provider: "fal",
|
|
938
|
+
async run(req) {
|
|
939
|
+
const submitRes = await fetchImpl(`${baseUrl}/${req.externalId}`, {
|
|
940
|
+
method: "POST",
|
|
941
|
+
headers,
|
|
942
|
+
body: JSON.stringify(req.input)
|
|
943
|
+
});
|
|
944
|
+
if (!submitRes.ok) {
|
|
945
|
+
throw new FalMediaError(submitRes.status, await safeText2(submitRes));
|
|
946
|
+
}
|
|
947
|
+
const submit = await submitRes.json();
|
|
948
|
+
const statusUrl = submit.status_url;
|
|
949
|
+
const responseUrl = submit.response_url;
|
|
950
|
+
if (!statusUrl || !responseUrl) {
|
|
951
|
+
throw new Error(
|
|
952
|
+
`ai-lcr: fal submit for "${req.externalId}" returned no status/response URL (keys: ${Object.keys(
|
|
953
|
+
submit
|
|
954
|
+
).join(", ")})`
|
|
955
|
+
);
|
|
956
|
+
}
|
|
957
|
+
const deadline = Date.now() + pollTimeoutMs;
|
|
958
|
+
let completed = false;
|
|
959
|
+
while (Date.now() < deadline) {
|
|
960
|
+
const statusRes = await fetchImpl(statusUrl, { headers });
|
|
961
|
+
if (!statusRes.ok) {
|
|
962
|
+
throw new FalMediaError(statusRes.status, await safeText2(statusRes));
|
|
963
|
+
}
|
|
964
|
+
const status = String((await statusRes.json()).status ?? "");
|
|
965
|
+
if (status === "COMPLETED") {
|
|
966
|
+
completed = true;
|
|
967
|
+
break;
|
|
968
|
+
}
|
|
969
|
+
await sleep2(pollIntervalMs);
|
|
970
|
+
}
|
|
971
|
+
if (!completed) {
|
|
972
|
+
throw new Error(
|
|
973
|
+
`ai-lcr: fal job for "${req.externalId}" timed out after ${pollTimeoutMs}ms`
|
|
974
|
+
);
|
|
975
|
+
}
|
|
976
|
+
const resultRes = await fetchImpl(responseUrl, { headers });
|
|
977
|
+
if (!resultRes.ok) {
|
|
978
|
+
throw new FalMediaError(resultRes.status, await safeText2(resultRes));
|
|
979
|
+
}
|
|
980
|
+
const outputs = extractOutputs(await resultRes.json());
|
|
981
|
+
if (outputs.length === 0) {
|
|
982
|
+
throw new Error(`ai-lcr: fal returned no media URL for "${req.externalId}"`);
|
|
983
|
+
}
|
|
984
|
+
return { outputs, units: outputs.length };
|
|
985
|
+
}
|
|
986
|
+
};
|
|
987
|
+
}
|
|
988
|
+
var FalMediaError = class extends Error {
|
|
989
|
+
constructor(status, body) {
|
|
990
|
+
super(`fal media HTTP ${status}: ${body.slice(0, 300)}`);
|
|
991
|
+
this.status = status;
|
|
992
|
+
this.name = "FalMediaError";
|
|
993
|
+
}
|
|
994
|
+
status;
|
|
995
|
+
};
|
|
996
|
+
function sleep2(ms) {
|
|
997
|
+
return new Promise((r) => setTimeout(r, ms));
|
|
998
|
+
}
|
|
999
|
+
async function safeText2(res) {
|
|
1000
|
+
try {
|
|
1001
|
+
return await res.text();
|
|
1002
|
+
} catch {
|
|
1003
|
+
return "<no body>";
|
|
1004
|
+
}
|
|
1005
|
+
}
|
|
1006
|
+
|
|
782
1007
|
// src/index.ts
|
|
783
1008
|
function isLanguageModel(entry) {
|
|
784
1009
|
return typeof entry.doGenerate === "function";
|
|
@@ -827,6 +1052,7 @@ function createLCR(config) {
|
|
|
827
1052
|
classifyError,
|
|
828
1053
|
classifyErrorKind,
|
|
829
1054
|
comparePrices,
|
|
1055
|
+
createFalMediaAdapter,
|
|
830
1056
|
createKunavoMediaAdapter,
|
|
831
1057
|
createLCR,
|
|
832
1058
|
createMediaLCR,
|
package/dist/index.d.cts
CHANGED
|
@@ -358,6 +358,48 @@ interface RunwareMediaConfig {
|
|
|
358
358
|
}
|
|
359
359
|
declare function createRunwareMediaAdapter(config: RunwareMediaConfig): MediaAdapter;
|
|
360
360
|
|
|
361
|
+
/**
|
|
362
|
+
* fal media adapter — image (queue) + video (queue, async poll).
|
|
363
|
+
*
|
|
364
|
+
* fal serves every model through one async queue API, so a single submit→poll→
|
|
365
|
+
* fetch-result path covers both image and video. That is the whole reason this
|
|
366
|
+
* adapter exists: it is ai-lcr's first VIDEO-capable execution path. (The
|
|
367
|
+
* Runware adapter is image-only; the Kunavo one's video poll loop is unverified.)
|
|
368
|
+
*
|
|
369
|
+
* Implementation note: ai-art's fal adapter uses the `@fal-ai/client` SDK, but
|
|
370
|
+
* ai-lcr deliberately keeps zero provider SDKs — every adapter is raw `fetch`
|
|
371
|
+
* with an injectable `fetchImpl` for testing (see runware-media, kunavo-media).
|
|
372
|
+
* So this re-implements the three queue calls against fal's REST endpoints:
|
|
373
|
+
*
|
|
374
|
+
* 1. submit POST https://queue.fal.run/{model} → { request_id, status_url, response_url }
|
|
375
|
+
* 2. status GET {status_url} → { status: IN_QUEUE | IN_PROGRESS | COMPLETED }
|
|
376
|
+
* 3. result GET {response_url} → { images:[…] } | { video:{url} } | …
|
|
377
|
+
*
|
|
378
|
+
* We follow the `status_url` / `response_url` returned by submit rather than
|
|
379
|
+
* rebuilding them, which sidesteps fal's sub-path quirk (a model like
|
|
380
|
+
* `fal-ai/flux/schnell` submits to the full path but its status/result live
|
|
381
|
+
* under the `fal-ai/flux` base).
|
|
382
|
+
*
|
|
383
|
+
* Auth: fal uses `Authorization: Key {FAL_KEY}` (NOT Bearer).
|
|
384
|
+
*
|
|
385
|
+
* Cost: fal's queue result does not carry a per-call price, so cost is left to
|
|
386
|
+
* the router's normalized estimate (costCents stays undefined; `units` is the
|
|
387
|
+
* output count — one image, or one clip).
|
|
388
|
+
*/
|
|
389
|
+
|
|
390
|
+
interface FalMediaConfig {
|
|
391
|
+
apiKey: string;
|
|
392
|
+
/** Override for testing. Defaults to https://queue.fal.run. */
|
|
393
|
+
baseUrl?: string;
|
|
394
|
+
/** Video/job poll cadence (ms). Default 3000. */
|
|
395
|
+
pollIntervalMs?: number;
|
|
396
|
+
/** Max time to wait for a job before giving up (ms). Default 300000 (5m). */
|
|
397
|
+
pollTimeoutMs?: number;
|
|
398
|
+
/** Injected for testing; defaults to global fetch. */
|
|
399
|
+
fetchImpl?: typeof fetch;
|
|
400
|
+
}
|
|
401
|
+
declare function createFalMediaAdapter(config: FalMediaConfig): MediaAdapter;
|
|
402
|
+
|
|
361
403
|
/**
|
|
362
404
|
* ai-lcr — Least Cost Routing for LLMs.
|
|
363
405
|
*
|
|
@@ -411,4 +453,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
|
|
|
411
453
|
*/
|
|
412
454
|
declare function createLCR(config: LCRConfig): LCRRouter;
|
|
413
455
|
|
|
414
|
-
export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
|
|
456
|
+
export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
|
package/dist/index.d.ts
CHANGED
|
@@ -358,6 +358,48 @@ interface RunwareMediaConfig {
|
|
|
358
358
|
}
|
|
359
359
|
declare function createRunwareMediaAdapter(config: RunwareMediaConfig): MediaAdapter;
|
|
360
360
|
|
|
361
|
+
/**
|
|
362
|
+
* fal media adapter — image (queue) + video (queue, async poll).
|
|
363
|
+
*
|
|
364
|
+
* fal serves every model through one async queue API, so a single submit→poll→
|
|
365
|
+
* fetch-result path covers both image and video. That is the whole reason this
|
|
366
|
+
* adapter exists: it is ai-lcr's first VIDEO-capable execution path. (The
|
|
367
|
+
* Runware adapter is image-only; the Kunavo one's video poll loop is unverified.)
|
|
368
|
+
*
|
|
369
|
+
* Implementation note: ai-art's fal adapter uses the `@fal-ai/client` SDK, but
|
|
370
|
+
* ai-lcr deliberately keeps zero provider SDKs — every adapter is raw `fetch`
|
|
371
|
+
* with an injectable `fetchImpl` for testing (see runware-media, kunavo-media).
|
|
372
|
+
* So this re-implements the three queue calls against fal's REST endpoints:
|
|
373
|
+
*
|
|
374
|
+
* 1. submit POST https://queue.fal.run/{model} → { request_id, status_url, response_url }
|
|
375
|
+
* 2. status GET {status_url} → { status: IN_QUEUE | IN_PROGRESS | COMPLETED }
|
|
376
|
+
* 3. result GET {response_url} → { images:[…] } | { video:{url} } | …
|
|
377
|
+
*
|
|
378
|
+
* We follow the `status_url` / `response_url` returned by submit rather than
|
|
379
|
+
* rebuilding them, which sidesteps fal's sub-path quirk (a model like
|
|
380
|
+
* `fal-ai/flux/schnell` submits to the full path but its status/result live
|
|
381
|
+
* under the `fal-ai/flux` base).
|
|
382
|
+
*
|
|
383
|
+
* Auth: fal uses `Authorization: Key {FAL_KEY}` (NOT Bearer).
|
|
384
|
+
*
|
|
385
|
+
* Cost: fal's queue result does not carry a per-call price, so cost is left to
|
|
386
|
+
* the router's normalized estimate (costCents stays undefined; `units` is the
|
|
387
|
+
* output count — one image, or one clip).
|
|
388
|
+
*/
|
|
389
|
+
|
|
390
|
+
interface FalMediaConfig {
|
|
391
|
+
apiKey: string;
|
|
392
|
+
/** Override for testing. Defaults to https://queue.fal.run. */
|
|
393
|
+
baseUrl?: string;
|
|
394
|
+
/** Video/job poll cadence (ms). Default 3000. */
|
|
395
|
+
pollIntervalMs?: number;
|
|
396
|
+
/** Max time to wait for a job before giving up (ms). Default 300000 (5m). */
|
|
397
|
+
pollTimeoutMs?: number;
|
|
398
|
+
/** Injected for testing; defaults to global fetch. */
|
|
399
|
+
fetchImpl?: typeof fetch;
|
|
400
|
+
}
|
|
401
|
+
declare function createFalMediaAdapter(config: FalMediaConfig): MediaAdapter;
|
|
402
|
+
|
|
361
403
|
/**
|
|
362
404
|
* ai-lcr — Least Cost Routing for LLMs.
|
|
363
405
|
*
|
|
@@ -411,4 +453,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
|
|
|
411
453
|
*/
|
|
412
454
|
declare function createLCR(config: LCRConfig): LCRRouter;
|
|
413
455
|
|
|
414
|
-
export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
|
|
456
|
+
export { type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaUnit, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, formatCallRecord, normalizedCents, rankRoutes, referenceMegapixels };
|
package/dist/index.js
CHANGED
|
@@ -18,43 +18,126 @@ var RETRYABLE_PATTERNS = [
|
|
|
18
18
|
"504",
|
|
19
19
|
"429",
|
|
20
20
|
// Billing caps — a capped provider should fall over, not kill the request.
|
|
21
|
+
// Include non-English wording: Chinese providers (e.g. Kunavo) report a failed
|
|
22
|
+
// charge as "余额不足"/"账户欠费"/"扣费失败" with a 200/400 body, which no
|
|
23
|
+
// English keyword and no HTTP status would catch — so without these a billing
|
|
24
|
+
// failure would die instead of failing over, the exact opposite of what we want.
|
|
21
25
|
"insufficient",
|
|
22
26
|
"credit",
|
|
23
27
|
"quota",
|
|
24
28
|
"billing",
|
|
25
|
-
"payment required"
|
|
29
|
+
"payment required",
|
|
30
|
+
"balance",
|
|
31
|
+
"\u4F59\u989D",
|
|
32
|
+
"\u6B20\u8D39",
|
|
33
|
+
"\u6263\u8D39",
|
|
34
|
+
"\u6263\u6B3E"
|
|
26
35
|
];
|
|
36
|
+
var NETWORK_CODES = /* @__PURE__ */ new Set([
|
|
37
|
+
"ECONNREFUSED",
|
|
38
|
+
"ECONNRESET",
|
|
39
|
+
"ECONNABORTED",
|
|
40
|
+
"ENOTFOUND",
|
|
41
|
+
"EAI_AGAIN",
|
|
42
|
+
"ETIMEDOUT",
|
|
43
|
+
"EPIPE",
|
|
44
|
+
"EHOSTUNREACH",
|
|
45
|
+
"ENETUNREACH",
|
|
46
|
+
"EPROTO",
|
|
47
|
+
"UND_ERR_SOCKET",
|
|
48
|
+
"UND_ERR_CONNECT_TIMEOUT",
|
|
49
|
+
"UND_ERR_HEADERS_TIMEOUT",
|
|
50
|
+
"UND_ERR_BODY_TIMEOUT"
|
|
51
|
+
]);
|
|
52
|
+
var NETWORK_PATTERNS = [
|
|
53
|
+
"fetch failed",
|
|
54
|
+
"failed to fetch",
|
|
55
|
+
"socket hang up",
|
|
56
|
+
"socket disconnected",
|
|
57
|
+
"econnrefused",
|
|
58
|
+
"econnreset",
|
|
59
|
+
"enotfound",
|
|
60
|
+
"etimedout",
|
|
61
|
+
"ehostunreach",
|
|
62
|
+
"enetunreach",
|
|
63
|
+
"eai_again",
|
|
64
|
+
"getaddrinfo",
|
|
65
|
+
"connect timeout",
|
|
66
|
+
"connection refused",
|
|
67
|
+
"connection reset",
|
|
68
|
+
"connection error",
|
|
69
|
+
"network error",
|
|
70
|
+
"dns"
|
|
71
|
+
];
|
|
72
|
+
function safeStringify(value) {
|
|
73
|
+
try {
|
|
74
|
+
return JSON.stringify(value) ?? "";
|
|
75
|
+
} catch {
|
|
76
|
+
return String(value);
|
|
77
|
+
}
|
|
78
|
+
}
|
|
79
|
+
function errorSignals(error) {
|
|
80
|
+
const parts = [];
|
|
81
|
+
const codes = [];
|
|
82
|
+
const seen = /* @__PURE__ */ new Set();
|
|
83
|
+
let cur = error;
|
|
84
|
+
for (let depth = 0; depth < 6 && cur && typeof cur === "object" && !seen.has(cur); depth++) {
|
|
85
|
+
seen.add(cur);
|
|
86
|
+
const e = cur;
|
|
87
|
+
if (typeof e.message === "string") parts.push(e.message);
|
|
88
|
+
if (typeof e.name === "string") parts.push(e.name);
|
|
89
|
+
if (typeof e.code === "string") {
|
|
90
|
+
parts.push(e.code);
|
|
91
|
+
codes.push(e.code);
|
|
92
|
+
}
|
|
93
|
+
cur = e.cause;
|
|
94
|
+
}
|
|
95
|
+
if (parts.length === 0) parts.push(safeStringify(error));
|
|
96
|
+
return { text: parts.join(" ").toLowerCase(), codes };
|
|
97
|
+
}
|
|
98
|
+
function isNetworkError(error) {
|
|
99
|
+
const { text, codes } = errorSignals(error);
|
|
100
|
+
if (codes.some((c) => NETWORK_CODES.has(c))) return true;
|
|
101
|
+
return NETWORK_PATTERNS.some((p) => text.includes(p));
|
|
102
|
+
}
|
|
27
103
|
function isRetryableError(error) {
|
|
28
104
|
const e = error;
|
|
29
105
|
const status = e?.statusCode ?? e?.status;
|
|
30
106
|
if (typeof status === "number" && (RETRYABLE_STATUS.has(status) || status > 500)) {
|
|
31
107
|
return true;
|
|
32
108
|
}
|
|
33
|
-
|
|
109
|
+
if (isNetworkError(error)) return true;
|
|
110
|
+
const { text } = errorSignals(error);
|
|
34
111
|
return RETRYABLE_PATTERNS.some((p) => text.includes(p));
|
|
35
112
|
}
|
|
36
|
-
function safeStringify(value) {
|
|
37
|
-
try {
|
|
38
|
-
return JSON.stringify(value) ?? "";
|
|
39
|
-
} catch {
|
|
40
|
-
return String(value);
|
|
41
|
-
}
|
|
42
|
-
}
|
|
43
113
|
function classifyError(error) {
|
|
44
114
|
const e = error;
|
|
45
115
|
const status = e?.statusCode ?? e?.status;
|
|
46
116
|
if (typeof status === "number") return String(status);
|
|
47
|
-
|
|
117
|
+
if (isNetworkError(error)) return "network";
|
|
118
|
+
const { text } = errorSignals(error);
|
|
48
119
|
return RETRYABLE_PATTERNS.find((p) => text.includes(p)) ?? "error";
|
|
49
120
|
}
|
|
50
121
|
var AUTH_STATUS = /* @__PURE__ */ new Set([401, 403]);
|
|
51
|
-
var BILLING_PATTERNS = [
|
|
122
|
+
var BILLING_PATTERNS = [
|
|
123
|
+
"insufficient",
|
|
124
|
+
"credit",
|
|
125
|
+
"quota",
|
|
126
|
+
"billing",
|
|
127
|
+
"payment required",
|
|
128
|
+
"balance",
|
|
129
|
+
"exhausted",
|
|
130
|
+
"\u4F59\u989D",
|
|
131
|
+
"\u6B20\u8D39",
|
|
132
|
+
"\u6263\u8D39",
|
|
133
|
+
"\u6263\u6B3E"
|
|
134
|
+
];
|
|
52
135
|
function classifyErrorKind(error) {
|
|
53
136
|
const e = error;
|
|
54
137
|
const status = e?.statusCode ?? e?.status;
|
|
55
|
-
const text = (
|
|
56
|
-
if (typeof status === "number" && AUTH_STATUS.has(status)) return "auth";
|
|
138
|
+
const { text } = errorSignals(error);
|
|
57
139
|
if (status === 402 || BILLING_PATTERNS.some((p) => text.includes(p))) return "billing";
|
|
140
|
+
if (typeof status === "number" && AUTH_STATUS.has(status)) return "auth";
|
|
58
141
|
return isRetryableError(error) ? "transient" : "client";
|
|
59
142
|
}
|
|
60
143
|
var callSeq = 0;
|
|
@@ -123,6 +206,27 @@ var LcrFallbackModel = class {
|
|
|
123
206
|
shouldRetry(error) {
|
|
124
207
|
return (this.opts.shouldRetry ?? isRetryableError)(error);
|
|
125
208
|
}
|
|
209
|
+
// Observer callbacks are caller-supplied logging hooks: a throw from one of
|
|
210
|
+
// them must NEVER turn a successful (or already-failed) request into a
|
|
211
|
+
// different outcome. Swallow anything they throw — they are fire-and-forget.
|
|
212
|
+
emitError(error, provider) {
|
|
213
|
+
try {
|
|
214
|
+
this.opts.onError?.(error, provider);
|
|
215
|
+
} catch {
|
|
216
|
+
}
|
|
217
|
+
}
|
|
218
|
+
emitCost(event) {
|
|
219
|
+
try {
|
|
220
|
+
this.opts.onCost?.(event);
|
|
221
|
+
} catch {
|
|
222
|
+
}
|
|
223
|
+
}
|
|
224
|
+
emitCall(record) {
|
|
225
|
+
try {
|
|
226
|
+
this.opts.onCall?.(record);
|
|
227
|
+
} catch {
|
|
228
|
+
}
|
|
229
|
+
}
|
|
126
230
|
startCall() {
|
|
127
231
|
return { id: newCallId(), attempts: [], startedAt: Date.now() };
|
|
128
232
|
}
|
|
@@ -142,14 +246,14 @@ var LcrFallbackModel = class {
|
|
|
142
246
|
const inputTokens = usage?.inputTokens?.total ?? 0;
|
|
143
247
|
const outputTokens = usage?.outputTokens?.total ?? 0;
|
|
144
248
|
const costUsd = provider.cost ? inputTokens / 1e6 * provider.cost.input + outputTokens / 1e6 * provider.cost.output : 0;
|
|
145
|
-
this.
|
|
249
|
+
this.emitCost({
|
|
146
250
|
model: this.opts.modelName,
|
|
147
251
|
provider: provider.label,
|
|
148
252
|
inputTokens,
|
|
149
253
|
outputTokens,
|
|
150
254
|
costUsd
|
|
151
255
|
});
|
|
152
|
-
this.
|
|
256
|
+
this.emitCall({
|
|
153
257
|
id: ctx.id,
|
|
154
258
|
model: this.opts.modelName,
|
|
155
259
|
attempts: ctx.attempts,
|
|
@@ -164,7 +268,7 @@ var LcrFallbackModel = class {
|
|
|
164
268
|
}
|
|
165
269
|
/** Every provider failed: fire `onCall` with no winner. */
|
|
166
270
|
finalizeFail(ctx) {
|
|
167
|
-
this.
|
|
271
|
+
this.emitCall({
|
|
168
272
|
id: ctx.id,
|
|
169
273
|
model: this.opts.modelName,
|
|
170
274
|
attempts: ctx.attempts,
|
|
@@ -199,7 +303,7 @@ var LcrFallbackModel = class {
|
|
|
199
303
|
this.finalizeFail(ctx);
|
|
200
304
|
throw error;
|
|
201
305
|
}
|
|
202
|
-
this.
|
|
306
|
+
this.emitError(error, provider.label);
|
|
203
307
|
this.recordFail(ctx, provider, attemptStart, error);
|
|
204
308
|
}
|
|
205
309
|
}
|
|
@@ -235,7 +339,7 @@ var LcrFallbackModel = class {
|
|
|
235
339
|
this.finalizeFail(ctx);
|
|
236
340
|
throw error;
|
|
237
341
|
}
|
|
238
|
-
this.
|
|
342
|
+
this.emitError(error, serving.label);
|
|
239
343
|
this.recordFail(ctx, serving, servingStart, error);
|
|
240
344
|
tried++;
|
|
241
345
|
if (tried >= n) {
|
|
@@ -271,7 +375,7 @@ var LcrFallbackModel = class {
|
|
|
271
375
|
self.finalizeOk(ctx, servingProvider, servingAttemptStart, usage);
|
|
272
376
|
controller.close();
|
|
273
377
|
} catch (error) {
|
|
274
|
-
self.
|
|
378
|
+
self.emitError(error, servingProvider.label);
|
|
275
379
|
self.recordFail(ctx, servingProvider, servingAttemptStart, error);
|
|
276
380
|
if (!streamedAny) {
|
|
277
381
|
const nextTried = triedBeforeServing + 1;
|
|
@@ -395,6 +499,24 @@ function newMediaCallId() {
|
|
|
395
499
|
}
|
|
396
500
|
function createMediaLCR(config) {
|
|
397
501
|
const { registry, adapters, reference = DEFAULT_REFERENCE, onError, onCost, onCall } = config;
|
|
502
|
+
const safeError = (error, provider) => {
|
|
503
|
+
try {
|
|
504
|
+
onError?.(error, provider);
|
|
505
|
+
} catch {
|
|
506
|
+
}
|
|
507
|
+
};
|
|
508
|
+
const safeCost = (event) => {
|
|
509
|
+
try {
|
|
510
|
+
onCost?.(event);
|
|
511
|
+
} catch {
|
|
512
|
+
}
|
|
513
|
+
};
|
|
514
|
+
const safeCall = (record) => {
|
|
515
|
+
try {
|
|
516
|
+
onCall?.(record);
|
|
517
|
+
} catch {
|
|
518
|
+
}
|
|
519
|
+
};
|
|
398
520
|
return async function generate(modelId, input) {
|
|
399
521
|
const def = registry[modelId];
|
|
400
522
|
if (!def) {
|
|
@@ -405,7 +527,7 @@ function createMediaLCR(config) {
|
|
|
405
527
|
const startedAt = Date.now();
|
|
406
528
|
const attempts = [];
|
|
407
529
|
let lastErr;
|
|
408
|
-
const emitFail = () =>
|
|
530
|
+
const emitFail = () => safeCall({
|
|
409
531
|
id: newMediaCallId(),
|
|
410
532
|
model: modelId,
|
|
411
533
|
attempts,
|
|
@@ -427,8 +549,8 @@ function createMediaLCR(config) {
|
|
|
427
549
|
const estimated = result.costCents === void 0;
|
|
428
550
|
const costCents = estimated ? route.refCents * (result.units ?? 1) : result.costCents;
|
|
429
551
|
attempts.push({ provider: route.provider, ok: true, latencyMs: Date.now() - attemptStart });
|
|
430
|
-
|
|
431
|
-
|
|
552
|
+
safeCost({ modelId, provider: route.provider, costCents, estimated });
|
|
553
|
+
safeCall({
|
|
432
554
|
id: newMediaCallId(),
|
|
433
555
|
model: modelId,
|
|
434
556
|
attempts,
|
|
@@ -450,7 +572,7 @@ function createMediaLCR(config) {
|
|
|
450
572
|
latencyMs: Date.now() - attemptStart,
|
|
451
573
|
errorClass: classifyError(err)
|
|
452
574
|
});
|
|
453
|
-
|
|
575
|
+
safeError(err, route.provider);
|
|
454
576
|
if (!isRetryableError(err)) {
|
|
455
577
|
emitFail();
|
|
456
578
|
throw err;
|
|
@@ -740,6 +862,108 @@ var RunwareMediaError = class extends Error {
|
|
|
740
862
|
status;
|
|
741
863
|
};
|
|
742
864
|
|
|
865
|
+
// src/adapters/fal-media.ts
|
|
866
|
+
var DEFAULT_BASE3 = "https://queue.fal.run";
|
|
867
|
+
function extractOutputs(raw) {
|
|
868
|
+
if (!raw || typeof raw !== "object") return [];
|
|
869
|
+
const data = raw;
|
|
870
|
+
const out = [];
|
|
871
|
+
const pushUrl = (url, type) => {
|
|
872
|
+
if (typeof url === "string" && url.length > 0) out.push({ url, type });
|
|
873
|
+
};
|
|
874
|
+
if (Array.isArray(data.images)) {
|
|
875
|
+
for (const img of data.images) pushUrl(img?.url, "image");
|
|
876
|
+
}
|
|
877
|
+
pushUrl(data.image?.url, "image");
|
|
878
|
+
if (Array.isArray(data.videos)) {
|
|
879
|
+
for (const v of data.videos) pushUrl(v?.url, "video");
|
|
880
|
+
}
|
|
881
|
+
pushUrl(data.video?.url, "video");
|
|
882
|
+
return out;
|
|
883
|
+
}
|
|
884
|
+
function createFalMediaAdapter(config) {
|
|
885
|
+
const {
|
|
886
|
+
apiKey,
|
|
887
|
+
baseUrl = DEFAULT_BASE3,
|
|
888
|
+
pollIntervalMs = 3e3,
|
|
889
|
+
pollTimeoutMs = 3e5,
|
|
890
|
+
fetchImpl = fetch
|
|
891
|
+
} = config;
|
|
892
|
+
const headers = {
|
|
893
|
+
"content-type": "application/json",
|
|
894
|
+
authorization: `Key ${apiKey}`
|
|
895
|
+
};
|
|
896
|
+
return {
|
|
897
|
+
provider: "fal",
|
|
898
|
+
async run(req) {
|
|
899
|
+
const submitRes = await fetchImpl(`${baseUrl}/${req.externalId}`, {
|
|
900
|
+
method: "POST",
|
|
901
|
+
headers,
|
|
902
|
+
body: JSON.stringify(req.input)
|
|
903
|
+
});
|
|
904
|
+
if (!submitRes.ok) {
|
|
905
|
+
throw new FalMediaError(submitRes.status, await safeText2(submitRes));
|
|
906
|
+
}
|
|
907
|
+
const submit = await submitRes.json();
|
|
908
|
+
const statusUrl = submit.status_url;
|
|
909
|
+
const responseUrl = submit.response_url;
|
|
910
|
+
if (!statusUrl || !responseUrl) {
|
|
911
|
+
throw new Error(
|
|
912
|
+
`ai-lcr: fal submit for "${req.externalId}" returned no status/response URL (keys: ${Object.keys(
|
|
913
|
+
submit
|
|
914
|
+
).join(", ")})`
|
|
915
|
+
);
|
|
916
|
+
}
|
|
917
|
+
const deadline = Date.now() + pollTimeoutMs;
|
|
918
|
+
let completed = false;
|
|
919
|
+
while (Date.now() < deadline) {
|
|
920
|
+
const statusRes = await fetchImpl(statusUrl, { headers });
|
|
921
|
+
if (!statusRes.ok) {
|
|
922
|
+
throw new FalMediaError(statusRes.status, await safeText2(statusRes));
|
|
923
|
+
}
|
|
924
|
+
const status = String((await statusRes.json()).status ?? "");
|
|
925
|
+
if (status === "COMPLETED") {
|
|
926
|
+
completed = true;
|
|
927
|
+
break;
|
|
928
|
+
}
|
|
929
|
+
await sleep2(pollIntervalMs);
|
|
930
|
+
}
|
|
931
|
+
if (!completed) {
|
|
932
|
+
throw new Error(
|
|
933
|
+
`ai-lcr: fal job for "${req.externalId}" timed out after ${pollTimeoutMs}ms`
|
|
934
|
+
);
|
|
935
|
+
}
|
|
936
|
+
const resultRes = await fetchImpl(responseUrl, { headers });
|
|
937
|
+
if (!resultRes.ok) {
|
|
938
|
+
throw new FalMediaError(resultRes.status, await safeText2(resultRes));
|
|
939
|
+
}
|
|
940
|
+
const outputs = extractOutputs(await resultRes.json());
|
|
941
|
+
if (outputs.length === 0) {
|
|
942
|
+
throw new Error(`ai-lcr: fal returned no media URL for "${req.externalId}"`);
|
|
943
|
+
}
|
|
944
|
+
return { outputs, units: outputs.length };
|
|
945
|
+
}
|
|
946
|
+
};
|
|
947
|
+
}
|
|
948
|
+
var FalMediaError = class extends Error {
|
|
949
|
+
constructor(status, body) {
|
|
950
|
+
super(`fal media HTTP ${status}: ${body.slice(0, 300)}`);
|
|
951
|
+
this.status = status;
|
|
952
|
+
this.name = "FalMediaError";
|
|
953
|
+
}
|
|
954
|
+
status;
|
|
955
|
+
};
|
|
956
|
+
function sleep2(ms) {
|
|
957
|
+
return new Promise((r) => setTimeout(r, ms));
|
|
958
|
+
}
|
|
959
|
+
async function safeText2(res) {
|
|
960
|
+
try {
|
|
961
|
+
return await res.text();
|
|
962
|
+
} catch {
|
|
963
|
+
return "<no body>";
|
|
964
|
+
}
|
|
965
|
+
}
|
|
966
|
+
|
|
743
967
|
// src/index.ts
|
|
744
968
|
function isLanguageModel(entry) {
|
|
745
969
|
return typeof entry.doGenerate === "function";
|
|
@@ -787,6 +1011,7 @@ export {
|
|
|
787
1011
|
classifyError,
|
|
788
1012
|
classifyErrorKind,
|
|
789
1013
|
comparePrices,
|
|
1014
|
+
createFalMediaAdapter,
|
|
790
1015
|
createKunavoMediaAdapter,
|
|
791
1016
|
createLCR,
|
|
792
1017
|
createMediaLCR,
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "ai-lcr",
|
|
3
|
-
"version": "0.2.
|
|
3
|
+
"version": "0.2.6",
|
|
4
4
|
"description": "Least Cost Routing for LLMs — route every model call to the cheapest available provider, fall back automatically, and track real cost. Built for the Vercel AI SDK.",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"ai",
|