ai-lcr 0.6.0 → 0.6.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +31 -0
- package/README.md +96 -6
- package/README.zh-CN.md +75 -4
- package/dist/index.cjs +273 -2
- package/dist/index.d.cts +30 -1
- package/dist/index.d.ts +30 -1
- package/dist/index.js +271 -2
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -4,6 +4,37 @@ All notable changes to `ai-lcr` are documented here. The format follows
|
|
|
4
4
|
[Keep a Changelog](https://keepachangelog.com/), and the project adheres to
|
|
5
5
|
[Semantic Versioning](https://semver.org/).
|
|
6
6
|
|
|
7
|
+
## [0.6.1] — 2026-06-11
|
|
8
|
+
|
|
9
|
+
Zero-config pricing for native-maker routes. Until now every priced provider
|
|
10
|
+
needed a hand-typed `cost: { input, output }`; for a vendor's own API that number
|
|
11
|
+
is just the public list price you could look up. 0.7 bundles those.
|
|
12
|
+
|
|
13
|
+
### Added
|
|
14
|
+
|
|
15
|
+
- **Bundled price table (`MODEL_PRICES`).** Official first-party token prices for
|
|
16
|
+
the native makers ai-lcr documents (openai · anthropic · gemini · deepseek ·
|
|
17
|
+
xai · mistral), keyed by the bare model id you pass to that vendor's AI SDK
|
|
18
|
+
provider — USD per 1M tokens, with `cacheRead` where the maker prices it.
|
|
19
|
+
Generated from [LiteLLM's price map](https://github.com/BerriAI/litellm) (MIT)
|
|
20
|
+
via `scripts/gen-text-prices.mjs`; the generated file is committed.
|
|
21
|
+
- **`getModelPrice(modelId)`.** Look up a bundled price directly; resolves a bare
|
|
22
|
+
id or one with a leading `provider/` segment stripped.
|
|
23
|
+
- **`createLCR({ autoPrice: true })`.** Fills any provider entry that has no
|
|
24
|
+
explicit `cost` from the table, by `model.modelId`. A native-vendor route then
|
|
25
|
+
needs zero hand-typed pricing and `autoSort` can order it.
|
|
26
|
+
- **`discount` on a provider entry.** The flat-reseller knob: `{ model:
|
|
27
|
+
kunavo("…"), discount: 0.2 }` prices a −20% aggregator off the bundled list
|
|
28
|
+
price (scaling input/output/cacheRead) with no hand-typed number. Applies only
|
|
29
|
+
when `autoPrice` fills the entry; out-of-range values throw.
|
|
30
|
+
|
|
31
|
+
### Compatibility
|
|
32
|
+
|
|
33
|
+
- Fully backward compatible. `autoPrice` is **off by default** — unpriced entries
|
|
34
|
+
stay unpriced and an explicit `cost` always wins, so no existing config changes
|
|
35
|
+
behavior. The table covers native makers only; open-weights hosts (DeepInfra)
|
|
36
|
+
and breadth aggregators (OpenRouter) are still priced explicitly.
|
|
37
|
+
|
|
7
38
|
## [0.6.0] — 2026-06-10
|
|
8
39
|
|
|
9
40
|
Media billing contract v2: **rank by the reference, bill by actual usage.**
|
package/README.md
CHANGED
|
@@ -138,6 +138,33 @@ const lcr = createLCR({
|
|
|
138
138
|
|
|
139
139
|
DeepInfra carries open weights only — no first-party Claude / GPT / Gemini. For those closed models, route through OpenRouter or a discount gateway instead.
|
|
140
140
|
|
|
141
|
+
## Zero-config pricing (`autoPrice`)
|
|
142
|
+
|
|
143
|
+
Typing `cost: { input, output }` for every provider is the tedious part. `autoPrice: true` fills any entry that has no explicit `cost` from a **bundled price table** (`MODEL_PRICES`) — official first-party rates for the native makers (OpenAI, Anthropic, Google, DeepSeek, xAI, Mistral), keyed by the bare model id you already pass to the provider:
|
|
144
|
+
|
|
145
|
+
```ts
|
|
146
|
+
const lcr = createLCR({
|
|
147
|
+
autoPrice: true, // fill missing costs from the bundled table
|
|
148
|
+
autoSort: true, // then order cheapest-first using those prices
|
|
149
|
+
models: {
|
|
150
|
+
"claude-sonnet": [
|
|
151
|
+
// Native API — price comes from the table, nothing to type.
|
|
152
|
+
{ model: anthropic("claude-sonnet-4-6"), label: "anthropic" },
|
|
153
|
+
// Flat-discount aggregator — `discount` applies on top of the list price.
|
|
154
|
+
{ model: kunavo("claude-sonnet-4-6"), label: "kunavo", discount: 0.2 }, // 20% off list
|
|
155
|
+
],
|
|
156
|
+
},
|
|
157
|
+
});
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
Three rules keep it predictable:
|
|
161
|
+
|
|
162
|
+
- **Off by default.** Unpriced entries stay unpriced (the pre-existing behavior), so turning `autoPrice` on never silently re-prices a model — and an **explicit `cost` always wins** over the table.
|
|
163
|
+
- **`discount` is the reseller knob.** A flat-% aggregator (Kunavo −20%) becomes `discount: 0.2` instead of a hand-typed number; it scales input, output, and `cacheRead` alike, and only applies when the table fills the entry. Variable-discount providers (TokenMart) still want explicit per-model `cost`.
|
|
164
|
+
- **Native makers only.** The table carries first-party list prices — the cheapest, most-featureful "go direct" route. Open-weights hosts (DeepInfra) and breadth aggregators (OpenRouter) aren't in it; price those explicitly.
|
|
165
|
+
|
|
166
|
+
Look a price up yourself with `getModelPrice("claude-sonnet-4-6")`. The table is generated from [LiteLLM's price map](https://github.com/BerriAI/litellm) (MIT) — refresh with `node scripts/gen-text-prices.mjs`.
|
|
167
|
+
|
|
141
168
|
## How it routes
|
|
142
169
|
|
|
143
170
|
1. **Cheapest first.** Providers are tried in order — list them cheapest-first, or set `autoSort: true` to order them by `cost` automatically.
|
|
@@ -185,13 +212,28 @@ interface CallRecord {
|
|
|
185
212
|
outputTokens: number;
|
|
186
213
|
cachedInputTokens?: number; // prompt-cache hits the winner read (when reported)
|
|
187
214
|
costUsd: number; // winner cost, cache-discount applied (see `cacheRead`)
|
|
188
|
-
baselineUsd?: number; //
|
|
215
|
+
baselineUsd?: number; // what the savings baseline would have charged for the SAME usage → savings = baselineUsd − costUsd
|
|
216
|
+
baselineKind?: "last-leg" | "official" | "priciest-route"; // how that baseline was derived (see below)
|
|
217
|
+
cachedSavingUsd?: number; // the provider's own prompt-cache discount — real money, but NOT a routing saving; never fold it into baselineUsd − costUsd
|
|
189
218
|
requestId?: string; // your correlation id (see below) — roll multi-step tool loops into one request
|
|
190
219
|
usageMissing?: boolean; // winner served but reported 0/0 tokens → costUsd is 0 but unknown, not free
|
|
220
|
+
emptyCompletion?: boolean; // clean response that generated NOTHING — prompt billed, zero output
|
|
221
|
+
|
|
222
|
+
// Media calls (createMediaLCR) additionally carry:
|
|
223
|
+
modality?: "image" | "video";
|
|
224
|
+
usage?: { seconds?: number; outputs?: number; megapixels?: number }; // the actual usage the bill was based on
|
|
225
|
+
officialUsd?: number; // the model maker's first-party price for this call's usage
|
|
226
|
+
estCostUsd?: number; // what the configured price table PREDICTED — on provider-reported rows, costUsd − estCostUsd is price-table drift
|
|
191
227
|
}
|
|
192
228
|
```
|
|
193
229
|
|
|
194
|
-
**Savings, not just spend.**
|
|
230
|
+
**Savings, not just spend.** `baselineUsd` is what the same call would have cost without routing, and `baselineKind` says exactly what that means so a dashboard can qualify the number instead of trusting it blindly:
|
|
231
|
+
|
|
232
|
+
- **`"last-leg"`** (text): the **last priced provider** in the chain — your always-on, list-price fallback. Deliberately *not* the most expensive leg: prompt caching can make a sticker-cheaper provider cost more on a cache-heavy call, and a max-of-chain baseline would fabricate "savings" on calls the fallback itself served.
|
|
233
|
+
- **`"official"`** (media): the model maker's **first-party API price** for the same actual usage — an 8-second clip is baselined at 8 seconds of the official rate, not a reference length.
|
|
234
|
+
- **`"priciest-route"`** (media, no official price known): the most expensive route you configured. Honest about cross-provider spread, but self-referential — not a market price.
|
|
235
|
+
|
|
236
|
+
`baselineUsd − costUsd` is the money routing saved on that call — the number a cost dashboard exists to show.
|
|
195
237
|
|
|
196
238
|
**Responsiveness, not just total time.** On streaming calls (`streamText`, `streamObject`, streaming agents), `ttftMs` is the **time to first token** — measured from the winning provider's attempt start to its first content delta. It's the metric most LLM dashboards lead with, because it's what a user feels as "how fast did it start replying". Total `latencyMs` covers the whole stream including any failover; `ttftMs` isolates the serving model's responsiveness. It's `undefined` for `generateText`/`generateObject` (no streaming → no "first" token) and for calls that failed before any content. Output throughput (tokens/sec) is then `outputTokens / ((latencyMs − ttftMs) / 1000)`.
|
|
197
239
|
|
|
@@ -226,13 +268,28 @@ const lcr = createLCR({
|
|
|
226
268
|
});
|
|
227
269
|
```
|
|
228
270
|
|
|
271
|
+
### The companion dashboard ([`ai-lcr-dashboard`](https://github.com/ai-lcr/ai-lcr-dashboard))
|
|
272
|
+
|
|
273
|
+
<p align="center">
|
|
274
|
+
<img src="assets/dashboard-demo.png" alt="ai-lcr-dashboard (demo data): saved vs spent over time, a price-drift alert, per-project failover health, and per-provider reliability" width="780">
|
|
275
|
+
</p>
|
|
276
|
+
|
|
277
|
+
A **self-hostable** Next.js + Postgres collector built for exactly these records — point `createHttpSink` at its `/api/ingest` and you get, across every project you tag:
|
|
278
|
+
|
|
279
|
+
- **saved vs. spent** over time, with the savings qualified by `baselineKind` and clamped per call (one mispriced row can't eat the rest);
|
|
280
|
+
- **failover health** per provider — who actually failed, who caught it, what leaked to users;
|
|
281
|
+
- **media economics** — image/video calls split out with per-unit cost ($/second of video, $/image);
|
|
282
|
+
- a **price-drift panel** — when a provider's reported bill disagrees with your configured price table by >±20%, it surfaces the route (a ~100× ratio is the classic USD-vs-cents slip). Cheapest-first routing is only as good as its price table; this is the smoke alarm.
|
|
283
|
+
|
|
284
|
+
One-click Vercel deploy (any Postgres: Neon, Supabase, RDS, local); records carry metadata only — no prompts, no outputs. The ingest contract is just the `CallRecord` JSON, so any other drain works too.
|
|
285
|
+
|
|
229
286
|
## Supported providers
|
|
230
287
|
|
|
231
288
|
Any OpenAI-compatible endpoint works — and so does any AI SDK provider package, including a model vendor's own official API.
|
|
232
289
|
|
|
233
290
|
- **Model vendors' own APIs (native):** route straight to [DeepSeek](https://platform.deepseek.com), [OpenAI](https://openai.com), [Anthropic](https://anthropic.com), [Google](https://ai.google.dev), [xAI](https://x.ai), etc. via their AI SDK provider packages — no markup, full native features. See [Route to a model vendor's own API](#route-to-a-model-vendors-own-api-native-providers).
|
|
234
291
|
- **Text aggregators:** [OpenRouter](https://openrouter.ai) (widest coverage, list pricing) · [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off** every model) · [TokenMart](https://thetokenmart.ai) (15–65% off, varies by model)
|
|
235
|
-
- **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — routing via `createMediaLCR`. Image: Kunavo (generations + `*-edit` reference-image endpoints) + Runware + fal. Video: fal (async queue)
|
|
292
|
+
- **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — routing via `createMediaLCR`. Image: Kunavo (generations + `*-edit` reference-image endpoints) + Runware + fal. Video: fal (async queue), Kunavo (async `POST /v1/videos` + poll, sync fallback), and Runware (async `videoInference` + `getResponse` poll) — all three on the async `submit`/`poll` path
|
|
236
293
|
|
|
237
294
|
## Text model pricing
|
|
238
295
|
|
|
@@ -347,6 +404,37 @@ Design choices worth knowing:
|
|
|
347
404
|
- **Telemetry lands once, at the terminal poll** — one `onCall` `CallRecord` with the full failover chain, threaded across both processes (not at `submit`).
|
|
348
405
|
- An adapter advertises async by implementing `submit` + `checkStatus`; image-only adapters omit them and are skipped by the async router. The bundled Kunavo, fal, and Runware adapters all implement the async path (Kunavo/Runware async is video-only; fal covers both).
|
|
349
406
|
|
|
407
|
+
### Writing your own adapter
|
|
408
|
+
|
|
409
|
+
A `MediaAdapter` is small — `run` for sync, optional `submit`/`checkStatus` for async — and the one contract that matters is **how you report what was produced**:
|
|
410
|
+
|
|
411
|
+
```ts
|
|
412
|
+
interface MediaAdapter {
|
|
413
|
+
provider: string;
|
|
414
|
+
run(req: { externalId: string; input: Record<string, unknown> }): Promise<MediaGenerateResult>;
|
|
415
|
+
submit?(req: { externalId: string; input; metadata? }): Promise<{ requestId: string }>;
|
|
416
|
+
checkStatus?(req: { externalId: string; requestId: string }): Promise<MediaStatusResult>;
|
|
417
|
+
}
|
|
418
|
+
|
|
419
|
+
// On a settled result, report:
|
|
420
|
+
{
|
|
421
|
+
outputs: [{ url, type: "image" | "video" }],
|
|
422
|
+
costCents?: number, // the provider's OWN bill, in US cents — convert if the API returns dollars (×100)!
|
|
423
|
+
usage?: { // typed actual usage — what the bill (or estimate) is based on
|
|
424
|
+
seconds?: number, // video length actually produced (per-second SKUs bill this)
|
|
425
|
+
outputs?: number, // output count — images or clips (per-image / per-call SKUs bill this)
|
|
426
|
+
megapixels?: number // total output MP (per-megapixel SKUs bill this)
|
|
427
|
+
}
|
|
428
|
+
}
|
|
429
|
+
```
|
|
430
|
+
|
|
431
|
+
Rules that keep billing honest:
|
|
432
|
+
|
|
433
|
+
- **Report dimensions in `usage`, never as a bare count.** Seconds and output count are separate, explicitly-named fields, so a per-call price can never be multiplied by a clip's duration (the classic 8× overcharge).
|
|
434
|
+
- **`costCents` is cents.** A provider that returns dollars must be converted in the adapter (see the Runware adapter). If you slip, the router's cost-outlier guard flags any bill ≥25× off the price table via `onError` — but the reported number still stands.
|
|
435
|
+
- **When you report nothing**, the router estimates: per-second SKUs read `usage.seconds`, then the input's `duration` (numbers or `"8s"`-style strings), then the 5-second reference as a last resort; per-image/per-call SKUs bill the output count.
|
|
436
|
+
- **Throw errors with an HTTP `status` property** (see `FalMediaError`/`KunavoMediaError`) so the router can classify them for failover.
|
|
437
|
+
|
|
350
438
|
## Vetting a provider (capability + cost probe)
|
|
351
439
|
|
|
352
440
|
A discount is worthless if the provider quietly breaks the wire protocol. `ai-lcr` ships a zero-dependency check (`scripts/check-provider.sh`, just `bash` + `curl` + `python3`) that vets the things that actually cost you money or corrupt output, **per model**:
|
|
@@ -406,11 +494,13 @@ Two OpenAI-compatible providers, same probe, same day. Cells cover both families
|
|
|
406
494
|
- [x] One correlated record per request with the full failover chain (`onCall` + `formatCallRecord`)
|
|
407
495
|
- [x] Auto cheapest-first ordering (`autoSort`) from per-provider `cost`
|
|
408
496
|
- [x] Offline capability + cost check (`scripts/check-provider.sh`) → per-model trust matrix
|
|
409
|
-
- [
|
|
497
|
+
- [x] Bundled price table for zero-config pricing (`autoPrice` + `MODEL_PRICES`) — drop the manual `cost` numbers for native-maker routes
|
|
410
498
|
- [ ] Provider-quirk middleware (transparently patch known per-provider request quirks, e.g. Kunavo's ignored `max_tokens`)
|
|
411
499
|
- [ ] Feed probe results into routing automatically (auto-exclude a model from a provider that fails its probe)
|
|
412
|
-
- [x] Image & video model routing (`createMediaLCR`) — image via Kunavo (incl. `*-edit`) + Runware + fal;
|
|
413
|
-
- [
|
|
500
|
+
- [x] Image & video model routing (`createMediaLCR`) — image via Kunavo (incl. `*-edit`) + Runware + fal; video async (`submit`/`poll`) via fal, Kunavo, and Runware
|
|
501
|
+
- [x] Settle-time billing on actual usage (0.6) — typed `usage`, duration-aware savings baseline, `estCostUsd` price-drift signal, cost-outlier guard
|
|
502
|
+
- [x] Self-hosted dashboard ([`ai-lcr-dashboard`](https://github.com/ai-lcr/ai-lcr-dashboard)) — savings, failover health, media $/unit, price-drift panel
|
|
503
|
+
- [ ] Normalized cross-provider video price comparison in the bundled table
|
|
414
504
|
|
|
415
505
|
## Affiliate disclosure
|
|
416
506
|
|
package/README.zh-CN.md
CHANGED
|
@@ -144,13 +144,56 @@ DeepInfra 只承载开源权重——没有第一方 Claude / GPT / Gemini。那
|
|
|
144
144
|
2. **失败时向下穿透。** 遇到任何 provider 失败——限流、5xx、超时、**额度耗尽**(402 / 欠费 / 余额不足),以及 **400** 这类 client 错误——都会前进到下一个 provider,且对流式安全。400 会 failover 是有意为之:在 OpenAI 兼容聚合层里,400 往往是"*这家* provider 不吃这个请求"(不支持的参数、它没上架这个 model、更严格的 schema),而非请求本身坏了——换一家很可能就能服务。若所有 provider 都拒绝,请求仍会失败,并抛出**第一个**(原始)错误,让真正的调用方 bug 保持可调试。唯一永远不 failover 的是调用方主动取消(`AbortSignal`)。想恢复旧的"client 错误立即失败"行为,给 `createLCR` 传 `shouldRetry: isRetryableError`。
|
|
145
145
|
3. **恢复。** 在一段空闲窗口(`resetIntervalMs`,默认 60s)之后,自动回到最便宜的 provider。
|
|
146
146
|
|
|
147
|
+
## 看清每次调用发生了什么(`onCall`)
|
|
148
|
+
|
|
149
|
+
`onError`/`onCost` 各自独立触发、互不关联,事后很难还原一次 failover 的全貌。`onCall` 给你**每个请求一条记录**——完整的尝试链、最终服务者、每跳失败的原因、延迟和成本;`formatCallRecord` 把它变成一行可扫读的日志:
|
|
150
|
+
|
|
151
|
+
```text
|
|
152
|
+
✓ text tokenmart 412ms $0.0003
|
|
153
|
+
⚠ text tokenmart→openrouter 910ms $0.0004 ⤷ tokenmart 502
|
|
154
|
+
✗ text deepseek→tokenmart→openrouter 1240ms FAILED ⤷ deepseek 401, tokenmart 502, openrouter 429
|
|
155
|
+
```
|
|
156
|
+
|
|
157
|
+
`record` 是一个纯 `CallRecord` 对象,关键字段:
|
|
158
|
+
|
|
159
|
+
```ts
|
|
160
|
+
interface CallRecord {
|
|
161
|
+
id: string; // 每个请求一个关联 id
|
|
162
|
+
model: string; // 逻辑模型名
|
|
163
|
+
attempts: { provider; ok; latencyMs; errorClass? }[];
|
|
164
|
+
winner?: string; // 最终服务的 provider;全失败则为 undefined
|
|
165
|
+
ok: boolean;
|
|
166
|
+
failedOver: boolean; // 尝试了不止一家
|
|
167
|
+
latencyMs: number;
|
|
168
|
+
ttftMs?: number; // 仅流式:首 token 时间
|
|
169
|
+
inputTokens: number;
|
|
170
|
+
outputTokens: number;
|
|
171
|
+
cachedInputTokens?: number; // 命中 prompt 缓存的输入 token
|
|
172
|
+
costUsd: number; // 实际成本(已按 cacheRead 折扣)
|
|
173
|
+
baselineUsd?: number; // 同样用量在「节约基线」上的价格 → 节约 = baselineUsd − costUsd
|
|
174
|
+
baselineKind?: "last-leg" | "official" | "priciest-route"; // 基线的来源(见下)
|
|
175
|
+
cachedSavingUsd?: number; // provider 自己的缓存折扣——是真金白银,但不是路由的功劳,别混进节约
|
|
176
|
+
usageMissing?: boolean; // 服务成功但 token 报 0/0 → 成本是「未知」而非「免费」
|
|
177
|
+
|
|
178
|
+
// 媒体调用(createMediaLCR)额外携带:
|
|
179
|
+
modality?: "image" | "video";
|
|
180
|
+
usage?: { seconds?; outputs?; megapixels? }; // 账单依据的实际用量
|
|
181
|
+
officialUsd?: number; // 官方第一方价(按本次实际用量)
|
|
182
|
+
estCostUsd?: number; // 价格表的预估——与 costUsd 的差 = 价格表漂移
|
|
183
|
+
}
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
**节约怎么算才诚实:** `baselineKind` 说明 `baselineUsd` 是哪种基线——文本是**链尾兜底 provider 的列表价**(`"last-leg"`,故意不取最贵的一条:prompt 缓存可能让标价更便宜的那家在缓存重的调用上反而更贵,取最大值会凭空造出"节约");媒体是**模型厂商官方第一方价**(`"official"`,按实际秒数算),查不到官方价时退化为你配置里最贵的路由(`"priciest-route"`,自我参照,仅说明跨 provider 价差)。
|
|
187
|
+
|
|
188
|
+
**送进收集器:** `createHttpSink` 把每条记录 POST 到任意 endpoint(serverless 上传 Next.js 的 `after` 作 `dispatch` 防止被掐断)。配套的自托管 dashboard [`ai-lcr-dashboard`](https://github.com/ai-lcr/ai-lcr-dashboard)(Next.js + Postgres,Vercel 一键部署)专为这些记录而建:花费 vs 节约趋势、各 provider failover 健康度、媒体 $/秒 与 $/张、以及**价格漂移面板**——某条 model@provider 路由的实报账单与价格表偏差超过 ±20% 时点名示警(约 100× 基本就是美元当美分的笔误)。只存元数据,不存 prompt 和输出。
|
|
189
|
+
|
|
147
190
|
## 支持的 provider
|
|
148
191
|
|
|
149
192
|
任何 OpenAI 兼容的 endpoint 都可用——任何 AI SDK 的 provider 包也都可用,包括模型厂商自己的官方 API。
|
|
150
193
|
|
|
151
194
|
- **模型厂商官方 API(原生):** 通过各自的 AI SDK provider 包直连 [DeepSeek](https://platform.deepseek.com)、[OpenAI](https://openai.com)、[Anthropic](https://anthropic.com)、[Google](https://ai.google.dev)、[xAI](https://x.ai) 等——无加价,原生特性齐全。见上方「直连模型厂商官方 API(原生 provider)」一节。
|
|
152
195
|
- **文本聚合器:** [OpenRouter](https://openrouter.ai)(覆盖最广,列表定价)· [Kunavo](https://kunavo.com/?ref=victorimf)(**全模型 8 折**)· [TokenMart](https://thetokenmart.ai)(按模型 85 折–35 折不等)
|
|
153
|
-
- **图像 / 视频:** [Kunavo](https://kunavo.com/?ref=victorimf)(**8 折**)· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) —— 通过 `createMediaLCR` 路由。图像:Kunavo + Runware + fal。视频:fal
|
|
196
|
+
- **图像 / 视频:** [Kunavo](https://kunavo.com/?ref=victorimf)(**8 折**)· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) —— 通过 `createMediaLCR` 路由。图像:Kunavo(生成 + `*-edit` 参考图端点)+ Runware + fal。视频:fal(异步队列)、Kunavo(异步 `POST /v1/videos` + 轮询,另有同步兜底)、Runware(异步 `videoInference` + `getResponse` 轮询)——三家都在异步 `submit`/`poll` 路径上
|
|
154
197
|
|
|
155
198
|
## 文本模型价格
|
|
156
199
|
|
|
@@ -209,7 +252,9 @@ Kunavo 提供 Anthropic + Google。DeepSeek / OpenAI / Grok / Mistral 路由到
|
|
|
209
252
|
|
|
210
253
|
## 图像与视频路由(`createMediaLCR`)
|
|
211
254
|
|
|
212
|
-
图像和视频是 `ai-lcr` 独立的一侧(输出是文件、计价单位混杂、视频是异步任务)—— 见 [`src/media.ts`](src/media.ts)。你提供一个 registry(每个模型的 provider 路由 + 单位价)和一组 adapter,它就按最便宜优先路由、自动 failover,并通过与文本侧相同的 `onCall` sink
|
|
255
|
+
图像和视频是 `ai-lcr` 独立的一侧(输出是文件、计价单位混杂、视频是异步任务)—— 见 [`src/media.ts`](src/media.ts)。你提供一个 registry(每个模型的 provider 路由 + 单位价)和一组 adapter,它就按最便宜优先路由、自动 failover,并通过与文本侧相同的 `onCall` sink 报告真实成本。
|
|
256
|
+
|
|
257
|
+
两个价格、两份职责(0.6+):**排序**用归一化到参考输出(1080p 一张图 / 5 秒一段片)的价格,让混杂的计价单位可以公平比较;但每次调用的**计费**按实际用量——按秒计价的 SKU,一条 8 秒的片就按 8 秒收,节约基线也按同样的 8 秒官方价算。adapter 上报带类型的实际用量(`usage: { seconds, outputs, megapixels }`);provider 自己报了账单时以账单为准,而账单与价格表预估差距悬殊时(经典的"美元当美分"笔误正好是 100×)会触发 `onError`,提醒你修价格表。
|
|
213
258
|
|
|
214
259
|
```ts
|
|
215
260
|
import { createMediaLCR, createKunavoMediaAdapter, createFalMediaAdapter } from 'ai-lcr'
|
|
@@ -261,6 +306,30 @@ if (r.done) {
|
|
|
261
306
|
- **telemetry 只在终态轮询落一条**——一条 `onCall` `CallRecord`,带完整 failover 链,跨两个进程串起来(不是在 `submit` 时落)。
|
|
262
307
|
- adapter 通过实现 `submit` + `checkStatus` 来声明支持异步;只做图像的 adapter 省略它们,异步路由会跳过这种路由。内置的 Kunavo、fal、Runware adapter 都实现了异步路径(Kunavo/Runware 异步仅视频;fal 图像视频皆可)。
|
|
263
308
|
|
|
309
|
+
### 自己写 adapter
|
|
310
|
+
|
|
311
|
+
`MediaAdapter` 很小——同步用 `run`,异步可选 `submit`/`checkStatus`——唯一要紧的合同是**如何上报产出**:
|
|
312
|
+
|
|
313
|
+
```ts
|
|
314
|
+
// 落定的结果上报:
|
|
315
|
+
{
|
|
316
|
+
outputs: [{ url, type: "image" | "video" }],
|
|
317
|
+
costCents?: number, // provider 自己的账单,单位是美分——API 返回美元的要 ×100 转换!
|
|
318
|
+
usage?: { // 带类型的实际用量——账单(或估算)以它为准
|
|
319
|
+
seconds?: number, // 实际产出的视频秒数(按秒计价的 SKU 按它计费)
|
|
320
|
+
outputs?: number, // 产出个数——图或片(按张 / 按次计价按它计费)
|
|
321
|
+
megapixels?: number // 产出总百万像素(按 MP 计价按它计费)
|
|
322
|
+
}
|
|
323
|
+
}
|
|
324
|
+
```
|
|
325
|
+
|
|
326
|
+
保证计费正确的几条规则:
|
|
327
|
+
|
|
328
|
+
- **维度在 `usage` 里显式命名,绝不报裸数字。** 秒数和产出数是两个不同的字段,按次的平价永远不可能被片长乘爆(经典的 8× 过计)。
|
|
329
|
+
- **`costCents` 是美分。** API 返回美元的,必须在 adapter 里转换(参考 Runware adapter)。万一失手,路由器的异常账单守卫会在偏差 ≥25× 时触发 `onError`——但上报的数字仍然作数。
|
|
330
|
+
- **什么都不报时**,路由器会估算:按秒 SKU 依次读 `usage.seconds` → 输入的 `duration`(数字或 `"8s"` 这类字符串)→ 最后才退到 5 秒参考;按张/按次 SKU 按产出数计。
|
|
331
|
+
- **抛错时带上 HTTP `status` 属性**(见 `FalMediaError`/`KunavoMediaError`),路由器才能正确分类并 failover。
|
|
332
|
+
|
|
264
333
|
## 给 provider 做体检(能力 + 成本探测)
|
|
265
334
|
|
|
266
335
|
折扣再大,如果 provider 偷偷破坏了协议就一文不值。`ai-lcr` 自带一个零依赖的检查脚本(`scripts/check-provider.sh`,只需 `bash` + `curl` + `python3`),**逐模型**核查那些真正会让你多花钱或污染输出的点:
|
|
@@ -321,8 +390,10 @@ API_KEY=$INFERENCE_API_KEY BASE=https://model.service-inference.ai \
|
|
|
321
390
|
- [ ] 内置价格表,实现零配置定价(省去手填 `cost` 数字)
|
|
322
391
|
- [ ] provider 怪癖中间件(透明地修补已知怪癖,如 Kunavo 被忽略的 `max_tokens`)
|
|
323
392
|
- [ ] 把 probe 结果自动接入路由(探测失败的 provider×model 自动从列表剔除)
|
|
324
|
-
- [x] 图像与视频模型路由(`createMediaLCR`)—— 图像走 Kunavo
|
|
325
|
-
- [
|
|
393
|
+
- [x] 图像与视频模型路由(`createMediaLCR`)—— 图像走 Kunavo(含 `*-edit`)+ Runware + fal;视频异步(`submit`/`poll`)走 fal、Kunavo、Runware 三家
|
|
394
|
+
- [x] 按实际用量的结算计费(0.6)—— typed `usage`、时长感知的节约基线、`estCostUsd` 价格漂移信号、异常账单守卫
|
|
395
|
+
- [x] 自托管 dashboard([`ai-lcr-dashboard`](https://github.com/ai-lcr/ai-lcr-dashboard))—— 节约、failover 健康度、媒体单位成本、价格漂移面板
|
|
396
|
+
- [ ] 内置价格表中的归一化跨 provider 视频价格对比
|
|
326
397
|
|
|
327
398
|
## 联盟(Affiliate)披露
|
|
328
399
|
|
package/dist/index.cjs
CHANGED
|
@@ -22,6 +22,7 @@ var index_exports = {};
|
|
|
22
22
|
__export(index_exports, {
|
|
23
23
|
DEFAULT_REFERENCE: () => DEFAULT_REFERENCE,
|
|
24
24
|
MEDIA_PRICING: () => MEDIA_PRICING,
|
|
25
|
+
MODEL_PRICES: () => MODEL_PRICES,
|
|
25
26
|
OFFICIAL_PRICES: () => OFFICIAL_PRICES,
|
|
26
27
|
billableUnits: () => billableUnits,
|
|
27
28
|
cheapestRoute: () => cheapestRoute,
|
|
@@ -36,6 +37,7 @@ __export(index_exports, {
|
|
|
36
37
|
createRunwareMediaAdapter: () => createRunwareMediaAdapter,
|
|
37
38
|
durationFromInput: () => durationFromInput,
|
|
38
39
|
formatCallRecord: () => formatCallRecord,
|
|
40
|
+
getModelPrice: () => getModelPrice,
|
|
39
41
|
isAbortError: () => isAbortError,
|
|
40
42
|
isNetworkError: () => isNetworkError,
|
|
41
43
|
isRetryableError: () => isRetryableError,
|
|
@@ -637,6 +639,239 @@ function createHttpSink(options) {
|
|
|
637
639
|
};
|
|
638
640
|
}
|
|
639
641
|
|
|
642
|
+
// src/text-prices.ts
|
|
643
|
+
var MODEL_PRICES = {
|
|
644
|
+
"chatgpt-4o-latest": { input: 5, output: 15 },
|
|
645
|
+
"claude-3-7-sonnet-20250219": { input: 3, output: 15, cacheRead: 0.3 },
|
|
646
|
+
"claude-3-haiku-20240307": { input: 0.25, output: 1.25, cacheRead: 0.03 },
|
|
647
|
+
"claude-3-opus-20240229": { input: 15, output: 75, cacheRead: 1.5 },
|
|
648
|
+
"claude-4-opus-20250514": { input: 15, output: 75, cacheRead: 1.5 },
|
|
649
|
+
"claude-4-sonnet-20250514": { input: 3, output: 15, cacheRead: 0.3 },
|
|
650
|
+
"claude-fable-5": { input: 10, output: 50, cacheRead: 1 },
|
|
651
|
+
"claude-haiku-4-5": { input: 1, output: 5, cacheRead: 0.1 },
|
|
652
|
+
"claude-haiku-4-5-20251001": { input: 1, output: 5, cacheRead: 0.1 },
|
|
653
|
+
"claude-opus-4-1": { input: 15, output: 75, cacheRead: 1.5 },
|
|
654
|
+
"claude-opus-4-1-20250805": { input: 15, output: 75, cacheRead: 1.5 },
|
|
655
|
+
"claude-opus-4-20250514": { input: 15, output: 75, cacheRead: 1.5 },
|
|
656
|
+
"claude-opus-4-5": { input: 5, output: 25, cacheRead: 0.5 },
|
|
657
|
+
"claude-opus-4-5-20251101": { input: 5, output: 25, cacheRead: 0.5 },
|
|
658
|
+
"claude-opus-4-6": { input: 5, output: 25, cacheRead: 0.5 },
|
|
659
|
+
"claude-opus-4-6-20260205": { input: 5, output: 25, cacheRead: 0.5 },
|
|
660
|
+
"claude-opus-4-7": { input: 5, output: 25, cacheRead: 0.5 },
|
|
661
|
+
"claude-opus-4-7-20260416": { input: 5, output: 25, cacheRead: 0.5 },
|
|
662
|
+
"claude-opus-4-8": { input: 5, output: 25, cacheRead: 0.5 },
|
|
663
|
+
"claude-sonnet-4-20250514": { input: 3, output: 15, cacheRead: 0.3 },
|
|
664
|
+
"claude-sonnet-4-5": { input: 3, output: 15, cacheRead: 0.3 },
|
|
665
|
+
"claude-sonnet-4-5-20250929": { input: 3, output: 15, cacheRead: 0.3 },
|
|
666
|
+
"claude-sonnet-4-6": { input: 3, output: 15, cacheRead: 0.3 },
|
|
667
|
+
"codestral-2405": { input: 1, output: 3 },
|
|
668
|
+
"codestral-2508": { input: 0.3, output: 0.9 },
|
|
669
|
+
"codestral-latest": { input: 1, output: 3 },
|
|
670
|
+
"codestral-mamba-latest": { input: 0.25, output: 0.25 },
|
|
671
|
+
"deepseek-chat": { input: 0.28, output: 0.42, cacheRead: 0.028 },
|
|
672
|
+
"deepseek-coder": { input: 0.14, output: 0.28 },
|
|
673
|
+
"deepseek-r1": { input: 0.55, output: 2.19 },
|
|
674
|
+
"deepseek-reasoner": { input: 0.28, output: 0.42, cacheRead: 0.028 },
|
|
675
|
+
"deepseek-v3": { input: 0.27, output: 1.1, cacheRead: 0.07 },
|
|
676
|
+
"deepseek-v3.2": { input: 0.28, output: 0.4 },
|
|
677
|
+
"devstral-2512": { input: 0.4, output: 2 },
|
|
678
|
+
"devstral-latest": { input: 0.4, output: 2 },
|
|
679
|
+
"devstral-medium-2507": { input: 0.4, output: 2 },
|
|
680
|
+
"devstral-medium-latest": { input: 0.4, output: 2 },
|
|
681
|
+
"devstral-small-2505": { input: 0.1, output: 0.3 },
|
|
682
|
+
"devstral-small-2507": { input: 0.1, output: 0.3 },
|
|
683
|
+
"devstral-small-latest": { input: 0.1, output: 0.3 },
|
|
684
|
+
"gemini-2.0-flash": { input: 0.1, output: 0.4, cacheRead: 0.025 },
|
|
685
|
+
"gemini-2.0-flash-001": { input: 0.1, output: 0.4, cacheRead: 0.025 },
|
|
686
|
+
"gemini-2.0-flash-lite": { input: 0.075, output: 0.3, cacheRead: 0.01875 },
|
|
687
|
+
"gemini-2.0-flash-lite-001": { input: 0.075, output: 0.3, cacheRead: 0.01875 },
|
|
688
|
+
"gemini-2.5-computer-use-preview-10-2025": { input: 1.25, output: 10 },
|
|
689
|
+
"gemini-2.5-flash": { input: 0.3, output: 2.5, cacheRead: 0.03 },
|
|
690
|
+
"gemini-2.5-flash-lite": { input: 0.1, output: 0.4, cacheRead: 0.01 },
|
|
691
|
+
"gemini-2.5-flash-lite-preview-06-17": { input: 0.1, output: 0.4, cacheRead: 0.025 },
|
|
692
|
+
"gemini-2.5-flash-lite-preview-09-2025": { input: 0.1, output: 0.4, cacheRead: 0.01 },
|
|
693
|
+
"gemini-2.5-flash-native-audio-latest": { input: 0.3, output: 2.5 },
|
|
694
|
+
"gemini-2.5-flash-native-audio-preview-09-2025": { input: 0.3, output: 2.5 },
|
|
695
|
+
"gemini-2.5-flash-native-audio-preview-12-2025": { input: 0.3, output: 2.5 },
|
|
696
|
+
"gemini-2.5-flash-preview-09-2025": { input: 0.3, output: 2.5, cacheRead: 0.075 },
|
|
697
|
+
"gemini-2.5-pro": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
698
|
+
"gemini-2.5-pro-preview-tts": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
699
|
+
"gemini-3-flash-preview": { input: 0.5, output: 3, cacheRead: 0.05 },
|
|
700
|
+
"gemini-3-pro-preview": { input: 2, output: 12, cacheRead: 0.2 },
|
|
701
|
+
"gemini-3.1-flash-lite": { input: 0.25, output: 1.5, cacheRead: 0.025 },
|
|
702
|
+
"gemini-3.1-flash-lite-preview": { input: 0.25, output: 1.5, cacheRead: 0.025 },
|
|
703
|
+
"gemini-3.1-flash-live-preview": { input: 0.75, output: 4.5 },
|
|
704
|
+
"gemini-3.1-pro-preview": { input: 2, output: 12, cacheRead: 0.2 },
|
|
705
|
+
"gemini-3.1-pro-preview-customtools": { input: 2, output: 12, cacheRead: 0.2 },
|
|
706
|
+
"gemini-3.5-flash": { input: 1.5, output: 9, cacheRead: 0.15 },
|
|
707
|
+
"gemini-exp-1206": { input: 0.3, output: 2.5, cacheRead: 0.03 },
|
|
708
|
+
"gemini-flash-latest": { input: 0.3, output: 2.5, cacheRead: 0.075 },
|
|
709
|
+
"gemini-flash-lite-latest": { input: 0.1, output: 0.4, cacheRead: 0.025 },
|
|
710
|
+
"gemini-gemma-2-27b-it": { input: 0.35, output: 1.05 },
|
|
711
|
+
"gemini-gemma-2-9b-it": { input: 0.35, output: 1.05 },
|
|
712
|
+
"gemini-pro-latest": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
713
|
+
"gemini-robotics-er-1.5-preview": { input: 0.3, output: 2.5 },
|
|
714
|
+
"gpt-3.5-turbo": { input: 0.5, output: 1.5 },
|
|
715
|
+
"gpt-3.5-turbo-0125": { input: 0.5, output: 1.5 },
|
|
716
|
+
"gpt-3.5-turbo-1106": { input: 1, output: 2 },
|
|
717
|
+
"gpt-3.5-turbo-16k": { input: 3, output: 4 },
|
|
718
|
+
"gpt-4": { input: 30, output: 60 },
|
|
719
|
+
"gpt-4-0125-preview": { input: 10, output: 30 },
|
|
720
|
+
"gpt-4-0314": { input: 30, output: 60 },
|
|
721
|
+
"gpt-4-0613": { input: 30, output: 60 },
|
|
722
|
+
"gpt-4-1106-preview": { input: 10, output: 30 },
|
|
723
|
+
"gpt-4-turbo": { input: 10, output: 30 },
|
|
724
|
+
"gpt-4-turbo-2024-04-09": { input: 10, output: 30 },
|
|
725
|
+
"gpt-4-turbo-preview": { input: 10, output: 30 },
|
|
726
|
+
"gpt-4.1": { input: 2, output: 8, cacheRead: 0.5 },
|
|
727
|
+
"gpt-4.1-2025-04-14": { input: 2, output: 8, cacheRead: 0.5 },
|
|
728
|
+
"gpt-4.1-mini": { input: 0.4, output: 1.6, cacheRead: 0.1 },
|
|
729
|
+
"gpt-4.1-mini-2025-04-14": { input: 0.4, output: 1.6, cacheRead: 0.1 },
|
|
730
|
+
"gpt-4.1-nano": { input: 0.1, output: 0.4, cacheRead: 0.025 },
|
|
731
|
+
"gpt-4.1-nano-2025-04-14": { input: 0.1, output: 0.4, cacheRead: 0.025 },
|
|
732
|
+
"gpt-4o": { input: 2.5, output: 10, cacheRead: 1.25 },
|
|
733
|
+
"gpt-4o-2024-05-13": { input: 5, output: 15 },
|
|
734
|
+
"gpt-4o-2024-08-06": { input: 2.5, output: 10, cacheRead: 1.25 },
|
|
735
|
+
"gpt-4o-2024-11-20": { input: 2.5, output: 10, cacheRead: 1.25 },
|
|
736
|
+
"gpt-4o-audio-preview": { input: 2.5, output: 10 },
|
|
737
|
+
"gpt-4o-audio-preview-2024-12-17": { input: 2.5, output: 10 },
|
|
738
|
+
"gpt-4o-audio-preview-2025-06-03": { input: 2.5, output: 10 },
|
|
739
|
+
"gpt-4o-mini": { input: 0.15, output: 0.6, cacheRead: 0.075 },
|
|
740
|
+
"gpt-4o-mini-2024-07-18": { input: 0.15, output: 0.6, cacheRead: 0.075 },
|
|
741
|
+
"gpt-4o-mini-audio-preview": { input: 0.15, output: 0.6 },
|
|
742
|
+
"gpt-4o-mini-audio-preview-2024-12-17": { input: 0.15, output: 0.6 },
|
|
743
|
+
"gpt-4o-mini-realtime-preview": { input: 0.6, output: 2.4, cacheRead: 0.3 },
|
|
744
|
+
"gpt-4o-mini-realtime-preview-2024-12-17": { input: 0.6, output: 2.4, cacheRead: 0.3 },
|
|
745
|
+
"gpt-4o-mini-search-preview": { input: 0.15, output: 0.6, cacheRead: 0.075 },
|
|
746
|
+
"gpt-4o-mini-search-preview-2025-03-11": { input: 0.15, output: 0.6, cacheRead: 0.075 },
|
|
747
|
+
"gpt-4o-realtime-preview": { input: 5, output: 20, cacheRead: 2.5 },
|
|
748
|
+
"gpt-4o-realtime-preview-2024-12-17": { input: 5, output: 20, cacheRead: 2.5 },
|
|
749
|
+
"gpt-4o-realtime-preview-2025-06-03": { input: 5, output: 20, cacheRead: 2.5 },
|
|
750
|
+
"gpt-4o-search-preview": { input: 2.5, output: 10, cacheRead: 1.25 },
|
|
751
|
+
"gpt-4o-search-preview-2025-03-11": { input: 2.5, output: 10, cacheRead: 1.25 },
|
|
752
|
+
"gpt-5": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
753
|
+
"gpt-5-2025-08-07": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
754
|
+
"gpt-5-chat": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
755
|
+
"gpt-5-chat-latest": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
756
|
+
"gpt-5-mini": { input: 0.25, output: 2, cacheRead: 0.025 },
|
|
757
|
+
"gpt-5-mini-2025-08-07": { input: 0.25, output: 2, cacheRead: 0.025 },
|
|
758
|
+
"gpt-5-nano": { input: 0.05, output: 0.4, cacheRead: 5e-3 },
|
|
759
|
+
"gpt-5-nano-2025-08-07": { input: 0.05, output: 0.4, cacheRead: 5e-3 },
|
|
760
|
+
"gpt-5-search-api": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
761
|
+
"gpt-5-search-api-2025-10-14": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
762
|
+
"gpt-5.1": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
763
|
+
"gpt-5.1-2025-11-13": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
764
|
+
"gpt-5.1-chat-latest": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
765
|
+
"gpt-5.2": { input: 1.75, output: 14, cacheRead: 0.175 },
|
|
766
|
+
"gpt-5.2-2025-12-11": { input: 1.75, output: 14, cacheRead: 0.175 },
|
|
767
|
+
"gpt-5.2-chat-latest": { input: 1.75, output: 14, cacheRead: 0.175 },
|
|
768
|
+
"gpt-5.3-chat-latest": { input: 1.75, output: 14, cacheRead: 0.175 },
|
|
769
|
+
"gpt-5.4": { input: 2.5, output: 15, cacheRead: 0.25 },
|
|
770
|
+
"gpt-5.4-2026-03-05": { input: 2.5, output: 15, cacheRead: 0.25 },
|
|
771
|
+
"gpt-5.4-mini": { input: 0.75, output: 4.5, cacheRead: 0.075 },
|
|
772
|
+
"gpt-5.4-mini-2026-03-17": { input: 0.75, output: 4.5, cacheRead: 0.075 },
|
|
773
|
+
"gpt-5.4-nano": { input: 0.2, output: 1.25, cacheRead: 0.02 },
|
|
774
|
+
"gpt-5.4-nano-2026-03-17": { input: 0.2, output: 1.25, cacheRead: 0.02 },
|
|
775
|
+
"gpt-5.5": { input: 5, output: 30, cacheRead: 0.5 },
|
|
776
|
+
"gpt-5.5-2026-04-23": { input: 5, output: 30, cacheRead: 0.5 },
|
|
777
|
+
"gpt-audio": { input: 2.5, output: 10 },
|
|
778
|
+
"gpt-audio-1.5": { input: 2.5, output: 10 },
|
|
779
|
+
"gpt-audio-2025-08-28": { input: 2.5, output: 10 },
|
|
780
|
+
"gpt-audio-mini": { input: 0.6, output: 2.4 },
|
|
781
|
+
"gpt-audio-mini-2025-10-06": { input: 0.6, output: 2.4 },
|
|
782
|
+
"gpt-audio-mini-2025-12-15": { input: 0.6, output: 2.4 },
|
|
783
|
+
"gpt-realtime": { input: 4, output: 16, cacheRead: 0.4 },
|
|
784
|
+
"gpt-realtime-1.5": { input: 4, output: 16, cacheRead: 0.4 },
|
|
785
|
+
"gpt-realtime-2": { input: 4, output: 16, cacheRead: 0.4 },
|
|
786
|
+
"gpt-realtime-2025-08-28": { input: 4, output: 16, cacheRead: 0.4 },
|
|
787
|
+
"gpt-realtime-mini": { input: 0.6, output: 2.4 },
|
|
788
|
+
"gpt-realtime-mini-2025-10-06": { input: 0.6, output: 2.4, cacheRead: 0.06 },
|
|
789
|
+
"gpt-realtime-mini-2025-12-15": { input: 0.6, output: 2.4, cacheRead: 0.06 },
|
|
790
|
+
"grok-2": { input: 2, output: 10 },
|
|
791
|
+
"grok-2-1212": { input: 2, output: 10 },
|
|
792
|
+
"grok-2-latest": { input: 2, output: 10 },
|
|
793
|
+
"grok-2-vision": { input: 2, output: 10 },
|
|
794
|
+
"grok-2-vision-1212": { input: 2, output: 10 },
|
|
795
|
+
"grok-2-vision-latest": { input: 2, output: 10 },
|
|
796
|
+
"grok-3": { input: 3, output: 15, cacheRead: 0.75 },
|
|
797
|
+
"grok-3-beta": { input: 3, output: 15, cacheRead: 0.75 },
|
|
798
|
+
"grok-3-fast-beta": { input: 5, output: 25, cacheRead: 1.25 },
|
|
799
|
+
"grok-3-fast-latest": { input: 5, output: 25, cacheRead: 1.25 },
|
|
800
|
+
"grok-3-latest": { input: 3, output: 15, cacheRead: 0.75 },
|
|
801
|
+
"grok-3-mini": { input: 0.3, output: 0.5, cacheRead: 0.075 },
|
|
802
|
+
"grok-3-mini-beta": { input: 0.3, output: 0.5, cacheRead: 0.075 },
|
|
803
|
+
"grok-3-mini-fast": { input: 0.6, output: 4, cacheRead: 0.15 },
|
|
804
|
+
"grok-3-mini-fast-beta": { input: 0.6, output: 4, cacheRead: 0.15 },
|
|
805
|
+
"grok-3-mini-fast-latest": { input: 0.6, output: 4, cacheRead: 0.15 },
|
|
806
|
+
"grok-3-mini-latest": { input: 0.3, output: 0.5, cacheRead: 0.075 },
|
|
807
|
+
"grok-4": { input: 3, output: 15 },
|
|
808
|
+
"grok-4-0709": { input: 3, output: 15 },
|
|
809
|
+
"grok-4-1-fast": { input: 0.2, output: 0.5, cacheRead: 0.05 },
|
|
810
|
+
"grok-4-1-fast-non-reasoning": { input: 0.2, output: 0.5, cacheRead: 0.05 },
|
|
811
|
+
"grok-4-1-fast-non-reasoning-latest": { input: 0.2, output: 0.5, cacheRead: 0.05 },
|
|
812
|
+
"grok-4-1-fast-reasoning": { input: 0.2, output: 0.5, cacheRead: 0.05 },
|
|
813
|
+
"grok-4-1-fast-reasoning-latest": { input: 0.2, output: 0.5, cacheRead: 0.05 },
|
|
814
|
+
"grok-4-fast-non-reasoning": { input: 0.2, output: 0.5, cacheRead: 0.05 },
|
|
815
|
+
"grok-4-fast-reasoning": { input: 0.2, output: 0.5, cacheRead: 0.05 },
|
|
816
|
+
"grok-4-latest": { input: 3, output: 15 },
|
|
817
|
+
"grok-4.20-0309-reasoning": { input: 2, output: 6, cacheRead: 0.2 },
|
|
818
|
+
"grok-4.20-beta-0309-non-reasoning": { input: 2, output: 6, cacheRead: 0.2 },
|
|
819
|
+
"grok-4.20-beta-0309-reasoning": { input: 2, output: 6, cacheRead: 0.2 },
|
|
820
|
+
"grok-4.20-multi-agent-beta-0309": { input: 2, output: 6, cacheRead: 0.2 },
|
|
821
|
+
"grok-4.3": { input: 1.25, output: 2.5, cacheRead: 0.2 },
|
|
822
|
+
"grok-4.3-latest": { input: 1.25, output: 2.5, cacheRead: 0.2 },
|
|
823
|
+
"grok-beta": { input: 5, output: 15 },
|
|
824
|
+
"grok-code-fast": { input: 0.2, output: 1.5, cacheRead: 0.02 },
|
|
825
|
+
"grok-code-fast-1": { input: 0.2, output: 1.5, cacheRead: 0.02 },
|
|
826
|
+
"grok-code-fast-1-0825": { input: 0.2, output: 1.5, cacheRead: 0.02 },
|
|
827
|
+
"grok-vision-beta": { input: 5, output: 15 },
|
|
828
|
+
"labs-devstral-small-2512": { input: 0.1, output: 0.3 },
|
|
829
|
+
"magistral-medium-1-2-2509": { input: 2, output: 5 },
|
|
830
|
+
"magistral-medium-2506": { input: 2, output: 5 },
|
|
831
|
+
"magistral-medium-2509": { input: 2, output: 5 },
|
|
832
|
+
"magistral-medium-latest": { input: 2, output: 5 },
|
|
833
|
+
"magistral-small-1-2-2509": { input: 0.5, output: 1.5 },
|
|
834
|
+
"magistral-small-2506": { input: 0.5, output: 1.5 },
|
|
835
|
+
"magistral-small-latest": { input: 0.5, output: 1.5 },
|
|
836
|
+
"ministral-3-14b-2512": { input: 0.2, output: 0.2 },
|
|
837
|
+
"ministral-3-3b-2512": { input: 0.1, output: 0.1 },
|
|
838
|
+
"ministral-3-8b-2512": { input: 0.15, output: 0.15 },
|
|
839
|
+
"ministral-8b-2512": { input: 0.15, output: 0.15 },
|
|
840
|
+
"ministral-8b-latest": { input: 0.15, output: 0.15 },
|
|
841
|
+
"mistral-large-2402": { input: 4, output: 12 },
|
|
842
|
+
"mistral-large-2407": { input: 3, output: 9 },
|
|
843
|
+
"mistral-large-2411": { input: 2, output: 6 },
|
|
844
|
+
"mistral-large-2512": { input: 0.5, output: 1.5 },
|
|
845
|
+
"mistral-large-3": { input: 0.5, output: 1.5 },
|
|
846
|
+
"mistral-large-latest": { input: 0.5, output: 1.5 },
|
|
847
|
+
"mistral-medium": { input: 2.7, output: 8.1 },
|
|
848
|
+
"mistral-medium-2312": { input: 2.7, output: 8.1 },
|
|
849
|
+
"mistral-medium-2505": { input: 0.4, output: 2 },
|
|
850
|
+
"mistral-medium-3-1-2508": { input: 0.4, output: 2 },
|
|
851
|
+
"mistral-medium-latest": { input: 0.4, output: 2 },
|
|
852
|
+
"mistral-small": { input: 0.1, output: 0.3 },
|
|
853
|
+
"mistral-small-3-2-2506": { input: 0.06, output: 0.18 },
|
|
854
|
+
"mistral-small-latest": { input: 0.06, output: 0.18 },
|
|
855
|
+
"mistral-tiny": { input: 0.25, output: 0.25 },
|
|
856
|
+
"o1": { input: 15, output: 60, cacheRead: 7.5 },
|
|
857
|
+
"o1-2024-12-17": { input: 15, output: 60, cacheRead: 7.5 },
|
|
858
|
+
"o3": { input: 2, output: 8, cacheRead: 0.5 },
|
|
859
|
+
"o3-2025-04-16": { input: 2, output: 8, cacheRead: 0.5 },
|
|
860
|
+
"o3-mini": { input: 1.1, output: 4.4, cacheRead: 0.55 },
|
|
861
|
+
"o3-mini-2025-01-31": { input: 1.1, output: 4.4, cacheRead: 0.55 },
|
|
862
|
+
"o4-mini": { input: 1.1, output: 4.4, cacheRead: 0.275 },
|
|
863
|
+
"o4-mini-2025-04-16": { input: 1.1, output: 4.4, cacheRead: 0.275 },
|
|
864
|
+
"open-codestral-mamba": { input: 0.25, output: 0.25 },
|
|
865
|
+
"open-mistral-7b": { input: 0.25, output: 0.25 },
|
|
866
|
+
"open-mistral-nemo": { input: 0.3, output: 0.3 },
|
|
867
|
+
"open-mistral-nemo-2407": { input: 0.3, output: 0.3 },
|
|
868
|
+
"open-mixtral-8x22b": { input: 2, output: 6 },
|
|
869
|
+
"open-mixtral-8x7b": { input: 0.7, output: 0.7 },
|
|
870
|
+
"pixtral-12b-2409": { input: 0.15, output: 0.15 },
|
|
871
|
+
"pixtral-large-2411": { input: 2, output: 6 },
|
|
872
|
+
"pixtral-large-latest": { input: 2, output: 6 }
|
|
873
|
+
};
|
|
874
|
+
|
|
640
875
|
// src/media-official.ts
|
|
641
876
|
var OFFICIAL_PRICES = {
|
|
642
877
|
"alibaba/qwen-image": { unit: "image", cents: 3.5 },
|
|
@@ -1649,6 +1884,17 @@ async function safeText2(res) {
|
|
|
1649
1884
|
}
|
|
1650
1885
|
|
|
1651
1886
|
// src/index.ts
|
|
1887
|
+
function getModelPrice(modelId) {
|
|
1888
|
+
if (!modelId) return void 0;
|
|
1889
|
+
const direct = MODEL_PRICES[modelId];
|
|
1890
|
+
if (direct) return direct;
|
|
1891
|
+
const slash = modelId.indexOf("/");
|
|
1892
|
+
if (slash !== -1) {
|
|
1893
|
+
const bare = MODEL_PRICES[modelId.slice(slash + 1)];
|
|
1894
|
+
if (bare) return bare;
|
|
1895
|
+
}
|
|
1896
|
+
return void 0;
|
|
1897
|
+
}
|
|
1652
1898
|
function isLanguageModel(entry) {
|
|
1653
1899
|
return typeof entry.doGenerate === "function";
|
|
1654
1900
|
}
|
|
@@ -1659,9 +1905,25 @@ function normalize(entry) {
|
|
|
1659
1905
|
return {
|
|
1660
1906
|
model: entry.model,
|
|
1661
1907
|
label: entry.label ?? entry.model.provider,
|
|
1662
|
-
cost: entry.cost
|
|
1908
|
+
cost: entry.cost,
|
|
1909
|
+
discount: entry.discount
|
|
1663
1910
|
};
|
|
1664
1911
|
}
|
|
1912
|
+
function applyDiscount(cost, discount) {
|
|
1913
|
+
const f = 1 - discount;
|
|
1914
|
+
return {
|
|
1915
|
+
input: cost.input * f,
|
|
1916
|
+
output: cost.output * f,
|
|
1917
|
+
...cost.cacheRead !== void 0 ? { cacheRead: cost.cacheRead * f } : {}
|
|
1918
|
+
};
|
|
1919
|
+
}
|
|
1920
|
+
function withAutoPrice(p, autoPrice) {
|
|
1921
|
+
const { discount, ...rest } = p;
|
|
1922
|
+
if (!autoPrice || rest.cost !== void 0) return rest;
|
|
1923
|
+
const base = getModelPrice(rest.model.modelId);
|
|
1924
|
+
if (!base) return rest;
|
|
1925
|
+
return { ...rest, cost: discount !== void 0 ? applyDiscount(base, discount) : base };
|
|
1926
|
+
}
|
|
1665
1927
|
function priceKey(p) {
|
|
1666
1928
|
return p.cost ? p.cost.input + p.cost.output : Number.POSITIVE_INFINITY;
|
|
1667
1929
|
}
|
|
@@ -1673,6 +1935,7 @@ function createLCR(config) {
|
|
|
1673
1935
|
const {
|
|
1674
1936
|
models,
|
|
1675
1937
|
autoSort = false,
|
|
1938
|
+
autoPrice = false,
|
|
1676
1939
|
resetIntervalMs,
|
|
1677
1940
|
onError,
|
|
1678
1941
|
onCost,
|
|
@@ -1687,7 +1950,13 @@ function createLCR(config) {
|
|
|
1687
1950
|
}
|
|
1688
1951
|
const routed = /* @__PURE__ */ new Map();
|
|
1689
1952
|
for (const [name, entries] of Object.entries(models)) {
|
|
1690
|
-
|
|
1953
|
+
for (const entry of entries) {
|
|
1954
|
+
const d = entry.discount;
|
|
1955
|
+
if (d !== void 0 && (d < 0 || d >= 1)) {
|
|
1956
|
+
throw new Error(`ai-lcr: discount must be in [0, 1) for model "${name}", got ${d}`);
|
|
1957
|
+
}
|
|
1958
|
+
}
|
|
1959
|
+
let providers = entries.map(normalize).map((p) => withAutoPrice(p, autoPrice)).map((p) => withDefaultCacheRead(p, defaultCacheReadRatio));
|
|
1691
1960
|
if (autoSort) {
|
|
1692
1961
|
providers = [...providers].sort((a, b) => priceKey(a) - priceKey(b));
|
|
1693
1962
|
}
|
|
@@ -1710,6 +1979,7 @@ function createLCR(config) {
|
|
|
1710
1979
|
0 && (module.exports = {
|
|
1711
1980
|
DEFAULT_REFERENCE,
|
|
1712
1981
|
MEDIA_PRICING,
|
|
1982
|
+
MODEL_PRICES,
|
|
1713
1983
|
OFFICIAL_PRICES,
|
|
1714
1984
|
billableUnits,
|
|
1715
1985
|
cheapestRoute,
|
|
@@ -1724,6 +1994,7 @@ function createLCR(config) {
|
|
|
1724
1994
|
createRunwareMediaAdapter,
|
|
1725
1995
|
durationFromInput,
|
|
1726
1996
|
formatCallRecord,
|
|
1997
|
+
getModelPrice,
|
|
1727
1998
|
isAbortError,
|
|
1728
1999
|
isNetworkError,
|
|
1729
2000
|
isRetryableError,
|
package/dist/index.d.cts
CHANGED
|
@@ -324,6 +324,8 @@ interface HttpSinkOptions {
|
|
|
324
324
|
*/
|
|
325
325
|
declare function createHttpSink(options: HttpSinkOptions): (record: CallRecord) => void;
|
|
326
326
|
|
|
327
|
+
declare const MODEL_PRICES: Record<string, ProviderCost>;
|
|
328
|
+
|
|
327
329
|
/**
|
|
328
330
|
* ai-lcr media routing — Least Cost Routing for image & video models.
|
|
329
331
|
*
|
|
@@ -858,7 +860,24 @@ type ProviderEntry = LanguageModelV3 | {
|
|
|
858
860
|
cost?: ProviderCost;
|
|
859
861
|
/** Label used in cost events / logs. Defaults to the model's provider id. */
|
|
860
862
|
label?: string;
|
|
863
|
+
/**
|
|
864
|
+
* Fraction off the bundled list price (0–1) — the reseller-discount knob.
|
|
865
|
+
* Applied ONLY when `autoPrice` fills this entry from {@link MODEL_PRICES}
|
|
866
|
+
* (i.e. no explicit `cost`): a flat-discount aggregator like Kunavo (−20%)
|
|
867
|
+
* becomes `{ model: kunavo("gemini-2.5-pro"), discount: 0.2 }` with no
|
|
868
|
+
* hand-typed price. Scales input, output, and cacheRead alike. Ignored when
|
|
869
|
+
* `cost` is set, when `autoPrice` is off, or when no bundled price is found.
|
|
870
|
+
*/
|
|
871
|
+
discount?: number;
|
|
861
872
|
};
|
|
873
|
+
/**
|
|
874
|
+
* Look up a model's bundled official list price by id. Tries the id as given,
|
|
875
|
+
* then with a leading `provider/` segment stripped (so `anthropic/claude-haiku-4-5`
|
|
876
|
+
* resolves the same as `claude-haiku-4-5`). Returns undefined for unknown models.
|
|
877
|
+
* The table ({@link MODEL_PRICES}) carries native-maker first-party rates only —
|
|
878
|
+
* see `scripts/gen-text-prices.mjs`.
|
|
879
|
+
*/
|
|
880
|
+
declare function getModelPrice(modelId: string): ProviderCost | undefined;
|
|
862
881
|
interface LCRConfig {
|
|
863
882
|
/**
|
|
864
883
|
* Map of logical model name -> providers to try, cheapest-first.
|
|
@@ -867,6 +886,16 @@ interface LCRConfig {
|
|
|
867
886
|
models: Record<string, ProviderEntry[]>;
|
|
868
887
|
/** Sort each model's providers cheapest-first by `cost` before routing. */
|
|
869
888
|
autoSort?: boolean;
|
|
889
|
+
/**
|
|
890
|
+
* Fill any provider entry that has no explicit `cost` from the bundled price
|
|
891
|
+
* table ({@link MODEL_PRICES}), looked up by the entry's `model.modelId`. A
|
|
892
|
+
* native-vendor route then needs zero hand-typed pricing; a flat-discount
|
|
893
|
+
* aggregator just adds `discount` (see {@link ProviderEntry}). Off by default —
|
|
894
|
+
* unpriced entries stay unpriced (the pre-existing behavior), so turning it on
|
|
895
|
+
* never silently re-prices a model you priced yourself (explicit `cost` always
|
|
896
|
+
* wins). Pairs naturally with `autoSort` and `onCost`/`onCall`.
|
|
897
|
+
*/
|
|
898
|
+
autoPrice?: boolean;
|
|
870
899
|
/** Idle window after which routing snaps back to the cheapest provider. Default 60s. */
|
|
871
900
|
resetIntervalMs?: number;
|
|
872
901
|
/** Called when a provider errors and routing falls through to the next. */
|
|
@@ -912,4 +941,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
|
|
|
912
941
|
*/
|
|
913
942
|
declare function createLCR(config: LCRConfig): LCRRouter;
|
|
914
943
|
|
|
915
|
-
export { type BillableContext, type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type HttpSinkOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaJobHandle, type MediaJobStatus, type MediaLCR, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPollResult, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaStatusRequest, type MediaStatusResult, type MediaSubmitOptions, type MediaSubmitRequest, type MediaSubmitResult, type MediaUnit, type MediaUsage, OFFICIAL_PRICES, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, billableUnits, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createHttpSink, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, durationFromInput, formatCallRecord, isAbortError, isNetworkError, isRetryableError, normalizedCents, priceCents, rankRoutes, referenceMegapixels, shouldFailover };
|
|
944
|
+
export { type BillableContext, type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type HttpSinkOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, MODEL_PRICES, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaJobHandle, type MediaJobStatus, type MediaLCR, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPollResult, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaStatusRequest, type MediaStatusResult, type MediaSubmitOptions, type MediaSubmitRequest, type MediaSubmitResult, type MediaUnit, type MediaUsage, OFFICIAL_PRICES, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, billableUnits, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createHttpSink, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, durationFromInput, formatCallRecord, getModelPrice, isAbortError, isNetworkError, isRetryableError, normalizedCents, priceCents, rankRoutes, referenceMegapixels, shouldFailover };
|
package/dist/index.d.ts
CHANGED
|
@@ -324,6 +324,8 @@ interface HttpSinkOptions {
|
|
|
324
324
|
*/
|
|
325
325
|
declare function createHttpSink(options: HttpSinkOptions): (record: CallRecord) => void;
|
|
326
326
|
|
|
327
|
+
declare const MODEL_PRICES: Record<string, ProviderCost>;
|
|
328
|
+
|
|
327
329
|
/**
|
|
328
330
|
* ai-lcr media routing — Least Cost Routing for image & video models.
|
|
329
331
|
*
|
|
@@ -858,7 +860,24 @@ type ProviderEntry = LanguageModelV3 | {
|
|
|
858
860
|
cost?: ProviderCost;
|
|
859
861
|
/** Label used in cost events / logs. Defaults to the model's provider id. */
|
|
860
862
|
label?: string;
|
|
863
|
+
/**
|
|
864
|
+
* Fraction off the bundled list price (0–1) — the reseller-discount knob.
|
|
865
|
+
* Applied ONLY when `autoPrice` fills this entry from {@link MODEL_PRICES}
|
|
866
|
+
* (i.e. no explicit `cost`): a flat-discount aggregator like Kunavo (−20%)
|
|
867
|
+
* becomes `{ model: kunavo("gemini-2.5-pro"), discount: 0.2 }` with no
|
|
868
|
+
* hand-typed price. Scales input, output, and cacheRead alike. Ignored when
|
|
869
|
+
* `cost` is set, when `autoPrice` is off, or when no bundled price is found.
|
|
870
|
+
*/
|
|
871
|
+
discount?: number;
|
|
861
872
|
};
|
|
873
|
+
/**
|
|
874
|
+
* Look up a model's bundled official list price by id. Tries the id as given,
|
|
875
|
+
* then with a leading `provider/` segment stripped (so `anthropic/claude-haiku-4-5`
|
|
876
|
+
* resolves the same as `claude-haiku-4-5`). Returns undefined for unknown models.
|
|
877
|
+
* The table ({@link MODEL_PRICES}) carries native-maker first-party rates only —
|
|
878
|
+
* see `scripts/gen-text-prices.mjs`.
|
|
879
|
+
*/
|
|
880
|
+
declare function getModelPrice(modelId: string): ProviderCost | undefined;
|
|
862
881
|
interface LCRConfig {
|
|
863
882
|
/**
|
|
864
883
|
* Map of logical model name -> providers to try, cheapest-first.
|
|
@@ -867,6 +886,16 @@ interface LCRConfig {
|
|
|
867
886
|
models: Record<string, ProviderEntry[]>;
|
|
868
887
|
/** Sort each model's providers cheapest-first by `cost` before routing. */
|
|
869
888
|
autoSort?: boolean;
|
|
889
|
+
/**
|
|
890
|
+
* Fill any provider entry that has no explicit `cost` from the bundled price
|
|
891
|
+
* table ({@link MODEL_PRICES}), looked up by the entry's `model.modelId`. A
|
|
892
|
+
* native-vendor route then needs zero hand-typed pricing; a flat-discount
|
|
893
|
+
* aggregator just adds `discount` (see {@link ProviderEntry}). Off by default —
|
|
894
|
+
* unpriced entries stay unpriced (the pre-existing behavior), so turning it on
|
|
895
|
+
* never silently re-prices a model you priced yourself (explicit `cost` always
|
|
896
|
+
* wins). Pairs naturally with `autoSort` and `onCost`/`onCall`.
|
|
897
|
+
*/
|
|
898
|
+
autoPrice?: boolean;
|
|
870
899
|
/** Idle window after which routing snaps back to the cheapest provider. Default 60s. */
|
|
871
900
|
resetIntervalMs?: number;
|
|
872
901
|
/** Called when a provider errors and routing falls through to the next. */
|
|
@@ -912,4 +941,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
|
|
|
912
941
|
*/
|
|
913
942
|
declare function createLCR(config: LCRConfig): LCRRouter;
|
|
914
943
|
|
|
915
|
-
export { type BillableContext, type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type HttpSinkOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaJobHandle, type MediaJobStatus, type MediaLCR, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPollResult, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaStatusRequest, type MediaStatusResult, type MediaSubmitOptions, type MediaSubmitRequest, type MediaSubmitResult, type MediaUnit, type MediaUsage, OFFICIAL_PRICES, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, billableUnits, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createHttpSink, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, durationFromInput, formatCallRecord, isAbortError, isNetworkError, isRetryableError, normalizedCents, priceCents, rankRoutes, referenceMegapixels, shouldFailover };
|
|
944
|
+
export { type BillableContext, type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type HttpSinkOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, MODEL_PRICES, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaJobHandle, type MediaJobStatus, type MediaLCR, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPollResult, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaStatusRequest, type MediaStatusResult, type MediaSubmitOptions, type MediaSubmitRequest, type MediaSubmitResult, type MediaUnit, type MediaUsage, OFFICIAL_PRICES, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, billableUnits, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createHttpSink, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, durationFromInput, formatCallRecord, getModelPrice, isAbortError, isNetworkError, isRetryableError, normalizedCents, priceCents, rankRoutes, referenceMegapixels, shouldFailover };
|
package/dist/index.js
CHANGED
|
@@ -588,6 +588,239 @@ function createHttpSink(options) {
|
|
|
588
588
|
};
|
|
589
589
|
}
|
|
590
590
|
|
|
591
|
+
// src/text-prices.ts
|
|
592
|
+
var MODEL_PRICES = {
|
|
593
|
+
"chatgpt-4o-latest": { input: 5, output: 15 },
|
|
594
|
+
"claude-3-7-sonnet-20250219": { input: 3, output: 15, cacheRead: 0.3 },
|
|
595
|
+
"claude-3-haiku-20240307": { input: 0.25, output: 1.25, cacheRead: 0.03 },
|
|
596
|
+
"claude-3-opus-20240229": { input: 15, output: 75, cacheRead: 1.5 },
|
|
597
|
+
"claude-4-opus-20250514": { input: 15, output: 75, cacheRead: 1.5 },
|
|
598
|
+
"claude-4-sonnet-20250514": { input: 3, output: 15, cacheRead: 0.3 },
|
|
599
|
+
"claude-fable-5": { input: 10, output: 50, cacheRead: 1 },
|
|
600
|
+
"claude-haiku-4-5": { input: 1, output: 5, cacheRead: 0.1 },
|
|
601
|
+
"claude-haiku-4-5-20251001": { input: 1, output: 5, cacheRead: 0.1 },
|
|
602
|
+
"claude-opus-4-1": { input: 15, output: 75, cacheRead: 1.5 },
|
|
603
|
+
"claude-opus-4-1-20250805": { input: 15, output: 75, cacheRead: 1.5 },
|
|
604
|
+
"claude-opus-4-20250514": { input: 15, output: 75, cacheRead: 1.5 },
|
|
605
|
+
"claude-opus-4-5": { input: 5, output: 25, cacheRead: 0.5 },
|
|
606
|
+
"claude-opus-4-5-20251101": { input: 5, output: 25, cacheRead: 0.5 },
|
|
607
|
+
"claude-opus-4-6": { input: 5, output: 25, cacheRead: 0.5 },
|
|
608
|
+
"claude-opus-4-6-20260205": { input: 5, output: 25, cacheRead: 0.5 },
|
|
609
|
+
"claude-opus-4-7": { input: 5, output: 25, cacheRead: 0.5 },
|
|
610
|
+
"claude-opus-4-7-20260416": { input: 5, output: 25, cacheRead: 0.5 },
|
|
611
|
+
"claude-opus-4-8": { input: 5, output: 25, cacheRead: 0.5 },
|
|
612
|
+
"claude-sonnet-4-20250514": { input: 3, output: 15, cacheRead: 0.3 },
|
|
613
|
+
"claude-sonnet-4-5": { input: 3, output: 15, cacheRead: 0.3 },
|
|
614
|
+
"claude-sonnet-4-5-20250929": { input: 3, output: 15, cacheRead: 0.3 },
|
|
615
|
+
"claude-sonnet-4-6": { input: 3, output: 15, cacheRead: 0.3 },
|
|
616
|
+
"codestral-2405": { input: 1, output: 3 },
|
|
617
|
+
"codestral-2508": { input: 0.3, output: 0.9 },
|
|
618
|
+
"codestral-latest": { input: 1, output: 3 },
|
|
619
|
+
"codestral-mamba-latest": { input: 0.25, output: 0.25 },
|
|
620
|
+
"deepseek-chat": { input: 0.28, output: 0.42, cacheRead: 0.028 },
|
|
621
|
+
"deepseek-coder": { input: 0.14, output: 0.28 },
|
|
622
|
+
"deepseek-r1": { input: 0.55, output: 2.19 },
|
|
623
|
+
"deepseek-reasoner": { input: 0.28, output: 0.42, cacheRead: 0.028 },
|
|
624
|
+
"deepseek-v3": { input: 0.27, output: 1.1, cacheRead: 0.07 },
|
|
625
|
+
"deepseek-v3.2": { input: 0.28, output: 0.4 },
|
|
626
|
+
"devstral-2512": { input: 0.4, output: 2 },
|
|
627
|
+
"devstral-latest": { input: 0.4, output: 2 },
|
|
628
|
+
"devstral-medium-2507": { input: 0.4, output: 2 },
|
|
629
|
+
"devstral-medium-latest": { input: 0.4, output: 2 },
|
|
630
|
+
"devstral-small-2505": { input: 0.1, output: 0.3 },
|
|
631
|
+
"devstral-small-2507": { input: 0.1, output: 0.3 },
|
|
632
|
+
"devstral-small-latest": { input: 0.1, output: 0.3 },
|
|
633
|
+
"gemini-2.0-flash": { input: 0.1, output: 0.4, cacheRead: 0.025 },
|
|
634
|
+
"gemini-2.0-flash-001": { input: 0.1, output: 0.4, cacheRead: 0.025 },
|
|
635
|
+
"gemini-2.0-flash-lite": { input: 0.075, output: 0.3, cacheRead: 0.01875 },
|
|
636
|
+
"gemini-2.0-flash-lite-001": { input: 0.075, output: 0.3, cacheRead: 0.01875 },
|
|
637
|
+
"gemini-2.5-computer-use-preview-10-2025": { input: 1.25, output: 10 },
|
|
638
|
+
"gemini-2.5-flash": { input: 0.3, output: 2.5, cacheRead: 0.03 },
|
|
639
|
+
"gemini-2.5-flash-lite": { input: 0.1, output: 0.4, cacheRead: 0.01 },
|
|
640
|
+
"gemini-2.5-flash-lite-preview-06-17": { input: 0.1, output: 0.4, cacheRead: 0.025 },
|
|
641
|
+
"gemini-2.5-flash-lite-preview-09-2025": { input: 0.1, output: 0.4, cacheRead: 0.01 },
|
|
642
|
+
"gemini-2.5-flash-native-audio-latest": { input: 0.3, output: 2.5 },
|
|
643
|
+
"gemini-2.5-flash-native-audio-preview-09-2025": { input: 0.3, output: 2.5 },
|
|
644
|
+
"gemini-2.5-flash-native-audio-preview-12-2025": { input: 0.3, output: 2.5 },
|
|
645
|
+
"gemini-2.5-flash-preview-09-2025": { input: 0.3, output: 2.5, cacheRead: 0.075 },
|
|
646
|
+
"gemini-2.5-pro": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
647
|
+
"gemini-2.5-pro-preview-tts": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
648
|
+
"gemini-3-flash-preview": { input: 0.5, output: 3, cacheRead: 0.05 },
|
|
649
|
+
"gemini-3-pro-preview": { input: 2, output: 12, cacheRead: 0.2 },
|
|
650
|
+
"gemini-3.1-flash-lite": { input: 0.25, output: 1.5, cacheRead: 0.025 },
|
|
651
|
+
"gemini-3.1-flash-lite-preview": { input: 0.25, output: 1.5, cacheRead: 0.025 },
|
|
652
|
+
"gemini-3.1-flash-live-preview": { input: 0.75, output: 4.5 },
|
|
653
|
+
"gemini-3.1-pro-preview": { input: 2, output: 12, cacheRead: 0.2 },
|
|
654
|
+
"gemini-3.1-pro-preview-customtools": { input: 2, output: 12, cacheRead: 0.2 },
|
|
655
|
+
"gemini-3.5-flash": { input: 1.5, output: 9, cacheRead: 0.15 },
|
|
656
|
+
"gemini-exp-1206": { input: 0.3, output: 2.5, cacheRead: 0.03 },
|
|
657
|
+
"gemini-flash-latest": { input: 0.3, output: 2.5, cacheRead: 0.075 },
|
|
658
|
+
"gemini-flash-lite-latest": { input: 0.1, output: 0.4, cacheRead: 0.025 },
|
|
659
|
+
"gemini-gemma-2-27b-it": { input: 0.35, output: 1.05 },
|
|
660
|
+
"gemini-gemma-2-9b-it": { input: 0.35, output: 1.05 },
|
|
661
|
+
"gemini-pro-latest": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
662
|
+
"gemini-robotics-er-1.5-preview": { input: 0.3, output: 2.5 },
|
|
663
|
+
"gpt-3.5-turbo": { input: 0.5, output: 1.5 },
|
|
664
|
+
"gpt-3.5-turbo-0125": { input: 0.5, output: 1.5 },
|
|
665
|
+
"gpt-3.5-turbo-1106": { input: 1, output: 2 },
|
|
666
|
+
"gpt-3.5-turbo-16k": { input: 3, output: 4 },
|
|
667
|
+
"gpt-4": { input: 30, output: 60 },
|
|
668
|
+
"gpt-4-0125-preview": { input: 10, output: 30 },
|
|
669
|
+
"gpt-4-0314": { input: 30, output: 60 },
|
|
670
|
+
"gpt-4-0613": { input: 30, output: 60 },
|
|
671
|
+
"gpt-4-1106-preview": { input: 10, output: 30 },
|
|
672
|
+
"gpt-4-turbo": { input: 10, output: 30 },
|
|
673
|
+
"gpt-4-turbo-2024-04-09": { input: 10, output: 30 },
|
|
674
|
+
"gpt-4-turbo-preview": { input: 10, output: 30 },
|
|
675
|
+
"gpt-4.1": { input: 2, output: 8, cacheRead: 0.5 },
|
|
676
|
+
"gpt-4.1-2025-04-14": { input: 2, output: 8, cacheRead: 0.5 },
|
|
677
|
+
"gpt-4.1-mini": { input: 0.4, output: 1.6, cacheRead: 0.1 },
|
|
678
|
+
"gpt-4.1-mini-2025-04-14": { input: 0.4, output: 1.6, cacheRead: 0.1 },
|
|
679
|
+
"gpt-4.1-nano": { input: 0.1, output: 0.4, cacheRead: 0.025 },
|
|
680
|
+
"gpt-4.1-nano-2025-04-14": { input: 0.1, output: 0.4, cacheRead: 0.025 },
|
|
681
|
+
"gpt-4o": { input: 2.5, output: 10, cacheRead: 1.25 },
|
|
682
|
+
"gpt-4o-2024-05-13": { input: 5, output: 15 },
|
|
683
|
+
"gpt-4o-2024-08-06": { input: 2.5, output: 10, cacheRead: 1.25 },
|
|
684
|
+
"gpt-4o-2024-11-20": { input: 2.5, output: 10, cacheRead: 1.25 },
|
|
685
|
+
"gpt-4o-audio-preview": { input: 2.5, output: 10 },
|
|
686
|
+
"gpt-4o-audio-preview-2024-12-17": { input: 2.5, output: 10 },
|
|
687
|
+
"gpt-4o-audio-preview-2025-06-03": { input: 2.5, output: 10 },
|
|
688
|
+
"gpt-4o-mini": { input: 0.15, output: 0.6, cacheRead: 0.075 },
|
|
689
|
+
"gpt-4o-mini-2024-07-18": { input: 0.15, output: 0.6, cacheRead: 0.075 },
|
|
690
|
+
"gpt-4o-mini-audio-preview": { input: 0.15, output: 0.6 },
|
|
691
|
+
"gpt-4o-mini-audio-preview-2024-12-17": { input: 0.15, output: 0.6 },
|
|
692
|
+
"gpt-4o-mini-realtime-preview": { input: 0.6, output: 2.4, cacheRead: 0.3 },
|
|
693
|
+
"gpt-4o-mini-realtime-preview-2024-12-17": { input: 0.6, output: 2.4, cacheRead: 0.3 },
|
|
694
|
+
"gpt-4o-mini-search-preview": { input: 0.15, output: 0.6, cacheRead: 0.075 },
|
|
695
|
+
"gpt-4o-mini-search-preview-2025-03-11": { input: 0.15, output: 0.6, cacheRead: 0.075 },
|
|
696
|
+
"gpt-4o-realtime-preview": { input: 5, output: 20, cacheRead: 2.5 },
|
|
697
|
+
"gpt-4o-realtime-preview-2024-12-17": { input: 5, output: 20, cacheRead: 2.5 },
|
|
698
|
+
"gpt-4o-realtime-preview-2025-06-03": { input: 5, output: 20, cacheRead: 2.5 },
|
|
699
|
+
"gpt-4o-search-preview": { input: 2.5, output: 10, cacheRead: 1.25 },
|
|
700
|
+
"gpt-4o-search-preview-2025-03-11": { input: 2.5, output: 10, cacheRead: 1.25 },
|
|
701
|
+
"gpt-5": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
702
|
+
"gpt-5-2025-08-07": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
703
|
+
"gpt-5-chat": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
704
|
+
"gpt-5-chat-latest": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
705
|
+
"gpt-5-mini": { input: 0.25, output: 2, cacheRead: 0.025 },
|
|
706
|
+
"gpt-5-mini-2025-08-07": { input: 0.25, output: 2, cacheRead: 0.025 },
|
|
707
|
+
"gpt-5-nano": { input: 0.05, output: 0.4, cacheRead: 5e-3 },
|
|
708
|
+
"gpt-5-nano-2025-08-07": { input: 0.05, output: 0.4, cacheRead: 5e-3 },
|
|
709
|
+
"gpt-5-search-api": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
710
|
+
"gpt-5-search-api-2025-10-14": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
711
|
+
"gpt-5.1": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
712
|
+
"gpt-5.1-2025-11-13": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
713
|
+
"gpt-5.1-chat-latest": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
714
|
+
"gpt-5.2": { input: 1.75, output: 14, cacheRead: 0.175 },
|
|
715
|
+
"gpt-5.2-2025-12-11": { input: 1.75, output: 14, cacheRead: 0.175 },
|
|
716
|
+
"gpt-5.2-chat-latest": { input: 1.75, output: 14, cacheRead: 0.175 },
|
|
717
|
+
"gpt-5.3-chat-latest": { input: 1.75, output: 14, cacheRead: 0.175 },
|
|
718
|
+
"gpt-5.4": { input: 2.5, output: 15, cacheRead: 0.25 },
|
|
719
|
+
"gpt-5.4-2026-03-05": { input: 2.5, output: 15, cacheRead: 0.25 },
|
|
720
|
+
"gpt-5.4-mini": { input: 0.75, output: 4.5, cacheRead: 0.075 },
|
|
721
|
+
"gpt-5.4-mini-2026-03-17": { input: 0.75, output: 4.5, cacheRead: 0.075 },
|
|
722
|
+
"gpt-5.4-nano": { input: 0.2, output: 1.25, cacheRead: 0.02 },
|
|
723
|
+
"gpt-5.4-nano-2026-03-17": { input: 0.2, output: 1.25, cacheRead: 0.02 },
|
|
724
|
+
"gpt-5.5": { input: 5, output: 30, cacheRead: 0.5 },
|
|
725
|
+
"gpt-5.5-2026-04-23": { input: 5, output: 30, cacheRead: 0.5 },
|
|
726
|
+
"gpt-audio": { input: 2.5, output: 10 },
|
|
727
|
+
"gpt-audio-1.5": { input: 2.5, output: 10 },
|
|
728
|
+
"gpt-audio-2025-08-28": { input: 2.5, output: 10 },
|
|
729
|
+
"gpt-audio-mini": { input: 0.6, output: 2.4 },
|
|
730
|
+
"gpt-audio-mini-2025-10-06": { input: 0.6, output: 2.4 },
|
|
731
|
+
"gpt-audio-mini-2025-12-15": { input: 0.6, output: 2.4 },
|
|
732
|
+
"gpt-realtime": { input: 4, output: 16, cacheRead: 0.4 },
|
|
733
|
+
"gpt-realtime-1.5": { input: 4, output: 16, cacheRead: 0.4 },
|
|
734
|
+
"gpt-realtime-2": { input: 4, output: 16, cacheRead: 0.4 },
|
|
735
|
+
"gpt-realtime-2025-08-28": { input: 4, output: 16, cacheRead: 0.4 },
|
|
736
|
+
"gpt-realtime-mini": { input: 0.6, output: 2.4 },
|
|
737
|
+
"gpt-realtime-mini-2025-10-06": { input: 0.6, output: 2.4, cacheRead: 0.06 },
|
|
738
|
+
"gpt-realtime-mini-2025-12-15": { input: 0.6, output: 2.4, cacheRead: 0.06 },
|
|
739
|
+
"grok-2": { input: 2, output: 10 },
|
|
740
|
+
"grok-2-1212": { input: 2, output: 10 },
|
|
741
|
+
"grok-2-latest": { input: 2, output: 10 },
|
|
742
|
+
"grok-2-vision": { input: 2, output: 10 },
|
|
743
|
+
"grok-2-vision-1212": { input: 2, output: 10 },
|
|
744
|
+
"grok-2-vision-latest": { input: 2, output: 10 },
|
|
745
|
+
"grok-3": { input: 3, output: 15, cacheRead: 0.75 },
|
|
746
|
+
"grok-3-beta": { input: 3, output: 15, cacheRead: 0.75 },
|
|
747
|
+
"grok-3-fast-beta": { input: 5, output: 25, cacheRead: 1.25 },
|
|
748
|
+
"grok-3-fast-latest": { input: 5, output: 25, cacheRead: 1.25 },
|
|
749
|
+
"grok-3-latest": { input: 3, output: 15, cacheRead: 0.75 },
|
|
750
|
+
"grok-3-mini": { input: 0.3, output: 0.5, cacheRead: 0.075 },
|
|
751
|
+
"grok-3-mini-beta": { input: 0.3, output: 0.5, cacheRead: 0.075 },
|
|
752
|
+
"grok-3-mini-fast": { input: 0.6, output: 4, cacheRead: 0.15 },
|
|
753
|
+
"grok-3-mini-fast-beta": { input: 0.6, output: 4, cacheRead: 0.15 },
|
|
754
|
+
"grok-3-mini-fast-latest": { input: 0.6, output: 4, cacheRead: 0.15 },
|
|
755
|
+
"grok-3-mini-latest": { input: 0.3, output: 0.5, cacheRead: 0.075 },
|
|
756
|
+
"grok-4": { input: 3, output: 15 },
|
|
757
|
+
"grok-4-0709": { input: 3, output: 15 },
|
|
758
|
+
"grok-4-1-fast": { input: 0.2, output: 0.5, cacheRead: 0.05 },
|
|
759
|
+
"grok-4-1-fast-non-reasoning": { input: 0.2, output: 0.5, cacheRead: 0.05 },
|
|
760
|
+
"grok-4-1-fast-non-reasoning-latest": { input: 0.2, output: 0.5, cacheRead: 0.05 },
|
|
761
|
+
"grok-4-1-fast-reasoning": { input: 0.2, output: 0.5, cacheRead: 0.05 },
|
|
762
|
+
"grok-4-1-fast-reasoning-latest": { input: 0.2, output: 0.5, cacheRead: 0.05 },
|
|
763
|
+
"grok-4-fast-non-reasoning": { input: 0.2, output: 0.5, cacheRead: 0.05 },
|
|
764
|
+
"grok-4-fast-reasoning": { input: 0.2, output: 0.5, cacheRead: 0.05 },
|
|
765
|
+
"grok-4-latest": { input: 3, output: 15 },
|
|
766
|
+
"grok-4.20-0309-reasoning": { input: 2, output: 6, cacheRead: 0.2 },
|
|
767
|
+
"grok-4.20-beta-0309-non-reasoning": { input: 2, output: 6, cacheRead: 0.2 },
|
|
768
|
+
"grok-4.20-beta-0309-reasoning": { input: 2, output: 6, cacheRead: 0.2 },
|
|
769
|
+
"grok-4.20-multi-agent-beta-0309": { input: 2, output: 6, cacheRead: 0.2 },
|
|
770
|
+
"grok-4.3": { input: 1.25, output: 2.5, cacheRead: 0.2 },
|
|
771
|
+
"grok-4.3-latest": { input: 1.25, output: 2.5, cacheRead: 0.2 },
|
|
772
|
+
"grok-beta": { input: 5, output: 15 },
|
|
773
|
+
"grok-code-fast": { input: 0.2, output: 1.5, cacheRead: 0.02 },
|
|
774
|
+
"grok-code-fast-1": { input: 0.2, output: 1.5, cacheRead: 0.02 },
|
|
775
|
+
"grok-code-fast-1-0825": { input: 0.2, output: 1.5, cacheRead: 0.02 },
|
|
776
|
+
"grok-vision-beta": { input: 5, output: 15 },
|
|
777
|
+
"labs-devstral-small-2512": { input: 0.1, output: 0.3 },
|
|
778
|
+
"magistral-medium-1-2-2509": { input: 2, output: 5 },
|
|
779
|
+
"magistral-medium-2506": { input: 2, output: 5 },
|
|
780
|
+
"magistral-medium-2509": { input: 2, output: 5 },
|
|
781
|
+
"magistral-medium-latest": { input: 2, output: 5 },
|
|
782
|
+
"magistral-small-1-2-2509": { input: 0.5, output: 1.5 },
|
|
783
|
+
"magistral-small-2506": { input: 0.5, output: 1.5 },
|
|
784
|
+
"magistral-small-latest": { input: 0.5, output: 1.5 },
|
|
785
|
+
"ministral-3-14b-2512": { input: 0.2, output: 0.2 },
|
|
786
|
+
"ministral-3-3b-2512": { input: 0.1, output: 0.1 },
|
|
787
|
+
"ministral-3-8b-2512": { input: 0.15, output: 0.15 },
|
|
788
|
+
"ministral-8b-2512": { input: 0.15, output: 0.15 },
|
|
789
|
+
"ministral-8b-latest": { input: 0.15, output: 0.15 },
|
|
790
|
+
"mistral-large-2402": { input: 4, output: 12 },
|
|
791
|
+
"mistral-large-2407": { input: 3, output: 9 },
|
|
792
|
+
"mistral-large-2411": { input: 2, output: 6 },
|
|
793
|
+
"mistral-large-2512": { input: 0.5, output: 1.5 },
|
|
794
|
+
"mistral-large-3": { input: 0.5, output: 1.5 },
|
|
795
|
+
"mistral-large-latest": { input: 0.5, output: 1.5 },
|
|
796
|
+
"mistral-medium": { input: 2.7, output: 8.1 },
|
|
797
|
+
"mistral-medium-2312": { input: 2.7, output: 8.1 },
|
|
798
|
+
"mistral-medium-2505": { input: 0.4, output: 2 },
|
|
799
|
+
"mistral-medium-3-1-2508": { input: 0.4, output: 2 },
|
|
800
|
+
"mistral-medium-latest": { input: 0.4, output: 2 },
|
|
801
|
+
"mistral-small": { input: 0.1, output: 0.3 },
|
|
802
|
+
"mistral-small-3-2-2506": { input: 0.06, output: 0.18 },
|
|
803
|
+
"mistral-small-latest": { input: 0.06, output: 0.18 },
|
|
804
|
+
"mistral-tiny": { input: 0.25, output: 0.25 },
|
|
805
|
+
"o1": { input: 15, output: 60, cacheRead: 7.5 },
|
|
806
|
+
"o1-2024-12-17": { input: 15, output: 60, cacheRead: 7.5 },
|
|
807
|
+
"o3": { input: 2, output: 8, cacheRead: 0.5 },
|
|
808
|
+
"o3-2025-04-16": { input: 2, output: 8, cacheRead: 0.5 },
|
|
809
|
+
"o3-mini": { input: 1.1, output: 4.4, cacheRead: 0.55 },
|
|
810
|
+
"o3-mini-2025-01-31": { input: 1.1, output: 4.4, cacheRead: 0.55 },
|
|
811
|
+
"o4-mini": { input: 1.1, output: 4.4, cacheRead: 0.275 },
|
|
812
|
+
"o4-mini-2025-04-16": { input: 1.1, output: 4.4, cacheRead: 0.275 },
|
|
813
|
+
"open-codestral-mamba": { input: 0.25, output: 0.25 },
|
|
814
|
+
"open-mistral-7b": { input: 0.25, output: 0.25 },
|
|
815
|
+
"open-mistral-nemo": { input: 0.3, output: 0.3 },
|
|
816
|
+
"open-mistral-nemo-2407": { input: 0.3, output: 0.3 },
|
|
817
|
+
"open-mixtral-8x22b": { input: 2, output: 6 },
|
|
818
|
+
"open-mixtral-8x7b": { input: 0.7, output: 0.7 },
|
|
819
|
+
"pixtral-12b-2409": { input: 0.15, output: 0.15 },
|
|
820
|
+
"pixtral-large-2411": { input: 2, output: 6 },
|
|
821
|
+
"pixtral-large-latest": { input: 2, output: 6 }
|
|
822
|
+
};
|
|
823
|
+
|
|
591
824
|
// src/media-official.ts
|
|
592
825
|
var OFFICIAL_PRICES = {
|
|
593
826
|
"alibaba/qwen-image": { unit: "image", cents: 3.5 },
|
|
@@ -1600,6 +1833,17 @@ async function safeText2(res) {
|
|
|
1600
1833
|
}
|
|
1601
1834
|
|
|
1602
1835
|
// src/index.ts
|
|
1836
|
+
function getModelPrice(modelId) {
|
|
1837
|
+
if (!modelId) return void 0;
|
|
1838
|
+
const direct = MODEL_PRICES[modelId];
|
|
1839
|
+
if (direct) return direct;
|
|
1840
|
+
const slash = modelId.indexOf("/");
|
|
1841
|
+
if (slash !== -1) {
|
|
1842
|
+
const bare = MODEL_PRICES[modelId.slice(slash + 1)];
|
|
1843
|
+
if (bare) return bare;
|
|
1844
|
+
}
|
|
1845
|
+
return void 0;
|
|
1846
|
+
}
|
|
1603
1847
|
function isLanguageModel(entry) {
|
|
1604
1848
|
return typeof entry.doGenerate === "function";
|
|
1605
1849
|
}
|
|
@@ -1610,9 +1854,25 @@ function normalize(entry) {
|
|
|
1610
1854
|
return {
|
|
1611
1855
|
model: entry.model,
|
|
1612
1856
|
label: entry.label ?? entry.model.provider,
|
|
1613
|
-
cost: entry.cost
|
|
1857
|
+
cost: entry.cost,
|
|
1858
|
+
discount: entry.discount
|
|
1614
1859
|
};
|
|
1615
1860
|
}
|
|
1861
|
+
function applyDiscount(cost, discount) {
|
|
1862
|
+
const f = 1 - discount;
|
|
1863
|
+
return {
|
|
1864
|
+
input: cost.input * f,
|
|
1865
|
+
output: cost.output * f,
|
|
1866
|
+
...cost.cacheRead !== void 0 ? { cacheRead: cost.cacheRead * f } : {}
|
|
1867
|
+
};
|
|
1868
|
+
}
|
|
1869
|
+
function withAutoPrice(p, autoPrice) {
|
|
1870
|
+
const { discount, ...rest } = p;
|
|
1871
|
+
if (!autoPrice || rest.cost !== void 0) return rest;
|
|
1872
|
+
const base = getModelPrice(rest.model.modelId);
|
|
1873
|
+
if (!base) return rest;
|
|
1874
|
+
return { ...rest, cost: discount !== void 0 ? applyDiscount(base, discount) : base };
|
|
1875
|
+
}
|
|
1616
1876
|
function priceKey(p) {
|
|
1617
1877
|
return p.cost ? p.cost.input + p.cost.output : Number.POSITIVE_INFINITY;
|
|
1618
1878
|
}
|
|
@@ -1624,6 +1884,7 @@ function createLCR(config) {
|
|
|
1624
1884
|
const {
|
|
1625
1885
|
models,
|
|
1626
1886
|
autoSort = false,
|
|
1887
|
+
autoPrice = false,
|
|
1627
1888
|
resetIntervalMs,
|
|
1628
1889
|
onError,
|
|
1629
1890
|
onCost,
|
|
@@ -1638,7 +1899,13 @@ function createLCR(config) {
|
|
|
1638
1899
|
}
|
|
1639
1900
|
const routed = /* @__PURE__ */ new Map();
|
|
1640
1901
|
for (const [name, entries] of Object.entries(models)) {
|
|
1641
|
-
|
|
1902
|
+
for (const entry of entries) {
|
|
1903
|
+
const d = entry.discount;
|
|
1904
|
+
if (d !== void 0 && (d < 0 || d >= 1)) {
|
|
1905
|
+
throw new Error(`ai-lcr: discount must be in [0, 1) for model "${name}", got ${d}`);
|
|
1906
|
+
}
|
|
1907
|
+
}
|
|
1908
|
+
let providers = entries.map(normalize).map((p) => withAutoPrice(p, autoPrice)).map((p) => withDefaultCacheRead(p, defaultCacheReadRatio));
|
|
1642
1909
|
if (autoSort) {
|
|
1643
1910
|
providers = [...providers].sort((a, b) => priceKey(a) - priceKey(b));
|
|
1644
1911
|
}
|
|
@@ -1660,6 +1927,7 @@ function createLCR(config) {
|
|
|
1660
1927
|
export {
|
|
1661
1928
|
DEFAULT_REFERENCE,
|
|
1662
1929
|
MEDIA_PRICING,
|
|
1930
|
+
MODEL_PRICES,
|
|
1663
1931
|
OFFICIAL_PRICES,
|
|
1664
1932
|
billableUnits,
|
|
1665
1933
|
cheapestRoute,
|
|
@@ -1674,6 +1942,7 @@ export {
|
|
|
1674
1942
|
createRunwareMediaAdapter,
|
|
1675
1943
|
durationFromInput,
|
|
1676
1944
|
formatCallRecord,
|
|
1945
|
+
getModelPrice,
|
|
1677
1946
|
isAbortError,
|
|
1678
1947
|
isNetworkError,
|
|
1679
1948
|
isRetryableError,
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "ai-lcr",
|
|
3
|
-
"version": "0.6.
|
|
3
|
+
"version": "0.6.1",
|
|
4
4
|
"description": "Least Cost Routing for LLMs — route every model call to the cheapest available provider, fall back automatically, and track real cost. Built for the Vercel AI SDK.",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"ai",
|