ai-lcr 0.6.0 → 0.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -4,6 +4,37 @@ All notable changes to `ai-lcr` are documented here. The format follows
4
4
  [Keep a Changelog](https://keepachangelog.com/), and the project adheres to
5
5
  [Semantic Versioning](https://semver.org/).
6
6
 
7
+ ## [0.6.1] — 2026-06-11
8
+
9
+ Zero-config pricing for native-maker routes. Until now every priced provider
10
+ needed a hand-typed `cost: { input, output }`; for a vendor's own API that number
11
+ is just the public list price you could look up. 0.7 bundles those.
12
+
13
+ ### Added
14
+
15
+ - **Bundled price table (`MODEL_PRICES`).** Official first-party token prices for
16
+ the native makers ai-lcr documents (openai · anthropic · gemini · deepseek ·
17
+ xai · mistral), keyed by the bare model id you pass to that vendor's AI SDK
18
+ provider — USD per 1M tokens, with `cacheRead` where the maker prices it.
19
+ Generated from [LiteLLM's price map](https://github.com/BerriAI/litellm) (MIT)
20
+ via `scripts/gen-text-prices.mjs`; the generated file is committed.
21
+ - **`getModelPrice(modelId)`.** Look up a bundled price directly; resolves a bare
22
+ id or one with a leading `provider/` segment stripped.
23
+ - **`createLCR({ autoPrice: true })`.** Fills any provider entry that has no
24
+ explicit `cost` from the table, by `model.modelId`. A native-vendor route then
25
+ needs zero hand-typed pricing and `autoSort` can order it.
26
+ - **`discount` on a provider entry.** The flat-reseller knob: `{ model:
27
+ kunavo("…"), discount: 0.2 }` prices a −20% aggregator off the bundled list
28
+ price (scaling input/output/cacheRead) with no hand-typed number. Applies only
29
+ when `autoPrice` fills the entry; out-of-range values throw.
30
+
31
+ ### Compatibility
32
+
33
+ - Fully backward compatible. `autoPrice` is **off by default** — unpriced entries
34
+ stay unpriced and an explicit `cost` always wins, so no existing config changes
35
+ behavior. The table covers native makers only; open-weights hosts (DeepInfra)
36
+ and breadth aggregators (OpenRouter) are still priced explicitly.
37
+
7
38
  ## [0.6.0] — 2026-06-10
8
39
 
9
40
  Media billing contract v2: **rank by the reference, bill by actual usage.**
package/README.md CHANGED
@@ -138,6 +138,33 @@ const lcr = createLCR({
138
138
 
139
139
  DeepInfra carries open weights only — no first-party Claude / GPT / Gemini. For those closed models, route through OpenRouter or a discount gateway instead.
140
140
 
141
+ ## Zero-config pricing (`autoPrice`)
142
+
143
+ Typing `cost: { input, output }` for every provider is the tedious part. `autoPrice: true` fills any entry that has no explicit `cost` from a **bundled price table** (`MODEL_PRICES`) — official first-party rates for the native makers (OpenAI, Anthropic, Google, DeepSeek, xAI, Mistral), keyed by the bare model id you already pass to the provider:
144
+
145
+ ```ts
146
+ const lcr = createLCR({
147
+ autoPrice: true, // fill missing costs from the bundled table
148
+ autoSort: true, // then order cheapest-first using those prices
149
+ models: {
150
+ "claude-sonnet": [
151
+ // Native API — price comes from the table, nothing to type.
152
+ { model: anthropic("claude-sonnet-4-6"), label: "anthropic" },
153
+ // Flat-discount aggregator — `discount` applies on top of the list price.
154
+ { model: kunavo("claude-sonnet-4-6"), label: "kunavo", discount: 0.2 }, // 20% off list
155
+ ],
156
+ },
157
+ });
158
+ ```
159
+
160
+ Three rules keep it predictable:
161
+
162
+ - **Off by default.** Unpriced entries stay unpriced (the pre-existing behavior), so turning `autoPrice` on never silently re-prices a model — and an **explicit `cost` always wins** over the table.
163
+ - **`discount` is the reseller knob.** A flat-% aggregator (Kunavo −20%) becomes `discount: 0.2` instead of a hand-typed number; it scales input, output, and `cacheRead` alike, and only applies when the table fills the entry. Variable-discount providers (TokenMart) still want explicit per-model `cost`.
164
+ - **Native makers only.** The table carries first-party list prices — the cheapest, most-featureful "go direct" route. Open-weights hosts (DeepInfra) and breadth aggregators (OpenRouter) aren't in it; price those explicitly.
165
+
166
+ Look a price up yourself with `getModelPrice("claude-sonnet-4-6")`. The table is generated from [LiteLLM's price map](https://github.com/BerriAI/litellm) (MIT) — refresh with `node scripts/gen-text-prices.mjs`.
167
+
141
168
  ## How it routes
142
169
 
143
170
  1. **Cheapest first.** Providers are tried in order — list them cheapest-first, or set `autoSort: true` to order them by `cost` automatically.
@@ -185,13 +212,28 @@ interface CallRecord {
185
212
  outputTokens: number;
186
213
  cachedInputTokens?: number; // prompt-cache hits the winner read (when reported)
187
214
  costUsd: number; // winner cost, cache-discount applied (see `cacheRead`)
188
- baselineUsd?: number; // same usage on the priciest priced leg → savings = baselineUsd − costUsd
215
+ baselineUsd?: number; // what the savings baseline would have charged for the SAME usage → savings = baselineUsd − costUsd
216
+ baselineKind?: "last-leg" | "official" | "priciest-route"; // how that baseline was derived (see below)
217
+ cachedSavingUsd?: number; // the provider's own prompt-cache discount — real money, but NOT a routing saving; never fold it into baselineUsd − costUsd
189
218
  requestId?: string; // your correlation id (see below) — roll multi-step tool loops into one request
190
219
  usageMissing?: boolean; // winner served but reported 0/0 tokens → costUsd is 0 but unknown, not free
220
+ emptyCompletion?: boolean; // clean response that generated NOTHING — prompt billed, zero output
221
+
222
+ // Media calls (createMediaLCR) additionally carry:
223
+ modality?: "image" | "video";
224
+ usage?: { seconds?: number; outputs?: number; megapixels?: number }; // the actual usage the bill was based on
225
+ officialUsd?: number; // the model maker's first-party price for this call's usage
226
+ estCostUsd?: number; // what the configured price table PREDICTED — on provider-reported rows, costUsd − estCostUsd is price-table drift
191
227
  }
192
228
  ```
193
229
 
194
- **Savings, not just spend.** Whenever at least one provider in a chain carries a `cost`, `baselineUsd` is what the same call would have cost on the most expensive priced leg (typically your safety-net fallback). `baselineUsd costUsd` is the money routing saved on that call the number a cost dashboard exists to show.
230
+ **Savings, not just spend.** `baselineUsd` is what the same call would have cost without routing, and `baselineKind` says exactly what that means so a dashboard can qualify the number instead of trusting it blindly:
231
+
232
+ - **`"last-leg"`** (text): the **last priced provider** in the chain — your always-on, list-price fallback. Deliberately *not* the most expensive leg: prompt caching can make a sticker-cheaper provider cost more on a cache-heavy call, and a max-of-chain baseline would fabricate "savings" on calls the fallback itself served.
233
+ - **`"official"`** (media): the model maker's **first-party API price** for the same actual usage — an 8-second clip is baselined at 8 seconds of the official rate, not a reference length.
234
+ - **`"priciest-route"`** (media, no official price known): the most expensive route you configured. Honest about cross-provider spread, but self-referential — not a market price.
235
+
236
+ `baselineUsd − costUsd` is the money routing saved on that call — the number a cost dashboard exists to show.
195
237
 
196
238
  **Responsiveness, not just total time.** On streaming calls (`streamText`, `streamObject`, streaming agents), `ttftMs` is the **time to first token** — measured from the winning provider's attempt start to its first content delta. It's the metric most LLM dashboards lead with, because it's what a user feels as "how fast did it start replying". Total `latencyMs` covers the whole stream including any failover; `ttftMs` isolates the serving model's responsiveness. It's `undefined` for `generateText`/`generateObject` (no streaming → no "first" token) and for calls that failed before any content. Output throughput (tokens/sec) is then `outputTokens / ((latencyMs − ttftMs) / 1000)`.
197
239
 
@@ -226,13 +268,28 @@ const lcr = createLCR({
226
268
  });
227
269
  ```
228
270
 
271
+ ### The companion dashboard ([`ai-lcr-dashboard`](https://github.com/ai-lcr/ai-lcr-dashboard))
272
+
273
+ <p align="center">
274
+ <img src="assets/dashboard-demo.png" alt="ai-lcr-dashboard (demo data): saved vs spent over time, a price-drift alert, per-project failover health, and per-provider reliability" width="780">
275
+ </p>
276
+
277
+ A **self-hostable** Next.js + Postgres collector built for exactly these records — point `createHttpSink` at its `/api/ingest` and you get, across every project you tag:
278
+
279
+ - **saved vs. spent** over time, with the savings qualified by `baselineKind` and clamped per call (one mispriced row can't eat the rest);
280
+ - **failover health** per provider — who actually failed, who caught it, what leaked to users;
281
+ - **media economics** — image/video calls split out with per-unit cost ($/second of video, $/image);
282
+ - a **price-drift panel** — when a provider's reported bill disagrees with your configured price table by >±20%, it surfaces the route (a ~100× ratio is the classic USD-vs-cents slip). Cheapest-first routing is only as good as its price table; this is the smoke alarm.
283
+
284
+ One-click Vercel deploy (any Postgres: Neon, Supabase, RDS, local); records carry metadata only — no prompts, no outputs. The ingest contract is just the `CallRecord` JSON, so any other drain works too.
285
+
229
286
  ## Supported providers
230
287
 
231
288
  Any OpenAI-compatible endpoint works — and so does any AI SDK provider package, including a model vendor's own official API.
232
289
 
233
290
  - **Model vendors' own APIs (native):** route straight to [DeepSeek](https://platform.deepseek.com), [OpenAI](https://openai.com), [Anthropic](https://anthropic.com), [Google](https://ai.google.dev), [xAI](https://x.ai), etc. via their AI SDK provider packages — no markup, full native features. See [Route to a model vendor's own API](#route-to-a-model-vendors-own-api-native-providers).
234
291
  - **Text aggregators:** [OpenRouter](https://openrouter.ai) (widest coverage, list pricing) · [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off** every model) · [TokenMart](https://thetokenmart.ai) (15–65% off, varies by model)
235
- - **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — routing via `createMediaLCR`. Image: Kunavo (generations + `*-edit` reference-image endpoints) + Runware + fal. Video: fal (async queue) and Kunavo (async `POST /v1/videos` + poll, sync fallback) — both verified live
292
+ - **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — routing via `createMediaLCR`. Image: Kunavo (generations + `*-edit` reference-image endpoints) + Runware + fal. Video: fal (async queue), Kunavo (async `POST /v1/videos` + poll, sync fallback), and Runware (async `videoInference` + `getResponse` poll) all three on the async `submit`/`poll` path
236
293
 
237
294
  ## Text model pricing
238
295
 
@@ -347,6 +404,37 @@ Design choices worth knowing:
347
404
  - **Telemetry lands once, at the terminal poll** — one `onCall` `CallRecord` with the full failover chain, threaded across both processes (not at `submit`).
348
405
  - An adapter advertises async by implementing `submit` + `checkStatus`; image-only adapters omit them and are skipped by the async router. The bundled Kunavo, fal, and Runware adapters all implement the async path (Kunavo/Runware async is video-only; fal covers both).
349
406
 
407
+ ### Writing your own adapter
408
+
409
+ A `MediaAdapter` is small — `run` for sync, optional `submit`/`checkStatus` for async — and the one contract that matters is **how you report what was produced**:
410
+
411
+ ```ts
412
+ interface MediaAdapter {
413
+ provider: string;
414
+ run(req: { externalId: string; input: Record<string, unknown> }): Promise<MediaGenerateResult>;
415
+ submit?(req: { externalId: string; input; metadata? }): Promise<{ requestId: string }>;
416
+ checkStatus?(req: { externalId: string; requestId: string }): Promise<MediaStatusResult>;
417
+ }
418
+
419
+ // On a settled result, report:
420
+ {
421
+ outputs: [{ url, type: "image" | "video" }],
422
+ costCents?: number, // the provider's OWN bill, in US cents — convert if the API returns dollars (×100)!
423
+ usage?: { // typed actual usage — what the bill (or estimate) is based on
424
+ seconds?: number, // video length actually produced (per-second SKUs bill this)
425
+ outputs?: number, // output count — images or clips (per-image / per-call SKUs bill this)
426
+ megapixels?: number // total output MP (per-megapixel SKUs bill this)
427
+ }
428
+ }
429
+ ```
430
+
431
+ Rules that keep billing honest:
432
+
433
+ - **Report dimensions in `usage`, never as a bare count.** Seconds and output count are separate, explicitly-named fields, so a per-call price can never be multiplied by a clip's duration (the classic 8× overcharge).
434
+ - **`costCents` is cents.** A provider that returns dollars must be converted in the adapter (see the Runware adapter). If you slip, the router's cost-outlier guard flags any bill ≥25× off the price table via `onError` — but the reported number still stands.
435
+ - **When you report nothing**, the router estimates: per-second SKUs read `usage.seconds`, then the input's `duration` (numbers or `"8s"`-style strings), then the 5-second reference as a last resort; per-image/per-call SKUs bill the output count.
436
+ - **Throw errors with an HTTP `status` property** (see `FalMediaError`/`KunavoMediaError`) so the router can classify them for failover.
437
+
350
438
  ## Vetting a provider (capability + cost probe)
351
439
 
352
440
  A discount is worthless if the provider quietly breaks the wire protocol. `ai-lcr` ships a zero-dependency check (`scripts/check-provider.sh`, just `bash` + `curl` + `python3`) that vets the things that actually cost you money or corrupt output, **per model**:
@@ -406,11 +494,13 @@ Two OpenAI-compatible providers, same probe, same day. Cells cover both families
406
494
  - [x] One correlated record per request with the full failover chain (`onCall` + `formatCallRecord`)
407
495
  - [x] Auto cheapest-first ordering (`autoSort`) from per-provider `cost`
408
496
  - [x] Offline capability + cost check (`scripts/check-provider.sh`) → per-model trust matrix
409
- - [ ] Bundled price table for zero-config pricing (drop the manual `cost` numbers)
497
+ - [x] Bundled price table for zero-config pricing (`autoPrice` + `MODEL_PRICES`) — drop the manual `cost` numbers for native-maker routes
410
498
  - [ ] Provider-quirk middleware (transparently patch known per-provider request quirks, e.g. Kunavo's ignored `max_tokens`)
411
499
  - [ ] Feed probe results into routing automatically (auto-exclude a model from a provider that fails its probe)
412
- - [x] Image & video model routing (`createMediaLCR`) — image via Kunavo (incl. `*-edit`) + Runware + fal; **video live via fal and Kunavo** (both verified)
413
- - [ ] Normalized cross-provider video price comparison + verified Runware video adapter
500
+ - [x] Image & video model routing (`createMediaLCR`) — image via Kunavo (incl. `*-edit`) + Runware + fal; video async (`submit`/`poll`) via fal, Kunavo, and Runware
501
+ - [x] Settle-time billing on actual usage (0.6) — typed `usage`, duration-aware savings baseline, `estCostUsd` price-drift signal, cost-outlier guard
502
+ - [x] Self-hosted dashboard ([`ai-lcr-dashboard`](https://github.com/ai-lcr/ai-lcr-dashboard)) — savings, failover health, media $/unit, price-drift panel
503
+ - [ ] Normalized cross-provider video price comparison in the bundled table
414
504
 
415
505
  ## Affiliate disclosure
416
506
 
package/README.zh-CN.md CHANGED
@@ -144,13 +144,56 @@ DeepInfra 只承载开源权重——没有第一方 Claude / GPT / Gemini。那
144
144
  2. **失败时向下穿透。** 遇到任何 provider 失败——限流、5xx、超时、**额度耗尽**(402 / 欠费 / 余额不足),以及 **400** 这类 client 错误——都会前进到下一个 provider,且对流式安全。400 会 failover 是有意为之:在 OpenAI 兼容聚合层里,400 往往是"*这家* provider 不吃这个请求"(不支持的参数、它没上架这个 model、更严格的 schema),而非请求本身坏了——换一家很可能就能服务。若所有 provider 都拒绝,请求仍会失败,并抛出**第一个**(原始)错误,让真正的调用方 bug 保持可调试。唯一永远不 failover 的是调用方主动取消(`AbortSignal`)。想恢复旧的"client 错误立即失败"行为,给 `createLCR` 传 `shouldRetry: isRetryableError`。
145
145
  3. **恢复。** 在一段空闲窗口(`resetIntervalMs`,默认 60s)之后,自动回到最便宜的 provider。
146
146
 
147
+ ## 看清每次调用发生了什么(`onCall`)
148
+
149
+ `onError`/`onCost` 各自独立触发、互不关联,事后很难还原一次 failover 的全貌。`onCall` 给你**每个请求一条记录**——完整的尝试链、最终服务者、每跳失败的原因、延迟和成本;`formatCallRecord` 把它变成一行可扫读的日志:
150
+
151
+ ```text
152
+ ✓ text tokenmart 412ms $0.0003
153
+ ⚠ text tokenmart→openrouter 910ms $0.0004 ⤷ tokenmart 502
154
+ ✗ text deepseek→tokenmart→openrouter 1240ms FAILED ⤷ deepseek 401, tokenmart 502, openrouter 429
155
+ ```
156
+
157
+ `record` 是一个纯 `CallRecord` 对象,关键字段:
158
+
159
+ ```ts
160
+ interface CallRecord {
161
+ id: string; // 每个请求一个关联 id
162
+ model: string; // 逻辑模型名
163
+ attempts: { provider; ok; latencyMs; errorClass? }[];
164
+ winner?: string; // 最终服务的 provider;全失败则为 undefined
165
+ ok: boolean;
166
+ failedOver: boolean; // 尝试了不止一家
167
+ latencyMs: number;
168
+ ttftMs?: number; // 仅流式:首 token 时间
169
+ inputTokens: number;
170
+ outputTokens: number;
171
+ cachedInputTokens?: number; // 命中 prompt 缓存的输入 token
172
+ costUsd: number; // 实际成本(已按 cacheRead 折扣)
173
+ baselineUsd?: number; // 同样用量在「节约基线」上的价格 → 节约 = baselineUsd − costUsd
174
+ baselineKind?: "last-leg" | "official" | "priciest-route"; // 基线的来源(见下)
175
+ cachedSavingUsd?: number; // provider 自己的缓存折扣——是真金白银,但不是路由的功劳,别混进节约
176
+ usageMissing?: boolean; // 服务成功但 token 报 0/0 → 成本是「未知」而非「免费」
177
+
178
+ // 媒体调用(createMediaLCR)额外携带:
179
+ modality?: "image" | "video";
180
+ usage?: { seconds?; outputs?; megapixels? }; // 账单依据的实际用量
181
+ officialUsd?: number; // 官方第一方价(按本次实际用量)
182
+ estCostUsd?: number; // 价格表的预估——与 costUsd 的差 = 价格表漂移
183
+ }
184
+ ```
185
+
186
+ **节约怎么算才诚实:** `baselineKind` 说明 `baselineUsd` 是哪种基线——文本是**链尾兜底 provider 的列表价**(`"last-leg"`,故意不取最贵的一条:prompt 缓存可能让标价更便宜的那家在缓存重的调用上反而更贵,取最大值会凭空造出"节约");媒体是**模型厂商官方第一方价**(`"official"`,按实际秒数算),查不到官方价时退化为你配置里最贵的路由(`"priciest-route"`,自我参照,仅说明跨 provider 价差)。
187
+
188
+ **送进收集器:** `createHttpSink` 把每条记录 POST 到任意 endpoint(serverless 上传 Next.js 的 `after` 作 `dispatch` 防止被掐断)。配套的自托管 dashboard [`ai-lcr-dashboard`](https://github.com/ai-lcr/ai-lcr-dashboard)(Next.js + Postgres,Vercel 一键部署)专为这些记录而建:花费 vs 节约趋势、各 provider failover 健康度、媒体 $/秒 与 $/张、以及**价格漂移面板**——某条 model@provider 路由的实报账单与价格表偏差超过 ±20% 时点名示警(约 100× 基本就是美元当美分的笔误)。只存元数据,不存 prompt 和输出。
189
+
147
190
  ## 支持的 provider
148
191
 
149
192
  任何 OpenAI 兼容的 endpoint 都可用——任何 AI SDK 的 provider 包也都可用,包括模型厂商自己的官方 API。
150
193
 
151
194
  - **模型厂商官方 API(原生):** 通过各自的 AI SDK provider 包直连 [DeepSeek](https://platform.deepseek.com)、[OpenAI](https://openai.com)、[Anthropic](https://anthropic.com)、[Google](https://ai.google.dev)、[xAI](https://x.ai) 等——无加价,原生特性齐全。见上方「直连模型厂商官方 API(原生 provider)」一节。
152
195
  - **文本聚合器:** [OpenRouter](https://openrouter.ai)(覆盖最广,列表定价)· [Kunavo](https://kunavo.com/?ref=victorimf)(**全模型 8 折**)· [TokenMart](https://thetokenmart.ai)(按模型 85 折–35 折不等)
153
- - **图像 / 视频:** [Kunavo](https://kunavo.com/?ref=victorimf)(**8 折**)· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) —— 通过 `createMediaLCR` 路由。图像:Kunavo + Runware + fal。视频:fal(已可用,走其异步队列 API);Kunavo Veo 轮询路径已实现但未验证
196
+ - **图像 / 视频:** [Kunavo](https://kunavo.com/?ref=victorimf)(**8 折**)· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) —— 通过 `createMediaLCR` 路由。图像:Kunavo(生成 + `*-edit` 参考图端点)+ Runware + fal。视频:fal(异步队列)、Kunavo(异步 `POST /v1/videos` + 轮询,另有同步兜底)、Runware(异步 `videoInference` + `getResponse` 轮询)——三家都在异步 `submit`/`poll` 路径上
154
197
 
155
198
  ## 文本模型价格
156
199
 
@@ -209,7 +252,9 @@ Kunavo 提供 Anthropic + Google。DeepSeek / OpenAI / Grok / Mistral 路由到
209
252
 
210
253
  ## 图像与视频路由(`createMediaLCR`)
211
254
 
212
- 图像和视频是 `ai-lcr` 独立的一侧(输出是文件、计价单位混杂、视频是异步任务)—— 见 [`src/media.ts`](src/media.ts)。你提供一个 registry(每个模型的 provider 路由 + 单位价)和一组 adapter,它就按最便宜优先路由、自动 failover,并通过与文本侧相同的 `onCall` sink 报告真实/归一化成本。
255
+ 图像和视频是 `ai-lcr` 独立的一侧(输出是文件、计价单位混杂、视频是异步任务)—— 见 [`src/media.ts`](src/media.ts)。你提供一个 registry(每个模型的 provider 路由 + 单位价)和一组 adapter,它就按最便宜优先路由、自动 failover,并通过与文本侧相同的 `onCall` sink 报告真实成本。
256
+
257
+ 两个价格、两份职责(0.6+):**排序**用归一化到参考输出(1080p 一张图 / 5 秒一段片)的价格,让混杂的计价单位可以公平比较;但每次调用的**计费**按实际用量——按秒计价的 SKU,一条 8 秒的片就按 8 秒收,节约基线也按同样的 8 秒官方价算。adapter 上报带类型的实际用量(`usage: { seconds, outputs, megapixels }`);provider 自己报了账单时以账单为准,而账单与价格表预估差距悬殊时(经典的"美元当美分"笔误正好是 100×)会触发 `onError`,提醒你修价格表。
213
258
 
214
259
  ```ts
215
260
  import { createMediaLCR, createKunavoMediaAdapter, createFalMediaAdapter } from 'ai-lcr'
@@ -261,6 +306,30 @@ if (r.done) {
261
306
  - **telemetry 只在终态轮询落一条**——一条 `onCall` `CallRecord`,带完整 failover 链,跨两个进程串起来(不是在 `submit` 时落)。
262
307
  - adapter 通过实现 `submit` + `checkStatus` 来声明支持异步;只做图像的 adapter 省略它们,异步路由会跳过这种路由。内置的 Kunavo、fal、Runware adapter 都实现了异步路径(Kunavo/Runware 异步仅视频;fal 图像视频皆可)。
263
308
 
309
+ ### 自己写 adapter
310
+
311
+ `MediaAdapter` 很小——同步用 `run`,异步可选 `submit`/`checkStatus`——唯一要紧的合同是**如何上报产出**:
312
+
313
+ ```ts
314
+ // 落定的结果上报:
315
+ {
316
+ outputs: [{ url, type: "image" | "video" }],
317
+ costCents?: number, // provider 自己的账单,单位是美分——API 返回美元的要 ×100 转换!
318
+ usage?: { // 带类型的实际用量——账单(或估算)以它为准
319
+ seconds?: number, // 实际产出的视频秒数(按秒计价的 SKU 按它计费)
320
+ outputs?: number, // 产出个数——图或片(按张 / 按次计价按它计费)
321
+ megapixels?: number // 产出总百万像素(按 MP 计价按它计费)
322
+ }
323
+ }
324
+ ```
325
+
326
+ 保证计费正确的几条规则:
327
+
328
+ - **维度在 `usage` 里显式命名,绝不报裸数字。** 秒数和产出数是两个不同的字段,按次的平价永远不可能被片长乘爆(经典的 8× 过计)。
329
+ - **`costCents` 是美分。** API 返回美元的,必须在 adapter 里转换(参考 Runware adapter)。万一失手,路由器的异常账单守卫会在偏差 ≥25× 时触发 `onError`——但上报的数字仍然作数。
330
+ - **什么都不报时**,路由器会估算:按秒 SKU 依次读 `usage.seconds` → 输入的 `duration`(数字或 `"8s"` 这类字符串)→ 最后才退到 5 秒参考;按张/按次 SKU 按产出数计。
331
+ - **抛错时带上 HTTP `status` 属性**(见 `FalMediaError`/`KunavoMediaError`),路由器才能正确分类并 failover。
332
+
264
333
  ## 给 provider 做体检(能力 + 成本探测)
265
334
 
266
335
  折扣再大,如果 provider 偷偷破坏了协议就一文不值。`ai-lcr` 自带一个零依赖的检查脚本(`scripts/check-provider.sh`,只需 `bash` + `curl` + `python3`),**逐模型**核查那些真正会让你多花钱或污染输出的点:
@@ -321,8 +390,10 @@ API_KEY=$INFERENCE_API_KEY BASE=https://model.service-inference.ai \
321
390
  - [ ] 内置价格表,实现零配置定价(省去手填 `cost` 数字)
322
391
  - [ ] provider 怪癖中间件(透明地修补已知怪癖,如 Kunavo 被忽略的 `max_tokens`)
323
392
  - [ ] 把 probe 结果自动接入路由(探测失败的 provider×model 自动从列表剔除)
324
- - [x] 图像与视频模型路由(`createMediaLCR`)—— 图像走 Kunavo + Runware + fal;**视频已可用,走 fal**(异步队列 API)
325
- - [ ] 归一化的跨 provider 视频价格对比 + 验证 Kunavo/Runware 视频适配器
393
+ - [x] 图像与视频模型路由(`createMediaLCR`)—— 图像走 Kunavo(含 `*-edit`)+ Runware + fal;视频异步(`submit`/`poll`)走 fal、Kunavo、Runware 三家
394
+ - [x] 按实际用量的结算计费(0.6)—— typed `usage`、时长感知的节约基线、`estCostUsd` 价格漂移信号、异常账单守卫
395
+ - [x] 自托管 dashboard([`ai-lcr-dashboard`](https://github.com/ai-lcr/ai-lcr-dashboard))—— 节约、failover 健康度、媒体单位成本、价格漂移面板
396
+ - [ ] 内置价格表中的归一化跨 provider 视频价格对比
326
397
 
327
398
  ## 联盟(Affiliate)披露
328
399
 
package/dist/index.cjs CHANGED
@@ -22,6 +22,7 @@ var index_exports = {};
22
22
  __export(index_exports, {
23
23
  DEFAULT_REFERENCE: () => DEFAULT_REFERENCE,
24
24
  MEDIA_PRICING: () => MEDIA_PRICING,
25
+ MODEL_PRICES: () => MODEL_PRICES,
25
26
  OFFICIAL_PRICES: () => OFFICIAL_PRICES,
26
27
  billableUnits: () => billableUnits,
27
28
  cheapestRoute: () => cheapestRoute,
@@ -36,6 +37,7 @@ __export(index_exports, {
36
37
  createRunwareMediaAdapter: () => createRunwareMediaAdapter,
37
38
  durationFromInput: () => durationFromInput,
38
39
  formatCallRecord: () => formatCallRecord,
40
+ getModelPrice: () => getModelPrice,
39
41
  isAbortError: () => isAbortError,
40
42
  isNetworkError: () => isNetworkError,
41
43
  isRetryableError: () => isRetryableError,
@@ -637,6 +639,239 @@ function createHttpSink(options) {
637
639
  };
638
640
  }
639
641
 
642
+ // src/text-prices.ts
643
+ var MODEL_PRICES = {
644
+ "chatgpt-4o-latest": { input: 5, output: 15 },
645
+ "claude-3-7-sonnet-20250219": { input: 3, output: 15, cacheRead: 0.3 },
646
+ "claude-3-haiku-20240307": { input: 0.25, output: 1.25, cacheRead: 0.03 },
647
+ "claude-3-opus-20240229": { input: 15, output: 75, cacheRead: 1.5 },
648
+ "claude-4-opus-20250514": { input: 15, output: 75, cacheRead: 1.5 },
649
+ "claude-4-sonnet-20250514": { input: 3, output: 15, cacheRead: 0.3 },
650
+ "claude-fable-5": { input: 10, output: 50, cacheRead: 1 },
651
+ "claude-haiku-4-5": { input: 1, output: 5, cacheRead: 0.1 },
652
+ "claude-haiku-4-5-20251001": { input: 1, output: 5, cacheRead: 0.1 },
653
+ "claude-opus-4-1": { input: 15, output: 75, cacheRead: 1.5 },
654
+ "claude-opus-4-1-20250805": { input: 15, output: 75, cacheRead: 1.5 },
655
+ "claude-opus-4-20250514": { input: 15, output: 75, cacheRead: 1.5 },
656
+ "claude-opus-4-5": { input: 5, output: 25, cacheRead: 0.5 },
657
+ "claude-opus-4-5-20251101": { input: 5, output: 25, cacheRead: 0.5 },
658
+ "claude-opus-4-6": { input: 5, output: 25, cacheRead: 0.5 },
659
+ "claude-opus-4-6-20260205": { input: 5, output: 25, cacheRead: 0.5 },
660
+ "claude-opus-4-7": { input: 5, output: 25, cacheRead: 0.5 },
661
+ "claude-opus-4-7-20260416": { input: 5, output: 25, cacheRead: 0.5 },
662
+ "claude-opus-4-8": { input: 5, output: 25, cacheRead: 0.5 },
663
+ "claude-sonnet-4-20250514": { input: 3, output: 15, cacheRead: 0.3 },
664
+ "claude-sonnet-4-5": { input: 3, output: 15, cacheRead: 0.3 },
665
+ "claude-sonnet-4-5-20250929": { input: 3, output: 15, cacheRead: 0.3 },
666
+ "claude-sonnet-4-6": { input: 3, output: 15, cacheRead: 0.3 },
667
+ "codestral-2405": { input: 1, output: 3 },
668
+ "codestral-2508": { input: 0.3, output: 0.9 },
669
+ "codestral-latest": { input: 1, output: 3 },
670
+ "codestral-mamba-latest": { input: 0.25, output: 0.25 },
671
+ "deepseek-chat": { input: 0.28, output: 0.42, cacheRead: 0.028 },
672
+ "deepseek-coder": { input: 0.14, output: 0.28 },
673
+ "deepseek-r1": { input: 0.55, output: 2.19 },
674
+ "deepseek-reasoner": { input: 0.28, output: 0.42, cacheRead: 0.028 },
675
+ "deepseek-v3": { input: 0.27, output: 1.1, cacheRead: 0.07 },
676
+ "deepseek-v3.2": { input: 0.28, output: 0.4 },
677
+ "devstral-2512": { input: 0.4, output: 2 },
678
+ "devstral-latest": { input: 0.4, output: 2 },
679
+ "devstral-medium-2507": { input: 0.4, output: 2 },
680
+ "devstral-medium-latest": { input: 0.4, output: 2 },
681
+ "devstral-small-2505": { input: 0.1, output: 0.3 },
682
+ "devstral-small-2507": { input: 0.1, output: 0.3 },
683
+ "devstral-small-latest": { input: 0.1, output: 0.3 },
684
+ "gemini-2.0-flash": { input: 0.1, output: 0.4, cacheRead: 0.025 },
685
+ "gemini-2.0-flash-001": { input: 0.1, output: 0.4, cacheRead: 0.025 },
686
+ "gemini-2.0-flash-lite": { input: 0.075, output: 0.3, cacheRead: 0.01875 },
687
+ "gemini-2.0-flash-lite-001": { input: 0.075, output: 0.3, cacheRead: 0.01875 },
688
+ "gemini-2.5-computer-use-preview-10-2025": { input: 1.25, output: 10 },
689
+ "gemini-2.5-flash": { input: 0.3, output: 2.5, cacheRead: 0.03 },
690
+ "gemini-2.5-flash-lite": { input: 0.1, output: 0.4, cacheRead: 0.01 },
691
+ "gemini-2.5-flash-lite-preview-06-17": { input: 0.1, output: 0.4, cacheRead: 0.025 },
692
+ "gemini-2.5-flash-lite-preview-09-2025": { input: 0.1, output: 0.4, cacheRead: 0.01 },
693
+ "gemini-2.5-flash-native-audio-latest": { input: 0.3, output: 2.5 },
694
+ "gemini-2.5-flash-native-audio-preview-09-2025": { input: 0.3, output: 2.5 },
695
+ "gemini-2.5-flash-native-audio-preview-12-2025": { input: 0.3, output: 2.5 },
696
+ "gemini-2.5-flash-preview-09-2025": { input: 0.3, output: 2.5, cacheRead: 0.075 },
697
+ "gemini-2.5-pro": { input: 1.25, output: 10, cacheRead: 0.125 },
698
+ "gemini-2.5-pro-preview-tts": { input: 1.25, output: 10, cacheRead: 0.125 },
699
+ "gemini-3-flash-preview": { input: 0.5, output: 3, cacheRead: 0.05 },
700
+ "gemini-3-pro-preview": { input: 2, output: 12, cacheRead: 0.2 },
701
+ "gemini-3.1-flash-lite": { input: 0.25, output: 1.5, cacheRead: 0.025 },
702
+ "gemini-3.1-flash-lite-preview": { input: 0.25, output: 1.5, cacheRead: 0.025 },
703
+ "gemini-3.1-flash-live-preview": { input: 0.75, output: 4.5 },
704
+ "gemini-3.1-pro-preview": { input: 2, output: 12, cacheRead: 0.2 },
705
+ "gemini-3.1-pro-preview-customtools": { input: 2, output: 12, cacheRead: 0.2 },
706
+ "gemini-3.5-flash": { input: 1.5, output: 9, cacheRead: 0.15 },
707
+ "gemini-exp-1206": { input: 0.3, output: 2.5, cacheRead: 0.03 },
708
+ "gemini-flash-latest": { input: 0.3, output: 2.5, cacheRead: 0.075 },
709
+ "gemini-flash-lite-latest": { input: 0.1, output: 0.4, cacheRead: 0.025 },
710
+ "gemini-gemma-2-27b-it": { input: 0.35, output: 1.05 },
711
+ "gemini-gemma-2-9b-it": { input: 0.35, output: 1.05 },
712
+ "gemini-pro-latest": { input: 1.25, output: 10, cacheRead: 0.125 },
713
+ "gemini-robotics-er-1.5-preview": { input: 0.3, output: 2.5 },
714
+ "gpt-3.5-turbo": { input: 0.5, output: 1.5 },
715
+ "gpt-3.5-turbo-0125": { input: 0.5, output: 1.5 },
716
+ "gpt-3.5-turbo-1106": { input: 1, output: 2 },
717
+ "gpt-3.5-turbo-16k": { input: 3, output: 4 },
718
+ "gpt-4": { input: 30, output: 60 },
719
+ "gpt-4-0125-preview": { input: 10, output: 30 },
720
+ "gpt-4-0314": { input: 30, output: 60 },
721
+ "gpt-4-0613": { input: 30, output: 60 },
722
+ "gpt-4-1106-preview": { input: 10, output: 30 },
723
+ "gpt-4-turbo": { input: 10, output: 30 },
724
+ "gpt-4-turbo-2024-04-09": { input: 10, output: 30 },
725
+ "gpt-4-turbo-preview": { input: 10, output: 30 },
726
+ "gpt-4.1": { input: 2, output: 8, cacheRead: 0.5 },
727
+ "gpt-4.1-2025-04-14": { input: 2, output: 8, cacheRead: 0.5 },
728
+ "gpt-4.1-mini": { input: 0.4, output: 1.6, cacheRead: 0.1 },
729
+ "gpt-4.1-mini-2025-04-14": { input: 0.4, output: 1.6, cacheRead: 0.1 },
730
+ "gpt-4.1-nano": { input: 0.1, output: 0.4, cacheRead: 0.025 },
731
+ "gpt-4.1-nano-2025-04-14": { input: 0.1, output: 0.4, cacheRead: 0.025 },
732
+ "gpt-4o": { input: 2.5, output: 10, cacheRead: 1.25 },
733
+ "gpt-4o-2024-05-13": { input: 5, output: 15 },
734
+ "gpt-4o-2024-08-06": { input: 2.5, output: 10, cacheRead: 1.25 },
735
+ "gpt-4o-2024-11-20": { input: 2.5, output: 10, cacheRead: 1.25 },
736
+ "gpt-4o-audio-preview": { input: 2.5, output: 10 },
737
+ "gpt-4o-audio-preview-2024-12-17": { input: 2.5, output: 10 },
738
+ "gpt-4o-audio-preview-2025-06-03": { input: 2.5, output: 10 },
739
+ "gpt-4o-mini": { input: 0.15, output: 0.6, cacheRead: 0.075 },
740
+ "gpt-4o-mini-2024-07-18": { input: 0.15, output: 0.6, cacheRead: 0.075 },
741
+ "gpt-4o-mini-audio-preview": { input: 0.15, output: 0.6 },
742
+ "gpt-4o-mini-audio-preview-2024-12-17": { input: 0.15, output: 0.6 },
743
+ "gpt-4o-mini-realtime-preview": { input: 0.6, output: 2.4, cacheRead: 0.3 },
744
+ "gpt-4o-mini-realtime-preview-2024-12-17": { input: 0.6, output: 2.4, cacheRead: 0.3 },
745
+ "gpt-4o-mini-search-preview": { input: 0.15, output: 0.6, cacheRead: 0.075 },
746
+ "gpt-4o-mini-search-preview-2025-03-11": { input: 0.15, output: 0.6, cacheRead: 0.075 },
747
+ "gpt-4o-realtime-preview": { input: 5, output: 20, cacheRead: 2.5 },
748
+ "gpt-4o-realtime-preview-2024-12-17": { input: 5, output: 20, cacheRead: 2.5 },
749
+ "gpt-4o-realtime-preview-2025-06-03": { input: 5, output: 20, cacheRead: 2.5 },
750
+ "gpt-4o-search-preview": { input: 2.5, output: 10, cacheRead: 1.25 },
751
+ "gpt-4o-search-preview-2025-03-11": { input: 2.5, output: 10, cacheRead: 1.25 },
752
+ "gpt-5": { input: 1.25, output: 10, cacheRead: 0.125 },
753
+ "gpt-5-2025-08-07": { input: 1.25, output: 10, cacheRead: 0.125 },
754
+ "gpt-5-chat": { input: 1.25, output: 10, cacheRead: 0.125 },
755
+ "gpt-5-chat-latest": { input: 1.25, output: 10, cacheRead: 0.125 },
756
+ "gpt-5-mini": { input: 0.25, output: 2, cacheRead: 0.025 },
757
+ "gpt-5-mini-2025-08-07": { input: 0.25, output: 2, cacheRead: 0.025 },
758
+ "gpt-5-nano": { input: 0.05, output: 0.4, cacheRead: 5e-3 },
759
+ "gpt-5-nano-2025-08-07": { input: 0.05, output: 0.4, cacheRead: 5e-3 },
760
+ "gpt-5-search-api": { input: 1.25, output: 10, cacheRead: 0.125 },
761
+ "gpt-5-search-api-2025-10-14": { input: 1.25, output: 10, cacheRead: 0.125 },
762
+ "gpt-5.1": { input: 1.25, output: 10, cacheRead: 0.125 },
763
+ "gpt-5.1-2025-11-13": { input: 1.25, output: 10, cacheRead: 0.125 },
764
+ "gpt-5.1-chat-latest": { input: 1.25, output: 10, cacheRead: 0.125 },
765
+ "gpt-5.2": { input: 1.75, output: 14, cacheRead: 0.175 },
766
+ "gpt-5.2-2025-12-11": { input: 1.75, output: 14, cacheRead: 0.175 },
767
+ "gpt-5.2-chat-latest": { input: 1.75, output: 14, cacheRead: 0.175 },
768
+ "gpt-5.3-chat-latest": { input: 1.75, output: 14, cacheRead: 0.175 },
769
+ "gpt-5.4": { input: 2.5, output: 15, cacheRead: 0.25 },
770
+ "gpt-5.4-2026-03-05": { input: 2.5, output: 15, cacheRead: 0.25 },
771
+ "gpt-5.4-mini": { input: 0.75, output: 4.5, cacheRead: 0.075 },
772
+ "gpt-5.4-mini-2026-03-17": { input: 0.75, output: 4.5, cacheRead: 0.075 },
773
+ "gpt-5.4-nano": { input: 0.2, output: 1.25, cacheRead: 0.02 },
774
+ "gpt-5.4-nano-2026-03-17": { input: 0.2, output: 1.25, cacheRead: 0.02 },
775
+ "gpt-5.5": { input: 5, output: 30, cacheRead: 0.5 },
776
+ "gpt-5.5-2026-04-23": { input: 5, output: 30, cacheRead: 0.5 },
777
+ "gpt-audio": { input: 2.5, output: 10 },
778
+ "gpt-audio-1.5": { input: 2.5, output: 10 },
779
+ "gpt-audio-2025-08-28": { input: 2.5, output: 10 },
780
+ "gpt-audio-mini": { input: 0.6, output: 2.4 },
781
+ "gpt-audio-mini-2025-10-06": { input: 0.6, output: 2.4 },
782
+ "gpt-audio-mini-2025-12-15": { input: 0.6, output: 2.4 },
783
+ "gpt-realtime": { input: 4, output: 16, cacheRead: 0.4 },
784
+ "gpt-realtime-1.5": { input: 4, output: 16, cacheRead: 0.4 },
785
+ "gpt-realtime-2": { input: 4, output: 16, cacheRead: 0.4 },
786
+ "gpt-realtime-2025-08-28": { input: 4, output: 16, cacheRead: 0.4 },
787
+ "gpt-realtime-mini": { input: 0.6, output: 2.4 },
788
+ "gpt-realtime-mini-2025-10-06": { input: 0.6, output: 2.4, cacheRead: 0.06 },
789
+ "gpt-realtime-mini-2025-12-15": { input: 0.6, output: 2.4, cacheRead: 0.06 },
790
+ "grok-2": { input: 2, output: 10 },
791
+ "grok-2-1212": { input: 2, output: 10 },
792
+ "grok-2-latest": { input: 2, output: 10 },
793
+ "grok-2-vision": { input: 2, output: 10 },
794
+ "grok-2-vision-1212": { input: 2, output: 10 },
795
+ "grok-2-vision-latest": { input: 2, output: 10 },
796
+ "grok-3": { input: 3, output: 15, cacheRead: 0.75 },
797
+ "grok-3-beta": { input: 3, output: 15, cacheRead: 0.75 },
798
+ "grok-3-fast-beta": { input: 5, output: 25, cacheRead: 1.25 },
799
+ "grok-3-fast-latest": { input: 5, output: 25, cacheRead: 1.25 },
800
+ "grok-3-latest": { input: 3, output: 15, cacheRead: 0.75 },
801
+ "grok-3-mini": { input: 0.3, output: 0.5, cacheRead: 0.075 },
802
+ "grok-3-mini-beta": { input: 0.3, output: 0.5, cacheRead: 0.075 },
803
+ "grok-3-mini-fast": { input: 0.6, output: 4, cacheRead: 0.15 },
804
+ "grok-3-mini-fast-beta": { input: 0.6, output: 4, cacheRead: 0.15 },
805
+ "grok-3-mini-fast-latest": { input: 0.6, output: 4, cacheRead: 0.15 },
806
+ "grok-3-mini-latest": { input: 0.3, output: 0.5, cacheRead: 0.075 },
807
+ "grok-4": { input: 3, output: 15 },
808
+ "grok-4-0709": { input: 3, output: 15 },
809
+ "grok-4-1-fast": { input: 0.2, output: 0.5, cacheRead: 0.05 },
810
+ "grok-4-1-fast-non-reasoning": { input: 0.2, output: 0.5, cacheRead: 0.05 },
811
+ "grok-4-1-fast-non-reasoning-latest": { input: 0.2, output: 0.5, cacheRead: 0.05 },
812
+ "grok-4-1-fast-reasoning": { input: 0.2, output: 0.5, cacheRead: 0.05 },
813
+ "grok-4-1-fast-reasoning-latest": { input: 0.2, output: 0.5, cacheRead: 0.05 },
814
+ "grok-4-fast-non-reasoning": { input: 0.2, output: 0.5, cacheRead: 0.05 },
815
+ "grok-4-fast-reasoning": { input: 0.2, output: 0.5, cacheRead: 0.05 },
816
+ "grok-4-latest": { input: 3, output: 15 },
817
+ "grok-4.20-0309-reasoning": { input: 2, output: 6, cacheRead: 0.2 },
818
+ "grok-4.20-beta-0309-non-reasoning": { input: 2, output: 6, cacheRead: 0.2 },
819
+ "grok-4.20-beta-0309-reasoning": { input: 2, output: 6, cacheRead: 0.2 },
820
+ "grok-4.20-multi-agent-beta-0309": { input: 2, output: 6, cacheRead: 0.2 },
821
+ "grok-4.3": { input: 1.25, output: 2.5, cacheRead: 0.2 },
822
+ "grok-4.3-latest": { input: 1.25, output: 2.5, cacheRead: 0.2 },
823
+ "grok-beta": { input: 5, output: 15 },
824
+ "grok-code-fast": { input: 0.2, output: 1.5, cacheRead: 0.02 },
825
+ "grok-code-fast-1": { input: 0.2, output: 1.5, cacheRead: 0.02 },
826
+ "grok-code-fast-1-0825": { input: 0.2, output: 1.5, cacheRead: 0.02 },
827
+ "grok-vision-beta": { input: 5, output: 15 },
828
+ "labs-devstral-small-2512": { input: 0.1, output: 0.3 },
829
+ "magistral-medium-1-2-2509": { input: 2, output: 5 },
830
+ "magistral-medium-2506": { input: 2, output: 5 },
831
+ "magistral-medium-2509": { input: 2, output: 5 },
832
+ "magistral-medium-latest": { input: 2, output: 5 },
833
+ "magistral-small-1-2-2509": { input: 0.5, output: 1.5 },
834
+ "magistral-small-2506": { input: 0.5, output: 1.5 },
835
+ "magistral-small-latest": { input: 0.5, output: 1.5 },
836
+ "ministral-3-14b-2512": { input: 0.2, output: 0.2 },
837
+ "ministral-3-3b-2512": { input: 0.1, output: 0.1 },
838
+ "ministral-3-8b-2512": { input: 0.15, output: 0.15 },
839
+ "ministral-8b-2512": { input: 0.15, output: 0.15 },
840
+ "ministral-8b-latest": { input: 0.15, output: 0.15 },
841
+ "mistral-large-2402": { input: 4, output: 12 },
842
+ "mistral-large-2407": { input: 3, output: 9 },
843
+ "mistral-large-2411": { input: 2, output: 6 },
844
+ "mistral-large-2512": { input: 0.5, output: 1.5 },
845
+ "mistral-large-3": { input: 0.5, output: 1.5 },
846
+ "mistral-large-latest": { input: 0.5, output: 1.5 },
847
+ "mistral-medium": { input: 2.7, output: 8.1 },
848
+ "mistral-medium-2312": { input: 2.7, output: 8.1 },
849
+ "mistral-medium-2505": { input: 0.4, output: 2 },
850
+ "mistral-medium-3-1-2508": { input: 0.4, output: 2 },
851
+ "mistral-medium-latest": { input: 0.4, output: 2 },
852
+ "mistral-small": { input: 0.1, output: 0.3 },
853
+ "mistral-small-3-2-2506": { input: 0.06, output: 0.18 },
854
+ "mistral-small-latest": { input: 0.06, output: 0.18 },
855
+ "mistral-tiny": { input: 0.25, output: 0.25 },
856
+ "o1": { input: 15, output: 60, cacheRead: 7.5 },
857
+ "o1-2024-12-17": { input: 15, output: 60, cacheRead: 7.5 },
858
+ "o3": { input: 2, output: 8, cacheRead: 0.5 },
859
+ "o3-2025-04-16": { input: 2, output: 8, cacheRead: 0.5 },
860
+ "o3-mini": { input: 1.1, output: 4.4, cacheRead: 0.55 },
861
+ "o3-mini-2025-01-31": { input: 1.1, output: 4.4, cacheRead: 0.55 },
862
+ "o4-mini": { input: 1.1, output: 4.4, cacheRead: 0.275 },
863
+ "o4-mini-2025-04-16": { input: 1.1, output: 4.4, cacheRead: 0.275 },
864
+ "open-codestral-mamba": { input: 0.25, output: 0.25 },
865
+ "open-mistral-7b": { input: 0.25, output: 0.25 },
866
+ "open-mistral-nemo": { input: 0.3, output: 0.3 },
867
+ "open-mistral-nemo-2407": { input: 0.3, output: 0.3 },
868
+ "open-mixtral-8x22b": { input: 2, output: 6 },
869
+ "open-mixtral-8x7b": { input: 0.7, output: 0.7 },
870
+ "pixtral-12b-2409": { input: 0.15, output: 0.15 },
871
+ "pixtral-large-2411": { input: 2, output: 6 },
872
+ "pixtral-large-latest": { input: 2, output: 6 }
873
+ };
874
+
640
875
  // src/media-official.ts
641
876
  var OFFICIAL_PRICES = {
642
877
  "alibaba/qwen-image": { unit: "image", cents: 3.5 },
@@ -1649,6 +1884,17 @@ async function safeText2(res) {
1649
1884
  }
1650
1885
 
1651
1886
  // src/index.ts
1887
+ function getModelPrice(modelId) {
1888
+ if (!modelId) return void 0;
1889
+ const direct = MODEL_PRICES[modelId];
1890
+ if (direct) return direct;
1891
+ const slash = modelId.indexOf("/");
1892
+ if (slash !== -1) {
1893
+ const bare = MODEL_PRICES[modelId.slice(slash + 1)];
1894
+ if (bare) return bare;
1895
+ }
1896
+ return void 0;
1897
+ }
1652
1898
  function isLanguageModel(entry) {
1653
1899
  return typeof entry.doGenerate === "function";
1654
1900
  }
@@ -1659,9 +1905,25 @@ function normalize(entry) {
1659
1905
  return {
1660
1906
  model: entry.model,
1661
1907
  label: entry.label ?? entry.model.provider,
1662
- cost: entry.cost
1908
+ cost: entry.cost,
1909
+ discount: entry.discount
1663
1910
  };
1664
1911
  }
1912
+ function applyDiscount(cost, discount) {
1913
+ const f = 1 - discount;
1914
+ return {
1915
+ input: cost.input * f,
1916
+ output: cost.output * f,
1917
+ ...cost.cacheRead !== void 0 ? { cacheRead: cost.cacheRead * f } : {}
1918
+ };
1919
+ }
1920
+ function withAutoPrice(p, autoPrice) {
1921
+ const { discount, ...rest } = p;
1922
+ if (!autoPrice || rest.cost !== void 0) return rest;
1923
+ const base = getModelPrice(rest.model.modelId);
1924
+ if (!base) return rest;
1925
+ return { ...rest, cost: discount !== void 0 ? applyDiscount(base, discount) : base };
1926
+ }
1665
1927
  function priceKey(p) {
1666
1928
  return p.cost ? p.cost.input + p.cost.output : Number.POSITIVE_INFINITY;
1667
1929
  }
@@ -1673,6 +1935,7 @@ function createLCR(config) {
1673
1935
  const {
1674
1936
  models,
1675
1937
  autoSort = false,
1938
+ autoPrice = false,
1676
1939
  resetIntervalMs,
1677
1940
  onError,
1678
1941
  onCost,
@@ -1687,7 +1950,13 @@ function createLCR(config) {
1687
1950
  }
1688
1951
  const routed = /* @__PURE__ */ new Map();
1689
1952
  for (const [name, entries] of Object.entries(models)) {
1690
- let providers = entries.map(normalize).map((p) => withDefaultCacheRead(p, defaultCacheReadRatio));
1953
+ for (const entry of entries) {
1954
+ const d = entry.discount;
1955
+ if (d !== void 0 && (d < 0 || d >= 1)) {
1956
+ throw new Error(`ai-lcr: discount must be in [0, 1) for model "${name}", got ${d}`);
1957
+ }
1958
+ }
1959
+ let providers = entries.map(normalize).map((p) => withAutoPrice(p, autoPrice)).map((p) => withDefaultCacheRead(p, defaultCacheReadRatio));
1691
1960
  if (autoSort) {
1692
1961
  providers = [...providers].sort((a, b) => priceKey(a) - priceKey(b));
1693
1962
  }
@@ -1710,6 +1979,7 @@ function createLCR(config) {
1710
1979
  0 && (module.exports = {
1711
1980
  DEFAULT_REFERENCE,
1712
1981
  MEDIA_PRICING,
1982
+ MODEL_PRICES,
1713
1983
  OFFICIAL_PRICES,
1714
1984
  billableUnits,
1715
1985
  cheapestRoute,
@@ -1724,6 +1994,7 @@ function createLCR(config) {
1724
1994
  createRunwareMediaAdapter,
1725
1995
  durationFromInput,
1726
1996
  formatCallRecord,
1997
+ getModelPrice,
1727
1998
  isAbortError,
1728
1999
  isNetworkError,
1729
2000
  isRetryableError,
package/dist/index.d.cts CHANGED
@@ -324,6 +324,8 @@ interface HttpSinkOptions {
324
324
  */
325
325
  declare function createHttpSink(options: HttpSinkOptions): (record: CallRecord) => void;
326
326
 
327
+ declare const MODEL_PRICES: Record<string, ProviderCost>;
328
+
327
329
  /**
328
330
  * ai-lcr media routing — Least Cost Routing for image & video models.
329
331
  *
@@ -858,7 +860,24 @@ type ProviderEntry = LanguageModelV3 | {
858
860
  cost?: ProviderCost;
859
861
  /** Label used in cost events / logs. Defaults to the model's provider id. */
860
862
  label?: string;
863
+ /**
864
+ * Fraction off the bundled list price (0–1) — the reseller-discount knob.
865
+ * Applied ONLY when `autoPrice` fills this entry from {@link MODEL_PRICES}
866
+ * (i.e. no explicit `cost`): a flat-discount aggregator like Kunavo (−20%)
867
+ * becomes `{ model: kunavo("gemini-2.5-pro"), discount: 0.2 }` with no
868
+ * hand-typed price. Scales input, output, and cacheRead alike. Ignored when
869
+ * `cost` is set, when `autoPrice` is off, or when no bundled price is found.
870
+ */
871
+ discount?: number;
861
872
  };
873
+ /**
874
+ * Look up a model's bundled official list price by id. Tries the id as given,
875
+ * then with a leading `provider/` segment stripped (so `anthropic/claude-haiku-4-5`
876
+ * resolves the same as `claude-haiku-4-5`). Returns undefined for unknown models.
877
+ * The table ({@link MODEL_PRICES}) carries native-maker first-party rates only —
878
+ * see `scripts/gen-text-prices.mjs`.
879
+ */
880
+ declare function getModelPrice(modelId: string): ProviderCost | undefined;
862
881
  interface LCRConfig {
863
882
  /**
864
883
  * Map of logical model name -> providers to try, cheapest-first.
@@ -867,6 +886,16 @@ interface LCRConfig {
867
886
  models: Record<string, ProviderEntry[]>;
868
887
  /** Sort each model's providers cheapest-first by `cost` before routing. */
869
888
  autoSort?: boolean;
889
+ /**
890
+ * Fill any provider entry that has no explicit `cost` from the bundled price
891
+ * table ({@link MODEL_PRICES}), looked up by the entry's `model.modelId`. A
892
+ * native-vendor route then needs zero hand-typed pricing; a flat-discount
893
+ * aggregator just adds `discount` (see {@link ProviderEntry}). Off by default —
894
+ * unpriced entries stay unpriced (the pre-existing behavior), so turning it on
895
+ * never silently re-prices a model you priced yourself (explicit `cost` always
896
+ * wins). Pairs naturally with `autoSort` and `onCost`/`onCall`.
897
+ */
898
+ autoPrice?: boolean;
870
899
  /** Idle window after which routing snaps back to the cheapest provider. Default 60s. */
871
900
  resetIntervalMs?: number;
872
901
  /** Called when a provider errors and routing falls through to the next. */
@@ -912,4 +941,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
912
941
  */
913
942
  declare function createLCR(config: LCRConfig): LCRRouter;
914
943
 
915
- export { type BillableContext, type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type HttpSinkOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaJobHandle, type MediaJobStatus, type MediaLCR, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPollResult, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaStatusRequest, type MediaStatusResult, type MediaSubmitOptions, type MediaSubmitRequest, type MediaSubmitResult, type MediaUnit, type MediaUsage, OFFICIAL_PRICES, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, billableUnits, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createHttpSink, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, durationFromInput, formatCallRecord, isAbortError, isNetworkError, isRetryableError, normalizedCents, priceCents, rankRoutes, referenceMegapixels, shouldFailover };
944
+ export { type BillableContext, type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type HttpSinkOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, MODEL_PRICES, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaJobHandle, type MediaJobStatus, type MediaLCR, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPollResult, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaStatusRequest, type MediaStatusResult, type MediaSubmitOptions, type MediaSubmitRequest, type MediaSubmitResult, type MediaUnit, type MediaUsage, OFFICIAL_PRICES, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, billableUnits, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createHttpSink, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, durationFromInput, formatCallRecord, getModelPrice, isAbortError, isNetworkError, isRetryableError, normalizedCents, priceCents, rankRoutes, referenceMegapixels, shouldFailover };
package/dist/index.d.ts CHANGED
@@ -324,6 +324,8 @@ interface HttpSinkOptions {
324
324
  */
325
325
  declare function createHttpSink(options: HttpSinkOptions): (record: CallRecord) => void;
326
326
 
327
+ declare const MODEL_PRICES: Record<string, ProviderCost>;
328
+
327
329
  /**
328
330
  * ai-lcr media routing — Least Cost Routing for image & video models.
329
331
  *
@@ -858,7 +860,24 @@ type ProviderEntry = LanguageModelV3 | {
858
860
  cost?: ProviderCost;
859
861
  /** Label used in cost events / logs. Defaults to the model's provider id. */
860
862
  label?: string;
863
+ /**
864
+ * Fraction off the bundled list price (0–1) — the reseller-discount knob.
865
+ * Applied ONLY when `autoPrice` fills this entry from {@link MODEL_PRICES}
866
+ * (i.e. no explicit `cost`): a flat-discount aggregator like Kunavo (−20%)
867
+ * becomes `{ model: kunavo("gemini-2.5-pro"), discount: 0.2 }` with no
868
+ * hand-typed price. Scales input, output, and cacheRead alike. Ignored when
869
+ * `cost` is set, when `autoPrice` is off, or when no bundled price is found.
870
+ */
871
+ discount?: number;
861
872
  };
873
+ /**
874
+ * Look up a model's bundled official list price by id. Tries the id as given,
875
+ * then with a leading `provider/` segment stripped (so `anthropic/claude-haiku-4-5`
876
+ * resolves the same as `claude-haiku-4-5`). Returns undefined for unknown models.
877
+ * The table ({@link MODEL_PRICES}) carries native-maker first-party rates only —
878
+ * see `scripts/gen-text-prices.mjs`.
879
+ */
880
+ declare function getModelPrice(modelId: string): ProviderCost | undefined;
862
881
  interface LCRConfig {
863
882
  /**
864
883
  * Map of logical model name -> providers to try, cheapest-first.
@@ -867,6 +886,16 @@ interface LCRConfig {
867
886
  models: Record<string, ProviderEntry[]>;
868
887
  /** Sort each model's providers cheapest-first by `cost` before routing. */
869
888
  autoSort?: boolean;
889
+ /**
890
+ * Fill any provider entry that has no explicit `cost` from the bundled price
891
+ * table ({@link MODEL_PRICES}), looked up by the entry's `model.modelId`. A
892
+ * native-vendor route then needs zero hand-typed pricing; a flat-discount
893
+ * aggregator just adds `discount` (see {@link ProviderEntry}). Off by default —
894
+ * unpriced entries stay unpriced (the pre-existing behavior), so turning it on
895
+ * never silently re-prices a model you priced yourself (explicit `cost` always
896
+ * wins). Pairs naturally with `autoSort` and `onCost`/`onCall`.
897
+ */
898
+ autoPrice?: boolean;
870
899
  /** Idle window after which routing snaps back to the cheapest provider. Default 60s. */
871
900
  resetIntervalMs?: number;
872
901
  /** Called when a provider errors and routing falls through to the next. */
@@ -912,4 +941,4 @@ type LCRRouter = (modelName: string) => LanguageModelV3;
912
941
  */
913
942
  declare function createLCR(config: LCRConfig): LCRRouter;
914
943
 
915
- export { type BillableContext, type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type HttpSinkOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaJobHandle, type MediaJobStatus, type MediaLCR, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPollResult, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaStatusRequest, type MediaStatusResult, type MediaSubmitOptions, type MediaSubmitRequest, type MediaSubmitResult, type MediaUnit, type MediaUsage, OFFICIAL_PRICES, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, billableUnits, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createHttpSink, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, durationFromInput, formatCallRecord, isAbortError, isNetworkError, isRetryableError, normalizedCents, priceCents, rankRoutes, referenceMegapixels, shouldFailover };
944
+ export { type BillableContext, type CallRecord, type CostEvent, DEFAULT_REFERENCE, type ErrorKind, type FormatOptions, type HttpSinkOptions, type LCRConfig, type LCRRouter, MEDIA_PRICING, MODEL_PRICES, type MediaAdapter, type MediaCostEvent, type MediaGenerateRequest, type MediaGenerateResult, type MediaJobHandle, type MediaJobStatus, type MediaLCR, type MediaLCRConfig, type MediaModality, type MediaModelDef, type MediaOutput, type MediaPollResult, type MediaPricing, type MediaRegistry, type MediaRoute, type MediaRunResult, type MediaStatusRequest, type MediaStatusResult, type MediaSubmitOptions, type MediaSubmitRequest, type MediaSubmitResult, type MediaUnit, type MediaUsage, OFFICIAL_PRICES, type PriceComparisonRow, type ProviderCost, type ProviderEntry, type RankedRoute, type ReferenceSpec, type RouteAttempt, billableUnits, cheapestRoute, classifyError, classifyErrorKind, comparePrices, createFalMediaAdapter, createHttpSink, createKunavoMediaAdapter, createLCR, createMediaLCR, createRunwareMediaAdapter, durationFromInput, formatCallRecord, getModelPrice, isAbortError, isNetworkError, isRetryableError, normalizedCents, priceCents, rankRoutes, referenceMegapixels, shouldFailover };
package/dist/index.js CHANGED
@@ -588,6 +588,239 @@ function createHttpSink(options) {
588
588
  };
589
589
  }
590
590
 
591
+ // src/text-prices.ts
592
+ var MODEL_PRICES = {
593
+ "chatgpt-4o-latest": { input: 5, output: 15 },
594
+ "claude-3-7-sonnet-20250219": { input: 3, output: 15, cacheRead: 0.3 },
595
+ "claude-3-haiku-20240307": { input: 0.25, output: 1.25, cacheRead: 0.03 },
596
+ "claude-3-opus-20240229": { input: 15, output: 75, cacheRead: 1.5 },
597
+ "claude-4-opus-20250514": { input: 15, output: 75, cacheRead: 1.5 },
598
+ "claude-4-sonnet-20250514": { input: 3, output: 15, cacheRead: 0.3 },
599
+ "claude-fable-5": { input: 10, output: 50, cacheRead: 1 },
600
+ "claude-haiku-4-5": { input: 1, output: 5, cacheRead: 0.1 },
601
+ "claude-haiku-4-5-20251001": { input: 1, output: 5, cacheRead: 0.1 },
602
+ "claude-opus-4-1": { input: 15, output: 75, cacheRead: 1.5 },
603
+ "claude-opus-4-1-20250805": { input: 15, output: 75, cacheRead: 1.5 },
604
+ "claude-opus-4-20250514": { input: 15, output: 75, cacheRead: 1.5 },
605
+ "claude-opus-4-5": { input: 5, output: 25, cacheRead: 0.5 },
606
+ "claude-opus-4-5-20251101": { input: 5, output: 25, cacheRead: 0.5 },
607
+ "claude-opus-4-6": { input: 5, output: 25, cacheRead: 0.5 },
608
+ "claude-opus-4-6-20260205": { input: 5, output: 25, cacheRead: 0.5 },
609
+ "claude-opus-4-7": { input: 5, output: 25, cacheRead: 0.5 },
610
+ "claude-opus-4-7-20260416": { input: 5, output: 25, cacheRead: 0.5 },
611
+ "claude-opus-4-8": { input: 5, output: 25, cacheRead: 0.5 },
612
+ "claude-sonnet-4-20250514": { input: 3, output: 15, cacheRead: 0.3 },
613
+ "claude-sonnet-4-5": { input: 3, output: 15, cacheRead: 0.3 },
614
+ "claude-sonnet-4-5-20250929": { input: 3, output: 15, cacheRead: 0.3 },
615
+ "claude-sonnet-4-6": { input: 3, output: 15, cacheRead: 0.3 },
616
+ "codestral-2405": { input: 1, output: 3 },
617
+ "codestral-2508": { input: 0.3, output: 0.9 },
618
+ "codestral-latest": { input: 1, output: 3 },
619
+ "codestral-mamba-latest": { input: 0.25, output: 0.25 },
620
+ "deepseek-chat": { input: 0.28, output: 0.42, cacheRead: 0.028 },
621
+ "deepseek-coder": { input: 0.14, output: 0.28 },
622
+ "deepseek-r1": { input: 0.55, output: 2.19 },
623
+ "deepseek-reasoner": { input: 0.28, output: 0.42, cacheRead: 0.028 },
624
+ "deepseek-v3": { input: 0.27, output: 1.1, cacheRead: 0.07 },
625
+ "deepseek-v3.2": { input: 0.28, output: 0.4 },
626
+ "devstral-2512": { input: 0.4, output: 2 },
627
+ "devstral-latest": { input: 0.4, output: 2 },
628
+ "devstral-medium-2507": { input: 0.4, output: 2 },
629
+ "devstral-medium-latest": { input: 0.4, output: 2 },
630
+ "devstral-small-2505": { input: 0.1, output: 0.3 },
631
+ "devstral-small-2507": { input: 0.1, output: 0.3 },
632
+ "devstral-small-latest": { input: 0.1, output: 0.3 },
633
+ "gemini-2.0-flash": { input: 0.1, output: 0.4, cacheRead: 0.025 },
634
+ "gemini-2.0-flash-001": { input: 0.1, output: 0.4, cacheRead: 0.025 },
635
+ "gemini-2.0-flash-lite": { input: 0.075, output: 0.3, cacheRead: 0.01875 },
636
+ "gemini-2.0-flash-lite-001": { input: 0.075, output: 0.3, cacheRead: 0.01875 },
637
+ "gemini-2.5-computer-use-preview-10-2025": { input: 1.25, output: 10 },
638
+ "gemini-2.5-flash": { input: 0.3, output: 2.5, cacheRead: 0.03 },
639
+ "gemini-2.5-flash-lite": { input: 0.1, output: 0.4, cacheRead: 0.01 },
640
+ "gemini-2.5-flash-lite-preview-06-17": { input: 0.1, output: 0.4, cacheRead: 0.025 },
641
+ "gemini-2.5-flash-lite-preview-09-2025": { input: 0.1, output: 0.4, cacheRead: 0.01 },
642
+ "gemini-2.5-flash-native-audio-latest": { input: 0.3, output: 2.5 },
643
+ "gemini-2.5-flash-native-audio-preview-09-2025": { input: 0.3, output: 2.5 },
644
+ "gemini-2.5-flash-native-audio-preview-12-2025": { input: 0.3, output: 2.5 },
645
+ "gemini-2.5-flash-preview-09-2025": { input: 0.3, output: 2.5, cacheRead: 0.075 },
646
+ "gemini-2.5-pro": { input: 1.25, output: 10, cacheRead: 0.125 },
647
+ "gemini-2.5-pro-preview-tts": { input: 1.25, output: 10, cacheRead: 0.125 },
648
+ "gemini-3-flash-preview": { input: 0.5, output: 3, cacheRead: 0.05 },
649
+ "gemini-3-pro-preview": { input: 2, output: 12, cacheRead: 0.2 },
650
+ "gemini-3.1-flash-lite": { input: 0.25, output: 1.5, cacheRead: 0.025 },
651
+ "gemini-3.1-flash-lite-preview": { input: 0.25, output: 1.5, cacheRead: 0.025 },
652
+ "gemini-3.1-flash-live-preview": { input: 0.75, output: 4.5 },
653
+ "gemini-3.1-pro-preview": { input: 2, output: 12, cacheRead: 0.2 },
654
+ "gemini-3.1-pro-preview-customtools": { input: 2, output: 12, cacheRead: 0.2 },
655
+ "gemini-3.5-flash": { input: 1.5, output: 9, cacheRead: 0.15 },
656
+ "gemini-exp-1206": { input: 0.3, output: 2.5, cacheRead: 0.03 },
657
+ "gemini-flash-latest": { input: 0.3, output: 2.5, cacheRead: 0.075 },
658
+ "gemini-flash-lite-latest": { input: 0.1, output: 0.4, cacheRead: 0.025 },
659
+ "gemini-gemma-2-27b-it": { input: 0.35, output: 1.05 },
660
+ "gemini-gemma-2-9b-it": { input: 0.35, output: 1.05 },
661
+ "gemini-pro-latest": { input: 1.25, output: 10, cacheRead: 0.125 },
662
+ "gemini-robotics-er-1.5-preview": { input: 0.3, output: 2.5 },
663
+ "gpt-3.5-turbo": { input: 0.5, output: 1.5 },
664
+ "gpt-3.5-turbo-0125": { input: 0.5, output: 1.5 },
665
+ "gpt-3.5-turbo-1106": { input: 1, output: 2 },
666
+ "gpt-3.5-turbo-16k": { input: 3, output: 4 },
667
+ "gpt-4": { input: 30, output: 60 },
668
+ "gpt-4-0125-preview": { input: 10, output: 30 },
669
+ "gpt-4-0314": { input: 30, output: 60 },
670
+ "gpt-4-0613": { input: 30, output: 60 },
671
+ "gpt-4-1106-preview": { input: 10, output: 30 },
672
+ "gpt-4-turbo": { input: 10, output: 30 },
673
+ "gpt-4-turbo-2024-04-09": { input: 10, output: 30 },
674
+ "gpt-4-turbo-preview": { input: 10, output: 30 },
675
+ "gpt-4.1": { input: 2, output: 8, cacheRead: 0.5 },
676
+ "gpt-4.1-2025-04-14": { input: 2, output: 8, cacheRead: 0.5 },
677
+ "gpt-4.1-mini": { input: 0.4, output: 1.6, cacheRead: 0.1 },
678
+ "gpt-4.1-mini-2025-04-14": { input: 0.4, output: 1.6, cacheRead: 0.1 },
679
+ "gpt-4.1-nano": { input: 0.1, output: 0.4, cacheRead: 0.025 },
680
+ "gpt-4.1-nano-2025-04-14": { input: 0.1, output: 0.4, cacheRead: 0.025 },
681
+ "gpt-4o": { input: 2.5, output: 10, cacheRead: 1.25 },
682
+ "gpt-4o-2024-05-13": { input: 5, output: 15 },
683
+ "gpt-4o-2024-08-06": { input: 2.5, output: 10, cacheRead: 1.25 },
684
+ "gpt-4o-2024-11-20": { input: 2.5, output: 10, cacheRead: 1.25 },
685
+ "gpt-4o-audio-preview": { input: 2.5, output: 10 },
686
+ "gpt-4o-audio-preview-2024-12-17": { input: 2.5, output: 10 },
687
+ "gpt-4o-audio-preview-2025-06-03": { input: 2.5, output: 10 },
688
+ "gpt-4o-mini": { input: 0.15, output: 0.6, cacheRead: 0.075 },
689
+ "gpt-4o-mini-2024-07-18": { input: 0.15, output: 0.6, cacheRead: 0.075 },
690
+ "gpt-4o-mini-audio-preview": { input: 0.15, output: 0.6 },
691
+ "gpt-4o-mini-audio-preview-2024-12-17": { input: 0.15, output: 0.6 },
692
+ "gpt-4o-mini-realtime-preview": { input: 0.6, output: 2.4, cacheRead: 0.3 },
693
+ "gpt-4o-mini-realtime-preview-2024-12-17": { input: 0.6, output: 2.4, cacheRead: 0.3 },
694
+ "gpt-4o-mini-search-preview": { input: 0.15, output: 0.6, cacheRead: 0.075 },
695
+ "gpt-4o-mini-search-preview-2025-03-11": { input: 0.15, output: 0.6, cacheRead: 0.075 },
696
+ "gpt-4o-realtime-preview": { input: 5, output: 20, cacheRead: 2.5 },
697
+ "gpt-4o-realtime-preview-2024-12-17": { input: 5, output: 20, cacheRead: 2.5 },
698
+ "gpt-4o-realtime-preview-2025-06-03": { input: 5, output: 20, cacheRead: 2.5 },
699
+ "gpt-4o-search-preview": { input: 2.5, output: 10, cacheRead: 1.25 },
700
+ "gpt-4o-search-preview-2025-03-11": { input: 2.5, output: 10, cacheRead: 1.25 },
701
+ "gpt-5": { input: 1.25, output: 10, cacheRead: 0.125 },
702
+ "gpt-5-2025-08-07": { input: 1.25, output: 10, cacheRead: 0.125 },
703
+ "gpt-5-chat": { input: 1.25, output: 10, cacheRead: 0.125 },
704
+ "gpt-5-chat-latest": { input: 1.25, output: 10, cacheRead: 0.125 },
705
+ "gpt-5-mini": { input: 0.25, output: 2, cacheRead: 0.025 },
706
+ "gpt-5-mini-2025-08-07": { input: 0.25, output: 2, cacheRead: 0.025 },
707
+ "gpt-5-nano": { input: 0.05, output: 0.4, cacheRead: 5e-3 },
708
+ "gpt-5-nano-2025-08-07": { input: 0.05, output: 0.4, cacheRead: 5e-3 },
709
+ "gpt-5-search-api": { input: 1.25, output: 10, cacheRead: 0.125 },
710
+ "gpt-5-search-api-2025-10-14": { input: 1.25, output: 10, cacheRead: 0.125 },
711
+ "gpt-5.1": { input: 1.25, output: 10, cacheRead: 0.125 },
712
+ "gpt-5.1-2025-11-13": { input: 1.25, output: 10, cacheRead: 0.125 },
713
+ "gpt-5.1-chat-latest": { input: 1.25, output: 10, cacheRead: 0.125 },
714
+ "gpt-5.2": { input: 1.75, output: 14, cacheRead: 0.175 },
715
+ "gpt-5.2-2025-12-11": { input: 1.75, output: 14, cacheRead: 0.175 },
716
+ "gpt-5.2-chat-latest": { input: 1.75, output: 14, cacheRead: 0.175 },
717
+ "gpt-5.3-chat-latest": { input: 1.75, output: 14, cacheRead: 0.175 },
718
+ "gpt-5.4": { input: 2.5, output: 15, cacheRead: 0.25 },
719
+ "gpt-5.4-2026-03-05": { input: 2.5, output: 15, cacheRead: 0.25 },
720
+ "gpt-5.4-mini": { input: 0.75, output: 4.5, cacheRead: 0.075 },
721
+ "gpt-5.4-mini-2026-03-17": { input: 0.75, output: 4.5, cacheRead: 0.075 },
722
+ "gpt-5.4-nano": { input: 0.2, output: 1.25, cacheRead: 0.02 },
723
+ "gpt-5.4-nano-2026-03-17": { input: 0.2, output: 1.25, cacheRead: 0.02 },
724
+ "gpt-5.5": { input: 5, output: 30, cacheRead: 0.5 },
725
+ "gpt-5.5-2026-04-23": { input: 5, output: 30, cacheRead: 0.5 },
726
+ "gpt-audio": { input: 2.5, output: 10 },
727
+ "gpt-audio-1.5": { input: 2.5, output: 10 },
728
+ "gpt-audio-2025-08-28": { input: 2.5, output: 10 },
729
+ "gpt-audio-mini": { input: 0.6, output: 2.4 },
730
+ "gpt-audio-mini-2025-10-06": { input: 0.6, output: 2.4 },
731
+ "gpt-audio-mini-2025-12-15": { input: 0.6, output: 2.4 },
732
+ "gpt-realtime": { input: 4, output: 16, cacheRead: 0.4 },
733
+ "gpt-realtime-1.5": { input: 4, output: 16, cacheRead: 0.4 },
734
+ "gpt-realtime-2": { input: 4, output: 16, cacheRead: 0.4 },
735
+ "gpt-realtime-2025-08-28": { input: 4, output: 16, cacheRead: 0.4 },
736
+ "gpt-realtime-mini": { input: 0.6, output: 2.4 },
737
+ "gpt-realtime-mini-2025-10-06": { input: 0.6, output: 2.4, cacheRead: 0.06 },
738
+ "gpt-realtime-mini-2025-12-15": { input: 0.6, output: 2.4, cacheRead: 0.06 },
739
+ "grok-2": { input: 2, output: 10 },
740
+ "grok-2-1212": { input: 2, output: 10 },
741
+ "grok-2-latest": { input: 2, output: 10 },
742
+ "grok-2-vision": { input: 2, output: 10 },
743
+ "grok-2-vision-1212": { input: 2, output: 10 },
744
+ "grok-2-vision-latest": { input: 2, output: 10 },
745
+ "grok-3": { input: 3, output: 15, cacheRead: 0.75 },
746
+ "grok-3-beta": { input: 3, output: 15, cacheRead: 0.75 },
747
+ "grok-3-fast-beta": { input: 5, output: 25, cacheRead: 1.25 },
748
+ "grok-3-fast-latest": { input: 5, output: 25, cacheRead: 1.25 },
749
+ "grok-3-latest": { input: 3, output: 15, cacheRead: 0.75 },
750
+ "grok-3-mini": { input: 0.3, output: 0.5, cacheRead: 0.075 },
751
+ "grok-3-mini-beta": { input: 0.3, output: 0.5, cacheRead: 0.075 },
752
+ "grok-3-mini-fast": { input: 0.6, output: 4, cacheRead: 0.15 },
753
+ "grok-3-mini-fast-beta": { input: 0.6, output: 4, cacheRead: 0.15 },
754
+ "grok-3-mini-fast-latest": { input: 0.6, output: 4, cacheRead: 0.15 },
755
+ "grok-3-mini-latest": { input: 0.3, output: 0.5, cacheRead: 0.075 },
756
+ "grok-4": { input: 3, output: 15 },
757
+ "grok-4-0709": { input: 3, output: 15 },
758
+ "grok-4-1-fast": { input: 0.2, output: 0.5, cacheRead: 0.05 },
759
+ "grok-4-1-fast-non-reasoning": { input: 0.2, output: 0.5, cacheRead: 0.05 },
760
+ "grok-4-1-fast-non-reasoning-latest": { input: 0.2, output: 0.5, cacheRead: 0.05 },
761
+ "grok-4-1-fast-reasoning": { input: 0.2, output: 0.5, cacheRead: 0.05 },
762
+ "grok-4-1-fast-reasoning-latest": { input: 0.2, output: 0.5, cacheRead: 0.05 },
763
+ "grok-4-fast-non-reasoning": { input: 0.2, output: 0.5, cacheRead: 0.05 },
764
+ "grok-4-fast-reasoning": { input: 0.2, output: 0.5, cacheRead: 0.05 },
765
+ "grok-4-latest": { input: 3, output: 15 },
766
+ "grok-4.20-0309-reasoning": { input: 2, output: 6, cacheRead: 0.2 },
767
+ "grok-4.20-beta-0309-non-reasoning": { input: 2, output: 6, cacheRead: 0.2 },
768
+ "grok-4.20-beta-0309-reasoning": { input: 2, output: 6, cacheRead: 0.2 },
769
+ "grok-4.20-multi-agent-beta-0309": { input: 2, output: 6, cacheRead: 0.2 },
770
+ "grok-4.3": { input: 1.25, output: 2.5, cacheRead: 0.2 },
771
+ "grok-4.3-latest": { input: 1.25, output: 2.5, cacheRead: 0.2 },
772
+ "grok-beta": { input: 5, output: 15 },
773
+ "grok-code-fast": { input: 0.2, output: 1.5, cacheRead: 0.02 },
774
+ "grok-code-fast-1": { input: 0.2, output: 1.5, cacheRead: 0.02 },
775
+ "grok-code-fast-1-0825": { input: 0.2, output: 1.5, cacheRead: 0.02 },
776
+ "grok-vision-beta": { input: 5, output: 15 },
777
+ "labs-devstral-small-2512": { input: 0.1, output: 0.3 },
778
+ "magistral-medium-1-2-2509": { input: 2, output: 5 },
779
+ "magistral-medium-2506": { input: 2, output: 5 },
780
+ "magistral-medium-2509": { input: 2, output: 5 },
781
+ "magistral-medium-latest": { input: 2, output: 5 },
782
+ "magistral-small-1-2-2509": { input: 0.5, output: 1.5 },
783
+ "magistral-small-2506": { input: 0.5, output: 1.5 },
784
+ "magistral-small-latest": { input: 0.5, output: 1.5 },
785
+ "ministral-3-14b-2512": { input: 0.2, output: 0.2 },
786
+ "ministral-3-3b-2512": { input: 0.1, output: 0.1 },
787
+ "ministral-3-8b-2512": { input: 0.15, output: 0.15 },
788
+ "ministral-8b-2512": { input: 0.15, output: 0.15 },
789
+ "ministral-8b-latest": { input: 0.15, output: 0.15 },
790
+ "mistral-large-2402": { input: 4, output: 12 },
791
+ "mistral-large-2407": { input: 3, output: 9 },
792
+ "mistral-large-2411": { input: 2, output: 6 },
793
+ "mistral-large-2512": { input: 0.5, output: 1.5 },
794
+ "mistral-large-3": { input: 0.5, output: 1.5 },
795
+ "mistral-large-latest": { input: 0.5, output: 1.5 },
796
+ "mistral-medium": { input: 2.7, output: 8.1 },
797
+ "mistral-medium-2312": { input: 2.7, output: 8.1 },
798
+ "mistral-medium-2505": { input: 0.4, output: 2 },
799
+ "mistral-medium-3-1-2508": { input: 0.4, output: 2 },
800
+ "mistral-medium-latest": { input: 0.4, output: 2 },
801
+ "mistral-small": { input: 0.1, output: 0.3 },
802
+ "mistral-small-3-2-2506": { input: 0.06, output: 0.18 },
803
+ "mistral-small-latest": { input: 0.06, output: 0.18 },
804
+ "mistral-tiny": { input: 0.25, output: 0.25 },
805
+ "o1": { input: 15, output: 60, cacheRead: 7.5 },
806
+ "o1-2024-12-17": { input: 15, output: 60, cacheRead: 7.5 },
807
+ "o3": { input: 2, output: 8, cacheRead: 0.5 },
808
+ "o3-2025-04-16": { input: 2, output: 8, cacheRead: 0.5 },
809
+ "o3-mini": { input: 1.1, output: 4.4, cacheRead: 0.55 },
810
+ "o3-mini-2025-01-31": { input: 1.1, output: 4.4, cacheRead: 0.55 },
811
+ "o4-mini": { input: 1.1, output: 4.4, cacheRead: 0.275 },
812
+ "o4-mini-2025-04-16": { input: 1.1, output: 4.4, cacheRead: 0.275 },
813
+ "open-codestral-mamba": { input: 0.25, output: 0.25 },
814
+ "open-mistral-7b": { input: 0.25, output: 0.25 },
815
+ "open-mistral-nemo": { input: 0.3, output: 0.3 },
816
+ "open-mistral-nemo-2407": { input: 0.3, output: 0.3 },
817
+ "open-mixtral-8x22b": { input: 2, output: 6 },
818
+ "open-mixtral-8x7b": { input: 0.7, output: 0.7 },
819
+ "pixtral-12b-2409": { input: 0.15, output: 0.15 },
820
+ "pixtral-large-2411": { input: 2, output: 6 },
821
+ "pixtral-large-latest": { input: 2, output: 6 }
822
+ };
823
+
591
824
  // src/media-official.ts
592
825
  var OFFICIAL_PRICES = {
593
826
  "alibaba/qwen-image": { unit: "image", cents: 3.5 },
@@ -1600,6 +1833,17 @@ async function safeText2(res) {
1600
1833
  }
1601
1834
 
1602
1835
  // src/index.ts
1836
+ function getModelPrice(modelId) {
1837
+ if (!modelId) return void 0;
1838
+ const direct = MODEL_PRICES[modelId];
1839
+ if (direct) return direct;
1840
+ const slash = modelId.indexOf("/");
1841
+ if (slash !== -1) {
1842
+ const bare = MODEL_PRICES[modelId.slice(slash + 1)];
1843
+ if (bare) return bare;
1844
+ }
1845
+ return void 0;
1846
+ }
1603
1847
  function isLanguageModel(entry) {
1604
1848
  return typeof entry.doGenerate === "function";
1605
1849
  }
@@ -1610,9 +1854,25 @@ function normalize(entry) {
1610
1854
  return {
1611
1855
  model: entry.model,
1612
1856
  label: entry.label ?? entry.model.provider,
1613
- cost: entry.cost
1857
+ cost: entry.cost,
1858
+ discount: entry.discount
1614
1859
  };
1615
1860
  }
1861
+ function applyDiscount(cost, discount) {
1862
+ const f = 1 - discount;
1863
+ return {
1864
+ input: cost.input * f,
1865
+ output: cost.output * f,
1866
+ ...cost.cacheRead !== void 0 ? { cacheRead: cost.cacheRead * f } : {}
1867
+ };
1868
+ }
1869
+ function withAutoPrice(p, autoPrice) {
1870
+ const { discount, ...rest } = p;
1871
+ if (!autoPrice || rest.cost !== void 0) return rest;
1872
+ const base = getModelPrice(rest.model.modelId);
1873
+ if (!base) return rest;
1874
+ return { ...rest, cost: discount !== void 0 ? applyDiscount(base, discount) : base };
1875
+ }
1616
1876
  function priceKey(p) {
1617
1877
  return p.cost ? p.cost.input + p.cost.output : Number.POSITIVE_INFINITY;
1618
1878
  }
@@ -1624,6 +1884,7 @@ function createLCR(config) {
1624
1884
  const {
1625
1885
  models,
1626
1886
  autoSort = false,
1887
+ autoPrice = false,
1627
1888
  resetIntervalMs,
1628
1889
  onError,
1629
1890
  onCost,
@@ -1638,7 +1899,13 @@ function createLCR(config) {
1638
1899
  }
1639
1900
  const routed = /* @__PURE__ */ new Map();
1640
1901
  for (const [name, entries] of Object.entries(models)) {
1641
- let providers = entries.map(normalize).map((p) => withDefaultCacheRead(p, defaultCacheReadRatio));
1902
+ for (const entry of entries) {
1903
+ const d = entry.discount;
1904
+ if (d !== void 0 && (d < 0 || d >= 1)) {
1905
+ throw new Error(`ai-lcr: discount must be in [0, 1) for model "${name}", got ${d}`);
1906
+ }
1907
+ }
1908
+ let providers = entries.map(normalize).map((p) => withAutoPrice(p, autoPrice)).map((p) => withDefaultCacheRead(p, defaultCacheReadRatio));
1642
1909
  if (autoSort) {
1643
1910
  providers = [...providers].sort((a, b) => priceKey(a) - priceKey(b));
1644
1911
  }
@@ -1660,6 +1927,7 @@ function createLCR(config) {
1660
1927
  export {
1661
1928
  DEFAULT_REFERENCE,
1662
1929
  MEDIA_PRICING,
1930
+ MODEL_PRICES,
1663
1931
  OFFICIAL_PRICES,
1664
1932
  billableUnits,
1665
1933
  cheapestRoute,
@@ -1674,6 +1942,7 @@ export {
1674
1942
  createRunwareMediaAdapter,
1675
1943
  durationFromInput,
1676
1944
  formatCallRecord,
1945
+ getModelPrice,
1677
1946
  isAbortError,
1678
1947
  isNetworkError,
1679
1948
  isRetryableError,
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "ai-lcr",
3
- "version": "0.6.0",
3
+ "version": "0.6.1",
4
4
  "description": "Least Cost Routing for LLMs — route every model call to the cheapest available provider, fall back automatically, and track real cost. Built for the Vercel AI SDK.",
5
5
  "keywords": [
6
6
  "ai",