ai-lcr 0.6.4 → 0.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +63 -0
- package/README.md +64 -3
- package/README.zh-CN.md +53 -2
- package/dist/index.cjs +78 -5
- package/dist/index.js +78 -5
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -4,6 +4,69 @@ All notable changes to `ai-lcr` are documented here. The format follows
|
|
|
4
4
|
[Keep a Changelog](https://keepachangelog.com/), and the project adheres to
|
|
5
5
|
[Semantic Versioning](https://semver.org/).
|
|
6
6
|
|
|
7
|
+
## [0.7.0] — 2026-06-20
|
|
8
|
+
|
|
9
|
+
The text router now records the **provider-reported actual cost** when a provider
|
|
10
|
+
returns one, instead of always estimating from the price table. The table becomes
|
|
11
|
+
the routing input and the drift baseline (`estCostUsd`); the recorded `costUsd` is
|
|
12
|
+
the real bill wherever the provider gives it.
|
|
13
|
+
|
|
14
|
+
### Why
|
|
15
|
+
|
|
16
|
+
A static price table can only encode one price per model, but an aggregator
|
|
17
|
+
(OpenRouter) routes a single model across many sub-providers whose prices differ
|
|
18
|
+
several-fold, picking one per call — so `tokens × table` is structurally unable to
|
|
19
|
+
match the bill for multi-provider models (measured: `deepseek-v4-pro` reconciled at
|
|
20
|
+
~57% of the real cost, while single-provider models like Gemini/Claude/GPT matched
|
|
21
|
+
at 100%). The provider's own number already accounts for which sub-provider served,
|
|
22
|
+
every token kind (cache read/write, reasoning), and fees — none of which a flat
|
|
23
|
+
table can track.
|
|
24
|
+
|
|
25
|
+
### Added
|
|
26
|
+
|
|
27
|
+
- **`costUsd` prefers the provider-reported actual cost** (text path). Read from
|
|
28
|
+
OpenRouter's `providerMetadata.openrouter.usage` —
|
|
29
|
+
`costDetails.upstreamInferenceCost` (the real upstream / BYOK model spend) when
|
|
30
|
+
present, otherwise `cost` (the credit charge) — and from an OpenAI-compatible
|
|
31
|
+
provider's `estimated_cost` on the raw usage body. Requires the caller to enable
|
|
32
|
+
usage accounting on the provider (e.g. OpenRouter `usage: { include: true }`);
|
|
33
|
+
without it, behavior is unchanged.
|
|
34
|
+
- **`estCostUsd` is now set on text records** (previously media-only) — the
|
|
35
|
+
price-table prediction for the same usage. `costUsd − estCostUsd` is the
|
|
36
|
+
price-table drift signal, so a dashboard's drift panel now works for text too.
|
|
37
|
+
|
|
38
|
+
### Changed
|
|
39
|
+
|
|
40
|
+
- When no provider cost is reported, `costUsd` still equals the price-table
|
|
41
|
+
estimate (and `estCostUsd` equals it, so no drift is flagged) — a pure fallback,
|
|
42
|
+
fully backward-compatible. The streaming path reads the reported cost from the
|
|
43
|
+
`finish` chunk's `providerMetadata`.
|
|
44
|
+
|
|
45
|
+
## [0.6.5] — 2026-06-16
|
|
46
|
+
|
|
47
|
+
Bundled price table now covers the open-weights labs, not just the Western
|
|
48
|
+
proprietary makers — so `autoPrice` resolves Qwen, Kimi, MiniMax, and GLM routes
|
|
49
|
+
out of the box (previously they needed a hand-typed `cost`).
|
|
50
|
+
|
|
51
|
+
### Added
|
|
52
|
+
|
|
53
|
+
- **`MODEL_PRICES` now includes the open-weights makers** — Qwen (Alibaba /
|
|
54
|
+
`dashscope`), Kimi (Moonshot), MiniMax, and GLM (Z.ai), alongside the existing
|
|
55
|
+
DeepSeek. 55 new first-party list prices (229 → 284 entries), keyed by each
|
|
56
|
+
maker's own bare model id (`qwen-plus`, `kimi-k2.5`, `MiniMax-M2`, `glm-4.6`,
|
|
57
|
+
…). The generator's `ALLOW` set gained `dashscope` / `moonshot` / `minimax` /
|
|
58
|
+
`zai`; no existing price changed.
|
|
59
|
+
|
|
60
|
+
### Notes
|
|
61
|
+
|
|
62
|
+
- These are **first-party** list rates (the maker's own API). A dedicated
|
|
63
|
+
inference *host* (DeepInfra, …) is often cheaper and uses HF-style ids
|
|
64
|
+
(`Qwen/Qwen3-…`) that won't match these bare keys — for an aggregator route,
|
|
65
|
+
keep passing an explicit `cost` or `discount`. The bundled rate is the
|
|
66
|
+
`autoPrice` baseline for the maker's own provider and a reference for the rest.
|
|
67
|
+
- Aggregators (deepinfra, together, fireworks, groq, openrouter) remain
|
|
68
|
+
deliberately excluded from the table — their prices drift per-model.
|
|
69
|
+
|
|
7
70
|
## [0.6.4] — 2026-06-16
|
|
8
71
|
|
|
9
72
|
DX improvements that eliminate per-project boilerplate for consumers.
|
package/README.md
CHANGED
|
@@ -96,7 +96,7 @@ const lcr = createLCR({
|
|
|
96
96
|
});
|
|
97
97
|
```
|
|
98
98
|
|
|
99
|
-
The same pattern works for any vendor's native SDK provider — `@ai-sdk/anthropic`, `@ai-sdk/google`, `@ai-sdk/openai`, `@ai-sdk/xai`, and so on.
|
|
99
|
+
The same pattern works for any vendor's native SDK provider — `@ai-sdk/anthropic`, `@ai-sdk/google`, `@ai-sdk/openai`, `@ai-sdk/xai`, and so on. `ProviderEntry` accepts `AnyLanguageModel` — a duck-typed interface (`doGenerate` + `doStream` + `provider` + `modelId`) that any AI SDK model satisfies regardless of spec version (V2 or V3), so you never need `as`-casts at the integration boundary. Native APIs are narrow (only that vendor's models) but featureful; aggregators are broad. **Official-first + aggregator-fallback** is the natural LCR shape.
|
|
100
100
|
|
|
101
101
|
## Cheapest route for open-weights models (DeepInfra)
|
|
102
102
|
|
|
@@ -138,9 +138,50 @@ const lcr = createLCR({
|
|
|
138
138
|
|
|
139
139
|
DeepInfra carries open weights only — no first-party Claude / GPT / Gemini. For those closed models, route through OpenRouter or a discount gateway instead.
|
|
140
140
|
|
|
141
|
+
## Skip the boilerplate (`DEFAULT_PROVIDERS`)
|
|
142
|
+
|
|
143
|
+
Every project that routes through OpenRouter, DeepInfra, TokenMart, DeepSeek, etc. redeclares the same `baseURL` + `apiKeyEnv` pair. `DEFAULT_PROVIDERS` is a bundled dictionary — import what you need instead of copy-pasting URLs:
|
|
144
|
+
|
|
145
|
+
```ts
|
|
146
|
+
import { DEFAULT_PROVIDERS } from "ai-lcr";
|
|
147
|
+
import { createOpenAICompatible } from "@ai-sdk/openai-compatible";
|
|
148
|
+
|
|
149
|
+
// Pick the providers you use — type-safe, no hardcoded URLs.
|
|
150
|
+
const deepinfra = createOpenAICompatible({
|
|
151
|
+
name: "deepinfra",
|
|
152
|
+
baseURL: DEFAULT_PROVIDERS.deepinfra.baseURL,
|
|
153
|
+
apiKey: process.env[DEFAULT_PROVIDERS.deepinfra.apiKeyEnv],
|
|
154
|
+
});
|
|
155
|
+
```
|
|
156
|
+
|
|
157
|
+
Available providers:
|
|
158
|
+
|
|
159
|
+
| Key | Base URL | Env var |
|
|
160
|
+
|---|---|---|
|
|
161
|
+
| `openrouter` | `https://openrouter.ai/api/v1` | `OPENROUTER_API_KEY` |
|
|
162
|
+
| `deepinfra` | `https://api.deepinfra.com/v1/openai` | `DEEPINFRA_API_KEY` |
|
|
163
|
+
| `tokenmart` | `https://model.service-inference.ai/v1` | `INFERENCE_API_KEY` |
|
|
164
|
+
| `deepseek` | `https://api.deepseek.com` | `DEEPSEEK_API_KEY` |
|
|
165
|
+
| `kunavo` | `https://api.kunavo.com/v1` | `KUNAVO_API_KEY` |
|
|
166
|
+
| `runware` | `https://api.runware.ai/v1` | `RUNWARE_API_KEY` |
|
|
167
|
+
| `fal` | `https://queue.fal.run` | `FAL_KEY` |
|
|
168
|
+
|
|
169
|
+
A common pattern is to subset `DEFAULT_PROVIDERS` into a project-local type for compile-time safety:
|
|
170
|
+
|
|
171
|
+
```ts
|
|
172
|
+
import { DEFAULT_PROVIDERS } from "ai-lcr";
|
|
173
|
+
|
|
174
|
+
type ProviderId = "deepinfra" | "openrouter";
|
|
175
|
+
|
|
176
|
+
export const PROVIDERS = {
|
|
177
|
+
deepinfra: DEFAULT_PROVIDERS.deepinfra,
|
|
178
|
+
openrouter: DEFAULT_PROVIDERS.openrouter,
|
|
179
|
+
} satisfies Record<ProviderId, { baseURL: string; apiKeyEnv: string }>;
|
|
180
|
+
```
|
|
181
|
+
|
|
141
182
|
## Zero-config pricing (`autoPrice`)
|
|
142
183
|
|
|
143
|
-
Typing `cost: { input, output }` for every provider is the tedious part. `autoPrice: true` fills any entry that has no explicit `cost` from a **bundled price table** (`MODEL_PRICES`) — official first-party rates for the native makers (OpenAI, Anthropic, Google, DeepSeek,
|
|
184
|
+
Typing `cost: { input, output }` for every provider is the tedious part. `autoPrice: true` fills any entry that has no explicit `cost` from a **bundled price table** (`MODEL_PRICES`) — official first-party rates for the native makers (OpenAI, Anthropic, Google, xAI, Mistral, plus the open-weights labs DeepSeek, Qwen, Kimi, MiniMax, GLM), keyed by the bare model id you already pass to the provider:
|
|
144
185
|
|
|
145
186
|
```ts
|
|
146
187
|
const lcr = createLCR({
|
|
@@ -161,7 +202,7 @@ Three rules keep it predictable:
|
|
|
161
202
|
|
|
162
203
|
- **Off by default.** Unpriced entries stay unpriced (the pre-existing behavior), so turning `autoPrice` on never silently re-prices a model — and an **explicit `cost` always wins** over the table.
|
|
163
204
|
- **`discount` is the reseller knob.** A flat-% aggregator (Kunavo −20%) becomes `discount: 0.2` instead of a hand-typed number; it scales input, output, and `cacheRead` alike, and only applies when the table fills the entry. Variable-discount providers (TokenMart) still want explicit per-model `cost`.
|
|
164
|
-
- **Native makers only.** The table carries first-party list prices
|
|
205
|
+
- **Native makers only.** The table carries first-party list prices, keyed by each maker's own bare id (`qwen-plus`, `glm-4.6`, `kimi-k2.5`, `MiniMax-M2`). It's the autoPrice baseline when you route through that maker's own API. Open-weights *hosts* (DeepInfra uses HF-style ids like `Qwen/Qwen3-…`) and breadth aggregators (OpenRouter) aren't keyed here — price those with explicit `cost` or `discount`.
|
|
165
206
|
|
|
166
207
|
Look a price up yourself with `getModelPrice("claude-sonnet-4-6")`. The table is generated from [LiteLLM's price map](https://github.com/BerriAI/litellm) (MIT) — refresh with `node scripts/gen-text-prices.mjs`.
|
|
167
208
|
|
|
@@ -315,6 +356,26 @@ const lcr = createLCR({
|
|
|
315
356
|
});
|
|
316
357
|
```
|
|
317
358
|
|
|
359
|
+
### Convention-based sink (`createEnvSink`)
|
|
360
|
+
|
|
361
|
+
If your app uses the standard env vars (`LCR_INGEST_URL`, `LCR_PROJECT`, `LCR_INGEST_KEY`), you don't need to wire `createHttpSink` at all — `createEnvSink` reads them for you and returns a ready-to-use `onCall` handler (or `undefined` when `LCR_INGEST_URL` is unset, so local dev stays quiet):
|
|
362
|
+
|
|
363
|
+
```ts
|
|
364
|
+
import { createEnvSink } from "ai-lcr";
|
|
365
|
+
import { after } from "next/server";
|
|
366
|
+
|
|
367
|
+
export const lcrCallSink = createEnvSink(after);
|
|
368
|
+
// → use as `onCall: lcrCallSink` in createLCR
|
|
369
|
+
```
|
|
370
|
+
|
|
371
|
+
The only required argument is `dispatch` — a framework-specific fire-and-forget runner (Next.js: `after`, Cloudflare: `ctx.waitUntil`, plain servers: `(fn) => fn()`). Env vars:
|
|
372
|
+
|
|
373
|
+
| Var | Required | Description |
|
|
374
|
+
|---|---|---|
|
|
375
|
+
| `LCR_INGEST_URL` | yes (no URL → sink is `undefined`) | Dashboard origin, e.g. `https://lcr.ideamarketfit.com` |
|
|
376
|
+
| `LCR_PROJECT` | no | Project tag merged into each payload; falls back to `SITE_KEY` |
|
|
377
|
+
| `LCR_INGEST_KEY` | no | Bearer token (only if the dashboard sets `INGEST_KEY`) |
|
|
378
|
+
|
|
318
379
|
### The companion dashboard ([`ai-lcr-dashboard`](https://github.com/ai-lcr/ai-lcr-dashboard))
|
|
319
380
|
|
|
320
381
|
<p align="center">
|
package/README.zh-CN.md
CHANGED
|
@@ -96,7 +96,7 @@ const lcr = createLCR({
|
|
|
96
96
|
});
|
|
97
97
|
```
|
|
98
98
|
|
|
99
|
-
同样的模式适用于任何厂商的原生 SDK provider——`@ai-sdk/anthropic`、`@ai-sdk/google`、`@ai-sdk/openai`、`@ai-sdk/xai`
|
|
99
|
+
同样的模式适用于任何厂商的原生 SDK provider——`@ai-sdk/anthropic`、`@ai-sdk/google`、`@ai-sdk/openai`、`@ai-sdk/xai` 等等。`ProviderEntry` 接受 `AnyLanguageModel`——一个鸭子类型接口(`doGenerate` + `doStream` + `provider` + `modelId`),任何 AI SDK model 无论 V2 还是 V3 spec 都满足,集成边界无需 `as` 强转。原生 API 覆盖窄(只有该厂商自己的模型)但特性全;聚合器覆盖广。**官方优先 + 聚合器兜底** 正是 LCR 最自然的形态。
|
|
100
100
|
|
|
101
101
|
## 开源权重模型的最便宜路由(DeepInfra)
|
|
102
102
|
|
|
@@ -138,6 +138,47 @@ const lcr = createLCR({
|
|
|
138
138
|
|
|
139
139
|
DeepInfra 只承载开源权重——没有第一方 Claude / GPT / Gemini。那些闭源模型请走 OpenRouter 或折扣中转。
|
|
140
140
|
|
|
141
|
+
## 省掉样板代码(`DEFAULT_PROVIDERS`)
|
|
142
|
+
|
|
143
|
+
每个路由 OpenRouter、DeepInfra、TokenMart、DeepSeek 等的项目都要重复声明相同的 `baseURL` + `apiKeyEnv`。`DEFAULT_PROVIDERS` 是一份内置字典——import 你需要的那几个就行,不用再复制粘贴 URL:
|
|
144
|
+
|
|
145
|
+
```ts
|
|
146
|
+
import { DEFAULT_PROVIDERS } from "ai-lcr";
|
|
147
|
+
import { createOpenAICompatible } from "@ai-sdk/openai-compatible";
|
|
148
|
+
|
|
149
|
+
// 按需取——类型安全,无硬编码 URL。
|
|
150
|
+
const deepinfra = createOpenAICompatible({
|
|
151
|
+
name: "deepinfra",
|
|
152
|
+
baseURL: DEFAULT_PROVIDERS.deepinfra.baseURL,
|
|
153
|
+
apiKey: process.env[DEFAULT_PROVIDERS.deepinfra.apiKeyEnv],
|
|
154
|
+
});
|
|
155
|
+
```
|
|
156
|
+
|
|
157
|
+
可用 provider:
|
|
158
|
+
|
|
159
|
+
| Key | Base URL | Env 变量 |
|
|
160
|
+
|---|---|---|
|
|
161
|
+
| `openrouter` | `https://openrouter.ai/api/v1` | `OPENROUTER_API_KEY` |
|
|
162
|
+
| `deepinfra` | `https://api.deepinfra.com/v1/openai` | `DEEPINFRA_API_KEY` |
|
|
163
|
+
| `tokenmart` | `https://model.service-inference.ai/v1` | `INFERENCE_API_KEY` |
|
|
164
|
+
| `deepseek` | `https://api.deepseek.com` | `DEEPSEEK_API_KEY` |
|
|
165
|
+
| `kunavo` | `https://api.kunavo.com/v1` | `KUNAVO_API_KEY` |
|
|
166
|
+
| `runware` | `https://api.runware.ai/v1` | `RUNWARE_API_KEY` |
|
|
167
|
+
| `fal` | `https://queue.fal.run` | `FAL_KEY` |
|
|
168
|
+
|
|
169
|
+
常见用法是取 `DEFAULT_PROVIDERS` 的子集,并声明一个项目级类型保证编译安全:
|
|
170
|
+
|
|
171
|
+
```ts
|
|
172
|
+
import { DEFAULT_PROVIDERS } from "ai-lcr";
|
|
173
|
+
|
|
174
|
+
type ProviderId = "deepinfra" | "openrouter";
|
|
175
|
+
|
|
176
|
+
export const PROVIDERS = {
|
|
177
|
+
deepinfra: DEFAULT_PROVIDERS.deepinfra,
|
|
178
|
+
openrouter: DEFAULT_PROVIDERS.openrouter,
|
|
179
|
+
} satisfies Record<ProviderId, { baseURL: string; apiKeyEnv: string }>;
|
|
180
|
+
```
|
|
181
|
+
|
|
141
182
|
## 它如何路由
|
|
142
183
|
|
|
143
184
|
1. **最便宜优先。** provider 按顺序依次尝试——把它们排成最便宜优先,或设置 `autoSort: true` 让它按 `cost` 自动排序。
|
|
@@ -221,7 +262,17 @@ interface CallRecord {
|
|
|
221
262
|
|
|
222
263
|
**节约怎么算才诚实:** `baselineKind` 说明 `baselineUsd` 是哪种基线——文本是**链尾兜底 provider 的列表价**(`"last-leg"`,故意不取最贵的一条:prompt 缓存可能让标价更便宜的那家在缓存重的调用上反而更贵,取最大值会凭空造出"节约");媒体是**模型厂商官方第一方价**(`"official"`,按实际秒数算),查不到官方价时退化为你配置里最贵的路由(`"priciest-route"`,自我参照,仅说明跨 provider 价差)。
|
|
223
264
|
|
|
224
|
-
**送进收集器:** `createHttpSink` 把每条记录 POST 到任意 endpoint(serverless 上传 Next.js 的 `after` 作 `dispatch`
|
|
265
|
+
**送进收集器:** `createHttpSink` 把每条记录 POST 到任意 endpoint(serverless 上传 Next.js 的 `after` 作 `dispatch` 防止被掐断)。如果你用标准环境变量(`LCR_INGEST_URL`、`LCR_PROJECT`、`LCR_INGEST_KEY`),`createEnvSink` 全部替你读好——三行搞定:
|
|
266
|
+
|
|
267
|
+
```ts
|
|
268
|
+
import { createEnvSink } from "ai-lcr";
|
|
269
|
+
import { after } from "next/server";
|
|
270
|
+
export const lcrCallSink = createEnvSink(after);
|
|
271
|
+
```
|
|
272
|
+
|
|
273
|
+
`LCR_INGEST_URL` 不设 → sink 是 `undefined`,本地开发自动静默。唯一必传参数是 `dispatch`——框架相关的 fire-and-forget runner(Next.js: `after`;Cloudflare: `ctx.waitUntil`;长驻服务: `(fn) => fn()`)。
|
|
274
|
+
|
|
275
|
+
配套的自托管 dashboard [`ai-lcr-dashboard`](https://github.com/ai-lcr/ai-lcr-dashboard)(Next.js + Postgres,Vercel 一键部署)专为这些记录而建:花费 vs 节约趋势、各 provider failover 健康度、媒体 $/秒 与 $/张、以及**价格漂移面板**——某条 model@provider 路由的实报账单与价格表偏差超过 ±20% 时点名示警(约 100× 基本就是美元当美分的笔误)。只存元数据,不存 prompt 和输出。
|
|
225
276
|
|
|
226
277
|
## 支持的 provider
|
|
227
278
|
|
package/dist/index.cjs
CHANGED
|
@@ -341,6 +341,20 @@ function cacheSavingForUsage(cost, inputTokens, cacheReadTokens) {
|
|
|
341
341
|
const cached = Math.min(Math.max(cacheReadTokens, 0), inputTokens);
|
|
342
342
|
return cached / 1e6 * (cost.input - cost.cacheRead);
|
|
343
343
|
}
|
|
344
|
+
function reportedCost(providerMetadata, usage) {
|
|
345
|
+
const orUsage = providerMetadata?.openrouter?.usage;
|
|
346
|
+
if (orUsage) {
|
|
347
|
+
const upstream = orUsage.costDetails?.upstreamInferenceCost;
|
|
348
|
+
if (typeof upstream === "number" && upstream > 0) return upstream;
|
|
349
|
+
if (typeof orUsage.cost === "number") return orUsage.cost;
|
|
350
|
+
}
|
|
351
|
+
const raw = usage?.raw;
|
|
352
|
+
if (raw) {
|
|
353
|
+
const est = raw["estimated_cost"] ?? raw["cost"];
|
|
354
|
+
if (typeof est === "number") return est;
|
|
355
|
+
}
|
|
356
|
+
return void 0;
|
|
357
|
+
}
|
|
344
358
|
function requestIdFrom(options) {
|
|
345
359
|
const raw = options.providerOptions?.lcr?.requestId;
|
|
346
360
|
return typeof raw === "string" && raw.length > 0 ? raw : void 0;
|
|
@@ -539,12 +553,13 @@ var LcrFallbackModel = class {
|
|
|
539
553
|
return baseline;
|
|
540
554
|
}
|
|
541
555
|
/** Winner settled: record the attempt, fire `onCost` (compat) + `onCall`. */
|
|
542
|
-
finalizeOk(ctx, provider, attemptStart, usage, ttftMs) {
|
|
556
|
+
finalizeOk(ctx, provider, attemptStart, usage, ttftMs, providerMetadata) {
|
|
543
557
|
ctx.attempts.push({ provider: provider.label, ok: true, latencyMs: Date.now() - attemptStart });
|
|
544
558
|
const inputTokens = usage?.inputTokens?.total ?? 0;
|
|
545
559
|
const outputTokens = usage?.outputTokens?.total ?? 0;
|
|
546
560
|
const cacheReadTokens = usage?.inputTokens?.cacheRead ?? 0;
|
|
547
|
-
const
|
|
561
|
+
const estCostUsd = provider.cost ? costForUsage(provider.cost, inputTokens, outputTokens, cacheReadTokens) : void 0;
|
|
562
|
+
const costUsd = reportedCost(providerMetadata, usage) ?? estCostUsd ?? 0;
|
|
548
563
|
const cachedSavingUsd = provider.cost ? cacheSavingForUsage(provider.cost, inputTokens, cacheReadTokens) : 0;
|
|
549
564
|
const usageMissing = inputTokens === 0 && outputTokens === 0;
|
|
550
565
|
const emptyCompletion = inputTokens > 0 && outputTokens === 0;
|
|
@@ -579,6 +594,7 @@ var LcrFallbackModel = class {
|
|
|
579
594
|
outputTokens,
|
|
580
595
|
...cacheReadTokens > 0 ? { cachedInputTokens: cacheReadTokens } : {},
|
|
581
596
|
costUsd,
|
|
597
|
+
...estCostUsd !== void 0 ? { estCostUsd } : {},
|
|
582
598
|
...baselineUsd !== void 0 ? { baselineUsd, baselineKind: "last-leg" } : {},
|
|
583
599
|
...cachedSavingUsd > 0 ? { cachedSavingUsd } : {},
|
|
584
600
|
...ctx.requestId ? { requestId: ctx.requestId } : {},
|
|
@@ -635,7 +651,7 @@ var LcrFallbackModel = class {
|
|
|
635
651
|
}
|
|
636
652
|
this.recordProviderSuccess(idx);
|
|
637
653
|
this.settleSticky(idx);
|
|
638
|
-
this.finalizeOk(ctx, provider, attemptStart, result.usage);
|
|
654
|
+
this.finalizeOk(ctx, provider, attemptStart, result.usage, void 0, result.providerMetadata);
|
|
639
655
|
if (cache && cacheKey !== void 0 && ctx.settled?.cacheable) {
|
|
640
656
|
this.storeCache(cacheKey, { kind: "generate", result, meta: ctx.settled.meta });
|
|
641
657
|
}
|
|
@@ -767,6 +783,7 @@ var LcrFallbackModel = class {
|
|
|
767
783
|
const servingIdx = idx;
|
|
768
784
|
const servingPos = p;
|
|
769
785
|
let usage;
|
|
786
|
+
let finishProviderMetadata;
|
|
770
787
|
let contentStreamed = false;
|
|
771
788
|
let ttftMs;
|
|
772
789
|
const stream = new ReadableStream({
|
|
@@ -783,6 +800,7 @@ var LcrFallbackModel = class {
|
|
|
783
800
|
if (done) break;
|
|
784
801
|
if (value.type === "finish") {
|
|
785
802
|
usage = value.usage;
|
|
803
|
+
finishProviderMetadata = value.providerMetadata;
|
|
786
804
|
const out = value.usage?.outputTokens?.total ?? 0;
|
|
787
805
|
const inp = value.usage?.inputTokens?.total ?? 0;
|
|
788
806
|
if (inp > 0 && out === 0 && !contentStreamed && servingPos + 1 < n) {
|
|
@@ -797,7 +815,7 @@ var LcrFallbackModel = class {
|
|
|
797
815
|
}
|
|
798
816
|
self.recordProviderSuccess(servingIdx);
|
|
799
817
|
self.settleSticky(servingIdx);
|
|
800
|
-
self.finalizeOk(ctx, servingProvider, servingAttemptStart, usage, ttftMs);
|
|
818
|
+
self.finalizeOk(ctx, servingProvider, servingAttemptStart, usage, ttftMs, finishProviderMetadata);
|
|
801
819
|
controller.close();
|
|
802
820
|
} catch (error) {
|
|
803
821
|
self.emitError(error, servingProvider.label);
|
|
@@ -1003,6 +1021,16 @@ var MODEL_PRICES = {
|
|
|
1003
1021
|
"gemini-gemma-2-9b-it": { input: 0.35, output: 1.05 },
|
|
1004
1022
|
"gemini-pro-latest": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
1005
1023
|
"gemini-robotics-er-1.5-preview": { input: 0.3, output: 2.5 },
|
|
1024
|
+
"glm-4-32b-0414-128k": { input: 0.1, output: 0.1 },
|
|
1025
|
+
"glm-4.5": { input: 0.6, output: 2.2 },
|
|
1026
|
+
"glm-4.5-air": { input: 0.2, output: 1.1 },
|
|
1027
|
+
"glm-4.5-airx": { input: 1.1, output: 4.5 },
|
|
1028
|
+
"glm-4.5-x": { input: 2.2, output: 8.9 },
|
|
1029
|
+
"glm-4.5v": { input: 0.6, output: 1.8 },
|
|
1030
|
+
"glm-4.6": { input: 0.6, output: 2.2, cacheRead: 0.11 },
|
|
1031
|
+
"glm-4.7": { input: 0.6, output: 2.2, cacheRead: 0.11 },
|
|
1032
|
+
"glm-5": { input: 1, output: 3.2, cacheRead: 0.2 },
|
|
1033
|
+
"glm-5-code": { input: 1.2, output: 5, cacheRead: 0.3 },
|
|
1006
1034
|
"gpt-3.5-turbo": { input: 0.5, output: 1.5 },
|
|
1007
1035
|
"gpt-3.5-turbo-0125": { input: 0.5, output: 1.5 },
|
|
1008
1036
|
"gpt-3.5-turbo-1106": { input: 1, output: 2 },
|
|
@@ -1117,6 +1145,18 @@ var MODEL_PRICES = {
|
|
|
1117
1145
|
"grok-code-fast-1": { input: 0.2, output: 1.5, cacheRead: 0.02 },
|
|
1118
1146
|
"grok-code-fast-1-0825": { input: 0.2, output: 1.5, cacheRead: 0.02 },
|
|
1119
1147
|
"grok-vision-beta": { input: 5, output: 15 },
|
|
1148
|
+
"kimi-k2-0711-preview": { input: 0.6, output: 2.5, cacheRead: 0.15 },
|
|
1149
|
+
"kimi-k2-0905-preview": { input: 0.6, output: 2.5, cacheRead: 0.15 },
|
|
1150
|
+
"kimi-k2-thinking": { input: 0.6, output: 2.5, cacheRead: 0.15 },
|
|
1151
|
+
"kimi-k2-thinking-turbo": { input: 1.15, output: 8, cacheRead: 0.15 },
|
|
1152
|
+
"kimi-k2-turbo-preview": { input: 1.15, output: 8, cacheRead: 0.15 },
|
|
1153
|
+
"kimi-k2.5": { input: 0.6, output: 3, cacheRead: 0.1 },
|
|
1154
|
+
"kimi-k2.6": { input: 0.95, output: 4, cacheRead: 0.16 },
|
|
1155
|
+
"kimi-latest": { input: 2, output: 5, cacheRead: 0.15 },
|
|
1156
|
+
"kimi-latest-128k": { input: 2, output: 5, cacheRead: 0.15 },
|
|
1157
|
+
"kimi-latest-32k": { input: 1, output: 3, cacheRead: 0.15 },
|
|
1158
|
+
"kimi-latest-8k": { input: 0.2, output: 2, cacheRead: 0.15 },
|
|
1159
|
+
"kimi-thinking-preview": { input: 0.6, output: 2.5, cacheRead: 0.15 },
|
|
1120
1160
|
"labs-devstral-small-2512": { input: 0.1, output: 0.3 },
|
|
1121
1161
|
"magistral-medium-1-2-2509": { input: 2, output: 5 },
|
|
1122
1162
|
"magistral-medium-2506": { input: 2, output: 5 },
|
|
@@ -1125,6 +1165,12 @@ var MODEL_PRICES = {
|
|
|
1125
1165
|
"magistral-small-1-2-2509": { input: 0.5, output: 1.5 },
|
|
1126
1166
|
"magistral-small-2506": { input: 0.5, output: 1.5 },
|
|
1127
1167
|
"magistral-small-latest": { input: 0.5, output: 1.5 },
|
|
1168
|
+
"MiniMax-M2": { input: 0.3, output: 1.2, cacheRead: 0.03 },
|
|
1169
|
+
"MiniMax-M2.1": { input: 0.3, output: 1.2, cacheRead: 0.03 },
|
|
1170
|
+
"MiniMax-M2.1-lightning": { input: 0.3, output: 2.4, cacheRead: 0.03 },
|
|
1171
|
+
"MiniMax-M2.5": { input: 0.3, output: 1.2, cacheRead: 0.03 },
|
|
1172
|
+
"MiniMax-M2.5-lightning": { input: 0.3, output: 2.4, cacheRead: 0.03 },
|
|
1173
|
+
"MiniMax-M3": { input: 0.6, output: 2.4, cacheRead: 0.12 },
|
|
1128
1174
|
"ministral-3-14b-2512": { input: 0.2, output: 0.2 },
|
|
1129
1175
|
"ministral-3-3b-2512": { input: 0.1, output: 0.1 },
|
|
1130
1176
|
"ministral-3-8b-2512": { input: 0.15, output: 0.15 },
|
|
@@ -1145,6 +1191,16 @@ var MODEL_PRICES = {
|
|
|
1145
1191
|
"mistral-small-3-2-2506": { input: 0.06, output: 0.18 },
|
|
1146
1192
|
"mistral-small-latest": { input: 0.06, output: 0.18 },
|
|
1147
1193
|
"mistral-tiny": { input: 0.25, output: 0.25 },
|
|
1194
|
+
"moonshot-v1-128k": { input: 2, output: 5 },
|
|
1195
|
+
"moonshot-v1-128k-0430": { input: 2, output: 5 },
|
|
1196
|
+
"moonshot-v1-128k-vision-preview": { input: 2, output: 5 },
|
|
1197
|
+
"moonshot-v1-32k": { input: 1, output: 3 },
|
|
1198
|
+
"moonshot-v1-32k-0430": { input: 1, output: 3 },
|
|
1199
|
+
"moonshot-v1-32k-vision-preview": { input: 1, output: 3 },
|
|
1200
|
+
"moonshot-v1-8k": { input: 0.2, output: 2 },
|
|
1201
|
+
"moonshot-v1-8k-0430": { input: 0.2, output: 2 },
|
|
1202
|
+
"moonshot-v1-8k-vision-preview": { input: 0.2, output: 2 },
|
|
1203
|
+
"moonshot-v1-auto": { input: 2, output: 5 },
|
|
1148
1204
|
"o1": { input: 15, output: 60, cacheRead: 7.5 },
|
|
1149
1205
|
"o1-2024-12-17": { input: 15, output: 60, cacheRead: 7.5 },
|
|
1150
1206
|
"o3": { input: 2, output: 8, cacheRead: 0.5 },
|
|
@@ -1161,7 +1217,24 @@ var MODEL_PRICES = {
|
|
|
1161
1217
|
"open-mixtral-8x7b": { input: 0.7, output: 0.7 },
|
|
1162
1218
|
"pixtral-12b-2409": { input: 0.15, output: 0.15 },
|
|
1163
1219
|
"pixtral-large-2411": { input: 2, output: 6 },
|
|
1164
|
-
"pixtral-large-latest": { input: 2, output: 6 }
|
|
1220
|
+
"pixtral-large-latest": { input: 2, output: 6 },
|
|
1221
|
+
"qwen-coder": { input: 0.3, output: 1.5 },
|
|
1222
|
+
"qwen-max": { input: 1.6, output: 6.4 },
|
|
1223
|
+
"qwen-plus": { input: 0.4, output: 1.2 },
|
|
1224
|
+
"qwen-plus-2025-01-25": { input: 0.4, output: 1.2 },
|
|
1225
|
+
"qwen-plus-2025-04-28": { input: 0.4, output: 1.2 },
|
|
1226
|
+
"qwen-plus-2025-07-14": { input: 0.4, output: 1.2 },
|
|
1227
|
+
"qwen-turbo": { input: 0.05, output: 0.2 },
|
|
1228
|
+
"qwen-turbo-2024-11-01": { input: 0.05, output: 0.2 },
|
|
1229
|
+
"qwen-turbo-2025-04-28": { input: 0.05, output: 0.2 },
|
|
1230
|
+
"qwen-turbo-latest": { input: 0.05, output: 0.2 },
|
|
1231
|
+
"qwen3-next-80b-a3b-instruct": { input: 0.15, output: 1.2 },
|
|
1232
|
+
"qwen3-next-80b-a3b-thinking": { input: 0.15, output: 1.2 },
|
|
1233
|
+
"qwen3-vl-235b-a22b-instruct": { input: 0.4, output: 1.6 },
|
|
1234
|
+
"qwen3-vl-235b-a22b-thinking": { input: 0.4, output: 4 },
|
|
1235
|
+
"qwen3-vl-32b-instruct": { input: 0.16, output: 0.64 },
|
|
1236
|
+
"qwen3-vl-32b-thinking": { input: 0.16, output: 2.87 },
|
|
1237
|
+
"qwq-plus": { input: 0.8, output: 2.4 }
|
|
1165
1238
|
};
|
|
1166
1239
|
|
|
1167
1240
|
// src/media-official.ts
|
package/dist/index.js
CHANGED
|
@@ -287,6 +287,20 @@ function cacheSavingForUsage(cost, inputTokens, cacheReadTokens) {
|
|
|
287
287
|
const cached = Math.min(Math.max(cacheReadTokens, 0), inputTokens);
|
|
288
288
|
return cached / 1e6 * (cost.input - cost.cacheRead);
|
|
289
289
|
}
|
|
290
|
+
function reportedCost(providerMetadata, usage) {
|
|
291
|
+
const orUsage = providerMetadata?.openrouter?.usage;
|
|
292
|
+
if (orUsage) {
|
|
293
|
+
const upstream = orUsage.costDetails?.upstreamInferenceCost;
|
|
294
|
+
if (typeof upstream === "number" && upstream > 0) return upstream;
|
|
295
|
+
if (typeof orUsage.cost === "number") return orUsage.cost;
|
|
296
|
+
}
|
|
297
|
+
const raw = usage?.raw;
|
|
298
|
+
if (raw) {
|
|
299
|
+
const est = raw["estimated_cost"] ?? raw["cost"];
|
|
300
|
+
if (typeof est === "number") return est;
|
|
301
|
+
}
|
|
302
|
+
return void 0;
|
|
303
|
+
}
|
|
290
304
|
function requestIdFrom(options) {
|
|
291
305
|
const raw = options.providerOptions?.lcr?.requestId;
|
|
292
306
|
return typeof raw === "string" && raw.length > 0 ? raw : void 0;
|
|
@@ -485,12 +499,13 @@ var LcrFallbackModel = class {
|
|
|
485
499
|
return baseline;
|
|
486
500
|
}
|
|
487
501
|
/** Winner settled: record the attempt, fire `onCost` (compat) + `onCall`. */
|
|
488
|
-
finalizeOk(ctx, provider, attemptStart, usage, ttftMs) {
|
|
502
|
+
finalizeOk(ctx, provider, attemptStart, usage, ttftMs, providerMetadata) {
|
|
489
503
|
ctx.attempts.push({ provider: provider.label, ok: true, latencyMs: Date.now() - attemptStart });
|
|
490
504
|
const inputTokens = usage?.inputTokens?.total ?? 0;
|
|
491
505
|
const outputTokens = usage?.outputTokens?.total ?? 0;
|
|
492
506
|
const cacheReadTokens = usage?.inputTokens?.cacheRead ?? 0;
|
|
493
|
-
const
|
|
507
|
+
const estCostUsd = provider.cost ? costForUsage(provider.cost, inputTokens, outputTokens, cacheReadTokens) : void 0;
|
|
508
|
+
const costUsd = reportedCost(providerMetadata, usage) ?? estCostUsd ?? 0;
|
|
494
509
|
const cachedSavingUsd = provider.cost ? cacheSavingForUsage(provider.cost, inputTokens, cacheReadTokens) : 0;
|
|
495
510
|
const usageMissing = inputTokens === 0 && outputTokens === 0;
|
|
496
511
|
const emptyCompletion = inputTokens > 0 && outputTokens === 0;
|
|
@@ -525,6 +540,7 @@ var LcrFallbackModel = class {
|
|
|
525
540
|
outputTokens,
|
|
526
541
|
...cacheReadTokens > 0 ? { cachedInputTokens: cacheReadTokens } : {},
|
|
527
542
|
costUsd,
|
|
543
|
+
...estCostUsd !== void 0 ? { estCostUsd } : {},
|
|
528
544
|
...baselineUsd !== void 0 ? { baselineUsd, baselineKind: "last-leg" } : {},
|
|
529
545
|
...cachedSavingUsd > 0 ? { cachedSavingUsd } : {},
|
|
530
546
|
...ctx.requestId ? { requestId: ctx.requestId } : {},
|
|
@@ -581,7 +597,7 @@ var LcrFallbackModel = class {
|
|
|
581
597
|
}
|
|
582
598
|
this.recordProviderSuccess(idx);
|
|
583
599
|
this.settleSticky(idx);
|
|
584
|
-
this.finalizeOk(ctx, provider, attemptStart, result.usage);
|
|
600
|
+
this.finalizeOk(ctx, provider, attemptStart, result.usage, void 0, result.providerMetadata);
|
|
585
601
|
if (cache && cacheKey !== void 0 && ctx.settled?.cacheable) {
|
|
586
602
|
this.storeCache(cacheKey, { kind: "generate", result, meta: ctx.settled.meta });
|
|
587
603
|
}
|
|
@@ -713,6 +729,7 @@ var LcrFallbackModel = class {
|
|
|
713
729
|
const servingIdx = idx;
|
|
714
730
|
const servingPos = p;
|
|
715
731
|
let usage;
|
|
732
|
+
let finishProviderMetadata;
|
|
716
733
|
let contentStreamed = false;
|
|
717
734
|
let ttftMs;
|
|
718
735
|
const stream = new ReadableStream({
|
|
@@ -729,6 +746,7 @@ var LcrFallbackModel = class {
|
|
|
729
746
|
if (done) break;
|
|
730
747
|
if (value.type === "finish") {
|
|
731
748
|
usage = value.usage;
|
|
749
|
+
finishProviderMetadata = value.providerMetadata;
|
|
732
750
|
const out = value.usage?.outputTokens?.total ?? 0;
|
|
733
751
|
const inp = value.usage?.inputTokens?.total ?? 0;
|
|
734
752
|
if (inp > 0 && out === 0 && !contentStreamed && servingPos + 1 < n) {
|
|
@@ -743,7 +761,7 @@ var LcrFallbackModel = class {
|
|
|
743
761
|
}
|
|
744
762
|
self.recordProviderSuccess(servingIdx);
|
|
745
763
|
self.settleSticky(servingIdx);
|
|
746
|
-
self.finalizeOk(ctx, servingProvider, servingAttemptStart, usage, ttftMs);
|
|
764
|
+
self.finalizeOk(ctx, servingProvider, servingAttemptStart, usage, ttftMs, finishProviderMetadata);
|
|
747
765
|
controller.close();
|
|
748
766
|
} catch (error) {
|
|
749
767
|
self.emitError(error, servingProvider.label);
|
|
@@ -949,6 +967,16 @@ var MODEL_PRICES = {
|
|
|
949
967
|
"gemini-gemma-2-9b-it": { input: 0.35, output: 1.05 },
|
|
950
968
|
"gemini-pro-latest": { input: 1.25, output: 10, cacheRead: 0.125 },
|
|
951
969
|
"gemini-robotics-er-1.5-preview": { input: 0.3, output: 2.5 },
|
|
970
|
+
"glm-4-32b-0414-128k": { input: 0.1, output: 0.1 },
|
|
971
|
+
"glm-4.5": { input: 0.6, output: 2.2 },
|
|
972
|
+
"glm-4.5-air": { input: 0.2, output: 1.1 },
|
|
973
|
+
"glm-4.5-airx": { input: 1.1, output: 4.5 },
|
|
974
|
+
"glm-4.5-x": { input: 2.2, output: 8.9 },
|
|
975
|
+
"glm-4.5v": { input: 0.6, output: 1.8 },
|
|
976
|
+
"glm-4.6": { input: 0.6, output: 2.2, cacheRead: 0.11 },
|
|
977
|
+
"glm-4.7": { input: 0.6, output: 2.2, cacheRead: 0.11 },
|
|
978
|
+
"glm-5": { input: 1, output: 3.2, cacheRead: 0.2 },
|
|
979
|
+
"glm-5-code": { input: 1.2, output: 5, cacheRead: 0.3 },
|
|
952
980
|
"gpt-3.5-turbo": { input: 0.5, output: 1.5 },
|
|
953
981
|
"gpt-3.5-turbo-0125": { input: 0.5, output: 1.5 },
|
|
954
982
|
"gpt-3.5-turbo-1106": { input: 1, output: 2 },
|
|
@@ -1063,6 +1091,18 @@ var MODEL_PRICES = {
|
|
|
1063
1091
|
"grok-code-fast-1": { input: 0.2, output: 1.5, cacheRead: 0.02 },
|
|
1064
1092
|
"grok-code-fast-1-0825": { input: 0.2, output: 1.5, cacheRead: 0.02 },
|
|
1065
1093
|
"grok-vision-beta": { input: 5, output: 15 },
|
|
1094
|
+
"kimi-k2-0711-preview": { input: 0.6, output: 2.5, cacheRead: 0.15 },
|
|
1095
|
+
"kimi-k2-0905-preview": { input: 0.6, output: 2.5, cacheRead: 0.15 },
|
|
1096
|
+
"kimi-k2-thinking": { input: 0.6, output: 2.5, cacheRead: 0.15 },
|
|
1097
|
+
"kimi-k2-thinking-turbo": { input: 1.15, output: 8, cacheRead: 0.15 },
|
|
1098
|
+
"kimi-k2-turbo-preview": { input: 1.15, output: 8, cacheRead: 0.15 },
|
|
1099
|
+
"kimi-k2.5": { input: 0.6, output: 3, cacheRead: 0.1 },
|
|
1100
|
+
"kimi-k2.6": { input: 0.95, output: 4, cacheRead: 0.16 },
|
|
1101
|
+
"kimi-latest": { input: 2, output: 5, cacheRead: 0.15 },
|
|
1102
|
+
"kimi-latest-128k": { input: 2, output: 5, cacheRead: 0.15 },
|
|
1103
|
+
"kimi-latest-32k": { input: 1, output: 3, cacheRead: 0.15 },
|
|
1104
|
+
"kimi-latest-8k": { input: 0.2, output: 2, cacheRead: 0.15 },
|
|
1105
|
+
"kimi-thinking-preview": { input: 0.6, output: 2.5, cacheRead: 0.15 },
|
|
1066
1106
|
"labs-devstral-small-2512": { input: 0.1, output: 0.3 },
|
|
1067
1107
|
"magistral-medium-1-2-2509": { input: 2, output: 5 },
|
|
1068
1108
|
"magistral-medium-2506": { input: 2, output: 5 },
|
|
@@ -1071,6 +1111,12 @@ var MODEL_PRICES = {
|
|
|
1071
1111
|
"magistral-small-1-2-2509": { input: 0.5, output: 1.5 },
|
|
1072
1112
|
"magistral-small-2506": { input: 0.5, output: 1.5 },
|
|
1073
1113
|
"magistral-small-latest": { input: 0.5, output: 1.5 },
|
|
1114
|
+
"MiniMax-M2": { input: 0.3, output: 1.2, cacheRead: 0.03 },
|
|
1115
|
+
"MiniMax-M2.1": { input: 0.3, output: 1.2, cacheRead: 0.03 },
|
|
1116
|
+
"MiniMax-M2.1-lightning": { input: 0.3, output: 2.4, cacheRead: 0.03 },
|
|
1117
|
+
"MiniMax-M2.5": { input: 0.3, output: 1.2, cacheRead: 0.03 },
|
|
1118
|
+
"MiniMax-M2.5-lightning": { input: 0.3, output: 2.4, cacheRead: 0.03 },
|
|
1119
|
+
"MiniMax-M3": { input: 0.6, output: 2.4, cacheRead: 0.12 },
|
|
1074
1120
|
"ministral-3-14b-2512": { input: 0.2, output: 0.2 },
|
|
1075
1121
|
"ministral-3-3b-2512": { input: 0.1, output: 0.1 },
|
|
1076
1122
|
"ministral-3-8b-2512": { input: 0.15, output: 0.15 },
|
|
@@ -1091,6 +1137,16 @@ var MODEL_PRICES = {
|
|
|
1091
1137
|
"mistral-small-3-2-2506": { input: 0.06, output: 0.18 },
|
|
1092
1138
|
"mistral-small-latest": { input: 0.06, output: 0.18 },
|
|
1093
1139
|
"mistral-tiny": { input: 0.25, output: 0.25 },
|
|
1140
|
+
"moonshot-v1-128k": { input: 2, output: 5 },
|
|
1141
|
+
"moonshot-v1-128k-0430": { input: 2, output: 5 },
|
|
1142
|
+
"moonshot-v1-128k-vision-preview": { input: 2, output: 5 },
|
|
1143
|
+
"moonshot-v1-32k": { input: 1, output: 3 },
|
|
1144
|
+
"moonshot-v1-32k-0430": { input: 1, output: 3 },
|
|
1145
|
+
"moonshot-v1-32k-vision-preview": { input: 1, output: 3 },
|
|
1146
|
+
"moonshot-v1-8k": { input: 0.2, output: 2 },
|
|
1147
|
+
"moonshot-v1-8k-0430": { input: 0.2, output: 2 },
|
|
1148
|
+
"moonshot-v1-8k-vision-preview": { input: 0.2, output: 2 },
|
|
1149
|
+
"moonshot-v1-auto": { input: 2, output: 5 },
|
|
1094
1150
|
"o1": { input: 15, output: 60, cacheRead: 7.5 },
|
|
1095
1151
|
"o1-2024-12-17": { input: 15, output: 60, cacheRead: 7.5 },
|
|
1096
1152
|
"o3": { input: 2, output: 8, cacheRead: 0.5 },
|
|
@@ -1107,7 +1163,24 @@ var MODEL_PRICES = {
|
|
|
1107
1163
|
"open-mixtral-8x7b": { input: 0.7, output: 0.7 },
|
|
1108
1164
|
"pixtral-12b-2409": { input: 0.15, output: 0.15 },
|
|
1109
1165
|
"pixtral-large-2411": { input: 2, output: 6 },
|
|
1110
|
-
"pixtral-large-latest": { input: 2, output: 6 }
|
|
1166
|
+
"pixtral-large-latest": { input: 2, output: 6 },
|
|
1167
|
+
"qwen-coder": { input: 0.3, output: 1.5 },
|
|
1168
|
+
"qwen-max": { input: 1.6, output: 6.4 },
|
|
1169
|
+
"qwen-plus": { input: 0.4, output: 1.2 },
|
|
1170
|
+
"qwen-plus-2025-01-25": { input: 0.4, output: 1.2 },
|
|
1171
|
+
"qwen-plus-2025-04-28": { input: 0.4, output: 1.2 },
|
|
1172
|
+
"qwen-plus-2025-07-14": { input: 0.4, output: 1.2 },
|
|
1173
|
+
"qwen-turbo": { input: 0.05, output: 0.2 },
|
|
1174
|
+
"qwen-turbo-2024-11-01": { input: 0.05, output: 0.2 },
|
|
1175
|
+
"qwen-turbo-2025-04-28": { input: 0.05, output: 0.2 },
|
|
1176
|
+
"qwen-turbo-latest": { input: 0.05, output: 0.2 },
|
|
1177
|
+
"qwen3-next-80b-a3b-instruct": { input: 0.15, output: 1.2 },
|
|
1178
|
+
"qwen3-next-80b-a3b-thinking": { input: 0.15, output: 1.2 },
|
|
1179
|
+
"qwen3-vl-235b-a22b-instruct": { input: 0.4, output: 1.6 },
|
|
1180
|
+
"qwen3-vl-235b-a22b-thinking": { input: 0.4, output: 4 },
|
|
1181
|
+
"qwen3-vl-32b-instruct": { input: 0.16, output: 0.64 },
|
|
1182
|
+
"qwen3-vl-32b-thinking": { input: 0.16, output: 2.87 },
|
|
1183
|
+
"qwq-plus": { input: 0.8, output: 2.4 }
|
|
1111
1184
|
};
|
|
1112
1185
|
|
|
1113
1186
|
// src/media-official.ts
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "ai-lcr",
|
|
3
|
-
"version": "0.
|
|
3
|
+
"version": "0.7.0",
|
|
4
4
|
"description": "Least Cost Routing for LLMs — route every model call to the cheapest available provider, fall back automatically, and track real cost. Built for the Vercel AI SDK.",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"ai",
|