ai-lcr 0.2.2 → 0.2.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +108 -0
- package/README.md +2 -3
- package/README.zh-CN.md +2 -3
- package/dist/index.cjs +256 -132
- package/dist/index.d.cts +48 -38
- package/dist/index.d.ts +48 -38
- package/dist/index.js +255 -131
- package/package.json +5 -3
package/CHANGELOG.md
ADDED
|
@@ -0,0 +1,108 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
All notable changes to `ai-lcr` are documented here. The format follows
|
|
4
|
+
[Keep a Changelog](https://keepachangelog.com/), and the project adheres to
|
|
5
|
+
[Semantic Versioning](https://semver.org/).
|
|
6
|
+
|
|
7
|
+
## [0.2.5] — 2026-06-01
|
|
8
|
+
|
|
9
|
+
Pre-launch failover-robustness + media-provider pass — closing cases where a
|
|
10
|
+
real provider failure slipped past the switch criterion and killed the request,
|
|
11
|
+
and making fal a live failover target.
|
|
12
|
+
|
|
13
|
+
### Fixed
|
|
14
|
+
|
|
15
|
+
- **A network-unreachable provider didn't fail over.** `isRetryableError` only
|
|
16
|
+
matched HTTP statuses and English keywords, but a provider that's down throws
|
|
17
|
+
a `fetch` `TypeError` with *no* status — and wraps the real cause
|
|
18
|
+
(`ECONNREFUSED`/`ECONNRESET`/`ENOTFOUND`/connect-timeout, with the Node `code`)
|
|
19
|
+
in `error.cause`. Those read as a non-retryable client error, so the cheapest
|
|
20
|
+
provider going down killed the request instead of falling over — the most
|
|
21
|
+
common outage mode. The engine now walks the `cause` chain and treats Node
|
|
22
|
+
network codes / transport-failure messages as retryable. Applies to both the
|
|
23
|
+
text and media routers. New exported helper `isNetworkError`.
|
|
24
|
+
- **Non-English billing failures didn't fail over.** Out-of-credit detection was
|
|
25
|
+
English-only, but Chinese providers (e.g. Kunavo) report a failed charge as
|
|
26
|
+
`余额不足`/`账户欠费`/`扣费失败` in a 200/400 body with no billing status.
|
|
27
|
+
Those are now matched (plus `balance`/`exhausted`), so a failed charge fails
|
|
28
|
+
over and is tagged `billing` by `classifyErrorKind` for alerting.
|
|
29
|
+
- **An out-of-balance 403 was mis-tagged as `auth`.** Providers report an
|
|
30
|
+
exhausted account as 403 (e.g. fal "exhausted balance") — a top-up problem,
|
|
31
|
+
not a revoked key. `classifyErrorKind` now lets billing wording win over a
|
|
32
|
+
bare 401/403 status, so it's tagged `billing` (a plain 403 stays `auth`).
|
|
33
|
+
- **A throwing observer could fail a successful request.** `onCost`/`onCall`/
|
|
34
|
+
`onError` were invoked unguarded; a logging sink that threw (e.g. a flaky db9
|
|
35
|
+
write) turned an otherwise-successful generation into a thrown error. All
|
|
36
|
+
observer callbacks are now fire-and-forget — wrapped so a throw can never
|
|
37
|
+
affect routing or the request outcome. Applies to both routers.
|
|
38
|
+
|
|
39
|
+
### Added
|
|
40
|
+
|
|
41
|
+
- **fal media adapter** (`createFalMediaAdapter`). fal was in the price table
|
|
42
|
+
but had no adapter, so its routes were silently skipped at runtime — now it's
|
|
43
|
+
a real cheapest-first / failover target for image models. Synchronous
|
|
44
|
+
`https://fal.run/<model>` with `Authorization: Key`, generic input pass-
|
|
45
|
+
through, HTTP-status-bearing errors (403 out-of-balance → fails over; 422 bad
|
|
46
|
+
input → doesn't). Image only; fal video (queue) is on the roadmap.
|
|
47
|
+
- **Status-page liveness probes for Runware + fal** (`website`). Both are now
|
|
48
|
+
monitored with a free, generation-free reachability probe: Runware's `ping`
|
|
49
|
+
task (→ `pong`, 0 cost) and fal's `GET /v1/account/billing` (2xx ⇒ endpoint up
|
|
50
|
+
+ key valid). Generalized via a new `ReachProbe` so a "reachable" check can
|
|
51
|
+
hit a provider-specific free endpoint instead of `GET /v1/models`. Requires
|
|
52
|
+
`RUNWARE_API_KEY` and `FAL_KEY` env vars to be set.
|
|
53
|
+
|
|
54
|
+
## [0.2.3] — 2026-06-01
|
|
55
|
+
|
|
56
|
+
Release-quality and engine-correctness pass.
|
|
57
|
+
|
|
58
|
+
### Fixed
|
|
59
|
+
|
|
60
|
+
- **Build was red on `main`.** `media.ts` set `CallRecord.baselineUsd` but the
|
|
61
|
+
type never declared it, so `tsc`/`npm run build` failed while `npm test`
|
|
62
|
+
(which doesn't typecheck) stayed green. `baselineUsd?: number` is now part of
|
|
63
|
+
`CallRecord`. The text router leaves it `undefined`; the media router sets it.
|
|
64
|
+
- **Failover used shared mutable state across concurrent requests.** The active
|
|
65
|
+
provider index was an instance field used both as the per-request loop cursor
|
|
66
|
+
and the loop's termination check. Two requests sharing one model instance
|
|
67
|
+
could clobber each other's cursor mid-flight (skipped providers, wrong
|
|
68
|
+
termination). Each request now walks providers on a fully local cursor; the
|
|
69
|
+
only shared state is a "where to start next" hint, read once and written once.
|
|
70
|
+
- **Cheapest provider was never re-probed under sustained traffic.** The
|
|
71
|
+
snap-back-to-cheapest timer reset on *every* call, so with calls more frequent
|
|
72
|
+
than `resetIntervalMs` it never fired — one blip pinned you on the expensive
|
|
73
|
+
fallback indefinitely (exactly when spend is highest). The timer now measures
|
|
74
|
+
from the last *failover*, so re-probe fires under load too.
|
|
75
|
+
|
|
76
|
+
### Added
|
|
77
|
+
|
|
78
|
+
- **`classifyErrorKind(error)` and `RouteAttempt.kind`** (`"transient" | "auth"
|
|
79
|
+
| "billing" | "client"`). 401/403 (auth) and 402/out-of-credit (billing)
|
|
80
|
+
still fail over so the request survives — but they're now tagged distinctly
|
|
81
|
+
from transient 429/5xx, so a misconfigured key silently burning the pricey
|
|
82
|
+
fallback is something you can alert on instead of mistaking for healthy
|
|
83
|
+
routing.
|
|
84
|
+
- **Continuous Integration** (`.github/workflows/ci.yml`): `build` +
|
|
85
|
+
`typecheck` + `test` on Node 20 & 22, plus a `pack-smoke` job that installs
|
|
86
|
+
the actual `npm pack` tarball into a clean directory and imports it (ESM and
|
|
87
|
+
CJS) — catching dropped exports and broken `dist` that an in-repo test can't.
|
|
88
|
+
- **`prepublishOnly` gate**: `npm publish` now runs build + typecheck + test
|
|
89
|
+
first, so a red tree can't be published.
|
|
90
|
+
- **Public-export surface test** (`public-api.test.ts`): pins every runtime
|
|
91
|
+
export by name, so removing one fails loudly and adding one is deliberate.
|
|
92
|
+
|
|
93
|
+
## [0.2.1] — earlier
|
|
94
|
+
|
|
95
|
+
- `onCall` correlated `CallRecord` + `formatCallRecord` one-liner for the text
|
|
96
|
+
router, extended to the media router (image/video).
|
|
97
|
+
|
|
98
|
+
## [0.2.0] — earlier
|
|
99
|
+
|
|
100
|
+
- Observability: `onCall` / `CallRecord`, `formatCallRecord`.
|
|
101
|
+
|
|
102
|
+
## [0.1.x] — earlier
|
|
103
|
+
|
|
104
|
+
- Dual ESM/CJS build. Media (image/video) least-cost routing with the Runware
|
|
105
|
+
and Kunavo adapters; cap-aware failover for the text router.
|
|
106
|
+
|
|
107
|
+
[0.2.5]: https://github.com/victorzhrn/ai-lcr/releases/tag/v0.2.5
|
|
108
|
+
[0.2.3]: https://github.com/victorzhrn/ai-lcr/releases/tag/v0.2.3
|
package/README.md
CHANGED
|
@@ -156,7 +156,7 @@ Any OpenAI-compatible endpoint works — and so does any AI SDK provider package
|
|
|
156
156
|
|
|
157
157
|
- **Model vendors' own APIs (native):** route straight to [DeepSeek](https://platform.deepseek.com), [OpenAI](https://openai.com), [Anthropic](https://anthropic.com), [Google](https://ai.google.dev), [xAI](https://x.ai), etc. via their AI SDK provider packages — no markup, full native features. See [Route to a model vendor's own API](#route-to-a-model-vendors-own-api-native-providers).
|
|
158
158
|
- **Text aggregators:** [OpenRouter](https://openrouter.ai) (widest coverage, list pricing) · [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off** every model) · [TokenMart](https://thetokenmart.ai) (15–65% off, varies by model)
|
|
159
|
-
- **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — routing via `createMediaLCR
|
|
159
|
+
- **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — image routing available via `createMediaLCR` (Kunavo + Runware + fal adapters); video on the roadmap
|
|
160
160
|
|
|
161
161
|
## Text model pricing
|
|
162
162
|
|
|
@@ -273,8 +273,7 @@ Two OpenAI-compatible providers, same probe, same day. Cells cover both families
|
|
|
273
273
|
- [ ] Bundled price table for zero-config pricing (drop the manual `cost` numbers)
|
|
274
274
|
- [ ] Provider-quirk middleware (transparently patch known per-provider request quirks, e.g. Kunavo's ignored `max_tokens`)
|
|
275
275
|
- [ ] Feed probe results into routing automatically (auto-exclude a model from a provider that fails its probe)
|
|
276
|
-
- [
|
|
277
|
-
- [ ] Normalized cross-provider video price comparison + verified Kunavo/Runware video adapters
|
|
276
|
+
- [ ] Image & video model routing (fal.ai / Runware / Kunavo)
|
|
278
277
|
|
|
279
278
|
## Affiliate disclosure
|
|
280
279
|
|
package/README.zh-CN.md
CHANGED
|
@@ -114,7 +114,7 @@ const lcr = createLCR({
|
|
|
114
114
|
|
|
115
115
|
- **模型厂商官方 API(原生):** 通过各自的 AI SDK provider 包直连 [DeepSeek](https://platform.deepseek.com)、[OpenAI](https://openai.com)、[Anthropic](https://anthropic.com)、[Google](https://ai.google.dev)、[xAI](https://x.ai) 等——无加价,原生特性齐全。见上方「直连模型厂商官方 API(原生 provider)」一节。
|
|
116
116
|
- **文本聚合器:** [OpenRouter](https://openrouter.ai)(覆盖最广,列表定价)· [Kunavo](https://kunavo.com/?ref=victorimf)(**全模型 8 折**)· [TokenMart](https://thetokenmart.ai)(按模型 85 折–35 折不等)
|
|
117
|
-
- **图像 / 视频:** [Kunavo](https://kunavo.com/?ref=victorimf)(**8 折**)· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) ——
|
|
117
|
+
- **图像 / 视频:** [Kunavo](https://kunavo.com/?ref=victorimf)(**8 折**)· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) —— 路由功能在路线图中
|
|
118
118
|
|
|
119
119
|
## 文本模型价格
|
|
120
120
|
|
|
@@ -229,8 +229,7 @@ API_KEY=$TOKENMART_API_KEY BASE=https://api.tokenmart.ai \
|
|
|
229
229
|
- [ ] 内置价格表,实现零配置定价(省去手填 `cost` 数字)
|
|
230
230
|
- [ ] provider 怪癖中间件(透明地修补已知怪癖,如 Kunavo 被忽略的 `max_tokens`)
|
|
231
231
|
- [ ] 把 probe 结果自动接入路由(探测失败的 provider×model 自动从列表剔除)
|
|
232
|
-
- [
|
|
233
|
-
- [ ] 归一化的跨 provider 视频价格对比 + 验证 Kunavo/Runware 视频适配器
|
|
232
|
+
- [ ] 图像与视频模型路由(fal.ai / Runware / Kunavo)
|
|
234
233
|
|
|
235
234
|
## 联盟(Affiliate)披露
|
|
236
235
|
|