ai-lcr 0.2.2 → 0.2.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,108 @@
1
+ # Changelog
2
+
3
+ All notable changes to `ai-lcr` are documented here. The format follows
4
+ [Keep a Changelog](https://keepachangelog.com/), and the project adheres to
5
+ [Semantic Versioning](https://semver.org/).
6
+
7
+ ## [0.2.5] — 2026-06-01
8
+
9
+ Pre-launch failover-robustness + media-provider pass — closing cases where a
10
+ real provider failure slipped past the switch criterion and killed the request,
11
+ and making fal a live failover target.
12
+
13
+ ### Fixed
14
+
15
+ - **A network-unreachable provider didn't fail over.** `isRetryableError` only
16
+ matched HTTP statuses and English keywords, but a provider that's down throws
17
+ a `fetch` `TypeError` with *no* status — and wraps the real cause
18
+ (`ECONNREFUSED`/`ECONNRESET`/`ENOTFOUND`/connect-timeout, with the Node `code`)
19
+ in `error.cause`. Those read as a non-retryable client error, so the cheapest
20
+ provider going down killed the request instead of falling over — the most
21
+ common outage mode. The engine now walks the `cause` chain and treats Node
22
+ network codes / transport-failure messages as retryable. Applies to both the
23
+ text and media routers. New exported helper `isNetworkError`.
24
+ - **Non-English billing failures didn't fail over.** Out-of-credit detection was
25
+ English-only, but Chinese providers (e.g. Kunavo) report a failed charge as
26
+ `余额不足`/`账户欠费`/`扣费失败` in a 200/400 body with no billing status.
27
+ Those are now matched (plus `balance`/`exhausted`), so a failed charge fails
28
+ over and is tagged `billing` by `classifyErrorKind` for alerting.
29
+ - **An out-of-balance 403 was mis-tagged as `auth`.** Providers report an
30
+ exhausted account as 403 (e.g. fal "exhausted balance") — a top-up problem,
31
+ not a revoked key. `classifyErrorKind` now lets billing wording win over a
32
+ bare 401/403 status, so it's tagged `billing` (a plain 403 stays `auth`).
33
+ - **A throwing observer could fail a successful request.** `onCost`/`onCall`/
34
+ `onError` were invoked unguarded; a logging sink that threw (e.g. a flaky db9
35
+ write) turned an otherwise-successful generation into a thrown error. All
36
+ observer callbacks are now fire-and-forget — wrapped so a throw can never
37
+ affect routing or the request outcome. Applies to both routers.
38
+
39
+ ### Added
40
+
41
+ - **fal media adapter** (`createFalMediaAdapter`). fal was in the price table
42
+ but had no adapter, so its routes were silently skipped at runtime — now it's
43
+ a real cheapest-first / failover target for image models. Synchronous
44
+ `https://fal.run/<model>` with `Authorization: Key`, generic input pass-
45
+ through, HTTP-status-bearing errors (403 out-of-balance → fails over; 422 bad
46
+ input → doesn't). Image only; fal video (queue) is on the roadmap.
47
+ - **Status-page liveness probes for Runware + fal** (`website`). Both are now
48
+ monitored with a free, generation-free reachability probe: Runware's `ping`
49
+ task (→ `pong`, 0 cost) and fal's `GET /v1/account/billing` (2xx ⇒ endpoint up
50
+ + key valid). Generalized via a new `ReachProbe` so a "reachable" check can
51
+ hit a provider-specific free endpoint instead of `GET /v1/models`. Requires
52
+ `RUNWARE_API_KEY` and `FAL_KEY` env vars to be set.
53
+
54
+ ## [0.2.3] — 2026-06-01
55
+
56
+ Release-quality and engine-correctness pass.
57
+
58
+ ### Fixed
59
+
60
+ - **Build was red on `main`.** `media.ts` set `CallRecord.baselineUsd` but the
61
+ type never declared it, so `tsc`/`npm run build` failed while `npm test`
62
+ (which doesn't typecheck) stayed green. `baselineUsd?: number` is now part of
63
+ `CallRecord`. The text router leaves it `undefined`; the media router sets it.
64
+ - **Failover used shared mutable state across concurrent requests.** The active
65
+ provider index was an instance field used both as the per-request loop cursor
66
+ and the loop's termination check. Two requests sharing one model instance
67
+ could clobber each other's cursor mid-flight (skipped providers, wrong
68
+ termination). Each request now walks providers on a fully local cursor; the
69
+ only shared state is a "where to start next" hint, read once and written once.
70
+ - **Cheapest provider was never re-probed under sustained traffic.** The
71
+ snap-back-to-cheapest timer reset on *every* call, so with calls more frequent
72
+ than `resetIntervalMs` it never fired — one blip pinned you on the expensive
73
+ fallback indefinitely (exactly when spend is highest). The timer now measures
74
+ from the last *failover*, so re-probe fires under load too.
75
+
76
+ ### Added
77
+
78
+ - **`classifyErrorKind(error)` and `RouteAttempt.kind`** (`"transient" | "auth"
79
+ | "billing" | "client"`). 401/403 (auth) and 402/out-of-credit (billing)
80
+ still fail over so the request survives — but they're now tagged distinctly
81
+ from transient 429/5xx, so a misconfigured key silently burning the pricey
82
+ fallback is something you can alert on instead of mistaking for healthy
83
+ routing.
84
+ - **Continuous Integration** (`.github/workflows/ci.yml`): `build` +
85
+ `typecheck` + `test` on Node 20 & 22, plus a `pack-smoke` job that installs
86
+ the actual `npm pack` tarball into a clean directory and imports it (ESM and
87
+ CJS) — catching dropped exports and broken `dist` that an in-repo test can't.
88
+ - **`prepublishOnly` gate**: `npm publish` now runs build + typecheck + test
89
+ first, so a red tree can't be published.
90
+ - **Public-export surface test** (`public-api.test.ts`): pins every runtime
91
+ export by name, so removing one fails loudly and adding one is deliberate.
92
+
93
+ ## [0.2.1] — earlier
94
+
95
+ - `onCall` correlated `CallRecord` + `formatCallRecord` one-liner for the text
96
+ router, extended to the media router (image/video).
97
+
98
+ ## [0.2.0] — earlier
99
+
100
+ - Observability: `onCall` / `CallRecord`, `formatCallRecord`.
101
+
102
+ ## [0.1.x] — earlier
103
+
104
+ - Dual ESM/CJS build. Media (image/video) least-cost routing with the Runware
105
+ and Kunavo adapters; cap-aware failover for the text router.
106
+
107
+ [0.2.5]: https://github.com/victorzhrn/ai-lcr/releases/tag/v0.2.5
108
+ [0.2.3]: https://github.com/victorzhrn/ai-lcr/releases/tag/v0.2.3
package/README.md CHANGED
@@ -156,7 +156,7 @@ Any OpenAI-compatible endpoint works — and so does any AI SDK provider package
156
156
 
157
157
  - **Model vendors' own APIs (native):** route straight to [DeepSeek](https://platform.deepseek.com), [OpenAI](https://openai.com), [Anthropic](https://anthropic.com), [Google](https://ai.google.dev), [xAI](https://x.ai), etc. via their AI SDK provider packages — no markup, full native features. See [Route to a model vendor's own API](#route-to-a-model-vendors-own-api-native-providers).
158
158
  - **Text aggregators:** [OpenRouter](https://openrouter.ai) (widest coverage, list pricing) · [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off** every model) · [TokenMart](https://thetokenmart.ai) (15–65% off, varies by model)
159
- - **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — routing via `createMediaLCR`. Image: Kunavo + Runware + fal. Video: fal (live, via its async queue API); Kunavo's Veo poll path is implemented but unverified
159
+ - **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — image routing available via `createMediaLCR` (Kunavo + Runware + fal adapters); video on the roadmap
160
160
 
161
161
  ## Text model pricing
162
162
 
@@ -273,8 +273,7 @@ Two OpenAI-compatible providers, same probe, same day. Cells cover both families
273
273
  - [ ] Bundled price table for zero-config pricing (drop the manual `cost` numbers)
274
274
  - [ ] Provider-quirk middleware (transparently patch known per-provider request quirks, e.g. Kunavo's ignored `max_tokens`)
275
275
  - [ ] Feed probe results into routing automatically (auto-exclude a model from a provider that fails its probe)
276
- - [x] Image & video model routing (`createMediaLCR`) image via Kunavo + Runware + fal; **video live via fal** (async queue API)
277
- - [ ] Normalized cross-provider video price comparison + verified Kunavo/Runware video adapters
276
+ - [ ] Image & video model routing (fal.ai / Runware / Kunavo)
278
277
 
279
278
  ## Affiliate disclosure
280
279
 
package/README.zh-CN.md CHANGED
@@ -114,7 +114,7 @@ const lcr = createLCR({
114
114
 
115
115
  - **模型厂商官方 API(原生):** 通过各自的 AI SDK provider 包直连 [DeepSeek](https://platform.deepseek.com)、[OpenAI](https://openai.com)、[Anthropic](https://anthropic.com)、[Google](https://ai.google.dev)、[xAI](https://x.ai) 等——无加价,原生特性齐全。见上方「直连模型厂商官方 API(原生 provider)」一节。
116
116
  - **文本聚合器:** [OpenRouter](https://openrouter.ai)(覆盖最广,列表定价)· [Kunavo](https://kunavo.com/?ref=victorimf)(**全模型 8 折**)· [TokenMart](https://thetokenmart.ai)(按模型 85 折–35 折不等)
117
- - **图像 / 视频:** [Kunavo](https://kunavo.com/?ref=victorimf)(**8 折**)· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) —— 通过 `createMediaLCR` 路由。图像:Kunavo + Runware + fal。视频:fal(已可用,走其异步队列 API);Kunavo 的 Veo 轮询路径已实现但未验证
117
+ - **图像 / 视频:** [Kunavo](https://kunavo.com/?ref=victorimf)(**8 折**)· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) —— 路由功能在路线图中
118
118
 
119
119
  ## 文本模型价格
120
120
 
@@ -229,8 +229,7 @@ API_KEY=$TOKENMART_API_KEY BASE=https://api.tokenmart.ai \
229
229
  - [ ] 内置价格表,实现零配置定价(省去手填 `cost` 数字)
230
230
  - [ ] provider 怪癖中间件(透明地修补已知怪癖,如 Kunavo 被忽略的 `max_tokens`)
231
231
  - [ ] 把 probe 结果自动接入路由(探测失败的 provider×model 自动从列表剔除)
232
- - [x] 图像与视频模型路由(`createMediaLCR`)—— 图像走 Kunavo + Runware + fal;**视频已可用,走 fal**(异步队列 API
233
- - [ ] 归一化的跨 provider 视频价格对比 + 验证 Kunavo/Runware 视频适配器
232
+ - [ ] 图像与视频模型路由(fal.ai / Runware / Kunavo
234
233
 
235
234
  ## 联盟(Affiliate)披露
236
235