npm - ai-lcr - Versions diffs - 0.2.2 → 0.2.5 - Mend

ai-lcr 0.2.2 → 0.2.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,108 @@
+# Changelog
+All notable changes to `ai-lcr` are documented here. The format follows
+[Keep a Changelog](https://keepachangelog.com/), and the project adheres to
+[Semantic Versioning](https://semver.org/).
+## [0.2.5] — 2026-06-01
+Pre-launch failover-robustness + media-provider pass — closing cases where a
+real provider failure slipped past the switch criterion and killed the request,
+and making fal a live failover target.
+### Fixed
+- **A network-unreachable provider didn't fail over.** `isRetryableError` only
+  matched HTTP statuses and English keywords, but a provider that's down throws
+  a `fetch` `TypeError` with *no* status — and wraps the real cause
+  (`ECONNREFUSED`/`ECONNRESET`/`ENOTFOUND`/connect-timeout, with the Node `code`)
+  in `error.cause`. Those read as a non-retryable client error, so the cheapest
+  provider going down killed the request instead of falling over — the most
+  common outage mode. The engine now walks the `cause` chain and treats Node
+  network codes / transport-failure messages as retryable. Applies to both the
+  text and media routers. New exported helper `isNetworkError`.
+- **Non-English billing failures didn't fail over.** Out-of-credit detection was
+  English-only, but Chinese providers (e.g. Kunavo) report a failed charge as
+  `余额不足`/`账户欠费`/`扣费失败` in a 200/400 body with no billing status.
+  Those are now matched (plus `balance`/`exhausted`), so a failed charge fails
+  over and is tagged `billing` by `classifyErrorKind` for alerting.
+- **An out-of-balance 403 was mis-tagged as `auth`.** Providers report an
+  exhausted account as 403 (e.g. fal "exhausted balance") — a top-up problem,
+  not a revoked key. `classifyErrorKind` now lets billing wording win over a
+  bare 401/403 status, so it's tagged `billing` (a plain 403 stays `auth`).
+- **A throwing observer could fail a successful request.** `onCost`/`onCall`/
+  `onError` were invoked unguarded; a logging sink that threw (e.g. a flaky db9
+  write) turned an otherwise-successful generation into a thrown error. All
+  observer callbacks are now fire-and-forget — wrapped so a throw can never
+  affect routing or the request outcome. Applies to both routers.
+### Added
+- **fal media adapter** (`createFalMediaAdapter`). fal was in the price table
+  but had no adapter, so its routes were silently skipped at runtime — now it's
+  a real cheapest-first / failover target for image models. Synchronous
+  `https://fal.run/<model>` with `Authorization: Key`, generic input pass-
+  through, HTTP-status-bearing errors (403 out-of-balance → fails over; 422 bad
+  input → doesn't). Image only; fal video (queue) is on the roadmap.
+- **Status-page liveness probes for Runware + fal** (`website`). Both are now
+  monitored with a free, generation-free reachability probe: Runware's `ping`
+  task (→ `pong`, 0 cost) and fal's `GET /v1/account/billing` (2xx ⇒ endpoint up
+  + key valid). Generalized via a new `ReachProbe` so a "reachable" check can
+  hit a provider-specific free endpoint instead of `GET /v1/models`. Requires
+  `RUNWARE_API_KEY` and `FAL_KEY` env vars to be set.
+## [0.2.3] — 2026-06-01
+Release-quality and engine-correctness pass.
+### Fixed
+- **Build was red on `main`.** `media.ts` set `CallRecord.baselineUsd` but the
+  type never declared it, so `tsc`/`npm run build` failed while `npm test`
+  (which doesn't typecheck) stayed green. `baselineUsd?: number` is now part of
+  `CallRecord`. The text router leaves it `undefined`; the media router sets it.
+- **Failover used shared mutable state across concurrent requests.** The active
+  provider index was an instance field used both as the per-request loop cursor
+  and the loop's termination check. Two requests sharing one model instance
+  could clobber each other's cursor mid-flight (skipped providers, wrong
+  termination). Each request now walks providers on a fully local cursor; the
+  only shared state is a "where to start next" hint, read once and written once.
+- **Cheapest provider was never re-probed under sustained traffic.** The
+  snap-back-to-cheapest timer reset on *every* call, so with calls more frequent
+  than `resetIntervalMs` it never fired — one blip pinned you on the expensive
+  fallback indefinitely (exactly when spend is highest). The timer now measures
+  from the last *failover*, so re-probe fires under load too.
+### Added
+- **`classifyErrorKind(error)` and `RouteAttempt.kind`** (`"transient" | "auth"
+  | "billing" | "client"`). 401/403 (auth) and 402/out-of-credit (billing)
+  still fail over so the request survives — but they're now tagged distinctly
+  from transient 429/5xx, so a misconfigured key silently burning the pricey
+  fallback is something you can alert on instead of mistaking for healthy
+  routing.
+- **Continuous Integration** (`.github/workflows/ci.yml`): `build` +
+  `typecheck` + `test` on Node 20 & 22, plus a `pack-smoke` job that installs
+  the actual `npm pack` tarball into a clean directory and imports it (ESM and
+  CJS) — catching dropped exports and broken `dist` that an in-repo test can't.
+- **`prepublishOnly` gate**: `npm publish` now runs build + typecheck + test
+  first, so a red tree can't be published.
+- **Public-export surface test** (`public-api.test.ts`): pins every runtime
+  export by name, so removing one fails loudly and adding one is deliberate.
+## [0.2.1] — earlier
+- `onCall` correlated `CallRecord` + `formatCallRecord` one-liner for the text
+  router, extended to the media router (image/video).
+## [0.2.0] — earlier
+- Observability: `onCall` / `CallRecord`, `formatCallRecord`.
+## [0.1.x] — earlier
+- Dual ESM/CJS build. Media (image/video) least-cost routing with the Runware
+  and Kunavo adapters; cap-aware failover for the text router.
+[0.2.5]: https://github.com/victorzhrn/ai-lcr/releases/tag/v0.2.5
+[0.2.3]: https://github.com/victorzhrn/ai-lcr/releases/tag/v0.2.3

package/README.md CHANGED Viewed

@@ -156,7 +156,7 @@ Any OpenAI-compatible endpoint works — and so does any AI SDK provider package
 - **Model vendors' own APIs (native):** route straight to [DeepSeek](https://platform.deepseek.com), [OpenAI](https://openai.com), [Anthropic](https://anthropic.com), [Google](https://ai.google.dev), [xAI](https://x.ai), etc. via their AI SDK provider packages — no markup, full native features. See [Route to a model vendor's own API](#route-to-a-model-vendors-own-api-native-providers).
 - **Text aggregators:** [OpenRouter](https://openrouter.ai) (widest coverage, list pricing) · [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off** every model) · [TokenMart](https://thetokenmart.ai) (15–65% off, varies by model)
-- **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — routing via `createMediaLCR`. Image: Kunavo + Runware + fal. Video: fal (live, via its async queue API); Kunavo's Veo poll path is implemented but unverified
+- **Image / video:** [Kunavo](https://kunavo.com/?ref=victorimf) (**20% off**) · [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) — image routing available via `createMediaLCR` (Kunavo + Runware + fal adapters); video on the roadmap
 ## Text model pricing
@@ -273,8 +273,7 @@ Two OpenAI-compatible providers, same probe, same day. Cells cover both families
 - [ ] Bundled price table for zero-config pricing (drop the manual `cost` numbers)
 - [ ] Provider-quirk middleware (transparently patch known per-provider request quirks, e.g. Kunavo's ignored `max_tokens`)
 - [ ] Feed probe results into routing automatically (auto-exclude a model from a provider that fails its probe)
-- [x] Image & video model routing (`createMediaLCR`) — image via Kunavo + Runware + fal; **video live via fal** (async queue API)
-- [ ] Normalized cross-provider video price comparison + verified Kunavo/Runware video adapters
+- [ ] Image & video model routing (fal.ai / Runware / Kunavo)
 ## Affiliate disclosure

package/README.zh-CN.md CHANGED Viewed

@@ -114,7 +114,7 @@ const lcr = createLCR({
 - **模型厂商官方 API（原生）：** 通过各自的 AI SDK provider 包直连 [DeepSeek](https://platform.deepseek.com)、[OpenAI](https://openai.com)、[Anthropic](https://anthropic.com)、[Google](https://ai.google.dev)、[xAI](https://x.ai) 等——无加价，原生特性齐全。见上方「直连模型厂商官方 API（原生 provider）」一节。
 - **文本聚合器：** [OpenRouter](https://openrouter.ai)（覆盖最广，列表定价）· [Kunavo](https://kunavo.com/?ref=victorimf)（**全模型 8 折**）· [TokenMart](https://thetokenmart.ai)（按模型 85 折–35 折不等）
-- **图像 / 视频：** [Kunavo](https://kunavo.com/?ref=victorimf)（**8 折**）· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) —— 通过 `createMediaLCR` 路由。图像：Kunavo + Runware + fal。视频：fal（已可用，走其异步队列 API）；Kunavo 的 Veo 轮询路径已实现但未验证
+- **图像 / 视频：** [Kunavo](https://kunavo.com/?ref=victorimf)（**8 折**）· [TokenMart](https://thetokenmart.ai) · [fal.ai](https://fal.ai) · [Runware](https://runware.ai) —— 路由功能在路线图中
 ## 文本模型价格
@@ -229,8 +229,7 @@ API_KEY=$TOKENMART_API_KEY BASE=https://api.tokenmart.ai \
 - [ ] 内置价格表，实现零配置定价（省去手填 `cost` 数字）
 - [ ] provider 怪癖中间件（透明地修补已知怪癖，如 Kunavo 被忽略的 `max_tokens`）
 - [ ] 把 probe 结果自动接入路由（探测失败的 provider×model 自动从列表剔除）
-- [x] 图像与视频模型路由（`createMediaLCR`）—— 图像走 Kunavo + Runware + fal；**视频已可用，走 fal**（异步队列 API）
-- [ ] 归一化的跨 provider 视频价格对比 + 验证 Kunavo/Runware 视频适配器
+- [ ] 图像与视频模型路由（fal.ai / Runware / Kunavo）
 ## 联盟（Affiliate）披露