ai-lcr 0.5.5 → 0.5.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -4,6 +4,94 @@ All notable changes to `ai-lcr` are documented here. The format follows
4
4
  [Keep a Changelog](https://keepachangelog.com/), and the project adheres to
5
5
  [Semantic Versioning](https://semver.org/).
6
6
 
7
+ ## [0.5.6] — 2026-06-07
8
+
9
+ All additions are optional and backward compatible. The sync `createMediaLCR`
10
+ router (the callable `generate(modelId, input)`) and every adapter's `run()` are
11
+ **unchanged** in signature and behavior.
12
+
13
+ ### Added
14
+
15
+ - **Async media routing — `submit` / `poll` for long-running (video) jobs.**
16
+ The blocking media path holds a serverless invocation open until the file is
17
+ ready: fine for an image (seconds), impossible for a minutes-long video job.
18
+ `createMediaLCR(...)` now returns a callable with two methods attached:
19
+
20
+ ```ts
21
+ const lcr = createMediaLCR({ registry, adapters })
22
+
23
+ // process A (request handler): route + enqueue, return immediately
24
+ const handle = await lcr.submit('google/veo-3-lite', { prompt, aspect_ratio: '16:9' })
25
+ await db.save(JSON.stringify(handle)) // the handle is plain JSON
26
+
27
+ // process B (cron / queue worker): poll until terminal
28
+ const r = await lcr.poll(handle)
29
+ if (r.done) use(r.outputs, r.costCents) // else keep polling r.handle
30
+ ```
31
+
32
+ - **Routing happens at `submit`** — it picks the cheapest provider whose
33
+ adapter supports async, and the returned `MediaJobHandle` carries the
34
+ not-yet-tried fallback routes (cheapest-first), the original input, and the
35
+ telemetry accumulator. The handle is **serializable on purpose**: submit and
36
+ poll typically run in different processes, so it must survive a round-trip
37
+ through a database or queue.
38
+ - **Failover happens at `poll`, not just submit.** When a provider's job fails
39
+ mid-poll (a `status:"error"`, a completed-but-empty job, or a thrown
40
+ retryable transport error such as the video-timeout `504` remap), `poll`
41
+ **re-submits to the next fallback provider** and hands back a fresh handle to
42
+ keep polling — it does not give up. A thrown error uses the standard
43
+ `isRetryableError` gate (so a caller-bug `400` on the poll endpoint doesn't
44
+ loop); a provider's own job failure always earns a fallback attempt.
45
+ - **Telemetry lands once, at the terminal poll.** The single correlated
46
+ `CallRecord` (via `onCall`) and the `onCost` event fire when the job settles
47
+ (`poll` → done/exhausted), carrying the full failover chain across both
48
+ processes — not at `submit`. The one exception: a `submit` that *no* provider
49
+ accepts settles a failed record there (there is no poll to do it).
50
+
51
+ - **`MediaAdapter.submit` / `MediaAdapter.checkStatus` (both optional).** The
52
+ adapter contract gains the async pair, shaped to match ai-art's
53
+ `ProviderAdapter` so a consumer can delegate its own async runtime to ai-lcr
54
+ with no glue:
55
+
56
+ ```ts
57
+ submit({ externalId, input, metadata? }) -> { requestId }
58
+ checkStatus({ externalId, requestId }) ->
59
+ { status: 'queued' | 'running' | 'done' | 'error', outputs?, costCents?, units?, error? }
60
+ ```
61
+
62
+ A sync-only adapter (image-only) omits both; the async router simply skips a
63
+ route whose adapter can't serve async.
64
+
65
+ - **All three bundled adapters now implement the async path:**
66
+ - **Kunavo** — `submit` → `POST /v1/videos`, `checkStatus` → `GET /v1/videos/{id}`
67
+ (video only; submitting an image id throws, since Kunavo images are sync).
68
+ `run()`'s blocking async path now reuses these internally.
69
+ - **fal** — `submit` → `POST queue.fal.run/{model}`, `checkStatus` reconstructs
70
+ the queue base from the id (the `fal-ai/flux/schnell` → `fal-ai/flux`
71
+ sub-path quirk) for cross-process polling.
72
+ - **Runware** — gains an **async video** path (`videoInference` with
73
+ `deliveryMethod:"async"`, polled via `getResponse`). Image stays on the
74
+ synchronous `run()`.
75
+
76
+ - **New exported types:** `MediaSubmitRequest`, `MediaSubmitResult`,
77
+ `MediaStatusRequest`, `MediaStatusResult`, `MediaJobStatus`,
78
+ `MediaSubmitOptions`, `MediaJobHandle`, `MediaPollResult`, and `MediaLCR` (the
79
+ callable-with-methods return type of `createMediaLCR`).
80
+
81
+ - **Live probe `scripts/check-media-async.mjs`** — exercises the real
82
+ `submit`/`poll` API across **every async provider** (kunavo · fal · runware)
83
+ whose key is present: submit → JSON round-trip the handle → poll to done →
84
+ assert the output URL fetches and cost is reported, per provider.
85
+ `PROBE_FAILOVER=1` adds a live submit-time failover case.
86
+
87
+ ### Migration
88
+
89
+ Nothing breaks. To adopt async, give your video adapters `submit`/`checkStatus`
90
+ (the bundled fal/kunavo/runware adapters already have them) and call
91
+ `lcr.submit(...)` / `lcr.poll(...)` instead of the blocking `lcr(...)`. The
92
+ blocking call still works for image and for video where holding the request open
93
+ is acceptable.
94
+
7
95
  ## [0.5.5] — 2026-06-06
8
96
 
9
97
  Kunavo media (image + video) verified live and properly wired. The Kunavo
package/README.md CHANGED
@@ -291,11 +291,65 @@ USD per second, as of 2026-05 — verify current rates. Video billing differs by
291
291
  | Seedance Pro | $0.124 |
292
292
  | Veo 3.1 (audio-on) | $0.400 |
293
293
 
294
+ ## Image & video routing (`createMediaLCR`)
295
+
296
+ Image and video are a separate, self-contained side of `ai-lcr` (file outputs, mixed pricing units, async jobs) — see [`src/media.ts`](src/media.ts). You give it a registry (each model's provider routes + per-unit price) and a set of adapters; it routes cheapest-first, fails over, and reports real/normalized cost through the same `onCall` sink as text.
297
+
298
+ ```ts
299
+ import { createMediaLCR, createKunavoMediaAdapter, createFalMediaAdapter } from 'ai-lcr'
300
+
301
+ const lcr = createMediaLCR({
302
+ registry: {
303
+ 'google/veo-3-lite': {
304
+ id: 'google/veo-3-lite', modality: 'video',
305
+ routes: [
306
+ { provider: 'kunavo', externalId: 'veo-3-lite', pricing: { unit: 'call', cents: 16 } },
307
+ { provider: 'fal', externalId: 'fal-ai/veo3.1/lite', pricing: { unit: 'second', cents: 8 } },
308
+ ],
309
+ },
310
+ },
311
+ adapters: {
312
+ kunavo: createKunavoMediaAdapter({ apiKey: process.env.KUNAVO_API_KEY! }),
313
+ fal: createFalMediaAdapter({ apiKey: process.env.FAL_KEY! }),
314
+ },
315
+ onCall: rec => console.log(rec.winner, rec.costUsd, rec.failedOver),
316
+ })
317
+
318
+ // Sync: resolves when the file is ready (fine for images).
319
+ const { outputs, provider, costCents } = await lcr('google/veo-3-lite', { prompt: 'a wave' })
320
+ ```
321
+
322
+ ### Async (`submit` / `poll`) — for long-running video
323
+
324
+ A minutes-long video job can't hold a serverless request open. `submit` routes + enqueues and returns a **plain-JSON handle**; `poll` checks it. The two run in different processes — the handle survives a database/queue hop.
325
+
326
+ ```ts
327
+ // process A — request handler: route + enqueue, return immediately
328
+ const handle = await lcr.submit('google/veo-3-lite', { prompt: 'a wave', aspect_ratio: '16:9' })
329
+ await db.jobs.put(jobId, JSON.stringify(handle))
330
+
331
+ // process B — cron / queue worker: poll until terminal
332
+ let handle = JSON.parse(await db.jobs.get(jobId))
333
+ const r = await lcr.poll(handle)
334
+ if (r.done) {
335
+ save(r.outputs, r.costCents) // settled — telemetry already emitted
336
+ } else {
337
+ await db.jobs.put(jobId, JSON.stringify(r.handle)) // keep polling r.handle
338
+ }
339
+ ```
340
+
341
+ Design choices worth knowing:
342
+
343
+ - **Routing is at `submit`** (cheapest async-capable provider); the handle carries the not-yet-tried fallbacks, so…
344
+ - **Failover is at `poll`** — a provider whose job fails mid-poll is re-submitted to the next provider automatically (a fresh `r.handle` to keep polling), rather than the request just dying.
345
+ - **Telemetry lands once, at the terminal poll** — one `onCall` `CallRecord` with the full failover chain, threaded across both processes (not at `submit`).
346
+ - An adapter advertises async by implementing `submit` + `checkStatus`; image-only adapters omit them and are skipped by the async router. The bundled Kunavo, fal, and Runware adapters all implement the async path (Kunavo/Runware async is video-only; fal covers both).
347
+
294
348
  ## Vetting a provider (capability + cost probe)
295
349
 
296
350
  A discount is worthless if the provider quietly breaks the wire protocol. `ai-lcr` ships a zero-dependency check (`scripts/check-provider.sh`, just `bash` + `curl` + `python3`) that vets the things that actually cost you money or corrupt output, **per model**:
297
351
 
298
- > **Media providers** have their own probe: `scripts/check-kunavo-media.sh` (`bash` + `curl` + `jq`) live-tests Kunavo's image generation, `*-edit` reference endpoint, and async + sync video — the same checks used to verify the routes above. Run it before trusting a media route in production.
352
+ > **Media providers** have their own probes: `scripts/check-kunavo-media.sh` (`bash` + `curl` + `jq`) live-tests Kunavo's image generation, `*-edit` reference endpoint, and async + sync video; `scripts/check-media-async.mjs` exercises `ai-lcr`'s own `submit`/`poll` API across **every async provider** (kunavo · fal · runware) whose key is present submit → JSON round-trip the handle poll to done → assert the URL fetches and cost is reported, per provider (`PROBE_FAILOVER=1` adds a live submit-time failover case). Run them before trusting a media route in production.
299
353
 
300
354
  - **tool calling** — single call and a multi-step round-trip with `content: null` (the shape every agent loop sends)
301
355
  - **`max_tokens` honored** — caps must bound output
package/README.zh-CN.md CHANGED
@@ -207,10 +207,66 @@ Kunavo 提供 Anthropic + Google。DeepSeek / OpenAI / Grok / Mistral 路由到
207
207
  | Seedance Pro | $0.124 |
208
208
  | Veo 3.1(audio-on) | $0.400 |
209
209
 
210
+ ## 图像与视频路由(`createMediaLCR`)
211
+
212
+ 图像和视频是 `ai-lcr` 独立的一侧(输出是文件、计价单位混杂、视频是异步任务)—— 见 [`src/media.ts`](src/media.ts)。你提供一个 registry(每个模型的 provider 路由 + 单位价)和一组 adapter,它就按最便宜优先路由、自动 failover,并通过与文本侧相同的 `onCall` sink 报告真实/归一化成本。
213
+
214
+ ```ts
215
+ import { createMediaLCR, createKunavoMediaAdapter, createFalMediaAdapter } from 'ai-lcr'
216
+
217
+ const lcr = createMediaLCR({
218
+ registry: {
219
+ 'google/veo-3-lite': {
220
+ id: 'google/veo-3-lite', modality: 'video',
221
+ routes: [
222
+ { provider: 'kunavo', externalId: 'veo-3-lite', pricing: { unit: 'call', cents: 16 } },
223
+ { provider: 'fal', externalId: 'fal-ai/veo3.1/lite', pricing: { unit: 'second', cents: 8 } },
224
+ ],
225
+ },
226
+ },
227
+ adapters: {
228
+ kunavo: createKunavoMediaAdapter({ apiKey: process.env.KUNAVO_API_KEY! }),
229
+ fal: createFalMediaAdapter({ apiKey: process.env.FAL_KEY! }),
230
+ },
231
+ onCall: rec => console.log(rec.winner, rec.costUsd, rec.failedOver),
232
+ })
233
+
234
+ // 同步:出片才 resolve(图像够用)。
235
+ const { outputs, provider, costCents } = await lcr('google/veo-3-lite', { prompt: 'a wave' })
236
+ ```
237
+
238
+ ### 异步(`submit` / `poll`)—— 给长耗时的视频
239
+
240
+ 几分钟的视频任务没法把一个 serverless 请求一直挂住。`submit` 负责路由 + 入队,返回一个**纯 JSON 句柄**;`poll` 负责查它。两者跑在不同进程——句柄能扛过一次数据库/队列的来回。
241
+
242
+ ```ts
243
+ // 进程 A —— 请求处理器:路由 + 入队,立即返回
244
+ const handle = await lcr.submit('google/veo-3-lite', { prompt: 'a wave', aspect_ratio: '16:9' })
245
+ await db.jobs.put(jobId, JSON.stringify(handle))
246
+
247
+ // 进程 B —— cron / 队列 worker:轮询到终态
248
+ let handle = JSON.parse(await db.jobs.get(jobId))
249
+ const r = await lcr.poll(handle)
250
+ if (r.done) {
251
+ save(r.outputs, r.costCents) // 已落地——telemetry 此刻已落一条
252
+ } else {
253
+ await db.jobs.put(jobId, JSON.stringify(r.handle)) // 继续轮询 r.handle
254
+ }
255
+ ```
256
+
257
+ 几个值得知道的设计取舍:
258
+
259
+ - **路由发生在 `submit`**(选中最便宜的、支持异步的 provider);句柄携带尚未尝试的 fallback 列表,所以——
260
+ - **failover 发生在 `poll`**——某个 provider 的任务在轮询途中失败时,会自动 re-submit 到下一个 provider(返回一个新的 `r.handle` 继续轮询),而不是让请求直接死掉。
261
+ - **telemetry 只在终态轮询落一条**——一条 `onCall` `CallRecord`,带完整 failover 链,跨两个进程串起来(不是在 `submit` 时落)。
262
+ - adapter 通过实现 `submit` + `checkStatus` 来声明支持异步;只做图像的 adapter 省略它们,异步路由会跳过这种路由。内置的 Kunavo、fal、Runware adapter 都实现了异步路径(Kunavo/Runware 异步仅视频;fal 图像视频皆可)。
263
+
210
264
  ## 给 provider 做体检(能力 + 成本探测)
211
265
 
212
266
  折扣再大,如果 provider 偷偷破坏了协议就一文不值。`ai-lcr` 自带一个零依赖的检查脚本(`scripts/check-provider.sh`,只需 `bash` + `curl` + `python3`),**逐模型**核查那些真正会让你多花钱或污染输出的点:
213
267
 
268
+ > **媒体 provider 有独立探针:** `scripts/check-kunavo-media.sh`(`bash` + `curl` + `jq`)实测 Kunavo 的图像生成、`*-edit` 参考图端点、以及异步 + 同步视频;`scripts/check-media-async.mjs` 则**逐 provider**(kunavo · fal · runware,有 key 的才跑)跑 `ai-lcr` 自己的 `submit`/`poll` API——submit → 把句柄做 JSON 来回 → 轮询到 done → 断言 URL 真能 GET 到、成本有上报(`PROBE_FAILOVER=1` 再加一条实时 submit 期 failover)。上生产前先跑一遍。
269
+
214
270
  - **工具调用** —— 单次调用 + 带 `content: null` 的多步 round-trip(每个 agent 循环都会发的形态)
215
271
  - **`max_tokens` 是否生效** —— cap 必须能限制输出长度
216
272
  - **隐藏 prompt 注入** —— 发一条中性消息,如果模型开始回应一段它从没收到过的 system prompt,就说明 provider 注入了东西