llm-stream-assemble 1.2.0 → 1.3.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,28 +1,37 @@
1
1
  # llm-stream-assemble
2
2
 
3
- ![core](https://img.shields.io/badge/core-1.2.0-blue)
3
+ ![core](https://img.shields.io/badge/core-1.3.6-blue)
4
4
  ![node](https://img.shields.io/badge/node-%3E%3D18-339933)
5
5
  ![runtime deps](https://img.shields.io/badge/runtime_deps-0-brightgreen)
6
- ![tests](https://img.shields.io/badge/tests-755%2B_passing-brightgreen)
6
+ ![tests](https://img.shields.io/badge/tests-1019%2B_passing-brightgreen)
7
7
  [![ci](https://github.com/01laky/llm-stream-assemble/actions/workflows/ci.yml/badge.svg)](https://github.com/01laky/llm-stream-assemble/actions/workflows/ci.yml)
8
- ![status](https://img.shields.io/badge/status-stable_1.2.0-brightgreen)
8
+ ![status](https://img.shields.io/badge/status-stable_1.3.6-brightgreen)
9
9
 
10
10
  **One typed event model for every LLM stream** — text, tool calls, reasoning, JSON, usage, refusals, errors, and non-streaming responses.
11
11
 
12
12
  > A zero-dependency TypeScript layer for assembling **OpenAI**, **Anthropic**, **Google Gemini**, and **OpenAI-compatible** LLM streams into unified events — so you can stop hand-rolling provider parsers and keep one clean, typed event model across chat UIs, agents, proxies, and backends.
13
13
 
14
- **Status:** Stable `1.2.0`. Five built-in adapters, twelve OpenAI-compatible host presets (including **Azure OpenAI**), transforms, replay helpers, and examples are production-ready. Pin semver ranges as usual and review [CHANGELOG.md](./CHANGELOG.md) before major upgrades.
14
+ Turn provider SSE fragments into typed events **not another `+=` loop**.
15
+
16
+ **Status:** Stable `1.3.6`. Five built-in adapters, thirteen OpenAI-compatible host presets (including **Azure OpenAI** and **Cloudflare Workers AI**), transforms, replay helpers, and examples are production-ready. Pin semver ranges as usual and review [CHANGELOG.md](./CHANGELOG.md) before major upgrades.
15
17
 
16
18
  ---
17
19
 
18
20
  ## Contents
19
21
 
22
+ - [Why not just concatenate?](#why-not-just-concatenate)
23
+ - [Edge-case showcase](#edge-case-showcase)
20
24
  - [Why use this](#why-use-this)
21
25
  - [Architecture](#architecture)
22
26
  - [Providers at a glance](#providers-at-a-glance)
23
27
  - [Install](#install)
28
+ - [First success in 30 seconds](#first-success-in-30-seconds)
24
29
  - [Quickstart](#quickstart)
30
+ - [Quick decision guide](#quick-decision-guide)
25
31
  - [Documentation](#documentation)
32
+ - [How this compares](#how-this-compares)
33
+ - [Examples](#examples)
34
+ - [Integration cookbook](#integration-cookbook)
26
35
  - [Usage guides](#usage-guides)
27
36
  - [Transforms & replay](#transforms--replay)
28
37
  - [Examples & proxy safety](#examples--proxy-safety)
@@ -31,13 +40,67 @@
31
40
 
32
41
  ---
33
42
 
43
+ ## Why not just concatenate?
44
+
45
+ Raw LLM streams look like text, but **simple string concatenation or naive `JSON.parse` per chunk fails** in production. Providers emit **protocol events**, not finished messages.
46
+
47
+ 1. **SSE mid-line splits** — TCP chunks can break `data: {"choices":[...]}\n` across reads; you need a line buffer (`parse-sse.ts`, fixtures **LSA-C**).
48
+ 2. **Tool argument fragmentation** — function parameters arrive as partial JSON across dozens of deltas; only assembly produces valid `tool_call.done` args.
49
+ 3. **Anthropic id/index ordering** — `tool_use` blocks may stream `index` before `id`; fine-grained `input_json_delta` is invalid JSON until the block ends.
50
+ 4. **Reasoning vs user text** — DeepSeek R1, Claude thinking, and OpenAI reasoning models interleave hidden reasoning that must map to `reasoning.*`, not `text.*`.
51
+ 5. **JSON mode streaming** — structured output streams as deltas; you do not receive a parsed object until completion (`json.delta` / `json.done`).
52
+ 6. **Stream lifecycle** — `[DONE]` markers, usage-only tail chunks, and incomplete streams without explicit finish need consistent terminal handling.
53
+ 7. **Mid-stream errors** — provider error payloads must not leak raw internals to browsers; use `sanitizeErrors` when proxying (**LSA-X23**).
54
+ 8. **Dual code paths** — the same `StreamEvent` union should work for `stream: true` SSE and non-stream JSON (`assembleStream` vs `assembleResponse`).
55
+
56
+ This library is the **assembly layer** between those raw bytes and your UI, agent, or proxy.
57
+
58
+ ### Why not `text += chunk`?
59
+
60
+ The first reaction is often: “Why not `message += chunk`?” Provider streams are **protocol events**, not finished message strings.
61
+
62
+ | Failure mode | What breaks with `+=` / naive parse | This library |
63
+ | --------------------------------- | --------------------------------------------------- | ------------------------------------------------- |
64
+ | **Chunk boundaries** | SSE `data:` line split mid-payload across TCP reads | Line buffer — `parse-sse.ts` |
65
+ | **Incomplete structures** | One SSE payload ≠ one complete JSON message | Adapter per payload; assembler until `.done` |
66
+ | **State management** | Parallel tools, reasoning vs text channels | `EventAssembler` per stream |
67
+ | **Parser invalidity mid-stream** | Anthropic `input_json_delta`, partial tool args | Partial preview; valid at `.done` |
68
+ | **JSON partials** | Structured output streams as fragments | `json.*`, `tool_call.args.delta` |
69
+ | **Markdown fences in model text** | ` ```json ` split across **text tokens** | **Out of scope** — render `text.delta` in your UI |
70
+
71
+ See [Edge-case showcase](#edge-case-showcase) for concrete chunk examples.
72
+
73
+ ---
74
+
75
+ ## Edge-case showcase
76
+
77
+ Raw streams break in predictable ways. Three layers — **SSE framing**, **tool/JSON assembly**, **UI text** — fail differently:
78
+
79
+ ![Chunk assembly: SSE fragments to unified events](https://raw.githubusercontent.com/01laky/llm-stream-assemble/main/docs/img/chunk-assembly.svg)
80
+
81
+ - **SSE mid-line split** — TCP reads break `data: {...}\n` across buffers; line parser required.
82
+ - **Tool JSON partials** — args stream as `{`, `"city":`, `"Paris"}` before `tool_call.done`.
83
+ - **JSON mode** — structured output arrives as `json.delta` strings, not a parsed object.
84
+
85
+ **[Full edge-case walkthrough →](./docs/edge-cases.md)** — DIY vs `assembleStream`, fixture replay, test IDs (**LSA-C04**, **LSA-C52**, golden fixtures).
86
+
87
+ ---
88
+
34
89
  ## Why use this
35
90
 
36
91
  - **Zero runtime dependencies** — thin adapters + core assembly, no provider SDKs.
37
92
  - **Stream and non-stream parity** — same `StreamEvent` union from SSE chunks or JSON bodies.
38
- - **Provider presets, not forks** — Groq, Azure, Perplexity, xAI, and others reuse one compatible parser with dialect options.
93
+ - **Provider presets, not forks** — Groq, Azure, Cloudflare, Perplexity, xAI, and others reuse one compatible parser with dialect options.
39
94
  - **Proxy-ready transforms** — `toSSE({ sanitizeErrors: true })`, `tapEvents`, `collectStream`, fixture replay.
40
95
 
96
+ ### Performance at a glance
97
+
98
+ - **Zero runtime dependencies** — verified in CI (`pnpm verify:deps`)
99
+ - **Incremental SSE parsing** — line buffer; no full-stream re-parse
100
+ - **Single-pass O(n) assembly** — **LSA-C52** smoke test on 10k chunks
101
+ - **Bounded buffers** — `maxBufferBytes` for untrusted streams
102
+ - **Local repro:** `pnpm bench:smoke` — see [performance](./docs/performance.md)
103
+
41
104
  ---
42
105
 
43
106
  ## Architecture
@@ -58,19 +121,30 @@ Every adapter maps provider-specific fragments into the same **`StreamEvent`** u
58
121
 
59
122
  **Design constraints:** adapters never accumulate cross-chunk state beyond id/index reconciliation; assembly, buffering, and `.done` emission live in core. No HTTP client, no tool execution, no UI — just the stream layer.
60
123
 
124
+ ### Lifecycle & concurrency
125
+
126
+ - **`EventAssembler` is stateful per stream** — it buffers text, reasoning, JSON, refusals, and open tool calls until `.done` / `finish`.
127
+ - **Public APIs create a new assembler per call** — `assembleStream`, `assembleFromPayloads`, `assembleResponse`, and `createAssemblyTransform` each construct their own instance.
128
+ - **One assembler = one stream/response** — do not share an instance across concurrent requests.
129
+ - **`EventAssembler.reset()`** clears state for tests or explicit reuse after a stream completes.
130
+ - **Adapters are thin** — one payload in, `RawChunk[]` out; create **one adapter instance per request/stream** (minimal id/index map only).
131
+ - **Transforms are stateless** — `tapEvents`, `toSSE`, and `collectStream` operate on the unified event stream.
132
+
133
+ ![Stateful assembler vs stateless adapters](https://raw.githubusercontent.com/01laky/llm-stream-assemble/main/docs/img/assembler-lifecycle.svg)
134
+
61
135
  Diagram sources: [`docs/img/`](./docs/img/) (Mermaid `.mmd` + committed SVG). Regenerate with `pnpm diagrams:build`.
62
136
 
63
137
  ---
64
138
 
65
139
  ## Providers at a glance
66
140
 
67
- | Adapter | Provider / API | Import |
68
- | --------------------------------------- | ----------------------------------------------------------------------------------------------------------------------- | ------------------------------------------- |
69
- | `openaiChatAdapter()` | OpenAI Chat Completions | `llm-stream-assemble` |
70
- | `openaiCompatibleAdapter({ provider })` | Groq, DeepSeek, Mistral, Ollama, LM Studio, Together, Fireworks, OpenRouter, Perplexity, xAI, **Azure OpenAI**, generic | `llm-stream-assemble` |
71
- | `anthropicAdapter()` | Anthropic Messages | `llm-stream-assemble` |
72
- | `openaiResponsesAdapter()` | OpenAI Responses API | `llm-stream-assemble` |
73
- | `geminiAdapter()` | Google AI Gemini | `llm-stream-assemble` or `/adapters/gemini` |
141
+ | Adapter | Provider / API | Import |
142
+ | --------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------- |
143
+ | `openaiChatAdapter()` | OpenAI Chat Completions | `llm-stream-assemble` |
144
+ | `openaiCompatibleAdapter({ provider })` | Groq, DeepSeek, Mistral, Ollama, LM Studio, Together, Fireworks, OpenRouter, Perplexity, xAI, **Azure OpenAI**, **Cloudflare Workers AI**, generic | `llm-stream-assemble` |
145
+ | `anthropicAdapter()` | Anthropic Messages | `llm-stream-assemble` |
146
+ | `openaiResponsesAdapter()` | OpenAI Responses API | `llm-stream-assemble` |
147
+ | `geminiAdapter()` | Google AI Gemini | `llm-stream-assemble` or `/adapters/gemini` |
74
148
 
75
149
  Full feature flags and quirks: [compatibility matrix](./docs/compatibility.md).
76
150
 
@@ -87,6 +161,23 @@ pnpm add llm-stream-assemble
87
161
 
88
162
  ---
89
163
 
164
+ ## First success in 30 seconds
165
+
166
+ Minimal loop once you have a streaming `response.body` — see [Quickstart](#quickstart) for full `fetch` setup:
167
+
168
+ ```ts
169
+ import { assembleStream, openaiChatAdapter } from "llm-stream-assemble";
170
+
171
+ for await (const event of assembleStream(response.body!, openaiChatAdapter())) {
172
+ if (event.type === "text.delta") process.stdout.write(event.text);
173
+ if (event.type === "text.done") console.log("\n--- done:", event.text);
174
+ }
175
+ ```
176
+
177
+ Swap `openaiChatAdapter()` for `anthropicAdapter()`, `geminiAdapter()`, or `openaiCompatibleAdapter({ provider: "ollama" })` — [Quick decision guide](#quick-decision-guide).
178
+
179
+ ---
180
+
90
181
  ## Quickstart
91
182
 
92
183
  ```ts
@@ -99,10 +190,32 @@ for await (const event of assembleStream(response.body!, openaiChatAdapter())) {
99
190
 
100
191
  ---
101
192
 
193
+ ## Quick decision guide
194
+
195
+ Pick an adapter in ~30 seconds:
196
+
197
+ ![Quick decision guide](https://raw.githubusercontent.com/01laky/llm-stream-assemble/main/docs/img/quick-decision.svg)
198
+
199
+ - **OpenAI Chat Completions SSE** → `openaiChatAdapter()`
200
+ - **OpenAI Responses API** → `openaiResponsesAdapter()`
201
+ - **Anthropic Messages** → `anthropicAdapter()`
202
+ - **Google Gemini** → `geminiAdapter()`
203
+ - **Groq, Ollama, Azure, Cloudflare, OpenRouter, …** → `openaiCompatibleAdapter({ provider })`
204
+ - **Non-streaming JSON body** → `assembleResponse(body, adapter)`
205
+ - **React chat UI / full agent framework** → not this package — see [comparison](./docs/comparison.md)
206
+ - **XML/markdown tag parsing from model text** → out of scope — see [Non-goals](#non-goals)
207
+
208
+ ---
209
+
102
210
  ## Documentation
103
211
 
104
212
  - [Provider compatibility matrix](./docs/compatibility.md)
105
213
  - [Adapter author guide](./docs/adapter-guide.md)
214
+ - [Performance & runtime behavior](./docs/performance.md)
215
+ - [Edge-case showcase](./docs/edge-cases.md)
216
+ - [Integration cookbook](./docs/integration-cookbook.md)
217
+ - [How this compares](./docs/comparison.md)
218
+ - [FAQ](./docs/faq.md)
106
219
  - [Architecture diagrams](./docs/img/README.md)
107
220
  - [Live smoke checklist (maintainers)](./docs/live-smoke.md)
108
221
  - [Post-1.0 provider roadmap](./docs/post-1.0-provider-roadmap.md)
@@ -110,6 +223,90 @@ for await (const event of assembleStream(response.body!, openaiChatAdapter())) {
110
223
 
111
224
  ---
112
225
 
226
+ ## How this compares
227
+
228
+ | | llm-stream-assemble | Full-stack AI SDK | Provider SDK | DIY concat |
229
+ | ------------ | --------------------- | ------------------ | -------------- | ------------ |
230
+ | Scope | Stream assembly only | HTTP + UI + agents | Vendor RPC | Manual parse |
231
+ | Events | Unified `StreamEvent` | Framework types | Vendor types | Ad hoc |
232
+ | Dependencies | Zero runtime | Many | Vendor package | None |
233
+
234
+ Full matrix, when-not-to-use, and alternatives: **[docs/comparison.md](./docs/comparison.md)**.
235
+
236
+ ---
237
+
238
+ ## Examples
239
+
240
+ Curated index — full snippets live in [Usage guides](#usage-guides) and [`examples/`](./examples/README.md).
241
+
242
+ ### OpenAI Chat
243
+
244
+ ```ts
245
+ import { assembleStream, openaiChatAdapter } from "llm-stream-assemble";
246
+ // fetch(..., { stream: true }) then:
247
+ for await (const event of assembleStream(response.body!, openaiChatAdapter())) {
248
+ if (event.type === "text.delta") process.stdout.write(event.text);
249
+ }
250
+ ```
251
+
252
+ → [`examples/node-fetch/openai-chat.ts`](./examples/node-fetch/openai-chat.ts)
253
+
254
+ ### Ollama (local)
255
+
256
+ ```ts
257
+ import { assembleStream, openaiCompatibleAdapter } from "llm-stream-assemble";
258
+ const adapter = openaiCompatibleAdapter({ provider: "ollama" });
259
+ for await (const event of assembleStream(response.body!, adapter)) {
260
+ if (event.type === "text.delta") process.stdout.write(event.text);
261
+ }
262
+ ```
263
+
264
+ → [`examples/node-fetch/openai-compatible.ts`](./examples/node-fetch/openai-compatible.ts) · Usage: [OpenAI-Compatible](#openai-compatible-usage)
265
+
266
+ ### Anthropic Messages
267
+
268
+ → [`examples/node-fetch/anthropic.ts`](./examples/node-fetch/anthropic.ts) · Usage: [Anthropic Messages](#anthropic-messages-usage)
269
+
270
+ ### Google Gemini
271
+
272
+ → [`examples/node-fetch/gemini.ts`](./examples/node-fetch/gemini.ts) · Usage: [Gemini](#gemini-usage)
273
+
274
+ ### Streaming JSON (structured output)
275
+
276
+ ```ts
277
+ for await (const event of assembleStream(response.body!, openaiChatAdapter({ jsonMode: true }))) {
278
+ if (event.type === "json.delta") process.stdout.write(event.delta);
279
+ if (event.type === "json.done") console.log(event.json);
280
+ }
281
+ ```
282
+
283
+ ### Tool calling
284
+
285
+ ```ts
286
+ for await (const event of assembleStream(response.body!, openaiChatAdapter())) {
287
+ if (event.type === "tool_call.args.delta") process.stdout.write(event.delta);
288
+ if (event.type === "tool_call.done") console.log(event.name, event.args);
289
+ }
290
+ ```
291
+
292
+ ### Chat UI / markdown rendering
293
+
294
+ Stream `text.delta` into your renderer — this library does **not** parse markdown/XML tags from model output (see [Non-goals](#non-goals)).
295
+
296
+ ### SSE proxy to browser
297
+
298
+ → [`examples/proxy-safety/`](./examples/proxy-safety/) — `toSSE(events, { sanitizeErrors: true })`
299
+
300
+ ### Fixture replay
301
+
302
+ → [`examples/node-fetch/replay-fixture.ts`](./examples/node-fetch/replay-fixture.ts)
303
+
304
+ ### Integration cookbook
305
+
306
+ Wire unified events into **Hono**, **Express**, **Cloudflare Workers**, **LiteLLM**, **Next.js App Router**, AI SDK mapping, and LangChain callbacks — [`examples/integrations/`](./examples/integrations/) · **[Full cookbook →](./docs/integration-cookbook.md)**
307
+
308
+ ---
309
+
113
310
  ## Usage guides
114
311
 
115
312
  ### Core Usage
@@ -193,8 +390,9 @@ Provider presets:
193
390
  | `perplexity` | Perplexity API | Search-grounded answers; citations in `metadata.raw` |
194
391
  | `xai` | xAI Grok API | OpenAI-compatible; `reasoning_content` mapped when present |
195
392
  | `azure` | Azure OpenAI Chat Completions | Stricter preset; deployment URL + `api-key` auth; content filter metadata in `metadata.raw` |
393
+ | `cloudflare` | Cloudflare Workers AI REST | OpenAI-compatible `/v1/chat/completions`; Bearer + account id; loose preset like Groq |
196
394
 
197
- Base URL examples: Groq `https://api.groq.com/openai/v1`, DeepSeek `https://api.deepseek.com`, Mistral `https://api.mistral.ai/v1`, Ollama `http://localhost:11434/v1`, LM Studio `http://localhost:1234/v1`, Together `https://api.together.xyz/v1`, Fireworks `https://api.fireworks.ai/inference/v1`, OpenRouter `https://openrouter.ai/api/v1`, Perplexity `https://api.perplexity.ai`, xAI `https://api.x.ai/v1`, Azure OpenAI `https://{resource}.openai.azure.com/openai/deployments/{deployment}/chat/completions?api-version={version}`.
395
+ Base URL examples: Groq `https://api.groq.com/openai/v1`, DeepSeek `https://api.deepseek.com`, Mistral `https://api.mistral.ai/v1`, Ollama `http://localhost:11434/v1`, LM Studio `http://localhost:1234/v1`, Together `https://api.together.xyz/v1`, Fireworks `https://api.fireworks.ai/inference/v1`, OpenRouter `https://openrouter.ai/api/v1`, Perplexity `https://api.perplexity.ai`, xAI `https://api.x.ai/v1`, Azure OpenAI `https://{resource}.openai.azure.com/openai/deployments/{deployment}/chat/completions?api-version={version}`, Cloudflare Workers AI `https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/v1/chat/completions`.
198
396
 
199
397
  Strict vs loose configuration:
200
398
 
@@ -256,6 +454,44 @@ Use `openaiCompatibleAdapter({ provider: "azure", jsonMode: true })` when struct
256
454
 
257
455
  See `examples/node-fetch/azure-openai.ts` for a URL builder helper and `examples/proxy-safety/README.md` for server-side proxy notes.
258
456
 
457
+ ### Cloudflare Workers AI Usage
458
+
459
+ Cloudflare Workers AI exposes an OpenAI-compatible REST endpoint at `/v1/chat/completions` under your account. Use the **`cloudflare`** preset — not `generic` — when you want fixture-tested defaults for Workers AI REST (loose metadata tolerance like Groq).
460
+
461
+ ```ts
462
+ import { assembleStream, openaiCompatibleAdapter } from "llm-stream-assemble";
463
+
464
+ const accountId = process.env.CLOUDFLARE_ACCOUNT_ID!;
465
+ const url = `https://api.cloudflare.com/client/v4/accounts/${accountId}/ai/v1/chat/completions`;
466
+
467
+ const response = await fetch(url, {
468
+ method: "POST",
469
+ headers: {
470
+ Authorization: `Bearer ${process.env.CLOUDFLARE_API_TOKEN!}`,
471
+ "Content-Type": "application/json",
472
+ },
473
+ body: JSON.stringify({
474
+ model: "@cf/meta/llama-3.1-8b-instruct",
475
+ messages: [{ role: "user", content: "Hello" }],
476
+ stream: true,
477
+ stream_options: { include_usage: true },
478
+ }),
479
+ });
480
+
481
+ for await (const event of assembleStream(
482
+ response.body!,
483
+ openaiCompatibleAdapter({ provider: "cloudflare" }),
484
+ )) {
485
+ if (event.type === "text.delta") process.stdout.write(event.text);
486
+ }
487
+ ```
488
+
489
+ Streaming usage requires `stream_options: { include_usage: true }` on the request. Use `openaiCompatibleAdapter({ provider: "cloudflare", jsonMode: true })` when JSON output should map to `json.*` events.
490
+
491
+ The **`env.AI.run(model, { stream: true })`** Worker binding can return SSE bytes compatible with `assembleStream` when the model streams Chat Completions-shaped payloads — account binding and auth stay in your Worker; this library only parses the bytes.
492
+
493
+ See `examples/workers-ai/rest-chat-completions.ts` and `examples/proxy-safety/README.md` (Bearer token + account id must never reach the browser).
494
+
259
495
  ### Anthropic Messages Usage
260
496
 
261
497
  `anthropicAdapter()` parses Anthropic Messages streaming events and non-streaming responses. Create one adapter instance per request/stream.
@@ -377,17 +613,18 @@ for await (const event of assembleFromFile(
377
613
 
378
614
  ## Examples & proxy safety
379
615
 
380
- | Example | Description |
381
- | ---------------------------------------------------------------------------------------- | --------------------------------------- |
382
- | [`examples/node-fetch/openai-chat.ts`](./examples/node-fetch/openai-chat.ts) | OpenAI Chat Completions streaming |
383
- | [`examples/node-fetch/openai-compatible.ts`](./examples/node-fetch/openai-compatible.ts) | OpenAI-compatible presets |
384
- | [`examples/node-fetch/azure-openai.ts`](./examples/node-fetch/azure-openai.ts) | Azure OpenAI deployment URL + `api-key` |
385
- | [`examples/node-fetch/perplexity.ts`](./examples/node-fetch/perplexity.ts) | Perplexity streaming |
386
- | [`examples/node-fetch/xai.ts`](./examples/node-fetch/xai.ts) | xAI Grok streaming |
387
- | [`examples/node-fetch/anthropic.ts`](./examples/node-fetch/anthropic.ts) | Anthropic Messages |
388
- | [`examples/node-fetch/gemini.ts`](./examples/node-fetch/gemini.ts) | Google Gemini SSE |
389
- | [`examples/node-fetch/replay-fixture.ts`](./examples/node-fetch/replay-fixture.ts) | Local fixture replay |
390
- | [`examples/proxy-safety/`](./examples/proxy-safety/) | Proxy + browser client patterns |
616
+ | Example | Description |
617
+ | ------------------------------------------------------------------------------------------------ | ------------------------------------------------ |
618
+ | [`examples/node-fetch/openai-chat.ts`](./examples/node-fetch/openai-chat.ts) | OpenAI Chat Completions streaming |
619
+ | [`examples/node-fetch/openai-compatible.ts`](./examples/node-fetch/openai-compatible.ts) | OpenAI-compatible presets |
620
+ | [`examples/node-fetch/azure-openai.ts`](./examples/node-fetch/azure-openai.ts) | Azure OpenAI deployment URL + `api-key` |
621
+ | [`examples/workers-ai/rest-chat-completions.ts`](./examples/workers-ai/rest-chat-completions.ts) | Cloudflare Workers AI REST + `cloudflare` preset |
622
+ | [`examples/node-fetch/perplexity.ts`](./examples/node-fetch/perplexity.ts) | Perplexity streaming |
623
+ | [`examples/node-fetch/xai.ts`](./examples/node-fetch/xai.ts) | xAI Grok streaming |
624
+ | [`examples/node-fetch/anthropic.ts`](./examples/node-fetch/anthropic.ts) | Anthropic Messages |
625
+ | [`examples/node-fetch/gemini.ts`](./examples/node-fetch/gemini.ts) | Google Gemini SSE |
626
+ | [`examples/node-fetch/replay-fixture.ts`](./examples/node-fetch/replay-fixture.ts) | Local fixture replay |
627
+ | [`examples/proxy-safety/`](./examples/proxy-safety/) | Proxy + browser client patterns |
391
628
 
392
629
  Proxy safety:
393
630
 
@@ -420,6 +657,7 @@ pnpm verify
420
657
  | `pnpm verify:deps` | fail if runtime dependencies are added |
421
658
  | `pnpm release:prep` | pre-tag checks (version, CHANGELOG, dist, npm pack) |
422
659
  | `pnpm diagrams:build` | regenerate README SVGs from Mermaid sources |
660
+ | `pnpm bench:smoke` | local LSA-C52 timing script (requires build first) |
423
661
  | `pnpm test` | Vitest smoke tests |
424
662
  | `pnpm build` | tsup → ESM + CJS + declarations |
425
663
 
@@ -484,7 +484,34 @@ function createOpenAIChatLikeAdapter(options) {
484
484
  });
485
485
  }
486
486
 
487
- // src/adapters/openai-compatible.ts
487
+ // src/adapters/openai-compatible-presets.ts
488
+ var OPENAI_COMPATIBLE_PROVIDERS = [
489
+ "generic",
490
+ "openrouter",
491
+ "groq",
492
+ "deepseek",
493
+ "mistral",
494
+ "ollama",
495
+ "lmstudio",
496
+ "together",
497
+ "fireworks",
498
+ "perplexity",
499
+ "xai",
500
+ "azure",
501
+ "cloudflare"
502
+ ];
503
+ var HOST_COMPATIBLE_PRESETS = OPENAI_COMPATIBLE_PROVIDERS.filter(
504
+ (p) => p !== "generic"
505
+ );
506
+ var STRICT_COMPATIBLE_PRESETS = [
507
+ "azure"
508
+ ];
509
+ function isStrictCompatiblePreset(provider) {
510
+ return STRICT_COMPATIBLE_PRESETS.includes(provider);
511
+ }
512
+ var LOOSE_HOST_PRESETS = HOST_COMPATIBLE_PRESETS.filter(
513
+ (p) => !isStrictCompatiblePreset(p)
514
+ );
488
515
  var DEFAULT_PRESET = {
489
516
  looseErrorShape: true,
490
517
  allowMissingMetadata: true,
@@ -502,33 +529,66 @@ var PRESET_OVERRIDES = {
502
529
  reasoningFieldAliases: []
503
530
  }
504
531
  };
505
- function openaiCompatibleAdapter(options = {}) {
532
+ var PRESET_OVERRIDE_KEYS = Object.keys(PRESET_OVERRIDES);
533
+ function hasPresetOverride(provider) {
534
+ return provider in PRESET_OVERRIDES;
535
+ }
536
+ function providerPreset(provider) {
537
+ return {
538
+ ...DEFAULT_PRESET,
539
+ ...PRESET_OVERRIDES[provider]
540
+ };
541
+ }
542
+
543
+ // src/adapters/openai-compatible-resolve.ts
544
+ function resolveCompatibleAdapterConfig(options = {}) {
506
545
  const preset = providerPreset(options.provider ?? "generic");
507
- const resolvedAllowMissingMetadata = options.allowMissingMetadata ?? preset.allowMissingMetadata ?? DEFAULT_PRESET.allowMissingMetadata;
508
- const resolvedLooseErrorShape = options.looseErrorShape ?? preset.looseErrorShape ?? DEFAULT_PRESET.looseErrorShape;
509
- const resolvedUseChoicePositionFallback = options.useChoicePositionFallback ?? preset.useChoicePositionFallback ?? DEFAULT_PRESET.useChoicePositionFallback;
546
+ const allowMissingMetadata = options.allowMissingMetadata ?? preset.allowMissingMetadata ?? DEFAULT_PRESET.allowMissingMetadata;
547
+ const looseErrorShape = options.looseErrorShape ?? preset.looseErrorShape ?? DEFAULT_PRESET.looseErrorShape;
548
+ const useChoicePositionFallback = options.useChoicePositionFallback ?? preset.useChoicePositionFallback ?? DEFAULT_PRESET.useChoicePositionFallback;
549
+ return {
550
+ looseErrorShape,
551
+ allowMissingMetadata,
552
+ useChoicePositionFallback,
553
+ rejectUnrecognizedPayloads: allowMissingMetadata === false,
554
+ reasoningFieldAliases: [
555
+ ...preset.reasoningFieldAliases ?? [],
556
+ ...options.reasoningFieldAliases ?? []
557
+ ]
558
+ };
559
+ }
560
+ function compatibleProviderLabel(provider) {
561
+ return provider ?? "generic";
562
+ }
563
+
564
+ // src/adapters/openai-compatible.ts
565
+ function openaiCompatibleAdapter(options = {}) {
566
+ const resolved = resolveCompatibleAdapterConfig(options);
510
567
  return createOpenAIChatLikeAdapter({
511
568
  ...options,
512
- looseErrorShape: resolvedLooseErrorShape,
513
- allowMissingMetadata: resolvedAllowMissingMetadata,
514
- useChoicePositionFallback: resolvedUseChoicePositionFallback,
569
+ looseErrorShape: resolved.looseErrorShape,
570
+ allowMissingMetadata: resolved.allowMissingMetadata,
571
+ useChoicePositionFallback: resolved.useChoicePositionFallback,
515
572
  errorPrefix: "openaiCompatibleAdapter",
516
573
  usageInputTokenFields: ["prompt_tokens", "input_tokens"],
517
574
  usageOutputTokenFields: ["completion_tokens", "output_tokens"],
518
- rejectUnrecognizedPayloads: resolvedAllowMissingMetadata === false,
519
- reasoningFieldAliases: [
520
- ...preset.reasoningFieldAliases ?? [],
521
- ...options.reasoningFieldAliases ?? []
522
- ]
575
+ rejectUnrecognizedPayloads: resolved.rejectUnrecognizedPayloads,
576
+ reasoningFieldAliases: resolved.reasoningFieldAliases
523
577
  });
524
578
  }
525
- function providerPreset(provider) {
526
- return {
527
- ...DEFAULT_PRESET,
528
- ...PRESET_OVERRIDES[provider]
529
- };
530
- }
531
579
 
580
+ exports.DEFAULT_PRESET = DEFAULT_PRESET;
581
+ exports.HOST_COMPATIBLE_PRESETS = HOST_COMPATIBLE_PRESETS;
582
+ exports.LOOSE_HOST_PRESETS = LOOSE_HOST_PRESETS;
583
+ exports.OPENAI_COMPATIBLE_PROVIDERS = OPENAI_COMPATIBLE_PROVIDERS;
584
+ exports.PRESET_OVERRIDES = PRESET_OVERRIDES;
585
+ exports.PRESET_OVERRIDE_KEYS = PRESET_OVERRIDE_KEYS;
586
+ exports.STRICT_COMPATIBLE_PRESETS = STRICT_COMPATIBLE_PRESETS;
587
+ exports.compatibleProviderLabel = compatibleProviderLabel;
588
+ exports.hasPresetOverride = hasPresetOverride;
589
+ exports.isStrictCompatiblePreset = isStrictCompatiblePreset;
532
590
  exports.openaiCompatibleAdapter = openaiCompatibleAdapter;
591
+ exports.providerPreset = providerPreset;
592
+ exports.resolveCompatibleAdapterConfig = resolveCompatibleAdapterConfig;
533
593
  //# sourceMappingURL=openai-compatible.cjs.map
534
594
  //# sourceMappingURL=openai-compatible.cjs.map