llm-stream-assemble 1.4.1 → 1.5.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +87 -9
- package/dist/adapters/cohere.cjs +594 -0
- package/dist/adapters/cohere.cjs.map +1 -0
- package/dist/adapters/cohere.d.cts +9 -0
- package/dist/adapters/cohere.d.ts +9 -0
- package/dist/adapters/cohere.js +592 -0
- package/dist/adapters/cohere.js.map +1 -0
- package/dist/adapters/gemini.cjs +40 -5
- package/dist/adapters/gemini.cjs.map +1 -1
- package/dist/adapters/gemini.d.cts +9 -1
- package/dist/adapters/gemini.d.ts +9 -1
- package/dist/adapters/gemini.js +40 -6
- package/dist/adapters/gemini.js.map +1 -1
- package/dist/index.cjs +495 -5
- package/dist/index.cjs.map +1 -1
- package/dist/index.d.cts +2 -1
- package/dist/index.d.ts +2 -1
- package/dist/index.js +495 -6
- package/dist/index.js.map +1 -1
- package/package.json +13 -3
package/README.md
CHANGED
|
@@ -1,19 +1,19 @@
|
|
|
1
1
|
# llm-stream-assemble
|
|
2
2
|
|
|
3
|
-

|
|
4
4
|

|
|
5
5
|

|
|
6
|
-

|
|
7
7
|
[](https://github.com/01laky/llm-stream-assemble/actions/workflows/ci.yml)
|
|
8
|
-

|
|
9
9
|
|
|
10
10
|
**One typed event model for every LLM stream** — text, tool calls, reasoning, JSON, usage, refusals, errors, and non-streaming responses.
|
|
11
11
|
|
|
12
|
-
> A zero-dependency TypeScript layer between raw LLM provider bytes and your app:
|
|
12
|
+
> A zero-dependency TypeScript layer between raw LLM provider bytes and your app: seven built-in adapters, thirteen host presets, and a single StreamEvent model for text, tools, reasoning, JSON, and lifecycle — from Ollama to Azure to Bedrock to Cohere to Cloudflare Workers AI.
|
|
13
13
|
|
|
14
14
|
Turn provider SSE fragments into typed events — **not another `+=` loop**.
|
|
15
15
|
|
|
16
|
-
**Status:** Stable `1.
|
|
16
|
+
**Status:** Stable `1.5.5`. Seven built-in adapters (Gemini covers **Google AI** and **Vertex AI** via `apiSurface`), thirteen OpenAI-compatible host presets (including **Azure OpenAI** and **Cloudflare Workers AI**), transforms, replay helpers, and examples are production-ready. Pin semver ranges as usual and review [CHANGELOG.md](./CHANGELOG.md) before major upgrades.
|
|
17
17
|
|
|
18
18
|
---
|
|
19
19
|
|
|
@@ -144,8 +144,9 @@ Diagram sources: [`docs/img/`](./docs/img/) (Mermaid `.mmd` + committed SVG). Re
|
|
|
144
144
|
| `openaiCompatibleAdapter({ provider })` | Groq, DeepSeek, Mistral, Ollama, LM Studio, Together, Fireworks, OpenRouter, Perplexity, xAI, **Azure OpenAI**, **Cloudflare Workers AI**, generic | `llm-stream-assemble` |
|
|
145
145
|
| `anthropicAdapter()` | Anthropic Messages | `llm-stream-assemble` |
|
|
146
146
|
| `openaiResponsesAdapter()` | OpenAI Responses API | `llm-stream-assemble` |
|
|
147
|
-
| `geminiAdapter()` | Google AI Gemini
|
|
147
|
+
| `geminiAdapter()` | Google AI Gemini + Vertex AI (`apiSurface`) | `llm-stream-assemble` or `/adapters/gemini` |
|
|
148
148
|
| `bedrockAdapter()` | AWS Bedrock Converse / ConverseStream | `llm-stream-assemble` or `/adapters/bedrock` |
|
|
149
|
+
| `cohereAdapter()` | Cohere Chat v2 (`api.cohere.com/v2/chat`) | `llm-stream-assemble` or `/adapters/cohere` |
|
|
149
150
|
|
|
150
151
|
Full feature flags and quirks: [compatibility matrix](./docs/compatibility.md).
|
|
151
152
|
|
|
@@ -202,6 +203,7 @@ Pick an adapter in ~30 seconds:
|
|
|
202
203
|
- **Anthropic Messages** → `anthropicAdapter()`
|
|
203
204
|
- **Google Gemini** → `geminiAdapter()`
|
|
204
205
|
- **AWS Bedrock ConverseStream** → `bedrockAdapter()` (decoded JSON per event — see [Bedrock Usage](#bedrock-usage))
|
|
206
|
+
- **Cohere Chat v2 SSE** → `cohereAdapter()` (not OpenAI-compatible — see [Cohere Usage](#cohere-usage))
|
|
205
207
|
- **Groq, Ollama, Azure, Cloudflare, OpenRouter, …** → `openaiCompatibleAdapter({ provider })`
|
|
206
208
|
- **Non-streaming JSON body** → `assembleResponse(body, adapter)`
|
|
207
209
|
- **React chat UI / full agent framework** → not this package — see [comparison](./docs/comparison.md)
|
|
@@ -277,6 +279,10 @@ for await (const event of assembleStream(response.body!, adapter)) {
|
|
|
277
279
|
|
|
278
280
|
→ [`examples/node-fetch/bedrock.ts`](./examples/node-fetch/bedrock.ts) · Usage: [Bedrock](#bedrock-usage) · Decode helper: [`examples/bedrock/README.md`](./examples/bedrock/README.md)
|
|
279
281
|
|
|
282
|
+
### Cohere Chat v2
|
|
283
|
+
|
|
284
|
+
→ [`examples/node-fetch/cohere.ts`](./examples/node-fetch/cohere.ts) · Usage: [Cohere](#cohere-usage)
|
|
285
|
+
|
|
280
286
|
### Streaming JSON (structured output)
|
|
281
287
|
|
|
282
288
|
```ts
|
|
@@ -317,7 +323,7 @@ Wire unified events into **Hono**, **Express**, **Cloudflare Workers**, **LiteLL
|
|
|
317
323
|
|
|
318
324
|
### Core Usage
|
|
319
325
|
|
|
320
|
-
The core pipeline works with any adapter that emits `RawChunk[]`, including the built-in OpenAI Chat, OpenAI-compatible, Anthropic Messages, OpenAI Responses, Google Gemini,
|
|
326
|
+
The core pipeline works with any adapter that emits `RawChunk[]`, including the built-in OpenAI Chat, OpenAI-compatible, Anthropic Messages, OpenAI Responses, Google Gemini, AWS Bedrock, and Cohere adapters:
|
|
321
327
|
|
|
322
328
|
```ts
|
|
323
329
|
import { assembleFromPayloads, type StreamAdapter } from "llm-stream-assemble";
|
|
@@ -555,7 +561,43 @@ Use `geminiAdapter({ jsonMode: true })` when structured JSON output should map t
|
|
|
555
561
|
|
|
556
562
|
Subpath import: `import { geminiAdapter } from "llm-stream-assemble/adapters/gemini"`.
|
|
557
563
|
|
|
558
|
-
Vertex AI
|
|
564
|
+
#### Vertex AI Gemini
|
|
565
|
+
|
|
566
|
+
Vertex uses the same `geminiAdapter()` with **`apiSurface: "vertex"`**. The adapter strips Vertex / gateway envelopes (`response`, `result`, `predictions[0]`) via **`normalizeVertexChunk()`** before mapping `candidates` and tools. Vertex HTTP streams are often **JSONL or concatenated JSON objects**, not Google AI `data:` SSE — split complete JSON strings in your app, then pass each line to `assembleFromPayloads` (see [`examples/vertex/read-chunk-stream.ts`](./examples/vertex/read-chunk-stream.ts)).
|
|
567
|
+
|
|
568
|
+
```ts
|
|
569
|
+
import { assembleFromPayloads, geminiAdapter } from "llm-stream-assemble";
|
|
570
|
+
import { buildVertexStreamUrl } from "./examples/vertex/build-vertex-url";
|
|
571
|
+
import { readVertexJsonlStrings } from "./examples/vertex/read-chunk-stream";
|
|
572
|
+
|
|
573
|
+
const projectId = process.env.GOOGLE_CLOUD_PROJECT!;
|
|
574
|
+
const location = process.env.VERTEX_LOCATION ?? "us-central1";
|
|
575
|
+
const model = process.env.VERTEX_MODEL ?? "gemini-2.5-flash";
|
|
576
|
+
const accessToken = process.env.VERTEX_ACCESS_TOKEN!; // ADC — not GOOGLE_API_KEY
|
|
577
|
+
|
|
578
|
+
const response = await fetch(buildVertexStreamUrl({ projectId, location, model }), {
|
|
579
|
+
method: "POST",
|
|
580
|
+
headers: {
|
|
581
|
+
Authorization: `Bearer ${accessToken}`,
|
|
582
|
+
"Content-Type": "application/json",
|
|
583
|
+
},
|
|
584
|
+
body: JSON.stringify({
|
|
585
|
+
contents: [{ role: "user", parts: [{ text: "Hello" }] }],
|
|
586
|
+
}),
|
|
587
|
+
});
|
|
588
|
+
|
|
589
|
+
async function* lines() {
|
|
590
|
+
for await (const line of readVertexJsonlStrings(response.body!)) yield line;
|
|
591
|
+
}
|
|
592
|
+
|
|
593
|
+
for await (const event of assembleFromPayloads(lines(), geminiAdapter({ apiSurface: "vertex" }))) {
|
|
594
|
+
if (event.type === "text.delta") process.stdout.write(event.text);
|
|
595
|
+
}
|
|
596
|
+
```
|
|
597
|
+
|
|
598
|
+
Obtain a short-lived bearer token with Application Default Credentials, e.g. `gcloud auth application-default print-access-token`, and set `VERTEX_ACCESS_TOKEN` (or pass `accessToken` in your own wrapper). Full runnable example: [`examples/node-fetch/vertex-gemini.ts`](./examples/node-fetch/vertex-gemini.ts). Live smoke: `pnpm smoke:vertex` — see [live-smoke](./docs/live-smoke.md).
|
|
599
|
+
|
|
600
|
+
The Gemini **Interactions API** remains deferred; see [compatibility matrix](./docs/compatibility.md).
|
|
559
601
|
|
|
560
602
|
### Bedrock Usage
|
|
561
603
|
|
|
@@ -605,6 +647,41 @@ Subpath import: `import { bedrockAdapter } from "llm-stream-assemble/adapters/be
|
|
|
605
647
|
|
|
606
648
|
Worker proxy recipe: [`examples/integrations/bedrock-worker-proxy.ts`](./examples/integrations/bedrock-worker-proxy.ts). EventStream decode helper (examples only): [`examples/bedrock/decode-event-stream.ts`](./examples/bedrock/decode-event-stream.ts).
|
|
607
649
|
|
|
650
|
+
### Cohere Usage
|
|
651
|
+
|
|
652
|
+
`cohereAdapter()` parses Cohere Chat **v2** SSE events from `https://api.cohere.com/v2/chat` and non-streaming v2 response bodies. Create one adapter instance per request/stream. Cohere is **not** OpenAI-compatible — use `cohereAdapter()`, not `openaiCompatibleAdapter()`.
|
|
653
|
+
|
|
654
|
+
Core `parseSSE()` frames the HTTP body; `assembleStream` yields one JSON payload string per `data:` line to `cohereAdapter().parseChunk`.
|
|
655
|
+
|
|
656
|
+
```ts
|
|
657
|
+
import { assembleStream, cohereAdapter } from "llm-stream-assemble";
|
|
658
|
+
|
|
659
|
+
const response = await fetch("https://api.cohere.com/v2/chat", {
|
|
660
|
+
method: "POST",
|
|
661
|
+
headers: {
|
|
662
|
+
Authorization: `Bearer ${process.env.COHERE_API_KEY}`,
|
|
663
|
+
"Content-Type": "application/json",
|
|
664
|
+
},
|
|
665
|
+
body: JSON.stringify({
|
|
666
|
+
model: "command-r-plus-08-2024",
|
|
667
|
+
messages: [{ role: "user", content: "Hello" }],
|
|
668
|
+
stream: true,
|
|
669
|
+
}),
|
|
670
|
+
});
|
|
671
|
+
|
|
672
|
+
for await (const event of assembleStream(response.body!, cohereAdapter())) {
|
|
673
|
+
if (event.type === "text.delta") process.stdout.write(event.text);
|
|
674
|
+
if (event.type === "reasoning.delta") process.stdout.write(event.text);
|
|
675
|
+
if (event.type === "tool_call.done") console.log(event.name, event.args);
|
|
676
|
+
}
|
|
677
|
+
```
|
|
678
|
+
|
|
679
|
+
Use `cohereAdapter({ jsonMode: true })` when structured JSON output should map to `json.*` instead of `text.*`. **`tool-plan-delta`** events map to `reasoning.*` with `variant: "detail"`. **`citation-start`** payloads are preserved in `metadata.raw` — there are no dedicated `citation.*` unified events in 1.x. Legacy Cohere v1 endpoints are out of scope.
|
|
680
|
+
|
|
681
|
+
Subpath import: `import { cohereAdapter } from "llm-stream-assemble/adapters/cohere"`.
|
|
682
|
+
|
|
683
|
+
Live smoke: `pnpm smoke:cohere` — see [`docs/live-smoke.md`](./docs/live-smoke.md) for `COHERE_API_KEY`, `COHERE_MODEL`, and `COHERE_SMOKE_TOOLS`.
|
|
684
|
+
|
|
608
685
|
---
|
|
609
686
|
|
|
610
687
|
## Transforms & replay
|
|
@@ -676,7 +753,8 @@ for await (const event of assembleFromFile(
|
|
|
676
753
|
| [`examples/node-fetch/perplexity.ts`](./examples/node-fetch/perplexity.ts) | Perplexity streaming |
|
|
677
754
|
| [`examples/node-fetch/xai.ts`](./examples/node-fetch/xai.ts) | xAI Grok streaming |
|
|
678
755
|
| [`examples/node-fetch/anthropic.ts`](./examples/node-fetch/anthropic.ts) | Anthropic Messages |
|
|
679
|
-
| [`examples/node-fetch/gemini.ts`](./examples/node-fetch/gemini.ts) | Google Gemini SSE
|
|
756
|
+
| [`examples/node-fetch/gemini.ts`](./examples/node-fetch/gemini.ts) | Google AI Gemini SSE |
|
|
757
|
+
| [`examples/node-fetch/vertex-gemini.ts`](./examples/node-fetch/vertex-gemini.ts) | Vertex AI Gemini JSONL stream |
|
|
680
758
|
| [`examples/node-fetch/bedrock.ts`](./examples/node-fetch/bedrock.ts) | AWS Bedrock ConverseStream (decoded JSON) |
|
|
681
759
|
| [`examples/node-fetch/replay-fixture.ts`](./examples/node-fetch/replay-fixture.ts) | Local fixture replay |
|
|
682
760
|
| [`examples/proxy-safety/`](./examples/proxy-safety/) | Proxy + browser client patterns |
|