@yadimon/prio-llm-router 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +434 -0
- package/dist/index.cjs +1058 -0
- package/dist/index.cjs.map +1 -0
- package/dist/index.d.cts +235 -0
- package/dist/index.d.ts +235 -0
- package/dist/index.js +1047 -0
- package/dist/index.js.map +1 -0
- package/package.json +98 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,434 @@
|
|
|
1
|
+
# @yadimon/prio-llm-router
|
|
2
|
+
|
|
3
|
+
`@yadimon/prio-llm-router` is a TypeScript library for routing text generation requests through a priority-ordered chain of LLM targets.
|
|
4
|
+
|
|
5
|
+
It is built for the common "free models first, paid models later" setup:
|
|
6
|
+
|
|
7
|
+
- providers are configured once with names and API keys
|
|
8
|
+
- models are configured once with names, provider references, priorities, and metadata
|
|
9
|
+
- each request can use either an explicit chain or the implicit global priority order
|
|
10
|
+
- failures automatically fall through to the next configured target
|
|
11
|
+
|
|
12
|
+
The package keeps the routing logic intentionally small and predictable while reusing the Vercel AI SDK provider ecosystem for the actual provider calls.
|
|
13
|
+
|
|
14
|
+
## Features
|
|
15
|
+
|
|
16
|
+
- Priority-based fallback across multiple providers and models
|
|
17
|
+
- Separate provider config and model target config
|
|
18
|
+
- Optional source builders for source-centric setup and strict free policies
|
|
19
|
+
- Non-streaming text generation and optional streaming
|
|
20
|
+
- Built-in support for `google`, `openrouter`, `groq`, `mistral`, `cohere`, `perplexity`, `xai`, `togetherai`, `openai`, `anthropic`, `deepseek`, and generic `openai-compatible`
|
|
21
|
+
- Strict TypeScript types
|
|
22
|
+
- Hook points for attempt-level logging and telemetry
|
|
23
|
+
- Ready for npm publishing and GitHub CI
|
|
24
|
+
- Structured to support future provider key pools without changing the model-chain API
|
|
25
|
+
|
|
26
|
+
## Documentation
|
|
27
|
+
|
|
28
|
+
- [Configuration Guide](./docs/configuration.md)
|
|
29
|
+
- [Streaming Semantics](./docs/streaming.md)
|
|
30
|
+
- [Architecture Notes](./docs/architecture.md)
|
|
31
|
+
- [Current Free Possibilities](./docs/current-free-possibilities.md)
|
|
32
|
+
- [Examples](./examples/README.md)
|
|
33
|
+
- [Contributor Agent Notes](./AGENTS.md)
|
|
34
|
+
|
|
35
|
+
## Installation
|
|
36
|
+
|
|
37
|
+
```bash
|
|
38
|
+
npm install @yadimon/prio-llm-router
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
## When To Use It
|
|
42
|
+
|
|
43
|
+
This package is a good fit when:
|
|
44
|
+
|
|
45
|
+
- you want to try multiple providers in a deterministic order
|
|
46
|
+
- you want free models first and paid models later
|
|
47
|
+
- you want one stable application-facing API while provider choices evolve
|
|
48
|
+
- you want fallback behavior to live in one place instead of being spread across app code
|
|
49
|
+
|
|
50
|
+
It is not trying to be a universal orchestration framework. The goal is a narrow, reliable router for text calls.
|
|
51
|
+
|
|
52
|
+
## Quick Start
|
|
53
|
+
|
|
54
|
+
```ts
|
|
55
|
+
import { createLlmRouter } from '@yadimon/prio-llm-router';
|
|
56
|
+
|
|
57
|
+
const router = createLlmRouter({
|
|
58
|
+
providers: [
|
|
59
|
+
{
|
|
60
|
+
name: 'openrouter-main',
|
|
61
|
+
type: 'openrouter',
|
|
62
|
+
auth: {
|
|
63
|
+
mode: 'single',
|
|
64
|
+
apiKey: process.env.OPENROUTER_API_KEY!,
|
|
65
|
+
},
|
|
66
|
+
appName: 'prio-llm-router-demo',
|
|
67
|
+
appUrl: 'https://example.com',
|
|
68
|
+
},
|
|
69
|
+
{
|
|
70
|
+
name: 'groq-main',
|
|
71
|
+
type: 'groq',
|
|
72
|
+
auth: {
|
|
73
|
+
mode: 'single',
|
|
74
|
+
apiKey: process.env.GROQ_API_KEY!,
|
|
75
|
+
},
|
|
76
|
+
},
|
|
77
|
+
{
|
|
78
|
+
name: 'openai-main',
|
|
79
|
+
type: 'openai',
|
|
80
|
+
auth: {
|
|
81
|
+
mode: 'single',
|
|
82
|
+
apiKey: process.env.OPENAI_API_KEY!,
|
|
83
|
+
},
|
|
84
|
+
},
|
|
85
|
+
],
|
|
86
|
+
models: [
|
|
87
|
+
{
|
|
88
|
+
name: 'trinity-free',
|
|
89
|
+
provider: 'openrouter-main',
|
|
90
|
+
model: 'arcee-ai/trinity-large:free',
|
|
91
|
+
priority: 10,
|
|
92
|
+
tier: 'free',
|
|
93
|
+
},
|
|
94
|
+
{
|
|
95
|
+
name: 'groq-oss',
|
|
96
|
+
provider: 'groq-main',
|
|
97
|
+
model: 'openai/gpt-oss-20b',
|
|
98
|
+
priority: 20,
|
|
99
|
+
tier: 'free',
|
|
100
|
+
},
|
|
101
|
+
{
|
|
102
|
+
name: 'gpt-4.1-paid',
|
|
103
|
+
provider: 'openai-main',
|
|
104
|
+
model: 'gpt-4.1-mini',
|
|
105
|
+
priority: 100,
|
|
106
|
+
tier: 'paid',
|
|
107
|
+
},
|
|
108
|
+
],
|
|
109
|
+
hooks: {
|
|
110
|
+
onAttemptFailure(attempt) {
|
|
111
|
+
console.warn('LLM attempt failed:', attempt);
|
|
112
|
+
},
|
|
113
|
+
},
|
|
114
|
+
});
|
|
115
|
+
|
|
116
|
+
const result = await router.generateText({
|
|
117
|
+
prompt: 'Summarize the advantages of priority-based model routing in 3 bullets.',
|
|
118
|
+
});
|
|
119
|
+
|
|
120
|
+
console.log(result.text);
|
|
121
|
+
console.log(result.target);
|
|
122
|
+
console.log(result.attempts);
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
## Basic Mental Model
|
|
126
|
+
|
|
127
|
+
There are two separate layers:
|
|
128
|
+
|
|
129
|
+
- `providers`: named credentials and transport settings
|
|
130
|
+
- `models`: named routing targets that point to a provider and a concrete model id
|
|
131
|
+
|
|
132
|
+
Your app sends requests to the router using model target names, not raw provider config.
|
|
133
|
+
|
|
134
|
+
There is also an additive builder layer for source-centric setup:
|
|
135
|
+
|
|
136
|
+
- `createLlmConnection(...)`
|
|
137
|
+
- `createLlmSource(...)`
|
|
138
|
+
|
|
139
|
+
This is the preferred path when you want to mark a source as strict `free`.
|
|
140
|
+
|
|
141
|
+
## Strict Free Sources
|
|
142
|
+
|
|
143
|
+
Strict `free` mode is intentionally narrow.
|
|
144
|
+
|
|
145
|
+
It exists only where the package can prevent paid usage from the request shape alone. Today that means:
|
|
146
|
+
|
|
147
|
+
- only `openrouter`
|
|
148
|
+
- only explicit model ids that end in `:free`
|
|
149
|
+
|
|
150
|
+
Example:
|
|
151
|
+
|
|
152
|
+
```ts
|
|
153
|
+
import {
|
|
154
|
+
createLlmConnection,
|
|
155
|
+
createLlmRouter,
|
|
156
|
+
createLlmSource,
|
|
157
|
+
} from '@yadimon/prio-llm-router';
|
|
158
|
+
|
|
159
|
+
const openRouter = createLlmConnection({
|
|
160
|
+
name: 'openrouter-main',
|
|
161
|
+
type: 'openrouter',
|
|
162
|
+
auth: {
|
|
163
|
+
mode: 'single',
|
|
164
|
+
apiKey: process.env.OPENROUTER_API_KEY!,
|
|
165
|
+
},
|
|
166
|
+
appName: 'prio-llm-router-demo',
|
|
167
|
+
appUrl: 'https://example.com',
|
|
168
|
+
});
|
|
169
|
+
|
|
170
|
+
const router = createLlmRouter({
|
|
171
|
+
sources: [
|
|
172
|
+
createLlmSource(openRouter, {
|
|
173
|
+
name: 'kimi-free',
|
|
174
|
+
model: 'moonshotai/kimi-k2:free',
|
|
175
|
+
access: 'free',
|
|
176
|
+
priority: 10,
|
|
177
|
+
}),
|
|
178
|
+
],
|
|
179
|
+
});
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
The package rejects strict `free` sources for providers whose free status depends on account plan or billing setup, such as `google`, `groq`, `mistral`, or `cohere`.
|
|
183
|
+
|
|
184
|
+
## Explicit Request Chains
|
|
185
|
+
|
|
186
|
+
If you want per-request routing, pass a chain of configured model target names:
|
|
187
|
+
|
|
188
|
+
```ts
|
|
189
|
+
const result = await router.generateText({
|
|
190
|
+
prompt: 'Write a terse release note.',
|
|
191
|
+
chain: ['trinity-free', 'groq-oss', 'gpt-4.1-paid'],
|
|
192
|
+
});
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
The chain values are target names from the `models` config, not raw provider names or raw model ids.
|
|
196
|
+
|
|
197
|
+
If `chain` is not provided, the router uses:
|
|
198
|
+
|
|
199
|
+
- `defaultChain` from setup if present
|
|
200
|
+
- otherwise all enabled model targets sorted by ascending `priority`
|
|
201
|
+
|
|
202
|
+
## Messages Instead of Prompt
|
|
203
|
+
|
|
204
|
+
```ts
|
|
205
|
+
const result = await router.generateText({
|
|
206
|
+
system: 'Be concise.',
|
|
207
|
+
messages: [
|
|
208
|
+
{ role: 'user', content: [{ type: 'text', text: 'Explain fallback routing.' }] },
|
|
209
|
+
],
|
|
210
|
+
});
|
|
211
|
+
```
|
|
212
|
+
|
|
213
|
+
## Streaming With First-Chunk Fallback
|
|
214
|
+
|
|
215
|
+
For chat-style UX you can use `streamText`.
|
|
216
|
+
|
|
217
|
+
The router behavior is intentionally strict:
|
|
218
|
+
|
|
219
|
+
- before the first text chunk arrives, it may fall back to the next target
|
|
220
|
+
- once the first text chunk has been emitted, the model is locked in
|
|
221
|
+
- if the selected stream later fails, the error is surfaced and no further fallback happens
|
|
222
|
+
|
|
223
|
+
```ts
|
|
224
|
+
const stream = await router.streamText({
|
|
225
|
+
prompt: 'Explain this system in short sentences.',
|
|
226
|
+
chain: ['trinity-free', 'groq-oss', 'gpt-4.1-paid'],
|
|
227
|
+
firstChunkTimeoutMs: 2500,
|
|
228
|
+
});
|
|
229
|
+
|
|
230
|
+
for await (const chunk of stream.textStream) {
|
|
231
|
+
process.stdout.write(chunk);
|
|
232
|
+
}
|
|
233
|
+
|
|
234
|
+
const final = await stream.final;
|
|
235
|
+
console.log(final.target.name);
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
Use `firstChunkTimeoutMs` when you want "switch if nothing starts quickly enough" behavior. If you omit it, the router waits indefinitely for the first chunk of the current target.
|
|
239
|
+
|
|
240
|
+
This makes the behavior safe for chat UIs:
|
|
241
|
+
|
|
242
|
+
- no silent model switch after the answer has already started
|
|
243
|
+
- no mixed output from multiple models in one response
|
|
244
|
+
- deterministic fallback only during the "nothing has started yet" phase
|
|
245
|
+
|
|
246
|
+
## Configuration Model
|
|
247
|
+
|
|
248
|
+
### Providers
|
|
249
|
+
|
|
250
|
+
Providers are named credentials plus provider type:
|
|
251
|
+
|
|
252
|
+
```ts
|
|
253
|
+
{
|
|
254
|
+
name: 'groq-main',
|
|
255
|
+
type: 'groq',
|
|
256
|
+
auth: {
|
|
257
|
+
mode: 'single',
|
|
258
|
+
apiKey: process.env.GROQ_API_KEY!,
|
|
259
|
+
},
|
|
260
|
+
}
|
|
261
|
+
```
|
|
262
|
+
|
|
263
|
+
Today the auth mode is `single`. The type layout is intentionally future-friendly so provider key pools or key-priority strategies can be added later without changing how models reference providers.
|
|
264
|
+
|
|
265
|
+
Common provider-level fields:
|
|
266
|
+
|
|
267
|
+
- `name`
|
|
268
|
+
- `type`
|
|
269
|
+
- `auth`
|
|
270
|
+
- `enabled`
|
|
271
|
+
- `baseURL`
|
|
272
|
+
- `headers`
|
|
273
|
+
|
|
274
|
+
### Models
|
|
275
|
+
|
|
276
|
+
Models are named routing targets:
|
|
277
|
+
|
|
278
|
+
```ts
|
|
279
|
+
{
|
|
280
|
+
name: 'trinity-free',
|
|
281
|
+
provider: 'openrouter-main',
|
|
282
|
+
model: 'arcee-ai/trinity-large:free',
|
|
283
|
+
priority: 10,
|
|
284
|
+
tier: 'free',
|
|
285
|
+
}
|
|
286
|
+
```
|
|
287
|
+
|
|
288
|
+
The router either:
|
|
289
|
+
|
|
290
|
+
- uses `request.chain` if provided
|
|
291
|
+
- uses `defaultChain` from setup if provided
|
|
292
|
+
- otherwise sorts enabled targets by ascending `priority`
|
|
293
|
+
|
|
294
|
+
Common model-level fields:
|
|
295
|
+
|
|
296
|
+
- `name`
|
|
297
|
+
- `provider`
|
|
298
|
+
- `model`
|
|
299
|
+
- `enabled`
|
|
300
|
+
- `priority`
|
|
301
|
+
- `tier`
|
|
302
|
+
- `metadata`
|
|
303
|
+
|
|
304
|
+
## Supported Providers
|
|
305
|
+
|
|
306
|
+
- `google`
|
|
307
|
+
- `openrouter`
|
|
308
|
+
- `groq`
|
|
309
|
+
- `mistral`
|
|
310
|
+
- `cohere`
|
|
311
|
+
- `perplexity`
|
|
312
|
+
- `xai`
|
|
313
|
+
- `togetherai`
|
|
314
|
+
- `openai`
|
|
315
|
+
- `anthropic`
|
|
316
|
+
- `deepseek`
|
|
317
|
+
- `openai-compatible`
|
|
318
|
+
|
|
319
|
+
These built-in types focus on API-key-based providers that map cleanly to the Vercel AI SDK. For OpenAI-style gateways and proxies, use `openai-compatible`.
|
|
320
|
+
|
|
321
|
+
Use `openai-compatible` when you have an OpenAI-style endpoint that is not covered by a first-party adapter:
|
|
322
|
+
|
|
323
|
+
```ts
|
|
324
|
+
{
|
|
325
|
+
name: 'my-proxy',
|
|
326
|
+
type: 'openai-compatible',
|
|
327
|
+
baseURL: 'https://my-proxy.example.com/v1',
|
|
328
|
+
providerLabel: 'my-proxy',
|
|
329
|
+
auth: {
|
|
330
|
+
mode: 'single',
|
|
331
|
+
apiKey: process.env.MY_PROXY_API_KEY!,
|
|
332
|
+
},
|
|
333
|
+
}
|
|
334
|
+
```
|
|
335
|
+
|
|
336
|
+
This also covers local OpenAI-compatible runtimes such as LM Studio, Ollama, or other local gateways.
|
|
337
|
+
|
|
338
|
+
Example for LM Studio running locally on `http://127.0.0.1:1234/v1`:
|
|
339
|
+
|
|
340
|
+
Before using this setup, make sure LM Studio's local server is running with the OpenAI-compatible API enabled.
|
|
341
|
+
|
|
342
|
+
```ts
|
|
343
|
+
import { createLlmRouter } from '@yadimon/prio-llm-router';
|
|
344
|
+
|
|
345
|
+
const router = createLlmRouter({
|
|
346
|
+
providers: [
|
|
347
|
+
{
|
|
348
|
+
name: 'lm-studio-local',
|
|
349
|
+
type: 'openai-compatible',
|
|
350
|
+
baseURL: 'http://127.0.0.1:1234/v1',
|
|
351
|
+
providerLabel: 'lm-studio',
|
|
352
|
+
auth: {
|
|
353
|
+
mode: 'single',
|
|
354
|
+
apiKey: 'lm-studio',
|
|
355
|
+
},
|
|
356
|
+
},
|
|
357
|
+
],
|
|
358
|
+
models: [
|
|
359
|
+
{
|
|
360
|
+
name: 'local-qwen',
|
|
361
|
+
provider: 'lm-studio-local',
|
|
362
|
+
model: 'qwen2.5-7b-instruct',
|
|
363
|
+
priority: 10,
|
|
364
|
+
},
|
|
365
|
+
],
|
|
366
|
+
});
|
|
367
|
+
|
|
368
|
+
const result = await router.generateText({
|
|
369
|
+
prompt: 'Describe this local LM Studio setup in one sentence.',
|
|
370
|
+
});
|
|
371
|
+
|
|
372
|
+
console.log(result.text);
|
|
373
|
+
```
|
|
374
|
+
|
|
375
|
+
Notes:
|
|
376
|
+
|
|
377
|
+
- for LM Studio, enable the OpenAI-compatible local API before using this config
|
|
378
|
+
- the local server still needs to expose an OpenAI-compatible HTTP API
|
|
379
|
+
- the package currently requires a non-empty `apiKey`, so local runtimes that ignore auth should use a dummy value such as `'lm-studio'`
|
|
380
|
+
- the `model` value must match the local model name exposed by your runtime
|
|
381
|
+
|
|
382
|
+
## Error Model
|
|
383
|
+
|
|
384
|
+
If every target fails, the router throws `AllModelsFailedError`.
|
|
385
|
+
|
|
386
|
+
That error includes:
|
|
387
|
+
|
|
388
|
+
- `attempts`: all failed attempts in execution order
|
|
389
|
+
- `cause`: the last underlying error
|
|
390
|
+
|
|
391
|
+
This makes it straightforward to log or surface detailed fallback history.
|
|
392
|
+
|
|
393
|
+
For streaming requests:
|
|
394
|
+
|
|
395
|
+
- fallback is allowed only before the first emitted text chunk
|
|
396
|
+
- after the stream starts, later errors are surfaced directly
|
|
397
|
+
- `stream.final` resolves to the final aggregated result when the stream completes successfully
|
|
398
|
+
|
|
399
|
+
## Public API
|
|
400
|
+
|
|
401
|
+
Main exports:
|
|
402
|
+
|
|
403
|
+
- `createLlmRouter`
|
|
404
|
+
- `PrioLlmRouter`
|
|
405
|
+
- `createDefaultTextGenerationExecutor`
|
|
406
|
+
- `AllModelsFailedError`
|
|
407
|
+
- `RouterConfigurationError`
|
|
408
|
+
|
|
409
|
+
Main methods:
|
|
410
|
+
|
|
411
|
+
- `router.generateText(...)`
|
|
412
|
+
- `router.streamText(...)`
|
|
413
|
+
- `router.listProviders()`
|
|
414
|
+
- `router.listModels()`
|
|
415
|
+
|
|
416
|
+
## Development
|
|
417
|
+
|
|
418
|
+
```bash
|
|
419
|
+
npm install
|
|
420
|
+
npm run check
|
|
421
|
+
```
|
|
422
|
+
|
|
423
|
+
Repository layout:
|
|
424
|
+
|
|
425
|
+
- [src](./src)
|
|
426
|
+
- [tests](./tests)
|
|
427
|
+
- [examples](./examples)
|
|
428
|
+
- [docs](./docs)
|
|
429
|
+
|
|
430
|
+
## Notes
|
|
431
|
+
|
|
432
|
+
- The routing logic is deliberately separate from provider execution logic.
|
|
433
|
+
- OpenRouter request headers `HTTP-Referer` and `X-Title` can be set via `appUrl` and `appName`.
|
|
434
|
+
- Examples in this repository import from `../src/index.js` for local development. In external projects, import from `@yadimon/prio-llm-router`.
|