@askalf/dario 3.9.6 → 3.10.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +48 -5
- package/dist/cli.js +2 -0
- package/dist/proxy.d.ts +4 -0
- package/dist/proxy.js +70 -8
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -26,11 +26,13 @@
|
|
|
26
26
|
|
|
27
27
|
Dario runs on your machine and gives every tool you use one local URL that reaches **every LLM you use.** Point Cursor, Continue, Aider, LiteLLM, your own scripts — anything that speaks the Anthropic or OpenAI API — at `http://localhost:3456`, and dario routes each request to the right backend:
|
|
28
28
|
|
|
29
|
-
- **Claude Max / Pro subscriptions** — OAuth-backed, billed against your plan instead of API pricing. Multi-account pooling if you have more than one.
|
|
30
29
|
- **OpenAI** — your API key, routed to `api.openai.com` straight through.
|
|
31
30
|
- **Any OpenAI-compat endpoint** — OpenRouter, Groq, a local LiteLLM, Ollama's openai-compat mode, self-hosted vLLM. Set the backend's `baseUrl` once, done.
|
|
31
|
+
- **Claude Max / Pro subscriptions** — OAuth-backed, billed against your plan instead of API pricing. Multi-account pooling if you have more than one.
|
|
32
|
+
|
|
33
|
+
Your tool sees one base URL. `gpt-4o` goes to OpenAI. `llama-3-70b` goes to Groq. `claude-opus-4-6` goes to your Claude subscription. None of your tools have to know about any of it.
|
|
32
34
|
|
|
33
|
-
|
|
35
|
+
**Backends are plugins, not the product.** Dario's job is the one local endpoint your tools point at. Each backend is a swappable adapter behind it — when a provider ships, a backend entry lands, your tools don't change. That's the durable part.
|
|
34
36
|
|
|
35
37
|
**No account anywhere is required.** Single-backend Claude dario works with nothing but `dario login`. Multi-backend dario works with nothing but local config files. Nothing phones home. Zero runtime dependencies. ~2,000 lines of TypeScript.
|
|
36
38
|
|
|
@@ -41,8 +43,9 @@ Your tool sees one base URL. `gpt-4` goes to OpenAI. `claude-opus-4-6` goes to y
|
|
|
41
43
|
**Best fit:**
|
|
42
44
|
|
|
43
45
|
- **Developers using multiple LLMs across multiple tools** who are tired of juggling base URLs, API keys, and per-tool provider configs.
|
|
44
|
-
- **Claude Max or Pro subscribers** who want their subscription usable anywhere that speaks the Anthropic or OpenAI API — without paying API rates for every request.
|
|
45
46
|
- **Teams running local or hosted OpenAI-compat servers** (LiteLLM, vLLM, Ollama, Groq, OpenRouter) who want one stable local endpoint in front of them that every tool can reuse.
|
|
47
|
+
- **Anyone who wants to switch providers without reconfiguring every tool** — change the model name in your tool, dario picks a different backend, your tool keeps working.
|
|
48
|
+
- **Claude Max or Pro subscribers** who want their subscription usable anywhere that speaks the Anthropic or OpenAI API — without paying API rates for every request.
|
|
46
49
|
- **Power users running multi-agent workloads on Claude subscriptions** who want multi-account pooling with headroom-aware routing on their own machine, against their own subscriptions, without a hosted platform.
|
|
47
50
|
|
|
48
51
|
**Not a fit:**
|
|
@@ -94,6 +97,8 @@ One URL. Your tool doesn't know or care which provider is answering.
|
|
|
94
97
|
|
|
95
98
|
**Use dario if** you use more than one LLM provider, or more than one tool, or both — and you're tired of configuring each tool with a different base URL and API key per provider.
|
|
96
99
|
|
|
100
|
+
**Use dario if** you want provider independence. Switching from GPT-4o to Claude to Llama is a model-name change in your tool, not a reconfigure of every SDK and base URL you've got.
|
|
101
|
+
|
|
97
102
|
**Use dario if** you pay for Claude Max or Pro and you want that subscription reachable from every tool on your machine, without paying API rates or opening a second billing surface.
|
|
98
103
|
|
|
99
104
|
**Use dario pool mode if** you're running multi-agent workloads on Claude subscriptions and hitting per-account rate limits. Add 2–N accounts with `dario accounts add` and dario routes across them by per-account headroom, all on your machine, against your own subscriptions. See [Multi-Account Pool Mode](#multi-account-pool-mode).
|
|
@@ -133,7 +138,7 @@ Opus, Sonnet, Haiku, GPT-4o, o1, o3, o4, plus anything the configured OpenAI-com
|
|
|
133
138
|
|
|
134
139
|
## Backends
|
|
135
140
|
|
|
136
|
-
Dario's routing is organized around **backends**, each with its own auth and its own target. v3.6.0 ships two backends, with more coming.
|
|
141
|
+
Dario's routing is organized around **backends**, each with its own auth and its own target. Backends are swappable adapters — add one, your tools reach it at `localhost:3456` with whatever API shape they already speak. v3.6.0 ships two backends, with more coming.
|
|
137
142
|
|
|
138
143
|
### 1. Claude subscription backend (built in)
|
|
139
144
|
|
|
@@ -247,7 +252,7 @@ curl http://localhost:3456/analytics # per-account / per-model stats, burn ra
|
|
|
247
252
|
| `--passthrough` / `--thin` | Thin proxy for the Claude backend — OAuth swap only, no template injection | off |
|
|
248
253
|
| `--preserve-tools` / `--keep-tools` | Keep client tool schemas instead of remapping to CC's `Bash/Read/Grep/Glob/WebSearch/WebFetch`. Required for clients whose tools have fields CC doesn't (`sessionId`, custom ids, etc.) — see [Custom tool schemas](#custom-tool-schemas). Trade-off: drops the CC request fingerprint. | off |
|
|
249
254
|
| `--hybrid-tools` / `--context-inject` | Remap to CC tools **and** inject request-context values (`sessionId`, `requestId`, `channelId`, `userId`, `timestamp`) into client-declared fields CC's schema doesn't carry. Preserves the CC fingerprint while keeping custom schemas functional — see [Hybrid tool mode](#hybrid-tool-mode). Mutually exclusive with `--preserve-tools`. | off |
|
|
250
|
-
| `--model=<name>` | Force a model (`opus`, `sonnet`, `haiku
|
|
255
|
+
| `--model=<name>` | Force a model. Shortcuts (`opus`, `sonnet`, `haiku`), full IDs (`claude-opus-4-6`), or a **provider prefix** (`openai:gpt-4o`, `groq:llama-3.3-70b`, `claude:opus`, `local:qwen-coder`) to force the backend server-wide. See [Provider prefix](#provider-prefix). | passthrough |
|
|
251
256
|
| `--port=<n>` | Port to listen on | `3456` |
|
|
252
257
|
| `--host=<addr>` / `DARIO_HOST` | Bind address. Use `0.0.0.0` for LAN, or a specific IP (e.g. a Tailscale interface). When non-loopback, also set `DARIO_API_KEY`. | `127.0.0.1` |
|
|
253
258
|
| `--verbose` / `-v` | Log every request | off |
|
|
@@ -348,6 +353,44 @@ curl http://localhost:3456/v1/chat/completions \
|
|
|
348
353
|
|
|
349
354
|
All supported. Claude backend: full Anthropic SSE format plus OpenAI-SSE translation for tool_use streaming. OpenAI-compat backend: streaming body forwarded byte-for-byte.
|
|
350
355
|
|
|
356
|
+
### Provider prefix
|
|
357
|
+
|
|
358
|
+
Any request's `model` field can be written as `<provider>:<name>` to force which backend handles it, regardless of what the model name looks like. This is useful when regex-based routing (`gpt-*` → OpenAI, `claude-*` → Claude) doesn't match — for example when routing a `llama-3.3-70b` request through an OpenAI-compat backend, or when you want the same model name to go to different providers on different requests.
|
|
359
|
+
|
|
360
|
+
Recognized prefixes:
|
|
361
|
+
|
|
362
|
+
| Prefix | Backend |
|
|
363
|
+
|---|---|
|
|
364
|
+
| `openai:` | OpenAI-compat backend (the configured one) |
|
|
365
|
+
| `groq:` | OpenAI-compat backend |
|
|
366
|
+
| `openrouter:` | OpenAI-compat backend |
|
|
367
|
+
| `local:` | OpenAI-compat backend |
|
|
368
|
+
| `compat:` | OpenAI-compat backend |
|
|
369
|
+
| `claude:` | Claude subscription backend |
|
|
370
|
+
| `anthropic:` | Claude subscription backend |
|
|
371
|
+
|
|
372
|
+
Examples:
|
|
373
|
+
|
|
374
|
+
```bash
|
|
375
|
+
# Force openai backend
|
|
376
|
+
curl http://localhost:3456/v1/chat/completions \
|
|
377
|
+
-H "Authorization: Bearer dario" \
|
|
378
|
+
-d '{"model":"openai:gpt-4o","messages":[{"role":"user","content":"hi"}]}'
|
|
379
|
+
|
|
380
|
+
# Force a non-gpt model through the openai-compat backend (e.g. OpenRouter)
|
|
381
|
+
curl http://localhost:3456/v1/chat/completions \
|
|
382
|
+
-H "Authorization: Bearer dario" \
|
|
383
|
+
-d '{"model":"openrouter:meta-llama/llama-3.1-70b-instruct","messages":[...]}'
|
|
384
|
+
|
|
385
|
+
# Force Claude subscription backend — same as `opus` shortcut but explicit
|
|
386
|
+
curl http://localhost:3456/v1/messages \
|
|
387
|
+
-d '{"model":"claude:opus","max_tokens":1024,"messages":[...]}'
|
|
388
|
+
```
|
|
389
|
+
|
|
390
|
+
The prefix gets stripped before the request goes upstream — the backend only sees the bare model name. Unrecognized prefixes are ignored, so ollama-style `llama3:8b` passes through untouched.
|
|
391
|
+
|
|
392
|
+
**Server-wide override.** `dario proxy --model=openai:gpt-4o` applies the prefix to every request, regardless of what the client sends. Useful for "I want everything routed to this specific backend and model" without editing every tool's config.
|
|
393
|
+
|
|
351
394
|
### Custom tool schemas
|
|
352
395
|
|
|
353
396
|
By default, on the Claude backend, dario replaces your client's tool definitions with the real Claude Code tools (`Bash`, `Read`, `Grep`, `Glob`, `WebSearch`, `WebFetch`) and translates parameters back and forth. That's how dario looks like CC on the wire, which is what lets your request bill against your Claude subscription instead of API pricing.
|
package/dist/cli.js
CHANGED
|
@@ -371,6 +371,8 @@ async function help() {
|
|
|
371
371
|
--model=MODEL Force a model for all requests
|
|
372
372
|
Shortcuts: opus, sonnet, haiku
|
|
373
373
|
Full IDs: claude-opus-4-6, claude-sonnet-4-6
|
|
374
|
+
Provider prefix: openai:gpt-4o, groq:llama-3.3-70b,
|
|
375
|
+
claude:opus, local:qwen-coder (forces backend)
|
|
374
376
|
Default: passthrough (client decides)
|
|
375
377
|
--passthrough, --thin Thin proxy — OAuth swap only, no injection
|
|
376
378
|
--preserve-tools Forward client tool schemas unchanged
|
package/dist/proxy.d.ts
CHANGED
package/dist/proxy.js
CHANGED
|
@@ -134,6 +134,33 @@ const MODEL_ALIASES = {
|
|
|
134
134
|
'sonnet1m': 'claude-sonnet-4-6[1m]',
|
|
135
135
|
'haiku': 'claude-haiku-4-5',
|
|
136
136
|
};
|
|
137
|
+
// Provider prefix in the `model` field — `<provider>:<model>`. Forces
|
|
138
|
+
// routing regardless of model-name regex. Only recognized prefixes are
|
|
139
|
+
// parsed, so ollama-style `llama3:8b` (without a recognized prefix)
|
|
140
|
+
// passes through untouched and reaches the configured openai-compat
|
|
141
|
+
// backend as-is.
|
|
142
|
+
const PROVIDER_PREFIXES = {
|
|
143
|
+
openai: 'openai',
|
|
144
|
+
openrouter: 'openai',
|
|
145
|
+
groq: 'openai',
|
|
146
|
+
compat: 'openai',
|
|
147
|
+
local: 'openai',
|
|
148
|
+
claude: 'claude',
|
|
149
|
+
anthropic: 'claude',
|
|
150
|
+
};
|
|
151
|
+
export function parseProviderPrefix(model) {
|
|
152
|
+
const idx = model.indexOf(':');
|
|
153
|
+
if (idx <= 0)
|
|
154
|
+
return null;
|
|
155
|
+
const prefix = model.slice(0, idx).toLowerCase();
|
|
156
|
+
const provider = PROVIDER_PREFIXES[prefix];
|
|
157
|
+
if (!provider)
|
|
158
|
+
return null;
|
|
159
|
+
const stripped = model.slice(idx + 1);
|
|
160
|
+
if (!stripped)
|
|
161
|
+
return null;
|
|
162
|
+
return { provider, model: stripped };
|
|
163
|
+
}
|
|
137
164
|
// Beta prefixes that require Extra Usage to be ENABLED on the account.
|
|
138
165
|
// context-management and prompt-caching-scope are safe — billing is determined
|
|
139
166
|
// solely by the OAuth token's subscription type, not by beta flags.
|
|
@@ -390,7 +417,13 @@ export async function startProxy(opts = {}) {
|
|
|
390
417
|
}
|
|
391
418
|
}
|
|
392
419
|
const cliVersion = detectCliVersion();
|
|
393
|
-
|
|
420
|
+
// Parse --model once at startup. Supports `<provider>:<model>` to force
|
|
421
|
+
// a backend for every request (e.g. `--model=openai:gpt-4o`). Back-compat:
|
|
422
|
+
// bare names like `opus` resolve via MODEL_ALIASES.
|
|
423
|
+
const modelPrefix = opts.model ? parseProviderPrefix(opts.model) : null;
|
|
424
|
+
const cliModelRaw = modelPrefix ? modelPrefix.model : opts.model;
|
|
425
|
+
const cliProviderOverride = modelPrefix ? modelPrefix.provider : null;
|
|
426
|
+
const modelOverride = cliModelRaw ? (MODEL_ALIASES[cliModelRaw] ?? cliModelRaw) : null;
|
|
394
427
|
const identity = loadClaudeIdentity();
|
|
395
428
|
if (identity.deviceId) {
|
|
396
429
|
console.log(' Device identity: detected');
|
|
@@ -608,18 +641,47 @@ export async function startProxy(opts = {}) {
|
|
|
608
641
|
finally {
|
|
609
642
|
clearTimeout(bodyTimeout);
|
|
610
643
|
}
|
|
611
|
-
|
|
644
|
+
let body = Buffer.concat(chunks);
|
|
645
|
+
// Provider prefix (v3.10.0). If the body's model field is `<provider>:<model>`
|
|
646
|
+
// with a recognized prefix, strip the prefix and force routing regardless of
|
|
647
|
+
// regex. CLI-level `--model=<provider>:<name>` applies the same override
|
|
648
|
+
// server-wide. Rewrites the body in place once so both code paths below
|
|
649
|
+
// see the stripped model name.
|
|
650
|
+
let forcedProvider = cliProviderOverride;
|
|
651
|
+
if (body.length > 0) {
|
|
652
|
+
try {
|
|
653
|
+
const parsed = JSON.parse(body.toString());
|
|
654
|
+
const rawModel = parsed.model ?? '';
|
|
655
|
+
const prefix = parseProviderPrefix(rawModel);
|
|
656
|
+
if (prefix) {
|
|
657
|
+
forcedProvider = prefix.provider;
|
|
658
|
+
parsed.model = prefix.model;
|
|
659
|
+
body = Buffer.from(JSON.stringify(parsed));
|
|
660
|
+
if (verbose) {
|
|
661
|
+
console.log(`[dario] provider prefix: ${rawModel} → ${prefix.provider} backend with model ${prefix.model}`);
|
|
662
|
+
}
|
|
663
|
+
}
|
|
664
|
+
else if (cliProviderOverride === 'openai' && cliModelRaw) {
|
|
665
|
+
// --model=openai:<name> forces the openai backend and replaces
|
|
666
|
+
// the model name server-wide. Body gets rewritten so the openai
|
|
667
|
+
// route below sees the CLI-chosen model.
|
|
668
|
+
parsed.model = cliModelRaw;
|
|
669
|
+
body = Buffer.from(JSON.stringify(parsed));
|
|
670
|
+
}
|
|
671
|
+
}
|
|
672
|
+
catch { /* not JSON — fall through */ }
|
|
673
|
+
}
|
|
612
674
|
// Multi-provider routing (v3.6.0+). When an OpenAI-compat backend is
|
|
613
675
|
// configured and the request is on /v1/chat/completions with a
|
|
614
|
-
// GPT-family model, forward it straight
|
|
615
|
-
// instead of running it through the Claude
|
|
616
|
-
// on /v1/messages or with Claude-family models
|
|
617
|
-
// existing behavior.
|
|
618
|
-
if (openaiBackend && isOpenAI && body.length > 0) {
|
|
676
|
+
// GPT-family model (or a forced `openai:` prefix), forward it straight
|
|
677
|
+
// through to the backend instead of running it through the Claude
|
|
678
|
+
// template path. Requests on /v1/messages or with Claude-family models
|
|
679
|
+
// fall through to existing behavior.
|
|
680
|
+
if (openaiBackend && isOpenAI && forcedProvider !== 'claude' && body.length > 0) {
|
|
619
681
|
try {
|
|
620
682
|
const peek = JSON.parse(body.toString());
|
|
621
683
|
const rawModel = (peek.model || '').toString();
|
|
622
|
-
if (rawModel && isOpenAIModel(rawModel)) {
|
|
684
|
+
if (rawModel && (forcedProvider === 'openai' || isOpenAIModel(rawModel))) {
|
|
623
685
|
if (verbose) {
|
|
624
686
|
console.log(`[dario] #${requestCount} ${req.method} ${urlPath} (model: ${rawModel}) → openai backend`);
|
|
625
687
|
}
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@askalf/dario",
|
|
3
|
-
"version": "3.
|
|
3
|
+
"version": "3.10.0",
|
|
4
4
|
"description": "A local LLM router. One endpoint, every provider — Claude subscriptions, OpenAI, OpenRouter, Groq, local LiteLLM, any OpenAI-compat endpoint — your tools don't need to change.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"bin": {
|
|
@@ -21,7 +21,7 @@
|
|
|
21
21
|
],
|
|
22
22
|
"scripts": {
|
|
23
23
|
"build": "tsc && cp src/cc-template-data.json dist/",
|
|
24
|
-
"test": "node test/issue-29-tool-translation.mjs && node test/hybrid-tools.mjs && node test/scrub-paths.mjs && node test/analytics-recording.mjs && node test/failover-429.mjs",
|
|
24
|
+
"test": "node test/issue-29-tool-translation.mjs && node test/hybrid-tools.mjs && node test/scrub-paths.mjs && node test/provider-prefix.mjs && node test/analytics-recording.mjs && node test/failover-429.mjs",
|
|
25
25
|
"audit": "npm audit --production --audit-level=high",
|
|
26
26
|
"prepublishOnly": "npm run build",
|
|
27
27
|
"start": "node dist/cli.js",
|