@askalf/dario 3.9.5 → 3.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -26,11 +26,13 @@
26
26
 
27
27
  Dario runs on your machine and gives every tool you use one local URL that reaches **every LLM you use.** Point Cursor, Continue, Aider, LiteLLM, your own scripts — anything that speaks the Anthropic or OpenAI API — at `http://localhost:3456`, and dario routes each request to the right backend:
28
28
 
29
- - **Claude Max / Pro subscriptions** — OAuth-backed, billed against your plan instead of API pricing. Multi-account pooling if you have more than one.
30
29
  - **OpenAI** — your API key, routed to `api.openai.com` straight through.
31
30
  - **Any OpenAI-compat endpoint** — OpenRouter, Groq, a local LiteLLM, Ollama's openai-compat mode, self-hosted vLLM. Set the backend's `baseUrl` once, done.
31
+ - **Claude Max / Pro subscriptions** — OAuth-backed, billed against your plan instead of API pricing. Multi-account pooling if you have more than one.
32
+
33
+ Your tool sees one base URL. `gpt-4o` goes to OpenAI. `llama-3-70b` goes to Groq. `claude-opus-4-6` goes to your Claude subscription. None of your tools have to know about any of it.
32
34
 
33
- Your tool sees one base URL. `gpt-4` goes to OpenAI. `claude-opus-4-6` goes to your Claude subscription. `llama-3-70b` goes to Groq. None of your tools have to know about any of it.
35
+ **Backends are plugins, not the product.** Dario's job is the one local endpoint your tools point at. Each backend is a swappable adapter behind it — when a provider ships, a backend entry lands, your tools don't change. That's the durable part.
34
36
 
35
37
  **No account anywhere is required.** Single-backend Claude dario works with nothing but `dario login`. Multi-backend dario works with nothing but local config files. Nothing phones home. Zero runtime dependencies. ~2,000 lines of TypeScript.
36
38
 
@@ -41,8 +43,9 @@ Your tool sees one base URL. `gpt-4` goes to OpenAI. `claude-opus-4-6` goes to y
41
43
  **Best fit:**
42
44
 
43
45
  - **Developers using multiple LLMs across multiple tools** who are tired of juggling base URLs, API keys, and per-tool provider configs.
44
- - **Claude Max or Pro subscribers** who want their subscription usable anywhere that speaks the Anthropic or OpenAI API — without paying API rates for every request.
45
46
  - **Teams running local or hosted OpenAI-compat servers** (LiteLLM, vLLM, Ollama, Groq, OpenRouter) who want one stable local endpoint in front of them that every tool can reuse.
47
+ - **Anyone who wants to switch providers without reconfiguring every tool** — change the model name in your tool, dario picks a different backend, your tool keeps working.
48
+ - **Claude Max or Pro subscribers** who want their subscription usable anywhere that speaks the Anthropic or OpenAI API — without paying API rates for every request.
46
49
  - **Power users running multi-agent workloads on Claude subscriptions** who want multi-account pooling with headroom-aware routing on their own machine, against their own subscriptions, without a hosted platform.
47
50
 
48
51
  **Not a fit:**
@@ -94,6 +97,8 @@ One URL. Your tool doesn't know or care which provider is answering.
94
97
 
95
98
  **Use dario if** you use more than one LLM provider, or more than one tool, or both — and you're tired of configuring each tool with a different base URL and API key per provider.
96
99
 
100
+ **Use dario if** you want provider independence. Switching from GPT-4o to Claude to Llama is a model-name change in your tool, not a reconfigure of every SDK and base URL you've got.
101
+
97
102
  **Use dario if** you pay for Claude Max or Pro and you want that subscription reachable from every tool on your machine, without paying API rates or opening a second billing surface.
98
103
 
99
104
  **Use dario pool mode if** you're running multi-agent workloads on Claude subscriptions and hitting per-account rate limits. Add 2–N accounts with `dario accounts add` and dario routes across them by per-account headroom, all on your machine, against your own subscriptions. See [Multi-Account Pool Mode](#multi-account-pool-mode).
@@ -133,7 +138,7 @@ Opus, Sonnet, Haiku, GPT-4o, o1, o3, o4, plus anything the configured OpenAI-com
133
138
 
134
139
  ## Backends
135
140
 
136
- Dario's routing is organized around **backends**, each with its own auth and its own target. v3.6.0 ships two backends, with more coming.
141
+ Dario's routing is organized around **backends**, each with its own auth and its own target. Backends are swappable adapters — add one, your tools reach it at `localhost:3456` with whatever API shape they already speak. v3.6.0 ships two backends, with more coming.
137
142
 
138
143
  ### 1. Claude subscription backend (built in)
139
144
 
@@ -247,7 +252,7 @@ curl http://localhost:3456/analytics # per-account / per-model stats, burn ra
247
252
  | `--passthrough` / `--thin` | Thin proxy for the Claude backend — OAuth swap only, no template injection | off |
248
253
  | `--preserve-tools` / `--keep-tools` | Keep client tool schemas instead of remapping to CC's `Bash/Read/Grep/Glob/WebSearch/WebFetch`. Required for clients whose tools have fields CC doesn't (`sessionId`, custom ids, etc.) — see [Custom tool schemas](#custom-tool-schemas). Trade-off: drops the CC request fingerprint. | off |
249
254
  | `--hybrid-tools` / `--context-inject` | Remap to CC tools **and** inject request-context values (`sessionId`, `requestId`, `channelId`, `userId`, `timestamp`) into client-declared fields CC's schema doesn't carry. Preserves the CC fingerprint while keeping custom schemas functional — see [Hybrid tool mode](#hybrid-tool-mode). Mutually exclusive with `--preserve-tools`. | off |
250
- | `--model=<name>` | Force a model (`opus`, `sonnet`, `haiku`, or full ID). Applies to the Claude backend. | passthrough |
255
+ | `--model=<name>` | Force a model. Shortcuts (`opus`, `sonnet`, `haiku`), full IDs (`claude-opus-4-6`), or a **provider prefix** (`openai:gpt-4o`, `groq:llama-3.3-70b`, `claude:opus`, `local:qwen-coder`) to force the backend server-wide. See [Provider prefix](#provider-prefix). | passthrough |
251
256
  | `--port=<n>` | Port to listen on | `3456` |
252
257
  | `--host=<addr>` / `DARIO_HOST` | Bind address. Use `0.0.0.0` for LAN, or a specific IP (e.g. a Tailscale interface). When non-loopback, also set `DARIO_API_KEY`. | `127.0.0.1` |
253
258
  | `--verbose` / `-v` | Log every request | off |
@@ -348,6 +353,44 @@ curl http://localhost:3456/v1/chat/completions \
348
353
 
349
354
  All supported. Claude backend: full Anthropic SSE format plus OpenAI-SSE translation for tool_use streaming. OpenAI-compat backend: streaming body forwarded byte-for-byte.
350
355
 
356
+ ### Provider prefix
357
+
358
+ Any request's `model` field can be written as `<provider>:<name>` to force which backend handles it, regardless of what the model name looks like. This is useful when regex-based routing (`gpt-*` → OpenAI, `claude-*` → Claude) doesn't match — for example when routing a `llama-3.3-70b` request through an OpenAI-compat backend, or when you want the same model name to go to different providers on different requests.
359
+
360
+ Recognized prefixes:
361
+
362
+ | Prefix | Backend |
363
+ |---|---|
364
+ | `openai:` | OpenAI-compat backend (the configured one) |
365
+ | `groq:` | OpenAI-compat backend |
366
+ | `openrouter:` | OpenAI-compat backend |
367
+ | `local:` | OpenAI-compat backend |
368
+ | `compat:` | OpenAI-compat backend |
369
+ | `claude:` | Claude subscription backend |
370
+ | `anthropic:` | Claude subscription backend |
371
+
372
+ Examples:
373
+
374
+ ```bash
375
+ # Force openai backend
376
+ curl http://localhost:3456/v1/chat/completions \
377
+ -H "Authorization: Bearer dario" \
378
+ -d '{"model":"openai:gpt-4o","messages":[{"role":"user","content":"hi"}]}'
379
+
380
+ # Force a non-gpt model through the openai-compat backend (e.g. OpenRouter)
381
+ curl http://localhost:3456/v1/chat/completions \
382
+ -H "Authorization: Bearer dario" \
383
+ -d '{"model":"openrouter:meta-llama/llama-3.1-70b-instruct","messages":[...]}'
384
+
385
+ # Force Claude subscription backend — same as `opus` shortcut but explicit
386
+ curl http://localhost:3456/v1/messages \
387
+ -d '{"model":"claude:opus","max_tokens":1024,"messages":[...]}'
388
+ ```
389
+
390
+ The prefix gets stripped before the request goes upstream — the backend only sees the bare model name. Unrecognized prefixes are ignored, so ollama-style `llama3:8b` passes through untouched.
391
+
392
+ **Server-wide override.** `dario proxy --model=openai:gpt-4o` applies the prefix to every request, regardless of what the client sends. Useful for "I want everything routed to this specific backend and model" without editing every tool's config.
393
+
351
394
  ### Custom tool schemas
352
395
 
353
396
  By default, on the Claude backend, dario replaces your client's tool definitions with the real Claude Code tools (`Bash`, `Read`, `Grep`, `Glob`, `WebSearch`, `WebFetch`) and translates parameters back and forth. That's how dario looks like CC on the wire, which is what lets your request bill against your Claude subscription instead of API pricing.
@@ -44,6 +44,14 @@ export interface ToolMapping {
44
44
  * into fields CC's schema doesn't carry. Unset in default mode.
45
45
  */
46
46
  clientFields?: string[];
47
+ /**
48
+ * Reverse-lookup priority for resolving collisions when multiple client
49
+ * tools map to the same CC tool. Higher wins. Default 10. Set lower for
50
+ * niche / lossy translations (e.g. OpenClaw's `process` action-discriminator
51
+ * tool loses most of its schema when flattened to Bash, so bash/exec
52
+ * should win the Bash reverse slot when both are declared — dario#37).
53
+ */
54
+ reverseScore?: number;
47
55
  }
48
56
  /**
49
57
  * Request context extracted once per incoming request. Source for
@@ -141,10 +141,17 @@ const TOOL_MAP = {
141
141
  // so the model upstream can't actually drive it. Kept mapped for fingerprint
142
142
  // continuity but the reverse translation is inherently lossy — clients with a
143
143
  // process-style tool should use --preserve-tools instead of --hybrid-tools.
144
+ //
145
+ // reverseScore: 1 makes sure that when a client declares BOTH `process` AND
146
+ // `exec`/`bash` (OpenClaw does — both are exported from bash-tools.ts), the
147
+ // reverse lookup picks the bash-family mapping for CC's Bash tool slot
148
+ // instead of routing CC tool calls through process's action-based shape
149
+ // and breaking every Bash call with "Unknown action" (dario#37).
144
150
  process: {
145
151
  ccTool: 'Bash',
146
152
  translateArgs: (a) => ({ command: a.action || a.cmd || '' }),
147
153
  translateBack: (a) => ({ action: a.command ?? '' }),
154
+ reverseScore: 1,
148
155
  },
149
156
  read: {
150
157
  ccTool: 'Read',
@@ -499,11 +506,22 @@ export function buildCCRequest(clientBody, billingTag, cache1h, identity, opts =
499
506
  }
500
507
  /**
501
508
  * Build the CC-name → {clientName, mapping} reverse lookup used by both
502
- * the non-streaming and streaming reverse-mappers. Two-pass construction
503
- * preserves the original identity-protection rule: when a client sent a
504
- * tool with the literal CC name (e.g. `WebSearch`), that pairing claims
505
- * the CC slot first so a later unmapped-tool fallback that also lands
506
- * on `WebSearch` can't overwrite it.
509
+ * the non-streaming and streaming reverse-mappers.
510
+ *
511
+ * Two-pass construction preserves the original identity-protection rule:
512
+ * when a client sent a tool with the literal CC name (e.g. `WebSearch`),
513
+ * that pairing claims the CC slot first so a later unmapped-tool fallback
514
+ * that also lands on `WebSearch` can't overwrite it.
515
+ *
516
+ * Within the non-identity pass, collisions are broken by `reverseScore`
517
+ * (higher wins, default 10). This matters when a client declares two
518
+ * tools that both map to the same CC tool — OpenClaw declares both
519
+ * `exec` (bash-like, score 10) and `process` (action-discriminator,
520
+ * score 1) and both map to Bash. Pre-fix, insertion-order last-wins
521
+ * routed Bash tool calls through `process`, which interpreted the
522
+ * command string as an action and returned "Unknown action" for
523
+ * every call. `process` now has reverseScore: 1 so bash/exec wins
524
+ * (dario#37).
507
525
  */
508
526
  function buildReverseLookup(toolMap) {
509
527
  const reverseMap = new Map();
@@ -514,12 +532,17 @@ function buildReverseLookup(toolMap) {
514
532
  reverseMap.set(mapping.ccTool, { clientName, mapping });
515
533
  }
516
534
  }
535
+ // Score-based collision resolution in the non-identity pass.
536
+ const scoreOf = (m) => m.reverseScore ?? 10;
517
537
  for (const [clientName, mapping] of toolMap) {
518
538
  if (clientName.toLowerCase() === mapping.ccTool.toLowerCase())
519
539
  continue;
520
540
  if (identityClaimed.has(mapping.ccTool))
521
541
  continue;
522
- reverseMap.set(mapping.ccTool, { clientName, mapping });
542
+ const existing = reverseMap.get(mapping.ccTool);
543
+ if (!existing || scoreOf(mapping) > scoreOf(existing.mapping)) {
544
+ reverseMap.set(mapping.ccTool, { clientName, mapping });
545
+ }
523
546
  }
524
547
  return reverseMap;
525
548
  }
package/dist/cli.js CHANGED
@@ -371,6 +371,8 @@ async function help() {
371
371
  --model=MODEL Force a model for all requests
372
372
  Shortcuts: opus, sonnet, haiku
373
373
  Full IDs: claude-opus-4-6, claude-sonnet-4-6
374
+ Provider prefix: openai:gpt-4o, groq:llama-3.3-70b,
375
+ claude:opus, local:qwen-coder (forces backend)
374
376
  Default: passthrough (client decides)
375
377
  --passthrough, --thin Thin proxy — OAuth swap only, no injection
376
378
  --preserve-tools Forward client tool schemas unchanged
package/dist/proxy.d.ts CHANGED
@@ -1,3 +1,7 @@
1
+ export declare function parseProviderPrefix(model: string): {
2
+ provider: 'openai' | 'claude';
3
+ model: string;
4
+ } | null;
1
5
  interface ProxyOptions {
2
6
  port?: number;
3
7
  host?: string;
package/dist/proxy.js CHANGED
@@ -134,6 +134,33 @@ const MODEL_ALIASES = {
134
134
  'sonnet1m': 'claude-sonnet-4-6[1m]',
135
135
  'haiku': 'claude-haiku-4-5',
136
136
  };
137
+ // Provider prefix in the `model` field — `<provider>:<model>`. Forces
138
+ // routing regardless of model-name regex. Only recognized prefixes are
139
+ // parsed, so ollama-style `llama3:8b` (without a recognized prefix)
140
+ // passes through untouched and reaches the configured openai-compat
141
+ // backend as-is.
142
+ const PROVIDER_PREFIXES = {
143
+ openai: 'openai',
144
+ openrouter: 'openai',
145
+ groq: 'openai',
146
+ compat: 'openai',
147
+ local: 'openai',
148
+ claude: 'claude',
149
+ anthropic: 'claude',
150
+ };
151
+ export function parseProviderPrefix(model) {
152
+ const idx = model.indexOf(':');
153
+ if (idx <= 0)
154
+ return null;
155
+ const prefix = model.slice(0, idx).toLowerCase();
156
+ const provider = PROVIDER_PREFIXES[prefix];
157
+ if (!provider)
158
+ return null;
159
+ const stripped = model.slice(idx + 1);
160
+ if (!stripped)
161
+ return null;
162
+ return { provider, model: stripped };
163
+ }
137
164
  // Beta prefixes that require Extra Usage to be ENABLED on the account.
138
165
  // context-management and prompt-caching-scope are safe — billing is determined
139
166
  // solely by the OAuth token's subscription type, not by beta flags.
@@ -390,7 +417,13 @@ export async function startProxy(opts = {}) {
390
417
  }
391
418
  }
392
419
  const cliVersion = detectCliVersion();
393
- const modelOverride = opts.model ? (MODEL_ALIASES[opts.model] ?? opts.model) : null;
420
+ // Parse --model once at startup. Supports `<provider>:<model>` to force
421
+ // a backend for every request (e.g. `--model=openai:gpt-4o`). Back-compat:
422
+ // bare names like `opus` resolve via MODEL_ALIASES.
423
+ const modelPrefix = opts.model ? parseProviderPrefix(opts.model) : null;
424
+ const cliModelRaw = modelPrefix ? modelPrefix.model : opts.model;
425
+ const cliProviderOverride = modelPrefix ? modelPrefix.provider : null;
426
+ const modelOverride = cliModelRaw ? (MODEL_ALIASES[cliModelRaw] ?? cliModelRaw) : null;
394
427
  const identity = loadClaudeIdentity();
395
428
  if (identity.deviceId) {
396
429
  console.log(' Device identity: detected');
@@ -608,18 +641,47 @@ export async function startProxy(opts = {}) {
608
641
  finally {
609
642
  clearTimeout(bodyTimeout);
610
643
  }
611
- const body = Buffer.concat(chunks);
644
+ let body = Buffer.concat(chunks);
645
+ // Provider prefix (v3.10.0). If the body's model field is `<provider>:<model>`
646
+ // with a recognized prefix, strip the prefix and force routing regardless of
647
+ // regex. CLI-level `--model=<provider>:<name>` applies the same override
648
+ // server-wide. Rewrites the body in place once so both code paths below
649
+ // see the stripped model name.
650
+ let forcedProvider = cliProviderOverride;
651
+ if (body.length > 0) {
652
+ try {
653
+ const parsed = JSON.parse(body.toString());
654
+ const rawModel = parsed.model ?? '';
655
+ const prefix = parseProviderPrefix(rawModel);
656
+ if (prefix) {
657
+ forcedProvider = prefix.provider;
658
+ parsed.model = prefix.model;
659
+ body = Buffer.from(JSON.stringify(parsed));
660
+ if (verbose) {
661
+ console.log(`[dario] provider prefix: ${rawModel} → ${prefix.provider} backend with model ${prefix.model}`);
662
+ }
663
+ }
664
+ else if (cliProviderOverride === 'openai' && cliModelRaw) {
665
+ // --model=openai:<name> forces the openai backend and replaces
666
+ // the model name server-wide. Body gets rewritten so the openai
667
+ // route below sees the CLI-chosen model.
668
+ parsed.model = cliModelRaw;
669
+ body = Buffer.from(JSON.stringify(parsed));
670
+ }
671
+ }
672
+ catch { /* not JSON — fall through */ }
673
+ }
612
674
  // Multi-provider routing (v3.6.0+). When an OpenAI-compat backend is
613
675
  // configured and the request is on /v1/chat/completions with a
614
- // GPT-family model, forward it straight through to the backend
615
- // instead of running it through the Claude template path. Requests
616
- // on /v1/messages or with Claude-family models fall through to
617
- // existing behavior.
618
- if (openaiBackend && isOpenAI && body.length > 0) {
676
+ // GPT-family model (or a forced `openai:` prefix), forward it straight
677
+ // through to the backend instead of running it through the Claude
678
+ // template path. Requests on /v1/messages or with Claude-family models
679
+ // fall through to existing behavior.
680
+ if (openaiBackend && isOpenAI && forcedProvider !== 'claude' && body.length > 0) {
619
681
  try {
620
682
  const peek = JSON.parse(body.toString());
621
683
  const rawModel = (peek.model || '').toString();
622
- if (rawModel && isOpenAIModel(rawModel)) {
684
+ if (rawModel && (forcedProvider === 'openai' || isOpenAIModel(rawModel))) {
623
685
  if (verbose) {
624
686
  console.log(`[dario] #${requestCount} ${req.method} ${urlPath} (model: ${rawModel}) → openai backend`);
625
687
  }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@askalf/dario",
3
- "version": "3.9.5",
3
+ "version": "3.10.0",
4
4
  "description": "A local LLM router. One endpoint, every provider — Claude subscriptions, OpenAI, OpenRouter, Groq, local LiteLLM, any OpenAI-compat endpoint — your tools don't need to change.",
5
5
  "type": "module",
6
6
  "bin": {
@@ -21,7 +21,7 @@
21
21
  ],
22
22
  "scripts": {
23
23
  "build": "tsc && cp src/cc-template-data.json dist/",
24
- "test": "node test/issue-29-tool-translation.mjs && node test/hybrid-tools.mjs && node test/scrub-paths.mjs && node test/analytics-recording.mjs && node test/failover-429.mjs",
24
+ "test": "node test/issue-29-tool-translation.mjs && node test/hybrid-tools.mjs && node test/scrub-paths.mjs && node test/provider-prefix.mjs && node test/analytics-recording.mjs && node test/failover-429.mjs",
25
25
  "audit": "npm audit --production --audit-level=high",
26
26
  "prepublishOnly": "npm run build",
27
27
  "start": "node dist/cli.js",