@askalf/dario 3.16.0 → 3.19.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +171 -169
- package/dist/accounts.d.ts +2 -0
- package/dist/accounts.js +54 -4
- package/dist/cc-template-data.json +1 -0
- package/dist/cc-template.d.ts +3 -0
- package/dist/cc-template.js +95 -35
- package/dist/cli.js +19 -0
- package/dist/doctor.d.ts +43 -0
- package/dist/doctor.js +208 -0
- package/dist/live-fingerprint.d.ts +137 -0
- package/dist/live-fingerprint.js +375 -9
- package/dist/openai-backend.js +24 -3
- package/dist/proxy.js +108 -16
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
<p align="center">
|
|
2
2
|
<h1 align="center">dario</h1>
|
|
3
|
-
<p align="center"><strong>
|
|
3
|
+
<p align="center"><strong>A universal LLM router that runs on your machine.<br>One local endpoint, every provider — Anthropic, OpenAI, Groq, OpenRouter, Ollama, any OpenAI-compat URL. Your tools point here and stop caring which vendor is upstream.</strong></p>
|
|
4
4
|
</p>
|
|
5
5
|
|
|
6
6
|
<p align="center">
|
|
@@ -12,40 +12,33 @@
|
|
|
12
12
|
</p>
|
|
13
13
|
|
|
14
14
|
```bash
|
|
15
|
-
npm install -g @askalf/dario && dario
|
|
15
|
+
npm install -g @askalf/dario && dario proxy
|
|
16
16
|
```
|
|
17
17
|
|
|
18
|
-
|
|
18
|
+
One command, one local URL, every provider behind it. Point `ANTHROPIC_BASE_URL`, `OPENAI_BASE_URL`, or anything that speaks either protocol at `http://localhost:3456` and the **model name** decides where the request goes:
|
|
19
19
|
|
|
20
|
-
|
|
20
|
+
- `claude-opus-4-7`, `claude-sonnet-4-6`, `opus`, `sonnet`, `haiku` → **Anthropic** (via your Claude Max/Pro subscription, or a direct API key, your choice)
|
|
21
|
+
- `gpt-4o`, `o3-mini`, `chatgpt-4o-latest` → **OpenAI**
|
|
22
|
+
- `llama-3.3-70b`, `deepseek-v3`, anything else → **Groq**, **OpenRouter**, **local LiteLLM**, **vLLM**, **Ollama**, whichever OpenAI-compat backend you wired up
|
|
23
|
+
- Force a backend explicitly with a prefix: `openai:gpt-4o`, `groq:llama-3.3-70b`, `local:qwen-coder`, `claude:opus`
|
|
21
24
|
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
## Before and after
|
|
25
|
+
Switching providers is a **model-name change** in your tool. Not a reconfigure. Not new base URLs. Not new API keys. Not a new SDK import. **Zero runtime dependencies. ~7,600 lines of TypeScript across ~15 files. ~640 assertions across 20 test suites. [SLSA-attested](https://www.npmjs.com/package/@askalf/dario) on every release. Nothing phones home, ever.**
|
|
25
26
|
|
|
26
|
-
|
|
27
|
+
---
|
|
27
28
|
|
|
28
|
-
|
|
29
|
-
|---|---|
|
|
30
|
-
| Claude Code | subscription ✓ |
|
|
31
|
-
| Cursor | pay per token |
|
|
32
|
-
| Aider | pay per token |
|
|
33
|
-
| Continue | pay per token |
|
|
34
|
-
| Zed | pay per token |
|
|
35
|
-
| Your own scripts | pay per token |
|
|
29
|
+
## What it actually does
|
|
36
30
|
|
|
37
|
-
|
|
31
|
+
You point every tool at one URL. Dario reads each request, decides which backend owns it, and forwards the request in that backend's native protocol.
|
|
38
32
|
|
|
39
|
-
|
|
|
33
|
+
| Client speaks | Model in request | dario routes to | What happens |
|
|
40
34
|
|---|---|---|---|
|
|
41
|
-
|
|
|
42
|
-
|
|
|
43
|
-
|
|
|
44
|
-
|
|
|
45
|
-
|
|
|
46
|
-
| Your own scripts | **subscription ✓** | gpt-4o passthrough | passthrough |
|
|
35
|
+
| Anthropic Messages API | `claude-*` / `opus` / `sonnet` / `haiku` | Claude backend | OAuth swap + (optional) CC template replay → `api.anthropic.com` |
|
|
36
|
+
| Anthropic Messages API | `gpt-*`, `llama-*`, etc. | OpenAI-compat backend | Anthropic → OpenAI translation, forwarded to configured backend |
|
|
37
|
+
| OpenAI Chat Completions | `gpt-*` / `o1-*` / `o3-*` | OpenAI-compat backend | Passthrough: auth swap, body forwarded byte-for-byte |
|
|
38
|
+
| OpenAI Chat Completions | `claude-*` | Claude backend | OpenAI → Anthropic translation, then the Claude backend path |
|
|
39
|
+
| Either protocol | `<provider>:<model>` | Forced by prefix | Explicit override for ambiguous names |
|
|
47
40
|
|
|
48
|
-
The
|
|
41
|
+
The tool doesn't know. The backend doesn't know. Dario is the seam.
|
|
49
42
|
|
|
50
43
|
---
|
|
51
44
|
|
|
@@ -55,15 +48,17 @@ The trick: every outbound request on the Claude path is rebuilt to look exactly
|
|
|
55
48
|
# Install
|
|
56
49
|
npm install -g @askalf/dario
|
|
57
50
|
|
|
58
|
-
#
|
|
59
|
-
|
|
51
|
+
# Any combination of backends:
|
|
52
|
+
|
|
53
|
+
# 1. Claude via your Claude Max / Pro subscription (uses your Claude Code
|
|
54
|
+
# OAuth if CC is installed; runs its own OAuth flow otherwise)
|
|
60
55
|
dario login
|
|
61
56
|
|
|
62
|
-
# OpenAI or any OpenAI-compat
|
|
57
|
+
# 2. OpenAI or any OpenAI-compat endpoint
|
|
63
58
|
dario backend add openai --key=sk-proj-...
|
|
64
|
-
dario backend add groq --key=gsk_...
|
|
65
|
-
dario backend add openrouter --key=sk-or-...
|
|
66
|
-
dario backend add local --key=anything
|
|
59
|
+
dario backend add groq --key=gsk_... --base-url=https://api.groq.com/openai/v1
|
|
60
|
+
dario backend add openrouter --key=sk-or-... --base-url=https://openrouter.ai/api/v1
|
|
61
|
+
dario backend add local --key=anything --base-url=http://127.0.0.1:11434/v1
|
|
67
62
|
|
|
68
63
|
# Start the proxy
|
|
69
64
|
dario proxy
|
|
@@ -75,23 +70,27 @@ export OPENAI_BASE_URL=http://localhost:3456/v1
|
|
|
75
70
|
export OPENAI_API_KEY=dario
|
|
76
71
|
```
|
|
77
72
|
|
|
78
|
-
That's it. Every tool that honors these standard env vars now reaches every backend you configured.
|
|
73
|
+
That's it. Every tool that honors these standard env vars now reaches every backend you configured. No per-tool reconfiguration. No SDK changes. One URL, one fake key, every real provider behind it.
|
|
74
|
+
|
|
75
|
+
Something broken? `dario doctor` prints a single aggregated health report — dario version, Node, platform, CC binary compat, template source + age + drift, OAuth status, pool state, configured backends. Paste that instead of screenshots when you file an issue.
|
|
79
76
|
|
|
80
77
|
---
|
|
81
78
|
|
|
82
79
|
## Why you'll install this
|
|
83
80
|
|
|
84
|
-
**You
|
|
81
|
+
**You want one URL for every provider.** Cursor, Aider, Continue, Zed, OpenHands, Claude Code, your own scripts — every tool you own has its own per-provider config. Dario collapses that into a single `localhost:3456` that speaks both Anthropic and OpenAI protocols and routes by model name. Switching providers is a model-name change in your tool, not a reconfigure of every SDK on your laptop.
|
|
85
82
|
|
|
86
|
-
**You
|
|
83
|
+
**You pay for Claude Max but only use it in Claude Code.** Cursor, Aider, Zed, Continue — they all want API keys and bill per-token while your $200/mo subscription sits idle. Dario's Claude backend routes requests from all of them through your plan by replaying the exact Claude Code wire shape (template, tools, headers, billing tag) that Anthropic's classifier expects for subscription billing. Section [Claude subscription backend](#2-claude-subscription-backend) has the full mechanics.
|
|
87
84
|
|
|
88
|
-
**You
|
|
85
|
+
**You hit rate limits on long agent runs.** Add a second / third Claude subscription with `dario accounts add work` and pool mode routes each request to whichever account has the most headroom. **Session stickiness** (v3.13.0) pins a multi-turn conversation to one account so the Anthropic prompt cache survives the run. **In-flight 429 failover** retries the same request against a different account before your client sees an error. See [Multi-account pool mode](#multi-account-pool-mode).
|
|
89
86
|
|
|
90
|
-
**You
|
|
87
|
+
**You run a coding agent that isn't Claude Code.** Cline, Roo Code, Cursor, Windsurf, Continue.dev, GitHub Copilot, OpenHands, OpenClaw, Hermes — they each ship their own tool schemas and their own validators. Dario's universal `TOOL_MAP` (**71 entries as of v3.15**) pre-maps every major coding agent's tool names to Claude Code's native set on the outbound path and rebuilds to your agent's exact expected shape on the inbound path. No `--preserve-tools`, no fingerprint loss, no validator errors. See [Agent compatibility](#agent-compatibility).
|
|
91
88
|
|
|
92
|
-
**You want
|
|
89
|
+
**You want the proxy layer off the wire entirely.** **Shim mode** (v3.12, hardened in v3.13) is an in-process `globalThis.fetch` patch injected via `NODE_OPTIONS=--require`. No HTTP hop, no port to bind, no `BASE_URL` to set. `dario shim -- claude --print "hi"` and CC thinks it's talking directly to `api.anthropic.com`. See [Shim mode](#shim-mode).
|
|
93
90
|
|
|
94
|
-
**You want to
|
|
91
|
+
**You want to share capacity with a trusted group without surveilling each other.** The **sealed-sender overflow protocol** (v3.13) uses RSA blind signatures (Chaum 1983, implemented from scratch over Node's `crypto`) so members of a trust group can lend unused Claude capacity to each other with cryptographic unlinkability. A lender verifies "this is a valid group member" without learning *which* member. Dario ships the primitive; [mux](https://github.com/askalf/mux) is the dedicated product around it (group admin, key distribution, member workflow, borrower CLI). See [Sealed-sender overflow](#sealed-sender-overflow-protocol).
|
|
92
|
+
|
|
93
|
+
**You want to actually audit the thing.** ~7,600 lines of TypeScript across ~15 files. Zero runtime dependencies (`npm ls --production` confirms). Credentials at `~/.dario/` with `0600` permissions. `127.0.0.1`-only by default. Every release [SLSA-attested](https://www.npmjs.com/package/@askalf/dario) via GitHub Actions. Nothing phones home. Small enough to read in a weekend.
|
|
95
94
|
|
|
96
95
|
---
|
|
97
96
|
|
|
@@ -100,10 +99,10 @@ That's it. Every tool that honors these standard env vars now reaches every back
|
|
|
100
99
|
**Best fit:**
|
|
101
100
|
|
|
102
101
|
- **Developers using multiple LLMs across multiple tools** tired of juggling base URLs, keys, and per-tool provider configs.
|
|
102
|
+
- **Teams running local or hosted OpenAI-compat servers** (LiteLLM, vLLM, Ollama, Groq, OpenRouter, self-hosted) who want one stable local endpoint every tool can reuse.
|
|
103
|
+
- **Anyone building AI coding tools** who wants provider independence without writing an OpenAI ↔ Anthropic translator themselves.
|
|
103
104
|
- **Claude Max / Pro subscribers** who want their subscription usable from every tool on their machine, not just Claude Code.
|
|
104
|
-
- **Teams running local or hosted OpenAI-compat servers** (LiteLLM, vLLM, Ollama, Groq, OpenRouter) who want one stable local endpoint that every tool reuses.
|
|
105
105
|
- **Power users on multi-agent workloads** who want multi-account pooling, session stickiness, and in-flight 429 failover on their own machine, against their own subscriptions.
|
|
106
|
-
- **Anyone building AI coding tools** who wants provider independence without writing an OpenAI ↔ Anthropic translator themselves.
|
|
107
106
|
|
|
108
107
|
**Not a fit if:**
|
|
109
108
|
|
|
@@ -115,29 +114,11 @@ That's it. Every tool that honors these standard env vars now reaches every back
|
|
|
115
114
|
|
|
116
115
|
## Backends
|
|
117
116
|
|
|
118
|
-
Dario's routing is organized around **backends
|
|
117
|
+
Dario's routing is organized around **backends**. Each is a swappable adapter — add one, your tools reach it through `localhost:3456` in whichever API shape they already speak. You can run zero, one, or all of them concurrently.
|
|
119
118
|
|
|
120
|
-
### 1.
|
|
119
|
+
### 1. OpenAI-compat backend
|
|
121
120
|
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
**What it does:**
|
|
125
|
-
|
|
126
|
-
- Every request is replaced with a Claude Code template before it goes upstream — 25 tool definitions, ~25KB system prompt, exact CC field order, exact beta headers, exact metadata structure. Only the conversation content is preserved. Anthropic's classifier sees what looks like a Claude Code session because, from the wire up, it *is* one — and that's what keeps your usage on subscription billing instead of Extra Usage.
|
|
127
|
-
- **Live fingerprint extraction** (v3.11.0). Dario spawns your installed `claude` binary against a loopback MITM endpoint on startup, captures its outbound request, and extracts the live template (system prompt, tools, user-agent, beta flags, and as of v3.13.0 the exact header insertion order — replayed on the wire by the shim since v3.13.0 and by the proxy since v3.16.0). Eliminates the "Anthropic ships a new CC, dario is stale for 48 hours" window. Cached at `~/.dario/cc-template.live.json` with a 24h TTL. Falls back to the bundled snapshot if CC isn't installed.
|
|
128
|
-
- **Billing tag** reconstructed using CC's own algorithm: `x-anthropic-billing-header: cc_version=<version>.<build_tag>; cc_entrypoint=cli; cch=<5-char-hex>;` where `build_tag = SHA-256(seed + chars[4,7,20] of user message + version).slice(0,3)`.
|
|
129
|
-
- **OAuth config auto-detection** from the installed CC binary. When Anthropic rotates `client_id`, authorize URL, or scopes, dario picks up the new values on the next run without needing a release.
|
|
130
|
-
- **Multi-account pool mode** — see below. Automatic when 2+ accounts are configured.
|
|
131
|
-
- **Framework scrubbing** — known fingerprint tokens (`OpenClaw`, `sessions_*` prefixes, orchestration tags) stripped from system prompt and message content before the request leaves your machine.
|
|
132
|
-
- **Bun auto-relaunch** — when Bun is installed, dario relaunches under it so the TLS fingerprint matches CC's runtime. Without Bun, dario runs on Node.js.
|
|
133
|
-
|
|
134
|
-
**Passthrough mode** (`dario proxy --passthrough`) does an OAuth swap and nothing else — no template, no identity, no scrubbing. Use it when the upstream tool already builds a Claude-Code-shaped request on its own and you just need the token auth.
|
|
135
|
-
|
|
136
|
-
**Detection scope.** The Claude backend is a per-request layer. Template replay and scrubbing are designed to be indistinguishable from Claude Code at the request level. What they *cannot* defend against is Anthropic's session-level behavioral classifier, which operates on cumulative per-OAuth aggregates (token throughput, conversation depth, streaming duration, inter-arrival timing). The practical answer to that is **pool mode** — distributing load across multiple subscriptions so no one account accumulates enough signal to trip anything. See the [FAQ entry](#faq) for the full mechanism.
|
|
137
|
-
|
|
138
|
-
### 2. OpenAI-compat backend
|
|
139
|
-
|
|
140
|
-
Any provider that speaks the OpenAI Chat Completions API. Activated by:
|
|
121
|
+
Any provider that speaks the OpenAI Chat Completions API.
|
|
141
122
|
|
|
142
123
|
```bash
|
|
143
124
|
# OpenAI itself (default base URL)
|
|
@@ -155,7 +136,7 @@ dario backend add local --key=anything --base-url=http://127.0.0.1:4000/v1
|
|
|
155
136
|
|
|
156
137
|
Credentials live at `~/.dario/backends/<name>.json` with mode `0600`.
|
|
157
138
|
|
|
158
|
-
**How it routes.**
|
|
139
|
+
**How it routes.** On `/v1/chat/completions` the request is inspected and forwarded:
|
|
159
140
|
|
|
160
141
|
| Request model | Route |
|
|
161
142
|
|---|---|
|
|
@@ -163,15 +144,38 @@ Credentials live at `~/.dario/backends/<name>.json` with mode `0600`.
|
|
|
163
144
|
| `claude-*` (or `opus` / `sonnet` / `haiku`) | Claude subscription backend |
|
|
164
145
|
| Anything else | Claude backend with OpenAI-compat translation |
|
|
165
146
|
|
|
166
|
-
|
|
147
|
+
The request body goes upstream as-is; only the `Authorization` header is swapped and the URL is pointed at `baseUrl + /chat/completions`. Streaming is forwarded byte-for-byte.
|
|
148
|
+
|
|
149
|
+
Force a backend with a **provider prefix** on the model field (`openai:gpt-4o`, `groq:llama-3.3-70b`, `claude:opus`, `local:qwen-coder`) regardless of what the model name looks like — see [Provider prefix](#provider-prefix).
|
|
167
150
|
|
|
168
|
-
|
|
151
|
+
### 2. Claude subscription backend
|
|
152
|
+
|
|
153
|
+
OAuth-backed Claude Max / Pro, billed against your plan instead of the API. Activated by `dario login`.
|
|
154
|
+
|
|
155
|
+
**What it does.** Every outbound Claude request is rebuilt to look exactly like a request Claude Code itself would make — system prompt, tool definitions, fingerprint headers, billing tag, beta flags, **even the exact header insertion order** — using a live-extracted template from your actually-installed CC binary that self-heals on every Anthropic release. Anthropic's classifier sees a CC session because, from the wire up, it *is* one. That's what keeps your usage on subscription billing instead of API overage.
|
|
156
|
+
|
|
157
|
+
**Key mechanisms:**
|
|
158
|
+
|
|
159
|
+
- **Live fingerprint extraction** (v3.11). Dario spawns your installed `claude` binary against a loopback MITM endpoint on startup, captures its outbound request, and extracts the live template (system prompt, tools, user-agent, beta flags, **header insertion order** as of v3.13, replayed on the wire by the shim since v3.13 and the proxy since v3.16). Eliminates the "Anthropic ships a new CC, dario is stale for 48 hours" window. Cached at `~/.dario/cc-template.live.json` with a 24h TTL. Falls back to the bundled snapshot if CC isn't installed.
|
|
160
|
+
- **Drift detection** (v3.17). On startup dario probes the installed `claude` binary and compares against the captured template. Mismatch triggers a forced refresh and prints a one-line warning. Users never silently sit on a stale template again.
|
|
161
|
+
- **Compat matrix** (v3.17). `SUPPORTED_CC_RANGE = { min: "1.0.0", maxTested: "2.1.104" }` is encoded in code. Installed CC outside that band prints a warn (untested above) or fail (below min) — zero-dep dotted-numeric comparator, no `semver` import per the dep policy.
|
|
162
|
+
- **Billing tag** reconstructed using CC's own algorithm: `x-anthropic-billing-header: cc_version=<version>.<build_tag>; cc_entrypoint=cli; cch=<5-char-hex>;` where `build_tag = SHA-256(seed + chars[4,7,20] of user message + version).slice(0,3)`.
|
|
163
|
+
- **OAuth config auto-detection** from the installed CC binary. When Anthropic rotates `client_id`, authorize URL, or scopes, dario picks up the new values on the next run without needing a release.
|
|
164
|
+
- **Multi-account pool mode** — see [Multi-account pool mode](#multi-account-pool-mode). Automatic when 2+ accounts are configured.
|
|
165
|
+
- **Framework scrubbing** — known fingerprint tokens (`OpenClaw`, `sessions_*` prefixes, orchestration tags) stripped from system prompt and message content before the request leaves your machine.
|
|
166
|
+
- **Atomic cache writes + cache corruption recovery** (v3.17). Template cache writes go through pid-qualified `.tmp` + `rename`, so an OS crash mid-write doesn't leave a half-written file. Unparseable cache files get quarantined to `cc-template.live.json.bad-<timestamp>` and dario self-heals on the next capture.
|
|
167
|
+
- **OAuth single-flight** (v3.17). Two concurrent refreshes for the same account alias now share one outbound `POST /oauth/token`, so the pool's background refresh timer and a user-triggered request at the same millisecond can't race and invalidate each other's refresh token.
|
|
168
|
+
- **Bun auto-relaunch** — when Bun is installed, dario relaunches under it so the TLS fingerprint matches CC's runtime. Without Bun, dario runs on Node.js.
|
|
169
|
+
|
|
170
|
+
**Passthrough mode** (`dario proxy --passthrough`) does an OAuth swap and nothing else — no template, no identity, no scrubbing. Use it when the upstream tool already builds a Claude-Code-shaped request on its own.
|
|
171
|
+
|
|
172
|
+
**Detection scope.** The Claude backend is a per-request layer. Template replay and scrubbing are designed to be indistinguishable from CC at the request level. What they *cannot* defend against is Anthropic's session-level behavioral classifier, which operates on cumulative per-OAuth aggregates (token throughput, conversation depth, streaming duration, inter-arrival timing). The practical answer is **pool mode** — distribute load across multiple subscriptions so no single account accumulates enough signal to trip anything.
|
|
169
173
|
|
|
170
174
|
---
|
|
171
175
|
|
|
172
176
|
## Multi-account pool mode
|
|
173
177
|
|
|
174
|
-
|
|
178
|
+
Pool mode activates automatically when `~/.dario/accounts/` contains 2+ accounts. Single-account dario is unchanged.
|
|
175
179
|
|
|
176
180
|
```bash
|
|
177
181
|
dario accounts add work
|
|
@@ -187,19 +191,19 @@ Each request picks the account with the highest headroom:
|
|
|
187
191
|
headroom = 1 - max(util_5h, util_7d)
|
|
188
192
|
```
|
|
189
193
|
|
|
190
|
-
The response's `anthropic-ratelimit-unified-*` headers are parsed back into the pool so the next selection sees fresh utilization. An account that returns a 429 is marked `rejected` and routed around until its window resets. When every account is exhausted, requests queue for up to 60 seconds waiting for headroom to reappear.
|
|
194
|
+
The response's `anthropic-ratelimit-unified-*` headers are parsed back into the pool so the next selection sees fresh utilization. An account that returns a 429 is marked `rejected` and routed around until its window resets. When every account is exhausted, requests queue for up to 60 seconds waiting for headroom to reappear. Plans can mix freely — Max and Pro accounts sit in the same pool; dario doesn't care about tier, only headroom.
|
|
191
195
|
|
|
192
|
-
### Session stickiness (v3.13
|
|
196
|
+
### Session stickiness (v3.13)
|
|
193
197
|
|
|
194
|
-
Multi-turn agent sessions
|
|
198
|
+
Multi-turn agent sessions pin to one account for the life of the conversation, so the Anthropic prompt cache isn't destroyed by account rotation between turns.
|
|
195
199
|
|
|
196
|
-
**The problem.** Claude
|
|
200
|
+
**The problem.** Claude prompt cache is scoped to `{account × cache_control key}`. When the pool rotates a long agent conversation across accounts on headroom alone, turn 1 builds a cache entry on account A, turn 2 lands on account B and reads nothing from A's cache — paying full cache-create cost again. For a long agent session that's a **5–10× token-cost multiplier** on every turn after the first.
|
|
197
201
|
|
|
198
|
-
**The fix.** Dario hashes a conversation's first user message into a 16-hex-char `stickyKey` (SHA-256 truncated, deterministic) and binds the key to whichever account `select()` would have picked on turn 1. Subsequent turns re-use that account as long as it's still healthy (not rejected, token not near expiry, headroom > 2%). On 429 failover, dario rebinds the key to the new account so the next turn doesn't re-select the exhausted one. 6h TTL, 2,000-entry cap, lazy cleanup. No client cooperation required
|
|
202
|
+
**The fix.** Dario hashes a conversation's first user message into a 16-hex-char `stickyKey` (SHA-256 truncated, deterministic) and binds the key to whichever account `select()` would have picked on turn 1. Subsequent turns re-use that account as long as it's still healthy (not rejected, token not near expiry, headroom > 2%). On 429 failover, dario rebinds the key to the new account so the next turn doesn't re-select the exhausted one. 6h TTL, 2,000-entry cap, lazy cleanup. No client cooperation required.
|
|
199
203
|
|
|
200
|
-
### In-flight 429 failover (v3.8
|
|
204
|
+
### In-flight 429 failover (v3.8+)
|
|
201
205
|
|
|
202
|
-
When a Claude request hits a 429 mid-flight, dario retries the *same request* against a different account before the client
|
|
206
|
+
When a Claude request hits a 429 mid-flight, dario retries the *same request* against a different account before the client sees an error. The client sees one successful response; the pool sees the rejected account go cold until its window resets. Combined with session stickiness, long agent runs survive pool-level exhaustion without dropping user-facing turns.
|
|
203
207
|
|
|
204
208
|
### Inspection
|
|
205
209
|
|
|
@@ -208,29 +212,33 @@ curl http://localhost:3456/accounts # per-account utilization, claim, sticky
|
|
|
208
212
|
curl http://localhost:3456/analytics # per-account / per-model stats, burn rate, exhaustion predictions
|
|
209
213
|
```
|
|
210
214
|
|
|
215
|
+
Every request carries a `billingBucket` field (`subscription` / `subscription_fallback` / `extra_usage` / `api` / `unknown`) so you can see which bucket each request billed against and a `subscriptionPercent` headline number tells you at a glance whether dario is actually routing through your subscription or silently falling to API overage.
|
|
216
|
+
|
|
211
217
|
---
|
|
212
218
|
|
|
213
|
-
## Sealed-sender overflow protocol (v3.13
|
|
219
|
+
## Sealed-sender overflow protocol (v3.13)
|
|
214
220
|
|
|
215
221
|
Trust-group members can lend each other Claude capacity with **cryptographic unlinkability**: a lender can verify the borrower is a valid group member without learning *which* member, so no one in the pool can surveil another through borrow telemetry.
|
|
216
222
|
|
|
217
|
-
**The primitive.** RSA blind signatures (Chaum 1983), implemented from scratch on top of Node's `crypto` module using `RSA_NO_PADDING` for raw `m^e mod n` / `c^d mod n` primitives. Full-Domain Hash via MGF1-SHA256 (with counter retry) prevents multiplicative forgery. The flow: the group admin signs *blinded* tokens in a batch without seeing their real values; the member unblinds locally to obtain valid RSA-FDH signatures on random tokens the admin has never seen
|
|
223
|
+
**The primitive.** RSA blind signatures (Chaum 1983), implemented from scratch on top of Node's `crypto` module using `RSA_NO_PADDING` for raw `m^e mod n` / `c^d mod n` primitives. Full-Domain Hash via MGF1-SHA256 (with counter retry) prevents multiplicative forgery. The flow: the group admin signs *blinded* tokens in a batch without seeing their real values; the member unblinds locally to obtain valid RSA-FDH signatures on random tokens the admin has never seen. When a member spends a token with a lender, the lender verifies the signature with the group public key — it proves "some member got this signed" without identifying who.
|
|
218
224
|
|
|
219
|
-
**What this is, and what it isn't.** This is **privacy between group members**, not anonymity from Anthropic. When a lender accepts a borrow, the actual upstream request still lands under the lender's attributable Claude account identity — Anthropic sees the lender as the originator, exactly as they would for any other request on that account. The cryptographic unlinkability protects group members from each other
|
|
225
|
+
**What this is, and what it isn't.** This is **privacy between group members**, not anonymity from Anthropic. When a lender accepts a borrow, the actual upstream request still lands under the lender's attributable Claude account identity — Anthropic sees the lender as the originator, exactly as they would for any other request on that account. The cryptographic unlinkability protects group members from each other.
|
|
220
226
|
|
|
221
|
-
**What's in
|
|
227
|
+
**What's in the release:**
|
|
222
228
|
|
|
223
229
|
- `src/sealed-pool.ts` — ~550 lines. `GroupAdmin` / `GroupMember` / `GroupLender` classes with quota/expiry enforcement, SHA-256-hashed double-spend set, JSON wire envelope (`{v:1, groupId, token, sig, request}`), and key export/import for distributing group credentials.
|
|
224
230
|
- `POST /v1/pool/borrow` endpoint on the proxy, gated on `~/.dario/group.json`. Positioned before `checkAuth` — the group signature *is* the authentication, so doubling it with a local API key would add nothing. Verified borrows delegate to `pool.select()` and forward upstream under the lender's account.
|
|
225
|
-
-
|
|
231
|
+
- 85 test assertions in `test/sealed-pool.mjs` covering raw RSA roundtrip, unlinkability, wrong-key / tampered-sig / wrong-group / double-spend rejection, key export/import, admin membership / quota / expiry enforcement, concurrent-borrow double-spend prevention, and end-to-end two-member unlinkability.
|
|
232
|
+
|
|
233
|
+
Full feature-parity with `/v1/messages` (streaming, inside-request 429 failover, reverse tool mapping) for borrowed requests is intentionally a follow-up — the current release ships the cryptographic primitive and a working minimal endpoint; full integration layers on top.
|
|
226
234
|
|
|
227
|
-
|
|
235
|
+
**Dedicated product.** The sealed-sender protocol has a dedicated product around it: [mux](https://github.com/askalf/mux). mux carries the group admin tooling (key generation, member roster, batch signing), the member workflow (prepare / finalize / status), the borrower CLI, and the lender daemon as a coherent surface. It uses dario as its backend — a mux lender runs a dario pool and fronts it with `/v1/pool/borrow`. Dario keeps the primitive here for anyone who wants to embed it without running the full mux flow; for peer-to-peer capacity sharing as a product, use mux.
|
|
228
236
|
|
|
229
237
|
---
|
|
230
238
|
|
|
231
239
|
## Shim mode
|
|
232
240
|
|
|
233
|
-
*Experimental, opt-in. The
|
|
241
|
+
*Experimental, opt-in. The proxy is still the default — shim mode is a second transport, not a replacement.*
|
|
234
242
|
|
|
235
243
|
Shim mode runs a child process with an **in-process `globalThis.fetch` patch** that rewrites the child's outbound requests to `api.anthropic.com/v1/messages` exactly the way the proxy would, then sends them directly from the child to Anthropic. No localhost HTTP hop. No port to bind. No `ANTHROPIC_BASE_URL` to set.
|
|
236
244
|
|
|
@@ -239,30 +247,44 @@ dario shim -- claude --print "hello"
|
|
|
239
247
|
dario shim -v -- claude --print "hello" # verbose
|
|
240
248
|
```
|
|
241
249
|
|
|
242
|
-
Under the hood: `dario shim` spawns the child with `NODE_OPTIONS=--require <dario-runtime.cjs>` and a unix socket / named pipe for telemetry. The runtime patches `globalThis.fetch` only for Anthropic messages requests, applies the same template replay the proxy does, and relays per-request events back to the parent so analytics still work. Every other fetch call
|
|
250
|
+
Under the hood: `dario shim` spawns the child with `NODE_OPTIONS=--require <dario-runtime.cjs>` and a unix socket / named pipe for telemetry. The runtime patches `globalThis.fetch` only for Anthropic messages requests, applies the same template replay the proxy does, and relays per-request events back to the parent so analytics still work. Every other fetch call is untouched and fails safe on any internal error.
|
|
243
251
|
|
|
244
|
-
**Why it matters.** Anthropic can fingerprint a proxy via TLS, headers, IP, or `BASE_URL` env. They literally cannot easily detect a `globalThis.fetch` monkey-patch from inside their own process without shipping signed-binary integrity checks against `globalThis` — and even then, the shim runs *before* CC's code loads, so it could patch the integrity check too.
|
|
252
|
+
**Why it matters.** Anthropic can fingerprint a proxy via TLS, headers, IP, or `BASE_URL` env. They literally cannot easily detect a `globalThis.fetch` monkey-patch from inside their own process without shipping signed-binary integrity checks against `globalThis` from inside the CC binary — and even then, the shim runs *before* CC's code loads, so it could patch the integrity check too. The longest-half-life transport against classifier evolution.
|
|
245
253
|
|
|
246
|
-
**v3.13
|
|
247
|
-
|
|
248
|
-
- **Runtime detection** — `detectRuntime()` checks `globalThis.Bun` / `globalThis.Deno` / `process.versions.node` and logs a warning for non-Node runtimes. Canary for the day Anthropic ships a Bun-compiled CC.
|
|
249
|
-
- **Template mtime-based auto-reload** — long-running child processes pick up mid-session fingerprint refreshes from dario's live capture without restart.
|
|
250
|
-
- **Strict defensive `rewriteBody`** — the previous logic accepted `length >= 1` on the system array and invented `[1]`/`[2]` blocks out of thin air. Now requires exactly `length === 3` with all-text blocks; any mismatch passes through unchanged. Passthrough on an unknown shape is safer than blind replacement.
|
|
251
|
-
- **`rewriteHeaders` honors captured header order** — the live fingerprint capture now records the exact order CC emits headers on the wire, and the shim replays that order on every outbound request. Header sequence alone is a fingerprint vector; v3.13.0 removes it from the shim, and v3.16.0 closes the same gap on the proxy via the shared `orderHeadersForOutbound` helper so both transports produce an identical wire shape.
|
|
252
|
-
- **`checkVersionDrift`** — logs when the child's UA `cc_version` differs from the template's, so stale-cache windows during CC upgrades are visible in debug output.
|
|
254
|
+
**v3.13 hardening** added runtime detection (canary for the day Anthropic ships a Bun-compiled CC), template mtime-based auto-reload (long-running children pick up mid-session fingerprint refreshes without restart), strict defensive `rewriteBody` (requires exactly 3 text blocks, passes through on any mismatch instead of inventing structure), and header-order replay (honors captured CC header sequence so the shim matches CC wire-exact).
|
|
253
255
|
|
|
254
256
|
**When to use shim mode:**
|
|
255
|
-
- Running a single CC instance on a locked-down machine where binding a local port is inconvenient
|
|
257
|
+
- Running a single CC instance on a locked-down machine where binding a local port is inconvenient.
|
|
256
258
|
- Wrapping one-off scripts (`dario shim -- node my-agent.js`) without setting up environment variables.
|
|
257
|
-
- Debugging a specific child process in isolation — verbose logs are scoped to that
|
|
259
|
+
- Debugging a specific child process in isolation — verbose logs are scoped to that child.
|
|
258
260
|
- You suspect Anthropic is fingerprinting your proxy traffic and you want to take the proxy off the wire.
|
|
259
261
|
|
|
260
|
-
**When to stay on the proxy** (
|
|
261
|
-
- Multi-client routing. The proxy serves every tool on the machine through one endpoint;
|
|
262
|
+
**When to stay on the proxy** (default):
|
|
263
|
+
- Multi-client routing. The proxy serves every tool on the machine through one endpoint; shim wraps one child at a time.
|
|
262
264
|
- Multi-account pool mode. Pooling across subscriptions needs a shared OAuth pool the proxy owns — a shim patch inside one child can't see pool state across other processes.
|
|
263
|
-
- Anything that isn't a Node / Bun child. The shim relies on `NODE_OPTIONS`, so
|
|
265
|
+
- Anything that isn't a Node / Bun child. The shim relies on `NODE_OPTIONS`, so Python SDKs or Go CLIs still need the proxy.
|
|
266
|
+
|
|
267
|
+
---
|
|
264
268
|
|
|
265
|
-
|
|
269
|
+
## Agent compatibility
|
|
270
|
+
|
|
271
|
+
As of **v3.18**, dario's built-in `TOOL_MAP` carries **~65 schema-verified entries** covering the tool schemas of every major coding agent. On the Claude backend, tool calls translate to CC's native `Bash / Read / Write / Edit / Glob / Grep / WebSearch / WebFetch` on the outbound path (keeping the subscription fingerprint intact) and rebuild to your agent's exact expected shape on the inbound path (so your validator is happy). No flag required.
|
|
272
|
+
|
|
273
|
+
| Agent | Covered tool names (subset) |
|
|
274
|
+
|---|---|
|
|
275
|
+
| Claude Code | default — CC's own tools |
|
|
276
|
+
| Cline / Roo Code | `execute_command`, `write_to_file`, `replace_in_file`, `apply_diff`, `list_files`, `search_files`, `read_file` |
|
|
277
|
+
| Cursor | `run_terminal_cmd`, `edit_file`, `search_replace`, `codebase_search`, `grep_search`, `file_search`, `list_dir`, `read_file` (`target_file`) |
|
|
278
|
+
| Windsurf | `run_command`, `view_file`, `write_to_file`, `replace_file_content`, `find_by_name`, `grep_search`, `list_dir`, `search_web`, `read_url_content` |
|
|
279
|
+
| Continue.dev | `builtin_run_terminal_command`, `builtin_read_file`, `builtin_create_new_file`, `builtin_edit_existing_file`, `builtin_file_glob_search`, `builtin_grep_search`, `builtin_ls` |
|
|
280
|
+
| GitHub Copilot | `run_in_terminal`, `insert_edit_into_file`, `semantic_search`, `codebase_search`, `list_dir`, `fetch_webpage` |
|
|
281
|
+
| OpenHands | `execute_bash`, `str_replace_editor` |
|
|
282
|
+
| OpenClaw | `exec`, `process`, `web_search`, `web_fetch`, `browser`, `message` |
|
|
283
|
+
| Hermes | `terminal`, `patch`, `web_extract`, `clarify` |
|
|
284
|
+
|
|
285
|
+
If your agent's tool names aren't pre-mapped, there are two escape hatches: **`--preserve-tools`** (forward your schema verbatim, lose the CC fingerprint) or **`--hybrid-tools`** (keep the fingerprint, fill request-context fields from headers). See [Custom tool schemas](#custom-tool-schemas).
|
|
286
|
+
|
|
287
|
+
The OpenAI-compat backend forwards tool definitions byte-for-byte and doesn't need any of this.
|
|
266
288
|
|
|
267
289
|
---
|
|
268
290
|
|
|
@@ -272,6 +294,7 @@ See the [v3.12.0 release notes](https://github.com/askalf/dario/releases/tag/v3.
|
|
|
272
294
|
|---|---|
|
|
273
295
|
| `dario login` | Log in to the Claude backend (detects CC credentials or runs its own OAuth flow) |
|
|
274
296
|
| `dario proxy` | Start the local API proxy on port 3456 |
|
|
297
|
+
| `dario doctor` | Aggregated health report — dario / Node / CC binary + compat / template + drift / OAuth / pool / backends |
|
|
275
298
|
| `dario status` | Show Claude backend OAuth token health and expiry |
|
|
276
299
|
| `dario refresh` | Force an immediate Claude token refresh |
|
|
277
300
|
| `dario logout` | Delete stored Claude credentials |
|
|
@@ -289,9 +312,9 @@ See the [v3.12.0 release notes](https://github.com/askalf/dario/releases/tag/v3.
|
|
|
289
312
|
| Flag / env | Description | Default |
|
|
290
313
|
|---|---|---|
|
|
291
314
|
| `--passthrough` / `--thin` | Thin proxy for the Claude backend — OAuth swap only, no template injection | off |
|
|
292
|
-
| `--preserve-tools` / `--keep-tools` | Keep client tool schemas instead of remapping to CC's
|
|
315
|
+
| `--preserve-tools` / `--keep-tools` | Keep client tool schemas instead of remapping to CC's. Required for clients whose tools have fields CC doesn't — see [Custom tool schemas](#custom-tool-schemas). | off |
|
|
293
316
|
| `--hybrid-tools` / `--context-inject` | Remap to CC tools **and** inject request-context values (`sessionId`, `requestId`, `channelId`, `userId`, `timestamp`) into client-declared fields CC's schema doesn't carry. See [Hybrid tool mode](#hybrid-tool-mode). | off |
|
|
294
|
-
| `--model=<name>` | Force a model. Shortcuts (`opus`, `sonnet`, `haiku`), full IDs (`claude-opus-4-
|
|
317
|
+
| `--model=<name>` | Force a model. Shortcuts (`opus`, `sonnet`, `haiku`), full IDs (`claude-opus-4-7`), or a **provider prefix** (`openai:gpt-4o`, `groq:llama-3.3-70b`, `claude:opus`, `local:qwen-coder`) to force the backend server-wide. See [Provider prefix](#provider-prefix). | passthrough |
|
|
295
318
|
| `--port=<n>` | Port to listen on | `3456` |
|
|
296
319
|
| `--host=<addr>` / `DARIO_HOST` | Bind address. Use `0.0.0.0` for LAN, or a specific IP (e.g. a Tailscale interface). When non-loopback, also set `DARIO_API_KEY`. | `127.0.0.1` |
|
|
297
320
|
| `--verbose` / `-v` | Log every request | off |
|
|
@@ -316,7 +339,7 @@ client = anthropic.Anthropic(
|
|
|
316
339
|
)
|
|
317
340
|
|
|
318
341
|
msg = client.messages.create(
|
|
319
|
-
model="claude-opus-4-
|
|
342
|
+
model="claude-opus-4-7",
|
|
320
343
|
max_tokens=1024,
|
|
321
344
|
messages=[{"role": "user", "content": "Hello!"}],
|
|
322
345
|
)
|
|
@@ -339,9 +362,9 @@ msg = client.chat.completions.create(
|
|
|
339
362
|
messages=[{"role": "user", "content": "Hello!"}],
|
|
340
363
|
)
|
|
341
364
|
|
|
342
|
-
# claude-opus-4-
|
|
365
|
+
# claude-opus-4-7 routes to the Claude subscription backend — same SDK, same URL
|
|
343
366
|
claude_msg = client.chat.completions.create(
|
|
344
|
-
model="claude-opus-4-
|
|
367
|
+
model="claude-opus-4-7",
|
|
345
368
|
messages=[{"role": "user", "content": "Hello!"}],
|
|
346
369
|
)
|
|
347
370
|
```
|
|
@@ -357,7 +380,7 @@ const client = new Anthropic({
|
|
|
357
380
|
});
|
|
358
381
|
|
|
359
382
|
const msg = await client.messages.create({
|
|
360
|
-
model: "claude-opus-4-
|
|
383
|
+
model: "claude-opus-4-7",
|
|
361
384
|
max_tokens: 1024,
|
|
362
385
|
messages: [{ role: "user", content: "Hello!" }],
|
|
363
386
|
});
|
|
@@ -370,7 +393,7 @@ export OPENAI_BASE_URL=http://localhost:3456/v1
|
|
|
370
393
|
export OPENAI_API_KEY=dario
|
|
371
394
|
```
|
|
372
395
|
|
|
373
|
-
Any tool that accepts an OpenAI base URL works. Use Claude model names (`claude-opus-4-
|
|
396
|
+
Any tool that accepts an OpenAI base URL works. Use Claude model names (`claude-opus-4-7`, `opus`, `sonnet`, `haiku`) for the Claude backend, or GPT-family names for the configured OpenAI-compat backend.
|
|
374
397
|
|
|
375
398
|
### curl
|
|
376
399
|
|
|
@@ -379,7 +402,7 @@ Any tool that accepts an OpenAI base URL works. Use Claude model names (`claude-
|
|
|
379
402
|
curl http://localhost:3456/v1/messages \
|
|
380
403
|
-H "Content-Type: application/json" \
|
|
381
404
|
-H "anthropic-version: 2023-06-01" \
|
|
382
|
-
-d '{"model":"claude-opus-4-
|
|
405
|
+
-d '{"model":"claude-opus-4-7","max_tokens":1024,"messages":[{"role":"user","content":"Hello!"}]}'
|
|
383
406
|
|
|
384
407
|
# OpenAI backend via OpenAI format
|
|
385
408
|
curl http://localhost:3456/v1/chat/completions \
|
|
@@ -394,7 +417,7 @@ All supported. Claude backend: full Anthropic SSE format plus OpenAI-SSE transla
|
|
|
394
417
|
|
|
395
418
|
### Provider prefix
|
|
396
419
|
|
|
397
|
-
Any request's `model` field can be written as `<provider>:<name>` to force which backend handles it, regardless of what the model name looks like. Useful when regex-based routing (`gpt-*` → OpenAI, `claude-*` → Claude) doesn't match — for example
|
|
420
|
+
Any request's `model` field can be written as `<provider>:<name>` to force which backend handles it, regardless of what the model name looks like. Useful when regex-based routing (`gpt-*` → OpenAI, `claude-*` → Claude) doesn't match — for example routing a `llama-3.3-70b` request through OpenRouter, or making the same model name go to different providers on different requests.
|
|
398
421
|
|
|
399
422
|
Recognized prefixes:
|
|
400
423
|
|
|
@@ -408,33 +431,15 @@ Recognized prefixes:
|
|
|
408
431
|
| `claude:` | Claude subscription backend |
|
|
409
432
|
| `anthropic:` | Claude subscription backend |
|
|
410
433
|
|
|
411
|
-
The prefix gets stripped before the request goes upstream — the backend only sees the bare model name. Unrecognized prefixes are ignored, so
|
|
412
|
-
|
|
413
|
-
### Agent compatibility
|
|
414
|
-
|
|
415
|
-
As of **v3.15.0**, dario's built-in `TOOL_MAP` has **71 entries** covering the tool schemas of every major coding agent. If you're running one of these, no flag is required on the Claude backend — tool calls translate to CC's native `Bash/Read/Write/Edit/Glob/Grep/WebSearch/WebFetch` on the outbound path (so the subscription fingerprint stays intact) and rebuild to your agent's exact expected shape on the inbound path (so your validator is happy).
|
|
416
|
-
|
|
417
|
-
| Agent | Covered tool names (subset) |
|
|
418
|
-
|---|---|
|
|
419
|
-
| Claude Code | default — CC's own tools |
|
|
420
|
-
| Cline / Roo Code | `execute_command`, `write_to_file`, `replace_in_file`, `apply_diff`, `list_files`, `search_files`, `read_file` |
|
|
421
|
-
| Cursor | `run_terminal_cmd`, `edit_file`, `search_replace`, `codebase_search`, `grep_search`, `file_search`, `list_dir`, `read_file` (`target_file`) |
|
|
422
|
-
| Windsurf | `run_command`, `view_file`, `write_to_file`, `replace_file_content`, `find_by_name`, `grep_search`, `list_dir`, `search_web`, `read_url_content` |
|
|
423
|
-
| Continue.dev | `builtin_run_terminal_command`, `builtin_read_file`, `builtin_create_new_file`, `builtin_edit_existing_file`, `builtin_file_glob_search`, `builtin_grep_search`, `builtin_ls` |
|
|
424
|
-
| GitHub Copilot | `run_in_terminal`, `insert_edit_into_file`, `semantic_search`, `codebase_search`, `list_dir`, `fetch_webpage` |
|
|
425
|
-
| OpenHands | `execute_bash`, `str_replace_editor` |
|
|
426
|
-
| OpenClaw | `exec`, `process`, `web_search`, `web_fetch`, `browser`, `message` |
|
|
427
|
-
| Hermes | `terminal`, `patch`, `web_extract`, `clarify` |
|
|
428
|
-
|
|
429
|
-
If your agent's tool names aren't in this list, you've got two escape hatches below: **`--preserve-tools`** (forward your schema verbatim, lose the CC fingerprint) or **`--hybrid-tools`** (keep the fingerprint, fill request-context fields from headers). Open an issue with your agent's tool schema and we'll add a pre-mapping entry.
|
|
434
|
+
The prefix gets stripped before the request goes upstream — the backend only sees the bare model name. Unrecognized prefixes are ignored, so Ollama-style `llama3:8b` passes through untouched. `dario proxy --model=openai:gpt-4o` applies the prefix to every request server-wide.
|
|
430
435
|
|
|
431
436
|
### Custom tool schemas
|
|
432
437
|
|
|
433
|
-
By default, on the Claude backend, dario replaces your client's tool definitions with the real Claude Code tools (`Bash`, `Read`, `Write`, `Edit`, `Grep`, `Glob`, `WebSearch`, `WebFetch`) and translates parameters back and forth. That's how dario looks like CC on the wire, which is what lets your request bill against your Claude subscription instead of API pricing. For the agents listed in [Agent compatibility](#agent-compatibility)
|
|
438
|
+
By default, on the Claude backend, dario replaces your client's tool definitions with the real Claude Code tools (`Bash`, `Read`, `Write`, `Edit`, `Grep`, `Glob`, `WebSearch`, `WebFetch`) and translates parameters back and forth. That's how dario looks like CC on the wire, which is what lets your request bill against your Claude subscription instead of API pricing. For the agents listed in [Agent compatibility](#agent-compatibility), the translation is pre-mapped and runs automatically — nothing to configure.
|
|
434
439
|
|
|
435
|
-
The trade-off shows up when you're running something that *isn't* in the pre-mapped list and whose tools carry fields CC's schema doesn't have — a `sessionId`, a custom request id, a channel-bound context token, a `confidence` score the model is supposed to emit. Those fields don't survive the round trip.
|
|
440
|
+
The trade-off shows up when you're running something that *isn't* in the pre-mapped list and whose tools carry fields CC's schema doesn't have — a `sessionId`, a custom request id, a channel-bound context token, a `confidence` score the model is supposed to emit. Those fields don't survive the round trip.
|
|
436
441
|
|
|
437
|
-
Symptom: your tool calls come back looking stripped-down, or your runtime complains about a required field being absent *only when routed through dario's Claude backend
|
|
442
|
+
Symptom: your tool calls come back looking stripped-down, or your runtime complains about a required field being absent *only when routed through dario's Claude backend*.
|
|
438
443
|
|
|
439
444
|
Fix: run dario with `--preserve-tools`. That skips the CC tool remap entirely, passes your client's tool definitions through to the model unchanged, and lets the model populate every field your schema expects.
|
|
440
445
|
|
|
@@ -444,7 +449,7 @@ dario proxy --preserve-tools
|
|
|
444
449
|
|
|
445
450
|
The cost: requests no longer look like CC on the wire, so the CC subscription fingerprint is gone. On a Max/Pro plan, that means the request may be counted against your API usage rather than your subscription quota. If you're on API-key billing already, `--preserve-tools` is free; if you're using dario specifically to route against a subscription, [hybrid tool mode](#hybrid-tool-mode) below is the compromise that keeps both.
|
|
446
451
|
|
|
447
|
-
The
|
|
452
|
+
The OpenAI-compat backend is unaffected — it forwards tool definitions byte-for-byte and doesn't need this flag.
|
|
448
453
|
|
|
449
454
|
### Hybrid tool mode
|
|
450
455
|
|
|
@@ -454,7 +459,7 @@ For the very common case where the "missing" fields on your client's tool are **
|
|
|
454
459
|
dario proxy --hybrid-tools
|
|
455
460
|
```
|
|
456
461
|
|
|
457
|
-
**How it works.** On each request, dario builds a `RequestContext` from headers (`x-session-id`, `x-request-id`, `x-channel-id`, `x-user-id`) plus its own generated ids and the current timestamp. After `translateBack` produces the client-shaped tool call on the response path, any field declared on the client's tool schema whose name matches a known context field (`sessionId`/`session_id`, `requestId`/`request_id`, `channelId`/`channel_id`, `userId`/`user_id`, `timestamp`/`created_at`/`createdAt`) and isn't already populated gets filled from the context. Fields the model genuinely populated
|
|
462
|
+
**How it works.** On each request, dario builds a `RequestContext` from headers (`x-session-id`, `x-request-id`, `x-channel-id`, `x-user-id`) plus its own generated ids and the current timestamp. After `translateBack` produces the client-shaped tool call on the response path, any field declared on the client's tool schema whose name matches a known context field (`sessionId`/`session_id`, `requestId`/`request_id`, `channelId`/`channel_id`, `userId`/`user_id`, `timestamp`/`created_at`/`createdAt`) and isn't already populated gets filled from the context. Fields the model genuinely populated are never overwritten.
|
|
458
463
|
|
|
459
464
|
**When to use which flag:**
|
|
460
465
|
|
|
@@ -465,8 +470,6 @@ dario proxy --hybrid-tools
|
|
|
465
470
|
| Your custom fields need the model's reasoning (e.g. `confidence`, `reasoning_trace`, `tool_selection_rationale`) | `--preserve-tools` | The model has to see the real schema to populate these. Accept the fingerprint loss. |
|
|
466
471
|
| Your client's tools are already a subset of CC's `Bash/Read/Write/Edit/Grep/Glob/WebSearch/WebFetch` | *(neither)* | Default mode works as-is. |
|
|
467
472
|
|
|
468
|
-
Hybrid mode was built to resolve [#29](https://github.com/askalf/dario/issues/29) cleanly for OpenClaw-style agents whose `process` tool declares `sessionId`, after the full provider-comparison diagnostic from [@boeingchoco](https://github.com/boeingchoco) made clear that the problem wasn't fixable in the translation layer alone.
|
|
469
|
-
|
|
470
473
|
### Library mode
|
|
471
474
|
|
|
472
475
|
```typescript
|
|
@@ -492,12 +495,12 @@ curl http://localhost:3456/health
|
|
|
492
495
|
|---|---|
|
|
493
496
|
| `POST /v1/messages` | Anthropic Messages API (Claude backend) |
|
|
494
497
|
| `POST /v1/chat/completions` | OpenAI-compatible Chat API (routes by model name) |
|
|
495
|
-
| `POST /v1/pool/borrow` | Sealed-sender borrow endpoint
|
|
498
|
+
| `POST /v1/pool/borrow` | Sealed-sender borrow endpoint. Accepts group-signed tokens and forwards the request through the lender's pool. |
|
|
496
499
|
| `GET /v1/models` | Model list (Claude models — OpenAI models come from the OpenAI backend directly) |
|
|
497
500
|
| `GET /health` | Proxy health + OAuth status + request count |
|
|
498
501
|
| `GET /status` | Detailed Claude OAuth token status |
|
|
499
502
|
| `GET /accounts` | Pool snapshot including sticky binding count (pool mode only) |
|
|
500
|
-
| `GET /analytics` | Per-account / per-model stats, burn rate, exhaustion predictions
|
|
503
|
+
| `GET /analytics` | Per-account / per-model stats, burn rate, exhaustion predictions, `billingBucket` + `subscriptionPercent` per request |
|
|
501
504
|
|
|
502
505
|
---
|
|
503
506
|
|
|
@@ -507,16 +510,17 @@ Dario handles your OAuth tokens and API keys locally. Here's why you can trust i
|
|
|
507
510
|
|
|
508
511
|
| Signal | Status |
|
|
509
512
|
|---|---|
|
|
510
|
-
| **Source code** | ~
|
|
513
|
+
| **Source code** | ~7,600 lines of TypeScript across ~15 files — small enough to audit in a weekend |
|
|
511
514
|
| **Dependencies** | 0 runtime dependencies. Verify: `npm ls --production` |
|
|
512
515
|
| **npm provenance** | Every release is [SLSA-attested](https://www.npmjs.com/package/@askalf/dario) via GitHub Actions with sigstore provenance attached to the transparency log |
|
|
513
516
|
| **Security scanning** | [CodeQL](https://github.com/askalf/dario/actions/workflows/codeql.yml) runs on every push and weekly |
|
|
514
|
-
| **Test footprint** |
|
|
517
|
+
| **Test footprint** | ~640 assertions across 20 files. Full `npm test` green on every release |
|
|
515
518
|
| **Credential handling** | Tokens and API keys never logged, redacted from errors, stored with `0600` permissions |
|
|
516
519
|
| **OAuth flow** | PKCE (Proof Key for Code Exchange), no client secret |
|
|
517
520
|
| **Network scope** | Binds to `127.0.0.1` by default. `--host` allows LAN/mesh with `DARIO_API_KEY` gating. Upstream traffic goes only to the configured backend target URLs over HTTPS |
|
|
518
521
|
| **SSRF protection** | `/v1/messages` hits `api.anthropic.com` only; `/v1/chat/completions` hits the configured backend `baseUrl` only — hardcoded allowlist |
|
|
519
522
|
| **Telemetry** | None. Zero analytics, tracking, or data collection |
|
|
523
|
+
| **Atomic cache writes + corruption recovery** | v3.17 — template cache writes are pid-qualified `.tmp` + `rename`, corrupt cache files are quarantined and regenerated instead of crashing startup |
|
|
520
524
|
| **Audit trail** | [CHANGELOG.md](CHANGELOG.md) documents every release with file-level rationale |
|
|
521
525
|
|
|
522
526
|
Verify the npm tarball matches this repo:
|
|
@@ -541,11 +545,26 @@ Claude Max and Claude Pro. Any plan that lets you use Claude Code.
|
|
|
541
545
|
Should work if your plan includes Claude Code access. Not widely tested yet — open an issue with results.
|
|
542
546
|
|
|
543
547
|
**Do I need Claude Code installed?**
|
|
544
|
-
Recommended for the Claude backend, not strictly required. With CC installed, `dario login` picks up your credentials automatically, and the live fingerprint extractor reads your CC binary on every startup so the template stays current. Without CC, dario runs its own OAuth flow and falls back to the bundled template snapshot.
|
|
548
|
+
Recommended for the Claude backend, not strictly required. With CC installed, `dario login` picks up your credentials automatically, and the live fingerprint extractor reads your CC binary on every startup so the template stays current. Without CC, dario runs its own OAuth flow and falls back to the bundled template snapshot. Drift detection (v3.17) warns you if your installed CC doesn't match the captured template, so upgrade windows don't silently ship stale templates.
|
|
545
549
|
|
|
546
550
|
**Do I need Bun?**
|
|
547
551
|
Optional, recommended for Claude-backend requests. Dario auto-relaunches under Bun when available so the TLS fingerprint matches CC's runtime. Without Bun, dario runs on Node.js and works fine; the TLS fingerprint is the only difference.
|
|
548
552
|
|
|
553
|
+
**Can I use dario without a Claude subscription?**
|
|
554
|
+
Yes. Skip `dario login`, just run `dario backend add openai --key=...` (or any OpenAI-compat URL) and `dario proxy`. Claude-backend requests will return an authentication error; OpenAI-compat requests will work normally. Dario becomes a local OpenAI-compat router with no Claude involvement.
|
|
555
|
+
|
|
556
|
+
**Can I route non-OpenAI providers through dario?**
|
|
557
|
+
Yes — anything that speaks the OpenAI Chat Completions API. Groq, OpenRouter, LiteLLM, vLLM, Ollama's openai-compat mode, your own vLLM server, any hosted inference endpoint that exposes `/v1/chat/completions`. Just `dario backend add <name> --key=... --base-url=...`.
|
|
558
|
+
|
|
559
|
+
**Something's wrong. Where do I start?**
|
|
560
|
+
`dario doctor`. One command, one aggregated report — dario version, Node, platform, CC binary compat, template source + age + drift, OAuth status, pool state, backends, home dir. Exit code 1 if any check fails. Paste the output when you file an issue.
|
|
561
|
+
|
|
562
|
+
**What happens when Anthropic rotates the OAuth config?**
|
|
563
|
+
Dario auto-detects OAuth config from the installed Claude Code binary. When CC ships a new version with rotated values, dario picks them up on the next run. Cache at `~/.dario/cc-oauth-cache-v3.json`, keyed by the CC binary fingerprint.
|
|
564
|
+
|
|
565
|
+
**What happens when Anthropic changes the CC request template?**
|
|
566
|
+
Dario extracts the live request template from your installed Claude Code binary on startup — the system prompt, tool schemas, user-agent, beta flags, and header insertion order — and uses those to replay requests instead of a version pinned into dario itself. When CC ships a new version with a tweaked template, the next `dario proxy` run picks it up automatically. Drift detection (v3.17) forces a refresh when the installed CC version changes under dario.
|
|
567
|
+
|
|
549
568
|
**First time setup on a fresh Claude account.**
|
|
550
569
|
If dario is the first thing you run against a brand-new Claude account, prime the account with a few real Claude Code commands first:
|
|
551
570
|
```bash
|
|
@@ -554,46 +573,25 @@ claude --print "hello"
|
|
|
554
573
|
```
|
|
555
574
|
This establishes a session baseline. Without priming, brand-new accounts occasionally see billing classification issues on first use.
|
|
556
575
|
|
|
557
|
-
**What happens when Anthropic rotates the OAuth config?**
|
|
558
|
-
Dario auto-detects OAuth config from the installed Claude Code binary. When CC ships a new version with rotated values, dario picks them up on the next run. Cache at `~/.dario/cc-oauth-cache-v3.json`, keyed by the CC binary fingerprint.
|
|
559
|
-
|
|
560
|
-
**What happens when Anthropic changes the CC request template?**
|
|
561
|
-
Dario extracts the live request template from your installed Claude Code binary on startup — the system prompt, tool schemas, user-agent, beta flags, and as of v3.13.0 the exact header insertion order — and uses those to replay requests instead of a version pinned into dario itself. When CC ships a new version with a tweaked template, the next `dario proxy` run picks it up automatically. Fallback: the hand-curated `src/cc-template-data.json` bundled with the release.
|
|
562
|
-
|
|
563
576
|
**I'm hitting rate limits on the Claude backend. What do I do?**
|
|
564
|
-
Claude subscriptions have rolling 5-hour and 7-day usage windows. Check utilization with Claude Code's `/usage` command or the [statusline](https://code.claude.com/docs/en/statusline). For multi-agent workloads, add more accounts and let pool mode distribute the load: `dario accounts add <alias>`.
|
|
577
|
+
Claude subscriptions have rolling 5-hour and 7-day usage windows. Check utilization with Claude Code's `/usage` command or the [statusline](https://code.claude.com/docs/en/statusline). For multi-agent workloads, add more accounts and let pool mode distribute the load: `dario accounts add <alias>`. Session stickiness keeps long conversations pinned to one account so the prompt cache isn't destroyed by rotation.
|
|
565
578
|
|
|
566
579
|
**I'm seeing `representative-claim: seven_day` in my rate-limit headers instead of `five_hour`. Am I being downgraded to API billing?**
|
|
567
580
|
|
|
568
|
-
**No.** You're still on subscription billing. Both `five_hour` and `seven_day` are the same subscription billing mode —
|
|
569
|
-
|
|
570
|
-
Here's the full picture. Every Claude Max and Pro subscription has **two rolling usage windows**:
|
|
571
|
-
|
|
572
|
-
- **5-hour window** — your short-term usage bucket. Refreshes on a rolling 5-hour schedule.
|
|
573
|
-
- **7-day window** — your longer-term usage bucket. Refreshes on a rolling 7-day schedule. Intentionally larger than the 5-hour one so you can keep working past brief bursts of heavy usage.
|
|
574
|
-
|
|
575
|
-
When Anthropic bills a request, it decides which bucket to charge it against based on your current utilization. That decision comes back in the `anthropic-ratelimit-unified-representative-claim` response header:
|
|
581
|
+
**No.** You're still on subscription billing. Both `five_hour` and `seven_day` are the same subscription billing mode — two different accounting buckets inside it.
|
|
576
582
|
|
|
577
583
|
| Claim | What it means |
|
|
578
584
|
|---|---|
|
|
579
585
|
| `five_hour` | You're well inside your 5-hour window; billing against the short-term bucket. |
|
|
580
|
-
| `seven_day` | You've exhausted (or come close to exhausting) the 5-hour window for this rolling cycle, so Anthropic is
|
|
581
|
-
| `overage` | Both subscription windows are effectively exhausted. *This* is where per-token Extra Usage charges kick in — if you've enabled Extra Usage on the account. If
|
|
582
|
-
|
|
583
|
-
**Seeing `seven_day` is a healthy state.** Your Max/Pro plan is doing exactly what it's supposed to do: letting you keep working past short bursts of heavy use by absorbing them into the larger 7-day bucket. Your subscription is not being "downgraded." When your 5-hour window rolls forward enough, the claim on new requests will go back to `five_hour` on its own.
|
|
586
|
+
| `seven_day` | You've exhausted (or come close to exhausting) the 5-hour window for this rolling cycle, so Anthropic is charging this request against the 7-day bucket. **Still subscription billing. Still your plan.** Not API pricing, not overage. |
|
|
587
|
+
| `overage` | Both subscription windows are effectively exhausted. *This* is where per-token Extra Usage charges kick in — if you've enabled Extra Usage on the account. If not, you get 429'd instead. |
|
|
584
588
|
|
|
585
|
-
|
|
589
|
+
Seeing `seven_day` is a healthy state. Your Max/Pro plan is doing exactly what it's supposed to do: letting you keep working past short bursts of heavy use by absorbing them into the larger 7-day bucket. When your 5-hour window rolls forward enough, the claim on new requests will go back to `five_hour` on its own. If the 7-day bucket is painful, add more Claude subscriptions to the pool — each account has its own independent 5h/7d windows, and pool mode routes each request to the account with the most headroom.
|
|
586
590
|
|
|
587
591
|
Standalone writeup: [Discussion #32 — why you see `representative-claim: seven_day` and why it's not a downgrade](https://github.com/askalf/dario/discussions/32).
|
|
588
592
|
|
|
589
593
|
**My multi-agent workload is getting reclassified to overage even though dario template-replays per request. Why?**
|
|
590
|
-
Reclassification at high agent volume is not a per-request problem. Anthropic's classifier operates on cumulative per-OAuth-session aggregates — token throughput, conversation depth, streaming duration, inter-arrival timing, thinking-block volume. Dario's Claude backend can make each individual request indistinguishable from Claude Code and still hit this wall on a long-running agent session
|
|
591
|
-
|
|
592
|
-
**Can I route non-OpenAI providers through dario?**
|
|
593
|
-
Yes — anything that speaks the OpenAI Chat Completions API. Groq, OpenRouter, LiteLLM, vLLM, Ollama's openai-compat mode. Just `dario backend add <name> --key=... --base-url=...`.
|
|
594
|
-
|
|
595
|
-
**Does dario work with only the OpenAI backend, no Claude subscription?**
|
|
596
|
-
Yes. Skip `dario login`, just run `dario backend add openai --key=...` and `dario proxy`. Claude-backend requests will return an authentication error; OpenAI-compat requests will work normally. Dario becomes a local OpenAI-compat shim with no Claude involvement.
|
|
594
|
+
Reclassification at high agent volume is not a per-request problem. Anthropic's classifier operates on cumulative per-OAuth-session aggregates — token throughput, conversation depth, streaming duration, inter-arrival timing, thinking-block volume. Dario's Claude backend can make each individual request indistinguishable from Claude Code and still hit this wall on a long-running agent session. Thorough diagnostic work was contributed by [@belangertrading](https://github.com/belangertrading) in [#23](https://github.com/askalf/dario/issues/23). The practical answer at the dario layer is **pool mode** — distribute load across multiple subscriptions so no single account accumulates enough signal to trip anything. See [Multi-account pool mode](#multi-account-pool-mode).
|
|
597
595
|
|
|
598
596
|
**Why "dario"?**
|
|
599
597
|
It's a name, not an acronym. Don't overthink it.
|
|
@@ -614,19 +612,20 @@ Longer-form writing on how dario works and why it works that way:
|
|
|
614
612
|
|
|
615
613
|
## Contributing
|
|
616
614
|
|
|
617
|
-
PRs welcome. The codebase is small TypeScript —
|
|
615
|
+
PRs welcome. The codebase is small TypeScript — ~7,600 lines across ~15 files:
|
|
618
616
|
|
|
619
617
|
| File | Purpose |
|
|
620
618
|
|---|---|
|
|
621
|
-
| `src/proxy.ts` | HTTP proxy server, request handler, rate governor, Claude backend dispatch |
|
|
622
|
-
| `src/cc-template.ts` | CC request template engine,
|
|
619
|
+
| `src/proxy.ts` | HTTP proxy server, request handler, rate governor, Claude backend dispatch, OpenAI-compat routing, pool failover |
|
|
620
|
+
| `src/cc-template.ts` | CC request template engine, universal `TOOL_MAP` (~65 schema-verified entries), orchestration and framework scrubbing, header-order replay |
|
|
623
621
|
| `src/cc-template-data.json` | Bundled fallback CC request template (used when live-fingerprint extraction isn't possible) |
|
|
624
622
|
| `src/cc-oauth-detect.ts` | OAuth config auto-detection from the installed CC binary |
|
|
625
|
-
| `src/live-fingerprint.ts` | Live extraction of the CC request template (system prompt, tools, user-agent, beta flags, header order) from the installed Claude Code binary |
|
|
623
|
+
| `src/live-fingerprint.ts` | Live extraction of the CC request template (system prompt, tools, user-agent, beta flags, header order) from the installed Claude Code binary, drift detection, compat matrix, atomic cache writes, corruption recovery |
|
|
624
|
+
| `src/doctor.ts` | `dario doctor` health report aggregator — dario/Node/CC/template/drift/OAuth/pool/backends |
|
|
626
625
|
| `src/oauth.ts` | Single-account token storage, PKCE flow, auto-refresh |
|
|
627
|
-
| `src/accounts.ts` | Multi-account credential storage
|
|
626
|
+
| `src/accounts.ts` | Multi-account credential storage, independent OAuth lifecycle, refresh single-flight |
|
|
628
627
|
| `src/pool.ts` | Account pool, headroom-aware routing, session stickiness, failover target selection |
|
|
629
|
-
| `src/sealed-pool.ts` |
|
|
628
|
+
| `src/sealed-pool.ts` | Sealed-sender overflow protocol — RSA blind signatures for unlinkable group pooling |
|
|
630
629
|
| `src/analytics.ts` | Rolling request history, per-account / per-model stats, burn-rate, billing bucket classification |
|
|
631
630
|
| `src/openai-backend.ts` | OpenAI-compat backend credential storage and request forwarder |
|
|
632
631
|
| `src/shim/runtime.cjs` | Hand-written CJS payload loaded into child processes via `NODE_OPTIONS=--require`; patches `globalThis.fetch` for Anthropic messages requests only |
|
|
@@ -639,7 +638,8 @@ git clone https://github.com/askalf/dario
|
|
|
639
638
|
cd dario
|
|
640
639
|
npm install
|
|
641
640
|
npm run dev # runs with tsx, no build step
|
|
642
|
-
npm test #
|
|
641
|
+
npm test # ~640 assertions across 20 suites
|
|
642
|
+
npm run e2e # live proxy + OAuth (requires a working Claude backend)
|
|
643
643
|
```
|
|
644
644
|
|
|
645
645
|
---
|
|
@@ -652,7 +652,9 @@ npm test # 376 assertions across 12 suites
|
|
|
652
652
|
| [@belangertrading](https://github.com/belangertrading) | Billing classification investigation ([#4](https://github.com/askalf/dario/issues/4)), cache_control fingerprinting ([#6](https://github.com/askalf/dario/issues/6)), billing reclassification root cause ([#7](https://github.com/askalf/dario/issues/7)), OAuth client_id discovery ([#12](https://github.com/askalf/dario/issues/12)), multi-agent session-level billing analysis ([#23](https://github.com/askalf/dario/issues/23)) |
|
|
653
653
|
| [@nathan-widjaja](https://github.com/nathan-widjaja) | README positioning rewrite structure ([#21](https://github.com/askalf/dario/issues/21)) |
|
|
654
654
|
| [@iNicholasBE](https://github.com/iNicholasBE) | macOS keychain credential detection ([#30](https://github.com/askalf/dario/pull/30)) |
|
|
655
|
-
| [@boeingchoco](https://github.com/boeingchoco) | Reverse-direction tool parameter translation ([#29](https://github.com/askalf/dario/issues/29)), SSE event-group framing regression catch (v3.7.1), provider-comparison diagnostic that surfaced the `--preserve-tools` discoverability gap (v3.8.1), motivating case for hybrid tool mode ([#33](https://github.com/askalf/dario/issues/33), v3.9.0) |
|
|
655
|
+
| [@boeingchoco](https://github.com/boeingchoco) | Reverse-direction tool parameter translation ([#29](https://github.com/askalf/dario/issues/29)), SSE event-group framing regression catch (v3.7.1), provider-comparison diagnostic that surfaced the `--preserve-tools` discoverability gap (v3.8.1), motivating case for hybrid tool mode ([#33](https://github.com/askalf/dario/issues/33), v3.9.0), OpenClaw tool-mapping root cause that drove the universal `TOOL_MAP` work ([#36](https://github.com/askalf/dario/issues/36)) |
|
|
656
|
+
| [@tetsuco](https://github.com/tetsuco) | Framework-name path corruption in scrubber ([#35](https://github.com/askalf/dario/issues/35)), OpenClaw Bash/Glob reverse-mapping collisions ([#37](https://github.com/askalf/dario/issues/37)) |
|
|
657
|
+
| [@mikelovatt](https://github.com/mikelovatt) | Silent subscription-percent drain surfaced via friendly billing buckets ([#34](https://github.com/askalf/dario/issues/34)) |
|
|
656
658
|
|
|
657
659
|
---
|
|
658
660
|
|