@askalf/dario 3.38.3 → 3.38.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,7 +1,6 @@
1
1
  <p align="center">
2
2
  <h1 align="center">dario</h1>
3
- <p align="center"><strong>Use your Claude Pro/Max subscription with Cursor, Aider, Cline, Zed, Codex CLI, the Claude Agent SDKany tool that speaks Anthropic or OpenAI.</strong></p>
4
- <p align="center">A local LLM router. One endpoint, every provider. Your Claude subscription — Pro ($20), Max 5x ($100), or Max 20x ($200) — stops sitting idle in Claude Code while you pay per-token everywhere else. Speaks both the Anthropic Messages API and the OpenAI Chat Completions API at <code>http://localhost:3456</code>.</p>
3
+ <p align="center"><strong>Your Claude Pro/Max subscription works in exactly one place: Claude Code.<br>dario makes it work everywhereat subscription pricing, not per-token API bills.</strong></p>
5
4
  </p>
6
5
 
7
6
  <p align="center">
@@ -10,267 +9,160 @@
10
9
  <a href="https://github.com/askalf/dario/actions/workflows/codeql.yml"><img src="https://github.com/askalf/dario/actions/workflows/codeql.yml/badge.svg" alt="CodeQL"></a>
11
10
  <a href="https://github.com/askalf/dario/blob/master/LICENSE"><img src="https://img.shields.io/npm/l/@askalf/dario" alt="License"></a>
12
11
  <a href="https://www.npmjs.com/package/@askalf/dario"><img src="https://img.shields.io/npm/dm/@askalf/dario" alt="Downloads"></a>
13
- </p>
14
-
15
- <p align="center">
16
12
  <a href="https://x.com/ask_alf"><img src="https://img.shields.io/badge/follow-@ask_alf-1da1f2?style=flat-square" alt="Follow on X"></a>
17
- <a href="https://askalf.org"><img src="https://img.shields.io/badge/askalf.org-platform-00ff88?style=flat-square" alt="askalf"></a>
18
13
  </p>
19
14
 
20
- <p align="center"><em>Zero runtime dependencies. <a href="https://www.npmjs.com/package/@askalf/dario">SLSA-attested</a> on every release. Nothing phones home. Independent, unofficial, third-party — see <a href="DISCLAIMER.md">DISCLAIMER.md</a>.</em></p>
21
-
22
- > **dario is the open-source wedge of [askalf](https://askalf.org)** — the AI workforce platform we're building. Dario solves the Claude subscription problem so the rest of the workforce can run on flat-rate billing. Star this repo or follow [@ask_alf](https://x.com/ask_alf) for platform updates.
15
+ <p align="center"><em>Zero runtime dependencies · <a href="https://www.npmjs.com/package/@askalf/dario">SLSA-attested</a> every release · nothing phones home · ~13k lines you can read in a weekend · independent, unofficial, third-party (<a href="DISCLAIMER.md">DISCLAIMER.md</a>)</em></p>
23
16
 
24
17
  ---
25
18
 
26
- ## 30 seconds
19
+ You're already paying $20, $100, or $200 a month for Claude. Then Cursor wants an API key. Aider wants an API key. Cline, Continue, Zed, your scripts — every one of them bills you **again**, per token, while the subscription you already bought sits idle in Claude Code.
20
+
21
+ **dario is one local endpoint that routes all of them through the Claude subscription you already pay for.** Point any Anthropic- or OpenAI-compatible tool at `http://localhost:3456` and you're done. No per-tool config, no second bill.
27
22
 
28
23
  ```bash
29
- # 1. Install
30
24
  npm install -g @askalf/dario
31
-
32
- # 2. Log in to your Claude subscription (Pro, Max 5x, or Max 20x)
33
- dario login # or `dario login --manual` for SSH / headless setups
34
-
35
- # 3. Start the local Claude API proxy
25
+ dario login # uses your existing Claude Code credentials
36
26
  dario proxy
37
-
38
- # 4. Point any Anthropic-compat tool at it
39
27
  export ANTHROPIC_BASE_URL=http://localhost:3456
40
28
  export ANTHROPIC_API_KEY=dario
41
29
  ```
42
30
 
43
- Done. Every tool that honors those env vars — Claude Code, Cursor, Aider, Cline, Roo Code, Continue.dev, Zed, Windsurf, OpenHands, OpenClaw, Hermes, Codex CLI, the [Claude Agent SDK](https://www.npmjs.com/package/@anthropic-ai/claude-agent-sdk), your own scripts — now routes through your **Claude subscription** (Pro / Max 5x / Max 20x) instead of per-token API pricing. Dario sends the same request shape Claude Code itself sends, which is the shape the subscription-billing path recognizes.
44
-
45
- Prefer Docker? `ghcr.io/askalf/dario:latest` is a multi-arch (`linux/amd64` + `linux/arm64`) image published on every release — homelab, k8s, NAS. Full guide: [`docs/docker.md`](./docs/docker.md).
46
-
47
- For OpenAI / Groq / OpenRouter / Ollama / LiteLLM / vLLM, add one backend line and reuse the same proxy:
48
-
49
- ```bash
50
- dario backend add openai --key=sk-proj-...
51
- dario backend add groq --key=gsk_... --base-url=https://api.groq.com/openai/v1
52
- dario backend add openrouter --key=sk-or-... --base-url=https://openrouter.ai/api/v1
53
- dario backend add local --key=anything --base-url=http://127.0.0.1:11434/v1
54
-
55
- export OPENAI_BASE_URL=http://localhost:3456/v1
56
- export OPENAI_API_KEY=dario
57
- ```
58
-
59
- Switching providers is a **model-name change** in your tool — `claude-opus-4-7`, `gpt-4o`, `llama-3.3-70b`, any OpenRouter / Groq / local model — not a reconfigure. Force a specific backend with a prefix: `openai:gpt-4o`, `claude:opus`, `groq:llama-3.3-70b`, `local:qwen-coder`.
60
-
61
- Something not right? `dario doctor` prints a single paste-ready health report. Paste that when you file an issue.
31
+ That's the whole setup. Every tool that honors those env vars now runs on your subscription.
62
32
 
63
33
  ---
64
34
 
65
- ## The 2026-06-15 billing cliff
66
-
67
- Starting **2026-06-15**, Anthropic splits Claude plan usage into two separate pools. The one that matters for coding agents:
68
-
69
- | Plan | New Agent-SDK / `claude -p` credit | What happens when it runs out |
70
- |---|---|---|
71
- | Pro | **$20/mo** | Per-token API pricing |
72
- | Max 5x | **$100/mo** | Per-token API pricing |
73
- | Max 20x | **$200/mo** | Per-token API pricing |
74
-
75
- A sustained Cline or Aider session burns $100 of API-rate tokens in an evening. Agentic loops blow past $200 in days. Any proxy that forwards requests in their original Agent-SDK or `claude -p` wire shape — which is most of them — puts your agentic traffic into this new credit pool instead of your subscription pool. Once it's gone, you're on metered pricing.
35
+ ## The money
76
36
 
77
- **Dario doesn't.** Every outbound request is rebuilt as **interactive Claude Code wire-shape** before it leaves your machine: headers, body key order, TLS stack, session-id lifecycle the same six static axes the live template extractor has been closing since v3.22 — plus, as of v3.38, the **temporal axis**: post-response read time correlated with response length, and 1.2–4.2s session-start latency. Toggle the behavioral layer on with `--stealth`. Anthropic's billing classifier sees an interactive CC session — static shape *and* inter-arrival distribution. Your traffic stays in the subscription pool you already pay for.
78
-
79
- | Your setup | Post-2026-06-15 billing path |
37
+ | Setup | Monthly costheavy user |
80
38
  |---|---|
81
- | Any tool Anthropic API direct | Per-token API |
82
- | Any tool proxy forwarding requests as-is | **Agent-SDK credit ($20–200/mo cap), then per-token API** |
83
- | **Any tool dario** | **Subscription pool unchanged** |
84
- | Claude Code interactive | Subscription pool — unchanged |
85
-
86
- Same install. Same `localhost:3456`. No config change needed for the cliff.
39
+ | Cursor + Anthropic API direct | **$80–$300** |
40
+ | Multi-tool heavy use (Cursor + Aider + Cline + Continue), per-token | **$200–$600+** |
41
+ | **Any of the above + dario** | **$20–$200 flat** — your existing Pro/Max plan, nothing extra |
87
42
 
88
- **Verify it's working:** `dario doctor --usage` fires one Haiku request through your OAuth and surfaces the rate-limit headers Anthropic returned. The `representative-claim` field should read `five_hour` or `seven_day` — subscription billing buckets. If it reads anything else after 2026-06-15 lands, [file an issue](https://github.com/askalf/dario/issues/new) that's what the drift detector exists to catch. Full technical breakdown + post-cliff verification procedure: [`docs/why-now-2026-06.md`](./docs/why-now-2026-06.md).
43
+ Switching providers is a model-name change, not a reconfigure: `claude-opus-4-7`, `gpt-4o`, `llama-3.3-70b`, anything on OpenRouter/Groq/Ollama. Add a backend once (`dario backend add openai --key=…`) and the same `localhost:3456` speaks OpenAI too.
89
44
 
90
45
  ---
91
46
 
92
- ## What it actually does
93
-
94
- You point every tool at one URL. Dario reads each request, decides which backend owns it, and forwards in that backend's native protocol.
95
-
96
- | Client speaks | Model in request | dario routes to | What happens |
97
- |---|---|---|---|
98
- | Anthropic Messages API | `claude-*` / `opus` / `sonnet` / `haiku` | Claude backend | OAuth swap + (optional) CC template replay → `api.anthropic.com` |
99
- | Anthropic Messages API | `gpt-*`, `llama-*`, etc. | OpenAI-compat backend | Anthropic → OpenAI translation, forwarded to configured backend |
100
- | OpenAI Chat Completions | `gpt-*` / `o1-*` / `o3-*` | OpenAI-compat backend | Passthrough: auth swap, body forwarded byte-for-byte |
101
- | OpenAI Chat Completions | `claude-*` | Claude backend | OpenAI → Anthropic translation, then the Claude backend path |
102
- | Either protocol | `<provider>:<model>` | Forced by prefix | Explicit override for ambiguous names |
103
-
104
- The tool doesn't know. The backend doesn't know. Dario is the seam.
105
-
106
- Beyond routing, the Claude backend is a **full Claude Code wire-level template** — every observable axis (bytes, headers, body key order, TLS stack, inter-request timing, session-id lifecycle, stream-consumption shape) is captured from your installed CC binary and mirrored on outbound requests so the upstream subscription-billing path is the one the request follows. See [`docs/wire-fidelity.md`](./docs/wire-fidelity.md).
47
+ ## The deadline: 2026-06-15
107
48
 
108
- **Static shape, plus behavioral shape.** v3.38 closes the temporal axis on top: post-response *think time* (delay before the next request, correlated with the previous response's output-token countreal users read before typing the next message; agent loops don't), and *session-start latency* (the first request of a new session fires after a sampled 1.2–4.2s delay, matching observed real-CC session opens, instead of at machine speed). One flag turns it on: `dario proxy --stealth`. Per-knob tuning if you need it: `--think-time-base`, `--think-time-per-token`, `--think-time-jitter`, `--session-start-min`, `--session-start-jitter`.
49
+ On **2026-06-15**, Anthropic splits Claude billing in two. Agentic traffic Agent SDK, `claude -p` headlessstops counting against your subscription and gets a small fixed monthly credit instead:
109
50
 
110
- ---
51
+ | Plan | New Agent-SDK / `claude -p` credit | After it's gone |
52
+ |---|---|---|
53
+ | Pro | $20/mo | per-token API pricing |
54
+ | Max 5x | $100/mo | per-token API pricing |
55
+ | Max 20x | $200/mo | per-token API pricing |
111
56
 
112
- ## Cost comparison
57
+ A sustained Cline or Aider session burns $100 of API-rate tokens in an evening. **Any proxy that forwards requests in their original `claude -p` / Agent-SDK shape — which is most of them — dumps your agentic traffic into that small credit bucket, then onto metered pricing.**
113
58
 
114
- Claude subscription tiers: **Pro** ($20/mo) · **Max 5x** ($100/mo) · **Max 20x** ($200/mo). Dario routes through whichever you have pick by your usage volume, not by what dario needs.
59
+ dario doesn't. Every outbound request is rebuilt into **interactive Claude Code wire-shape** before it leaves your machine — headers, body key order, TLS stack, session-id lifecycle, and (v3.38, `--stealth`) the temporal axis: response-correlated think-time and session-start latency. Anthropic's billing classifier sees an interactive Claude Code session. Your traffic stays in the subscription pool you already pay for.
115
60
 
116
- | Setup | Monthly cost (heavy single-tool user) |
61
+ | Your setup | After 2026-06-15 |
117
62
  |---|---|
118
- | Cursor + Anthropic API direct | $80–$300 |
119
- | Cursor + ChatGPT Plus | $20 + per-token overage |
120
- | **Cursor + Claude Pro/Max + dario** | **$20 (Cursor) + $20–200 (your Claude tier) flat every Claude call routes through your subscription** |
121
- | Multi-tool heavy use (Cursor + Aider + Cline + Continue) without dario | $200–$600+ |
122
- | **Same multi-tool use with dario** | **$20–200 flat — one Pro/Max subscription routes all of them** |
63
+ | Any tool Anthropic API direct | per-token API |
64
+ | Any tool proxy that forwards requests as-is | **$20–200/mo credit, then per-token** |
65
+ | **Any tool dario** | **subscription poolunchanged** |
66
+ | Claude Code, interactive | subscription pool unchanged |
123
67
 
124
- Already have **Pro + Max** stacked? Pool mode (`dario accounts add work` / `dario accounts add personal`) routes across both, with session stickiness keeping multi-turn agents pinned to one account so the prompt cache survives. Tiers mix freely dario only cares about headroom, not which plan an account is on.
68
+ Same install, same `localhost:3456`, no config change for the cliff. Verify on your own machine: `dario doctor --usage` fires one request and surfaces the rate-limit headers `representative-claim` should read `five_hour` or `seven_day` (subscription buckets). Full breakdown + post-cliff verification: [`docs/why-now-2026-06.md`](./docs/why-now-2026-06.md).
125
69
 
126
70
  ---
127
71
 
128
- ## Why you'll install this
72
+ ## Does it actually work?
129
73
 
130
- - **One URL for every provider.** Cursor, Aider, Continue, Zed, OpenHands, Claude Code, your own scripts — every tool you own has its own per-provider config. Dario collapses that into a single `localhost:3456` that speaks both Anthropic and OpenAI protocols and routes by model name.
131
- - **Your Claude subscription stops sitting idle.** Cursor, Aider, Zed, Continue all want API keys and bill per-token while your Pro / Max 5x / Max 20x plan only gets used in Claude Code. Dario routes them through your plan via Claude Code's exact wire shape.
132
- - **You hit rate limits on long agent runs.** Add a second / third Claude subscription with `dario accounts add work` and pool mode routes each request to whichever account has the most headroom. Session stickiness pins multi-turn conversations; in-flight 429 failover retries on a different account before your client sees the error. See [`docs/multi-account-pool.md`](./docs/multi-account-pool.md).
133
- - **You run a coding agent that isn't Claude Code.** Cline, Roo Code, Cursor, Windsurf, Continue.dev, GitHub Copilot, OpenHands, OpenClaw, Hermes, hands — dario's universal `TOOL_MAP` (66 schema-verified entries) pre-maps their tool names to Claude Code's native set. No flag, no validator errors. See [`docs/agent-compat.md`](./docs/agent-compat.md).
134
- - **You want the proxy off the wire entirely.** Shim mode is an in-process `globalThis.fetch` patch — no HTTP hop, no port to bind, no `BASE_URL`. `dario shim -- claude --print "hi"` and CC thinks it's talking directly to `api.anthropic.com`. See [`docs/shim.md`](./docs/shim.md).
135
- - **You want CC's behavioral constraints out of your prompt.** `dario proxy --system-prompt=partial` strips CC's Tone-and-style / Text-output / verbosity / no-comments-by-default bullets and recovers ~1.2–2.8× output capability on open-ended work — empirically without flipping subscription billing (the classifier doesn't read this slot). RLHF refusals on harmful content are unaffected (alignment is in the weights, not the prompt). See [`docs/system-prompt.md`](./docs/system-prompt.md) and the empirical writeup in [`docs/research/system-prompt.md`](./docs/research/system-prompt.md).
136
- - **You want dario reachable from inside Claude Code or any MCP client.** `dario subagent install` registers a CC sub-agent for in-session diagnostics ([`docs/sub-agent.md`](./docs/sub-agent.md)). `dario mcp` turns dario into a read-only MCP server ([`docs/mcp-server.md`](./docs/mcp-server.md)).
137
- - **You want to actually audit it.** ~13,170 lines of TypeScript across 28 files. Zero runtime dependencies. Credentials at `~/.dario/` with `0600` permissions. `127.0.0.1`-only by default. Every release [SLSA-attested](https://www.npmjs.com/package/@askalf/dario). Nothing phones home. Small enough to read in a weekend.
138
- - **You want a deep-research tool that runs at $0/mo.** [deepdive](https://github.com/askalf/deepdive) is dario's companion CLI — `npx @askalf/deepdive "your question"`, get a cited Markdown report. Replaces Perplexity Pro ($20/mo), OpenAI Deep Research ($20/mo), Gemini Deep Research ($20/mo) — all of which mark up LLM calls on top of LLM calls. The deep-research workload (50k–200k tokens per question, sustained) is exactly what Max was priced for; deepdive is what uses it for that.
74
+ Four LLMs reviewed the codebase cold, same prompt ([`reviews/PROMPT.md`](./reviews/PROMPT.md)), each signed a verdict:
139
75
 
140
- ---
141
-
142
- ## Independently reviewed (4 LLMs)
143
-
144
- Same prompt to all four ([`reviews/PROMPT.md`](./reviews/PROMPT.md)). Each reviewer signed a verdict line. Push-back triaged in [`review-feedback`](https://github.com/askalf/dario/issues?q=label%3Areview-feedback).
145
-
146
- | Reviewer | Verdict | Full review |
147
- |---|---|---|
148
- | **Grok 4** | "Adopt if the use-case fits." | [→](./reviews/grok-4-2026-04-21.md) |
149
- | **Claude Opus 4.7** | "The fingerprint-replay claim is backed by the code." | [→](./reviews/claude-opus-4-7-2026-04-21.md) |
150
- | **Gemini 2.0 Pro** | "Technically elite, zero-dependency proxy." | [→](./reviews/gemini-2-pro-2026-04-21.md) |
151
- | **GPT-5.3** | "Disciplined, intentional engineering. Not vibe-coded." | [→](./reviews/gpt-5.3-2026-04-21.md) |
152
-
153
- Highlights:
154
-
155
- > "This is not vibe-coded; it reads like production-grade infrastructure that happens to be open-source." — Grok 4
76
+ > "Not vibe-coded; it reads like production-grade infrastructure that happens to be open-source." — **Grok 4** ([full](./reviews/grok-4-2026-04-21.md))
156
77
  >
157
- > "Comments consistently cite the issue number that motivated the code — which is what scar-tissue code looks like in a project that has actual users." — Claude Opus 4.7
78
+ > "The implementation isn't just a simple header swap; it is a sophisticated request-level deepfake." — **Gemini 2.0 Pro** ([full](./reviews/gemini-2-pro-2026-04-21.md))
158
79
  >
159
- > "The implementation isn't just a simple header swap; it is a sophisticated request-level deepfake." — Gemini 2.0 Pro
80
+ > "Not 'best-effort mimicry'; it's capture-and-replay of a real client." — **GPT-5.3** ([full](./reviews/gpt-5.3-2026-04-21.md))
160
81
  >
161
- > "Not 'best-effort mimicry'; it's capture-and-replay of a real client." — GPT-5.3
82
+ > "The fingerprint-replay claim is backed by the code." — **Claude Opus 4.7** ([full](./reviews/claude-opus-4-7-2026-04-21.md))
162
83
 
163
- ---
164
-
165
- ## Who this is for
166
-
167
- **Best fit:**
168
-
169
- - Developers using multiple LLMs across multiple tools tired of juggling base URLs, keys, and per-tool provider configs.
170
- - Claude Pro / Max subscribers who want their subscription usable from every tool on their machine, not just Claude Code.
171
- - Teams running local or hosted OpenAI-compat servers (LiteLLM, vLLM, Ollama, Groq, OpenRouter, self-hosted) who want one stable local endpoint every tool can reuse.
172
- - [Claude Agent SDK](https://www.npmjs.com/package/@anthropic-ai/claude-agent-sdk) users who want OAuth-subscription routing under the SDK. Point `baseURL: 'http://localhost:3456'` and dario translates API-key calls into your Claude subscription auth — agent code stays identical.
173
- - Power users on multi-agent workloads who want multi-account pooling, session stickiness, and in-flight 429 failover on their own machine, against their own subscriptions.
174
-
175
- **Not a fit:**
176
-
177
- - You need vendor-managed production SLAs on every request. Use the provider APIs directly.
178
- - You need a hosted, multi-tenant, managed routing platform with a dashboard, team auth, and support contracts. Dario is a local, single-user tool — the [askalf platform](https://askalf.org) is the right surface for the team / fleet case.
179
- - You want a chat UI. Use claude.ai or chatgpt.com.
84
+ The mechanism: dario doesn't *guess* Claude Code's request shape — it captures it live from your installed `claude` binary on every startup, drift-detects against each upstream CC release, and replays it byte-for-byte. That's why the billing classifier can't tell the difference. Deep dive: [`docs/wire-fidelity.md`](./docs/wire-fidelity.md).
180
85
 
181
86
  ---
182
87
 
183
- ## Backends
184
-
185
- Dario's routing is organized around **backends**. Each is a swappable adapter — add one, your tools reach it through `localhost:3456` in whichever API shape they already speak. Run zero, one, or all concurrently.
186
-
187
- ### 1. OpenAI-compat backend
188
-
189
- Any provider that speaks the OpenAI Chat Completions API.
88
+ ## 30 seconds, in full
190
89
 
191
90
  ```bash
192
- dario backend add openai --key=sk-proj-...
193
- dario backend add groq --key=gsk_... --base-url=https://api.groq.com/openai/v1
194
- dario backend add openrouter --key=sk-or-... --base-url=https://openrouter.ai/api/v1
195
- dario backend add local --key=anything --base-url=http://127.0.0.1:4000/v1
196
- ```
197
-
198
- Credentials live at `~/.dario/backends/<name>.json` with mode `0600`. Body forwarded as-is, only the `Authorization` header is swapped, streaming forwarded byte-for-byte. Force a specific backend with a [provider prefix](./docs/usage.md#provider-prefix) on the model field.
199
-
200
- ### 2. Claude subscription backend
201
-
202
- OAuth-backed Claude Pro / Max 5x / Max 20x, billed against your plan instead of the API. Activated by `dario login` (or `dario login --manual` for SSH / container setups, v3.20). Any tier with Claude Code access works — see [`docs/faq.md`](./docs/faq.md).
203
-
204
- Every outbound Claude request is rebuilt to match a request Claude Code itself would make — system prompt, tool definitions, identity headers, billing tag, beta flags, header insertion order, static header values, `anthropic-beta` flag set, top-level request-body key order — using a live-extracted template from your actually-installed CC binary that self-heals on every upstream CC release.
91
+ # 1. Install
92
+ npm install -g @askalf/dario
205
93
 
206
- Key mechanisms in brief: live template extraction from your installed `claude` binary, drift detection with forced refresh on mismatch, OAuth config auto-detection (so dario picks up Anthropic-side rotations on the next run), atomic cache writes, framework scrubbing (third-party identity markers stripped from system prompt), Bun auto-relaunch (so the TLS ClientHello matches CC's runtime). `dario proxy --passthrough` does an OAuth swap and nothing else — use it when the upstream tool already builds a Claude-Code-shaped request.
94
+ # 2. Log in to your Claude subscription (Pro, Max 5x, or Max 20x)
95
+ dario login # or `dario login --manual` for SSH / headless
207
96
 
208
- What this addresses: per-request fidelity. What it can't address alone: cumulative per-OAuth-session aggregates. The v3.22 – v3.28 wire-fidelity track closed six of those axes (body order, TLS, pacing, stream-drain, session-id lifecycle, MCP/sub-agent surface — see [`docs/wire-fidelity.md`](./docs/wire-fidelity.md)); for anything left, [pool mode](./docs/multi-account-pool.md) distributes load across multiple subscriptions.
97
+ # 3. Start the local proxy
98
+ dario proxy
209
99
 
210
- ---
100
+ # 4. Point any Anthropic-compat tool at it
101
+ export ANTHROPIC_BASE_URL=http://localhost:3456
102
+ export ANTHROPIC_API_KEY=dario
103
+ ```
211
104
 
212
- ## Multi-account pool mode
105
+ Works with: Claude Code, Cursor, Aider, Cline, Roo Code, Continue.dev, Zed, Windsurf, OpenHands, OpenClaw, Hermes, Codex CLI, the [Claude Agent SDK](https://www.npmjs.com/package/@anthropic-ai/claude-agent-sdk), your own scripts.
213
106
 
214
- Pool mode activates automatically when `~/.dario/accounts/` contains 2+ accounts. Each request picks the account with the highest headroom; multi-turn agent sessions pin to one account so the Anthropic prompt cache survives; in-flight 429s retry on a different account before the client sees an error.
107
+ Add other providers and reuse the same proxy:
215
108
 
216
109
  ```bash
217
- dario accounts add work
218
- dario accounts add personal
219
- dario proxy
220
- ```
221
-
222
- Full details, headroom math, sticky-key implementation, inspection endpoints: [`docs/multi-account-pool.md`](./docs/multi-account-pool.md).
223
-
224
- ---
110
+ dario backend add openai --key=sk-proj-...
111
+ dario backend add groq --key=gsk_... --base-url=https://api.groq.com/openai/v1
112
+ dario backend add openrouter --key=sk-or-... --base-url=https://openrouter.ai/api/v1
113
+ dario backend add local --key=anything --base-url=http://127.0.0.1:11434/v1
225
114
 
226
- ## Shim mode (experimental)
115
+ export OPENAI_BASE_URL=http://localhost:3456/v1
116
+ export OPENAI_API_KEY=dario
117
+ ```
227
118
 
228
- Take the proxy off the wire entirely. `dario shim -- <child cmd>` patches `globalThis.fetch` inside the child via `NODE_OPTIONS=--require`. No localhost HTTP hop. No port to bind. No `BASE_URL`.
119
+ Force a specific backend with a model prefix: `openai:gpt-4o`, `claude:opus`, `groq:llama-3.3-70b`, `local:qwen-coder`.
229
120
 
230
- ```bash
231
- dario shim -- claude --print "hello"
232
- ```
121
+ Prefer Docker? `ghcr.io/askalf/dario:latest` — multi-arch (`amd64`+`arm64`), published every release. Guide: [`docs/docker.md`](./docs/docker.md).
233
122
 
234
- When to use it, when to stay on the proxy, hardening detail: [`docs/shim.md`](./docs/shim.md).
123
+ Something off? `dario doctor` prints one paste-ready health report.
235
124
 
236
125
  ---
237
126
 
238
- ## Agent compatibility
239
-
240
- Dario's `TOOL_MAP` (66 schema-verified entries) covers every major coding agent — Cline, Roo Code, Kilo Code, Cursor, Windsurf, Continue.dev, GitHub Copilot, OpenHands, OpenClaw, Hermes, [hands](https://github.com/askalf/hands). Tool calls translate to CC's native set on the outbound path (subscription wire shape preserved) and rebuild to your agent's exact expected shape on the inbound path.
127
+ ## What it actually does
241
128
 
242
- Text-tool clients (Cline / Kilo / Roo) and identity-detected agents (`hands`, `arnie`, Hermes) auto-flip into preserve-tools mode via system-prompt identity markers. A structural fallback catches in-house non-CC agents (3+ tools, ≥80% unmapped) and flips them too.
129
+ You point every tool at one URL. Dario reads each request, decides which backend owns it, forwards in that backend's native protocol.
243
130
 
244
- Per-tool setup (Cursor, Continue, Aider, Cline, Zed, OpenHands), the `--preserve-tools` / `--hybrid-tools` decision matrix, and the full agent table all live in [`docs/agent-compat.md`](./docs/agent-compat.md).
131
+ | Client speaks | Model | Routes to | What happens |
132
+ |---|---|---|---|
133
+ | Anthropic Messages | `claude-*` / `opus` / `sonnet` / `haiku` | Claude backend | OAuth swap + CC template replay → `api.anthropic.com` |
134
+ | Anthropic Messages | `gpt-*`, `llama-*`, … | OpenAI-compat backend | Anthropic→OpenAI translation, forwarded |
135
+ | OpenAI Chat | `gpt-*` / `o1-*` / `o3-*` | OpenAI-compat backend | Auth swap, body forwarded byte-for-byte |
136
+ | OpenAI Chat | `claude-*` | Claude backend | OpenAI→Anthropic translation, then Claude path |
137
+ | Either | `<provider>:<model>` | Forced by prefix | Explicit override |
245
138
 
246
- ### Custom tool schemas
139
+ The tool doesn't know. The backend doesn't know. Dario is the seam.
247
140
 
248
- If your agent's tool names aren't pre-mapped and its tools carry fields CC's schema doesn't have, you have two escape hatches:
141
+ ---
249
142
 
250
- - **`--preserve-tools`** — forward your schema verbatim, lose the CC wire shape (and likely the subscription-billing wire shape with it).
251
- - **`--hybrid-tools`** — keep the CC wire shape, fill request-context fields (`sessionId`, `requestId`, `channelId`, `userId`, `timestamp`) from headers on the reverse path. The compromise that keeps both.
143
+ ## Capabilities
252
144
 
253
- Full when-to-use-which decision matrix and request-context field set: [`docs/agent-compat.md#custom-tool-schemas`](./docs/agent-compat.md#custom-tool-schemas).
145
+ - **Multi-account pool.** Drop 2+ Claude accounts in `~/.dario/accounts/` and pool mode auto-activates: every request routes to the account with the most headroom, multi-turn sessions pin to one account so the prompt cache survives, in-flight 429s fail over to a peer before your client sees an error. `dario accounts add work` / `dario accounts add personal`. → [`docs/multi-account-pool.md`](./docs/multi-account-pool.md)
146
+ - **Behavioral stealth (`--stealth`).** Static wire fidelity covers *what* the request looks like; `--stealth` adds *when* it arrives — response-length-correlated think time and 1.2–4.2s session-start latency, the inter-arrival pattern real interactive sessions have and agent loops don't. → [`docs/wire-fidelity.md`](./docs/wire-fidelity.md)
147
+ - **Runs any non-Claude-Code agent.** A 66-entry schema-verified `TOOL_MAP` pre-maps Cline, Roo, Kilo, Cursor, Windsurf, Continue, Copilot, OpenHands, OpenClaw, Hermes, [hands](https://github.com/askalf/hands) tool names to CC's native set. No flag, no validator errors. → [`docs/agent-compat.md`](./docs/agent-compat.md)
148
+ - **Shim mode.** Take the proxy off the wire entirely — `dario shim -- claude --print "hi"` patches `globalThis.fetch` in-process. No HTTP hop, no port, no `BASE_URL`. → [`docs/shim.md`](./docs/shim.md)
149
+ - **Recover output capability.** `dario proxy --system-prompt=partial` strips CC's tone/verbosity/no-comments constraints for ~1.2–2.8× output on open-ended work — empirically without flipping billing (the classifier doesn't read that slot; RLHF safety is in the weights, not the prompt). → [`docs/system-prompt.md`](./docs/system-prompt.md)
150
+ - **Reachable from inside CC / any MCP client.** `dario subagent install` registers a CC sub-agent for in-session diagnostics; `dario mcp` exposes dario as a read-only MCP server. → [`docs/sub-agent.md`](./docs/sub-agent.md) · [`docs/mcp-server.md`](./docs/mcp-server.md)
254
151
 
255
152
  ---
256
153
 
257
- ## Trust and transparency
154
+ ## Trust & transparency
258
155
 
259
156
  | Signal | Status |
260
157
  |---|---|
261
- | **Source code** | ~13,170 lines of TypeScript across 28 files — small enough to audit in a weekend |
262
- | **Dependencies** | 0 runtime dependencies. Verify: `npm ls --production` |
263
- | **npm provenance** | Every release is [SLSA-attested](https://www.npmjs.com/package/@askalf/dario) via GitHub Actions with sigstore provenance attached to the transparency log |
264
- | **Security scanning** | [CodeQL](https://github.com/askalf/dario/actions/workflows/codeql.yml) on every push and weekly |
265
- | **Test footprint** | ~1,535 assertions across 57 test suites. Full `npm test` green on every release |
266
- | **Credential handling** | Tokens and API keys never logged, redacted from errors, stored with `0600` permissions. MCP server (v3.27) redacts keys at the tool boundary too — not even a `sk-…` prefix leaks. |
267
- | **OAuth flow** | PKCE, no client secret. `--manual` flow for headless setups (v3.20). |
268
- | **Network scope** | Binds to `127.0.0.1` by default. `--host` allows LAN/mesh with `DARIO_API_KEY` gating. Upstream traffic goes only to configured backend target URLs over HTTPS. |
269
- | **SSRF protection** | `/v1/messages` hits `api.anthropic.com` only; `/v1/chat/completions` hits the configured backend `baseUrl` only — hardcoded allowlist |
270
- | **Telemetry** | None. Zero analytics, tracking, or data collection. The MCP server and CC sub-agent are read-only by design. |
271
- | **Audit trail** | [CHANGELOG.md](CHANGELOG.md) documents every release with file-level rationale |
272
-
273
- Verify the npm tarball matches this repo:
158
+ | Source | ~13,170 lines of TypeScript, 28 files — auditable in a weekend |
159
+ | Dependencies | **0 runtime.** Verify: `npm ls --production` |
160
+ | Provenance | Every release [SLSA-attested](https://www.npmjs.com/package/@askalf/dario) via GitHub Actions + sigstore |
161
+ | Scanning | [CodeQL](https://github.com/askalf/dario/actions/workflows/codeql.yml) on every push and weekly |
162
+ | Tests | ~1,535 assertions across 57 suites green on every release |
163
+ | Credentials | Never logged, redacted from errors, `0600` on disk; MCP server redacts at the tool boundary |
164
+ | Network | Binds `127.0.0.1` by default; upstream only to configured backends over HTTPS; hardcoded SSRF allowlist |
165
+ | Telemetry | **None.** No analytics, no tracking, no data collection |
274
166
 
275
167
  ```bash
276
168
  npm audit signatures
@@ -280,115 +172,104 @@ cd $(npm root -g)/@askalf/dario && npm ls --production
280
172
 
281
173
  ---
282
174
 
283
- ## Commands
175
+ ## Who it's for
176
+
177
+ **Best fit:** developers juggling multiple LLM tools and per-tool API keys · Claude Pro/Max subscribers who want their plan usable everywhere, not just in Claude Code · teams running local/hosted OpenAI-compat servers who want one stable local endpoint · Agent SDK users who want OAuth-subscription routing with zero code change (`baseURL: 'http://localhost:3456'`) · power users wanting multi-account pooling + 429 failover on their own machine.
178
+
179
+ **Not a fit:** you need vendor-managed production SLAs (use the provider APIs) · you need a hosted multi-tenant platform with a dashboard and team auth (that's the [askalf platform](https://askalf.org)) · you want a chat UI (use claude.ai).
180
+
181
+ ---
284
182
 
285
- `dario login`, `dario proxy`, `dario doctor`, `dario accounts {list,add,remove}`, `dario backend {list,add,remove}`, `dario shim`, `dario mcp`, `dario subagent {install,status,remove}`, `dario usage`, `dario config`, `dario upgrade`, `dario status`, `dario refresh`, `dario logout`, `dario help`.
183
+ ## Commands
286
184
 
287
- Full flag/env reference + endpoint list: [`docs/commands.md`](./docs/commands.md).
185
+ `dario login` · `proxy` · `doctor` · `accounts {list,add,remove}` · `backend {list,add,remove}` · `shim` · `mcp` · `subagent {install,status,remove}` · `usage` · `config` · `upgrade` · `status` · `refresh` · `logout` · `help`
288
186
 
289
- SDK examples (Python / TypeScript / curl) + per-tool setup: [`docs/usage.md`](./docs/usage.md).
187
+ Full flag/env reference: [`docs/commands.md`](./docs/commands.md) · SDK examples + per-tool setup: [`docs/usage.md`](./docs/usage.md)
290
188
 
291
189
  ---
292
190
 
293
191
  ## FAQ
294
192
 
295
- The most-asked questions. Full FAQ in [`docs/faq.md`](./docs/faq.md).
296
-
297
- **Does this violate Anthropic's terms of service?**
298
- Mechanically: dario uses your existing Claude Code credentials with the same OAuth tokens CC uses. It authenticates you as you, with your subscription, through Anthropic's official endpoints. Whether any particular use complies with their current terms is between you and Anthropic — consult their terms and your subscription agreement. Independent, unofficial, third-party — see [DISCLAIMER.md](DISCLAIMER.md).
193
+ **Does this violate Anthropic's terms?**
194
+ Mechanically, dario uses your existing Claude Code OAuth tokens — it authenticates you as you, with your subscription, through Anthropic's official endpoints. Whether any particular use complies with current terms is between you and Anthropic; consult their terms and your agreement. Independent, unofficial, third-party — see [DISCLAIMER.md](DISCLAIMER.md).
299
195
 
300
196
  **Do I need Claude Code installed?**
301
- Recommended, not strictly required. With CC installed, `dario login` picks up your credentials automatically and the live template extractor reads your CC binary on every startup. Without CC, dario runs its own OAuth flow and falls back to the bundled (scrubbed) template snapshot.
197
+ Recommended, not required. With CC, `dario login` picks up credentials automatically and the live template extractor reads your binary on every startup. Without it, dario runs its own OAuth flow and falls back to the bundled (scrubbed) template snapshot.
302
198
 
303
199
  **Do I need Bun?**
304
- Optional, strongly recommended for Claude-backend requests so the TLS ClientHello matches CC's runtime. Without Bun it works fine `dario doctor` surfaces the mismatch as of v3.23 and `--strict-tls` refuses to start until resolved.
200
+ Optional, recommended Bun's TLS ClientHello matches CC's runtime. Without it dario works fine; `dario doctor` flags the mismatch and `--strict-tls` hard-fails until resolved.
305
201
 
306
202
  **Can I use dario without a Claude subscription?**
307
- Yes. Skip `dario login`, `dario backend add openai --key=...` and you have a local OpenAI-compat router with no Claude involvement.
308
-
309
- **Something's wrong. Where do I start?**
310
- `dario doctor`. One command, paste-ready report. If you're inside CC, `dario subagent install` once and ask CC to "use the dario sub-agent to run doctor."
203
+ Yes. Skip `dario login`, `dario backend add openai --key=…`, and you have a local OpenAI-compat router with no Claude involvement.
311
204
 
312
- **I used dario before, drifted to another tool, now coming back anything to redo?**
313
- No reinstall. `dario login` re-uses any existing Claude Code credentials on your machine. If you also picked up Codex CLI / OpenAI in the gap, `dario backend add openai --key=$OPENAI_API_KEY` puts both your subscription path and your OpenAI fallback on the same `localhost:3456`. Full returner walkthrough: [`docs/returning.md`](./docs/returning.md).
205
+ **`representative-claim: seven_day` in my rate-limit headersam I downgraded?**
206
+ No. `five_hour` and `seven_day` are both subscription billing different accounting buckets, same mode. `overage` is the one that flips you to per-token. [Discussion #1](https://github.com/askalf/dario/discussions/1).
314
207
 
315
- **I'm seeing `representative-claim: seven_day` in my rate-limit headers — am I being downgraded?**
316
- **No.** Both `five_hour` and `seven_day` are subscription billing different accounting buckets inside the same subscription mode. `overage` is the one that flips you to per-token. See [Discussion #1](https://github.com/askalf/dario/discussions/1) for the full rate-limit-header breakdown.
208
+ **Will the 2026-06-15 split break my dario setup?**
209
+ No see [The deadline](#the-deadline-2026-06-15) above. dario rewrites every request to interactive-CC shape before it reaches `api.anthropic.com`; the classifier sees interactive CC, not `claude -p`/Agent SDK, regardless of the local tool.
317
210
 
318
- **I heard Anthropic is moving `claude -p` and Agent SDK to a separate credit pool on 2026-06-15. Will my dario setup break?**
319
- **No.** Dario rewrites every outbound request to look like an interactive Claude Code session before it hits `api.anthropic.com` — headers, body, TLS, session-id lifecycle. The upstream billing classifier sees interactive CC, not `claude -p` or Agent SDK, regardless of which local tool originated the call. Your workloads continue billing against your existing Pro / Max 5x / Max 20x subscription pool, same as today. Naive proxies that pass `claude -p` requests through unchanged will see their users' agentic traffic shift to the new $20–200/mo credit pool and then to metered API pricing — but that's not how dario was built. See the [2026-06-15 section](#what-changes-2026-06-15-and-why-dario-doesnt) above for the full breakdown.
211
+ Full FAQ: [`docs/faq.md`](./docs/faq.md)
320
212
 
321
213
  ---
322
214
 
323
215
  ## Technical deep dives
324
216
 
325
- - [#183 — CC's system prompt is 27kB. Modifying it doesn't change your billing. Stripping its behavioral constraints recovers 1.2–2.8× output capability.](https://github.com/askalf/dario/discussions/183) — productized as `--system-prompt=partial` in v3.34.0
326
- - [#178 — Anthropic's billing classifier fingerprints `openclaw.inbound_meta.v1`](https://github.com/askalf/dario/discussions/178) — reproducing Theo Browne's finding with controlled variants
327
- - [#172Re-testing #13: the system prompt is not a fingerprint signal](https://github.com/askalf/dario/discussions/172)
328
- - [#68 — dario vs LiteLLM / OpenRouter / Kong AI Gateway (when each one wins)](https://github.com/askalf/dario/discussions/68)
217
+ - [#183 — CC's 27kB system prompt: modifying it doesn't change billing; stripping its constraints recovers 1.2–2.8× output](https://github.com/askalf/dario/discussions/183)
218
+ - [#178 — Anthropic's billing classifier fingerprints `openclaw.inbound_meta.v1`](https://github.com/askalf/dario/discussions/178)
219
+ - [#68dario vs LiteLLM / OpenRouter / Kong AI Gateway (when each wins)](https://github.com/askalf/dario/discussions/68)
329
220
  - [#14 — Template Replay: why we stopped matching signals](https://github.com/askalf/dario/discussions/14)
330
221
  - [#13 — Claude Code's "defaults" are detection signals, not optimizations](https://github.com/askalf/dario/discussions/13)
331
- - [#39Why your Claude Max usage is burning in minutes](https://github.com/askalf/dario/discussions/39)
332
- - [#9 — Why Opus feels worse through other proxies and how to fix it](https://github.com/askalf/dario/discussions/9)
333
- - [#8 — Billing tag algorithm and fingerprint analysis](https://github.com/askalf/dario/discussions/8)
334
- - [#1 — Rate limit header analysis](https://github.com/askalf/dario/discussions/1)
335
-
336
- The CHANGELOG documents every v3.22 – v3.28 wire-fidelity release with file-level rationale; each one is worth reading as a standalone post on the axis it closes.
222
+ - [#1Rate-limit header analysis](https://github.com/askalf/dario/discussions/1)
337
223
 
338
224
  ---
339
225
 
340
226
  ## Contributing
341
227
 
342
- PRs welcome. Small TypeScript codebase ~13,170 lines, 28 files, zero runtime dependencies. See [`CONTRIBUTING.md`](CONTRIBUTING.md) for the architecture overview and the file-by-file map.
228
+ PRs welcome. Small TypeScript codebase, zero runtime deps. Architecture + file-by-file map in [`CONTRIBUTING.md`](CONTRIBUTING.md).
343
229
 
344
230
  ```bash
345
- git clone https://github.com/askalf/dario
346
- cd dario
231
+ git clone https://github.com/askalf/dario && cd dario
347
232
  npm install
348
- npm run dev # runs with tsx, no build step
349
- npm test # ~1,535 assertions across 57 suites
350
- npm run e2e # live proxy + OAuth (requires a working Claude backend)
233
+ npm run dev # tsx, no build step
234
+ npm test # ~1,535 assertions, 57 suites
235
+ npm run e2e # live proxy + OAuth (needs a working Claude backend)
351
236
  ```
352
237
 
353
- ---
354
-
355
- ## Contributors
238
+ ### Contributors
356
239
 
357
240
  | Who | Contributions |
358
241
  |---|---|
359
- | [@GodsBoy](https://github.com/GodsBoy) | Proxy authentication, token redaction, error sanitization ([#2](https://github.com/askalf/dario/pull/2)) |
360
- | [@belangertrading](https://github.com/belangertrading) | Billing classification investigation ([#4](https://github.com/askalf/dario/issues/4)), cache_control fingerprinting ([#6](https://github.com/askalf/dario/issues/6)), billing reclassification root cause ([#7](https://github.com/askalf/dario/issues/7)), OAuth client_id discovery ([#12](https://github.com/askalf/dario/issues/12)), multi-agent session-level billing analysis ([#23](https://github.com/askalf/dario/issues/23)) |
242
+ | [@GodsBoy](https://github.com/GodsBoy) | Proxy auth, token redaction, error sanitization ([#2](https://github.com/askalf/dario/pull/2)) |
243
+ | [@belangertrading](https://github.com/belangertrading) | Billing-classification investigation ([#4](https://github.com/askalf/dario/issues/4), [#6](https://github.com/askalf/dario/issues/6), [#7](https://github.com/askalf/dario/issues/7), [#12](https://github.com/askalf/dario/issues/12), [#23](https://github.com/askalf/dario/issues/23)) |
361
244
  | [@iNicholasBE](https://github.com/iNicholasBE) | macOS keychain credential detection ([#30](https://github.com/askalf/dario/pull/30)) |
362
- | [@boeingchoco](https://github.com/boeingchoco) | Reverse-direction tool parameter translation ([#29](https://github.com/askalf/dario/issues/29)), SSE event-group framing regression catch (v3.7.1), provider-comparison diagnostic, motivating case for hybrid tool mode ([#33](https://github.com/askalf/dario/issues/33), v3.9.0), OpenClaw tool-mapping root cause ([#36](https://github.com/askalf/dario/issues/36)) |
363
- | [@tetsuco](https://github.com/tetsuco) | Framework-name path corruption in scrubber ([#35](https://github.com/askalf/dario/issues/35)), OpenClaw Bash/Glob reverse-mapping collisions ([#37](https://github.com/askalf/dario/issues/37)), 20x-tier capture-artifact + OAuth-scope rejection report ([#42](https://github.com/askalf/dario/issues/42)) |
364
- | [@mikelovatt](https://github.com/mikelovatt) | Silent subscription-percent drain surfaced via friendly billing buckets ([#34](https://github.com/askalf/dario/issues/34)) |
365
- | [@ringge](https://github.com/ringge) | Fingerprint-fidelity concern motivating `--no-auto-detect` for text-tool-client auto-preserve ([#40](https://github.com/askalf/dario/issues/40), v3.20.1) |
366
- | [@earlvanze](https://github.com/earlvanze) | OpenClaw tool mappings + missing CC tools ([#19](https://github.com/askalf/dario/pull/19)), OAuth manual-override escape hatch ([#47](https://github.com/askalf/dario/pull/47)), HTTPS warning for non-secure overrides ([#53](https://github.com/askalf/dario/pull/53)) |
245
+ | [@boeingchoco](https://github.com/boeingchoco) | Reverse tool-param translation ([#29](https://github.com/askalf/dario/issues/29)), SSE framing regression catch, hybrid-tool motivation ([#33](https://github.com/askalf/dario/issues/33), [#36](https://github.com/askalf/dario/issues/36)) |
246
+ | [@tetsuco](https://github.com/tetsuco) | Scrubber path corruption ([#35](https://github.com/askalf/dario/issues/35)), OpenClaw reverse-mapping collisions ([#37](https://github.com/askalf/dario/issues/37)), 20x-tier report ([#42](https://github.com/askalf/dario/issues/42)) |
247
+ | [@mikelovatt](https://github.com/mikelovatt) | Silent subscription-drain surfaced via friendly billing buckets ([#34](https://github.com/askalf/dario/issues/34)) |
248
+ | [@ringge](https://github.com/ringge) | `--no-auto-detect` for text-tool auto-preserve ([#40](https://github.com/askalf/dario/issues/40)) |
249
+ | [@earlvanze](https://github.com/earlvanze) | OpenClaw tool mappings ([#19](https://github.com/askalf/dario/pull/19)), OAuth manual override ([#47](https://github.com/askalf/dario/pull/47)), HTTPS warning ([#53](https://github.com/askalf/dario/pull/53)) |
367
250
 
368
251
  ---
369
252
 
370
- ## Disclaimers
253
+ > **dario is the open-source wedge of [askalf](https://askalf.org)** — the AI workforce platform we're building. Dario solves the Claude subscription problem so the rest of the workforce runs on flat-rate billing. Star the repo or follow [@ask_alf](https://x.com/ask_alf) for platform updates.
371
254
 
372
- **dario is an independent, unofficial, third-party project.** Not affiliated with, endorsed by, or sponsored by Anthropic, OpenAI, or any other vendor referenced in the code or documentation. Provided as-is with no warranty. You are solely responsible for compliance with your subscription's terms of service, the security of your credentials, and the content you send through the proxy. Not intended for safety-critical, regulated, or production-grade environments without your own independent review. See [DISCLAIMER.md](DISCLAIMER.md) for full text.
255
+ ## Disclaimers
373
256
 
374
- ---
257
+ **dario is an independent, unofficial, third-party project.** Not affiliated with, endorsed by, or sponsored by Anthropic, OpenAI, or any vendor referenced here. Provided as-is, no warranty. You are solely responsible for compliance with your subscription's terms, the security of your credentials, and the content you send through the proxy. Not for safety-critical, regulated, or production environments without your own review. Full text: [DISCLAIMER.md](DISCLAIMER.md).
375
258
 
376
259
  ## License
377
260
 
378
261
  MIT — see [LICENSE](LICENSE) and [DISCLAIMER.md](DISCLAIMER.md).
379
262
 
380
- ---
381
-
382
263
  ## Also by askalf
383
264
 
384
265
  | Project | What it does |
385
- |---------|-------------|
386
- | [arnie](https://github.com/askalf/arnie) | Portable IT troubleshooting companion. Networking, AD, Windows Update, package managers, log triage, hardware checks. |
387
- | [browser-bridge](https://github.com/askalf/browser-bridge) | Stealth headless Chromium in a container. CDP on 9222 — Playwright/Puppeteer/MCP-compatible. |
388
- | [claude-bridge](https://github.com/askalf/claude-bridge) | Bridge Claude Code sessions to Discord. |
389
- | [deepdive](https://github.com/askalf/deepdive) | Local research agent. Plan → search → fetch → extract → synthesize. Cited answers. |
390
- | [git-providers](https://github.com/askalf/git-providers) | Unified GitHub + GitLab + Bitbucket Cloud REST clients behind one GitProvider interface. Plus a 44-entry api-key-provider taxonomy. |
391
- | [hands](https://github.com/askalf/hands) | Cross-platform computer-use agent. Mouse, keyboard, screen. |
392
- | [install-kit](https://github.com/askalf/install-kit) | curl-pipe-bash template for self-hosted Docker apps. |
393
- | [pgflex](https://github.com/askalf/pgflex) | One Postgres API. Two modes (real PG ↔ PGlite WASM). |
394
- | [redisflex](https://github.com/askalf/redisflex) | One Redis API. Two modes (ioredis ↔ in-process). |
266
+ |---|---|
267
+ | [arnie](https://github.com/askalf/arnie) | Portable IT troubleshooting companion — networking, AD, package managers, log triage |
268
+ | [browser-bridge](https://github.com/askalf/browser-bridge) | Stealth headless Chromium in a container, CDP on 9222 |
269
+ | [claude-bridge](https://github.com/askalf/claude-bridge) | Bridge Claude Code sessions to Discord |
270
+ | [deepdive](https://github.com/askalf/deepdive) | Local research agent plan → search → fetch → synthesize, cited answers |
271
+ | [git-providers](https://github.com/askalf/git-providers) | Unified GitHub + GitLab + Bitbucket Cloud clients behind one interface |
272
+ | [hands](https://github.com/askalf/hands) | Cross-platform computer-use agent — mouse, keyboard, screen |
273
+ | [install-kit](https://github.com/askalf/install-kit) | curl-pipe-bash template for self-hosted Docker apps |
274
+ | [pgflex](https://github.com/askalf/pgflex) | One Postgres API, two modes (real PG ↔ PGlite WASM) |
275
+ | [redisflex](https://github.com/askalf/redisflex) | One Redis API, two modes (ioredis ↔ in-process) |
@@ -262,6 +262,29 @@ export declare const VALID_EFFORT_VALUES: ReadonlyArray<EffortValue>;
262
262
  * Exported for tests.
263
263
  */
264
264
  export declare function resolveEffort(flag: EffortValue | undefined, clientBody: Record<string, unknown>): string;
265
+ /**
266
+ * Returns true if the given model accepts `thinking: { type: "adaptive" }`.
267
+ *
268
+ * Empirical results (2026-05-15, live OAuth-subscription probes against
269
+ * api.anthropic.com — see dario#NNN for the probe matrix):
270
+ * claude-opus-4-7 ✓ accepts adaptive
271
+ * claude-opus-4-6 ✓ accepts adaptive
272
+ * claude-sonnet-4-6 ✓ accepts adaptive
273
+ * claude-opus-4-5 ✗ "adaptive thinking is not supported on this model"
274
+ * claude-sonnet-4-5 ✗ same
275
+ * claude-haiku-4-5 ✗ same (already gated separately by isHaiku)
276
+ *
277
+ * The split is the 4.6 minor: Anthropic added adaptive support in the 4.6
278
+ * generation. Beta header state does not affect the outcome — adaptive is
279
+ * gated per-model, server-side.
280
+ *
281
+ * Allow-list pattern, default-deny: when a future model ships and isn't
282
+ * yet listed here, dario silently OMITS the `thinking` field rather than
283
+ * 400ing. Omitting `thinking` is always accepted by the API, so the
284
+ * worst-case regression is "no thinking blocks until allow-list update"
285
+ * — never a broken request.
286
+ */
287
+ export declare function supportsAdaptiveThinking(modelId: string): boolean;
265
288
  export declare function buildCCRequest(clientBody: Record<string, unknown>, billingTag: string, cacheControl: {
266
289
  type: 'ephemeral';
267
290
  }, identity: {
@@ -907,6 +907,54 @@ export function resolveEffort(flag, clientBody) {
907
907
  }
908
908
  return flag;
909
909
  }
910
+ /**
911
+ * Returns true if the given model accepts `thinking: { type: "adaptive" }`.
912
+ *
913
+ * Empirical results (2026-05-15, live OAuth-subscription probes against
914
+ * api.anthropic.com — see dario#NNN for the probe matrix):
915
+ * claude-opus-4-7 ✓ accepts adaptive
916
+ * claude-opus-4-6 ✓ accepts adaptive
917
+ * claude-sonnet-4-6 ✓ accepts adaptive
918
+ * claude-opus-4-5 ✗ "adaptive thinking is not supported on this model"
919
+ * claude-sonnet-4-5 ✗ same
920
+ * claude-haiku-4-5 ✗ same (already gated separately by isHaiku)
921
+ *
922
+ * The split is the 4.6 minor: Anthropic added adaptive support in the 4.6
923
+ * generation. Beta header state does not affect the outcome — adaptive is
924
+ * gated per-model, server-side.
925
+ *
926
+ * Allow-list pattern, default-deny: when a future model ships and isn't
927
+ * yet listed here, dario silently OMITS the `thinking` field rather than
928
+ * 400ing. Omitting `thinking` is always accepted by the API, so the
929
+ * worst-case regression is "no thinking blocks until allow-list update"
930
+ * — never a broken request.
931
+ */
932
+ export function supportsAdaptiveThinking(modelId) {
933
+ const m = modelId.toLowerCase();
934
+ // Opus/Sonnet, major-minor form: opus-4-6+, sonnet-4-6+, opus-5-X, etc.
935
+ //
936
+ // Digit groups are bounded to {1,2} so the dated-suffix pre-4.x line
937
+ // (`claude-3-5-sonnet-20241022`, `claude-3-7-sonnet-20250219`) doesn't
938
+ // accidentally match the date as `sonnet-2024-1022` and parse year as
939
+ // major. Realistic Anthropic version numbers are 1-2 digits.
940
+ const mm = m.match(/(?:opus|sonnet)-(\d{1,2})-(\d{1,2})\b/);
941
+ if (mm) {
942
+ const major = Number(mm[1]);
943
+ const minor = Number(mm[2]);
944
+ if (major > 4)
945
+ return true; // any opus-5+ / sonnet-5+
946
+ if (major === 4 && minor >= 6)
947
+ return true; // 4-6, 4-7, …
948
+ return false; // 4-5 and older
949
+ }
950
+ // Major-only form (e.g. `opus-5`, `opus-10`). The negative lookahead
951
+ // prevents matching the `5` in `opus-5-X` (handled above), and the
952
+ // {1,2} bound prevents matching long dated suffixes.
953
+ const majorOnly = m.match(/(?:opus|sonnet)-(\d{1,2})(?!\d|-)/);
954
+ if (majorOnly && Number(majorOnly[1]) >= 5)
955
+ return true;
956
+ return false;
957
+ }
910
958
  export function buildCCRequest(clientBody, billingTag, cacheControl, identity, opts = {}) {
911
959
  const model = clientBody.model || 'claude-sonnet-4-6';
912
960
  const isHaiku = model.toLowerCase().includes('haiku');
@@ -1216,13 +1264,24 @@ export function buildCCRequest(clientBody, billingTag, cacheControl, identity, o
1216
1264
  };
1217
1265
  ccRequest.max_tokens = resolveMaxTokens(opts.maxTokens, clientBody);
1218
1266
  // Model-specific fields — order: thinking, context_management, output_config
1267
+ //
1268
+ // `thinking: {type:"adaptive"}` is a 4.6-generation feature; older
1269
+ // Opus/Sonnet 4-5 models 400 it (`"adaptive thinking is not supported
1270
+ // on this model"`). `context_management.edits[clear_thinking_*]` is
1271
+ // tied to thinking — sending it without an enabled thinking field
1272
+ // 400s too (`"clear_thinking_* strategy requires thinking to be
1273
+ // enabled or adaptive"`). So both are gated on the same model check,
1274
+ // and either both ship or neither does.
1275
+ //
1276
+ // `output_config.effort` is independent of thinking and ships for all
1277
+ // non-Haiku models. Default `'high'` matches CC 2.1.116's wire value;
1278
+ // `--effort` flag overrides; `'client'` passes through whatever the
1279
+ // client sent (or falls back to `'high'` if absent). See dario#87.
1219
1280
  if (!isHaiku) {
1220
- ccRequest.thinking = { type: 'adaptive' };
1221
- ccRequest.context_management = { edits: [{ type: 'clear_thinking_20251015', keep: 'all' }] };
1222
- // output_config.effort default is `'high'` (matches CC 2.1.116's wire
1223
- // value). `--effort` flag overrides; `'client'` passes through whatever
1224
- // the client sent (or falls back to `'high'` if the client didn't
1225
- // include an output_config). See dario#87.
1281
+ if (supportsAdaptiveThinking(model)) {
1282
+ ccRequest.thinking = { type: 'adaptive' };
1283
+ ccRequest.context_management = { edits: [{ type: 'clear_thinking_20251015', keep: 'all' }] };
1284
+ }
1226
1285
  ccRequest.output_config = { effort: resolveEffort(opts.effort, clientBody) };
1227
1286
  }
1228
1287
  ccRequest.stream = stream;
@@ -282,7 +282,7 @@ export declare function _resetInstalledVersionProbeForTest(): void;
282
282
  */
283
283
  export declare const SUPPORTED_CC_RANGE: {
284
284
  readonly min: "1.0.0";
285
- readonly maxTested: "2.1.141";
285
+ readonly maxTested: "2.1.142";
286
286
  };
287
287
  /**
288
288
  * Compare two dotted-numeric version strings. Returns negative if `a<b`,
@@ -777,7 +777,7 @@ export function _resetInstalledVersionProbeForTest() {
777
777
  */
778
778
  export const SUPPORTED_CC_RANGE = {
779
779
  min: '1.0.0',
780
- maxTested: '2.1.141',
780
+ maxTested: '2.1.142',
781
781
  };
782
782
  /**
783
783
  * Compare two dotted-numeric version strings. Returns negative if `a<b`,
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "@askalf/dario",
3
- "version": "3.38.3",
4
- "description": "A local LLM router. One endpoint, every providerClaude subscriptions, OpenAI, OpenRouter, Groq, local LiteLLM, any OpenAI-compat endpoint your tools don't need to change.",
3
+ "version": "3.38.5",
4
+ "description": "Use your Claude Pro/Max subscription in any toolCursor, Cline, Aider, the Agent SDK, your scripts — at subscription pricing, not per-token API bills. One local Anthropic + OpenAI-compatible endpoint.",
5
5
  "type": "module",
6
6
  "bin": {
7
7
  "dario": "./dist/cli.js"