@askalf/dario 2.7.0 → 2.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -70,14 +70,15 @@ Opus, Sonnet, Haiku — all models, streaming, tool use. Works with Cursor, Cont
70
70
 
71
71
  Most Claude subscription proxies have a critical billing problem: **Anthropic classifies their requests as third-party and routes all usage to Extra Usage billing** — even when you have Max plan limits available. You're paying for your subscription twice.
72
72
 
73
- dario is the only proxy that solves this. It injects native Claude Code device identity (`metadata.user_id`) into every request, so Anthropic's billing system treats your requests exactly like Claude Code itself. Your Max plan limits work correctly.
73
+ dario is the only proxy that solves this. It injects native Claude Code device identity, billing classification tags, and priority routing into every request so Anthropic's billing system treats your requests exactly like Claude Code itself. Your Max plan limits work correctly, and Opus/Sonnet stay available even at high utilization.
74
74
 
75
75
  | | dario | Other proxies |
76
76
  |---|---|---|
77
77
  | **Billing classification** | Native Claude Code session | Third-party (Extra Usage) |
78
78
  | **Max plan limits** | Used correctly | Bypassed — billed separately |
79
79
  | **Device identity** | Injected automatically | Missing |
80
- | **Beta flags** | Match Claude Code v2.1.98 | Outdated or missing |
80
+ | **Priority routing** | Billing tag + service_tier auto | Missing |
81
+ | **Beta flags** | Match Claude Code v2.1.100 | Outdated or missing |
81
82
  | **Billable beta filtering** | Strips surprise charges | Passes everything through |
82
83
 
83
84
  <details>
@@ -91,7 +92,7 @@ dario is the only proxy that solves this. It injects native Claude Code device i
91
92
  | OpenAI API compat | **Yes** | Yes | Yes | Yes |
92
93
  | Orchestration sanitization | **Yes** | Yes | No | No |
93
94
  | Token anomaly detection | **Yes** | Yes | No | No |
94
- | Codebase size | ~1,200 lines | ~9,000 lines | Platform | Rust binary |
95
+ | Codebase size | ~1,500 lines | ~9,000 lines | Platform | Rust binary |
95
96
  | Dependencies | 1 | Many | Many | Compiled |
96
97
  | Setup | 2 commands | Config + build | Config + dashboard | Config |
97
98
 
@@ -382,6 +383,9 @@ Then run `hermes` normally — it routes through dario using your Claude subscri
382
383
  ### Direct API Mode
383
384
  - All Claude models (Opus 4.6, Sonnet 4.6, Haiku 4.5) + 1M extended context aliases (`opus1m`, `sonnet1m`)
384
385
  - **Native billing classification** — device identity metadata ensures Max plan limits work correctly
386
+ - **Priority routing** — billing tag injection + `service_tier: auto` activates per-model rate limits, keeping Opus/Sonnet available even at 100% overall utilization
387
+ - **Adaptive thinking** — matches Claude Code's `{ type: 'adaptive' }` mode for optimal reasoning
388
+ - **Auto CLI fallback** — if the API returns 429 and Claude Code is installed, transparently retries through `claude --print` with SSE conversion
385
389
  - **OpenAI-compatible** (`/v1/chat/completions`) — works with any OpenAI SDK or tool
386
390
  - Streaming and non-streaming (both Anthropic and OpenAI SSE formats, including tool_use streaming)
387
391
  - Tool use / function calling
@@ -493,7 +497,7 @@ Dario handles your OAuth tokens. Here's why you can trust it:
493
497
 
494
498
  | Signal | Status |
495
499
  |--------|--------|
496
- | **Source code** | ~1,300 lines of TypeScript — small enough to audit in one sitting |
500
+ | **Source code** | ~1,500 lines of TypeScript — small enough to audit in one sitting |
497
501
  | **Dependencies** | 1 production dep (`@anthropic-ai/sdk`). Verify: `npm ls --production` |
498
502
  | **npm provenance** | Every release is [SLSA attested](https://www.npmjs.com/package/@askalf/dario) via GitHub Actions |
499
503
  | **Security scanning** | [CodeQL](https://github.com/askalf/dario/actions/workflows/codeql.yml) runs on every push and weekly |
@@ -515,7 +519,7 @@ cd $(npm root -g)/@askalf/dario && npm ls --production
515
519
 
516
520
  ## Contributing
517
521
 
518
- PRs welcome. The codebase is ~1,300 lines of TypeScript across 4 files:
522
+ PRs welcome. The codebase is ~1,500 lines of TypeScript across 4 files:
519
523
 
520
524
  | File | Purpose |
521
525
  |------|---------|
@@ -536,7 +540,7 @@ npm run dev # runs with tsx (no build needed)
536
540
  | Who | Contributions |
537
541
  |-----|---------------|
538
542
  | [@GodsBoy](https://github.com/GodsBoy) | Proxy authentication, token redaction, error sanitization ([#2](https://github.com/askalf/dario/pull/2)) |
539
- | [@belangertrading](https://github.com/belangertrading) | Billing classification investigation — reported, tested 5 versions, confirmed fix via response header analysis ([#4](https://github.com/askalf/dario/issues/4)) |
543
+ | [@belangertrading](https://github.com/belangertrading) | Billing classification investigation ([#4](https://github.com/askalf/dario/issues/4)), Opus/Sonnet 429 diagnosis + CLI fallback workaround ([#6](https://github.com/askalf/dario/issues/6)) |
540
544
 
541
545
  ## Also by AskAlf
542
546
 
package/dist/proxy.js CHANGED
@@ -605,9 +605,11 @@ export async function startProxy(opts = {}) {
605
605
  }),
606
606
  };
607
607
  }
608
- // Enable adaptive thinking (matches Claude Code default)
609
- // adaptive lets the model decide when/how much to think — preferred for Opus/Sonnet 4.6
610
- if (!r.thinking) {
608
+ // Enable adaptive thinking for models that support it (Opus/Sonnet 4.6+)
609
+ // Haiku 4.5 does not support thinking at all
610
+ const modelName = (r.model || '').toLowerCase();
611
+ const supportsThinking = !modelName.includes('haiku');
612
+ if (supportsThinking && !r.thinking) {
611
613
  r.thinking = { type: 'adaptive' };
612
614
  // Ensure max_tokens is reasonable for thinking models
613
615
  const clientMax = r.max_tokens || 8192;
@@ -618,7 +620,8 @@ export async function startProxy(opts = {}) {
618
620
  r.service_tier = 'auto';
619
621
  }
620
622
  // Enable context management (matches Claude Code default)
621
- if (!r.context_management) {
623
+ // Requires thinking to be enabled — skip for models without thinking support (e.g. Haiku)
624
+ if (supportsThinking && !r.context_management) {
622
625
  r.context_management = { edits: [{ type: 'clear_thinking_20251015', keep: 'all' }] };
623
626
  }
624
627
  // Inject Claude Code billing header into system prompt.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@askalf/dario",
3
- "version": "2.7.0",
3
+ "version": "2.7.1",
4
4
  "description": "Use your Claude subscription as an API. No API key needed. Local proxy for Claude Max/Pro subscriptions.",
5
5
  "type": "module",
6
6
  "bin": {