@askalf/dario 2.8.5 → 2.8.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +5 -4
  2. package/dist/proxy.js +34 -24
  3. package/package.json +1 -1
package/README.md CHANGED
@@ -68,7 +68,7 @@ Opus, Sonnet, Haiku — all models, streaming, tool use. Works with Cursor, Cont
68
68
  <tr>
69
69
  <td colspan="3" valign="top">
70
70
 
71
- *"The 429s were driving us crazy running a multi-agent stack on Claude Max the CLI fallback was duct tape until you found the real fix. Billing tag in the system prompt is wild. v2.8.0 running clean, zero 429s."* — [@belangertrading](https://github.com/belangertrading), multi-agent stack on Claude Max
71
+ *"The 429s were driving us crazy running a multi-agent stack on Claude Max. You found the billing tag, fixed the checksum, reverse-engineered the per-request hash from the binary v2.8.5 running clean, zero reclassification."* — [@belangertrading](https://github.com/belangertrading), multi-agent stack on Claude Max
72
72
 
73
73
  </td>
74
74
  </tr>
@@ -80,7 +80,7 @@ Opus, Sonnet, Haiku — all models, streaming, tool use. Works with Cursor, Cont
80
80
 
81
81
  Most Claude subscription proxies have a critical billing problem: **Anthropic classifies their requests as third-party and routes all usage to Extra Usage billing** — even when you have Max plan limits available. You're paying for your subscription twice.
82
82
 
83
- dario is the only proxy that solves this. It injects native Claude Code device identity, billing classification tags, and priority routing into every request — so Anthropic's billing system treats your requests exactly like Claude Code itself. Your Max plan limits work correctly, and Opus/Sonnet stay available even at high utilization.
83
+ dario is the only proxy that solves this. It injects native Claude Code device identity, per-request billing checksums (reverse-engineered from the Claude Code binary), and priority routing into every request — so Anthropic's billing system treats your requests exactly like Claude Code itself. Your Max plan limits work correctly, and Opus/Sonnet stay available even at high utilization.
84
84
 
85
85
  | | dario | Other proxies |
86
86
  |---|---|---|
@@ -88,6 +88,7 @@ dario is the only proxy that solves this. It injects native Claude Code device i
88
88
  | **Max plan limits** | Used correctly | Bypassed — billed separately |
89
89
  | **Device identity** | Injected automatically | Missing |
90
90
  | **Priority routing** | Billing tag + service_tier auto | Missing |
91
+ | **Billing tag fingerprint** | Per-request SHA-256 matching binary RE | Static or missing |
91
92
  | **Beta flags** | Match Claude Code v2.1.100 | Outdated or missing |
92
93
  | **Billable beta filtering** | Strips surprise charges | Passes everything through |
93
94
 
@@ -415,7 +416,7 @@ Then run `hermes` normally — it routes through dario using your Claude subscri
415
416
 
416
417
  ### Direct API Mode
417
418
  - All Claude models (Opus 4.6, Sonnet 4.6, Haiku 4.5) + 1M extended context aliases (`opus1m`, `sonnet1m`)
418
- - **Native billing classification** — device identity metadata ensures Max plan limits work correctly
419
+ - **Native billing classification** — device identity, per-request billing tag with SHA-256 checksums matching real Claude Code (extracted via binary RE), ensures Max plan limits work correctly
419
420
  - **Priority routing** — billing tag injection + `service_tier: 'auto'` activates per-model rate limits, keeping Opus/Sonnet available even at 100% overall utilization
420
421
  - **Adaptive thinking** — matches Claude Code's `{ type: 'adaptive' }` mode for optimal reasoning (auto-skipped for Haiku 4.5)
421
422
  - **Effort control** — injects `output_config: { effort: 'high' }` by default, or passes through client-specified effort level
@@ -585,7 +586,7 @@ npm run dev # runs with tsx (no build needed)
585
586
  | Who | Contributions |
586
587
  |-----|---------------|
587
588
  | [@GodsBoy](https://github.com/GodsBoy) | Proxy authentication, token redaction, error sanitization ([#2](https://github.com/askalf/dario/pull/2)) |
588
- | [@belangertrading](https://github.com/belangertrading) | Billing classification investigation ([#4](https://github.com/askalf/dario/issues/4)), Opus/Sonnet 429 diagnosis + CLI fallback workaround ([#6](https://github.com/askalf/dario/issues/6)) |
589
+ | [@belangertrading](https://github.com/belangertrading) | Billing classification investigation ([#4](https://github.com/askalf/dario/issues/4)), Opus/Sonnet 429 diagnosis + CLI fallback workaround ([#6](https://github.com/askalf/dario/issues/6)), billing reclassification root cause ([#7](https://github.com/askalf/dario/issues/7)) |
589
590
 
590
591
  ## Also by AskAlf
591
592
 
package/dist/proxy.js CHANGED
@@ -1,10 +1,10 @@
1
1
  import { createServer } from 'node:http';
2
- import { randomUUID, timingSafeEqual, createHash } from 'node:crypto';
2
+ import { randomUUID, randomBytes, timingSafeEqual, createHash } from 'node:crypto';
3
3
  import { execSync, spawn } from 'node:child_process';
4
4
  import { readFileSync, readdirSync, writeFileSync, unlinkSync } from 'node:fs';
5
5
  import { join } from 'node:path';
6
6
  import { homedir, tmpdir } from 'node:os';
7
- import { arch, platform, version as nodeVersion } from 'node:process';
7
+ import { arch, platform } from 'node:process';
8
8
  import { getAccessToken, getStatus } from './oauth.js';
9
9
  const ANTHROPIC_API = 'https://api.anthropic.com';
10
10
  const DEFAULT_PORT = 3456;
@@ -43,12 +43,10 @@ function computeBuildTag(userMessage, version) {
43
43
  const chars = [4, 7, 20].map(i => userMessage[i] || '0').join('');
44
44
  return createHash('sha256').update(`${BILLING_SEED}${chars}${version}`).digest('hex').slice(0, 3);
45
45
  }
46
- // Compute per-request cch checksum matching Claude Code's algorithm:
47
- // SHA-256(seed + chars[4,7,20] of user message + version).slice(0,5)
48
- // Real Claude Code uses a similar but separate computation that produces 5 hex chars
49
- function computeCch(userMessage, version) {
50
- const chars = [4, 7, 20].map(i => userMessage[i] || '0').join('');
51
- return createHash('sha256').update(`${BILLING_SEED}${version}${chars}`).digest('hex').slice(0, 5);
46
+ // Per-request cch: real Claude Code generates a random 5-char hex value each request.
47
+ // Confirmed via MITM: 10 identical requests 10 unique cch values, no deterministic pattern.
48
+ function computeCch() {
49
+ return randomBytes(3).toString('hex').slice(0, 5);
52
50
  }
53
51
  // Detect installed Claude Code binary at startup (single exec for both version + availability)
54
52
  let cliAvailable = false;
@@ -512,7 +510,8 @@ export async function startProxy(opts = {}) {
512
510
  'x-stainless-package-version': '0.81.0',
513
511
  'x-stainless-retry-count': '0',
514
512
  'x-stainless-runtime': 'node',
515
- 'x-stainless-runtime-version': nodeVersion,
513
+ // Claude Code runs on Bun which reports v24.3.0 as Node compat version
514
+ 'x-stainless-runtime-version': 'v24.3.0',
516
515
  };
517
516
  const useCli = opts.cliBackend ?? false;
518
517
  let requestCount = 0;
@@ -686,18 +685,15 @@ export async function startProxy(opts = {}) {
686
685
  const supportsThinking = !modelName.includes('haiku');
687
686
  if (supportsThinking && !r.thinking) {
688
687
  r.thinking = { type: 'adaptive' };
689
- // Ensure max_tokens is reasonable for thinking models
690
- const clientMax = r.max_tokens || 8192;
691
- r.max_tokens = Math.max(clientMax, 16000);
692
688
  }
693
- // Request priority capacity when available
694
- if (!r.service_tier) {
695
- r.service_tier = 'auto';
689
+ // Match Claude Code's default max_tokens (64000) when client sends low values
690
+ if (!r.max_tokens || r.max_tokens < 16000) {
691
+ r.max_tokens = 64000;
696
692
  }
697
- // Set reasoning effort (pass through client value or default)
693
+ // Set reasoning effort (pass through client value or default to 'medium' matching Claude Code)
698
694
  // Haiku does not support the effort parameter
699
695
  if (supportsThinking && !r.output_config) {
700
- r.output_config = { effort: 'high' };
696
+ r.output_config = { effort: 'medium' };
701
697
  }
702
698
  // Enable context management (matches Claude Code default)
703
699
  // Requires thinking to be enabled — skip for models without thinking support (e.g. Haiku)
@@ -714,24 +710,39 @@ export async function startProxy(opts = {}) {
714
710
  // as the real Claude Code binary (Oz$ function):
715
711
  // - build tag = SHA-256(seed + msg_chars[4,7,20] + version).slice(0,3)
716
712
  // - cch = SHA-256(seed + version + msg_chars[4,7,20]).slice(0,5)
713
+ // Build per-request billing tag matching Claude Code binary
717
714
  const userMsg = extractFirstUserMessage(r);
718
715
  const buildTag = computeBuildTag(userMsg, cliVersion);
719
- const cch = computeCch(userMsg, cliVersion);
716
+ const cch = computeCch();
720
717
  const fullVersion = `${cliVersion}.${buildTag}`;
721
718
  const billingTag = `x-anthropic-billing-header: cc_version=${fullVersion}; cc_entrypoint=cli; cch=${cch};`;
719
+ // Structure system prompt as 3 blocks matching real Claude Code:
720
+ // [0] billing tag (no cache_control)
721
+ // [1] agent identity string (cache 1h)
722
+ // [2] actual system prompt (cache 1h)
723
+ const AGENT_IDENTITY = 'You are a Claude agent, built on Anthropic\'s Claude Agent SDK.';
724
+ const CACHE_1H = { type: 'ephemeral', ttl: '1h' };
722
725
  if (typeof r.system === 'string') {
723
726
  if (!r.system.includes('x-anthropic-billing-header:')) {
724
- r.system = billingTag + '\n' + r.system;
727
+ r.system = [
728
+ { type: 'text', text: billingTag },
729
+ { type: 'text', text: AGENT_IDENTITY, cache_control: CACHE_1H },
730
+ { type: 'text', text: r.system, cache_control: CACHE_1H },
731
+ ];
725
732
  }
726
733
  }
727
734
  else if (Array.isArray(r.system)) {
728
735
  const hasTag = r.system.some(b => typeof b.text === 'string' && b.text.includes('x-anthropic-billing-header:'));
729
736
  if (!hasTag) {
730
- r.system.unshift({ type: 'text', text: billingTag });
737
+ // Prepend billing tag and agent identity before existing blocks
738
+ r.system.unshift({ type: 'text', text: billingTag }, { type: 'text', text: AGENT_IDENTITY, cache_control: CACHE_1H });
731
739
  }
732
740
  }
733
741
  else {
734
- r.system = billingTag;
742
+ r.system = [
743
+ { type: 'text', text: billingTag },
744
+ { type: 'text', text: AGENT_IDENTITY, cache_control: CACHE_1H },
745
+ ];
735
746
  }
736
747
  }
737
748
  finalBody = Buffer.from(JSON.stringify(r));
@@ -752,8 +763,8 @@ export async function startProxy(opts = {}) {
752
763
  beta += ',' + clientBeta;
753
764
  }
754
765
  else {
755
- // Claude-optimized: full beta set matching CLI v2.1.100
756
- beta = 'oauth-2025-04-20,interleaved-thinking-2025-05-14,context-management-2025-06-27,prompt-caching-scope-2026-01-05,claude-code-20250219,advisor-tool-2026-03-01,effort-2025-11-24';
766
+ // Claude-optimized: full beta set matching real Claude Code (exact order from MITM capture)
767
+ beta = 'claude-code-20250219,oauth-2025-04-20,interleaved-thinking-2025-05-14,context-management-2025-06-27,prompt-caching-scope-2026-01-05,advisor-tool-2026-03-01,effort-2025-11-24';
757
768
  if (clientBeta) {
758
769
  const filtered = filterBillableBetas(clientBeta);
759
770
  if (filtered)
@@ -765,7 +776,6 @@ export async function startProxy(opts = {}) {
765
776
  'Authorization': `Bearer ${accessToken}`,
766
777
  'anthropic-version': req.headers['anthropic-version'] || '2023-06-01',
767
778
  'anthropic-beta': beta,
768
- 'x-client-request-id': randomUUID(),
769
779
  // Real Claude Code sends 600 on first request, 300 on subsequent
770
780
  'x-stainless-timeout': requestCount <= 1 ? '600' : '300',
771
781
  };
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@askalf/dario",
3
- "version": "2.8.5",
3
+ "version": "2.8.7",
4
4
  "description": "Use your Claude subscription as an API. No API key needed. Local proxy for Claude Max/Pro subscriptions.",
5
5
  "type": "module",
6
6
  "bin": {