npm - @askalf/dario - Versions diffs - 2.8.5 → 2.8.6 - Mend

@askalf/dario 2.8.5 → 2.8.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -68,7 +68,7 @@ Opus, Sonnet, Haiku — all models, streaming, tool use. Works with Cursor, Cont
 <tr>
 <td colspan="3" valign="top">
-*"The 429s were driving us crazy running a multi-agent stack on Claude Max — the CLI fallback was duct tape until you found the real fix. Billing tag in the system prompt is wild. v2.8.0 running clean, zero 429s."* — [@belangertrading](https://github.com/belangertrading), multi-agent stack on Claude Max
+*"The 429s were driving us crazy running a multi-agent stack on Claude Max. You found the billing tag, fixed the checksum, reverse-engineered the per-request hash from the binary — v2.8.5 running clean, zero reclassification."* — [@belangertrading](https://github.com/belangertrading), multi-agent stack on Claude Max
 </td>
 </tr>
@@ -80,7 +80,7 @@ Opus, Sonnet, Haiku — all models, streaming, tool use. Works with Cursor, Cont
 Most Claude subscription proxies have a critical billing problem: **Anthropic classifies their requests as third-party and routes all usage to Extra Usage billing** — even when you have Max plan limits available. You're paying for your subscription twice.
-dario is the only proxy that solves this. It injects native Claude Code device identity, billing classification tags, and priority routing into every request — so Anthropic's billing system treats your requests exactly like Claude Code itself. Your Max plan limits work correctly, and Opus/Sonnet stay available even at high utilization.
+dario is the only proxy that solves this. It injects native Claude Code device identity, per-request billing checksums (reverse-engineered from the Claude Code binary), and priority routing into every request — so Anthropic's billing system treats your requests exactly like Claude Code itself. Your Max plan limits work correctly, and Opus/Sonnet stay available even at high utilization.
 | | dario | Other proxies |
 |---|---|---|
@@ -88,6 +88,7 @@ dario is the only proxy that solves this. It injects native Claude Code device i
 | **Max plan limits** | Used correctly | Bypassed — billed separately |
 | **Device identity** | Injected automatically | Missing |
 | **Priority routing** | Billing tag + service_tier auto | Missing |
+| **Billing tag fingerprint** | Per-request SHA-256 matching binary RE | Static or missing |
 | **Beta flags** | Match Claude Code v2.1.100 | Outdated or missing |
 | **Billable beta filtering** | Strips surprise charges | Passes everything through |
@@ -415,7 +416,7 @@ Then run `hermes` normally — it routes through dario using your Claude subscri
 ### Direct API Mode
 - All Claude models (Opus 4.6, Sonnet 4.6, Haiku 4.5) + 1M extended context aliases (`opus1m`, `sonnet1m`)
-- **Native billing classification** — device identity metadata ensures Max plan limits work correctly
+- **Native billing classification** — device identity, per-request billing tag with SHA-256 checksums matching real Claude Code (extracted via binary RE), ensures Max plan limits work correctly
 - **Priority routing** — billing tag injection + `service_tier: 'auto'` activates per-model rate limits, keeping Opus/Sonnet available even at 100% overall utilization
 - **Adaptive thinking** — matches Claude Code's `{ type: 'adaptive' }` mode for optimal reasoning (auto-skipped for Haiku 4.5)
 - **Effort control** — injects `output_config: { effort: 'high' }` by default, or passes through client-specified effort level
@@ -585,7 +586,7 @@ npm run dev   # runs with tsx (no build needed)
 | Who | Contributions |
 |-----|---------------|
 | [@GodsBoy](https://github.com/GodsBoy) | Proxy authentication, token redaction, error sanitization ([#2](https://github.com/askalf/dario/pull/2)) |
-| [@belangertrading](https://github.com/belangertrading) | Billing classification investigation ([#4](https://github.com/askalf/dario/issues/4)), Opus/Sonnet 429 diagnosis + CLI fallback workaround ([#6](https://github.com/askalf/dario/issues/6)) |
+| [@belangertrading](https://github.com/belangertrading) | Billing classification investigation ([#4](https://github.com/askalf/dario/issues/4)), Opus/Sonnet 429 diagnosis + CLI fallback workaround ([#6](https://github.com/askalf/dario/issues/6)), billing reclassification root cause ([#7](https://github.com/askalf/dario/issues/7)) |
 ## Also by AskAlf

package/dist/proxy.js CHANGED Viewed

@@ -4,7 +4,7 @@ import { execSync, spawn } from 'node:child_process';
 import { readFileSync, readdirSync, writeFileSync, unlinkSync } from 'node:fs';
 import { join } from 'node:path';
 import { homedir, tmpdir } from 'node:os';
-import { arch, platform, version as nodeVersion } from 'node:process';
+import { arch, platform } from 'node:process';
 import { getAccessToken, getStatus } from './oauth.js';
 const ANTHROPIC_API = 'https://api.anthropic.com';
 const DEFAULT_PORT = 3456;
@@ -512,7 +512,8 @@ export async function startProxy(opts = {}) {
         'x-stainless-package-version': '0.81.0',
         'x-stainless-retry-count': '0',
         'x-stainless-runtime': 'node',
-        'x-stainless-runtime-version': nodeVersion,
+        // Claude Code runs on Bun which reports v24.3.0 as Node compat version
+        'x-stainless-runtime-version': 'v24.3.0',
     };
     const useCli = opts.cliBackend ?? false;
     let requestCount = 0;
@@ -686,18 +687,15 @@ export async function startProxy(opts = {}) {
                         const supportsThinking = !modelName.includes('haiku');
                         if (supportsThinking && !r.thinking) {
                             r.thinking = { type: 'adaptive' };
-                            // Ensure max_tokens is reasonable for thinking models
-                            const clientMax = r.max_tokens || 8192;
-                            r.max_tokens = Math.max(clientMax, 16000);
                         }
-                        // Request priority capacity when available
-                        if (!r.service_tier) {
-                            r.service_tier = 'auto';
+                        // Match Claude Code's default max_tokens (64000) when client sends low values
+                        if (!r.max_tokens || r.max_tokens < 16000) {
+                            r.max_tokens = 64000;
                         }
-                        // Set reasoning effort (pass through client value or default)
+                        // Set reasoning effort (pass through client value or default to 'medium' matching Claude Code)
                         // Haiku does not support the effort parameter
                         if (supportsThinking && !r.output_config) {
-                            r.output_config = { effort: 'high' };
+                            r.output_config = { effort: 'medium' };
                         }
                         // Enable context management (matches Claude Code default)
                         // Requires thinking to be enabled — skip for models without thinking support (e.g. Haiku)
@@ -714,24 +712,39 @@ export async function startProxy(opts = {}) {
                         // as the real Claude Code binary (Oz$ function):
                         // - build tag = SHA-256(seed + msg_chars[4,7,20] + version).slice(0,3)
                         // - cch = SHA-256(seed + version + msg_chars[4,7,20]).slice(0,5)
+                        // Build per-request billing tag matching Claude Code binary
                         const userMsg = extractFirstUserMessage(r);
                         const buildTag = computeBuildTag(userMsg, cliVersion);
                         const cch = computeCch(userMsg, cliVersion);
                         const fullVersion = `${cliVersion}.${buildTag}`;
                         const billingTag = `x-anthropic-billing-header: cc_version=${fullVersion}; cc_entrypoint=cli; cch=${cch};`;
+                        // Structure system prompt as 3 blocks matching real Claude Code:
+                        // [0] billing tag (no cache_control)
+                        // [1] agent identity string (cache 1h)
+                        // [2] actual system prompt (cache 1h)
+                        const AGENT_IDENTITY = 'You are a Claude agent, built on Anthropic\'s Claude Agent SDK.';
+                        const CACHE_1H = { type: 'ephemeral', ttl: '1h' };
                         if (typeof r.system === 'string') {
                             if (!r.system.includes('x-anthropic-billing-header:')) {
-                                r.system = billingTag + '\n' + r.system;
+                                r.system = [
+                                    { type: 'text', text: billingTag },
+                                    { type: 'text', text: AGENT_IDENTITY, cache_control: CACHE_1H },
+                                    { type: 'text', text: r.system, cache_control: CACHE_1H },
+                                ];
                             }
                         }
                         else if (Array.isArray(r.system)) {
                             const hasTag = r.system.some(b => typeof b.text === 'string' && b.text.includes('x-anthropic-billing-header:'));
                             if (!hasTag) {
-                                r.system.unshift({ type: 'text', text: billingTag });
+                                // Prepend billing tag and agent identity before existing blocks
+                                r.system.unshift({ type: 'text', text: billingTag }, { type: 'text', text: AGENT_IDENTITY, cache_control: CACHE_1H });
                             }
                         }
                         else {
-                            r.system = billingTag;
+                            r.system = [
+                                { type: 'text', text: billingTag },
+                                { type: 'text', text: AGENT_IDENTITY, cache_control: CACHE_1H },
+                            ];
                         }
                     }
                     finalBody = Buffer.from(JSON.stringify(r));
@@ -752,8 +765,8 @@ export async function startProxy(opts = {}) {
                     beta += ',' + clientBeta;
             }
             else {
-                // Claude-optimized: full beta set matching CLI v2.1.100
-                beta = 'oauth-2025-04-20,interleaved-thinking-2025-05-14,context-management-2025-06-27,prompt-caching-scope-2026-01-05,claude-code-20250219,advisor-tool-2026-03-01,effort-2025-11-24';
+                // Claude-optimized: full beta set matching real Claude Code (exact order from MITM capture)
+                beta = 'claude-code-20250219,oauth-2025-04-20,interleaved-thinking-2025-05-14,context-management-2025-06-27,prompt-caching-scope-2026-01-05,advisor-tool-2026-03-01,effort-2025-11-24';
                 if (clientBeta) {
                     const filtered = filterBillableBetas(clientBeta);
                     if (filtered)

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@askalf/dario",
-  "version": "2.8.5",
+  "version": "2.8.6",
   "description": "Use your Claude subscription as an API. No API key needed. Local proxy for Claude Max/Pro subscriptions.",
   "type": "module",
   "bin": {