@askalf/dario 2.8.0 → 2.8.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +57 -12
- package/dist/cli.js +13 -22
- package/dist/oauth.js +2 -7
- package/dist/proxy.js +66 -157
- package/package.json +3 -2
package/README.md
CHANGED
|
@@ -60,6 +60,16 @@ Opus, Sonnet, Haiku — all models, streaming, tool use. Works with Cursor, Cont
|
|
|
60
60
|
|
|
61
61
|
*"Highly recommended for personal, local development. Solves a massive pain point for developers by bridging Claude Max/Pro subscriptions with developer IDEs, saving substantial API costs. Modular & lean (~1100 lines), modern PKCE auth, SSRF protection, mature CI/CD pipeline with CodeQL and npm provenance attestations."*
|
|
62
62
|
|
|
63
|
+
</td>
|
|
64
|
+
</tr>
|
|
65
|
+
<tr>
|
|
66
|
+
<td colspan="3" align="center"><br/><strong>In production</strong><br/><br/></td>
|
|
67
|
+
</tr>
|
|
68
|
+
<tr>
|
|
69
|
+
<td colspan="3" valign="top">
|
|
70
|
+
|
|
71
|
+
*"The 429s were driving us crazy running a multi-agent stack on Claude Max — the CLI fallback was duct tape until you found the real fix. Billing tag in the system prompt is wild. v2.8.0 running clean, zero 429s."* — [@belangertrading](https://github.com/belangertrading), multi-agent stack on Claude Max
|
|
72
|
+
|
|
63
73
|
</td>
|
|
64
74
|
</tr>
|
|
65
75
|
</table>
|
|
@@ -193,13 +203,23 @@ Model: claude-opus-4-6 (all requests)
|
|
|
193
203
|
|
|
194
204
|
**Trade-offs vs direct API mode:**
|
|
195
205
|
|
|
196
|
-
| | Direct API (default) | CLI Backend (`--cli`) |
|
|
197
|
-
|
|
198
|
-
| Streaming |
|
|
199
|
-
| Tool use
|
|
200
|
-
|
|
|
201
|
-
|
|
|
202
|
-
|
|
|
206
|
+
| | Direct API (default) | CLI Backend (`--cli`) | Passthrough (`--passthrough`) |
|
|
207
|
+
|---|---|---|---|
|
|
208
|
+
| Streaming | Native SSE | SSE (converted from JSON) | Native SSE |
|
|
209
|
+
| Tool use | Yes | No | Yes |
|
|
210
|
+
| Thinking/billing injection | Yes (Claude-optimized) | N/A | No (OAuth swap only) |
|
|
211
|
+
| Latency | Low | Higher (process spawn) | Low |
|
|
212
|
+
| Rate limits | Priority routing | Not affected | Standard (no priority) |
|
|
213
|
+
| Opus when throttled | Auto CLI fallback | **Always works** | May return 429 |
|
|
214
|
+
|
|
215
|
+
## Passthrough Mode
|
|
216
|
+
|
|
217
|
+
For tools like Hermes or OpenClaw that need exact Anthropic protocol fidelity, use `--passthrough`. This does OAuth swap only — no billing tag, no thinking injection, no device identity, no extra beta flags.
|
|
218
|
+
|
|
219
|
+
```bash
|
|
220
|
+
dario proxy --passthrough # Thin proxy, zero injection
|
|
221
|
+
dario proxy --passthrough --model=opus # Thin proxy + model override
|
|
222
|
+
```
|
|
203
223
|
|
|
204
224
|
## Model Selection
|
|
205
225
|
|
|
@@ -285,7 +305,7 @@ const message = await client.messages.create({
|
|
|
285
305
|
});
|
|
286
306
|
```
|
|
287
307
|
|
|
288
|
-
### Streaming
|
|
308
|
+
### Streaming
|
|
289
309
|
|
|
290
310
|
```bash
|
|
291
311
|
curl http://localhost:3456/v1/messages \
|
|
@@ -351,6 +371,18 @@ Then run `hermes` normally — it routes through dario using your Claude subscri
|
|
|
351
371
|
└──────────┘ └─────────────────┘ └──────────────────┘
|
|
352
372
|
```
|
|
353
373
|
|
|
374
|
+
### Passthrough Mode (`--passthrough`)
|
|
375
|
+
|
|
376
|
+
```
|
|
377
|
+
┌──────────┐ ┌─────────────────┐ ┌──────────────────┐
|
|
378
|
+
│ Your App │ ──> │ dario (proxy) │ ──> │ api.anthropic.com│
|
|
379
|
+
│ │ │ localhost:3456 │ │ │
|
|
380
|
+
│ sends │ │ swaps API key │ │ sees valid │
|
|
381
|
+
│ API │ │ for OAuth │ │ OAuth bearer │
|
|
382
|
+
│ request │ │ nothing else │ │ token │
|
|
383
|
+
└──────────┘ └─────────────────┘ └──────────────────┘
|
|
384
|
+
```
|
|
385
|
+
|
|
354
386
|
1. **`dario login`** — Detects your existing Claude Code credentials (`~/.claude/.credentials.json`) and starts the proxy automatically. If Claude Code isn't installed, runs a PKCE OAuth flow with a local callback server to capture the token automatically.
|
|
355
387
|
|
|
356
388
|
2. **`dario proxy`** — Starts an HTTP server on localhost that implements the Anthropic Messages API. In direct mode, it swaps your API key for an OAuth bearer token. In CLI mode, it routes through the Claude Code binary.
|
|
@@ -373,6 +405,7 @@ Then run `hermes` normally — it routes through dario using your Claude subscri
|
|
|
373
405
|
| Flag/Env | Description | Default |
|
|
374
406
|
|----------|-------------|---------|
|
|
375
407
|
| `--cli` | Use Claude CLI as backend (bypasses rate limits) | off |
|
|
408
|
+
| `--passthrough` | Thin proxy — OAuth swap only, no injection | off |
|
|
376
409
|
| `--model=MODEL` | Force a model (`opus`, `sonnet`, `haiku`, or full ID) | passthrough |
|
|
377
410
|
| `--port=PORT` | Port to listen on | `3456` |
|
|
378
411
|
| `--verbose` / `-v` | Log every request | off |
|
|
@@ -383,8 +416,10 @@ Then run `hermes` normally — it routes through dario using your Claude subscri
|
|
|
383
416
|
### Direct API Mode
|
|
384
417
|
- All Claude models (Opus 4.6, Sonnet 4.6, Haiku 4.5) + 1M extended context aliases (`opus1m`, `sonnet1m`)
|
|
385
418
|
- **Native billing classification** — device identity metadata ensures Max plan limits work correctly
|
|
386
|
-
- **Priority routing** — billing tag injection + `service_tier: auto` activates per-model rate limits, keeping Opus/Sonnet available even at 100% overall utilization
|
|
387
|
-
- **Adaptive thinking** — matches Claude Code's `{ type: 'adaptive' }` mode for optimal reasoning
|
|
419
|
+
- **Priority routing** — billing tag injection + `service_tier: 'auto'` activates per-model rate limits, keeping Opus/Sonnet available even at 100% overall utilization
|
|
420
|
+
- **Adaptive thinking** — matches Claude Code's `{ type: 'adaptive' }` mode for optimal reasoning (auto-skipped for Haiku 4.5)
|
|
421
|
+
- **Effort control** — injects `output_config: { effort: 'high' }` by default, or passes through client-specified effort level
|
|
422
|
+
- **Enriched 429 errors** — rate limit errors include utilization %, limiting window, and reset time instead of Anthropic's default `"Error"` message
|
|
388
423
|
- **Auto CLI fallback** — if the API returns 429 and Claude Code is installed, transparently retries through `claude --print` with SSE conversion
|
|
389
424
|
- **OpenAI-compatible** (`/v1/chat/completions`) — works with any OpenAI SDK or tool
|
|
390
425
|
- Streaming and non-streaming (both Anthropic and OpenAI SSE formats, including tool_use streaming)
|
|
@@ -399,10 +434,17 @@ Then run `hermes` normally — it routes through dario using your Claude subscri
|
|
|
399
434
|
|
|
400
435
|
### CLI Backend Mode
|
|
401
436
|
- All Claude models — including Opus when rate limited
|
|
402
|
-
-
|
|
437
|
+
- Streaming via SSE conversion (client sends `stream: true`, CLI JSON response is converted to Anthropic or OpenAI SSE events)
|
|
438
|
+
- OpenAI compatibility (translates OpenAI → Anthropic before CLI, Anthropic → OpenAI after)
|
|
403
439
|
- System prompts and multi-turn conversations (via context injection)
|
|
404
440
|
- Not affected by API rate limits
|
|
405
441
|
|
|
442
|
+
### Passthrough Mode
|
|
443
|
+
- All Claude models with native streaming and tool use
|
|
444
|
+
- OAuth token swap only — no billing tag, thinking, effort, service_tier, or device identity injection
|
|
445
|
+
- Minimal beta flags (`oauth-2025-04-20` + client betas only)
|
|
446
|
+
- For tools like Hermes or OpenClaw that need exact Anthropic protocol fidelity
|
|
447
|
+
|
|
406
448
|
## Endpoints
|
|
407
449
|
|
|
408
450
|
| Path | Description |
|
|
@@ -459,7 +501,7 @@ Recommended but not required. If Claude Code is installed and logged in, `dario
|
|
|
459
501
|
Dario auto-refreshes tokens 30 minutes before expiry. You should never see an auth error in normal use. If something goes wrong, `dario refresh` forces an immediate refresh.
|
|
460
502
|
|
|
461
503
|
**I'm getting rate limited on Opus. What do I do?**
|
|
462
|
-
Use `--cli` mode: `dario proxy --cli`. This routes through the Claude Code binary, which continues working when direct API calls are rate limited. You can also enable [extra usage](https://support.claude.com/en/articles/12429409-manage-extra-usage-for-paid-claude-plans) in your Anthropic account settings to extend your limits at API rates.
|
|
504
|
+
Use `--cli` mode: `dario proxy --cli`. This routes through the Claude Code binary, which continues working when direct API calls are rate limited. In default mode, dario automatically falls back to CLI when it detects a 429 (if Claude Code is installed). Rate limit errors include utilization percentages and reset times so you can see exactly when capacity returns. You can also enable [extra usage](https://support.claude.com/en/articles/12429409-manage-extra-usage-for-paid-claude-plans) in your Anthropic account settings to extend your limits at API rates.
|
|
463
505
|
|
|
464
506
|
**What are the usage limits?**
|
|
465
507
|
Claude subscriptions have rolling 5-hour and 7-day usage windows shared across claude.ai and Claude Code. See [Anthropic's docs](https://support.claude.com/en/articles/11647753-how-do-usage-and-length-limits-work) for details. In Claude Code, use `/usage` to check your current limits, or configure the [statusline](https://code.claude.com/docs/en/statusline) to show real-time 5h and 7d utilization percentages.
|
|
@@ -483,6 +525,9 @@ await startProxy({ port: 3456, verbose: true });
|
|
|
483
525
|
// CLI backend mode
|
|
484
526
|
await startProxy({ port: 3456, cliBackend: true, model: "opus" });
|
|
485
527
|
|
|
528
|
+
// Passthrough mode (OAuth swap only, no injection)
|
|
529
|
+
await startProxy({ port: 3456, passthrough: true });
|
|
530
|
+
|
|
486
531
|
// Or just get a raw access token
|
|
487
532
|
const token = await getAccessToken();
|
|
488
533
|
|
package/dist/cli.js
CHANGED
|
@@ -9,10 +9,10 @@
|
|
|
9
9
|
* dario refresh — Force token refresh
|
|
10
10
|
* dario logout — Remove saved credentials
|
|
11
11
|
*/
|
|
12
|
-
import {
|
|
12
|
+
import { unlink } from 'node:fs/promises';
|
|
13
13
|
import { join } from 'node:path';
|
|
14
14
|
import { homedir } from 'node:os';
|
|
15
|
-
import { startAutoOAuthFlow, getStatus, refreshTokens } from './oauth.js';
|
|
15
|
+
import { startAutoOAuthFlow, getStatus, refreshTokens, loadCredentials } from './oauth.js';
|
|
16
16
|
import { startProxy, sanitizeError } from './proxy.js';
|
|
17
17
|
const args = process.argv.slice(2);
|
|
18
18
|
const command = args[0] ?? 'proxy';
|
|
@@ -21,22 +21,14 @@ async function login() {
|
|
|
21
21
|
console.log(' dario — Claude Login');
|
|
22
22
|
console.log(' ───────────────────');
|
|
23
23
|
console.log('');
|
|
24
|
-
// Check
|
|
25
|
-
const
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
if (expiresAt > Date.now()) {
|
|
32
|
-
console.log(' Found Claude Code credentials. Starting proxy...');
|
|
33
|
-
console.log('');
|
|
34
|
-
await proxy();
|
|
35
|
-
return;
|
|
36
|
-
}
|
|
37
|
-
}
|
|
24
|
+
// Check for existing credentials (Claude Code or dario's own)
|
|
25
|
+
const creds = await loadCredentials();
|
|
26
|
+
if (creds?.claudeAiOauth?.accessToken && creds.claudeAiOauth.expiresAt > Date.now()) {
|
|
27
|
+
console.log(' Found credentials. Starting proxy...');
|
|
28
|
+
console.log('');
|
|
29
|
+
await proxy();
|
|
30
|
+
return;
|
|
38
31
|
}
|
|
39
|
-
catch { /* no Claude Code credentials, fall through to OAuth */ }
|
|
40
32
|
console.log(' No Claude Code credentials found. Starting OAuth flow...');
|
|
41
33
|
console.log('');
|
|
42
34
|
try {
|
|
@@ -157,12 +149,11 @@ async function help() {
|
|
|
157
149
|
`);
|
|
158
150
|
}
|
|
159
151
|
async function version() {
|
|
160
|
-
const { readFile } = await import('node:fs/promises');
|
|
161
|
-
const { fileURLToPath } = await import('node:url');
|
|
162
|
-
const { dirname, join } = await import('node:path');
|
|
163
152
|
try {
|
|
164
|
-
const
|
|
165
|
-
const
|
|
153
|
+
const { fileURLToPath } = await import('node:url');
|
|
154
|
+
const { readFile: rf } = await import('node:fs/promises');
|
|
155
|
+
const dir = join(fileURLToPath(import.meta.url), '..', '..');
|
|
156
|
+
const pkg = JSON.parse(await rf(join(dir, 'package.json'), 'utf-8'));
|
|
166
157
|
console.log(pkg.version);
|
|
167
158
|
}
|
|
168
159
|
catch {
|
package/dist/oauth.js
CHANGED
|
@@ -5,7 +5,7 @@
|
|
|
5
5
|
* Handles authorization, token exchange, storage, and auto-refresh.
|
|
6
6
|
*/
|
|
7
7
|
import { randomBytes, createHash } from 'node:crypto';
|
|
8
|
-
import { readFile, writeFile, mkdir,
|
|
8
|
+
import { readFile, writeFile, mkdir, rename } from 'node:fs/promises';
|
|
9
9
|
import { dirname, join } from 'node:path';
|
|
10
10
|
import { homedir } from 'node:os';
|
|
11
11
|
// Claude Code's public OAuth client (PKCE, no secret needed)
|
|
@@ -62,11 +62,6 @@ async function saveCredentials(creds) {
|
|
|
62
62
|
const tmpPath = `${path}.tmp.${Date.now()}`;
|
|
63
63
|
await writeFile(tmpPath, JSON.stringify(creds, null, 2), { mode: 0o600 });
|
|
64
64
|
await rename(tmpPath, path);
|
|
65
|
-
// Set permissions (best-effort — no-op on Windows where mode is ignored)
|
|
66
|
-
try {
|
|
67
|
-
await chmod(path, 0o600);
|
|
68
|
-
}
|
|
69
|
-
catch { /* Windows ignores file modes */ }
|
|
70
65
|
// Invalidate cache so next read picks up the new tokens
|
|
71
66
|
credentialsCache = creds;
|
|
72
67
|
credentialsCacheTime = Date.now();
|
|
@@ -222,10 +217,10 @@ async function doRefreshTokens() {
|
|
|
222
217
|
}
|
|
223
218
|
const data = await res.json();
|
|
224
219
|
const tokens = {
|
|
225
|
-
...oauth,
|
|
226
220
|
accessToken: data.access_token,
|
|
227
221
|
refreshToken: data.refresh_token,
|
|
228
222
|
expiresAt: Date.now() + data.expires_in * 1000,
|
|
223
|
+
scopes: oauth.scopes,
|
|
229
224
|
};
|
|
230
225
|
await saveCredentials({ claudeAiOauth: tokens });
|
|
231
226
|
return tokens;
|
package/dist/proxy.js
CHANGED
|
@@ -1,9 +1,9 @@
|
|
|
1
1
|
import { createServer } from 'node:http';
|
|
2
2
|
import { randomUUID, timingSafeEqual } from 'node:crypto';
|
|
3
3
|
import { execSync, spawn } from 'node:child_process';
|
|
4
|
-
import { readFileSync } from 'node:fs';
|
|
4
|
+
import { readFileSync, readdirSync, writeFileSync, unlinkSync } from 'node:fs';
|
|
5
5
|
import { join } from 'node:path';
|
|
6
|
-
import { homedir } from 'node:os';
|
|
6
|
+
import { homedir, tmpdir } from 'node:os';
|
|
7
7
|
import { arch, platform, version as nodeVersion } from 'node:process';
|
|
8
8
|
import { getAccessToken, getStatus } from './oauth.js';
|
|
9
9
|
const ANTHROPIC_API = 'https://api.anthropic.com';
|
|
@@ -35,27 +35,19 @@ class Semaphore {
|
|
|
35
35
|
next();
|
|
36
36
|
}
|
|
37
37
|
}
|
|
38
|
-
// Detect installed Claude Code binary at startup
|
|
39
|
-
|
|
38
|
+
// Detect installed Claude Code binary at startup (single exec for both version + availability)
|
|
39
|
+
let cliAvailable = false;
|
|
40
|
+
function detectCli() {
|
|
40
41
|
try {
|
|
41
42
|
const out = execSync('claude --version', { timeout: 5000, stdio: 'pipe' }).toString().trim();
|
|
42
|
-
|
|
43
|
-
return match?.[1] ?? '2.1.96';
|
|
43
|
+
cliAvailable = true;
|
|
44
|
+
return out.match(/^([\d.]+)/)?.[1] ?? '2.1.96';
|
|
44
45
|
}
|
|
45
46
|
catch {
|
|
47
|
+
cliAvailable = false;
|
|
46
48
|
return '2.1.96';
|
|
47
49
|
}
|
|
48
50
|
}
|
|
49
|
-
let cliAvailable = false;
|
|
50
|
-
function detectCliAvailable() {
|
|
51
|
-
try {
|
|
52
|
-
execSync('claude --version', { timeout: 5000, stdio: 'pipe' });
|
|
53
|
-
return true;
|
|
54
|
-
}
|
|
55
|
-
catch {
|
|
56
|
-
return false;
|
|
57
|
-
}
|
|
58
|
-
}
|
|
59
51
|
/** Convert a non-streaming Messages API response to SSE event stream. */
|
|
60
52
|
function jsonToSse(jsonBody) {
|
|
61
53
|
try {
|
|
@@ -86,6 +78,40 @@ function jsonToSse(jsonBody) {
|
|
|
86
78
|
return '';
|
|
87
79
|
}
|
|
88
80
|
}
|
|
81
|
+
/** Convert CLI JSON response to OpenAI SSE format. */
|
|
82
|
+
function jsonToOpenaiSse(jsonBody) {
|
|
83
|
+
try {
|
|
84
|
+
const parsed = JSON.parse(jsonBody);
|
|
85
|
+
const text = parsed.content?.find(c => c.type === 'text')?.text ?? '';
|
|
86
|
+
const ts = Math.floor(Date.now() / 1000);
|
|
87
|
+
return `data: ${JSON.stringify({ id: 'chatcmpl-dario', object: 'chat.completion.chunk', created: ts, model: 'claude', choices: [{ index: 0, delta: { content: text }, finish_reason: null }] })}\n\n` +
|
|
88
|
+
`data: ${JSON.stringify({ id: 'chatcmpl-dario', object: 'chat.completion.chunk', created: ts, model: 'claude', choices: [{ index: 0, delta: {}, finish_reason: 'stop' }] })}\n\ndata: [DONE]\n\n`;
|
|
89
|
+
}
|
|
90
|
+
catch {
|
|
91
|
+
return '';
|
|
92
|
+
}
|
|
93
|
+
}
|
|
94
|
+
/** Send a CLI result to the client, handling streaming/format translation. */
|
|
95
|
+
function sendCliResponse(res, cliResult, clientWantsStream, isOpenAI, corsOrigin, securityHeaders) {
|
|
96
|
+
const headers = { 'Access-Control-Allow-Origin': corsOrigin, ...securityHeaders };
|
|
97
|
+
const ok = cliResult.status >= 200 && cliResult.status < 300;
|
|
98
|
+
if (ok && clientWantsStream) {
|
|
99
|
+
const sseData = isOpenAI ? jsonToOpenaiSse(cliResult.body) : jsonToSse(cliResult.body);
|
|
100
|
+
if (sseData) {
|
|
101
|
+
res.writeHead(200, { 'Content-Type': 'text/event-stream', ...headers });
|
|
102
|
+
res.end(sseData);
|
|
103
|
+
return;
|
|
104
|
+
}
|
|
105
|
+
}
|
|
106
|
+
if (ok && isOpenAI) {
|
|
107
|
+
try {
|
|
108
|
+
cliResult.body = JSON.stringify(anthropicToOpenai(JSON.parse(cliResult.body)));
|
|
109
|
+
}
|
|
110
|
+
catch { }
|
|
111
|
+
}
|
|
112
|
+
res.writeHead(cliResult.status, { 'Content-Type': cliResult.contentType, ...headers });
|
|
113
|
+
res.end(cliResult.body);
|
|
114
|
+
}
|
|
89
115
|
const SESSION_ID = randomUUID();
|
|
90
116
|
const OS_NAME = platform === 'win32' ? 'Windows' : platform === 'darwin' ? 'MacOS' : 'Linux';
|
|
91
117
|
// Claude Code device identity — required for Max plan billing classification.
|
|
@@ -100,7 +126,7 @@ function loadClaudeIdentity() {
|
|
|
100
126
|
// Also check backup files as fallback
|
|
101
127
|
try {
|
|
102
128
|
const backupDir = join(homedir(), '.claude', 'backups');
|
|
103
|
-
const files =
|
|
129
|
+
const files = readdirSync(backupDir);
|
|
104
130
|
const backups = files
|
|
105
131
|
.filter((f) => f.startsWith('.claude.json.backup.'))
|
|
106
132
|
.sort()
|
|
@@ -180,28 +206,6 @@ function sanitizeMessages(body) {
|
|
|
180
206
|
}
|
|
181
207
|
}
|
|
182
208
|
}
|
|
183
|
-
let lastTokenSnapshot = null;
|
|
184
|
-
function checkTokenAnomalies(usage, requestId) {
|
|
185
|
-
const current = {
|
|
186
|
-
inputTokens: usage.input_tokens ?? 0,
|
|
187
|
-
outputTokens: usage.output_tokens ?? 0,
|
|
188
|
-
cacheRead: usage.cache_read_input_tokens ?? 0,
|
|
189
|
-
};
|
|
190
|
-
if (lastTokenSnapshot && lastTokenSnapshot.inputTokens > 0) {
|
|
191
|
-
const growth = (current.inputTokens - lastTokenSnapshot.inputTokens) / lastTokenSnapshot.inputTokens;
|
|
192
|
-
if (growth > 0.6) {
|
|
193
|
-
const pct = Math.round(growth * 100);
|
|
194
|
-
console.warn(`[dario] TOKEN WARN ${requestId}: Input grew ${pct}% (${lastTokenSnapshot.inputTokens} → ${current.inputTokens}). Possible full replay.`);
|
|
195
|
-
}
|
|
196
|
-
if (current.outputTokens > lastTokenSnapshot.outputTokens * 2 && current.outputTokens > 2000) {
|
|
197
|
-
console.warn(`[dario] TOKEN WARN ${requestId}: Output explosion ${current.outputTokens} tokens (${Math.round(current.outputTokens / lastTokenSnapshot.outputTokens)}x previous).`);
|
|
198
|
-
}
|
|
199
|
-
}
|
|
200
|
-
lastTokenSnapshot = current;
|
|
201
|
-
}
|
|
202
|
-
// Extended context fallback — cooldown after 1M context failure
|
|
203
|
-
let extendedContextUnavailableAt = 0;
|
|
204
|
-
const EXTENDED_CONTEXT_COOLDOWN_MS = 60 * 60 * 1000; // 1 hour
|
|
205
209
|
// OpenAI model names → Anthropic (fallback if client sends GPT names)
|
|
206
210
|
const OPENAI_MODEL_MAP = {
|
|
207
211
|
'gpt-5.4': 'claude-opus-4-6',
|
|
@@ -361,14 +365,25 @@ async function handleViaCli(body, model, verbose) {
|
|
|
361
365
|
const historyText = history.map(m => `${m.role}: ${typeof m.content === 'string' ? m.content : JSON.stringify(m.content)}`).join('\n');
|
|
362
366
|
systemPrompt = systemPrompt ? `${systemPrompt}\n\nConversation history:\n${historyText}` : `Conversation history:\n${historyText}`;
|
|
363
367
|
}
|
|
368
|
+
// Write system prompt to temp file instead of passing as arg to avoid E2BIG
|
|
369
|
+
// on large conversation contexts (OS arg size limit ~2MB)
|
|
370
|
+
let systemPromptFile = null;
|
|
364
371
|
if (systemPrompt) {
|
|
365
|
-
|
|
372
|
+
systemPromptFile = join(tmpdir(), `dario-sysprompt-${randomUUID()}.txt`);
|
|
373
|
+
writeFileSync(systemPromptFile, systemPrompt, { mode: 0o600 });
|
|
374
|
+
args.push('--append-system-prompt-file', systemPromptFile);
|
|
366
375
|
}
|
|
367
376
|
if (verbose) {
|
|
368
377
|
console.log(`[dario:cli] model=${effectiveModel} prompt=${prompt.substring(0, 60)}...`);
|
|
369
378
|
}
|
|
370
379
|
// Spawn claude --print
|
|
371
380
|
return new Promise((resolve) => {
|
|
381
|
+
// Cleanup temp file when done
|
|
382
|
+
const cleanup = () => { if (systemPromptFile)
|
|
383
|
+
try {
|
|
384
|
+
unlinkSync(systemPromptFile);
|
|
385
|
+
}
|
|
386
|
+
catch { } };
|
|
372
387
|
const child = spawn('claude', args, {
|
|
373
388
|
stdio: ['pipe', 'pipe', 'pipe'],
|
|
374
389
|
timeout: 300_000,
|
|
@@ -383,6 +398,7 @@ async function handleViaCli(body, model, verbose) {
|
|
|
383
398
|
child.stdin.write(prompt);
|
|
384
399
|
child.stdin.end();
|
|
385
400
|
child.on('close', (code) => {
|
|
401
|
+
cleanup();
|
|
386
402
|
if (code !== 0 || !stdout.trim()) {
|
|
387
403
|
resolve({
|
|
388
404
|
status: 502,
|
|
@@ -410,6 +426,7 @@ async function handleViaCli(body, model, verbose) {
|
|
|
410
426
|
resolve({ status: 200, body: JSON.stringify(response), contentType: 'application/json' });
|
|
411
427
|
});
|
|
412
428
|
child.on('error', (err) => {
|
|
429
|
+
cleanup();
|
|
413
430
|
resolve({
|
|
414
431
|
status: 502,
|
|
415
432
|
body: JSON.stringify({ type: 'error', error: { type: 'api_error', message: 'Claude CLI not found. Install Claude Code first.' } }),
|
|
@@ -436,8 +453,7 @@ export async function startProxy(opts = {}) {
|
|
|
436
453
|
console.error('[dario] Not authenticated. Run `dario login` first.');
|
|
437
454
|
process.exit(1);
|
|
438
455
|
}
|
|
439
|
-
const cliVersion =
|
|
440
|
-
cliAvailable = detectCliAvailable();
|
|
456
|
+
const cliVersion = detectCli();
|
|
441
457
|
const modelOverride = opts.model ? (MODEL_ALIASES[opts.model] ?? opts.model) : null;
|
|
442
458
|
const identity = loadClaudeIdentity();
|
|
443
459
|
if (identity.deviceId) {
|
|
@@ -610,41 +626,7 @@ export async function startProxy(opts = {}) {
|
|
|
610
626
|
}
|
|
611
627
|
const cliResult = await handleViaCli(cliBody, modelOverride, verbose);
|
|
612
628
|
requestCount++;
|
|
613
|
-
|
|
614
|
-
// Client requested streaming — convert CLI JSON to SSE
|
|
615
|
-
if (isOpenAI) {
|
|
616
|
-
try {
|
|
617
|
-
const parsed = JSON.parse(cliResult.body);
|
|
618
|
-
const text = parsed.content?.find(c => c.type === 'text')?.text ?? '';
|
|
619
|
-
const ts = Math.floor(Date.now() / 1000);
|
|
620
|
-
let sseData = `data: ${JSON.stringify({ id: 'chatcmpl-dario', object: 'chat.completion.chunk', created: ts, model: 'claude', choices: [{ index: 0, delta: { content: text }, finish_reason: null }] })}\n\n`;
|
|
621
|
-
sseData += `data: ${JSON.stringify({ id: 'chatcmpl-dario', object: 'chat.completion.chunk', created: ts, model: 'claude', choices: [{ index: 0, delta: {}, finish_reason: 'stop' }] })}\n\ndata: [DONE]\n\n`;
|
|
622
|
-
res.writeHead(200, { 'Content-Type': 'text/event-stream', 'Access-Control-Allow-Origin': corsOrigin, ...SECURITY_HEADERS });
|
|
623
|
-
res.end(sseData);
|
|
624
|
-
}
|
|
625
|
-
catch {
|
|
626
|
-
res.writeHead(cliResult.status, { 'Content-Type': cliResult.contentType, 'Access-Control-Allow-Origin': corsOrigin, ...SECURITY_HEADERS });
|
|
627
|
-
res.end(cliResult.body);
|
|
628
|
-
}
|
|
629
|
-
}
|
|
630
|
-
else {
|
|
631
|
-
const sseData = jsonToSse(cliResult.body);
|
|
632
|
-
res.writeHead(200, { 'Content-Type': 'text/event-stream', 'Access-Control-Allow-Origin': corsOrigin, ...SECURITY_HEADERS });
|
|
633
|
-
res.end(sseData);
|
|
634
|
-
}
|
|
635
|
-
}
|
|
636
|
-
else {
|
|
637
|
-
// Non-streaming or error — translate and return as JSON
|
|
638
|
-
if (isOpenAI && cliResult.status >= 200 && cliResult.status < 300) {
|
|
639
|
-
try {
|
|
640
|
-
const parsed = JSON.parse(cliResult.body);
|
|
641
|
-
cliResult.body = JSON.stringify(anthropicToOpenai(parsed));
|
|
642
|
-
}
|
|
643
|
-
catch { /* send as-is */ }
|
|
644
|
-
}
|
|
645
|
-
res.writeHead(cliResult.status, { 'Content-Type': cliResult.contentType, 'Access-Control-Allow-Origin': corsOrigin, ...SECURITY_HEADERS });
|
|
646
|
-
res.end(cliResult.body);
|
|
647
|
-
}
|
|
629
|
+
sendCliResponse(res, cliResult, clientWantsStream, isOpenAI, corsOrigin, SECURITY_HEADERS);
|
|
648
630
|
return;
|
|
649
631
|
}
|
|
650
632
|
// Parse body once, apply OpenAI translation, model override, and sanitization
|
|
@@ -654,10 +636,6 @@ export async function startProxy(opts = {}) {
|
|
|
654
636
|
const parsed = JSON.parse(body.toString());
|
|
655
637
|
// Strip orchestration tags from messages (Aider, Cursor, etc.)
|
|
656
638
|
sanitizeMessages(parsed);
|
|
657
|
-
// Handle 1M context: strip [1m] suffix if in cooldown
|
|
658
|
-
if (modelOverride?.includes('[1m]') && extendedContextUnavailableAt > 0 && Date.now() - extendedContextUnavailableAt < EXTENDED_CONTEXT_COOLDOWN_MS) {
|
|
659
|
-
parsed.model = modelOverride.replace('[1m]', '');
|
|
660
|
-
}
|
|
661
639
|
const result = isOpenAI ? openaiToAnthropic(parsed, modelOverride) : (modelOverride ? { ...parsed, model: modelOverride } : parsed);
|
|
662
640
|
const r = result;
|
|
663
641
|
// In passthrough mode, skip all Claude-specific injection — OAuth swap only
|
|
@@ -687,7 +665,8 @@ export async function startProxy(opts = {}) {
|
|
|
687
665
|
r.service_tier = 'auto';
|
|
688
666
|
}
|
|
689
667
|
// Set reasoning effort (pass through client value or default)
|
|
690
|
-
|
|
668
|
+
// Haiku does not support the effort parameter
|
|
669
|
+
if (supportsThinking && !r.output_config) {
|
|
691
670
|
r.output_config = { effort: 'high' };
|
|
692
671
|
}
|
|
693
672
|
// Enable context management (matches Claude Code default)
|
|
@@ -774,74 +753,19 @@ export async function startProxy(opts = {}) {
|
|
|
774
753
|
res.end(enriched);
|
|
775
754
|
return;
|
|
776
755
|
}
|
|
777
|
-
// Auto-fallback: if API returns 429 and CLI is available, retry through CLI binary
|
|
778
|
-
// The CLI gets priority routing from Anthropic's server — a separate rate limit pool
|
|
779
|
-
// that continues working when the direct API quota is exhausted for expensive models.
|
|
756
|
+
// Auto-fallback: if API returns 429 and CLI is available, retry through CLI binary
|
|
780
757
|
if (upstream.status === 429 && cliAvailable && !useCli) {
|
|
781
|
-
// Drain the upstream response
|
|
782
758
|
await upstream.text().catch(() => { });
|
|
783
759
|
if (verbose)
|
|
784
760
|
console.log(`[dario] #${requestCount} 429 from API — falling back to CLI`);
|
|
785
|
-
// Determine if the client requested streaming
|
|
786
761
|
let clientWantsStream = false;
|
|
787
|
-
|
|
788
|
-
|
|
789
|
-
const p = JSON.parse(body.toString());
|
|
790
|
-
clientWantsStream = !!p.stream;
|
|
791
|
-
}
|
|
792
|
-
catch { }
|
|
762
|
+
try {
|
|
763
|
+
clientWantsStream = !!JSON.parse(body.toString()).stream;
|
|
793
764
|
}
|
|
765
|
+
catch { }
|
|
794
766
|
const cliResult = await handleViaCli(body, modelOverride, verbose);
|
|
795
767
|
requestCount++;
|
|
796
|
-
|
|
797
|
-
if (isOpenAI) {
|
|
798
|
-
// Translate to OpenAI format
|
|
799
|
-
try {
|
|
800
|
-
const parsed = JSON.parse(cliResult.body);
|
|
801
|
-
cliResult.body = JSON.stringify(anthropicToOpenai(parsed));
|
|
802
|
-
}
|
|
803
|
-
catch { }
|
|
804
|
-
}
|
|
805
|
-
if (clientWantsStream && !isOpenAI) {
|
|
806
|
-
// Client requested SSE streaming — convert CLI JSON to SSE events
|
|
807
|
-
const sseData = jsonToSse(cliResult.body);
|
|
808
|
-
res.writeHead(200, {
|
|
809
|
-
'Content-Type': 'text/event-stream',
|
|
810
|
-
'Access-Control-Allow-Origin': corsOrigin,
|
|
811
|
-
...SECURITY_HEADERS,
|
|
812
|
-
});
|
|
813
|
-
res.end(sseData);
|
|
814
|
-
}
|
|
815
|
-
else if (clientWantsStream && isOpenAI) {
|
|
816
|
-
// OpenAI streaming — convert Anthropic JSON to OpenAI SSE
|
|
817
|
-
try {
|
|
818
|
-
const parsed = JSON.parse(cliResult.body);
|
|
819
|
-
const text = parsed.content?.find(c => c.type === 'text')?.text ?? '';
|
|
820
|
-
const ts = Math.floor(Date.now() / 1000);
|
|
821
|
-
let sseData = `data: ${JSON.stringify({ id: 'chatcmpl-dario', object: 'chat.completion.chunk', created: ts, model: 'claude', choices: [{ index: 0, delta: { content: text }, finish_reason: null }] })}\n\n`;
|
|
822
|
-
sseData += `data: ${JSON.stringify({ id: 'chatcmpl-dario', object: 'chat.completion.chunk', created: ts, model: 'claude', choices: [{ index: 0, delta: {}, finish_reason: 'stop' }] })}\n\ndata: [DONE]\n\n`;
|
|
823
|
-
res.writeHead(200, {
|
|
824
|
-
'Content-Type': 'text/event-stream',
|
|
825
|
-
'Access-Control-Allow-Origin': corsOrigin,
|
|
826
|
-
...SECURITY_HEADERS,
|
|
827
|
-
});
|
|
828
|
-
res.end(sseData);
|
|
829
|
-
}
|
|
830
|
-
catch {
|
|
831
|
-
res.writeHead(cliResult.status, { 'Content-Type': cliResult.contentType, 'Access-Control-Allow-Origin': corsOrigin, ...SECURITY_HEADERS });
|
|
832
|
-
res.end(cliResult.body);
|
|
833
|
-
}
|
|
834
|
-
}
|
|
835
|
-
else {
|
|
836
|
-
res.writeHead(cliResult.status, { 'Content-Type': cliResult.contentType, 'Access-Control-Allow-Origin': corsOrigin, ...SECURITY_HEADERS });
|
|
837
|
-
res.end(cliResult.body);
|
|
838
|
-
}
|
|
839
|
-
}
|
|
840
|
-
else {
|
|
841
|
-
// CLI also failed — return the CLI error
|
|
842
|
-
res.writeHead(cliResult.status, { 'Content-Type': cliResult.contentType, 'Access-Control-Allow-Origin': corsOrigin, ...SECURITY_HEADERS });
|
|
843
|
-
res.end(cliResult.body);
|
|
844
|
-
}
|
|
768
|
+
sendCliResponse(res, cliResult, clientWantsStream, isOpenAI, corsOrigin, SECURITY_HEADERS);
|
|
845
769
|
return;
|
|
846
770
|
}
|
|
847
771
|
// Detect streaming from content-type (reliable) or body (fallback)
|
|
@@ -916,21 +840,6 @@ export async function startProxy(opts = {}) {
|
|
|
916
840
|
else {
|
|
917
841
|
// Buffer and forward
|
|
918
842
|
const responseBody = await upstream.text();
|
|
919
|
-
// Check for extended context failure — cooldown to avoid repeated failures
|
|
920
|
-
if (upstream.status === 400 && responseBody.includes('extra_usage') && modelOverride?.includes('[1m]')) {
|
|
921
|
-
extendedContextUnavailableAt = Date.now();
|
|
922
|
-
console.warn('[dario] 1M context requires Extra Usage — falling back to standard context for 1 hour');
|
|
923
|
-
}
|
|
924
|
-
// Token anomaly detection on non-streaming responses
|
|
925
|
-
if (upstream.status >= 200 && upstream.status < 300) {
|
|
926
|
-
try {
|
|
927
|
-
const parsed = JSON.parse(responseBody);
|
|
928
|
-
const usage = parsed.usage;
|
|
929
|
-
if (usage)
|
|
930
|
-
checkTokenAnomalies(usage, responseHeaders['request-id'] ?? '');
|
|
931
|
-
}
|
|
932
|
-
catch { /* ignore parse errors */ }
|
|
933
|
-
}
|
|
934
843
|
if (isOpenAI && upstream.status >= 200 && upstream.status < 300) {
|
|
935
844
|
// Translate Anthropic response → OpenAI format
|
|
936
845
|
try {
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@askalf/dario",
|
|
3
|
-
"version": "2.8.
|
|
3
|
+
"version": "2.8.2",
|
|
4
4
|
"description": "Use your Claude subscription as an API. No API key needed. Local proxy for Claude Max/Pro subscriptions.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"bin": {
|
|
@@ -25,7 +25,8 @@
|
|
|
25
25
|
"prepublishOnly": "npm run build",
|
|
26
26
|
"start": "node dist/cli.js",
|
|
27
27
|
"dev": "tsx src/cli.ts",
|
|
28
|
-
"e2e": "node test/e2e.mjs"
|
|
28
|
+
"e2e": "node test/e2e.mjs",
|
|
29
|
+
"compat": "node test/compat.mjs"
|
|
29
30
|
},
|
|
30
31
|
"keywords": [
|
|
31
32
|
"claude",
|