@askalf/dario 4.1.1 → 4.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +64 -4
- package/dist/cli.js +51 -2
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -172,15 +172,24 @@ The moment any upstream response carries `representative-claim: overage`, dario
|
|
|
172
172
|
}
|
|
173
173
|
```
|
|
174
174
|
|
|
175
|
-
The TUI
|
|
175
|
+
The state surfaces in four TUI tabs simultaneously — each answers a different question a user has when their bill suddenly starts moving:
|
|
176
|
+
|
|
177
|
+
| Tab | Question it answers | What it renders |
|
|
178
|
+
|---|---|---|
|
|
179
|
+
| **Status** | What's happening RIGHT NOW? | `⚠ HALTED` banner with triggering request, cause, live countdown to auto-resume, manual-resume hint |
|
|
180
|
+
| **Hits** | Which specific request triggered it? | Pinned banner across the top + red `!` marker + red row on the triggering request in the live buffer + 503-status row for any blocked-while-halted requests |
|
|
181
|
+
| **Analytics** | How often is this happening across my traffic? | New "Overage" bar in the rate-limit cluster, alongside 5h/7d — red the moment count is non-zero |
|
|
182
|
+
| **Config** | How do I tune this? | Four in-place-editable fields: `overageGuard.enabled`, `.behavior` (enum-validated halt/warn), `.cooldownMs`, `.notifyOs` |
|
|
183
|
+
|
|
184
|
+
Status and Hits during an active halt:
|
|
176
185
|
|
|
177
186
|
```
|
|
178
|
-
┌─ dario v4
|
|
187
|
+
┌─ dario v4.1 ────────────────────────[ q quit · Tab next · ? help ]──┐
|
|
179
188
|
│ ▎Status▎ Config Analytics Hits Accounts Backends │
|
|
180
189
|
├─────────────────────────────────────────────────────────────────────┤
|
|
181
190
|
│ Overage-guard │
|
|
182
191
|
│ ⚠ HALTED overage detected 12s ago │
|
|
183
|
-
│ Request: claude-opus-4-7 account=
|
|
192
|
+
│ Request: claude-opus-4-7 account=work │
|
|
184
193
|
│ Cause: representative-claim = overage │
|
|
185
194
|
│ Auto-resume in 29m 48s │
|
|
186
195
|
│ Manual resume press R here, or `dario resume` from any shell │
|
|
@@ -189,10 +198,61 @@ The TUI's Status tab pins the loud version:
|
|
|
189
198
|
└─────────────────────────────────────────────────────────────────────┘
|
|
190
199
|
```
|
|
191
200
|
|
|
201
|
+
```
|
|
202
|
+
┌─ dario v4.1 ────────────────────────[ q quit · Tab next · ? help ]──┐
|
|
203
|
+
│ Status Config Analytics ▎Hits▎ Accounts Backends │
|
|
204
|
+
├─────────────────────────────────────────────────────────────────────┤
|
|
205
|
+
│ Hits 248 buffered · live │
|
|
206
|
+
│ │
|
|
207
|
+
│ ⚠ HALTED overage detected at 15:54:28 on opus-4-7 acct=work │
|
|
208
|
+
│ → New /v1/messages return 503 until R here, or `dario resume` │
|
|
209
|
+
│ │
|
|
210
|
+
│ time model in out lat st │
|
|
211
|
+
│ ▎15:54:31 opus-4-7 2.1k — — 503 │
|
|
212
|
+
│ 15:54:29 haiku-4-5 120 24 0.3s 200 │
|
|
213
|
+
│ ! 15:54:28 opus-4-7 1.4k 216 1.2s 200 ◀ red row │
|
|
214
|
+
│ 15:54:25 sonnet-4-6 1.2k 480 0.8s 200 │
|
|
215
|
+
│ 15:54:20 opus-4-7 842 216 1.2s 200 │
|
|
216
|
+
│ ────────────────────────────────────────────────────────────── │
|
|
217
|
+
│ Selected: 15:54:31 req_011Cb52VKMBsB6z6w28NvMn │
|
|
218
|
+
│ Account: work │
|
|
219
|
+
│ Model: claude-opus-4-7 │
|
|
220
|
+
│ Billing bucket: (halted before upstream — no claim) │
|
|
221
|
+
│ Status: 503 dario_overage_guard │
|
|
222
|
+
└─────────────────────────────────────────────────────────────────────┘
|
|
223
|
+
```
|
|
224
|
+
|
|
225
|
+
Analytics — the burn-rate view, with the new Overage bar at the bottom of the rate-limit cluster (here showing one overage hit out of 248 — which is enough to halt by default):
|
|
226
|
+
|
|
227
|
+
```
|
|
228
|
+
┌─ dario v4.1 ────────────────────────[ q quit · Tab next · ? help ]──┐
|
|
229
|
+
│ Status Config ▎Analytics▎ Hits Accounts Backends │
|
|
230
|
+
├─────────────────────────────────────────────────────────────────────┤
|
|
231
|
+
│ Analytics — last 60 min │
|
|
232
|
+
│ │
|
|
233
|
+
│ Requests: 248 (4.1/min) │
|
|
234
|
+
│ Tokens in: 142,830 │
|
|
235
|
+
│ Tokens out: 38,200 │
|
|
236
|
+
│ Subscription %: 99% │
|
|
237
|
+
│ │
|
|
238
|
+
│ Rate-limit │
|
|
239
|
+
│ 5h ████░░░░░░░░░░░░░░░░░░░░░░░░ 18% │
|
|
240
|
+
│ 7d ██░░░░░░░░░░░░░░░░░░░░░░░░░░ 8% │
|
|
241
|
+
│ Overage █░░░░░░░░░░░░░░░░░░░░░░░░░░░ 1 req of 248 │
|
|
242
|
+
│ ⮤ red — the moment count is non-zero │
|
|
243
|
+
│ │
|
|
244
|
+
│ Billing │
|
|
245
|
+
│ subscription 247 req │
|
|
246
|
+
│ extra_usage 1 req │
|
|
247
|
+
└─────────────────────────────────────────────────────────────────────┘
|
|
248
|
+
```
|
|
249
|
+
|
|
192
250
|
Why "halt at hit #1" is the right default: subscribers should never see a single overage response during normal operation. One means something is wrong — wire-shape drift, classifier change, account misconfig — and continuing to forward requests in the same shape bleeds real money for accounts with extra-usage enabled, or returns wall-of-rejections for accounts without it. The first hit is the signal; the second through hundredth are damage.
|
|
193
251
|
|
|
194
252
|
**Resume paths** — `dario resume` from any shell, `R` on the TUI Status tab, or the cooldown timer (default 30 min). **Configuration** — `~/.dario/config.json` → `overageGuard`, or CLI flags (`--overage-behavior=warn` for visibility-only, `--no-overage-guard` to disable, `--overage-cooldown=<ms>` to tune). **OS notification** — best-effort native toast (osascript / notify-send / BurntToast) plus terminal BEL as the unconditional floor. See [#288](https://github.com/askalf/dario/issues/288).
|
|
195
253
|
|
|
254
|
+
**Verified end-to-end live.** [`test/overage-guard-e2e-live.mjs`](./test/overage-guard-e2e-live.mjs) patches `globalThis.fetch` to mock the upstream, starts a real dario proxy in-process, and drives the five-stage halt cycle through real HTTP: subscription request flows → upstream returns overage → guard halts → next request returns 503 with the `dario_overage_guard` body → `POST /admin/resume` clears state → requests flow again. 20/20 assertions, 3 upstream calls intercepted (the halted request short-circuited at the request handler, never touched the upstream). Run with `node test/overage-guard-e2e-live.mjs`.
|
|
255
|
+
|
|
196
256
|
---
|
|
197
257
|
|
|
198
258
|
## Does it actually work?
|
|
@@ -276,7 +336,7 @@ The tool doesn't know. The backend doesn't know. Dario is the seam.
|
|
|
276
336
|
- **Multi-account pool.** Drop 2+ Claude accounts in `~/.dario/accounts/` and pool mode auto-activates: every request routes to the account with the most headroom, multi-turn sessions pin to one account so the prompt cache survives, in-flight 429s fail over to a peer before your client sees an error. `dario accounts add work` / `dario accounts add personal`. → [`docs/multi-account-pool.md`](./docs/multi-account-pool.md)
|
|
277
337
|
- **Behavioral stealth (`--stealth`).** Static wire fidelity covers *what* the request looks like; `--stealth` adds *when* it arrives — response-length-correlated think time and 1.2–4.2s session-start latency, the inter-arrival pattern real interactive sessions have and agent loops don't. → [`docs/wire-fidelity.md`](./docs/wire-fidelity.md)
|
|
278
338
|
- **Runs any non-Claude-Code agent.** A 64-entry schema-verified `TOOL_MAP` pre-maps Cline, Roo, Kilo, Cursor, Windsurf, Continue, Copilot, OpenHands, OpenClaw, Hermes, [hands](https://github.com/askalf/hands) tool names to CC's native set. No flag, no validator errors. → [`docs/agent-compat.md`](./docs/agent-compat.md)
|
|
279
|
-
- **Shim mode
|
|
339
|
+
- **Shim mode** *(deprecated in v4.2; removal scheduled for v5.x)*. The original "no HTTP hop" path that patched `globalThis.fetch` inside a `dario shim -- <cmd>` child process. Empirically only matches 3 of the 8 wire-shape axes the billing classifier inspects (system blocks, agent identity, header order) and falls back to total passthrough when the client sends a 1-block system — which `claude -p` and Agent-SDK both do. Use **proxy mode** for any non-CC client; that's the only mode that rebuilds every request to CC's full canonical shape. Shim emits a deprecation banner on every invocation. See [CHANGELOG v4.2.0](./CHANGELOG.md) for the side-by-side fingerprint diff that drove this call.
|
|
280
340
|
- **Recover output capability.** `dario proxy --system-prompt=partial` strips CC's tone/verbosity/no-comments constraints for 1.2–2.8× more output on open-ended work — empirically without flipping billing (the classifier doesn't read that slot). [Discussion #183](https://github.com/askalf/dario/discussions/183) has the per-prompt receipts. → [`docs/system-prompt.md`](./docs/system-prompt.md)
|
|
281
341
|
- **Reachable from inside CC / any MCP client.** `dario subagent install` registers a CC sub-agent for in-session diagnostics; `dario mcp` exposes dario as a read-only MCP server. → [`docs/sub-agent.md`](./docs/sub-agent.md) · [`docs/mcp-server.md`](./docs/mcp-server.md)
|
|
282
342
|
- **Active overage protection (v4.1).** Halts the proxy on any `representative-claim: overage` response and returns 503 to subsequent requests until you run `dario resume` or the cooldown clears. Visibility-only mode (`--overage-behavior=warn`) for operators who want the signal without disrupting traffic. Halt state visible in TUI Status/Hits/Analytics tabs, surfaced as named SSE events, and as a best-effort native desktop notification. [#288](https://github.com/askalf/dario/issues/288).
|
package/dist/cli.js
CHANGED
|
@@ -1053,8 +1053,13 @@ async function help() {
|
|
|
1053
1053
|
dario backend add NAME --key=sk-... [--base-url=...]
|
|
1054
1054
|
Add an OpenAI-compat backend (OpenAI, OpenRouter, Groq, etc.)
|
|
1055
1055
|
dario backend remove N Remove an OpenAI-compat backend
|
|
1056
|
-
dario shim -- CMD ARGS
|
|
1057
|
-
|
|
1056
|
+
dario shim -- CMD ARGS [DEPRECATED — removal scheduled for v5.x]
|
|
1057
|
+
Run CMD with dario's fetch-patch in-process.
|
|
1058
|
+
Only replaces 3 of 8 billing-classifier axes
|
|
1059
|
+
(system blocks, agent identity, header order).
|
|
1060
|
+
Falls back to passthrough on 1-block system
|
|
1061
|
+
requests (\`claude -p\`, Agent SDK). Use proxy
|
|
1062
|
+
mode for non-CC clients. See CHANGELOG v4.2.0.
|
|
1058
1063
|
dario subagent install Register ~/.claude/agents/dario.md so Claude Code
|
|
1059
1064
|
can delegate dario diagnostics / template-refresh
|
|
1060
1065
|
operations to a named sub-agent (v3.26)
|
|
@@ -1435,6 +1440,50 @@ async function shim() {
|
|
|
1435
1440
|
console.error(' dario shim --priority=below-normal -- claude (recommended on Windows when RDP\'d into the host)');
|
|
1436
1441
|
process.exit(1);
|
|
1437
1442
|
}
|
|
1443
|
+
// v4.2.0+: shim mode is DEPRECATED — set for removal in v5.x.
|
|
1444
|
+
//
|
|
1445
|
+
// The empirical reason (verified by side-by-side fingerprint diff of
|
|
1446
|
+
// shim's `_rewriteBody` against the proxy's `buildCCRequest` — see
|
|
1447
|
+
// CHANGELOG v4.2.0 entry): shim mode only replaces 3 of the 8 axes
|
|
1448
|
+
// Anthropic's billing classifier actually inspects (system blocks,
|
|
1449
|
+
// agent identity, header order). It leaves the client's JSON key
|
|
1450
|
+
// order, max_tokens value, metadata billing tag, and any non-CC
|
|
1451
|
+
// fields (temperature, top_p, service_tier) unchanged on the wire.
|
|
1452
|
+
// And on the most-common claude -p / Agent-SDK request shape (which
|
|
1453
|
+
// sends a 1-block system, not the 3-block shape shim's shape-check
|
|
1454
|
+
// hardcodes), shim silently falls back to total passthrough — sending
|
|
1455
|
+
// the client's raw body to api.anthropic.com with zero replay.
|
|
1456
|
+
//
|
|
1457
|
+
// For interactive CC (`dario shim -- claude`), this is mostly a no-op
|
|
1458
|
+
// because CC's own outbound already matches every axis dario would
|
|
1459
|
+
// synthesize. But for any non-CC client (`dario shim -- aider`,
|
|
1460
|
+
// `dario shim -- cline`, your own scripts), shim mode does not deliver
|
|
1461
|
+
// the wire fidelity the README claims.
|
|
1462
|
+
//
|
|
1463
|
+
// We warn loudly here instead of silently breaking, and point users
|
|
1464
|
+
// at proxy mode. Set DARIO_SHIM_NO_DEPRECATION_WARNING=1 to suppress
|
|
1465
|
+
// the banner for scripts that need the exit-code semantics but have
|
|
1466
|
+
// already migrated their understanding.
|
|
1467
|
+
if (process.env['DARIO_SHIM_NO_DEPRECATION_WARNING'] !== '1') {
|
|
1468
|
+
console.error('');
|
|
1469
|
+
console.error('[dario] ⚠ DEPRECATION: `dario shim` is deprecated in v4.2 and scheduled for removal in v5.x.');
|
|
1470
|
+
console.error('[dario]');
|
|
1471
|
+
console.error('[dario] Shim mode only matches a subset of the wire-shape axes Anthropic\'s billing classifier');
|
|
1472
|
+
console.error('[dario] inspects. Specifically, it does not normalize JSON key order, max_tokens, metadata');
|
|
1473
|
+
console.error('[dario] billing-tag, or non-CC body fields. On `claude -p` / Agent-SDK style requests (1-block');
|
|
1474
|
+
console.error('[dario] system), shim falls back to total passthrough — the client\'s raw body reaches the');
|
|
1475
|
+
console.error('[dario] upstream unchanged.');
|
|
1476
|
+
console.error('[dario]');
|
|
1477
|
+
console.error('[dario] Use proxy mode instead, which rebuilds every request to CC\'s exact wire shape:');
|
|
1478
|
+
console.error('[dario]');
|
|
1479
|
+
console.error('[dario] Terminal 1: dario proxy');
|
|
1480
|
+
console.error('[dario] Terminal 2: ANTHROPIC_BASE_URL=http://localhost:3456 \\');
|
|
1481
|
+
console.error('[dario] ANTHROPIC_API_KEY=dario \\');
|
|
1482
|
+
console.error('[dario] ' + childArgs.join(' '));
|
|
1483
|
+
console.error('[dario]');
|
|
1484
|
+
console.error('[dario] To suppress this banner: DARIO_SHIM_NO_DEPRECATION_WARNING=1');
|
|
1485
|
+
console.error('');
|
|
1486
|
+
}
|
|
1438
1487
|
const { runShim } = await import('./shim/host.js');
|
|
1439
1488
|
try {
|
|
1440
1489
|
const result = await runShim({
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@askalf/dario",
|
|
3
|
-
"version": "4.
|
|
3
|
+
"version": "4.2.0",
|
|
4
4
|
"description": "Use your Claude Pro/Max subscription in any tool — Cursor, Cline, Aider, the Agent SDK, your scripts — at subscription pricing, not per-token API bills. One local Anthropic + OpenAI-compatible endpoint.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"bin": {
|