npm - @askalf/dario - Versions diffs - 3.37.20 → 3.38.1 - Mend

@askalf/dario 3.37.20 → 3.38.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/README.md CHANGED Viewed

@@ -62,6 +62,33 @@ Something not right? `dario doctor` prints a single paste-ready health report. P
 ---
+## The 2026-06-15 billing cliff
+Starting **2026-06-15**, Anthropic splits Claude plan usage into two separate pools. The one that matters for coding agents:
+| Plan | New Agent-SDK / `claude -p` credit | What happens when it runs out |
+|---|---|---|
+| Pro | **$20/mo** | Per-token API pricing |
+| Max 5x | **$100/mo** | Per-token API pricing |
+| Max 20x | **$200/mo** | Per-token API pricing |
+A sustained Cline or Aider session burns $100 of API-rate tokens in an evening. Agentic loops blow past $200 in days. Any proxy that forwards requests in their original Agent-SDK or `claude -p` wire shape — which is most of them — puts your agentic traffic into this new credit pool instead of your subscription pool. Once it's gone, you're on metered pricing.
+**Dario doesn't.** Every outbound request is rebuilt as **interactive Claude Code wire-shape** before it leaves your machine: headers, body key order, TLS stack, session-id lifecycle — the same six axes the live template extractor has been closing since v3.22. Anthropic's billing classifier sees an interactive CC session. Your traffic stays in the subscription pool you already pay for.
+| Your setup | Post-2026-06-15 billing path |
+|---|---|
+| Any tool → Anthropic API direct | Per-token API |
+| Any tool → proxy forwarding requests as-is | **Agent-SDK credit ($20–200/mo cap), then per-token API** |
+| **Any tool → dario** | **Subscription pool — unchanged** |
+| Claude Code interactive | Subscription pool — unchanged |
+Same install. Same `localhost:3456`. No config change needed for the cliff.
+**Verify it's working:** `dario doctor --usage` fires one Haiku request through your OAuth and surfaces the rate-limit headers Anthropic returned. The `representative-claim` field should read `five_hour` or `seven_day` — subscription billing buckets. If it reads anything else after 2026-06-15 lands, [file an issue](https://github.com/askalf/dario/issues/new) — that's what the drift detector exists to catch. Full technical breakdown + post-cliff verification procedure: [`docs/why-now-2026-06.md`](./docs/why-now-2026-06.md).
+---
 ## What it actually does
 You point every tool at one URL. Dario reads each request, decides which backend owns it, and forwards in that backend's native protocol.
@@ -96,29 +123,6 @@ Already have **Pro + Max** stacked? Pool mode (`dario accounts add work` / `dari
 ---
-## What changes 2026-06-15 (and why dario doesn't)
-Anthropic announced that starting **2026-06-15**, Claude Agent SDK and `claude -p` (Claude Code headless mode) usage will no longer count toward Claude plan usage limits. Eligible plans get a separate fixed monthly credit instead — **$20/mo on Pro, $100/mo on Max 5x, $200/mo on Max 20x**. Once exhausted, those calls go to metered API pricing.
-For autonomous workloads — long-running coding agents, background scripts, anything that runs unattended — that credit pool is small. A single sustained Cline or Aider session can chew through $100 in an evening; agentic workloads blow past $200 in days.
-**Dario is unaffected by this split.** The Claude backend was designed from day one to send requests as **interactive Claude Code wire-shape** — full template replay from your installed CC binary (headers, body key order, TLS stack, session-id lifecycle). The upstream billing path sees a Claude Code interactive session, not a `claude -p` invocation or an Agent SDK call, regardless of what tool actually originated the request locally. That's the whole point of the wire-fidelity work in [`docs/wire-fidelity.md`](./docs/wire-fidelity.md) — and it was already shipping in production before the 2026-06-15 announcement.
-What this means concretely:
-| Setup | Pre-2026-06-15 | Post-2026-06-15 |
-|---|---|---|
-| Cline / Aider / Cursor → Anthropic API direct with API key | Per-token API billing | Per-token API billing (unchanged) |
-| Cline / Aider / Cursor → naive proxy that passes `claude -p` / Agent SDK calls through unchanged | Subscription pool | **Separate $20–200/mo credit pool, then per-token API** |
-| **Cline / Aider / Cursor → dario** | **Subscription pool** | **Subscription pool (unchanged — dario rewrites as interactive CC)** |
-| Claude Code itself, used interactively | Subscription pool | Subscription pool (unchanged) |
-If you're coming from a proxy that doesn't replay the full CC wire shape, your agentic workloads are about to land in the smaller credit bucket on 2026-06-15. Dario keeps them on the subscription pool you already pay for.
-Same install. Same `localhost:3456`. No config change needed for the cliff. Full technical breakdown — including the two diagnostic checks you can run after 2026-06-15 lands to verify the classification on your own setup — is in [`docs/why-now-2026-06.md`](./docs/why-now-2026-06.md).
----
 ## Why you'll install this
 - **One URL for every provider.** Cursor, Aider, Continue, Zed, OpenHands, Claude Code, your own scripts — every tool you own has its own per-provider config. Dario collapses that into a single `localhost:3456` that speaks both Anthropic and OpenAI protocols and routes by model name.
@@ -378,7 +382,6 @@ MIT — see [LICENSE](LICENSE) and [DISCLAIMER.md](DISCLAIMER.md).
 | Project | What it does |
 |---------|-------------|
 | [arnie](https://github.com/askalf/arnie) | Portable IT troubleshooting companion. Networking, AD, Windows Update, package managers, log triage, hardware checks. |
-| [brio](https://github.com/askalf/brio) | Capability layer for AI workloads — semantic cache, cost tiering, policy. Sits in front of any Anthropic-compat endpoint. |
 | [browser-bridge](https://github.com/askalf/browser-bridge) | Stealth headless Chromium in a container. CDP on 9222 — Playwright/Puppeteer/MCP-compatible. |
 | [claude-bridge](https://github.com/askalf/claude-bridge) | Bridge Claude Code sessions to Discord. |
 | [deepdive](https://github.com/askalf/deepdive) | Local research agent. Plan → search → fetch → extract → synthesize. Cited answers. |

package/dist/cli.js CHANGED Viewed

@@ -251,6 +251,16 @@ async function proxy() {
     // calc lives in src/pacing.ts; the flags just feed it.
     const pacingMinMs = parsePositiveIntFlag('--pace-min=');
     const pacingJitterMs = parsePositiveIntFlag('--pace-jitter=');
+    // --think-time-* / --session-start-* — behavioral smoothing extension.
+    // Closes the temporal axis the wire-fidelity work doesn't touch:
+    // response-length-correlated read time between requests, and per-
+    // session opening latency. All defaults 0 = off (opt-in).
+    const thinkTimeBaseMs = parsePositiveIntFlag('--think-time-base=');
+    const thinkTimePerTokenMs = parsePositiveIntFlag('--think-time-per-token=');
+    const thinkTimeJitterMs = parsePositiveIntFlag('--think-time-jitter=');
+    const thinkTimeMaxMs = parsePositiveIntFlag('--think-time-max=');
+    const sessionStartMinMs = parsePositiveIntFlag('--session-start-min=');
+    const sessionStartJitterMs = parsePositiveIntFlag('--session-start-jitter=');
     // --drain-on-close (v3.25, direction #5). When set, a client
     // disconnect no longer aborts the upstream SSE — dario keeps
     // draining the stream to EOF so Anthropic sees the CC-shaped
@@ -389,7 +399,7 @@ async function proxy() {
         console.error(`[dario] Override (not recommended): pass --unsafe-no-auth if you have out-of-band network controls and accept the risk.`);
         process.exit(1);
     }
-    await startProxy({ port, host, verbose, verboseBodies, model, passthrough, preserveTools, hybridTools, mergeTools, noAutoDetect, strictTls, pacingMinMs, pacingJitterMs, drainOnClose, sessionIdleRotateMs, sessionRotateJitterMs, sessionMaxAgeMs, sessionPerClient, preserveOrchestrationTags, noLiveCapture, strictTemplate, maxConcurrent, maxQueued, queueTimeoutMs, effort, maxTokens, logFile, passthroughBetas, systemPrompt });
+    await startProxy({ port, host, verbose, verboseBodies, model, passthrough, preserveTools, hybridTools, mergeTools, noAutoDetect, strictTls, pacingMinMs, pacingJitterMs, thinkTimeBaseMs, thinkTimePerTokenMs, thinkTimeJitterMs, thinkTimeMaxMs, sessionStartMinMs, sessionStartJitterMs, drainOnClose, sessionIdleRotateMs, sessionRotateJitterMs, sessionMaxAgeMs, sessionPerClient, preserveOrchestrationTags, noLiveCapture, strictTemplate, maxConcurrent, maxQueued, queueTimeoutMs, effort, maxTokens, logFile, passthroughBetas, systemPrompt });
 }
 /**
  * Parse `--system-prompt=<verbatim|partial|aggressive|filepath>` (or the
@@ -1013,6 +1023,37 @@ async function help() {
                              Default: 0 (off). Set to e.g. 300 to hide
                              the floor from long-run inter-arrival
                              statistics. (v3.24)
+    --think-time-base=MS     Post-response "think time" base — constant
+                             ms added before the next request fires.
+                             Models the wall-clock pause between an
+                             interactive CC user reading a response and
+                             typing the next message. Default: 0 (off).
+                             Env: DARIO_THINK_TIME_BASE_MS.
+    --think-time-per-token=MS
+                             Additional ms per output token of the
+                             previous response (linear). e.g. 5 → a
+                             1000-token response adds 5s of read time
+                             before the next request. Default: 0.
+                             Env: DARIO_THINK_TIME_PER_TOKEN_MS.
+    --think-time-jitter=MS   Max uniform-random jitter on top of
+                             base+perToken*tokens. Hides the formula
+                             from long-run inter-arrival statistics.
+                             Default: 0.
+                             Env: DARIO_THINK_TIME_JITTER_MS.
+    --think-time-max=MS      Upper bound on think time so a 50k-token
+                             response doesn't pause for minutes.
+                             Default: 30000 (30s).
+                             Env: DARIO_THINK_TIME_MAX_MS.
+    --session-start-min=MS   Floor on session-start delay — applied to
+                             the first request only (lastResponseTime
+                             === 0). Real CC sessions open with seconds
+                             of startup latency, not microseconds.
+                             Default: 0 (off).
+                             Env: DARIO_SESSION_START_MIN_MS.
+    --session-start-jitter=MS
+                             Max uniform-random jitter on session-start
+                             delay. Default: 0.
+                             Env: DARIO_SESSION_START_JITTER_MS.
     --drain-on-close         When the client disconnects mid-stream,
                              keep consuming the upstream SSE to EOF
                              so Anthropic sees the same read-to-

package/dist/pacing.d.ts CHANGED Viewed

@@ -60,3 +60,76 @@ export declare function resolvePacingConfig(explicit?: {
     minGapMs?: number;
     jitterMs?: number;
 }, env?: NodeJS.ProcessEnv): PacingConfig;
+/**
+ * Post-response "think time" simulation (behavioral smoothing extension).
+ *
+ * Inter-request `computePacingDelay` enforces a floor on the wall-clock
+ * distance between two outbound requests. Think time models the
+ * orthogonal axis: how long a real interactive Claude Code user would
+ * spend reading a response before sending the next message. Without it,
+ * agentic loops fire the next request as fast as the client can stamp
+ * one out, which creates an inter-arrival distribution that's
+ * structurally absent in real interactive sessions (read-then-type has
+ * variance correlated with response length; agent loops don't).
+ *
+ *   delay = baseMs + perTokenMs * lastResponseTokens + U(0, jitterMs)
+ *
+ * Then clamped to [0, maxMs] and reduced by elapsed time since the
+ * response completed (so a slow downstream consumer doesn't double-pay).
+ *
+ * `lastResponseTime === 0` returns 0 — there's no response to read on
+ * the first request of a session. Session-start jitter is a separate
+ * function (`computeSessionStartDelay`) since it has different semantics.
+ */
+export interface ThinkTimeConfig {
+    /** Constant ms added to every think-time sample, regardless of tokens. */
+    baseMs: number;
+    /** Additional ms per output token of the previous response (linear). */
+    perTokenMs: number;
+    /** Max uniform-random jitter (ms) added on top. */
+    jitterMs: number;
+    /** Upper bound on think time. Prevents pathological pauses on very long responses. */
+    maxMs: number;
+}
+export declare function computeThinkTimeDelay(now: number, lastResponseTime: number, lastResponseTokens: number, cfg: ThinkTimeConfig, rng?: () => number): number;
+/**
+ * Resolve a ThinkTimeConfig from explicit options, env vars, and
+ * defaults. All defaults are 0 — feature is opt-in. `maxMs` defaults to
+ * 30000 (30s) when any think-time knob is enabled and the user hasn't
+ * set their own cap; on a fully-disabled config the cap doesn't matter
+ * since the short-circuit above returns 0 first.
+ */
+export declare function resolveThinkTimeConfig(explicit?: {
+    baseMs?: number;
+    perTokenMs?: number;
+    jitterMs?: number;
+    maxMs?: number;
+}, env?: NodeJS.ProcessEnv): ThinkTimeConfig;
+/**
+ * Session-start delay (behavioral smoothing extension).
+ *
+ * Every new single-account session — first request after startup, first
+ * request after a session-id rotation — currently fires at machine
+ * speed (lastRequestTime resets to 0, computePacingDelay returns 0).
+ * Every session opens with an identical zero-delay first request, which
+ * is a detectable signal on long-run traffic statistics. Real CC users
+ * open a new session by opening the binary and typing a prompt — that's
+ * seconds of latency, not microseconds.
+ *
+ *   delay = minMs + U(0, jitterMs)
+ *
+ * Returns the sampled delay directly (no elapsed-time check — this is a
+ * one-shot delay applied to the first request of a session, before any
+ * upstream call has happened).
+ */
+export interface SessionStartConfig {
+    /** Constant ms floor for session-start delay. */
+    minMs: number;
+    /** Max uniform-random jitter (ms) added on top. */
+    jitterMs: number;
+}
+export declare function computeSessionStartDelay(cfg: SessionStartConfig, rng?: () => number): number;
+export declare function resolveSessionStartConfig(explicit?: {
+    minMs?: number;
+    jitterMs?: number;
+}, env?: NodeJS.ProcessEnv): SessionStartConfig;

package/dist/pacing.js CHANGED Viewed

@@ -76,3 +76,51 @@ function pickNonNegativeInt(...candidates) {
     }
     return undefined;
 }
+export function computeThinkTimeDelay(now, lastResponseTime, lastResponseTokens, cfg, rng = Math.random) {
+    if (lastResponseTime <= 0)
+        return 0;
+    const base = Math.max(0, cfg.baseMs);
+    const perToken = Math.max(0, cfg.perTokenMs);
+    const jitter = Math.max(0, cfg.jitterMs);
+    const max = Math.max(0, cfg.maxMs);
+    const tokens = Math.max(0, lastResponseTokens);
+    // Short-circuit when all knobs are zero — avoids unnecessary rng calls
+    // and the elapsed-time math on the hot path when think time is off.
+    if (base === 0 && perToken === 0 && jitter === 0)
+        return 0;
+    const jitterAdd = jitter > 0 ? Math.floor(rng() * jitter) : 0;
+    let target = base + perToken * tokens + jitterAdd;
+    if (max > 0 && target > max)
+        target = max;
+    const elapsed = now - lastResponseTime;
+    if (elapsed >= target)
+        return 0;
+    return target - elapsed;
+}
+/**
+ * Resolve a ThinkTimeConfig from explicit options, env vars, and
+ * defaults. All defaults are 0 — feature is opt-in. `maxMs` defaults to
+ * 30000 (30s) when any think-time knob is enabled and the user hasn't
+ * set their own cap; on a fully-disabled config the cap doesn't matter
+ * since the short-circuit above returns 0 first.
+ */
+export function resolveThinkTimeConfig(explicit = {}, env = process.env) {
+    const base = pickNonNegativeInt(explicit.baseMs, env.DARIO_THINK_TIME_BASE_MS) ?? 0;
+    const perToken = pickNonNegativeInt(explicit.perTokenMs, env.DARIO_THINK_TIME_PER_TOKEN_MS) ?? 0;
+    const jitter = pickNonNegativeInt(explicit.jitterMs, env.DARIO_THINK_TIME_JITTER_MS) ?? 0;
+    const max = pickNonNegativeInt(explicit.maxMs, env.DARIO_THINK_TIME_MAX_MS) ?? 30000;
+    return { baseMs: base, perTokenMs: perToken, jitterMs: jitter, maxMs: max };
+}
+export function computeSessionStartDelay(cfg, rng = Math.random) {
+    const min = Math.max(0, cfg.minMs);
+    const jitter = Math.max(0, cfg.jitterMs);
+    if (min === 0 && jitter === 0)
+        return 0;
+    const jitterAdd = jitter > 0 ? Math.floor(rng() * jitter) : 0;
+    return min + jitterAdd;
+}
+export function resolveSessionStartConfig(explicit = {}, env = process.env) {
+    const min = pickNonNegativeInt(explicit.minMs, env.DARIO_SESSION_START_MIN_MS) ?? 0;
+    const jitter = pickNonNegativeInt(explicit.jitterMs, env.DARIO_SESSION_START_JITTER_MS) ?? 0;
+    return { minMs: min, jitterMs: jitter };
+}

package/dist/proxy.d.ts CHANGED Viewed

@@ -62,6 +62,12 @@ interface ProxyOptions {
     strictTls?: boolean;
     pacingMinMs?: number;
     pacingJitterMs?: number;
+    thinkTimeBaseMs?: number;
+    thinkTimePerTokenMs?: number;
+    thinkTimeJitterMs?: number;
+    thinkTimeMaxMs?: number;
+    sessionStartMinMs?: number;
+    sessionStartJitterMs?: number;
     drainOnClose?: boolean;
     sessionIdleRotateMs?: number;
     sessionRotateJitterMs?: number;

package/dist/proxy.js CHANGED Viewed

@@ -717,14 +717,39 @@ export async function startProxy(opts = {}) {
     // 500ms floor keeps the default behavior identical to v3.23; `--pace-min`
     // and `--pace-jitter` let callers tune the distribution. Pure calc lives
     // in src/pacing.ts so the edge cases are unit-tested without timers.
-    const { computePacingDelay, resolvePacingConfig } = await import('./pacing.js');
+    const { computePacingDelay, resolvePacingConfig, computeThinkTimeDelay, resolveThinkTimeConfig, computeSessionStartDelay, resolveSessionStartConfig, } = await import('./pacing.js');
     let lastRequestTime = 0;
+    // Behavioral smoothing state: when the last response *completed* and
+    // how many output tokens it had. Used by computeThinkTimeDelay to
+    // model human read-time before the next request. Distinct from
+    // lastRequestTime (which tracks when the last request *started* and
+    // feeds the inter-request floor).
+    let lastResponseTime = 0;
+    let lastResponseTokens = 0;
     const pacingCfg = resolvePacingConfig({
         minGapMs: opts.pacingMinMs,
         jitterMs: opts.pacingJitterMs,
     });
+    const thinkTimeCfg = resolveThinkTimeConfig({
+        baseMs: opts.thinkTimeBaseMs,
+        perTokenMs: opts.thinkTimePerTokenMs,
+        jitterMs: opts.thinkTimeJitterMs,
+        maxMs: opts.thinkTimeMaxMs,
+    });
+    const sessionStartCfg = resolveSessionStartConfig({
+        minMs: opts.sessionStartMinMs,
+        jitterMs: opts.sessionStartJitterMs,
+    });
+    const thinkTimeEnabled = thinkTimeCfg.baseMs > 0 || thinkTimeCfg.perTokenMs > 0 || thinkTimeCfg.jitterMs > 0;
+    const sessionStartEnabled = sessionStartCfg.minMs > 0 || sessionStartCfg.jitterMs > 0;
     if (verbose) {
         console.log(`[dario] pacing: min=${pacingCfg.minGapMs}ms jitter=${pacingCfg.jitterMs}ms`);
+        if (thinkTimeEnabled) {
+            console.log(`[dario] think-time: base=${thinkTimeCfg.baseMs}ms perToken=${thinkTimeCfg.perTokenMs}ms jitter=${thinkTimeCfg.jitterMs}ms max=${thinkTimeCfg.maxMs}ms`);
+        }
+        if (sessionStartEnabled) {
+            console.log(`[dario] session-start: min=${sessionStartCfg.minMs}ms jitter=${sessionStartCfg.jitterMs}ms`);
+        }
     }
     // Stream-consumption replay (v3.25, direction #5). When on, a client
     // disconnect no longer aborts the upstream fetch — we keep consuming
@@ -1287,10 +1312,29 @@ export async function startProxy(opts = {}) {
                 }
             }
             // Rate governor — prevent inhuman request cadence. See src/pacing.ts
-            // for the pure delay calculator (floor + uniform jitter).
-            const pacingDelay = computePacingDelay(Date.now(), lastRequestTime, pacingCfg);
-            if (pacingDelay > 0) {
-                await new Promise(r => setTimeout(r, pacingDelay));
+            // for the pure delay calculators. Three layers, all defaults preserve
+            // v3.37.20 behaviour:
+            //   1. pacingDelay      — floor on inter-request distance (always on,
+            //                         500ms default since v3.24).
+            //   2. thinkTimeDelay   — post-response read-time, proportional to
+            //                         the previous response's output tokens.
+            //                         Opt-in via --think-time-* flags.
+            //   3. sessionStartDelay — one-shot startup latency on the first
+            //                          request of a session (lastResponseTime===0).
+            //                          Opt-in via --session-start-* flags.
+            // We take the max because each layer enforces an independent floor
+            // — waiting longer satisfies all of them, so we never need to sum.
+            const nowForPacing = Date.now();
+            const pacingDelay = computePacingDelay(nowForPacing, lastRequestTime, pacingCfg);
+            const thinkDelay = thinkTimeEnabled
+                ? computeThinkTimeDelay(nowForPacing, lastResponseTime, lastResponseTokens, thinkTimeCfg)
+                : 0;
+            const sessionStartDelay = (sessionStartEnabled && lastResponseTime === 0 && lastRequestTime === 0)
+                ? computeSessionStartDelay(sessionStartCfg)
+                : 0;
+            const totalDelay = Math.max(pacingDelay, thinkDelay, sessionStartDelay);
+            if (totalDelay > 0) {
+                await new Promise(r => setTimeout(r, totalDelay));
             }
             lastRequestTime = Date.now();
             // Session ID: pool mode uses the per-account identity.sessionId (stable
@@ -1817,6 +1861,15 @@ export async function startProxy(opts = {}) {
                         console.error('[dario] Stream error:', sanitizeError(err));
                 }
                 res.end();
+                // Stamp the response-completion timestamp + token count so the
+                // next request's think-time delay can model human read time.
+                // Only on 2xx — error responses don't represent content the user
+                // would read, and using their (often zero) output_tokens would
+                // pin think time to baseMs+jitter on the next request needlessly.
+                if (upstream.status >= 200 && upstream.status < 300) {
+                    lastResponseTime = Date.now();
+                    lastResponseTokens = streamOutputTokens;
+                }
                 if (analytics && poolAccount) {
                     analytics.record({
                         timestamp: Date.now(), account: poolAccount.alias, model: requestModel,
@@ -1867,6 +1920,14 @@ export async function startProxy(opts = {}) {
                     bufferedUsage = Analytics.parseUsage(parsed);
                 }
                 catch { /* malformed body — log without usage */ }
+                // Stamp response-completion state for the next request's think-time
+                // delay. Same 2xx-only rule as the streaming path. Falls back to 0
+                // tokens when the body wasn't JSON or had no usage block — base +
+                // jitter still apply but the per-token component is 0.
+                if (upstream.status >= 200 && upstream.status < 300) {
+                    lastResponseTime = Date.now();
+                    lastResponseTokens = bufferedUsage?.outputTokens ?? 0;
+                }
                 if (analytics && poolAccount && bufferedUsage) {
                     try {
                         analytics.record({
@@ -1960,10 +2021,19 @@ export async function startProxy(opts = {}) {
                 const res = await fetch(`http://${displayHost}:${port}/health`);
                 const body = await res.json();
                 if (body && (body.status === 'ok' || body.status === 'degraded')) {
+                    // The /health endpoint's `oauth` field is a status enum
+                    // ('healthy' | 'expired' | 'broken' | 'none') — not a token
+                    // and not any kind of credential. CodeQL's clear-text-logging
+                    // heuristic flags any logged field whose key contains "oauth",
+                    // so we whitelist by allow-list rather than disable the rule.
+                    const allowedOauthStatuses = new Set(['healthy', 'expired', 'broken', 'none', 'degraded']);
+                    const rawOauth = typeof body.oauth === 'string' ? body.oauth : '';
+                    const oauthStatusLabel = allowedOauthStatuses.has(rawOauth) ? rawOauth : 'unknown';
+                    const requestsServed = typeof body.requests === 'number' ? body.requests : 0;
                     console.log('');
                     console.log(`  dario — already running on http://${displayHost}:${port}`);
                     console.log('');
-                    console.log(`  OAuth: ${body.oauth ?? 'unknown'}  |  requests served: ${body.requests ?? 0}`);
+                    console.log(`  OAuth: ${oauthStatusLabel}  |  requests served: ${requestsServed}`);
                     console.log('');
                     console.log('  Usage:');
                     console.log(`    ANTHROPIC_BASE_URL=http://${displayHost}:${port}`);

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@askalf/dario",
-  "version": "3.37.20",
+  "version": "3.38.1",
   "description": "A local LLM router. One endpoint, every provider — Claude subscriptions, OpenAI, OpenRouter, Groq, local LiteLLM, any OpenAI-compat endpoint — your tools don't need to change.",
   "type": "module",
   "bin": {