npm - @blockrun/clawrouter - Versions diffs - 0.12.64 → 0.12.66 - Mend

@blockrun/clawrouter 0.12.64 → 0.12.66

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

package/README.md +55 -55
package/dist/cli.js +70 -25
package/dist/cli.js.map +1 -1
package/dist/index.js +77 -27
package/dist/index.js.map +1 -1
package/docs/anthropic-cost-savings.md +90 -85
package/docs/architecture.md +12 -12
package/docs/{blog-openclaw-cost-overruns.md → clawrouter-cuts-llm-api-costs-500x.md} +27 -27
package/docs/clawrouter-vs-openrouter-llm-routing-comparison.md +280 -0
package/docs/configuration.md +2 -2
package/docs/image-generation.md +39 -39
package/docs/{blog-benchmark-2026-03.md → llm-router-benchmark-46-models-sub-1ms-routing.md} +61 -64
package/docs/routing-profiles.md +6 -6
package/docs/{technical-routing-2026-03.md → smart-llm-router-14-dimension-classifier.md} +29 -28
package/docs/worker-network.md +438 -347
package/package.json +1 -1
package/scripts/reinstall.sh +31 -6
package/scripts/update.sh +6 -1
package/docs/vs-openrouter.md +0 -157

package/docs/anthropic-cost-savings.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Stop Overpaying for Claude: How ClawRouter Cuts Your Anthropic Bill by 70%
-*You love Claude. Your wallet doesn't. Here's how to keep frontier-quality answers — at a fraction of the cost.*
+_You love Claude. Your wallet doesn't. Here's how to keep frontier-quality answers — at a fraction of the cost._
 ---
@@ -87,17 +87,17 @@ ClawRouter scores every prompt against 14 dimensions in <1ms and routes it to th
 From real production data across 20,000+ paying user requests:
-| Model | % of Requests | Price (input/output per M) |
-|---|---|---|
-| gemini-2.5-flash-lite | 34.5% | $0.10 / $0.40 |
-| **claude-sonnet-4.6** | **22.7%** | **$3.00 / $15.00** |
-| kimi-k2.5 | 16.2% | $0.60 / $3.00 |
-| minimax-m2.5 | 6.5% | $0.30 / $1.20 |
-| grok-code-fast | 6.1% | $0.20 / $1.50 |
-| claude-haiku-4.5 | 2.7% | $1.00 / $5.00 |
-| nvidia/gpt-oss-120b | 2.1% | FREE |
-| grok-reasoning | 2.9% | $0.20 / $0.50 |
-| Others | 6.3% | varies |
+| Model                 | % of Requests | Price (input/output per M) |
+| --------------------- | ------------- | -------------------------- |
+| gemini-2.5-flash-lite | 34.5%         | $0.10 / $0.40              |
+| **claude-sonnet-4.6** | **22.7%**     | **$3.00 / $15.00**         |
+| kimi-k2.5             | 16.2%         | $0.60 / $3.00              |
+| minimax-m2.5          | 6.5%          | $0.30 / $1.20              |
+| grok-code-fast        | 6.1%          | $0.20 / $1.50              |
+| claude-haiku-4.5      | 2.7%          | $1.00 / $5.00              |
+| nvidia/gpt-oss-120b   | 2.1%          | FREE                       |
+| grok-reasoning        | 2.9%          | $0.20 / $0.50              |
+| Others                | 6.3%          | varies                     |
 **Result:** 77% of requests go to models that cost 5-150x less than Sonnet. Only the ~23% that genuinely need Claude still go to Claude.
@@ -107,11 +107,11 @@ Even when a request does go to Claude, ClawRouter reduces the tokens you pay for
 **How it works:**
-| Compression Layer | What It Does | Savings |
-|---|---|---|
-| **Deduplication** | Removes duplicate messages in conversation history | 2-5% |
-| **Whitespace normalization** | Strips excess whitespace, trailing spaces, empty lines | 3-8% |
-| **JSON compaction** | Minifies JSON in tool calls and results | 2-4% |
+| Compression Layer            | What It Does                                           | Savings |
+| ---------------------------- | ------------------------------------------------------ | ------- |
+| **Deduplication**            | Removes duplicate messages in conversation history     | 2-5%    |
+| **Whitespace normalization** | Strips excess whitespace, trailing spaces, empty lines | 3-8%    |
+| **JSON compaction**          | Minifies JSON in tool calls and results                | 2-4%    |
 These three layers are **enabled by default** and are completely safe — they don't change semantic meaning. The compression triggers automatically on requests larger than 180KB (common in agent workflows and long conversations).
@@ -126,6 +126,7 @@ This matters most on expensive models. If you're sending a 50K-token agent conve
 ClawRouter caches responses locally. If your app sends the same request within 10 minutes, you get an instant response at **zero cost** — no API call, no tokens billed.
 This is more common than you'd think:
 - **Retry logic** — Your app retries on timeout. Without dedup, you pay twice. With ClawRouter, the retry resolves from cache instantly.
 - **Redundant requests** — Multiple users or processes asking the same thing? One API call, multiple responses.
 - **Agent loops** — Agentic frameworks often re-query with identical context. Cache catches these.
@@ -146,44 +147,44 @@ The deduplicator also catches in-flight duplicates: if two identical requests ar
 ### Direct Anthropic API
-| Approach | Input (10M tokens) | Output (5M tokens) | Monthly Total |
-|---|---|---|---|
-| All Claude Sonnet | $30.00 | $75.00 | **$105.00** |
-| All Claude Opus | $50.00 | $125.00 | **$175.00** |
+| Approach          | Input (10M tokens) | Output (5M tokens) | Monthly Total |
+| ----------------- | ------------------ | ------------------ | ------------- |
+| All Claude Sonnet | $30.00             | $75.00             | **$105.00**   |
+| All Claude Opus   | $50.00             | $125.00            | **$175.00**   |
 ### ClawRouter (real paying-user distribution)
-| Tier | % Requests | Routed To | Cost |
-|---|---|---|---|
-| Cheap models | 34.5% | gemini-flash-lite | $0.76 |
-| Mid-tier | 16.2% | kimi-k2.5 | $2.43 |
-| **Claude (complex)** | **22.7%** | **claude-sonnet-4.6** | **$17.44** |
-| Code models | 6.1% | grok-code-fast | $0.52 |
-| Reasoning | 2.9% | grok-reasoning | $0.03 |
-| Haiku | 2.7% | claude-haiku-4.5 | $0.76 |
-| Free | 2.1% | nvidia/gpt-oss-120b | $0.00 |
-| Other | 12.8% | various | $1.18 |
-| **Subtotal (routing)** | | | **$23.12** |
-| Token compression (~10%) | | | **-$2.31** |
-| Cache hits (~5% est.) | | | **-$1.16** |
-| **Final Total** | | | **~$19.65** |
+| Tier                     | % Requests | Routed To             | Cost        |
+| ------------------------ | ---------- | --------------------- | ----------- |
+| Cheap models             | 34.5%      | gemini-flash-lite     | $0.76       |
+| Mid-tier                 | 16.2%      | kimi-k2.5             | $2.43       |
+| **Claude (complex)**     | **22.7%**  | **claude-sonnet-4.6** | **$17.44**  |
+| Code models              | 6.1%       | grok-code-fast        | $0.52       |
+| Reasoning                | 2.9%       | grok-reasoning        | $0.03       |
+| Haiku                    | 2.7%       | claude-haiku-4.5      | $0.76       |
+| Free                     | 2.1%       | nvidia/gpt-oss-120b   | $0.00       |
+| Other                    | 12.8%      | various               | $1.18       |
+| **Subtotal (routing)**   |            |                       | **$23.12**  |
+| Token compression (~10%) |            |                       | **-$2.31**  |
+| Cache hits (~5% est.)    |            |                       | **-$1.16**  |
+| **Final Total**          |            |                       | **~$19.65** |
 ### The Bottom Line
-| Approach | Monthly Cost | Savings |
-|---|---|---|
-| Direct Claude Sonnet | $105.00 | — |
-| Direct Claude Opus | $175.00 | — |
-| **ClawRouter** | **~$20** | **~81% vs Sonnet, ~89% vs Opus** |
+| Approach             | Monthly Cost | Savings                          |
+| -------------------- | ------------ | -------------------------------- |
+| Direct Claude Sonnet | $105.00      | —                                |
+| Direct Claude Opus   | $175.00      | —                                |
+| **ClawRouter**       | **~$20**     | **~81% vs Sonnet, ~89% vs Opus** |
 Breaking down where the savings come from:
-| Savings Source | Estimated Impact | How |
-|---|---|---|
-| **Smart routing** | ~68% cost reduction | 77% of requests → cheaper models |
-| **Token compression** | ~7-15% on remaining cost | Fewer tokens billed per request |
-| **Response cache** | ~3-5% additional | Repeat requests cost $0 |
-| **Request dedup** | Prevents overcharges | Retries don't double-bill |
+| Savings Source        | Estimated Impact         | How                              |
+| --------------------- | ------------------------ | -------------------------------- |
+| **Smart routing**     | ~68% cost reduction      | 77% of requests → cheaper models |
+| **Token compression** | ~7-15% on remaining cost | Fewer tokens billed per request  |
+| **Response cache**    | ~3-5% additional         | Repeat requests cost $0          |
+| **Request dedup**     | Prevents overcharges     | Retries don't double-bill        |
 ---
@@ -191,22 +192,22 @@ Breaking down where the savings come from:
 ClawRouter runs a weighted scoring algorithm on every prompt — entirely locally, in under 1 millisecond, zero external API calls.
-| Dimension | Weight | Detects |
-|---|---|---|
-| Reasoning Markers | 0.18 | "prove," "step by step," "analyze" |
-| Code Presence | 0.15 | `function`, `class`, `import`, code blocks |
-| Multi-Step Patterns | 0.12 | "first...then," numbered steps |
-| Technical Terms | 0.10 | Domain-specific vocabulary |
-| Token Count | 0.08 | Short vs. long context |
-| Question Complexity | 0.05 | Nested or compound questions |
-| Creative Markers | 0.05 | Creative writing indicators |
-| Constraint Count | 0.04 | "max," "minimum," "at most" |
-| Imperative Verbs | 0.03 | "create," "generate," "build" |
-| Output Format | 0.03 | JSON, YAML, table, markdown |
-| Simple Indicators | 0.02 | "what is," "define," "translate" |
-| Reference Complexity | 0.02 | "the code above," "the docs" |
-| Domain Specificity | 0.02 | Quantum, genomics, etc. |
-| Negation Complexity | 0.01 | "don't," "never," "avoid" |
+| Dimension            | Weight | Detects                                    |
+| -------------------- | ------ | ------------------------------------------ |
+| Reasoning Markers    | 0.18   | "prove," "step by step," "analyze"         |
+| Code Presence        | 0.15   | `function`, `class`, `import`, code blocks |
+| Multi-Step Patterns  | 0.12   | "first...then," numbered steps             |
+| Technical Terms      | 0.10   | Domain-specific vocabulary                 |
+| Token Count          | 0.08   | Short vs. long context                     |
+| Question Complexity  | 0.05   | Nested or compound questions               |
+| Creative Markers     | 0.05   | Creative writing indicators                |
+| Constraint Count     | 0.04   | "max," "minimum," "at most"                |
+| Imperative Verbs     | 0.03   | "create," "generate," "build"              |
+| Output Format        | 0.03   | JSON, YAML, table, markdown                |
+| Simple Indicators    | 0.02   | "what is," "define," "translate"           |
+| Reference Complexity | 0.02   | "the code above," "the docs"               |
+| Domain Specificity   | 0.02   | Quantum, genomics, etc.                    |
+| Negation Complexity  | 0.01   | "don't," "never," "avoid"                  |
 The weighted score maps to four tiers:
@@ -234,6 +235,7 @@ Starts a local proxy on port 8402. Auto-generates a crypto wallet. Done.
 ### Step 2: Update Your Code
 **Python** — change 2 lines:
 ```python
 from openai import OpenAI
@@ -249,6 +251,7 @@ response = client.chat.completions.create(
 ```
 **TypeScript** — same idea:
 ```typescript
 import OpenAI from "openai";
@@ -258,12 +261,13 @@ const client = new OpenAI({
 });
 const response = await client.chat.completions.create({
-  model: "blockrun/auto",  // or "eco" for max savings, "premium" for best quality
+  model: "blockrun/auto", // or "eco" for max savings, "premium" for best quality
   messages: [{ role: "user", content: "Your prompt here" }],
 });
 ```
 **Routing profiles:**
 - `blockrun/auto` — Balanced cost/quality (default)
 - `blockrun/eco` — Maximum savings (free tier aggressively)
 - `blockrun/premium` — Best quality (Opus/Sonnet/GPT-5)
@@ -305,17 +309,17 @@ $ /stats 7
 ## Why ClawRouter Instead of OpenRouter?
-| | ClawRouter | OpenRouter |
-|---|---|---|
-| **Smart routing** | Automatic — 14-dimension scorer picks the model | Manual — you pick the model |
-| **Token optimization** | Built-in compression (7-15% savings) | None |
-| **Response caching** | Local cache, repeat requests = $0 | None |
-| **Request dedup** | Retries don't double-bill | None |
-| **Routing latency** | <1ms (local, on your machine) | Additional network hop |
-| **Payments** | Non-custodial USDC on Base (your wallet, your keys) | Prepaid credit balance (custodial) |
-| **Free tier** | NVIDIA GPT-OSS-120B (always available) | No free models |
-| **API keys** | Zero — proxy handles all auth | You manage keys per provider |
-| **Algorithm** | Open-source, MIT license, modify it yourself | Proprietary |
+|                        | ClawRouter                                          | OpenRouter                         |
+| ---------------------- | --------------------------------------------------- | ---------------------------------- |
+| **Smart routing**      | Automatic — 14-dimension scorer picks the model     | Manual — you pick the model        |
+| **Token optimization** | Built-in compression (7-15% savings)                | None                               |
+| **Response caching**   | Local cache, repeat requests = $0                   | None                               |
+| **Request dedup**      | Retries don't double-bill                           | None                               |
+| **Routing latency**    | <1ms (local, on your machine)                       | Additional network hop             |
+| **Payments**           | Non-custodial USDC on Base (your wallet, your keys) | Prepaid credit balance (custodial) |
+| **Free tier**          | NVIDIA GPT-OSS-120B (always available)              | No free models                     |
+| **API keys**           | Zero — proxy handles all auth                       | You manage keys per provider       |
+| **Algorithm**          | Open-source, MIT license, modify it yourself        | Proprietary                        |
 The fundamental difference: **OpenRouter is a model marketplace where you choose.** ClawRouter is an intelligent proxy that **chooses for you**, compresses your tokens, caches your responses, and pays per-request with crypto from your own wallet.
@@ -323,16 +327,16 @@ The fundamental difference: **OpenRouter is a model marketplace where you choose
 ## TL;DR
-| What | Details |
-|---|---|
-| **Problem** | You pay Claude $3-25/M tokens on every request, but ~70% don't need Claude |
-| **Solution** | ClawRouter auto-routes + compresses + caches |
-| **Savings** | ~81% vs Sonnet, ~89% vs Opus |
-| **How** | Routing (68%) + token compression (7-15%) + caching (3-5%) |
-| **Code change** | 2 lines (base_url + model name) |
-| **Setup time** | 3 minutes |
-| **Quality tradeoff** | None — complex tasks still go to Claude |
-| **Open source** | MIT license, local proxy, non-custodial payments |
+| What                 | Details                                                                    |
+| -------------------- | -------------------------------------------------------------------------- |
+| **Problem**          | You pay Claude $3-25/M tokens on every request, but ~70% don't need Claude |
+| **Solution**         | ClawRouter auto-routes + compresses + caches                               |
+| **Savings**          | ~81% vs Sonnet, ~89% vs Opus                                               |
+| **How**              | Routing (68%) + token compression (7-15%) + caching (3-5%)                 |
+| **Code change**      | 2 lines (base_url + model name)                                            |
+| **Setup time**       | 3 minutes                                                                  |
+| **Quality tradeoff** | None — complex tasks still go to Claude                                    |
+| **Open source**      | MIT license, local proxy, non-custodial payments                           |
 ```bash
 # Start saving now:
@@ -340,10 +344,11 @@ npx @blockrun/clawrouter
 ```
 **Links:**
 - [ClawRouter on GitHub](https://github.com/blockrunai/ClawRouter) — MIT License
 - [BlockRun](https://blockrun.ai) — AI model marketplace
 - [x402 Protocol](https://www.x402.org/) — Per-request crypto payments for AI
 ---
-*Cost data based on real production traffic from paying users across 20,000+ requests, March 2026. Savings vary by workload — agent-heavy and long-context workloads see larger compression benefits. ClawRouter is open-source and part of the BlockRun ecosystem.*
+_Cost data based on real production traffic from paying users across 20,000+ requests, March 2026. Savings vary by workload — agent-heavy and long-context workloads see larger compression benefits. ClawRouter is open-source and part of the BlockRun ecosystem._

package/docs/architecture.md CHANGED Viewed

@@ -378,8 +378,8 @@ const solanaAccount = await deriveSlip10Ed25519Key(mnemonic, "m/44'/501'/0'/0'")
 // Build SPL Token USDC transfer instruction
 const transaction = buildSolanaPaymentTransaction({
   from: solanaAddress,
-  to: payTo,            // base58 recipient
-  mint: USDC_SOLANA,   // EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v
+  to: payTo, // base58 recipient
+  mint: USDC_SOLANA, // EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v
   amount: BigInt(5000), // 0.005 USDC (6 decimals)
 });
@@ -547,13 +547,13 @@ src/
 ### Key Files
-| File                  | Purpose                                                     |
-| --------------------- | ----------------------------------------------------------- |
-| `proxy.ts`            | Core request handling, SSE simulation, fallback chain       |
-| `wallet.ts`           | BIP-39 mnemonic generation, EVM + Solana (SLIP-10) derivation |
-| `router/rules.ts`     | 15-dimension weighted scorer, 9-language keyword sets       |
-| `x402.ts`             | EIP-712 typed data signing, payment header formatting       |
-| `balance.ts`          | USDC balance via Base RPC (EVM), caching, thresholds        |
-| `solana-balance.ts`   | USDC balance via Solana RPC (SPL Token), caching, retries   |
-| `payment-preauth.ts`  | Pre-authorization cache (EVM; skipped for Solana)           |
-| `dedup.ts`            | SHA-256 hashing, 30s response cache                         |
+| File                 | Purpose                                                       |
+| -------------------- | ------------------------------------------------------------- |
+| `proxy.ts`           | Core request handling, SSE simulation, fallback chain         |
+| `wallet.ts`          | BIP-39 mnemonic generation, EVM + Solana (SLIP-10) derivation |
+| `router/rules.ts`    | 15-dimension weighted scorer, 9-language keyword sets         |
+| `x402.ts`            | EIP-712 typed data signing, payment header formatting         |
+| `balance.ts`         | USDC balance via Base RPC (EVM), caching, thresholds          |
+| `solana-balance.ts`  | USDC balance via Solana RPC (SPL Token), caching, retries     |
+| `payment-preauth.ts` | Pre-authorization cache (EVM; skipped for Solana)             |
+| `dedup.ts`           | SHA-256 hashing, 30s response cache                           |

package/docs/{blog-openclaw-cost-overruns.md → clawrouter-cuts-llm-api-costs-500x.md} RENAMED Viewed

@@ -1,6 +1,6 @@
 # The Most AI-Agent-Native Router for OpenClaw
-> *OpenClaw is one of the best AI agent frameworks available. Its LLM abstraction layer is not.*
+> _OpenClaw is one of the best AI agent frameworks available. Its LLM abstraction layer is not._
 ---
@@ -10,9 +10,9 @@
 From [openclaw/openclaw#3181](https://github.com/openclaw/openclaw/issues/3181):
-> *"We ended up at $248/day before we caught it. Heartbeat on Opus 4.6 with a large context. The dedup fix reduced trigger rate, but there's nothing bounding the run itself."*
+> _"We ended up at $248/day before we caught it. Heartbeat on Opus 4.6 with a large context. The dedup fix reduced trigger rate, but there's nothing bounding the run itself."_
-> *"11.3M input tokens in 1 hour on claude-opus-4-6 (128K context), ~$20/hour."*
+> _"11.3M input tokens in 1 hour on claude-opus-4-6 (128K context), ~$20/hour."_
 Both users ended up disabling heartbeat entirely. The workaround: `heartbeat.every: "0"` — turning off the feature to avoid burning money.
@@ -62,15 +62,15 @@ Agents are the worst offenders for context bloat. Tool call results are verbose.
 ClawRouter compresses every request through 7 layers before it hits the wire:
-| Layer | What it does | Saves |
-|-------|-------------|-------|
-| Deduplication | Removes repeated messages (retries, echoes) | Variable |
-| Whitespace | Strips excessive whitespace from all content | 2–8% |
-| Dictionary | Replaces common phrases with short codes | 5–15% |
-| Path shortening | Codebook for repeated file paths in tool results | 3–10% |
-| JSON compaction | Removes whitespace from embedded JSON | 5–12% |
-| **Observation compression** | **Summarizes tool results to key information** | **Up to 97%** |
-| Dynamic codebook | Learns repetitions in the actual conversation | 3–15% |
+| Layer                       | What it does                                     | Saves         |
+| --------------------------- | ------------------------------------------------ | ------------- |
+| Deduplication               | Removes repeated messages (retries, echoes)      | Variable      |
+| Whitespace                  | Strips excessive whitespace from all content     | 2–8%          |
+| Dictionary                  | Replaces common phrases with short codes         | 5–15%         |
+| Path shortening             | Codebook for repeated file paths in tool results | 3–10%         |
+| JSON compaction             | Removes whitespace from embedded JSON            | 5–12%         |
+| **Observation compression** | **Summarizes tool results to key information**   | **Up to 97%** |
+| Dynamic codebook            | Learns repetitions in the actual conversation    | 3–15%         |
 Layer 6 is the big one. Tool results — file reads, API responses, shell output — can be 10KB+ each. The actual useful signal is often 200–300 chars. ClawRouter extracts errors, status lines, key JSON fields, and compresses the rest. Same model intelligence, 97% fewer tokens on the bulk.
@@ -153,20 +153,20 @@ There is no monthly invoice. There is no 3am email. There is a wallet balance, a
 <p align="center"><img src="assets/blockrun-clawrouter-vs-openclaw-standalone-comparison-production-safety.png" alt="Architecting for production safety — OpenClaw standalone vs OpenClaw + ClawRouter comparison across cost, context, error handling, and budgeting" width="720"></p>
-| Problem | OpenClaw alone | OpenClaw + ClawRouter |
-|---------|---------------|----------------------|
-| Heartbeat cost overrun | No per-run cap | Tier routing → 50–500× cheaper model |
-| Large context | Full context every call | 7-layer compression, 15–40% reduction |
-| Tool result bloat | Raw output forwarded | Observation compression, up to 97% |
-| Rate limit contaminates profile | All models penalized (#49834) | Per-model 60s cooldown, others unaffected |
-| Empty / degraded 200 response | Passed through to agent (#49902) | Detected, triggers model fallback |
-| Short-burst 429 failover | Immediate failover to next model | 200ms retry first, failover only if needed |
-| MiniMax 520 failure | Silent drop / retry storm | Classified as server_error, retried correctly |
-| Z.ai 1311 (billing) | Treated as rate_limit, retried | Classified as billing, stopped immediately |
-| Mid-task model switch | Model can change mid-session | Session pinning, consistent model per task |
-| Monthly billing surprise | Possible | Wallet-based, stops when empty |
-| Per-session cost ceiling | None | `maxCostPerRun` — graceful or strict cap |
-| Cost visibility | None | `/stats` with per-provider error counts |
+| Problem                         | OpenClaw alone                   | OpenClaw + ClawRouter                         |
+| ------------------------------- | -------------------------------- | --------------------------------------------- |
+| Heartbeat cost overrun          | No per-run cap                   | Tier routing → 50–500× cheaper model          |
+| Large context                   | Full context every call          | 7-layer compression, 15–40% reduction         |
+| Tool result bloat               | Raw output forwarded             | Observation compression, up to 97%            |
+| Rate limit contaminates profile | All models penalized (#49834)    | Per-model 60s cooldown, others unaffected     |
+| Empty / degraded 200 response   | Passed through to agent (#49902) | Detected, triggers model fallback             |
+| Short-burst 429 failover        | Immediate failover to next model | 200ms retry first, failover only if needed    |
+| MiniMax 520 failure             | Silent drop / retry storm        | Classified as server_error, retried correctly |
+| Z.ai 1311 (billing)             | Treated as rate_limit, retried   | Classified as billing, stopped immediately    |
+| Mid-task model switch           | Model can change mid-session     | Session pinning, consistent model per task    |
+| Monthly billing surprise        | Possible                         | Wallet-based, stops when empty                |
+| Per-session cost ceiling        | None                             | `maxCostPerRun` — graceful or strict cap      |
+| Cost visibility                 | None                             | `/stats` with per-provider error counts       |
 ---
@@ -194,4 +194,4 @@ That's what ClawRouter is for.
 ---
-*[github.com/BlockRunAI/ClawRouter](https://github.com/BlockRunAI/ClawRouter) · [blockrun.ai](https://blockrun.ai) · `npm install -g @blockrun/clawrouter`*
+_[github.com/BlockRunAI/ClawRouter](https://github.com/BlockRunAI/ClawRouter) · [blockrun.ai](https://blockrun.ai) · `npm install -g @blockrun/clawrouter`_