@blockrun/clawrouter 0.12.64 → 0.12.66

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  # Stop Overpaying for Claude: How ClawRouter Cuts Your Anthropic Bill by 70%
2
2
 
3
- *You love Claude. Your wallet doesn't. Here's how to keep frontier-quality answers — at a fraction of the cost.*
3
+ _You love Claude. Your wallet doesn't. Here's how to keep frontier-quality answers — at a fraction of the cost._
4
4
 
5
5
  ---
6
6
 
@@ -87,17 +87,17 @@ ClawRouter scores every prompt against 14 dimensions in <1ms and routes it to th
87
87
 
88
88
  From real production data across 20,000+ paying user requests:
89
89
 
90
- | Model | % of Requests | Price (input/output per M) |
91
- |---|---|---|
92
- | gemini-2.5-flash-lite | 34.5% | $0.10 / $0.40 |
93
- | **claude-sonnet-4.6** | **22.7%** | **$3.00 / $15.00** |
94
- | kimi-k2.5 | 16.2% | $0.60 / $3.00 |
95
- | minimax-m2.5 | 6.5% | $0.30 / $1.20 |
96
- | grok-code-fast | 6.1% | $0.20 / $1.50 |
97
- | claude-haiku-4.5 | 2.7% | $1.00 / $5.00 |
98
- | nvidia/gpt-oss-120b | 2.1% | FREE |
99
- | grok-reasoning | 2.9% | $0.20 / $0.50 |
100
- | Others | 6.3% | varies |
90
+ | Model | % of Requests | Price (input/output per M) |
91
+ | --------------------- | ------------- | -------------------------- |
92
+ | gemini-2.5-flash-lite | 34.5% | $0.10 / $0.40 |
93
+ | **claude-sonnet-4.6** | **22.7%** | **$3.00 / $15.00** |
94
+ | kimi-k2.5 | 16.2% | $0.60 / $3.00 |
95
+ | minimax-m2.5 | 6.5% | $0.30 / $1.20 |
96
+ | grok-code-fast | 6.1% | $0.20 / $1.50 |
97
+ | claude-haiku-4.5 | 2.7% | $1.00 / $5.00 |
98
+ | nvidia/gpt-oss-120b | 2.1% | FREE |
99
+ | grok-reasoning | 2.9% | $0.20 / $0.50 |
100
+ | Others | 6.3% | varies |
101
101
 
102
102
  **Result:** 77% of requests go to models that cost 5-150x less than Sonnet. Only the ~23% that genuinely need Claude still go to Claude.
103
103
 
@@ -107,11 +107,11 @@ Even when a request does go to Claude, ClawRouter reduces the tokens you pay for
107
107
 
108
108
  **How it works:**
109
109
 
110
- | Compression Layer | What It Does | Savings |
111
- |---|---|---|
112
- | **Deduplication** | Removes duplicate messages in conversation history | 2-5% |
113
- | **Whitespace normalization** | Strips excess whitespace, trailing spaces, empty lines | 3-8% |
114
- | **JSON compaction** | Minifies JSON in tool calls and results | 2-4% |
110
+ | Compression Layer | What It Does | Savings |
111
+ | ---------------------------- | ------------------------------------------------------ | ------- |
112
+ | **Deduplication** | Removes duplicate messages in conversation history | 2-5% |
113
+ | **Whitespace normalization** | Strips excess whitespace, trailing spaces, empty lines | 3-8% |
114
+ | **JSON compaction** | Minifies JSON in tool calls and results | 2-4% |
115
115
 
116
116
  These three layers are **enabled by default** and are completely safe — they don't change semantic meaning. The compression triggers automatically on requests larger than 180KB (common in agent workflows and long conversations).
117
117
 
@@ -126,6 +126,7 @@ This matters most on expensive models. If you're sending a 50K-token agent conve
126
126
  ClawRouter caches responses locally. If your app sends the same request within 10 minutes, you get an instant response at **zero cost** — no API call, no tokens billed.
127
127
 
128
128
  This is more common than you'd think:
129
+
129
130
  - **Retry logic** — Your app retries on timeout. Without dedup, you pay twice. With ClawRouter, the retry resolves from cache instantly.
130
131
  - **Redundant requests** — Multiple users or processes asking the same thing? One API call, multiple responses.
131
132
  - **Agent loops** — Agentic frameworks often re-query with identical context. Cache catches these.
@@ -146,44 +147,44 @@ The deduplicator also catches in-flight duplicates: if two identical requests ar
146
147
 
147
148
  ### Direct Anthropic API
148
149
 
149
- | Approach | Input (10M tokens) | Output (5M tokens) | Monthly Total |
150
- |---|---|---|---|
151
- | All Claude Sonnet | $30.00 | $75.00 | **$105.00** |
152
- | All Claude Opus | $50.00 | $125.00 | **$175.00** |
150
+ | Approach | Input (10M tokens) | Output (5M tokens) | Monthly Total |
151
+ | ----------------- | ------------------ | ------------------ | ------------- |
152
+ | All Claude Sonnet | $30.00 | $75.00 | **$105.00** |
153
+ | All Claude Opus | $50.00 | $125.00 | **$175.00** |
153
154
 
154
155
  ### ClawRouter (real paying-user distribution)
155
156
 
156
- | Tier | % Requests | Routed To | Cost |
157
- |---|---|---|---|
158
- | Cheap models | 34.5% | gemini-flash-lite | $0.76 |
159
- | Mid-tier | 16.2% | kimi-k2.5 | $2.43 |
160
- | **Claude (complex)** | **22.7%** | **claude-sonnet-4.6** | **$17.44** |
161
- | Code models | 6.1% | grok-code-fast | $0.52 |
162
- | Reasoning | 2.9% | grok-reasoning | $0.03 |
163
- | Haiku | 2.7% | claude-haiku-4.5 | $0.76 |
164
- | Free | 2.1% | nvidia/gpt-oss-120b | $0.00 |
165
- | Other | 12.8% | various | $1.18 |
166
- | **Subtotal (routing)** | | | **$23.12** |
167
- | Token compression (~10%) | | | **-$2.31** |
168
- | Cache hits (~5% est.) | | | **-$1.16** |
169
- | **Final Total** | | | **~$19.65** |
157
+ | Tier | % Requests | Routed To | Cost |
158
+ | ------------------------ | ---------- | --------------------- | ----------- |
159
+ | Cheap models | 34.5% | gemini-flash-lite | $0.76 |
160
+ | Mid-tier | 16.2% | kimi-k2.5 | $2.43 |
161
+ | **Claude (complex)** | **22.7%** | **claude-sonnet-4.6** | **$17.44** |
162
+ | Code models | 6.1% | grok-code-fast | $0.52 |
163
+ | Reasoning | 2.9% | grok-reasoning | $0.03 |
164
+ | Haiku | 2.7% | claude-haiku-4.5 | $0.76 |
165
+ | Free | 2.1% | nvidia/gpt-oss-120b | $0.00 |
166
+ | Other | 12.8% | various | $1.18 |
167
+ | **Subtotal (routing)** | | | **$23.12** |
168
+ | Token compression (~10%) | | | **-$2.31** |
169
+ | Cache hits (~5% est.) | | | **-$1.16** |
170
+ | **Final Total** | | | **~$19.65** |
170
171
 
171
172
  ### The Bottom Line
172
173
 
173
- | Approach | Monthly Cost | Savings |
174
- |---|---|---|
175
- | Direct Claude Sonnet | $105.00 | — |
176
- | Direct Claude Opus | $175.00 | — |
177
- | **ClawRouter** | **~$20** | **~81% vs Sonnet, ~89% vs Opus** |
174
+ | Approach | Monthly Cost | Savings |
175
+ | -------------------- | ------------ | -------------------------------- |
176
+ | Direct Claude Sonnet | $105.00 | — |
177
+ | Direct Claude Opus | $175.00 | — |
178
+ | **ClawRouter** | **~$20** | **~81% vs Sonnet, ~89% vs Opus** |
178
179
 
179
180
  Breaking down where the savings come from:
180
181
 
181
- | Savings Source | Estimated Impact | How |
182
- |---|---|---|
183
- | **Smart routing** | ~68% cost reduction | 77% of requests → cheaper models |
184
- | **Token compression** | ~7-15% on remaining cost | Fewer tokens billed per request |
185
- | **Response cache** | ~3-5% additional | Repeat requests cost $0 |
186
- | **Request dedup** | Prevents overcharges | Retries don't double-bill |
182
+ | Savings Source | Estimated Impact | How |
183
+ | --------------------- | ------------------------ | -------------------------------- |
184
+ | **Smart routing** | ~68% cost reduction | 77% of requests → cheaper models |
185
+ | **Token compression** | ~7-15% on remaining cost | Fewer tokens billed per request |
186
+ | **Response cache** | ~3-5% additional | Repeat requests cost $0 |
187
+ | **Request dedup** | Prevents overcharges | Retries don't double-bill |
187
188
 
188
189
  ---
189
190
 
@@ -191,22 +192,22 @@ Breaking down where the savings come from:
191
192
 
192
193
  ClawRouter runs a weighted scoring algorithm on every prompt — entirely locally, in under 1 millisecond, zero external API calls.
193
194
 
194
- | Dimension | Weight | Detects |
195
- |---|---|---|
196
- | Reasoning Markers | 0.18 | "prove," "step by step," "analyze" |
197
- | Code Presence | 0.15 | `function`, `class`, `import`, code blocks |
198
- | Multi-Step Patterns | 0.12 | "first...then," numbered steps |
199
- | Technical Terms | 0.10 | Domain-specific vocabulary |
200
- | Token Count | 0.08 | Short vs. long context |
201
- | Question Complexity | 0.05 | Nested or compound questions |
202
- | Creative Markers | 0.05 | Creative writing indicators |
203
- | Constraint Count | 0.04 | "max," "minimum," "at most" |
204
- | Imperative Verbs | 0.03 | "create," "generate," "build" |
205
- | Output Format | 0.03 | JSON, YAML, table, markdown |
206
- | Simple Indicators | 0.02 | "what is," "define," "translate" |
207
- | Reference Complexity | 0.02 | "the code above," "the docs" |
208
- | Domain Specificity | 0.02 | Quantum, genomics, etc. |
209
- | Negation Complexity | 0.01 | "don't," "never," "avoid" |
195
+ | Dimension | Weight | Detects |
196
+ | -------------------- | ------ | ------------------------------------------ |
197
+ | Reasoning Markers | 0.18 | "prove," "step by step," "analyze" |
198
+ | Code Presence | 0.15 | `function`, `class`, `import`, code blocks |
199
+ | Multi-Step Patterns | 0.12 | "first...then," numbered steps |
200
+ | Technical Terms | 0.10 | Domain-specific vocabulary |
201
+ | Token Count | 0.08 | Short vs. long context |
202
+ | Question Complexity | 0.05 | Nested or compound questions |
203
+ | Creative Markers | 0.05 | Creative writing indicators |
204
+ | Constraint Count | 0.04 | "max," "minimum," "at most" |
205
+ | Imperative Verbs | 0.03 | "create," "generate," "build" |
206
+ | Output Format | 0.03 | JSON, YAML, table, markdown |
207
+ | Simple Indicators | 0.02 | "what is," "define," "translate" |
208
+ | Reference Complexity | 0.02 | "the code above," "the docs" |
209
+ | Domain Specificity | 0.02 | Quantum, genomics, etc. |
210
+ | Negation Complexity | 0.01 | "don't," "never," "avoid" |
210
211
 
211
212
  The weighted score maps to four tiers:
212
213
 
@@ -234,6 +235,7 @@ Starts a local proxy on port 8402. Auto-generates a crypto wallet. Done.
234
235
  ### Step 2: Update Your Code
235
236
 
236
237
  **Python** — change 2 lines:
238
+
237
239
  ```python
238
240
  from openai import OpenAI
239
241
 
@@ -249,6 +251,7 @@ response = client.chat.completions.create(
249
251
  ```
250
252
 
251
253
  **TypeScript** — same idea:
254
+
252
255
  ```typescript
253
256
  import OpenAI from "openai";
254
257
 
@@ -258,12 +261,13 @@ const client = new OpenAI({
258
261
  });
259
262
 
260
263
  const response = await client.chat.completions.create({
261
- model: "blockrun/auto", // or "eco" for max savings, "premium" for best quality
264
+ model: "blockrun/auto", // or "eco" for max savings, "premium" for best quality
262
265
  messages: [{ role: "user", content: "Your prompt here" }],
263
266
  });
264
267
  ```
265
268
 
266
269
  **Routing profiles:**
270
+
267
271
  - `blockrun/auto` — Balanced cost/quality (default)
268
272
  - `blockrun/eco` — Maximum savings (free tier aggressively)
269
273
  - `blockrun/premium` — Best quality (Opus/Sonnet/GPT-5)
@@ -305,17 +309,17 @@ $ /stats 7
305
309
 
306
310
  ## Why ClawRouter Instead of OpenRouter?
307
311
 
308
- | | ClawRouter | OpenRouter |
309
- |---|---|---|
310
- | **Smart routing** | Automatic — 14-dimension scorer picks the model | Manual — you pick the model |
311
- | **Token optimization** | Built-in compression (7-15% savings) | None |
312
- | **Response caching** | Local cache, repeat requests = $0 | None |
313
- | **Request dedup** | Retries don't double-bill | None |
314
- | **Routing latency** | <1ms (local, on your machine) | Additional network hop |
315
- | **Payments** | Non-custodial USDC on Base (your wallet, your keys) | Prepaid credit balance (custodial) |
316
- | **Free tier** | NVIDIA GPT-OSS-120B (always available) | No free models |
317
- | **API keys** | Zero — proxy handles all auth | You manage keys per provider |
318
- | **Algorithm** | Open-source, MIT license, modify it yourself | Proprietary |
312
+ | | ClawRouter | OpenRouter |
313
+ | ---------------------- | --------------------------------------------------- | ---------------------------------- |
314
+ | **Smart routing** | Automatic — 14-dimension scorer picks the model | Manual — you pick the model |
315
+ | **Token optimization** | Built-in compression (7-15% savings) | None |
316
+ | **Response caching** | Local cache, repeat requests = $0 | None |
317
+ | **Request dedup** | Retries don't double-bill | None |
318
+ | **Routing latency** | <1ms (local, on your machine) | Additional network hop |
319
+ | **Payments** | Non-custodial USDC on Base (your wallet, your keys) | Prepaid credit balance (custodial) |
320
+ | **Free tier** | NVIDIA GPT-OSS-120B (always available) | No free models |
321
+ | **API keys** | Zero — proxy handles all auth | You manage keys per provider |
322
+ | **Algorithm** | Open-source, MIT license, modify it yourself | Proprietary |
319
323
 
320
324
  The fundamental difference: **OpenRouter is a model marketplace where you choose.** ClawRouter is an intelligent proxy that **chooses for you**, compresses your tokens, caches your responses, and pays per-request with crypto from your own wallet.
321
325
 
@@ -323,16 +327,16 @@ The fundamental difference: **OpenRouter is a model marketplace where you choose
323
327
 
324
328
  ## TL;DR
325
329
 
326
- | What | Details |
327
- |---|---|
328
- | **Problem** | You pay Claude $3-25/M tokens on every request, but ~70% don't need Claude |
329
- | **Solution** | ClawRouter auto-routes + compresses + caches |
330
- | **Savings** | ~81% vs Sonnet, ~89% vs Opus |
331
- | **How** | Routing (68%) + token compression (7-15%) + caching (3-5%) |
332
- | **Code change** | 2 lines (base_url + model name) |
333
- | **Setup time** | 3 minutes |
334
- | **Quality tradeoff** | None — complex tasks still go to Claude |
335
- | **Open source** | MIT license, local proxy, non-custodial payments |
330
+ | What | Details |
331
+ | -------------------- | -------------------------------------------------------------------------- |
332
+ | **Problem** | You pay Claude $3-25/M tokens on every request, but ~70% don't need Claude |
333
+ | **Solution** | ClawRouter auto-routes + compresses + caches |
334
+ | **Savings** | ~81% vs Sonnet, ~89% vs Opus |
335
+ | **How** | Routing (68%) + token compression (7-15%) + caching (3-5%) |
336
+ | **Code change** | 2 lines (base_url + model name) |
337
+ | **Setup time** | 3 minutes |
338
+ | **Quality tradeoff** | None — complex tasks still go to Claude |
339
+ | **Open source** | MIT license, local proxy, non-custodial payments |
336
340
 
337
341
  ```bash
338
342
  # Start saving now:
@@ -340,10 +344,11 @@ npx @blockrun/clawrouter
340
344
  ```
341
345
 
342
346
  **Links:**
347
+
343
348
  - [ClawRouter on GitHub](https://github.com/blockrunai/ClawRouter) — MIT License
344
349
  - [BlockRun](https://blockrun.ai) — AI model marketplace
345
350
  - [x402 Protocol](https://www.x402.org/) — Per-request crypto payments for AI
346
351
 
347
352
  ---
348
353
 
349
- *Cost data based on real production traffic from paying users across 20,000+ requests, March 2026. Savings vary by workload — agent-heavy and long-context workloads see larger compression benefits. ClawRouter is open-source and part of the BlockRun ecosystem.*
354
+ _Cost data based on real production traffic from paying users across 20,000+ requests, March 2026. Savings vary by workload — agent-heavy and long-context workloads see larger compression benefits. ClawRouter is open-source and part of the BlockRun ecosystem._
@@ -378,8 +378,8 @@ const solanaAccount = await deriveSlip10Ed25519Key(mnemonic, "m/44'/501'/0'/0'")
378
378
  // Build SPL Token USDC transfer instruction
379
379
  const transaction = buildSolanaPaymentTransaction({
380
380
  from: solanaAddress,
381
- to: payTo, // base58 recipient
382
- mint: USDC_SOLANA, // EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v
381
+ to: payTo, // base58 recipient
382
+ mint: USDC_SOLANA, // EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v
383
383
  amount: BigInt(5000), // 0.005 USDC (6 decimals)
384
384
  });
385
385
 
@@ -547,13 +547,13 @@ src/
547
547
 
548
548
  ### Key Files
549
549
 
550
- | File | Purpose |
551
- | --------------------- | ----------------------------------------------------------- |
552
- | `proxy.ts` | Core request handling, SSE simulation, fallback chain |
553
- | `wallet.ts` | BIP-39 mnemonic generation, EVM + Solana (SLIP-10) derivation |
554
- | `router/rules.ts` | 15-dimension weighted scorer, 9-language keyword sets |
555
- | `x402.ts` | EIP-712 typed data signing, payment header formatting |
556
- | `balance.ts` | USDC balance via Base RPC (EVM), caching, thresholds |
557
- | `solana-balance.ts` | USDC balance via Solana RPC (SPL Token), caching, retries |
558
- | `payment-preauth.ts` | Pre-authorization cache (EVM; skipped for Solana) |
559
- | `dedup.ts` | SHA-256 hashing, 30s response cache |
550
+ | File | Purpose |
551
+ | -------------------- | ------------------------------------------------------------- |
552
+ | `proxy.ts` | Core request handling, SSE simulation, fallback chain |
553
+ | `wallet.ts` | BIP-39 mnemonic generation, EVM + Solana (SLIP-10) derivation |
554
+ | `router/rules.ts` | 15-dimension weighted scorer, 9-language keyword sets |
555
+ | `x402.ts` | EIP-712 typed data signing, payment header formatting |
556
+ | `balance.ts` | USDC balance via Base RPC (EVM), caching, thresholds |
557
+ | `solana-balance.ts` | USDC balance via Solana RPC (SPL Token), caching, retries |
558
+ | `payment-preauth.ts` | Pre-authorization cache (EVM; skipped for Solana) |
559
+ | `dedup.ts` | SHA-256 hashing, 30s response cache |
@@ -1,6 +1,6 @@
1
1
  # The Most AI-Agent-Native Router for OpenClaw
2
2
 
3
- > *OpenClaw is one of the best AI agent frameworks available. Its LLM abstraction layer is not.*
3
+ > _OpenClaw is one of the best AI agent frameworks available. Its LLM abstraction layer is not._
4
4
 
5
5
  ---
6
6
 
@@ -10,9 +10,9 @@
10
10
 
11
11
  From [openclaw/openclaw#3181](https://github.com/openclaw/openclaw/issues/3181):
12
12
 
13
- > *"We ended up at $248/day before we caught it. Heartbeat on Opus 4.6 with a large context. The dedup fix reduced trigger rate, but there's nothing bounding the run itself."*
13
+ > _"We ended up at $248/day before we caught it. Heartbeat on Opus 4.6 with a large context. The dedup fix reduced trigger rate, but there's nothing bounding the run itself."_
14
14
 
15
- > *"11.3M input tokens in 1 hour on claude-opus-4-6 (128K context), ~$20/hour."*
15
+ > _"11.3M input tokens in 1 hour on claude-opus-4-6 (128K context), ~$20/hour."_
16
16
 
17
17
  Both users ended up disabling heartbeat entirely. The workaround: `heartbeat.every: "0"` — turning off the feature to avoid burning money.
18
18
 
@@ -62,15 +62,15 @@ Agents are the worst offenders for context bloat. Tool call results are verbose.
62
62
 
63
63
  ClawRouter compresses every request through 7 layers before it hits the wire:
64
64
 
65
- | Layer | What it does | Saves |
66
- |-------|-------------|-------|
67
- | Deduplication | Removes repeated messages (retries, echoes) | Variable |
68
- | Whitespace | Strips excessive whitespace from all content | 2–8% |
69
- | Dictionary | Replaces common phrases with short codes | 5–15% |
70
- | Path shortening | Codebook for repeated file paths in tool results | 3–10% |
71
- | JSON compaction | Removes whitespace from embedded JSON | 5–12% |
72
- | **Observation compression** | **Summarizes tool results to key information** | **Up to 97%** |
73
- | Dynamic codebook | Learns repetitions in the actual conversation | 3–15% |
65
+ | Layer | What it does | Saves |
66
+ | --------------------------- | ------------------------------------------------ | ------------- |
67
+ | Deduplication | Removes repeated messages (retries, echoes) | Variable |
68
+ | Whitespace | Strips excessive whitespace from all content | 2–8% |
69
+ | Dictionary | Replaces common phrases with short codes | 5–15% |
70
+ | Path shortening | Codebook for repeated file paths in tool results | 3–10% |
71
+ | JSON compaction | Removes whitespace from embedded JSON | 5–12% |
72
+ | **Observation compression** | **Summarizes tool results to key information** | **Up to 97%** |
73
+ | Dynamic codebook | Learns repetitions in the actual conversation | 3–15% |
74
74
 
75
75
  Layer 6 is the big one. Tool results — file reads, API responses, shell output — can be 10KB+ each. The actual useful signal is often 200–300 chars. ClawRouter extracts errors, status lines, key JSON fields, and compresses the rest. Same model intelligence, 97% fewer tokens on the bulk.
76
76
 
@@ -153,20 +153,20 @@ There is no monthly invoice. There is no 3am email. There is a wallet balance, a
153
153
 
154
154
  <p align="center"><img src="assets/blockrun-clawrouter-vs-openclaw-standalone-comparison-production-safety.png" alt="Architecting for production safety — OpenClaw standalone vs OpenClaw + ClawRouter comparison across cost, context, error handling, and budgeting" width="720"></p>
155
155
 
156
- | Problem | OpenClaw alone | OpenClaw + ClawRouter |
157
- |---------|---------------|----------------------|
158
- | Heartbeat cost overrun | No per-run cap | Tier routing → 50–500× cheaper model |
159
- | Large context | Full context every call | 7-layer compression, 15–40% reduction |
160
- | Tool result bloat | Raw output forwarded | Observation compression, up to 97% |
161
- | Rate limit contaminates profile | All models penalized (#49834) | Per-model 60s cooldown, others unaffected |
162
- | Empty / degraded 200 response | Passed through to agent (#49902) | Detected, triggers model fallback |
163
- | Short-burst 429 failover | Immediate failover to next model | 200ms retry first, failover only if needed |
164
- | MiniMax 520 failure | Silent drop / retry storm | Classified as server_error, retried correctly |
165
- | Z.ai 1311 (billing) | Treated as rate_limit, retried | Classified as billing, stopped immediately |
166
- | Mid-task model switch | Model can change mid-session | Session pinning, consistent model per task |
167
- | Monthly billing surprise | Possible | Wallet-based, stops when empty |
168
- | Per-session cost ceiling | None | `maxCostPerRun` — graceful or strict cap |
169
- | Cost visibility | None | `/stats` with per-provider error counts |
156
+ | Problem | OpenClaw alone | OpenClaw + ClawRouter |
157
+ | ------------------------------- | -------------------------------- | --------------------------------------------- |
158
+ | Heartbeat cost overrun | No per-run cap | Tier routing → 50–500× cheaper model |
159
+ | Large context | Full context every call | 7-layer compression, 15–40% reduction |
160
+ | Tool result bloat | Raw output forwarded | Observation compression, up to 97% |
161
+ | Rate limit contaminates profile | All models penalized (#49834) | Per-model 60s cooldown, others unaffected |
162
+ | Empty / degraded 200 response | Passed through to agent (#49902) | Detected, triggers model fallback |
163
+ | Short-burst 429 failover | Immediate failover to next model | 200ms retry first, failover only if needed |
164
+ | MiniMax 520 failure | Silent drop / retry storm | Classified as server_error, retried correctly |
165
+ | Z.ai 1311 (billing) | Treated as rate_limit, retried | Classified as billing, stopped immediately |
166
+ | Mid-task model switch | Model can change mid-session | Session pinning, consistent model per task |
167
+ | Monthly billing surprise | Possible | Wallet-based, stops when empty |
168
+ | Per-session cost ceiling | None | `maxCostPerRun` — graceful or strict cap |
169
+ | Cost visibility | None | `/stats` with per-provider error counts |
170
170
 
171
171
  ---
172
172
 
@@ -194,4 +194,4 @@ That's what ClawRouter is for.
194
194
 
195
195
  ---
196
196
 
197
- *[github.com/BlockRunAI/ClawRouter](https://github.com/BlockRunAI/ClawRouter) · [blockrun.ai](https://blockrun.ai) · `npm install -g @blockrun/clawrouter`*
197
+ _[github.com/BlockRunAI/ClawRouter](https://github.com/BlockRunAI/ClawRouter) · [blockrun.ai](https://blockrun.ai) · `npm install -g @blockrun/clawrouter`_