@blockrun/clawrouter 0.12.63 → 0.12.65
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +55 -55
- package/dist/cli.js +50 -14
- package/dist/cli.js.map +1 -1
- package/dist/index.js +57 -16
- package/dist/index.js.map +1 -1
- package/docs/anthropic-cost-savings.md +90 -85
- package/docs/architecture.md +12 -12
- package/docs/{blog-openclaw-cost-overruns.md → clawrouter-cuts-llm-api-costs-500x.md} +27 -27
- package/docs/clawrouter-vs-openrouter-llm-routing-comparison.md +280 -0
- package/docs/configuration.md +2 -2
- package/docs/image-generation.md +39 -39
- package/docs/{blog-benchmark-2026-03.md → llm-router-benchmark-46-models-sub-1ms-routing.md} +61 -64
- package/docs/routing-profiles.md +6 -6
- package/docs/{technical-routing-2026-03.md → smart-llm-router-14-dimension-classifier.md} +29 -28
- package/docs/worker-network.md +438 -347
- package/package.json +3 -2
- package/scripts/reinstall.sh +31 -6
- package/scripts/update.sh +6 -1
- package/docs/assets/blockrun-248-day-cost-overrun-problem.png +0 -0
- package/docs/assets/blockrun-clawrouter-7-layer-token-compression-openclaw.png +0 -0
- package/docs/assets/blockrun-clawrouter-observation-compression-97-percent-token-savings.png +0 -0
- package/docs/assets/blockrun-clawrouter-openclaw-agentic-proxy-architecture.png +0 -0
- package/docs/assets/blockrun-clawrouter-openclaw-automatic-tier-routing-model-selection.png +0 -0
- package/docs/assets/blockrun-clawrouter-openclaw-error-classification-retry-storm-prevention.png +0 -0
- package/docs/assets/blockrun-clawrouter-openclaw-session-memory-journaling-vs-context-compounding.png +0 -0
- package/docs/assets/blockrun-clawrouter-vs-openclaw-standalone-comparison-production-safety.png +0 -0
- package/docs/assets/blockrun-clawrouter-x402-usdc-micropayment-wallet-budget-control.png +0 -0
- package/docs/assets/blockrun-openclaw-inference-layer-blind-spots.png +0 -0
- package/docs/plans/2026-02-03-smart-routing-design.md +0 -267
- package/docs/plans/2026-02-13-e2e-docker-deployment.md +0 -1260
- package/docs/plans/2026-02-28-worker-network.md +0 -947
- package/docs/plans/2026-03-18-error-classification.md +0 -574
- package/docs/plans/2026-03-19-exclude-models.md +0 -538
- package/docs/vs-openrouter.md +0 -157
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Stop Overpaying for Claude: How ClawRouter Cuts Your Anthropic Bill by 70%
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
_You love Claude. Your wallet doesn't. Here's how to keep frontier-quality answers — at a fraction of the cost._
|
|
4
4
|
|
|
5
5
|
---
|
|
6
6
|
|
|
@@ -87,17 +87,17 @@ ClawRouter scores every prompt against 14 dimensions in <1ms and routes it to th
|
|
|
87
87
|
|
|
88
88
|
From real production data across 20,000+ paying user requests:
|
|
89
89
|
|
|
90
|
-
| Model
|
|
91
|
-
|
|
92
|
-
| gemini-2.5-flash-lite | 34.5%
|
|
93
|
-
| **claude-sonnet-4.6** | **22.7%**
|
|
94
|
-
| kimi-k2.5
|
|
95
|
-
| minimax-m2.5
|
|
96
|
-
| grok-code-fast
|
|
97
|
-
| claude-haiku-4.5
|
|
98
|
-
| nvidia/gpt-oss-120b
|
|
99
|
-
| grok-reasoning
|
|
100
|
-
| Others
|
|
90
|
+
| Model | % of Requests | Price (input/output per M) |
|
|
91
|
+
| --------------------- | ------------- | -------------------------- |
|
|
92
|
+
| gemini-2.5-flash-lite | 34.5% | $0.10 / $0.40 |
|
|
93
|
+
| **claude-sonnet-4.6** | **22.7%** | **$3.00 / $15.00** |
|
|
94
|
+
| kimi-k2.5 | 16.2% | $0.60 / $3.00 |
|
|
95
|
+
| minimax-m2.5 | 6.5% | $0.30 / $1.20 |
|
|
96
|
+
| grok-code-fast | 6.1% | $0.20 / $1.50 |
|
|
97
|
+
| claude-haiku-4.5 | 2.7% | $1.00 / $5.00 |
|
|
98
|
+
| nvidia/gpt-oss-120b | 2.1% | FREE |
|
|
99
|
+
| grok-reasoning | 2.9% | $0.20 / $0.50 |
|
|
100
|
+
| Others | 6.3% | varies |
|
|
101
101
|
|
|
102
102
|
**Result:** 77% of requests go to models that cost 5-150x less than Sonnet. Only the ~23% that genuinely need Claude still go to Claude.
|
|
103
103
|
|
|
@@ -107,11 +107,11 @@ Even when a request does go to Claude, ClawRouter reduces the tokens you pay for
|
|
|
107
107
|
|
|
108
108
|
**How it works:**
|
|
109
109
|
|
|
110
|
-
| Compression Layer
|
|
111
|
-
|
|
112
|
-
| **Deduplication**
|
|
113
|
-
| **Whitespace normalization** | Strips excess whitespace, trailing spaces, empty lines | 3-8%
|
|
114
|
-
| **JSON compaction**
|
|
110
|
+
| Compression Layer | What It Does | Savings |
|
|
111
|
+
| ---------------------------- | ------------------------------------------------------ | ------- |
|
|
112
|
+
| **Deduplication** | Removes duplicate messages in conversation history | 2-5% |
|
|
113
|
+
| **Whitespace normalization** | Strips excess whitespace, trailing spaces, empty lines | 3-8% |
|
|
114
|
+
| **JSON compaction** | Minifies JSON in tool calls and results | 2-4% |
|
|
115
115
|
|
|
116
116
|
These three layers are **enabled by default** and are completely safe — they don't change semantic meaning. The compression triggers automatically on requests larger than 180KB (common in agent workflows and long conversations).
|
|
117
117
|
|
|
@@ -126,6 +126,7 @@ This matters most on expensive models. If you're sending a 50K-token agent conve
|
|
|
126
126
|
ClawRouter caches responses locally. If your app sends the same request within 10 minutes, you get an instant response at **zero cost** — no API call, no tokens billed.
|
|
127
127
|
|
|
128
128
|
This is more common than you'd think:
|
|
129
|
+
|
|
129
130
|
- **Retry logic** — Your app retries on timeout. Without dedup, you pay twice. With ClawRouter, the retry resolves from cache instantly.
|
|
130
131
|
- **Redundant requests** — Multiple users or processes asking the same thing? One API call, multiple responses.
|
|
131
132
|
- **Agent loops** — Agentic frameworks often re-query with identical context. Cache catches these.
|
|
@@ -146,44 +147,44 @@ The deduplicator also catches in-flight duplicates: if two identical requests ar
|
|
|
146
147
|
|
|
147
148
|
### Direct Anthropic API
|
|
148
149
|
|
|
149
|
-
| Approach
|
|
150
|
-
|
|
151
|
-
| All Claude Sonnet | $30.00
|
|
152
|
-
| All Claude Opus
|
|
150
|
+
| Approach | Input (10M tokens) | Output (5M tokens) | Monthly Total |
|
|
151
|
+
| ----------------- | ------------------ | ------------------ | ------------- |
|
|
152
|
+
| All Claude Sonnet | $30.00 | $75.00 | **$105.00** |
|
|
153
|
+
| All Claude Opus | $50.00 | $125.00 | **$175.00** |
|
|
153
154
|
|
|
154
155
|
### ClawRouter (real paying-user distribution)
|
|
155
156
|
|
|
156
|
-
| Tier
|
|
157
|
-
|
|
158
|
-
| Cheap models
|
|
159
|
-
| Mid-tier
|
|
160
|
-
| **Claude (complex)**
|
|
161
|
-
| Code models
|
|
162
|
-
| Reasoning
|
|
163
|
-
| Haiku
|
|
164
|
-
| Free
|
|
165
|
-
| Other
|
|
166
|
-
| **Subtotal (routing)**
|
|
167
|
-
| Token compression (~10%) |
|
|
168
|
-
| Cache hits (~5% est.)
|
|
169
|
-
| **Final Total**
|
|
157
|
+
| Tier | % Requests | Routed To | Cost |
|
|
158
|
+
| ------------------------ | ---------- | --------------------- | ----------- |
|
|
159
|
+
| Cheap models | 34.5% | gemini-flash-lite | $0.76 |
|
|
160
|
+
| Mid-tier | 16.2% | kimi-k2.5 | $2.43 |
|
|
161
|
+
| **Claude (complex)** | **22.7%** | **claude-sonnet-4.6** | **$17.44** |
|
|
162
|
+
| Code models | 6.1% | grok-code-fast | $0.52 |
|
|
163
|
+
| Reasoning | 2.9% | grok-reasoning | $0.03 |
|
|
164
|
+
| Haiku | 2.7% | claude-haiku-4.5 | $0.76 |
|
|
165
|
+
| Free | 2.1% | nvidia/gpt-oss-120b | $0.00 |
|
|
166
|
+
| Other | 12.8% | various | $1.18 |
|
|
167
|
+
| **Subtotal (routing)** | | | **$23.12** |
|
|
168
|
+
| Token compression (~10%) | | | **-$2.31** |
|
|
169
|
+
| Cache hits (~5% est.) | | | **-$1.16** |
|
|
170
|
+
| **Final Total** | | | **~$19.65** |
|
|
170
171
|
|
|
171
172
|
### The Bottom Line
|
|
172
173
|
|
|
173
|
-
| Approach
|
|
174
|
-
|
|
175
|
-
| Direct Claude Sonnet | $105.00
|
|
176
|
-
| Direct Claude Opus
|
|
177
|
-
| **ClawRouter**
|
|
174
|
+
| Approach | Monthly Cost | Savings |
|
|
175
|
+
| -------------------- | ------------ | -------------------------------- |
|
|
176
|
+
| Direct Claude Sonnet | $105.00 | — |
|
|
177
|
+
| Direct Claude Opus | $175.00 | — |
|
|
178
|
+
| **ClawRouter** | **~$20** | **~81% vs Sonnet, ~89% vs Opus** |
|
|
178
179
|
|
|
179
180
|
Breaking down where the savings come from:
|
|
180
181
|
|
|
181
|
-
| Savings Source
|
|
182
|
-
|
|
183
|
-
| **Smart routing**
|
|
184
|
-
| **Token compression** | ~7-15% on remaining cost | Fewer tokens billed per request
|
|
185
|
-
| **Response cache**
|
|
186
|
-
| **Request dedup**
|
|
182
|
+
| Savings Source | Estimated Impact | How |
|
|
183
|
+
| --------------------- | ------------------------ | -------------------------------- |
|
|
184
|
+
| **Smart routing** | ~68% cost reduction | 77% of requests → cheaper models |
|
|
185
|
+
| **Token compression** | ~7-15% on remaining cost | Fewer tokens billed per request |
|
|
186
|
+
| **Response cache** | ~3-5% additional | Repeat requests cost $0 |
|
|
187
|
+
| **Request dedup** | Prevents overcharges | Retries don't double-bill |
|
|
187
188
|
|
|
188
189
|
---
|
|
189
190
|
|
|
@@ -191,22 +192,22 @@ Breaking down where the savings come from:
|
|
|
191
192
|
|
|
192
193
|
ClawRouter runs a weighted scoring algorithm on every prompt — entirely locally, in under 1 millisecond, zero external API calls.
|
|
193
194
|
|
|
194
|
-
| Dimension
|
|
195
|
-
|
|
196
|
-
| Reasoning Markers
|
|
197
|
-
| Code Presence
|
|
198
|
-
| Multi-Step Patterns
|
|
199
|
-
| Technical Terms
|
|
200
|
-
| Token Count
|
|
201
|
-
| Question Complexity
|
|
202
|
-
| Creative Markers
|
|
203
|
-
| Constraint Count
|
|
204
|
-
| Imperative Verbs
|
|
205
|
-
| Output Format
|
|
206
|
-
| Simple Indicators
|
|
207
|
-
| Reference Complexity | 0.02
|
|
208
|
-
| Domain Specificity
|
|
209
|
-
| Negation Complexity
|
|
195
|
+
| Dimension | Weight | Detects |
|
|
196
|
+
| -------------------- | ------ | ------------------------------------------ |
|
|
197
|
+
| Reasoning Markers | 0.18 | "prove," "step by step," "analyze" |
|
|
198
|
+
| Code Presence | 0.15 | `function`, `class`, `import`, code blocks |
|
|
199
|
+
| Multi-Step Patterns | 0.12 | "first...then," numbered steps |
|
|
200
|
+
| Technical Terms | 0.10 | Domain-specific vocabulary |
|
|
201
|
+
| Token Count | 0.08 | Short vs. long context |
|
|
202
|
+
| Question Complexity | 0.05 | Nested or compound questions |
|
|
203
|
+
| Creative Markers | 0.05 | Creative writing indicators |
|
|
204
|
+
| Constraint Count | 0.04 | "max," "minimum," "at most" |
|
|
205
|
+
| Imperative Verbs | 0.03 | "create," "generate," "build" |
|
|
206
|
+
| Output Format | 0.03 | JSON, YAML, table, markdown |
|
|
207
|
+
| Simple Indicators | 0.02 | "what is," "define," "translate" |
|
|
208
|
+
| Reference Complexity | 0.02 | "the code above," "the docs" |
|
|
209
|
+
| Domain Specificity | 0.02 | Quantum, genomics, etc. |
|
|
210
|
+
| Negation Complexity | 0.01 | "don't," "never," "avoid" |
|
|
210
211
|
|
|
211
212
|
The weighted score maps to four tiers:
|
|
212
213
|
|
|
@@ -234,6 +235,7 @@ Starts a local proxy on port 8402. Auto-generates a crypto wallet. Done.
|
|
|
234
235
|
### Step 2: Update Your Code
|
|
235
236
|
|
|
236
237
|
**Python** — change 2 lines:
|
|
238
|
+
|
|
237
239
|
```python
|
|
238
240
|
from openai import OpenAI
|
|
239
241
|
|
|
@@ -249,6 +251,7 @@ response = client.chat.completions.create(
|
|
|
249
251
|
```
|
|
250
252
|
|
|
251
253
|
**TypeScript** — same idea:
|
|
254
|
+
|
|
252
255
|
```typescript
|
|
253
256
|
import OpenAI from "openai";
|
|
254
257
|
|
|
@@ -258,12 +261,13 @@ const client = new OpenAI({
|
|
|
258
261
|
});
|
|
259
262
|
|
|
260
263
|
const response = await client.chat.completions.create({
|
|
261
|
-
model: "blockrun/auto",
|
|
264
|
+
model: "blockrun/auto", // or "eco" for max savings, "premium" for best quality
|
|
262
265
|
messages: [{ role: "user", content: "Your prompt here" }],
|
|
263
266
|
});
|
|
264
267
|
```
|
|
265
268
|
|
|
266
269
|
**Routing profiles:**
|
|
270
|
+
|
|
267
271
|
- `blockrun/auto` — Balanced cost/quality (default)
|
|
268
272
|
- `blockrun/eco` — Maximum savings (free tier aggressively)
|
|
269
273
|
- `blockrun/premium` — Best quality (Opus/Sonnet/GPT-5)
|
|
@@ -305,17 +309,17 @@ $ /stats 7
|
|
|
305
309
|
|
|
306
310
|
## Why ClawRouter Instead of OpenRouter?
|
|
307
311
|
|
|
308
|
-
|
|
|
309
|
-
|
|
310
|
-
| **Smart routing**
|
|
311
|
-
| **Token optimization** | Built-in compression (7-15% savings)
|
|
312
|
-
| **Response caching**
|
|
313
|
-
| **Request dedup**
|
|
314
|
-
| **Routing latency**
|
|
315
|
-
| **Payments**
|
|
316
|
-
| **Free tier**
|
|
317
|
-
| **API keys**
|
|
318
|
-
| **Algorithm**
|
|
312
|
+
| | ClawRouter | OpenRouter |
|
|
313
|
+
| ---------------------- | --------------------------------------------------- | ---------------------------------- |
|
|
314
|
+
| **Smart routing** | Automatic — 14-dimension scorer picks the model | Manual — you pick the model |
|
|
315
|
+
| **Token optimization** | Built-in compression (7-15% savings) | None |
|
|
316
|
+
| **Response caching** | Local cache, repeat requests = $0 | None |
|
|
317
|
+
| **Request dedup** | Retries don't double-bill | None |
|
|
318
|
+
| **Routing latency** | <1ms (local, on your machine) | Additional network hop |
|
|
319
|
+
| **Payments** | Non-custodial USDC on Base (your wallet, your keys) | Prepaid credit balance (custodial) |
|
|
320
|
+
| **Free tier** | NVIDIA GPT-OSS-120B (always available) | No free models |
|
|
321
|
+
| **API keys** | Zero — proxy handles all auth | You manage keys per provider |
|
|
322
|
+
| **Algorithm** | Open-source, MIT license, modify it yourself | Proprietary |
|
|
319
323
|
|
|
320
324
|
The fundamental difference: **OpenRouter is a model marketplace where you choose.** ClawRouter is an intelligent proxy that **chooses for you**, compresses your tokens, caches your responses, and pays per-request with crypto from your own wallet.
|
|
321
325
|
|
|
@@ -323,16 +327,16 @@ The fundamental difference: **OpenRouter is a model marketplace where you choose
|
|
|
323
327
|
|
|
324
328
|
## TL;DR
|
|
325
329
|
|
|
326
|
-
| What
|
|
327
|
-
|
|
328
|
-
| **Problem**
|
|
329
|
-
| **Solution**
|
|
330
|
-
| **Savings**
|
|
331
|
-
| **How**
|
|
332
|
-
| **Code change**
|
|
333
|
-
| **Setup time**
|
|
334
|
-
| **Quality tradeoff** | None — complex tasks still go to Claude
|
|
335
|
-
| **Open source**
|
|
330
|
+
| What | Details |
|
|
331
|
+
| -------------------- | -------------------------------------------------------------------------- |
|
|
332
|
+
| **Problem** | You pay Claude $3-25/M tokens on every request, but ~70% don't need Claude |
|
|
333
|
+
| **Solution** | ClawRouter auto-routes + compresses + caches |
|
|
334
|
+
| **Savings** | ~81% vs Sonnet, ~89% vs Opus |
|
|
335
|
+
| **How** | Routing (68%) + token compression (7-15%) + caching (3-5%) |
|
|
336
|
+
| **Code change** | 2 lines (base_url + model name) |
|
|
337
|
+
| **Setup time** | 3 minutes |
|
|
338
|
+
| **Quality tradeoff** | None — complex tasks still go to Claude |
|
|
339
|
+
| **Open source** | MIT license, local proxy, non-custodial payments |
|
|
336
340
|
|
|
337
341
|
```bash
|
|
338
342
|
# Start saving now:
|
|
@@ -340,10 +344,11 @@ npx @blockrun/clawrouter
|
|
|
340
344
|
```
|
|
341
345
|
|
|
342
346
|
**Links:**
|
|
347
|
+
|
|
343
348
|
- [ClawRouter on GitHub](https://github.com/blockrunai/ClawRouter) — MIT License
|
|
344
349
|
- [BlockRun](https://blockrun.ai) — AI model marketplace
|
|
345
350
|
- [x402 Protocol](https://www.x402.org/) — Per-request crypto payments for AI
|
|
346
351
|
|
|
347
352
|
---
|
|
348
353
|
|
|
349
|
-
|
|
354
|
+
_Cost data based on real production traffic from paying users across 20,000+ requests, March 2026. Savings vary by workload — agent-heavy and long-context workloads see larger compression benefits. ClawRouter is open-source and part of the BlockRun ecosystem._
|
package/docs/architecture.md
CHANGED
|
@@ -378,8 +378,8 @@ const solanaAccount = await deriveSlip10Ed25519Key(mnemonic, "m/44'/501'/0'/0'")
|
|
|
378
378
|
// Build SPL Token USDC transfer instruction
|
|
379
379
|
const transaction = buildSolanaPaymentTransaction({
|
|
380
380
|
from: solanaAddress,
|
|
381
|
-
to: payTo,
|
|
382
|
-
mint: USDC_SOLANA,
|
|
381
|
+
to: payTo, // base58 recipient
|
|
382
|
+
mint: USDC_SOLANA, // EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v
|
|
383
383
|
amount: BigInt(5000), // 0.005 USDC (6 decimals)
|
|
384
384
|
});
|
|
385
385
|
|
|
@@ -547,13 +547,13 @@ src/
|
|
|
547
547
|
|
|
548
548
|
### Key Files
|
|
549
549
|
|
|
550
|
-
| File
|
|
551
|
-
|
|
|
552
|
-
| `proxy.ts`
|
|
553
|
-
| `wallet.ts`
|
|
554
|
-
| `router/rules.ts`
|
|
555
|
-
| `x402.ts`
|
|
556
|
-
| `balance.ts`
|
|
557
|
-
| `solana-balance.ts`
|
|
558
|
-
| `payment-preauth.ts`
|
|
559
|
-
| `dedup.ts`
|
|
550
|
+
| File | Purpose |
|
|
551
|
+
| -------------------- | ------------------------------------------------------------- |
|
|
552
|
+
| `proxy.ts` | Core request handling, SSE simulation, fallback chain |
|
|
553
|
+
| `wallet.ts` | BIP-39 mnemonic generation, EVM + Solana (SLIP-10) derivation |
|
|
554
|
+
| `router/rules.ts` | 15-dimension weighted scorer, 9-language keyword sets |
|
|
555
|
+
| `x402.ts` | EIP-712 typed data signing, payment header formatting |
|
|
556
|
+
| `balance.ts` | USDC balance via Base RPC (EVM), caching, thresholds |
|
|
557
|
+
| `solana-balance.ts` | USDC balance via Solana RPC (SPL Token), caching, retries |
|
|
558
|
+
| `payment-preauth.ts` | Pre-authorization cache (EVM; skipped for Solana) |
|
|
559
|
+
| `dedup.ts` | SHA-256 hashing, 30s response cache |
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# The Most AI-Agent-Native Router for OpenClaw
|
|
2
2
|
|
|
3
|
-
>
|
|
3
|
+
> _OpenClaw is one of the best AI agent frameworks available. Its LLM abstraction layer is not._
|
|
4
4
|
|
|
5
5
|
---
|
|
6
6
|
|
|
@@ -10,9 +10,9 @@
|
|
|
10
10
|
|
|
11
11
|
From [openclaw/openclaw#3181](https://github.com/openclaw/openclaw/issues/3181):
|
|
12
12
|
|
|
13
|
-
>
|
|
13
|
+
> _"We ended up at $248/day before we caught it. Heartbeat on Opus 4.6 with a large context. The dedup fix reduced trigger rate, but there's nothing bounding the run itself."_
|
|
14
14
|
|
|
15
|
-
>
|
|
15
|
+
> _"11.3M input tokens in 1 hour on claude-opus-4-6 (128K context), ~$20/hour."_
|
|
16
16
|
|
|
17
17
|
Both users ended up disabling heartbeat entirely. The workaround: `heartbeat.every: "0"` — turning off the feature to avoid burning money.
|
|
18
18
|
|
|
@@ -62,15 +62,15 @@ Agents are the worst offenders for context bloat. Tool call results are verbose.
|
|
|
62
62
|
|
|
63
63
|
ClawRouter compresses every request through 7 layers before it hits the wire:
|
|
64
64
|
|
|
65
|
-
| Layer
|
|
66
|
-
|
|
67
|
-
| Deduplication
|
|
68
|
-
| Whitespace
|
|
69
|
-
| Dictionary
|
|
70
|
-
| Path shortening
|
|
71
|
-
| JSON compaction
|
|
72
|
-
| **Observation compression** | **Summarizes tool results to key information**
|
|
73
|
-
| Dynamic codebook
|
|
65
|
+
| Layer | What it does | Saves |
|
|
66
|
+
| --------------------------- | ------------------------------------------------ | ------------- |
|
|
67
|
+
| Deduplication | Removes repeated messages (retries, echoes) | Variable |
|
|
68
|
+
| Whitespace | Strips excessive whitespace from all content | 2–8% |
|
|
69
|
+
| Dictionary | Replaces common phrases with short codes | 5–15% |
|
|
70
|
+
| Path shortening | Codebook for repeated file paths in tool results | 3–10% |
|
|
71
|
+
| JSON compaction | Removes whitespace from embedded JSON | 5–12% |
|
|
72
|
+
| **Observation compression** | **Summarizes tool results to key information** | **Up to 97%** |
|
|
73
|
+
| Dynamic codebook | Learns repetitions in the actual conversation | 3–15% |
|
|
74
74
|
|
|
75
75
|
Layer 6 is the big one. Tool results — file reads, API responses, shell output — can be 10KB+ each. The actual useful signal is often 200–300 chars. ClawRouter extracts errors, status lines, key JSON fields, and compresses the rest. Same model intelligence, 97% fewer tokens on the bulk.
|
|
76
76
|
|
|
@@ -153,20 +153,20 @@ There is no monthly invoice. There is no 3am email. There is a wallet balance, a
|
|
|
153
153
|
|
|
154
154
|
<p align="center"><img src="assets/blockrun-clawrouter-vs-openclaw-standalone-comparison-production-safety.png" alt="Architecting for production safety — OpenClaw standalone vs OpenClaw + ClawRouter comparison across cost, context, error handling, and budgeting" width="720"></p>
|
|
155
155
|
|
|
156
|
-
| Problem
|
|
157
|
-
|
|
158
|
-
| Heartbeat cost overrun
|
|
159
|
-
| Large context
|
|
160
|
-
| Tool result bloat
|
|
161
|
-
| Rate limit contaminates profile | All models penalized (#49834)
|
|
162
|
-
| Empty / degraded 200 response
|
|
163
|
-
| Short-burst 429 failover
|
|
164
|
-
| MiniMax 520 failure
|
|
165
|
-
| Z.ai 1311 (billing)
|
|
166
|
-
| Mid-task model switch
|
|
167
|
-
| Monthly billing surprise
|
|
168
|
-
| Per-session cost ceiling
|
|
169
|
-
| Cost visibility
|
|
156
|
+
| Problem | OpenClaw alone | OpenClaw + ClawRouter |
|
|
157
|
+
| ------------------------------- | -------------------------------- | --------------------------------------------- |
|
|
158
|
+
| Heartbeat cost overrun | No per-run cap | Tier routing → 50–500× cheaper model |
|
|
159
|
+
| Large context | Full context every call | 7-layer compression, 15–40% reduction |
|
|
160
|
+
| Tool result bloat | Raw output forwarded | Observation compression, up to 97% |
|
|
161
|
+
| Rate limit contaminates profile | All models penalized (#49834) | Per-model 60s cooldown, others unaffected |
|
|
162
|
+
| Empty / degraded 200 response | Passed through to agent (#49902) | Detected, triggers model fallback |
|
|
163
|
+
| Short-burst 429 failover | Immediate failover to next model | 200ms retry first, failover only if needed |
|
|
164
|
+
| MiniMax 520 failure | Silent drop / retry storm | Classified as server_error, retried correctly |
|
|
165
|
+
| Z.ai 1311 (billing) | Treated as rate_limit, retried | Classified as billing, stopped immediately |
|
|
166
|
+
| Mid-task model switch | Model can change mid-session | Session pinning, consistent model per task |
|
|
167
|
+
| Monthly billing surprise | Possible | Wallet-based, stops when empty |
|
|
168
|
+
| Per-session cost ceiling | None | `maxCostPerRun` — graceful or strict cap |
|
|
169
|
+
| Cost visibility | None | `/stats` with per-provider error counts |
|
|
170
170
|
|
|
171
171
|
---
|
|
172
172
|
|
|
@@ -194,4 +194,4 @@ That's what ClawRouter is for.
|
|
|
194
194
|
|
|
195
195
|
---
|
|
196
196
|
|
|
197
|
-
|
|
197
|
+
_[github.com/BlockRunAI/ClawRouter](https://github.com/BlockRunAI/ClawRouter) · [blockrun.ai](https://blockrun.ai) · `npm install -g @blockrun/clawrouter`_
|