@blockrun/clawrouter 0.12.64 → 0.12.65

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,280 @@
1
+ # We Read 100 OpenClaw Issues About OpenRouter. Here's What We Built Instead.
2
+
3
+ > _OpenRouter is the most popular LLM aggregator. It's also the source of the most frustration in OpenClaw's issue tracker._
4
+
5
+ ---
6
+
7
+ ## The Data
8
+
9
+ We searched OpenClaw's GitHub issues for "openrouter" and read every result. 100 issues. Open and closed. Filed by users who ran into the same structural problems over and over:
10
+
11
+ | Category | Issue Count | Representative Issues |
12
+ | ------------------------------- | ----------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
13
+ | **Broken fallback / failover** | ~20 | [#22136](https://github.com/openclaw/openclaw/issues/22136), [#45663](https://github.com/openclaw/openclaw/issues/45663), [#50389](https://github.com/openclaw/openclaw/issues/50389), [#49079](https://github.com/openclaw/openclaw/issues/49079) |
14
+ | **Model ID mangling** | ~15 | [#49379](https://github.com/openclaw/openclaw/issues/49379), [#50711](https://github.com/openclaw/openclaw/issues/50711), [#25665](https://github.com/openclaw/openclaw/issues/25665), [#2373](https://github.com/openclaw/openclaw/issues/2373) |
15
+ | **Authentication / 401 errors** | ~8 | [#51056](https://github.com/openclaw/openclaw/issues/51056), [#34830](https://github.com/openclaw/openclaw/issues/34830), [#26960](https://github.com/openclaw/openclaw/issues/26960) |
16
+ | **Cost / billing opacity** | ~6 | [#25371](https://github.com/openclaw/openclaw/issues/25371), [#50738](https://github.com/openclaw/openclaw/issues/50738), [#38248](https://github.com/openclaw/openclaw/issues/38248) |
17
+ | **Routing opacity** | ~5 | [#7006](https://github.com/openclaw/openclaw/issues/7006), [#35842](https://github.com/openclaw/openclaw/issues/35842) |
18
+ | **Missing feature parity** | ~10 | [#46255](https://github.com/openclaw/openclaw/issues/46255), [#50485](https://github.com/openclaw/openclaw/issues/50485), [#30850](https://github.com/openclaw/openclaw/issues/30850) |
19
+ | **Rate limit / key exhaustion** | ~4 | [#8615](https://github.com/openclaw/openclaw/issues/8615), [#48729](https://github.com/openclaw/openclaw/issues/48729) |
20
+ | **Model catalog staleness** | ~5 | [#10687](https://github.com/openclaw/openclaw/issues/10687), [#30152](https://github.com/openclaw/openclaw/issues/30152) |
21
+
22
+ These aren't edge cases. They're structural consequences of how OpenRouter works: a middleman that adds latency, mangles model IDs, obscures routing decisions, and introduces its own failure modes on top of the providers it aggregates.
23
+
24
+ ---
25
+
26
+ ## 1. Broken Fallback — The #1 Pain Point
27
+
28
+ From [#45663](https://github.com/openclaw/openclaw/issues/45663):
29
+
30
+ > _"Provider returned error from OpenRouter does not trigger model failover."_
31
+
32
+ From [#50389](https://github.com/openclaw/openclaw/issues/50389):
33
+
34
+ > _"Rate limit errors surfaced to user instead of auto-failover."_
35
+
36
+ When OpenRouter returns a 429 or provider error, OpenClaw's failover logic often doesn't recognize it as retriable. The user sees a raw error. The agent stops. ~20 issues document variations of this: HTTP 529 (Anthropic overloaded) not triggering fallback ([#49079](https://github.com/openclaw/openclaw/issues/49079)), invalid model IDs causing 400 instead of failover ([#50017](https://github.com/openclaw/openclaw/issues/50017)), timeouts in cron sessions with no recovery ([#49597](https://github.com/openclaw/openclaw/issues/49597)).
37
+
38
+ ### How ClawRouter Solves This
39
+
40
+ ClawRouter maintains 8-deep fallback chains per routing tier. When a model fails:
41
+
42
+ 1. **200ms retry** — short-burst rate limits often recover in milliseconds
43
+ 2. **Next model** — if retry fails, move to the next model in the chain
44
+ 3. **Per-model isolation** — one provider's failure doesn't poison the others
45
+ 4. **All-failed summary** — if every model in the chain fails, you get a structured error listing every attempt and failure reason
46
+
47
+ ```
48
+ [ClawRouter] Trying model 1/6: google/gemini-2.5-flash
49
+ [ClawRouter] Model google/gemini-2.5-flash returned 429, retrying in 200ms...
50
+ [ClawRouter] Retry failed, trying model 2/6: deepseek/deepseek-chat
51
+ [ClawRouter] Success with model: deepseek/deepseek-chat
52
+ ```
53
+
54
+ No silent failures. No raw 429s surfaced to the agent.
55
+
56
+ ---
57
+
58
+ ## 2. Model ID Mangling — Death by Prefix
59
+
60
+ From [#25665](https://github.com/openclaw/openclaw/issues/25665):
61
+
62
+ > _"Model config defaults to `openrouter/openrouter/auto` (double prefix)."_
63
+
64
+ From [#50711](https://github.com/openclaw/openclaw/issues/50711):
65
+
66
+ > _"Control UI model picker strips `openrouter/` prefix."_
67
+
68
+ OpenRouter uses nested model IDs: `openrouter/deepseek/deepseek-v3.2`. OpenClaw's UI, Discord bot, and web gateway all handle these differently. Some add the prefix. Some strip it. Some double it. 15 issues trace back to model ID confusion.
69
+
70
+ ### How ClawRouter Solves This
71
+
72
+ ClawRouter uses clean aliases. You say `sonnet` and get `anthropic/claude-sonnet-4-6`. You say `flash` and get `google/gemini-2.5-flash`. No nested prefixes. No double-prefix bugs.
73
+
74
+ ```typescript
75
+ // resolveModelAlias() handles all normalization
76
+ "sonnet" → "anthropic/claude-sonnet-4-6"
77
+ "opus" → "anthropic/claude-opus-4-6"
78
+ "flash" → "google/gemini-2.5-flash"
79
+ "grok" → "xai/grok-4-0314"
80
+ "deepseek" → "deepseek/deepseek-chat"
81
+ ```
82
+
83
+ One canonical format. No mangling. No UI inconsistency.
84
+
85
+ ---
86
+
87
+ ## 3. API Key Hell — 401s, Leakage, and Rotation
88
+
89
+ From [#51056](https://github.com/openclaw/openclaw/issues/51056):
90
+
91
+ > _"OpenRouter fails with '401 Missing Authentication header' despite valid key."_
92
+
93
+ From [#8615](https://github.com/openclaw/openclaw/issues/8615):
94
+
95
+ > _"Feature request: native multi-API-key support with load balancing and fallback."_
96
+
97
+ API keys are the root cause of an entire category of failures. Keys expire. Keys leak into LLM context (every provider sees every other provider's keys in the serialized request). Keys hit rate limits that can't be load-balanced. 8 issues document auth failures alone.
98
+
99
+ ### How ClawRouter Solves This
100
+
101
+ ClawRouter has no API keys. Zero.
102
+
103
+ Payment happens via [x402](https://x402.org/) — a cryptographic micropayment protocol. Your agent generates a wallet on first run (BIP-44 derivation, both EVM and Solana). Each request is signed with the wallet's private key. USDC moves per-request.
104
+
105
+ ```
106
+ No keys to leak.
107
+ No keys to rotate.
108
+ No keys to rate-limit.
109
+ No keys to expire.
110
+ ```
111
+
112
+ The wallet is the identity. The signature is the authentication. Nothing to configure, nothing to paste into a config file, nothing for the LLM to accidentally serialize.
113
+
114
+ ---
115
+
116
+ ## 4. Cost and Billing Opacity — Surprise Bills
117
+
118
+ From [#25371](https://github.com/openclaw/openclaw/issues/25371):
119
+
120
+ > _"OpenRouter 402 billing error misclassified as 'Context overflow', triggering auto-compaction that drains remaining credits faster."_
121
+
122
+ From [#7006](https://github.com/openclaw/openclaw/issues/7006):
123
+
124
+ > _"`openrouter/auto` doesn't expose which model was actually used or its cost."_
125
+
126
+ When OpenRouter runs out of credits, it returns a 402 that OpenClaw misreads as a context overflow. OpenClaw then auto-compacts the context and retries — on the same empty balance. Each retry charges the compaction cost. Credits drain faster. The agent burns money trying to fix a billing error it doesn't understand.
127
+
128
+ ### How ClawRouter Solves This
129
+
130
+ **Per-request cost visibility.** Every response includes cost headers:
131
+
132
+ ```
133
+ x-clawrouter-cost: 0.0034
134
+ x-clawrouter-savings: 82%
135
+ x-clawrouter-model: google/gemini-2.5-flash
136
+ ```
137
+
138
+ **Per-request USDC payments.** No prepaid balance to drain. Each request shows its price before you pay. When the wallet is empty, requests don't fail — they fall back to the free tier (NVIDIA GPT-OSS-120B).
139
+
140
+ **Budget guard.** `maxCostPerRun` caps per-session spending. Two modes: `graceful` (downgrade to cheaper models) or `strict` (hard stop). The $248/day heartbeat scenario is structurally impossible.
141
+
142
+ **Usage logging.** Every request logs to `~/.openclaw/blockrun/logs/usage-YYYY-MM-DD.jsonl` with model, tier, cost, baseline cost, savings, and latency. `/stats` shows the breakdown.
143
+
144
+ ---
145
+
146
+ ## 5. Routing Opacity — "Which Model Did I Just Pay For?"
147
+
148
+ From [#7006](https://github.com/openclaw/openclaw/issues/7006):
149
+
150
+ > _"No visibility into which model `openrouter/auto` actually uses."_
151
+
152
+ From [#35842](https://github.com/openclaw/openclaw/issues/35842):
153
+
154
+ > _"Need explicit Claude Sonnet default instead of auto-routing."_
155
+
156
+ When you use `openrouter/auto`, you don't know what model served your request. You can't debug quality regressions. You can't understand cost spikes. You're paying for a black box.
157
+
158
+ ### How ClawRouter Solves This
159
+
160
+ ClawRouter's routing is 100% local, open-source, and transparent.
161
+
162
+ **14-dimension weighted classifier** runs locally in <1ms. It scores every request across: token count, code presence, reasoning markers, technical terms, multi-step patterns, question complexity, tool signals, and more.
163
+
164
+ **Debug headers on every response:**
165
+
166
+ ```
167
+ x-clawrouter-profile: auto
168
+ x-clawrouter-tier: MEDIUM
169
+ x-clawrouter-model: moonshot/kimi-k2.5
170
+ x-clawrouter-confidence: 0.87
171
+ x-clawrouter-reasoning: "Code task with moderate complexity"
172
+ ```
173
+
174
+ **SSE debug comments** in streaming responses show the routing decision inline. You always know which model, why it was selected, and how confident the classifier was.
175
+
176
+ **Four routing profiles** give you explicit control:
177
+
178
+ | Profile | Behavior | Savings |
179
+ | --------- | ----------------------- | ------- |
180
+ | `auto` | Balanced quality + cost | 74–100% |
181
+ | `eco` | Cheapest possible | 95–100% |
182
+ | `premium` | Best quality always | 0% |
183
+ | `free` | NVIDIA GPT-OSS only | 100% |
184
+
185
+ No black box. No mystery routing. Full visibility, full control.
186
+
187
+ ---
188
+
189
+ ## 6. Missing Feature Parity — Images, Tools, Caching
190
+
191
+ From [#46255](https://github.com/openclaw/openclaw/issues/46255):
192
+
193
+ > _"Images not passed to OpenRouter models."_
194
+
195
+ From [#47707](https://github.com/openclaw/openclaw/issues/47707):
196
+
197
+ > _"Mistral models fail with strict tool call ID requirements."_
198
+
199
+ OpenRouter doesn't always pass through provider-specific features correctly. Image payloads get dropped. Cache retention headers get ignored. Tool call ID formats cause silent failures with strict providers.
200
+
201
+ ### How ClawRouter Solves This
202
+
203
+ **Vision auto-detection.** When `image_url` content parts are detected, ClawRouter automatically filters the fallback chain to vision-capable models only. No images dropped.
204
+
205
+ **Tool calling validation.** Every model has a `toolCalling` flag. When tools are present in the request, ClawRouter forces agentic routing tiers and excludes models without tool support. No silent tool call failures.
206
+
207
+ **Direct provider routing.** ClawRouter routes through BlockRun's API directly to providers — not through a second aggregator. One hop, not two. Provider-specific features work because there's no middleman translating them.
208
+
209
+ ---
210
+
211
+ ## 7. Model Catalog Staleness — "Where's the New Model?"
212
+
213
+ From [#10687](https://github.com/openclaw/openclaw/issues/10687):
214
+
215
+ > _"Need fully dynamic model discovery."_
216
+
217
+ From [#30152](https://github.com/openclaw/openclaw/issues/30152):
218
+
219
+ > _"Allowlist silently drops models not in catalog."_
220
+
221
+ When new models launch, OpenRouter's catalog lags. Users configure a model that exists at the provider but isn't in the catalog. The request fails silently or gets rerouted.
222
+
223
+ ### How ClawRouter Solves This
224
+
225
+ ClawRouter maintains a curated catalog of 46+ models across 8 providers, updated with each release. Delisted models have automatic redirect aliases:
226
+
227
+ ```typescript
228
+ // Delisted models redirect automatically
229
+ "xai/grok-code-fast-1" → "deepseek/deepseek-chat"
230
+ "google/gemini-2.0-pro" → "google/gemini-3.1-pro"
231
+ ```
232
+
233
+ No silent drops. No stale catalog. Models are benchmarked for speed, quality, and tool support before inclusion.
234
+
235
+ ---
236
+
237
+ ## The Full Comparison
238
+
239
+ | | OpenRouter | ClawRouter |
240
+ | ------------------- | -------------------------------- | ---------------------------------------------- |
241
+ | **Authentication** | API key (leak risk) | Wallet signature (no keys) |
242
+ | **Payment** | Prepaid balance (custodial) | Per-request USDC (non-custodial) |
243
+ | **Routing** | Server-side black box | Local 14-dim classifier, <1ms |
244
+ | **Fallback** | Often broken (20+ issues) | 8-deep chains, per-model isolation |
245
+ | **Model IDs** | Nested prefixes, mangling bugs | Clean aliases, single format |
246
+ | **Cost visibility** | None per-request | Headers + JSONL logs + `/stats` |
247
+ | **Empty wallet** | Request fails | Auto-fallback to free tier |
248
+ | **Rate limits** | Per-key, shared | Per-wallet, independent |
249
+ | **Vision support** | Images sometimes dropped | Auto-detected, vision-only fallback |
250
+ | **Tool calling** | Silent failures with some models | Flag-based filtering, guaranteed support |
251
+ | **Model catalog** | Laggy, silent drops | Curated 46+ models, redirect aliases |
252
+ | **Budget control** | Monthly invoice | Per-session cap (`maxCostPerRun`) |
253
+ | **Setup** | Create account, paste key | Agent generates wallet, auto-configured |
254
+ | **Average cost** | $25/M tokens (Opus direct) | $2.05/M tokens (auto-routed) = **92% savings** |
255
+
256
+ ---
257
+
258
+ ## Getting Started
259
+
260
+ ```bash
261
+ # Install
262
+ npm install -g @blockrun/clawrouter
263
+
264
+ # Start (auto-configures OpenClaw)
265
+ clawrouter
266
+
267
+ # Check your wallet
268
+ # /wallet
269
+
270
+ # View routing stats
271
+ # /stats
272
+ ```
273
+
274
+ ClawRouter auto-injects itself into `~/.openclaw/openclaw.json` as a provider on startup. Your existing tools, sessions, and extensions are unchanged.
275
+
276
+ Load a wallet with USDC on Base or Solana, pick a routing profile, and run.
277
+
278
+ ---
279
+
280
+ _[github.com/BlockRunAI/ClawRouter](https://github.com/BlockRunAI/ClawRouter) · [blockrun.ai](https://blockrun.ai) · `npm install -g @blockrun/clawrouter`_
@@ -316,7 +316,7 @@ plugins:
316
316
  config:
317
317
  # Maximum spend per session/run in USD.
318
318
  # Default: disabled (no limit)
319
- maxCostPerRun: 0.50 # $0.50 per session
319
+ maxCostPerRun: 0.50 # $0.50 per session
320
320
 
321
321
  # How to enforce the budget cap. Default: graceful
322
322
  #
@@ -326,7 +326,7 @@ plugins:
326
326
  #
327
327
  # strict: immediately returns 429 (X-ClawRouter-Cost-Cap-Exceeded: 1) once
328
328
  # the session spend reaches the cap. Use when you need a hard budget ceiling.
329
- maxCostPerRunMode: graceful # or: strict
329
+ maxCostPerRunMode: graceful # or: strict
330
330
 
331
331
  # Note: image generation endpoints (/v1/images/generations) bypass maxCostPerRun.
332
332
  # Their cost is charged via x402 micropayment directly and is not tracked per-session.
@@ -51,13 +51,13 @@ The returned URL is a publicly hosted image, ready to use in Telegram, Discord,
51
51
 
52
52
  ## Models & Pricing
53
53
 
54
- | Model ID | Shorthand | Price | Max Size | Provider |
55
- | -------------------------- | --------------- | ----------- | ---------- | ----------------- |
56
- | `google/nano-banana` | `nano-banana` | $0.05/image | 1024×1024 | Google Gemini Flash |
57
- | `google/nano-banana-pro` | `banana-pro` | $0.10/image | 4096×4096 | Google Gemini Pro |
58
- | `openai/dall-e-3` | `dall-e-3` | $0.04/image | 1792×1024 | OpenAI DALL-E 3 |
59
- | `openai/gpt-image-1` | `gpt-image` | $0.02/image | 1536×1024 | OpenAI GPT Image |
60
- | `black-forest/flux-1.1-pro`| `flux` | $0.04/image | 1024×1024 | Black Forest Labs |
54
+ | Model ID | Shorthand | Price | Max Size | Provider |
55
+ | --------------------------- | ------------- | ----------- | --------- | ------------------- |
56
+ | `google/nano-banana` | `nano-banana` | $0.05/image | 1024×1024 | Google Gemini Flash |
57
+ | `google/nano-banana-pro` | `banana-pro` | $0.10/image | 4096×4096 | Google Gemini Pro |
58
+ | `openai/dall-e-3` | `dall-e-3` | $0.04/image | 1792×1024 | OpenAI DALL-E 3 |
59
+ | `openai/gpt-image-1` | `gpt-image` | $0.02/image | 1536×1024 | OpenAI GPT Image |
60
+ | `black-forest/flux-1.1-pro` | `flux` | $0.04/image | 1024×1024 | Black Forest Labs |
61
61
 
62
62
  Default model: `google/nano-banana`.
63
63
 
@@ -71,20 +71,20 @@ OpenAI-compatible endpoint. Route via ClawRouter proxy (`http://localhost:8402`)
71
71
 
72
72
  **Request body:**
73
73
 
74
- | Field | Type | Required | Description |
75
- | -------- | -------- | -------- | ------------------------------------------------ |
76
- | `model` | `string` | Yes | Model ID (see table above) |
77
- | `prompt` | `string` | Yes | Text description of the image to generate |
78
- | `size` | `string` | No | Image dimensions, e.g. `"1024x1024"` (default) |
79
- | `n` | `number` | No | Number of images (default: `1`) |
74
+ | Field | Type | Required | Description |
75
+ | -------- | -------- | -------- | ---------------------------------------------- |
76
+ | `model` | `string` | Yes | Model ID (see table above) |
77
+ | `prompt` | `string` | Yes | Text description of the image to generate |
78
+ | `size` | `string` | No | Image dimensions, e.g. `"1024x1024"` (default) |
79
+ | `n` | `number` | No | Number of images (default: `1`) |
80
80
 
81
81
  **Response:**
82
82
 
83
83
  ```typescript
84
84
  {
85
- created: number; // Unix timestamp
85
+ created: number; // Unix timestamp
86
86
  data: Array<{
87
- url: string; // Publicly hosted image URL
87
+ url: string; // Publicly hosted image URL
88
88
  revised_prompt?: string; // Model's rewritten prompt (dall-e-3 only)
89
89
  }>;
90
90
  }
@@ -96,22 +96,22 @@ Edit an existing image using AI. Route via ClawRouter proxy (`http://localhost:8
96
96
 
97
97
  **Request body:**
98
98
 
99
- | Field | Type | Required | Description |
100
- | -------- | -------- | -------- | -------------------------------------------------------------- |
101
- | `model` | `string` | No | Model ID (default: `openai/gpt-image-1`) |
102
- | `prompt` | `string` | Yes | Text description of the edit to apply |
103
- | `image` | `string` | Yes | Source image — see **Image input formats** below |
104
- | `mask` | `string` | No | Mask image (white = area to edit) — same formats as `image` |
105
- | `size` | `string` | No | Output dimensions, e.g. `"1024x1024"` (default) |
99
+ | Field | Type | Required | Description |
100
+ | -------- | -------- | -------- | ----------------------------------------------------------- |
101
+ | `model` | `string` | No | Model ID (default: `openai/gpt-image-1`) |
102
+ | `prompt` | `string` | Yes | Text description of the edit to apply |
103
+ | `image` | `string` | Yes | Source image — see **Image input formats** below |
104
+ | `mask` | `string` | No | Mask image (white = area to edit) — same formats as `image` |
105
+ | `size` | `string` | No | Output dimensions, e.g. `"1024x1024"` (default) |
106
106
 
107
107
  **Image input formats** — the `image` and `mask` fields accept any of:
108
108
 
109
- | Format | Example | Description |
110
- | ------------------- | ------------------------------------ | ---------------------------------------------- |
111
- | Local file path | `"/Users/me/photo.png"` | Absolute path — ClawRouter reads the file |
112
- | Home-relative path | `"~/photo.png"` | Expands `~` to home directory |
113
- | HTTP/HTTPS URL | `"https://example.com/photo.png"` | ClawRouter downloads the image automatically |
114
- | Base64 data URI | `"data:image/png;base64,iVBOR..."` | Passed through directly (no conversion needed) |
109
+ | Format | Example | Description |
110
+ | ------------------ | ---------------------------------- | ---------------------------------------------- |
111
+ | Local file path | `"/Users/me/photo.png"` | Absolute path — ClawRouter reads the file |
112
+ | Home-relative path | `"~/photo.png"` | Expands `~` to home directory |
113
+ | HTTP/HTTPS URL | `"https://example.com/photo.png"` | ClawRouter downloads the image automatically |
114
+ | Base64 data URI | `"data:image/png;base64,iVBOR..."` | Passed through directly (no conversion needed) |
115
115
 
116
116
  Supported image formats: **PNG**, **JPG/JPEG**, **WebP**.
117
117
 
@@ -119,9 +119,9 @@ Supported image formats: **PNG**, **JPG/JPEG**, **WebP**.
119
119
 
120
120
  ```typescript
121
121
  {
122
- created: number; // Unix timestamp
122
+ created: number; // Unix timestamp
123
123
  data: Array<{
124
- url: string; // Locally cached image URL (http://localhost:8402/images/...)
124
+ url: string; // Locally cached image URL (http://localhost:8402/images/...)
125
125
  revised_prompt?: string; // Model's rewritten prompt
126
126
  }>;
127
127
  }
@@ -171,7 +171,7 @@ const response = await fetch("http://localhost:8402/v1/images/generations", {
171
171
  }),
172
172
  });
173
173
 
174
- const result = await response.json() as {
174
+ const result = (await response.json()) as {
175
175
  created: number;
176
176
  data: Array<{ url: string; revised_prompt?: string }>;
177
177
  };
@@ -206,7 +206,7 @@ print(image_url)
206
206
  import OpenAI from "openai";
207
207
 
208
208
  const client = new OpenAI({
209
- apiKey: "blockrun", // any non-empty string
209
+ apiKey: "blockrun", // any non-empty string
210
210
  baseURL: "http://localhost:8402/v1",
211
211
  });
212
212
 
@@ -352,12 +352,12 @@ When using ClawRouter with OpenClaw, generate and edit images directly from any
352
352
  /img2img --image /tmp/portrait.png --size 1536x1024 add a hat
353
353
  ```
354
354
 
355
- | Flag | Default | Description |
356
- | --------- | -------------- | ------------------------------------- |
357
- | `--image` | _(required)_ | Local image file path (supports `~/`) |
358
- | `--mask` | _(none)_ | Mask image (white = area to edit) |
359
- | `--model` | `gpt-image-1` | Model to use |
360
- | `--size` | `1024x1024` | Output size |
355
+ | Flag | Default | Description |
356
+ | --------- | ------------- | ------------------------------------- |
357
+ | `--image` | _(required)_ | Local image file path (supports `~/`) |
358
+ | `--mask` | _(none)_ | Mask image (white = area to edit) |
359
+ | `--model` | `gpt-image-1` | Model to use |
360
+ | `--size` | `1024x1024` | Output size |
361
361
 
362
362
  ### Model shorthands
363
363
 
@@ -366,7 +366,7 @@ When using ClawRouter with OpenClaw, generate and edit images directly from any
366
366
  | `nano-banana` | `google/nano-banana` |
367
367
  | `banana-pro` | `google/nano-banana-pro` |
368
368
  | `dall-e-3` | `openai/dall-e-3` |
369
- | `gpt-image` | `openai/gpt-image-1` |
369
+ | `gpt-image` | `openai/gpt-image-1` |
370
370
  | `flux` | `black-forest/flux-1.1-pro` |
371
371
 
372
372
  ---
@@ -1,6 +1,6 @@
1
1
  # We Benchmarked 39 AI Models Through Our Payment Gateway. Here's What We Found.
2
2
 
3
- *March 16, 2026 | BlockRun Engineering*
3
+ _March 16, 2026 | BlockRun Engineering_
4
4
 
5
5
  Last week we ran every model on BlockRun through a real-world latency benchmark — 39 models, same prompts, same payment pipeline, same hardware. No cherry-picked results. No synthetic lab conditions. Just cold, hard numbers from production infrastructure.
6
6
 
@@ -18,47 +18,47 @@ We sent 2 coding prompts per model (256 max tokens, non-streaming) and measured
18
18
 
19
19
  ### Speed Rankings (End-to-End Latency Through BlockRun)
20
20
 
21
- | # | Model | Latency | Tok/s | $/1M in | $/1M out |
22
- |---|-------|---------|-------|---------|----------|
23
- | 1 | xai/grok-4-fast-non-reasoning | 1,143ms | 224 | $0.20 | $0.50 |
24
- | 2 | xai/grok-3-mini | 1,202ms | 215 | $0.30 | $0.50 |
25
- | 3 | google/gemini-2.5-flash | 1,238ms | 208 | $0.15 | $0.60 |
26
- | 4 | xai/grok-3 | 1,244ms | 207 | $3.00 | $15.00 |
27
- | 5 | xai/grok-4-1-fast-non-reasoning | 1,244ms | 206 | $0.20 | $0.50 |
28
- | 6 | nvidia/gpt-oss-120b | 1,252ms | 204 | FREE | FREE |
29
- | 7 | minimax/minimax-m2.5 | 1,278ms | 202 | $0.30 | $1.10 |
30
- | 8 | google/gemini-2.5-pro | 1,294ms | 198 | $1.25 | $10.00 |
31
- | 9 | xai/grok-4-fast-reasoning | 1,298ms | 198 | $0.20 | $0.50 |
32
- | 10 | xai/grok-4-0709 | 1,348ms | 190 | $0.20 | $1.50 |
33
- | 11 | google/gemini-3-pro-preview | 1,352ms | 190 | $1.25 | $10.00 |
34
- | 12 | google/gemini-2.5-flash-lite | 1,353ms | 193 | $0.10 | $0.40 |
35
- | 13 | google/gemini-3-flash-preview | 1,398ms | 183 | $0.15 | $0.60 |
36
- | 14 | deepseek/deepseek-chat | 1,431ms | 179 | $0.27 | $1.10 |
37
- | 15 | deepseek/deepseek-reasoner | 1,454ms | 183 | $0.55 | $2.19 |
38
- | 16 | xai/grok-4-1-fast-reasoning | 1,454ms | 176 | $0.20 | $0.50 |
39
- | 17 | google/gemini-3.1-pro | 1,609ms | 167 | $1.25 | $10.00 |
40
- | 18 | moonshot/kimi-k2.5 | 1,646ms | 156 | $0.60 | $3.00 |
41
- | 19 | anthropic/claude-sonnet-4.6 | 2,110ms | 121 | $3.00 | $15.00 |
42
- | 20 | anthropic/claude-opus-4.6 | 2,139ms | 120 | $15.00 | $75.00 |
43
- | 21 | openai/o3-mini | 2,260ms | 114 | $1.10 | $4.40 |
44
- | 22 | openai/gpt-5-mini | 2,264ms | 114 | $1.10 | $4.40 |
45
- | 23 | anthropic/claude-haiku-4.5 | 2,305ms | 141 | $0.80 | $4.00 |
46
- | 24 | openai/o4-mini | 2,328ms | 111 | $1.10 | $4.40 |
47
- | 25 | openai/gpt-4.1-mini | 2,340ms | 109 | $0.40 | $1.60 |
48
- | 26 | openai/o1 | 2,562ms | 100 | $15.00 | $60.00 |
49
- | 27 | openai/gpt-4.1-nano | 2,640ms | 97 | $0.10 | $0.40 |
50
- | 28 | openai/o1-mini | 2,746ms | 93 | $1.10 | $4.40 |
51
- | 29 | openai/gpt-4o-mini | 2,764ms | 93 | $0.15 | $0.60 |
52
- | 30 | openai/o3 | 2,862ms | 90 | $2.00 | $8.00 |
53
- | 31 | openai/gpt-5-nano | 3,187ms | 81 | $0.50 | $2.00 |
54
- | 32 | openai/gpt-5.2-pro | 3,546ms | 73 | $2.50 | $10.00 |
55
- | 33 | openai/gpt-4o | 5,378ms | 48 | $2.50 | $10.00 |
56
- | 34 | openai/gpt-4.1 | 5,477ms | 47 | $2.00 | $8.00 |
57
- | 35 | openai/gpt-5.3 | 5,910ms | 43 | $2.50 | $10.00 |
58
- | 36 | openai/gpt-5.4 | 6,213ms | 41 | $2.50 | $15.00 |
59
- | 37 | openai/gpt-5.2 | 6,507ms | 40 | $2.50 | $10.00 |
60
- | 38 | openai/gpt-5.4-pro | 6,671ms | 40 | $2.50 | $15.00 |
61
- | 39 | openai/gpt-5.3-codex | 7,935ms | 32 | $2.50 | $10.00 |
21
+ | # | Model | Latency | Tok/s | $/1M in | $/1M out |
22
+ | --- | ------------------------------- | ------- | ----- | ------- | -------- |
23
+ | 1 | xai/grok-4-fast-non-reasoning | 1,143ms | 224 | $0.20 | $0.50 |
24
+ | 2 | xai/grok-3-mini | 1,202ms | 215 | $0.30 | $0.50 |
25
+ | 3 | google/gemini-2.5-flash | 1,238ms | 208 | $0.15 | $0.60 |
26
+ | 4 | xai/grok-3 | 1,244ms | 207 | $3.00 | $15.00 |
27
+ | 5 | xai/grok-4-1-fast-non-reasoning | 1,244ms | 206 | $0.20 | $0.50 |
28
+ | 6 | nvidia/gpt-oss-120b | 1,252ms | 204 | FREE | FREE |
29
+ | 7 | minimax/minimax-m2.5 | 1,278ms | 202 | $0.30 | $1.10 |
30
+ | 8 | google/gemini-2.5-pro | 1,294ms | 198 | $1.25 | $10.00 |
31
+ | 9 | xai/grok-4-fast-reasoning | 1,298ms | 198 | $0.20 | $0.50 |
32
+ | 10 | xai/grok-4-0709 | 1,348ms | 190 | $0.20 | $1.50 |
33
+ | 11 | google/gemini-3-pro-preview | 1,352ms | 190 | $1.25 | $10.00 |
34
+ | 12 | google/gemini-2.5-flash-lite | 1,353ms | 193 | $0.10 | $0.40 |
35
+ | 13 | google/gemini-3-flash-preview | 1,398ms | 183 | $0.15 | $0.60 |
36
+ | 14 | deepseek/deepseek-chat | 1,431ms | 179 | $0.27 | $1.10 |
37
+ | 15 | deepseek/deepseek-reasoner | 1,454ms | 183 | $0.55 | $2.19 |
38
+ | 16 | xai/grok-4-1-fast-reasoning | 1,454ms | 176 | $0.20 | $0.50 |
39
+ | 17 | google/gemini-3.1-pro | 1,609ms | 167 | $1.25 | $10.00 |
40
+ | 18 | moonshot/kimi-k2.5 | 1,646ms | 156 | $0.60 | $3.00 |
41
+ | 19 | anthropic/claude-sonnet-4.6 | 2,110ms | 121 | $3.00 | $15.00 |
42
+ | 20 | anthropic/claude-opus-4.6 | 2,139ms | 120 | $15.00 | $75.00 |
43
+ | 21 | openai/o3-mini | 2,260ms | 114 | $1.10 | $4.40 |
44
+ | 22 | openai/gpt-5-mini | 2,264ms | 114 | $1.10 | $4.40 |
45
+ | 23 | anthropic/claude-haiku-4.5 | 2,305ms | 141 | $0.80 | $4.00 |
46
+ | 24 | openai/o4-mini | 2,328ms | 111 | $1.10 | $4.40 |
47
+ | 25 | openai/gpt-4.1-mini | 2,340ms | 109 | $0.40 | $1.60 |
48
+ | 26 | openai/o1 | 2,562ms | 100 | $15.00 | $60.00 |
49
+ | 27 | openai/gpt-4.1-nano | 2,640ms | 97 | $0.10 | $0.40 |
50
+ | 28 | openai/o1-mini | 2,746ms | 93 | $1.10 | $4.40 |
51
+ | 29 | openai/gpt-4o-mini | 2,764ms | 93 | $0.15 | $0.60 |
52
+ | 30 | openai/o3 | 2,862ms | 90 | $2.00 | $8.00 |
53
+ | 31 | openai/gpt-5-nano | 3,187ms | 81 | $0.50 | $2.00 |
54
+ | 32 | openai/gpt-5.2-pro | 3,546ms | 73 | $2.50 | $10.00 |
55
+ | 33 | openai/gpt-4o | 5,378ms | 48 | $2.50 | $10.00 |
56
+ | 34 | openai/gpt-4.1 | 5,477ms | 47 | $2.00 | $8.00 |
57
+ | 35 | openai/gpt-5.3 | 5,910ms | 43 | $2.50 | $10.00 |
58
+ | 36 | openai/gpt-5.4 | 6,213ms | 41 | $2.50 | $15.00 |
59
+ | 37 | openai/gpt-5.2 | 6,507ms | 40 | $2.50 | $10.00 |
60
+ | 38 | openai/gpt-5.4-pro | 6,671ms | 40 | $2.50 | $15.00 |
61
+ | 39 | openai/gpt-5.3-codex | 7,935ms | 32 | $2.50 | $10.00 |
62
62
 
63
63
  ## Three Things That Surprised Us
64
64
 
@@ -86,21 +86,21 @@ OpenAI's "mini" and "nano" variants are faster (2.2-3.2s range) but still 2x slo
86
86
 
87
87
  We cross-referenced our latency data with quality scores from [Artificial Analysis](https://artificialanalysis.ai/leaderboards/models) (Intelligence Index v4.0):
88
88
 
89
- | Model | BlockRun Latency | Intelligence Index | Price Tier |
90
- |-------|-----------------|-------------------|-----------|
91
- | Gemini 3.1 Pro | 1,609ms | 57 | $1.25/$10 |
92
- | GPT-5.4 | 6,213ms | 57 | $2.50/$15 |
93
- | GPT-5.3 Codex | 7,935ms | 54 | $2.50/$10 |
94
- | Claude Opus 4.6 | 2,139ms | 53 | $15/$75 |
95
- | Claude Sonnet 4.6 | 2,110ms | 52 | $3/$15 |
96
- | Kimi K2.5 | 1,646ms | 47 | $0.60/$3 |
97
- | Gemini 3 Flash Preview | 1,398ms | 46 | $0.15/$0.60 |
98
- | Grok 4 | 1,348ms | 41 | $0.20/$1.50 |
99
- | Grok 4.1 Fast | 1,244ms | 41 | $0.20/$0.50 |
100
- | DeepSeek V3 | 1,431ms | 32 | $0.27/$1.10 |
101
- | Grok 3 | 1,244ms | 32 | $3/$15 |
102
- | Grok 4 Fast | 1,143ms | 23 | $0.20/$0.50 |
103
- | Gemini 2.5 Flash | 1,238ms | 20 | $0.15/$0.60 |
89
+ | Model | BlockRun Latency | Intelligence Index | Price Tier |
90
+ | ---------------------- | ---------------- | ------------------ | ----------- |
91
+ | Gemini 3.1 Pro | 1,609ms | 57 | $1.25/$10 |
92
+ | GPT-5.4 | 6,213ms | 57 | $2.50/$15 |
93
+ | GPT-5.3 Codex | 7,935ms | 54 | $2.50/$10 |
94
+ | Claude Opus 4.6 | 2,139ms | 53 | $15/$75 |
95
+ | Claude Sonnet 4.6 | 2,110ms | 52 | $3/$15 |
96
+ | Kimi K2.5 | 1,646ms | 47 | $0.60/$3 |
97
+ | Gemini 3 Flash Preview | 1,398ms | 46 | $0.15/$0.60 |
98
+ | Grok 4 | 1,348ms | 41 | $0.20/$1.50 |
99
+ | Grok 4.1 Fast | 1,244ms | 41 | $0.20/$0.50 |
100
+ | DeepSeek V3 | 1,431ms | 32 | $0.27/$1.10 |
101
+ | Grok 3 | 1,244ms | 32 | $3/$15 |
102
+ | Grok 4 Fast | 1,143ms | 23 | $0.20/$0.50 |
103
+ | Gemini 2.5 Flash | 1,238ms | 20 | $0.15/$0.60 |
104
104
 
105
105
  **Gemini 3.1 Pro** is the standout: highest intelligence score (57) at just 1.6 seconds. GPT-5.4 matches its intelligence but takes **4x longer**.
106
106
 
@@ -131,7 +131,7 @@ Raw benchmark data: [benchmark-results.json](https://github.com/BlockRunAI/ClawR
131
131
 
132
132
  ---
133
133
 
134
- *BlockRun is the x402 micropayment gateway for AI. One wallet, 39+ models, pay-per-request with USDC. [Get started](https://blockrun.ai)*
134
+ _BlockRun is the x402 micropayment gateway for AI. One wallet, 39+ models, pay-per-request with USDC. [Get started](https://blockrun.ai)_
135
135
 
136
136
  ---
137
137
 
@@ -144,18 +144,14 @@ Raw benchmark data: [benchmark-results.json](https://github.com/BlockRunAI/ClawR
144
144
  The fastest model (Grok 4 Fast) was 7x faster than the slowest (GPT-5.3 Codex). Here's the full breakdown:
145
145
 
146
146
  **2/** Top 5 fastest (end-to-end latency):
147
+
147
148
  1. xai/grok-4-fast — 1,143ms
148
149
  2. xai/grok-3-mini — 1,202ms
149
150
  3. google/gemini-2.5-flash — 1,238ms
150
151
  4. xai/grok-3 — 1,244ms
151
152
  5. nvidia/gpt-oss-120b — 1,252ms (FREE)
152
153
 
153
- **3/** Bottom 5 (all OpenAI):
154
- 35. openai/gpt-5.3 — 5,910ms
155
- 36. openai/gpt-5.4 — 6,213ms
156
- 37. openai/gpt-5.2 — 6,507ms
157
- 38. openai/gpt-5.4-pro — 6,671ms
158
- 39. openai/gpt-5.3-codex — 7,935ms
154
+ **3/** Bottom 5 (all OpenAI): 35. openai/gpt-5.3 — 5,910ms 36. openai/gpt-5.4 — 6,213ms 37. openai/gpt-5.2 — 6,507ms 38. openai/gpt-5.4-pro — 6,671ms 39. openai/gpt-5.3-codex — 7,935ms
159
155
 
160
156
  Every OpenAI 5.x model: 5-8 seconds. Every Grok/Gemini model: ~1.2 seconds.
161
157
 
@@ -166,6 +162,7 @@ We tried routing all requests to the fastest models. Users complained the "fast"
166
162
  Lesson: you need to balance speed, quality, AND cost.
167
163
 
168
164
  **5/** The efficiency frontier winners:
165
+
169
166
  - Best overall: Gemini 3.1 Pro (IQ 57, 1.6s, $1.25/M)
170
167
  - Best budget: Gemini 2.5 Flash (IQ 20, 1.2s, $0.15/M)
171
168
  - Best reasoning: Claude Opus 4.6 (IQ 53, 2.1s, $15/M)
@@ -19,12 +19,12 @@ Use `blockrun/eco` for maximum cost savings.
19
19
 
20
20
  Use `blockrun/auto` for the best quality/price balance.
21
21
 
22
- | Tier | Primary Model | Input | Output |
23
- | --------- | ----------------------------- | ----- | ------ |
24
- | SIMPLE | moonshot/kimi-k2.5 | $0.60 | $3.00 |
25
- | MEDIUM | xai/grok-code-fast-1 | $0.20 | $1.50 |
26
- | COMPLEX | google/gemini-3.1-pro | $2.00 | $12.00 |
27
- | REASONING | xai/grok-4-1-fast-reasoning | $0.20 | $0.50 |
22
+ | Tier | Primary Model | Input | Output |
23
+ | --------- | --------------------------- | ----- | ------ |
24
+ | SIMPLE | moonshot/kimi-k2.5 | $0.60 | $3.00 |
25
+ | MEDIUM | xai/grok-code-fast-1 | $0.20 | $1.50 |
26
+ | COMPLEX | google/gemini-3.1-pro | $2.00 | $12.00 |
27
+ | REASONING | xai/grok-4-1-fast-reasoning | $0.20 | $0.50 |
28
28
 
29
29
  ---
30
30