@blockrun/clawrouter 0.12.81 → 0.12.83

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,276 @@
1
+ # 11 Free AI Models, Zero Cost: How BlockRun Gives Developers Top-Tier LLMs for Nothing
2
+
3
+ ## The Cost Problem Nobody Talks About
4
+
5
+ It's 2026. Large language models are table stakes for developers. But here's the uncomfortable truth — **the models you can afford aren't good enough, and the good ones aren't affordable.**
6
+
7
+ Claude Opus 4 runs $15/$75 per million tokens. GPT-4o sits at $2.50/$10. Even the "cheap" models add up fast. For indie developers, students, and early-stage startups, $50–$200/month in API costs is real money — especially when half of it goes to throwaway experiments, prompt iterations, and dead-end debugging sessions.
8
+
9
+ You're not just paying for intelligence. You're paying for every mistake, every retry, every discarded attempt.
10
+
11
+ **What if you had 11 high-quality LLMs — completely free, unlimited calls, 128K context — and could use them right now?**
12
+
13
+ BlockRun's answer: just take them.
14
+
15
+ ---
16
+
17
+ ## The Lineup: 11 Models, $0.00
18
+
19
+ Through [ClawRouter](https://github.com/blockrunai/ClawRouter) — BlockRun's local AI routing proxy — you get zero-cost access to the following:
20
+
21
+ | Model | Parameters | Context | Reasoning | Best For |
22
+ |-------|-----------|---------|-----------|----------|
23
+ | **GPT-OSS 120B** | 120B | 128K | — | General chat, summaries, formatting |
24
+ | **GPT-OSS 20B** | 20B | 128K | — | Fast lightweight tasks |
25
+ | **Nemotron Ultra 253B** | 253B | 131K | ✅ | Complex reasoning, math, analysis |
26
+ | **Nemotron 3 Super 120B** | 120B | 131K | ✅ | Balanced reasoning + general |
27
+ | **Nemotron Super 49B** | 49B | 131K | ✅ | Quick reasoning, low latency |
28
+ | **DeepSeek V3.2** | — | 128K | ✅ | Code generation, technical reasoning |
29
+ | **Mistral Large 675B** | 675B | 128K | ✅ | Multilingual, long-form, complex instructions |
30
+ | **Qwen3 Coder 480B** | 480B | 128K | — | Professional code generation |
31
+ | **Devstral 2 123B** | 123B | 128K | — | Developer tooling, code review |
32
+ | **GLM-4.7** | — | 128K | ✅ | Chinese-English bilingual reasoning |
33
+ | **Llama 4 Maverick** | — | 128K | ✅ | Meta's latest open-source all-rounder |
34
+
35
+ **Price: $0.00 per million tokens. Input free. Output free. No hidden fees. No daily caps. No trial period.**
36
+
37
+ This isn't "free for your first 1,000 requests." It's not "free but rate-limited to uselessness." It's production-grade, unlimited, genuinely free inference.
38
+
39
+ ---
40
+
41
+ ## Why Free?
42
+
43
+ BlockRun's business model is simple: **make the best models accessible, charge only for the premium ones.**
44
+
45
+ The 11 free models are BlockRun's foundation tier. They cover the vast majority of everyday developer tasks — chat, coding, translation, summarization, lightweight reasoning — without costing a cent. When you need heavier firepower (Claude Opus 4, GPT-4o, o3), BlockRun charges per-call via [x402 micropayments](https://www.x402.org/). No subscriptions, no monthly minimums — just pay for what you use, only when you need to.
46
+
47
+ The free tier isn't a loss leader. It's the product. BlockRun believes baseline AI capability should be accessible to every developer, regardless of budget. The premium tier exists for tasks that genuinely demand it.
48
+
49
+ ---
50
+
51
+ ## Not Just Free: How Smart Routing Squeezes Every Dollar
52
+
53
+ ClawRouter's value proposition isn't just "here are free models." It's **intelligent routing** — automatically selecting the right model for each request based on prompt complexity.
54
+
55
+ ### The Four-Tier Architecture
56
+
57
+ ClawRouter classifies every incoming request into one of four complexity tiers:
58
+
59
+ | Tier | Typical Tasks | ECO Route (Cheapest) | AUTO Route (Balanced) |
60
+ |------|--------------|---------------------|----------------------|
61
+ | **SIMPLE** | Formatting, translation, Q&A | 🆓 GPT-OSS 120B (FREE) | GPT-4o Mini |
62
+ | **MEDIUM** | Summaries, analysis, general coding | 🆓 DeepSeek V3.2 (FREE) | DeepSeek V3.2 |
63
+ | **COMPLEX** | Architecture, complex code | 🆓 Nemotron Ultra 253B (FREE) | Claude Sonnet 4 |
64
+ | **REASONING** | Mathematical proofs, multi-step logic | DeepSeek R1 | Claude Opus 4 |
65
+
66
+ Look at the ECO column. **Three out of four tiers route to free models.** Unless you're doing the hardest reasoning tasks, your daily work costs nothing.
67
+
68
+ ### Real-World Cost Comparison
69
+
70
+ Assume 100 requests per day, distributed roughly as:
71
+ - 40% SIMPLE (chat, translation, formatting)
72
+ - 30% MEDIUM (coding, analysis)
73
+ - 20% COMPLEX (architecture, deep debugging)
74
+ - 10% REASONING (math, formal logic)
75
+
76
+ | Approach | Estimated Monthly Cost |
77
+ |----------|----------------------|
78
+ | Pure Claude Opus 4 | ~$75–150 |
79
+ | Pure GPT-4o | ~$15–30 |
80
+ | ClawRouter AUTO mode | ~$5–10 |
81
+ | ClawRouter ECO mode | ~$1–3 |
82
+ | Manual free model selection | **$0** |
83
+
84
+ **ECO mode saves 92%+ compared to Claude Opus alone.**
85
+
86
+ ---
87
+
88
+ ## Deep Dive: What Each Free Model Does Best
89
+
90
+ ### GPT-OSS 120B / 20B — The Workhorse
91
+
92
+ GPT-OSS is BlockRun's default general-purpose free model. The 120B version is ClawRouter's **default SIMPLE-tier model** in ECO mode and the **ultimate fallback** when wallet balance runs low. It handles conversation, text generation, and summarization with reliable consistency.
93
+
94
+ The 20B variant trades capability for speed — noticeably faster responses for tasks that don't need the bigger model's muscle.
95
+
96
+ **Best for:** Daily conversation, text summaries, reformatting, translation, quick answers.
97
+
98
+ ### Nemotron Ultra 253B — The Free Flagship
99
+
100
+ 253 billion parameters. Reasoning capability. 131K context window. Nemotron Ultra is the **single strongest free model on BlockRun** — and it's the default when you type `/model free` in ClawRouter.
101
+
102
+ This is the model you reach for when the task is genuinely hard but you don't want to pay for it. Complex analysis, multi-step planning, mathematical reasoning — Nemotron Ultra handles them with surprising competence for a zero-cost option.
103
+
104
+ **Best for:** Complex reasoning, math, logic, deep analysis, planning. If you remember one free model name, remember this one.
105
+
106
+ ### Nemotron 3 Super 120B / Nemotron Super 49B — The Gradient
107
+
108
+ The Nemotron family gives you three reasoning-capable models at different scales (253B / 120B / 49B). This gradient lets you match firepower to task difficulty. The 49B version is noticeably faster, making it ideal for development workflows where you're iterating rapidly and don't need maximum capability on every call.
109
+
110
+ **Best for:** When you need reasoning but want faster responses than Ultra 253B.
111
+
112
+ ### DeepSeek V3.2 — The Developer's Weapon
113
+
114
+ DeepSeek has consistently punched above its weight on coding benchmarks. V3.2 adds reasoning capability on top of already strong code generation. It's ClawRouter's **MEDIUM-tier primary in ECO mode** — the model that handles your everyday coding tasks for free.
115
+
116
+ **Best for:** Code generation and completion, code review and refactoring, technical design, debugging and error analysis.
117
+
118
+ ### Mistral Large 675B — The Largest Free Model
119
+
120
+ At 675 billion parameters, Mistral Large is the **biggest model in the free lineup by parameter count.** Mistral has always excelled at multilingual tasks, with particular strength in European languages (French, German, Spanish). Reasoning-capable and formidable on long-form content.
121
+
122
+ **Best for:** Multilingual content, long document analysis, complex instruction following, cross-language translation.
123
+
124
+ ### Qwen3 Coder 480B — Brute-Force Code Generation
125
+
126
+ Alibaba's Qwen team built this 480B model specifically for code. When your task is "write a lot of correct code," raw parameter count matters — and 480B parameters dedicated to code generation produces noticeably more complete and accurate output than smaller generalist models.
127
+
128
+ **Best for:** Large-scale code generation, complex algorithm implementation, multi-file changes, codebase-level understanding.
129
+
130
+ ### Devstral 2 123B — Mistral's Developer Edition
131
+
132
+ Devstral is the developer-optimized variant of Mistral, fine-tuned for code comprehension, technical documentation, and API design. Think of it as Mistral Large's more focused sibling.
133
+
134
+ **Best for:** Code understanding, technical documentation, API design, developer tooling.
135
+
136
+ ### GLM-4.7 — The Chinese-English Bridge
137
+
138
+ Zhipu AI's GLM-4.7 shines in Chinese-language scenarios while maintaining strong English capability. Reasoning-capable. If your users, documentation, or codebase involves Chinese, this model deserves your attention.
139
+
140
+ **Best for:** Chinese content generation, Chinese-English translation, reasoning in Chinese context, applications targeting Chinese-speaking users.
141
+
142
+ ### Llama 4 Maverick — Meta's Latest
143
+
144
+ Meta's newest open-source model represents the current state of the art in open LLMs. Reasoning-capable, well-balanced across benchmarks, and backed by Meta's massive training infrastructure.
145
+
146
+ **Best for:** General-purpose tasks where you want the most recent open-source capabilities.
147
+
148
+ ---
149
+
150
+ ## Get Started in 5 Minutes
151
+
152
+ ### Option 1: Via ClawRouter (Recommended)
153
+
154
+ ```bash
155
+ # Install
156
+ npm install -g @blockrun/clawrouter
157
+
158
+ # Start the local proxy
159
+ clawrouter start
160
+ ```
161
+
162
+ ```python
163
+ from openai import OpenAI
164
+
165
+ client = OpenAI(
166
+ base_url="http://localhost:4402/v1",
167
+ api_key="your-blockrun-key"
168
+ )
169
+
170
+ # Pick a specific free model
171
+ response = client.chat.completions.create(
172
+ model="free/nemotron-ultra-253b",
173
+ messages=[{"role": "user", "content": "Explain quantum entanglement"}]
174
+ )
175
+
176
+ # Or let ECO routing pick the best free model automatically
177
+ response = client.chat.completions.create(
178
+ model="auto",
179
+ messages=[{"role": "user", "content": "Hello world"}]
180
+ )
181
+ ```
182
+
183
+ ### Option 2: Switch Models in Claude Code
184
+
185
+ If you're using Claude Code, one command switches you to any free model:
186
+
187
+ ```
188
+ /model free → Nemotron Ultra 253B (strongest free)
189
+ /model deepseek-free → DeepSeek V3.2
190
+ /model mistral-free → Mistral Large 675B
191
+ /model glm-free → GLM-4.7
192
+ /model llama-free → Llama 4 Maverick
193
+ ```
194
+
195
+ Seamless. No config changes. No restarts.
196
+
197
+ ---
198
+
199
+ ## The Honest Limitations
200
+
201
+ Free models aren't a silver bullet. Here's what you need to know:
202
+
203
+ ### 1. No Verified Tool Calling
204
+
205
+ None of these 11 models have **structured function calling (tool use) enabled.** If your application depends on tool calling, you need a paid model (GPT-4o, Claude Sonnet, etc.).
206
+
207
+ ### 2. Reasoning Has a Ceiling
208
+
209
+ Seven models are marked reasoning-capable, and they handle most tasks well. But on the hardest problems — competition-level math, formal proofs, deep multi-step planning — they don't match Claude Opus 4 or o3. That's why ClawRouter's REASONING tier doesn't use free models.
210
+
211
+ ### 3. Context Is Large, Not Largest
212
+
213
+ 128K–131K context is generous for most tasks, but if you're processing entire books or massive codebases, you may need Claude's 1M context or Gemini's 2M window.
214
+
215
+ ---
216
+
217
+ ## Best Practices: Maximizing Free Models
218
+
219
+ ### Strategy 1: Match Model to Task
220
+
221
+ Don't use one model for everything. Route by task type:
222
+
223
+ ```
224
+ Quick chat, formatting → GPT-OSS 120B (fastest)
225
+ Code generation → DeepSeek V3.2 or Qwen3 Coder 480B
226
+ Reasoning required → Nemotron Ultra 253B
227
+ Chinese content → GLM-4.7
228
+ Multilingual work → Mistral Large 675B
229
+ Latest open-source → Llama 4 Maverick
230
+ ```
231
+
232
+ ### Strategy 2: Free for 80%, Paid for 20%
233
+
234
+ Use ECO mode for the bulk of daily tasks — it's free. Reserve paid models (Claude Opus, GPT-4o) for the 20% that genuinely requires top-tier capability: production-critical reasoning, tool calling, agentic workflows. Monthly AI spend drops to single digits.
235
+
236
+ ### Strategy 3: Prototype Free, Ship Paid
237
+
238
+ During development, iterate freely — prompt engineering, edge case testing, architecture exploration — all on free models. Once you've nailed the approach, switch to a paid model for final quality assurance and production deployment.
239
+
240
+ ---
241
+
242
+ ## The Bigger Picture: What This Means for AI Access
243
+
244
+ Look at the cost trajectory over the past three years:
245
+
246
+ - **2023:** GPT-4 dominates alone at $30/$60 per M tokens
247
+ - **2024:** Open-source models surge, prices halve repeatedly
248
+ - **2025:** DeepSeek, Qwen push top-tier inference below $1/M
249
+ - **2026:** BlockRun offers 11 free models through a single API
250
+
251
+ **Eleven free models isn't just a product feature — it's a signal.** Baseline AI capability is becoming infrastructure. Like internet bandwidth before it, the cost of "good enough" AI inference is converging toward zero.
252
+
253
+ BlockRun and ClawRouter exist to be the **routing layer** in this transition: not locked to any single provider, not bound to any single model, always giving developers the lowest-cost path to the right capability.
254
+
255
+ Today it's 11 free models. Tomorrow it could be 50. Prices will only drop. Capabilities will only improve.
256
+
257
+ **The one constant: your code doesn't need to change.**
258
+
259
+ ---
260
+
261
+ ## Start Now
262
+
263
+ ```bash
264
+ npm install -g @blockrun/clawrouter
265
+ clawrouter start
266
+ ```
267
+
268
+ Point your `base_url` to `http://localhost:4402/v1`. That's the whole setup.
269
+
270
+ Eleven free models. 128K context. Unlimited calls. Zero cost.
271
+
272
+ Go build something.
273
+
274
+ ---
275
+
276
+ *Based on ClawRouter v0.12.81. Model availability may change with future releases. For the latest information, visit [blockrun.ai](https://blockrun.ai).*
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@blockrun/clawrouter",
3
- "version": "0.12.81",
3
+ "version": "0.12.83",
4
4
  "description": "Smart LLM router — save 85% on inference costs. 55+ models (11 free), one wallet, x402 micropayments.",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
@@ -257,9 +257,26 @@ if (fs.existsSync(configPath)) {
257
257
  "
258
258
 
259
259
  # 6. Install plugin (config is ready, but no allow list yet to avoid validation error)
260
+ # Back up OpenClaw credentials (channels, WhatsApp/Telegram state) before plugin install
261
+ CREDS_DIR="$HOME/.openclaw/credentials"
262
+ CREDS_BACKUP=""
263
+ if [ -d "$CREDS_DIR" ] && [ "$(ls -A "$CREDS_DIR" 2>/dev/null)" ]; then
264
+ CREDS_BACKUP="$(mktemp -d)/openclaw-credentials-backup"
265
+ cp -a "$CREDS_DIR" "$CREDS_BACKUP"
266
+ echo " ✓ Backed up OpenClaw credentials"
267
+ fi
268
+
260
269
  echo "→ Installing ClawRouter..."
261
270
  openclaw plugins install @blockrun/clawrouter
262
271
 
272
+ # Restore credentials after plugin install (always restore to preserve user's channels)
273
+ if [ -n "$CREDS_BACKUP" ] && [ -d "$CREDS_BACKUP" ]; then
274
+ mkdir -p "$CREDS_DIR"
275
+ cp -a "$CREDS_BACKUP/"* "$CREDS_DIR/"
276
+ echo " ✓ Restored OpenClaw credentials (channels preserved)"
277
+ rm -rf "$(dirname "$CREDS_BACKUP")"
278
+ fi
279
+
263
280
  # 6.1. Verify installation (check dist/ files exist)
264
281
  echo "→ Verifying installation..."
265
282
  DIST_PATH="$PLUGIN_DIR/dist/index.js"
package/scripts/update.sh CHANGED
@@ -170,9 +170,26 @@ try {
170
170
  "
171
171
 
172
172
  # ── Step 4: Install latest version ─────────────────────────────
173
+ # Back up OpenClaw credentials (channels, WhatsApp/Telegram state) before plugin install
174
+ CREDS_DIR="$HOME/.openclaw/credentials"
175
+ CREDS_BACKUP=""
176
+ if [ -d "$CREDS_DIR" ] && [ "$(ls -A "$CREDS_DIR" 2>/dev/null)" ]; then
177
+ CREDS_BACKUP="$(mktemp -d)/openclaw-credentials-backup"
178
+ cp -a "$CREDS_DIR" "$CREDS_BACKUP"
179
+ echo " ✓ Backed up OpenClaw credentials"
180
+ fi
181
+
173
182
  echo "→ Installing latest ClawRouter..."
174
183
  openclaw plugins install @blockrun/clawrouter
175
184
 
185
+ # Restore credentials after plugin install (always restore to preserve user's channels)
186
+ if [ -n "$CREDS_BACKUP" ] && [ -d "$CREDS_BACKUP" ]; then
187
+ mkdir -p "$CREDS_DIR"
188
+ cp -a "$CREDS_BACKUP/"* "$CREDS_DIR/"
189
+ echo " ✓ Restored OpenClaw credentials (channels preserved)"
190
+ rm -rf "$(dirname "$CREDS_BACKUP")"
191
+ fi
192
+
176
193
  # ── Step 4b: Ensure all dependencies are installed ────────────
177
194
  # openclaw's plugin installer may skip native/optional deps like @solana/kit.
178
195
  # Run npm install in the plugin directory to fill any gaps.