@blockrun/clawrouter 0.12.81 → 0.12.83
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +17 -6
- package/dist/cli.js +29 -8
- package/dist/cli.js.map +1 -1
- package/docs/11-free-ai-models-zero-cost-blockrun.md +276 -0
- package/package.json +1 -1
- package/scripts/reinstall.sh +17 -0
- package/scripts/update.sh +17 -0
|
@@ -0,0 +1,276 @@
|
|
|
1
|
+
# 11 Free AI Models, Zero Cost: How BlockRun Gives Developers Top-Tier LLMs for Nothing
|
|
2
|
+
|
|
3
|
+
## The Cost Problem Nobody Talks About
|
|
4
|
+
|
|
5
|
+
It's 2026. Large language models are table stakes for developers. But here's the uncomfortable truth — **the models you can afford aren't good enough, and the good ones aren't affordable.**
|
|
6
|
+
|
|
7
|
+
Claude Opus 4 runs $15/$75 per million tokens. GPT-4o sits at $2.50/$10. Even the "cheap" models add up fast. For indie developers, students, and early-stage startups, $50–$200/month in API costs is real money — especially when half of it goes to throwaway experiments, prompt iterations, and dead-end debugging sessions.
|
|
8
|
+
|
|
9
|
+
You're not just paying for intelligence. You're paying for every mistake, every retry, every discarded attempt.
|
|
10
|
+
|
|
11
|
+
**What if you had 11 high-quality LLMs — completely free, unlimited calls, 128K context — and could use them right now?**
|
|
12
|
+
|
|
13
|
+
BlockRun's answer: just take them.
|
|
14
|
+
|
|
15
|
+
---
|
|
16
|
+
|
|
17
|
+
## The Lineup: 11 Models, $0.00
|
|
18
|
+
|
|
19
|
+
Through [ClawRouter](https://github.com/blockrunai/ClawRouter) — BlockRun's local AI routing proxy — you get zero-cost access to the following:
|
|
20
|
+
|
|
21
|
+
| Model | Parameters | Context | Reasoning | Best For |
|
|
22
|
+
|-------|-----------|---------|-----------|----------|
|
|
23
|
+
| **GPT-OSS 120B** | 120B | 128K | — | General chat, summaries, formatting |
|
|
24
|
+
| **GPT-OSS 20B** | 20B | 128K | — | Fast lightweight tasks |
|
|
25
|
+
| **Nemotron Ultra 253B** | 253B | 131K | ✅ | Complex reasoning, math, analysis |
|
|
26
|
+
| **Nemotron 3 Super 120B** | 120B | 131K | ✅ | Balanced reasoning + general |
|
|
27
|
+
| **Nemotron Super 49B** | 49B | 131K | ✅ | Quick reasoning, low latency |
|
|
28
|
+
| **DeepSeek V3.2** | — | 128K | ✅ | Code generation, technical reasoning |
|
|
29
|
+
| **Mistral Large 675B** | 675B | 128K | ✅ | Multilingual, long-form, complex instructions |
|
|
30
|
+
| **Qwen3 Coder 480B** | 480B | 128K | — | Professional code generation |
|
|
31
|
+
| **Devstral 2 123B** | 123B | 128K | — | Developer tooling, code review |
|
|
32
|
+
| **GLM-4.7** | — | 128K | ✅ | Chinese-English bilingual reasoning |
|
|
33
|
+
| **Llama 4 Maverick** | — | 128K | ✅ | Meta's latest open-source all-rounder |
|
|
34
|
+
|
|
35
|
+
**Price: $0.00 per million tokens. Input free. Output free. No hidden fees. No daily caps. No trial period.**
|
|
36
|
+
|
|
37
|
+
This isn't "free for your first 1,000 requests." It's not "free but rate-limited to uselessness." It's production-grade, unlimited, genuinely free inference.
|
|
38
|
+
|
|
39
|
+
---
|
|
40
|
+
|
|
41
|
+
## Why Free?
|
|
42
|
+
|
|
43
|
+
BlockRun's business model is simple: **make the best models accessible, charge only for the premium ones.**
|
|
44
|
+
|
|
45
|
+
The 11 free models are BlockRun's foundation tier. They cover the vast majority of everyday developer tasks — chat, coding, translation, summarization, lightweight reasoning — without costing a cent. When you need heavier firepower (Claude Opus 4, GPT-4o, o3), BlockRun charges per-call via [x402 micropayments](https://www.x402.org/). No subscriptions, no monthly minimums — just pay for what you use, only when you need to.
|
|
46
|
+
|
|
47
|
+
The free tier isn't a loss leader. It's the product. BlockRun believes baseline AI capability should be accessible to every developer, regardless of budget. The premium tier exists for tasks that genuinely demand it.
|
|
48
|
+
|
|
49
|
+
---
|
|
50
|
+
|
|
51
|
+
## Not Just Free: How Smart Routing Squeezes Every Dollar
|
|
52
|
+
|
|
53
|
+
ClawRouter's value proposition isn't just "here are free models." It's **intelligent routing** — automatically selecting the right model for each request based on prompt complexity.
|
|
54
|
+
|
|
55
|
+
### The Four-Tier Architecture
|
|
56
|
+
|
|
57
|
+
ClawRouter classifies every incoming request into one of four complexity tiers:
|
|
58
|
+
|
|
59
|
+
| Tier | Typical Tasks | ECO Route (Cheapest) | AUTO Route (Balanced) |
|
|
60
|
+
|------|--------------|---------------------|----------------------|
|
|
61
|
+
| **SIMPLE** | Formatting, translation, Q&A | 🆓 GPT-OSS 120B (FREE) | GPT-4o Mini |
|
|
62
|
+
| **MEDIUM** | Summaries, analysis, general coding | 🆓 DeepSeek V3.2 (FREE) | DeepSeek V3.2 |
|
|
63
|
+
| **COMPLEX** | Architecture, complex code | 🆓 Nemotron Ultra 253B (FREE) | Claude Sonnet 4 |
|
|
64
|
+
| **REASONING** | Mathematical proofs, multi-step logic | DeepSeek R1 | Claude Opus 4 |
|
|
65
|
+
|
|
66
|
+
Look at the ECO column. **Three out of four tiers route to free models.** Unless you're doing the hardest reasoning tasks, your daily work costs nothing.
|
|
67
|
+
|
|
68
|
+
### Real-World Cost Comparison
|
|
69
|
+
|
|
70
|
+
Assume 100 requests per day, distributed roughly as:
|
|
71
|
+
- 40% SIMPLE (chat, translation, formatting)
|
|
72
|
+
- 30% MEDIUM (coding, analysis)
|
|
73
|
+
- 20% COMPLEX (architecture, deep debugging)
|
|
74
|
+
- 10% REASONING (math, formal logic)
|
|
75
|
+
|
|
76
|
+
| Approach | Estimated Monthly Cost |
|
|
77
|
+
|----------|----------------------|
|
|
78
|
+
| Pure Claude Opus 4 | ~$75–150 |
|
|
79
|
+
| Pure GPT-4o | ~$15–30 |
|
|
80
|
+
| ClawRouter AUTO mode | ~$5–10 |
|
|
81
|
+
| ClawRouter ECO mode | ~$1–3 |
|
|
82
|
+
| Manual free model selection | **$0** |
|
|
83
|
+
|
|
84
|
+
**ECO mode saves 92%+ compared to Claude Opus alone.**
|
|
85
|
+
|
|
86
|
+
---
|
|
87
|
+
|
|
88
|
+
## Deep Dive: What Each Free Model Does Best
|
|
89
|
+
|
|
90
|
+
### GPT-OSS 120B / 20B — The Workhorse
|
|
91
|
+
|
|
92
|
+
GPT-OSS is BlockRun's default general-purpose free model. The 120B version is ClawRouter's **default SIMPLE-tier model** in ECO mode and the **ultimate fallback** when wallet balance runs low. It handles conversation, text generation, and summarization with reliable consistency.
|
|
93
|
+
|
|
94
|
+
The 20B variant trades capability for speed — noticeably faster responses for tasks that don't need the bigger model's muscle.
|
|
95
|
+
|
|
96
|
+
**Best for:** Daily conversation, text summaries, reformatting, translation, quick answers.
|
|
97
|
+
|
|
98
|
+
### Nemotron Ultra 253B — The Free Flagship
|
|
99
|
+
|
|
100
|
+
253 billion parameters. Reasoning capability. 131K context window. Nemotron Ultra is the **single strongest free model on BlockRun** — and it's the default when you type `/model free` in ClawRouter.
|
|
101
|
+
|
|
102
|
+
This is the model you reach for when the task is genuinely hard but you don't want to pay for it. Complex analysis, multi-step planning, mathematical reasoning — Nemotron Ultra handles them with surprising competence for a zero-cost option.
|
|
103
|
+
|
|
104
|
+
**Best for:** Complex reasoning, math, logic, deep analysis, planning. If you remember one free model name, remember this one.
|
|
105
|
+
|
|
106
|
+
### Nemotron 3 Super 120B / Nemotron Super 49B — The Gradient
|
|
107
|
+
|
|
108
|
+
The Nemotron family gives you three reasoning-capable models at different scales (253B / 120B / 49B). This gradient lets you match firepower to task difficulty. The 49B version is noticeably faster, making it ideal for development workflows where you're iterating rapidly and don't need maximum capability on every call.
|
|
109
|
+
|
|
110
|
+
**Best for:** When you need reasoning but want faster responses than Ultra 253B.
|
|
111
|
+
|
|
112
|
+
### DeepSeek V3.2 — The Developer's Weapon
|
|
113
|
+
|
|
114
|
+
DeepSeek has consistently punched above its weight on coding benchmarks. V3.2 adds reasoning capability on top of already strong code generation. It's ClawRouter's **MEDIUM-tier primary in ECO mode** — the model that handles your everyday coding tasks for free.
|
|
115
|
+
|
|
116
|
+
**Best for:** Code generation and completion, code review and refactoring, technical design, debugging and error analysis.
|
|
117
|
+
|
|
118
|
+
### Mistral Large 675B — The Largest Free Model
|
|
119
|
+
|
|
120
|
+
At 675 billion parameters, Mistral Large is the **biggest model in the free lineup by parameter count.** Mistral has always excelled at multilingual tasks, with particular strength in European languages (French, German, Spanish). Reasoning-capable and formidable on long-form content.
|
|
121
|
+
|
|
122
|
+
**Best for:** Multilingual content, long document analysis, complex instruction following, cross-language translation.
|
|
123
|
+
|
|
124
|
+
### Qwen3 Coder 480B — Brute-Force Code Generation
|
|
125
|
+
|
|
126
|
+
Alibaba's Qwen team built this 480B model specifically for code. When your task is "write a lot of correct code," raw parameter count matters — and 480B parameters dedicated to code generation produces noticeably more complete and accurate output than smaller generalist models.
|
|
127
|
+
|
|
128
|
+
**Best for:** Large-scale code generation, complex algorithm implementation, multi-file changes, codebase-level understanding.
|
|
129
|
+
|
|
130
|
+
### Devstral 2 123B — Mistral's Developer Edition
|
|
131
|
+
|
|
132
|
+
Devstral is the developer-optimized variant of Mistral, fine-tuned for code comprehension, technical documentation, and API design. Think of it as Mistral Large's more focused sibling.
|
|
133
|
+
|
|
134
|
+
**Best for:** Code understanding, technical documentation, API design, developer tooling.
|
|
135
|
+
|
|
136
|
+
### GLM-4.7 — The Chinese-English Bridge
|
|
137
|
+
|
|
138
|
+
Zhipu AI's GLM-4.7 shines in Chinese-language scenarios while maintaining strong English capability. Reasoning-capable. If your users, documentation, or codebase involves Chinese, this model deserves your attention.
|
|
139
|
+
|
|
140
|
+
**Best for:** Chinese content generation, Chinese-English translation, reasoning in Chinese context, applications targeting Chinese-speaking users.
|
|
141
|
+
|
|
142
|
+
### Llama 4 Maverick — Meta's Latest
|
|
143
|
+
|
|
144
|
+
Meta's newest open-source model represents the current state of the art in open LLMs. Reasoning-capable, well-balanced across benchmarks, and backed by Meta's massive training infrastructure.
|
|
145
|
+
|
|
146
|
+
**Best for:** General-purpose tasks where you want the most recent open-source capabilities.
|
|
147
|
+
|
|
148
|
+
---
|
|
149
|
+
|
|
150
|
+
## Get Started in 5 Minutes
|
|
151
|
+
|
|
152
|
+
### Option 1: Via ClawRouter (Recommended)
|
|
153
|
+
|
|
154
|
+
```bash
|
|
155
|
+
# Install
|
|
156
|
+
npm install -g @blockrun/clawrouter
|
|
157
|
+
|
|
158
|
+
# Start the local proxy
|
|
159
|
+
clawrouter start
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
```python
|
|
163
|
+
from openai import OpenAI
|
|
164
|
+
|
|
165
|
+
client = OpenAI(
|
|
166
|
+
base_url="http://localhost:4402/v1",
|
|
167
|
+
api_key="your-blockrun-key"
|
|
168
|
+
)
|
|
169
|
+
|
|
170
|
+
# Pick a specific free model
|
|
171
|
+
response = client.chat.completions.create(
|
|
172
|
+
model="free/nemotron-ultra-253b",
|
|
173
|
+
messages=[{"role": "user", "content": "Explain quantum entanglement"}]
|
|
174
|
+
)
|
|
175
|
+
|
|
176
|
+
# Or let ECO routing pick the best free model automatically
|
|
177
|
+
response = client.chat.completions.create(
|
|
178
|
+
model="auto",
|
|
179
|
+
messages=[{"role": "user", "content": "Hello world"}]
|
|
180
|
+
)
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
### Option 2: Switch Models in Claude Code
|
|
184
|
+
|
|
185
|
+
If you're using Claude Code, one command switches you to any free model:
|
|
186
|
+
|
|
187
|
+
```
|
|
188
|
+
/model free → Nemotron Ultra 253B (strongest free)
|
|
189
|
+
/model deepseek-free → DeepSeek V3.2
|
|
190
|
+
/model mistral-free → Mistral Large 675B
|
|
191
|
+
/model glm-free → GLM-4.7
|
|
192
|
+
/model llama-free → Llama 4 Maverick
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
Seamless. No config changes. No restarts.
|
|
196
|
+
|
|
197
|
+
---
|
|
198
|
+
|
|
199
|
+
## The Honest Limitations
|
|
200
|
+
|
|
201
|
+
Free models aren't a silver bullet. Here's what you need to know:
|
|
202
|
+
|
|
203
|
+
### 1. No Verified Tool Calling
|
|
204
|
+
|
|
205
|
+
None of these 11 models have **structured function calling (tool use) enabled.** If your application depends on tool calling, you need a paid model (GPT-4o, Claude Sonnet, etc.).
|
|
206
|
+
|
|
207
|
+
### 2. Reasoning Has a Ceiling
|
|
208
|
+
|
|
209
|
+
Seven models are marked reasoning-capable, and they handle most tasks well. But on the hardest problems — competition-level math, formal proofs, deep multi-step planning — they don't match Claude Opus 4 or o3. That's why ClawRouter's REASONING tier doesn't use free models.
|
|
210
|
+
|
|
211
|
+
### 3. Context Is Large, Not Largest
|
|
212
|
+
|
|
213
|
+
128K–131K context is generous for most tasks, but if you're processing entire books or massive codebases, you may need Claude's 1M context or Gemini's 2M window.
|
|
214
|
+
|
|
215
|
+
---
|
|
216
|
+
|
|
217
|
+
## Best Practices: Maximizing Free Models
|
|
218
|
+
|
|
219
|
+
### Strategy 1: Match Model to Task
|
|
220
|
+
|
|
221
|
+
Don't use one model for everything. Route by task type:
|
|
222
|
+
|
|
223
|
+
```
|
|
224
|
+
Quick chat, formatting → GPT-OSS 120B (fastest)
|
|
225
|
+
Code generation → DeepSeek V3.2 or Qwen3 Coder 480B
|
|
226
|
+
Reasoning required → Nemotron Ultra 253B
|
|
227
|
+
Chinese content → GLM-4.7
|
|
228
|
+
Multilingual work → Mistral Large 675B
|
|
229
|
+
Latest open-source → Llama 4 Maverick
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
### Strategy 2: Free for 80%, Paid for 20%
|
|
233
|
+
|
|
234
|
+
Use ECO mode for the bulk of daily tasks — it's free. Reserve paid models (Claude Opus, GPT-4o) for the 20% that genuinely requires top-tier capability: production-critical reasoning, tool calling, agentic workflows. Monthly AI spend drops to single digits.
|
|
235
|
+
|
|
236
|
+
### Strategy 3: Prototype Free, Ship Paid
|
|
237
|
+
|
|
238
|
+
During development, iterate freely — prompt engineering, edge case testing, architecture exploration — all on free models. Once you've nailed the approach, switch to a paid model for final quality assurance and production deployment.
|
|
239
|
+
|
|
240
|
+
---
|
|
241
|
+
|
|
242
|
+
## The Bigger Picture: What This Means for AI Access
|
|
243
|
+
|
|
244
|
+
Look at the cost trajectory over the past three years:
|
|
245
|
+
|
|
246
|
+
- **2023:** GPT-4 dominates alone at $30/$60 per M tokens
|
|
247
|
+
- **2024:** Open-source models surge, prices halve repeatedly
|
|
248
|
+
- **2025:** DeepSeek, Qwen push top-tier inference below $1/M
|
|
249
|
+
- **2026:** BlockRun offers 11 free models through a single API
|
|
250
|
+
|
|
251
|
+
**Eleven free models isn't just a product feature — it's a signal.** Baseline AI capability is becoming infrastructure. Like internet bandwidth before it, the cost of "good enough" AI inference is converging toward zero.
|
|
252
|
+
|
|
253
|
+
BlockRun and ClawRouter exist to be the **routing layer** in this transition: not locked to any single provider, not bound to any single model, always giving developers the lowest-cost path to the right capability.
|
|
254
|
+
|
|
255
|
+
Today it's 11 free models. Tomorrow it could be 50. Prices will only drop. Capabilities will only improve.
|
|
256
|
+
|
|
257
|
+
**The one constant: your code doesn't need to change.**
|
|
258
|
+
|
|
259
|
+
---
|
|
260
|
+
|
|
261
|
+
## Start Now
|
|
262
|
+
|
|
263
|
+
```bash
|
|
264
|
+
npm install -g @blockrun/clawrouter
|
|
265
|
+
clawrouter start
|
|
266
|
+
```
|
|
267
|
+
|
|
268
|
+
Point your `base_url` to `http://localhost:4402/v1`. That's the whole setup.
|
|
269
|
+
|
|
270
|
+
Eleven free models. 128K context. Unlimited calls. Zero cost.
|
|
271
|
+
|
|
272
|
+
Go build something.
|
|
273
|
+
|
|
274
|
+
---
|
|
275
|
+
|
|
276
|
+
*Based on ClawRouter v0.12.81. Model availability may change with future releases. For the latest information, visit [blockrun.ai](https://blockrun.ai).*
|
package/package.json
CHANGED
package/scripts/reinstall.sh
CHANGED
|
@@ -257,9 +257,26 @@ if (fs.existsSync(configPath)) {
|
|
|
257
257
|
"
|
|
258
258
|
|
|
259
259
|
# 6. Install plugin (config is ready, but no allow list yet to avoid validation error)
|
|
260
|
+
# Back up OpenClaw credentials (channels, WhatsApp/Telegram state) before plugin install
|
|
261
|
+
CREDS_DIR="$HOME/.openclaw/credentials"
|
|
262
|
+
CREDS_BACKUP=""
|
|
263
|
+
if [ -d "$CREDS_DIR" ] && [ "$(ls -A "$CREDS_DIR" 2>/dev/null)" ]; then
|
|
264
|
+
CREDS_BACKUP="$(mktemp -d)/openclaw-credentials-backup"
|
|
265
|
+
cp -a "$CREDS_DIR" "$CREDS_BACKUP"
|
|
266
|
+
echo " ✓ Backed up OpenClaw credentials"
|
|
267
|
+
fi
|
|
268
|
+
|
|
260
269
|
echo "→ Installing ClawRouter..."
|
|
261
270
|
openclaw plugins install @blockrun/clawrouter
|
|
262
271
|
|
|
272
|
+
# Restore credentials after plugin install (always restore to preserve user's channels)
|
|
273
|
+
if [ -n "$CREDS_BACKUP" ] && [ -d "$CREDS_BACKUP" ]; then
|
|
274
|
+
mkdir -p "$CREDS_DIR"
|
|
275
|
+
cp -a "$CREDS_BACKUP/"* "$CREDS_DIR/"
|
|
276
|
+
echo " ✓ Restored OpenClaw credentials (channels preserved)"
|
|
277
|
+
rm -rf "$(dirname "$CREDS_BACKUP")"
|
|
278
|
+
fi
|
|
279
|
+
|
|
263
280
|
# 6.1. Verify installation (check dist/ files exist)
|
|
264
281
|
echo "→ Verifying installation..."
|
|
265
282
|
DIST_PATH="$PLUGIN_DIR/dist/index.js"
|
package/scripts/update.sh
CHANGED
|
@@ -170,9 +170,26 @@ try {
|
|
|
170
170
|
"
|
|
171
171
|
|
|
172
172
|
# ── Step 4: Install latest version ─────────────────────────────
|
|
173
|
+
# Back up OpenClaw credentials (channels, WhatsApp/Telegram state) before plugin install
|
|
174
|
+
CREDS_DIR="$HOME/.openclaw/credentials"
|
|
175
|
+
CREDS_BACKUP=""
|
|
176
|
+
if [ -d "$CREDS_DIR" ] && [ "$(ls -A "$CREDS_DIR" 2>/dev/null)" ]; then
|
|
177
|
+
CREDS_BACKUP="$(mktemp -d)/openclaw-credentials-backup"
|
|
178
|
+
cp -a "$CREDS_DIR" "$CREDS_BACKUP"
|
|
179
|
+
echo " ✓ Backed up OpenClaw credentials"
|
|
180
|
+
fi
|
|
181
|
+
|
|
173
182
|
echo "→ Installing latest ClawRouter..."
|
|
174
183
|
openclaw plugins install @blockrun/clawrouter
|
|
175
184
|
|
|
185
|
+
# Restore credentials after plugin install (always restore to preserve user's channels)
|
|
186
|
+
if [ -n "$CREDS_BACKUP" ] && [ -d "$CREDS_BACKUP" ]; then
|
|
187
|
+
mkdir -p "$CREDS_DIR"
|
|
188
|
+
cp -a "$CREDS_BACKUP/"* "$CREDS_DIR/"
|
|
189
|
+
echo " ✓ Restored OpenClaw credentials (channels preserved)"
|
|
190
|
+
rm -rf "$(dirname "$CREDS_BACKUP")"
|
|
191
|
+
fi
|
|
192
|
+
|
|
176
193
|
# ── Step 4b: Ensure all dependencies are installed ────────────
|
|
177
194
|
# openclaw's plugin installer may skip native/optional deps like @solana/kit.
|
|
178
195
|
# Run npm install in the plugin directory to fill any gaps.
|