@blockrun/llm 1.8.0 → 1.8.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +17 -5
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -103,7 +103,7 @@ const client = new LLMClient();
103
103
  // Auto-routes to cheapest capable model
104
104
  const result = await client.smartChat('What is 2+2?');
105
105
  console.log(result.response); // '4'
106
- console.log(result.model); // 'nvidia/kimi-k2.5' (cheap, fast)
106
+ console.log(result.model); // 'moonshot/kimi-k2.5' (cheap, fast)
107
107
  console.log(`Saved ${(result.routing.savings * 100).toFixed(0)}%`); // 'Saved 78%'
108
108
 
109
109
  // Complex reasoning task -> routes to reasoning model
@@ -144,7 +144,7 @@ The classifier runs in <1ms, 100% locally, and routes to one of four tiers:
144
144
 
145
145
  | Tier | Example Tasks | Auto Profile Model |
146
146
  |------|---------------|-------------------|
147
- | SIMPLE | "What is 2+2?", definitions | nvidia/kimi-k2.5 |
147
+ | SIMPLE | "What is 2+2?", definitions | moonshot/kimi-k2.5 |
148
148
  | MEDIUM | Code snippets, explanations | xai/grok-code-fast-1 |
149
149
  | COMPLEX | Architecture, long documents | google/gemini-3.1-pro |
150
150
  | REASONING | Proofs, multi-step reasoning | xai/grok-4-1-fast-reasoning |
@@ -236,11 +236,23 @@ The classifier runs in <1ms, 100% locally, and routes to one of four tiers:
236
236
  | `minimax/minimax-m2.7` | $0.30/M | $1.20/M |
237
237
  | `minimax/minimax-m2.5` | $0.30/M | $1.20/M |
238
238
 
239
- ### NVIDIA (Free & Hosted)
239
+ ### NVIDIA (Free) + Moonshot
240
+
241
+ Free tier refreshed 2026-04-21: retired the Nemotron family, `mistral-large-3-675b`,
242
+ `devstral-2-123b`, and paid `nvidia/kimi-k2.5`. The backend auto-redirects the
243
+ old IDs; the recommended replacements are listed below.
244
+
240
245
  | Model | Input Price | Output Price | Notes |
241
246
  |-------|-------------|--------------|-------|
242
- | `nvidia/gpt-oss-120b` | **FREE** | **FREE** | OpenAI open-weight 120B (Apache 2.0) |
243
- | `nvidia/kimi-k2.5` | $0.60/M | $3.00/M | Moonshot 1T MoE with vision |
247
+ | `nvidia/qwen3-next-80b-a3b-thinking` | **FREE** | **FREE** | Reasoning flagship 116 tok/s, thinking mode |
248
+ | `nvidia/mistral-small-4-119b` | **FREE** | **FREE** | Fastest free chat 114 tok/s |
249
+ | `nvidia/glm-4.7` | **FREE** | **FREE** | GLM-4.7 with thinking — 237 tok/s |
250
+ | `nvidia/llama-4-maverick` | **FREE** | **FREE** | Llama 4 Maverick MoE |
251
+ | `nvidia/qwen3-coder-480b` | **FREE** | **FREE** | Coding-optimised 480B MoE |
252
+ | `nvidia/deepseek-v3.2` | **FREE** | **FREE** | DeepSeek V3.2 hosted |
253
+ | `nvidia/gpt-oss-120b` | **FREE** | **FREE** | OpenAI open-weight 120B — 123 tok/s |
254
+ | `nvidia/gpt-oss-20b` | **FREE** | **FREE** | OpenAI open-weight 20B — 155 tok/s |
255
+ | `moonshot/kimi-k2.5` | $0.60/M | $3.00/M | Direct from Moonshot — replaces `nvidia/kimi-k2.5` |
244
256
 
245
257
  ### E2E Verified Models
246
258
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@blockrun/llm",
3
- "version": "1.8.0",
3
+ "version": "1.8.1",
4
4
  "type": "module",
5
5
  "description": "BlockRun SDK - Pay-per-request AI (LLM, Image, Video, Music) via x402 on Base and Solana",
6
6
  "main": "dist/index.cjs",