@blockrun/llm 1.15.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -2,7 +2,7 @@
2
2
 
3
3
  > **@blockrun/llm** is a TypeScript/Node.js SDK for accessing 41+ large language models (GPT-5, Claude, Gemini, Grok, DeepSeek, Kimi, and more) with automatic pay-per-request USDC micropayments via the x402 protocol. No API keys required — your wallet signature is your authentication. Supports **streaming**, smart routing, Base and Solana chains.
4
4
  >
5
- > 🆓 **Includes 9 fully-free NVIDIA-hosted models** — DeepSeek V4 Pro/Flash (1M context), Nemotron Nano Omni (vision), Qwen3, Llama 4, GLM-4.7, Mistral. Zero USDC, no rate-limit gimmicks. Use `routingProfile: 'free'` or call any `nvidia/*` model directly.
5
+ > 🆓 **Includes 8 fully-free NVIDIA-hosted models** (6 visible in `/v1/models`, 2 hidden but directly callable) — DeepSeek V4 Flash (1M context), Nemotron Nano Omni (vision), Qwen3, Llama 4, Mistral, plus the gpt-oss pair. Zero USDC, no rate-limit gimmicks. Use `routingProfile: 'free'` or call any `nvidia/*` model directly.
6
6
 
7
7
  [![npm](https://img.shields.io/npm/v/@blockrun/llm.svg)](https://www.npmjs.com/package/@blockrun/llm)
8
8
  [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
@@ -74,7 +74,7 @@ console.log(result.response); // '4'
74
74
 
75
75
  > Need V4-Pro-class reasoning? Use the paid `deepseek/deepseek-v4-pro` ($0.50/$1.00 with the 75% promo through 2026-05-31) — `nvidia/deepseek-v4-pro` is currently hidden because NVIDIA's NIM deployment is hung; backend MODEL_REDIRECTS forwards calls to V4 Flash.
76
76
 
77
- > Note: `nvidia/gpt-oss-120b` and `nvidia/gpt-oss-20b` were retired 2026-04-28 NVIDIA's free build.nvidia.com tier reserves the right to use prompts/outputs for service improvement, which conflicts with our data-privacy policy.
77
+ > Privacy note: `nvidia/gpt-oss-120b` and `nvidia/gpt-oss-20b` are hidden from `/v1/models` because NVIDIA's free build.nvidia.com tier reserves the right to use prompts/outputs for service improvement. Direct calls by full model ID still work — opt in only when your data isn't sensitive.
78
78
 
79
79
  ## Quick Start (Solana)
80
80
 
@@ -146,13 +146,33 @@ console.log(`Saved ${(result.routing.savings * 100).toFixed(0)}%`); // 'Saved 78
146
146
  // Complex reasoning task -> routes to reasoning model
147
147
  const complex = await client.smartChat('Prove the Riemann hypothesis step by step');
148
148
  console.log(complex.model); // 'xai/grok-4-1-fast-reasoning'
149
+
150
+ // Inspect the fallback chain SmartChat will walk on transient errors.
151
+ console.log(complex.routing.fallbacks); // ['anthropic/claude-opus-4.7', ...]
152
+ ```
153
+
154
+ ### Automatic Fallback on Transient Errors
155
+
156
+ `smartChat()` populates a tier-specific fallback chain and `chat()` /
157
+ `chatCompletion()` walk it automatically when the primary model returns a
158
+ transient error — timeouts, network failures, or 5xx responses (502/503/504/
159
+ 522/524). 4xx errors and `PaymentError` propagate immediately so wallet /
160
+ auth issues surface fast.
161
+
162
+ ```typescript
163
+ // Manually pass a fallback chain to chat() / chatCompletion()
164
+ const reply = await client.chat('nvidia/deepseek-v4-flash', 'hello', {
165
+ fallbackModels: ['nvidia/llama-4-maverick', 'nvidia/mistral-small-4-119b'],
166
+ });
167
+ // If deepseek-v4-flash times out, the SDK retries against the next model
168
+ // and logs each hop to stderr: "[@blockrun/llm] <from> -> <to> (...)".
149
169
  ```
150
170
 
151
171
  ### Routing Profiles
152
172
 
153
173
  | Profile | Description | Best For |
154
174
  |---------|-------------|----------|
155
- | `free` | NVIDIA free tier — smart-routes across 9 models (DeepSeek V4 Pro/Flash, Nemotron Nano Omni, Qwen3, GLM-4.7, Llama 4, Mistral) | Zero-cost testing, dev, prod |
175
+ | `free` | NVIDIA free tier — smart-routes across 8 models (DeepSeek V4 Flash, Nemotron Nano Omni, Qwen3, Llama 4, Mistral, plus 2 hidden gpt-oss) | Zero-cost testing, dev, prod |
156
176
  | `eco` | Cheapest models per tier (DeepSeek, xAI) | Cost-sensitive production |
157
177
  | `auto` | Best balance of cost/quality (default) | General use |
158
178
  | `premium` | Top-tier models (OpenAI, Anthropic) | Quality-critical tasks |
@@ -163,7 +183,7 @@ const result = await client.smartChat(
163
183
  'Write production-grade async TypeScript code',
164
184
  { routingProfile: 'premium' }
165
185
  );
166
- console.log(result.model); // 'anthropic/claude-opus-4.5'
186
+ console.log(result.model); // 'anthropic/claude-opus-4.7'
167
187
  ```
168
188
 
169
189
  ### How ClawRouter Works
@@ -230,14 +250,15 @@ Released 2026-04-23 — first fully retrained base since GPT-4.5. 1M context, 12
230
250
  | `openai/o4-mini` | $1.10/M | $4.40/M |
231
251
 
232
252
  ### Anthropic Claude
233
- | Model | Input Price | Output Price |
234
- |-------|-------------|--------------|
235
- | `anthropic/claude-opus-4.6` | $5.00/M | $25.00/M |
236
- | `anthropic/claude-opus-4.5` | $5.00/M | $25.00/M |
237
- | `anthropic/claude-opus-4` | $15.00/M | $75.00/M |
238
- | `anthropic/claude-sonnet-4.6` | $3.00/M | $15.00/M |
239
- | `anthropic/claude-sonnet-4` | $3.00/M | $15.00/M |
240
- | `anthropic/claude-haiku-4.5` | $1.00/M | $5.00/M |
253
+ | Model | Input Price | Output Price | Context | Notes |
254
+ |-------|-------------|--------------|---------|-------|
255
+ | `anthropic/claude-opus-4.7` | $5.00/M | $25.00/M | **1M** | Flagship — agentic coding + adaptive thinking, 128K output |
256
+ | `anthropic/claude-opus-4.6` | $5.00/M | $25.00/M | 200K | Hidden but still callable — kept as in-family hot-swap fallback |
257
+ | `anthropic/claude-opus-4.5` | $5.00/M | $25.00/M | 200K | |
258
+ | `anthropic/claude-opus-4` | $15.00/M | $75.00/M | 200K | |
259
+ | `anthropic/claude-sonnet-4.6` | $3.00/M | $15.00/M | 200K | Best for reasoning/instructions |
260
+ | `anthropic/claude-sonnet-4` | $3.00/M | $15.00/M | 200K | |
261
+ | `anthropic/claude-haiku-4.5` | $1.00/M | $5.00/M | 200K | |
241
262
 
242
263
  ### Google Gemini
243
264
  | Model | Input Price | Output Price |
@@ -334,7 +355,6 @@ All models below have been tested end-to-end via the TypeScript SDK (Feb 2026):
334
355
  | `openai/gpt-image-2` | $0.06-0.12/image (reasoning-driven, multilingual text rendering, character consistency) |
335
356
  | `google/nano-banana` | $0.05/image |
336
357
  | `google/nano-banana-pro` | $0.10-0.15/image |
337
- | `black-forest/flux-1.1-pro` | $0.04/image |
338
358
  | `xai/grok-imagine-image` | $0.02/image |
339
359
  | `xai/grok-imagine-image-pro` | $0.07/image |
340
360
  | `zai/cogview-4` | $0.015/image |
@@ -507,48 +527,6 @@ const result = await client.imageEdit(
507
527
  console.log(result.data[0].url);
508
528
  ```
509
529
 
510
- ## Testnet Usage
511
-
512
- For development and testing without real USDC, use the testnet:
513
-
514
- ```typescript
515
- import { testnetClient } from '@blockrun/llm';
516
-
517
- // Create testnet client (uses Base Sepolia)
518
- const client = testnetClient({ privateKey: '0x...' });
519
-
520
- // Chat with testnet model
521
- const response = await client.chat('openai/gpt-oss-20b', 'Hello!');
522
- console.log(response);
523
-
524
- // Check if client is on testnet
525
- console.log(client.isTestnet()); // true
526
- ```
527
-
528
- ### Testnet Setup
529
-
530
- 1. Get testnet ETH from [Alchemy Base Sepolia Faucet](https://www.alchemy.com/faucets/base-sepolia)
531
- 2. Get testnet USDC from [Circle USDC Faucet](https://faucet.circle.com/)
532
- 3. Set your wallet key: `export BASE_CHAIN_WALLET_KEY=0x...`
533
-
534
- ### Available Testnet Models
535
-
536
- - `openai/gpt-oss-20b` - $0.001/request (flat price)
537
- - `openai/gpt-oss-120b` - $0.002/request (flat price)
538
-
539
- ### Manual Testnet Configuration
540
-
541
- ```typescript
542
- import { LLMClient } from '@blockrun/llm';
543
-
544
- // Or configure manually
545
- const client = new LLMClient({
546
- privateKey: '0x...',
547
- apiUrl: 'https://testnet.blockrun.ai/api'
548
- });
549
- const response = await client.chat('openai/gpt-oss-20b', 'Hello!');
550
- ```
551
-
552
530
  ## Usage Examples
553
531
 
554
532
  ### Simple Chat
@@ -782,7 +760,7 @@ Works on both `LLMClient` (Base) and `SolanaLLMClient`.
782
760
 
783
761
  ## Exa Web Search (Powered by Exa)
784
762
 
785
- Access [Exa](https://exa.ai)'s neural web search via x402. No API keys needed — pay-per-request via Solana USDC. Available on `SolanaLLMClient` only.
763
+ Access [Exa](https://exa.ai)'s neural web search via x402. No API keys needed — pay-per-request. Available on **`LLMClient` (Base USDC)** and `SolanaLLMClient` (Solana USDC). Use Base as the primary path; the Solana gateway is awaiting `EXA_API_KEY` provisioning.
786
764
 
787
765
  | Method | Description | Price |
788
766
  |---|---|---|
@@ -793,9 +771,9 @@ Access [Exa](https://exa.ai)'s neural web search via x402. No API keys needed
793
771
  | `exa(path, body)` | Generic proxy for any Exa endpoint | varies |
794
772
 
795
773
  ```typescript
796
- import { SolanaLLMClient } from '@blockrun/llm';
774
+ import { LLMClient } from '@blockrun/llm';
797
775
 
798
- const client = new SolanaLLMClient();
776
+ const client = new LLMClient();
799
777
 
800
778
  // Neural web search ($0.01/request)
801
779
  const results = await client.exaSearch("latest AI safety research", { numResults: 5 });
@@ -806,10 +784,6 @@ const similar = await client.exaFindSimilar("https://openai.com/research/gpt-4",
806
784
 
807
785
  // Extract content from URLs ($0.002/URL)
808
786
  const content = await client.exaContents(["https://arxiv.org/abs/2303.08774"]);
809
- const rich = await client.exaContents(
810
- ["https://example.com/page1", "https://example.com/page2"],
811
- { text: true, highlights: true }
812
- );
813
787
 
814
788
  // AI-generated answer from live web ($0.01/request)
815
789
  const answer = await client.exaAnswer("What is the current state of AI safety research?");
@@ -818,7 +792,7 @@ const answer = await client.exaAnswer("What is the current state of AI safety re
818
792
  const custom = await client.exa("search", { query: "transformer architecture", numResults: 5 });
819
793
  ```
820
794
 
821
- `SolanaLLMClient` only Exa endpoints are on `sol.blockrun.ai`.
795
+ Same surface on `SolanaLLMClient` once Solana-side `EXA_API_KEY` is provisioned.
822
796
 
823
797
  ## Configuration
824
798
 
@@ -955,7 +929,6 @@ Full TypeScript support with exported types:
955
929
  import {
956
930
  LLMClient,
957
931
  OpenAI,
958
- testnetClient,
959
932
  type ChatMessage,
960
933
  type ChatResponse,
961
934
  type ChatOptions,