npm - @blockrun/llm - Versions diffs - 1.15.0 → 2.1.0 - Mend

@blockrun/llm 1.15.0 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/README.md CHANGED Viewed

@@ -2,7 +2,7 @@
 > **@blockrun/llm** is a TypeScript/Node.js SDK for accessing 41+ large language models (GPT-5, Claude, Gemini, Grok, DeepSeek, Kimi, and more) with automatic pay-per-request USDC micropayments via the x402 protocol. No API keys required — your wallet signature is your authentication. Supports **streaming**, smart routing, Base and Solana chains.
 >
-> 🆓 **Includes 9 fully-free NVIDIA-hosted models** — DeepSeek V4 Pro/Flash (1M context), Nemotron Nano Omni (vision), Qwen3, Llama 4, GLM-4.7, Mistral. Zero USDC, no rate-limit gimmicks. Use `routingProfile: 'free'` or call any `nvidia/*` model directly.
+> 🆓 **Includes 8 fully-free NVIDIA-hosted models** (6 visible in `/v1/models`, 2 hidden but directly callable) — DeepSeek V4 Flash (1M context), Nemotron Nano Omni (vision), Qwen3, Llama 4, Mistral, plus the gpt-oss pair. Zero USDC, no rate-limit gimmicks. Use `routingProfile: 'free'` or call any `nvidia/*` model directly.
 [![npm](https://img.shields.io/npm/v/@blockrun/llm.svg)](https://www.npmjs.com/package/@blockrun/llm)
 [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
@@ -74,7 +74,7 @@ console.log(result.response);  // '4'
 > Need V4-Pro-class reasoning? Use the paid `deepseek/deepseek-v4-pro` ($0.50/$1.00 with the 75% promo through 2026-05-31) — `nvidia/deepseek-v4-pro` is currently hidden because NVIDIA's NIM deployment is hung; backend MODEL_REDIRECTS forwards calls to V4 Flash.
-> Note: `nvidia/gpt-oss-120b` and `nvidia/gpt-oss-20b` were retired 2026-04-28 — NVIDIA's free build.nvidia.com tier reserves the right to use prompts/outputs for service improvement, which conflicts with our data-privacy policy.
+> Privacy note: `nvidia/gpt-oss-120b` and `nvidia/gpt-oss-20b` are hidden from `/v1/models` because NVIDIA's free build.nvidia.com tier reserves the right to use prompts/outputs for service improvement. Direct calls by full model ID still work — opt in only when your data isn't sensitive.
 ## Quick Start (Solana)
@@ -146,13 +146,33 @@ console.log(`Saved ${(result.routing.savings * 100).toFixed(0)}%`); // 'Saved 78
 // Complex reasoning task -> routes to reasoning model
 const complex = await client.smartChat('Prove the Riemann hypothesis step by step');
 console.log(complex.model);  // 'xai/grok-4-1-fast-reasoning'
+// Inspect the fallback chain SmartChat will walk on transient errors.
+console.log(complex.routing.fallbacks);  // ['anthropic/claude-opus-4.7', ...]
+```
+### Automatic Fallback on Transient Errors
+`smartChat()` populates a tier-specific fallback chain and `chat()` /
+`chatCompletion()` walk it automatically when the primary model returns a
+transient error — timeouts, network failures, or 5xx responses (502/503/504/
+522/524). 4xx errors and `PaymentError` propagate immediately so wallet /
+auth issues surface fast.
+```typescript
+// Manually pass a fallback chain to chat() / chatCompletion()
+const reply = await client.chat('nvidia/deepseek-v4-flash', 'hello', {
+  fallbackModels: ['nvidia/llama-4-maverick', 'nvidia/mistral-small-4-119b'],
+});
+// If deepseek-v4-flash times out, the SDK retries against the next model
+// and logs each hop to stderr: "[@blockrun/llm] <from> -> <to> (...)".
 ```
 ### Routing Profiles
 | Profile | Description | Best For |
 |---------|-------------|----------|
-| `free` | NVIDIA free tier — smart-routes across 9 models (DeepSeek V4 Pro/Flash, Nemotron Nano Omni, Qwen3, GLM-4.7, Llama 4, Mistral) | Zero-cost testing, dev, prod |
+| `free` | NVIDIA free tier — smart-routes across 8 models (DeepSeek V4 Flash, Nemotron Nano Omni, Qwen3, Llama 4, Mistral, plus 2 hidden gpt-oss) | Zero-cost testing, dev, prod |
 | `eco` | Cheapest models per tier (DeepSeek, xAI) | Cost-sensitive production |
 | `auto` | Best balance of cost/quality (default) | General use |
 | `premium` | Top-tier models (OpenAI, Anthropic) | Quality-critical tasks |
@@ -163,7 +183,7 @@ const result = await client.smartChat(
   'Write production-grade async TypeScript code',
   { routingProfile: 'premium' }
 );
-console.log(result.model);  // 'anthropic/claude-opus-4.5'
+console.log(result.model);  // 'anthropic/claude-opus-4.7'
 ```
 ### How ClawRouter Works
@@ -230,14 +250,15 @@ Released 2026-04-23 — first fully retrained base since GPT-4.5. 1M context, 12
 | `openai/o4-mini` | $1.10/M | $4.40/M |
 ### Anthropic Claude
-| Model | Input Price | Output Price |
-|-------|-------------|--------------|
-| `anthropic/claude-opus-4.6` | $5.00/M | $25.00/M |
-| `anthropic/claude-opus-4.5` | $5.00/M | $25.00/M |
-| `anthropic/claude-opus-4` | $15.00/M | $75.00/M |
-| `anthropic/claude-sonnet-4.6` | $3.00/M | $15.00/M |
-| `anthropic/claude-sonnet-4` | $3.00/M | $15.00/M |
-| `anthropic/claude-haiku-4.5` | $1.00/M | $5.00/M |
+| Model | Input Price | Output Price | Context | Notes |
+|-------|-------------|--------------|---------|-------|
+| `anthropic/claude-opus-4.7` | $5.00/M | $25.00/M | **1M** | Flagship — agentic coding + adaptive thinking, 128K output |
+| `anthropic/claude-opus-4.6` | $5.00/M | $25.00/M | 200K | Hidden but still callable — kept as in-family hot-swap fallback |
+| `anthropic/claude-opus-4.5` | $5.00/M | $25.00/M | 200K | |
+| `anthropic/claude-opus-4` | $15.00/M | $75.00/M | 200K | |
+| `anthropic/claude-sonnet-4.6` | $3.00/M | $15.00/M | 200K | Best for reasoning/instructions |
+| `anthropic/claude-sonnet-4` | $3.00/M | $15.00/M | 200K | |
+| `anthropic/claude-haiku-4.5` | $1.00/M | $5.00/M | 200K | |
 ### Google Gemini
 | Model | Input Price | Output Price |
@@ -334,7 +355,6 @@ All models below have been tested end-to-end via the TypeScript SDK (Feb 2026):
 | `openai/gpt-image-2` | $0.06-0.12/image (reasoning-driven, multilingual text rendering, character consistency) |
 | `google/nano-banana` | $0.05/image |
 | `google/nano-banana-pro` | $0.10-0.15/image |
-| `black-forest/flux-1.1-pro` | $0.04/image |
 | `xai/grok-imagine-image` | $0.02/image |
 | `xai/grok-imagine-image-pro` | $0.07/image |
 | `zai/cogview-4` | $0.015/image |
@@ -507,48 +527,6 @@ const result = await client.imageEdit(
 console.log(result.data[0].url);
 ```
-## Testnet Usage
-For development and testing without real USDC, use the testnet:
-```typescript
-import { testnetClient } from '@blockrun/llm';
-// Create testnet client (uses Base Sepolia)
-const client = testnetClient({ privateKey: '0x...' });
-// Chat with testnet model
-const response = await client.chat('openai/gpt-oss-20b', 'Hello!');
-console.log(response);
-// Check if client is on testnet
-console.log(client.isTestnet()); // true
-```
-### Testnet Setup
-1. Get testnet ETH from [Alchemy Base Sepolia Faucet](https://www.alchemy.com/faucets/base-sepolia)
-2. Get testnet USDC from [Circle USDC Faucet](https://faucet.circle.com/)
-3. Set your wallet key: `export BASE_CHAIN_WALLET_KEY=0x...`
-### Available Testnet Models
-- `openai/gpt-oss-20b` - $0.001/request (flat price)
-- `openai/gpt-oss-120b` - $0.002/request (flat price)
-### Manual Testnet Configuration
-```typescript
-import { LLMClient } from '@blockrun/llm';
-// Or configure manually
-const client = new LLMClient({
-  privateKey: '0x...',
-  apiUrl: 'https://testnet.blockrun.ai/api'
-});
-const response = await client.chat('openai/gpt-oss-20b', 'Hello!');
-```
 ## Usage Examples
 ### Simple Chat
@@ -782,7 +760,7 @@ Works on both `LLMClient` (Base) and `SolanaLLMClient`.
 ## Exa Web Search (Powered by Exa)
-Access [Exa](https://exa.ai)'s neural web search via x402. No API keys needed — pay-per-request via Solana USDC. Available on `SolanaLLMClient` only.
+Access [Exa](https://exa.ai)'s neural web search via x402. No API keys needed — pay-per-request. Available on **`LLMClient` (Base USDC)** and `SolanaLLMClient` (Solana USDC). Use Base as the primary path; the Solana gateway is awaiting `EXA_API_KEY` provisioning.
 | Method | Description | Price |
 |---|---|---|
@@ -793,9 +771,9 @@ Access [Exa](https://exa.ai)'s neural web search via x402. No API keys needed
 | `exa(path, body)` | Generic proxy for any Exa endpoint | varies |
 ```typescript
-import { SolanaLLMClient } from '@blockrun/llm';
+import { LLMClient } from '@blockrun/llm';
-const client = new SolanaLLMClient();
+const client = new LLMClient();
 // Neural web search ($0.01/request)
 const results = await client.exaSearch("latest AI safety research", { numResults: 5 });
@@ -806,10 +784,6 @@ const similar = await client.exaFindSimilar("https://openai.com/research/gpt-4",
 // Extract content from URLs ($0.002/URL)
 const content = await client.exaContents(["https://arxiv.org/abs/2303.08774"]);
-const rich = await client.exaContents(
-  ["https://example.com/page1", "https://example.com/page2"],
-  { text: true, highlights: true }
-);
 // AI-generated answer from live web ($0.01/request)
 const answer = await client.exaAnswer("What is the current state of AI safety research?");
@@ -818,7 +792,7 @@ const answer = await client.exaAnswer("What is the current state of AI safety re
 const custom = await client.exa("search", { query: "transformer architecture", numResults: 5 });
 ```
-`SolanaLLMClient` only — Exa endpoints are on `sol.blockrun.ai`.
+Same surface on `SolanaLLMClient` once Solana-side `EXA_API_KEY` is provisioned.
 ## Configuration
@@ -955,7 +929,6 @@ Full TypeScript support with exported types:
 import {
   LLMClient,
   OpenAI,
-  testnetClient,
   type ChatMessage,
   type ChatResponse,
   type ChatOptions,