npm - @blockrun/clawrouter - Versions diffs - 0.4.7 → 0.5.0 - Mend

@blockrun/clawrouter 0.4.7 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 BlockRunAI
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md CHANGED Viewed

@@ -1,8 +1,6 @@
-<div align="center">
-# ClawRouter
+![ClawRouter Banner](assets/banner.png)
-**Save 78% on LLM costs. Automatically.**
+<div align="center">
 Route every request to the cheapest model that can handle it.
 One wallet, 30+ models, zero API keys.
@@ -28,7 +26,7 @@ One wallet, 30+ models, zero API keys.
 ## Why ClawRouter?
-- **100% local routing** — 14-dimension weighted scoring runs on your machine in <1ms
+- **100% local routing** — 15-dimension weighted scoring runs on your machine in <1ms
 - **Zero external calls** — no API calls for routing decisions, ever
 - **30+ models** — OpenAI, Anthropic, Google, DeepSeek, xAI, Moonshot through one wallet
 - **x402 micropayments** — pay per request with USDC on Base, no API keys
@@ -94,14 +92,14 @@ Request → Weighted Scorer (14 dimensions)
 No external classifier calls. Ambiguous queries default to the MEDIUM tier (DeepSeek/GPT-4o-mini) — fast, cheap, and good enough for most tasks.
-### 14-Dimension Weighted Scoring
+### 15-Dimension Weighted Scoring
 | Dimension            | Weight | What It Detects                          |
 | -------------------- | ------ | ---------------------------------------- |
 | Reasoning markers    | 0.18   | "prove", "theorem", "step by step"       |
 | Code presence        | 0.15   | "function", "async", "import", "```"     |
-| Simple indicators    | 0.12   | "what is", "define", "translate"         |
 | Multi-step patterns  | 0.12   | "first...then", "step 1", numbered lists |
+| **Agentic task**     | 0.10   | "run", "test", "fix", "deploy", "edit"   |
 | Technical terms      | 0.10   | "algorithm", "kubernetes", "distributed" |
 | Token count          | 0.08   | short (<50) vs long (>500) prompts       |
 | Creative markers     | 0.05   | "story", "poem", "brainstorm"            |
@@ -109,6 +107,7 @@ No external classifier calls. Ambiguous queries default to the MEDIUM tier (Deep
 | Constraint count     | 0.04   | "at most", "O(n)", "maximum"             |
 | Imperative verbs     | 0.03   | "build", "create", "implement"           |
 | Output format        | 0.03   | "json", "yaml", "schema"                 |
+| Simple indicators    | 0.02   | "what is", "define", "translate"         |
 | Domain specificity   | 0.02   | "quantum", "fpga", "genomics"            |
 | Reference complexity | 0.02   | "the docs", "the api", "above"           |
 | Negation complexity  | 0.01   | "don't", "avoid", "without"              |
@@ -131,15 +130,93 @@ Mixed-language prompts are supported — keywords from all languages are checked
 ### Tier → Model Mapping
-| Tier      | Primary Model     | Cost/M | Savings vs Opus |
-| --------- | ----------------- | ------ | --------------- |
-| SIMPLE    | gemini-2.5-flash  | $0.60  | **99.2%**       |
-| MEDIUM    | deepseek-chat     | $0.42  | **99.4%**       |
-| COMPLEX   | claude-opus-4     | $75.00 | baseline        |
-| REASONING | deepseek-reasoner | $0.42  | **99.4%**       |
+| Tier      | Primary Model          | Cost/M | Savings vs Opus |
+| --------- | ---------------------- | ------ | --------------- |
+| SIMPLE    | gemini-2.5-flash       | $0.60  | **99.2%**       |
+| MEDIUM    | grok-code-fast-1       | $1.50  | **98.0%**       |
+| COMPLEX   | gemini-2.5-pro         | $10.00 | **86.7%**       |
+| REASONING | grok-4-fast-reasoning  | $0.50  | **99.3%**       |
 Special rule: 2+ reasoning markers → REASONING at 0.97 confidence.
+### Agentic Auto-Detection
+ClawRouter automatically detects multi-step agentic tasks and routes to models optimized for autonomous execution:
+```
+"what is 2+2"                    → gemini-flash (standard)
+"build the project then run tests" → kimi-k2.5 (auto-agentic)
+"fix the bug and make sure it works" → kimi-k2.5 (auto-agentic)
+```
+**How it works:**
+- Detects agentic keywords: file ops ("read", "edit"), execution ("run", "test", "deploy"), iteration ("fix", "debug", "verify")
+- Threshold: 2+ signals triggers auto-switch to agentic tiers
+- No config needed — works automatically
+**Agentic tier models** (optimized for multi-step autonomy):
+| Tier      | Agentic Model        | Why                                    |
+| --------- | -------------------- | -------------------------------------- |
+| SIMPLE    | claude-haiku-4.5     | Fast + reliable tool use              |
+| MEDIUM    | kimi-k2.5            | 200+ tool chains, 76% cheaper         |
+| COMPLEX   | claude-sonnet-4      | Best balance for complex tasks        |
+| REASONING | kimi-k2.5            | Extended reasoning + execution        |
+You can also force agentic mode via config:
+```yaml
+# openclaw.yaml
+plugins:
+  - id: "@blockrun/clawrouter"
+    config:
+      routing:
+        overrides:
+          agenticMode: true  # Always use agentic tiers
+```
+### Tool Detection (v0.5)
+When your request includes a `tools` array (function calling), ClawRouter automatically switches to agentic tiers:
+```typescript
+// Request with tools → auto-agentic mode
+{
+  model: "blockrun/auto",
+  messages: [{ role: "user", content: "Check the weather" }],
+  tools: [{ type: "function", function: { name: "get_weather", ... } }]
+}
+// → Routes to claude-haiku-4.5 (excellent tool use)
+// → Instead of gemini-flash (may produce malformed tool calls)
+```
+**Why this matters:** Some models (like `deepseek-reasoner`) are optimized for chain-of-thought reasoning but can generate malformed tool calls. Tool detection ensures requests with functions go to models proven to handle tool use correctly.
+### Context-Length-Aware Routing (v0.5)
+ClawRouter automatically filters out models that can't handle your context size:
+```
+150K token request:
+  Full chain: [grok-4-fast (131K), deepseek (128K), kimi (262K), gemini (1M)]
+  Filtered:   [kimi (262K), gemini (1M)]
+  → Skips models that would fail with "context too long" errors
+```
+This prevents wasted API calls and faster fallback to capable models.
+### Session Persistence (v0.5)
+For multi-turn conversations, ClawRouter pins the model to prevent mid-task switching:
+```
+Turn 1: "Build a React component" → claude-sonnet-4
+Turn 2: "Add dark mode support"   → claude-sonnet-4 (pinned)
+Turn 3: "Now add tests"           → claude-sonnet-4 (pinned)
+```
+Sessions are identified by conversation ID and persist for 1 hour of inactivity.
 ### Cost Savings (Real Numbers)
 | Tier                | % of Traffic | Cost/M      |
@@ -179,8 +256,13 @@ Compared to **$75/M** for Claude Opus = **96% savings** on a typical workload.
 | **xAI**           |           |            |         |           |
 | grok-3            | $3.00     | $15.00     | 131K    |    \*     |
 | grok-3-mini       | $0.30     | $0.50      | 131K    |           |
+| grok-4-fast-reasoning | $0.20 | $0.50      | 131K    |    \*     |
+| grok-4-fast       | $0.20     | $0.50      | 131K    |           |
+| grok-code-fast-1  | $0.20     | $1.50      | 131K    |           |
 | **Moonshot**      |           |            |         |           |
-| kimi-k2.5         | $0.50     | $2.40      | 128K    |    \*     |
+| kimi-k2.5         | $0.50     | $2.40      | 262K    |    \*     |
+| **NVIDIA**        |           |            |         |           |
+| gpt-oss-120b      | **FREE**  | **FREE**   | 128K    |           |
 Full list: [`src/models.ts`](src/models.ts)
@@ -446,6 +528,38 @@ console.log(decision);
 ---
+## Cost Tracking with /stats (v0.5)
+Track your savings in real-time:
+```bash
+# In any OpenClaw conversation
+/stats
+```
+Output:
+```
+╔════════════════════════════════════════════════════════════╗
+║              ClawRouter Usage Statistics                   ║
+╠════════════════════════════════════════════════════════════╣
+║  Period: last 7 days                                      ║
+║  Total Requests: 442                                      ║
+║  Total Cost: $1.73                                       ║
+║  Baseline Cost (Opus): $20.13                            ║
+║  💰 Total Saved: $18.40 (91.4%)                            ║
+╠════════════════════════════════════════════════════════════╣
+║  Routing by Tier:                                          ║
+║    SIMPLE     ███████████           55.0% (243)            ║
+║    MEDIUM     ██████                30.8% (136)            ║
+║    COMPLEX    █                      7.2% (32)             ║
+║    REASONING  █                      7.0% (31)             ║
+╚════════════════════════════════════════════════════════════╝
+```
+Stats are stored locally at `~/.openclaw/blockrun/logs/` and aggregated on demand.
+---
 ## Why Not OpenRouter / LiteLLM?
 They're built for developers. ClawRouter is built for **agents**.
@@ -468,7 +582,7 @@ Agents shouldn't need a human to paste API keys. They should generate a wallet,
 ### Quick Checklist
 ```bash
-# 1. Check your version (should be 0.3.21+)
+# 1. Check your version (should be 0.5.0+)
 cat ~/.openclaw/extensions/clawrouter/package.json | grep version
 # 2. Check proxy is running
@@ -477,6 +591,9 @@ curl http://localhost:8402/health
 # 3. Watch routing in action
 openclaw logs --follow
 # Should see: gemini-2.5-flash $0.0012 (saved 99%)
+# 4. View cost savings
+/stats
 ```
 ### "Unknown model: blockrun/auto" or "Unknown model: auto"
@@ -586,14 +703,19 @@ BLOCKRUN_WALLET_KEY=0x... npx tsx test-e2e.ts
 ## Roadmap
-- [x] Smart routing — 14-dimension weighted scoring, 4-tier model selection
+- [x] Smart routing — 15-dimension weighted scoring, 4-tier model selection
 - [x] x402 payments — per-request USDC micropayments, non-custodial
 - [x] Response dedup — prevents double-charge on retries
 - [x] Payment pre-auth — skips 402 round trip
 - [x] SSE heartbeat — prevents upstream timeouts
+- [x] Agentic auto-detect — auto-switch to agentic models for multi-step tasks
+- [x] Tool detection — auto-switch to agentic mode when tools array present
+- [x] Context-aware routing — filter out models that can't handle context size
+- [x] Session persistence — pin model for multi-turn conversations
+- [x] Cost tracking — /stats command with savings dashboard
 - [ ] Cascade routing — try cheap model first, escalate on low quality
 - [ ] Spend controls — daily/monthly budgets
-- [ ] Analytics dashboard — cost tracking at blockrun.ai
+- [ ] Remote analytics — cost tracking at blockrun.ai
 ---

package/dist/index.d.ts CHANGED Viewed

@@ -168,6 +168,7 @@ type ScoringConfig = {
     referenceKeywords: string[];
     negationKeywords: string[];
     domainSpecificKeywords: string[];
+    agenticTaskKeywords: string[];
     dimensionWeights: Record<string, number>;
     tierBoundaries: {
         simpleMedium: number;
@@ -188,12 +189,20 @@ type OverridesConfig = {
     maxTokensForceComplex: number;
     structuredOutputMinTier: Tier;
     ambiguousDefaultTier: Tier;
+    /**
+     * When enabled, prefer models optimized for agentic workflows.
+     * Agentic models continue autonomously with multi-step tasks
+     * instead of stopping and waiting for user input.
+     */
+    agenticMode?: boolean;
 };
 type RoutingConfig = {
     version: string;
     classifier: ClassifierConfig;
     scoring: ScoringConfig;
     tiers: Record<Tier, TierConfig>;
+    /** Tier configs for agentic mode - models that excel at multi-step tasks */
+    agenticTiers?: Record<Tier, TierConfig>;
     overrides: OverridesConfig;
 };
@@ -208,6 +217,21 @@ type ModelPricing = {
     inputPrice: number;
     outputPrice: number;
 };
+/**
+ * Get the ordered fallback chain for a tier: [primary, ...fallbacks].
+ */
+declare function getFallbackChain(tier: Tier, tierConfigs: Record<Tier, TierConfig>): string[];
+/**
+ * Get the fallback chain filtered by context length.
+ * Only returns models that can handle the estimated total context.
+ *
+ * @param tier - The tier to get fallback chain for
+ * @param tierConfigs - Tier configurations
+ * @param estimatedTotalTokens - Estimated total context (input + output)
+ * @param getContextWindow - Function to get context window for a model ID
+ * @returns Filtered list of models that can handle the context
+ */
+declare function getFallbackChainFiltered(tier: Tier, tierConfigs: Record<Tier, TierConfig>, estimatedTotalTokens: number, getContextWindow: (modelId: string) => number | undefined): string[];
 /**
  * Default Routing Config
@@ -340,6 +364,82 @@ declare class BalanceMonitor {
     private buildInfo;
 }
+/**
+ * Session Persistence Store
+ *
+ * Tracks model selections per session to prevent model switching mid-task.
+ * When a session is active, the router will continue using the same model
+ * instead of re-routing each request.
+ */
+type SessionEntry = {
+    model: string;
+    tier: string;
+    createdAt: number;
+    lastUsedAt: number;
+    requestCount: number;
+};
+type SessionConfig = {
+    /** Enable session persistence (default: false) */
+    enabled: boolean;
+    /** Session timeout in ms (default: 30 minutes) */
+    timeoutMs: number;
+    /** Header name for session ID (default: X-Session-ID) */
+    headerName: string;
+};
+declare const DEFAULT_SESSION_CONFIG: SessionConfig;
+/**
+ * Session persistence store for maintaining model selections.
+ */
+declare class SessionStore {
+    private sessions;
+    private config;
+    private cleanupInterval;
+    constructor(config?: Partial<SessionConfig>);
+    /**
+     * Get the pinned model for a session, if any.
+     */
+    getSession(sessionId: string): SessionEntry | undefined;
+    /**
+     * Pin a model to a session.
+     */
+    setSession(sessionId: string, model: string, tier: string): void;
+    /**
+     * Touch a session to extend its timeout.
+     */
+    touchSession(sessionId: string): void;
+    /**
+     * Clear a specific session.
+     */
+    clearSession(sessionId: string): void;
+    /**
+     * Clear all sessions.
+     */
+    clearAll(): void;
+    /**
+     * Get session stats for debugging.
+     */
+    getStats(): {
+        count: number;
+        sessions: Array<{
+            id: string;
+            model: string;
+            age: number;
+        }>;
+    };
+    /**
+     * Clean up expired sessions.
+     */
+    private cleanup;
+    /**
+     * Stop the cleanup interval.
+     */
+    close(): void;
+}
+/**
+ * Generate a session ID from request headers or create a default.
+ */
+declare function getSessionId(headers: Record<string, string | string[] | undefined>, headerName?: string): string | undefined;
 /**
  * Local x402 Proxy Server
  *
@@ -388,6 +488,11 @@ type ProxyOptions = {
     requestTimeoutMs?: number;
     /** Skip balance checks (for testing only). Default: false */
     skipBalanceCheck?: boolean;
+    /**
+     * Session persistence config. When enabled, maintains model selection
+     * across requests within a session to prevent mid-task model switching.
+     */
+    sessionConfig?: Partial<SessionConfig>;
     onReady?: (port: number) => void;
     onError?: (error: Error) => void;
     onPayment?: (info: {
@@ -441,6 +546,16 @@ declare const blockrunProvider: ProviderPlugin;
  * they set their own markup when reselling to end users (Phase 2).
  */
+/**
+ * Model aliases for convenient shorthand access.
+ * Users can type `/model claude` instead of `/model blockrun/anthropic/claude-sonnet-4`.
+ */
+declare const MODEL_ALIASES: Record<string, string>;
+/**
+ * Resolve a model alias to its full model ID.
+ * Returns the original model if not an alias.
+ */
+declare function resolveModelAlias(model: string): string;
 type BlockRunModel = {
     id: string;
     name: string;
@@ -450,6 +565,8 @@ type BlockRunModel = {
     maxOutput: number;
     reasoning?: boolean;
     vision?: boolean;
+    /** Models optimized for agentic workflows (multi-step autonomous tasks) */
+    agentic?: boolean;
 };
 declare const BLOCKRUN_MODELS: BlockRunModel[];
 /**
@@ -462,6 +579,21 @@ declare const OPENCLAW_MODELS: ModelDefinitionConfig[];
  * @param baseUrl - The proxy's local base URL (e.g., "http://127.0.0.1:12345")
  */
 declare function buildProviderModels(baseUrl: string): ModelProviderConfig;
+/**
+ * Check if a model is optimized for agentic workflows.
+ * Agentic models continue autonomously with multi-step tasks
+ * instead of stopping and waiting for user input.
+ */
+declare function isAgenticModel(modelId: string): boolean;
+/**
+ * Get all agentic-capable models.
+ */
+declare function getAgenticModels(): string[];
+/**
+ * Get context window size for a model.
+ * Returns undefined if model not found.
+ */
+declare function getModelContextWindow(modelId: string): number | undefined;
 /**
  * Usage Logger
@@ -475,7 +607,10 @@ declare function buildProviderModels(baseUrl: string): ModelProviderConfig;
 type UsageEntry = {
     timestamp: string;
     model: string;
+    tier: string;
     cost: number;
+    baselineCost: number;
+    savings: number;
     latencyMs: number;
 };
 /**
@@ -674,6 +809,58 @@ declare function fetchWithRetry(fetchFn: (url: string, init?: RequestInit) => Pr
  */
 declare function isRetryable(errorOrResponse: Error | Response, config?: Partial<RetryConfig>): boolean;
+/**
+ * Usage Statistics Aggregator
+ *
+ * Reads usage log files and aggregates statistics for terminal display.
+ * Supports filtering by date range and provides multiple aggregation views.
+ */
+type DailyStats = {
+    date: string;
+    totalRequests: number;
+    totalCost: number;
+    totalBaselineCost: number;
+    totalSavings: number;
+    avgLatencyMs: number;
+    byTier: Record<string, {
+        count: number;
+        cost: number;
+    }>;
+    byModel: Record<string, {
+        count: number;
+        cost: number;
+    }>;
+};
+type AggregatedStats = {
+    period: string;
+    totalRequests: number;
+    totalCost: number;
+    totalBaselineCost: number;
+    totalSavings: number;
+    savingsPercentage: number;
+    avgLatencyMs: number;
+    avgCostPerRequest: number;
+    byTier: Record<string, {
+        count: number;
+        cost: number;
+        percentage: number;
+    }>;
+    byModel: Record<string, {
+        count: number;
+        cost: number;
+        percentage: number;
+    }>;
+    dailyBreakdown: DailyStats[];
+};
+/**
+ * Get aggregated statistics for the last N days.
+ */
+declare function getStats(days?: number): Promise<AggregatedStats>;
+/**
+ * Format stats as ASCII table for terminal display.
+ */
+declare function formatStatsAscii(stats: AggregatedStats): string;
 /**
  * @blockrun/clawrouter
  *
@@ -695,4 +882,4 @@ declare function isRetryable(errorOrResponse: Error | Response, config?: Partial
 declare const plugin: OpenClawPluginDefinition;
-export { BALANCE_THRESHOLDS, BLOCKRUN_MODELS, type BalanceInfo, BalanceMonitor, type CachedPaymentParams, type CachedResponse, DEFAULT_RETRY_CONFIG, DEFAULT_ROUTING_CONFIG, EmptyWalletError, InsufficientFundsError, type InsufficientFundsInfo, type LowBalanceInfo, OPENCLAW_MODELS, PaymentCache, type PaymentFetchResult, type PreAuthParams, type ProxyHandle, type ProxyOptions, RequestDeduplicator, type RetryConfig, type RoutingConfig, type RoutingDecision, RpcError, type SufficiencyResult, type Tier, type UsageEntry, blockrunProvider, buildProviderModels, createPaymentFetch, plugin as default, fetchWithRetry, getProxyPort, isBalanceError, isEmptyWalletError, isInsufficientFundsError, isRetryable, isRpcError, logUsage, route, startProxy };
+export { type AggregatedStats, BALANCE_THRESHOLDS, BLOCKRUN_MODELS, type BalanceInfo, BalanceMonitor, type CachedPaymentParams, type CachedResponse, DEFAULT_RETRY_CONFIG, DEFAULT_ROUTING_CONFIG, DEFAULT_SESSION_CONFIG, type DailyStats, EmptyWalletError, InsufficientFundsError, type InsufficientFundsInfo, type LowBalanceInfo, MODEL_ALIASES, OPENCLAW_MODELS, PaymentCache, type PaymentFetchResult, type PreAuthParams, type ProxyHandle, type ProxyOptions, RequestDeduplicator, type RetryConfig, type RoutingConfig, type RoutingDecision, RpcError, type SessionConfig, type SessionEntry, SessionStore, type SufficiencyResult, type Tier, type UsageEntry, blockrunProvider, buildProviderModels, createPaymentFetch, plugin as default, fetchWithRetry, formatStatsAscii, getAgenticModels, getFallbackChain, getFallbackChainFiltered, getModelContextWindow, getProxyPort, getSessionId, getStats, isAgenticModel, isBalanceError, isEmptyWalletError, isInsufficientFundsError, isRetryable, isRpcError, logUsage, resolveModelAlias, route, startProxy };