tokenfirewall 1.0.2 → 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +219 -474
- package/dist/core/pricingRegistry.js +56 -5
- package/dist/index.d.ts +32 -0
- package/dist/index.js +66 -12
- package/dist/interceptors/fetchInterceptor.d.ts +5 -0
- package/dist/interceptors/fetchInterceptor.js +278 -27
- package/dist/introspection/contextRegistry.d.ts +5 -0
- package/dist/introspection/contextRegistry.js +58 -6
- package/dist/logger.d.ts +5 -0
- package/dist/logger.js +10 -0
- package/dist/router/errorDetector.d.ts +45 -0
- package/dist/router/errorDetector.js +170 -0
- package/dist/router/modelRouter.d.ts +33 -0
- package/dist/router/modelRouter.js +111 -0
- package/dist/router/routingStrategies.d.ts +16 -0
- package/dist/router/routingStrategies.js +243 -0
- package/dist/router/types.d.ts +65 -0
- package/dist/router/types.js +5 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -1,25 +1,33 @@
|
|
|
1
1
|
# TokenFirewall
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
Enterprise-grade LLM cost enforcement middleware for Node.js with automatic budget protection, intelligent model routing, and comprehensive multi-provider support.
|
|
4
4
|
|
|
5
|
-
[](https://www.npmjs.com/package/tokenfirewall)
|
|
6
|
-
[](https://www.npmjs.com/package/tokenfirewall)
|
|
6
|
+
[](https://www.npmjs.com/package/tokenfirewall)
|
|
7
|
+
[](https://opensource.org/licenses/MIT)
|
|
8
|
+
[](https://www.typescriptlang.org/)
|
|
8
9
|
|
|
9
10
|
## Overview
|
|
10
11
|
|
|
11
|
-
TokenFirewall is a production-ready middleware that automatically tracks and enforces budget limits for Large Language Model (LLM) API calls. It provides transparent cost monitoring, prevents budget overruns, and supports multiple providers through a unified interface.
|
|
12
|
+
TokenFirewall is a production-ready middleware that automatically tracks and enforces budget limits for Large Language Model (LLM) API calls. It provides transparent cost monitoring, prevents budget overruns, intelligent model routing with automatic failover, and supports multiple providers through a unified interface.
|
|
12
13
|
|
|
13
14
|
### Key Features
|
|
14
15
|
|
|
15
|
-
-
|
|
16
|
-
-
|
|
17
|
-
-
|
|
18
|
-
-
|
|
19
|
-
-
|
|
20
|
-
-
|
|
21
|
-
- **Production Ready** -
|
|
22
|
-
-
|
|
16
|
+
- **Never Exceed Your Budget** - Automatically blocks API calls when spending limits are reached, preventing surprise bills
|
|
17
|
+
- **Zero Code Changes Required** - Drop-in middleware that works with any LLM API without modifying your existing code
|
|
18
|
+
- **Automatic Failover** - Intelligent router switches to backup models when primary fails, keeping your app running
|
|
19
|
+
- **Real-time Cost Tracking** - See exactly how much each API call costs based on actual token usage
|
|
20
|
+
- **Multi-Provider Support** - Works with OpenAI, Anthropic, Gemini, Grok, Kimi, and any custom LLM provider
|
|
21
|
+
- **Custom Model Support** - Register your own models with custom pricing and context limits at runtime
|
|
22
|
+
- **Production Ready** - Battle-tested with comprehensive error handling and edge case coverage
|
|
23
|
+
- **TypeScript Native** - Full type safety with included definitions
|
|
24
|
+
|
|
25
|
+
### What's New in v2.0.0
|
|
26
|
+
|
|
27
|
+
- **Intelligent Router** - Automatic failover to backup models when API calls fail
|
|
28
|
+
- **40+ Latest Models** - GPT-5, Claude 4.5, Gemini 3, with accurate 2026 pricing
|
|
29
|
+
- **Dynamic Registration** - Add custom models and pricing at runtime
|
|
30
|
+
- **Production Hardened** - Comprehensive validation, error handling, and edge case coverage
|
|
23
31
|
|
|
24
32
|
---
|
|
25
33
|
|
|
@@ -29,18 +37,13 @@ TokenFirewall is a production-ready middleware that automatically tracks and enf
|
|
|
29
37
|
- [Quick Start](#quick-start)
|
|
30
38
|
- [Core Concepts](#core-concepts)
|
|
31
39
|
- [API Reference](#api-reference)
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
- [Model Discovery](#model-discovery)
|
|
35
|
-
- [Custom Providers](#custom-providers)
|
|
36
|
-
- [Budget Persistence](#budget-persistence)
|
|
40
|
+
- [Intelligent Model Router](#intelligent-model-router)
|
|
41
|
+
- [Dynamic Model Registration](#dynamic-model-registration)
|
|
37
42
|
- [Supported Providers](#supported-providers)
|
|
38
|
-
- [Use Cases](#use-cases)
|
|
39
43
|
- [Examples](#examples)
|
|
40
44
|
- [TypeScript Support](#typescript-support)
|
|
41
45
|
- [Error Handling](#error-handling)
|
|
42
46
|
- [Best Practices](#best-practices)
|
|
43
|
-
- [Contributing](#contributing)
|
|
44
47
|
- [License](#license)
|
|
45
48
|
|
|
46
49
|
---
|
|
@@ -71,7 +74,7 @@ createBudgetGuard({
|
|
|
71
74
|
// Step 2: Patch global fetch
|
|
72
75
|
patchGlobalFetch();
|
|
73
76
|
|
|
74
|
-
// Step 3: Use any LLM API normally
|
|
77
|
+
// Step 3: Use any LLM API normally
|
|
75
78
|
const response = await fetch("https://api.openai.com/v1/chat/completions", {
|
|
76
79
|
method: "POST",
|
|
77
80
|
headers: {
|
|
@@ -93,7 +96,7 @@ const response = await fetch("https://api.openai.com/v1/chat/completions", {
|
|
|
93
96
|
|
|
94
97
|
### Budget Guard
|
|
95
98
|
|
|
96
|
-
The Budget Guard
|
|
99
|
+
The Budget Guard tracks spending and enforces limits in two modes:
|
|
97
100
|
|
|
98
101
|
- **Block Mode** (`mode: "block"`): Throws an error when budget is exceeded, preventing the API call
|
|
99
102
|
- **Warn Mode** (`mode: "warn"`): Logs a warning but allows the API call to proceed
|
|
@@ -126,17 +129,12 @@ Creates and configures a budget guard instance.
|
|
|
126
129
|
|
|
127
130
|
**Parameters:**
|
|
128
131
|
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
**Returns:** `BudgetManager` - The budget manager instance
|
|
136
|
-
|
|
137
|
-
**Throws:**
|
|
138
|
-
- `Error` if `monthlyLimit` is not a positive number
|
|
139
|
-
- `Error` if `mode` is not "block" or "warn"
|
|
132
|
+
```typescript
|
|
133
|
+
interface BudgetGuardOptions {
|
|
134
|
+
monthlyLimit: number; // Maximum spending limit in USD
|
|
135
|
+
mode?: "block" | "warn"; // Enforcement mode (default: "block")
|
|
136
|
+
}
|
|
137
|
+
```
|
|
140
138
|
|
|
141
139
|
**Example:**
|
|
142
140
|
|
|
@@ -144,32 +142,25 @@ Creates and configures a budget guard instance.
|
|
|
144
142
|
const { createBudgetGuard } = require("tokenfirewall");
|
|
145
143
|
|
|
146
144
|
// Block mode - strict enforcement
|
|
147
|
-
|
|
145
|
+
createBudgetGuard({
|
|
148
146
|
monthlyLimit: 100,
|
|
149
147
|
mode: "block"
|
|
150
148
|
});
|
|
151
149
|
|
|
152
150
|
// Warn mode - soft limits
|
|
153
|
-
|
|
151
|
+
createBudgetGuard({
|
|
154
152
|
monthlyLimit: 500,
|
|
155
153
|
mode: "warn"
|
|
156
154
|
});
|
|
157
155
|
```
|
|
158
156
|
|
|
159
|
-
**Notes:**
|
|
160
|
-
- Calling `createBudgetGuard()` multiple times will replace the existing guard
|
|
161
|
-
- A warning is logged when overwriting an existing guard
|
|
162
|
-
- The guard is global and applies to all subsequent API calls
|
|
163
|
-
|
|
164
157
|
---
|
|
165
158
|
|
|
166
159
|
#### `getBudgetStatus()`
|
|
167
160
|
|
|
168
161
|
Retrieves the current budget status and usage statistics.
|
|
169
162
|
|
|
170
|
-
**
|
|
171
|
-
|
|
172
|
-
**Returns:** `BudgetStatus | null`
|
|
163
|
+
**Returns:**
|
|
173
164
|
|
|
174
165
|
```typescript
|
|
175
166
|
interface BudgetStatus {
|
|
@@ -186,152 +177,61 @@ interface BudgetStatus {
|
|
|
186
177
|
const { getBudgetStatus } = require("tokenfirewall");
|
|
187
178
|
|
|
188
179
|
const status = getBudgetStatus();
|
|
189
|
-
|
|
190
180
|
if (status) {
|
|
191
181
|
console.log(`Spent: $${status.totalSpent.toFixed(2)}`);
|
|
192
182
|
console.log(`Remaining: $${status.remaining.toFixed(2)}`);
|
|
193
183
|
console.log(`Usage: ${status.percentageUsed.toFixed(1)}%`);
|
|
194
|
-
|
|
195
|
-
// Alert if over 80%
|
|
196
|
-
if (status.percentageUsed > 80) {
|
|
197
|
-
console.warn("⚠️ Budget usage is high!");
|
|
198
|
-
}
|
|
199
184
|
}
|
|
200
185
|
```
|
|
201
186
|
|
|
202
|
-
**Returns `null` if:**
|
|
203
|
-
- No budget guard has been created
|
|
204
|
-
- Budget guard was not initialized
|
|
205
|
-
|
|
206
187
|
---
|
|
207
188
|
|
|
208
189
|
#### `resetBudget()`
|
|
209
190
|
|
|
210
|
-
Resets the budget tracking to zero
|
|
211
|
-
|
|
212
|
-
**Parameters:** None
|
|
213
|
-
|
|
214
|
-
**Returns:** `void`
|
|
215
|
-
|
|
216
|
-
**Example:**
|
|
191
|
+
Resets the budget tracking to zero.
|
|
217
192
|
|
|
218
193
|
```javascript
|
|
219
|
-
const { resetBudget
|
|
194
|
+
const { resetBudget } = require("tokenfirewall");
|
|
220
195
|
|
|
221
196
|
// Reset at the start of each month
|
|
222
|
-
|
|
223
|
-
resetBudget();
|
|
224
|
-
console.log("Budget reset for new month");
|
|
225
|
-
|
|
226
|
-
const status = getBudgetStatus();
|
|
227
|
-
console.log(`New budget: $${status.limit}`);
|
|
228
|
-
}
|
|
229
|
-
|
|
230
|
-
// Schedule monthly reset
|
|
231
|
-
const cron = require("node-cron");
|
|
232
|
-
cron.schedule("0 0 1 * *", monthlyReset); // First day of month
|
|
197
|
+
resetBudget();
|
|
233
198
|
```
|
|
234
199
|
|
|
235
|
-
**Use Cases:**
|
|
236
|
-
- Monthly budget resets
|
|
237
|
-
- Testing and development
|
|
238
|
-
- Per-session budgets
|
|
239
|
-
- Tenant-specific resets
|
|
240
|
-
|
|
241
200
|
---
|
|
242
201
|
|
|
243
|
-
|
|
244
|
-
|
|
245
|
-
#### `patchGlobalFetch()`
|
|
246
|
-
|
|
247
|
-
Patches the global `fetch` function to intercept and track LLM API calls.
|
|
202
|
+
#### `exportBudgetState()` / `importBudgetState(state)`
|
|
248
203
|
|
|
249
|
-
|
|
250
|
-
|
|
251
|
-
**Returns:** `void`
|
|
252
|
-
|
|
253
|
-
**Example:**
|
|
204
|
+
Save and restore budget state for persistence.
|
|
254
205
|
|
|
255
206
|
```javascript
|
|
256
|
-
const {
|
|
207
|
+
const { exportBudgetState, importBudgetState } = require("tokenfirewall");
|
|
208
|
+
const fs = require("fs");
|
|
257
209
|
|
|
258
|
-
//
|
|
259
|
-
|
|
210
|
+
// Export state
|
|
211
|
+
const state = exportBudgetState();
|
|
212
|
+
fs.writeFileSync("budget.json", JSON.stringify(state));
|
|
260
213
|
|
|
261
|
-
//
|
|
262
|
-
|
|
263
|
-
|
|
214
|
+
// Import state
|
|
215
|
+
const savedState = JSON.parse(fs.readFileSync("budget.json"));
|
|
216
|
+
importBudgetState(savedState);
|
|
264
217
|
```
|
|
265
218
|
|
|
266
|
-
**Behavior:**
|
|
267
|
-
- Intercepts all `fetch` calls globally
|
|
268
|
-
- Only processes LLM API responses (non-LLM calls are ignored)
|
|
269
|
-
- Automatically detects provider from response format
|
|
270
|
-
- Calculates costs and tracks against budget
|
|
271
|
-
- Logs usage information to console
|
|
272
|
-
- Can be called multiple times safely (idempotent)
|
|
273
|
-
|
|
274
|
-
**Important Notes:**
|
|
275
|
-
- Must be called AFTER `createBudgetGuard()`
|
|
276
|
-
- Works with official SDKs that use `fetch` internally
|
|
277
|
-
- Does not affect non-LLM HTTP requests
|
|
278
|
-
- Minimal performance overhead
|
|
279
|
-
|
|
280
219
|
---
|
|
281
220
|
|
|
282
|
-
|
|
283
|
-
|
|
284
|
-
Restores the original `fetch` function, disabling interception.
|
|
285
|
-
|
|
286
|
-
**Parameters:** None
|
|
221
|
+
### Interception
|
|
287
222
|
|
|
288
|
-
|
|
223
|
+
#### `patchGlobalFetch()`
|
|
289
224
|
|
|
290
|
-
|
|
225
|
+
Patches the global `fetch` function to intercept and track LLM API calls.
|
|
291
226
|
|
|
292
227
|
```javascript
|
|
293
|
-
const { patchGlobalFetch
|
|
228
|
+
const { patchGlobalFetch } = require("tokenfirewall");
|
|
294
229
|
|
|
295
|
-
// Enable tracking
|
|
296
230
|
patchGlobalFetch();
|
|
297
231
|
|
|
298
|
-
//
|
|
299
|
-
|
|
300
|
-
// Disable tracking
|
|
301
|
-
unpatchGlobalFetch();
|
|
302
|
-
|
|
303
|
-
// Subsequent calls are not tracked
|
|
304
|
-
```
|
|
305
|
-
|
|
306
|
-
**Use Cases:**
|
|
307
|
-
- Temporarily disable tracking
|
|
308
|
-
- Testing specific scenarios
|
|
309
|
-
- Cleanup in test suites
|
|
310
|
-
|
|
311
|
-
---
|
|
312
|
-
|
|
313
|
-
#### `patchProvider(providerName)`
|
|
314
|
-
|
|
315
|
-
Patches a specific provider SDK (currently placeholder - most providers work via fetch interception).
|
|
316
|
-
|
|
317
|
-
**Parameters:**
|
|
318
|
-
|
|
319
|
-
| Parameter | Type | Required | Description |
|
|
320
|
-
|-----------|------|----------|-------------|
|
|
321
|
-
| `providerName` | `string` | Yes | Provider name ("openai", "anthropic", etc.) |
|
|
322
|
-
|
|
323
|
-
**Returns:** `void`
|
|
324
|
-
|
|
325
|
-
**Example:**
|
|
326
|
-
|
|
327
|
-
```javascript
|
|
328
|
-
const { patchProvider } = require("tokenfirewall");
|
|
329
|
-
|
|
330
|
-
patchProvider("openai");
|
|
232
|
+
// All subsequent fetch calls are intercepted
|
|
331
233
|
```
|
|
332
234
|
|
|
333
|
-
**Note:** Most providers work automatically with `patchGlobalFetch()`. This function is reserved for future provider-specific integrations.
|
|
334
|
-
|
|
335
235
|
---
|
|
336
236
|
|
|
337
237
|
### Model Discovery
|
|
@@ -342,21 +242,12 @@ Lists available models from a provider with context limits and budget informatio
|
|
|
342
242
|
|
|
343
243
|
**Parameters:**
|
|
344
244
|
|
|
345
|
-
| Parameter | Type | Required | Description |
|
|
346
|
-
|-----------|------|----------|-------------|
|
|
347
|
-
| `options` | `ListModelsOptions` | Yes | Discovery options |
|
|
348
|
-
| `options.provider` | `string` | Yes | Provider name ("openai", "gemini", "grok", "kimi") |
|
|
349
|
-
| `options.apiKey` | `string` | Yes | Provider API key |
|
|
350
|
-
| `options.baseURL` | `string` | No | Custom API endpoint URL |
|
|
351
|
-
| `options.includeBudgetUsage` | `boolean` | No | Include current budget usage % (default: false) |
|
|
352
|
-
|
|
353
|
-
**Returns:** `Promise<ModelInfo[]>`
|
|
354
|
-
|
|
355
245
|
```typescript
|
|
356
|
-
interface
|
|
357
|
-
|
|
358
|
-
|
|
359
|
-
|
|
246
|
+
interface ListModelsOptions {
|
|
247
|
+
provider: string; // Provider name
|
|
248
|
+
apiKey: string; // Provider API key
|
|
249
|
+
baseURL?: string; // Custom API endpoint
|
|
250
|
+
includeBudgetUsage?: boolean; // Include budget usage %
|
|
360
251
|
}
|
|
361
252
|
```
|
|
362
253
|
|
|
@@ -365,7 +256,6 @@ interface ModelInfo {
|
|
|
365
256
|
```javascript
|
|
366
257
|
const { listModels } = require("tokenfirewall");
|
|
367
258
|
|
|
368
|
-
// Discover OpenAI models
|
|
369
259
|
const models = await listModels({
|
|
370
260
|
provider: "openai",
|
|
371
261
|
apiKey: process.env.OPENAI_API_KEY,
|
|
@@ -373,305 +263,156 @@ const models = await listModels({
|
|
|
373
263
|
});
|
|
374
264
|
|
|
375
265
|
models.forEach(model => {
|
|
376
|
-
console.log(
|
|
377
|
-
if (model.contextLimit) {
|
|
378
|
-
console.log(` Context: ${model.contextLimit.toLocaleString()} tokens`);
|
|
379
|
-
}
|
|
380
|
-
if (model.budgetUsagePercentage !== undefined) {
|
|
381
|
-
console.log(` Budget Used: ${model.budgetUsagePercentage.toFixed(2)}%`);
|
|
382
|
-
}
|
|
266
|
+
console.log(`${model.model}: ${model.contextLimit} tokens`);
|
|
383
267
|
});
|
|
384
|
-
|
|
385
|
-
// Find models with large context windows
|
|
386
|
-
const largeContext = models.filter(m => m.contextLimit && m.contextLimit > 100000);
|
|
387
268
|
```
|
|
388
269
|
|
|
389
|
-
**Supported Providers:**
|
|
390
|
-
- `"openai"` - Fetches from OpenAI API
|
|
391
|
-
- `"gemini"` - Fetches from Google Gemini API
|
|
392
|
-
- `"grok"` - Fetches from X.AI API
|
|
393
|
-
- `"kimi"` - Fetches from Moonshot AI API
|
|
394
|
-
- `"anthropic"` - Returns static list (no API endpoint available)
|
|
395
|
-
|
|
396
|
-
**Error Handling:**
|
|
397
|
-
- Returns empty array if API call fails
|
|
398
|
-
- Logs warning on errors
|
|
399
|
-
- Has 10-second timeout to prevent hanging
|
|
400
|
-
|
|
401
270
|
---
|
|
402
271
|
|
|
403
|
-
|
|
404
|
-
|
|
405
|
-
Lower-level model discovery function with manual budget manager injection.
|
|
406
|
-
|
|
407
|
-
**Parameters:**
|
|
408
|
-
|
|
409
|
-
| Parameter | Type | Required | Description |
|
|
410
|
-
|-----------|------|----------|-------------|
|
|
411
|
-
| `options` | `ListModelsOptions` | Yes | Discovery options (same as `listModels`) |
|
|
412
|
-
| `options.budgetManager` | `BudgetManager` | No | Manual budget manager instance |
|
|
413
|
-
|
|
414
|
-
**Returns:** `Promise<ModelInfo[]>`
|
|
415
|
-
|
|
416
|
-
**Example:**
|
|
417
|
-
|
|
418
|
-
```javascript
|
|
419
|
-
const { listAvailableModels, createBudgetGuard } = require("tokenfirewall");
|
|
420
|
-
|
|
421
|
-
const manager = createBudgetGuard({ monthlyLimit: 100, mode: "warn" });
|
|
272
|
+
## Intelligent Model Router
|
|
422
273
|
|
|
423
|
-
|
|
424
|
-
provider: "openai",
|
|
425
|
-
apiKey: process.env.OPENAI_API_KEY,
|
|
426
|
-
budgetManager: manager,
|
|
427
|
-
includeBudgetUsage: true
|
|
428
|
-
});
|
|
429
|
-
```
|
|
274
|
+
The Model Router provides automatic retry and model switching on failures.
|
|
430
275
|
|
|
431
|
-
|
|
276
|
+
### `createModelRouter(options)`
|
|
432
277
|
|
|
433
|
-
|
|
434
|
-
|
|
435
|
-
### Custom Providers
|
|
436
|
-
|
|
437
|
-
#### `registerAdapter(adapter)`
|
|
438
|
-
|
|
439
|
-
Registers a custom provider adapter for tracking non-standard LLM APIs.
|
|
278
|
+
Creates and configures an intelligent model router.
|
|
440
279
|
|
|
441
280
|
**Parameters:**
|
|
442
281
|
|
|
443
|
-
| Parameter | Type | Required | Description |
|
|
444
|
-
|-----------|------|----------|-------------|
|
|
445
|
-
| `adapter` | `ProviderAdapter` | Yes | Custom adapter implementation |
|
|
446
|
-
|
|
447
282
|
```typescript
|
|
448
|
-
interface
|
|
449
|
-
|
|
450
|
-
|
|
451
|
-
|
|
452
|
-
}
|
|
453
|
-
|
|
454
|
-
interface NormalizedUsage {
|
|
455
|
-
provider: string; // Provider name
|
|
456
|
-
model: string; // Model identifier
|
|
457
|
-
inputTokens: number; // Input/prompt tokens
|
|
458
|
-
outputTokens: number; // Output/completion tokens
|
|
459
|
-
totalTokens: number; // Total tokens
|
|
283
|
+
interface ModelRouterOptions {
|
|
284
|
+
strategy: "fallback" | "context" | "cost"; // Routing strategy
|
|
285
|
+
fallbackMap?: Record<string, string[]>; // Fallback model map
|
|
286
|
+
maxRetries?: number; // Max retry attempts (default: 1)
|
|
460
287
|
}
|
|
461
288
|
```
|
|
462
289
|
|
|
463
290
|
**Example:**
|
|
464
291
|
|
|
465
292
|
```javascript
|
|
466
|
-
const {
|
|
467
|
-
|
|
468
|
-
//
|
|
469
|
-
|
|
470
|
-
|
|
471
|
-
|
|
472
|
-
|
|
473
|
-
|
|
474
|
-
typeof response === "object" &&
|
|
475
|
-
response.model &&
|
|
476
|
-
response.prompt_eval_count !== undefined;
|
|
293
|
+
const { createModelRouter, patchGlobalFetch } = require("tokenfirewall");
|
|
294
|
+
|
|
295
|
+
// Fallback strategy - use predefined fallback models
|
|
296
|
+
createModelRouter({
|
|
297
|
+
strategy: "fallback",
|
|
298
|
+
fallbackMap: {
|
|
299
|
+
"gpt-4o": ["gpt-4o-mini", "gpt-3.5-turbo"],
|
|
300
|
+
"claude-3-5-sonnet-20241022": ["claude-3-5-haiku-20241022"]
|
|
477
301
|
},
|
|
478
|
-
|
|
479
|
-
normalize: (response) => {
|
|
480
|
-
return {
|
|
481
|
-
provider: "ollama",
|
|
482
|
-
model: response.model,
|
|
483
|
-
inputTokens: response.prompt_eval_count || 0,
|
|
484
|
-
outputTokens: response.eval_count || 0,
|
|
485
|
-
totalTokens: (response.prompt_eval_count || 0) + (response.eval_count || 0)
|
|
486
|
-
};
|
|
487
|
-
}
|
|
302
|
+
maxRetries: 2
|
|
488
303
|
});
|
|
489
304
|
|
|
490
|
-
|
|
491
|
-
|
|
492
|
-
|
|
493
|
-
body: JSON.stringify({ model: "llama3.2", prompt: "Hello" })
|
|
494
|
-
});
|
|
305
|
+
patchGlobalFetch();
|
|
306
|
+
|
|
307
|
+
// API calls will automatically retry with fallback models on failure
|
|
495
308
|
```
|
|
496
309
|
|
|
497
|
-
|
|
498
|
-
- Adapter name must be a non-empty string
|
|
499
|
-
- `detect()` must return boolean
|
|
500
|
-
- `normalize()` must return valid `NormalizedUsage` object
|
|
501
|
-
- Adapters are checked in registration order (first match wins)
|
|
310
|
+
### Routing Strategies
|
|
502
311
|
|
|
503
|
-
|
|
312
|
+
**1. Fallback Strategy** - Uses predefined fallback map
|
|
313
|
+
- Tries models in order from fallbackMap
|
|
314
|
+
- Best for: Known model preferences, production resilience
|
|
504
315
|
|
|
505
|
-
|
|
316
|
+
**2. Context Strategy** - Upgrades to larger context window
|
|
317
|
+
- Only triggers on context overflow errors
|
|
318
|
+
- Selects model with larger context from same provider
|
|
319
|
+
- Best for: Handling variable input sizes
|
|
506
320
|
|
|
507
|
-
|
|
321
|
+
**3. Cost Strategy** - Switches to cheaper model
|
|
322
|
+
- Selects cheaper model from same provider
|
|
323
|
+
- Best for: Cost optimization, rate limit handling
|
|
508
324
|
|
|
509
|
-
|
|
325
|
+
### Error Detection
|
|
510
326
|
|
|
511
|
-
|
|
512
|
-
|
|
513
|
-
|
|
514
|
-
|
|
515
|
-
|
|
327
|
+
The router automatically detects and classifies failures:
|
|
328
|
+
- `rate_limit` - HTTP 429 or rate limit errors
|
|
329
|
+
- `context_overflow` - Context length exceeded errors
|
|
330
|
+
- `model_unavailable` - HTTP 404 or model not found
|
|
331
|
+
- `access_denied` - HTTP 403 or unauthorized
|
|
332
|
+
- `unknown` - Other errors
|
|
516
333
|
|
|
517
|
-
|
|
518
|
-
interface ModelPricing {
|
|
519
|
-
input: number; // Cost per 1M input tokens (USD)
|
|
520
|
-
output: number; // Cost per 1M output tokens (USD)
|
|
521
|
-
}
|
|
522
|
-
```
|
|
334
|
+
### `disableModelRouter()`
|
|
523
335
|
|
|
524
|
-
|
|
336
|
+
Disables the model router.
|
|
525
337
|
|
|
526
338
|
```javascript
|
|
527
|
-
const {
|
|
528
|
-
|
|
529
|
-
// Register pricing for custom model
|
|
530
|
-
registerPricing("ollama", "llama3.2", {
|
|
531
|
-
input: 0.0, // Free (self-hosted)
|
|
532
|
-
output: 0.0
|
|
533
|
-
});
|
|
339
|
+
const { disableModelRouter } = require("tokenfirewall");
|
|
534
340
|
|
|
535
|
-
|
|
536
|
-
registerPricing("openai", "gpt-5", {
|
|
537
|
-
input: 5.0, // $5 per 1M input tokens
|
|
538
|
-
output: 15.0 // $15 per 1M output tokens
|
|
539
|
-
});
|
|
540
|
-
|
|
541
|
-
// Override existing pricing
|
|
542
|
-
registerPricing("openai", "gpt-4o", {
|
|
543
|
-
input: 2.0, // Custom pricing
|
|
544
|
-
output: 8.0
|
|
545
|
-
});
|
|
341
|
+
disableModelRouter();
|
|
546
342
|
```
|
|
547
343
|
|
|
548
|
-
|
|
549
|
-
- Provider and model must be non-empty strings
|
|
550
|
-
- Input and output prices must be non-negative numbers
|
|
551
|
-
- Prices cannot be NaN or Infinity
|
|
344
|
+
---
|
|
552
345
|
|
|
553
|
-
|
|
554
|
-
TokenFirewall includes default pricing for:
|
|
555
|
-
- OpenAI (GPT-4o, GPT-4o-mini, GPT-4-turbo, GPT-3.5-turbo)
|
|
556
|
-
- Anthropic (Claude 3.5 Sonnet, Claude 3.5 Haiku, Claude 3 Opus)
|
|
557
|
-
- Gemini (Gemini 2.0 Flash, Gemini 1.5 Pro, Gemini 1.5 Flash)
|
|
558
|
-
- Grok (Grok-beta, Grok-2, Llama models)
|
|
559
|
-
- Kimi (Moonshot v1 models)
|
|
346
|
+
## Dynamic Model Registration
|
|
560
347
|
|
|
561
|
-
|
|
348
|
+
Register models with pricing and context limits at runtime.
|
|
562
349
|
|
|
563
|
-
|
|
350
|
+
### `registerModels(provider, models)`
|
|
564
351
|
|
|
565
|
-
|
|
352
|
+
Bulk register models for a provider.
|
|
566
353
|
|
|
567
354
|
**Parameters:**
|
|
568
355
|
|
|
569
|
-
|
|
570
|
-
|
|
571
|
-
|
|
572
|
-
|
|
573
|
-
|
|
356
|
+
```typescript
|
|
357
|
+
interface ModelConfig {
|
|
358
|
+
name: string; // Model identifier
|
|
359
|
+
contextLimit?: number; // Context window size in tokens
|
|
360
|
+
pricing?: { // Pricing per 1M tokens (USD)
|
|
361
|
+
input: number;
|
|
362
|
+
output: number;
|
|
363
|
+
};
|
|
364
|
+
}
|
|
365
|
+
```
|
|
574
366
|
|
|
575
367
|
**Example:**
|
|
576
368
|
|
|
577
369
|
```javascript
|
|
578
|
-
const {
|
|
579
|
-
|
|
580
|
-
// Register
|
|
581
|
-
|
|
370
|
+
const { registerModels, createModelRouter } = require("tokenfirewall");
|
|
371
|
+
|
|
372
|
+
// Register custom models
|
|
373
|
+
registerModels("my-provider", [
|
|
374
|
+
{
|
|
375
|
+
name: "my-large-model",
|
|
376
|
+
contextLimit: 200000,
|
|
377
|
+
pricing: { input: 5.0, output: 15.0 }
|
|
378
|
+
},
|
|
379
|
+
{
|
|
380
|
+
name: "my-small-model",
|
|
381
|
+
contextLimit: 50000,
|
|
382
|
+
pricing: { input: 1.0, output: 3.0 }
|
|
383
|
+
}
|
|
384
|
+
]);
|
|
582
385
|
|
|
583
|
-
//
|
|
584
|
-
|
|
386
|
+
// Router will use dynamically registered models
|
|
387
|
+
createModelRouter({
|
|
388
|
+
strategy: "cost",
|
|
389
|
+
maxRetries: 2
|
|
390
|
+
});
|
|
585
391
|
```
|
|
586
392
|
|
|
587
|
-
|
|
588
|
-
- Provider and model must be non-empty strings
|
|
589
|
-
- Context limit must be a positive number
|
|
590
|
-
- Cannot be NaN or Infinity
|
|
591
|
-
|
|
592
|
-
---
|
|
593
|
-
|
|
594
|
-
### Budget Persistence
|
|
393
|
+
### `registerPricing(provider, model, pricing)`
|
|
595
394
|
|
|
596
|
-
|
|
597
|
-
|
|
598
|
-
Exports the current budget state for persistence.
|
|
599
|
-
|
|
600
|
-
**Parameters:** None
|
|
601
|
-
|
|
602
|
-
**Returns:** `{ totalSpent: number; limit: number; mode: string } | null`
|
|
603
|
-
|
|
604
|
-
**Example:**
|
|
395
|
+
Register custom pricing for a specific model.
|
|
605
396
|
|
|
606
397
|
```javascript
|
|
607
|
-
const {
|
|
608
|
-
const fs = require("fs");
|
|
609
|
-
|
|
610
|
-
// Export state
|
|
611
|
-
const state = exportBudgetState();
|
|
398
|
+
const { registerPricing } = require("tokenfirewall");
|
|
612
399
|
|
|
613
|
-
|
|
614
|
-
//
|
|
615
|
-
|
|
616
|
-
|
|
617
|
-
// Or save to database
|
|
618
|
-
await db.budgets.update({ id: "main" }, state);
|
|
619
|
-
|
|
620
|
-
// Or save to Redis
|
|
621
|
-
await redis.set("budget:state", JSON.stringify(state));
|
|
622
|
-
}
|
|
400
|
+
registerPricing("openai", "gpt-5", {
|
|
401
|
+
input: 5.0, // $5 per 1M input tokens
|
|
402
|
+
output: 15.0 // $15 per 1M output tokens
|
|
403
|
+
});
|
|
623
404
|
```
|
|
624
405
|
|
|
625
|
-
|
|
626
|
-
- No budget guard has been created
|
|
627
|
-
|
|
628
|
-
---
|
|
629
|
-
|
|
630
|
-
#### `importBudgetState(state)`
|
|
631
|
-
|
|
632
|
-
Imports and restores a previously saved budget state.
|
|
633
|
-
|
|
634
|
-
**Parameters:**
|
|
635
|
-
|
|
636
|
-
| Parameter | Type | Required | Description |
|
|
637
|
-
|-----------|------|----------|-------------|
|
|
638
|
-
| `state` | `{ totalSpent: number }` | Yes | Saved budget state |
|
|
639
|
-
|
|
640
|
-
**Returns:** `void`
|
|
406
|
+
### `registerContextLimit(provider, model, contextLimit)`
|
|
641
407
|
|
|
642
|
-
|
|
643
|
-
- `Error` if no budget guard exists
|
|
644
|
-
- `Error` if `totalSpent` is not a valid number
|
|
645
|
-
- `Error` if `totalSpent` is negative
|
|
646
|
-
|
|
647
|
-
**Example:**
|
|
408
|
+
Register custom context window limit.
|
|
648
409
|
|
|
649
410
|
```javascript
|
|
650
|
-
const {
|
|
651
|
-
const fs = require("fs");
|
|
652
|
-
|
|
653
|
-
// Create budget guard first
|
|
654
|
-
createBudgetGuard({ monthlyLimit: 100, mode: "block" });
|
|
655
|
-
|
|
656
|
-
// Load from file
|
|
657
|
-
if (fs.existsSync("budget-state.json")) {
|
|
658
|
-
const state = JSON.parse(fs.readFileSync("budget-state.json", "utf8"));
|
|
659
|
-
importBudgetState(state);
|
|
660
|
-
console.log("Budget state restored");
|
|
661
|
-
}
|
|
411
|
+
const { registerContextLimit } = require("tokenfirewall");
|
|
662
412
|
|
|
663
|
-
|
|
664
|
-
const state = await db.budgets.findOne({ id: "main" });
|
|
665
|
-
if (state) {
|
|
666
|
-
importBudgetState(state);
|
|
667
|
-
}
|
|
413
|
+
registerContextLimit("openai", "gpt-5", 256000);
|
|
668
414
|
```
|
|
669
415
|
|
|
670
|
-
**Validation:**
|
|
671
|
-
- Validates `totalSpent` is a valid number
|
|
672
|
-
- Rejects negative values
|
|
673
|
-
- Warns if imported value is suspiciously large (>10x limit)
|
|
674
|
-
|
|
675
416
|
---
|
|
676
417
|
|
|
677
418
|
## Supported Providers
|
|
@@ -680,41 +421,43 @@ TokenFirewall includes built-in support for:
|
|
|
680
421
|
|
|
681
422
|
| Provider | Models | Pricing | Discovery |
|
|
682
423
|
|----------|--------|---------|-----------|
|
|
683
|
-
| **OpenAI** | GPT-
|
|
684
|
-
| **Anthropic** | Claude
|
|
685
|
-
| **Google Gemini** | Gemini
|
|
686
|
-
| **Grok (X.AI)** | Grok
|
|
687
|
-
| **Kimi (Moonshot)** | Moonshot v1 (8k, 32k, 128k) |
|
|
688
|
-
| **
|
|
689
|
-
|
|
690
|
-
|
|
691
|
-
|
|
692
|
-
|
|
693
|
-
|
|
694
|
-
|
|
695
|
-
|
|
696
|
-
-
|
|
697
|
-
-
|
|
698
|
-
|
|
699
|
-
|
|
700
|
-
-
|
|
701
|
-
|
|
702
|
-
|
|
703
|
-
|
|
704
|
-
|
|
705
|
-
-
|
|
706
|
-
|
|
707
|
-
|
|
708
|
-
|
|
709
|
-
|
|
710
|
-
-
|
|
711
|
-
-
|
|
712
|
-
|
|
713
|
-
|
|
714
|
-
|
|
715
|
-
-
|
|
716
|
-
-
|
|
717
|
-
-
|
|
424
|
+
| **OpenAI** | GPT-5, GPT-5-mini, GPT-4.1, GPT-4o, o1, gpt-image-1 | Included | API |
|
|
425
|
+
| **Anthropic** | Claude 4.5 (Opus, Sonnet, Haiku), Claude 4, Claude 3.5 | Included | Static |
|
|
426
|
+
| **Google Gemini** | Gemini 3, Gemini 3.1, Gemini 2.5, Nano Banana | Included | API |
|
|
427
|
+
| **Grok (X.AI)** | Grok 3, Grok 2, Grok Vision | Included | API |
|
|
428
|
+
| **Kimi (Moonshot)** | Moonshot v1 (8k, 32k, 128k) | Included | API |
|
|
429
|
+
| **Meta** | Llama 3.3, Llama 3.1 | Included | Static |
|
|
430
|
+
| **Mistral** | Mistral Large, Mixtral | Included | Static |
|
|
431
|
+
| **Cohere** | Command R+, Command R | Included | Static |
|
|
432
|
+
| **Custom** | Any LLM API | Register | Custom |
|
|
433
|
+
|
|
434
|
+
### Pricing (Per 1M Tokens)
|
|
435
|
+
|
|
436
|
+
**OpenAI:**
|
|
437
|
+
- GPT-5: $5.00 / $15.00
|
|
438
|
+
- GPT-5-mini: $1.50 / $5.00
|
|
439
|
+
- GPT-4.1: $3.00 / $12.00
|
|
440
|
+
- GPT-4o: $2.50 / $10.00
|
|
441
|
+
- o1: $6.00 / $18.00
|
|
442
|
+
|
|
443
|
+
**Anthropic:**
|
|
444
|
+
- Claude Opus 4.5: $17.00 / $85.00
|
|
445
|
+
- Claude Sonnet 4.5: $4.00 / $20.00
|
|
446
|
+
- Claude Haiku 4.5: $1.20 / $6.00
|
|
447
|
+
|
|
448
|
+
**Gemini:**
|
|
449
|
+
- Gemini 3 Pro: $3.50 / $14.00
|
|
450
|
+
- Gemini 3 Flash: $0.35 / $1.50
|
|
451
|
+
- Gemini 2.5 Pro: $2.50 / $10.00
|
|
452
|
+
- Nano Banana: $0.05 / $0.20
|
|
453
|
+
|
|
454
|
+
### Context Limits
|
|
455
|
+
|
|
456
|
+
- GPT-5: 256K tokens
|
|
457
|
+
- GPT-4.1: 200K tokens
|
|
458
|
+
- Claude 4.5: 200K tokens
|
|
459
|
+
- Gemini 3 Pro: 2M tokens
|
|
460
|
+
- o1: 200K tokens
|
|
718
461
|
|
|
719
462
|
---
|
|
720
463
|
|
|
@@ -727,6 +470,8 @@ See the [`examples/`](./examples) directory for complete, runnable examples:
|
|
|
727
470
|
3. **[Budget Persistence](./examples/3-budget-persistence.js)** - Save and restore state
|
|
728
471
|
4. **[Custom Provider](./examples/4-custom-provider.js)** - Add your own LLM provider
|
|
729
472
|
5. **[Model Discovery](./examples/5-model-discovery.js)** - Find and compare models
|
|
473
|
+
6. **[Intelligent Routing](./examples/6-intelligent-routing.js)** - Automatic retry and fallback
|
|
474
|
+
7. **[Dynamic Models](./examples/7-dynamic-models.js)** - Register models at runtime
|
|
730
475
|
|
|
731
476
|
---
|
|
732
477
|
|
|
@@ -739,11 +484,13 @@ import {
|
|
|
739
484
|
createBudgetGuard,
|
|
740
485
|
patchGlobalFetch,
|
|
741
486
|
getBudgetStatus,
|
|
487
|
+
createModelRouter,
|
|
488
|
+
registerModels,
|
|
742
489
|
BudgetGuardOptions,
|
|
743
490
|
BudgetStatus,
|
|
744
491
|
ModelInfo,
|
|
745
|
-
|
|
746
|
-
|
|
492
|
+
ModelRouterOptions,
|
|
493
|
+
ModelConfig
|
|
747
494
|
} from "tokenfirewall";
|
|
748
495
|
|
|
749
496
|
// Full type safety
|
|
@@ -769,14 +516,12 @@ try {
|
|
|
769
516
|
const response = await fetch(/* ... */);
|
|
770
517
|
} catch (error) {
|
|
771
518
|
if (error.message.includes("TokenFirewall: Budget exceeded")) {
|
|
772
|
-
// Budget limit reached
|
|
773
519
|
console.error("Monthly budget exhausted");
|
|
774
|
-
//
|
|
775
|
-
} else if (error.message.includes("TokenFirewall:
|
|
776
|
-
|
|
777
|
-
|
|
520
|
+
// Handle budget limit
|
|
521
|
+
} else if (error.message.includes("TokenFirewall Router: Max routing retries exceeded")) {
|
|
522
|
+
console.error("All fallback models failed");
|
|
523
|
+
// Handle routing failure
|
|
778
524
|
} else {
|
|
779
|
-
// Other errors (network, API, etc.)
|
|
780
525
|
console.error("API error:", error.message);
|
|
781
526
|
}
|
|
782
527
|
}
|
|
@@ -788,15 +533,15 @@ try {
|
|
|
788
533
|
|---------------|-------|----------|
|
|
789
534
|
| `Budget exceeded! Would spend $X of $Y limit` | Budget limit reached | Increase limit or wait for reset |
|
|
790
535
|
| `monthlyLimit must be a valid number` | Invalid budget configuration | Provide positive number |
|
|
791
|
-
| `
|
|
536
|
+
| `Max routing retries exceeded` | All fallback models failed | Check API status or fallback map |
|
|
792
537
|
| `No pricing found for model "X"` | Unknown model | Register custom pricing |
|
|
793
|
-
| `Cannot import budget state - no budget guard exists` | Import before create | Call `createBudgetGuard()` first |
|
|
794
538
|
|
|
795
539
|
---
|
|
796
540
|
|
|
797
541
|
## Best Practices
|
|
798
542
|
|
|
799
543
|
### 1. Initialize Early
|
|
544
|
+
|
|
800
545
|
```javascript
|
|
801
546
|
// At application startup
|
|
802
547
|
createBudgetGuard({ monthlyLimit: 100, mode: "block" });
|
|
@@ -804,12 +549,14 @@ patchGlobalFetch();
|
|
|
804
549
|
```
|
|
805
550
|
|
|
806
551
|
### 2. Use Warn Mode in Development
|
|
552
|
+
|
|
807
553
|
```javascript
|
|
808
554
|
const mode = process.env.NODE_ENV === "production" ? "block" : "warn";
|
|
809
555
|
createBudgetGuard({ monthlyLimit: 100, mode });
|
|
810
556
|
```
|
|
811
557
|
|
|
812
558
|
### 3. Persist Budget State
|
|
559
|
+
|
|
813
560
|
```javascript
|
|
814
561
|
// Save on exit
|
|
815
562
|
process.on("beforeExit", () => {
|
|
@@ -819,29 +566,39 @@ process.on("beforeExit", () => {
|
|
|
819
566
|
```
|
|
820
567
|
|
|
821
568
|
### 4. Monitor Usage
|
|
569
|
+
|
|
822
570
|
```javascript
|
|
823
571
|
// Alert at 80% usage
|
|
824
572
|
const status = getBudgetStatus();
|
|
825
573
|
if (status && status.percentageUsed > 80) {
|
|
826
|
-
await sendAlert("Budget usage high
|
|
574
|
+
await sendAlert("Budget usage high");
|
|
827
575
|
}
|
|
828
576
|
```
|
|
829
577
|
|
|
830
|
-
### 5.
|
|
578
|
+
### 5. Use Router for Resilience
|
|
579
|
+
|
|
831
580
|
```javascript
|
|
832
|
-
//
|
|
833
|
-
|
|
834
|
-
|
|
835
|
-
|
|
836
|
-
|
|
581
|
+
// Automatic fallback on failures
|
|
582
|
+
createModelRouter({
|
|
583
|
+
strategy: "fallback",
|
|
584
|
+
fallbackMap: {
|
|
585
|
+
"gpt-4o": ["gpt-4o-mini", "gpt-3.5-turbo"]
|
|
586
|
+
},
|
|
587
|
+
maxRetries: 2
|
|
837
588
|
});
|
|
838
589
|
```
|
|
839
590
|
|
|
840
|
-
|
|
841
|
-
|
|
842
|
-
## Contributing
|
|
591
|
+
### 6. Register Models Dynamically
|
|
843
592
|
|
|
844
|
-
|
|
593
|
+
```javascript
|
|
594
|
+
// Discover and register models from API
|
|
595
|
+
const models = await discoverModels(apiKey);
|
|
596
|
+
registerModels("provider", models.map(m => ({
|
|
597
|
+
name: m.id,
|
|
598
|
+
contextLimit: m.context_window,
|
|
599
|
+
pricing: { input: m.input_price, output: m.output_price }
|
|
600
|
+
})));
|
|
601
|
+
```
|
|
845
602
|
|
|
846
603
|
---
|
|
847
604
|
|
|
@@ -856,20 +613,8 @@ MIT © [Ruthwik](https://github.com/Ruthwik000)
|
|
|
856
613
|
- **GitHub:** https://github.com/Ruthwik000/tokenfirewall
|
|
857
614
|
- **npm:** https://www.npmjs.com/package/tokenfirewall
|
|
858
615
|
- **Issues:** https://github.com/Ruthwik000/tokenfirewall/issues
|
|
859
|
-
- **Documentation:** [API.md](./API.md)
|
|
860
616
|
- **Changelog:** [CHANGELOG.md](./CHANGELOG.md)
|
|
861
617
|
|
|
862
618
|
---
|
|
863
619
|
|
|
864
|
-
|
|
865
|
-
|
|
866
|
-
If you find TokenFirewall useful, please:
|
|
867
|
-
- ⭐ Star the repository
|
|
868
|
-
- 🐛 Report bugs and issues
|
|
869
|
-
- 💡 Suggest new features
|
|
870
|
-
- 📖 Improve documentation
|
|
871
|
-
- 🔀 Submit pull requests
|
|
872
|
-
|
|
873
|
-
---
|
|
874
|
-
|
|
875
|
-
**Built with ❤️ for the AI developer community**
|
|
620
|
+
Built with ❤️ for the AI developer community.
|