tokenfirewall 1.0.1 → 1.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +710 -147
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -1,16 +1,49 @@
1
- # tokenfirewall
1
+ # TokenFirewall
2
2
 
3
- Production-grade LLM cost enforcement middleware for Node.js with automatic tracking, budget management, and model discovery.
3
+ > Enterprise-grade LLM cost enforcement middleware for Node.js with automatic budget protection, multi-provider support, and intelligent cost tracking.
4
4
 
5
- ## Features
5
+ [![npm version](https://img.shields.io/npm/v/tokenfirewall.svg)](https://www.npmjs.com/package/tokenfirewall)
6
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
7
+ [![TypeScript](https://img.shields.io/badge/TypeScript-Ready-blue.svg)](https://www.typescriptlang.org/)
6
8
 
7
- - **Multi-provider support**: OpenAI, Anthropic, Gemini, Grok, Kimi
8
- - **Budget enforcement**: Warn or block when limits exceeded
9
- - **Automatic cost tracking**: Real-time usage monitoring
10
- - **Model discovery**: List available models with context limits
11
- - **Context intelligence**: Budget-aware model selection
12
- - **Extensible**: Add custom providers easily
13
- - **Type-safe**: Full TypeScript support
9
+ ## Overview
10
+
11
+ TokenFirewall is a production-ready middleware that automatically tracks and enforces budget limits for Large Language Model (LLM) API calls. It provides transparent cost monitoring, prevents budget overruns, and supports multiple providers through a unified interface.
12
+
13
+ ### Key Features
14
+
15
+ - **Automatic Budget Enforcement** - Block or warn when spending limits are exceeded
16
+ - **Real-time Cost Tracking** - Automatic calculation based on actual token usage
17
+ - **Multi-Provider Support** - Works with OpenAI, Anthropic, Gemini, Grok, Kimi, and custom providers
18
+ - **Model Discovery** - List available models with context limits and pricing
19
+ - **Budget Persistence** - Save and restore budget state across restarts
20
+ - **Zero Configuration** - Works out-of-the-box with sensible defaults
21
+ - **Production Ready** - Comprehensive error handling and validation
22
+ - **TypeScript Native** - Full type definitions included
23
+
24
+ ---
25
+
26
+ ## Table of Contents
27
+
28
+ - [Installation](#installation)
29
+ - [Quick Start](#quick-start)
30
+ - [Core Concepts](#core-concepts)
31
+ - [API Reference](#api-reference)
32
+ - [Budget Management](#budget-management)
33
+ - [Interception](#interception)
34
+ - [Model Discovery](#model-discovery)
35
+ - [Custom Providers](#custom-providers)
36
+ - [Budget Persistence](#budget-persistence)
37
+ - [Supported Providers](#supported-providers)
38
+ - [Use Cases](#use-cases)
39
+ - [Examples](#examples)
40
+ - [TypeScript Support](#typescript-support)
41
+ - [Error Handling](#error-handling)
42
+ - [Best Practices](#best-practices)
43
+ - [Contributing](#contributing)
44
+ - [License](#license)
45
+
46
+ ---
14
47
 
15
48
  ## Installation
16
49
 
@@ -18,21 +51,27 @@ Production-grade LLM cost enforcement middleware for Node.js with automatic trac
18
51
  npm install tokenfirewall
19
52
  ```
20
53
 
54
+ **Requirements:**
55
+ - Node.js >= 16.0.0
56
+ - TypeScript >= 5.0.0 (for TypeScript projects)
57
+
58
+ ---
59
+
21
60
  ## Quick Start
22
61
 
23
62
  ```javascript
24
63
  const { createBudgetGuard, patchGlobalFetch } = require("tokenfirewall");
25
64
 
26
- // Set budget limit
65
+ // Step 1: Set up budget protection
27
66
  createBudgetGuard({
28
67
  monthlyLimit: 100, // $100 USD
29
- mode: "block" // or "warn"
68
+ mode: "block" // Throw error when exceeded
30
69
  });
31
70
 
32
- // Enable tracking
71
+ // Step 2: Patch global fetch
33
72
  patchGlobalFetch();
34
73
 
35
- // Use any LLM API - tokenfirewall tracks everything
74
+ // Step 3: Use any LLM API normally - tokenfirewall handles the rest
36
75
  const response = await fetch("https://api.openai.com/v1/chat/completions", {
37
76
  method: "POST",
38
77
  headers: {
@@ -40,273 +79,797 @@ const response = await fetch("https://api.openai.com/v1/chat/completions", {
40
79
  "Content-Type": "application/json"
41
80
  },
42
81
  body: JSON.stringify({
43
- model: "gpt-4o",
82
+ model: "gpt-4o-mini",
44
83
  messages: [{ role: "user", content: "Hello!" }]
45
84
  })
46
85
  });
86
+
87
+ // Costs are automatically tracked and logged
47
88
  ```
48
89
 
90
+ ---
91
+
92
+ ## Core Concepts
93
+
94
+ ### Budget Guard
95
+
96
+ The Budget Guard is the core component that tracks spending and enforces limits. It operates in two modes:
97
+
98
+ - **Block Mode** (`mode: "block"`): Throws an error when budget is exceeded, preventing the API call
99
+ - **Warn Mode** (`mode: "warn"`): Logs a warning but allows the API call to proceed
100
+
101
+ ### Automatic Interception
102
+
103
+ TokenFirewall intercepts HTTP requests at the `fetch` level, automatically:
104
+ 1. Detecting LLM API responses
105
+ 2. Extracting token usage information
106
+ 3. Calculating costs based on provider pricing
107
+ 4. Tracking against your budget
108
+ 5. Logging usage details
109
+
110
+ ### Provider Adapters
111
+
112
+ Each LLM provider has a dedicated adapter that:
113
+ - Detects provider-specific response formats
114
+ - Normalizes token usage data
115
+ - Applies correct pricing models
116
+
117
+ ---
118
+
49
119
  ## API Reference
50
120
 
51
121
  ### Budget Management
52
122
 
53
123
  #### `createBudgetGuard(options)`
54
124
 
55
- Initialize budget protection.
125
+ Creates and configures a budget guard instance.
126
+
127
+ **Parameters:**
128
+
129
+ | Parameter | Type | Required | Description |
130
+ |-----------|------|----------|-------------|
131
+ | `options` | `BudgetGuardOptions` | Yes | Budget configuration object |
132
+ | `options.monthlyLimit` | `number` | Yes | Maximum spending limit in USD |
133
+ | `options.mode` | `"block" \| "warn"` | No | Enforcement mode (default: `"block"`) |
134
+
135
+ **Returns:** `BudgetManager` - The budget manager instance
136
+
137
+ **Throws:**
138
+ - `Error` if `monthlyLimit` is not a positive number
139
+ - `Error` if `mode` is not "block" or "warn"
140
+
141
+ **Example:**
56
142
 
57
143
  ```javascript
58
- createBudgetGuard({
59
- monthlyLimit: 100, // Required: monthly budget in USD
60
- mode: "block" // Optional: "block" (default) or "warn"
144
+ const { createBudgetGuard } = require("tokenfirewall");
145
+
146
+ // Block mode - strict enforcement
147
+ const guard = createBudgetGuard({
148
+ monthlyLimit: 100,
149
+ mode: "block"
150
+ });
151
+
152
+ // Warn mode - soft limits
153
+ const guard = createBudgetGuard({
154
+ monthlyLimit: 500,
155
+ mode: "warn"
61
156
  });
62
157
  ```
63
158
 
159
+ **Notes:**
160
+ - Calling `createBudgetGuard()` multiple times will replace the existing guard
161
+ - A warning is logged when overwriting an existing guard
162
+ - The guard is global and applies to all subsequent API calls
163
+
164
+ ---
165
+
64
166
  #### `getBudgetStatus()`
65
167
 
66
- Get current budget information.
168
+ Retrieves the current budget status and usage statistics.
169
+
170
+ **Parameters:** None
171
+
172
+ **Returns:** `BudgetStatus | null`
173
+
174
+ ```typescript
175
+ interface BudgetStatus {
176
+ totalSpent: number; // Total amount spent in USD
177
+ limit: number; // Monthly limit in USD
178
+ remaining: number; // Remaining budget in USD
179
+ percentageUsed: number; // Percentage of budget used (0-100)
180
+ }
181
+ ```
182
+
183
+ **Example:**
67
184
 
68
185
  ```javascript
186
+ const { getBudgetStatus } = require("tokenfirewall");
187
+
69
188
  const status = getBudgetStatus();
70
- // {
71
- // totalSpent: 45.23,
72
- // limit: 100,
73
- // remaining: 54.77,
74
- // percentageUsed: 45.23
75
- // }
189
+
190
+ if (status) {
191
+ console.log(`Spent: $${status.totalSpent.toFixed(2)}`);
192
+ console.log(`Remaining: $${status.remaining.toFixed(2)}`);
193
+ console.log(`Usage: ${status.percentageUsed.toFixed(1)}%`);
194
+
195
+ // Alert if over 80%
196
+ if (status.percentageUsed > 80) {
197
+ console.warn("⚠️ Budget usage is high!");
198
+ }
199
+ }
76
200
  ```
77
201
 
202
+ **Returns `null` if:**
203
+ - No budget guard has been created
204
+ - Budget guard was not initialized
205
+
206
+ ---
207
+
78
208
  #### `resetBudget()`
79
209
 
80
- Reset budget tracking (useful for monthly resets).
210
+ Resets the budget tracking to zero, clearing all accumulated costs.
211
+
212
+ **Parameters:** None
213
+
214
+ **Returns:** `void`
215
+
216
+ **Example:**
81
217
 
82
218
  ```javascript
83
- resetBudget();
219
+ const { resetBudget, getBudgetStatus } = require("tokenfirewall");
220
+
221
+ // Reset at the start of each month
222
+ function monthlyReset() {
223
+ resetBudget();
224
+ console.log("Budget reset for new month");
225
+
226
+ const status = getBudgetStatus();
227
+ console.log(`New budget: $${status.limit}`);
228
+ }
229
+
230
+ // Schedule monthly reset
231
+ const cron = require("node-cron");
232
+ cron.schedule("0 0 1 * *", monthlyReset); // First day of month
84
233
  ```
85
234
 
235
+ **Use Cases:**
236
+ - Monthly budget resets
237
+ - Testing and development
238
+ - Per-session budgets
239
+ - Tenant-specific resets
240
+
241
+ ---
242
+
86
243
  ### Interception
87
244
 
88
245
  #### `patchGlobalFetch()`
89
246
 
90
- Intercept all fetch calls to track LLM usage.
247
+ Patches the global `fetch` function to intercept and track LLM API calls.
248
+
249
+ **Parameters:** None
250
+
251
+ **Returns:** `void`
252
+
253
+ **Example:**
254
+
255
+ ```javascript
256
+ const { patchGlobalFetch } = require("tokenfirewall");
257
+
258
+ // Patch once at application startup
259
+ patchGlobalFetch();
260
+
261
+ // All subsequent fetch calls are intercepted
262
+ await fetch("https://api.openai.com/v1/chat/completions", { /* ... */ });
263
+ await fetch("https://api.anthropic.com/v1/messages", { /* ... */ });
264
+ ```
265
+
266
+ **Behavior:**
267
+ - Intercepts all `fetch` calls globally
268
+ - Only processes LLM API responses (non-LLM calls are ignored)
269
+ - Automatically detects provider from response format
270
+ - Calculates costs and tracks against budget
271
+ - Logs usage information to console
272
+ - Can be called multiple times safely (idempotent)
273
+
274
+ **Important Notes:**
275
+ - Must be called AFTER `createBudgetGuard()`
276
+ - Works with official SDKs that use `fetch` internally
277
+ - Does not affect non-LLM HTTP requests
278
+ - Minimal performance overhead
279
+
280
+ ---
281
+
282
+ #### `unpatchGlobalFetch()`
283
+
284
+ Restores the original `fetch` function, disabling interception.
285
+
286
+ **Parameters:** None
287
+
288
+ **Returns:** `void`
289
+
290
+ **Example:**
91
291
 
92
292
  ```javascript
293
+ const { patchGlobalFetch, unpatchGlobalFetch } = require("tokenfirewall");
294
+
295
+ // Enable tracking
93
296
  patchGlobalFetch();
297
+
298
+ // ... make some API calls ...
299
+
300
+ // Disable tracking
301
+ unpatchGlobalFetch();
302
+
303
+ // Subsequent calls are not tracked
94
304
  ```
95
305
 
306
+ **Use Cases:**
307
+ - Temporarily disable tracking
308
+ - Testing specific scenarios
309
+ - Cleanup in test suites
310
+
311
+ ---
312
+
96
313
  #### `patchProvider(providerName)`
97
314
 
98
- Patch specific provider SDK (most use fetch internally).
315
+ Patches a specific provider SDK (currently placeholder - most providers work via fetch interception).
316
+
317
+ **Parameters:**
318
+
319
+ | Parameter | Type | Required | Description |
320
+ |-----------|------|----------|-------------|
321
+ | `providerName` | `string` | Yes | Provider name ("openai", "anthropic", etc.) |
322
+
323
+ **Returns:** `void`
324
+
325
+ **Example:**
99
326
 
100
327
  ```javascript
328
+ const { patchProvider } = require("tokenfirewall");
329
+
101
330
  patchProvider("openai");
102
331
  ```
103
332
 
333
+ **Note:** Most providers work automatically with `patchGlobalFetch()`. This function is reserved for future provider-specific integrations.
334
+
335
+ ---
336
+
104
337
  ### Model Discovery
105
338
 
339
+ #### `listModels(options)`
340
+
341
+ Lists available models from a provider with context limits and budget information.
342
+
343
+ **Parameters:**
344
+
345
+ | Parameter | Type | Required | Description |
346
+ |-----------|------|----------|-------------|
347
+ | `options` | `ListModelsOptions` | Yes | Discovery options |
348
+ | `options.provider` | `string` | Yes | Provider name ("openai", "gemini", "grok", "kimi") |
349
+ | `options.apiKey` | `string` | Yes | Provider API key |
350
+ | `options.baseURL` | `string` | No | Custom API endpoint URL |
351
+ | `options.includeBudgetUsage` | `boolean` | No | Include current budget usage % (default: false) |
352
+
353
+ **Returns:** `Promise<ModelInfo[]>`
354
+
355
+ ```typescript
356
+ interface ModelInfo {
357
+ model: string; // Model identifier
358
+ contextLimit?: number; // Context window size in tokens
359
+ budgetUsagePercentage?: number; // Current budget usage (if requested)
360
+ }
361
+ ```
362
+
363
+ **Example:**
364
+
365
+ ```javascript
366
+ const { listModels } = require("tokenfirewall");
367
+
368
+ // Discover OpenAI models
369
+ const models = await listModels({
370
+ provider: "openai",
371
+ apiKey: process.env.OPENAI_API_KEY,
372
+ includeBudgetUsage: true
373
+ });
374
+
375
+ models.forEach(model => {
376
+ console.log(`Model: ${model.model}`);
377
+ if (model.contextLimit) {
378
+ console.log(` Context: ${model.contextLimit.toLocaleString()} tokens`);
379
+ }
380
+ if (model.budgetUsagePercentage !== undefined) {
381
+ console.log(` Budget Used: ${model.budgetUsagePercentage.toFixed(2)}%`);
382
+ }
383
+ });
384
+
385
+ // Find models with large context windows
386
+ const largeContext = models.filter(m => m.contextLimit && m.contextLimit > 100000);
387
+ ```
388
+
389
+ **Supported Providers:**
390
+ - `"openai"` - Fetches from OpenAI API
391
+ - `"gemini"` - Fetches from Google Gemini API
392
+ - `"grok"` - Fetches from X.AI API
393
+ - `"kimi"` - Fetches from Moonshot AI API
394
+ - `"anthropic"` - Returns static list (no API endpoint available)
395
+
396
+ **Error Handling:**
397
+ - Returns empty array if API call fails
398
+ - Logs warning on errors
399
+ - Has 10-second timeout to prevent hanging
400
+
401
+ ---
402
+
106
403
  #### `listAvailableModels(options)`
107
404
 
108
- Discover available models with context limits and budget usage.
405
+ Lower-level model discovery function with manual budget manager injection.
406
+
407
+ **Parameters:**
408
+
409
+ | Parameter | Type | Required | Description |
410
+ |-----------|------|----------|-------------|
411
+ | `options` | `ListModelsOptions` | Yes | Discovery options (same as `listModels`) |
412
+ | `options.budgetManager` | `BudgetManager` | No | Manual budget manager instance |
413
+
414
+ **Returns:** `Promise<ModelInfo[]>`
415
+
416
+ **Example:**
109
417
 
110
418
  ```javascript
419
+ const { listAvailableModels, createBudgetGuard } = require("tokenfirewall");
420
+
421
+ const manager = createBudgetGuard({ monthlyLimit: 100, mode: "warn" });
422
+
111
423
  const models = await listAvailableModels({
112
424
  provider: "openai",
113
425
  apiKey: process.env.OPENAI_API_KEY,
114
- includeBudgetUsage: true // Optional
426
+ budgetManager: manager,
427
+ includeBudgetUsage: true
115
428
  });
116
-
117
- // Returns:
118
- // [
119
- // {
120
- // model: "gpt-4o",
121
- // contextLimit: 128000,
122
- // budgetUsagePercentage: 32.4
123
- // }
124
- // ]
125
429
  ```
126
430
 
127
- ### Extensibility
431
+ **Note:** Use `listModels()` instead - it automatically passes the global budget manager.
432
+
433
+ ---
434
+
435
+ ### Custom Providers
128
436
 
129
437
  #### `registerAdapter(adapter)`
130
438
 
131
- Add custom LLM provider.
439
+ Registers a custom provider adapter for tracking non-standard LLM APIs.
440
+
441
+ **Parameters:**
442
+
443
+ | Parameter | Type | Required | Description |
444
+ |-----------|------|----------|-------------|
445
+ | `adapter` | `ProviderAdapter` | Yes | Custom adapter implementation |
446
+
447
+ ```typescript
448
+ interface ProviderAdapter {
449
+ name: string; // Unique provider name
450
+ detect: (response: unknown) => boolean; // Detect if response is from this provider
451
+ normalize: (response: unknown, request?: unknown) => NormalizedUsage; // Extract token usage
452
+ }
453
+
454
+ interface NormalizedUsage {
455
+ provider: string; // Provider name
456
+ model: string; // Model identifier
457
+ inputTokens: number; // Input/prompt tokens
458
+ outputTokens: number; // Output/completion tokens
459
+ totalTokens: number; // Total tokens
460
+ }
461
+ ```
462
+
463
+ **Example:**
132
464
 
133
465
  ```javascript
466
+ const { registerAdapter } = require("tokenfirewall");
467
+
468
+ // Register Ollama (self-hosted) adapter
134
469
  registerAdapter({
135
- name: "custom",
136
- detect: (response) => /* detection logic */,
137
- normalize: (response) => /* normalization logic */
470
+ name: "ollama",
471
+
472
+ detect: (response) => {
473
+ return response &&
474
+ typeof response === "object" &&
475
+ response.model &&
476
+ response.prompt_eval_count !== undefined;
477
+ },
478
+
479
+ normalize: (response) => {
480
+ return {
481
+ provider: "ollama",
482
+ model: response.model,
483
+ inputTokens: response.prompt_eval_count || 0,
484
+ outputTokens: response.eval_count || 0,
485
+ totalTokens: (response.prompt_eval_count || 0) + (response.eval_count || 0)
486
+ };
487
+ }
488
+ });
489
+
490
+ // Now Ollama calls are tracked
491
+ const response = await fetch("http://localhost:11434/api/generate", {
492
+ method: "POST",
493
+ body: JSON.stringify({ model: "llama3.2", prompt: "Hello" })
138
494
  });
139
495
  ```
140
496
 
497
+ **Validation:**
498
+ - Adapter name must be a non-empty string
499
+ - `detect()` must return boolean
500
+ - `normalize()` must return valid `NormalizedUsage` object
501
+ - Adapters are checked in registration order (first match wins)
502
+
503
+ ---
504
+
141
505
  #### `registerPricing(provider, model, pricing)`
142
506
 
143
- Add custom pricing (per 1M tokens).
507
+ Registers custom pricing for a provider and model.
508
+
509
+ **Parameters:**
510
+
511
+ | Parameter | Type | Required | Description |
512
+ |-----------|------|----------|-------------|
513
+ | `provider` | `string` | Yes | Provider name |
514
+ | `model` | `string` | Yes | Model identifier |
515
+ | `pricing` | `ModelPricing` | Yes | Pricing configuration |
516
+
517
+ ```typescript
518
+ interface ModelPricing {
519
+ input: number; // Cost per 1M input tokens (USD)
520
+ output: number; // Cost per 1M output tokens (USD)
521
+ }
522
+ ```
523
+
524
+ **Example:**
144
525
 
145
526
  ```javascript
146
- registerPricing("custom", "model-name", {
147
- input: 0.001,
148
- output: 0.002
527
+ const { registerPricing } = require("tokenfirewall");
528
+
529
+ // Register pricing for custom model
530
+ registerPricing("ollama", "llama3.2", {
531
+ input: 0.0, // Free (self-hosted)
532
+ output: 0.0
533
+ });
534
+
535
+ // Register pricing for new OpenAI model
536
+ registerPricing("openai", "gpt-5", {
537
+ input: 5.0, // $5 per 1M input tokens
538
+ output: 15.0 // $15 per 1M output tokens
539
+ });
540
+
541
+ // Override existing pricing
542
+ registerPricing("openai", "gpt-4o", {
543
+ input: 2.0, // Custom pricing
544
+ output: 8.0
149
545
  });
150
546
  ```
151
547
 
548
+ **Validation:**
549
+ - Provider and model must be non-empty strings
550
+ - Input and output prices must be non-negative numbers
551
+ - Prices cannot be NaN or Infinity
552
+
553
+ **Default Pricing:**
554
+ TokenFirewall includes default pricing for:
555
+ - OpenAI (GPT-4o, GPT-4o-mini, GPT-4-turbo, GPT-3.5-turbo)
556
+ - Anthropic (Claude 3.5 Sonnet, Claude 3.5 Haiku, Claude 3 Opus)
557
+ - Gemini (Gemini 2.0 Flash, Gemini 1.5 Pro, Gemini 1.5 Flash)
558
+ - Grok (Grok-beta, Grok-2, Llama models)
559
+ - Kimi (Moonshot v1 models)
560
+
561
+ ---
562
+
152
563
  #### `registerContextLimit(provider, model, contextLimit)`
153
564
 
154
- Add custom context limit.
565
+ Registers custom context window limit for a model.
566
+
567
+ **Parameters:**
568
+
569
+ | Parameter | Type | Required | Description |
570
+ |-----------|------|----------|-------------|
571
+ | `provider` | `string` | Yes | Provider name |
572
+ | `model` | `string` | Yes | Model identifier |
573
+ | `contextLimit` | `number` | Yes | Context window size in tokens |
574
+
575
+ **Example:**
155
576
 
156
577
  ```javascript
157
- registerContextLimit("custom", "model-name", 131072);
578
+ const { registerContextLimit } = require("tokenfirewall");
579
+
580
+ // Register context limit for custom model
581
+ registerContextLimit("ollama", "llama3.2", 8192);
582
+
583
+ // Register for new model
584
+ registerContextLimit("openai", "gpt-5", 256000);
158
585
  ```
159
586
 
160
- ## Supported Providers
587
+ **Validation:**
588
+ - Provider and model must be non-empty strings
589
+ - Context limit must be a positive number
590
+ - Cannot be NaN or Infinity
161
591
 
162
- | Provider | Models | Context Limits |
163
- |----------|--------|----------------|
164
- | OpenAI | gpt-4o, gpt-4o-mini, gpt-3.5-turbo | 16K - 128K |
165
- | Anthropic | claude-3-5-sonnet, claude-3-5-haiku | 200K |
166
- | Gemini | gemini-2.5-pro, gemini-2.5-flash | 1M - 2M |
167
- | Grok | grok-beta, llama-3.3-70b | 131K |
168
- | Kimi | moonshot-v1-8k/32k/128k | 8K - 128K |
592
+ ---
169
593
 
170
- ## Usage Examples
594
+ ### Budget Persistence
171
595
 
172
- ### Basic Usage
596
+ #### `exportBudgetState()`
173
597
 
174
- ```javascript
175
- const { createBudgetGuard, patchGlobalFetch } = require("tokenfirewall");
598
+ Exports the current budget state for persistence.
176
599
 
177
- createBudgetGuard({ monthlyLimit: 100, mode: "block" });
178
- patchGlobalFetch();
600
+ **Parameters:** None
601
+
602
+ **Returns:** `{ totalSpent: number; limit: number; mode: string } | null`
603
+
604
+ **Example:**
179
605
 
180
- // Make LLM calls as usual - automatically tracked
606
+ ```javascript
607
+ const { exportBudgetState } = require("tokenfirewall");
608
+ const fs = require("fs");
609
+
610
+ // Export state
611
+ const state = exportBudgetState();
612
+
613
+ if (state) {
614
+ // Save to file
615
+ fs.writeFileSync("budget-state.json", JSON.stringify(state, null, 2));
616
+
617
+ // Or save to database
618
+ await db.budgets.update({ id: "main" }, state);
619
+
620
+ // Or save to Redis
621
+ await redis.set("budget:state", JSON.stringify(state));
622
+ }
181
623
  ```
182
624
 
183
- ### Budget-Aware Model Selection
625
+ **Returns `null` if:**
626
+ - No budget guard has been created
627
+
628
+ ---
629
+
630
+ #### `importBudgetState(state)`
631
+
632
+ Imports and restores a previously saved budget state.
633
+
634
+ **Parameters:**
635
+
636
+ | Parameter | Type | Required | Description |
637
+ |-----------|------|----------|-------------|
638
+ | `state` | `{ totalSpent: number }` | Yes | Saved budget state |
639
+
640
+ **Returns:** `void`
641
+
642
+ **Throws:**
643
+ - `Error` if no budget guard exists
644
+ - `Error` if `totalSpent` is not a valid number
645
+ - `Error` if `totalSpent` is negative
646
+
647
+ **Example:**
184
648
 
185
649
  ```javascript
186
- const { listAvailableModels, getBudgetStatus } = require("tokenfirewall");
650
+ const { importBudgetState, createBudgetGuard } = require("tokenfirewall");
651
+ const fs = require("fs");
187
652
 
188
- const models = await listAvailableModels({
189
- provider: "openai",
190
- apiKey: process.env.OPENAI_API_KEY,
191
- includeBudgetUsage: true
192
- });
653
+ // Create budget guard first
654
+ createBudgetGuard({ monthlyLimit: 100, mode: "block" });
193
655
 
194
- const status = getBudgetStatus();
195
- if (status.remaining < 10) {
196
- console.log("Low budget - use cheaper models");
197
- const cheapModels = models.filter(m => m.model.includes("mini"));
656
+ // Load from file
657
+ if (fs.existsSync("budget-state.json")) {
658
+ const state = JSON.parse(fs.readFileSync("budget-state.json", "utf8"));
659
+ importBudgetState(state);
660
+ console.log("Budget state restored");
661
+ }
662
+
663
+ // Or load from database
664
+ const state = await db.budgets.findOne({ id: "main" });
665
+ if (state) {
666
+ importBudgetState(state);
198
667
  }
199
668
  ```
200
669
 
201
- ### Context-Aware Routing
670
+ **Validation:**
671
+ - Validates `totalSpent` is a valid number
672
+ - Rejects negative values
673
+ - Warns if imported value is suspiciously large (>10x limit)
202
674
 
203
- ```javascript
204
- const { listAvailableModels } = require("tokenfirewall");
675
+ ---
205
676
 
206
- const models = await listAvailableModels({
207
- provider: "openai",
208
- apiKey: process.env.OPENAI_API_KEY
209
- });
677
+ ## Supported Providers
210
678
 
211
- // Find model with sufficient context
212
- const suitable = models.find(m =>
213
- m.contextLimit && m.contextLimit >= promptTokens * 1.5
214
- );
215
- ```
679
+ TokenFirewall includes built-in support for:
216
680
 
217
- ### Custom Provider
681
+ | Provider | Models | Pricing | Discovery |
682
+ |----------|--------|---------|-----------|
683
+ | **OpenAI** | GPT-4o, GPT-4o-mini, GPT-4-turbo, GPT-3.5-turbo | ✅ Included | ✅ API |
684
+ | **Anthropic** | Claude 3.5 Sonnet, Claude 3.5 Haiku, Claude 3 Opus | ✅ Included | ✅ Static |
685
+ | **Google Gemini** | Gemini 2.0 Flash, Gemini 1.5 Pro, Gemini 1.5 Flash | ✅ Included | ✅ API |
686
+ | **Grok (X.AI)** | Grok-beta, Grok-2, Llama 3.x models | ✅ Included | ✅ API |
687
+ | **Kimi (Moonshot)** | Moonshot v1 (8k, 32k, 128k) | ✅ Included | ✅ API |
688
+ | **Custom** | Any LLM API | ⚙️ Register | ⚙️ Custom |
218
689
 
219
- ```javascript
220
- const { registerAdapter, registerPricing } = require("tokenfirewall");
690
+ ---
221
691
 
222
- // Add Ollama support
223
- registerAdapter({
224
- name: "ollama",
225
- detect: (response) => response?.model && response?.prompt_eval_count !== undefined,
226
- normalize: (response) => ({
227
- provider: "ollama",
228
- model: response.model,
229
- inputTokens: response.prompt_eval_count || 0,
230
- outputTokens: response.eval_count || 0,
231
- totalTokens: (response.prompt_eval_count || 0) + (response.eval_count || 0)
232
- })
233
- });
692
+ ## Use Cases
234
693
 
235
- registerPricing("ollama", "llama3.2", { input: 0, output: 0 });
236
- ```
694
+ ### 1. Production Applications
695
+ - Prevent unexpected cost spikes
696
+ - Enforce spending limits per tenant/user
697
+ - Track costs across multiple providers
698
+
699
+ ### 2. Development & Testing
700
+ - Limit test suite costs
701
+ - Prevent accidental expensive calls
702
+ - Safe experimentation with new models
703
+
704
+ ### 3. Multi-Tenant SaaS
705
+ - Per-customer budget limits
706
+ - Tiered pricing enforcement
707
+ - Usage-based billing
708
+
709
+ ### 4. AI Agent Systems
710
+ - Prevent runaway agent loops
711
+ - Budget-aware task planning
712
+ - Cost-optimized model selection
713
+
714
+ ### 5. Internal Tools
715
+ - Department-level budgets
716
+ - Employee usage tracking
717
+ - Cost allocation and reporting
718
+
719
+ ---
720
+
721
+ ## Examples
722
+
723
+ See the [`examples/`](./examples) directory for complete, runnable examples:
724
+
725
+ 1. **[Basic Usage](./examples/1-basic-usage.js)** - Core functionality and budget protection
726
+ 2. **[Multiple Providers](./examples/2-multiple-providers.js)** - Unified tracking across providers
727
+ 3. **[Budget Persistence](./examples/3-budget-persistence.js)** - Save and restore state
728
+ 4. **[Custom Provider](./examples/4-custom-provider.js)** - Add your own LLM provider
729
+ 5. **[Model Discovery](./examples/5-model-discovery.js)** - Find and compare models
730
+
731
+ ---
237
732
 
238
733
  ## TypeScript Support
239
734
 
240
- Full type definitions included:
735
+ TokenFirewall is written in TypeScript and includes full type definitions.
241
736
 
242
737
  ```typescript
243
- import {
738
+ import {
244
739
  createBudgetGuard,
245
- listAvailableModels,
740
+ patchGlobalFetch,
741
+ getBudgetStatus,
246
742
  BudgetGuardOptions,
743
+ BudgetStatus,
247
744
  ModelInfo,
248
- ListModelsOptions
745
+ ProviderAdapter,
746
+ ModelPricing
249
747
  } from "tokenfirewall";
250
748
 
749
+ // Full type safety
251
750
  const options: BudgetGuardOptions = {
252
751
  monthlyLimit: 100,
253
752
  mode: "block"
254
753
  };
255
754
 
256
- const models: ModelInfo[] = await listAvailableModels({
257
- provider: "openai",
258
- apiKey: process.env.OPENAI_API_KEY!
259
- });
755
+ createBudgetGuard(options);
756
+ patchGlobalFetch();
757
+
758
+ const status: BudgetStatus | null = getBudgetStatus();
260
759
  ```
261
760
 
262
- ## Architecture
761
+ ---
263
762
 
763
+ ## Error Handling
764
+
765
+ TokenFirewall provides clear, actionable error messages:
766
+
767
+ ```javascript
768
+ try {
769
+ const response = await fetch(/* ... */);
770
+ } catch (error) {
771
+ if (error.message.includes("TokenFirewall: Budget exceeded")) {
772
+ // Budget limit reached
773
+ console.error("Monthly budget exhausted");
774
+ // Notify user, upgrade prompt, etc.
775
+ } else if (error.message.includes("TokenFirewall: Cost must be")) {
776
+ // Invalid cost calculation (should not happen in normal use)
777
+ console.error("Internal error:", error.message);
778
+ } else {
779
+ // Other errors (network, API, etc.)
780
+ console.error("API error:", error.message);
781
+ }
782
+ }
264
783
  ```
265
- tokenfirewall/
266
- ├── core/ # Provider-agnostic logic
267
- ├── adapters/ # Provider-specific normalization
268
- ├── interceptors/ # Request/response capture
269
- ├── introspection/ # Model discovery
270
- └── registry/ # Adapter management
784
+
785
+ **Common Errors:**
786
+
787
+ | Error Message | Cause | Solution |
788
+ |---------------|-------|----------|
789
+ | `Budget exceeded! Would spend $X of $Y limit` | Budget limit reached | Increase limit or wait for reset |
790
+ | `monthlyLimit must be a valid number` | Invalid budget configuration | Provide positive number |
791
+ | `Cost must be a valid number` | Internal error | Report as bug |
792
+ | `No pricing found for model "X"` | Unknown model | Register custom pricing |
793
+ | `Cannot import budget state - no budget guard exists` | Import before create | Call `createBudgetGuard()` first |
794
+
795
+ ---
796
+
797
+ ## Best Practices
798
+
799
+ ### 1. Initialize Early
800
+ ```javascript
801
+ // At application startup
802
+ createBudgetGuard({ monthlyLimit: 100, mode: "block" });
803
+ patchGlobalFetch();
271
804
  ```
272
805
 
273
- Adding a new provider requires only creating an adapter file - no core changes needed.
806
+ ### 2. Use Warn Mode in Development
807
+ ```javascript
808
+ const mode = process.env.NODE_ENV === "production" ? "block" : "warn";
809
+ createBudgetGuard({ monthlyLimit: 100, mode });
810
+ ```
274
811
 
275
- ## Examples
812
+ ### 3. Persist Budget State
813
+ ```javascript
814
+ // Save on exit
815
+ process.on("beforeExit", () => {
816
+ const state = exportBudgetState();
817
+ if (state) saveToDatabase(state);
818
+ });
819
+ ```
276
820
 
277
- See the `examples/` directory for complete working examples:
821
+ ### 4. Monitor Usage
822
+ ```javascript
823
+ // Alert at 80% usage
824
+ const status = getBudgetStatus();
825
+ if (status && status.percentageUsed > 80) {
826
+ await sendAlert("Budget usage high!");
827
+ }
828
+ ```
278
829
 
279
- - `basic-usage.js` - Simple OpenAI example
280
- - `multiple-providers.js` - Track multiple providers
281
- - `with-sdk.js` - Use with official SDKs
282
- - `model-discovery.js` - Model discovery
283
- - `context-aware-routing.js` - Intelligent routing
284
- - `custom-provider.js` - Add custom provider
285
- - `gemini-complete-demo.js` - Complete Gemini demo
830
+ ### 5. Reset Monthly
831
+ ```javascript
832
+ // Automated monthly reset
833
+ const cron = require("node-cron");
834
+ cron.schedule("0 0 1 * *", () => {
835
+ resetBudget();
836
+ console.log("Budget reset for new month");
837
+ });
838
+ ```
286
839
 
287
- ## Best Practices
840
+ ---
288
841
 
289
- 1. **Set realistic budgets**: Start with a conservative limit
290
- 2. **Use warn mode in development**: Switch to block in production
291
- 3. **Reset monthly**: Automate budget resets with cron
292
- 4. **Cache model lists**: Model availability doesn't change often
293
- 5. **Monitor logs**: Review structured JSON output regularly
842
+ ## Contributing
294
843
 
295
- ## Limitations
844
+ Contributions are welcome! Please see [CONTRIBUTING.md](./CONTRIBUTING.md) for guidelines.
296
845
 
297
- - In-memory tracking only (no persistence in V1)
298
- - No streaming support yet
299
- - Context limits are static (not from provider APIs)
300
- - Budget tracking is local only (not provider-side billing)
846
+ ---
301
847
 
302
848
  ## License
303
849
 
304
- MIT
850
+ MIT © [Ruthwik](https://github.com/Ruthwik000)
305
851
 
306
- ## Contributing
852
+ ---
853
+
854
+ ## Links
307
855
 
308
- Contributions welcome! Please open an issue or PR.
856
+ - **GitHub:** https://github.com/Ruthwik000/tokenfirewall
857
+ - **npm:** https://www.npmjs.com/package/tokenfirewall
858
+ - **Issues:** https://github.com/Ruthwik000/tokenfirewall/issues
859
+ - **Documentation:** [API.md](./API.md)
860
+ - **Changelog:** [CHANGELOG.md](./CHANGELOG.md)
861
+
862
+ ---
309
863
 
310
864
  ## Support
311
865
 
312
- For issues and questions, please open a GitHub issue.
866
+ If you find TokenFirewall useful, please:
867
+ - ⭐ Star the repository
868
+ - 🐛 Report bugs and issues
869
+ - 💡 Suggest new features
870
+ - 📖 Improve documentation
871
+ - 🔀 Submit pull requests
872
+
873
+ ---
874
+
875
+ **Built with ❤️ for the AI developer community**
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "tokenfirewall",
3
- "version": "1.0.1",
3
+ "version": "1.0.2",
4
4
  "description": "Scalable, adapter-driven LLM cost enforcement middleware for Node.js with model discovery and context intelligence",
5
5
  "main": "dist/index.js",
6
6
  "types": "dist/index.d.ts",