@salimassili/ai-costguard 1.2.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (75) hide show
  1. package/CHANGELOG.md +53 -0
  2. package/LICENSE +21 -0
  3. package/README.md +281 -103
  4. package/benchmarks/run.mjs +229 -0
  5. package/dist/cli.d.ts +50 -0
  6. package/dist/cli.d.ts.map +1 -0
  7. package/dist/cli.js +178 -0
  8. package/dist/cli.js.map +1 -0
  9. package/dist/core/CostGuard.d.ts +3 -4
  10. package/dist/core/CostGuard.d.ts.map +1 -1
  11. package/dist/core/CostGuard.js +1 -2
  12. package/dist/core/CostGuard.js.map +1 -1
  13. package/dist/core/GuardCore.d.ts +93 -13
  14. package/dist/core/GuardCore.d.ts.map +1 -1
  15. package/dist/core/GuardCore.js +372 -158
  16. package/dist/core/GuardCore.js.map +1 -1
  17. package/dist/core/GuardFree.d.ts +42 -18
  18. package/dist/core/GuardFree.d.ts.map +1 -1
  19. package/dist/core/GuardFree.js +95 -140
  20. package/dist/core/GuardFree.js.map +1 -1
  21. package/dist/core/GuardPro.d.ts +85 -5
  22. package/dist/core/GuardPro.d.ts.map +1 -1
  23. package/dist/core/GuardPro.js +216 -121
  24. package/dist/core/GuardPro.js.map +1 -1
  25. package/dist/core/event-log.d.ts +37 -0
  26. package/dist/core/event-log.d.ts.map +1 -0
  27. package/dist/core/event-log.js +49 -0
  28. package/dist/core/event-log.js.map +1 -0
  29. package/dist/core/events.d.ts +20 -0
  30. package/dist/core/events.d.ts.map +1 -0
  31. package/dist/core/events.js +46 -0
  32. package/dist/core/events.js.map +1 -0
  33. package/dist/core/similarity.d.ts +13 -0
  34. package/dist/core/similarity.d.ts.map +1 -0
  35. package/dist/core/similarity.js +51 -0
  36. package/dist/core/similarity.js.map +1 -0
  37. package/dist/core/tokenizer.d.ts +18 -0
  38. package/dist/core/tokenizer.d.ts.map +1 -0
  39. package/dist/core/tokenizer.js +137 -0
  40. package/dist/core/tokenizer.js.map +1 -0
  41. package/dist/core/types.d.ts +153 -5
  42. package/dist/core/types.d.ts.map +1 -1
  43. package/dist/core/types.js +0 -3
  44. package/dist/core/types.js.map +1 -1
  45. package/dist/core/webhooks.d.ts +15 -0
  46. package/dist/core/webhooks.d.ts.map +1 -0
  47. package/dist/core/webhooks.js +58 -0
  48. package/dist/core/webhooks.js.map +1 -0
  49. package/dist/dashboard.d.ts +73 -0
  50. package/dist/dashboard.d.ts.map +1 -0
  51. package/dist/dashboard.js +201 -0
  52. package/dist/dashboard.js.map +1 -0
  53. package/dist/index.d.ts +3 -4
  54. package/dist/index.d.ts.map +1 -1
  55. package/dist/index.js +1 -2
  56. package/dist/index.js.map +1 -1
  57. package/dist/pricing/index.d.ts +19 -2
  58. package/dist/pricing/index.d.ts.map +1 -1
  59. package/dist/pricing/index.js +93 -13
  60. package/dist/pricing/index.js.map +1 -1
  61. package/dist/pro.d.ts +3 -0
  62. package/dist/pro.d.ts.map +1 -0
  63. package/dist/pro.js +2 -0
  64. package/dist/pro.js.map +1 -0
  65. package/docs/BENCHMARKS.md +51 -0
  66. package/docs/DASHBOARD.md +61 -0
  67. package/docs/INTEGRATIONS.md +153 -0
  68. package/examples/integrations/anthropic-workflow-budget.mjs +36 -0
  69. package/examples/integrations/ci-budget-check.mjs +32 -0
  70. package/examples/integrations/crewai-budget-gate.mjs +31 -0
  71. package/examples/integrations/langchain-retry-storm.mjs +32 -0
  72. package/examples/integrations/mastra-agent.mjs +41 -0
  73. package/examples/integrations/openai-agent-loop.mjs +44 -0
  74. package/examples/integrations/vercel-ai-chatbot.mjs +29 -0
  75. package/package.json +35 -7
package/CHANGELOG.md ADDED
@@ -0,0 +1,53 @@
1
+ # Changelog
2
+
3
+ ## 2.0.0 - Unreleased
4
+
5
+ ### Changed
6
+
7
+ - Moved Redis-backed `GuardPro` exports to `@salimassili/ai-costguard/pro` so the root import stays lightweight.
8
+ - Removed fake local license enforcement from `GuardPro`; `licenseKey` and `validateLicense()` are compatibility helpers only.
9
+ - Unknown models now block by default unless runtime pricing, guard pricing overrides, or explicit fallback pricing is configured.
10
+ - Guard proxy now checks known AI SDK method paths instead of charging every function call on the wrapped client.
11
+ - Loop detection now requires repeated similar prompts in the same scope before blocking.
12
+ - Retry detection now requires stronger retry/failure signals to reduce false positives.
13
+ - Prompt and retry histories are scoped and TTL-bound.
14
+
15
+ ### Added
16
+
17
+ - `guardFunction()` for Vercel AI SDK, LangChain, Mastra-style, CrewAI launcher, and other function-style integrations.
18
+ - Local JSONL event logging and `ai-costguard dashboard` / `aifw dashboard` for local-only visibility.
19
+ - Mocked runnable integration examples for OpenAI, Anthropic, Vercel AI SDK, LangChain, Mastra, CrewAI, and CI checks.
20
+ - Local benchmark script and benchmark documentation.
21
+ - Structured `GuardError.code` and `GuardError.metadata`.
22
+ - Scoped accounting fields for attempted, allowed, blocked, and reconciled actual cost.
23
+ - CLI custom pricing flags for private/custom models.
24
+ - `/pricing` package subpath export.
25
+ - Repository smoke checks for examples, templates, package exports, and stale claims.
26
+
27
+ ### Removed
28
+
29
+ - Active root docs and templates for unimplemented proxy/dashboard/SaaS features.
30
+ - Unused postinstall helper, stale ESLint config, and stale npm ignore file.
31
+
32
+ ## 1.2.0 - 2026-05-28
33
+
34
+ ### Changed
35
+
36
+ - Rebuilt the package around a strict ESM TypeScript core.
37
+ - Replaced the old character-count token heuristic with an inline BPE-style estimator.
38
+ - Replaced exact prompt matching with character trigram cosine similarity loop detection at the default `0.85` threshold.
39
+ - Reworked `GuardPro` with pooled Redis connections, TTL-based spend windows, and local fallback when Redis is unavailable.
40
+ - Rewrote the README to describe only shipped behavior.
41
+
42
+ ### Added
43
+
44
+ - `guard.on('block' | 'allow' | 'cost', callback)` event hooks.
45
+ - Optional Slack and Discord block webhooks with exponential backoff and silent failure.
46
+ - `aifw check --budget --model --tokens --max-steps` CLI for CI budget checks.
47
+ - Stale pricing warnings for entries older than 30 days.
48
+ - Node-native unit and integration tests for GuardCore, GuardFree, GuardPro, middleware, pricing, token estimation, webhooks, and CLI behavior.
49
+
50
+ ### Removed
51
+
52
+ - Removed stale Jest configuration and CommonJS-era test setup.
53
+ - Removed README claims about dashboards, hosted monitoring, and proxy features that are not shipped in this package.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Salim Assili
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md CHANGED
@@ -1,19 +1,8 @@
1
1
  # AI CostGuard
2
2
 
3
- AI CostGuard is a small TypeScript library that wraps OpenAI-like clients and blocks requests before they run when local safety checks predict unsafe AI API spend.
3
+ AI CostGuard is a local-first runtime safety layer for AI agents that prevents runaway costs, loops, retries, and budget explosions before API calls execute. It wraps OpenAI-compatible clients and function-style SDK calls, estimates request cost locally, blocks budget overruns, detects repeated prompts, emits structured events, and exposes CLI checks plus a local dashboard.
4
4
 
5
- It is ESM-only, targets Node.js 18+, and is built with `tsc`.
6
-
7
- ## What Works Today
8
-
9
- - `guard()` wraps a client with budget, loop, and retry protection.
10
- - `GuardError` is thrown when a request is blocked.
11
- - Budget blocking estimates request cost before the API call.
12
- - Loop detection blocks repeated prompts within the current process.
13
- - Retry detection blocks repeated failure/retry prompts within the current process.
14
- - `middleware()` adds the same local checks to web request flows.
15
- - `getPricing()` returns known built-in model pricing.
16
- - `registerPricing()` and `listPricing()` let you manage runtime pricing entries.
5
+ It is local-first. It does not include a SaaS control plane, cloud dashboard, proxy gateway, telemetry service, billing reconciliation service, or hard security boundary.
17
6
 
18
7
  ## Install
19
8
 
@@ -21,79 +10,248 @@ It is ESM-only, targets Node.js 18+, and is built with `tsc`.
21
10
  npm install @salimassili/ai-costguard
22
11
  ```
23
12
 
24
- ## Basic Usage
13
+ ## Quick Start
25
14
 
26
15
  ```ts
27
16
  import OpenAI from 'openai';
28
17
  import { guard, GuardError } from '@salimassili/ai-costguard';
29
18
 
30
- const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
31
- const ai = guard(client, { budget: 1 });
19
+ const openai = guard(new OpenAI({ apiKey: process.env.OPENAI_API_KEY }), {
20
+ budget: 5,
21
+ maxSteps: 50,
22
+ scope: { projectId: 'my-app' },
23
+ });
32
24
 
33
25
  try {
34
- const response = await ai.chat.completions.create({
26
+ const response = await openai.chat.completions.create({
35
27
  model: 'gpt-4o-mini',
36
- messages: [{ role: 'user', content: 'Write a short project summary.' }],
37
- max_tokens: 200
28
+ messages: [{ role: 'user', content: 'Write a short summary.' }],
29
+ max_tokens: 200,
38
30
  });
39
31
 
40
- console.log(response);
32
+ console.log(response.choices[0]?.message?.content);
41
33
  } catch (error) {
42
34
  if (error instanceof GuardError) {
43
- console.error('AI request blocked:', error.message, error.context);
35
+ console.error(error.code, error.message, error.context);
44
36
  } else {
45
37
  throw error;
46
38
  }
47
39
  }
48
40
  ```
49
41
 
50
- ## How `guard()` Works
42
+ ## What It Guards
43
+
44
+ By default AI CostGuard evaluates these SDK method paths:
51
45
 
52
- `guard(client, config)` returns a `Proxy` around your client. When code calls a method such as `client.chat.completions.create(...)`, CostGuard:
46
+ - `chat.completions.create`
47
+ - `completions.create`
48
+ - `responses.create`
49
+ - `messages.create`
53
50
 
54
- 1. Reads the request model, messages, and `max_tokens`.
55
- 2. Estimates input tokens from message length and combines them with the requested output limit.
56
- 3. Looks up pricing for the model.
57
- 4. Estimates the request cost.
58
- 5. Blocks the call with `GuardError` if the local budget would be exceeded.
59
- 6. Blocks repeated prompts that look like loops.
60
- 7. Blocks repeated prompts that look like retry storms.
61
- 8. Lets the original client method run when checks pass.
51
+ Other client methods are passed through without cost checks. To protect a custom client method:
62
52
 
63
- The free guard state is process-local. Separate Node.js processes do not share budget state.
53
+ ```ts
54
+ const client = guard(customClient, {
55
+ budget: 2,
56
+ guardedMethods: ['agent.run'],
57
+ pricingOverrides: [
58
+ {
59
+ model: 'internal-model',
60
+ inputPer1kTokens: 0.001,
61
+ outputPer1kTokens: 0.002,
62
+ lastUpdated: '2026-06-07',
63
+ source: 'internal pricing sheet',
64
+ },
65
+ ],
66
+ });
67
+ ```
64
68
 
65
- ## Budget Blocking
69
+ For function-style SDKs such as Vercel AI SDK adapters, LangChain wrappers, or agent runners:
66
70
 
67
71
  ```ts
68
- import { guard } from '@salimassili/ai-costguard';
72
+ import { guardFunction } from '@salimassili/ai-costguard';
69
73
 
70
- const ai = guard(openai, { budget: 0.25 });
74
+ const guardedGenerateText = guardFunction(generateTextAdapter, {
75
+ budget: 1,
76
+ scope: { projectId: 'chatbot' },
77
+ });
71
78
 
72
- await ai.chat.completions.create({
79
+ await guardedGenerateText({
73
80
  model: 'gpt-4o-mini',
74
- messages: [{ role: 'user', content: 'Hello' }],
75
- max_tokens: 100
81
+ prompt: 'Answer the user in one paragraph.',
82
+ max_tokens: 200,
76
83
  });
77
84
  ```
78
85
 
79
- When the estimated cumulative spend in the current process would exceed `budget`, CostGuard throws `GuardError` before calling the underlying AI client.
86
+ ## Decisions And Errors
87
+
88
+ Blocked requests throw `GuardError` before the provider method is called.
80
89
 
81
- ## Loop And Retry Detection
90
+ ```ts
91
+ try {
92
+ await openai.chat.completions.create(request);
93
+ } catch (error) {
94
+ if (error instanceof GuardError) {
95
+ console.log(error.code);
96
+ console.log(error.metadata);
97
+ }
98
+ }
99
+ ```
82
100
 
83
- CostGuard keeps a short in-memory history of recent prompts for the wrapped client:
101
+ Current runtime block codes:
84
102
 
85
- - Loop detection blocks repeated prompt hashes.
86
- - Retry detection blocks repeated prompts containing retry/failure language such as `retry`, `again`, `repeat`, `error`, `fail`, or `timeout`.
103
+ - `UNKNOWN_MODEL`
104
+ - `BUDGET_EXCEEDED`
105
+ - `MAX_STEPS_EXCEEDED`
106
+ - `LOOP_DETECTED`
107
+ - `RETRY_STORM_DETECTED`
87
108
 
88
- These checks are intentionally local and lightweight.
109
+ `INVALID_LICENSE` remains in the exported type for compatibility with older callers, but the current Pro helper does not enforce local license checks.
89
110
 
90
- ## Middleware
111
+ ## Configuration
91
112
 
92
113
  ```ts
93
- import express from 'express';
94
- import { middleware, GuardError } from '@salimassili/ai-costguard';
114
+ guard(client, {
115
+ budget: 10,
116
+ maxSteps: 100,
117
+ behaviorAnalysis: true,
118
+ maxHistory: 32,
119
+ historyTtlMs: 5 * 60 * 1000,
120
+ loopSimilarityThreshold: 0.85,
121
+ loopMinRepeats: 2,
122
+ retryThreshold: 2,
123
+ scope: {
124
+ projectId: 'production-api',
125
+ userId: 'optional-user',
126
+ sessionId: 'optional-agent-run',
127
+ },
128
+ guardedMethods: ['chat.completions.create', 'responses.create'],
129
+ pricingOverrides: [],
130
+ webhooks: {
131
+ slack: process.env.SLACK_WEBHOOK,
132
+ discord: process.env.DISCORD_WEBHOOK,
133
+ retries: 2,
134
+ timeoutMs: 1500,
135
+ },
136
+ eventLogPath: '.ai-costguard/events.jsonl',
137
+ eventLogPrompt: 'none',
138
+ });
139
+ ```
140
+
141
+ `scope` isolates budgets and behavior history. If no scope is supplied, the guard uses one process-local default scope.
142
+
143
+ ## Accounting Semantics
95
144
 
96
- const app = express();
145
+ AI CostGuard is a pre-call estimator, not a billing ledger.
146
+
147
+ - `attemptedCost`: estimated cost of every guarded attempt, including blocked attempts.
148
+ - `totalCost`: estimated cost of allowed calls.
149
+ - `blockedCost`: estimated cost stopped before provider execution.
150
+ - `actualCost`: provider-reported usage cost when the response includes recognizable `usage` fields.
151
+
152
+ Budget decisions use estimated allowed cost. Actual usage is recorded for observability but does not rewrite earlier decisions.
153
+
154
+ ## Pricing
155
+
156
+ Known model pricing comes from built-in registry entries, runtime registrations, or per-guard overrides. Unknown models are blocked by default.
157
+
158
+ ```ts
159
+ import { registerPricing } from '@salimassili/ai-costguard';
160
+
161
+ registerPricing([
162
+ {
163
+ model: 'my-company-model',
164
+ inputPer1kTokens: 0.001,
165
+ outputPer1kTokens: 0.002,
166
+ lastUpdated: '2026-06-07',
167
+ source: 'internal',
168
+ },
169
+ ]);
170
+ ```
171
+
172
+ If you intentionally want fallback pricing for unknown models:
173
+
174
+ ```ts
175
+ guard(client, {
176
+ budget: 5,
177
+ unknownModelPolicy: 'fallback',
178
+ unknownModelPricing: {
179
+ model: 'fallback',
180
+ inputPer1kTokens: 0.001,
181
+ outputPer1kTokens: 0.002,
182
+ lastUpdated: '2026-06-07',
183
+ source: 'application fallback',
184
+ },
185
+ });
186
+ ```
187
+
188
+ Pricing changes frequently. Verify provider pricing before production use and override entries when needed.
189
+
190
+ ## Events
191
+
192
+ ```ts
193
+ const unsubscribe = openai.on('block', (event) => {
194
+ console.log(event.code, event.reason, event.context.estimatedCost);
195
+ });
196
+
197
+ unsubscribe();
198
+ ```
199
+
200
+ Supported events are `cost`, `allow`, and `block`. Handler errors are swallowed so observability code cannot change guard decisions.
201
+
202
+ ## Local Dashboard
203
+
204
+ Opt into a local JSONL event log:
205
+
206
+ ```ts
207
+ const openai = guard(client, {
208
+ budget: 5,
209
+ eventLogPath: '.ai-costguard/events.jsonl',
210
+ });
211
+ ```
212
+
213
+ Start the local-only dashboard:
214
+
215
+ ```bash
216
+ ai-costguard dashboard --events .ai-costguard/events.jsonl --budget 5
217
+ ```
218
+
219
+ For one-off package execution:
220
+
221
+ ```bash
222
+ npx @salimassili/ai-costguard dashboard --events .ai-costguard/events.jsonl --budget 5
223
+ ```
224
+
225
+ If the package is installed locally, `npx ai-costguard dashboard` also works. The dashboard binds to `127.0.0.1` by default and reads only local event files.
226
+
227
+ For CI or terminal output:
228
+
229
+ ```bash
230
+ ai-costguard dashboard --events .ai-costguard/events.jsonl --budget 5 --once --json
231
+ ```
232
+
233
+ See `docs/DASHBOARD.md`.
234
+
235
+ ## Integrations
236
+
237
+ Runnable mocked examples are included for:
238
+
239
+ - OpenAI SDK agent loop protection
240
+ - Anthropic SDK workflow budget guard
241
+ - Vercel AI SDK chatbot budget cap
242
+ - LangChain retry-storm prevention
243
+ - Mastra-style agent runner protection
244
+ - CrewAI launch/budget gate
245
+ - CI budget checks
246
+
247
+ See `docs/INTEGRATIONS.md` and `examples/integrations`.
248
+
249
+ ## Express Middleware
250
+
251
+ The middleware attaches a manual checker. It does not automatically parse or inspect every route.
252
+
253
+ ```ts
254
+ import { middleware, GuardError } from '@salimassili/ai-costguard';
97
255
 
98
256
  app.use(middleware({ budget: 2 }));
99
257
 
@@ -101,101 +259,121 @@ app.post('/chat', async (req, res, next) => {
101
259
  try {
102
260
  req.localSafety.check({
103
261
  model: 'gpt-4o-mini',
104
- tokens: 1000,
105
- estimatedCost: 0.001,
262
+ tokens: 500,
263
+ inputTokens: 100,
264
+ outputTokens: 400,
265
+ estimatedCost: 0.0003,
106
266
  timestamp: Date.now(),
107
- prompt: req.body?.prompt ?? ''
267
+ prompt: String(req.body?.prompt ?? ''),
108
268
  });
109
269
 
110
270
  res.json({ ok: true });
111
271
  } catch (error) {
112
272
  if (error instanceof GuardError) {
113
- res.status(402).json({ error: error.message, context: error.context });
273
+ res.status(403).json({ code: error.code, reason: error.message });
114
274
  return;
115
275
  }
116
-
117
276
  next(error);
118
277
  }
119
278
  });
120
279
  ```
121
280
 
122
- ## Pricing
281
+ ## Optional Redis / Pro Helper
123
282
 
124
- ```ts
125
- import { getPricing, listPricing, registerPricing } from '@salimassili/ai-costguard';
283
+ Redis-backed shared spend tracking is isolated behind a subpath import:
126
284
 
127
- console.log(getPricing('gpt-4o-mini'));
285
+ ```ts
286
+ import { GuardPro } from '@salimassili/ai-costguard/pro';
128
287
 
129
- registerPricing([
130
- {
131
- model: 'custom-model',
132
- inputPer1kTokens: 0.001,
133
- outputPer1kTokens: 0.002,
134
- lastUpdated: '2026-05-21',
135
- source: 'internal'
136
- }
137
- ]);
288
+ const pro = new GuardPro({
289
+ redisUrl: process.env.REDIS_URL ?? '',
290
+ budget: 25,
291
+ windowSeconds: 86400,
292
+ });
138
293
 
139
- console.log(listPricing());
294
+ await pro.checkAndCharge('production', 0.0042);
295
+ await pro.shutdown();
140
296
  ```
141
297
 
142
- `getPricing(model)` returns an exact match when available, then falls back to simple fuzzy matching. Unknown models return `undefined`.
298
+ `ioredis` is an optional dependency and is not loaded by the root import.
143
299
 
144
- ## Pro Features (Coming Soon)
300
+ `licenseKey` is accepted as a deprecated compatibility field only. AI CostGuard does not enforce commercial licenses locally, and `validateLicense()` is a format sanity helper, not security.
145
301
 
146
- > These features are under active development and not yet available:
147
- > - Distributed Redis-backed budget state
148
- > - Real Slack/Discord webhook alerts
149
- > - Multi-instance coordination
150
- > - Production license validation
302
+ ## CLI
151
303
 
152
- ## API
304
+ ```bash
305
+ aifw check --budget 1 --model gpt-4o-mini --input-tokens 500 --tokens 1000 --max-steps 5
306
+ ```
153
307
 
154
- ### `guard(client, config)`
308
+ The package also installs an `ai-costguard` bin alias:
155
309
 
156
- Wraps an OpenAI-like client.
310
+ ```bash
311
+ ai-costguard check --budget 1 --model gpt-4o-mini --tokens 1000 --max-steps 5
312
+ ai-costguard dashboard --events .ai-costguard/events.jsonl --budget 5
313
+ ```
157
314
 
158
- ```ts
159
- guard(client, { budget: 10 });
315
+ For custom models:
316
+
317
+ ```bash
318
+ aifw check --budget 1 --model internal-model --tokens 1000 --input-price-per-1k 0.001 --output-price-per-1k 0.002
160
319
  ```
161
320
 
162
- ### `GuardError`
321
+ Exit codes:
163
322
 
164
- Thrown when CostGuard blocks a request.
323
+ - `0`: projected cost is within budget
324
+ - `1`: projected cost exceeds budget
325
+ - `2`: usage/config error
165
326
 
166
- ```ts
167
- try {
168
- await ai.chat.completions.create(params);
169
- } catch (error) {
170
- if (error instanceof GuardError) {
171
- console.log(error.context);
172
- }
173
- }
174
- ```
327
+ ## Benchmarks
328
+
329
+ Run local benchmarks:
175
330
 
176
- ### `middleware(config)`
331
+ ```bash
332
+ npm run build
333
+ npm run benchmark
334
+ ```
177
335
 
178
- Creates request middleware with local budget, loop, and retry checks.
336
+ The script reports runtime overhead, approximate heap delta, false-positive scenarios, loop detection behavior, and cost-estimation boundaries. Results are local measurements, not universal guarantees. See `docs/BENCHMARKS.md`.
179
337
 
180
- ### `getPricing(model, overrides?)`
338
+ Latest local benchmark in this repo on Node `v24.14.1` / Windows measured `0.020691 ms` added per mocked guarded call over `5000` iterations. Re-run on your target runtime before using this number in performance-sensitive claims.
181
339
 
182
- Returns pricing for a model from overrides, runtime registrations, or built-in entries.
340
+ ## Why Not 50 Lines Of Code?
183
341
 
184
- ### `registerPricing(entries)`
342
+ A simple homemade budget check can stop one request after one counter crosses one number. AI CostGuard packages the parts that usually become messy once agents enter production:
185
343
 
186
- Registers or replaces runtime pricing entries by model name.
344
+ - Provider pricing registry with runtime overrides and unknown-model blocking.
345
+ - Structured `GuardError` codes and metadata for API responses.
346
+ - Scoped budget and behavior state per project, user, or session.
347
+ - TTL-bounded prompt history.
348
+ - Loop and retry-storm detection.
349
+ - Estimated, attempted, blocked, and actual usage accounting.
350
+ - Method filtering so non-AI SDK calls are not charged.
351
+ - Event hooks, best-effort webhooks, JSONL event logs, and local dashboard visibility.
352
+ - CI budget checks and runnable integration examples.
187
353
 
188
- ### `listPricing()`
354
+ ## Development
189
355
 
190
- Returns built-in and runtime pricing entries, deduplicated by model name.
356
+ ```bash
357
+ npm ci
358
+ npm run build
359
+ npm run typecheck
360
+ npm test
361
+ npm run smoke
362
+ npm run benchmark
363
+ npm audit --omit=dev
364
+ npm pack --dry-run
365
+ ```
191
366
 
192
367
  ## Limitations
193
368
 
194
- - Free guard state is stored in memory only.
195
- - Budget checks are estimates, not billing records.
196
- - Token estimation is approximate.
197
- - Pricing entries are static until the package or runtime registry is updated.
198
- - The library does not include dashboards, analytics, governance workflows, or hosted services.
369
+ - Token counting is approximate and dependency-free.
370
+ - Pricing entries can become stale; override them for production.
371
+ - The free guard is process-local.
372
+ - Loop detection uses character trigram similarity, not embeddings.
373
+ - Retry detection is heuristic.
374
+ - Webhooks are best-effort and never affect enforcement.
375
+ - The dashboard reads local JSONL logs only; it is not a hosted analytics product.
376
+ - Provider usage reconciliation only works when responses expose recognizable `usage` fields.
199
377
 
200
378
  ## License
201
379