@salimassili/ai-costguard 1.1.9 → 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +53 -0
- package/LICENSE +21 -0
- package/README.md +294 -147
- package/benchmarks/run.mjs +229 -0
- package/dist/cli.d.ts +50 -0
- package/dist/cli.d.ts.map +1 -0
- package/dist/cli.js +178 -0
- package/dist/cli.js.map +1 -0
- package/dist/core/CostGuard.d.ts +5 -12
- package/dist/core/CostGuard.d.ts.map +1 -1
- package/dist/core/CostGuard.js +2 -11
- package/dist/core/CostGuard.js.map +1 -1
- package/dist/core/GuardCore.d.ts +93 -16
- package/dist/core/GuardCore.d.ts.map +1 -1
- package/dist/core/GuardCore.js +371 -166
- package/dist/core/GuardCore.js.map +1 -1
- package/dist/core/GuardFree.d.ts +42 -22
- package/dist/core/GuardFree.d.ts.map +1 -1
- package/dist/core/GuardFree.js +95 -150
- package/dist/core/GuardFree.js.map +1 -1
- package/dist/core/GuardPro.d.ts +85 -87
- package/dist/core/GuardPro.d.ts.map +1 -1
- package/dist/core/GuardPro.js +229 -235
- package/dist/core/GuardPro.js.map +1 -1
- package/dist/core/event-log.d.ts +37 -0
- package/dist/core/event-log.d.ts.map +1 -0
- package/dist/core/event-log.js +49 -0
- package/dist/core/event-log.js.map +1 -0
- package/dist/core/events.d.ts +20 -0
- package/dist/core/events.d.ts.map +1 -0
- package/dist/core/events.js +46 -0
- package/dist/core/events.js.map +1 -0
- package/dist/core/similarity.d.ts +13 -0
- package/dist/core/similarity.d.ts.map +1 -0
- package/dist/core/similarity.js +51 -0
- package/dist/core/similarity.js.map +1 -0
- package/dist/core/tokenizer.d.ts +18 -0
- package/dist/core/tokenizer.d.ts.map +1 -0
- package/dist/core/tokenizer.js +137 -0
- package/dist/core/tokenizer.js.map +1 -0
- package/dist/core/types.d.ts +153 -5
- package/dist/core/types.d.ts.map +1 -1
- package/dist/core/types.js +0 -3
- package/dist/core/types.js.map +1 -1
- package/dist/core/webhooks.d.ts +15 -0
- package/dist/core/webhooks.d.ts.map +1 -0
- package/dist/core/webhooks.js +58 -0
- package/dist/core/webhooks.js.map +1 -0
- package/dist/dashboard.d.ts +73 -0
- package/dist/dashboard.d.ts.map +1 -0
- package/dist/dashboard.js +201 -0
- package/dist/dashboard.js.map +1 -0
- package/dist/index.d.ts +5 -15
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +2 -14
- package/dist/index.js.map +1 -1
- package/dist/pricing/index.d.ts +28 -0
- package/dist/pricing/index.d.ts.map +1 -0
- package/dist/pricing/index.js +183 -0
- package/dist/pricing/index.js.map +1 -0
- package/dist/pro.d.ts +3 -0
- package/dist/pro.d.ts.map +1 -0
- package/dist/pro.js +2 -0
- package/dist/pro.js.map +1 -0
- package/docs/BENCHMARKS.md +51 -0
- package/docs/DASHBOARD.md +61 -0
- package/docs/INTEGRATIONS.md +153 -0
- package/examples/integrations/anthropic-workflow-budget.mjs +36 -0
- package/examples/integrations/ci-budget-check.mjs +32 -0
- package/examples/integrations/crewai-budget-gate.mjs +31 -0
- package/examples/integrations/langchain-retry-storm.mjs +32 -0
- package/examples/integrations/mastra-agent.mjs +41 -0
- package/examples/integrations/openai-agent-loop.mjs +44 -0
- package/examples/integrations/vercel-ai-chatbot.mjs +29 -0
- package/package.json +44 -16
- package/dist/core/AgentFailureKernel.d.ts +0 -52
- package/dist/core/AgentFailureKernel.d.ts.map +0 -1
- package/dist/core/AgentFailureKernel.js +0 -357
- package/dist/core/AgentFailureKernel.js.map +0 -1
package/CHANGELOG.md
ADDED
|
@@ -0,0 +1,53 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
## 2.0.0 - Unreleased
|
|
4
|
+
|
|
5
|
+
### Changed
|
|
6
|
+
|
|
7
|
+
- Moved Redis-backed `GuardPro` exports to `@salimassili/ai-costguard/pro` so the root import stays lightweight.
|
|
8
|
+
- Removed fake local license enforcement from `GuardPro`; `licenseKey` and `validateLicense()` are compatibility helpers only.
|
|
9
|
+
- Unknown models now block by default unless runtime pricing, guard pricing overrides, or explicit fallback pricing is configured.
|
|
10
|
+
- Guard proxy now checks known AI SDK method paths instead of charging every function call on the wrapped client.
|
|
11
|
+
- Loop detection now requires repeated similar prompts in the same scope before blocking.
|
|
12
|
+
- Retry detection now requires stronger retry/failure signals to reduce false positives.
|
|
13
|
+
- Prompt and retry histories are scoped and TTL-bound.
|
|
14
|
+
|
|
15
|
+
### Added
|
|
16
|
+
|
|
17
|
+
- `guardFunction()` for Vercel AI SDK, LangChain, Mastra-style, CrewAI launcher, and other function-style integrations.
|
|
18
|
+
- Local JSONL event logging and `ai-costguard dashboard` / `aifw dashboard` for local-only visibility.
|
|
19
|
+
- Mocked runnable integration examples for OpenAI, Anthropic, Vercel AI SDK, LangChain, Mastra, CrewAI, and CI checks.
|
|
20
|
+
- Local benchmark script and benchmark documentation.
|
|
21
|
+
- Structured `GuardError.code` and `GuardError.metadata`.
|
|
22
|
+
- Scoped accounting fields for attempted, allowed, blocked, and reconciled actual cost.
|
|
23
|
+
- CLI custom pricing flags for private/custom models.
|
|
24
|
+
- `/pricing` package subpath export.
|
|
25
|
+
- Repository smoke checks for examples, templates, package exports, and stale claims.
|
|
26
|
+
|
|
27
|
+
### Removed
|
|
28
|
+
|
|
29
|
+
- Active root docs and templates for unimplemented proxy/dashboard/SaaS features.
|
|
30
|
+
- Unused postinstall helper, stale ESLint config, and stale npm ignore file.
|
|
31
|
+
|
|
32
|
+
## 1.2.0 - 2026-05-28
|
|
33
|
+
|
|
34
|
+
### Changed
|
|
35
|
+
|
|
36
|
+
- Rebuilt the package around a strict ESM TypeScript core.
|
|
37
|
+
- Replaced the old character-count token heuristic with an inline BPE-style estimator.
|
|
38
|
+
- Replaced exact prompt matching with character trigram cosine similarity loop detection at the default `0.85` threshold.
|
|
39
|
+
- Reworked `GuardPro` with pooled Redis connections, TTL-based spend windows, and local fallback when Redis is unavailable.
|
|
40
|
+
- Rewrote the README to describe only shipped behavior.
|
|
41
|
+
|
|
42
|
+
### Added
|
|
43
|
+
|
|
44
|
+
- `guard.on('block' | 'allow' | 'cost', callback)` event hooks.
|
|
45
|
+
- Optional Slack and Discord block webhooks with exponential backoff and silent failure.
|
|
46
|
+
- `aifw check --budget --model --tokens --max-steps` CLI for CI budget checks.
|
|
47
|
+
- Stale pricing warnings for entries older than 30 days.
|
|
48
|
+
- Node-native unit and integration tests for GuardCore, GuardFree, GuardPro, middleware, pricing, token estimation, webhooks, and CLI behavior.
|
|
49
|
+
|
|
50
|
+
### Removed
|
|
51
|
+
|
|
52
|
+
- Removed stale Jest configuration and CommonJS-era test setup.
|
|
53
|
+
- Removed README claims about dashboards, hosted monitoring, and proxy features that are not shipped in this package.
|
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Salim Assili
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
CHANGED
|
@@ -1,233 +1,380 @@
|
|
|
1
|
-
# AI CostGuard
|
|
1
|
+
# AI CostGuard
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
AI CostGuard is a local-first runtime safety layer for AI agents that prevents runaway costs, loops, retries, and budget explosions before API calls execute. It wraps OpenAI-compatible clients and function-style SDK calls, estimates request cost locally, blocks budget overruns, detects repeated prompts, emits structured events, and exposes CLI checks plus a local dashboard.
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
It is local-first. It does not include a SaaS control plane, cloud dashboard, proxy gateway, telemetry service, billing reconciliation service, or hard security boundary.
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
## Install
|
|
8
8
|
|
|
9
|
-
|
|
9
|
+
```bash
|
|
10
|
+
npm install @salimassili/ai-costguard
|
|
11
|
+
```
|
|
10
12
|
|
|
11
|
-
|
|
13
|
+
## Quick Start
|
|
12
14
|
|
|
13
|
-
|
|
15
|
+
```ts
|
|
16
|
+
import OpenAI from 'openai';
|
|
17
|
+
import { guard, GuardError } from '@salimassili/ai-costguard';
|
|
14
18
|
|
|
15
|
-
|
|
19
|
+
const openai = guard(new OpenAI({ apiKey: process.env.OPENAI_API_KEY }), {
|
|
20
|
+
budget: 5,
|
|
21
|
+
maxSteps: 50,
|
|
22
|
+
scope: { projectId: 'my-app' },
|
|
23
|
+
});
|
|
16
24
|
|
|
17
|
-
|
|
25
|
+
try {
|
|
26
|
+
const response = await openai.chat.completions.create({
|
|
27
|
+
model: 'gpt-4o-mini',
|
|
28
|
+
messages: [{ role: 'user', content: 'Write a short summary.' }],
|
|
29
|
+
max_tokens: 200,
|
|
30
|
+
});
|
|
31
|
+
|
|
32
|
+
console.log(response.choices[0]?.message?.content);
|
|
33
|
+
} catch (error) {
|
|
34
|
+
if (error instanceof GuardError) {
|
|
35
|
+
console.error(error.code, error.message, error.context);
|
|
36
|
+
} else {
|
|
37
|
+
throw error;
|
|
38
|
+
}
|
|
39
|
+
}
|
|
40
|
+
```
|
|
18
41
|
|
|
19
|
-
|
|
42
|
+
## What It Guards
|
|
20
43
|
|
|
21
|
-
|
|
44
|
+
By default AI CostGuard evaluates these SDK method paths:
|
|
22
45
|
|
|
23
|
-
|
|
46
|
+
- `chat.completions.create`
|
|
47
|
+
- `completions.create`
|
|
48
|
+
- `responses.create`
|
|
49
|
+
- `messages.create`
|
|
24
50
|
|
|
25
|
-
|
|
26
|
-
import { guard } from '@salimassili/ai-costguard';
|
|
51
|
+
Other client methods are passed through without cost checks. To protect a custom client method:
|
|
27
52
|
|
|
28
|
-
|
|
53
|
+
```ts
|
|
54
|
+
const client = guard(customClient, {
|
|
55
|
+
budget: 2,
|
|
56
|
+
guardedMethods: ['agent.run'],
|
|
57
|
+
pricingOverrides: [
|
|
58
|
+
{
|
|
59
|
+
model: 'internal-model',
|
|
60
|
+
inputPer1kTokens: 0.001,
|
|
61
|
+
outputPer1kTokens: 0.002,
|
|
62
|
+
lastUpdated: '2026-06-07',
|
|
63
|
+
source: 'internal pricing sheet',
|
|
64
|
+
},
|
|
65
|
+
],
|
|
66
|
+
});
|
|
29
67
|
```
|
|
30
68
|
|
|
31
|
-
|
|
32
|
-
- Hard budget limits
|
|
33
|
-
- Infinite loops (2x repetition)
|
|
34
|
-
- Retry storms (1x failure pattern)
|
|
35
|
-
- Token explosions (20x spikes)
|
|
69
|
+
For function-style SDKs such as Vercel AI SDK adapters, LangChain wrappers, or agent runners:
|
|
36
70
|
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
🚨 SMOKE DETECTOR: Infinite loop → saved $18.42
|
|
40
|
-
```
|
|
71
|
+
```ts
|
|
72
|
+
import { guardFunction } from '@salimassili/ai-costguard';
|
|
41
73
|
|
|
42
|
-
|
|
74
|
+
const guardedGenerateText = guardFunction(generateTextAdapter, {
|
|
75
|
+
budget: 1,
|
|
76
|
+
scope: { projectId: 'chatbot' },
|
|
77
|
+
});
|
|
43
78
|
|
|
44
|
-
|
|
79
|
+
await guardedGenerateText({
|
|
80
|
+
model: 'gpt-4o-mini',
|
|
81
|
+
prompt: 'Answer the user in one paragraph.',
|
|
82
|
+
max_tokens: 200,
|
|
83
|
+
});
|
|
84
|
+
```
|
|
45
85
|
|
|
46
|
-
##
|
|
86
|
+
## Decisions And Errors
|
|
47
87
|
|
|
48
|
-
|
|
88
|
+
Blocked requests throw `GuardError` before the provider method is called.
|
|
49
89
|
|
|
50
90
|
```ts
|
|
51
|
-
|
|
91
|
+
try {
|
|
92
|
+
await openai.chat.completions.create(request);
|
|
93
|
+
} catch (error) {
|
|
94
|
+
if (error instanceof GuardError) {
|
|
95
|
+
console.log(error.code);
|
|
96
|
+
console.log(error.metadata);
|
|
97
|
+
}
|
|
98
|
+
}
|
|
99
|
+
```
|
|
52
100
|
|
|
53
|
-
|
|
54
|
-
slack: { webhook: "webhook_url", channel: "#alerts" }
|
|
55
|
-
});
|
|
101
|
+
Current runtime block codes:
|
|
56
102
|
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
103
|
+
- `UNKNOWN_MODEL`
|
|
104
|
+
- `BUDGET_EXCEEDED`
|
|
105
|
+
- `MAX_STEPS_EXCEEDED`
|
|
106
|
+
- `LOOP_DETECTED`
|
|
107
|
+
- `RETRY_STORM_DETECTED`
|
|
61
108
|
|
|
62
|
-
|
|
63
|
-
```
|
|
109
|
+
`INVALID_LICENSE` remains in the exported type for compatibility with older callers, but the current Pro helper does not enforce local license checks.
|
|
64
110
|
|
|
65
|
-
|
|
66
|
-
- Hard global budget limits (cross-instance)
|
|
67
|
-
- Instant emergency shutdown
|
|
68
|
-
- Global usage caps
|
|
69
|
-
- Real-time spend tracking
|
|
70
|
-
- Slack/Discord panic alerts
|
|
71
|
-
- Distributed coordination (2-second sync)
|
|
72
|
-
- Simple policy rules
|
|
111
|
+
## Configuration
|
|
73
112
|
|
|
74
|
-
|
|
113
|
+
```ts
|
|
114
|
+
guard(client, {
|
|
115
|
+
budget: 10,
|
|
116
|
+
maxSteps: 100,
|
|
117
|
+
behaviorAnalysis: true,
|
|
118
|
+
maxHistory: 32,
|
|
119
|
+
historyTtlMs: 5 * 60 * 1000,
|
|
120
|
+
loopSimilarityThreshold: 0.85,
|
|
121
|
+
loopMinRepeats: 2,
|
|
122
|
+
retryThreshold: 2,
|
|
123
|
+
scope: {
|
|
124
|
+
projectId: 'production-api',
|
|
125
|
+
userId: 'optional-user',
|
|
126
|
+
sessionId: 'optional-agent-run',
|
|
127
|
+
},
|
|
128
|
+
guardedMethods: ['chat.completions.create', 'responses.create'],
|
|
129
|
+
pricingOverrides: [],
|
|
130
|
+
webhooks: {
|
|
131
|
+
slack: process.env.SLACK_WEBHOOK,
|
|
132
|
+
discord: process.env.DISCORD_WEBHOOK,
|
|
133
|
+
retries: 2,
|
|
134
|
+
timeoutMs: 1500,
|
|
135
|
+
},
|
|
136
|
+
eventLogPath: '.ai-costguard/events.jsonl',
|
|
137
|
+
eventLogPrompt: 'none',
|
|
138
|
+
});
|
|
139
|
+
```
|
|
75
140
|
|
|
76
|
-
|
|
141
|
+
`scope` isolates budgets and behavior history. If no scope is supplied, the guard uses one process-local default scope.
|
|
77
142
|
|
|
78
|
-
|
|
143
|
+
## Accounting Semantics
|
|
79
144
|
|
|
80
|
-
|
|
145
|
+
AI CostGuard is a pre-call estimator, not a billing ledger.
|
|
81
146
|
|
|
82
|
-
|
|
147
|
+
- `attemptedCost`: estimated cost of every guarded attempt, including blocked attempts.
|
|
148
|
+
- `totalCost`: estimated cost of allowed calls.
|
|
149
|
+
- `blockedCost`: estimated cost stopped before provider execution.
|
|
150
|
+
- `actualCost`: provider-reported usage cost when the response includes recognizable `usage` fields.
|
|
83
151
|
|
|
84
|
-
|
|
152
|
+
Budget decisions use estimated allowed cost. Actual usage is recorded for observability but does not rewrite earlier decisions.
|
|
85
153
|
|
|
86
|
-
##
|
|
154
|
+
## Pricing
|
|
87
155
|
|
|
88
|
-
|
|
89
|
-
npm install @salimassili/ai-costguard
|
|
90
|
-
```
|
|
156
|
+
Known model pricing comes from built-in registry entries, runtime registrations, or per-guard overrides. Unknown models are blocked by default.
|
|
91
157
|
|
|
92
|
-
### Development (FREE)
|
|
93
158
|
```ts
|
|
94
|
-
import {
|
|
95
|
-
|
|
159
|
+
import { registerPricing } from '@salimassili/ai-costguard';
|
|
160
|
+
|
|
161
|
+
registerPricing([
|
|
162
|
+
{
|
|
163
|
+
model: 'my-company-model',
|
|
164
|
+
inputPer1kTokens: 0.001,
|
|
165
|
+
outputPer1kTokens: 0.002,
|
|
166
|
+
lastUpdated: '2026-06-07',
|
|
167
|
+
source: 'internal',
|
|
168
|
+
},
|
|
169
|
+
]);
|
|
170
|
+
```
|
|
96
171
|
|
|
97
|
-
|
|
172
|
+
If you intentionally want fallback pricing for unknown models:
|
|
98
173
|
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
|
|
174
|
+
```ts
|
|
175
|
+
guard(client, {
|
|
176
|
+
budget: 5,
|
|
177
|
+
unknownModelPolicy: 'fallback',
|
|
178
|
+
unknownModelPricing: {
|
|
179
|
+
model: 'fallback',
|
|
180
|
+
inputPer1kTokens: 0.001,
|
|
181
|
+
outputPer1kTokens: 0.002,
|
|
182
|
+
lastUpdated: '2026-06-07',
|
|
183
|
+
source: 'application fallback',
|
|
184
|
+
},
|
|
103
185
|
});
|
|
104
186
|
```
|
|
105
187
|
|
|
106
|
-
|
|
107
|
-
```ts
|
|
108
|
-
import { getProGuard } from '@salimassili/ai-costguard';
|
|
188
|
+
Pricing changes frequently. Verify provider pricing before production use and override entries when needed.
|
|
109
189
|
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
190
|
+
## Events
|
|
191
|
+
|
|
192
|
+
```ts
|
|
193
|
+
const unsubscribe = openai.on('block', (event) => {
|
|
194
|
+
console.log(event.code, event.reason, event.context.estimatedCost);
|
|
113
195
|
});
|
|
114
196
|
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
|
|
197
|
+
unsubscribe();
|
|
198
|
+
```
|
|
199
|
+
|
|
200
|
+
Supported events are `cost`, `allow`, and `block`. Handler errors are swallowed so observability code cannot change guard decisions.
|
|
201
|
+
|
|
202
|
+
## Local Dashboard
|
|
203
|
+
|
|
204
|
+
Opt into a local JSONL event log:
|
|
205
|
+
|
|
206
|
+
```ts
|
|
207
|
+
const openai = guard(client, {
|
|
208
|
+
budget: 5,
|
|
209
|
+
eventLogPath: '.ai-costguard/events.jsonl',
|
|
118
210
|
});
|
|
211
|
+
```
|
|
212
|
+
|
|
213
|
+
Start the local-only dashboard:
|
|
119
214
|
|
|
120
|
-
|
|
215
|
+
```bash
|
|
216
|
+
ai-costguard dashboard --events .ai-costguard/events.jsonl --budget 5
|
|
217
|
+
```
|
|
218
|
+
|
|
219
|
+
For one-off package execution:
|
|
220
|
+
|
|
221
|
+
```bash
|
|
222
|
+
npx @salimassili/ai-costguard dashboard --events .ai-costguard/events.jsonl --budget 5
|
|
121
223
|
```
|
|
122
224
|
|
|
123
|
-
|
|
225
|
+
If the package is installed locally, `npx ai-costguard dashboard` also works. The dashboard binds to `127.0.0.1` by default and reads only local event files.
|
|
226
|
+
|
|
227
|
+
For CI or terminal output:
|
|
124
228
|
|
|
125
|
-
|
|
229
|
+
```bash
|
|
230
|
+
ai-costguard dashboard --events .ai-costguard/events.jsonl --budget 5 --once --json
|
|
231
|
+
```
|
|
126
232
|
|
|
127
|
-
|
|
128
|
-
- **Founders using OpenAI/Claude in production** - Need sleep at night
|
|
129
|
-
- **Teams running autonomous workflows** - Multiple instances, high risk
|
|
130
|
-
- **Companies spending $1k+/month on AI APIs** - High exposure
|
|
233
|
+
See `docs/DASHBOARD.md`.
|
|
131
234
|
|
|
132
|
-
|
|
235
|
+
## Integrations
|
|
133
236
|
|
|
134
|
-
|
|
237
|
+
Runnable mocked examples are included for:
|
|
135
238
|
|
|
136
|
-
|
|
137
|
-
|
|
239
|
+
- OpenAI SDK agent loop protection
|
|
240
|
+
- Anthropic SDK workflow budget guard
|
|
241
|
+
- Vercel AI SDK chatbot budget cap
|
|
242
|
+
- LangChain retry-storm prevention
|
|
243
|
+
- Mastra-style agent runner protection
|
|
244
|
+
- CrewAI launch/budget gate
|
|
245
|
+
- CI budget checks
|
|
138
246
|
|
|
139
|
-
|
|
247
|
+
See `docs/INTEGRATIONS.md` and `examples/integrations`.
|
|
140
248
|
|
|
141
|
-
|
|
249
|
+
## Express Middleware
|
|
142
250
|
|
|
143
|
-
|
|
251
|
+
The middleware attaches a manual checker. It does not automatically parse or inspect every route.
|
|
144
252
|
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
253
|
+
```ts
|
|
254
|
+
import { middleware, GuardError } from '@salimassili/ai-costguard';
|
|
255
|
+
|
|
256
|
+
app.use(middleware({ budget: 2 }));
|
|
257
|
+
|
|
258
|
+
app.post('/chat', async (req, res, next) => {
|
|
259
|
+
try {
|
|
260
|
+
req.localSafety.check({
|
|
261
|
+
model: 'gpt-4o-mini',
|
|
262
|
+
tokens: 500,
|
|
263
|
+
inputTokens: 100,
|
|
264
|
+
outputTokens: 400,
|
|
265
|
+
estimatedCost: 0.0003,
|
|
266
|
+
timestamp: Date.now(),
|
|
267
|
+
prompt: String(req.body?.prompt ?? ''),
|
|
268
|
+
});
|
|
269
|
+
|
|
270
|
+
res.json({ ok: true });
|
|
271
|
+
} catch (error) {
|
|
272
|
+
if (error instanceof GuardError) {
|
|
273
|
+
res.status(403).json({ code: error.code, reason: error.message });
|
|
274
|
+
return;
|
|
275
|
+
}
|
|
276
|
+
next(error);
|
|
277
|
+
}
|
|
278
|
+
});
|
|
279
|
+
```
|
|
155
280
|
|
|
156
|
-
|
|
281
|
+
## Optional Redis / Pro Helper
|
|
157
282
|
|
|
158
|
-
|
|
283
|
+
Redis-backed shared spend tracking is isolated behind a subpath import:
|
|
159
284
|
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
- **Proxy-based** - Zero integration overhead
|
|
163
|
-
- **OpenAI + Anthropic** - Major providers supported
|
|
164
|
-
- **Minimal dependencies** - Lightweight, secure
|
|
165
|
-
- **Serverless-friendly** - Works anywhere
|
|
166
|
-
- **Production-ready** - Battle tested
|
|
285
|
+
```ts
|
|
286
|
+
import { GuardPro } from '@salimassili/ai-costguard/pro';
|
|
167
287
|
|
|
168
|
-
|
|
288
|
+
const pro = new GuardPro({
|
|
289
|
+
redisUrl: process.env.REDIS_URL ?? '',
|
|
290
|
+
budget: 25,
|
|
291
|
+
windowSeconds: 86400,
|
|
292
|
+
});
|
|
169
293
|
|
|
170
|
-
|
|
294
|
+
await pro.checkAndCharge('production', 0.0042);
|
|
295
|
+
await pro.shutdown();
|
|
296
|
+
```
|
|
171
297
|
|
|
172
|
-
|
|
173
|
-
❌ No analytics
|
|
174
|
-
❌ No enterprise complexity
|
|
175
|
-
❌ No governance systems
|
|
176
|
-
❌ No features nobody pays for
|
|
298
|
+
`ioredis` is an optional dependency and is not loaded by the root import.
|
|
177
299
|
|
|
178
|
-
|
|
300
|
+
`licenseKey` is accepted as a deprecated compatibility field only. AI CostGuard does not enforce commercial licenses locally, and `validateLicense()` is a format sanity helper, not security.
|
|
179
301
|
|
|
180
|
-
|
|
302
|
+
## CLI
|
|
181
303
|
|
|
182
|
-
|
|
304
|
+
```bash
|
|
305
|
+
aifw check --budget 1 --model gpt-4o-mini --input-tokens 500 --tokens 1000 --max-steps 5
|
|
306
|
+
```
|
|
183
307
|
|
|
184
|
-
|
|
308
|
+
The package also installs an `ai-costguard` bin alias:
|
|
185
309
|
|
|
186
|
-
|
|
310
|
+
```bash
|
|
311
|
+
ai-costguard check --budget 1 --model gpt-4o-mini --tokens 1000 --max-steps 5
|
|
312
|
+
ai-costguard dashboard --events .ai-costguard/events.jsonl --budget 5
|
|
313
|
+
```
|
|
187
314
|
|
|
188
|
-
|
|
315
|
+
For custom models:
|
|
189
316
|
|
|
190
|
-
|
|
317
|
+
```bash
|
|
318
|
+
aifw check --budget 1 --model internal-model --tokens 1000 --input-price-per-1k 0.001 --output-price-per-1k 0.002
|
|
319
|
+
```
|
|
191
320
|
|
|
192
|
-
|
|
193
|
-
"AI agent processing customer feedback entered infinite retry loop. Free version on each server thought everything was fine. Cost exploded to $15,000/hour. Circuit breaker killed all instances in 8 seconds. Company saved $12,000."
|
|
321
|
+
Exit codes:
|
|
194
322
|
|
|
195
|
-
|
|
196
|
-
|
|
323
|
+
- `0`: projected cost is within budget
|
|
324
|
+
- `1`: projected cost exceeds budget
|
|
325
|
+
- `2`: usage/config error
|
|
197
326
|
|
|
198
|
-
|
|
199
|
-
"Autonomous workflow system went into runaway execution chain. 12 servers burning $4,000/hour each. Circuit breaker detected anomaly across all instances and emergency shutdown. Prevented $50,000 disaster."
|
|
327
|
+
## Benchmarks
|
|
200
328
|
|
|
201
|
-
|
|
329
|
+
Run local benchmarks:
|
|
202
330
|
|
|
203
|
-
|
|
331
|
+
```bash
|
|
332
|
+
npm run build
|
|
333
|
+
npm run benchmark
|
|
334
|
+
```
|
|
204
335
|
|
|
205
|
-
|
|
336
|
+
The script reports runtime overhead, approximate heap delta, false-positive scenarios, loop detection behavior, and cost-estimation boundaries. Results are local measurements, not universal guarantees. See `docs/BENCHMARKS.md`.
|
|
206
337
|
|
|
207
|
-
|
|
208
|
-
✅ **Trust** - Saves companies thousands, proves value instantly
|
|
209
|
-
✅ **Integration** - One-line installation, zero overhead
|
|
210
|
-
✅ **Emergency response** - Panic button works in seconds
|
|
211
|
-
✅ **Distributed coordination** - Handles multi-instance disasters
|
|
338
|
+
Latest local benchmark in this repo on Node `v24.14.1` / Windows measured `0.020691 ms` added per mocked guarded call over `5000` iterations. Re-run on your target runtime before using this number in performance-sensitive claims.
|
|
212
339
|
|
|
213
|
-
|
|
340
|
+
## Why Not 50 Lines Of Code?
|
|
214
341
|
|
|
215
|
-
|
|
342
|
+
A simple homemade budget check can stop one request after one counter crosses one number. AI CostGuard packages the parts that usually become messy once agents enter production:
|
|
216
343
|
|
|
217
|
-
|
|
344
|
+
- Provider pricing registry with runtime overrides and unknown-model blocking.
|
|
345
|
+
- Structured `GuardError` codes and metadata for API responses.
|
|
346
|
+
- Scoped budget and behavior state per project, user, or session.
|
|
347
|
+
- TTL-bounded prompt history.
|
|
348
|
+
- Loop and retry-storm detection.
|
|
349
|
+
- Estimated, attempted, blocked, and actual usage accounting.
|
|
350
|
+
- Method filtering so non-AI SDK calls are not charged.
|
|
351
|
+
- Event hooks, best-effort webhooks, JSONL event logs, and local dashboard visibility.
|
|
352
|
+
- CI budget checks and runnable integration examples.
|
|
218
353
|
|
|
219
|
-
|
|
220
|
-
2. **Production incidents create urgency** - Real disasters create immediate need
|
|
221
|
-
3. **Paid version becomes unavoidable** - Can't safely run AI in production without it
|
|
222
|
-
4. **Emergency alerts prove value** - Slack notifications save thousands
|
|
223
|
-
5. **Trust creates lock-in** - Reliable circuit breaker becomes infrastructure
|
|
354
|
+
## Development
|
|
224
355
|
|
|
225
|
-
|
|
356
|
+
```bash
|
|
357
|
+
npm ci
|
|
358
|
+
npm run build
|
|
359
|
+
npm run typecheck
|
|
360
|
+
npm test
|
|
361
|
+
npm run smoke
|
|
362
|
+
npm run benchmark
|
|
363
|
+
npm audit --omit=dev
|
|
364
|
+
npm pack --dry-run
|
|
365
|
+
```
|
|
226
366
|
|
|
227
|
-
##
|
|
367
|
+
## Limitations
|
|
228
368
|
|
|
229
|
-
|
|
369
|
+
- Token counting is approximate and dependency-free.
|
|
370
|
+
- Pricing entries can become stale; override them for production.
|
|
371
|
+
- The free guard is process-local.
|
|
372
|
+
- Loop detection uses character trigram similarity, not embeddings.
|
|
373
|
+
- Retry detection is heuristic.
|
|
374
|
+
- Webhooks are best-effort and never affect enforcement.
|
|
375
|
+
- The dashboard reads local JSONL logs only; it is not a hosted analytics product.
|
|
376
|
+
- Provider usage reconciliation only works when responses expose recognizable `usage` fields.
|
|
230
377
|
|
|
231
|
-
|
|
378
|
+
## License
|
|
232
379
|
|
|
233
|
-
MIT
|
|
380
|
+
MIT
|