npm - budget-agent - Versions diffs - 0.4.7 → 0.4.9 - Mend

budget-agent 0.4.7 → 0.4.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/README.md +449 -51
package/package.json +29 -19

package/README.md CHANGED Viewed

@@ -1,8 +1,8 @@
 # budget-agent
-Stop runaway LLM agents from burning your API credits. Set hard limits on cost, tokens, steps, and wall time. The SDK blocks each call before and after it hits your provider -- so you never overspend.
+Stop runaway AI agents from burning through your API credits. Track cost, tokens, runtime, and steps. Enforce hard budget limits for OpenAI, Anthropic, LangGraph, LangChain, OpenRouter, CrewAI, Mastra, AutoGen, and any LLM workflow.
-Works with **OpenAI**, **Anthropic**, **OpenRouter**, **Ollama**, **Together AI**, **Fireworks**, and any OpenAI-compatible API.
+budget-agent helps developers **track AI agent costs**, **enforce token limits**, **set spending caps**, **monitor LLM usage**, and **prevent runaway OpenAI, Anthropic, and OpenRouter agents** from exceeding budget. Works with every provider. Zero vendor lock-in.
 ## Install
@@ -10,7 +10,7 @@ Works with **OpenAI**, **Anthropic**, **OpenRouter**, **Ollama**, **Together AI*
 npm install budget-agent
 ```
-## Usage
+## Quick start
 ```ts
 import { AgentBudget, BudgetError } from 'budget-agent';
@@ -18,10 +18,10 @@ import { AgentBudget, BudgetError } from 'budget-agent';
 const agent = new AgentBudget({
   apiKey: process.env.OPENROUTER_API_KEY,
   limits: {
-    maxCostUSD:   0.10,
-    maxSteps:     15,
+    maxCostUSD:     0.10,
+    maxSteps:       15,
     maxTotalTokens: 50_000,
-    maxWallTimeMs: 30_000,
+    maxWallTimeMs:  30_000,
   },
 });
@@ -33,25 +33,54 @@ const response = await agent.step({
 console.log(agent.getUsage());
 ```
-## How it works
+---
-Every `step()` call runs two budget checks:
+## Prevent runaway AI agents
-1. **Before the API call** -- estimates cost and blocks if you'd go over budget.
-2. **After the API call** -- records actual tokens/cost and blocks if a limit was hit. The step rolls back so you can retry cleanly.
+Agent loops multiply LLM costs across every step. Without guardrails, a single loop can burn through your entire API budget in seconds. budget-agent blocks each call before and after it hits your provider -- so you never overspend.
 ```ts
+const agent = new AgentBudget({
+  apiKey: key,
+  limits: { maxCostUSD: 0.05, maxSteps: 10 },
+});
 try {
-  await agent.step({ model, messages });
+  while (true) {
+    const res = await agent.step({ model, messages });
+    messages.push(res.choices[0].message);
+    messages.push({ role: 'user', content: 'Continue.' });
+  }
 } catch (err) {
   if (err instanceof BudgetError) {
-    console.log(err.exceeded.reason);  // 'cost' | 'steps' | 'totalTokens' | 'wallTime'
-    console.log(err.exceeded.usage);   // full usage snapshot at cutoff
+    console.log('Agent stopped:', err.exceeded.reason);
   }
 }
 ```
-## Limits
+---
+## Track LLM costs in production
+Get real-time visibility into every API call. See cost per step, total spend, token breakdown, and wall time.
+```ts
+const usage = agent.getUsage();
+// {
+//   steps: 12,
+//   totalCostUSD: 0.0847,
+//   totalInputTokens: 24300,
+//   totalOutputTokens: 8200,
+//   elapsedMs: 45200,
+//   stepHistory: [...]
+// }
+agent.summary(); // formatted table in console
+```
+---
+## Set hard budget caps
 Every limit is optional. Set only what you need.
@@ -66,9 +95,56 @@ limits: {
 }
 ```
-## Custom executor (any provider)
+---
+## Runtime limits for AI agents
+Kill agents that run too long. Set wall time limits to prevent infinite loops from consuming compute and money.
+```ts
+const agent = new AgentBudget({
+  apiKey: key,
+  limits: {
+    maxWallTimeMs: 30_000,  // 30 second hard stop
+    maxCostUSD: 1.00,
+  },
+});
+```
+---
+## Token usage tracking
+Track input tokens, output tokens, and total tokens across every step. Know exactly where your budget goes.
+```ts
+agent.on('step:end', (e) => {
+  console.log(`Step ${e.stepIndex}: ${e.inputTokens} in / ${e.outputTokens} out / $${e.costUSD}`);
+});
+```
+---
+## Agent guardrails
+Pre-flight checks estimate output cost before the API call. Post-step checks record actual spend. If a limit is hit, the step rolls back and you can retry cleanly.
+```ts
+try {
+  await agent.step({ model, messages });
+} catch (err) {
+  if (err instanceof BudgetError) {
+    err.exceeded.reason;  // 'cost' | 'steps' | 'totalTokens' | 'wallTime'
+    err.exceeded.usage;   // full snapshot at cutoff
+  }
+}
+```
+---
-Use any LLM provider with a custom executor:
+## OpenAI cost tracking
+Use budget-agent with the OpenAI SDK to track GPT-5.5 costs in real time.
 ```ts
 import { AgentBudget } from 'budget-agent';
@@ -78,7 +154,7 @@ const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
 const agent = new AgentBudget({
   apiKey: process.env.OPENAI_API_KEY,
-  limits: { maxCostUSD: 0.10 },
+  limits: { maxCostUSD: 0.50 },
   executor: async (request) => {
     const completion = await openai.chat.completions.create({
       model: request.model,
@@ -100,23 +176,352 @@ const agent = new AgentBudget({
 });
 ```
-Works with OpenAI, Anthropic, Ollama, Together AI, Fireworks, LocalAI, or any OpenAI-compatible endpoint.
+---
+## Anthropic budget limits
+Set spending caps on Claude Opus, Sonnet, and Haiku. Track token usage and enforce cost limits for Anthropic models.
+```ts
+const agent = new AgentBudget({
+  apiKey: process.env.ANTHROPIC_API_KEY,
+  limits: { maxCostUSD: 0.25, maxSteps: 20 },
+  executor: async (request) => {
+    const response = await fetch('https://api.anthropic.com/v1/messages', {
+      method: 'POST',
+      headers: {
+        'x-api-key': process.env.ANTHROPIC_API_KEY!,
+        'anthropic-version': '2023-06-01',
+        'content-type': 'application/json',
+      },
+      body: JSON.stringify({
+        model: request.model,
+        messages: request.messages,
+        max_tokens: 1024,
+      }),
+    });
+    const data = await response.json();
+    return {
+      model: data.model,
+      usage: {
+        prompt_tokens: data.usage?.input_tokens ?? 0,
+        completion_tokens: data.usage?.output_tokens ?? 0,
+        total_tokens: (data.usage?.input_tokens ?? 0) + (data.usage?.output_tokens ?? 0),
+      },
+      choices: [{
+        message: { role: 'assistant', content: data.content?.[0]?.text ?? '' },
+        finish_reason: 'stop',
+      }],
+    };
+  },
+});
+```
+---
+## LangGraph budget control
+Add budget limits to LangGraph agent graphs. Prevent infinite loops and control cost per execution.
+```ts
+import { AgentBudget, BudgetError } from 'budget-agent';
+const agent = new AgentBudget({
+  apiKey: process.env.OPENROUTER_API_KEY,
+  limits: { maxCostUSD: 0.20, maxSteps: 50 },
+});
+// Use inside a LangGraph node
+async function agentNode(state) {
+  const response = await agent.step({
+    model: 'anthropic/claude-sonnet-4-5',
+    messages: state.messages,
+  });
+  return { messages: [...state.messages, response.choices[0].message] };
+}
+```
+---
+## LangChain cost monitoring
+Track costs for LangChain chains and agents. Set token limits and spending caps.
+```ts
+import { AgentBudget } from 'budget-agent';
+const agent = new AgentBudget({
+  apiKey: process.env.OPENROUTER_API_KEY,
+  limits: { maxCostUSD: 0.15, maxTotalTokens: 100_000 },
+});
+// Wrap any LangChain call
+const response = await agent.step({
+  model: 'openai/gpt-5.5',
+  messages: [{ role: 'user', content: prompt }],
+});
+```
+---
+## OpenRouter spend caps
+budget-agent fetches live pricing from OpenRouter. No hardcoded price tables. If OpenRouter adds a model, it works automatically.
+```ts
+const agent = new AgentBudget({
+  apiKey: process.env.OPENROUTER_API_KEY,
+  limits: { maxCostUSD: 0.10 },
+});
+// Pricing is fetched and cached automatically
+const response = await agent.step({
+  model: 'anthropic/claude-sonnet-4-5',
+  messages: [{ role: 'user', content: 'Hello' }],
+});
+```
+---
+## Ollama agent limits
+Set budget limits for local Ollama models. Track token usage even for self-hosted inference.
+```ts
+const agent = new AgentBudget({
+  apiKey: 'ollama',
+  limits: { maxSteps: 100, maxWallTimeMs: 60_000 },
+  baseUrl: 'http://localhost:11434/v1',
+  executor: async (request) => {
+    const res = await fetch('http://localhost:11434/api/chat', {
+      method: 'POST',
+      body: JSON.stringify({ model: request.model, messages: request.messages }),
+    });
+    const data = await res.json();
+    return {
+      model: data.model,
+      usage: data.usage ?? { prompt_tokens: 0, completion_tokens: 0, total_tokens: 0 },
+      choices: data.messages?.map((m) => ({
+        message: { role: m.role, content: m.content },
+        finish_reason: 'stop',
+      })) ?? [],
+    };
+  },
+});
+```
+---
+## CrewAI budget enforcement
+Add cost limits to CrewAI agent crews. Prevent multi-agent systems from running up bills.
+```ts
+import { AgentBudget } from 'budget-agent';
+const agent = new AgentBudget({
+  apiKey: process.env.OPENROUTER_API_KEY,
+  limits: { maxCostUSD: 1.00, maxSteps: 100 },
+});
+// Use in CrewAI task execution
+const response = await agent.step({
+  model: 'anthropic/claude-sonnet-4-5',
+  messages: [{ role: 'user', content: taskDescription }],
+});
+```
+---
+## Mastra agent limits
+Set budget limits for Mastra agents. Track cost and tokens across agent workflows.
+```ts
+import { AgentBudget } from 'budget-agent';
+const agent = new AgentBudget({
+  apiKey: process.env.OPENROUTER_API_KEY,
+  limits: { maxCostUSD: 0.50, maxSteps: 30 },
+});
+```
+---
+## AutoGen cost control
+Add budget limits to AutoGen multi-agent conversations. Prevent agent loops from exceeding budget.
+```ts
+import { AgentBudget } from 'budget-agent';
+const agent = new AgentBudget({
+  apiKey: process.env.OPENROUTER_API_KEY,
+  limits: { maxCostUSD: 0.25, maxSteps: 20 },
+});
+```
+---
+## LLM observability
+Subscribe to lifecycle events for full visibility into agent behavior.
+```ts
+agent.on('step:start', (e) => console.log('Step', e.stepIndex, 'started'));
+agent.on('step:token', (e) => process.stdout.write(e.token));
+agent.on('step:end', (e) => console.log(`Step cost: $${e.costUSD}`));
+agent.on('budget:exceeded', (e) => console.log('Limit hit:', e.exceeded.reason));
+agent.on('budget:warning', (e) => console.log(`Warning: ${e.pctConsumed * 100}% consumed`));
+agent.on('model:downgraded', (e) => console.log(`Downgraded: ${e.from} → ${e.to}`));
+```
+---
+## Adaptive model routing
+Downgrade to cheaper models as budget depletes. Automatic fallback chains.
+```ts
+const agent = new AgentBudget({
+  apiKey: key,
+  limits: { maxCostUSD: 5.00 },
+  adaptiveRouting: {
+    fallbackChain: [
+      'anthropic/claude-opus-4.8-fast',
+      'openai/gpt-5.5',
+      'openrouter/free',
+    ],
+    thresholds: [0.4, 0.75],
+  },
+});
+```
+---
-## Features
+## Circuit breaker
+Detect repetition or stagnation and halt the agent before it burns through credits.
+```ts
+const agent = new AgentBudget({
+  apiKey: key,
+  limits: { maxCostUSD: 1.00 },
+  circuitBreaker: {
+    repetitionWindow: 3,
+    repetitionThreshold: 0.85,
+    stagnationWindow: 4,
+    stagnationMinLength: 50,
+  },
+});
+```
-- **Cost limits** -- hard stop at a USD ceiling across all steps
-- **Token limits** -- cap input, output, or total tokens
-- **Step limits** -- max number of LLM calls
-- **Wall time limits** -- kill agents that run too long
-- **Pre-flight checks** -- estimate cost before spending money
-- **Rollback on exceed** -- step rolls back so retry stays clean
-- **Adaptive routing** -- auto-downgrade to cheaper models as budget depletes
-- **Circuit breaker** -- detect repetition or stagnation, halt the agent
-- **Auto-compress** -- truncate message history when tokens exceed threshold
-- **Checkpoints** -- save and resume agent state across restarts
-- **Streaming** -- set `stream: true`, listen for `step:token` events
-- **Rate-limit retry** -- automatic 429 retry with exponential backoff
-- **OpenTelemetry** -- optional tracing spans
+---
+## Auto-compress messages
+Truncate message history with an LLM summary when token count exceeds a threshold.
+```ts
+const agent = new AgentBudget({
+  apiKey: key,
+  limits: { maxTotalTokens: 100_000 },
+  autoCompress: {
+    thresholdTokens: 80_000,
+    keepLastN: 4,
+  },
+});
+```
+---
+## Checkpoints
+Save and resume agent state across restarts.
+```ts
+const agent = new AgentBudget({
+  apiKey: key,
+  limits: { maxCostUSD: 0.50 },
+  checkpoint: { enabled: true, path: './agent-state.json' },
+});
+// Resume later
+const resumed = await AgentBudget.resume(options);
+```
+---
+## Warning thresholds
+Get notified before hitting limits.
+```ts
+const agent = new AgentBudget({
+  limits: { maxCostUSD: 0.10 },
+  warningThreshold: 0.5,
+});
+agent.on('budget:warning', (e) => {
+  console.log(`${e.pctConsumed * 100}% of ${e.reason} budget consumed`);
+});
+```
+---
+## budget-agent vs LangSmith
+LangSmith is an observability platform. budget-agent is a runtime enforcement layer. LangSmith shows you what happened. budget-agent stops it from happening.
+| | budget-agent | LangSmith |
+|---|---|---|
+| Runtime enforcement | Yes | No |
+| Pre-flight cost estimation | Yes | No |
+| Budget limits | Hard stops | Soft alerts |
+| Pricing | Free, self-hosted | Paid SaaS |
+| Provider lock-in | None | LangChain ecosystem |
+---
+## budget-agent vs Helicone
+Helicone is a proxy for LLM cost tracking. budget-agent is an SDK that enforces limits at runtime. Helicone tracks after the fact. budget-agent blocks before spend happens.
+| | budget-agent | Helicone |
+|---|---|---|
+| Runtime enforcement | Yes | No |
+| Pre-flight checks | Yes | No |
+| Self-hosted | Yes | Cloud only |
+| Free tier | Yes | Limited |
+---
+## budget-agent vs Langfuse
+Langfuse is an LLM observability tool. budget-agent is a budget enforcement SDK. Langfuse gives you dashboards. budget-agent gives you hard limits.
+| | budget-agent | Langfuse |
+|---|---|---|
+| Runtime enforcement | Yes | No |
+| Pre-flight cost estimation | Yes | No |
+| Budget limits | Hard stops | Observability only |
+| Self-hosted | Yes | Yes |
+| Free | Yes | Yes (self-hosted) |
+---
+## budget-agent vs OpenAI Usage Dashboard
+OpenAI's dashboard shows usage after the fact. budget-agent prevents overspend in real time.
+| | budget-agent | OpenAI Dashboard |
+|---|---|---|
+| Real-time enforcement | Yes | No |
+| Pre-flight checks | Yes | No |
+| Multi-provider | Yes | OpenAI only |
+| Agent loop protection | Yes | No |
+---
 ## API
@@ -143,17 +548,9 @@ Works with OpenAI, Anthropic, Ollama, Together AI, Fireworks, LocalAI, or any Op
 One LLM call. Checks limits before and after. Throws `BudgetError` on exceed.
-```ts
-const response = await agent.step({
-  model: 'anthropic/claude-sonnet-4-5',
-  messages: [{ role: 'user', content: 'Hi' }],
-  stream: true,
-});
-```
 ### `agent.getUsage()`
-Returns current usage: `steps`, `totalInputTokens`, `totalOutputTokens`, `totalCostUSD`, `elapsedMs`, `stepHistory`.
+Returns: `steps`, `totalInputTokens`, `totalOutputTokens`, `totalCostUSD`, `elapsedMs`, `stepHistory`.
 ### `agent.reset()`
@@ -161,18 +558,19 @@ Reset all counters.
 ### `agent.refreshPricing()`
-Force re-fetch model prices from OpenRouter.
+Force re-fetch model prices.
-## Events
+### `agent.summary()`
-```ts
-agent.on('step:start', (e) => {});
-agent.on('step:token', (e) => {});
-agent.on('step:end', (e) => {});
-agent.on('budget:exceeded', (e) => {});
-agent.on('budget:warning', (e) => {});
-agent.on('model:downgraded', (e) => {});
-```
+Formatted usage table in console.
+### `agent.loadCheckpoint()` / `agent.clearCheckpoint()`
+Load or clear persisted state.
+### `AgentBudget.resume(options, checkpointPath?)`
+Create a new agent pre-loaded with checkpoint state.
 ## License

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "budget-agent",
-  "version": "0.4.7",
-  "description": "Control LLM agent costs with real-time token, cost, and step tracking. Set budget limits, enforce spend caps, and prevent runaway agents. Works with OpenAI, Anthropic, OpenRouter, Ollama, and any provider.",
+  "version": "0.4.9",
+  "description": "Track AI agent costs, tokens, runtime and spending. Prevent runaway OpenAI, Anthropic, LangGraph and OpenRouter agents from exceeding budget.",
   "type": "module",
   "main": "./dist/index.js",
   "types": "./dist/index.d.ts",
@@ -29,31 +29,41 @@
     "test:legacy": "tsx test-integration.ts"
   },
   "keywords": [
-    "llm",
+    "ai",
     "agent",
-    "budget",
-    "cost-control",
-    "token-limit",
-    "openrouter",
+    "llm",
     "openai",
     "anthropic",
-    "llm-cost",
+    "langgraph",
+    "langchain",
+    "openrouter",
+    "ollama",
+    "crewai",
+    "mastra",
+    "autogen",
+    "cost-tracking",
+    "budget",
+    "token-tracking",
     "agent-budget",
+    "ai-cost",
+    "ai-observability",
+    "agent-monitoring",
+    "guardrails",
+    "runtime-limits",
+    "token-limits",
     "spending-limit",
-    "rate-limit",
+    "cost-control",
+    "llm-cost",
+    "agent-guardrails",
+    "runaway-agent",
+    "budget-enforcement",
     "circuit-breaker",
     "checkpoint",
-    "token-tracker",
-    "cost-tracker",
-    "llm-agent",
-    "ai-agent",
-    "prompt-cost",
-    "usage-tracking",
-    "budget-enforcement",
-    "ollama",
-    "gpt-4",
+    "gpt-5.5",
     "claude",
-    "llm-proxy"
+    "langsmith",
+    "langfuse",
+    "helicone"
   ],
   "license": "MIT",
   "repository": {