npm - budget-agent - Versions diffs - 0.4.6 → 0.4.7 - Mend

budget-agent 0.4.6 → 0.4.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/README.md +61 -137
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -1,8 +1,8 @@
 # budget-agent
-Control LLM agent costs with real-time token, cost, and step tracking. Set budget limits, enforce spend caps, and prevent runaway agents from burning through your API credits.
+Stop runaway LLM agents from burning your API credits. Set hard limits on cost, tokens, steps, and wall time. The SDK blocks each call before and after it hits your provider -- so you never overspend.
-**Works with OpenAI, Anthropic, OpenRouter, Ollama, Together AI, Fireworks, and any OpenAI-compatible endpoint.**
+Works with **OpenAI**, **Anthropic**, **OpenRouter**, **Ollama**, **Together AI**, **Fireworks**, and any OpenAI-compatible API.
 ## Install
@@ -10,114 +10,63 @@ Control LLM agent costs with real-time token, cost, and step tracking. Set budge
 npm install budget-agent
 ```
-## Quick start
+## Usage
 ```ts
-import { AgentBudget } from 'budget-agent';
+import { AgentBudget, BudgetError } from 'budget-agent';
 const agent = new AgentBudget({
   apiKey: process.env.OPENROUTER_API_KEY,
-  limits: { maxCostUSD: 0.05, maxSteps: 10 },
+  limits: {
+    maxCostUSD:   0.10,
+    maxSteps:     15,
+    maxTotalTokens: 50_000,
+    maxWallTimeMs: 30_000,
+  },
 });
 const response = await agent.step({
-  model: 'anthropic/claude-opus-4.8-fast',
+  model: 'anthropic/claude-sonnet-4-5',
   messages: [{ role: 'user', content: 'Hello' }],
 });
 console.log(agent.getUsage());
-// { steps: 1, totalCostUSD: 0.000015, totalInputTokens: 12, ... }
 ```
-## Why use this
-LLM API calls cost money. Agent loops multiply that cost across every step. Without guardrails, a single runaway agent can burn through your credits in seconds.
+## How it works
-This SDK sits between your agent and the LLM provider. It tracks every call, checks your budget before each one, and stops the agent when it hits a limit. No provider is bundled. No model is defaulted. You bring everything.
+Every `step()` call runs two budget checks:
-## Budget limits
-Set limits on cost, tokens, steps, and wall time. Every limit is optional.
-```ts
-const agent = new AgentBudget({
-  apiKey: key,
-  limits: {
-    maxCostUSD:      0.05,   // total USD before abort
-    maxSteps:        10,     // total LLM calls before abort
-    maxInputTokens:  50000,  // total input tokens
-    maxOutputTokens: 10000,  // total output tokens
-    maxTotalTokens:  60000,  // input + output combined
-    maxWallTimeMs:   60000,  // 60 seconds wall clock
-  },
-});
-```
-### How enforcement works
-Each `step()` runs two checks:
-1. **Pre-flight** -- before the API call. Estimates output cost and catches over-budget calls before spending money.
-2. **Post-step** -- after recording real token/cost data. If a limit is exceeded, the step is rolled back from the tracker so you can retry without a stale balance.
+1. **Before the API call** -- estimates cost and blocks if you'd go over budget.
+2. **After the API call** -- records actual tokens/cost and blocks if a limit was hit. The step rolls back so you can retry cleanly.
 ```ts
 try {
   await agent.step({ model, messages });
 } catch (err) {
   if (err instanceof BudgetError) {
-    console.log(err.exceeded.reason); // 'cost' | 'steps' | 'wallTime' | ...
+    console.log(err.exceeded.reason);  // 'cost' | 'steps' | 'totalTokens' | 'wallTime'
+    console.log(err.exceeded.usage);   // full usage snapshot at cutoff
   }
 }
 ```
-### Custom callback instead of abort
-```ts
-const agent = new AgentBudget({
-  apiKey: key,
-  limits: { maxCostUSD: 0.01 },
-  onExceeded: (usage) => {
-    console.log(`Over budget: $${usage.totalCostUSD}`);
-    // Log, alert, switch models -- never throws
-  },
-});
-```
-### Warning thresholds
-Get notified before hitting limits:
-```ts
-const agent = new AgentBudget({
-  limits: { maxCostUSD: 0.10 },
-  warningThreshold: 0.5, // fire 'budget:warning' at 50% consumption
-});
-agent.on('budget:warning', (e) => {
-  // { reason: 'cost', pctConsumed: 0.51, remaining: 0.049 }
-});
-```
-## Adaptive model routing
+## Limits
-Downgrade to cheaper models as budget depletes:
+Every limit is optional. Set only what you need.
 ```ts
-const agent = new AgentBudget({
-  apiKey: key,
-  limits: { maxCostUSD: 5.00 },
-  adaptiveRouting: {
-    fallbackChain: [
-      'anthropic/claude-opus-4.8-fast', // best model
-      'openai/gpt-4o',                  // mid-tier
-      'openrouter/free',                // emergency fallback
-    ],
-    thresholds: [0.4, 0.75], // downgrade at 40% and 75% budget consumed
-  },
-});
+limits: {
+  maxCostUSD:      0.05,   // total USD across all steps
+  maxSteps:        10,     // total LLM calls
+  maxInputTokens:  50000,  // input tokens only
+  maxOutputTokens: 10000,  // output tokens only
+  maxTotalTokens:  60000,  // input + output combined
+  maxWallTimeMs:   60000,  // wall clock time in ms
+}
 ```
-## Bring your own executor
+## Custom executor (any provider)
 Use any LLM provider with a custom executor:
@@ -151,20 +100,23 @@ const agent = new AgentBudget({
 });
 ```
-Works with OpenAI, Anthropic, Ollama, Together AI, Fireworks, LocalAI, or any OpenAI-compatible API.
+Works with OpenAI, Anthropic, Ollama, Together AI, Fireworks, LocalAI, or any OpenAI-compatible endpoint.
 ## Features
-- **Budget enforcement** -- cost, tokens, steps, wall time limits checked before and after every LLM call
-- **Adaptive routing** -- automatic model downgrade as budget depletes
-- **Circuit breaker** -- detect repetition or stagnation and halt the agent
-- **Auto-compress** -- truncate message history with LLM summary when tokens exceed threshold
+- **Cost limits** -- hard stop at a USD ceiling across all steps
+- **Token limits** -- cap input, output, or total tokens
+- **Step limits** -- max number of LLM calls
+- **Wall time limits** -- kill agents that run too long
+- **Pre-flight checks** -- estimate cost before spending money
+- **Rollback on exceed** -- step rolls back so retry stays clean
+- **Adaptive routing** -- auto-downgrade to cheaper models as budget depletes
+- **Circuit breaker** -- detect repetition or stagnation, halt the agent
+- **Auto-compress** -- truncate message history when tokens exceed threshold
 - **Checkpoints** -- save and resume agent state across restarts
-- **Streaming** -- set `stream: true` and listen for `step:token` events
-- **Events** -- subscribe to `step:start`, `step:end`, `step:token`, `budget:exceeded`, and more
-- **Pricing cache** -- model pricing fetched from OpenRouter with configurable TTL
+- **Streaming** -- set `stream: true`, listen for `step:token` events
 - **Rate-limit retry** -- automatic 429 retry with exponential backoff
-- **OpenTelemetry** -- optional tracing spans via `telemetry: { enabled: true }`
+- **OpenTelemetry** -- optional tracing spans
 ## API
@@ -172,82 +124,54 @@ Works with OpenAI, Anthropic, Ollama, Together AI, Fireworks, LocalAI, or any Op
 | Option | Type | Default | Description |
 |--------|------|---------|-------------|
-| `apiKey` | `string` | -- | Your provider API key |
-| `limits` | `object` | -- | Budget limits (cost, tokens, steps, wall time) |
-| `executor` | `function` | -- | Custom API executor (replaces built-in fetch) |
+| `apiKey` | `string` | required | Your provider API key |
+| `limits` | `object` | required | Budget limits (cost, tokens, steps, wall time) |
+| `executor` | `function` | -- | Custom API executor |
 | `baseUrl` | `string` | `https://openrouter.ai/api/v1` | API base URL |
 | `defaultHeaders` | `object` | -- | Extra HTTP headers |
 | `autoCompress` | `object` | -- | Auto-compress messages at token threshold |
-| `circuitBreaker` | `object` | -- | Detect repetition/stagnation loops |
-| `adaptiveRouting` | `object` | -- | Downgrade model tiers as budget depletes |
+| `circuitBreaker` | `object` | -- | Detect repetition/stagnation |
+| `adaptiveRouting` | `object` | -- | Downgrade models as budget depletes |
 | `checkpoint` | `object` | -- | Persist and resume agent state |
-| `onExceeded` | `'abort' \| function` | `'abort'` | Strategy when budget exceeded |
+| `onExceeded` | `'abort' \| function` | `'abort'` | Strategy when limit hit |
 | `onEvent` | `function` | -- | Global event listener |
-| `warningThreshold` | `number` | `0.75` | Fraction of limit that triggers warning |
-| `pricingCacheTTLMs` | `number` | `300000` | Pricing cache TTL in ms |
+| `warningThreshold` | `number` | `0.75` | Warning at this fraction of any limit |
+| `pricingCacheTTLMs` | `number` | `300000` | Pricing cache TTL |
 | `telemetry` | `object` | -- | Enable OpenTelemetry spans |
 ### `agent.step(request)`
-Make one LLM call. Checks limits before and after. Throws `BudgetError` if exceeded.
+One LLM call. Checks limits before and after. Throws `BudgetError` on exceed.
 ```ts
 const response = await agent.step({
-  model: 'anthropic/claude-opus-4.8-fast',
+  model: 'anthropic/claude-sonnet-4-5',
   messages: [{ role: 'user', content: 'Hi' }],
-  stream: true, // optional -- emit step:token events
+  stream: true,
 });
 ```
 ### `agent.getUsage()`
-Returns current usage snapshot:
-```ts
-{
-  steps: number;
-  totalInputTokens: number;
-  totalOutputTokens: number;
-  totalCostUSD: number;
-  elapsedMs: number;
-  stepHistory: StepUsage[];
-}
-```
-### `agent.summary()`
-Prints a formatted table to console and returns the usage snapshot.
+Returns current usage: `steps`, `totalInputTokens`, `totalOutputTokens`, `totalCostUSD`, `elapsedMs`, `stepHistory`.
 ### `agent.reset()`
-Reset all usage counters.
-### `agent.compressMessages(messages, keepLastN?)`
-Manually compress a message array via LLM summary.
+Reset all counters.
-### `agent.loadCheckpoint()` / `agent.clearCheckpoint()`
+### `agent.refreshPricing()`
-Load or clear persisted checkpoint state.
-### `AgentBudget.resume(options, checkpointPath?)`
-Static factory. Creates a new agent pre-loaded with checkpoint state.
+Force re-fetch model prices from OpenRouter.
 ## Events
 ```ts
-agent.on('step:start', (e) => console.log('Step', e.stepIndex, 'started'));
-agent.on('step:token', (e) => process.stdout.write(e.token));
-agent.on('step:end', (e) => console.log('Step cost:', e.costUSD));
-agent.on('budget:exceeded', (e) => console.log('Limit hit:', e.exceeded.reason));
-agent.on('model:downgraded', (e) => console.log('Downgraded to', e.to));
-```
-## Testing
-```bash
-npm test
+agent.on('step:start', (e) => {});
+agent.on('step:token', (e) => {});
+agent.on('step:end', (e) => {});
+agent.on('budget:exceeded', (e) => {});
+agent.on('budget:warning', (e) => {});
+agent.on('model:downgraded', (e) => {});
 ```
 ## License

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "budget-agent",
-  "version": "0.4.6",
+  "version": "0.4.7",
   "description": "Control LLM agent costs with real-time token, cost, and step tracking. Set budget limits, enforce spend caps, and prevent runaway agents. Works with OpenAI, Anthropic, OpenRouter, Ollama, and any provider.",
   "type": "module",
   "main": "./dist/index.js",