npm - ai-sdk-rate-limiter - Versions diffs - 0.1.0 → 0.3.0 - Mend

ai-sdk-rate-limiter 0.1.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/README.md CHANGED Viewed

@@ -77,7 +77,7 @@ const limiter = createRateLimiter({
       daily:   50,
       monthly: 500,
     },
-    onExceeded: 'throw', // or 'queue' — wait until the period resets
+    onExceeded: 'throw', // 'throw' | 'queue' | 'fallback'
   },
   // Queue behavior
@@ -169,6 +169,49 @@ Costs are based on **actual token counts** from API responses — not estimates.
 ---
+## Budget fallback routing
+When a budget limit is hit, you can transparently reroute to a cheaper model instead of throwing an error. Pass a `fallback` option to `wrap()`:
+```typescript
+const limiter = createRateLimiter({
+  cost: {
+    budget: { daily: 10 },
+    onExceeded: 'fallback',  // reroute to fallback instead of throwing
+  },
+  on: {
+    budgetHit: ({ model, currentCostUsd, limitUsd, period }) =>
+      console.warn(`${model} ${period} budget hit ($${currentCostUsd} of $${limitUsd})`),
+  },
+})
+const model = limiter.wrap(
+  openai('gpt-4o'),                     // primary model
+  { fallback: openai('gpt-4o-mini') },  // used when budget is exceeded
+)
+// Under budget  → uses gpt-4o normally
+// Over $10/day  → silently switches to gpt-4o-mini, no code changes needed
+const result = await generateText({ model, prompt })
+```
+**How it works:**
+1. The budget is checked before every request against total rolling spend
+2. When exceeded, `BudgetExceededError` is caught inside `wrap()` before it reaches your code
+3. The request is re-executed against the fallback model, bypassing the budget pre-check
+4. Fallback usage is tracked under the fallback model's ID in `getCostReport()`
+**Behavior matrix:**
+| `onExceeded` | `fallback` configured | Outcome |
+|---|---|---|
+| `'throw'` | any | Throws `BudgetExceededError` |
+| `'fallback'` | yes | Transparently uses fallback model |
+| `'fallback'` | no | Throws `BudgetExceededError` |
+| `'queue'` | any | Queues until period resets |
+---
 ## Backpressure — know before you send
 Check estimated wait time before committing to a request. Useful for showing loading states or shedding load gracefully.
@@ -209,7 +252,7 @@ limiter.off('queued', handler)
 | `dequeued` | Request leaves the queue | `model`, `waitedMs`, `priority` |
 | `retrying` | A failed request is about to retry | `model`, `attempt`, `maxAttempts`, `delayMs`, `error` |
 | `rateLimited` | Limit hit (local or remote 429) | `model`, `source`, `limitType`, `resetAt` |
-| `budgetHit` | Cost budget exceeded | `model`, `currentCostUsd`, `limitUsd`, `period` |
+| `budgetHit` | Cost budget exceeded | `model`, `currentCostUsd`, `limitUsd`, `period`, `usingFallback` |
 | `dropped` | Request rejected (queue full or timeout) | `model`, `reason` |
 | `completed` | Request finished successfully | `model`, `inputTokens`, `outputTokens`, `costUsd`, `latencyMs` |
@@ -304,7 +347,7 @@ const model = req.user.plan === 'paid'
 ## Built-in model registry
-Limits and pricing are built-in for every major model. These defaults are Tier 1 (most conservative) — override with your actual tier limits.
+Limits and pricing are built-in for every major model across 6 providers. Defaults are conservative (free/Tier 1) — override with your actual plan limits.
 **OpenAI**
@@ -332,13 +375,54 @@ Limits and pricing are built-in for every major model. These defaults are Tier 1
 | gemini-1.5-pro | 2 | 32,000 | $1.25 | $5.00 |
 | gemini-1.5-flash | 15 | 1,000,000 | $0.075 | $0.30 |
+**Groq** (free tier defaults — on-demand tier is 6,000 RPM / 200k TPM)
+| Model | RPM | ITPM | Input $/M | Output $/M |
+|---|---|---|---|---|
+| llama-3.3-70b-versatile | 30 | 6,000 | $0.59 | $0.79 |
+| llama-3.1-8b-instant | 30 | 20,000 | $0.05 | $0.08 |
+| mixtral-8x7b-32768 | 30 | 5,000 | $0.24 | $0.24 |
+| gemma2-9b-it | 30 | 15,000 | $0.20 | $0.20 |
+| deepseek-r1-distill-llama-70b | 30 | 6,000 | $0.75 | $0.99 |
+**Mistral**
+| Model | RPM | ITPM | Input $/M | Output $/M |
+|---|---|---|---|---|
+| mistral-large-latest | 500 | 100,000 | $2.00 | $6.00 |
+| mistral-small-latest | 500 | 100,000 | $0.10 | $0.30 |
+| codestral-latest | 500 | 100,000 | $0.30 | $0.90 |
+| open-mistral-nemo | 500 | 100,000 | $0.15 | $0.15 |
+| pixtral-large-latest | 500 | 100,000 | $2.00 | $6.00 |
+**Cohere** (trial tier defaults — production tier is 10,000+ RPM)
+| Model | RPM | ITPM | Input $/M | Output $/M |
+|---|---|---|---|---|
+| command-r-plus | 20 | 100,000 | $2.50 | $10.00 |
+| command-r | 20 | 100,000 | $0.15 | $0.60 |
+| command | 20 | 100,000 | $0.50 | $1.50 |
+| command-light | 20 | 100,000 | $0.15 | $0.60 |
 Unknown models fall back to 60 RPM / 100k ITPM with no cost tracking. You can inspect or extend the registry:
 ```typescript
-import { OPENAI_MODELS, ANTHROPIC_MODELS, resolveModelLimits, isKnownModel } from 'ai-sdk-rate-limiter'
+import {
+  OPENAI_MODELS,
+  ANTHROPIC_MODELS,
+  GOOGLE_MODELS,
+  GROQ_MODELS,
+  MISTRAL_MODELS,
+  COHERE_MODELS,
+  resolveModelLimits,
+  isKnownModel,
+} from 'ai-sdk-rate-limiter'
+console.log(GROQ_MODELS['llama-3.3-70b-versatile'])
+// { rpm: 30, itpm: 6000, rpd: 1000, inputPricePerMillion: 0.59, ... }
-console.log(OPENAI_MODELS['gpt-4o'])
-// { rpm: 500, itpm: 30000, otpm: 30000, inputPricePerMillion: 2.5, ... }
+console.log(isKnownModel('llama-3.3-70b-versatile', 'groq'))
+// true
 console.log(isKnownModel('my-fine-tune', 'openai'))
 // false → will use fallback limits

package/dist/index.cjs CHANGED Viewed

@@ -317,7 +317,7 @@ var CostTracker = class {
     ];
     for (const { limit, current, period } of checks) {
       if (limit !== void 0 && current + estimatedCostUsd > limit) {
-        if (onExceeded === "throw") {
+        if (onExceeded === "throw" || onExceeded === "fallback") {
           throw new BudgetExceededError(model, current, limit, period);
         }
         return false;
@@ -873,6 +873,341 @@ var GOOGLE_MODELS = {
   }
 };
+// src/registry/groq.ts
+var GROQ_MODELS = {
+  // -------------------------------------------------------------------------
+  // Llama 3.3 family
+  // -------------------------------------------------------------------------
+  "llama-3.3-70b-versatile": {
+    rpm: 30,
+    itpm: 6e3,
+    otpm: 6e3,
+    rpd: 1e3,
+    inputPricePerMillion: 0.59,
+    outputPricePerMillion: 0.79
+  },
+  "llama-3.3-70b-specdec": {
+    rpm: 30,
+    itpm: 6e3,
+    otpm: 6e3,
+    rpd: 1e3,
+    inputPricePerMillion: 0.59,
+    outputPricePerMillion: 0.99
+  },
+  // -------------------------------------------------------------------------
+  // Llama 3.1 family
+  // -------------------------------------------------------------------------
+  "llama-3.1-8b-instant": {
+    rpm: 30,
+    itpm: 2e4,
+    otpm: 2e4,
+    rpd: 14400,
+    inputPricePerMillion: 0.05,
+    outputPricePerMillion: 0.08
+  },
+  "llama-3.1-70b-versatile": {
+    rpm: 30,
+    itpm: 6e3,
+    otpm: 6e3,
+    rpd: 1e3,
+    inputPricePerMillion: 0.59,
+    outputPricePerMillion: 0.79
+  },
+  // -------------------------------------------------------------------------
+  // Llama 3 family
+  // -------------------------------------------------------------------------
+  "llama3-70b-8192": {
+    rpm: 30,
+    itpm: 6e3,
+    otpm: 6e3,
+    rpd: 14400,
+    inputPricePerMillion: 0.59,
+    outputPricePerMillion: 0.79
+  },
+  "llama3-8b-8192": {
+    rpm: 30,
+    itpm: 3e4,
+    otpm: 3e4,
+    rpd: 14400,
+    inputPricePerMillion: 0.05,
+    outputPricePerMillion: 0.08
+  },
+  "llama-guard-3-8b": {
+    rpm: 30,
+    itpm: 15e3,
+    otpm: 15e3,
+    rpd: 14400,
+    inputPricePerMillion: 0.2,
+    outputPricePerMillion: 0.2
+  },
+  // -------------------------------------------------------------------------
+  // Mixtral family
+  // -------------------------------------------------------------------------
+  "mixtral-8x7b-32768": {
+    rpm: 30,
+    itpm: 5e3,
+    otpm: 5e3,
+    rpd: 14400,
+    inputPricePerMillion: 0.24,
+    outputPricePerMillion: 0.24
+  },
+  // -------------------------------------------------------------------------
+  // Gemma family
+  // -------------------------------------------------------------------------
+  "gemma2-9b-it": {
+    rpm: 30,
+    itpm: 15e3,
+    otpm: 15e3,
+    rpd: 14400,
+    inputPricePerMillion: 0.2,
+    outputPricePerMillion: 0.2
+  },
+  "gemma-7b-it": {
+    rpm: 30,
+    itpm: 15e3,
+    otpm: 15e3,
+    rpd: 14400,
+    inputPricePerMillion: 0.07,
+    outputPricePerMillion: 0.07
+  },
+  // -------------------------------------------------------------------------
+  // Deepseek family
+  // -------------------------------------------------------------------------
+  "deepseek-r1-distill-llama-70b": {
+    rpm: 30,
+    itpm: 6e3,
+    otpm: 6e3,
+    rpd: 1e3,
+    inputPricePerMillion: 0.75,
+    outputPricePerMillion: 0.99
+  },
+  "deepseek-r1-distill-qwen-32b": {
+    rpm: 30,
+    itpm: 6e3,
+    otpm: 6e3,
+    rpd: 1e3,
+    inputPricePerMillion: 0.69,
+    outputPricePerMillion: 0.69
+  }
+};
+// src/registry/mistral.ts
+var MISTRAL_MODELS = {
+  // -------------------------------------------------------------------------
+  // Mistral Large — frontier model
+  // -------------------------------------------------------------------------
+  "mistral-large-latest": {
+    rpm: 500,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 2,
+    outputPricePerMillion: 6
+  },
+  "mistral-large-2411": {
+    rpm: 500,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 2,
+    outputPricePerMillion: 6
+  },
+  "mistral-large-2407": {
+    rpm: 500,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 2,
+    outputPricePerMillion: 6
+  },
+  // -------------------------------------------------------------------------
+  // Mistral Small — efficient, low-cost
+  // -------------------------------------------------------------------------
+  "mistral-small-latest": {
+    rpm: 500,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 0.1,
+    outputPricePerMillion: 0.3
+  },
+  "mistral-small-2409": {
+    rpm: 500,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 0.1,
+    outputPricePerMillion: 0.3
+  },
+  // -------------------------------------------------------------------------
+  // Pixtral Large — multimodal
+  // -------------------------------------------------------------------------
+  "pixtral-large-latest": {
+    rpm: 500,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 2,
+    outputPricePerMillion: 6
+  },
+  "pixtral-large-2411": {
+    rpm: 500,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 2,
+    outputPricePerMillion: 6
+  },
+  "pixtral-12b": {
+    rpm: 500,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 0.15,
+    outputPricePerMillion: 0.15
+  },
+  "pixtral-12b-2409": {
+    rpm: 500,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 0.15,
+    outputPricePerMillion: 0.15
+  },
+  // -------------------------------------------------------------------------
+  // Codestral — code-optimized
+  // -------------------------------------------------------------------------
+  "codestral-latest": {
+    rpm: 500,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 0.3,
+    outputPricePerMillion: 0.9
+  },
+  "codestral-2501": {
+    rpm: 500,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 0.3,
+    outputPricePerMillion: 0.9
+  },
+  // -------------------------------------------------------------------------
+  // Open models (free / self-hosted weights available)
+  // -------------------------------------------------------------------------
+  "open-mistral-nemo": {
+    rpm: 500,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 0.15,
+    outputPricePerMillion: 0.15
+  },
+  "open-mixtral-8x22b": {
+    rpm: 500,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 2,
+    outputPricePerMillion: 6
+  },
+  "open-mixtral-8x7b": {
+    rpm: 500,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 0.7,
+    outputPricePerMillion: 0.7
+  },
+  "open-mistral-7b": {
+    rpm: 500,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 0.25,
+    outputPricePerMillion: 0.25
+  },
+  // -------------------------------------------------------------------------
+  // Mistral Embed — embedding only (no RPM-based generation limits)
+  // -------------------------------------------------------------------------
+  "mistral-embed": {
+    rpm: 500,
+    itpm: 1e5,
+    otpm: 0,
+    inputPricePerMillion: 0.1,
+    outputPricePerMillion: 0
+  }
+};
+// src/registry/cohere.ts
+var COHERE_MODELS = {
+  // -------------------------------------------------------------------------
+  // Command R+ — highest capability
+  // -------------------------------------------------------------------------
+  "command-r-plus": {
+    rpm: 20,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 2.5,
+    outputPricePerMillion: 10
+  },
+  "command-r-plus-08-2024": {
+    rpm: 20,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 2.5,
+    outputPricePerMillion: 10
+  },
+  "command-r-plus-04-2024": {
+    rpm: 20,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 2.5,
+    outputPricePerMillion: 10
+  },
+  // -------------------------------------------------------------------------
+  // Command R — balanced, RAG-optimized
+  // -------------------------------------------------------------------------
+  "command-r": {
+    rpm: 20,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 0.15,
+    outputPricePerMillion: 0.6
+  },
+  "command-r-08-2024": {
+    rpm: 20,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 0.15,
+    outputPricePerMillion: 0.6
+  },
+  "command-r-03-2024": {
+    rpm: 20,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 0.15,
+    outputPricePerMillion: 0.6
+  },
+  // -------------------------------------------------------------------------
+  // Command — legacy general-purpose
+  // -------------------------------------------------------------------------
+  "command": {
+    rpm: 20,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 0.5,
+    outputPricePerMillion: 1.5
+  },
+  "command-nightly": {
+    rpm: 20,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 0.5,
+    outputPricePerMillion: 1.5
+  },
+  "command-light": {
+    rpm: 20,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 0.15,
+    outputPricePerMillion: 0.6
+  },
+  "command-light-nightly": {
+    rpm: 20,
+    itpm: 1e5,
+    otpm: 1e5,
+    inputPricePerMillion: 0.15,
+    outputPricePerMillion: 0.6
+  }
+};
 // src/registry/index.ts
 var FALLBACK_LIMITS = {
   rpm: 60,
@@ -919,7 +1254,22 @@ function getFromRegistry(modelId, provider) {
     const stripped = modelId.replace(/^(google|vertex)\//, "");
     if (GOOGLE_MODELS[stripped]) return GOOGLE_MODELS[stripped];
   }
-  return OPENAI_MODELS[modelId] ?? ANTHROPIC_MODELS[modelId] ?? GOOGLE_MODELS[modelId];
+  if (provider === "groq") {
+    if (GROQ_MODELS[modelId]) return GROQ_MODELS[modelId];
+    const stripped = modelId.replace(/^groq\//, "");
+    if (GROQ_MODELS[stripped]) return GROQ_MODELS[stripped];
+  }
+  if (provider === "mistral") {
+    if (MISTRAL_MODELS[modelId]) return MISTRAL_MODELS[modelId];
+    const stripped = modelId.replace(/^mistral\//, "");
+    if (MISTRAL_MODELS[stripped]) return MISTRAL_MODELS[stripped];
+  }
+  if (provider === "cohere") {
+    if (COHERE_MODELS[modelId]) return COHERE_MODELS[modelId];
+    const stripped = modelId.replace(/^cohere\//, "");
+    if (COHERE_MODELS[stripped]) return COHERE_MODELS[stripped];
+  }
+  return OPENAI_MODELS[modelId] ?? ANTHROPIC_MODELS[modelId] ?? GOOGLE_MODELS[modelId] ?? GROQ_MODELS[modelId] ?? MISTRAL_MODELS[modelId] ?? COHERE_MODELS[modelId];
 }
 function isKnownModel(modelId, provider) {
   return getFromRegistry(modelId, normalizeProvider(provider)) !== void 0;
@@ -970,7 +1320,7 @@ var Pipeline = class {
     const estimatedInput = estimateInputTokens(prompt);
     const startMs = Date.now();
     const key = `${provider}:${modelId}`;
-    if (this.config.cost?.budget) {
+    if (this.config.cost?.budget && !opts.skipBudgetCheck) {
       const estimatedCost = this.costTracker.estimateCost(
         estimatedInput,
         500,
@@ -978,12 +1328,26 @@ var Pipeline = class {
         limits.inputPricePerMillion,
         limits.outputPricePerMillion
       );
-      this.costTracker.checkBudget(
-        modelId,
-        estimatedCost,
-        this.config.cost.budget,
-        this.config.cost.onExceeded ?? "throw"
-      );
+      try {
+        this.costTracker.checkBudget(
+          modelId,
+          estimatedCost,
+          this.config.cost.budget,
+          this.config.cost.onExceeded ?? "throw"
+        );
+      } catch (err) {
+        if (err instanceof BudgetExceededError) {
+          this.emitter.emit("budgetHit", {
+            model: err.model,
+            provider,
+            currentCostUsd: err.currentCostUsd,
+            limitUsd: err.limitUsd,
+            period: err.period,
+            usingFallback: false
+          });
+        }
+        throw err;
+      }
     }
     await this.engine.acquire(key, {
       limits,
@@ -1122,7 +1486,8 @@ function getPerRequestOptions(params, queueTimeout) {
   return {
     priority: raw?.priority ?? "normal",
     timeoutMs: raw?.timeout ?? queueTimeout,
-    metadata: raw?.metadata ?? {}
+    metadata: raw?.metadata ?? {},
+    skipBudgetCheck: raw?._skipBudgetCheck ?? false
   };
 }
 function extractTokenUsage(usage) {
@@ -1138,7 +1503,7 @@ function createMiddleware(pipeline, queueTimeout) {
     // wrapGenerate — non-streaming
     // -----------------------------------------------------------------------
     async wrapGenerate({ doGenerate, params, model }) {
-      const { priority, timeoutMs } = getPerRequestOptions(params, queueTimeout);
+      const { priority, timeoutMs, skipBudgetCheck } = getPerRequestOptions(params, queueTimeout);
       const modelId = model.modelId;
       const provider = model.provider;
       const startMs = Date.now();
@@ -1151,6 +1516,7 @@ function createMiddleware(pipeline, queueTimeout) {
           streaming: false,
           priority,
           timeoutMs,
+          skipBudgetCheck,
           onUsage: () => {
           }
         }
@@ -1165,7 +1531,7 @@ function createMiddleware(pipeline, queueTimeout) {
     // wrapStream — streaming
     // -----------------------------------------------------------------------
     async wrapStream({ doStream, params, model }) {
-      const { priority, timeoutMs } = getPerRequestOptions(params, queueTimeout);
+      const { priority, timeoutMs, skipBudgetCheck } = getPerRequestOptions(params, queueTimeout);
       const modelId = model.modelId;
       const provider = model.provider;
       const startMs = Date.now();
@@ -1178,6 +1544,7 @@ function createMiddleware(pipeline, queueTimeout) {
           streaming: true,
           priority,
           timeoutMs,
+          skipBudgetCheck,
           onUsage: () => {
           }
         }
@@ -1204,26 +1571,71 @@ function createMiddleware(pipeline, queueTimeout) {
 function wrapModel(model, middleware, overrides) {
   const providerId = overrides?.providerId ?? model.provider;
   const modelId = overrides?.modelId ?? model.modelId;
+  const fallbackModel = overrides?.fallback;
   return {
     specificationVersion: "v4",
     provider: providerId,
     modelId,
     supportedUrls: model["supportedUrls"],
     async doGenerate(params) {
-      return middleware.wrapGenerate({
-        doGenerate: () => model.doGenerate(params),
-        doStream: () => model.doStream(params),
-        params,
-        model
-      });
+      try {
+        return await middleware.wrapGenerate({
+          doGenerate: () => model.doGenerate(params),
+          doStream: () => model.doStream(params),
+          params,
+          model
+        });
+      } catch (err) {
+        if (err instanceof BudgetExceededError && fallbackModel) {
+          const fallbackParams = {
+            ...params,
+            providerOptions: {
+              ...params.providerOptions,
+              rateLimiter: {
+                ...params.providerOptions?.["rateLimiter"] ?? {},
+                _skipBudgetCheck: true
+              }
+            }
+          };
+          return middleware.wrapGenerate({
+            doGenerate: () => fallbackModel.doGenerate(fallbackParams),
+            doStream: () => fallbackModel.doStream(fallbackParams),
+            params: fallbackParams,
+            model: fallbackModel
+          });
+        }
+        throw err;
+      }
     },
     async doStream(params) {
-      return middleware.wrapStream({
-        doGenerate: () => model.doGenerate(params),
-        doStream: () => model.doStream(params),
-        params,
-        model
-      });
+      try {
+        return await middleware.wrapStream({
+          doGenerate: () => model.doGenerate(params),
+          doStream: () => model.doStream(params),
+          params,
+          model
+        });
+      } catch (err) {
+        if (err instanceof BudgetExceededError && fallbackModel) {
+          const fallbackParams = {
+            ...params,
+            providerOptions: {
+              ...params.providerOptions,
+              rateLimiter: {
+                ...params.providerOptions?.["rateLimiter"] ?? {},
+                _skipBudgetCheck: true
+              }
+            }
+          };
+          return middleware.wrapStream({
+            doGenerate: () => fallbackModel.doGenerate(fallbackParams),
+            doStream: () => fallbackModel.doStream(fallbackParams),
+            params: fallbackParams,
+            model: fallbackModel
+          });
+        }
+        throw err;
+      }
     }
   };
 }
@@ -1260,7 +1672,10 @@ function createRateLimiter(config = {}) {
 exports.ANTHROPIC_MODELS = ANTHROPIC_MODELS;
 exports.BudgetExceededError = BudgetExceededError;
+exports.COHERE_MODELS = COHERE_MODELS;
 exports.GOOGLE_MODELS = GOOGLE_MODELS;
+exports.GROQ_MODELS = GROQ_MODELS;
+exports.MISTRAL_MODELS = MISTRAL_MODELS;
 exports.OPENAI_MODELS = OPENAI_MODELS;
 exports.QueueFullError = QueueFullError;
 exports.QueueTimeoutError = QueueTimeoutError;