npm - @ai-sdk/gateway - Versions diffs - 4.0.0-beta.6 → 4.0.0-beta.61 - Mend

@ai-sdk/gateway 4.0.0-beta.6 → 4.0.0-beta.61

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (29) hide show

package/CHANGELOG.md +390 -4
package/dist/index.d.ts +149 -24
package/dist/index.js +735 -320
package/dist/index.js.map +1 -1
package/docs/00-ai-gateway.mdx +312 -45
package/package.json +8 -10
package/src/errors/create-gateway-error.ts +0 -1
package/src/errors/gateway-authentication-error.ts +0 -1
package/src/gateway-config.ts +1 -1
package/src/gateway-embedding-model-settings.ts +1 -1
package/src/gateway-embedding-model.ts +38 -14
package/src/gateway-fetch-metadata.ts +51 -37
package/src/gateway-generation-info.ts +149 -0
package/src/gateway-image-model-settings.ts +9 -0
package/src/gateway-image-model.ts +41 -21
package/src/gateway-language-model-settings.ts +22 -10
package/src/gateway-language-model.ts +49 -23
package/src/gateway-model-entry.ts +13 -3
package/src/gateway-provider-options.ts +35 -8
package/src/gateway-provider.ts +100 -18
package/src/gateway-reranking-model-settings.ts +7 -0
package/src/gateway-reranking-model.ts +119 -0
package/src/gateway-spend-report.ts +193 -0
package/src/gateway-video-model-settings.ts +2 -0
package/src/gateway-video-model.ts +22 -17
package/src/index.ts +13 -3
package/dist/index.d.mts +0 -602
package/dist/index.mjs +0 -1539
package/dist/index.mjs.map +0 -1

package/docs/00-ai-gateway.mdx CHANGED Viewed

@@ -29,7 +29,7 @@ For most use cases, you can use the AI Gateway directly with a model string:
 import { generateText } from 'ai';
 const { text } = await generateText({
-  model: 'openai/gpt-5',
+  model: 'openai/gpt-5.4',
   prompt: 'Hello world',
 });
 ```
@@ -39,7 +39,7 @@ const { text } = await generateText({
 import { generateText, gateway } from 'ai';
 const { text } = await generateText({
-  model: gateway('openai/gpt-5'),
+  model: gateway('openai/gpt-5.4'),
   prompt: 'Hello world',
 });
 ```
@@ -80,7 +80,7 @@ You can use the following optional settings to customize the AI Gateway provider
 - **baseURL** _string_
-  Use a different URL prefix for API calls. The default prefix is `https://ai-gateway.vercel.sh/v3/ai`.
+  Use a different URL prefix for API calls. The default prefix is `https://ai-gateway.vercel.sh/v4/ai`.
 - **apiKey** _string_
@@ -159,6 +159,8 @@ You can connect your own provider credentials to use with Vercel AI Gateway. Thi
 To set up BYOK, add your provider credentials in your Vercel team's AI Gateway settings. Once configured, AI Gateway automatically uses your credentials. No code changes are needed.
+For providers like Azure where you can use custom deployment names, you can configure model mappings to map gateway model slugs to your deployment names. See [model mappings](https://vercel.com/docs/ai-gateway/byok#model-mappings) for details.
 Learn more in the [BYOK documentation](https://vercel.com/docs/ai-gateway/byok).
 ## Language Models
@@ -169,13 +171,41 @@ You can create language models using a provider instance. The first argument is
 import { generateText } from 'ai';
 const { text } = await generateText({
-  model: 'openai/gpt-5',
+  model: 'openai/gpt-5.4',
   prompt: 'Explain quantum computing in simple terms',
 });
 ```
 AI Gateway language models can also be used in the `streamText` function and support structured data generation with [`Output`](/docs/reference/ai-sdk-core/output) (see [AI SDK Core](/docs/ai-sdk-core)).
+## Reranking Models
+You can create reranking models using the `rerankingModel` method on the provider instance:
+```ts
+import { rerank } from 'ai';
+import { gateway } from '@ai-sdk/gateway';
+const { ranking } = await rerank({
+  model: gateway.rerankingModel('cohere/rerank-v3.5'),
+  query: 'What is the capital of France?',
+  documents: [
+    'Paris is the capital of France.',
+    'Berlin is the capital of Germany.',
+    'Madrid is the capital of Spain.',
+  ],
+  topN: 2,
+});
+console.log(ranking);
+// [
+//   { originalIndex: 0, score: 0.89, document: 'Paris is the capital of France.' },
+//   { originalIndex: 2, score: 0.15, document: 'Madrid is the capital of Spain.' },
+// ]
+```
+Reranking models are useful for improving search results in retrieval-augmented generation (RAG) pipelines by re-scoring candidate documents after an initial retrieval step.
 ## Available Models
 The AI Gateway supports models from OpenAI, Anthropic, Google, Meta, xAI, Mistral, DeepSeek, Amazon Bedrock, Cohere, Perplexity, Alibaba, and other providers.
@@ -215,7 +245,7 @@ availableModels.models.forEach(model => {
 // Use any discovered model with plain string
 const { text } = await generateText({
-  model: availableModels.models[0].id, // e.g., 'openai/gpt-4o'
+  model: availableModels.models[0].id, // e.g., 'openai/gpt-5.4'
   prompt: 'Hello world',
 });
 ```
@@ -238,6 +268,86 @@ The `getCredits()` method returns your team's credit information based on the au
 - **balance** _number_ - Your team's current available credit balance
 - **total_used** _number_ - Total credits consumed by your team
+## Generation Lookup
+Look up detailed information about a specific generation by its ID, including cost, token usage, latency, and provider details. Generation IDs are available in `providerMetadata.gateway.generationId` on both `generateText` and `streamText` responses.
+When streaming, the generation ID is injected on the first content chunk, so you can capture it early in the stream without waiting for completion. This is especially useful in cases where a network interruption or mid-stream error could prevent you from receiving the final response — since the gateway records the final status server-side, you can use the generation ID to look up the results (including cost, token usage, and finish reason) later via `getGenerationInfo()`.
+```ts
+import { gateway, generateText } from 'ai';
+// Make a request
+const result = await generateText({
+  model: gateway('anthropic/claude-sonnet-4'),
+  prompt: 'Explain quantum entanglement briefly',
+});
+// Get the generation ID from provider metadata
+const generationId = result.providerMetadata?.gateway?.generationId;
+// Look up detailed generation info
+const generation = await gateway.getGenerationInfo({ id: generationId });
+console.log(`Model: ${generation.model}`);
+console.log(`Cost: $${generation.totalCost.toFixed(6)}`);
+console.log(`Latency: ${generation.latency}ms`);
+console.log(`Prompt tokens: ${generation.promptTokens}`);
+console.log(`Completion tokens: ${generation.completionTokens}`);
+```
+With `streamText`, you can capture the generation ID from the first chunk via `fullStream`:
+```ts
+import { gateway, streamText } from 'ai';
+const result = streamText({
+  model: gateway('anthropic/claude-sonnet-4'),
+  prompt: 'Explain quantum entanglement briefly',
+});
+let generationId: string | undefined;
+for await (const part of result.fullStream) {
+  if (!generationId && part.providerMetadata?.gateway?.generationId) {
+    generationId = part.providerMetadata.gateway.generationId as string;
+    console.log(`Generation ID (early): ${generationId}`);
+  }
+}
+// Look up cost and usage after the stream completes
+if (generationId) {
+  const generation = await gateway.getGenerationInfo({ id: generationId });
+  console.log(`Cost: $${generation.totalCost.toFixed(6)}`);
+  console.log(`Finish reason: ${generation.finishReason}`);
+}
+```
+The `getGenerationInfo()` method accepts:
+- **id** _string_ - The generation ID to look up (format: `gen_<ulid>`, required)
+It returns a `GatewayGenerationInfo` object with the following fields:
+- **id** _string_ - The generation ID
+- **totalCost** _number_ - Total cost in USD
+- **upstreamInferenceCost** _number_ - Upstream inference cost in USD (relevant for BYOK)
+- **usage** _number_ - Usage cost in USD (same as totalCost)
+- **createdAt** _string_ - ISO 8601 timestamp when the generation was created
+- **model** _string_ - Model identifier used
+- **isByok** _boolean_ - Whether Bring Your Own Key credentials were used
+- **providerName** _string_ - The provider that served this generation
+- **streamed** _boolean_ - Whether streaming was used
+- **finishReason** _string_ - Finish reason (e.g. `'stop'`)
+- **latency** _number_ - Time to first token in milliseconds
+- **generationTime** _number_ - Total generation time in milliseconds
+- **promptTokens** _number_ - Number of prompt tokens
+- **completionTokens** _number_ - Number of completion tokens
+- **reasoningTokens** _number_ - Reasoning tokens used (if applicable)
+- **cachedTokens** _number_ - Cached tokens used (if applicable)
+- **cacheCreationTokens** _number_ - Cache creation input tokens
+- **billableWebSearchCalls** _number_ - Number of billable web search calls
 ## Examples
 ### Basic Text Generation
@@ -246,7 +356,7 @@ The `getCredits()` method returns your team's credit information based on the au
 import { generateText } from 'ai';
 const { text } = await generateText({
-  model: 'anthropic/claude-sonnet-4',
+  model: 'anthropic/claude-sonnet-4.6',
   prompt: 'Write a haiku about programming',
 });
@@ -259,7 +369,7 @@ console.log(text);
 import { streamText } from 'ai';
 const { textStream } = await streamText({
-  model: 'openai/gpt-5',
+  model: 'openai/gpt-5.4',
   prompt: 'Explain the benefits of serverless architecture',
 });
@@ -297,13 +407,13 @@ const { text } = await generateText({
 Some providers offer tools that are executed by the provider itself, such as [OpenAI's web search tool](/providers/ai-sdk-providers/openai#web-search-tool). To use these tools through AI Gateway, import the provider to access the tool definitions:
 ```ts
-import { generateText, stepCountIs } from 'ai';
+import { generateText, isStepCount } from 'ai';
 import { openai } from '@ai-sdk/openai';
 const result = await generateText({
-  model: 'openai/gpt-5-mini',
+  model: 'openai/gpt-5.4-mini',
   prompt: 'What is the Vercel AI Gateway?',
-  stopWhen: stepCountIs(10),
+  stopWhen: isStepCount(10),
   tools: {
     web_search: openai.tools.webSearch({}),
   },
@@ -330,7 +440,7 @@ The Perplexity Search tool enables models to search the web using [Perplexity's
 import { gateway, generateText } from 'ai';
 const result = await generateText({
-  model: 'openai/gpt-5-nano',
+  model: 'openai/gpt-5.4-nano',
   prompt: 'Search for news about AI regulations in January 2025.',
   tools: {
     perplexity_search: gateway.tools.perplexitySearch(),
@@ -348,7 +458,7 @@ You can also configure the search with optional parameters:
 import { gateway, generateText } from 'ai';
 const result = await generateText({
-  model: 'openai/gpt-5-nano',
+  model: 'openai/gpt-5.4-nano',
   prompt:
     'Search for news about AI regulations from the first week of January 2025.',
   tools: {
@@ -402,7 +512,7 @@ The tool works with both `generateText` and `streamText`:
 import { gateway, streamText } from 'ai';
 const result = streamText({
-  model: 'openai/gpt-5-nano',
+  model: 'openai/gpt-5.4-nano',
   prompt: 'Search for the latest news about AI regulations.',
   tools: {
     perplexity_search: gateway.tools.perplexitySearch(),
@@ -432,7 +542,7 @@ The Parallel Search tool enables models to search the web using [Parallel AI's S
 import { gateway, generateText } from 'ai';
 const result = await generateText({
-  model: 'openai/gpt-5-nano',
+  model: 'openai/gpt-5.4-nano',
   prompt: 'Research the latest developments in quantum computing.',
   tools: {
     parallel_search: gateway.tools.parallelSearch(),
@@ -450,7 +560,7 @@ You can also configure the search with optional parameters:
 import { gateway, generateText } from 'ai';
 const result = await generateText({
-  model: 'openai/gpt-5-nano',
+  model: 'openai/gpt-5.4-nano',
   prompt: 'Find detailed information about TypeScript 5.0 features.',
   tools: {
     parallel_search: gateway.tools.parallelSearch({
@@ -511,7 +621,7 @@ The tool works with both `generateText` and `streamText`:
 import { gateway, streamText } from 'ai';
 const result = streamText({
-  model: 'openai/gpt-5-nano',
+  model: 'openai/gpt-5.4-nano',
   prompt: 'Research the latest AI safety guidelines.',
   tools: {
     parallel_search: gateway.tools.parallelSearch(),
@@ -533,22 +643,24 @@ for await (const part of result.fullStream) {
 }
 ```
-### Usage Tracking with User and Tags
+### Custom Reporting
-Track usage per end-user and categorize requests with tags:
+Track usage per end-user and categorize requests with tags, then query the data through the reporting API.
+#### Usage Tracking with User and Tags
 ```ts
-import type { GatewayLanguageModelOptions } from '@ai-sdk/gateway';
+import type { GatewayProviderOptions } from '@ai-sdk/gateway';
 import { generateText } from 'ai';
 const { text } = await generateText({
-  model: 'openai/gpt-5',
+  model: 'openai/gpt-5.4',
   prompt: 'Summarize this document...',
   providerOptions: {
     gateway: {
       user: 'user-abc-123', // Track usage for this specific end-user
       tags: ['document-summary', 'premium-feature'], // Categorize for reporting
-    } satisfies GatewayLanguageModelOptions,
+    } satisfies GatewayProviderOptions,
   },
 });
 ```
@@ -559,6 +671,77 @@ This allows you to:
 - Filter and analyze spending by feature or use case using tags
 - Track which users or features are driving the most AI usage
+#### Querying Spend Reports
+Use the `getSpendReport()` method to query usage data programmatically. The reporting API is only available for Vercel Pro and Enterprise plans. For pricing, see the [Custom Reporting docs](https://vercel.com/docs/ai-gateway/capabilities/custom-reporting).
+```ts
+import { gateway } from 'ai';
+const report = await gateway.getSpendReport({
+  startDate: '2026-03-01',
+  endDate: '2026-03-25',
+  groupBy: 'model',
+});
+for (const row of report.results) {
+  console.log(`${row.model}: $${row.totalCost.toFixed(4)}`);
+}
+```
+The `getSpendReport()` method accepts the following parameters:
+- **startDate** _string_ - Start date in `YYYY-MM-DD` format (inclusive, required)
+- **endDate** _string_ - End date in `YYYY-MM-DD` format (inclusive, required)
+- **groupBy** _string_ - Aggregation dimension: `'day'` (default), `'user'`, `'model'`, `'tag'`, `'provider'`, or `'credential_type'`
+- **datePart** _string_ - Time granularity when `groupBy` is `'day'`: `'day'` or `'hour'`
+- **userId** _string_ - Filter to a specific user
+- **model** _string_ - Filter to a specific model (e.g. `'anthropic/claude-sonnet-4.5'`)
+- **provider** _string_ - Filter to a specific provider (e.g. `'anthropic'`)
+- **credentialType** _string_ - Filter by `'byok'` or `'system'` credentials
+- **tags** _string[]_ - Filter to requests matching these tags
+Each row in `results` contains a grouping field (matching your `groupBy` choice) and metrics:
+- **totalCost** _number_ - Total cost in USD
+- **marketCost** _number_ - Market cost in USD
+- **inputTokens** _number_ - Number of input tokens
+- **outputTokens** _number_ - Number of output tokens
+- **cachedInputTokens** _number_ - Number of cached input tokens
+- **cacheCreationInputTokens** _number_ - Number of cache creation input tokens
+- **reasoningTokens** _number_ - Number of reasoning tokens
+- **requestCount** _number_ - Number of requests
+You can combine tracking and querying to analyze spend by tags you defined:
+```ts
+import type { GatewayProviderOptions } from '@ai-sdk/gateway';
+import { gateway, streamText } from 'ai';
+// 1. Make requests with tags
+const result = streamText({
+  model: gateway('anthropic/claude-haiku-4.5'),
+  prompt: 'Summarize this quarter's results',
+  providerOptions: {
+    gateway: {
+      tags: ['team:finance', 'feature:summaries'],
+    } satisfies GatewayProviderOptions,
+  },
+});
+// 2. Later, query spend filtered by those tags
+const report = await gateway.getSpendReport({
+  startDate: '2026-03-01',
+  endDate: '2026-03-31',
+  groupBy: 'tag',
+  tags: ['team:finance'],
+});
+for (const row of report.results) {
+  console.log(`${row.tag}: $${row.totalCost.toFixed(4)} (${row.requestCount} requests)`);
+}
+```
 ## Provider Options
 The AI Gateway provider accepts provider options that control routing behavior and provider-specific configurations.
@@ -568,17 +751,17 @@ The AI Gateway provider accepts provider options that control routing behavior a
 You can use the `gateway` key in `providerOptions` to control how AI Gateway routes requests:
 ```ts
-import type { GatewayLanguageModelOptions } from '@ai-sdk/gateway';
+import type { GatewayProviderOptions } from '@ai-sdk/gateway';
 import { generateText } from 'ai';
 const { text } = await generateText({
-  model: 'anthropic/claude-sonnet-4',
+  model: 'anthropic/claude-sonnet-4.6',
   prompt: 'Explain quantum computing',
   providerOptions: {
     gateway: {
       order: ['vertex', 'anthropic'], // Try Vertex AI first, then Anthropic
       only: ['vertex', 'anthropic'], // Only use these providers
-    } satisfies GatewayLanguageModelOptions,
+    } satisfies GatewayProviderOptions,
   },
 });
 ```
@@ -597,11 +780,25 @@ The following gateway provider options are available:
   Example: `only: ['anthropic', 'vertex']` will only allow routing to Anthropic or Vertex AI.
+- **sort** _'cost' | 'ttft' | 'tps'_
+  Sorts available providers by a performance or cost metric before routing. The gateway will try the best-scoring provider first and fall back through the rest in sorted order. If unspecified, providers are ordered using the gateway's default system ranking.
+  - `'cost'` — lowest cost first
+  - `'ttft'` — lowest time-to-first-token first
+  - `'tps'` — highest tokens-per-second first
+  When combined with `order`, the user-specified providers are promoted to the front while remaining providers follow the sorted order.
+  Example: `sort: 'ttft'` will route to the provider with the fastest time-to-first-token.
+  When `sort` is active, the response's `providerMetadata.gateway.routing.sort` object contains the sort option used, the resulting execution order, per-provider metric values, and any providers that were deprioritized.
 - **models** _string[]_
   Specifies fallback models to use when the primary model fails or is unavailable. The gateway will try the primary model first (specified in the `model` parameter), then try each model in this array in order until one succeeds.
-  Example: `models: ['openai/gpt-5-nano', 'gemini-2.0-flash']` will try the fallback models in order if the primary model fails.
+  Example: `models: ['openai/gpt-5.4-nano', 'gemini-3-flash-preview']` will try the fallback models in order if the primary model fails.
 - **user** _string_
@@ -621,15 +818,30 @@ The following gateway provider options are available:
   Each provider can have multiple credentials (tried in order). The structure is a record where keys are provider slugs and values are arrays of credential objects.
+  Each credential can optionally include a `modelMappings` array to map AI Gateway model slugs to your deployment names (for example, custom Azure deployment names). If a BYOK request fails, the gateway falls back to system credentials using the default model name.
   Examples:
   - Single provider: `byok: { 'anthropic': [{ apiKey: 'sk-ant-...' }] }`
   - Multiple credentials: `byok: { 'vertex': [{ project: 'proj-1', googleCredentials: { privateKey: '...', clientEmail: '...' } }, { project: 'proj-2', googleCredentials: { privateKey: '...', clientEmail: '...' } }] }`
   - Multiple providers: `byok: { 'anthropic': [{ apiKey: '...' }], 'bedrock': [{ accessKeyId: '...', secretAccessKey: '...' }] }`
+  - With model mappings: `byok: { 'azure': [{ apiKey: '...', resourceName: '...', modelMappings: [{ gatewayModelSlug: 'openai/gpt-5.4-nano', customModelId: 'my-deployment' }] }] }`
 - **zeroDataRetention** _boolean_
-  Restricts routing requests to providers that have zero data retention policies.
+  Restricts routing to providers that have zero data retention agreements with Vercel for AI Gateway. When using BYOK credentials, this filter is not applied. If BYOK credentials fail and the request falls back to system credentials, only providers with zero data retention agreements will be used. If there are no providers available for the model with zero data retention, the request will fail. Request-level ZDR is only available for Vercel Pro and Enterprise plans.
+- **disallowPromptTraining** _boolean_
+  Restricts routing to providers that have agreements with Vercel for AI Gateway to not use prompts for model training. When using BYOK credentials, this filter is not applied. If BYOK credentials fail and the request falls back to system credentials, only providers that do not train on prompt data will be used. If there are no providers available for the model that disallow prompt training, the request will fail.
+- **hipaaCompliant** _boolean_
+  Restricts routing to models and tools from providers that have signed a BAA with Vercel for the use of AI Gateway (requires Vercel HIPAA BAA add on). BYOK credentials are skipped when `hipaaCompliant` is set to `true` to ensure that requests are only routed to providers that support HIPAA compliance.
+- **quotaEntityId** _string_
+  The unique identifier for the entity against which quota is tracked. Used for quota management and enforcement purposes.
 - **providerTimeouts** _object_
@@ -642,17 +854,17 @@ The following gateway provider options are available:
 You can combine these options to have fine-grained control over routing and tracking:
 ```ts
-import type { GatewayLanguageModelOptions } from '@ai-sdk/gateway';
+import type { GatewayProviderOptions } from '@ai-sdk/gateway';
 import { generateText } from 'ai';
 const { text } = await generateText({
-  model: 'anthropic/claude-sonnet-4',
+  model: 'anthropic/claude-sonnet-4.6',
   prompt: 'Write a haiku about programming',
   providerOptions: {
     gateway: {
       order: ['vertex'], // Prefer Vertex AI
       only: ['anthropic', 'vertex'], // Only allow these providers
-    } satisfies GatewayLanguageModelOptions,
+    } satisfies GatewayProviderOptions,
   },
 });
 ```
@@ -662,43 +874,98 @@ const { text } = await generateText({
 The `models` option enables automatic fallback to alternative models when the primary model fails:
 ```ts
-import type { GatewayLanguageModelOptions } from '@ai-sdk/gateway';
+import type { GatewayProviderOptions } from '@ai-sdk/gateway';
 import { generateText } from 'ai';
 const { text } = await generateText({
-  model: 'openai/gpt-4o', // Primary model
+  model: 'openai/gpt-5.4', // Primary model
   prompt: 'Write a TypeScript haiku',
   providerOptions: {
     gateway: {
-      models: ['openai/gpt-5-nano', 'gemini-2.0-flash'], // Fallback models
-    } satisfies GatewayLanguageModelOptions,
+      models: ['openai/gpt-5.4-nano', 'gemini-3-flash-preview'], // Fallback models
+    } satisfies GatewayProviderOptions,
   },
 });
 // This will:
-// 1. Try openai/gpt-4o first
-// 2. If it fails, try openai/gpt-5-nano
-// 3. If that fails, try gemini-2.0-flash
+// 1. Try openai/gpt-5.4 first
+// 2. If it fails, try openai/gpt-5.4-nano
+// 3. If that fails, try gemini-3-flash-preview
 // 4. Return the result from the first model that succeeds
 ```
 #### Zero Data Retention Example
-Set `zeroDataRetention` to true to ensure requests are only routed to providers
-that have zero data retention policies. When `zeroDataRetention` is `false` or not
-specified, there is no enforcement of restricting routing.
+Set `zeroDataRetention` to true to route requests to providers that have zero data retention agreements with Vercel for AI Gateway. When using BYOK credentials, this filter is not applied. If BYOK credentials fail and the request falls back to system credentials, only providers with zero data retention agreements will be used. If there are no providers available for the model with zero data retention, the request will fail. When `zeroDataRetention` is `false` or not specified, there is no enforcement of restricting routing. Request-level ZDR is only available for Vercel Pro and Enterprise plans.
 ```ts
-import type { GatewayLanguageModelOptions } from '@ai-sdk/gateway';
+import type { GatewayProviderOptions } from '@ai-sdk/gateway';
 import { generateText } from 'ai';
 const { text } = await generateText({
-  model: 'anthropic/claude-sonnet-4.5',
+  model: 'anthropic/claude-sonnet-4.6',
   prompt: 'Analyze this sensitive document...',
   providerOptions: {
     gateway: {
       zeroDataRetention: true,
-    } satisfies GatewayLanguageModelOptions,
+    } satisfies GatewayProviderOptions,
+  },
+});
+```
+#### Disallow Prompt Training Example
+Set `disallowPromptTraining` to true to route requests to providers that have agreements with Vercel for AI Gateway to not use prompts for model training. When using BYOK credentials, this filter is not applied. If BYOK credentials fail and the request falls back to system credentials, only providers that do not train on prompt data will be used. If there are no providers available for the model that disallow prompt training, the request will fail. When `disallowPromptTraining` is `false` or not specified, there is no enforcement of restricting routing.
+```ts
+import type { GatewayProviderOptions } from '@ai-sdk/gateway';
+import { generateText } from 'ai';
+const { text } = await generateText({
+  model: 'anthropic/claude-sonnet-4.6',
+  prompt: 'Analyze this proprietary business data...',
+  providerOptions: {
+    gateway: {
+      disallowPromptTraining: true,
+    } satisfies GatewayProviderOptions,
+  },
+});
+```
+#### HIPAA Compliance Example
+Set `hipaaCompliant` to true to route requests only to models or tools by providers that have signed a BAA with Vercel for the use of AI Gateway. If the model or tool does not have a HIPAA-compliant provider, the request will fail. When `hipaaCompliant` is `false` or not specified, there is no enforcement of restricting routing. BYOK credentials are skipped when `hipaaCompliant` is set to `true` to ensure that requests are only routed to providers that support HIPAA compliance.
+```ts
+import type { GatewayProviderOptions } from '@ai-sdk/gateway';
+import { generateText } from 'ai';
+const { text } = await generateText({
+  model: 'anthropic/claude-sonnet-4.6',
+  prompt: 'Analyze this patient data...',
+  providerOptions: {
+    gateway: {
+      hipaaCompliant: true,
+    } satisfies GatewayProviderOptions,
+  },
+});
+```
+#### Quota Entity ID Example
+Set `quotaEntityId` to track and enforce quota against a specific entity. This is useful for multi-tenant applications where you need to manage quota at the entity level (e.g., per organization or team).
+```ts
+import type { GatewayProviderOptions } from '@ai-sdk/gateway';
+import { generateText } from 'ai';
+const { text } = await generateText({
+  model: 'anthropic/claude-sonnet-4.6',
+  prompt: 'Summarize this report...',
+  providerOptions: {
+    gateway: {
+      quotaEntityId: 'org-123',
+    } satisfies GatewayProviderOptions,
   },
 });
 ```
@@ -709,16 +976,16 @@ When using provider-specific options through AI Gateway, use the actual provider
 ```ts
 import type { AnthropicLanguageModelOptions } from '@ai-sdk/anthropic';
-import type { GatewayLanguageModelOptions } from '@ai-sdk/gateway';
+import type { GatewayProviderOptions } from '@ai-sdk/gateway';
 import { generateText } from 'ai';
 const { text } = await generateText({
-  model: 'anthropic/claude-sonnet-4',
+  model: 'anthropic/claude-sonnet-4.6',
   prompt: 'Explain quantum computing',
   providerOptions: {
     gateway: {
       order: ['vertex', 'anthropic'],
-    } satisfies GatewayLanguageModelOptions,
+    } satisfies GatewayProviderOptions,
     anthropic: {
       thinking: { type: 'enabled', budgetTokens: 12000 },
     } satisfies AnthropicLanguageModelOptions,

package/package.json CHANGED Viewed

@@ -1,11 +1,11 @@
 {
   "name": "@ai-sdk/gateway",
   "private": false,
-  "version": "4.0.0-beta.6",
+  "version": "4.0.0-beta.61",
+  "type": "module",
   "license": "Apache-2.0",
   "sideEffects": false,
   "main": "./dist/index.js",
-  "module": "./dist/index.mjs",
   "types": "./dist/index.d.ts",
   "files": [
     "dist/**/*",
@@ -25,14 +25,14 @@
     "./package.json": "./package.json",
     ".": {
       "types": "./dist/index.d.ts",
-      "import": "./dist/index.mjs",
-      "require": "./dist/index.js"
+      "import": "./dist/index.js",
+      "default": "./dist/index.js"
     }
   },
   "dependencies": {
-    "@vercel/oidc": "3.1.0",
-    "@ai-sdk/provider": "4.0.0-beta.0",
-    "@ai-sdk/provider-utils": "5.0.0-beta.1"
+    "@vercel/oidc": "3.2.0",
+    "@ai-sdk/provider": "4.0.0-beta.12",
+    "@ai-sdk/provider-utils": "5.0.0-beta.25"
   },
   "devDependencies": {
     "@types/node": "18.15.11",
@@ -40,7 +40,7 @@
     "tsx": "4.19.2",
     "typescript": "5.8.3",
     "zod": "3.25.76",
-    "@ai-sdk/test-server": "2.0.0-beta.0",
+    "@ai-sdk/test-server": "2.0.0-beta.1",
     "@vercel/ai-tsconfig": "0.0.0"
   },
   "peerDependencies": {
@@ -68,9 +68,7 @@
     "build:watch": "pnpm clean && tsup --watch",
     "clean": "del-cli dist docs *.tsbuildinfo",
     "generate-model-settings": "tsx scripts/generate-model-settings.ts",
-    "lint": "eslint \"./**/*.ts*\"",
     "type-check": "tsc --build",
-    "prettier-check": "prettier --check \"./**/*.ts*\"",
     "test": "pnpm test:node && pnpm test:edge",
     "test:update": "pnpm test:node -u",
     "test:watch": "vitest --config vitest.node.config.js",

package/src/errors/create-gateway-error.ts CHANGED Viewed

@@ -13,7 +13,6 @@ import {
   InferSchema,
   lazySchema,
   safeValidateTypes,
-  validateTypes,
   zodSchema,
 } from '@ai-sdk/provider-utils';

package/src/errors/gateway-authentication-error.ts CHANGED Viewed

@@ -37,7 +37,6 @@ export class GatewayAuthenticationError extends GatewayError {
   static createContextualError({
     apiKeyProvided,
     oidcTokenProvided,
-    message = 'Authentication failed',
     statusCode = 401,
     cause,
     generationId,

package/src/gateway-config.ts CHANGED Viewed

@@ -2,6 +2,6 @@ import type { FetchFunction, Resolvable } from '@ai-sdk/provider-utils';
 export type GatewayConfig = {
   baseURL: string;
-  headers: () => Resolvable<Record<string, string | undefined>>;
+  headers?: Resolvable<Record<string, string | undefined>>;
   fetch?: FetchFunction;
 };