npm - @ai-sdk/google - Versions diffs - 4.0.0-beta.7 → 4.0.0-beta.82 - Mend

@ai-sdk/google 4.0.0-beta.7 → 4.0.0-beta.82

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (71) hide show

package/CHANGELOG.md +614 -5
package/README.md +6 -4
package/dist/index.d.ts +301 -50
package/dist/index.js +5410 -639
package/dist/index.js.map +1 -1
package/dist/internal/index.d.ts +100 -26
package/dist/internal/index.js +1653 -451
package/dist/internal/index.js.map +1 -1
package/docs/{15-google-generative-ai.mdx → 15-google.mdx} +784 -69
package/package.json +16 -17
package/src/{convert-google-generative-ai-usage.ts → convert-google-usage.ts} +13 -5
package/src/convert-json-schema-to-openapi-schema.ts +1 -1
package/src/convert-to-google-messages.ts +647 -0
package/src/{google-generative-ai-embedding-options.ts → google-embedding-model-options.ts} +9 -2
package/src/{google-generative-ai-embedding-model.ts → google-embedding-model.ts} +31 -18
package/src/google-error.ts +1 -1
package/src/google-files.ts +225 -0
package/src/google-image-model-options.ts +35 -0
package/src/{google-generative-ai-image-model.ts → google-image-model.ts} +116 -65
package/src/{google-generative-ai-image-settings.ts → google-image-settings.ts} +2 -2
package/src/google-json-accumulator.ts +371 -0
package/src/{google-generative-ai-options.ts → google-language-model-options.ts} +50 -5
package/src/{google-generative-ai-language-model.ts → google-language-model.ts} +701 -219
package/src/google-prepare-tools.ts +72 -12
package/src/google-prompt.ts +86 -0
package/src/google-provider.ts +157 -53
package/src/google-speech-api.ts +36 -0
package/src/google-speech-model-options.ts +48 -0
package/src/google-speech-model.ts +311 -0
package/src/google-video-model-options.ts +43 -0
package/src/{google-generative-ai-video-model.ts → google-video-model.ts} +25 -60
package/src/{google-generative-ai-video-settings.ts → google-video-settings.ts} +2 -1
package/src/index.ts +40 -9
package/src/interactions/build-google-interactions-stream-transform.ts +818 -0
package/src/interactions/cancel-google-interaction.ts +60 -0
package/src/interactions/convert-google-interactions-usage.ts +47 -0
package/src/interactions/convert-to-google-interactions-input.ts +557 -0
package/src/interactions/extract-google-interactions-sources.ts +252 -0
package/src/interactions/google-interactions-agent.ts +15 -0
package/src/interactions/google-interactions-api.ts +530 -0
package/src/interactions/google-interactions-language-model-options.ts +262 -0
package/src/interactions/google-interactions-language-model.ts +776 -0
package/src/interactions/google-interactions-prompt.ts +582 -0
package/src/interactions/google-interactions-provider-metadata.ts +23 -0
package/src/interactions/map-google-interactions-finish-reason.ts +31 -0
package/src/interactions/parse-google-interactions-outputs.ts +252 -0
package/src/interactions/poll-google-interactions.ts +129 -0
package/src/interactions/prepare-google-interactions-tools.ts +245 -0
package/src/interactions/stream-google-interactions.ts +242 -0
package/src/interactions/synthesize-google-interactions-agent-stream.ts +185 -0
package/src/internal/index.ts +3 -2
package/src/{map-google-generative-ai-finish-reason.ts → map-google-finish-reason.ts} +3 -3
package/src/realtime/google-realtime-event-mapper.ts +383 -0
package/src/realtime/google-realtime-model-options.ts +3 -0
package/src/realtime/google-realtime-model.ts +160 -0
package/src/realtime/index.ts +2 -0
package/src/tool/code-execution.ts +2 -2
package/src/tool/enterprise-web-search.ts +9 -3
package/src/tool/file-search.ts +5 -7
package/src/tool/google-maps.ts +3 -2
package/src/tool/google-search.ts +11 -12
package/src/tool/url-context.ts +4 -2
package/src/tool/vertex-rag-store.ts +9 -6
package/dist/index.d.mts +0 -376
package/dist/index.mjs +0 -2517
package/dist/index.mjs.map +0 -1
package/dist/internal/index.d.mts +0 -284
package/dist/internal/index.mjs +0 -1706
package/dist/internal/index.mjs.map +0 -1
package/src/convert-to-google-generative-ai-messages.ts +0 -239
package/src/google-generative-ai-prompt.ts +0 -38

package/docs/{15-google-generative-ai.mdx → 15-google.mdx} RENAMED Viewed

@@ -1,12 +1,12 @@
 ---
-title: Google Generative AI
-description: Learn how to use Google Generative AI Provider.
+title: Google
+description: Learn how to use Google Provider.
 ---
-# Google Generative AI Provider
+# Google Provider
-The [Google Generative AI](https://ai.google.dev) provider contains language and embedding model support for
-the [Google Generative AI](https://ai.google.dev/api/rest) APIs.
+The [Google](https://ai.google.dev) provider contains language and embedding model support for
+the [Google](https://ai.google.dev/api/rest) APIs.
 ## Setup
@@ -36,17 +36,17 @@ You can import the default provider instance `google` from `@ai-sdk/google`:
 import { google } from '@ai-sdk/google';
 ```
-If you need a customized setup, you can import `createGoogleGenerativeAI` from `@ai-sdk/google` and create a provider instance with your settings:
+If you need a customized setup, you can import `createGoogle` from `@ai-sdk/google` and create a provider instance with your settings:
 ```ts
-import { createGoogleGenerativeAI } from '@ai-sdk/google';
+import { createGoogle } from '@ai-sdk/google';
-const google = createGoogleGenerativeAI({
+const google = createGoogle({
   // custom settings
 });
 ```
-You can use the following optional settings to customize the Google Generative AI provider instance:
+You can use the following optional settings to customize the Google provider instance:
 - **baseURL** _string_
@@ -89,7 +89,7 @@ The models support tool calls and some have multi-modal capabilities.
 const model = google('gemini-2.5-flash');
 ```
-You can use Google Generative AI language models to generate text with the `generateText` function:
+You can use Google language models to generate text with the `generateText` function:
 ```ts
 import { google } from '@ai-sdk/google';
@@ -101,11 +101,11 @@ const { text } = await generateText({
 });
 ```
-Google Generative AI language models can also be used in the `streamText` function
+Google language models can also be used in the `streamText` function
 and support structured data generation with [`Output`](/docs/reference/ai-sdk-core/output)
 (see [AI SDK Core](/docs/ai-sdk-core)).
-Google Generative AI also supports some model specific settings that are not part of the [standard call settings](/docs/ai-sdk-core/settings).
+Google also supports some model specific settings that are not part of the [standard call settings](/docs/ai-sdk-core/settings).
 You can pass them as an options argument:
 ```ts
@@ -128,7 +128,7 @@ await generateText({
 });
 ```
-The following optional provider options are available for Google Generative AI models:
+The following optional provider options are available for Google models:
 - **cachedContent** _string_
@@ -141,7 +141,7 @@ The following optional provider options are available for Google Generative AI m
   This is useful when the JSON Schema contains elements that are
   not supported by the OpenAPI schema version that
-  Google Generative AI uses. You can use this to disable
+  Google uses. You can use this to disable
   structured outputs if you need to.
   See [Troubleshooting: Schema Limitations](#schema-limitations) for more details.
@@ -149,11 +149,9 @@ The following optional provider options are available for Google Generative AI m
 - **safetySettings** _Array\<\{ category: string; threshold: string \}\>_
   Optional. Safety settings for the model.
   - **category** _string_
     The category of the safety setting. Can be one of the following:
     - `HARM_CATEGORY_UNSPECIFIED`
     - `HARM_CATEGORY_HATE_SPEECH`
     - `HARM_CATEGORY_DANGEROUS_CONTENT`
@@ -164,7 +162,6 @@ The following optional provider options are available for Google Generative AI m
   - **threshold** _string_
     The threshold of the safety setting. Can be one of the following:
     - `HARM_BLOCK_THRESHOLD_UNSPECIFIED`
     - `BLOCK_LOW_AND_ABOVE`
     - `BLOCK_MEDIUM_AND_ABOVE`
@@ -177,8 +174,7 @@ The following optional provider options are available for Google Generative AI m
 - **thinkingConfig** _\{ thinkingLevel?: 'minimal' | 'low' | 'medium' | 'high'; thinkingBudget?: number; includeThoughts?: boolean \}_
-  Optional. Configuration for the model's thinking process. Only supported by specific [Google Generative AI models](https://ai.google.dev/gemini-api/docs/thinking).
+  Optional. Configuration for the model's thinking process. Only supported by specific [Google models](https://ai.google.dev/gemini-api/docs/thinking).
   - **thinkingLevel** _'minimal' | 'low' | 'medium' | 'high'_
     Optional. Controls the thinking depth for Gemini 3 models. Gemini 3.1 Pro supports 'low', 'medium', and 'high', Gemini 3 Pro supports 'low' and 'high', while Gemini 3 Flash supports all four levels: 'minimal', 'low', 'medium', and 'high'. Only supported by Gemini 3 models.
@@ -186,7 +182,7 @@ The following optional provider options are available for Google Generative AI m
   - **thinkingBudget** _number_
     Optional. Gives the model guidance on the number of thinking tokens it can use when generating a response. Setting it to 0 disables thinking, if the model supports it.
-    For more information about the possible value ranges for each model see [Google Generative AI thinking documentation](https://ai.google.dev/gemini-api/docs/thinking#set-budget).
+    For more information about the possible value ranges for each model see [Google thinking documentation](https://ai.google.dev/gemini-api/docs/thinking#set-budget).
     <Note>
       This option is for Gemini 2.5 models. Gemini 3 models should use
@@ -199,12 +195,10 @@ The following optional provider options are available for Google Generative AI m
 - **imageConfig** _\{ aspectRatio?: string, imageSize?: string \}_
-  Optional. Configuration for the models image generation. Only supported by specific [Google Generative AI models](https://ai.google.dev/gemini-api/docs/image-generation).
+  Optional. Configuration for the models image generation. Only supported by specific [Google models](https://ai.google.dev/gemini-api/docs/image-generation).
   - **aspectRatio** _string_
     Model defaults to generate 1:1 squares, or to matching the output image size to that of your input image. Can be one of the following:
     - 1:1
     - 2:3
     - 3:2
@@ -219,7 +213,6 @@ The following optional provider options are available for Google Generative AI m
   - **imageSize** _string_
     Controls the output image resolution. Defaults to 1K. Can be one of the following:
     - 1K
     - 2K
     - 4K
@@ -232,7 +225,6 @@ The following optional provider options are available for Google Generative AI m
 - **mediaResolution** _string_
   Optional. If specified, the media resolution specified will be used. Can be one of the following:
   - `MEDIA_RESOLUTION_UNSPECIFIED`
   - `MEDIA_RESOLUTION_LOW`
   - `MEDIA_RESOLUTION_MEDIUM`
@@ -245,6 +237,18 @@ The following optional provider options are available for Google Generative AI m
   Optional. Defines labels used in billing reports. Available on Vertex AI only.
   See [Google Cloud labels documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/add-labels-to-api-calls).
+- **serviceTier** _'standard' | 'flex' | 'priority'_
+  Optional. The service tier to use for the request.
+  Set to `'flex'` for 50% cheaper processing at the cost of increased latency.
+  Set to `'priority'` for ultra-low latency at a 75-100% price premium over `'standard'`.
+  Because Priority can be gracefully downgraded to Standard under load, the
+  tier the request actually ran on is surfaced on
+  `result.providerMetadata.google.serviceTier`. See
+  [Priority inference](https://ai.google.dev/gemini-api/docs/priority-inference)
+  and [Flex inference](https://ai.google.dev/gemini-api/docs/flex-inference).
 - **threshold** _string_
   Optional. Standalone threshold setting that can be used independently of `safetySettings`.
@@ -252,7 +256,7 @@ The following optional provider options are available for Google Generative AI m
 ### Thinking
-The Gemini 2.5 and Gemini 3 series models use an internal "thinking process" that significantly improves their reasoning and multi-step planning abilities, making them highly effective for complex tasks such as coding, advanced mathematics, and data analysis. For more information see [Google Generative AI thinking documentation](https://ai.google.dev/gemini-api/docs/thinking).
+The Gemini 2.5 and Gemini 3 series models use an internal "thinking process" that significantly improves their reasoning and multi-step planning abilities, making them highly effective for complex tasks such as coding, advanced mathematics, and data analysis. For more information see [Google thinking documentation](https://ai.google.dev/gemini-api/docs/thinking).
 #### Gemini 3 Models
@@ -312,7 +316,7 @@ console.log(reasoning); // Reasoning summary
 ### File Inputs
-The Google Generative AI provider supports file inputs, e.g. PDF files.
+The Google provider supports file inputs, e.g. PDF files.
 ```ts
 import { google } from '@ai-sdk/google';
@@ -378,7 +382,7 @@ See [File Parts](/docs/foundations/prompts#file-parts) for details on how to use
 ### Cached Content
-Google Generative AI supports both explicit and implicit caching to help reduce costs on repetitive content.
+Google supports both explicit and implicit caching to help reduce costs on repetitive content.
 #### Implicit Caching
@@ -510,7 +514,7 @@ the model has access to the latest information using Google Search.
 ```ts highlight="8,17-20"
 import { google } from '@ai-sdk/google';
-import { GoogleGenerativeAIProviderMetadata } from '@ai-sdk/google';
+import { GoogleProviderMetadata } from '@ai-sdk/google';
 import { generateText } from 'ai';
 const { text, sources, providerMetadata } = await generateText({
@@ -525,9 +529,7 @@ const { text, sources, providerMetadata } = await generateText({
 // access the grounding metadata. Casting to the provider metadata type
 // is optional but provides autocomplete and type safety.
-const metadata = providerMetadata?.google as
-  | GoogleGenerativeAIProviderMetadata
-  | undefined;
+const metadata = providerMetadata?.google as GoogleProviderMetadata | undefined;
 const groundingMetadata = metadata?.groundingMetadata;
 const safetyRatings = metadata?.safetyRatings;
 ```
@@ -537,14 +539,12 @@ The `googleSearch` tool accepts the following optional configuration options:
 - **searchTypes** _object_
   Enables specific search types. Both can be combined.
   - `webSearch`: Enable web search grounding (pass `{}` to enable). This is the default.
   - `imageSearch`: Enable [image search grounding](https://ai.google.dev/gemini-api/docs/image-generation#image-search) (pass `{}` to enable).
 - **timeRangeFilter** _object_
   Restricts search results to a specific time range. Both `startTime` and `endTime` are required.
   - `startTime`: Start time in ISO 8601 format (e.g. `'2025-01-01T00:00:00Z'`).
   - `endTime`: End time in ISO 8601 format (e.g. `'2025-12-31T23:59:59Z'`).
@@ -563,12 +563,10 @@ When Google Search grounding is enabled, the model will include sources in the r
 Additionally, the grounding metadata includes detailed information about how search results were used to ground the model's response. Here are the available fields:
 - **`webSearchQueries`** (`string[] | null`)
   - Array of search queries used to retrieve information
   - Example: `["What's the weather in Chicago this weekend?"]`
 - **`searchEntryPoint`** (`{ renderedContent: string } | null`)
   - Contains the main search result content used as an entry point
   - The `renderedContent` field contains the formatted content
@@ -619,10 +617,10 @@ the model has access to a compliance-focused web index designed for highly-regul
 </Note>
 ```ts
-import { createVertex } from '@ai-sdk/google-vertex';
+import { createGoogleVertex } from '@ai-sdk/google-vertex';
 import { generateText } from 'ai';
-const vertex = createVertex({
+const vertex = createGoogleVertex({
   project: 'my-project',
   location: 'us-central1',
 });
@@ -686,9 +684,7 @@ const { text, sources, providerMetadata } = await generateText({
   },
 });
-const metadata = providerMetadata?.google as
-  | GoogleGenerativeAIProviderMetadata
-  | undefined;
+const metadata = providerMetadata?.google as GoogleProviderMetadata | undefined;
 const groundingMetadata = metadata?.groundingMetadata;
 const urlContextMetadata = metadata?.urlContextMetadata;
 ```
@@ -696,7 +692,6 @@ const urlContextMetadata = metadata?.urlContextMetadata;
 The URL context metadata includes detailed information about how the model used the URL context to generate the response. Here are the available fields:
 - **`urlMetadata`** (`{ retrievedUrl: string; urlRetrievalStatus: string; }[] | null`)
   - Array of URL context metadata
   - Each object includes:
     - **`retrievedUrl`**: The URL of the context
@@ -708,7 +703,7 @@ Example response:
 {
   "urlMetadata": [
     {
-      "retrievedUrl": "https://ai-sdk.dev/providers/ai-sdk-providers/google-generative-ai",
+      "retrievedUrl": "https://ai-sdk.dev/providers/ai-sdk-providers/google",
       "urlRetrievalStatus": "URL_RETRIEVAL_STATUS_SUCCESS"
     }
   ]
@@ -722,8 +717,8 @@ With the URL context tool, you will also get the `groundingMetadata`.
     "groundingChunks": [
         {
             "web": {
-                "uri": "https://ai-sdk.dev/providers/ai-sdk-providers/google-generative-ai",
-                "title": "Google Generative AI - AI SDK Providers"
+                "uri": "https://ai-sdk.dev/providers/ai-sdk-providers/google",
+                "title": "Google - AI SDK Providers"
             }
         }
     ],
@@ -760,7 +755,7 @@ import { generateText } from 'ai';
 const { text, sources, providerMetadata } = await generateText({
   model: google('gemini-2.5-flash'),
-  prompt: `Based on this context: https://ai-sdk.dev/providers/ai-sdk-providers/google-generative-ai, tell me how to use Gemini with AI SDK.
+  prompt: `Based on this context: https://ai-sdk.dev/providers/ai-sdk-providers/google, tell me how to use Gemini with AI SDK.
     Also, provide the latest news about AI SDK V5.`,
   tools: {
     google_search: google.tools.googleSearch({}),
@@ -768,9 +763,7 @@ const { text, sources, providerMetadata } = await generateText({
   },
 });
-const metadata = providerMetadata?.google as
-  | GoogleGenerativeAIProviderMetadata
-  | undefined;
+const metadata = providerMetadata?.google as GoogleProviderMetadata | undefined;
 const groundingMetadata = metadata?.groundingMetadata;
 const urlContextMetadata = metadata?.urlContextMetadata;
 ```
@@ -782,7 +775,7 @@ the model has access to Google Maps data for location-aware responses. This enab
 ```ts highlight="7-16"
 import { google, type GoogleLanguageModelOptions } from '@ai-sdk/google';
-import { GoogleGenerativeAIProviderMetadata } from '@ai-sdk/google';
+import { GoogleProviderMetadata } from '@ai-sdk/google';
 import { generateText } from 'ai';
 const { text, sources, providerMetadata } = await generateText({
@@ -801,9 +794,7 @@ const { text, sources, providerMetadata } = await generateText({
     'What are the best Italian restaurants within a 15-minute walk from here?',
 });
-const metadata = providerMetadata?.google as
-  | GoogleGenerativeAIProviderMetadata
-  | undefined;
+const metadata = providerMetadata?.google as GoogleProviderMetadata | undefined;
 const groundingMetadata = metadata?.groundingMetadata;
 ```
@@ -842,11 +833,11 @@ This enables the model to provide answers based on your specific data sources an
 </Note>
 ```ts highlight="8,17-20"
-import { createVertex } from '@ai-sdk/google-vertex';
-import { GoogleGenerativeAIProviderMetadata } from '@ai-sdk/google';
+import { createGoogleVertex } from '@ai-sdk/google-vertex';
+import { GoogleProviderMetadata } from '@ai-sdk/google';
 import { generateText } from 'ai';
-const vertex = createVertex({
+const vertex = createGoogleVertex({
   project: 'my-project',
   location: 'us-central1',
 });
@@ -866,9 +857,7 @@ const { text, sources, providerMetadata } = await generateText({
 // access the grounding metadata. Casting to the provider metadata type
 // is optional but provides autocomplete and type safety.
-const metadata = providerMetadata?.google as
-  | GoogleGenerativeAIProviderMetadata
-  | undefined;
+const metadata = providerMetadata?.google as GoogleProviderMetadata | undefined;
 const groundingMetadata = metadata?.groundingMetadata;
 const safetyRatings = metadata?.safetyRatings;
 ```
@@ -878,7 +867,6 @@ When RAG Engine Grounding is enabled, the model will include sources from your R
 Additionally, the grounding metadata includes detailed information about how RAG results were used to ground the model's response. Here are the available fields:
 - **`groundingChunks`** (Array of chunk objects | null)
   - Contains the retrieved context chunks from your RAG corpus
   - Each chunk includes:
     - **`retrievedContext`**: Information about the retrieved context
@@ -887,7 +875,6 @@ Additionally, the grounding metadata includes detailed information about how RAG
       - `text`: The actual text content of the chunk
 - **`groundingSupports`** (Array of support objects | null)
   - Contains details about how specific response parts are supported by RAG results
   - Each support object includes:
     - **`segment`**: Information about the grounded text segment
@@ -931,12 +918,10 @@ Example response:
 The `vertexRagStore` tool accepts the following configuration options:
 - **`ragCorpus`** (`string`, required)
   - The RagCorpus resource name in the format: `projects/{project}/locations/{location}/ragCorpora/{rag_corpus}`
   - This identifies your specific RAG corpus to search against
 - **`topK`** (`number`, optional)
   - The number of top contexts to retrieve from your RAG corpus
   - Defaults to the corpus configuration if not specified
@@ -1051,7 +1036,7 @@ const { output } = await generateText({
 });
 ```
-The following Zod features are known to not work with Google Generative AI:
+The following Zod features are known to not work with Google:
 - `z.union`
 - `z.record`
@@ -1060,6 +1045,7 @@ The following Zod features are known to not work with Google Generative AI:
 | Model                                 | Image Input         | Object Generation   | Tool Usage          | Tool Streaming      | Google Search       | URL Context         |
 | ------------------------------------- | ------------------- | ------------------- | ------------------- | ------------------- | ------------------- | ------------------- |
+| `gemini-3.5-flash`                    | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
 | `gemini-3.1-pro-preview`              | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
 | `gemini-3.1-flash-image-preview`      | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
 | `gemini-3.1-flash-lite-preview`       | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
@@ -1079,6 +1065,635 @@ The following Zod features are known to not work with Google Generative AI:
   available provider model ID as a string if needed.
 </Note>
+## Realtime Models
+<Note type="warning">Realtime is an experimental feature.</Note>
+You can create models that call the [Gemini Live API](https://ai.google.dev/gemini-api/docs/live)
+using the `.experimental_realtime()` factory method.
+```ts
+import { google } from '@ai-sdk/google';
+const model = google.experimental_realtime('gemini-3.1-flash-live-preview');
+```
+Realtime sessions run in the browser and require a short-lived token created on
+your server with `google.experimental_realtime.getToken()`:
+```ts
+const token = await google.experimental_realtime.getToken({
+  model: 'gemini-3.1-flash-live-preview',
+});
+```
+Google realtime models may require provider-specific audio formats, depending
+on the model and modality. See [Realtime](/docs/ai-sdk-core/realtime) for the
+complete setup and tool calling pattern.
+## Interactions API
+The [Gemini Interactions API](https://ai.google.dev/gemini-api/docs/interactions)
+(`POST /v1beta/interactions`) is a separate Google endpoint with server-side
+state, unified content blocks, first-class built-in tools, agent presets,
+managed agents that run in a sandboxed Linux environment, and native
+multimodal image output. It is reached via the `google.interactions(...)`
+factory:
+```ts
+import { google } from '@ai-sdk/google';
+import { generateText } from 'ai';
+const { text } = await generateText({
+  model: google.interactions('gemini-2.5-flash'),
+  prompt: 'Hello, how are you?',
+});
+```
+`google.interactions(...)` accepts a model ID string (e.g.
+`'gemini-2.5-flash'`, `'gemini-3-pro-preview'`), `{ agent: <name> }` to use
+a Gemini [agent preset](#agent-presets), or `{ managedAgent: <name> }` to
+invoke a [managed agent](#managed-agents) you created on Google's side.
+The returned model can be passed to `generateText` and `streamText` like
+any other AI SDK language model.
+<Note>
+  Use `google(...)` for the standard `:generateContent` /
+  `:streamGenerateContent` endpoints, and `google.interactions(...)` for the new
+  Interactions endpoint. Pick one per model instance — they target different
+  request bodies and SSE event vocabularies.
+</Note>
+### Provider Options
+The Interactions model reads its options from the shared
+`providerOptions.google.*` namespace. Validate them with the
+`GoogleLanguageModelInteractionsOptions` type:
+```ts
+import {
+  google,
+  type GoogleLanguageModelInteractionsOptions,
+} from '@ai-sdk/google';
+import { generateText } from 'ai';
+await generateText({
+  model: google.interactions('gemini-2.5-flash'),
+  prompt: 'What color is the sky in one word?',
+  providerOptions: {
+    google: {
+      serviceTier: 'priority',
+    } satisfies GoogleLanguageModelInteractionsOptions,
+  },
+});
+```
+The following optional provider options are available:
+- **previousInteractionId** _string_
+  Server-side interaction id from a prior turn. When set, the server pulls
+  prior context from its own state and only the new user message is sent on
+  the wire. Pair with the default `store: true` to chain stateful
+  conversations. See [Stateful chaining](#stateful-chaining).
+- **store** _boolean_
+  Whether the server should persist the interaction. Defaults to `true`.
+  Set to `false` for stateless multi-turn conversations where the full
+  message history is re-sent on every turn.
+- **agent** _string_
+  Name of a Gemini agent preset (e.g. `'deep-research-pro-preview-12-2025'`).
+      <Note>
+  Prefer the factory form `google.interactions({ agent: '...' })` over
+  setting `agent` in provider options — the factory is type-checked
+  against the supported agent names.
+  </Note>
+- **agentConfig** _object_
+  Per-agent configuration. Currently supports `{ type: 'dynamic' }` and
+  `{ type: 'deep-research', thinkingSummaries?, visualization?, collaborativePlanning? }`.
+- **thinkingLevel** _'minimal' | 'low' | 'medium' | 'high'_
+  Controls reasoning depth for thinking-enabled models. Mapped onto the
+  Interactions request's `thinking_level`.
+- **thinkingSummaries** _'auto' | 'none'_
+  Whether the model returns synthesized thought summaries on reasoning
+  parts. Defaults to the API default.
+- **responseFormat** _Array\<\{ type: 'text' | 'image' | 'audio'; mimeType?: string; schema?: unknown; aspectRatio?: string; imageSize?: '1K' \| '2K' \| '4K' \| '512' \}\>_
+  Output-format entries that map directly to the API's `response_format`
+  array. Use this for fine-grained control over image, audio, or non-JSON
+  text outputs (e.g. `aspectRatio` and `imageSize` for image generation).
+  The AI SDK call-level `responseFormat: { type: 'json', schema }` still
+  drives JSON-mode automatically and prepends a matching text entry;
+  entries listed here are appended.
+  `aspectRatio` accepts `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`,
+  `9:16`, `16:9`, `21:9`, `1:8`, `8:1`, `1:4`, `4:1`.
+- **imageConfig** _\{ aspectRatio?: string; imageSize?: '1K' | '2K' | '4K' | '512' \}_ (deprecated)
+  Use **responseFormat** with a `{ type: 'image', ... }` entry instead.
+  Retained for backwards compatibility; the SDK translates `imageConfig`
+  into a matching `response_format` image entry and emits a warning when
+  set. Ignored when `responseFormat` already supplies an image entry.
+- **mediaResolution** _'low' | 'medium' | 'high' | 'ultra_high'_
+  Media resolution applied to image inputs / outputs.
+- **serviceTier** _'flex' | 'standard' | 'priority'_
+  Service tier for the request. Mirrored back on
+  `result.providerMetadata.google.serviceTier` for observability.
+- **systemInstruction** _string_
+  Alternative to the AI SDK `system` message. If both are set, the AI SDK
+  `system` message wins and a warning is emitted.
+- **background** _boolean_
+  Run the interaction in the background. Required for agents whose
+  server-side workflow cannot complete within a single request/response;
+  rejected by agents that only support synchronous calls. When `true`,
+  the POST returns a non-terminal status and the SDK polls
+  `GET /interactions/{id}` until the work completes.
+- **environment** _string \| object_
+  Sandbox environment configuration for [managed agents](#managed-agents).
+  Pass `'remote'` to provision a fresh sandbox, an `environment_id`
+  string to reuse an existing one, or an object of the form
+  `{ type: 'remote', sources?, network? }` to preload files and/or
+  constrain outbound traffic. Only applies to agent calls.
+- **pollingTimeoutMs** _number_
+  Maximum time, in milliseconds, to poll a background interaction before
+  giving up. Defaults to 30 minutes (1,800,000 ms). Long-running agents
+  may need longer.
+### Provider Metadata
+`result.providerMetadata.google` (typed via `GoogleInteractionsProviderMetadata`)
+exposes:
+- **interactionId** _string_
+  Server-side interaction id. Pass this back as `previousInteractionId` on
+  the next turn to chain.
+- **serviceTier** _string_
+  Service tier the request actually ran on.
+- **signature** _string_
+  Per-block signature hash, set by the SDK on output reasoning and
+  tool-call parts. Round-tripped automatically on the next turn.
+### Stateful chaining
+With the default `store: true`, the server retains the prior turn so the
+next request only needs to send the new user message and the
+`previousInteractionId`:
+```ts
+import {
+  google,
+  type GoogleLanguageModelInteractionsOptions,
+} from '@ai-sdk/google';
+import { generateText } from 'ai';
+const turn1 = await generateText({
+  model: google.interactions('gemini-2.5-flash'),
+  prompt: 'What are the three largest cities in Spain?',
+});
+const interactionId = turn1.providerMetadata?.google?.interactionId as
+  | string
+  | undefined;
+const turn2 = await generateText({
+  model: google.interactions('gemini-2.5-flash'),
+  prompt: 'What is the most famous landmark in the second one?',
+  providerOptions: {
+    google: {
+      previousInteractionId: interactionId,
+    } satisfies GoogleLanguageModelInteractionsOptions,
+  },
+});
+```
+For stateless multi-turn conversations, set `store: false` and re-send the
+full message history on every turn (no `previousInteractionId`):
+```ts
+import {
+  google,
+  type GoogleLanguageModelInteractionsOptions,
+} from '@ai-sdk/google';
+import { generateText, type ModelMessage } from 'ai';
+const messages: Array<ModelMessage> = [
+  { role: 'user', content: 'What are the three largest cities in Spain?' },
+];
+const turn1 = await generateText({
+  model: google.interactions('gemini-2.5-flash'),
+  messages,
+  providerOptions: {
+    google: { store: false } satisfies GoogleLanguageModelInteractionsOptions,
+  },
+});
+messages.push(...turn1.responseMessages);
+messages.push({
+  role: 'user',
+  content: 'What is the most famous landmark in the second one?',
+});
+const turn2 = await generateText({
+  model: google.interactions('gemini-2.5-flash'),
+  messages,
+  providerOptions: {
+    google: { store: false } satisfies GoogleLanguageModelInteractionsOptions,
+  },
+});
+```
+### Built-in Tools
+The Interactions API ships a built-in tool catalog. The provider-defined
+tools under `google.tools.*` map onto Interactions tool descriptors:
+| AI SDK tool                           | Interactions tool type | Notes                                     |
+| ------------------------------------- | ---------------------- | ----------------------------------------- |
+| `google.tools.googleSearch`           | `google_search`        | Web / image search grounding.             |
+| `google.tools.codeExecution`          | `code_execution`       | Server-side Python execution.             |
+| `google.tools.urlContext`             | `url_context`          | Fetch URLs referenced in the prompt.      |
+| `google.tools.fileSearch`             | `file_search`          | Retrieval from File Search stores.        |
+| `google.tools.googleMaps`             | `google_maps`          | Maps grounding for nearby-places queries. |
+| _provider tool_ `google.computer_use` | `computer_use`         | Computer use (browser environment).       |
+| _provider tool_ `google.mcp_server`   | `mcp_server`           | Remote MCP server passthrough.            |
+| _provider tool_ `google.retrieval`    | `retrieval`            | Vertex AI Search retrieval.               |
+Function tools (`type: 'function'`) defined with the AI SDK `tool(...)`
+helper are translated to Interactions `function` tool descriptors. Other
+tool kinds emit a warning and are dropped.
+```ts
+import { google } from '@ai-sdk/google';
+import { generateText } from 'ai';
+const { text, sources } = await generateText({
+  model: google.interactions('gemini-2.5-flash'),
+  tools: {
+    google_search: google.tools.googleSearch({}),
+  },
+  prompt:
+    "What's a notable AI development from this past week? " +
+    'Include the date for each item you mention.',
+});
+```
+Function tools work the same way as on the standard provider:
+```ts
+import { google } from '@ai-sdk/google';
+import { generateText, stepCountIs, tool } from 'ai';
+import { z } from 'zod';
+const weatherTool = tool({
+  description: 'Get the weather for a city.',
+  inputSchema: z.object({ city: z.string() }),
+  execute: async ({ city }) => `It is sunny in ${city}.`,
+});
+const { text, toolCalls } = await generateText({
+  model: google.interactions('gemini-2.5-flash'),
+  tools: { getWeather: weatherTool },
+  stopWhen: stepCountIs(5),
+  prompt: 'What is the weather in San Francisco right now?',
+});
+```
+### Image output via Interactions
+Add a `{ type: 'image' }` entry to `responseFormat` on a Gemini
+image-capable model to get images as `LanguageModelV4FilePart` files in
+the response. No tool wrapping is required, and the entry doubles as the
+place to set `aspectRatio`, `imageSize`, and `mimeType`.
+```ts
+import {
+  google,
+  type GoogleLanguageModelInteractionsOptions,
+} from '@ai-sdk/google';
+import { generateText } from 'ai';
+const result = await generateText({
+  model: google.interactions('gemini-3-pro-image-preview'),
+  prompt: 'Generate an image of a comic cat in a spaceship.',
+  providerOptions: {
+    google: {
+      responseFormat: [{ type: 'image' }],
+    } satisfies GoogleLanguageModelInteractionsOptions,
+  },
+});
+for (const file of result.files) {
+  if (file.mediaType.startsWith('image/')) {
+    // file.uint8Array | file.base64 | file.mediaType
+  }
+}
+```
+To control aspect ratio, image size, or output mime type, add those
+fields to the same image entry:
+```ts
+import {
+  google,
+  type GoogleLanguageModelInteractionsOptions,
+} from '@ai-sdk/google';
+import { generateText } from 'ai';
+const result = await generateText({
+  model: google.interactions('gemini-3-pro-image-preview'),
+  prompt: 'Generate a high-quality landscape photo of mountains at sunset.',
+  providerOptions: {
+    google: {
+      responseFormat: [
+        {
+          type: 'image',
+          aspectRatio: '16:9',
+          imageSize: '4K',
+        },
+      ],
+    } satisfies GoogleLanguageModelInteractionsOptions,
+  },
+});
+```
+For multimodal output, list one entry per modality. The model returns
+text in `result.text` and the accompanying image(s) in `result.files`:
+```ts
+import {
+  google,
+  type GoogleLanguageModelInteractionsOptions,
+} from '@ai-sdk/google';
+import { generateText } from 'ai';
+const result = await generateText({
+  model: google.interactions('gemini-2.5-flash-image'),
+  prompt:
+    'Tell me a three sentence bedtime story about a unicorn, accompanied by a suitable illustration.',
+  providerOptions: {
+    google: {
+      responseFormat: [
+        { type: 'text' },
+        { type: 'image', aspectRatio: '16:9' },
+      ],
+    } satisfies GoogleLanguageModelInteractionsOptions,
+  },
+});
+console.log(result.text);
+const images = result.files.filter(file => file.mediaType.startsWith('image/'));
+// images[0].uint8Array | images[0].base64 | images[0].mediaType
+```
+Iterative image editing pairs naturally with stateful chaining — keep
+`previousInteractionId` set across turns and the model edits its prior
+output:
+```ts
+import {
+  google,
+  type GoogleLanguageModelInteractionsOptions,
+} from '@ai-sdk/google';
+import { generateText } from 'ai';
+const model = google.interactions('gemini-3-pro-image-preview');
+const turn1 = await generateText({
+  model,
+  prompt: 'Generate an image of a comic cat in a spaceship.',
+  providerOptions: {
+    google: {
+      responseFormat: [{ type: 'image' }],
+    } satisfies GoogleLanguageModelInteractionsOptions,
+  },
+});
+const interactionId = turn1.providerMetadata?.google?.interactionId as
+  | string
+  | undefined;
+const turn2 = await generateText({
+  model,
+  prompt: 'now make the cat red',
+  providerOptions: {
+    google: {
+      responseFormat: [{ type: 'image' }],
+      previousInteractionId: interactionId,
+    } satisfies GoogleLanguageModelInteractionsOptions,
+  },
+});
+```
+### Agent presets
+Pass `{ agent: <name> }` to target a Gemini agent preset. The factory
+type-checks the agent name against the supported set:
+```ts
+import {
+  google,
+  type GoogleLanguageModelInteractionsOptions,
+} from '@ai-sdk/google';
+import { generateText } from 'ai';
+const result = await generateText({
+  model: google.interactions({
+    agent: 'deep-research-pro-preview-12-2025',
+  }),
+  prompt:
+    'Briefly summarize the most-cited papers on retrieval-augmented generation since 2024 (2-3 sentences).',
+  providerOptions: {
+    google: {
+      background: true,
+    } satisfies GoogleLanguageModelInteractionsOptions,
+  },
+});
+```
+Whether an agent runs synchronously or in the background depends on the
+agent. Long-running presets (such as the `deep-research-*` family)
+require `background: true` — the POST returns a non-terminal status and
+the SDK polls `GET /interactions/{id}` internally until the interaction
+completes. Other agents accept synchronous calls only and will reject
+`background: true`. Set the flag explicitly via
+`providerOptions.google.background`.
+The default polling timeout is 30 minutes; raise it via
+`pollingTimeoutMs` for slower agents:
+```ts
+import {
+  google,
+  type GoogleLanguageModelInteractionsOptions,
+} from '@ai-sdk/google';
+import { generateText } from 'ai';
+await generateText({
+  model: google.interactions({ agent: 'deep-research-max-preview-04-2026' }),
+  prompt: 'Produce a long-form research brief on ...',
+  providerOptions: {
+    google: {
+      background: true,
+      pollingTimeoutMs: 60 * 60 * 1000, // 1 hour
+    } satisfies GoogleLanguageModelInteractionsOptions,
+  },
+});
+```
+Agents also chain through `previousInteractionId` like model-id calls.
+### Managed Agents
+[Managed agents](https://ai.google.dev/gemini-api/docs/agents) run inside a
+sandboxed Linux environment provisioned per interaction. Pass the `environment`
+provider option to control how the sandbox is set up; the option is only
+accepted on agent calls.
+The simplest form provisions a fresh sandbox:
+```ts
+import {
+  google,
+  type GoogleLanguageModelInteractionsOptions,
+} from '@ai-sdk/google';
+import { generateText } from 'ai';
+const result = await generateText({
+  model: google.interactions({ agent: 'antigravity-preview-05-2026' }),
+  prompt: 'What is 2 + 2?',
+  providerOptions: {
+    google: {
+      environment: 'remote',
+    } satisfies GoogleLanguageModelInteractionsOptions,
+  },
+});
+```
+`environment` accepts three shapes:
+- `'remote'` — provision a fresh sandbox for this call.
+- any other string — an `environment_id` to reuse, forking the previous
+  sandbox so its filesystem and installed packages persist.
+- an object — provision a fresh sandbox and optionally preload `sources`
+  and/or constrain outbound traffic via `network`:
+```ts
+import {
+  google,
+  type GoogleLanguageModelInteractionsOptions,
+} from '@ai-sdk/google';
+import { generateText } from 'ai';
+await generateText({
+  model: google.interactions({ agent: 'antigravity-preview-05-2026' }),
+  prompt:
+    'Read the file at /data/note.txt and tell me exactly what it contains.',
+  providerOptions: {
+    google: {
+      environment: {
+        type: 'remote',
+        sources: [
+          {
+            type: 'inline',
+            content: 'hello from the AI SDK example\n',
+            target: '/data/note.txt',
+          },
+        ],
+      },
+    } satisfies GoogleLanguageModelInteractionsOptions,
+  },
+});
+```
+Three source types are supported: `inline` (write a string into the
+sandbox at `target`), `repository` (clone a git repository — pass the
+URL as `source`), and `gcs` (mount a Google Cloud Storage prefix).
+The `network` field accepts the string `'disabled'` to block all
+outbound traffic, or an object with an `allowlist` array whose entries
+each carry a `domain` plus an optional `transform` array of header
+objects to inject into matching requests.
+#### Custom managed agents
+For user-defined agents that you created on Google's side via the
+Gemini API's `/v1beta/agents` endpoint, pass the agent's name through the dedicated
+`managedAgent` factory shape instead of `agent` (which only accepts
+known preset names):
+```ts
+import {
+  google,
+  type GoogleLanguageModelInteractionsOptions,
+} from '@ai-sdk/google';
+import { generateText } from 'ai';
+const result = await generateText({
+  model: google.interactions({ managedAgent: 'my-custom-agent' }),
+  prompt: 'Hello!',
+  providerOptions: {
+    google: {
+      environment: 'remote',
+    } satisfies GoogleLanguageModelInteractionsOptions,
+  },
+});
+```
+### Streaming
+`streamText` is supported. The stream's `finish` part exposes
+`interactionId` on `providerMetadata.google` so callers can chain.
+```ts
+import { google } from '@ai-sdk/google';
+import { streamText } from 'ai';
+const result = streamText({
+  model: google.interactions('gemini-2.5-flash'),
+  prompt: 'Hello, how are you?',
+});
+for await (const textPart of result.textStream) {
+  process.stdout.write(textPart);
+}
+const googleMetadata = (await result.providerMetadata)?.google;
+console.log('Interaction id:', googleMetadata?.interactionId);
+```
 ## Gemma Models
 You can use [Gemma models](https://deepmind.google/models/gemma/) with the Google Generative AI API.
@@ -1111,12 +1726,12 @@ using the `.embedding()` factory method.
 const model = google.embedding('gemini-embedding-001');
 ```
-The Google Generative AI provider sends API calls to the right endpoint based on the type of embedding:
+The Google provider sends API calls to the right endpoint based on the type of embedding:
 - **Single embeddings**: When embedding a single value with `embed()`, the provider uses the single `:embedContent` endpoint, which typically has higher rate limits compared to the batch endpoint.
 - **Batch embeddings**: When embedding multiple values with `embedMany()` or multiple values in `embed()`, the provider uses the `:batchEmbedContents` endpoint.
-Google Generative AI embedding models support additional settings. You can pass them as an options argument:
+Google embedding models support additional settings. You can pass them as an options argument:
 ```ts
 import { google, type GoogleEmbeddingModelOptions } from '@ai-sdk/google';
@@ -1158,7 +1773,7 @@ const { embeddings } = await embedMany({
 });
 ```
-The following optional provider options are available for Google Generative AI embedding models:
+The following optional provider options are available for Google embedding models:
 - **outputDimensionality**: _number_
@@ -1167,7 +1782,6 @@ The following optional provider options are available for Google Generative AI e
 - **taskType**: _string_
   Optional. Specifies the task type for generating embeddings. Supported task types include:
   - `SEMANTIC_SIMILARITY`: Optimized for text similarity.
   - `CLASSIFICATION`: Optimized for text classification.
   - `CLUSTERING`: Optimized for clustering texts based on similarity.
@@ -1179,13 +1793,14 @@ The following optional provider options are available for Google Generative AI e
 - **content**: _array_
-  Optional. Per-value multimodal content parts for embedding non-text content (images, video, PDF, audio). Each entry corresponds to the embedding value at the same index — its parts are merged with the text value in the request. Use `null` for entries that are text-only. The array length must match the number of values being embedded. Each non-null entry is an array of parts, where each part can be either `{ text: string }` or `{ inlineData: { mimeType: string, data: string } }`. Supported by `gemini-embedding-2-preview`.
+  Optional. Per-value multimodal content parts for embedding non-text content (images, video, PDF, audio). Each entry corresponds to the embedding value at the same index — its parts are merged with the text value in the request. Use `null` for entries that are text-only. The array length must match the number of values being embedded. Each non-null entry is an array of parts, where each part can be `{ text: string }`, `{ inlineData: { mimeType: string, data: string } }` for inline base64 data, or `{ fileData: { fileUri: string, mimeType: string } }` to reference remote content via HTTP URL or Google Cloud Storage URI (`gs://...`). Supported by `gemini-embedding-2-preview`.
 ### Model Capabilities
 | Model                        | Default Dimensions | Custom Dimensions   | Multimodal          |
 | ---------------------------- | ------------------ | ------------------- | ------------------- |
 | `gemini-embedding-001`       | 3072               | <Check size={18} /> | <Cross size={18} /> |
+| `gemini-embedding-2`         | 3072               | <Check size={18} /> | <Check size={18} /> |
 | `gemini-embedding-2-preview` | 3072               | <Check size={18} /> | <Check size={18} /> |
 ## Image Models
@@ -1309,6 +1924,29 @@ const { image } = await generateImage({
   details.
 </Note>
+#### Google Search Grounding
+Gemini image models support [Google Search grounding](#google-search) through `providerOptions.google.googleSearch`. The value matches the args of `google.tools.googleSearch(...)`; pass `{}` to enable with defaults, or `{ searchTypes: { imageSearch: {} } }` to ground on reference photos.
+```ts
+import { google } from '@ai-sdk/google';
+import { generateImage } from 'ai';
+const result = await generateImage({
+  model: google.image('gemini-3.1-flash-image-preview'),
+  prompt:
+    'Search for live footage of the 2026 Super Bowl halftime show artist, then generate a close-up in space.',
+  providerOptions: {
+    google: {
+      googleSearch: { searchTypes: { imageSearch: {} } },
+    },
+  },
+});
+// Grounding metadata is forwarded onto the image result:
+console.log(result.providerMetadata?.google?.groundingMetadata);
+```
 #### Gemini Image Model Capabilities
 | Model                            | Image Generation    | Image Editing       | Aspect Ratios                                       |
@@ -1323,3 +1961,80 @@ const { image } = await generateImage({
   2K, 4K via `providerOptions.google.imageConfig.imageSize`), and Google Search
   grounding.
 </Note>
+## Speech Models
+You can create models that call the [Gemini text-to-speech API](https://ai.google.dev/gemini-api/docs/speech-generation)
+using the `.speech()` factory method.
+The first argument is the model id e.g. `gemini-2.5-flash-preview-tts`.
+```ts
+const model = google.speech('gemini-2.5-flash-preview-tts');
+```
+The `voice` argument can be set to one of Gemini's [30 prebuilt voices](https://ai.google.dev/gemini-api/docs/speech-generation#voices)
+e.g. `Kore`, `Puck`, `Zephyr`, or `Charon`. Voice names are case-sensitive. It defaults to `Kore`.
+```ts highlight="6"
+import { generateSpeech } from 'ai';
+import { google } from '@ai-sdk/google';
+const result = await generateSpeech({
+  model: google.speech('gemini-2.5-flash-preview-tts'),
+  text: 'Hello, world!',
+  voice: 'Kore', // Gemini voice name
+});
+```
+By default the generated audio is returned as a playable WAV file (`result.audio.mediaType` is
+`audio/wav`). Set `outputFormat: 'pcm'` to receive the raw signed 16-bit little-endian mono PCM
+bytes instead; the sample rate is reported in `result.providerMetadata.google.sampleRate`.
+Gemini honors natural-language style direction. The `instructions` argument is prepended to the
+spoken text, so `instructions: 'Say cheerfully'` with `text: 'Hello'` speaks `Say cheerfully: Hello`.
+### Multi-speaker audio
+For multi-speaker dialogue, pass a `multiSpeakerVoiceConfig` through `providerOptions`. Each speaker
+name must match a name used in the input text. When set, it overrides the top-level `voice`.
+```ts highlight="8-23"
+import { generateSpeech } from 'ai';
+import { google, type GoogleSpeechModelOptions } from '@ai-sdk/google';
+const result = await generateSpeech({
+  model: google.speech('gemini-2.5-flash-preview-tts'),
+  text: 'Joe: How are you? Jane: Doing great, thanks!',
+  providerOptions: {
+    google: {
+      multiSpeakerVoiceConfig: {
+        speakerVoiceConfigs: [
+          {
+            speaker: 'Joe',
+            voiceConfig: { prebuiltVoiceConfig: { voiceName: 'Kore' } },
+          },
+          {
+            speaker: 'Jane',
+            voiceConfig: { prebuiltVoiceConfig: { voiceName: 'Puck' } },
+          },
+        ],
+      },
+    } satisfies GoogleSpeechModelOptions,
+  },
+});
+```
+<Note>
+  Gemini TTS models do not support the `speed` or `language` options; passing
+  them adds a warning to `result.warnings`. Language is detected automatically
+  from the input text.
+</Note>
+### Model Capabilities
+| Model                          | Multi-speaker       | Style via instructions |
+| ------------------------------ | ------------------- | ---------------------- |
+| `gemini-2.5-flash-preview-tts` | <Check size={18} /> | <Check size={18} />    |
+| `gemini-2.5-pro-preview-tts`   | <Check size={18} /> | <Check size={18} />    |
+| `gemini-3.1-flash-tts-preview` | <Check size={18} /> | <Check size={18} />    |