@ai-sdk/google 3.0.79 → 3.0.81

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -248,8 +248,14 @@ The following optional provider options are available for Google Generative AI m
248
248
  - **serviceTier** _'standard' | 'flex' | 'priority'_
249
249
 
250
250
  Optional. The service tier to use for the request.
251
- Set to 'flex' for 50% cheaper processing at the cost of increased latency.
252
- Set to 'priority' for ultra-low latency at a 75-100% price premium over 'standard'.
251
+ Set to `'flex'` for 50% cheaper processing at the cost of increased latency.
252
+ Set to `'priority'` for ultra-low latency at a 75-100% price premium over `'standard'`.
253
+
254
+ Because Priority can be gracefully downgraded to Standard under load, the
255
+ tier the request actually ran on is surfaced on
256
+ `result.providerMetadata.google.serviceTier`. See
257
+ [Priority inference](https://ai.google.dev/gemini-api/docs/priority-inference)
258
+ and [Flex inference](https://ai.google.dev/gemini-api/docs/flex-inference).
253
259
 
254
260
  - **threshold** _string_
255
261
 
@@ -1090,8 +1096,9 @@ The following Zod features are known to not work with Google Generative AI:
1090
1096
 
1091
1097
  The [Gemini Interactions API](https://ai.google.dev/gemini-api/docs/interactions)
1092
1098
  (`POST /v1beta/interactions`) is a separate Google endpoint with server-side
1093
- state, unified content blocks, first-class built-in tools, agent presets, and
1094
- native multimodal image output. It is reached via the `google.interactions(...)`
1099
+ state, unified content blocks, first-class built-in tools, agent presets,
1100
+ managed agents that run in a sandboxed Linux environment, and native
1101
+ multimodal image output. It is reached via the `google.interactions(...)`
1095
1102
  factory:
1096
1103
 
1097
1104
  ```ts
@@ -1104,10 +1111,12 @@ const { text } = await generateText({
1104
1111
  });
1105
1112
  ```
1106
1113
 
1107
- `google.interactions(...)` accepts either a model ID string (e.g.
1108
- `'gemini-2.5-flash'`, `'gemini-3-pro-preview'`) or `{ agent: <name> }` to use
1109
- a Gemini [agent preset](#agent-presets). The returned model can be passed to
1110
- `generateText` and `streamText` like any other AI SDK language model.
1114
+ `google.interactions(...)` accepts a model ID string (e.g.
1115
+ `'gemini-2.5-flash'`, `'gemini-3-pro-preview'`), `{ agent: <name> }` to use
1116
+ a Gemini [agent preset](#agent-presets), or `{ managedAgent: <name> }` to
1117
+ invoke a [managed agent](#managed-agents) you created on Google's side.
1118
+ The returned model can be passed to `generateText` and `streamText` like
1119
+ any other AI SDK language model.
1111
1120
 
1112
1121
  <Note>
1113
1122
  Use `google(...)` for the standard `:generateContent` /
@@ -1213,6 +1222,22 @@ The following optional provider options are available:
1213
1222
  Alternative to the AI SDK `system` message. If both are set, the AI SDK
1214
1223
  `system` message wins and a warning is emitted.
1215
1224
 
1225
+ - **background** _boolean_
1226
+
1227
+ Run the interaction in the background. Required for agents whose
1228
+ server-side workflow cannot complete within a single request/response;
1229
+ rejected by agents that only support synchronous calls. When `true`,
1230
+ the POST returns a non-terminal status and the SDK polls
1231
+ `GET /interactions/{id}` until the work completes.
1232
+
1233
+ - **environment** _string \| object_
1234
+
1235
+ Sandbox environment configuration for [managed agents](#managed-agents).
1236
+ Pass `'remote'` to provision a fresh sandbox, an `environment_id`
1237
+ string to reuse an existing one, or an object of the form
1238
+ `{ type: 'remote', sources?, network? }` to preload files and/or
1239
+ constrain outbound traffic. Only applies to agent calls.
1240
+
1216
1241
  - **pollingTimeoutMs** _number_
1217
1242
 
1218
1243
  Maximum time, in milliseconds, to poll a background interaction (agent
@@ -1466,7 +1491,10 @@ Pass `{ agent: <name> }` to target a Gemini agent preset. The factory
1466
1491
  type-checks the agent name against the supported set:
1467
1492
 
1468
1493
  ```ts
1469
- import { google } from '@ai-sdk/google';
1494
+ import {
1495
+ google,
1496
+ type GoogleLanguageModelInteractionsOptions,
1497
+ } from '@ai-sdk/google';
1470
1498
  import { generateText } from 'ai';
1471
1499
 
1472
1500
  const result = await generateText({
@@ -1475,28 +1503,143 @@ const result = await generateText({
1475
1503
  }),
1476
1504
  prompt:
1477
1505
  'Briefly summarize the most-cited papers on retrieval-augmented generation since 2024 (2-3 sentences).',
1506
+ providerOptions: {
1507
+ google: {
1508
+ background: true,
1509
+ } satisfies GoogleLanguageModelInteractionsOptions,
1510
+ },
1478
1511
  });
1479
1512
  ```
1480
1513
 
1481
- Agent calls run with `background: true` on the wire and the SDK polls the
1482
- `GET /interactions/{id}` endpoint internally until the interaction
1483
- completes. The default polling timeout is 30 minutes; raise it via
1514
+ Whether an agent runs synchronously or in the background depends on the
1515
+ agent. Long-running presets (such as the `deep-research-*` family)
1516
+ require `background: true` the POST returns a non-terminal status and
1517
+ the SDK polls `GET /interactions/{id}` internally until the interaction
1518
+ completes. Other agents accept synchronous calls only and will reject
1519
+ `background: true`. Set the flag explicitly via
1520
+ `providerOptions.google.background`.
1521
+
1522
+ The default polling timeout is 30 minutes; raise it via
1484
1523
  `pollingTimeoutMs` for slower agents:
1485
1524
 
1486
1525
  ```ts
1526
+ import {
1527
+ google,
1528
+ type GoogleLanguageModelInteractionsOptions,
1529
+ } from '@ai-sdk/google';
1530
+ import { generateText } from 'ai';
1531
+
1487
1532
  await generateText({
1488
1533
  model: google.interactions({ agent: 'deep-research-max-preview-04-2026' }),
1489
1534
  prompt: 'Produce a long-form research brief on ...',
1490
1535
  providerOptions: {
1491
1536
  google: {
1537
+ background: true,
1492
1538
  pollingTimeoutMs: 60 * 60 * 1000, // 1 hour
1493
- },
1539
+ } satisfies GoogleLanguageModelInteractionsOptions,
1494
1540
  },
1495
1541
  });
1496
1542
  ```
1497
1543
 
1498
1544
  Agents also chain through `previousInteractionId` like model-id calls.
1499
1545
 
1546
+ ### Managed Agents
1547
+
1548
+ [Managed agents](https://ai.google.dev/gemini-api/docs/agents) run inside a
1549
+ sandboxed Linux environment provisioned per interaction. Pass the `environment`
1550
+ provider option to control how the sandbox is set up; the option is only
1551
+ accepted on agent calls.
1552
+
1553
+ The simplest form provisions a fresh sandbox:
1554
+
1555
+ ```ts
1556
+ import {
1557
+ google,
1558
+ type GoogleLanguageModelInteractionsOptions,
1559
+ } from '@ai-sdk/google';
1560
+ import { generateText } from 'ai';
1561
+
1562
+ const result = await generateText({
1563
+ model: google.interactions({ agent: 'antigravity-preview-05-2026' }),
1564
+ prompt: 'What is 2 + 2?',
1565
+ providerOptions: {
1566
+ google: {
1567
+ environment: 'remote',
1568
+ } satisfies GoogleLanguageModelInteractionsOptions,
1569
+ },
1570
+ });
1571
+ ```
1572
+
1573
+ `environment` accepts three shapes:
1574
+
1575
+ - `'remote'` — provision a fresh sandbox for this call.
1576
+ - any other string — an `environment_id` to reuse, forking the previous
1577
+ sandbox so its filesystem and installed packages persist.
1578
+ - an object — provision a fresh sandbox and optionally preload `sources`
1579
+ and/or constrain outbound traffic via `network`:
1580
+
1581
+ ```ts
1582
+ import {
1583
+ google,
1584
+ type GoogleLanguageModelInteractionsOptions,
1585
+ } from '@ai-sdk/google';
1586
+ import { generateText } from 'ai';
1587
+
1588
+ await generateText({
1589
+ model: google.interactions({ agent: 'antigravity-preview-05-2026' }),
1590
+ prompt:
1591
+ 'Read the file at /data/note.txt and tell me exactly what it contains.',
1592
+ providerOptions: {
1593
+ google: {
1594
+ environment: {
1595
+ type: 'remote',
1596
+ sources: [
1597
+ {
1598
+ type: 'inline',
1599
+ content: 'hello from the AI SDK example\n',
1600
+ target: '/data/note.txt',
1601
+ },
1602
+ ],
1603
+ },
1604
+ } satisfies GoogleLanguageModelInteractionsOptions,
1605
+ },
1606
+ });
1607
+ ```
1608
+
1609
+ Three source types are supported: `inline` (write a string into the
1610
+ sandbox at `target`), `repository` (clone a git repository — pass the
1611
+ URL as `source`), and `gcs` (mount a Google Cloud Storage prefix).
1612
+
1613
+ The `network` field accepts the string `'disabled'` to block all
1614
+ outbound traffic, or an object with an `allowlist` array whose entries
1615
+ each carry a `domain` plus an optional `transform` array of header
1616
+ objects to inject into matching requests.
1617
+
1618
+ #### Custom managed agents
1619
+
1620
+ For user-defined agents that you created on Google's side via the
1621
+ Gemini API's `/v1beta/agents` endpoint, pass the agent's name through the dedicated
1622
+ `managedAgent` factory shape instead of `agent` (which only accepts
1623
+ known preset names):
1624
+
1625
+ ```ts
1626
+ import {
1627
+ google,
1628
+ type GoogleLanguageModelInteractionsOptions,
1629
+ } from '@ai-sdk/google';
1630
+ import { generateText } from 'ai';
1631
+
1632
+ const result = await generateText({
1633
+ model: google.interactions({ managedAgent: 'my-custom-agent' }),
1634
+ prompt: 'Hello!',
1635
+ providerOptions: {
1636
+ google: {
1637
+ environment: 'remote',
1638
+ } satisfies GoogleLanguageModelInteractionsOptions,
1639
+ },
1640
+ });
1641
+ ```
1642
+
1500
1643
  ### Streaming
1501
1644
 
1502
1645
  `streamText` is supported and consumes the seven Interactions SSE event
@@ -1522,22 +1665,6 @@ const googleMetadata = (await result.providerMetadata)?.google;
1522
1665
  console.log('Interaction id:', googleMetadata?.interactionId);
1523
1666
  ```
1524
1667
 
1525
- ### Runnable Examples
1526
-
1527
- Paired `generateText` + `streamText` examples live under:
1528
-
1529
- - `examples/ai-functions/src/generate-text/google/interactions-*.ts`
1530
- - `examples/ai-functions/src/stream-text/google/interactions-*.ts`
1531
-
1532
- Notable examples: `interactions-basic`, `interactions-multi-turn-stateful`,
1533
- `interactions-multi-turn-stateless`, `interactions-tool-call`,
1534
- `interactions-google-search`, `interactions-image-output`,
1535
- `interactions-image-output-modify`, `interactions-image-base64`,
1536
- `interactions-image-reference`, `interactions-image-url`,
1537
- `interactions-pdf`, `interactions-structured-output`,
1538
- `interactions-service-tier`, `interactions-agent-single-turn`, and
1539
- `interactions-agent-multi-turn`.
1540
-
1541
1668
  ## Gemma Models
1542
1669
 
1543
1670
  You can use [Gemma models](https://deepmind.google/models/gemma/) with the Google Generative AI API.
@@ -1768,6 +1895,29 @@ const { image } = await generateImage({
1768
1895
  details.
1769
1896
  </Note>
1770
1897
 
1898
+ #### Google Search Grounding
1899
+
1900
+ Gemini image models support [Google Search grounding](#google-search) through `providerOptions.google.googleSearch`. The value matches the args of `google.tools.googleSearch(...)`; pass `{}` to enable with defaults, or `{ searchTypes: { imageSearch: {} } }` to ground on reference photos.
1901
+
1902
+ ```ts
1903
+ import { google } from '@ai-sdk/google';
1904
+ import { generateImage } from 'ai';
1905
+
1906
+ const result = await generateImage({
1907
+ model: google.image('gemini-3.1-flash-image-preview'),
1908
+ prompt:
1909
+ 'Search for live footage of the 2026 Super Bowl halftime show artist, then generate a close-up in space.',
1910
+ providerOptions: {
1911
+ google: {
1912
+ googleSearch: { searchTypes: { imageSearch: {} } },
1913
+ },
1914
+ },
1915
+ });
1916
+
1917
+ // Grounding metadata is forwarded onto the image result:
1918
+ console.log(result.providerMetadata?.google?.groundingMetadata);
1919
+ ```
1920
+
1771
1921
  #### Gemini Image Model Capabilities
1772
1922
 
1773
1923
  | Model | Image Generation | Image Editing | Aspect Ratios |
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@ai-sdk/google",
3
- "version": "3.0.79",
3
+ "version": "3.0.81",
4
4
  "license": "Apache-2.0",
5
5
  "sideEffects": false,
6
6
  "main": "./dist/index.js",
@@ -36,8 +36,8 @@
36
36
  }
37
37
  },
38
38
  "dependencies": {
39
- "@ai-sdk/provider": "3.0.10",
40
- "@ai-sdk/provider-utils": "4.0.27"
39
+ "@ai-sdk/provider-utils": "4.0.28",
40
+ "@ai-sdk/provider": "3.0.10"
41
41
  },
42
42
  "devDependencies": {
43
43
  "@types/node": "20.17.24",
@@ -1,6 +1,7 @@
1
1
  import {
2
2
  UnsupportedFunctionalityError,
3
3
  type LanguageModelV3Prompt,
4
+ type SharedV3Warning,
4
5
  } from '@ai-sdk/provider';
5
6
  import { convertToBase64 } from '@ai-sdk/provider-utils';
6
7
  import type {
@@ -10,6 +11,48 @@ import type {
10
11
  GoogleGenerativeAIPrompt,
11
12
  } from './google-generative-ai-prompt';
12
13
 
14
+ /**
15
+ * Sentinel value Google documents for replaying functionCall parts whose
16
+ * original thoughtSignature is not available to the client.
17
+ *
18
+ * Gemini 3 models reject `functionCall` parts that lack a `thoughtSignature`
19
+ * with HTTP 400 "Function call is missing a thought_signature in functionCall
20
+ * parts." Sending this sentinel string in place of the missing signature
21
+ * makes Gemini skip the validator and continue the turn.
22
+ *
23
+ * See https://ai.google.dev/gemini-api/docs/thought-signatures.
24
+ */
25
+ export const SKIP_THOUGHT_SIGNATURE_VALIDATOR =
26
+ 'skip_thought_signature_validator';
27
+
28
+ type GoogleProviderOptions = {
29
+ thought?: unknown;
30
+ thoughtSignature?: unknown;
31
+ serverToolCallId?: unknown;
32
+ serverToolType?: unknown;
33
+ };
34
+
35
+ function getGoogleProviderOptions(
36
+ providerOptions: Record<string, GoogleProviderOptions> | undefined,
37
+ providerOptionsName: string,
38
+ ): GoogleProviderOptions | undefined {
39
+ const namespaces = [
40
+ providerOptionsName,
41
+ 'google',
42
+ 'googleVertex',
43
+ 'vertex',
44
+ ].filter((namespace, index, allNamespaces) => {
45
+ return allNamespaces.indexOf(namespace) === index;
46
+ });
47
+
48
+ for (const namespace of namespaces) {
49
+ const options = providerOptions?.[namespace];
50
+ if (options != null) {
51
+ return options;
52
+ }
53
+ }
54
+ }
55
+
13
56
  const dataUrlRegex = /^data:([^;,]+);base64,(.+)$/s;
14
57
 
15
58
  function parseBase64DataUrl(
@@ -168,17 +211,41 @@ export function convertToGoogleGenerativeAIMessages(
168
211
  prompt: LanguageModelV3Prompt,
169
212
  options?: {
170
213
  isGemmaModel?: boolean;
214
+ /**
215
+ * Whether the target model is in the Gemini 3 family. Gemini 3 enforces a
216
+ * `thoughtSignature` on every replayed `functionCall` part; when one is
217
+ * missing we inject the documented `skip_thought_signature_validator`
218
+ * sentinel and emit a warning via `onWarning` so the developer can find
219
+ * and fix the upstream serialization that lost the signature.
220
+ */
221
+ isGemini3Model?: boolean;
171
222
  providerOptionsName?: string;
172
223
  supportsFunctionResponseParts?: boolean;
224
+ /**
225
+ * Called once for the request when a Gemini 3 `functionCall` part is
226
+ * about to be sent without a `thoughtSignature` and the sentinel is
227
+ * injected.
228
+ */
229
+ onWarning?: (warning: SharedV3Warning) => void;
173
230
  },
174
231
  ): GoogleGenerativeAIPrompt {
175
232
  const systemInstructionParts: Array<{ text: string }> = [];
176
233
  const contents: Array<GoogleGenerativeAIContent> = [];
177
234
  let systemMessagesAllowed = true;
178
235
  const isGemmaModel = options?.isGemmaModel ?? false;
236
+ const isGemini3Model = options?.isGemini3Model ?? false;
179
237
  const providerOptionsName = options?.providerOptionsName ?? 'google';
180
238
  const supportsFunctionResponseParts =
181
239
  options?.supportsFunctionResponseParts ?? true;
240
+ const onWarning = options?.onWarning;
241
+
242
+ let sentinelInjected = false;
243
+ const missingSignatureToolNames: string[] = [];
244
+ const injectSkipSignature = (toolName: string) => {
245
+ missingSignatureToolNames.push(toolName);
246
+ sentinelInjected = true;
247
+ return SKIP_THOUGHT_SIGNATURE_VALIDATOR;
248
+ };
182
249
 
183
250
  for (const { role, content } of prompt) {
184
251
  switch (role) {
@@ -243,11 +310,10 @@ export function convertToGoogleGenerativeAIMessages(
243
310
  role: 'model',
244
311
  parts: content
245
312
  .map(part => {
246
- const providerOpts =
247
- part.providerOptions?.[providerOptionsName] ??
248
- (providerOptionsName !== 'google'
249
- ? part.providerOptions?.google
250
- : part.providerOptions?.vertex);
313
+ const providerOpts = getGoogleProviderOptions(
314
+ part.providerOptions,
315
+ providerOptionsName,
316
+ );
251
317
  const thoughtSignature =
252
318
  providerOpts?.thoughtSignature != null
253
319
  ? String(providerOpts.thoughtSignature)
@@ -303,6 +369,16 @@ export function convertToGoogleGenerativeAIMessages(
303
369
  ? String(providerOpts.serverToolType)
304
370
  : undefined;
305
371
 
372
+ // For Gemini 3, every replayed functionCall part must carry a
373
+ // thoughtSignature or the API returns HTTP 400. If the upstream
374
+ // serialization layer dropped the signature, inject the
375
+ // documented sentinel so the request still succeeds.
376
+ const effectiveThoughtSignature =
377
+ thoughtSignature ??
378
+ (isGemini3Model
379
+ ? injectSkipSignature(part.toolName)
380
+ : undefined);
381
+
306
382
  if (serverToolCallId && serverToolType) {
307
383
  return {
308
384
  toolCall: {
@@ -313,7 +389,7 @@ export function convertToGoogleGenerativeAIMessages(
313
389
  : part.input,
314
390
  id: serverToolCallId,
315
391
  },
316
- thoughtSignature,
392
+ thoughtSignature: effectiveThoughtSignature,
317
393
  };
318
394
  }
319
395
 
@@ -325,7 +401,7 @@ export function convertToGoogleGenerativeAIMessages(
325
401
  name: part.toolName,
326
402
  args: part.input,
327
403
  },
328
- thoughtSignature,
404
+ thoughtSignature: effectiveThoughtSignature,
329
405
  };
330
406
  }
331
407
 
@@ -371,11 +447,10 @@ export function convertToGoogleGenerativeAIMessages(
371
447
  continue;
372
448
  }
373
449
 
374
- const partProviderOpts =
375
- part.providerOptions?.[providerOptionsName] ??
376
- (providerOptionsName !== 'google'
377
- ? part.providerOptions?.google
378
- : part.providerOptions?.vertex);
450
+ const partProviderOpts = getGoogleProviderOptions(
451
+ part.providerOptions,
452
+ providerOptionsName,
453
+ );
379
454
  const serverToolCallId =
380
455
  partProviderOpts?.serverToolCallId != null
381
456
  ? String(partProviderOpts.serverToolCallId)
@@ -465,6 +540,23 @@ export function convertToGoogleGenerativeAIMessages(
465
540
  contents[0].parts.unshift({ text: systemText + '\n\n' });
466
541
  }
467
542
 
543
+ if (sentinelInjected && onWarning != null) {
544
+ const uniqueToolNames = Array.from(new Set(missingSignatureToolNames));
545
+ onWarning({
546
+ type: 'other',
547
+ message:
548
+ `Replayed ${missingSignatureToolNames.length} \`functionCall\` part(s) ` +
549
+ `for a Gemini 3 model without a \`thoughtSignature\` ` +
550
+ `(tools: ${uniqueToolNames.map(name => `\`${name}\``).join(', ')}). ` +
551
+ `Injected the documented \`skip_thought_signature_validator\` sentinel ` +
552
+ `to keep the request from failing with HTTP 400. ` +
553
+ `The likely cause is application code that drops ` +
554
+ '`providerOptions.google.thoughtSignature` when persisting or ' +
555
+ 'serializing assistant tool-call messages. ' +
556
+ 'See https://ai.google.dev/gemini-api/docs/thought-signatures.',
557
+ });
558
+ }
559
+
468
560
  return {
469
561
  systemInstruction:
470
562
  systemInstructionParts.length > 0 && !isGemmaModel
@@ -193,14 +193,17 @@ export class GoogleGenerativeAILanguageModel implements LanguageModelV3 {
193
193
  : googleOptions?.serviceTier;
194
194
 
195
195
  const isGemmaModel = this.modelId.toLowerCase().startsWith('gemma-');
196
- const supportsFunctionResponseParts = this.modelId.startsWith('gemini-3');
196
+ const isGemini3Model = /^gemini-3[.-]/.test(this.modelId);
197
+ const supportsFunctionResponseParts = isGemini3Model;
197
198
 
198
199
  const { contents, systemInstruction } = convertToGoogleGenerativeAIMessages(
199
200
  prompt,
200
201
  {
201
202
  isGemmaModel,
203
+ isGemini3Model,
202
204
  providerOptionsName,
203
205
  supportsFunctionResponseParts,
206
+ onWarning: warning => warnings.push(warning),
204
207
  },
205
208
  );
206
209