@proxyvn/genai 1.40.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,609 @@
1
+ # Google Gen AI SDK for TypeScript and JavaScript
2
+
3
+ [![NPM Downloads](https://img.shields.io/npm/dw/%40google%2Fgenai)](https://www.npmjs.com/package/@google/genai)
4
+ [![Node Current](https://img.shields.io/node/v/%40google%2Fgenai)](https://www.npmjs.com/package/@google/genai)
5
+
6
+ ----------------------
7
+ **Documentation:** https://googleapis.github.io/js-genai/
8
+
9
+ ----------------------
10
+
11
+ The Google Gen AI JavaScript SDK is designed for
12
+ TypeScript and JavaScript developers to build applications powered by Gemini. The SDK
13
+ supports both the [Gemini Developer API](https://ai.google.dev/gemini-api/docs)
14
+ and [Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/overview).
15
+
16
+ The Google Gen AI SDK is designed to work with Gemini 2.0+ features.
17
+
18
+ > [!CAUTION]
19
+ > **API Key Security:** Avoid exposing API keys in client-side code.
20
+ > Use server-side implementations in production environments.
21
+
22
+ ## Code Generation
23
+
24
+ Generative models are often unaware of recent API and SDK updates and may suggest outdated or legacy code.
25
+
26
+ We recommend using our Code Generation instructions [`codegen_instructions.md`](https://raw.githubusercontent.com/googleapis/js-genai/refs/heads/main/codegen_instructions.md) when generating Google Gen AI SDK code to guide your model towards using the more recent SDK features. Copy and paste the instructions into your development environment to provide the model with the necessary context.
27
+
28
+ ## Prerequisites
29
+
30
+ 1. Node.js version 20 or later
31
+
32
+ ### The following are required for Vertex AI users (excluding Vertex AI Studio)
33
+ 1. [Select](https://console.cloud.google.com/project) or [create](https://cloud.google.com/resource-manager/docs/creating-managing-projects#creating_a_project) a Google Cloud project.
34
+ 1. [Enable billing for your project](https://cloud.google.com/billing/docs/how-to/modify-project).
35
+ 1. [Enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).
36
+ 1. [Configure authentication](https://cloud.google.com/docs/authentication) for your project.
37
+ * [Install the gcloud CLI](https://cloud.google.com/sdk/docs/install).
38
+ * [Initialize the gcloud CLI](https://cloud.google.com/sdk/docs/initializing).
39
+ * Create local authentication credentials for your user account:
40
+
41
+ ```sh
42
+ gcloud auth application-default login
43
+ ```
44
+ A list of accepted authentication options are listed in [GoogleAuthOptions](https://github.com/googleapis/google-auth-library-nodejs/blob/3ae120d0a45c95e36c59c9ac8286483938781f30/src/auth/googleauth.ts#L87) interface of google-auth-library-node.js GitHub repo.
45
+
46
+ ## Installation
47
+
48
+ To install the SDK, run the following command:
49
+
50
+ ```shell
51
+ npm install @google/genai
52
+ ```
53
+
54
+ ## Quickstart
55
+
56
+ The simplest way to get started is to use an API key from
57
+ [Google AI Studio](https://aistudio.google.com/apikey):
58
+
59
+ ```typescript
60
+ import {GoogleGenAI} from '@google/genai';
61
+ const GEMINI_API_KEY = process.env.GEMINI_API_KEY;
62
+
63
+ const ai = new GoogleGenAI({apiKey: GEMINI_API_KEY});
64
+
65
+ async function main() {
66
+ const response = await ai.models.generateContent({
67
+ model: 'gemini-2.5-flash',
68
+ contents: 'Why is the sky blue?',
69
+ });
70
+ console.log(response.text);
71
+ }
72
+
73
+ main();
74
+ ```
75
+
76
+ ## Initialization
77
+
78
+ The Google Gen AI SDK provides support for both the
79
+ [Google AI Studio](https://ai.google.dev/gemini-api/docs) and
80
+ [Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/overview)
81
+ implementations of the Gemini API.
82
+
83
+ ### Gemini Developer API
84
+
85
+ For server-side applications, initialize using an API key, which can
86
+ be acquired from [Google AI Studio](https://aistudio.google.com/apikey):
87
+
88
+ ```typescript
89
+ import { GoogleGenAI } from '@google/genai';
90
+ const ai = new GoogleGenAI({apiKey: 'GEMINI_API_KEY'});
91
+ ```
92
+
93
+ #### Browser
94
+
95
+ > [!CAUTION]
96
+ > **API Key Security:** Avoid exposing API keys in client-side code.
97
+ > Use server-side implementations in production environments.
98
+
99
+ In the browser the initialization code is identical:
100
+
101
+
102
+ ```typescript
103
+ import { GoogleGenAI } from '@google/genai';
104
+ const ai = new GoogleGenAI({apiKey: 'GEMINI_API_KEY'});
105
+ ```
106
+
107
+ ### Vertex AI
108
+
109
+ Sample code for VertexAI initialization:
110
+
111
+ ```typescript
112
+ import { GoogleGenAI } from '@google/genai';
113
+
114
+ const ai = new GoogleGenAI({
115
+ vertexai: true,
116
+ project: 'your_project',
117
+ location: 'your_location',
118
+ });
119
+ ```
120
+
121
+ ### (Optional) (NodeJS only) Using environment variables:
122
+
123
+ For NodeJS environments, you can create a client by configuring the necessary
124
+ environment variables. Configuration setup instructions depends on whether
125
+ you're using the Gemini Developer API or the Gemini API in Vertex AI.
126
+
127
+ **Gemini Developer API:** Set `GOOGLE_API_KEY` as shown below:
128
+
129
+ ```bash
130
+ export GOOGLE_API_KEY='your-api-key'
131
+ ```
132
+
133
+ **Gemini API on Vertex AI:** Set `GOOGLE_GENAI_USE_VERTEXAI`,
134
+ `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION`, as shown below:
135
+
136
+ ```bash
137
+ export GOOGLE_GENAI_USE_VERTEXAI=true
138
+ export GOOGLE_CLOUD_PROJECT='your-project-id'
139
+ export GOOGLE_CLOUD_LOCATION='us-central1'
140
+ ```
141
+
142
+ ```typescript
143
+ import {GoogleGenAI} from '@google/genai';
144
+
145
+ const ai = new GoogleGenAI();
146
+ ```
147
+
148
+ ## API Selection
149
+
150
+ By default, the SDK uses the beta API endpoints provided by Google to support
151
+ preview features in the APIs. The stable API endpoints can be selected by
152
+ setting the API version to `v1`.
153
+
154
+ To set the API version use `apiVersion`. For example, to set the API version to
155
+ `v1` for Vertex AI:
156
+
157
+ ```typescript
158
+ const ai = new GoogleGenAI({
159
+ vertexai: true,
160
+ project: 'your_project',
161
+ location: 'your_location',
162
+ apiVersion: 'v1'
163
+ });
164
+ ```
165
+
166
+ To set the API version to `v1alpha` for the Gemini Developer API:
167
+
168
+ ```typescript
169
+ const ai = new GoogleGenAI({
170
+ apiKey: 'GEMINI_API_KEY',
171
+ apiVersion: 'v1alpha'
172
+ });
173
+ ```
174
+
175
+ ## GoogleGenAI overview
176
+
177
+ All API features are accessed through an instance of the `GoogleGenAI` classes.
178
+ The submodules bundle together related API methods:
179
+
180
+ - [`ai.models`](https://googleapis.github.io/js-genai/release_docs/classes/models.Models.html):
181
+ Use `models` to query models (`generateContent`, `generateImages`, ...), or
182
+ examine their metadata.
183
+ - [`ai.caches`](https://googleapis.github.io/js-genai/release_docs/classes/caches.Caches.html):
184
+ Create and manage `caches` to reduce costs when repeatedly using the same
185
+ large prompt prefix.
186
+ - [`ai.chats`](https://googleapis.github.io/js-genai/release_docs/classes/chats.Chats.html):
187
+ Create local stateful `chat` objects to simplify multi turn interactions.
188
+ - [`ai.files`](https://googleapis.github.io/js-genai/release_docs/classes/files.Files.html):
189
+ Upload `files` to the API and reference them in your prompts.
190
+ This reduces bandwidth if you use a file many times, and handles files too
191
+ large to fit inline with your prompt.
192
+ - [`ai.live`](https://googleapis.github.io/js-genai/release_docs/classes/live.Live.html):
193
+ Start a `live` session for real time interaction, allows text + audio + video
194
+ input, and text or audio output.
195
+
196
+ ## Samples
197
+
198
+ More samples can be found in the
199
+ [github samples directory](https://github.com/googleapis/js-genai/tree/main/sdk-samples).
200
+
201
+ ### Streaming
202
+
203
+ For quicker, more responsive API interactions use the `generateContentStream`
204
+ method which yields chunks as they're generated:
205
+
206
+ ```typescript
207
+ import {GoogleGenAI} from '@google/genai';
208
+ const GEMINI_API_KEY = process.env.GEMINI_API_KEY;
209
+
210
+ const ai = new GoogleGenAI({apiKey: GEMINI_API_KEY});
211
+
212
+ async function main() {
213
+ const response = await ai.models.generateContentStream({
214
+ model: 'gemini-2.5-flash',
215
+ contents: 'Write a 100-word poem.',
216
+ });
217
+ for await (const chunk of response) {
218
+ console.log(chunk.text);
219
+ }
220
+ }
221
+
222
+ main();
223
+ ```
224
+
225
+ ### Function Calling
226
+
227
+ To let Gemini to interact with external systems, you can provide
228
+ `functionDeclaration` objects as `tools`. To use these tools it's a 4 step
229
+
230
+ 1. **Declare the function name, description, and parametersJsonSchema**
231
+ 2. **Call `generateContent` with function calling enabled**
232
+ 3. **Use the returned `FunctionCall` parameters to call your actual function**
233
+ 3. **Send the result back to the model (with history, easier in `ai.chat`)
234
+ as a `FunctionResponse`**
235
+
236
+ ```typescript
237
+ import {GoogleGenAI, FunctionCallingConfigMode, FunctionDeclaration, Type} from '@google/genai';
238
+ const GEMINI_API_KEY = process.env.GEMINI_API_KEY;
239
+
240
+ async function main() {
241
+ const controlLightDeclaration: FunctionDeclaration = {
242
+ name: 'controlLight',
243
+ parametersJsonSchema: {
244
+ type: 'object',
245
+ properties:{
246
+ brightness: {
247
+ type:'number',
248
+ },
249
+ colorTemperature: {
250
+ type:'string',
251
+ },
252
+ },
253
+ required: ['brightness', 'colorTemperature'],
254
+ },
255
+ };
256
+
257
+ const ai = new GoogleGenAI({apiKey: GEMINI_API_KEY});
258
+ const response = await ai.models.generateContent({
259
+ model: 'gemini-2.5-flash',
260
+ contents: 'Dim the lights so the room feels cozy and warm.',
261
+ config: {
262
+ toolConfig: {
263
+ functionCallingConfig: {
264
+ // Force it to call any function
265
+ mode: FunctionCallingConfigMode.ANY,
266
+ allowedFunctionNames: ['controlLight'],
267
+ }
268
+ },
269
+ tools: [{functionDeclarations: [controlLightDeclaration]}]
270
+ }
271
+ });
272
+
273
+ console.log(response.functionCalls);
274
+ }
275
+
276
+ main();
277
+ ```
278
+
279
+ #### Model Context Protocol (MCP) support (experimental)
280
+
281
+ Built-in [MCP](https://modelcontextprotocol.io/introduction) support is an
282
+ experimental feature. You can pass a local MCP server as a tool directly.
283
+
284
+ ```javascript
285
+ import { GoogleGenAI, FunctionCallingConfigMode , mcpToTool} from '@google/genai';
286
+ import { Client } from "@modelcontextprotocol/sdk/client/index.js";
287
+ import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";
288
+
289
+ // Create server parameters for stdio connection
290
+ const serverParams = new StdioClientTransport({
291
+ command: "npx", // Executable
292
+ args: ["-y", "@philschmid/weather-mcp"] // MCP Server
293
+ });
294
+
295
+ const client = new Client(
296
+ {
297
+ name: "example-client",
298
+ version: "1.0.0"
299
+ }
300
+ );
301
+
302
+ // Configure the client
303
+ const ai = new GoogleGenAI({});
304
+
305
+ // Initialize the connection between client and server
306
+ await client.connect(serverParams);
307
+
308
+ // Send request to the model with MCP tools
309
+ const response = await ai.models.generateContent({
310
+ model: "gemini-2.5-flash",
311
+ contents: `What is the weather in London in ${new Date().toLocaleDateString()}?`,
312
+ config: {
313
+ tools: [mcpToTool(client)], // uses the session, will automatically call the tool using automatic function calling
314
+ },
315
+ });
316
+ console.log(response.text);
317
+
318
+ // Close the connection
319
+ await client.close();
320
+ ```
321
+
322
+ ### Generate Content
323
+
324
+ #### How to structure `contents` argument for `generateContent`
325
+
326
+ The SDK allows you to specify the following types in the `contents` parameter:
327
+
328
+ #### Content
329
+
330
+ - `Content`: The SDK will wrap the singular `Content` instance in an array which
331
+ contains only the given content instance
332
+ - `Content[]`: No transformation happens
333
+
334
+ #### Part
335
+
336
+ Parts will be aggregated on a singular Content, with role 'user'.
337
+
338
+ - `Part | string`: The SDK will wrap the `string` or `Part` in a `Content`
339
+ instance with role 'user'.
340
+ - `Part[] | string[]`: The SDK will wrap the full provided list into a single
341
+ `Content` with role 'user'.
342
+
343
+ **_NOTE:_** This doesn't apply to `FunctionCall` and `FunctionResponse` parts,
344
+ if you are specifying those, you need to explicitly provide the full
345
+ `Content[]` structure making it explicit which Parts are 'spoken' by the model,
346
+ or the user. The SDK will throw an exception if you try this.
347
+
348
+ ## Error Handling
349
+
350
+ To handle errors raised by the API, the SDK provides this [ApiError](https://github.com/googleapis/js-genai/blob/main/src/errors.ts) class.
351
+
352
+ ```typescript
353
+ import {GoogleGenAI} from '@google/genai';
354
+ const GEMINI_API_KEY = process.env.GEMINI_API_KEY;
355
+
356
+ const ai = new GoogleGenAI({apiKey: GEMINI_API_KEY});
357
+
358
+ async function main() {
359
+ await ai.models.generateContent({
360
+ model: 'non-existent-model',
361
+ contents: 'Write a 100-word poem.',
362
+ }).catch((e) => {
363
+ console.error('error name: ', e.name);
364
+ console.error('error message: ', e.message);
365
+ console.error('error status: ', e.status);
366
+ });
367
+ }
368
+
369
+ main();
370
+ ```
371
+
372
+ ## Interactions (Preview)
373
+
374
+ > **Warning:** The Interactions API is in **Beta**. This is a preview of an
375
+ experimental feature. Features and schemas are subject to **breaking changes**.
376
+
377
+ The Interactions API is a unified interface for interacting with Gemini models
378
+ and agents. It simplifies state management, tool orchestration, and long-running
379
+ tasks.
380
+
381
+ See the [documentation site](https://ai.google.dev/gemini-api/docs/interactions)
382
+ for more details.
383
+
384
+ ### Basic Interaction
385
+
386
+ ```typescript
387
+ const interaction = await ai.interactions.create({
388
+ model: 'gemini-2.5-flash',
389
+ input: 'Hello, how are you?',
390
+ });
391
+ console.debug(interaction);
392
+
393
+ ```
394
+
395
+ ### Stateful Conversation
396
+
397
+ The Interactions API supports server-side state management. You can continue a
398
+ conversation by referencing the `previous_interaction_id`.
399
+
400
+ ```typescript
401
+ // 1. First turn
402
+ const interaction1 = await ai.interactions.create({
403
+ model: 'gemini-2.5-flash',
404
+ input: 'Hi, my name is Amir.',
405
+ });
406
+ console.debug(interaction1);
407
+
408
+ // 2. Second turn (passing previous_interaction_id)
409
+ const interaction2 = await ai.interactions.create({
410
+ model: 'gemini-2.5-flash',
411
+ input: 'What is my name?',
412
+ previous_interaction_id: interaction1.id,
413
+ });
414
+ console.debug(interaction2);
415
+
416
+ ```
417
+
418
+ ### Agents (Deep Research)
419
+
420
+ You can use specialized agents like `deep-research-pro-preview-12-2025` for
421
+ complex tasks.
422
+
423
+ ```typescript
424
+ function sleep(ms: number): Promise<void> {
425
+ return new Promise(resolve => setTimeout(resolve, ms));
426
+ }
427
+
428
+ // 1. Start the Deep Research Agent
429
+ const initialInteraction = await ai.interactions.create({
430
+ input:
431
+ 'Research the history of the Google TPUs with a focus on 2025 and 2026.',
432
+ agent: 'deep-research-pro-preview-12-2025',
433
+ background: true,
434
+ });
435
+
436
+ console.log(`Research started. Interaction ID: ${initialInteraction.id}`);
437
+
438
+ // 2. Poll for results
439
+ while (true) {
440
+ const interaction = await ai.interactions.get(initialInteraction.id);
441
+ console.log(`Status: ${interaction.status}`);
442
+
443
+ if (interaction.status === 'completed') {
444
+ console.debug('\nFinal Report:\n', interaction.outputs);
445
+ break;
446
+ } else if (['failed', 'cancelled'].includes(interaction.status)) {
447
+ console.log(`Failed with status: ${interaction.status}`);
448
+ break;
449
+ }
450
+
451
+ await sleep(10000); // Sleep for 10 seconds
452
+ }
453
+
454
+ ```
455
+
456
+ ### Multimodal Input
457
+
458
+ You can provide multimodal data (text, images, audio, etc.) in the input list.
459
+
460
+ ```typescript
461
+ import base64
462
+
463
+ // Assuming you have a base64 string
464
+ // const base64Image = ...;
465
+
466
+ const interaction = await ai.interactions.create({
467
+ model: 'gemini-2.5-flash',
468
+ input: [
469
+ { type: 'text', text: 'Describe the image.' },
470
+ { type: 'image', data: base64Image, mime_type: 'image/png' },
471
+ ],
472
+ });
473
+
474
+ console.debug(interaction);
475
+
476
+ ```
477
+
478
+ ### Function Calling
479
+
480
+ You can define custom functions for the model to use. The Interactions API
481
+ handles the tool selection, and you provide the execution result back to the
482
+ model.
483
+
484
+ ```typescript
485
+ // 1. Define the tool
486
+ const getWeather = (location: string) => {
487
+ /* Gets the weather for a given location. */
488
+ return `The weather in ${location} is sunny.`;
489
+ };
490
+
491
+ // 2. Send the request with tools
492
+ let interaction = await ai.interactions.create({
493
+ model: 'gemini-2.5-flash',
494
+ input: 'What is the weather in Mountain View, CA?',
495
+ tools: [
496
+ {
497
+ type: 'function',
498
+ name: 'get_weather',
499
+ description: 'Gets the weather for a given location.',
500
+ parameters: {
501
+ type: 'object',
502
+ properties: {
503
+ location: {
504
+ type: 'string',
505
+ description: 'The city and state, e.g. San Francisco, CA',
506
+ },
507
+ },
508
+ required: ['location'],
509
+ },
510
+ },
511
+ ],
512
+ });
513
+
514
+ // 3. Handle the tool call
515
+ for (const output of interaction.outputs!) {
516
+ if (output.type === 'function_call') {
517
+ console.log(
518
+ `Tool Call: ${output.name}(${JSON.stringify(output.arguments)})`);
519
+
520
+ // Execute your actual function here
521
+ // Note: ensure arguments match your function signature
522
+ const result = getWeather(JSON.stringify(output.arguments.location));
523
+
524
+ // Send result back to the model
525
+ interaction = await ai.interactions.create({
526
+ model: 'gemini-2.5-flash',
527
+ previous_interaction_id: interaction.id,
528
+ input: [
529
+ {
530
+ type: 'function_result',
531
+ name: output.name,
532
+ call_id: output.id,
533
+ result: result,
534
+ },
535
+ ],
536
+ });
537
+
538
+ console.debug(`Response: ${JSON.stringify(interaction)}`);
539
+ }
540
+ }
541
+
542
+ ```
543
+
544
+ ### Built-in Tools
545
+ You can also use Google's built-in tools, such as **Google Search** or **Code
546
+ Execution**.
547
+
548
+ #### Grounding with Google Search
549
+
550
+ ```typescript
551
+ const interaction = await ai.interactions.create({
552
+ model: 'gemini-2.5-flash',
553
+ input: 'Who won the last Super Bowl',
554
+ tools: [{ type: 'google_search' }],
555
+ });
556
+
557
+ console.debug(interaction);
558
+
559
+ ```
560
+
561
+ #### Code Execution
562
+
563
+ ```typescript
564
+ const interaction = await ai.interactions.create({
565
+ model: 'gemini-2.5-flash',
566
+ input: 'Calculate the 50th Fibonacci number.',
567
+ tools: [{ type: 'code_execution' }],
568
+ });
569
+
570
+ console.debug(interaction);
571
+
572
+ ```
573
+
574
+ ### Multimodal Output
575
+
576
+ The Interactions API can generate multimodal outputs, such as images. You must
577
+ specify the `response_modalities`.
578
+
579
+ ```typescript
580
+ import * as fs from 'fs';
581
+
582
+ const interaction = await ai.interactions.create({
583
+ model: 'gemini-3-pro-image-preview',
584
+ input: 'Generate an image of a futuristic city.',
585
+ response_modalities: ['image'],
586
+ });
587
+
588
+ for (const output of interaction.outputs!) {
589
+ if (output.type === 'image') {
590
+ console.log(`Generated image with mime_type: ${output.mime_type}`);
591
+ // Save the image
592
+ fs.writeFileSync(
593
+ 'generated_city.png', Buffer.from(output.data!, 'base64'));
594
+ }
595
+ }
596
+
597
+ ```
598
+
599
+ ## How is this different from the other Google AI SDKs
600
+ This SDK (`@google/genai`) is Google Deepmind’s "vanilla" SDK for its generative
601
+ AI offerings, and is where Google Deepmind adds new AI features.
602
+
603
+ Models hosted either on the [Vertex AI platform](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/overview) or the [Gemini Developer platform](https://ai.google.dev/gemini-api/docs) are accessible through this SDK.
604
+
605
+ Other SDKs may be offering additional AI frameworks on top of this SDK, or may
606
+ be targeting specific project environments (like Firebase).
607
+
608
+ The `@google/generative_language` and `@google-cloud/vertexai` SDKs are previous
609
+ iterations of this SDK and are no longer receiving new Gemini 2.0+ features.