ak-gemini 2.0.0 → 2.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/GUIDE.md ADDED
@@ -0,0 +1,994 @@
1
+ # ak-gemini — Integration Guide
2
+
3
+ > A practical guide for rapidly adding AI capabilities to any Node.js codebase using `ak-gemini`.
4
+ > Covers every class, common patterns, best practices, and observability hooks.
5
+
6
+ ```sh
7
+ npm install ak-gemini
8
+ ```
9
+
10
+ **Requirements**: Node.js 18+, a `GEMINI_API_KEY` env var (or Vertex AI credentials).
11
+
12
+ ---
13
+
14
+ ## Table of Contents
15
+
16
+ 1. [Core Concepts](#core-concepts)
17
+ 2. [Authentication](#authentication)
18
+ 3. [Class Selection Guide](#class-selection-guide)
19
+ 4. [Message — Stateless AI Calls](#message--stateless-ai-calls)
20
+ 5. [Chat — Multi-Turn Conversations](#chat--multi-turn-conversations)
21
+ 6. [Transformer — Structured JSON Transformation](#transformer--structured-json-transformation)
22
+ 7. [ToolAgent — Agent with Custom Tools](#toolagent--agent-with-custom-tools)
23
+ 8. [CodeAgent — Agent That Writes and Runs Code](#codeagent--agent-that-writes-and-runs-code)
24
+ 9. [RagAgent — Document & Data Q&A](#ragagent--document--data-qa)
25
+ 10. [Observability & Usage Tracking](#observability--usage-tracking)
26
+ 11. [Thinking Configuration](#thinking-configuration)
27
+ 12. [Error Handling & Retries](#error-handling--retries)
28
+ 13. [Performance Tips](#performance-tips)
29
+ 14. [Common Integration Patterns](#common-integration-patterns)
30
+ 15. [Quick Reference](#quick-reference)
31
+
32
+ ---
33
+
34
+ ## Core Concepts
35
+
36
+ Every class in ak-gemini extends `BaseGemini`, which handles:
37
+
38
+ - **Authentication** — Gemini API key or Vertex AI service account
39
+ - **Chat sessions** — Managed conversation state with the model
40
+ - **Token tracking** — Input/output token counts after every call
41
+ - **Cost estimation** — Dollar estimates before sending
42
+ - **Few-shot seeding** — Inject example pairs to guide the model
43
+ - **Thinking config** — Control the model's internal reasoning budget
44
+ - **Safety settings** — Harassment and dangerous content filters (relaxed by default)
45
+
46
+ ```javascript
47
+ import { Transformer, Chat, Message, ToolAgent, CodeAgent, RagAgent } from 'ak-gemini';
48
+ // or
49
+ import AI from 'ak-gemini';
50
+ const t = new AI.Transformer({ ... });
51
+ ```
52
+
53
+ The default model is `gemini-2.5-flash`. Override with `modelName`:
54
+
55
+ ```javascript
56
+ new Chat({ modelName: 'gemini-2.5-pro' });
57
+ ```
58
+
59
+ ---
60
+
61
+ ## Authentication
62
+
63
+ ### Gemini API (default)
64
+
65
+ ```javascript
66
+ // Option 1: Environment variable (recommended)
67
+ // Set GEMINI_API_KEY in your .env or shell
68
+ new Chat();
69
+
70
+ // Option 2: Explicit key
71
+ new Chat({ apiKey: 'your-key' });
72
+ ```
73
+
74
+ ### Vertex AI
75
+
76
+ ```javascript
77
+ new Chat({
78
+ vertexai: true,
79
+ project: 'my-gcp-project', // or GOOGLE_CLOUD_PROJECT env var
80
+ location: 'us-central1', // or GOOGLE_CLOUD_LOCATION env var
81
+ labels: { app: 'myapp', env: 'prod' } // billing labels (Vertex AI only)
82
+ });
83
+ ```
84
+
85
+ Vertex AI uses Application Default Credentials. Run `gcloud auth application-default login` locally, or use a service account in production.
86
+
87
+ ---
88
+
89
+ ## Class Selection Guide
90
+
91
+ | I want to... | Use | Method |
92
+ |---|---|---|
93
+ | Get a one-off AI response (no history) | `Message` | `send()` |
94
+ | Have a back-and-forth conversation | `Chat` | `send()` |
95
+ | Transform JSON with examples + validation | `Transformer` | `send()` |
96
+ | Give the AI tools to call (APIs, DB, etc.) | `ToolAgent` | `chat()` / `stream()` |
97
+ | Let the AI write and run JavaScript | `CodeAgent` | `chat()` / `stream()` |
98
+ | Q&A over documents, files, or data | `RagAgent` | `chat()` / `stream()` |
99
+
100
+ **Rule of thumb**: Start with `Message` for the simplest integration. Move to `Chat` if you need history. Use `Transformer` when you need structured JSON output with validation. Use agents when the AI needs to take action.
101
+
102
+ ---
103
+
104
+ ## Message — Stateless AI Calls
105
+
106
+ The simplest class. Each `send()` call is independent — no conversation history is maintained. Ideal for classification, extraction, summarization, and any fire-and-forget AI call.
107
+
108
+ ```javascript
109
+ import { Message } from 'ak-gemini';
110
+
111
+ const msg = new Message({
112
+ systemPrompt: 'You are a sentiment classifier. Respond with: positive, negative, or neutral.'
113
+ });
114
+
115
+ const result = await msg.send('I love this product!');
116
+ console.log(result.text); // "positive"
117
+ console.log(result.usage); // { promptTokens, responseTokens, totalTokens, ... }
118
+ ```
119
+
120
+ ### Structured Output (JSON)
121
+
122
+ Force the model to return valid JSON matching a schema:
123
+
124
+ ```javascript
125
+ const extractor = new Message({
126
+ systemPrompt: 'Extract structured data from the input text.',
127
+ responseMimeType: 'application/json',
128
+ responseSchema: {
129
+ type: 'object',
130
+ properties: {
131
+ people: { type: 'array', items: { type: 'string' } },
132
+ places: { type: 'array', items: { type: 'string' } },
133
+ sentiment: { type: 'string', enum: ['positive', 'negative', 'neutral'] }
134
+ },
135
+ required: ['people', 'places', 'sentiment']
136
+ }
137
+ });
138
+
139
+ const result = await extractor.send('Alice and Bob visited Paris. They had a wonderful time.');
140
+ console.log(result.data);
141
+ // { people: ['Alice', 'Bob'], places: ['Paris'], sentiment: 'positive' }
142
+ ```
143
+
144
+ Key difference from `Chat`: `result.data` contains the parsed JSON object. `result.text` contains the raw string.
145
+
146
+ ### When to Use Message
147
+
148
+ - Classification, tagging, or labeling
149
+ - Entity extraction
150
+ - Summarization
151
+ - Any call where previous context doesn't matter
152
+ - High-throughput pipelines where you process items independently
153
+
154
+ ---
155
+
156
+ ## Chat — Multi-Turn Conversations
157
+
158
+ Maintains conversation history across calls. The model remembers what was said earlier.
159
+
160
+ ```javascript
161
+ import { Chat } from 'ak-gemini';
162
+
163
+ const chat = new Chat({
164
+ systemPrompt: 'You are a helpful coding assistant.'
165
+ });
166
+
167
+ const r1 = await chat.send('What is a closure in JavaScript?');
168
+ console.log(r1.text);
169
+
170
+ const r2 = await chat.send('Can you give me an example?');
171
+ // The model remembers the closure topic from r1
172
+ console.log(r2.text);
173
+ ```
174
+
175
+ ### History Management
176
+
177
+ ```javascript
178
+ // Get conversation history
179
+ const history = chat.getHistory();
180
+
181
+ // Clear and start fresh (preserves system prompt)
182
+ await chat.clearHistory();
183
+ ```
184
+
185
+ ### When to Use Chat
186
+
187
+ - Interactive assistants and chatbots
188
+ - Multi-step reasoning where later questions depend on earlier answers
189
+ - Tutoring or coaching interactions
190
+ - Any scenario where context carries across messages
191
+
192
+ ---
193
+
194
+ ## Transformer — Structured JSON Transformation
195
+
196
+ The power tool for data pipelines. Show it examples of input → output mappings, then send new inputs. Includes validation, retry, and AI-powered error correction.
197
+
198
+ ```javascript
199
+ import { Transformer } from 'ak-gemini';
200
+
201
+ const t = new Transformer({
202
+ systemPrompt: 'Transform user profiles into marketing segments.',
203
+ sourceKey: 'INPUT', // key for input data in examples
204
+ targetKey: 'OUTPUT', // key for output data in examples
205
+ maxRetries: 3, // retry on validation failure
206
+ retryDelay: 1000, // ms between retries
207
+ });
208
+
209
+ // Seed with examples
210
+ await t.seed([
211
+ {
212
+ INPUT: { age: 25, spending: 'high', interests: ['tech', 'gaming'] },
213
+ OUTPUT: { segment: 'young-affluent-tech', confidence: 0.9, tags: ['early-adopter'] }
214
+ },
215
+ {
216
+ INPUT: { age: 55, spending: 'medium', interests: ['gardening', 'cooking'] },
217
+ OUTPUT: { segment: 'mature-lifestyle', confidence: 0.85, tags: ['home-focused'] }
218
+ }
219
+ ]);
220
+
221
+ // Transform new data
222
+ const result = await t.send({ age: 30, spending: 'low', interests: ['books', 'hiking'] });
223
+ // result → { segment: '...', confidence: ..., tags: [...] }
224
+ ```
225
+
226
+ ### Validation
227
+
228
+ Pass an async validator as the third argument to `send()`. If it throws, the Transformer retries with the error message fed back to the model:
229
+
230
+ ```javascript
231
+ const result = await t.send(
232
+ { age: 30, spending: 'low' },
233
+ {}, // options
234
+ async (output) => {
235
+ if (!output.segment) throw new Error('Missing segment field');
236
+ if (output.confidence < 0 || output.confidence > 1) {
237
+ throw new Error('Confidence must be between 0 and 1');
238
+ }
239
+ return output; // return the validated (or modified) output
240
+ }
241
+ );
242
+ ```
243
+
244
+ Or set a global validator in the constructor:
245
+
246
+ ```javascript
247
+ const t = new Transformer({
248
+ asyncValidator: async (output) => {
249
+ if (!output.id) throw new Error('Missing id');
250
+ return output;
251
+ }
252
+ });
253
+ ```
254
+
255
+ ### Self-Healing with `rebuild()`
256
+
257
+ When downstream code fails, feed the error back to the AI:
258
+
259
+ ```javascript
260
+ try {
261
+ await processPayload(result);
262
+ } catch (err) {
263
+ const fixed = await t.rebuild(result, err.message);
264
+ await processPayload(fixed); // try again with AI-corrected payload
265
+ }
266
+ ```
267
+
268
+ ### Loading Examples from a File
269
+
270
+ ```javascript
271
+ const t = new Transformer({
272
+ examplesFile: './training-data.json'
273
+ // JSON array of { INPUT: ..., OUTPUT: ... } objects
274
+ });
275
+ await t.seed(); // loads from file automatically
276
+ ```
277
+
278
+ ### Stateless Sends
279
+
280
+ Send without affecting the conversation history (useful for parallel processing):
281
+
282
+ ```javascript
283
+ const result = await t.send(payload, { stateless: true });
284
+ ```
285
+
286
+ ### When to Use Transformer
287
+
288
+ - ETL pipelines — transform data between formats
289
+ - API response normalization
290
+ - Content enrichment (add tags, categories, scores)
291
+ - Any structured data transformation where you can provide examples
292
+ - Batch processing with validation guarantees
293
+
294
+ ---
295
+
296
+ ## ToolAgent — Agent with Custom Tools
297
+
298
+ Give the model tools (functions) it can call. You define what tools exist and how to execute them. The agent handles the conversation loop — sending messages, receiving tool calls, executing them, feeding results back, until the model produces a final text answer.
299
+
300
+ ```javascript
301
+ import { ToolAgent } from 'ak-gemini';
302
+
303
+ const agent = new ToolAgent({
304
+ systemPrompt: 'You are a database assistant.',
305
+ tools: [
306
+ {
307
+ name: 'query_db',
308
+ description: 'Execute a read-only SQL query against the users database',
309
+ parametersJsonSchema: {
310
+ type: 'object',
311
+ properties: {
312
+ sql: { type: 'string', description: 'The SQL query to execute' }
313
+ },
314
+ required: ['sql']
315
+ }
316
+ },
317
+ {
318
+ name: 'send_email',
319
+ description: 'Send an email notification',
320
+ parametersJsonSchema: {
321
+ type: 'object',
322
+ properties: {
323
+ to: { type: 'string' },
324
+ subject: { type: 'string' },
325
+ body: { type: 'string' }
326
+ },
327
+ required: ['to', 'subject', 'body']
328
+ }
329
+ }
330
+ ],
331
+ toolExecutor: async (toolName, args) => {
332
+ switch (toolName) {
333
+ case 'query_db':
334
+ return await db.query(args.sql);
335
+ case 'send_email':
336
+ await mailer.send(args);
337
+ return { sent: true };
338
+ }
339
+ },
340
+ maxToolRounds: 10 // safety limit on tool-use loop iterations
341
+ });
342
+
343
+ const result = await agent.chat('How many users signed up this week? Email the count to admin@co.com');
344
+ console.log(result.text); // "There were 47 new signups this week. I've sent the email."
345
+ console.log(result.toolCalls); // [{ name: 'query_db', args: {...}, result: [...] }, { name: 'send_email', ... }]
346
+ ```
347
+
348
+ ### Streaming
349
+
350
+ Stream the agent's output in real-time — useful for showing progress in a UI:
351
+
352
+ ```javascript
353
+ for await (const event of agent.stream('Find the top 5 users by spend')) {
354
+ switch (event.type) {
355
+ case 'text': process.stdout.write(event.text); break;
356
+ case 'tool_call': console.log(`\nCalling ${event.toolName}...`); break;
357
+ case 'tool_result': console.log(`Result:`, event.result); break;
358
+ case 'done': console.log('\nUsage:', event.usage); break;
359
+ }
360
+ }
361
+ ```
362
+
363
+ ### Execution Gating
364
+
365
+ Control which tool calls are allowed at runtime:
366
+
367
+ ```javascript
368
+ const agent = new ToolAgent({
369
+ tools: [...],
370
+ toolExecutor: myExecutor,
371
+ onBeforeExecution: async (toolName, args) => {
372
+ if (toolName === 'delete_user') {
373
+ console.log('Blocked dangerous tool call');
374
+ return false; // deny execution
375
+ }
376
+ return true; // allow
377
+ },
378
+ onToolCall: (toolName, args) => {
379
+ // Notification callback — fires on every tool call (logging, metrics, etc.)
380
+ metrics.increment(`tool_call.${toolName}`);
381
+ }
382
+ });
383
+ ```
384
+
385
+ ### Stopping an Agent
386
+
387
+ Cancel mid-execution from a callback or externally:
388
+
389
+ ```javascript
390
+ // From a callback
391
+ onBeforeExecution: async (toolName, args) => {
392
+ if (shouldStop) {
393
+ agent.stop(); // stop after this round
394
+ return false;
395
+ }
396
+ return true;
397
+ }
398
+
399
+ // Externally (e.g., user cancel button, timeout)
400
+ setTimeout(() => agent.stop(), 60_000);
401
+ const result = await agent.chat('Do some work');
402
+ // result includes warning: "Agent was stopped"
403
+ ```
404
+
405
+ ### When to Use ToolAgent
406
+
407
+ - AI that needs to call APIs, query databases, or interact with external systems
408
+ - Workflow automation — the AI orchestrates a sequence of operations
409
+ - Research assistants that fetch and synthesize data from multiple sources
410
+ - Any scenario where you want the model to decide *which* tools to use and *when*
411
+
412
+ ---
413
+
414
+ ## CodeAgent — Agent That Writes and Runs Code
415
+
416
+ Instead of calling tools one by one, the model writes complete JavaScript scripts and executes them in a child process. This is powerful for tasks that require complex logic, file manipulation, or multi-step computation.
417
+
418
+ ```javascript
419
+ import { CodeAgent } from 'ak-gemini';
420
+
421
+ const agent = new CodeAgent({
422
+ workingDirectory: '/path/to/project',
423
+ importantFiles: ['package.json', 'src/config.js'], // injected into system prompt
424
+ timeout: 30_000, // per-execution timeout
425
+ maxRounds: 10, // max code execution cycles
426
+ keepArtifacts: true, // keep script files on disk after execution
427
+ });
428
+
429
+ const result = await agent.chat('Find all files larger than 1MB and list them sorted by size');
430
+ console.log(result.text); // Agent's summary
431
+ console.log(result.codeExecutions); // [{ code, output, stderr, exitCode, purpose }]
432
+ ```
433
+
434
+ ### How It Works
435
+
436
+ 1. On `init()`, the agent scans the working directory and gathers codebase context (file tree, package.json, key files, importantFiles)
437
+ 2. This context is injected into the system prompt so the model understands the project
438
+ 3. The model writes JavaScript using an internal `execute_code` tool
439
+ 4. Code is saved to a `.mjs` file and run in a Node.js child process that inherits `process.env`
440
+ 5. stdout/stderr feeds back to the model
441
+ 6. The model decides if more work is needed (up to `maxRounds` cycles)
442
+
443
+ ### Streaming
444
+
445
+ ```javascript
446
+ for await (const event of agent.stream('Refactor the auth module to use async/await')) {
447
+ switch (event.type) {
448
+ case 'text': process.stdout.write(event.text); break;
449
+ case 'code': console.log('\n--- Executing code ---'); break;
450
+ case 'output': console.log(event.stdout); break;
451
+ case 'done': console.log('\nDone!', event.usage); break;
452
+ }
453
+ }
454
+ ```
455
+
456
+ ### Execution Gating & Notifications
457
+
458
+ ```javascript
459
+ const agent = new CodeAgent({
460
+ workingDirectory: '.',
461
+ onBeforeExecution: async (code) => {
462
+ // Review code before it runs
463
+ if (code.includes('rm -rf')) return false; // deny
464
+ return true;
465
+ },
466
+ onCodeExecution: (code, output) => {
467
+ // Log every execution for audit
468
+ logger.info({ code: code.slice(0, 200), exitCode: output.exitCode });
469
+ }
470
+ });
471
+ ```
472
+
473
+ ### Retrieving Scripts
474
+
475
+ Get all scripts the agent wrote across all interactions:
476
+
477
+ ```javascript
478
+ const scripts = agent.dump();
479
+ // [{ fileName: 'agent-read-config.mjs', purpose: 'read-config', script: '...', filePath: '/path/...' }]
480
+ ```
481
+
482
+ ### When to Use CodeAgent
483
+
484
+ - File system operations — reading, writing, transforming files
485
+ - Data analysis — processing CSV, JSON, or log files
486
+ - Codebase exploration — finding patterns, counting occurrences, generating reports
487
+ - Prototyping — quickly testing ideas by having the AI write and run code
488
+ - Any task where the AI needs more flexibility than predefined tools provide
489
+
490
+ ---
491
+
492
+ ## RagAgent — Document & Data Q&A
493
+
494
+ Load documents and data into the model's context for grounded Q&A. Supports three input types that can be used together:
495
+
496
+ | Input Type | Option | What It Does |
497
+ |---|---|---|
498
+ | **Remote files** | `remoteFiles` | Uploaded via Google Files API — for PDFs, images, audio, video |
499
+ | **Local files** | `localFiles` | Read from disk as UTF-8 text — for md, json, csv, yaml, txt |
500
+ | **Local data** | `localData` | In-memory objects serialized as JSON |
501
+
502
+ ```javascript
503
+ import { RagAgent } from 'ak-gemini';
504
+
505
+ const agent = new RagAgent({
506
+ // Text files read directly from disk (fast, no upload)
507
+ localFiles: ['./docs/api-reference.md', './docs/architecture.md'],
508
+
509
+ // In-memory data
510
+ localData: [
511
+ { name: 'users', data: await db.query('SELECT * FROM users LIMIT 100') },
512
+ { name: 'config', data: JSON.parse(await fs.readFile('./config.json', 'utf-8')) },
513
+ ],
514
+
515
+ // Binary/media files uploaded via Files API
516
+ remoteFiles: ['./diagrams/architecture.png', './reports/q4.pdf'],
517
+ });
518
+
519
+ const result = await agent.chat('What authentication method does the API use?');
520
+ console.log(result.text); // Grounded answer citing the api-reference.md
521
+ ```
522
+
523
+ ### Dynamic Context
524
+
525
+ Add more context after initialization (each triggers a reinit):
526
+
527
+ ```javascript
528
+ await agent.addLocalFiles(['./new-doc.md']);
529
+ await agent.addLocalData([{ name: 'metrics', data: { uptime: 99.9 } }]);
530
+ await agent.addRemoteFiles(['./new-chart.png']);
531
+ ```
532
+
533
+ ### Inspecting Context
534
+
535
+ ```javascript
536
+ const ctx = agent.getContext();
537
+ // {
538
+ // remoteFiles: [{ name, displayName, mimeType, sizeBytes, uri, originalPath }],
539
+ // localFiles: [{ name, path, size }],
540
+ // localData: [{ name, type }]
541
+ // }
542
+ ```
543
+
544
+ ### Streaming
545
+
546
+ ```javascript
547
+ for await (const event of agent.stream('Summarize the architecture document')) {
548
+ if (event.type === 'text') process.stdout.write(event.text);
549
+ if (event.type === 'done') console.log('\nUsage:', event.usage);
550
+ }
551
+ ```
552
+
553
+ ### When to Use RagAgent
554
+
555
+ - Documentation Q&A — let users ask questions about your docs
556
+ - Data exploration — load database results or CSV exports and ask questions
557
+ - Code review — load source files and ask about patterns, bugs, or architecture
558
+ - Report analysis — load PDF reports and extract insights
559
+ - Any scenario where the AI needs to answer questions grounded in specific data
560
+
561
+ ### Choosing Input Types
562
+
563
+ | Data | Use |
564
+ |---|---|
565
+ | Plain text files (md, txt, json, csv, yaml) | `localFiles` — fastest, no API upload |
566
+ | In-memory objects, DB results, API responses | `localData` — serialized as JSON |
567
+ | PDFs, images, audio, video | `remoteFiles` — uploaded via Files API |
568
+
569
+ Prefer `localFiles` and `localData` when possible — they skip the upload step and initialize faster.
570
+
571
+ ---
572
+
573
+ ## Observability & Usage Tracking
574
+
575
+ Every class provides consistent observability hooks.
576
+
577
+ ### Token Usage
578
+
579
+ After every API call, get detailed token counts:
580
+
581
+ ```javascript
582
+ const usage = instance.getLastUsage();
583
+ // {
584
+ // promptTokens: 1250, // input tokens (cumulative across retries)
585
+ // responseTokens: 340, // output tokens (cumulative across retries)
586
+ // totalTokens: 1590, // total (cumulative)
587
+ // attempts: 1, // 1 = first try, 2+ = retries needed
588
+ // modelVersion: 'gemini-2.5-flash-001', // actual model that responded
589
+ // requestedModel: 'gemini-2.5-flash', // model you requested
590
+ // timestamp: 1710000000000
591
+ // }
592
+ ```
593
+
594
+ ### Cost Estimation
595
+
596
+ Estimate cost *before* sending:
597
+
598
+ ```javascript
599
+ const estimate = await instance.estimateCost('What is the meaning of life?');
600
+ // {
601
+ // inputTokens: 8,
602
+ // model: 'gemini-2.5-flash',
603
+ // pricing: { input: 0.15, output: 0.60 }, // per million tokens
604
+ // estimatedInputCost: 0.0000012,
605
+ // note: 'Output cost depends on response length'
606
+ // }
607
+ ```
608
+
609
+ Or just get the token count:
610
+
611
+ ```javascript
612
+ const { inputTokens } = await instance.estimate('some payload');
613
+ ```
614
+
615
+ ### Logging
616
+
617
+ All classes use [pino](https://github.com/pinojs/pino) for structured logging. Control the level:
618
+
619
+ ```javascript
620
+ // Per-instance
621
+ new Chat({ logLevel: 'debug' });
622
+
623
+ // Via environment
624
+ LOG_LEVEL=debug node app.js
625
+
626
+ // Via NODE_ENV (dev → debug, test → warn, prod → info)
627
+ ```
628
+
629
+ ### Agent Callbacks
630
+
631
+ ToolAgent and CodeAgent provide execution callbacks for building audit trails, metrics, and approval flows:
632
+
633
+ ```javascript
634
+ // ToolAgent
635
+ new ToolAgent({
636
+ onToolCall: (toolName, args) => {
637
+ // Fires on every tool call — use for logging, metrics
638
+ logger.info({ event: 'tool_call', tool: toolName, args });
639
+ },
640
+ onBeforeExecution: async (toolName, args) => {
641
+ // Fires before execution — return false to deny
642
+ // Use for approval flows, safety checks, rate limiting
643
+ return !blocklist.includes(toolName);
644
+ }
645
+ });
646
+
647
+ // CodeAgent
648
+ new CodeAgent({
649
+ onCodeExecution: (code, output) => {
650
+ // Fires after every code execution
651
+ logger.info({ event: 'code_exec', exitCode: output.exitCode, lines: code.split('\n').length });
652
+ },
653
+ onBeforeExecution: async (code) => {
654
+ // Review code before execution
655
+ if (code.includes('process.exit')) return false;
656
+ return true;
657
+ }
658
+ });
659
+ ```
660
+
661
+ ### Billing Labels (Vertex AI)
662
+
663
+ Tag API calls for cost attribution:
664
+
665
+ ```javascript
666
+ // Constructor-level (applies to all calls)
667
+ new Transformer({
668
+ vertexai: true,
669
+ project: 'my-project',
670
+ labels: { app: 'etl-pipeline', env: 'prod', team: 'data' }
671
+ });
672
+
673
+ // Per-message override
674
+ await transformer.send(payload, { labels: { job_id: 'abc123' } });
675
+ ```
676
+
677
+ ---
678
+
679
+ ## Thinking Configuration
680
+
681
+ Models like `gemini-2.5-flash` and `gemini-2.5-pro` support thinking — internal reasoning before answering. Control the budget:
682
+
683
+ ```javascript
684
+ // Disable thinking (default — fastest, cheapest)
685
+ new Chat({ thinkingConfig: { thinkingBudget: 0 } });
686
+
687
+ // Automatic thinking budget (model decides)
688
+ new Chat({ thinkingConfig: { thinkingBudget: -1 } });
689
+
690
+ // Fixed budget (in tokens)
691
+ new Chat({ thinkingConfig: { thinkingBudget: 2048 } });
692
+
693
+ // Use ThinkingLevel enum
694
+ import { ThinkingLevel } from 'ak-gemini';
695
+ new Chat({ thinkingConfig: { thinkingLevel: ThinkingLevel.LOW } });
696
+ ```
697
+
698
+ **When to enable thinking**: Complex reasoning, math, multi-step logic, code generation. **When to disable**: Simple classification, extraction, or chat where speed matters.
699
+
700
+ ---
701
+
702
+ ## Error Handling & Retries
703
+
704
+ ### Transformer Retries
705
+
706
+ The Transformer has built-in retry with exponential backoff when validation fails:
707
+
708
+ ```javascript
709
+ const t = new Transformer({
710
+ maxRetries: 3, // default: 3
711
+ retryDelay: 1000 // default: 1000ms, doubles each retry
712
+ });
713
+ ```
714
+
715
+ Each retry feeds the validation error back to the model, giving it a chance to self-correct. The `usage` object reports cumulative tokens across all attempts:
716
+
717
+ ```javascript
718
+ const result = await t.send(payload, {}, validator);
719
+ const usage = t.getLastUsage();
720
+ console.log(usage.attempts); // 2 = needed one retry
721
+ ```
722
+
723
+ ### Rate Limiting (429 Errors)
724
+
725
+ The Gemini API returns 429 when rate limited. ak-gemini does not auto-retry 429s — handle them in your application layer:
726
+
727
+ ```javascript
728
+ async function sendWithBackoff(instance, payload, maxRetries = 3) {
729
+ for (let i = 0; i < maxRetries; i++) {
730
+ try {
731
+ return await instance.send(payload);
732
+ } catch (err) {
733
+ if (err.status === 429 && i < maxRetries - 1) {
734
+ await new Promise(r => setTimeout(r, 2 ** i * 1000));
735
+ continue;
736
+ }
737
+ throw err;
738
+ }
739
+ }
740
+ }
741
+ ```
742
+
743
+ ### CodeAgent Failure Limits
744
+
745
+ CodeAgent tracks consecutive failed executions. After `maxRetries` (default: 3) consecutive failures, the model summarizes what went wrong and asks for guidance:
746
+
747
+ ```javascript
748
+ new CodeAgent({
749
+ maxRetries: 5, // allow more failures before stopping
750
+ });
751
+ ```
752
+
753
+ ---
754
+
755
+ ## Performance Tips
756
+
757
+ ### Reuse Instances
758
+
759
+ Each instance maintains a chat session. Creating a new instance for every request wastes the system prompt tokens. Reuse instances when possible:
760
+
761
+ ```javascript
762
+ // Bad — creates a new session every call
763
+ app.post('/classify', async (req, res) => {
764
+ const msg = new Message({ systemPrompt: '...' }); // new instance every request!
765
+ const result = await msg.send(req.body.text);
766
+ res.json(result);
767
+ });
768
+
769
+ // Good — reuse the instance
770
+ const classifier = new Message({ systemPrompt: '...' });
771
+ app.post('/classify', async (req, res) => {
772
+ const result = await classifier.send(req.body.text);
773
+ res.json(result);
774
+ });
775
+ ```
776
+
777
+ ### Choose the Right Model
778
+
779
+ | Model | Speed | Cost | Best For |
780
+ |---|---|---|---|
781
+ | `gemini-2.0-flash-lite` | Fastest | Cheapest | Classification, extraction, simple tasks |
782
+ | `gemini-2.0-flash` | Fast | Low | General purpose, good quality |
783
+ | `gemini-2.5-flash` | Medium | Low | Best balance of speed and quality |
784
+ | `gemini-2.5-pro` | Slow | High | Complex reasoning, code, analysis |
785
+
786
+ ### Use `Message` for Stateless Workloads
787
+
788
+ `Message` uses `generateContent()` under the hood — no chat session overhead. For pipelines processing thousands of items independently, `Message` is the right choice.
789
+
790
+ ### Use `localFiles` / `localData` over `remoteFiles`
791
+
792
+ For text-based content, `localFiles` and `localData` skip the Files API upload entirely. They're faster to initialize and don't require network calls for the file upload step.
793
+
794
+ ### Disable Thinking for Simple Tasks
795
+
796
+ Thinking tokens cost money and add latency. For classification, extraction, or simple formatting tasks, keep `thinkingBudget: 0` (the default).
797
+
798
+ ---
799
+
800
+ ## Common Integration Patterns
801
+
802
+ ### Pattern: API Endpoint Classifier
803
+
804
+ ```javascript
805
+ import { Message } from 'ak-gemini';
806
+
807
+ const classifier = new Message({
808
+ modelName: 'gemini-2.0-flash-lite', // fast + cheap
809
+ systemPrompt: 'Classify support tickets. Respond with exactly one of: billing, technical, account, other.',
810
+ });
811
+
812
+ app.post('/api/classify-ticket', async (req, res) => {
813
+ const result = await classifier.send(req.body.text);
814
+ res.json({ category: result.text.trim().toLowerCase() });
815
+ });
816
+ ```
817
+
818
+ ### Pattern: ETL Pipeline with Validation
819
+
820
+ ```javascript
821
+ import { Transformer } from 'ak-gemini';
822
+
823
+ const normalizer = new Transformer({
824
+ sourceKey: 'RAW',
825
+ targetKey: 'NORMALIZED',
826
+ maxRetries: 3,
827
+ asyncValidator: async (output) => {
828
+ if (!output.email?.includes('@')) throw new Error('Invalid email');
829
+ if (!output.name?.trim()) throw new Error('Name is required');
830
+ return output;
831
+ }
832
+ });
833
+
834
+ await normalizer.seed([
835
+ { RAW: { nm: 'alice', mail: 'alice@co.com' }, NORMALIZED: { name: 'Alice', email: 'alice@co.com' } },
836
+ ]);
837
+
838
+ for (const record of rawRecords) {
839
+ const clean = await normalizer.send(record, { stateless: true });
840
+ await db.insert('users', clean);
841
+ }
842
+ ```
843
+
844
+ ### Pattern: Conversational Assistant with Tools
845
+
846
+ ```javascript
847
+ import { ToolAgent } from 'ak-gemini';
848
+
849
+ const assistant = new ToolAgent({
850
+ systemPrompt: `You are a customer support agent for Acme Corp.
851
+ You can look up orders and issue refunds. Always confirm before issuing refunds.`,
852
+ tools: [
853
+ {
854
+ name: 'lookup_order',
855
+ description: 'Look up an order by ID or customer email',
856
+ parametersJsonSchema: {
857
+ type: 'object',
858
+ properties: {
859
+ order_id: { type: 'string' },
860
+ email: { type: 'string' }
861
+ }
862
+ }
863
+ },
864
+ {
865
+ name: 'issue_refund',
866
+ description: 'Issue a refund for an order',
867
+ parametersJsonSchema: {
868
+ type: 'object',
869
+ properties: {
870
+ order_id: { type: 'string' },
871
+ amount: { type: 'number' },
872
+ reason: { type: 'string' }
873
+ },
874
+ required: ['order_id', 'amount', 'reason']
875
+ }
876
+ }
877
+ ],
878
+ toolExecutor: async (toolName, args) => {
879
+ if (toolName === 'lookup_order') return await orderService.lookup(args);
880
+ if (toolName === 'issue_refund') return await orderService.refund(args);
881
+ },
882
+ onBeforeExecution: async (toolName, args) => {
883
+ // Only allow refunds under $100 without human approval
884
+ if (toolName === 'issue_refund' && args.amount > 100) {
885
+ return false;
886
+ }
887
+ return true;
888
+ }
889
+ });
890
+
891
+ // In a chat endpoint
892
+ const result = await assistant.chat(userMessage);
893
+ ```
894
+
895
+ ### Pattern: Document Q&A Service
896
+
897
+ ```javascript
898
+ import { RagAgent } from 'ak-gemini';
899
+
900
+ const docs = new RagAgent({
901
+ localFiles: [
902
+ './docs/getting-started.md',
903
+ './docs/api-reference.md',
904
+ './docs/faq.md',
905
+ ],
906
+ systemPrompt: 'You are a documentation assistant. Answer questions based on the docs. If the answer is not in the docs, say so.',
907
+ });
908
+
909
+ app.post('/api/ask', async (req, res) => {
910
+ const result = await docs.chat(req.body.question);
911
+ res.json({ answer: result.text, usage: result.usage });
912
+ });
913
+ ```
914
+
915
+ ### Pattern: Data-Grounded Analysis
916
+
917
+ ```javascript
918
+ import { RagAgent } from 'ak-gemini';
919
+
920
+ const analyst = new RagAgent({
921
+ modelName: 'gemini-2.5-pro', // use a smarter model for analysis
922
+ localData: [
923
+ { name: 'sales_q4', data: await db.query('SELECT * FROM sales WHERE quarter = 4') },
924
+ { name: 'targets', data: await db.query('SELECT * FROM quarterly_targets') },
925
+ ],
926
+ systemPrompt: 'You are a business analyst. Analyze the provided data and answer questions with specific numbers.',
927
+ });
928
+
929
+ const result = await analyst.chat('Which regions missed their Q4 targets? By how much?');
930
+ ```
931
+
932
+ ### Pattern: Few-Shot Any Class
933
+
934
+ Every class supports `seed()` for few-shot learning — not just Transformer:
935
+
936
+ ```javascript
937
+ import { Chat } from 'ak-gemini';
938
+
939
+ const chat = new Chat({ systemPrompt: 'You are a SQL expert.' });
940
+ await chat.seed([
941
+ { PROMPT: 'Get all users', ANSWER: 'SELECT * FROM users;' },
942
+ { PROMPT: 'Count orders by status', ANSWER: 'SELECT status, COUNT(*) FROM orders GROUP BY status;' },
943
+ ]);
944
+
945
+ const result = await chat.send('Find users who signed up in the last 7 days');
946
+ // Model follows the SQL-only response pattern from the examples
947
+ ```
948
+
949
+ ---
950
+
951
+ ## Quick Reference
952
+
953
+ ### Imports
954
+
955
+ ```javascript
956
+ // Named exports
957
+ import { Transformer, Chat, Message, ToolAgent, CodeAgent, RagAgent, BaseGemini, log } from 'ak-gemini';
958
+ import { extractJSON, attemptJSONRecovery } from 'ak-gemini';
959
+ import { ThinkingLevel, HarmCategory, HarmBlockThreshold } from 'ak-gemini';
960
+
961
+ // Default export (namespace)
962
+ import AI from 'ak-gemini';
963
+
964
+ // CommonJS
965
+ const { Transformer, Chat } = require('ak-gemini');
966
+ ```
967
+
968
+ ### Constructor Options (All Classes)
969
+
970
+ | Option | Type | Default |
971
+ |---|---|---|
972
+ | `modelName` | string | `'gemini-2.5-flash'` |
973
+ | `systemPrompt` | string \| null \| false | varies by class |
974
+ | `apiKey` | string | `GEMINI_API_KEY` env var |
975
+ | `vertexai` | boolean | `false` |
976
+ | `project` | string | `GOOGLE_CLOUD_PROJECT` env var |
977
+ | `location` | string | `'global'` |
978
+ | `chatConfig` | object | `{ temperature: 0.7, topP: 0.95, topK: 64 }` |
979
+ | `thinkingConfig` | object | `{ thinkingBudget: 0 }` |
980
+ | `maxOutputTokens` | number \| null | `50000` |
981
+ | `logLevel` | string | based on `NODE_ENV` |
982
+ | `labels` | object | `{}` (Vertex AI only) |
983
+
984
+ ### Methods Available on All Classes
985
+
986
+ | Method | Returns | Description |
987
+ |---|---|---|
988
+ | `init(force?)` | `Promise<void>` | Initialize chat session |
989
+ | `seed(examples, opts?)` | `Promise<Array>` | Add few-shot examples |
990
+ | `getHistory()` | `Array` | Get conversation history |
991
+ | `clearHistory()` | `Promise<void>` | Clear conversation history |
992
+ | `getLastUsage()` | `UsageData \| null` | Token usage from last call |
993
+ | `estimate(payload)` | `Promise<{ inputTokens }>` | Estimate input tokens |
994
+ | `estimateCost(payload)` | `Promise<object>` | Estimate cost in dollars |