ak-claude 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/GUIDE.md ADDED
@@ -0,0 +1,1457 @@
1
+ # ak-claude --- Integration Guide
2
+
3
+ > A practical guide for rapidly adding AI capabilities to any Node.js codebase using `ak-claude`.
4
+ > Covers every class, common patterns, best practices, and observability hooks.
5
+
6
+ ```sh
7
+ npm install ak-claude
8
+ ```
9
+
10
+ **Requirements**: Node.js 18+. Auth via Vertex AI (`vertexai: true`) or an `ANTHROPIC_API_KEY` / `CLAUDE_API_KEY` env var.
11
+
12
+ ---
13
+
14
+ ## Table of Contents
15
+
16
+ 1. [Core Concepts](#core-concepts)
17
+ 2. [Authentication](#authentication)
18
+ 3. [Class Selection Guide](#class-selection-guide)
19
+ 4. [Message --- Stateless AI Calls](#message--stateless-ai-calls)
20
+ 5. [Chat --- Multi-Turn Conversations](#chat--multi-turn-conversations)
21
+ 6. [Transformer --- Structured JSON Transformation](#transformer--structured-json-transformation)
22
+ 7. [ToolAgent --- Agent with Custom Tools](#toolagent--agent-with-custom-tools)
23
+ 8. [CodeAgent --- Agent That Writes and Runs Code](#codeagent--agent-that-writes-and-runs-code)
24
+ 9. [RagAgent --- Document & Data Q&A](#ragagent--document--data-qa)
25
+ 10. [AgentQuery --- Autonomous Agent via Claude Agent SDK](#agentquery--autonomous-agent-via-claude-agent-sdk)
26
+ 11. [Web Search](#web-search)
27
+ 12. [Prompt Caching](#prompt-caching)
28
+ 13. [Observability & Usage Tracking](#observability--usage-tracking)
29
+ 14. [Thinking Configuration](#thinking-configuration)
30
+ 15. [Error Handling & Retries](#error-handling--retries)
31
+ 16. [Performance Tips](#performance-tips)
32
+ 17. [Common Integration Patterns](#common-integration-patterns)
33
+ 18. [Quick Reference](#quick-reference)
34
+
35
+ ---
36
+
37
+ ## Core Concepts
38
+
39
+ Every class in ak-claude extends `BaseClaude`, which handles:
40
+
41
+ - **Authentication** --- Vertex AI via Application Default Credentials (`vertexai: true`) or Anthropic API key
42
+ - **Message history** --- Managed conversation state as a plain array (Claude's Messages API is stateless; ak-claude manages history for you)
43
+ - **Token tracking** --- Input/output token counts after every call, including cache metrics
44
+ - **Cost estimation** --- Dollar estimates before sending
45
+ - **Few-shot seeding** --- Inject example pairs to guide the model
46
+ - **Extended thinking** --- Control the model's internal reasoning budget
47
+ - **Web search** --- Built-in server-managed web search tool
48
+ - **Prompt caching** --- Mark system prompts for Anthropic's ephemeral caching
49
+
50
+ ```javascript
51
+ import { Transformer, Chat, Message, ToolAgent, CodeAgent, RagAgent, AgentQuery } from 'ak-claude';
52
+ // or
53
+ import AI from 'ak-claude';
54
+ const t = new AI.Transformer({ ... });
55
+ ```
56
+
57
+ The default model is `claude-sonnet-4-6`. Override with `modelName`:
58
+
59
+ ```javascript
60
+ new Chat({ modelName: 'claude-opus-4-6' });
61
+ ```
62
+
63
+ ---
64
+
65
+ ## Authentication
66
+
67
+ ak-claude supports two authentication methods: **Vertex AI** (GCP) and **direct API key**.
68
+
69
+ ### Vertex AI (recommended for GCP deployments)
70
+
71
+ ```javascript
72
+ // Uses Application Default Credentials — no API key needed
73
+ // Auth via: gcloud auth application-default login
74
+ new Chat({ vertexai: true });
75
+
76
+ // With explicit project and region
77
+ new Chat({
78
+ vertexai: true,
79
+ vertexProjectId: 'my-gcp-project', // or GOOGLE_CLOUD_PROJECT env var
80
+ vertexRegion: 'us-central1' // or GOOGLE_CLOUD_LOCATION env var (default: 'us-east5')
81
+ });
82
+ ```
83
+
84
+ When `vertexai: true`, the Anthropic client is created lazily using `@anthropic-ai/vertex-sdk` (included as a dependency). No API key is required — authentication flows through Google Cloud's Application Default Credentials (ADC). This is ideal for server deployments on GCP, CI/CD pipelines with service accounts, or local development with `gcloud auth`.
85
+
86
+ ### API Key (direct Anthropic API)
87
+
88
+ ```javascript
89
+ // Option 1: Environment variable
90
+ // Set ANTHROPIC_API_KEY or CLAUDE_API_KEY in your .env or shell
91
+ new Chat();
92
+
93
+ // Option 2: Explicit key
94
+ new Chat({ apiKey: 'your-key' });
95
+ ```
96
+
97
+ ak-claude checks for keys in this order:
98
+ 1. `options.apiKey` (constructor argument)
99
+ 2. `ANTHROPIC_API_KEY` environment variable
100
+ 3. `CLAUDE_API_KEY` environment variable
101
+
102
+ If `vertexai` is not set and no key is found, the constructor throws immediately.
103
+
104
+ ### Rate Limit Retries
105
+
106
+ The Anthropic SDK handles 429 (rate limit) errors automatically. Control the number of retries:
107
+
108
+ ```javascript
109
+ new Chat({
110
+ maxRetries: 5 // default: 5, passed to the Anthropic SDK client
111
+ });
112
+ ```
113
+
114
+ ---
115
+
116
+ ## Class Selection Guide
117
+
118
+ | I want to... | Use | Method |
119
+ |---|---|---|
120
+ | Get a one-off AI response (no history) | `Message` | `send()` |
121
+ | Have a back-and-forth conversation | `Chat` | `send()` |
122
+ | Transform JSON with examples + validation | `Transformer` | `send()` |
123
+ | Give the AI tools to call (APIs, DB, etc.) | `ToolAgent` | `chat()` / `stream()` |
124
+ | Let the AI write and run JavaScript | `CodeAgent` | `chat()` / `stream()` |
125
+ | Q&A over documents, files, or data | `RagAgent` | `chat()` / `stream()` |
126
+ | Launch an autonomous Claude Code agent | `AgentQuery` | `run()` / `resume()` |
127
+
128
+ **Rule of thumb**: Start with `Message` for the simplest integration. Move to `Chat` if you need history. Use `Transformer` when you need structured JSON output with validation. Use agents when the AI needs to take action. Use `AgentQuery` when you want the full Claude Code agent with built-in file and shell tools.
129
+
130
+ ---
131
+
132
+ ## Message --- Stateless AI Calls
133
+
134
+ The simplest class. Each `send()` call is independent --- no conversation history is maintained. Ideal for classification, extraction, summarization, and any fire-and-forget AI call.
135
+
136
+ ```javascript
137
+ import { Message } from 'ak-claude';
138
+
139
+ const msg = new Message({
140
+ systemPrompt: 'You are a sentiment classifier. Respond with: positive, negative, or neutral.'
141
+ });
142
+
143
+ const result = await msg.send('I love this product!');
144
+ console.log(result.text); // "positive"
145
+ console.log(result.usage); // { promptTokens, responseTokens, totalTokens, ... }
146
+ ```
147
+
148
+ ### Structured Output (JSON Schema)
149
+
150
+ Force the model to return valid JSON matching a schema using Claude's native structured outputs:
151
+
152
+ ```javascript
153
+ const extractor = new Message({
154
+ systemPrompt: 'Extract structured data from the input text.',
155
+ responseSchema: {
156
+ type: 'object',
157
+ properties: {
158
+ people: { type: 'array', items: { type: 'string' } },
159
+ places: { type: 'array', items: { type: 'string' } },
160
+ sentiment: { type: 'string', enum: ['positive', 'negative', 'neutral'] }
161
+ },
162
+ required: ['people', 'places', 'sentiment']
163
+ }
164
+ });
165
+
166
+ const result = await extractor.send('Alice and Bob visited Paris. They had a wonderful time.');
167
+ console.log(result.data);
168
+ // { people: ['Alice', 'Bob'], places: ['Paris'], sentiment: 'positive' }
169
+ ```
170
+
171
+ When `responseSchema` is provided, the API guarantees valid JSON matching your schema via `output_config`. The parsed object is available as `result.data`; the raw string is `result.text`.
172
+
173
+ ### Fallback JSON Mode
174
+
175
+ If you do not need schema guarantees, use `responseFormat: 'json'` for a system-prompt-based approach:
176
+
177
+ ```javascript
178
+ const jsonMsg = new Message({
179
+ systemPrompt: 'Extract entities from text.',
180
+ responseFormat: 'json'
181
+ });
182
+
183
+ const result = await jsonMsg.send('Alice works at Acme Corp in New York.');
184
+ console.log(result.data); // { entities: [...] } — best-effort JSON extraction
185
+ ```
186
+
187
+ ### When to Use Message
188
+
189
+ - Classification, tagging, or labeling
190
+ - Entity extraction
191
+ - Summarization
192
+ - Any call where previous context does not matter
193
+ - High-throughput pipelines where you process items independently
194
+
195
+ ---
196
+
197
+ ## Chat --- Multi-Turn Conversations
198
+
199
+ Maintains conversation history across calls. The model remembers what was said earlier.
200
+
201
+ ```javascript
202
+ import { Chat } from 'ak-claude';
203
+
204
+ const chat = new Chat({
205
+ systemPrompt: 'You are a helpful coding assistant.'
206
+ });
207
+
208
+ const r1 = await chat.send('What is a closure in JavaScript?');
209
+ console.log(r1.text);
210
+
211
+ const r2 = await chat.send('Can you give me an example?');
212
+ // The model remembers the closure topic from r1
213
+ console.log(r2.text);
214
+ ```
215
+
216
+ ### History Management
217
+
218
+ ```javascript
219
+ // Get conversation history
220
+ const history = chat.getHistory();
221
+
222
+ // Get simplified text-only history
223
+ const curated = chat.getHistory(true);
224
+
225
+ // Clear and start fresh (preserves system prompt)
226
+ await chat.clearHistory();
227
+ ```
228
+
229
+ ### When to Use Chat
230
+
231
+ - Interactive assistants and chatbots
232
+ - Multi-step reasoning where later questions depend on earlier answers
233
+ - Tutoring or coaching interactions
234
+ - Any scenario where context carries across messages
235
+
236
+ ---
237
+
238
+ ## Transformer --- Structured JSON Transformation
239
+
240
+ The power tool for data pipelines. Show it examples of input -> output mappings, then send new inputs. Includes validation, retry, and AI-powered error correction.
241
+
242
+ ```javascript
243
+ import { Transformer } from 'ak-claude';
244
+
245
+ const t = new Transformer({
246
+ systemPrompt: 'Transform user profiles into marketing segments.',
247
+ sourceKey: 'INPUT', // key for input data in examples
248
+ targetKey: 'OUTPUT', // key for output data in examples
249
+ maxRetries: 3, // retry on validation failure
250
+ retryDelay: 1000, // ms between retries
251
+ });
252
+
253
+ // Seed with examples
254
+ await t.seed([
255
+ {
256
+ INPUT: { age: 25, spending: 'high', interests: ['tech', 'gaming'] },
257
+ OUTPUT: { segment: 'young-affluent-tech', confidence: 0.9, tags: ['early-adopter'] }
258
+ },
259
+ {
260
+ INPUT: { age: 55, spending: 'medium', interests: ['gardening', 'cooking'] },
261
+ OUTPUT: { segment: 'mature-lifestyle', confidence: 0.85, tags: ['home-focused'] }
262
+ }
263
+ ]);
264
+
265
+ // Transform new data
266
+ const result = await t.send({ age: 30, spending: 'low', interests: ['books', 'hiking'] });
267
+ // result -> { segment: '...', confidence: ..., tags: [...] }
268
+ ```
269
+
270
+ ### Validation
271
+
272
+ Pass an async validator as the third argument to `send()`. If it throws, the Transformer retries with the error message fed back to the model:
273
+
274
+ ```javascript
275
+ const result = await t.send(
276
+ { age: 30, spending: 'low' },
277
+ {}, // options
278
+ async (output) => {
279
+ if (!output.segment) throw new Error('Missing segment field');
280
+ if (output.confidence < 0 || output.confidence > 1) {
281
+ throw new Error('Confidence must be between 0 and 1');
282
+ }
283
+ return output; // return the validated (or modified) output
284
+ }
285
+ );
286
+ ```
287
+
288
+ Or set a global validator in the constructor:
289
+
290
+ ```javascript
291
+ const t = new Transformer({
292
+ asyncValidator: async (output) => {
293
+ if (!output.id) throw new Error('Missing id');
294
+ return output;
295
+ }
296
+ });
297
+ ```
298
+
299
+ ### Self-Healing with `rebuild()`
300
+
301
+ When downstream code fails, feed the error back to the AI:
302
+
303
+ ```javascript
304
+ try {
305
+ await processPayload(result);
306
+ } catch (err) {
307
+ const fixed = await t.rebuild(result, err.message);
308
+ await processPayload(fixed); // try again with AI-corrected payload
309
+ }
310
+ ```
311
+
312
+ ### Loading Examples from a File
313
+
314
+ ```javascript
315
+ const t = new Transformer({
316
+ examplesFile: './training-data.json'
317
+ // JSON array of { INPUT: ..., OUTPUT: ... } objects
318
+ });
319
+ await t.seed(); // loads from file automatically
320
+ ```
321
+
322
+ ### Stateless Sends
323
+
324
+ Send without affecting the conversation history (useful for parallel processing):
325
+
326
+ ```javascript
327
+ const result = await t.send(payload, { stateless: true });
328
+ ```
329
+
330
+ ### History Management
331
+
332
+ ```javascript
333
+ // Clear conversation history but preserve seeded examples
334
+ await t.clearHistory();
335
+
336
+ // Full reset including seeded examples
337
+ await t.reset();
338
+
339
+ // Update system prompt
340
+ await t.updateSystemPrompt('New instructions for the model.');
341
+ ```
342
+
343
+ ### When to Use Transformer
344
+
345
+ - ETL pipelines --- transform data between formats
346
+ - API response normalization
347
+ - Content enrichment (add tags, categories, scores)
348
+ - Any structured data transformation where you can provide examples
349
+ - Batch processing with validation guarantees
350
+
351
+ ---
352
+
353
+ ## ToolAgent --- Agent with Custom Tools
354
+
355
+ Give the model tools (functions) it can call. You define what tools exist and how to execute them. The agent handles the conversation loop --- sending messages, receiving tool calls, executing them, feeding results back, until the model produces a final text answer.
356
+
357
+ ```javascript
358
+ import { ToolAgent } from 'ak-claude';
359
+
360
+ const agent = new ToolAgent({
361
+ systemPrompt: 'You are a database assistant.',
362
+ tools: [
363
+ {
364
+ name: 'query_db',
365
+ description: 'Execute a read-only SQL query against the users database',
366
+ input_schema: {
367
+ type: 'object',
368
+ properties: {
369
+ sql: { type: 'string', description: 'The SQL query to execute' }
370
+ },
371
+ required: ['sql']
372
+ }
373
+ },
374
+ {
375
+ name: 'send_email',
376
+ description: 'Send an email notification',
377
+ input_schema: {
378
+ type: 'object',
379
+ properties: {
380
+ to: { type: 'string' },
381
+ subject: { type: 'string' },
382
+ body: { type: 'string' }
383
+ },
384
+ required: ['to', 'subject', 'body']
385
+ }
386
+ }
387
+ ],
388
+ toolExecutor: async (toolName, args) => {
389
+ switch (toolName) {
390
+ case 'query_db':
391
+ return await db.query(args.sql);
392
+ case 'send_email':
393
+ await mailer.send(args);
394
+ return { sent: true };
395
+ }
396
+ },
397
+ maxToolRounds: 10 // safety limit on tool-use loop iterations
398
+ });
399
+
400
+ const result = await agent.chat('How many users signed up this week? Email the count to admin@co.com');
401
+ console.log(result.text); // "There were 47 new signups this week. I've sent the email."
402
+ console.log(result.toolCalls); // [{ name: 'query_db', args: {...}, result: [...] }, { name: 'send_email', ... }]
403
+ ```
404
+
405
+ ### Tool Schema Compatibility
406
+
407
+ ToolAgent accepts tool declarations in Claude's native format (`input_schema`) as well as Gemini-compatible formats (`inputSchema` or `parametersJsonSchema`). They are auto-mapped internally:
408
+
409
+ ```javascript
410
+ // Claude format (native)
411
+ { name: 'my_tool', description: '...', input_schema: { type: 'object', properties: { ... } } }
412
+
413
+ // Gemini-compatible format (auto-mapped)
414
+ { name: 'my_tool', description: '...', parametersJsonSchema: { type: 'object', properties: { ... } } }
415
+ ```
416
+
417
+ ### Tool Choice
418
+
419
+ Control how the model selects tools:
420
+
421
+ ```javascript
422
+ const agent = new ToolAgent({
423
+ tools: [...],
424
+ toolExecutor: myExecutor,
425
+ toolChoice: { type: 'auto' }, // default: model decides
426
+ // toolChoice: { type: 'any' }, // model must use a tool
427
+ // toolChoice: { type: 'tool', name: 'query_db' }, // force a specific tool
428
+ // toolChoice: { type: 'none' }, // disable tool use
429
+ disableParallelToolUse: true, // force sequential tool calls
430
+ });
431
+ ```
432
+
433
+ ### Streaming
434
+
435
+ Stream the agent's output in real-time --- useful for showing progress in a UI:
436
+
437
+ ```javascript
438
+ for await (const event of agent.stream('Find the top 5 users by spend')) {
439
+ switch (event.type) {
440
+ case 'text': process.stdout.write(event.text); break;
441
+ case 'tool_call': console.log(`\nCalling ${event.toolName}...`); break;
442
+ case 'tool_result': console.log(`Result:`, event.result); break;
443
+ case 'done': console.log('\nUsage:', event.usage); break;
444
+ }
445
+ }
446
+ ```
447
+
448
+ ### Execution Gating
449
+
450
+ Control which tool calls are allowed at runtime:
451
+
452
+ ```javascript
453
+ const agent = new ToolAgent({
454
+ tools: [...],
455
+ toolExecutor: myExecutor,
456
+ onBeforeExecution: async (toolName, args) => {
457
+ if (toolName === 'delete_user') {
458
+ console.log('Blocked dangerous tool call');
459
+ return false; // deny execution
460
+ }
461
+ return true; // allow
462
+ },
463
+ onToolCall: (toolName, args) => {
464
+ // Notification callback --- fires on every tool call (logging, metrics, etc.)
465
+ metrics.increment(`tool_call.${toolName}`);
466
+ }
467
+ });
468
+ ```
469
+
470
+ ### Stopping an Agent
471
+
472
+ Cancel mid-execution from a callback or externally:
473
+
474
+ ```javascript
475
+ // From a callback
476
+ onBeforeExecution: async (toolName, args) => {
477
+ if (shouldStop) {
478
+ agent.stop(); // stop after this round
479
+ return false;
480
+ }
481
+ return true;
482
+ }
483
+
484
+ // Externally (e.g., user cancel button, timeout)
485
+ setTimeout(() => agent.stop(), 60_000);
486
+ const result = await agent.chat('Do some work');
487
+ // result includes warning: "Agent was stopped"
488
+ ```
489
+
490
+ ### When to Use ToolAgent
491
+
492
+ - AI that needs to call APIs, query databases, or interact with external systems
493
+ - Workflow automation --- the AI orchestrates a sequence of operations
494
+ - Research assistants that fetch and synthesize data from multiple sources
495
+ - Any scenario where you want the model to decide *which* tools to use and *when*
496
+
497
+ ---
498
+
499
+ ## CodeAgent --- Agent That Writes and Runs Code
500
+
501
+ Instead of calling tools one by one, the model writes complete JavaScript scripts and executes them in a child process. This is powerful for tasks that require complex logic, file manipulation, or multi-step computation.
502
+
503
+ ```javascript
504
+ import { CodeAgent } from 'ak-claude';
505
+
506
+ const agent = new CodeAgent({
507
+ workingDirectory: '/path/to/project',
508
+ importantFiles: ['package.json', 'src/config.js'], // injected into system prompt
509
+ timeout: 30_000, // per-execution timeout
510
+ maxRounds: 10, // max code execution cycles
511
+ keepArtifacts: true, // keep script files on disk after execution
512
+ });
513
+
514
+ const result = await agent.chat('Find all files larger than 1MB and list them sorted by size');
515
+ console.log(result.text); // Agent's summary
516
+ console.log(result.codeExecutions); // [{ code, output, stderr, exitCode, purpose }]
517
+ ```
518
+
519
+ ### How It Works
520
+
521
+ 1. On `init()`, the agent scans the working directory and gathers codebase context (file tree via `git ls-files`, package.json dependencies, importantFiles contents)
522
+ 2. This context is injected into the system prompt so the model understands the project
523
+ 3. The model writes JavaScript using an internal `execute_code` tool with a descriptive `purpose` slug
524
+ 4. Code is saved to a `.mjs` file and run in a Node.js child process that inherits `process.env`
525
+ 5. stdout/stderr feeds back to the model
526
+ 6. The model decides if more work is needed (up to `maxRounds` cycles)
527
+
528
+ Scripts are written to `writeDir` (default: `{workingDirectory}/tmp`) with descriptive names like `agent-read-config-1710000000.mjs`.
529
+
530
+ ### Streaming
531
+
532
+ ```javascript
533
+ for await (const event of agent.stream('Refactor the auth module to use async/await')) {
534
+ switch (event.type) {
535
+ case 'text': process.stdout.write(event.text); break;
536
+ case 'code': console.log('\n--- Executing code ---'); break;
537
+ case 'output': console.log(event.stdout); break;
538
+ case 'done': console.log('\nDone!', event.usage); break;
539
+ }
540
+ }
541
+ ```
542
+
543
+ ### Execution Gating & Notifications
544
+
545
+ ```javascript
546
+ const agent = new CodeAgent({
547
+ workingDirectory: '.',
548
+ onBeforeExecution: async (code) => {
549
+ // Review code before it runs
550
+ if (code.includes('rm -rf')) return false; // deny
551
+ return true;
552
+ },
553
+ onCodeExecution: (code, output) => {
554
+ // Log every execution for audit
555
+ logger.info({ code: code.slice(0, 200), exitCode: output.exitCode });
556
+ }
557
+ });
558
+ ```
559
+
560
+ ### Retrieving Scripts
561
+
562
+ Get all scripts the agent wrote across all interactions:
563
+
564
+ ```javascript
565
+ const scripts = agent.dump();
566
+ // [{ fileName: 'agent-read-config.mjs', purpose: 'read-config', script: '...', filePath: '/path/...' }]
567
+ ```
568
+
569
+ ### Code Style Options
570
+
571
+ ```javascript
572
+ new CodeAgent({
573
+ comments: true, // instruct model to write JSDoc comments (default: false, saves tokens)
574
+ });
575
+ ```
576
+
577
+ ### When to Use CodeAgent
578
+
579
+ - File system operations --- reading, writing, transforming files
580
+ - Data analysis --- processing CSV, JSON, or log files
581
+ - Codebase exploration --- finding patterns, counting occurrences, generating reports
582
+ - Prototyping --- quickly testing ideas by having the AI write and run code
583
+ - Any task where the AI needs more flexibility than predefined tools provide
584
+
585
+ ---
586
+
587
+ ## RagAgent --- Document & Data Q&A
588
+
589
+ Load documents and data into the model's context for grounded Q&A. Supports three input types that can be used together:
590
+
591
+ | Input Type | Option | What It Does |
592
+ |---|---|---|
593
+ | **Local files** | `localFiles` | Read from disk as UTF-8 text --- for md, json, csv, yaml, txt |
594
+ | **Local data** | `localData` | In-memory objects serialized as JSON |
595
+ | **Media files** | `mediaFiles` | Base64-encoded images and PDFs for Claude's vision |
596
+
597
+ ```javascript
598
+ import { RagAgent } from 'ak-claude';
599
+
600
+ const agent = new RagAgent({
601
+ // Text files read directly from disk
602
+ localFiles: ['./docs/api-reference.md', './docs/architecture.md'],
603
+
604
+ // In-memory data
605
+ localData: [
606
+ { name: 'users', data: await db.query('SELECT * FROM users LIMIT 100') },
607
+ { name: 'config', data: JSON.parse(await fs.readFile('./config.json', 'utf-8')) },
608
+ ],
609
+
610
+ // Images and PDFs (base64 encoded, supports Claude vision)
611
+ mediaFiles: ['./diagrams/architecture.png', './reports/q4.pdf'],
612
+ });
613
+
614
+ const result = await agent.chat('What authentication method does the API use?');
615
+ console.log(result.text); // Grounded answer citing the api-reference.md
616
+ ```
617
+
618
+ ### Citations
619
+
620
+ Enable Claude's built-in citations feature for source attribution:
621
+
622
+ ```javascript
623
+ const agent = new RagAgent({
624
+ localFiles: ['./docs/api.md'],
625
+ mediaFiles: ['./reports/q4.pdf'],
626
+ enableCitations: true, // enables document-level citations
627
+ });
628
+
629
+ const result = await agent.chat('What were Q4 revenue numbers?');
630
+ console.log(result.text);
631
+ console.log(result.citations);
632
+ // [{ type: 'char_location', cited_text: '...', document_title: 'q4.pdf', ... }]
633
+ ```
634
+
635
+ When `enableCitations` is true, all context sources (local files, local data, media files) are injected as `document` content blocks with `citations: { enabled: true }`, allowing Claude to return structured citation metadata.
636
+
637
+ ### Dynamic Context
638
+
639
+ Add more context after initialization (each triggers a reinit):
640
+
641
+ ```javascript
642
+ await agent.addLocalFiles(['./new-doc.md']);
643
+ await agent.addLocalData([{ name: 'metrics', data: { uptime: 99.9 } }]);
644
+ await agent.addMediaFiles(['./new-chart.png']);
645
+ ```
646
+
647
+ ### Inspecting Context
648
+
649
+ ```javascript
650
+ const ctx = agent.getContext();
651
+ // {
652
+ // localFiles: [{ name, path, size }],
653
+ // localData: [{ name, type }],
654
+ // mediaFiles: [{ path, name, ext }]
655
+ // }
656
+ ```
657
+
658
+ ### Streaming
659
+
660
+ ```javascript
661
+ for await (const event of agent.stream('Summarize the architecture document')) {
662
+ if (event.type === 'text') process.stdout.write(event.text);
663
+ if (event.type === 'done') console.log('\nUsage:', event.usage);
664
+ }
665
+ ```
666
+
667
+ ### When to Use RagAgent
668
+
669
+ - Documentation Q&A --- let users ask questions about your docs
670
+ - Data exploration --- load database results or CSV exports and ask questions
671
+ - Code review --- load source files and ask about patterns, bugs, or architecture
672
+ - Report analysis --- load PDF reports and extract insights
673
+ - Any scenario where the AI needs to answer questions grounded in specific data
674
+
675
+ ### Choosing Input Types
676
+
677
+ | Data | Use |
678
+ |---|---|
679
+ | Plain text files (md, txt, json, csv, yaml) | `localFiles` --- read as UTF-8, fastest |
680
+ | In-memory objects, DB results, API responses | `localData` --- serialized as JSON |
681
+ | Images (png, jpg, gif, webp) | `mediaFiles` --- base64 encoded for vision |
682
+ | PDFs | `mediaFiles` --- base64 encoded as document blocks |
683
+
684
+ Prefer `localFiles` and `localData` when possible --- they are fastest to initialize and have no size overhead beyond the text itself.
685
+
686
+ ---
687
+
688
+ ## AgentQuery --- Autonomous Agent via Claude Agent SDK
689
+
690
+ AgentQuery wraps the `@anthropic-ai/claude-agent-sdk` to launch a full autonomous Claude Code agent with built-in tools for file operations, shell commands, and code search. Unlike the other classes, AgentQuery does **not** extend BaseClaude --- it uses a completely different SDK with its own execution model.
691
+
692
+ ```javascript
693
+ import { AgentQuery } from 'ak-claude';
694
+
695
+ const agent = new AgentQuery({
696
+ cwd: '/path/to/project',
697
+ allowedTools: ['Read', 'Glob', 'Grep', 'Bash'],
698
+ maxTurns: 20,
699
+ maxBudgetUsd: 0.50,
700
+ });
701
+
702
+ for await (const msg of agent.run('Find all TODO comments and summarize them')) {
703
+ if (msg.type === 'assistant') {
704
+ console.log(msg.message.content);
705
+ }
706
+ if (msg.type === 'result') {
707
+ console.log('Done:', msg.result);
708
+ console.log('Cost:', msg.total_cost_usd);
709
+ }
710
+ }
711
+ ```
712
+
713
+ ### Built-in Tools
714
+
715
+ The Claude Agent SDK provides tools that the agent can use autonomously:
716
+
717
+ - **Read** --- Read file contents
718
+ - **Write** --- Write files
719
+ - **Edit** --- Edit files with diffs
720
+ - **Bash** --- Execute shell commands
721
+ - **Glob** --- Find files by pattern
722
+ - **Grep** --- Search file contents
723
+ - **NotebookEdit** --- Edit Jupyter notebooks
724
+
725
+ Control access with `allowedTools` and `disallowedTools`:
726
+
727
+ ```javascript
728
+ // Restrict to read-only operations
729
+ new AgentQuery({
730
+ allowedTools: ['Read', 'Glob', 'Grep'],
731
+ });
732
+
733
+ // Allow everything except destructive operations
734
+ new AgentQuery({
735
+ disallowedTools: ['Write', 'Edit', 'Bash'],
736
+ });
737
+ ```
738
+
739
+ ### Session Resumption
740
+
741
+ Resume a previous agent session to continue where it left off:
742
+
743
+ ```javascript
744
+ const agent = new AgentQuery({ cwd: '/my/project' });
745
+
746
+ // First run
747
+ for await (const msg of agent.run('Analyze the codebase structure')) {
748
+ // ... process messages
749
+ }
750
+
751
+ const sessionId = agent.lastSessionId;
752
+
753
+ // Later: resume with follow-up
754
+ for await (const msg of agent.resume(sessionId, 'Now refactor the auth module')) {
755
+ // ... process messages
756
+ }
757
+ ```
758
+
759
+ ### MCP Server Configuration
760
+
761
+ Connect external MCP (Model Context Protocol) servers for additional tool capabilities:
762
+
763
+ ```javascript
764
+ new AgentQuery({
765
+ mcpServers: {
766
+ database: {
767
+ command: 'npx',
768
+ args: ['-y', '@modelcontextprotocol/server-postgres', 'postgresql://localhost/mydb'],
769
+ }
770
+ }
771
+ });
772
+ ```
773
+
774
+ ### Budget and Turn Limits
775
+
776
+ ```javascript
777
+ new AgentQuery({
778
+ maxTurns: 30, // max agentic turns before stopping
779
+ maxBudgetUsd: 1.00, // hard spending cap in USD
780
+ });
781
+ ```
782
+
783
+ ### When to Use AgentQuery
784
+
785
+ - Complex multi-file codebase tasks (refactoring, analysis, migration)
786
+ - Tasks that need shell access, file editing, and code search combined
787
+ - Autonomous workflows where you want Claude Code's full capabilities
788
+ - Situations where you want budget controls and turn limits
789
+
790
+ ---
791
+
792
+ ## Web Search
793
+
794
+ Ground model responses in real-time web search results. Available on **all classes** via `enableWebSearch` --- Claude's server-managed web search tool is injected automatically.
795
+
796
+ ### Basic Usage
797
+
798
+ ```javascript
799
+ import { Chat } from 'ak-claude';
800
+
801
+ const chat = new Chat({
802
+ enableWebSearch: true
803
+ });
804
+
805
+ const result = await chat.send('What happened in tech news today?');
806
+ console.log(result.text); // Response grounded in current search results
807
+ ```
808
+
809
+ ### Web Search Configuration
810
+
811
+ ```javascript
812
+ const chat = new Chat({
813
+ enableWebSearch: true,
814
+ webSearchConfig: {
815
+ max_uses: 5, // max searches per request
816
+ allowed_domains: ['docs.python.org'], // only search these domains
817
+ blocked_domains: ['reddit.com'], // never search these domains
818
+ }
819
+ });
820
+ ```
821
+
822
+ ### Web Search with ToolAgent
823
+
824
+ Web search works alongside user-defined tools --- both are merged into the tools array automatically:
825
+
826
+ ```javascript
827
+ const agent = new ToolAgent({
828
+ enableWebSearch: true,
829
+ tools: [
830
+ {
831
+ name: 'save_result',
832
+ description: 'Save a research result',
833
+ input_schema: {
834
+ type: 'object',
835
+ properties: { title: { type: 'string' }, summary: { type: 'string' } },
836
+ required: ['title', 'summary']
837
+ }
838
+ }
839
+ ],
840
+ toolExecutor: async (name, args) => {
841
+ if (name === 'save_result') return await db.insert(args);
842
+ }
843
+ });
844
+
845
+ // The agent can search the web AND call your tools
846
+ const result = await agent.chat('Research the latest AI safety developments and save the key findings');
847
+ ```
848
+
849
+ ### When to Use Web Search
850
+
851
+ - Questions about current events, recent news, or real-time data
852
+ - Fact-checking or verification tasks
853
+ - Research assistants that need up-to-date information
854
+ - Any scenario where the model's training data cutoff is a limitation
855
+
856
+ ---
857
+
858
+ ## Prompt Caching
859
+
860
+ Mark system prompts for Anthropic's ephemeral prompt caching to reduce costs when making many API calls with the same large context. Cached tokens are billed at a reduced rate.
861
+
862
+ ### When Prompt Caching Helps
863
+
864
+ - **Large system prompts** reused across many calls
865
+ - **RagAgent** with the same document set serving many queries
866
+ - **ToolAgent** with many tool definitions
867
+ - Any scenario with high token count in repeated context
868
+
869
+ ### Enable Caching
870
+
871
+ ```javascript
872
+ import { Chat } from 'ak-claude';
873
+
874
+ const chat = new Chat({
875
+ systemPrompt: veryLongSystemPrompt, // e.g., 10,000+ tokens
876
+ cacheSystemPrompt: true // adds cache_control: { type: 'ephemeral' }
877
+ });
878
+
879
+ // First call creates the cache (pay cache creation cost)
880
+ const r1 = await chat.send('Hello');
881
+
882
+ // Subsequent calls within the cache TTL use cached tokens at reduced cost
883
+ const r2 = await chat.send('Tell me more');
884
+ ```
885
+
886
+ When `cacheSystemPrompt: true`, the system prompt is sent as a content block array with `cache_control: { type: 'ephemeral' }` attached. Anthropic automatically manages the cache lifetime.
887
+
888
+ ### Monitoring Cache Usage
889
+
890
+ Cache token metrics are included in usage data:
891
+
892
+ ```javascript
893
+ const usage = chat.getLastUsage();
894
+ console.log(usage.cacheCreationTokens); // tokens used to create the cache (first call)
895
+ console.log(usage.cacheReadTokens); // tokens read from cache (subsequent calls)
896
+ ```
897
+
898
+ ### Cost Savings
899
+
900
+ Cached input tokens are billed at a discount compared to regular input tokens. The exact savings depend on the model --- check [Anthropic's pricing page](https://docs.anthropic.com/en/docs/about-claude/models) for current rates. The trade-off is a small cache creation cost on the first call.
901
+
902
+ **Rule of thumb**: Caching pays off when you make many calls with the same large context (system prompt + seeded examples) within the cache TTL (typically 5 minutes for ephemeral caches).
903
+
904
+ ---
905
+
906
+ ## Observability & Usage Tracking
907
+
908
+ Every class provides consistent observability hooks.
909
+
910
+ ### Token Usage
911
+
912
+ After every API call, get detailed token counts:
913
+
914
+ ```javascript
915
+ const usage = instance.getLastUsage();
916
+ // {
917
+ // promptTokens: 1250, // input tokens (cumulative across retries)
918
+ // responseTokens: 340, // output tokens (cumulative across retries)
919
+ // totalTokens: 1590, // total (cumulative)
920
+ // cacheCreationTokens: 0, // tokens used to create cache
921
+ // cacheReadTokens: 0, // tokens read from cache
922
+ // attempts: 1, // 1 = first try, 2+ = retries needed
923
+ // modelVersion: 'claude-sonnet-4-6-20250514', // actual model that responded
924
+ // requestedModel: 'claude-sonnet-4-6', // model you requested
925
+ // stopReason: 'end_turn', // 'end_turn', 'tool_use', 'max_tokens'
926
+ // timestamp: 1710000000000
927
+ // }
928
+ ```
929
+
930
+ ### Cost Estimation
931
+
932
+ Estimate cost *before* sending using Claude's token counting API:
933
+
934
+ ```javascript
935
+ const estimate = await instance.estimateCost('What is the meaning of life?');
936
+ // {
937
+ // inputTokens: 8,
938
+ // model: 'claude-sonnet-4-6',
939
+ // pricing: { input: 3.00, output: 15.00 }, // per million tokens
940
+ // estimatedInputCost: 0.000024,
941
+ // note: 'Cost is for input tokens only; output cost depends on response length'
942
+ // }
943
+ ```
944
+
945
+ Or just get the token count:
946
+
947
+ ```javascript
948
+ const { inputTokens } = await instance.estimate('some payload');
949
+ ```
950
+
951
+ ### Logging
952
+
953
+ All classes use [pino](https://github.com/pinojs/pino) for structured logging. Control the level:
954
+
955
+ ```javascript
956
+ // Per-instance
957
+ new Chat({ logLevel: 'debug' });
958
+
959
+ // Via environment
960
+ LOG_LEVEL=debug node app.js
961
+
962
+ // Via NODE_ENV (dev -> debug, test -> warn, prod -> error)
963
+
964
+ // Silence all logging
965
+ new Chat({ logLevel: 'none' });
966
+ ```
967
+
968
+ ### Agent Callbacks
969
+
970
+ ToolAgent and CodeAgent provide execution callbacks for building audit trails, metrics, and approval flows:
971
+
972
+ ```javascript
973
+ // ToolAgent
974
+ new ToolAgent({
975
+ onToolCall: (toolName, args) => {
976
+ // Fires on every tool call --- use for logging, metrics
977
+ logger.info({ event: 'tool_call', tool: toolName, args });
978
+ },
979
+ onBeforeExecution: async (toolName, args) => {
980
+ // Fires before execution --- return false to deny
981
+ // Use for approval flows, safety checks, rate limiting
982
+ return !blocklist.includes(toolName);
983
+ }
984
+ });
985
+
986
+ // CodeAgent
987
+ new CodeAgent({
988
+ onCodeExecution: (code, output) => {
989
+ // Fires after every code execution
990
+ logger.info({ event: 'code_exec', exitCode: output.exitCode, lines: code.split('\n').length });
991
+ },
992
+ onBeforeExecution: async (code) => {
993
+ // Review code before execution
994
+ if (code.includes('process.exit')) return false;
995
+ return true;
996
+ }
997
+ });
998
+ ```
999
+
1000
+ ---
1001
+
1002
+ ## Thinking Configuration
1003
+
1004
+ Claude models support extended thinking --- internal reasoning before answering. This produces higher quality results for complex tasks at the cost of additional tokens and latency.
1005
+
1006
+ ```javascript
1007
+ // Enable thinking with a token budget
1008
+ new Chat({
1009
+ thinking: {
1010
+ type: 'enabled',
1011
+ budget_tokens: 4096
1012
+ }
1013
+ });
1014
+ ```
1015
+
1016
+ **Important constraints** when thinking is enabled:
1017
+
1018
+ - `temperature` is ignored (internally set to 1)
1019
+ - `top_p` and `top_k` are not supported
1020
+ - The `budget_tokens` must be less than `maxTokens`
1021
+
1022
+ ```javascript
1023
+ // Example: complex reasoning task
1024
+ const reasoner = new Chat({
1025
+ systemPrompt: 'You are an expert mathematician.',
1026
+ thinking: { type: 'enabled', budget_tokens: 8192 },
1027
+ maxTokens: 16384, // must be greater than budget_tokens
1028
+ });
1029
+
1030
+ const result = await reasoner.send('Prove that the square root of 2 is irrational.');
1031
+ console.log(result.text); // Detailed proof with internal reasoning
1032
+ ```
1033
+
1034
+ **When to enable thinking**: Complex reasoning, math, multi-step logic, code generation, nuanced analysis. **When to skip it**: Simple classification, extraction, or chat where speed matters.
1035
+
1036
+ ---
1037
+
1038
+ ## Error Handling & Retries
1039
+
1040
+ ### SDK-Level Rate Limit Retries
1041
+
1042
+ The Anthropic SDK automatically retries on 429 (rate limit) and certain 5xx errors. Control the retry count:
1043
+
1044
+ ```javascript
1045
+ new Chat({
1046
+ maxRetries: 5 // default: 5, with exponential backoff
1047
+ });
1048
+ ```
1049
+
1050
+ This is handled entirely by the `@anthropic-ai/sdk` client --- you do not need to implement retry logic for rate limits.
1051
+
1052
+ ### Transformer Validation Retries
1053
+
1054
+ The Transformer has its own retry mechanism for validation failures. This is separate from SDK-level retries:
1055
+
1056
+ ```javascript
1057
+ const t = new Transformer({
1058
+ maxRetries: 3, // default: 3 validation retries
1059
+ retryDelay: 1000 // default: 1000ms, doubles each retry (exponential backoff)
1060
+ });
1061
+ ```
1062
+
1063
+ Each retry feeds the validation error back to the model via `rebuild()`, giving it a chance to self-correct. The `usage` object reports cumulative tokens across all attempts:
1064
+
1065
+ ```javascript
1066
+ const result = await t.send(payload, {}, validator);
1067
+ const usage = t.getLastUsage();
1068
+ console.log(usage.attempts); // 2 = needed one retry
1069
+ ```
1070
+
1071
+ ### CodeAgent Failure Limits
1072
+
1073
+ CodeAgent tracks consecutive failed code executions. After `maxRetries` (default: 3) consecutive failures, the model is instructed to stop executing code and summarize what went wrong:
1074
+
1075
+ ```javascript
1076
+ new CodeAgent({
1077
+ maxRetries: 5, // allow more failures before stopping
1078
+ });
1079
+ ```
1080
+
1081
+ ### General Error Handling
1082
+
1083
+ ```javascript
1084
+ try {
1085
+ const result = await chat.send('Hello');
1086
+ } catch (err) {
1087
+ if (err.status === 400) {
1088
+ console.error('Bad request:', err.message);
1089
+ } else if (err.status === 401) {
1090
+ console.error('Invalid API key');
1091
+ } else if (err.status === 429) {
1092
+ // Normally handled by SDK retries, but thrown after maxRetries exhausted
1093
+ console.error('Rate limited after all retries');
1094
+ } else {
1095
+ console.error('Unexpected error:', err.message);
1096
+ }
1097
+ }
1098
+ ```
1099
+
1100
+ ---
1101
+
1102
+ ## Performance Tips
1103
+
1104
+ ### Reuse Instances
1105
+
1106
+ Each instance maintains conversation history. Creating a new instance for every request wastes system prompt tokens. Reuse instances when possible:
1107
+
1108
+ ```javascript
1109
+ // Bad --- creates a new instance every call
1110
+ app.post('/classify', async (req, res) => {
1111
+ const msg = new Message({ systemPrompt: '...' }); // new instance every request!
1112
+ const result = await msg.send(req.body.text);
1113
+ res.json(result);
1114
+ });
1115
+
1116
+ // Good --- reuse the instance
1117
+ const classifier = new Message({ systemPrompt: '...' });
1118
+ app.post('/classify', async (req, res) => {
1119
+ const result = await classifier.send(req.body.text);
1120
+ res.json(result);
1121
+ });
1122
+ ```
1123
+
1124
+ ### Choose the Right Model
1125
+
1126
+ | Model | Speed | Cost | Best For |
1127
+ |---|---|---|---|
1128
+ | `claude-haiku-4-5-20251001` | Fastest | Cheapest | Classification, extraction, simple tasks |
1129
+ | `claude-sonnet-4-6` | Fast | Medium | General purpose, good quality (default) |
1130
+ | `claude-opus-4-6` | Slower | Highest | Complex reasoning, code, deep analysis |
1131
+
1132
+ ### Use `Message` for Stateless Workloads
1133
+
1134
+ `Message` sends each request independently --- no history accumulation. For pipelines processing thousands of items independently, `Message` is the right choice. It avoids the growing token cost of conversation history.
1135
+
1136
+ ### Use `localFiles` / `localData` over `mediaFiles`
1137
+
1138
+ For text-based content, `localFiles` and `localData` are injected as plain text --- no base64 encoding overhead. They initialize faster and use fewer tokens than `mediaFiles`.
1139
+
1140
+ ### Enable Prompt Caching for Large System Prompts
1141
+
1142
+ If your system prompt is large (1000+ tokens) and you make many calls, `cacheSystemPrompt: true` reduces input costs on subsequent calls:
1143
+
1144
+ ```javascript
1145
+ const classifier = new Message({
1146
+ systemPrompt: veryLongInstructions,
1147
+ cacheSystemPrompt: true
1148
+ });
1149
+ ```
1150
+
1151
+ ### Disable Thinking for Simple Tasks
1152
+
1153
+ Extended thinking tokens cost money and add latency. For classification, extraction, or simple formatting tasks, omit the `thinking` option entirely (the default).
1154
+
1155
+ ### Use Stateless Sends for Parallel Processing
1156
+
1157
+ When using Transformer for batch processing, `stateless: true` prevents history from growing:
1158
+
1159
+ ```javascript
1160
+ const results = await Promise.all(
1161
+ records.map(r => transformer.send(r, { stateless: true }))
1162
+ );
1163
+ ```
1164
+
1165
+ ---
1166
+
1167
+ ## Common Integration Patterns
1168
+
1169
+ ### Pattern: API Endpoint Classifier
1170
+
1171
+ ```javascript
1172
+ import { Message } from 'ak-claude';
1173
+
1174
+ const classifier = new Message({
1175
+ modelName: 'claude-haiku-4-5-20251001', // fast + cheap
1176
+ systemPrompt: 'Classify support tickets. Respond with exactly one of: billing, technical, account, other.',
1177
+ });
1178
+
1179
+ app.post('/api/classify-ticket', async (req, res) => {
1180
+ const result = await classifier.send(req.body.text);
1181
+ res.json({ category: result.text.trim().toLowerCase() });
1182
+ });
1183
+ ```
1184
+
1185
+ ### Pattern: ETL Pipeline with Validation
1186
+
1187
+ ```javascript
1188
+ import { Transformer } from 'ak-claude';
1189
+
1190
+ const normalizer = new Transformer({
1191
+ sourceKey: 'RAW',
1192
+ targetKey: 'NORMALIZED',
1193
+ maxRetries: 3,
1194
+ asyncValidator: async (output) => {
1195
+ if (!output.email?.includes('@')) throw new Error('Invalid email');
1196
+ if (!output.name?.trim()) throw new Error('Name is required');
1197
+ return output;
1198
+ }
1199
+ });
1200
+
1201
+ await normalizer.seed([
1202
+ { RAW: { nm: 'alice', mail: 'alice@co.com' }, NORMALIZED: { name: 'Alice', email: 'alice@co.com' } },
1203
+ ]);
1204
+
1205
+ for (const record of rawRecords) {
1206
+ const clean = await normalizer.send(record, { stateless: true });
1207
+ await db.insert('users', clean);
1208
+ }
1209
+ ```
1210
+
1211
+ ### Pattern: Conversational Assistant with Tools
1212
+
1213
+ ```javascript
1214
+ import { ToolAgent } from 'ak-claude';
1215
+
1216
+ const assistant = new ToolAgent({
1217
+ systemPrompt: `You are a customer support agent for Acme Corp.
1218
+ You can look up orders and issue refunds. Always confirm before issuing refunds.`,
1219
+ tools: [
1220
+ {
1221
+ name: 'lookup_order',
1222
+ description: 'Look up an order by ID or customer email',
1223
+ input_schema: {
1224
+ type: 'object',
1225
+ properties: {
1226
+ order_id: { type: 'string' },
1227
+ email: { type: 'string' }
1228
+ }
1229
+ }
1230
+ },
1231
+ {
1232
+ name: 'issue_refund',
1233
+ description: 'Issue a refund for an order',
1234
+ input_schema: {
1235
+ type: 'object',
1236
+ properties: {
1237
+ order_id: { type: 'string' },
1238
+ amount: { type: 'number' },
1239
+ reason: { type: 'string' }
1240
+ },
1241
+ required: ['order_id', 'amount', 'reason']
1242
+ }
1243
+ }
1244
+ ],
1245
+ toolExecutor: async (toolName, args) => {
1246
+ if (toolName === 'lookup_order') return await orderService.lookup(args);
1247
+ if (toolName === 'issue_refund') return await orderService.refund(args);
1248
+ },
1249
+ onBeforeExecution: async (toolName, args) => {
1250
+ // Only allow refunds under $100 without human approval
1251
+ if (toolName === 'issue_refund' && args.amount > 100) {
1252
+ return false;
1253
+ }
1254
+ return true;
1255
+ }
1256
+ });
1257
+
1258
+ // In a chat endpoint
1259
+ const result = await assistant.chat(userMessage);
1260
+ ```
1261
+
1262
+ ### Pattern: Code Analysis Agent
1263
+
1264
+ ```javascript
1265
+ import { CodeAgent } from 'ak-claude';
1266
+
1267
+ const analyst = new CodeAgent({
1268
+ workingDirectory: '/path/to/project',
1269
+ importantFiles: ['package.json', 'tsconfig.json'],
1270
+ maxRounds: 15,
1271
+ timeout: 60_000,
1272
+ onCodeExecution: (code, output) => {
1273
+ console.log(`[CodeAgent] exitCode=${output.exitCode}, stdout=${output.stdout.length} chars`);
1274
+ }
1275
+ });
1276
+
1277
+ const result = await analyst.chat('Find all unused exports in this project and list them.');
1278
+ console.log(result.text);
1279
+ console.log(`Executed ${result.codeExecutions.length} scripts`);
1280
+ ```
1281
+
1282
+ ### Pattern: Document Q&A Service
1283
+
1284
+ ```javascript
1285
+ import { RagAgent } from 'ak-claude';
1286
+
1287
+ const docs = new RagAgent({
1288
+ localFiles: [
1289
+ './docs/getting-started.md',
1290
+ './docs/api-reference.md',
1291
+ './docs/faq.md',
1292
+ ],
1293
+ enableCitations: true,
1294
+ systemPrompt: 'You are a documentation assistant. Answer questions based on the docs. If the answer is not in the docs, say so.',
1295
+ });
1296
+
1297
+ app.post('/api/ask', async (req, res) => {
1298
+ const result = await docs.chat(req.body.question);
1299
+ res.json({ answer: result.text, citations: result.citations, usage: result.usage });
1300
+ });
1301
+ ```
1302
+
1303
+ ### Pattern: Provider Abstraction Layer
1304
+
1305
+ ak-claude and ak-gemini share a compatible API surface. You can build a provider-agnostic wrapper:
1306
+
1307
+ ```javascript
1308
+ // ai-provider.js
1309
+ import { Chat as ClaudeChat, Transformer as ClaudeTransformer } from 'ak-claude';
1310
+ import { Chat as GeminiChat, Transformer as GeminiTransformer } from 'ak-gemini';
1311
+
1312
+ const PROVIDER = process.env.AI_PROVIDER || 'claude';
1313
+
1314
+ export function createChat(opts) {
1315
+ if (PROVIDER === 'gemini') {
1316
+ return new GeminiChat({ modelName: 'gemini-2.5-flash', ...opts });
1317
+ }
1318
+ return new ClaudeChat({ modelName: 'claude-sonnet-4-6', ...opts });
1319
+ }
1320
+
1321
+ export function createTransformer(opts) {
1322
+ if (PROVIDER === 'gemini') {
1323
+ return new GeminiTransformer({ modelName: 'gemini-2.5-flash', ...opts });
1324
+ }
1325
+ return new ClaudeTransformer({ modelName: 'claude-sonnet-4-6', ...opts });
1326
+ }
1327
+
1328
+ // Usage --- works with either provider
1329
+ const chat = createChat({ systemPrompt: 'You are a helpful assistant.' });
1330
+ const result = await chat.send('Hello!');
1331
+ console.log(result.text);
1332
+ console.log(result.usage); // same shape: { promptTokens, responseTokens, totalTokens, ... }
1333
+ ```
1334
+
1335
+ Both libraries share these APIs: `send()`, `chat()`, `stream()`, `seed()`, `getHistory()`, `clearHistory()`, `getLastUsage()`, `estimate()`, `estimateCost()`.
1336
+
1337
+ ### Pattern: Few-Shot Any Class
1338
+
1339
+ Every class (except Message) supports `seed()` for few-shot learning --- not just Transformer:
1340
+
1341
+ ```javascript
1342
+ import { Chat } from 'ak-claude';
1343
+
1344
+ const chat = new Chat({ systemPrompt: 'You are a SQL expert.' });
1345
+ await chat.seed([
1346
+ { PROMPT: 'Get all users', ANSWER: 'SELECT * FROM users;' },
1347
+ { PROMPT: 'Count orders by status', ANSWER: 'SELECT status, COUNT(*) FROM orders GROUP BY status;' },
1348
+ ]);
1349
+
1350
+ const result = await chat.send('Find users who signed up in the last 7 days');
1351
+ // Model follows the SQL-only response pattern from the examples
1352
+ ```
1353
+
1354
+ ### Pattern: Data-Grounded Analysis
1355
+
1356
+ ```javascript
1357
+ import { RagAgent } from 'ak-claude';
1358
+
1359
+ const analyst = new RagAgent({
1360
+ modelName: 'claude-opus-4-6', // use a smarter model for analysis
1361
+ localData: [
1362
+ { name: 'sales_q4', data: await db.query('SELECT * FROM sales WHERE quarter = 4') },
1363
+ { name: 'targets', data: await db.query('SELECT * FROM quarterly_targets') },
1364
+ ],
1365
+ systemPrompt: 'You are a business analyst. Analyze the provided data and answer questions with specific numbers.',
1366
+ });
1367
+
1368
+ const result = await analyst.chat('Which regions missed their Q4 targets? By how much?');
1369
+ ```
1370
+
1371
+ ---
1372
+
1373
+ ## Quick Reference
1374
+
1375
+ ### Imports
1376
+
1377
+ ```javascript
1378
+ // Named exports
1379
+ import { Transformer, Chat, Message, ToolAgent, CodeAgent, RagAgent, AgentQuery, BaseClaude, log } from 'ak-claude';
1380
+ import { extractJSON, attemptJSONRecovery } from 'ak-claude';
1381
+
1382
+ // Default export (namespace)
1383
+ import AI from 'ak-claude';
1384
+
1385
+ // CommonJS
1386
+ const { Transformer, Chat } = require('ak-claude');
1387
+ ```
1388
+
1389
+ ### Constructor Options (All Classes)
1390
+
1391
+ | Option | Type | Default |
1392
+ |---|---|---|
1393
+ | `modelName` | string | `'claude-sonnet-4-6'` |
1394
+ | `systemPrompt` | string \| null \| false | varies by class |
1395
+ | `apiKey` | string | `ANTHROPIC_API_KEY` or `CLAUDE_API_KEY` env var (not needed with `vertexai`) |
1396
+ | `vertexai` | boolean | `false` (use Vertex AI auth via ADC) |
1397
+ | `vertexProjectId` | string | `GOOGLE_CLOUD_PROJECT` env var |
1398
+ | `vertexRegion` | string | `'us-east5'` or `GOOGLE_CLOUD_LOCATION` env var |
1399
+ | `maxTokens` | number | `8192` |
1400
+ | `temperature` | number | `0.7` (ignored with thinking) |
1401
+ | `topP` | number | `0.95` (ignored with thinking) |
1402
+ | `topK` | number \| undefined | `undefined` |
1403
+ | `thinking` | `{ type: 'enabled', budget_tokens: number }` \| null | `null` |
1404
+ | `cacheSystemPrompt` | boolean | `false` |
1405
+ | `enableWebSearch` | boolean | `false` |
1406
+ | `webSearchConfig` | `{ max_uses?, allowed_domains?, blocked_domains? }` | `{}` |
1407
+ | `maxRetries` | number | `5` (SDK-level 429 retries) |
1408
+ | `healthCheck` | boolean | `false` |
1409
+ | `logLevel` | string | based on `NODE_ENV` |
1410
+
1411
+ ### Methods Available on All Classes
1412
+
1413
+ | Method | Returns | Description |
1414
+ |---|---|---|
1415
+ | `init(force?)` | `Promise<void>` | Initialize instance |
1416
+ | `seed(examples, opts?)` | `Promise<Array>` | Add few-shot examples |
1417
+ | `getHistory(curated?)` | `Array` | Get conversation history |
1418
+ | `clearHistory()` | `Promise<void>` | Clear conversation history |
1419
+ | `getLastUsage()` | `UsageData \| null` | Token usage from last call |
1420
+ | `estimate(payload)` | `Promise<{ inputTokens }>` | Estimate input tokens |
1421
+ | `estimateCost(payload)` | `Promise<object>` | Estimate cost in dollars |
1422
+
1423
+ ### Class-Specific Methods
1424
+
1425
+ | Class | Method | Returns |
1426
+ |---|---|---|
1427
+ | `Message` | `send(payload, opts?)` | `{ text, data?, usage }` |
1428
+ | `Chat` | `send(message, opts?)` | `{ text, usage }` |
1429
+ | `Transformer` | `send(payload, opts?, validator?)` | `Object` (transformed JSON) |
1430
+ | `Transformer` | `rawSend(payload)` | `Object` (no validation) |
1431
+ | `Transformer` | `rebuild(payload, error)` | `Object` (AI-corrected) |
1432
+ | `Transformer` | `reset()` | `Promise<void>` |
1433
+ | `Transformer` | `updateSystemPrompt(prompt)` | `Promise<void>` |
1434
+ | `ToolAgent` | `chat(message, opts?)` | `{ text, toolCalls, usage }` |
1435
+ | `ToolAgent` | `stream(message, opts?)` | `AsyncGenerator<AgentStreamEvent>` |
1436
+ | `ToolAgent` | `stop()` | `void` |
1437
+ | `CodeAgent` | `chat(message, opts?)` | `{ text, codeExecutions, usage }` |
1438
+ | `CodeAgent` | `stream(message, opts?)` | `AsyncGenerator<CodeAgentStreamEvent>` |
1439
+ | `CodeAgent` | `dump()` | `Array<{ fileName, purpose, script, filePath }>` |
1440
+ | `CodeAgent` | `stop()` | `void` |
1441
+ | `RagAgent` | `chat(message, opts?)` | `{ text, citations?, usage }` |
1442
+ | `RagAgent` | `stream(message, opts?)` | `AsyncGenerator<RagStreamEvent>` |
1443
+ | `RagAgent` | `addLocalFiles(paths)` | `Promise<void>` |
1444
+ | `RagAgent` | `addLocalData(entries)` | `Promise<void>` |
1445
+ | `RagAgent` | `addMediaFiles(paths)` | `Promise<void>` |
1446
+ | `RagAgent` | `getContext()` | `{ localFiles, localData, mediaFiles }` |
1447
+ | `AgentQuery` | `run(prompt, opts?)` | `AsyncGenerator<message>` |
1448
+ | `AgentQuery` | `resume(sessionId, prompt, opts?)` | `AsyncGenerator<message>` |
1449
+ | `AgentQuery` | `lastSessionId` | `string \| null` (getter) |
1450
+
1451
+ ### Model Pricing (per million tokens)
1452
+
1453
+ | Model | Input | Output |
1454
+ |---|---|---|
1455
+ | `claude-haiku-4-5-20251001` | $0.80 | $4.00 |
1456
+ | `claude-sonnet-4-6` | $3.00 | $15.00 |
1457
+ | `claude-opus-4-6` | $15.00 | $75.00 |