@botpress/zai 2.1.20 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CLAUDE.md ADDED
@@ -0,0 +1,696 @@
1
+ # Zai Library - Technical Documentation for Claude
2
+
3
+ ## Overview
4
+
5
+ **Zai** (Zui AI) is an LLM utility library built on top of Zod schemas (@bpinternal/zui) and the Botpress API (@botpress/cognitive). It provides a type-safe, production-ready abstraction layer for common AI operations with built-in features like active learning, automatic chunking, retries, and usage tracking.
6
+
7
+ **Main Entry Point**: `src/index.ts`
8
+ **Build Output**: `dist/` directory
9
+ **Package Manager**: pnpm
10
+
11
+ ## Core Architecture
12
+
13
+ ### 1. Main Classes
14
+
15
+ #### `Zai` Class (src/zai.ts)
16
+
17
+ The primary interface users interact with. Key responsibilities:
18
+
19
+ - Configuration management (model selection, namespace, active learning)
20
+ - Client wrapper around `@botpress/cognitive`
21
+ - Provides chainable API through `with()` and `learn()` methods
22
+ - Manages tokenizer initialization (WASM-based)
23
+ - Delegates to operation-specific implementations
24
+
25
+ **Key Properties**:
26
+
27
+ - `client: Cognitive` - Wrapped Botpress cognitive client
28
+ - `Model: Models` - Model identifier (e.g., 'best', 'fast', or specific model)
29
+ - `adapter: Adapter` - Storage adapter for active learning (TableAdapter or MemoryAdapter)
30
+ - `namespace: string` - Namespace for organizing tasks (default: 'zai')
31
+ - `activeLearning: ActiveLearning` - Active learning configuration
32
+
33
+ **Key Methods**:
34
+
35
+ - `with(options)` - Creates new Zai instance with merged config (for chaining)
36
+ - `learn(taskId)` - Enables active learning for specific task
37
+ - `callModel(props)` - Internal method to invoke cognitive API
38
+ - `getTokenizer()` - Lazy-loads WASM tokenizer with retry logic
39
+
40
+ #### `ZaiContext` Class (src/context.ts)
41
+
42
+ Request execution context that tracks a single operation's lifecycle:
43
+
44
+ - Wraps Cognitive client with event listeners
45
+ - Tracks usage metrics (tokens, cost, latency, requests)
46
+ - Manages AbortController for cancellation
47
+ - Emits progress events during execution
48
+ - Handles retry logic and error recovery
49
+
50
+ **Key Features**:
51
+
52
+ - Clones cognitive client per operation for isolation
53
+ - Automatic retry with error feedback to LLM (up to maxRetries)
54
+ - Injects metadata (integrationName, promptCategory, promptSource)
55
+ - Real-time usage tracking via event emitters
56
+
57
+ #### `Response` Class (src/response.ts)
58
+
59
+ Promise-like wrapper that adds observability and control:
60
+
61
+ - Implements `PromiseLike` interface for `await` compatibility
62
+ - Event emitter for progress/complete/error events
63
+ - Dual value system: simplified value (for await) and full result
64
+ - Signal binding for external abort control
65
+ - Result caching with elapsed time tracking
66
+
67
+ **Simplification Pattern**:
68
+
69
+ ```typescript
70
+ // Full result
71
+ const full = await response.result() // { output, usage, elapsed }
72
+
73
+ // Simplified (default await)
74
+ const simple = await response // Just the value (e.g., boolean for check)
75
+ ```
76
+
77
+ #### `EventEmitter` Class (src/emitter.ts)
78
+
79
+ Lightweight typed event emitter used throughout the library:
80
+
81
+ - Type-safe event dispatch and subscription
82
+ - Supports `on()`, `once()`, `off()`, `emit()`, `clear()`
83
+ - No external dependencies
84
+
85
+ ### 2. Adapters (Active Learning Storage)
86
+
87
+ #### `Adapter` Abstract Class (src/adapters/adapter.ts)
88
+
89
+ Defines interface for storing and retrieving learning examples:
90
+
91
+ - `getExamples<TInput, TOutput>(props)` - Retrieve similar examples
92
+ - `saveExample<TInput, TOutput>(props)` - Store new examples
93
+
94
+ #### `TableAdapter` (src/adapters/botpress-table.ts)
95
+
96
+ Botpress Table API implementation for persistent storage:
97
+
98
+ - Creates/validates table schema on first use
99
+ - Stores examples with metadata (cost, tokens, model, latency)
100
+ - Supports similarity search via table search API
101
+ - Schema includes: taskType, taskId, key, input, output, explanation, metadata, status, feedback
102
+ - Only retrieves 'approved' status examples
103
+ - Handles schema validation and migration checking
104
+
105
+ **Table Schema**:
106
+
107
+ ```typescript
108
+ {
109
+ taskType: string // e.g., 'zai.extract'
110
+ taskId: string // e.g., 'zai/sentiment-analysis'
111
+ key: string // Hash of input + taskId + taskType + instructions
112
+ instructions: string
113
+ input: Record // Searchable
114
+ output: Record
115
+ explanation: string | null
116
+ metadata: {
117
+ model: string
118
+ cost: { input, output }
119
+ latency: number
120
+ tokens: { input, output }
121
+ }
122
+ status: 'pending' | 'rejected' | 'approved'
123
+ feedback: { rating, comment } | null
124
+ }
125
+ ```
126
+
127
+ #### `MemoryAdapter` (src/adapters/memory.ts)
128
+
129
+ No-op implementation for when active learning is disabled:
130
+
131
+ - Returns empty examples array
132
+ - Does not persist anything
133
+
134
+ ### 3. Operations
135
+
136
+ All operations follow a similar pattern:
137
+
138
+ 1. Parse and validate options using Zod schemas
139
+ 2. Create `ZaiContext` for the operation
140
+ 3. Execute async operation function
141
+ 4. Wrap in `Response` with simplification function
142
+ 5. Optionally save examples to adapter
143
+
144
+ #### Extract Operation (src/operations/extract.ts)
145
+
146
+ **Purpose**: Extract structured data from unstructured text using Zod schemas
147
+
148
+ **Key Features**:
149
+
150
+ - Supports objects, arrays of objects, and primitive types
151
+ - Automatic schema wrapping for non-objects
152
+ - Multi-chunk processing for large inputs (parallel with p-limit)
153
+ - Recursive merging of chunked results
154
+ - JSON repair and parsing (json5, jsonrepair)
155
+ - Few-shot learning with examples
156
+ - Strict/non-strict mode
157
+
158
+ **Special Markers**:
159
+
160
+ - `■json_start■` / `■json_end■` - JSON boundaries
161
+ - `■NO_MORE_ELEMENT■` - Signals completion for arrays
162
+ - `■ZERO_ELEMENTS■` - Empty array indicator
163
+
164
+ **Chunking Strategy**:
165
+
166
+ 1. If input exceeds chunkLength, split into chunks
167
+ 2. Process chunks in parallel (max 10 concurrent)
168
+ 3. Recursively merge results into final schema
169
+ 4. Handles conflicting data by taking most frequent/reasonable value
170
+
171
+ **Example Flow**:
172
+
173
+ ```typescript
174
+ zai.extract(text, z.object({ name: z.string(), age: z.number() }))
175
+ → Context creation
176
+ → Tokenize + chunk if needed
177
+ → Generate prompt with examples
178
+ → LLM extraction with JSON markers
179
+ → Parse + validate with schema
180
+ → Save example (if learning enabled)
181
+ → Return via Response wrapper
182
+ ```
183
+
184
+ #### Check Operation (src/operations/check.ts)
185
+
186
+ **Purpose**: Boolean condition verification with explanation
187
+
188
+ **Return Type**: `{ value: boolean, explanation: string }` (simplified to `boolean`)
189
+
190
+ **Markers**: `■TRUE■`, `■FALSE■`, `■END■`
191
+
192
+ **Handling Ambiguity**: If both TRUE and FALSE appear, uses the last occurrence
193
+
194
+ **Example Storage**: Stores boolean output with explanation for future reference
195
+
196
+ #### Label Operation (src/operations/label.ts)
197
+
198
+ **Purpose**: Multi-label classification with confidence levels
199
+
200
+ **Labels**:
201
+
202
+ - `ABSOLUTELY_NOT` (confidence: 1, value: false)
203
+ - `PROBABLY_NOT` (confidence: 0.5, value: false)
204
+ - `AMBIGUOUS` (confidence: 0, value: false)
205
+ - `PROBABLY_YES` (confidence: 0.5, value: true)
206
+ - `ABSOLUTELY_YES` (confidence: 1, value: true)
207
+
208
+ **Return Type**:
209
+
210
+ ```typescript
211
+ Record<
212
+ LabelKey,
213
+ {
214
+ explanation: string
215
+ value: boolean
216
+ confidence: number
217
+ }
218
+ >
219
+ ```
220
+
221
+ Simplified to `Record<LabelKey, boolean>`
222
+
223
+ **Format**: `■label:【explanation】:LABEL_VALUE■`
224
+
225
+ **Chunking**: For large inputs, processes in chunks and merges with OR logic (any true → true)
226
+
227
+ #### Rewrite Operation (src/operations/rewrite.ts)
228
+
229
+ **Purpose**: Transform text according to instructions
230
+
231
+ **Use Cases**:
232
+
233
+ - Translation
234
+ - Tone adjustment
235
+ - Format conversion
236
+ - Content modification
237
+
238
+ **Markers**: `■START■`, `■END■`
239
+
240
+ **Length Control**: Optionally enforces token length limits
241
+
242
+ **Examples**: Supports custom examples for format learning
243
+
244
+ #### Filter Operation (src/operations/filter.ts)
245
+
246
+ **Purpose**: Filter array elements based on natural language condition
247
+
248
+ **Strategy**:
249
+
250
+ - Chunks arrays (max 50 items per chunk, max tokens per chunk)
251
+ - Processes chunks in parallel (max 10 concurrent)
252
+ - Returns filtered subset
253
+
254
+ **Format**: `■0:true■1:false■2:true` (indices with boolean decisions)
255
+
256
+ **Token Budget Allocation**:
257
+
258
+ - 50% for examples
259
+ - 25% for condition
260
+ - Remainder for input array
261
+
262
+ #### Text Operation (src/operations/text.ts)
263
+
264
+ **Purpose**: Generate text content based on prompt
265
+
266
+ **Features**:
267
+
268
+ - Direct text generation
269
+ - Length constraints with enforcement
270
+ - Token-to-word approximation table for short texts
271
+ - Higher temperature (0.7) for creativity
272
+
273
+ **Simplest Operation**: No complex parsing, just prompt → text
274
+
275
+ #### Summarize Operation (src/operations/summarize.ts)
276
+
277
+ **Purpose**: Summarize documents of any length to target length
278
+
279
+ **Strategies**:
280
+
281
+ 1. **Sliding Window**: For moderate documents
282
+
283
+ - Iteratively processes overlapping windows
284
+ - Updates summary incrementally
285
+ - Final pass ensures target length
286
+
287
+ 2. **Merge Sort**: For very large documents
288
+ - Recursively splits into sub-chunks
289
+ - Summarizes each independently (parallel)
290
+ - Merges summaries bottom-up
291
+
292
+ **Options**:
293
+
294
+ - `length`: Target token count
295
+ - `intermediateFactor`: Allows intermediate summaries to be longer (default: 4x)
296
+ - `sliding.window`: Window size for sliding strategy
297
+ - `sliding.overlap`: Overlap between windows
298
+ - `prompt`: What to focus on
299
+ - `format`: Output formatting instructions
300
+
301
+ **Markers**: `■START■`, `■END■`
302
+
303
+ ### 4. Utilities
304
+
305
+ #### src/utils.ts
306
+
307
+ - `stringify(input, beautify)` - Converts any input to string (handles null/undefined)
308
+ - `fastHash(str)` - Simple 32-bit hash for cache keys
309
+ - `takeUntilTokens(arr, tokens, count)` - Takes items until token budget exhausted
310
+
311
+ #### src/tokenizer.ts
312
+
313
+ - Lazy-loads `@bpinternal/thicktoken` WASM tokenizer
314
+ - Retry logic for WASM initialization race conditions
315
+ - Singleton pattern for tokenizer instance
316
+
317
+ #### src/operations/constants.ts
318
+
319
+ - `PROMPT_INPUT_BUFFER = 1048` - Safety buffer for input token calculations
320
+ - `PROMPT_OUTPUT_BUFFER = 512` - Safety buffer for output token calculations
321
+
322
+ #### src/operations/errors.ts
323
+
324
+ - `JsonParsingError` - Specialized error for JSON parsing failures
325
+ - Formats Zod validation errors in human-readable way
326
+ - Shows JSON excerpt and specific validation issues
327
+
328
+ ## Token Budget Management
329
+
330
+ All operations carefully manage token budgets to stay within model limits:
331
+
332
+ ```typescript
333
+ const PROMPT_COMPONENT = model.input.maxTokens - PROMPT_INPUT_BUFFER
334
+
335
+ // Typical allocation strategy:
336
+ {
337
+ input: 50% of PROMPT_COMPONENT,
338
+ condition/instruction: 20% of PROMPT_COMPONENT,
339
+ examples: 30% of PROMPT_COMPONENT,
340
+ }
341
+ ```
342
+
343
+ Chunking triggers when:
344
+
345
+ - Input exceeds configured `chunkLength`
346
+ - Calculated budget exceeded
347
+
348
+ ## Active Learning Flow
349
+
350
+ When enabled (`activeLearning.enable = true`):
351
+
352
+ 1. **Task Execution**:
353
+
354
+ - Generate unique key: `fastHash(taskType + taskId + input + instructions)`
355
+ - Check adapter for exact match (cache hit)
356
+ - If no match, generate examples from adapter.getExamples()
357
+ - Execute LLM operation with examples as few-shot learning
358
+ - Save result to adapter if not aborted
359
+
360
+ 2. **Example Retrieval**:
361
+
362
+ - Adapter searches by similarity (semantic search via Table API)
363
+ - Only returns 'approved' status examples
364
+ - Limited to top 10 results
365
+ - Filtered by token budget (takeUntilTokens)
366
+
367
+ 3. **Example Format**:
368
+
369
+ - Each operation formats examples differently
370
+ - Generally: User message (input + context) → Assistant message (expected output)
371
+ - Includes metadata for tracking cost/performance
372
+
373
+ 4. **Learning Curve**:
374
+ - First calls: No examples (uses defaults or no examples)
375
+ - Subsequent calls: Uses approved examples as guidance
376
+ - Improves format consistency and accuracy over time
377
+
378
+ ## Error Handling
379
+
380
+ ### Retry Mechanism (ZaiContext)
381
+
382
+ ```typescript
383
+ maxRetries = 3 (default)
384
+ for (attempt in 0..maxRetries) {
385
+ try {
386
+ response = await cognitive.generateContent(...)
387
+ return transform(response)
388
+ } catch (error) {
389
+ if (attempt === maxRetries) throw error
390
+ // Add error as user message for LLM to fix
391
+ messages.push({ role: 'user', content: ERROR_PARSING_OUTPUT })
392
+ }
393
+ }
394
+ ```
395
+
396
+ ### Transform Errors
397
+
398
+ - Operations throw errors in transform function when output invalid
399
+ - Error message fed back to LLM with context
400
+ - Common issues: missing markers, invalid JSON, wrong format
401
+
402
+ ### Abort Handling
403
+
404
+ - All operations check `ctx.controller.signal.throwIfAborted()`
405
+ - Examples not saved if aborted
406
+ - Clean abort via Response.abort() or signal binding
407
+
408
+ ## Usage Tracking
409
+
410
+ ### Metrics Collected (Usage type)
411
+
412
+ ```typescript
413
+ {
414
+ requests: {
415
+ requests: number // Total requests initiated
416
+ errors: number // Failed requests
417
+ responses: number // Successful responses
418
+ cached: number // Cached responses (no tokens used)
419
+ percentage: number // Completion percentage
420
+ },
421
+ cost: {
422
+ input: number // USD cost for input tokens
423
+ output: number // USD cost for output tokens
424
+ total: number // Total cost
425
+ },
426
+ tokens: {
427
+ input: number // Input tokens consumed
428
+ output: number // Output tokens generated
429
+ total: number // Total tokens
430
+ }
431
+ }
432
+ ```
433
+
434
+ ### Access Patterns
435
+
436
+ ```typescript
437
+ // During execution (progress events)
438
+ response.on('progress', (usage) => {
439
+ console.log(usage.tokens.total)
440
+ })
441
+
442
+ // After completion
443
+ const { output, usage, elapsed } = await response.result()
444
+ ```
445
+
446
+ ### Metadata Stored with Examples
447
+
448
+ ```typescript
449
+ {
450
+ model: string // Model used
451
+ cost: {
452
+ input, output
453
+ }
454
+ latency: number // ms
455
+ tokens: {
456
+ input, output
457
+ }
458
+ }
459
+ ```
460
+
461
+ ## Configuration
462
+
463
+ ### ZaiConfig
464
+
465
+ ```typescript
466
+ {
467
+ client: BotpressClientLike | Cognitive // Required
468
+ userId?: string // For tracking/attribution
469
+ modelId?: Models // 'best' | 'fast' | 'provider:model'
470
+ activeLearning?: {
471
+ enable: boolean
472
+ tableName: string // Must match /^[A-Za-z0-9_/-]{1,100}Table$/
473
+ taskId: string // Must match /^[A-Za-z0-9_/-]{1,100}$/
474
+ }
475
+ namespace?: string // Default: 'zai'
476
+ }
477
+ ```
478
+
479
+ ### Model Selection
480
+
481
+ - `'best'` - Best available model (default)
482
+ - `'fast'` - Fastest/cheapest model
483
+ - `'provider:model'` - Specific model (e.g., 'openai:gpt-4')
484
+
485
+ Model details fetched lazily via `cognitive.getModelDetails()`
486
+
487
+ ## Testing
488
+
489
+ Test files located in `e2e/` directory:
490
+
491
+ - Uses Vitest framework
492
+ - Real API calls to Botpress (requires .env with credentials)
493
+ - Snapshot testing for validation
494
+ - Includes active learning tests with table cleanup
495
+
496
+ **Key Test Utilities** (e2e/utils.ts):
497
+
498
+ - `getCachedClient()` - Reuses cognitive client across tests
499
+ - `getZai()` - Creates Zai instance
500
+ - `getClient()` - Gets raw Botpress client
501
+ - Loads `BotpressDocumentation` for large document tests
502
+
503
+ ## Build System
504
+
505
+ - **TypeScript**: Compiled with tsup (types) and custom esbuild script (build.ts)
506
+ - **Type Generation**: `tsup` generates .d.ts files
507
+ - **Neutral Build**: `ts-node -T ./build.ts` for platform-neutral JS
508
+ - **Size Limit**: Max 50 kB (enforced by size-limit)
509
+ - **Peer Dependencies**: @bpinternal/thicktoken, @bpinternal/zui
510
+
511
+ ## Extension Points
512
+
513
+ ### Adding New Operations
514
+
515
+ 1. **Create operation file**: `src/operations/my-operation.ts`
516
+
517
+ 2. **Declare module augmentation**:
518
+
519
+ ```typescript
520
+ declare module '@botpress/zai' {
521
+ interface Zai {
522
+ myOperation(input: T, options?: Options): Response<Output, Simplified>
523
+ }
524
+ }
525
+ ```
526
+
527
+ 3. **Implement operation function**:
528
+
529
+ ```typescript
530
+ const myOperation = async (input: T, options: Options, ctx: ZaiContext): Promise<Output> => {
531
+ // Implementation
532
+ }
533
+ ```
534
+
535
+ 4. **Add prototype method**:
536
+
537
+ ```typescript
538
+ Zai.prototype.myOperation = function (input, options) {
539
+ const context = new ZaiContext({
540
+ client: this.client,
541
+ modelId: this.Model,
542
+ taskId: this.taskId,
543
+ taskType: 'zai.myOperation',
544
+ adapter: this.adapter,
545
+ })
546
+
547
+ return new Response(context, myOperation(input, options, context), simplify)
548
+ }
549
+ ```
550
+
551
+ 5. **Import in src/index.ts**: `import './operations/my-operation'`
552
+
553
+ ### Custom Adapters
554
+
555
+ Implement `Adapter` abstract class:
556
+
557
+ ```typescript
558
+ export class MyAdapter extends Adapter {
559
+ async getExamples<TInput, TOutput>(props: GetExamplesProps<TInput>) {
560
+ // Return array of { key, input, output, explanation?, similarity }
561
+ }
562
+
563
+ async saveExample<TInput, TOutput>(props: SaveExampleProps<TInput, TOutput>) {
564
+ // Persist example
565
+ }
566
+ }
567
+ ```
568
+
569
+ ## Dependencies
570
+
571
+ ### Runtime
572
+
573
+ - `@botpress/cognitive` (0.1.50) - Core LLM client
574
+ - `json5` (^2.2.3) - Relaxed JSON parsing
575
+ - `jsonrepair` (^3.10.0) - Fix malformed JSON
576
+ - `lodash-es` (^4.17.21) - Utilities (chunk, isArray, clamp)
577
+ - `p-limit` (^7.2.0) - Concurrency control
578
+
579
+ ### Peer Dependencies
580
+
581
+ - `@bpinternal/thicktoken` (^1.0.0) - WASM tokenizer
582
+ - `@bpinternal/zui` (^1.2.2) - Zod wrapper with transforms
583
+
584
+ ### Dev Dependencies
585
+
586
+ - `@botpress/client` (workspace) - Botpress API client
587
+ - `@botpress/common` (workspace) - Shared utilities
588
+ - `@botpress/vai` (workspace) - Validation utilities
589
+ - `tsup`, `esbuild` - Build tools
590
+ - `vitest` - Testing framework
591
+
592
+ ## Common Patterns
593
+
594
+ ### Chaining Configuration
595
+
596
+ ```typescript
597
+ const result = await zai.with({ modelId: 'fast' }).learn('my-task').extract(text, schema)
598
+ ```
599
+
600
+ ### Abort Control
601
+
602
+ ```typescript
603
+ const controller = new AbortController()
604
+ const response = zai.check(text, condition)
605
+ response.bindSignal(controller.signal)
606
+
607
+ setTimeout(() => controller.abort(), 5000)
608
+ ```
609
+
610
+ ### Progress Tracking
611
+
612
+ ```typescript
613
+ const response = zai.summarize(longDoc)
614
+ response.on('progress', (usage) => {
615
+ console.log(`Progress: ${usage.requests.percentage * 100}%`)
616
+ })
617
+ const summary = await response
618
+ ```
619
+
620
+ ### Detailed Results
621
+
622
+ ```typescript
623
+ const { output, usage, elapsed } = await zai.extract(text, schema).result()
624
+ console.log(`Took ${elapsed}ms, used ${usage.tokens.total} tokens, cost $${usage.cost.total}`)
625
+ ```
626
+
627
+ ## Debugging Tips
628
+
629
+ 1. **Enable request logging**:
630
+
631
+ ```typescript
632
+ cognitive.on('request', (req) => console.log(req.input))
633
+ cognitive.on('response', (req, res) => console.log(res.output))
634
+ ```
635
+
636
+ 2. **Check token counts**:
637
+
638
+ ```typescript
639
+ const tokenizer = await getTokenizer()
640
+ console.log(tokenizer.count(text))
641
+ ```
642
+
643
+ 3. **Inspect examples**:
644
+
645
+ ```typescript
646
+ const examples = await adapter.getExamples({ taskType, taskId, input })
647
+ console.log(examples)
648
+ ```
649
+
650
+ 4. **Monitor retries**: Watch for multiple requests in usage stats
651
+
652
+ ```typescript
653
+ const { usage } = await response.result()
654
+ if (usage.requests.requests > usage.requests.responses) {
655
+ console.warn('Retries occurred')
656
+ }
657
+ ```
658
+
659
+ ## Performance Considerations
660
+
661
+ - **Chunking**: Use smaller chunks for better parallelization, larger for better context
662
+ - **Concurrency**: Limited to 10 parallel operations (p-limit)
663
+ - **Caching**: Active learning provides cache via exact key matches
664
+ - **Token Estimation**: Tokenizer used for accurate counting, not char-based estimation
665
+ - **Model Selection**: 'fast' model significantly cheaper but lower quality
666
+
667
+ ## Security Notes
668
+
669
+ - Input validation via Zod schemas
670
+ - No arbitrary code execution
671
+ - Table names/taskIds validated with regex
672
+ - Frozen table schema prevents accidental modifications
673
+ - No sensitive data in default table tags
674
+
675
+ ## Known Issues & Limitations
676
+
677
+ 1. **WASM Loading**: Tokenizer requires retry logic due to race condition
678
+ 2. **Table Search**: Limited to 1024 characters for search query
679
+ 3. **Chunk Merging**: May lose information if chunks have conflicting data
680
+ 4. **Max Retries**: Fixed at 3, not configurable per operation
681
+ 5. **Concurrency**: Fixed at 10 parallel operations
682
+ 6. **Schema Changes**: Table adapter doesn't auto-migrate schemas
683
+
684
+ ## Future Enhancement Ideas
685
+
686
+ - Configurable retry strategies
687
+ - Custom similarity functions for example retrieval
688
+ - Streaming support for long-running operations
689
+ - Cache layer beyond exact match
690
+ - Multi-model fallback strategies
691
+ - Cost optimization recommendations
692
+ - Token usage prediction before execution
693
+
694
+ ---
695
+
696
+ **Last Updated**: Based on codebase analysis at commit `7d073b6de` on branch `sp/zai-fix-empty-arr`