@ax-llm/ax 19.0.15 → 19.0.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@ax-llm/ax",
3
- "version": "19.0.15",
3
+ "version": "19.0.17",
4
4
  "type": "module",
5
5
  "description": "The best library to work with LLMs",
6
6
  "repository": {
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: ax-agent
3
3
  description: This skill helps an LLM generate correct AxAgent code using @ax-llm/ax. Use when the user asks about agent(), child agents, namespaced functions, discovery mode, shared fields, llmQuery(...), or RLM code execution.
4
- version: "19.0.15"
4
+ version: "19.0.17"
5
5
  ---
6
6
 
7
7
  # AxAgent Codegen Rules (@ax-llm/ax)
@@ -11,11 +11,42 @@ Use this skill to generate `AxAgent` code. Prefer short, modern, copyable patter
11
11
  ## Use These Defaults
12
12
 
13
13
  - Use `agent(...)`, not `new AxAgent(...)`.
14
+ - Prefer `fn(...)` for host-side function definitions instead of hand-writing JSON Schema objects.
14
15
  - Prefer namespaced functions such as `utils.search(...)` or `kb.find(...)`.
15
16
  - Assume the child-agent module is `agents` unless `agentIdentity.namespace` is set.
16
17
  - If `functions.discovery` is `true`, discover callables from modules before using them.
17
18
  - In stdout-mode RLM, use one observable `console.log(...)` step per non-final actor turn.
18
- - For long RLM tasks, prefer `contextManagement.actionReplay: 'adaptive'` plus `stateSummary` so prior exploratory turns are summarized and live runtime state stays visible.
19
+ - For long RLM tasks, prefer `contextPolicy: { preset: 'adaptive' }` so older successful turns collapse into checkpoint summaries while live runtime state stays visible.
20
+
21
+ ## Mental Model
22
+
23
+ Treat `AxAgent` as a long-running JavaScript REPL that the actor steers over multiple turns, not as a fresh script generator on every turn.
24
+
25
+ - Successful code leaves variables, functions, imports, and computed values available in the runtime session.
26
+ - The actor should continue from existing runtime state instead of recreating prior work.
27
+ - `Action Log`, `Live Runtime State`, and checkpoint summaries only control what the actor can see again in the prompt.
28
+ - Rebuild state only after an explicit runtime restart notice or when you intentionally need to overwrite a value.
29
+
30
+ ## Context Policy Presets
31
+
32
+ Use these meanings consistently when writing or explaining `contextPolicy.preset`:
33
+
34
+ - `full`: Keep prior actions fully replayed. Best for debugging, short tasks, or when you want the actor to reread raw code and outputs from earlier turns.
35
+ - `adaptive`: Keep runtime state visible, keep recent or dependency-relevant actions in full, and collapse older successful work into a `Checkpoint Summary` when context grows. This is the default recommendation for long multi-turn tasks.
36
+ - `lean`: Most aggressive compression. Keep `Live Runtime State`, checkpoint older successful work, and summarize replay-pruned successful turns instead of showing their full code blocks. Use when token pressure matters more than raw replay detail.
37
+
38
+ Practical rule:
39
+
40
+ - Start with `adaptive` for most long RLM tasks.
41
+ - Use `lean` only when the task can mostly continue from current runtime state plus compact summaries.
42
+ - Use `full` when you are debugging the actor loop itself or need exact prior code/output in prompt.
43
+
44
+ Important:
45
+
46
+ - `contextPolicy` controls prompt replay and compression, not runtime persistence.
47
+ - A value created by successful actor code still exists in the runtime session even if the earlier turn is later shown only as a summary or checkpoint.
48
+ - Used discovery docs are replay artifacts too: `adaptive` and `lean` can hide old `listModuleFunctions(...)` / `getFunctionDefinitions(...)` output after the actor successfully uses the discovered callable.
49
+ - Reliability-first defaults now prefer "summarize first, delete only when clearly safe" instead of aggressively pruning older evidence as soon as context grows.
19
50
 
20
51
  ## Critical Rules
21
52
 
@@ -24,10 +55,11 @@ Use this skill to generate `AxAgent` code. Prefer short, modern, copyable patter
24
55
  - If `functions.discovery` is `true`, call `listModuleFunctions(...)` first, then `getFunctionDefinitions(...)`, then call only discovered functions.
25
56
  - In stdout-mode RLM, non-final turns must emit exactly one `console.log(...)` and stop immediately after it.
26
57
  - Never combine `console.log(...)` with `final(...)` or `ask_clarification(...)` in the same actor turn.
58
+ - If a host-side `AxAgentFunction` needs to end the current actor turn, use `extra.protocol.final(...)` or `extra.protocol.askClarification(...)`.
27
59
  - If a child agent needs parent inputs such as `audience`, use `fields.shared` or `fields.globallyShared`.
28
60
  - `llmQuery(...)` failures may come back as `[ERROR] ...`; do not assume success.
29
- - If `contextManagement.stateSummary.enabled` is on, rely on the `Live Runtime State` block for current variables instead of re-reading old action log code.
30
- - If `contextManagement.actionReplay` is `'adaptive'` or `'minimal'`, assume older successful turns may be summarized or omitted.
61
+ - If `contextPolicy.state.summary` is on, rely on the `Live Runtime State` block for current variables instead of re-reading old action log code.
62
+ - If `contextPolicy.preset` is `'adaptive'` or `'lean'`, assume older successful turns may be replaced by a `Checkpoint Summary` and that replay-pruned successful turns may appear as compact summaries instead of full code blocks.
31
63
 
32
64
  ## Canonical Pattern
33
65
 
@@ -119,43 +151,34 @@ Rules:
119
151
  ## Tool Functions And Namespaces
120
152
 
121
153
  ```typescript
122
- import type { AxAgentFunction } from '@ax-llm/ax';
154
+ import { f, fn } from '@ax-llm/ax';
155
+
156
+ const tools = [
157
+ fn('findSnippets')
158
+ .description('Find handbook snippets by topic')
159
+ .namespace('kb')
160
+ .arg('topic', f.string('Topic keyword'))
161
+ .returns(f.string('Matching snippet').array())
162
+ .example({
163
+ title: 'Find severity guidance',
164
+ code: 'await kb.findSnippets({ topic: "severity" });',
165
+ })
166
+ .handler(async ({ topic }) => [])
167
+ .build(),
168
+ ];
123
169
 
124
- const tools: AxAgentFunction[] = [
125
- {
126
- name: 'findSnippets',
127
- namespace: 'kb',
128
- description: 'Find handbook snippets by topic',
129
- parameters: {
130
- type: 'object',
131
- properties: {
132
- topic: { type: 'string', description: 'Topic keyword' },
133
- },
134
- required: ['topic'],
135
- },
136
- returns: {
137
- type: 'array',
138
- items: { type: 'string' },
139
- },
140
- examples: [
170
+ const analyst = agent('query:string -> answer:string', {
171
+ functions: {
172
+ local: [
141
173
  {
142
- title: 'Find severity guidance',
143
- code: 'await kb.findSnippets({ topic: "severity" });',
174
+ namespace: 'kb',
175
+ title: 'Knowledge Base',
176
+ selectionCriteria: 'Use for handbook and documentation lookups.',
177
+ description: 'Handbook and documentation search helpers.',
178
+ functions: tools.map(({ namespace: _namespace, ...tool }) => tool),
144
179
  },
145
180
  ],
146
- func: async ({ topic }) => [],
147
181
  },
148
- ];
149
-
150
- const analyst = agent('query:string -> answer:string', {
151
- namespaces: [
152
- {
153
- name: 'kb',
154
- title: 'Knowledge Base',
155
- description: 'Handbook and documentation search helpers.',
156
- },
157
- ],
158
- functions: { local: tools },
159
182
  contextFields: [],
160
183
  });
161
184
  ```
@@ -172,6 +195,44 @@ Rules:
172
195
  - Default function namespace is `utils` when no namespace is set.
173
196
  - Use the runtime call shape `await <namespace>.<name>({...})`.
174
197
 
198
+ ## Host-Side Completion From Functions
199
+
200
+ Use this pattern when the actor should call a namespaced function, but the host-side function implementation should decide to end the turn:
201
+
202
+ ```typescript
203
+ import { f, fn } from '@ax-llm/ax';
204
+
205
+ const workflowTools = [
206
+ fn('finishReply')
207
+ .description('Complete the actor turn with the final reply text')
208
+ .namespace('workflow')
209
+ .arg('reply', f.string('Final reply text'))
210
+ .returns(f.string('Final reply text'))
211
+ .handler(async ({ reply }, extra) => {
212
+ extra?.protocol?.final(reply);
213
+ return reply;
214
+ })
215
+ .build(),
216
+ fn('askForOrderId')
217
+ .description('Complete the actor turn by requesting clarification')
218
+ .namespace('workflow')
219
+ .arg('question', f.string('Clarification question'))
220
+ .returns(f.string('Clarification question'))
221
+ .handler(async ({ question }, extra) => {
222
+ extra?.protocol?.askClarification(question);
223
+ return question;
224
+ })
225
+ .build(),
226
+ ];
227
+ ```
228
+
229
+ Rules:
230
+
231
+ - `extra.protocol` is only available when the function call comes from an active AxAgent actor runtime session.
232
+ - Use `extra.protocol.final(...)` or `extra.protocol.askClarification(...)` only inside host-side function handlers.
233
+ - Inside actor-authored JavaScript, keep using the runtime globals `final(...)` and `ask_clarification(...)`.
234
+ - Do not model these protocol completions as normal registered tool functions or discovery entries.
235
+
175
236
  ## Discovery Mode
176
237
 
177
238
  Enable discovery mode when you want the actor to discover modules and fetch callable definitions on demand:
@@ -199,7 +260,8 @@ Discovery APIs:
199
260
 
200
261
  Both return Markdown.
201
262
 
202
- - `listModuleFunctions(...)` only lists modules that actually have callable entries. Namespace metadata from `namespaces` only enriches those callable-backed modules.
263
+ - `listModuleFunctions(...)` only lists modules that actually have callable entries.
264
+ - Grouped modules render in the Actor prompt as `<namespace> - <selection criteria>` when criteria is provided.
203
265
  - If a requested module does not exist, `listModuleFunctions(...)` returns a per-module markdown error without failing the whole call.
204
266
  - `getFunctionDefinitions(...)` may include argument comments from schema descriptions and fenced code examples from `AxAgentFunction.examples`.
205
267
 
@@ -234,20 +296,88 @@ Do not:
234
296
  - Do not dump large pre-known tool definitions into actor code when discovery mode is enabled.
235
297
  - Do not use `Promise.all(...)` to fan out discovery calls across modules or definitions.
236
298
  - Do not convert discovery markdown into JSON before logging or using it.
299
+ - If used discovery docs disappear from later prompts under `adaptive` or `lean`, call `listModuleFunctions(...)` or `getFunctionDefinitions(...)` again when you need to re-open them.
237
300
 
238
301
  ## RLM Actor Code Rules
239
302
 
240
303
  Use these rules when generating actor JavaScript for RLM in stdout mode:
241
304
 
242
305
  - Treat each actor turn as exactly one observable step.
243
- - If you need to inspect a value, compute it, `console.log(...)` it, and stop immediately after that `console.log(...)`.
244
- - On the next turn, read the logged result from `Action Log` before writing more code that depends on it.
306
+ - Inspect what already exists before recomputing it. If a prior turn successfully created a value, prefer reusing that runtime value.
307
+ - If you need to inspect a value, compute it or read it, `console.log(...)` it, and stop immediately after that `console.log(...)`.
308
+ - On the next turn, continue from the existing runtime state and use the logged result from `Action Log` only as evidence for what happened.
245
309
  - If the prompt contains `Live Runtime State`, treat it as the canonical view of current variables.
246
310
  - Errors from child-agent or tool calls appear in `Action Log`; inspect them and fix the code on the next turn.
247
311
  - Non-final turns should contain exactly one `console.log(...)`.
248
312
  - Final turns should call `final(...)` or `ask_clarification(...)` without `console.log(...)`.
249
313
  - Do not write a complete multi-step program in one actor turn.
250
- - Do not assume older successful turns remain fully replayed; adaptive replay may compress them to `[SUMMARY]: ...`.
314
+ - Do not re-declare or recompute values just because older turns are summarized; only rebuild after an explicit runtime restart or when you intentionally want a new value.
315
+ - Do not assume older successful turns remain fully replayed; adaptive or lean policies may collapse them into a `Checkpoint Summary` block or compact action summaries.
316
+
317
+ Small reuse example:
318
+
319
+ Turn 1:
320
+
321
+ ```javascript
322
+ const customers = await kb.findCustomers({ segment: 'active' });
323
+ console.log(customers.length);
324
+ ```
325
+
326
+ Turn 2:
327
+
328
+ ```javascript
329
+ const topCustomers = customers.slice(0, 3);
330
+ console.log(topCustomers);
331
+ ```
332
+
333
+ Reason: turn 2 reuses `customers` from the persistent runtime. `Live Runtime State` or summaries may change how turn 1 is shown in the prompt, but they do not remove the value from the runtime session.
334
+
335
+ ## RLM Test Harness
336
+
337
+ Use `agent.test(code, contextFieldValues?, options?)` when the user wants to validate JavaScript snippets against the actual AxAgent runtime environment without running the full Actor/Responder loop.
338
+
339
+ ```typescript
340
+ import { AxJSRuntime, agent, f, fn } from '@ax-llm/ax';
341
+
342
+ const runtime = new AxJSRuntime();
343
+
344
+ const tools = [
345
+ fn('sum')
346
+ .description('Return the sum of the provided numeric values')
347
+ .namespace('math')
348
+ .arg('values', f.number('Value to add').array())
349
+ .returns(f.number('Sum of all values'))
350
+ .handler(async ({ values }) =>
351
+ values.reduce((total, value) => total + value, 0)
352
+ )
353
+ .build(),
354
+ ];
355
+
356
+ const harness = agent('query:string -> answer:string', {
357
+ contextFields: ['query'],
358
+ runtime,
359
+ functions: { local: tools },
360
+ contextPolicy: { preset: 'adaptive' },
361
+ });
362
+
363
+ const output = await harness.test(
364
+ 'console.log(await math.sum({ values: [3, 5, 8] }))',
365
+ { query: 'sum the values' }
366
+ );
367
+
368
+ console.log(output);
369
+ ```
370
+
371
+ Rules:
372
+
373
+ - `test(...)` creates a fresh runtime session per call.
374
+ - It exposes the same runtime globals the actor would see for configured `contextFields`: `inputs`, non-colliding top-level aliases, namespaced functions, child agents, and `llmQuery`.
375
+ - In `AxJSRuntime`, do not rely on calling `inspect_runtime()` from inside `test(...)` snippets yet; prefer checking runtime globals directly inside the snippet.
376
+ - It returns the formatted runtime output string.
377
+ - It throws on runtime failures instead of returning LLM-style error strings.
378
+ - Do not call `final(...)` or `ask_clarification(...)` inside `test(...)` snippets.
379
+ - Pass only `contextFields` values to `test(...)`; it is not a general way to inject arbitrary non-context inputs.
380
+ - If the snippet uses `llmQuery(...)`, provide an AI service through the agent config or `options.ai`.
251
381
 
252
382
  ## RLM Adaptive Replay
253
383
 
@@ -260,15 +390,31 @@ const analyst = agent(
260
390
  contextFields: ['context'],
261
391
  runtime: new AxJSRuntime(),
262
392
  maxTurns: 10,
263
- contextManagement: {
264
- actionReplay: 'adaptive',
265
- recentFullActions: 1,
266
- successSummarization: true,
267
- stateSummary: { enabled: true, maxEntries: 6 },
268
- stateInspection: { contextThreshold: 2_000 },
269
- errorPruning: true,
270
- hindsightEvaluation: true,
271
- pruneRank: 2,
393
+ contextPolicy: {
394
+ preset: 'adaptive',
395
+ summarizerOptions: {
396
+ model: 'summary-model',
397
+ modelConfig: { temperature: 0.2, maxTokens: 180 },
398
+ },
399
+ state: {
400
+ summary: true,
401
+ inspect: true,
402
+ inspectThresholdChars: 8_000,
403
+ maxEntries: 6,
404
+ maxChars: 1_200,
405
+ },
406
+ checkpoints: {
407
+ enabled: true,
408
+ triggerChars: 12_000,
409
+ },
410
+ expert: {
411
+ pruneErrors: true,
412
+ rankPruning: { enabled: true, minRank: 2 },
413
+ tombstones: {
414
+ model: 'summary-model',
415
+ modelConfig: { maxTokens: 80 },
416
+ },
417
+ },
272
418
  },
273
419
  }
274
420
  );
@@ -276,12 +422,18 @@ const analyst = agent(
276
422
 
277
423
  Rules:
278
424
 
279
- - Use `actionReplay: 'adaptive'` when the task needs runtime state across many turns but old exploratory code should not keep bloating the prompt.
280
- - Use `recentFullActions` to keep the newest one or two turns verbatim while older successful turns collapse to summaries.
281
- - Use `successSummarization: true` for explicit, compact summaries of older successful turns.
282
- - Use `stateSummary.enabled` to inject a compact `Live Runtime State` block into the actor prompt.
283
- - Use `actionReplay: 'minimal'` only when you want aggressively compressed history and can rely mostly on current runtime state.
284
- - Keep `stateInspection.contextThreshold` on so the actor is reminded to call `inspect_runtime()` when context grows.
425
+ - Use `preset: 'full'` when the actor should keep seeing raw prior code and outputs with minimal compression.
426
+ - Use `preset: 'adaptive'` when the task needs runtime state across many turns but older successful work should collapse into checkpoint summaries while important recent steps can still stay fully replayed.
427
+ - Use `preset: 'lean'` when you want more aggressive compression and can rely mostly on current runtime state plus checkpoint summaries and compact action summaries.
428
+ - Use `state.summary` to inject a compact `Live Runtime State` block into the actor prompt. The block is structured and provenance-aware: variables are rendered with compact type/size/preview metadata, and when Ax can infer it, a short source suffix like `from t3 via db.search` is included. Combine `maxEntries` with `maxChars` so large runtime objects do not dominate the prompt.
429
+ - Use `state.inspect` with `inspectThresholdChars` so the actor is reminded to call `inspect_runtime()` when replayed action history starts getting large.
430
+ - `adaptive` and `lean` hide used discovery docs by default; set `contextPolicy.pruneUsedDocs: false` if you want to keep replaying them.
431
+ - `full` keeps used discovery docs by default; set `contextPolicy.pruneUsedDocs: true` if you want the same cleanup there.
432
+ - Use `summarizerOptions` to tune the internal checkpoint-summary AxGen program.
433
+ - If you configure `expert.tombstones`, treat the object form as options for the internal tombstone-summary AxGen program.
434
+ - Internal checkpoint and tombstone summarizers are stateless helpers: `functions` are not allowed, `maxSteps` is forced to `1`, and `mem` is not propagated.
435
+ - Built-in `adaptive` and `lean` presets no longer enable destructive rank pruning by default. Opt in with `expert.rankPruning` only when you want lower-value successful turns deleted instead of summarized.
436
+ - If you want a quick local demo of the rendered `Live Runtime State` block, run [`src/examples/rlm-live-runtime-state.ts`](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-live-runtime-state.ts).
285
437
 
286
438
  Good pattern:
287
439
 
@@ -455,7 +607,7 @@ agentIdentity?: {
455
607
  maxRuntimeChars?: number;
456
608
  maxBatchedLlmQueryConcurrency?: number;
457
609
  maxTurns?: number;
458
- contextManagement?: AxContextManagementConfig;
610
+ contextPolicy?: AxContextPolicyConfig;
459
611
  actorFields?: string[];
460
612
  actorCallback?: (result: Record<string, unknown>) => void | Promise<void>;
461
613
  inputUpdateCallback?: (currentInputs: Record<string, unknown>) => Promise<Record<string, unknown> | undefined> | Record<string, unknown> | undefined;
@@ -468,6 +620,23 @@ agentIdentity?: {
468
620
  }
469
621
  ```
470
622
 
623
+ ## Examples
624
+
625
+ Fetch these for full working code:
626
+
627
+ - [Agent](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/agent.ts) — basic agent
628
+ - [Functions](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/function.ts) — function validation
629
+ - [Food Search](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/food-search.ts) — API tools
630
+ - [Smart Home](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/smart-home.ts) — state management
631
+ - [RLM](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm.ts) — RLM basic
632
+ - [RLM Long Task](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-long-task.ts) — RLM context policy
633
+ - [RLM Discovery](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-discovery.ts) — discovery mode
634
+ - [RLM Shared Fields](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-shared-fields.ts) — shared fields
635
+ - [RLM Adaptive Replay](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-adaptive-replay.ts) — adaptive replay
636
+ - [RLM Live Runtime State](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-live-runtime-state.ts) — structured runtime-state rendering
637
+ - [Customer Support](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/customer-support.ts) — classification agent
638
+ - [Abort Patterns](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/abort-patterns.ts) — abort handling
639
+
471
640
  ## Do Not Generate
472
641
 
473
642
  - Do not use `new AxAgent(...)` for new code unless explicitly required.
@@ -0,0 +1,245 @@
1
+ ---
2
+ name: ax-ai
3
+ description: This skill helps an LLM generate correct AI provider setup and configuration code using @ax-llm/ax. Use when the user asks about ai(), providers, models, presets, embeddings, extended thinking, context caching, or mentions OpenAI/Anthropic/Google/Azure/Groq/DeepSeek/Mistral/Cohere/Together/Ollama/HuggingFace/Reka/OpenRouter with @ax-llm/ax.
4
+ version: "19.0.17"
5
+ ---
6
+
7
+ # AI Provider Codegen Rules (@ax-llm/ax)
8
+
9
+ Use this skill to generate AI provider setup, configuration, and chat code. Prefer short, modern, copyable patterns. Do not write tutorial prose unless the user explicitly asks for explanation.
10
+
11
+ ## Quick Setup
12
+
13
+ ```typescript
14
+ import { ai } from '@ax-llm/ax';
15
+
16
+ const openai = ai({ name: 'openai', apiKey: 'sk-...' });
17
+ const claude = ai({ name: 'anthropic', apiKey: 'sk-ant-...' });
18
+ const gemini = ai({ name: 'google-gemini', apiKey: 'AIza...' });
19
+ const azure = ai({ name: 'azure-openai', apiKey: 'your-key', resourceName: 'your-resource', deploymentName: 'gpt-4' });
20
+ const groq = ai({ name: 'groq', apiKey: 'gsk_...' });
21
+ const deepseek = ai({ name: 'deepseek', apiKey: 'sk-...' });
22
+ const mistral = ai({ name: 'mistral', apiKey: 'your-key' });
23
+ const cohere = ai({ name: 'cohere', apiKey: 'your-key' });
24
+ const together = ai({ name: 'together', apiKey: 'your-key' });
25
+ const openrouter = ai({ name: 'openrouter', apiKey: 'your-key' });
26
+ const ollama = ai({ name: 'ollama', url: 'http://localhost:11434' });
27
+ const hf = ai({ name: 'huggingface', apiKey: 'hf_...' });
28
+ const reka = ai({ name: 'reka', apiKey: 'your-key' });
29
+ const grok = ai({ name: 'x-grok', apiKey: 'your-key' });
30
+ ```
31
+
32
+ ## Model Presets
33
+
34
+ ```typescript
35
+ import { ai, AxAIGoogleGeminiModel } from '@ax-llm/ax';
36
+
37
+ const gemini = ai({
38
+ name: 'google-gemini',
39
+ apiKey: process.env.GOOGLE_APIKEY!,
40
+ config: { model: 'simple' },
41
+ models: [
42
+ { key: 'tiny', model: AxAIGoogleGeminiModel.Gemini20FlashLite, description: 'Fast + cheap', config: { maxTokens: 1024, temperature: 0.3 } },
43
+ { key: 'simple', model: AxAIGoogleGeminiModel.Gemini20Flash, description: 'Balanced', config: { temperature: 0.6 } },
44
+ ],
45
+ });
46
+
47
+ await gemini.chat({ model: 'tiny', chatPrompt: [{ role: 'user', content: 'Hi' }] });
48
+ ```
49
+
50
+ ## Chat
51
+
52
+ ```typescript
53
+ const res = await llm.chat({
54
+ chatPrompt: [
55
+ { role: 'system', content: 'You are concise.' },
56
+ { role: 'user', content: 'Write a haiku about the ocean.' },
57
+ ],
58
+ });
59
+ console.log(res.results[0]?.content);
60
+ ```
61
+
62
+ ## Common Options
63
+
64
+ - `stream` (boolean): enable SSE; true by default
65
+ - `thinkingTokenBudget`: `'minimal'` | `'low'` | `'medium'` | `'high'` | `'highest'` | `'none'`
66
+ - `showThoughts`: include thoughts in output
67
+ - `functionCallMode`: `'auto'` | `'native'` | `'prompt'`
68
+ - `debug`, `logger`, `tracer`, `rateLimiter`, `timeout`
69
+
70
+ ## Extended Thinking
71
+
72
+ ```typescript
73
+ import { ai, AxAIAnthropicModel } from '@ax-llm/ax';
74
+
75
+ const claude = ai({
76
+ name: 'anthropic',
77
+ apiKey: process.env.ANTHROPIC_APIKEY!,
78
+ config: { model: AxAIAnthropicModel.Claude46Opus },
79
+ });
80
+
81
+ const res = await claude.chat(
82
+ { chatPrompt: [{ role: 'user', content: 'Solve step by step...' }] },
83
+ { thinkingTokenBudget: 'medium', showThoughts: true },
84
+ );
85
+ console.log(res.results[0]?.thought);
86
+ console.log(res.results[0]?.content);
87
+ ```
88
+
89
+ ### Budget Levels
90
+
91
+ | Level | Anthropic (tokens) | Gemini (tokens) |
92
+ |---|---|---|
93
+ | `'none'` | disabled | minimal |
94
+ | `'minimal'` | 1,024 | 200 |
95
+ | `'low'` | 5,000 | 800 |
96
+ | `'medium'` | 10,000 | 5,000 |
97
+ | `'high'` | 20,000 | 10,000 |
98
+ | `'highest'` | 32,000 | 24,500 |
99
+
100
+ ### Anthropic Model-Specific Behavior
101
+
102
+ - Opus 4.6: adaptive thinking, effort levels
103
+ - Opus 4.5: budget_tokens + effort levels (capped at `'high'`)
104
+ - Other thinking models: budget tokens only
105
+
106
+ ### Custom Thinking Levels
107
+
108
+ ```typescript
109
+ const claude = ai({
110
+ name: 'anthropic',
111
+ apiKey: '...',
112
+ config: {
113
+ model: AxAIAnthropicModel.Claude46Opus,
114
+ thinkingTokenBudgetLevels: {
115
+ minimal: 2048,
116
+ low: 8000,
117
+ medium: 16000,
118
+ high: 25000,
119
+ highest: 40000,
120
+ },
121
+ effortLevelMapping: {
122
+ minimal: 'low',
123
+ low: 'medium',
124
+ medium: 'high',
125
+ high: 'high',
126
+ highest: 'max',
127
+ },
128
+ },
129
+ });
130
+ ```
131
+
132
+ ## Embeddings
133
+
134
+ ```typescript
135
+ const { embeddings } = await llm.embed({
136
+ texts: ['hello', 'world'],
137
+ embedModel: 'text-embedding-005',
138
+ });
139
+ ```
140
+
141
+ ## Context Caching
142
+
143
+ ```typescript
144
+ const result = await gen.forward(llm, { code, language }, {
145
+ mem,
146
+ sessionId: 'code-review-session',
147
+ contextCache: {
148
+ ttlSeconds: 3600,
149
+ cacheBreakpoint: 'after-examples',
150
+ },
151
+ });
152
+ ```
153
+
154
+ Breakpoint values: `'system'` | `'after-functions'` | `'after-examples'`
155
+
156
+ Provider behavior:
157
+
158
+ - Google Gemini: explicit caching with cache resource ID, auto TTL refresh
159
+ - Anthropic: implicit via `cache_control` markers
160
+
161
+ ### External Registry (serverless)
162
+
163
+ ```typescript
164
+ const registry: AxContextCacheRegistry = {
165
+ get: async (key) => { /* redis.get */ },
166
+ set: async (key, entry) => { /* redis.set */ },
167
+ };
168
+ ```
169
+
170
+ ## AWS Bedrock
171
+
172
+ ```typescript
173
+ import { AxAIBedrock, AxAIBedrockModel } from '@ax-llm/ax-ai-aws-bedrock';
174
+
175
+ const bedrock = new AxAIBedrock({
176
+ region: 'us-east-2',
177
+ fallbackRegions: ['us-west-2'],
178
+ config: { model: AxAIBedrockModel.ClaudeSonnet4 },
179
+ });
180
+ ```
181
+
182
+ ## Vercel AI SDK Integration
183
+
184
+ ```typescript
185
+ import { ai } from '@ax-llm/ax';
186
+ import { AxAIProvider } from '@ax-llm/ax-ai-sdk-provider';
187
+ import { generateText } from 'ai';
188
+
189
+ const axAI = ai({ name: 'openai', apiKey: process.env.OPENAI_APIKEY! });
190
+ const model = new AxAIProvider(axAI);
191
+ const result = await generateText({
192
+ model,
193
+ messages: [{ role: 'user', content: 'Hello!' }],
194
+ });
195
+ ```
196
+
197
+ ## MCP + AxJSRuntime
198
+
199
+ ```typescript
200
+ import { AxMCPClient } from '@ax-llm/ax';
201
+ import { axCreateMCPStdioTransport } from '@ax-llm/ax-tools';
202
+
203
+ const transport = axCreateMCPStdioTransport({
204
+ command: 'npx',
205
+ args: ['-y', '@anthropic/mcp-server-filesystem'],
206
+ });
207
+ const client = new AxMCPClient(transport);
208
+ ```
209
+
210
+ ## Critical Rules
211
+
212
+ - Use `ai()` factory for all providers.
213
+ - Provider names: `'openai'`, `'anthropic'`, `'google-gemini'`, `'azure-openai'`, `'mistral'`, `'groq'`, `'cohere'`, `'together'`, `'deepseek'`, `'ollama'`, `'huggingface'`, `'openrouter'`, `'reka'`, `'x-grok'`
214
+ - Thinking constraints on Anthropic: `temperature` and `topK` are ignored; `topP` only sent if >= 0.95.
215
+ - Bedrock uses `new AxAIBedrock()`, not `ai()`.
216
+ - Vercel AI SDK uses `AxAIProvider` wrapper.
217
+
218
+ ## Examples
219
+
220
+ Fetch these for full working code:
221
+
222
+ - [Embeddings](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/embed.ts) — embedding generation
223
+ - [Anthropic Thinking](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/anthropic-thinking-function.ts) — extended thinking with functions
224
+ - [Anthropic Thinking Separation](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/anthropic-thinking-separation.ts) — thinking separation
225
+ - [Anthropic Web Search](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/anthropic-web-search.ts) — Anthropic web search
226
+ - [OpenAI Web Search](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/openai-web-search.ts) — OpenAI web search
227
+ - [OpenAI Responses](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/openai-responses.ts) — OpenAI responses API
228
+ - [o3 Reasoning](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/reasoning-o3-example.ts) — o3 reasoning
229
+ - [Gemini Context Cache](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/gemini-context-cache.ts) — Gemini context caching
230
+ - [Gemini Files](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/gemini-file-support.ts) — Gemini file handling
231
+ - [Grok Live Search](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/grok-live-search.ts) — Grok live search
232
+ - [OpenRouter](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/openrouter.ts) — OpenRouter provider
233
+ - [Vertex AI Auth](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/vertex-auth-example.ts) — Vertex AI authentication
234
+ - [MCP Stdio](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/mcp-client-memory.ts) — MCP stdio transport
235
+ - [MCP HTTP](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/mcp-client-pipedream.ts) — MCP HTTP transport
236
+ - [Telemetry](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/telemetry.ts) — OpenTelemetry tracing
237
+ - [Multi-Modal](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/multi-modal.ts) — image handling
238
+
239
+ ## Do Not Generate
240
+
241
+ - Do not use `new AxAIOpenAI(...)` or similar class constructors for standard providers; use `ai()`.
242
+ - Do not hardcode provider class names when `ai({ name: ... })` covers the provider.
243
+ - Do not mix `thinkingTokenBudget` with explicit `temperature` on Anthropic thinking models.
244
+ - Do not use `ai()` for AWS Bedrock; use `new AxAIBedrock()`.
245
+ - Do not omit `resourceName` and `deploymentName` for Azure OpenAI.