@ax-llm/ax 20.0.2 → 21.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/skills/ax-ai.md CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: ax-ai
3
3
  description: This skill helps an LLM generate correct AI provider setup and configuration code using @ax-llm/ax. Use when the user asks about ai(), providers, models, presets, embeddings, extended thinking, context caching, or mentions OpenAI/Anthropic/Google/Azure/Groq/DeepSeek/Mistral/Cohere/Together/Ollama/HuggingFace/Reka/OpenRouter with @ax-llm/ax.
4
- version: "20.0.2"
4
+ version: "21.0.2"
5
5
  ---
6
6
 
7
7
  # AI Provider Codegen Rules (@ax-llm/ax)
@@ -26,7 +26,7 @@ const openrouter = ai({ name: 'openrouter', apiKey: 'your-key' });
26
26
  const ollama = ai({ name: 'ollama', url: 'http://localhost:11434' });
27
27
  const hf = ai({ name: 'huggingface', apiKey: 'hf_...' });
28
28
  const reka = ai({ name: 'reka', apiKey: 'your-key' });
29
- const grok = ai({ name: 'x-grok', apiKey: 'your-key' });
29
+ const grok = ai({ name: 'grok', apiKey: 'your-key' });
30
30
  ```
31
31
 
32
32
  ## Model Presets
@@ -47,6 +47,25 @@ const gemini = ai({
47
47
  await gemini.chat({ model: 'tiny', chatPrompt: [{ role: 'user', content: 'Hi' }] });
48
48
  ```
49
49
 
50
+ ## Model Catalog
51
+
52
+ ```typescript
53
+ import { axGetSupportedAIModels } from '@ax-llm/ax';
54
+
55
+ const providers = axGetSupportedAIModels();
56
+ const openai = providers.find((provider) => provider.name === 'openai');
57
+ console.log(openai?.models[0]?.promptTokenCostPer1M);
58
+
59
+ const textProviders = axGetSupportedAIModels({ type: 'text' });
60
+ const embeddingProviders = axGetSupportedAIModels({ type: 'embeddings' });
61
+ ```
62
+
63
+ Use `axGetSupportedAIModels()` to build provider/model selectors before creating an `ai(...)` instance. It returns bundled static metadata: provider names, display names, default models, raw `AxModelInfo` pricing/details, model type (`'text'`, `'embeddings'`, `'code'`, or `'audio'`), and normalized capability flags for thinking, thoughts, structured outputs, audio, temperature, and top-p support. Provider groups and models are sorted cheapest to most expensive based on bundled input + output token pricing; unpriced models sort last.
64
+
65
+ Filter with `{ type: 'all' | 'text' | 'embeddings' | 'code' | 'audio' }` or an array of those values. The `'text'` filter includes code-capable models; use `'code'` to show only code-first models.
66
+
67
+ Dynamic providers such as Azure OpenAI deployments, OpenRouter, Ollama, and Hugging Face are marked with `isDynamic: true` and may have an empty or static-limited model list.
68
+
50
69
  ## Chat
51
70
 
52
71
  ```typescript
@@ -210,7 +229,7 @@ const client = new AxMCPClient(transport);
210
229
  ## Critical Rules
211
230
 
212
231
  - Use `ai()` factory for all providers.
213
- - Provider names: `'openai'`, `'anthropic'`, `'google-gemini'`, `'azure-openai'`, `'mistral'`, `'groq'`, `'cohere'`, `'together'`, `'deepseek'`, `'ollama'`, `'huggingface'`, `'openrouter'`, `'reka'`, `'x-grok'`
232
+ - Provider names: `'openai'`, `'anthropic'`, `'google-gemini'`, `'azure-openai'`, `'mistral'`, `'groq'`, `'cohere'`, `'together'`, `'deepseek'`, `'ollama'`, `'huggingface'`, `'openrouter'`, `'reka'`, `'grok'`
214
233
  - Thinking constraints on Anthropic: `temperature` and `topK` are ignored; `topP` only sent if >= 0.95.
215
234
  - Bedrock uses `new AxAIBedrock()`, not `ai()`.
216
235
  - Vercel AI SDK uses `AxAIProvider` wrapper.
@@ -0,0 +1,251 @@
1
+ ---
2
+ name: ax-audio
3
+ description: This skill helps an LLM generate correct conversational audio I/O code with @ax-llm/ax. Use when the user asks about .chat() audio input, audio output, OpenAI gpt-audio or realtime models, Gemini Live native audio, Grok Voice Agent models, voices, formats, transcripts, or how audio fits with signatures and structured outputs.
4
+ version: "21.0.2"
5
+ ---
6
+
7
+ # Audio I/O Codegen Rules (@ax-llm/ax)
8
+
9
+ Use this skill for bounded-turn conversational audio through `.chat()`. Prefer short, modern, copyable examples. Do not model generated audio as a DSPy signature output field.
10
+
11
+ ## Core Rule
12
+
13
+ Audio output is returned on `AxChatResponseResult.audio`, not in signature fields.
14
+
15
+ Signatures should keep text fields text-shaped:
16
+
17
+ ```typescript
18
+ const result = await llm.chat({
19
+ chatPrompt: [{ role: 'user', content: 'Say hello out loud.' }],
20
+ modelConfig: {
21
+ audio: { output: { enabled: true } },
22
+ },
23
+ });
24
+
25
+ console.log(result.results[0]?.content);
26
+ console.log(result.results[0]?.audio?.data);
27
+ console.log(result.results[0]?.audio?.transcript);
28
+ ```
29
+
30
+ Do not write signatures like `question:string -> audio:audio`. Use `.chat()` for conversational audio and use `audio.data` for the generated bytes.
31
+
32
+ ## Config Shape
33
+
34
+ ```typescript
35
+ type AxChatAudioConfig = {
36
+ input?: {
37
+ format?: 'wav' | 'mp3' | 'flac' | 'opus' | 'aac' | 'pcm16' | 'pcm' | 'ogg';
38
+ mimeType?: string;
39
+ sampleRate?: number;
40
+ channels?: number;
41
+ };
42
+ output?: {
43
+ enabled?: boolean;
44
+ voice?: string | { id: string };
45
+ format?: 'wav' | 'mp3' | 'flac' | 'opus' | 'aac' | 'pcm16' | 'pcm' | 'ogg';
46
+ sampleRate?: number;
47
+ channels?: number;
48
+ includeTranscript?: boolean;
49
+ };
50
+ live?: {
51
+ turnTimeoutMs?: number;
52
+ enableAffectiveDialog?: boolean;
53
+ proactiveAudio?: boolean;
54
+ };
55
+ };
56
+ ```
57
+
58
+ ## OpenAI Defaults
59
+
60
+ Use `axAIOpenAIAudioDefaultConfig()` for OpenAI request-based audio chat:
61
+
62
+ - model: `gpt-audio-mini`
63
+ - output enabled
64
+ - voice: `alloy`
65
+ - output format: `wav`
66
+ - transcript enabled
67
+ - streaming disabled by default
68
+ - audio input formats: `wav`, `mp3`
69
+ - audio output formats: `wav`, `mp3`, `flac`, `opus`, `aac`, `pcm16`
70
+
71
+ ```typescript
72
+ import { ai, axAIOpenAIAudioDefaultConfig } from '@ax-llm/ax';
73
+
74
+ const openai = ai({
75
+ name: 'openai',
76
+ apiKey: process.env.OPENAI_APIKEY!,
77
+ config: axAIOpenAIAudioDefaultConfig(),
78
+ });
79
+
80
+ const res = await openai.chat({
81
+ chatPrompt: [
82
+ {
83
+ role: 'user',
84
+ content: [
85
+ { type: 'text', text: 'What is in this recording?' },
86
+ { type: 'audio', data: base64Wav, format: 'wav' },
87
+ ],
88
+ },
89
+ ],
90
+ });
91
+
92
+ console.log(res.results[0]?.content);
93
+ console.log(res.results[0]?.audio?.data);
94
+ ```
95
+
96
+ Use `axAIOpenAIRealtimeDefaultConfig()` for OpenAI realtime speech-to-speech:
97
+
98
+ - model: `gpt-realtime-2`
99
+ - output enabled
100
+ - voice: `marin`
101
+ - output format: `pcm16`
102
+ - input default: `audio/pcm`, mono, 24000 Hz
103
+ - turn timeout: `30000`
104
+ - streaming disabled by default
105
+
106
+ Use `axAIOpenAIRealtimeTranscriptionDefaultConfig()` for realtime transcript deltas:
107
+
108
+ - model: `gpt-realtime-whisper`
109
+ - input default: `audio/pcm`, mono, 24000 Hz
110
+ - output audio disabled; transcript text is returned on `content`
111
+
112
+ Realtime models use a one-turn WebSocket call under `.chat()`. In Node, pass a WebSocket constructor through request options:
113
+
114
+ ```typescript
115
+ import WebSocket from 'ws';
116
+ import { ai, axAIOpenAIRealtimeDefaultConfig } from '@ax-llm/ax';
117
+
118
+ const openai = ai({
119
+ name: 'openai',
120
+ apiKey: process.env.OPENAI_APIKEY!,
121
+ config: axAIOpenAIRealtimeDefaultConfig(),
122
+ });
123
+
124
+ const stream = await openai.chat(
125
+ {
126
+ chatPrompt: [{ role: 'user', content: 'Say hello out loud.' }],
127
+ },
128
+ { stream: true, webSocket: WebSocket }
129
+ );
130
+ ```
131
+
132
+ For follow-up turns, keep the assistant audio reference in history:
133
+
134
+ ```typescript
135
+ await openai.chat({
136
+ chatPrompt: [
137
+ { role: 'assistant', audio: { id: previousAudioId } },
138
+ { role: 'user', content: 'Repeat that more slowly.' },
139
+ ],
140
+ });
141
+ ```
142
+
143
+ ## Gemini Live Defaults
144
+
145
+ Use `axAIGoogleGeminiLiveAudioDefaultConfig()` for Gemini native audio:
146
+
147
+ - model: `gemini-2.5-flash-native-audio-preview-12-2025`
148
+ - output enabled
149
+ - voice: `Kore`
150
+ - output format: `pcm16`
151
+ - output sample rate: `24000`
152
+ - input default: `audio/pcm;rate=16000`, mono
153
+ - transcript enabled
154
+ - turn timeout: `30000`
155
+ - streaming disabled by default
156
+
157
+ ```typescript
158
+ import { ai, axAIGoogleGeminiLiveAudioDefaultConfig } from '@ax-llm/ax';
159
+
160
+ const gemini = ai({
161
+ name: 'google-gemini',
162
+ apiKey: process.env.GOOGLE_APIKEY!,
163
+ config: axAIGoogleGeminiLiveAudioDefaultConfig(),
164
+ });
165
+
166
+ const res = await gemini.chat({
167
+ chatPrompt: [
168
+ {
169
+ role: 'user',
170
+ content: [
171
+ { type: 'text', text: 'Answer this spoken question.' },
172
+ {
173
+ type: 'audio',
174
+ data: base64Pcm16,
175
+ format: 'pcm16',
176
+ sampleRate: 16000,
177
+ channels: 1,
178
+ },
179
+ ],
180
+ },
181
+ ],
182
+ });
183
+
184
+ console.log(res.results[0]?.content);
185
+ console.log(res.results[0]?.audio?.data);
186
+ ```
187
+
188
+ Gemini Live uses a one-turn WebSocket call under `.chat()`. It expects PCM input for native audio turns; use `format: 'pcm16'` or `mimeType: 'audio/pcm;rate=16000'`.
189
+
190
+ ## Grok Voice Defaults
191
+
192
+ Use `axAIGrokVoiceDefaultConfig()` for xAI Grok Voice Agent:
193
+
194
+ - model: `grok-voice-think-fast-1.0`
195
+ - output enabled
196
+ - voice: `eve`
197
+ - output format: `pcm16`
198
+ - output sample rate: `24000`
199
+ - input default: `audio/pcm`, mono, 24000 Hz
200
+ - transcript enabled
201
+ - turn timeout: `30000`
202
+ - streaming disabled by default
203
+
204
+ ```typescript
205
+ import WebSocket from 'ws';
206
+ import { ai, axAIGrokVoiceDefaultConfig } from '@ax-llm/ax';
207
+
208
+ const grok = ai({
209
+ name: 'grok',
210
+ apiKey: process.env.GROK_API_KEY!,
211
+ config: axAIGrokVoiceDefaultConfig(),
212
+ });
213
+
214
+ const res = await grok.chat(
215
+ {
216
+ chatPrompt: [{ role: 'user', content: 'Say hello out loud.' }],
217
+ },
218
+ { webSocket: WebSocket }
219
+ );
220
+
221
+ console.log(res.results[0]?.content);
222
+ console.log(res.results[0]?.audio?.data);
223
+ ```
224
+
225
+ Grok Voice uses a one-turn WebSocket call under `.chat()`. It expects PCM input for spoken input turns; use `format: 'pcm16'` or `mimeType: 'audio/pcm'`.
226
+
227
+ ## Streaming Audio
228
+
229
+ OpenAI audio chat, OpenAI Realtime, Gemini Live, and Grok Voice all default to non-streaming, but each can stream deltas when you pass `{ stream: true }`.
230
+
231
+ ```typescript
232
+ const stream = await llm.chat(
233
+ {
234
+ chatPrompt: [{ role: 'user', content: 'Say hello.' }],
235
+ },
236
+ { stream: true }
237
+ );
238
+
239
+ for await (const chunk of stream) {
240
+ const audio = chunk.results[0]?.audio;
241
+ if (audio?.isDelta) {
242
+ playAudioChunk(audio.data);
243
+ }
244
+ }
245
+ ```
246
+
247
+ ## Structured Outputs
248
+
249
+ Do not combine audio output with structured response formats. Audio chat may return a text transcript in `content`, but generated audio bytes live at `result.results[0].audio`.
250
+
251
+ For structured extraction from speech, use a text-only or transcription step first, then pass the transcript into `ax(...)` or `flow(...)`.
package/skills/ax-flow.md CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: ax-flow
3
3
  description: This skill helps an LLM generate correct AxFlow workflow code using @ax-llm/ax. Use when the user asks about flow(), AxFlow, workflow orchestration, parallel execution, DAG workflows, conditional routing, map/reduce patterns, or multi-node AI pipelines.
4
- version: "20.0.2"
4
+ version: "21.0.2"
5
5
  ---
6
6
 
7
7
  # AxFlow Codegen Rules (@ax-llm/ax)
@@ -361,6 +361,17 @@ wf.setDemos([{ programId: 'root.summarizer', traces: [] }]);
361
361
  wf.applyOptimization(optimizedProgram);
362
362
  ```
363
363
 
364
+ ## Chat Logs
365
+
366
+ `AxFlow.getChatLog()` returns a flat `readonly AxChatLogEntry[]` after `forward()`. Each child-node entry is tagged with `entry.name` so callers can filter by node:
367
+
368
+ ```typescript
369
+ const log = wf.getChatLog();
370
+ for (const entry of log) {
371
+ console.log(entry.name, entry.model);
372
+ }
373
+ ```
374
+
364
375
  ## Error Handling
365
376
 
366
377
  ```typescript
package/skills/ax-gen.md CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: ax-gen
3
3
  description: This skill helps an LLM generate correct AxGen code using @ax-llm/ax. Use when the user asks about ax(), AxGen, generators, forward(), streamingForward(), assertions, field processors, step hooks, self-tuning, or structured outputs.
4
- version: "20.0.2"
4
+ version: "21.0.2"
5
5
  ---
6
6
 
7
7
  # AxGen Codegen Rules (@ax-llm/ax)
@@ -382,6 +382,7 @@ type AxChatLogMessage =
382
382
  | { role: 'tool'; name: string; content: string };
383
383
 
384
384
  type AxChatLogEntry = {
385
+ name?: string;
385
386
  model: string;
386
387
  messages: AxChatLogMessage[];
387
388
  modelUsage?: AxProgramUsage;
@@ -400,7 +401,7 @@ console.log(usage[0]?.tokens?.promptTokens);
400
401
  gen.resetUsage();
401
402
  ```
402
403
 
403
- > For `AxAgent`, both `getChatLog()` and `getUsage()` return `{ actor: ..., responder: ... }` see `ax-agent` skill.
404
+ `AxAgent` and `AxFlow` also return flat `AxChatLogEntry[]` logs; composite programs set `entry.name` so callers can filter by node/stage.
404
405
 
405
406
  ## Examples
406
407
 
package/skills/ax-gepa.md CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: ax-gepa
3
3
  description: This skill helps an LLM generate correct AxGEPA optimization code using @ax-llm/ax. Use when the user asks about AxGEPA, GEPA, Pareto optimization, multi-objective prompt tuning, reflective prompt evolution, validationExamples, maxMetricCalls, or optimizing a generator, flow, or agent tree.
4
- version: "20.0.2"
4
+ version: "21.0.2"
5
5
  ---
6
6
 
7
7
  # AxGEPA Codegen Rules (@ax-llm/ax)
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: ax-learn
3
3
  description: This skill helps an LLM generate correct AxLearn code using @ax-llm/ax. Use when the user asks about self-improving agents, trace-backed learning, feedback-aware updates, or AxLearn modes.
4
- version: "20.0.2"
4
+ version: "21.0.2"
5
5
  ---
6
6
 
7
7
  # AxLearn Codegen Rules (@ax-llm/ax)
package/skills/ax-llm.md CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- name: ax
2
+ name: ax-llm
3
3
  description: This skill helps with using the @ax-llm/ax TypeScript library for building LLM applications. Use when the user asks about ax(), ai(), f(), s(), agent(), flow(), AxGen, AxAgent, AxFlow, signatures, streaming, or mentions @ax-llm/ax.
4
- version: "20.0.2"
4
+ version: "21.0.2"
5
5
  ---
6
6
 
7
7
  # Ax Library (@ax-llm/ax) Quick Reference
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: ax-signature
3
3
  description: This skill helps an LLM generate correct DSPy signature code using @ax-llm/ax. Use when the user asks about signatures, s(), f(), field types, string syntax, fluent builder API, validation constraints, or type-safe inputs/outputs.
4
- version: "20.0.2"
4
+ version: "21.0.2"
5
5
  ---
6
6
 
7
7
  # Ax Signature Reference