npm - @ax-llm/ax - Versions diffs - 20.0.2 → 21.0.2 - Mend

@ax-llm/ax 20.0.2 → 21.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

package/README.md +4 -9
package/index.cjs +621 -707
package/index.cjs.map +1 -1
package/index.d.cts +5657 -5226
package/index.d.ts +5657 -5226
package/index.global.js +623 -709
package/index.global.js.map +1 -1
package/index.js +621 -707
package/index.js.map +1 -1
package/package.json +1 -1
package/skills/ax-agent-optimize.md +1 -1
package/skills/ax-agent.md +369 -133
package/skills/ax-ai.md +22 -3
package/skills/ax-audio.md +251 -0
package/skills/ax-flow.md +12 -1
package/skills/ax-gen.md +3 -2
package/skills/ax-gepa.md +1 -1
package/skills/ax-learn.md +1 -1
package/skills/ax-llm.md +2 -2
package/skills/ax-signature.md +1 -1

package/skills/ax-ai.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: ax-ai
 description: This skill helps an LLM generate correct AI provider setup and configuration code using @ax-llm/ax. Use when the user asks about ai(), providers, models, presets, embeddings, extended thinking, context caching, or mentions OpenAI/Anthropic/Google/Azure/Groq/DeepSeek/Mistral/Cohere/Together/Ollama/HuggingFace/Reka/OpenRouter with @ax-llm/ax.
-version: "20.0.2"
+version: "21.0.2"
 ---
 # AI Provider Codegen Rules (@ax-llm/ax)
@@ -26,7 +26,7 @@ const openrouter = ai({ name: 'openrouter', apiKey: 'your-key' });
 const ollama = ai({ name: 'ollama', url: 'http://localhost:11434' });
 const hf = ai({ name: 'huggingface', apiKey: 'hf_...' });
 const reka = ai({ name: 'reka', apiKey: 'your-key' });
-const grok = ai({ name: 'x-grok', apiKey: 'your-key' });
+const grok = ai({ name: 'grok', apiKey: 'your-key' });
 ```
 ## Model Presets
@@ -47,6 +47,25 @@ const gemini = ai({
 await gemini.chat({ model: 'tiny', chatPrompt: [{ role: 'user', content: 'Hi' }] });
 ```
+## Model Catalog
+```typescript
+import { axGetSupportedAIModels } from '@ax-llm/ax';
+const providers = axGetSupportedAIModels();
+const openai = providers.find((provider) => provider.name === 'openai');
+console.log(openai?.models[0]?.promptTokenCostPer1M);
+const textProviders = axGetSupportedAIModels({ type: 'text' });
+const embeddingProviders = axGetSupportedAIModels({ type: 'embeddings' });
+```
+Use `axGetSupportedAIModels()` to build provider/model selectors before creating an `ai(...)` instance. It returns bundled static metadata: provider names, display names, default models, raw `AxModelInfo` pricing/details, model type (`'text'`, `'embeddings'`, `'code'`, or `'audio'`), and normalized capability flags for thinking, thoughts, structured outputs, audio, temperature, and top-p support. Provider groups and models are sorted cheapest to most expensive based on bundled input + output token pricing; unpriced models sort last.
+Filter with `{ type: 'all' | 'text' | 'embeddings' | 'code' | 'audio' }` or an array of those values. The `'text'` filter includes code-capable models; use `'code'` to show only code-first models.
+Dynamic providers such as Azure OpenAI deployments, OpenRouter, Ollama, and Hugging Face are marked with `isDynamic: true` and may have an empty or static-limited model list.
 ## Chat
 ```typescript
@@ -210,7 +229,7 @@ const client = new AxMCPClient(transport);
 ## Critical Rules
 - Use `ai()` factory for all providers.
-- Provider names: `'openai'`, `'anthropic'`, `'google-gemini'`, `'azure-openai'`, `'mistral'`, `'groq'`, `'cohere'`, `'together'`, `'deepseek'`, `'ollama'`, `'huggingface'`, `'openrouter'`, `'reka'`, `'x-grok'`
+- Provider names: `'openai'`, `'anthropic'`, `'google-gemini'`, `'azure-openai'`, `'mistral'`, `'groq'`, `'cohere'`, `'together'`, `'deepseek'`, `'ollama'`, `'huggingface'`, `'openrouter'`, `'reka'`, `'grok'`
 - Thinking constraints on Anthropic: `temperature` and `topK` are ignored; `topP` only sent if >= 0.95.
 - Bedrock uses `new AxAIBedrock()`, not `ai()`.
 - Vercel AI SDK uses `AxAIProvider` wrapper.

package/skills/ax-audio.md ADDED Viewed

@@ -0,0 +1,251 @@
+---
+name: ax-audio
+description: This skill helps an LLM generate correct conversational audio I/O code with @ax-llm/ax. Use when the user asks about .chat() audio input, audio output, OpenAI gpt-audio or realtime models, Gemini Live native audio, Grok Voice Agent models, voices, formats, transcripts, or how audio fits with signatures and structured outputs.
+version: "21.0.2"
+---
+# Audio I/O Codegen Rules (@ax-llm/ax)
+Use this skill for bounded-turn conversational audio through `.chat()`. Prefer short, modern, copyable examples. Do not model generated audio as a DSPy signature output field.
+## Core Rule
+Audio output is returned on `AxChatResponseResult.audio`, not in signature fields.
+Signatures should keep text fields text-shaped:
+```typescript
+const result = await llm.chat({
+  chatPrompt: [{ role: 'user', content: 'Say hello out loud.' }],
+  modelConfig: {
+    audio: { output: { enabled: true } },
+  },
+});
+console.log(result.results[0]?.content);
+console.log(result.results[0]?.audio?.data);
+console.log(result.results[0]?.audio?.transcript);
+```
+Do not write signatures like `question:string -> audio:audio`. Use `.chat()` for conversational audio and use `audio.data` for the generated bytes.
+## Config Shape
+```typescript
+type AxChatAudioConfig = {
+  input?: {
+    format?: 'wav' | 'mp3' | 'flac' | 'opus' | 'aac' | 'pcm16' | 'pcm' | 'ogg';
+    mimeType?: string;
+    sampleRate?: number;
+    channels?: number;
+  };
+  output?: {
+    enabled?: boolean;
+    voice?: string | { id: string };
+    format?: 'wav' | 'mp3' | 'flac' | 'opus' | 'aac' | 'pcm16' | 'pcm' | 'ogg';
+    sampleRate?: number;
+    channels?: number;
+    includeTranscript?: boolean;
+  };
+  live?: {
+    turnTimeoutMs?: number;
+    enableAffectiveDialog?: boolean;
+    proactiveAudio?: boolean;
+  };
+};
+```
+## OpenAI Defaults
+Use `axAIOpenAIAudioDefaultConfig()` for OpenAI request-based audio chat:
+- model: `gpt-audio-mini`
+- output enabled
+- voice: `alloy`
+- output format: `wav`
+- transcript enabled
+- streaming disabled by default
+- audio input formats: `wav`, `mp3`
+- audio output formats: `wav`, `mp3`, `flac`, `opus`, `aac`, `pcm16`
+```typescript
+import { ai, axAIOpenAIAudioDefaultConfig } from '@ax-llm/ax';
+const openai = ai({
+  name: 'openai',
+  apiKey: process.env.OPENAI_APIKEY!,
+  config: axAIOpenAIAudioDefaultConfig(),
+});
+const res = await openai.chat({
+  chatPrompt: [
+    {
+      role: 'user',
+      content: [
+        { type: 'text', text: 'What is in this recording?' },
+        { type: 'audio', data: base64Wav, format: 'wav' },
+      ],
+    },
+  ],
+});
+console.log(res.results[0]?.content);
+console.log(res.results[0]?.audio?.data);
+```
+Use `axAIOpenAIRealtimeDefaultConfig()` for OpenAI realtime speech-to-speech:
+- model: `gpt-realtime-2`
+- output enabled
+- voice: `marin`
+- output format: `pcm16`
+- input default: `audio/pcm`, mono, 24000 Hz
+- turn timeout: `30000`
+- streaming disabled by default
+Use `axAIOpenAIRealtimeTranscriptionDefaultConfig()` for realtime transcript deltas:
+- model: `gpt-realtime-whisper`
+- input default: `audio/pcm`, mono, 24000 Hz
+- output audio disabled; transcript text is returned on `content`
+Realtime models use a one-turn WebSocket call under `.chat()`. In Node, pass a WebSocket constructor through request options:
+```typescript
+import WebSocket from 'ws';
+import { ai, axAIOpenAIRealtimeDefaultConfig } from '@ax-llm/ax';
+const openai = ai({
+  name: 'openai',
+  apiKey: process.env.OPENAI_APIKEY!,
+  config: axAIOpenAIRealtimeDefaultConfig(),
+});
+const stream = await openai.chat(
+  {
+    chatPrompt: [{ role: 'user', content: 'Say hello out loud.' }],
+  },
+  { stream: true, webSocket: WebSocket }
+);
+```
+For follow-up turns, keep the assistant audio reference in history:
+```typescript
+await openai.chat({
+  chatPrompt: [
+    { role: 'assistant', audio: { id: previousAudioId } },
+    { role: 'user', content: 'Repeat that more slowly.' },
+  ],
+});
+```
+## Gemini Live Defaults
+Use `axAIGoogleGeminiLiveAudioDefaultConfig()` for Gemini native audio:
+- model: `gemini-2.5-flash-native-audio-preview-12-2025`
+- output enabled
+- voice: `Kore`
+- output format: `pcm16`
+- output sample rate: `24000`
+- input default: `audio/pcm;rate=16000`, mono
+- transcript enabled
+- turn timeout: `30000`
+- streaming disabled by default
+```typescript
+import { ai, axAIGoogleGeminiLiveAudioDefaultConfig } from '@ax-llm/ax';
+const gemini = ai({
+  name: 'google-gemini',
+  apiKey: process.env.GOOGLE_APIKEY!,
+  config: axAIGoogleGeminiLiveAudioDefaultConfig(),
+});
+const res = await gemini.chat({
+  chatPrompt: [
+    {
+      role: 'user',
+      content: [
+        { type: 'text', text: 'Answer this spoken question.' },
+        {
+          type: 'audio',
+          data: base64Pcm16,
+          format: 'pcm16',
+          sampleRate: 16000,
+          channels: 1,
+        },
+      ],
+    },
+  ],
+});
+console.log(res.results[0]?.content);
+console.log(res.results[0]?.audio?.data);
+```
+Gemini Live uses a one-turn WebSocket call under `.chat()`. It expects PCM input for native audio turns; use `format: 'pcm16'` or `mimeType: 'audio/pcm;rate=16000'`.
+## Grok Voice Defaults
+Use `axAIGrokVoiceDefaultConfig()` for xAI Grok Voice Agent:
+- model: `grok-voice-think-fast-1.0`
+- output enabled
+- voice: `eve`
+- output format: `pcm16`
+- output sample rate: `24000`
+- input default: `audio/pcm`, mono, 24000 Hz
+- transcript enabled
+- turn timeout: `30000`
+- streaming disabled by default
+```typescript
+import WebSocket from 'ws';
+import { ai, axAIGrokVoiceDefaultConfig } from '@ax-llm/ax';
+const grok = ai({
+  name: 'grok',
+  apiKey: process.env.GROK_API_KEY!,
+  config: axAIGrokVoiceDefaultConfig(),
+});
+const res = await grok.chat(
+  {
+    chatPrompt: [{ role: 'user', content: 'Say hello out loud.' }],
+  },
+  { webSocket: WebSocket }
+);
+console.log(res.results[0]?.content);
+console.log(res.results[0]?.audio?.data);
+```
+Grok Voice uses a one-turn WebSocket call under `.chat()`. It expects PCM input for spoken input turns; use `format: 'pcm16'` or `mimeType: 'audio/pcm'`.
+## Streaming Audio
+OpenAI audio chat, OpenAI Realtime, Gemini Live, and Grok Voice all default to non-streaming, but each can stream deltas when you pass `{ stream: true }`.
+```typescript
+const stream = await llm.chat(
+  {
+    chatPrompt: [{ role: 'user', content: 'Say hello.' }],
+  },
+  { stream: true }
+);
+for await (const chunk of stream) {
+  const audio = chunk.results[0]?.audio;
+  if (audio?.isDelta) {
+    playAudioChunk(audio.data);
+  }
+}
+```
+## Structured Outputs
+Do not combine audio output with structured response formats. Audio chat may return a text transcript in `content`, but generated audio bytes live at `result.results[0].audio`.
+For structured extraction from speech, use a text-only or transcription step first, then pass the transcript into `ax(...)` or `flow(...)`.

package/skills/ax-flow.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: ax-flow
 description: This skill helps an LLM generate correct AxFlow workflow code using @ax-llm/ax. Use when the user asks about flow(), AxFlow, workflow orchestration, parallel execution, DAG workflows, conditional routing, map/reduce patterns, or multi-node AI pipelines.
-version: "20.0.2"
+version: "21.0.2"
 ---
 # AxFlow Codegen Rules (@ax-llm/ax)
@@ -361,6 +361,17 @@ wf.setDemos([{ programId: 'root.summarizer', traces: [] }]);
 wf.applyOptimization(optimizedProgram);
 ```
+## Chat Logs
+`AxFlow.getChatLog()` returns a flat `readonly AxChatLogEntry[]` after `forward()`. Each child-node entry is tagged with `entry.name` so callers can filter by node:
+```typescript
+const log = wf.getChatLog();
+for (const entry of log) {
+  console.log(entry.name, entry.model);
+}
+```
 ## Error Handling
 ```typescript

package/skills/ax-gen.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: ax-gen
 description: This skill helps an LLM generate correct AxGen code using @ax-llm/ax. Use when the user asks about ax(), AxGen, generators, forward(), streamingForward(), assertions, field processors, step hooks, self-tuning, or structured outputs.
-version: "20.0.2"
+version: "21.0.2"
 ---
 # AxGen Codegen Rules (@ax-llm/ax)
@@ -382,6 +382,7 @@ type AxChatLogMessage =
   | { role: 'tool'; name: string; content: string };
 type AxChatLogEntry = {
+  name?: string;
   model: string;
   messages: AxChatLogMessage[];
   modelUsage?: AxProgramUsage;
@@ -400,7 +401,7 @@ console.log(usage[0]?.tokens?.promptTokens);
 gen.resetUsage();
 ```
-> For `AxAgent`, both `getChatLog()` and `getUsage()` return `{ actor: ..., responder: ... }` — see `ax-agent` skill.
+`AxAgent` and `AxFlow` also return flat `AxChatLogEntry[]` logs; composite programs set `entry.name` so callers can filter by node/stage.
 ## Examples

package/skills/ax-gepa.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: ax-gepa
 description: This skill helps an LLM generate correct AxGEPA optimization code using @ax-llm/ax. Use when the user asks about AxGEPA, GEPA, Pareto optimization, multi-objective prompt tuning, reflective prompt evolution, validationExamples, maxMetricCalls, or optimizing a generator, flow, or agent tree.
-version: "20.0.2"
+version: "21.0.2"
 ---
 # AxGEPA Codegen Rules (@ax-llm/ax)

package/skills/ax-learn.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: ax-learn
 description: This skill helps an LLM generate correct AxLearn code using @ax-llm/ax. Use when the user asks about self-improving agents, trace-backed learning, feedback-aware updates, or AxLearn modes.
-version: "20.0.2"
+version: "21.0.2"
 ---
 # AxLearn Codegen Rules (@ax-llm/ax)

package/skills/ax-llm.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
-name: ax
+name: ax-llm
 description: This skill helps with using the @ax-llm/ax TypeScript library for building LLM applications. Use when the user asks about ax(), ai(), f(), s(), agent(), flow(), AxGen, AxAgent, AxFlow, signatures, streaming, or mentions @ax-llm/ax.
-version: "20.0.2"
+version: "21.0.2"
 ---
 # Ax Library (@ax-llm/ax) Quick Reference

package/skills/ax-signature.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: ax-signature
 description: This skill helps an LLM generate correct DSPy signature code using @ax-llm/ax. Use when the user asks about signatures, s(), f(), field types, string syntax, fluent builder API, validation constraints, or type-safe inputs/outputs.
-version: "20.0.2"
+version: "21.0.2"
 ---
 # Ax Signature Reference