npm - @ax-llm/ax - Versions diffs - 21.0.12 → 21.0.13 - Mend

@ax-llm/ax 21.0.12 → 21.0.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (22) hide show

package/index.cjs +358 -223
package/index.cjs.map +1 -1
package/index.d.cts +4484 -4282
package/index.d.ts +4484 -4282
package/index.global.js +358 -223
package/index.global.js.map +1 -1
package/index.js +358 -223
package/index.js.map +1 -1
package/package.json +1 -1
package/skills/ax-agent-memory-skills.md +52 -3
package/skills/ax-agent-observability.md +2 -2
package/skills/ax-agent-optimize.md +22 -27
package/skills/ax-agent-rlm.md +30 -43
package/skills/ax-agent.md +46 -11
package/skills/ax-ai.md +38 -7
package/skills/ax-audio.md +155 -33
package/skills/ax-flow.md +1 -1
package/skills/ax-gen.md +1 -1
package/skills/ax-gepa.md +1 -1
package/skills/ax-learn.md +1 -1
package/skills/ax-llm.md +1 -1
package/skills/ax-signature.md +13 -8

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@ax-llm/ax",
-  "version": "21.0.12",
+  "version": "21.0.13",
   "type": "module",
   "description": "The best library to work with LLMs",
   "repository": {

package/skills/ax-agent-memory-skills.md CHANGED Viewed

@@ -1,16 +1,17 @@
 ---
 name: ax-agent-memory-skills
-description: This skill helps an LLM generate correct AxAgent memory retrieval and dynamic skill-loading code using @ax-llm/ax. Use when the user asks about onMemoriesSearch, recall(...), inputs.memories, onLoadedMemories, onUsedMemories, onSkillsSearch, discover({ skills }), onLoadedSkills, onUsedSkills, preloaded skills, loaded memory/skill IDs, or carrying memories across forward() calls.
-version: "21.0.12"
+description: This skill helps an LLM generate correct AxAgent memory retrieval, context-map, and dynamic skill-loading code using @ax-llm/ax. Use when the user asks about contextMap, AxAgentContextMap, onMemoriesSearch, recall(...), inputs.memories, onLoadedMemories, onUsedMemories, onSkillsSearch, discover({ skills }), onLoadedSkills, onUsedSkills, preloaded skills, loaded memory/skill IDs, or carrying memories across forward() calls.
+version: "21.0.13"
 ---
 # AxAgent Memory And Skills Rules (@ax-llm/ax)
-Use this skill when an agent needs to retrieve task-relevant memories or load skill guides into the executor prompt on demand. For ordinary agent setup use `ax-agent`. For RLM runtime policy use `ax-agent-rlm`. For callbacks and telemetry use `ax-agent-observability`.
+Use this skill when an agent needs a persistent context map, task-relevant memory retrieval, or skill guides loaded into the executor prompt on demand. For ordinary agent setup use `ax-agent`. For RLM runtime policy use `ax-agent-rlm`. For callbacks and telemetry use `ax-agent-observability`.
 ## Use These Defaults
 - Use `onMemoriesSearch` when the agent should pull relevant context from an external store instead of stuffing everything into the prompt upfront.
+- Use `contextMap` when repeated runs inspect the same long external context and should accumulate a small orientation cache automatically.
 - Use `onSkillsSearch` when the agent should load usage guides, runbooks, or domain conventions into the executor prompt on demand.
 - `recall(...)` is available to distiller and executor stages when `onMemoriesSearch` is set.
 - `discover({ skills })` is available to the executor when `onSkillsSearch` is set.
@@ -19,6 +20,53 @@ Use this skill when an agent needs to retrieve task-relevant memories or load sk
 - Use `onUsedMemories` / `onUsedSkills` to track what the actor says it actually relied on.
 - Child agents do not inherit memory or skills search callbacks; wire them explicitly on every agent that needs the capability.
+## Context Map
+Use `contextMap` when repeated runs ask different questions over the same long context, document set, or repository. The map is prompt-resident orientation knowledge: structure, concepts, constants, parsing schema, reusable aggregate results, and concrete error patterns. It is not a task-specific answer cache.
+Runnable example: [`src/examples/rlm-context-map.ts`](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-context-map.ts) demonstrates one update, `onUpdate` snapshot persistence, finite evolve, and frozen map reuse.
+When `contextMap` is configured:
+- Ax injects the current map into the distiller prompt.
+- Ax updates the map once after each successful completed `forward(...)`.
+- By default the map evolves forever. For a finite warmup, create the map with `{ infiniteEvolve: false, evolveSteps: N }`; after `N` successful updates it is still injected but no longer updated.
+- Failed runs, aborts, and clarification requests do not update the map.
+- Use `onUpdate` to persist `result.map.snapshot()` outside the agent.
+```typescript
+import { agent, AxAgentContextMap } from '@ax-llm/ax';
+const map = new AxAgentContextMap(savedSnapshot, {
+  maxChars: 4000,
+  infiniteEvolve: false,
+  evolveSteps: 10,
+});
+const myAgent = agent('context:string, query:string -> answer:string', {
+  contextFields: ['context'],
+  contextMap: {
+    map,
+    onUpdate: ({ map }) => saveSnapshot(map.snapshot()),
+  },
+});
+```
+Types:
+```typescript
+type AxAgentContextMapConfig = {
+  map?: AxAgentContextMap | AxAgentContextMapSnapshot | string;
+  onUpdate?: (result: AxAgentContextMapUpdateResult) => void | Promise<void>;
+};
+type AxAgentContextMapOptions = {
+  maxChars?: number;
+  infiniteEvolve?: boolean;
+  evolveSteps?: number;
+};
+```
 ## Memory Search
 Use `onMemoriesSearch` when the agent needs to pull task-relevant context such as user preferences, prior decisions, project facts, or past conversations from an external store (vector DB, BM25, KV). The actor decides what to load, when, and how much.
@@ -268,6 +316,7 @@ onUsedSkills?: (
   usedSkills: readonly AxAgentUsedSkill[]
 ) => void | Promise<void>;
+contextMap?: AxAgentContextMapConfig;
 skills?: readonly AxAgentSkillResult[];
 ```

package/skills/ax-agent-observability.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: ax-agent-observability
 description: This skill helps an LLM generate correct AxAgent observability code using @ax-llm/ax. Use when the user asks about actorTurnCallback, executorTurnCallback, onContextEvent, agentStatusCallback, onFunctionCall, reportSuccess, reportFailure, getChatLog(), getUsage(), resetUsage(), debug traces, progress updates, or telemetry for AxAgent runs.
-version: "21.0.12"
+version: "21.0.13"
 ---
 # AxAgent Observability Rules (@ax-llm/ax)
@@ -66,7 +66,7 @@ Important:
 - `result` is the raw runtime result before Ax applies type-aware serialization and budget-proportional truncation.
 - `thought` is optional and only appears when the underlying `AxGen` call had `showThoughts` enabled and the provider actually returned a thought field.
 - `actionLogEntryCount` and `guidanceLogEntryCount` reflect the live log sizes after the turn is processed, including resumed runs.
-- `actorTurnCallback` fires for the root agent and for recursive child agents that run actor turns.
+- `actorTurnCallback` fires for the configured agent instance. Child agents passed through `functions: [...]` should define their own callback if you need their internal actor turns; use `onFunctionCall` on the parent to observe the parent-side child-agent invocation.
 Good pattern:

package/skills/ax-agent-optimize.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: ax-agent-optimize
-description: This skill helps an LLM generate correct AxAgent tuning and evaluation code using @ax-llm/ax. Use when the user asks about agent.optimize(...), judgeOptions, eval datasets, optimization targets, saved optimizedProgram artifacts, or recursive optimization guidance.
-version: "21.0.12"
+description: This skill helps an LLM generate correct AxAgent tuning and evaluation code using @ax-llm/ax. Use when the user asks about agent.optimize(...), judgeOptions, eval datasets, optimization targets, saved optimizedProgram artifacts, or agent optimization guidance.
+version: "21.0.13"
 ---
 # AxAgent Optimize Codegen Rules (@ax-llm/ax)
@@ -13,7 +13,7 @@ Your job is to help the model choose a good optimization setup for the user's ac
 - If the user wants better tool use, prefer action-aware tasks and either a deterministic metric or the built-in judge depending on how objective the scoring is.
 - If the user wants better wording only, responder optimization may be enough.
 - If the user wants reusable improvements, include artifact save/load.
-- If the user wants cost or recursion behavior improved, make the eval tasks expose those tradeoffs explicitly.
+- If the user wants cost, tool-use, or child-agent delegation behavior improved, make the eval tasks expose those tradeoffs explicitly.
 ## Use These Defaults
@@ -32,7 +32,7 @@ Your job is to help the model choose a good optimization setup for the user's ac
 - For first examples, pass a plain task array instead of splitting into `train` and `validation` unless the user already has a holdout set.
 - GEPA-backed `agent.optimize(...)` now optimizes generic components exposed by the selected target programs; `target: 'actor'` only tunes actor components, `target: 'responder'` only tunes responder components, and `target: 'all'` broadens the component set.
 - `result.optimizedProgram.componentMap` is the canonical saved artifact for agent GEPA runs. It may include actor instructions, descriptions, tool descriptions/names, templates, or runtime primitives depending on what the selected target exposes.
-- When recursive behavior matters, keep `mode: 'advanced'` on the agent and tune against realistic `recursionOptions`.
+- When child-agent delegation matters, expose the child agents as named functions and tune against realistic call/no-call tasks.
 ## Decision Guide
@@ -41,7 +41,7 @@ Pick the optimization shape from the user's need:
 - "Make the agent use tools correctly" -> keep the default actor target and use `expectedActions` and `forbiddenActions`.
 - "Make final answers read better" -> consider `target: 'responder'`, but only if the task is not mostly tool-selection or clarification behavior.
 - "Make the whole agent better" -> use the default actor target first; only broaden target selection when the user clearly wants that extra scope.
-- "Tune recursive delegation" -> keep `mode: 'advanced'` and use tasks that actually exercise recursion depth, fan-out, and termination choices.
+- "Tune child-agent delegation" -> use tasks that exercise when to call the child agent, when to call normal tools, and when to answer directly.
 - "Compare before and after" -> include a held-out task plus artifact save/load and replay.
 Choose task design carefully:
@@ -58,8 +58,8 @@ Optimization works much better when the agent and dataset remove avoidable ambig
 - Prefer typed tool outputs over free-form JSON blobs so the actor can rely on exact field names.
 - Tell the actor the exact tool fields it may use when payload shape matters.
 - Explicitly ban invented fields if the model has any reason to guess hidden IDs or alternate key names.
-- If recursive children only see explicit `llmQuery(..., context)` payloads, say that directly in the actor prompt.
-- For recursive synthesis, tell the agent what the narrowed context should look like before delegation.
+- If a child agent needs parent values, declare those fields in the child signature and pass them explicitly at the call site.
+- For specialist synthesis, tell the agent what narrowed context should be passed to the child agent.
 - Keep `maxSubAgentCalls` small in examples unless the user is explicitly testing broad fan-out behavior.
 - Use canonical, unambiguous task wording so the model does not burn turns asking for fake clarification.
 - In JS-runtime agents, require raw runnable JavaScript only. Ban `javascript:` prefixes, mixed prose/code, and multi-snippet turns.
@@ -68,14 +68,14 @@ Good pattern:
 - tool schema says exactly what fields exist
 - task names the exact entity to look up
-- actor prompt says which fields to extract before delegation
-- metric or judge penalizes unnecessary recursion and tool misuse
+- actor prompt says which fields to extract before calling a child agent
+- metric or judge penalizes unnecessary child-agent calls and tool misuse
 Bad pattern:
 - tool returns `json` with an underspecified shape
 - task uses overloaded names like `Atlas` without clarifying whether that is a project, team, or account
-- recursive child is expected to infer hidden parent state that was never passed in context
+- child agent is expected to infer hidden parent state that was never passed in its call arguments
 - code agent is allowed to mix natural language with JavaScript in the same turn
 ## Metric vs Judge
@@ -149,7 +149,7 @@ const assistant = agent('query:string -> answer:string', {
   judgeAI,
   contextFields: [],
   runtime: new AxJSRuntime(),
-  functions: { local: tools },
+  functions: tools,
   contextPolicy: { preset: 'checkpointed', budget: 'balanced' },
   judgeOptions: {
     description: 'Prefer correct tool use over polished wording.',
@@ -222,7 +222,7 @@ const result = await assistant.optimize(tasks, {
       score += 0.4;
     }
-    if ((prediction.recursiveStats?.recursiveCallCount ?? 0) === 0) {
+    if (prediction.turnCount <= 3) {
       score += 0.2;
     }
@@ -234,7 +234,7 @@ const result = await assistant.optimize(tasks, {
 Use this pattern when:
 - the task has a known correct answer or exact action pattern
-- recursion cost or tool count must be measured explicitly
+- tool count, child-agent calls, or turn count must be measured explicitly
 - you want repeatable, low-variance optimization runs
 ## Built-In Judge Pattern
@@ -247,7 +247,7 @@ const result = await assistant.optimize(tasks, {
   judgeOptions: {
     model: AxAIGoogleGeminiModel.Gemini3Pro,
     description:
-      'Be strict about unnecessary delegation, weak clarifications, and incorrect tool choices.',
+      'Be strict about unnecessary child-agent calls, weak clarifications, and incorrect tool choices.',
   },
   maxMetricCalls: 12,
 });
@@ -306,7 +306,6 @@ Use this pattern when:
 - Use `expectedActions` and `forbiddenActions` when tool correctness matters.
 - `judgeOptions` mirrors normal forward options and supports extra judge guidance through `description`.
 - The built-in judge scores from the full agent run, not just the final reply. It can see completion type, clarification payload, final output, action log, normalized function calls, tool errors, and turn count.
-- For recursive advanced-mode evals, the built-in judge can also see `recursiveTrace` and `recursiveStats`.
 - If the user provides a custom `metric`, that overrides the built-in judge path.
 - If the user provides an LLM-based custom metric, keep the output schema as small as possible and prefer a direct numeric score.
@@ -326,16 +325,13 @@ Decision rules:
 - For final outcomes in custom metrics, expect `prediction.completionType === 'final'` and populated `prediction.output`.
 - `target: 'responder'` still works, but clarification-heavy tasks are usually low-signal for responder optimization.
-## Recursive Optimization Notes
+## Delegation Optimization Notes
-- Recursive-slot artifacts require an agent configured for recursive advanced mode.
-- Keep `mode: 'advanced'` top-level; child recursion behavior still follows `recursionOptions`.
-- When recursive behavior matters, tune against the same `maxDepth` and tool/discovery structure you expect in production.
-- Use recursive traces and recursive stats when the user wants to diagnose where token or delegation cost is coming from.
-- For recursion-efficiency tuning, prefer a deterministic metric unless the user specifically needs a qualitative LLM review of decomposition quality.
-- Tell the actor that recursive children only see passed context, not parent globals or prior tool results.
-- For synthesis-style recursive tasks, specify the desired delegation pattern explicitly, for example "use at most one focused delegated child analysis after narrowing the tool output in JS."
-- Penalize over-decomposition directly in the metric or judge prompt.
+- Prefer explicit child agents in `functions: [...]` for specialist delegation. Their calls appear as normal function-call records.
+- When delegation behavior matters, tune against the same child-agent/tool structure you expect in production.
+- Tell the actor which fields to pass to the child agent and which tasks should stay local.
+- For synthesis-style tasks, specify the desired delegation pattern explicitly, for example "call `team.writer(...)` only after narrowing tool output in JS."
+- Penalize unnecessary child-agent calls directly in the metric or judge prompt.
 - If one training task keeps collapsing to zero, inspect that task first instead of adding more optimizer rounds. Most failures come from task ambiguity, weak tool schemas, or vague delegation guidance rather than GEPA itself.
 ## Artifacts And Replay
@@ -350,7 +346,6 @@ Decision rules:
 - [RLM Agent Optimize](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-agent-optimize.ts) — Gemini office-assistant tuning with save/load
 - [AxAgent GEPA Component Optimization](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/axagent-gepa-optimization.ts) — compact support-agent GEPA run with deterministic metric and artifact replay
-- [RLM Agent Recursive Optimize](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-agent-recursive-optimize.ts) — recursive-slot optimization artifacts
 ## Do Not Generate
@@ -358,6 +353,6 @@ Decision rules:
 - Do not recommend responder-only optimization by default for clarification-heavy workflows.
 - Do not omit artifact save/load steps when the user asks for reusable optimized configurations.
 - Do not introduce a dedicated judge class or helper abstraction in new agent-optimize examples; prefer the built-in judge path or a plain typed `AxGen`.
-- Do not rely on vague `json` tool returns when the agent must reason about specific fields across recursive steps.
-- Do not leave recursive child context implicit. If the child needs a fact, pass it explicitly.
+- Do not rely on vague `json` tool returns when the agent must reason about specific fields across tool or child-agent calls.
+- Do not leave child-agent inputs implicit. If the child needs a fact, pass it explicitly.
 - Do not let code-generation agents mix prose and JavaScript if the user is optimizing runtime behavior.

package/skills/ax-agent-rlm.md CHANGED Viewed

@@ -1,12 +1,12 @@
 ---
 name: ax-agent-rlm
-description: This skill helps an LLM generate correct AxAgent RLM/runtime code using @ax-llm/ax. Use when the user asks about RLM code execution, AxJSRuntime, contextFields, contextPolicy, liveRuntimeState, promptLevel, stage prompt controls, executorModelPolicy, maxRuntimeChars, agent.test(...), llmQuery(...), mode: 'advanced', recursionOptions, or long-running agent runtime behavior.
-version: "21.0.12"
+description: This skill helps an LLM generate correct AxAgent RLM/runtime code using @ax-llm/ax. Use when the user asks about RLM code execution, AxJSRuntime, contextFields, contextPolicy, liveRuntimeState, promptLevel, stage prompt controls, executorModelPolicy, maxRuntimeChars, agent.test(...), llmQuery(...), recursionOptions, or long-running agent runtime behavior.
+version: "21.0.13"
 ---
 # AxAgent RLM Runtime Rules (@ax-llm/ax)
-Use this skill for code-runtime agents and recursive/delegated runtime behavior. For ordinary agent setup, child agents, tool namespaces, clarification, and `bubbleErrors`, use `ax-agent`. For callbacks and logs, use `ax-agent-observability`. For memories and skill loading, use `ax-agent-memory-skills`.
+Use this skill for code-runtime agents and `llmQuery(...)` semantic-helper behavior. For ordinary agent setup, child agents, tool namespaces, clarification, and `bubbleErrors`, use `ax-agent`. For callbacks and logs, use `ax-agent-observability`. For memories and skill loading, use `ax-agent-memory-skills`.
 ## Use These Defaults
@@ -14,11 +14,13 @@ Use this skill for code-runtime agents and recursive/delegated runtime behavior.
 - In stdout-mode RLM, use one observable `console.log(...)` step per non-final actor turn.
 - Default to `contextPolicy: { preset: 'checkpointed', budget: 'balanced' }` for most RLM tasks.
 - Prefer `contextPolicy: { preset: 'adaptive', budget: 'balanced' }` when older successful turns should collapse sooner while live runtime state stays visible.
+- Use `contextMap` for recurring long-context corpora when the distiller should start future runs with a small persisted orientation cache.
 - Prefer `promptLevel: 'default'` for normal use.
 - Use `promptLevel: 'detailed'` when you want extra anti-pattern examples and tighter teaching scaffolding in the actor prompt.
 - Prefer `executorModelPolicy` when the actor may need to upgrade after repeated error turns or discovery in specific namespaces without also upgrading the responder.
-- Prefer `mode: 'simple'` unless recursive child agents materially improve the task.
-- Prefer `maxSubAgentCalls` only when advanced recursion is enabled or the user needs explicit delegation limits.
+- Use explicit child agents in `functions: [...]` when the task needs specialist agents with their own tools/runtime.
+- Use `llmQuery(...)` only for focused semantic questions over narrowed context; it does not spawn a tool-using child AxAgent.
+- Prefer `maxSubAgentCalls` only when you need an explicit cap on `llmQuery(...)` sub-query usage.
 ## Mental Model
@@ -131,14 +133,14 @@ Practical rule:
 Use these top-level controls consistently:
-- `mode`: controls whether `llmQuery(...)` stays simple or delegates to recursive child agents in advanced mode.
-- `recursionOptions.maxDepth`: limits recursive `llmQuery(...)` depth.
-- `recursionOptions.ai`: routes recursive `llmQuery(...)` sub-agent calls to a different AI service than the parent run.
-- `maxSubAgentCalls`: shared delegated-call budget across the whole run, including recursive children. Default is `100`.
+- `recursionOptions.ai`: routes `llmQuery(...)` sub-query calls to a different AI service than the parent run.
+- `recursionOptions.model`, `modelConfig`, and other forward options: tune the AxGen call used by `llmQuery(...)`.
+- `maxSubAgentCalls`: shared `llmQuery(...)` sub-query budget across the whole run. Default is `100`.
 - `maxBatchedLlmQueryConcurrency`: caps batched `llmQuery([...])` concurrency.
 - `maxRuntimeChars`: runtime/output truncation ceiling for console logs, tool results, and interpreter output replay. The effective limit is computed dynamically each turn based on remaining context budget.
 - `summarizerOptions`: default model/options for the internal checkpoint summarizer.
 - `contextPolicy`: replay/checkpointing/compression policy.
+- `contextMap`: optional persistent orientation cache injected into the distiller and updated once after each successful run. `AxAgentContextMap` evolves indefinitely by default; use `{ infiniteEvolve: false, evolveSteps: N }` on the map object for finite warmup followed by reuse.
 - `contextOptions`: distiller-stage forward options.
 - `executorOptions`: executor-stage forward options such as `description`, `model`, `modelConfig`, `thinkingTokenBudget`, and `showThoughts`.
 - `executorModelPolicy`: executor-only model override rules based on consecutive error turns or discovery fetches from listed namespaces.
@@ -151,9 +153,8 @@ Canonical shape:
 const researchAgent = agent('query:string -> answer:string', {
   contextFields: ['query'],
   runtime,
-  mode: 'advanced',
   recursionOptions: {
-    maxDepth: 2,
+    model: 'gpt-5.4-mini',
   },
   maxRuntimeChars: 3000,
   summarizerOptions: {
@@ -187,16 +188,14 @@ const researchAgent = agent('query:string -> answer:string', {
 Semantics:
-- `mode` stays top-level; there is no `recursionOptions.mode`.
 - `maxRuntimeChars` sets the truncation ceiling and is separate from `contextPolicy.budget`.
 - `summarizerOptions` tunes only the internal checkpoint summarizer. It does not change actor or responder model selection.
 - `executorModelPolicy` only switches the actor model. It does not change `responderOptions.model`.
-- Recursive child agents can inherit `executorModelPolicy`; use a child override only when that child needs different routing behavior.
-- Recursive child calls use `recursionOptions.ai` when set, otherwise they fall back to the parent `.forward(ai, ...)` service.
+- `llmQuery(...)` uses `recursionOptions.ai` when set, otherwise it falls back to the parent `.forward(ai, ...)` service.
+- `recursionOptions` configures the AxGen semantic sub-query used by `llmQuery(...)`; it does not create a child AxAgent and cannot give the sub-query tools.
 - `executorModelPolicy` entries are ordered from weaker to stronger. If multiple rules match, the last matching entry wins.
 - If one entry defines `namespaces`, any successful `discover(...)` function-definition fetch from one of those namespaces marks the rule as matched starting on the next actor turn.
-- Do not add `mode: 'advanced'` just because recursion exists as a feature. Add it only when delegated children need their own tool/discovery/runtime loop.
-- Do not add `recursionOptions` if the user does not need recursive delegation.
+- Do not add `recursionOptions` unless the user needs different model/options for `llmQuery(...)`.
 ## Dynamic Output Truncation
@@ -365,20 +364,12 @@ Available forms:
 Rules:
 - `llmQuery(...)` forwards only the explicit `context` argument.
-- Parent inputs are not automatically available to `llmQuery(...)` children.
-- In `mode: 'simple'`, `llmQuery(...)` is a direct semantic helper.
-- In `mode: 'advanced'`, `llmQuery(...)` delegates a focused subtask to a child `AxAgent` with its own runtime and action log while recursion depth remains.
-- In advanced mode, no parent `contextFields` are auto-inserted into recursive children. Only explicit `llmQuery(..., context)` payload is available there.
-- If `context` is a plain object, safe keys are exposed as child runtime globals and the full payload is also available as `context`.
-- In advanced mode, use `llmQuery(...)` to offload discovery-heavy, tool-heavy, or multi-turn semantic branches so the parent action log stays smaller and more focused.
-- In advanced mode, use batched `llmQuery([...])` only for independent subtasks. Use serial calls when later work depends on earlier results.
-- In advanced mode with discovery enabled, prefer putting noisy tool discovery, `discover(...)`, and branch-specific tool chatter inside delegated child calls when those branches are independent or semantically distinct.
-- In advanced mode, pass compact named object context to children instead of huge raw parent payloads.
-- In advanced mode, do not assume child-created variables, discovered docs, or action-log history come back to the parent. Only the child return value comes back.
-- In advanced mode, if a child calls `askClarification(...)`, that clarification bubbles up and ends the top-level run.
-- In advanced mode, recursion is depth-limited: `maxDepth: 0` makes top-level `llmQuery(...)` simple; `maxDepth: 1` makes top-level `llmQuery(...)` advanced and child `llmQuery(...)` simple.
-- In advanced mode, batched delegated children are cancelled when a sibling child asks for clarification or aborts, so use batched form only when branches are truly independent.
-- `maxSubAgentCalls` is a shared budget across the whole top-level run, including recursive children.
+- Parent inputs, runtime variables, tool results, and discovered docs are not automatically available to `llmQuery(...)`; include any needed facts in `context`.
+- `llmQuery(...)` is a direct semantic helper backed by an AxGen sub-query. It does not create a child AxAgent, does not run a JS runtime, and does not have access to tools or discovery.
+- Use batched `llmQuery([...])` only for independent semantic questions. Use serial calls when later work depends on earlier results.
+- Pass compact named object context instead of huge raw parent payloads.
+- Do not assume anything other than the returned string comes back from `llmQuery(...)`.
+- `maxSubAgentCalls` is a shared budget for `llmQuery(...)` sub-queries across the top-level run.
 - Single-call `llmQuery(...)` may return `[ERROR] ...` on non-abort failures.
 - Batched `llmQuery([...])` returns per-item `[ERROR] ...`.
 - If a result starts with `[ERROR]`, inspect or branch on it instead of assuming success.
@@ -394,7 +385,7 @@ if (summary.startsWith('[ERROR]')) {
 }
 ```
-Advanced recursive discovery example:
+Parallel semantic review example:
 ```javascript
 const narrowedIncidents = incidents.map((incident) => ({
@@ -436,23 +427,20 @@ Delegation decision guide:
 - **JS-only**: deterministic logic such as filter, sort, count, regex, or date math -> do it inline.
 - **Single-shot semantic**: needs LLM reasoning but no tools or multi-step exploration -> single `llmQuery(...)` with narrow context.
-- **Full delegation**: needs its own discovery, tool calls, or more than two turns of exploratory work -> `llmQuery(...)` as child agent.
-- **Parallel fan-out**: two or more independent subtasks each qualifying for delegation -> batched `llmQuery([...])`.
+- **Specialist/tool delegation**: needs its own tools, discovery, runtime, or reusable role -> create a child `agent(...)` and pass it in `functions: [...]`.
+- **Parallel semantic fan-out**: two or more independent semantic-only subtasks -> batched `llmQuery([...])`.
 Context handling:
-- In advanced mode, the `context` object is injected into the child's JS runtime as named globals. It does not go into the child's LLM prompt as raw data.
-- The child prompt sees only a compact metadata summary of the delegated context.
-- The child actor explores the delegated context with code, the same way the parent explores `inputs.*`.
 - Always narrow with JS before delegating. Never pass raw `inputs.*`.
 - Name context keys semantically, e.g. `{ emails: filtered, rubric: 'classify-urgency' }`.
-- Estimate total sub-agent calls before fanning out. `maxSubAgentCalls` is shared across all recursion levels.
+- Estimate total sub-query calls before fanning out. `maxSubAgentCalls` is shared across the run.
 Patterns:
-- Fan-Out / Fan-In: JS narrows into categories -> `llmQuery([...])` fans out per category -> JS or one more `llmQuery(...)` merges results.
+- Fan-Out / Fan-In: JS narrows into categories -> `llmQuery([...])` fans out per category -> JS or one more `llmQuery(...)` merges semantic results.
 - Pipeline: serial `llmQuery(...)` calls where each depends on the prior result.
-- Scout-then-Execute: first child explores, parent processes with JS, second child acts.
+- Specialist tool use: call child agents or tools via their namespaced function globals, e.g. `await team.writer({ draft })`.
 ## Examples
@@ -460,8 +448,7 @@ Fetch these for full working code:
 - [RLM](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm.ts) - RLM basic
 - [RLM Long Task](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-long-task.ts) - RLM context policy
-- [RLM Discovery](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-discovery.ts) - advanced recursive `llmQuery(...)` plus discovery-heavy delegated subtasks
-- [RLM Shared Fields](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-shared-fields.ts) - shared fields
+- [RLM Discovery](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-discovery.ts) - discovery mode, grouped tools, child agents as functions, and semantic `llmQuery(...)`
 - [RLM Adaptive Replay](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-adaptive-replay.ts) - adaptive replay
 - [RLM Live Runtime State](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-live-runtime-state.ts) - structured runtime-state rendering
 - [RLM Clarification Resume](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-clarification-resume.ts) - clarification exception plus `getState()` / `setState(...)`
@@ -472,7 +459,7 @@ Fetch these for full working code:
 - Do not combine `console.log(...)` with `final(...)`.
 - Do not assume old successful turns stay fully replayed under adaptive/checkpointed/lean policies.
 - Do not rebuild runtime state just because a prior turn was summarized.
-- Do not add `mode: 'advanced'` unless delegated children need their own tool/discovery/runtime loop.
-- Do not assume parent inputs are available in `llmQuery(...)` children unless passed in `context`.
+- Do not describe `llmQuery(...)` as spawning a tool-using child AxAgent.
+- Do not assume parent inputs are available to `llmQuery(...)` unless passed in `context`.
 - Do not ignore `[ERROR] ...` results from `llmQuery(...)`.
 - Do not grant `AxJSRuntime` permissions unless the user asked for the capability.

package/skills/ax-agent.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: ax-agent
 description: This skill helps an LLM generate correct core AxAgent code using @ax-llm/ax. Use when the user asks about agent(), child agents, namespaced functions, discovery mode, clarification, bubbleErrors, host-side final/clarification protocol, or ordinary agent runtime behavior. For RLM/code-runtime work use ax-agent-rlm; for callbacks and telemetry use ax-agent-observability; for recall/memory/skill loading use ax-agent-memory-skills; for agent.optimize(...) use ax-agent-optimize.
-version: "21.0.12"
+version: "21.0.13"
 ---
 # AxAgent Codegen Rules (@ax-llm/ax)
@@ -25,19 +25,18 @@ Your job is to choose the smallest correct `AxAgent` shape for the user's needs:
 - Prefer namespaced functions such as `utils.search(...)` or `kb.find(...)`.
 - Pass child agents directly in `functions: [...]`. They land under their `agentIdentity.namespace` (or `utils` if unset), exactly like a `fn()` tool.
 - If discovery is enabled, call `discover(...)` before using callables whose docs are not already in the prompt.
-- Prefer `mode: 'simple'` unless recursive child agents materially improve the task.
-- Add `mode: 'advanced'`, `recursionOptions`, or `maxSubAgentCalls` only when delegated children need their own runtime, tools, or discovery loop.
+- Use explicit child agents in `functions: [...]` for specialist delegation; do not model that as recursive `llmQuery(...)`.
 - Add `bubbleErrors` only for fatal infrastructure errors that should abort `.forward()`.
 ## Decision Guide
 Map user intent to agent shape before writing code:
-- "Use tools and answer" -> plain `agent(...)` with local functions, no recursion, no extra observability.
+- "Use tools and answer" -> plain `agent(...)` with local functions, no extra observability.
 - "Need child agents with distinct responsibilities" -> add child agents to the parent's `functions: [...]` list and set each child's `agentIdentity.namespace` when you want a specific runtime call site such as `team.writer(...)`.
 - "Need tool discovery because names/schemas are not stable" -> enable discovery and generate discovery-first actor code.
-- "Need certain errors to escape the agent loop" -> add `bubbleErrors` with error classes; those errors propagate through function handlers, actor code, and `llmQuery(...)` sub-agents to `.forward()`.
-- "Inspect large context with code", "RLM", "`llmQuery(...)`", or "recursive delegation" -> use `ax-agent-rlm`.
+- "Need certain errors to escape the agent loop" -> add `bubbleErrors` with error classes; those errors propagate through function handlers, actor code, and `llmQuery(...)` sub-queries to `.forward()`.
+- "Inspect large context with code", "RLM", or "`llmQuery(...)`" -> use `ax-agent-rlm`.
 - "Need debugging, traces, progress updates, tool-call logs, chat logs, or usage" -> use `ax-agent-observability`.
 - "Need memories, recall, dynamic skill guides, `discover({ skills })`, or loaded/used tracking" -> use `ax-agent-memory-skills`.
@@ -51,6 +50,7 @@ Map user intent to agent shape before writing code:
 - When resuming after clarification, prefer `error.getState()` from the thrown `AxAgentClarificationError`, then call `agent.setState(savedState)` before the next `forward(...)`.
 - Errors listed in `bubbleErrors` bypass actor-loop catch blocks and propagate directly to the caller of `.forward()`.
 - Child agents receive only the arguments the actor passes. Pass parent fields explicitly via `inputs.<field>` or use `inputUpdateCallback` when many calls need the same value.
+- Audio input fields are transcribed before agent planner/executor/responder stages by default; internal agent stages receive text transcripts, not base64 audio.
 ## Canonical Pattern
@@ -80,6 +80,42 @@ const result = await assistant.forward(llm, { query: 'What is TypeScript?' });
 console.log(result.answer);
 ```
+## Audio Inputs And Speech Outputs
+Agents can accept audio inputs and return scripted speech artifacts. The runtime transcribes audio input fields before internal stages run, then synthesizes `:audio` outputs after the final structured response is selected.
+```typescript
+const voiceAgent = agent(
+  'recording:audio, question:string -> speech:audio, summary:string',
+  {
+    agentIdentity: {
+      name: 'Voice Assistant',
+      description: 'Answers spoken requests',
+    },
+    contextFields: [],
+  }
+);
+const result = await voiceAgent.forward(
+  llm,
+  {
+    recording: { data: base64Wav, format: 'wav' },
+    question: 'What should I do next?',
+  },
+  {
+    speech: {
+      transcribe: { model: 'gpt-4o-mini-transcribe' },
+      speak: { voice: 'alloy', format: 'mp3' },
+    },
+  }
+);
+console.log(result.summary);
+console.log(result.speech.data);
+```
+Use direct `ax(...)` or `.chat()` if the model should receive native audio instead of a transcript-first agent pipeline.
 ## Child Agents As Tools
 Child agents are passed in the parent's `functions` list. There is no separate `agents` option for new code. Each child agent's `agentIdentity.namespace` (or `utils`, the default) determines where it lands in the JS runtime:
@@ -333,7 +369,7 @@ State notes:
 ## Bubble Errors
-Use `bubbleErrors` when certain exceptions thrown inside function handlers or `llmQuery(...)` sub-agent calls should propagate all the way out to `.forward()` instead of being caught by the actor loop and returned as `[ERROR]` strings.
+Use `bubbleErrors` when certain exceptions thrown inside function handlers or `llmQuery(...)` sub-query calls should propagate all the way out to `.forward()` instead of being caught by the actor loop and returned as `[ERROR]` strings.
 ```typescript
 import { agent, f, fn } from '@ax-llm/ax';
@@ -366,8 +402,7 @@ const myAgent = agent('query:string -> answer:string', {
 Rules:
 - `bubbleErrors` takes an array of Error constructor classes, checked via `instanceof`.
-- A matching error thrown inside a function handler, during actor code execution, or inside a nested `llmQuery(...)` child agent propagates immediately to `.forward()`.
-- The same `bubbleErrors` list is automatically propagated to recursive child agents created for advanced-mode `llmQuery(...)` calls.
+- A matching error thrown inside a function handler, during actor code execution, or inside an `llmQuery(...)` sub-query propagates immediately to `.forward()`.
 - Use `bubbleErrors` for fatal infrastructure errors such as DB down, auth failure, or quota exceeded.
 - Do not use `bubbleErrors` for expected recoverable errors; let those return as `[ERROR] ...` strings so the actor can handle them.
 - `AxAgentClarificationError` and `AxAIServiceAbortedError` always bubble up unconditionally.
@@ -503,7 +538,7 @@ Use these method groups as the compact AxAgent surface map:
 - Running: `forward(ai, values, options?)` and `streamingForward(ai, values, options?)`.
 - Forward-time agent options: `skills`, `onUsedMemories`, and `onUsedSkills`; use `ax-agent-memory-skills` for details.
-- State and control: `getState()`, `setState(state?)`, `stop()`, `getSignature()`, `setSignature(signature)`, `getFunction()`, `getId()`, and `setId(id)`.
+- State and control: `getState()`, `setState(state?)`, `getContextMap()`, `setContextMap(map?)`, `stop()`, `getSignature()`, `setSignature(signature)`, `getFunction()`, `getId()`, and `setId(id)`. Context-map evolve policy lives on `AxAgentContextMap` (`infiniteEvolve`, `evolveSteps`, `maxChars`), not on the agent config. See [`src/examples/rlm-context-map.ts`](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-context-map.ts) for persistence and finite-evolve usage.
 - Observability: `getChatLog()`, `getUsage()`, `getStagedUsage()`, `resetUsage()`, and `getTraces()`; use `ax-agent-observability` for details.
 - Demos and tuning: `setDemos(...)`, `namedPrograms()`, `namedProgramInstances()`, `optimize(...)`, `applyOptimization(...)`, `getOptimizableComponents()`, and `applyOptimizedComponents(...)`; use `ax-agent-optimize` for tuning details.
@@ -516,7 +551,7 @@ Rules:
 ## Tuning Hand-off
-When the user wants `agent.optimize(...)`, judge configuration, eval datasets, saved optimization artifacts, or recursive optimization guidance, use `ax-agent-optimize`.
+When the user wants `agent.optimize(...)`, judge configuration, eval datasets, saved optimization artifacts, or optimization guidance, use `ax-agent-optimize`.
 Keep this skill focused on building and running agents. For tuning work:

package/skills/ax-ai.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: ax-ai
-description: This skill helps an LLM generate correct AI provider setup and configuration code using @ax-llm/ax. Use when the user asks about ai(), providers, models, presets, embeddings, extended thinking, context caching, or mentions OpenAI/Anthropic/Google/Azure/Groq/DeepSeek/Mistral/Cohere/Together/Ollama/HuggingFace/Reka/OpenRouter with @ax-llm/ax.
-version: "21.0.12"
+description: This skill helps an LLM generate correct AI provider setup and configuration code using @ax-llm/ax. Use when the user asks about ai(), providers, models, presets, embeddings, batch audio with ai.transcribe() or ai.speak(), extended thinking, context caching, or mentions OpenAI/Anthropic/Google/Azure/Groq/DeepSeek/Mistral/Cohere/Together/Ollama/HuggingFace/Reka/OpenRouter with @ax-llm/ax.
+version: "21.0.13"
 ---
 # AI Provider Codegen Rules (@ax-llm/ax)
@@ -78,6 +78,30 @@ const res = await llm.chat({
 console.log(res.results[0]?.content);
 ```
+## Batch Audio
+Use `ai.transcribe(...)` for batch speech-to-text and `ai.speak(...)` for batch text-to-speech. These are separate from conversational `.chat()` audio config.
+```typescript
+const transcript = await llm.transcribe({
+  audio: { data: base64Wav, format: 'wav' },
+  model: 'gpt-4o-mini-transcribe',
+  language: 'en',
+});
+const speech = await llm.speak({
+  text: transcript.text,
+  model: 'gpt-4o-mini-tts',
+  voice: 'alloy',
+  format: 'mp3',
+});
+console.log(transcript.text);
+console.log(speech.data);
+```
+Providers without the requested audio endpoint throw `AxMediaNotSupportedError`. Use `speech` forward options for signature audio artifacts and `modelConfig.audio` for conversational chat audio.
 ## Common Options
 - `stream` (boolean): enable SSE; true by default
@@ -117,14 +141,21 @@ import { ai, AxAIDeepSeekModel } from '@ax-llm/ax';
 const deepseek = ai({
   name: 'deepseek',
   apiKey: process.env.DEEPSEEK_APIKEY!,
-  config: { model: AxAIDeepSeekModel.DeepSeekV4Pro },
+  config: { model: AxAIDeepSeekModel.DeepSeekV4Flash },
 });
 ```
-DeepSeek V4 thinking models support tools, but reject the `tool_choice`
-request parameter. Ax omits forced/auto tool choice for `deepseek-v4-pro`,
-`deepseek-v4-flash`, and `deepseek-reasoner` while still sending tool
-definitions.
+DeepSeek's current API models are `deepseek-v4-flash` and `deepseek-v4-pro`.
+The deprecated `deepseek-chat` and `deepseek-reasoner` aliases are retained for
+compatibility until DeepSeek removes them on 2026-07-24.
+DeepSeek V4 supports thinking mode. Ax sends `thinking: { type: "disabled" }`
+by default to preserve non-thinking behavior, and enables it when
+`thinkingTokenBudget` is set. Ax maps lower budget levels to DeepSeek's `high`
+effort and maps `highest` to `max`. DeepSeek V4 thinking models support tools,
+but reject the `tool_choice` request parameter, so Ax omits forced/auto tool
+choice for `deepseek-v4-pro`, `deepseek-v4-flash`, and `deepseek-reasoner`
+while still sending tool definitions.
 ## Extended Thinking