npm - @ax-llm/ax - Versions diffs - 19.0.45 → 20.0.0 - Mend

@ax-llm/ax 19.0.45 → 20.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

package/README.md +2 -2
package/index.cjs +821 -779
package/index.cjs.map +1 -1
package/index.d.cts +7040 -6777
package/index.d.ts +7040 -6777
package/index.global.js +808 -766
package/index.global.js.map +1 -1
package/index.js +818 -776
package/index.js.map +1 -1
package/package.json +1 -15
package/skills/ax-agent-optimize.md +41 -16
package/skills/ax-agent.md +1 -1
package/skills/ax-ai.md +1 -1
package/skills/ax-flow.md +1 -1
package/skills/ax-gen.md +1 -1
package/skills/ax-gepa.md +18 -12
package/skills/ax-learn.md +1 -1
package/skills/ax-llm.md +1 -1
package/skills/ax-signature.md +1 -1

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@ax-llm/ax",
-  "version": "19.0.45",
+  "version": "20.0.0",
   "type": "module",
   "description": "The best library to work with LLMs",
   "repository": {
@@ -25,20 +25,6 @@
       "optional": true
     }
   },
-  "ava": {
-    "failFast": true,
-    "timeout": "180s",
-    "concurrency": 1,
-    "extensions": {
-      "ts": "module"
-    },
-    "nodeArguments": [
-      "--import=tsimp"
-    ],
-    "files": [
-      "!dist/**/*"
-    ]
-  },
   "tsd": {
     "directory": "./"
   },

package/skills/ax-agent-optimize.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: ax-agent-optimize
 description: This skill helps an LLM generate correct AxAgent tuning and evaluation code using @ax-llm/ax. Use when the user asks about agent.optimize(...), judgeOptions, eval datasets, optimization targets, saved optimizedProgram artifacts, or recursive optimization guidance.
-version: "19.0.45"
+version: "20.0.0"
 ---
 # AxAgent Optimize Codegen Rules (@ax-llm/ax)
@@ -18,21 +18,27 @@ Your job is to help the model choose a good optimization setup for the user's ac
 ## Use These Defaults
 - Use `agent.optimize(...)` only after the agent is already configured and runnable.
-- Prefer a deterministic custom `metric` when success is easy to score from the prediction and task record.
-- Prefer the built-in judge path for open-ended assistant tasks: `judgeAI` plus `judgeOptions`.
+- Prefer the built-in judge path first for normal agent tuning. Most users should start with tasks that include `input` and `criteria`, then let `agent.optimize(...)` use its default actor target and judge-based metric.
+- Prefer a deterministic custom `metric` only when success is easy to score from the prediction and task record.
+- Add `judgeAI` plus `judgeOptions` when the judge should run on a stronger or separate model than the agent runtime model.
 - Only reach for a plain typed `AxGen` evaluator when the user needs LLM-as-judge behavior outside the built-in `agent.optimize(...)` flow.
-- Default optimize target is `root.actor`; use `target: 'responder'` or explicit program IDs only when the user clearly asks for that.
+- Default optimize target is the actor path; do not surface `target` unless the user clearly wants responder-only tuning or explicit program IDs.
 - Use eval-safe tools or in-memory mocks because optimization replays tasks many times.
 - Prefer precise tool return schemas such as `f.object(...)` over vague `f.json(...)` whenever the agent must reason about returned fields.
 - Prefer task wording with canonical entity names like "the Atlas project" instead of ambiguous labels like "Atlas" when ambiguity could trigger pointless clarification.
-- Save `result.optimizedProgram`, then restore with `new AxOptimizedProgramImpl(...)` and `agent.applyOptimization(...)`.
+- Save artifacts with `axSerializeOptimizedProgram(result.optimizedProgram!)`, then restore with `axDeserializeOptimizedProgram(saved)` and `agent.applyOptimization(...)`.
+- For browser-safe persistence, let the caller store the serialized JSON anywhere they want such as localStorage, IndexedDB, or a backend.
+- If `bootstrap` is enabled, bootstrapped demos are persisted inside `result.optimizedProgram.demos`; raw failed traces are not saved in v1.
+- For first examples, pass a plain task array instead of splitting into `train` and `validation` unless the user already has a holdout set.
+- GEPA-backed `agent.optimize(...)` now optimizes generic components exposed by the selected target programs; `target: 'actor'` only tunes actor components, `target: 'responder'` only tunes responder components, and `target: 'all'` broadens the component set.
+- `result.optimizedProgram.componentMap` is the canonical saved artifact for agent GEPA runs. It may include actor instructions, descriptions, tool descriptions/names, templates, or runtime primitives depending on what the selected target exposes.
 - When recursive behavior matters, keep `mode: 'advanced'` on the agent and tune against realistic `recursionOptions`.
 ## Decision Guide
 Pick the optimization shape from the user's need:
-- "Make the agent use tools correctly" -> optimize `root.actor` with `expectedActions` and `forbiddenActions`.
+- "Make the agent use tools correctly" -> keep the default actor target and use `expectedActions` and `forbiddenActions`.
 - "Make final answers read better" -> consider `target: 'responder'`, but only if the task is not mostly tool-selection or clarification behavior.
 - "Make the whole agent better" -> use the default actor target first; only broaden target selection when the user clearly wants that extra scope.
 - "Tune recursive delegation" -> keep `mode: 'advanced'` and use tasks that actually exercise recursion depth, fan-out, and termination choices.
@@ -101,12 +107,13 @@ Important:
 import {
   AxAIGoogleGeminiModel,
   AxJSRuntime,
-  AxOptimizedProgramImpl,
   axDefaultOptimizerLogger,
   agent,
   ai,
   f,
   fn,
+  axDeserializeOptimizedProgram,
+  axSerializeOptimizedProgram,
 } from '@ax-llm/ax';
 const tools = [
@@ -159,22 +166,38 @@ const tasks = [
 ];
 const result = await assistant.optimize(tasks, {
-  target: 'actor',
   maxMetricCalls: 12,
   verbose: true,
-  optimizerLogger: axDefaultOptimizerLogger,
-  onProgress: (progress) => {
-    console.log(
-      `round ${progress.round}/${progress.totalRounds} current=${progress.currentScore} best=${progress.bestScore}`
-    );
-  },
 });
-const saved = JSON.stringify(result.optimizedProgram, null, 2);
-const restored = new AxOptimizedProgramImpl(JSON.parse(saved));
+const saved = axSerializeOptimizedProgram(result.optimizedProgram!);
+const restored = axDeserializeOptimizedProgram(saved);
 assistant.applyOptimization(restored);
 ```
+## Minimal Normal-User Pattern
+Start here unless the user clearly needs a hand-built scorer:
+```typescript
+const tasks = [
+  {
+    input: { query: 'Send an email to Jim saying good morning.' },
+    criteria: 'Use the email tool and send the message to Jim.',
+    expectedActions: ['email.sendEmail'],
+  },
+];
+const result = await assistant.optimize(tasks);
+assistant.applyOptimization(result.optimizedProgram!);
+```
+- `target` defaults to actor optimization.
+- `metric` defaults to the built-in LLM judge.
+- `judgeAI` is optional; if omitted, the agent falls back to its configured judge model or runtime model.
+- `bootstrap: true` is a good next step for tool-heavy agents when you want GEPA to start from successful traces from the provided tasks.
+- The one thing users still need is realistic task records with clear `criteria`.
 ## Deterministic Metric Pattern
 Use this when the task has crisp correctness and cost/behavior tradeoffs:
@@ -319,12 +342,14 @@ Decision rules:
 - Save `result.optimizedProgram` if the user wants portable artifacts.
 - Restore artifacts with `new AxOptimizedProgramImpl(...)`, then call `agent.applyOptimization(...)`.
+- Preserve the full optimized program when saving GEPA artifacts; `componentMap` reapplies the learned strings.
 - For demonstrations, use fresh eval-safe tool state for baseline, optimize, and restored replay so side effects do not leak across phases.
 - If the user wants to show improvement, run a held-out task before optimization, then replay it on a freshly restored optimized agent.
 ## Examples
 - [RLM Agent Optimize](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-agent-optimize.ts) — Gemini office-assistant tuning with save/load
+- [AxAgent GEPA Component Optimization](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/axagent-gepa-optimization.ts) — compact support-agent GEPA run with deterministic metric and artifact replay
 - [RLM Agent Recursive Optimize](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-agent-recursive-optimize.ts) — recursive-slot optimization artifacts
 ## Do Not Generate

package/skills/ax-agent.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: ax-agent
 description: This skill helps an LLM generate correct AxAgent code using @ax-llm/ax. Use when the user asks about agent(), child agents, namespaced functions, discovery mode, shared fields, llmQuery(...), RLM code execution, recursionOptions, or agent runtime behavior. For tuning and eval with agent.optimize(...), use ax-agent-optimize.
-version: "19.0.45"
+version: "20.0.0"
 ---
 # AxAgent Codegen Rules (@ax-llm/ax)

package/skills/ax-ai.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: ax-ai
 description: This skill helps an LLM generate correct AI provider setup and configuration code using @ax-llm/ax. Use when the user asks about ai(), providers, models, presets, embeddings, extended thinking, context caching, or mentions OpenAI/Anthropic/Google/Azure/Groq/DeepSeek/Mistral/Cohere/Together/Ollama/HuggingFace/Reka/OpenRouter with @ax-llm/ax.
-version: "19.0.45"
+version: "20.0.0"
 ---
 # AI Provider Codegen Rules (@ax-llm/ax)

package/skills/ax-flow.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: ax-flow
 description: This skill helps an LLM generate correct AxFlow workflow code using @ax-llm/ax. Use when the user asks about flow(), AxFlow, workflow orchestration, parallel execution, DAG workflows, conditional routing, map/reduce patterns, or multi-node AI pipelines.
-version: "19.0.45"
+version: "20.0.0"
 ---
 # AxFlow Codegen Rules (@ax-llm/ax)

package/skills/ax-gen.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: ax-gen
 description: This skill helps an LLM generate correct AxGen code using @ax-llm/ax. Use when the user asks about ax(), AxGen, generators, forward(), streamingForward(), assertions, field processors, step hooks, self-tuning, or structured outputs.
-version: "19.0.45"
+version: "20.0.0"
 ---
 # AxGen Codegen Rules (@ax-llm/ax)

package/skills/ax-gepa.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: ax-gepa
 description: This skill helps an LLM generate correct AxGEPA optimization code using @ax-llm/ax. Use when the user asks about AxGEPA, GEPA, Pareto optimization, multi-objective prompt tuning, reflective prompt evolution, validationExamples, maxMetricCalls, or optimizing a generator, flow, or agent tree.
-version: "19.0.45"
+version: "20.0.0"
 ---
 # AxGEPA Codegen Rules (@ax-llm/ax)
@@ -17,20 +17,22 @@ Use this skill to generate direct `AxGEPA` optimization code. Prefer short, mode
 - Always set `maxMetricCalls` to bound optimizer cost.
 - Use scalar metrics for one objective and object metrics for Pareto optimization.
 - Apply results with `program.applyOptimization(result.optimizedProgram!)`.
-- For tree-wide runs, expect `optimizedProgram.instructionMap`.
+- For tree-wide runs, expect `optimizedProgram.componentMap`.
+- Persist artifacts with `axSerializeOptimizedProgram(...)` and restore them with `axDeserializeOptimizedProgram(...)` so the same flow works in browsers and Node.
 ## Critical Rules
-- `AxGEPA.compile()` works for a single generator and for tree-aware roots such as flows or agents with registered instruction-bearing descendants.
+- `AxGEPA.compile()` works for a single generator and for tree-aware roots such as flows or agents with registered optimizable descendants.
 - There is no separate flow-only GEPA optimizer. Use `AxGEPA` for flows too.
 - The metric may return either `number` or `Record<string, number>`.
 - Keep metrics deterministic and cheap by default.
 - Avoid extra LLM calls inside the metric unless the user explicitly wants judge-based evaluation.
 - If the user needs LLM-as-judge scoring for a non-agent GEPA run, prefer a plain typed `AxGen` evaluator instead of writing a custom judge abstraction.
 - `maxMetricCalls` must be large enough to cover the initial validation pass over `validationExamples`.
-- GEPA optimizes instructions. If a tree has no instruction-bearing nodes, optimization will fail.
+- GEPA optimizes generic string components exposed by `getOptimizableComponents()`. If a tree exposes no components, optimization will fail.
 - Use held-out validation examples for selection. Do not reuse the training set as `validationExamples`.
 - `result.optimizedProgram` is the easy-to-apply best candidate. `result.paretoFront` is the full trade-off set for multi-objective runs.
+- `bootstrap: true` can seed GEPA with demos collected from successful runs on the provided training tasks.
 ## Metric Selection
@@ -39,12 +41,12 @@ Choose the evaluation path deliberately:
 - Prefer a deterministic metric when correctness can be read directly from `prediction` and `example`.
 - Prefer a deterministic metric when cost, latency, recursion depth, or tool count matters.
 - Use a plain typed `AxGen` evaluator only when the task is genuinely qualitative and hard to score exactly.
-- For `agent.optimize(...)`, prefer the built-in judge path instead of manually wrapping a judge metric.
+- For `agent.optimize(...)`, prefer the built-in judge path instead of manually wrapping a judge metric. Normal agent users usually do not need to set `target` or `metric` at all.
 Rule of thumb:
 - `AxGEPA` on `AxGen` or flow: use a metric first, optionally a plain typed `AxGen` evaluator if needed.
-- `agent.optimize(...)`: use custom `metric` for crisp scoring, otherwise `judgeAI` plus `judgeOptions`.
+- `agent.optimize(...)`: use custom `metric` for crisp scoring, otherwise let the built-in judge handle scoring. Add `judgeAI` plus `judgeOptions` only when you want a stronger or separate judge model.
 ## Canonical Scalar Pattern
@@ -169,7 +171,7 @@ for (const point of result.paretoFront) {
 }
 wf.applyOptimization(result.optimizedProgram!);
-console.log(result.optimizedProgram?.instructionMap);
+console.log(result.optimizedProgram?.componentMap);
 ```
 ## Metric Patterns
@@ -209,9 +211,9 @@ const loaded = JSON.parse(saved);
 program.applyOptimization(loaded);
 ```
-- Single-target runs usually populate both `optimizedProgram.instruction` and `optimizedProgram.instructionMap`.
-- Tree-wide runs rely on `instructionMap`, keyed by full program ID.
-- Pareto points expose candidate configs under `point.configuration.instructionMap`.
+- Single-target runs usually populate both `optimizedProgram.instruction` and `optimizedProgram.componentMap`.
+- Tree-wide runs rely on `componentMap`, keyed by full component key.
+- Pareto points expose candidate configs under `point.configuration.componentMap`.
 ## Useful Options
@@ -244,13 +246,16 @@ const optimizer = new AxGEPA({
 - Size `maxMetricCalls` for at least one full validation pass plus several rounds.
 - If the user wants a strict budget, say so explicitly and set `maxMetricCalls`.
 - For expensive trees, start with `auto: 'light'` or fewer `numTrials`, then scale up.
+- GEPA selects among exposed components using measured accept/reject history, not LLM-generated numeric scores. The LLM proposes component text; metrics decide whether to keep it.
+- Function/tool trace reflection is keyed by stable component IDs where available, so function renames do not break saved candidate maps.
 ## Troubleshooting
 - Error about `maxMetricCalls` being too small: increase it until the initial validation pass fits.
 - Empty or poor Pareto front: verify the metric returns numbers for every example.
-- No tree optimization effect: ensure child programs are registered under the root and have instructions to mutate.
-- Saved optimization applies only partly: use `program.applyOptimization(...)`, not just `setInstruction(...)`, so `instructionMap` reaches the full tree.
+- No tree optimization effect: ensure child programs are registered under the root and expose optimizable components.
+- Saved optimization applies only partly: use `program.applyOptimization(...)`, not just `setInstruction(...)`, so `componentMap` reaches the full tree.
+- Agent target seems too broad: when using `agent.optimize(...)`, set `target: 'actor'`, `'responder'`, `'all'`, or explicit program IDs. The wrapper filters GEPA components to the selected target.
 ## Good Example Targets
@@ -258,3 +263,4 @@ const optimizer = new AxGEPA({
 - `/Users/vr/src/ax/src/examples/gepa-flow.ts`
 - `/Users/vr/src/ax/src/examples/gepa-train-inference.ts`
 - `/Users/vr/src/ax/src/examples/gepa-quality-vs-speed-optimization.ts`
+- `/Users/vr/src/ax/src/examples/axagent-gepa-optimization.ts`

package/skills/ax-learn.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: ax-learn
 description: This skill helps an LLM generate correct AxLearn code using @ax-llm/ax. Use when the user asks about self-improving agents, trace-backed learning, feedback-aware updates, or AxLearn modes.
-version: "19.0.45"
+version: "20.0.0"
 ---
 # AxLearn Codegen Rules (@ax-llm/ax)

package/skills/ax-llm.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: ax
 description: This skill helps with using the @ax-llm/ax TypeScript library for building LLM applications. Use when the user asks about ax(), ai(), f(), s(), agent(), flow(), AxGen, AxAgent, AxFlow, signatures, streaming, or mentions @ax-llm/ax.
-version: "19.0.45"
+version: "20.0.0"
 ---
 # Ax Library (@ax-llm/ax) Quick Reference

package/skills/ax-signature.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: ax-signature
 description: This skill helps an LLM generate correct DSPy signature code using @ax-llm/ax. Use when the user asks about signatures, s(), f(), field types, string syntax, fluent builder API, validation constraints, or type-safe inputs/outputs.
-version: "19.0.45"
+version: "20.0.0"
 ---
 # Ax Signature Reference