@ax-llm/ax 21.0.13 → 22.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +3 -4
- package/index.cjs +462 -491
- package/index.cjs.map +1 -1
- package/index.d.cts +759 -3061
- package/index.d.ts +759 -3061
- package/index.global.js +460 -489
- package/index.global.js.map +1 -1
- package/index.js +462 -491
- package/index.js.map +1 -1
- package/package.json +1 -1
- package/skills/ax-agent-memory-skills.md +1 -1
- package/skills/ax-agent-observability.md +4 -4
- package/skills/ax-agent-optimize.md +1 -1
- package/skills/ax-agent-rlm.md +28 -8
- package/skills/ax-agent.md +27 -7
- package/skills/ax-ai.md +28 -12
- package/skills/ax-audio.md +1 -1
- package/skills/ax-flow.md +13 -5
- package/skills/ax-gen.md +35 -18
- package/skills/ax-gepa.md +1 -1
- package/skills/ax-llm.md +17 -9
- package/skills/ax-refine.md +81 -0
- package/skills/ax-signature.md +1 -1
- package/skills/ax-learn.md +0 -268
|
@@ -0,0 +1,81 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: ax-refine
|
|
3
|
+
description: Use this skill when writing or reviewing Ax bestOfN/refine code, reward functions, thresholds, native sample selection, serial attempts, generated advice, and attempt diagnostics.
|
|
4
|
+
version: "22.0.0"
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Ax Refine And BestOfN
|
|
8
|
+
|
|
9
|
+
Use `bestOfN(...)` when you can score complete outputs independently. Use `refine(...)` when failed rounds should produce feedback that changes the next attempt.
|
|
10
|
+
|
|
11
|
+
## Breaking Migration
|
|
12
|
+
|
|
13
|
+
Treat this as a breaking API change:
|
|
14
|
+
|
|
15
|
+
- Do not generate `addAssert(...)` or `addStreamingAssert(...)`; they are removed.
|
|
16
|
+
- Use schema validation for shape and field validity.
|
|
17
|
+
- Use `bestOfN(...)` for complete-candidate selection.
|
|
18
|
+
- Use `refine(...)` for retry rounds with generated feedback.
|
|
19
|
+
- Use `addStreamingGuard(...)` only for fail-fast streaming safety.
|
|
20
|
+
|
|
21
|
+
## APIs
|
|
22
|
+
|
|
23
|
+
```typescript
|
|
24
|
+
import { bestOfN, refine } from '@ax-llm/ax';
|
|
25
|
+
|
|
26
|
+
const selected = bestOfN(program, {
|
|
27
|
+
n: 4,
|
|
28
|
+
threshold: 0.8,
|
|
29
|
+
rewardFn: ({ input, prediction, traces, chatLog }) => score(prediction),
|
|
30
|
+
});
|
|
31
|
+
|
|
32
|
+
const improved = refine(program, {
|
|
33
|
+
rounds: 3,
|
|
34
|
+
samplesPerRound: 2,
|
|
35
|
+
threshold: 0.85,
|
|
36
|
+
rewardDescription: 'Prefer complete, grounded, concise answers.',
|
|
37
|
+
rewardFn: ({ prediction }) => score(prediction),
|
|
38
|
+
});
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
Rules:
|
|
42
|
+
|
|
43
|
+
- `forward(...)` returns the selected prediction.
|
|
44
|
+
- `streamingForward(...)` is unsupported; score complete outputs instead.
|
|
45
|
+
- `getUsage()` aggregates usage across attempts.
|
|
46
|
+
- `getTraces()` and `getChatLog()` return the selected attempt's diagnostics.
|
|
47
|
+
- `getAttempts()` returns all attempt metadata, including reward, errors, and advice application.
|
|
48
|
+
|
|
49
|
+
## Reward Functions
|
|
50
|
+
|
|
51
|
+
Reward functions return a number. Higher is better. A `threshold` marks a good-enough candidate and can stop serial attempts early.
|
|
52
|
+
|
|
53
|
+
```typescript
|
|
54
|
+
const rewardFn = ({ prediction }) => {
|
|
55
|
+
const exact = prediction.answer === 'Paris' ? 1 : 0;
|
|
56
|
+
const concise = prediction.answer.length < 80 ? 0.2 : 0;
|
|
57
|
+
return exact + concise;
|
|
58
|
+
};
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
Use serial strategy when the reward needs traces, chat logs, tools, or full flow behavior.
|
|
62
|
+
|
|
63
|
+
## Strategies
|
|
64
|
+
|
|
65
|
+
- `strategy: "auto"` uses native samples for `AxGen` and serial attempts for composite programs.
|
|
66
|
+
- `strategy: "native-samples"` uses `sampleCount` and a reward-backed `resultPicker`; candidate context includes outputs, not full per-candidate traces.
|
|
67
|
+
- `strategy: "serial"` runs isolated full-program attempts with fresh memory/session IDs.
|
|
68
|
+
|
|
69
|
+
## Refine Advice
|
|
70
|
+
|
|
71
|
+
`refine(...)` generates advice after a below-threshold round. Advice is appended temporarily to matching `kind: "instruction"` components exposed by `getOptimizableComponents()` and applied through `applyOptimizedComponents()`.
|
|
72
|
+
|
|
73
|
+
Rules:
|
|
74
|
+
|
|
75
|
+
- Original instruction values are restored in `finally`, on success and error.
|
|
76
|
+
- Programs without instruction components continue as best-of-N rounds and mark `adviceApplied: false`.
|
|
77
|
+
- Do not add DSPy-style `hint_` signature fields; Ax uses instruction-component advice.
|
|
78
|
+
|
|
79
|
+
## Streaming
|
|
80
|
+
|
|
81
|
+
Do not use `refine(...)` for streaming. For partial-output safety, use `addStreamingGuard(fieldName, fn, message?)` on `AxGen`. Guards fail fast with `AxStreamingGuardError`; they do not retry, refine, or feed correction text back to the model.
|
package/skills/ax-signature.md
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: ax-signature
|
|
3
3
|
description: This skill helps an LLM generate correct DSPy signature code using @ax-llm/ax. Use when the user asks about signatures, s(), f(), field types, string syntax, fluent builder API, validation constraints, or type-safe inputs/outputs.
|
|
4
|
-
version: "
|
|
4
|
+
version: "22.0.0"
|
|
5
5
|
---
|
|
6
6
|
|
|
7
7
|
# Ax Signature Reference
|
package/skills/ax-learn.md
DELETED
|
@@ -1,268 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: ax-learn
|
|
3
|
-
description: This skill helps an LLM generate correct AxLearn code using @ax-llm/ax. Use when the user asks about self-improving agents, trace-backed learning, feedback-aware updates, or AxLearn modes.
|
|
4
|
-
version: "21.0.13"
|
|
5
|
-
---
|
|
6
|
-
|
|
7
|
-
# AxLearn Codegen Rules (@ax-llm/ax)
|
|
8
|
-
|
|
9
|
-
Use this skill to generate `AxLearn` code that matches the current API.
|
|
10
|
-
|
|
11
|
-
## Core Model
|
|
12
|
-
|
|
13
|
-
- `AxLearn` wraps an `AxGen`.
|
|
14
|
-
- `teacher` is for judging, synthesis, and reflection.
|
|
15
|
-
- `runtimeAI` is the model being improved.
|
|
16
|
-
- `forward()` and `streamingForward()` are inference-time APIs and auto-log traces when tracing is enabled.
|
|
17
|
-
- `optimize()` is offline learning.
|
|
18
|
-
- `applyUpdate()` is a bounded update API for `continuous` and `playbook` modes.
|
|
19
|
-
- `ready()` should be awaited before assuming checkpoints have been restored.
|
|
20
|
-
- `improvement` is the score delta from the previous/restored state.
|
|
21
|
-
|
|
22
|
-
## Required Inputs
|
|
23
|
-
|
|
24
|
-
- Always provide `name`.
|
|
25
|
-
- Always provide `storage`.
|
|
26
|
-
- Always provide `teacher`.
|
|
27
|
-
- Always provide `runtimeAI` if you call `optimize()` or `applyUpdate()`.
|
|
28
|
-
|
|
29
|
-
## Modes
|
|
30
|
-
|
|
31
|
-
- `batch`: offline prompt learning only.
|
|
32
|
-
- `continuous`: offline optimization plus bounded feedback-aware `applyUpdate(...)`.
|
|
33
|
-
- `playbook`: structured context/playbook learning plus `applyUpdate(...)`.
|
|
34
|
-
|
|
35
|
-
## Preferred Construction
|
|
36
|
-
|
|
37
|
-
```typescript
|
|
38
|
-
import {
|
|
39
|
-
AxLearn,
|
|
40
|
-
ax,
|
|
41
|
-
ai,
|
|
42
|
-
type AxCheckpoint,
|
|
43
|
-
type AxStorage,
|
|
44
|
-
type AxTrace,
|
|
45
|
-
} from '@ax-llm/ax';
|
|
46
|
-
|
|
47
|
-
const storage: AxStorage = {
|
|
48
|
-
save: async (_name, _item) => {
|
|
49
|
-
// persist trace/checkpoint
|
|
50
|
-
},
|
|
51
|
-
load: async (_name, _query) => {
|
|
52
|
-
// return traces/checkpoints
|
|
53
|
-
return [];
|
|
54
|
-
},
|
|
55
|
-
};
|
|
56
|
-
|
|
57
|
-
const teacher = ai({
|
|
58
|
-
name: 'openai',
|
|
59
|
-
apiKey: process.env.OPENAI_APIKEY!,
|
|
60
|
-
});
|
|
61
|
-
|
|
62
|
-
const runtimeAI = ai({
|
|
63
|
-
name: 'openai',
|
|
64
|
-
apiKey: process.env.OPENAI_APIKEY!,
|
|
65
|
-
});
|
|
66
|
-
|
|
67
|
-
const gen = ax(`
|
|
68
|
-
customerQuery:string "User message" ->
|
|
69
|
-
supportReply:string "Agent reply"
|
|
70
|
-
`);
|
|
71
|
-
|
|
72
|
-
const agent = new AxLearn(gen, {
|
|
73
|
-
name: 'support-bot-v1',
|
|
74
|
-
storage,
|
|
75
|
-
teacher,
|
|
76
|
-
runtimeAI,
|
|
77
|
-
mode: 'continuous',
|
|
78
|
-
budget: 12,
|
|
79
|
-
examples: [
|
|
80
|
-
{
|
|
81
|
-
customerQuery: 'Where is my order?',
|
|
82
|
-
supportReply: 'Your order is in transit and should arrive in 2 days.',
|
|
83
|
-
},
|
|
84
|
-
{
|
|
85
|
-
customerQuery: 'I need a refund.',
|
|
86
|
-
supportReply: 'I can help with that. Please share your order number.',
|
|
87
|
-
},
|
|
88
|
-
],
|
|
89
|
-
generateExamples: false,
|
|
90
|
-
});
|
|
91
|
-
|
|
92
|
-
await agent.ready();
|
|
93
|
-
```
|
|
94
|
-
|
|
95
|
-
## Runtime Pattern
|
|
96
|
-
|
|
97
|
-
```typescript
|
|
98
|
-
const prediction = await agent.forward(runtimeAI, {
|
|
99
|
-
customerQuery: 'My package is late.',
|
|
100
|
-
});
|
|
101
|
-
|
|
102
|
-
const traces = await agent.getTraces({ limit: 1 });
|
|
103
|
-
if (traces[0]) {
|
|
104
|
-
await agent.addFeedback(traces[0].id, {
|
|
105
|
-
score: 0,
|
|
106
|
-
label: 'needs-empathy',
|
|
107
|
-
comment: 'Acknowledge the frustration more directly.',
|
|
108
|
-
});
|
|
109
|
-
}
|
|
110
|
-
```
|
|
111
|
-
|
|
112
|
-
## Offline Optimization
|
|
113
|
-
|
|
114
|
-
```typescript
|
|
115
|
-
const result = await agent.optimize({
|
|
116
|
-
// Optional overrides
|
|
117
|
-
budget: 20,
|
|
118
|
-
});
|
|
119
|
-
|
|
120
|
-
console.log(result.mode);
|
|
121
|
-
console.log(result.score);
|
|
122
|
-
console.log(result.improvement);
|
|
123
|
-
console.log(result.checkpointVersion);
|
|
124
|
-
```
|
|
125
|
-
|
|
126
|
-
`result.improvement` is the gain relative to the prior/restored score.
|
|
127
|
-
|
|
128
|
-
## Continuous Update
|
|
129
|
-
|
|
130
|
-
Use `applyUpdate(...)` only in `continuous` or `playbook` mode.
|
|
131
|
-
|
|
132
|
-
- In `continuous` mode, `example` may be input-only.
|
|
133
|
-
- `prediction` is the observed runtime output being critiqued.
|
|
134
|
-
- If `example` includes expected output fields, that expected-output row stays eligible for scored optimization.
|
|
135
|
-
- The observed `prediction` row is feedback/reflection context, not a scored train/validation row by itself.
|
|
136
|
-
- Feedback-bearing scored examples should stay in the training pool when non-feedback rows can fill validation.
|
|
137
|
-
- In `playbook` mode, `getInstruction()` returns the active composed prompt.
|
|
138
|
-
|
|
139
|
-
```typescript
|
|
140
|
-
const update = await agent.applyUpdate({
|
|
141
|
-
example: {
|
|
142
|
-
customerQuery: 'My package is late.',
|
|
143
|
-
},
|
|
144
|
-
prediction,
|
|
145
|
-
feedback: {
|
|
146
|
-
score: 0,
|
|
147
|
-
label: 'needs-empathy',
|
|
148
|
-
comment: 'Acknowledge the frustration more directly.',
|
|
149
|
-
},
|
|
150
|
-
});
|
|
151
|
-
```
|
|
152
|
-
|
|
153
|
-
## Playbook Mode
|
|
154
|
-
|
|
155
|
-
- Use `mode: 'playbook'` when the learned artifact should be structured guidance, not just an instruction tweak.
|
|
156
|
-
- Playbook checkpoints restore through `ready()`.
|
|
157
|
-
- `applyUpdate(...)` in playbook mode performs an online structured update.
|
|
158
|
-
- `getInstruction()` should be treated as the active composed runtime prompt, even before optimization if the base prompt lives in the signature description.
|
|
159
|
-
- `artifact.playbookSummary` should match the persisted checkpoint `state.artifactSummary`.
|
|
160
|
-
|
|
161
|
-
## How Learning Data Is Used
|
|
162
|
-
|
|
163
|
-
- `examples` and usable traces become scored optimization rows.
|
|
164
|
-
- Feedback stored with `addFeedback(...)` becomes reflection feedback for later optimization.
|
|
165
|
-
- In continuous updates, `example + prediction + feedback` is used as an observed feedback event.
|
|
166
|
-
- Input-only update examples are useful for reflection, but they are not promoted into scored examples unless expected outputs are present.
|
|
167
|
-
|
|
168
|
-
## Important Options
|
|
169
|
-
|
|
170
|
-
```typescript
|
|
171
|
-
const agent = new AxLearn(gen, {
|
|
172
|
-
name: 'agent-id',
|
|
173
|
-
storage,
|
|
174
|
-
teacher,
|
|
175
|
-
runtimeAI,
|
|
176
|
-
mode: 'batch', // 'batch' | 'continuous' | 'playbook'
|
|
177
|
-
budget: 20,
|
|
178
|
-
metric: async ({ prediction, example }) => {
|
|
179
|
-
return prediction.supportReply === example.supportReply ? 1 : 0;
|
|
180
|
-
},
|
|
181
|
-
criteria: 'accuracy and tone',
|
|
182
|
-
judgeOptions: {},
|
|
183
|
-
examples: [],
|
|
184
|
-
useTraces: true,
|
|
185
|
-
generateExamples: false,
|
|
186
|
-
synthCount: 20,
|
|
187
|
-
validationSplit: 0.2,
|
|
188
|
-
continuousOptions: {
|
|
189
|
-
feedbackWindowSize: 25,
|
|
190
|
-
maxRecentTraces: 100,
|
|
191
|
-
updateBudget: 4,
|
|
192
|
-
},
|
|
193
|
-
playbookOptions: {
|
|
194
|
-
maxEpochs: 2,
|
|
195
|
-
},
|
|
196
|
-
onTrace: (trace) => {
|
|
197
|
-
console.log(trace.id);
|
|
198
|
-
},
|
|
199
|
-
onProgress: (progress) => {
|
|
200
|
-
console.log(progress.round, progress.score);
|
|
201
|
-
},
|
|
202
|
-
});
|
|
203
|
-
```
|
|
204
|
-
|
|
205
|
-
## Result Shape
|
|
206
|
-
|
|
207
|
-
```typescript
|
|
208
|
-
type AxLearnResult = {
|
|
209
|
-
mode: 'batch' | 'continuous' | 'playbook';
|
|
210
|
-
score: number;
|
|
211
|
-
improvement: number;
|
|
212
|
-
checkpointVersion: number;
|
|
213
|
-
stats: {
|
|
214
|
-
trainingExamples: number;
|
|
215
|
-
validationExamples: number;
|
|
216
|
-
feedbackExamples: number;
|
|
217
|
-
durationMs: number;
|
|
218
|
-
mode: 'batch' | 'continuous' | 'playbook';
|
|
219
|
-
};
|
|
220
|
-
state?: {
|
|
221
|
-
mode: 'batch' | 'continuous' | 'playbook';
|
|
222
|
-
instruction?: string;
|
|
223
|
-
baseInstruction?: string;
|
|
224
|
-
score?: number;
|
|
225
|
-
continuous?: {
|
|
226
|
-
feedbackTraceCount?: number;
|
|
227
|
-
lastUpdateAt?: string;
|
|
228
|
-
};
|
|
229
|
-
playbook?: Record<string, unknown>;
|
|
230
|
-
artifactSummary?: Record<string, unknown>;
|
|
231
|
-
};
|
|
232
|
-
artifact?: {
|
|
233
|
-
playbook?: Record<string, unknown>;
|
|
234
|
-
playbookSummary?: {
|
|
235
|
-
feedbackEvents: number;
|
|
236
|
-
historyBatches: number;
|
|
237
|
-
bulletCount: number;
|
|
238
|
-
updatedAt?: string;
|
|
239
|
-
};
|
|
240
|
-
lastUpdateAt?: string;
|
|
241
|
-
feedbackExamples?: number;
|
|
242
|
-
};
|
|
243
|
-
};
|
|
244
|
-
```
|
|
245
|
-
|
|
246
|
-
## Storage Notes
|
|
247
|
-
|
|
248
|
-
- `AxStorage.save(name, item)` receives either a trace or checkpoint.
|
|
249
|
-
- `AxStorage.load(name, query)` should return arrays of traces or checkpoints.
|
|
250
|
-
- Checkpoints may be returned unsorted. `AxLearn` restores the newest one client-side.
|
|
251
|
-
|
|
252
|
-
## Do This
|
|
253
|
-
|
|
254
|
-
- Use `runtimeAI` explicitly.
|
|
255
|
-
- Await `ready()` before relying on restored state.
|
|
256
|
-
- Run `optimize()` off the hot path.
|
|
257
|
-
- Use `continuous` mode when you want bounded feedback-aware updates.
|
|
258
|
-
- Use `playbook` mode when you want persistent structured guidance.
|
|
259
|
-
- Pass the real observed model output as `prediction` in `applyUpdate(...)`.
|
|
260
|
-
- Treat `getInstruction()` in playbook mode as the live composed prompt, not just the raw base instruction.
|
|
261
|
-
|
|
262
|
-
## Avoid This
|
|
263
|
-
|
|
264
|
-
- Do not assume `teacher` is the optimized runtime model.
|
|
265
|
-
- Do not call `applyUpdate()` in `batch` mode.
|
|
266
|
-
- Do not claim feedback affects learning unless you are storing it with `addFeedback(...)` or passing it to `applyUpdate(...)`.
|
|
267
|
-
- Do not assume checkpoints load synchronously in the constructor.
|
|
268
|
-
- Do not treat `prediction` as the gold answer in continuous updates.
|