@ax-llm/ax 19.0.16 → 19.0.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,268 @@
1
+ ---
2
+ name: ax-learn
3
+ description: This skill helps an LLM generate correct AxLearn code using @ax-llm/ax. Use when the user asks about self-improving agents, trace-backed learning, feedback-aware updates, or AxLearn modes.
4
+ version: "19.0.17"
5
+ ---
6
+
7
+ # AxLearn Codegen Rules (@ax-llm/ax)
8
+
9
+ Use this skill to generate `AxLearn` code that matches the current API.
10
+
11
+ ## Core Model
12
+
13
+ - `AxLearn` wraps an `AxGen`.
14
+ - `teacher` is for judging, synthesis, and reflection.
15
+ - `runtimeAI` is the model being improved.
16
+ - `forward()` and `streamingForward()` are inference-time APIs and auto-log traces when tracing is enabled.
17
+ - `optimize()` is offline learning.
18
+ - `applyUpdate()` is a bounded update API for `continuous` and `playbook` modes.
19
+ - `ready()` should be awaited before assuming checkpoints have been restored.
20
+ - `improvement` is the score delta from the previous/restored state.
21
+
22
+ ## Required Inputs
23
+
24
+ - Always provide `name`.
25
+ - Always provide `storage`.
26
+ - Always provide `teacher`.
27
+ - Always provide `runtimeAI` if you call `optimize()` or `applyUpdate()`.
28
+
29
+ ## Modes
30
+
31
+ - `batch`: offline prompt learning only.
32
+ - `continuous`: offline optimization plus bounded feedback-aware `applyUpdate(...)`.
33
+ - `playbook`: structured context/playbook learning plus `applyUpdate(...)`.
34
+
35
+ ## Preferred Construction
36
+
37
+ ```typescript
38
+ import {
39
+ AxLearn,
40
+ ax,
41
+ ai,
42
+ type AxCheckpoint,
43
+ type AxStorage,
44
+ type AxTrace,
45
+ } from '@ax-llm/ax';
46
+
47
+ const storage: AxStorage = {
48
+ save: async (_name, _item) => {
49
+ // persist trace/checkpoint
50
+ },
51
+ load: async (_name, _query) => {
52
+ // return traces/checkpoints
53
+ return [];
54
+ },
55
+ };
56
+
57
+ const teacher = ai({
58
+ name: 'openai',
59
+ apiKey: process.env.OPENAI_APIKEY!,
60
+ });
61
+
62
+ const runtimeAI = ai({
63
+ name: 'openai',
64
+ apiKey: process.env.OPENAI_APIKEY!,
65
+ });
66
+
67
+ const gen = ax(`
68
+ customerQuery:string "User message" ->
69
+ supportReply:string "Agent reply"
70
+ `);
71
+
72
+ const agent = new AxLearn(gen, {
73
+ name: 'support-bot-v1',
74
+ storage,
75
+ teacher,
76
+ runtimeAI,
77
+ mode: 'continuous',
78
+ budget: 12,
79
+ examples: [
80
+ {
81
+ customerQuery: 'Where is my order?',
82
+ supportReply: 'Your order is in transit and should arrive in 2 days.',
83
+ },
84
+ {
85
+ customerQuery: 'I need a refund.',
86
+ supportReply: 'I can help with that. Please share your order number.',
87
+ },
88
+ ],
89
+ generateExamples: false,
90
+ });
91
+
92
+ await agent.ready();
93
+ ```
94
+
95
+ ## Runtime Pattern
96
+
97
+ ```typescript
98
+ const prediction = await agent.forward(runtimeAI, {
99
+ customerQuery: 'My package is late.',
100
+ });
101
+
102
+ const traces = await agent.getTraces({ limit: 1 });
103
+ if (traces[0]) {
104
+ await agent.addFeedback(traces[0].id, {
105
+ score: 0,
106
+ label: 'needs-empathy',
107
+ comment: 'Acknowledge the frustration more directly.',
108
+ });
109
+ }
110
+ ```
111
+
112
+ ## Offline Optimization
113
+
114
+ ```typescript
115
+ const result = await agent.optimize({
116
+ // Optional overrides
117
+ budget: 20,
118
+ });
119
+
120
+ console.log(result.mode);
121
+ console.log(result.score);
122
+ console.log(result.improvement);
123
+ console.log(result.checkpointVersion);
124
+ ```
125
+
126
+ `result.improvement` is the gain relative to the prior/restored score.
127
+
128
+ ## Continuous Update
129
+
130
+ Use `applyUpdate(...)` only in `continuous` or `playbook` mode.
131
+
132
+ - In `continuous` mode, `example` may be input-only.
133
+ - `prediction` is the observed runtime output being critiqued.
134
+ - If `example` includes expected output fields, that expected-output row stays eligible for scored optimization.
135
+ - The observed `prediction` row is feedback/reflection context, not a scored train/validation row by itself.
136
+ - Feedback-bearing scored examples should stay in the training pool when non-feedback rows can fill validation.
137
+ - In `playbook` mode, `getInstruction()` returns the active composed prompt.
138
+
139
+ ```typescript
140
+ const update = await agent.applyUpdate({
141
+ example: {
142
+ customerQuery: 'My package is late.',
143
+ },
144
+ prediction,
145
+ feedback: {
146
+ score: 0,
147
+ label: 'needs-empathy',
148
+ comment: 'Acknowledge the frustration more directly.',
149
+ },
150
+ });
151
+ ```
152
+
153
+ ## Playbook Mode
154
+
155
+ - Use `mode: 'playbook'` when the learned artifact should be structured guidance, not just an instruction tweak.
156
+ - Playbook checkpoints restore through `ready()`.
157
+ - `applyUpdate(...)` in playbook mode performs an online structured update.
158
+ - `getInstruction()` should be treated as the active composed runtime prompt, even before optimization if the base prompt lives in the signature description.
159
+ - `artifact.playbookSummary` should match the persisted checkpoint `state.artifactSummary`.
160
+
161
+ ## How Learning Data Is Used
162
+
163
+ - `examples` and usable traces become scored optimization rows.
164
+ - Feedback stored with `addFeedback(...)` becomes reflection feedback for later optimization.
165
+ - In continuous updates, `example + prediction + feedback` is used as an observed feedback event.
166
+ - Input-only update examples are useful for reflection, but they are not promoted into scored examples unless expected outputs are present.
167
+
168
+ ## Important Options
169
+
170
+ ```typescript
171
+ const agent = new AxLearn(gen, {
172
+ name: 'agent-id',
173
+ storage,
174
+ teacher,
175
+ runtimeAI,
176
+ mode: 'batch', // 'batch' | 'continuous' | 'playbook'
177
+ budget: 20,
178
+ metric: async ({ prediction, example }) => {
179
+ return prediction.supportReply === example.supportReply ? 1 : 0;
180
+ },
181
+ criteria: 'accuracy and tone',
182
+ judgeOptions: {},
183
+ examples: [],
184
+ useTraces: true,
185
+ generateExamples: false,
186
+ synthCount: 20,
187
+ validationSplit: 0.2,
188
+ continuousOptions: {
189
+ feedbackWindowSize: 25,
190
+ maxRecentTraces: 100,
191
+ updateBudget: 4,
192
+ },
193
+ playbookOptions: {
194
+ maxEpochs: 2,
195
+ },
196
+ onTrace: (trace) => {
197
+ console.log(trace.id);
198
+ },
199
+ onProgress: (progress) => {
200
+ console.log(progress.round, progress.score);
201
+ },
202
+ });
203
+ ```
204
+
205
+ ## Result Shape
206
+
207
+ ```typescript
208
+ type AxLearnResult = {
209
+ mode: 'batch' | 'continuous' | 'playbook';
210
+ score: number;
211
+ improvement: number;
212
+ checkpointVersion: number;
213
+ stats: {
214
+ trainingExamples: number;
215
+ validationExamples: number;
216
+ feedbackExamples: number;
217
+ durationMs: number;
218
+ mode: 'batch' | 'continuous' | 'playbook';
219
+ };
220
+ state?: {
221
+ mode: 'batch' | 'continuous' | 'playbook';
222
+ instruction?: string;
223
+ baseInstruction?: string;
224
+ score?: number;
225
+ continuous?: {
226
+ feedbackTraceCount?: number;
227
+ lastUpdateAt?: string;
228
+ };
229
+ playbook?: Record<string, unknown>;
230
+ artifactSummary?: Record<string, unknown>;
231
+ };
232
+ artifact?: {
233
+ playbook?: Record<string, unknown>;
234
+ playbookSummary?: {
235
+ feedbackEvents: number;
236
+ historyBatches: number;
237
+ bulletCount: number;
238
+ updatedAt?: string;
239
+ };
240
+ lastUpdateAt?: string;
241
+ feedbackExamples?: number;
242
+ };
243
+ };
244
+ ```
245
+
246
+ ## Storage Notes
247
+
248
+ - `AxStorage.save(name, item)` receives either a trace or checkpoint.
249
+ - `AxStorage.load(name, query)` should return arrays of traces or checkpoints.
250
+ - Checkpoints may be returned unsorted. `AxLearn` restores the newest one client-side.
251
+
252
+ ## Do This
253
+
254
+ - Use `runtimeAI` explicitly.
255
+ - Await `ready()` before relying on restored state.
256
+ - Run `optimize()` off the hot path.
257
+ - Use `continuous` mode when you want bounded feedback-aware updates.
258
+ - Use `playbook` mode when you want persistent structured guidance.
259
+ - Pass the real observed model output as `prediction` in `applyUpdate(...)`.
260
+ - Treat `getInstruction()` in playbook mode as the live composed prompt, not just the raw base instruction.
261
+
262
+ ## Avoid This
263
+
264
+ - Do not assume `teacher` is the optimized runtime model.
265
+ - Do not call `applyUpdate()` in `batch` mode.
266
+ - Do not claim feedback affects learning unless you are storing it with `addFeedback(...)` or passing it to `applyUpdate(...)`.
267
+ - Do not assume checkpoints load synchronously in the constructor.
268
+ - Do not treat `prediction` as the gold answer in continuous updates.