dialai 1.2.0 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,1100 @@
1
+ # Enriched Transitions
2
+
3
+ ## Spec Weaknesses Found (from original draft)
4
+
5
+ 1. **Code sample typos throughout** — `tnsName` for `transName`, `inace` for `interface`, `constrMessage` for `const userMessage`, `systemMes` for `systemMessage`, `resu.reasoning` for `result.reasoning`, `OpenAIToolDefinitionl` for `OpenAIToolDefinition[] | null`, `detions` for `definitions`, `rseModelId` for `parseModelId`, `stringstring` for `string, string`. Ralph would copy these verbatim and waste time debugging.
6
+ 2. **Phase 8 tests are numbered paragraphs, not testable code** — no file locations, no assertion details, no permutation matrix for the callLlm/callLlmWithTools decision tree.
7
+ 3. **Missing: `classifyArbitration` type signature update** — its `currentStateTransitions` parameter is typed `Record<string, string> | undefined`. After this change, callers pass `Record<string, string | TransitionDefinition>`. The function itself reads `currentStateTransitions[transitionName]` to get a toState string. This is a type error after enrichment unless `classifyArbitration` is updated or receives only pre-extracted data.
8
+ 4. **Missing: export declarations** — `TransitionDefinition` must be exported from `types.ts`. `ToolCallResult`, `NoToolCallError`, `callLlmWithTools` must be exported from `llm.ts`. The spec never says this.
9
+ 5. **Missing: `submitProposal` stores `metaJson` from opts only** — line 708 in `api.ts` uses `metaJson` (the destructured opts value), not `finalMetaJson`. Strategy-returned `metaJson` would never reach the proposal without changing this line.
10
+ 6. **Duplication of `callLlm` in `callLlmWithTools`** — 80% of the code is identical (fetch, audit, error handling). Spec acknowledges this in Failure Rule 3 but gives no guidance. Decision: extract a shared `callLlmRaw` helper or accept duplication. This must be decided upfront.
11
+ 7. **Silent fallthrough on HTTP errors** — if `callLlmWithTools` gets a 500, the spec catches it and falls through to make a *second* HTTP call via `callLlm`. This doubles cost on transient failures. The fallback should only trigger on `NoToolCallError`, not on HTTP/network errors.
12
+ 8. **`executeProposerLlm` existing test uses `transitions: { approve: "approved", reject: "rejected" }`** — after this change, `ProposerContext.transitions` is `Record<string, TransitionDefinition>`. That existing test breaks unless updated.
13
+
14
+ ## Executive Summary
15
+
16
+ Change `StateDefinition.transitions` from `Record<string, string>` to `Record<string, string | TransitionDefinition>` so that each transition can carry a description and parameter schema alongside its target state. When a transition has parameters, DIAL builds OpenAI-compatible tool definitions and uses native function calling via a new `callLlmWithTools` function. The shorthand form (`"closed"`) remains valid and is normalized to `{ target: "closed" }` internally. `callLlm` is unchanged.
17
+
18
+ ## Objective
19
+
20
+ Let transitions carry enough metadata to become tool calls, so that `executeProposerLlm` can convert them into native OpenAI `tools` automatically without a separate `tools` field or any manual construction by the consumer.
21
+
22
+ ## In Scope
23
+
24
+ - New `TransitionDefinition` type with `target`, `description`, and `parameters`
25
+ - `StateDefinition.transitions` accepts both `string` (shorthand) and `TransitionDefinition` (enriched)
26
+ - `normalizeMachine` converts all shorthand transitions to `TransitionDefinition`
27
+ - `validateMachine` validates enriched transitions (target state exists, parameters is valid object)
28
+ - Internal DIAL code updated to read `.target` from normalized transitions
29
+ - `ProposerContext.transitions` changes to `Record<string, TransitionDefinition>` (normalized form)
30
+ - New `callLlmWithTools` function with dedicated `ToolCallResult` return type
31
+ - New `NoToolCallError` class thrown when model responds with text instead of a tool call
32
+ - `callLlm` is **completely unchanged**
33
+ - `executeProposerLlm` builds tools from enriched transitions and tries `callLlmWithTools` first
34
+ - `executeProposerLlm` catches `NoToolCallError` and falls back to `callLlm` text path
35
+ - `modelId[tools=no]` flag to skip tool attempt for models known not to support tools
36
+ - `ProposerStrategyResult` gets optional `metaJson` field (tool arguments)
37
+ - `submitProposal` merges strategy-returned `metaJson` into the proposal
38
+ - `classifyArbitration` parameter type updated for enriched transitions
39
+ - Existing `executeProposerLlm` test updated for new `ProposerContext.transitions` shape
40
+ - Unit tests for normalization, tool building, tool calling, text fallback, and opt-out
41
+
42
+ ## Out of Scope
43
+
44
+ - Parallel tool calls (only first `tool_call` used)
45
+ - Streaming / SSE tool call responses
46
+ - Any changes to `callLlm`
47
+ - Webhook tool execution
48
+ - `tool_choice` configuration beyond the `"auto"` default
49
+ - Extracting shared HTTP/audit logic from `callLlm` and `callLlmWithTools` (accepted duplication for now)
50
+
51
+ ## Assumptions and Constraints
52
+
53
+ - The shorthand form `uber_ride: "closed"` must keep working everywhere. Backward compatibility is mandatory.
54
+ - `normalizeMachine` is the single normalization point. After normalization, all runtime code sees `TransitionDefinition` objects.
55
+ - Consumer `strategyFn` implementations that access `ctx.transitions[name]` will get a `TransitionDefinition` object instead of a string. This is a **breaking change** to `ProposerContext`. Migration: `ctx.transitions[name]` becomes `ctx.transitions[name].target`.
56
+ - `callLlm` must not be modified in any way.
57
+ - `callLlmWithTools` is a separate function with its own clean return type. No union types.
58
+ - This spec depends on the operational metrics spec (`proposal-metadata-update.md`) for `latencyMsec`, `numInputTokens`, `numOutputTokens` on `ProposerStrategyResult`.
59
+ - HTTP/network errors from `callLlmWithTools` should **not** fall through to the text path. Only `NoToolCallError` triggers the fallback. This avoids doubling API cost on transient failures.
60
+ - Code duplication between `callLlm` and `callLlmWithTools` is accepted. Do not refactor shared logic.
61
+
62
+ ## Files to Modify
63
+
64
+ | File | Action |
65
+ |------|--------|
66
+ | `src/dialai/types.ts` | Add `TransitionDefinition`, update `StateDefinition.transitions`, update `ProposerContext.transitions`, add `metaJson` to `ProposerStrategyResult` |
67
+ | `src/dialai/utils.ts` | Update `normalizeMachine` to normalize transitions, update `validateMachine` to handle both forms |
68
+ | `src/dialai/api.ts` | Update `buildProposerContext`, `submitProposal`, `executeTransition`, `classifyArbitration` to read `.target` |
69
+ | `src/dialai/llm.ts` | Add `callLlmWithTools`, `NoToolCallError`, `ToolCallResult`, `buildToolsFromTransitions`, `parseModelId`; update `executeProposerLlm` and `assembleProposerPrompt` |
70
+ | `src/dialai/utils.test.ts` | Add normalization tests for shorthand, enriched, and mixed transitions |
71
+ | `src/dialai/llm.test.ts` | Update existing `executeProposerLlm` test; add `callLlmWithTools` and tool-path tests |
72
+ | `src/dialai/llm-audit.test.ts` | Add audit test for `callLlmWithTools` |
73
+ | `tests/unit/submit-proposal.test.ts` | Add metaJson merging tests |
74
+ | `tests/unit/machine-validation.test.ts` | Add enriched transition validation tests |
75
+ | `tests/unit/execute-transition.test.ts` | Verify existing tests still pass with normalized transitions |
76
+
77
+ ## Files to Read (do not modify)
78
+
79
+ | File | Why |
80
+ |------|-----|
81
+ | `src/dialai/store.ts` | Understand `Proposal` storage and `appendLlmAuditEntry` |
82
+ | `src/dialai/strategies.ts` | Check if any built-in strategies access `transitions` directly |
83
+
84
+ ## Implementation Plan
85
+
86
+ ### Phase 1: Types
87
+
88
+ **`types.ts` — new `TransitionDefinition` (export it):**
89
+
90
+ ```typescript
91
+ /** A transition with optional tool-calling metadata. */
92
+ export interface TransitionDefinition {
93
+ /** Target state this transition leads to */
94
+ target: string;
95
+ /** Description of what this transition/tool does (used as tool description in LLM calls) */
96
+ description?: string;
97
+ /** JSON Schema for the transition's parameters (used as tool parameters in LLM calls) */
98
+ parameters?: Record<string, unknown>;
99
+ }
100
+ ```
101
+
102
+ **`types.ts` — `StateDefinition.transitions`:**
103
+
104
+ ```typescript
105
+ transitions?: Record<string, string | TransitionDefinition>;
106
+ ```
107
+
108
+ **`types.ts` — `ProposerContext.transitions`:**
109
+
110
+ ```typescript
111
+ /** Normalized transitions — always TransitionDefinition after normalization */
112
+ transitions: Record<string, TransitionDefinition>;
113
+ ```
114
+
115
+ **`types.ts` — `ProposerStrategyResult.metaJson`:**
116
+
117
+ ```typescript
118
+ /** Structured metadata — tool arguments land here */
119
+ metaJson?: Record<string, unknown>;
120
+ ```
121
+
122
+ **Validate:** `npm run typecheck` — expect type errors in files that read `transitions` as strings. These are fixed in Phase 3.
123
+
124
+ ### Phase 2: Normalization and Validation
125
+
126
+ **`utils.ts` — `normalizeMachine`:**
127
+
128
+ Add transition normalization after the existing `defaultState` migration:
129
+
130
+ ```typescript
131
+ // Normalize transitions: string shorthand -> TransitionDefinition
132
+ if (normalized.states) {
133
+ const normalizedStates: Record<string, StateDefinition> = {};
134
+ for (const [stateName, stateDef] of Object.entries(normalized.states)) {
135
+ if (stateDef?.transitions) {
136
+ const normalizedTransitions: Record<string, TransitionDefinition> = {};
137
+ for (const [transName, value] of Object.entries(stateDef.transitions)) {
138
+ if (typeof value === "string") {
139
+ normalizedTransitions[transName] = { target: value };
140
+ } else {
141
+ normalizedTransitions[transName] = value;
142
+ }
143
+ }
144
+ normalizedStates[stateName] = {
145
+ ...stateDef,
146
+ transitions: normalizedTransitions,
147
+ };
148
+ } else {
149
+ normalizedStates[stateName] = stateDef;
150
+ }
151
+ }
152
+ normalized.states = normalizedStates;
153
+ }
154
+ ```
155
+
156
+ Import `TransitionDefinition` and `StateDefinition` from `types.js` in `utils.ts`.
157
+
158
+ **`utils.ts` — `validateMachine`:**
159
+
160
+ Update the transition validation loop to handle both forms:
161
+
162
+ ```typescript
163
+ for (const [transitionName, value] of Object.entries(transitions)) {
164
+ const targetState = typeof value === "string" ? value : value.target;
165
+ if (!(targetState in states)) {
166
+ throw new Error(
167
+ `Invalid machine definition: transition "${transitionName}" in state `
168
+ + `"${stateName}" points to non-existent state "${targetState}"`
169
+ );
170
+ }
171
+ }
172
+ ```
173
+
174
+ **Validate:** `npm run typecheck` after Phase 2. Some errors will remain in api.ts and llm.ts (fixed in Phase 3).
175
+
176
+ ### Phase 3: Internal Code Migration
177
+
178
+ Every place that reads a transition value expecting a string must read `.target` instead.
179
+
180
+ **`api.ts` — `buildProposerContext`:**
181
+
182
+ ```typescript
183
+ function buildProposerContext(session: Session): ProposerContext {
184
+ const currentStatedef = session.machine.states[session.currentState];
185
+ const rawTransitions = currentStatedef?.transitions ?? {};
186
+
187
+ // Normalize for context (should already be normalized, but be safe)
188
+ const transitions: Record<string, TransitionDefinition> = {};
189
+ for (const [name, value] of Object.entries(rawTransitions)) {
190
+ transitions[name] = typeof value === "string" ? { target: value } : value;
191
+ }
192
+
193
+ return {
194
+ sessionId: session.sessionId,
195
+ currentState: session.currentState,
196
+ prompt: currentStatedef?.prompt ?? "",
197
+ transitions,
198
+ history: [...session.history],
199
+ metaJson: session.metaJson,
200
+ };
201
+ }
202
+ ```
203
+
204
+ **`api.ts` — `submitProposal` line 696:**
205
+
206
+ ```typescript
207
+ // Before:
208
+ finalToState = currentStateDef.transitions[finalTransitionName];
209
+
210
+ // After:
211
+ const transitionDef = currentStateDef.transitions[finalTransitionName];
212
+ finalToState = typeof transitionDef === "string" ? transitionDef : transitionDef.target;
213
+ ```
214
+
215
+ **`api.ts` — `submitProposal` metaJson merging:**
216
+
217
+ Change line 708 from `metaJson,` to `metaJson: metaJson ?? finalMetaJson,` and add `let finalMetaJson: Record<string, unknown> | undefined;` alongside the other `final*` declarations. Inside the strategy invocation block, add `finalMetaJson = result.metaJson;`.
218
+
219
+ **`api.ts` — `executeTransition` line 1079:**
220
+
221
+ ```typescript
222
+ // Before:
223
+ const expectedToState = currentStateDef.transitions[transitionName];
224
+
225
+ // After:
226
+ const transitionDef = currentStateDef.transitions[transitionName];
227
+ const expectedToState = typeof transitionDef === "string" ? transitionDef : transitionDef.target;
228
+ ```
229
+
230
+ **`api.ts` — `classifyArbitration`:**
231
+
232
+ Update parameter type from `Record<string, string> | undefined` to `Record<string, string | TransitionDefinition> | undefined`. Update the two lines that read transition values:
233
+
234
+ ```typescript
235
+ // Line 788 — existence check is unchanged (truthy check on object or string both work)
236
+ if (!currentStateTransitions?.[transitionName]) { ... }
237
+
238
+ // Line 797 — extract target
239
+ const raw = currentStateTransitions[transitionName];
240
+ const toState = typeof raw === "string" ? raw : raw.target;
241
+ return { type: "humanOverride", transitionName, toState };
242
+ ```
243
+
244
+ The `ArbitrationPath` type's `humanOverride` variant has `toState: string`, which is unchanged.
245
+
246
+ **`llm.ts` — `assembleProposerPrompt`:**
247
+
248
+ ```typescript
249
+ // Before:
250
+ const transitions = Object.entries(ctx.transitions)
251
+ .map(([name, target]) => ` - "${name}" -> "${target}"`)
252
+ .join("\n");
253
+
254
+ // After:
255
+ const transitions = Object.entries(ctx.transitions)
256
+ .map(([name, def]) => ` - "${name}" -> "${def.target}"`)
257
+ .join("\n");
258
+ ```
259
+
260
+ **Validate:** `npm run typecheck` — should now pass. `npm test` — existing tests may fail due to `ProposerContext.transitions` shape change in `llm.test.ts`. Fix in Phase 8.
261
+
262
+ ### Phase 4: Tool Building Helper
263
+
264
+ **`llm.ts` — types and helper (not exported):**
265
+
266
+ ```typescript
267
+ interface OpenAIToolDefinition {
268
+ type: "function";
269
+ function: {
270
+ name: string;
271
+ description: string;
272
+ parameters: Record<string, unknown>;
273
+ };
274
+ }
275
+
276
+ /**
277
+ * Builds OpenAI-compatible tool definitions from enriched transitions.
278
+ * Returns null if no transitions have descriptions or parameters.
279
+ */
280
+ function buildToolsFromTransitions(
281
+ transitions: Record<string, TransitionDefinition>,
282
+ ): OpenAIToolDefinition[] | null {
283
+ const tools: OpenAIToolDefinition[] = [];
284
+
285
+ for (const [name, def] of Object.entries(transitions)) {
286
+ if (def.description || def.parameters) {
287
+ tools.push({
288
+ type: "function",
289
+ function: {
290
+ name,
291
+ description: def.description ?? name,
292
+ parameters: def.parameters ?? { type: "object", properties: {} },
293
+ },
294
+ });
295
+ }
296
+ }
297
+
298
+ return tools.length > 0 ? tools : null;
299
+ }
300
+ ```
301
+
302
+ Key behavior: only transitions with `description` or `parameters` become tools. Plain `{ target: "closed" }` transitions produce no tools. If `buildToolsFromTransitions` returns `null`, there are no enriched transitions and the text path is used.
303
+
304
+ Export `buildToolsFromTransitions` for testing purposes (or use `@visibleForTesting` pattern — export it).
305
+
306
+ ### Phase 5: `callLlmWithTools`
307
+
308
+ **`llm.ts` — `ToolCallResult` (export it):**
309
+
310
+ ```typescript
311
+ /** Result from a successful tool-calling LLM request. */
312
+ export interface ToolCallResult {
313
+ /** The function the model chose to call */
314
+ name: string;
315
+ /** Parsed arguments object */
316
+ arguments: Record<string, unknown>;
317
+ /** Text content from the model alongside the tool call, if any */
318
+ reasoning: string;
319
+ /** Token usage */
320
+ usage?: {
321
+ prompt_tokens?: number;
322
+ completion_tokens?: number;
323
+ };
324
+ }
325
+ ```
326
+
327
+ **`llm.ts` — `NoToolCallError` (export it):**
328
+
329
+ ```typescript
330
+ /** Thrown when callLlmWithTools gets a response without tool_calls. */
331
+ export class NoToolCallError extends Error {
332
+ /** The text content the model returned instead */
333
+ content: string;
334
+ usage?: { prompt_tokens?: number; completion_tokens?: number };
335
+
336
+ constructor(
337
+ content: string,
338
+ usage?: { prompt_tokens?: number; completion_tokens?: number },
339
+ ) {
340
+ super("Model did not return a tool call");
341
+ this.name = "NoToolCallError";
342
+ this.content = content;
343
+ this.usage = usage;
344
+ }
345
+ }
346
+ ```
347
+
348
+ **`llm.ts` — `callLlmWithTools` (export it):**
349
+
350
+ Follows the same structure as `callLlm` (fetch, audit, error handling) but adds `tools` and `tool_choice` to the request body and parses `tool_calls` from the response. Code is intentionally duplicated from `callLlm` — do not refactor shared logic.
351
+
352
+ The function:
353
+ 1. Throws `Error` if no API token
354
+ 2. Sends POST with `tools` and `tool_choice: "auto"` in the body
355
+ 3. On HTTP error: throws `Error` (not `NoToolCallError`)
356
+ 4. On success with `tool_calls`: returns `ToolCallResult`
357
+ 5. On success without `tool_calls`: throws `NoToolCallError` with the text content
358
+ 6. Writes audit entry in `finally` block (identical pattern to `callLlm`)
359
+
360
+ ### Phase 6: `executeProposerLlm` with Try/Fallback
361
+
362
+ **`llm.ts` — `parseModelId`:**
363
+
364
+ ```typescript
365
+ function parseModelId(raw: string): { modelId: string; useTools: boolean } {
366
+ const match = raw.match(/^(.+?)(?:\[(.+)\])?$/);
367
+ const modelId = match?.[1] ?? raw;
368
+ const flags = match?.[2] ?? "";
369
+ const useTools = !flags.includes("tools=no");
370
+ return { modelId, useTools };
371
+ }
372
+ ```
373
+
374
+ **`llm.ts` — updated `executeProposerLlm`:**
375
+
376
+ ```typescript
377
+ export async function executeProposerLlm(
378
+ contextFn: (ctx: ProposerContext) => Promise<string>,
379
+ modelId: string,
380
+ ctx: ProposerContext,
381
+ auditContext?: LlmAuditContext
382
+ ): Promise<ProposerStrategyResult> {
383
+ const { modelId: actualModelId, useTools } = parseModelId(modelId);
384
+ const context = await contextFn(ctx);
385
+ const start = Date.now();
386
+
387
+ // Attempt tool calling if transitions are enriched and model not opted out
388
+ const tools = useTools ? buildToolsFromTransitions(ctx.transitions) : null;
389
+
390
+ if (tools) {
391
+ const systemMessage = "You are a function-calling AI assistant. "
392
+ + "Use the provided tools to respond to the user's request.";
393
+ const userMessage = ctx.prompt ? `${ctx.prompt}\n\n${context}` : context;
394
+
395
+ try {
396
+ const result = await callLlmWithTools(
397
+ actualModelId, systemMessage, userMessage, tools, auditContext,
398
+ );
399
+ const latencyMsec = Date.now() - start;
400
+
401
+ const transitionDef = ctx.transitions[result.name];
402
+ if (!transitionDef) {
403
+ throw new Error(
404
+ `Tool call "${result.name}" does not match any transition from `
405
+ + `state "${ctx.currentState}". Available: `
406
+ + `${Object.keys(ctx.transitions).join(", ")}`,
407
+ );
408
+ }
409
+
410
+ return {
411
+ transitionName: result.name,
412
+ toState: transitionDef.target,
413
+ reasoning: result.reasoning,
414
+ metaJson: Object.keys(result.arguments).length > 0 ? result.arguments : undefined,
415
+ latencyMsec,
416
+ numInputTokens: result.usage?.prompt_tokens,
417
+ numOutputTokens: result.usage?.completion_tokens,
418
+ };
419
+ } catch (e) {
420
+ if (e instanceof NoToolCallError) {
421
+ // Model returned text instead of tool call.
422
+ // Try parsing it as DIAL JSON before falling through to text path.
423
+ try {
424
+ const parsed = JSON.parse(e.content) as Record<string, unknown>;
425
+ if (parsed.transitionName && parsed.toState) {
426
+ return {
427
+ transitionName: parsed.transitionName as string,
428
+ toState: parsed.toState as string,
429
+ reasoning: (parsed.reasoning as string) ?? "",
430
+ metaJson: parsed.metaJson as Record<string, unknown> | undefined,
431
+ latencyMsec: Date.now() - start,
432
+ numInputTokens: e.usage?.prompt_tokens,
433
+ numOutputTokens: e.usage?.completion_tokens,
434
+ };
435
+ }
436
+ } catch { /* not parseable, fall through to text path */ }
437
+ // Fall through to text path below
438
+ } else {
439
+ // HTTP errors, network errors, invalid tool name, etc.
440
+ // Re-throw — do NOT fall through to text path for these.
441
+ throw e;
442
+ }
443
+ }
444
+ }
445
+
446
+ // ---- TEXT PATH (existing behavior) ----
447
+ const systemMessage = "You are a decision-making specialist in a state machine. "
448
+ + "You must choose the best transition based on the context provided. "
449
+ + "Respond only with valid JSON.";
450
+ const userMessage = assembleProposerPrompt(ctx, context);
451
+
452
+ const result = await callLlm(actualModelId, systemMessage, userMessage, auditContext);
453
+ const latencyMsec = Date.now() - start;
454
+
455
+ try {
456
+ const parsed = JSON.parse(result.content) as Record<string, unknown>;
457
+ if (!parsed.transitionName || !parsed.toState) {
458
+ throw new Error("Missing required fields in LLM response");
459
+ }
460
+ return {
461
+ transitionName: parsed.transitionName as string,
462
+ toState: parsed.toState as string,
463
+ reasoning: (parsed.reasoning as string) ?? "",
464
+ metaJson: parsed.metaJson as Record<string, unknown> | undefined,
465
+ latencyMsec,
466
+ numInputTokens: result.usage?.prompt_tokens,
467
+ numOutputTokens: result.usage?.completion_tokens,
468
+ };
469
+ } catch {
470
+ throw new Error(`Failed to parse LLM proposer response: ${result.content}`);
471
+ }
472
+ }
473
+ ```
474
+
475
+ **Fallback sequence (corrected from original):**
476
+
477
+ 1. Transitions have `description`/`parameters` + model not opted out -> call `callLlmWithTools`
478
+ 2. Model returns `tool_calls` -> map to `{ transitionName, metaJson }` -> done
479
+ 3. Model returns text (`NoToolCallError`) -> try parsing as DIAL JSON -> if valid, done
480
+ 4. Not parseable -> fall through to fresh `callLlm` call with text-mode prompt -> done
481
+ 5. HTTP/network error from `callLlmWithTools` -> **re-throw** (do NOT fall through)
482
+ 6. `[tools=no]` or no enriched transitions -> skip to text path directly
483
+
484
+ ### Phase 7: Update Existing Test
485
+
486
+ **`src/dialai/llm.test.ts` — fix existing `executeProposerLlm` test:**
487
+
488
+ Change `transitions: { approve: "approved", reject: "rejected" }` to `transitions: { approve: { target: "approved" }, reject: { target: "rejected" } }` to match the new `ProposerContext.transitions` type.
489
+
490
+ **Validate:** `npm run typecheck && npm test`
491
+
492
+ ### Phase 8: Tests
493
+
494
+ All tests below are organized by the decision tree that `executeProposerLlm` and `callLlmWithTools` follow. Each test name includes the path it exercises.
495
+
496
+ #### Test file: `src/dialai/utils.test.ts` (normalization and validation)
497
+
498
+ **Test 1: `normalizeMachine` converts shorthand transitions to TransitionDefinition**
499
+
500
+ ```
501
+ Input: machine with transitions: { close: "closed" }
502
+ Assert: after normalizeMachine, transitions.close is { target: "closed" }
503
+ Assert: transitions.close.description is undefined
504
+ Assert: transitions.close.parameters is undefined
505
+ ```
506
+
507
+ **Test 2: `normalizeMachine` preserves enriched transitions unchanged**
508
+
509
+ ```
510
+ Input: machine with transitions: { uber_ride: { target: "closed", description: "Book an Uber", parameters: { type: "object", properties: { destination: { type: "string" } } } } }
511
+ Assert: after normalizeMachine, transitions.uber_ride.target is "closed"
512
+ Assert: transitions.uber_ride.description is "Book an Uber"
513
+ Assert: transitions.uber_ride.parameters deep-equals the input parameters
514
+ ```
515
+
516
+ **Test 3: `normalizeMachine` handles mixed shorthand and enriched in one state**
517
+
518
+ ```
519
+ Input: machine with transitions: { close: "closed", uber_ride: { target: "riding", description: "Book ride" } }
520
+ Assert: transitions.close is { target: "closed" }
521
+ Assert: transitions.uber_ride is { target: "riding", description: "Book ride" }
522
+ ```
523
+
524
+ **Test 4: `normalizeMachine` handles states with no transitions**
525
+
526
+ ```
527
+ Input: machine with state "terminal" that has no transitions field
528
+ Assert: normalizeMachine does not throw
529
+ Assert: state "terminal" has no transitions field (or empty)
530
+ ```
531
+
532
+ **Test 5: `validateMachine` accepts enriched transition pointing to valid state**
533
+
534
+ ```
535
+ Input: machine with transitions: { go: { target: "done" } }, states has "done"
536
+ Assert: validateMachine does not throw
537
+ ```
538
+
539
+ **Test 6: `validateMachine` rejects enriched transition pointing to nonexistent state**
540
+
541
+ ```
542
+ Input: machine with transitions: { go: { target: "nonexistent" } }
543
+ Assert: validateMachine throws "non-existent state"
544
+ ```
545
+
546
+ #### Test file: `src/dialai/llm.test.ts` (tool building, callLlmWithTools, executeProposerLlm paths)
547
+
548
+ **Decision tree for `executeProposerLlm`:**
549
+
550
+ ```
551
+ Has enriched transitions (description/parameters)?
552
+ NO -> TEXT PATH (callLlm)
553
+ YES -> Is [tools=no] set?
554
+ YES -> TEXT PATH (callLlm)
555
+ NO -> TOOL PATH (callLlmWithTools)
556
+ -> Model returns tool_calls?
557
+ YES -> tool name matches a transition?
558
+ YES -> RETURN (with metaJson from arguments)
559
+ NO -> THROW (invalid tool name)
560
+ NO -> NoToolCallError
561
+ -> Text content is valid DIAL JSON?
562
+ YES -> RETURN (parsed from text)
563
+ NO -> TEXT PATH (callLlm, fresh call)
564
+ -> HTTP/network error?
565
+ -> RE-THROW (do not fall through)
566
+ ```
567
+
568
+ ##### Group: `buildToolsFromTransitions`
569
+
570
+ **Test 7: returns OpenAI tool array for enriched transitions**
571
+
572
+ ```
573
+ Input: { book: { target: "booked", description: "Book a ride", parameters: { type: "object", properties: { dest: { type: "string" } } } } }
574
+ Assert: returns array of length 1
575
+ Assert: result[0].type is "function"
576
+ Assert: result[0].function.name is "book"
577
+ Assert: result[0].function.description is "Book a ride"
578
+ Assert: result[0].function.parameters deep-equals the input parameters
579
+ ```
580
+
581
+ **Test 8: returns null for plain transitions (no description, no parameters)**
582
+
583
+ ```
584
+ Input: { close: { target: "closed" }, reopen: { target: "open" } }
585
+ Assert: returns null
586
+ ```
587
+
588
+ **Test 9: returns tools only for enriched transitions in a mixed set**
589
+
590
+ ```
591
+ Input: { close: { target: "closed" }, book: { target: "booked", description: "Book" } }
592
+ Assert: returns array of length 1 (only "book")
593
+ Assert: result[0].function.name is "book"
594
+ ```
595
+
596
+ **Test 10: uses transition name as description when description is omitted but parameters present**
597
+
598
+ ```
599
+ Input: { book: { target: "booked", parameters: { type: "object", properties: {} } } }
600
+ Assert: result[0].function.description is "book"
601
+ ```
602
+
603
+ ##### Group: `callLlmWithTools`
604
+
605
+ **Test 11: callLlmWithTools returns ToolCallResult when model uses a tool**
606
+
607
+ ```
608
+ Mock fetch: returns 200 with { choices: [{ message: { tool_calls: [{ function: { name: "book", arguments: '{"dest":"airport"}' } }], content: "reasoning text" } }], usage: { prompt_tokens: 50, completion_tokens: 20 } }
609
+ Assert: result.name is "book"
610
+ Assert: result.arguments deep-equals { dest: "airport" }
611
+ Assert: result.reasoning is "reasoning text"
612
+ Assert: result.usage.prompt_tokens is 50
613
+ Assert: result.usage.completion_tokens is 20
614
+ ```
615
+
616
+ **Test 12: callLlmWithTools throws NoToolCallError when model returns text only**
617
+
618
+ ```
619
+ Mock fetch: returns 200 with { choices: [{ message: { content: "I chose to close" } }], usage: { prompt_tokens: 30, completion_tokens: 10 } }
620
+ Assert: throws NoToolCallError
621
+ Assert: error.content is "I chose to close"
622
+ Assert: error.usage.prompt_tokens is 30
623
+ Assert: error.usage.completion_tokens is 10
624
+ ```
625
+
626
+ **Test 13: callLlmWithTools throws NoToolCallError when tool_calls is empty array**
627
+
628
+ ```
629
+ Mock fetch: returns 200 with { choices: [{ message: { tool_calls: [], content: "no tools" } }] }
630
+ Assert: throws NoToolCallError
631
+ Assert: error.content is "no tools"
632
+ ```
633
+
634
+ **Test 14: callLlmWithTools throws Error (not NoToolCallError) on HTTP 500**
635
+
636
+ ```
637
+ Mock fetch: returns 500 with body "Internal Server Error"
638
+ Assert: throws Error with message containing "LLM API error (500)"
639
+ Assert: error is NOT instanceof NoToolCallError
640
+ ```
641
+
642
+ **Test 15: callLlmWithTools throws Error on network failure**
643
+
644
+ ```
645
+ Mock fetch: rejects with Error("ECONNREFUSED")
646
+ Assert: throws Error with message "ECONNREFUSED"
647
+ Assert: error is NOT instanceof NoToolCallError
648
+ ```
649
+
650
+ **Test 16: callLlmWithTools handles unparseable tool arguments gracefully**
651
+
652
+ ```
653
+ Mock fetch: returns 200 with { choices: [{ message: { tool_calls: [{ function: { name: "book", arguments: "not json" } }] } }] }
654
+ Assert: result.name is "book"
655
+ Assert: result.arguments deep-equals {} (empty object fallback)
656
+ ```
657
+
658
+ **Test 17: callLlmWithTools handles null content alongside tool call**
659
+
660
+ ```
661
+ Mock fetch: returns 200 with { choices: [{ message: { tool_calls: [{ function: { name: "book", arguments: '{}' } }], content: null } }] }
662
+ Assert: result.reasoning is "" (empty string, not null)
663
+ ```
664
+
665
+ **Test 18: callLlmWithTools handles missing usage field**
666
+
667
+ ```
668
+ Mock fetch: returns 200 with { choices: [{ message: { tool_calls: [{ function: { name: "book", arguments: '{}' } }] } }] } (no usage field)
669
+ Assert: result.usage is undefined
670
+ ```
671
+
672
+ **Test 19: callLlmWithTools includes tools in request body**
673
+
674
+ ```
675
+ Mock fetch: capture request body
676
+ Assert: body.tools is the array passed to callLlmWithTools
677
+ Assert: body.tool_choice is "auto"
678
+ ```
679
+
680
+ ##### Group: `callLlmWithTools` audit
681
+
682
+ **Test 20: callLlmWithTools writes audit entry on success**
683
+
684
+ ```
685
+ Mock fetch: returns 200 with tool_calls
686
+ Assert: store.getLlmAuditEntries() has 1 entry
687
+ Assert: entry.requestBody contains "tools" key
688
+ Assert: entry.error is null
689
+ Assert: entry.responseStatus is 200
690
+ ```
691
+
692
+ **Test 21: callLlmWithTools writes audit entry on NoToolCallError**
693
+
694
+ ```
695
+ Mock fetch: returns 200 with text only (no tool_calls)
696
+ Catch the NoToolCallError (it's expected)
697
+ Assert: store.getLlmAuditEntries() has 1 entry
698
+ Assert: entry.error is null (NoToolCallError is not an "error" from the HTTP perspective)
699
+ Assert: entry.responseStatus is 200
700
+ ```
701
+
702
+ **Test 22: callLlmWithTools writes audit entry on HTTP error**
703
+
704
+ ```
705
+ Mock fetch: returns 500
706
+ Catch the Error
707
+ Assert: store.getLlmAuditEntries() has 1 entry
708
+ Assert: entry.error contains "LLM API error (500)"
709
+ ```
710
+
711
+ **Test 23: callLlmWithTools redacts Authorization header in audit**
712
+
713
+ ```
714
+ Mock fetch: returns 200 with tool_calls
715
+ Assert: audit entry requestHeaders.Authorization is "[REDACTED]"
716
+ ```
717
+
718
+ ##### Group: `callLlm` unchanged
719
+
720
+ **Test 24: existing callLlm tests pass without modification**
721
+
722
+ No new test needed. The existing tests in `llm.test.ts` and `llm-audit.test.ts` must continue to pass unchanged. Verify by running `npm test`.
723
+
724
+ ##### Group: `executeProposerLlm` — TOOL PATH: model returns tool_calls
725
+
726
+ **Test 25: enriched transitions + model returns tool_call -> returns transitionName and metaJson**
727
+
728
+ ```
729
+ Setup:
730
+ ctx.transitions = {
731
+ book_ride: { target: "riding", description: "Book a ride", parameters: { type: "object", properties: { destination: { type: "string" } } } },
732
+ cancel: { target: "cancelled", description: "Cancel the request" }
733
+ }
734
+ modelId = "test-model" (no [tools=no])
735
+ Mock fetch: returns 200 with tool_calls: [{ function: { name: "book_ride", arguments: '{"destination":"airport"}' } }]
736
+ usage: { prompt_tokens: 100, completion_tokens: 30 }
737
+
738
+ Assert: result.transitionName is "book_ride"
739
+ Assert: result.toState is "riding"
740
+ Assert: result.metaJson deep-equals { destination: "airport" }
741
+ Assert: result.latencyMsec is a number >= 0
742
+ Assert: result.numInputTokens is 100
743
+ Assert: result.numOutputTokens is 30
744
+ Assert: fetch was called exactly once (no fallback to callLlm)
745
+ ```
746
+
747
+ **Test 26: enriched transitions + model returns tool_call with empty arguments -> metaJson is undefined**
748
+
749
+ ```
750
+ Setup: same as Test 25 but tool_call arguments is '{}'
751
+ Assert: result.metaJson is undefined (empty args not stored)
752
+ Assert: result.transitionName is "book_ride"
753
+ ```
754
+
755
+ **Test 27: enriched transitions + model returns tool_call with unknown tool name -> throws**
756
+
757
+ ```
758
+ Setup: ctx.transitions has "book_ride" and "cancel"
759
+ Mock fetch: returns tool_calls with name: "nonexistent_tool"
760
+ Assert: throws Error containing "does not match any transition"
761
+ Assert: fetch was called exactly once (no fallback)
762
+ ```
763
+
764
+ ##### Group: `executeProposerLlm` — TOOL PATH: NoToolCallError with parseable DIAL JSON
765
+
766
+ **Test 28: enriched transitions + NoToolCallError + valid DIAL JSON -> returns parsed result**
767
+
768
+ ```
769
+ Setup: ctx.transitions = { book: { target: "booked", description: "Book" } }
770
+ Mock fetch: returns 200 with NO tool_calls, content is '{"transitionName":"book","toState":"booked","reasoning":"chose book"}'
771
+ usage: { prompt_tokens: 40, completion_tokens: 15 }
772
+
773
+ Assert: result.transitionName is "book"
774
+ Assert: result.toState is "booked"
775
+ Assert: result.reasoning is "chose book"
776
+ Assert: result.numInputTokens is 40
777
+ Assert: result.numOutputTokens is 15
778
+ Assert: fetch was called exactly once (no second callLlm call)
779
+ ```
780
+
781
+ **Test 29: enriched transitions + NoToolCallError + valid DIAL JSON with metaJson -> metaJson preserved**
782
+
783
+ ```
784
+ Mock fetch: returns content '{"transitionName":"book","toState":"booked","reasoning":"ok","metaJson":{"key":"val"}}'
785
+ Assert: result.metaJson deep-equals { key: "val" }
786
+ ```
787
+
788
+ ##### Group: `executeProposerLlm` — TOOL PATH: NoToolCallError with unparseable text -> full fallback
789
+
790
+ **Test 30: enriched transitions + NoToolCallError + unparseable text -> falls back to callLlm**
791
+
792
+ ```
793
+ Setup: ctx.transitions = { book: { target: "booked", description: "Book" } }
794
+ Mock fetch:
795
+ - First call: returns 200 with NO tool_calls, content is "I think you should book" (not JSON)
796
+ - Second call: returns 200 with content '{"transitionName":"book","toState":"booked","reasoning":"fallback"}'
797
+
798
+ Assert: result.transitionName is "book"
799
+ Assert: result.toState is "booked"
800
+ Assert: result.reasoning is "fallback"
801
+ Assert: fetch was called exactly twice (one for callLlmWithTools, one for callLlm)
802
+ ```
803
+
804
+ **Test 31: enriched transitions + NoToolCallError + partial DIAL JSON (missing toState) -> falls back to callLlm**
805
+
806
+ ```
807
+ Mock fetch:
808
+ - First call: returns content '{"transitionName":"book"}' (missing toState)
809
+ - Second call: returns content '{"transitionName":"book","toState":"booked","reasoning":"ok"}'
810
+
811
+ Assert: fetch was called exactly twice
812
+ Assert: result.toState is "booked"
813
+ ```
814
+
815
+ ##### Group: `executeProposerLlm` — TOOL PATH: HTTP error -> re-throw (no fallback)
816
+
817
+ **Test 32: enriched transitions + HTTP 500 from callLlmWithTools -> re-throws, does not fall back**
818
+
819
+ ```
820
+ Setup: ctx.transitions = { book: { target: "booked", description: "Book" } }
821
+ Mock fetch: returns 500
822
+
823
+ Assert: throws Error containing "LLM API error (500)"
824
+ Assert: fetch was called exactly once (no second call)
825
+ ```
826
+
827
+ **Test 33: enriched transitions + network error from callLlmWithTools -> re-throws, does not fall back**
828
+
829
+ ```
830
+ Mock fetch: rejects with Error("ECONNREFUSED")
831
+ Assert: throws Error("ECONNREFUSED")
832
+ Assert: fetch was called exactly once
833
+ ```
834
+
835
+ ##### Group: `executeProposerLlm` — TEXT PATH: [tools=no] opt-out
836
+
837
+ **Test 34: modelId is "test-model[tools=no]" + enriched transitions -> skips tool path, uses callLlm**
838
+
839
+ ```
840
+ Setup: ctx.transitions = { book: { target: "booked", description: "Book" } }
841
+ modelId = "test-model[tools=no]"
842
+ Mock fetch: returns 200 with content '{"transitionName":"book","toState":"booked","reasoning":"text path"}'
843
+
844
+ Assert: result.transitionName is "book"
845
+ Assert: result.reasoning is "text path"
846
+ Assert: fetch was called exactly once
847
+ Assert: request body does NOT contain "tools" key (text path, not tool path)
848
+ Assert: request body model is "test-model" (brackets stripped)
849
+ ```
850
+
851
+ ##### Group: `executeProposerLlm` — TEXT PATH: no enriched transitions
852
+
853
+ **Test 35: plain transitions only (no description/parameters) -> uses callLlm directly**
854
+
855
+ ```
856
+ Setup: ctx.transitions = { close: { target: "closed" }, reopen: { target: "open" } }
857
+ modelId = "test-model" (no opt-out)
858
+ Mock fetch: returns 200 with content '{"transitionName":"close","toState":"closed","reasoning":"done"}'
859
+
860
+ Assert: result.transitionName is "close"
861
+ Assert: result.toState is "closed"
862
+ Assert: fetch was called exactly once
863
+ Assert: request body does NOT contain "tools" key
864
+ ```
865
+
866
+ **Test 36: text path returns metaJson from LLM response when present**
867
+
868
+ ```
869
+ Setup: plain transitions
870
+ Mock fetch: returns content '{"transitionName":"close","toState":"closed","reasoning":"done","metaJson":{"key":"val"}}'
871
+
872
+ Assert: result.metaJson deep-equals { key: "val" }
873
+ ```
874
+
875
+ **Test 37: text path returns undefined metaJson when LLM response omits it**
876
+
877
+ ```
878
+ Setup: plain transitions
879
+ Mock fetch: returns content '{"transitionName":"close","toState":"closed","reasoning":"done"}'
880
+
881
+ Assert: result.metaJson is undefined
882
+ ```
883
+
884
+ ##### Group: `parseModelId`
885
+
886
+ **Test 38: parses plain model ID**
887
+
888
+ ```
889
+ Input: "openai/gpt-4"
890
+ Assert: { modelId: "openai/gpt-4", useTools: true }
891
+ ```
892
+
893
+ **Test 39: parses model ID with [tools=no]**
894
+
895
+ ```
896
+ Input: "openai/gpt-4[tools=no]"
897
+ Assert: { modelId: "openai/gpt-4", useTools: false }
898
+ ```
899
+
900
+ **Test 40: parses model ID with unknown flags (useTools defaults true)**
901
+
902
+ ```
903
+ Input: "openai/gpt-4[streaming=yes]"
904
+ Assert: { modelId: "openai/gpt-4", useTools: true }
905
+ ```
906
+
907
+ #### Test file: `tests/unit/submit-proposal.test.ts` (metaJson merging)
908
+
909
+ **Test 41: strategy-returned metaJson flows to proposal**
910
+
911
+ ```
912
+ Setup: register proposer with strategyFn that returns { transitionName: "t", toState: "s", reasoning: "r", metaJson: { key: "from-strategy" } }
913
+ Call: submitProposal({ sessionId, specialistId }) — no transitionName, no metaJson in opts
914
+ Assert: stored proposal.metaJson deep-equals { key: "from-strategy" }
915
+ ```
916
+
917
+ **Test 42: caller-provided metaJson takes precedence over strategy metaJson**
918
+
919
+ ```
920
+ Setup: same strategyFn returning metaJson: { key: "from-strategy" }
921
+ Call: submitProposal({ sessionId, specialistId, metaJson: { key: "from-caller" } })
922
+ Assert: stored proposal.metaJson deep-equals { key: "from-caller" }
923
+ ```
924
+
925
+ **Test 43: proposal works when neither caller nor strategy provides metaJson**
926
+
927
+ ```
928
+ Setup: strategyFn returns { transitionName: "t", toState: "s", reasoning: "r" } (no metaJson)
929
+ Call: submitProposal({ sessionId, specialistId })
930
+ Assert: stored proposal.metaJson is undefined
931
+ Assert: no crash
932
+ ```
933
+
934
+ #### Test file: `tests/unit/execute-transition.test.ts` (backward compat)
935
+
936
+ **Test 44: existing execute-transition tests pass without modification**
937
+
938
+ No new test needed. Existing tests use string transitions which are now normalized by `normalizeMachine` called in `createSession`. Verify by running `npm test`.
939
+
940
+ #### Test file: `tests/unit/machine-validation.test.ts`
941
+
942
+ **Test 45: validateMachine accepts machine with mixed shorthand and enriched transitions**
943
+
944
+ ```
945
+ Input: machine with state "a" having transitions: { go: "b", ride: { target: "b", description: "Take a ride" } }
946
+ Assert: validateMachine does not throw
947
+ ```
948
+
949
+ #### Summary: Path coverage matrix
950
+
951
+ | # | Enriched? | [tools=no]? | Tool path taken? | Model response | Fallback? | Result |
952
+ |---|-----------|-------------|------------------|----------------|-----------|--------|
953
+ | 25 | Yes | No | Yes | tool_calls with valid name | No | transitionName + metaJson |
954
+ | 26 | Yes | No | Yes | tool_calls with empty args | No | transitionName, metaJson undefined |
955
+ | 27 | Yes | No | Yes | tool_calls with unknown name | No | Throws |
956
+ | 28 | Yes | No | Yes | Text (valid DIAL JSON) | No | Parsed from text |
957
+ | 29 | Yes | No | Yes | Text (valid DIAL JSON + metaJson) | No | Parsed with metaJson |
958
+ | 30 | Yes | No | Yes | Text (not JSON) | Yes (callLlm) | From second call |
959
+ | 31 | Yes | No | Yes | Text (partial JSON) | Yes (callLlm) | From second call |
960
+ | 32 | Yes | No | Yes | HTTP 500 | No (re-throw) | Throws |
961
+ | 33 | Yes | No | Yes | Network error | No (re-throw) | Throws |
962
+ | 34 | Yes | Yes | No | Text (JSON) | N/A | From callLlm directly |
963
+ | 35 | No | No | No | Text (JSON) | N/A | From callLlm directly |
964
+ | 36 | No | No | No | Text (JSON + metaJson) | N/A | metaJson preserved |
965
+ | 37 | No | No | No | Text (JSON, no metaJson) | N/A | metaJson undefined |
966
+
967
+ ## Acceptance Criteria
968
+
969
+ ### Functional
970
+
971
+ - `TransitionDefinition` type exists with `target`, optional `description`, optional `parameters`
972
+ - `TransitionDefinition` is exported from `types.ts`
973
+ - `StateDefinition.transitions` accepts `Record<string, string | TransitionDefinition>`
974
+ - `normalizeMachine` converts all string transitions to `{ target: string }`
975
+ - `validateMachine` validates both shorthand and enriched transitions
976
+ - `ProposerContext.transitions` is `Record<string, TransitionDefinition>` (normalized)
977
+ - `buildToolsFromTransitions` builds OpenAI tools from enriched transitions, returns `null` for plain transitions
978
+ - `callLlmWithTools` sends tools in API request, returns `ToolCallResult` on success, throws `NoToolCallError` on text response
979
+ - `callLlmWithTools` throws `Error` (not `NoToolCallError`) on HTTP errors
980
+ - `callLlmWithTools` writes audit entry for every call
981
+ - `callLlm` is completely unchanged (no diff in `callLlm` function)
982
+ - `executeProposerLlm` tries `callLlmWithTools` when enriched transitions present and model not opted out
983
+ - `executeProposerLlm` maps `ToolCallResult.name` to `transitionName`, `.arguments` to `metaJson`, validates transition exists
984
+ - `executeProposerLlm` falls back to `callLlm` text path only on `NoToolCallError` (not on HTTP errors)
985
+ - `executeProposerLlm` re-throws HTTP/network errors from `callLlmWithTools` without fallback
986
+ - `modelId[tools=no]` skips the tool attempt
987
+ - `ProposerStrategyResult.metaJson` carries tool arguments into proposal
988
+ - `submitProposal` merges `metaJson` from strategy result; caller-provided `metaJson` takes precedence
989
+ - `classifyArbitration` parameter type updated for enriched transitions
990
+
991
+ ### Quality
992
+
993
+ - No `any` types introduced (except in JSON parsing where unavoidable)
994
+ - All existing tests pass unchanged (except the one `executeProposerLlm` test updated for new `ProposerContext.transitions` shape)
995
+ - `npm run typecheck` passes
996
+ - `npm run lint` passes
997
+
998
+ ### Operational
999
+
1000
+ - `npm run build` produces a clean build
1001
+ - `npm run ci` passes end to end
1002
+ - Existing machine definitions using string transitions work without modification
1003
+
1004
+ ## Validation and Tests
1005
+
1006
+ - Run: `npm run typecheck`
1007
+ - Run: `npm run lint`
1008
+ - Run: `npm test`
1009
+ - Run: `npm run build`
1010
+ - Run: `npm run ci`
1011
+ - Verify: all commands exit with code 0
1012
+ - Verify: a machine with `{ close: "closed" }` still works through `runSession`
1013
+ - Verify: a machine with `{ uber_ride: { target: "closed", description: "...", parameters: {...} } }` triggers native tool calling
1014
+
1015
+ ## Failure and Recovery Rules
1016
+
1017
+ 1. Run `npm run typecheck` after Phases 1-3. Run `npm test` after each subsequent phase.
1018
+ 2. If the `transitions` type change causes widespread type errors, ensure `normalizeMachine` runs before any runtime access. The defensive `typeof` checks in `api.ts` handle unnormalized input.
1019
+ 3. If `callLlmWithTools` implementation feels like too much copy-paste from `callLlm`, that is expected. Do not refactor shared logic. The spec explicitly accepts duplication.
1020
+ 4. If existing machines break, the normalization in Phase 2 is incomplete. Check that `createSession` calls `normalizeMachine` and that `normalizeMachine` processes transitions.
1021
+ 5. The existing `executeProposerLlm` test in `llm.test.ts` will break after Phase 1 because `ProposerContext.transitions` changes type. Fix it in Phase 7 before running tests.
1022
+ 6. Do not declare completion while any acceptance criterion is unmet.
1023
+
1024
+ ## Completion Signal
1025
+
1026
+ Output exactly `COMPLETE` only when:
1027
+ - All acceptance criteria are met
1028
+ - `npm run ci` passes
1029
+ - No blocking errors remain
1030
+ - Existing machine definitions with string transitions work without modification
1031
+
1032
+ ## Ralph Prompt Draft
1033
+
1034
+ ```
1035
+ Implement enriched transitions for DIAL.
1036
+
1037
+ Spec location: .claude/specs/enriched-transitions.md
1038
+
1039
+ Read the spec thoroughly before starting. It contains exact code changes, a decision
1040
+ tree for executeProposerLlm, and 45 specific tests with expected inputs and assertions.
1041
+
1042
+ Constraints:
1043
+ - callLlm must not be modified in any way
1044
+ - String shorthand { close: "closed" } must keep working everywhere
1045
+ - normalizeMachine converts shorthand to { target: "closed" } at creation time
1046
+ - After normalization all runtime code sees TransitionDefinition objects
1047
+ - Defensive typeof checks in api.ts for unnormalized input
1048
+ - HTTP/network errors from callLlmWithTools must re-throw, NOT fall through to text path
1049
+ - Only NoToolCallError triggers the text-path fallback
1050
+ - Code duplication between callLlm and callLlmWithTools is accepted — do not refactor
1051
+
1052
+ Required deliverables:
1053
+ - TransitionDefinition type in types.ts (exported)
1054
+ - Updated StateDefinition.transitions type
1055
+ - Updated ProposerContext.transitions to Record<string, TransitionDefinition>
1056
+ - ProposerStrategyResult gains optional metaJson field
1057
+ - normalizeMachine converts string transitions
1058
+ - validateMachine handles both forms
1059
+ - All internal .transitions[name] reads updated to use .target
1060
+ - classifyArbitration parameter type updated
1061
+ - assembleProposerPrompt updated to read .target
1062
+ - buildToolsFromTransitions helper (returns null for plain transitions)
1063
+ - callLlmWithTools function with ToolCallResult return type (exported)
1064
+ - NoToolCallError class with content and usage fields (exported)
1065
+ - parseModelId for [tools=no] opt-out
1066
+ - executeProposerLlm try/fallback: callLlmWithTools -> NoToolCallError -> callLlm text path
1067
+ - submitProposal merges metaJson from strategy result (caller-provided takes precedence)
1068
+ - Update existing executeProposerLlm test for new transitions shape
1069
+ - 45 tests covering all paths per the spec's test section
1070
+
1071
+ Acceptance criteria:
1072
+ - Existing machines with string transitions work without modification
1073
+ - Enriched transitions with description/parameters trigger native tool calling
1074
+ - callLlm is completely unchanged (zero diff)
1075
+ - callLlmWithTools has a clean ToolCallResult return type (no union)
1076
+ - NoToolCallError thrown when model responds with text instead of tool_calls
1077
+ - HTTP errors from callLlmWithTools re-throw without fallback
1078
+ - executeProposerLlm falls back to text path only on NoToolCallError
1079
+ - modelId[tools=no] skips tool attempt entirely
1080
+ - Audit entries capture tool requests and responses
1081
+ - submitProposal merges strategy metaJson; caller-provided metaJson wins
1082
+ - npm run ci passes
1083
+
1084
+ Execution rules:
1085
+ 1. Start with types and normalization (Phase 1-2). Run typecheck.
1086
+ 2. Update internal code to use .target (Phase 3). Run typecheck.
1087
+ 3. Add buildToolsFromTransitions helper (Phase 4). Run typecheck.
1088
+ 4. Implement callLlmWithTools and NoToolCallError (Phase 5). Run tests.
1089
+ 5. Update executeProposerLlm with try/fallback (Phase 6). Run tests.
1090
+ 6. Fix existing executeProposerLlm test for new transitions shape (Phase 7). Run tests.
1091
+ 7. Write all unit tests per the spec's Phase 8 section (45 tests). Run full ci.
1092
+ 8. If blocked after repeated attempts, report the blocker and smallest needed decision.
1093
+ 9. Do not claim completion until every acceptance criterion is satisfied.
1094
+
1095
+ Output exactly COMPLETE when all criteria are met.
1096
+ ```
1097
+
1098
+ ## Open Questions
1099
+
1100
+ None. All resolved.