@openexpertise/authoring 0.1.3 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -17,17 +17,224 @@ Call the `structured_output` tool with:
17
17
  - No leading `/`. No `..` traversal. No drive letters.
18
18
  - The writer rejects invalid paths.
19
19
 
20
- ## YAML rules
21
-
22
- - Top-level `name` matches the analysis `name`.
23
- - `state.schema` mirrors `state_fields[]` from the analysis.
24
- - For each node from `node_sketches[]`, emit the right shape:
25
- - `tool`: `{ id, kind: tool, impl: ./tools/<id>.mjs, phase?, reads?, writes? }`
26
- - `agent`: `{ id, kind: agent, prompt: ./prompts/<id>.md, schema: {...}, reads?, writes?, for_each? }`
27
- - `cli-agent`: `{ id, kind: cli-agent, provider: claude-code|codex|gemini, prompt: "...", reads?, writes?, output_format?, schema?, timeout_ms? }` (inline prompt is required for cli-agent in V1)
28
- - `edges` form a connected DAG aligned with `phases`.
29
- - Conditional edges use `when: '<expression>'` (e.g. `when: 'length($.findings) > 0'`).
30
- - `for_each` blocks use `{ source: '$.<state_field>' }`.
20
+ ## YAML rules — read carefully, this is where most synthesis fails
21
+
22
+ ### Rule 0 — Top-level shape (MOST CRITICAL)
23
+
24
+ The YAML has EXACTLY these top-level keys, in this order:
25
+
26
+ ```yaml
27
+ name: <slug>
28
+ description: '...'
29
+ version: '0.1.0'
30
+ state:
31
+ schema: { ... }
32
+ phases: [...] # optional
33
+ graph: # REQUIRED — nodes + edges nest UNDER graph
34
+ nodes: [...]
35
+ edges: [...]
36
+ runtime: # optional
37
+ concurrency: 4
38
+ ```
39
+
40
+ ⚠️ MOST COMMON MISTAKE: putting `nodes:` and `edges:` at the TOP level. They MUST be nested under `graph:`. The schema rejects any top-level field that isn't in the list above.
41
+
42
+ ⚠️ Other rejected top-level keys: `prompts:`, `tools:`, `vars:`, `config:`, `env:`. Anything you'd want to put there belongs inside a node or inside `state.schema`.
43
+
44
+ ### Rule 1 — Quote any string with YAML-significant characters
45
+
46
+ The most common failure mode is unquoted strings that contain `:`, `#`, `{`, `}`, `[`, `]`, `&`, `*`, `!`, `|`, `>`, `%`, `@`, leading whitespace, or that start with `-` or `?`. These break YAML parsing.
47
+
48
+ ALWAYS double-quote any description, prompt, or value that contains those characters.
49
+
50
+ ❌ WRONG: `description: List of types to scan: sql_injection, hardcoded_secrets`
51
+ ✅ RIGHT: `description: "List of types to scan: sql_injection, hardcoded_secrets"`
52
+
53
+ ❌ WRONG: `summary: Scan #python files for issues`
54
+ ✅ RIGHT: `summary: "Scan #python files for issues"`
55
+
56
+ When in doubt, quote. Over-quoting is harmless. Under-quoting breaks the parse.
57
+
58
+ ### Rule 2 — Use the EXACT shape for each node `kind`
59
+
60
+ Each `kind` has specific required fields. Mixing them up (e.g., putting `reads:` on a `dataset` node) breaks schema validation.
61
+
62
+ #### `kind: tool` — deterministic JavaScript
63
+
64
+ ```yaml
65
+ - id: <slug>
66
+ kind: tool
67
+ impl: ./tools/<id>.mjs
68
+ phase: <phase-id> # optional
69
+ reads: [<state-field>] # optional
70
+ writes: [<state-field>] # optional
71
+ ```
72
+
73
+ #### `kind: agent` — LLM call with structured output
74
+
75
+ ```yaml
76
+ - id: <slug>
77
+ kind: agent
78
+ prompt: ./prompts/<id>.md
79
+ schema:
80
+ type: object
81
+ properties: { ... }
82
+ phase: <phase-id> # optional
83
+ reads: [<state-field>] # optional
84
+ writes: [<state-field>] # optional
85
+ for_each: { source: $.<state-field> } # optional
86
+ model: <override> # optional
87
+ ```
88
+
89
+ #### `kind: cli-agent` — delegate to Claude Code / Codex / Gemini
90
+
91
+ ```yaml
92
+ - id: <slug>
93
+ kind: cli-agent
94
+ provider: claude-code # REQUIRED — one of: claude-code | codex | gemini
95
+ prompt: '<INLINE prompt — file paths NOT supported for cli-agent in V1>'
96
+ phase: <phase-id> # optional
97
+ reads: [<state-field>] # optional
98
+ writes: [<state-field>] # optional
99
+ output_format: text # optional — text (default) or json
100
+ schema: { ... } # REQUIRED if output_format: json
101
+ timeout_ms: 600000 # optional — default 600000 ms
102
+ for_each: { source: $.<state-field> } # optional
103
+ ```
104
+
105
+ #### `kind: dataset` — EXTERNAL TABULAR data loader
106
+
107
+ ⚠️ When to choose `dataset` vs `tool`:
108
+
109
+ - Use `kind: dataset` ONLY when you need to load **tabular rows** from JSON/JSONL/CSV/Parquet/SQLite, an HTTP endpoint returning a JSON array, or an MCP resource.
110
+ - For ANY other "load something from disk" need — read a single file as text, load a Python source, fetch one config blob, etc. — use `kind: tool` with `readFileSync` in the .mjs stub.
111
+ - `format:` is one of `json | jsonl | csv | parquet`. There is NO `text` / `txt` / `yaml` / `markdown` format. If you reach for one of those, switch to a `tool` node instead.
112
+
113
+ Three more constraints:
114
+
115
+ 1. Dataset nodes load from external sources. They do NOT have a `reads:` field.
116
+ 2. `writes:` MUST contain **EXACTLY ONE** field. The loaded rows go into that single field. If you need to populate multiple state fields, add a follow-up `tool` node that reads the dataset output and splits it.
117
+ 3. Dataset outputs are ARRAYS of objects. If you want a scalar (a single string, a single object), use a `tool` node.
118
+
119
+ ```yaml
120
+ - id: <slug>
121
+ kind: dataset
122
+ source: # REQUIRED — exactly one of these shapes:
123
+ # File:
124
+ type: file
125
+ uri: ./fixtures/data.json
126
+ format: json # optional — json (default) | jsonl | csv | parquet
127
+ phase: <phase-id> # optional
128
+ writes: [<the_single_field>] # ⚠️ EXACTLY ONE field, not zero, not two
129
+ ```
130
+
131
+ Other `source.type` variants:
132
+
133
+ ```yaml
134
+ source:
135
+ type: sqlite
136
+ uri: ./db.sqlite
137
+ query: 'SELECT * FROM table'
138
+ ```
139
+
140
+ ```yaml
141
+ source:
142
+ type: http
143
+ url: https://api.example.com/path
144
+ method: GET # optional
145
+ body: {} # optional
146
+ ```
147
+
148
+ ```yaml
149
+ source:
150
+ type: mcp-resource
151
+ server: <server-name-declared-in-mcp.json>
152
+ uri: <resource-uri>
153
+ ```
154
+
155
+ #### `kind: skill` — invoke a SKILL.md package
156
+
157
+ ```yaml
158
+ - id: <slug>
159
+ kind: skill
160
+ impl: ./skills/<dir-containing-SKILL.md>
161
+ inputs: {} # optional
162
+ phase: <phase-id> # optional
163
+ reads: [<state-field>] # optional
164
+ writes: [<state-field>] # optional
165
+ ```
166
+
167
+ #### `kind: experience` — nested OpenExpertise experience
168
+
169
+ ```yaml
170
+ - id: <slug>
171
+ kind: experience
172
+ impl: ./sub-experience-dir
173
+ args: {} # optional
174
+ state_scope: isolated # optional — isolated (default) | shared
175
+ phase: <phase-id> # optional
176
+ reads: [<state-field>] # optional
177
+ writes: [<state-field>] # optional
178
+ ```
179
+
180
+ ### Rule 3 — Edges are minimal
181
+
182
+ Edges have EXACTLY these fields: `from`, `to`, and optionally `when`. Do NOT add `description:`, `label:`, or any other property — the schema rejects extra properties.
183
+
184
+ ```yaml
185
+ edges:
186
+ - { from: <node-id>, to: <node-id> }
187
+ - { from: <node-id>, to: <node-id>, when: '<expression>' }
188
+ ```
189
+
190
+ Conditional `when:` uses a JSONPath-ish expression in single quotes, e.g.:
191
+
192
+ - `when: '$.findings.length > 0'`
193
+ - `when: '$.is_duplicate == true'`
194
+ - `when: '$.score >= 0.5'`
195
+
196
+ `edges` form a connected DAG aligned with `phases`.
197
+
198
+ ### Rule 4 — `state.schema` shape
199
+
200
+ Each field under `state.schema` is a JSON-schema-ish fragment. Optionally add `merge:` for fan-out collectors:
201
+
202
+ ```yaml
203
+ state:
204
+ schema:
205
+ raw_findings:
206
+ type: array
207
+ items: { type: object }
208
+ merge: array_append # accumulates across for_each iterations
209
+ final_score:
210
+ type: number
211
+ merge: last_wins # default
212
+ run_id:
213
+ type: string
214
+ merge: set_once # write-once
215
+ ```
216
+
217
+ ⚠️ `type` MUST be one of: `string`, `number`, `boolean`, `object`, `array`, `null`. The schema does NOT allow `integer`, `int`, `float`, `bigint`, `date`, `datetime`, or any other variant — use `number` for any numeric (whole or fractional) and `string` for ISO-formatted dates.
218
+
219
+ `merge:` is one of: `array_append`, `last_wins` (default), `set_once`. Use `array_append` on collector fields written by `for_each` nodes.
220
+
221
+ ### Rule 5 — Self-check before returning
222
+
223
+ Mentally walk these checks before emitting the YAML:
224
+
225
+ 1. ✓ `nodes:` and `edges:` are nested UNDER `graph:` — not at top level
226
+ 2. ✓ No top-level keys other than `name`, `description`, `version`, `state`, `phases`, `graph`, `runtime`
227
+ 3. ✓ Every string containing `:`, `#`, `{`, `}`, `[`, `]`, `&`, `*`, `!`, `|`, `>` is double-quoted
228
+ 4. ✓ Every `dataset` node has a `source: {type: ..., ...}` field (NOT `reads:`)
229
+ 5. ✓ Every `cli-agent` node has a `provider:` field (one of claude-code | codex | gemini)
230
+ 6. ✓ Every edge has ONLY `from`, `to`, and optionally `when` — no `description:`, no `label:`
231
+ 7. ✓ Every node has a unique `id`
232
+ 8. ✓ Every `reads:` and `writes:` field references something declared in `state.schema`
233
+ 9. ✓ `phases:` items have `id` (and optionally `title`) only — no `description:`
234
+ 10. ✓ Every `state.schema` field's `type:` is one of `string | number | boolean | object | array | null` (NOT `integer`, `int`, `float`, etc.)
235
+ 11. ✓ Every `kind: dataset` node has `writes:` with EXACTLY ONE field AND a real tabular `source.format` (`json | jsonl | csv | parquet`) — for non-tabular file loads (a single Python source, a config blob), use `kind: tool` with `readFileSync`
236
+
237
+ If any check fails, fix before responding.
31
238
 
32
239
  ## Tool stub conventions
33
240
 
@@ -38,7 +245,35 @@ Each `tools/<id>.mjs` MUST:
38
245
  - Use `import {readFileSync} from 'node:fs'` etc., not bare specifiers.
39
246
  - Include a `// TODO:` comment marking the integration point the user needs to fill in (API call, file path, etc.).
40
247
 
41
- Make stubs runnable as-is with fixture data so `oe run` doesn't crash before the user wires anything up.
248
+ ### Defensive tool stubs `oe run` MUST succeed before any user wiring
249
+
250
+ This is critical. The first thing a user does after `oe ultra` finishes is run `oe run .` to verify the scaffold is alive. If a stub crashes because a state field is undefined, the user thinks the scaffold is broken.
251
+
252
+ Rules every stub must follow:
253
+
254
+ 1. **NEVER destructure state without a default.** `const { source_file } = bundle.state ?? {}` is wrong if `source_file` itself is undefined — `bundle.state.source_file` will still be undefined and crash on further use.
255
+ 2. **Default every consumed value:** `const sourceFile = bundle.state?.source_file ?? './fixtures/sample.txt'`.
256
+ 3. **The FIRST node in the graph must produce ALL the state fields downstream nodes need.** It is YOUR job to pick a reasonable default (a relative fixture path, a sample object, an empty array) for any state field that isn't populated by `args:` on some upstream node.
257
+ 4. **Generate a `fixtures/` directory with realistic sample data** if any tool reads from disk. Reference relative paths from the tool. The user replaces the fixture content with real data later.
258
+
259
+ Example of a defensive stub:
260
+
261
+ ```js
262
+ import { readFileSync } from 'node:fs'
263
+ import { resolve, dirname } from 'node:path'
264
+ import { fileURLToPath } from 'node:url'
265
+
266
+ const HERE = dirname(fileURLToPath(import.meta.url))
267
+
268
+ export default async function loadSource(bundle) {
269
+ // TODO: replace with real source-of-truth (e.g., bundle.state.source_file)
270
+ const filePath = bundle?.state?.source_file ?? resolve(HERE, '..', 'fixtures', 'sample.py')
271
+ const sourceText = readFileSync(filePath, 'utf8')
272
+ return { state_delta: { source_text: sourceText } }
273
+ }
274
+ ```
275
+
276
+ Note: defaults to a fixture; never crashes; the `// TODO:` marker tells the user what to swap.
42
277
 
43
278
  ## Prompt conventions
44
279
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@openexpertise/authoring",
3
- "version": "0.1.3",
3
+ "version": "0.1.4",
4
4
  "description": "Schema-aware authoring helpers — validate, scaffold, and edit experience.yaml files programmatically.",
5
5
  "keywords": [
6
6
  "openexpertise",
@@ -42,8 +42,8 @@
42
42
  },
43
43
  "dependencies": {
44
44
  "ajv": "^8.17.0",
45
- "@openexpertise/core": "0.1.3",
46
- "@openexpertise/schema": "0.1.3"
45
+ "@openexpertise/core": "0.1.4",
46
+ "@openexpertise/schema": "0.1.4"
47
47
  },
48
48
  "scripts": {
49
49
  "build": "tsc -b && node -e \"require('node:fs').cpSync('src/prompts','dist/prompts',{recursive:true})\"",
@@ -17,17 +17,224 @@ Call the `structured_output` tool with:
17
17
  - No leading `/`. No `..` traversal. No drive letters.
18
18
  - The writer rejects invalid paths.
19
19
 
20
- ## YAML rules
21
-
22
- - Top-level `name` matches the analysis `name`.
23
- - `state.schema` mirrors `state_fields[]` from the analysis.
24
- - For each node from `node_sketches[]`, emit the right shape:
25
- - `tool`: `{ id, kind: tool, impl: ./tools/<id>.mjs, phase?, reads?, writes? }`
26
- - `agent`: `{ id, kind: agent, prompt: ./prompts/<id>.md, schema: {...}, reads?, writes?, for_each? }`
27
- - `cli-agent`: `{ id, kind: cli-agent, provider: claude-code|codex|gemini, prompt: "...", reads?, writes?, output_format?, schema?, timeout_ms? }` (inline prompt is required for cli-agent in V1)
28
- - `edges` form a connected DAG aligned with `phases`.
29
- - Conditional edges use `when: '<expression>'` (e.g. `when: 'length($.findings) > 0'`).
30
- - `for_each` blocks use `{ source: '$.<state_field>' }`.
20
+ ## YAML rules — read carefully, this is where most synthesis fails
21
+
22
+ ### Rule 0 — Top-level shape (MOST CRITICAL)
23
+
24
+ The YAML has EXACTLY these top-level keys, in this order:
25
+
26
+ ```yaml
27
+ name: <slug>
28
+ description: '...'
29
+ version: '0.1.0'
30
+ state:
31
+ schema: { ... }
32
+ phases: [...] # optional
33
+ graph: # REQUIRED — nodes + edges nest UNDER graph
34
+ nodes: [...]
35
+ edges: [...]
36
+ runtime: # optional
37
+ concurrency: 4
38
+ ```
39
+
40
+ ⚠️ MOST COMMON MISTAKE: putting `nodes:` and `edges:` at the TOP level. They MUST be nested under `graph:`. The schema rejects any top-level field that isn't in the list above.
41
+
42
+ ⚠️ Other rejected top-level keys: `prompts:`, `tools:`, `vars:`, `config:`, `env:`. Anything you'd want to put there belongs inside a node or inside `state.schema`.
43
+
44
+ ### Rule 1 — Quote any string with YAML-significant characters
45
+
46
+ The most common failure mode is unquoted strings that contain `:`, `#`, `{`, `}`, `[`, `]`, `&`, `*`, `!`, `|`, `>`, `%`, `@`, leading whitespace, or that start with `-` or `?`. These break YAML parsing.
47
+
48
+ ALWAYS double-quote any description, prompt, or value that contains those characters.
49
+
50
+ ❌ WRONG: `description: List of types to scan: sql_injection, hardcoded_secrets`
51
+ ✅ RIGHT: `description: "List of types to scan: sql_injection, hardcoded_secrets"`
52
+
53
+ ❌ WRONG: `summary: Scan #python files for issues`
54
+ ✅ RIGHT: `summary: "Scan #python files for issues"`
55
+
56
+ When in doubt, quote. Over-quoting is harmless. Under-quoting breaks the parse.
57
+
58
+ ### Rule 2 — Use the EXACT shape for each node `kind`
59
+
60
+ Each `kind` has specific required fields. Mixing them up (e.g., putting `reads:` on a `dataset` node) breaks schema validation.
61
+
62
+ #### `kind: tool` — deterministic JavaScript
63
+
64
+ ```yaml
65
+ - id: <slug>
66
+ kind: tool
67
+ impl: ./tools/<id>.mjs
68
+ phase: <phase-id> # optional
69
+ reads: [<state-field>] # optional
70
+ writes: [<state-field>] # optional
71
+ ```
72
+
73
+ #### `kind: agent` — LLM call with structured output
74
+
75
+ ```yaml
76
+ - id: <slug>
77
+ kind: agent
78
+ prompt: ./prompts/<id>.md
79
+ schema:
80
+ type: object
81
+ properties: { ... }
82
+ phase: <phase-id> # optional
83
+ reads: [<state-field>] # optional
84
+ writes: [<state-field>] # optional
85
+ for_each: { source: $.<state-field> } # optional
86
+ model: <override> # optional
87
+ ```
88
+
89
+ #### `kind: cli-agent` — delegate to Claude Code / Codex / Gemini
90
+
91
+ ```yaml
92
+ - id: <slug>
93
+ kind: cli-agent
94
+ provider: claude-code # REQUIRED — one of: claude-code | codex | gemini
95
+ prompt: '<INLINE prompt — file paths NOT supported for cli-agent in V1>'
96
+ phase: <phase-id> # optional
97
+ reads: [<state-field>] # optional
98
+ writes: [<state-field>] # optional
99
+ output_format: text # optional — text (default) or json
100
+ schema: { ... } # REQUIRED if output_format: json
101
+ timeout_ms: 600000 # optional — default 600000 ms
102
+ for_each: { source: $.<state-field> } # optional
103
+ ```
104
+
105
+ #### `kind: dataset` — EXTERNAL TABULAR data loader
106
+
107
+ ⚠️ When to choose `dataset` vs `tool`:
108
+
109
+ - Use `kind: dataset` ONLY when you need to load **tabular rows** from JSON/JSONL/CSV/Parquet/SQLite, an HTTP endpoint returning a JSON array, or an MCP resource.
110
+ - For ANY other "load something from disk" need — read a single file as text, load a Python source, fetch one config blob, etc. — use `kind: tool` with `readFileSync` in the .mjs stub.
111
+ - `format:` is one of `json | jsonl | csv | parquet`. There is NO `text` / `txt` / `yaml` / `markdown` format. If you reach for one of those, switch to a `tool` node instead.
112
+
113
+ Three more constraints:
114
+
115
+ 1. Dataset nodes load from external sources. They do NOT have a `reads:` field.
116
+ 2. `writes:` MUST contain **EXACTLY ONE** field. The loaded rows go into that single field. If you need to populate multiple state fields, add a follow-up `tool` node that reads the dataset output and splits it.
117
+ 3. Dataset outputs are ARRAYS of objects. If you want a scalar (a single string, a single object), use a `tool` node.
118
+
119
+ ```yaml
120
+ - id: <slug>
121
+ kind: dataset
122
+ source: # REQUIRED — exactly one of these shapes:
123
+ # File:
124
+ type: file
125
+ uri: ./fixtures/data.json
126
+ format: json # optional — json (default) | jsonl | csv | parquet
127
+ phase: <phase-id> # optional
128
+ writes: [<the_single_field>] # ⚠️ EXACTLY ONE field, not zero, not two
129
+ ```
130
+
131
+ Other `source.type` variants:
132
+
133
+ ```yaml
134
+ source:
135
+ type: sqlite
136
+ uri: ./db.sqlite
137
+ query: 'SELECT * FROM table'
138
+ ```
139
+
140
+ ```yaml
141
+ source:
142
+ type: http
143
+ url: https://api.example.com/path
144
+ method: GET # optional
145
+ body: {} # optional
146
+ ```
147
+
148
+ ```yaml
149
+ source:
150
+ type: mcp-resource
151
+ server: <server-name-declared-in-mcp.json>
152
+ uri: <resource-uri>
153
+ ```
154
+
155
+ #### `kind: skill` — invoke a SKILL.md package
156
+
157
+ ```yaml
158
+ - id: <slug>
159
+ kind: skill
160
+ impl: ./skills/<dir-containing-SKILL.md>
161
+ inputs: {} # optional
162
+ phase: <phase-id> # optional
163
+ reads: [<state-field>] # optional
164
+ writes: [<state-field>] # optional
165
+ ```
166
+
167
+ #### `kind: experience` — nested OpenExpertise experience
168
+
169
+ ```yaml
170
+ - id: <slug>
171
+ kind: experience
172
+ impl: ./sub-experience-dir
173
+ args: {} # optional
174
+ state_scope: isolated # optional — isolated (default) | shared
175
+ phase: <phase-id> # optional
176
+ reads: [<state-field>] # optional
177
+ writes: [<state-field>] # optional
178
+ ```
179
+
180
+ ### Rule 3 — Edges are minimal
181
+
182
+ Edges have EXACTLY these fields: `from`, `to`, and optionally `when`. Do NOT add `description:`, `label:`, or any other property — the schema rejects extra properties.
183
+
184
+ ```yaml
185
+ edges:
186
+ - { from: <node-id>, to: <node-id> }
187
+ - { from: <node-id>, to: <node-id>, when: '<expression>' }
188
+ ```
189
+
190
+ Conditional `when:` uses a JSONPath-ish expression in single quotes, e.g.:
191
+
192
+ - `when: '$.findings.length > 0'`
193
+ - `when: '$.is_duplicate == true'`
194
+ - `when: '$.score >= 0.5'`
195
+
196
+ `edges` form a connected DAG aligned with `phases`.
197
+
198
+ ### Rule 4 — `state.schema` shape
199
+
200
+ Each field under `state.schema` is a JSON-schema-ish fragment. Optionally add `merge:` for fan-out collectors:
201
+
202
+ ```yaml
203
+ state:
204
+ schema:
205
+ raw_findings:
206
+ type: array
207
+ items: { type: object }
208
+ merge: array_append # accumulates across for_each iterations
209
+ final_score:
210
+ type: number
211
+ merge: last_wins # default
212
+ run_id:
213
+ type: string
214
+ merge: set_once # write-once
215
+ ```
216
+
217
+ ⚠️ `type` MUST be one of: `string`, `number`, `boolean`, `object`, `array`, `null`. The schema does NOT allow `integer`, `int`, `float`, `bigint`, `date`, `datetime`, or any other variant — use `number` for any numeric (whole or fractional) and `string` for ISO-formatted dates.
218
+
219
+ `merge:` is one of: `array_append`, `last_wins` (default), `set_once`. Use `array_append` on collector fields written by `for_each` nodes.
220
+
221
+ ### Rule 5 — Self-check before returning
222
+
223
+ Mentally walk these checks before emitting the YAML:
224
+
225
+ 1. ✓ `nodes:` and `edges:` are nested UNDER `graph:` — not at top level
226
+ 2. ✓ No top-level keys other than `name`, `description`, `version`, `state`, `phases`, `graph`, `runtime`
227
+ 3. ✓ Every string containing `:`, `#`, `{`, `}`, `[`, `]`, `&`, `*`, `!`, `|`, `>` is double-quoted
228
+ 4. ✓ Every `dataset` node has a `source: {type: ..., ...}` field (NOT `reads:`)
229
+ 5. ✓ Every `cli-agent` node has a `provider:` field (one of claude-code | codex | gemini)
230
+ 6. ✓ Every edge has ONLY `from`, `to`, and optionally `when` — no `description:`, no `label:`
231
+ 7. ✓ Every node has a unique `id`
232
+ 8. ✓ Every `reads:` and `writes:` field references something declared in `state.schema`
233
+ 9. ✓ `phases:` items have `id` (and optionally `title`) only — no `description:`
234
+ 10. ✓ Every `state.schema` field's `type:` is one of `string | number | boolean | object | array | null` (NOT `integer`, `int`, `float`, etc.)
235
+ 11. ✓ Every `kind: dataset` node has `writes:` with EXACTLY ONE field AND a real tabular `source.format` (`json | jsonl | csv | parquet`) — for non-tabular file loads (a single Python source, a config blob), use `kind: tool` with `readFileSync`
236
+
237
+ If any check fails, fix before responding.
31
238
 
32
239
  ## Tool stub conventions
33
240
 
@@ -38,7 +245,35 @@ Each `tools/<id>.mjs` MUST:
38
245
  - Use `import {readFileSync} from 'node:fs'` etc., not bare specifiers.
39
246
  - Include a `// TODO:` comment marking the integration point the user needs to fill in (API call, file path, etc.).
40
247
 
41
- Make stubs runnable as-is with fixture data so `oe run` doesn't crash before the user wires anything up.
248
+ ### Defensive tool stubs `oe run` MUST succeed before any user wiring
249
+
250
+ This is critical. The first thing a user does after `oe ultra` finishes is run `oe run .` to verify the scaffold is alive. If a stub crashes because a state field is undefined, the user thinks the scaffold is broken.
251
+
252
+ Rules every stub must follow:
253
+
254
+ 1. **NEVER destructure state without a default.** `const { source_file } = bundle.state ?? {}` is wrong if `source_file` itself is undefined — `bundle.state.source_file` will still be undefined and crash on further use.
255
+ 2. **Default every consumed value:** `const sourceFile = bundle.state?.source_file ?? './fixtures/sample.txt'`.
256
+ 3. **The FIRST node in the graph must produce ALL the state fields downstream nodes need.** It is YOUR job to pick a reasonable default (a relative fixture path, a sample object, an empty array) for any state field that isn't populated by `args:` on some upstream node.
257
+ 4. **Generate a `fixtures/` directory with realistic sample data** if any tool reads from disk. Reference relative paths from the tool. The user replaces the fixture content with real data later.
258
+
259
+ Example of a defensive stub:
260
+
261
+ ```js
262
+ import { readFileSync } from 'node:fs'
263
+ import { resolve, dirname } from 'node:path'
264
+ import { fileURLToPath } from 'node:url'
265
+
266
+ const HERE = dirname(fileURLToPath(import.meta.url))
267
+
268
+ export default async function loadSource(bundle) {
269
+ // TODO: replace with real source-of-truth (e.g., bundle.state.source_file)
270
+ const filePath = bundle?.state?.source_file ?? resolve(HERE, '..', 'fixtures', 'sample.py')
271
+ const sourceText = readFileSync(filePath, 'utf8')
272
+ return { state_delta: { source_text: sourceText } }
273
+ }
274
+ ```
275
+
276
+ Note: defaults to a fixture; never crashes; the `// TODO:` marker tells the user what to swap.
42
277
 
43
278
  ## Prompt conventions
44
279