npm - @openexpertise/authoring - Versions diffs - 0.1.3 → 0.1.4 - Mend

@openexpertise/authoring 0.1.3 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/dist/prompts/synthesizer.md +247 -12
package/package.json +3 -3
package/src/prompts/synthesizer.md +247 -12

package/dist/prompts/synthesizer.md CHANGED Viewed

@@ -17,17 +17,224 @@ Call the `structured_output` tool with:
 - No leading `/`. No `..` traversal. No drive letters.
 - The writer rejects invalid paths.
-## YAML rules
-- Top-level `name` matches the analysis `name`.
-- `state.schema` mirrors `state_fields[]` from the analysis.
-- For each node from `node_sketches[]`, emit the right shape:
-  - `tool`: `{ id, kind: tool, impl: ./tools/<id>.mjs, phase?, reads?, writes? }`
-  - `agent`: `{ id, kind: agent, prompt: ./prompts/<id>.md, schema: {...}, reads?, writes?, for_each? }`
-  - `cli-agent`: `{ id, kind: cli-agent, provider: claude-code|codex|gemini, prompt: "...", reads?, writes?, output_format?, schema?, timeout_ms? }` (inline prompt is required for cli-agent in V1)
-- `edges` form a connected DAG aligned with `phases`.
-- Conditional edges use `when: '<expression>'` (e.g. `when: 'length($.findings) > 0'`).
-- `for_each` blocks use `{ source: '$.<state_field>' }`.
+## YAML rules — read carefully, this is where most synthesis fails
+### Rule 0 — Top-level shape (MOST CRITICAL)
+The YAML has EXACTLY these top-level keys, in this order:
+```yaml
+name: <slug>
+description: '...'
+version: '0.1.0'
+state:
+  schema: { ... }
+phases: [...] # optional
+graph: # REQUIRED — nodes + edges nest UNDER graph
+  nodes: [...]
+  edges: [...]
+runtime: # optional
+  concurrency: 4
+```
+⚠️ MOST COMMON MISTAKE: putting `nodes:` and `edges:` at the TOP level. They MUST be nested under `graph:`. The schema rejects any top-level field that isn't in the list above.
+⚠️ Other rejected top-level keys: `prompts:`, `tools:`, `vars:`, `config:`, `env:`. Anything you'd want to put there belongs inside a node or inside `state.schema`.
+### Rule 1 — Quote any string with YAML-significant characters
+The most common failure mode is unquoted strings that contain `:`, `#`, `{`, `}`, `[`, `]`, `&`, `*`, `!`, `|`, `>`, `%`, `@`, leading whitespace, or that start with `-` or `?`. These break YAML parsing.
+ALWAYS double-quote any description, prompt, or value that contains those characters.
+❌ WRONG: `description: List of types to scan: sql_injection, hardcoded_secrets`
+✅ RIGHT: `description: "List of types to scan: sql_injection, hardcoded_secrets"`
+❌ WRONG: `summary: Scan #python files for issues`
+✅ RIGHT: `summary: "Scan #python files for issues"`
+When in doubt, quote. Over-quoting is harmless. Under-quoting breaks the parse.
+### Rule 2 — Use the EXACT shape for each node `kind`
+Each `kind` has specific required fields. Mixing them up (e.g., putting `reads:` on a `dataset` node) breaks schema validation.
+#### `kind: tool` — deterministic JavaScript
+```yaml
+- id: <slug>
+  kind: tool
+  impl: ./tools/<id>.mjs
+  phase: <phase-id> # optional
+  reads: [<state-field>] # optional
+  writes: [<state-field>] # optional
+```
+#### `kind: agent` — LLM call with structured output
+```yaml
+- id: <slug>
+  kind: agent
+  prompt: ./prompts/<id>.md
+  schema:
+    type: object
+    properties: { ... }
+  phase: <phase-id> # optional
+  reads: [<state-field>] # optional
+  writes: [<state-field>] # optional
+  for_each: { source: $.<state-field> } # optional
+  model: <override> # optional
+```
+#### `kind: cli-agent` — delegate to Claude Code / Codex / Gemini
+```yaml
+- id: <slug>
+  kind: cli-agent
+  provider: claude-code # REQUIRED — one of: claude-code | codex | gemini
+  prompt: '<INLINE prompt — file paths NOT supported for cli-agent in V1>'
+  phase: <phase-id> # optional
+  reads: [<state-field>] # optional
+  writes: [<state-field>] # optional
+  output_format: text # optional — text (default) or json
+  schema: { ... } # REQUIRED if output_format: json
+  timeout_ms: 600000 # optional — default 600000 ms
+  for_each: { source: $.<state-field> } # optional
+```
+#### `kind: dataset` — EXTERNAL TABULAR data loader
+⚠️ When to choose `dataset` vs `tool`:
+- Use `kind: dataset` ONLY when you need to load **tabular rows** from JSON/JSONL/CSV/Parquet/SQLite, an HTTP endpoint returning a JSON array, or an MCP resource.
+- For ANY other "load something from disk" need — read a single file as text, load a Python source, fetch one config blob, etc. — use `kind: tool` with `readFileSync` in the .mjs stub.
+- `format:` is one of `json | jsonl | csv | parquet`. There is NO `text` / `txt` / `yaml` / `markdown` format. If you reach for one of those, switch to a `tool` node instead.
+Three more constraints:
+1. Dataset nodes load from external sources. They do NOT have a `reads:` field.
+2. `writes:` MUST contain **EXACTLY ONE** field. The loaded rows go into that single field. If you need to populate multiple state fields, add a follow-up `tool` node that reads the dataset output and splits it.
+3. Dataset outputs are ARRAYS of objects. If you want a scalar (a single string, a single object), use a `tool` node.
+```yaml
+- id: <slug>
+  kind: dataset
+  source: # REQUIRED — exactly one of these shapes:
+    # File:
+    type: file
+    uri: ./fixtures/data.json
+    format: json # optional — json (default) | jsonl | csv | parquet
+  phase: <phase-id> # optional
+  writes: [<the_single_field>] # ⚠️ EXACTLY ONE field, not zero, not two
+```
+Other `source.type` variants:
+```yaml
+source:
+  type: sqlite
+  uri: ./db.sqlite
+  query: 'SELECT * FROM table'
+```
+```yaml
+source:
+  type: http
+  url: https://api.example.com/path
+  method: GET # optional
+  body: {} # optional
+```
+```yaml
+source:
+  type: mcp-resource
+  server: <server-name-declared-in-mcp.json>
+  uri: <resource-uri>
+```
+#### `kind: skill` — invoke a SKILL.md package
+```yaml
+- id: <slug>
+  kind: skill
+  impl: ./skills/<dir-containing-SKILL.md>
+  inputs: {} # optional
+  phase: <phase-id> # optional
+  reads: [<state-field>] # optional
+  writes: [<state-field>] # optional
+```
+#### `kind: experience` — nested OpenExpertise experience
+```yaml
+- id: <slug>
+  kind: experience
+  impl: ./sub-experience-dir
+  args: {} # optional
+  state_scope: isolated # optional — isolated (default) | shared
+  phase: <phase-id> # optional
+  reads: [<state-field>] # optional
+  writes: [<state-field>] # optional
+```
+### Rule 3 — Edges are minimal
+Edges have EXACTLY these fields: `from`, `to`, and optionally `when`. Do NOT add `description:`, `label:`, or any other property — the schema rejects extra properties.
+```yaml
+edges:
+  - { from: <node-id>, to: <node-id> }
+  - { from: <node-id>, to: <node-id>, when: '<expression>' }
+```
+Conditional `when:` uses a JSONPath-ish expression in single quotes, e.g.:
+- `when: '$.findings.length > 0'`
+- `when: '$.is_duplicate == true'`
+- `when: '$.score >= 0.5'`
+`edges` form a connected DAG aligned with `phases`.
+### Rule 4 — `state.schema` shape
+Each field under `state.schema` is a JSON-schema-ish fragment. Optionally add `merge:` for fan-out collectors:
+```yaml
+state:
+  schema:
+    raw_findings:
+      type: array
+      items: { type: object }
+      merge: array_append # accumulates across for_each iterations
+    final_score:
+      type: number
+      merge: last_wins # default
+    run_id:
+      type: string
+      merge: set_once # write-once
+```
+⚠️ `type` MUST be one of: `string`, `number`, `boolean`, `object`, `array`, `null`. The schema does NOT allow `integer`, `int`, `float`, `bigint`, `date`, `datetime`, or any other variant — use `number` for any numeric (whole or fractional) and `string` for ISO-formatted dates.
+`merge:` is one of: `array_append`, `last_wins` (default), `set_once`. Use `array_append` on collector fields written by `for_each` nodes.
+### Rule 5 — Self-check before returning
+Mentally walk these checks before emitting the YAML:
+1. ✓ `nodes:` and `edges:` are nested UNDER `graph:` — not at top level
+2. ✓ No top-level keys other than `name`, `description`, `version`, `state`, `phases`, `graph`, `runtime`
+3. ✓ Every string containing `:`, `#`, `{`, `}`, `[`, `]`, `&`, `*`, `!`, `|`, `>` is double-quoted
+4. ✓ Every `dataset` node has a `source: {type: ..., ...}` field (NOT `reads:`)
+5. ✓ Every `cli-agent` node has a `provider:` field (one of claude-code | codex | gemini)
+6. ✓ Every edge has ONLY `from`, `to`, and optionally `when` — no `description:`, no `label:`
+7. ✓ Every node has a unique `id`
+8. ✓ Every `reads:` and `writes:` field references something declared in `state.schema`
+9. ✓ `phases:` items have `id` (and optionally `title`) only — no `description:`
+10. ✓ Every `state.schema` field's `type:` is one of `string | number | boolean | object | array | null` (NOT `integer`, `int`, `float`, etc.)
+11. ✓ Every `kind: dataset` node has `writes:` with EXACTLY ONE field AND a real tabular `source.format` (`json | jsonl | csv | parquet`) — for non-tabular file loads (a single Python source, a config blob), use `kind: tool` with `readFileSync`
+If any check fails, fix before responding.
 ## Tool stub conventions
@@ -38,7 +245,35 @@ Each `tools/<id>.mjs` MUST:
 - Use `import {readFileSync} from 'node:fs'` etc., not bare specifiers.
 - Include a `// TODO:` comment marking the integration point the user needs to fill in (API call, file path, etc.).
-Make stubs runnable as-is with fixture data so `oe run` doesn't crash before the user wires anything up.
+### Defensive tool stubs — `oe run` MUST succeed before any user wiring
+This is critical. The first thing a user does after `oe ultra` finishes is run `oe run .` to verify the scaffold is alive. If a stub crashes because a state field is undefined, the user thinks the scaffold is broken.
+Rules every stub must follow:
+1. **NEVER destructure state without a default.** `const { source_file } = bundle.state ?? {}` is wrong if `source_file` itself is undefined — `bundle.state.source_file` will still be undefined and crash on further use.
+2. **Default every consumed value:** `const sourceFile = bundle.state?.source_file ?? './fixtures/sample.txt'`.
+3. **The FIRST node in the graph must produce ALL the state fields downstream nodes need.** It is YOUR job to pick a reasonable default (a relative fixture path, a sample object, an empty array) for any state field that isn't populated by `args:` on some upstream node.
+4. **Generate a `fixtures/` directory with realistic sample data** if any tool reads from disk. Reference relative paths from the tool. The user replaces the fixture content with real data later.
+Example of a defensive stub:
+```js
+import { readFileSync } from 'node:fs'
+import { resolve, dirname } from 'node:path'
+import { fileURLToPath } from 'node:url'
+const HERE = dirname(fileURLToPath(import.meta.url))
+export default async function loadSource(bundle) {
+  // TODO: replace with real source-of-truth (e.g., bundle.state.source_file)
+  const filePath = bundle?.state?.source_file ?? resolve(HERE, '..', 'fixtures', 'sample.py')
+  const sourceText = readFileSync(filePath, 'utf8')
+  return { state_delta: { source_text: sourceText } }
+}
+```
+Note: defaults to a fixture; never crashes; the `// TODO:` marker tells the user what to swap.
 ## Prompt conventions

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@openexpertise/authoring",
-  "version": "0.1.3",
+  "version": "0.1.4",
   "description": "Schema-aware authoring helpers — validate, scaffold, and edit experience.yaml files programmatically.",
   "keywords": [
     "openexpertise",
@@ -42,8 +42,8 @@
   },
   "dependencies": {
     "ajv": "^8.17.0",
-    "@openexpertise/core": "0.1.3",
-    "@openexpertise/schema": "0.1.3"
+    "@openexpertise/core": "0.1.4",
+    "@openexpertise/schema": "0.1.4"
   },
   "scripts": {
     "build": "tsc -b && node -e \"require('node:fs').cpSync('src/prompts','dist/prompts',{recursive:true})\"",

package/src/prompts/synthesizer.md CHANGED Viewed

@@ -17,17 +17,224 @@ Call the `structured_output` tool with:
 - No leading `/`. No `..` traversal. No drive letters.
 - The writer rejects invalid paths.
-## YAML rules
-- Top-level `name` matches the analysis `name`.
-- `state.schema` mirrors `state_fields[]` from the analysis.
-- For each node from `node_sketches[]`, emit the right shape:
-  - `tool`: `{ id, kind: tool, impl: ./tools/<id>.mjs, phase?, reads?, writes? }`
-  - `agent`: `{ id, kind: agent, prompt: ./prompts/<id>.md, schema: {...}, reads?, writes?, for_each? }`
-  - `cli-agent`: `{ id, kind: cli-agent, provider: claude-code|codex|gemini, prompt: "...", reads?, writes?, output_format?, schema?, timeout_ms? }` (inline prompt is required for cli-agent in V1)
-- `edges` form a connected DAG aligned with `phases`.
-- Conditional edges use `when: '<expression>'` (e.g. `when: 'length($.findings) > 0'`).
-- `for_each` blocks use `{ source: '$.<state_field>' }`.
+## YAML rules — read carefully, this is where most synthesis fails
+### Rule 0 — Top-level shape (MOST CRITICAL)
+The YAML has EXACTLY these top-level keys, in this order:
+```yaml
+name: <slug>
+description: '...'
+version: '0.1.0'
+state:
+  schema: { ... }
+phases: [...] # optional
+graph: # REQUIRED — nodes + edges nest UNDER graph
+  nodes: [...]
+  edges: [...]
+runtime: # optional
+  concurrency: 4
+```
+⚠️ MOST COMMON MISTAKE: putting `nodes:` and `edges:` at the TOP level. They MUST be nested under `graph:`. The schema rejects any top-level field that isn't in the list above.
+⚠️ Other rejected top-level keys: `prompts:`, `tools:`, `vars:`, `config:`, `env:`. Anything you'd want to put there belongs inside a node or inside `state.schema`.
+### Rule 1 — Quote any string with YAML-significant characters
+The most common failure mode is unquoted strings that contain `:`, `#`, `{`, `}`, `[`, `]`, `&`, `*`, `!`, `|`, `>`, `%`, `@`, leading whitespace, or that start with `-` or `?`. These break YAML parsing.
+ALWAYS double-quote any description, prompt, or value that contains those characters.
+❌ WRONG: `description: List of types to scan: sql_injection, hardcoded_secrets`
+✅ RIGHT: `description: "List of types to scan: sql_injection, hardcoded_secrets"`
+❌ WRONG: `summary: Scan #python files for issues`
+✅ RIGHT: `summary: "Scan #python files for issues"`
+When in doubt, quote. Over-quoting is harmless. Under-quoting breaks the parse.
+### Rule 2 — Use the EXACT shape for each node `kind`
+Each `kind` has specific required fields. Mixing them up (e.g., putting `reads:` on a `dataset` node) breaks schema validation.
+#### `kind: tool` — deterministic JavaScript
+```yaml
+- id: <slug>
+  kind: tool
+  impl: ./tools/<id>.mjs
+  phase: <phase-id> # optional
+  reads: [<state-field>] # optional
+  writes: [<state-field>] # optional
+```
+#### `kind: agent` — LLM call with structured output
+```yaml
+- id: <slug>
+  kind: agent
+  prompt: ./prompts/<id>.md
+  schema:
+    type: object
+    properties: { ... }
+  phase: <phase-id> # optional
+  reads: [<state-field>] # optional
+  writes: [<state-field>] # optional
+  for_each: { source: $.<state-field> } # optional
+  model: <override> # optional
+```
+#### `kind: cli-agent` — delegate to Claude Code / Codex / Gemini
+```yaml
+- id: <slug>
+  kind: cli-agent
+  provider: claude-code # REQUIRED — one of: claude-code | codex | gemini
+  prompt: '<INLINE prompt — file paths NOT supported for cli-agent in V1>'
+  phase: <phase-id> # optional
+  reads: [<state-field>] # optional
+  writes: [<state-field>] # optional
+  output_format: text # optional — text (default) or json
+  schema: { ... } # REQUIRED if output_format: json
+  timeout_ms: 600000 # optional — default 600000 ms
+  for_each: { source: $.<state-field> } # optional
+```
+#### `kind: dataset` — EXTERNAL TABULAR data loader
+⚠️ When to choose `dataset` vs `tool`:
+- Use `kind: dataset` ONLY when you need to load **tabular rows** from JSON/JSONL/CSV/Parquet/SQLite, an HTTP endpoint returning a JSON array, or an MCP resource.
+- For ANY other "load something from disk" need — read a single file as text, load a Python source, fetch one config blob, etc. — use `kind: tool` with `readFileSync` in the .mjs stub.
+- `format:` is one of `json | jsonl | csv | parquet`. There is NO `text` / `txt` / `yaml` / `markdown` format. If you reach for one of those, switch to a `tool` node instead.
+Three more constraints:
+1. Dataset nodes load from external sources. They do NOT have a `reads:` field.
+2. `writes:` MUST contain **EXACTLY ONE** field. The loaded rows go into that single field. If you need to populate multiple state fields, add a follow-up `tool` node that reads the dataset output and splits it.
+3. Dataset outputs are ARRAYS of objects. If you want a scalar (a single string, a single object), use a `tool` node.
+```yaml
+- id: <slug>
+  kind: dataset
+  source: # REQUIRED — exactly one of these shapes:
+    # File:
+    type: file
+    uri: ./fixtures/data.json
+    format: json # optional — json (default) | jsonl | csv | parquet
+  phase: <phase-id> # optional
+  writes: [<the_single_field>] # ⚠️ EXACTLY ONE field, not zero, not two
+```
+Other `source.type` variants:
+```yaml
+source:
+  type: sqlite
+  uri: ./db.sqlite
+  query: 'SELECT * FROM table'
+```
+```yaml
+source:
+  type: http
+  url: https://api.example.com/path
+  method: GET # optional
+  body: {} # optional
+```
+```yaml
+source:
+  type: mcp-resource
+  server: <server-name-declared-in-mcp.json>
+  uri: <resource-uri>
+```
+#### `kind: skill` — invoke a SKILL.md package
+```yaml
+- id: <slug>
+  kind: skill
+  impl: ./skills/<dir-containing-SKILL.md>
+  inputs: {} # optional
+  phase: <phase-id> # optional
+  reads: [<state-field>] # optional
+  writes: [<state-field>] # optional
+```
+#### `kind: experience` — nested OpenExpertise experience
+```yaml
+- id: <slug>
+  kind: experience
+  impl: ./sub-experience-dir
+  args: {} # optional
+  state_scope: isolated # optional — isolated (default) | shared
+  phase: <phase-id> # optional
+  reads: [<state-field>] # optional
+  writes: [<state-field>] # optional
+```
+### Rule 3 — Edges are minimal
+Edges have EXACTLY these fields: `from`, `to`, and optionally `when`. Do NOT add `description:`, `label:`, or any other property — the schema rejects extra properties.
+```yaml
+edges:
+  - { from: <node-id>, to: <node-id> }
+  - { from: <node-id>, to: <node-id>, when: '<expression>' }
+```
+Conditional `when:` uses a JSONPath-ish expression in single quotes, e.g.:
+- `when: '$.findings.length > 0'`
+- `when: '$.is_duplicate == true'`
+- `when: '$.score >= 0.5'`
+`edges` form a connected DAG aligned with `phases`.
+### Rule 4 — `state.schema` shape
+Each field under `state.schema` is a JSON-schema-ish fragment. Optionally add `merge:` for fan-out collectors:
+```yaml
+state:
+  schema:
+    raw_findings:
+      type: array
+      items: { type: object }
+      merge: array_append # accumulates across for_each iterations
+    final_score:
+      type: number
+      merge: last_wins # default
+    run_id:
+      type: string
+      merge: set_once # write-once
+```
+⚠️ `type` MUST be one of: `string`, `number`, `boolean`, `object`, `array`, `null`. The schema does NOT allow `integer`, `int`, `float`, `bigint`, `date`, `datetime`, or any other variant — use `number` for any numeric (whole or fractional) and `string` for ISO-formatted dates.
+`merge:` is one of: `array_append`, `last_wins` (default), `set_once`. Use `array_append` on collector fields written by `for_each` nodes.
+### Rule 5 — Self-check before returning
+Mentally walk these checks before emitting the YAML:
+1. ✓ `nodes:` and `edges:` are nested UNDER `graph:` — not at top level
+2. ✓ No top-level keys other than `name`, `description`, `version`, `state`, `phases`, `graph`, `runtime`
+3. ✓ Every string containing `:`, `#`, `{`, `}`, `[`, `]`, `&`, `*`, `!`, `|`, `>` is double-quoted
+4. ✓ Every `dataset` node has a `source: {type: ..., ...}` field (NOT `reads:`)
+5. ✓ Every `cli-agent` node has a `provider:` field (one of claude-code | codex | gemini)
+6. ✓ Every edge has ONLY `from`, `to`, and optionally `when` — no `description:`, no `label:`
+7. ✓ Every node has a unique `id`
+8. ✓ Every `reads:` and `writes:` field references something declared in `state.schema`
+9. ✓ `phases:` items have `id` (and optionally `title`) only — no `description:`
+10. ✓ Every `state.schema` field's `type:` is one of `string | number | boolean | object | array | null` (NOT `integer`, `int`, `float`, etc.)
+11. ✓ Every `kind: dataset` node has `writes:` with EXACTLY ONE field AND a real tabular `source.format` (`json | jsonl | csv | parquet`) — for non-tabular file loads (a single Python source, a config blob), use `kind: tool` with `readFileSync`
+If any check fails, fix before responding.
 ## Tool stub conventions
@@ -38,7 +245,35 @@ Each `tools/<id>.mjs` MUST:
 - Use `import {readFileSync} from 'node:fs'` etc., not bare specifiers.
 - Include a `// TODO:` comment marking the integration point the user needs to fill in (API call, file path, etc.).
-Make stubs runnable as-is with fixture data so `oe run` doesn't crash before the user wires anything up.
+### Defensive tool stubs — `oe run` MUST succeed before any user wiring
+This is critical. The first thing a user does after `oe ultra` finishes is run `oe run .` to verify the scaffold is alive. If a stub crashes because a state field is undefined, the user thinks the scaffold is broken.
+Rules every stub must follow:
+1. **NEVER destructure state without a default.** `const { source_file } = bundle.state ?? {}` is wrong if `source_file` itself is undefined — `bundle.state.source_file` will still be undefined and crash on further use.
+2. **Default every consumed value:** `const sourceFile = bundle.state?.source_file ?? './fixtures/sample.txt'`.
+3. **The FIRST node in the graph must produce ALL the state fields downstream nodes need.** It is YOUR job to pick a reasonable default (a relative fixture path, a sample object, an empty array) for any state field that isn't populated by `args:` on some upstream node.
+4. **Generate a `fixtures/` directory with realistic sample data** if any tool reads from disk. Reference relative paths from the tool. The user replaces the fixture content with real data later.
+Example of a defensive stub:
+```js
+import { readFileSync } from 'node:fs'
+import { resolve, dirname } from 'node:path'
+import { fileURLToPath } from 'node:url'
+const HERE = dirname(fileURLToPath(import.meta.url))
+export default async function loadSource(bundle) {
+  // TODO: replace with real source-of-truth (e.g., bundle.state.source_file)
+  const filePath = bundle?.state?.source_file ?? resolve(HERE, '..', 'fixtures', 'sample.py')
+  const sourceText = readFileSync(filePath, 'utf8')
+  return { state_delta: { source_text: sourceText } }
+}
+```
+Note: defaults to a fixture; never crashes; the `// TODO:` marker tells the user what to swap.
 ## Prompt conventions