npm - @openplaybooks/converge - Versions diffs - 0.2.0 - Mend

@openplaybooks/converge 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

package/LICENSE +21 -0
package/README.md +131 -0
package/dist/index.js +212278 -0
package/package.json +54 -0
package/skills/converge-control/SKILL.md +208 -0
package/skills/converge-control/reference/cli.md +128 -0
package/skills/converge-control/reference/events.md +165 -0
package/skills/converge-control/troubleshooting/playbook.md +367 -0
package/skills/converge-development/SKILL.md +303 -0
package/skills/converge-development/reference/framework-map.md +294 -0
package/skills/converge-development/reference/observability.md +132 -0
package/skills/converge-development/troubleshooting/playbook.md +213 -0
package/skills/converge-planning/SKILL.md +302 -0
package/skills/converge-planning/references/anti-patterns.md +35 -0
package/skills/converge-planning/references/model.md +317 -0
package/skills/converge-planning/references/patterns.md +169 -0
package/skills/converge-planning/references/phases.md +168 -0
package/skills/converge-planning/references/schema.md +313 -0
package/skills/converge-planning/references/static-dynamic.md +38 -0
package/skills/converge-planning/references/tests.md +91 -0

package/skills/converge-control/troubleshooting/playbook.md ADDED Viewed

@@ -0,0 +1,367 @@
+# Converge troubleshooting playbook
+Symptom-indexed fixes for the run-blockers we know how to solve. Each entry is **symptom → root cause → fix recipe → verification**.
+If your symptom isn't in this file, **STOP** and surface to the user with: failing node ID, exact event lines, the check that failed, what you've tried, and a proposed fix. Don't improvise patches on novel symptoms.
+## Quick index
+1. [Previous run cancelled — node status unclear](#1-previous-run-cancelled--node-status-unclear)
+2. [Stale `outputs:` paths after workflow moved files](#2-stale-outputs-paths-after-workflow-moved-files)
+3. [Stale `inputs:` blocking a node that should be ready](#3-stale-inputs-blocking-a-node-that-should-be-ready)
+4. [Missing seed sub-template directory](#4-missing-seed-sub-template-directory)
+5. [Foreign playbook hijacks `converge run`](#5-foreign-playbook-hijacks-converge-run)
+6. [Secondary playbook fails after main one finishes](#6-secondary-playbook-fails-after-main-one-finishes)
+7. [Pre-existing typecheck/build errors in vendored code](#7-pre-existing-typecheckbuild-errors-in-vendored-code)
+8. [Verification task expects browser/server E2E inside an AI spawn](#8-verification-task-expects-browserserver-e2e-inside-an-ai-spawn)
+9. [Mixed-shape task: file-creation + tree-wide cleanup in one task](#9-mixed-shape-task-file-creation--tree-wide-cleanup-in-one-task)
+10. [Cycle detected in DAG](#10-cycle-detected-in-dag)
+11. [Frontier unresolved — seed spawned no children](#11-frontier-unresolved--seed-spawned-no-children)
+12. [Fingerprint mismatch cascade — all downstream re-executes](#12-fingerprint-mismatch-cascade--all-downstream-re-executes)
+---
+## 1. Previous run cancelled — node status unclear
+**Symptom:**
+```
+RUN_CANCELLED <playbook>
+```
+Or the run process was killed and you're unsure what completed.
+**Root cause:** The previous run was interrupted (SIGTERM, crash, reboot) without completing all nodes.
+**Fix:** Re-run. The runner reads `runstate.json` — completed nodes carry forward, incomplete nodes execute fresh. No special flags needed.
+```bash
+converge run --playbook=<name>
+```
+To explicitly retry only nodes that failed (not were cancelled):
+```bash
+converge run --playbook=<name> --select 'result:error+'
+```
+Do **not** use `--full-refresh` — it ignores the previous runstate and re-executes everything.
+**Verification:** Run proceeds without re-executing completed nodes. `NODE_COMPLETE cached` events for previously-done work.
+---
+## 2. Stale `outputs:` paths after workflow moved files
+**Symptom:**
+```
+CHECK_FAIL <nodeId> <checkId>
+  Task output not created: <path>
+```
+The path in the error points to a location that's empty on disk, but the file actually exists at a different location. Common: a `split` task declares output at `lib/screens/X/widgets/foo.dart`, a follow-up `lift` task moves it to `lib/widgets/foo.dart`, and the split task's check fails on re-validation because the file moved.
+**Root cause:** TASK.md frontmatter declares an `outputs:` path that's correct at generation time but stale after later steps move the file.
+**Fix recipe:**
+1. **Fix the template** so future spawns handle the moved file:
+   ```yaml
+   # In the template TASK.md — make checks tolerate the moved location:
+   checks:
+     - id: widget-exists
+       cmd: "bash -c 'test -f {{widgetPath}} || test -f lib/widgets/$(basename {{widgetPath}})'"
+   ```
+   Or drop the brittle `outputs:` entry entirely if the check is sufficient.
+2. **Regenerate already-spawned nodes.** For each affected spawned node directory under `.converge/inventory/<playbook>/spawned/`, re-render from the fixed template with the node's existing `vars:`.
+3. **Re-compile and re-run:**
+   ```bash
+   converge run --playbook=<name> --dry
+   converge run --playbook=<name> --select 'result:error+'
+   ```
+**Verification:** `CHECK_FAIL` doesn't recur for the fixed node. Node completes on next attempt.
+---
+## 3. Stale `inputs:` blocking a node that should be ready
+**Symptom:**
+```
+INPUT_MISSING <nodeId> <path>
+```
+A node can't start because its declared `inputs:` file doesn't exist. The file was produced but later moved by a downstream task.
+**Root cause:** The `inputs:` path references a file that existed when the DAG was compiled but was moved or renamed.
+**Fix recipe:**
+1. **Fix the TASK.md** — drop the brittle input or make it conditional:
+   ```yaml
+   # Instead of:
+   inputs:
+     - "{{localWidgetPath}}"
+   # Use a check that tolerates the moved location.
+   ```
+2. **Regenerate affected spawned nodes** from the fixed template.
+3. **Re-compile and re-run:**
+   ```bash
+   converge run --playbook=<name> --dry
+   converge run --playbook=<name> --select 'result:error+'
+   ```
+**Verification:** Node moves past the input gate. `INPUT_MISSING` doesn't recur.
+---
+## 4. Missing seed sub-template directory
+**Symptom:**
+```
+NODE_FAIL <seedParentId> seed script import failed: <path>/seed.js
+```
+The seed.js exists and parses, but its `run()` references a sub-template (e.g. `tasks/subtask/TASK.md`) that's not on disk.
+**Root cause:** When migrating a playbook, sub-template directories were missed in the copy.
+**Fix recipe:**
+1. Find a known-good source that has the sub-template:
+   ```bash
+   find <source-playbook>/seeds/ -type d -name "subtask"
+   ```
+2. Copy into the target playbook:
+   ```bash
+   cp -r <source>/seeds/<name>/tasks/<step>/tasks/subtask \
+         <target-playbook>/seeds/<name>/tasks/<step>/tasks/subtask
+   ```
+3. Re-compile and re-run:
+   ```bash
+   converge run --playbook=<name> --dry
+   converge run --playbook=<name> --select 'result:error+'
+   ```
+**Verification:** Seed spawns children successfully. `SEED_SPAWN` event appears in the stream.
+---
+## 5. Foreign playbook hijacks `converge run`
+**Symptom:** Run completes the intended playbook, then starts running tasks from a different playbook. The other playbook fails because it expects setup that hasn't happened.
+**Root cause:** `.converge/playbooks/` contains more than one playbook. A bare `converge run` may pick a different one than intended.
+**Fix:** Use the explicit playbook path on every command:
+```bash
+converge run --playbook=default
+converge list --playbook=default
+```
+If the other playbook is genuinely unwanted, remove it (after confirming with the user):
+```bash
+rm -rf .converge/playbooks/<unwanted>
+```
+**Verification:** `converge run` only starts nodes from the intended playbook.
+---
+## 6. Secondary playbook fails after main one finishes
+**Symptom:** The primary playbook completes, then a secondary playbook starts and fails immediately on setup issues.
+**Root cause:** Same as #5 — multiple playbooks present, auto-discovery picks the wrong one.
+**Fix:** Same as #5 — use the explicit playbook path.
+**Verification:** Primary playbook completes cleanly. No secondary playbook nodes appear.
+---
+## 7. Pre-existing typecheck/build errors in vendored code
+**Symptom:** A `typecheck` or `build` check fails identically across many nodes. The failing file isn't something the AI wrote — it was already in the repo before the run started.
+**Root cause:** The playbook's typecheck check is all-or-nothing. Any pre-existing error fails every node with that check.
+**Fix recipe:**
+1. **Identify the offending files:**
+   ```bash
+   pnpm typecheck 2>&1 | grep "error TS" | head -20
+   ```
+2. **Decide:** are these files the playbook needs? If yes, fix the types. If no (vestigial vendored code), delete them.
+3. **Delete and clean imports:**
+   ```bash
+   rm <offending-file>
+   # Clean imports referencing the deleted file
+   pnpm typecheck 2>&1 | grep -c "error TS"
+   ```
+   Repeat until count is 0.
+4. **Re-run** — previously blocked nodes will pass.
+**Verification:** `pnpm typecheck` exits 0. Next run shows `CHECK_PASS` for the typecheck check.
+---
+## 8. Verification task expects browser/server E2E inside an AI spawn
+**Symptom:** A task says "spin up `pnpm dev`, curl `localhost:N`, exercise pages, write a JSON report." The AI tries — runs `pnpm dev &`, curls, sometimes runs aggressive cleanups like `pkill -f "node"` (which can kill the runner itself). Times out or deadlocks.
+**Root cause:** AI spawns are designed for file edits + short shell commands, not multi-process choreography. No port management, no headless browser, no reliable long-lived server lifecycle.
+**Fix recipe — restructure the task:**
+1. **Drop the `allPassed === true` gate.** Replace with a "report file exists + has expected schema" check:
+   ```yaml
+   checks:
+     - id: report-written
+       cmd: "test -f e2e-verify.json && node -e \"const r=JSON.parse(require('fs').readFileSync('e2e-verify.json','utf8'));process.exit(Array.isArray(r.scenarios)&&r.scenarios.length>0?0:1)\""
+   ```
+2. **Reframe the task body** as "scaffold the report file, leave verdicts for human review."
+3. **If you genuinely need automated E2E**, split into two tasks:
+   - Task A: spawn `pnpm dev`, write pid/port file, exit.
+   - Task B (depends on A): read pid/port, hit endpoints, kill pid, write report.
+**Verification:** Task passes its relaxed gate cleanly. No `pkill -f "node"` in any task body.
+---
+## 9. Mixed-shape task: file-creation + tree-wide cleanup in one task
+**Symptom:** A single node takes many attempts to converge. The check list contains both "new file X exists" (`test -f some/path.ts`) AND "no occurrences of pattern Y in src/" (`grep -r 'badPattern' src`). Each attempt scrubs a few files but new ones keep being found.
+**Root cause:** Existence and negation checks converge at different rates. Existence flips false→true once when the file is written. Negation drains chunk-by-chunk over many edits.
+**Fix recipe — split into creator + cleanup, two sibling nodes:**
+```yaml
+# Before (one node, slow):
+- id: 009-converge-event-stream
+  outputs:
+    - src/app/api/events/route.ts
+  checks:
+    - id: route-exists
+      cmd: "test -f src/app/api/events/route.ts"
+    - id: no-legacy-websocket
+      cmd: "test -z \"$(grep -rl 'useWebSocket' src 2>/dev/null)\""
+# After (two nodes, fast):
+- id: 009-converge-event-stream
+  outputs:
+    - src/app/api/events/route.ts
+  checks:
+    - id: route-exists
+      cmd: "test -f src/app/api/events/route.ts"
+- id: 009b-purge-legacy-websocket
+  depends_on: [009-converge-event-stream]
+  checks:
+    - id: no-legacy-websocket
+      cmd: "test -z \"$(grep -rl 'useWebSocket' src 2>/dev/null)\""
+```
+**Verification:** Each single-shape node converges in 1-2 attempts. No multi-attempt thrashing.
+---
+## 10. Cycle detected in DAG
+**Symptom:**
+```
+CYCLE_DETECTED [id1 → id2 → id3 → id1]
+```
+Compile fails. The DAG has a circular dependency.
+**Root cause:** `depends_on` edges form a cycle. Usually happens when two tasks each declare the other as a dependency, or a chain loops back.
+**Fix recipe:**
+1. Trace the cycle shown in the error.
+2. Identify which edge is incorrect — which task does NOT actually need to depend on the other.
+3. Remove or fix the `depends_on` entry in the offending TASK.md or playbook.yml.
+4. Re-validate the graph:
+   ```bash
+   converge run --playbook=<name> --dry
+   ```
+**Verification:** Dry run succeeds. No cycle error is reported.
+---
+## 11. Frontier unresolved — seed spawned no children
+**Symptom:**
+```
+FRONTIER_UNRESOLVED <nodeId>
+```
+A seed parent declared with `from_seed` and an upstream catalog was expected to spawn children, but the DAG shows zero child nodes.
+**Root cause:** Either (a) the catalog file is empty/missing, or (b) the seed script errored silently, or (c) the catalog format changed and the seed didn't match any entries.
+**Fix recipe:**
+1. Check the catalog file exists and has entries:
+   ```bash
+   cat <catalog-path> | jq 'length'  # or equivalent
+   ```
+2. Run the seed script manually to see errors:
+   ```bash
+   node <playbook>/seeds/<name>/index.js
+   ```
+3. Fix the catalog or seed script.
+4. Re-validate and re-run:
+   ```bash
+   converge run --playbook=<name> --dry
+   converge run --playbook=<name> --select 'result:error+'
+   ```
+**Verification:** Dry run succeeds. `SEED_SPAWN` events appear during run showing the expected child count.
+---
+## 12. Fingerprint mismatch cascade — all downstream re-executes
+**Symptom:** An incremental run (`--select 'state:modified+'`) re-executes far more nodes than expected. Nodes that shouldn't have changed show `NODE_COMPLETE fresh` instead of `cached`.
+**Root cause:** A node's fingerprint changed unexpectedly — often because a TASK.md was touched (even whitespace), a `vars:` value changed, or the manifest hash differs due to a re-compile that produced a different DAG structure.
+**Fix recipe:**
+1. Check what actually changed:
+   ```bash
+   diff <(jq -S . .converge/journal/<playbook>/manifest.prev.json) <(jq -S . .converge/journal/<playbook>/manifest.json)
+   ```
+2. If the diff is noise (whitespace, key ordering), the fingerprint computation is too broad. This is a framework issue — surface to the user.
+3. If the diff is real (a `depends_on` edge changed, a `vars:` value updated), the cascade is correct behavior. Let it run.
+**Verification:** After a clean run, the next `--select 'state:modified+'` should show all `cached` (zero `fresh`).
+---
+## When NONE of these match
+If your symptom isn't covered above:
+1. **Read the node forensics:**
+   ```bash
+   ls .converge/journal/<playbook>/tasks/<nodeId>/
+   cat .converge/journal/<playbook>/tasks/<nodeId>/FEEDBACK.md
+   cat .converge/journal/<playbook>/tasks/<nodeId>/LEARN.md
+   ```
+2. **Check the event stream** around the failure:
+   ```bash
+   grep "NODE_FAIL\|CHECK_FAIL\|ERROR" .converge/journal/<playbook>/events.jsonl | tail -20
+   ```
+3. **Surface to the user** with: failing node ID, exact event lines, what you've tried, your hypothesis, and a proposed fix.
+4. Wait for approval before applying any patch.

package/skills/converge-development/SKILL.md ADDED Viewed

@@ -0,0 +1,303 @@
+---
+name: converge-development
+description: Use when the user wants to develop, debug, or improve the converge framework itself — running an example as a test bed, running the self-improvement loop, observing framework behavior, diagnosing framework bugs, and editing source under packages/. Triggers on phrases like "debug converge", "fix the framework", "run the self-improvement loop", "autonomous framework improvement", "why does the runner do X", "improve the journal", "add a feature to the CLI", "use this example to find bugs in converge".
+---
+# Converge Development — observe-diagnose-fix the framework itself
+## Purpose
+Use a real example playbook as a test bed. Run it. Watch what the framework does internally — not just the stdout event stream, but the target directory, runstate, and per-attempt forensics the runner writes to disk. When the framework misbehaves (crashes, corrupts state, fails to retry, mishandles a provider response), trace the symptom to the package and module responsible, patch `packages/**` source, rebuild, and re-run the example to verify.
+This skill is **only** for changes to framework source under `packages/` or for running framework-improvement playbooks that target `packages/`. It is the framework-developer counterpart to `converge-control` (which babysits a *user's* playbook and treats the framework as a black box).
+## Two modes
+- **Interactive:** reproduce a named framework bug, patch `packages/**`, rebuild, verify.
+- **Autonomous:** run `self-improvement-loop` for bounded framework hardening, then use its artifacts as the evidence trail.
+## When to invoke
+Trigger on user requests like:
+- "Debug converge using <example>" / "Use this example to find bugs in the framework"
+- "Why does the DAG runner <do X>?" / "Why is the execution <doing Y>?"
+- "Fix the framework — <symptom>" / "There's a bug in the manifest/target/seed/CLI"
+- "Improve <subsystem>" / "Add a feature to the CLI" / "Refactor a DAG action"
+- "Run the self-improvement loop" / "Autonomously improve the framework"
+- "Profile / instrument / add logging to <module>"
+Do **not** invoke for:
+- Running a *user's* playbook to completion → **`converge-control`**
+- Fixing a stuck user playbook (stale outputs, stall, foreign-playbook hijack) → **`converge-control`**
+- Designing a new playbook or setting up `.converge/` from scratch → **`converge-planning`**
+If the symptom is purely user-shape (the playbook author made a mistake), route to `converge-control`. If the symptom is framework-shape (the runner mishandles a *valid* user playbook), continue here.
+## Autonomous mode: self-improvement-loop
+Run bounded framework hardening with:
+```bash
+converge run --playbook=self-improvement-loop --select improve+
+```
+Use only these surfaces unless debugging the playbook itself:
+- source: `.converge/playbooks/self-improvement-loop/README.md`, `tasks/improve/TASK.md`, `tasks/improve/seeds/epoch.seed.js`, `scripts/*.mjs`;
+- evidence: `.converge/artifacts/self-improvement-loop/{journal.md,metrics.jsonl,backlog.jsonl,touched-files.jsonl,convergence.md,epochs/<NNN>/}`.
+Keep epochs maintainer-grade: clean non-artifact start, real observations before selection, one evidence-backed framework change, patch manifest from `git diff`, mapped regression commands, command-backed `verify/result.json`, and stop rather than repeat low-value cleanup.
+If the loop exposes a clear framework bug, use the interactive dev loop below for the patch and let the playbook verify the epoch.
+## The dev loop
+Eight steps, in order. Stay in this loop until the example passes cleanly or you hit a structural decision that needs the user.
+### 1. Pick a test bed
+If the user named a test fixture or example in the trigger phrase, use it. The smallest one that exercises the suspected subsystem is best — see the fixture→subsystem table at the bottom of `reference/framework-map.md`.
+**Test fixtures** (under `tests/`) are the primary dev-loop test beds — they're small, fast, and have corresponding vitest runners. Prefer these for most framework debugging:
+| Subsystem | Fixture |
+|-----------|---------|
+| Navigator / convergence | `tests/test-simple-run` |
+| Compile / discovery / manifest | `tests/test-compile-discover` |
+| Multi-provider / agentfn routing | `tests/test-mixed-model` |
+| Seed / dynamic spawn | `tests/test-seeding`, `tests/test-queue-pattern` |
+| Gap detection (input/output) | `tests/test-gap-blocked-input`, `tests/test-gap-missing-output` |
+| Buggy-check relaxation | `tests/test-buggy-check` |
+| Loop detection | `tests/test-loop-detection` |
+| Multi-attempt convergence | `tests/test-multi-attempt` |
+| Crash-safe resume | `tests/test-resume` |
+**Full examples** (under `examples/`) are heavier multi-phase projects. Use when debugging end-to-end behavior that doesn't surface in a single fixture.
+### 2. Build current state
+```bash
+cd <repo-root>
+pnpm build
+```
+Confirm it exits clean. **If the build is already broken, that *is* the first bug** — skip to step 5 with the build error as the symptom.
+For faster iteration when changes are scoped to one package:
+```bash
+pnpm --filter @openplaybooks/converge-core build && pnpm --filter @openplaybooks/converge build
+```
+### 3. Run the test bed & monitor
+From the test fixture or example directory:
+```bash
+cd tests/<fixture-name>
+node <repo-root>/packages/cli/dist/index.js playbook validate default
+node <repo-root>/packages/cli/dist/index.js run --playbook=default --dry
+node <repo-root>/packages/cli/dist/index.js run --playbook=default
+```
+If the fixture uses a non-default playbook name, swap `default` for the actual name.
+Common flags for debugging:
+| Flag | Use |
+|---|---|
+| `--force` | Force-run a task even if completed/cached |
+| `--select <expr>` | Run only matching tasks (`--select '02-something+'` = task + descendants) |
+| `--dry` | Plan only — show what would execute without running |
+| `--full-refresh` | Ignore fingerprints, re-execute everything |
+| `--verbose, -v` | Verbose output |
+Arm a Monitor on the event stream:
+```bash
+tail -f .converge/journal/<playbook>/events.jsonl | grep -E '(NODE_START|NODE_COMPLETE|NODE_FAIL|CHECK_FAIL|ERROR)'
+```
+Then — and this is what makes this skill different from `converge-control` — also read the *internal* state:
+```bash
+# DAG state after run
+cat .converge/journal/<playbook>/runstate.json
+# Per-task forensics
+ls .converge/journal/<playbook>/tasks/<taskId>/
+cat .converge/journal/<playbook>/tasks/<taskId>/FEEDBACK.md
+cat .converge/journal/<playbook>/tasks/<taskId>/LEARN.md
+```
+Full observability surface: **`reference/observability.md`**.
+### 4. Classify the symptom
+| Symptom shape | Class | Action |
+|---|---|---|
+| Example completes cleanly, no anomalies | none | nothing to fix; ask the user what they wanted to investigate |
+| Stale paths, missing inputs from user playbook | user-shape | wrong skill; route to **`converge-control`** |
+| DAG runner crashes / unhandled exception during execution | framework | continue to step 5 |
+| Runstate corruption (node status flip-flops, fingerprint mismatch cascade) | framework | continue to step 5 |
+| Seed spawn fails despite valid `seeds/index.js` | framework | continue to step 5 |
+| agentfn provider throws on a valid response | framework | continue to step 5 |
+| Node retries without progress (same CHECK_FAIL across attempts) | framework | continue to step 5 |
+| Fingerprint caching broken (unchanged node re-executed unnecessarily) | framework | continue to step 5 |
+| CLI arg parsing / exit code wrong | framework | continue to step 5 |
+### 5. Diagnose
+Open **`reference/framework-map.md`**. Find the subsystem that owns the symptom. Read the source files listed there. Form a hypothesis.
+Then check **`troubleshooting/playbook.md`** for a matching past entry. If found → apply the recipe.
+If the diagnosis is straightforward and confined to one file, proceed. If it crosses package boundaries (e.g. `core/navigator` ↔ an agentfn provider, or `core/journal` ↔ `cli/commands-clean`), **STOP and surface the hypothesis to the user before editing**. Same escalation pattern as `converge-control`.
+### 6. Edit + rebuild
+Patch `packages/**`. Then rebuild — the CLI runs from `dist/`, not source:
+```bash
+# whole monorepo (safe default)
+pnpm build
+# or single package (faster when the change is scoped)
+pnpm --filter @openplaybooks/<package-name> build
+```
+### 7. Verify
+Clear target state from the failing run (so you're testing the fix, not a stale runstate):
+```bash
+# Remove runtime state for a clean re-run
+rm -rf tests/<fixture>/.converge/journal
+rm -rf tests/<fixture>/.converge/inventory
+# Also clean output files the fixture may have produced
+rm -f tests/<fixture>/*.txt
+```
+Or use the CLI for targeted cleanup:
+```bash
+node packages/cli/dist/index.js clean --select '*' --dir=tests/<fixture>
+```
+Re-run from step 3. Confirm:
+- Original symptom is gone.
+- No new symptoms appeared.
+- Run reaches exit 0 clean.
+**Run the existing vitest suite** for the subsystem you touched:
+```bash
+# Run tests for the fixture you're using
+npx vitest run tests/<fixture-related>.test.ts
+# Or run all tests (slower, use for hot-path changes)
+npx vitest run tests/
+```
+If no vitest runner exists for the fixture, create one (see `tests/compile-discover.test.ts` for the pattern — compile + run + verify outputs).
+If the symptom returns or a new one shows up → loop back to step 5.
+### 8. Record the recipe
+Append a new entry to **`troubleshooting/playbook.md`** in the format established there: **Symptom** / **Root cause** / **Fix** / **Verification** / **Files touched**. Skip if the fix was a one-off typo. The point is to grow institutional memory so the *next* invocation of this skill recognizes the symptom faster.
+## Hard rules — STOP and re-route
+- **Don't edit framework source without first reproducing the bug against an example.** No speculative fixes. The reproducible run is also the verification baseline for step 7.
+- **Don't skip `pnpm build` between source edit and re-run.** The CLI binary runs from `packages/cli/dist/index.js`, not source. Edits to `packages/**/src/*.ts` have zero effect until rebuilt.
+- **Don't `--full-refresh` the example mid-debug.** That ignores fingerprints and can mask caching bugs. Use `rm -rf .converge/journal/<playbook> .converge/inventory/<playbook>` to clear state for a clean re-run.
+- **Don't bundle unrelated improvements.** One bug, one patch (CLAUDE.md §3 — surgical changes). If you notice adjacent dead code or a refactor opportunity, mention it to the user; don't ship it in the diagnostic fix.
+- **Don't run `pnpm test` as a gate for every edit.** Too slow for the dev loop. But if your fix touches a hot path — `core/src/dag/`, `core/src/manifest/`, `core/src/journal/` — flag that to the user and suggest *they* run `pnpm test` before commit.
+- **Don't leave `console.log` debugging in the source.** If you added logging to diagnose, remove it before declaring the fix done.
+- **Apply known recipes; ask before novel ones.** If `troubleshooting/playbook.md` has a matching entry → apply and continue. If it doesn't, and the diagnosis crosses package boundaries → STOP, state hypothesis, wait for approval.
+- **Use current terminology.** Runtime state lives under `.converge/journal/<playbook>/`, spawned-task inventory under `.converge/inventory/<playbook>/`, and outputs under `.converge/artifacts/<playbook>/`. Use `runstate.json`, not `checkpoint.json`. Use `DAG node`, not `epic`. Use `fingerprint caching`, not `resume checkpoint`.
+## Testing
+### Running tests
+```bash
+# All root-level integration tests
+npx vitest run tests/
+# Single test file (fast feedback)
+npx vitest run tests/playbook-compile.test.ts
+# Watch mode (re-run on file changes)
+npx vitest tests/playbook-compile.test.ts
+# Per-package unit tests
+pnpm --filter @openplaybooks/converge-core test
+pnpm --filter @openplaybooks/agentfn test
+# Full monorepo test suite
+pnpm test
+```
+### Test file anatomy
+Root-level tests live in `tests/*.test.ts`. They follow a pattern:
+```ts
+// 1. Spawn converge CLI with spawnSync
+const CLI = resolve(__dirname, "..", "packages/cli/dist/index.js");
+const result = spawnSync("node", [CLI, "run", "--dir=<dir>"], {
+  cwd: REPO_ROOT, encoding: "utf-8",
+  stdio: ["ignore", "pipe", "pipe"],
+});
+// 2. Verify outputs on disk
+expect(existsSync(resolve(PROJECT_DIR, "EXPECTED_OUTPUT.txt"))).toBe(true);
+// 3. Verify journal/manifest state
+const manifest = JSON.parse(readFileSync(manifestPath, "utf-8"));
+expect(manifest.nodes["task-id"]).toBeDefined();
+```
+**Key conventions:**
+- Fixtures live under `tests/test-<name>/` with full `.converge/` structure
+- Clean journal before each test (`beforeAll`), clean outputs after
+- Use `describe.skip` + binary check for tests requiring external CLIs (claude, codex)
+- `vitest.config.ts` has `fileParallelism: false` — tests run serially, safe to share fixture dirs
+- For compile-only tests, use the parameterized pattern from `tests/playbook-compile.test.ts`
+- For DAG structure tests, use the pattern from `tests/playbook-dag.test.ts`
+- For seed/structure tests (no AI needed), use the pattern from `tests/playbook-seeds.test.ts`
+### When to add tests
+- **Always** when fixing a bug that manifested in a specific fixture — add a regression test
+- **Always** when adding a new config schema field (`ai:`, new frontmatter key) — add a compile test
+- **Optionally** when the fix is a comment, error message, or logging change
+- **Never** skip adding a test for a bug that can reproduce deterministically
+## Hand-off
+| Situation | Hand off to |
+|---|---|
+| User wants to *run* a user playbook (not develop the framework) | **`converge-control`** |
+| User wants bounded autonomous framework improvement | run `self-improvement-loop` here, then use its artifacts as evidence |
+| User wants to design a new playbook | **`converge-planning`** |
+| Bug is in the user's example/playbook (TASK.md typo, missing input, wrong path) | **the user** — surface it, don't patch the framework around bad user data |
+| Fix touches a hot path and needs full test coverage before merge | **the user** — flag the path, suggest `pnpm test` |
+## File map
+```
+SKILL.md                         (this file — entry point and dev loop)
+reference/
+  framework-map.md               (subsystem → packages/ location → symptoms → reproducer)
+  observability.md               (what to read on disk during a run)
+troubleshooting/
+  playbook.md                    (symptom → root cause → fix recipes; grows over time)
+```
+Load **one** file per gap. Return here between.