@openplaybooks/converge 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,367 @@
1
+ # Converge troubleshooting playbook
2
+
3
+ Symptom-indexed fixes for the run-blockers we know how to solve. Each entry is **symptom → root cause → fix recipe → verification**.
4
+
5
+ If your symptom isn't in this file, **STOP** and surface to the user with: failing node ID, exact event lines, the check that failed, what you've tried, and a proposed fix. Don't improvise patches on novel symptoms.
6
+
7
+ ## Quick index
8
+
9
+ 1. [Previous run cancelled — node status unclear](#1-previous-run-cancelled--node-status-unclear)
10
+ 2. [Stale `outputs:` paths after workflow moved files](#2-stale-outputs-paths-after-workflow-moved-files)
11
+ 3. [Stale `inputs:` blocking a node that should be ready](#3-stale-inputs-blocking-a-node-that-should-be-ready)
12
+ 4. [Missing seed sub-template directory](#4-missing-seed-sub-template-directory)
13
+ 5. [Foreign playbook hijacks `converge run`](#5-foreign-playbook-hijacks-converge-run)
14
+ 6. [Secondary playbook fails after main one finishes](#6-secondary-playbook-fails-after-main-one-finishes)
15
+ 7. [Pre-existing typecheck/build errors in vendored code](#7-pre-existing-typecheckbuild-errors-in-vendored-code)
16
+ 8. [Verification task expects browser/server E2E inside an AI spawn](#8-verification-task-expects-browserserver-e2e-inside-an-ai-spawn)
17
+ 9. [Mixed-shape task: file-creation + tree-wide cleanup in one task](#9-mixed-shape-task-file-creation--tree-wide-cleanup-in-one-task)
18
+ 10. [Cycle detected in DAG](#10-cycle-detected-in-dag)
19
+ 11. [Frontier unresolved — seed spawned no children](#11-frontier-unresolved--seed-spawned-no-children)
20
+ 12. [Fingerprint mismatch cascade — all downstream re-executes](#12-fingerprint-mismatch-cascade--all-downstream-re-executes)
21
+
22
+ ---
23
+
24
+ ## 1. Previous run cancelled — node status unclear
25
+
26
+ **Symptom:**
27
+ ```
28
+ RUN_CANCELLED <playbook>
29
+ ```
30
+ Or the run process was killed and you're unsure what completed.
31
+
32
+ **Root cause:** The previous run was interrupted (SIGTERM, crash, reboot) without completing all nodes.
33
+
34
+ **Fix:** Re-run. The runner reads `runstate.json` — completed nodes carry forward, incomplete nodes execute fresh. No special flags needed.
35
+
36
+ ```bash
37
+ converge run --playbook=<name>
38
+ ```
39
+
40
+ To explicitly retry only nodes that failed (not were cancelled):
41
+
42
+ ```bash
43
+ converge run --playbook=<name> --select 'result:error+'
44
+ ```
45
+
46
+ Do **not** use `--full-refresh` — it ignores the previous runstate and re-executes everything.
47
+
48
+ **Verification:** Run proceeds without re-executing completed nodes. `NODE_COMPLETE cached` events for previously-done work.
49
+
50
+ ---
51
+
52
+ ## 2. Stale `outputs:` paths after workflow moved files
53
+
54
+ **Symptom:**
55
+ ```
56
+ CHECK_FAIL <nodeId> <checkId>
57
+ Task output not created: <path>
58
+ ```
59
+ The path in the error points to a location that's empty on disk, but the file actually exists at a different location. Common: a `split` task declares output at `lib/screens/X/widgets/foo.dart`, a follow-up `lift` task moves it to `lib/widgets/foo.dart`, and the split task's check fails on re-validation because the file moved.
60
+
61
+ **Root cause:** TASK.md frontmatter declares an `outputs:` path that's correct at generation time but stale after later steps move the file.
62
+
63
+ **Fix recipe:**
64
+
65
+ 1. **Fix the template** so future spawns handle the moved file:
66
+ ```yaml
67
+ # In the template TASK.md — make checks tolerate the moved location:
68
+ checks:
69
+ - id: widget-exists
70
+ cmd: "bash -c 'test -f {{widgetPath}} || test -f lib/widgets/$(basename {{widgetPath}})'"
71
+ ```
72
+ Or drop the brittle `outputs:` entry entirely if the check is sufficient.
73
+
74
+ 2. **Regenerate already-spawned nodes.** For each affected spawned node directory under `.converge/inventory/<playbook>/spawned/`, re-render from the fixed template with the node's existing `vars:`.
75
+
76
+ 3. **Re-compile and re-run:**
77
+ ```bash
78
+ converge run --playbook=<name> --dry
79
+ converge run --playbook=<name> --select 'result:error+'
80
+ ```
81
+
82
+ **Verification:** `CHECK_FAIL` doesn't recur for the fixed node. Node completes on next attempt.
83
+
84
+ ---
85
+
86
+ ## 3. Stale `inputs:` blocking a node that should be ready
87
+
88
+ **Symptom:**
89
+ ```
90
+ INPUT_MISSING <nodeId> <path>
91
+ ```
92
+ A node can't start because its declared `inputs:` file doesn't exist. The file was produced but later moved by a downstream task.
93
+
94
+ **Root cause:** The `inputs:` path references a file that existed when the DAG was compiled but was moved or renamed.
95
+
96
+ **Fix recipe:**
97
+
98
+ 1. **Fix the TASK.md** — drop the brittle input or make it conditional:
99
+ ```yaml
100
+ # Instead of:
101
+ inputs:
102
+ - "{{localWidgetPath}}"
103
+ # Use a check that tolerates the moved location.
104
+ ```
105
+
106
+ 2. **Regenerate affected spawned nodes** from the fixed template.
107
+
108
+ 3. **Re-compile and re-run:**
109
+ ```bash
110
+ converge run --playbook=<name> --dry
111
+ converge run --playbook=<name> --select 'result:error+'
112
+ ```
113
+
114
+ **Verification:** Node moves past the input gate. `INPUT_MISSING` doesn't recur.
115
+
116
+ ---
117
+
118
+ ## 4. Missing seed sub-template directory
119
+
120
+ **Symptom:**
121
+ ```
122
+ NODE_FAIL <seedParentId> seed script import failed: <path>/seed.js
123
+ ```
124
+ The seed.js exists and parses, but its `run()` references a sub-template (e.g. `tasks/subtask/TASK.md`) that's not on disk.
125
+
126
+ **Root cause:** When migrating a playbook, sub-template directories were missed in the copy.
127
+
128
+ **Fix recipe:**
129
+
130
+ 1. Find a known-good source that has the sub-template:
131
+ ```bash
132
+ find <source-playbook>/seeds/ -type d -name "subtask"
133
+ ```
134
+
135
+ 2. Copy into the target playbook:
136
+ ```bash
137
+ cp -r <source>/seeds/<name>/tasks/<step>/tasks/subtask \
138
+ <target-playbook>/seeds/<name>/tasks/<step>/tasks/subtask
139
+ ```
140
+
141
+ 3. Re-compile and re-run:
142
+ ```bash
143
+ converge run --playbook=<name> --dry
144
+ converge run --playbook=<name> --select 'result:error+'
145
+ ```
146
+
147
+ **Verification:** Seed spawns children successfully. `SEED_SPAWN` event appears in the stream.
148
+
149
+ ---
150
+
151
+ ## 5. Foreign playbook hijacks `converge run`
152
+
153
+ **Symptom:** Run completes the intended playbook, then starts running tasks from a different playbook. The other playbook fails because it expects setup that hasn't happened.
154
+
155
+ **Root cause:** `.converge/playbooks/` contains more than one playbook. A bare `converge run` may pick a different one than intended.
156
+
157
+ **Fix:** Use the explicit playbook path on every command:
158
+
159
+ ```bash
160
+ converge run --playbook=default
161
+ converge list --playbook=default
162
+ ```
163
+
164
+ If the other playbook is genuinely unwanted, remove it (after confirming with the user):
165
+
166
+ ```bash
167
+ rm -rf .converge/playbooks/<unwanted>
168
+ ```
169
+
170
+ **Verification:** `converge run` only starts nodes from the intended playbook.
171
+
172
+ ---
173
+
174
+ ## 6. Secondary playbook fails after main one finishes
175
+
176
+ **Symptom:** The primary playbook completes, then a secondary playbook starts and fails immediately on setup issues.
177
+
178
+ **Root cause:** Same as #5 — multiple playbooks present, auto-discovery picks the wrong one.
179
+
180
+ **Fix:** Same as #5 — use the explicit playbook path.
181
+
182
+ **Verification:** Primary playbook completes cleanly. No secondary playbook nodes appear.
183
+
184
+ ---
185
+
186
+ ## 7. Pre-existing typecheck/build errors in vendored code
187
+
188
+ **Symptom:** A `typecheck` or `build` check fails identically across many nodes. The failing file isn't something the AI wrote — it was already in the repo before the run started.
189
+
190
+ **Root cause:** The playbook's typecheck check is all-or-nothing. Any pre-existing error fails every node with that check.
191
+
192
+ **Fix recipe:**
193
+
194
+ 1. **Identify the offending files:**
195
+ ```bash
196
+ pnpm typecheck 2>&1 | grep "error TS" | head -20
197
+ ```
198
+
199
+ 2. **Decide:** are these files the playbook needs? If yes, fix the types. If no (vestigial vendored code), delete them.
200
+
201
+ 3. **Delete and clean imports:**
202
+ ```bash
203
+ rm <offending-file>
204
+ # Clean imports referencing the deleted file
205
+ pnpm typecheck 2>&1 | grep -c "error TS"
206
+ ```
207
+ Repeat until count is 0.
208
+
209
+ 4. **Re-run** — previously blocked nodes will pass.
210
+
211
+ **Verification:** `pnpm typecheck` exits 0. Next run shows `CHECK_PASS` for the typecheck check.
212
+
213
+ ---
214
+
215
+ ## 8. Verification task expects browser/server E2E inside an AI spawn
216
+
217
+ **Symptom:** A task says "spin up `pnpm dev`, curl `localhost:N`, exercise pages, write a JSON report." The AI tries — runs `pnpm dev &`, curls, sometimes runs aggressive cleanups like `pkill -f "node"` (which can kill the runner itself). Times out or deadlocks.
218
+
219
+ **Root cause:** AI spawns are designed for file edits + short shell commands, not multi-process choreography. No port management, no headless browser, no reliable long-lived server lifecycle.
220
+
221
+ **Fix recipe — restructure the task:**
222
+
223
+ 1. **Drop the `allPassed === true` gate.** Replace with a "report file exists + has expected schema" check:
224
+ ```yaml
225
+ checks:
226
+ - id: report-written
227
+ cmd: "test -f e2e-verify.json && node -e \"const r=JSON.parse(require('fs').readFileSync('e2e-verify.json','utf8'));process.exit(Array.isArray(r.scenarios)&&r.scenarios.length>0?0:1)\""
228
+ ```
229
+
230
+ 2. **Reframe the task body** as "scaffold the report file, leave verdicts for human review."
231
+
232
+ 3. **If you genuinely need automated E2E**, split into two tasks:
233
+ - Task A: spawn `pnpm dev`, write pid/port file, exit.
234
+ - Task B (depends on A): read pid/port, hit endpoints, kill pid, write report.
235
+
236
+ **Verification:** Task passes its relaxed gate cleanly. No `pkill -f "node"` in any task body.
237
+
238
+ ---
239
+
240
+ ## 9. Mixed-shape task: file-creation + tree-wide cleanup in one task
241
+
242
+ **Symptom:** A single node takes many attempts to converge. The check list contains both "new file X exists" (`test -f some/path.ts`) AND "no occurrences of pattern Y in src/" (`grep -r 'badPattern' src`). Each attempt scrubs a few files but new ones keep being found.
243
+
244
+ **Root cause:** Existence and negation checks converge at different rates. Existence flips false→true once when the file is written. Negation drains chunk-by-chunk over many edits.
245
+
246
+ **Fix recipe — split into creator + cleanup, two sibling nodes:**
247
+
248
+ ```yaml
249
+ # Before (one node, slow):
250
+ - id: 009-converge-event-stream
251
+ outputs:
252
+ - src/app/api/events/route.ts
253
+ checks:
254
+ - id: route-exists
255
+ cmd: "test -f src/app/api/events/route.ts"
256
+ - id: no-legacy-websocket
257
+ cmd: "test -z \"$(grep -rl 'useWebSocket' src 2>/dev/null)\""
258
+
259
+ # After (two nodes, fast):
260
+ - id: 009-converge-event-stream
261
+ outputs:
262
+ - src/app/api/events/route.ts
263
+ checks:
264
+ - id: route-exists
265
+ cmd: "test -f src/app/api/events/route.ts"
266
+
267
+ - id: 009b-purge-legacy-websocket
268
+ depends_on: [009-converge-event-stream]
269
+ checks:
270
+ - id: no-legacy-websocket
271
+ cmd: "test -z \"$(grep -rl 'useWebSocket' src 2>/dev/null)\""
272
+ ```
273
+
274
+ **Verification:** Each single-shape node converges in 1-2 attempts. No multi-attempt thrashing.
275
+
276
+ ---
277
+
278
+ ## 10. Cycle detected in DAG
279
+
280
+ **Symptom:**
281
+ ```
282
+ CYCLE_DETECTED [id1 → id2 → id3 → id1]
283
+ ```
284
+ Compile fails. The DAG has a circular dependency.
285
+
286
+ **Root cause:** `depends_on` edges form a cycle. Usually happens when two tasks each declare the other as a dependency, or a chain loops back.
287
+
288
+ **Fix recipe:**
289
+
290
+ 1. Trace the cycle shown in the error.
291
+ 2. Identify which edge is incorrect — which task does NOT actually need to depend on the other.
292
+ 3. Remove or fix the `depends_on` entry in the offending TASK.md or playbook.yml.
293
+ 4. Re-validate the graph:
294
+ ```bash
295
+ converge run --playbook=<name> --dry
296
+ ```
297
+
298
+ **Verification:** Dry run succeeds. No cycle error is reported.
299
+
300
+ ---
301
+
302
+ ## 11. Frontier unresolved — seed spawned no children
303
+
304
+ **Symptom:**
305
+ ```
306
+ FRONTIER_UNRESOLVED <nodeId>
307
+ ```
308
+ A seed parent declared with `from_seed` and an upstream catalog was expected to spawn children, but the DAG shows zero child nodes.
309
+
310
+ **Root cause:** Either (a) the catalog file is empty/missing, or (b) the seed script errored silently, or (c) the catalog format changed and the seed didn't match any entries.
311
+
312
+ **Fix recipe:**
313
+
314
+ 1. Check the catalog file exists and has entries:
315
+ ```bash
316
+ cat <catalog-path> | jq 'length' # or equivalent
317
+ ```
318
+ 2. Run the seed script manually to see errors:
319
+ ```bash
320
+ node <playbook>/seeds/<name>/index.js
321
+ ```
322
+ 3. Fix the catalog or seed script.
323
+ 4. Re-validate and re-run:
324
+ ```bash
325
+ converge run --playbook=<name> --dry
326
+ converge run --playbook=<name> --select 'result:error+'
327
+ ```
328
+
329
+ **Verification:** Dry run succeeds. `SEED_SPAWN` events appear during run showing the expected child count.
330
+
331
+ ---
332
+
333
+ ## 12. Fingerprint mismatch cascade — all downstream re-executes
334
+
335
+ **Symptom:** An incremental run (`--select 'state:modified+'`) re-executes far more nodes than expected. Nodes that shouldn't have changed show `NODE_COMPLETE fresh` instead of `cached`.
336
+
337
+ **Root cause:** A node's fingerprint changed unexpectedly — often because a TASK.md was touched (even whitespace), a `vars:` value changed, or the manifest hash differs due to a re-compile that produced a different DAG structure.
338
+
339
+ **Fix recipe:**
340
+
341
+ 1. Check what actually changed:
342
+ ```bash
343
+ diff <(jq -S . .converge/journal/<playbook>/manifest.prev.json) <(jq -S . .converge/journal/<playbook>/manifest.json)
344
+ ```
345
+ 2. If the diff is noise (whitespace, key ordering), the fingerprint computation is too broad. This is a framework issue — surface to the user.
346
+ 3. If the diff is real (a `depends_on` edge changed, a `vars:` value updated), the cascade is correct behavior. Let it run.
347
+
348
+ **Verification:** After a clean run, the next `--select 'state:modified+'` should show all `cached` (zero `fresh`).
349
+
350
+ ---
351
+
352
+ ## When NONE of these match
353
+
354
+ If your symptom isn't covered above:
355
+
356
+ 1. **Read the node forensics:**
357
+ ```bash
358
+ ls .converge/journal/<playbook>/tasks/<nodeId>/
359
+ cat .converge/journal/<playbook>/tasks/<nodeId>/FEEDBACK.md
360
+ cat .converge/journal/<playbook>/tasks/<nodeId>/LEARN.md
361
+ ```
362
+ 2. **Check the event stream** around the failure:
363
+ ```bash
364
+ grep "NODE_FAIL\|CHECK_FAIL\|ERROR" .converge/journal/<playbook>/events.jsonl | tail -20
365
+ ```
366
+ 3. **Surface to the user** with: failing node ID, exact event lines, what you've tried, your hypothesis, and a proposed fix.
367
+ 4. Wait for approval before applying any patch.
@@ -0,0 +1,303 @@
1
+ ---
2
+ name: converge-development
3
+ description: Use when the user wants to develop, debug, or improve the converge framework itself — running an example as a test bed, running the self-improvement loop, observing framework behavior, diagnosing framework bugs, and editing source under packages/. Triggers on phrases like "debug converge", "fix the framework", "run the self-improvement loop", "autonomous framework improvement", "why does the runner do X", "improve the journal", "add a feature to the CLI", "use this example to find bugs in converge".
4
+ ---
5
+
6
+ # Converge Development — observe-diagnose-fix the framework itself
7
+
8
+ ## Purpose
9
+
10
+ Use a real example playbook as a test bed. Run it. Watch what the framework does internally — not just the stdout event stream, but the target directory, runstate, and per-attempt forensics the runner writes to disk. When the framework misbehaves (crashes, corrupts state, fails to retry, mishandles a provider response), trace the symptom to the package and module responsible, patch `packages/**` source, rebuild, and re-run the example to verify.
11
+
12
+ This skill is **only** for changes to framework source under `packages/` or for running framework-improvement playbooks that target `packages/`. It is the framework-developer counterpart to `converge-control` (which babysits a *user's* playbook and treats the framework as a black box).
13
+
14
+ ## Two modes
15
+
16
+ - **Interactive:** reproduce a named framework bug, patch `packages/**`, rebuild, verify.
17
+ - **Autonomous:** run `self-improvement-loop` for bounded framework hardening, then use its artifacts as the evidence trail.
18
+
19
+ ## When to invoke
20
+
21
+ Trigger on user requests like:
22
+
23
+ - "Debug converge using <example>" / "Use this example to find bugs in the framework"
24
+ - "Why does the DAG runner <do X>?" / "Why is the execution <doing Y>?"
25
+ - "Fix the framework — <symptom>" / "There's a bug in the manifest/target/seed/CLI"
26
+ - "Improve <subsystem>" / "Add a feature to the CLI" / "Refactor a DAG action"
27
+ - "Run the self-improvement loop" / "Autonomously improve the framework"
28
+ - "Profile / instrument / add logging to <module>"
29
+
30
+ Do **not** invoke for:
31
+
32
+ - Running a *user's* playbook to completion → **`converge-control`**
33
+ - Fixing a stuck user playbook (stale outputs, stall, foreign-playbook hijack) → **`converge-control`**
34
+ - Designing a new playbook or setting up `.converge/` from scratch → **`converge-planning`**
35
+
36
+ If the symptom is purely user-shape (the playbook author made a mistake), route to `converge-control`. If the symptom is framework-shape (the runner mishandles a *valid* user playbook), continue here.
37
+
38
+ ## Autonomous mode: self-improvement-loop
39
+
40
+ Run bounded framework hardening with:
41
+
42
+ ```bash
43
+ converge run --playbook=self-improvement-loop --select improve+
44
+ ```
45
+
46
+ Use only these surfaces unless debugging the playbook itself:
47
+
48
+ - source: `.converge/playbooks/self-improvement-loop/README.md`, `tasks/improve/TASK.md`, `tasks/improve/seeds/epoch.seed.js`, `scripts/*.mjs`;
49
+ - evidence: `.converge/artifacts/self-improvement-loop/{journal.md,metrics.jsonl,backlog.jsonl,touched-files.jsonl,convergence.md,epochs/<NNN>/}`.
50
+
51
+ Keep epochs maintainer-grade: clean non-artifact start, real observations before selection, one evidence-backed framework change, patch manifest from `git diff`, mapped regression commands, command-backed `verify/result.json`, and stop rather than repeat low-value cleanup.
52
+
53
+ If the loop exposes a clear framework bug, use the interactive dev loop below for the patch and let the playbook verify the epoch.
54
+
55
+ ## The dev loop
56
+
57
+ Eight steps, in order. Stay in this loop until the example passes cleanly or you hit a structural decision that needs the user.
58
+
59
+ ### 1. Pick a test bed
60
+
61
+ If the user named a test fixture or example in the trigger phrase, use it. The smallest one that exercises the suspected subsystem is best — see the fixture→subsystem table at the bottom of `reference/framework-map.md`.
62
+
63
+ **Test fixtures** (under `tests/`) are the primary dev-loop test beds — they're small, fast, and have corresponding vitest runners. Prefer these for most framework debugging:
64
+
65
+ | Subsystem | Fixture |
66
+ |-----------|---------|
67
+ | Navigator / convergence | `tests/test-simple-run` |
68
+ | Compile / discovery / manifest | `tests/test-compile-discover` |
69
+ | Multi-provider / agentfn routing | `tests/test-mixed-model` |
70
+ | Seed / dynamic spawn | `tests/test-seeding`, `tests/test-queue-pattern` |
71
+ | Gap detection (input/output) | `tests/test-gap-blocked-input`, `tests/test-gap-missing-output` |
72
+ | Buggy-check relaxation | `tests/test-buggy-check` |
73
+ | Loop detection | `tests/test-loop-detection` |
74
+ | Multi-attempt convergence | `tests/test-multi-attempt` |
75
+ | Crash-safe resume | `tests/test-resume` |
76
+
77
+ **Full examples** (under `examples/`) are heavier multi-phase projects. Use when debugging end-to-end behavior that doesn't surface in a single fixture.
78
+
79
+ ### 2. Build current state
80
+
81
+ ```bash
82
+ cd <repo-root>
83
+ pnpm build
84
+ ```
85
+
86
+ Confirm it exits clean. **If the build is already broken, that *is* the first bug** — skip to step 5 with the build error as the symptom.
87
+
88
+ For faster iteration when changes are scoped to one package:
89
+
90
+ ```bash
91
+ pnpm --filter @openplaybooks/converge-core build && pnpm --filter @openplaybooks/converge build
92
+ ```
93
+
94
+ ### 3. Run the test bed & monitor
95
+
96
+ From the test fixture or example directory:
97
+
98
+ ```bash
99
+ cd tests/<fixture-name>
100
+ node <repo-root>/packages/cli/dist/index.js playbook validate default
101
+ node <repo-root>/packages/cli/dist/index.js run --playbook=default --dry
102
+ node <repo-root>/packages/cli/dist/index.js run --playbook=default
103
+ ```
104
+
105
+ If the fixture uses a non-default playbook name, swap `default` for the actual name.
106
+
107
+ Common flags for debugging:
108
+
109
+ | Flag | Use |
110
+ |---|---|
111
+ | `--force` | Force-run a task even if completed/cached |
112
+ | `--select <expr>` | Run only matching tasks (`--select '02-something+'` = task + descendants) |
113
+ | `--dry` | Plan only — show what would execute without running |
114
+ | `--full-refresh` | Ignore fingerprints, re-execute everything |
115
+ | `--verbose, -v` | Verbose output |
116
+
117
+ Arm a Monitor on the event stream:
118
+
119
+ ```bash
120
+ tail -f .converge/journal/<playbook>/events.jsonl | grep -E '(NODE_START|NODE_COMPLETE|NODE_FAIL|CHECK_FAIL|ERROR)'
121
+ ```
122
+
123
+ Then — and this is what makes this skill different from `converge-control` — also read the *internal* state:
124
+
125
+ ```bash
126
+ # DAG state after run
127
+ cat .converge/journal/<playbook>/runstate.json
128
+
129
+ # Per-task forensics
130
+ ls .converge/journal/<playbook>/tasks/<taskId>/
131
+ cat .converge/journal/<playbook>/tasks/<taskId>/FEEDBACK.md
132
+ cat .converge/journal/<playbook>/tasks/<taskId>/LEARN.md
133
+ ```
134
+
135
+ Full observability surface: **`reference/observability.md`**.
136
+
137
+ ### 4. Classify the symptom
138
+
139
+ | Symptom shape | Class | Action |
140
+ |---|---|---|
141
+ | Example completes cleanly, no anomalies | none | nothing to fix; ask the user what they wanted to investigate |
142
+ | Stale paths, missing inputs from user playbook | user-shape | wrong skill; route to **`converge-control`** |
143
+ | DAG runner crashes / unhandled exception during execution | framework | continue to step 5 |
144
+ | Runstate corruption (node status flip-flops, fingerprint mismatch cascade) | framework | continue to step 5 |
145
+ | Seed spawn fails despite valid `seeds/index.js` | framework | continue to step 5 |
146
+ | agentfn provider throws on a valid response | framework | continue to step 5 |
147
+ | Node retries without progress (same CHECK_FAIL across attempts) | framework | continue to step 5 |
148
+ | Fingerprint caching broken (unchanged node re-executed unnecessarily) | framework | continue to step 5 |
149
+ | CLI arg parsing / exit code wrong | framework | continue to step 5 |
150
+
151
+ ### 5. Diagnose
152
+
153
+ Open **`reference/framework-map.md`**. Find the subsystem that owns the symptom. Read the source files listed there. Form a hypothesis.
154
+
155
+ Then check **`troubleshooting/playbook.md`** for a matching past entry. If found → apply the recipe.
156
+
157
+ If the diagnosis is straightforward and confined to one file, proceed. If it crosses package boundaries (e.g. `core/navigator` ↔ an agentfn provider, or `core/journal` ↔ `cli/commands-clean`), **STOP and surface the hypothesis to the user before editing**. Same escalation pattern as `converge-control`.
158
+
159
+ ### 6. Edit + rebuild
160
+
161
+ Patch `packages/**`. Then rebuild — the CLI runs from `dist/`, not source:
162
+
163
+ ```bash
164
+ # whole monorepo (safe default)
165
+ pnpm build
166
+
167
+ # or single package (faster when the change is scoped)
168
+ pnpm --filter @openplaybooks/<package-name> build
169
+ ```
170
+
171
+ ### 7. Verify
172
+
173
+ Clear target state from the failing run (so you're testing the fix, not a stale runstate):
174
+
175
+ ```bash
176
+ # Remove runtime state for a clean re-run
177
+ rm -rf tests/<fixture>/.converge/journal
178
+ rm -rf tests/<fixture>/.converge/inventory
179
+ # Also clean output files the fixture may have produced
180
+ rm -f tests/<fixture>/*.txt
181
+ ```
182
+
183
+ Or use the CLI for targeted cleanup:
184
+
185
+ ```bash
186
+ node packages/cli/dist/index.js clean --select '*' --dir=tests/<fixture>
187
+ ```
188
+
189
+ Re-run from step 3. Confirm:
190
+
191
+ - Original symptom is gone.
192
+ - No new symptoms appeared.
193
+ - Run reaches exit 0 clean.
194
+
195
+ **Run the existing vitest suite** for the subsystem you touched:
196
+
197
+ ```bash
198
+ # Run tests for the fixture you're using
199
+ npx vitest run tests/<fixture-related>.test.ts
200
+
201
+ # Or run all tests (slower, use for hot-path changes)
202
+ npx vitest run tests/
203
+ ```
204
+
205
+ If no vitest runner exists for the fixture, create one (see `tests/compile-discover.test.ts` for the pattern — compile + run + verify outputs).
206
+
207
+ If the symptom returns or a new one shows up → loop back to step 5.
208
+
209
+ ### 8. Record the recipe
210
+
211
+ Append a new entry to **`troubleshooting/playbook.md`** in the format established there: **Symptom** / **Root cause** / **Fix** / **Verification** / **Files touched**. Skip if the fix was a one-off typo. The point is to grow institutional memory so the *next* invocation of this skill recognizes the symptom faster.
212
+
213
+ ## Hard rules — STOP and re-route
214
+
215
+ - **Don't edit framework source without first reproducing the bug against an example.** No speculative fixes. The reproducible run is also the verification baseline for step 7.
216
+ - **Don't skip `pnpm build` between source edit and re-run.** The CLI binary runs from `packages/cli/dist/index.js`, not source. Edits to `packages/**/src/*.ts` have zero effect until rebuilt.
217
+ - **Don't `--full-refresh` the example mid-debug.** That ignores fingerprints and can mask caching bugs. Use `rm -rf .converge/journal/<playbook> .converge/inventory/<playbook>` to clear state for a clean re-run.
218
+ - **Don't bundle unrelated improvements.** One bug, one patch (CLAUDE.md §3 — surgical changes). If you notice adjacent dead code or a refactor opportunity, mention it to the user; don't ship it in the diagnostic fix.
219
+ - **Don't run `pnpm test` as a gate for every edit.** Too slow for the dev loop. But if your fix touches a hot path — `core/src/dag/`, `core/src/manifest/`, `core/src/journal/` — flag that to the user and suggest *they* run `pnpm test` before commit.
220
+ - **Don't leave `console.log` debugging in the source.** If you added logging to diagnose, remove it before declaring the fix done.
221
+ - **Apply known recipes; ask before novel ones.** If `troubleshooting/playbook.md` has a matching entry → apply and continue. If it doesn't, and the diagnosis crosses package boundaries → STOP, state hypothesis, wait for approval.
222
+ - **Use current terminology.** Runtime state lives under `.converge/journal/<playbook>/`, spawned-task inventory under `.converge/inventory/<playbook>/`, and outputs under `.converge/artifacts/<playbook>/`. Use `runstate.json`, not `checkpoint.json`. Use `DAG node`, not `epic`. Use `fingerprint caching`, not `resume checkpoint`.
223
+
224
+ ## Testing
225
+
226
+ ### Running tests
227
+
228
+ ```bash
229
+ # All root-level integration tests
230
+ npx vitest run tests/
231
+
232
+ # Single test file (fast feedback)
233
+ npx vitest run tests/playbook-compile.test.ts
234
+
235
+ # Watch mode (re-run on file changes)
236
+ npx vitest tests/playbook-compile.test.ts
237
+
238
+ # Per-package unit tests
239
+ pnpm --filter @openplaybooks/converge-core test
240
+ pnpm --filter @openplaybooks/agentfn test
241
+
242
+ # Full monorepo test suite
243
+ pnpm test
244
+ ```
245
+
246
+ ### Test file anatomy
247
+
248
+ Root-level tests live in `tests/*.test.ts`. They follow a pattern:
249
+
250
+ ```ts
251
+ // 1. Spawn converge CLI with spawnSync
252
+ const CLI = resolve(__dirname, "..", "packages/cli/dist/index.js");
253
+ const result = spawnSync("node", [CLI, "run", "--dir=<dir>"], {
254
+ cwd: REPO_ROOT, encoding: "utf-8",
255
+ stdio: ["ignore", "pipe", "pipe"],
256
+ });
257
+
258
+ // 2. Verify outputs on disk
259
+ expect(existsSync(resolve(PROJECT_DIR, "EXPECTED_OUTPUT.txt"))).toBe(true);
260
+
261
+ // 3. Verify journal/manifest state
262
+ const manifest = JSON.parse(readFileSync(manifestPath, "utf-8"));
263
+ expect(manifest.nodes["task-id"]).toBeDefined();
264
+ ```
265
+
266
+ **Key conventions:**
267
+ - Fixtures live under `tests/test-<name>/` with full `.converge/` structure
268
+ - Clean journal before each test (`beforeAll`), clean outputs after
269
+ - Use `describe.skip` + binary check for tests requiring external CLIs (claude, codex)
270
+ - `vitest.config.ts` has `fileParallelism: false` — tests run serially, safe to share fixture dirs
271
+ - For compile-only tests, use the parameterized pattern from `tests/playbook-compile.test.ts`
272
+ - For DAG structure tests, use the pattern from `tests/playbook-dag.test.ts`
273
+ - For seed/structure tests (no AI needed), use the pattern from `tests/playbook-seeds.test.ts`
274
+
275
+ ### When to add tests
276
+
277
+ - **Always** when fixing a bug that manifested in a specific fixture — add a regression test
278
+ - **Always** when adding a new config schema field (`ai:`, new frontmatter key) — add a compile test
279
+ - **Optionally** when the fix is a comment, error message, or logging change
280
+ - **Never** skip adding a test for a bug that can reproduce deterministically
281
+
282
+ ## Hand-off
283
+
284
+ | Situation | Hand off to |
285
+ |---|---|
286
+ | User wants to *run* a user playbook (not develop the framework) | **`converge-control`** |
287
+ | User wants bounded autonomous framework improvement | run `self-improvement-loop` here, then use its artifacts as evidence |
288
+ | User wants to design a new playbook | **`converge-planning`** |
289
+ | Bug is in the user's example/playbook (TASK.md typo, missing input, wrong path) | **the user** — surface it, don't patch the framework around bad user data |
290
+ | Fix touches a hot path and needs full test coverage before merge | **the user** — flag the path, suggest `pnpm test` |
291
+
292
+ ## File map
293
+
294
+ ```
295
+ SKILL.md (this file — entry point and dev loop)
296
+ reference/
297
+ framework-map.md (subsystem → packages/ location → symptoms → reproducer)
298
+ observability.md (what to read on disk during a run)
299
+ troubleshooting/
300
+ playbook.md (symptom → root cause → fix recipes; grows over time)
301
+ ```
302
+
303
+ Load **one** file per gap. Return here between.