@openplaybooks/converge 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,294 @@
1
+ # Framework map — where things live
2
+
3
+ A subsystem→location→symptom cheat sheet for diagnosing framework bugs. Use it in step 5 of the dev loop.
4
+
5
+ Repo root: `/Users/minh/Documents/converge`. All paths below are relative to root unless noted.
6
+
7
+ ## Monorepo layout
8
+
9
+ ```
10
+ packages/
11
+ core/ framework engine (navigator, gap detection, journal, checkpoint, planning, …)
12
+ cli/ `converge` command — arg parsing, subcommands, output formatting
13
+ agentfn/ unified agent function — single callable across all AI providers
14
+ claudefn/ Claude provider — spawns `claude` CLI programmatically
15
+ acpfn/ ACP provider — wraps Anthropic Client Protocol SDK
16
+ kimifn/ Kimi provider
17
+ qwenfn/ Qwen provider
18
+ geminifn/ Gemini provider
19
+ openfn/ Opencode AI provider
20
+ navigator/ generic graph-driven state machine / convergence loop
21
+ codets/ code-generation utilities (fluent TS/JSX/MD emitter)
22
+ project-root/ canonical project-root resolver (finds nearest `.converge/`)
23
+ provider-benchmark/ deep journal analysis for comparing AI providers
24
+ swebench/ SWE-bench Lite evaluation runner
25
+ tbench/ terminal-bench evaluation runner
26
+ studio/ (reserved)
27
+ ```
28
+
29
+ The CLI binary is `packages/cli/dist/index.js`. The runtime entry from the binary is `packages/cli/src/main.ts` → individual `commands-*.ts` files.
30
+
31
+ ## Subsystem → location → symptoms
32
+
33
+ ### Navigator (convergence engine)
34
+ - **Source:** `packages/core/src/navigator/` — `core/navigator.ts`, `core/actions/`, `repair/strategies/`, `repair/agent-runner.ts`
35
+ - **Key files:**
36
+ - `packages/core/src/navigator/repair/agent-runner.ts` — runs AI agents, resolves AI config, emits `AGENT_START/COMPLETE/FAILED` events
37
+ - `packages/core/src/navigator/repair/strategies/task-run.ts` — primary task execution strategy (builds prompt, calls `runAgent`)
38
+ - `packages/core/src/navigator/repair/strategies/seed-script-repair.ts` — seed script auto-repair
39
+ - `packages/core/src/navigator/repair/strategy-catalog.ts` — maps gap kinds to fix strategies
40
+ - **Symptoms:**
41
+ - Node stuck in `buffered` / `executing` status across iterations
42
+ - Action phases fire out of order (preflight skipped, response duplicated)
43
+ - Stall detection misfires (declares stall when progress is visible, or fails to detect repeating failures)
44
+ - Navigate iterates without progress (gap unchanged across iterations)
45
+ - Per-task `agent:` field ignored — all tasks use default provider
46
+ - **Reproduce against:** `tests/test-simple-run` (smallest), `tests/test-loop-detection` (stall), `tests/test-mixed-model` (provider routing)
47
+ - **Watch:** stdout `🤖 AI Provider:` lines, per-task `events.jsonl`, per-attempt `logs/events.jsonl`
48
+
49
+ ### Task discovery & resolution (TASK.md → Unit → DAG)
50
+ - **Source:** `packages/core/src/task/discovery/static-children.ts` (folder-scan for `\d{2,3}-` prefixed subdirectories), `packages/core/src/task/unit/factories.ts` (Unit.fromPath), `packages/core/src/task/unit/unit.ts` (Unit class)
51
+ - **Also:** `packages/core/src/task/unit/resolve.ts` — `resolveAgent`, `resolvePrompt`, `resolveTaskAI`, `resolveSkill`
52
+ - **Also:** `packages/core/src/task/unit/find-gaps.ts` — gap detection from Unit state; `packages/core/src/task/unit/fix-gaps.ts` — gap resolution
53
+ - **Symptoms:**
54
+ - Children not discovered despite valid prefix subdirectories
55
+ - `ai:` block in TASK.md ignored (provider falls back to default)
56
+ - Sort order wrong (prefix parsing broken)
57
+ - Gaps double-counted or missing
58
+ - **Reproduce against:** `tests/test-compile-discover` (child discovery), `tests/test-mixed-model` (ai: block), `tests/playbook-compile.test.ts` (compile suite)
59
+ - **Watch:** compile output (`Compiled default: N nodes`), manifest.json `parent_map`
60
+
61
+ ### TASK.md parsing & schema
62
+ - **Source:** `packages/core/src/config/task-md-definition.ts` — `parseTaskMd`, `parseTaskMdString`, `parseFrontmatterToTaskMdDef`, `mapTaskMdToTaskDefinition`, `RESERVED_KEYS`
63
+ - **Also:** `packages/core/src/config/task-definition.ts` — `TaskDefinition` interface, `TaskAIConfig`, builder
64
+ - **Also:** `packages/core/src/config/declarative-loader.ts` — playbook loading, `resolveTaskDef`, `loadTaskFile`
65
+ - **Also:** `packages/core/src/task/playbook/loader.ts` — playbook check parsing and `scripts/`-path validation
66
+ - **Symptoms:**
67
+ - Frontmatter field silently ignored (not in `RESERVED_KEYS`, falls through to `vars`)
68
+ - `ai:` block parsed but not mapped (missing from `mapTaskMdToTaskDefinition`)
69
+ - Legacy `type: test` or `.test.md` content still appears in a playbook and now fails hard
70
+ - **Reproduce against:** `tests/test-mixed-model` (ai: block), `tests/playbook-compile.test.ts` (compile)
71
+ - **Watch:** compile manifest `nodes[].agent` field, `parseTaskMdString` return shape
72
+
73
+ ### DAG & manifest
74
+ - **Source:** `packages/core/src/dag/` — `dag-node.ts` (DagNode), `task-dag.ts` (TaskDag), `dag-tree.ts` (execution tree)
75
+ - **Also:** `packages/core/src/manifest/` — `types.ts` (Manifest, ManifestNode, RunState), `writer.ts`, `reader.ts`
76
+ - **Also:** `packages/cli/src/commands-compile.ts` — compile command, manifest/runstate writing
77
+ - **Symptoms:**
78
+ - Wrong node count after compile
79
+ - Parent-child relationships incorrect in manifest
80
+ - Run fails with "No compiled manifest found"
81
+ - Frontier count wrong
82
+ - **Reproduce against:** `tests/test-compile-discover`, `tests/playbook-dag.test.ts`
83
+ - **Watch:** manifest.json (`nodes`, `parent_map`, `child_map`), runstate.json
84
+
85
+ ### AI config resolution
86
+ - **Source:** `packages/core/src/ai/factory.ts` — `resolveAIConfig`, `listAIProviders`, `createAIFactory`, multi-provider config
87
+ - **Symptoms:**
88
+ - `provider:` field in project.yaml ignored
89
+ - Multi-provider config falls back to default even when task specifies `agent:`
90
+ - `preferredProvider` not passed through from task metadata
91
+ - **Reproduce against:** `tests/test-mixed-model`
92
+ - **Watch:** stdout `🤖 AI Provider:` lines, `AI config type:`, `Providers:` debug lines
93
+
94
+ ### Executor (task execution within navigator)
95
+ - **Source:** `packages/core/src/navigator/core/actions/execution/run-executor.ts`, `packages/core/src/navigator/core/actions/execution/`
96
+ - **Symptoms:**
97
+ - Task spawn fails / process never starts
98
+ - Wrong exit code interpretation (success treated as failure or vice versa)
99
+ - Hangs after spawn (no event stream, no timeout)
100
+ - Execution skips seed-spawned children
101
+ - Skill symlink setup/teardown fails silently
102
+ - **Reproduce against:** `examples/hello-world` for single-task path, any example with seed for parallel spawn
103
+ - **Watch:** process spawn lines in stdout, per-attempt `logs/events.jsonl`
104
+
105
+ ### Journal
106
+ - **Source:** `packages/core/src/journal/`
107
+ - **Key file:** `packages/core/src/journal/structure.ts` (path layout and file-type mapping)
108
+ - **Symptoms:**
109
+ - Materialized TASK.md missing or stale in journal
110
+ - Attempt log files not written
111
+ - `events.jsonl` truncated mid-write
112
+ - Status file (`status.json`) corrupted or stale
113
+ - Gap snapshot (`gaps.yml`) missing
114
+ - **Reproduce against:** any example; `examples/hello-world` makes the file set easiest to inspect
115
+ - **Watch:** `journal/<playbook>/tasks/<taskId>/` and `attempts/<n>/`
116
+
117
+ ### Checkpoint
118
+ - **Source:** `packages/core/src/checkpoint/`
119
+ - **Symptoms:**
120
+ - Resume fails after a clean kill
121
+ - Parent stays `seeded` while all children show complete
122
+ - Status flip-flops between iterations
123
+ - `progress.completedChildren` doesn't match disk reality
124
+ - Checkpoint write fails silently (partial write, missing fields)
125
+ - **Reproduce against:** `examples/test-resume`, examples with seed children (e.g. `examples/test-seeding`)
126
+ - **Watch:** `journal/<playbook>/runstate.json`, `journal/<playbook>/tasks/<taskId>/status.json`
127
+
128
+ ### Seed (dynamic child spawning)
129
+ - **Source:** `packages/core/src/executor/seed-executor.ts` — `ctx.spawn()` implementation, script resolution, staged writes
130
+ - **Also:** `packages/core/src/navigator/repair/strategies/seed-script-repair.ts` — auto-repair of broken seed scripts
131
+ - **Symptoms:**
132
+ - Seed script runs but children don't appear in tree
133
+ - Children spawn but parent rollup never fires
134
+ - Seed spawns duplicate tasks across iterations
135
+ - Seed script not found (path resolution wrong)
136
+ - Seed repair fires on transient errors (429, 5xx)
137
+ - **Reproduce against:** `tests/test-seeding` (basic), `tests/test-queue-pattern` (incremental do-while), `tests/test-financial-deep-research` (multi-level)
138
+ - **Watch:** `converge list`, `journal/<playbook>/runstate.json`, and `inventory/<playbook>/tasks.jsonl`
139
+
140
+ ### Test infrastructure
141
+ - **Source:** `tests/*.test.ts` (vitest, root-level integration tests), `tests/test-*/` (fixture directories), `packages/*/tests/` (per-package unit tests)
142
+ - **Config:** `/vitest.config.ts` (root, `fileParallelism: false`), `packages/*/vitest.config.ts` (per-package)
143
+ - **Key fixtures:**
144
+ - `tests/test-simple-run` — basic single-task run
145
+ - `tests/test-compile-discover` — compile + run separation, child discovery
146
+ - `tests/test-mixed-model` — multi-provider AI routing
147
+ - `tests/test-gap-blocked-input` — dependency backoff, input gaps
148
+ - `tests/test-gap-missing-output` — output gap detection
149
+ - `tests/test-buggy-check` — buggy check relaxation
150
+ - `tests/test-loop-detection` — tool-call loop detection
151
+ - `tests/test-multi-attempt` — multi-attempt convergence
152
+ - `tests/test-queue-pattern` — incremental do-while seed
153
+ - `tests/test-seeding` — recursive seed spawning
154
+ - `tests/test-financial-deep-research` — named non-default playbook
155
+ - **Test patterns:**
156
+ 1. **Compile tests** — `converge playbook validate <name>` or `converge run --playbook=<name> --dry`, verify structure and manifest shape
157
+ 2. **DAG tests** — verify `depends_on`, `depended_on_by`, `child_map`, content hashes
158
+ 3. **Integration tests** — `converge run --playbook=<name>`, check outputs on disk
159
+ 4. **Structure tests** — verify TASK.md frontmatter, seed.js exports, playbook YAML
160
+ - **Running:** `npx vitest run tests/` (all), `npx vitest run tests/<file>` (specific file), `npx vitest` (watch mode)
161
+ - **Adding a test:** create a test fixture under `tests/test-<name>/` with `.converge/project.yaml` + `playbooks/default/` structure, then write a `.test.ts` file that compiles/runs and verifies expected outputs
162
+
163
+ ### Gap detection
164
+ - **Source:** `packages/core/src/task/gap/`
165
+ - **Key types:** `packages/core/src/task/gap/types.ts` (`GapKind`, `GapType`, `CompactGap`)
166
+ - **Key logic:** `packages/core/src/task/gap/detector.ts` (`GapDetector`, `ConvergenceAnalyzer`)
167
+ - **Gap kinds:** `plan`, `seed`, `seed-script`, `blocker`, `output`, `check-failed`, `corrupted`, `systemic`, `user-question`, `insufficient-evidence`, `contradictory-finding`, `untested-hypothesis`
168
+ - **Symptoms:**
169
+ - Gap persists across waves despite valid outputs on disk
170
+ - Wrong gap kind assigned (e.g. `seed-script` gap on a hand-written task)
171
+ - Gap score doesn't improve between waves (stall trigger)
172
+ - `detect-gaps` action returns empty when gaps clearly exist
173
+ - Input gaps not traced to upstream task outputs
174
+ - **Reproduce against:** `examples/test-buggy-check`, `examples/test-gap-blocked-input`, `examples/test-gap-missing-output`
175
+ - **Watch:** per-task `gaps.yml`, per-attempt gap events in `logs/events.jsonl`
176
+
177
+ ### Validation / checks
178
+ - **Source:** `packages/core/src/validation/`, `packages/core/src/task/checks/`
179
+ - **Also:** `packages/core/src/task/playbook/loader.ts` — explicit `cmd` checks + `scripts/` reference extraction
180
+ - **Symptoms:**
181
+ - Check passes when output is wrong / fails when output is right
182
+ - Check predicate evaluates against stale state
183
+ - Check error message uninformative
184
+ - check command points at a missing `scripts/...` helper
185
+ - legacy `type: test` / `.test.md` authoring still present
186
+ - **Reproduce against:** `tests/test-buggy-check` (check behavior), `packages/core/tests/config/playbook-loader-checks.test.ts`
187
+ - **Watch:** per-attempt `CHECK.md`, navigator `verify` action output
188
+
189
+ ### Planning / synthesis / orchestrator
190
+ - **Source:** `packages/core/src/planning/`, `packages/core/src/synthesis/`, `packages/core/src/orchestrator/`
191
+ - **Symptoms:**
192
+ - Wrong task chosen as next-task
193
+ - Phase transitions out of order
194
+ - Synthesis step produces empty / malformed output
195
+ - `plan` gap appears but re-planning produces invalid plan
196
+ - **Reproduce against:** multi-phase examples
197
+ - **Watch:** plan-related navigator actions in stdout, per-task `plan.md` in journal
198
+
199
+ ### Storage / artifacts
200
+ - **Source:** `packages/core/src/storage/`, plus on-disk `.converge/artifacts/<playbook>/`
201
+ - **Symptoms:**
202
+ - Artifact path mismatch (task writes to one path, reader expects another)
203
+ - Artifact missing despite task showing complete
204
+ - Artifact overwritten across iterations when it shouldn't be
205
+ - **Watch:** `.converge/artifacts/<playbook>/...`
206
+
207
+ ### Hooks
208
+ - **Source:** `packages/core/src/hooks/`
209
+ - **Symptoms:**
210
+ - Lifecycle hook never fires
211
+ - Hook fires twice
212
+ - Hook exception silently swallowed
213
+ - **Watch:** stdout for hook log lines, per-attempt `events.jsonl` for hook events
214
+
215
+ ### agentfn / AI providers
216
+ - **Source:** `packages/agentfn/src/` (unified interface + skill management + compose)
217
+ - **Also:** `packages/{claudefn,acpfn,kimifn,qwenfn,geminifn,openfn}/` (individual provider clients)
218
+ - **Symptoms:**
219
+ - Provider throws on a valid response shape
220
+ - Retry loop doesn't kick in on `Overloaded` / 429 / 5xx
221
+ - Retry loop *over*-retries on a permanent error (400, 401, 403)
222
+ - Token / cost accounting wrong
223
+ - Skill symlinks land in wrong directory
224
+ - Provider selection (`provider:` field) ignored
225
+ - **Reproduce against:** whichever example the user has the provider configured for (check `.converge/project.yaml`)
226
+ - **Watch:** stdout for `Overloaded`, `API Error`, `429`, `5xx` retry messages; per-attempt `logs/events.jsonl` for provider call records
227
+
228
+ ### CLI
229
+ - **Source:** `packages/cli/src/`
230
+ - `main.ts` — entry, arg parsing, command dispatch
231
+ - `commands-run.ts` — `run` command
232
+ - `commands-build.ts` — `build` command
233
+ - `commands-clean.ts` — `clean` command
234
+ - `commands-test.ts` — `test` command
235
+ - `commands-compile.ts` — `compile` command
236
+ - `commands-reset.ts` — `reset` command
237
+ - `commands-list.ts`, `commands-tree.ts` — list/status display
238
+ - `commands-inspect.ts` — task/session inspection
239
+ - `commands-metrics.ts` — cost metrics
240
+ - `commands-gantt.ts`, `commands-graph.ts`, `commands-journal.ts` — visualization
241
+ - `commands-validate.ts` — `verify` command
242
+ - `commands-seed.ts` — `seed` command
243
+ - `commands-playbook.ts` — playbook management
244
+ - `commands-deps.ts` — dependency management
245
+ - `autonomous-run.ts` — autonomous run loop
246
+ - `dag-run.ts` — DAG-based run
247
+ - `run-event-stream.ts` — event stream handling
248
+ - **Symptoms:**
249
+ - Wrong arg parsing / unrecognized flag
250
+ - Path-form scoping picks wrong playbook
251
+ - Exit code wrong (0 on failure, non-zero on success)
252
+ - Output formatting broken
253
+ - `--select` expression doesn't match expected tasks
254
+ - `clean --select` deletes wrong or no tasks
255
+ - **Reproduce against:** any example; pick the smallest that exposes the subcommand under test
256
+ - **Watch:** the actual CLI command's output
257
+
258
+ ## How to use this map
259
+
260
+ 1. Match the symptom to a subsystem row above.
261
+ 2. Open the listed source files.
262
+ 3. Trace the call path from the symptom backwards.
263
+ 4. Cross-reference with `troubleshooting/playbook.md` — if the symptom is recorded, follow the recipe.
264
+ 5. If the diagnosis crosses subsystem rows (e.g. navigator ↔ seed, or gap detection ↔ agentfn), surface the hypothesis to the user before editing.
265
+
266
+ ## Test fixture → subsystem mapping
267
+
268
+ Use these when picking a test bed (dev loop step 1). All paths under `tests/`:
269
+
270
+ | Fixture | Exercises |
271
+ |---------|-----------|
272
+ | `test-simple-run` | Basic task execution, single-attempt convergence |
273
+ | `test-compile-discover` | Compile/run separation, static child discovery, manifest+runtime |
274
+ | `test-mixed-model` | Multi-provider AI routing, `ai:` block, agentfn dispatch |
275
+ | `test-buggy-check` | Buggy-check relaxation, `BUGGY_CHECK.md`, check patching |
276
+ | `test-gap-blocked-input` | DependencyBackoffStrategy, input gap detection, producer→consumer |
277
+ | `test-gap-missing-output` | Output gap detection, TaskRunStrategy re-execution |
278
+ | `test-loop-detection` | Tool-call loop detection, LEARN.md augmentation |
279
+ | `test-multi-attempt` | Multi-attempt convergence, sequential check gates |
280
+ | `test-resume` | Crash-safe resume, incremental file creation |
281
+ | `test-seeding` | Recursive seed spawning (3 levels), `ctx.spawn()` |
282
+ | `test-seed-repair` | SeedScriptRepairStrategy, broken seed auto-fix |
283
+ | `test-queue-pattern` | Incremental do-while drain, discovery, convergence |
284
+ | `test-financial-deep-research` | Named non-default playbook, multi-level seed structure |
285
+ | `test-mixed-model` | Multi-provider `ai:` block, per-task provider/model config |
286
+
287
+ ## Self-improvement-loop playbook
288
+
289
+ - **Run:** `converge run --playbook=self-improvement-loop --select improve+`
290
+ - **Source:** `.converge/playbooks/self-improvement-loop/` (`README.md`, `tasks/improve/TASK.md`, `tasks/improve/seeds/epoch.seed.js`, `scripts/*.mjs`)
291
+ - **Evidence:** `.converge/artifacts/self-improvement-loop/` (`journal.md`, `metrics.jsonl`, `backlog.jsonl`, `touched-files.jsonl`, `epochs/<NNN>/verify/result.json`)
292
+ - **Gate failures:** dirty start → clean non-artifact diff; selection quality → `metrics.jsonl`/`touched-files.jsonl`; patch mismatch → manifest vs non-artifact `git diff`; weak verification → changed subsystem tests.
293
+
294
+ Full examples (heavier, multi-phase) live under `examples/`:
@@ -0,0 +1,132 @@
1
+ # Observability — what to read on disk during a run
2
+
3
+ The stdout event stream tells you *what* the framework is doing. The journal and inventory tell you *why*. When debugging the framework itself, you need both.
4
+
5
+ All paths are relative to the example directory (e.g. `/Users/minh/Documents/converge/examples/hello-world`).
6
+
7
+ ## Top-level layout (per playbook run)
8
+
9
+ ```
10
+ .converge/
11
+ ├── project.yaml project + provider config
12
+ ├── playbooks/<playbook>/ playbook source
13
+ │ ├── playbook.yml
14
+ │ └── tasks/<task>/TASK.md
15
+ ├── journal/<playbook>/ runtime state — read this for diagnosis
16
+ │ ├── manifest.json compiled DAG (nodes, parent_map, child_map)
17
+ │ ├── runstate.json execution state (node status, attempts, fingerprints)
18
+ │ ├── events.jsonl playbook-level event stream
19
+ │ └── tasks/<taskId>/
20
+ │ ├── status.json task status (pending → in_progress → complete/failed)
21
+ │ ├── gaps.yml current gap snapshot
22
+ │ ├── summary.md human-readable status
23
+ │ ├── plan.md plan output (for containers)
24
+ │ ├── seed.json seed spawn record (for seed tasks)
25
+ │ ├── FEEDBACK.md latest attempt feedback
26
+ │ ├── LEARN.md accumulated learning across attempts
27
+ │ └── attempts/<n>/
28
+ │ ├── TASK.md materialized snapshot at attempt time
29
+ │ ├── CHECK.md check predicate output
30
+ │ └── logs/
31
+ │ ├── events.jsonl per-attempt event log (most detailed)
32
+ │ └── log.log raw AI session transcript
33
+ ├── inventory/<playbook>/ runtime ledger for spawned tasks
34
+ │ ├── tasks.jsonl flat task inventory
35
+ │ └── spawned/<taskId>/TASK.md rendered spawned task definitions
36
+ └── artifacts/<playbook>/ task outputs (the actual work product)
37
+ ```
38
+
39
+ ## What each file tells you
40
+
41
+ ### `journal/<playbook>/tasks/<taskId>/status.json`
42
+ Lightweight task status file. Faster to read than checkpoint for quick status checks. Contains status enum and timestamps.
43
+
44
+ ### `journal/<playbook>/tasks/<taskId>/gaps.yml`
45
+ Current gap snapshot for this task. Lists all unresolved gaps with kind, type, severity, and description. Compare across waves to see if gaps are resolving.
46
+
47
+ ### `journal/<playbook>/tasks/<taskId>/attempts/<n>/TASK.md`
48
+ Snapshot of TASK.md as the runner saw it at attempt n. **If this differs from the source `playbooks/<playbook>/tasks/<task>/TASK.md`, the runner is using a stale materialized copy** — known failure mode. Compare:
49
+
50
+ ```bash
51
+ diff .converge/playbooks/<playbook>/tasks/<task>/TASK.md \
52
+ .converge/journal/<playbook>/tasks/<taskId>/attempts/<n>/TASK.md
53
+ ```
54
+
55
+ ### `journal/<playbook>/tasks/<taskId>/attempts/<n>/CHECK.md`
56
+ What the check predicate evaluated. Use to see the gap between expected and actual output.
57
+
58
+ ### `journal/<playbook>/tasks/<taskId>/attempts/<n>/logs/events.jsonl`
59
+ The most detailed per-attempt event stream. Includes: spawn, hook fires, agentfn provider calls, check evaluation, gap detection, repair attempts. **Primary diagnostic source for navigator and provider bugs.**
60
+
61
+ ```bash
62
+ # pretty-print one attempt's events
63
+ jq -c . .converge/journal/<playbook>/tasks/<taskId>/attempts/<n>/logs/events.jsonl | less
64
+ ```
65
+
66
+ ### `artifacts/<playbook>/...`
67
+ What the task actually produced. **Always cross-check checkpoint status against artifacts on disk** — checkpoint can lie; the artifact is ground truth.
68
+
69
+ ## Useful tail commands during a run
70
+
71
+ ```bash
72
+ # All task checkpoints + event streams, side by side
73
+ tail -f .converge/journal/<playbook>/events.jsonl &
74
+ find .converge/journal/<playbook>/tasks -path '*/logs/events.jsonl' -exec tail -f {} +
75
+
76
+ # Latest attempt's detailed events for a specific task
77
+ TASK_DIR=".converge/journal/<playbook>/tasks/<taskId>"
78
+ LATEST=$(ls -t "$TASK_DIR/attempts/" 2>/dev/null | head -1)
79
+ tail -f "$TASK_DIR/attempts/$LATEST/logs/events.jsonl"
80
+
81
+ # Quick status scan across all tasks
82
+ find .converge/journal/<playbook>/tasks -name "status.json" -exec sh -c 'echo "{} -> $(jq -r .status {})"' \;
83
+ ```
84
+
85
+ ## CLI introspection commands
86
+
87
+ These are also disk readers, just packaged. Use them as a faster path than reading raw JSON:
88
+
89
+ ```bash
90
+ CLI="node /Users/minh/Documents/converge/packages/cli/dist/index.js"
91
+
92
+ # Task list with status
93
+ $CLI list --playbook <name>
94
+
95
+ # Dependency graph
96
+ $CLI show graph --playbook <name>
97
+
98
+ # Inspect a specific task
99
+ $CLI inspect --playbook <name> --task <task-id>
100
+
101
+ # Show cost metrics
102
+ $CLI show metrics --playbook <name>
103
+
104
+ # Verify config and structure
105
+ $CLI playbook validate <name>
106
+
107
+ # Show journal for a playbook
108
+ $CLI show journal --playbook <name>
109
+ ```
110
+
111
+ ## Self-improvement-loop artifacts
112
+
113
+ Read `.converge/artifacts/self-improvement-loop/` as the autonomous loop's evidence trail: `journal.md`, `metrics.jsonl`, `backlog.jsonl`, `touched-files.jsonl`, `convergence.md`, and `epochs/<NNN>/verify/result.json`.
114
+
115
+ ```bash
116
+ latest=$(ls -1 .converge/artifacts/self-improvement-loop/epochs | sort -n | tail -1)
117
+ jq . .converge/artifacts/self-improvement-loop/epochs/$latest/verify/result.json
118
+ jq -c . .converge/artifacts/self-improvement-loop/metrics.jsonl
119
+ jq -r .file .converge/artifacts/self-improvement-loop/touched-files.jsonl | sort | uniq -c | sort -rn
120
+ ```
121
+
122
+ If a gate fails, read the matching script in `.converge/playbooks/self-improvement-loop/scripts/`; do not weaken checks or hand-edit evidence to pass.
123
+
124
+ ## Adding temporary diagnostic logging
125
+
126
+ When the on-disk surface isn't enough, add `console.log` in the relevant `packages/core/src/<subsystem>/` file. Then:
127
+
128
+ 1. `pnpm --filter @openplaybooks/converge-core build` (faster than full `pnpm build`).
129
+ 2. `rm -rf .converge/journal/<playbook> .converge/inventory/<playbook>` to clear runtime state.
130
+ 3. Re-run the example.
131
+
132
+ **Remove the `console.log` before declaring the fix done.** Don't ship debugging output. If the module already has a real logger, prefer that over raw `console.log`.
@@ -0,0 +1,213 @@
1
+ # Framework troubleshooting playbook
2
+
3
+ Symptom → root cause → fix recipes for **framework bugs** (under `packages/`). Distinct from `converge-control`'s playbook, which covers user-playbook bugs.
4
+
5
+ **This file grows.** Append new entries each time the dev loop fixes a novel framework bug (step 8 of `SKILL.md`). Never delete an entry unless the bug class no longer applies (e.g. the subsystem was rewritten and the symptom is impossible).
6
+
7
+ ## Entry format
8
+
9
+ Each entry follows this structure. Copy it when adding a new one:
10
+
11
+ ```markdown
12
+ ## <symptom in one sentence, as the user / operator would describe it>
13
+
14
+ **Symptom**
15
+ - What you observe in stdout / journal / checkpoint / artifacts
16
+ - Exact log lines if available
17
+
18
+ **Root cause**
19
+ - The actual code-level reason. Cite file paths.
20
+
21
+ **Fix**
22
+ - The patch applied. Cite file paths and what changed (one or two sentences, not a diff).
23
+
24
+ **Verification**
25
+ - Exact commands run after the fix to confirm
26
+ - What you expected to see in the output
27
+
28
+ **Files touched**
29
+ - List of `packages/**` files modified
30
+ ```
31
+
32
+ ---
33
+
34
+ ## Skill symlinks land in the monorepo root's `.claude/skills/` instead of the example's
35
+
36
+ **Symptom**
37
+ - After running an example like `examples/autonomous-pentest`, the example's skills appear as symlinks under `<repo-root>/.claude/skills/` instead of `examples/<name>/.claude/skills/`.
38
+ - Console line during the run: `🔗 Creating skill junctions in: /Users/.../converge/.claude/skills`. The path is the monorepo root, not the example dir.
39
+ - Symlinks point to `../../examples/<name>/.converge/skills/<skill>` — clearly the framework knew about the example's skills but resolved the install location to the wrong root.
40
+ - Symlinks persist after the run (cleanup also runs against the wrong dir).
41
+
42
+ **Root cause**
43
+ The framework had **six** `findProjectRoot` implementations across packages, each using a different heuristic. The buggy ones either:
44
+ 1. Walked up looking for `.claude/` first (`agentfn.ts:36`, `compose.ts:26`) — climbed past the example to the monorepo root if the example had `.converge/` but no `.claude/`.
45
+ 2. Walked up looking for `pnpm-workspace.yaml`/`package.json` (`agentfn/skills.ts:417`, `kimifn/skills.ts:19`, `geminifn/skills.ts:19`, `qwenfn/skills.ts:19`) — every example lacks both files but the monorepo root has them, so the walk always landed at the monorepo root.
46
+
47
+ In `agentfn.ts`, the `agentfn()` function computed the *correct* `symlinkTarget` on line 183, then the legacy-skills branch at line 218–222 silently *overwrote* it using the bad `_findProjectRoot` from `agentfn/skills.ts`.
48
+
49
+ **Fix**
50
+ Established a single canonical rule: **project root = nearest ancestor (or self) containing `.converge/`**. Period. No `.claude/` preference. No workspace markers. No `process.cwd()` escape hatches.
51
+
52
+ Implemented as a small `findConvergeRoot(startDir)` helper inlined into each fn package that needs it. Migrated every call site:
53
+
54
+ - `agentfn/agentfn.ts` — deleted local `findProjectRoot`, imported `findConvergeRoot`. Rewrote the legacy-skills branch to reuse the already-computed `projectRoot` instead of re-resolving.
55
+ - `agentfn/compose.ts` — same treatment.
56
+ - `agentfn/skills.ts` — deleted `_findProjectRoot`. `_getConvergeDir`, `listSkills`, and `legacyGetSkillPath` now use `findConvergeRoot`.
57
+ - `kimifn/src/skills.ts`, `geminifn/src/skills.ts`, `qwenfn/src/skills.ts` — deleted local `findProjectRoot`, imported `findConvergeRoot`. `getSkillsDir` and `getAgentsDir` retain the `<root>/skills/` and `<root>/agents/` path shape; only the root-finding logic changed.
58
+ - `core/src/client/converge-client.ts` — removed the `process.env.CONVERGE_PROJECT_DIR ?? process.cwd()` fallback. The SDK now hard-fails with a clear error if `CONVERGE_PROJECT_DIR` (or `parsed.projectDir` from `CONVERGE_CONTEXT_JSON`) is missing. No silent guesses.
59
+
60
+ **Verification**
61
+ 1. New package's own tests pass:
62
+ ```bash
63
+ cd /Users/minh/Documents/converge/packages/project-root && pnpm test
64
+ # → 9 passed
65
+ ```
66
+ 2. Full monorepo build clean:
67
+ ```bash
68
+ cd /Users/minh/Documents/converge && pnpm build
69
+ ```
70
+ 3. End-to-end check of the resolved path with a synthetic monorepo + nested example:
71
+ ```bash
72
+ # Tempdir with .converge/ at outer level AND inside examples/demo/.
73
+ # findConvergeRoot from inside examples/demo/tasks/recon must return
74
+ # examples/demo, NOT the outer monorepo root. Verified.
75
+ ```
76
+ 4. Per-package test counts unchanged from baseline (zero regressions).
77
+ 5. Manual: re-run any example that has `.converge/skills/`. Watch the `🔗 Creating skill junctions in:` line — path must end in `examples/<name>/.claude/skills`, not `<monorepo>/.claude/skills`.
78
+
79
+ **Files touched**
80
+ - `packages/project-root/` (new package: `package.json`, `tsconfig.json`, `vitest.config.ts`, `src/index.ts`, `tests/find-converge-root.test.ts`)
81
+ - `packages/agentfn/package.json` (added dep)
82
+ - `packages/agentfn/vitest.config.ts` (added alias)
83
+ - `packages/agentfn/src/agentfn.ts`
84
+ - `packages/agentfn/src/compose.ts`
85
+ - `packages/agentfn/src/skills.ts`
86
+ - `packages/kimifn/package.json` (added dep)
87
+ - `packages/kimifn/src/skills.ts`
88
+ - `packages/geminifn/package.json` (added dep)
89
+ - `packages/geminifn/src/skills.ts`
90
+ - `packages/qwenfn/package.json` (added dep)
91
+ - `packages/qwenfn/src/skills.ts`
92
+ - `packages/core/src/client/converge-client.ts`
93
+
94
+ ---
95
+
96
+ ## Shared seed script path not found despite file existing at project root
97
+
98
+ *(Historical note: originally diagnosed for the WBS subsystem, which has since been replaced by the seed/navigator architecture. The path-resolution pattern is still relevant.)*
99
+
100
+ **Symptom**
101
+ - A TASK.md or seed template references a shared script at the project root.
102
+ - Run fails with a "script not found" error.
103
+ - The script exists at `<projectDir>/scripts/foo.js` but the runner only looks under the journal task directory.
104
+ - Error shows a resolved path inside `journal/<playbook>/tasks/...` instead of the project root.
105
+
106
+ **Root cause**
107
+ - The seed script executor resolved the script path only against the task materialization directory. That works for co-located scripts (e.g. `./seeds/index.js` next to the TASK.md) but breaks for shared scripts at the project root.
108
+
109
+ **Fix**
110
+ - Two-step resolution in the seed executor:
111
+ 1. Try task-dir-relative first (preserves the co-located convention).
112
+ 2. If that doesn't exist, fall back to project-dir-relative.
113
+ 3. The "not found" error message now lists *both* candidate paths so debugging is unambiguous.
114
+ - Mirror the same project-dir fallback in the repair path (`core/src/navigator/core/actions/repair/strategy-seed-script-repair.ts`) so the repair pipeline can also locate shared scripts.
115
+
116
+ **Verification**
117
+ ```bash
118
+ cd /Users/minh/Documents/converge
119
+ pnpm --filter @openplaybooks/converge-core build && pnpm --filter @openplaybooks/converge build
120
+ cd examples/test-seeding
121
+ node /Users/minh/Documents/converge/packages/cli/dist/index.js clean --select '*'
122
+ node /Users/minh/Documents/converge/packages/cli/dist/index.js run
123
+ # expect: seed script found and executed, no "script not found" errors
124
+ ```
125
+
126
+ **Files touched**
127
+ - `packages/core/src/navigator/core/actions/resolution/resolve-seed.ts` (seed script resolution)
128
+ - `packages/core/src/navigator/core/actions/repair/strategy-seed-script-repair.ts` (repair path)
129
+
130
+ ---
131
+
132
+ ## Transient remote errors (429, 5xx, network) trigger seed script repair, wasting tokens
133
+
134
+ **Symptom**
135
+ - A seed script runs, hits a transient downstream failure (rate limit, quota exhausted, 5xx, network reset), and exits non-zero. Stdout shows the error, e.g.:
136
+ ```
137
+ google.genai.errors.ClientError: 429 RESOURCE_EXHAUSTED. {'error': {'message': 'Your prepayment credits are depleted...'}}
138
+ ```
139
+ - Runner classifies this as a script bug and triggers repair, calling AI to "fix" the script:
140
+ ```
141
+ → Strategy: seed script error - triggering AI repair
142
+ [seed-script-repair] Calling AI to fix script (attempt 1)...
143
+ ```
144
+ - AI rewrites a script that wasn't broken, costing tokens for no benefit. Even if the rewrite is syntactically valid, the next run hits the same 429.
145
+
146
+ **Root cause**
147
+ - The seed repair path had no precondition to detect transient/remote failures. Every non-success exit path fed into AI repair.
148
+
149
+ **Fix**
150
+ - Added a transient-error precondition:
151
+ - `TRANSIENT_REMOTE_PATTERNS` regex list matches 429, 5xx (502/503/504), `RESOURCE_EXHAUSTED`, `quota`, `rate-limit`, `Overloaded`, `ECONNRESET`/`ECONNREFUSED`/`ETIMEDOUT`/`ENOTFOUND`, "credits depleted", etc.
152
+ - `isTransientRemoteError(error)` checks `error.name + message + stack` against those patterns.
153
+ - Before the AI-repair branch, the executor logs a `skip-transient` fact and returns early when transient. The script's normal retry loop (1/2, 2/2) still applies; we just don't rewrite the script.
154
+
155
+ **Verification**
156
+ ```bash
157
+ cd /Users/minh/Documents/converge
158
+ pnpm --filter @openplaybooks/converge-core build && pnpm --filter @openplaybooks/converge build
159
+ cd examples/test-seeding
160
+ # Use a depleted/invalid API key to deterministically force a 429
161
+ GEMINI_API_KEY=invalid node /Users/minh/Documents/converge/packages/cli/dist/index.js clean --select '*'
162
+ GEMINI_API_KEY=invalid node /Users/minh/Documents/converge/packages/cli/dist/index.js run
163
+ # expect: log line showing transient error detected and AI repair skipped
164
+ # expect: NO "[seed-script-repair] Calling AI to fix script" line
165
+ ```
166
+
167
+ **Files touched**
168
+ - `packages/core/src/navigator/core/actions/repair/strategy-seed-script-repair.ts` (transient error pre-filter)
169
+
170
+ ---
171
+
172
+ ## `seedConfigGap is not defined` crash in resolve-seed action
173
+
174
+ **Symptom**
175
+ - Running a playbook with a seed task that has a seed gap.
176
+ - Crash with `ReferenceError: seedConfigGap is not defined` at `resolveSeed`.
177
+ - Stack trace points to `resolve-seed.ts` (or the bundled `dist/index.js` equivalent).
178
+
179
+ **Root cause**
180
+ - `packages/core/src/navigator/core/actions/resolution/resolve-seed.ts` line 37 references `seedConfigGap.id` but the variable captured from the gap search (line 19) is named `seedGap`. The variable name `seedConfigGap` was never declared — a simple typo.
181
+
182
+ **Fix**
183
+ - Rename `seedConfigGap` → `seedGap` on the `gapId:` line in `resolve-seed.ts`.
184
+
185
+ **Verification**
186
+ - Run `examples/test-seed-repair`. Should not crash with `seedConfigGap is not defined`. Instead it should proceed to seed execution and (if the seed has a deliberate error) trigger repair.
187
+
188
+ **Files touched**
189
+ - `packages/core/src/navigator/core/actions/resolution/resolve-seed.ts`
190
+
191
+ ---
192
+
193
+ ## SeedScriptRepairStrategy can't find the seed file because `scriptPath` points to the task directory, not the script
194
+
195
+ **Symptom**
196
+ - Seed script execution fails with a runtime error (e.g., `ReferenceError`).
197
+ - Self-healing triggers seed-script-repair.
198
+ - Repair strategy fails: `Seed script not found at <task-dir>` (non-retryable).
199
+ - The seed script exists at `<seeds-dir>/<name>.seed.js` but the repair strategy only looks under the task directory.
200
+
201
+ **Root cause**
202
+ - `packages/core/src/executor/seed-executor.ts` "Strategy 4" (general seed error handler, ~line 1096) sets `scriptPath: this.taskFilePath` without trying to extract the actual script path from the error.
203
+ - The `extractSeedScriptPathFromError` method already knows how to parse the error format `"Seed script import failed: <path>\n<cause>"`, but it was only called in Strategy 2 (missing file), not Strategy 4 (general error).
204
+ - `packages/core/src/navigator/repair/strategies/seed-script-repair.ts` then searches for `seed.js`/`seedData.ts`/`seed/index.js` under the task directory, which doesn't contain the script (it's in `../seeds/`).
205
+
206
+ **Fix**
207
+ - In the Strategy 4 gap creation block, call `this.extractSeedScriptPathFromError(error)` first and use that as `scriptPath`, falling back to `this.taskFilePath` if extraction fails.
208
+
209
+ **Verification**
210
+ - Run `examples/test-seed-repair`. The repair strategy should find the seed script, call AI to fix it, self-test should pass, and the fixed script should re-run successfully.
211
+
212
+ **Files touched**
213
+ - `packages/core/src/executor/seed-executor.ts`