@martintrojer/mu 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,542 @@
1
+ # Roadmap
2
+
3
+ What's coming after [0.1.0](../CHANGELOG.md), with full design
4
+ rationale per item. This is the **single forward-looking doc**: if
5
+ a feature isn't listed here, it isn't planned. If it's listed but
6
+ unbuilt, see its promotion criteria for what would move it.
7
+
8
+ For canonical terms, see [VOCABULARY.md](VOCABULARY.md). For
9
+ pillars that must not bend, see [VISION.md](VISION.md). For module
10
+ layout and data flow, see [ARCHITECTURE.md](ARCHITECTURE.md).
11
+
12
+ ---
13
+
14
+ ## Promotion criteria (the only bar)
15
+
16
+ A roadmap item earns implementation when **all three** are true:
17
+
18
+ 1. **Proven friction.** A real user (us, internal users, early
19
+ adopters) hits the missing feature in a real workflow at least
20
+ twice. "Imagined polish" doesn't count.
21
+ 2. **No pillar refactor.** The addition fits the current substrate
22
+ without bending any of the load-bearing pillars (see
23
+ [VISION.md](VISION.md)).
24
+ 3. **Bounded scope.** The addition fits in **<300 LOC** or has a
25
+ clear smaller subset that does.
26
+
27
+ If an item drops below the bar (no longer has criterion 1 met after
28
+ real use), it moves to the bottom or is removed. We don't keep
29
+ phantom plans alive.
30
+
31
+ **Exception: data-loss footguns.** A change that fixes a default
32
+ that silently destroys user artifacts (uncommitted output, scratch
33
+ logs, benchmark results, etc.) ships on the **first** occurrence,
34
+ not the second. The cost of waiting for criterion 1 is "lose more
35
+ stuff"; that's the wrong cost to optimise. Document the friction
36
+ in the commit message instead.
37
+
38
+ **Polish doesn't count as promotion.** Bug fixes, ergonomic
39
+ improvements, error-message wording, doc tightening, and similar
40
+ "the existing thing works better" changes don't need promotion
41
+ criteria — they just need to be small and to ship clean (typecheck
42
+ + lint + tests + build). Polish is the dividend the project earns
43
+ by refusing the things on this roadmap. Don't wait for occurrence
44
+ #2 to fix a typo, tighten an error message, or truncate a runaway
45
+ table column.
46
+
47
+ ---
48
+
49
+ ## Anti-feature pledges (still in force; reinforced by an internal critique)
50
+
51
+ We will NOT, until each one earns its way back via the criteria
52
+ above. Each pledge is a specific accumulation a prior internal
53
+ multi-agent runtime made and mu chose not to inherit; an internal
54
+ critique made the case sharply (TL;DR: that runtime's breadth had
55
+ hidden state, lifecycle bugs, unclear ownership of truth, and high
56
+ model-facing tool entropy).
57
+
58
+ - Add a configuration file. All config is CLI flags or env vars.
59
+ - Add a daemon, watcher, or background process beyond what tmux /
60
+ SQLite give us.
61
+ - Add abstractions that exist for "future flexibility" with no
62
+ current consumer (a prior internal LLM-runtime's `RunContext`
63
+ trait was the cautionary tale).
64
+ - Add wrappers around wrappers (stream-of-streams wrappers we've
65
+ seen before — `TextStream`/`TextState`/`StreamResult` shapes —
66
+ are the cautionary tale).
67
+ - Generate code, embed a JS engine, or use any macro/decorator
68
+ pattern beyond TypeScript itself. (Council: "A workflow DSL that
69
+ becomes 'programming the runtime' is a liability.")
70
+ - Ship a template/definition system for agent roles. Spawn flags +
71
+ the orchestrator's first message are the only "definition."
72
+ - Add a render layer beyond `cli-table3` + `picocolors`.
73
+ - Bundle pi. The pi extension is the only anticipated future
74
+ caller; even that is required to be a thin facade over the SDK
75
+ (see [§ Pi extension and the three rules](#pi-extension-and-the-three-rules)
76
+ below).
77
+ - Add a plugin runtime, a web UI, an RPC layer, a chat or docs
78
+ integration, a memory system, or a workflow engine. (These are
79
+ the kinds of accumulated subsystems the council critique flagged
80
+ as costing more than they pay for. mu has none and intends to
81
+ keep it that way.)
82
+
83
+ ---
84
+
85
+ ## Possible — small additions with an obvious shape
86
+
87
+ These have a clear design but haven't yet hit criterion 1 (proven
88
+ friction in ≥2 real workflows). They earn implementation when real
89
+ use surfaces them.
90
+
91
+ The section heading is deliberately "Possible," not "Next." "Next"
92
+ implies it's coming. "Possible" doesn't. Items below ship if and
93
+ when they earn it.
94
+
95
+ ### Pi extension and the three rules
96
+
97
+ The pi extension is the first "polish" tier — LLM-facing UX
98
+ (typed `mu_*` tools, HUD widget, wakeups) that wraps the same core
99
+ operations the CLI already exposes. Bundled in the same npm
100
+ package; pi is a peer dep.
101
+
102
+ The pi extension is **the only anticipated future caller**. When /
103
+ if it lands, three rules stay non-negotiable:
104
+
105
+ 1. **The DB is canonical.** All state in `<state-dir>/mu.db`.
106
+ Extension reads/writes it through the same modules the CLI uses.
107
+ No extension-only state.
108
+ 2. **Every operation works from the CLI.** No tool registered in
109
+ the extension has logic that doesn't exist in the CLI. The
110
+ extension is a typed/integrated facade.
111
+ 3. **The skill teaches the CLI.** Pi sessions without the extension
112
+ still get a working mu by following [the bundled
113
+ skill](../skills/mu/SKILL.md).
114
+
115
+ If those three rules hold, mu stays driveable from a shell forever
116
+ and the extension stays thin.
117
+
118
+ ### `mu adopt <pane-id> [--name <agent>]` — SHIPPED in v0.2 (`e20af89`)
119
+
120
+ Reconciliation surfaces orphan panes; `mu adopt` formally registers
121
+ one of them as a managed agent. Promotion was triggered by the
122
+ multi-agent dogfood pattern (orchestrator runs in a pane outside
123
+ the `mu-<ws>` session and wants to be claimable as a worker).
124
+
125
+ ### Heterogeneous CLI status detection (claude, codex, ...)
126
+
127
+ mu is a pi orchestrator today, BUT v0.2 added a Braille-spinner
128
+ fallback (`f68838f`) that catches every TUI wrapper using
129
+ standard spinner glyphs (U+2800–U+28FF). pi-meta + solo are now
130
+ covered without a per-CLI detector. Other vanilla TUIs (claude,
131
+ codex) inherit the same fallback.
132
+
133
+ For patterns the spinner fallback misses (e.g. permission
134
+ prompts), a per-CLI `Detector` registry keyed by CLI name (~50
135
+ LOC per CLI) is the obvious shape. Promote when a real
136
+ specific-prompt-misclassification surfaces.
137
+
138
+ Pattern sketch (ported from a prior internal multi-agent runtime's
139
+ per-CLI detector — kept here for whoever picks it up):
140
+
141
+ | CLI | Busy patterns | Permission patterns |
142
+ | -------- | ------------------------------------------ | --------------------------------------------------------- |
143
+ | Claude | `to interrupt`, `\(.*[↑↓].*tokens\)` | `Allow once`, `Allow for this session`, `Esc to cancel` |
144
+ | Codex | `esc to interrupt)`, `to cancel` | `enter to confirm`, `enter to submit \| esc to cancel` |
145
+ | Pi | (well-known mu-defined marker) | (well-known mu-defined marker) — shipped |
146
+
147
+ Critical subtleties any new detector must keep:
148
+
149
+ - **Tail-window extraction**: take last ~100 lines, strip trailing
150
+ blanks, then take last ~20. Prevents stale scrollback
151
+ false-positives. Already implemented for pi in `src/detect.ts`;
152
+ the registry version factors this out.
153
+ - **Permission detection uses a narrower window than busy
154
+ detection** — prevents already-answered prompts triggering
155
+ re-detection.
156
+ - **Permission patterns override busy** — if a permission prompt
157
+ is visible, agent is `NeedsPermission`, not `Busy`.
158
+
159
+ ### `tasks_v` enriched view
160
+
161
+ ```sql
162
+ CREATE VIEW tasks_v AS
163
+ SELECT t.*,
164
+ GROUP_CONCAT(n.content, char(10) || '---' || char(10)) AS notes,
165
+ COUNT(n.id) AS note_count,
166
+ MAX(n.created_at) AS last_note_at
167
+ FROM tasks t
168
+ LEFT JOIN task_notes n ON n.task_id = t.id
169
+ GROUP BY t.id;
170
+ ```
171
+
172
+ Earns when `mu sql` queries against tasks + notes start getting
173
+ verbose for a second consumer.
174
+
175
+ ---
176
+
177
+ ## Snapshots + undo
178
+
179
+ Theme: every destructive action becomes recoverable.
180
+
181
+ ### `snapshots` table + auto-snapshot before mutation — SHIPPED in v0.2 (schema v4; tables carried into v5, and unchanged in v6/v7)
182
+
183
+ `captureSnapshot()` runs at the top of every destructive verb
184
+ (workstream destroy, agent close, task close/reject/defer/release/
185
+ delete, workspace free). Whole-DB copy via
186
+ `VACUUM INTO` (synchronous, FK-page-level atomic). Files land in
187
+ `<dirname(db-path)>/snapshots/<id>.db`; one row per capture in:
188
+
189
+ ```sql
190
+ CREATE TABLE snapshots (
191
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
192
+ workstream TEXT, -- nullable: destroy spans all
193
+ label TEXT NOT NULL, -- operation name + args
194
+ db_path TEXT NOT NULL, -- abs path to .db file
195
+ schema_version INTEGER NOT NULL, -- for restore-time version check
196
+ created_at TEXT NOT NULL
197
+ );
198
+ ```
199
+
200
+ GC opportunistic in-hook (<14 days OR <100 rows). NO FK on
201
+ `workstream` — destroying a workstream must NOT cascade-delete
202
+ its pre-destroy snapshot.
203
+
204
+ ### `mu undo` + `mu snapshot {list,show}` — SHIPPED in v0.2 (snap_undo_verb)
205
+
206
+ Three verbs on top of the snapshots substrate:
207
+
208
+ - **`mu undo [--yes] [--to <id>]`** — top-level. Restores latest
209
+ snapshot (or the one named by `--to`). Dry-run by default;
210
+ `--yes` commits. Post-restore reconciles every workstream
211
+ (best-effort per workstream, errors swallowed) and reports
212
+ ghosts pruned + orphans surfaced.
213
+ - **`mu snapshot list [-n N] [--json]`** — newest-first table.
214
+ - **`mu snapshot show <id> [--json]`** — full row metadata.
215
+
216
+ Design decisions held to:
217
+
218
+ - **No `mu redo`.** Verbs have side-effects (tmux kill, git worktree
219
+ remove) that aren't replayable. Each restore captures a
220
+ pre-restore snapshot first, so a second `mu undo` rolls forward
221
+ to that one. Verified end-to-end. Promote `mu redo` only if real
222
+ use surfaces a need.
223
+ - **Cross-version restores rejected** (snapshot.schema_version <
224
+ CURRENT_SCHEMA_VERSION); migrations are forward-only. Maps to
225
+ `SnapshotVersionMismatchError` (exit 4).
226
+ - **Tmux state is NOT rolled back.** Restore + reconcile prunes
227
+ ghost rows; orphan panes surface in next `mu agent list`.
228
+ Documented honestly in the verb's stdout.
229
+
230
+ Destructive verbs that already auto-snapshot now also advertise
231
+ undo in their `Next:` blocks (`mu task delete`, `mu workstream
232
+ destroy --yes`, etc.). Closes `snap_destroy_safety`.
233
+
234
+ ---
235
+
236
+ ## Stretch
237
+
238
+ Items that meet criterion 2 (no pillar bend) and 3 (small) but
239
+ haven't yet hit criterion 1 (proven friction). Stays parked until
240
+ real use surfaces them.
241
+
242
+ ### `task_artifacts` — generalized "this task produced X"
243
+
244
+ ```sql
245
+ CREATE TABLE task_artifacts (
246
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
247
+ task_id TEXT NOT NULL REFERENCES tasks(local_id) ON DELETE CASCADE,
248
+ kind TEXT NOT NULL, -- pr|file|url|commit|image
249
+ ref TEXT NOT NULL,
250
+ label TEXT,
251
+ created_at TEXT NOT NULL
252
+ );
253
+ ```
254
+
255
+ `mu task artifact add <task> --kind pr <url>`. Surfaces in `mu
256
+ task show` and a future `tasks_v` enriched view.
257
+
258
+ ### Other parked items
259
+
260
+ | Item | Source / origin |
261
+ | --- | --- |
262
+ | `CancelScope` for long-running ops — Ctrl-C handling that cooperatively cancels in-flight tmux/exec calls | prior-art pattern (workflows) |
263
+ | `mu.step()` replay cache for `mu run` — re-running a partially-failed script skips already-completed steps | prior-art pattern (workflows; `SqliteWorkflowStore` shape) |
264
+ | `init_tracing(config)` + RAII guard — NDJSON to `<state-dir>/logs/`, MINUTELY rotation, last 100 files | prior-art pattern (tracing) |
265
+ | Subscription-based wakeups — `mu log --tail` polls SQLite once per second; SQLite update hooks (via better-sqlite3) or fs.watch on the WAL would drop latency. | internal critique gap |
266
+
267
+ ### Schema normalization — SHIPPED in v0.2 (schema v5)
268
+
269
+ `tasks.id INTEGER PK + (workstream_id, local_id) UNIQUE` shipped
270
+ as the universal substrate-wide pattern, not just on tasks. See
271
+ [docs/ARCHITECTURE.md § Surrogate-PK + SDK-boundary discipline](ARCHITECTURE.md#surrogate-pk--sdk-boundary-discipline-load-bearing).
272
+ Two operators both running `mu task add design` in different
273
+ workstreams just works; same for agents.
274
+
275
+ Post-v5 evolution: schema v6 added the cross-workstream archive
276
+ tables (`archives`, `archived_tasks`, `archived_edges`,
277
+ `archived_notes`, `archived_events`); schema v7 dropped the
278
+ unused `approvals` table. The surrogate-PK shape is unchanged.
279
+
280
+ ---
281
+
282
+ ## Explicitly rejected
283
+
284
+ These were considered and turned down, with the reason. Listed so
285
+ we don't rediscover the same ideas every quarter.
286
+
287
+ ### JavaScript DSL (`mu run` / `mu eval` / `mu repl`)
288
+
289
+ Why it's tempting: atomicity-as-syntax, forward refs as a parser
290
+ feature, LLMs reliably emit structured code.
291
+
292
+ Why we rejected (twice — first as a Lisp like the prior runtime
293
+ used, then as JS-via-`vm`):
294
+
295
+ - The gap a DSL fills is "compose multiple verbs into one
296
+ transactional script." `--json` on every read verb plus typed
297
+ verbs that accept evidence arguments cover that without a
298
+ sandbox, codegen, `.d.ts` shipping, or a parallel typed surface
299
+ to maintain.
300
+ - **Independent corroboration from an internal critique**: five
301
+ orthogonal reviewers (architect, engineer, model-UX,
302
+ thin-harness advocate, operator) all flagged DSL/workflow
303
+ language as the worst maintenance liability of the prior
304
+ internal runtime. "A workflow DSL that becomes 'programming
305
+ the runtime' is a liability."
306
+ - The `vm` sandbox would have to be maintained against Node's
307
+ security model forever; a non-trivial commitment for a feature
308
+ with no proven friction.
309
+ - bash composition over `mu --json | jq` covers what real users
310
+ do.
311
+
312
+ What the DSL would have provided, and what ships instead:
313
+
314
+ | Original DSL feature | Shipped substitute |
315
+ | --------------------------------------------- | ------------------------------------------------------- |
316
+ | `mu run script.ts` (transactional script) | `bash + jq + --json`; SDK in-proc for typed callers |
317
+ | `mu eval` | `mu sql` for raw queries; `bash -c` for actions |
318
+ | `mu repl` | `node` + `import("mu-agent")` for in-proc exploration |
319
+ | `mu.create / spawn / claim / send / ...` | `mu task add / agent spawn / task claim / agent send` |
320
+ | `mu.ready()` / `mu.parallelTracks()` | `mu task next -n 0 --json` / bare `mu --json` / `mu state --json` |
321
+ | Forward refs via deferred string IDs | Add tasks in topological order, or use `mu task block` after-the-fact |
322
+ | Atomic transactions wrapping a script | Per-verb transactions in the SDK; idempotent verbs |
323
+ | `mu.step()` replay cache | Not built; if needed, build on top of `agent_logs` event seq |
324
+
325
+ Re-earn requires repeated friction reports of "I keep writing the
326
+ same bash" that bash + jq + `--json` couldn't fix.
327
+
328
+ ### `defineOperation()` registry framework
329
+
330
+ The only consumer that motivated this was the JS DSL's `.d.ts`
331
+ autocomplete. With the DSL rejected, no consumer remains. The pi
332
+ extension, if/when it ships, can share types directly via
333
+ `src/index.ts` SDK exports without a registry layer. Classic case
334
+ of an abstraction with one anticipated consumer.
335
+
336
+ ### Markdown agent-definition discovery
337
+
338
+ Spawn already accepts `--cli` / `--command` / `--workspace` /
339
+ `--role` directly; an orchestrator's first message + spawn flags
340
+ ARE the agent's "definition." The `agents/` directory and a
341
+ `docs/AGENT_FORMAT.md` were considered and dropped.
342
+
343
+ Earn back if real friction surfaces ("I'm copy-pasting the same
344
+ role doc into five spawn invocations every day, twice a week").
345
+
346
+ ### Build mu as a pure pi extension (no CLI)
347
+
348
+ Why it's tempting: simpler distribution, one install, full access
349
+ to pi's `ExtensionAPI` for HUD and events.
350
+
351
+ Why rejected:
352
+
353
+ - Children spawned by mu can't drive mu without re-loading the
354
+ extension.
355
+ - Humans can't `mu agent list` from a shell to debug.
356
+ - Recursion requires special plumbing.
357
+ - Couples mu to pi's release cycle and extension API.
358
+ - Throws away the "any process can drive this" property.
359
+
360
+ ### Build mu as a library that pi imports (no standalone CLI)
361
+
362
+ Why it's tempting: zero subprocess overhead.
363
+
364
+ Why rejected:
365
+
366
+ - Multiple pi instances would each load the library and fight over
367
+ the DB.
368
+ - A standalone CLI on `$PATH` is the cleanest "shared resource"
369
+ model.
370
+ - The library/CLI split is well-trodden — every good tool ships
371
+ both, and the CLI is canonical.
372
+
373
+ ### Two binaries: `mu-agents` and `mu-tasks`
374
+
375
+ Why it's tempting: cleaner separation of concerns.
376
+
377
+ Why rejected:
378
+
379
+ - Agent ↔ task integration (claim, owner field, agent_logs about
380
+ tasks) needs them in one transactional surface.
381
+ - One install, one mental model, one `mu doctor`.
382
+ - A prior internal precedent of separating task-graph and
383
+ agent-runtime crates created awkward join logic; mu collapsing
384
+ them is a feature.
385
+
386
+ ### `TaskSurface` adapter abstraction with multiple backends
387
+
388
+ Sync to GitHub Issues / Linear / Asana. Why it's tempting:
389
+ composability, "bring your own work tracker."
390
+
391
+ Why rejected:
392
+
393
+ - mu without a built-in task graph is just a fancier agent runner
394
+ — the killer features (parallel tracks, claim, ROI
395
+ prioritization) require a graph.
396
+ - Adapter complexity for systems most users don't have.
397
+ - Round-tripping inverts the model: mu's task graph is local and
398
+ authoritative.
399
+ - If wanted: a separate companion package, not core.
400
+
401
+ ### Cross-machine state sync
402
+
403
+ Local-first SQLite. Layer something like syncthing on top if you
404
+ want it. Multi-machine sync would force a server, conflict
405
+ resolution, identity, auth — every one of those breaks the "zero
406
+ ops" pledge.
407
+
408
+ ### HTTP API on top of the SQLite registry
409
+
410
+ mu is a CLI; if you need RPC, write it. The schema is small and
411
+ stable enough.
412
+
413
+ ### A "hosted" mu
414
+
415
+ Zero ops, no accounts. Your machine is the deployment.
416
+
417
+ ### Plugin system / web UI / RPC / chat & docs integrations / memory system / workflow engine
418
+
419
+ Not "rejected one at a time" — rejected as a class. An internal
420
+ critique established that the prior internal runtime's accumulation
421
+ of these adjacent product identities was its central design
422
+ failure: "hidden state, lifecycle bugs, unclear ownership of
423
+ truth, and high model-facing tool entropy."
424
+
425
+ mu's anti-feature pledges (no plugin runtime, no codegen, no
426
+ daemon, no web UI, no chat integration, no memory system, no
427
+ workflow engine) are specifically the accumulations of that prior
428
+ internal runtime that mu chose not to inherit. Each one is
429
+ provable as the absence of a subsystem mu was tempted to copy.
430
+
431
+ ### Anthropomorphic builtin agent names (`alice`, `bob`)
432
+
433
+ Use role-based names (`worker-1`, `reviewer-1`). See
434
+ [VOCABULARY.md §"Naming conventions"](VOCABULARY.md#agent-names-prefer-role-n-not-human-names).
435
+
436
+ ---
437
+
438
+ ## Open questions
439
+
440
+ These were live during initial design and remain partly unresolved.
441
+ Listed so we don't pretend they're settled.
442
+
443
+ - **`agents.cli` as TEXT vs enum.** Went with TEXT (originally for
444
+ heterogeneous-CLI forward-compat). Today the only meaningful
445
+ value is `pi`. We're keeping it TEXT — if multi-CLI re-earns its
446
+ way back, the column doesn't need a schema migration.
447
+ - **Composite `(workstream, local_id)` PK on tasks.** Currently
448
+ `local_id` is global PK. Two workstreams can't both have a
449
+ `design` task. Recorded as a deferred normalization above.
450
+ - **Capability tags on operations.** The `defineOperation()`
451
+ registry that would have carried these is rejected. The role
452
+ flag on agents is stored but unenforced. The internal critique
453
+ flagged "capability-gated mutations" as part of the minimal
454
+ core; for now mu's only authorization surface is "the agent ran
455
+ the verb." Earn capability enforcement when an agent actually
456
+ does damage.
457
+ - **Per-workstream config.** Resisted (the anti-feature pledge).
458
+ "This workstream uses one pi binary, that one uses another" is
459
+ a real gap that env vars don't solve cleanly. Revisit when the
460
+ second user hits it.
461
+ - **Subscription-based wakeups.** `mu log --tail` polls SQLite
462
+ once per second. Real subscriptions (SQLite update hooks via
463
+ better-sqlite3, or fs.watch on the WAL) would drop latency at
464
+ the cost of more machinery. Not worth it until someone hits
465
+ the cliff.
466
+
467
+ ---
468
+
469
+ ## Operational lessons we're stealing (reference for implementers)
470
+
471
+ Each of these is a real failure mode pi-subagents or a prior
472
+ internal multi-agent runtime has already fixed. Listed here so
473
+ when one of the items above is picked up, the implementer doesn't
474
+ have to rediscover the lesson.
475
+
476
+ ### From pi-subagents (`src/runs/shared/`)
477
+
478
+ | File | Lesson |
479
+ | -------------------------- | ----------------------------------------------------------------- |
480
+ | `frontmatter.ts` | Agent-frontmatter parser: 28 lines, handles CRLF, quoted values, kebab-case. Port verbatim. |
481
+ | `long-running-guard.ts` | Mutating-bash detection via regex + unquoted-redirection scanner. Don't trust tool names; scan command bodies. |
482
+ | `long-running-guard.ts` | Mutating-failure burst detection: rolling window, consecutive vs same-path failures, escalation threshold. |
483
+ | `completion-guard.ts` | Expected-mutation detection from task prose, not agent role. Strips framework-injected lines before checking. |
484
+ | `model-fallback.ts` | Curated regex list of retryable failures (rate limit, 429, quota, 502/503/504). Don't waste a fallback on auth errors. |
485
+ | `model-fallback.ts` | `splitThinkingSuffix` always splits on **last** colon — preserves `provider/model:high`. |
486
+ | `single-output.ts` | Three cases for output files: agent wrote it, agent didn't, file unreadable. `captureSingleOutputSnapshot` before run to disambiguate. |
487
+ | `worktree.ts` | `node_modules` symlinking + tracking as synthetic-path. Generic across VCS. |
488
+ | `worktree.ts` | Per-task `cwd:` conflict detection. Best-effort rollback on hook failure. |
489
+ | `result-watcher.ts` | `fs.watch` with mandatory polling fallback on `EMFILE`/`ENOSPC`. `unref()` timers. Coalescer for rapid rename events. |
490
+ | `pi-args.ts` | Long tasks → temp file + `@path` argv. System prompt via `mode: 0o600` temp file. Identity env vars passed down. |
491
+ | `extension/doctor.ts` | `lineFromCheck(label, fn)` wrapper turns thrown errors into `failed — <text>` lines so one broken probe doesn't break the report. |
492
+
493
+ ### From a prior internal multi-agent runtime
494
+
495
+ | Topic | Lesson |
496
+ | ----------------------------------------------- | ------------------------------------------------------------ |
497
+ | shell-escape | `shell_escape` via single-quote wrapping. |
498
+ | granular workspace-free results | A `WorkspaceFreeResult` with independent `committed`/`submitted`/`commitError`/`submitError`. |
499
+ | submit guard | `timeout -k 5s {N}s sh -c 'exec jf submit --draft </dev/null'` to prevent hanging on TTY prompts. |
500
+ | per-CLI detector | Per-CLI Detector trait + pattern registry. Tail-window + narrow-window distinction. (deferred; pi only today.) |
501
+ | lifecycle state machine | Side-effect-free lifecycle state machine: `(state, event) → outcome`. Single point for tracing. Distinguishes manual `Free` from inferred idle. |
502
+ | read-list reconciliation | "Reality wins": every `list()` queries the substrate, prunes ghosts, adopts orphans. **Implemented (`src/reconcile.ts`).** |
503
+ | parallel-tracks | Parallel-tracks union-find with diamond-merge. **Implemented (`src/tracks.ts`).** |
504
+ | built-in graph views | Built-in views: `ready`, `blocked`, `goals`. **Implemented.** |
505
+ | pane-title-as-identity | Pane-title-as-identity for the claim protocol. **Implemented.** |
506
+ | lisp DSL (rejected for mu, ideas not adopted) | Atomic transactions are per-verb in the SDK; idempotent re-imports work via `INSERT OR IGNORE` + idempotent verbs; forward-ref checking handled at task-add time. JS DSL also rejected (above). |
507
+ | notes model | Append-only, FILES/DECISION/VERIFIED conventions. **Implemented.** |
508
+
509
+ ---
510
+
511
+ ## Documents still to write
512
+
513
+ Meta-docs the project will need eventually:
514
+
515
+ - **CONTRIBUTING.md** — once external PRs land. Contains the LOC
516
+ caps, the lint rules, the "no traits with zero implementors"
517
+ rule, the test-first conventions.
518
+ - **MIGRATIONS.md** — the v3→v4 in-process migration framework + the
519
+ one-shot v4→v5 script have shipped and (`src/migrations.ts`)
520
+ retired. Capturing the operator-facing contract for future schema
521
+ bumps in one place is still useful; leave as a follow-up.
522
+
523
+ ---
524
+
525
+ ## How to use this roadmap
526
+
527
+ If you're starting work on an item:
528
+
529
+ 1. **Confirm it still meets the three promotion criteria.** Note
530
+ the second real-use occurrence; cite the friction.
531
+ 2. **Open a focused PR per item.** One typed verb per commit, one
532
+ schema change per commit.
533
+ 3. **Update [VOCABULARY.md](VOCABULARY.md) first** if you introduce
534
+ a new concept or rename an existing one.
535
+ 4. **Add a [CHANGELOG.md](../CHANGELOG.md) entry** under the
536
+ upcoming version.
537
+
538
+ If you're considering adding a new entry to this file:
539
+
540
+ - Read AGENTS.md §"What NOT to do" first.
541
+ - Provide a concrete promotion-criteria assessment.
542
+ - Match the format of existing entries.