@exaudeus/workrail 3.66.0 → 3.68.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (150) hide show
  1. package/dist/application/services/compiler/template-registry.js +10 -1
  2. package/dist/application/validation.js +1 -1
  3. package/dist/cli/commands/worktrain-init.js +1 -1
  4. package/dist/console/standalone-console.js +4 -1
  5. package/dist/console-ui/assets/{index-BynU38Vu.js → index-CyzltI6D.js} +1 -1
  6. package/dist/console-ui/index.html +1 -1
  7. package/dist/coordinators/modes/full-pipeline.js +4 -4
  8. package/dist/coordinators/modes/implement-shared.js +5 -5
  9. package/dist/coordinators/modes/implement.js +4 -4
  10. package/dist/coordinators/pr-review.js +4 -4
  11. package/dist/daemon/workflow-runner.d.ts +1 -0
  12. package/dist/daemon/workflow-runner.js +1 -0
  13. package/dist/infrastructure/storage/schema-validating-workflow-storage.d.ts +21 -2
  14. package/dist/infrastructure/storage/schema-validating-workflow-storage.js +48 -0
  15. package/dist/manifest.json +41 -41
  16. package/dist/mcp/handlers/v2-workflow.js +24 -7
  17. package/dist/mcp/output-schemas.d.ts +36 -0
  18. package/dist/mcp/output-schemas.js +11 -1
  19. package/dist/mcp/workflow-protocol-contracts.js +2 -2
  20. package/dist/v2/projections/session-metrics.d.ts +1 -1
  21. package/dist/v2/projections/session-metrics.js +16 -35
  22. package/dist/v2/usecases/console-routes.d.ts +2 -2
  23. package/docs/authoring-v2.md +4 -4
  24. package/docs/changelog-recent.md +3 -3
  25. package/docs/configuration.md +1 -1
  26. package/docs/design/adaptive-coordinator-context-candidates.md +1 -1
  27. package/docs/design/adaptive-coordinator-context.md +1 -1
  28. package/docs/design/adaptive-coordinator-routing-candidates.md +18 -18
  29. package/docs/design/adaptive-coordinator-routing-review.md +1 -1
  30. package/docs/design/adaptive-coordinator-routing.md +34 -34
  31. package/docs/design/agent-cascade-protocol.md +2 -2
  32. package/docs/design/console-daemon-separation-discovery.md +323 -0
  33. package/docs/design/context-assembly-design-candidates.md +1 -1
  34. package/docs/design/context-assembly-implementation-plan.md +1 -1
  35. package/docs/design/context-assembly-layer.md +2 -2
  36. package/docs/design/context-assembly-review-findings.md +1 -1
  37. package/docs/design/coordinator-access-audit.md +293 -0
  38. package/docs/design/coordinator-architecture-audit.md +62 -0
  39. package/docs/design/coordinator-error-handling-audit.md +240 -0
  40. package/docs/design/coordinator-testability-audit.md +426 -0
  41. package/docs/design/daemon-architecture-discovery.md +1 -1
  42. package/docs/design/daemon-console-separation-discovery.md +242 -0
  43. package/docs/design/daemon-memory-audit.md +203 -0
  44. package/docs/design/design-candidates-console-daemon-separation.md +256 -0
  45. package/docs/design/design-candidates-discovery-loop-fix.md +141 -0
  46. package/docs/design/design-review-findings-console-daemon-separation.md +106 -0
  47. package/docs/design/design-review-findings-discovery-loop-fix.md +81 -0
  48. package/docs/design/discovery-loop-fix-candidates.md +161 -0
  49. package/docs/design/discovery-loop-fix-design-review.md +106 -0
  50. package/docs/design/discovery-loop-fix-validation.md +258 -0
  51. package/docs/design/discovery-loop-investigation-A.md +188 -0
  52. package/docs/design/discovery-loop-investigation-B.md +287 -0
  53. package/docs/design/exploration-workflow-candidates.md +205 -0
  54. package/docs/design/exploration-workflow-design-review.md +166 -0
  55. package/docs/design/exploration-workflow-discovery.md +443 -0
  56. package/docs/design/ide-context-files-candidates.md +231 -0
  57. package/docs/design/ide-context-files-design-review.md +85 -0
  58. package/docs/design/ide-context-files.md +615 -0
  59. package/docs/design/implementation-plan-discovery-loop-fix.md +199 -0
  60. package/docs/design/implementation-plan-queue-poll-rotation.md +102 -0
  61. package/docs/design/in-process-http-audit.md +190 -0
  62. package/docs/design/layer3b-ghost-nodes-design-candidates.md +2 -2
  63. package/docs/design/loadSessionNotes-candidates.md +108 -0
  64. package/docs/design/loadSessionNotes-test-coverage-discovery.md +297 -0
  65. package/docs/design/loadSessionNotes-test-coverage-session4.md +209 -0
  66. package/docs/design/loadSessionNotes-test-coverage-v3.md +321 -0
  67. package/docs/design/probe-session-design-candidates.md +261 -0
  68. package/docs/design/probe-session-phase0.md +490 -0
  69. package/docs/design/routines-guide.md +7 -7
  70. package/docs/design/session-metrics-attribution-candidates.md +250 -0
  71. package/docs/design/session-metrics-attribution-design-review.md +115 -0
  72. package/docs/design/session-metrics-attribution-discovery.md +319 -0
  73. package/docs/design/session-metrics-candidates.md +227 -0
  74. package/docs/design/session-metrics-design-review.md +104 -0
  75. package/docs/design/session-metrics-discovery.md +454 -0
  76. package/docs/design/spawn-session-debug.md +202 -0
  77. package/docs/design/trigger-validator-candidates.md +214 -0
  78. package/docs/design/trigger-validator-review.md +109 -0
  79. package/docs/design/trigger-validator-shaping-phase0.md +239 -0
  80. package/docs/design/trigger-validator.md +454 -0
  81. package/docs/design/v2-core-design-locks.md +2 -2
  82. package/docs/design/workflow-extension-points.md +15 -15
  83. package/docs/design/workflow-id-validation-at-startup.md +1 -1
  84. package/docs/design/workflow-id-validation-implementation-plan.md +2 -2
  85. package/docs/design/workflow-trigger-lifecycle-audit.md +175 -0
  86. package/docs/design/worktrain-task-queue-candidates.md +5 -5
  87. package/docs/design/worktrain-task-queue.md +4 -4
  88. package/docs/discovery/coordinator-script-design.md +1 -1
  89. package/docs/discovery/coordinator-ux-discovery.md +3 -3
  90. package/docs/discovery/simulation-report.md +1 -1
  91. package/docs/discovery/workflow-modernization-discovery.md +326 -0
  92. package/docs/discovery/workflow-selection-for-discovery-tasks.md +33 -33
  93. package/docs/discovery/worktrain-status-briefing.md +1 -1
  94. package/docs/discovery/wr-discovery-goal-reframing.md +1 -1
  95. package/docs/docker.md +1 -1
  96. package/docs/ideas/backlog.md +227 -0
  97. package/docs/ideas/third-party-workflow-setup-design-thinking.md +1 -1
  98. package/docs/integrations/claude-code.md +5 -5
  99. package/docs/integrations/firebender.md +1 -1
  100. package/docs/plans/agentic-orchestration-roadmap.md +2 -2
  101. package/docs/plans/mr-review-workflow-redesign.md +9 -9
  102. package/docs/plans/ui-ux-workflow-design-candidates.md +4 -4
  103. package/docs/plans/ui-ux-workflow-discovery.md +2 -2
  104. package/docs/plans/workflow-categories-candidates.md +8 -8
  105. package/docs/plans/workflow-categories-discovery.md +4 -4
  106. package/docs/plans/workflow-modernization-design.md +430 -0
  107. package/docs/plans/workflow-staleness-detection-candidates.md +11 -11
  108. package/docs/plans/workflow-staleness-detection-review.md +4 -4
  109. package/docs/plans/workflow-staleness-detection.md +9 -9
  110. package/docs/plans/workrail-platform-vision.md +3 -3
  111. package/docs/reference/agent-context-cleaner-snippet.md +1 -1
  112. package/docs/reference/agent-context-guidance.md +4 -4
  113. package/docs/reference/context-optimization.md +2 -2
  114. package/docs/roadmap/now-next-later.md +2 -2
  115. package/docs/roadmap/open-work-inventory.md +16 -16
  116. package/docs/workflows.md +31 -31
  117. package/package.json +1 -1
  118. package/spec/workflow-tags.json +47 -47
  119. package/workflows/adaptive-ticket-creation.json +16 -16
  120. package/workflows/architecture-scalability-audit.json +22 -22
  121. package/workflows/bug-investigation.agentic.v2.json +3 -3
  122. package/workflows/classify-task-workflow.json +1 -1
  123. package/workflows/coding-task-workflow-agentic.json +6 -6
  124. package/workflows/cross-platform-code-conversion.v2.json +8 -8
  125. package/workflows/document-creation-workflow.json +8 -8
  126. package/workflows/documentation-update-workflow.json +8 -8
  127. package/workflows/intelligent-test-case-generation.json +2 -2
  128. package/workflows/learner-centered-course-workflow.json +2 -2
  129. package/workflows/mr-review-workflow.agentic.v2.json +4 -4
  130. package/workflows/personal-learning-materials-creation-branched.json +8 -8
  131. package/workflows/presentation-creation.json +5 -5
  132. package/workflows/production-readiness-audit.json +1 -1
  133. package/workflows/relocation-workflow-us.json +31 -31
  134. package/workflows/routines/context-gathering.json +1 -1
  135. package/workflows/routines/design-review.json +1 -1
  136. package/workflows/routines/execution-simulation.json +1 -1
  137. package/workflows/routines/feature-implementation.json +3 -3
  138. package/workflows/routines/final-verification.json +1 -1
  139. package/workflows/routines/hypothesis-challenge.json +1 -1
  140. package/workflows/routines/ideation.json +1 -1
  141. package/workflows/routines/parallel-work-partitioning.json +3 -3
  142. package/workflows/routines/philosophy-alignment.json +2 -2
  143. package/workflows/routines/plan-analysis.json +1 -1
  144. package/workflows/routines/plan-generation.json +1 -1
  145. package/workflows/routines/tension-driven-design.json +6 -6
  146. package/workflows/scoped-documentation-workflow.json +26 -26
  147. package/workflows/ui-ux-design-workflow.json +14 -14
  148. package/workflows/workflow-diagnose-environment.json +1 -1
  149. package/workflows/workflow-for-workflows.json +32 -77
  150. package/workflows/workflow-for-workflows.v2.json +0 -788
@@ -0,0 +1,426 @@
1
+ # Coordinator Testability Audit
2
+
3
+ Generated: 2026-04-19
4
+
5
+ ## Context
6
+
7
+ **Motivating bug:** The `awaitSessions` HTTP bug shipped to production. The real implementation
8
+ polled an HTTP console endpoint instead of reading from the session store. The unit-test fake
9
+ always returned a success `AwaitResult`, so the wrong implementation path was invisible to tests.
10
+
11
+ **Audit question:** For each dependency function in `AdaptiveCoordinatorDeps`, does the test fake
12
+ exist? Does it simulate realistic failure modes? Could the real implementation fail in a way the
13
+ fake would NOT catch?
14
+
15
+ **Scope:**
16
+ - `tests/unit/adaptive-implement.test.ts`
17
+ - `tests/unit/adaptive-full-pipeline.test.ts`
18
+ - `tests/unit/route-task.test.ts`
19
+ - `src/coordinators/adaptive-pipeline.ts` -- `AdaptiveCoordinatorDeps` full interface
20
+ - `src/coordinators/pr-review.ts` -- `CoordinatorDeps` (parent interface)
21
+ - `src/trigger/trigger-listener.ts` -- real production wiring (lines 640-752)
22
+ - `src/cli-worktrain.ts` -- real `awaitSessions` wiring (lines 1336-1372)
23
+
24
+ **Principle under audit:** "Prefer fakes over mocks -- tests should validate behavior with
25
+ realistic substitutes." (CLAUDE.md)
26
+
27
+ ---
28
+
29
+ ## Interface Summary
30
+
31
+ `AdaptiveCoordinatorDeps` extends `CoordinatorDeps` with 5 additional deps.
32
+ Total: 19 named deps plus one optional (`contextAssembler`).
33
+
34
+ **Inherited from `CoordinatorDeps`:**
35
+
36
+ | Dep | Type |
37
+ |-----|------|
38
+ | `spawnSession` | async, returns `Result<string, string>` |
39
+ | `awaitSessions` | async, returns `AwaitResult` |
40
+ | `getAgentResult` | async, returns `{ recapMarkdown, artifacts }` |
41
+ | `listOpenPRs` | async, returns `PrSummary[]` |
42
+ | `mergePR` | async, returns `Result<void, string>` |
43
+ | `writeFile` | async, returns `void` |
44
+ | `readFile` | async, returns `string` (throws on ENOENT) |
45
+ | `appendFile` | async, returns `void` |
46
+ | `mkdir` | async, returns `string \| undefined` |
47
+ | `stderr` | sync, void |
48
+ | `now` | sync, returns `number` |
49
+ | `port` | value |
50
+ | `homedir` | sync, returns `string` |
51
+ | `joinPath` | sync, returns `string` |
52
+ | `nowIso` | sync, returns `string` |
53
+ | `generateId` | sync, returns `string` |
54
+ | `contextAssembler` | optional, `ContextAssembler` |
55
+
56
+ **Added by `AdaptiveCoordinatorDeps`:**
57
+
58
+ | Dep | Type |
59
+ |-----|------|
60
+ | `fileExists` | sync, returns `boolean` |
61
+ | `archiveFile` | async, returns `void` |
62
+ | `pollForPR` | async, returns `string \| null` |
63
+ | `postToOutbox` | async, returns `void` |
64
+ | `pollOutboxAck` | async, returns `'acked' \| 'timeout'` |
65
+
66
+ ---
67
+
68
+ ## Per-Dep Analysis
69
+
70
+ ### 1. `awaitSessions`
71
+
72
+ **Fake default:** Always returns `makeSuccessAwait(handles[0])` -- outcome is always `'success'`.
73
+
74
+ **Fake failure coverage:** `makeFailedAwait()` and `makeTimeoutAwait()` helpers exist and are used
75
+ in explicit test overrides (e.g., "escalates when coding session times out"). These cover the
76
+ `outcome: 'failed'` and `outcome: 'timeout'` contract branches.
77
+
78
+ **Real implementation (cli-worktrain.ts:1336):** Calls `executeWorktrainAwaitCommand` which makes
79
+ HTTP requests to the daemon console port. If the HTTP call fails or the result is unparseable,
80
+ `resolvedResult` stays null and the function returns all sessions as `outcome: 'failed'`.
81
+
82
+ **Gap:** The fake does not simulate the scenario where `awaitSessions` returns all-failed because
83
+ the daemon is unreachable (port unavailable). This is different from a session failing: it means
84
+ the coordinator cannot determine any session status. While `makeFailedAwait` covers the contract
85
+ shape, it does not represent the CAUSE: the coordinator trusts `awaitSessions` to correctly
86
+ reflect session state, but the real impl can return all-failed even when sessions succeeded.
87
+
88
+ **The awaitSessions bug:** The bug was that the real impl polled HTTP instead of reading the
89
+ session store. A fake that validates contract shape cannot catch this. However, a fake that
90
+ simulates port-unavailable failure (returning all-failed unconditionally when `port` is set to 0
91
+ or -1) would have surfaced this as a test gap: the coordinator would escalate, and the test author
92
+ would ask why.
93
+
94
+ **Missing scenarios:**
95
+ - `awaitSessions` returns all-failed because daemon is unreachable (simulated via `port = 0`)
96
+ - `awaitSessions` called with an empty handles array
97
+
98
+ **Severity: HIGH**
99
+
100
+ ---
101
+
102
+ ### 2. `getAgentResult`
103
+
104
+ **Fake default:** Always returns `{ recapMarkdown: 'APPROVE -- no findings. LGTM.', artifacts: [] }`.
105
+
106
+ **Fake failure coverage:** None in the default fake. Tests that need a specific result override
107
+ `getAgentResult` inline, but no test simulates a network failure.
108
+
109
+ **Real implementation (cli-worktrain.ts:1374):** Makes two raw `globalThis.fetch()` calls (session
110
+ detail, then node detail). Neither call is wrapped in try/catch in `implement-shared.ts` or
111
+ `full-pipeline.ts`. A network error throws an unhandled exception, crashing the coordinator.
112
+
113
+ **Gap 1 (success bias):** Default fake never returns `{ recapMarkdown: null, artifacts: [] }`,
114
+ which is the real impl's fallback on HTTP failure. The `null` recap path IS tested in
115
+ `adaptive-full-pipeline.test.ts` but only via explicit override, not as the default.
116
+
117
+ **Gap 2 (throw injection):** The real impl throws on network error (fetch rejects). The callers
118
+ in `implement-shared.ts` do not have try/catch around `getAgentResult`. If this throws, the
119
+ coordinator crashes rather than returning `{ kind: 'escalated' }`. No test exercises this.
120
+
121
+ **Missing scenarios:**
122
+ - `getAgentResult` returns `{ recapMarkdown: null, artifacts: [] }` as the default (should escalate gracefully)
123
+ - `getAgentResult` throws a network error (coordinator should escalate, not crash)
124
+
125
+ **Severity: HIGH**
126
+
127
+ ---
128
+
129
+ ### 3. `spawnSession`
130
+
131
+ **Fake default:** Returns `ok(nextHandle())` -- always succeeds.
132
+
133
+ **Fake failure coverage:** GOOD. Multiple tests use `vi.fn().mockResolvedValue(err('...'))` and
134
+ workflow-specific failure injection. The `err()` path is well-tested for all spawn points
135
+ (coding, UX gate, review, fix loop).
136
+
137
+ **Real implementation:** Makes an HTTP POST to the daemon console. Returns `err()` on HTTP failure.
138
+ Does not throw -- uses Result type consistently.
139
+
140
+ **Gap:** None significant. Zombie handle detection (empty string handle) is NOT explicitly tested
141
+ in the adaptive tests (it is tested in `pr-review.ts` context), but the structural coverage is good.
142
+
143
+ **Missing scenarios:**
144
+ - `spawnSession` returns `ok('')` (empty handle / zombie detection) for adaptive modes
145
+
146
+ **Severity: LOW**
147
+
148
+ ---
149
+
150
+ ### 4. `pollForPR`
151
+
152
+ **Fake default:** Returns `'https://github.com/org/repo/pull/42'` -- always finds a PR.
153
+
154
+ **Fake failure coverage:** `null` return IS tested via explicit override (`pollForPR: vi.fn().mockResolvedValue(null)`
155
+ in "escalates when no PR is found after coding session"). So the null path has a test.
156
+
157
+ **Real implementation (trigger-listener.ts:665):** Shells `gh pr list` every 30 seconds until
158
+ timeout. Can fail silently if `gh` is not installed, not authenticated, or the branch pattern
159
+ has no match. Returns null after timeout.
160
+
161
+ **Gap:** The default fake always returns a PR URL, so any regression that makes the real impl
162
+ always return null would be masked in all other tests. The `gh` CLI failure mode (throws, then
163
+ continues polling) is completely untested.
164
+
165
+ **Missing scenarios:**
166
+ - Default fake returning null (to catch regressions earlier -- currently only one explicit test)
167
+ - `gh` CLI throws on every poll but eventually times out
168
+
169
+ **Severity: MEDIUM**
170
+
171
+ ---
172
+
173
+ ### 5. `postToOutbox`
174
+
175
+ **Fake default:** Returns `undefined` (vi.fn().mockResolvedValue(undefined)) -- always succeeds.
176
+
177
+ **Fake failure coverage:** Tests verify that `postToOutbox` WAS CALLED with the right arguments,
178
+ but no test verifies coordinator behavior when `postToOutbox` throws.
179
+
180
+ **Real implementation (trigger-listener.ts:694):** Writes a JSON line to `~/.workrail/outbox.jsonl`
181
+ using `fs.promises.appendFile`. Can fail on disk full, missing directory, or permission error.
182
+
183
+ **Gap:** `postToOutbox` is called at critical escalation decision points (fix loop exhausted,
184
+ human review required, do-not-merge). If it throws, the coordinator crashes at that point and
185
+ never returns a `PipelineOutcome`. Callers do not wrap calls in try/catch.
186
+
187
+ **Missing scenarios:**
188
+ - `postToOutbox` throws an error (coordinator should continue and return escalated outcome)
189
+
190
+ **Severity: MEDIUM**
191
+
192
+ ---
193
+
194
+ ### 6. `pollOutboxAck`
195
+
196
+ **Fake default:** Returns `'acked'` -- always acknowledged immediately.
197
+
198
+ **Fake failure coverage:** NONE. No test in either file exercises the `'timeout'` return value.
199
+
200
+ **Real implementation (trigger-listener.ts:707):** Polls `inbox-cursor.json` every 5 minutes
201
+ for up to 24 hours. The `'timeout'` path is the most likely real-world outcome because users
202
+ do not always ack notifications promptly.
203
+
204
+ **Gap:** The UX gate escalation path on `'timeout'` is completely untested. The `'timeout'`
205
+ branch exists in `full-pipeline.ts` but no test triggers it. Any regression in that branch
206
+ (e.g., forgetting to escalate, forgetting to post another outbox message) would be invisible.
207
+
208
+ The UX gate is triggered by goals containing: 'ui', 'screen', 'component', 'design', 'ux',
209
+ 'frontend' -- a common set of keywords, not an edge case.
210
+
211
+ **Missing scenarios:**
212
+ - `pollOutboxAck` returns `'timeout'` -- coordinator should escalate gracefully
213
+
214
+ **Severity: HIGH**
215
+
216
+ ---
217
+
218
+ ### 7. `archiveFile`
219
+
220
+ **Fake default:** Returns `undefined` -- always succeeds.
221
+
222
+ **Fake failure coverage:** GOOD. One test explicitly tests `archiveFile` throwing:
223
+ "logs a warning if archiveFile throws but does not change the outcome". The coordinator
224
+ uses a try/catch wrapper around `archiveFile` in a finally block. This is the only dep
225
+ with proper throw-handling coverage.
226
+
227
+ **Real implementation:** `fs.promises.rename(src, dest)` -- can fail on cross-device rename,
228
+ missing dest directory, or permission error.
229
+
230
+ **Gap:** None. Well-tested.
231
+
232
+ **Severity: LOW**
233
+
234
+ ---
235
+
236
+ ### 8. `fileExists`
237
+
238
+ **Fake default:** Returns `false` (in makeFakeDeps). `route-task.test.ts` uses explicit
239
+ `noPitch` and `hasPitch` fakes.
240
+
241
+ **Fake failure coverage:** Good for routing tests. The `fileExists` dep is sync and pure;
242
+ it does not have network failure modes.
243
+
244
+ **Real implementation:** `fs.existsSync(p)` -- cannot throw in normal operation.
245
+
246
+ **Gap:** None significant.
247
+
248
+ **Severity: LOW**
249
+
250
+ ---
251
+
252
+ ### 9. `mergePR`
253
+
254
+ **Fake default:** Returns `ok(undefined)` -- always succeeds.
255
+
256
+ **Fake failure coverage:** None for IMPLEMENT/FULL mode tests. `mergePR` is called by
257
+ `runPrReviewCoordinator` (pr-review.ts), not by IMPLEMENT/FULL mode logic. The IMPLEMENT
258
+ and FULL pipeline tests include `mergePR` in their fakes, but it is never called by the
259
+ code under test.
260
+
261
+ **Real implementation:** Shells `gh pr merge --squash`. Can fail on merge conflict,
262
+ required CI not passing, or branch protection rule violation.
263
+
264
+ **Gap:** Low severity because `mergePR` is not called in IMPLEMENT/FULL mode. The fake
265
+ is structurally correct but its presence in `makeFakeDeps()` is misleading -- it implies
266
+ IMPLEMENT/FULL mode merges PRs, which it does not (merging is delegated to the review
267
+ coordinator). The misleading presence could cause future confusion.
268
+
269
+ **Missing scenarios:** N/A for IMPLEMENT/FULL tests. Reviewed separately in pr-review tests.
270
+
271
+ **Severity: LOW (structural confusion only)**
272
+
273
+ ---
274
+
275
+ ### 10. `listOpenPRs`
276
+
277
+ Same analysis as `mergePR`: present in fakes but not called by IMPLEMENT/FULL mode logic.
278
+
279
+ **Severity: LOW (structural confusion only)**
280
+
281
+ ---
282
+
283
+ ### 11. `writeFile`
284
+
285
+ **Fake default:** Returns `undefined`. Called by `runAdaptivePipeline` for routing log writes
286
+ and by `writeReport` in pr-review mode. Both callers wrap in try/catch -- routing log failure
287
+ is explicitly non-fatal.
288
+
289
+ **Gap:** None significant for IMPLEMENT/FULL. The try/catch wrapper means failures are already
290
+ safe.
291
+
292
+ **Severity: LOW**
293
+
294
+ ---
295
+
296
+ ### 12. `readFile`
297
+
298
+ **Fake default:** Throws `Object.assign(new Error('ENOENT'), { code: 'ENOENT' })` -- simulates
299
+ missing file. This is the most realistic default of any fake in the suite: it forces callers
300
+ to handle ENOENT, which is the most common real-world readFile failure.
301
+
302
+ **Gap:** None. Well-designed default.
303
+
304
+ **Severity: LOW**
305
+
306
+ ---
307
+
308
+ ### 13. `contextAssembler` (optional)
309
+
310
+ **Fake:** Absent from all `makeFakeDeps()` in adaptive tests. `contextAssembler` is an optional
311
+ field and its absence means context assembly never runs in unit tests.
312
+
313
+ **Real implementation:** Assembles git diff and prior session notes. Can fail if the git command
314
+ fails or the session store is unreachable.
315
+
316
+ **Gap:** Context assembly failures are completely invisible to unit tests. If a regression in
317
+ context assembly caused `spawnSession` to receive malformed context and crash, unit tests would
318
+ not catch it. However, this is intentional -- the optional nature of the dep means the coordinator
319
+ is designed to work without it.
320
+
321
+ **Severity: LOW (by design, but worth documenting)**
322
+
323
+ ---
324
+
325
+ ### 14. `stderr`, `now`, `port`, `homedir`, `joinPath`, `nowIso`, `generateId`
326
+
327
+ These are synchronous utility functions with trivial or no failure modes.
328
+
329
+ - `stderr`: vi.fn() -- never throws in real impl
330
+ - `now`: vi.fn().mockReturnValue(Date.now()) -- realistic
331
+ - `port`: hardcoded 3456 -- does not simulate "port 0 = daemon not running"
332
+ - `homedir`: returns '/home/test' -- realistic enough for path construction
333
+ - `joinPath`: uses string concatenation -- realistic
334
+ - `nowIso`: returns ISO string -- realistic
335
+ - `generateId`: returns random string -- realistic
336
+
337
+ **Gap for `port`:** The `port` value in fakes is always 3456 (a valid port). A fake with `port = 0`
338
+ or `port = -1` combined with a failure-simulating `awaitSessions` would represent "daemon not
339
+ running" more realistically. This is the test scenario that would have caught the awaitSessions
340
+ HTTP bug.
341
+
342
+ **Severity: LOW (port gap is only meaningful when combined with awaitSessions)**
343
+
344
+ ---
345
+
346
+ ## vi.mock() Audit
347
+
348
+ **Finding:** Zero `vi.mock()` calls in any of the three test files. All dependencies are
349
+ injected via `makeFakeDeps()` or explicit object literals. This correctly follows the
350
+ "prefer fakes over mocks" principle.
351
+
352
+ The `vi.fn()` calls within `makeFakeDeps()` are jest-spy instances on the fake's own methods,
353
+ not module-level mocks. This is the correct pattern.
354
+
355
+ ---
356
+
357
+ ## Missing Test Scenarios Summary
358
+
359
+ | Gap | Dep | Severity | File |
360
+ |-----|-----|----------|------|
361
+ | Fake never simulates daemon-unreachable (all-failed when port unavailable) | `awaitSessions` | HIGH | both |
362
+ | Fake never throws network error; callers lack try/catch | `getAgentResult` | HIGH | both |
363
+ | `pollOutboxAck` `'timeout'` path never exercised | `pollOutboxAck` | HIGH | `adaptive-full-pipeline.test.ts` |
364
+ | `postToOutbox` throw not tested; callers lack try/catch | `postToOutbox` | MEDIUM | both |
365
+ | Default always returns PR URL; only one explicit null test | `pollForPR` | MEDIUM | both |
366
+ | Empty-handle zombie detection not tested in adaptive modes | `spawnSession` | LOW | both |
367
+ | `port = 3456` always valid; never simulates "daemon not running" | `port` | LOW | both |
368
+ | `contextAssembler` absent from all fakes | `contextAssembler` | LOW | both |
369
+ | `mergePR` / `listOpenPRs` in fakes but never called by IMPLEMENT/FULL | `mergePR`, `listOpenPRs` | LOW (misleading) | both |
370
+
371
+ ---
372
+
373
+ ## Severity Rankings
374
+
375
+ ### HIGH (must fix -- these are the awaitSessions class of gap)
376
+
377
+ 1. **`awaitSessions` daemon-unreachable scenario** -- The exact class of bug that motivated this
378
+ audit. Add a test that injects `awaitSessions` returning all-failed to simulate an unreachable
379
+ daemon port. Verify the coordinator escalates gracefully (not crashes).
380
+
381
+ 2. **`getAgentResult` throw injection** -- The real impl uses raw `fetch()` without try/catch in
382
+ callers. Add a test where `getAgentResult` throws a `TypeError: fetch failed`. The coordinator
383
+ should catch this and return `{ kind: 'escalated' }`. Currently it would crash.
384
+ Fix also requires adding try/catch in `implement-shared.ts` around `getAgentResult` calls.
385
+
386
+ 3. **`pollOutboxAck` timeout path** -- Add a test for the UX gate in `full-pipeline.ts` where
387
+ `pollOutboxAck` returns `'timeout'`. Verify the coordinator escalates and does not hang.
388
+ This is the most common real-world outcome of the UX gate.
389
+
390
+ ### MEDIUM (should fix)
391
+
392
+ 4. **`postToOutbox` throw injection** -- Add a test where `postToOutbox` throws. The coordinator
393
+ calls it at critical decision points without try/catch; a throw currently crashes the pipeline.
394
+
395
+ 5. **`pollForPR` null as default** -- The null path is covered by one explicit test, but the
396
+ default fake always returns a URL. Consider making null the default in a second
397
+ `makeFakeDeps` variant used for failure-path tests, to catch regressions earlier.
398
+
399
+ ### LOW (address in cleanup)
400
+
401
+ 6. **`spawnSession` zombie handle** -- Add one test where `spawnSession` returns `ok('')` (empty
402
+ handle) for IMPLEMENT/FULL modes and verify the coordinator escalates.
403
+
404
+ 7. **`mergePR` / `listOpenPRs` in `makeFakeDeps`** -- These deps are not called by IMPLEMENT/FULL
405
+ mode. Remove them from `makeFakeDeps` to reduce noise, or add a comment explaining they are
406
+ inherited from `CoordinatorDeps` for REVIEW_ONLY/QUICK_REVIEW modes.
407
+
408
+ 8. **`contextAssembler` smoke test** -- Add at least one test that injects a minimal
409
+ `contextAssembler` fake to verify context threading in IMPLEMENT/FULL modes.
410
+
411
+ ---
412
+
413
+ ## Architectural Note
414
+
415
+ The awaitSessions HTTP bug was an implementation-path bug, not an interface-contract bug. No
416
+ unit-test fake can catch "the real implementation chose the wrong data source." What a better
417
+ fake CAN do is simulate the *outcome* of that wrong choice (daemon unreachable = all-failed)
418
+ so the coordinator's escalation path for that outcome is exercised. The gap was not "bad fake"
419
+ but "untested escalation branch for a failure mode that occurs in production."
420
+
421
+ The correct fix at two levels:
422
+ 1. **Fake level:** Simulate transport failures (port unavailable, network error) to exercise
423
+ escalation paths.
424
+ 2. **Implementation level:** Wrap `getAgentResult`, `pollForPR`, and `postToOutbox` calls in
425
+ try/catch in the mode files, so transport errors return `{ kind: 'escalated' }` rather than
426
+ crashing the coordinator.
@@ -96,7 +96,7 @@ etc.) to call its MCP tools over a transport (stdio or HTTP). The process entry
96
96
  | **Team lead** | Get consistent, enforced process on every MR without training reviewers | Reviews are ad-hoc; agents drift and skip steps |
97
97
  | **Platform/infra engineer** | Deploy WorkRail as a service on cloud infrastructure | WorkRail is a local tool that exits when the terminal closes |
98
98
  | **Workflow author** | Write a workflow once, have it run identically in both manual and autonomous mode | Today: manual mode only; would need to rewrite for autonomous mode if it existed separately |
99
- | **WorkRail itself (self-improvement)** | Run `workflow-for-workflows` to author new workflows autonomously | Cannot initiate its own workflows; must be driven by a human |
99
+ | **WorkRail itself (self-improvement)** | Run `wr.workflow-for-workflows` to author new workflows autonomously | Cannot initiate its own workflows; must be driven by a human |
100
100
 
101
101
  ### Core tension
102
102
 
@@ -0,0 +1,242 @@
1
+ # Daemon Console Separation -- Architectural Discovery
2
+
3
+ ## About This Document
4
+
5
+ This is a **human-readable artifact** capturing the discovery process and findings. It is NOT execution memory -- the workflow's durable state lives in WorkRail session notes and context variables. This doc is for the owner to read when reviewing the recommendation or handing work to a coding agent.
6
+
7
+ ---
8
+
9
+ ## Context / Ask
10
+
11
+ The owner wants STRICT separation between the three WorkTrain/WorkRail systems: the MCP server, the daemon, and the console. Currently the daemon starts an embedded console server (`src/trigger/daemon-console.ts`) that holds live references to daemon internals. This creates a port conflict with the standalone console and prevents the standalone console from being the single independently-runnable console process.
12
+
13
+ **Stated goal (solution-statement):** Split by port -- standalone console on 3456 reads filesystem, daemon control endpoints move to port 3200.
14
+
15
+ **Reframed problem:** The daemon and standalone console fight over port 3456 because the daemon embeds a console that duplicates the standalone console's role while adding live daemon object wiring -- yet the browser UI only needs dispatch access, not the full daemon wiring.
16
+
17
+ ---
18
+
19
+ ## Path Recommendation
20
+
21
+ **Path: `landscape_first`**
22
+
23
+ The reframed problem is well-understood from source reading. The dominant need now is comparing the viable architectural approaches (eliminate daemon-console entirely, split-by-port, proxy approach) against the actual constraints in the codebase. We already have enough problem grounding from Step 1 -- the landscape of options is the open question.
24
+
25
+ *Why not `design_first`*: The problem is already well-scoped. We know what's wrong. We need to know which fix is least disruptive and most maintainable.
26
+
27
+ *Why not `full_spectrum`*: The goal was a solution-statement but we already reframed it in Step 1. We have clear success criteria. No further reframing is needed.
28
+
29
+ ---
30
+
31
+ ## Constraints / Anti-goals
32
+
33
+ **Constraints:**
34
+ - The MCP server (`src/mcp/`) must be touched as little as possible -- it is used in production by people other than the owner
35
+ - The standalone console must remain independently runnable (no daemon dependency)
36
+ - The daemon's control operations (dispatch, steer, poll) must remain HTTP-accessible
37
+ - The browser UI dispatch button must continue to work
38
+
39
+ **Anti-goals:**
40
+ - Do not add auth or multi-user features in this change
41
+ - Do not move the webhook receiver (port 3200) off its current role
42
+ - Do not break existing `worktrain console` command behavior
43
+ - Do not require changes to the frontend's Vite dev proxy config unless absolutely necessary
44
+
45
+ ---
46
+
47
+ ## Landscape Packet
48
+
49
+ ### Current State Summary
50
+
51
+ Three separate server processes share port 3456 and a single lock file (`~/.workrail/daemon-console.lock`):
52
+
53
+ 1. **`src/console/standalone-console.ts`** (`worktrain console`) -- already filesystem-only, no daemon coupling. Calls `mountConsoleRoutes()` with no daemon objects (all optional params omitted). This is the CORRECT target state.
54
+
55
+ 2. **`src/trigger/daemon-console.ts`** (started by `worktrain daemon`) -- starts another Express server on port 3456 that calls `mountConsoleRoutes()` with live daemon objects. Competes directly with the standalone console. Is the source of the coupling problem.
56
+
57
+ 3. **`src/mcp/server.ts` / HttpServer** -- legacy MCP server console, writes `dashboard.lock`. Mostly retired but still present in some code paths.
58
+
59
+ ### Actual Cross-Boundary Imports (the violations)
60
+
61
+ **Violation 1: `src/v2/usecases/console-routes.ts` imports from `src/daemon/`**
62
+ ```
63
+ import type { SteerRegistry } from '../../daemon/workflow-runner.js';
64
+ import { runWorkflow } from '../../daemon/workflow-runner.js';
65
+ ```
66
+ This puts daemon types into the shared console route layer. `v2/usecases/` is supposed to be the shared middle layer used by both the standalone console and the daemon. The daemon bleeding into it is a layering violation.
67
+
68
+ **Violation 2: `src/trigger/daemon-console.ts` imports from `src/mcp/types.js`**
69
+ ```
70
+ import type { V2ToolContext } from '../mcp/types.js';
71
+ ```
72
+ The trigger system importing from the MCP system is a soft violation (type-only import, not a hard runtime dependency), but it creates invisible coupling between `src/trigger/` and `src/mcp/`.
73
+
74
+ **Violation 3: `src/v2/usecases/console-routes.ts` imports from `src/trigger/`**
75
+ ```
76
+ import type { TriggerRouter } from '../../trigger/trigger-router.js';
77
+ import type { PollingScheduler } from '../../trigger/polling-scheduler.js';
78
+ ```
79
+ The shared v2/usecases layer imports from the trigger system. The layering rule should be: v2/usecases knows nothing about trigger internals.
80
+
81
+ ### How the Port/Lock System Works
82
+
83
+ - `daemon-console.lock` is the single file read by all CLI commands (`worktrain spawn`, `worktrain trigger poll`, `worktrain await`) to discover the running console port
84
+ - The standalone console writes this file when it starts
85
+ - The daemon-console ALSO writes this file when it starts (they compete)
86
+ - `worktrain spawn` calls `POST /api/v2/auto/dispatch` against the discovered port
87
+ - `worktrain trigger poll` calls `POST /api/v2/triggers/:id/poll` against the discovered port
88
+ - `src/mcp/handlers/session.ts` `handleOpenDashboard()` reads the lock file to construct the dashboard URL
89
+
90
+ ### Vite Dev Proxy
91
+
92
+ `console/vite.config.ts` proxies `/api` to `http://localhost:3456`. The built frontend uses relative URLs. This means:
93
+ - In development: all frontend API calls go to port 3456 (via Vite proxy)
94
+ - In production: all frontend API calls go to the same origin (port 3456) via relative paths
95
+ - **There is no mechanism for the frontend to reach port 3200 today**
96
+
97
+ ### What Endpoints Live Where
98
+
99
+ **Port 3456 (daemon-console / standalone-console) via `mountConsoleRoutes()`:**
100
+ - `GET /api/v2/sessions` -- session list (filesystem read)
101
+ - `GET /api/v2/sessions/:id` -- session detail (filesystem read)
102
+ - `GET /api/v2/sessions/:id/nodes/:nodeId` -- node detail (filesystem read)
103
+ - `GET /api/v2/sessions/:id/events` -- per-session SSE (reads daemon event log file)
104
+ - `GET /api/v2/workspace/events` -- workspace SSE (watches sessions dir)
105
+ - `GET /api/v2/worktrees` -- worktree list (git commands + filesystem)
106
+ - `GET /api/v2/workflows` -- workflow catalog (filesystem)
107
+ - `GET /api/v2/triggers` -- trigger list (requires `TriggerRouter` injection; returns [] without it)
108
+ - `POST /api/v2/auto/dispatch` -- dispatch workflow (requires `V2ToolContext` + optional `TriggerRouter`)
109
+ - `POST /api/v2/triggers/:id/poll` -- force poll (requires `PollingScheduler` injection; returns 503 without it)
110
+ - `POST /api/v2/sessions/:id/steer` -- inject text into running session (requires `SteerRegistry`; returns 503 without it)
111
+
112
+ **Port 3200 (trigger-listener) via `createTriggerApp()`:**
113
+ - `GET /health` -- health check
114
+ - `POST /webhook/:triggerId` -- webhook receiver
115
+
116
+ ### Browser Frontend API Usage
117
+
118
+ Only three control endpoints are called from the browser:
119
+ - `POST /api/v2/auto/dispatch` -- called from `DispatchPane.tsx`
120
+ - `GET /api/v2/triggers` -- called from `DispatchPane.tsx` (trigger list display)
121
+ - All others are read-only data display
122
+
123
+ **Steer and poll have ZERO frontend callers.** They are programmatic coordinator APIs only.
124
+
125
+ ### CLI Commands and Their Console Port Usage
126
+
127
+ | Command | What it calls | Port used |
128
+ |---------|--------------|-----------|
129
+ | `worktrain spawn` | `POST /api/v2/auto/dispatch` | Discovers from `daemon-console.lock`, default 3456 |
130
+ | `worktrain trigger poll <id>` | `POST /api/v2/triggers/:id/poll` | Discovers from `daemon-console.lock`, default 3200 |
131
+ | `worktrain await` | Reads lock file for URL construction | daemon-console.lock port |
132
+ | `src/mcp/handlers/session.ts` | Reads lock file for URL construction | daemon-console.lock port |
133
+
134
+ ### Existing Approaches / Precedents
135
+
136
+ The codebase already has a clean precedent: the standalone console (`standalone-console.ts`) calls `mountConsoleRoutes()` with all optional daemon params omitted. The endpoints that need daemon objects return `503 Service Unavailable` gracefully when those objects are not injected. This is the "degradation without error" pattern already designed in.
137
+
138
+ ### Option Categories
139
+
140
+ Three architectural approaches are viable (see Candidate Directions section).
141
+
142
+ ### Contradictions
143
+
144
+ - The comment in `daemon-console.ts` says it is "designed to be called from the daemon startup path so the console stays up as long as the daemon process runs" -- but if the standalone console is running, the daemon's console conflicts with it.
145
+ - `worktrain-trigger-poll.ts` has `DEFAULT_POLL_PORT = 3200` as a "spec requirement" but then says "in practice, the daemon console writes daemon-console.lock (port 3456)" -- the spec says 3200 but reality is 3456. This inconsistency suggests the spec and implementation diverged.
146
+ - `v2/usecases/console-routes.ts` is supposed to be a shared middle layer but it imports from `src/daemon/workflow-runner.js`. This is an upward dependency from the shared layer to the application layer.
147
+
148
+ ### Evidence Gaps
149
+
150
+ - It is unknown whether the owner intends to ever add steer/poll UI controls in the browser
151
+ - It is unknown whether the daemon is expected to run simultaneously with `worktrain console` (they compete for port 3456 today -- does the owner expect one to take priority?)
152
+
153
+ ---
154
+
155
+ ## Problem Frame Packet
156
+
157
+ ### Users / Stakeholders
158
+
159
+ - **Primary user:** Project owner -- a single developer who runs daemon + console locally
160
+ - **Secondary users:** None for daemon/console. External users only interact with the MCP server.
161
+ - **Affected indirectly:** External MCP users if MCP server is destabilized by changes
162
+
163
+ ### Jobs, Goals, and Outcomes
164
+
165
+ - Run `worktrain console` as a standalone process that works whether or not the daemon is running
166
+ - Run the daemon without it conflicting with the standalone console
167
+ - Browser dispatch button (`DispatchPane`) works when the daemon is running
168
+ - Coordinator scripts can call steer and poll APIs when the daemon is running
169
+ - `worktrain spawn`, `worktrain trigger poll`, `worktrain await` CLI commands continue to work
170
+
171
+ ### Pains / Tensions / Constraints
172
+
173
+ 1. **Port conflict pain:** Daemon starts on port 3456, preventing `worktrain console` from binding. Owner must choose: run daemon OR run console. Can't run both simultaneously today.
174
+
175
+ 2. **Layering violation pain:** `src/v2/usecases/console-routes.ts` -- intended as a shared layer -- imports from `src/daemon/` and `src/trigger/`. This means any change to daemon or trigger internals could break the console routes, and the console routes cannot be reasoned about independently.
176
+
177
+ 3. **Ownership ambiguity:** Both `daemon-console.ts` and `standalone-console.ts` write the same lock file and bind the same port. Neither "owns" the console definitively. CLI tools read whichever wrote last.
178
+
179
+ 4. **Dispatch coupling tension:** The dispatch button requires a live `V2ToolContext` (for `executeStartWorkflow`) and optionally a `TriggerRouter` (for queue serialization). These objects only exist inside the daemon process. The standalone console cannot dispatch autonomously without them.
180
+
181
+ ### Success Criteria
182
+
183
+ 1. `worktrain console` binds port 3456, serves sessions, and never errors out just because the daemon is or isn't running
184
+ 2. When the daemon is running AND the standalone console is running, both processes work without port conflict
185
+ 3. `POST /api/v2/auto/dispatch` from the browser UI succeeds when the daemon is running
186
+ 4. `POST /api/v2/sessions/:id/steer` and `POST /api/v2/triggers/:id/poll` remain functional for coordinator scripts
187
+ 5. `worktrain spawn` and `worktrain trigger poll` continue to work
188
+ 6. No imports cross from `src/v2/usecases/` into `src/daemon/` or `src/trigger/`
189
+ 7. `npx vitest run` passes
190
+
191
+ ### Assumptions
192
+
193
+ - The owner wants daemon and standalone console to coexist simultaneously (not be mutually exclusive)
194
+ - Steer and poll will remain coordinator-script-only APIs (no browser UI for them)
195
+ - The browser frontend will NOT be rewritten to support dual-port API calls
196
+
197
+ ### Reframes / HMW Questions
198
+
199
+ 1. **HMW eliminate the daemon's embedded console entirely?** Instead of the daemon starting its own console, the standalone console could be the only console. The daemon adds its control surface (dispatch, steer, poll) to port 3200 or a new dedicated port (3201). The standalone console proxies dispatch requests to the daemon port, or the dispatch button in the browser is disabled when the daemon port is unreachable.
200
+
201
+ 2. **HMW make dispatch work without the daemon holding objects?** Instead of `mountConsoleRoutes()` holding a live `TriggerRouter` reference, the dispatch endpoint could make an HTTP POST to the trigger-listener on port 3200 (`/webhook/:triggerId`) or a new `/dispatch` endpoint on 3200. This decouples the console from the daemon's object graph.
202
+
203
+ 3. **HMW minimize the change surface?** The cleanest path might be: (a) remove the daemon's console startup from `cli-worktrain.ts`, (b) move the three control endpoints (`dispatch`, `steer`, `poll`) OUT of `mountConsoleRoutes()` and into a separate `mountDaemonControlRoutes()`, (c) the daemon mounts both on port 3200 (alongside `/webhook`), (d) the standalone console mounts only `mountConsoleRoutes()`, (e) CLI tools discover which port has control endpoints via separate lock files.
204
+
205
+ ### What Would Make This Framing Wrong
206
+
207
+ - If the owner DOES want to run daemon and standalone console as mutually exclusive alternatives (not simultaneously), then port conflict is not a bug but a feature -- and the real problem is just the import layering violations
208
+ - If the owner wants to add steer/poll browser UI, the "steer and poll are API-only" assumption is wrong and the design needs to account for cross-port browser calls
209
+ - If the owner wants NO changes to `worktrain spawn` and `worktrain trigger poll` behavior, any solution that moves dispatch/poll to a different port must preserve the lock file discovery mechanism exactly
210
+
211
+ ---
212
+
213
+ ## Candidate Directions
214
+
215
+ *(To be populated in Phase 1)*
216
+
217
+ ---
218
+
219
+ ## Challenge Notes
220
+
221
+ *(To be populated in Phase 2)*
222
+
223
+ ---
224
+
225
+ ## Resolution Notes
226
+
227
+ *(To be populated in Phase 2)*
228
+
229
+ ---
230
+
231
+ ## Decision Log
232
+
233
+ | Date | Decision | Rationale |
234
+ |------|----------|-----------|
235
+ | 2026-04-21 | Goal reclassified as solution_statement | "Split by port" names a specific approach, not the problem |
236
+ | 2026-04-21 | Path = landscape_first | Problem is well-understood; option landscape is the open question |
237
+
238
+ ---
239
+
240
+ ## Final Summary
241
+
242
+ *(To be populated at end of workflow)*