@exaudeus/workrail 3.67.0 → 3.68.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (140) hide show
  1. package/dist/application/services/compiler/template-registry.js +10 -1
  2. package/dist/cli/commands/worktrain-init.js +1 -1
  3. package/dist/console-ui/assets/{index-tOl8Vowf.js → index-CyzltI6D.js} +1 -1
  4. package/dist/console-ui/index.html +1 -1
  5. package/dist/coordinators/modes/full-pipeline.js +4 -4
  6. package/dist/coordinators/modes/implement-shared.js +5 -5
  7. package/dist/coordinators/modes/implement.js +4 -4
  8. package/dist/coordinators/pr-review.js +4 -4
  9. package/dist/daemon/workflow-runner.d.ts +1 -0
  10. package/dist/daemon/workflow-runner.js +1 -0
  11. package/dist/manifest.json +25 -25
  12. package/dist/mcp/handlers/v2-workflow.js +1 -1
  13. package/dist/mcp/workflow-protocol-contracts.js +2 -2
  14. package/docs/authoring-v2.md +4 -4
  15. package/docs/changelog-recent.md +3 -3
  16. package/docs/configuration.md +1 -1
  17. package/docs/design/adaptive-coordinator-context-candidates.md +1 -1
  18. package/docs/design/adaptive-coordinator-context.md +1 -1
  19. package/docs/design/adaptive-coordinator-routing-candidates.md +18 -18
  20. package/docs/design/adaptive-coordinator-routing-review.md +1 -1
  21. package/docs/design/adaptive-coordinator-routing.md +34 -34
  22. package/docs/design/agent-cascade-protocol.md +2 -2
  23. package/docs/design/console-daemon-separation-discovery.md +323 -0
  24. package/docs/design/context-assembly-design-candidates.md +1 -1
  25. package/docs/design/context-assembly-implementation-plan.md +1 -1
  26. package/docs/design/context-assembly-layer.md +2 -2
  27. package/docs/design/context-assembly-review-findings.md +1 -1
  28. package/docs/design/coordinator-access-audit.md +293 -0
  29. package/docs/design/coordinator-architecture-audit.md +62 -0
  30. package/docs/design/coordinator-error-handling-audit.md +240 -0
  31. package/docs/design/coordinator-testability-audit.md +426 -0
  32. package/docs/design/daemon-architecture-discovery.md +1 -1
  33. package/docs/design/daemon-console-separation-discovery.md +242 -0
  34. package/docs/design/daemon-memory-audit.md +203 -0
  35. package/docs/design/design-candidates-console-daemon-separation.md +256 -0
  36. package/docs/design/design-candidates-discovery-loop-fix.md +141 -0
  37. package/docs/design/design-review-findings-console-daemon-separation.md +106 -0
  38. package/docs/design/design-review-findings-discovery-loop-fix.md +81 -0
  39. package/docs/design/discovery-loop-fix-candidates.md +161 -0
  40. package/docs/design/discovery-loop-fix-design-review.md +106 -0
  41. package/docs/design/discovery-loop-fix-validation.md +258 -0
  42. package/docs/design/discovery-loop-investigation-A.md +188 -0
  43. package/docs/design/discovery-loop-investigation-B.md +287 -0
  44. package/docs/design/exploration-workflow-candidates.md +205 -0
  45. package/docs/design/exploration-workflow-design-review.md +166 -0
  46. package/docs/design/exploration-workflow-discovery.md +443 -0
  47. package/docs/design/ide-context-files-candidates.md +231 -0
  48. package/docs/design/ide-context-files-design-review.md +85 -0
  49. package/docs/design/ide-context-files.md +615 -0
  50. package/docs/design/implementation-plan-discovery-loop-fix.md +199 -0
  51. package/docs/design/implementation-plan-queue-poll-rotation.md +102 -0
  52. package/docs/design/in-process-http-audit.md +190 -0
  53. package/docs/design/layer3b-ghost-nodes-design-candidates.md +2 -2
  54. package/docs/design/loadSessionNotes-candidates.md +108 -0
  55. package/docs/design/loadSessionNotes-test-coverage-discovery.md +297 -0
  56. package/docs/design/loadSessionNotes-test-coverage-session4.md +209 -0
  57. package/docs/design/loadSessionNotes-test-coverage-v3.md +321 -0
  58. package/docs/design/probe-session-design-candidates.md +261 -0
  59. package/docs/design/probe-session-phase0.md +490 -0
  60. package/docs/design/routines-guide.md +7 -7
  61. package/docs/design/session-metrics-attribution-candidates.md +250 -0
  62. package/docs/design/session-metrics-attribution-design-review.md +115 -0
  63. package/docs/design/session-metrics-attribution-discovery.md +319 -0
  64. package/docs/design/session-metrics-candidates.md +227 -0
  65. package/docs/design/session-metrics-design-review.md +104 -0
  66. package/docs/design/session-metrics-discovery.md +454 -0
  67. package/docs/design/spawn-session-debug.md +202 -0
  68. package/docs/design/trigger-validator-candidates.md +214 -0
  69. package/docs/design/trigger-validator-review.md +109 -0
  70. package/docs/design/trigger-validator-shaping-phase0.md +239 -0
  71. package/docs/design/trigger-validator.md +454 -0
  72. package/docs/design/v2-core-design-locks.md +2 -2
  73. package/docs/design/workflow-extension-points.md +15 -15
  74. package/docs/design/workflow-id-validation-at-startup.md +1 -1
  75. package/docs/design/workflow-id-validation-implementation-plan.md +2 -2
  76. package/docs/design/workflow-trigger-lifecycle-audit.md +175 -0
  77. package/docs/design/worktrain-task-queue-candidates.md +5 -5
  78. package/docs/design/worktrain-task-queue.md +4 -4
  79. package/docs/discovery/coordinator-script-design.md +1 -1
  80. package/docs/discovery/coordinator-ux-discovery.md +3 -3
  81. package/docs/discovery/simulation-report.md +1 -1
  82. package/docs/discovery/workflow-modernization-discovery.md +326 -0
  83. package/docs/discovery/workflow-selection-for-discovery-tasks.md +33 -33
  84. package/docs/discovery/worktrain-status-briefing.md +1 -1
  85. package/docs/discovery/wr-discovery-goal-reframing.md +1 -1
  86. package/docs/docker.md +1 -1
  87. package/docs/ideas/backlog.md +227 -0
  88. package/docs/ideas/third-party-workflow-setup-design-thinking.md +1 -1
  89. package/docs/integrations/claude-code.md +5 -5
  90. package/docs/integrations/firebender.md +1 -1
  91. package/docs/plans/agentic-orchestration-roadmap.md +2 -2
  92. package/docs/plans/mr-review-workflow-redesign.md +9 -9
  93. package/docs/plans/ui-ux-workflow-design-candidates.md +4 -4
  94. package/docs/plans/ui-ux-workflow-discovery.md +2 -2
  95. package/docs/plans/workflow-categories-candidates.md +8 -8
  96. package/docs/plans/workflow-categories-discovery.md +4 -4
  97. package/docs/plans/workflow-modernization-design.md +430 -0
  98. package/docs/plans/workflow-staleness-detection-candidates.md +11 -11
  99. package/docs/plans/workflow-staleness-detection-review.md +4 -4
  100. package/docs/plans/workflow-staleness-detection.md +9 -9
  101. package/docs/plans/workrail-platform-vision.md +3 -3
  102. package/docs/reference/agent-context-cleaner-snippet.md +1 -1
  103. package/docs/reference/agent-context-guidance.md +4 -4
  104. package/docs/reference/context-optimization.md +2 -2
  105. package/docs/roadmap/now-next-later.md +2 -2
  106. package/docs/roadmap/open-work-inventory.md +16 -16
  107. package/docs/workflows.md +31 -31
  108. package/package.json +1 -1
  109. package/spec/workflow-tags.json +47 -47
  110. package/workflows/adaptive-ticket-creation.json +16 -16
  111. package/workflows/architecture-scalability-audit.json +22 -22
  112. package/workflows/bug-investigation.agentic.v2.json +3 -3
  113. package/workflows/classify-task-workflow.json +1 -1
  114. package/workflows/coding-task-workflow-agentic.json +6 -6
  115. package/workflows/cross-platform-code-conversion.v2.json +8 -8
  116. package/workflows/document-creation-workflow.json +8 -8
  117. package/workflows/documentation-update-workflow.json +8 -8
  118. package/workflows/intelligent-test-case-generation.json +2 -2
  119. package/workflows/learner-centered-course-workflow.json +2 -2
  120. package/workflows/mr-review-workflow.agentic.v2.json +4 -4
  121. package/workflows/personal-learning-materials-creation-branched.json +8 -8
  122. package/workflows/presentation-creation.json +5 -5
  123. package/workflows/production-readiness-audit.json +1 -1
  124. package/workflows/relocation-workflow-us.json +31 -31
  125. package/workflows/routines/context-gathering.json +1 -1
  126. package/workflows/routines/design-review.json +1 -1
  127. package/workflows/routines/execution-simulation.json +1 -1
  128. package/workflows/routines/feature-implementation.json +3 -3
  129. package/workflows/routines/final-verification.json +1 -1
  130. package/workflows/routines/hypothesis-challenge.json +1 -1
  131. package/workflows/routines/ideation.json +1 -1
  132. package/workflows/routines/parallel-work-partitioning.json +3 -3
  133. package/workflows/routines/philosophy-alignment.json +2 -2
  134. package/workflows/routines/plan-analysis.json +1 -1
  135. package/workflows/routines/plan-generation.json +1 -1
  136. package/workflows/routines/tension-driven-design.json +6 -6
  137. package/workflows/scoped-documentation-workflow.json +26 -26
  138. package/workflows/ui-ux-design-workflow.json +14 -14
  139. package/workflows/workflow-diagnose-environment.json +1 -1
  140. package/workflows/workflow-for-workflows.json +1 -1
@@ -10,9 +10,9 @@
10
10
 
11
11
  ## Context / Ask
12
12
 
13
- A daemon session was dispatched using `coding-task-workflow-agentic` with a goal that said "Discovery only -- Do NOT write any code". The session ran 11 advances, produced good design candidate notes, stopped at event 74 with no `run_completed`, and the later advances had no note output (likely conditional skips).
13
+ A daemon session was dispatched using `wr.coding-task` with a goal that said "Discovery only -- Do NOT write any code". The session ran 11 advances, produced good design candidate notes, stopped at event 74 with no `run_completed`, and the later advances had no note output (likely conditional skips).
14
14
 
15
- The question: for a discovery-only task (no code, just a design document), should we use `coding-task-workflow-agentic` or `wr.discovery`? And can `coding-task-workflow-agentic` be trusted to stay in discovery mode when the goal explicitly says no code?
15
+ The question: for a discovery-only task (no code, just a design document), should we use `wr.coding-task` or `wr.discovery`? And can `wr.coding-task` be trusted to stay in discovery mode when the goal explicitly says no code?
16
16
 
17
17
  ---
18
18
 
@@ -42,11 +42,11 @@ The question: for a discovery-only task (no code, just a design document), shoul
42
42
 
43
43
  ### Current state summary
44
44
 
45
- `coding-task-workflow-agentic` (lean v2, v1.1.0) is a full implementation lifecycle workflow. Its `about` field says: "Use this to implement a software feature or task." Its preconditions include "A deterministic validation path exists (tests, build, or an explicit verification strategy)." It explicitly describes what it produces: `implementation_plan.md`, `spec.md`, code slices, and a PR-ready handoff with commit JSON.
45
+ `wr.coding-task` (lean v2, v1.1.0) is a full implementation lifecycle workflow. Its `about` field says: "Use this to implement a software feature or task." Its preconditions include "A deterministic validation path exists (tests, build, or an explicit verification strategy)." It explicitly describes what it produces: `implementation_plan.md`, `spec.md`, code slices, and a PR-ready handoff with commit JSON.
46
46
 
47
47
  `wr.discovery` (v3.1.0) is a structured thinking/design workflow. Its `about` field says: "Use this to explore and think through a problem end-to-end." Its metaGuidance explicitly states: "Boundary: this workflow can end with a recommendation memo, prototype or test plan, or a research-informed direction. It should not implement production code."
48
48
 
49
- ### Step structure analysis: coding-task-workflow-agentic
49
+ ### Step structure analysis: wr.coding-task
50
50
 
51
51
  | Step | Condition | Discovery-relevant? |
52
52
  |------|-----------|---------------------|
@@ -67,7 +67,7 @@ The question: for a discovery-only task (no code, just a design document), shoul
67
67
 
68
68
  For Medium/Large tasks, the workflow runs the full design pipeline (phases 0-4) which produces `design-candidates.md` -- but it then continues directly into implementation (phases 6-7). There is no early exit after design.
69
69
 
70
- **Does coding-task-workflow-agentic have a "discovery only" mode?** No. It has no `runCondition` or context variable that would stop before implementation when a goal says "no code". The only escape hatch would be the agent choosing to stop itself based on the goal text -- which is an honor-system trust, not a structural guarantee.
70
+ **Does wr.coding-task have a "discovery only" mode?** No. It has no `runCondition` or context variable that would stop before implementation when a goal says "no code". The only escape hatch would be the agent choosing to stop itself based on the goal text -- which is an honor-system trust, not a structural guarantee.
71
71
 
72
72
  ### What phases run for Small vs Medium/Large
73
73
 
@@ -101,15 +101,15 @@ It explicitly cannot produce production code. It always ends with a design docum
101
101
 
102
102
  ### Option categories
103
103
 
104
- 1. **Use wr.discovery** for discovery tasks, `coding-task-workflow-agentic` for implementation tasks
105
- 2. **Use coding-task-workflow-agentic for everything**, trusting the agent to stop early when goal says "no code"
106
- 3. **Add a discovery-mode flag** to `coding-task-workflow-agentic` via a `runCondition` on phases 6-7
104
+ 1. **Use wr.discovery** for discovery tasks, `wr.coding-task` for implementation tasks
105
+ 2. **Use wr.coding-task for everything**, trusting the agent to stop early when goal says "no code"
106
+ 3. **Add a discovery-mode flag** to `wr.coding-task` via a `runCondition` on phases 6-7
107
107
  4. **Use separate triggers** in triggers.yml with different `workflowId` per task type
108
108
 
109
109
  ### Contradictions / disagreements
110
110
 
111
- - The daemon session with `coding-task-workflow-agentic` produced "good design candidates notes" -- so the workflow does good design work even though it is intended for implementation. The design pipeline (phases 1-4) is legitimate and high quality.
112
- - The risk is not that `coding-task-workflow-agentic` does bad design work. The risk is that (a) it might not stop before phase-6 reliably, and (b) it carries implementation framing (slices, spec, PR handoff) that pollutes a pure discovery context.
111
+ - The daemon session with `wr.coding-task` produced "good design candidates notes" -- so the workflow does good design work even though it is intended for implementation. The design pipeline (phases 1-4) is legitimate and high quality.
112
+ - The risk is not that `wr.coding-task` does bad design work. The risk is that (a) it might not stop before phase-6 reliably, and (b) it carries implementation framing (slices, spec, PR handoff) that pollutes a pure discovery context.
113
113
 
114
114
  ### Evidence gaps
115
115
 
@@ -131,12 +131,12 @@ It explicitly cannot produce production code. It always ends with a design docum
131
131
 
132
132
  - Dispatch a session that produces a design document and nothing else
133
133
  - Know with certainty that no code will be written, regardless of agent judgment
134
- - Get a high-quality, structured design output comparable to what coding-task-workflow-agentic's design phases produce
134
+ - Get a high-quality, structured design output comparable to what wr.coding-task's design phases produce
135
135
 
136
136
  ### Pains / tensions / constraints
137
137
 
138
138
  - The daemon currently has ONE `workflowId` in triggers.yml -- no per-task routing
139
- - `coding-task-workflow-agentic` is trusted for design quality but is not structurally bounded to stop before code
139
+ - `wr.coding-task` is trusted for design quality but is not structurally bounded to stop before code
140
140
  - `wr.discovery` is structurally bounded to no-code but may produce different design output depth
141
141
 
142
142
  ### Success criteria
@@ -148,12 +148,12 @@ It explicitly cannot produce production code. It always ends with a design docum
148
148
  ### Assumptions
149
149
 
150
150
  - The daemon reads `workflowId` directly from triggers.yml and cannot dynamically select based on goal text
151
- - `wr.discovery` produces design candidates comparable in quality to what phases 1-4 of `coding-task-workflow-agentic` produce
151
+ - `wr.discovery` produces design candidates comparable in quality to what phases 1-4 of `wr.coding-task` produce
152
152
  - triggers.yml supports multiple trigger entries with different `workflowId` values
153
153
 
154
154
  ### Reframes / HMW questions
155
155
 
156
- - HMW: How might we route discovery tasks to `wr.discovery` and implementation tasks to `coding-task-workflow-agentic` at the dispatcher level instead of relying on agent judgment?
156
+ - HMW: How might we route discovery tasks to `wr.discovery` and implementation tasks to `wr.coding-task` at the dispatcher level instead of relying on agent judgment?
157
157
  - HMW: How might we make "discovery only" a structural guarantee rather than a goal-text instruction?
158
158
 
159
159
  ### What would make this framing wrong
@@ -183,15 +183,15 @@ Configure a second trigger entry in triggers.yml with `workflowId: wr.discovery`
183
183
 
184
184
  **Why it fits:** Structural guarantee. `wr.discovery` was explicitly designed for this use case. Its metaGuidance says "should not implement production code."
185
185
 
186
- **Strongest evidence for it:** The session incident shows the risk of relying on honor-system stop behavior in `coding-task-workflow-agentic`. Structural routing removes the risk entirely.
186
+ **Strongest evidence for it:** The session incident shows the risk of relying on honor-system stop behavior in `wr.coding-task`. Structural routing removes the risk entirely.
187
187
 
188
- **Strongest risk against it:** triggers.yml currently supports one trigger per session. If it cannot support multiple triggers with per-task routing, this requires daemon work. Also, `wr.discovery` produces a recommendation memo/design doc, not the same `design-candidates.md` artifact shape that `coding-task-workflow-agentic` phases 1-4 produce.
188
+ **Strongest risk against it:** triggers.yml currently supports one trigger per session. If it cannot support multiple triggers with per-task routing, this requires daemon work. Also, `wr.discovery` produces a recommendation memo/design doc, not the same `design-candidates.md` artifact shape that `wr.coding-task` phases 1-4 produce.
189
189
 
190
190
  **When it should win:** Always, for any task where the desired output is a design document and there is no intent to implement code in the same session.
191
191
 
192
192
  ---
193
193
 
194
- ### Direction B: Trust coding-task-workflow-agentic with honor-system stop
194
+ ### Direction B: Trust wr.coding-task with honor-system stop
195
195
 
196
196
  Keep triggers.yml as-is. Rely on the goal text ("Discovery only -- Do NOT write any code") to instruct the agent to stop before phase-6.
197
197
 
@@ -205,27 +205,27 @@ Keep triggers.yml as-is. Rely on the goal text ("Discovery only -- Do NOT write
205
205
 
206
206
  ---
207
207
 
208
- ### Direction C: Add discoveryMode flag to coding-task-workflow-agentic
208
+ ### Direction C: Add discoveryMode flag to wr.coding-task
209
209
 
210
- Modify `coding-task-workflow-agentic` to support a `discoveryMode` context variable. Add `runCondition: { var: "discoveryMode", not_equals: true }` to phases 6 and 7. Pass `discoveryMode: true` via the goal or a trigger-level context override.
210
+ Modify `wr.coding-task` to support a `discoveryMode` context variable. Add `runCondition: { var: "discoveryMode", not_equals: true }` to phases 6 and 7. Pass `discoveryMode: true` via the goal or a trigger-level context override.
211
211
 
212
- **Why it fits:** Preserves the high-quality design pipeline of `coding-task-workflow-agentic` while adding a structural stop before implementation.
212
+ **Why it fits:** Preserves the high-quality design pipeline of `wr.coding-task` while adding a structural stop before implementation.
213
213
 
214
- **Strongest evidence for it:** The design phases (1-4) of `coding-task-workflow-agentic` are well-designed and familiar. Reusing them avoids duplication.
214
+ **Strongest evidence for it:** The design phases (1-4) of `wr.coding-task` are well-designed and familiar. Reusing them avoids duplication.
215
215
 
216
216
  **Strongest risk against it:** This requires modifying a core workflow file. It adds complexity to a workflow that was designed for a different purpose. It creates a hybrid that does neither thing cleanly. And triggers.yml still only has one trigger, so the `discoveryMode` value must come from somewhere (goal text parse? trigger-level context?).
217
217
 
218
- **When it should win:** If modifying `wr.discovery` or the daemon is unavailable, and modifying `coding-task-workflow-agentic` is cheap and acceptable.
218
+ **When it should win:** If modifying `wr.discovery` or the daemon is unavailable, and modifying `wr.coding-task` is cheap and acceptable.
219
219
 
220
220
  ---
221
221
 
222
222
  ## Challenge Notes
223
223
 
224
- **Against Direction A (wr.discovery):** The design output format differs. `coding-task-workflow-agentic` produces `design-candidates.md` via the `tension-driven-design` routine, followed by a `design-review-findings.md` and a full `implementation_plan.md`. `wr.discovery` produces a design doc with Candidate Directions and a recommendation. For a technical question about workflow architecture, the `wr.discovery` output (a recommendation memo) is actually _more_ appropriate than `implementation_plan.md`. The format difference is not a disadvantage.
224
+ **Against Direction A (wr.discovery):** The design output format differs. `wr.coding-task` produces `design-candidates.md` via the `tension-driven-design` routine, followed by a `design-review-findings.md` and a full `implementation_plan.md`. `wr.discovery` produces a design doc with Candidate Directions and a recommendation. For a technical question about workflow architecture, the `wr.discovery` output (a recommendation memo) is actually _more_ appropriate than `implementation_plan.md`. The format difference is not a disadvantage.
225
225
 
226
226
  **Against Direction B:** The incident already showed the risk. The session stopped at event 74 with no `run_completed`. We do not know if it stopped intentionally or by timeout/connection drop. If it stopped by timeout, the next session might not stop in the same place. Structural guarantees are always preferred over honor-system constraints when the downside (code written to a wrong branch) is recoverable but costly.
227
227
 
228
- **Against Direction C:** Modifying `coding-task-workflow-agentic` for a use case it was not designed for violates the "make illegal states unrepresentable" principle. It is better to use the right tool than to add a mode switch to the wrong tool.
228
+ **Against Direction C:** Modifying `wr.coding-task` for a use case it was not designed for violates the "make illegal states unrepresentable" principle. It is better to use the right tool than to add a mode switch to the wrong tool.
229
229
 
230
230
  ---
231
231
 
@@ -256,7 +256,7 @@ Modify `coding-task-workflow-agentic` to support a `discoveryMode` context varia
256
256
  #### Recommendation
257
257
 
258
258
  For a discovery-only task (no code, just a design document):
259
- - **Use `wr.discovery`**, not `coding-task-workflow-agentic`
259
+ - **Use `wr.discovery`**, not `wr.coding-task`
260
260
  - Add a second trigger entry to `triggers.yml` with a unique `id` and `workflowId: wr.discovery`
261
261
  - The daemon's trigger-store.ts and trigger-router.ts already support multiple triggers with different workflowIds -- no code change required
262
262
 
@@ -266,7 +266,7 @@ For a discovery-only task (no code, just a design document):
266
266
  triggers:
267
267
  - id: test-task
268
268
  provider: generic
269
- workflowId: coding-task-workflow-agentic
269
+ workflowId: wr.coding-task
270
270
  workspacePath: /Users/etienneb/git/personal/workrail
271
271
  goal: "Add the evidenceFrom field to AssessmentDimension..."
272
272
  concurrencyMode: parallel
@@ -287,13 +287,13 @@ triggers:
287
287
 
288
288
  The caller must send the correct `triggerId` (`discovery-task` vs `test-task`) when firing the webhook.
289
289
 
290
- #### Why coding-task-workflow-agentic cannot be trusted in discovery mode
290
+ #### Why wr.coding-task cannot be trusted in discovery mode
291
291
 
292
- `coding-task-workflow-agentic` has no structural stop before phase-6 (Implement Slice-by-Slice). For Small tasks, phase-5 (Small Task Fast Path) explicitly requires writing code. For Medium/Large tasks, the design pipeline (phases 0-4) produces good design work, then phase-6 writes code. The only protection against code-writing is the agent choosing to stop based on goal text -- an honor-system constraint that can fail under context window pressure.
292
+ `wr.coding-task` has no structural stop before phase-6 (Implement Slice-by-Slice). For Small tasks, phase-5 (Small Task Fast Path) explicitly requires writing code. For Medium/Large tasks, the design pipeline (phases 0-4) produces good design work, then phase-6 writes code. The only protection against code-writing is the agent choosing to stop based on goal text -- an honor-system constraint that can fail under context window pressure.
293
293
 
294
294
  The prior session stopped at event 74 (likely after phase-4, before phase-6) -- but we cannot confirm whether this was agent judgment or a connection drop. With `wr.discovery`, the question is irrelevant: there are no phases 6-7 to reach.
295
295
 
296
- #### What phases coding-task-workflow-agentic skips for Small tasks
296
+ #### What phases wr.coding-task skips for Small tasks
297
297
 
298
298
  - Skips: phase-1a (hypothesis), phase-1b (design), phase-1c (challenge), phase-2 (design review), phase-3 (plan), phase-3b (spec), phase-4 (plan audit), phase-6 (implementation), phase-7 (verification)
299
299
  - Runs: phase-0 (classify) and phase-5 (Small Task Fast Path -- **writes code**)
@@ -302,24 +302,24 @@ For Medium/Large tasks, all phases run in sequence, including phase-6 (implement
302
302
 
303
303
  #### Would wr.discovery have been a better choice?
304
304
 
305
- Yes, without qualification. `wr.discovery` was designed for exactly this use case. Its metaGuidance states: "should not implement production code." All paths end with a recommendation memo, prototype spec, or research plan. It uses the same `tension-driven-design` routine as `coding-task-workflow-agentic` phases 1b, so design quality is equivalent.
305
+ Yes, without qualification. `wr.discovery` was designed for exactly this use case. Its metaGuidance states: "should not implement production code." All paths end with a recommendation memo, prototype spec, or research plan. It uses the same `tension-driven-design` routine as `wr.coding-task` phases 1b, so design quality is equivalent.
306
306
 
307
307
  #### How to configure triggers.yml for discovery vs implementation
308
308
 
309
- - **Implementation tasks**: `workflowId: coding-task-workflow-agentic` -- use the existing `test-task` trigger or rename it
309
+ - **Implementation tasks**: `workflowId: wr.coding-task` -- use the existing `test-task` trigger or rename it
310
310
  - **Discovery tasks**: `workflowId: wr.discovery` -- add a new trigger entry (e.g., `id: discovery-task`)
311
311
  - Route by sending the correct `triggerId` in the webhook
312
312
 
313
313
  #### Workflow selection strategy when the daemon has ONE workflowId configured
314
314
 
315
- The current `test-task` trigger always dispatches to `coding-task-workflow-agentic`. For discovery tasks, either:
315
+ The current `test-task` trigger always dispatches to `wr.coding-task`. For discovery tasks, either:
316
316
  1. Add a second trigger entry (preferred -- structural routing, zero code change)
317
317
  2. Temporarily change the trigger's `workflowId` to `wr.discovery` for discovery sessions, then change it back (workable but manual and error-prone)
318
318
  3. Use console AUTO dispatch and set `workflowId: wr.discovery` explicitly in the dispatch request (for console-dispatched sessions only)
319
319
 
320
320
  Option 1 is the right answer.
321
321
 
322
- ### Strongest alternative: Direction C (add discoveryMode flag to coding-task-workflow-agentic)
322
+ ### Strongest alternative: Direction C (add discoveryMode flag to wr.coding-task)
323
323
 
324
324
  If the two-trigger routing were unavailable (it is not), adding `runCondition: { var: "discoveryMode", not_equals: true }` to phases 6-7 would also provide structural enforcement. Loses: workflow cleanliness, YAGNI compliance, reversibility. Not recommended when Direction A is available.
325
325
 
@@ -301,7 +301,7 @@ ACTIVE (2 sessions)
301
301
  Discovery: what data exists today that a 'worktrain status' plain-English briefing command could use
302
302
  Step: phase-3-synthesize Running 22 min
303
303
 
304
- ● coding-task-workflow-agentic
304
+ wr.coding-task
305
305
  Implement GitHub polling adapter for Issues/PRs without requiring webhooks
306
306
  Step: phase-2-implement Running 8 min ⚠ no activity for 18 min
307
307
 
@@ -296,7 +296,7 @@ C2 is more structurally correct -- a mandatory separate step enforces that goal
296
296
 
297
297
  ### Next actions
298
298
 
299
- These findings are the input to Phase 2: the `workflow-for-workflows` workflow will design the implementation based on this diagnosis.
299
+ These findings are the input to Phase 2: the `wr.workflow-for-workflows` workflow will design the implementation based on this diagnosis.
300
300
 
301
301
  1. The wfw workflow should receive: the full diagnosis (Phase 0 is the root cause), the specific changes needed (5 changes listed above), the priority order (Phase 0 goalType classification is highest priority), and the decision to implement the C1+C3 hybrid, not C2.
302
302
  2. After wfw produces the improved workflow, write it to `workflows/wr.discovery.json`.
package/docs/docker.md CHANGED
@@ -70,7 +70,7 @@ echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | docker run -
70
70
  echo '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"workflow_list","arguments":{}}}' | docker run --rm -i workrail-mcp
71
71
 
72
72
  # Test getting a specific workflow
73
- echo '{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"workflow_get","arguments":{"id":"coding-task-workflow-agentic","mode":"metadata"}}}' | docker run --rm -i workrail-mcp
73
+ echo '{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"workflow_get","arguments":{"id":"wr.coding-task","mode":"metadata"}}}' | docker run --rm -i workrail-mcp
74
74
  ```
75
75
 
76
76
  ## Custom Workflows
@@ -7577,3 +7577,230 @@ Discovery session `ecf359d7` running: 77 turns, 11 step advances (active, making
7577
7577
  **Priority:** High. Every daemon crash currently wastes all in-flight work and waits up to 56 min before retrying. With even basic resume (step > 0 → resume, step = 0 → discard + fast re-dispatch), we'd recover most of the lost work and reduce retry latency from 56 min to < 5 min.
7578
7578
 
7579
7579
  **Depends on:** Conversation history persistence (for high-quality resume context).
7580
+
7581
+ ---
7582
+
7583
+ ## Current state update (Apr 23, 2026)
7584
+
7585
+ **npm version: v3.66.0** | Daemon: stopped (intentionally, undergoing MCP reconnect) | MCP: reconnecting to updated binary
7586
+
7587
+ ---
7588
+
7589
+ ### What shipped in this session (Apr 22-23, 2026)
7590
+
7591
+ This was a major session covering daemon/console separation, metrics infrastructure, and workflow stability fixes.
7592
+
7593
+ **Architecture -- daemon/console/MCP separation:**
7594
+ - ✅ **Delete daemon-console.ts** (#753) -- daemon no longer bundles an embedded console; `worktrain console` is now the sole console entry point
7595
+ - ✅ **Remove dead steer/poll endpoints** (#755) -- deleted `worktrain trigger poll` CLI and the steer/poll HTTP endpoints that were only used by the deleted daemon-console
7596
+ - ✅ **Wire workflow catalog into standalone console** (#783, open) -- `worktrain console` Workflows tab now works without the MCP server running; `EnhancedMultiSourceWorkflowStorage` constructed directly in `standalone-console.ts`
7597
+
7598
+ **Metrics infrastructure (6-step sequence, all merged):**
7599
+ - ✅ **timestampMs on events** (#768, #772) -- `DomainEventEnvelopeV1Schema` now has required `timestampMs`; backfill script at `scripts/backfill-timestamps.ts`
7600
+ - ✅ **`run_completed` event** (#773) -- emitted on successful session completion with `startGitSha`, `endGitSha`, `agentCommitShas`, `captureConfidence`, `durationMs`
7601
+ - ✅ **Authoring docs: metrics_* keys** (#767) -- `metricsProfile` field and SHA accumulation convention documented in `docs/authoring-v2.md`
7602
+ - ✅ **`projectSessionMetricsV2` projection** (#771) -- pure projection reading `run_completed` + `context_set metrics_*` keys, wired into `ConsoleSessionSummary`
7603
+ - ✅ **Console metrics display** (#777) -- `SessionMetricsSection` in session detail view; `GET /api/v2/sessions/:id/diff-summary` endpoint
7604
+ - ✅ **`stats-summary.json` writer** (#769) -- `~/.workrail/data/stats-summary.json` aggregated from `execution-stats.jsonl`, written post-session and every 30s heartbeat
7605
+
7606
+ **Engine improvements:**
7607
+ - ✅ **Execution time tracking** (#756) -- `execution-stats.jsonl` per session in finally block
7608
+ - ✅ **Worktree orphan leak fix** (#756) -- sidecar deletion deferred to `maybeRunDelivery()` for worktree sessions
7609
+ - ✅ **assertNever for ReviewSeverity** (#756)
7610
+ - ✅ **Crash recovery phase A** (#759) -- `clearQueueIssueSidecars()` fixes 56-min re-dispatch block; sidecar preservation for sessions with progress
7611
+ - ✅ **Conversation history persistence** (#762) -- `<sessionId>-conversation.jsonl` per daemon session, append-only delta flush at each turn
7612
+ - ✅ **queue-poll.jsonl rotation** (#761) -- 10 MB size cap with `.1` backup
7613
+ - ✅ **Remove WorkTrain-owned label writes** (#765) -- `worktrain:in-progress`, `worktrain:generated` labels removed; deduplication now purely internal (sidecar + dispatchingIssues + session scan)
7614
+ - ✅ **metricsProfile footer injection** (#779) -- engine injects `metrics_*` accumulation footers based on `metricsProfile` workflow field; all 35 bundled workflows assigned profiles
7615
+
7616
+ **Workflow namespace:**
7617
+ - ✅ **Rename all bundled workflows to `wr.*`** (#782, open) -- `coding-task-workflow-agentic` → `wr.coding-task`, `mr-review-workflow-agentic` → `wr.mr-review`, etc. Prevents local project source from shadowing bundled workflows on version mismatch.
7618
+
7619
+ ---
7620
+
7621
+ ### Open PRs (waiting for WorkRail MCP review before merge)
7622
+
7623
+ | PR | Title | Status |
7624
+ |---|---|---|
7625
+ | #782 | Rename all bundled workflows to `wr.*` namespace | CI passing, needs `wr.mr-review` |
7626
+ | #783 | Wire workflow catalog into standalone console | CI pending, needs `wr.mr-review` |
7627
+
7628
+ **Do not merge #782 or #783 without running `wr.mr-review` on each.** The MCP needs to reconnect to the updated 3.66.0 binary first.
7629
+
7630
+ ---
7631
+
7632
+ ### Active bugs (investigated, not yet fixed)
7633
+
7634
+ 1. **`additionalProperties: false` not enforced in Ajv** -- `src/application/validation.ts` uses `strict: false`, making schema's `additionalProperties` advisory only. A workflow with an unknown field passes `validate:registry`. Discovery+shaping in progress (agent running). **High priority -- fix before next release.**
7635
+
7636
+ 2. **`wr.mr-review` NOT_FOUND from MCP** -- `list_workflows` finds it but `start_workflow` returns NOT_FOUND. Root cause: MCP process is still running old 3.60.0 binary (global npm was stale). Fixed by `npm update -g @exaudeus/workrail` (done). Requires MCP reconnect to take effect.
7637
+
7638
+ 3. **User's `wr.discovery` VALIDATION_ERROR** -- stale `npx` cache pre-3.11.2. Fix: `npm cache clean --force && npx @exaudeus/workrail`. No code change needed.
7639
+
7640
+ ---
7641
+
7642
+ ### Known gaps (not yet started)
7643
+
7644
+ - **Phase B crash recovery** -- actual agent loop restart after crash (not just sidecar preservation). Blocked on conversation history being tested end-to-end. See "Autonomous crash recovery" entry above.
7645
+ - **`workrail cleanup` command** -- removes dead managed sources, old sessions. Still needed.
7646
+ - **console-routes.ts dispatch coupling** -- `POST /api/v2/auto/dispatch` still imports `runWorkflow` from `src/daemon/`. See backlog entry.
7647
+ - **`wr.*` list/get inconsistency** -- user-source `wr.*` copies appear in list but execution uses bundled. Low priority.
7648
+
7649
+ ---
7650
+
7651
+ ### Current system state (for next engineer picking this up)
7652
+
7653
+ **Daemon:** Stopped intentionally. Unload: `launchctl bootout gui/$(id -u) ~/Library/LaunchAgents/io.worktrain.daemon.plist`
7654
+ **To restart daemon:** `launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/io.worktrain.daemon.plist`
7655
+ **MCP server:** Reconnecting -- run `/mcp` in Claude Code to get fresh 3.66.0 process
7656
+ **Global npm:** Updated to 3.66.0 (`npm update -g @exaudeus/workrail`)
7657
+ **Local build:** Built from main at 3.66.0 (`npm run build` done)
7658
+ **triggers.yml:** Must update `workflowId` values to new `wr.*` IDs after #782 merges (e.g. `coding-task-workflow-agentic` → `wr.coding-task`)
7659
+
7660
+ **Immediate next actions:**
7661
+ 1. Reconnect MCP (`/mcp` in Claude Code)
7662
+ 2. Run `wr.mr-review` on PR #782 (rename) and PR #783 (console fix)
7663
+ 3. Merge both PRs
7664
+ 4. Wait for validation fix shaping to complete, then code and ship it
7665
+ 5. Update `triggers.yml` with new `wr.*` workflow IDs
7666
+ 6. Restart daemon and monitor first pipeline run with new IDs
7667
+
7668
+ ---
7669
+
7670
+ ## Daemon agent loop stall detection (Apr 23, 2026)
7671
+
7672
+ **The problem:** When a subagent (workrail-executor) stalls with no progress for 600s, the stream watchdog kills it with "Agent stalled: no progress for 600s (stream watchdog did not recover)". The daemon has no equivalent mechanism. A daemon session that stops making LLM API calls (e.g. waiting on a hung tool, a network issue with no timeout, or a silent deadlock) will spin until the wall-clock timeout fires -- which can be up to 55-65 minutes. No indication to the operator, no early abort, no event emitted.
7673
+
7674
+ **What we want:** The daemon's `AgentLoop` should detect when no LLM turn starts within a configurable window (e.g. 120s) and abort the session with a `'stuck'` result. This is different from the existing `repeated_tool_call` / `no_progress` stuck detection, which watches for behavioral loops. This is a liveness check: if the loop simply isn't making any API calls at all, something is frozen.
7675
+
7676
+ **Design sketch:**
7677
+ - In `src/daemon/agent-loop.ts`, add a per-turn heartbeat timer that resets each time an LLM call starts
7678
+ - If the timer fires (120s with no new turn), call `agent.abort()` and emit `agent_stuck` with `reason: 'no_llm_turn'`
7679
+ - Configurable via `agentConfig.stallTimeoutSeconds` in `triggers.yml` (default 120s)
7680
+ - Distinct from wall-clock timeout (`maxSessionMinutes`) which covers the full session
7681
+
7682
+ **Where to look:**
7683
+ - `src/daemon/agent-loop.ts` -- `_runLoop()`, where LLM calls are made
7684
+ - `src/daemon/workflow-runner.ts` -- existing stuck detection and abort logic
7685
+ - `src/daemon/daemon-events.ts` -- `AgentStuckEvent` already has `reason` union (add `'no_llm_turn'`)
7686
+
7687
+ **Priority:** Medium. The wall-clock timeout provides a safety net, but 55 minutes is a long time to wait for a frozen session. A 2-minute liveness check would dramatically improve operator experience.
7688
+
7689
+ ---
7690
+
7691
+ ## Versioned workflow schema validation (Apr 23, 2026)
7692
+
7693
+ ### The problem
7694
+
7695
+ WorkRail validates workflow files against the schema bundled in the currently-running MCP binary. This creates a bidirectional version mismatch problem:
7696
+
7697
+ **New binary, old workflow:** Binary's schema has tightened validation (new required field, removed enum value) → old workflow fails → silently dropped from registry.
7698
+
7699
+ **Old binary, new workflow:** Workflow has new fields the old schema doesn't know about → `additionalProperties: false` rejects it → silently dropped from registry.
7700
+
7701
+ Both directions cause the same symptom: workflow disappears from `list_workflows` with no explanation. This is what we hit Apr 22-23: local `workflows/` directory (v3.66.0 with `metricsProfile`) was loaded by an old global npm binary (v3.60.0) whose schema didn't know `metricsProfile`, causing the entire registry to appear empty.
7702
+
7703
+ The `wr.*` rename solves the specific case of bundled workflows being shadowed by local project files. But the version mismatch problem affects any non-bundled workflow (user, managed source, project) when the binary and the workflow file are at different schema versions.
7704
+
7705
+ ### The right long-term fix: versioned schema validation (like Android Room migrations)
7706
+
7707
+ **Model:** Each workflow declares `"schemaVersion": 2` (integer). The binary ships validator copies for every schema version it supports. When loading a workflow, pick the validator matching the declared version -- not the current one.
7708
+
7709
+ ```json
7710
+ { "schemaVersion": 2, "id": "my-workflow", ... }
7711
+ ```
7712
+
7713
+ **Load-time logic:**
7714
+ 1. Read `schemaVersion` from the workflow file (default to 1 if absent -- legacy workflows)
7715
+ 2. If `schemaVersion === current`: validate against current schema directly
7716
+ 3. If `schemaVersion < current` (binary newer): validate against the declared schema version (workflow is valid for its era)
7717
+ 4. If `schemaVersion > current` (binary too old): load leniently with warnings -- binary doesn't know this schema version, so `additionalProperties: false` doesn't apply
7718
+
7719
+ This gives you full schema freedom going forward. You can add required fields, tighten enums, rename things -- without breaking workflows written for older schema versions.
7720
+
7721
+ ### Why NOT migrations (yet)
7722
+
7723
+ A migration chain (v1→v2→v3 transform functions like Room) is the logical extension but adds complexity:
7724
+ - Migration functions on JSON documents with free-form prose are harder to write correctly than SQL schema migrations
7725
+ - The forward direction (binary too old for workflow) can't be migrated -- you'd have to downgrade the workflow, which is lossy
7726
+ - Requires process discipline: every schema change must increment the version AND write a migration. Easy to forget.
7727
+ - Migration chain length grows over time -- a v1 workflow loading against a v10 binary runs 9 migrations
7728
+
7729
+ **Recommended phased approach:**
7730
+ - **Phase 1 (ship first):** Schema version dispatch without migrations. Keep old schema files (`spec/workflow.schema.v1.json`, `v2.json`, etc.). Validate each workflow against its declared version. No migration functions yet. Simple, no bugs, covers the backward direction.
7731
+ - **Phase 2 (add when needed):** Add migration functions when you actually need to make a breaking schema change that would invalidate old workflows. Not before.
7732
+
7733
+ ### Gaps and known issues with this approach
7734
+
7735
+ 1. **Forward direction still requires leniency.** When a workflow declares `schemaVersion: 5` and the binary only knows up to `schemaVersion: 4`, the only option is lenient loading with warnings. This is the `additionalProperties: true` approach, scoped to the mismatch case. This is acceptable -- if the binary is too old, it can still try to run the workflow with unknown fields ignored.
7736
+
7737
+ 2. **Schema version vs. authoring spec version.** WorkRail already has `validatedAgainstSpecVersion` on workflows (authoring spec -- style/quality). `schemaVersion` is separate (structural validity). Two version numbers with similar names need clear documentation.
7738
+
7739
+ 3. **External author burden.** When a new schema version ships, teams using managed sources need to know what changed and whether their workflows need updating. A changelog per schema version is required.
7740
+
7741
+ 4. **Default for legacy workflows.** Workflows without `schemaVersion` should default to `1` (oldest supported), not current. This means they get validated against the oldest schema -- which is lenient and permissive -- rather than the current strict one. Acceptable tradeoff.
7742
+
7743
+ 5. **`workflow-for-workflows` should stamp `schemaVersion`.** When authoring or modernizing a workflow, `wfw` should set `schemaVersion` to the current version automatically. This keeps the version accurate without requiring manual maintenance.
7744
+
7745
+ ### What's already in place
7746
+
7747
+ - `validatedAgainstSpecVersion` field exists on workflows (authoring spec version, different concept)
7748
+ - `workflow.schema.json` has a `$id` with a version string (`v0.3.0`) but it's decorative -- not used at runtime
7749
+ - Validation warnings in `list_workflows` (PR #787) give users visibility when their workflow is silently dropped -- this is the interim fix until versioned validation ships
7750
+
7751
+ ### Files to change (Phase 1)
7752
+
7753
+ - `spec/workflow.schema.json` -- add `schemaVersion` as an optional integer field (default 1 if absent)
7754
+ - `spec/workflow.schema.v1.json` -- snapshot of the current schema as "v1" (baseline)
7755
+ - `src/application/validation.ts` -- version dispatch: load the right schema based on `schemaVersion`
7756
+ - `src/types/workflow-definition.ts` -- add `readonly schemaVersion?: number` to `WorkflowDefinition`
7757
+ - `workflow-for-workflows.json` -- add step that stamps `schemaVersion` on the authored workflow
7758
+ - All bundled workflows -- add `"schemaVersion": 1` (once Phase 1 ships, bump to whatever the current version is)
7759
+
7760
+ ### Priority
7761
+
7762
+ Medium-High. The `wr.*` rename (PR #782) is the immediate fix. This is the permanent architectural solution that prevents the problem for all workflow sources, not just bundled ones. Should ship after the rename stabilizes.
7763
+
7764
+ **Implementation note (Apr 23):** Start with v1 = current schema. A git history audit is running to check whether any breaking changes have already been shipped. If none found: all existing workflows are valid against the current schema, v1 = today, no reconstruction needed. If breaking changes are found: snapshot the pre-break schema as v1, declare current as v2 (or higher), and existing workflows without `schemaVersion` default to v1. **Do not ship schema versioning until the audit completes and this determination is made.**
7765
+
7766
+ **Audit result (Apr 23):** Exactly one breaking change found -- commit `b3212b45` (Apr 5, 2026) restructured `assessmentConsequenceTrigger` (`dimensionId`/`equalsLevel` → `anyEqualsLevel`). This affected only the `assessmentConsequences` feature which was introduced 4 days earlier (Apr 1). The bundled workflows that used it were migrated atomically in the same commit. No external workflows could have adopted this feature in that 4-day window. All other 14 schema changes in history are additive or loosening -- fully backward compatible.
7767
+
7768
+ **Decision: v1 = current schema. No historical reconstruction needed.** The one breaking change was fully contained within the bundled workflow corpus at the time it shipped.
7769
+
7770
+ ---
7771
+
7772
+ ## Consider rewriting WorkRail engine in Kotlin (Apr 23, 2026)
7773
+
7774
+ ### The argument
7775
+
7776
+ WorkRail's coding philosophy demands "make illegal states unrepresentable" and "type safety as the first line of defense." TypeScript is structurally at odds with this: the compiler is advisory, not enforcing. `as unknown as`, `any`, and type assertion casts are always one line away. In a codebase where autonomous agents write and merge code without deep human review, the compiler is the reviewer -- and TypeScript's escape hatches make it too easy for an agent to paper over a real design problem with a cast.
7777
+
7778
+ Evidence from today's work: the `RunCompletedDataExpected` intermediate interface and the `as unknown as` cast in `session-metrics.ts` both existed for weeks. TypeScript didn't prevent them. A stricter compiler -- one where bypass requires genuine effort -- raises the bar the agent has to clear before code is valid.
7779
+
7780
+ ### What Kotlin actually buys
7781
+
7782
+ - **Sealed classes** -- exhaustive `when` is a compile error, not a runtime `assertNever` pattern that convention must enforce
7783
+ - **No easy escape hatch** -- `as` in Kotlin throws at runtime on type mismatch; there's no equivalent of `as unknown as` that silently lies to the compiler
7784
+ - **Null safety by default** -- `String` vs `String?` is a language distinction, not a `strict: true` compiler flag that can be turned off
7785
+ - **Value classes and data classes** -- less boilerplate for domain types, stronger invariants
7786
+
7787
+ ### What TypeScript + current tooling already covers
7788
+
7789
+ - Zod at boundaries provides runtime validation that Kotlin's type system would provide at compile time -- this gap is smaller than it looks
7790
+ - `neverthrow` gives Result types
7791
+ - Discriminated unions + `assertNever` give exhaustiveness -- but enforced by convention, not the compiler
7792
+
7793
+ ### Real costs
7794
+
7795
+ - JVM startup latency for an MCP server that starts/stops frequently -- mitigable with GraalVM native image, but adds build complexity
7796
+ - Full rewrite of `src/` -- months of work, not weeks
7797
+ - Console stays TypeScript/React regardless
7798
+ - The Kotlin MCP SDK exists but the ecosystem tooling (npm, Node.js file I/O patterns) needs reimplementation
7799
+
7800
+ ### The honest tradeoff
7801
+
7802
+ Convention drift is a recurring tax. Migration is a one-time cost. In a codebase driven heavily by autonomous agents, the compiler is the last line of defense against accumulated drift. TypeScript's permissiveness means that defense has holes.
7803
+
7804
+ This is not urgent -- the current codebase is working well. But if autonomous agent usage grows and human review per-PR decreases further, the compiler enforcement gap becomes more important, not less.
7805
+
7806
+ **Priority:** Low / long-term. Worth revisiting when the agent is writing the majority of new code. Requires a concrete spike: rewrite one module (e.g. `src/v2/durable-core/domain/`) in Kotlin and measure the real friction before committing to a full migration.
@@ -1460,7 +1460,7 @@
1460
1460
  Available Workflows
1461
1461
 
1462
1462
  ## workrail (built-in)
1463
- - coding-task-workflow-agentic
1463
+ - wr.coding-task
1464
1464
 
1465
1465
  ## personal/workrail (repo root)
1466
1466
  - team-code-review
@@ -144,7 +144,7 @@ agent = Agent(
144
144
  subagent_type="workrail-executor",
145
145
  description="Execute context gathering",
146
146
  prompt="""
147
- Start the routine-context-gathering workflow.
147
+ Start the wr.routine-context-gathering workflow.
148
148
 
149
149
  Workspace: /path/to/project
150
150
  Focus: COMPLETENESS
@@ -156,7 +156,7 @@ agent = Agent(
156
156
  Or from the main agent in Claude Code:
157
157
 
158
158
  ```
159
- Please use the workrail-executor agent to run the bug-investigation-agentic workflow
159
+ Please use the workrail-executor agent to run the wr.bug-investigation workflow
160
160
  ```
161
161
 
162
162
  ---
@@ -273,15 +273,15 @@ Later repositories override earlier ones with the same workflow ID.
273
273
  ### Running a workflow directly
274
274
 
275
275
  ```
276
- > Use the bug-investigation-agentic workflow to investigate the cache expiration issue
276
+ > Use the wr.bug-investigation workflow to investigate the cache expiration issue
277
277
  ```
278
278
 
279
279
  ### Delegating to workrail-executor
280
280
 
281
281
  ```
282
282
  > Spawn two workrail-executor agents in parallel:
283
- > 1. One running routine-context-gathering with focus=COMPLETENESS
284
- > 2. One running routine-context-gathering with focus=DEPTH
283
+ > 1. One running wr.routine-context-gathering with focus=COMPLETENESS
284
+ > 2. One running wr.routine-context-gathering with focus=DEPTH
285
285
  ```
286
286
 
287
287
  ### Resuming a checkpointed workflow
@@ -101,7 +101,7 @@ Run the diagnostic workflow to test your configuration:
101
101
 
102
102
  ```bash
103
103
  # In Firebender, ask the main agent:
104
- "Run the workflow-diagnose-environment workflow"
104
+ "Run the wr.diagnose-environment workflow"
105
105
  ```
106
106
 
107
107
  This will:
@@ -22,7 +22,7 @@ The rollout is structured in **3 Phased Tiers**, gated by feature flags, ensurin
22
22
  * Creation of `bug-investigation.agentic.json` with manual delegation instructions.
23
23
  * Implementation of the "Delegate or Proxy" prompt pattern directly in the JSON.
24
24
  3. **The Diagnostic Suite:**
25
- * `workflow-diagnose-environment.json`: Agent-driven wizard to probe capabilities and generate config.
25
+ * `wr.diagnose-environment.json`: Agent-driven wizard to probe capabilities and generate config.
26
26
  * `docs/integrations/firebender.md`: Documentation on tool whitelisting constraints.
27
27
 
28
28
  **User Experience:**
@@ -83,7 +83,7 @@ The rollout is structured in **3 Phased Tiers**, gated by feature flags, ensurin
83
83
  **Why it matters:**
84
84
  * Keeps the primary step prompt user-voiced while still allowing start/resume-only guidance.
85
85
  * Makes current runtime-owned supplement behavior explicit and eventually authorable.
86
- * Gives workflow-for-workflows and future linting a real schema surface instead of relying on hidden server policy.
86
+ * Gives wr.workflow-for-workflows and future linting a real schema surface instead of relying on hidden server policy.
87
87
 
88
88
  **Constraints:**
89
89
  * Should be a **narrow, typed feature**, not arbitrary extra prompt sludge.
@@ -299,11 +299,11 @@ The redesign currently references a few routines conceptually, but it should mak
299
299
 
300
300
  High-value candidates include:
301
301
 
302
- - `routine-context-gathering`
303
- - `routine-hypothesis-challenge`
304
- - `routine-execution-simulation`
305
- - `routine-philosophy-alignment`
306
- - `routine-final-verification`
302
+ - `wr.routine-context-gathering`
303
+ - `wr.routine-hypothesis-challenge`
304
+ - `wr.routine-execution-simulation`
305
+ - `wr.routine-philosophy-alignment`
306
+ - `wr.routine-final-verification`
307
307
 
308
308
  These should be treated as current reusable building blocks, not future ideas.
309
309
 
@@ -786,9 +786,9 @@ The workflow should further strengthen:
786
786
 
787
787
  This phase should explicitly consider use of:
788
788
 
789
- - `routine-hypothesis-challenge` for adversarial reviewer challenge
790
- - `routine-execution-simulation` when runtime behavior or branch-sensitive behavior is material
791
- - `routine-philosophy-alignment` when policy-context is important enough to affect recommendation quality
789
+ - `wr.routine-hypothesis-challenge` for adversarial reviewer challenge
790
+ - `wr.routine-execution-simulation` when runtime behavior or branch-sensitive behavior is material
791
+ - `wr.routine-philosophy-alignment` when policy-context is important enough to affect recommendation quality
792
792
 
793
793
  ## Phase 4: Contradiction, Gap, and Boundary Resolution Loop
794
794
 
@@ -822,7 +822,7 @@ The current final validation idea remains useful, but it should explicitly valid
822
822
 
823
823
  Final validation should also ensure the handoff reflects uncertainty honestly instead of over-stating confidence.
824
824
 
825
- The current WorkRail routine catalog suggests the redesign should strongly consider `routine-final-verification` as either:
825
+ The current WorkRail routine catalog suggests the redesign should strongly consider `wr.routine-final-verification` as either:
826
826
 
827
827
  - a delegated verifier
828
828
  - an injected routine template
@@ -58,7 +58,7 @@
58
58
  - **Tensions resolved**: all 6 failure categories; forces alternatives at hypothesis stage; evidence-based per-dimension findings
59
59
  - **Tensions accepted**: inherent visual limitations; spec not mockup
60
60
  - **Failure mode**: reviewer families produce generic UX advice not tied to actual design context
61
- - **Repo pattern**: directly adapts `production-readiness-audit.json` structure; auditComplexity branching from `adaptive-ticket-creation.json`
61
+ - **Repo pattern**: directly adapts `wr.production-readiness-audit.json` structure; auditComplexity branching from `wr.adaptive-ticket-creation.json`
62
62
  - **Gains**: comprehensive, structured freedom, all failure categories covered
63
63
  - **Losses**: heavier than minimal for simple tasks (mitigated by Simple fast path)
64
64
  - **Scope**: best-fit for feature-level and screen-level design work
@@ -73,7 +73,7 @@
73
73
  - **Tensions resolved**: single-solution anchoring; forces genuine exploration
74
74
  - **Tensions accepted**: UX laws/accessibility not explicitly enforced
75
75
  - **Failure mode**: 3 directions are superficially different (same IA, different metaphors)
76
- - **Repo pattern**: adapts `architecture-scalability-audit.json` dimension-declaration
76
+ - **Repo pattern**: adapts `wr.architecture-scalability-audit.json` dimension-declaration
77
77
  - **Gains**: best for exploring solution space; documents tradeoffs
78
78
  - **Losses**: lighter on UX law enforcement; accessibility second-class
79
79
  - **Scope**: best as a mechanism within B rather than a standalone workflow
@@ -90,11 +90,11 @@
90
90
  - **Tensions resolved**: turns agent UX knowledge into structured application; fully evidence-based
91
91
  - **Tensions accepted**: doesn't help with design-from-scratch; requires existing design as input
92
92
  - **Failure mode**: agent audits what's in the spec but misses implicit design assumptions not stated
93
- - **Repo pattern**: directly adapts `architecture-scalability-audit.json`
93
+ - **Repo pattern**: directly adapts `wr.architecture-scalability-audit.json`
94
94
  - **Gains**: actionable per-dimension findings with references; complements B
95
95
  - **Losses**: review only, not creation
96
96
  - **Scope**: best-fit as standalone for design review; or used after B to audit the output
97
- - **Philosophy**: all principles satisfied; mirrors architecture-scalability-audit exactly
97
+ - **Philosophy**: all principles satisfied; mirrors wr.architecture-scalability-audit exactly
98
98
 
99
99
  ## Comparison and Recommendation
100
100