@exaudeus/workrail 3.39.0 → 3.41.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (97) hide show
  1. package/dist/cli/commands/init.js +0 -3
  2. package/dist/cli-worktrain.js +58 -26
  3. package/dist/cli.js +0 -18
  4. package/dist/config/app-config.d.ts +0 -16
  5. package/dist/config/app-config.js +0 -14
  6. package/dist/config/config-file.js +0 -3
  7. package/dist/console-ui/assets/index-CQt4UhPB.js +28 -0
  8. package/dist/console-ui/assets/index-DGj8EsFR.css +1 -0
  9. package/dist/console-ui/index.html +2 -2
  10. package/dist/coordinators/pr-review.d.ts +23 -1
  11. package/dist/coordinators/pr-review.js +224 -5
  12. package/dist/daemon/daemon-events.d.ts +9 -1
  13. package/dist/daemon/soul-template.d.ts +2 -2
  14. package/dist/daemon/soul-template.js +11 -1
  15. package/dist/daemon/workflow-runner.d.ts +17 -3
  16. package/dist/daemon/workflow-runner.js +401 -28
  17. package/dist/di/container.js +1 -25
  18. package/dist/di/tokens.d.ts +0 -3
  19. package/dist/di/tokens.js +0 -3
  20. package/dist/engine/engine-factory.js +0 -1
  21. package/dist/infrastructure/console-defaults.d.ts +1 -0
  22. package/dist/infrastructure/console-defaults.js +4 -0
  23. package/dist/infrastructure/session/index.d.ts +0 -1
  24. package/dist/infrastructure/session/index.js +1 -3
  25. package/dist/manifest.json +124 -124
  26. package/dist/mcp/handlers/session.d.ts +1 -0
  27. package/dist/mcp/handlers/session.js +61 -13
  28. package/dist/mcp/output-schemas.d.ts +10 -10
  29. package/dist/mcp/server.js +1 -18
  30. package/dist/mcp/tools.d.ts +12 -12
  31. package/dist/mcp/transports/http-entry.js +0 -2
  32. package/dist/mcp/transports/stdio-entry.js +1 -2
  33. package/dist/mcp/types.d.ts +0 -2
  34. package/dist/trigger/daemon-console.d.ts +2 -0
  35. package/dist/trigger/daemon-console.js +1 -1
  36. package/dist/trigger/trigger-listener.d.ts +2 -0
  37. package/dist/trigger/trigger-listener.js +3 -1
  38. package/dist/trigger/trigger-router.d.ts +4 -3
  39. package/dist/trigger/trigger-router.js +13 -5
  40. package/dist/trigger/trigger-store.js +17 -4
  41. package/dist/types/workflow-source.d.ts +0 -1
  42. package/dist/types/workflow-source.js +3 -6
  43. package/dist/types/workflow.d.ts +1 -1
  44. package/dist/types/workflow.js +1 -2
  45. package/dist/v2/durable-core/domain/artifact-contract-validator.js +66 -0
  46. package/dist/v2/durable-core/schemas/artifacts/coordinator-signal.d.ts +25 -0
  47. package/dist/v2/durable-core/schemas/artifacts/coordinator-signal.js +31 -0
  48. package/dist/v2/durable-core/schemas/artifacts/index.d.ts +3 -1
  49. package/dist/v2/durable-core/schemas/artifacts/index.js +14 -1
  50. package/dist/v2/durable-core/schemas/artifacts/review-verdict.d.ts +41 -0
  51. package/dist/v2/durable-core/schemas/artifacts/review-verdict.js +30 -0
  52. package/dist/v2/durable-core/schemas/export-bundle/index.d.ts +236 -236
  53. package/dist/v2/durable-core/schemas/session/events.d.ts +50 -50
  54. package/dist/v2/durable-core/schemas/session/gaps.d.ts +2 -2
  55. package/dist/v2/durable-core/schemas/session/manifest.d.ts +4 -4
  56. package/dist/v2/durable-core/schemas/session/outputs.d.ts +8 -8
  57. package/dist/v2/usecases/console-routes.d.ts +2 -1
  58. package/dist/v2/usecases/console-routes.js +207 -5
  59. package/dist/v2/usecases/console-service.js +14 -0
  60. package/dist/v2/usecases/console-types.d.ts +1 -0
  61. package/docs/authoring.md +16 -16
  62. package/docs/design/coordinator-artifact-protocol-design-candidates.md +155 -0
  63. package/docs/design/coordinator-artifact-protocol-design-review.md +103 -0
  64. package/docs/design/coordinator-artifact-protocol-implementation-plan.md +259 -0
  65. package/docs/design/coordinator-message-queue-drain-plan.md +241 -0
  66. package/docs/design/coordinator-message-queue-drain-review.md +120 -0
  67. package/docs/design/coordinator-message-queue-drain.md +289 -0
  68. package/docs/design/shaping-workflow-external-research.md +119 -0
  69. package/docs/discovery/late-bound-goals-impl-plan.md +147 -0
  70. package/docs/discovery/late-bound-goals-review.md +82 -0
  71. package/docs/discovery/late-bound-goals.md +118 -0
  72. package/docs/discovery/steer-endpoint-design-candidates.md +288 -0
  73. package/docs/discovery/steer-endpoint-design-review-findings.md +104 -0
  74. package/docs/discovery/steer-endpoint-implementation-plan.md +284 -0
  75. package/docs/ideas/backlog.md +447 -97
  76. package/docs/ideas/design-candidates-console-session-tree-impl.md +64 -0
  77. package/docs/ideas/design-candidates-session-tree-view.md +196 -0
  78. package/docs/ideas/design-review-findings-console-session-tree-impl.md +75 -0
  79. package/docs/ideas/design-review-findings-session-tree-view.md +88 -0
  80. package/docs/ideas/implementation_plan_session_tree_view.md +238 -0
  81. package/package.json +2 -1
  82. package/spec/authoring-spec.json +16 -16
  83. package/spec/shape.schema.json +178 -0
  84. package/spec/workflow-tags.json +232 -47
  85. package/workflows/coding-task-workflow-agentic.json +491 -480
  86. package/workflows/mr-review-workflow.agentic.v2.json +5 -1
  87. package/workflows/wr.shaping.json +182 -0
  88. package/dist/console-ui/assets/index-3oXZ_A9m.js +0 -28
  89. package/dist/console-ui/assets/index-8dh0Psu-.css +0 -1
  90. package/dist/infrastructure/session/DashboardHeartbeat.d.ts +0 -8
  91. package/dist/infrastructure/session/DashboardHeartbeat.js +0 -39
  92. package/dist/infrastructure/session/DashboardLockRelease.d.ts +0 -2
  93. package/dist/infrastructure/session/DashboardLockRelease.js +0 -29
  94. package/dist/infrastructure/session/HttpServer.d.ts +0 -60
  95. package/dist/infrastructure/session/HttpServer.js +0 -912
  96. package/workflows/coding-task-workflow-agentic.lean.v2.json +0 -648
  97. package/workflows/coding-task-workflow-agentic.v2.json +0 -324
@@ -5700,136 +5700,486 @@ Tested empirically today. This is what actually works, not what's specced.
5700
5700
 
5701
5701
  ---
5702
5702
 
5703
- ### Self-healing daemon: detect internal failures, kill, diagnose, fix, reboot, resume (Apr 18, 2026)
5703
+ ### Autonomous feature development: scope breakdown parallel execution merge (Apr 18, 2026)
5704
5704
 
5705
- **The problem:** today if WorkRail's MCP connection drops or the daemon's internal tooling fails, agents continue running without enforcement -- producing unverified output that looks correct but bypassed all workflow gates. The user has no way to know this happened until they manually inspect session completion events.
5705
+ **The vision:** give WorkTrain a feature scope -- from a vague idea to a fully groomed ticket -- and it figures out the rest. Discovery if needed, design if needed, breakdown into parallel slices, execution across worktrees, context management across agents, bringing it all back together.
5706
5706
 
5707
- **What happened today:** WorkRail MCP went down mid-session across ~10 concurrent agents. All sessions show INCOMPLETE (no `run_completed` event). Agents produced PRs, reviews, and merges -- two PRs landed on main -- without any confirmed workflow completion. Required manual audit after the fact.
5707
+ **The four pillars the user cares about:**
5708
+ 1. **Autonomy** -- WorkTrain takes a scope and figures out the work breakdown without hand-holding
5709
+ 2. **Quality** -- comes FROM autonomy + workflow enforcement + coordination. Each slice goes through the right phases.
5710
+ 3. **Throughput** -- parallel slices across worktrees simultaneously. N agents working while you focus elsewhere.
5711
+ 4. **Visibility** -- one coherent work unit you can track at a glance, not N unrelated sessions in a flat list.
5708
5712
 
5709
- **What WorkTrain needs:**
5713
+ **The pipeline for a scope:**
5710
5714
 
5711
- **1. Detect its own tooling failures**
5712
- - Monitor whether `complete_step` / `continue_workflow` tool calls are succeeding or timing out
5713
- - Detect when the WorkRail session store becomes unreachable
5714
- - Detect when the MCP connection (for agents that use MCP-mode) is lost
5715
- - Distinguish: "agent is thinking" vs "agent is stuck" vs "agent's tools are broken"
5715
+ ```
5716
+ Input: "add GitHub polling support" (any level of definition -- idea to full spec)
5717
+
5718
+ ├── [if vague] ideation + spec authoring output: BRD / acceptance criteria
5719
+ ├── classify-task taskComplexity, hasUI, touchesArchitecture, taskMaturity
5720
+ ├── [if Medium/Large] discovery → context bundle, invariants, candidate files
5721
+ ├── [if touchesArchitecture] design → candidates, review, selected approach
5722
+ ├── breakdown → parallel slices with dependency graph
5723
+ │ ├── Slice 1: types + schema (worktree A)
5724
+ │ ├── Slice 2: polling adapter (worktree B, depends: 1)
5725
+ │ ├── Slice 3: scheduler integration (worktree C, depends: 2)
5726
+ │ └── Slice 4: tests (worktree D, depends: 1-3)
5727
+ ├── [parallel execution] each slice: implement → review → (fix if needed) → approved
5728
+ ├── [serial integration] merge slices in dependency order, verify after each
5729
+ └── [final] integration test → PR created → notification to user
5730
+ ```
5731
+
5732
+ **Context management across agents:**
5733
+ - Coordinator maintains a "work unit manifest": current phase, slice status, shared invariants, decisions made in design phase
5734
+ - Each spawned agent receives a context bundle: relevant portion of the manifest + files it needs + decisions from upstream phases
5735
+ - Agents don't rediscover what the coordinator already knows
5736
+ - After each agent completes, its findings update the manifest (new invariants found, scope changes, follow-up tickets)
5737
+
5738
+ **Worktree coordination:**
5739
+ - Each slice gets its own worktree (already done via `--isolation worktree`)
5740
+ - Coordinator tracks which files each slice touches -- detects conflicts before they happen
5741
+ - Independent slices run in parallel; dependent slices queue automatically
5742
+ - Merge order follows the dependency graph, not wall-clock completion time
5743
+
5744
+ **Knowing when to spawn a new main agent:**
5745
+ - When a slice is too large or discovers unexpected scope, it requests a breakdown from the coordinator
5746
+ - When a review finds a Critical finding, the coordinator spawns a dedicated fix agent with the finding + relevant context
5747
+ - When integration reveals a regression, coordinator spawns an investigation agent before retrying the merge
5748
+
5749
+ **The coordinator's job (what stays in scripts, not LLM):**
5750
+ - Maintain the manifest (JSON file, append-only)
5751
+ - Compute the dependency graph
5752
+ - Decide parallelism vs serialization
5753
+ - Route: clean → merge, minor findings → fix agent, critical → escalate
5754
+ - Track worktrees, detect conflicts
5755
+ - Sequence the merge order
5756
+
5757
+ **What requires LLM cognition:**
5758
+ - Discovery (what are the invariants, which files matter)
5759
+ - Design (which approach, what tradeoffs)
5760
+ - Implementation (write the code)
5761
+ - Review (is this correct and complete)
5762
+ - Breakdown (what are the right slice boundaries)
5763
+
5764
+ **The minimum viable version:**
5765
+ A coordinator that handles a Medium/Small scoped task (already classified, no need for ideation or design). Takes 2-4 parallel slices, runs them, reviews each, merges when clean. No escalation handling in v1 -- if anything fails, notify the user.
5766
+
5767
+ This is the thing that makes WorkTrain feel like a senior engineer taking ownership of a task, not a tool you have to supervise step by step.
5768
+
5769
+ ---
5770
+
5771
+ ### Coordinator design decision: MVP-first, generalize after (Apr 18, 2026)
5772
+
5773
+ **Decision:** Build the first coordinator as a PR review-specific script. Generalize to a reusable coordinator framework after proving it works end-to-end.
5774
+
5775
+ **Rationale:** Three discovery runs all converged on the architecture (TypeScript script, `CoordinatorDeps` interface, 2-call HTTP for notes). The risk is over-engineering for hypothetical pipelines before validating the real one. PR review is the highest-value first use case with a clear success criterion.
5776
+
5777
+ **The generic coordinator architecture is already designed** (see `docs/discovery/coordinator-script-design.md`). The `CoordinatorDeps` interface and `AgentResult` bridge type make migration to a generic coordinator trivial -- the PR review script uses these types, so generalizing is additive, not a rewrite.
5778
+
5779
+ **Migration path:** once PR review coordinator is proven in production, extract the routing logic (`parseFindings`, `routeByFindings`) and `CoordinatorDeps` interface into `src/coordinators/base.ts`. The PR review coordinator becomes one implementation of the base pattern.
5780
+
5781
+ ---
5782
+
5783
+ ### Architecture decisions from Apr 17-18 sessions (to record before files are cleaned up)
5784
+
5785
+ **Decision 1: Structured output + tool calls can coexist (Apr 18)**
5786
+ Validated empirically via integration test. The beta API (`client.beta.messages.create()`) supports both JSON schema enforcement AND tool calls in the same request. Schema enforcement applies at `end_turn` only. Bedrock is more consistent than direct Anthropic API for system-prompt fallback behavior. This opens a future path for replacing `complete_step` with structured output, but `complete_step` remains the chosen primitive for now.
5787
+
5788
+ **Decision 2: `complete_step` is the preferred daemon workflow-control primitive (Apr 18)**
5789
+ PR #569 merged. The daemon holds the continueToken in a closure; LLM calls `complete_step(notes)` and never handles the token directly. Structured output (`beta.messages.create` with JSON schema) was evaluated as an alternative and deferred -- it's a viable migration path for a future version but adds API complexity today. Follow-up: track a structured output migration as a future improvement, not a current priority.
5790
+
5791
+ **Decision 3: AgentLoop error handling contract -- FatalToolError (Apr 16)**
5792
+ `FatalToolError` subclass selected for distinguishing recoverable from non-recoverable tool failures in the AgentLoop. The contract: user-facing tools (Bash, Read, Write) catch failures and return `isError: true` in the tool_result (loop continues, LLM can retry). Coordination tools with unrecoverable failures (session store corruption, token decode failure) throw `FatalToolError` -- `_executeTools` instanceof-checks this and kills the session rather than surfacing a confusing error to the LLM. This contract is part of the AgentLoop architecture and must be followed by any new tool implementations.
5793
+
5794
+ **Decision 4: Use `wr.discovery` for discovery-only tasks, not `coding-task-workflow-agentic` (Apr 17)**
5795
+ Discovered from a broken session: `coding-task-workflow-agentic` dispatched with "do discovery only, no code" ran 11 step advances then stopped without `run_completed`. The workflow's implementation phases fired even with explicit instructions not to code. Lesson: when a trigger or coordinator wants pure discovery/research, use `wr.discovery` as the workflowId. `coding-task-workflow-agentic` should only be dispatched when implementation is the actual goal.
5796
+
5797
+ **Decision 5: Bug -- MCP server EPIPE crash (Apr 18)**
5798
+ Root cause confirmed with 15 production crash log entries: `process.stderr` is missing an `'error'` event handler in `registerFatalHandlers()`. When an MCP client disconnects, Node.js emits `EPIPE` on stderr which crashes the process with an unhandled error. `process.stdout` already has equivalent protection via `wireStdoutShutdown()`. Fix: mirror the stdout protection for stderr. One-line fix being implemented in PR `fix/mcp-stderr-epipe-crash`.
5799
+
5800
+ ---
5801
+
5802
+ ### worktrain status → console integration (Apr 18, 2026)
5803
+
5804
+ The `worktrain status` CLI command is Phase 1. Phase 2: the same data and rendering lives inside the console as the default landing view when you open it -- not the sessions list, the overview. Same `StatusDataPacket` type, two surfaces. The console overview replaces the need to run a CLI command; it auto-refreshes and stays live.
5805
+
5806
+ ---
5807
+
5808
+ ### WorkTrain as a native macOS app (Apr 18, 2026)
5809
+
5810
+ Long-term vision: WorkTrain becomes a full native Mac app -- not just a CLI + web console, but a proper macOS application with a menubar icon, system notifications, windows, and native UX.
5811
+
5812
+ **What this unlocks:**
5813
+ - Always-on menubar presence showing daemon status at a glance
5814
+ - Native macOS notifications (already built via osascript -- the app version uses UserNotifications framework directly)
5815
+ - The `worktrain status` overview as a native window, not a browser tab
5816
+ - Message queue and inbox as a native interface (type a message from anywhere on your Mac, not just the terminal)
5817
+ - Background daemon management -- start/stop/restart from the menubar without terminal
5818
+ - Deep system integration: file system events, calendar, Contacts, native share sheet
5819
+
5820
+ **Tech stack options:**
5821
+ - Swift/SwiftUI: full native, best macOS integration, steeper learning curve from TypeScript
5822
+ - Electron + existing console UI: fastest path, same TypeScript codebase, but heavy
5823
+ - Tauri: Rust core + existing web frontend, lighter than Electron, good macOS support
5824
+ - React Native macOS: reuses React knowledge, not quite native feel
5716
5825
 
5717
- **2. Kill the agent cleanly on detected failure**
5718
- - When internal tooling is detected broken, stop the agent immediately -- do NOT let it continue without enforcement
5719
- - Retain the full conversation history and step notes up to the point of failure
5720
- - Mark the session as `interrupted_tooling_failure` (distinct from `error` or `timeout`)
5721
- - Write the failure event to the daemon event log with the exact cause
5826
+ **Recommended path:** Tauri wrapping the existing console UI. The console is already a React/Vite app. Tauri gives native menubar, notifications, and system APIs without rewriting the frontend. The WorkTrain daemon stays as a separate process managed by the app.
5722
5827
 
5723
- **3. Self-diagnose**
5724
- - Run a lightweight health check: can we reach the session store? Can we decode the continueToken? Is the WorkRail engine responding?
5725
- - Identify the root cause: MCP disconnect? Session store corruption? Token decode failure? Port conflict?
5726
- - Distinguish recoverable (restart and resume) from non-recoverable (session data corrupted, must restart from scratch)
5828
+ **This is a post-v1 platform decision** -- not a near-term priority, but worth designing toward. Don't make architectural decisions that would make the Tauri wrapper hard later.
5727
5829
 
5728
- **4. Fix and reboot**
5729
- - For recoverable failures: restart the WorkRail engine in-process, re-register tools, verify health before resuming
5730
- - For MCP-mode failures: reconnect without killing the parent session
5731
- - For port conflicts: clear the lock and rebind
5732
- - All of this happens automatically, without user intervention
5830
+ ---
5831
+
5832
+ ### Long-running sessions: stay open across agent handoffs (Apr 18, 2026)
5833
+
5834
+ **The problem:** today when an MR review session completes, it writes its findings and exits. If the findings require fixes, a new fix agent starts from scratch with no shared context. When the fix is done, a new re-review agent also starts from scratch. Three sessions that are logically one unit of work are isolated from each other.
5835
+
5836
+ **The vision:** a session can stay open and wait -- dormant but alive -- while another agent does work. When that work completes, the waiting session resumes with full context continuity.
5837
+
5838
+ **The MR review example:**
5839
+
5840
+ ```
5841
+ [MR review session] finds: 2 critical, 3 minor
5842
+ → stays open, waiting for fixes
5843
+
5844
+ [Fix agent session] addresses all 5 findings
5845
+ → completes, signals "fixes ready"
5846
+
5847
+ [MR review session resumes] re-reads the diff, re-evaluates
5848
+ → all 5 verified fixed, 0 new findings
5849
+ → completes with APPROVE verdict
5850
+ ```
5851
+
5852
+ The same session that found the issues verifies the fixes. No context reconstruction. No risk of re-review missing something the original reviewer knew.
5853
+
5854
+ **Other use cases for waiting sessions:**
5855
+
5856
+ - **Architecture review waiting for approval:** architect session identifies a design gap, waits for the human to decide on direction, resumes when the decision is recorded
5857
+ - **Discovery session waiting for data:** a research session identifies that it needs a specific file or API response, signals "blocked on: fetch X", waits for a retrieval agent to deliver it, resumes with the data injected
5858
+ - **Coordinator waiting on child completion:** instead of a coordinator script polling `worktrain await`, the coordinator session can yield and be resumed by the daemon when child sessions complete -- same session, same context, no polling overhead
5859
+ - **Spec authoring waiting for stakeholder input:** a spec session writes a draft, flags "needs: human review of acceptance criteria", waits, resumes when the human adds a comment
5860
+ - **Integration test waiting for deployment:** a test coordination session waits for a deploy to complete before running integration tests
5861
+
5862
+ **The key insight: the LLM doesn't experience waiting.**
5733
5863
 
5734
- **5. Resume with context**
5735
- - Resume the session from the last confirmed `complete_step` / `advance_recorded` event
5736
- - Inject a `<context>` block into the resumed session: "Your previous session was interrupted due to [reason]. Here is what you completed before the interruption: [last 3 step notes]. Resume from where you left off."
5737
- - The agent never knows the interruption happened -- it just continues
5738
- - If resume fails (token expired, session corrupted), escalate to the user with full context on what was completed and what wasn't
5864
+ LLMs have no concept of time. Between one turn and the next, zero time passes from the agent's perspective. This means "waiting" is not a thing that happens to the agent -- it just doesn't receive its next turn until the coordinator has something to give it.
5739
5865
 
5740
- **6. Audit trail**
5741
- - Every self-heal event is recorded in `~/.workrail/events/daemon/YYYY-MM-DD.jsonl` as `tooling_failure_detected`, `self_heal_started`, `self_heal_succeeded` / `self_heal_failed`
5742
- - The console shows a `⚠️ self-healed` badge on sessions that were interrupted and resumed
5743
- - The user can query: "which sessions were interrupted today?" and get the full list with causes
5866
+ The session is paused at the engine level (DAG holds at a node, no new turns issued). The agent submitted its output and simply hasn't received a response yet. When the coordinator is ready -- fix agent completed, human reviewed, deployment finished -- it advances the session with a turn that contains the new context. From the agent's perspective: it submitted findings and immediately received "here are the fixes, verify them."
5867
+
5868
+ **No `wait_for` primitive needed at the workflow level.** The coordinator is the timing mechanism. This is the coordinator's job: know when each session is ready for its next input, and deliver that input at the right time.
5869
+
5870
+ ```
5871
+ Coordinator logic:
5872
+
5873
+ 1. Advance review session to "findings complete" node
5874
+ 2. Read findings from session output
5875
+ 3. Spawn fix agent with those findings
5876
+ 4. Wait for fix agent to complete (worktrain await)
5877
+ 5. Inject fix summary into review session's next turn
5878
+ 6. Advance review session: "Here are the fixes. Verify them."
5879
+ → LLM receives this as the natural next step, no time gap perceived
5880
+ ```
5881
+
5882
+ **Why this is more powerful than re-running a fresh session:**
5883
+
5884
+ - **Context continuity:** the reviewer remembers what it found, why it flagged it, what invariants it was checking. A fresh session has to re-discover all of that.
5885
+ - **Relational memory:** "does this fix address the root cause I identified, or just the symptom?" -- only the original session knows the root cause reasoning.
5886
+ - **Efficiency:** no redundant context gathering. The resumed session picks up exactly where it left off.
5887
+ - **The agent doesn't know it's coordinating:** from the agent's view, it's a continuous workflow. The coordinator manages the timing externally.
5744
5888
 
5745
5889
  **Implementation path:**
5746
- - Phase 1: detection only -- log `tooling_failure_detected` when a session produces output without `run_completed`. Surface in console and status command.
5747
- - Phase 2: kill on detection -- stop agents immediately when tooling failure detected. No more unverified output reaching main.
5748
- - Phase 3: auto-resume -- restart and resume for recoverable failures.
5749
- - Phase 4: full self-heal loop -- diagnose, fix, reboot, resume automatically.
5750
5890
 
5751
- **The invariant that must hold:** no output from a WorkTrain agent should be acted on (committed, merged, posted) unless its session has a confirmed `run_completed` event. This is the enforcement guarantee. Self-healing is what makes that guarantee survivable.
5891
+ - Phase 1: coordinator scripts withhold `complete_step` advancement until the condition is met. This already works today -- the coordinator just doesn't advance the session until the fix agent is done.
5892
+ - Phase 2: the coordinator passes structured context when advancing: `complete_step(session, { injectedContext: fixSummary })`. The session receives it as part of the next step's prompt.
5893
+ - Phase 3: declarative pipelines -- workflow JSON declares that step N waits for an external condition before proceeding. The coordinator reads this and manages the timing automatically. No hand-coded coordinator script needed for common patterns.
5752
5894
 
5753
5895
  ---
5754
5896
 
5755
- ### CRITICAL architectural clarity: three systems, one shared engine (permanent reference)
5897
+ ### Coordinatable workflow steps: confirmation points the coordinator can satisfy (needs discovery, Apr 18, 2026)
5898
+
5899
+ ⚠️ **Needs discovery before implementation. The questions below are open, not answered.**
5900
+
5901
+ **The insight:** workflows already have `requireConfirmation: true` on certain steps -- these are natural coordination points. Right now they pause for a human. The idea is to make them also pausable-for-a-coordinator, so a coordinator (or another agent) can be the one that responds instead of a human.
5902
+
5903
+ **The vision:**
5904
+ A workflow reaches a `requireConfirmation` step. In MCP mode (human-driven), it behaves exactly as today -- pauses and waits. In daemon/coordinator mode, instead of blocking forever, the coordinator can:
5905
+ - Inject a synthesized answer based on external work it just did ("architecture review found X, proceed with approach A")
5906
+ - Spawn another agent to generate the answer and inject its output
5907
+ - Ask a discovery agent to weigh in and forward the result
5908
+ - Simply forward a human's message from the message queue
5909
+
5910
+ The original session never knows whether a human or a coordinator satisfied the confirmation. It just receives the next turn with context.
5911
+
5912
+ **Why this is powerful:**
5913
+ Today the coordinator is external to the workflow -- it orchestrates sessions from outside. This makes the workflow itself coordinatable from within, so multi-agent collaboration can be declared in the workflow spec rather than bolted on in coordinator scripts.
5756
5914
 
5757
- This is the most important architectural fact about this codebase. Every agent and contributor must understand this before touching anything.
5915
+ **What's unknown and needs discovery:**
5916
+ 1. **Mechanism:** is this an enriched `requireConfirmation` (add a `coordinatable: true` flag?), a new step type (`requireCoordinatorInput`?), or something at the engine level? Tradeoffs between each.
5917
+ 2. **What gets injected:** always a structured decision ("proceed/revise/abort + findings"), or also data injection ("here are the file contents", "here's what the API returned")? How does the step receive it -- as a new tool call result, as a steer, as part of the step prompt?
5918
+ 3. **Coordinator discovery:** how does the coordinator know a step is waiting for it vs waiting for a human? Does it poll the session state? Does the session emit a `coordinator_gate_pending` event? (This connects to the `waitForCoordinator` spec in this backlog.)
5919
+ 4. **Timeout/fallback:** if the coordinator never responds, what happens? Fall back to human? Error? Configurable?
5920
+ 5. **MCP invariant:** must behave identically to today in MCP/human-driven mode. The coordinator path is additive, not a behavior change for existing users.
5758
5921
 
5759
- **WorkRail/WorkTrain is three separate systems sharing one engine:**
5922
+ **Relationship to other specs:**
5923
+ - "Long-running sessions: stay open across agent handoffs" -- the session pauses at the confirmation point, coordinator acts, session resumes
5924
+ - "POST /api/v2/sessions/:id/steer" -- this might be the injection mechanism
5925
+ - `signal_coordinator` tool -- the session might signal the coordinator instead of blocking
5926
+ - `waitForCoordinator` step flag (already in this backlog) -- same underlying need, different framing
5927
+ - "Coordinator review mode: self-healing vs comment-and-wait" -- confirmation points are where that routing decision gets expressed
5760
5928
 
5929
+ ---
5930
+
5931
+ ## Architecture Decision: Three-Workflow Pipeline (Apr 18, 2026)
5932
+
5933
+ ### Decision
5934
+
5935
+ The canonical WorkRail workflow pipeline for new features is:
5936
+
5937
+ ```
5938
+ wr.discovery (optional) → wr.shaping (optional) → coding-task-workflow-agentic
5761
5939
  ```
5762
- Shared core
5763
- ┌─────────────────────────────┐
5764
- │ WorkRail engine │
5765
- │ src/v2/durable-core/ │
5766
- │ ~/.workrail/data/sessions/ │
5767
- │ workflow registry │
5768
- └────────┬──────────┬─────────┘
5769
- │ │ │
5770
- ┌──────────────▼─┐ ┌─────▼──────┐ ┌▼─────────────┐
5771
- WorkRail MCP │ │ WorkTrain │ │ WorkRail │
5772
- │ Server │ │ Daemon │ │ Console │
5773
- workrail start │ │ worktrain │ │ worktrain │
5774
- │ src/mcp/ │ │ daemon │ │ console │
5775
- │ │ │ src/daemon/│ │ src/console/
5776
- │ Claude Code │ │ src/trigger│ │ │
5777
- connects here │ │ │ │ Shows BOTH │
5778
- via stdio │ │ autonomous │ │ MCP + daemon
5779
- │ │ │ agent loop │ │ sessions │
5780
- └────────────────┘ └────────────┘ └──────────────┘
5940
+
5941
+ Each workflow is independently useful. The pipeline is an optional chain, not a required sequence.
5942
+
5943
+ ### Rationale
5944
+
5945
+ **wr.discovery** produces a direction -- what problem is worth solving. Output: structured discovery notes at `.workrail/discovery/`.
5946
+
5947
+ **wr.shaping** produces a bounded pitch -- what specifically to build and explicitly NOT build, at a product level. Output: `.workrail/current-pitch.md`. Faithful Shape Up methodology. Tech-agnostic. No code-level content.
5948
+
5949
+ **coding-task-workflow-agentic** produces running code -- engineering approach, sliced implementation, verification. When pitch.md exists (Phase 0.5), it skips design ideation and translates the pitch directly into an engineering approach. The pitch's no-gos and appetite are binding constraints.
5950
+
5951
+ ### No TechSpec workflow needed
5952
+
5953
+ The coding workflow already does everything a TechSpec workflow would do: Phase 1b generates design candidates, Phase 1c selects and challenges the approach, Phase 3 writes the spec and implementation plan. Adding a separate TechSpec workflow would duplicate this and create a question of which is canonical. The coding workflow is the engineering planning layer.
5954
+
5955
+ **The split that matters is product vs engineering:**
5956
+ - Product decisions (what to build, for whom, within what time) → wr.shaping
5957
+ - Engineering decisions (how to build it, which interfaces, which tests) → coding workflow
5958
+
5959
+ ### When to skip shaping
5960
+
5961
+ - Task is small, concrete, and clearly scoped → go straight to coding workflow
5962
+ - Discovery already produced a bounded, implementable direction
5963
+ - You have a pre-written ticket or spec that already defines what to build
5964
+
5965
+ ### Faithful Shape Up constraint
5966
+
5967
+ wr.shaping is tech-agnostic. A pitch for a Kotlin Android app and a pitch for a Python API service look structurally identical. No file paths, no function signatures, no implementation details. This makes pitches usable by human engineering teams at companies using Shape Up, not just WorkRail's coding workflow.
5968
+
5969
+ ### Phase 0.5 mechanics
5970
+
5971
+ When `coding-task-workflow-agentic` finds `.workrail/current-pitch.md`:
5972
+ 1. Reads all five pitch sections (Problem, Appetite, Solution/Elements, Rabbit Holes, No-Gos)
5973
+ 2. Sets `shapedInputDetected=true`
5974
+ 3. Skips phases 1a-1c (hypothesis, design generation, challenge-and-select)
5975
+ 4. Phase 1d translates pitch elements/invariants/no-gos into an engineering approach
5976
+ 5. Plan audit (Phase 4) checks for drift against the pitch
5977
+ 6. Appetite is a hard ceiling -- oversized engineering work becomes follow-up tickets
5978
+
5979
+
5980
+ ---
5981
+
5982
+ ## Idea: `context-gather` Step Type (Apr 19, 2026)
5983
+
5984
+ ### Problem
5985
+
5986
+ Phase 0.5 in the coding workflow currently looks for a shaped pitch by checking a local path. This doesn't handle: coordinator-injected context, manually written docs (GDoc, Confluence, Notion), Glean-indexed artifacts, or URLs embedded in the task description. The search logic is duplicated if other workflows need the same document.
5987
+
5988
+ ### Proposed primitive
5989
+
5990
+ A new engine-level step type `context-gather` that resolves a named context artifact from ordered sources:
5991
+
5992
+ ```json
5993
+ {
5994
+ "type": "context-gather",
5995
+ "id": "gather-pitch",
5996
+ "contextType": "shaped-pitch",
5997
+ "outputVar": "shapedInput",
5998
+ "optional": true,
5999
+ "sources": ["coordinator-injected", "local-paths", "task-url", "glean"]
6000
+ }
5781
6001
  ```
5782
6002
 
5783
- **WorkRail MCP Server:** `workrail start` → stdio MCP server. Claude Code connects here. Provides `start_workflow`, `complete_step`, `list_workflows` etc. as MCP tools. Source: `src/mcp/`. Must be bulletproof -- a crash kills all Claude Code workflow tools.
6003
+ **Source resolution order (stops at first hit):**
6004
+ 1. `coordinator-injected` -- coordinator already attached context of this type to the session (most common in autonomous mode)
6005
+ 2. `local-paths` -- check `.workrail/current-pitch.md`, `pitch.md`, `PRD.md`, `.workrail/pitches/` (most recent)
6006
+ 3. `task-url` -- extract any URL from the task description and fetch via WebFetch or matching MCP (GDoc, Confluence, Notion)
6007
+ 4. `glean` -- search Glean for recent docs matching the task keywords and `contextType`; opt-in only (risk of false positives silently constraining wrong scope)
6008
+
6009
+ If `optional: true` and no source resolves: `outputVar = null`, workflow continues normally.
6010
+
6011
+ ### Why engine-level, not a routine
6012
+
6013
+ - Coordinator intercept requires the engine to check "has this type already been provided?" before running any search -- a routine can't express that
6014
+ - `contextType` is a declared intent multiple workflows can share (`wr.shaping`, `coding-task-workflow`, `wr.discovery`) without duplicating resolver logic
6015
+ - New sources (Linear, Jira, Notion) get added to the engine once, immediately available to all workflows
6016
+
6017
+ ### Relationship to existing work
6018
+
6019
+ - Replaces/supersedes Phase 0.5's current local-path check in `coding-task-workflow-agentic`
6020
+ - Coordinator PR-review flow would inject `shaped-pitch` context before spawning the coding session
6021
+ - Any workflow that needs "find the spec/pitch/PRD for this task" uses the same step type
6022
+
6023
+ ### Open questions
6024
+
6025
+ - How does the coordinator inject context into a session? Via a session variable set before `start_workflow`, or a new `inject_context` call?
6026
+ - How does `task-url` distinguish a GDoc URL from a Confluence URL from a Notion URL? MCP routing by domain?
6027
+ - What is the `contextType` vocabulary? Start with `shaped-pitch` -- what else? (`discovery-notes`, `design-spec`, `api-contract`?)
6028
+ - Glean false-positive risk: wrong document fed as shaped input silently constrains wrong scope. Needs confidence threshold or explicit user confirmation when Glean is the only hit.
6029
+
6030
+
6031
+ ---
6032
+
6033
+ ## Completed (Apr 19, 2026)
5784
6034
 
5785
- **WorkTrain Daemon:** `worktrain daemon` → autonomous agent runner. Drives sessions without human involvement. Calls the WorkRail engine **directly in-process** -- does NOT go through the MCP server. Source: `src/daemon/`, `src/trigger/`.
6035
+ ### wr.shaping -- Faithful Shape Up shaping workflow
5786
6036
 
5787
- **WorkRail Console:** `worktrain console` → unified read-only session viewer. Shows sessions from **both** the MCP server and the daemon (they share the same session store). Requires neither the MCP server nor the daemon to be running.
6037
+ Created `workflows/wr.shaping.json`. Faithful Shape Up methodology, tech-agnostic, produces `.workrail/current-pitch.md` only. Nine steps: ingest frame gate → diverge (6 shapes, Verbalized Sampling) → converge → breadboard + elements → rabbit holes + no-gos draft/critique loop approval gate write pitch.md. Two human gates with autonomous fallback. Appetite is calendar-time only (xs/s/m/l/xl). No code-level content -- a pitch for a Kotlin app and a pitch for a Python service look structurally identical.
5788
6038
 
5789
- **Rules that follow from this:**
5790
- - MCP server code must never import from `src/daemon/` or `src/trigger/`
5791
- - Daemon code must never depend on the MCP server being alive
5792
- - Console code reads the session store directly -- no IPC with MCP or daemon needed
5793
- - These are separate processes. A crash in one does not affect the others.
6039
+ ### coding-task-workflow-agentic -- Upstream context Phase 0.5
6040
+
6041
+ Added Phase 0.5 "Locate Upstream Context" to `coding-task-workflow-agentic.json`. Format-agnostic: the agent uses whatever tools are available (repo search, WebFetch, Confluence/Notion/Glean MCPs, etc.) to find any upstream document -- pitch, PRD, BRD, RFC, design doc, user story, Jira epic, etc. Sets `upstreamSpecDetected` + `solutionFixed` flags. When `solutionFixed=true`, design ideation phases (1a-1c) are skipped and Phase 1d translates upstream constraints directly into an engineering approach. Plan audit (Phase 4) checks for drift against `upstreamBoundaries` whenever an upstream document was found.
6042
+
6043
+ Also consolidated from three workflow variants to one canonical file.
6044
+
6045
+
6046
+ ---
6047
+
6048
+ ## Current state update (Apr 19, 2026)
6049
+
6050
+ **npm version: v3.40.0**
6051
+
6052
+ ### What shipped since v3.36.0 (Apr 18 -- Apr 19)
6053
+
6054
+ - ✅ **`wr.shaping`** -- faithful Shape Up shaping workflow (9 steps, two human gates with autonomous fallback)
6055
+ - ✅ **`coding-task-workflow-agentic` Phase 0.5** -- upstream context detection; skips design phases when solution is pre-specified. Three-workflow pipeline: shaping → discovery → coding.
6056
+ - ✅ **Coding workflow consolidated** -- from three variants (lean, full, lean.v2) to one canonical file.
6057
+ - ✅ **HttpServer removed from MCP server** (#601) -- pure stdio. MCP server can no longer accidentally start an HTTP server.
6058
+ - ✅ **Late-bound goals** (#604) -- `goalTemplate: "{{$.goal}}"` defaults for webhook-driven sessions. Goals can come from the payload, not just the static trigger definition.
6059
+ - ✅ **Coordinator message queue drain** (#606) -- `pr-review` coordinator reads `~/.workrail/message-queue.jsonl` before each spawn cycle. `worktrain tell stop`, `skip-pr <n>`, `add-pr <n>` work.
6060
+ - ✅ **Notifications shipped** -- `NotificationService` implemented, wired into `TriggerRouter` via `trigger-listener.ts`. `WORKTRAIN_NOTIFY_MACOS=true` and `WORKTRAIN_NOTIFY_WEBHOOK=<url>` in `~/.workrail/config.json`.
6061
+ - ✅ **`worktrain run pr-review`** -- fully wired coordinator command. `spawnSession` → `awaitSessions` → `getAgentResult` (session-wide artifact aggregation) → `parseFindingsFromNotes` → route by severity.
6062
+ - ✅ **`wr.review_verdict` artifact path** -- end-to-end wired: `mr-review-workflow.agentic.v2.json` phase-6 emits it, `artifact-contract-validator.ts` validates it at `continue_workflow` time, coordinator reads it with keyword-scan fallback.
6063
+ - ✅ **`worktrain logs` / `worktrain health`** -- structured daemon log tailing and per-session health summary. `worktrain status <id>` deprecated in favor of `worktrain health <id>`.
6064
+ - ✅ **`signal_coordinator` tool** -- agent can emit structured mid-session signals (`progress`, `finding`, `data_needed`, `approval_needed`, `blocked`) without advancing the step.
6065
+ - ✅ **`ChildWorkflowRunResult` + `assertNever`** -- spawn_agent delivery_failed bug fixed. `delivery_failed` impossible state is compile-time excluded.
6066
+ - ✅ **`lastStepArtifacts` on `WorkflowRunSuccess`** -- `onComplete` callback forwards artifacts alongside notes. Coordinator can read typed artifacts from result without a separate HTTP call.
6067
+ - ✅ **`steerRegistry` + POST `/sessions/:id/steer`** -- coordinator injection endpoint wired in daemon console. Running sessions register a steer callback; coordinators can inject mid-session messages via HTTP.
6068
+ - ✅ **GitHub polling adapters** -- `github_issues_poll` and `github_prs_poll` providers fully implemented alongside existing `gitlab_poll`.
6069
+ - ✅ **Knowledge graph spike** -- `src/knowledge-graph/` module: DuckDB in-memory + ts-morph indexer + two validation queries. NOT yet wired to an MCP tool (ts-morph in devDependencies).
6070
+ - ✅ **`worktrain daemon --install`** -- launchd plist creation, load, verify. Daemon survives MCP server reconnects.
6071
+ - ✅ **Performance sweep** -- April 2026 sweep identified 10 highest-leverage fixes, filed as issues #248-257. Not yet merged.
6072
+
6073
+ ### Accurate limitations (as of v3.40.0)
6074
+
6075
+ 1. **Console session tree UI not built** -- `parentSessionId` is stored in the `session_created` event and in `WorkflowRunSuccess`. Console `RunLineageDag` shows the per-session step DAG only. Cross-session parent-child tree is data-only. PRs #607 (tree view) and #608 (steer endpoint) are OPEN.
6076
+ 2. **Daemon tool set is minimal** -- agent has: `complete_step`, `continue_workflow` (deprecated), `Bash`, `Read`, `Write`, `report_issue`, `spawn_agent`, `signal_coordinator`. No `Glob`, `Grep`, or `Edit`. Read/Write are thin wrappers.
6077
+ 3. **`worktrain tell` messages only drained by coordinator** -- `drainMessageQueue` is called by `runPrReviewCoordinator`, not by the daemon loop. A running autonomous session cannot receive mid-run injections from `worktrain tell`. The `steerRegistry` HTTP endpoint is the mid-session channel.
6078
+ 4. **Knowledge graph not wired** -- module exists, ts-morph must move to dependencies before an MCP tool can be built.
6079
+ 5. **`spawn_agent` return missing `artifacts`** -- returns `{ childSessionId, outcome, notes }` only. Typed artifacts from child session are not surfaced to the parent agent. `lastStepArtifacts` on `WorkflowRunSuccess` exists but spawn_agent doesn't return it.
6080
+ 6. **`worktrain inbox --watch` stub** -- `--watch` flag prints "not yet implemented" and exits.
6081
+ 7. **Artifact store not built** -- agents still dump markdown/files directly into the repo. `~/.workrail/artifacts/` directory structure not created.
6082
+ 8. **Performance issues not fixed** -- issues #248-257 filed from April sweep. `continue_workflow` triggers 6+ event log scans, full session rebuild per `/api/v2/sessions` request, N+1 workflow fetches, no caching.
6083
+ 9. **No auto-commit** -- agents can write code but do not commit, push, or open PRs autonomously.
6084
+ 10. **Assessment gates not battle-tested** -- end-to-end flow with `outputContract: required: true` not validated in production use.
6085
+
6086
+ ### Open PRs to merge
6087
+
6088
+ - **#607** `feat(console): add session tree view for coordinator sessions` -- cross-session parent-child hierarchy in console. Blocked on: `parentSessionId` data is in store but console routes need to surface it.
6089
+ - **#608** `feat(console): add POST /api/v2/sessions/:sessionId/steer for coordinator injection` -- NOTE: this endpoint is already implemented in `daemon-console.ts` via `steerRegistry`. PR #608 may be adding this to the MCP server console separately. Check before merging.
6090
+ - **#610** `feat(workflows): add wr.shaping` -- the shaping workflow. Ready to merge.
6091
+ - **#587** `fix(mcp): add assertNever exhaustiveness guard to TriggerRouter` -- likely already applied in codebase (ChildWorkflowRunResult assertNever is live). May be a duplicate or different scope. Check.
6092
+
6093
+ ### Next priorities (groomed Apr 19)
6094
+
6095
+ 1. **Merge #610 (wr.shaping)** -- ready. Workflow is implemented and in the branch.
6096
+ 2. **Merge #587 (TriggerRouter assertNever)** -- quick fix, check if still relevant.
6097
+ 3. **Review and merge #607 + #608** -- console tree view and steer endpoint. Verify #608 doesn't duplicate what's already live in daemon-console.ts.
6098
+ 4. **Performance fixes** -- issues #248-257. Pick highest-leverage first: SessionIndex (#248) and console projection cache (#249) eliminate most of the repeated scans.
6099
+ 5. **Daemon tool set: add Glob + Grep** -- agents routinely need to search files. `Read` + `Bash` grep is slow and lossy. Native `Glob` and `Grep` tools would make coding sessions more reliable.
6100
+ 6. **`spawn_agent` artifacts gap** -- add `artifacts?: readonly unknown[]` to the return value. `lastStepArtifacts` is already on `WorkflowRunSuccess`; wiring it through is ~30 LOC.
6101
+ 7. **Knowledge graph wiring** -- move `ts-morph` and `@duckdb/node-api` to dependencies, add `query_knowledge_graph` MCP tool.
6102
+ 8. **Artifact store foundation** -- `~/.workrail/artifacts/` directory, write path in `complete_step`.
5794
6103
 
5795
6104
  ---
5796
6105
 
5797
- ### CRITICAL architectural clarity: three systems, one shared engine (permanent reference)
6106
+ ### wr.shaping workflow: shape messy problems into implementation-ready specs (needs authoring, Apr 18, 2026)
6107
+
6108
+ **Status:** Design complete. Ready to author as a WorkRail workflow JSON.
6109
+
6110
+ **Design docs:**
6111
+ - `docs/design/shaping-workflow-discovery.md` -- WorkRail-internal discovery findings
6112
+ - `docs/design/shaping-workflow-external-research.md` -- External research synthesis (Shape Up, LLM failure modes, artifact schema)
6113
+
6114
+ **The gap this fills:** WorkRail has `wr.discovery` (divergent) and `coding-task-workflow-agentic` (convergent). Shaping is the missing middle -- converting messy discovery output into a bounded, implementation-ready spec without mid-implementation rabbit holes.
5798
6115
 
5799
- This is the most important architectural fact about this codebase. Every agent and contributor must understand this before touching anything.
6116
+ **The 11-step skeleton (see design doc for full detail):**
6117
+ 1. ingest_and_extract -- extract problem frames, forces, open questions
6118
+ 2. **frame_gate** -- MANDATORY HUMAN GATE: confirm problem + appetite
6119
+ 3. diverge_solution_shapes -- 4 parallel rough shapes with varied framings
6120
+ 4. converge_pick -- SEPARATE JUDGE (different model/prompt): pick best shape
6121
+ 5. breadboard_and_elements -- fat-marker breadboard + Interface/Invariant/Exclusion classification
6122
+ 6. rabbit_holes_nogos -- adversarial: risks, mitigations, no-gos, assumptions
6123
+ 7. context_pack_build -- file globs, reuse_utilities, conventions, do-not-touch boundaries
6124
+ 8. example_map_and_gherkin -- Given/When/Then acceptance criteria + verification commands
6125
+ 9. draft_pitch -- self-refine ×2, SEPARATE CRITIC (obfuscated authorship)
6126
+ 10. **approval_gate** -- MANDATORY HUMAN GATE: approve, edit, or restart
6127
+ 11. finalize_and_handoff -- schema validation, emit shape.json + pitch.md
5800
6128
 
5801
- **WorkRail/WorkTrain is three separate systems sharing one engine:**
6129
+ **The single most important design decision:** generator and critic run on structurally different prompts (ideally different model families). CoT and self-reflection alone do NOT mitigate anchoring or self-preference bias (Lou & Sun 2025; Panickssery et al. 2024).
6130
+
6131
+ **Output artifact:** `shape.json` -- contains problem story, appetite (multi-dimensional: calendar + tokens + turns + files), breadboard, elements, context_pack (file boundaries + reuse_utilities), Gherkin acceptance criteria, rabbit holes, no-gos, decomposition with walking skeleton, assumptions_log, build_readiness_score.
6132
+
6133
+ **Key insight for AI implementers:** LLMs need MORE explicit specs than humans on interfaces/invariants/file boundaries (no tacit knowledge, no scope-shame), but LESS explicit than junior humans on standard patterns. The dominant failure mode is confident architectural divergence -- working code that reinvents an existing utility. Context Pack (Step 7) directly prevents this.
6134
+
6135
+ **Next action:** author `wr.shaping` as a WorkRail workflow JSON using workflow-for-workflows, then update `coding-task-workflow-agentic` Phase 0 to detect and consume `shape.json` when present.
6136
+
6137
+ ---
6138
+
6139
+ ## Coordinator architecture: separation of concerns (Apr 19, 2026)
6140
+
6141
+ **Decision: defer knowledge graph implementation until the context assembly layer is designed.**
6142
+
6143
+ ### The god class problem
6144
+
6145
+ `src/coordinators/pr-review.ts` is already ~500 LOC doing: session dispatch, result aggregation, finding classification, merge routing, message queue drain, and outbox writes. Adding knowledge graph queries, context bundle assembly, upstream doc fetching, and prior session lookups would make it a god class.
6146
+
6147
+ "Coordinator" is not a class or a script -- it is a **layer** that orchestrates across multiple concerns. Those concerns need to be separated before we add more to them.
6148
+
6149
+ ### The right layering
5802
6150
 
5803
6151
  ```
5804
- Shared core
5805
- ┌─────────────────────────────┐
5806
- │ WorkRail engine │
5807
- src/v2/durable-core/ │
5808
- │ ~/.workrail/data/sessions/
5809
- │ workflow registry │
5810
- └────────┬──────────┬─────────┘
5811
- │ │ │
5812
- ┌──────────────▼─┐ ┌─────▼──────┐ ┌▼─────────────┐
5813
- │ WorkRail MCP │ │ WorkTrain │ │ WorkRail │
5814
- │ Server │ │ Daemon │ │ Console │
5815
- │ workrail start │ │ worktrain │ │ worktrain │
5816
- │ src/mcp/ │ │ daemon │ │ console │
5817
- │ │ │ src/daemon/│ │ src/console/ │
5818
- │ Claude Code │ │ src/trigger│ │ │
5819
- │ connects here │ │ │ │ Shows BOTH │
5820
- │ via stdio │ │ autonomous │ │ MCP + daemon │
5821
- │ │ │ agent loop │ │ sessions │
5822
- └────────────────┘ └────────────┘ └──────────────┘
6152
+ Trigger layer src/trigger/ receives events, validates, enqueues
6153
+ Dispatch layer (TBD) decides which workflow + what goal
6154
+ Context assembly (TBD) gathers and packages context before spawning
6155
+ Orchestration layer src/coordinators/ spawns, awaits, routes, retries, escalates
6156
+ Delivery layer src/trigger/delivery posts results back to origin systems
5823
6157
  ```
5824
6158
 
5825
- **WorkRail MCP Server:** `workrail start` → stdio MCP server. Claude Code connects here. Provides `start_workflow`, `complete_step`, `list_workflows` etc. as MCP tools. Source: `src/mcp/`. Must be bulletproof.
6159
+ **Context assembly** is the missing layer. Before dispatching a coding session, something needs to:
6160
+ - Run `buildIndex()` and query "what imports the file being changed"
6161
+ - Find the upstream pitch/PRD/BRD for the task
6162
+ - Pull relevant prior session notes
6163
+ - Package everything as a structured context bundle
6164
+
6165
+ This is NOT the orchestration script's job. The orchestration script should call `assembleContext(task, workspace)` and receive a bundle -- it should not know how that bundle was gathered.
6166
+
6167
+ ### Why the knowledge graph belongs in context assembly, not in the daemon
6168
+
6169
+ Two options were considered:
6170
+ - **Daemon tool** (`makeQueryKnowledgeGraphTool` in `workflow-runner.ts`) -- agent queries mid-session on demand
6171
+ - **Coordinator pre-fetch** -- coordinator runs queries before spawning, injects answers as context
6172
+
6173
+ The coordinator pre-fetch is better for known patterns (e.g. "what imports the file being changed" before a coding task). The agent doesn't need to know the graph exists -- it just gets the relevant facts as context. This also avoids adding `ts-morph` + DuckDB to the production build.
6174
+
6175
+ The daemon tool approach is only better for ad-hoc mid-session queries the agent discovers dynamically. That's a secondary use case for v1.
6176
+
6177
+ ### What to build before the knowledge graph
5826
6178
 
5827
- **WorkTrain Daemon:** `worktrain daemon` autonomous agent runner. Drives sessions without human involvement. Calls the WorkRail engine **directly in-process**. Does NOT go through the MCP server. Source: `src/daemon/`, `src/trigger/`.
6179
+ 1. **Design the `ContextAssembler` abstraction** -- takes task description + workspace + trigger metadata, returns a structured context bundle. The knowledge graph is one of several sources (alongside upstream docs, prior session notes, repo state).
6180
+ 2. **Refactor `pr-review.ts`** to use a `ContextAssembler` for the bits that fit there.
6181
+ 3. **Then** implement knowledge graph as a `ContextAssembler` plugin -- not as a coordinator script addition and not as a daemon tool.
5828
6182
 
5829
- **WorkRail Console:** `worktrain console` → unified read-only session viewer. Shows sessions from **both** the MCP server and the daemon (shared session store). Requires neither to be running.
6183
+ ### Anti-pattern to avoid
5830
6184
 
5831
- **Rules:**
5832
- - MCP server code must never import from `src/daemon/` or `src/trigger/`
5833
- - Daemon code must never depend on the MCP server being alive
5834
- - Console reads the session store directly -- no IPC with either needed
5835
- - These are separate processes. A crash in one does not affect the others.
6185
+ Adding knowledge graph calls directly into `pr-review.ts` or any other coordinator script. That immediately creates the god class we're trying to avoid and couples the orchestration layer to a specific context source.