@exaudeus/workrail 3.27.0 → 3.29.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (160) hide show
  1. package/dist/console/assets/{index-FtTaDku8.js → index-BZ6HkxGf.js} +1 -1
  2. package/dist/console/index.html +1 -1
  3. package/dist/manifest.json +3 -3
  4. package/docs/README.md +57 -0
  5. package/docs/adrs/001-hybrid-storage-backend.md +38 -0
  6. package/docs/adrs/002-four-layer-context-classification.md +38 -0
  7. package/docs/adrs/003-checkpoint-trigger-strategy.md +35 -0
  8. package/docs/adrs/004-opt-in-encryption-strategy.md +36 -0
  9. package/docs/adrs/005-agent-first-workflow-execution-tokens.md +105 -0
  10. package/docs/adrs/006-append-only-session-run-event-log.md +76 -0
  11. package/docs/adrs/007-resume-and-checkpoint-only-sessions.md +51 -0
  12. package/docs/adrs/008-blocked-nodes-architectural-upgrade.md +178 -0
  13. package/docs/adrs/009-bridge-mode-single-instance-mcp.md +195 -0
  14. package/docs/adrs/010-release-pipeline.md +89 -0
  15. package/docs/architecture/README.md +7 -0
  16. package/docs/architecture/refactor-audit.md +364 -0
  17. package/docs/authoring-v2.md +527 -0
  18. package/docs/authoring.md +873 -0
  19. package/docs/changelog-recent.md +201 -0
  20. package/docs/configuration.md +505 -0
  21. package/docs/ctc-mcp-proposal.md +518 -0
  22. package/docs/design/README.md +22 -0
  23. package/docs/design/agent-cascade-protocol.md +96 -0
  24. package/docs/design/autonomous-console-design-candidates.md +253 -0
  25. package/docs/design/autonomous-console-design-review.md +111 -0
  26. package/docs/design/autonomous-platform-mvp-discovery.md +525 -0
  27. package/docs/design/claude-code-source-deep-dive.md +713 -0
  28. package/docs/design/console-cyberpunk-ui-discovery.md +504 -0
  29. package/docs/design/console-execution-trace-candidates-final.md +160 -0
  30. package/docs/design/console-execution-trace-candidates.md +211 -0
  31. package/docs/design/console-execution-trace-design-candidates-v2.md +113 -0
  32. package/docs/design/console-execution-trace-design-review.md +74 -0
  33. package/docs/design/console-execution-trace-discovery.md +394 -0
  34. package/docs/design/console-execution-trace-final-review.md +77 -0
  35. package/docs/design/console-execution-trace-review.md +92 -0
  36. package/docs/design/console-performance-discovery.md +415 -0
  37. package/docs/design/console-ui-backlog.md +280 -0
  38. package/docs/design/daemon-architecture-discovery.md +853 -0
  39. package/docs/design/daemon-design-candidates.md +318 -0
  40. package/docs/design/daemon-design-review-findings.md +119 -0
  41. package/docs/design/daemon-engine-design-candidates.md +210 -0
  42. package/docs/design/daemon-engine-design-review.md +131 -0
  43. package/docs/design/daemon-execution-engine-discovery.md +280 -0
  44. package/docs/design/daemon-gap-analysis.md +554 -0
  45. package/docs/design/daemon-owns-console-plan.md +168 -0
  46. package/docs/design/daemon-owns-console-review.md +91 -0
  47. package/docs/design/daemon-owns-console.md +195 -0
  48. package/docs/design/data-model-erd.md +11 -0
  49. package/docs/design/design-candidates-consolidate-dev-staleness.md +98 -0
  50. package/docs/design/design-candidates-walk-cache-depth-limit.md +80 -0
  51. package/docs/design/design-review-consolidate-dev-staleness.md +54 -0
  52. package/docs/design/design-review-walk-cache-depth-limit.md +48 -0
  53. package/docs/design/implementation-plan-consolidate-dev-staleness.md +142 -0
  54. package/docs/design/implementation-plan-walk-cache-depth-limit.md +141 -0
  55. package/docs/design/layer3b-ghost-nodes-design-candidates.md +229 -0
  56. package/docs/design/layer3b-ghost-nodes-design-review.md +93 -0
  57. package/docs/design/layer3b-ghost-nodes-implementation-plan.md +219 -0
  58. package/docs/design/list-workflows-latency-fix-plan.md +128 -0
  59. package/docs/design/list-workflows-latency-fix-review.md +55 -0
  60. package/docs/design/list-workflows-latency-fix.md +109 -0
  61. package/docs/design/native-context-management-api.md +11 -0
  62. package/docs/design/performance-sweep-2026-04.md +96 -0
  63. package/docs/design/routines-guide.md +219 -0
  64. package/docs/design/sequence-diagrams.md +11 -0
  65. package/docs/design/subagent-design-principles.md +220 -0
  66. package/docs/design/temporal-patterns-design-candidates.md +312 -0
  67. package/docs/design/temporal-patterns-design-review-findings.md +163 -0
  68. package/docs/design/test-isolation-from-config-file.md +335 -0
  69. package/docs/design/v2-core-design-locks.md +2746 -0
  70. package/docs/design/v2-lock-registry.json +734 -0
  71. package/docs/design/workflow-authoring-v2.md +1044 -0
  72. package/docs/design/workflow-docs-spec.md +218 -0
  73. package/docs/design/workflow-extension-points.md +687 -0
  74. package/docs/design/workrail-auto-trigger-system.md +359 -0
  75. package/docs/design/workrail-config-file-discovery.md +513 -0
  76. package/docs/docker.md +110 -0
  77. package/docs/generated/v2-lock-closure-plan.md +26 -0
  78. package/docs/generated/v2-lock-coverage.json +797 -0
  79. package/docs/generated/v2-lock-coverage.md +177 -0
  80. package/docs/ideas/backlog.md +3927 -0
  81. package/docs/ideas/design-candidates-mcp-resilience.md +208 -0
  82. package/docs/ideas/design-review-findings-mcp-resilience.md +119 -0
  83. package/docs/ideas/implementation_plan.md +249 -0
  84. package/docs/ideas/third-party-workflow-setup-design-thinking.md +1948 -0
  85. package/docs/implementation/02-architecture.md +316 -0
  86. package/docs/implementation/04-testing-strategy.md +124 -0
  87. package/docs/implementation/09-simple-workflow-guide.md +835 -0
  88. package/docs/implementation/13-advanced-validation-guide.md +874 -0
  89. package/docs/implementation/README.md +21 -0
  90. package/docs/integrations/claude-code.md +300 -0
  91. package/docs/integrations/firebender.md +315 -0
  92. package/docs/migration/v0.1.0.md +147 -0
  93. package/docs/naming-conventions.md +45 -0
  94. package/docs/planning/README.md +104 -0
  95. package/docs/planning/github-ticketing-playbook.md +195 -0
  96. package/docs/plans/README.md +24 -0
  97. package/docs/plans/agent-managed-ticketing-design.md +605 -0
  98. package/docs/plans/agentic-orchestration-roadmap.md +112 -0
  99. package/docs/plans/assessment-gates-engine-handoff.md +536 -0
  100. package/docs/plans/content-coherence-and-references.md +151 -0
  101. package/docs/plans/library-extraction-plan.md +340 -0
  102. package/docs/plans/mr-review-workflow-redesign.md +1451 -0
  103. package/docs/plans/native-context-management-epic.md +11 -0
  104. package/docs/plans/perf-fixes-design-candidates.md +225 -0
  105. package/docs/plans/perf-fixes-design-review-findings.md +61 -0
  106. package/docs/plans/perf-fixes-new-issues-candidates.md +264 -0
  107. package/docs/plans/perf-fixes-new-issues-review.md +110 -0
  108. package/docs/plans/prompt-fragments.md +53 -0
  109. package/docs/plans/ui-ux-workflow-design-candidates.md +120 -0
  110. package/docs/plans/ui-ux-workflow-discovery.md +100 -0
  111. package/docs/plans/ui-ux-workflow-review.md +48 -0
  112. package/docs/plans/v2-followup-enhancements.md +587 -0
  113. package/docs/plans/workflow-categories-candidates.md +105 -0
  114. package/docs/plans/workflow-categories-discovery.md +110 -0
  115. package/docs/plans/workflow-categories-review.md +51 -0
  116. package/docs/plans/workflow-discovery-model-candidates.md +94 -0
  117. package/docs/plans/workflow-discovery-model-discovery.md +74 -0
  118. package/docs/plans/workflow-discovery-model-review.md +48 -0
  119. package/docs/plans/workflow-source-setup-phase-1.md +245 -0
  120. package/docs/plans/workflow-source-setup-phase-2.md +361 -0
  121. package/docs/plans/workflow-staleness-detection-candidates.md +104 -0
  122. package/docs/plans/workflow-staleness-detection-review.md +58 -0
  123. package/docs/plans/workflow-staleness-detection.md +80 -0
  124. package/docs/plans/workflow-v2-design.md +69 -0
  125. package/docs/plans/workflow-v2-roadmap.md +74 -0
  126. package/docs/plans/workflow-validation-design.md +98 -0
  127. package/docs/plans/workflow-validation-roadmap.md +108 -0
  128. package/docs/plans/workrail-platform-vision.md +420 -0
  129. package/docs/reference/agent-context-cleaner-snippet.md +94 -0
  130. package/docs/reference/agent-context-guidance.md +140 -0
  131. package/docs/reference/context-optimization.md +284 -0
  132. package/docs/reference/example-workflow-repository-template/.github/workflows/validate.yml +125 -0
  133. package/docs/reference/example-workflow-repository-template/README.md +268 -0
  134. package/docs/reference/example-workflow-repository-template/workflows/example-workflow.json +80 -0
  135. package/docs/reference/external-workflow-repositories.md +916 -0
  136. package/docs/reference/feature-flags-architecture.md +472 -0
  137. package/docs/reference/feature-flags.md +349 -0
  138. package/docs/reference/god-tier-workflow-validation.md +272 -0
  139. package/docs/reference/loop-optimization.md +209 -0
  140. package/docs/reference/loop-validation.md +176 -0
  141. package/docs/reference/loops.md +465 -0
  142. package/docs/reference/mcp-platform-constraints.md +59 -0
  143. package/docs/reference/recovery.md +88 -0
  144. package/docs/reference/releases.md +177 -0
  145. package/docs/reference/troubleshooting.md +105 -0
  146. package/docs/reference/workflow-execution-contract.md +998 -0
  147. package/docs/roadmap/README.md +22 -0
  148. package/docs/roadmap/legacy-planning-status.md +103 -0
  149. package/docs/roadmap/now-next-later.md +70 -0
  150. package/docs/roadmap/open-work-inventory.md +389 -0
  151. package/docs/tickets/README.md +39 -0
  152. package/docs/tickets/next-up.md +76 -0
  153. package/docs/workflow-management.md +317 -0
  154. package/docs/workflow-templates.md +423 -0
  155. package/docs/workflow-validation.md +184 -0
  156. package/docs/workflows.md +254 -0
  157. package/package.json +3 -1
  158. package/spec/authoring-spec.json +61 -16
  159. package/workflows/workflow-for-workflows.json +252 -93
  160. package/workflows/workflow-for-workflows.v2.json +188 -77
@@ -0,0 +1,853 @@
1
+ # WorkRail Daemon Architecture: First-Principles Discovery
2
+
3
+ > Design discovery for the WorkRail autonomous execution daemon architecture.
4
+ > Generated: 2026-04-14.
5
+ >
6
+ > **Artifact strategy:** This document is a human-readable reference. Execution truth
7
+ > is recorded in WorkRail session notes and context variables -- not in this file.
8
+ > Session notes are always authoritative.
9
+
10
+ ---
11
+
12
+ ## Landscape Packet
13
+
14
+ ### Current-state summary
15
+
16
+ WorkRail today is exclusively reactive: it waits for an external agent (Claude Code, Cursor,
17
+ etc.) to call its MCP tools over a transport (stdio or HTTP). The process entry point is
18
+ `src/mcp-server.ts`, which starts either `startStdioServer` or `startHttpServer` based on
19
+ `WORKRAIL_TRANSPORT`.
20
+
21
+ **What already exists for autonomous execution:**
22
+
23
+ | Component | File | Status |
24
+ |-----------|------|--------|
25
+ | In-process engine library | `src/engine/engine-factory.ts` | Built -- wraps handlers directly |
26
+ | Engine interface | `src/engine/types.ts` | Built -- `WorkRailEngine` with bot-as-orchestrator doc comment |
27
+ | Core handler: start | `src/mcp/handlers/v2-execution/start.ts` | Built -- `executeStartWorkflow` |
28
+ | Core handler: continue | `src/mcp/handlers/v2-execution/continue-advance.ts` | Built -- `handleAdvanceIntent` |
29
+ | Core handler: checkpoint | `src/mcp/handlers/v2-checkpoint.ts` | Built |
30
+ | Durable session store | `src/v2/infra/local/` | Built -- append-only event log |
31
+ | HMAC token protocol | `src/v2/durable-core/tokens/` | Built -- cryptographic enforcement |
32
+ | DAG visualization | `console/` | Built -- passive (read-only) |
33
+ | SSE infrastructure | existing HTTP server | Built -- `/api/v2/workspace/events` |
34
+ | Trigger system | -- | **Missing** |
35
+ | Agent loop (LLM caller) | -- | **Missing** |
36
+ | Tool executor (Bash/Read/Write) | -- | **Missing for daemon** |
37
+ | Cross-repo routing | -- | **Missing** |
38
+ | Evidence collection hooks | -- | **Missing** |
39
+ | REST control plane | -- | **Missing** |
40
+ | Console live view | -- | **Missing** |
41
+
42
+ ### Existing approaches and precedents
43
+
44
+ **pi-mono (35k stars, MIT, @mariozechner/pi-ai):**
45
+ - `agentLoop(prompts, context, config, signal?)` -- clean loop over unified LLM API
46
+ - `BeforeToolCallResult` / `AfterToolCallResult` -- hooks for observation and gating
47
+ - `ToolExecutionMode` -- sequential vs parallel tool execution
48
+ - `mom` package -- Slack bot as simplest daemon reference: "message received -> run agent -> respond"
49
+ - `coding-agent` package -- `SessionManager`, `AgentSession`, skill loading from directory
50
+
51
+ **OpenClaw (357k stars, MIT, TypeScript):**
52
+ - `AcpSessionStore` -- in-memory session management (WorkRail's disk-persisted store is superior)
53
+ - `SpawnAcpParams` -- minimal interface for spawning autonomous task sessions
54
+ - Task flow chaining -- `createTaskFlowForTask` / `linkTaskToFlowById`
55
+ - Policy system -- `isAcpEnabledByPolicy(cfg)` for daemon feature flags
56
+
57
+ **Claude Code (leaked source):**
58
+ - `sessionRunner.ts` -- programmatic session initiation (analogous to what daemon needs)
59
+ - `PreToolUse` / `PostToolUse` hooks -- evidence collection integration points
60
+ - Compaction hooks -- `executePreCompactHooks` for injecting WorkRail notes into session memory
61
+
62
+ ### Hard constraints from the world
63
+
64
+ 1. **DI container singleton:** `engineActive` guard -- one engine per process. Must evolve for concurrency.
65
+ 2. **Anthropic API key required:** Daemon makes direct LLM API calls, not routed through Claude Code.
66
+ 3. **Filesystem layout:** `WorkRailEngine.startWorkflow()` uses `process.cwd()` as workspace path. Daemon must pass explicit paths per session.
67
+ 4. **Token protocol immutability:** HMAC tokens are cryptographically bound -- they cannot be reconstructed or forged. The daemon must store and manage tokens durably between agent loop iterations.
68
+ 5. **Workflow backward compatibility:** Every existing workflow must run in autonomous mode unchanged. The daemon cannot require new workflow fields.
69
+
70
+ ### Notable contradictions
71
+
72
+ 1. **"MCP server and daemon can run simultaneously in the same process" (backlog) vs. `engineActive` guard (code):** The current singleton guard explicitly prevents this. The backlog's claim is aspirational; the code enforces single-engine. Resolution needed before multi-mode operation.
73
+ 2. **"WorkRail calls MCP tools internally" (backlog) vs. engine-factory.ts pattern (code):** The backlog says the daemon calls `start_workflow` and `continue_workflow` MCP tools internally. The engine-factory shows it calls the underlying handlers directly -- no MCP tool layer. These are two descriptions of the same intent but at different abstraction levels. The handlers-directly path is more efficient and already built.
74
+ 3. **Cross-repo isolation vs. single DI container:** If multiple sessions run concurrently and each session's tool calls are routed to different repos, the DI container's session store must remain repo-agnostic. Currently it is (the store uses session IDs, not workspace paths). But if the daemon injects workspace-specific tool executors per session, those executors must not bleed between sessions sharing the same engine instance.
75
+
76
+ ### Evidence gaps
77
+
78
+ 1. **pi-mono is not yet integrated** -- the agent loop layer is the biggest missing piece. pi-mono's `agentLoop` is the cleanest reference but is an external dependency. Whether WorkRail should use pi-mono or build its own agent loop is an open question.
79
+ 2. **`engineActive` guard resolution path not designed** -- the exact mechanism for sharing one engine between MCP server and daemon is not specified.
80
+ 3. **Evidence collection hook architecture** -- how `BeforeToolCall` hooks wire into the continue-token gate is not designed. The backlog mentions it but there is no code.
81
+ 4. **Cloud session store** -- `LocalDataDirV2` is the only `DataDir` implementation. A cloud-backed store (S3, Postgres) would be needed for true cloud deployment. This is a port swap (DI-injectable) but the port contract for remote stores has not been designed.
82
+
83
+ ### Precedent count: 3 (pi-mono, OpenClaw, Claude Code)
84
+ ### Contradiction count: 3
85
+ ### Evidence gap count: 4
86
+
87
+ ---
88
+
89
+ ## Problem Frame Packet
90
+
91
+ ### Primary users / stakeholders
92
+
93
+ | User | Job to be done | Pain today |
94
+ |------|---------------|------------|
95
+ | **Individual developer (e.g., Zillow Mercury Mobile)** | Run autonomous MR review overnight without sitting at keyboard | Has to have Claude Code open and manually initiate each review |
96
+ | **Team lead** | Get consistent, enforced process on every MR without training reviewers | Reviews are ad-hoc; agents drift and skip steps |
97
+ | **Platform/infra engineer** | Deploy WorkRail as a service on cloud infrastructure | WorkRail is a local tool that exits when the terminal closes |
98
+ | **Workflow author** | Write a workflow once, have it run identically in both manual and autonomous mode | Today: manual mode only; would need to rewrite for autonomous mode if it existed separately |
99
+ | **WorkRail itself (self-improvement)** | Run `workflow-for-workflows` to author new workflows autonomously | Cannot initiate its own workflows; must be driven by a human |
100
+
101
+ ### Core tension
102
+
103
+ **The daemon is not just a new entry point -- it is a different trust model.**
104
+
105
+ When a human drives Claude Code, the human is the ultimate arbiter of what the agent does. They can interrupt, redirect, or reject. When the daemon drives itself, the cryptographic enforcement of the token protocol and the immutable session log become the primary trust mechanism. The architecture must make the enforcement stronger, not weaker, when humans are not in the loop.
106
+
107
+ This creates a design tension:
108
+ - **Speed and simplicity** favor Option A (direct engine, tight coupling, fast)
109
+ - **Auditability and control** favor the REST control plane (humans can inspect, pause, override)
110
+ - **Portability and distribution** favor Option B (MCP client, process boundary, deployable separately)
111
+
112
+ The 12-month answer must satisfy all three. That is why Option D (composite) is necessary.
113
+
114
+ ### Jobs and success criteria
115
+
116
+ **For the daemon to be considered successful at 12 months:**
117
+
118
+ 1. `workrail daemon start` runs without Claude Code, without an IDE, without a human at the keyboard
119
+ 2. A GitLab MR webhook triggers `mr-review-workflow`, runs to completion, posts findings as a comment -- zero human interaction
120
+ 3. Every step of the autonomous session is visible in the console live view (audit trail, not just completion status)
121
+ 4. If the daemon crashes mid-session, `workrail daemon resume <sessionId>` continues from the last checkpoint
122
+ 5. Cross-repo: a workflow that reads from `android` and `ios` repos runs correctly on a developer's machine with both repos cloned
123
+ 6. A workflow author cannot tell whether their workflow ran in manual mode (Claude Code) or autonomous mode (daemon) from the workflow definition alone
124
+
125
+ ### Assumptions being treated as facts (framing risks)
126
+
127
+ 1. **"The daemon should share the same process as the MCP server"** -- This is convenient but not obviously correct. A separate daemon process avoids the `engineActive` singleton problem entirely. The backlog says same-process; this should be a deliberate decision, not an assumption.
128
+
129
+ 2. **"pi-mono is the right agent loop library"** -- pi-mono has 35k stars and clean TypeScript abstractions. But WorkRail's licensing, bundle size, and maintenance burden preferences are not stated. The daemon might be better served by a minimal direct Anthropic SDK integration than by adopting a 35k-star monorepo as a dependency.
130
+
131
+ 3. **"Cross-repo execution is a 12-month must-have"** -- The backlog says "post-MVP, must-have before WorkRail can be called a general-purpose platform." This is a judgment call about timeline, not a technical constraint. A daemon that only handles single-repo workflows is still enormously useful.
132
+
133
+ 4. **"The REST control plane is the right interface for console live view"** -- The console already has SSE infrastructure. The question is whether live session events from the daemon flow through the same SSE pipe or through a separate polling endpoint. This is a UI/API design question, not an architecture question.
134
+
135
+ ### Tensions and HMW questions
136
+
137
+ **Tension 1: Single-process convenience vs. multi-session concurrency**
138
+ The `engineActive` guard exists because the DI container is a global singleton. A single-process model is simpler to deploy but requires the guard to evolve. A multi-process model eliminates the guard problem but adds IPC complexity.
139
+
140
+ HMW: How might we allow the MCP server and daemon to share one engine instance while ensuring concurrent sessions do not interfere?
141
+
142
+ **Tension 2: Freestanding vs. best-in-class agent loop**
143
+ WorkRail's value proposition is enforcement + durability + observability. The agent loop (LLM calling + tool execution) is commodity infrastructure. Using pi-mono's `agentLoop` gets a battle-tested implementation immediately but adds a dependency. Building in-house takes time but stays lean.
144
+
145
+ HMW: How might we get the benefits of a clean agent loop abstraction without coupling WorkRail to a specific third-party library?
146
+
147
+ **Tension 3: Autonomous trust vs. human control**
148
+ The more autonomous the daemon, the more important the console control plane becomes. But building the console live view adds scope. Deferring the live view means operators are blind to autonomous sessions.
149
+
150
+ HMW: How might we deliver meaningful human oversight of autonomous sessions with the minimum new console scope in the MVP?
151
+
152
+ ### Framing risk count: 4
153
+ ### Tension count: 3 (mapped to 3 HMW questions)
154
+ ### Success criteria count: 6
155
+
156
+ ---
157
+
158
+ ## Candidate Generation Expectations
159
+
160
+ This is a `landscape_first` path. Candidate directions must:
161
+
162
+ 1. **Be grounded in the actual landscape** -- not free invention. Each candidate must map
163
+ to a specific precedent, constraint, or code pattern identified in the landscape packet.
164
+ Candidates invented from scratch without landscape grounding will be rejected in synthesis.
165
+
166
+ 2. **Cover the process-boundary dimension explicitly** -- the synthesis step identified
167
+ the single-process vs. separate-process choice as the real open question. Every candidate
168
+ must take a stance on this dimension.
169
+
170
+ 3. **Not cluster around only the recommended option** -- even though Option D (composite)
171
+ is the early favorite, the candidate set must include at least one candidate that
172
+ challenges the composite direction (pure Option A with no REST layer, or pure Option B
173
+ with a clean process boundary). The challenge must be genuinely argued, not strawmanned.
174
+
175
+ 4. **Address the `engineActive` constraint explicitly** -- any candidate that places the
176
+ daemon in the same process as the MCP server must state how it resolves the singleton
177
+ guard. Any candidate that uses separate processes must state how session state is shared.
178
+
179
+ 5. **Four candidates target:** A (pure direct engine, same process), B (MCP client,
180
+ separate process), D-same (composite, same process), D-separate (composite, separate
181
+ process). These four are the natural spread and map directly to the landscape.
182
+
183
+ ---
184
+
185
+ ## Candidate Directions
186
+
187
+ ### Candidate 1: Minimal Sequential Daemon (simplest possible)
188
+
189
+ **One-sentence summary:** A new `src/daemon/entry.ts` calls `createWorkRailEngine()`, runs one
190
+ session at a time via a trigger listener, drives the agent loop with a direct Anthropic SDK
191
+ call per step, and exits after each session completes.
192
+
193
+ **Concrete shape:**
194
+ - `src/daemon/entry.ts` -- `workrailDaemon(config: DaemonConfig): Promise<void>`
195
+ - `src/daemon/trigger/gitlab-webhook.ts` -- HTTP listener, parses MR events, returns
196
+ `{ workflowId: string; goal: string; context: Record<string, string> }`
197
+ - `src/daemon/agent-loop/step-runner.ts` -- takes `PendingStep`, calls Anthropic SDK
198
+ `messages.create()` with the step prompt, collects tool calls, executes them via a
199
+ `ToolExecutor`, returns `{ notesMarkdown: string; context: Record<string, unknown> }`
200
+ - `src/daemon/tool-executor/local.ts` -- implements `Bash`, `Read`, `Write` as child
201
+ process calls; returns `ToolCallResult[]`
202
+ - Session queue: `DaemonSessionQueue` -- a simple async FIFO queue; only one session runs
203
+ at a time; `engineActive` guard is never violated because queue ensures mutual exclusion
204
+
205
+ **Process boundary:** Same process as MCP server. The `engineActive` guard is satisfied by
206
+ the queue (only one engine in use at a time). No relaxation of the guard needed.
207
+
208
+ **Tensions resolved:** Simplicity; single-process deployment; zero workflow changes.
209
+ **Tensions accepted:** No concurrent sessions; no live view; no REST control plane.
210
+ **Failure mode:** Session throughput bottleneck -- if sessions are slow (30-60 min each),
211
+ the queue grows unbounded and new trigger events are delayed.
212
+ **Relation to existing patterns:** Directly adapts `engine-factory.ts`. The engine library
213
+ was built for this exact use case.
214
+ **Gain:** Ships fast, proves the agent loop concept, zero architectural risk.
215
+ **Give up:** No concurrency, no live view, no human override mid-session.
216
+ **Impact surface:** Only `src/daemon/` -- no changes to MCP server, engine, or console.
217
+ **Scope judgment:** Best-fit for MVP. Too narrow for 12-month platform vision.
218
+ **Philosophy:** Honors YAGNI, DI (engine injected), errors-as-data. Conflicts with nothing.
219
+
220
+ ---
221
+
222
+ ### Candidate 2: Pure MCP Client Daemon (clean process boundary)
223
+
224
+ **One-sentence summary:** A separate `workrail-daemon` process connects to the WorkRail
225
+ MCP server over HTTP and calls `start_workflow` / `continue_workflow` as a regular MCP
226
+ client, with no direct engine access.
227
+
228
+ **Concrete shape:**
229
+ - `packages/daemon/` -- separate package in the monorepo (or separate repo)
230
+ - `src/mcp/client.ts` -- a minimal MCP client over HTTP: `call(toolName, input)` returns
231
+ the tool response. Wraps `fetch` with JSON-RPC envelope.
232
+ - `packages/daemon/src/trigger/` -- same trigger listener as Candidate 1
233
+ - `packages/daemon/src/agent-loop/step-runner.ts` -- same structure as Candidate 1, but
234
+ calls `mcpClient.call('continue_workflow', { continueToken, output })` instead of the
235
+ engine directly
236
+ - `packages/daemon/src/tool-executor/local.ts` -- same as Candidate 1
237
+ - Deployment: `docker-compose.yml` with two services: `workrail-mcp` and `workrail-daemon`
238
+
239
+ **Process boundary:** Separate process. The `engineActive` guard problem disappears --
240
+ the daemon never touches the DI container.
241
+
242
+ **Tensions resolved:** Clean process boundary; independent scaling; crash isolation;
243
+ no `engineActive` concern; deployable anywhere as separate container.
244
+ **Tensions accepted:** JSON-RPC round-trip on every `continue_workflow` call (adds ~5-10ms
245
+ per step, negligible for long-running steps); requires MCP server to be running first;
246
+ two-process deployment for local dev.
247
+ **Failure mode:** MCP server is a single point of failure for both human (Claude Code)
248
+ and autonomous (daemon) sessions. If the MCP server crashes, both stop.
249
+ **Relation to existing patterns:** Departs from `engine-factory.ts` -- does not use it.
250
+ Follows the MCP protocol contract instead.
251
+ **Gain:** Maximum decoupling; daemon code has no import from WorkRail's internal handlers;
252
+ daemon can be written in any language.
253
+ **Give up:** Two-process deployment friction for individuals; the HTTP transport adds latency
254
+ overhead on the hot path.
255
+ **Impact surface:** Requires MCP server to expose all necessary tools over HTTP (already
256
+ does for `http` transport mode).
257
+ **Scope judgment:** Best-fit for 18-24 month distributed cloud. Too broad for 12-month MVP.
258
+ **Philosophy:** Honors DI (full process boundary is the ultimate DI). Mild conflict with
259
+ YAGNI (the process boundary adds complexity not yet justified by scale requirements).
260
+
261
+ ---
262
+
263
+ ### Candidate 3: Composite Same-Process (recommended)
264
+
265
+ **One-sentence summary:** A `src/daemon/` module calls `createWorkRailEngine()` directly,
266
+ the `engineActive` guard is relaxed to allow the MCP server and daemon to share one engine
267
+ instance, concurrent sessions are managed by a `DaemonSessionManager` that runs each
268
+ session in its own async chain, and a thin REST/SSE control plane exposes session status
269
+ for the console.
270
+
271
+ **Concrete shape:**
272
+ - `src/engine/engine-factory.ts` -- relax `engineActive` guard: instead of a boolean,
273
+ use a `EngineRefCount: number`. When `> 0`, the container is active. `createWorkRailEngine`
274
+ now requires an explicit `EngineHandle` release pattern. OR: expose a single shared
275
+ engine instance for the process that both MCP server and daemon use.
276
+ - `src/daemon/session-manager.ts` -- `DaemonSessionManager`: tracks active sessions by
277
+ `sessionId -> { continueToken, status: 'running' | 'paused' | 'complete' | 'failed' }`.
278
+ Each session runs as an independent `Promise` chain. No concurrency between steps of
279
+ the same session (HMAC token protocol enforces this); concurrency across sessions is
280
+ safe because the session store is append-only and session-scoped.
281
+ - `src/daemon/agent-loop/step-runner.ts` -- same as Candidate 1 but concurrent-safe
282
+ (no shared mutable state between sessions).
283
+ - `src/daemon/trigger/` -- webhook + cron + CLI trigger listeners.
284
+ - `src/daemon/tool-executor/local.ts` -- `Bash`, `Read`, `Write` plus `BashInRepo`,
285
+ `ReadRepo` for cross-repo routing.
286
+ - REST control plane additions to existing HTTP server:
287
+ - `GET /api/v2/sessions/:id/daemon-status` -- `{ status, currentStepTitle, startedAt }`
288
+ - `POST /api/v2/sessions/:id/pause` -- sets `status: 'paused'`, daemon loop waits
289
+ - `POST /api/v2/sessions/:id/resume` -- unpauses
290
+ - `DELETE /api/v2/sessions/:id` -- cancels active session (aborts current LLM call)
291
+
292
+ **Process boundary:** Same process. MCP server and daemon share one DI container and one
293
+ engine instance.
294
+
295
+ **Tensions resolved:** Single deployment artifact; zero workflow changes; concurrent sessions
296
+ (multiple sessions run simultaneously); human oversight (live view via REST/SSE); correct
297
+ enforcement (HMAC tokens apply to daemon sessions identically to manual sessions).
298
+ **Tensions accepted:** The `engineActive` guard must be relaxed (requires code change and
299
+ verification that concurrent handler calls are safe). The shared DI context means a
300
+ container bug affects both MCP server and daemon.
301
+ **Failure mode:** If concurrent `executeStartWorkflow` / `executeContinueWorkflow` calls
302
+ over the same DI context have a race condition (e.g., in the keyring load path or snapshot
303
+ store), concurrent sessions could corrupt each other. Must be verified.
304
+ **Relation to existing patterns:** Directly adapts `engine-factory.ts`. The `V2Dependencies`
305
+ struct is already designed to be built once and shared across calls -- no session-specific
306
+ state in `V2Dependencies`.
307
+ **Gain:** Single process, single deployment, concurrent sessions, live view, human control.
308
+ **Give up:** The `engineActive` guard relaxation needs careful design; shared process means
309
+ shared failure domain.
310
+ **Impact surface:** `engine-factory.ts` guard change; new `src/daemon/` module; minor
311
+ additions to existing HTTP server routes.
312
+ **Scope judgment:** Best-fit for 12-month vision. The composite is not broader than needed;
313
+ each component solves a concrete known requirement.
314
+ **Philosophy:** Honors DI (engine injected, agent loop port injected), errors-as-data,
315
+ immutability (session store is append-only, no mutation under concurrency). The guard
316
+ relaxation honors "make illegal states unrepresentable" -- a ref count is more precise
317
+ than a boolean. Conflicts with nothing.
318
+
319
+ ---
320
+
321
+ ### Candidate 4: Composite Separate-Process (cloud-native path)
322
+
323
+ **One-sentence summary:** The daemon runs as a separate process that uses `createWorkRailEngine()`
324
+ with a shared `dataDir` (pointing to the same `~/.workrail/v2` directory), enabling
325
+ independent process lifecycle while sharing durable session state through the filesystem.
326
+
327
+ **Concrete shape:**
328
+ - `packages/daemon/` -- separate entry point, runs `workrail-daemon` as its own process
329
+ - Uses `createWorkRailEngine({ dataDir: sharedDataDir })` -- the daemon gets its own DI
330
+ container instance (no `engineActive` guard conflict) but reads/writes the same session
331
+ store on disk
332
+ - The `engineActive` guard is NOT relaxed -- each process has exactly one engine, the guard
333
+ works correctly
334
+ - Session store file locking: the append-only event log already uses file-level locking
335
+ (`withHealthySessionLock`). Two processes writing to the same session store is safe IF
336
+ they coordinate via the lock protocol.
337
+ - REST control plane: the daemon process exposes its own HTTP port (e.g., 3101) for
338
+ status/pause/resume. The console proxies to this port.
339
+ - `src/daemon/agent-loop/`, `src/daemon/trigger/` -- same as Candidate 3
340
+
341
+ **Process boundary:** Separate process. Shared durable state via filesystem. The MCP
342
+ server and daemon are independent processes that both use the same `~/.workrail/v2`
343
+ directory as their shared state store.
344
+
345
+ **Tensions resolved:** `engineActive` guard is never an issue; independent crash recovery
346
+ (daemon crash does not affect MCP server); cloud-native (processes map to containers);
347
+ concurrent sessions (each daemon process handles multiple sessions via its own session
348
+ manager).
349
+ **Tensions accepted:** Two-process deployment for local dev; the shared filesystem is a
350
+ coordination mechanism that only works on a single machine (not distributed cloud without
351
+ a shared volume); REST control plane for the daemon is a new HTTP server, not reusing the
352
+ existing one.
353
+ **Failure mode:** Two processes writing to the same session store via file locks could
354
+ produce lock contention under high load. The lock protocol (`withHealthySessionLock`) is
355
+ designed for this, but it has not been tested with two concurrent processes.
356
+ **Relation to existing patterns:** Adapts `engine-factory.ts` (uses the library correctly,
357
+ one engine per process). Departs from the single-process assumption in `mcp-server.ts`.
358
+ **Gain:** Clean process boundary, no guard relaxation, independent scaling, natural
359
+ path to cloud (replace filesystem with remote store, same code).
360
+ **Give up:** Two-process deployment; filesystem-based coordination limits to single-machine;
361
+ more complex local dev setup.
362
+ **Impact surface:** New `packages/daemon/` package; new HTTP server in daemon process;
363
+ console proxy to daemon port.
364
+ **Scope judgment:** Best-fit for 18-month cloud target. Slightly too broad for 12-month
365
+ local-first focus.
366
+ **Philosophy:** Perfectly honors all principles -- one engine per process, `engineActive`
367
+ guard is correct, no guard relaxation needed. The cleanest architectural expression of
368
+ "dependency injection for boundaries." Conflicts with YAGNI slightly (the process
369
+ separation adds complexity before it is strictly needed).
370
+
371
+ ---
372
+
373
+ ## Challenge Notes
374
+
375
+ ### C3 safety question: resolved
376
+
377
+ The critical concern for Candidate 3 was whether concurrent calls to
378
+ `executeStartWorkflow` / `executeContinueWorkflow` over the same `V2Dependencies` struct
379
+ are safe. Analysis of `engine-factory.ts` and the handler code resolves this:
380
+
381
+ - `V2Dependencies` is a stateless struct: no fields are mutated per-call. All mutable
382
+ state lives in the session store (append-only event log with per-session file locks).
383
+ - `withHealthySessionLock(sessionId, ...)` serializes writes per session. Different
384
+ sessions have different IDs -- their locks do not compete.
385
+ - The dedup key system (`advance_recorded:sessionId:nodeId:attemptId`) prevents
386
+ double-advances even if two calls race on the same session.
387
+ - The keyring is loaded once during `createWorkRailEngine()` and its value is read-only
388
+ after that. Concurrent token signing uses the same keyring but with per-call random
389
+ entropy -- safe.
390
+
391
+ **Conclusion:** Concurrent sessions are safe today. The `engineActive` guard is not
392
+ protecting against a concurrent-call race condition -- it is protecting against two
393
+ separate `createWorkRailEngine()` calls creating two independent DI container instances
394
+ (which would have separate keystores, separate session stores, etc.). The solution for
395
+ Candidate 3 is to expose one shared engine instance for the process, not to relax
396
+ the guard to allow two instances.
397
+
398
+ ### Strongest counter-argument against C3
399
+
400
+ The REST control plane is scope that is not required for correctness. A sequential daemon
401
+ (Candidate 1) with a FIFO session queue ships faster and proves the core unknowns:
402
+ (a) does the agent loop correctly drive a workflow step?
403
+ (b) does the trigger system work?
404
+ (c) does the daemon produce the right `notesMarkdown` output for `continueWorkflow`?
405
+
406
+ If the primary uncertainty is the agent loop (not the deployment architecture), C1 is
407
+ the better MVP choice. The REST control plane can follow in the next release.
408
+
409
+ ### What would tip the decision to Candidate 1
410
+
411
+ If the 3-month target is "prove autonomous execution works" (not "ship the 12-month
412
+ platform"), Candidate 1 is correct. Sequential sessions with a queue are acceptable for
413
+ the MR review use case -- MRs are submitted hours apart, not milliseconds apart. A queue
414
+ delay of one session is unnoticeable.
415
+
416
+ ### What would tip the decision to Candidate 4
417
+
418
+ If cloud deployment is committed (not tentative) within 12 months -- meaning there is
419
+ a production system that needs the daemon on a server, not just a local machine -- then
420
+ Candidate 4's separate process model is justified. The two-process local dev friction
421
+ is a real cost but acceptable for a deployed service.
422
+
423
+ ---
424
+
425
+ ## Resolution Notes
426
+
427
+ ### Recommendation: Candidate 3 (Composite Same-Process)
428
+
429
+ **Rationale:**
430
+ 1. The safety concern is resolved -- `V2Dependencies` is already concurrent-safe.
431
+ 2. Single deployment artifact is a hard requirement for the developer experience goal
432
+ (`workrail start` brings up everything).
433
+ 3. The `engineActive` guard change is simpler than it appeared: the solution is to expose
434
+ a single shared engine instance, not to allow two separate instances.
435
+ 4. The REST control plane reuses existing HTTP server infrastructure -- it is not new
436
+ architectural surface.
437
+ 5. The 12-month success criteria require concurrent sessions (multiple MR reviews
438
+ simultaneously) and human oversight (live view). Only C3 satisfies both within a
439
+ single process.
440
+
441
+ **Pivot conditions:**
442
+ - If cloud deployment is committed within 12 months: migrate from C3 to C4 by extracting
443
+ the daemon into a separate process. The code in `src/daemon/` is identical -- only the
444
+ process entry point changes. The upgrade is a one-time extraction, not a rewrite.
445
+ - If the 3-month goal is "prove the agent loop": start with C1, expand to C3 after
446
+ proof. The FIFO queue in C1 is a subset of C3's session manager.
447
+
448
+ **Implementation order for C3:**
449
+ 1. Agent loop (`src/daemon/agent-loop/`) with direct Anthropic SDK -- the riskiest unknown
450
+ 2. Single trigger (GitLab MR webhook) -- second most uncertain
451
+ 3. Tool executor (Bash, Read, Write) -- mechanical, low risk
452
+ 4. Shared engine instance (relax `engineActive` guard design) -- well-understood after
453
+ the safety analysis
454
+ 5. REST control plane additions -- incremental to existing HTTP server
455
+ 6. Console live view integration -- last, depends on REST control plane
456
+
457
+ ---
458
+
459
+ ## Context / Ask
460
+
461
+ The specific question: what architectural form should the WorkRail autonomous execution
462
+ engine take for the 12-month vision?
463
+
464
+ **Three candidate architectures proposed:**
465
+ - **A) Direct engine caller** -- daemon imports and calls `executeStartWorkflow` /
466
+ `executeContinueWorkflow` handler functions directly (tight, fast, internal)
467
+ - **B) MCP client** -- daemon connects to WorkRail's MCP server over the wire as a
468
+ client (clean, decoupled, deployable anywhere)
469
+ - **C) Self-referential workflow** -- WorkRail becomes fully self-referential; workflows
470
+ spawn other workflows autonomously using existing subagent delegation
471
+ - **D) Something else**
472
+
473
+ **The 12-month vision (from backlog.md):**
474
+ - WorkRail is a freestanding autonomous agent platform
475
+ - WorkRail drives itself -- the daemon calls WorkRail's own MCP tools internally
476
+ - Cross-repo execution: sessions can span multiple repos
477
+ - The autonomous engine must work without Claude Code, without any IDE
478
+ - Every workflow written for Claude Code works in autonomous mode with zero changes
479
+
480
+ ---
481
+
482
+ ## Path Recommendation: `landscape_first`
483
+
484
+ The code already contains a concrete answer. `engine-factory.ts` and `engine/types.ts`
485
+ reveal that an in-process library API already exists and was explicitly designed with
486
+ autonomous execution in mind ("Future direction: bot-as-orchestrator"). The dominant need
487
+ is reading that design and reasoning from it -- not open-ended reframing.
488
+
489
+ ---
490
+
491
+ ## Constraints / Anti-goals
492
+
493
+ **Hard constraints (from backlog.md and codebase):**
494
+ - Single process: DI container is a global singleton -- one `WorkRailEngine` instance per
495
+ process at a time (enforced by `engineActive` guard)
496
+ - No duplicate session logic -- the existing session engine is the canonical implementation
497
+ - Zero workflow changes -- existing workflows must run in autonomous mode unchanged
498
+ - Freestanding -- no Claude Code, no IDE dependency
499
+ - `npx -y @exaudeus/workrail` must still work for all non-daemon users
500
+
501
+ **Anti-goals:**
502
+ - Do not duplicate the session store, token protocol, or step sequencer
503
+ - Do not require a running MCP server as a prerequisite for autonomous execution
504
+ - Do not introduce MCP transport overhead on the hot path (token round-trips, JSON-RPC
505
+ serialization) when the daemon and engine are co-located
506
+ - Do not build something that only works on local -- cloud deployment is a 12-month goal
507
+
508
+ ---
509
+
510
+ ## What the Code Actually Reveals
511
+
512
+ ### engine-factory.ts is already the daemon API
513
+
514
+ `createWorkRailEngine()` returns a `WorkRailEngine` that wraps `executeStartWorkflow` and
515
+ `executeContinueWorkflow` directly -- the same handlers the MCP server calls. The factory:
516
+
517
+ - Initializes the DI container in `library` mode (no signal handlers, no HTTP server,
518
+ no MCP transport)
519
+ - Builds the full `V2Dependencies` object (gate, sessionStore, snapshotStore, keyring,
520
+ tokenCodecPorts, etc.)
521
+ - Exposes `startWorkflow`, `continueWorkflow`, `checkpointWorkflow`, `listWorkflows`
522
+
523
+ The doc comment says explicitly:
524
+ > "Future direction: bot-as-orchestrator -- the caller reads `agentRole` + `prompt` from
525
+ > each step, constructs its own system prompts enriched with domain context, manages the
526
+ > agent lifecycle independently, and feeds output back."
527
+
528
+ **This is Option A (Direct engine caller) -- already partially built.**
529
+
530
+ ### The MCP server is a thin wrapper over the same handlers
531
+
532
+ `start.ts` and `continue-advance.ts` show that the MCP server handlers are also thin
533
+ wrappers over `executeStartWorkflow` / `executeContinueWorkflow`. The MCP server adds:
534
+ - JSON-RPC serialization/deserialization
535
+ - MCP transport (stdio or HTTP)
536
+ - `workspaceResolver` and `directoryListing` ports (MCP-specific workspace resolution)
537
+ - `sessionSummaryProvider` (for the `resume_session` tool)
538
+
539
+ None of these are needed by an autonomous daemon. The daemon knows exactly which workflow
540
+ to run and where the workspace is -- it does not need workspace discovery.
541
+
542
+ ### What's missing from engine-factory.ts for the daemon
543
+
544
+ The existing `WorkRailEngine` was scoped as "transport replacement" -- it drives the step
545
+ loop the same way an MCP agent would. What's needed for true autonomous execution:
546
+
547
+ 1. **Agent loop** -- something to read `pending.prompt`, send it to the LLM (direct
548
+ Anthropic API call), get back a `continueWorkflow` call with notes + context
549
+ 2. **Tool execution** -- `Bash`, `Read`, `Write`, and domain tools (`BashInRepo`, etc.)
550
+ 3. **Trigger system** -- webhooks, cron, CLI, REST to initiate a workflow session
551
+ 4. **Cross-repo routing** -- workspace manifest resolution, repo provisioning
552
+ 5. **Evidence collection** -- `BeforeToolCall` / `AfterToolCall` hooks to observe agent
553
+ tool use and gate continue tokens on required evidence
554
+
555
+ ---
556
+
557
+ ## Candidate Analysis
558
+
559
+ ### Option A: Direct Engine Caller (the `engine-factory.ts` pattern)
560
+
561
+ **Architecture:**
562
+ ```
563
+ src/daemon/
564
+ ├── trigger/ -- GitLab webhook, cron, CLI, REST
565
+ ├── agent-loop/ -- LLM call layer (pi-mono agentLoop or direct Anthropic SDK)
566
+ ├── tool-executor/ -- Bash, Read, Write + scoped cross-repo tools
567
+ └── entry.ts -- daemon process entry point
568
+ ```
569
+
570
+ The daemon imports `createWorkRailEngine()`, calls `engine.startWorkflow()`, reads
571
+ `response.pending.prompt`, sends it to the LLM API, gets tool calls + notes back, calls
572
+ `engine.continueWorkflow()` with the notes. Repeat until `isComplete`.
573
+
574
+ **Cross-repo execution:** The daemon controls the tool executor -- `BashInRepo(repo,
575
+ command)` routes to a provisioned workspace. The engine sees opaque context variables;
576
+ the daemon translates them to routed tool calls.
577
+
578
+ **Cloud vs local:** The engine uses `LocalDataDirV2` by default. For cloud, swap the
579
+ `dataDir` config or inject a remote-backed `SessionEventLogAppendStorePort`. The engine
580
+ is fully DI-injectable -- no filesystem assumptions in the session logic itself.
581
+
582
+ **WorkRail drives itself:** The daemon calls `engine.startWorkflow()` and
583
+ `engine.continueWorkflow()` -- the same tokens, the same session store, the same
584
+ enforcement. The daemon IS the agent; the engine enforces the workflow on the daemon.
585
+
586
+ **Developer experience (autonomous MR review):** One process, one command, zero
587
+ configuration beyond Claude API key + GitLab token. No MCP server running, no Claude Code
588
+ open. The daemon receives the GitLab webhook, starts the `mr-review-workflow`, runs the
589
+ agent loop, posts results.
590
+
591
+ **Risks:**
592
+ - The `engineActive` singleton guard means one active engine per process. Multi-session
593
+ concurrency requires the guard to evolve (or run sessions sequentially, which is fine
594
+ for v1).
595
+ - The daemon and MCP server share the same DI container -- running both simultaneously
596
+ in the same process requires care. The factory already notes "MCP server and daemon can
597
+ run simultaneously in the same process" but the singleton guard currently blocks this.
598
+ Solution: the guard needs relaxing for multi-session daemon mode (tracked separately).
599
+
600
+ ---
601
+
602
+ ### Option B: MCP Client (daemon connects to the MCP server over the wire)
603
+
604
+ **Architecture:**
605
+ ```
606
+ Process 1: WorkRail MCP Server (existing)
607
+ Process 2: Daemon --> HTTP/stdio --> MCP Server
608
+ ```
609
+
610
+ The daemon connects to the running MCP server and calls `start_workflow`,
611
+ `continue_workflow` as a client.
612
+
613
+ **Cross-repo execution:** Same as A -- the daemon controls the tool executor.
614
+
615
+ **Cloud vs local:** Clean process boundary -- deploy daemon and MCP server as separate
616
+ containers. The MCP server is the stateful component; the daemon is stateless between
617
+ sessions.
618
+
619
+ **WorkRail drives itself:** Yes -- the daemon calls the same MCP tools that Claude Code
620
+ calls. The token protocol and enforcement are identical.
621
+
622
+ **Developer experience:** Worse for MVP. Requires:
623
+ 1. A running MCP server (separate process)
624
+ 2. A stable HTTP address or process handle for the daemon to connect to
625
+ 3. Authentication between daemon and MCP server
626
+ 4. Dealing with JSON-RPC over HTTP round-trip latency on every `continue_workflow` call
627
+
628
+ For a developer who just wants autonomous MR review: "install WorkRail, start two
629
+ processes, configure them to talk to each other" -- this is friction that Option A
630
+ eliminates entirely.
631
+
632
+ **When Option B makes sense:** When the daemon is deployed on separate infrastructure
633
+ from the MCP server (e.g., a cloud worker that connects to a central WorkRail instance).
634
+ This is a valid 18-24 month architecture but is premature for the 12-month horizon where
635
+ the primary deployment is local or simple cloud.
636
+
637
+ **Verdict: Option B is architecturally correct for distributed cloud but adds friction
638
+ that does not pay off in the 12-month horizon.**
639
+
640
+ ---
641
+
642
+ ### Option C: Self-Referential Workflow
643
+
644
+ **Architecture:** A "daemon workflow" running inside WorkRail that uses the existing
645
+ `mcp__nested-subagent__Task` delegation to spawn subagent sessions for each autonomous
646
+ task.
647
+
648
+ **Cross-repo execution:** The coordinator workflow passes workspace paths via context
649
+ variables. Subagent workflows receive them as input. This is already how the existing
650
+ subagent protocol works.
651
+
652
+ **Cloud vs local:** The coordinator session runs wherever WorkRail runs. Subagents run
653
+ in the same process.
654
+
655
+ **WorkRail drives itself:** Yes -- workflows spawn workflows. The `wr.discovery` workflow
656
+ (this session) is already an example of this pattern in manual mode.
657
+
658
+ **Developer experience:** Interesting but not right for this use case:
659
+ 1. The coordinator workflow itself needs an agent driving it (a human or... another
660
+ daemon). This is turtles all the way down.
661
+ 2. Triggers (GitLab webhooks, cron) do not map cleanly to workflow steps. A workflow is
662
+ a thing you start with a goal; a trigger system is a thing that decides when to start.
663
+ 3. The `mcp__nested-subagent__Task` delegation is a subagent protocol, not a task
664
+ dispatch queue. It does not handle webhook payloads, credential management, or
665
+ concurrent session scheduling.
666
+
667
+ **What Option C is actually good for:** Autonomous orchestration within a single session
668
+ (coordinator delegates subtasks to parallel subagent sessions). This is already working
669
+ and is the right pattern for within-session parallelism. It is NOT the right pattern for
670
+ the entry-point / trigger layer.
671
+
672
+ **Verdict: Option C is the right answer for intra-session parallelism but is the wrong
673
+ answer for the daemon entry point.**
674
+
675
+ ---
676
+
677
+ ### Option D: Composite -- Direct Engine + Thin HTTP API
678
+
679
+ The 12-month architecture that actually satisfies all constraints is a composite:
680
+
681
+ ```
682
+ WorkRail Process
683
+ ├── Core Engine (shared DI singletons)
684
+ │ ├── Session store, snapshot store, keyring
685
+ │ ├── Token protocol
686
+ │ └── Workflow registry
687
+
688
+ ├── MCP Server (existing -- Claude Code integration, no changes)
689
+ │ └── stdio or HTTP transport
690
+
691
+ ├── Daemon Entry (new -- src/daemon/)
692
+ │ ├── Trigger listener (webhooks, cron, CLI)
693
+ │ ├── Agent loop (pi-mono agentLoop / direct Anthropic SDK)
694
+ │ ├── Tool executor (Bash, Read, Write, BashInRepo, ReadRepo)
695
+ │ └── Calls createWorkRailEngine() -- same handlers as MCP server
696
+
697
+ └── REST Control Plane (new -- for console live view + external control)
698
+ ├── GET /api/v2/sessions/:id/status (live polling by console)
699
+ ├── POST /api/v2/sessions/:id/pause (human pause mid-session)
700
+ ├── POST /api/v2/sessions/:id/resume (human resume)
701
+ └── SSE /api/v2/workspace/events (already exists -- extend for daemon events)
702
+ ```
703
+
704
+ The daemon calls the engine directly (Option A pattern) for the core loop. The REST
705
+ control plane exposes session state for the console and for external control (option B's
706
+ "decoupled" benefit without the process boundary overhead on the hot path).
707
+
708
+ This is already the direction the backlog describes:
709
+ > "The single-process model: The daemon entry point is a new src/daemon/ module that
710
+ > imports and calls the same handlers as the MCP server -- executeStartWorkflow,
711
+ > executeContinueWorkflow -- directly, without HTTP overhead."
712
+
713
+ ---
714
+
715
+ ## Resolution: Option D (Composite) is the Answer
716
+
717
+ **The 12-month architecture is not A, B, or C in isolation. It is:**
718
+ - **Option A** for the core execution loop (direct engine calls, no transport overhead)
719
+ - **Option C** for intra-session orchestration (workflows spawn subagent workflows)
720
+ - **A thin REST/SSE control plane** for human visibility and external control
721
+ (the "Option B benefit" without requiring a separate MCP server process)
722
+
723
+ **Why this beats pure Option B for 12 months:**
724
+ - No inter-process transport on the hot path (the majority of daemon interactions)
725
+ - Single deployment artifact -- `workrail daemon` starts everything
726
+ - The REST control plane satisfies "deployable anywhere" without requiring a second MCP
727
+ server process
728
+
729
+ **Why this beats pure Option A:**
730
+ - The REST control plane enables the console live view (currently missing)
731
+ - External systems (CI pipelines, Slack bots, other services) can interact with running
732
+ sessions without embedding the WorkRail engine
733
+ - Future option: split daemon and MCP server to separate processes when scale demands it,
734
+ by pointing the daemon at the REST API instead of the direct engine -- zero workflow
735
+ changes required
736
+
737
+ **The key architectural invariant:** The daemon never bypasses WorkRail's session engine.
738
+ It calls `engine.startWorkflow()` and `engine.continueWorkflow()` -- the exact same
739
+ code path as Claude Code's MCP calls. The enforcement guarantee is cryptographically
740
+ identical.
741
+
742
+ ---
743
+
744
+ ## Decision Log (Updated After Challenge)
745
+
746
+ | Decision | Rationale |
747
+ |----------|-----------|
748
+ | Direct engine (Option A) for core loop | Engine factory already exists, no transport overhead, same handlers as MCP server. `V2Dependencies` is stateless -- concurrent calls verified safe. |
749
+ | Not pure Option B (MCP client) | Process boundary adds friction for MVP (two processes, HTTP overhead per step). No payoff until distributed cloud deployment. Departs from `engine-factory.ts` pattern. |
750
+ | Not pure Option C (self-referential) | Trigger/dispatch layer does not map to workflow steps. Something still has to drive the coordinator. Right pattern for intra-session parallelism, wrong for entry point. |
751
+ | Not pure Option A (sequential, no live view) | Satisfies 3-month proof-of-concept but not 12-month platform. No concurrent sessions, no human override. |
752
+ | Composite same-process (C3) selected | Only candidate that satisfies all 12-month criteria: single deployment + concurrent sessions + live view + human override + correct enforcement. |
753
+ | REST control plane via existing Express server | `http-entry.ts` uses Express -- adding REST routes is `listener.app.get(...)`. Not a new server, not new architectural surface. |
754
+ | `engineActive` guard: shared instance, not relaxed | The guard prevents two `initializeContainer()` calls (which would reset DI registrations). Fix: a process-level `initializeWorkRailProcess()` called once; both MCP server and daemon use the resulting engine instance. The guard's purpose (prevent two containers) is preserved; its implementation is changed. |
755
+ | Challenge found 3 real risks, none blocking | (1) Hanging agent loops need `AbortController` timeouts -- design requirement, not blocker. (2) Process-level init function is a real non-trivial change -- design constraint, recorded. (3) Cross-repo not needed for MR review MVP -- scope decision, not blocker. |
756
+
757
+ ---
758
+
759
+ ## Open Questions
760
+
761
+ 1. **Session concurrency:** The `engineActive` guard allows one engine per process. The
762
+ daemon needs to handle multiple concurrent sessions. Options: (a) session queue
763
+ (process one at a time), (b) relax the guard to allow multiple concurrent
764
+ `WorkRailEngine` instances with isolated DI containers, (c) per-session process
765
+ workers. For v1, a simple session queue is sufficient.
766
+
767
+ 2. **Cross-repo tool routing:** The `BashInRepo(repo, command)` pattern requires a
768
+ workspace manifest resolver. Where does this live -- in the daemon's tool executor or
769
+ in the engine itself? Recommendation: tool executor (the engine stays agnostic about
770
+ filesystem layout; the daemon injects scoped tool implementations).
771
+
772
+ 3. **Credential management:** Daemon needs Claude API key, GitLab/GitHub tokens, Jira
773
+ tokens. Where are these stored? Options: environment variables (simplest), WorkRail
774
+ keyring extension (same HMAC infrastructure), external secret manager. For v1,
775
+ environment variables with a typed `DaemonConfig` struct.
776
+
777
+ 4. **Evidence collection hook:** The `BeforeToolCall` / `AfterToolCall` hooks that gate
778
+ continue tokens on observed evidence -- these need integration with the agent loop.
779
+ The pi-mono `agentLoop` has `BeforeToolCallResult` and `AfterToolCallResult` -- these
780
+ are the right integration points.
781
+
782
+ 5. **Single-process constraint:** The backlog says "MCP server and daemon can run
783
+ simultaneously in the same process" -- but the current `engineActive` guard prevents
784
+ this (two engines cannot coexist). The guard exists because the DI container is a
785
+ global singleton. Either: (a) the daemon shares the single engine instance with the
786
+ MCP server, or (b) they use separate data directories (separate DI contexts). Option
787
+ (a) is cleaner but requires the engine's `startWorkflow` / `continueWorkflow` to be
788
+ thread-safe (they already are, since each call is a separate async chain over
789
+ immutable session events). The guard should be relaxed to allow the MCP server and
790
+ daemon to share one engine instance.
791
+
792
+ ---
793
+
794
+ ## Final Summary (Updated After Full Review Cycle)
795
+
796
+ The daemon should be **Candidate 3: Composite Same-Process with C1 safety defaults**.
797
+
798
+ **Architecture:**
799
+ ```
800
+ WorkRail Process (single)
801
+ ├── Core (shared)
802
+ │ ├── Session store, snapshot store, keyring (all DI singletons)
803
+ │ ├── HMAC token protocol
804
+ │ └── Workflow registry
805
+
806
+ ├── MCP Server entry (existing, unchanged)
807
+ │ └── Claude Code / Cursor call start_workflow, continue_workflow externally
808
+
809
+ └── Daemon entry (new -- src/daemon/)
810
+ ├── DaemonSessionManager(config: { maxConcurrentSessions: 1 })
811
+ │ └── FIFO session queue for v1 (C1 safety, no engineActive guard change needed)
812
+ ├── AgentLoopPort (injected -- Anthropic SDK or pi-mono behind a port)
813
+ ├── ToolExecutorPort (injected -- Bash, Read, Write + optional repo routing)
814
+ ├── TriggerListener (GitLab MR webhook for v1)
815
+ └── REST additions to existing Express server
816
+ ├── GET /api/v2/sessions/:id/daemon-status
817
+ ├── POST /api/v2/sessions/:id/pause
818
+ ├── POST /api/v2/sessions/:id/resume
819
+ └── DELETE /api/v2/sessions/:id (cancel)
820
+ ```
821
+
822
+ **V1 implementation order:**
823
+ 1. Agent loop (`src/daemon/agent-loop/`) -- riskiest unknown; direct Anthropic SDK + AbortSignal
824
+ 2. Single trigger (GitLab MR webhook) -- second most uncertain
825
+ 3. Tool executor (Bash, Read, Write) -- mechanical
826
+ 4. Design `SharedEngineContext` / `initializeWorkRailProcess()` -- design only in v1, enable in v1.5
827
+ 5. REST control plane additions (4 routes on existing Express server)
828
+ 6. Console live view integration
829
+
830
+ **v1.5 additions (after v1 is proven):**
831
+ - Enable `maxConcurrentSessions: N` (requires SharedEngineContext pattern)
832
+ - Cross-repo tool routing (BashInRepo, ReadRepo)
833
+ - Evidence collection hooks (BeforeToolCall / AfterToolCall)
834
+
835
+ **Confidence: high.** Grounded in code analysis (`V2Dependencies` concurrent-safe, Express server
836
+ extensible, `engine-factory.ts` already partially implements this pattern). No RED findings in
837
+ design review. Two ORANGE constraints (AbortSignal in step-runner, SharedEngineContext interface)
838
+ are implementation requirements, not blockers.
839
+
840
+ **Strongest alternative: Candidate 1 (Sequential Daemon)**
841
+ Valid if the 3-month goal is "prove autonomous execution works" rather than "ship the 12-month
842
+ platform." C1 is a strict subset of C3 -- expand to C3 after proof.
843
+
844
+ **Pivot condition:** If cloud deployment is committed in 12 months, extract daemon to separate
845
+ process (Candidate 4). The `src/daemon/` code is identical; only the process entry point changes.
846
+
847
+ **Residual risks:**
848
+ 1. Agent loop multi-turn tool call coordination is the riskiest unknown -- has never been built
849
+ in WorkRail. Study pi-mono's `agentLoop` before designing.
850
+ 2. `mcp-server.ts` refactor scope for `initializeWorkRailProcess()` is uncertain -- needs a code
851
+ spike first.
852
+ 3. First real use case may be cross-repo (full-stack MR) -- if so, ToolExecutorPort repo parameter
853
+ becomes a v1 requirement, not v2. Confirm MVP workflow target before finalizing tool executor design.