@exaudeus/workrail 3.27.0 → 3.29.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (160) hide show
  1. package/dist/console/assets/{index-FtTaDku8.js → index-BZ6HkxGf.js} +1 -1
  2. package/dist/console/index.html +1 -1
  3. package/dist/manifest.json +3 -3
  4. package/docs/README.md +57 -0
  5. package/docs/adrs/001-hybrid-storage-backend.md +38 -0
  6. package/docs/adrs/002-four-layer-context-classification.md +38 -0
  7. package/docs/adrs/003-checkpoint-trigger-strategy.md +35 -0
  8. package/docs/adrs/004-opt-in-encryption-strategy.md +36 -0
  9. package/docs/adrs/005-agent-first-workflow-execution-tokens.md +105 -0
  10. package/docs/adrs/006-append-only-session-run-event-log.md +76 -0
  11. package/docs/adrs/007-resume-and-checkpoint-only-sessions.md +51 -0
  12. package/docs/adrs/008-blocked-nodes-architectural-upgrade.md +178 -0
  13. package/docs/adrs/009-bridge-mode-single-instance-mcp.md +195 -0
  14. package/docs/adrs/010-release-pipeline.md +89 -0
  15. package/docs/architecture/README.md +7 -0
  16. package/docs/architecture/refactor-audit.md +364 -0
  17. package/docs/authoring-v2.md +527 -0
  18. package/docs/authoring.md +873 -0
  19. package/docs/changelog-recent.md +201 -0
  20. package/docs/configuration.md +505 -0
  21. package/docs/ctc-mcp-proposal.md +518 -0
  22. package/docs/design/README.md +22 -0
  23. package/docs/design/agent-cascade-protocol.md +96 -0
  24. package/docs/design/autonomous-console-design-candidates.md +253 -0
  25. package/docs/design/autonomous-console-design-review.md +111 -0
  26. package/docs/design/autonomous-platform-mvp-discovery.md +525 -0
  27. package/docs/design/claude-code-source-deep-dive.md +713 -0
  28. package/docs/design/console-cyberpunk-ui-discovery.md +504 -0
  29. package/docs/design/console-execution-trace-candidates-final.md +160 -0
  30. package/docs/design/console-execution-trace-candidates.md +211 -0
  31. package/docs/design/console-execution-trace-design-candidates-v2.md +113 -0
  32. package/docs/design/console-execution-trace-design-review.md +74 -0
  33. package/docs/design/console-execution-trace-discovery.md +394 -0
  34. package/docs/design/console-execution-trace-final-review.md +77 -0
  35. package/docs/design/console-execution-trace-review.md +92 -0
  36. package/docs/design/console-performance-discovery.md +415 -0
  37. package/docs/design/console-ui-backlog.md +280 -0
  38. package/docs/design/daemon-architecture-discovery.md +853 -0
  39. package/docs/design/daemon-design-candidates.md +318 -0
  40. package/docs/design/daemon-design-review-findings.md +119 -0
  41. package/docs/design/daemon-engine-design-candidates.md +210 -0
  42. package/docs/design/daemon-engine-design-review.md +131 -0
  43. package/docs/design/daemon-execution-engine-discovery.md +280 -0
  44. package/docs/design/daemon-gap-analysis.md +554 -0
  45. package/docs/design/daemon-owns-console-plan.md +168 -0
  46. package/docs/design/daemon-owns-console-review.md +91 -0
  47. package/docs/design/daemon-owns-console.md +195 -0
  48. package/docs/design/data-model-erd.md +11 -0
  49. package/docs/design/design-candidates-consolidate-dev-staleness.md +98 -0
  50. package/docs/design/design-candidates-walk-cache-depth-limit.md +80 -0
  51. package/docs/design/design-review-consolidate-dev-staleness.md +54 -0
  52. package/docs/design/design-review-walk-cache-depth-limit.md +48 -0
  53. package/docs/design/implementation-plan-consolidate-dev-staleness.md +142 -0
  54. package/docs/design/implementation-plan-walk-cache-depth-limit.md +141 -0
  55. package/docs/design/layer3b-ghost-nodes-design-candidates.md +229 -0
  56. package/docs/design/layer3b-ghost-nodes-design-review.md +93 -0
  57. package/docs/design/layer3b-ghost-nodes-implementation-plan.md +219 -0
  58. package/docs/design/list-workflows-latency-fix-plan.md +128 -0
  59. package/docs/design/list-workflows-latency-fix-review.md +55 -0
  60. package/docs/design/list-workflows-latency-fix.md +109 -0
  61. package/docs/design/native-context-management-api.md +11 -0
  62. package/docs/design/performance-sweep-2026-04.md +96 -0
  63. package/docs/design/routines-guide.md +219 -0
  64. package/docs/design/sequence-diagrams.md +11 -0
  65. package/docs/design/subagent-design-principles.md +220 -0
  66. package/docs/design/temporal-patterns-design-candidates.md +312 -0
  67. package/docs/design/temporal-patterns-design-review-findings.md +163 -0
  68. package/docs/design/test-isolation-from-config-file.md +335 -0
  69. package/docs/design/v2-core-design-locks.md +2746 -0
  70. package/docs/design/v2-lock-registry.json +734 -0
  71. package/docs/design/workflow-authoring-v2.md +1044 -0
  72. package/docs/design/workflow-docs-spec.md +218 -0
  73. package/docs/design/workflow-extension-points.md +687 -0
  74. package/docs/design/workrail-auto-trigger-system.md +359 -0
  75. package/docs/design/workrail-config-file-discovery.md +513 -0
  76. package/docs/docker.md +110 -0
  77. package/docs/generated/v2-lock-closure-plan.md +26 -0
  78. package/docs/generated/v2-lock-coverage.json +797 -0
  79. package/docs/generated/v2-lock-coverage.md +177 -0
  80. package/docs/ideas/backlog.md +3927 -0
  81. package/docs/ideas/design-candidates-mcp-resilience.md +208 -0
  82. package/docs/ideas/design-review-findings-mcp-resilience.md +119 -0
  83. package/docs/ideas/implementation_plan.md +249 -0
  84. package/docs/ideas/third-party-workflow-setup-design-thinking.md +1948 -0
  85. package/docs/implementation/02-architecture.md +316 -0
  86. package/docs/implementation/04-testing-strategy.md +124 -0
  87. package/docs/implementation/09-simple-workflow-guide.md +835 -0
  88. package/docs/implementation/13-advanced-validation-guide.md +874 -0
  89. package/docs/implementation/README.md +21 -0
  90. package/docs/integrations/claude-code.md +300 -0
  91. package/docs/integrations/firebender.md +315 -0
  92. package/docs/migration/v0.1.0.md +147 -0
  93. package/docs/naming-conventions.md +45 -0
  94. package/docs/planning/README.md +104 -0
  95. package/docs/planning/github-ticketing-playbook.md +195 -0
  96. package/docs/plans/README.md +24 -0
  97. package/docs/plans/agent-managed-ticketing-design.md +605 -0
  98. package/docs/plans/agentic-orchestration-roadmap.md +112 -0
  99. package/docs/plans/assessment-gates-engine-handoff.md +536 -0
  100. package/docs/plans/content-coherence-and-references.md +151 -0
  101. package/docs/plans/library-extraction-plan.md +340 -0
  102. package/docs/plans/mr-review-workflow-redesign.md +1451 -0
  103. package/docs/plans/native-context-management-epic.md +11 -0
  104. package/docs/plans/perf-fixes-design-candidates.md +225 -0
  105. package/docs/plans/perf-fixes-design-review-findings.md +61 -0
  106. package/docs/plans/perf-fixes-new-issues-candidates.md +264 -0
  107. package/docs/plans/perf-fixes-new-issues-review.md +110 -0
  108. package/docs/plans/prompt-fragments.md +53 -0
  109. package/docs/plans/ui-ux-workflow-design-candidates.md +120 -0
  110. package/docs/plans/ui-ux-workflow-discovery.md +100 -0
  111. package/docs/plans/ui-ux-workflow-review.md +48 -0
  112. package/docs/plans/v2-followup-enhancements.md +587 -0
  113. package/docs/plans/workflow-categories-candidates.md +105 -0
  114. package/docs/plans/workflow-categories-discovery.md +110 -0
  115. package/docs/plans/workflow-categories-review.md +51 -0
  116. package/docs/plans/workflow-discovery-model-candidates.md +94 -0
  117. package/docs/plans/workflow-discovery-model-discovery.md +74 -0
  118. package/docs/plans/workflow-discovery-model-review.md +48 -0
  119. package/docs/plans/workflow-source-setup-phase-1.md +245 -0
  120. package/docs/plans/workflow-source-setup-phase-2.md +361 -0
  121. package/docs/plans/workflow-staleness-detection-candidates.md +104 -0
  122. package/docs/plans/workflow-staleness-detection-review.md +58 -0
  123. package/docs/plans/workflow-staleness-detection.md +80 -0
  124. package/docs/plans/workflow-v2-design.md +69 -0
  125. package/docs/plans/workflow-v2-roadmap.md +74 -0
  126. package/docs/plans/workflow-validation-design.md +98 -0
  127. package/docs/plans/workflow-validation-roadmap.md +108 -0
  128. package/docs/plans/workrail-platform-vision.md +420 -0
  129. package/docs/reference/agent-context-cleaner-snippet.md +94 -0
  130. package/docs/reference/agent-context-guidance.md +140 -0
  131. package/docs/reference/context-optimization.md +284 -0
  132. package/docs/reference/example-workflow-repository-template/.github/workflows/validate.yml +125 -0
  133. package/docs/reference/example-workflow-repository-template/README.md +268 -0
  134. package/docs/reference/example-workflow-repository-template/workflows/example-workflow.json +80 -0
  135. package/docs/reference/external-workflow-repositories.md +916 -0
  136. package/docs/reference/feature-flags-architecture.md +472 -0
  137. package/docs/reference/feature-flags.md +349 -0
  138. package/docs/reference/god-tier-workflow-validation.md +272 -0
  139. package/docs/reference/loop-optimization.md +209 -0
  140. package/docs/reference/loop-validation.md +176 -0
  141. package/docs/reference/loops.md +465 -0
  142. package/docs/reference/mcp-platform-constraints.md +59 -0
  143. package/docs/reference/recovery.md +88 -0
  144. package/docs/reference/releases.md +177 -0
  145. package/docs/reference/troubleshooting.md +105 -0
  146. package/docs/reference/workflow-execution-contract.md +998 -0
  147. package/docs/roadmap/README.md +22 -0
  148. package/docs/roadmap/legacy-planning-status.md +103 -0
  149. package/docs/roadmap/now-next-later.md +70 -0
  150. package/docs/roadmap/open-work-inventory.md +389 -0
  151. package/docs/tickets/README.md +39 -0
  152. package/docs/tickets/next-up.md +76 -0
  153. package/docs/workflow-management.md +317 -0
  154. package/docs/workflow-templates.md +423 -0
  155. package/docs/workflow-validation.md +184 -0
  156. package/docs/workflows.md +254 -0
  157. package/package.json +3 -1
  158. package/spec/authoring-spec.json +61 -16
  159. package/workflows/workflow-for-workflows.json +252 -93
  160. package/workflows/workflow-for-workflows.v2.json +188 -77
@@ -0,0 +1,253 @@
1
+ # Design Candidates: WorkRail Autonomous Console Live View
2
+
3
+ > Raw investigative material for main-agent synthesis. Not a final decision.
4
+ > Generated: 2026-04-14 as part of wr.discovery workflow.
5
+
6
+ ---
7
+
8
+ ## Problem Understanding
9
+
10
+ ### Core Tensions
11
+
12
+ **Tension 1: Ephemeral daemon state vs. durable event log**
13
+ The daemon is a running process -- inherently ephemeral. Its liveness can change at any moment (crash, pause, restart). The console's architectural invariant is that the event log is the source of truth. An in-memory registry that tracks liveness as process state will drift from the event log on any disruption. Resolution direction: derive liveness from the event log via timestamped heartbeat context events; use the registry only for ephemeral process handles (AbortController).
14
+
15
+ **Tension 2: Read-only console architecture vs. write operations needed for control**
16
+ The console server (`console-routes.ts`, `ConsoleService`) is stateless and read-only by design. All existing routes are GETs. The control actions (pause/resume/cancel) require write operations -- specifically, writing to the DaemonRegistry (in-process state), not to the event log. The event log must remain write-only from the daemon's perspective. The question is whether the console's read-only invariant is "never write anything" or "never write to the event log."
17
+
18
+ **Tension 3: Cooperative pause semantics vs. user expectation of immediate effect**
19
+ "Pause" in a cooperative model means "stop before the next step." If the current step is a 10-minute LLM API call, the button produces no visible effect for up to 10 minutes. This creates a trust gap. The design must communicate the difference between "pause command received" and "session is paused" with explicit intermediate state.
20
+
21
+ **Tension 4: Primary job (asynchronous verification) vs. secondary job (live control)**
22
+ The user research reframe: autonomous mode is an asynchronous verification problem, not a real-time monitoring problem. Users are not at the console when the daemon runs. They need "what happened while I was away," not "watch it happen live." The MVP risks over-indexing on the live monitoring UX (real-time tool call streaming, live badge) at the expense of the post-execution verification UX (session summary, confidence indicator, batch review).
23
+
24
+ ### Likely Seam
25
+
26
+ **Primary:** `ConsoleService.projectSessionSummary()` in `console-service.ts` -- already reads `context_set` events via `projectRunContextV2()`. Heartbeat detection is 5 lines here. Additive, pure, no new ports required.
27
+
28
+ **Secondary:** `mountConsoleRoutes()` in `console-routes.ts` -- already the function that mounts all console routes. Control endpoints mount here. Same pattern as the existing session/workflow/worktree routes.
29
+
30
+ **Frontend:** `SessionCard` and `SessionDetail` in `console/src/views/` -- render `ConsoleSessionSummary` and `ConsoleSessionDetail` props. Both already handle the status/health badge rendering pattern that `[ LIVE ]` follows.
31
+
32
+ ### What Makes This Hard
33
+
34
+ 1. **The heartbeat frequency problem:** If the daemon only emits a heartbeat on each `continue_workflow` advance, long steps (10-minute LLM calls) leave a 10-minute gap. The 60-second detection window requires the daemon to emit heartbeats within a step -- via a timer, not just at step transitions.
35
+
36
+ 2. **The `in_progress` ambiguity:** An `in_progress` session could be (a) human at keyboard, (b) daemon running, (c) human stepped away, (d) daemon crashed. The naive approach of adding a new `ConsoleSessionStatus` variant `'autonomous'` breaks existing consumers. The correct approach is an additive `isAutonomous: boolean` alongside the existing status.
37
+
38
+ 3. **The session lock + pause interaction:** Pausing does not release the session lock. The daemon holds the lock while executing a step. Pause only prevents the next `continue_workflow` call. A developer who tries to "pause" by releasing the lock early would break the gate invariant (`ExecutionSessionGateV2` tracks re-entrance via `activeSessions`).
39
+
40
+ 4. **The control endpoint authorization gap:** Any process with localhost access can cancel any session. For MVP (single-user localhost), this is acceptable. For production multi-user deployments, this is a security hole. The design must explicitly accept this limitation for MVP.
41
+
42
+ ---
43
+
44
+ ## Philosophy Constraints
45
+
46
+ **Relevant principles (from `/Users/etienneb/CLAUDE.md`):**
47
+
48
+ - **Errors are data** -- all new `DaemonRegistry` methods and POST route handlers must return `ResultAsync<T, E>` not throw exceptions. Pattern already established: `ConsoleService` uses `neverthrow` throughout.
49
+ - **Make illegal states unrepresentable** -- `DaemonEntry.status` must be a discriminated union `'running' | 'pausing' | 'paused' | 'cancelling'`. A `'paused'` session that then receives a `resume` command must transition through `'running'` -- not stay in `'paused'`.
50
+ - **Immutability by default** -- `DaemonEntry` should be immutable; updates create new entries. The registry's `set(sessionId, newEntry)` replaces entries, never mutates them in place.
51
+ - **Validate at boundaries, trust inside** -- the POST endpoint validates: session exists, is registered in DaemonRegistry, has status compatible with the requested action. Inside the registry method, no re-validation.
52
+ - **YAGNI with discipline** -- do not add trigger system, task flow chaining, or multi-model routing to MVP. These are explicitly `Later` items.
53
+ - **Type safety as first line of defense** -- the new `daemon-status-changed` SSE event type must be added to the SSE event union in `useWorkspaceEvents()` in a type-safe way. The frontend currently parses `msg.type` as `string` -- this should grow to a typed union or at minimum be handled exhaustively.
54
+
55
+ **Philosophy conflicts:**
56
+
57
+ 1. **Immutability vs. registry state management:** The DaemonRegistry is inherently stateful and mutable. Resolved by: confine all mutation to the registry's own methods; callers receive `Readonly<DaemonEntry>` only.
58
+ 2. **Read-only console invariant vs. control endpoints:** The console is read-only today by convention, not by architectural constraint. The POST control endpoints write to in-process DaemonRegistry state (not the event log). This is a bounded and acceptable departure: the console's API must be read-only with respect to durable state.
59
+
60
+ ---
61
+
62
+ ## Impact Surface
63
+
64
+ **Files that must stay consistent:**
65
+ - `src/v2/usecases/console-service.ts` -- primary change target (add `isAutonomous`, `lastHeartbeatMs` to projection)
66
+ - `src/v2/usecases/console-types.ts` -- `ConsoleSessionSummary` type change (add fields)
67
+ - `src/v2/usecases/console-routes.ts` -- add POST endpoints, new SSE event type
68
+ - `console/src/api/types.ts` -- frontend mirror of `ConsoleSessionSummary` (add `isAutonomous`)
69
+ - `console/src/views/SessionList.tsx` -- add `[ LIVE ]` badge to `SessionCard`
70
+ - `console/src/views/SessionDetail.tsx` -- add `AutonomousControlStrip`
71
+ - `console/src/api/hooks.ts` -- add `daemon-status-changed` SSE handling + POST mutation hooks
72
+
73
+ **Nearby consumers that must stay consistent:**
74
+ - `useSessionListRepository.ts` -- wraps `useSessionList()`, reads sessions; no change required if `isAutonomous` is additive
75
+ - `useSessionDetailViewModel.ts` -- wraps session detail; must pass `isAutonomous` and `daemonStatus` to `SessionDetail`
76
+ - `session-list-reducer.ts` -- filters and sorts sessions; `isAutonomous` should be filterable (add to filter options)
77
+ - `console-types.ts` backend and `api/types.ts` frontend must stay in sync (existing convention, no automatic codegen)
78
+
79
+ ---
80
+
81
+ ## Candidates
82
+
83
+ ### Candidate 1: Visibility Only (Simplest Possible)
84
+
85
+ **Summary:** The daemon writes `context_set` events (`is_autonomous: "true"`, `daemon_heartbeat: "<ISO timestamp>"`) at session start and every 30 seconds. `ConsoleService.projectSessionSummary()` reads these via the existing `projectRunContextV2()` call. `ConsoleSessionSummary` gains two new fields: `isAutonomous: boolean` and `lastHeartbeatMs: number | null`. `SessionCard` shows a pulsing amber `[ LIVE ]` dot when `isAutonomous && status === 'in_progress' && lastHeartbeatMs !== null && Date.now() - lastHeartbeatMs < 60_000`. No new routes. No new ports. No DaemonRegistry.
86
+
87
+ **Tensions resolved:** Event-log-as-source-of-truth (fully -- liveness from heartbeat events only). Crash-safe (yes -- no heartbeat in 60s = badge disappears). Minimum new surface (best in class -- 3 files changed, ~40 lines).
88
+
89
+ **Tensions accepted:** No pause/cancel control (safety net absent). No post-execution confidence indicator (post-execution verification <30s partially met via existing session detail view). No real-time step progress beyond the existing 5s poll.
90
+
91
+ **Boundary solved at:** `ConsoleService.projectSessionSummary()` -- the single function that builds the session summary DTO. This is already the right place for all projection-level decisions about session status.
92
+
93
+ **Why this boundary is best fit:** It follows the exact pattern used by `deriveSessionTitle()` (reads context events), `projectRunStatusSignalsV2()` (reads status events), and `extractGitBranch()` (reads observation events). No new plumbing required.
94
+
95
+ **Failure mode:** The 30-second heartbeat interval creates a 30-90 second window where the badge can be stale (last heartbeat up to 30s old + 60s detection window = 90s maximum). If the daemon crashes at the 29-second mark, users see `[ LIVE ]` for up to 60 more seconds. This is the advertised crash-safety window; document it explicitly.
96
+
97
+ **Repo-pattern relationship:** Follows exactly -- pure event-log projection, no new infrastructure.
98
+
99
+ **Gains:** Zero infrastructure risk. Zero new routes. Clean event-log derivation. Safe to ship as Phase 1 with no breaking changes.
100
+
101
+ **Losses:** No pause/cancel. No post-execution confidence indicator. No batch "autonomous sessions" filter.
102
+
103
+ **Scope judgment:** Too narrow as a standalone MVP (users need the safety net), but correct as Phase 1 of a phased delivery.
104
+
105
+ **Philosophy fit:** Honors immutability, errors-as-data (no new error paths), YAGNI, validate-at-boundaries. No conflicts.
106
+
107
+ ---
108
+
109
+ ### Candidate 2: Visibility + Control (Follow Existing Pattern)
110
+
111
+ **Summary:** Builds directly on Candidate 1. Adds a `DaemonRegistry` class with `Map<SessionId, DaemonEntry>` where `DaemonEntry = { readonly sessionId: SessionId; readonly workflowId: string | null; readonly goal: string | null; readonly startedAtMs: number; readonly abortController: AbortController; readonly pauseFlag: { paused: boolean }; readonly status: 'running' | 'pausing' | 'paused' | 'cancelling' }`. Adds three POST routes to `mountConsoleRoutes()`: `POST /api/v2/sessions/:id/pause`, `POST /api/v2/sessions/:id/resume`, `POST /api/v2/sessions/:id/cancel`. Adds `AutonomousControlStrip` React component to `SessionDetail.tsx`. Adds `useDaemonControl(sessionId)` hook in `console/src/hooks/`. Extends the SSE event union with `{type: "daemon-status-changed", sessionId: string, status: DaemonEntryStatus}` broadcast when registry status changes.
112
+
113
+ **DaemonEntry status transitions:**
114
+ - `running` → `pausing` (on POST /pause)
115
+ - `pausing` → `paused` (when daemon cooperative check fires)
116
+ - `paused` → `running` (on POST /resume)
117
+ - `running | pausing | paused` → `cancelling` (on POST /cancel)
118
+ - any → deregistered (when daemon calls `daemonRegistry.deregister(sessionId)`)
119
+
120
+ **Tensions resolved:** All 5 criteria met. Liveness from heartbeat (event-log-as-source-of-truth). Crash-safe for liveness. Control actions for safety net. Post-execution verification via existing DAG + session detail.
121
+
122
+ **Tensions accepted:** DaemonRegistry is lost on server restart -- control actions for sessions started before the restart are unavailable. This is explicitly acceptable: a restarted server has no in-flight calls to cancel; those sessions' daemons are also dead or disconnected.
123
+
124
+ **Boundary solved at:** Two boundaries: (1) `ConsoleService.projectSessionSummary()` for liveness detection, (2) `mountConsoleRoutes()` for control endpoint mounting. Both are existing extension points.
125
+
126
+ **Why these boundaries are best fit:** Same reason as Candidate 1 for liveness. For control: `mountConsoleRoutes()` already accepts `consoleService`, `workflowService`, and optional `timingRingBuffer` -- adding `daemonRegistry?: DaemonRegistry` follows the same optional-dependency pattern.
127
+
128
+ **Failure mode:** A user presses `[ PAUSE ]`. The POST succeeds (HTTP 200). The daemon has a 10-minute LLM call in flight. The UI must show `Pausing after current step...` for up to 10 minutes before transitioning to `Paused`. If the UI shows `Paused` immediately after the POST (optimistic update without waiting for `daemon-status-changed` event), users may try to interact with a session that is not actually paused. Resolution: optimistic update to `pausing` state immediately, then `paused` on `daemon-status-changed` SSE event.
129
+
130
+ **Repo-pattern relationship:** Adapts -- follows `mountConsoleRoutes()` pattern for route mounting, `ConsoleServicePorts` pattern for optional dependencies, `usePerfToolCalls()` pattern for mutation hooks.
131
+
132
+ **Gains:** Full MVP feature set. Clean architecture. In-process DaemonRegistry avoids IPC complexity. Phased delivery (C1 first, C2 as increment).
133
+
134
+ **Losses:** More files changed (10 vs. 3). `DaemonEntry` mutability requires careful ownership (only `DaemonRegistry` mutates it). Registry is ephemeral -- acknowledged limitation.
135
+
136
+ **Scope judgment:** Best-fit for MVP. Grounded in existing patterns. Not over-engineered.
137
+
138
+ **Philosophy fit:** Honors errors-as-data (`ResultAsync` on all registry methods), make-illegal-states-unrepresentable (discriminated union for status), immutability (entries are `Readonly<DaemonEntry>`), validate-at-boundaries (POST routes validate before calling registry). Minor tension with YAGNI (DaemonRegistry is new infrastructure), but justified by user need for the safety net.
139
+
140
+ ---
141
+
142
+ ### Candidate 3: Reframe -- Autonomous History Tab
143
+
144
+ **Summary:** Reframes the MVP as post-execution verification, not live monitoring. Does NOT add a live badge, control buttons, or DaemonRegistry in Phase 1. Instead, adds an `Autonomous` filter option to the existing `StatusFilterOptions` in the session list. Sessions with `isAutonomous: true` (derived from context events) are filterable from the existing filter chips. Each autonomous session card shows an additional "confidence indicator" chip: `green` (complete), `yellow` (complete_with_gaps or blocked), `red` (dormant). Users can filter to all their autonomous sessions in one click and assess batch outcomes at a glance.
145
+
146
+ **New type additions:** `confidenceSignal: 'green' | 'yellow' | 'red' | null` on `ConsoleSessionSummary` (derived from `status + isAutonomous`). Zero backend changes beyond the heartbeat projection from Candidate 1.
147
+
148
+ **Tensions resolved:** Event-log-as-source-of-truth (fully). Post-execution verification <30s (best in class -- filter + confidence chip). Minimum new surface. Crash-safe.
149
+
150
+ **Tensions accepted:** No live badge (users cannot tell if a session is currently running). No pause/cancel control. Live monitoring is entirely absent.
151
+
152
+ **Boundary solved at:** `session-list-reducer.ts` for filtering + `ConsoleSessionSummary` for `confidenceSignal` derivation. Both are already the canonical places for session list processing.
153
+
154
+ **Why this boundary is best fit:** It directly serves the primary user job (batch post-execution verification) at the lowest possible surface area.
155
+
156
+ **Failure mode:** Users who want to intervene mid-session have no mechanism. If an autonomous session goes wrong (infinite loop, unauthorized action), users must wait for it to complete or manually kill the daemon process. This is the explicitly accepted limitation: "safety net absent."
157
+
158
+ **Repo-pattern relationship:** Follows -- the status filter chip pattern already exists in `SessionList.tsx`; adding a new filter option follows the existing `statusFilterOptions` array pattern exactly.
159
+
160
+ **Gains:** Serves primary user job directly. Zero new infrastructure. Clean.
161
+
162
+ **Losses:** No live monitoring. No control actions. Users who want a safety net must build it themselves (kill the process).
163
+
164
+ **Scope judgment:** Too narrow as a standalone MVP -- misses the users' expressed need for a safety net. Strong as a Phase 1 before Candidate 2, or as an alternative framing if user research shows the safety net is not a real need.
165
+
166
+ **Philosophy fit:** Perfect philosophy fit. YAGNI honored. No new infrastructure. Honors event-log source of truth.
167
+
168
+ ---
169
+
170
+ ### Candidate 4: Log-Based Control Signals (Architectural Departure)
171
+
172
+ **Summary:** Eliminates the in-process `DaemonRegistry` for control actions. Instead, the console POST endpoints write new control signal events directly to the session event log: `{kind: 'control_signal_appended', data: {signal: 'pause' | 'resume' | 'cancel', requestedAtMs: number}}`. The daemon reads the event log tail at each step boundary to detect control signals. No DaemonRegistry. No in-process state. Control signals are durable, survive server restarts, appear in the execution trace, and cannot be lost.
173
+
174
+ **New event type:** `CONTROL_SIGNAL_APPENDED` added to the domain event schema. New projection: `projectControlSignalsV2(events): Result<{pendingSignal: 'pause' | 'resume' | 'cancel' | null}>`.
175
+
176
+ **Tensions resolved:** Event-log-as-source-of-truth (strongest -- control signals ARE in the event log). Crash-safe (complete -- signals survive server restart). No ephemeral state.
177
+
178
+ **Tensions accepted:** Console writes to the event log -- this is a fundamental violation of the console's read-only invariant. The console's read-only constraint exists for a reason: no accidental session corruption, no race conditions between readers. Writing control signals from the console API breaks this invariant.
179
+
180
+ **Boundary solved at:** The domain event schema (`durable-core/schemas/session/`). Control signals become first-class domain events.
181
+
182
+ **Why this boundary is problematic:** The console-routes.ts has this comment: "GET-only (invariant: Console is read-only)". Candidate 4 violates this invariant. The risk is not just one technical concern -- it changes the security model, the test assumptions, and the ownership model of who can write to sessions.
183
+
184
+ **Failure mode:** A race condition between the console writing a control signal and the daemon reading it could cause the daemon to act on a stale signal (e.g., re-pausing after a resume was already processed). The daemon would need robust idempotent signal processing to handle this.
185
+
186
+ **Repo-pattern relationship:** Significant departure -- introduces console-to-event-log writes, which the entire console architecture explicitly prohibits.
187
+
188
+ **Gains:** Perfect source-of-truth alignment. Durable signals. No ephemeral state. Control signals visible in audit trail and DAG.
189
+
190
+ **Losses:** Breaks read-only console invariant. Requires new domain event type in durable-core schema. Requires new projection. Higher blast radius if the signal processing has bugs.
191
+
192
+ **Scope judgment:** Too broad for MVP -- introduces architectural changes to durable-core. Potentially correct for a future version where control signals should be auditable.
193
+
194
+ **Philosophy fit:** Honors event-log-as-source-of-truth but violates make-illegal-states-unrepresentable (now the console CAN write to sessions -- that's a new legal state) and validate-at-boundaries (the console API is now a write boundary to a previously read-only system).
195
+
196
+ ---
197
+
198
+ ## Comparison and Recommendation
199
+
200
+ | Criterion | C1 (Visibility) | C2 (Visibility + Control) | C3 (History Reframe) | C4 (Log Control) |
201
+ |-----------|:--------------:|:------------------------:|:--------------------:|:----------------:|
202
+ | Event-log source of truth | YES | YES | YES | YES (strongest) |
203
+ | Crash-safe | YES | YES (partial*) | YES | YES |
204
+ | Minimum new surface | BEST | GOOD | GOOD | WORST |
205
+ | Post-execution verification <30s | PARTIAL | PARTIAL | BEST | PARTIAL |
206
+ | Pause/cancel safety net | NO | YES | NO | YES (durable) |
207
+ | Philosophy fit | PERFECT | GOOD | PERFECT | FAIR |
208
+
209
+ \* C2 crash-safe qualification: liveness is crash-safe via heartbeat; registry is ephemeral for control actions (acceptable -- registry is not source of truth for liveness, only for AbortController handles)
210
+
211
+ **Recommendation: Candidate 2, delivered as C1 then C2.**
212
+
213
+ **Rationale:**
214
+ 1. C2 satisfies all 5 decision criteria. C1 and C3 miss the safety net.
215
+ 2. C2 follows existing repo patterns: `mountConsoleRoutes()` optional-dependency pattern, `ConsoleServicePorts` pattern, `usePerfToolCalls()` pattern for mutation hooks.
216
+ 3. C4 is architecturally superior in theory but practically wrong -- it breaks the console's read-only invariant, which is the load-bearing architectural constraint that prevents console bugs from corrupting sessions.
217
+ 4. C3's insight (autonomous filter + confidence chip) should be adopted INTO C2, not as a separate candidate. Add `confidenceSignal` to `ConsoleSessionSummary` and an autonomous filter option alongside the live badge and control strip.
218
+
219
+ **C3 insight absorbed:** Add `confidenceSignal: 'green' | 'yellow' | 'red' | null` to `ConsoleSessionSummary`. Derive from `isAutonomous && status`. Add `statusFilter: 'autonomous'` option to the session list filter chips. This serves the primary job (post-execution verification) at minimal cost.
220
+
221
+ ---
222
+
223
+ ## Self-Critique
224
+
225
+ ### Strongest counter-argument against C2
226
+
227
+ The DaemonRegistry is a violation of WorkRail's "event log as source of truth" principle for the control action state. If the registry says a session is `pausing` but the session's event log has no `paused` signal, the system state is inconsistent. The counter-argument: this inconsistency only exists transiently (between when the user presses pause and when the daemon checks the pause flag). The event log does not need to represent transient control states -- it needs to represent permanent execution states. The daemon's cooperative pause check is designed to consume-and-clear the pause flag without logging it; the absence of a `pause_requested` event in the log is intentional.
228
+
229
+ ### Narrower option that could still work
230
+
231
+ Candidate 1 (visibility only) + Candidate 3 (autonomous filter). This gives users post-execution verification without any write operations on the console. If user research shows that the safety net (pause/cancel) is not a real need in practice, this combination is sufficient and has the smallest possible surface area.
232
+
233
+ ### Broader option that could be justified
234
+
235
+ Candidate 4 (log-based control signals) would be justified if: (a) multi-process deployment is a real requirement (daemon and console server in separate containers), and (b) auditability of control actions is required (e.g., compliance use case where "who paused this session and when" must be in the session record). Neither condition is present in the MVP.
236
+
237
+ ### Assumption that would invalidate this design
238
+
239
+ The daemon and console server MUST be in the same Node.js process for C2's in-process DaemonRegistry to work. This was verified: `HttpServer` source confirms console + MCP run in the same process. But if a future deployment scenario separates them (e.g., console UI served from a CDN, console API as a microservice, daemon as a separate container), C2's DaemonRegistry would need to be replaced with a socket-backed or log-based implementation.
240
+
241
+ ---
242
+
243
+ ## Open Questions for the Main Agent
244
+
245
+ 1. **Heartbeat frequency:** Should the daemon emit a heartbeat event at step START only, or also on a 30-second timer within a step? The timer approach requires the daemon to have a separate non-blocking timer running alongside the LLM API call. Is that complexity justified for the 60-second detection window?
246
+
247
+ 2. **Confidence signal derivation:** C3's `confidenceSignal` (green/yellow/red) is derived from `status`. But the primary users (team lead reviewing PRs overnight) may want per-session evidence quality signals, not just completion status. Is `status → confidence` sufficient, or should there be a richer derivation?
248
+
249
+ 3. **The `pausing` UX state:** When a user presses pause and the daemon is mid-step, the console should show `Pausing after current step...`. Should this be a badge variant, a tooltip, or a visible status row in `AutonomousControlStrip`? The answer is a UX decision, not an architecture decision.
250
+
251
+ 4. **`POST` endpoint authentication:** The MVP explicitly accepts "any localhost process can cancel any session." Should there be a CSRF token or a session-ID-in-request-body check to prevent accidental misfire from browser extensions or other localhost services?
252
+
253
+ 5. **The console read-only invariant:** Should `console-routes.ts` have an explicit comment documenting that the POST control endpoints are the ONLY exception to the read-only invariant, and that they write ONLY to in-process state, never to the event log? This would prevent future developers from adding additional console write routes without understanding the constraint.
@@ -0,0 +1,111 @@
1
+ # Design Review Findings: WorkRail Autonomous Console Live View
2
+
3
+ > Concise, actionable findings for main-agent synthesis. Companion to `autonomous-console-design-candidates.md`.
4
+ > Generated: 2026-04-14 as part of wr.discovery workflow.
5
+
6
+ ---
7
+
8
+ ## Tradeoff Review
9
+
10
+ | Tradeoff | Acceptable under expected conditions? | What makes it unacceptable |
11
+ |----------|--------------------------------------|---------------------------|
12
+ | DaemonRegistry `lastHeartbeatMs` is ephemeral (lost on restart) | Yes -- liveness requires both `is_autonomous` (event log, durable) AND `lastHeartbeatMs < 60s` (registry, ephemeral). Post-restart: registry empty → lastHeartbeatMs null → no LIVE badge → correct behavior (daemon is also dead) | Daemon and console server run in separate processes (separate containers); daemon can survive a console restart |
13
+ | Cancelled sessions become dormant after 1 hour (no explicit `cancelled` status) | Acceptable for MVP single-user. LIVE badge disappears within 60s of cancel. Session shows `dormant` after 1h. | Multi-user reporting requirements; users need explicit `cancelled` status for filtering |
14
+ | LIVE badge is best-effort (users can spoof `is_autonomous` via context_set) | Yes for localhost single-user MVP | Multi-user/multi-tenant deployment where badge is a trust signal |
15
+ | Autonomous mode requires HTTP transport (not STDIO) | Yes -- the daemon requires the console and a persistent process. STDIO mode unchanged for human-driven sessions | WorkRail deprecates HTTP mode or autonomous users prefer STDIO; unlikely |
16
+ | Heartbeat timer cannot write to event log (session lock) | Resolved by hybrid approach -- heartbeat freshness stored in registry, not event log | N/A -- already resolved |
17
+
18
+ ---
19
+
20
+ ## Failure Mode Review
21
+
22
+ | Failure Mode | Design handling | Missing mitigation | Risk level |
23
+ |--------------|-----------------|--------------------|------------|
24
+ | LLM call exceeds 60s with no tool calls -- LIVE badge disappears | Accepted -- badge reappears after step completes and next heartbeat fires | Softer indicator for `is_autonomous + in_progress + last_heartbeat < 10min` (not 60s) for UX clarity | LOW |
25
+ | Daemon crash -- session in_progress for 1 hour | LIVE badge disappears in 60s; dormant after 1h | Shorter dormancy threshold for autonomous sessions (5-10 min vs. 1 hour) | MEDIUM |
26
+ | Cancel → POST succeeds but daemon checks pause flag after long step | Badge transitions to `pausing`; user sees intermediate state | UI must show `Pausing after current step...` not `Paused` immediately | MEDIUM (UX trust) |
27
+ | Daemon orphan on crash leaves session stuck in_progress | Mitigated by dormancy. No corruption, just display ambiguity | Configure autonomous dormancy threshold separately from human-session threshold | MEDIUM (UX) |
28
+
29
+ **Highest-risk failure mode:** Daemon crash creating 1-hour ambiguity window. Mitigation: shorten dormancy threshold for autonomous sessions (configurable via `WORKRAIL_AUTONOMOUS_DORMANCY_MS` env var, default 5 minutes vs. the 1-hour default for human sessions).
30
+
31
+ ---
32
+
33
+ ## Runner-Up / Simpler Alternative Review
34
+
35
+ **Runner-up (C3 Autonomous History Reframe) strengths absorbed:**
36
+ - `confidenceSignal: 'green' | 'yellow' | 'red' | null` on `ConsoleSessionSummary` -- derives from `isAutonomous + status` -- directly serves primary job (post-execution verification)
37
+ - `statusFilter: 'autonomous'` option in session list filter chips -- one-click batch review
38
+
39
+ **Simpler variant (C1 + C3 only, no DaemonRegistry) analysis:**
40
+ - Fails the "pause/cancel as safety net" acceptance criterion
41
+ - Without human control plane, WorkRail autonomous mode is indistinguishable from any other black-box autonomous agent
42
+ - The control plane is the product differentiator -- it cannot be deferred past MVP
43
+
44
+ **Hybrid DaemonRegistry implementation:** Module-level Map in `console-routes.ts` (like `sseClients`) is marginally simpler but harder to test. Proper class is ~10 more lines but injectable and testable. Class is the right choice.
45
+
46
+ ---
47
+
48
+ ## Philosophy Alignment
49
+
50
+ | Principle | Satisfied? | Notes |
51
+ |-----------|-----------|-------|
52
+ | Errors are data | YES | All registry methods and POST routes use `ResultAsync`/`.match()` |
53
+ | Make illegal states unrepresentable | YES | `DaemonEntry.status` is a 4-value closed union |
54
+ | Immutability by default | YES (with tension) | Registry state is mutable but confined behind class methods; callers receive `Readonly<DaemonEntry>` |
55
+ | Validate at boundaries | YES | POST endpoints validate before calling registry |
56
+ | YAGNI with discipline | YES | No trigger system, chaining, or multi-model routing in MVP |
57
+ | Type safety as first line | YES | Typed new fields, typed SSE event union |
58
+ | Compose with small pure functions | YES | `projectIsAutonomous()`, `isSessionLive()` are separate testable functions |
59
+ | Determinism | TENSION (acceptable) | `lastHeartbeatMs` is real-time clock state; `is_autonomous` in event log is deterministic |
60
+
61
+ **Tension that matters:** The hybrid liveness design (event log for `is_autonomous`, registry for `lastHeartbeatMs`) has a determinism tension. The event log path is fully deterministic; the registry path is not. This is explicitly accepted because the session lock prevents the fully-deterministic event log approach from working for freshness signals.
62
+
63
+ ---
64
+
65
+ ## Findings
66
+
67
+ ### RED (blocking)
68
+
69
+ None. No finding requires blocking the selected direction.
70
+
71
+ ### ORANGE (revise before implementation)
72
+
73
+ **ORANGE-1: Autonomous dormancy threshold must be configurable and set to 5 minutes by default**
74
+
75
+ The 1-hour default dormancy threshold (`DORMANCY_THRESHOLD_MS`) was designed for human-driven sessions. Autonomous sessions should have a much shorter threshold (5-10 minutes). A daemon that crashes should show `dormant` within minutes, not an hour. Revise: add `AUTONOMOUS_DORMANCY_THRESHOLD_MS` constant (default: 5 minutes) alongside the existing `DORMANCY_THRESHOLD_MS`. When `isAutonomous && nowMs - lastModifiedMs > AUTONOMOUS_DORMANCY_THRESHOLD_MS`, use the shorter threshold for `dormant` detection.
76
+
77
+ **ORANGE-2: Pause UX must show intermediate `pausing` state, not immediate `paused`**
78
+
79
+ If the POST /pause route returns 200 before the daemon acknowledges pause (which it will, since acknowledgment requires the current step to complete), the frontend must show `Pausing after current step...` as an intermediate state. Optimistic UI must transition to `pausing` status (not `paused`) immediately after the POST. Transition to `paused` only on receipt of `daemon-status-changed` SSE event with `status: "paused"`. Without this, users will think their pause button was ignored during long steps.
80
+
81
+ ### YELLOW (monitor)
82
+
83
+ **YELLOW-1: LIVE badge false negative during long LLM calls**
84
+
85
+ The 60-second window is tight for steps with long LLM responses and no tool calls. Users may see the LIVE badge disappear and reappear for normal operation. Monitor for user confusion. If this causes issues, widen the detection window to 3-5 minutes (at the cost of slower crash detection). The widening can be done with a constant change.
86
+
87
+ **YELLOW-2: Registry-event-log divergence in multi-process deployments**
88
+
89
+ If a future deployment separates the daemon from the console server, the DaemonRegistry will be empty in the console server while the daemon is running. This will cause the LIVE badge to not show (false negative) and control actions to fail. Monitor for deployment scenarios that separate these processes.
90
+
91
+ ---
92
+
93
+ ## Recommended Revisions
94
+
95
+ 1. **Add `AUTONOMOUS_DORMANCY_THRESHOLD_MS` to `console-service.ts`** (default 5 minutes). Use it when `isAutonomous === true` for dormant detection instead of `DORMANCY_THRESHOLD_MS`. Impact: ~5 lines in `projectSessionSummary()`.
96
+
97
+ 2. **Implement `pausing` as an intermediate status in `AutonomousControlStrip`**. The component's local state tracks `'idle' | 'pausing' | 'paused' | 'cancelling'`. Optimistic update on POST → `pausing`. Transition to `paused` on `daemon-status-changed` SSE event. Impact: ~20 lines in `AutonomousControlStrip.tsx`.
98
+
99
+ 3. **Absorb C3 features: add `confidenceSignal` to `ConsoleSessionSummary` and `statusFilter: 'autonomous'` to session list filter chips**. These directly serve the primary user job (post-execution verification) at near-zero cost.
100
+
101
+ 4. **Add ORANGE-1 and ORANGE-2 revisions to the implementation plan** before coding begins. Both are ~10-20 lines each.
102
+
103
+ ---
104
+
105
+ ## Residual Concerns
106
+
107
+ 1. **The heartbeat interval is implicit.** The daemon emits heartbeats "at each tool call result boundary" -- but this is a behavioral contract between the daemon and the console, not enforced by the schema. If the daemon implementation misses a heartbeat, the LIVE badge degrades silently. Consider making the heartbeat frequency a documented invariant in the daemon's implementation spec.
108
+
109
+ 2. **The `DaemonEntry` type is in the backend.** The frontend's `ConsoleSessionSummary` has `isAutonomous: boolean` and the SSE event carries `status: DaemonEntryStatus` -- but the frontend has no way to validate that the status values it receives match the backend enum. A shared type definition or string literal union in `api/types.ts` would enforce this without a code generation step.
110
+
111
+ 3. **Control endpoint idempotency is unspecified.** POST /pause on an already-paused session: should return 200 (idempotent) or 409 (conflict)? POST /cancel on an already-cancelling session: same question. The implementation should define and document these edge cases before coding. Recommendation: 200 for idempotent same-state requests; 409 for impossible transitions (e.g., resume a cancelling session).