@exaudeus/workrail 3.36.0 → 3.37.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (44) hide show
  1. package/dist/config/config-file.js +2 -0
  2. package/dist/console-ui/assets/{index-n8cJrS4v.js → index-o-p__sHJ.js} +1 -1
  3. package/dist/console-ui/index.html +1 -1
  4. package/dist/daemon/workflow-runner.d.ts +1 -0
  5. package/dist/daemon/workflow-runner.js +3 -6
  6. package/dist/manifest.json +23 -15
  7. package/dist/trigger/notification-service.d.ts +42 -0
  8. package/dist/trigger/notification-service.js +164 -0
  9. package/dist/trigger/trigger-listener.js +7 -1
  10. package/dist/trigger/trigger-router.d.ts +3 -1
  11. package/dist/trigger/trigger-router.js +4 -1
  12. package/docs/design/agent-behavior-patterns-discovery.md +312 -0
  13. package/docs/design/agent-engine-communication-discovery.md +390 -0
  14. package/docs/design/agent-loop-architecture-alternatives-discovery.md +531 -0
  15. package/docs/design/agent-loop-error-handling-contract.md +238 -0
  16. package/docs/design/complete-step-approach-validation-discovery.md +344 -0
  17. package/docs/design/daemon-stuck-detection-discovery.md +174 -0
  18. package/docs/design/mcp-server-disconnect-discovery.md +245 -0
  19. package/docs/design/mcp-server-epipe-crash.md +198 -0
  20. package/docs/design/notification-design-candidates.md +131 -0
  21. package/docs/design/notification-design-review.md +84 -0
  22. package/docs/design/notification-implementation-plan.md +181 -0
  23. package/docs/design/spawn-agent-failure-modes.md +161 -0
  24. package/docs/design/spawn-agent-result-handling-implementation-plan.md +186 -0
  25. package/docs/design/stdio-simplification-design-candidates.md +341 -0
  26. package/docs/design/stdio-simplification-design-review.md +93 -0
  27. package/docs/design/stdio-simplification-implementation-plan.md +317 -0
  28. package/docs/design/structured-output-tools-coexist-findings.md +288 -0
  29. package/docs/discovery/coordinator-script-design.md +745 -0
  30. package/docs/discovery/coordinator-ux-discovery.md +471 -0
  31. package/docs/discovery/spawn-agent-failure-modes.md +309 -0
  32. package/docs/discovery/workflow-selection-for-discovery-tasks.md +336 -0
  33. package/docs/discovery/worktrain-status-briefing.md +325 -0
  34. package/docs/discovery/worktrain-status-design-candidates.md +202 -0
  35. package/docs/discovery/worktrain-status-design-review-findings.md +86 -0
  36. package/docs/ideas/backlog.md +608 -0
  37. package/docs/ideas/daemon-structured-output-vs-tool-calls.md +344 -0
  38. package/docs/ideas/design-candidates-backlog-consolidation.md +85 -0
  39. package/docs/ideas/design-review-findings-backlog-consolidation.md +39 -0
  40. package/docs/ideas/implementation_plan_backlog_consolidation.md +117 -0
  41. package/docs/plans/authoring-doc-staleness-enforcement-candidates.md +251 -0
  42. package/docs/plans/authoring-doc-staleness-enforcement-review.md +99 -0
  43. package/docs/plans/authoring-doc-staleness-enforcement.md +463 -0
  44. package/package.json +1 -1
@@ -0,0 +1,325 @@
1
+ # WorkTrain Status Briefing -- Discovery
2
+
3
+ ## Artifact Strategy
4
+
5
+ This document is a human-readable record of the discovery. It is NOT execution truth.
6
+
7
+ - **Execution truth lives in:** WorkRail session notes and context variables (survive chat rewinds)
8
+ - **This doc is for:** reading, sharing, reviewing -- a narrative artifact
9
+ - **Do not rely on this doc** for workflow resumption -- use WorkRail session state instead
10
+
11
+ **Capabilities confirmed:**
12
+ - File system access: available (Read, Glob, Grep, Bash)
13
+ - Delegation (WorkRail Executor subagents): available via nested subagent tool
14
+ - Web browsing: not probed (not needed -- all sources are local files)
15
+
16
+ ---
17
+
18
+ ## Context / Ask
19
+
20
+ **Stated goal (original):** Discovery: what data exists today that a 'worktrain status' plain-English briefing command could use -- and what's the gap between available data and what a user needs to feel informed?
21
+
22
+ **User context:** The user wants WorkTrain to be able to answer 'what are you doing and why' in plain language, like they can ask Claude Code. This is about replacing the chat interface with something WorkTrain can answer autonomously.
23
+
24
+ **Reframed problem:** WorkTrain cannot explain its own current state and intent in plain language, forcing users to either tolerate opacity or switch to an interactive chat interface to understand what is happening and why.
25
+
26
+ ---
27
+
28
+ ## Path Recommendation
29
+
30
+ **Recommended path: `full_spectrum`**
31
+
32
+ **Rationale:** The stated goal is a solution statement (a CLI command), but the task is primarily empirical -- we need to read real data files to determine whether data poverty or rendering is the binding constraint. `design_first` alone would miss the empirical grounding. `landscape_first` alone would not resolve the solution-vs-problem ambiguity. `full_spectrum` combines landscape grounding (what data actually exists) with reframing (is a pull command even the right mechanism, how does this relate to `worktrain talk`).
33
+
34
+ ---
35
+
36
+ ## Constraints / Anti-goals
37
+
38
+ **Core constraints:**
39
+ - Must work with data sources that exist today (no schema changes as a prerequisite for MVP)
40
+ - Must be implementable by a single developer as a CLI subcommand
41
+ - Must not require a running LLM for the minimum viable version (plain-text rendering only)
42
+
43
+ **Anti-goals:**
44
+ - Do not design a full observability platform -- this is a user-facing briefing, not a debug tool
45
+ - Do not conflate `worktrain status` with `worktrain talk` until the relationship is explicitly resolved
46
+ - Do not build notification/push infrastructure as part of this feature
47
+
48
+ ---
49
+
50
+ ## Landscape Packet
51
+
52
+ ### Existing CLI Commands (`src/cli-worktrain.ts`)
53
+
54
+ | Command | What it does | Data source |
55
+ |---------|-------------|-------------|
56
+ | `init` | Guided setup wizard | Creates files |
57
+ | `tell <msg>` | Queue a message for the daemon | Writes `~/.workrail/message-queue.jsonl` |
58
+ | `inbox` | Read daemon messages | Reads/marks `~/.workrail/outbox.jsonl` |
59
+ | `spawn` | Start a workflow session non-interactively | HTTP POST `/api/v2/auto/dispatch` |
60
+ | `await` | Block until sessions complete | HTTP GET `/api/v2/sessions/:id` (polling) |
61
+ | `console` | Start the console UI HTTP server | Serves `dist/console-ui/` |
62
+ | `daemon` | Start/manage the daemon | launchd plist + daemon process |
63
+ | `logs` | Display daemon event log (follow mode) | `~/.workrail/events/daemon/<date>.jsonl` |
64
+ | `status <sessionId>` | Health summary for a session | `~/.workrail/events/daemon/<today>.jsonl` |
65
+
66
+ **Key finding:** A `status` command already exists -- but it requires a session ID and only reports mechanical health metrics (LLM turn count, failure rate, stuck detection), not a human-readable "what are you doing and why" briefing. There is no command that lists all active sessions with plain-English descriptions.
67
+
68
+ ### HTTP API (`src/v2/usecases/console-routes.ts`)
69
+
70
+ | Route | Returns |
71
+ |-------|---------|
72
+ | GET `/api/v2/sessions` | Session list (via ConsoleService) |
73
+ | GET `/api/v2/sessions/:id` | Full session detail with DAG |
74
+ | GET `/api/v2/sessions/:id/nodes/:nodeId` | Node detail |
75
+ | GET `/api/v2/workflows` | Workflow catalog |
76
+ | GET `/api/v2/worktrees` | Git worktrees with session counts |
77
+ | GET `/api/v2/triggers` | Registered triggers |
78
+ | POST `/api/v2/auto/dispatch` | Fire-and-forget session start |
79
+ | GET `/api/v2/workspace/events` | SSE change stream |
80
+
81
+ **Key finding:** The HTTP API exposes DAG position and workflow catalog, but does NOT expose goal text, trigger metadata, queue state, event history, or session health metrics. A briefing command reading only the API would lack the "what" (goal) and "why" (trigger reason).
82
+
83
+ ### Daemon Event Log (`~/.workrail/events/daemon/`)
84
+
85
+ One JSONL file per day. Schema of key event types:
86
+
87
+ **`session_started`**: `{ kind, sessionId, workflowId, workspacePath, ts }` -- identifies which workflow is running.
88
+
89
+ **`trigger_fired`**: `{ kind, triggerId, workflowId, ts }` -- identifies what caused the session to start.
90
+
91
+ **`tool_called`**: `{ kind, sessionId, toolName, summary, ts }` -- last tool call (useful for "stuck" detection and last activity).
92
+
93
+ **`step_advanced`**: `{ kind, sessionId, ts }` -- step count increment, but NO step name.
94
+
95
+ **What's present:** workflow ID, trigger ID, tool names/summaries, timestamps.
96
+
97
+ **What's absent:** goal text, step names, plain-language descriptions, estimated time remaining, human-readable status.
98
+
99
+ ### Session Manifests (`~/.workrail/data/sessions/<id>/`)
100
+
101
+ Each session directory contains:
102
+ - `manifest.jsonl` -- segment metadata and snapshot pointers only (no semantic data)
103
+ - `events/` -- JSONL files with the actual event log per segment
104
+
105
+ **Goal text found:** in the `context_set` event (`data.context.goal`), which appears early in each session's event log. Quality is high -- full sentences like:
106
+ > "Discovery (user experience angle): what should the first coordinator script template look like from the user's perspective -- how does someone invoke it, what do they see, what does it produce?"
107
+
108
+ **Step names found:** ONLY in snapshot files (`data.enginePayload.engineState.pending.step.stepId`), not in the event log. Example: `"phase-0-reframe"`.
109
+
110
+ **What's present:** goal text (in `context_set`), workflow ID (in `run_started`), observations (repo root, branch, etc.), step name (in snapshots).
111
+
112
+ **What's absent:** step index / total step count, trigger provenance (who/what started the session), estimated time remaining.
113
+
114
+ ### Queue and Outbox Files
115
+
116
+ - `~/.workrail/message-queue.jsonl` -- **does not exist**
117
+ - `~/.workrail/outbox.jsonl` -- **does not exist**
118
+
119
+ The `tell` and `inbox` commands reference these files but they are not present in the current runtime. The push/notification path Assumption 1 worried about does not exist in practice.
120
+
121
+ ### Backlog Design Specs (`docs/ideas/backlog.md`)
122
+
123
+ Three highly relevant sections found:
124
+
125
+ **"Live status briefings" (Apr 15, 2026):** Full spec for `worktrain status --workspace <name>`. Describes a `build-status-briefing` routine (not a full workflow -- a single fast step) that reads: active sessions from session store (step, duration), queue state from `queue.jsonl`, recent completions from merge audit log, blocked items, milestone dependencies. Sample output shows 3 active sessions with plain-English descriptions, queue top-5, recently completed items, blocked/waiting items, upcoming milestones. **Key insight:** the spec says sessions need a brief "plain English description" maintained separately -- either extracted from goal text or generated when enqueued.
126
+
127
+ **"WorkTrain analytics" (Apr 15, 2026):** Analytics dashboard spec -- volume stats, time saved estimates, quality metrics. A different feature from status; concerns aggregated historical data, not live state.
128
+
129
+ **"Interactive ideation" (Apr 15, 2026):** `worktrain talk` spec. A conversational loop workflow that starts with a synthesized context bundle (session outcomes, open PRs, backlog items, in-flight agent state). The spec explicitly says: "This is also what the `worktrain talk` session uses as its opening context -- before any conversation, WorkTrain gives itself a briefing on the current state so it can answer questions accurately." Status IS the context bundle for talk.
130
+
131
+ ### Assumption Resolution
132
+
133
+ **Assumption 1 (pull vs push):** Resolved -- `message-queue.jsonl` and `outbox.jsonl` do not exist. There is no push infrastructure. A pull command is currently the only viable path.
134
+
135
+ **Assumption 2 (data exists vs data poverty):** Partially resolved. The goal text IS there (high quality, in `context_set` events). But step names require reading snapshots (not just events), step counts are not tracked anywhere, and trigger provenance is only in daemon event logs (not session events). This is a **medium effort rendering problem** with some targeted data gap filling needed.
136
+
137
+ **Assumption 3 (status vs talk):** Resolved. The backlog spec explicitly states that status IS the opening context bundle for `worktrain talk`. They are the same data, different surfaces: status is read-only plain text; talk is interactive with LLM. Building status first is the right path -- it produces the context bundle that talk will consume.
138
+
139
+ ### Contradictions Found
140
+
141
+ 1. The existing `worktrain status <sessionId>` command reads daemon event logs, but new session data lives in the per-session event store (`~/.workrail/data/sessions/`). These are two different storage systems. A unified briefing needs to read from both -- or just from the session store, which has more data.
142
+
143
+ 2. The step name is only in snapshots, not event logs. The existing `status` command (which reads event logs) cannot currently report which step a session is on.
144
+
145
+ ### Evidence Gaps
146
+
147
+ - The ConsoleService internals (`src/v2/domain/console-service.ts` or similar) were not audited -- the actual shape of session detail returned by the HTTP API is unknown.
148
+ - Queue infrastructure (queue.jsonl, the trigger queue) was not fully audited.
149
+ - Snapshot read performance (reading snapshot files for step names) is unknown.
150
+
151
+ ---
152
+
153
+ ## Problem Frame Packet
154
+
155
+ ### Users / Stakeholders
156
+
157
+ **Primary user:** A developer running WorkTrain autonomously -- has 1-10 sessions active at any given time, may check in after an hour or two away. Needs to quickly answer: "what is happening, is it going well, what's next?"
158
+
159
+ **Secondary user:** The developer at the moment of starting a new session -- wants confirmation that WorkTrain understood the goal and is executing the right workflow.
160
+
161
+ **Tertiary (future):** The `worktrain talk` conversational interface -- needs a pre-built context bundle to start an informed conversation without re-reading every session.
162
+
163
+ ### Jobs / Outcomes
164
+
165
+ 1. **Ambient awareness:** Without actively monitoring, understand what WorkTrain is working on at a glance (< 10 seconds).
166
+ 2. **Intervention triage:** Quickly identify whether any session is stuck, failing, or needs human input -- vs running fine and needing nothing.
167
+ 3. **Session grounding:** Before asking a follow-up question or queuing next work, get oriented on what's already in flight.
168
+ 4. **Context seeding:** Provide WorkTrain's own conversational sessions with a pre-built briefing so talk starts informed.
169
+
170
+ ### Pains / Tensions
171
+
172
+ **Pain 1 -- Identity opacity:** Session IDs like `sess_bumu5ljx...` are meaningless. The user cannot tell which session is which without opening the console.
173
+
174
+ **Pain 2 -- State opacity:** The existing `worktrain status <sessionId>` reports health metrics (LLM turns, tool call counts) not semantic state ("I am on step 4 of 8, writing integration tests").
175
+
176
+ **Pain 3 -- No aggregate view:** There is no command that says "here are all your running sessions" with anything except IDs.
177
+
178
+ **Pain 4 -- Two storage systems:** Daemon event log vs per-session store. Bridging them requires either reading two systems or accepting one system's incomplete picture.
179
+
180
+ **Tension 1 -- Freshness vs complexity:** Getting step names requires reading snapshot files (content-addressed, not indexed). This is safe but adds read complexity. Without it, step names are absent.
181
+
182
+ **Tension 2 -- Pull vs completeness:** A pull command captures state at a moment in time. If a session finishes between the user looking away and running status, it disappears from the output. Recent completions need a separate read.
183
+
184
+ ### Success Criteria
185
+
186
+ 1. A user with 3 active sessions reads `worktrain status` and correctly names what each session is working on -- without opening the console.
187
+ 2. A user can identify a stuck session from status output (no recent tool calls, long elapsed time).
188
+ 3. Output is valid and informative with zero active sessions (graceful empty state).
189
+ 4. Output correctly reflects the current step name for sessions that have advanced past step 1.
190
+ 5. A developer implements the command reading only today's data sources without schema migration.
191
+
192
+ ### HMW Questions
193
+
194
+ - **HMW:** How might we surface goal text from `context_set` events without requiring the user to know the session event schema?
195
+ - **HMW:** How might we provide step context ("step 4 of 8") when step count is not tracked -- by reading the workflow definition for total step count and the snapshot for current step?
196
+
197
+ ### Primary Framing Risk
198
+
199
+ **The framing assumes the `context_set` goal text is always the right "what and why" for a session.** But the goal field is set once at session start and never updated. If a session's direction changed mid-run (e.g., it pivoted based on discoveries), the goal text may be stale or misleading. If a significant fraction of sessions have stale goals, the status briefing becomes unreliable exactly when the user most needs it (complex, multi-step sessions). **Evidence to watch:** sessions with many `context_set` updates vs sessions where goal is set once and never revised.
200
+
201
+ ---
202
+
203
+ ## Candidate Directions
204
+
205
+ ### Generation Expectations (before candidates are produced)
206
+
207
+ This is a `full_spectrum` pass. Candidates must:
208
+ 1. **Reflect landscape constraints** -- not invent new data sources; only use what exists today (session store, daemon event log, HTTP API, workflow catalog, snapshot files)
209
+ 2. **Span implementation depth** -- at least one minimal/fast candidate (what can be done in a day), at least one fuller candidate (complete per the backlog spec)
210
+ 3. **Address the storage system choice** -- each candidate must explicitly state which storage it reads from (session store vs daemon log vs HTTP API) and accept the tradeoffs
211
+ 4. **Treat status-as-talk-bundle as a design constraint** -- at least one candidate must show how the status output feeds into `worktrain talk`
212
+ 5. **Do NOT require new infrastructure** -- no new event emission, no queue files, no schema changes as a prerequisite
213
+
214
+ *(Candidates to be populated by injected routine)*
215
+
216
+ ---
217
+
218
+ ## Challenge Notes
219
+
220
+ ### Assumption 1: A pull command is the right mechanism
221
+ - Might be wrong: push/notification may already serve the 'feel informed' goal
222
+ - Evidence needed: outbox/message-queue usage patterns, event log completions without user awareness
223
+
224
+ ### Assumption 2: Data exists -- this is a rendering problem
225
+ - Highest-risk assumption: event logs record mechanical facts, not semantic intent
226
+ - Evidence needed: direct inspection of session manifest 'goal' field quality
227
+
228
+ ### Assumption 3: Status and talk are architecturally distinct
229
+ - Might be wrong: `worktrain talk` may already cover 'what are you working on' as a use case
230
+ - Evidence needed: backlog.md spec for both features
231
+
232
+ ---
233
+
234
+ ## Resolution Notes
235
+
236
+ *(To be populated)*
237
+
238
+ ---
239
+
240
+ ## Decision Log
241
+
242
+ ### Selected Direction: Candidate A (CLI formatter over SessionSummaryProviderPort)
243
+
244
+ **Why it won:**
245
+ 1. The data assembly is already done in `HealthySessionSummary` -- `sessionTitle` (goal), `pendingStepId` (step name), `lastModifiedMs` (stuck detection). No new projections needed.
246
+ 2. No daemon required -- works file-only, unlike Candidate B.
247
+ 3. Best-fit scope -- shippable in 1 day, satisfies 4/5 success criteria immediately.
248
+ 4. Correct architecture -- uses v2 projection layer, neverthrow, DI. Resolves the philosophy conflict (existing `status` command uses daemon log; new command uses v2 layer).
249
+ 5. The typed `StatusBriefingV1` intermediate type costs < 30 minutes but enables future `worktrain talk` integration without duplication.
250
+
251
+ **Why Candidate B lost:**
252
+ - Requires running daemon (usability regression for core 'check what's happening' use case)
253
+ - The 'reuse by talk' benefit is speculative -- talk doesn't exist yet
254
+ - 2-3 days implementation vs 1 day, same MVP output
255
+
256
+ **Why Candidate C was rejected:**
257
+ - Perpetuates daemon-log-reading pattern (wrong architecture)
258
+ - No goal text in output without changing the existing command's data source
259
+ - 'Architectural fixes over patches' principle directly violated
260
+
261
+ ### Challenge outcome
262
+ The one genuine technical risk (port accessibility from CLI context) was investigated. `LocalSessionSummaryProviderV2` requires 4 ports (directoryListing, dataDir, sessionStore, snapshotStore). All are local file I/O adapters, instantiatable without the DI container. A small factory function `createStandaloneSessionSummaryProvider(dataDir: string)` resolves this cleanly. Challenge failed to kill Candidate A.
263
+
264
+ ### Switch triggers
265
+ - If `worktrain talk` is prioritized in the next sprint: add the HTTP route as a parallel PR (Candidate B shape), reuse `buildStatusBriefing()` from both sides
266
+ - If step count ('of N') is a hard requirement: add workflow catalog read to `buildStatusBriefing()` (still Candidate A, small scope addition)
267
+ - If the port factory wiring turns out to require > 2 hours: consider calling the HTTP API instead (Candidate B shape for CLI, removing daemon dependency by starting the server if not running)
268
+
269
+ ---
270
+
271
+ ## Final Summary
272
+
273
+ ### Recommendation: Candidate A -- CLI formatter over SessionSummaryProviderPort
274
+
275
+ **Confidence: HIGH**
276
+
277
+ #### What to build (Sprint 1 -- ~1 day)
278
+
279
+ 1. **New file:** `src/v2/projections/status-briefing.ts`
280
+ - Types: `StatusBriefingV1`, `ActiveSessionBriefing` (with discriminated union for goal: `{ kind: 'set'; value: string } | { kind: 'not_set' }`)
281
+ - Pure function: `buildStatusBriefing(summaries: HealthySessionSummary[]): StatusBriefingV1` -- no I/O, fully testable
282
+ - Stuck detection: `isStuck = (now - lastModifiedMs) > STUCK_THRESHOLD_MS` (suggest 15 min)
283
+
284
+ 2. **New CLI subcommand:** `worktrain status` (no positional arg required)
285
+ - Wires `LocalSessionSummaryProviderV2` via a small `createStandaloneSessionSummaryProvider(dataDir)` factory
286
+ - Calls `buildStatusBriefing()` with loaded summaries
287
+ - Formats to terminal output: one block per active session (goal, step, elapsed time, stuck warning)
288
+
289
+ 3. **Rename existing command:** `worktrain status <sessionId>` → `worktrain health <sessionId>`
290
+ - Prevents naming confusion
291
+ - Part of the same PR
292
+
293
+ #### What the output looks like (sketch)
294
+
295
+ ```
296
+ WorkTrain [18 Apr 2026, 14:32]
297
+
298
+ ACTIVE (2 sessions)
299
+
300
+ ● wr.discovery
301
+ Discovery: what data exists today that a 'worktrain status' plain-English briefing command could use
302
+ Step: phase-3-synthesize Running 22 min
303
+
304
+ ● coding-task-workflow-agentic
305
+ Implement GitHub polling adapter for Issues/PRs without requiring webhooks
306
+ Step: phase-2-implement Running 8 min ⚠ no activity for 18 min
307
+
308
+ No items in queue. Use `worktrain logs` to see recent completions.
309
+ ```
310
+
311
+ #### What's deferred (Sprint 2)
312
+
313
+ - **Step count ('of N'):** Read workflow catalog to add 'step 3 of 8' -- 1-2 hours
314
+ - **Recently completed:** Scan daemon event log for `session_completed` events from last 24h -- 2-3 hours
315
+ - **HTTP API route:** Add `GET /api/v2/status/briefing` returning `StatusBriefingV1` -- needed when `worktrain talk` is prioritized
316
+
317
+ #### Status vs Talk relationship (resolved)
318
+
319
+ `worktrain status` is the read-only text rendering of `StatusBriefingV1`. `worktrain talk` will use `StatusBriefingV1` as its opening context bundle. Status is built first; talk consumes the same type. They are not separate features -- they are two surfaces of the same data structure.
320
+
321
+ #### Residual risks
322
+
323
+ 1. **Port factory complexity:** `createStandaloneSessionSummaryProvider(dataDir)` must instantiate 4 local adapters. Low risk but unverified. Verify transitive deps at sprint start.
324
+ 2. **`worktrain talk` timeline:** If talk is imminent (< 2 sprints), add the HTTP API route in Sprint 1 alongside the CLI command. User decision.
325
+ 3. **Recently completed gap:** Most visible difference between MVP and the backlog vision. Users checking status after sessions complete will see an empty active list with no context about what just finished.
@@ -0,0 +1,202 @@
1
+ # WorkTrain Status Briefing -- Design Candidates
2
+
3
+ > Raw investigative material for main-agent synthesis. Not a final decision.
4
+
5
+ ---
6
+
7
+ ## Problem Understanding
8
+
9
+ ### Core tensions
10
+
11
+ **Tension 1: Existing daemon-log status vs v2 projection layer**
12
+
13
+ The existing `worktrain status <sessionId>` command reads `~/.workrail/events/daemon/<today>.jsonl` directly (string parsing, mechanical metrics only). The v2 architecture uses per-session event stores with typed pure-function projections. A new aggregate status command has two options: (a) extend the daemon-log reader (consistent with existing command, but less data and wrong architecture), or (b) use the v2 session store (more data, correct architecture, requires resolving port wiring from CLI context). Option (b) is clearly correct per the architecture, but it introduces a temporary inconsistency -- two `status` commands reading from different sources.
14
+
15
+ **Tension 2: Completeness vs read complexity**
16
+
17
+ A complete briefing needs: goal (from `context_set` event), step name (from snapshot file), step count (from workflow catalog), stuck signal (from lastModifiedMs). Each requires a different read operation. A simpler briefing reads only session events (goal + workflow ID). The v2 projection layer resolves most of this: `HealthySessionSummary` in `resume-ranking.ts` already contains `pendingStepId`, `sessionTitle`, `isComplete`, and `lastModifiedMs` -- the assembly is done.
18
+
19
+ **Tension 3: Pull-command completeness vs stateless design**
20
+
21
+ The backlog spec shows a 'RECENTLY COMPLETED' section as a key output element. A pull command reading current session state misses sessions that completed between status runs. No persistent completion index exists. Accepting a point-in-time view (active sessions only) satisfies 4/5 success criteria; adding recently-completed requires either a new completion index or reading daemon log for session_completed events.
22
+
23
+ ### Likely seam
24
+
25
+ `SessionSummaryProviderPortV2` + `HealthySessionSummary` in `src/v2/projections/resume-ranking.ts`. This is the correct aggregation boundary -- it already loads all sessions, health-checks them, and returns typed summaries. A new briefing command adds a formatter layer on top of this port, not a new data path.
26
+
27
+ ### What makes this hard
28
+
29
+ A junior developer would: read daemon event log (wrong source, no goal text), produce session IDs instead of goal text, skip step names (miss the snapshot read), and miss that `HealthySessionSummary.sessionTitle` and `HealthySessionSummary.pendingStepId` already exist. The actual complexity is recognizing that the data assembly problem is solved -- only the formatter is missing.
30
+
31
+ ---
32
+
33
+ ## Philosophy Constraints
34
+
35
+ **Principles that apply directly:**
36
+ - **Errors are data (neverthrow):** individual session load failures must be `Result<ActiveSessionBriefing, LoadError>` not exceptions that abort the whole briefing
37
+ - **Ports/adapters DI:** session store access must be injected as a port, not hard-coded file reads
38
+ - **Pure functions for projection:** `buildStatusBriefing(summaries)` is a pure function, testable without I/O
39
+ - **Explicit domain types over primitives:** return type is `StatusBriefingV1`, not `string`
40
+ - **YAGNI with discipline:** do not add step-count tracking, completion indexing, time estimates before they are needed
41
+ - **Validate at boundaries, trust inside:** session loading validates at the port boundary; the formatter trusts `HealthySessionSummary` fields
42
+
43
+ **Conflicts:**
44
+ - The existing `status <sessionId>` command violates the 'use v2 projection layer' architectural direction. A new `status` command should NOT perpetuate this pattern.
45
+
46
+ ---
47
+
48
+ ## Impact Surface
49
+
50
+ - `src/cli-worktrain.ts` -- new subcommand added here
51
+ - `src/v2/projections/resume-ranking.ts` -- `HealthySessionSummary` type consumed (read-only)
52
+ - `src/v2/ports/session-summary-provider.port.ts` -- port consumed by new CLI command
53
+ - Future `worktrain talk` -- will consume the same `StatusBriefingV1` type or the same port
54
+ - `src/v2/usecases/console-routes.ts` -- not changed for MVP; may add status route later
55
+ - Existing `worktrain status <sessionId>` -- not changed; two status commands coexist (inconsistency noted)
56
+
57
+ ---
58
+
59
+ ## Candidates
60
+
61
+ ### Candidate A: Minimal -- CLI formatter over `SessionSummaryProviderPort`
62
+
63
+ **Summary:** Add a `worktrain status` CLI subcommand (no session ID required) that calls `SessionSummaryProviderPortV2.loadAll()`, filters to active (non-complete) sessions, and formats each `HealthySessionSummary` into 3-4 lines: goal, step, duration, stuck warning if applicable. Defines a typed `StatusBriefingV1` intermediate (pure data structure) even though no HTTP route exists yet.
64
+
65
+ **Tensions resolved:**
66
+ - Goal visibility: `HealthySessionSummary.sessionTitle` (derived from `context_set` goal) already present
67
+ - Step context: `HealthySessionSummary.pendingStepId` already present
68
+ - Health signal: `lastModifiedMs` available for stuck detection (> N minutes without update)
69
+ - Aggregate view: `loadAll()` returns all sessions, no ID required
70
+ - Architecture consistency: v2 projection layer, neverthrow, DI
71
+
72
+ **Tensions accepted:**
73
+ - No 'recently completed' section (only active sessions; completions require daemon log read or completion index)
74
+ - No queue state (queue.jsonl does not exist)
75
+ - No step count -- only step name (e.g., "phase-3-implement"), not "step 4 of 8"
76
+ - Status-talk reuse is partial: both will call the same port, but no shared HTTP contract yet
77
+
78
+ **Boundary:** New `buildStatusBriefing(summaries: HealthySessionSummary[]): StatusBriefingV1` pure function in `src/v2/projections/status-briefing.ts`. New `worktrain status` subcommand in `src/cli-worktrain.ts` that wires the port and calls the formatter.
79
+
80
+ **Why this boundary is best fit:** The data assembly is already done in the projection layer. The briefing function is a pure, testable formatter -- not a new data path. Minimal surface area, minimal risk.
81
+
82
+ **Failure mode:** `SessionSummaryProviderPortV2` may not be available from CLI context (only wired in the HTTP server's DI container). Mitigation: expose a `createStandaloneSessionSummaryProvider(dataDir)` factory that wires the port without a full server context.
83
+
84
+ **Repo pattern:** Follows `src/v2/projections/` pattern (pure projection functions). The port-wiring from CLI context is a new pattern (existing CLI commands either call HTTP API or read files directly). Small, well-bounded new pattern.
85
+
86
+ **Gains:** Shippable in < 1 day. Satisfies 4/5 success criteria. Typed return type enables future talk integration. No daemon required.
87
+
88
+ **Losses:** No recently-completed section. No queue. Step count absent.
89
+
90
+ **Scope judgment:** Best-fit for MVP.
91
+
92
+ **Philosophy fit:** Honors errors-as-data, pure functions, DI, explicit domain types. Honors YAGNI (no speculative infrastructure). Resolves the stated philosophy conflict by using v2 layer instead of daemon log.
93
+
94
+ ---
95
+
96
+ ### Candidate B: `GET /api/v2/status/briefing` HTTP endpoint + CLI consumer
97
+
98
+ **Summary:** Add `GET /api/v2/status/briefing` returning a typed `StatusBriefingV1` JSON object, consumed by both a CLI `worktrain status` formatter and later by `worktrain talk` as its opening context bundle. The HTTP route calls `ConsoleService.getStatusBriefing()` which calls the same `SessionSummaryProviderPort`. The CLI subcommand calls the HTTP route (like `spawn`/`await`).
99
+
100
+ **Tensions resolved:**
101
+ - Status-talk reuse: `StatusBriefingV1` is the canonical shared contract; talk imports from the same endpoint
102
+ - Architecture layering: CLI talks to HTTP API, matching the `spawn`/`await` pattern
103
+ - Multiple consumers: web console can display a status widget without a separate data path
104
+
105
+ **Tensions accepted:**
106
+ - Requires a running daemon/console server (HTTP API unavailable if daemon is not running)
107
+ - More moving parts for the same MVP output (route + service method + DTO + CLI command)
108
+ - The 'reuse' benefit is speculative -- `worktrain talk` doesn't exist yet
109
+
110
+ **Boundary:** `StatusBriefingV1` in `src/v2/usecases/console-types.ts`, new `ConsoleService.getStatusBriefing()` method, new HTTP route, new CLI subcommand.
111
+
112
+ **Why this boundary is best fit:** If the canonical delivery mechanism is the HTTP API and multiple consumers exist, the API boundary is the right seam. But today there is only one consumer (the CLI), so this is premature optimization.
113
+
114
+ **Failure mode:** `worktrain status` becomes useless if the console server is not running. Candidate A works without a daemon; Candidate B does not. This is a usability regression for the core "check what's happening" use case.
115
+
116
+ **Repo pattern:** Follows `spawn`/`await` (CLI calls HTTP). Adds a new service method and route (standard pattern). Slightly too broad for the current moment.
117
+
118
+ **Gains:** Clean consumer contract for talk. Multiple consumers share one data path. Web console gets a status widget 'for free'.
119
+
120
+ **Losses:** Daemon dependency. Higher implementation cost (2-3 days vs 1 day). Premature given talk doesn't exist.
121
+
122
+ **Scope judgment:** Slightly too broad for MVP; right for the medium-term architecture if talk is imminent.
123
+
124
+ **Philosophy fit:** Honors explicit domain types, DI. The daemon dependency is a mild violation of 'validate at boundaries, trust inside' -- it introduces a runtime availability assumption that Candidate A avoids.
125
+
126
+ ---
127
+
128
+ ### Candidate C: Extend existing `worktrain status <sessionId>` with `--all` flag
129
+
130
+ **Summary:** Add `--all` flag to the existing `status` command that iterates sessions and renders health summaries for each.
131
+
132
+ **Tensions resolved:**
133
+ - Aggregate view (with --all flag)
134
+ - Consistency with existing command (same command name)
135
+
136
+ **Tensions accepted:**
137
+ - **Critically:** The existing command reads daemon event logs (mechanical metrics: LLM turns, tool call counts, failure rate). Adding `--all` to it would produce a list of sessions WITHOUT goal text and WITHOUT step names. The output would be "here are your N sessions: each ran X tool calls, Y LLM turns" -- useful for ops debugging, not for 'what are you doing and why'.
138
+ - The two-storage-system contradiction is not resolved; it's perpetuated.
139
+
140
+ **Failure mode:** Either (a) the output is inferior (no goal text) -- solving the wrong problem, or (b) the implementation is changed to read from v2 store instead -- which is functionally the same as Candidate A but with worse ergonomics (extending a misaligned command rather than a clean new one).
141
+
142
+ **Scope judgment:** Too narrow. Perpetuates the daemon-log-reading pattern that should not be extended.
143
+
144
+ **Philosophy fit:** Conflicts with 'architectural fixes over patches' -- extends a symptom location rather than the correct seam.
145
+
146
+ ---
147
+
148
+ ## Comparison and Recommendation
149
+
150
+ ### Matrix
151
+
152
+ | Criterion | A (CLI+Port) | B (HTTP route) | C (extend existing) |
153
+ |-----------|-------------|---------------|---------------------|
154
+ | Goal text in output | YES | YES | NO (without arch change) |
155
+ | Step name in output | YES | YES | NO (without arch change) |
156
+ | Aggregate view | YES | YES | YES |
157
+ | Status-talk reuse | PARTIAL | FULL | NO |
158
+ | Daemon required | NO | YES | NO |
159
+ | Implementation cost | 1 day | 2-3 days | 1 day (wrong output) |
160
+ | Architecture correct | YES | YES | NO |
161
+ | MVP satisfies user | 4/5 criteria | 4/5 criteria | 0/5 criteria |
162
+
163
+ ### Recommendation: Candidate A + typed `StatusBriefingV1` type
164
+
165
+ **Candidate A** wins on best-fit scope, minimum viable implementation, daemon-independent operation, and correctness (goal text + step name in output). The one addition from B: define a typed `StatusBriefingV1` return type even though the HTTP route is not built yet. This adds < 30 minutes to the implementation and gives `worktrain talk` a named contract to consume without requiring HTTP.
166
+
167
+ **Concrete implementation plan:**
168
+ 1. New file: `src/v2/projections/status-briefing.ts` -- pure `buildStatusBriefing(summaries: HealthySessionSummary[]): StatusBriefingV1` function
169
+ 2. New types: `StatusBriefingV1`, `ActiveSessionBriefing` in the same file
170
+ 3. New CLI subcommand `status` (no positional arg) in `src/cli-worktrain.ts` -- note: renames the existing positional-arg `status <sessionId>` command to `health <sessionId>` to avoid ambiguity, or adds `status` as an alias
171
+ 4. Port wiring: `createStandaloneSessionSummaryProvider(dataDir: string)` factory if the port is not accessible from CLI context
172
+
173
+ ---
174
+
175
+ ## Self-Critique
176
+
177
+ ### Strongest argument against this recommendation
178
+
179
+ Candidate B is architecturally cleaner for multi-consumer scenarios. If `worktrain talk` is planned for the next sprint, building the HTTP API route now avoids a refactor. The extra day of implementation is cheap compared to duplicated port-wiring if the CLI and talk both wire the provider independently.
180
+
181
+ ### Narrower option that lost
182
+
183
+ Candidate A without the typed return type (format directly to strings in the CLI command). Lost because: `worktrain talk` would need to re-implement the assembly logic, violating DRY. The typed intermediate is < 30 minutes of extra work and high future value.
184
+
185
+ ### Broader option and what evidence would justify it
186
+
187
+ Candidate B. Evidence required: backlog prioritization showing `worktrain talk` is the immediate next milestone (< 2 weeks), or confirmation that the web console UI needs a live status widget in the same timeframe.
188
+
189
+ ### Pivot conditions
190
+
191
+ - If `SessionSummaryProviderPortV2` cannot be wired from CLI context without significant boilerplate -- consider Candidate B (the HTTP server already wires it, so calling the API from CLI avoids duplicating DI logic)
192
+ - If the user confirms talk is imminent -- add the HTTP route now as part of the same PR
193
+ - If step count ('of N total steps') is a hard requirement -- add workflow catalog read to `buildStatusBriefing()` (still Candidate A shape, small scope addition)
194
+
195
+ ---
196
+
197
+ ## Open Questions for Main Agent
198
+
199
+ 1. Is `SessionSummaryProviderPortV2` accessible from a standalone CLI context, or is it only wired in the HTTP server's DI container? (determines whether a factory function is needed)
200
+ 2. Should the existing `worktrain status <sessionId>` be renamed to `worktrain health <sessionId>` to avoid command name collision, or should the new aggregate command use a different name (`worktrain status --all` or `worktrain ls`)?
201
+ 3. How important is the 'RECENTLY COMPLETED' section for the initial release? If important, a daemon-log scan for `session_completed` events from the last 24 hours is a feasible addition to the briefing.
202
+ 4. Is `worktrain talk` planned for the next sprint? If yes, Candidate B is worth the extra day.
@@ -0,0 +1,86 @@
1
+ # WorkTrain Status Briefing -- Design Review Findings
2
+
3
+ > Raw review material for main-agent synthesis. Selected direction: Candidate A (CLI formatter over SessionSummaryProviderPort).
4
+
5
+ ---
6
+
7
+ ## Tradeoff Review
8
+
9
+ | Tradeoff | Accepted? | When it fails | Mitigation |
10
+ |----------|-----------|---------------|------------|
11
+ | No 'recently completed' section | YES | User checks status minutes after session completes; sees empty list with no context | Suggest `worktrain logs` in status output footer |
12
+ | No queue state | YES | User has 0 active sessions and is uncertain whether anything is pending | N/A -- queue.jsonl does not exist; nothing to read |
13
+ | No step count ('of N') | YES | User sees step name but has no sense of % complete on long tasks | Fast follow: read workflow catalog for total step count |
14
+ | Two status command variants coexist | CONDITIONAL | User types `worktrain status` expecting session ID prompt | Rename existing command to `worktrain health <id>` as part of same PR |
15
+
16
+ ---
17
+
18
+ ## Failure Mode Review
19
+
20
+ | Failure Mode | Handled? | Risk | Notes |
21
+ |-------------|----------|------|-------|
22
+ | Port wiring from CLI context | YES (factory pattern) | LOW | `createStandaloneSessionSummaryProvider(dataDir)` instantiates 4 concrete adapters without DI container |
23
+ | sessionTitle null for sessions without goal | YES (formatter fallback) | LOW | Fall back to workflowId as display name |
24
+ | Sessions from multi-day spans not visible | YES | LOW | `enumerateSessionsByRecency` scans full sessions directory, not today-only; filter `isComplete = false` handles active-only |
25
+
26
+ **Highest-risk failure mode:** Port wiring (factory pattern). Risk is LOW but unverified. Recommend verifying adapter transitive deps during implementation sprint planning.
27
+
28
+ ---
29
+
30
+ ## Runner-Up / Simpler Alternative Review
31
+
32
+ **Runner-up (Candidate B):** Nothing to borrow beyond the `StatusBriefingV1` typed intermediate, which is already included in Candidate A's recommendation.
33
+
34
+ **Simpler variant (format directly to strings):** Would satisfy all success criteria identically. Rejected because the typed intermediate (`StatusBriefingV1`) costs < 30 minutes and prevents duplication when `worktrain talk` is implemented.
35
+
36
+ **Even simpler variant (call HTTP API from CLI):** Fails on daemon-required criterion. Worse than selected design.
37
+
38
+ **Conclusion:** Selected design is the minimum useful shape. No hybrid improvements needed.
39
+
40
+ ---
41
+
42
+ ## Philosophy Alignment
43
+
44
+ | Principle | Status | Notes |
45
+ |-----------|--------|-------|
46
+ | Errors are data (neverthrow) | SATISFIED | `buildStatusBriefing()` returns `Result<StatusBriefingV1, BriefingError>`; individual session failures skipped gracefully |
47
+ | Explicit domain types over primitives | SATISFIED | `StatusBriefingV1`, `ActiveSessionBriefing` -- not ad-hoc strings |
48
+ | Compose small pure functions | SATISFIED | `buildStatusBriefing()` pure, formatter separate, port wiring in CLI |
49
+ | Validate at boundaries, trust inside | SATISFIED | Validation at port boundary; pure function trusts `HealthySessionSummary` |
50
+ | DI for boundaries | SATISFIED | Port injected into CLI command |
51
+ | Immutability by default | SATISFIED | Reading append-only event store |
52
+ | Architectural fixes over patches | SATISFIED | Uses v2 projection layer; does not extend daemon-log-reading pattern |
53
+ | YAGNI with discipline | SATISFIED | No HTTP route, no queue, no completion index; `StatusBriefingV1` type is a minimal seam not speculative infrastructure |
54
+ | Make illegal states unrepresentable | MILD TENSION | `goal: string \| null` -- null is not illegal but a discriminated union would be more explicit |
55
+ | Type safety as first line of defense | MILD TENSION | Null `sessionTitle` requires defensive null handling; type system enforces it but the caller must pattern-match |
56
+
57
+ ---
58
+
59
+ ## Findings
60
+
61
+ ### Yellow
62
+
63
+ **Y1: Goal field type could be more explicit**
64
+ `ActiveSessionBriefing.goal: string | null` is acceptable but `{ kind: 'set'; value: string } | { kind: 'not_set' }` would be more explicit per the 'make illegal states unrepresentable' principle. Low severity -- the null is handled either way, but the discriminated union makes the null case named and intentional.
65
+
66
+ **Y2: Command naming collision risk**
67
+ Existing `worktrain status <sessionId>` and new `worktrain status` (no args) would be disambiguated by commander.js argument presence, but help text and documentation will be confusing. Renaming the existing command to `worktrain health <id>` is a 5-minute change that removes this confusion entirely. Recommend including in the same PR.
68
+
69
+ **Y3: Step count is absent**
70
+ 'Step 4 of 8' is significantly more useful than 'phase-3-implement' for understanding how far along a session is. The workflow catalog is queryable. This should be added as a fast follow (not MVP blocker) since it requires one additional read operation per active session.
71
+
72
+ ---
73
+
74
+ ## Recommended Revisions
75
+
76
+ 1. **Include:** Rename `worktrain status <id>` to `worktrain health <id>` in the same PR as the new `worktrain status` aggregate command. (5 minutes, prevents naming confusion)
77
+ 2. **Consider:** Use `{ kind: 'set'; value: string } | { kind: 'not_set' }` for the goal field in `ActiveSessionBriefing`. (15 minutes, clearer semantics)
78
+ 3. **Fast follow:** Add workflow catalog read to `buildStatusBriefing()` to include step count ('step 3 of 8'). (1-2 hours, significant UX improvement)
79
+
80
+ ---
81
+
82
+ ## Residual Concerns
83
+
84
+ 1. **`worktrain talk` timeline unclear:** If talk is within 2 sprints, the HTTP API route (Candidate B) should be built now rather than after the fact. This is context-dependent -- the user should confirm.
85
+ 2. **Port factory complexity unverified:** The `createStandaloneSessionSummaryProvider(dataDir)` factory is conceptually straightforward but not implemented. Transitive adapter dependencies could add complexity. Low risk but should be verified at sprint start.
86
+ 3. **Recently completed sessions:** The MVP has no 'recently completed' section. This is the most user-visible gap in the backlog spec's vision. A daemon-log scan for `session_completed` events from the last 24 hours is a feasible addition in Sprint 2.