npm - @exaudeus/workrail - Versions diffs - 3.36.0 → 3.37.0 - Mend

@exaudeus/workrail 3.36.0 → 3.37.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (44) hide show

package/dist/config/config-file.js +2 -0
package/dist/console-ui/assets/{index-n8cJrS4v.js → index-o-p__sHJ.js} +1 -1
package/dist/console-ui/index.html +1 -1
package/dist/daemon/workflow-runner.d.ts +1 -0
package/dist/daemon/workflow-runner.js +3 -6
package/dist/manifest.json +23 -15
package/dist/trigger/notification-service.d.ts +42 -0
package/dist/trigger/notification-service.js +164 -0
package/dist/trigger/trigger-listener.js +7 -1
package/dist/trigger/trigger-router.d.ts +3 -1
package/dist/trigger/trigger-router.js +4 -1
package/docs/design/agent-behavior-patterns-discovery.md +312 -0
package/docs/design/agent-engine-communication-discovery.md +390 -0
package/docs/design/agent-loop-architecture-alternatives-discovery.md +531 -0
package/docs/design/agent-loop-error-handling-contract.md +238 -0
package/docs/design/complete-step-approach-validation-discovery.md +344 -0
package/docs/design/daemon-stuck-detection-discovery.md +174 -0
package/docs/design/mcp-server-disconnect-discovery.md +245 -0
package/docs/design/mcp-server-epipe-crash.md +198 -0
package/docs/design/notification-design-candidates.md +131 -0
package/docs/design/notification-design-review.md +84 -0
package/docs/design/notification-implementation-plan.md +181 -0
package/docs/design/spawn-agent-failure-modes.md +161 -0
package/docs/design/spawn-agent-result-handling-implementation-plan.md +186 -0
package/docs/design/stdio-simplification-design-candidates.md +341 -0
package/docs/design/stdio-simplification-design-review.md +93 -0
package/docs/design/stdio-simplification-implementation-plan.md +317 -0
package/docs/design/structured-output-tools-coexist-findings.md +288 -0
package/docs/discovery/coordinator-script-design.md +745 -0
package/docs/discovery/coordinator-ux-discovery.md +471 -0
package/docs/discovery/spawn-agent-failure-modes.md +309 -0
package/docs/discovery/workflow-selection-for-discovery-tasks.md +336 -0
package/docs/discovery/worktrain-status-briefing.md +325 -0
package/docs/discovery/worktrain-status-design-candidates.md +202 -0
package/docs/discovery/worktrain-status-design-review-findings.md +86 -0
package/docs/ideas/backlog.md +608 -0
package/docs/ideas/daemon-structured-output-vs-tool-calls.md +344 -0
package/docs/ideas/design-candidates-backlog-consolidation.md +85 -0
package/docs/ideas/design-review-findings-backlog-consolidation.md +39 -0
package/docs/ideas/implementation_plan_backlog_consolidation.md +117 -0
package/docs/plans/authoring-doc-staleness-enforcement-candidates.md +251 -0
package/docs/plans/authoring-doc-staleness-enforcement-review.md +99 -0
package/docs/plans/authoring-doc-staleness-enforcement.md +463 -0
package/package.json +1 -1

package/docs/discovery/worktrain-status-briefing.md ADDED Viewed

@@ -0,0 +1,325 @@
+# WorkTrain Status Briefing -- Discovery
+## Artifact Strategy
+This document is a human-readable record of the discovery. It is NOT execution truth.
+- **Execution truth lives in:** WorkRail session notes and context variables (survive chat rewinds)
+- **This doc is for:** reading, sharing, reviewing -- a narrative artifact
+- **Do not rely on this doc** for workflow resumption -- use WorkRail session state instead
+**Capabilities confirmed:**
+- File system access: available (Read, Glob, Grep, Bash)
+- Delegation (WorkRail Executor subagents): available via nested subagent tool
+- Web browsing: not probed (not needed -- all sources are local files)
+---
+## Context / Ask
+**Stated goal (original):** Discovery: what data exists today that a 'worktrain status' plain-English briefing command could use -- and what's the gap between available data and what a user needs to feel informed?
+**User context:** The user wants WorkTrain to be able to answer 'what are you doing and why' in plain language, like they can ask Claude Code. This is about replacing the chat interface with something WorkTrain can answer autonomously.
+**Reframed problem:** WorkTrain cannot explain its own current state and intent in plain language, forcing users to either tolerate opacity or switch to an interactive chat interface to understand what is happening and why.
+---
+## Path Recommendation
+**Recommended path: `full_spectrum`**
+**Rationale:** The stated goal is a solution statement (a CLI command), but the task is primarily empirical -- we need to read real data files to determine whether data poverty or rendering is the binding constraint. `design_first` alone would miss the empirical grounding. `landscape_first` alone would not resolve the solution-vs-problem ambiguity. `full_spectrum` combines landscape grounding (what data actually exists) with reframing (is a pull command even the right mechanism, how does this relate to `worktrain talk`).
+---
+## Constraints / Anti-goals
+**Core constraints:**
+- Must work with data sources that exist today (no schema changes as a prerequisite for MVP)
+- Must be implementable by a single developer as a CLI subcommand
+- Must not require a running LLM for the minimum viable version (plain-text rendering only)
+**Anti-goals:**
+- Do not design a full observability platform -- this is a user-facing briefing, not a debug tool
+- Do not conflate `worktrain status` with `worktrain talk` until the relationship is explicitly resolved
+- Do not build notification/push infrastructure as part of this feature
+---
+## Landscape Packet
+### Existing CLI Commands (`src/cli-worktrain.ts`)
+| Command | What it does | Data source |
+|---------|-------------|-------------|
+| `init` | Guided setup wizard | Creates files |
+| `tell <msg>` | Queue a message for the daemon | Writes `~/.workrail/message-queue.jsonl` |
+| `inbox` | Read daemon messages | Reads/marks `~/.workrail/outbox.jsonl` |
+| `spawn` | Start a workflow session non-interactively | HTTP POST `/api/v2/auto/dispatch` |
+| `await` | Block until sessions complete | HTTP GET `/api/v2/sessions/:id` (polling) |
+| `console` | Start the console UI HTTP server | Serves `dist/console-ui/` |
+| `daemon` | Start/manage the daemon | launchd plist + daemon process |
+| `logs` | Display daemon event log (follow mode) | `~/.workrail/events/daemon/<date>.jsonl` |
+| `status <sessionId>` | Health summary for a session | `~/.workrail/events/daemon/<today>.jsonl` |
+**Key finding:** A `status` command already exists -- but it requires a session ID and only reports mechanical health metrics (LLM turn count, failure rate, stuck detection), not a human-readable "what are you doing and why" briefing. There is no command that lists all active sessions with plain-English descriptions.
+### HTTP API (`src/v2/usecases/console-routes.ts`)
+| Route | Returns |
+|-------|---------|
+| GET `/api/v2/sessions` | Session list (via ConsoleService) |
+| GET `/api/v2/sessions/:id` | Full session detail with DAG |
+| GET `/api/v2/sessions/:id/nodes/:nodeId` | Node detail |
+| GET `/api/v2/workflows` | Workflow catalog |
+| GET `/api/v2/worktrees` | Git worktrees with session counts |
+| GET `/api/v2/triggers` | Registered triggers |
+| POST `/api/v2/auto/dispatch` | Fire-and-forget session start |
+| GET `/api/v2/workspace/events` | SSE change stream |
+**Key finding:** The HTTP API exposes DAG position and workflow catalog, but does NOT expose goal text, trigger metadata, queue state, event history, or session health metrics. A briefing command reading only the API would lack the "what" (goal) and "why" (trigger reason).
+### Daemon Event Log (`~/.workrail/events/daemon/`)
+One JSONL file per day. Schema of key event types:
+**`session_started`**: `{ kind, sessionId, workflowId, workspacePath, ts }` -- identifies which workflow is running.
+**`trigger_fired`**: `{ kind, triggerId, workflowId, ts }` -- identifies what caused the session to start.
+**`tool_called`**: `{ kind, sessionId, toolName, summary, ts }` -- last tool call (useful for "stuck" detection and last activity).
+**`step_advanced`**: `{ kind, sessionId, ts }` -- step count increment, but NO step name.
+**What's present:** workflow ID, trigger ID, tool names/summaries, timestamps.
+**What's absent:** goal text, step names, plain-language descriptions, estimated time remaining, human-readable status.
+### Session Manifests (`~/.workrail/data/sessions/<id>/`)
+Each session directory contains:
+- `manifest.jsonl` -- segment metadata and snapshot pointers only (no semantic data)
+- `events/` -- JSONL files with the actual event log per segment
+**Goal text found:** in the `context_set` event (`data.context.goal`), which appears early in each session's event log. Quality is high -- full sentences like:
+> "Discovery (user experience angle): what should the first coordinator script template look like from the user's perspective -- how does someone invoke it, what do they see, what does it produce?"
+**Step names found:** ONLY in snapshot files (`data.enginePayload.engineState.pending.step.stepId`), not in the event log. Example: `"phase-0-reframe"`.
+**What's present:** goal text (in `context_set`), workflow ID (in `run_started`), observations (repo root, branch, etc.), step name (in snapshots).
+**What's absent:** step index / total step count, trigger provenance (who/what started the session), estimated time remaining.
+### Queue and Outbox Files
+- `~/.workrail/message-queue.jsonl` -- **does not exist**
+- `~/.workrail/outbox.jsonl` -- **does not exist**
+The `tell` and `inbox` commands reference these files but they are not present in the current runtime. The push/notification path Assumption 1 worried about does not exist in practice.
+### Backlog Design Specs (`docs/ideas/backlog.md`)
+Three highly relevant sections found:
+**"Live status briefings" (Apr 15, 2026):** Full spec for `worktrain status --workspace <name>`. Describes a `build-status-briefing` routine (not a full workflow -- a single fast step) that reads: active sessions from session store (step, duration), queue state from `queue.jsonl`, recent completions from merge audit log, blocked items, milestone dependencies. Sample output shows 3 active sessions with plain-English descriptions, queue top-5, recently completed items, blocked/waiting items, upcoming milestones. **Key insight:** the spec says sessions need a brief "plain English description" maintained separately -- either extracted from goal text or generated when enqueued.
+**"WorkTrain analytics" (Apr 15, 2026):** Analytics dashboard spec -- volume stats, time saved estimates, quality metrics. A different feature from status; concerns aggregated historical data, not live state.
+**"Interactive ideation" (Apr 15, 2026):** `worktrain talk` spec. A conversational loop workflow that starts with a synthesized context bundle (session outcomes, open PRs, backlog items, in-flight agent state). The spec explicitly says: "This is also what the `worktrain talk` session uses as its opening context -- before any conversation, WorkTrain gives itself a briefing on the current state so it can answer questions accurately." Status IS the context bundle for talk.
+### Assumption Resolution
+**Assumption 1 (pull vs push):** Resolved -- `message-queue.jsonl` and `outbox.jsonl` do not exist. There is no push infrastructure. A pull command is currently the only viable path.
+**Assumption 2 (data exists vs data poverty):** Partially resolved. The goal text IS there (high quality, in `context_set` events). But step names require reading snapshots (not just events), step counts are not tracked anywhere, and trigger provenance is only in daemon event logs (not session events). This is a **medium effort rendering problem** with some targeted data gap filling needed.
+**Assumption 3 (status vs talk):** Resolved. The backlog spec explicitly states that status IS the opening context bundle for `worktrain talk`. They are the same data, different surfaces: status is read-only plain text; talk is interactive with LLM. Building status first is the right path -- it produces the context bundle that talk will consume.
+### Contradictions Found
+1. The existing `worktrain status <sessionId>` command reads daemon event logs, but new session data lives in the per-session event store (`~/.workrail/data/sessions/`). These are two different storage systems. A unified briefing needs to read from both -- or just from the session store, which has more data.
+2. The step name is only in snapshots, not event logs. The existing `status` command (which reads event logs) cannot currently report which step a session is on.
+### Evidence Gaps
+- The ConsoleService internals (`src/v2/domain/console-service.ts` or similar) were not audited -- the actual shape of session detail returned by the HTTP API is unknown.
+- Queue infrastructure (queue.jsonl, the trigger queue) was not fully audited.
+- Snapshot read performance (reading snapshot files for step names) is unknown.
+---
+## Problem Frame Packet
+### Users / Stakeholders
+**Primary user:** A developer running WorkTrain autonomously -- has 1-10 sessions active at any given time, may check in after an hour or two away. Needs to quickly answer: "what is happening, is it going well, what's next?"
+**Secondary user:** The developer at the moment of starting a new session -- wants confirmation that WorkTrain understood the goal and is executing the right workflow.
+**Tertiary (future):** The `worktrain talk` conversational interface -- needs a pre-built context bundle to start an informed conversation without re-reading every session.
+### Jobs / Outcomes
+1. **Ambient awareness:** Without actively monitoring, understand what WorkTrain is working on at a glance (< 10 seconds).
+2. **Intervention triage:** Quickly identify whether any session is stuck, failing, or needs human input -- vs running fine and needing nothing.
+3. **Session grounding:** Before asking a follow-up question or queuing next work, get oriented on what's already in flight.
+4. **Context seeding:** Provide WorkTrain's own conversational sessions with a pre-built briefing so talk starts informed.
+### Pains / Tensions
+**Pain 1 -- Identity opacity:** Session IDs like `sess_bumu5ljx...` are meaningless. The user cannot tell which session is which without opening the console.
+**Pain 2 -- State opacity:** The existing `worktrain status <sessionId>` reports health metrics (LLM turns, tool call counts) not semantic state ("I am on step 4 of 8, writing integration tests").
+**Pain 3 -- No aggregate view:** There is no command that says "here are all your running sessions" with anything except IDs.
+**Pain 4 -- Two storage systems:** Daemon event log vs per-session store. Bridging them requires either reading two systems or accepting one system's incomplete picture.
+**Tension 1 -- Freshness vs complexity:** Getting step names requires reading snapshot files (content-addressed, not indexed). This is safe but adds read complexity. Without it, step names are absent.
+**Tension 2 -- Pull vs completeness:** A pull command captures state at a moment in time. If a session finishes between the user looking away and running status, it disappears from the output. Recent completions need a separate read.
+### Success Criteria
+1. A user with 3 active sessions reads `worktrain status` and correctly names what each session is working on -- without opening the console.
+2. A user can identify a stuck session from status output (no recent tool calls, long elapsed time).
+3. Output is valid and informative with zero active sessions (graceful empty state).
+4. Output correctly reflects the current step name for sessions that have advanced past step 1.
+5. A developer implements the command reading only today's data sources without schema migration.
+### HMW Questions
+- **HMW:** How might we surface goal text from `context_set` events without requiring the user to know the session event schema?
+- **HMW:** How might we provide step context ("step 4 of 8") when step count is not tracked -- by reading the workflow definition for total step count and the snapshot for current step?
+### Primary Framing Risk
+**The framing assumes the `context_set` goal text is always the right "what and why" for a session.** But the goal field is set once at session start and never updated. If a session's direction changed mid-run (e.g., it pivoted based on discoveries), the goal text may be stale or misleading. If a significant fraction of sessions have stale goals, the status briefing becomes unreliable exactly when the user most needs it (complex, multi-step sessions). **Evidence to watch:** sessions with many `context_set` updates vs sessions where goal is set once and never revised.
+---
+## Candidate Directions
+### Generation Expectations (before candidates are produced)
+This is a `full_spectrum` pass. Candidates must:
+1. **Reflect landscape constraints** -- not invent new data sources; only use what exists today (session store, daemon event log, HTTP API, workflow catalog, snapshot files)
+2. **Span implementation depth** -- at least one minimal/fast candidate (what can be done in a day), at least one fuller candidate (complete per the backlog spec)
+3. **Address the storage system choice** -- each candidate must explicitly state which storage it reads from (session store vs daemon log vs HTTP API) and accept the tradeoffs
+4. **Treat status-as-talk-bundle as a design constraint** -- at least one candidate must show how the status output feeds into `worktrain talk`
+5. **Do NOT require new infrastructure** -- no new event emission, no queue files, no schema changes as a prerequisite
+*(Candidates to be populated by injected routine)*
+---
+## Challenge Notes
+### Assumption 1: A pull command is the right mechanism
+- Might be wrong: push/notification may already serve the 'feel informed' goal
+- Evidence needed: outbox/message-queue usage patterns, event log completions without user awareness
+### Assumption 2: Data exists -- this is a rendering problem
+- Highest-risk assumption: event logs record mechanical facts, not semantic intent
+- Evidence needed: direct inspection of session manifest 'goal' field quality
+### Assumption 3: Status and talk are architecturally distinct
+- Might be wrong: `worktrain talk` may already cover 'what are you working on' as a use case
+- Evidence needed: backlog.md spec for both features
+---
+## Resolution Notes
+*(To be populated)*
+---
+## Decision Log
+### Selected Direction: Candidate A (CLI formatter over SessionSummaryProviderPort)
+**Why it won:**
+1. The data assembly is already done in `HealthySessionSummary` -- `sessionTitle` (goal), `pendingStepId` (step name), `lastModifiedMs` (stuck detection). No new projections needed.
+2. No daemon required -- works file-only, unlike Candidate B.
+3. Best-fit scope -- shippable in 1 day, satisfies 4/5 success criteria immediately.
+4. Correct architecture -- uses v2 projection layer, neverthrow, DI. Resolves the philosophy conflict (existing `status` command uses daemon log; new command uses v2 layer).
+5. The typed `StatusBriefingV1` intermediate type costs < 30 minutes but enables future `worktrain talk` integration without duplication.
+**Why Candidate B lost:**
+- Requires running daemon (usability regression for core 'check what's happening' use case)
+- The 'reuse by talk' benefit is speculative -- talk doesn't exist yet
+- 2-3 days implementation vs 1 day, same MVP output
+**Why Candidate C was rejected:**
+- Perpetuates daemon-log-reading pattern (wrong architecture)
+- No goal text in output without changing the existing command's data source
+- 'Architectural fixes over patches' principle directly violated
+### Challenge outcome
+The one genuine technical risk (port accessibility from CLI context) was investigated. `LocalSessionSummaryProviderV2` requires 4 ports (directoryListing, dataDir, sessionStore, snapshotStore). All are local file I/O adapters, instantiatable without the DI container. A small factory function `createStandaloneSessionSummaryProvider(dataDir: string)` resolves this cleanly. Challenge failed to kill Candidate A.
+### Switch triggers
+- If `worktrain talk` is prioritized in the next sprint: add the HTTP route as a parallel PR (Candidate B shape), reuse `buildStatusBriefing()` from both sides
+- If step count ('of N') is a hard requirement: add workflow catalog read to `buildStatusBriefing()` (still Candidate A, small scope addition)
+- If the port factory wiring turns out to require > 2 hours: consider calling the HTTP API instead (Candidate B shape for CLI, removing daemon dependency by starting the server if not running)
+---
+## Final Summary
+### Recommendation: Candidate A -- CLI formatter over SessionSummaryProviderPort
+**Confidence: HIGH**
+#### What to build (Sprint 1 -- ~1 day)
+1. **New file:** `src/v2/projections/status-briefing.ts`
+   - Types: `StatusBriefingV1`, `ActiveSessionBriefing` (with discriminated union for goal: `{ kind: 'set'; value: string } | { kind: 'not_set' }`)
+   - Pure function: `buildStatusBriefing(summaries: HealthySessionSummary[]): StatusBriefingV1` -- no I/O, fully testable
+   - Stuck detection: `isStuck = (now - lastModifiedMs) > STUCK_THRESHOLD_MS` (suggest 15 min)
+2. **New CLI subcommand:** `worktrain status` (no positional arg required)
+   - Wires `LocalSessionSummaryProviderV2` via a small `createStandaloneSessionSummaryProvider(dataDir)` factory
+   - Calls `buildStatusBriefing()` with loaded summaries
+   - Formats to terminal output: one block per active session (goal, step, elapsed time, stuck warning)
+3. **Rename existing command:** `worktrain status <sessionId>` → `worktrain health <sessionId>`
+   - Prevents naming confusion
+   - Part of the same PR
+#### What the output looks like (sketch)
+```
+WorkTrain  [18 Apr 2026, 14:32]
+ACTIVE (2 sessions)
+  ● wr.discovery
+    Discovery: what data exists today that a 'worktrain status' plain-English briefing command could use
+    Step: phase-3-synthesize   Running 22 min
+  ● coding-task-workflow-agentic
+    Implement GitHub polling adapter for Issues/PRs without requiring webhooks
+    Step: phase-2-implement   Running 8 min   ⚠ no activity for 18 min
+No items in queue.  Use `worktrain logs` to see recent completions.
+```
+#### What's deferred (Sprint 2)
+- **Step count ('of N'):** Read workflow catalog to add 'step 3 of 8' -- 1-2 hours
+- **Recently completed:** Scan daemon event log for `session_completed` events from last 24h -- 2-3 hours
+- **HTTP API route:** Add `GET /api/v2/status/briefing` returning `StatusBriefingV1` -- needed when `worktrain talk` is prioritized
+#### Status vs Talk relationship (resolved)
+`worktrain status` is the read-only text rendering of `StatusBriefingV1`. `worktrain talk` will use `StatusBriefingV1` as its opening context bundle. Status is built first; talk consumes the same type. They are not separate features -- they are two surfaces of the same data structure.
+#### Residual risks
+1. **Port factory complexity:** `createStandaloneSessionSummaryProvider(dataDir)` must instantiate 4 local adapters. Low risk but unverified. Verify transitive deps at sprint start.
+2. **`worktrain talk` timeline:** If talk is imminent (< 2 sprints), add the HTTP API route in Sprint 1 alongside the CLI command. User decision.
+3. **Recently completed gap:** Most visible difference between MVP and the backlog vision. Users checking status after sessions complete will see an empty active list with no context about what just finished.

package/docs/discovery/worktrain-status-design-candidates.md ADDED Viewed

@@ -0,0 +1,202 @@
+# WorkTrain Status Briefing -- Design Candidates
+> Raw investigative material for main-agent synthesis. Not a final decision.
+---
+## Problem Understanding
+### Core tensions
+**Tension 1: Existing daemon-log status vs v2 projection layer**
+The existing `worktrain status <sessionId>` command reads `~/.workrail/events/daemon/<today>.jsonl` directly (string parsing, mechanical metrics only). The v2 architecture uses per-session event stores with typed pure-function projections. A new aggregate status command has two options: (a) extend the daemon-log reader (consistent with existing command, but less data and wrong architecture), or (b) use the v2 session store (more data, correct architecture, requires resolving port wiring from CLI context). Option (b) is clearly correct per the architecture, but it introduces a temporary inconsistency -- two `status` commands reading from different sources.
+**Tension 2: Completeness vs read complexity**
+A complete briefing needs: goal (from `context_set` event), step name (from snapshot file), step count (from workflow catalog), stuck signal (from lastModifiedMs). Each requires a different read operation. A simpler briefing reads only session events (goal + workflow ID). The v2 projection layer resolves most of this: `HealthySessionSummary` in `resume-ranking.ts` already contains `pendingStepId`, `sessionTitle`, `isComplete`, and `lastModifiedMs` -- the assembly is done.
+**Tension 3: Pull-command completeness vs stateless design**
+The backlog spec shows a 'RECENTLY COMPLETED' section as a key output element. A pull command reading current session state misses sessions that completed between status runs. No persistent completion index exists. Accepting a point-in-time view (active sessions only) satisfies 4/5 success criteria; adding recently-completed requires either a new completion index or reading daemon log for session_completed events.
+### Likely seam
+`SessionSummaryProviderPortV2` + `HealthySessionSummary` in `src/v2/projections/resume-ranking.ts`. This is the correct aggregation boundary -- it already loads all sessions, health-checks them, and returns typed summaries. A new briefing command adds a formatter layer on top of this port, not a new data path.
+### What makes this hard
+A junior developer would: read daemon event log (wrong source, no goal text), produce session IDs instead of goal text, skip step names (miss the snapshot read), and miss that `HealthySessionSummary.sessionTitle` and `HealthySessionSummary.pendingStepId` already exist. The actual complexity is recognizing that the data assembly problem is solved -- only the formatter is missing.
+---
+## Philosophy Constraints
+**Principles that apply directly:**
+- **Errors are data (neverthrow):** individual session load failures must be `Result<ActiveSessionBriefing, LoadError>` not exceptions that abort the whole briefing
+- **Ports/adapters DI:** session store access must be injected as a port, not hard-coded file reads
+- **Pure functions for projection:** `buildStatusBriefing(summaries)` is a pure function, testable without I/O
+- **Explicit domain types over primitives:** return type is `StatusBriefingV1`, not `string`
+- **YAGNI with discipline:** do not add step-count tracking, completion indexing, time estimates before they are needed
+- **Validate at boundaries, trust inside:** session loading validates at the port boundary; the formatter trusts `HealthySessionSummary` fields
+**Conflicts:**
+- The existing `status <sessionId>` command violates the 'use v2 projection layer' architectural direction. A new `status` command should NOT perpetuate this pattern.
+---
+## Impact Surface
+- `src/cli-worktrain.ts` -- new subcommand added here
+- `src/v2/projections/resume-ranking.ts` -- `HealthySessionSummary` type consumed (read-only)
+- `src/v2/ports/session-summary-provider.port.ts` -- port consumed by new CLI command
+- Future `worktrain talk` -- will consume the same `StatusBriefingV1` type or the same port
+- `src/v2/usecases/console-routes.ts` -- not changed for MVP; may add status route later
+- Existing `worktrain status <sessionId>` -- not changed; two status commands coexist (inconsistency noted)
+---
+## Candidates
+### Candidate A: Minimal -- CLI formatter over `SessionSummaryProviderPort`
+**Summary:** Add a `worktrain status` CLI subcommand (no session ID required) that calls `SessionSummaryProviderPortV2.loadAll()`, filters to active (non-complete) sessions, and formats each `HealthySessionSummary` into 3-4 lines: goal, step, duration, stuck warning if applicable. Defines a typed `StatusBriefingV1` intermediate (pure data structure) even though no HTTP route exists yet.
+**Tensions resolved:**
+- Goal visibility: `HealthySessionSummary.sessionTitle` (derived from `context_set` goal) already present
+- Step context: `HealthySessionSummary.pendingStepId` already present
+- Health signal: `lastModifiedMs` available for stuck detection (> N minutes without update)
+- Aggregate view: `loadAll()` returns all sessions, no ID required
+- Architecture consistency: v2 projection layer, neverthrow, DI
+**Tensions accepted:**
+- No 'recently completed' section (only active sessions; completions require daemon log read or completion index)
+- No queue state (queue.jsonl does not exist)
+- No step count -- only step name (e.g., "phase-3-implement"), not "step 4 of 8"
+- Status-talk reuse is partial: both will call the same port, but no shared HTTP contract yet
+**Boundary:** New `buildStatusBriefing(summaries: HealthySessionSummary[]): StatusBriefingV1` pure function in `src/v2/projections/status-briefing.ts`. New `worktrain status` subcommand in `src/cli-worktrain.ts` that wires the port and calls the formatter.
+**Why this boundary is best fit:** The data assembly is already done in the projection layer. The briefing function is a pure, testable formatter -- not a new data path. Minimal surface area, minimal risk.
+**Failure mode:** `SessionSummaryProviderPortV2` may not be available from CLI context (only wired in the HTTP server's DI container). Mitigation: expose a `createStandaloneSessionSummaryProvider(dataDir)` factory that wires the port without a full server context.
+**Repo pattern:** Follows `src/v2/projections/` pattern (pure projection functions). The port-wiring from CLI context is a new pattern (existing CLI commands either call HTTP API or read files directly). Small, well-bounded new pattern.
+**Gains:** Shippable in < 1 day. Satisfies 4/5 success criteria. Typed return type enables future talk integration. No daemon required.
+**Losses:** No recently-completed section. No queue. Step count absent.
+**Scope judgment:** Best-fit for MVP.
+**Philosophy fit:** Honors errors-as-data, pure functions, DI, explicit domain types. Honors YAGNI (no speculative infrastructure). Resolves the stated philosophy conflict by using v2 layer instead of daemon log.
+---
+### Candidate B: `GET /api/v2/status/briefing` HTTP endpoint + CLI consumer
+**Summary:** Add `GET /api/v2/status/briefing` returning a typed `StatusBriefingV1` JSON object, consumed by both a CLI `worktrain status` formatter and later by `worktrain talk` as its opening context bundle. The HTTP route calls `ConsoleService.getStatusBriefing()` which calls the same `SessionSummaryProviderPort`. The CLI subcommand calls the HTTP route (like `spawn`/`await`).
+**Tensions resolved:**
+- Status-talk reuse: `StatusBriefingV1` is the canonical shared contract; talk imports from the same endpoint
+- Architecture layering: CLI talks to HTTP API, matching the `spawn`/`await` pattern
+- Multiple consumers: web console can display a status widget without a separate data path
+**Tensions accepted:**
+- Requires a running daemon/console server (HTTP API unavailable if daemon is not running)
+- More moving parts for the same MVP output (route + service method + DTO + CLI command)
+- The 'reuse' benefit is speculative -- `worktrain talk` doesn't exist yet
+**Boundary:** `StatusBriefingV1` in `src/v2/usecases/console-types.ts`, new `ConsoleService.getStatusBriefing()` method, new HTTP route, new CLI subcommand.
+**Why this boundary is best fit:** If the canonical delivery mechanism is the HTTP API and multiple consumers exist, the API boundary is the right seam. But today there is only one consumer (the CLI), so this is premature optimization.
+**Failure mode:** `worktrain status` becomes useless if the console server is not running. Candidate A works without a daemon; Candidate B does not. This is a usability regression for the core "check what's happening" use case.
+**Repo pattern:** Follows `spawn`/`await` (CLI calls HTTP). Adds a new service method and route (standard pattern). Slightly too broad for the current moment.
+**Gains:** Clean consumer contract for talk. Multiple consumers share one data path. Web console gets a status widget 'for free'.
+**Losses:** Daemon dependency. Higher implementation cost (2-3 days vs 1 day). Premature given talk doesn't exist.
+**Scope judgment:** Slightly too broad for MVP; right for the medium-term architecture if talk is imminent.
+**Philosophy fit:** Honors explicit domain types, DI. The daemon dependency is a mild violation of 'validate at boundaries, trust inside' -- it introduces a runtime availability assumption that Candidate A avoids.
+---
+### Candidate C: Extend existing `worktrain status <sessionId>` with `--all` flag
+**Summary:** Add `--all` flag to the existing `status` command that iterates sessions and renders health summaries for each.
+**Tensions resolved:**
+- Aggregate view (with --all flag)
+- Consistency with existing command (same command name)
+**Tensions accepted:**
+- **Critically:** The existing command reads daemon event logs (mechanical metrics: LLM turns, tool call counts, failure rate). Adding `--all` to it would produce a list of sessions WITHOUT goal text and WITHOUT step names. The output would be "here are your N sessions: each ran X tool calls, Y LLM turns" -- useful for ops debugging, not for 'what are you doing and why'.
+- The two-storage-system contradiction is not resolved; it's perpetuated.
+**Failure mode:** Either (a) the output is inferior (no goal text) -- solving the wrong problem, or (b) the implementation is changed to read from v2 store instead -- which is functionally the same as Candidate A but with worse ergonomics (extending a misaligned command rather than a clean new one).
+**Scope judgment:** Too narrow. Perpetuates the daemon-log-reading pattern that should not be extended.
+**Philosophy fit:** Conflicts with 'architectural fixes over patches' -- extends a symptom location rather than the correct seam.
+---
+## Comparison and Recommendation
+### Matrix
+| Criterion | A (CLI+Port) | B (HTTP route) | C (extend existing) |
+|-----------|-------------|---------------|---------------------|
+| Goal text in output | YES | YES | NO (without arch change) |
+| Step name in output | YES | YES | NO (without arch change) |
+| Aggregate view | YES | YES | YES |
+| Status-talk reuse | PARTIAL | FULL | NO |
+| Daemon required | NO | YES | NO |
+| Implementation cost | 1 day | 2-3 days | 1 day (wrong output) |
+| Architecture correct | YES | YES | NO |
+| MVP satisfies user | 4/5 criteria | 4/5 criteria | 0/5 criteria |
+### Recommendation: Candidate A + typed `StatusBriefingV1` type
+**Candidate A** wins on best-fit scope, minimum viable implementation, daemon-independent operation, and correctness (goal text + step name in output). The one addition from B: define a typed `StatusBriefingV1` return type even though the HTTP route is not built yet. This adds < 30 minutes to the implementation and gives `worktrain talk` a named contract to consume without requiring HTTP.
+**Concrete implementation plan:**
+1. New file: `src/v2/projections/status-briefing.ts` -- pure `buildStatusBriefing(summaries: HealthySessionSummary[]): StatusBriefingV1` function
+2. New types: `StatusBriefingV1`, `ActiveSessionBriefing` in the same file
+3. New CLI subcommand `status` (no positional arg) in `src/cli-worktrain.ts` -- note: renames the existing positional-arg `status <sessionId>` command to `health <sessionId>` to avoid ambiguity, or adds `status` as an alias
+4. Port wiring: `createStandaloneSessionSummaryProvider(dataDir: string)` factory if the port is not accessible from CLI context
+---
+## Self-Critique
+### Strongest argument against this recommendation
+Candidate B is architecturally cleaner for multi-consumer scenarios. If `worktrain talk` is planned for the next sprint, building the HTTP API route now avoids a refactor. The extra day of implementation is cheap compared to duplicated port-wiring if the CLI and talk both wire the provider independently.
+### Narrower option that lost
+Candidate A without the typed return type (format directly to strings in the CLI command). Lost because: `worktrain talk` would need to re-implement the assembly logic, violating DRY. The typed intermediate is < 30 minutes of extra work and high future value.
+### Broader option and what evidence would justify it
+Candidate B. Evidence required: backlog prioritization showing `worktrain talk` is the immediate next milestone (< 2 weeks), or confirmation that the web console UI needs a live status widget in the same timeframe.
+### Pivot conditions
+- If `SessionSummaryProviderPortV2` cannot be wired from CLI context without significant boilerplate -- consider Candidate B (the HTTP server already wires it, so calling the API from CLI avoids duplicating DI logic)
+- If the user confirms talk is imminent -- add the HTTP route now as part of the same PR
+- If step count ('of N total steps') is a hard requirement -- add workflow catalog read to `buildStatusBriefing()` (still Candidate A shape, small scope addition)
+---
+## Open Questions for Main Agent
+1. Is `SessionSummaryProviderPortV2` accessible from a standalone CLI context, or is it only wired in the HTTP server's DI container? (determines whether a factory function is needed)
+2. Should the existing `worktrain status <sessionId>` be renamed to `worktrain health <sessionId>` to avoid command name collision, or should the new aggregate command use a different name (`worktrain status --all` or `worktrain ls`)?
+3. How important is the 'RECENTLY COMPLETED' section for the initial release? If important, a daemon-log scan for `session_completed` events from the last 24 hours is a feasible addition to the briefing.
+4. Is `worktrain talk` planned for the next sprint? If yes, Candidate B is worth the extra day.

package/docs/discovery/worktrain-status-design-review-findings.md ADDED Viewed

@@ -0,0 +1,86 @@
+# WorkTrain Status Briefing -- Design Review Findings
+> Raw review material for main-agent synthesis. Selected direction: Candidate A (CLI formatter over SessionSummaryProviderPort).
+---
+## Tradeoff Review
+| Tradeoff | Accepted? | When it fails | Mitigation |
+|----------|-----------|---------------|------------|
+| No 'recently completed' section | YES | User checks status minutes after session completes; sees empty list with no context | Suggest `worktrain logs` in status output footer |
+| No queue state | YES | User has 0 active sessions and is uncertain whether anything is pending | N/A -- queue.jsonl does not exist; nothing to read |
+| No step count ('of N') | YES | User sees step name but has no sense of % complete on long tasks | Fast follow: read workflow catalog for total step count |
+| Two status command variants coexist | CONDITIONAL | User types `worktrain status` expecting session ID prompt | Rename existing command to `worktrain health <id>` as part of same PR |
+---
+## Failure Mode Review
+| Failure Mode | Handled? | Risk | Notes |
+|-------------|----------|------|-------|
+| Port wiring from CLI context | YES (factory pattern) | LOW | `createStandaloneSessionSummaryProvider(dataDir)` instantiates 4 concrete adapters without DI container |
+| sessionTitle null for sessions without goal | YES (formatter fallback) | LOW | Fall back to workflowId as display name |
+| Sessions from multi-day spans not visible | YES | LOW | `enumerateSessionsByRecency` scans full sessions directory, not today-only; filter `isComplete = false` handles active-only |
+**Highest-risk failure mode:** Port wiring (factory pattern). Risk is LOW but unverified. Recommend verifying adapter transitive deps during implementation sprint planning.
+---
+## Runner-Up / Simpler Alternative Review
+**Runner-up (Candidate B):** Nothing to borrow beyond the `StatusBriefingV1` typed intermediate, which is already included in Candidate A's recommendation.
+**Simpler variant (format directly to strings):** Would satisfy all success criteria identically. Rejected because the typed intermediate (`StatusBriefingV1`) costs < 30 minutes and prevents duplication when `worktrain talk` is implemented.
+**Even simpler variant (call HTTP API from CLI):** Fails on daemon-required criterion. Worse than selected design.
+**Conclusion:** Selected design is the minimum useful shape. No hybrid improvements needed.
+---
+## Philosophy Alignment
+| Principle | Status | Notes |
+|-----------|--------|-------|
+| Errors are data (neverthrow) | SATISFIED | `buildStatusBriefing()` returns `Result<StatusBriefingV1, BriefingError>`; individual session failures skipped gracefully |
+| Explicit domain types over primitives | SATISFIED | `StatusBriefingV1`, `ActiveSessionBriefing` -- not ad-hoc strings |
+| Compose small pure functions | SATISFIED | `buildStatusBriefing()` pure, formatter separate, port wiring in CLI |
+| Validate at boundaries, trust inside | SATISFIED | Validation at port boundary; pure function trusts `HealthySessionSummary` |
+| DI for boundaries | SATISFIED | Port injected into CLI command |
+| Immutability by default | SATISFIED | Reading append-only event store |
+| Architectural fixes over patches | SATISFIED | Uses v2 projection layer; does not extend daemon-log-reading pattern |
+| YAGNI with discipline | SATISFIED | No HTTP route, no queue, no completion index; `StatusBriefingV1` type is a minimal seam not speculative infrastructure |
+| Make illegal states unrepresentable | MILD TENSION | `goal: string \| null` -- null is not illegal but a discriminated union would be more explicit |
+| Type safety as first line of defense | MILD TENSION | Null `sessionTitle` requires defensive null handling; type system enforces it but the caller must pattern-match |
+---
+## Findings
+### Yellow
+**Y1: Goal field type could be more explicit**
+`ActiveSessionBriefing.goal: string | null` is acceptable but `{ kind: 'set'; value: string } | { kind: 'not_set' }` would be more explicit per the 'make illegal states unrepresentable' principle. Low severity -- the null is handled either way, but the discriminated union makes the null case named and intentional.
+**Y2: Command naming collision risk**
+Existing `worktrain status <sessionId>` and new `worktrain status` (no args) would be disambiguated by commander.js argument presence, but help text and documentation will be confusing. Renaming the existing command to `worktrain health <id>` is a 5-minute change that removes this confusion entirely. Recommend including in the same PR.
+**Y3: Step count is absent**
+'Step 4 of 8' is significantly more useful than 'phase-3-implement' for understanding how far along a session is. The workflow catalog is queryable. This should be added as a fast follow (not MVP blocker) since it requires one additional read operation per active session.
+---
+## Recommended Revisions
+1. **Include:** Rename `worktrain status <id>` to `worktrain health <id>` in the same PR as the new `worktrain status` aggregate command. (5 minutes, prevents naming confusion)
+2. **Consider:** Use `{ kind: 'set'; value: string } | { kind: 'not_set' }` for the goal field in `ActiveSessionBriefing`. (15 minutes, clearer semantics)
+3. **Fast follow:** Add workflow catalog read to `buildStatusBriefing()` to include step count ('step 3 of 8'). (1-2 hours, significant UX improvement)
+---
+## Residual Concerns
+1. **`worktrain talk` timeline unclear:** If talk is within 2 sprints, the HTTP API route (Candidate B) should be built now rather than after the fact. This is context-dependent -- the user should confirm.
+2. **Port factory complexity unverified:** The `createStandaloneSessionSummaryProvider(dataDir)` factory is conceptually straightforward but not implemented. Transitive adapter dependencies could add complexity. Low risk but should be verified at sprint start.
+3. **Recently completed sessions:** The MVP has no 'recently completed' section. This is the most user-visible gap in the backlog spec's vision. A daemon-log scan for `session_completed` events from the last 24 hours is a feasible addition in Sprint 2.