npm - @exaudeus/workrail - Versions diffs - 3.40.0 → 3.41.0 - Mend

@exaudeus/workrail 3.40.0 → 3.41.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (75) hide show

package/dist/cli/commands/init.js +0 -3
package/dist/cli-worktrain.js +8 -0
package/dist/cli.js +0 -18
package/dist/config/app-config.d.ts +0 -16
package/dist/config/app-config.js +0 -14
package/dist/config/config-file.js +0 -3
package/dist/console-ui/assets/index-CQt4UhPB.js +28 -0
package/dist/console-ui/assets/index-DGj8EsFR.css +1 -0
package/dist/console-ui/index.html +2 -2
package/dist/coordinators/pr-review.d.ts +17 -0
package/dist/coordinators/pr-review.js +164 -0
package/dist/daemon/daemon-events.d.ts +9 -1
package/dist/daemon/soul-template.d.ts +2 -2
package/dist/daemon/soul-template.js +11 -1
package/dist/daemon/workflow-runner.d.ts +14 -1
package/dist/daemon/workflow-runner.js +395 -25
package/dist/di/container.js +1 -25
package/dist/di/tokens.d.ts +0 -3
package/dist/di/tokens.js +0 -3
package/dist/engine/engine-factory.js +0 -1
package/dist/infrastructure/console-defaults.d.ts +1 -0
package/dist/infrastructure/console-defaults.js +4 -0
package/dist/infrastructure/session/index.d.ts +0 -1
package/dist/infrastructure/session/index.js +1 -3
package/dist/manifest.json +87 -103
package/dist/mcp/handlers/session.d.ts +1 -0
package/dist/mcp/handlers/session.js +61 -13
package/dist/mcp/server.js +1 -18
package/dist/mcp/transports/http-entry.js +0 -2
package/dist/mcp/transports/stdio-entry.js +1 -2
package/dist/mcp/types.d.ts +0 -2
package/dist/trigger/daemon-console.d.ts +2 -0
package/dist/trigger/daemon-console.js +1 -1
package/dist/trigger/trigger-listener.d.ts +2 -0
package/dist/trigger/trigger-listener.js +3 -1
package/dist/trigger/trigger-router.d.ts +4 -3
package/dist/trigger/trigger-router.js +4 -3
package/dist/trigger/trigger-store.js +17 -4
package/dist/v2/usecases/console-routes.d.ts +2 -1
package/dist/v2/usecases/console-routes.js +29 -5
package/dist/v2/usecases/console-service.js +14 -0
package/dist/v2/usecases/console-types.d.ts +1 -0
package/docs/authoring.md +16 -16
package/docs/design/coordinator-message-queue-drain-plan.md +241 -0
package/docs/design/coordinator-message-queue-drain-review.md +120 -0
package/docs/design/coordinator-message-queue-drain.md +289 -0
package/docs/design/shaping-workflow-external-research.md +119 -0
package/docs/discovery/late-bound-goals-impl-plan.md +147 -0
package/docs/discovery/late-bound-goals-review.md +82 -0
package/docs/discovery/late-bound-goals.md +118 -0
package/docs/discovery/steer-endpoint-design-candidates.md +288 -0
package/docs/discovery/steer-endpoint-design-review-findings.md +104 -0
package/docs/discovery/steer-endpoint-implementation-plan.md +284 -0
package/docs/ideas/backlog.md +292 -0
package/docs/ideas/design-candidates-console-session-tree-impl.md +64 -0
package/docs/ideas/design-candidates-session-tree-view.md +196 -0
package/docs/ideas/design-review-findings-console-session-tree-impl.md +75 -0
package/docs/ideas/design-review-findings-session-tree-view.md +88 -0
package/docs/ideas/implementation_plan_session_tree_view.md +238 -0
package/package.json +2 -1
package/spec/authoring-spec.json +16 -16
package/spec/shape.schema.json +178 -0
package/spec/workflow-tags.json +232 -47
package/workflows/coding-task-workflow-agentic.json +491 -480
package/workflows/wr.shaping.json +182 -0
package/dist/console-ui/assets/index-8dh0Psu-.css +0 -1
package/dist/console-ui/assets/index-CXWCAonr.js +0 -28
package/dist/infrastructure/session/DashboardHeartbeat.d.ts +0 -8
package/dist/infrastructure/session/DashboardHeartbeat.js +0 -39
package/dist/infrastructure/session/DashboardLockRelease.d.ts +0 -2
package/dist/infrastructure/session/DashboardLockRelease.js +0 -29
package/dist/infrastructure/session/HttpServer.d.ts +0 -60
package/dist/infrastructure/session/HttpServer.js +0 -912
package/workflows/coding-task-workflow-agentic.lean.v2.json +0 -648
package/workflows/coding-task-workflow-agentic.v2.json +0 -324

package/docs/discovery/steer-endpoint-implementation-plan.md ADDED Viewed

@@ -0,0 +1,284 @@
+# Implementation Plan: POST /api/v2/sessions/:sessionId/steer
+**Branch:** `feat/session-steer-endpoint`
+**Confidence:** High
+**PR count:** 1
+---
+## Problem Statement
+A coordinator script needs to inject text into an agent's next turn during a running daemon session.
+The `steer()` mechanism in `AgentLoop` already delivers injected text. The gap is a bridge between
+an HTTP endpoint and the closure-scoped `pendingSteerText` variable in `runWorkflow()`.
+Additionally, the existing `pendingSteerText: string | null` is a single-value field that silently
+drops coordinator steers when overwritten by `onAdvance()`. This must be fixed first (R1 finding).
+---
+## Acceptance Criteria
+1. `POST /api/v2/sessions/:sessionId/steer` with body `{ "text": "..." }` returns HTTP 200
+   `{ "success": true }` when the sessionId belongs to an active daemon session.
+2. The injected text is delivered to the agent via `agent.steer()` on the next `turn_end` event,
+   concatenated after the step text from `onAdvance()`.
+3. Returns HTTP 404 `{ "success": false, "error": "Session not found or not a daemon session" }`
+   when the sessionId is not in the registry.
+4. Returns HTTP 503 `{ "success": false, "error": "Steer not available..." }` in standalone console
+   mode (no steerRegistry injected).
+5. Returns HTTP 400 for missing or non-string `text` body.
+6. Multiple calls to the endpoint between `turn_end` events: all injected texts are delivered in the
+   same steer message, joined with `\n\n`.
+7. After the session completes, calling the endpoint returns 404.
+8. The existing `pendingSteerText` variable is replaced by `pendingSteerParts: string[]` and
+   `onAdvance()` behavior is unchanged from the caller's perspective (step advance still works).
+---
+## Non-Goals
+- Auth token on the endpoint (v1 is localhost-only, network binding is the security layer)
+- `waitForCoordinator` blocking gate mechanism (Phase 2B, separate task)
+- `wr.coordinator_signal` artifact schema (Phase A, separate task)
+- MCP-mode injection (deferred to v2)
+- Crash recovery for in-flight steers (in-memory only, v1 known limitation)
+- Structured request body beyond `{ text: string }` (v2 concern)
+---
+## Philosophy-Driven Constraints
+- **DI for boundaries**: `SteerRegistry` must be injected into `mountConsoleRoutes()` and
+  `runWorkflow()`. No module-level singletons.
+- **Errors as data**: HTTP responses use `{ success: bool, error?: string }` shape. No thrown
+  exceptions at the route level.
+- **Validate at boundaries**: 400 for invalid body, 503 for disabled, 404 for not-found -- all
+  checked before touching the registry.
+- **YAGNI**: Only what's listed in acceptance criteria. No speculative extension points.
+- **Explicit domain types**: Named type alias `SteerRegistry` (not raw `Map<string, fn>` literal).
+---
+## Invariants
+1. `pendingSteerParts` is only mutated in two places: `onAdvance()` (push step text) and the steer
+   callback registered in the `SteerRegistry` (push coordinator text). No other writer.
+2. `pendingSteerParts` is only read and drained in the `turn_end` subscriber. Single reader.
+3. JavaScript single-threaded event loop: no race between push and drain.
+4. The steer callback is registered after `workrailSessionId` is decoded from the continueToken,
+   and deregistered in `runWorkflow()`'s `finally` block. No stale entries possible.
+5. The endpoint is only active when `steerRegistry` is provided to `mountConsoleRoutes()`.
+   The standalone console does not provide it.
+6. The `steerRegistry` param is optional on all functions. No existing callers are broken.
+---
+## Selected Approach
+**Hybrid (type alias + parameter injection):**
+- Named type alias `export type SteerRegistry = Map<string, (text: string) => void>` in
+  `src/daemon/workflow-runner.ts`.
+- `pendingSteerText: string | null` replaced by `const pendingSteerParts: string[] = []`.
+- `onAdvance()` uses `pendingSteerParts.push(stepText)`.
+- turn_end subscriber drains with `const parts = pendingSteerParts.splice(0)` and calls
+  `agent.steer(buildUserMessage(parts.join('\n\n')))` if `parts.length > 0`.
+- `runWorkflow()` gains optional `steerRegistry?: SteerRegistry` param. After workrailSessionId
+  is decoded, calls `steerRegistry?.set(workrailSessionId, (text) => pendingSteerParts.push(text))`.
+  In `finally`: `steerRegistry?.delete(workrailSessionId)`.
+- `mountConsoleRoutes()` gains optional `steerRegistry?: SteerRegistry` param after `triggerRouter`.
+- `POST /api/v2/sessions/:sessionId/steer` endpoint added in `console-routes.ts`.
+- `TriggerRouter` constructor gains optional `steerRegistry?: SteerRegistry`; passes to
+  `runWorkflowFn()` calls in `route()` and `dispatch()`.
+- `RunWorkflowFn` type in `trigger-router.ts` extended with optional 6th param.
+**Runner-Up:** C2 (SteerRegistry class). Loses only for having a new file for 3 trivial operations.
+Use if the registry gains additional methods or needs isolated unit tests.
+---
+## Vertical Slices
+### Slice 1: Fix `pendingSteerText` -> `pendingSteerParts` (R1 prerequisite)
+**Files:** `src/daemon/workflow-runner.ts` only.
+**Change:**
+- Rename `pendingSteerText: string | null` to `const pendingSteerParts: string[] = []`.
+- Update `onAdvance()`: `pendingSteerText = stepText` -> `pendingSteerParts.push(stepText)`.
+- Update turn_end subscriber drain:
+  - Before: `if (pendingSteerText !== null && !isComplete) { ... }`
+  - After: `if (!isComplete) { const parts = pendingSteerParts.splice(0); if (parts.length > 0) { agent.steer(buildUserMessage(parts.join('\n\n'))); } }`
+- Note: `isComplete` guard moves outside the `splice(0)` -- drain always happens, steer only if not complete and parts non-empty.
+**Acceptance:** Existing behavior unchanged. Session still receives step text on each advance.
+No coordinator injection yet. Build+type-check passes. Existing tests pass.
+**Risk:** Low. Pure refactor, no behavior change from external perspective.
+---
+### Slice 2: SteerRegistry type alias + runWorkflow() registration
+**Files:** `src/daemon/workflow-runner.ts`.
+**Change:**
+- Add: `export type SteerRegistry = Map<string, (text: string) => void>;`
+- Add optional param to `runWorkflow()`: `steerRegistry?: SteerRegistry`
+- After `workrailSessionId` is decoded (line ~2190): register callback:
+  ```typescript
+  if (steerRegistry && workrailSessionId) {
+    steerRegistry.set(workrailSessionId, (text: string) => { pendingSteerParts.push(text); });
+  }
+  ```
+- In `finally` block: `if (steerRegistry && workrailSessionId) { steerRegistry.delete(workrailSessionId); }`
+- Add code comment on the `set()` call documenting the registration gap.
+**Acceptance:** `runWorkflow()` compiles with new optional param. Existing callers unchanged.
+Manual test: if a steerRegistry Map is passed and a callback is registered/called during a session,
+text is pushed to `pendingSteerParts`.
+**Risk:** Low. Additive change, no behavior change when `steerRegistry` is undefined.
+---
+### Slice 3: TriggerRouter wiring
+**Files:** `src/trigger/trigger-router.ts`.
+**Change:**
+- Import `SteerRegistry` from `workflow-runner.js`.
+- Extend `RunWorkflowFn` type with optional 6th param:
+  `steerRegistry?: SteerRegistry` after `emitter?`.
+- Add `private readonly steerRegistry?: SteerRegistry` to `TriggerRouter`.
+- Add `steerRegistry?: SteerRegistry` to TriggerRouter constructor params.
+- Assign in constructor: `this.steerRegistry = steerRegistry`.
+- Update `route()` call: `this.runWorkflowFn(workflowTrigger, this.ctx, this.apiKey, undefined, this.emitter, this.steerRegistry)`
+- Update `dispatch()` call: same.
+**Acceptance:** TriggerRouter compiles. `runWorkflow` in production TriggerRouter path passes
+steerRegistry to the agent loop. Existing trigger tests unaffected (registry is optional).
+**Risk:** Low. Additive param. All existing test calls to `TriggerRouter` pass `undefined` or
+omit the param.
+---
+### Slice 4: HTTP endpoint in console-routes.ts
+**Files:** `src/v2/usecases/console-routes.ts`.
+**Change:**
+- Import `SteerRegistry` from `../../daemon/workflow-runner.js`.
+- Add `steerRegistry?: SteerRegistry` to `mountConsoleRoutes()` param list (after `triggerRouter`).
+- Add endpoint after the `POST /api/v2/auto/dispatch` block:
+```typescript
+// POST /api/v2/sessions/:sessionId/steer
+// Injects text into a running daemon session's next agent turn.
+// Daemon-only: requires steerRegistry to be provided at server startup.
+// Auth: localhost-only (127.0.0.1 binding). No token auth in v1.
+// TODO(v2): Add token auth before any multi-user or remote deployment.
+app.post('/api/v2/sessions/:sessionId/steer', express.json(), (req: Request, res: Response) => {
+  if (!steerRegistry) {
+    res.status(503).json({ success: false, error: 'Steer not available (not a daemon context).' });
+    return;
+  }
+  const { sessionId } = req.params;
+  const body = req.body as { text?: unknown };
+  const text = typeof body.text === 'string' ? body.text.trim() : '';
+  if (!text) {
+    res.status(400).json({ success: false, error: 'text is required and must be a non-empty string.' });
+    return;
+  }
+  const callback = steerRegistry.get(sessionId);
+  if (!callback) {
+    res.status(404).json({ success: false, error: 'Session not found or not a daemon session.' });
+    return;
+  }
+  callback(text);
+  res.json({ success: true });
+});
+```
+**Acceptance:** All 5 HTTP response cases work correctly (200, 400, 404, 503). Session receives
+injected text on next turn_end. Standalone console returns 503.
+**Risk:** Low. New endpoint, no changes to existing routes.
+---
+### Slice 5: Daemon wiring in daemon-console.ts
+**Files:** `src/trigger/daemon-console.ts`.
+**Change:**
+- Import `SteerRegistry` from `../daemon/workflow-runner.js`.
+- Before constructing `TriggerRouter`: `const steerRegistry: SteerRegistry = new Map();`
+- Pass to `TriggerRouter` constructor: `new TriggerRouter(index, ctx, apiKey, runWorkflow, execFn, ..., steerRegistry)`
+- Pass to `mountConsoleRoutes()`: add `steerRegistry` as the last argument (after `triggerRouter`).
+- Also update the direct `runWorkflow()` call in `console-routes.ts` `POST /auto/dispatch` path
+  (when `triggerRouter` is absent): pass `steerRegistry` as 6th arg.
+**Acceptance:** End-to-end: daemon starts, `POST /auto/dispatch` creates a session, coordinator
+calls `POST /sessions/:id/steer`, agent receives injected text on next turn.
+**Risk:** Medium. This is the wiring step that connects all slices. Most likely source of missed
+call sites.
+---
+## Test Design
+### Unit tests (workflow-runner.ts)
+- Test that `onAdvance()` pushes to `pendingSteerParts`.
+- Test that turn_end subscriber joins and steers when `pendingSteerParts.length > 0`.
+- Test that multiple pushes (simulate both `onAdvance` and steer callback) produce joined text.
+- Test that `steerRegistry.set()` is called after workrailSessionId decoded.
+- Test that `steerRegistry.delete()` is called in finally (mock registry, verify delete).
+### Integration test (console-routes.ts)
+- Mock `steerRegistry` with a Map. POST to endpoint with valid body -> 200, callback called.
+- POST with empty body -> 400.
+- POST with unknown sessionId -> 404.
+- POST without steerRegistry injected -> 503.
+### Regression: existing tests
+- All existing `runWorkflow()` tests must pass unchanged (optional param, default undefined).
+- All existing `mountConsoleRoutes()` tests must pass unchanged.
+- All existing TriggerRouter tests must pass unchanged.
+---
+## Risk Register
+| Risk | Likelihood | Impact | Mitigation |
+|---|---|---|---|
+| Missed call site for steerRegistry in dispatch/route | Medium | High | Search for all `runWorkflowFn(` calls in trigger-router.ts before submitting |
+| `pendingSteerParts.splice(0)` stale closure ref | Low | High | splice(0) mutates in-place; closure over array variable (not array contents) is safe |
+| Registration gap causes 404 for early steers | Very low | Low | Document with code comment; coordinator retries on 404 |
+| `mountConsoleRoutes` callers not updated | Low | Medium | Only 3 callers: daemon-console.ts, standalone-console.ts (no change), console-routes.ts |
+---
+## PR Packaging Strategy
+Single PR on branch `feat/session-steer-endpoint`. All 5 slices together. The R1 fix (Slice 1) is
+small enough that it doesn't need its own PR. The endpoint is only usable when all slices are present.
+PR title: `feat(console): add POST /api/v2/sessions/:sessionId/steer for coordinator injection`
+---
+## Philosophy Alignment Per Slice
+| Slice | Principle | Status |
+|---|---|---|
+| S1 pendingSteerParts | Immutability by default | Tension (mutable array) -- acceptable, mutation bounded |
+| S1 pendingSteerParts | Compose with small pure functions | Satisfied -- drain is one expression |
+| S2 SteerRegistry type | Explicit domain types | Satisfied -- named alias |
+| S2 runWorkflow registration | DI for boundaries | Satisfied -- injected, not global |
+| S3 TriggerRouter | Make illegal states unrepresentable | Satisfied -- optional param can't be confused with required |
+| S4 HTTP endpoint | Validate at boundaries | Satisfied -- 400/503/404 before touching registry |
+| S4 HTTP endpoint | Errors as data | Satisfied -- { success: bool } shape |
+| S5 daemon wiring | YAGNI | Satisfied -- single Map, no extra abstraction |

package/docs/ideas/backlog.md CHANGED Viewed

@@ -5891,3 +5891,295 @@ Coordinator logic:
 - Phase 1: coordinator scripts withhold `complete_step` advancement until the condition is met. This already works today -- the coordinator just doesn't advance the session until the fix agent is done.
 - Phase 2: the coordinator passes structured context when advancing: `complete_step(session, { injectedContext: fixSummary })`. The session receives it as part of the next step's prompt.
 - Phase 3: declarative pipelines -- workflow JSON declares that step N waits for an external condition before proceeding. The coordinator reads this and manages the timing automatically. No hand-coded coordinator script needed for common patterns.
+---
+### Coordinatable workflow steps: confirmation points the coordinator can satisfy (needs discovery, Apr 18, 2026)
+⚠️ **Needs discovery before implementation. The questions below are open, not answered.**
+**The insight:** workflows already have `requireConfirmation: true` on certain steps -- these are natural coordination points. Right now they pause for a human. The idea is to make them also pausable-for-a-coordinator, so a coordinator (or another agent) can be the one that responds instead of a human.
+**The vision:**
+A workflow reaches a `requireConfirmation` step. In MCP mode (human-driven), it behaves exactly as today -- pauses and waits. In daemon/coordinator mode, instead of blocking forever, the coordinator can:
+- Inject a synthesized answer based on external work it just did ("architecture review found X, proceed with approach A")
+- Spawn another agent to generate the answer and inject its output
+- Ask a discovery agent to weigh in and forward the result
+- Simply forward a human's message from the message queue
+The original session never knows whether a human or a coordinator satisfied the confirmation. It just receives the next turn with context.
+**Why this is powerful:**
+Today the coordinator is external to the workflow -- it orchestrates sessions from outside. This makes the workflow itself coordinatable from within, so multi-agent collaboration can be declared in the workflow spec rather than bolted on in coordinator scripts.
+**What's unknown and needs discovery:**
+1. **Mechanism:** is this an enriched `requireConfirmation` (add a `coordinatable: true` flag?), a new step type (`requireCoordinatorInput`?), or something at the engine level? Tradeoffs between each.
+2. **What gets injected:** always a structured decision ("proceed/revise/abort + findings"), or also data injection ("here are the file contents", "here's what the API returned")? How does the step receive it -- as a new tool call result, as a steer, as part of the step prompt?
+3. **Coordinator discovery:** how does the coordinator know a step is waiting for it vs waiting for a human? Does it poll the session state? Does the session emit a `coordinator_gate_pending` event? (This connects to the `waitForCoordinator` spec in this backlog.)
+4. **Timeout/fallback:** if the coordinator never responds, what happens? Fall back to human? Error? Configurable?
+5. **MCP invariant:** must behave identically to today in MCP/human-driven mode. The coordinator path is additive, not a behavior change for existing users.
+**Relationship to other specs:**
+- "Long-running sessions: stay open across agent handoffs" -- the session pauses at the confirmation point, coordinator acts, session resumes
+- "POST /api/v2/sessions/:id/steer" -- this might be the injection mechanism
+- `signal_coordinator` tool -- the session might signal the coordinator instead of blocking
+- `waitForCoordinator` step flag (already in this backlog) -- same underlying need, different framing
+- "Coordinator review mode: self-healing vs comment-and-wait" -- confirmation points are where that routing decision gets expressed
+---
+## Architecture Decision: Three-Workflow Pipeline (Apr 18, 2026)
+### Decision
+The canonical WorkRail workflow pipeline for new features is:
+```
+wr.discovery (optional) → wr.shaping (optional) → coding-task-workflow-agentic
+```
+Each workflow is independently useful. The pipeline is an optional chain, not a required sequence.
+### Rationale
+**wr.discovery** produces a direction -- what problem is worth solving. Output: structured discovery notes at `.workrail/discovery/`.
+**wr.shaping** produces a bounded pitch -- what specifically to build and explicitly NOT build, at a product level. Output: `.workrail/current-pitch.md`. Faithful Shape Up methodology. Tech-agnostic. No code-level content.
+**coding-task-workflow-agentic** produces running code -- engineering approach, sliced implementation, verification. When pitch.md exists (Phase 0.5), it skips design ideation and translates the pitch directly into an engineering approach. The pitch's no-gos and appetite are binding constraints.
+### No TechSpec workflow needed
+The coding workflow already does everything a TechSpec workflow would do: Phase 1b generates design candidates, Phase 1c selects and challenges the approach, Phase 3 writes the spec and implementation plan. Adding a separate TechSpec workflow would duplicate this and create a question of which is canonical. The coding workflow is the engineering planning layer.
+**The split that matters is product vs engineering:**
+- Product decisions (what to build, for whom, within what time) → wr.shaping
+- Engineering decisions (how to build it, which interfaces, which tests) → coding workflow
+### When to skip shaping
+- Task is small, concrete, and clearly scoped → go straight to coding workflow
+- Discovery already produced a bounded, implementable direction
+- You have a pre-written ticket or spec that already defines what to build
+### Faithful Shape Up constraint
+wr.shaping is tech-agnostic. A pitch for a Kotlin Android app and a pitch for a Python API service look structurally identical. No file paths, no function signatures, no implementation details. This makes pitches usable by human engineering teams at companies using Shape Up, not just WorkRail's coding workflow.
+### Phase 0.5 mechanics
+When `coding-task-workflow-agentic` finds `.workrail/current-pitch.md`:
+1. Reads all five pitch sections (Problem, Appetite, Solution/Elements, Rabbit Holes, No-Gos)
+2. Sets `shapedInputDetected=true`
+3. Skips phases 1a-1c (hypothesis, design generation, challenge-and-select)
+4. Phase 1d translates pitch elements/invariants/no-gos into an engineering approach
+5. Plan audit (Phase 4) checks for drift against the pitch
+6. Appetite is a hard ceiling -- oversized engineering work becomes follow-up tickets
+---
+## Idea: `context-gather` Step Type (Apr 19, 2026)
+### Problem
+Phase 0.5 in the coding workflow currently looks for a shaped pitch by checking a local path. This doesn't handle: coordinator-injected context, manually written docs (GDoc, Confluence, Notion), Glean-indexed artifacts, or URLs embedded in the task description. The search logic is duplicated if other workflows need the same document.
+### Proposed primitive
+A new engine-level step type `context-gather` that resolves a named context artifact from ordered sources:
+```json
+{
+  "type": "context-gather",
+  "id": "gather-pitch",
+  "contextType": "shaped-pitch",
+  "outputVar": "shapedInput",
+  "optional": true,
+  "sources": ["coordinator-injected", "local-paths", "task-url", "glean"]
+}
+```
+**Source resolution order (stops at first hit):**
+1. `coordinator-injected` -- coordinator already attached context of this type to the session (most common in autonomous mode)
+2. `local-paths` -- check `.workrail/current-pitch.md`, `pitch.md`, `PRD.md`, `.workrail/pitches/` (most recent)
+3. `task-url` -- extract any URL from the task description and fetch via WebFetch or matching MCP (GDoc, Confluence, Notion)
+4. `glean` -- search Glean for recent docs matching the task keywords and `contextType`; opt-in only (risk of false positives silently constraining wrong scope)
+If `optional: true` and no source resolves: `outputVar = null`, workflow continues normally.
+### Why engine-level, not a routine
+- Coordinator intercept requires the engine to check "has this type already been provided?" before running any search -- a routine can't express that
+- `contextType` is a declared intent multiple workflows can share (`wr.shaping`, `coding-task-workflow`, `wr.discovery`) without duplicating resolver logic
+- New sources (Linear, Jira, Notion) get added to the engine once, immediately available to all workflows
+### Relationship to existing work
+- Replaces/supersedes Phase 0.5's current local-path check in `coding-task-workflow-agentic`
+- Coordinator PR-review flow would inject `shaped-pitch` context before spawning the coding session
+- Any workflow that needs "find the spec/pitch/PRD for this task" uses the same step type
+### Open questions
+- How does the coordinator inject context into a session? Via a session variable set before `start_workflow`, or a new `inject_context` call?
+- How does `task-url` distinguish a GDoc URL from a Confluence URL from a Notion URL? MCP routing by domain?
+- What is the `contextType` vocabulary? Start with `shaped-pitch` -- what else? (`discovery-notes`, `design-spec`, `api-contract`?)
+- Glean false-positive risk: wrong document fed as shaped input silently constrains wrong scope. Needs confidence threshold or explicit user confirmation when Glean is the only hit.
+---
+## Completed (Apr 19, 2026)
+### wr.shaping -- Faithful Shape Up shaping workflow
+Created `workflows/wr.shaping.json`. Faithful Shape Up methodology, tech-agnostic, produces `.workrail/current-pitch.md` only. Nine steps: ingest → frame gate → diverge (6 shapes, Verbalized Sampling) → converge → breadboard + elements → rabbit holes + no-gos → draft/critique loop → approval gate → write pitch.md. Two human gates with autonomous fallback. Appetite is calendar-time only (xs/s/m/l/xl). No code-level content -- a pitch for a Kotlin app and a pitch for a Python service look structurally identical.
+### coding-task-workflow-agentic -- Upstream context Phase 0.5
+Added Phase 0.5 "Locate Upstream Context" to `coding-task-workflow-agentic.json`. Format-agnostic: the agent uses whatever tools are available (repo search, WebFetch, Confluence/Notion/Glean MCPs, etc.) to find any upstream document -- pitch, PRD, BRD, RFC, design doc, user story, Jira epic, etc. Sets `upstreamSpecDetected` + `solutionFixed` flags. When `solutionFixed=true`, design ideation phases (1a-1c) are skipped and Phase 1d translates upstream constraints directly into an engineering approach. Plan audit (Phase 4) checks for drift against `upstreamBoundaries` whenever an upstream document was found.
+Also consolidated from three workflow variants to one canonical file.
+---
+## Current state update (Apr 19, 2026)
+**npm version: v3.40.0**
+### What shipped since v3.36.0 (Apr 18 -- Apr 19)
+- ✅ **`wr.shaping`** -- faithful Shape Up shaping workflow (9 steps, two human gates with autonomous fallback)
+- ✅ **`coding-task-workflow-agentic` Phase 0.5** -- upstream context detection; skips design phases when solution is pre-specified. Three-workflow pipeline: shaping → discovery → coding.
+- ✅ **Coding workflow consolidated** -- from three variants (lean, full, lean.v2) to one canonical file.
+- ✅ **HttpServer removed from MCP server** (#601) -- pure stdio. MCP server can no longer accidentally start an HTTP server.
+- ✅ **Late-bound goals** (#604) -- `goalTemplate: "{{$.goal}}"` defaults for webhook-driven sessions. Goals can come from the payload, not just the static trigger definition.
+- ✅ **Coordinator message queue drain** (#606) -- `pr-review` coordinator reads `~/.workrail/message-queue.jsonl` before each spawn cycle. `worktrain tell stop`, `skip-pr <n>`, `add-pr <n>` work.
+- ✅ **Notifications shipped** -- `NotificationService` implemented, wired into `TriggerRouter` via `trigger-listener.ts`. `WORKTRAIN_NOTIFY_MACOS=true` and `WORKTRAIN_NOTIFY_WEBHOOK=<url>` in `~/.workrail/config.json`.
+- ✅ **`worktrain run pr-review`** -- fully wired coordinator command. `spawnSession` → `awaitSessions` → `getAgentResult` (session-wide artifact aggregation) → `parseFindingsFromNotes` → route by severity.
+- ✅ **`wr.review_verdict` artifact path** -- end-to-end wired: `mr-review-workflow.agentic.v2.json` phase-6 emits it, `artifact-contract-validator.ts` validates it at `continue_workflow` time, coordinator reads it with keyword-scan fallback.
+- ✅ **`worktrain logs` / `worktrain health`** -- structured daemon log tailing and per-session health summary. `worktrain status <id>` deprecated in favor of `worktrain health <id>`.
+- ✅ **`signal_coordinator` tool** -- agent can emit structured mid-session signals (`progress`, `finding`, `data_needed`, `approval_needed`, `blocked`) without advancing the step.
+- ✅ **`ChildWorkflowRunResult` + `assertNever`** -- spawn_agent delivery_failed bug fixed. `delivery_failed` impossible state is compile-time excluded.
+- ✅ **`lastStepArtifacts` on `WorkflowRunSuccess`** -- `onComplete` callback forwards artifacts alongside notes. Coordinator can read typed artifacts from result without a separate HTTP call.
+- ✅ **`steerRegistry` + POST `/sessions/:id/steer`** -- coordinator injection endpoint wired in daemon console. Running sessions register a steer callback; coordinators can inject mid-session messages via HTTP.
+- ✅ **GitHub polling adapters** -- `github_issues_poll` and `github_prs_poll` providers fully implemented alongside existing `gitlab_poll`.
+- ✅ **Knowledge graph spike** -- `src/knowledge-graph/` module: DuckDB in-memory + ts-morph indexer + two validation queries. NOT yet wired to an MCP tool (ts-morph in devDependencies).
+- ✅ **`worktrain daemon --install`** -- launchd plist creation, load, verify. Daemon survives MCP server reconnects.
+- ✅ **Performance sweep** -- April 2026 sweep identified 10 highest-leverage fixes, filed as issues #248-257. Not yet merged.
+### Accurate limitations (as of v3.40.0)
+1. **Console session tree UI not built** -- `parentSessionId` is stored in the `session_created` event and in `WorkflowRunSuccess`. Console `RunLineageDag` shows the per-session step DAG only. Cross-session parent-child tree is data-only. PRs #607 (tree view) and #608 (steer endpoint) are OPEN.
+2. **Daemon tool set is minimal** -- agent has: `complete_step`, `continue_workflow` (deprecated), `Bash`, `Read`, `Write`, `report_issue`, `spawn_agent`, `signal_coordinator`. No `Glob`, `Grep`, or `Edit`. Read/Write are thin wrappers.
+3. **`worktrain tell` messages only drained by coordinator** -- `drainMessageQueue` is called by `runPrReviewCoordinator`, not by the daemon loop. A running autonomous session cannot receive mid-run injections from `worktrain tell`. The `steerRegistry` HTTP endpoint is the mid-session channel.
+4. **Knowledge graph not wired** -- module exists, ts-morph must move to dependencies before an MCP tool can be built.
+5. **`spawn_agent` return missing `artifacts`** -- returns `{ childSessionId, outcome, notes }` only. Typed artifacts from child session are not surfaced to the parent agent. `lastStepArtifacts` on `WorkflowRunSuccess` exists but spawn_agent doesn't return it.
+6. **`worktrain inbox --watch` stub** -- `--watch` flag prints "not yet implemented" and exits.
+7. **Artifact store not built** -- agents still dump markdown/files directly into the repo. `~/.workrail/artifacts/` directory structure not created.
+8. **Performance issues not fixed** -- issues #248-257 filed from April sweep. `continue_workflow` triggers 6+ event log scans, full session rebuild per `/api/v2/sessions` request, N+1 workflow fetches, no caching.
+9. **No auto-commit** -- agents can write code but do not commit, push, or open PRs autonomously.
+10. **Assessment gates not battle-tested** -- end-to-end flow with `outputContract: required: true` not validated in production use.
+### Open PRs to merge
+- **#607** `feat(console): add session tree view for coordinator sessions` -- cross-session parent-child hierarchy in console. Blocked on: `parentSessionId` data is in store but console routes need to surface it.
+- **#608** `feat(console): add POST /api/v2/sessions/:sessionId/steer for coordinator injection` -- NOTE: this endpoint is already implemented in `daemon-console.ts` via `steerRegistry`. PR #608 may be adding this to the MCP server console separately. Check before merging.
+- **#610** `feat(workflows): add wr.shaping` -- the shaping workflow. Ready to merge.
+- **#587** `fix(mcp): add assertNever exhaustiveness guard to TriggerRouter` -- likely already applied in codebase (ChildWorkflowRunResult assertNever is live). May be a duplicate or different scope. Check.
+### Next priorities (groomed Apr 19)
+1. **Merge #610 (wr.shaping)** -- ready. Workflow is implemented and in the branch.
+2. **Merge #587 (TriggerRouter assertNever)** -- quick fix, check if still relevant.
+3. **Review and merge #607 + #608** -- console tree view and steer endpoint. Verify #608 doesn't duplicate what's already live in daemon-console.ts.
+4. **Performance fixes** -- issues #248-257. Pick highest-leverage first: SessionIndex (#248) and console projection cache (#249) eliminate most of the repeated scans.
+5. **Daemon tool set: add Glob + Grep** -- agents routinely need to search files. `Read` + `Bash` grep is slow and lossy. Native `Glob` and `Grep` tools would make coding sessions more reliable.
+6. **`spawn_agent` artifacts gap** -- add `artifacts?: readonly unknown[]` to the return value. `lastStepArtifacts` is already on `WorkflowRunSuccess`; wiring it through is ~30 LOC.
+7. **Knowledge graph wiring** -- move `ts-morph` and `@duckdb/node-api` to dependencies, add `query_knowledge_graph` MCP tool.
+8. **Artifact store foundation** -- `~/.workrail/artifacts/` directory, write path in `complete_step`.
+---
+### wr.shaping workflow: shape messy problems into implementation-ready specs (needs authoring, Apr 18, 2026)
+**Status:** Design complete. Ready to author as a WorkRail workflow JSON.
+**Design docs:**
+- `docs/design/shaping-workflow-discovery.md` -- WorkRail-internal discovery findings
+- `docs/design/shaping-workflow-external-research.md` -- External research synthesis (Shape Up, LLM failure modes, artifact schema)
+**The gap this fills:** WorkRail has `wr.discovery` (divergent) and `coding-task-workflow-agentic` (convergent). Shaping is the missing middle -- converting messy discovery output into a bounded, implementation-ready spec without mid-implementation rabbit holes.
+**The 11-step skeleton (see design doc for full detail):**
+1. ingest_and_extract -- extract problem frames, forces, open questions
+2. **frame_gate** -- MANDATORY HUMAN GATE: confirm problem + appetite
+3. diverge_solution_shapes -- 4 parallel rough shapes with varied framings
+4. converge_pick -- SEPARATE JUDGE (different model/prompt): pick best shape
+5. breadboard_and_elements -- fat-marker breadboard + Interface/Invariant/Exclusion classification
+6. rabbit_holes_nogos -- adversarial: risks, mitigations, no-gos, assumptions
+7. context_pack_build -- file globs, reuse_utilities, conventions, do-not-touch boundaries
+8. example_map_and_gherkin -- Given/When/Then acceptance criteria + verification commands
+9. draft_pitch -- self-refine ×2, SEPARATE CRITIC (obfuscated authorship)
+10. **approval_gate** -- MANDATORY HUMAN GATE: approve, edit, or restart
+11. finalize_and_handoff -- schema validation, emit shape.json + pitch.md
+**The single most important design decision:** generator and critic run on structurally different prompts (ideally different model families). CoT and self-reflection alone do NOT mitigate anchoring or self-preference bias (Lou & Sun 2025; Panickssery et al. 2024).
+**Output artifact:** `shape.json` -- contains problem story, appetite (multi-dimensional: calendar + tokens + turns + files), breadboard, elements, context_pack (file boundaries + reuse_utilities), Gherkin acceptance criteria, rabbit holes, no-gos, decomposition with walking skeleton, assumptions_log, build_readiness_score.
+**Key insight for AI implementers:** LLMs need MORE explicit specs than humans on interfaces/invariants/file boundaries (no tacit knowledge, no scope-shame), but LESS explicit than junior humans on standard patterns. The dominant failure mode is confident architectural divergence -- working code that reinvents an existing utility. Context Pack (Step 7) directly prevents this.
+**Next action:** author `wr.shaping` as a WorkRail workflow JSON using workflow-for-workflows, then update `coding-task-workflow-agentic` Phase 0 to detect and consume `shape.json` when present.
+---
+## Coordinator architecture: separation of concerns (Apr 19, 2026)
+**Decision: defer knowledge graph implementation until the context assembly layer is designed.**
+### The god class problem
+`src/coordinators/pr-review.ts` is already ~500 LOC doing: session dispatch, result aggregation, finding classification, merge routing, message queue drain, and outbox writes. Adding knowledge graph queries, context bundle assembly, upstream doc fetching, and prior session lookups would make it a god class.
+"Coordinator" is not a class or a script -- it is a **layer** that orchestrates across multiple concerns. Those concerns need to be separated before we add more to them.
+### The right layering
+```
+Trigger layer         src/trigger/          receives events, validates, enqueues
+Dispatch layer        (TBD)                 decides which workflow + what goal
+Context assembly      (TBD)                 gathers and packages context before spawning
+Orchestration layer   src/coordinators/     spawns, awaits, routes, retries, escalates
+Delivery layer        src/trigger/delivery  posts results back to origin systems
+```
+**Context assembly** is the missing layer. Before dispatching a coding session, something needs to:
+- Run `buildIndex()` and query "what imports the file being changed"
+- Find the upstream pitch/PRD/BRD for the task
+- Pull relevant prior session notes
+- Package everything as a structured context bundle
+This is NOT the orchestration script's job. The orchestration script should call `assembleContext(task, workspace)` and receive a bundle -- it should not know how that bundle was gathered.
+### Why the knowledge graph belongs in context assembly, not in the daemon
+Two options were considered:
+- **Daemon tool** (`makeQueryKnowledgeGraphTool` in `workflow-runner.ts`) -- agent queries mid-session on demand
+- **Coordinator pre-fetch** -- coordinator runs queries before spawning, injects answers as context
+The coordinator pre-fetch is better for known patterns (e.g. "what imports the file being changed" before a coding task). The agent doesn't need to know the graph exists -- it just gets the relevant facts as context. This also avoids adding `ts-morph` + DuckDB to the production build.
+The daemon tool approach is only better for ad-hoc mid-session queries the agent discovers dynamically. That's a secondary use case for v1.
+### What to build before the knowledge graph
+1. **Design the `ContextAssembler` abstraction** -- takes task description + workspace + trigger metadata, returns a structured context bundle. The knowledge graph is one of several sources (alongside upstream docs, prior session notes, repo state).
+2. **Refactor `pr-review.ts`** to use a `ContextAssembler` for the bits that fit there.
+3. **Then** implement knowledge graph as a `ContextAssembler` plugin -- not as a coordinator script addition and not as a daemon tool.
+### Anti-pattern to avoid
+Adding knowledge graph calls directly into `pr-review.ts` or any other coordinator script. That immediately creates the god class we're trying to avoid and couples the orchestration layer to a specific context source.

package/docs/ideas/design-candidates-console-session-tree-impl.md ADDED Viewed

@@ -0,0 +1,64 @@
+# Design Candidates: Console Session Tree Implementation (Phase 3)
+*2026-04-18 -- This document covers only the remaining Slice 5 (SessionTreeView UI component)*
+*Phase 1 and Phase 2 artifacts: see design-candidates-session-tree-view.md and design-review-findings-session-tree-view.md*
+## Problem Understanding
+Slices 1-4 are implemented. The remaining work is Slice 5: add a SessionTreeView rendering path to SessionList.tsx.
+**Tensions:**
+- Expand toggle vs card navigation: two click targets on the same logical row. Resolved by a flex row with separate button elements.
+- Per-coordinator expand state vs pure component: expand state lives in useState (UI state, not business logic -- correct placement).
+- Auto-expand for in_progress: requires checking status in state initialization.
+**Likely seam:** SessionList.tsx (presenter) + session-list-use-cases.ts (pure function buildSessionTree, already built).
+**What makes it hard:** The expand toggle must be keyboard-navigable separately from the card AND must not trigger card navigation on click.
+## Philosophy Constraints
+- Pure presenter: no business logic in the component
+- Immutability: expand state is a ReadonlyMap or regular Map in useState
+- Functional/declarative: map SessionTreeNode[] to JSX
+- Compose with small functions: SessionTreeView as a named function, separate from SessionList
+## Impact Surface
+- SessionList.tsx: adding viewMode branch
+- session-list-use-cases.ts: already has buildSessionTree exported
+- session-list-reducer.ts: already has viewMode + view_mode_changed
+## Candidates
+### Candidate A: SessionTreeView inline in SessionList.tsx (only candidate)
+**Summary:** A `SessionTreeView` function component in SessionList.tsx takes `SessionTreeNode[]`, initializes expand state as `Map<string, boolean>` (auto-expand in_progress), and renders a flex row with [expand-toggle, coordinator-card] and children in a TreeLine wrapper below when expanded.
+**Tensions resolved:** Expand/navigate separation (separate button elements). Accepts: expand state resets on navigation (transient UI state is acceptable).
+**Boundary:** SessionList.tsx presenter layer.
+**Failure mode:** Expand toggle accidentally triggers card navigation. Fixed by: expand toggle button is outside the coordinator ConsoleCard, not nested inside it.
+**Repo pattern:** Follows SessionGroup component pattern in SessionList.tsx exactly.
+**Gains:** Simple, pure, testable in isolation. **Loses:** Expand state resets when navigating away (transient).
+**Scope:** Best-fit.
+**Philosophy:** All principles honored.
+## Comparison and Recommendation
+Single candidate; no comparison needed. Candidate A is the correct approach.
+## Self-Critique
+Strongest counter-argument: expand state should be in the reducer (durable within page session). Counter-counter: expand state is UI state, not domain state. Reducer is for interaction state that needs to persist across renders (search, filter, sort, pagination). Expand state for individual coordinator rows is more like accordion state -- local useState is correct.
+Pivot condition: if user feedback shows expand state loss is disruptive, move to reducer with `expanded_coordinators: ReadonlySet<string>` field.
+## Open Questions for the Main Agent
+None. Implementation is fully specified in docs/ideas/design-candidates-session-tree-view.md and the Phase 2 design spec.