@exaudeus/workrail 3.44.0 → 3.46.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (40) hide show
  1. package/dist/cli/commands/index.d.ts +1 -0
  2. package/dist/cli/commands/index.js +3 -1
  3. package/dist/cli/commands/worktrain-pipeline.d.ts +17 -0
  4. package/dist/cli/commands/worktrain-pipeline.js +121 -0
  5. package/dist/console-ui/assets/{index-Bi38ITiQ.js → index-BQFhoMcY.js} +1 -1
  6. package/dist/console-ui/index.html +1 -1
  7. package/dist/coordinators/adaptive-pipeline.d.ts +57 -0
  8. package/dist/coordinators/adaptive-pipeline.js +104 -0
  9. package/dist/coordinators/modes/full-pipeline.d.ts +4 -0
  10. package/dist/coordinators/modes/full-pipeline.js +256 -0
  11. package/dist/coordinators/modes/implement-shared.d.ts +4 -0
  12. package/dist/coordinators/modes/implement-shared.js +201 -0
  13. package/dist/coordinators/modes/implement.d.ts +3 -0
  14. package/dist/coordinators/modes/implement.js +108 -0
  15. package/dist/coordinators/modes/quick-review.d.ts +3 -0
  16. package/dist/coordinators/modes/quick-review.js +37 -0
  17. package/dist/coordinators/modes/review-only.d.ts +2 -0
  18. package/dist/coordinators/modes/review-only.js +28 -0
  19. package/dist/coordinators/routing/route-task.d.ts +21 -0
  20. package/dist/coordinators/routing/route-task.js +55 -0
  21. package/dist/daemon/workflow-runner.d.ts +12 -2
  22. package/dist/daemon/workflow-runner.js +96 -13
  23. package/dist/manifest.json +101 -29
  24. package/dist/mcp/output-schemas.d.ts +16 -16
  25. package/dist/trigger/notification-service.d.ts +1 -1
  26. package/dist/trigger/notification-service.js +4 -0
  27. package/dist/trigger/trigger-router.d.ts +3 -0
  28. package/dist/trigger/trigger-router.js +17 -0
  29. package/dist/trigger/types.d.ts +2 -0
  30. package/dist/v2/durable-core/schemas/artifacts/discovery-handoff.d.ts +29 -0
  31. package/dist/v2/durable-core/schemas/artifacts/discovery-handoff.js +26 -0
  32. package/dist/v2/durable-core/schemas/artifacts/index.d.ts +2 -1
  33. package/dist/v2/durable-core/schemas/artifacts/index.js +7 -1
  34. package/dist/v2/durable-core/schemas/compiled-workflow/index.d.ts +8 -8
  35. package/dist/v2/usecases/console-routes.js +3 -0
  36. package/docs/design/design-candidates-stuck-escalation.md +183 -0
  37. package/docs/design/design-review-findings-stuck-escalation.md +93 -0
  38. package/docs/design/implementation-plan-stuck-escalation.md +172 -0
  39. package/docs/ideas/backlog.md +86 -0
  40. package/package.json +1 -1
@@ -0,0 +1,183 @@
1
+ # Design Candidates: WorkTrain Stuck-Escalation
2
+
3
+ *Generated: 2026-04-19 | Pitch: .workrail/current-pitch.md*
4
+
5
+ ---
6
+
7
+ ## Problem Understanding
8
+
9
+ ### Core Tensions
10
+
11
+ 1. **Stuck vs timeout conflation**: When `repeated_tool_call` fires, the session
12
+ currently runs until wall-clock or max-turns timeout. The result is
13
+ `_tag: 'timeout'`, which is indistinguishable from a legitimate slow session.
14
+ Automated routing requires a distinct discriminant.
15
+
16
+ 2. **Abort vs notify-only independence**: Outbox notification and `agent.abort()`
17
+ are two separate effects. `notify_only` policy suppresses the abort but must
18
+ not suppress the outbox write. These effects must not be coupled.
19
+
20
+ 3. **ChildWorkflowRunResult atomic update**: The `as ChildWorkflowRunResult` cast
21
+ at line 2172 in `makeSpawnAgentTool` suppresses any compile-time error from a
22
+ missing union update. Only the `assertNever(childResult)` at line 2212 catches
23
+ the omission -- at runtime, crashing the parent session.
24
+
25
+ 4. **no_progress false-positive risk**: The no_progress heuristic fires on
26
+ legitimate research workflows that spend many turns reading before advancing.
27
+ It must be opt-in (default: false) to avoid breaking existing sessions.
28
+
29
+ ### Likely Seam
30
+
31
+ The `turn_end` subscriber in `runWorkflow()` is the correct location. All
32
+ required state (lastNToolCalls, stepAdvanceCount, timeoutReason, issueSummaries)
33
+ is available there as closure variables. Detection fires at the right moment
34
+ (after each turn, synchronously before next step injection).
35
+
36
+ ### What Makes This Hard
37
+
38
+ - The `as ChildWorkflowRunResult` cast is a type-safety trap: it silences
39
+ TypeScript while leaving a runtime crash. Only careful reading of the pitch
40
+ reveals the issue.
41
+ - `buildOutcome()` in notification-service.ts has return type
42
+ `NotificationPayload['outcome']`. Adding 'stuck' to WorkflowRunResult causes
43
+ a compile error there unless the outcome union is also widened.
44
+
45
+ ---
46
+
47
+ ## Philosophy Constraints
48
+
49
+ From CLAUDE.md:
50
+
51
+ - **Make illegal states unrepresentable**: the stuck discriminant prevents
52
+ conflating stuck with timeout at the type level.
53
+ - **Exhaustiveness everywhere**: assertNever guards in trigger-router and
54
+ makeSpawnAgentTool enforce this -- adding stuck arm is required.
55
+ - **Errors are data**: WorkflowRunResult is a Result type; WorkflowRunStuck is
56
+ a new variant, not an exception.
57
+ - **Type safety as first line of defense**: ChildWorkflowRunResult update in
58
+ same commit restores the compile-time invariant that the cast broke.
59
+ - **Fire-and-forget for side effects**: outbox write uses void + catch, same
60
+ as DaemonEventEmitter and issue recording.
61
+
62
+ No conflicts between stated philosophy and repo patterns.
63
+
64
+ ---
65
+
66
+ ## Impact Surface
67
+
68
+ Paths that must stay consistent when WorkflowRunResult gains a new variant:
69
+
70
+ 1. `makeSpawnAgentTool` -- `assertNever(childResult)` at line 2212; requires
71
+ ChildWorkflowRunResult update and a new `stuck` arm in the result mapping.
72
+ 2. `trigger-router.ts` `route()` -- exhaustive if-else chain ending in
73
+ `assertNever(result)` at line ~689.
74
+ 3. `trigger-router.ts` `dispatch()` -- same exhaustive chain at line ~770.
75
+ 4. `notification-service.ts` `buildNotificationBody()` -- exhaustive switch.
76
+ 5. `notification-service.ts` `buildDetail()` -- exhaustive switch.
77
+ 6. `notification-service.ts` `buildOutcome()` -- return type
78
+ `NotificationPayload['outcome']`; 'stuck' must be added to that union.
79
+ 7. `NotificationPayload.outcome` union -- currently
80
+ `'success' | 'error' | 'timeout' | 'delivery_failed'`; must add `'stuck'`.
81
+
82
+ ---
83
+
84
+ ## Candidates
85
+
86
+ ### Candidate A: New `_tag: 'stuck'` discriminated union variant (SELECTED)
87
+
88
+ **Summary**: Add `WorkflowRunStuck` interface with `_tag: 'stuck'`, wire abort
89
+ in turn_end subscriber after Signal 1 and Signal 2 emitter calls, return stuck
90
+ result before timeout check, update both `WorkflowRunResult` and
91
+ `ChildWorkflowRunResult` unions atomically, add `writeStuckOutboxEntry` helper.
92
+
93
+ **Tensions resolved**:
94
+ - Stuck/timeout conflation: separate discriminant, separate return path.
95
+ - Abort/notify independence: outbox write fires before the abort gate check.
96
+ - ChildWorkflowRunResult crash: atomic update with assertNever arm added.
97
+ - no_progress false-positive: gated by `noProgressAbortEnabled: false` default.
98
+
99
+ **Boundary solved at**: `turn_end` subscriber (detection + abort), result
100
+ construction (return), 4 files for propagation to callers.
101
+
102
+ **Why best-fit boundary**: The turn_end subscriber is the only location with
103
+ access to all required state. The result construction is the canonical output
104
+ boundary for runWorkflow(). Propagation to callers follows the existing
105
+ WorkflowRunResult variant fan-out pattern.
106
+
107
+ **Failure mode**: Forgetting to update `NotificationPayload.outcome` union --
108
+ caught by `npm run build` (TypeScript compile error in `buildOutcome()`).
109
+
110
+ **Repo-pattern relationship**: Mirrors `timeoutReason` flag pattern exactly.
111
+ Mirrors `WorkflowRunTimeout` interface field shape. Follows assertNever guard
112
+ pattern already established in trigger-router and makeSpawnAgentTool.
113
+
114
+ **Gains**: Distinct routing for stuck sessions, type-safe callers, clean
115
+ separation of abort and notification effects.
116
+
117
+ **Losses**: One more variant in the union (minor cognitive load increase).
118
+
119
+ **Scope judgment**: Best-fit. 4 files, mechanical wiring, all design resolved.
120
+
121
+ **Philosophy fit**: Honors all relevant CLAUDE.md principles. No conflicts.
122
+
123
+ ---
124
+
125
+ ### Candidate B: Extend `WorkflowRunTimeout.reason` with stuck sub-values
126
+
127
+ **Summary**: Add `'stuck_repeated_tool_call' | 'stuck_no_progress'` to
128
+ `WorkflowRunTimeout.reason` -- reuse the timeout discriminant.
129
+
130
+ **Tensions resolved**: None of the core ones. Stuck and timeout still share
131
+ `_tag: 'timeout'`, requiring callers to inspect reason to distinguish them.
132
+
133
+ **Failure mode**: Violates make-illegal-states-unrepresentable. Callers using
134
+ `result._tag === 'timeout'` would silently handle stuck sessions as timeouts.
135
+
136
+ **Repo-pattern relationship**: Departs from the exhaustiveness-everywhere
137
+ pattern. The assertNever guard pattern exists precisely to avoid this.
138
+
139
+ **Scope judgment**: Too narrow -- preserves the routing problem this pitch
140
+ exists to solve.
141
+
142
+ **Rejected because**: Violates philosophy, does not resolve the core tension,
143
+ and the pitch explicitly rejects conflating stuck with timeout.
144
+
145
+ ---
146
+
147
+ ## Comparison and Recommendation
148
+
149
+ Candidate A is the only viable candidate. All analysis converges.
150
+
151
+ The core recommendation is to implement Candidate A exactly as specified in
152
+ `.workrail/current-pitch.md`, with one addition not noted in the pitch:
153
+ update `NotificationPayload.outcome` union to include `'stuck'` (required for
154
+ `buildOutcome()` to compile).
155
+
156
+ ---
157
+
158
+ ## Self-Critique
159
+
160
+ **Strongest counter-argument**: Adding a 5th variant to WorkflowRunResult
161
+ increases cognitive load for callers. Counter: assertNever guards make missing
162
+ cases compile errors, which is the correct safeguard. The complexity cost is
163
+ paid once (at implementation) and enforced automatically.
164
+
165
+ **Narrower option that lost**: Update only WorkflowRunResult, skip
166
+ ChildWorkflowRunResult. Lost because: runtime crash in makeSpawnAgentTool
167
+ when a child hits stuck-abort. The cast at line 2172 provides no protection.
168
+
169
+ **Broader option not justified**: Adding `onStuck:` hook to TriggerDefinition.
170
+ Explicitly deferred per pitch No-Gos. Would require trigger-store.ts parser
171
+ changes -- outside the 4-file scope.
172
+
173
+ **Pivot condition**: If `assertNever(childResult)` were removed in favor of a
174
+ logged fallback, ChildWorkflowRunResult update would be less critical. It is
175
+ not removed, so the atomic update is required.
176
+
177
+ ---
178
+
179
+ ## Open Questions for the Main Agent
180
+
181
+ None. All design decisions are resolved in the pitch. The only implementation
182
+ detail requiring attention is the `NotificationPayload.outcome` union widening
183
+ (add 'stuck') -- verify this compiles before finalizing.
@@ -0,0 +1,93 @@
1
+ # Design Review Findings: WorkTrain Stuck-Escalation
2
+
3
+ *Generated: 2026-04-19 | Pitch: .workrail/current-pitch.md*
4
+
5
+ ---
6
+
7
+ ## Tradeoff Review
8
+
9
+ | Tradeoff | Status | Condition for Failure |
10
+ |----------|--------|-----------------------|
11
+ | One more union variant in WorkflowRunResult | Acceptable | All callers use assertNever guards -- compile error enforces handling |
12
+ | ChildWorkflowRunResult atomic update relies on discipline | Managed | Fails only if commit is split; mitigated by single-PR implementation and compile-time test |
13
+ | NotificationPayload.outcome union widening (gap, not tradeoff) | Resolved | Add 'stuck' to outcome union; caught by npm run build |
14
+
15
+ ---
16
+
17
+ ## Failure Mode Review
18
+
19
+ | Failure Mode | Severity | Design Handling | Missing Mitigation |
20
+ |--------------|----------|-----------------|--------------------|
21
+ | ChildWorkflowRunResult not updated | High | Atomic commit, compile-time assignability test | None beyond discipline |
22
+ | stuckReason / timeoutReason race | Low | First-writer-wins guard; max_turns early return prevents race | None needed |
23
+ | writeStuckOutboxEntry fails | Low | Fire-and-forget, console.warn on error | None -- intentional |
24
+ | no_progress fires on research workflow | Low | noProgressAbortEnabled defaults to false | None needed |
25
+ | NotificationPayload.outcome compile error | Medium | Add 'stuck' to union | None -- caught at build |
26
+
27
+ ---
28
+
29
+ ## Runner-Up / Simpler Alternative Review
30
+
31
+ - **Candidate B** (extend WorkflowRunTimeout.reason): No elements worth borrowing.
32
+ Does not resolve the core routing tension.
33
+ - **Skip ChildWorkflowRunResult**: Not acceptable -- runtime crash in parent session.
34
+ - **Skip sessionStartMs**: Not recommended -- pitch explicitly adds it for Signal 5 follow-up
35
+ to avoid future restructuring.
36
+ - **Inline outbox write**: Works but reduces turn_end subscriber readability. Not worth it.
37
+
38
+ No hybrid opportunities identified.
39
+
40
+ ---
41
+
42
+ ## Philosophy Alignment
43
+
44
+ | Principle | Status |
45
+ |-----------|--------|
46
+ | Make illegal states unrepresentable | Satisfied |
47
+ | Exhaustiveness everywhere | Satisfied |
48
+ | Errors are data | Satisfied |
49
+ | Immutability by default | Satisfied |
50
+ | Type safety as first line of defense | Under tension (pre-existing cast; improved but not fully resolved) |
51
+ | Fire-and-forget for side effects | Satisfied |
52
+
53
+ ---
54
+
55
+ ## Findings
56
+
57
+ ### Yellow: NotificationPayload.outcome union widening not specified in pitch
58
+
59
+ The pitch states 'buildOutcome() returns result._tag directly -- no change needed'.
60
+ However, the return type annotation `NotificationPayload['outcome']` will cause a
61
+ TypeScript compile error when 'stuck' is added to WorkflowRunResult but not to the
62
+ outcome union. **Resolution**: add `'stuck'` to `NotificationPayload.outcome` union
63
+ in notification-service.ts. This is a mechanical fix, not a design change.
64
+
65
+ ### Yellow: Pre-existing `as ChildWorkflowRunResult` cast at line 2172
66
+
67
+ The cast suppresses TypeScript's compile-time check that would otherwise catch a
68
+ missing ChildWorkflowRunResult update. This PR updates the union and adds a
69
+ compile-time assignability test to partially compensate. Removing the cast is
70
+ out of scope. **Residual concern**: future union additions must be caught by the
71
+ test rather than the compiler.
72
+
73
+ ---
74
+
75
+ ## Recommended Revisions
76
+
77
+ 1. Add `'stuck'` to `NotificationPayload.outcome` union (not in pitch, required for compile).
78
+ 2. Add compile-time assignability test for `ChildWorkflowRunResult` in the test file.
79
+ 3. Document the `as ChildWorkflowRunResult` cast issue in a code comment at line 2172
80
+ (or verify existing comment is sufficient).
81
+
82
+ ---
83
+
84
+ ## Residual Concerns
85
+
86
+ - The `as ChildWorkflowRunResult` cast remains. Future contributors adding a new
87
+ WorkflowRunResult variant may forget to update ChildWorkflowRunResult. The
88
+ compile-time test in the stuck-escalation test file partially mitigates this,
89
+ but only for the stuck variant. A broader structural fix (removing the cast)
90
+ is a follow-up.
91
+ - Webhook consumers reading `outcome: 'stuck'` must handle the new value.
92
+ This is a new feature, not a breaking change, but operators consuming the
93
+ webhook should be aware.
@@ -0,0 +1,172 @@
1
+ # Implementation Plan: WorkTrain Stuck-Escalation
2
+
3
+ *2026-04-19 | Pitch: .workrail/current-pitch.md*
4
+
5
+ ---
6
+
7
+ ## 1. Problem Statement
8
+
9
+ When a WorkTrain daemon session enters a `repeated_tool_call` loop, the session
10
+ currently burns turns until wall-clock or max-turn timeout. The result is
11
+ `_tag: 'timeout'`, indistinguishable from a legitimate slow session. Automated
12
+ routing is impossible without string-parsing.
13
+
14
+ ---
15
+
16
+ ## 2. Acceptance Criteria
17
+
18
+ 1. `WorkflowRunStuck` interface exported from `workflow-runner.ts` with fields:
19
+ `_tag: 'stuck'`, `workflowId`, `reason`, `message`, `stopReason`, `issueSummaries?`
20
+ 2. `WorkflowRunResult` union includes `WorkflowRunStuck`.
21
+ 3. `ChildWorkflowRunResult` union includes `WorkflowRunStuck` (SAME COMMIT as #2).
22
+ 4. `WorkflowTrigger.agentConfig` has `stuckAbortPolicy?` and `noProgressAbortEnabled?`.
23
+ 5. `TriggerDefinition.agentConfig` has the same two fields.
24
+ 6. When `repeated_tool_call` fires and `stuckAbortPolicy !== 'notify_only'`:
25
+ outbox entry written, `agent.abort()` called, `stuckReason = 'repeated_tool_call'`.
26
+ 7. When `notify_only` is set: outbox written, abort NOT called.
27
+ 8. When `noProgressAbortEnabled: true` and `no_progress` fires with `stuckAbortPolicy !== 'notify_only'`:
28
+ same abort + outbox write.
29
+ 9. Return path returns `{ _tag: 'stuck', ... }` before `timeoutReason` check.
30
+ 10. `trigger-router.ts` `route()` and `dispatch()` handle `stuck` without assertNever fallthrough.
31
+ 11. `notification-service.ts` `buildNotificationBody()` and `buildDetail()` handle `stuck`.
32
+ 12. `NotificationPayload.outcome` union includes `'stuck'`.
33
+ 13. `makeSpawnAgentTool` handles `stuck` child result, returns `outcome: 'stuck'`.
34
+ 14. All 6 test cases in `workflow-runner-stuck-escalation.test.ts` pass.
35
+ 15. `npm run build` clean. `npx vitest run` no regressions.
36
+
37
+ ---
38
+
39
+ ## 3. Non-Goals
40
+
41
+ - No `onStuck:` hook in TriggerDefinition (follow-up)
42
+ - No console live panel stuck indicator
43
+ - No `worktrain logs` formatting changes
44
+ - No automatic retry on stuck
45
+ - No Signal 5 (wall-clock at 80%) wiring
46
+ - No new heuristics beyond Signal 1 and 2
47
+ - No changes to `src/mcp/`
48
+ - No `trigger-store.ts` parser changes
49
+
50
+ ---
51
+
52
+ ## 4. Philosophy-Driven Constraints
53
+
54
+ - All new fields `readonly`
55
+ - `issueSummaries` spread to new readonly array when included in return value
56
+ - `writeStuckOutboxEntry` is fire-and-forget (void + catch)
57
+ - `stuckReason` flag: first-writer-wins (same as `timeoutReason`)
58
+ - Outbox write and abort are independent effects (write before abort gate check)
59
+
60
+ ---
61
+
62
+ ## 5. Invariants
63
+
64
+ - **I1**: `ChildWorkflowRunResult` and `WorkflowRunResult` updates ship in the same commit.
65
+ - **I2**: `stuckReason` is checked BEFORE `timeoutReason` in the return path.
66
+ - **I3**: Outbox write fires regardless of `stuckAbortPolicy`.
67
+ - **I4**: `no_progress` never aborts unless `noProgressAbortEnabled: true`.
68
+ - **I5**: `repeated_tool_call` abort fires on the same turn as detection.
69
+ - **I6**: First writer wins on `stuckReason` (guard: `stuckReason === null && timeoutReason === null`).
70
+
71
+ ---
72
+
73
+ ## 6. Selected Approach
74
+
75
+ New `_tag: 'stuck'` discriminated union variant. Wire abort in `turn_end` subscriber
76
+ after Signal 1 and Signal 2 emitter calls. Return stuck result before `timeoutReason`
77
+ check. Update both union types atomically. Add `writeStuckOutboxEntry` module-level
78
+ helper. Propagate to trigger-router, notification-service, makeSpawnAgentTool.
79
+
80
+ **Runner-up rejected**: Extend `WorkflowRunTimeout.reason` -- violates make-illegal-states-unrepresentable.
81
+
82
+ ---
83
+
84
+ ## 7. Vertical Slices
85
+
86
+ ### Slice 1: Core types (workflow-runner.ts)
87
+ - Add `WorkflowRunStuck` interface after `WorkflowRunTimeout`
88
+ - Add to `WorkflowRunResult` union
89
+ - Add to `ChildWorkflowRunResult` union (ATOMIC with above)
90
+ - Add `stuckAbortPolicy?` and `noProgressAbortEnabled?` to `WorkflowTrigger.agentConfig`
91
+ - **Done when**: `npm run build` clean after this slice
92
+
93
+ ### Slice 2: TriggerDefinition.agentConfig (types.ts)
94
+ - Add `stuckAbortPolicy?` and `noProgressAbortEnabled?` after `maxTurns`
95
+ - **Done when**: `npm run build` clean
96
+
97
+ ### Slice 3: Runtime wiring (workflow-runner.ts)
98
+ - Add `sessionStartMs` constant after `maxTurns` resolution
99
+ - Add `stuckReason` flag after `timeoutReason` flag
100
+ - Add `writeStuckOutboxEntry` module-level helper
101
+ - Wire abort after Signal 1 emitter call in `turn_end`
102
+ - Wire abort after Signal 2 emitter call in `turn_end`
103
+ - Add stuck return path before `timeoutReason` check
104
+ - Update `makeSpawnAgentTool` resultObj type + add `stuck` arm before `assertNever`
105
+ - **Done when**: `npm run build` clean
106
+
107
+ ### Slice 4: Caller propagation (trigger-router.ts, notification-service.ts)
108
+ - Add `stuck` arm in `route()` exhaustive chain
109
+ - Add `stuck` arm in `dispatch()` exhaustive chain
110
+ - Add `'stuck'` to `NotificationPayload.outcome` union
111
+ - Add `stuck` case in `buildNotificationBody()`
112
+ - Add `stuck` case in `buildDetail()`
113
+ - **Done when**: `npm run build` clean
114
+
115
+ ### Slice 5: Tests
116
+ - Write `tests/unit/workflow-runner-stuck-escalation.test.ts` with 6 test cases
117
+ - **Done when**: all 6 tests pass, no regressions
118
+
119
+ ---
120
+
121
+ ## 8. Test Design
122
+
123
+ File: `tests/unit/workflow-runner-stuck-escalation.test.ts`
124
+
125
+ Pattern: replicate turn_end subscriber logic (same as workflow-runner-stuck-detection.test.ts).
126
+
127
+ **Test 1**: `stuckAbortPolicy: 'abort'` default -- repeated_tool_call fires, stuckReason set, abort called, would return _tag:'stuck'
128
+ **Test 2**: `stuckAbortPolicy: 'notify_only'` -- abort NOT called, emitter still fires
129
+ **Test 3**: `noProgressAbortEnabled: false` default -- no_progress does NOT set stuckReason
130
+ **Test 4**: `noProgressAbortEnabled: true` -- no_progress sets stuckReason = 'no_progress', abort called
131
+ **Test 5**: Compile-time assignability test: `WorkflowRunStuck` is assignable to `ChildWorkflowRunResult`
132
+ **Test 6**: trigger-router exhaustive switch handles 'stuck' (import trigger-router, verify no assertNever path hit)
133
+
134
+ ---
135
+
136
+ ## 9. Risk Register
137
+
138
+ | Risk | Likelihood | Severity | Mitigation |
139
+ |------|------------|----------|------------|
140
+ | ChildWorkflowRunResult not updated atomically | Low | High | Single-PR, Slice 1 includes both updates, Test 5 catches gap |
141
+ | NotificationPayload.outcome union gap | Low | Medium | Slice 4 adds 'stuck'; build catches it |
142
+ | stuckReason/timeoutReason race | Low | Low | Guard condition (both null check) |
143
+ | writeStuckOutboxEntry silent failure | Low | Low | Fire-and-forget with console.warn |
144
+
145
+ ---
146
+
147
+ ## 10. PR Packaging Strategy
148
+
149
+ Single PR: `feat/stuck-escalation`
150
+ Single atomic commit with all 4 source files + test file.
151
+ PR title: `feat(daemon): WorkflowRunStuck result variant with abort and outbox notification`
152
+
153
+ ---
154
+
155
+ ## 11. Philosophy Alignment Per Slice
156
+
157
+ | Slice | Principle | Status |
158
+ |-------|-----------|--------|
159
+ | Slice 1 (types) | Make illegal states unrepresentable | Satisfied |
160
+ | Slice 1 (types) | Exhaustiveness everywhere | Satisfied |
161
+ | Slice 1 (types) | Type safety as first line of defense | Satisfied (ChildWorkflowRunResult updated) |
162
+ | Slice 3 (runtime) | Errors are data | Satisfied |
163
+ | Slice 3 (runtime) | Determinism over cleverness | Satisfied (simple flag) |
164
+ | Slice 3 (runtime) | Fire-and-forget side effects | Satisfied (outbox write) |
165
+ | Slice 4 (callers) | Exhaustiveness everywhere | Satisfied (all assertNever guards updated) |
166
+ | Slice 5 (tests) | Prefer fakes over mocks | Satisfied (replicate subscriber logic, not vi.mock) |
167
+
168
+ ---
169
+
170
+ **unresolvedUnknownCount**: 0
171
+ **planConfidenceBand**: High
172
+ **estimatedPRCount**: 1
@@ -6395,3 +6395,89 @@ When a post-implementation MR review finds a UI/UX finding (wrong affordance, mi
6395
6395
  ### Priority
6396
6396
 
6397
6397
  Design this as part of the adaptive coordinator (#3). The `touchesUI` flag belongs on the classification output alongside `taskComplexity` and `maturity`. The UI detection logic and the design workflow insertion are both coordinator-level concerns, not engine-level.
6398
+
6399
+ ---
6400
+
6401
+ ## Current state update (Apr 20, 2026)
6402
+
6403
+ **npm version: v3.45.0**
6404
+
6405
+ ### What shipped in this session (Apr 19-20, 2026)
6406
+
6407
+ All five top-priority autonomous pipeline items shipped:
6408
+
6409
+ - ✅ **#1 -- Worktree isolation + auto-commit** (PR #630) -- Each WorkTrain coding session now runs in an isolated git worktree (`~/.workrail/worktrees/<sessionId>`). `trigger.workspacePath` is never mutated; all tool factories receive `sessionWorkspacePath`. Crash recovery sidecar persists `worktreePath` for orphan cleanup. `delivery-action.ts` asserts HEAD branch before push. `test-task` trigger: `branchStrategy: worktree`, `autoCommit: true`, `autoOpenPR: true`.
6410
+
6411
+ - ✅ **#2 -- Stuck detection escalation** (PR #636) -- New `WorkflowRunResult._tag: 'stuck'` discriminant. When `repeated_tool_call` heuristic fires and `stuckAbortPolicy !== 'notify_only'` (default: `'abort'`), daemon aborts the session immediately instead of burning the 30-min wall clock. Writes structured entry to `~/.workrail/outbox.jsonl`. `stuckAbortPolicy` and `noProgressAbortEnabled` configurable per trigger in `agentConfig`. `ChildWorkflowRunResult` updated atomically.
6412
+
6413
+ - ✅ **#3 -- Adaptive pipeline coordinator** (PR #639) -- `worktrain run pipeline --issue N --workspace path` routes tasks to the right pipeline via pure static routing:
6414
+ - dep-bump + PR number → QUICK_REVIEW (delegates to `runPrReviewCoordinator`)
6415
+ - PR/MR number → REVIEW_ONLY
6416
+ - `current-pitch.md` exists → IMPLEMENT (coding + PR + review + merge)
6417
+ - Default → FULL (discovery → shaping → coding → PR → review → merge)
6418
+ - Fix loop cap: 2 iterations max. Escalating audit chain for Critical findings. UX gate for UI-touching tasks. 6 hardcoded timeout constants. Pitch archived after IMPLEMENT/FULL completes.
6419
+
6420
+ - ✅ **#4 -- GitHub issue queue poll trigger** (PR #637) -- New `github_queue_poll` trigger provider. Polls GitHub issues matching `GitHubQueueConfig` (assignee-based MVP, `label`/`mention`/`query` typed but `not_implemented`). Maturity inference from 3 deterministic heuristics. Idempotency check (conservative: parse errors = active). JSONL decision log at `~/.workrail/queue-poll.jsonl`. `maxTotalConcurrentSessions` cap. Bot identity config (`botName`, `botEmail`).
6421
+
6422
+ - ✅ **#5 -- Context assembly layer** (PR #624, shipped earlier) -- `ContextAssembler` injects git diff summary + prior session notes before turn 1. Feeds into coordinator pre-dispatch.
6423
+
6424
+ - ✅ **Performance sweep** (all 10 issues #248-257 -- already confirmed complete)
6425
+ - ✅ **Console session tree** (PR #607 -- parentSessionId rendered in UI)
6426
+ - ✅ **Daemon file-nav tools** (PR #619) -- Glob, Grep, Edit + upgraded Read/Write with staleness guard
6427
+ - ✅ **`spawn_agent` artifacts** (PR #613) -- `lastStepArtifacts` surfaced through spawn_agent return
6428
+ - ✅ **`wr.shaping` workflow** (PR #610) -- faithful Shape Up shaping, 9 steps
6429
+ - ✅ **Coding workflow Phase 0.5** (PR #610) -- upstream context detection, three-workflow pipeline
6430
+
6431
+ ### WorkTrain current capabilities (v3.45.0)
6432
+
6433
+ **Autonomous workflow execution -- confirmed working:**
6434
+ - `worktrain run pipeline --issue N` routes to the right pipeline and runs it end-to-end
6435
+ - `worktrain run pr-review` autonomous PR review with structured verdicts and auto-merge
6436
+ - Coding sessions run in isolated worktrees, auto-commit, auto-open PR
6437
+ - Sessions abort when stuck (instead of burning 30-min wall clock)
6438
+ - GitHub issue queue polling: assign issue to `worktrain-etienneb` → daemon picks it up automatically
6439
+ - All sessions start with git diff + prior session notes injected (ContextAssembler)
6440
+ - Daemon file-nav tools: Glob, Grep, Edit, Read (paginated), Write (staleness guard)
6441
+ - Escalating audit chain: Critical findings → prod audit → re-review → escalate if still Critical
6442
+ - Fix loop: minor findings → max 2 fix iterations before escalation
6443
+
6444
+ **WorkTrain agent tool set (v3.45.0):**
6445
+ `complete_step`, `continue_workflow` (deprecated), `Bash`, `Read`, `Write`, `Glob`, `Grep`, `Edit`, `report_issue`, `spawn_agent`, `signal_coordinator`
6446
+
6447
+ **Trigger system:**
6448
+ - Generic webhook, GitLab MR polling, GitHub Issues polling, GitHub PR polling
6449
+ - **NEW: `github_queue_poll`** -- assignee-based issue queue with maturity inference
6450
+ - `branchStrategy: worktree` -- isolated worktree per session
6451
+ - `autoCommit: true` / `autoOpenPR: true` -- full delivery pipeline
6452
+ - `stuckAbortPolicy: 'abort' | 'notify_only'`
6453
+ - `goalTemplate`, `referenceUrls`, `contextMapping`, `agentConfig`
6454
+
6455
+ ### Accurate limitations (v3.45.0)
6456
+
6457
+ 1. **`dispatchAdaptivePipeline()` not yet connected** -- `TriggerRouter.dispatchAdaptivePipeline()` exists but `polling-scheduler.ts` still calls `router.dispatch()`. Queue poll sessions run as generic sessions, not routed through the adaptive coordinator. Cross-PR gap documented with TODO.
6458
+
6459
+ 2. **`findingCategory` not on review-verdict** -- Audit chain always dispatches `production-readiness-audit` for Critical findings regardless of finding type. `findingCategory` field on `findings[]` items needs to be added to `wr.review_verdict` schema as a follow-up so architecture findings can route to `architecture-scalability-audit` correctly.
6460
+
6461
+ 3. **Bot account setup required before first queue run** -- `worktrain-etienneb` GitHub account must be created, PAT generated with `repo:read` scope, stored as `WORKTRAIN_BOT_TOKEN`, and added as repo collaborator. Commit identity: `worktrain-etienneb@users.noreply.github.com`. Without this, `github_queue_poll` trigger has no bot identity.
6462
+
6463
+ 4. **No auto-merge setting in `worktrain init`** -- Auto-merge policy is hardcoded in the coordinator. Should be a `~/.workrail/config.json` setting exposed during `worktrain init`.
6464
+
6465
+ 5. **Grooming loop not built** -- Three open design decisions must be settled before building (human-ack boundary, compute budget, priority signal source). Deferred until Level 1 usage data exists.
6466
+
6467
+ 6. **Knowledge graph not wired** -- `src/knowledge-graph/` module exists (DuckDB + ts-morph), `ts-morph` in devDependencies. No daemon tool yet. Architecture decision: belongs in context assembly layer, not as a daemon tool.
6468
+
6469
+ 7. **`worktrain inbox --watch` stub** -- Prints "not yet implemented." The outbox mechanism exists; just needs a polling loop.
6470
+
6471
+ 8. **Artifact store not built** -- Agents dump markdown in the repo. `~/.workrail/artifacts/` not created.
6472
+
6473
+ ### Next priorities (groomed Apr 20)
6474
+
6475
+ 1. **Connect `dispatchAdaptivePipeline()`** -- Wire `polling-scheduler.ts` to call `TriggerRouter.dispatchAdaptivePipeline()` when `context.taskCandidate` is present. Small change, unlocks the full autonomous queue → pipeline connection.
6476
+
6477
+ 2. **`findingCategory` on review-verdict schema** -- Add `findingCategory: 'correctness' | 'security' | 'architecture' | 'ux' | 'performance' | 'testing'` to `findings[]` in `ReviewVerdictArtifactV1Schema`. Update `mr-review-workflow-agentic` final step to emit it. Unlocks correct audit routing.
6478
+
6479
+ 3. **Bot account setup + `worktrain init` overhaul** -- Create `worktrain-etienneb`, add `worktrain daemon --check` command (API key + git fetch dry run), expose auto-merge policy in `worktrain init`.
6480
+
6481
+ 4. **Level 1 usage: run WorkTrain on its own backlog** -- Create `worktrain:ready` issues for the top 10 ready tasks, assign to `worktrain-etienneb`, observe one full queue → pipeline run. Collect data on misclassifications and weak PRs before designing the grooming loop.
6482
+
6483
+ 5. **`worktrain inbox --watch`** -- Close the notification loop. Outbox exists, just needs the polling implementation.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@exaudeus/workrail",
3
- "version": "3.44.0",
3
+ "version": "3.46.0",
4
4
  "description": "Step-by-step workflow enforcement for AI agents via MCP",
5
5
  "license": "MIT",
6
6
  "repository": {