@exaudeus/workrail 3.41.0 → 3.43.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (63) hide show
  1. package/dist/cli-worktrain.js +40 -11
  2. package/dist/console-ui/assets/{index-CQt4UhPB.js → index-Sb57DW4B.js} +1 -1
  3. package/dist/console-ui/index.html +1 -1
  4. package/dist/context-assembly/deps.d.ts +8 -0
  5. package/dist/context-assembly/deps.js +2 -0
  6. package/dist/context-assembly/index.d.ts +6 -0
  7. package/dist/context-assembly/index.js +50 -0
  8. package/dist/context-assembly/infra.d.ts +3 -0
  9. package/dist/context-assembly/infra.js +154 -0
  10. package/dist/context-assembly/types.d.ts +30 -0
  11. package/dist/context-assembly/types.js +2 -0
  12. package/dist/coordinators/pr-review.d.ts +3 -1
  13. package/dist/coordinators/pr-review.js +25 -4
  14. package/dist/daemon/workflow-runner.d.ts +11 -1
  15. package/dist/daemon/workflow-runner.js +82 -9
  16. package/dist/domain/execution/state.d.ts +6 -6
  17. package/dist/manifest.json +76 -44
  18. package/dist/mcp/handlers/v2-workflow.d.ts +2 -2
  19. package/dist/mcp/output-schemas.d.ts +234 -234
  20. package/dist/mcp/tools.d.ts +2 -2
  21. package/dist/mcp/v2/tools.d.ts +24 -24
  22. package/dist/trigger/delivery-action.d.ts +2 -0
  23. package/dist/trigger/delivery-action.js +24 -0
  24. package/dist/trigger/trigger-router.js +24 -1
  25. package/dist/trigger/trigger-store.js +42 -0
  26. package/dist/trigger/types.d.ts +3 -0
  27. package/dist/v2/durable-core/schemas/artifacts/assessment.d.ts +2 -2
  28. package/dist/v2/durable-core/schemas/artifacts/coordinator-signal.d.ts +2 -2
  29. package/dist/v2/durable-core/schemas/artifacts/loop-control.d.ts +6 -6
  30. package/dist/v2/durable-core/schemas/artifacts/review-verdict.d.ts +6 -6
  31. package/dist/v2/durable-core/schemas/compiled-workflow/index.d.ts +56 -56
  32. package/dist/v2/durable-core/schemas/execution-snapshot/blocked-snapshot.d.ts +83 -83
  33. package/dist/v2/durable-core/schemas/execution-snapshot/execution-snapshot.v1.d.ts +1024 -1024
  34. package/dist/v2/durable-core/schemas/export-bundle/index.d.ts +2336 -2336
  35. package/dist/v2/durable-core/schemas/session/dag-topology.d.ts +6 -6
  36. package/dist/v2/durable-core/schemas/session/events.d.ts +339 -339
  37. package/dist/v2/durable-core/schemas/session/gaps.d.ts +30 -30
  38. package/dist/v2/durable-core/schemas/session/manifest.d.ts +6 -6
  39. package/dist/v2/durable-core/schemas/session/outputs.d.ts +8 -8
  40. package/dist/v2/durable-core/schemas/session/validation-event.d.ts +3 -3
  41. package/docs/design/adaptive-coordinator-context-candidates.md +265 -0
  42. package/docs/design/adaptive-coordinator-context-review.md +101 -0
  43. package/docs/design/adaptive-coordinator-context.md +504 -0
  44. package/docs/design/adaptive-coordinator-routing-candidates.md +340 -0
  45. package/docs/design/adaptive-coordinator-routing-design-review.md +135 -0
  46. package/docs/design/adaptive-coordinator-routing-review.md +156 -0
  47. package/docs/design/adaptive-coordinator-routing.md +660 -0
  48. package/docs/design/context-assembly-design-candidates.md +199 -0
  49. package/docs/design/context-assembly-implementation-plan.md +211 -0
  50. package/docs/design/context-assembly-layer-design-review.md +110 -0
  51. package/docs/design/context-assembly-layer.md +622 -0
  52. package/docs/design/context-assembly-review-findings.md +112 -0
  53. package/docs/design/stuck-escalation-candidates.md +176 -0
  54. package/docs/design/stuck-escalation-design-review.md +70 -0
  55. package/docs/design/stuck-escalation.md +326 -0
  56. package/docs/design/worktrain-task-queue-candidates.md +252 -0
  57. package/docs/design/worktrain-task-queue-design-review.md +109 -0
  58. package/docs/design/worktrain-task-queue.md +443 -0
  59. package/docs/design/worktree-review-findings-candidates.md +101 -0
  60. package/docs/design/worktree-review-findings-design-review.md +65 -0
  61. package/docs/design/worktree-review-findings-implementation-plan.md +153 -0
  62. package/docs/ideas/backlog.md +212 -0
  63. package/package.json +3 -3
@@ -0,0 +1,101 @@
1
+ # Worktree Review Findings - Design Candidates
2
+
3
+ ## Problem Understanding
4
+
5
+ ### Core Tensions
6
+
7
+ 1. **Cleanup location vs result completeness**: `runWorkflow()` knows when a session succeeds; `trigger-router` knows when delivery completes. Cleanup must happen after delivery, but the result type must carry enough context for delivery to work -- hence `sessionWorkspacePath` in `WorkflowRunSuccess`. The existing architecture already threads this context; the bug is that runWorkflow() also tries to clean up before returning, racing with the delivery.
8
+
9
+ 2. **Crash-safety vs orphan-free**: Worktree path must be persisted before any crash could make it untracked. The `if (startContinueToken)` guard on the second `persistTokens()` call means that if `startContinueToken` is falsy at worktree creation time, the sidecar never records the worktree path -- creating an untracked orphan if the process crashes.
10
+
11
+ 3. **Type safety vs path coupling**: `sessionId` is currently extracted via `result.sessionWorkspacePath.split('/').at(-1)` -- a fragile string operation that couples branch-naming convention (UUID in path) to the calling code. Threading sessionId as a typed field on `WorkflowRunSuccess` eliminates this coupling.
12
+
13
+ 4. **Fail-fast validation vs runtime discovery**: Validating `branchPrefix`/`baseBranch` at parse time (trigger-store) produces a clear config error. Waiting until worktree creation produces a cryptic `git checkout` error deep in the session setup.
14
+
15
+ ### What Makes This Hard
16
+
17
+ The key insight for the CRITICAL bug: the cleanup code at trigger-router.ts lines 365-377 is inside `maybeRunDelivery()`, but `maybeRunDelivery()` returns early (line 293) when `autoCommit !== true`. This means worktree sessions with `autoCommit: false` would accumulate orphan worktrees if the runWorkflow() cleanup is removed without a compensating change. The review accepts this -- startup recovery (24h threshold) handles the edge case.
18
+
19
+ For Minor 1, the `persistTokens()` function already handles `worktreePath?: string` (omits the field when undefined). The guard `if (startContinueToken)` was added to avoid writing a blank token to the sidecar, but it incorrectly prevents worktreePath from being persisted when the token is falsy. The fix must decouple the worktreePath persistence from the token presence check.
20
+
21
+ ## Philosophy Constraints
22
+
23
+ From CLAUDE.md:
24
+ - **Architectural fixes over patches**: Move cleanup to the correct layer (trigger-router), not patch runWorkflow().
25
+ - **Errors are data**: Use `TriggerStoreError` with `kind: 'invalid_field_value'` for validation failures.
26
+ - **Make illegal states unrepresentable**: `sessionId?: string` on `WorkflowRunSuccess` makes path-parsing unnecessary.
27
+ - **Explicit domain types**: typed sessionId instead of stringly-typed split.
28
+ - **Validate at boundaries**: branchPrefix/baseBranch validation belongs at parse time, not at worktree creation.
29
+ - **Document 'why'**: JSDoc on makeSpawnAgentTool must explain the architectural reason for branchStrategy:'none'.
30
+
31
+ No philosophy conflicts detected.
32
+
33
+ ## Impact Surface
34
+
35
+ - **WorkflowRunSuccess interface**: Adding optional `sessionId?: string` is additive. Immediate-complete path (line 3062) must also be updated to include sessionId when applicable.
36
+ - **trigger-router.ts maybeRunDelivery()**: Line 321 changes from `.split('/').at(-1)` to `result.sessionId`. No interface contract changes for callers of TriggerRouter.
37
+ - **trigger-store.ts**: New validation added before existing branchStrategy/baseBranch/branchPrefix are assembled into the trigger. No changes to the TriggerDefinition shape.
38
+ - **spawn_agent tool**: JSDoc addition only -- no behavior change, no callers affected.
39
+ - **persistTokens()**: No signature change. Guard removal makes the second call unconditional.
40
+
41
+ ## Candidates
42
+
43
+ ### Candidate A: Follow Review Verbatim (Recommended)
44
+
45
+ **Summary**: Apply all 7 findings exactly as specified, accepting that non-autoCommit worktree sessions (a rare/unlikely combination) have worktrees cleaned up by runStartupRecovery after 24h.
46
+
47
+ **Tensions resolved**:
48
+ - CRITICAL: delivery no longer races with worktree removal
49
+ - Minor 2: sessionId no longer requires path parsing
50
+ - Minor 3: validation catches bad git chars at daemon startup
51
+
52
+ **Tensions accepted**:
53
+ - Non-autoCommit worktree sessions accumulate for up to 24h before startup recovery cleans them
54
+
55
+ **Boundary**: runWorkflow() owns session execution; trigger-router owns delivery lifecycle including post-delivery cleanup.
56
+
57
+ **Failure mode**: If a worktree session has autoCommit=false (unusual -- why use worktree isolation without autoCommit?), the worktree persists for 24h. Acceptable given startup recovery already handles this.
58
+
59
+ **Repo-pattern relationship**: Follows. The `sessionWorkspacePath` threading pattern, `TriggerStoreError` validation, and startup recovery cleanup are all existing patterns.
60
+
61
+ **Gains**: Minimal diff, matches review intent exactly, no new abstractions.
62
+
63
+ **Losses**: Minor 24h worktree leak for non-autoCommit sessions.
64
+
65
+ **Scope**: Best-fit.
66
+
67
+ **Philosophy**: Honors architectural fixes over patches, errors-as-data, explicit domain types, validate at boundaries.
68
+
69
+ ### Candidate B: Move Cleanup to Queue Callback
70
+
71
+ **Summary**: Move worktree cleanup out of `maybeRunDelivery()` to the queue callback that orchestrates `runWorkflow()` + `maybeRunDelivery()`, so cleanup always runs regardless of autoCommit.
72
+
73
+ **Tensions resolved**: Worktree leak for non-autoCommit sessions eliminated.
74
+
75
+ **Tensions accepted**: More invasive change, modifies both trigger-router internals and cleanup location.
76
+
77
+ **Failure mode**: Cleanup logic now in two places (maybeRunDelivery for autoCommit=true sessions, queue callback for all). Harder to reason about.
78
+
79
+ **Scope**: Too broad. Review doesn't ask for this, and it changes the cleanup location the review identifies as correct.
80
+
81
+ **Philosophy conflict**: YAGNI with discipline -- adding complexity without evidence the non-autoCommit+worktree combination is a real use case.
82
+
83
+ ## Comparison and Recommendation
84
+
85
+ **Recommendation: Candidate A**
86
+
87
+ The review is the upstream spec. It explicitly says "The cleanup in `maybeRunDelivery()` (in trigger-router) is the architecturally correct location and should be the sole success-path removal." Candidate A follows this exactly. The 24h cleanup window for the edge case is handled by an existing mechanism (runStartupRecovery).
88
+
89
+ ## Self-Critique
90
+
91
+ **Strongest counter-argument**: Moving cleanup out of runWorkflow() creates a window where the process crashes between runWorkflow() returning and maybeRunDelivery() cleaning up -- leaving an orphan. But startup recovery already handles this case, and the review explicitly accepts this tradeoff.
92
+
93
+ **Pivot condition**: If evidence emerges that branchStrategy='worktree' without autoCommit is a common pattern, Candidate B becomes justified.
94
+
95
+ **Invalidating assumption**: If the review misidentified the cleanup location in trigger-router as correct. But the comment at lines 355-357 of trigger-router.ts is the author's own documentation of the invariant, making this self-consistent.
96
+
97
+ ## Open Questions for Main Agent
98
+
99
+ 1. When implementing Minor 1: should the second `persistTokens()` call use `startContinueToken ?? ''` (write empty string) or `currentContinueToken` (same value at that point)? Both work since startup recovery handles malformed sidecars. Prefer `startContinueToken ?? ''` to be explicit about the fallback.
100
+
101
+ 2. The immediate-complete path at line 3062 returns `{ _tag: 'success', workflowId: trigger.workflowId, stopReason: 'stop' }` without `sessionWorkspacePath`. Should it also include `sessionId` and `sessionWorkspacePath`? Yes -- if a single-step workflow with branchStrategy='worktree' completes immediately, delivery still needs to run from the worktree.
@@ -0,0 +1,65 @@
1
+ # Worktree Review Findings - Design Review
2
+
3
+ ## Tradeoff Review
4
+
5
+ | Tradeoff | Acceptance Criteria Impact | Hidden Assumptions | Verdict |
6
+ |---|---|---|---|
7
+ | 24h orphan window for non-autoCommit worktree sessions | None -- startup recovery handles this | Daemon restarts at least once per 24h | Acceptable |
8
+ | Empty string token fallback in persistTokens() | None -- sidecar still tracks worktreePath for orphan recovery | startContinueToken is always set before worktree creation (verified in code flow) | Acceptable |
9
+ | sessionId absent for spawn_agent child sessions | None -- children never use branchStrategy:'worktree' | No caller reads WorkflowRunSuccess.sessionId except the one being updated | Acceptable |
10
+
11
+ ## Failure Mode Review
12
+
13
+ | Failure Mode | Handled By | Missing Mitigation | Risk |
14
+ |---|---|---|---|
15
+ | Crash after runWorkflow() returns, before maybeRunDelivery() cleans up | Startup recovery (24h) | None needed | Low |
16
+ | maybeRunDelivery() fails partway | Cleanup runs regardless of deliveryResult._tag | None | Low |
17
+ | startContinueToken genuinely undefined at worktree creation | persistTokens() still writes worktreePath; sidecar cleaned on next start | None | Low (theoretical only) |
18
+ | Regex rejects valid but unusual git branch name | Fail-fast with clear config error | None -- review specifies this regex | Low |
19
+
20
+ ## Runner-Up / Simpler Alternative Review
21
+
22
+ - Runner-up (cleanup in queue callback): not worth borrowing -- review explicitly identifies maybeRunDelivery() as the correct cleanup location.
23
+ - Simpler variants (skip Minor 2 or Minor 3): not acceptable -- each finding has a specific correctness justification, not just cosmetic preference.
24
+ - No hybrid opportunities identified.
25
+
26
+ ## Philosophy Alignment
27
+
28
+ All 7 fixes align with CLAUDE.md principles:
29
+ - Architectural fix: cleanup moved to correct layer
30
+ - Errors-as-data: TriggerStoreError for validation
31
+ - Make illegal states unrepresentable: sessionId as typed field
32
+ - Validate at boundaries: branchPrefix/baseBranch at parse time
33
+ - Document 'why': JSDoc on makeSpawnAgentTool
34
+ - YAGNI: only the 7 specified fixes implemented
35
+
36
+ No philosophy conflicts.
37
+
38
+ ## Findings
39
+
40
+ ### Yellow: Immediate-Complete Path Missing sessionWorkspacePath/sessionId
41
+
42
+ The review asks to fix both the success path AND the immediate-complete path for the CRITICAL bug (remove worktree cleanup). But the current immediate-complete return at line 3062 also lacks `sessionWorkspacePath` and `sessionId` spreading. Without these, a single-step workflow with branchStrategy='worktree' would return success with no delivery context, and maybeRunDelivery() would use trigger.workspacePath (wrong directory) for delivery.
43
+
44
+ **Severity**: Yellow. The review mentions fixing both paths for cleanup removal, but doesn't explicitly call out the missing return fields. However, omitting them would make the cleanup fix incomplete for the immediate-complete case.
45
+
46
+ **Recommended fix**: Add the same spreading pattern used in the main success return to the immediate-complete return:
47
+ ```typescript
48
+ return {
49
+ _tag: 'success',
50
+ workflowId: trigger.workflowId,
51
+ stopReason: 'stop',
52
+ ...(sessionWorktreePath !== undefined ? { sessionWorkspacePath: sessionWorktreePath } : {}),
53
+ ...(sessionWorktreePath !== undefined ? { sessionId } : {}),
54
+ };
55
+ ```
56
+
57
+ ## Recommended Revisions
58
+
59
+ 1. **Apply Yellow finding**: Add sessionWorkspacePath and sessionId to the immediate-complete return at line 3062 when sessionWorktreePath is defined.
60
+ 2. All other 7 review findings: apply as specified.
61
+
62
+ ## Residual Concerns
63
+
64
+ - The 24h orphan window for non-autoCommit worktree sessions is accepted. If this pattern becomes common in production, consider adding explicit cleanup in the queue callback.
65
+ - The regex for branchPrefix/baseBranch is slightly narrower than git's full rules. This is intentional (clear config errors > cryptic git failures) and matches the review spec.
@@ -0,0 +1,153 @@
1
+ # Worktree Review Findings - Implementation Plan
2
+
3
+ ## Problem Statement
4
+
5
+ PR #630 (`feat/worktree-auto-commit`) has 7 MR review findings (1 critical, 2 major, 4 minor) that must be resolved before merge. The critical bug causes delivery to fail with "not a git repository" because `runWorkflow()` deletes the worktree before `maybeRunDelivery()` runs.
6
+
7
+ ## Acceptance Criteria
8
+
9
+ 1. `runWorkflow()` does NOT remove the worktree on the success path or immediate-complete path.
10
+ 2. `makeSpawnAgentTool()` has a JSDoc comment documenting that child sessions always use `branchStrategy: 'none'`.
11
+ 3. `WorkflowRunSuccess` has a `readonly sessionId?: string` field.
12
+ 4. `runWorkflow()` sets `sessionId` in the success return when `branchStrategy === 'worktree'`.
13
+ 5. `trigger-router.ts` reads `result.sessionId` instead of `result.sessionWorkspacePath.split('/').at(-1)`.
14
+ 6. `trigger-store.ts` validates `branchPrefix` and `baseBranch` against `/^[a-zA-Z0-9._/-]+$/` and rejects values starting with `-`.
15
+ 7. `tests/unit/trigger-router.test.ts` has a test verifying delivery uses the worktree path.
16
+ 8. `npm run build` compiles clean.
17
+ 9. `npx vitest run` shows no regressions.
18
+ 10. `persistTokens()` is called unconditionally after worktree creation (not gated on `startContinueToken`).
19
+ 11. Immediate-complete path return includes `sessionWorkspacePath` and `sessionId` when `sessionWorktreePath !== undefined`.
20
+
21
+ ## Non-Goals
22
+
23
+ - Do NOT touch `src/mcp/` in any way.
24
+ - Do NOT change delivery logic in `delivery-action.ts`.
25
+ - Do NOT change the cleanup location in `maybeRunDelivery()` (lines 365-377 in trigger-router.ts) -- this is correct.
26
+ - Do NOT add new abstractions or dependencies.
27
+ - Do NOT change workflow definitions or schema files.
28
+
29
+ ## Philosophy-Driven Constraints
30
+
31
+ - Use `TriggerStoreError` with `kind: 'invalid_field_value'` for validation errors (errors-as-data).
32
+ - `WorkflowRunSuccess.sessionId` must be `readonly` (immutability by default).
33
+ - JSDoc must explain WHY, not just what (document 'why' principle).
34
+ - Validation must happen at the boundary (trigger-store parse time), not at worktree creation time.
35
+ - Architectural fix: cleanup moves to the correct layer, not patched at the symptom.
36
+
37
+ ## Invariants
38
+
39
+ 1. Worktree must exist until `maybeRunDelivery()` completes; `runWorkflow()` must NOT remove it on any success path.
40
+ 2. `persistTokens()` must always record `worktreePath` immediately after worktree creation (not conditional on token presence).
41
+ 3. The `sessionId` field on `WorkflowRunSuccess` must never require path parsing at the call site.
42
+ 4. `branchPrefix` and `baseBranch` must be validated before use (fail-fast at daemon startup).
43
+
44
+ ## Selected Approach
45
+
46
+ Follow review verbatim, with one additional fix: the immediate-complete return path (line 3062) must also include `sessionWorkspacePath` and `sessionId` when a worktree was created (this was missing and discovered during design review).
47
+
48
+ ## Vertical Slices
49
+
50
+ ### Slice 1: CRITICAL -- Remove Premature Worktree Removal
51
+ **File**: `src/daemon/workflow-runner.ts`
52
+ **Changes**:
53
+ - Remove the `if (sessionWorktreePath)` cleanup block at lines 3049-3058 (immediate-complete path).
54
+ - Add `sessionWorkspacePath` and `sessionId` spread to the immediate-complete return at line 3062.
55
+ - Remove the `// ---- Remove worktree on success ----` comment and `if (sessionWorktreePath)` block at lines 3502-3514 (success path).
56
+
57
+ **Done when**: `runWorkflow()` returns without any `execFileAsync('git', ['-C', ..., 'worktree', 'remove', ...])` calls on the success path. The worktree cleanup comment in `trigger-router.ts` lines 355-357 remains the sole cleanup on the success path.
58
+
59
+ ### Slice 2: MAJOR -- JSDoc on makeSpawnAgentTool
60
+ **File**: `src/daemon/workflow-runner.ts`
61
+ **Changes**:
62
+ - Add a JSDoc comment block immediately before `export function makeSpawnAgentTool(` (line 2009).
63
+ - Content: "Child sessions spawned by this tool always have `branchStrategy: 'none'` -- they operate in the parent's workspace without their own worktree or feature branch. Coordinators that need isolated child sessions should dispatch them via `TriggerRouter.dispatch()` instead."
64
+
65
+ **Done when**: JSDoc is present and describes the branchStrategy limitation.
66
+
67
+ ### Slice 3: Minor 1 -- Unconditional persistTokens After Worktree Creation
68
+ **File**: `src/daemon/workflow-runner.ts`
69
+ **Changes**:
70
+ - Remove the `if (startContinueToken)` guard from the second `persistTokens()` call (lines 3020-3022).
71
+ - Replace with an unconditional call: `await persistTokens(sessionId, startContinueToken ?? currentContinueToken, startCheckpointToken, sessionWorktreePath);`
72
+
73
+ **Done when**: `persistTokens()` is called unconditionally after worktree creation, ensuring `worktreePath` is always written to the sidecar.
74
+
75
+ ### Slice 4: Minor 2 -- Thread sessionId Through WorkflowRunSuccess
76
+ **Files**: `src/daemon/workflow-runner.ts`, `src/trigger/trigger-router.ts`
77
+ **Changes in workflow-runner.ts**:
78
+ - Add `readonly sessionId?: string` to `WorkflowRunSuccess` interface (after `sessionWorkspacePath`).
79
+ - In the main success return (line 3526), add `...(sessionWorktreePath !== undefined ? { sessionId } : {})` (where `sessionId` is the process-local UUID already in scope).
80
+ - In the immediate-complete return (line 3062), add `...(sessionWorktreePath !== undefined ? { sessionId } : {})` alongside `sessionWorkspacePath`.
81
+
82
+ **Changes in trigger-router.ts**:
83
+ - Line 321: Replace `result.sessionWorkspacePath.split('/').at(-1) ?? ''` with `result.sessionId ?? ''`.
84
+
85
+ **Done when**: `WorkflowRunSuccess.sessionId` is set when `branchStrategy === 'worktree'` and trigger-router reads it directly without path manipulation.
86
+
87
+ ### Slice 5: Minor 3 -- Validate git-safe chars for branchPrefix/baseBranch
88
+ **File**: `src/trigger/trigger-store.ts`
89
+ **Changes**:
90
+ - After lines 867-868 where `baseBranch` and `branchPrefix` are extracted, add regex validation.
91
+ - For each non-undefined value, check `/^[a-zA-Z0-9._/-]+$/` and that it does not start with `-`.
92
+ - Return `err({ kind: 'invalid_field_value', field: '...', triggerId: rawId })` on failure.
93
+
94
+ **Done when**: A trigger with `branchPrefix: '--bad'` or `baseBranch: '-main'` fails at parse time with `kind: 'invalid_field_value'`.
95
+
96
+ ### Slice 6: Minor 4 -- Add End-to-End Delivery Test for branchStrategy:worktree
97
+ **File**: `tests/unit/trigger-router.test.ts`
98
+ **Changes**:
99
+ - Add a test in the `describe('delivery wiring (autoCommit)')` block.
100
+ - The test creates a `WorkflowRunSuccess` with `sessionWorkspacePath: '/worktrees/test-session-id'` and valid `lastStepNotes`.
101
+ - Stubs `runWorkflowFn` to return this success result.
102
+ - Verifies the first git call uses `/worktrees/test-session-id` as the working directory (not trigger.workspacePath).
103
+
104
+ **Done when**: Test passes and verifies `execFn` is called with the worktree path.
105
+
106
+ ## Test Design
107
+
108
+ ### Existing Tests to Verify Unchanged
109
+ - `tests/unit/trigger-router.test.ts` -- all existing tests must still pass.
110
+ - `tests/unit/trigger-store.test.ts` -- all existing validation tests must still pass.
111
+
112
+ ### New Test (Slice 6)
113
+ ```
114
+ describe('delivery wiring (autoCommit)')
115
+ it('uses sessionWorkspacePath as working directory when runWorkflow returns a worktree session')
116
+ - trigger: { autoCommit: true, branchStrategy: 'worktree', workspacePath: '/workspace' }
117
+ - runWorkflowFn returns: { _tag: 'success', sessionWorkspacePath: '/worktrees/abc-session', lastStepNotes: VALID_HANDOFF_NOTES }
118
+ - fakeExec: vi.fn().mockResolvedValue(...)
119
+ - assertion: fakeExec called; first git add call uses cwd '/worktrees/abc-session'
120
+ ```
121
+
122
+ ## Risk Register
123
+
124
+ | Risk | Likelihood | Impact | Mitigation |
125
+ |---|---|---|---|
126
+ | `startContinueToken` is undefined in practice when branchStrategy='worktree' | Very Low | Low | persistTokens writes '' as fallback; startup recovery handles it |
127
+ | Removing cleanup breaks non-autoCommit worktree sessions | Low | Low | Startup recovery reaps after 24h; combination is unusual |
128
+ | `sessionId` field name collision with WorkRail server sessionId | Low | Low | Field is optional; no ambiguity since it's typed on the interface |
129
+
130
+ ## PR Packaging Strategy
131
+
132
+ All changes on existing branch `feat/worktree-auto-commit`. Single PR #630.
133
+
134
+ Commit message: `fix(daemon): address worktree review findings -- move success cleanup, document spawn_agent limitation, thread sessionId, validate git-safe chars`
135
+
136
+ ## Philosophy Alignment
137
+
138
+ | Principle | Slice | Status |
139
+ |---|---|---|
140
+ | Architectural fixes over patches | Slice 1 | Satisfied -- cleanup moved to correct layer |
141
+ | Errors are data | Slice 5 | Satisfied -- TriggerStoreError returned |
142
+ | Make illegal states unrepresentable | Slice 4 | Satisfied -- typed sessionId, no path-parsing |
143
+ | Validate at boundaries | Slice 5 | Satisfied -- parse-time validation |
144
+ | Document 'why' | Slice 2 | Satisfied -- JSDoc explains architectural reason |
145
+ | Immutability by default | Slice 4 | Satisfied -- readonly field added |
146
+ | YAGNI | All | Satisfied -- no new abstractions |
147
+
148
+ ## Open Questions
149
+
150
+ None. All questions resolved during design.
151
+
152
+ ## Unresolved Unknown Count: 0
153
+ ## Plan Confidence Band: High
@@ -6183,3 +6183,215 @@ The daemon tool approach is only better for ad-hoc mid-session queries the agent
6183
6183
  ### Anti-pattern to avoid
6184
6184
 
6185
6185
  Adding knowledge graph calls directly into `pr-review.ts` or any other coordinator script. That immediately creates the god class we're trying to avoid and couples the orchestration layer to a specific context source.
6186
+
6187
+ ---
6188
+
6189
+ ## Scheduled tasks (Apr 19, 2026)
6190
+
6191
+ **The idea:** WorkTrain runs tasks on a schedule -- not triggered by an external event, but by time. "Every Monday morning, run the code health scan." "Every night at 2am, check for new GitHub issues and triage them." "First of the month, run the production readiness audit."
6192
+
6193
+ ### Why this matters for the autonomous pipeline vision
6194
+
6195
+ The full autonomous pipeline (prioritize → discover → shape → implement → test → PR → review → fix → merge) needs a way to start without a human pushing a button. Scheduled tasks are the trigger layer for proactive, time-driven work. Without them, WorkTrain is purely reactive -- it only acts when a webhook fires or a human dispatches it.
6196
+
6197
+ ### What exists today
6198
+
6199
+ The trigger system (`src/trigger/`) supports `generic` (webhook) and polling providers (`gitlab_poll`, `github_issues_poll`, `github_prs_poll`). There is no native cron/schedule provider. The workaround today is OS crontab calling `curl` to fire a webhook.
6200
+
6201
+ ### What to build
6202
+
6203
+ A `schedule` provider in triggers.yml:
6204
+
6205
+ ```yaml
6206
+ triggers:
6207
+ - id: weekly-code-health
6208
+ provider: schedule
6209
+ cron: "0 9 * * 1" # every Monday at 9am
6210
+ workflowId: architecture-scalability-audit
6211
+ workspacePath: /path/to/repo
6212
+ goal: "Run weekly code health scan -- identify coupling violations, complexity hotspots, and performance anti-patterns introduced this week"
6213
+
6214
+ - id: nightly-issue-triage
6215
+ provider: schedule
6216
+ cron: "0 2 * * *" # every night at 2am
6217
+ workflowId: wr.discovery
6218
+ workspacePath: /path/to/repo
6219
+ goal: "Review open GitHub issues created in the last 24 hours and triage them: classify severity, identify duplicates, suggest which to prioritize"
6220
+
6221
+ - id: backlog-next-task
6222
+ provider: schedule
6223
+ cron: "0 8 * * 1-5" # weekday mornings at 8am
6224
+ workflowId: coding-task-workflow-agentic
6225
+ workspacePath: /path/to/repo
6226
+ goal: "Pick the highest-priority unstarted task from docs/ideas/backlog.md and implement it"
6227
+ ```
6228
+
6229
+ ### Key design decisions
6230
+
6231
+ - **Cron syntax**: standard 5-field cron (`min hour dom month dow`). Parsed by `node-cron` or equivalent -- already a pattern in the codebase (backlog mentions cron).
6232
+ - **Timezone**: configurable per trigger, defaults to system timezone. Important for "weekday morning" schedules that need to fire in the user's timezone.
6233
+ - **Missed runs**: if the daemon was down when a scheduled run should have fired, it does NOT catch up on missed runs by default. "Run at 9am Monday" means "run the next time 9am Monday arrives." Optional `catchUp: true` flag for cases where missing a run should be recovered.
6234
+ - **Overlap prevention**: if a scheduled run fires while the previous run is still active, it should be skipped (not queued). A `coding-task` that takes 2 hours should not spawn a second instance at the next cron tick.
6235
+ - **Manual trigger**: `worktrain run schedule <trigger-id>` to fire a scheduled trigger immediately without waiting for the cron time. Useful for testing.
6236
+
6237
+ ### Integration with the autonomous pipeline
6238
+
6239
+ Scheduled tasks are the entry point for fully autonomous work:
6240
+ - "Every weekday morning, pick the next backlog item and run the full pipeline" -- this is how WorkTrain improves WorkTrain without any human input.
6241
+ - "Every time a PR is opened, run the MR review pipeline" -- this is github_prs_poll, already exists.
6242
+ - "Every Monday, run the architecture audit and file GitHub issues for findings" -- new scheduled capability.
6243
+
6244
+ ### Implementation notes
6245
+
6246
+ - The `PollingScheduler` in `src/trigger/polling-scheduler.ts` already runs time-based loops for GitLab/GitHub polling. The schedule provider would be a similar loop, using cron expression matching instead of API polling.
6247
+ - `node-cron` or `croner` npm package for cron expression parsing and next-fire-time calculation. Lightweight, no daemon dependencies.
6248
+ - Scheduled triggers have no webhook payload -- `contextMapping` is empty, `goalTemplate` uses only static text or env vars.
6249
+ - The schedule state (last-fired-at per trigger) persists to `~/.workrail/schedule-state.json` so the daemon can detect missed runs on restart.
6250
+
6251
+ ---
6252
+
6253
+ ## Autonomous grooming loop + workOnAll mode (Apr 19, 2026)
6254
+
6255
+ ### The vision
6256
+
6257
+ WorkTrain eventually finds and executes its own work without any human seeding the queue. This is the full autonomous loop: raw backlog idea → groomed issue → discovered/shaped spec → implemented PR → reviewed → merged. Zero human input required once configured.
6258
+
6259
+ ### Three autonomy levels
6260
+
6261
+ **Level 0 -- Opt-in queue (current design)**
6262
+ Human adds `worktrain` label to specific issues. WorkTrain works those issues only. Safe, predictable, explicit.
6263
+
6264
+ **Level 1 -- workOnAll mode**
6265
+ Config flag `workOnAll: true` in `~/.workrail/config.json`. WorkTrain looks at ALL open issues, infers which ones are actionable, picks the highest-priority one. Human escape hatch: `worktrain:skip` label blocks WorkTrain from touching a specific issue. Status labels (`worktrain:in-progress`, `worktrain:done`) are coordinator-managed for observability. No human-set maturity labels needed -- coordinator infers from content.
6266
+
6267
+ **Level 2 -- Fully proactive**
6268
+ WorkTrain also surfaces work it found itself: failing CI, Dependabot alerts, backlog items with no issue, patterns in git history suggesting missing tests or docs. Creates its own work items, runs them, closes the loop.
6269
+
6270
+ ### The grooming loop (scheduled, e.g. nightly)
6271
+
6272
+ Runs on a cron trigger. Responsibilities:
6273
+ 1. Read `docs/ideas/backlog.md`, `docs/roadmap/now-next-later.md`, open GitHub issues
6274
+ 2. Reconcile: close issues that are already done (PR merged), update priorities based on what shipped recently, flag duplicate or obsolete items
6275
+ 3. For each ungroomed `worktrain` issue (or all issues in workOnAll mode): infer maturity -- does it have a linked spec? acceptance criteria? concrete implementation plan?
6276
+ 4. For high-value `idea`-level items: autonomously run `wr.discovery` → `wr.shaping` → update or create issue with pitch attached, set `worktrain:specced`
6277
+ 5. Backlog → issue promotion: when a backlog item crosses a readiness threshold (has enough context to act on), create a GitHub issue from it
6278
+
6279
+ ### Maturity inference (no human-set labels required in Level 1+)
6280
+
6281
+ The coordinator reads issue content and infers:
6282
+ - Linked pitch/PRD/spec URL → `ready` or `specced`
6283
+ - Has acceptance criteria or concrete implementation plan → `specced` or `ready`
6284
+ - Vague/exploratory language → `idea`
6285
+ - Has open PR or recent branch activity → skip (already in flight)
6286
+
6287
+ The `worktrain:idea/specced/ready` taxonomy is the coordinator's internal model, not something humans set. In Level 1+ the coordinator manages it automatically.
6288
+
6289
+ ### workOnAll config
6290
+
6291
+ ```json
6292
+ // ~/.workrail/config.json
6293
+ {
6294
+ "workOnAll": true,
6295
+ "workOnAllExclusions": ["needs-design", "blocked-external", "wontfix"],
6296
+ "maxConcurrentSelf": 2
6297
+ }
6298
+ ```
6299
+
6300
+ `maxConcurrentSelf` caps how many autonomous self-improvement sessions run simultaneously -- important so WorkTrain doesn't try to implement 10 things at once and create merge conflicts.
6301
+
6302
+ ### Design notes
6303
+
6304
+ - The grooming loop and the work loop are **separate triggers** with separate schedules. Grooming runs more frequently (nightly or post-merge). Work loop runs on demand or weekly.
6305
+ - The grooming loop requires LLM judgment ("is this ready?") -- it's a `wr.discovery`-style session on the backlog, not a deterministic script. This is a feature, not a limitation.
6306
+ - `worktrain:skip` is the only label humans need to set in Level 1+ -- it's the explicit "not this one" override.
6307
+ - Auto-PR-from-backlog requires careful scope: WorkTrain should create draft PRs for its own discoveries, not automatically push to open issues on other people's repos.
6308
+
6309
+ ### Priority
6310
+
6311
+ This is the long-term autonomous vision. Implement in order:
6312
+ 1. Level 0 (current, task queue PR #4)
6313
+ 2. workOnAll config flag (small addition to the coordinator, after #4 ships)
6314
+ 3. Maturity inference (replace label-based routing with content inference)
6315
+ 4. Grooming loop (scheduled cron trigger, wr.discovery session on backlog)
6316
+ 5. Level 2 proactive work (post-grooming, after proving the loop works)
6317
+
6318
+ ---
6319
+
6320
+ ## Escalating review gates based on finding severity (Apr 19, 2026)
6321
+
6322
+ **The idea:** when an MR review returns a Critical finding post-implementation, the review is not over -- it triggers a deeper audit chain before merge is allowed.
6323
+
6324
+ ### Current state
6325
+
6326
+ `worktrain run pr-review` routes by severity: `clean` → merge, `minor` → fix-agent loop, `blocking` → escalate to human. But "blocking" is binary -- a single Critical finding and a trivially incorrect comment are treated identically (both block, neither gets more scrutiny).
6327
+
6328
+ ### The right behavior
6329
+
6330
+ After a fix round, if the re-review still returns a Critical finding (or the original review does):
6331
+ 1. **Another full MR review** -- confirm the Critical is real, not a false positive from the reviewer
6332
+ 2. **Production readiness audit** (`production-readiness-audit` workflow) -- a Critical finding often implies a runtime risk. Check for error handling gaps, security exposure, missing observability.
6333
+ 3. **Architecture audit** (`architecture-scalability-audit`) -- if the Critical is architectural (wrong abstraction, tight coupling, violates invariants), run a targeted audit on the affected modules.
6334
+
6335
+ Not all Criticals warrant all three. The coordinator should route based on the finding's `category` field (from `wr.review_verdict`):
6336
+ - `correctness` / `security` → always trigger prod audit
6337
+ - `architecture` / `design` → trigger arch audit
6338
+ - All → trigger re-review
6339
+
6340
+ ### Auto-merge policy interaction
6341
+
6342
+ A PR that triggered the escalating audit chain should NEVER auto-merge, even if the final re-review comes back clean. The human should approve it explicitly after seeing the audit trail. This is a hard rule, not a setting.
6343
+
6344
+ ### Implementation notes
6345
+
6346
+ - The escalation logic belongs in the `IMPLEMENT` and `REVIEW_ONLY` mode coordinators (part of the adaptive pipeline coordinator work).
6347
+ - `wr.review_verdict` `findings[].category` field needs to be defined if not already -- check `src/v2/durable-core/schemas/artifacts/review-verdict.ts`.
6348
+ - The audit chain runs sequentially (prod then arch), not in parallel -- each audit's output informs the next.
6349
+ - All audit session IDs should be linked to the same parent work unit so the console session tree shows the full chain.
6350
+
6351
+ ### Priority
6352
+
6353
+ Design this alongside the adaptive pipeline coordinator (#3). The coordinator needs to know about this escalation policy before its routing logic is finalized -- the `IMPLEMENT` mode's post-review handling is incomplete without it.
6354
+
6355
+ ---
6356
+
6357
+ ## UX/UI impact detection and design workflow integration (Apr 19, 2026)
6358
+
6359
+ **The idea:** When the adaptive pipeline coordinator classifies a task, it should detect whether the task touches user-facing surfaces (UI components, user flows, API contracts that clients consume) and automatically insert a `ui-ux-design-workflow` run before implementation.
6360
+
6361
+ ### Why this matters
6362
+
6363
+ Coding tasks that touch UI get implemented without a design pass today. The agent writes functional code but often produces interfaces that are technically correct but experientially wrong -- wrong information hierarchy, wrong affordances, missing error states, missing loading states, wrong copy. A `ui-ux-design-workflow` run before coding forces the "multiple design directions before converging" discipline that prevents the single-solution trap.
6364
+
6365
+ ### Detection signals (what marks a task as UX-impactful)
6366
+
6367
+ The coordinator should classify a task as `touchesUI: true` when any of:
6368
+ - Issue title or body mentions: component, screen, page, modal, dialog, button, form, flow, onboarding, dashboard, table, list, navigation, UX, UI, design, user-facing, frontend, console, web
6369
+ - Affected files (from git diff or knowledge graph) include: `console/src/`, `*.tsx`, `*.css`, `web/`, `views/`
6370
+ - The task has a `ui` or `frontend` label
6371
+ - The upstream spec (pitch/PRD) explicitly calls out visual or interaction design requirements
6372
+
6373
+ False positives (running design workflow unnecessarily) are cheaper than false negatives (shipping bad UX). Default to `touchesUI: true` when signals are ambiguous and the task is `complexity: Medium` or larger.
6374
+
6375
+ ### Pipeline integration
6376
+
6377
+ When `touchesUI: true`, the `IMPLEMENT` pipeline becomes:
6378
+
6379
+ ```
6380
+ coding-task-classify → ui-ux-design-workflow → coding-task-workflow-agentic → PR → review → merge
6381
+ ```
6382
+
6383
+ The `ui-ux-design-workflow` output (a design spec with chosen direction, information architecture, component breakdown, error states) feeds into Phase 0.5 of `coding-task-workflow-agentic` as the upstream spec. The coding agent then implements against a concrete design spec, not ad-hoc intuition.
6384
+
6385
+ ### Relationship to escalating review gates
6386
+
6387
+ When a post-implementation MR review finds a UI/UX finding (wrong affordance, missing state, confusing flow), the escalation should include a targeted `ui-ux-design-workflow` audit pass, not just a code review. UX regressions need design eyes, not just code eyes.
6388
+
6389
+ ### Open design questions
6390
+
6391
+ - **Who reviews the design spec before coding starts?** If the UX design workflow runs autonomously at 2am and coding starts immediately after, there is no human review of the design direction. This is fine for small UI tweaks; it's wrong for new user flows. The coordinator needs a complexity gate: `complexity: Large AND touchesUI: true` → require human ack on the design spec before coding.
6392
+ - **Design spec format:** `ui-ux-design-workflow` currently produces a markdown design document. Does the coding workflow reliably consume this as an upstream spec via Phase 0.5? Verify before relying on the automated handoff.
6393
+ - **Console-specific workflows:** WorkRail's console is a React/TypeScript SPA. Consider a `worktrain:console` label or file-path heuristic that routes to a console-specific design workflow variant.
6394
+
6395
+ ### Priority
6396
+
6397
+ Design this as part of the adaptive coordinator (#3). The `touchesUI` flag belongs on the classification output alongside `taskComplexity` and `maturity`. The UI detection logic and the design workflow insertion are both coordinator-level concerns, not engine-level.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@exaudeus/workrail",
3
- "version": "3.41.0",
3
+ "version": "3.43.0",
4
4
  "description": "Step-by-step workflow enforcement for AI agents via MCP",
5
5
  "license": "MIT",
6
6
  "repository": {
@@ -54,8 +54,8 @@
54
54
  "preinstall": "node -e \"const v=parseInt(process.versions.node.split('.')[0],10); if(v<20){console.error('WorkRail requires Node.js >=20. Current: '+process.versions.node+'\\nPlease upgrade: https://nodejs.org/'); process.exit(1);}\"",
55
55
  "dev:mcp": "pkill -f \"$(pwd)/dist/mcp-server.js\" 2>/dev/null; sleep 0.5; WORKRAIL_TRANSPORT=http WORKRAIL_ENABLE_SESSION_TOOLS=true node dist/mcp-server.js",
56
56
  "dev:mcp:watch": "pkill -f \"$(pwd)/dist/mcp-server.js\" 2>/dev/null; sleep 0.5; WORKRAIL_TRANSPORT=http WORKRAIL_ENABLE_SESSION_TOOLS=true nodemon --watch dist --ext js --delay 2 --exec 'node dist/mcp-server.js'",
57
- "web:dev": "npm run build && WORKRAIL_ENABLE_SESSION_TOOLS=true node dist/mcp-server.js",
58
- "web:ci": "WORKRAIL_ENABLE_SESSION_TOOLS=true node dist/mcp-server.js",
57
+ "web:dev": "npm run build && node dist/cli-worktrain.js console",
58
+ "web:ci": "node dist/cli-worktrain.js console",
59
59
  "web:typecheck": "tsc -p tsconfig.web.json",
60
60
  "typecheck": "tsc --noEmit",
61
61
  "test": "vitest",