@exaudeus/workrail 3.36.0 → 3.37.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. package/dist/config/config-file.js +2 -0
  2. package/dist/console-ui/assets/{index-n8cJrS4v.js → index-t8Wi304z.js} +1 -1
  3. package/dist/console-ui/index.html +1 -1
  4. package/dist/daemon/workflow-runner.d.ts +1 -0
  5. package/dist/daemon/workflow-runner.js +3 -6
  6. package/dist/infrastructure/session/SessionManager.js +17 -4
  7. package/dist/manifest.json +25 -17
  8. package/dist/trigger/notification-service.d.ts +42 -0
  9. package/dist/trigger/notification-service.js +164 -0
  10. package/dist/trigger/trigger-listener.js +7 -1
  11. package/dist/trigger/trigger-router.d.ts +3 -1
  12. package/dist/trigger/trigger-router.js +4 -1
  13. package/docs/design/agent-behavior-patterns-discovery.md +312 -0
  14. package/docs/design/agent-engine-communication-discovery.md +390 -0
  15. package/docs/design/agent-loop-architecture-alternatives-discovery.md +531 -0
  16. package/docs/design/agent-loop-error-handling-contract.md +238 -0
  17. package/docs/design/complete-step-approach-validation-discovery.md +344 -0
  18. package/docs/design/daemon-stuck-detection-discovery.md +174 -0
  19. package/docs/design/mcp-server-disconnect-discovery.md +245 -0
  20. package/docs/design/mcp-server-epipe-crash.md +198 -0
  21. package/docs/design/notification-design-candidates.md +131 -0
  22. package/docs/design/notification-design-review.md +84 -0
  23. package/docs/design/notification-implementation-plan.md +181 -0
  24. package/docs/design/spawn-agent-failure-modes.md +161 -0
  25. package/docs/design/spawn-agent-result-handling-implementation-plan.md +186 -0
  26. package/docs/design/stdio-simplification-design-candidates.md +341 -0
  27. package/docs/design/stdio-simplification-design-review.md +93 -0
  28. package/docs/design/stdio-simplification-implementation-plan.md +317 -0
  29. package/docs/design/structured-output-tools-coexist-findings.md +288 -0
  30. package/docs/discovery/coordinator-script-design.md +745 -0
  31. package/docs/discovery/coordinator-ux-discovery.md +471 -0
  32. package/docs/discovery/spawn-agent-failure-modes.md +309 -0
  33. package/docs/discovery/workflow-selection-for-discovery-tasks.md +336 -0
  34. package/docs/discovery/worktrain-status-briefing.md +325 -0
  35. package/docs/discovery/worktrain-status-design-candidates.md +202 -0
  36. package/docs/discovery/worktrain-status-design-review-findings.md +86 -0
  37. package/docs/ideas/backlog.md +608 -0
  38. package/docs/ideas/daemon-structured-output-vs-tool-calls.md +344 -0
  39. package/docs/ideas/design-candidates-backlog-consolidation.md +85 -0
  40. package/docs/ideas/design-review-findings-backlog-consolidation.md +39 -0
  41. package/docs/ideas/implementation_plan_backlog_consolidation.md +117 -0
  42. package/docs/plans/authoring-doc-staleness-enforcement-candidates.md +251 -0
  43. package/docs/plans/authoring-doc-staleness-enforcement-review.md +99 -0
  44. package/docs/plans/authoring-doc-staleness-enforcement.md +463 -0
  45. package/package.json +1 -1
@@ -0,0 +1,309 @@
1
+ # Discovery: spawn_agent Failure Modes and Multi-Agent Coordination
2
+
3
+ ## Context / Ask
4
+
5
+ **Original goal:** Discovery (precedent + failure modes angle): what can we learn from how spawn_agent actually works today, what failure modes exist, and what do competitors/references do for multi-agent coordination?
6
+
7
+ **Problem statement:** WorkTrain needs a coordinator that can spawn fix agents, await their results, and act on outcomes -- but the failure modes of that coordination loop are not yet cataloged, and no precedent has been reviewed to validate the design choices.
8
+
9
+ **Desired outcome:** A failure mode catalog with recommended mitigations, a "minimum viable robustness" checklist, and concrete precedent from reference architectures that directly applies to WorkTrain's coordinator design.
10
+
11
+ ## Path Recommendation
12
+
13
+ **Chosen path:** `landscape_first`
14
+
15
+ **Rationale:** The dominant need is grounding -- understanding how spawn_agent actually works in the current codebase, what the existing error paths look like, and what reference architectures did for similar coordination problems. Reframing is secondary (the problem is already well-scoped). Design work comes after this grounding, not before.
16
+
17
+ Alternative paths considered:
18
+ - `full_spectrum`: Would add a reframing step, but the problem is already well-framed. The original goal is problem-shaped, not solution-shaped.
19
+ - `design_first`: Wrong here -- the risk is not solving the wrong problem, it is shipping a coordinator with unhandled failure modes.
20
+
21
+ ## Constraints / Anti-goals
22
+
23
+ **Core constraints:**
24
+ - WorkTrain's first real run must not embarrass the project -- robustness over cleverness
25
+ - The coordinator must work with spawn_agent as it exists today, not a hypothetical redesign
26
+ - Depth limiting must account for real timeout/hang scenarios, not just recursion counts
27
+
28
+ **Anti-goals:**
29
+ - Do not over-engineer for theoretical failure modes that have no evidence in the codebase
30
+ - Do not adopt reference architecture patterns wholesale -- extract the principle, not the mechanism
31
+ - Do not build a new event bus, message queue, or distributed coordination layer
32
+
33
+ **Primary uncertainty:** Whether the current timeout path in worktrain-await.ts actually terminates cleanly when a spawned session hangs at max_turns.
34
+
35
+ **Known approaches to multi-agent coordination (to evaluate):**
36
+ - OpenClaw nexus-core: referenced in backlog deep-dive
37
+ - pi-mono: referenced in backlog
38
+ - Semaphore-based depth limiting (current WorkRail approach)
39
+ - Polling loop with not_awaited outcome (current worktrain-await approach)
40
+
41
+ ## Artifact Strategy
42
+
43
+ This document is a **human-readable artifact** -- it is for people to read and reference. It is NOT execution memory.
44
+
45
+ - Execution truth lives in WorkRail step notes and context variables.
46
+ - If a chat rewind occurs, the durable notes/context survive; this file may not.
47
+ - This file is updated at each research step for readability, but workflow state does not depend on it.
48
+
49
+ ## Capability Status
50
+
51
+ - **Delegation (subagent spawning):** Available via `mcp__nested-subagent__Task`
52
+ - **Web browsing:** Not available (WebFetch tool not active); all research is from codebase and checked-in docs only
53
+
54
+ ## Landscape Packet
55
+
56
+ ### spawn_agent mechanics (makeSpawnAgentTool)
57
+
58
+ **Function:** `makeSpawnAgentTool` in `src/daemon/workflow-runner.ts:1415-1591`
59
+
60
+ **Four error paths:**
61
+ 1. **Depth limit exceeded (pre-spawn):** Synchronous check `currentDepth >= maxDepth`. Returns `{outcome: 'error', childSessionId: null}`. No child created. Fail-fast.
62
+ 2. **Child session start failure:** `executeStartWorkflow()` returns `Err`. Returns `{outcome: 'error', childSessionId: null}`.
63
+ 3. **Token decode failure (silent):** `parseContinueTokenOrFail()` fails -- logs a console warning, but child session still runs with `childSessionId: null`. Zombie risk: session runs but coordinator cannot trace it.
64
+ 4. **WorkflowRunResult variants:** `success` -> `'success'`; `error` -> `'error'`; `timeout` -> `'timeout'`; `delivery_failed` -> **`'success'`** (silent bug: work done, notification failed, treated as success).
65
+
66
+ **Depth limit enforcement:**
67
+ - Depth is passed as a closure parameter, not a global semaphore or counter.
68
+ - Each tree path enforces independently. Siblings at the same depth do NOT share a pool.
69
+ - Default `maxDepth = 3` (line 2207). Root sessions start at depth 0 (line 2206).
70
+ - Enforcement is per-tree-path, not global.
71
+
72
+ **Semaphore bypass:**
73
+ - `dispatch()` uses a global Semaphore for concurrency limiting.
74
+ - `makeSpawnAgentTool` calls `runWorkflow()` directly -- **bypasses the semaphore entirely**.
75
+ - Reason (in code comment): dispatch() is fire-and-forget; calling it from inside a running session would deadlock.
76
+ - Child sessions pass `undefined` for `daemonRegistry` -- invisible to `worktrain status` and console live-session heartbeat.
77
+ - Consequence: a single root session can spawn multiple children that are all untracked by daemon tooling.
78
+
79
+ **Not used in practice:** Session store search found no spawn_agent calls in 3,278 daemon events (Apr 17-18 2026). Analysis is code-only, not runtime-observed.
80
+
81
+ ### worktrain-await.ts
82
+
83
+ **Poll interval:** 3000ms (3 seconds). Configurable via `opts.pollInterval` for tests.
84
+
85
+ **Default timeout:** 30 minutes (1,800,000ms). Accepts duration strings like `"30m"`, `"1h"`, `"90s"`.
86
+
87
+ **Timeout handling:** Timeout is checked once per loop iteration (not per session poll). When timeout fires:
88
+ - All remaining pending sessions marked `{outcome: 'timeout', status: null}`.
89
+ - Loop breaks immediately.
90
+ - Exit code 1.
91
+ - The coordinator receives `'timeout'` -- then what? See failure mode catalog.
92
+
93
+ **`not_awaited` outcome:** Only fires when `--mode any`. When first session returns `success`, all other pending sessions are marked `not_awaited` with `status: null`. They were still running and healthy -- we just stopped waiting. Exit code 0.
94
+
95
+ **Race conditions identified:**
96
+ 1. **No atomic timeout per session (high-risk):** Timeout check runs once per loop, not per poll call. If 10 sessions are polled serially and network is slow, sessions 6-10 may timeout before being polled that round. The durationMs recorded reflects when the check ran, not when each session was last polled.
97
+ 2. **Concurrent timeout + poll result (low-risk):** Network delay between polling session A and session B could push next iteration past timeout boundary. Window is negligible at 3s poll / 30m timeout.
98
+ 3. **`--mode any` early exit with stale status (by design):** Session B may have completed 1ms after Session A but gets marked `not_awaited`. This is intentional for latency optimization.
99
+
100
+ ### Reference architectures
101
+
102
+ **OpenClaw (nexus-core)** -- Interactive, session-coordination system (NOT batch/DAG)
103
+ - `AcpSessionStore`: In-memory, 5k sessions, 24h TTL, LRU eviction. Not durable.
104
+ - `SessionActorQueue`: Serializes messages per session to prevent concurrent modification.
105
+ - `SpawnAcpParams`: Minimal spawn API (task, label, agentId, resumeSessionId, cwd, mode).
106
+ - Task flow chaining: workflow A completion auto-triggers workflow B via `linkTaskToFlowById`.
107
+ - **Transferable:** Session actor queue pattern (serialization per session).
108
+ - **Not transferable:** In-memory store (violates WorkRail's durability guarantee).
109
+
110
+ **pi-mono** -- Library of coordination primitives (not a system)
111
+ - `agentLoop()` returns `EventStream<AgentEvent, AgentMessage[]>` -- handles multi-turn without context degradation.
112
+ - `BeforeToolCallResult`: Can block a tool call with a reason.
113
+ - `AfterToolCallResult`: Can override tool result content.
114
+ - `ChannelQueue` (KeyedAsyncQueue): Serializes messages per channel.
115
+ - **Transferable:** Tool call hooks pattern (block/override tool calls from coordinator level).
116
+ - **Not transferable:** Entire library (WorkRail has its own agent loop).
117
+
118
+ **Claude Code (closest analog)** -- Interactive IDE agent with coordinator/subagent model
119
+ - Coordinator holds tokens; subagents report via durable store (not context).
120
+ - `PreToolUse` / `PostToolUse` hooks for evidence collection.
121
+ - Three compaction tiers: session memory > full compaction > microcompaction.
122
+ - **Transferable:** State-via-store pattern (WorkRail already does this). Evidence collection hooks.
123
+
124
+ **LangGraph** -- Batch/DAG pipeline (LOW comparability)
125
+ - Time-travel checkpointing (`fork` source) -- useful for WorkRail's rewind feature.
126
+ - Interrupt mechanism: node re-runs from scratch on resume (requires idempotency) -- NOT how WorkRail works.
127
+ - **Not transferable:** Core interrupt/resume model.
128
+ - **Partially transferable:** Checkpoint fork pattern.
129
+
130
+ **Temporal.io** -- Event-sourced code-defined workflows (MEDIUM comparability)
131
+ - Worker polling vs webhook push model.
132
+ - Workflow versioning via `patched()`.
133
+ - **Transferable:** Crash recovery patterns, namespace isolation for multi-tenant.
134
+
135
+ **Assumption revision:** Assumption 2 (reference architectures are batch/DAG systems) was partially wrong. OpenClaw and Claude Code are interactive session-coordination systems, making them more comparable than expected. This strengthens the transferability of their patterns.
136
+
137
+ ## Problem Frame Packet
138
+
139
+ **Reframed problem:** What is the minimum coordinator design that handles the real failure modes in spawn_agent today, informed by precedent, without over-engineering for failures not yet observed?
140
+
141
+ **Primary stakeholders:**
142
+ - Etienne (WorkTrain developer and primary user) -- needs the coordinator to work on the first real run, robustness over cleverness
143
+ - WorkTrain session initiators running autonomous fix pipelines -- need predictable outcomes and visible failures
144
+
145
+ **Core tension:** The coordinator must be robust enough to handle failure modes, but WorkTrain's explicit anti-goal is "don't over-engineer." The minimum viable robustness point is: handle failures that would silently corrupt the coordinator's state or leave orphaned sessions. Adding observability, atomic timeouts, and session tracking beyond that is premature optimization.
146
+
147
+ **Framing risks:**
148
+ 1. The coordinator doesn't exist yet -- all failure mode analysis covers infrastructure (spawn_agent + worktrain-await). The real design decisions happen in the coordinator layer, which is the missing piece. We may be analyzing the wrong layer.
149
+ 2. spawn_agent has never been used in practice (0 calls in 3,278 daemon events) -- all identified failure modes are theoretical. A real run may surface completely different issues.
150
+ 3. The "first real run must not embarrass" constraint could push toward over-engineering -- the right balance is a coordinator that fails loudly (not silently) and stops cleanly.
151
+
152
+ **HMW questions:**
153
+ - How might we design a coordinator that degrades gracefully when a spawned session times out, without requiring the coordinator to know why it timed out?
154
+ - How might we make spawn_agent's zombie risk (silent token decode failure) visible without requiring daemon tooling changes?
155
+
156
+ **Challenged assumptions (updated after landscape research):**
157
+ 1. spawn_agent mechanics are the right research focus -- partially confirmed: coordinator protocol level is where the real design decisions happen, but spawn_agent has real silent failure modes that need fixing
158
+ 2. Competitor/reference architectures are batch/DAG systems -- WRONG: OpenClaw and Claude Code are interactive session-coordination systems. Comparability is higher than assumed.
159
+ 3. Depth=3 is the right safety boundary -- confirmed: the real question is timeout robustness, not depth arithmetic. A depth-1 coordinator can hang if the spawned session hangs.
160
+
161
+ ## Candidate Directions
162
+
163
+ ### Candidate Generation Expectations
164
+
165
+ This is a `landscape_first` + `THOROUGH` pass. Candidates must:
166
+
167
+ 1. **Anchor to landscape precedents.** Each candidate must reference at least one observed precedent (OpenClaw, pi-mono, Claude Code, worktrain-await design, or spawn_agent behavior) -- not free invention.
168
+ 2. **Cover the failure mode space.** The 5 decision criteria must be addressed by at least one candidate each. No criterion can be unaddressed across the whole set.
169
+ 3. **Spread across the simplicity-completeness axis.** At least one candidate at each pole: a minimal wrapper that adds almost nothing, and a structured handoff protocol that addresses all 5 criteria.
170
+ 4. **THOROUGH push:** If the first spread feels clustered around the middle, add one more candidate that is either maximally simple (borderline too simple) or takes a structurally different approach to the termination guarantee problem.
171
+ 5. **No invented infrastructure.** Candidates must not require new daemon tooling, event buses, or message queues. They must work with spawn_agent and worktrain-await as they exist today.
172
+
173
+ ### Candidates
174
+
175
+ *To be populated in candidate-generation step.*
176
+
177
+ ## Challenge Notes
178
+
179
+ *To be populated after research.*
180
+
181
+ ## Resolution Notes
182
+
183
+ ### Recommendation
184
+
185
+ **v1 coordinator design: 5 components, no new infrastructure, all justified by evidence.**
186
+
187
+ 1. **Infrastructure fix:** Change `delivery_failed` to return `outcome: 'error'` (not `'success'`) in `makeSpawnAgentTool` (~line 1580 of `src/daemon/workflow-runner.ts`). This is the highest-consequence silent failure. One-line change. **This is a hard blocker -- coordinator must not ship without it.**
188
+
189
+ 2. **Hardcoded child session timeout:** Pass `agentConfig: { maxSessionMinutes: 15 }` in all spawn triggers. No LLM arithmetic. The hardcoded value is conservative but correct under uncertainty (being too conservative is recoverable; no timeout is not).
190
+
191
+ 3. **Coordinator rule -- null childSessionId:** After spawn, if `childSessionId === null` with any outcome, treat as error. This catches the token decode failure zombie case (separate from the delivery_failed fix).
192
+
193
+ 4. **Coordinator rule -- go/no-go time check:** Before spawning, if remaining session time < 20 minutes, do not spawn. Return error with reason "insufficient session time remaining." Prevents coordinator death in edge cases without LLM arithmetic.
194
+
195
+ 5. **Layer D traceability:** Record spawn result JSON block in step notes BEFORE acting: `{ childSessionId, outcome, notes (truncated), spawnedAtEpochMs, durationMs }`. Step notes ARE injected into subsequent steps (MAX_SESSION_RECAP_NOTES=3 mechanism confirmed in code). This enables observability on first real run.
196
+
197
+ ### Strongest Alternative (Runner-Up)
198
+
199
+ **B+C+D composition:** Coordinator-owned timeout budget (Layer B) + CoordinatorSpawnResult discriminated type (Layer C) + notes-as-retry-ledger (Layer D). Correct for v2 after empirical data. Loses for v1 because Layer B has silent failure modes (LLM arithmetic) and Layer C is a prompt-level workaround for a one-line infrastructure bug.
200
+
201
+ ### Confidence Band
202
+
203
+ **HIGH.** Direction is grounded in code reading, challenged by an adversarial reviewer, and confirmed by design review. The challenger's false positive (Layer D notes not readable) was refuted by code evidence.
204
+
205
+ ### Residual Risks
206
+
207
+ 1. **Empirical validation of 15-min timeout.** The value is a heuristic. Real fix agents may need more or less time. Revisit after 3 real runs.
208
+
209
+ 2. **delivery_failed frequency unknown.** Zero spawn_agent calls in 3,278 daemon events. The delivery_failed -> success bug is a real code path but may never fire in typical usage. The infra fix is still correct architecture, but urgency is not yet validated empirically.
210
+
211
+ ### Constraints on Selected Direction
212
+
213
+ - **Single-spawn per coordinator session.** Multi-spawn (diagnose + fix + verify) requires Layer B (dynamic budget) and is a v2 concern. This constraint must be explicit in the coordinator design spec.
214
+ - **Infra fix is a hard blocker.** Do not ship coordinator without the delivery_failed -> error fix.
215
+
216
+ ## Decision Log
217
+
218
+ | Date | Decision | Rationale |
219
+ |------|----------|-----------|
220
+ | 2026-04-18 | Path: landscape_first | Dominant need is grounding in current code and precedent, not reframing |
221
+ | 2026-04-18 | v1 coordinator: infra fix + hardcoded timeout + notes traceability | spawn_agent unused in practice (0 calls). B+C+D composition over-engineered for v1. Layer C is a prompt-level workaround for a one-line infrastructure bug. Layer B relies on LLM arithmetic with silent failure mode. Layer D (notes) does work (notes are injected) but retry is premature. v2 can add dynamic budgeting and result type mapping after observing real usage. |
222
+ | 2026-04-18 | Runner-up: B+C+D composition | Correct for v2+ after empirical data from real runs. Not justified for v1 on theoretical grounds alone. |
223
+
224
+ ## Final Summary
225
+
226
+ ### Selected Path
227
+ `landscape_first` -- grounding in current spawn_agent code and reference architecture comparisons. The problem was already well-framed; no reframing step needed.
228
+
229
+ ### Problem Framing
230
+ WorkTrain needs a coordinator that can spawn fix agents, await results, and act on outcomes. The failure modes of that coordination loop are not yet cataloged. The real design seam is the coordinator layer (which doesn't exist yet), not spawn_agent infrastructure (which does).
231
+
232
+ ### Landscape Takeaways
233
+ - spawn_agent bypasses the global semaphore (direct runWorkflow call, not dispatch). Child sessions are invisible to daemon tooling.
234
+ - `delivery_failed` is explicitly mapped to `outcome: 'success'` in makeSpawnAgentTool -- a silent failure the coordinator must guard against.
235
+ - Token decode failure proceeds with `childSessionId: null` and `outcome: 'success'` -- a separate zombie case.
236
+ - Reference architectures (OpenClaw, Claude Code) are interactive session-coordination systems, not batch/DAG pipelines. They are more comparable than initially assumed.
237
+ - Most transferable patterns: pi-mono tool call hooks, OpenClaw session actor queue, Claude Code state-via-store model (WorkRail already uses this).
238
+ - spawn_agent has never been used in practice (0 calls in 3,278 daemon events). All analysis is code-only.
239
+
240
+ ### Chosen Direction
241
+ **v1: Infrastructure fix + hardcoded timeout + 4 coordinator rules**
242
+
243
+ 1. Fix `delivery_failed -> 'error'` in `makeSpawnAgentTool` (hard blocker -- must ship with coordinator)
244
+ 2. Hardcode `agentConfig: { maxSessionMinutes: 15 }` in all spawn calls
245
+ 3. Coordinator rule: `childSessionId === null` with any outcome = error
246
+ 4. Coordinator rule: go/no-go check -- if < 20 min session time remaining, do not spawn
247
+ 5. Layer D: record spawn result JSON in step notes before acting
248
+
249
+ **Key constraint:** Single-spawn per coordinator session. Multi-spawn requires Layer B (v2).
250
+
251
+ ### Strongest Alternative (Runner-Up)
252
+ B+C+D composition: coordinator-owned timeout budget (Layer B) + CoordinatorSpawnResult type mapping (Layer C) + notes-as-retry-ledger (Layer D). Loses for v1 because Layer B has silent failure modes (LLM arithmetic) and Layer C is a workaround for a one-line infrastructure bug.
253
+
254
+ ### Why It Won
255
+ - Infrastructure fix addresses the root cause at the correct abstraction layer (not a prompt-level workaround)
256
+ - Hardcoded timeout eliminates silent failure mode of LLM arithmetic
257
+ - No speculative abstractions -- every component is justified by identified failure mode
258
+ - Adversarial review validated 3 of 4 components; false positive on Layer D was refuted by code evidence
259
+
260
+ ### Confidence Band
261
+ HIGH. Direction grounded, challenged, reviewed, confirmed. Remaining gaps are empirical.
262
+
263
+ ### Failure Mode Catalog
264
+
265
+ | # | Failure Mode | Mechanism | Severity | Mitigation | Status |
266
+ |---|-------------|-----------|----------|------------|--------|
267
+ | FM1 | delivery_failed treated as success | makeSpawnAgentTool maps delivery_failed -> 'success' | HIGH | Fix in infrastructure: return outcome: 'error' | Required (hard blocker) |
268
+ | FM2 | Zombie session (null childSessionId) | Token decode failure proceeds silently with childSessionId: null and outcome: 'success' | HIGH | Coordinator rule: treat null childSessionId as error | Required |
269
+ | FM3 | Spawned session hangs at max_turns | No bounded timeout on spawned session | HIGH | Hardcode maxSessionMinutes: 15 | Required |
270
+ | FM4 | Coordinator dies waiting for child | Nested timeout: coordinator and child have same timeout budget | HIGH | Go/no-go check: < 20 min remaining = don't spawn | Required |
271
+ | FM5 | Fix agent introduces new bug | Coordinator has no way to verify fix quality | MEDIUM | Out of scope for v1; requires verification spawn | Accepted / v2 |
272
+ | FM6 | Concurrent coordinators on same repo | spawn_agent bypasses semaphore | MEDIUM | WorkRail session queue prevents races at higher level | Already handled |
273
+ | FM7 | worktrain-await race condition | Timeout checked once per loop, not per poll | LOW | Negligible at 15-min sessions / 3s poll | Accepted (file as separate bug) |
274
+ | FM8 | Context compaction strips retry state | Context variables may be compacted | MEDIUM | Layer D: step notes are durable (MAX_SESSION_RECAP_NOTES=3 injects prior notes) | Mitigated |
275
+
276
+ ### Minimum Viable Robustness Checklist
277
+
278
+ A pre-ship reviewer can use this to verify coordinator v1 is ready:
279
+
280
+ - [ ] **Infrastructure fix landed:** `makeSpawnAgentTool` returns `outcome: 'error'` for `delivery_failed` (not `'success'`). Check `src/daemon/workflow-runner.ts` ~line 1580.
281
+ - [ ] **Hardcoded timeout set:** All spawn triggers in the coordinator pass `agentConfig: { maxSessionMinutes: 15 }`. No dynamic calculation.
282
+ - [ ] **Null childSessionId check present:** Coordinator explicitly checks `childSessionId !== null` before treating outcome as success. If null, treats as error.
283
+ - [ ] **Go/no-go check present:** Coordinator checks remaining session time before spawning. If < 20 minutes, returns error without spawning.
284
+ - [ ] **Spawn record written to notes:** Before acting on spawn result, coordinator writes a JSON record to step notes: `{ childSessionId, outcome, elapsedMs }`.
285
+ - [ ] **Single-spawn constraint documented:** Coordinator design spec explicitly states this coordinator makes one spawn per session. Multi-spawn is not supported.
286
+ - [ ] **Real run performed:** At least 1 real coordinator session run before declaring v1 stable. Timeout values revisited after.
287
+
288
+ ### Reference Architecture Precedents Applied
289
+
290
+ | System | Pattern | Applied In |
291
+ |--------|---------|------------|
292
+ | OpenClaw | Session actor queue: serialize messages per session | WorkRail's DaemonSessionManager already does this |
293
+ | pi-mono | Tool call hooks: BeforeToolCallResult / AfterToolCallResult | Evidence gating pattern (v2 concern) |
294
+ | Claude Code | State-via-store: subagents report to durable store, not context | Already in WorkRail's design; coordinator uses session store, not context |
295
+ | LangGraph | Time-travel checkpointing | WorkRail's checkpoint/rewind feature (existing) |
296
+ | nexus-core | Knowledge injection before each LLM call | WorkRail's session recap (MAX_SESSION_RECAP_NOTES) does this |
297
+
298
+ ### Next Actions
299
+
300
+ 1. **Now:** Fix `delivery_failed -> 'error'` in `makeSpawnAgentTool`. This is the infrastructure fix that unblocks coordinator design.
301
+ 2. **Now:** Design coordinator workflow with the 4 coordinator rules above. Use `design-candidates-spawn-agent.md` as the design spec.
302
+ 3. **After first 3 real runs:** Revisit 15-min timeout value. File worktrain-await race condition as a separate bug.
303
+ 4. **v2:** Add Layer B (dynamic timeout budgeting at infrastructure level, not LLM arithmetic) and multi-spawn support.
304
+
305
+ ### Residual Risks
306
+
307
+ 1. 15-min timeout value is a heuristic with no empirical validation. May be too conservative or too liberal.
308
+ 2. delivery_failed frequency unknown in practice (0 production spawn calls). The fix is correct architecture regardless.
309
+
@@ -0,0 +1,336 @@
1
+ # Workflow Selection for Discovery-Only Tasks
2
+
3
+ **Status:** Discovery in progress
4
+ **Session:** wr.discovery
5
+ **Date:** 2026-04-17
6
+
7
+ **Artifact strategy:** This document is for human reading. Execution truth (context variables, step notes) lives in WorkRail session state, not here. This doc is updated at each phase but is not the primary memory -- it can be reconstructed from notes if lost.
8
+
9
+ ---
10
+
11
+ ## Context / Ask
12
+
13
+ A daemon session was dispatched using `coding-task-workflow-agentic` with a goal that said "Discovery only -- Do NOT write any code". The session ran 11 advances, produced good design candidate notes, stopped at event 74 with no `run_completed`, and the later advances had no note output (likely conditional skips).
14
+
15
+ The question: for a discovery-only task (no code, just a design document), should we use `coding-task-workflow-agentic` or `wr.discovery`? And can `coding-task-workflow-agentic` be trusted to stay in discovery mode when the goal explicitly says no code?
16
+
17
+ ---
18
+
19
+ ## Path Recommendation
20
+
21
+ **Path:** `landscape_first`
22
+
23
+ **Rationale:** The dominant need here is to understand the current structure of two specific workflows and compare their fitness for a known task class (discovery-only). The answer is primarily a landscape/comparison problem, not an ambiguous framing problem. `landscape_first` is the right fit. `full_spectrum` is not needed because we are not uncertain about what the problem is -- we have a concrete incident and two concrete artifacts. `design_first` would be appropriate only if we suspected the stated problem was the wrong problem, and we do not.
24
+
25
+ ---
26
+
27
+ ## Constraints / Anti-goals
28
+
29
+ **Constraints:**
30
+ - We have two concrete workflow JSON files to analyze
31
+ - We have a concrete triggers.yml with one `workflowId` configured
32
+ - The daemon session behavior is a real observed incident, not a hypothesis
33
+
34
+ **Anti-goals:**
35
+ - Do not redesign either workflow
36
+ - Do not recommend changes to workflow step content
37
+ - Do not propose a new workflow; only decide which existing one to use
38
+
39
+ ---
40
+
41
+ ## Landscape Packet
42
+
43
+ ### Current state summary
44
+
45
+ `coding-task-workflow-agentic` (lean v2, v1.1.0) is a full implementation lifecycle workflow. Its `about` field says: "Use this to implement a software feature or task." Its preconditions include "A deterministic validation path exists (tests, build, or an explicit verification strategy)." It explicitly describes what it produces: `implementation_plan.md`, `spec.md`, code slices, and a PR-ready handoff with commit JSON.
46
+
47
+ `wr.discovery` (v3.1.0) is a structured thinking/design workflow. Its `about` field says: "Use this to explore and think through a problem end-to-end." Its metaGuidance explicitly states: "Boundary: this workflow can end with a recommendation memo, prototype or test plan, or a research-informed direction. It should not implement production code."
48
+
49
+ ### Step structure analysis: coding-task-workflow-agentic
50
+
51
+ | Step | Condition | Discovery-relevant? |
52
+ |------|-----------|---------------------|
53
+ | phase-0: Understand & Classify | always runs | Yes -- classifies complexity/rigor |
54
+ | phase-1a: State Hypothesis | `taskComplexity != Small AND rigorMode != QUICK` | Yes |
55
+ | phase-1b-design-quick: Lightweight Design | `taskComplexity != Small AND rigorMode == QUICK` | Yes |
56
+ | phase-1b-design-deep: Tension-Driven Design | `taskComplexity != Small AND rigorMode != QUICK` | Yes |
57
+ | phase-1c: Challenge and Select | `taskComplexity != Small` | Yes |
58
+ | phase-2: Design Review loop | `taskComplexity != Small` | Yes |
59
+ | phase-3: Slice, Plan, and Test Design | `taskComplexity != Small` | Implementation planning |
60
+ | phase-3b: Spec (Observable Behavior) | `taskComplexity != Small AND (Large OR High risk)` | Implementation planning |
61
+ | phase-4: Plan Audit loop | `taskComplexity != Small AND rigorMode != QUICK` | Implementation planning |
62
+ | phase-5: Small Task Fast Path | `taskComplexity == Small` | Implementation (code required) |
63
+ | phase-6: Implement Slice-by-Slice loop | `taskComplexity != Small` | **Code writing** |
64
+ | phase-7: Final Verification loop | `taskComplexity != Small` | **Code verification** |
65
+
66
+ **Key finding:** For a task classified as `Small`, the workflow skips phases 1a, 1b, 1c, 2, 3, 3b, 4, 6, 7 and runs only phase-0 and phase-5. Phase-5 (Small Task Fast Path) **explicitly requires writing code** and producing a handoff JSON block with `filesChanged`. There is no "Small + discovery only" path.
67
+
68
+ For Medium/Large tasks, the workflow runs the full design pipeline (phases 0-4) which produces `design-candidates.md` -- but it then continues directly into implementation (phases 6-7). There is no early exit after design.
69
+
70
+ **Does coding-task-workflow-agentic have a "discovery only" mode?** No. It has no `runCondition` or context variable that would stop before implementation when a goal says "no code". The only escape hatch would be the agent choosing to stop itself based on the goal text -- which is an honor-system trust, not a structural guarantee.
71
+
72
+ ### What phases run for Small vs Medium/Large
73
+
74
+ **Small task path:**
75
+ - phase-0 (classify)
76
+ - phase-5 (fast path -- writes code, produces commit JSON)
77
+ - All other phases skipped via `runCondition: taskComplexity == Small` or `taskComplexity != Small`
78
+
79
+ **Medium/Large task path:**
80
+ - phase-0 (classify)
81
+ - phase-1a/1b/1c (design candidates)
82
+ - phase-2 (design review loop)
83
+ - phase-3 (implementation plan)
84
+ - phase-3b (spec, if Large or High risk)
85
+ - phase-4 (plan audit loop)
86
+ - phase-6 (implement slice loop -- **writes code**)
87
+ - phase-7 (final verification loop)
88
+
89
+ The daemon session ran 11 advances and stopped at event 74. Given the step structure, for a Medium/Large non-QUICK classification, 11 advances would likely cover phases 0-4 (design + planning), stopping before phase-6 (implementation). This means the session exhausted the design pipeline but never reached code-writing -- not because the workflow has a discovery mode, but because the agent stopped before phase-6, possibly because:
90
+ 1. The goal text said "no code" and the agent respected it
91
+ 2. A loop condition evaluation or `requireConfirmation` gate paused/stopped execution
92
+ 3. The session timed out or the MCP connection dropped before the loop started
93
+
94
+ The "no note output on later advances" is consistent with conditional steps being skipped (e.g., phase-3b skipped because not Large/High-risk, or loop steps stopping early).
95
+
96
+ ### wr.discovery landscape
97
+
98
+ `wr.discovery` runs: path selection -> capability setup -> landscape understanding -> problem framing -> re-triage (conditional) -> synthesis -> candidate generation -> challenge/selection -> direction review loop -> uncertainty resolution (direct recommendation / research loop / prototype loop) -> final validation -> handoff.
99
+
100
+ It explicitly cannot produce production code. It always ends with a design document, recommendation memo, or prototype spec. There is no implementation path in the workflow.
101
+
102
+ ### Option categories
103
+
104
+ 1. **Use wr.discovery** for discovery tasks, `coding-task-workflow-agentic` for implementation tasks
105
+ 2. **Use coding-task-workflow-agentic for everything**, trusting the agent to stop early when goal says "no code"
106
+ 3. **Add a discovery-mode flag** to `coding-task-workflow-agentic` via a `runCondition` on phases 6-7
107
+ 4. **Use separate triggers** in triggers.yml with different `workflowId` per task type
108
+
109
+ ### Contradictions / disagreements
110
+
111
+ - The daemon session with `coding-task-workflow-agentic` produced "good design candidates notes" -- so the workflow does good design work even though it is intended for implementation. The design pipeline (phases 1-4) is legitimate and high quality.
112
+ - The risk is not that `coding-task-workflow-agentic` does bad design work. The risk is that (a) it might not stop before phase-6 reliably, and (b) it carries implementation framing (slices, spec, PR handoff) that pollutes a pure discovery context.
113
+
114
+ ### Evidence gaps
115
+
116
+ - We do not know the exact event log from the stopped daemon session -- we cannot confirm whether it stopped naturally or by connection drop
117
+ - We do not know whether the agent in that session reached phase-6 or stopped before it
118
+ - We cannot test "honor system" reliability without more session data
119
+
120
+ ---
121
+
122
+ ## Problem Frame Packet
123
+
124
+ ### Users / stakeholders
125
+
126
+ - Daemon dispatcher: needs to select the right `workflowId` in triggers.yml
127
+ - Agent executing the session: needs structural guarantees, not honor-system constraints
128
+ - Developer (you): needs the design document output to be pure and trustworthy
129
+
130
+ ### Jobs / goals / outcomes
131
+
132
+ - Dispatch a session that produces a design document and nothing else
133
+ - Know with certainty that no code will be written, regardless of agent judgment
134
+ - Get a high-quality, structured design output comparable to what coding-task-workflow-agentic's design phases produce
135
+
136
+ ### Pains / tensions / constraints
137
+
138
+ - The daemon currently has ONE `workflowId` in triggers.yml -- no per-task routing
139
+ - `coding-task-workflow-agentic` is trusted for design quality but is not structurally bounded to stop before code
140
+ - `wr.discovery` is structurally bounded to no-code but may produce different design output depth
141
+
142
+ ### Success criteria
143
+
144
+ 1. A discovery-only task produces only a design document, never code or a PR
145
+ 2. The selection is structural (a wrong `workflowId` cannot accidentally write code), not honor-system
146
+ 3. The design quality is not degraded by switching to `wr.discovery`
147
+
148
+ ### Assumptions
149
+
150
+ - The daemon reads `workflowId` directly from triggers.yml and cannot dynamically select based on goal text
151
+ - `wr.discovery` produces design candidates comparable in quality to what phases 1-4 of `coding-task-workflow-agentic` produce
152
+ - triggers.yml supports multiple trigger entries with different `workflowId` values
153
+
154
+ ### Reframes / HMW questions
155
+
156
+ - HMW: How might we route discovery tasks to `wr.discovery` and implementation tasks to `coding-task-workflow-agentic` at the dispatcher level instead of relying on agent judgment?
157
+ - HMW: How might we make "discovery only" a structural guarantee rather than a goal-text instruction?
158
+
159
+ ### What would make this framing wrong
160
+
161
+ - If the daemon cannot support multiple triggers, option 4 (separate triggers) is blocked
162
+ - If `wr.discovery` produces materially weaker design output for technical workflow questions, the quality tradeoff matters
163
+
164
+ ---
165
+
166
+ ## Candidate Generation Expectations (landscape_first)
167
+
168
+ Because this is a `landscape_first` path, the candidate set must:
169
+ - Clearly reflect the landscape findings (the actual step structure of both workflows, the triggers.yml constraint)
170
+ - Not invent options that contradict what was observed in the workflow files
171
+ - Include at least one option that uses existing structure without any modification
172
+ - Include the runner-up option that would feel like a real alternative, not just a straw man
173
+
174
+ The three candidates (A, B, C) below were derived directly from the landscape analysis, not from free invention.
175
+
176
+ ---
177
+
178
+ ## Candidate Directions
179
+
180
+ ### Direction A: Use wr.discovery for discovery tasks (structural routing)
181
+
182
+ Configure a second trigger entry in triggers.yml with `workflowId: wr.discovery` for discovery-only goals. The structural guarantee is that `wr.discovery` cannot write code -- it does not have those steps. The daemon would need to support routing (two triggers, each with a matching rule or explicit goal flag).
183
+
184
+ **Why it fits:** Structural guarantee. `wr.discovery` was explicitly designed for this use case. Its metaGuidance says "should not implement production code."
185
+
186
+ **Strongest evidence for it:** The session incident shows the risk of relying on honor-system stop behavior in `coding-task-workflow-agentic`. Structural routing removes the risk entirely.
187
+
188
+ **Strongest risk against it:** triggers.yml currently supports one trigger per session. If it cannot support multiple triggers with per-task routing, this requires daemon work. Also, `wr.discovery` produces a recommendation memo/design doc, not the same `design-candidates.md` artifact shape that `coding-task-workflow-agentic` phases 1-4 produce.
189
+
190
+ **When it should win:** Always, for any task where the desired output is a design document and there is no intent to implement code in the same session.
191
+
192
+ ---
193
+
194
+ ### Direction B: Trust coding-task-workflow-agentic with honor-system stop
195
+
196
+ Keep triggers.yml as-is. Rely on the goal text ("Discovery only -- Do NOT write any code") to instruct the agent to stop before phase-6.
197
+
198
+ **Why it fits (weakly):** The prior session actually did produce design candidates and apparently stopped before code. It worked once.
199
+
200
+ **Strongest evidence for it:** The 11-advance session with good design notes suggests the agent did respect the goal text.
201
+
202
+ **Strongest risk against it:** The workflow has no structural stop before phase-6. A future session could classify the task differently, run through phases 0-4 faster, and reach phase-6 before the session ends. Phase-6 will attempt to implement code. The only protection is the agent re-reading the goal and choosing not to implement -- which is fragile under long sessions, context window pressure, or agent model changes.
203
+
204
+ **When it should win:** Never for production use. Acceptable as a short-term workaround only.
205
+
206
+ ---
207
+
208
+ ### Direction C: Add discoveryMode flag to coding-task-workflow-agentic
209
+
210
+ Modify `coding-task-workflow-agentic` to support a `discoveryMode` context variable. Add `runCondition: { var: "discoveryMode", not_equals: true }` to phases 6 and 7. Pass `discoveryMode: true` via the goal or a trigger-level context override.
211
+
212
+ **Why it fits:** Preserves the high-quality design pipeline of `coding-task-workflow-agentic` while adding a structural stop before implementation.
213
+
214
+ **Strongest evidence for it:** The design phases (1-4) of `coding-task-workflow-agentic` are well-designed and familiar. Reusing them avoids duplication.
215
+
216
+ **Strongest risk against it:** This requires modifying a core workflow file. It adds complexity to a workflow that was designed for a different purpose. It creates a hybrid that does neither thing cleanly. And triggers.yml still only has one trigger, so the `discoveryMode` value must come from somewhere (goal text parse? trigger-level context?).
217
+
218
+ **When it should win:** If modifying `wr.discovery` or the daemon is unavailable, and modifying `coding-task-workflow-agentic` is cheap and acceptable.
219
+
220
+ ---
221
+
222
+ ## Challenge Notes
223
+
224
+ **Against Direction A (wr.discovery):** The design output format differs. `coding-task-workflow-agentic` produces `design-candidates.md` via the `tension-driven-design` routine, followed by a `design-review-findings.md` and a full `implementation_plan.md`. `wr.discovery` produces a design doc with Candidate Directions and a recommendation. For a technical question about workflow architecture, the `wr.discovery` output (a recommendation memo) is actually _more_ appropriate than `implementation_plan.md`. The format difference is not a disadvantage.
225
+
226
+ **Against Direction B:** The incident already showed the risk. The session stopped at event 74 with no `run_completed`. We do not know if it stopped intentionally or by timeout/connection drop. If it stopped by timeout, the next session might not stop in the same place. Structural guarantees are always preferred over honor-system constraints when the downside (code written to a wrong branch) is recoverable but costly.
227
+
228
+ **Against Direction C:** Modifying `coding-task-workflow-agentic` for a use case it was not designed for violates the "make illegal states unrepresentable" principle. It is better to use the right tool than to add a mode switch to the wrong tool.
229
+
230
+ ---
231
+
232
+ ## Resolution Notes
233
+
234
+ **Direction A wins.** `wr.discovery` is the right workflow for discovery-only tasks. The structural guarantee -- no implementation steps exist in the workflow -- is strictly better than an honor-system stop. The triggers.yml configuration needs to evolve to support per-task workflow routing.
235
+
236
+ ---
237
+
238
+ ## Decision Log
239
+
240
+ | Decision | Rationale |
241
+ |----------|-----------|
242
+ | `landscape_first` path chosen | We have two concrete artifacts to compare; this is a comparison/routing question, not an ambiguous framing problem |
243
+ | Direction A selected | Structural guarantees are always preferred over honor-system constraints; `wr.discovery` was built for this |
244
+ | Direction B rejected | Honor-system stop is fragile; the incident confirmed the risk |
245
+ | Direction C rejected | Adding a mode switch to the wrong tool is worse than using the right tool |
246
+ | Multiple triggers confirmed | Read `src/trigger/trigger-store.ts` and `src/trigger/trigger-router.ts`. `loadTriggerConfig()` loads all entries; `buildTriggerIndex()` maps by unique `id`; `route()` dispatches by `triggerId`. A second trigger entry with `workflowId: wr.discovery` works today with zero code changes. |
247
+
248
+ ---
249
+
250
+ ## Final Summary
251
+
252
+ ### Selected direction: Direction A -- use wr.discovery for discovery-only tasks
253
+
254
+ **Confidence band: High**
255
+
256
+ #### Recommendation
257
+
258
+ For a discovery-only task (no code, just a design document):
259
+ - **Use `wr.discovery`**, not `coding-task-workflow-agentic`
260
+ - Add a second trigger entry to `triggers.yml` with a unique `id` and `workflowId: wr.discovery`
261
+ - The daemon's trigger-store.ts and trigger-router.ts already support multiple triggers with different workflowIds -- no code change required
262
+
263
+ #### Example triggers.yml configuration
264
+
265
+ ```yaml
266
+ triggers:
267
+ - id: test-task
268
+ provider: generic
269
+ workflowId: coding-task-workflow-agentic
270
+ workspacePath: /Users/etienneb/git/personal/workrail
271
+ goal: "Add the evidenceFrom field to AssessmentDimension..."
272
+ concurrencyMode: parallel
273
+ autoCommit: false
274
+ agentConfig:
275
+ maxSessionMinutes: 60
276
+
277
+ - id: discovery-task
278
+ provider: generic
279
+ workflowId: wr.discovery
280
+ workspacePath: /Users/etienneb/git/personal/workrail
281
+ goal: "Discovery only: ..."
282
+ concurrencyMode: parallel
283
+ autoCommit: false
284
+ agentConfig:
285
+ maxSessionMinutes: 60
286
+ ```
287
+
288
+ The caller must send the correct `triggerId` (`discovery-task` vs `test-task`) when firing the webhook.
289
+
290
+ #### Why coding-task-workflow-agentic cannot be trusted in discovery mode
291
+
292
+ `coding-task-workflow-agentic` has no structural stop before phase-6 (Implement Slice-by-Slice). For Small tasks, phase-5 (Small Task Fast Path) explicitly requires writing code. For Medium/Large tasks, the design pipeline (phases 0-4) produces good design work, then phase-6 writes code. The only protection against code-writing is the agent choosing to stop based on goal text -- an honor-system constraint that can fail under context window pressure.
293
+
294
+ The prior session stopped at event 74 (likely after phase-4, before phase-6) -- but we cannot confirm whether this was agent judgment or a connection drop. With `wr.discovery`, the question is irrelevant: there are no phases 6-7 to reach.
295
+
296
+ #### What phases coding-task-workflow-agentic skips for Small tasks
297
+
298
+ - Skips: phase-1a (hypothesis), phase-1b (design), phase-1c (challenge), phase-2 (design review), phase-3 (plan), phase-3b (spec), phase-4 (plan audit), phase-6 (implementation), phase-7 (verification)
299
+ - Runs: phase-0 (classify) and phase-5 (Small Task Fast Path -- **writes code**)
300
+
301
+ For Medium/Large tasks, all phases run in sequence, including phase-6 (implementation).
302
+
303
+ #### Would wr.discovery have been a better choice?
304
+
305
+ Yes, without qualification. `wr.discovery` was designed for exactly this use case. Its metaGuidance states: "should not implement production code." All paths end with a recommendation memo, prototype spec, or research plan. It uses the same `tension-driven-design` routine as `coding-task-workflow-agentic` phases 1b, so design quality is equivalent.
306
+
307
+ #### How to configure triggers.yml for discovery vs implementation
308
+
309
+ - **Implementation tasks**: `workflowId: coding-task-workflow-agentic` -- use the existing `test-task` trigger or rename it
310
+ - **Discovery tasks**: `workflowId: wr.discovery` -- add a new trigger entry (e.g., `id: discovery-task`)
311
+ - Route by sending the correct `triggerId` in the webhook
312
+
313
+ #### Workflow selection strategy when the daemon has ONE workflowId configured
314
+
315
+ The current `test-task` trigger always dispatches to `coding-task-workflow-agentic`. For discovery tasks, either:
316
+ 1. Add a second trigger entry (preferred -- structural routing, zero code change)
317
+ 2. Temporarily change the trigger's `workflowId` to `wr.discovery` for discovery sessions, then change it back (workable but manual and error-prone)
318
+ 3. Use console AUTO dispatch and set `workflowId: wr.discovery` explicitly in the dispatch request (for console-dispatched sessions only)
319
+
320
+ Option 1 is the right answer.
321
+
322
+ ### Strongest alternative: Direction C (add discoveryMode flag to coding-task-workflow-agentic)
323
+
324
+ If the two-trigger routing were unavailable (it is not), adding `runCondition: { var: "discoveryMode", not_equals: true }` to phases 6-7 would also provide structural enforcement. Loses: workflow cleanliness, YAGNI compliance, reversibility. Not recommended when Direction A is available.
325
+
326
+ ### Residual risks
327
+
328
+ 1. **Console dispatch scope boundary** (Yellow): console AUTO dispatch uses `workflowId` directly, not `triggerId`. For console-dispatched discovery sessions, the caller must explicitly set `workflowId: wr.discovery`. The two-trigger triggers.yml setup covers webhook-triggered sessions only.
329
+
330
+ 2. **Prior session at event 74**: stop reason unknown. If it was a connection drop, the design pipeline output may be incomplete. Review the session artifacts before using them. Direction A eliminates this risk for future sessions.
331
+
332
+ ### Next actions
333
+
334
+ 1. Add a second trigger entry to `triggers.yml` with `id: discovery-task` and `workflowId: wr.discovery`
335
+ 2. Route discovery-only goals to the `discovery-task` trigger ID when firing webhooks
336
+ 3. Review the prior session's design artifacts for completeness