@exaudeus/workrail 3.67.0 → 3.68.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/application/services/compiler/template-registry.js +10 -1
- package/dist/cli/commands/worktrain-init.js +1 -1
- package/dist/console-ui/assets/{index-tOl8Vowf.js → index-CyzltI6D.js} +1 -1
- package/dist/console-ui/index.html +1 -1
- package/dist/coordinators/modes/full-pipeline.js +4 -4
- package/dist/coordinators/modes/implement-shared.js +5 -5
- package/dist/coordinators/modes/implement.js +4 -4
- package/dist/coordinators/pr-review.js +4 -4
- package/dist/daemon/workflow-runner.d.ts +1 -0
- package/dist/daemon/workflow-runner.js +1 -0
- package/dist/manifest.json +25 -25
- package/dist/mcp/handlers/v2-workflow.js +1 -1
- package/dist/mcp/workflow-protocol-contracts.js +2 -2
- package/docs/authoring-v2.md +4 -4
- package/docs/changelog-recent.md +3 -3
- package/docs/configuration.md +1 -1
- package/docs/design/adaptive-coordinator-context-candidates.md +1 -1
- package/docs/design/adaptive-coordinator-context.md +1 -1
- package/docs/design/adaptive-coordinator-routing-candidates.md +18 -18
- package/docs/design/adaptive-coordinator-routing-review.md +1 -1
- package/docs/design/adaptive-coordinator-routing.md +34 -34
- package/docs/design/agent-cascade-protocol.md +2 -2
- package/docs/design/console-daemon-separation-discovery.md +323 -0
- package/docs/design/context-assembly-design-candidates.md +1 -1
- package/docs/design/context-assembly-implementation-plan.md +1 -1
- package/docs/design/context-assembly-layer.md +2 -2
- package/docs/design/context-assembly-review-findings.md +1 -1
- package/docs/design/coordinator-access-audit.md +293 -0
- package/docs/design/coordinator-architecture-audit.md +62 -0
- package/docs/design/coordinator-error-handling-audit.md +240 -0
- package/docs/design/coordinator-testability-audit.md +426 -0
- package/docs/design/daemon-architecture-discovery.md +1 -1
- package/docs/design/daemon-console-separation-discovery.md +242 -0
- package/docs/design/daemon-memory-audit.md +203 -0
- package/docs/design/design-candidates-console-daemon-separation.md +256 -0
- package/docs/design/design-candidates-discovery-loop-fix.md +141 -0
- package/docs/design/design-review-findings-console-daemon-separation.md +106 -0
- package/docs/design/design-review-findings-discovery-loop-fix.md +81 -0
- package/docs/design/discovery-loop-fix-candidates.md +161 -0
- package/docs/design/discovery-loop-fix-design-review.md +106 -0
- package/docs/design/discovery-loop-fix-validation.md +258 -0
- package/docs/design/discovery-loop-investigation-A.md +188 -0
- package/docs/design/discovery-loop-investigation-B.md +287 -0
- package/docs/design/exploration-workflow-candidates.md +205 -0
- package/docs/design/exploration-workflow-design-review.md +166 -0
- package/docs/design/exploration-workflow-discovery.md +443 -0
- package/docs/design/ide-context-files-candidates.md +231 -0
- package/docs/design/ide-context-files-design-review.md +85 -0
- package/docs/design/ide-context-files.md +615 -0
- package/docs/design/implementation-plan-discovery-loop-fix.md +199 -0
- package/docs/design/implementation-plan-queue-poll-rotation.md +102 -0
- package/docs/design/in-process-http-audit.md +190 -0
- package/docs/design/layer3b-ghost-nodes-design-candidates.md +2 -2
- package/docs/design/loadSessionNotes-candidates.md +108 -0
- package/docs/design/loadSessionNotes-test-coverage-discovery.md +297 -0
- package/docs/design/loadSessionNotes-test-coverage-session4.md +209 -0
- package/docs/design/loadSessionNotes-test-coverage-v3.md +321 -0
- package/docs/design/probe-session-design-candidates.md +261 -0
- package/docs/design/probe-session-phase0.md +490 -0
- package/docs/design/routines-guide.md +7 -7
- package/docs/design/session-metrics-attribution-candidates.md +250 -0
- package/docs/design/session-metrics-attribution-design-review.md +115 -0
- package/docs/design/session-metrics-attribution-discovery.md +319 -0
- package/docs/design/session-metrics-candidates.md +227 -0
- package/docs/design/session-metrics-design-review.md +104 -0
- package/docs/design/session-metrics-discovery.md +454 -0
- package/docs/design/spawn-session-debug.md +202 -0
- package/docs/design/trigger-validator-candidates.md +214 -0
- package/docs/design/trigger-validator-review.md +109 -0
- package/docs/design/trigger-validator-shaping-phase0.md +239 -0
- package/docs/design/trigger-validator.md +454 -0
- package/docs/design/v2-core-design-locks.md +2 -2
- package/docs/design/workflow-extension-points.md +15 -15
- package/docs/design/workflow-id-validation-at-startup.md +1 -1
- package/docs/design/workflow-id-validation-implementation-plan.md +2 -2
- package/docs/design/workflow-trigger-lifecycle-audit.md +175 -0
- package/docs/design/worktrain-task-queue-candidates.md +5 -5
- package/docs/design/worktrain-task-queue.md +4 -4
- package/docs/discovery/coordinator-script-design.md +1 -1
- package/docs/discovery/coordinator-ux-discovery.md +3 -3
- package/docs/discovery/simulation-report.md +1 -1
- package/docs/discovery/workflow-modernization-discovery.md +326 -0
- package/docs/discovery/workflow-selection-for-discovery-tasks.md +33 -33
- package/docs/discovery/worktrain-status-briefing.md +1 -1
- package/docs/discovery/wr-discovery-goal-reframing.md +1 -1
- package/docs/docker.md +1 -1
- package/docs/ideas/backlog.md +227 -0
- package/docs/ideas/third-party-workflow-setup-design-thinking.md +1 -1
- package/docs/integrations/claude-code.md +5 -5
- package/docs/integrations/firebender.md +1 -1
- package/docs/plans/agentic-orchestration-roadmap.md +2 -2
- package/docs/plans/mr-review-workflow-redesign.md +9 -9
- package/docs/plans/ui-ux-workflow-design-candidates.md +4 -4
- package/docs/plans/ui-ux-workflow-discovery.md +2 -2
- package/docs/plans/workflow-categories-candidates.md +8 -8
- package/docs/plans/workflow-categories-discovery.md +4 -4
- package/docs/plans/workflow-modernization-design.md +430 -0
- package/docs/plans/workflow-staleness-detection-candidates.md +11 -11
- package/docs/plans/workflow-staleness-detection-review.md +4 -4
- package/docs/plans/workflow-staleness-detection.md +9 -9
- package/docs/plans/workrail-platform-vision.md +3 -3
- package/docs/reference/agent-context-cleaner-snippet.md +1 -1
- package/docs/reference/agent-context-guidance.md +4 -4
- package/docs/reference/context-optimization.md +2 -2
- package/docs/roadmap/now-next-later.md +2 -2
- package/docs/roadmap/open-work-inventory.md +16 -16
- package/docs/workflows.md +31 -31
- package/package.json +1 -1
- package/spec/workflow-tags.json +47 -47
- package/workflows/adaptive-ticket-creation.json +16 -16
- package/workflows/architecture-scalability-audit.json +22 -22
- package/workflows/bug-investigation.agentic.v2.json +3 -3
- package/workflows/classify-task-workflow.json +1 -1
- package/workflows/coding-task-workflow-agentic.json +6 -6
- package/workflows/cross-platform-code-conversion.v2.json +8 -8
- package/workflows/document-creation-workflow.json +8 -8
- package/workflows/documentation-update-workflow.json +8 -8
- package/workflows/intelligent-test-case-generation.json +2 -2
- package/workflows/learner-centered-course-workflow.json +2 -2
- package/workflows/mr-review-workflow.agentic.v2.json +4 -4
- package/workflows/personal-learning-materials-creation-branched.json +8 -8
- package/workflows/presentation-creation.json +5 -5
- package/workflows/production-readiness-audit.json +1 -1
- package/workflows/relocation-workflow-us.json +31 -31
- package/workflows/routines/context-gathering.json +1 -1
- package/workflows/routines/design-review.json +1 -1
- package/workflows/routines/execution-simulation.json +1 -1
- package/workflows/routines/feature-implementation.json +3 -3
- package/workflows/routines/final-verification.json +1 -1
- package/workflows/routines/hypothesis-challenge.json +1 -1
- package/workflows/routines/ideation.json +1 -1
- package/workflows/routines/parallel-work-partitioning.json +3 -3
- package/workflows/routines/philosophy-alignment.json +2 -2
- package/workflows/routines/plan-analysis.json +1 -1
- package/workflows/routines/plan-generation.json +1 -1
- package/workflows/routines/tension-driven-design.json +6 -6
- package/workflows/scoped-documentation-workflow.json +26 -26
- package/workflows/ui-ux-design-workflow.json +14 -14
- package/workflows/workflow-diagnose-environment.json +1 -1
- package/workflows/workflow-for-workflows.json +1 -1
|
@@ -0,0 +1,175 @@
|
|
|
1
|
+
# WorkflowTrigger and Session Lifecycle Audit
|
|
2
|
+
|
|
3
|
+
**Date:** 2026-04-19
|
|
4
|
+
**Scope:** `WorkflowTrigger` interface field categorization; session lifecycle ownership map
|
|
5
|
+
**Status:** Findings only -- no implementation plan attached
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Part 1: WorkflowTrigger Field-by-Field Analysis
|
|
10
|
+
|
|
11
|
+
`WorkflowTrigger` is defined in `src/daemon/workflow-runner.ts` at line 288.
|
|
12
|
+
It currently has 14 top-level fields (plus 6 nested under `agentConfig`).
|
|
13
|
+
|
|
14
|
+
### Field inventory with category labels
|
|
15
|
+
|
|
16
|
+
| Field | Category | Notes |
|
|
17
|
+
|---|---|---|
|
|
18
|
+
| `workflowId` | Trigger config | Sourced from TriggerDefinition; known before session start |
|
|
19
|
+
| `goal` | Trigger config | May be dynamically interpolated from goalTemplate but is resolved before WorkflowTrigger is constructed |
|
|
20
|
+
| `workspacePath` | Trigger config | Doubles as the session working directory; passed to tool factories |
|
|
21
|
+
| `context` | Trigger config | Workflow context variables from contextMapping |
|
|
22
|
+
| `referenceUrls` | Trigger config | Pass-through from TriggerDefinition so workflow-runner stays decoupled |
|
|
23
|
+
| `soulFile` | Trigger config | Resolved path, cascade from trigger -> workspace -> global |
|
|
24
|
+
| `branchStrategy` | Trigger config | 'worktree' or 'none'; sourced from TriggerDefinition |
|
|
25
|
+
| `baseBranch` | Trigger config | Only meaningful when branchStrategy === 'worktree' |
|
|
26
|
+
| `branchPrefix` | Trigger config | Only meaningful when branchStrategy === 'worktree' |
|
|
27
|
+
| `botIdentity` | Trigger config / Delivery config | Set by polling-scheduler at trigger time; read by delivery-action.ts at commit time |
|
|
28
|
+
| `agentConfig.model` | Trigger config | Per-trigger model override |
|
|
29
|
+
| `agentConfig.maxSessionMinutes` | Trigger config | Session execution policy |
|
|
30
|
+
| `agentConfig.maxTurns` | Trigger config | Session execution policy |
|
|
31
|
+
| `agentConfig.maxSubagentDepth` | Trigger config | Session execution policy |
|
|
32
|
+
| `agentConfig.stuckAbortPolicy` | Session execution policy | Controls runtime behavior inside the agent loop |
|
|
33
|
+
| `agentConfig.noProgressAbortEnabled` | Session execution policy | Controls runtime behavior inside the agent loop |
|
|
34
|
+
| `_preAllocatedStartResponse` | Session runtime state | Set by dispatch HTTP handler or spawnSession; skips duplicate executeStartWorkflow() |
|
|
35
|
+
| `parentSessionId` | Session runtime state | Set by makeSpawnAgentTool; documented as 'not read by runWorkflow() directly' |
|
|
36
|
+
| `spawnDepth` | Session runtime state | Set by parent factory at child construction time; enforces maxSubagentDepth |
|
|
37
|
+
|
|
38
|
+
### Category summary
|
|
39
|
+
|
|
40
|
+
- **Trigger config (10 fields):** Known before session start, sourced from TriggerDefinition or resolved at dispatch time. These are the "input parameters" the operator writes in triggers.yml.
|
|
41
|
+
- **Session execution policy (2 fields in agentConfig):** Technically sourced from TriggerDefinition but they control runtime behavior inside the agent loop, not the trigger firing condition.
|
|
42
|
+
- **Session runtime state (3 fields):** Set during or after session creation. `_preAllocatedStartResponse` and `parentSessionId` carry infrastructure handshake data; `spawnDepth` carries tree-position state from the parent session factory.
|
|
43
|
+
|
|
44
|
+
### The mixing problem
|
|
45
|
+
|
|
46
|
+
Three categories of fields live in one flat struct with no structural boundary. The consequences:
|
|
47
|
+
|
|
48
|
+
1. **`_preAllocatedStartResponse` is an internal handshake field on a public-facing type.** The underscore prefix is the only signal. `trigger-listener.ts:467` constructs `WorkflowTrigger` with this field from outside the infrastructure layer, proving the underscore convention does not act as an access barrier.
|
|
49
|
+
|
|
50
|
+
2. **`parentSessionId` is documented as "not read by runWorkflow() directly"** (workflow-runner.ts line 382). It exists for documentation and potential future use. A field that is not read by the function it is passed to is a dead field on the input type.
|
|
51
|
+
|
|
52
|
+
3. **`botIdentity` is double-threaded.** It appears on `WorkflowTrigger` (input) and is explicitly re-surfaced on `WorkflowRunSuccess` (output, lines 506-521) so `trigger-router.ts maybeRunDelivery()` can pass it to `delivery-action.ts`. The threading comment says "follows the sessionId/sessionWorkspacePath threading pattern" -- meaning this pattern is now established and will repeat for any future delivery-config field added to the trigger.
|
|
53
|
+
|
|
54
|
+
4. **`branchStrategy`, `baseBranch`, and `branchPrefix` are pass-through forwarding fields** that exist solely because `workflow-runner.ts` must remain decoupled from `TriggerDefinition`. This is the right goal, but the mechanism (copy each field individually onto the flat struct) does not scale as the worktree feature grows.
|
|
55
|
+
|
|
56
|
+
### Proposed decomposition sketch
|
|
57
|
+
|
|
58
|
+
A future refactor could split the interface into three composed types:
|
|
59
|
+
|
|
60
|
+
```typescript
|
|
61
|
+
// Fields the trigger operator configures in triggers.yml
|
|
62
|
+
interface TriggerConfig {
|
|
63
|
+
readonly workflowId: string;
|
|
64
|
+
readonly goal: string;
|
|
65
|
+
readonly workspacePath: string;
|
|
66
|
+
readonly context?: Readonly<Record<string, unknown>>;
|
|
67
|
+
readonly referenceUrls?: readonly string[];
|
|
68
|
+
readonly soulFile?: string;
|
|
69
|
+
readonly agentConfig?: AgentConfig;
|
|
70
|
+
readonly worktreeConfig?: WorktreeConfig; // branchStrategy, baseBranch, branchPrefix
|
|
71
|
+
}
|
|
72
|
+
|
|
73
|
+
// Fields set by delivery systems; read at git-commit time
|
|
74
|
+
interface DeliveryConfig {
|
|
75
|
+
readonly botIdentity?: { readonly name: string; readonly email: string };
|
|
76
|
+
}
|
|
77
|
+
|
|
78
|
+
// Fields set by session infrastructure at creation time
|
|
79
|
+
interface SessionContext {
|
|
80
|
+
readonly _preAllocatedStartResponse?: ...; // or move to a separate creation parameter
|
|
81
|
+
readonly parentSessionId?: string;
|
|
82
|
+
readonly spawnDepth?: number;
|
|
83
|
+
}
|
|
84
|
+
|
|
85
|
+
// WorkflowTrigger becomes a composition
|
|
86
|
+
type WorkflowTrigger = TriggerConfig & DeliveryConfig & SessionContext;
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
The immediate safe win -- without a full refactor -- would be to pull `_preAllocatedStartResponse`, `parentSessionId`, and `spawnDepth` into a separate `SessionContext` parameter to `runWorkflow()` rather than embedding them in the trigger input. This makes the contract explicit: `runWorkflow(config: TriggerConfig, ctx: V2ToolContext, sessionCtx?: SessionContext, ...)`.
|
|
90
|
+
|
|
91
|
+
---
|
|
92
|
+
|
|
93
|
+
## Part 2: Session Lifecycle Ownership Map
|
|
94
|
+
|
|
95
|
+
### Who does what
|
|
96
|
+
|
|
97
|
+
| Responsibility | Owner | Location |
|
|
98
|
+
|---|---|---|
|
|
99
|
+
| Session sidecar creation (initial) | `runWorkflow()` | `workflow-runner.ts:3279` -- `persistTokens(sessionId, token, checkpoint)` |
|
|
100
|
+
| Session sidecar update (add worktreePath) | `runWorkflow()` | `workflow-runner.ts:3333` -- second `persistTokens()` call after worktree creation |
|
|
101
|
+
| Session sidecar deletion (normal end) | `runWorkflow()` | `workflow-runner.ts:3376` -- `fs.unlink(DAEMON_SESSIONS_DIR/sessionId.json)` |
|
|
102
|
+
| Worktree creation | `runWorkflow()` | `workflow-runner.ts:3299-3354` -- `git worktree add` |
|
|
103
|
+
| Worktree removal (normal path, after delivery) | `trigger-router.ts maybeRunDelivery()` | `trigger-router.ts:382-394` -- `git worktree remove --force` |
|
|
104
|
+
| Orphan sidecar cleanup (restart recovery) | `runStartupRecovery()` | `workflow-runner.ts:917-928` -- `fs.unlink` per orphaned sidecar |
|
|
105
|
+
| Orphan worktree cleanup (restart recovery, after 24h) | `runStartupRecovery()` | `workflow-runner.ts:893-915` -- `git worktree remove --force` |
|
|
106
|
+
| Session spawn for adaptive pipeline | `coordinatorDeps.spawnSession` | `trigger-listener.ts:411-476` -- `executeStartWorkflow` + `router.dispatch()` |
|
|
107
|
+
| In-process liveness registration | `runWorkflow()` | `workflow-runner.ts:3258` -- `daemonRegistry.register()` |
|
|
108
|
+
| In-process liveness heartbeat | `runWorkflow()` -- `onAdvance` callback | `workflow-runner.ts:3162` -- `daemonRegistry.heartbeat()` |
|
|
109
|
+
| In-process liveness deregistration | `runWorkflow()` | Multiple early-exit paths and normal completion |
|
|
110
|
+
|
|
111
|
+
### Who is responsible for worktree creation/cleanup?
|
|
112
|
+
|
|
113
|
+
Worktree creation is owned by `runWorkflow()`. Worktree cleanup is **split**:
|
|
114
|
+
|
|
115
|
+
- **Normal path:** `trigger-router.ts maybeRunDelivery()` removes the worktree after delivery completes (success or error). The comment explains why it cannot be in `runWorkflow()`: delivery must run inside the worktree, so `runWorkflow()` must not remove it before returning.
|
|
116
|
+
- **Crash/orphan path:** `runStartupRecovery()` in `workflow-runner.ts` removes worktrees older than 24 hours.
|
|
117
|
+
|
|
118
|
+
This split is architecturally deliberate (the worktree comment at trigger-router.ts:370-381 explains the constraint) but it means no single location owns the full worktree lifecycle. The constraint that forces the split is real: delivery happens outside `runWorkflow()`. The design pressure to resolve it would be to either push delivery into `runWorkflow()`, or introduce a `SessionOwner` abstraction that holds the worktree handle and is responsible for cleanup regardless of which path ends the session.
|
|
119
|
+
|
|
120
|
+
### Is DaemonRegistry the authoritative source of truth for "what's running"?
|
|
121
|
+
|
|
122
|
+
No. `DaemonRegistry` is explicitly documented as "ephemeral in-process liveness tracking -- cleared on process restart -- this is intentional" (`src/v2/infra/in-memory/daemon-registry/index.ts:8`). It answers the question "is this session being heartbeated by the currently running process?" It cannot answer:
|
|
123
|
+
|
|
124
|
+
- Was this session running before the process restarted?
|
|
125
|
+
- Is this sidecar file for a live or crashed session?
|
|
126
|
+
- Did delivery complete before the process died?
|
|
127
|
+
|
|
128
|
+
Session sidecar files (`DAEMON_SESSIONS_DIR/*.json`) are the only durable record. `runStartupRecovery()` reads them directly (not through DaemonRegistry) because the registry is empty after a restart.
|
|
129
|
+
|
|
130
|
+
The result is two mechanisms answering the same question, with no overlap and no single API surface that gives a correct answer in all states:
|
|
131
|
+
|
|
132
|
+
| State | DaemonRegistry | Sidecar file |
|
|
133
|
+
|---|---|---|
|
|
134
|
+
| Session running, same process | Present, heartbeating | Present |
|
|
135
|
+
| Session completed normally | Absent (unregistered) | Absent (deleted) |
|
|
136
|
+
| Process crashed mid-session | Absent (process died) | Present (not cleaned up) |
|
|
137
|
+
| After restart, before recovery | Absent | Present (orphan) |
|
|
138
|
+
|
|
139
|
+
### Gaps and ambiguities
|
|
140
|
+
|
|
141
|
+
**Gap 1: No single owner for the session liveness question.**
|
|
142
|
+
There is no function, class, or module that can correctly answer "is session X currently running?" across all process states. `DaemonRegistry` answers it for the in-process case; sidecar files answer it for the post-crash case; but there is no unified interface. `ConsoleService` has to handle both separately.
|
|
143
|
+
|
|
144
|
+
**Gap 2: Worktree cleanup has two owners with different trigger conditions.**
|
|
145
|
+
The 24-hour threshold in `runStartupRecovery()` and the post-delivery immediate cleanup in `maybeRunDelivery()` are not coordinated. If a session's sidecar is orphaned but its worktree is less than 24 hours old, `runStartupRecovery()` will delete the sidecar but leave the worktree. The worktree becomes a resource leak that is only cleaned on the next startup after 24 hours have elapsed.
|
|
146
|
+
|
|
147
|
+
Confirmed path to this state:
|
|
148
|
+
1. `runWorkflow()` creates worktree and sidecar.
|
|
149
|
+
2. Session completes, `runWorkflow()` deletes the sidecar.
|
|
150
|
+
3. `maybeRunDelivery()` tries to remove the worktree but fails (disk full, git lock, etc.) -- error is logged and swallowed.
|
|
151
|
+
4. Worktree is now orphaned with no sidecar pointing to it. `runStartupRecovery()` can never find it because it searches sidecars, not the worktrees directory.
|
|
152
|
+
|
|
153
|
+
This is a real orphan leak path that the current recovery mechanism cannot reach.
|
|
154
|
+
|
|
155
|
+
**Gap 3: `_preAllocatedStartResponse` leaks infrastructure state into the public trigger type.**
|
|
156
|
+
`spawnSession` in `trigger-listener.ts` constructs `WorkflowTrigger` directly with this field. Any future caller that constructs `WorkflowTrigger` without understanding the `_preAllocatedStartResponse` invariant can accidentally call `executeStartWorkflow()` twice for the same session (the MUST NOT invariant documented at line 356).
|
|
157
|
+
|
|
158
|
+
**Gap 4: `parentSessionId` is a documented dead field.**
|
|
159
|
+
The comments on lines 379-383 say "This field is not read by runWorkflow() directly" and "exists for documentation purposes and potential future use." A field on an input type that is not read by the function it is passed to is not providing value and creates confusion about which fields are operative.
|
|
160
|
+
|
|
161
|
+
**Gap 5: `spawnDepth` has no compile-time enforcement.**
|
|
162
|
+
`spawnDepth` is an optional numeric field. Callers that construct `WorkflowTrigger` (e.g. `spawnSession` in `trigger-listener.ts:467`) must correctly set `spawnDepth` or the `maxSubagentDepth` enforcement in `runWorkflow()` will silently default to depth 0 for all sessions, allowing unlimited recursion depth regardless of the configured limit.
|
|
163
|
+
|
|
164
|
+
---
|
|
165
|
+
|
|
166
|
+
## Summary of findings
|
|
167
|
+
|
|
168
|
+
| Finding | Severity | Concrete risk |
|
|
169
|
+
|---|---|---|
|
|
170
|
+
| WorkflowTrigger mixes three concern categories in one flat struct | Medium | Type leaks, confusion about which fields are operative, MUST NOT invariant on `_preAllocatedStartResponse` easily violated |
|
|
171
|
+
| Worktree cleanup has two owners; the sidecar-less orphan case is unreachable by recovery | Medium | Disk space leak when delivery cleanup fails; orphaned worktrees accumulate indefinitely |
|
|
172
|
+
| DaemonRegistry is not authoritative; two mechanisms answer "is session alive?" | Low-Medium | No concrete bug today, but any code that reads DaemonRegistry as a complete truth (e.g. a future health check) will be wrong after a crash |
|
|
173
|
+
| `_preAllocatedStartResponse` is an internal field on a public type, constructed by the composition root | Medium | Double-session-creation risk if the invariant is violated by a new caller |
|
|
174
|
+
| `parentSessionId` is a dead field on the input type | Low | Confusion only; no operational risk |
|
|
175
|
+
| `spawnDepth` has no compile-time enforcement at construction sites | Low-Medium | Silent depth=0 default for coordinator-spawned sessions if the field is omitted |
|
|
@@ -78,9 +78,9 @@ Nearby contracts that must stay consistent:
|
|
|
78
78
|
|
|
79
79
|
**Routing table (exhaustive):**
|
|
80
80
|
```
|
|
81
|
-
maturity=idea | rough => wr.discovery -> wr.shaping -> coding-task
|
|
82
|
-
maturity=specced => wr.discovery -> coding-task
|
|
83
|
-
maturity=ready => coding-task
|
|
81
|
+
maturity=idea | rough => wr.discovery -> wr.shaping -> wr.coding-task
|
|
82
|
+
maturity=specced => wr.discovery -> wr.coding-task
|
|
83
|
+
maturity=ready => wr.coding-task (Phase 0.5 searches for upstream spec at runtime)
|
|
84
84
|
```
|
|
85
85
|
|
|
86
86
|
Type refines within the coding workflow (bug => skip hypothesis, chore => skip design phases) but does not change pipeline selection.
|
|
@@ -121,7 +121,7 @@ upstream_spec: https://docs.example.com/pitch-feature-x
|
|
|
121
121
|
affected_files: src/foo.ts src/bar.ts
|
|
122
122
|
```
|
|
123
123
|
|
|
124
|
-
**Routing mechanism:** Identical to A -- coordinator uses labels only. Body section is not read at routing time. Downstream workflows (coding-task
|
|
124
|
+
**Routing mechanism:** Identical to A -- coordinator uses labels only. Body section is not read at routing time. Downstream workflows (wr.coding-task Phase 0.5, wr.shaping) call GitHub API to fetch body and parse the `## WorkTrain` section when they need enrichment.
|
|
125
125
|
|
|
126
126
|
**Tensions resolved:** T1, T2, T3 (two-layer schema: labels for routing, body for enrichment), T4
|
|
127
127
|
**Tensions accepted:** None materially -- the body section is optional and gracefully absent
|
|
@@ -192,7 +192,7 @@ affected_files:
|
|
|
192
192
|
**Summary:** Instead of a coordinator that routes, create three triggers with different `labelFilter` values -- one per pipeline path. Routing is implicit in which trigger fires.
|
|
193
193
|
|
|
194
194
|
**Proposed triggers:**
|
|
195
|
-
- `worktrain-coding` trigger: labelFilter=`worktrain:maturity:ready` => `coding-task
|
|
195
|
+
- `worktrain-coding` trigger: labelFilter=`worktrain:maturity:ready` => `wr.coding-task`
|
|
196
196
|
- `worktrain-discovery` trigger: labelFilter=`worktrain:maturity:specced` => `wr.discovery`
|
|
197
197
|
- `worktrain-full` trigger: labelFilter=`worktrain:maturity:idea` => `wr.discovery` (with full-pipeline flag)
|
|
198
198
|
|
|
@@ -54,7 +54,7 @@ The Three-Workflow Pipeline and taskMaturity spectrum are already partially defi
|
|
|
54
54
|
**Routing signals defined in backlog (not yet implemented):**
|
|
55
55
|
- `taskMaturity`: idea / rough / specced / ready / code-complete (backlog Apr 15)
|
|
56
56
|
- `existingArtifacts`: brd / designs / arch-decision / acceptance-criteria / ticket / implementation
|
|
57
|
-
- Three-Workflow Pipeline: `wr.discovery` (optional) -> `wr.shaping` (optional) -> `coding-task
|
|
57
|
+
- Three-Workflow Pipeline: `wr.discovery` (optional) -> `wr.shaping` (optional) -> `wr.coding-task` (Apr 18)
|
|
58
58
|
|
|
59
59
|
**TriggerDefinition routing fields:** `workflowId` is static per trigger. To dispatch different workflows from the same trigger, the routing must happen either in a coordinator script (reads issue, decides workflowId) or via multiple triggers with different `labelFilter` values.
|
|
60
60
|
|
|
@@ -395,9 +395,9 @@ The `## WorkTrain` section header is frozen after v1. `upstream_spec` must be a
|
|
|
395
395
|
|
|
396
396
|
| Maturity label | Pipeline |
|
|
397
397
|
|----------------|----------|
|
|
398
|
-
| `worktrain:idea` | `wr.discovery` -> `wr.shaping` -> `coding-task
|
|
399
|
-
| `worktrain:specced` | `wr.shaping` -> `coding-task
|
|
400
|
-
| `worktrain:ready` | `coding-task
|
|
398
|
+
| `worktrain:idea` | `wr.discovery` -> `wr.shaping` -> `wr.coding-task` |
|
|
399
|
+
| `worktrain:specced` | `wr.shaping` -> `wr.coding-task` |
|
|
400
|
+
| `worktrain:ready` | `wr.coding-task` (Phase 0.5 searches for upstream spec at runtime) |
|
|
401
401
|
| (no maturity label) | Add `worktrain:needs-labels`, skip dispatch, emit structured log entry |
|
|
402
402
|
|
|
403
403
|
**Multi-label conflict rule:** if multiple maturity labels are present (e.g. both `worktrain:idea` and `worktrain:ready`), the lowest maturity wins (idea > specced > ready). Coordinator logs a warning.
|
|
@@ -159,4 +159,4 @@ The scope is correct: 3 new files, clear boundaries, no speculative abstractions
|
|
|
159
159
|
|
|
160
160
|
3. Should the coordinator write a full report file (`coordinator-pr-review-YYYY-MM-DD.md`)? Yes, per UX spec. This is a simple file write via CoordinatorDeps.
|
|
161
161
|
|
|
162
|
-
4. Is `coding-task
|
|
162
|
+
4. Is `wr.coding-task` the correct fix-agent workflow? Yes -- it handles "implement/fix" tasks. The goal string `Fix review findings in PR #N: [finding summaries]` is the goal format.
|
|
@@ -59,7 +59,7 @@ Rationale: The goal was a solution-statement. The risk is designing the wrong in
|
|
|
59
59
|
**What spawn/await enable today:**
|
|
60
60
|
`worktrain spawn` prints a session handle to stdout. `worktrain await` blocks until sessions complete and prints structured JSON results. These two commands are the building blocks for a coordinator script.
|
|
61
61
|
|
|
62
|
-
**The MR review workflow (mr-review
|
|
62
|
+
**The MR review workflow (wr.mr-review):**
|
|
63
63
|
- 8+ phases: locate/bound/classify, hypothesis, freeze fact packet, reviewer family bundle (parallel), contradiction resolution, adversarial validation, final recommendation, handoff
|
|
64
64
|
- Produces: severity-graded findings (Critical/Major/Minor/Nit), recommendation (approve/request changes/needs discussion), confidence band, coverage ledger, ready-to-post MR comments
|
|
65
65
|
- Output lives in `notesMarkdown` and context variables -- no automatic post to GitHub
|
|
@@ -154,7 +154,7 @@ Not every PR review ends cleanly. The coordinator must surface decision points w
|
|
|
154
154
|
```
|
|
155
155
|
$ worktrain review 47
|
|
156
156
|
Reviewing PR #47: "feat(engine): add OAuth refresh token rotation"
|
|
157
|
-
Session started: sess_4cd0b579 (mr-review
|
|
157
|
+
Session started: sess_4cd0b579 (wr.mr-review)
|
|
158
158
|
Run 'worktrain logs --follow --session sess_4cd0b579' to watch progress.
|
|
159
159
|
Run 'worktrain inbox' when done.
|
|
160
160
|
^ Takes ~20-30 min. You can do other things.
|
|
@@ -411,7 +411,7 @@ worktrain run pr-review 47
|
|
|
411
411
|
```
|
|
412
412
|
PR #47: feat(engine): add OAuth refresh token rotation (3 files, +142/-18)
|
|
413
413
|
Base: main merge-base: abc1234
|
|
414
|
-
Session: sess_4cd0b579 (mr-review
|
|
414
|
+
Session: sess_4cd0b579 (wr.mr-review)
|
|
415
415
|
|
|
416
416
|
Watch raw events: worktrain logs --follow --session sess_4cd0b579
|
|
417
417
|
|
|
@@ -51,7 +51,7 @@ Pass 3: passCount=2 -> review: 'minor' -> passCount becomes 3 -> CHECK: 3 >= 3 -
|
|
|
51
51
|
**Trace:**
|
|
52
52
|
```
|
|
53
53
|
discoverPort() -> no lock files -> falls back to 3456
|
|
54
|
-
spawnSession('mr-review
|
|
54
|
+
spawnSession('wr.mr-review', 'Review PR #419...', '/workspace')
|
|
55
55
|
POST http://127.0.0.1:3456/api/v2/auto/dispatch
|
|
56
56
|
-> fetch throws Error: ECONNREFUSED 127.0.0.1:3456
|
|
57
57
|
-> spawnSession catches -> returns err('Could not connect to WorkTrain daemon on port 3456')
|
|
@@ -0,0 +1,326 @@
|
|
|
1
|
+
# Discovery: Workflow Modernization Backlog Audit
|
|
2
|
+
|
|
3
|
+
*Session artifact -- human-readable summary. Durable truth lives in session notes and context.*
|
|
4
|
+
|
|
5
|
+
**Artifact strategy:** This document is for human readers only. It does not need to be complete or current for the session to function correctly. If a chat rewind occurs, the session's `notesMarkdown` and context variables are the authoritative record -- this file may be stale or absent. Do not treat this doc as a checkpoint or source of truth for workflow execution.
|
|
6
|
+
|
|
7
|
+
**Capability summary:**
|
|
8
|
+
- Web browsing: AVAILABLE (curl probe confirmed reachable) -- but NOT needed; all required information is local
|
|
9
|
+
- Delegation (spawn_agent): AVAILABLE (tool present in session, spawn depth < 3) -- deferred to candidate-generation phase where parallel cognitive lenses add value; not useful for local file reading
|
|
10
|
+
- GitHub CLI: AVAILABLE (`gh` confirmed working)
|
|
11
|
+
- Local session store: AVAILABLE and queried (`~/.workrail/data/sessions` yielded real usage counts)
|
|
12
|
+
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
## Context / Ask
|
|
16
|
+
|
|
17
|
+
**Stated goal (original):** Modernize `workflows/exploration-workflow.json` to current v2/lean authoring patterns. This is the highest-priority candidate among the unmodernized workflows.
|
|
18
|
+
|
|
19
|
+
**Statedgoal classification:** `solution_statement` -- names a specific file and solution, not an outcome.
|
|
20
|
+
|
|
21
|
+
**Reframed problem:** Some bundled workflows still in use provide lower-quality agent guidance than the current authoring standard allows, and the planning documents tracking this work reference files that no longer exist -- making it impossible to execute the stated backlog without first auditing what actually exists and what is worth keeping.
|
|
22
|
+
|
|
23
|
+
---
|
|
24
|
+
|
|
25
|
+
## Critical Finding: Planning Docs Are Stale
|
|
26
|
+
|
|
27
|
+
**`exploration-workflow.json` does not exist.** It was:
|
|
28
|
+
1. Modernized in PR #158 (March 27, 2026, commit `edbc3161`)
|
|
29
|
+
2. Deleted and consolidated into `workflows/wr.discovery.json` (commit `a582e1a1`)
|
|
30
|
+
|
|
31
|
+
All 10 filenames in the primary modernization list are missing from the repo. The planning documents (`now-next-later.md`, `open-work-inventory.md`, `tickets/next-up.md`) all reference stale filenames.
|
|
32
|
+
|
|
33
|
+
**The stated backlog cannot be executed as written.** A real audit of existing files against actual usage is the prerequisite.
|
|
34
|
+
|
|
35
|
+
---
|
|
36
|
+
|
|
37
|
+
## Path Recommendation
|
|
38
|
+
|
|
39
|
+
**Path: `landscape_first`**
|
|
40
|
+
|
|
41
|
+
The dominant need is understanding the current state -- what workflows exist, which are actively used, what modernization gaps each actually has. The goal was stated as a solution (file + technique), but the underlying problem is best served by first establishing a correct, usage-ranked picture of the real modernization backlog.
|
|
42
|
+
|
|
43
|
+
**Why not `design_first`?** The modernization technique itself is not in question -- the authoring guide is clear, the spec is versioned, the target state is known. The risk is not solving the wrong problem; it is executing against a stale and incorrect task list.
|
|
44
|
+
|
|
45
|
+
**Why not `full_spectrum`?** This is not an ambiguous problem requiring deep reframing. It is an operational audit problem: match current reality to the planning system, then execute in priority order.
|
|
46
|
+
|
|
47
|
+
---
|
|
48
|
+
|
|
49
|
+
## Constraints / Anti-goals
|
|
50
|
+
|
|
51
|
+
**Constraints:**
|
|
52
|
+
- Do not modify protected files (`src/daemon/`, `src/trigger/`, `src/v2/`, `triggers.yml`)
|
|
53
|
+
- All changes must pass `npx vitest run`
|
|
54
|
+
- Never modernize in a way that breaks existing session behavior (no step ID renames on active workflows)
|
|
55
|
+
|
|
56
|
+
**Anti-goals:**
|
|
57
|
+
- Do not mechanically rename files to `.v2` or `.lean` -- that is cargo-cult modernization
|
|
58
|
+
- Do not add `metaGuidance` fields just to clear a checklist item if the content is empty or generic
|
|
59
|
+
- Do not modernize workflows that are effectively retired (near-zero usage and no unique value)
|
|
60
|
+
- Do not update planning docs to be accurate only to immediately make them stale again -- the audit output should drive concrete, executable ticket updates
|
|
61
|
+
|
|
62
|
+
---
|
|
63
|
+
|
|
64
|
+
## Landscape Packet
|
|
65
|
+
|
|
66
|
+
### Usage-ranked workflow inventory (actual, as of this session)
|
|
67
|
+
|
|
68
|
+
| Usage | Workflow ID | File | Spec Stamp | Gaps |
|
|
69
|
+
|------:|-------------|------|-----------|------|
|
|
70
|
+
| 910 | wr.coding-task | coding-task-workflow-agentic.json | unstamped | features, stamp |
|
|
71
|
+
| 606 | wr.mr-review | mr-review-workflow.agentic.v2.json | unstamped | stamp only |
|
|
72
|
+
| 461 | wr.discovery | wr.discovery.json | v3 (current) | MODERN |
|
|
73
|
+
| 125 | wr.production-readiness-audit | production-readiness-audit.json | unstamped | stamp only |
|
|
74
|
+
| 74 | wr.bug-investigation | bug-investigation.agentic.v2.json | unstamped | features, stamp |
|
|
75
|
+
| 64 | wr.architecture-scalability-audit | architecture-scalability-audit.json | unstamped | stamp only |
|
|
76
|
+
| 51 | wr.ui-ux-design | ui-ux-design-workflow.json | v3 (current) | MODERN |
|
|
77
|
+
| 39 | wr.diagnose-environment | workflow-diagnose-environment.json | unstamped | meta, recs, features, stamp |
|
|
78
|
+
| 39 | wr.shaping | wr.shaping.json | unstamped | features, stamp |
|
|
79
|
+
| 33 | wr.workflow-for-workflows | workflow-for-workflows.json | v3 (current) | MODERN |
|
|
80
|
+
| 7 | wr.documentation-update | documentation-update-workflow.json | unstamped | recs, features, stamp |
|
|
81
|
+
| 5 | wr.document-creation | document-creation-workflow.json | unstamped | recs, features, stamp |
|
|
82
|
+
| 4 | wr.scoped-documentation | scoped-documentation-workflow.json | unstamped | recs, features, stamp |
|
|
83
|
+
| 4 | wr.cross-platform-code-conversion | cross-platform-code-conversion.v2.json | unstamped | features, stamp |
|
|
84
|
+
| 3 | wr.adaptive-ticket-creation | adaptive-ticket-creation.json | unstamped | recs, features, stamp |
|
|
85
|
+
| 2 | wr.relocation-us | relocation-workflow-us.json | unstamped | features, stamp |
|
|
86
|
+
| 0 | wr.intelligent-test-case-generation | intelligent-test-case-generation.json | unstamped | recs, features, stamp |
|
|
87
|
+
| 0 | wr.personal-learning-course-design | learner-centered-course-workflow.json | unstamped | recs, features, stamp |
|
|
88
|
+
| 0 | wr.personal-learning-materials | personal-learning-materials-creation-branched.json | unstamped | recs, features, stamp |
|
|
89
|
+
| 0 | wr.presentation-creation | presentation-creation.json | unstamped | recs, features, stamp |
|
|
90
|
+
| 0 | wr.classify-task | classify-task-workflow.json | v3 (current) | MODERN (but 0 usage) |
|
|
91
|
+
|
|
92
|
+
### Modernization depth classification
|
|
93
|
+
|
|
94
|
+
| Level | Description | Count | Workflows |
|
|
95
|
+
|-------|-------------|-------|-----------|
|
|
96
|
+
| L0 | MODERN -- no action needed | 4 | wr.discovery, wr.ui-ux-design, wr.workflow-for-workflows, wr.classify-task |
|
|
97
|
+
| L1 | Stamp only | 2 | wr.production-readiness-audit, wr.architecture-scalability-audit |
|
|
98
|
+
| L2 | Stamp + features declaration | 0 | (none -- all that need features also need raw->promptBlocks migration) |
|
|
99
|
+
| L3 | Raw prompt -> promptBlocks migration + features | 6 | wr.coding-task(910), wr.mr-review(606), wr.bug-investigation(74), wr.shaping(39), wr.cross-platform-code-conversion(4), wr.relocation-us(2) |
|
|
100
|
+
| L4 | Missing basics (no recs, raw prompts, no features) | 9 | wr.documentation-update(7), wr.document-creation(5), wr.scoped-documentation(4), wr.adaptive-ticket-creation(3), wr.intelligent-test-case-generation(0), wr.personal-learning-course-design(0), wr.personal-learning-materials(0), wr.presentation-creation(0), wr.diagnose-environment(39) |
|
|
101
|
+
|
|
102
|
+
### Critical architectural finding: features only work with promptBlocks
|
|
103
|
+
|
|
104
|
+
**Features injection (`wr.features.capabilities`, `wr.features.subagent_guidance`) only applies to steps using `promptBlocks`.** Raw `prompt` field steps are explicitly skipped by the compiler (`src/application/services/compiler/resolve-features.ts`, line 62: `if (!step.promptBlocks || features.length === 0) return step;`).
|
|
105
|
+
|
|
106
|
+
This means:
|
|
107
|
+
- `wr.coding-task` (910 uses, ALL 9 steps are raw prompt) -- declaring `features` in the header has **zero runtime effect** until steps are migrated to `promptBlocks`
|
|
108
|
+
- `wr.bug-investigation` (74 uses, ALL 8 steps are raw prompt) -- same
|
|
109
|
+
- `wr.shaping` (39 uses, ALL 8 steps are raw prompt) -- same
|
|
110
|
+
- `wr.mr-review` (606 uses, 3 of 6 steps are raw prompt) -- features only inject into the 3 promptBlocks steps
|
|
111
|
+
|
|
112
|
+
**Raw prompt → promptBlocks migration is a prerequisite for features to matter.** This is a deeper migration than just adding a features declaration.
|
|
113
|
+
|
|
114
|
+
### Modern workflows (no modernization needed)
|
|
115
|
+
- `wr.discovery` (v3.2.0, stamped, full features, all promptBlocks)
|
|
116
|
+
- `wr.ui-ux-design` (v3 stamped, features declared, all promptBlocks)
|
|
117
|
+
- `wr.workflow-for-workflows` (v3 stamped, references, features, all promptBlocks)
|
|
118
|
+
- `wr.classify-task` (v3 stamped -- but 0 usage, may need visibility review)
|
|
119
|
+
|
|
120
|
+
### Contradictions in prior understanding
|
|
121
|
+
|
|
122
|
+
**[C1] HIGH -- Planning docs name 10 files, all missing**
|
|
123
|
+
All 10 filenames in the primary modernization list (`now-next-later.md`, `open-work-inventory.md`, `tickets/next-up.md`) do not exist on disk. They were renamed, merged, or deleted in March 2026 during a workflow consolidation sprint.
|
|
124
|
+
|
|
125
|
+
**[C2] HIGH -- Adding `wr.features` to `wr.coding-task` would have zero runtime effect**
|
|
126
|
+
`wr.coding-task` has 9 raw `prompt` steps and 0 `promptBlocks` steps. The compiler explicitly skips feature injection for raw prompt steps. The "missing features" gap reported for this workflow is not a simple header field addition -- it requires a full step structure migration first.
|
|
127
|
+
|
|
128
|
+
**[C3] MEDIUM -- `wr.mr-review` has a deeper gap than "stamp only"**
|
|
129
|
+
Earlier assessment said stamp was the only gap. Actually, 3 of 6 steps use raw prompt and would not receive feature injections. The stamp gap is real, but the structural gap is also real.
|
|
130
|
+
|
|
131
|
+
**[C4] MEDIUM -- `wr.coding-task` already has substantive delegation/parallelism guidance in metaGuidance**
|
|
132
|
+
`wr.coding-task` has 11 `metaGuidance` items including explicit ownership, delegation, and parallelism rules. The `wr.features.subagent_guidance` would inject similar content via promptBlocks, potentially duplicating what already exists in metaGuidance prose. This needs de-duplication thinking, not blind injection.
|
|
133
|
+
|
|
134
|
+
### Evidence gaps
|
|
135
|
+
|
|
136
|
+
- Usage telemetry uses all-time counts including old deleted workflow IDs (`exploration-workflow: 19`, `design-thinking-workflow: 42`). Recent counts would be more actionable.
|
|
137
|
+
- No before/after session quality comparison for any modernization -- whether promptBlocks migration produces measurably better sessions is assumed but unvalidated.
|
|
138
|
+
- Unknown: why `wr.coding-task` and `wr.bug-investigation` use raw prompts (intentional choice? legacy carryover? waiting for migration tooling?).
|
|
139
|
+
- Unknown: whether any zero-usage workflows are referenced in external team configs that would make them important to keep despite 0 local sessions.
|
|
140
|
+
|
|
141
|
+
### What "unstamped" actually means
|
|
142
|
+
17 of 21 workflows lack `validatedAgainstSpecVersion`. This is advisory-only (no CI failure), but signals they haven't been reviewed against spec v3. The `stamp-workflow` script resolves this after running `workflow-for-workflows` on them.
|
|
143
|
+
|
|
144
|
+
### Key gap: `wr.features` adoption
|
|
145
|
+
Most workflows do not declare `wr.features.capabilities` or `wr.features.subagent_guidance`. These inject concrete constraints and procedure items into every step at compile time -- they are not cosmetic. High-usage workflows missing them: `wr.coding-task` (910 uses), `wr.bug-investigation` (74), `wr.shaping` (39).
|
|
146
|
+
|
|
147
|
+
### What the planning docs say vs. reality
|
|
148
|
+
The planning docs reference all of:
|
|
149
|
+
- `exploration-workflow.json` -- DELETED, replaced by wr.discovery
|
|
150
|
+
- `wr.adaptive-ticket-creation.json` -- exists but as `adaptive-ticket-creation.json`
|
|
151
|
+
- `mr-review-workflow.json` -- DELETED, v2 is `mr-review-workflow.agentic.v2.json`
|
|
152
|
+
- `mr-review-workflow.agentic.json` -- DELETED
|
|
153
|
+
- `bug-investigation.json` -- DELETED, v2 is `bug-investigation.agentic.v2.json`
|
|
154
|
+
- `bug-investigation.agentic.json` -- DELETED
|
|
155
|
+
- `design-thinking-workflow.json` -- DELETED, merged into wr.discovery
|
|
156
|
+
- `design-thinking-workflow-autonomous.agentic.json` -- DELETED
|
|
157
|
+
- `wr.documentation-update.json` -- exists as `documentation-update-workflow.json`
|
|
158
|
+
- `wr.document-creation.json` -- exists as `document-creation-workflow.json`
|
|
159
|
+
|
|
160
|
+
---
|
|
161
|
+
|
|
162
|
+
## Problem Frame Packet
|
|
163
|
+
|
|
164
|
+
### Users / Stakeholders
|
|
165
|
+
|
|
166
|
+
1. **Daemon sessions as consumers** (primary): autonomous agents running inside `wr.coding-task`, `wr.mr-review`, `wr.bug-investigation` etc. receive the workflow step prompts. These are the actual beneficiaries of better guidance -- 910 runs of wr.coding-task alone.
|
|
167
|
+
2. **Workflow authors** (secondary): developers writing and maintaining workflow JSON. They need planning docs that reflect reality, and a modernization approach that doesn't require heroic manual effort.
|
|
168
|
+
3. **Project owner** (decision authority): decides what merges, what gets prioritized, and what gets retired. Needs a clear prioritized plan, not an open-ended backlog.
|
|
169
|
+
|
|
170
|
+
### Jobs, Goals, Outcomes
|
|
171
|
+
|
|
172
|
+
- Agents should receive step prompts that reinforce behavioral constraints (durability, delegation, synthesis ownership) without relying on the agent to have read and retained `metaGuidance` at inspection time.
|
|
173
|
+
- Workflow authors should be able to trust that planning docs reflect actual file paths and current priorities.
|
|
174
|
+
- The CI staleness advisory should shrink -- it currently fires for 17/21 workflows, creating noise.
|
|
175
|
+
|
|
176
|
+
### Pains / Tensions
|
|
177
|
+
|
|
178
|
+
- **Pain 1 (high):** `metaGuidance` is inspection-time-only -- it shows when an agent calls `inspect_workflow` but does NOT inject into per-step prompts. For high-usage workflows like `wr.coding-task`, behavioral constraints only appear at inspection time, not at each step where they are most needed. Feature injection via `promptBlocks` is the per-step mechanism, but it requires step migration first.
|
|
179
|
+
- **Pain 2 (high):** Planning docs are stale -- all 10 filenames in the primary modernization list don't exist. Backlog cannot be executed as written.
|
|
180
|
+
- **Pain 3 (medium):** The modernization approach (add features field) has zero runtime effect on workflows still using raw `prompt` steps. The real fix (raw→promptBlocks migration) is 10x the effort of adding a header field.
|
|
181
|
+
- **Tension 1:** Modernizing `wr.coding-task` (910 uses, 14 steps, all raw prompt) is the highest-impact action, but also the riskiest -- changing the step structure of a heavily used workflow could break in-flight sessions or degrade guidance quality if done carelessly.
|
|
182
|
+
- **Tension 2:** Stamp-only passes are fast and reduce advisory noise, but they falsely signal "reviewed against spec v3" when the content has not actually changed.
|
|
183
|
+
|
|
184
|
+
### Constraints That Matter in Lived Use
|
|
185
|
+
|
|
186
|
+
- `wr.features` injection is compile-time -- changing a workflow file is immediately effective for NEW sessions; in-flight sessions use the pinned version
|
|
187
|
+
- Step IDs must not be renamed on widely-used workflows (session continuity depends on stable step IDs in active sessions)
|
|
188
|
+
- Any change must pass `npx vitest run` -- the bundled workflow smoke test covers all workflows automatically
|
|
189
|
+
|
|
190
|
+
### Success Criteria (observable and falsifiable)
|
|
191
|
+
|
|
192
|
+
1. `now-next-later.md`, `open-work-inventory.md`, `tickets/next-up.md` reference only filenames that exist in `workflows/` (checkable with `ls`)
|
|
193
|
+
2. `wr.production-readiness-audit` and `wr.architecture-scalability-audit` have `validatedAgainstSpecVersion: 3` after stamp
|
|
194
|
+
3. At least one high-usage L3 workflow (e.g. `wr.shaping`, 39 uses, 8 raw steps) has been fully migrated to `promptBlocks` and declares `wr.features.capabilities` -- can be verified by checking the JSON and running the compiler
|
|
195
|
+
4. `npx vitest run` passes after all changes
|
|
196
|
+
5. The staleness advisory list shrinks by at least 2 entries (the L1 stamps)
|
|
197
|
+
|
|
198
|
+
### Assumptions Still Active
|
|
199
|
+
|
|
200
|
+
- **A1:** Migrating `wr.coding-task` raw prompts to `promptBlocks` is safe without breaking the existing step logic. (Unvalidated -- needs careful review of each step's prompt content vs. promptBlocks rendering)
|
|
201
|
+
- **A2:** `metaGuidance` at inspection time is insufficient for behavioral reinforcement in long sessions. (Structurally supported but empirically unvalidated)
|
|
202
|
+
- **A3:** Zero-usage workflows (presentation-creation, personal-learning-materials etc.) are safe to deprioritize indefinitely. (May be used by external teams not captured in local session store)
|
|
203
|
+
|
|
204
|
+
### What Would Make This Framing Wrong
|
|
205
|
+
|
|
206
|
+
- If agents reliably call `inspect_workflow` before each session and retain `metaGuidance` throughout, the per-step injection via promptBlocks may be redundant overhead. (Unlikely given how daemon sessions work -- they receive step prompts directly without prior inspection)
|
|
207
|
+
- If wr.coding-task raw prompts were authored this way intentionally (as a v1-style "trust the agent" design), migrating them would work against the author's intent. (Git history shows no explicit reasoning for the choice -- likely legacy carryover from before promptBlocks existed)
|
|
208
|
+
|
|
209
|
+
### HMW Reframes
|
|
210
|
+
|
|
211
|
+
1. **HMW ensure behavioral constraints reach agents at the moment they need them (per step), not just at inspection time?** -- This reframe surfaces promptBlocks migration as the core action, not just field additions.
|
|
212
|
+
2. **HMW reduce the gap between what planning docs say and what the repo actually contains?** -- This reframe surfaces that planning doc accuracy is itself a product quality issue, separate from workflow file quality.
|
|
213
|
+
|
|
214
|
+
### Framing Risk
|
|
215
|
+
|
|
216
|
+
The biggest framing risk: treating this as "modernization work" (cosmetic checklist) when the real problem is "per-step behavioral guidance is not reaching agents reliably." A stamp pass or features header addition would satisfy the planning doc metric while doing nothing for the agents that run these workflows 910 times.
|
|
217
|
+
|
|
218
|
+
---
|
|
219
|
+
|
|
220
|
+
## Opportunity Synthesis
|
|
221
|
+
|
|
222
|
+
Two problems to solve in sequence:
|
|
223
|
+
|
|
224
|
+
**Problem 1 (prerequisite): Planning system is stale.** All 10 named files in the modernization backlog are deleted. No execution can proceed against the current plan.
|
|
225
|
+
|
|
226
|
+
**Problem 2 (the real work): High-usage workflows lack per-step behavioral guidance.** `metaGuidance` is inspection-time-only and not surfaced to daemon sessions. `wr.features` injection (the per-step path) requires `promptBlocks` steps, but `wr.coding-task` (910 runs), `wr.bug-investigation` (74 runs), and `wr.shaping` (39 runs) are entirely raw-prompt. Declaring features without migrating steps has zero runtime effect.
|
|
227
|
+
|
|
228
|
+
**Self-challenge results (5 challenges conducted):**
|
|
229
|
+
- Challenge 1: metaGuidance inspection-time claim confirmed by source code -- absent from start_workflow, continue_workflow, and daemon workflow-runner
|
|
230
|
+
- Challenge 2: features-requires-promptBlocks confirmed by `resolve-features.ts` line 62 -- intentional design decision
|
|
231
|
+
- Challenge 3: **FRAMING CORRECTION** -- wr.coding-task should NOT be the first L3 migration target. wr.shaping (39 uses, 8 steps) is the risk-appropriate starting point. Validate the migration pattern there before attempting the 910-run workflow.
|
|
232
|
+
- Challenge 4: planning doc staleness is a genuine prerequisite, not a side issue
|
|
233
|
+
- Challenge 5: stamp-only is honest for L1 workflows (already modern); dishonest for L4 without prior review
|
|
234
|
+
|
|
235
|
+
**Riskiest assumption:** That migrating wr.coding-task's 9 raw prompt steps to promptBlocks will preserve the semantic quality of the existing prompts without fragmenting or diluting the guidance. Mitigation: validate on wr.shaping first.
|
|
236
|
+
|
|
237
|
+
## Decision Criteria
|
|
238
|
+
|
|
239
|
+
A valid direction must satisfy all of the following:
|
|
240
|
+
1. Step IDs not renamed on active high-usage workflows (session continuity)
|
|
241
|
+
2. `npx vitest run` passes after each change
|
|
242
|
+
3. Planning docs reference only existing filenames after update
|
|
243
|
+
4. Modernization order follows usage-ranked priority, not arbitrary list order
|
|
244
|
+
5. Stamp follows genuine review (not precedes it) -- at minimum for L1 workflows
|
|
245
|
+
|
|
246
|
+
## Candidate Directions
|
|
247
|
+
|
|
248
|
+
**Direction A (Recommended): Fix planning docs + stamp L1 + migrate wr.shaping as first L3 target**
|
|
249
|
+
- Update planning docs to correct filenames and usage-ranked priority
|
|
250
|
+
- Review + stamp `wr.production-readiness-audit` and `wr.architecture-scalability-audit` (L1 -- already modern)
|
|
251
|
+
- Migrate `wr.shaping` (39 uses, 8 raw steps) to full promptBlocks + features as the risk-appropriate first L3 migration
|
|
252
|
+
- Defer wr.coding-task until wr.shaping validates the migration pattern
|
|
253
|
+
- Ship as one PR (planning doc changes + stamp + first L3 migration)
|
|
254
|
+
|
|
255
|
+
**Direction B: Fix planning docs only, output a usage-ranked ticket**
|
|
256
|
+
- Update the three planning docs to reflect real filenames and real usage order
|
|
257
|
+
- Create a GitHub issue with the correct prioritized modernization backlog
|
|
258
|
+
- No workflow file changes in this PR -- separate concerns, minimize blast radius
|
|
259
|
+
- Pro: cleanest separation; Con: defers the actual guidance quality fix
|
|
260
|
+
|
|
261
|
+
**Direction C: Stamp-only pass on L1 + planning doc fixes, defer all L3 migration**
|
|
262
|
+
- Fix planning docs
|
|
263
|
+
- Review + stamp wr.production-readiness-audit and wr.architecture-scalability-audit
|
|
264
|
+
- Explicitly defer all raw→promptBlocks migration to a future PR
|
|
265
|
+
- Pro: very low risk; Con: doesn't address the per-step guidance gap at all
|
|
266
|
+
|
|
267
|
+
**Direction D: Fix planning docs + full L3 migration of wr.shaping + wr.coding-task in one PR**
|
|
268
|
+
- Aggressive scope: modernize two L3 workflows in a single PR
|
|
269
|
+
- High impact but also high review burden; wr.coding-task is 14 steps
|
|
270
|
+
- Pro: maximum value delivered; Con: risky, large diff, hard to review carefully
|
|
271
|
+
|
|
272
|
+
---
|
|
273
|
+
|
|
274
|
+
## Candidate Generation Requirements (landscape_first path)
|
|
275
|
+
|
|
276
|
+
The following expectations apply to the next candidate generation pass. They are recorded here so the synthesis step can judge whether the candidates met the bar.
|
|
277
|
+
|
|
278
|
+
### What the candidate set must reflect (landscape_first constraints)
|
|
279
|
+
|
|
280
|
+
1. **Grounded in actual usage data.** Candidates must reflect the usage-ranked inventory (wr.coding-task: 910, wr.mr-review: 606, etc.) -- not an alphabetical or arbitrary ordering. Any direction that does not prioritize by usage fails this criterion.
|
|
281
|
+
|
|
282
|
+
2. **Must account for the architectural constraint.** Candidates cannot treat "declare wr.features in the header" as meaningful for workflows that are entirely raw-prompt. The only valid modernization path for L3 workflows is raw→promptBlocks migration. Directions that skip this constraint are disqualified.
|
|
283
|
+
|
|
284
|
+
3. **Must address both problems (planning docs AND workflow quality).** A direction that only fixes planning docs without any workflow quality improvement is incomplete. A direction that modernizes workflows without fixing planning docs first is unorderable (can't find the files). Both problems must appear in every viable direction.
|
|
285
|
+
|
|
286
|
+
4. **Must not invent a third problem.** The scope is: fix planning system accuracy AND improve per-step guidance delivery for high-usage workflows. Candidates that reframe toward "build a migration tooling layer" or "redesign the features system" are out of scope -- those are good ideas but belong in the backlog, not in this decision.
|
|
287
|
+
|
|
288
|
+
5. **Must include at least one direction that explicitly scopes OUT wr.coding-task.** The riskiest assumption is that wr.coding-task can be migrated without quality degradation. At least one candidate must treat wr.shaping as the only L3 migration target in the first PR, deferring wr.coding-task to a follow-on.
|
|
289
|
+
|
|
290
|
+
6. **Must include at least one direction that includes wr.coding-task.** The counterpoint must also exist -- the highest-usage workflow deserves a candidate that argues for tackling it now.
|
|
291
|
+
|
|
292
|
+
### What would disqualify a candidate
|
|
293
|
+
|
|
294
|
+
- Proposes "stamp first, review later" for L4 workflows
|
|
295
|
+
- Ignores the raw→promptBlocks prerequisite
|
|
296
|
+
- Treats planning doc fixes as optional
|
|
297
|
+
- Proposes step ID renames on active workflows
|
|
298
|
+
- Requires changes to protected files (src/daemon/, src/trigger/, src/v2/)
|
|
299
|
+
|
|
300
|
+
---
|
|
301
|
+
|
|
302
|
+
## Decision Log
|
|
303
|
+
|
|
304
|
+
- **2026-04-28**: Discovered `exploration-workflow.json` does not exist -- it was deleted in March 2026. Planning docs are stale. Path selection shifted to `landscape_first`.
|
|
305
|
+
- **Capability check**: No web access needed -- all data is local (repo files, session store). Web browsing available but not used.
|
|
306
|
+
- **Delegation**: Not used for file reading/analysis. Delegation deferred to candidate generation; decision at phase 3b was to use self-generation given STANDARD rigor and well-evidenced context.
|
|
307
|
+
- **Self-challenge (Phase 2)**: 5 challenges conducted. Key correction: wr.shaping (not wr.coding-task) is the risk-appropriate first L3 migration target. Stamp is honest for L1, dishonest for L4 without prior review.
|
|
308
|
+
|
|
309
|
+
---
|
|
310
|
+
|
|
311
|
+
## Challenge Notes
|
|
312
|
+
|
|
313
|
+
*(to be populated after direction challenge in phase 3d)*
|
|
314
|
+
|
|
315
|
+
---
|
|
316
|
+
|
|
317
|
+
## Resolution Notes
|
|
318
|
+
|
|
319
|
+
*(to be populated at phase 5)*
|
|
320
|
+
|
|
321
|
+
---
|
|
322
|
+
|
|
323
|
+
## Final Summary
|
|
324
|
+
|
|
325
|
+
*(to be populated at handoff)*
|
|
326
|
+
|