@exaudeus/workrail 3.28.0 → 3.30.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (160) hide show
  1. package/dist/console/assets/{index-C146q2kN.js → index-Bl5-Ghuu.js} +1 -1
  2. package/dist/console/index.html +1 -1
  3. package/dist/manifest.json +3 -3
  4. package/docs/README.md +57 -0
  5. package/docs/adrs/001-hybrid-storage-backend.md +38 -0
  6. package/docs/adrs/002-four-layer-context-classification.md +38 -0
  7. package/docs/adrs/003-checkpoint-trigger-strategy.md +35 -0
  8. package/docs/adrs/004-opt-in-encryption-strategy.md +36 -0
  9. package/docs/adrs/005-agent-first-workflow-execution-tokens.md +105 -0
  10. package/docs/adrs/006-append-only-session-run-event-log.md +76 -0
  11. package/docs/adrs/007-resume-and-checkpoint-only-sessions.md +51 -0
  12. package/docs/adrs/008-blocked-nodes-architectural-upgrade.md +178 -0
  13. package/docs/adrs/009-bridge-mode-single-instance-mcp.md +195 -0
  14. package/docs/adrs/010-release-pipeline.md +89 -0
  15. package/docs/architecture/README.md +7 -0
  16. package/docs/architecture/refactor-audit.md +364 -0
  17. package/docs/authoring-v2.md +527 -0
  18. package/docs/authoring.md +873 -0
  19. package/docs/changelog-recent.md +201 -0
  20. package/docs/configuration.md +505 -0
  21. package/docs/ctc-mcp-proposal.md +518 -0
  22. package/docs/design/README.md +22 -0
  23. package/docs/design/agent-cascade-protocol.md +96 -0
  24. package/docs/design/autonomous-console-design-candidates.md +253 -0
  25. package/docs/design/autonomous-console-design-review.md +111 -0
  26. package/docs/design/autonomous-platform-mvp-discovery.md +525 -0
  27. package/docs/design/claude-code-source-deep-dive.md +713 -0
  28. package/docs/design/console-cyberpunk-ui-discovery.md +504 -0
  29. package/docs/design/console-execution-trace-candidates-final.md +160 -0
  30. package/docs/design/console-execution-trace-candidates.md +211 -0
  31. package/docs/design/console-execution-trace-design-candidates-v2.md +113 -0
  32. package/docs/design/console-execution-trace-design-review.md +74 -0
  33. package/docs/design/console-execution-trace-discovery.md +394 -0
  34. package/docs/design/console-execution-trace-final-review.md +77 -0
  35. package/docs/design/console-execution-trace-review.md +92 -0
  36. package/docs/design/console-performance-discovery.md +415 -0
  37. package/docs/design/console-ui-backlog.md +280 -0
  38. package/docs/design/daemon-architecture-discovery.md +853 -0
  39. package/docs/design/daemon-design-candidates.md +318 -0
  40. package/docs/design/daemon-design-review-findings.md +119 -0
  41. package/docs/design/daemon-engine-design-candidates.md +210 -0
  42. package/docs/design/daemon-engine-design-review.md +131 -0
  43. package/docs/design/daemon-execution-engine-discovery.md +280 -0
  44. package/docs/design/daemon-gap-analysis.md +554 -0
  45. package/docs/design/daemon-owns-console-plan.md +168 -0
  46. package/docs/design/daemon-owns-console-review.md +91 -0
  47. package/docs/design/daemon-owns-console.md +195 -0
  48. package/docs/design/data-model-erd.md +11 -0
  49. package/docs/design/design-candidates-consolidate-dev-staleness.md +98 -0
  50. package/docs/design/design-candidates-walk-cache-depth-limit.md +80 -0
  51. package/docs/design/design-review-consolidate-dev-staleness.md +54 -0
  52. package/docs/design/design-review-walk-cache-depth-limit.md +48 -0
  53. package/docs/design/implementation-plan-consolidate-dev-staleness.md +142 -0
  54. package/docs/design/implementation-plan-walk-cache-depth-limit.md +141 -0
  55. package/docs/design/layer3b-ghost-nodes-design-candidates.md +229 -0
  56. package/docs/design/layer3b-ghost-nodes-design-review.md +93 -0
  57. package/docs/design/layer3b-ghost-nodes-implementation-plan.md +219 -0
  58. package/docs/design/list-workflows-latency-fix-plan.md +128 -0
  59. package/docs/design/list-workflows-latency-fix-review.md +55 -0
  60. package/docs/design/list-workflows-latency-fix.md +109 -0
  61. package/docs/design/native-context-management-api.md +11 -0
  62. package/docs/design/performance-sweep-2026-04.md +96 -0
  63. package/docs/design/routines-guide.md +219 -0
  64. package/docs/design/sequence-diagrams.md +11 -0
  65. package/docs/design/subagent-design-principles.md +220 -0
  66. package/docs/design/temporal-patterns-design-candidates.md +312 -0
  67. package/docs/design/temporal-patterns-design-review-findings.md +163 -0
  68. package/docs/design/test-isolation-from-config-file.md +335 -0
  69. package/docs/design/v2-core-design-locks.md +2746 -0
  70. package/docs/design/v2-lock-registry.json +734 -0
  71. package/docs/design/workflow-authoring-v2.md +1044 -0
  72. package/docs/design/workflow-docs-spec.md +218 -0
  73. package/docs/design/workflow-extension-points.md +687 -0
  74. package/docs/design/workrail-auto-trigger-system.md +359 -0
  75. package/docs/design/workrail-config-file-discovery.md +513 -0
  76. package/docs/docker.md +110 -0
  77. package/docs/generated/v2-lock-closure-plan.md +26 -0
  78. package/docs/generated/v2-lock-coverage.json +797 -0
  79. package/docs/generated/v2-lock-coverage.md +177 -0
  80. package/docs/ideas/backlog.md +3927 -0
  81. package/docs/ideas/design-candidates-mcp-resilience.md +208 -0
  82. package/docs/ideas/design-review-findings-mcp-resilience.md +119 -0
  83. package/docs/ideas/implementation_plan.md +249 -0
  84. package/docs/ideas/third-party-workflow-setup-design-thinking.md +1948 -0
  85. package/docs/implementation/02-architecture.md +316 -0
  86. package/docs/implementation/04-testing-strategy.md +124 -0
  87. package/docs/implementation/09-simple-workflow-guide.md +835 -0
  88. package/docs/implementation/13-advanced-validation-guide.md +874 -0
  89. package/docs/implementation/README.md +21 -0
  90. package/docs/integrations/claude-code.md +300 -0
  91. package/docs/integrations/firebender.md +315 -0
  92. package/docs/migration/v0.1.0.md +147 -0
  93. package/docs/naming-conventions.md +45 -0
  94. package/docs/planning/README.md +104 -0
  95. package/docs/planning/github-ticketing-playbook.md +195 -0
  96. package/docs/plans/README.md +24 -0
  97. package/docs/plans/agent-managed-ticketing-design.md +605 -0
  98. package/docs/plans/agentic-orchestration-roadmap.md +112 -0
  99. package/docs/plans/assessment-gates-engine-handoff.md +536 -0
  100. package/docs/plans/content-coherence-and-references.md +151 -0
  101. package/docs/plans/library-extraction-plan.md +340 -0
  102. package/docs/plans/mr-review-workflow-redesign.md +1451 -0
  103. package/docs/plans/native-context-management-epic.md +11 -0
  104. package/docs/plans/perf-fixes-design-candidates.md +225 -0
  105. package/docs/plans/perf-fixes-design-review-findings.md +61 -0
  106. package/docs/plans/perf-fixes-new-issues-candidates.md +264 -0
  107. package/docs/plans/perf-fixes-new-issues-review.md +110 -0
  108. package/docs/plans/prompt-fragments.md +53 -0
  109. package/docs/plans/ui-ux-workflow-design-candidates.md +120 -0
  110. package/docs/plans/ui-ux-workflow-discovery.md +100 -0
  111. package/docs/plans/ui-ux-workflow-review.md +48 -0
  112. package/docs/plans/v2-followup-enhancements.md +587 -0
  113. package/docs/plans/workflow-categories-candidates.md +105 -0
  114. package/docs/plans/workflow-categories-discovery.md +110 -0
  115. package/docs/plans/workflow-categories-review.md +51 -0
  116. package/docs/plans/workflow-discovery-model-candidates.md +94 -0
  117. package/docs/plans/workflow-discovery-model-discovery.md +74 -0
  118. package/docs/plans/workflow-discovery-model-review.md +48 -0
  119. package/docs/plans/workflow-source-setup-phase-1.md +245 -0
  120. package/docs/plans/workflow-source-setup-phase-2.md +361 -0
  121. package/docs/plans/workflow-staleness-detection-candidates.md +104 -0
  122. package/docs/plans/workflow-staleness-detection-review.md +58 -0
  123. package/docs/plans/workflow-staleness-detection.md +80 -0
  124. package/docs/plans/workflow-v2-design.md +69 -0
  125. package/docs/plans/workflow-v2-roadmap.md +74 -0
  126. package/docs/plans/workflow-validation-design.md +98 -0
  127. package/docs/plans/workflow-validation-roadmap.md +108 -0
  128. package/docs/plans/workrail-platform-vision.md +420 -0
  129. package/docs/reference/agent-context-cleaner-snippet.md +94 -0
  130. package/docs/reference/agent-context-guidance.md +140 -0
  131. package/docs/reference/context-optimization.md +284 -0
  132. package/docs/reference/example-workflow-repository-template/.github/workflows/validate.yml +125 -0
  133. package/docs/reference/example-workflow-repository-template/README.md +268 -0
  134. package/docs/reference/example-workflow-repository-template/workflows/example-workflow.json +80 -0
  135. package/docs/reference/external-workflow-repositories.md +916 -0
  136. package/docs/reference/feature-flags-architecture.md +472 -0
  137. package/docs/reference/feature-flags.md +349 -0
  138. package/docs/reference/god-tier-workflow-validation.md +272 -0
  139. package/docs/reference/loop-optimization.md +209 -0
  140. package/docs/reference/loop-validation.md +176 -0
  141. package/docs/reference/loops.md +465 -0
  142. package/docs/reference/mcp-platform-constraints.md +59 -0
  143. package/docs/reference/recovery.md +88 -0
  144. package/docs/reference/releases.md +177 -0
  145. package/docs/reference/troubleshooting.md +105 -0
  146. package/docs/reference/workflow-execution-contract.md +998 -0
  147. package/docs/roadmap/README.md +22 -0
  148. package/docs/roadmap/legacy-planning-status.md +103 -0
  149. package/docs/roadmap/now-next-later.md +70 -0
  150. package/docs/roadmap/open-work-inventory.md +389 -0
  151. package/docs/tickets/README.md +39 -0
  152. package/docs/tickets/next-up.md +76 -0
  153. package/docs/workflow-management.md +317 -0
  154. package/docs/workflow-templates.md +423 -0
  155. package/docs/workflow-validation.md +184 -0
  156. package/docs/workflows.md +254 -0
  157. package/package.json +4 -1
  158. package/spec/authoring-spec.json +61 -16
  159. package/workflows/workflow-for-workflows.json +3 -3
  160. package/workflows/workflow-for-workflows.v2.json +3 -3
@@ -0,0 +1,587 @@
1
+ # WorkRail v2 Follow-up Enhancements
2
+
3
+ > **Active follow-up initiative**
4
+ >
5
+ > This file is still useful for the detailed open v2 follow-up work, but it is no longer the canonical high-level entrypoint.
6
+ >
7
+ > Prefer:
8
+ > - `docs/plans/workflow-v2-roadmap.md`
9
+ > - `docs/plans/workflow-v2-design.md`
10
+
11
+ **Status**: In Progress — detailed follow-up initiative after core v2 delivery
12
+ **Date**: 2026-02-17
13
+ **Updated**: 2026-02-18
14
+ **Context**: Post-v2 core completion. All functional slices shipped, 2628 tests passing. This doc captures enhancement opportunities discovered during manual testing and production usage.
15
+
16
+ ---
17
+
18
+ ## Priority 1: MCP Roots Protocol Integration (Critical Bug Fix)
19
+
20
+ ### Problem
21
+
22
+ `resume_session` fails to find sessions across Firebender workspaces because WorkRail detects git context from the MCP server process's CWD (`process.cwd()`), not the client's workspace.
23
+
24
+ **Scenario**:
25
+ - E1: Agent creates session in Firebender workspace A (zillow repo)
26
+ - Server: Detects git context from server CWD (workrail repo)
27
+ - Session observations: `git_branch: "main"`, `git_head_sha: "b419857..."` (workrail's main)
28
+ - E2: Agent searches for session in Firebender workspace A (zillow repo)
29
+ - `resume_session`: Filters by git context → no match (workrail main ≠ zillow branch)
30
+ - Result: Session not found despite being in the same client workspace
31
+
32
+ **Impact**: Cross-chat resumption is broken for multi-workspace users.
33
+
34
+ ---
35
+
36
+ ### Solution: Use MCP Roots Protocol
37
+
38
+ MCP provides `notifications/roots/list_changed` to notify servers when client workspace changes. Firebender sends this on workspace switch.
39
+
40
+ **Architecture**:
41
+ ```
42
+ Client → notifications/roots/list_changed → Server stores latest roots
43
+ Server → start_workflow → resolves git from roots[0].uri (client workspace)
44
+ Server → resume_session → matches sessions by stored git observations
45
+ ```
46
+
47
+ **Key invariant**: Workspace anchor is resolved **per-request** from current client roots, not once at server startup.
48
+
49
+ ---
50
+
51
+ ### Implementation Plan
52
+
53
+ #### 1. Immutable roots state manager
54
+
55
+ **File**: `src/mcp/workspace-roots-manager.ts`
56
+
57
+ Split read and write capabilities at the type level so handler code can only read — no
58
+ mutation surface leaks into consumers via `V2Dependencies`.
59
+
60
+ ```typescript
61
+ /** Read-only view — passed into V2Dependencies. */
62
+ export interface RootsReader {
63
+ getCurrentRootUris(): readonly string[];
64
+ }
65
+
66
+ /** Write capability — only the MCP notification handler holds this. */
67
+ export interface RootsWriter {
68
+ updateRootUris(uris: readonly string[]): void;
69
+ }
70
+
71
+ export class WorkspaceRootsManager implements RootsReader, RootsWriter {
72
+ private rootUris: readonly string[] = Object.freeze([]);
73
+
74
+ updateRootUris(uris: readonly string[]): void {
75
+ this.rootUris = Object.freeze([...uris]);
76
+ }
77
+
78
+ getCurrentRootUris(): readonly string[] {
79
+ return this.rootUris;
80
+ }
81
+ }
82
+ ```
83
+
84
+ **Philosophy alignment**:
85
+ - Mutable cell is minimal, confined behind an explicit `RootsWriter` interface
86
+ - Handlers receive `RootsReader` — cannot call `updateRootUris`
87
+ - Single-writer (MCP notification handler on Node.js event loop)
88
+
89
+ ---
90
+
91
+ #### 2. Add roots notification handler
92
+
93
+ **File**: `src/mcp/server.ts`
94
+
95
+ Two important protocol details:
96
+
97
+ 1. `notifications/roots/list_changed` is a **signal only** — it carries no roots payload. After
98
+ receiving it, the server must call `server.listRoots()` (which sends a `roots/list` request
99
+ to the client) to get the updated list.
100
+ 2. Initial roots must be fetched **after** `server.connect(transport)`. Some clients don't support
101
+ `roots/list`; wrap in try/catch and degrade gracefully to CWD fallback.
102
+
103
+ ```typescript
104
+ const rootsManager = new WorkspaceRootsManager();
105
+ // rootsWriter stays local — never passed to handlers
106
+ const rootsWriter: RootsWriter = rootsManager;
107
+
108
+ // Register before connect. Notification is signal-only; re-fetch via listRoots().
109
+ server.setNotificationHandler(RootsListChangedNotificationSchema, async () => {
110
+ try {
111
+ const result = await server.listRoots();
112
+ rootsWriter.updateRootUris(result.roots.map((r) => r.uri));
113
+ console.error(`[Roots] Updated: ${result.roots.map((r) => r.uri).join(', ') || '(none)'}`);
114
+ } catch {
115
+ console.error('[Roots] Failed to fetch updated roots after change notification');
116
+ }
117
+ });
118
+
119
+ // After server.connect(transport): fetch initial roots.
120
+ // Graceful: clients that don't support roots/list will throw; fall back to CWD.
121
+ try {
122
+ const result = await server.listRoots();
123
+ rootsWriter.updateRootUris(result.roots.map((r) => r.uri));
124
+ console.error(`[Roots] Initial: ${result.roots.map((r) => r.uri).join(', ') || '(none)'}`);
125
+ } catch {
126
+ console.error('[Roots] Client does not support roots/list; CWD fallback active');
127
+ }
128
+ ```
129
+
130
+ ---
131
+
132
+ #### 3. Make workspace anchor resolver per-request
133
+
134
+ **File**: `src/v2/infra/local/workspace-anchor/index.ts`
135
+
136
+ **Before**:
137
+ ```typescript
138
+ export class LocalWorkspaceAnchorV2 implements WorkspaceAnchorPortV2 {
139
+ constructor(private readonly cwd: string) {}
140
+ resolveAnchors(): RA<readonly WorkspaceAnchor[], WorkspaceAnchorError> {
141
+ // uses this.cwd (singleton)
142
+ }
143
+ }
144
+ ```
145
+
146
+ **After**:
147
+ ```typescript
148
+ export interface WorkspaceContextResolverPortV2 {
149
+ resolveFromUri(rootUri: string): ResultAsync<readonly WorkspaceAnchor[], WorkspaceAnchorError>;
150
+ resolveFromCwd(): ResultAsync<readonly WorkspaceAnchor[], WorkspaceAnchorError>;
151
+ }
152
+
153
+ export class LocalWorkspaceAnchorV2 implements WorkspaceContextResolverPortV2 {
154
+ resolveFromUri(rootUri: string): RA<readonly WorkspaceAnchor[], WorkspaceAnchorError> {
155
+ const fsPath = this.uriToPath(rootUri);
156
+ if (!fsPath) return okAsync([]); // Not file:// URI, graceful empty
157
+ return this.resolveFromPath(fsPath);
158
+ }
159
+
160
+ resolveFromCwd(): RA<readonly WorkspaceAnchor[], WorkspaceAnchorError> {
161
+ return this.resolveFromPath(process.cwd());
162
+ }
163
+
164
+ private resolveFromPath(cwd: string): RA<readonly WorkspaceAnchor[], WorkspaceAnchorError> {
165
+ // run git commands in specified cwd (existing logic)
166
+ }
167
+
168
+ private uriToPath(uri: string): string | null {
169
+ if (!uri.startsWith('file://')) return null;
170
+ // Use fileURLToPath (node:url) — handles Windows drive letters and percent-encoding correctly.
171
+ // decodeURIComponent(slice(7)) is wrong on Windows: file:///C:/foo → /C:/foo (leading slash).
172
+ try { return fileURLToPath(uri); } catch { return null; }
173
+ }
174
+ }
175
+ ```
176
+
177
+ **Philosophy**: Pure functions, no constructor state, explicit about file:// URIs only.
178
+
179
+ ---
180
+
181
+ #### 4. Update V2Dependencies
182
+
183
+ **File**: `src/mcp/types.ts`
184
+
185
+ ```typescript
186
+ export interface V2Dependencies {
187
+ readonly gate: ExecutionSessionGateV2;
188
+ readonly sessionStore: ...;
189
+ // Remove: readonly workspaceAnchor?: WorkspaceAnchorPortV2;
190
+ // Add:
191
+ readonly workspaceResolver?: WorkspaceContextResolverPortV2;
192
+ // Per-request snapshot of client root URIs, injected at the CallTool boundary.
193
+ // Optional: absent when client doesn't support roots/list (degrades to CWD).
194
+ readonly resolvedRootUris?: readonly string[];
195
+ }
196
+ ```
197
+
198
+ At the `CallToolRequestSchema` handler, snapshot roots once and spread into `V2Dependencies`:
199
+ ```typescript
200
+ const requestCtx: ToolContext = ctx.v2
201
+ ? { ...ctx, v2: { ...ctx.v2, resolvedRootUris: rootsManager.getCurrentRootUris() } }
202
+ : ctx;
203
+ return handler(args ?? {}, requestCtx);
204
+ ```
205
+
206
+ **Why `resolvedRootUris` as a value, not a `getCurrentRoots` thunk**: a function that reads
207
+ ambient state at call-time is not deterministic from the handler's perspective — the roots could
208
+ change between calls. Snapshotting at the request boundary gives handlers an immutable value for
209
+ their entire duration, consistent with the determinism-over-cleverness principle.
210
+
211
+ ---
212
+
213
+ #### 5. Update start_workflow to use primary root
214
+
215
+ **File**: `src/mcp/handlers/v2-execution/start.ts`, around line 331
216
+
217
+ **Before**:
218
+ ```typescript
219
+ const workspaceAnchor = ctx.v2?.workspaceAnchor;
220
+ const anchorsRA = workspaceAnchor
221
+ ? workspaceAnchor.resolveAnchors()
222
+ : okAsync([]);
223
+ ```
224
+
225
+ **After**:
226
+ ```typescript
227
+ const workspaceResolver = ctx.v2.workspaceResolver;
228
+ const primaryRootUri = ctx.v2.resolvedRootUris?.[0]; // snapshotted at CallTool boundary
229
+ const anchorsRA = workspaceResolver
230
+ ? (primaryRootUri
231
+ ? workspaceResolver.resolveFromUri(primaryRootUri)
232
+ : workspaceResolver.resolveFromCwd()
233
+ ).orElse(() => okAsync([]))
234
+ : okAsync([]);
235
+ ```
236
+
237
+ **Why**: Uses client's workspace URI if available (snapshotted at request boundary — deterministic
238
+ for this call), falls back to server CWD for clients that don't support roots/list.
239
+
240
+ ---
241
+
242
+ #### 6. Tests (pending)
243
+
244
+ **Unit tests** (`tests/unit/v2/workspace-roots-manager.test.ts`):
245
+ - `updateRootUris` stores immutable copy; subsequent mutations don't affect the returned slice
246
+ - `getCurrentRootUris` returns frozen array
247
+ - `RootsWriter` interface is separate from `RootsReader` — consumers cannot call `updateRootUris`
248
+
249
+ **Unit tests** (`tests/unit/v2/workspace-anchor-resolver.test.ts`):
250
+ - `resolveFromUri` with valid `file://` URI
251
+ - `resolveFromUri` with non-`file://` URI (e.g., `http://`, `vscode-vfs://`) → returns empty (graceful)
252
+ - `resolveFromUri` with malformed URI → returns empty (graceful)
253
+ - `resolveFromCwd` uses the adapter's default CWD
254
+ - Windows path handling: `file:///C:/foo` → `C:\foo` (via `fileURLToPath`)
255
+
256
+ **Integration test** (`tests/integration/v2/resume-session-workspace-filtering.test.ts`):
257
+ - Create session in workspace A (mock `resolvedRootUris` pointing at a temp git repo on branch `feat-a`)
258
+ - Create session in workspace B (mock pointing at a different temp git repo)
259
+ - `resume_session` from workspace A → finds only workspace A session via git branch/SHA match
260
+ - `resume_session` with no roots → finds both via recency fallback
261
+
262
+ ---
263
+
264
+ ### Status
265
+
266
+ **Complete** (2026-02-18, tests tightened 2026-03-26 in #147). Full implementation shipped including integration and unit tests covering all workspace resolution variants and edge cases (26+ tests).
267
+
268
+ ---
269
+
270
+ ## Priority 2: MCP Progress Notifications for Workflow Execution
271
+
272
+ ### Problem
273
+
274
+ Long workflows (10+ steps, loops, subagents) take minutes to complete. Agents have no visibility into progress — they call `continue_workflow` and wait.
275
+
276
+ ### Solution: Send `notifications/progress`
277
+
278
+ When a `continue_workflow` advance completes, send a progress notification to the client:
279
+
280
+ ```json
281
+ {
282
+ "method": "notifications/progress",
283
+ "params": {
284
+ "progressToken": "...", // from original request._meta.progressToken
285
+ "progress": 3,
286
+ "total": 10,
287
+ "message": "Completed step 3/10: Hypothesis Development"
288
+ }
289
+ }
290
+ ```
291
+
292
+ **Agent UX**: The client UI shows "WorkRail: Step 3/10 (Hypothesis Development)" while the tool call is in-flight.
293
+
294
+ ---
295
+
296
+ ### Implementation
297
+
298
+ **File**: `src/mcp/handlers/v2-execution/advance.ts`
299
+
300
+ **After** successful append, before returning:
301
+ ```typescript
302
+ // Send progress notification if client requested it.
303
+ // progressToken must be threaded from CallToolRequestSchema handler
304
+ // through ToolContext (or via a server reference passed to V2Dependencies).
305
+ if (progressToken) {
306
+ const dag = projectRunDagV2(truthAfter.events);
307
+ if (dag.isOk()) {
308
+ const run = dag.value.runsById[runId];
309
+ // Count only 'step' nodes — not 'blocked_attempt' or 'checkpoint' nodes.
310
+ // Post-ADR 008, nodesById includes blocked_attempt nodes; counting all of them
311
+ // would inflate 'total' and make progress percentages wrong.
312
+ const stepNodes = Object.values(run?.nodesById ?? {}).filter(n => n.nodeKind === 'step');
313
+ const totalSteps = stepNodes.length;
314
+ const completedSteps = stepNodes.filter(n => n.isComplete).length;
315
+
316
+ // Correct SDK API is sendNotification, not notification.
317
+ await server.sendNotification({
318
+ method: 'notifications/progress',
319
+ params: {
320
+ progressToken,
321
+ progress: completedSteps,
322
+ total: totalSteps,
323
+ message: `Completed step ${completedSteps}/${totalSteps}: ${currentStep.title}`,
324
+ },
325
+ });
326
+ }
327
+ }
328
+ ```
329
+
330
+ **Three implementation details to resolve before building**:
331
+
332
+ 1. **`progressToken` plumbing**: `request._meta?.progressToken` is available in the raw
333
+ `CallToolRequestSchema` handler, not in `executeAdvance`. Thread it through `ToolContext`
334
+ (or a dedicated `RequestMeta` field) before calling the advance logic.
335
+
336
+ 2. **`server` reference**: `advance.ts` has no access to the MCP `Server` instance today.
337
+ Pass it via `V2Dependencies` or a `NotificationSender` port (interface segregation — expose
338
+ only `sendNotification`, not the full server).
339
+
340
+ 3. **`completedSteps` count**: See inline note above — filter by `nodeKind === 'step'` to
341
+ exclude `blocked_attempt` and `checkpoint` nodes, which are in the same DAG post-ADR 008.
342
+
343
+ **Philosophy**:
344
+ - Pure projection (DAG -> progress count)
345
+ - Side effect at edge (notification send)
346
+ - Opt-in (only if client provides progressToken)
347
+
348
+ ---
349
+
350
+ ### Open Question
351
+
352
+ Should progress be:
353
+ - **Step-granular** (1 notification per step) — simple, but may spam for 50-step workflows
354
+ - **Percentage-based** (notify on 10%, 20%, ..., 100%) — fewer notifications, but requires more logic
355
+ - **Time-based** (notify every 5 seconds) — smooth UX, but requires background timers
356
+
357
+ **Recommendation**: Start with step-granular (simplest, matches the execution model). Add throttling later if needed.
358
+
359
+ ---
360
+
361
+ ## Priority 3: Session State Change Notifications (Console/Dashboard Integration)
362
+
363
+ ### Problem
364
+
365
+ When Console/Dashboard UI exists, users may have multiple views open:
366
+ - Session list showing all sessions
367
+ - Session detail showing a specific session's DAG
368
+ - Workflow execution view
369
+
370
+ When an agent advances a workflow, these views become stale. Currently they'd need manual refresh or polling.
371
+
372
+ ### Solution: `notifications/resources/updated`
373
+
374
+ After durable events are written, notify clients watching that session:
375
+
376
+ ```json
377
+ {
378
+ "method": "notifications/resources/updated",
379
+ "params": {
380
+ "uri": "workrail://session/sess_abc123",
381
+ "changes": {
382
+ "lastEventIndex": 42,
383
+ "preferredTipNodeId": "node_xyz",
384
+ "isComplete": false
385
+ }
386
+ }
387
+ }
388
+ ```
389
+
390
+ **Console benefit**: Auto-refresh session views when new events are written.
391
+
392
+ ---
393
+
394
+ ### Implementation
395
+
396
+ **Requires**:
397
+ 1. Resource URI schema for sessions (`workrail://session/{sessionId}`)
398
+ 2. Subscription tracking (which clients are watching which sessions)
399
+ 3. Notification dispatch after `sessionStore.append()`
400
+
401
+ **Defer until**: Console UI exists (YAGNI — no UI to refresh yet)
402
+
403
+ ---
404
+
405
+ ## Priority 4: Logging Notifications (Server Diagnostics)
406
+
407
+ ### Problem
408
+
409
+ When session health issues occur (lock contention, corruption detected, validation errors), agents see tool errors but operators have no server-side visibility.
410
+
411
+ ### Solution: `notifications/logging/message`
412
+
413
+ Structured server logs sent to clients:
414
+
415
+ ```json
416
+ {
417
+ "method": "notifications/logging/message",
418
+ "params": {
419
+ "level": "warning",
420
+ "logger": "workrail.session.gate",
421
+ "data": "Session lock held for >5s — another process may be stuck"
422
+ }
423
+ }
424
+ ```
425
+
426
+ **Operator benefit**: Real-time server diagnostics visible in Firebender console.
427
+
428
+ **When to send**:
429
+ - Lock timeout warnings (held >5s)
430
+ - Session corruption detected
431
+ - Keyring initialization failures
432
+ - Feature flag changes
433
+
434
+ **Philosophy**: Errors as data, observability at edges
435
+
436
+ ---
437
+
438
+ ## Priority 5: Dynamic Tool List Updates
439
+
440
+ ### Problem
441
+
442
+ Feature flags control which tools are available. Changing a flag requires agent reconnect to see new tools. (Note: `WORKRAIL_ENABLE_V2_TOOLS` has been removed -- v2 is default-on. This priority applies to any future feature flags.)
443
+
444
+ ### Solution: `notifications/tools/list_changed`
445
+
446
+ When feature flags change:
447
+ ```json
448
+ {
449
+ "method": "notifications/tools/list_changed",
450
+ "params": {}
451
+ }
452
+ ```
453
+
454
+ Client re-fetches tool list via `tools/list`.
455
+
456
+ **Complexity**: Requires runtime feature flag mutation (currently environment variables, immutable after boot).
457
+
458
+ **Defer until**: Feature flags become mutable via Console UI.
459
+
460
+ ---
461
+
462
+ ## Priority 6: Async Workflow Execution via MCP Tasks
463
+
464
+ ### Problem
465
+
466
+ Long workflows block the agent's tool call. A 50-step workflow might take 10+ minutes, during which the agent is waiting on a single `continue_workflow` call.
467
+
468
+ ### Solution: MCP Tasks for async workflows
469
+
470
+ **Flow**:
471
+ 1. Agent: `start_workflow` with `task: { ttl: 600000, pollInterval: 5000 }`
472
+ 2. Server: Returns `taskId`, begins async execution
473
+ 3. Client: Polls `tasks/get` every 5s to check status
474
+ 4. Server: Sends `notifications/tasks/status` when steps complete
475
+ 5. Agent: Sees progress, continues other work
476
+ 6. Server: Workflow completes, task result available
477
+ 7. Agent: `tasks/get` returns final result
478
+
479
+ **Benefits**:
480
+ - Agent can do other work while workflow runs
481
+ - Progress via notifications instead of blocking
482
+ - Timeout-friendly (long workflows don't need infinite tool call timeout)
483
+
484
+ **Complexity**:
485
+ - Requires background workflow executor thread
486
+ - Task result storage + TTL management
487
+ - Cancellation support
488
+
489
+ **Defer until**: Workflows routinely take >60s (YAGNI for current 2-10 step workflows).
490
+
491
+ ---
492
+
493
+ ## Summary Table
494
+
495
+ | Enhancement | Priority | Status | Blocks | Philosophy Aligned |
496
+ |-------------|----------|--------|--------|-------------------|
497
+ | MCP Roots Protocol | P1 (bug fix) | Complete (2026-02-18, #75/#78/#147) | Cross-workspace resume | Yes -- pure functions, immutable |
498
+ | Progress Notifications | P2 | Planned (3 open design issues) | Agent UX for long workflows | Yes -- side effects at edges |
499
+ | Resource Update Notifications | P3 | Deferred (no UI) | Console auto-refresh | Yes -- event-driven |
500
+ | Logging Notifications | P4 | Deferred | Operator visibility | Yes -- errors as data |
501
+ | Tool List Change Notifications | P5 | Deferred | Runtime flag changes | Partial -- requires mutable flags |
502
+ | Async Workflows via Tasks | P6 | Deferred (YAGNI) | 10min+ workflows | Partial -- requires background threads |
503
+
504
+ ---
505
+
506
+ ## Related Work from Earlier Session
507
+
508
+ From the "unfleshed v2 ideas" inventory:
509
+
510
+ ### Already Addressed This Session
511
+
512
+ - **MCP Roots Protocol** -- Per-request workspace anchor resolution; `RootsReader`/`RootsWriter` capability split; `fileURLToPath` URI handling; `resolvedRootUris` snapshot at CallTool boundary (2026-02-18)
513
+ - **Workflow migration** -- All while-loops migrated to `wr.contracts.loop_control` (PR #69)
514
+ - **ADR 008 completion** -- Terminal block path + projection query (this session)
515
+ - **Deprecated path removal** -- `advance_recorded.outcome.kind='blocked'` removed from builder (this session, PR #70)
516
+ - **SessionManager Result refactoring** -- All methods return `Result`, no throws (this session, PR #70)
517
+ - **V2ToolContext + requireV2 guard** -- Eliminated `ctx.v2!` assertions (this session, PR #70)
518
+ - **Branded contractRef** -- `ArtifactContractRef` type instead of `string` (this session, PR #70)
519
+ - **Compiler contract validation** -- Compile-time check for unknown contract refs (this session, PR #70)
520
+ - **Manual test plan** -- 23 scenarios for slices 4b, 4c, ADR 008, loop artifacts (this session)
521
+ - **Optimistic pre-lock dedup** -- Checkpoint replay skips gate (this session, PR #73)
522
+
523
+ ### Still Open
524
+
525
+ 1. ~~**Unflag v2 tools**~~ (done -- v2 is default-on, feature flag gate removed)
526
+ 2. **Console/Dashboard UI** — Zero UI exists, substrate complete
527
+ 3. **Agent Cascade Protocol** — Cross-IDE delegation model, design complete
528
+ 4. **Enforceable verification contracts** — `verify` block is instructional-only
529
+ 5. **Parallel forEach execution** — Concurrent loop iterations
530
+ 6. **Subagent composition** — Chained outputs (researcher → challenger → analyzer)
531
+ 7. **Evidence validation contracts** — Replace prose `validationCriteria` with structured artifacts
532
+
533
+ ---
534
+
535
+ ## Decision: What to Do Next
536
+
537
+ ### Done: MCP Roots Protocol
538
+
539
+ Implemented 2026-02-18. Per-request workspace anchor resolution, correct `listRoots()` flow,
540
+ `fileURLToPath` URI handling, `resolvedRootUris` snapshot at CallTool boundary.
541
+
542
+ ### Next: Complete manual test plan validation
543
+
544
+ Run the E1+E2 cross-workspace resume scenarios to verify the roots fix works end-to-end.
545
+ Manual test plans from the v2 phases have been archived; verification is now covered by the
546
+ 26+ automated tests added with the roots fix.
547
+
548
+ ### Done: Unflag v2 tools (Production Readiness)
549
+
550
+ V2 is default-on. Feature flag gate removed.
551
+
552
+ ### Later: Progress Notifications (UX Improvement)
553
+
554
+ - **Impact**: Better agent feedback for long workflows
555
+ - **Effort**: Moderate — three design issues must be resolved first (see P2 above):
556
+ 1. `progressToken` threading through `ToolContext`
557
+ 2. `NotificationSender` port to give advance handler access to `sendNotification`
558
+ 3. Node counting: filter `step` nodes only, exclude `blocked_attempt` + `checkpoint`
559
+ - **Risk**: Low (opt-in via progressToken)
560
+
561
+ ### Recommended Sequence
562
+
563
+ 1. ~~**MCP Roots**~~ (done)
564
+ 2. **Complete manual test plan validation** (run all 23 scenarios with roots fix)
565
+ 3. ~~**Unflag v2 tools**~~ (done)
566
+ 4. **Resolve P2 design issues** then implement progress notifications
567
+
568
+ ---
569
+
570
+ ## Open Questions
571
+
572
+ 1. **Should resume_session support multi-root matching?** Current plan uses only `roots[0]`. If a client has 3 workspace roots, should sessions from any of them be eligible?
573
+ - **Recommendation**: No (YAGNI). Use primary root only.
574
+
575
+ 2. **What if client sends roots but they're all non-file:// URIs?** (e.g., `vscode-vfs://github/...`)
576
+ - **Recommendation**: Graceful fallback to server CWD with a warning log.
577
+
578
+ 3. **Should workspace anchor resolution be cached per root URI?** Git commands are expensive (fork + exec).
579
+ - **Recommendation**: No for v1. Add caching later if profiling shows it's a bottleneck.
580
+
581
+ ---
582
+
583
+ ## References
584
+
585
+ - MCP Roots Spec: `https://modelcontextprotocol.io/specification/draft/client/roots`
586
+ - MCP SDK Types: `@modelcontextprotocol/sdk/types` (v1.24.0)
587
+ - Design Locks: `docs/design/v2-core-design-locks.md` §15 (single-writer), §1.3 (rehydrate separation)
@@ -0,0 +1,105 @@
1
+ # Workflow Categories Design Candidates
2
+
3
+ ## Problem Understanding
4
+
5
+ **Core tensions:**
6
+ 1. Hash stability: category metadata cannot go in workflow JSON without breaking workflowHash when recategorized
7
+ 2. Default behavior: making summary the default changes the implicit contract of list_workflows (agents expecting full list must adapt)
8
+ 3. Overlay freshness: a separate categories file can drift from the actual workflow registry
9
+
10
+ **Likely seam**: `handleV2ListWorkflows` in `src/mcp/handlers/v2-workflow.ts` + `V2ListWorkflowsInput` in `src/mcp/v2/tools.ts` + `V2WorkflowListOutputSchema` in `src/mcp/output-schemas.ts`
11
+
12
+ **What makes it hard**: The overlay must be authoritative metadata about the registry without being part of the compilation pipeline. Junior devs would put it in workflow JSON or infer it dynamically — both approaches break for different reasons.
13
+
14
+ ## Philosophy Constraints
15
+
16
+ - **Determinism**: category assignment must be explicit, not inferred
17
+ - **Make illegal states unrepresentable**: uncategorized workflows should be a validation warning, not silent
18
+ - **YAGNI**: don't add compiler complexity for C when A solves it more simply
19
+ - **Explicit domain types**: category should be a typed enum, not a free string
20
+
21
+ ## Impact Surface
22
+
23
+ - `spec/workflow-categories.json` (new file)
24
+ - `V2ListWorkflowsInput` (new optional `category` field)
25
+ - `V2WorkflowListOutputSchema` (new optional `categorySummary` field)
26
+ - `handleV2ListWorkflows` (response branching logic)
27
+ - `validate-workflows-registry.ts` (new uncategorized workflow warning)
28
+ - `workflow-for-workflows.v2.json` Phase 7 (should stamp category when authoring)
29
+
30
+ ## Candidates
31
+
32
+ ### A: Spec overlay file + `category` filter param ✓ RECOMMENDED
33
+
34
+ **Summary**: `spec/workflow-categories.json` maps workflow IDs to domain categories. `list_workflows` without `category` returns compact `categorySummary`. With `category`, returns full filtered list.
35
+
36
+ - **Tensions resolved**: hash stability, backwards compatibility, token reduction
37
+ - **Tensions accepted**: overlay can drift (mitigated by validate:registry check)
38
+ - **Boundary**: spec/ directory + V2ListWorkflowsInput + output schema + handler
39
+ - **Failure mode**: new workflow added but not categorized — shows as uncategorized, validator warns
40
+ - **Repo pattern**: adapts `includeSources` pattern directly
41
+ - **Gains**: clean separation, CI-checkable, zero workflow file changes
42
+ - **Losses**: extra file to maintain
43
+ - **Scope**: best-fit
44
+ - **Philosophy**: honors determinism, make-illegal-states-unrepresentable
45
+
46
+ ### B: Naming convention inference (no overlay)
47
+
48
+ **Summary**: Infer category from workflow ID prefix at runtime. `routine-*` → routines, `test-*` → testing, everything else guessed from description keywords.
49
+
50
+ - **Failure mode**: ~70% of workflows mis-categorized (only routine-* and test-* have reliable prefixes)
51
+ - **Repo pattern**: departs
52
+ - **Scope**: too narrow — doesn't work for most of the catalog
53
+ - **Philosophy**: conflicts with determinism
54
+
55
+ ### C: `category` field in workflow JSON with hash isolation
56
+
57
+ **Summary**: Add `category` to workflow JSON but strip it from the compiled snapshot before hashing.
58
+
59
+ - **Failure mode**: compiler regression accidentally includes `category` in hash, silently invalidating sessions
60
+ - **Repo pattern**: departs — no existing field excluded from compilation this way
61
+ - **Scope**: too broad — adds significant compiler complexity
62
+ - **Philosophy**: violates YAGNI
63
+
64
+ ## Comparison and Recommendation
65
+
66
+ **A wins on every axis**: hash stability, backwards compatibility, clean boundary, CI-checkable, follows includeSources pattern, minimal code change.
67
+
68
+ B covers ~30% of workflows. C adds compiler complexity for a problem A already solves.
69
+
70
+ **Implementation shape for A:**
71
+ 1. `spec/workflow-categories.json` — `{ categories: [...], workflows: { workflowId: { category, hidden? } } }`
72
+ 2. `V2ListWorkflowsInput`: add `category?: string`
73
+ 3. `V2WorkflowListOutputSchema`: add `categorySummary?: { category, displayName, count, representatives }[]`
74
+ 4. `handleV2ListWorkflows`: when no `category`, return `categorySummary`; when `category` present, return filtered full list
75
+ 5. `validate:registry`: warn on uncategorized non-hidden workflows
76
+ 6. Token budget: summary ~500 tokens; per-category full list ~800 tokens for 3-5 workflows
77
+
78
+ **Natural taxonomy (10 categories):**
79
+
80
+ | Category | Count | Examples |
81
+ |---|---|---|
82
+ | coding | 3 | coding-task, cross-platform-code-conversion |
83
+ | review_audit | 3 | mr-review, production-readiness-audit, architecture-scalability-audit |
84
+ | investigation | 2 | bug-investigation, workflow-diagnose |
85
+ | design | 2 | ui-ux-design, wr.discovery |
86
+ | documentation | 3 | document-creation, scoped-documentation, documentation-update |
87
+ | tickets | 4 | adaptive-ticket-creation, ticket-grooming, intelligent-test-case-generation |
88
+ | learning | 4 | personal-learning-*, presentation-creation, relocation |
89
+ | routines | ~10 | all routine-* |
90
+ | authoring | 1-2 | workflow-for-workflows |
91
+ | testing | 3 | test-* (hidden from default summary) |
92
+
93
+ ## Self-Critique
94
+
95
+ **Strongest counter-argument**: two-file maintenance burden (workflow JSON + overlay). Mitigated by: validate:registry warning on uncategorized workflows makes omission loud; workflow-for-workflows can be updated to prompt for category at authoring time.
96
+
97
+ **Pivot condition**: if teams want per-workspace custom categories, A needs extension (workspace-level categories.json overlay). Defer to v2.
98
+
99
+ ## Open Questions for Main Agent
100
+
101
+ 1. Should `testing` workflows be `hidden: true` (excluded from summary) or shown in their own testing category?
102
+ 2. Should routines be surfaced in summary mode at all, or hidden by default (they're internal, not user-invoked)?
103
+ 3. Should the `categorySummary` include a short description per category (e.g., "Review code changes, audit systems") or just names + counts?
104
+ 4. What's the right `displayName` for `review_audit`? "Review & Audit"?
105
+ 5. Should `workflow-for-workflows` Phase 7 be updated to stamp the category, or is that a separate ticket?