planflow-ai 1.3.4 → 1.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (90) hide show
  1. package/.claude/commands/create-plan.md +11 -0
  2. package/.claude/commands/discovery-plan.md +12 -0
  3. package/.claude/commands/execute-plan.md +114 -23
  4. package/.claude/commands/flow.md +30 -5
  5. package/.claude/commands/resume-work.md +261 -0
  6. package/.claude/commands/review-code.md +11 -0
  7. package/.claude/commands/review-pr.md +11 -0
  8. package/.claude/resources/core/_index.md +45 -2
  9. package/.claude/resources/core/atomic-commits.md +380 -0
  10. package/.claude/resources/core/autopilot-mode.md +3 -2
  11. package/.claude/resources/core/compaction-guide.md +15 -1
  12. package/.claude/resources/core/heartbeat.md +129 -1
  13. package/.claude/resources/core/model-routing.md +6 -2
  14. package/.claude/resources/core/per-task-verification.md +362 -0
  15. package/.claude/resources/core/phase-isolation.md +192 -4
  16. package/.claude/resources/core/session-scratchpad.md +1 -0
  17. package/.claude/resources/core/wave-execution.md +329 -0
  18. package/.claude/resources/patterns/plans-patterns.md +56 -0
  19. package/.claude/resources/patterns/plans-templates.md +152 -0
  20. package/.claude/resources/skills/_index.md +8 -6
  21. package/.claude/resources/skills/create-plan-skill.md +71 -5
  22. package/.claude/resources/skills/execute-plan-skill.md +357 -12
  23. package/.claude/resources/skills/resume-work-skill.md +159 -0
  24. package/.claude/rules/core/forbidden-patterns.md +38 -0
  25. package/dist/cli/commands/init.js +1 -1
  26. package/dist/cli/commands/init.js.map +1 -1
  27. package/dist/cli/commands/state.d.ts +12 -0
  28. package/dist/cli/commands/state.d.ts.map +1 -0
  29. package/dist/cli/commands/state.js +47 -0
  30. package/dist/cli/commands/state.js.map +1 -0
  31. package/dist/cli/daemon/desktop-notifier.d.ts +16 -0
  32. package/dist/cli/daemon/desktop-notifier.d.ts.map +1 -0
  33. package/dist/cli/daemon/desktop-notifier.js +53 -0
  34. package/dist/cli/daemon/desktop-notifier.js.map +1 -0
  35. package/dist/cli/daemon/event-writer.d.ts +22 -0
  36. package/dist/cli/daemon/event-writer.d.ts.map +1 -0
  37. package/dist/cli/daemon/event-writer.js +76 -0
  38. package/dist/cli/daemon/event-writer.js.map +1 -0
  39. package/dist/cli/daemon/heartbeat-daemon.js +81 -1
  40. package/dist/cli/daemon/heartbeat-daemon.js.map +1 -1
  41. package/dist/cli/daemon/log-writer.d.ts +17 -0
  42. package/dist/cli/daemon/log-writer.d.ts.map +1 -0
  43. package/dist/cli/daemon/log-writer.js +62 -0
  44. package/dist/cli/daemon/log-writer.js.map +1 -0
  45. package/dist/cli/daemon/notification-router.d.ts +17 -0
  46. package/dist/cli/daemon/notification-router.d.ts.map +1 -0
  47. package/dist/cli/daemon/notification-router.js +35 -0
  48. package/dist/cli/daemon/notification-router.js.map +1 -0
  49. package/dist/cli/daemon/prompt-manager.d.ts +27 -0
  50. package/dist/cli/daemon/prompt-manager.d.ts.map +1 -0
  51. package/dist/cli/daemon/prompt-manager.js +107 -0
  52. package/dist/cli/daemon/prompt-manager.js.map +1 -0
  53. package/dist/cli/index.js +9 -0
  54. package/dist/cli/index.js.map +1 -1
  55. package/dist/cli/state/flowconfig-parser.d.ts +16 -0
  56. package/dist/cli/state/flowconfig-parser.d.ts.map +1 -0
  57. package/dist/cli/state/flowconfig-parser.js +166 -0
  58. package/dist/cli/state/flowconfig-parser.js.map +1 -0
  59. package/dist/cli/state/heartbeat-state.d.ts +16 -0
  60. package/dist/cli/state/heartbeat-state.d.ts.map +1 -0
  61. package/dist/cli/state/heartbeat-state.js +97 -0
  62. package/dist/cli/state/heartbeat-state.js.map +1 -0
  63. package/dist/cli/state/model-router.d.ts +8 -0
  64. package/dist/cli/state/model-router.d.ts.map +1 -0
  65. package/dist/cli/state/model-router.js +36 -0
  66. package/dist/cli/state/model-router.js.map +1 -0
  67. package/dist/cli/state/plan-parser.d.ts +16 -0
  68. package/dist/cli/state/plan-parser.d.ts.map +1 -0
  69. package/dist/cli/state/plan-parser.js +124 -0
  70. package/dist/cli/state/plan-parser.js.map +1 -0
  71. package/dist/cli/state/session-state.d.ts +21 -0
  72. package/dist/cli/state/session-state.d.ts.map +1 -0
  73. package/dist/cli/state/session-state.js +36 -0
  74. package/dist/cli/state/session-state.js.map +1 -0
  75. package/dist/cli/state/state-md-parser.d.ts +18 -0
  76. package/dist/cli/state/state-md-parser.d.ts.map +1 -0
  77. package/dist/cli/state/state-md-parser.js +222 -0
  78. package/dist/cli/state/state-md-parser.js.map +1 -0
  79. package/dist/cli/state/types.d.ts +106 -0
  80. package/dist/cli/state/types.d.ts.map +1 -0
  81. package/dist/cli/state/types.js +8 -0
  82. package/dist/cli/state/types.js.map +1 -0
  83. package/dist/cli/state/wave-calculator.d.ts +18 -0
  84. package/dist/cli/state/wave-calculator.d.ts.map +1 -0
  85. package/dist/cli/state/wave-calculator.js +134 -0
  86. package/dist/cli/state/wave-calculator.js.map +1 -0
  87. package/dist/cli/types.d.ts +15 -0
  88. package/dist/cli/types.d.ts.map +1 -1
  89. package/package.json +4 -2
  90. package/templates/shared/CLAUDE.md.template +4 -0
@@ -0,0 +1,362 @@
1
+
2
+ # Per-Task Verification
3
+
4
+ ## Purpose
5
+
6
+ When a plan phase includes tasks with `<verify>` tags, the phase isolation sub-agent runs **targeted verification immediately after each task completes**. If verification fails, a nested debug sub-agent diagnoses the failure and the implementation sub-agent applies repairs. This catches errors at the task level instead of waiting for the final build+test step.
7
+
8
+ **Core principle**: Verify early, diagnose fast, repair in place.
9
+
10
+ ---
11
+
12
+ ## Architecture
13
+
14
+ ```
15
+ Phase Sub-Agent (isolated)
16
+
17
+ ├─ Task 1: Implement
18
+ │ ├─ Complete task implementation
19
+ │ ├─ Parse <verify> tag → extract command
20
+ │ ├─ Run verification command
21
+ │ ├─ ✅ Pass → record result, move to Task 2
22
+ │ └─ ❌ Fail → enter verification loop:
23
+ │ │
24
+ │ ├─ Spawn debug sub-agent (haiku):
25
+ │ │ Input: error output + task context + file content
26
+ │ │ Output: JSON diagnosis (root cause, repair actions)
27
+ │ │
28
+ │ ├─ Apply repair actions
29
+ │ ├─ Re-run verification command
30
+ │ ├─ ✅ Pass → record result (with repair info), move to Task 2
31
+ │ ├─ ❌ Fail → retry (up to max_verify_retries)
32
+ │ └─ ❌ Max retries exceeded → record failure, escalate to user
33
+
34
+ ├─ Task 2: Implement (no <verify> tag → skip verification)
35
+
36
+ ├─ Task 3: Implement
37
+ │ ├─ Complete task implementation
38
+ │ ├─ Parse <verify> tag → extract command
39
+ │ └─ Run verification → ✅ Pass
40
+
41
+ └─ Return JSON (includes task_verifications array)
42
+ ```
43
+
44
+ Verification is **internal to the phase sub-agent**. The wave coordinator and main session see only the final JSON return with verification results — they never interact with the verification loop directly.
45
+
46
+ ---
47
+
48
+ ## Verify Tag Syntax
49
+
50
+ ### Declaration in Plans
51
+
52
+ Tasks in a plan phase can include an optional `<verify>` tag indented under the task:
53
+
54
+ ```markdown
55
+ ### Phase 2: API Integration
56
+
57
+ **Scope**: ...
58
+ **Complexity**: 5/10
59
+ **Dependencies**: Phase 1
60
+
61
+ - [ ] Create user authentication middleware in `src/middleware/auth.ts`
62
+ <verify>npx tsc --noEmit src/middleware/auth.ts</verify>
63
+ - [ ] Add rate limiting to API routes
64
+ <verify>npx jest src/middleware/__tests__/rate-limit.test.ts --no-coverage</verify>
65
+ - [ ] Update configuration constants
66
+ ```
67
+
68
+ ### Parsing Rules
69
+
70
+ 1. **Tag format**: `<verify>COMMAND</verify>` on a single line, indented under a task
71
+ 2. **One verify per task**: Only the first `<verify>` tag under a task is used; additional tags are ignored
72
+ 3. **Command content**: The text between tags is executed as a shell command by the sub-agent
73
+ 4. **No verify = no verification**: Tasks without `<verify>` tags skip verification entirely (backward compatible)
74
+ 5. **Whitespace**: Leading/trailing whitespace inside the tag is trimmed
75
+ 6. **Nesting**: The `<verify>` tag must be indented under its parent task (2+ spaces or 1+ tab)
76
+
77
+ ### Recommended Verification Commands
78
+
79
+ | Task Type | Verify Command | Purpose |
80
+ |-----------|---------------|---------|
81
+ | File creation (TypeScript) | `npx tsc --noEmit <file>` | Type-check the new file |
82
+ | Test writing | `npx jest <test-file> --no-coverage` | Run the specific test |
83
+ | Schema/type changes | `npx tsc --noEmit <type-file>` | Verify type consistency |
84
+ | Config changes | *(no verify)* | Manual review preferred |
85
+ | Documentation | *(no verify)* | No automated check available |
86
+
87
+ **Constraint**: Verification commands must be **targeted** (single file or small scope). Never use full builds (`npm run build`) or full test suites (`npm test`) as verify commands — those run in the final Step 7.
88
+
89
+ ---
90
+
91
+ ## Debug Sub-Agent
92
+
93
+ ### When to Spawn
94
+
95
+ A debug sub-agent is spawned when a verification command returns a **non-zero exit code**. The sub-agent diagnoses the failure and suggests repair actions.
96
+
97
+ ### Prompt Template
98
+
99
+ ```markdown
100
+ # Debug Diagnosis
101
+
102
+ ## Failed Verification
103
+ **Task**: {task description}
104
+ **Command**: {verify command}
105
+ **Exit Code**: {exit code}
106
+
107
+ ## Error Output
108
+ ```
109
+ {stderr + stdout from the failed command, truncated to 200 lines}
110
+ ```
111
+
112
+ ## Task Context
113
+ **File**: {primary file being modified}
114
+ **Phase**: {phase name}
115
+ **What was implemented**: {brief description of what the task did}
116
+
117
+ ## File Content
118
+ ```
119
+ {content of the primary file, truncated to 300 lines}
120
+ ```
121
+
122
+ ## Instructions
123
+ Analyze the error output and diagnose the root cause. Return a JSON object with your diagnosis and suggested repair actions. Do NOT fix the code — only diagnose.
124
+
125
+ Return ONLY a JSON object (no markdown fences):
126
+ {see Debug Return Schema below}
127
+ ```
128
+
129
+ ### Debug Return Schema
130
+
131
+ ```json
132
+ {
133
+ "root_cause": "Missing import for AuthMiddleware type used on line 15",
134
+ "category": "import_missing",
135
+ "repair_actions": [
136
+ "Add import { AuthMiddleware } from '../types/auth' to src/middleware/auth.ts"
137
+ ],
138
+ "confidence": "high",
139
+ "file_to_fix": "src/middleware/auth.ts"
140
+ }
141
+ ```
142
+
143
+ ### Field Descriptions
144
+
145
+ | Field | Type | Required | Description |
146
+ |-------|------|----------|-------------|
147
+ | `root_cause` | string | Yes | Human-readable description of what went wrong |
148
+ | `category` | string | Yes | Error category: `import_missing`, `type_error`, `syntax_error`, `runtime_error`, `test_failure`, `config_error`, `other` |
149
+ | `repair_actions` | string[] | Yes | Ordered list of specific actions to fix the issue |
150
+ | `confidence` | `"high" \| "medium" \| "low"` | Yes | How confident the diagnosis is |
151
+ | `file_to_fix` | string | Yes | Primary file that needs modification |
152
+
153
+ ### Sub-Agent Configuration
154
+
155
+ - **Model**: Always uses haiku (fast tier) — diagnosis is a focused, low-complexity task
156
+ - **Mode**: `"auto"`
157
+ - **Read-only**: The debug sub-agent does NOT modify files — it only returns a diagnosis. The implementation sub-agent applies the repairs.
158
+
159
+ ---
160
+
161
+ ## Verification Loop
162
+
163
+ ### Flow
164
+
165
+ ```
166
+ 1. Complete task implementation
167
+ 2. Parse <verify> tag → extract command
168
+ 3. Run command
169
+ 4. If exit code == 0 → PASS (record result, continue)
170
+ 5. If exit code != 0:
171
+ a. Increment retry counter
172
+ b. If retry counter > max_verify_retries → ESCALATE (record failure)
173
+ c. Spawn debug sub-agent with error context
174
+ d. Receive JSON diagnosis
175
+ e. Apply repair actions from diagnosis
176
+ f. Re-run verification command → go to step 4
177
+ ```
178
+
179
+ ### Retry Behavior
180
+
181
+ - **Retry counter**: Starts at 0, increments on each failed verification attempt
182
+ - **First attempt**: The initial verification run does NOT count as a retry
183
+ - **Max retries**: Controlled by `max_verify_retries` in `.flowconfig` (default: 2)
184
+ - **Example with default**: Initial attempt + 2 retries = 3 total verification runs maximum
185
+
186
+ ### Escalation on Max Retries
187
+
188
+ When max retries are exceeded, the sub-agent:
189
+
190
+ 1. Records the verification failure in the `task_verifications` array
191
+ 2. Includes the last debug diagnosis in the failure record
192
+ 3. Continues to the next task (does NOT abort the phase)
193
+ 4. Sets overall phase `status` to `"partial"` if any task verification failed
194
+
195
+ The coordinator presents the failure to the user with the accumulated diagnosis:
196
+
197
+ ```markdown
198
+ ⚠️ Task verification failed after 2 retries:
199
+
200
+ **Task**: Create user authentication middleware in `src/middleware/auth.ts`
201
+ **Command**: `npx tsc --noEmit src/middleware/auth.ts`
202
+ **Last diagnosis**: Missing type export from @auth/core — dependency may need updating
203
+ **Category**: import_missing
204
+
205
+ Options:
206
+ 1. Continue with remaining phases (issue noted)
207
+ 2. Stop and fix manually
208
+ ```
209
+
210
+ ---
211
+
212
+ ## Task Verifications Return Field
213
+
214
+ ### Schema Extension
215
+
216
+ The phase isolation JSON return format is extended with an optional `task_verifications` array:
217
+
218
+ ```json
219
+ {
220
+ "status": "success",
221
+ "phase": "Phase 2: API Integration",
222
+ "summary": "Implemented auth middleware and rate limiting. One task required type import repair.",
223
+ "files_created": ["src/middleware/auth.ts"],
224
+ "files_modified": ["src/api/routes.ts"],
225
+ "decisions": [],
226
+ "deviations": [],
227
+ "errors": [],
228
+ "patterns_captured": [],
229
+ "task_verifications": [
230
+ {
231
+ "task": "Create user authentication middleware",
232
+ "verify_command": "npx tsc --noEmit src/middleware/auth.ts",
233
+ "status": "pass",
234
+ "attempts": 2,
235
+ "repairs_applied": [
236
+ "Added missing import for AuthMiddleware type"
237
+ ]
238
+ },
239
+ {
240
+ "task": "Add rate limiting to API routes",
241
+ "verify_command": "npx jest src/middleware/__tests__/rate-limit.test.ts --no-coverage",
242
+ "status": "pass",
243
+ "attempts": 1,
244
+ "repairs_applied": []
245
+ }
246
+ ]
247
+ }
248
+ ```
249
+
250
+ ### Task Verification Entry Fields
251
+
252
+ | Field | Type | Required | Description |
253
+ |-------|------|----------|-------------|
254
+ | `task` | string | Yes | Short description of the task (from plan) |
255
+ | `verify_command` | string | Yes | The verification command that was run |
256
+ | `status` | `"pass" \| "fail"` | Yes | Final verification outcome |
257
+ | `attempts` | number | Yes | Total verification attempts (1 = passed first try) |
258
+ | `repairs_applied` | string[] | Yes | List of repairs applied during retries (empty if passed first try) |
259
+ | `last_diagnosis` | object | No | Last debug sub-agent diagnosis (only present when `status: "fail"`) |
260
+
261
+ ### When `task_verifications` is Omitted
262
+
263
+ - If a phase has **no tasks with `<verify>` tags**, the `task_verifications` field is omitted entirely from the JSON return
264
+ - This maintains backward compatibility — existing phase isolation returns are unchanged
265
+
266
+ ---
267
+
268
+ ## Configuration
269
+
270
+ ### `.flowconfig` Setting
271
+
272
+ ```yaml
273
+ max_verify_retries: 2 # Max repair attempts per task verification (default: 2, range: 1-5)
274
+ ```
275
+
276
+ - **Default**: `2` (initial attempt + 2 retries = 3 total runs)
277
+ - **Range**: `1` to `5`
278
+ - **Values below 1 or above 5**: Clamped to the valid range with a warning
279
+
280
+ ### Toggle via `/flow`
281
+
282
+ ```
283
+ /flow max_verify_retries=3
284
+ ```
285
+
286
+ ### No Feature Toggle
287
+
288
+ Per-task verification has no on/off toggle. It activates automatically when tasks include `<verify>` tags. Plans without `<verify>` tags behave exactly as before — fully backward compatible.
289
+
290
+ ---
291
+
292
+ ## Error Handling
293
+
294
+ ### Verification Command Errors
295
+
296
+ | Scenario | Behavior |
297
+ |----------|----------|
298
+ | Command not found | Treat as verification failure, spawn debug sub-agent |
299
+ | Command timeout (>30s) | Kill process, treat as failure, include timeout in error output |
300
+ | Command produces no output | Treat exit code as sole indicator (0 = pass, non-zero = fail) |
301
+ | Command produces large output | Truncate to 200 lines before passing to debug sub-agent |
302
+
303
+ ### Debug Sub-Agent Errors
304
+
305
+ | Scenario | Behavior |
306
+ |----------|----------|
307
+ | Invalid JSON return | Skip this retry, count as failed attempt |
308
+ | Sub-agent timeout | Skip this retry, count as failed attempt |
309
+ | Empty repair_actions | Skip repair, re-run verification (may pass if issue was transient) |
310
+
311
+ ### Phase-Level Impact
312
+
313
+ | Verification Outcome | Phase Status |
314
+ |----------------------|-------------|
315
+ | All verifications pass | `"success"` (no change) |
316
+ | Some verifications fail (max retries exceeded) | `"partial"` |
317
+ | Task implementation itself fails | `"failure"` (existing behavior, unrelated to verification) |
318
+
319
+ ---
320
+
321
+ ## Interaction with Wave Mode
322
+
323
+ Per-task verification is **entirely internal** to each phase sub-agent. The wave coordinator:
324
+
325
+ - Does NOT know about individual task verifications during execution
326
+ - Receives `task_verifications` in the JSON return after the sub-agent completes
327
+ - Displays verification stats in the wave completion summary
328
+ - Does NOT retry phases based on verification failures (that is internal to the sub-agent)
329
+
330
+ ```
331
+ Wave 1: Phase 1 (2 tasks verified: 2 pass), Phase 2 (3 tasks verified: 2 pass, 1 fail after 2 retries)
332
+ ```
333
+
334
+ Wave execution treats a phase with failed verifications as `"partial"` — the same way it handles any partial result. The user decides whether to continue.
335
+
336
+ ---
337
+
338
+ ## Rules
339
+
340
+ 1. **Verify is optional** — tasks without `<verify>` tags skip verification entirely
341
+ 2. **Targeted commands only** — never use full builds or full test suites as verify commands
342
+ 3. **Debug sub-agent is read-only** — it diagnoses but never modifies files
343
+ 4. **Implementation sub-agent repairs** — only the phase sub-agent applies fixes
344
+ 5. **Continue on failure** — failed verification does NOT abort the phase; it records the failure and continues to the next task
345
+ 6. **Max retries are hard** — once exceeded, escalate to user; never increase retries dynamically
346
+ 7. **First attempt is not a retry** — the initial verification run is attempt 1, retries start at attempt 2
347
+ 8. **Truncate large output** — cap error output at 200 lines and file content at 300 lines for debug sub-agent
348
+ 9. **Backward compatible** — phases without any `<verify>` tags produce no `task_verifications` field
349
+ 10. **Wave-transparent** — wave coordinator sees only final results; verification loops are internal to phase sub-agents
350
+
351
+ ---
352
+
353
+ ## Related Files
354
+
355
+ | File | Purpose |
356
+ |------|---------|
357
+ | `.claude/resources/core/phase-isolation.md` | Sub-agent context template and JSON return format (extended by this feature) |
358
+ | `.claude/resources/core/wave-execution.md` | Wave coordinator behavior (verification is internal to sub-agents) |
359
+ | `.claude/resources/core/model-routing.md` | Model tier selection (debug sub-agent always uses haiku) |
360
+ | `.claude/resources/skills/execute-plan-skill.md` | Execute-plan skill with verification result display |
361
+ | `.claude/resources/skills/create-plan-skill.md` | Auto-generation of `<verify>` sections in plans |
362
+ | `.claude/resources/patterns/plans-templates.md` | Plan template with `<verify>` tag syntax |
@@ -7,10 +7,16 @@ When `phase_isolation: true` in `flow/.flowconfig` (default), each `/execute-pla
7
7
 
8
8
  **Core principle**: Clean context in, structured summary out.
9
9
 
10
+ ### Per-Task Verification
11
+
12
+ Phase sub-agents support **per-task verification** when plan tasks include `<verify>` tags. After completing each task, the sub-agent runs the verification command and, on failure, spawns a debug sub-agent (haiku) for diagnosis and repair. See `.claude/resources/core/per-task-verification.md` for the complete verification system, debug sub-agent prompt template, JSON schemas, and configuration.
13
+
10
14
  ---
11
15
 
12
16
  ## Architecture
13
17
 
18
+ ### Sequential Mode (default)
19
+
14
20
  ```
15
21
  Coordinator (main session)
16
22
 
@@ -35,6 +41,40 @@ Coordinator (main session)
35
41
  └─ Next phase...
36
42
  ```
37
43
 
44
+ ### Wave Mode (when `wave_execution: true`)
45
+
46
+ ```
47
+ Coordinator (main session)
48
+
49
+ ├─ For each Wave:
50
+ │ │
51
+ │ ├─ Approve each phase sequentially in Plan Mode
52
+ │ │
53
+ │ ├─ Prepare isolated context for EACH phase in the wave
54
+ │ │
55
+ │ ├─ Spawn MULTIPLE Agent sub-agents IN PARALLEL:
56
+ │ │ ├─► Agent: Phase A (model: [tier_A], prompt: phase_A_context)
57
+ │ │ ├─► Agent: Phase B (model: [tier_B], prompt: phase_B_context)
58
+ │ │ └─► Agent: Phase C (model: [tier_C], prompt: phase_C_context)
59
+ │ │
60
+ │ ├─ Wait for ALL sub-agents to complete
61
+ │ │
62
+ │ ├─ Collect JSON returns from all sub-agents
63
+ │ │
64
+ │ ├─ Post-wave processing (sequential, in phase order):
65
+ │ │ ├─ Detect file conflicts (files_modified overlap)
66
+ │ │ ├─ Process each phase result
67
+ │ │ ├─ Update plan file (mark tasks [x])
68
+ │ │ ├─ Accumulate files_modified list
69
+ │ │ ├─ Buffer patterns from all phases
70
+ │ │ ├─ Git commit sequentially (Phase A, then B, then C)
71
+ │ │ └─ Handle failures (present to user)
72
+ │ │
73
+ │ └─ Next Wave...
74
+
75
+ └─ Completion summary with wave execution stats
76
+ ```
77
+
38
78
  Planning and user approval always happen in the **main session** (full context). Only the **implementation step** is isolated.
39
79
 
40
80
  ---
@@ -68,11 +108,37 @@ Read these files before implementing:
68
108
  {Only if UI phase — include design tokens from discovery doc}
69
109
  {Otherwise omit this section entirely}
70
110
 
111
+ ## Commit Instructions
112
+ {Only include this section when `commit: true` in `.flowconfig`}
113
+
114
+ ### Sequential Mode (wave_execution: false)
115
+ - After each task completes and verification passes (if applicable):
116
+ 1. Stage changed files: `git add -A`
117
+ 2. Create atomic commit: `git commit -m "feat(phase-N.task-M): <truncated description> — <feature>"`
118
+ - Use format: feat(phase-{phase_number}.task-{task_number_in_phase}): <description> — <feature>
119
+ - Truncate description to 50 chars (use ellipsis if truncated)
120
+ - Task numbers are 1-indexed within each phase
121
+ 3. Continue to next task
122
+ - Return `tasks_completed` array in JSON with files_created/files_modified per task
123
+ - Do NOT create a final phase commit (coordinator will not create one either)
124
+
125
+ ### Wave Mode (wave_execution: true)
126
+ - Do NOT create any commits during task implementation
127
+ - The coordinator will commit your changes after this wave completes
128
+ - Return `tasks_completed` array with per-task file lists (see Return Format below)
129
+ - Coordinator will iterate tasks and commit: feat(phase-N.task-M): ... per task
130
+
71
131
  ## Instructions
72
132
  1. Read the plan file to understand the full feature context
73
133
  2. Implement all tasks listed above
74
134
  3. Follow the project patterns from the files listed
75
- 4. Return a JSON summary in the exact format below do NOT return markdown
135
+ 4. After completing each task, check if it has a `<verify>` tag indented beneath it:
136
+ - If yes: run the verification command inside the tag
137
+ - If the command exits 0: record a pass result and continue to the next task
138
+ - If the command exits non-zero: spawn a debug sub-agent (haiku, mode: "auto") with the error output, task context, and file content. Apply the repair actions from the diagnosis and re-run the verification command (up to `max_verify_retries` attempts, default 2). See `.claude/resources/core/per-task-verification.md` for the debug sub-agent prompt template and return schema.
139
+ - If max retries exceeded: record a fail result and continue to the next task (do NOT abort)
140
+ - If no `<verify>` tag: skip verification for that task
141
+ 5. Return a JSON summary in the exact format below — do NOT return markdown
76
142
 
77
143
  ## Return Format
78
144
  Return ONLY a JSON object (no markdown fences, no explanation):
@@ -83,6 +149,14 @@ Return ONLY a JSON object (no markdown fences, no explanation):
83
149
 
84
150
  The prompt should be **under 2K tokens** (excluding files the sub-agent reads itself via the Read tool). Keep it focused — the sub-agent will read project files as needed during implementation.
85
151
 
152
+ ### Wave Mode Context Additions
153
+
154
+ When spawning sub-agents within a wave, the context template is **identical** to sequential mode with one key difference:
155
+
156
+ - **`Files Modified in Previous Phases`**: Include files from ALL completed waves (Wave 1 through Wave N-1), not just the immediately preceding phase. This gives each sub-agent awareness of everything that changed before the current wave.
157
+
158
+ Sub-agents within the same wave do NOT receive information about each other — no cross-phase awareness. Each sub-agent operates as if it is the only phase running.
159
+
86
160
  ---
87
161
 
88
162
  ## Return Format Schema
@@ -115,6 +189,38 @@ The sub-agent must return a JSON object with this structure:
115
189
  "description": "All Zod schemas live next to their type definitions",
116
190
  "confidence": "high"
117
191
  }
192
+ ],
193
+ "task_verifications": [
194
+ {
195
+ "task": "Create user authentication middleware",
196
+ "verify_command": "npx tsc --noEmit src/middleware/auth.ts",
197
+ "status": "pass",
198
+ "attempts": 2,
199
+ "repairs_applied": [
200
+ "Added missing import for AuthMiddleware type"
201
+ ]
202
+ },
203
+ {
204
+ "task": "Add rate limiting to API routes",
205
+ "verify_command": "npx jest src/middleware/__tests__/rate-limit.test.ts --no-coverage",
206
+ "status": "pass",
207
+ "attempts": 1,
208
+ "repairs_applied": []
209
+ }
210
+ ],
211
+ "tasks_completed": [
212
+ {
213
+ "task_number": 1,
214
+ "task_name": "Create user authentication middleware",
215
+ "files_created": ["src/middleware/auth.ts"],
216
+ "files_modified": []
217
+ },
218
+ {
219
+ "task_number": 2,
220
+ "task_name": "Add rate limiting to API routes",
221
+ "files_created": ["src/middleware/rate-limit.ts"],
222
+ "files_modified": ["src/api/routes.ts"]
223
+ }
118
224
  ]
119
225
  }
120
226
  ```
@@ -132,6 +238,8 @@ The sub-agent must return a JSON object with this structure:
132
238
  | `deviations` | string[] | No | Tasks skipped or changed from plan |
133
239
  | `errors` | string[] | No | Errors encountered (even if resolved) |
134
240
  | `patterns_captured` | object[] | No | Patterns observed during implementation |
241
+ | `task_verifications` | object[] | No | Array of per-task verification results. Only present when at least one task had a `<verify>` tag. Each entry contains: `task` (string), `verify_command` (string), `status` (`"pass" \| "fail"`), `attempts` (number), `repairs_applied` (string[]), and optionally `last_diagnosis` (object, only when status is `"fail"`). See `.claude/resources/core/per-task-verification.md` for full schema. |
242
+ | `tasks_completed` | object[] | No | Array of per-task file tracking for atomic commits. Each entry: `task_number` (number, 1-indexed within phase), `task_name` (string), `files_created` (string[]), `files_modified` (string[]). Present when any tasks ran. Used by coordinator for per-task commit messages. See `.claude/resources/core/atomic-commits.md` for full schema. |
135
243
 
136
244
  ### Failure Return Example
137
245
 
@@ -163,9 +271,15 @@ After receiving the sub-agent's JSON summary, the coordinator:
163
271
  1. **Update plan file**: Mark all phase tasks as `[x]`
164
272
  2. **Accumulate file list**: Merge `files_created` and `files_modified` into running list
165
273
  3. **Buffer patterns**: Append `patterns_captured` entries to `flow/resources/pending-patterns.md`
166
- 4. **Git commit**: If `commit: true`, run `git add -A && git commit -m "Phase N: {name} — {feature}"`
274
+ 4. **Git commit (per-task)**: If `commit: true` and `tasks_completed` is present:
275
+ - **Sequential mode**: Sub-agent already created per-task commits — verify they exist, do NOT create phase commit
276
+ - **Wave mode**: Coordinator iterates `tasks_completed` in task_number order and creates per-task commits:
277
+ - For each task: `git add -A && git commit -m "feat(phase-N.task-M): <truncated task_name> — <feature>"`
278
+ - Truncate `task_name` to 50 chars if needed
279
+ - **Fallback**: If `tasks_completed` is absent (legacy sub-agent), fall back to single phase commit: `git add -A && git commit -m "Phase N: {name} — {feature}"`
167
280
  5. **Log decisions**: Include `decisions` in phase completion message
168
- 6. **Proceed**: Move to next phase
281
+ 6. **Display verification results**: If `task_verifications` is present, show pass/fail counts and any repairs applied
282
+ 7. **Proceed**: Move to next phase
169
283
 
170
284
  ### On Failure (`status: "failure"`)
171
285
 
@@ -178,7 +292,49 @@ After receiving the sub-agent's JSON summary, the coordinator:
178
292
 
179
293
  1. **Present summary**: Show what was completed and what wasn't
180
294
  2. **Show deviations**: List `deviations` explaining what was skipped
181
- 3. **Ask user**: "Phase partially complete. Continue to next phase or retry remaining tasks?"
295
+ 3. **Display verification failures**: If `task_verifications` contains failed entries, show task name, last diagnosis, and repair attempts
296
+ 4. **Ask user**: "Phase partially complete. Continue to next phase or retry remaining tasks?"
297
+
298
+ ---
299
+
300
+ ## Wave Coordinator Processing
301
+
302
+ When multiple sub-agents return simultaneously from a wave, the coordinator handles them differently from sequential mode. See `.claude/resources/core/wave-execution.md` for the full wave system and `.claude/resources/skills/execute-plan-skill.md` Step 4c for the detailed processing flow.
303
+
304
+ **Per-task commits in wave mode**: After collecting all JSON returns from a wave, the coordinator commits per-task (not per-phase). For each phase in phase-number order, iterate `tasks_completed` and create atomic commits: `feat(phase-N.task-M): <desc> — <feature>`. See `.claude/resources/core/atomic-commits.md` for the complete commit format and coordinator processing rules.
305
+
306
+ ### Collecting Multiple JSON Returns
307
+
308
+ After all sub-agents in a wave complete, the coordinator collects all JSON returns before processing any of them. This allows file conflict detection before committing.
309
+
310
+ ### Processing Order
311
+
312
+ Results are always processed **sequentially in phase number order**, regardless of which sub-agent finished first. This ensures:
313
+ - Deterministic commit history (Phase A committed before Phase B)
314
+ - Predictable plan file updates
315
+ - Consistent file accumulation order
316
+
317
+ ### File Conflict Detection
318
+
319
+ After collecting all wave results, check for `files_modified` overlap between phases:
320
+
321
+ ```
322
+ For each pair of phases (A, B) in the wave:
323
+ overlap = A.files_modified ∩ B.files_modified
324
+ if overlap is not empty:
325
+ → File conflict detected
326
+ ```
327
+
328
+ **On conflict**: Present to user with options (accept as-is, re-run conflicting phases sequentially, or stop). Never silently resolve conflicts.
329
+
330
+ ### Wave Failure Isolation
331
+
332
+ A failed phase in a wave does NOT affect other phases in the same wave:
333
+ - Successful phases are processed normally (plan updates, file accumulation, git commits)
334
+ - Failed phases are presented to the user after all successful phases are processed
335
+ - The user chooses per failed phase: retry, skip, or stop
336
+
337
+ This differs from sequential mode where a failure immediately pauses execution. In wave mode, all parallel phases complete independently before any failure handling.
182
338
 
183
339
  ---
184
340
 
@@ -186,6 +342,8 @@ After receiving the sub-agent's JSON summary, the coordinator:
186
342
 
187
343
  When phases are aggregated (combined complexity ≤ 6), they run as **one sub-agent call** with all tasks from all aggregated phases. The context template lists all phases and tasks together. The return uses the highest phase number as the `phase` field.
188
344
 
345
+ In wave mode, aggregated phases within the same wave are treated as a **single unit** — they share one wave slot, one dependency set (union of all aggregated phases' dependencies), and one sub-agent call.
346
+
189
347
  ---
190
348
 
191
349
  ## Configuration
@@ -210,6 +368,19 @@ Phase isolation **enhances** model routing — it doesn't replace it:
210
368
  | `model_routing: false` + `phase_isolation: true` | Sub-agent spawned with session model, clean context |
211
369
  | `model_routing: false` + `phase_isolation: false` | Inline execution, no sub-agents (original behavior) |
212
370
 
371
+ ### Interaction with Wave Execution
372
+
373
+ Phase isolation is the **foundation** for wave execution — wave mode spawns multiple isolated sub-agents per wave instead of one at a time:
374
+
375
+ | Setting | Behavior |
376
+ |---------|----------|
377
+ | `wave_execution: true` + `phase_isolation: true` | Multiple sub-agents per wave, each with clean context (optimal) |
378
+ | `wave_execution: true` + `phase_isolation: false` | Multiple sub-agents per wave, but sharing session context (may cause interference) |
379
+ | `wave_execution: false` + `phase_isolation: true` | One sub-agent at a time, clean context (existing behavior) |
380
+ | `wave_execution: false` + `phase_isolation: false` | Inline execution, no sub-agents (original behavior) |
381
+
382
+ **Recommendation**: `wave_execution: true` works best with `phase_isolation: true`. Without phase isolation, parallel sub-agents may interfere with each other's context.
383
+
213
384
  ---
214
385
 
215
386
  ## Rules
@@ -220,3 +391,20 @@ Phase isolation **enhances** model routing — it doesn't replace it:
220
391
  4. **Coordinator validates** — check status field before proceeding
221
392
  5. **Never auto-retry** — on failure, present to user and ask
222
393
  6. **Pass paths, not content** — give file paths, sub-agent reads them
394
+ 7. **Each phase gets own sub-agent** — even in wave mode, phases are never merged into one sub-agent (except for aggregated phases per complexity rules)
395
+ 8. **No cross-wave awareness** — sub-agents in the same wave know nothing about each other
396
+ 9. **Deterministic processing** — wave results are always processed in phase number order
397
+ 10. **Collect before commit** — in wave mode, all JSON returns are collected before any commits happen
398
+ 11. **Verification is internal** — per-task verification loops run inside the phase sub-agent; the coordinator sees only the final `task_verifications` results
399
+
400
+ ---
401
+
402
+ ## Related Files
403
+
404
+ | File | Purpose |
405
+ |------|---------|
406
+ | `.claude/resources/core/wave-execution.md` | Full wave-based parallel execution system |
407
+ | `.claude/resources/core/model-routing.md` | Model tier selection per phase complexity |
408
+ | `.claude/resources/core/discovery-sub-agents.md` | Parallel spawning pattern reference |
409
+ | `.claude/resources/core/per-task-verification.md` | Per-task verification system, debug sub-agent, and repair loops |
410
+ | `.claude/resources/skills/execute-plan-skill.md` | Execute-plan skill with wave integration (Steps 2b, 3, 4) |
@@ -103,3 +103,4 @@ After compaction, the model should re-read `flow/.scratchpad.md` to restore any
103
103
  3. **Self-manage size** — promote or discard when approaching 50 lines
104
104
  4. **Promote before ending** — always scan for promotable items before session ends
105
105
  5. **Not a task list** — use `flow/tasklist.md` for tasks, scratchpad is for observations
106
+ 6. **Different from STATE.md** — `flow/STATE.md` tracks structured execution position (current skill, phase, status) for session resumability. The scratchpad tracks informal observations, insights, and open questions. They coexist and serve different purposes: STATE.md is machine-readable execution state, scratchpad is human-readable notes.