planflow-ai 1.3.4 → 1.4.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/commands/create-plan.md +11 -0
- package/.claude/commands/discovery-plan.md +12 -0
- package/.claude/commands/execute-plan.md +114 -23
- package/.claude/commands/flow.md +30 -5
- package/.claude/commands/resume-work.md +261 -0
- package/.claude/commands/review-code.md +11 -0
- package/.claude/commands/review-pr.md +11 -0
- package/.claude/resources/core/_index.md +45 -2
- package/.claude/resources/core/atomic-commits.md +380 -0
- package/.claude/resources/core/autopilot-mode.md +3 -2
- package/.claude/resources/core/compaction-guide.md +15 -1
- package/.claude/resources/core/heartbeat.md +129 -1
- package/.claude/resources/core/model-routing.md +6 -2
- package/.claude/resources/core/per-task-verification.md +362 -0
- package/.claude/resources/core/phase-isolation.md +192 -4
- package/.claude/resources/core/session-scratchpad.md +1 -0
- package/.claude/resources/core/wave-execution.md +329 -0
- package/.claude/resources/patterns/plans-patterns.md +56 -0
- package/.claude/resources/patterns/plans-templates.md +152 -0
- package/.claude/resources/skills/_index.md +8 -6
- package/.claude/resources/skills/create-plan-skill.md +71 -5
- package/.claude/resources/skills/execute-plan-skill.md +357 -12
- package/.claude/resources/skills/resume-work-skill.md +159 -0
- package/.claude/rules/core/forbidden-patterns.md +38 -0
- package/dist/cli/commands/init.js +1 -1
- package/dist/cli/commands/init.js.map +1 -1
- package/dist/cli/commands/state.d.ts +12 -0
- package/dist/cli/commands/state.d.ts.map +1 -0
- package/dist/cli/commands/state.js +47 -0
- package/dist/cli/commands/state.js.map +1 -0
- package/dist/cli/daemon/desktop-notifier.d.ts +16 -0
- package/dist/cli/daemon/desktop-notifier.d.ts.map +1 -0
- package/dist/cli/daemon/desktop-notifier.js +53 -0
- package/dist/cli/daemon/desktop-notifier.js.map +1 -0
- package/dist/cli/daemon/event-writer.d.ts +22 -0
- package/dist/cli/daemon/event-writer.d.ts.map +1 -0
- package/dist/cli/daemon/event-writer.js +76 -0
- package/dist/cli/daemon/event-writer.js.map +1 -0
- package/dist/cli/daemon/heartbeat-daemon.js +81 -1
- package/dist/cli/daemon/heartbeat-daemon.js.map +1 -1
- package/dist/cli/daemon/log-writer.d.ts +17 -0
- package/dist/cli/daemon/log-writer.d.ts.map +1 -0
- package/dist/cli/daemon/log-writer.js +62 -0
- package/dist/cli/daemon/log-writer.js.map +1 -0
- package/dist/cli/daemon/notification-router.d.ts +17 -0
- package/dist/cli/daemon/notification-router.d.ts.map +1 -0
- package/dist/cli/daemon/notification-router.js +35 -0
- package/dist/cli/daemon/notification-router.js.map +1 -0
- package/dist/cli/daemon/prompt-manager.d.ts +27 -0
- package/dist/cli/daemon/prompt-manager.d.ts.map +1 -0
- package/dist/cli/daemon/prompt-manager.js +107 -0
- package/dist/cli/daemon/prompt-manager.js.map +1 -0
- package/dist/cli/index.js +9 -0
- package/dist/cli/index.js.map +1 -1
- package/dist/cli/state/flowconfig-parser.d.ts +16 -0
- package/dist/cli/state/flowconfig-parser.d.ts.map +1 -0
- package/dist/cli/state/flowconfig-parser.js +166 -0
- package/dist/cli/state/flowconfig-parser.js.map +1 -0
- package/dist/cli/state/heartbeat-state.d.ts +16 -0
- package/dist/cli/state/heartbeat-state.d.ts.map +1 -0
- package/dist/cli/state/heartbeat-state.js +97 -0
- package/dist/cli/state/heartbeat-state.js.map +1 -0
- package/dist/cli/state/model-router.d.ts +8 -0
- package/dist/cli/state/model-router.d.ts.map +1 -0
- package/dist/cli/state/model-router.js +36 -0
- package/dist/cli/state/model-router.js.map +1 -0
- package/dist/cli/state/plan-parser.d.ts +16 -0
- package/dist/cli/state/plan-parser.d.ts.map +1 -0
- package/dist/cli/state/plan-parser.js +124 -0
- package/dist/cli/state/plan-parser.js.map +1 -0
- package/dist/cli/state/session-state.d.ts +21 -0
- package/dist/cli/state/session-state.d.ts.map +1 -0
- package/dist/cli/state/session-state.js +36 -0
- package/dist/cli/state/session-state.js.map +1 -0
- package/dist/cli/state/state-md-parser.d.ts +18 -0
- package/dist/cli/state/state-md-parser.d.ts.map +1 -0
- package/dist/cli/state/state-md-parser.js +222 -0
- package/dist/cli/state/state-md-parser.js.map +1 -0
- package/dist/cli/state/types.d.ts +106 -0
- package/dist/cli/state/types.d.ts.map +1 -0
- package/dist/cli/state/types.js +8 -0
- package/dist/cli/state/types.js.map +1 -0
- package/dist/cli/state/wave-calculator.d.ts +18 -0
- package/dist/cli/state/wave-calculator.d.ts.map +1 -0
- package/dist/cli/state/wave-calculator.js +134 -0
- package/dist/cli/state/wave-calculator.js.map +1 -0
- package/dist/cli/types.d.ts +15 -0
- package/dist/cli/types.d.ts.map +1 -1
- package/package.json +4 -2
- package/templates/shared/CLAUDE.md.template +4 -0
|
@@ -0,0 +1,362 @@
|
|
|
1
|
+
|
|
2
|
+
# Per-Task Verification
|
|
3
|
+
|
|
4
|
+
## Purpose
|
|
5
|
+
|
|
6
|
+
When a plan phase includes tasks with `<verify>` tags, the phase isolation sub-agent runs **targeted verification immediately after each task completes**. If verification fails, a nested debug sub-agent diagnoses the failure and the implementation sub-agent applies repairs. This catches errors at the task level instead of waiting for the final build+test step.
|
|
7
|
+
|
|
8
|
+
**Core principle**: Verify early, diagnose fast, repair in place.
|
|
9
|
+
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## Architecture
|
|
13
|
+
|
|
14
|
+
```
|
|
15
|
+
Phase Sub-Agent (isolated)
|
|
16
|
+
│
|
|
17
|
+
├─ Task 1: Implement
|
|
18
|
+
│ ├─ Complete task implementation
|
|
19
|
+
│ ├─ Parse <verify> tag → extract command
|
|
20
|
+
│ ├─ Run verification command
|
|
21
|
+
│ ├─ ✅ Pass → record result, move to Task 2
|
|
22
|
+
│ └─ ❌ Fail → enter verification loop:
|
|
23
|
+
│ │
|
|
24
|
+
│ ├─ Spawn debug sub-agent (haiku):
|
|
25
|
+
│ │ Input: error output + task context + file content
|
|
26
|
+
│ │ Output: JSON diagnosis (root cause, repair actions)
|
|
27
|
+
│ │
|
|
28
|
+
│ ├─ Apply repair actions
|
|
29
|
+
│ ├─ Re-run verification command
|
|
30
|
+
│ ├─ ✅ Pass → record result (with repair info), move to Task 2
|
|
31
|
+
│ ├─ ❌ Fail → retry (up to max_verify_retries)
|
|
32
|
+
│ └─ ❌ Max retries exceeded → record failure, escalate to user
|
|
33
|
+
│
|
|
34
|
+
├─ Task 2: Implement (no <verify> tag → skip verification)
|
|
35
|
+
│
|
|
36
|
+
├─ Task 3: Implement
|
|
37
|
+
│ ├─ Complete task implementation
|
|
38
|
+
│ ├─ Parse <verify> tag → extract command
|
|
39
|
+
│ └─ Run verification → ✅ Pass
|
|
40
|
+
│
|
|
41
|
+
└─ Return JSON (includes task_verifications array)
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
Verification is **internal to the phase sub-agent**. The wave coordinator and main session see only the final JSON return with verification results — they never interact with the verification loop directly.
|
|
45
|
+
|
|
46
|
+
---
|
|
47
|
+
|
|
48
|
+
## Verify Tag Syntax
|
|
49
|
+
|
|
50
|
+
### Declaration in Plans
|
|
51
|
+
|
|
52
|
+
Tasks in a plan phase can include an optional `<verify>` tag indented under the task:
|
|
53
|
+
|
|
54
|
+
```markdown
|
|
55
|
+
### Phase 2: API Integration
|
|
56
|
+
|
|
57
|
+
**Scope**: ...
|
|
58
|
+
**Complexity**: 5/10
|
|
59
|
+
**Dependencies**: Phase 1
|
|
60
|
+
|
|
61
|
+
- [ ] Create user authentication middleware in `src/middleware/auth.ts`
|
|
62
|
+
<verify>npx tsc --noEmit src/middleware/auth.ts</verify>
|
|
63
|
+
- [ ] Add rate limiting to API routes
|
|
64
|
+
<verify>npx jest src/middleware/__tests__/rate-limit.test.ts --no-coverage</verify>
|
|
65
|
+
- [ ] Update configuration constants
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
### Parsing Rules
|
|
69
|
+
|
|
70
|
+
1. **Tag format**: `<verify>COMMAND</verify>` on a single line, indented under a task
|
|
71
|
+
2. **One verify per task**: Only the first `<verify>` tag under a task is used; additional tags are ignored
|
|
72
|
+
3. **Command content**: The text between tags is executed as a shell command by the sub-agent
|
|
73
|
+
4. **No verify = no verification**: Tasks without `<verify>` tags skip verification entirely (backward compatible)
|
|
74
|
+
5. **Whitespace**: Leading/trailing whitespace inside the tag is trimmed
|
|
75
|
+
6. **Nesting**: The `<verify>` tag must be indented under its parent task (2+ spaces or 1+ tab)
|
|
76
|
+
|
|
77
|
+
### Recommended Verification Commands
|
|
78
|
+
|
|
79
|
+
| Task Type | Verify Command | Purpose |
|
|
80
|
+
|-----------|---------------|---------|
|
|
81
|
+
| File creation (TypeScript) | `npx tsc --noEmit <file>` | Type-check the new file |
|
|
82
|
+
| Test writing | `npx jest <test-file> --no-coverage` | Run the specific test |
|
|
83
|
+
| Schema/type changes | `npx tsc --noEmit <type-file>` | Verify type consistency |
|
|
84
|
+
| Config changes | *(no verify)* | Manual review preferred |
|
|
85
|
+
| Documentation | *(no verify)* | No automated check available |
|
|
86
|
+
|
|
87
|
+
**Constraint**: Verification commands must be **targeted** (single file or small scope). Never use full builds (`npm run build`) or full test suites (`npm test`) as verify commands — those run in the final Step 7.
|
|
88
|
+
|
|
89
|
+
---
|
|
90
|
+
|
|
91
|
+
## Debug Sub-Agent
|
|
92
|
+
|
|
93
|
+
### When to Spawn
|
|
94
|
+
|
|
95
|
+
A debug sub-agent is spawned when a verification command returns a **non-zero exit code**. The sub-agent diagnoses the failure and suggests repair actions.
|
|
96
|
+
|
|
97
|
+
### Prompt Template
|
|
98
|
+
|
|
99
|
+
```markdown
|
|
100
|
+
# Debug Diagnosis
|
|
101
|
+
|
|
102
|
+
## Failed Verification
|
|
103
|
+
**Task**: {task description}
|
|
104
|
+
**Command**: {verify command}
|
|
105
|
+
**Exit Code**: {exit code}
|
|
106
|
+
|
|
107
|
+
## Error Output
|
|
108
|
+
```
|
|
109
|
+
{stderr + stdout from the failed command, truncated to 200 lines}
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
## Task Context
|
|
113
|
+
**File**: {primary file being modified}
|
|
114
|
+
**Phase**: {phase name}
|
|
115
|
+
**What was implemented**: {brief description of what the task did}
|
|
116
|
+
|
|
117
|
+
## File Content
|
|
118
|
+
```
|
|
119
|
+
{content of the primary file, truncated to 300 lines}
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
## Instructions
|
|
123
|
+
Analyze the error output and diagnose the root cause. Return a JSON object with your diagnosis and suggested repair actions. Do NOT fix the code — only diagnose.
|
|
124
|
+
|
|
125
|
+
Return ONLY a JSON object (no markdown fences):
|
|
126
|
+
{see Debug Return Schema below}
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
### Debug Return Schema
|
|
130
|
+
|
|
131
|
+
```json
|
|
132
|
+
{
|
|
133
|
+
"root_cause": "Missing import for AuthMiddleware type used on line 15",
|
|
134
|
+
"category": "import_missing",
|
|
135
|
+
"repair_actions": [
|
|
136
|
+
"Add import { AuthMiddleware } from '../types/auth' to src/middleware/auth.ts"
|
|
137
|
+
],
|
|
138
|
+
"confidence": "high",
|
|
139
|
+
"file_to_fix": "src/middleware/auth.ts"
|
|
140
|
+
}
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
### Field Descriptions
|
|
144
|
+
|
|
145
|
+
| Field | Type | Required | Description |
|
|
146
|
+
|-------|------|----------|-------------|
|
|
147
|
+
| `root_cause` | string | Yes | Human-readable description of what went wrong |
|
|
148
|
+
| `category` | string | Yes | Error category: `import_missing`, `type_error`, `syntax_error`, `runtime_error`, `test_failure`, `config_error`, `other` |
|
|
149
|
+
| `repair_actions` | string[] | Yes | Ordered list of specific actions to fix the issue |
|
|
150
|
+
| `confidence` | `"high" \| "medium" \| "low"` | Yes | How confident the diagnosis is |
|
|
151
|
+
| `file_to_fix` | string | Yes | Primary file that needs modification |
|
|
152
|
+
|
|
153
|
+
### Sub-Agent Configuration
|
|
154
|
+
|
|
155
|
+
- **Model**: Always uses haiku (fast tier) — diagnosis is a focused, low-complexity task
|
|
156
|
+
- **Mode**: `"auto"`
|
|
157
|
+
- **Read-only**: The debug sub-agent does NOT modify files — it only returns a diagnosis. The implementation sub-agent applies the repairs.
|
|
158
|
+
|
|
159
|
+
---
|
|
160
|
+
|
|
161
|
+
## Verification Loop
|
|
162
|
+
|
|
163
|
+
### Flow
|
|
164
|
+
|
|
165
|
+
```
|
|
166
|
+
1. Complete task implementation
|
|
167
|
+
2. Parse <verify> tag → extract command
|
|
168
|
+
3. Run command
|
|
169
|
+
4. If exit code == 0 → PASS (record result, continue)
|
|
170
|
+
5. If exit code != 0:
|
|
171
|
+
a. Increment retry counter
|
|
172
|
+
b. If retry counter > max_verify_retries → ESCALATE (record failure)
|
|
173
|
+
c. Spawn debug sub-agent with error context
|
|
174
|
+
d. Receive JSON diagnosis
|
|
175
|
+
e. Apply repair actions from diagnosis
|
|
176
|
+
f. Re-run verification command → go to step 4
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
### Retry Behavior
|
|
180
|
+
|
|
181
|
+
- **Retry counter**: Starts at 0, increments on each failed verification attempt
|
|
182
|
+
- **First attempt**: The initial verification run does NOT count as a retry
|
|
183
|
+
- **Max retries**: Controlled by `max_verify_retries` in `.flowconfig` (default: 2)
|
|
184
|
+
- **Example with default**: Initial attempt + 2 retries = 3 total verification runs maximum
|
|
185
|
+
|
|
186
|
+
### Escalation on Max Retries
|
|
187
|
+
|
|
188
|
+
When max retries are exceeded, the sub-agent:
|
|
189
|
+
|
|
190
|
+
1. Records the verification failure in the `task_verifications` array
|
|
191
|
+
2. Includes the last debug diagnosis in the failure record
|
|
192
|
+
3. Continues to the next task (does NOT abort the phase)
|
|
193
|
+
4. Sets overall phase `status` to `"partial"` if any task verification failed
|
|
194
|
+
|
|
195
|
+
The coordinator presents the failure to the user with the accumulated diagnosis:
|
|
196
|
+
|
|
197
|
+
```markdown
|
|
198
|
+
⚠️ Task verification failed after 2 retries:
|
|
199
|
+
|
|
200
|
+
**Task**: Create user authentication middleware in `src/middleware/auth.ts`
|
|
201
|
+
**Command**: `npx tsc --noEmit src/middleware/auth.ts`
|
|
202
|
+
**Last diagnosis**: Missing type export from @auth/core — dependency may need updating
|
|
203
|
+
**Category**: import_missing
|
|
204
|
+
|
|
205
|
+
Options:
|
|
206
|
+
1. Continue with remaining phases (issue noted)
|
|
207
|
+
2. Stop and fix manually
|
|
208
|
+
```
|
|
209
|
+
|
|
210
|
+
---
|
|
211
|
+
|
|
212
|
+
## Task Verifications Return Field
|
|
213
|
+
|
|
214
|
+
### Schema Extension
|
|
215
|
+
|
|
216
|
+
The phase isolation JSON return format is extended with an optional `task_verifications` array:
|
|
217
|
+
|
|
218
|
+
```json
|
|
219
|
+
{
|
|
220
|
+
"status": "success",
|
|
221
|
+
"phase": "Phase 2: API Integration",
|
|
222
|
+
"summary": "Implemented auth middleware and rate limiting. One task required type import repair.",
|
|
223
|
+
"files_created": ["src/middleware/auth.ts"],
|
|
224
|
+
"files_modified": ["src/api/routes.ts"],
|
|
225
|
+
"decisions": [],
|
|
226
|
+
"deviations": [],
|
|
227
|
+
"errors": [],
|
|
228
|
+
"patterns_captured": [],
|
|
229
|
+
"task_verifications": [
|
|
230
|
+
{
|
|
231
|
+
"task": "Create user authentication middleware",
|
|
232
|
+
"verify_command": "npx tsc --noEmit src/middleware/auth.ts",
|
|
233
|
+
"status": "pass",
|
|
234
|
+
"attempts": 2,
|
|
235
|
+
"repairs_applied": [
|
|
236
|
+
"Added missing import for AuthMiddleware type"
|
|
237
|
+
]
|
|
238
|
+
},
|
|
239
|
+
{
|
|
240
|
+
"task": "Add rate limiting to API routes",
|
|
241
|
+
"verify_command": "npx jest src/middleware/__tests__/rate-limit.test.ts --no-coverage",
|
|
242
|
+
"status": "pass",
|
|
243
|
+
"attempts": 1,
|
|
244
|
+
"repairs_applied": []
|
|
245
|
+
}
|
|
246
|
+
]
|
|
247
|
+
}
|
|
248
|
+
```
|
|
249
|
+
|
|
250
|
+
### Task Verification Entry Fields
|
|
251
|
+
|
|
252
|
+
| Field | Type | Required | Description |
|
|
253
|
+
|-------|------|----------|-------------|
|
|
254
|
+
| `task` | string | Yes | Short description of the task (from plan) |
|
|
255
|
+
| `verify_command` | string | Yes | The verification command that was run |
|
|
256
|
+
| `status` | `"pass" \| "fail"` | Yes | Final verification outcome |
|
|
257
|
+
| `attempts` | number | Yes | Total verification attempts (1 = passed first try) |
|
|
258
|
+
| `repairs_applied` | string[] | Yes | List of repairs applied during retries (empty if passed first try) |
|
|
259
|
+
| `last_diagnosis` | object | No | Last debug sub-agent diagnosis (only present when `status: "fail"`) |
|
|
260
|
+
|
|
261
|
+
### When `task_verifications` is Omitted
|
|
262
|
+
|
|
263
|
+
- If a phase has **no tasks with `<verify>` tags**, the `task_verifications` field is omitted entirely from the JSON return
|
|
264
|
+
- This maintains backward compatibility — existing phase isolation returns are unchanged
|
|
265
|
+
|
|
266
|
+
---
|
|
267
|
+
|
|
268
|
+
## Configuration
|
|
269
|
+
|
|
270
|
+
### `.flowconfig` Setting
|
|
271
|
+
|
|
272
|
+
```yaml
|
|
273
|
+
max_verify_retries: 2 # Max repair attempts per task verification (default: 2, range: 1-5)
|
|
274
|
+
```
|
|
275
|
+
|
|
276
|
+
- **Default**: `2` (initial attempt + 2 retries = 3 total runs)
|
|
277
|
+
- **Range**: `1` to `5`
|
|
278
|
+
- **Values below 1 or above 5**: Clamped to the valid range with a warning
|
|
279
|
+
|
|
280
|
+
### Toggle via `/flow`
|
|
281
|
+
|
|
282
|
+
```
|
|
283
|
+
/flow max_verify_retries=3
|
|
284
|
+
```
|
|
285
|
+
|
|
286
|
+
### No Feature Toggle
|
|
287
|
+
|
|
288
|
+
Per-task verification has no on/off toggle. It activates automatically when tasks include `<verify>` tags. Plans without `<verify>` tags behave exactly as before — fully backward compatible.
|
|
289
|
+
|
|
290
|
+
---
|
|
291
|
+
|
|
292
|
+
## Error Handling
|
|
293
|
+
|
|
294
|
+
### Verification Command Errors
|
|
295
|
+
|
|
296
|
+
| Scenario | Behavior |
|
|
297
|
+
|----------|----------|
|
|
298
|
+
| Command not found | Treat as verification failure, spawn debug sub-agent |
|
|
299
|
+
| Command timeout (>30s) | Kill process, treat as failure, include timeout in error output |
|
|
300
|
+
| Command produces no output | Treat exit code as sole indicator (0 = pass, non-zero = fail) |
|
|
301
|
+
| Command produces large output | Truncate to 200 lines before passing to debug sub-agent |
|
|
302
|
+
|
|
303
|
+
### Debug Sub-Agent Errors
|
|
304
|
+
|
|
305
|
+
| Scenario | Behavior |
|
|
306
|
+
|----------|----------|
|
|
307
|
+
| Invalid JSON return | Skip this retry, count as failed attempt |
|
|
308
|
+
| Sub-agent timeout | Skip this retry, count as failed attempt |
|
|
309
|
+
| Empty repair_actions | Skip repair, re-run verification (may pass if issue was transient) |
|
|
310
|
+
|
|
311
|
+
### Phase-Level Impact
|
|
312
|
+
|
|
313
|
+
| Verification Outcome | Phase Status |
|
|
314
|
+
|----------------------|-------------|
|
|
315
|
+
| All verifications pass | `"success"` (no change) |
|
|
316
|
+
| Some verifications fail (max retries exceeded) | `"partial"` |
|
|
317
|
+
| Task implementation itself fails | `"failure"` (existing behavior, unrelated to verification) |
|
|
318
|
+
|
|
319
|
+
---
|
|
320
|
+
|
|
321
|
+
## Interaction with Wave Mode
|
|
322
|
+
|
|
323
|
+
Per-task verification is **entirely internal** to each phase sub-agent. The wave coordinator:
|
|
324
|
+
|
|
325
|
+
- Does NOT know about individual task verifications during execution
|
|
326
|
+
- Receives `task_verifications` in the JSON return after the sub-agent completes
|
|
327
|
+
- Displays verification stats in the wave completion summary
|
|
328
|
+
- Does NOT retry phases based on verification failures (that is internal to the sub-agent)
|
|
329
|
+
|
|
330
|
+
```
|
|
331
|
+
Wave 1: Phase 1 (2 tasks verified: 2 pass), Phase 2 (3 tasks verified: 2 pass, 1 fail after 2 retries)
|
|
332
|
+
```
|
|
333
|
+
|
|
334
|
+
Wave execution treats a phase with failed verifications as `"partial"` — the same way it handles any partial result. The user decides whether to continue.
|
|
335
|
+
|
|
336
|
+
---
|
|
337
|
+
|
|
338
|
+
## Rules
|
|
339
|
+
|
|
340
|
+
1. **Verify is optional** — tasks without `<verify>` tags skip verification entirely
|
|
341
|
+
2. **Targeted commands only** — never use full builds or full test suites as verify commands
|
|
342
|
+
3. **Debug sub-agent is read-only** — it diagnoses but never modifies files
|
|
343
|
+
4. **Implementation sub-agent repairs** — only the phase sub-agent applies fixes
|
|
344
|
+
5. **Continue on failure** — failed verification does NOT abort the phase; it records the failure and continues to the next task
|
|
345
|
+
6. **Max retries are hard** — once exceeded, escalate to user; never increase retries dynamically
|
|
346
|
+
7. **First attempt is not a retry** — the initial verification run is attempt 1, retries start at attempt 2
|
|
347
|
+
8. **Truncate large output** — cap error output at 200 lines and file content at 300 lines for debug sub-agent
|
|
348
|
+
9. **Backward compatible** — phases without any `<verify>` tags produce no `task_verifications` field
|
|
349
|
+
10. **Wave-transparent** — wave coordinator sees only final results; verification loops are internal to phase sub-agents
|
|
350
|
+
|
|
351
|
+
---
|
|
352
|
+
|
|
353
|
+
## Related Files
|
|
354
|
+
|
|
355
|
+
| File | Purpose |
|
|
356
|
+
|------|---------|
|
|
357
|
+
| `.claude/resources/core/phase-isolation.md` | Sub-agent context template and JSON return format (extended by this feature) |
|
|
358
|
+
| `.claude/resources/core/wave-execution.md` | Wave coordinator behavior (verification is internal to sub-agents) |
|
|
359
|
+
| `.claude/resources/core/model-routing.md` | Model tier selection (debug sub-agent always uses haiku) |
|
|
360
|
+
| `.claude/resources/skills/execute-plan-skill.md` | Execute-plan skill with verification result display |
|
|
361
|
+
| `.claude/resources/skills/create-plan-skill.md` | Auto-generation of `<verify>` sections in plans |
|
|
362
|
+
| `.claude/resources/patterns/plans-templates.md` | Plan template with `<verify>` tag syntax |
|
|
@@ -7,10 +7,16 @@ When `phase_isolation: true` in `flow/.flowconfig` (default), each `/execute-pla
|
|
|
7
7
|
|
|
8
8
|
**Core principle**: Clean context in, structured summary out.
|
|
9
9
|
|
|
10
|
+
### Per-Task Verification
|
|
11
|
+
|
|
12
|
+
Phase sub-agents support **per-task verification** when plan tasks include `<verify>` tags. After completing each task, the sub-agent runs the verification command and, on failure, spawns a debug sub-agent (haiku) for diagnosis and repair. See `.claude/resources/core/per-task-verification.md` for the complete verification system, debug sub-agent prompt template, JSON schemas, and configuration.
|
|
13
|
+
|
|
10
14
|
---
|
|
11
15
|
|
|
12
16
|
## Architecture
|
|
13
17
|
|
|
18
|
+
### Sequential Mode (default)
|
|
19
|
+
|
|
14
20
|
```
|
|
15
21
|
Coordinator (main session)
|
|
16
22
|
│
|
|
@@ -35,6 +41,40 @@ Coordinator (main session)
|
|
|
35
41
|
└─ Next phase...
|
|
36
42
|
```
|
|
37
43
|
|
|
44
|
+
### Wave Mode (when `wave_execution: true`)
|
|
45
|
+
|
|
46
|
+
```
|
|
47
|
+
Coordinator (main session)
|
|
48
|
+
│
|
|
49
|
+
├─ For each Wave:
|
|
50
|
+
│ │
|
|
51
|
+
│ ├─ Approve each phase sequentially in Plan Mode
|
|
52
|
+
│ │
|
|
53
|
+
│ ├─ Prepare isolated context for EACH phase in the wave
|
|
54
|
+
│ │
|
|
55
|
+
│ ├─ Spawn MULTIPLE Agent sub-agents IN PARALLEL:
|
|
56
|
+
│ │ ├─► Agent: Phase A (model: [tier_A], prompt: phase_A_context)
|
|
57
|
+
│ │ ├─► Agent: Phase B (model: [tier_B], prompt: phase_B_context)
|
|
58
|
+
│ │ └─► Agent: Phase C (model: [tier_C], prompt: phase_C_context)
|
|
59
|
+
│ │
|
|
60
|
+
│ ├─ Wait for ALL sub-agents to complete
|
|
61
|
+
│ │
|
|
62
|
+
│ ├─ Collect JSON returns from all sub-agents
|
|
63
|
+
│ │
|
|
64
|
+
│ ├─ Post-wave processing (sequential, in phase order):
|
|
65
|
+
│ │ ├─ Detect file conflicts (files_modified overlap)
|
|
66
|
+
│ │ ├─ Process each phase result
|
|
67
|
+
│ │ ├─ Update plan file (mark tasks [x])
|
|
68
|
+
│ │ ├─ Accumulate files_modified list
|
|
69
|
+
│ │ ├─ Buffer patterns from all phases
|
|
70
|
+
│ │ ├─ Git commit sequentially (Phase A, then B, then C)
|
|
71
|
+
│ │ └─ Handle failures (present to user)
|
|
72
|
+
│ │
|
|
73
|
+
│ └─ Next Wave...
|
|
74
|
+
│
|
|
75
|
+
└─ Completion summary with wave execution stats
|
|
76
|
+
```
|
|
77
|
+
|
|
38
78
|
Planning and user approval always happen in the **main session** (full context). Only the **implementation step** is isolated.
|
|
39
79
|
|
|
40
80
|
---
|
|
@@ -68,11 +108,37 @@ Read these files before implementing:
|
|
|
68
108
|
{Only if UI phase — include design tokens from discovery doc}
|
|
69
109
|
{Otherwise omit this section entirely}
|
|
70
110
|
|
|
111
|
+
## Commit Instructions
|
|
112
|
+
{Only include this section when `commit: true` in `.flowconfig`}
|
|
113
|
+
|
|
114
|
+
### Sequential Mode (wave_execution: false)
|
|
115
|
+
- After each task completes and verification passes (if applicable):
|
|
116
|
+
1. Stage changed files: `git add -A`
|
|
117
|
+
2. Create atomic commit: `git commit -m "feat(phase-N.task-M): <truncated description> — <feature>"`
|
|
118
|
+
- Use format: feat(phase-{phase_number}.task-{task_number_in_phase}): <description> — <feature>
|
|
119
|
+
- Truncate description to 50 chars (use ellipsis if truncated)
|
|
120
|
+
- Task numbers are 1-indexed within each phase
|
|
121
|
+
3. Continue to next task
|
|
122
|
+
- Return `tasks_completed` array in JSON with files_created/files_modified per task
|
|
123
|
+
- Do NOT create a final phase commit (coordinator will not create one either)
|
|
124
|
+
|
|
125
|
+
### Wave Mode (wave_execution: true)
|
|
126
|
+
- Do NOT create any commits during task implementation
|
|
127
|
+
- The coordinator will commit your changes after this wave completes
|
|
128
|
+
- Return `tasks_completed` array with per-task file lists (see Return Format below)
|
|
129
|
+
- Coordinator will iterate tasks and commit: feat(phase-N.task-M): ... per task
|
|
130
|
+
|
|
71
131
|
## Instructions
|
|
72
132
|
1. Read the plan file to understand the full feature context
|
|
73
133
|
2. Implement all tasks listed above
|
|
74
134
|
3. Follow the project patterns from the files listed
|
|
75
|
-
4.
|
|
135
|
+
4. After completing each task, check if it has a `<verify>` tag indented beneath it:
|
|
136
|
+
- If yes: run the verification command inside the tag
|
|
137
|
+
- If the command exits 0: record a pass result and continue to the next task
|
|
138
|
+
- If the command exits non-zero: spawn a debug sub-agent (haiku, mode: "auto") with the error output, task context, and file content. Apply the repair actions from the diagnosis and re-run the verification command (up to `max_verify_retries` attempts, default 2). See `.claude/resources/core/per-task-verification.md` for the debug sub-agent prompt template and return schema.
|
|
139
|
+
- If max retries exceeded: record a fail result and continue to the next task (do NOT abort)
|
|
140
|
+
- If no `<verify>` tag: skip verification for that task
|
|
141
|
+
5. Return a JSON summary in the exact format below — do NOT return markdown
|
|
76
142
|
|
|
77
143
|
## Return Format
|
|
78
144
|
Return ONLY a JSON object (no markdown fences, no explanation):
|
|
@@ -83,6 +149,14 @@ Return ONLY a JSON object (no markdown fences, no explanation):
|
|
|
83
149
|
|
|
84
150
|
The prompt should be **under 2K tokens** (excluding files the sub-agent reads itself via the Read tool). Keep it focused — the sub-agent will read project files as needed during implementation.
|
|
85
151
|
|
|
152
|
+
### Wave Mode Context Additions
|
|
153
|
+
|
|
154
|
+
When spawning sub-agents within a wave, the context template is **identical** to sequential mode with one key difference:
|
|
155
|
+
|
|
156
|
+
- **`Files Modified in Previous Phases`**: Include files from ALL completed waves (Wave 1 through Wave N-1), not just the immediately preceding phase. This gives each sub-agent awareness of everything that changed before the current wave.
|
|
157
|
+
|
|
158
|
+
Sub-agents within the same wave do NOT receive information about each other — no cross-phase awareness. Each sub-agent operates as if it is the only phase running.
|
|
159
|
+
|
|
86
160
|
---
|
|
87
161
|
|
|
88
162
|
## Return Format Schema
|
|
@@ -115,6 +189,38 @@ The sub-agent must return a JSON object with this structure:
|
|
|
115
189
|
"description": "All Zod schemas live next to their type definitions",
|
|
116
190
|
"confidence": "high"
|
|
117
191
|
}
|
|
192
|
+
],
|
|
193
|
+
"task_verifications": [
|
|
194
|
+
{
|
|
195
|
+
"task": "Create user authentication middleware",
|
|
196
|
+
"verify_command": "npx tsc --noEmit src/middleware/auth.ts",
|
|
197
|
+
"status": "pass",
|
|
198
|
+
"attempts": 2,
|
|
199
|
+
"repairs_applied": [
|
|
200
|
+
"Added missing import for AuthMiddleware type"
|
|
201
|
+
]
|
|
202
|
+
},
|
|
203
|
+
{
|
|
204
|
+
"task": "Add rate limiting to API routes",
|
|
205
|
+
"verify_command": "npx jest src/middleware/__tests__/rate-limit.test.ts --no-coverage",
|
|
206
|
+
"status": "pass",
|
|
207
|
+
"attempts": 1,
|
|
208
|
+
"repairs_applied": []
|
|
209
|
+
}
|
|
210
|
+
],
|
|
211
|
+
"tasks_completed": [
|
|
212
|
+
{
|
|
213
|
+
"task_number": 1,
|
|
214
|
+
"task_name": "Create user authentication middleware",
|
|
215
|
+
"files_created": ["src/middleware/auth.ts"],
|
|
216
|
+
"files_modified": []
|
|
217
|
+
},
|
|
218
|
+
{
|
|
219
|
+
"task_number": 2,
|
|
220
|
+
"task_name": "Add rate limiting to API routes",
|
|
221
|
+
"files_created": ["src/middleware/rate-limit.ts"],
|
|
222
|
+
"files_modified": ["src/api/routes.ts"]
|
|
223
|
+
}
|
|
118
224
|
]
|
|
119
225
|
}
|
|
120
226
|
```
|
|
@@ -132,6 +238,8 @@ The sub-agent must return a JSON object with this structure:
|
|
|
132
238
|
| `deviations` | string[] | No | Tasks skipped or changed from plan |
|
|
133
239
|
| `errors` | string[] | No | Errors encountered (even if resolved) |
|
|
134
240
|
| `patterns_captured` | object[] | No | Patterns observed during implementation |
|
|
241
|
+
| `task_verifications` | object[] | No | Array of per-task verification results. Only present when at least one task had a `<verify>` tag. Each entry contains: `task` (string), `verify_command` (string), `status` (`"pass" \| "fail"`), `attempts` (number), `repairs_applied` (string[]), and optionally `last_diagnosis` (object, only when status is `"fail"`). See `.claude/resources/core/per-task-verification.md` for full schema. |
|
|
242
|
+
| `tasks_completed` | object[] | No | Array of per-task file tracking for atomic commits. Each entry: `task_number` (number, 1-indexed within phase), `task_name` (string), `files_created` (string[]), `files_modified` (string[]). Present when any tasks ran. Used by coordinator for per-task commit messages. See `.claude/resources/core/atomic-commits.md` for full schema. |
|
|
135
243
|
|
|
136
244
|
### Failure Return Example
|
|
137
245
|
|
|
@@ -163,9 +271,15 @@ After receiving the sub-agent's JSON summary, the coordinator:
|
|
|
163
271
|
1. **Update plan file**: Mark all phase tasks as `[x]`
|
|
164
272
|
2. **Accumulate file list**: Merge `files_created` and `files_modified` into running list
|
|
165
273
|
3. **Buffer patterns**: Append `patterns_captured` entries to `flow/resources/pending-patterns.md`
|
|
166
|
-
4. **Git commit**: If `commit: true
|
|
274
|
+
4. **Git commit (per-task)**: If `commit: true` and `tasks_completed` is present:
|
|
275
|
+
- **Sequential mode**: Sub-agent already created per-task commits — verify they exist, do NOT create phase commit
|
|
276
|
+
- **Wave mode**: Coordinator iterates `tasks_completed` in task_number order and creates per-task commits:
|
|
277
|
+
- For each task: `git add -A && git commit -m "feat(phase-N.task-M): <truncated task_name> — <feature>"`
|
|
278
|
+
- Truncate `task_name` to 50 chars if needed
|
|
279
|
+
- **Fallback**: If `tasks_completed` is absent (legacy sub-agent), fall back to single phase commit: `git add -A && git commit -m "Phase N: {name} — {feature}"`
|
|
167
280
|
5. **Log decisions**: Include `decisions` in phase completion message
|
|
168
|
-
6. **
|
|
281
|
+
6. **Display verification results**: If `task_verifications` is present, show pass/fail counts and any repairs applied
|
|
282
|
+
7. **Proceed**: Move to next phase
|
|
169
283
|
|
|
170
284
|
### On Failure (`status: "failure"`)
|
|
171
285
|
|
|
@@ -178,7 +292,49 @@ After receiving the sub-agent's JSON summary, the coordinator:
|
|
|
178
292
|
|
|
179
293
|
1. **Present summary**: Show what was completed and what wasn't
|
|
180
294
|
2. **Show deviations**: List `deviations` explaining what was skipped
|
|
181
|
-
3. **
|
|
295
|
+
3. **Display verification failures**: If `task_verifications` contains failed entries, show task name, last diagnosis, and repair attempts
|
|
296
|
+
4. **Ask user**: "Phase partially complete. Continue to next phase or retry remaining tasks?"
|
|
297
|
+
|
|
298
|
+
---
|
|
299
|
+
|
|
300
|
+
## Wave Coordinator Processing
|
|
301
|
+
|
|
302
|
+
When multiple sub-agents return simultaneously from a wave, the coordinator handles them differently from sequential mode. See `.claude/resources/core/wave-execution.md` for the full wave system and `.claude/resources/skills/execute-plan-skill.md` Step 4c for the detailed processing flow.
|
|
303
|
+
|
|
304
|
+
**Per-task commits in wave mode**: After collecting all JSON returns from a wave, the coordinator commits per-task (not per-phase). For each phase in phase-number order, iterate `tasks_completed` and create atomic commits: `feat(phase-N.task-M): <desc> — <feature>`. See `.claude/resources/core/atomic-commits.md` for the complete commit format and coordinator processing rules.
|
|
305
|
+
|
|
306
|
+
### Collecting Multiple JSON Returns
|
|
307
|
+
|
|
308
|
+
After all sub-agents in a wave complete, the coordinator collects all JSON returns before processing any of them. This allows file conflict detection before committing.
|
|
309
|
+
|
|
310
|
+
### Processing Order
|
|
311
|
+
|
|
312
|
+
Results are always processed **sequentially in phase number order**, regardless of which sub-agent finished first. This ensures:
|
|
313
|
+
- Deterministic commit history (Phase A committed before Phase B)
|
|
314
|
+
- Predictable plan file updates
|
|
315
|
+
- Consistent file accumulation order
|
|
316
|
+
|
|
317
|
+
### File Conflict Detection
|
|
318
|
+
|
|
319
|
+
After collecting all wave results, check for `files_modified` overlap between phases:
|
|
320
|
+
|
|
321
|
+
```
|
|
322
|
+
For each pair of phases (A, B) in the wave:
|
|
323
|
+
overlap = A.files_modified ∩ B.files_modified
|
|
324
|
+
if overlap is not empty:
|
|
325
|
+
→ File conflict detected
|
|
326
|
+
```
|
|
327
|
+
|
|
328
|
+
**On conflict**: Present to user with options (accept as-is, re-run conflicting phases sequentially, or stop). Never silently resolve conflicts.
|
|
329
|
+
|
|
330
|
+
### Wave Failure Isolation
|
|
331
|
+
|
|
332
|
+
A failed phase in a wave does NOT affect other phases in the same wave:
|
|
333
|
+
- Successful phases are processed normally (plan updates, file accumulation, git commits)
|
|
334
|
+
- Failed phases are presented to the user after all successful phases are processed
|
|
335
|
+
- The user chooses per failed phase: retry, skip, or stop
|
|
336
|
+
|
|
337
|
+
This differs from sequential mode where a failure immediately pauses execution. In wave mode, all parallel phases complete independently before any failure handling.
|
|
182
338
|
|
|
183
339
|
---
|
|
184
340
|
|
|
@@ -186,6 +342,8 @@ After receiving the sub-agent's JSON summary, the coordinator:
|
|
|
186
342
|
|
|
187
343
|
When phases are aggregated (combined complexity ≤ 6), they run as **one sub-agent call** with all tasks from all aggregated phases. The context template lists all phases and tasks together. The return uses the highest phase number as the `phase` field.
|
|
188
344
|
|
|
345
|
+
In wave mode, aggregated phases within the same wave are treated as a **single unit** — they share one wave slot, one dependency set (union of all aggregated phases' dependencies), and one sub-agent call.
|
|
346
|
+
|
|
189
347
|
---
|
|
190
348
|
|
|
191
349
|
## Configuration
|
|
@@ -210,6 +368,19 @@ Phase isolation **enhances** model routing — it doesn't replace it:
|
|
|
210
368
|
| `model_routing: false` + `phase_isolation: true` | Sub-agent spawned with session model, clean context |
|
|
211
369
|
| `model_routing: false` + `phase_isolation: false` | Inline execution, no sub-agents (original behavior) |
|
|
212
370
|
|
|
371
|
+
### Interaction with Wave Execution
|
|
372
|
+
|
|
373
|
+
Phase isolation is the **foundation** for wave execution — wave mode spawns multiple isolated sub-agents per wave instead of one at a time:
|
|
374
|
+
|
|
375
|
+
| Setting | Behavior |
|
|
376
|
+
|---------|----------|
|
|
377
|
+
| `wave_execution: true` + `phase_isolation: true` | Multiple sub-agents per wave, each with clean context (optimal) |
|
|
378
|
+
| `wave_execution: true` + `phase_isolation: false` | Multiple sub-agents per wave, but sharing session context (may cause interference) |
|
|
379
|
+
| `wave_execution: false` + `phase_isolation: true` | One sub-agent at a time, clean context (existing behavior) |
|
|
380
|
+
| `wave_execution: false` + `phase_isolation: false` | Inline execution, no sub-agents (original behavior) |
|
|
381
|
+
|
|
382
|
+
**Recommendation**: `wave_execution: true` works best with `phase_isolation: true`. Without phase isolation, parallel sub-agents may interfere with each other's context.
|
|
383
|
+
|
|
213
384
|
---
|
|
214
385
|
|
|
215
386
|
## Rules
|
|
@@ -220,3 +391,20 @@ Phase isolation **enhances** model routing — it doesn't replace it:
|
|
|
220
391
|
4. **Coordinator validates** — check status field before proceeding
|
|
221
392
|
5. **Never auto-retry** — on failure, present to user and ask
|
|
222
393
|
6. **Pass paths, not content** — give file paths, sub-agent reads them
|
|
394
|
+
7. **Each phase gets own sub-agent** — even in wave mode, phases are never merged into one sub-agent (except for aggregated phases per complexity rules)
|
|
395
|
+
8. **No cross-wave awareness** — sub-agents in the same wave know nothing about each other
|
|
396
|
+
9. **Deterministic processing** — wave results are always processed in phase number order
|
|
397
|
+
10. **Collect before commit** — in wave mode, all JSON returns are collected before any commits happen
|
|
398
|
+
11. **Verification is internal** — per-task verification loops run inside the phase sub-agent; the coordinator sees only the final `task_verifications` results
|
|
399
|
+
|
|
400
|
+
---
|
|
401
|
+
|
|
402
|
+
## Related Files
|
|
403
|
+
|
|
404
|
+
| File | Purpose |
|
|
405
|
+
|------|---------|
|
|
406
|
+
| `.claude/resources/core/wave-execution.md` | Full wave-based parallel execution system |
|
|
407
|
+
| `.claude/resources/core/model-routing.md` | Model tier selection per phase complexity |
|
|
408
|
+
| `.claude/resources/core/discovery-sub-agents.md` | Parallel spawning pattern reference |
|
|
409
|
+
| `.claude/resources/core/per-task-verification.md` | Per-task verification system, debug sub-agent, and repair loops |
|
|
410
|
+
| `.claude/resources/skills/execute-plan-skill.md` | Execute-plan skill with wave integration (Steps 2b, 3, 4) |
|
|
@@ -103,3 +103,4 @@ After compaction, the model should re-read `flow/.scratchpad.md` to restore any
|
|
|
103
103
|
3. **Self-manage size** — promote or discard when approaching 50 lines
|
|
104
104
|
4. **Promote before ending** — always scan for promotable items before session ends
|
|
105
105
|
5. **Not a task list** — use `flow/tasklist.md` for tasks, scratchpad is for observations
|
|
106
|
+
6. **Different from STATE.md** — `flow/STATE.md` tracks structured execution position (current skill, phase, status) for session resumability. The scratchpad tracks informal observations, insights, and open questions. They coexist and serve different purposes: STATE.md is machine-readable execution state, scratchpad is human-readable notes.
|