planflow-ai 1.3.5 → 1.4.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/commands/brainstorm.md +6 -14
- package/.claude/commands/create-contract.md +6 -38
- package/.claude/commands/create-plan.md +18 -38
- package/.claude/commands/discovery-plan.md +19 -49
- package/.claude/commands/execute-plan.md +116 -77
- package/.claude/commands/flow.md +27 -2
- package/.claude/commands/heartbeat.md +12 -14
- package/.claude/commands/learn.md +20 -80
- package/.claude/commands/note.md +6 -23
- package/.claude/commands/resume-work.md +261 -0
- package/.claude/commands/review-code.md +19 -51
- package/.claude/commands/review-pr.md +21 -57
- package/.claude/commands/setup.md +8 -56
- package/.claude/commands/write-tests.md +7 -41
- package/.claude/resources/core/atomic-commits.md +380 -0
- package/.claude/resources/core/autopilot-mode.md +3 -2
- package/.claude/resources/core/compaction-guide.md +15 -1
- package/.claude/resources/core/heartbeat.md +129 -1
- package/.claude/resources/core/per-task-verification.md +362 -0
- package/.claude/resources/core/phase-isolation.md +237 -4
- package/.claude/resources/core/session-scratchpad.md +1 -0
- package/.claude/resources/core/shared-context.md +110 -0
- package/.claude/resources/core/wave-execution.md +407 -0
- package/.claude/resources/patterns/plans-patterns.md +56 -0
- package/.claude/resources/patterns/plans-templates.md +152 -0
- package/.claude/resources/skills/create-plan-skill.md +71 -5
- package/.claude/resources/skills/execute-plan-skill.md +420 -14
- package/.claude/resources/skills/resume-work-skill.md +159 -0
- package/README.md +154 -96
- package/dist/cli/commands/brain.d.ts +20 -0
- package/dist/cli/commands/brain.d.ts.map +1 -0
- package/dist/cli/commands/brain.js +127 -0
- package/dist/cli/commands/brain.js.map +1 -0
- package/dist/cli/commands/init.d.ts.map +1 -1
- package/dist/cli/commands/init.js +10 -1
- package/dist/cli/commands/init.js.map +1 -1
- package/dist/cli/commands/state-query.d.ts +13 -0
- package/dist/cli/commands/state-query.d.ts.map +1 -0
- package/dist/cli/commands/state-query.js +32 -0
- package/dist/cli/commands/state-query.js.map +1 -0
- package/dist/cli/commands/state.d.ts +12 -0
- package/dist/cli/commands/state.d.ts.map +1 -0
- package/dist/cli/commands/state.js +47 -0
- package/dist/cli/commands/state.js.map +1 -0
- package/dist/cli/daemon/desktop-notifier.d.ts +16 -0
- package/dist/cli/daemon/desktop-notifier.d.ts.map +1 -0
- package/dist/cli/daemon/desktop-notifier.js +53 -0
- package/dist/cli/daemon/desktop-notifier.js.map +1 -0
- package/dist/cli/daemon/event-writer.d.ts +22 -0
- package/dist/cli/daemon/event-writer.d.ts.map +1 -0
- package/dist/cli/daemon/event-writer.js +76 -0
- package/dist/cli/daemon/event-writer.js.map +1 -0
- package/dist/cli/daemon/heartbeat-daemon.js +81 -1
- package/dist/cli/daemon/heartbeat-daemon.js.map +1 -1
- package/dist/cli/daemon/log-writer.d.ts +17 -0
- package/dist/cli/daemon/log-writer.d.ts.map +1 -0
- package/dist/cli/daemon/log-writer.js +62 -0
- package/dist/cli/daemon/log-writer.js.map +1 -0
- package/dist/cli/daemon/notification-router.d.ts +17 -0
- package/dist/cli/daemon/notification-router.d.ts.map +1 -0
- package/dist/cli/daemon/notification-router.js +35 -0
- package/dist/cli/daemon/notification-router.js.map +1 -0
- package/dist/cli/daemon/prompt-manager.d.ts +27 -0
- package/dist/cli/daemon/prompt-manager.d.ts.map +1 -0
- package/dist/cli/daemon/prompt-manager.js +107 -0
- package/dist/cli/daemon/prompt-manager.js.map +1 -0
- package/dist/cli/daemon/shared-context.d.ts +38 -0
- package/dist/cli/daemon/shared-context.d.ts.map +1 -0
- package/dist/cli/daemon/shared-context.js +129 -0
- package/dist/cli/daemon/shared-context.js.map +1 -0
- package/dist/cli/handlers/claude.d.ts.map +1 -1
- package/dist/cli/handlers/claude.js +18 -0
- package/dist/cli/handlers/claude.js.map +1 -1
- package/dist/cli/handlers/shared.js +1 -1
- package/dist/cli/handlers/shared.js.map +1 -1
- package/dist/cli/index.js +30 -0
- package/dist/cli/index.js.map +1 -1
- package/dist/cli/state/brain-query.d.ts +48 -0
- package/dist/cli/state/brain-query.d.ts.map +1 -0
- package/dist/cli/state/brain-query.js +113 -0
- package/dist/cli/state/brain-query.js.map +1 -0
- package/dist/cli/state/flowconfig-parser.d.ts +16 -0
- package/dist/cli/state/flowconfig-parser.d.ts.map +1 -0
- package/dist/cli/state/flowconfig-parser.js +166 -0
- package/dist/cli/state/flowconfig-parser.js.map +1 -0
- package/dist/cli/state/heartbeat-state.d.ts +16 -0
- package/dist/cli/state/heartbeat-state.d.ts.map +1 -0
- package/dist/cli/state/heartbeat-state.js +97 -0
- package/dist/cli/state/heartbeat-state.js.map +1 -0
- package/dist/cli/state/model-router.d.ts +8 -0
- package/dist/cli/state/model-router.d.ts.map +1 -0
- package/dist/cli/state/model-router.js +36 -0
- package/dist/cli/state/model-router.js.map +1 -0
- package/dist/cli/state/plan-parser.d.ts +16 -0
- package/dist/cli/state/plan-parser.d.ts.map +1 -0
- package/dist/cli/state/plan-parser.js +124 -0
- package/dist/cli/state/plan-parser.js.map +1 -0
- package/dist/cli/state/session-state.d.ts +21 -0
- package/dist/cli/state/session-state.d.ts.map +1 -0
- package/dist/cli/state/session-state.js +36 -0
- package/dist/cli/state/session-state.js.map +1 -0
- package/dist/cli/state/state-md-parser.d.ts +18 -0
- package/dist/cli/state/state-md-parser.d.ts.map +1 -0
- package/dist/cli/state/state-md-parser.js +222 -0
- package/dist/cli/state/state-md-parser.js.map +1 -0
- package/dist/cli/state/types.d.ts +137 -0
- package/dist/cli/state/types.d.ts.map +1 -0
- package/dist/cli/state/types.js +8 -0
- package/dist/cli/state/types.js.map +1 -0
- package/dist/cli/state/wave-calculator.d.ts +18 -0
- package/dist/cli/state/wave-calculator.d.ts.map +1 -0
- package/dist/cli/state/wave-calculator.js +134 -0
- package/dist/cli/state/wave-calculator.js.map +1 -0
- package/dist/cli/types.d.ts +15 -0
- package/dist/cli/types.d.ts.map +1 -1
- package/package.json +7 -2
- package/templates/shared/CLAUDE.md.template +4 -0
- package/.claude/resources/core/_index.md +0 -304
- package/.claude/resources/languages/_index.md +0 -76
- package/.claude/resources/patterns/_index.md +0 -200
- package/.claude/resources/skills/_index.md +0 -285
- package/.claude/resources/tools/_index.md +0 -110
- package/.claude/resources/tools/reference-expansion-tool.md +0 -326
|
@@ -205,9 +205,77 @@ The project log (`flow/log.md`) is also linked into the vault at `~/plan-flow/br
|
|
|
205
205
|
|
|
206
206
|
---
|
|
207
207
|
|
|
208
|
+
## Notification System
|
|
209
|
+
|
|
210
|
+
The heartbeat daemon includes a notification system that provides visibility into background task execution.
|
|
211
|
+
|
|
212
|
+
### Notification Channels
|
|
213
|
+
|
|
214
|
+
| Channel | What Gets Written | When |
|
|
215
|
+
|---------|------------------|------|
|
|
216
|
+
| `flow/log.md` | Timestamped one-liners | All events (start, complete, fail, blocked) |
|
|
217
|
+
| `flow/.heartbeat-events.jsonl` | Machine-readable JSON events | All events |
|
|
218
|
+
| Desktop notification (node-notifier) | Pop-up alert | Failures and blocked tasks only |
|
|
219
|
+
|
|
220
|
+
### Log Format (flow/log.md)
|
|
221
|
+
|
|
222
|
+
```
|
|
223
|
+
[2026-03-17 14:00] ✅ task-name: Started — description
|
|
224
|
+
[2026-03-17 14:12] ✅ task-name: Phase 1/5 complete — description
|
|
225
|
+
[2026-03-17 14:35] ❌ task-name: FAILED at Phase 3 — build error
|
|
226
|
+
[2026-03-17 14:35] ⏸️ task-name: Paused — see .heartbeat-prompt.md
|
|
227
|
+
```
|
|
228
|
+
|
|
229
|
+
### Event Types
|
|
230
|
+
|
|
231
|
+
| Type | Level | Desktop Notify |
|
|
232
|
+
|------|-------|---------------|
|
|
233
|
+
| `task_started` | info | No |
|
|
234
|
+
| `phase_complete` | info | No |
|
|
235
|
+
| `task_complete` | info | No |
|
|
236
|
+
| `task_failed` | error | Yes |
|
|
237
|
+
| `task_blocked` | error | Yes |
|
|
238
|
+
|
|
239
|
+
### Exit Code Convention
|
|
240
|
+
|
|
241
|
+
| Exit Code | Meaning | Daemon Action |
|
|
242
|
+
|-----------|---------|---------------|
|
|
243
|
+
| 0 | Success | Log ✅, continue |
|
|
244
|
+
| 1 | Failure | Log ❌, desktop notify |
|
|
245
|
+
| 2 | Needs input | Log ⏸️, write prompt file (if autopilot OFF), desktop notify |
|
|
246
|
+
|
|
247
|
+
### Prompt File System
|
|
248
|
+
|
|
249
|
+
When a task exits with code 2 and autopilot is OFF:
|
|
250
|
+
1. Daemon writes `flow/.heartbeat-prompt.md` with task context
|
|
251
|
+
2. Desktop notification alerts the user
|
|
252
|
+
3. Task pauses until user responds
|
|
253
|
+
4. On next session start, the prompt is auto-detected and presented
|
|
254
|
+
5. After resolution, prompt is archived to `flow/archive/heartbeat-prompts/`
|
|
255
|
+
|
|
256
|
+
When autopilot is ON: exit code 2 is logged as a warning but execution continues.
|
|
257
|
+
|
|
258
|
+
### Session Start Integration
|
|
259
|
+
|
|
260
|
+
Two behaviors activate on session start:
|
|
261
|
+
1. **Heartbeat Log**: Reads `.heartbeat-events.jsonl`, compares against `.heartbeat-state.json` timestamp, summarizes unread events
|
|
262
|
+
2. **Heartbeat Prompt**: Detects `.heartbeat-prompt.md` and presents pending questions immediately
|
|
263
|
+
|
|
264
|
+
### Runtime Files
|
|
265
|
+
|
|
266
|
+
| File | Purpose |
|
|
267
|
+
|------|---------|
|
|
268
|
+
| `flow/.heartbeat-events.jsonl` | Append-only event stream (one JSON per line) |
|
|
269
|
+
| `flow/.heartbeat-state.json` | Tracks last-read timestamp for session start summaries |
|
|
270
|
+
| `flow/.heartbeat-prompt.md` | Pending user input from blocked task (temporary) |
|
|
271
|
+
| `flow/archive/heartbeat-prompts/` | Archived resolved prompt files |
|
|
272
|
+
| `flow/.telegram-poll-state.json` | Telegram polling state (update offset, mode) |
|
|
273
|
+
|
|
274
|
+
---
|
|
275
|
+
|
|
208
276
|
## Rules
|
|
209
277
|
|
|
210
|
-
1. **
|
|
278
|
+
1. **Minimal dependencies**: Daemon uses Node.js built-in modules plus `node-notifier` for desktop notifications
|
|
211
279
|
2. **Graceful shutdown**: Always clean up timers and PID file on exit
|
|
212
280
|
3. **Stale PID detection**: Verify PID is actually running before assuming daemon is alive
|
|
213
281
|
4. **File-based config**: All configuration lives in `flow/heartbeat.md` — no CLI flags for task config
|
|
@@ -215,3 +283,63 @@ The project log (`flow/log.md`) is also linked into the vault at `~/plan-flow/br
|
|
|
215
283
|
6. **Log rotation**: Keep `flow/.heartbeat.log` under 1000 lines by truncating oldest entries
|
|
216
284
|
7. **One-shot cleanup**: After a one-shot task executes, auto-disable it and update the linked tasklist item
|
|
217
285
|
8. **Vault sync**: Every heartbeat update MUST also update `~/plan-flow/brain/heartbeat.md` — see Vault Sync section
|
|
286
|
+
|
|
287
|
+
---
|
|
288
|
+
|
|
289
|
+
## Telegram Two-Way Polling
|
|
290
|
+
|
|
291
|
+
When `telegram_bot_token` and `telegram_chat_id` are set in `flow/.flowconfig`, the heartbeat daemon enables two-way messaging with Telegram using the Bot API's `getUpdates` long-polling endpoint.
|
|
292
|
+
|
|
293
|
+
### Adaptive Polling Modes
|
|
294
|
+
|
|
295
|
+
| Mode | Interval | Trigger |
|
|
296
|
+
|------|----------|---------|
|
|
297
|
+
| Idle | 60 seconds | No pending prompts — lightweight keep-alive |
|
|
298
|
+
| Conversation | 3 seconds | A task is blocked and waiting for user input |
|
|
299
|
+
|
|
300
|
+
The daemon switches to **conversation mode** when it writes a prompt to Telegram (e.g., a task blocked with exit code 2). Once the user replies and the prompt is resolved, it drops back to **idle mode**.
|
|
301
|
+
|
|
302
|
+
### State File
|
|
303
|
+
|
|
304
|
+
Polling state is persisted in `flow/.telegram-poll-state.json`:
|
|
305
|
+
|
|
306
|
+
```json
|
|
307
|
+
{
|
|
308
|
+
"lastUpdateId": 123456789,
|
|
309
|
+
"mode": "idle",
|
|
310
|
+
"pendingPromptMessageId": null
|
|
311
|
+
}
|
|
312
|
+
```
|
|
313
|
+
|
|
314
|
+
- `lastUpdateId`: Tracks the Telegram update offset to avoid processing duplicate messages
|
|
315
|
+
- `mode`: Current polling mode (`idle` or `conversation`)
|
|
316
|
+
- `pendingPromptMessageId`: The Telegram message ID of the last prompt sent, used to match replies
|
|
317
|
+
|
|
318
|
+
### Two-Way Conversation Flow
|
|
319
|
+
|
|
320
|
+
1. A heartbeat task blocks (exit code 2) and writes `flow/.heartbeat-prompt.md`
|
|
321
|
+
2. The daemon detects the prompt file and sends it to Telegram via `sendMessage`
|
|
322
|
+
3. Polling switches to **conversation mode** (3s interval)
|
|
323
|
+
4. The user replies in Telegram
|
|
324
|
+
5. The daemon detects the reply via `getUpdates`, writes the response to `flow/.heartbeat-prompt.md`
|
|
325
|
+
6. The blocked task resumes with the user's input
|
|
326
|
+
7. Polling switches back to **idle mode** (60s interval)
|
|
327
|
+
|
|
328
|
+
### Configuration
|
|
329
|
+
|
|
330
|
+
Set both values in `flow/.flowconfig`:
|
|
331
|
+
|
|
332
|
+
```yaml
|
|
333
|
+
telegram_bot_token: "123456:ABC-DEF1234ghIkl-zyx57W2v1u123ew11"
|
|
334
|
+
telegram_chat_id: "5635356808"
|
|
335
|
+
```
|
|
336
|
+
|
|
337
|
+
Configure via `/flow`:
|
|
338
|
+
|
|
339
|
+
```
|
|
340
|
+
/flow telegram_bot_token=123456:ABC-DEF telegram_chat_id=5635356808
|
|
341
|
+
```
|
|
342
|
+
|
|
343
|
+
### Auto-Migration from webhook_url
|
|
344
|
+
|
|
345
|
+
If `webhook_url` contains a Telegram `api.telegram.org` URL with a bot token and chat_id in the query string, the flowconfig parser automatically extracts and populates `telegram_bot_token` and `telegram_chat_id`. The original `webhook_url` is preserved for one-way notification compatibility.
|
|
@@ -0,0 +1,362 @@
|
|
|
1
|
+
|
|
2
|
+
# Per-Task Verification
|
|
3
|
+
|
|
4
|
+
## Purpose
|
|
5
|
+
|
|
6
|
+
When a plan phase includes tasks with `<verify>` tags, the phase isolation sub-agent runs **targeted verification immediately after each task completes**. If verification fails, a nested debug sub-agent diagnoses the failure and the implementation sub-agent applies repairs. This catches errors at the task level instead of waiting for the final build+test step.
|
|
7
|
+
|
|
8
|
+
**Core principle**: Verify early, diagnose fast, repair in place.
|
|
9
|
+
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## Architecture
|
|
13
|
+
|
|
14
|
+
```
|
|
15
|
+
Phase Sub-Agent (isolated)
|
|
16
|
+
│
|
|
17
|
+
├─ Task 1: Implement
|
|
18
|
+
│ ├─ Complete task implementation
|
|
19
|
+
│ ├─ Parse <verify> tag → extract command
|
|
20
|
+
│ ├─ Run verification command
|
|
21
|
+
│ ├─ ✅ Pass → record result, move to Task 2
|
|
22
|
+
│ └─ ❌ Fail → enter verification loop:
|
|
23
|
+
│ │
|
|
24
|
+
│ ├─ Spawn debug sub-agent (haiku):
|
|
25
|
+
│ │ Input: error output + task context + file content
|
|
26
|
+
│ │ Output: JSON diagnosis (root cause, repair actions)
|
|
27
|
+
│ │
|
|
28
|
+
│ ├─ Apply repair actions
|
|
29
|
+
│ ├─ Re-run verification command
|
|
30
|
+
│ ├─ ✅ Pass → record result (with repair info), move to Task 2
|
|
31
|
+
│ ├─ ❌ Fail → retry (up to max_verify_retries)
|
|
32
|
+
│ └─ ❌ Max retries exceeded → record failure, escalate to user
|
|
33
|
+
│
|
|
34
|
+
├─ Task 2: Implement (no <verify> tag → skip verification)
|
|
35
|
+
│
|
|
36
|
+
├─ Task 3: Implement
|
|
37
|
+
│ ├─ Complete task implementation
|
|
38
|
+
│ ├─ Parse <verify> tag → extract command
|
|
39
|
+
│ └─ Run verification → ✅ Pass
|
|
40
|
+
│
|
|
41
|
+
└─ Return JSON (includes task_verifications array)
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
Verification is **internal to the phase sub-agent**. The wave coordinator and main session see only the final JSON return with verification results — they never interact with the verification loop directly.
|
|
45
|
+
|
|
46
|
+
---
|
|
47
|
+
|
|
48
|
+
## Verify Tag Syntax
|
|
49
|
+
|
|
50
|
+
### Declaration in Plans
|
|
51
|
+
|
|
52
|
+
Tasks in a plan phase can include an optional `<verify>` tag indented under the task:
|
|
53
|
+
|
|
54
|
+
```markdown
|
|
55
|
+
### Phase 2: API Integration
|
|
56
|
+
|
|
57
|
+
**Scope**: ...
|
|
58
|
+
**Complexity**: 5/10
|
|
59
|
+
**Dependencies**: Phase 1
|
|
60
|
+
|
|
61
|
+
- [ ] Create user authentication middleware in `src/middleware/auth.ts`
|
|
62
|
+
<verify>npx tsc --noEmit src/middleware/auth.ts</verify>
|
|
63
|
+
- [ ] Add rate limiting to API routes
|
|
64
|
+
<verify>npx jest src/middleware/__tests__/rate-limit.test.ts --no-coverage</verify>
|
|
65
|
+
- [ ] Update configuration constants
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
### Parsing Rules
|
|
69
|
+
|
|
70
|
+
1. **Tag format**: `<verify>COMMAND</verify>` on a single line, indented under a task
|
|
71
|
+
2. **One verify per task**: Only the first `<verify>` tag under a task is used; additional tags are ignored
|
|
72
|
+
3. **Command content**: The text between tags is executed as a shell command by the sub-agent
|
|
73
|
+
4. **No verify = no verification**: Tasks without `<verify>` tags skip verification entirely (backward compatible)
|
|
74
|
+
5. **Whitespace**: Leading/trailing whitespace inside the tag is trimmed
|
|
75
|
+
6. **Nesting**: The `<verify>` tag must be indented under its parent task (2+ spaces or 1+ tab)
|
|
76
|
+
|
|
77
|
+
### Recommended Verification Commands
|
|
78
|
+
|
|
79
|
+
| Task Type | Verify Command | Purpose |
|
|
80
|
+
|-----------|---------------|---------|
|
|
81
|
+
| File creation (TypeScript) | `npx tsc --noEmit <file>` | Type-check the new file |
|
|
82
|
+
| Test writing | `npx jest <test-file> --no-coverage` | Run the specific test |
|
|
83
|
+
| Schema/type changes | `npx tsc --noEmit <type-file>` | Verify type consistency |
|
|
84
|
+
| Config changes | *(no verify)* | Manual review preferred |
|
|
85
|
+
| Documentation | *(no verify)* | No automated check available |
|
|
86
|
+
|
|
87
|
+
**Constraint**: Verification commands must be **targeted** (single file or small scope). Never use full builds (`npm run build`) or full test suites (`npm test`) as verify commands — those run in the final Step 7.
|
|
88
|
+
|
|
89
|
+
---
|
|
90
|
+
|
|
91
|
+
## Debug Sub-Agent
|
|
92
|
+
|
|
93
|
+
### When to Spawn
|
|
94
|
+
|
|
95
|
+
A debug sub-agent is spawned when a verification command returns a **non-zero exit code**. The sub-agent diagnoses the failure and suggests repair actions.
|
|
96
|
+
|
|
97
|
+
### Prompt Template
|
|
98
|
+
|
|
99
|
+
```markdown
|
|
100
|
+
# Debug Diagnosis
|
|
101
|
+
|
|
102
|
+
## Failed Verification
|
|
103
|
+
**Task**: {task description}
|
|
104
|
+
**Command**: {verify command}
|
|
105
|
+
**Exit Code**: {exit code}
|
|
106
|
+
|
|
107
|
+
## Error Output
|
|
108
|
+
```
|
|
109
|
+
{stderr + stdout from the failed command, truncated to 200 lines}
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
## Task Context
|
|
113
|
+
**File**: {primary file being modified}
|
|
114
|
+
**Phase**: {phase name}
|
|
115
|
+
**What was implemented**: {brief description of what the task did}
|
|
116
|
+
|
|
117
|
+
## File Content
|
|
118
|
+
```
|
|
119
|
+
{content of the primary file, truncated to 300 lines}
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
## Instructions
|
|
123
|
+
Analyze the error output and diagnose the root cause. Return a JSON object with your diagnosis and suggested repair actions. Do NOT fix the code — only diagnose.
|
|
124
|
+
|
|
125
|
+
Return ONLY a JSON object (no markdown fences):
|
|
126
|
+
{see Debug Return Schema below}
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
### Debug Return Schema
|
|
130
|
+
|
|
131
|
+
```json
|
|
132
|
+
{
|
|
133
|
+
"root_cause": "Missing import for AuthMiddleware type used on line 15",
|
|
134
|
+
"category": "import_missing",
|
|
135
|
+
"repair_actions": [
|
|
136
|
+
"Add import { AuthMiddleware } from '../types/auth' to src/middleware/auth.ts"
|
|
137
|
+
],
|
|
138
|
+
"confidence": "high",
|
|
139
|
+
"file_to_fix": "src/middleware/auth.ts"
|
|
140
|
+
}
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
### Field Descriptions
|
|
144
|
+
|
|
145
|
+
| Field | Type | Required | Description |
|
|
146
|
+
|-------|------|----------|-------------|
|
|
147
|
+
| `root_cause` | string | Yes | Human-readable description of what went wrong |
|
|
148
|
+
| `category` | string | Yes | Error category: `import_missing`, `type_error`, `syntax_error`, `runtime_error`, `test_failure`, `config_error`, `other` |
|
|
149
|
+
| `repair_actions` | string[] | Yes | Ordered list of specific actions to fix the issue |
|
|
150
|
+
| `confidence` | `"high" \| "medium" \| "low"` | Yes | How confident the diagnosis is |
|
|
151
|
+
| `file_to_fix` | string | Yes | Primary file that needs modification |
|
|
152
|
+
|
|
153
|
+
### Sub-Agent Configuration
|
|
154
|
+
|
|
155
|
+
- **Model**: Always uses haiku (fast tier) — diagnosis is a focused, low-complexity task
|
|
156
|
+
- **Mode**: `"auto"`
|
|
157
|
+
- **Read-only**: The debug sub-agent does NOT modify files — it only returns a diagnosis. The implementation sub-agent applies the repairs.
|
|
158
|
+
|
|
159
|
+
---
|
|
160
|
+
|
|
161
|
+
## Verification Loop
|
|
162
|
+
|
|
163
|
+
### Flow
|
|
164
|
+
|
|
165
|
+
```
|
|
166
|
+
1. Complete task implementation
|
|
167
|
+
2. Parse <verify> tag → extract command
|
|
168
|
+
3. Run command
|
|
169
|
+
4. If exit code == 0 → PASS (record result, continue)
|
|
170
|
+
5. If exit code != 0:
|
|
171
|
+
a. Increment retry counter
|
|
172
|
+
b. If retry counter > max_verify_retries → ESCALATE (record failure)
|
|
173
|
+
c. Spawn debug sub-agent with error context
|
|
174
|
+
d. Receive JSON diagnosis
|
|
175
|
+
e. Apply repair actions from diagnosis
|
|
176
|
+
f. Re-run verification command → go to step 4
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
### Retry Behavior
|
|
180
|
+
|
|
181
|
+
- **Retry counter**: Starts at 0, increments on each failed verification attempt
|
|
182
|
+
- **First attempt**: The initial verification run does NOT count as a retry
|
|
183
|
+
- **Max retries**: Controlled by `max_verify_retries` in `.flowconfig` (default: 2)
|
|
184
|
+
- **Example with default**: Initial attempt + 2 retries = 3 total verification runs maximum
|
|
185
|
+
|
|
186
|
+
### Escalation on Max Retries
|
|
187
|
+
|
|
188
|
+
When max retries are exceeded, the sub-agent:
|
|
189
|
+
|
|
190
|
+
1. Records the verification failure in the `task_verifications` array
|
|
191
|
+
2. Includes the last debug diagnosis in the failure record
|
|
192
|
+
3. Continues to the next task (does NOT abort the phase)
|
|
193
|
+
4. Sets overall phase `status` to `"partial"` if any task verification failed
|
|
194
|
+
|
|
195
|
+
The coordinator presents the failure to the user with the accumulated diagnosis:
|
|
196
|
+
|
|
197
|
+
```markdown
|
|
198
|
+
⚠️ Task verification failed after 2 retries:
|
|
199
|
+
|
|
200
|
+
**Task**: Create user authentication middleware in `src/middleware/auth.ts`
|
|
201
|
+
**Command**: `npx tsc --noEmit src/middleware/auth.ts`
|
|
202
|
+
**Last diagnosis**: Missing type export from @auth/core — dependency may need updating
|
|
203
|
+
**Category**: import_missing
|
|
204
|
+
|
|
205
|
+
Options:
|
|
206
|
+
1. Continue with remaining phases (issue noted)
|
|
207
|
+
2. Stop and fix manually
|
|
208
|
+
```
|
|
209
|
+
|
|
210
|
+
---
|
|
211
|
+
|
|
212
|
+
## Task Verifications Return Field
|
|
213
|
+
|
|
214
|
+
### Schema Extension
|
|
215
|
+
|
|
216
|
+
The phase isolation JSON return format is extended with an optional `task_verifications` array:
|
|
217
|
+
|
|
218
|
+
```json
|
|
219
|
+
{
|
|
220
|
+
"status": "success",
|
|
221
|
+
"phase": "Phase 2: API Integration",
|
|
222
|
+
"summary": "Implemented auth middleware and rate limiting. One task required type import repair.",
|
|
223
|
+
"files_created": ["src/middleware/auth.ts"],
|
|
224
|
+
"files_modified": ["src/api/routes.ts"],
|
|
225
|
+
"decisions": [],
|
|
226
|
+
"deviations": [],
|
|
227
|
+
"errors": [],
|
|
228
|
+
"patterns_captured": [],
|
|
229
|
+
"task_verifications": [
|
|
230
|
+
{
|
|
231
|
+
"task": "Create user authentication middleware",
|
|
232
|
+
"verify_command": "npx tsc --noEmit src/middleware/auth.ts",
|
|
233
|
+
"status": "pass",
|
|
234
|
+
"attempts": 2,
|
|
235
|
+
"repairs_applied": [
|
|
236
|
+
"Added missing import for AuthMiddleware type"
|
|
237
|
+
]
|
|
238
|
+
},
|
|
239
|
+
{
|
|
240
|
+
"task": "Add rate limiting to API routes",
|
|
241
|
+
"verify_command": "npx jest src/middleware/__tests__/rate-limit.test.ts --no-coverage",
|
|
242
|
+
"status": "pass",
|
|
243
|
+
"attempts": 1,
|
|
244
|
+
"repairs_applied": []
|
|
245
|
+
}
|
|
246
|
+
]
|
|
247
|
+
}
|
|
248
|
+
```
|
|
249
|
+
|
|
250
|
+
### Task Verification Entry Fields
|
|
251
|
+
|
|
252
|
+
| Field | Type | Required | Description |
|
|
253
|
+
|-------|------|----------|-------------|
|
|
254
|
+
| `task` | string | Yes | Short description of the task (from plan) |
|
|
255
|
+
| `verify_command` | string | Yes | The verification command that was run |
|
|
256
|
+
| `status` | `"pass" \| "fail"` | Yes | Final verification outcome |
|
|
257
|
+
| `attempts` | number | Yes | Total verification attempts (1 = passed first try) |
|
|
258
|
+
| `repairs_applied` | string[] | Yes | List of repairs applied during retries (empty if passed first try) |
|
|
259
|
+
| `last_diagnosis` | object | No | Last debug sub-agent diagnosis (only present when `status: "fail"`) |
|
|
260
|
+
|
|
261
|
+
### When `task_verifications` is Omitted
|
|
262
|
+
|
|
263
|
+
- If a phase has **no tasks with `<verify>` tags**, the `task_verifications` field is omitted entirely from the JSON return
|
|
264
|
+
- This maintains backward compatibility — existing phase isolation returns are unchanged
|
|
265
|
+
|
|
266
|
+
---
|
|
267
|
+
|
|
268
|
+
## Configuration
|
|
269
|
+
|
|
270
|
+
### `.flowconfig` Setting
|
|
271
|
+
|
|
272
|
+
```yaml
|
|
273
|
+
max_verify_retries: 2 # Max repair attempts per task verification (default: 2, range: 1-5)
|
|
274
|
+
```
|
|
275
|
+
|
|
276
|
+
- **Default**: `2` (initial attempt + 2 retries = 3 total runs)
|
|
277
|
+
- **Range**: `1` to `5`
|
|
278
|
+
- **Values below 1 or above 5**: Clamped to the valid range with a warning
|
|
279
|
+
|
|
280
|
+
### Toggle via `/flow`
|
|
281
|
+
|
|
282
|
+
```
|
|
283
|
+
/flow max_verify_retries=3
|
|
284
|
+
```
|
|
285
|
+
|
|
286
|
+
### No Feature Toggle
|
|
287
|
+
|
|
288
|
+
Per-task verification has no on/off toggle. It activates automatically when tasks include `<verify>` tags. Plans without `<verify>` tags behave exactly as before — fully backward compatible.
|
|
289
|
+
|
|
290
|
+
---
|
|
291
|
+
|
|
292
|
+
## Error Handling
|
|
293
|
+
|
|
294
|
+
### Verification Command Errors
|
|
295
|
+
|
|
296
|
+
| Scenario | Behavior |
|
|
297
|
+
|----------|----------|
|
|
298
|
+
| Command not found | Treat as verification failure, spawn debug sub-agent |
|
|
299
|
+
| Command timeout (>30s) | Kill process, treat as failure, include timeout in error output |
|
|
300
|
+
| Command produces no output | Treat exit code as sole indicator (0 = pass, non-zero = fail) |
|
|
301
|
+
| Command produces large output | Truncate to 200 lines before passing to debug sub-agent |
|
|
302
|
+
|
|
303
|
+
### Debug Sub-Agent Errors
|
|
304
|
+
|
|
305
|
+
| Scenario | Behavior |
|
|
306
|
+
|----------|----------|
|
|
307
|
+
| Invalid JSON return | Skip this retry, count as failed attempt |
|
|
308
|
+
| Sub-agent timeout | Skip this retry, count as failed attempt |
|
|
309
|
+
| Empty repair_actions | Skip repair, re-run verification (may pass if issue was transient) |
|
|
310
|
+
|
|
311
|
+
### Phase-Level Impact
|
|
312
|
+
|
|
313
|
+
| Verification Outcome | Phase Status |
|
|
314
|
+
|----------------------|-------------|
|
|
315
|
+
| All verifications pass | `"success"` (no change) |
|
|
316
|
+
| Some verifications fail (max retries exceeded) | `"partial"` |
|
|
317
|
+
| Task implementation itself fails | `"failure"` (existing behavior, unrelated to verification) |
|
|
318
|
+
|
|
319
|
+
---
|
|
320
|
+
|
|
321
|
+
## Interaction with Wave Mode
|
|
322
|
+
|
|
323
|
+
Per-task verification is **entirely internal** to each phase sub-agent. The wave coordinator:
|
|
324
|
+
|
|
325
|
+
- Does NOT know about individual task verifications during execution
|
|
326
|
+
- Receives `task_verifications` in the JSON return after the sub-agent completes
|
|
327
|
+
- Displays verification stats in the wave completion summary
|
|
328
|
+
- Does NOT retry phases based on verification failures (that is internal to the sub-agent)
|
|
329
|
+
|
|
330
|
+
```
|
|
331
|
+
Wave 1: Phase 1 (2 tasks verified: 2 pass), Phase 2 (3 tasks verified: 2 pass, 1 fail after 2 retries)
|
|
332
|
+
```
|
|
333
|
+
|
|
334
|
+
Wave execution treats a phase with failed verifications as `"partial"` — the same way it handles any partial result. The user decides whether to continue.
|
|
335
|
+
|
|
336
|
+
---
|
|
337
|
+
|
|
338
|
+
## Rules
|
|
339
|
+
|
|
340
|
+
1. **Verify is optional** — tasks without `<verify>` tags skip verification entirely
|
|
341
|
+
2. **Targeted commands only** — never use full builds or full test suites as verify commands
|
|
342
|
+
3. **Debug sub-agent is read-only** — it diagnoses but never modifies files
|
|
343
|
+
4. **Implementation sub-agent repairs** — only the phase sub-agent applies fixes
|
|
344
|
+
5. **Continue on failure** — failed verification does NOT abort the phase; it records the failure and continues to the next task
|
|
345
|
+
6. **Max retries are hard** — once exceeded, escalate to user; never increase retries dynamically
|
|
346
|
+
7. **First attempt is not a retry** — the initial verification run is attempt 1, retries start at attempt 2
|
|
347
|
+
8. **Truncate large output** — cap error output at 200 lines and file content at 300 lines for debug sub-agent
|
|
348
|
+
9. **Backward compatible** — phases without any `<verify>` tags produce no `task_verifications` field
|
|
349
|
+
10. **Wave-transparent** — wave coordinator sees only final results; verification loops are internal to phase sub-agents
|
|
350
|
+
|
|
351
|
+
---
|
|
352
|
+
|
|
353
|
+
## Related Files
|
|
354
|
+
|
|
355
|
+
| File | Purpose |
|
|
356
|
+
|------|---------|
|
|
357
|
+
| `.claude/resources/core/phase-isolation.md` | Sub-agent context template and JSON return format (extended by this feature) |
|
|
358
|
+
| `.claude/resources/core/wave-execution.md` | Wave coordinator behavior (verification is internal to sub-agents) |
|
|
359
|
+
| `.claude/resources/core/model-routing.md` | Model tier selection (debug sub-agent always uses haiku) |
|
|
360
|
+
| `.claude/resources/skills/execute-plan-skill.md` | Execute-plan skill with verification result display |
|
|
361
|
+
| `.claude/resources/skills/create-plan-skill.md` | Auto-generation of `<verify>` sections in plans |
|
|
362
|
+
| `.claude/resources/patterns/plans-templates.md` | Plan template with `<verify>` tag syntax |
|