deepflow 0.1.45 → 0.1.47

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "deepflow",
3
- "version": "0.1.45",
3
+ "version": "0.1.47",
4
4
  "description": "Stay in flow state - lightweight spec-driven task orchestration for Claude Code",
5
5
  "keywords": [
6
6
  "claude",
@@ -99,8 +99,16 @@ task: T3
99
99
  status: success|failed
100
100
  commit: abc1234
101
101
  summary: "one line"
102
+ tests_ran: true|false
103
+ test_command: "npm test"
104
+ test_exit_code: 0
105
+ test_output_tail: |
106
+ PASS src/upload.test.ts
107
+ Tests: 12 passed, 12 total
102
108
  ```
103
109
 
110
+ New fields: `tests_ran` (bool), `test_command` (string), `test_exit_code` (int), `test_output_tail` (last 20 lines of output).
111
+
104
112
  **Spike result file** `.deepflow/results/{task_id}.yaml` (additional fields):
105
113
  ```yaml
106
114
  task: T1
@@ -137,8 +145,10 @@ experiment_file: ".deepflow/experiments/upload--streaming--failed.md"
137
145
  }
138
146
  ```
139
147
 
148
+ Note: `completed_tasks` is kept for backward compatibility but is now derivable from PLAN.md `[x]` entries. The native task system (TaskList) is the primary source for runtime task status.
149
+
140
150
  **On checkpoint:** Complete wave → update PLAN.md → save to worktree → exit.
141
- **Resume:** `--continue` loads checkpoint, verifies worktree, skips completed tasks.
151
+ **Resume:** `--continue` loads checkpoint, verifies worktree, skips completed tasks. Native tasks are re-registered for remaining `[ ]` items only.
142
152
 
143
153
  ## Behavior
144
154
 
@@ -188,6 +198,30 @@ Load: PLAN.md (required), specs/doing-*.md, .deepflow/config.yaml
188
198
  If missing: "No PLAN.md found. Run /df:plan first."
189
199
  ```
190
200
 
201
+ ### 2.5. REGISTER NATIVE TASKS
202
+
203
+ Parse PLAN.md and create native tasks for tracking, dependency management, and UI spinners.
204
+
205
+ **For each uncompleted task (`[ ]`) in PLAN.md:**
206
+
207
+ ```
208
+ 1. TaskCreate:
209
+ - subject: "{task_id}: {description}" (e.g. "T1: Create upload endpoint")
210
+ - description: Full task block from PLAN.md (files, blocked by, type, etc.)
211
+ - activeForm: "{gerund form of description}" (e.g. "Creating upload endpoint")
212
+
213
+ 2. Store mapping: PLAN.md task_id (T1) → native task ID
214
+ ```
215
+
216
+ **After all tasks created, set up dependencies:**
217
+
218
+ ```
219
+ For each task with "Blocked by: T{n}, T{m}":
220
+ TaskUpdate(taskId: native_id, addBlockedBy: [native_id_of_Tn, native_id_of_Tm])
221
+ ```
222
+
223
+ **On `--continue`:** Only create tasks for remaining `[ ]` items (skip `[x]` completed).
224
+
191
225
  ### 3. CHECK FOR UNPLANNED SPECS
192
226
 
193
227
  Warn if `specs/*.md` (excluding doing-/done-) exist. Non-blocking.
@@ -244,12 +278,30 @@ Topic extraction:
244
278
 
245
279
  ### 5. IDENTIFY READY TASKS
246
280
 
247
- Ready = `[ ]` + all `blocked_by` complete + experiment validated (if applicable) + not in checkpoint.
281
+ Use TaskList to find ready tasks (replaces manual PLAN.md parsing):
282
+
283
+ ```
284
+ Ready = TaskList results where:
285
+ - status: "pending"
286
+ - blockedBy: empty (auto-unblocked by native dependency system)
287
+ ```
288
+
289
+ **Cross-check with experiment validation** (for spike-blocked tasks):
290
+ - If task depends on spike AND experiment not `--passed.md` → still blocked
291
+ - TaskUpdate to add spike as blocker if not already set
292
+
293
+ Ready = TaskList pending + empty blockedBy + experiment validated (if applicable).
248
294
 
249
295
  ### 6. SPAWN AGENTS
250
296
 
251
297
  Context ≥50%: checkpoint and exit.
252
298
 
299
+ **Before spawning each agent**, mark its native task as in_progress:
300
+ ```
301
+ TaskUpdate(taskId: native_id, status: "in_progress")
302
+ ```
303
+ This activates the UI spinner showing the task's activeForm (e.g. "Creating upload endpoint").
304
+
253
305
  **CRITICAL: Spawn ALL ready tasks in a SINGLE response with MULTIPLE Task tool calls.**
254
306
 
255
307
  DO NOT spawn one task, wait, then spawn another. Instead, call Task tool multiple times in the SAME message block. This enables true parallelism.
@@ -319,8 +371,15 @@ Then rename experiment:
319
371
 
320
372
  **Gate:**
321
373
  ```
322
- VERIFIED_PASS → Unblock, log "✓ Spike {task_id} verified"
323
- VERIFIED_FAIL → Block, log "✗ Spike {task_id} failed verification"
374
+ VERIFIED_PASS →
375
+ TaskUpdate(taskId: spike_native_id, status: "completed")
376
+ # Native system auto-unblocks dependent tasks
377
+ Log "✓ Spike {task_id} verified"
378
+
379
+ VERIFIED_FAIL →
380
+ # Spike task stays as pending, dependents remain blocked
381
+ # No TaskUpdate needed — native system keeps them blocked
382
+ Log "✗ Spike {task_id} failed verification"
324
383
  If override: log "⚠ Agent incorrectly marked as passed"
325
384
  ```
326
385
 
@@ -349,8 +408,18 @@ Example: To edit src/foo.ts, use:
349
408
 
350
409
  Do NOT write files to the main project directory.
351
410
 
352
- Implement, test, commit as feat({spec}): {description}.
353
- Write result to {worktree_absolute_path}/.deepflow/results/{task_id}.yaml
411
+ Steps:
412
+ 1. Implement the task
413
+ 2. Detect test command: check for package.json (npm test), pyproject.toml (pytest),
414
+ Cargo.toml (cargo test), go.mod (go test ./...), or Makefile (make test)
415
+ 3. Run tests if test infrastructure exists:
416
+ - Run the detected test command
417
+ - If tests fail: fix the code and re-run until passing
418
+ - Do NOT commit with failing tests
419
+ 4. If NO test infrastructure: set tests_ran: false in result file
420
+ 5. Commit as feat({spec}): {description}
421
+ 6. Write result file with ALL fields including test evidence (see schema):
422
+ {worktree_absolute_path}/.deepflow/results/{task_id}.yaml
354
423
 
355
424
  **STOP after writing the result file. Do NOT:**
356
425
  - Merge branches or cherry-pick commits
@@ -376,6 +445,7 @@ Steps:
376
445
  3. Write experiment as --active.md (verifier determines final status)
377
446
  4. Commit: spike({spec}): validate {hypothesis}
378
447
  5. Write result to .deepflow/results/{task_id}.yaml (see spike result schema)
448
+ 6. If test infrastructure exists, also run tests and include evidence in result file
379
449
 
380
450
  Rules:
381
451
  - `met: true` ONLY if actual satisfies target
@@ -390,6 +460,12 @@ Rules:
390
460
 
391
461
  When a task fails and cannot be auto-fixed:
392
462
 
463
+ **Native task update:**
464
+ ```
465
+ TaskUpdate(taskId: native_id, status: "pending") # Reset to pending, not deleted
466
+ ```
467
+ This keeps the task visible for retry. Dependent tasks remain blocked.
468
+
393
469
  **Behavior:**
394
470
  1. Leave worktree intact at `{worktree_path}`
395
471
  2. Keep checkpoint.json for potential resume
@@ -434,9 +510,15 @@ After spawning wave agents, your turn ENDS. Completion notifications drive the l
434
510
 
435
511
  **Per notification:**
436
512
  1. Read result file for the completed agent
437
- 2. Report ONE line: "✓ Tx: status (commit)"
438
- 3. If NOT all wave agents done end turn, wait
439
- 4. If ALL wave agents donecheck context, update PLAN.md, spawn next wave or finish
513
+ 2. Validate test evidence:
514
+ - `tests_ran: true` + `test_exit_code: 0`trust result
515
+ - `tests_ran: true` + `test_exit_code: non-zero`status MUST be failed (flag mismatch if agent said success)
516
+ - `tests_ran: false` + `status: success` → flag: "⚠ Tx: success but no tests ran"
517
+ 3. TaskUpdate(taskId: native_id, status: "completed") — auto-unblocks dependent tasks
518
+ 4. Update PLAN.md: `[ ]` → `[x]` + commit hash (as before)
519
+ 5. Report: "✓ T1: success (abc123) [12 tests passed]" or "⚠ T1: success (abc123) [no tests]"
520
+ 6. If NOT all wave agents done → end turn, wait
521
+ 7. If ALL wave agents done → use TaskList to find newly unblocked tasks, check context, spawn next wave or finish
440
522
 
441
523
  **Between waves:** Check context %. If ≥50%, checkpoint and exit.
442
524
 
@@ -456,18 +538,41 @@ After spawning wave agents, your turn ENDS. Completion notifications drive the l
456
538
 
457
539
  ```
458
540
  /df:execute (context: 12%)
459
- Spawning Wave 1: T1, T2, T3 parallel...
541
+
542
+ Loading PLAN.md...
543
+ T1: Create upload endpoint (ready)
544
+ T2: Add S3 service (blocked by T1)
545
+ T3: Add auth guard (blocked by T1)
546
+
547
+ Registering native tasks...
548
+ TaskCreate → T1 (native: task-001)
549
+ TaskCreate → T2 (native: task-002)
550
+ TaskCreate → T3 (native: task-003)
551
+ TaskUpdate(task-002, addBlockedBy: [task-001])
552
+ TaskUpdate(task-003, addBlockedBy: [task-001])
553
+
554
+ Spawning Wave 1: T1
555
+ TaskUpdate(task-001, status: "in_progress") ← spinner: "Creating upload endpoint"
460
556
 
461
557
  [Agent "T1" completed]
462
- T1: success (abc1234)
558
+ TaskUpdate(task-001, status: "completed") ← auto-unblocks task-002, task-003
559
+ ✓ T1: success (abc1234)
560
+
561
+ TaskList → task-002, task-003 now ready (blockedBy empty)
562
+
563
+ Spawning Wave 2: T2, T3 parallel
564
+ TaskUpdate(task-002, status: "in_progress")
565
+ TaskUpdate(task-003, status: "in_progress")
463
566
 
464
567
  [Agent "T2" completed]
465
- T2: success (def5678)
568
+ TaskUpdate(task-002, status: "completed")
569
+ ✓ T2: success (def5678)
466
570
 
467
571
  [Agent "T3" completed]
468
- T3: success (ghi9012)
572
+ TaskUpdate(task-003, status: "completed")
573
+ ✓ T3: success (ghi9012)
469
574
 
470
- Wave 1 complete (3/3). Context: 35%
575
+ Wave 2 complete (2/2). Context: 35%
471
576
 
472
577
  ✓ doing-upload → done-upload
473
578
  ✓ Complete: 3/3 tasks
@@ -480,27 +585,43 @@ Next: Run /df:verify to verify specs and merge to main
480
585
  ```
481
586
  /df:execute (context: 10%)
482
587
 
588
+ Loading PLAN.md...
589
+ Registering native tasks...
590
+ TaskCreate → T1 [SPIKE] (native: task-001)
591
+ TaskCreate → T2 (native: task-002)
592
+ TaskCreate → T3 (native: task-003)
593
+ TaskUpdate(task-002, addBlockedBy: [task-001])
594
+ TaskUpdate(task-003, addBlockedBy: [task-001])
595
+
483
596
  Checking experiment status...
484
597
  T1 [SPIKE]: No experiment yet, spike executable
485
598
  T2: Blocked by T1 (spike not validated)
486
599
  T3: Blocked by T1 (spike not validated)
487
600
 
488
- Spawning Wave 1: T1 [SPIKE]...
601
+ Spawning Wave 1: T1 [SPIKE]
602
+ TaskUpdate(task-001, status: "in_progress")
489
603
 
490
604
  [Agent "T1 SPIKE" completed]
491
605
  ✓ T1: complete, verifying...
492
606
 
493
607
  Verifying T1...
494
608
  ✓ Spike T1 verified (throughput 8500 >= 7000)
609
+ TaskUpdate(task-001, status: "completed") ← auto-unblocks task-002, task-003
495
610
  → upload--streaming--passed.md
496
611
 
497
- Spawning Wave 2: T2, T3 parallel...
612
+ TaskList task-002, task-003 now ready
613
+
614
+ Spawning Wave 2: T2, T3 parallel
615
+ TaskUpdate(task-002, status: "in_progress")
616
+ TaskUpdate(task-003, status: "in_progress")
498
617
 
499
618
  [Agent "T2" completed]
500
- T2: success (def5678)
619
+ TaskUpdate(task-002, status: "completed")
620
+ ✓ T2: success (def5678)
501
621
 
502
622
  [Agent "T3" completed]
503
- T3: success (ghi9012)
623
+ TaskUpdate(task-003, status: "completed")
624
+ ✓ T3: success (ghi9012)
504
625
 
505
626
  Wave 2 complete (2/2). Context: 40%
506
627
 
@@ -515,11 +636,16 @@ Next: Run /df:verify to verify specs and merge to main
515
636
  ```
516
637
  /df:execute (context: 10%)
517
638
 
639
+ Registering native tasks...
640
+ TaskCreate → T1 [SPIKE], T2, T3 (with dependencies)
641
+
518
642
  Wave 1: T1 [SPIKE] (context: 15%)
643
+ TaskUpdate(task-001, status: "in_progress")
519
644
  T1: complete, verifying...
520
645
 
521
646
  Verifying T1...
522
647
  ✗ Spike T1 failed verification (throughput 1500 < 7000)
648
+ # Spike stays pending — dependents remain blocked
523
649
  → upload--streaming--failed.md
524
650
 
525
651
  ⚠ Spike T1 invalidated hypothesis
@@ -533,12 +659,17 @@ Next: Run /df:plan to generate new hypothesis spike
533
659
  ```
534
660
  /df:execute (context: 10%)
535
661
 
662
+ Registering native tasks...
663
+ TaskCreate → T1 [SPIKE], T2, T3 (with dependencies)
664
+
536
665
  Wave 1: T1 [SPIKE] (context: 15%)
666
+ TaskUpdate(task-001, status: "in_progress")
537
667
  T1: complete (agent said: success), verifying...
538
668
 
539
669
  Verifying T1...
540
670
  ✗ Spike T1 failed verification (throughput 1500 < 7000)
541
671
  ⚠ Agent incorrectly marked as passed — overriding to FAILED
672
+ TaskUpdate(task-001, status: "pending") ← reset, dependents stay blocked
542
673
  → upload--streaming--failed.md
543
674
 
544
675
  ⚠ Spike T1 invalidated hypothesis
@@ -40,35 +40,116 @@ Load:
40
40
 
41
41
  If no done-* specs: report counts, suggest `--doing`.
42
42
 
43
+ ### 1.5. DETECT PROJECT COMMANDS
44
+
45
+ Detect build and test commands by inspecting project files in the worktree.
46
+
47
+ **Config override always wins.** If `.deepflow/config.yaml` has `quality.test_command` or `quality.build_command`, use those.
48
+
49
+ **Auto-detection (first match wins):**
50
+
51
+ | File | Build | Test |
52
+ |------|-------|------|
53
+ | `package.json` with `scripts.build` | `npm run build` | `npm test` (if scripts.test is not default placeholder) |
54
+ | `pyproject.toml` or `setup.py` | — | `pytest` |
55
+ | `Cargo.toml` | `cargo build` | `cargo test` |
56
+ | `go.mod` | `go build ./...` | `go test ./...` |
57
+ | `Makefile` with `test` target | `make build` (if target exists) | `make test` |
58
+
59
+ **Output:**
60
+ - Commands found: `Build: npm run build | Test: npm test`
61
+ - Nothing found: `⚠ No build/test commands detected. L0/L4 skipped. Set quality.test_command in .deepflow/config.yaml`
62
+
43
63
  ### 2. VERIFY EACH SPEC
44
64
 
65
+ **L0: Build check** (if build command detected)
66
+
67
+ Run the build command in the worktree:
68
+ - Exit code 0 → L0 pass, continue to L1-L3
69
+ - Exit code non-zero → L0 FAIL
70
+ - Report: "✗ L0: Build failed" with last 30 lines of output
71
+ - Add fix task: "Fix build errors" to PLAN.md
72
+ - Do NOT proceed to L1-L4 (no point checking if code doesn't build)
73
+
74
+ **L1-L3: Static analysis** (via Explore agents)
75
+
45
76
  Check requirements, acceptance criteria, and quality (stubs/TODOs).
46
77
  Mark each: ✓ satisfied | ✗ missing | ⚠ partial
47
78
 
79
+ **L4: Test execution** (if test command detected)
80
+
81
+ Run AFTER L0 passes and L1-L3 complete. Run even if L1-L3 found issues — test failures reveal additional problems.
82
+
83
+ - Run test command in the worktree (timeout from config, default 5 min)
84
+ - Exit code 0 → L4 pass
85
+ - Exit code non-zero → L4 FAIL
86
+ - Capture last 50 lines of output
87
+ - Report: "✗ L4: Tests failed (N of M)" with relevant output
88
+ - Add fix task: "Fix failing tests" with test output in description
89
+
90
+ **Flaky test handling** (if `quality.test_retry_on_fail: true` in config):
91
+ - If tests fail, re-run ONCE
92
+ - Second run passes → L4 pass with note: "⚠ L4: Passed on retry (possible flaky test)"
93
+ - Second run fails → genuine failure
94
+
48
95
  ### 3. GENERATE REPORT
49
96
 
50
- Report per spec: requirements count, acceptance count, quality issues.
97
+ Report per spec with L0/L4 status, requirements count, acceptance count, quality issues.
51
98
 
52
- **If all pass:** Proceed to Post-Verification merge.
99
+ **Format on success:**
100
+ ```
101
+ done-upload.md: L0 ✓ | 4/4 reqs ✓, 5/5 acceptance ✓ | L4 ✓ (12 tests) | 0 quality issues
102
+ ```
53
103
 
54
- **If issues found:** Add fix tasks to PLAN.md in the worktree and loop back to execute:
104
+ **Format on failure:**
105
+ ```
106
+ done-upload.md: L0 ✓ | 4/4 reqs ✓, 5/5 acceptance ✓ | L4 ✗ (3 failed) | 0 quality issues
107
+
108
+ Issues:
109
+ ✗ L4: 3 test failures
110
+ FAIL src/upload.test.ts > should validate file type
111
+ FAIL src/upload.test.ts > should reject oversized files
112
+
113
+ Fix tasks added to PLAN.md:
114
+ T10: Fix 3 failing tests in upload module
115
+ ```
116
+
117
+ **Gate conditions (ALL must pass to merge):**
118
+ - L0: Build passes (or no build command detected)
119
+ - L1-L3: All requirements satisfied, no stubs, properly wired
120
+ - L4: Tests pass (or no test command detected)
121
+
122
+ **If all gates pass:** Proceed to Post-Verification merge.
123
+
124
+ **If issues found:** Add fix tasks to PLAN.md in the worktree and register as native tasks, then loop back to execute:
55
125
 
56
126
  1. Discover worktree (same logic as Post-Verification step 1)
57
127
  2. Write new fix tasks to `{worktree_path}/PLAN.md` under the existing spec section
58
128
  - Task IDs continue from last (e.g. if T9 was last, fixes start at T10)
59
129
  - Format: `- [ ] **T10**: Fix {description}` with `Files:` and details
60
- 3. Output report + next step:
130
+ 3. Register fix tasks as native tasks for immediate tracking:
131
+ ```
132
+ For each fix task added:
133
+ TaskCreate(subject: "T10: Fix {description}", description: "...", activeForm: "Fixing {description}")
134
+ TaskUpdate(addBlockedBy: [...]) if dependencies exist
135
+ ```
136
+ This allows `/df:execute --continue` to find fix tasks via TaskList immediately.
137
+ 4. Output report + next step:
61
138
 
62
139
  ```
63
- done-upload.md: 4/4 reqs ✓, 3/5 acceptance ✗, 1 quality issue
140
+ done-upload.md: L0 ✓ | 4/4 reqs ✓, 3/5 acceptance | L4 ✗ (2 failed) | 1 quality issue
64
141
 
65
142
  Issues:
66
143
  ✗ AC-3: YAML parsing missing for consolation
144
+ ✗ L4: 2 test failures
145
+ FAIL src/upload.test.ts > should validate file type
146
+ FAIL src/upload.test.ts > should reject oversized files
67
147
  ⚠ Quality: TODO in parse_config()
68
148
 
69
149
  Fix tasks added to PLAN.md:
70
150
  T10: Add YAML parsing for consolation section
71
- T11: Remove TODO in parse_config()
151
+ T11: Fix 2 failing tests in upload module
152
+ T12: Remove TODO in parse_config()
72
153
 
73
154
  Run /df:execute --continue to fix in the same worktree.
74
155
  ```
@@ -98,14 +179,16 @@ Files: ...
98
179
 
99
180
  ## Verification Levels
100
181
 
101
- | Level | Check | Method |
102
- |-------|-------|--------|
103
- | L1: Exists | File/function exists | Glob/Grep |
104
- | L2: Substantive | Real code, not stub | Read + analyze |
105
- | L3: Wired | Integrated into system | Trace imports/calls |
106
- | L4: Tested | Has passing tests | Run tests |
182
+ | Level | Check | Method | Runner |
183
+ |-------|-------|--------|--------|
184
+ | L0: Builds | Code compiles/builds | Run build command | Orchestrator (Bash) |
185
+ | L1: Exists | File/function exists | Glob/Grep | Explore agents |
186
+ | L2: Substantive | Real code, not stub | Read + analyze | Explore agents |
187
+ | L3: Wired | Integrated into system | Trace imports/calls | Explore agents |
188
+ | L4: Tested | Tests pass | Run test command | Orchestrator (Bash) |
107
189
 
108
- Default: L1-L3 (L4 optional, can be slow)
190
+ **Default: L0 through L4.** L0 and L4 are skipped ONLY if no build/test command is detected (see step 1.5).
191
+ L0 and L4 run directly via Bash — Explore agents cannot execute commands.
109
192
 
110
193
  ## Rules
111
194
  - **Never use TaskOutput** — Returns full transcripts that explode context
@@ -140,10 +223,12 @@ Scale: 1-2 agents per spec, cap 10.
140
223
  ```
141
224
  /df:verify
142
225
 
143
- done-upload.md: 4/4 reqs ✓, 5/5 acceptance ✓, clean
144
- done-auth.md: 2/2 reqs ✓, 3/3 acceptance ✓, clean
226
+ Build: npm run build | Test: npm test
227
+
228
+ done-upload.md: L0 ✓ | 4/4 reqs ✓, 5/5 acceptance ✓ | L4 ✓ (12 tests) | 0 quality issues
229
+ done-auth.md: L0 ✓ | 2/2 reqs ✓, 3/3 acceptance ✓ | L4 ✓ (8 tests) | 0 quality issues
145
230
 
146
- ✓ All specs verified
231
+ ✓ All gates passed
147
232
 
148
233
  ✓ Merged df/upload to main
149
234
  ✓ Cleaned up worktree and branch
@@ -156,22 +241,29 @@ Learnings captured:
156
241
  ```
157
242
  /df:verify --doing
158
243
 
159
- doing-upload.md: 4/4 reqs ✓, 3/5 acceptance ✗, 1 quality issue
244
+ Build: npm run build | Test: npm test
245
+
246
+ doing-upload.md: L0 ✓ | 4/4 reqs ✓, 3/5 acceptance ✗ | L4 ✗ (3 failed) | 1 quality issue
160
247
 
161
248
  Issues:
162
249
  ✗ AC-3: YAML parsing missing for consolation
250
+ ✗ L4: 3 test failures
251
+ FAIL src/upload.test.ts > should validate file type
252
+ FAIL src/upload.test.ts > should reject oversized files
253
+ FAIL src/upload.test.ts > should handle empty input
163
254
  ⚠ Quality: TODO in parse_config()
164
255
 
165
256
  Fix tasks added to PLAN.md:
166
257
  T10: Add YAML parsing for consolation section
167
- T11: Remove TODO in parse_config()
258
+ T11: Fix 3 failing tests in upload module
259
+ T12: Remove TODO in parse_config()
168
260
 
169
261
  Run /df:execute --continue to fix in the same worktree.
170
262
  ```
171
263
 
172
264
  ## Post-Verification: Worktree Merge & Cleanup
173
265
 
174
- **Only runs when ALL specs pass verification.** If issues were found, fix tasks were added to PLAN.md instead (see step 3).
266
+ **Only runs when ALL gates pass** (L0 build, L1-L3 static analysis, L4 tests). If any gate fails, fix tasks were added to PLAN.md instead (see step 3).
175
267
 
176
268
  ### 1. DISCOVER WORKTREE
177
269
 
@@ -61,3 +61,17 @@ worktree:
61
61
 
62
62
  # Keep worktree after failed execution for debugging
63
63
  cleanup_on_fail: false
64
+
65
+ # Quality gates for /df:verify
66
+ quality:
67
+ # Override auto-detected build command (e.g., "npm run build", "cargo build")
68
+ build_command: ""
69
+
70
+ # Override auto-detected test command (e.g., "npm test", "pytest", "go test ./...")
71
+ test_command: ""
72
+
73
+ # Test timeout in seconds (default: 300 = 5 minutes)
74
+ test_timeout: 300
75
+
76
+ # Retry flaky tests once before failing (default: true)
77
+ test_retry_on_fail: true