deepflow 0.1.45 → 0.1.47
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/src/commands/df/execute.md +149 -18
- package/src/commands/df/verify.md +111 -19
- package/templates/config-template.yaml +14 -0
package/package.json
CHANGED
|
@@ -99,8 +99,16 @@ task: T3
|
|
|
99
99
|
status: success|failed
|
|
100
100
|
commit: abc1234
|
|
101
101
|
summary: "one line"
|
|
102
|
+
tests_ran: true|false
|
|
103
|
+
test_command: "npm test"
|
|
104
|
+
test_exit_code: 0
|
|
105
|
+
test_output_tail: |
|
|
106
|
+
PASS src/upload.test.ts
|
|
107
|
+
Tests: 12 passed, 12 total
|
|
102
108
|
```
|
|
103
109
|
|
|
110
|
+
New fields: `tests_ran` (bool), `test_command` (string), `test_exit_code` (int), `test_output_tail` (last 20 lines of output).
|
|
111
|
+
|
|
104
112
|
**Spike result file** `.deepflow/results/{task_id}.yaml` (additional fields):
|
|
105
113
|
```yaml
|
|
106
114
|
task: T1
|
|
@@ -137,8 +145,10 @@ experiment_file: ".deepflow/experiments/upload--streaming--failed.md"
|
|
|
137
145
|
}
|
|
138
146
|
```
|
|
139
147
|
|
|
148
|
+
Note: `completed_tasks` is kept for backward compatibility but is now derivable from PLAN.md `[x]` entries. The native task system (TaskList) is the primary source for runtime task status.
|
|
149
|
+
|
|
140
150
|
**On checkpoint:** Complete wave → update PLAN.md → save to worktree → exit.
|
|
141
|
-
**Resume:** `--continue` loads checkpoint, verifies worktree, skips completed tasks.
|
|
151
|
+
**Resume:** `--continue` loads checkpoint, verifies worktree, skips completed tasks. Native tasks are re-registered for remaining `[ ]` items only.
|
|
142
152
|
|
|
143
153
|
## Behavior
|
|
144
154
|
|
|
@@ -188,6 +198,30 @@ Load: PLAN.md (required), specs/doing-*.md, .deepflow/config.yaml
|
|
|
188
198
|
If missing: "No PLAN.md found. Run /df:plan first."
|
|
189
199
|
```
|
|
190
200
|
|
|
201
|
+
### 2.5. REGISTER NATIVE TASKS
|
|
202
|
+
|
|
203
|
+
Parse PLAN.md and create native tasks for tracking, dependency management, and UI spinners.
|
|
204
|
+
|
|
205
|
+
**For each uncompleted task (`[ ]`) in PLAN.md:**
|
|
206
|
+
|
|
207
|
+
```
|
|
208
|
+
1. TaskCreate:
|
|
209
|
+
- subject: "{task_id}: {description}" (e.g. "T1: Create upload endpoint")
|
|
210
|
+
- description: Full task block from PLAN.md (files, blocked by, type, etc.)
|
|
211
|
+
- activeForm: "{gerund form of description}" (e.g. "Creating upload endpoint")
|
|
212
|
+
|
|
213
|
+
2. Store mapping: PLAN.md task_id (T1) → native task ID
|
|
214
|
+
```
|
|
215
|
+
|
|
216
|
+
**After all tasks created, set up dependencies:**
|
|
217
|
+
|
|
218
|
+
```
|
|
219
|
+
For each task with "Blocked by: T{n}, T{m}":
|
|
220
|
+
TaskUpdate(taskId: native_id, addBlockedBy: [native_id_of_Tn, native_id_of_Tm])
|
|
221
|
+
```
|
|
222
|
+
|
|
223
|
+
**On `--continue`:** Only create tasks for remaining `[ ]` items (skip `[x]` completed).
|
|
224
|
+
|
|
191
225
|
### 3. CHECK FOR UNPLANNED SPECS
|
|
192
226
|
|
|
193
227
|
Warn if `specs/*.md` (excluding doing-/done-) exist. Non-blocking.
|
|
@@ -244,12 +278,30 @@ Topic extraction:
|
|
|
244
278
|
|
|
245
279
|
### 5. IDENTIFY READY TASKS
|
|
246
280
|
|
|
247
|
-
|
|
281
|
+
Use TaskList to find ready tasks (replaces manual PLAN.md parsing):
|
|
282
|
+
|
|
283
|
+
```
|
|
284
|
+
Ready = TaskList results where:
|
|
285
|
+
- status: "pending"
|
|
286
|
+
- blockedBy: empty (auto-unblocked by native dependency system)
|
|
287
|
+
```
|
|
288
|
+
|
|
289
|
+
**Cross-check with experiment validation** (for spike-blocked tasks):
|
|
290
|
+
- If task depends on spike AND experiment not `--passed.md` → still blocked
|
|
291
|
+
- TaskUpdate to add spike as blocker if not already set
|
|
292
|
+
|
|
293
|
+
Ready = TaskList pending + empty blockedBy + experiment validated (if applicable).
|
|
248
294
|
|
|
249
295
|
### 6. SPAWN AGENTS
|
|
250
296
|
|
|
251
297
|
Context ≥50%: checkpoint and exit.
|
|
252
298
|
|
|
299
|
+
**Before spawning each agent**, mark its native task as in_progress:
|
|
300
|
+
```
|
|
301
|
+
TaskUpdate(taskId: native_id, status: "in_progress")
|
|
302
|
+
```
|
|
303
|
+
This activates the UI spinner showing the task's activeForm (e.g. "Creating upload endpoint").
|
|
304
|
+
|
|
253
305
|
**CRITICAL: Spawn ALL ready tasks in a SINGLE response with MULTIPLE Task tool calls.**
|
|
254
306
|
|
|
255
307
|
DO NOT spawn one task, wait, then spawn another. Instead, call Task tool multiple times in the SAME message block. This enables true parallelism.
|
|
@@ -319,8 +371,15 @@ Then rename experiment:
|
|
|
319
371
|
|
|
320
372
|
**Gate:**
|
|
321
373
|
```
|
|
322
|
-
VERIFIED_PASS →
|
|
323
|
-
|
|
374
|
+
VERIFIED_PASS →
|
|
375
|
+
TaskUpdate(taskId: spike_native_id, status: "completed")
|
|
376
|
+
# Native system auto-unblocks dependent tasks
|
|
377
|
+
Log "✓ Spike {task_id} verified"
|
|
378
|
+
|
|
379
|
+
VERIFIED_FAIL →
|
|
380
|
+
# Spike task stays as pending, dependents remain blocked
|
|
381
|
+
# No TaskUpdate needed — native system keeps them blocked
|
|
382
|
+
Log "✗ Spike {task_id} failed verification"
|
|
324
383
|
If override: log "⚠ Agent incorrectly marked as passed"
|
|
325
384
|
```
|
|
326
385
|
|
|
@@ -349,8 +408,18 @@ Example: To edit src/foo.ts, use:
|
|
|
349
408
|
|
|
350
409
|
Do NOT write files to the main project directory.
|
|
351
410
|
|
|
352
|
-
|
|
353
|
-
|
|
411
|
+
Steps:
|
|
412
|
+
1. Implement the task
|
|
413
|
+
2. Detect test command: check for package.json (npm test), pyproject.toml (pytest),
|
|
414
|
+
Cargo.toml (cargo test), go.mod (go test ./...), or Makefile (make test)
|
|
415
|
+
3. Run tests if test infrastructure exists:
|
|
416
|
+
- Run the detected test command
|
|
417
|
+
- If tests fail: fix the code and re-run until passing
|
|
418
|
+
- Do NOT commit with failing tests
|
|
419
|
+
4. If NO test infrastructure: set tests_ran: false in result file
|
|
420
|
+
5. Commit as feat({spec}): {description}
|
|
421
|
+
6. Write result file with ALL fields including test evidence (see schema):
|
|
422
|
+
{worktree_absolute_path}/.deepflow/results/{task_id}.yaml
|
|
354
423
|
|
|
355
424
|
**STOP after writing the result file. Do NOT:**
|
|
356
425
|
- Merge branches or cherry-pick commits
|
|
@@ -376,6 +445,7 @@ Steps:
|
|
|
376
445
|
3. Write experiment as --active.md (verifier determines final status)
|
|
377
446
|
4. Commit: spike({spec}): validate {hypothesis}
|
|
378
447
|
5. Write result to .deepflow/results/{task_id}.yaml (see spike result schema)
|
|
448
|
+
6. If test infrastructure exists, also run tests and include evidence in result file
|
|
379
449
|
|
|
380
450
|
Rules:
|
|
381
451
|
- `met: true` ONLY if actual satisfies target
|
|
@@ -390,6 +460,12 @@ Rules:
|
|
|
390
460
|
|
|
391
461
|
When a task fails and cannot be auto-fixed:
|
|
392
462
|
|
|
463
|
+
**Native task update:**
|
|
464
|
+
```
|
|
465
|
+
TaskUpdate(taskId: native_id, status: "pending") # Reset to pending, not deleted
|
|
466
|
+
```
|
|
467
|
+
This keeps the task visible for retry. Dependent tasks remain blocked.
|
|
468
|
+
|
|
393
469
|
**Behavior:**
|
|
394
470
|
1. Leave worktree intact at `{worktree_path}`
|
|
395
471
|
2. Keep checkpoint.json for potential resume
|
|
@@ -434,9 +510,15 @@ After spawning wave agents, your turn ENDS. Completion notifications drive the l
|
|
|
434
510
|
|
|
435
511
|
**Per notification:**
|
|
436
512
|
1. Read result file for the completed agent
|
|
437
|
-
2.
|
|
438
|
-
|
|
439
|
-
|
|
513
|
+
2. Validate test evidence:
|
|
514
|
+
- `tests_ran: true` + `test_exit_code: 0` → trust result
|
|
515
|
+
- `tests_ran: true` + `test_exit_code: non-zero` → status MUST be failed (flag mismatch if agent said success)
|
|
516
|
+
- `tests_ran: false` + `status: success` → flag: "⚠ Tx: success but no tests ran"
|
|
517
|
+
3. TaskUpdate(taskId: native_id, status: "completed") — auto-unblocks dependent tasks
|
|
518
|
+
4. Update PLAN.md: `[ ]` → `[x]` + commit hash (as before)
|
|
519
|
+
5. Report: "✓ T1: success (abc123) [12 tests passed]" or "⚠ T1: success (abc123) [no tests]"
|
|
520
|
+
6. If NOT all wave agents done → end turn, wait
|
|
521
|
+
7. If ALL wave agents done → use TaskList to find newly unblocked tasks, check context, spawn next wave or finish
|
|
440
522
|
|
|
441
523
|
**Between waves:** Check context %. If ≥50%, checkpoint and exit.
|
|
442
524
|
|
|
@@ -456,18 +538,41 @@ After spawning wave agents, your turn ENDS. Completion notifications drive the l
|
|
|
456
538
|
|
|
457
539
|
```
|
|
458
540
|
/df:execute (context: 12%)
|
|
459
|
-
|
|
541
|
+
|
|
542
|
+
Loading PLAN.md...
|
|
543
|
+
T1: Create upload endpoint (ready)
|
|
544
|
+
T2: Add S3 service (blocked by T1)
|
|
545
|
+
T3: Add auth guard (blocked by T1)
|
|
546
|
+
|
|
547
|
+
Registering native tasks...
|
|
548
|
+
TaskCreate → T1 (native: task-001)
|
|
549
|
+
TaskCreate → T2 (native: task-002)
|
|
550
|
+
TaskCreate → T3 (native: task-003)
|
|
551
|
+
TaskUpdate(task-002, addBlockedBy: [task-001])
|
|
552
|
+
TaskUpdate(task-003, addBlockedBy: [task-001])
|
|
553
|
+
|
|
554
|
+
Spawning Wave 1: T1
|
|
555
|
+
TaskUpdate(task-001, status: "in_progress") ← spinner: "Creating upload endpoint"
|
|
460
556
|
|
|
461
557
|
[Agent "T1" completed]
|
|
462
|
-
|
|
558
|
+
TaskUpdate(task-001, status: "completed") ← auto-unblocks task-002, task-003
|
|
559
|
+
✓ T1: success (abc1234)
|
|
560
|
+
|
|
561
|
+
TaskList → task-002, task-003 now ready (blockedBy empty)
|
|
562
|
+
|
|
563
|
+
Spawning Wave 2: T2, T3 parallel
|
|
564
|
+
TaskUpdate(task-002, status: "in_progress")
|
|
565
|
+
TaskUpdate(task-003, status: "in_progress")
|
|
463
566
|
|
|
464
567
|
[Agent "T2" completed]
|
|
465
|
-
|
|
568
|
+
TaskUpdate(task-002, status: "completed")
|
|
569
|
+
✓ T2: success (def5678)
|
|
466
570
|
|
|
467
571
|
[Agent "T3" completed]
|
|
468
|
-
|
|
572
|
+
TaskUpdate(task-003, status: "completed")
|
|
573
|
+
✓ T3: success (ghi9012)
|
|
469
574
|
|
|
470
|
-
Wave
|
|
575
|
+
Wave 2 complete (2/2). Context: 35%
|
|
471
576
|
|
|
472
577
|
✓ doing-upload → done-upload
|
|
473
578
|
✓ Complete: 3/3 tasks
|
|
@@ -480,27 +585,43 @@ Next: Run /df:verify to verify specs and merge to main
|
|
|
480
585
|
```
|
|
481
586
|
/df:execute (context: 10%)
|
|
482
587
|
|
|
588
|
+
Loading PLAN.md...
|
|
589
|
+
Registering native tasks...
|
|
590
|
+
TaskCreate → T1 [SPIKE] (native: task-001)
|
|
591
|
+
TaskCreate → T2 (native: task-002)
|
|
592
|
+
TaskCreate → T3 (native: task-003)
|
|
593
|
+
TaskUpdate(task-002, addBlockedBy: [task-001])
|
|
594
|
+
TaskUpdate(task-003, addBlockedBy: [task-001])
|
|
595
|
+
|
|
483
596
|
Checking experiment status...
|
|
484
597
|
T1 [SPIKE]: No experiment yet, spike executable
|
|
485
598
|
T2: Blocked by T1 (spike not validated)
|
|
486
599
|
T3: Blocked by T1 (spike not validated)
|
|
487
600
|
|
|
488
|
-
Spawning Wave 1: T1 [SPIKE]
|
|
601
|
+
Spawning Wave 1: T1 [SPIKE]
|
|
602
|
+
TaskUpdate(task-001, status: "in_progress")
|
|
489
603
|
|
|
490
604
|
[Agent "T1 SPIKE" completed]
|
|
491
605
|
✓ T1: complete, verifying...
|
|
492
606
|
|
|
493
607
|
Verifying T1...
|
|
494
608
|
✓ Spike T1 verified (throughput 8500 >= 7000)
|
|
609
|
+
TaskUpdate(task-001, status: "completed") ← auto-unblocks task-002, task-003
|
|
495
610
|
→ upload--streaming--passed.md
|
|
496
611
|
|
|
497
|
-
|
|
612
|
+
TaskList → task-002, task-003 now ready
|
|
613
|
+
|
|
614
|
+
Spawning Wave 2: T2, T3 parallel
|
|
615
|
+
TaskUpdate(task-002, status: "in_progress")
|
|
616
|
+
TaskUpdate(task-003, status: "in_progress")
|
|
498
617
|
|
|
499
618
|
[Agent "T2" completed]
|
|
500
|
-
|
|
619
|
+
TaskUpdate(task-002, status: "completed")
|
|
620
|
+
✓ T2: success (def5678)
|
|
501
621
|
|
|
502
622
|
[Agent "T3" completed]
|
|
503
|
-
|
|
623
|
+
TaskUpdate(task-003, status: "completed")
|
|
624
|
+
✓ T3: success (ghi9012)
|
|
504
625
|
|
|
505
626
|
Wave 2 complete (2/2). Context: 40%
|
|
506
627
|
|
|
@@ -515,11 +636,16 @@ Next: Run /df:verify to verify specs and merge to main
|
|
|
515
636
|
```
|
|
516
637
|
/df:execute (context: 10%)
|
|
517
638
|
|
|
639
|
+
Registering native tasks...
|
|
640
|
+
TaskCreate → T1 [SPIKE], T2, T3 (with dependencies)
|
|
641
|
+
|
|
518
642
|
Wave 1: T1 [SPIKE] (context: 15%)
|
|
643
|
+
TaskUpdate(task-001, status: "in_progress")
|
|
519
644
|
T1: complete, verifying...
|
|
520
645
|
|
|
521
646
|
Verifying T1...
|
|
522
647
|
✗ Spike T1 failed verification (throughput 1500 < 7000)
|
|
648
|
+
# Spike stays pending — dependents remain blocked
|
|
523
649
|
→ upload--streaming--failed.md
|
|
524
650
|
|
|
525
651
|
⚠ Spike T1 invalidated hypothesis
|
|
@@ -533,12 +659,17 @@ Next: Run /df:plan to generate new hypothesis spike
|
|
|
533
659
|
```
|
|
534
660
|
/df:execute (context: 10%)
|
|
535
661
|
|
|
662
|
+
Registering native tasks...
|
|
663
|
+
TaskCreate → T1 [SPIKE], T2, T3 (with dependencies)
|
|
664
|
+
|
|
536
665
|
Wave 1: T1 [SPIKE] (context: 15%)
|
|
666
|
+
TaskUpdate(task-001, status: "in_progress")
|
|
537
667
|
T1: complete (agent said: success), verifying...
|
|
538
668
|
|
|
539
669
|
Verifying T1...
|
|
540
670
|
✗ Spike T1 failed verification (throughput 1500 < 7000)
|
|
541
671
|
⚠ Agent incorrectly marked as passed — overriding to FAILED
|
|
672
|
+
TaskUpdate(task-001, status: "pending") ← reset, dependents stay blocked
|
|
542
673
|
→ upload--streaming--failed.md
|
|
543
674
|
|
|
544
675
|
⚠ Spike T1 invalidated hypothesis
|
|
@@ -40,35 +40,116 @@ Load:
|
|
|
40
40
|
|
|
41
41
|
If no done-* specs: report counts, suggest `--doing`.
|
|
42
42
|
|
|
43
|
+
### 1.5. DETECT PROJECT COMMANDS
|
|
44
|
+
|
|
45
|
+
Detect build and test commands by inspecting project files in the worktree.
|
|
46
|
+
|
|
47
|
+
**Config override always wins.** If `.deepflow/config.yaml` has `quality.test_command` or `quality.build_command`, use those.
|
|
48
|
+
|
|
49
|
+
**Auto-detection (first match wins):**
|
|
50
|
+
|
|
51
|
+
| File | Build | Test |
|
|
52
|
+
|------|-------|------|
|
|
53
|
+
| `package.json` with `scripts.build` | `npm run build` | `npm test` (if scripts.test is not default placeholder) |
|
|
54
|
+
| `pyproject.toml` or `setup.py` | — | `pytest` |
|
|
55
|
+
| `Cargo.toml` | `cargo build` | `cargo test` |
|
|
56
|
+
| `go.mod` | `go build ./...` | `go test ./...` |
|
|
57
|
+
| `Makefile` with `test` target | `make build` (if target exists) | `make test` |
|
|
58
|
+
|
|
59
|
+
**Output:**
|
|
60
|
+
- Commands found: `Build: npm run build | Test: npm test`
|
|
61
|
+
- Nothing found: `⚠ No build/test commands detected. L0/L4 skipped. Set quality.test_command in .deepflow/config.yaml`
|
|
62
|
+
|
|
43
63
|
### 2. VERIFY EACH SPEC
|
|
44
64
|
|
|
65
|
+
**L0: Build check** (if build command detected)
|
|
66
|
+
|
|
67
|
+
Run the build command in the worktree:
|
|
68
|
+
- Exit code 0 → L0 pass, continue to L1-L3
|
|
69
|
+
- Exit code non-zero → L0 FAIL
|
|
70
|
+
- Report: "✗ L0: Build failed" with last 30 lines of output
|
|
71
|
+
- Add fix task: "Fix build errors" to PLAN.md
|
|
72
|
+
- Do NOT proceed to L1-L4 (no point checking if code doesn't build)
|
|
73
|
+
|
|
74
|
+
**L1-L3: Static analysis** (via Explore agents)
|
|
75
|
+
|
|
45
76
|
Check requirements, acceptance criteria, and quality (stubs/TODOs).
|
|
46
77
|
Mark each: ✓ satisfied | ✗ missing | ⚠ partial
|
|
47
78
|
|
|
79
|
+
**L4: Test execution** (if test command detected)
|
|
80
|
+
|
|
81
|
+
Run AFTER L0 passes and L1-L3 complete. Run even if L1-L3 found issues — test failures reveal additional problems.
|
|
82
|
+
|
|
83
|
+
- Run test command in the worktree (timeout from config, default 5 min)
|
|
84
|
+
- Exit code 0 → L4 pass
|
|
85
|
+
- Exit code non-zero → L4 FAIL
|
|
86
|
+
- Capture last 50 lines of output
|
|
87
|
+
- Report: "✗ L4: Tests failed (N of M)" with relevant output
|
|
88
|
+
- Add fix task: "Fix failing tests" with test output in description
|
|
89
|
+
|
|
90
|
+
**Flaky test handling** (if `quality.test_retry_on_fail: true` in config):
|
|
91
|
+
- If tests fail, re-run ONCE
|
|
92
|
+
- Second run passes → L4 pass with note: "⚠ L4: Passed on retry (possible flaky test)"
|
|
93
|
+
- Second run fails → genuine failure
|
|
94
|
+
|
|
48
95
|
### 3. GENERATE REPORT
|
|
49
96
|
|
|
50
|
-
Report per spec
|
|
97
|
+
Report per spec with L0/L4 status, requirements count, acceptance count, quality issues.
|
|
51
98
|
|
|
52
|
-
**
|
|
99
|
+
**Format on success:**
|
|
100
|
+
```
|
|
101
|
+
done-upload.md: L0 ✓ | 4/4 reqs ✓, 5/5 acceptance ✓ | L4 ✓ (12 tests) | 0 quality issues
|
|
102
|
+
```
|
|
53
103
|
|
|
54
|
-
**
|
|
104
|
+
**Format on failure:**
|
|
105
|
+
```
|
|
106
|
+
done-upload.md: L0 ✓ | 4/4 reqs ✓, 5/5 acceptance ✓ | L4 ✗ (3 failed) | 0 quality issues
|
|
107
|
+
|
|
108
|
+
Issues:
|
|
109
|
+
✗ L4: 3 test failures
|
|
110
|
+
FAIL src/upload.test.ts > should validate file type
|
|
111
|
+
FAIL src/upload.test.ts > should reject oversized files
|
|
112
|
+
|
|
113
|
+
Fix tasks added to PLAN.md:
|
|
114
|
+
T10: Fix 3 failing tests in upload module
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
**Gate conditions (ALL must pass to merge):**
|
|
118
|
+
- L0: Build passes (or no build command detected)
|
|
119
|
+
- L1-L3: All requirements satisfied, no stubs, properly wired
|
|
120
|
+
- L4: Tests pass (or no test command detected)
|
|
121
|
+
|
|
122
|
+
**If all gates pass:** Proceed to Post-Verification merge.
|
|
123
|
+
|
|
124
|
+
**If issues found:** Add fix tasks to PLAN.md in the worktree and register as native tasks, then loop back to execute:
|
|
55
125
|
|
|
56
126
|
1. Discover worktree (same logic as Post-Verification step 1)
|
|
57
127
|
2. Write new fix tasks to `{worktree_path}/PLAN.md` under the existing spec section
|
|
58
128
|
- Task IDs continue from last (e.g. if T9 was last, fixes start at T10)
|
|
59
129
|
- Format: `- [ ] **T10**: Fix {description}` with `Files:` and details
|
|
60
|
-
3.
|
|
130
|
+
3. Register fix tasks as native tasks for immediate tracking:
|
|
131
|
+
```
|
|
132
|
+
For each fix task added:
|
|
133
|
+
TaskCreate(subject: "T10: Fix {description}", description: "...", activeForm: "Fixing {description}")
|
|
134
|
+
TaskUpdate(addBlockedBy: [...]) if dependencies exist
|
|
135
|
+
```
|
|
136
|
+
This allows `/df:execute --continue` to find fix tasks via TaskList immediately.
|
|
137
|
+
4. Output report + next step:
|
|
61
138
|
|
|
62
139
|
```
|
|
63
|
-
done-upload.md: 4/4 reqs ✓, 3/5 acceptance
|
|
140
|
+
done-upload.md: L0 ✓ | 4/4 reqs ✓, 3/5 acceptance ✗ | L4 ✗ (2 failed) | 1 quality issue
|
|
64
141
|
|
|
65
142
|
Issues:
|
|
66
143
|
✗ AC-3: YAML parsing missing for consolation
|
|
144
|
+
✗ L4: 2 test failures
|
|
145
|
+
FAIL src/upload.test.ts > should validate file type
|
|
146
|
+
FAIL src/upload.test.ts > should reject oversized files
|
|
67
147
|
⚠ Quality: TODO in parse_config()
|
|
68
148
|
|
|
69
149
|
Fix tasks added to PLAN.md:
|
|
70
150
|
T10: Add YAML parsing for consolation section
|
|
71
|
-
T11:
|
|
151
|
+
T11: Fix 2 failing tests in upload module
|
|
152
|
+
T12: Remove TODO in parse_config()
|
|
72
153
|
|
|
73
154
|
Run /df:execute --continue to fix in the same worktree.
|
|
74
155
|
```
|
|
@@ -98,14 +179,16 @@ Files: ...
|
|
|
98
179
|
|
|
99
180
|
## Verification Levels
|
|
100
181
|
|
|
101
|
-
| Level | Check | Method |
|
|
102
|
-
|
|
103
|
-
|
|
|
104
|
-
|
|
|
105
|
-
|
|
|
106
|
-
|
|
|
182
|
+
| Level | Check | Method | Runner |
|
|
183
|
+
|-------|-------|--------|--------|
|
|
184
|
+
| L0: Builds | Code compiles/builds | Run build command | Orchestrator (Bash) |
|
|
185
|
+
| L1: Exists | File/function exists | Glob/Grep | Explore agents |
|
|
186
|
+
| L2: Substantive | Real code, not stub | Read + analyze | Explore agents |
|
|
187
|
+
| L3: Wired | Integrated into system | Trace imports/calls | Explore agents |
|
|
188
|
+
| L4: Tested | Tests pass | Run test command | Orchestrator (Bash) |
|
|
107
189
|
|
|
108
|
-
Default:
|
|
190
|
+
**Default: L0 through L4.** L0 and L4 are skipped ONLY if no build/test command is detected (see step 1.5).
|
|
191
|
+
L0 and L4 run directly via Bash — Explore agents cannot execute commands.
|
|
109
192
|
|
|
110
193
|
## Rules
|
|
111
194
|
- **Never use TaskOutput** — Returns full transcripts that explode context
|
|
@@ -140,10 +223,12 @@ Scale: 1-2 agents per spec, cap 10.
|
|
|
140
223
|
```
|
|
141
224
|
/df:verify
|
|
142
225
|
|
|
143
|
-
|
|
144
|
-
|
|
226
|
+
Build: npm run build | Test: npm test
|
|
227
|
+
|
|
228
|
+
done-upload.md: L0 ✓ | 4/4 reqs ✓, 5/5 acceptance ✓ | L4 ✓ (12 tests) | 0 quality issues
|
|
229
|
+
done-auth.md: L0 ✓ | 2/2 reqs ✓, 3/3 acceptance ✓ | L4 ✓ (8 tests) | 0 quality issues
|
|
145
230
|
|
|
146
|
-
✓ All
|
|
231
|
+
✓ All gates passed
|
|
147
232
|
|
|
148
233
|
✓ Merged df/upload to main
|
|
149
234
|
✓ Cleaned up worktree and branch
|
|
@@ -156,22 +241,29 @@ Learnings captured:
|
|
|
156
241
|
```
|
|
157
242
|
/df:verify --doing
|
|
158
243
|
|
|
159
|
-
|
|
244
|
+
Build: npm run build | Test: npm test
|
|
245
|
+
|
|
246
|
+
doing-upload.md: L0 ✓ | 4/4 reqs ✓, 3/5 acceptance ✗ | L4 ✗ (3 failed) | 1 quality issue
|
|
160
247
|
|
|
161
248
|
Issues:
|
|
162
249
|
✗ AC-3: YAML parsing missing for consolation
|
|
250
|
+
✗ L4: 3 test failures
|
|
251
|
+
FAIL src/upload.test.ts > should validate file type
|
|
252
|
+
FAIL src/upload.test.ts > should reject oversized files
|
|
253
|
+
FAIL src/upload.test.ts > should handle empty input
|
|
163
254
|
⚠ Quality: TODO in parse_config()
|
|
164
255
|
|
|
165
256
|
Fix tasks added to PLAN.md:
|
|
166
257
|
T10: Add YAML parsing for consolation section
|
|
167
|
-
T11:
|
|
258
|
+
T11: Fix 3 failing tests in upload module
|
|
259
|
+
T12: Remove TODO in parse_config()
|
|
168
260
|
|
|
169
261
|
Run /df:execute --continue to fix in the same worktree.
|
|
170
262
|
```
|
|
171
263
|
|
|
172
264
|
## Post-Verification: Worktree Merge & Cleanup
|
|
173
265
|
|
|
174
|
-
**Only runs when ALL
|
|
266
|
+
**Only runs when ALL gates pass** (L0 build, L1-L3 static analysis, L4 tests). If any gate fails, fix tasks were added to PLAN.md instead (see step 3).
|
|
175
267
|
|
|
176
268
|
### 1. DISCOVER WORKTREE
|
|
177
269
|
|
|
@@ -61,3 +61,17 @@ worktree:
|
|
|
61
61
|
|
|
62
62
|
# Keep worktree after failed execution for debugging
|
|
63
63
|
cleanup_on_fail: false
|
|
64
|
+
|
|
65
|
+
# Quality gates for /df:verify
|
|
66
|
+
quality:
|
|
67
|
+
# Override auto-detected build command (e.g., "npm run build", "cargo build")
|
|
68
|
+
build_command: ""
|
|
69
|
+
|
|
70
|
+
# Override auto-detected test command (e.g., "npm test", "pytest", "go test ./...")
|
|
71
|
+
test_command: ""
|
|
72
|
+
|
|
73
|
+
# Test timeout in seconds (default: 300 = 5 minutes)
|
|
74
|
+
test_timeout: 300
|
|
75
|
+
|
|
76
|
+
# Retry flaky tests once before failing (default: true)
|
|
77
|
+
test_retry_on_fail: true
|