deepflow 0.1.26 → 0.1.28

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "deepflow",
3
- "version": "0.1.26",
3
+ "version": "0.1.28",
4
4
  "description": "Stay in flow state - lightweight spec-driven task orchestration for Claude Code",
5
5
  "keywords": [
6
6
  "claude",
@@ -29,6 +29,7 @@ Implement tasks from PLAN.md with parallel agents, atomic commits, and context-e
29
29
  | Agent | subagent_type | model | Purpose |
30
30
  |-------|---------------|-------|---------|
31
31
  | Implementation | `general-purpose` | `sonnet` | Task implementation |
32
+ | Spike Verifier | `reasoner` | `opus` | Verify spike pass/fail is correct |
32
33
  | Debugger | `reasoner` | `opus` | Debugging failures |
33
34
 
34
35
  ## Context-Aware Execution
@@ -57,24 +58,90 @@ commit: abc1234
57
58
  summary: "one line"
58
59
  ```
59
60
 
61
+ **Spike result file** `.deepflow/results/{task_id}.yaml` (additional fields):
62
+ ```yaml
63
+ task: T1
64
+ type: spike
65
+ status: success|failed
66
+ commit: abc1234
67
+ summary: "one line"
68
+ criteria:
69
+ - name: "throughput"
70
+ target: ">= 7000 g/s"
71
+ actual: "1500 g/s"
72
+ met: false
73
+ - name: "memory usage"
74
+ target: "< 500 MB"
75
+ actual: "320 MB"
76
+ met: true
77
+ all_criteria_met: false # ALL must be true for spike to pass
78
+ experiment_file: ".deepflow/experiments/upload--streaming--failed.md"
79
+ ```
80
+
81
+ **CRITICAL:** `status` MUST equal `success` only if `all_criteria_met: true`. The spike verifier will reject mismatches.
82
+
60
83
  ## Checkpoint & Resume
61
84
 
62
- **File:** `.deepflow/checkpoint.json` — stores completed tasks, current wave.
85
+ **File:** `.deepflow/checkpoint.json` — stored in WORKTREE directory, not main.
63
86
 
64
- **On checkpoint:** Complete wave → update PLAN.md → save → exit.
65
- **Resume:** `--continue` loads checkpoint, skips completed tasks.
87
+ **Schema:**
88
+ ```json
89
+ {
90
+ "completed_tasks": ["T1", "T2"],
91
+ "current_wave": 2,
92
+ "worktree_path": ".deepflow/worktrees/df/doing-upload/20260202-1430",
93
+ "worktree_branch": "df/doing-upload/20260202-1430"
94
+ }
95
+ ```
96
+
97
+ **On checkpoint:** Complete wave → update PLAN.md → save to worktree → exit.
98
+ **Resume:** `--continue` loads checkpoint, verifies worktree, skips completed tasks.
66
99
 
67
100
  ## Behavior
68
101
 
69
102
  ### 1. CHECK CHECKPOINT
70
103
 
71
104
  ```
72
- --continue → Load and resume
105
+ --continue → Load checkpoint
106
+ → If worktree_path exists:
107
+ → Verify worktree still exists on disk
108
+ → If missing: Error "Worktree deleted. Use --fresh"
109
+ → If exists: Use it, skip worktree creation
110
+ → Resume execution with completed tasks
73
111
  --fresh → Delete checkpoint, start fresh
74
112
  checkpoint exists → Prompt: "Resume? (y/n)"
75
113
  else → Start fresh
76
114
  ```
77
115
 
116
+ ### 1.5. CREATE WORKTREE
117
+
118
+ Before spawning any agents, create an isolated worktree:
119
+
120
+ ```
121
+ # Check main is clean (ignore untracked)
122
+ git diff --quiet HEAD || Error: "Main has uncommitted changes. Commit or stash first."
123
+
124
+ # Generate worktree path
125
+ SPEC_NAME=$(basename spec/doing-*.md .md | sed 's/doing-//')
126
+ TIMESTAMP=$(date +%Y%m%d-%H%M)
127
+ BRANCH_NAME="df/${SPEC_NAME}/${TIMESTAMP}"
128
+ WORKTREE_PATH=".deepflow/worktrees/${BRANCH_NAME}"
129
+
130
+ # Create worktree
131
+ git worktree add -b "${BRANCH_NAME}" "${WORKTREE_PATH}"
132
+
133
+ # Store in checkpoint for resume
134
+ checkpoint.worktree_path = WORKTREE_PATH
135
+ checkpoint.worktree_branch = BRANCH_NAME
136
+ ```
137
+
138
+ **Resume handling:**
139
+ - If checkpoint has worktree_path → verify it exists, use it
140
+ - If worktree missing → Error: "Worktree deleted. Use --fresh"
141
+
142
+ **Existing worktree handling:**
143
+ - If worktree exists for same spec → Prompt: "Resume existing worktree? (y/n/delete)"
144
+
78
145
  ### 2. LOAD PLAN
79
146
 
80
147
  ```
@@ -158,9 +225,57 @@ Same-file conflicts: spawn sequentially instead.
158
225
  **Spike Task Execution:**
159
226
  When spawning a spike task, the agent MUST:
160
227
  1. Execute the minimal validation method
161
- 2. Record result in experiment file (update status: `--passed.md` or `--failed.md`)
162
- 3. If passed: implementation tasks become unblocked
163
- 4. If failed: record conclusion with "next hypothesis" for future planning
228
+ 2. Record structured criteria evaluation in result file (see spike result schema above)
229
+ 3. Write experiment file with `--active.md` status (verifier determines final status)
230
+ 4. Commit as `spike({spec}): validate {hypothesis}`
231
+
232
+ **IMPORTANT:** Spike agent writes `--active.md`, NOT `--passed.md` or `--failed.md`. The verifier determines final status.
233
+
234
+ ### 6.5. VERIFY SPIKE RESULTS
235
+
236
+ After spike completes, spawn verifier BEFORE unblocking implementation tasks.
237
+
238
+ **Trigger:** Spike result file detected (`.deepflow/results/T{n}.yaml` with `type: spike`)
239
+
240
+ **Spawn:**
241
+ ```
242
+ Task(subagent_type="reasoner", model="opus", prompt=VERIFIER_PROMPT)
243
+ ```
244
+
245
+ **Verifier Prompt:**
246
+ ```
247
+ SPIKE VERIFICATION — Be skeptical. Catch false positives.
248
+
249
+ Task: {task_id}
250
+ Result: {worktree_path}/.deepflow/results/{task_id}.yaml
251
+ Experiment: {worktree_path}/.deepflow/experiments/{topic}--{hypothesis}--active.md
252
+
253
+ For each criterion in result file:
254
+ 1. Is `actual` a concrete number? (reject "good", "improved", "better")
255
+ 2. Does `actual` satisfy `target`? Do the math.
256
+ 3. Is `met` correct?
257
+
258
+ Reject these patterns:
259
+ - "Works but doesn't meet target" → FAILED
260
+ - "Close enough" → FAILED
261
+ - Actual 1500 vs Target >= 7000 → FAILED
262
+
263
+ Output to {worktree_path}/.deepflow/results/{task_id}-verified.yaml:
264
+ verified_status: VERIFIED_PASS|VERIFIED_FAIL
265
+ override: true|false
266
+ reason: "one line"
267
+
268
+ Then rename experiment:
269
+ - VERIFIED_PASS → --passed.md
270
+ - VERIFIED_FAIL → --failed.md (add "Next hypothesis:" to Conclusion)
271
+ ```
272
+
273
+ **Gate:**
274
+ ```
275
+ VERIFIED_PASS → Unblock, log "✓ Spike {task_id} verified"
276
+ VERIFIED_FAIL → Block, log "✗ Spike {task_id} failed verification"
277
+ If override: log "⚠ Agent incorrectly marked as passed"
278
+ ```
164
279
 
165
280
  **On failure, use Task tool to spawn reasoner:**
166
281
  ```
@@ -178,42 +293,87 @@ Task tool parameters:
178
293
  Files: {target files}
179
294
  Spec: {spec_name}
180
295
 
296
+ **IMPORTANT: Working Directory**
297
+ All file operations MUST use this absolute path as base:
298
+ {worktree_absolute_path}
299
+
300
+ Example: To edit src/foo.ts, use:
301
+ {worktree_absolute_path}/src/foo.ts
302
+
303
+ Do NOT write files to the main project directory.
304
+
181
305
  Implement, test, commit as feat({spec}): {description}.
182
- Write result to .deepflow/results/{task_id}.yaml
306
+ Write result to {worktree_absolute_path}/.deepflow/results/{task_id}.yaml
183
307
  ```
184
308
 
185
309
  **Spike Task:**
186
310
  ```
187
311
  {task_id} [SPIKE]: {hypothesis}
188
312
  Type: spike
189
- Method: {minimal steps to validate}
190
- Success criteria: {how to know it passed}
191
- Time-box: {duration}
192
- Experiment file: {.deepflow/experiments/{topic}--{hypothesis}--active.md}
193
- Spec: {spec_name}
313
+ Method: {minimal steps}
314
+ Success criteria: {measurable targets}
315
+ Experiment file: {worktree_absolute_path}/.deepflow/experiments/{topic}--{hypothesis}--active.md
316
+
317
+ Working directory: {worktree_absolute_path}
318
+
319
+ Steps:
320
+ 1. Execute method
321
+ 2. For EACH criterion: record target, measure actual, compare (show math)
322
+ 3. Write experiment as --active.md (verifier determines final status)
323
+ 4. Commit: spike({spec}): validate {hypothesis}
324
+ 5. Write result to .deepflow/results/{task_id}.yaml (see spike result schema)
325
+
326
+ Rules:
327
+ - `met: true` ONLY if actual satisfies target
328
+ - `status: success` ONLY if ALL criteria met
329
+ - Worse than baseline = FAILED (baseline 7k, actual 1.5k → FAILED)
330
+ - "Close enough" = FAILED
331
+ - Verifier will check. False positives waste resources.
332
+ ```
333
+
334
+ ### 8. FAILURE HANDLING
194
335
 
195
- Execute the minimal validation:
196
- 1. Follow the method steps exactly
197
- 2. Measure against success criteria
198
- 3. Update experiment file with result:
199
- - If passed: rename to --passed.md, record findings
200
- - If failed: rename to --failed.md, record conclusion with "next hypothesis"
201
- 4. Commit as spike({spec}): validate {hypothesis}
202
- 5. Write result to .deepflow/results/{task_id}.yaml
336
+ When a task fails and cannot be auto-fixed:
203
337
 
204
- Result status:
205
- - success = hypothesis validated (passed)
206
- - failed = hypothesis invalidated (failed experiment, NOT agent error)
338
+ **Behavior:**
339
+ 1. Leave worktree intact at `{worktree_path}`
340
+ 2. Keep checkpoint.json for potential resume
341
+ 3. Output debugging instructions
342
+
343
+ **Output:**
207
344
  ```
345
+ ✗ Task T3 failed after retry
346
+
347
+ Worktree preserved for debugging:
348
+ Path: .deepflow/worktrees/df/doing-upload/20260202-1430
349
+ Branch: df/doing-upload/20260202-1430
350
+
351
+ To investigate:
352
+ cd .deepflow/worktrees/df/doing-upload/20260202-1430
353
+ # examine files, run tests, etc.
354
+
355
+ To resume after fixing:
356
+ /df:execute --continue
357
+
358
+ To discard and start fresh:
359
+ git worktree remove --force .deepflow/worktrees/df/doing-upload/20260202-1430
360
+ git branch -D df/doing-upload/20260202-1430
361
+ /df:execute --fresh
362
+ ```
363
+
364
+ **Key points:**
365
+ - Never auto-delete worktree on failure (cleanup_on_fail: false by default)
366
+ - Always provide the exact cleanup commands
367
+ - Checkpoint remains so --continue can work after manual fix
208
368
 
209
- ### 8. COMPLETE SPECS
369
+ ### 9. COMPLETE SPECS
210
370
 
211
371
  When all tasks done for a `doing-*` spec:
212
372
  1. Embed history in spec: `## Completed` section
213
373
  2. Rename: `doing-upload.md` → `done-upload.md`
214
374
  3. Remove section from PLAN.md
215
375
 
216
- ### 9. ITERATE
376
+ ### 10. ITERATE
217
377
 
218
378
  Repeat until: all done, all blocked, or checkpoint.
219
379
 
@@ -253,14 +413,14 @@ Checking experiment status...
253
413
  T2: Blocked by T1 (spike not validated)
254
414
  T3: Blocked by T1 (spike not validated)
255
415
 
256
- Wave 1: T1 [SPIKE] (context: 20%)
257
- T1: success (abc1234) → upload--streaming--passed.md
416
+ Wave 1: T1 [SPIKE] (context: 15%)
417
+ T1: complete, verifying...
258
418
 
259
- Checking experiment status...
260
- T2: Experiment passed, unblocked
261
- T3: Experiment passed, unblocked
419
+ Verifying T1...
420
+ Spike T1 verified (throughput 8500 >= 7000)
421
+ upload--streaming--passed.md
262
422
 
263
- Wave 2: T2, T3 parallel (context: 45%)
423
+ Wave 2: T2, T3 parallel (context: 40%)
264
424
  T2: success (def5678)
265
425
  T3: success (ghi9012)
266
426
 
@@ -268,20 +428,38 @@ Wave 2: T2, T3 parallel (context: 45%)
268
428
  ✓ Complete: 3/3 tasks
269
429
  ```
270
430
 
271
- ### Spike Failed
431
+ ### Spike Failed (Agent Correctly Reported)
272
432
 
273
433
  ```
274
434
  /df:execute (context: 10%)
275
435
 
276
- Wave 1: T1 [SPIKE] (context: 20%)
277
- T1: failed → upload--streaming--failed.md
436
+ Wave 1: T1 [SPIKE] (context: 15%)
437
+ T1: complete, verifying...
278
438
 
279
- Checking experiment status...
280
- T2: Blocked - Experiment failed
281
- T3: ⚠ Blocked - Experiment failed
439
+ Verifying T1...
440
+ Spike T1 failed verification (throughput 1500 < 7000)
441
+ upload--streaming--failed.md
442
+
443
+ ⚠ Spike T1 invalidated hypothesis
444
+ → Run /df:plan to generate new hypothesis spike
445
+
446
+ Complete: 1/3 tasks (2 blocked by failed experiment)
447
+ ```
448
+
449
+ ### Spike Failed (Verifier Override)
450
+
451
+ ```
452
+ /df:execute (context: 10%)
453
+
454
+ Wave 1: T1 [SPIKE] (context: 15%)
455
+ T1: complete (agent said: success), verifying...
456
+
457
+ Verifying T1...
458
+ ✗ Spike T1 failed verification (throughput 1500 < 7000)
459
+ ⚠ Agent incorrectly marked as passed — overriding to FAILED
460
+ → upload--streaming--failed.md
282
461
 
283
462
  ⚠ Spike T1 invalidated hypothesis
284
- Experiment: upload--streaming--failed.md
285
463
  → Run /df:plan to generate new hypothesis spike
286
464
 
287
465
  Complete: 1/3 tasks (2 blocked by failed experiment)
@@ -115,3 +115,46 @@ Learnings captured:
115
115
  → experiments/perf--streaming-upload--success.md
116
116
  → experiments/auth--jwt-refresh-rotation--success.md
117
117
  ```
118
+
119
+ ## Post-Verification: Worktree Merge & Cleanup
120
+
121
+ After all verification passes:
122
+
123
+ ### 1. MERGE TO MAIN
124
+
125
+ ```bash
126
+ # Get worktree info from checkpoint
127
+ WORKTREE_BRANCH=$(cat .deepflow/checkpoint.json | jq -r '.worktree_branch')
128
+
129
+ # Switch to main and merge
130
+ git checkout main
131
+ git merge "${WORKTREE_BRANCH}" --no-ff -m "feat({spec}): merge verified changes"
132
+ ```
133
+
134
+ **On merge conflict:**
135
+ - Keep worktree intact for manual resolution
136
+ - Output: "Merge conflict detected. Resolve manually, then run /df:verify --merge-only"
137
+ - Exit without cleanup
138
+
139
+ ### 2. CLEANUP WORKTREE
140
+
141
+ After successful merge:
142
+
143
+ ```bash
144
+ # Get worktree path from checkpoint
145
+ WORKTREE_PATH=$(cat .deepflow/checkpoint.json | jq -r '.worktree_path')
146
+
147
+ # Remove worktree and branch
148
+ git worktree remove --force "${WORKTREE_PATH}"
149
+ git branch -d "${WORKTREE_BRANCH}"
150
+
151
+ # Remove checkpoint
152
+ rm .deepflow/checkpoint.json
153
+ ```
154
+
155
+ **Output on success:**
156
+ ```
157
+ ✓ Merged df/doing-upload/20260202-1430 to main
158
+ ✓ Cleaned up worktree and branch
159
+ ✓ Spec complete: doing-upload → done-upload
160
+ ```
@@ -43,3 +43,21 @@ commits:
43
43
  format: "feat({spec}): {description}"
44
44
  atomic: true # One task = one commit
45
45
  push_after: complete # Or "each" for every commit
46
+
47
+ # Worktree isolation for /df:execute
48
+ # Isolates all agent work in a git worktree, keeping main clean
49
+ worktree:
50
+ # Enable worktree isolation (default: true)
51
+ enabled: true
52
+
53
+ # Base path for worktrees relative to project root
54
+ base_path: .deepflow/worktrees
55
+
56
+ # Branch name prefix for worktree branches
57
+ branch_prefix: df/
58
+
59
+ # Automatically cleanup worktree after successful verify
60
+ cleanup_on_success: true
61
+
62
+ # Keep worktree after failed execution for debugging
63
+ cleanup_on_fail: false