deepflow 0.1.44 → 0.1.46

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/bin/install.js CHANGED
@@ -146,7 +146,7 @@ async function main() {
146
146
  console.log(`${c.green}Installation complete!${c.reset}`);
147
147
  console.log('');
148
148
  console.log(`Installed to ${c.cyan}${CLAUDE_DIR}${c.reset}:`);
149
- console.log(' commands/df/ — /df:spec, /df:plan, /df:execute, /df:verify');
149
+ console.log(' commands/df/ — /df:discover, /df:debate, /df:spec, /df:plan, /df:execute, /df:verify');
150
150
  console.log(' skills/ — gap-discovery, atomic-commits, code-completeness');
151
151
  console.log(' agents/ — reasoner');
152
152
  if (level === 'global') {
@@ -165,7 +165,7 @@ async function main() {
165
165
  console.log(' 1. claude');
166
166
  }
167
167
  console.log(' 2. Describe what you want to build');
168
- console.log(' 3. /df:spec feature-name');
168
+ console.log(' 3. /df:discover feature-name');
169
169
  console.log('');
170
170
  }
171
171
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "deepflow",
3
- "version": "0.1.44",
3
+ "version": "0.1.46",
4
4
  "description": "Stay in flow state - lightweight spec-driven task orchestration for Claude Code",
5
5
  "keywords": [
6
6
  "claude",
@@ -0,0 +1,283 @@
1
+ # /df:debate — Multi-Perspective Analysis
2
+
3
+ ## Orchestrator Role
4
+
5
+ You coordinate reasoner agents to debate a problem from multiple perspectives, then synthesize their arguments into a structured document.
6
+
7
+ **NEVER:** Read source files, use Glob/Grep directly, run git, use TaskOutput, use `run_in_background`, use Explore agents
8
+
9
+ **ONLY:** Spawn reasoner agents (non-background), write debate file, respond conversationally
10
+
11
+ ---
12
+
13
+ ## Purpose
14
+ Generate a multi-perspective analysis of a problem before formalizing into a spec. Surfaces tensions, trade-offs, and blind spots that a single perspective would miss.
15
+
16
+ ## Usage
17
+ ```
18
+ /df:debate <name>
19
+ ```
20
+
21
+ ## Skills & Agents
22
+
23
+ **Use Task tool to spawn agents:**
24
+ | Agent | subagent_type | model | Purpose |
25
+ |-------|---------------|-------|---------|
26
+ | User Advocate | `reasoner` | `opus` | UX, simplicity, real user needs |
27
+ | Tech Skeptic | `reasoner` | `opus` | Technical risks, hidden complexity, feasibility |
28
+ | Systems Thinker | `reasoner` | `opus` | Integration, scalability, long-term effects |
29
+ | LLM Efficiency | `reasoner` | `opus` | Token density, minimal scaffolding, navigable structure |
30
+ | Synthesizer | `reasoner` | `opus` | Merge perspectives into consensus + tensions |
31
+
32
+ ---
33
+
34
+ ## Behavior
35
+
36
+ ### 1. SUMMARIZE
37
+
38
+ Summarize the conversation context (from prior discover/conversation) in ~200 words. This summary will be passed to each perspective agent.
39
+
40
+ The summary should capture:
41
+ - The core problem being solved
42
+ - Key requirements mentioned
43
+ - Constraints and boundaries
44
+ - User's stated preferences and priorities
45
+
46
+ ### 2. SPAWN PERSPECTIVES
47
+
48
+ **Spawn ALL 4 perspective agents in ONE message (non-background, parallel):**
49
+
50
+ Each agent receives the same context summary but a different role. Each must:
51
+ - Argue from their perspective
52
+ - Identify risks the other perspectives might miss
53
+ - Propose concrete alternatives where they disagree with the likely approach
54
+
55
+ ```python
56
+ # All 4 in a single message — parallel, non-background:
57
+ Task(subagent_type="reasoner", model="opus", prompt="""
58
+ You are the USER ADVOCATE in a design debate.
59
+
60
+ ## Context
61
+ {summary}
62
+
63
+ ## Your Role
64
+ Argue from the perspective of the end user. Focus on:
65
+ - Simplicity and ease of use
66
+ - Real user needs vs assumed needs
67
+ - Friction points and cognitive load
68
+ - Whether the solution matches how users actually think
69
+
70
+ Provide:
71
+ 1. Your key arguments (3-5 points)
72
+ 2. Risks you see from a user perspective
73
+ 3. Concrete alternatives if you disagree with the current direction
74
+
75
+ Keep response under 400 words.
76
+ """)
77
+
78
+ Task(subagent_type="reasoner", model="opus", prompt="""
79
+ You are the TECH SKEPTIC in a design debate.
80
+
81
+ ## Context
82
+ {summary}
83
+
84
+ ## Your Role
85
+ Challenge technical assumptions and surface hidden complexity. Focus on:
86
+ - What could go wrong technically
87
+ - Hidden dependencies or coupling
88
+ - Complexity that seems simple but isn't
89
+ - Maintenance burden over time
90
+
91
+ Provide:
92
+ 1. Your key arguments (3-5 points)
93
+ 2. Technical risks others might overlook
94
+ 3. Simpler alternatives worth considering
95
+
96
+ Keep response under 400 words.
97
+ """)
98
+
99
+ Task(subagent_type="reasoner", model="opus", prompt="""
100
+ You are the SYSTEMS THINKER in a design debate.
101
+
102
+ ## Context
103
+ {summary}
104
+
105
+ ## Your Role
106
+ Analyze how this fits into the broader system. Focus on:
107
+ - Integration with existing components
108
+ - Scalability implications
109
+ - Second-order effects and unintended consequences
110
+ - Long-term evolution and extensibility
111
+
112
+ Provide:
113
+ 1. Your key arguments (3-5 points)
114
+ 2. Systemic risks and ripple effects
115
+ 3. Architectural alternatives worth considering
116
+
117
+ Keep response under 400 words.
118
+ """)
119
+
120
+ Task(subagent_type="reasoner", model="opus", prompt="""
121
+ You are the LLM EFFICIENCY expert in a design debate.
122
+
123
+ ## Context
124
+ {summary}
125
+
126
+ ## Your Role
127
+ Evaluate from the perspective of LLM consumption and interaction. Focus on:
128
+ - Token density: can the output be consumed efficiently by LLMs?
129
+ - Minimal scaffolding: avoid ceremony that adds tokens without information
130
+ - Navigable structure: can an LLM quickly find what it needs?
131
+ - Attention budget: does the design respect limited context windows?
132
+
133
+ Provide:
134
+ 1. Your key arguments (3-5 points)
135
+ 2. Efficiency risks others might not consider
136
+ 3. Alternatives that optimize for LLM consumption
137
+
138
+ Keep response under 400 words.
139
+ """)
140
+ ```
141
+
142
+ ### 3. SYNTHESIZE
143
+
144
+ After all 4 perspectives return, spawn 1 additional reasoner to synthesize:
145
+
146
+ ```python
147
+ Task(subagent_type="reasoner", model="opus", prompt="""
148
+ You are the SYNTHESIZER. Four perspectives have debated a design problem.
149
+
150
+ ## Context
151
+ {summary}
152
+
153
+ ## User Advocate's Arguments
154
+ {user_advocate_response}
155
+
156
+ ## Tech Skeptic's Arguments
157
+ {tech_skeptic_response}
158
+
159
+ ## Systems Thinker's Arguments
160
+ {systems_thinker_response}
161
+
162
+ ## LLM Efficiency's Arguments
163
+ {llm_efficiency_response}
164
+
165
+ ## Your Task
166
+ Synthesize these perspectives into:
167
+
168
+ 1. **Consensus** — Points where all or most perspectives agree
169
+ 2. **Tensions** — Unresolved disagreements and genuine trade-offs
170
+ 3. **Open Decisions** — Questions that need human judgment to resolve
171
+ 4. **Recommendation** — Your balanced recommendation considering all perspectives
172
+
173
+ Be specific. Name the tensions, don't smooth them over.
174
+
175
+ Keep response under 500 words.
176
+ """)
177
+ ```
178
+
179
+ ### 4. WRITE DEBATE FILE
180
+
181
+ Create `specs/.debate-{name}.md`:
182
+
183
+ ```markdown
184
+ # Debate: {Name}
185
+
186
+ ## Context
187
+ [~200 word summary from step 1]
188
+
189
+ ## Perspectives
190
+
191
+ ### User Advocate
192
+ [arguments from agent]
193
+
194
+ ### Tech Skeptic
195
+ [arguments from agent]
196
+
197
+ ### Systems Thinker
198
+ [arguments from agent]
199
+
200
+ ### LLM Efficiency
201
+ [arguments from agent]
202
+
203
+ ## Synthesis
204
+
205
+ ### Consensus
206
+ [from synthesizer]
207
+
208
+ ### Tensions
209
+ [from synthesizer]
210
+
211
+ ### Open Decisions
212
+ [from synthesizer]
213
+
214
+ ### Recommendation
215
+ [from synthesizer]
216
+ ```
217
+
218
+ ### 5. CONFIRM
219
+
220
+ After writing the file, present a brief summary to the user:
221
+
222
+ ```
223
+ ✓ Created specs/.debate-{name}.md
224
+
225
+ Key tensions:
226
+ - [tension 1]
227
+ - [tension 2]
228
+
229
+ Open decisions:
230
+ - [decision 1]
231
+ - [decision 2]
232
+
233
+ Next: Run /df:spec {name} to formalize into a specification
234
+ ```
235
+
236
+ ---
237
+
238
+ ## Rules
239
+
240
+ - **All 4 perspective agents MUST be spawned in ONE message** (parallel, non-background)
241
+ - **NEVER use `run_in_background`** — causes late notifications that pollute output
242
+ - **NEVER use TaskOutput** — returns full transcripts that explode context
243
+ - **NEVER use Explore agents** — this command doesn't read code
244
+ - **NEVER read source files directly** — agents receive context via prompt only
245
+ - Reasoner agents receive context through their prompt, not by reading files
246
+ - The debate file goes in `specs/` so `/df:spec` can reference it
247
+ - File name MUST be `.debate-{name}.md` (dot prefix = auxiliary file)
248
+ - Keep each perspective under 400 words, synthesis under 500 words
249
+
250
+ ## Example
251
+
252
+ ```
253
+ USER: /df:debate auth
254
+
255
+ CLAUDE: Let me summarize what we've discussed and get multiple perspectives
256
+ on the authentication design.
257
+
258
+ [Summarizes: ~200 words about auth requirements from conversation]
259
+
260
+ [Spawns 4 reasoner agents in parallel — User Advocate, Tech Skeptic,
261
+ Systems Thinker, LLM Efficiency]
262
+
263
+ [All 4 return their arguments]
264
+
265
+ [Spawns synthesizer agent with all 4 perspectives]
266
+
267
+ [Synthesizer returns consensus, tensions, open decisions, recommendation]
268
+
269
+ [Writes specs/.debate-auth.md]
270
+
271
+ ✓ Created specs/.debate-auth.md
272
+
273
+ Key tensions:
274
+ - OAuth complexity vs simpler API key approach
275
+ - User convenience (social login) vs privacy concerns
276
+ - Centralized auth service vs per-route middleware
277
+
278
+ Open decisions:
279
+ - Session storage strategy (JWT vs server-side)
280
+ - Token expiration policy
281
+
282
+ Next: Run /df:spec auth to formalize into a specification
283
+ ```
@@ -0,0 +1,182 @@
1
+ # /df:discover — Deep Problem Exploration
2
+
3
+ ## Orchestrator Role
4
+
5
+ You are a Socratic questioner. Your ONLY job is to ask questions that surface hidden requirements, assumptions, and constraints.
6
+
7
+ **NEVER:** Read source files, use Glob/Grep, spawn agents, create files, run git, use TaskOutput, use Task tool
8
+
9
+ **ONLY:** Ask questions using `AskUserQuestion` tool, respond conversationally
10
+
11
+ ---
12
+
13
+ ## Purpose
14
+ Explore a problem space deeply before formalizing into specs. Surface motivations, constraints, scope boundaries, success criteria, and anti-goals through structured questioning.
15
+
16
+ ## Usage
17
+ ```
18
+ /df:discover <name>
19
+ ```
20
+
21
+ ## Behavior
22
+
23
+ Work through these phases organically. You don't need to announce phases — let the conversation flow naturally. Move to the next phase when the current one feels sufficiently explored.
24
+
25
+ ### Phase 1: MOTIVATION
26
+ Why does this need to exist? What problem does it solve? Who suffers without it?
27
+
28
+ Example questions:
29
+ - What triggered the need for this?
30
+ - Who will use this and what's their current workaround?
31
+ - What happens if we don't build this?
32
+
33
+ ### Phase 2: CONTEXT
34
+ What already exists? What has been tried? What's the current state?
35
+
36
+ Example questions:
37
+ - Is there existing code or infrastructure that relates to this?
38
+ - Have you tried solving this before? What worked/didn't?
39
+ - Are there external systems or APIs involved?
40
+
41
+ ### Phase 3: SCOPE
42
+ What's in? What's out? What's the minimum viable version?
43
+
44
+ Example questions:
45
+ - What's the smallest version that would be useful?
46
+ - What features feel essential vs nice-to-have?
47
+ - Are there parts you explicitly want to exclude?
48
+
49
+ ### Phase 4: CONSTRAINTS
50
+ Technical limits, time pressure, resource boundaries?
51
+
52
+ Example questions:
53
+ - Are there performance requirements or SLAs?
54
+ - What technologies are non-negotiable?
55
+ - Is there a deadline or timeline pressure?
56
+
57
+ ### Phase 5: SUCCESS
58
+ How do we know it worked? What does "done" look like?
59
+
60
+ Example questions:
61
+ - How will you verify this works correctly?
62
+ - What metrics would indicate success?
63
+ - What would make you confident enough to ship?
64
+
65
+ ### Phase 6: ANTI-GOALS
66
+ What should we explicitly NOT do? What traps to avoid?
67
+
68
+ Example questions:
69
+ - What's the most common way this kind of feature gets over-engineered?
70
+ - Are there approaches you've seen fail elsewhere?
71
+ - What should we explicitly avoid building?
72
+
73
+ ---
74
+
75
+ ## Rules
76
+
77
+ ### Questioning Rules
78
+ - Use `AskUserQuestion` tool for structured questions with options
79
+ - Maximum **4 questions per `AskUserQuestion` call** (tool limit)
80
+ - Headers must be **≤12 characters**
81
+ - Mix structured questions (AskUserQuestion) with conversational follow-ups
82
+ - Ask follow-up questions based on answers — don't just march through phases mechanically
83
+ - Go deeper on surprising or unclear answers
84
+
85
+ ### Behavioral Rules
86
+ - **NEVER assume** — if something is ambiguous, ask
87
+ - **NEVER suggest ending** — the user decides when they're done
88
+ - **NEVER take action** — no code reading, no file creation, no agents
89
+ - **NEVER skip phases** — but adapt depth based on the problem
90
+ - Keep your responses short between questions — don't lecture
91
+ - Acknowledge answers briefly before asking the next question
92
+
93
+ ### When the User Wants to Move On
94
+ When the user signals they want to advance (e.g., "I think that's enough", "let's move on", "ready for next step"):
95
+
96
+ ```
97
+ Great, we've covered a lot of ground. To analyze this from multiple perspectives, run:
98
+
99
+ /df:debate {name}
100
+
101
+ Or if you want to go straight to a spec:
102
+
103
+ /df:spec {name}
104
+ ```
105
+
106
+ ## Example
107
+
108
+ ```
109
+ USER: /df:discover auth
110
+
111
+ CLAUDE: Let's explore the authentication problem space.
112
+
113
+ [Uses AskUserQuestion]
114
+ {
115
+ "questions": [
116
+ {
117
+ "question": "What's driving the need for auth right now?",
118
+ "header": "Motivation",
119
+ "multiSelect": false,
120
+ "options": [
121
+ {"label": "Security gap", "description": "Current system has no auth or weak auth"},
122
+ {"label": "New users", "description": "Opening up to users who need access control"},
123
+ {"label": "Compliance", "description": "Regulatory or policy requirement"}
124
+ ]
125
+ },
126
+ {
127
+ "question": "Who are the primary users that will authenticate?",
128
+ "header": "Users",
129
+ "multiSelect": true,
130
+ "options": [
131
+ {"label": "End users", "description": "Public-facing users of the product"},
132
+ {"label": "Internal team", "description": "Your team or organization members"},
133
+ {"label": "API clients", "description": "Other services or integrations"}
134
+ ]
135
+ }
136
+ ]
137
+ }
138
+
139
+ USER: [Selects: Security gap, End users + API clients]
140
+
141
+ CLAUDE: So there's a real security gap and you need to cover both human users
142
+ and machine clients. That's two different auth flows to think about.
143
+
144
+ [Uses AskUserQuestion]
145
+ {
146
+ "questions": [
147
+ {
148
+ "question": "Do end users currently have any way to identify themselves?",
149
+ "header": "Current state",
150
+ "multiSelect": false,
151
+ "options": [
152
+ {"label": "Nothing", "description": "Completely open, no identification"},
153
+ {"label": "Basic auth", "description": "Simple username/password exists"},
154
+ {"label": "Third-party", "description": "Using external auth provider already"}
155
+ ]
156
+ },
157
+ {
158
+ "question": "For API clients, what authentication exists today?",
159
+ "header": "API auth",
160
+ "multiSelect": false,
161
+ "options": [
162
+ {"label": "None", "description": "APIs are open"},
163
+ {"label": "API keys", "description": "Static keys in headers"},
164
+ {"label": "OAuth/JWT", "description": "Token-based auth already in place"}
165
+ ]
166
+ }
167
+ ]
168
+ }
169
+
170
+ [... continues through phases based on answers ...]
171
+
172
+ USER: I think we've covered everything, let's move on.
173
+
174
+ CLAUDE: Great, we've covered a lot of ground. To analyze this from multiple
175
+ perspectives, run:
176
+
177
+ /df:debate auth
178
+
179
+ Or if you want to go straight to a spec:
180
+
181
+ /df:spec auth
182
+ ```
@@ -137,8 +137,10 @@ experiment_file: ".deepflow/experiments/upload--streaming--failed.md"
137
137
  }
138
138
  ```
139
139
 
140
+ Note: `completed_tasks` is kept for backward compatibility but is now derivable from PLAN.md `[x]` entries. The native task system (TaskList) is the primary source for runtime task status.
141
+
140
142
  **On checkpoint:** Complete wave → update PLAN.md → save to worktree → exit.
141
- **Resume:** `--continue` loads checkpoint, verifies worktree, skips completed tasks.
143
+ **Resume:** `--continue` loads checkpoint, verifies worktree, skips completed tasks. Native tasks are re-registered for remaining `[ ]` items only.
142
144
 
143
145
  ## Behavior
144
146
 
@@ -188,6 +190,30 @@ Load: PLAN.md (required), specs/doing-*.md, .deepflow/config.yaml
188
190
  If missing: "No PLAN.md found. Run /df:plan first."
189
191
  ```
190
192
 
193
+ ### 2.5. REGISTER NATIVE TASKS
194
+
195
+ Parse PLAN.md and create native tasks for tracking, dependency management, and UI spinners.
196
+
197
+ **For each uncompleted task (`[ ]`) in PLAN.md:**
198
+
199
+ ```
200
+ 1. TaskCreate:
201
+ - subject: "{task_id}: {description}" (e.g. "T1: Create upload endpoint")
202
+ - description: Full task block from PLAN.md (files, blocked by, type, etc.)
203
+ - activeForm: "{gerund form of description}" (e.g. "Creating upload endpoint")
204
+
205
+ 2. Store mapping: PLAN.md task_id (T1) → native task ID
206
+ ```
207
+
208
+ **After all tasks created, set up dependencies:**
209
+
210
+ ```
211
+ For each task with "Blocked by: T{n}, T{m}":
212
+ TaskUpdate(taskId: native_id, addBlockedBy: [native_id_of_Tn, native_id_of_Tm])
213
+ ```
214
+
215
+ **On `--continue`:** Only create tasks for remaining `[ ]` items (skip `[x]` completed).
216
+
191
217
  ### 3. CHECK FOR UNPLANNED SPECS
192
218
 
193
219
  Warn if `specs/*.md` (excluding doing-/done-) exist. Non-blocking.
@@ -244,12 +270,30 @@ Topic extraction:
244
270
 
245
271
  ### 5. IDENTIFY READY TASKS
246
272
 
247
- Ready = `[ ]` + all `blocked_by` complete + experiment validated (if applicable) + not in checkpoint.
273
+ Use TaskList to find ready tasks (replaces manual PLAN.md parsing):
274
+
275
+ ```
276
+ Ready = TaskList results where:
277
+ - status: "pending"
278
+ - blockedBy: empty (auto-unblocked by native dependency system)
279
+ ```
280
+
281
+ **Cross-check with experiment validation** (for spike-blocked tasks):
282
+ - If task depends on spike AND experiment not `--passed.md` → still blocked
283
+ - TaskUpdate to add spike as blocker if not already set
284
+
285
+ Ready = TaskList pending + empty blockedBy + experiment validated (if applicable).
248
286
 
249
287
  ### 6. SPAWN AGENTS
250
288
 
251
289
  Context ≥50%: checkpoint and exit.
252
290
 
291
+ **Before spawning each agent**, mark its native task as in_progress:
292
+ ```
293
+ TaskUpdate(taskId: native_id, status: "in_progress")
294
+ ```
295
+ This activates the UI spinner showing the task's activeForm (e.g. "Creating upload endpoint").
296
+
253
297
  **CRITICAL: Spawn ALL ready tasks in a SINGLE response with MULTIPLE Task tool calls.**
254
298
 
255
299
  DO NOT spawn one task, wait, then spawn another. Instead, call Task tool multiple times in the SAME message block. This enables true parallelism.
@@ -319,8 +363,15 @@ Then rename experiment:
319
363
 
320
364
  **Gate:**
321
365
  ```
322
- VERIFIED_PASS → Unblock, log "✓ Spike {task_id} verified"
323
- VERIFIED_FAIL → Block, log "✗ Spike {task_id} failed verification"
366
+ VERIFIED_PASS →
367
+ TaskUpdate(taskId: spike_native_id, status: "completed")
368
+ # Native system auto-unblocks dependent tasks
369
+ Log "✓ Spike {task_id} verified"
370
+
371
+ VERIFIED_FAIL →
372
+ # Spike task stays as pending, dependents remain blocked
373
+ # No TaskUpdate needed — native system keeps them blocked
374
+ Log "✗ Spike {task_id} failed verification"
324
375
  If override: log "⚠ Agent incorrectly marked as passed"
325
376
  ```
326
377
 
@@ -390,6 +441,12 @@ Rules:
390
441
 
391
442
  When a task fails and cannot be auto-fixed:
392
443
 
444
+ **Native task update:**
445
+ ```
446
+ TaskUpdate(taskId: native_id, status: "pending") # Reset to pending, not deleted
447
+ ```
448
+ This keeps the task visible for retry. Dependent tasks remain blocked.
449
+
393
450
  **Behavior:**
394
451
  1. Leave worktree intact at `{worktree_path}`
395
452
  2. Keep checkpoint.json for potential resume
@@ -434,9 +491,11 @@ After spawning wave agents, your turn ENDS. Completion notifications drive the l
434
491
 
435
492
  **Per notification:**
436
493
  1. Read result file for the completed agent
437
- 2. Report ONE line: " Tx: status (commit)"
438
- 3. If NOT all wave agents done end turn, wait
439
- 4. If ALL wave agents done check context, update PLAN.md, spawn next wave or finish
494
+ 2. TaskUpdate(taskId: native_id, status: "completed") auto-unblocks dependent tasks
495
+ 3. Update PLAN.md: `[ ]` `[x]` + commit hash (as before)
496
+ 4. Report ONE line: "✓ Tx: status (commit)"
497
+ 5. If NOT all wave agents done → end turn, wait
498
+ 6. If ALL wave agents done → use TaskList to find newly unblocked tasks, check context, spawn next wave or finish
440
499
 
441
500
  **Between waves:** Check context %. If ≥50%, checkpoint and exit.
442
501
 
@@ -456,18 +515,41 @@ After spawning wave agents, your turn ENDS. Completion notifications drive the l
456
515
 
457
516
  ```
458
517
  /df:execute (context: 12%)
459
- Spawning Wave 1: T1, T2, T3 parallel...
518
+
519
+ Loading PLAN.md...
520
+ T1: Create upload endpoint (ready)
521
+ T2: Add S3 service (blocked by T1)
522
+ T3: Add auth guard (blocked by T1)
523
+
524
+ Registering native tasks...
525
+ TaskCreate → T1 (native: task-001)
526
+ TaskCreate → T2 (native: task-002)
527
+ TaskCreate → T3 (native: task-003)
528
+ TaskUpdate(task-002, addBlockedBy: [task-001])
529
+ TaskUpdate(task-003, addBlockedBy: [task-001])
530
+
531
+ Spawning Wave 1: T1
532
+ TaskUpdate(task-001, status: "in_progress") ← spinner: "Creating upload endpoint"
460
533
 
461
534
  [Agent "T1" completed]
462
- T1: success (abc1234)
535
+ TaskUpdate(task-001, status: "completed") ← auto-unblocks task-002, task-003
536
+ ✓ T1: success (abc1234)
537
+
538
+ TaskList → task-002, task-003 now ready (blockedBy empty)
539
+
540
+ Spawning Wave 2: T2, T3 parallel
541
+ TaskUpdate(task-002, status: "in_progress")
542
+ TaskUpdate(task-003, status: "in_progress")
463
543
 
464
544
  [Agent "T2" completed]
465
- T2: success (def5678)
545
+ TaskUpdate(task-002, status: "completed")
546
+ ✓ T2: success (def5678)
466
547
 
467
548
  [Agent "T3" completed]
468
- T3: success (ghi9012)
549
+ TaskUpdate(task-003, status: "completed")
550
+ ✓ T3: success (ghi9012)
469
551
 
470
- Wave 1 complete (3/3). Context: 35%
552
+ Wave 2 complete (2/2). Context: 35%
471
553
 
472
554
  ✓ doing-upload → done-upload
473
555
  ✓ Complete: 3/3 tasks
@@ -480,27 +562,43 @@ Next: Run /df:verify to verify specs and merge to main
480
562
  ```
481
563
  /df:execute (context: 10%)
482
564
 
565
+ Loading PLAN.md...
566
+ Registering native tasks...
567
+ TaskCreate → T1 [SPIKE] (native: task-001)
568
+ TaskCreate → T2 (native: task-002)
569
+ TaskCreate → T3 (native: task-003)
570
+ TaskUpdate(task-002, addBlockedBy: [task-001])
571
+ TaskUpdate(task-003, addBlockedBy: [task-001])
572
+
483
573
  Checking experiment status...
484
574
  T1 [SPIKE]: No experiment yet, spike executable
485
575
  T2: Blocked by T1 (spike not validated)
486
576
  T3: Blocked by T1 (spike not validated)
487
577
 
488
- Spawning Wave 1: T1 [SPIKE]...
578
+ Spawning Wave 1: T1 [SPIKE]
579
+ TaskUpdate(task-001, status: "in_progress")
489
580
 
490
581
  [Agent "T1 SPIKE" completed]
491
582
  ✓ T1: complete, verifying...
492
583
 
493
584
  Verifying T1...
494
585
  ✓ Spike T1 verified (throughput 8500 >= 7000)
586
+ TaskUpdate(task-001, status: "completed") ← auto-unblocks task-002, task-003
495
587
  → upload--streaming--passed.md
496
588
 
497
- Spawning Wave 2: T2, T3 parallel...
589
+ TaskList task-002, task-003 now ready
590
+
591
+ Spawning Wave 2: T2, T3 parallel
592
+ TaskUpdate(task-002, status: "in_progress")
593
+ TaskUpdate(task-003, status: "in_progress")
498
594
 
499
595
  [Agent "T2" completed]
500
- T2: success (def5678)
596
+ TaskUpdate(task-002, status: "completed")
597
+ ✓ T2: success (def5678)
501
598
 
502
599
  [Agent "T3" completed]
503
- T3: success (ghi9012)
600
+ TaskUpdate(task-003, status: "completed")
601
+ ✓ T3: success (ghi9012)
504
602
 
505
603
  Wave 2 complete (2/2). Context: 40%
506
604
 
@@ -515,11 +613,16 @@ Next: Run /df:verify to verify specs and merge to main
515
613
  ```
516
614
  /df:execute (context: 10%)
517
615
 
616
+ Registering native tasks...
617
+ TaskCreate → T1 [SPIKE], T2, T3 (with dependencies)
618
+
518
619
  Wave 1: T1 [SPIKE] (context: 15%)
620
+ TaskUpdate(task-001, status: "in_progress")
519
621
  T1: complete, verifying...
520
622
 
521
623
  Verifying T1...
522
624
  ✗ Spike T1 failed verification (throughput 1500 < 7000)
625
+ # Spike stays pending — dependents remain blocked
523
626
  → upload--streaming--failed.md
524
627
 
525
628
  ⚠ Spike T1 invalidated hypothesis
@@ -533,12 +636,17 @@ Next: Run /df:plan to generate new hypothesis spike
533
636
  ```
534
637
  /df:execute (context: 10%)
535
638
 
639
+ Registering native tasks...
640
+ TaskCreate → T1 [SPIKE], T2, T3 (with dependencies)
641
+
536
642
  Wave 1: T1 [SPIKE] (context: 15%)
643
+ TaskUpdate(task-001, status: "in_progress")
537
644
  T1: complete (agent said: success), verifying...
538
645
 
539
646
  Verifying T1...
540
647
  ✗ Spike T1 failed verification (throughput 1500 < 7000)
541
648
  ⚠ Agent incorrectly marked as passed — overriding to FAILED
649
+ TaskUpdate(task-001, status: "pending") ← reset, dependents stay blocked
542
650
  → upload--streaming--failed.md
543
651
 
544
652
  ⚠ Spike T1 invalidated hypothesis
@@ -31,6 +31,8 @@ Transform conversation context into a structured specification file.
31
31
 
32
32
  ### 1. GATHER CODEBASE CONTEXT
33
33
 
34
+ **Check for debate file first:** If `specs/.debate-{name}.md` exists, read it using the Read tool. Pass its content (especially the Synthesis section) to the reasoner agent in step 3 as additional context. The debate file contains multi-perspective analysis that should inform requirements and constraints.
35
+
34
36
  **NEVER use `run_in_background` for Explore agents** — causes late "Agent completed" notifications that pollute output after work is done.
35
37
 
36
38
  **NEVER use TaskOutput** — returns full agent transcripts (100KB+) that explode context.
@@ -51,13 +51,20 @@ Report per spec: requirements count, acceptance count, quality issues.
51
51
 
52
52
  **If all pass:** Proceed to Post-Verification merge.
53
53
 
54
- **If issues found:** Add fix tasks to PLAN.md in the worktree and loop back to execute:
54
+ **If issues found:** Add fix tasks to PLAN.md in the worktree and register as native tasks, then loop back to execute:
55
55
 
56
56
  1. Discover worktree (same logic as Post-Verification step 1)
57
57
  2. Write new fix tasks to `{worktree_path}/PLAN.md` under the existing spec section
58
58
  - Task IDs continue from last (e.g. if T9 was last, fixes start at T10)
59
59
  - Format: `- [ ] **T10**: Fix {description}` with `Files:` and details
60
- 3. Output report + next step:
60
+ 3. Register fix tasks as native tasks for immediate tracking:
61
+ ```
62
+ For each fix task added:
63
+ TaskCreate(subject: "T10: Fix {description}", description: "...", activeForm: "Fixing {description}")
64
+ TaskUpdate(addBlockedBy: [...]) if dependencies exist
65
+ ```
66
+ This allows `/df:execute --continue` to find fix tasks via TaskList immediately.
67
+ 4. Output report + next step:
61
68
 
62
69
  ```
63
70
  done-upload.md: 4/4 reqs ✓, 3/5 acceptance ✗, 1 quality issue