deepflow 0.1.23 → 0.1.26

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "deepflow",
3
- "version": "0.1.23",
3
+ "version": "0.1.26",
4
4
  "description": "Stay in flow state - lightweight spec-driven task orchestration for Claude Code",
5
5
  "keywords": [
6
6
  "claude",
@@ -24,8 +24,12 @@ Implement tasks from PLAN.md with parallel agents, atomic commits, and context-e
24
24
 
25
25
  ## Skills & Agents
26
26
  - Skill: `atomic-commits` — Clean commit protocol
27
- - Agent: `general-purpose` (Sonnet) — Task implementation
28
- - Agent: `reasoner` (Opus) Debugging failures
27
+
28
+ **Use Task tool to spawn agents:**
29
+ | Agent | subagent_type | model | Purpose |
30
+ |-------|---------------|-------|---------|
31
+ | Implementation | `general-purpose` | `sonnet` | Task implementation |
32
+ | Debugger | `reasoner` | `opus` | Debugging failures |
29
33
 
30
34
  ## Context-Aware Execution
31
35
 
@@ -82,20 +86,93 @@ If missing: "No PLAN.md found. Run /df:plan first."
82
86
 
83
87
  Warn if `specs/*.md` (excluding doing-/done-) exist. Non-blocking.
84
88
 
85
- ### 4. IDENTIFY READY TASKS
89
+ ### 4. CHECK EXPERIMENT STATUS (HYPOTHESIS VALIDATION)
90
+
91
+ **Before identifying ready tasks**, check experiment validation for full implementation tasks.
92
+
93
+ **Task Types:**
94
+ - **Spike tasks**: Have `[SPIKE]` in title OR `Type: spike` in description — always executable
95
+ - **Full implementation tasks**: Blocked by spike tasks — require validated experiment
96
+
97
+ **Validation Flow:**
98
+
99
+ ```
100
+ For each task in plan:
101
+ If task is spike task:
102
+ → Mark as executable (spikes are always allowed)
103
+ Else if task is blocked by a spike task (T{n}):
104
+ → Find related experiment file in .deepflow/experiments/
105
+ → Check experiment status:
106
+ - --passed.md exists → Unblock, proceed with implementation
107
+ - --failed.md exists → Keep blocked, warn user
108
+ - --active.md exists → Keep blocked, spike in progress
109
+ - No experiment → Keep blocked, spike not started
110
+ ```
111
+
112
+ **Experiment File Discovery:**
113
+
114
+ ```
115
+ Glob: .deepflow/experiments/{topic}--*--{status}.md
86
116
 
87
- Ready = `[ ]` + all `blocked_by` complete + not in checkpoint.
117
+ Topic extraction:
118
+ 1. From spike task: experiment file path in task description
119
+ 2. From spec name: doing-{topic} → {topic}
120
+ 3. Fuzzy match: normalize and match
121
+ ```
88
122
 
89
- ### 5. SPAWN AGENTS
123
+ **Status Handling:**
124
+
125
+ | Experiment Status | Task Status | Action |
126
+ |-------------------|-------------|--------|
127
+ | `--passed.md` | Ready | Execute full implementation |
128
+ | `--failed.md` | Blocked | Skip, warn: "Experiment failed, re-plan needed" |
129
+ | `--active.md` | Blocked | Skip, info: "Waiting for spike completion" |
130
+ | Not found | Blocked | Skip, info: "Spike task not executed yet" |
131
+
132
+ **Warning Output:**
133
+
134
+ ```
135
+ ⚠ T3 blocked: Experiment 'upload--streaming--failed.md' did not validate
136
+ → Run /df:plan to generate new hypothesis spike
137
+ ```
138
+
139
+ ### 5. IDENTIFY READY TASKS
140
+
141
+ Ready = `[ ]` + all `blocked_by` complete + experiment validated (if applicable) + not in checkpoint.
142
+
143
+ ### 6. SPAWN AGENTS
90
144
 
91
145
  Context ≥50%: checkpoint and exit.
92
146
 
93
- Spawn all ready tasks in ONE message (parallel). Same-file conflicts: sequential.
147
+ **Use Task tool to spawn all ready tasks in ONE message (parallel):**
148
+ ```
149
+ Task tool parameters for each task:
150
+ - subagent_type: "general-purpose"
151
+ - model: "sonnet"
152
+ - run_in_background: true
153
+ - prompt: "{task details from PLAN.md}"
154
+ ```
94
155
 
95
- On failure: spawn `reasoner`.
156
+ Same-file conflicts: spawn sequentially instead.
96
157
 
97
- ### 6. PER-TASK (agent prompt)
158
+ **Spike Task Execution:**
159
+ When spawning a spike task, the agent MUST:
160
+ 1. Execute the minimal validation method
161
+ 2. Record result in experiment file (update status: `--passed.md` or `--failed.md`)
162
+ 3. If passed: implementation tasks become unblocked
163
+ 4. If failed: record conclusion with "next hypothesis" for future planning
98
164
 
165
+ **On failure, use Task tool to spawn reasoner:**
166
+ ```
167
+ Task tool parameters:
168
+ - subagent_type: "reasoner"
169
+ - model: "opus"
170
+ - prompt: "Debug failure: {error details}"
171
+ ```
172
+
173
+ ### 7. PER-TASK (agent prompt)
174
+
175
+ **Standard Task:**
99
176
  ```
100
177
  {task_id}: {description from PLAN.md}
101
178
  Files: {target files}
@@ -105,14 +182,38 @@ Implement, test, commit as feat({spec}): {description}.
105
182
  Write result to .deepflow/results/{task_id}.yaml
106
183
  ```
107
184
 
108
- ### 7. COMPLETE SPECS
185
+ **Spike Task:**
186
+ ```
187
+ {task_id} [SPIKE]: {hypothesis}
188
+ Type: spike
189
+ Method: {minimal steps to validate}
190
+ Success criteria: {how to know it passed}
191
+ Time-box: {duration}
192
+ Experiment file: {.deepflow/experiments/{topic}--{hypothesis}--active.md}
193
+ Spec: {spec_name}
194
+
195
+ Execute the minimal validation:
196
+ 1. Follow the method steps exactly
197
+ 2. Measure against success criteria
198
+ 3. Update experiment file with result:
199
+ - If passed: rename to --passed.md, record findings
200
+ - If failed: rename to --failed.md, record conclusion with "next hypothesis"
201
+ 4. Commit as spike({spec}): validate {hypothesis}
202
+ 5. Write result to .deepflow/results/{task_id}.yaml
203
+
204
+ Result status:
205
+ - success = hypothesis validated (passed)
206
+ - failed = hypothesis invalidated (failed experiment, NOT agent error)
207
+ ```
208
+
209
+ ### 8. COMPLETE SPECS
109
210
 
110
211
  When all tasks done for a `doing-*` spec:
111
212
  1. Embed history in spec: `## Completed` section
112
213
  2. Rename: `doing-upload.md` → `done-upload.md`
113
214
  3. Remove section from PLAN.md
114
215
 
115
- ### 8. ITERATE
216
+ ### 9. ITERATE
116
217
 
117
218
  Repeat until: all done, all blocked, or checkpoint.
118
219
 
@@ -126,6 +227,8 @@ Repeat until: all done, all blocked, or checkpoint.
126
227
 
127
228
  ## Example
128
229
 
230
+ ### Standard Execution
231
+
129
232
  ```
130
233
  /df:execute (context: 12%)
131
234
 
@@ -140,7 +243,52 @@ Wave 2: T3 (context: 48%)
140
243
  ✓ Complete: 3/3 tasks
141
244
  ```
142
245
 
143
- With checkpoint:
246
+ ### Spike-First Execution
247
+
248
+ ```
249
+ /df:execute (context: 10%)
250
+
251
+ Checking experiment status...
252
+ T1 [SPIKE]: No experiment yet, spike executable
253
+ T2: Blocked by T1 (spike not validated)
254
+ T3: Blocked by T1 (spike not validated)
255
+
256
+ Wave 1: T1 [SPIKE] (context: 20%)
257
+ T1: success (abc1234) → upload--streaming--passed.md
258
+
259
+ Checking experiment status...
260
+ T2: Experiment passed, unblocked
261
+ T3: Experiment passed, unblocked
262
+
263
+ Wave 2: T2, T3 parallel (context: 45%)
264
+ T2: success (def5678)
265
+ T3: success (ghi9012)
266
+
267
+ ✓ doing-upload → done-upload
268
+ ✓ Complete: 3/3 tasks
269
+ ```
270
+
271
+ ### Spike Failed
272
+
273
+ ```
274
+ /df:execute (context: 10%)
275
+
276
+ Wave 1: T1 [SPIKE] (context: 20%)
277
+ T1: failed → upload--streaming--failed.md
278
+
279
+ Checking experiment status...
280
+ T2: ⚠ Blocked - Experiment failed
281
+ T3: ⚠ Blocked - Experiment failed
282
+
283
+ ⚠ Spike T1 invalidated hypothesis
284
+ Experiment: upload--streaming--failed.md
285
+ → Run /df:plan to generate new hypothesis spike
286
+
287
+ Complete: 1/3 tasks (2 blocked by failed experiment)
288
+ ```
289
+
290
+ ### With Checkpoint
291
+
144
292
  ```
145
293
  Wave 1 complete (context: 52%)
146
294
  Checkpoint saved. Run /df:execute --continue
@@ -42,21 +42,33 @@ Determine source_dir from config or default to src/
42
42
 
43
43
  If no new specs: report counts, suggest `/df:execute`.
44
44
 
45
- ### 2. CHECK PAST EXPERIMENTS
45
+ ### 2. CHECK PAST EXPERIMENTS (SPIKE-FIRST)
46
46
 
47
- Extract domains from spec (perf, auth, api, etc.), then:
47
+ **CRITICAL**: Check experiments BEFORE generating any tasks.
48
+
49
+ Extract topic from spec name (fuzzy match), then:
48
50
 
49
51
  ```
50
- Glob .deepflow/experiments/{domain}--*
52
+ Glob .deepflow/experiments/{topic}--*
51
53
  ```
52
54
 
55
+ **Experiment file naming:** `{topic}--{hypothesis}--{status}.md`
56
+ Statuses: `active`, `passed`, `failed`
57
+
53
58
  | Result | Action |
54
59
  |--------|--------|
55
- | `--failed.md` | Exclude approach, note why |
56
- | `--success.md` | Reference as pattern |
57
- | No matches | Continue (expected for new projects) |
60
+ | `--failed.md` exists | Extract "next hypothesis" from Conclusion section |
61
+ | `--passed.md` exists | Reference as validated pattern, can proceed to full implementation |
62
+ | `--active.md` exists | Wait for experiment completion before planning |
63
+ | No matches | New topic, needs initial spike |
64
+
65
+ **Spike-First Rule**:
66
+ - If `--failed.md` exists: Generate spike task to test the next hypothesis (from failed experiment's Conclusion)
67
+ - If no experiments exist: Generate spike task for the core hypothesis
68
+ - Full implementation tasks are BLOCKED until a spike validates the approach
69
+ - Only proceed to full task generation after `--passed.md` exists
58
70
 
59
- **Naming:** `{domain}--{approach}--{result}.md`
71
+ See: `templates/experiment-template.md` for experiment format
60
72
 
61
73
  ### 3. DETECT PROJECT CONTEXT
62
74
 
@@ -69,7 +81,15 @@ Include patterns in task descriptions for agents to follow.
69
81
 
70
82
  ### 4. ANALYZE CODEBASE
71
83
 
72
- **Spawn Explore agents** (haiku, read-only) with dynamic count:
84
+ **Use Task tool to spawn Explore agents in parallel:**
85
+ ```
86
+ Task tool parameters:
87
+ - subagent_type: "Explore"
88
+ - model: "haiku"
89
+ - run_in_background: true (for parallel execution)
90
+ ```
91
+
92
+ Scale agent count based on codebase size:
73
93
 
74
94
  | File Count | Agents |
75
95
  |------------|--------|
@@ -78,6 +98,34 @@ Include patterns in task descriptions for agents to follow.
78
98
  | 100-500 | 25-40 |
79
99
  | 500+ | 50-100 (cap) |
80
100
 
101
+ **Explore Agent Prompt Structure:**
102
+ ```
103
+ Find: [specific question]
104
+ Return ONLY:
105
+ - File paths matching criteria
106
+ - One-line description per file
107
+ - Integration points (if asked)
108
+
109
+ DO NOT:
110
+ - Read or summarize spec files
111
+ - Make recommendations
112
+ - Propose solutions
113
+ - Generate tables or lengthy explanations
114
+
115
+ Max response: 500 tokens (configurable via .deepflow/config.yaml explore.max_tokens)
116
+ ```
117
+
118
+ **Explore Agent Scope Restrictions:**
119
+ - MUST only report factual findings:
120
+ - Files found
121
+ - Patterns/conventions observed
122
+ - Integration points
123
+ - MUST NOT:
124
+ - Make recommendations
125
+ - Propose architectures
126
+ - Read and summarize specs (that's orchestrator's job)
127
+ - Draw conclusions about what should be built
128
+
81
129
  **Use `code-completeness` skill patterns** to search for:
82
130
  - Implementations matching spec requirements
83
131
  - TODO, FIXME, HACK comments
@@ -86,7 +134,14 @@ Include patterns in task descriptions for agents to follow.
86
134
 
87
135
  ### 5. COMPARE & PRIORITIZE
88
136
 
89
- **Spawn `reasoner` agent** (Opus) for analysis:
137
+ **Use Task tool to spawn reasoner agent:**
138
+ ```
139
+ Task tool parameters:
140
+ - subagent_type: "reasoner"
141
+ - model: "opus"
142
+ ```
143
+
144
+ Reasoner performs analysis:
90
145
 
91
146
  | Status | Action |
92
147
  |--------|--------|
@@ -102,7 +157,36 @@ Include patterns in task descriptions for agents to follow.
102
157
  2. Impact — core features before enhancements
103
158
  3. Risk — unknowns early
104
159
 
105
- ### 6. VALIDATE HYPOTHESES
160
+ ### 6. GENERATE SPIKE TASKS (IF NEEDED)
161
+
162
+ **When to generate spike tasks:**
163
+ 1. Failed experiment exists → Test the next hypothesis
164
+ 2. No experiments exist → Test the core hypothesis
165
+ 3. Passed experiment exists → Skip to full implementation
166
+
167
+ **Spike Task Format:**
168
+ ```markdown
169
+ - [ ] **T1** [SPIKE]: Validate {hypothesis}
170
+ - Type: spike
171
+ - Hypothesis: {what we're testing}
172
+ - Method: {minimal steps to validate}
173
+ - Success criteria: {how to know it passed}
174
+ - Time-box: 30 min
175
+ - Files: .deepflow/experiments/{topic}--{hypothesis}--{status}.md
176
+ - Blocked by: none
177
+ ```
178
+
179
+ **Blocking Logic:**
180
+ - All implementation tasks MUST have `Blocked by: T{spike}` until spike passes
181
+ - After spike completes:
182
+ - If passed: Update experiment to `--passed.md`, unblock implementation tasks
183
+ - If failed: Update experiment to `--failed.md`, DO NOT generate implementation tasks
184
+
185
+ **Full Implementation Only After Spike:**
186
+ - Only generate full task list when spike validates the approach
187
+ - Never generate 10-task waterfall without validated hypothesis
188
+
189
+ ### 7. VALIDATE HYPOTHESES
106
190
 
107
191
  Test risky assumptions before finalizing plan.
108
192
 
@@ -111,24 +195,27 @@ Test risky assumptions before finalizing plan.
111
195
  **Process:**
112
196
  1. Prototype in scratchpad (not committed)
113
197
  2. Test assumption
114
- 3. If fails → Write `.deepflow/experiments/{domain}--{approach}--failed.md`
198
+ 3. If fails → Write `.deepflow/experiments/{topic}--{hypothesis}--failed.md`
115
199
  4. Adjust approach, document in task
116
200
 
117
201
  **Skip:** Well-known patterns, simple CRUD, clear docs exist
118
202
 
119
- ### 7. OUTPUT PLAN.md
203
+ ### 8. OUTPUT PLAN.md
120
204
 
121
205
  Append tasks grouped by `### doing-{spec-name}`. Include spec gaps and validation findings.
122
206
 
123
- ### 8. RENAME SPECS
207
+ ### 9. RENAME SPECS
124
208
 
125
209
  `mv specs/feature.md specs/doing-feature.md`
126
210
 
127
- ### 9. REPORT
211
+ ### 10. REPORT
128
212
 
129
213
  `✓ Plan generated — {n} specs, {n} tasks. Run /df:execute`
130
214
 
131
215
  ## Rules
216
+ - **Spike-first** — Generate spike task before full implementation if no `--passed.md` experiment exists
217
+ - **Block on spike** — Full implementation tasks MUST be blocked by spike validation
218
+ - **Learn from failures** — Extract "next hypothesis" from failed experiments, never repeat same approach
132
219
  - **Learn from history** — Check past experiments before proposing approaches
133
220
  - **Plan only** — Do NOT implement anything (except quick validation prototypes)
134
221
  - **Validate before commit** — Test risky assumptions with minimal experiments
@@ -139,13 +226,64 @@ Append tasks grouped by `### doing-{spec-name}`. Include spec gaps and validatio
139
226
 
140
227
  ## Agent Scaling
141
228
 
142
- | Agent | Base | Scale |
143
- |-------|------|-------|
144
- | Explore (search) | 10 | +1 per 20 files |
145
- | Reasoner (analyze) | 5 | +1 per 2 specs |
229
+ | Agent | Model | Base | Scale |
230
+ |-------|-------|------|-------|
231
+ | Explore (search) | haiku | 10 | +1 per 20 files |
232
+ | Reasoner (analyze) | opus | 5 | +1 per 2 specs |
233
+
234
+ **IMPORTANT**: Always use the `Task` tool with explicit `subagent_type` and `model` parameters. Do NOT use Glob/Grep/Read directly for codebase analysis - spawn agents instead.
146
235
 
147
236
  ## Example
148
237
 
238
+ ### Spike-First (No Prior Experiments)
239
+
240
+ ```markdown
241
+ # Plan
242
+
243
+ ### doing-upload
244
+
245
+ - [ ] **T1** [SPIKE]: Validate streaming upload approach
246
+ - Type: spike
247
+ - Hypothesis: Streaming uploads will handle files >1GB without memory issues
248
+ - Method: Create minimal endpoint, upload 2GB file, measure memory
249
+ - Success criteria: Memory stays under 500MB during upload
250
+ - Time-box: 30 min
251
+ - Files: .deepflow/experiments/upload--streaming--active.md
252
+ - Blocked by: none
253
+
254
+ - [ ] **T2**: Create upload endpoint
255
+ - Files: src/api/upload.ts
256
+ - Blocked by: T1 (spike must pass)
257
+
258
+ - [ ] **T3**: Add S3 service with streaming
259
+ - Files: src/services/storage.ts
260
+ - Blocked by: T1 (spike must pass), T2
261
+ ```
262
+
263
+ ### Spike-First (After Failed Experiment)
264
+
265
+ ```markdown
266
+ # Plan
267
+
268
+ ### doing-upload
269
+
270
+ - [ ] **T1** [SPIKE]: Validate chunked upload with backpressure
271
+ - Type: spike
272
+ - Hypothesis: Adding backpressure control will prevent buffer overflow
273
+ - Method: Implement pause/resume on buffer threshold, test with 2GB file
274
+ - Success criteria: No memory spikes above 500MB
275
+ - Time-box: 30 min
276
+ - Files: .deepflow/experiments/upload--chunked-backpressure--active.md
277
+ - Blocked by: none
278
+ - Note: Previous approach failed (see upload--buffer-upload--failed.md)
279
+
280
+ - [ ] **T2**: Implement chunked upload endpoint
281
+ - Files: src/api/upload.ts
282
+ - Blocked by: T1 (spike must pass)
283
+ ```
284
+
285
+ ### After Spike Validates (Full Implementation)
286
+
149
287
  ```markdown
150
288
  # Plan
151
289
 
@@ -154,10 +292,10 @@ Append tasks grouped by `### doing-{spec-name}`. Include spec gaps and validatio
154
292
  - [ ] **T1**: Create upload endpoint
155
293
  - Files: src/api/upload.ts
156
294
  - Blocked by: none
295
+ - Note: Use streaming (validated in upload--streaming--passed.md)
157
296
 
158
297
  - [ ] **T2**: Add S3 service with streaming
159
298
  - Files: src/services/storage.ts
160
299
  - Blocked by: T1
161
- - Note: Use streaming (see experiments/perf--chunked-upload--success.md)
162
- - Avoid: Direct buffer upload failed for large files (experiments/perf--buffer-upload--failed.md)
300
+ - Avoid: Direct buffer upload failed (see upload--buffer-upload--failed.md)
163
301
  ```
@@ -1,5 +1,15 @@
1
1
  # /df:spec — Generate Spec from Conversation
2
2
 
3
+ ## Orchestrator Role
4
+
5
+ You coordinate agents and ask questions. You never search code directly.
6
+
7
+ **NEVER:** Read source files, use Glob/Grep directly, run git
8
+
9
+ **ONLY:** Spawn agents, poll results, ask user questions, write spec file
10
+
11
+ ---
12
+
3
13
  ## Purpose
4
14
  Transform conversation context into a structured specification file.
5
15
 
@@ -8,13 +18,69 @@ Transform conversation context into a structured specification file.
8
18
  /df:spec <name>
9
19
  ```
10
20
 
11
- ## Skills
12
- Uses: `gap-discovery` — Proactive requirement gap identification
21
+ ## Skills & Agents
22
+ - Skill: `gap-discovery` — Proactive requirement gap identification
23
+
24
+ **Use Task tool to spawn agents:**
25
+ | Agent | subagent_type | model | Purpose |
26
+ |-------|---------------|-------|---------|
27
+ | Context | `Explore` | `haiku` | Codebase context gathering |
28
+ | Synthesizer | `reasoner` | `opus` | Synthesize findings into requirements |
13
29
 
14
30
  ## Behavior
15
31
 
16
- ### 1. GAP CHECK
17
- Before generating, use the `gap-discovery` skill to analyze conversation.
32
+ ### 1. GATHER CODEBASE CONTEXT
33
+
34
+ **Use Task tool to spawn Explore agents in parallel:**
35
+ ```
36
+ Task tool parameters:
37
+ - subagent_type: "Explore"
38
+ - model: "haiku"
39
+ - run_in_background: true
40
+ ```
41
+
42
+ Find:
43
+ - Related existing implementations
44
+ - Code patterns and conventions
45
+ - Integration points relevant to the feature
46
+ - Existing TODOs or placeholders in related areas
47
+
48
+ | Codebase Size | Agents |
49
+ |---------------|--------|
50
+ | <20 files | 2-3 |
51
+ | 20-100 | 5-8 |
52
+ | 100+ | 10-15 |
53
+
54
+ **Explore Agent Prompt Structure:**
55
+ ```
56
+ Find: [specific question]
57
+ Return ONLY:
58
+ - File paths matching criteria
59
+ - One-line description per file
60
+ - Integration points (if asked)
61
+
62
+ DO NOT:
63
+ - Read or summarize spec files
64
+ - Make recommendations
65
+ - Propose solutions
66
+ - Generate tables or lengthy explanations
67
+
68
+ Max response: 500 tokens (configurable via .deepflow/config.yaml explore.max_tokens)
69
+ ```
70
+
71
+ **Explore Agent Scope Restrictions:**
72
+ - MUST only report factual findings:
73
+ - Files found
74
+ - Patterns/conventions observed
75
+ - Integration points
76
+ - MUST NOT:
77
+ - Make recommendations
78
+ - Propose architectures
79
+ - Read and summarize specs (that's orchestrator's job)
80
+ - Draw conclusions about what should be built
81
+
82
+ ### 2. GAP CHECK
83
+ Use the `gap-discovery` skill to analyze conversation + agent findings.
18
84
 
19
85
  **Required clarity:**
20
86
  - [ ] Core objective clear
@@ -42,9 +108,24 @@ Before generating, use the `gap-discovery` skill to analyze conversation.
42
108
 
43
109
  Max 4 questions per tool call. Wait for answers before proceeding.
44
110
 
45
- ### 2. GENERATE SPEC
111
+ ### 3. SYNTHESIZE FINDINGS
112
+
113
+ **Use Task tool to spawn reasoner agent:**
114
+ ```
115
+ Task tool parameters:
116
+ - subagent_type: "reasoner"
117
+ - model: "opus"
118
+ ```
46
119
 
47
- Once gaps covered, create `specs/{name}.md`:
120
+ The reasoner will:
121
+ - Analyze codebase context from Explore agents
122
+ - Identify constraints from existing architecture
123
+ - Suggest requirements based on patterns found
124
+ - Flag potential conflicts with existing code
125
+
126
+ ### 4. GENERATE SPEC
127
+
128
+ Once gaps covered and context gathered, create `specs/{name}.md`:
48
129
 
49
130
  ```markdown
50
131
  # {Name}
@@ -70,10 +151,10 @@ Once gaps covered, create `specs/{name}.md`:
70
151
  - [ ] [Testable criterion 3]
71
152
 
72
153
  ## Technical Notes
73
- [Any implementation hints, preferred approaches, or context]
154
+ [Implementation hints from codebase analysis — patterns, integration points, constraints discovered by agents]
74
155
  ```
75
156
 
76
- ### 3. CONFIRM
157
+ ### 5. CONFIRM
77
158
 
78
159
  After writing:
79
160
  ```
@@ -86,17 +167,35 @@ Next: Run /df:plan to generate tasks
86
167
  ```
87
168
 
88
169
  ## Rules
170
+ - **Orchestrator never searches** — Spawn agents for all codebase exploration
89
171
  - Do NOT generate spec if critical gaps remain
90
- - Ask maximum 5 questions per round (not overwhelming)
172
+ - Ask maximum 4 questions per tool call (not overwhelming)
91
173
  - Requirements must be testable
92
174
  - Acceptance criteria must be verifiable
175
+ - Include agent-discovered context in Technical Notes
93
176
  - Keep specs concise (<100 lines)
94
177
 
178
+ ## Agent Scaling
179
+
180
+ | Agent | subagent_type | model | Base | Purpose |
181
+ |-------|---------------|-------|------|---------|
182
+ | Explore | `Explore` | `haiku` | 3-5 | Find related code, patterns |
183
+ | Reasoner | `reasoner` | `opus` | 1 | Synthesize into requirements |
184
+
185
+ **IMPORTANT**: Always use the `Task` tool with explicit `subagent_type` and `model` parameters.
186
+
95
187
  ## Example
96
188
 
97
189
  ```
98
190
  USER: I want to add image upload
99
191
 
192
+ CLAUDE: [Spawns 3 Explore agents in parallel]
193
+ - "Find existing file handling patterns"
194
+ - "Find API endpoint conventions"
195
+ - "Find storage service implementations"
196
+
197
+ [Agents return: Express multer middleware, REST conventions, no cloud storage yet]
198
+
100
199
  CLAUDE: [Uses AskUserQuestion tool]
101
200
  {
102
201
  "questions": [
@@ -106,18 +205,7 @@ CLAUDE: [Uses AskUserQuestion tool]
106
205
  "multiSelect": true,
107
206
  "options": [
108
207
  {"label": "JPG/PNG only", "description": "Standard formats"},
109
- {"label": "Include WebP", "description": "Modern compression"},
110
- {"label": "Include GIF", "description": "Animated images"}
111
- ]
112
- },
113
- {
114
- "question": "What's the maximum upload size?",
115
- "header": "Size limit",
116
- "multiSelect": false,
117
- "options": [
118
- {"label": "5 MB", "description": "Fast uploads"},
119
- {"label": "10 MB (Recommended)", "description": "Balanced"},
120
- {"label": "25 MB", "description": "High-res photos"}
208
+ {"label": "Include WebP", "description": "Modern compression"}
121
209
  ]
122
210
  },
123
211
  {
@@ -126,29 +214,22 @@ CLAUDE: [Uses AskUserQuestion tool]
126
214
  "multiSelect": false,
127
215
  "options": [
128
216
  {"label": "S3 (Recommended)", "description": "Scalable cloud storage"},
129
- {"label": "Local filesystem", "description": "Simple, single server"},
130
- {"label": "Cloudflare R2", "description": "S3-compatible, no egress fees"}
131
- ]
132
- },
133
- {
134
- "question": "Who can upload files?",
135
- "header": "Auth",
136
- "multiSelect": false,
137
- "options": [
138
- {"label": "Logged-in users", "description": "Require authentication"},
139
- {"label": "Public", "description": "Anyone can upload"},
140
- {"label": "Role-based", "description": "Specific roles only"}
217
+ {"label": "Local filesystem", "description": "Simple, matches current setup"}
141
218
  ]
142
219
  }
143
220
  ]
144
221
  }
145
222
 
146
- USER: [Selects: JPG/PNG + WebP, 10MB, S3, Logged-in users]
223
+ USER: [Selects: JPG/PNG + WebP, S3]
224
+
225
+ CLAUDE: [Spawns reasoner agent]
226
+ - Synthesize: multer + S3 + existing API patterns
147
227
 
148
228
  CLAUDE: ✓ Created specs/image-upload.md
149
229
 
150
230
  Requirements: 4
151
231
  Acceptance criteria: 5
232
+ Technical notes: Express/multer pattern, REST conventions from existing API
152
233
 
153
234
  Next: Run /df:plan to generate tasks
154
235
  ```
@@ -12,7 +12,11 @@ Check that implemented code satisfies spec requirements and acceptance criteria.
12
12
 
13
13
  ## Skills & Agents
14
14
  - Skill: `code-completeness` — Find incomplete implementations
15
- - Agent: `Explore` (Haiku) — Fast codebase scanning
15
+
16
+ **Use Task tool to spawn agents:**
17
+ | Agent | subagent_type | model | Purpose |
18
+ |-------|---------------|-------|---------|
19
+ | Scanner | `Explore` | `haiku` | Fast codebase scanning |
16
20
 
17
21
  ## Spec File States
18
22
 
@@ -87,7 +91,15 @@ Default: L1-L3 (L4 optional, can be slow)
87
91
 
88
92
  ## Agent Usage
89
93
 
90
- Spawn `Explore` agents (Haiku), 1-2 per spec, cap 10.
94
+ **Use Task tool to spawn Explore agents:**
95
+ ```
96
+ Task tool parameters:
97
+ - subagent_type: "Explore"
98
+ - model: "haiku"
99
+ - run_in_background: true (for parallel)
100
+ ```
101
+
102
+ Scale: 1-2 agents per spec, cap 10.
91
103
 
92
104
  ## Example
93
105
 
@@ -36,6 +36,9 @@ models:
36
36
  reason: opus # Complex decisions
37
37
  debug: opus # Problem solving
38
38
 
39
+ explore:
40
+ max_tokens: 500 # Controls Explore agent response length
41
+
39
42
  commits:
40
43
  format: "feat({spec}): {description}"
41
44
  atomic: true # One task = one commit
@@ -0,0 +1,74 @@
1
+ # Experiment: {hypothesis-slug}
2
+
3
+ > **Filename convention**: `{topic}--{hypothesis-slug}--{status}.md`
4
+ > Status: `active` | `passed` | `failed`
5
+
6
+ ## Topic
7
+
8
+ {Spec name or feature area this experiment relates to}
9
+
10
+ <!--
11
+ What problem or feature does this experiment address?
12
+ Link to relevant spec if applicable.
13
+ -->
14
+
15
+ ## Hypothesis
16
+
17
+ {What we believe will work and why}
18
+
19
+ <!--
20
+ Be specific and testable:
21
+ - "Using approach X will achieve Y because Z"
22
+ - "The bottleneck is in component A, not B"
23
+ - Should be falsifiable in a single experiment
24
+ -->
25
+
26
+ ## Method
27
+
28
+ {Minimal steps to validate the hypothesis}
29
+
30
+ <!--
31
+ Keep it minimal - fastest path to prove/disprove:
32
+ 1. Step one (e.g., "Create test file with X")
33
+ 2. Step two (e.g., "Run command Y")
34
+ 3. Step three (e.g., "Observe output Z")
35
+
36
+ Time-box: ideally under 30 minutes
37
+ -->
38
+
39
+ ## Result
40
+
41
+ **Status**: {pass | fail}
42
+
43
+ {Actual outcome with evidence}
44
+
45
+ <!--
46
+ Include concrete evidence:
47
+ - Error messages, output logs
48
+ - Metrics or measurements
49
+ - Screenshots if applicable
50
+ - What specifically happened vs. expected
51
+ -->
52
+
53
+ ## Conclusion
54
+
55
+ {What we learned from this experiment}
56
+
57
+ <!--
58
+ Answer these:
59
+ - Why did it pass/fail?
60
+ - What assumption was validated/invalidated?
61
+ - If failed: What's the next hypothesis? (don't repeat same approach)
62
+ - If passed: What's ready for implementation?
63
+ -->
64
+
65
+ ---
66
+
67
+ <!--
68
+ Experiment Guidelines:
69
+ - One hypothesis per experiment
70
+ - Failed experiments are valuable - they inform the next hypothesis
71
+ - Never repeat a failed approach without a new insight
72
+ - Keep experiments small and fast (under 30 min)
73
+ - Link related experiments in conclusions
74
+ -->
@@ -29,6 +29,22 @@ Generated: {timestamp}
29
29
  - Files: {files}
30
30
  - Blocked by: T1
31
31
 
32
+ ### Spike Task Example
33
+
34
+ When no experiments exist to validate an approach, start with a minimal validation spike:
35
+
36
+ - [ ] **T1** (spike): Validate [hypothesis] approach
37
+ - Files: [minimal files needed]
38
+ - Blocked by: none
39
+ - Blocks: T2, T3, T4 (full implementation)
40
+ - Description: Minimal test to verify [approach] works before full implementation
41
+
42
+ - [ ] **T2**: Implement [feature] based on spike results
43
+ - Files: [implementation files]
44
+ - Blocked by: T1 (spike)
45
+
46
+ Spike tasks are 1-2 tasks to validate an approach before committing to full implementation.
47
+
32
48
  ---
33
49
 
34
50
  <!--
@@ -38,4 +54,6 @@ Plan Guidelines:
38
54
  - Blocked by references task IDs (T1, T2, etc.)
39
55
  - Mark complete with [x] and commit hash
40
56
  - Example completed: [x] **T1**: Create API ✓ (abc1234)
57
+ - Spike tasks: If no experiments validate the approach, first task should be a minimal validation spike
58
+ - Spike tasks block full implementation tasks until the hypothesis is validated
41
59
  -->