deepflow 0.1.23 → 0.1.26
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/src/commands/df/execute.md +159 -11
- package/src/commands/df/plan.md +158 -20
- package/src/commands/df/spec.md +115 -34
- package/src/commands/df/verify.md +14 -2
- package/templates/config-template.yaml +3 -0
- package/templates/experiment-template.md +74 -0
- package/templates/plan-template.md +18 -0
package/package.json
CHANGED
|
@@ -24,8 +24,12 @@ Implement tasks from PLAN.md with parallel agents, atomic commits, and context-e
|
|
|
24
24
|
|
|
25
25
|
## Skills & Agents
|
|
26
26
|
- Skill: `atomic-commits` — Clean commit protocol
|
|
27
|
-
|
|
28
|
-
|
|
27
|
+
|
|
28
|
+
**Use Task tool to spawn agents:**
|
|
29
|
+
| Agent | subagent_type | model | Purpose |
|
|
30
|
+
|-------|---------------|-------|---------|
|
|
31
|
+
| Implementation | `general-purpose` | `sonnet` | Task implementation |
|
|
32
|
+
| Debugger | `reasoner` | `opus` | Debugging failures |
|
|
29
33
|
|
|
30
34
|
## Context-Aware Execution
|
|
31
35
|
|
|
@@ -82,20 +86,93 @@ If missing: "No PLAN.md found. Run /df:plan first."
|
|
|
82
86
|
|
|
83
87
|
Warn if `specs/*.md` (excluding doing-/done-) exist. Non-blocking.
|
|
84
88
|
|
|
85
|
-
### 4.
|
|
89
|
+
### 4. CHECK EXPERIMENT STATUS (HYPOTHESIS VALIDATION)
|
|
90
|
+
|
|
91
|
+
**Before identifying ready tasks**, check experiment validation for full implementation tasks.
|
|
92
|
+
|
|
93
|
+
**Task Types:**
|
|
94
|
+
- **Spike tasks**: Have `[SPIKE]` in title OR `Type: spike` in description — always executable
|
|
95
|
+
- **Full implementation tasks**: Blocked by spike tasks — require validated experiment
|
|
96
|
+
|
|
97
|
+
**Validation Flow:**
|
|
98
|
+
|
|
99
|
+
```
|
|
100
|
+
For each task in plan:
|
|
101
|
+
If task is spike task:
|
|
102
|
+
→ Mark as executable (spikes are always allowed)
|
|
103
|
+
Else if task is blocked by a spike task (T{n}):
|
|
104
|
+
→ Find related experiment file in .deepflow/experiments/
|
|
105
|
+
→ Check experiment status:
|
|
106
|
+
- --passed.md exists → Unblock, proceed with implementation
|
|
107
|
+
- --failed.md exists → Keep blocked, warn user
|
|
108
|
+
- --active.md exists → Keep blocked, spike in progress
|
|
109
|
+
- No experiment → Keep blocked, spike not started
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
**Experiment File Discovery:**
|
|
113
|
+
|
|
114
|
+
```
|
|
115
|
+
Glob: .deepflow/experiments/{topic}--*--{status}.md
|
|
86
116
|
|
|
87
|
-
|
|
117
|
+
Topic extraction:
|
|
118
|
+
1. From spike task: experiment file path in task description
|
|
119
|
+
2. From spec name: doing-{topic} → {topic}
|
|
120
|
+
3. Fuzzy match: normalize and match
|
|
121
|
+
```
|
|
88
122
|
|
|
89
|
-
|
|
123
|
+
**Status Handling:**
|
|
124
|
+
|
|
125
|
+
| Experiment Status | Task Status | Action |
|
|
126
|
+
|-------------------|-------------|--------|
|
|
127
|
+
| `--passed.md` | Ready | Execute full implementation |
|
|
128
|
+
| `--failed.md` | Blocked | Skip, warn: "Experiment failed, re-plan needed" |
|
|
129
|
+
| `--active.md` | Blocked | Skip, info: "Waiting for spike completion" |
|
|
130
|
+
| Not found | Blocked | Skip, info: "Spike task not executed yet" |
|
|
131
|
+
|
|
132
|
+
**Warning Output:**
|
|
133
|
+
|
|
134
|
+
```
|
|
135
|
+
⚠ T3 blocked: Experiment 'upload--streaming--failed.md' did not validate
|
|
136
|
+
→ Run /df:plan to generate new hypothesis spike
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
### 5. IDENTIFY READY TASKS
|
|
140
|
+
|
|
141
|
+
Ready = `[ ]` + all `blocked_by` complete + experiment validated (if applicable) + not in checkpoint.
|
|
142
|
+
|
|
143
|
+
### 6. SPAWN AGENTS
|
|
90
144
|
|
|
91
145
|
Context ≥50%: checkpoint and exit.
|
|
92
146
|
|
|
93
|
-
|
|
147
|
+
**Use Task tool to spawn all ready tasks in ONE message (parallel):**
|
|
148
|
+
```
|
|
149
|
+
Task tool parameters for each task:
|
|
150
|
+
- subagent_type: "general-purpose"
|
|
151
|
+
- model: "sonnet"
|
|
152
|
+
- run_in_background: true
|
|
153
|
+
- prompt: "{task details from PLAN.md}"
|
|
154
|
+
```
|
|
94
155
|
|
|
95
|
-
|
|
156
|
+
Same-file conflicts: spawn sequentially instead.
|
|
96
157
|
|
|
97
|
-
|
|
158
|
+
**Spike Task Execution:**
|
|
159
|
+
When spawning a spike task, the agent MUST:
|
|
160
|
+
1. Execute the minimal validation method
|
|
161
|
+
2. Record result in experiment file (update status: `--passed.md` or `--failed.md`)
|
|
162
|
+
3. If passed: implementation tasks become unblocked
|
|
163
|
+
4. If failed: record conclusion with "next hypothesis" for future planning
|
|
98
164
|
|
|
165
|
+
**On failure, use Task tool to spawn reasoner:**
|
|
166
|
+
```
|
|
167
|
+
Task tool parameters:
|
|
168
|
+
- subagent_type: "reasoner"
|
|
169
|
+
- model: "opus"
|
|
170
|
+
- prompt: "Debug failure: {error details}"
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
### 7. PER-TASK (agent prompt)
|
|
174
|
+
|
|
175
|
+
**Standard Task:**
|
|
99
176
|
```
|
|
100
177
|
{task_id}: {description from PLAN.md}
|
|
101
178
|
Files: {target files}
|
|
@@ -105,14 +182,38 @@ Implement, test, commit as feat({spec}): {description}.
|
|
|
105
182
|
Write result to .deepflow/results/{task_id}.yaml
|
|
106
183
|
```
|
|
107
184
|
|
|
108
|
-
|
|
185
|
+
**Spike Task:**
|
|
186
|
+
```
|
|
187
|
+
{task_id} [SPIKE]: {hypothesis}
|
|
188
|
+
Type: spike
|
|
189
|
+
Method: {minimal steps to validate}
|
|
190
|
+
Success criteria: {how to know it passed}
|
|
191
|
+
Time-box: {duration}
|
|
192
|
+
Experiment file: {.deepflow/experiments/{topic}--{hypothesis}--active.md}
|
|
193
|
+
Spec: {spec_name}
|
|
194
|
+
|
|
195
|
+
Execute the minimal validation:
|
|
196
|
+
1. Follow the method steps exactly
|
|
197
|
+
2. Measure against success criteria
|
|
198
|
+
3. Update experiment file with result:
|
|
199
|
+
- If passed: rename to --passed.md, record findings
|
|
200
|
+
- If failed: rename to --failed.md, record conclusion with "next hypothesis"
|
|
201
|
+
4. Commit as spike({spec}): validate {hypothesis}
|
|
202
|
+
5. Write result to .deepflow/results/{task_id}.yaml
|
|
203
|
+
|
|
204
|
+
Result status:
|
|
205
|
+
- success = hypothesis validated (passed)
|
|
206
|
+
- failed = hypothesis invalidated (failed experiment, NOT agent error)
|
|
207
|
+
```
|
|
208
|
+
|
|
209
|
+
### 8. COMPLETE SPECS
|
|
109
210
|
|
|
110
211
|
When all tasks done for a `doing-*` spec:
|
|
111
212
|
1. Embed history in spec: `## Completed` section
|
|
112
213
|
2. Rename: `doing-upload.md` → `done-upload.md`
|
|
113
214
|
3. Remove section from PLAN.md
|
|
114
215
|
|
|
115
|
-
###
|
|
216
|
+
### 9. ITERATE
|
|
116
217
|
|
|
117
218
|
Repeat until: all done, all blocked, or checkpoint.
|
|
118
219
|
|
|
@@ -126,6 +227,8 @@ Repeat until: all done, all blocked, or checkpoint.
|
|
|
126
227
|
|
|
127
228
|
## Example
|
|
128
229
|
|
|
230
|
+
### Standard Execution
|
|
231
|
+
|
|
129
232
|
```
|
|
130
233
|
/df:execute (context: 12%)
|
|
131
234
|
|
|
@@ -140,7 +243,52 @@ Wave 2: T3 (context: 48%)
|
|
|
140
243
|
✓ Complete: 3/3 tasks
|
|
141
244
|
```
|
|
142
245
|
|
|
143
|
-
|
|
246
|
+
### Spike-First Execution
|
|
247
|
+
|
|
248
|
+
```
|
|
249
|
+
/df:execute (context: 10%)
|
|
250
|
+
|
|
251
|
+
Checking experiment status...
|
|
252
|
+
T1 [SPIKE]: No experiment yet, spike executable
|
|
253
|
+
T2: Blocked by T1 (spike not validated)
|
|
254
|
+
T3: Blocked by T1 (spike not validated)
|
|
255
|
+
|
|
256
|
+
Wave 1: T1 [SPIKE] (context: 20%)
|
|
257
|
+
T1: success (abc1234) → upload--streaming--passed.md
|
|
258
|
+
|
|
259
|
+
Checking experiment status...
|
|
260
|
+
T2: Experiment passed, unblocked
|
|
261
|
+
T3: Experiment passed, unblocked
|
|
262
|
+
|
|
263
|
+
Wave 2: T2, T3 parallel (context: 45%)
|
|
264
|
+
T2: success (def5678)
|
|
265
|
+
T3: success (ghi9012)
|
|
266
|
+
|
|
267
|
+
✓ doing-upload → done-upload
|
|
268
|
+
✓ Complete: 3/3 tasks
|
|
269
|
+
```
|
|
270
|
+
|
|
271
|
+
### Spike Failed
|
|
272
|
+
|
|
273
|
+
```
|
|
274
|
+
/df:execute (context: 10%)
|
|
275
|
+
|
|
276
|
+
Wave 1: T1 [SPIKE] (context: 20%)
|
|
277
|
+
T1: failed → upload--streaming--failed.md
|
|
278
|
+
|
|
279
|
+
Checking experiment status...
|
|
280
|
+
T2: ⚠ Blocked - Experiment failed
|
|
281
|
+
T3: ⚠ Blocked - Experiment failed
|
|
282
|
+
|
|
283
|
+
⚠ Spike T1 invalidated hypothesis
|
|
284
|
+
Experiment: upload--streaming--failed.md
|
|
285
|
+
→ Run /df:plan to generate new hypothesis spike
|
|
286
|
+
|
|
287
|
+
Complete: 1/3 tasks (2 blocked by failed experiment)
|
|
288
|
+
```
|
|
289
|
+
|
|
290
|
+
### With Checkpoint
|
|
291
|
+
|
|
144
292
|
```
|
|
145
293
|
Wave 1 complete (context: 52%)
|
|
146
294
|
Checkpoint saved. Run /df:execute --continue
|
package/src/commands/df/plan.md
CHANGED
|
@@ -42,21 +42,33 @@ Determine source_dir from config or default to src/
|
|
|
42
42
|
|
|
43
43
|
If no new specs: report counts, suggest `/df:execute`.
|
|
44
44
|
|
|
45
|
-
### 2. CHECK PAST EXPERIMENTS
|
|
45
|
+
### 2. CHECK PAST EXPERIMENTS (SPIKE-FIRST)
|
|
46
46
|
|
|
47
|
-
|
|
47
|
+
**CRITICAL**: Check experiments BEFORE generating any tasks.
|
|
48
|
+
|
|
49
|
+
Extract topic from spec name (fuzzy match), then:
|
|
48
50
|
|
|
49
51
|
```
|
|
50
|
-
Glob .deepflow/experiments/{
|
|
52
|
+
Glob .deepflow/experiments/{topic}--*
|
|
51
53
|
```
|
|
52
54
|
|
|
55
|
+
**Experiment file naming:** `{topic}--{hypothesis}--{status}.md`
|
|
56
|
+
Statuses: `active`, `passed`, `failed`
|
|
57
|
+
|
|
53
58
|
| Result | Action |
|
|
54
59
|
|--------|--------|
|
|
55
|
-
| `--failed.md` |
|
|
56
|
-
| `--
|
|
57
|
-
|
|
|
60
|
+
| `--failed.md` exists | Extract "next hypothesis" from Conclusion section |
|
|
61
|
+
| `--passed.md` exists | Reference as validated pattern, can proceed to full implementation |
|
|
62
|
+
| `--active.md` exists | Wait for experiment completion before planning |
|
|
63
|
+
| No matches | New topic, needs initial spike |
|
|
64
|
+
|
|
65
|
+
**Spike-First Rule**:
|
|
66
|
+
- If `--failed.md` exists: Generate spike task to test the next hypothesis (from failed experiment's Conclusion)
|
|
67
|
+
- If no experiments exist: Generate spike task for the core hypothesis
|
|
68
|
+
- Full implementation tasks are BLOCKED until a spike validates the approach
|
|
69
|
+
- Only proceed to full task generation after `--passed.md` exists
|
|
58
70
|
|
|
59
|
-
|
|
71
|
+
See: `templates/experiment-template.md` for experiment format
|
|
60
72
|
|
|
61
73
|
### 3. DETECT PROJECT CONTEXT
|
|
62
74
|
|
|
@@ -69,7 +81,15 @@ Include patterns in task descriptions for agents to follow.
|
|
|
69
81
|
|
|
70
82
|
### 4. ANALYZE CODEBASE
|
|
71
83
|
|
|
72
|
-
**
|
|
84
|
+
**Use Task tool to spawn Explore agents in parallel:**
|
|
85
|
+
```
|
|
86
|
+
Task tool parameters:
|
|
87
|
+
- subagent_type: "Explore"
|
|
88
|
+
- model: "haiku"
|
|
89
|
+
- run_in_background: true (for parallel execution)
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
Scale agent count based on codebase size:
|
|
73
93
|
|
|
74
94
|
| File Count | Agents |
|
|
75
95
|
|------------|--------|
|
|
@@ -78,6 +98,34 @@ Include patterns in task descriptions for agents to follow.
|
|
|
78
98
|
| 100-500 | 25-40 |
|
|
79
99
|
| 500+ | 50-100 (cap) |
|
|
80
100
|
|
|
101
|
+
**Explore Agent Prompt Structure:**
|
|
102
|
+
```
|
|
103
|
+
Find: [specific question]
|
|
104
|
+
Return ONLY:
|
|
105
|
+
- File paths matching criteria
|
|
106
|
+
- One-line description per file
|
|
107
|
+
- Integration points (if asked)
|
|
108
|
+
|
|
109
|
+
DO NOT:
|
|
110
|
+
- Read or summarize spec files
|
|
111
|
+
- Make recommendations
|
|
112
|
+
- Propose solutions
|
|
113
|
+
- Generate tables or lengthy explanations
|
|
114
|
+
|
|
115
|
+
Max response: 500 tokens (configurable via .deepflow/config.yaml explore.max_tokens)
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
**Explore Agent Scope Restrictions:**
|
|
119
|
+
- MUST only report factual findings:
|
|
120
|
+
- Files found
|
|
121
|
+
- Patterns/conventions observed
|
|
122
|
+
- Integration points
|
|
123
|
+
- MUST NOT:
|
|
124
|
+
- Make recommendations
|
|
125
|
+
- Propose architectures
|
|
126
|
+
- Read and summarize specs (that's orchestrator's job)
|
|
127
|
+
- Draw conclusions about what should be built
|
|
128
|
+
|
|
81
129
|
**Use `code-completeness` skill patterns** to search for:
|
|
82
130
|
- Implementations matching spec requirements
|
|
83
131
|
- TODO, FIXME, HACK comments
|
|
@@ -86,7 +134,14 @@ Include patterns in task descriptions for agents to follow.
|
|
|
86
134
|
|
|
87
135
|
### 5. COMPARE & PRIORITIZE
|
|
88
136
|
|
|
89
|
-
**
|
|
137
|
+
**Use Task tool to spawn reasoner agent:**
|
|
138
|
+
```
|
|
139
|
+
Task tool parameters:
|
|
140
|
+
- subagent_type: "reasoner"
|
|
141
|
+
- model: "opus"
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
Reasoner performs analysis:
|
|
90
145
|
|
|
91
146
|
| Status | Action |
|
|
92
147
|
|--------|--------|
|
|
@@ -102,7 +157,36 @@ Include patterns in task descriptions for agents to follow.
|
|
|
102
157
|
2. Impact — core features before enhancements
|
|
103
158
|
3. Risk — unknowns early
|
|
104
159
|
|
|
105
|
-
### 6.
|
|
160
|
+
### 6. GENERATE SPIKE TASKS (IF NEEDED)
|
|
161
|
+
|
|
162
|
+
**When to generate spike tasks:**
|
|
163
|
+
1. Failed experiment exists → Test the next hypothesis
|
|
164
|
+
2. No experiments exist → Test the core hypothesis
|
|
165
|
+
3. Passed experiment exists → Skip to full implementation
|
|
166
|
+
|
|
167
|
+
**Spike Task Format:**
|
|
168
|
+
```markdown
|
|
169
|
+
- [ ] **T1** [SPIKE]: Validate {hypothesis}
|
|
170
|
+
- Type: spike
|
|
171
|
+
- Hypothesis: {what we're testing}
|
|
172
|
+
- Method: {minimal steps to validate}
|
|
173
|
+
- Success criteria: {how to know it passed}
|
|
174
|
+
- Time-box: 30 min
|
|
175
|
+
- Files: .deepflow/experiments/{topic}--{hypothesis}--{status}.md
|
|
176
|
+
- Blocked by: none
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
**Blocking Logic:**
|
|
180
|
+
- All implementation tasks MUST have `Blocked by: T{spike}` until spike passes
|
|
181
|
+
- After spike completes:
|
|
182
|
+
- If passed: Update experiment to `--passed.md`, unblock implementation tasks
|
|
183
|
+
- If failed: Update experiment to `--failed.md`, DO NOT generate implementation tasks
|
|
184
|
+
|
|
185
|
+
**Full Implementation Only After Spike:**
|
|
186
|
+
- Only generate full task list when spike validates the approach
|
|
187
|
+
- Never generate 10-task waterfall without validated hypothesis
|
|
188
|
+
|
|
189
|
+
### 7. VALIDATE HYPOTHESES
|
|
106
190
|
|
|
107
191
|
Test risky assumptions before finalizing plan.
|
|
108
192
|
|
|
@@ -111,24 +195,27 @@ Test risky assumptions before finalizing plan.
|
|
|
111
195
|
**Process:**
|
|
112
196
|
1. Prototype in scratchpad (not committed)
|
|
113
197
|
2. Test assumption
|
|
114
|
-
3. If fails → Write `.deepflow/experiments/{
|
|
198
|
+
3. If fails → Write `.deepflow/experiments/{topic}--{hypothesis}--failed.md`
|
|
115
199
|
4. Adjust approach, document in task
|
|
116
200
|
|
|
117
201
|
**Skip:** Well-known patterns, simple CRUD, clear docs exist
|
|
118
202
|
|
|
119
|
-
###
|
|
203
|
+
### 8. OUTPUT PLAN.md
|
|
120
204
|
|
|
121
205
|
Append tasks grouped by `### doing-{spec-name}`. Include spec gaps and validation findings.
|
|
122
206
|
|
|
123
|
-
###
|
|
207
|
+
### 9. RENAME SPECS
|
|
124
208
|
|
|
125
209
|
`mv specs/feature.md specs/doing-feature.md`
|
|
126
210
|
|
|
127
|
-
###
|
|
211
|
+
### 10. REPORT
|
|
128
212
|
|
|
129
213
|
`✓ Plan generated — {n} specs, {n} tasks. Run /df:execute`
|
|
130
214
|
|
|
131
215
|
## Rules
|
|
216
|
+
- **Spike-first** — Generate spike task before full implementation if no `--passed.md` experiment exists
|
|
217
|
+
- **Block on spike** — Full implementation tasks MUST be blocked by spike validation
|
|
218
|
+
- **Learn from failures** — Extract "next hypothesis" from failed experiments, never repeat same approach
|
|
132
219
|
- **Learn from history** — Check past experiments before proposing approaches
|
|
133
220
|
- **Plan only** — Do NOT implement anything (except quick validation prototypes)
|
|
134
221
|
- **Validate before commit** — Test risky assumptions with minimal experiments
|
|
@@ -139,13 +226,64 @@ Append tasks grouped by `### doing-{spec-name}`. Include spec gaps and validatio
|
|
|
139
226
|
|
|
140
227
|
## Agent Scaling
|
|
141
228
|
|
|
142
|
-
| Agent | Base | Scale |
|
|
143
|
-
|
|
144
|
-
| Explore (search) | 10 | +1 per 20 files |
|
|
145
|
-
| Reasoner (analyze) | 5 | +1 per 2 specs |
|
|
229
|
+
| Agent | Model | Base | Scale |
|
|
230
|
+
|-------|-------|------|-------|
|
|
231
|
+
| Explore (search) | haiku | 10 | +1 per 20 files |
|
|
232
|
+
| Reasoner (analyze) | opus | 5 | +1 per 2 specs |
|
|
233
|
+
|
|
234
|
+
**IMPORTANT**: Always use the `Task` tool with explicit `subagent_type` and `model` parameters. Do NOT use Glob/Grep/Read directly for codebase analysis - spawn agents instead.
|
|
146
235
|
|
|
147
236
|
## Example
|
|
148
237
|
|
|
238
|
+
### Spike-First (No Prior Experiments)
|
|
239
|
+
|
|
240
|
+
```markdown
|
|
241
|
+
# Plan
|
|
242
|
+
|
|
243
|
+
### doing-upload
|
|
244
|
+
|
|
245
|
+
- [ ] **T1** [SPIKE]: Validate streaming upload approach
|
|
246
|
+
- Type: spike
|
|
247
|
+
- Hypothesis: Streaming uploads will handle files >1GB without memory issues
|
|
248
|
+
- Method: Create minimal endpoint, upload 2GB file, measure memory
|
|
249
|
+
- Success criteria: Memory stays under 500MB during upload
|
|
250
|
+
- Time-box: 30 min
|
|
251
|
+
- Files: .deepflow/experiments/upload--streaming--active.md
|
|
252
|
+
- Blocked by: none
|
|
253
|
+
|
|
254
|
+
- [ ] **T2**: Create upload endpoint
|
|
255
|
+
- Files: src/api/upload.ts
|
|
256
|
+
- Blocked by: T1 (spike must pass)
|
|
257
|
+
|
|
258
|
+
- [ ] **T3**: Add S3 service with streaming
|
|
259
|
+
- Files: src/services/storage.ts
|
|
260
|
+
- Blocked by: T1 (spike must pass), T2
|
|
261
|
+
```
|
|
262
|
+
|
|
263
|
+
### Spike-First (After Failed Experiment)
|
|
264
|
+
|
|
265
|
+
```markdown
|
|
266
|
+
# Plan
|
|
267
|
+
|
|
268
|
+
### doing-upload
|
|
269
|
+
|
|
270
|
+
- [ ] **T1** [SPIKE]: Validate chunked upload with backpressure
|
|
271
|
+
- Type: spike
|
|
272
|
+
- Hypothesis: Adding backpressure control will prevent buffer overflow
|
|
273
|
+
- Method: Implement pause/resume on buffer threshold, test with 2GB file
|
|
274
|
+
- Success criteria: No memory spikes above 500MB
|
|
275
|
+
- Time-box: 30 min
|
|
276
|
+
- Files: .deepflow/experiments/upload--chunked-backpressure--active.md
|
|
277
|
+
- Blocked by: none
|
|
278
|
+
- Note: Previous approach failed (see upload--buffer-upload--failed.md)
|
|
279
|
+
|
|
280
|
+
- [ ] **T2**: Implement chunked upload endpoint
|
|
281
|
+
- Files: src/api/upload.ts
|
|
282
|
+
- Blocked by: T1 (spike must pass)
|
|
283
|
+
```
|
|
284
|
+
|
|
285
|
+
### After Spike Validates (Full Implementation)
|
|
286
|
+
|
|
149
287
|
```markdown
|
|
150
288
|
# Plan
|
|
151
289
|
|
|
@@ -154,10 +292,10 @@ Append tasks grouped by `### doing-{spec-name}`. Include spec gaps and validatio
|
|
|
154
292
|
- [ ] **T1**: Create upload endpoint
|
|
155
293
|
- Files: src/api/upload.ts
|
|
156
294
|
- Blocked by: none
|
|
295
|
+
- Note: Use streaming (validated in upload--streaming--passed.md)
|
|
157
296
|
|
|
158
297
|
- [ ] **T2**: Add S3 service with streaming
|
|
159
298
|
- Files: src/services/storage.ts
|
|
160
299
|
- Blocked by: T1
|
|
161
|
-
-
|
|
162
|
-
- Avoid: Direct buffer upload failed for large files (experiments/perf--buffer-upload--failed.md)
|
|
300
|
+
- Avoid: Direct buffer upload failed (see upload--buffer-upload--failed.md)
|
|
163
301
|
```
|
package/src/commands/df/spec.md
CHANGED
|
@@ -1,5 +1,15 @@
|
|
|
1
1
|
# /df:spec — Generate Spec from Conversation
|
|
2
2
|
|
|
3
|
+
## Orchestrator Role
|
|
4
|
+
|
|
5
|
+
You coordinate agents and ask questions. You never search code directly.
|
|
6
|
+
|
|
7
|
+
**NEVER:** Read source files, use Glob/Grep directly, run git
|
|
8
|
+
|
|
9
|
+
**ONLY:** Spawn agents, poll results, ask user questions, write spec file
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
3
13
|
## Purpose
|
|
4
14
|
Transform conversation context into a structured specification file.
|
|
5
15
|
|
|
@@ -8,13 +18,69 @@ Transform conversation context into a structured specification file.
|
|
|
8
18
|
/df:spec <name>
|
|
9
19
|
```
|
|
10
20
|
|
|
11
|
-
## Skills
|
|
12
|
-
|
|
21
|
+
## Skills & Agents
|
|
22
|
+
- Skill: `gap-discovery` — Proactive requirement gap identification
|
|
23
|
+
|
|
24
|
+
**Use Task tool to spawn agents:**
|
|
25
|
+
| Agent | subagent_type | model | Purpose |
|
|
26
|
+
|-------|---------------|-------|---------|
|
|
27
|
+
| Context | `Explore` | `haiku` | Codebase context gathering |
|
|
28
|
+
| Synthesizer | `reasoner` | `opus` | Synthesize findings into requirements |
|
|
13
29
|
|
|
14
30
|
## Behavior
|
|
15
31
|
|
|
16
|
-
### 1.
|
|
17
|
-
|
|
32
|
+
### 1. GATHER CODEBASE CONTEXT
|
|
33
|
+
|
|
34
|
+
**Use Task tool to spawn Explore agents in parallel:**
|
|
35
|
+
```
|
|
36
|
+
Task tool parameters:
|
|
37
|
+
- subagent_type: "Explore"
|
|
38
|
+
- model: "haiku"
|
|
39
|
+
- run_in_background: true
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
Find:
|
|
43
|
+
- Related existing implementations
|
|
44
|
+
- Code patterns and conventions
|
|
45
|
+
- Integration points relevant to the feature
|
|
46
|
+
- Existing TODOs or placeholders in related areas
|
|
47
|
+
|
|
48
|
+
| Codebase Size | Agents |
|
|
49
|
+
|---------------|--------|
|
|
50
|
+
| <20 files | 2-3 |
|
|
51
|
+
| 20-100 | 5-8 |
|
|
52
|
+
| 100+ | 10-15 |
|
|
53
|
+
|
|
54
|
+
**Explore Agent Prompt Structure:**
|
|
55
|
+
```
|
|
56
|
+
Find: [specific question]
|
|
57
|
+
Return ONLY:
|
|
58
|
+
- File paths matching criteria
|
|
59
|
+
- One-line description per file
|
|
60
|
+
- Integration points (if asked)
|
|
61
|
+
|
|
62
|
+
DO NOT:
|
|
63
|
+
- Read or summarize spec files
|
|
64
|
+
- Make recommendations
|
|
65
|
+
- Propose solutions
|
|
66
|
+
- Generate tables or lengthy explanations
|
|
67
|
+
|
|
68
|
+
Max response: 500 tokens (configurable via .deepflow/config.yaml explore.max_tokens)
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
**Explore Agent Scope Restrictions:**
|
|
72
|
+
- MUST only report factual findings:
|
|
73
|
+
- Files found
|
|
74
|
+
- Patterns/conventions observed
|
|
75
|
+
- Integration points
|
|
76
|
+
- MUST NOT:
|
|
77
|
+
- Make recommendations
|
|
78
|
+
- Propose architectures
|
|
79
|
+
- Read and summarize specs (that's orchestrator's job)
|
|
80
|
+
- Draw conclusions about what should be built
|
|
81
|
+
|
|
82
|
+
### 2. GAP CHECK
|
|
83
|
+
Use the `gap-discovery` skill to analyze conversation + agent findings.
|
|
18
84
|
|
|
19
85
|
**Required clarity:**
|
|
20
86
|
- [ ] Core objective clear
|
|
@@ -42,9 +108,24 @@ Before generating, use the `gap-discovery` skill to analyze conversation.
|
|
|
42
108
|
|
|
43
109
|
Max 4 questions per tool call. Wait for answers before proceeding.
|
|
44
110
|
|
|
45
|
-
###
|
|
111
|
+
### 3. SYNTHESIZE FINDINGS
|
|
112
|
+
|
|
113
|
+
**Use Task tool to spawn reasoner agent:**
|
|
114
|
+
```
|
|
115
|
+
Task tool parameters:
|
|
116
|
+
- subagent_type: "reasoner"
|
|
117
|
+
- model: "opus"
|
|
118
|
+
```
|
|
46
119
|
|
|
47
|
-
|
|
120
|
+
The reasoner will:
|
|
121
|
+
- Analyze codebase context from Explore agents
|
|
122
|
+
- Identify constraints from existing architecture
|
|
123
|
+
- Suggest requirements based on patterns found
|
|
124
|
+
- Flag potential conflicts with existing code
|
|
125
|
+
|
|
126
|
+
### 4. GENERATE SPEC
|
|
127
|
+
|
|
128
|
+
Once gaps covered and context gathered, create `specs/{name}.md`:
|
|
48
129
|
|
|
49
130
|
```markdown
|
|
50
131
|
# {Name}
|
|
@@ -70,10 +151,10 @@ Once gaps covered, create `specs/{name}.md`:
|
|
|
70
151
|
- [ ] [Testable criterion 3]
|
|
71
152
|
|
|
72
153
|
## Technical Notes
|
|
73
|
-
[
|
|
154
|
+
[Implementation hints from codebase analysis — patterns, integration points, constraints discovered by agents]
|
|
74
155
|
```
|
|
75
156
|
|
|
76
|
-
###
|
|
157
|
+
### 5. CONFIRM
|
|
77
158
|
|
|
78
159
|
After writing:
|
|
79
160
|
```
|
|
@@ -86,17 +167,35 @@ Next: Run /df:plan to generate tasks
|
|
|
86
167
|
```
|
|
87
168
|
|
|
88
169
|
## Rules
|
|
170
|
+
- **Orchestrator never searches** — Spawn agents for all codebase exploration
|
|
89
171
|
- Do NOT generate spec if critical gaps remain
|
|
90
|
-
- Ask maximum
|
|
172
|
+
- Ask maximum 4 questions per tool call (not overwhelming)
|
|
91
173
|
- Requirements must be testable
|
|
92
174
|
- Acceptance criteria must be verifiable
|
|
175
|
+
- Include agent-discovered context in Technical Notes
|
|
93
176
|
- Keep specs concise (<100 lines)
|
|
94
177
|
|
|
178
|
+
## Agent Scaling
|
|
179
|
+
|
|
180
|
+
| Agent | subagent_type | model | Base | Purpose |
|
|
181
|
+
|-------|---------------|-------|------|---------|
|
|
182
|
+
| Explore | `Explore` | `haiku` | 3-5 | Find related code, patterns |
|
|
183
|
+
| Reasoner | `reasoner` | `opus` | 1 | Synthesize into requirements |
|
|
184
|
+
|
|
185
|
+
**IMPORTANT**: Always use the `Task` tool with explicit `subagent_type` and `model` parameters.
|
|
186
|
+
|
|
95
187
|
## Example
|
|
96
188
|
|
|
97
189
|
```
|
|
98
190
|
USER: I want to add image upload
|
|
99
191
|
|
|
192
|
+
CLAUDE: [Spawns 3 Explore agents in parallel]
|
|
193
|
+
- "Find existing file handling patterns"
|
|
194
|
+
- "Find API endpoint conventions"
|
|
195
|
+
- "Find storage service implementations"
|
|
196
|
+
|
|
197
|
+
[Agents return: Express multer middleware, REST conventions, no cloud storage yet]
|
|
198
|
+
|
|
100
199
|
CLAUDE: [Uses AskUserQuestion tool]
|
|
101
200
|
{
|
|
102
201
|
"questions": [
|
|
@@ -106,18 +205,7 @@ CLAUDE: [Uses AskUserQuestion tool]
|
|
|
106
205
|
"multiSelect": true,
|
|
107
206
|
"options": [
|
|
108
207
|
{"label": "JPG/PNG only", "description": "Standard formats"},
|
|
109
|
-
{"label": "Include WebP", "description": "Modern compression"}
|
|
110
|
-
{"label": "Include GIF", "description": "Animated images"}
|
|
111
|
-
]
|
|
112
|
-
},
|
|
113
|
-
{
|
|
114
|
-
"question": "What's the maximum upload size?",
|
|
115
|
-
"header": "Size limit",
|
|
116
|
-
"multiSelect": false,
|
|
117
|
-
"options": [
|
|
118
|
-
{"label": "5 MB", "description": "Fast uploads"},
|
|
119
|
-
{"label": "10 MB (Recommended)", "description": "Balanced"},
|
|
120
|
-
{"label": "25 MB", "description": "High-res photos"}
|
|
208
|
+
{"label": "Include WebP", "description": "Modern compression"}
|
|
121
209
|
]
|
|
122
210
|
},
|
|
123
211
|
{
|
|
@@ -126,29 +214,22 @@ CLAUDE: [Uses AskUserQuestion tool]
|
|
|
126
214
|
"multiSelect": false,
|
|
127
215
|
"options": [
|
|
128
216
|
{"label": "S3 (Recommended)", "description": "Scalable cloud storage"},
|
|
129
|
-
{"label": "Local filesystem", "description": "Simple,
|
|
130
|
-
{"label": "Cloudflare R2", "description": "S3-compatible, no egress fees"}
|
|
131
|
-
]
|
|
132
|
-
},
|
|
133
|
-
{
|
|
134
|
-
"question": "Who can upload files?",
|
|
135
|
-
"header": "Auth",
|
|
136
|
-
"multiSelect": false,
|
|
137
|
-
"options": [
|
|
138
|
-
{"label": "Logged-in users", "description": "Require authentication"},
|
|
139
|
-
{"label": "Public", "description": "Anyone can upload"},
|
|
140
|
-
{"label": "Role-based", "description": "Specific roles only"}
|
|
217
|
+
{"label": "Local filesystem", "description": "Simple, matches current setup"}
|
|
141
218
|
]
|
|
142
219
|
}
|
|
143
220
|
]
|
|
144
221
|
}
|
|
145
222
|
|
|
146
|
-
USER: [Selects: JPG/PNG + WebP,
|
|
223
|
+
USER: [Selects: JPG/PNG + WebP, S3]
|
|
224
|
+
|
|
225
|
+
CLAUDE: [Spawns reasoner agent]
|
|
226
|
+
- Synthesize: multer + S3 + existing API patterns
|
|
147
227
|
|
|
148
228
|
CLAUDE: ✓ Created specs/image-upload.md
|
|
149
229
|
|
|
150
230
|
Requirements: 4
|
|
151
231
|
Acceptance criteria: 5
|
|
232
|
+
Technical notes: Express/multer pattern, REST conventions from existing API
|
|
152
233
|
|
|
153
234
|
Next: Run /df:plan to generate tasks
|
|
154
235
|
```
|
|
@@ -12,7 +12,11 @@ Check that implemented code satisfies spec requirements and acceptance criteria.
|
|
|
12
12
|
|
|
13
13
|
## Skills & Agents
|
|
14
14
|
- Skill: `code-completeness` — Find incomplete implementations
|
|
15
|
-
|
|
15
|
+
|
|
16
|
+
**Use Task tool to spawn agents:**
|
|
17
|
+
| Agent | subagent_type | model | Purpose |
|
|
18
|
+
|-------|---------------|-------|---------|
|
|
19
|
+
| Scanner | `Explore` | `haiku` | Fast codebase scanning |
|
|
16
20
|
|
|
17
21
|
## Spec File States
|
|
18
22
|
|
|
@@ -87,7 +91,15 @@ Default: L1-L3 (L4 optional, can be slow)
|
|
|
87
91
|
|
|
88
92
|
## Agent Usage
|
|
89
93
|
|
|
90
|
-
|
|
94
|
+
**Use Task tool to spawn Explore agents:**
|
|
95
|
+
```
|
|
96
|
+
Task tool parameters:
|
|
97
|
+
- subagent_type: "Explore"
|
|
98
|
+
- model: "haiku"
|
|
99
|
+
- run_in_background: true (for parallel)
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
Scale: 1-2 agents per spec, cap 10.
|
|
91
103
|
|
|
92
104
|
## Example
|
|
93
105
|
|
|
@@ -0,0 +1,74 @@
|
|
|
1
|
+
# Experiment: {hypothesis-slug}
|
|
2
|
+
|
|
3
|
+
> **Filename convention**: `{topic}--{hypothesis-slug}--{status}.md`
|
|
4
|
+
> Status: `active` | `passed` | `failed`
|
|
5
|
+
|
|
6
|
+
## Topic
|
|
7
|
+
|
|
8
|
+
{Spec name or feature area this experiment relates to}
|
|
9
|
+
|
|
10
|
+
<!--
|
|
11
|
+
What problem or feature does this experiment address?
|
|
12
|
+
Link to relevant spec if applicable.
|
|
13
|
+
-->
|
|
14
|
+
|
|
15
|
+
## Hypothesis
|
|
16
|
+
|
|
17
|
+
{What we believe will work and why}
|
|
18
|
+
|
|
19
|
+
<!--
|
|
20
|
+
Be specific and testable:
|
|
21
|
+
- "Using approach X will achieve Y because Z"
|
|
22
|
+
- "The bottleneck is in component A, not B"
|
|
23
|
+
- Should be falsifiable in a single experiment
|
|
24
|
+
-->
|
|
25
|
+
|
|
26
|
+
## Method
|
|
27
|
+
|
|
28
|
+
{Minimal steps to validate the hypothesis}
|
|
29
|
+
|
|
30
|
+
<!--
|
|
31
|
+
Keep it minimal - fastest path to prove/disprove:
|
|
32
|
+
1. Step one (e.g., "Create test file with X")
|
|
33
|
+
2. Step two (e.g., "Run command Y")
|
|
34
|
+
3. Step three (e.g., "Observe output Z")
|
|
35
|
+
|
|
36
|
+
Time-box: ideally under 30 minutes
|
|
37
|
+
-->
|
|
38
|
+
|
|
39
|
+
## Result
|
|
40
|
+
|
|
41
|
+
**Status**: {pass | fail}
|
|
42
|
+
|
|
43
|
+
{Actual outcome with evidence}
|
|
44
|
+
|
|
45
|
+
<!--
|
|
46
|
+
Include concrete evidence:
|
|
47
|
+
- Error messages, output logs
|
|
48
|
+
- Metrics or measurements
|
|
49
|
+
- Screenshots if applicable
|
|
50
|
+
- What specifically happened vs. expected
|
|
51
|
+
-->
|
|
52
|
+
|
|
53
|
+
## Conclusion
|
|
54
|
+
|
|
55
|
+
{What we learned from this experiment}
|
|
56
|
+
|
|
57
|
+
<!--
|
|
58
|
+
Answer these:
|
|
59
|
+
- Why did it pass/fail?
|
|
60
|
+
- What assumption was validated/invalidated?
|
|
61
|
+
- If failed: What's the next hypothesis? (don't repeat same approach)
|
|
62
|
+
- If passed: What's ready for implementation?
|
|
63
|
+
-->
|
|
64
|
+
|
|
65
|
+
---
|
|
66
|
+
|
|
67
|
+
<!--
|
|
68
|
+
Experiment Guidelines:
|
|
69
|
+
- One hypothesis per experiment
|
|
70
|
+
- Failed experiments are valuable - they inform the next hypothesis
|
|
71
|
+
- Never repeat a failed approach without a new insight
|
|
72
|
+
- Keep experiments small and fast (under 30 min)
|
|
73
|
+
- Link related experiments in conclusions
|
|
74
|
+
-->
|
|
@@ -29,6 +29,22 @@ Generated: {timestamp}
|
|
|
29
29
|
- Files: {files}
|
|
30
30
|
- Blocked by: T1
|
|
31
31
|
|
|
32
|
+
### Spike Task Example
|
|
33
|
+
|
|
34
|
+
When no experiments exist to validate an approach, start with a minimal validation spike:
|
|
35
|
+
|
|
36
|
+
- [ ] **T1** (spike): Validate [hypothesis] approach
|
|
37
|
+
- Files: [minimal files needed]
|
|
38
|
+
- Blocked by: none
|
|
39
|
+
- Blocks: T2, T3, T4 (full implementation)
|
|
40
|
+
- Description: Minimal test to verify [approach] works before full implementation
|
|
41
|
+
|
|
42
|
+
- [ ] **T2**: Implement [feature] based on spike results
|
|
43
|
+
- Files: [implementation files]
|
|
44
|
+
- Blocked by: T1 (spike)
|
|
45
|
+
|
|
46
|
+
Spike tasks are 1-2 tasks to validate an approach before committing to full implementation.
|
|
47
|
+
|
|
32
48
|
---
|
|
33
49
|
|
|
34
50
|
<!--
|
|
@@ -38,4 +54,6 @@ Plan Guidelines:
|
|
|
38
54
|
- Blocked by references task IDs (T1, T2, etc.)
|
|
39
55
|
- Mark complete with [x] and commit hash
|
|
40
56
|
- Example completed: [x] **T1**: Create API ✓ (abc1234)
|
|
57
|
+
- Spike tasks: If no experiments validate the approach, first task should be a minimal validation spike
|
|
58
|
+
- Spike tasks block full implementation tasks until the hypothesis is validated
|
|
41
59
|
-->
|