deepflow 0.1.78 → 0.1.80
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +14 -3
- package/bin/install.js +3 -2
- package/package.json +4 -1
- package/src/commands/df/auto-cycle.md +33 -19
- package/src/commands/df/execute.md +166 -473
- package/src/commands/df/plan.md +113 -163
- package/src/commands/df/verify.md +433 -3
- package/src/skills/browse-fetch/SKILL.md +258 -0
- package/src/skills/browse-verify/SKILL.md +264 -0
- package/templates/config-template.yaml +14 -0
- package/src/skills/context-hub/SKILL.md +0 -87
package/src/commands/df/plan.md
CHANGED
|
@@ -3,7 +3,7 @@
|
|
|
3
3
|
## Purpose
|
|
4
4
|
Compare specs against codebase and past experiments. Generate prioritized tasks.
|
|
5
5
|
|
|
6
|
-
**NEVER:** use EnterPlanMode, use ExitPlanMode — this command IS the planning phase
|
|
6
|
+
**NEVER:** use EnterPlanMode, use ExitPlanMode — this command IS the planning phase
|
|
7
7
|
|
|
8
8
|
## Usage
|
|
9
9
|
```
|
|
@@ -17,71 +17,50 @@ Compare specs against codebase and past experiments. Generate prioritized tasks.
|
|
|
17
17
|
|
|
18
18
|
## Spec File States
|
|
19
19
|
|
|
20
|
-
| Prefix |
|
|
21
|
-
|
|
22
|
-
| (none) |
|
|
23
|
-
| `doing-` |
|
|
24
|
-
| `done-` |
|
|
20
|
+
| Prefix | Action |
|
|
21
|
+
|--------|--------|
|
|
22
|
+
| (none) | Plan this |
|
|
23
|
+
| `doing-` | Skip |
|
|
24
|
+
| `done-` | Skip |
|
|
25
25
|
|
|
26
26
|
## Behavior
|
|
27
27
|
|
|
28
28
|
### 1. LOAD CONTEXT
|
|
29
29
|
|
|
30
30
|
```
|
|
31
|
-
Load:
|
|
32
|
-
- specs/*.md EXCLUDING doing-* and done-* (only new specs)
|
|
33
|
-
- PLAN.md (if exists, for appending)
|
|
34
|
-
- .deepflow/config.yaml (if exists)
|
|
35
|
-
|
|
31
|
+
Load: specs/*.md (exclude doing-*/done-*), PLAN.md (if exists), .deepflow/config.yaml
|
|
36
32
|
Determine source_dir from config or default to src/
|
|
37
33
|
```
|
|
38
34
|
|
|
39
|
-
Run `validateSpec` on each
|
|
40
|
-
|
|
41
|
-
If no new specs: report counts, suggest `/df:execute`.
|
|
35
|
+
Run `validateSpec` on each spec. Hard failures → skip + error. Advisory → include in output.
|
|
36
|
+
No new specs → report counts, suggest `/df:execute`.
|
|
42
37
|
|
|
43
38
|
### 2. CHECK PAST EXPERIMENTS (SPIKE-FIRST)
|
|
44
39
|
|
|
45
40
|
**CRITICAL**: Check experiments BEFORE generating any tasks.
|
|
46
41
|
|
|
47
|
-
Extract topic from spec name (fuzzy match), then:
|
|
48
|
-
|
|
49
42
|
```
|
|
50
43
|
Glob .deepflow/experiments/{topic}--*
|
|
51
44
|
```
|
|
52
45
|
|
|
53
|
-
|
|
54
|
-
Statuses: `active`, `passed`, `failed`
|
|
46
|
+
File naming: `{topic}--{hypothesis}--{status}.md` (active/passed/failed)
|
|
55
47
|
|
|
56
48
|
| Result | Action |
|
|
57
49
|
|--------|--------|
|
|
58
|
-
| `--failed.md`
|
|
59
|
-
| `--passed.md`
|
|
60
|
-
| `--active.md`
|
|
61
|
-
| No matches | New topic,
|
|
62
|
-
|
|
63
|
-
**Spike-First Rule**:
|
|
64
|
-
- If `--failed.md` exists: Generate spike task to test the next hypothesis (from failed experiment's Conclusion)
|
|
65
|
-
- If no experiments exist: Generate spike task for the core hypothesis
|
|
66
|
-
- Full implementation tasks are BLOCKED until a spike validates the approach
|
|
67
|
-
- Only proceed to full task generation after `--passed.md` exists
|
|
50
|
+
| `--failed.md` | Extract "next hypothesis" from Conclusion, generate spike |
|
|
51
|
+
| `--passed.md` | Proceed to full implementation |
|
|
52
|
+
| `--active.md` | Wait for completion |
|
|
53
|
+
| No matches | New topic, generate initial spike |
|
|
68
54
|
|
|
69
|
-
See
|
|
55
|
+
Full implementation tasks BLOCKED until spike validates. See `templates/experiment-template.md`.
|
|
70
56
|
|
|
71
57
|
### 3. DETECT PROJECT CONTEXT
|
|
72
58
|
|
|
73
|
-
|
|
74
|
-
- Code style/conventions
|
|
75
|
-
- Existing patterns (error handling, API structure)
|
|
76
|
-
- Integration points
|
|
77
|
-
|
|
78
|
-
Include patterns in task descriptions for agents to follow.
|
|
59
|
+
Identify code style, patterns (error handling, API structure), integration points. Include in task descriptions.
|
|
79
60
|
|
|
80
61
|
### 4. ANALYZE CODEBASE
|
|
81
62
|
|
|
82
|
-
Follow `templates/explore-agent.md` for spawn rules
|
|
83
|
-
|
|
84
|
-
Scale agent count based on codebase size:
|
|
63
|
+
Follow `templates/explore-agent.md` for spawn rules and scope.
|
|
85
64
|
|
|
86
65
|
| File Count | Agents |
|
|
87
66
|
|------------|--------|
|
|
@@ -90,125 +69,139 @@ Scale agent count based on codebase size:
|
|
|
90
69
|
| 100-500 | 25-40 |
|
|
91
70
|
| 500+ | 50-100 (cap) |
|
|
92
71
|
|
|
93
|
-
|
|
94
|
-
- Implementations matching spec requirements
|
|
95
|
-
- TODO, FIXME, HACK comments
|
|
96
|
-
- Stub functions, placeholder returns
|
|
97
|
-
- Skipped tests, incomplete coverage
|
|
72
|
+
Use `code-completeness` skill to search for: implementations matching spec requirements, TODOs/FIXMEs/HACKs, stubs, skipped tests.
|
|
98
73
|
|
|
99
|
-
### 5.
|
|
74
|
+
### 4.5. IMPACT ANALYSIS (per planned file)
|
|
100
75
|
|
|
101
|
-
|
|
76
|
+
For each file in a task's "Files:" list, find the full blast radius.
|
|
102
77
|
|
|
103
|
-
|
|
78
|
+
**Search for:**
|
|
104
79
|
|
|
105
|
-
**
|
|
80
|
+
1. **Callers:** `grep -r "{exported_function}" --include="*.{ext}" -l` — files that import/call what's being changed
|
|
81
|
+
2. **Duplicates:** Files with similar logic (same function name, same transformation). Classify:
|
|
82
|
+
- `[active]` — used in production → must consolidate
|
|
83
|
+
- `[dead]` — bypassed/unreachable → must delete
|
|
84
|
+
3. **Data flow:** If file produces/transforms data, find ALL consumers of that shape across languages
|
|
106
85
|
|
|
107
|
-
|
|
86
|
+
**Embed as `Impact:` block in each task:**
|
|
87
|
+
```markdown
|
|
88
|
+
- [ ] **T2**: Add new features to YAML export
|
|
89
|
+
- Files: src/utils/buildConfigData.ts
|
|
90
|
+
- Impact:
|
|
91
|
+
- Callers: src/routes/index.ts:12, src/api/handler.ts:45
|
|
92
|
+
- Duplicates:
|
|
93
|
+
- src/components/YamlViewer.tsx:19 (own generateYAML) [active — consolidate]
|
|
94
|
+
- backend/yaml_gen.go (generateYAMLFromConfig) [dead — DELETE]
|
|
95
|
+
- Data flow: buildConfigData → YamlViewer, SimControls, RoleplayPage
|
|
96
|
+
- Blocked by: T1
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
Files outside original "Files:" → add with `(impact — verify/update)`.
|
|
100
|
+
Skip for spike tasks.
|
|
101
|
+
|
|
102
|
+
### 4.6. CROSS-TASK FILE CONFLICT DETECTION
|
|
108
103
|
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
104
|
+
After all tasks have their `Files:` lists, detect overlaps that require sequential execution.
|
|
105
|
+
|
|
106
|
+
**Algorithm:**
|
|
107
|
+
1. Build a map: `file → [task IDs that list it]`
|
|
108
|
+
2. For each file with >1 task: add `Blocked by` edge from later task → earlier task (by task number)
|
|
109
|
+
3. If a dependency already exists (direct or transitive), skip (no redundant edges)
|
|
110
|
+
|
|
111
|
+
**Example:**
|
|
112
|
+
```
|
|
113
|
+
T1: Files: config.go, feature.go — Blocked by: none
|
|
114
|
+
T3: Files: config.go — Blocked by: none
|
|
115
|
+
T5: Files: config.go — Blocked by: none
|
|
116
|
+
```
|
|
117
|
+
After conflict detection:
|
|
118
|
+
```
|
|
119
|
+
T1: Blocked by: none
|
|
120
|
+
T3: Blocked by: T1 (file conflict: config.go)
|
|
121
|
+
T5: Blocked by: T3 (file conflict: config.go)
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
**Rules:**
|
|
125
|
+
- Only add the minimum edges needed (chain, not full mesh — T5 blocks on T3, not T1+T3)
|
|
126
|
+
- Append `(file conflict: {filename})` to the Blocked by reason for traceability
|
|
127
|
+
- If a logical dependency already covers the ordering, don't add a redundant conflict edge
|
|
128
|
+
- Cross-spec conflicts: tasks from different specs sharing files get the same treatment
|
|
129
|
+
|
|
130
|
+
### 5. COMPARE & PRIORITIZE
|
|
131
|
+
|
|
132
|
+
Spawn `Task(subagent_type="reasoner", model="opus")`. Map each requirement to DONE / PARTIAL / MISSING / CONFLICT. Check REQ-AC alignment. Flag spec gaps.
|
|
133
|
+
|
|
134
|
+
Priority: Dependencies → Impact → Risk
|
|
135
|
+
|
|
136
|
+
### 6. GENERATE SPIKE TASKS (IF NEEDED)
|
|
113
137
|
|
|
114
138
|
**Spike Task Format:**
|
|
115
139
|
```markdown
|
|
116
140
|
- [ ] **T1** [SPIKE]: Validate {hypothesis}
|
|
117
141
|
- Type: spike
|
|
118
142
|
- Hypothesis: {what we're testing}
|
|
119
|
-
- Method: {minimal steps
|
|
120
|
-
- Success criteria: {
|
|
143
|
+
- Method: {minimal steps}
|
|
144
|
+
- Success criteria: {measurable}
|
|
121
145
|
- Time-box: 30 min
|
|
122
146
|
- Files: .deepflow/experiments/{topic}--{hypothesis}--{status}.md
|
|
123
147
|
- Blocked by: none
|
|
124
148
|
```
|
|
125
149
|
|
|
126
|
-
|
|
150
|
+
All implementation tasks MUST `Blocked by: T{spike}`. Spike fails → `--failed.md`, no implementation tasks.
|
|
127
151
|
|
|
128
152
|
#### Probe Diversity
|
|
129
153
|
|
|
130
|
-
When generating multiple
|
|
154
|
+
When generating multiple spikes for the same problem:
|
|
131
155
|
|
|
132
156
|
| Requirement | Rule |
|
|
133
157
|
|-------------|------|
|
|
134
|
-
| Contradictory |
|
|
135
|
-
| Naive |
|
|
136
|
-
| Parallel | All
|
|
137
|
-
| Scoped |
|
|
138
|
-
| Safe to fail | Each probe runs in its own worktree; failure has zero impact on main |
|
|
158
|
+
| Contradictory | ≥2 probes with opposing approaches |
|
|
159
|
+
| Naive | ≥1 probe without prior technical justification |
|
|
160
|
+
| Parallel | All run simultaneously |
|
|
161
|
+
| Scoped | Minimal — just enough to validate |
|
|
139
162
|
|
|
140
|
-
|
|
141
|
-
1. Are there at least 2 probes with opposing assumptions? If not, add a contradictory probe.
|
|
142
|
-
2. Is there at least 1 naive probe with no prior technical justification? If not, add one.
|
|
143
|
-
3. Are all probes independent (no probe depends on another probe's result)?
|
|
144
|
-
|
|
145
|
-
**Example — 3 diverse probes for a caching problem:**
|
|
163
|
+
Before output, verify: ≥2 opposing probes, ≥1 naive, all independent.
|
|
146
164
|
|
|
165
|
+
**Example — caching problem, 3 diverse probes:**
|
|
147
166
|
```markdown
|
|
148
167
|
- [ ] **T1** [SPIKE]: Validate in-memory LRU cache
|
|
149
|
-
- Type: spike
|
|
150
168
|
- Role: Contradictory-A (in-process)
|
|
151
|
-
- Hypothesis: In-memory LRU
|
|
152
|
-
- Method:
|
|
153
|
-
- Success criteria: DB
|
|
154
|
-
- Blocked by: none
|
|
169
|
+
- Hypothesis: In-memory LRU reduces DB queries by ≥80%
|
|
170
|
+
- Method: LRU with 1000-item cap, load test
|
|
171
|
+
- Success criteria: DB queries drop ≥80% under 100 concurrent users
|
|
155
172
|
|
|
156
173
|
- [ ] **T2** [SPIKE]: Validate Redis distributed cache
|
|
157
|
-
- Type: spike
|
|
158
174
|
- Role: Contradictory-B (external, opposing T1)
|
|
159
|
-
- Hypothesis: Redis
|
|
160
|
-
- Method:
|
|
161
|
-
- Success criteria: DB queries drop ≥80%, works across 2
|
|
162
|
-
- Blocked by: none
|
|
175
|
+
- Hypothesis: Redis scales across multiple instances
|
|
176
|
+
- Method: Redis client, cache top 10 queries, same load test
|
|
177
|
+
- Success criteria: DB queries drop ≥80%, works across 2 instances
|
|
163
178
|
|
|
164
|
-
- [ ] **T3** [SPIKE]: Validate query optimization without cache
|
|
165
|
-
- Type: spike
|
|
179
|
+
- [ ] **T3** [SPIKE]: Validate query optimization without cache
|
|
166
180
|
- Role: Naive (no prior justification — tests if caching is even necessary)
|
|
167
|
-
- Hypothesis: Indexes + query batching alone may
|
|
168
|
-
- Method: Add
|
|
181
|
+
- Hypothesis: Indexes + query batching alone may suffice
|
|
182
|
+
- Method: Add indexes, batch N+1 queries, same load test — no cache
|
|
169
183
|
- Success criteria: DB queries drop ≥80% with zero cache infrastructure
|
|
170
|
-
- Blocked by: none
|
|
171
184
|
```
|
|
172
185
|
|
|
173
186
|
### 7. VALIDATE HYPOTHESES
|
|
174
187
|
|
|
175
|
-
|
|
188
|
+
Unfamiliar APIs or performance-critical → prototype in scratchpad. Fails → write `--failed.md`. Skip for known patterns.
|
|
176
189
|
|
|
177
190
|
### 8. CLEANUP PLAN.md
|
|
178
191
|
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
```
|
|
182
|
-
For each ### section in PLAN.md:
|
|
183
|
-
Extract spec name from header (e.g. "doing-upload" or "done-upload")
|
|
184
|
-
If specs/done-{name}.md exists:
|
|
185
|
-
→ Remove the ENTIRE section: header, tasks, execution summary, fix tasks, separators
|
|
186
|
-
If header references a spec with no matching specs/doing-*.md or specs/done-*.md:
|
|
187
|
-
→ Remove it (orphaned section)
|
|
188
|
-
```
|
|
189
|
-
|
|
190
|
-
Also recalculate the Summary table (specs analyzed, tasks created/completed/pending) to reflect only remaining sections.
|
|
191
|
-
|
|
192
|
-
If PLAN.md becomes empty after cleanup, delete the file and recreate fresh.
|
|
193
|
-
|
|
194
|
-
### 9. OUTPUT PLAN.md
|
|
192
|
+
Prune stale sections: remove `done-*` sections and orphaned headers. Recalculate Summary table. Empty → recreate fresh.
|
|
195
193
|
|
|
196
|
-
|
|
194
|
+
### 9. OUTPUT & RENAME
|
|
197
195
|
|
|
198
|
-
|
|
196
|
+
Append tasks grouped by `### doing-{spec-name}`. Rename `specs/feature.md` → `specs/doing-feature.md`.
|
|
199
197
|
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
### 11. REPORT
|
|
203
|
-
|
|
204
|
-
`✓ Plan generated — {n} specs, {n} tasks. Run /df:execute`
|
|
198
|
+
Report: `✓ Plan generated — {n} specs, {n} tasks. Run /df:execute`
|
|
205
199
|
|
|
206
200
|
## Rules
|
|
207
|
-
- **Spike-first** —
|
|
208
|
-
- **Block on spike** —
|
|
209
|
-
- **Learn from failures** — Extract
|
|
201
|
+
- **Spike-first** — No `--passed.md` → spike before implementation
|
|
202
|
+
- **Block on spike** — Implementation tasks blocked until spike validates
|
|
203
|
+
- **Learn from failures** — Extract next hypothesis, never repeat approach
|
|
210
204
|
- **Plan only** — Do NOT implement (except quick validation prototypes)
|
|
211
|
-
- **Confirm before assume** — Search code before marking "missing"
|
|
212
205
|
- **One task = one logical unit** — Atomic, committable
|
|
213
206
|
- Prefer existing utilities over new code; flag spec gaps
|
|
214
207
|
|
|
@@ -216,74 +209,31 @@ Append tasks grouped by `### doing-{spec-name}`. Include spec gaps and validatio
|
|
|
216
209
|
|
|
217
210
|
| Agent | Model | Base | Scale |
|
|
218
211
|
|-------|-------|------|-------|
|
|
219
|
-
| Explore
|
|
220
|
-
| Reasoner
|
|
212
|
+
| Explore | haiku | 10 | +1 per 20 files |
|
|
213
|
+
| Reasoner | opus | 5 | +1 per 2 specs |
|
|
221
214
|
|
|
222
|
-
Always use
|
|
215
|
+
Always use `Task` tool with explicit `subagent_type` and `model`.
|
|
223
216
|
|
|
224
217
|
## Example
|
|
225
218
|
|
|
226
|
-
### Spike-First (No Prior Experiments)
|
|
227
|
-
|
|
228
219
|
```markdown
|
|
229
|
-
# Plan
|
|
230
|
-
|
|
231
220
|
### doing-upload
|
|
232
221
|
|
|
233
222
|
- [ ] **T1** [SPIKE]: Validate streaming upload approach
|
|
234
223
|
- Type: spike
|
|
235
|
-
- Hypothesis: Streaming uploads
|
|
236
|
-
-
|
|
237
|
-
- Success criteria: Memory stays under 500MB during upload
|
|
238
|
-
- Time-box: 30 min
|
|
224
|
+
- Hypothesis: Streaming uploads handle >1GB without memory issues
|
|
225
|
+
- Success criteria: Memory <500MB during 2GB upload
|
|
239
226
|
- Files: .deepflow/experiments/upload--streaming--active.md
|
|
240
227
|
- Blocked by: none
|
|
241
228
|
|
|
242
229
|
- [ ] **T2**: Create upload endpoint
|
|
243
230
|
- Files: src/api/upload.ts
|
|
244
|
-
-
|
|
231
|
+
- Impact:
|
|
232
|
+
- Callers: src/routes/index.ts:5
|
|
233
|
+
- Duplicates: backend/legacy-upload.go [dead — DELETE]
|
|
234
|
+
- Blocked by: T1
|
|
245
235
|
|
|
246
236
|
- [ ] **T3**: Add S3 service with streaming
|
|
247
237
|
- Files: src/services/storage.ts
|
|
248
|
-
- Blocked by: T1
|
|
249
|
-
```
|
|
250
|
-
|
|
251
|
-
### Spike-First (After Failed Experiment)
|
|
252
|
-
|
|
253
|
-
```markdown
|
|
254
|
-
# Plan
|
|
255
|
-
|
|
256
|
-
### doing-upload
|
|
257
|
-
|
|
258
|
-
- [ ] **T1** [SPIKE]: Validate chunked upload with backpressure
|
|
259
|
-
- Type: spike
|
|
260
|
-
- Hypothesis: Adding backpressure control will prevent buffer overflow
|
|
261
|
-
- Method: Implement pause/resume on buffer threshold, test with 2GB file
|
|
262
|
-
- Success criteria: No memory spikes above 500MB
|
|
263
|
-
- Time-box: 30 min
|
|
264
|
-
- Files: .deepflow/experiments/upload--chunked-backpressure--active.md
|
|
265
|
-
- Blocked by: none
|
|
266
|
-
- Note: Previous approach failed (see upload--buffer-upload--failed.md)
|
|
267
|
-
|
|
268
|
-
- [ ] **T2**: Implement chunked upload endpoint
|
|
269
|
-
- Files: src/api/upload.ts
|
|
270
|
-
- Blocked by: T1 (spike must pass)
|
|
271
|
-
```
|
|
272
|
-
|
|
273
|
-
### After Spike Validates (Full Implementation)
|
|
274
|
-
|
|
275
|
-
```markdown
|
|
276
|
-
# Plan
|
|
277
|
-
|
|
278
|
-
### doing-upload
|
|
279
|
-
|
|
280
|
-
- [ ] **T1**: Create upload endpoint
|
|
281
|
-
- Files: src/api/upload.ts
|
|
282
|
-
- Blocked by: none
|
|
283
|
-
- Note: Use streaming (validated in upload--streaming--passed.md)
|
|
284
|
-
|
|
285
|
-
- [ ] **T2**: Add S3 service with streaming
|
|
286
|
-
- Files: src/services/storage.ts
|
|
287
|
-
- Blocked by: T1
|
|
288
|
-
- Avoid: Direct buffer upload failed (see upload--buffer-upload--failed.md)
|
|
238
|
+
- Blocked by: T1, T2
|
|
289
239
|
```
|