deepflow 0.1.78 → 0.1.80

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -3,7 +3,7 @@
3
3
  ## Purpose
4
4
  Compare specs against codebase and past experiments. Generate prioritized tasks.
5
5
 
6
- **NEVER:** use EnterPlanMode, use ExitPlanMode — this command IS the planning phase; native plan mode conflicts with it
6
+ **NEVER:** use EnterPlanMode, use ExitPlanMode — this command IS the planning phase
7
7
 
8
8
  ## Usage
9
9
  ```
@@ -17,71 +17,50 @@ Compare specs against codebase and past experiments. Generate prioritized tasks.
17
17
 
18
18
  ## Spec File States
19
19
 
20
- | Prefix | State | Action |
21
- |--------|-------|--------|
22
- | (none) | New | Plan this |
23
- | `doing-` | In progress | Skip |
24
- | `done-` | Completed | Skip |
20
+ | Prefix | Action |
21
+ |--------|--------|
22
+ | (none) | Plan this |
23
+ | `doing-` | Skip |
24
+ | `done-` | Skip |
25
25
 
26
26
  ## Behavior
27
27
 
28
28
  ### 1. LOAD CONTEXT
29
29
 
30
30
  ```
31
- Load:
32
- - specs/*.md EXCLUDING doing-* and done-* (only new specs)
33
- - PLAN.md (if exists, for appending)
34
- - .deepflow/config.yaml (if exists)
35
-
31
+ Load: specs/*.md (exclude doing-*/done-*), PLAN.md (if exists), .deepflow/config.yaml
36
32
  Determine source_dir from config or default to src/
37
33
  ```
38
34
 
39
- Run `validateSpec` on each loaded spec. Hard failures → skip that spec entirely and emit an error line. Advisory warnings → include them in plan output.
40
-
41
- If no new specs: report counts, suggest `/df:execute`.
35
+ Run `validateSpec` on each spec. Hard failures → skip + error. Advisory → include in output.
36
+ No new specs → report counts, suggest `/df:execute`.
42
37
 
43
38
  ### 2. CHECK PAST EXPERIMENTS (SPIKE-FIRST)
44
39
 
45
40
  **CRITICAL**: Check experiments BEFORE generating any tasks.
46
41
 
47
- Extract topic from spec name (fuzzy match), then:
48
-
49
42
  ```
50
43
  Glob .deepflow/experiments/{topic}--*
51
44
  ```
52
45
 
53
- **Experiment file naming:** `{topic}--{hypothesis}--{status}.md`
54
- Statuses: `active`, `passed`, `failed`
46
+ File naming: `{topic}--{hypothesis}--{status}.md` (active/passed/failed)
55
47
 
56
48
  | Result | Action |
57
49
  |--------|--------|
58
- | `--failed.md` exists | Extract "next hypothesis" from Conclusion section |
59
- | `--passed.md` exists | Reference as validated pattern, can proceed to full implementation |
60
- | `--active.md` exists | Wait for experiment completion before planning |
61
- | No matches | New topic, needs initial spike |
62
-
63
- **Spike-First Rule**:
64
- - If `--failed.md` exists: Generate spike task to test the next hypothesis (from failed experiment's Conclusion)
65
- - If no experiments exist: Generate spike task for the core hypothesis
66
- - Full implementation tasks are BLOCKED until a spike validates the approach
67
- - Only proceed to full task generation after `--passed.md` exists
50
+ | `--failed.md` | Extract "next hypothesis" from Conclusion, generate spike |
51
+ | `--passed.md` | Proceed to full implementation |
52
+ | `--active.md` | Wait for completion |
53
+ | No matches | New topic, generate initial spike |
68
54
 
69
- See: `templates/experiment-template.md` for experiment format
55
+ Full implementation tasks BLOCKED until spike validates. See `templates/experiment-template.md`.
70
56
 
71
57
  ### 3. DETECT PROJECT CONTEXT
72
58
 
73
- For existing codebases, identify:
74
- - Code style/conventions
75
- - Existing patterns (error handling, API structure)
76
- - Integration points
77
-
78
- Include patterns in task descriptions for agents to follow.
59
+ Identify code style, patterns (error handling, API structure), integration points. Include in task descriptions.
79
60
 
80
61
  ### 4. ANALYZE CODEBASE
81
62
 
82
- Follow `templates/explore-agent.md` for spawn rules, prompt structure, and scope restrictions.
83
-
84
- Scale agent count based on codebase size:
63
+ Follow `templates/explore-agent.md` for spawn rules and scope.
85
64
 
86
65
  | File Count | Agents |
87
66
  |------------|--------|
@@ -90,125 +69,139 @@ Scale agent count based on codebase size:
90
69
  | 100-500 | 25-40 |
91
70
  | 500+ | 50-100 (cap) |
92
71
 
93
- **Use `code-completeness` skill patterns** to search for:
94
- - Implementations matching spec requirements
95
- - TODO, FIXME, HACK comments
96
- - Stub functions, placeholder returns
97
- - Skipped tests, incomplete coverage
72
+ Use `code-completeness` skill to search for: implementations matching spec requirements, TODOs/FIXMEs/HACKs, stubs, skipped tests.
98
73
 
99
- ### 5. COMPARE & PRIORITIZE
74
+ ### 4.5. IMPACT ANALYSIS (per planned file)
100
75
 
101
- Spawn `Task(subagent_type="reasoner", model="opus")`. Reasoner maps each requirement to DONE / PARTIAL / MISSING / CONFLICT. Flag spec gaps; don't silently assume.
76
+ For each file in a task's "Files:" list, find the full blast radius.
102
77
 
103
- Check spec health: verify REQ-AC alignment, requirement clarity, and completeness. Note any issues (orphan ACs, vague requirements) in plan output.
78
+ **Search for:**
104
79
 
105
- **Priority order:** Dependencies Impact Risk
80
+ 1. **Callers:** `grep -r "{exported_function}" --include="*.{ext}" -l` — files that import/call what's being changed
81
+ 2. **Duplicates:** Files with similar logic (same function name, same transformation). Classify:
82
+ - `[active]` — used in production → must consolidate
83
+ - `[dead]` — bypassed/unreachable → must delete
84
+ 3. **Data flow:** If file produces/transforms data, find ALL consumers of that shape across languages
106
85
 
107
- ### 6. GENERATE SPIKE TASKS (IF NEEDED)
86
+ **Embed as `Impact:` block in each task:**
87
+ ```markdown
88
+ - [ ] **T2**: Add new features to YAML export
89
+ - Files: src/utils/buildConfigData.ts
90
+ - Impact:
91
+ - Callers: src/routes/index.ts:12, src/api/handler.ts:45
92
+ - Duplicates:
93
+ - src/components/YamlViewer.tsx:19 (own generateYAML) [active — consolidate]
94
+ - backend/yaml_gen.go (generateYAMLFromConfig) [dead — DELETE]
95
+ - Data flow: buildConfigData → YamlViewer, SimControls, RoleplayPage
96
+ - Blocked by: T1
97
+ ```
98
+
99
+ Files outside original "Files:" → add with `(impact — verify/update)`.
100
+ Skip for spike tasks.
101
+
102
+ ### 4.6. CROSS-TASK FILE CONFLICT DETECTION
108
103
 
109
- **When to generate spike tasks:**
110
- 1. Failed experiment exists → Test the next hypothesis
111
- 2. No experiments exist → Test the core hypothesis
112
- 3. Passed experiment existsSkip to full implementation
104
+ After all tasks have their `Files:` lists, detect overlaps that require sequential execution.
105
+
106
+ **Algorithm:**
107
+ 1. Build a map: `file [task IDs that list it]`
108
+ 2. For each file with >1 task: add `Blocked by` edge from later task → earlier task (by task number)
109
+ 3. If a dependency already exists (direct or transitive), skip (no redundant edges)
110
+
111
+ **Example:**
112
+ ```
113
+ T1: Files: config.go, feature.go — Blocked by: none
114
+ T3: Files: config.go — Blocked by: none
115
+ T5: Files: config.go — Blocked by: none
116
+ ```
117
+ After conflict detection:
118
+ ```
119
+ T1: Blocked by: none
120
+ T3: Blocked by: T1 (file conflict: config.go)
121
+ T5: Blocked by: T3 (file conflict: config.go)
122
+ ```
123
+
124
+ **Rules:**
125
+ - Only add the minimum edges needed (chain, not full mesh — T5 blocks on T3, not T1+T3)
126
+ - Append `(file conflict: {filename})` to the Blocked by reason for traceability
127
+ - If a logical dependency already covers the ordering, don't add a redundant conflict edge
128
+ - Cross-spec conflicts: tasks from different specs sharing files get the same treatment
129
+
130
+ ### 5. COMPARE & PRIORITIZE
131
+
132
+ Spawn `Task(subagent_type="reasoner", model="opus")`. Map each requirement to DONE / PARTIAL / MISSING / CONFLICT. Check REQ-AC alignment. Flag spec gaps.
133
+
134
+ Priority: Dependencies → Impact → Risk
135
+
136
+ ### 6. GENERATE SPIKE TASKS (IF NEEDED)
113
137
 
114
138
  **Spike Task Format:**
115
139
  ```markdown
116
140
  - [ ] **T1** [SPIKE]: Validate {hypothesis}
117
141
  - Type: spike
118
142
  - Hypothesis: {what we're testing}
119
- - Method: {minimal steps to validate}
120
- - Success criteria: {how to know it passed}
143
+ - Method: {minimal steps}
144
+ - Success criteria: {measurable}
121
145
  - Time-box: 30 min
122
146
  - Files: .deepflow/experiments/{topic}--{hypothesis}--{status}.md
123
147
  - Blocked by: none
124
148
  ```
125
149
 
126
- **Blocking Logic:** All implementation tasks MUST have `Blocked by: T{spike}` until spike passes. If spike fails: update to `--failed.md`, DO NOT generate implementation tasks.
150
+ All implementation tasks MUST `Blocked by: T{spike}`. Spike fails `--failed.md`, no implementation tasks.
127
151
 
128
152
  #### Probe Diversity
129
153
 
130
- When generating multiple spike probes for the same problem, diversity is required to avoid confirmation bias and enable discovery of unexpected solutions.
154
+ When generating multiple spikes for the same problem:
131
155
 
132
156
  | Requirement | Rule |
133
157
  |-------------|------|
134
- | Contradictory | At least 2 probes must use opposing/contradictory approaches (e.g., streaming vs buffering, in-process vs external) |
135
- | Naive | At least 1 probe must be a naive/simple approach without prior technical justification — enables exaptation (discovering unexpected solutions) |
136
- | Parallel | All probes for the same problem run simultaneously, not sequentially |
137
- | Scoped | Each probe is minimal — just enough to validate the hypothesis |
138
- | Safe to fail | Each probe runs in its own worktree; failure has zero impact on main |
158
+ | Contradictory | 2 probes with opposing approaches |
159
+ | Naive | 1 probe without prior technical justification |
160
+ | Parallel | All run simultaneously |
161
+ | Scoped | Minimal — just enough to validate |
139
162
 
140
- **Diversity validation step** before outputting spike tasks, verify:
141
- 1. Are there at least 2 probes with opposing assumptions? If not, add a contradictory probe.
142
- 2. Is there at least 1 naive probe with no prior technical justification? If not, add one.
143
- 3. Are all probes independent (no probe depends on another probe's result)?
144
-
145
- **Example — 3 diverse probes for a caching problem:**
163
+ Before output, verify: ≥2 opposing probes, ≥1 naive, all independent.
146
164
 
165
+ **Example — caching problem, 3 diverse probes:**
147
166
  ```markdown
148
167
  - [ ] **T1** [SPIKE]: Validate in-memory LRU cache
149
- - Type: spike
150
168
  - Role: Contradictory-A (in-process)
151
- - Hypothesis: In-memory LRU cache reduces DB queries by ≥80%
152
- - Method: Implement LRU with 1000-item cap, run load test
153
- - Success criteria: DB query count drops ≥80% under 100 concurrent users
154
- - Blocked by: none
169
+ - Hypothesis: In-memory LRU reduces DB queries by ≥80%
170
+ - Method: LRU with 1000-item cap, load test
171
+ - Success criteria: DB queries drop ≥80% under 100 concurrent users
155
172
 
156
173
  - [ ] **T2** [SPIKE]: Validate Redis distributed cache
157
- - Type: spike
158
174
  - Role: Contradictory-B (external, opposing T1)
159
- - Hypothesis: Redis cache scales across multiple instances
160
- - Method: Add Redis client, cache top 10 queries, same load test
161
- - Success criteria: DB queries drop ≥80%, works across 2 app instances
162
- - Blocked by: none
175
+ - Hypothesis: Redis scales across multiple instances
176
+ - Method: Redis client, cache top 10 queries, same load test
177
+ - Success criteria: DB queries drop ≥80%, works across 2 instances
163
178
 
164
- - [ ] **T3** [SPIKE]: Validate query optimization without cache (naive)
165
- - Type: spike
179
+ - [ ] **T3** [SPIKE]: Validate query optimization without cache
166
180
  - Role: Naive (no prior justification — tests if caching is even necessary)
167
- - Hypothesis: Indexes + query batching alone may be sufficient
168
- - Method: Add missing indexes, batch N+1 queries, same load test — no cache
181
+ - Hypothesis: Indexes + query batching alone may suffice
182
+ - Method: Add indexes, batch N+1 queries, same load test — no cache
169
183
  - Success criteria: DB queries drop ≥80% with zero cache infrastructure
170
- - Blocked by: none
171
184
  ```
172
185
 
173
186
  ### 7. VALIDATE HYPOTHESES
174
187
 
175
- For unfamiliar APIs, ambiguous approaches, or performance-critical work: prototype in scratchpad (not committed). If assumption fails, write `.deepflow/experiments/{topic}--{hypothesis}--failed.md`. Skip for well-known patterns/simple CRUD.
188
+ Unfamiliar APIs or performance-critical prototype in scratchpad. Fails write `--failed.md`. Skip for known patterns.
176
189
 
177
190
  ### 8. CLEANUP PLAN.md
178
191
 
179
- Before writing new tasks, prune stale sections:
180
-
181
- ```
182
- For each ### section in PLAN.md:
183
- Extract spec name from header (e.g. "doing-upload" or "done-upload")
184
- If specs/done-{name}.md exists:
185
- → Remove the ENTIRE section: header, tasks, execution summary, fix tasks, separators
186
- If header references a spec with no matching specs/doing-*.md or specs/done-*.md:
187
- → Remove it (orphaned section)
188
- ```
189
-
190
- Also recalculate the Summary table (specs analyzed, tasks created/completed/pending) to reflect only remaining sections.
191
-
192
- If PLAN.md becomes empty after cleanup, delete the file and recreate fresh.
193
-
194
- ### 9. OUTPUT PLAN.md
192
+ Prune stale sections: remove `done-*` sections and orphaned headers. Recalculate Summary table. Empty → recreate fresh.
195
193
 
196
- Append tasks grouped by `### doing-{spec-name}`. Include spec gaps and validation findings.
194
+ ### 9. OUTPUT & RENAME
197
195
 
198
- ### 10. RENAME SPECS
196
+ Append tasks grouped by `### doing-{spec-name}`. Rename `specs/feature.md` `specs/doing-feature.md`.
199
197
 
200
- `mv specs/feature.md specs/doing-feature.md`
201
-
202
- ### 11. REPORT
203
-
204
- `✓ Plan generated — {n} specs, {n} tasks. Run /df:execute`
198
+ Report: `✓ Plan generated — {n} specs, {n} tasks. Run /df:execute`
205
199
 
206
200
  ## Rules
207
- - **Spike-first** — Generate spike task before full implementation if no `--passed.md` experiment exists
208
- - **Block on spike** — Full implementation tasks MUST be blocked by spike validation
209
- - **Learn from failures** — Extract "next hypothesis" from failed experiments, never repeat same approach
201
+ - **Spike-first** — No `--passed.md` spike before implementation
202
+ - **Block on spike** — Implementation tasks blocked until spike validates
203
+ - **Learn from failures** — Extract next hypothesis, never repeat approach
210
204
  - **Plan only** — Do NOT implement (except quick validation prototypes)
211
- - **Confirm before assume** — Search code before marking "missing"
212
205
  - **One task = one logical unit** — Atomic, committable
213
206
  - Prefer existing utilities over new code; flag spec gaps
214
207
 
@@ -216,74 +209,31 @@ Append tasks grouped by `### doing-{spec-name}`. Include spec gaps and validatio
216
209
 
217
210
  | Agent | Model | Base | Scale |
218
211
  |-------|-------|------|-------|
219
- | Explore (search) | haiku | 10 | +1 per 20 files |
220
- | Reasoner (analyze) | opus | 5 | +1 per 2 specs |
212
+ | Explore | haiku | 10 | +1 per 20 files |
213
+ | Reasoner | opus | 5 | +1 per 2 specs |
221
214
 
222
- Always use the `Task` tool with explicit `subagent_type` and `model`. Do NOT use Glob/Grep/Read directly.
215
+ Always use `Task` tool with explicit `subagent_type` and `model`.
223
216
 
224
217
  ## Example
225
218
 
226
- ### Spike-First (No Prior Experiments)
227
-
228
219
  ```markdown
229
- # Plan
230
-
231
220
  ### doing-upload
232
221
 
233
222
  - [ ] **T1** [SPIKE]: Validate streaming upload approach
234
223
  - Type: spike
235
- - Hypothesis: Streaming uploads will handle files >1GB without memory issues
236
- - Method: Create minimal endpoint, upload 2GB file, measure memory
237
- - Success criteria: Memory stays under 500MB during upload
238
- - Time-box: 30 min
224
+ - Hypothesis: Streaming uploads handle >1GB without memory issues
225
+ - Success criteria: Memory <500MB during 2GB upload
239
226
  - Files: .deepflow/experiments/upload--streaming--active.md
240
227
  - Blocked by: none
241
228
 
242
229
  - [ ] **T2**: Create upload endpoint
243
230
  - Files: src/api/upload.ts
244
- - Blocked by: T1 (spike must pass)
231
+ - Impact:
232
+ - Callers: src/routes/index.ts:5
233
+ - Duplicates: backend/legacy-upload.go [dead — DELETE]
234
+ - Blocked by: T1
245
235
 
246
236
  - [ ] **T3**: Add S3 service with streaming
247
237
  - Files: src/services/storage.ts
248
- - Blocked by: T1 (spike must pass), T2
249
- ```
250
-
251
- ### Spike-First (After Failed Experiment)
252
-
253
- ```markdown
254
- # Plan
255
-
256
- ### doing-upload
257
-
258
- - [ ] **T1** [SPIKE]: Validate chunked upload with backpressure
259
- - Type: spike
260
- - Hypothesis: Adding backpressure control will prevent buffer overflow
261
- - Method: Implement pause/resume on buffer threshold, test with 2GB file
262
- - Success criteria: No memory spikes above 500MB
263
- - Time-box: 30 min
264
- - Files: .deepflow/experiments/upload--chunked-backpressure--active.md
265
- - Blocked by: none
266
- - Note: Previous approach failed (see upload--buffer-upload--failed.md)
267
-
268
- - [ ] **T2**: Implement chunked upload endpoint
269
- - Files: src/api/upload.ts
270
- - Blocked by: T1 (spike must pass)
271
- ```
272
-
273
- ### After Spike Validates (Full Implementation)
274
-
275
- ```markdown
276
- # Plan
277
-
278
- ### doing-upload
279
-
280
- - [ ] **T1**: Create upload endpoint
281
- - Files: src/api/upload.ts
282
- - Blocked by: none
283
- - Note: Use streaming (validated in upload--streaming--passed.md)
284
-
285
- - [ ] **T2**: Add S3 service with streaming
286
- - Files: src/services/storage.ts
287
- - Blocked by: T1
288
- - Avoid: Direct buffer upload failed (see upload--buffer-upload--failed.md)
238
+ - Blocked by: T1, T2
289
239
  ```