deepflow 0.1.77 → 0.1.79
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +3 -3
- package/package.json +1 -1
- package/src/commands/df/auto-cycle.md +16 -17
- package/src/commands/df/execute.md +159 -473
- package/src/commands/df/plan.md +85 -163
package/README.md
CHANGED
|
@@ -32,7 +32,7 @@ Most spec-driven frameworks start from a finished spec and execute a static plan
|
|
|
32
32
|
- **Asking reveals what assuming hides** — Before any code, Socratic questioning surfaces the requirements you didn't know you had. Four AI perspectives collide to expose tensions in your approach. The spec isn't written from what you think you know — it's written from what the conversation uncovered.
|
|
33
33
|
- **Spec as living hypothesis** — Core intent stays fixed, details refine through implementation. "The spec becomes bulletproof because you built it, not before."
|
|
34
34
|
- **Parallel probes reveal the best path** — Uncertain approaches spawn parallel spikes in isolated worktrees. The machine selects the winner (fewer regressions > better coverage > fewer files changed). Failed approaches stay recorded and never repeat.
|
|
35
|
-
- **Metrics decide, not opinions** — No LLM judges another LLM. Build, tests, typecheck, lint are the only judges. After an agent commits, the orchestrator runs health checks. Pass = keep. Fail = revert + new hypothesis.
|
|
35
|
+
- **Metrics decide, not opinions** — No LLM judges another LLM. Build, tests, typecheck, lint, and invariant checks are the only judges. After an agent commits, the orchestrator runs health checks. Pass = keep. Fail = revert + new hypothesis.
|
|
36
36
|
- **The loop is the product** — Not "execute a plan" — "evolve the codebase toward the spec's goals through iterative cycles." Each cycle reveals what the previous one couldn't see.
|
|
37
37
|
|
|
38
38
|
## What We Learned by Doing
|
|
@@ -111,7 +111,7 @@ $ git log --oneline
|
|
|
111
111
|
1. Runs `/df:plan` if no PLAN.md exists
|
|
112
112
|
2. Snapshots pre-existing tests (ratchet baseline)
|
|
113
113
|
3. Starts a loop (`/loop 1m /df:auto-cycle`) — fresh context each cycle
|
|
114
|
-
4. Each cycle: picks next task → executes in worktree → runs health checks (build/tests/typecheck/lint)
|
|
114
|
+
4. Each cycle: picks next task → executes in worktree → runs health checks (build/tests/typecheck/lint/invariant-check)
|
|
115
115
|
5. Pass = commit stands. Fail = revert + retry next cycle
|
|
116
116
|
6. Circuit breaker: halts after N consecutive reverts on same task
|
|
117
117
|
7. When all tasks done: runs `/df:verify`, merges to main
|
|
@@ -179,7 +179,7 @@ your-project/
|
|
|
179
179
|
|
|
180
180
|
1. **Discover before specifying, spike before implementing** — Ask, debate, probe — then commit
|
|
181
181
|
2. **You define WHAT, AI figures out HOW** — Specs are the contract
|
|
182
|
-
3. **Metrics decide, not opinions** — Build/test/typecheck/lint are the only judges
|
|
182
|
+
3. **Metrics decide, not opinions** — Build/test/typecheck/lint/invariant-check are the only judges
|
|
183
183
|
4. **Confirm before assume** — Search the code before marking "missing"
|
|
184
184
|
5. **Complete implementations** — No stubs, no placeholders
|
|
185
185
|
6. **Atomic commits** — One task = one commit
|
package/package.json
CHANGED
|
@@ -169,10 +169,10 @@ _Last updated: {YYYY-MM-DDTHH:MM:SSZ}_
|
|
|
169
169
|
|
|
170
170
|
## Cycle Log
|
|
171
171
|
|
|
172
|
-
| Cycle | Task | Status | Commit / Revert | Reason | Timestamp |
|
|
173
|
-
|
|
174
|
-
| 1 | T1 | passed | abc1234 | — | 2025-01-15T10:00:00Z |
|
|
175
|
-
| 2 | T2 | failed | reverted | tests failed: 2 of 24 | 2025-01-15T10:05:00Z |
|
|
172
|
+
| Cycle | Task | Status | Commit / Revert | Delta | Reason | Timestamp |
|
|
173
|
+
|-------|------|--------|-----------------|-------|--------|-----------|
|
|
174
|
+
| 1 | T1 | passed | abc1234 | tests: 24→24, build: ok | — | 2025-01-15T10:00:00Z |
|
|
175
|
+
| 2 | T2 | failed | reverted | tests: 24→22 (−2) | tests failed: 2 of 24 | 2025-01-15T10:05:00Z |
|
|
176
176
|
|
|
177
177
|
## Probe Results
|
|
178
178
|
|
|
@@ -202,13 +202,14 @@ _(tasks that were reverted with their failure reasons)_
|
|
|
202
202
|
**Cycle Log — append one row:**
|
|
203
203
|
|
|
204
204
|
```
|
|
205
|
-
| {cycle_number} | {task_id} | {status} | {commit_hash or "reverted"} | {reason or "—"} | {YYYY-MM-DDTHH:MM:SSZ} |
|
|
205
|
+
| {cycle_number} | {task_id} | {status} | {commit_hash or "reverted"} | {delta} | {reason or "—"} | {YYYY-MM-DDTHH:MM:SSZ} |
|
|
206
206
|
```
|
|
207
207
|
|
|
208
208
|
- `cycle_number`: total number of cycles executed so far (count existing data rows in the Cycle Log + 1)
|
|
209
209
|
- `task_id`: task ID from PLAN.md, or `BOOTSTRAP` for bootstrap cycles
|
|
210
210
|
- `status`: `passed` (ratchet passed), `failed` (ratchet failed, reverted), or `skipped` (task was already done)
|
|
211
211
|
- `commit_hash`: short hash from the commit, or `reverted` if ratchet failed
|
|
212
|
+
- `delta`: ratchet metric change from this cycle. Format: `tests: {before}→{after}, build: ok/fail`. Include coverage delta if available (e.g., `cov: 80%→82% (+2%)`). On revert, show the regression that triggered it (e.g., `tests: 24→22 (−2)`)
|
|
212
213
|
- `reason`: failure reason from ratchet output (e.g., `"tests failed: 2 of 24"`), or `—` if passed
|
|
213
214
|
|
|
214
215
|
**Summary table — recalculate from Cycle Log rows:**
|
|
@@ -259,10 +260,12 @@ done_count = number of [x] tasks
|
|
|
259
260
|
pending_count = number of [ ] tasks
|
|
260
261
|
```
|
|
261
262
|
|
|
262
|
-
**
|
|
263
|
+
**Note:** Per-spec verification and merge to main happens automatically in `/df:execute` (step 8) when all tasks for a spec complete. No separate verify call is needed here.
|
|
264
|
+
|
|
265
|
+
**If no `[ ]` tasks remain (pending_count == 0):**
|
|
263
266
|
```
|
|
264
|
-
→
|
|
265
|
-
→
|
|
267
|
+
→ Report: "All specs verified and merged. Workflow complete."
|
|
268
|
+
→ Exit
|
|
266
269
|
```
|
|
267
270
|
|
|
268
271
|
**If tasks remain (pending_count > 0):**
|
|
@@ -327,17 +330,14 @@ Updated .deepflow/auto-report.md:
|
|
|
327
330
|
Cycle complete. 1 tasks remaining.
|
|
328
331
|
```
|
|
329
332
|
|
|
330
|
-
### All Tasks Done (
|
|
333
|
+
### All Tasks Done (workflow complete)
|
|
331
334
|
|
|
332
335
|
```
|
|
333
336
|
/df:auto-cycle
|
|
334
337
|
|
|
335
|
-
Loading PLAN.md...
|
|
338
|
+
Loading PLAN.md... 0 tasks total, 0 done, 0 pending
|
|
336
339
|
|
|
337
|
-
All
|
|
338
|
-
Running: /df:verify
|
|
339
|
-
✓ L0 | ✓ L1 | ⚠ L2 (no coverage tool) | ✓ L4
|
|
340
|
-
✓ Merged df/upload to main
|
|
340
|
+
All specs verified and merged. Workflow complete.
|
|
341
341
|
```
|
|
342
342
|
|
|
343
343
|
### No Work Remaining (idempotent)
|
|
@@ -345,10 +345,9 @@ Running: /df:verify
|
|
|
345
345
|
```
|
|
346
346
|
/df:auto-cycle
|
|
347
347
|
|
|
348
|
-
Loading PLAN.md...
|
|
349
|
-
Verification already complete (no doing-* specs found).
|
|
348
|
+
Loading PLAN.md... 0 tasks total, 0 done, 0 pending
|
|
350
349
|
|
|
351
|
-
|
|
350
|
+
All specs verified and merged. Workflow complete.
|
|
352
351
|
```
|
|
353
352
|
|
|
354
353
|
### Circuit Breaker Tripped
|
|
@@ -8,93 +8,44 @@ You are a coordinator. Spawn agents, run ratchet checks, update PLAN.md. Never i
|
|
|
8
8
|
|
|
9
9
|
**ONLY:** Read PLAN.md, read specs/doing-*.md, spawn background agents, run ratchet health checks after each agent completes, update PLAN.md, write `.deepflow/decisions.md` in the main tree
|
|
10
10
|
|
|
11
|
-
|
|
11
|
+
## Core Loop (Notification-Driven)
|
|
12
12
|
|
|
13
|
-
|
|
14
|
-
|
|
13
|
+
Each task = one background agent. Completion notifications drive the loop.
|
|
14
|
+
|
|
15
|
+
**NEVER use TaskOutput** — returns full transcripts (100KB+) that explode context.
|
|
15
16
|
|
|
16
|
-
## Usage
|
|
17
17
|
```
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
18
|
+
1. Spawn ALL wave agents with run_in_background=true in ONE message
|
|
19
|
+
2. STOP. End your turn. Do NOT poll or monitor.
|
|
20
|
+
3. On EACH notification:
|
|
21
|
+
a. Run ratchet check (section 5.5)
|
|
22
|
+
b. Passed → TaskUpdate(status: "completed"), update PLAN.md [x] + commit hash
|
|
23
|
+
c. Failed → git revert HEAD --no-edit, TaskUpdate(status: "pending")
|
|
24
|
+
d. Report ONE line: "✓ T1: ratchet passed (abc123)" or "✗ T1: ratchet failed, reverted"
|
|
25
|
+
e. NOT all done → end turn, wait | ALL done → next wave or finish
|
|
26
|
+
4. Between waves: check context %. If ≥50% → checkpoint and exit.
|
|
27
|
+
5. Repeat until: all done, all blocked, or context ≥50%.
|
|
23
28
|
```
|
|
24
29
|
|
|
25
|
-
##
|
|
26
|
-
- Skill: `atomic-commits` — Clean commit protocol
|
|
27
|
-
- Skill: `context-hub` — Fetch external API docs before coding (when task involves external libraries)
|
|
30
|
+
## Context Threshold
|
|
28
31
|
|
|
29
|
-
|
|
30
|
-
| Agent | subagent_type | Purpose |
|
|
31
|
-
|-------|---------------|---------|
|
|
32
|
-
| Implementation | `general-purpose` | Task implementation |
|
|
33
|
-
| Debugger | `reasoner` | Debugging failures |
|
|
34
|
-
|
|
35
|
-
**Model routing from frontmatter:**
|
|
36
|
-
The model for each agent is determined by the `model:` field in the command/agent/skill frontmatter being invoked. The orchestrator reads the relevant frontmatter to determine which model to pass to `Task()`. If no `model:` field is present in the frontmatter, default to `sonnet`.
|
|
37
|
-
|
|
38
|
-
## Context-Aware Execution
|
|
39
|
-
|
|
40
|
-
Statusline writes to `.deepflow/context.json`: `{"percentage": 45}`
|
|
32
|
+
Statusline writes `.deepflow/context.json`: `{"percentage": 45}`
|
|
41
33
|
|
|
42
34
|
| Context % | Action |
|
|
43
35
|
|-----------|--------|
|
|
44
36
|
| < 50% | Full parallelism (up to 5 agents) |
|
|
45
37
|
| ≥ 50% | Wait for running agents, checkpoint, exit |
|
|
46
38
|
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
Each task = one background agent. Use agent completion notifications as the feedback loop.
|
|
50
|
-
|
|
51
|
-
**NEVER use TaskOutput** — returns full agent transcripts (100KB+) that explode context.
|
|
52
|
-
|
|
53
|
-
### Notification-Driven Execution
|
|
54
|
-
|
|
55
|
-
```
|
|
56
|
-
1. Spawn ALL wave agents with run_in_background=true in ONE message
|
|
57
|
-
2. STOP. End your turn. Do NOT run Bash monitors or poll for results.
|
|
58
|
-
3. Wait for "Agent X completed" notifications (they arrive automatically)
|
|
59
|
-
4. On EACH notification:
|
|
60
|
-
a. Run ratchet check (health checks on the worktree)
|
|
61
|
-
b. Report: "✓ T1: ratchet passed (abc123)" or "✗ T1: ratchet failed, reverted"
|
|
62
|
-
c. Update PLAN.md for that task
|
|
63
|
-
d. Check: all wave agents done?
|
|
64
|
-
- No → end turn, wait for next notification
|
|
65
|
-
- Yes → proceed to next wave or write final summary
|
|
66
|
-
```
|
|
67
|
-
|
|
68
|
-
After spawning, your turn ENDS. Per notification: run ratchet, output ONE line, update PLAN.md. Write full summary only after ALL wave agents complete.
|
|
69
|
-
|
|
70
|
-
## Checkpoint & Resume
|
|
71
|
-
|
|
72
|
-
**File:** `.deepflow/checkpoint.json` — stored in WORKTREE directory, not main.
|
|
73
|
-
|
|
74
|
-
**Schema:**
|
|
75
|
-
```json
|
|
76
|
-
{
|
|
77
|
-
"completed_tasks": ["T1", "T2"],
|
|
78
|
-
"current_wave": 2,
|
|
79
|
-
"worktree_path": ".deepflow/worktrees/upload",
|
|
80
|
-
"worktree_branch": "df/upload"
|
|
81
|
-
}
|
|
82
|
-
```
|
|
83
|
-
|
|
84
|
-
**On checkpoint:** Complete wave → update PLAN.md → save to worktree → exit.
|
|
85
|
-
**Resume:** `--continue` loads checkpoint, verifies worktree, skips completed tasks.
|
|
39
|
+
---
|
|
86
40
|
|
|
87
41
|
## Behavior
|
|
88
42
|
|
|
89
43
|
### 1. CHECK CHECKPOINT
|
|
90
44
|
|
|
91
45
|
```
|
|
92
|
-
--continue → Load checkpoint
|
|
93
|
-
→
|
|
94
|
-
|
|
95
|
-
→ If missing: Error "Worktree deleted. Use --fresh"
|
|
96
|
-
→ If exists: Use it, skip worktree creation
|
|
97
|
-
→ Resume execution with completed tasks
|
|
46
|
+
--continue → Load .deepflow/checkpoint.json from worktree
|
|
47
|
+
→ Verify worktree exists on disk (else error: "Use --fresh")
|
|
48
|
+
→ Skip completed tasks, resume execution
|
|
98
49
|
--fresh → Delete checkpoint, start fresh
|
|
99
50
|
checkpoint exists → Prompt: "Resume? (y/n)"
|
|
100
51
|
else → Start fresh
|
|
@@ -102,88 +53,29 @@ else → Start fresh
|
|
|
102
53
|
|
|
103
54
|
### 1.5. CREATE WORKTREE
|
|
104
55
|
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
# Check main is clean (ignore untracked)
|
|
109
|
-
git diff --quiet HEAD || Error: "Main has uncommitted changes. Commit or stash first."
|
|
110
|
-
|
|
111
|
-
# Generate paths
|
|
112
|
-
SPEC_NAME=$(basename spec/doing-*.md .md | sed 's/doing-//')
|
|
113
|
-
BRANCH_NAME="df/${SPEC_NAME}"
|
|
114
|
-
WORKTREE_PATH=".deepflow/worktrees/${SPEC_NAME}"
|
|
115
|
-
|
|
116
|
-
# Create worktree (or reuse existing)
|
|
117
|
-
if [ -d "${WORKTREE_PATH}" ]; then
|
|
118
|
-
echo "Reusing existing worktree"
|
|
119
|
-
else
|
|
120
|
-
git worktree add -b "${BRANCH_NAME}" "${WORKTREE_PATH}"
|
|
121
|
-
fi
|
|
122
|
-
```
|
|
123
|
-
|
|
124
|
-
**Existing worktree:** Reuse it (same spec = same worktree).
|
|
125
|
-
|
|
126
|
-
**--fresh flag:** Deletes existing worktree and creates new one.
|
|
56
|
+
Require clean HEAD (`git diff --quiet`). Derive SPEC_NAME from `specs/doing-*.md`.
|
|
57
|
+
Create worktree: `.deepflow/worktrees/{spec}` on branch `df/{spec}`.
|
|
58
|
+
Reuse if exists. `--fresh` deletes first.
|
|
127
59
|
|
|
128
60
|
### 1.6. RATCHET SNAPSHOT
|
|
129
61
|
|
|
130
|
-
|
|
62
|
+
Snapshot pre-existing test files in worktree — only these count for ratchet (agent-created tests excluded):
|
|
131
63
|
|
|
132
64
|
```bash
|
|
133
65
|
cd ${WORKTREE_PATH}
|
|
134
|
-
|
|
135
|
-
# Snapshot pre-existing test files (only these count for ratchet)
|
|
136
66
|
git ls-files | grep -E '\.(test|spec)\.[^/]+$|^test_|_test\.[^/]+$|^tests/|__tests__/' \
|
|
137
67
|
> .deepflow/auto-snapshot.txt
|
|
138
|
-
|
|
139
|
-
echo "Ratchet snapshot: $(wc -l < .deepflow/auto-snapshot.txt) pre-existing test files"
|
|
140
68
|
```
|
|
141
69
|
|
|
142
|
-
**Only pre-existing test files are used for ratchet evaluation.** New test files created by agents during implementation don't influence the pass/fail decision. This prevents agents from gaming the ratchet by writing tests that pass trivially.
|
|
143
|
-
|
|
144
70
|
### 1.7. NO-TESTS BOOTSTRAP
|
|
145
71
|
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
```bash
|
|
149
|
-
TEST_COUNT=$(wc -l < .deepflow/auto-snapshot.txt | tr -d ' ')
|
|
150
|
-
|
|
151
|
-
if [ "${TEST_COUNT}" = "0" ]; then
|
|
152
|
-
echo "Bootstrap needed: no pre-existing test files found."
|
|
153
|
-
BOOTSTRAP_NEEDED=true
|
|
154
|
-
else
|
|
155
|
-
BOOTSTRAP_NEEDED=false
|
|
156
|
-
fi
|
|
157
|
-
```
|
|
158
|
-
|
|
159
|
-
**If `BOOTSTRAP_NEEDED=true`:**
|
|
160
|
-
|
|
161
|
-
1. **Inject a bootstrap task** as the FIRST action before any regular PLAN.md task is executed:
|
|
162
|
-
- Bootstrap task description: "Write tests for files in edit_scope"
|
|
163
|
-
- Read `edit_scope` from `specs/doing-*.md` to know which files need tests
|
|
164
|
-
- Spawn ONE dedicated bootstrap agent using the Bootstrap Task prompt (section 6)
|
|
72
|
+
If snapshot has zero test files:
|
|
165
73
|
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
|
|
169
|
-
- The bootstrap agent's ONLY job is writing tests — no implementation changes
|
|
74
|
+
1. Spawn ONE bootstrap agent (section 6 Bootstrap Task) to write tests for `edit_scope` files
|
|
75
|
+
2. On ratchet pass: re-snapshot, report `"bootstrap: completed"`, end cycle (no PLAN.md tasks this cycle)
|
|
76
|
+
3. On ratchet fail: revert, halt with "Bootstrap failed — manual intervention required"
|
|
170
77
|
|
|
171
|
-
|
|
172
|
-
- Run ratchet health checks (build must pass; test suite must not error out)
|
|
173
|
-
- If ratchet passes: re-take the ratchet snapshot so subsequent tasks use the new tests as baseline:
|
|
174
|
-
```bash
|
|
175
|
-
cd ${WORKTREE_PATH}
|
|
176
|
-
git ls-files | grep -E '\.(test|spec)\.[^/]+$|^test_|_test\.[^/]+$|^tests/|__tests__/' \
|
|
177
|
-
> .deepflow/auto-snapshot.txt
|
|
178
|
-
echo "Post-bootstrap snapshot: $(wc -l < .deepflow/auto-snapshot.txt) test files"
|
|
179
|
-
```
|
|
180
|
-
- If ratchet fails: revert bootstrap commit, log error, halt and report "Bootstrap failed — manual intervention required"
|
|
181
|
-
|
|
182
|
-
4. **Signal to caller:** After bootstrap completes successfully, report `"bootstrap: completed"` in the cycle summary. This cycle's sole output is the test bootstrap — no regular PLAN.md task is executed this cycle.
|
|
183
|
-
|
|
184
|
-
5. **Subsequent cycles:** The updated `.deepflow/auto-snapshot.txt` now contains the bootstrapped test files. All subsequent ratchet checks use these as the baseline.
|
|
185
|
-
|
|
186
|
-
**If `BOOTSTRAP_NEEDED=false`:** Proceed normally to section 2.
|
|
78
|
+
Subsequent cycles use bootstrapped tests as ratchet baseline.
|
|
187
79
|
|
|
188
80
|
### 2. LOAD PLAN
|
|
189
81
|
|
|
@@ -194,7 +86,7 @@ If missing: "No PLAN.md found. Run /df:plan first."
|
|
|
194
86
|
|
|
195
87
|
### 2.5. REGISTER NATIVE TASKS
|
|
196
88
|
|
|
197
|
-
For each `[ ]` task in PLAN.md: `TaskCreate(subject: "{task_id}: {description}", activeForm: "{gerund}", description: full block)`. Store task_id → native ID mapping.
|
|
89
|
+
For each `[ ]` task in PLAN.md: `TaskCreate(subject: "{task_id}: {description}", activeForm: "{gerund}", description: full block)`. Store task_id → native ID mapping. Set dependencies via `TaskUpdate(addBlockedBy: [...])`. On `--continue`: only register remaining `[ ]` items.
|
|
198
90
|
|
|
199
91
|
### 3. CHECK FOR UNPLANNED SPECS
|
|
200
92
|
|
|
@@ -202,237 +94,77 @@ Warn if `specs/*.md` (excluding doing-/done-) exist. Non-blocking.
|
|
|
202
94
|
|
|
203
95
|
### 4. IDENTIFY READY TASKS
|
|
204
96
|
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
```
|
|
208
|
-
Ready = TaskList results where:
|
|
209
|
-
- status: "pending"
|
|
210
|
-
- blockedBy: empty (auto-unblocked by native dependency system)
|
|
211
|
-
```
|
|
97
|
+
Ready = TaskList where status: "pending" AND blockedBy: empty.
|
|
212
98
|
|
|
213
99
|
### 5. SPAWN AGENTS
|
|
214
100
|
|
|
215
101
|
Context ≥50%: checkpoint and exit.
|
|
216
102
|
|
|
217
|
-
|
|
218
|
-
```
|
|
219
|
-
TaskUpdate(taskId: native_id, status: "in_progress")
|
|
220
|
-
```
|
|
221
|
-
This activates the UI spinner showing the task's activeForm (e.g. "Creating upload endpoint").
|
|
103
|
+
Before spawning: `TaskUpdate(taskId: native_id, status: "in_progress")` — activates UI spinner.
|
|
222
104
|
|
|
223
|
-
**NEVER use `isolation: "worktree"` on Task
|
|
105
|
+
**NEVER use `isolation: "worktree"` on Task calls.** Deepflow manages a shared worktree so wave 2 sees wave 1 commits.
|
|
224
106
|
|
|
225
|
-
**Spawn ALL ready tasks in ONE message
|
|
107
|
+
**Spawn ALL ready tasks in ONE message.** Same-file conflicts: spawn sequentially.
|
|
226
108
|
|
|
227
|
-
|
|
109
|
+
**≥2 [SPIKE] tasks for same problem:** Follow Parallel Spike Probes (section 5.7).
|
|
228
110
|
|
|
229
111
|
### 5.5. RATCHET CHECK
|
|
230
112
|
|
|
231
|
-
After each agent completes
|
|
113
|
+
After each agent completes, run health checks in the worktree.
|
|
232
114
|
|
|
233
|
-
**
|
|
115
|
+
**Auto-detect commands:**
|
|
234
116
|
|
|
235
117
|
| File | Build | Test | Typecheck | Lint |
|
|
236
118
|
|------|-------|------|-----------|------|
|
|
237
|
-
| `package.json` | `npm run build`
|
|
238
|
-
| `pyproject.toml` | — | `pytest` | `mypy .`
|
|
239
|
-
| `Cargo.toml` | `cargo build` | `cargo test` | — | `cargo clippy`
|
|
119
|
+
| `package.json` | `npm run build` | `npm test` | `npx tsc --noEmit` | `npm run lint` |
|
|
120
|
+
| `pyproject.toml` | — | `pytest` | `mypy .` | `ruff check .` |
|
|
121
|
+
| `Cargo.toml` | `cargo build` | `cargo test` | — | `cargo clippy` |
|
|
240
122
|
| `go.mod` | `go build ./...` | `go test ./...` | — | `go vet ./...` |
|
|
241
123
|
|
|
242
|
-
|
|
243
|
-
```bash
|
|
244
|
-
cd ${WORKTREE_PATH}
|
|
124
|
+
Run Build → Test → Typecheck → Lint (stop on first failure).
|
|
245
125
|
|
|
246
|
-
|
|
247
|
-
# Build → Test → Typecheck → Lint (stop on first failure)
|
|
248
|
-
```
|
|
126
|
+
**Edit scope validation** (if spec declares `edit_scope`): check `git diff HEAD~1 --name-only` against allowed globs. Violations → `git revert HEAD --no-edit`, report "Edit scope violation: {files}".
|
|
249
127
|
|
|
250
|
-
**
|
|
251
|
-
|
|
252
|
-
|
|
253
|
-
CHANGED=$(git diff HEAD~1 --name-only)
|
|
254
|
-
|
|
255
|
-
# Load edit_scope from spec (files/globs)
|
|
256
|
-
EDIT_SCOPE=$(grep 'edit_scope:' specs/doing-*.md | sed 's/edit_scope://' | tr ',' '\n' | xargs)
|
|
257
|
-
|
|
258
|
-
# Check each changed file against allowed scope
|
|
259
|
-
for file in ${CHANGED}; do
|
|
260
|
-
ALLOWED=false
|
|
261
|
-
for pattern in ${EDIT_SCOPE}; do
|
|
262
|
-
# Match file against glob pattern
|
|
263
|
-
[[ "${file}" == ${pattern} ]] && ALLOWED=true
|
|
264
|
-
done
|
|
265
|
-
${ALLOWED} || VIOLATIONS+=("${file}")
|
|
266
|
-
done
|
|
267
|
-
```
|
|
268
|
-
|
|
269
|
-
- Violations found → revert: `git revert HEAD --no-edit`, report "✗ Edit scope violation: {files}"
|
|
270
|
-
- No violations → continue to health checks
|
|
271
|
-
|
|
272
|
-
**Step 4: Evaluate**:
|
|
273
|
-
- All checks pass AND no scope violations → task succeeds, commit stands
|
|
274
|
-
- Any check fails → regression detected → revert: `git revert HEAD --no-edit`
|
|
128
|
+
**Impact completeness check** (if task has Impact block in PLAN.md):
|
|
129
|
+
Compare `git diff HEAD~1 --name-only` against Impact callers/duplicates list.
|
|
130
|
+
File listed but not modified → **advisory warning**: "Impact gap: {file} listed as {caller|duplicate} but not modified — verify manually". Not auto-revert (callers sometimes don't need changes), but flags the risk.
|
|
275
131
|
|
|
276
|
-
**
|
|
132
|
+
**Evaluate:** All pass + no violations → commit stands. Any failure → `git revert HEAD --no-edit`.
|
|
277
133
|
|
|
278
|
-
|
|
134
|
+
Ratchet uses ONLY pre-existing test files from `.deepflow/auto-snapshot.txt`.
|
|
279
135
|
|
|
280
136
|
### 5.7. PARALLEL SPIKE PROBES
|
|
281
137
|
|
|
282
|
-
|
|
283
|
-
|
|
284
|
-
|
|
285
|
-
|
|
286
|
-
|
|
287
|
-
|
|
288
|
-
|
|
289
|
-
|
|
290
|
-
|
|
291
|
-
|
|
292
|
-
|
|
293
|
-
|
|
294
|
-
|
|
295
|
-
|
|
296
|
-
|
|
297
|
-
|
|
298
|
-
|
|
299
|
-
|
|
300
|
-
|
|
301
|
-
|
|
302
|
-
|
|
303
|
-
|
|
304
|
-
|
|
305
|
-
|
|
306
|
-
|
|
307
|
-
|
|
308
|
-
|
|
309
|
-
|
|
310
|
-
|
|
311
|
-
|
|
312
|
-
echo "Created probe worktree: ${PROBE_PATH} (branch: ${PROBE_BRANCH})"
|
|
313
|
-
```
|
|
314
|
-
|
|
315
|
-
#### Step 3: Spawn all probes in parallel
|
|
316
|
-
|
|
317
|
-
Mark every spike task as `in_progress`, then spawn one agent per probe **in a single message** using the Spike Task prompt (section 6), with the probe's worktree path as its working directory.
|
|
318
|
-
|
|
319
|
-
```
|
|
320
|
-
TaskUpdate(taskId: native_id_SPIKE_A, status: "in_progress")
|
|
321
|
-
TaskUpdate(taskId: native_id_SPIKE_B, status: "in_progress")
|
|
322
|
-
[spawn agent for SPIKE_A → PROBE_PATH_A]
|
|
323
|
-
[spawn agent for SPIKE_B → PROBE_PATH_B]
|
|
324
|
-
... (all in ONE message)
|
|
325
|
-
```
|
|
326
|
-
|
|
327
|
-
End your turn. Do NOT poll or monitor. Wait for completion notifications.
|
|
328
|
-
|
|
329
|
-
#### Step 4: Ratchet each probe (on completion notifications)
|
|
330
|
-
|
|
331
|
-
When a probe agent's notification arrives, run the standard ratchet (section 5.5) against its dedicated probe worktree:
|
|
332
|
-
|
|
333
|
-
```bash
|
|
334
|
-
cd ${PROBE_PATH}
|
|
335
|
-
|
|
336
|
-
# Identical health-check commands as standard tasks
|
|
337
|
-
# Build → Test → Typecheck → Lint (stop on first failure)
|
|
338
|
-
```
|
|
339
|
-
|
|
340
|
-
Record per-probe metrics:
|
|
341
|
-
|
|
342
|
-
```yaml
|
|
343
|
-
probe_id: SPIKE_A
|
|
344
|
-
worktree: .deepflow/worktrees/{spec}/probe-SPIKE_A
|
|
345
|
-
branch: df/{spec}/probe-SPIKE_A
|
|
346
|
-
ratchet_passed: true/false
|
|
347
|
-
regressions: 0 # failing pre-existing tests
|
|
348
|
-
coverage_delta: +3 # new lines covered (positive = better)
|
|
349
|
-
files_changed: 4 # number of files touched
|
|
350
|
-
commit: abc1234
|
|
351
|
-
```
|
|
352
|
-
|
|
353
|
-
Wait until **all** probe notifications have arrived before proceeding to selection.
|
|
354
|
-
|
|
355
|
-
#### Step 5: Machine-select winner
|
|
356
|
-
|
|
357
|
-
No LLM evaluates another LLM's work. Apply the following ordered criteria to all probes that **passed** the ratchet:
|
|
358
|
-
|
|
359
|
-
```
|
|
360
|
-
1. Fewer regressions (lower is better — hard gate: any regression disqualifies)
|
|
361
|
-
2. Better coverage (higher delta is better)
|
|
362
|
-
3. Fewer files changed (lower is better — smaller blast radius)
|
|
363
|
-
|
|
364
|
-
Tie-break: first probe to complete (chronological)
|
|
365
|
-
```
|
|
366
|
-
|
|
367
|
-
If **no** probe passes the ratchet, all are failed probes. Log insights (step 7) and reset the spike tasks to `pending` for retry with debugger guidance.
|
|
368
|
-
|
|
369
|
-
#### Step 6: Preserve ALL probe worktrees
|
|
370
|
-
|
|
371
|
-
Do NOT delete losing probe worktrees. They are preserved for manual inspection and cross-cycle learning:
|
|
372
|
-
|
|
373
|
-
```bash
|
|
374
|
-
# Winning probe: leave as-is, will be used as implementation base (step 8)
|
|
375
|
-
# Losing probes: leave worktrees intact, mark branches with -failed suffix for clarity
|
|
376
|
-
git branch -m "df/{spec}/probe-SPIKE_B" "df/{spec}/probe-SPIKE_B-failed"
|
|
377
|
-
```
|
|
378
|
-
|
|
379
|
-
Record all probe paths in `.deepflow/checkpoint.json` under `"spike_probes"` so future `--continue` runs know they exist.
|
|
380
|
-
|
|
381
|
-
#### Step 7: Log failed probe insights
|
|
382
|
-
|
|
383
|
-
For every probe that failed the ratchet (or lost selection), write two entries to `.deepflow/auto-memory.yaml` in the **main** tree.
|
|
384
|
-
|
|
385
|
-
**Entry 1 — `spike_insights` (detailed probe record):**
|
|
386
|
-
|
|
387
|
-
```yaml
|
|
388
|
-
spike_insights:
|
|
389
|
-
- date: "YYYY-MM-DD"
|
|
390
|
-
spec: "{spec_name}"
|
|
391
|
-
spike_id: "SPIKE_B"
|
|
392
|
-
hypothesis: "{hypothesis text from PLAN.md}"
|
|
393
|
-
outcome: "failed" # or "passed-but-lost"
|
|
394
|
-
failure_reason: "{first failed check and error summary}"
|
|
395
|
-
ratchet_metrics:
|
|
396
|
-
regressions: 2
|
|
397
|
-
coverage_delta: -1
|
|
398
|
-
files_changed: 7
|
|
399
|
-
worktree: ".deepflow/worktrees/{spec}/probe-SPIKE_B-failed"
|
|
400
|
-
branch: "df/{spec}/probe-SPIKE_B-failed"
|
|
401
|
-
edge_cases: [] # orchestrator may populate after manual review
|
|
402
|
-
```
|
|
403
|
-
|
|
404
|
-
**Entry 2 — `probe_learnings` (cross-cycle memory, read by `/df:auto-cycle` on each cycle start):**
|
|
405
|
-
|
|
406
|
-
```yaml
|
|
407
|
-
probe_learnings:
|
|
408
|
-
- spike: "SPIKE_B"
|
|
409
|
-
probe: "{probe branch suffix, e.g. probe-SPIKE_B}"
|
|
410
|
-
insight: "{one-sentence summary of what the probe revealed, derived from failure_reason}"
|
|
411
|
-
```
|
|
412
|
-
|
|
413
|
-
If the file does not exist, create it. Initialize both `spike_insights:` and `probe_learnings:` as empty lists before appending. Preserve all existing keys when merging.
|
|
414
|
-
|
|
415
|
-
#### Step 8: Promote winning probe
|
|
416
|
-
|
|
417
|
-
Cherry-pick the winner's commit into the shared spec worktree so downstream implementation tasks see the winning approach:
|
|
418
|
-
|
|
419
|
-
```bash
|
|
420
|
-
cd ${WORKTREE_PATH} # shared worktree (not the probe sub-worktree)
|
|
421
|
-
git cherry-pick ${WINNER_COMMIT}
|
|
422
|
-
```
|
|
423
|
-
|
|
424
|
-
Then mark the winning spike task as `completed` and auto-unblock its dependents:
|
|
425
|
-
|
|
426
|
-
```
|
|
427
|
-
TaskUpdate(taskId: native_id_SPIKE_WINNER, status: "completed")
|
|
428
|
-
TaskUpdate(taskId: native_id_SPIKE_LOSERS, status: "pending") # keep visible for audit
|
|
429
|
-
```
|
|
430
|
-
|
|
431
|
-
Update PLAN.md:
|
|
432
|
-
- Winning spike → `[x]` with commit hash and `[PROBE_WINNER]` tag
|
|
433
|
-
- Losing spikes → `[~]` (skipped) with `[PROBE_FAILED: see auto-memory.yaml]` note
|
|
434
|
-
|
|
435
|
-
Resume the standard execution loop (section 9) — implementation tasks blocked by the spike group are now unblocked.
|
|
138
|
+
Trigger: ≥2 [SPIKE] tasks with same "Blocked by:" target or identical hypothesis.
|
|
139
|
+
|
|
140
|
+
1. **Baseline:** Record `BASELINE=$(git rev-parse HEAD)` in shared worktree
|
|
141
|
+
2. **Sub-worktrees:** Per spike: `git worktree add -b df/{spec}/probe-{SPIKE_ID} .deepflow/worktrees/{spec}/probe-{SPIKE_ID} ${BASELINE}`
|
|
142
|
+
3. **Spawn:** All probes in ONE message, each targeting its probe worktree. End turn.
|
|
143
|
+
4. **Ratchet:** Per notification, run standard ratchet (5.5) in probe worktree. Record: ratchet_passed, regressions, coverage_delta, files_changed, commit
|
|
144
|
+
5. **Select winner** (after ALL complete, no LLM judge):
|
|
145
|
+
- Disqualify any with regressions
|
|
146
|
+
- Rank: fewer regressions > higher coverage_delta > fewer files_changed > first to complete
|
|
147
|
+
- No passes → reset all to pending for retry with debugger
|
|
148
|
+
6. **Preserve all worktrees.** Losers: rename branch + `-failed` suffix. Record in checkpoint.json under `"spike_probes"`
|
|
149
|
+
7. **Log failed probes** to `.deepflow/auto-memory.yaml` (main tree):
|
|
150
|
+
```yaml
|
|
151
|
+
spike_insights:
|
|
152
|
+
- date: "YYYY-MM-DD"
|
|
153
|
+
spec: "{spec_name}"
|
|
154
|
+
spike_id: "SPIKE_B"
|
|
155
|
+
hypothesis: "{from PLAN.md}"
|
|
156
|
+
outcome: "failed" # or "passed-but-lost"
|
|
157
|
+
failure_reason: "{first failed check + error summary}"
|
|
158
|
+
ratchet_metrics: {regressions: N, coverage_delta: N, files_changed: N}
|
|
159
|
+
worktree: ".deepflow/worktrees/{spec}/probe-SPIKE_B-failed"
|
|
160
|
+
branch: "df/{spec}/probe-SPIKE_B-failed"
|
|
161
|
+
probe_learnings: # read by /df:auto-cycle each start
|
|
162
|
+
- spike: "SPIKE_B"
|
|
163
|
+
probe: "probe-SPIKE_B"
|
|
164
|
+
insight: "{one-sentence summary from failure_reason}"
|
|
165
|
+
```
|
|
166
|
+
Create file if missing. Preserve existing keys when merging.
|
|
167
|
+
8. **Promote winner:** Cherry-pick into shared worktree. Winner → `[x] [PROBE_WINNER]`, losers → `[~] [PROBE_FAILED]`. Resume standard loop.
|
|
436
168
|
|
|
437
169
|
---
|
|
438
170
|
|
|
@@ -444,143 +176,127 @@ Working directory: {worktree_absolute_path}
|
|
|
444
176
|
All file operations MUST use this absolute path as base. Do NOT write files to the main project directory.
|
|
445
177
|
Commit format: {commit_type}({spec}): {description}
|
|
446
178
|
|
|
447
|
-
STOP after committing. Do NOT merge branches, rename spec files, remove worktrees, or run git checkout on main.
|
|
179
|
+
STOP after committing. Do NOT merge branches, rename spec files, remove worktrees, or run git checkout on main.
|
|
448
180
|
```
|
|
449
181
|
|
|
450
|
-
**Standard Task
|
|
182
|
+
**Standard Task:**
|
|
451
183
|
```
|
|
452
184
|
{task_id}: {description from PLAN.md}
|
|
453
|
-
Files: {target files}
|
|
454
|
-
|
|
185
|
+
Files: {target files} Spec: {spec_name}
|
|
186
|
+
{Impact block from PLAN.md — include verbatim if present}
|
|
187
|
+
|
|
188
|
+
{Prior failure context — include ONLY if task was previously reverted. Read from .deepflow/auto-memory.yaml revert_history for this task_id:}
|
|
189
|
+
Previous attempts (DO NOT repeat these approaches):
|
|
190
|
+
- Cycle {N}: reverted — "{reason from revert_history}"
|
|
191
|
+
- Cycle {N}: reverted — "{reason from revert_history}"
|
|
192
|
+
{Omit this entire block if task has no revert history.}
|
|
193
|
+
|
|
194
|
+
CRITICAL: If Impact lists duplicates or callers, you MUST verify each one is consistent with your changes.
|
|
195
|
+
- [active] duplicates → consolidate into single source of truth (e.g., local generateYAML → use shared buildConfigData)
|
|
196
|
+
- [dead] duplicates → DELETE the dead code entirely. Dead code pollutes context and causes drift.
|
|
455
197
|
|
|
456
198
|
Steps:
|
|
457
|
-
1.
|
|
458
|
-
|
|
459
|
-
|
|
460
|
-
|
|
461
|
-
3. Commit as feat({spec}): {description}
|
|
199
|
+
1. External APIs/SDKs → chub search "<library>" --json → chub get <id> --lang <lang> (skip if chub unavailable or internal code only)
|
|
200
|
+
2. Read ALL files in Impact before implementing — understand the full picture
|
|
201
|
+
3. Implement the task, updating all impacted files
|
|
202
|
+
4. Commit as feat({spec}): {description}
|
|
462
203
|
|
|
463
|
-
Your ONLY job is to write code and commit.
|
|
204
|
+
Your ONLY job is to write code and commit. Orchestrator runs health checks after.
|
|
464
205
|
```
|
|
465
206
|
|
|
466
|
-
**Bootstrap Task
|
|
207
|
+
**Bootstrap Task:**
|
|
467
208
|
```
|
|
468
209
|
BOOTSTRAP: Write tests for files in edit_scope
|
|
469
|
-
Files: {edit_scope files
|
|
470
|
-
Spec: {spec_name}
|
|
471
|
-
|
|
472
|
-
Steps:
|
|
473
|
-
1. Write tests that cover the functionality of the files listed above
|
|
474
|
-
2. Do NOT change implementation files — tests only
|
|
475
|
-
3. Commit as test({spec}): bootstrap tests for edit_scope
|
|
210
|
+
Files: {edit_scope files} Spec: {spec_name}
|
|
476
211
|
|
|
477
|
-
|
|
212
|
+
Write tests covering listed files. Do NOT change implementation files.
|
|
213
|
+
Commit as test({spec}): bootstrap tests for edit_scope
|
|
478
214
|
```
|
|
479
215
|
|
|
480
|
-
**Spike Task
|
|
216
|
+
**Spike Task:**
|
|
481
217
|
```
|
|
482
218
|
{task_id} [SPIKE]: {hypothesis}
|
|
483
|
-
Files: {target files}
|
|
484
|
-
Spec: {spec_name}
|
|
219
|
+
Files: {target files} Spec: {spec_name}
|
|
485
220
|
|
|
486
|
-
|
|
487
|
-
|
|
488
|
-
|
|
221
|
+
{Prior failure context — include ONLY if this spike was previously reverted. Read from .deepflow/auto-memory.yaml revert_history + spike_insights for this task_id:}
|
|
222
|
+
Previous attempts (DO NOT repeat these approaches):
|
|
223
|
+
- Cycle {N}: reverted — "{reason}"
|
|
224
|
+
{Omit this entire block if no revert history.}
|
|
489
225
|
|
|
490
|
-
|
|
226
|
+
Implement minimal spike to validate hypothesis.
|
|
227
|
+
Commit as spike({spec}): {description}
|
|
491
228
|
```
|
|
492
229
|
|
|
493
|
-
###
|
|
230
|
+
### 8. COMPLETE SPECS
|
|
494
231
|
|
|
495
|
-
When
|
|
232
|
+
When all tasks done for a `doing-*` spec:
|
|
233
|
+
1. Run `/df:verify doing-{name}` via the Skill tool (`skill: "df:verify", args: "doing-{name}"`)
|
|
234
|
+
- Verify runs quality gates (L0-L4), merges worktree branch to main, cleans up worktree, renames spec `doing-*` → `done-*`, and extracts decisions
|
|
235
|
+
- If verify fails (adds fix tasks): stop here — `/df:execute --continue` will pick up the fix tasks
|
|
236
|
+
- If verify passes: proceed to step 2
|
|
237
|
+
2. Remove spec's ENTIRE section from PLAN.md (header, tasks, summaries, fix tasks, separators)
|
|
238
|
+
3. Recalculate Summary table at top of PLAN.md
|
|
496
239
|
|
|
497
|
-
|
|
240
|
+
---
|
|
498
241
|
|
|
499
|
-
|
|
242
|
+
## Usage
|
|
243
|
+
|
|
244
|
+
```
|
|
245
|
+
/df:execute # Execute all ready tasks
|
|
246
|
+
/df:execute T1 T2 # Specific tasks only
|
|
247
|
+
/df:execute --continue # Resume from checkpoint
|
|
248
|
+
/df:execute --fresh # Ignore checkpoint
|
|
249
|
+
/df:execute --dry-run # Show plan only
|
|
250
|
+
```
|
|
500
251
|
|
|
501
|
-
|
|
252
|
+
## Skills & Agents
|
|
502
253
|
|
|
503
|
-
|
|
254
|
+
- Skill: `atomic-commits` — Clean commit protocol
|
|
255
|
+
- Skill: `context-hub` — Fetch external API docs before coding
|
|
504
256
|
|
|
505
|
-
|
|
506
|
-
|
|
507
|
-
|
|
508
|
-
|
|
509
|
-
```
|
|
510
|
-
### {YYYY-MM-DD} — {spec-name}
|
|
511
|
-
- [TAG] decision text — rationale
|
|
512
|
-
```
|
|
513
|
-
After successful append, delete `specs/done-{name}.md`. If write fails, preserve the file.
|
|
514
|
-
4. Remove the spec's ENTIRE section from PLAN.md:
|
|
515
|
-
- The `### doing-{spec}` header
|
|
516
|
-
- All task entries (`- [x] **T{n}**: ...` and their sub-items)
|
|
517
|
-
- Any `## Execution Summary` block for that spec
|
|
518
|
-
- Any `### Fix Tasks` sub-section for that spec
|
|
519
|
-
- Separators (`---`) between removed sections
|
|
520
|
-
5. Recalculate the Summary table at the top of PLAN.md (update counts for completed/pending)
|
|
257
|
+
| Agent | subagent_type | Purpose |
|
|
258
|
+
|-------|---------------|---------|
|
|
259
|
+
| Implementation | `general-purpose` | Task implementation |
|
|
260
|
+
| Debugger | `reasoner` | Debugging failures |
|
|
521
261
|
|
|
522
|
-
|
|
262
|
+
**Model routing:** Use `model:` from command/agent/skill frontmatter. Default: `sonnet`.
|
|
523
263
|
|
|
524
|
-
|
|
264
|
+
**Checkpoint schema:** `.deepflow/checkpoint.json` in worktree:
|
|
265
|
+
```json
|
|
266
|
+
{"completed_tasks": ["T1","T2"], "current_wave": 2, "worktree_path": ".deepflow/worktrees/upload", "worktree_branch": "df/upload"}
|
|
267
|
+
```
|
|
525
268
|
|
|
526
|
-
|
|
527
|
-
1. Run ratchet check for the completed agent (see section 5.5)
|
|
528
|
-
2. Ratchet passed → `TaskUpdate(taskId: native_id, status: "completed")` — auto-unblocks dependent tasks
|
|
529
|
-
3. Ratchet failed → revert commit, `TaskUpdate(taskId: native_id, status: "pending")`
|
|
530
|
-
4. Update PLAN.md: `[ ]` → `[x]` + commit hash (on pass) or note revert (on fail)
|
|
531
|
-
5. Report: "✓ T1: ratchet passed (abc123)" or "✗ T1: ratchet failed, reverted"
|
|
532
|
-
6. If NOT all wave agents done → end turn, wait
|
|
533
|
-
7. If ALL wave agents done → use TaskList to find newly unblocked tasks, check context, spawn next wave or finish
|
|
269
|
+
---
|
|
534
270
|
|
|
535
|
-
|
|
271
|
+
## Failure Handling
|
|
536
272
|
|
|
537
|
-
|
|
273
|
+
When task fails ratchet and is reverted:
|
|
274
|
+
- `TaskUpdate(taskId: native_id, status: "pending")` — dependents remain blocked
|
|
275
|
+
- Repeated failure → spawn `Task(subagent_type="reasoner", prompt="Debug failure: {ratchet output}")`
|
|
276
|
+
- Leave worktree intact, keep checkpoint.json
|
|
277
|
+
- Output: worktree path/branch, `cd {path}` to investigate, `--continue` to resume, `--fresh` to discard
|
|
538
278
|
|
|
539
279
|
## Rules
|
|
540
280
|
|
|
541
281
|
| Rule | Detail |
|
|
542
282
|
|------|--------|
|
|
543
|
-
| Zero test files → bootstrap first |
|
|
283
|
+
| Zero test files → bootstrap first | Bootstrap is cycle's sole task when snapshot empty |
|
|
544
284
|
| 1 task = 1 agent = 1 commit | `atomic-commits` skill |
|
|
545
285
|
| 1 file = 1 writer | Sequential if conflict |
|
|
546
286
|
| Agent writes code, orchestrator measures | Ratchet is the judge |
|
|
547
287
|
| No LLM evaluates LLM work | Health checks only |
|
|
548
|
-
| ≥2 spikes
|
|
549
|
-
| All probe worktrees preserved |
|
|
550
|
-
| Machine-selected winner |
|
|
551
|
-
|
|
|
552
|
-
| Winner cherry-picked to shared worktree | Downstream tasks see winning approach via shared worktree |
|
|
553
|
-
| External APIs → chub first | Agents fetch curated docs before implementing external API calls; skip if chub unavailable |
|
|
288
|
+
| ≥2 spikes same problem → parallel probes | Never run competing spikes sequentially |
|
|
289
|
+
| All probe worktrees preserved | Losers renamed `-failed`; never deleted |
|
|
290
|
+
| Machine-selected winner | Regressions > coverage > files changed; no LLM judge |
|
|
291
|
+
| External APIs → chub first | Skip if unavailable |
|
|
554
292
|
|
|
555
293
|
## Example
|
|
556
294
|
|
|
557
|
-
### No-Tests Bootstrap
|
|
558
|
-
|
|
559
|
-
```
|
|
560
|
-
/df:execute (context: 8%)
|
|
561
|
-
|
|
562
|
-
Loading PLAN.md... T1 ready, T2/T3 blocked by T1
|
|
563
|
-
Ratchet snapshot: 0 pre-existing test files
|
|
564
|
-
Bootstrap needed: no pre-existing test files found.
|
|
565
|
-
|
|
566
|
-
Spawning bootstrap agent for edit_scope...
|
|
567
|
-
[Bootstrap agent completed]
|
|
568
|
-
Running ratchet: build ✓ | tests ✓ (12 new tests pass)
|
|
569
|
-
✓ Bootstrap: ratchet passed (boo1234)
|
|
570
|
-
Re-taking ratchet snapshot: 3 test files
|
|
571
|
-
|
|
572
|
-
bootstrap: completed — cycle's sole task was test bootstrap
|
|
573
|
-
Next: Run /df:auto-cycle again to execute T1
|
|
574
|
-
```
|
|
575
|
-
|
|
576
|
-
### Standard Execution
|
|
577
|
-
|
|
578
295
|
```
|
|
579
296
|
/df:execute (context: 12%)
|
|
580
297
|
|
|
581
298
|
Loading PLAN.md... T1 ready, T2/T3 blocked by T1
|
|
582
299
|
Ratchet snapshot: 24 pre-existing test files
|
|
583
|
-
Registering native tasks: TaskCreate T1/T2/T3, TaskUpdate(T2 blockedBy T1), TaskUpdate(T3 blockedBy T1)
|
|
584
300
|
|
|
585
301
|
Wave 1: TaskUpdate(T1, in_progress)
|
|
586
302
|
[Agent "T1" completed]
|
|
@@ -589,43 +305,13 @@ Wave 1: TaskUpdate(T1, in_progress)
|
|
|
589
305
|
TaskUpdate(T1, completed) → auto-unblocks T2, T3
|
|
590
306
|
|
|
591
307
|
Wave 2: TaskUpdate(T2/T3, in_progress)
|
|
592
|
-
[Agent "T2" completed]
|
|
593
|
-
|
|
594
|
-
|
|
595
|
-
|
|
596
|
-
|
|
597
|
-
✓
|
|
598
|
-
|
|
599
|
-
|
|
600
|
-
|
|
601
|
-
Next: Run /df:verify to verify specs and merge to main
|
|
602
|
-
```
|
|
603
|
-
|
|
604
|
-
### Ratchet Failure (Regression Detected)
|
|
605
|
-
|
|
606
|
-
```
|
|
607
|
-
/df:execute (context: 10%)
|
|
608
|
-
|
|
609
|
-
Wave 1: TaskUpdate(T1, in_progress)
|
|
610
|
-
[Agent "T1" completed]
|
|
611
|
-
Running ratchet: build ✓ | tests ✗ (2 failed of 24)
|
|
612
|
-
✗ T1: ratchet failed, reverted
|
|
613
|
-
TaskUpdate(T1, pending)
|
|
614
|
-
|
|
615
|
-
Spawning debugger for T1...
|
|
616
|
-
[Debugger completed]
|
|
617
|
-
Re-running T1 with fix guidance...
|
|
618
|
-
|
|
619
|
-
[Agent "T1 retry" completed]
|
|
620
|
-
Running ratchet: build ✓ | tests ✓ (24 passed) | typecheck ✓
|
|
621
|
-
✓ T1: ratchet passed (abc1234)
|
|
622
|
-
```
|
|
623
|
-
|
|
624
|
-
### With Checkpoint
|
|
625
|
-
|
|
626
|
-
```
|
|
627
|
-
Wave 1 complete (context: 52%)
|
|
628
|
-
Checkpoint saved.
|
|
629
|
-
|
|
630
|
-
Next: Run /df:execute --continue to resume execution
|
|
308
|
+
[Agent "T2" completed] ✓ T2: ratchet passed (def5678)
|
|
309
|
+
[Agent "T3" completed] ✓ T3: ratchet passed (ghi9012)
|
|
310
|
+
|
|
311
|
+
Context: 35% — All tasks done for doing-upload.
|
|
312
|
+
Running /df:verify doing-upload...
|
|
313
|
+
✓ L0 | ✓ L1 (3/3 files) | ⚠ L2 (no coverage tool) | ✓ L4 (24 tests)
|
|
314
|
+
✓ Merged df/upload to main
|
|
315
|
+
✓ Spec complete: doing-upload → done-upload
|
|
316
|
+
Complete: 3/3
|
|
631
317
|
```
|
package/src/commands/df/plan.md
CHANGED
|
@@ -3,7 +3,7 @@
|
|
|
3
3
|
## Purpose
|
|
4
4
|
Compare specs against codebase and past experiments. Generate prioritized tasks.
|
|
5
5
|
|
|
6
|
-
**NEVER:** use EnterPlanMode, use ExitPlanMode — this command IS the planning phase
|
|
6
|
+
**NEVER:** use EnterPlanMode, use ExitPlanMode — this command IS the planning phase
|
|
7
7
|
|
|
8
8
|
## Usage
|
|
9
9
|
```
|
|
@@ -17,71 +17,50 @@ Compare specs against codebase and past experiments. Generate prioritized tasks.
|
|
|
17
17
|
|
|
18
18
|
## Spec File States
|
|
19
19
|
|
|
20
|
-
| Prefix |
|
|
21
|
-
|
|
22
|
-
| (none) |
|
|
23
|
-
| `doing-` |
|
|
24
|
-
| `done-` |
|
|
20
|
+
| Prefix | Action |
|
|
21
|
+
|--------|--------|
|
|
22
|
+
| (none) | Plan this |
|
|
23
|
+
| `doing-` | Skip |
|
|
24
|
+
| `done-` | Skip |
|
|
25
25
|
|
|
26
26
|
## Behavior
|
|
27
27
|
|
|
28
28
|
### 1. LOAD CONTEXT
|
|
29
29
|
|
|
30
30
|
```
|
|
31
|
-
Load:
|
|
32
|
-
- specs/*.md EXCLUDING doing-* and done-* (only new specs)
|
|
33
|
-
- PLAN.md (if exists, for appending)
|
|
34
|
-
- .deepflow/config.yaml (if exists)
|
|
35
|
-
|
|
31
|
+
Load: specs/*.md (exclude doing-*/done-*), PLAN.md (if exists), .deepflow/config.yaml
|
|
36
32
|
Determine source_dir from config or default to src/
|
|
37
33
|
```
|
|
38
34
|
|
|
39
|
-
Run `validateSpec` on each
|
|
40
|
-
|
|
41
|
-
If no new specs: report counts, suggest `/df:execute`.
|
|
35
|
+
Run `validateSpec` on each spec. Hard failures → skip + error. Advisory → include in output.
|
|
36
|
+
No new specs → report counts, suggest `/df:execute`.
|
|
42
37
|
|
|
43
38
|
### 2. CHECK PAST EXPERIMENTS (SPIKE-FIRST)
|
|
44
39
|
|
|
45
40
|
**CRITICAL**: Check experiments BEFORE generating any tasks.
|
|
46
41
|
|
|
47
|
-
Extract topic from spec name (fuzzy match), then:
|
|
48
|
-
|
|
49
42
|
```
|
|
50
43
|
Glob .deepflow/experiments/{topic}--*
|
|
51
44
|
```
|
|
52
45
|
|
|
53
|
-
|
|
54
|
-
Statuses: `active`, `passed`, `failed`
|
|
46
|
+
File naming: `{topic}--{hypothesis}--{status}.md` (active/passed/failed)
|
|
55
47
|
|
|
56
48
|
| Result | Action |
|
|
57
49
|
|--------|--------|
|
|
58
|
-
| `--failed.md`
|
|
59
|
-
| `--passed.md`
|
|
60
|
-
| `--active.md`
|
|
61
|
-
| No matches | New topic,
|
|
50
|
+
| `--failed.md` | Extract "next hypothesis" from Conclusion, generate spike |
|
|
51
|
+
| `--passed.md` | Proceed to full implementation |
|
|
52
|
+
| `--active.md` | Wait for completion |
|
|
53
|
+
| No matches | New topic, generate initial spike |
|
|
62
54
|
|
|
63
|
-
|
|
64
|
-
- If `--failed.md` exists: Generate spike task to test the next hypothesis (from failed experiment's Conclusion)
|
|
65
|
-
- If no experiments exist: Generate spike task for the core hypothesis
|
|
66
|
-
- Full implementation tasks are BLOCKED until a spike validates the approach
|
|
67
|
-
- Only proceed to full task generation after `--passed.md` exists
|
|
68
|
-
|
|
69
|
-
See: `templates/experiment-template.md` for experiment format
|
|
55
|
+
Full implementation tasks BLOCKED until spike validates. See `templates/experiment-template.md`.
|
|
70
56
|
|
|
71
57
|
### 3. DETECT PROJECT CONTEXT
|
|
72
58
|
|
|
73
|
-
|
|
74
|
-
- Code style/conventions
|
|
75
|
-
- Existing patterns (error handling, API structure)
|
|
76
|
-
- Integration points
|
|
77
|
-
|
|
78
|
-
Include patterns in task descriptions for agents to follow.
|
|
59
|
+
Identify code style, patterns (error handling, API structure), integration points. Include in task descriptions.
|
|
79
60
|
|
|
80
61
|
### 4. ANALYZE CODEBASE
|
|
81
62
|
|
|
82
|
-
Follow `templates/explore-agent.md` for spawn rules
|
|
83
|
-
|
|
84
|
-
Scale agent count based on codebase size:
|
|
63
|
+
Follow `templates/explore-agent.md` for spawn rules and scope.
|
|
85
64
|
|
|
86
65
|
| File Count | Agents |
|
|
87
66
|
|------------|--------|
|
|
@@ -90,125 +69,111 @@ Scale agent count based on codebase size:
|
|
|
90
69
|
| 100-500 | 25-40 |
|
|
91
70
|
| 500+ | 50-100 (cap) |
|
|
92
71
|
|
|
93
|
-
|
|
94
|
-
- Implementations matching spec requirements
|
|
95
|
-
- TODO, FIXME, HACK comments
|
|
96
|
-
- Stub functions, placeholder returns
|
|
97
|
-
- Skipped tests, incomplete coverage
|
|
72
|
+
Use `code-completeness` skill to search for: implementations matching spec requirements, TODOs/FIXMEs/HACKs, stubs, skipped tests.
|
|
98
73
|
|
|
99
|
-
### 5.
|
|
74
|
+
### 4.5. IMPACT ANALYSIS (per planned file)
|
|
100
75
|
|
|
101
|
-
|
|
76
|
+
For each file in a task's "Files:" list, find the full blast radius.
|
|
102
77
|
|
|
103
|
-
|
|
78
|
+
**Search for:**
|
|
104
79
|
|
|
105
|
-
**
|
|
80
|
+
1. **Callers:** `grep -r "{exported_function}" --include="*.{ext}" -l` — files that import/call what's being changed
|
|
81
|
+
2. **Duplicates:** Files with similar logic (same function name, same transformation). Classify:
|
|
82
|
+
- `[active]` — used in production → must consolidate
|
|
83
|
+
- `[dead]` — bypassed/unreachable → must delete
|
|
84
|
+
3. **Data flow:** If file produces/transforms data, find ALL consumers of that shape across languages
|
|
106
85
|
|
|
107
|
-
|
|
86
|
+
**Embed as `Impact:` block in each task:**
|
|
87
|
+
```markdown
|
|
88
|
+
- [ ] **T2**: Add new features to YAML export
|
|
89
|
+
- Files: src/utils/buildConfigData.ts
|
|
90
|
+
- Impact:
|
|
91
|
+
- Callers: src/routes/index.ts:12, src/api/handler.ts:45
|
|
92
|
+
- Duplicates:
|
|
93
|
+
- src/components/YamlViewer.tsx:19 (own generateYAML) [active — consolidate]
|
|
94
|
+
- backend/yaml_gen.go (generateYAMLFromConfig) [dead — DELETE]
|
|
95
|
+
- Data flow: buildConfigData → YamlViewer, SimControls, RoleplayPage
|
|
96
|
+
- Blocked by: T1
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
Files outside original "Files:" → add with `(impact — verify/update)`.
|
|
100
|
+
Skip for spike tasks.
|
|
101
|
+
|
|
102
|
+
### 5. COMPARE & PRIORITIZE
|
|
103
|
+
|
|
104
|
+
Spawn `Task(subagent_type="reasoner", model="opus")`. Map each requirement to DONE / PARTIAL / MISSING / CONFLICT. Check REQ-AC alignment. Flag spec gaps.
|
|
108
105
|
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
3. Passed experiment exists → Skip to full implementation
|
|
106
|
+
Priority: Dependencies → Impact → Risk
|
|
107
|
+
|
|
108
|
+
### 6. GENERATE SPIKE TASKS (IF NEEDED)
|
|
113
109
|
|
|
114
110
|
**Spike Task Format:**
|
|
115
111
|
```markdown
|
|
116
112
|
- [ ] **T1** [SPIKE]: Validate {hypothesis}
|
|
117
113
|
- Type: spike
|
|
118
114
|
- Hypothesis: {what we're testing}
|
|
119
|
-
- Method: {minimal steps
|
|
120
|
-
- Success criteria: {
|
|
115
|
+
- Method: {minimal steps}
|
|
116
|
+
- Success criteria: {measurable}
|
|
121
117
|
- Time-box: 30 min
|
|
122
118
|
- Files: .deepflow/experiments/{topic}--{hypothesis}--{status}.md
|
|
123
119
|
- Blocked by: none
|
|
124
120
|
```
|
|
125
121
|
|
|
126
|
-
|
|
122
|
+
All implementation tasks MUST `Blocked by: T{spike}`. Spike fails → `--failed.md`, no implementation tasks.
|
|
127
123
|
|
|
128
124
|
#### Probe Diversity
|
|
129
125
|
|
|
130
|
-
When generating multiple
|
|
126
|
+
When generating multiple spikes for the same problem:
|
|
131
127
|
|
|
132
128
|
| Requirement | Rule |
|
|
133
129
|
|-------------|------|
|
|
134
|
-
| Contradictory |
|
|
135
|
-
| Naive |
|
|
136
|
-
| Parallel | All
|
|
137
|
-
| Scoped |
|
|
138
|
-
| Safe to fail | Each probe runs in its own worktree; failure has zero impact on main |
|
|
130
|
+
| Contradictory | ≥2 probes with opposing approaches |
|
|
131
|
+
| Naive | ≥1 probe without prior technical justification |
|
|
132
|
+
| Parallel | All run simultaneously |
|
|
133
|
+
| Scoped | Minimal — just enough to validate |
|
|
139
134
|
|
|
140
|
-
|
|
141
|
-
1. Are there at least 2 probes with opposing assumptions? If not, add a contradictory probe.
|
|
142
|
-
2. Is there at least 1 naive probe with no prior technical justification? If not, add one.
|
|
143
|
-
3. Are all probes independent (no probe depends on another probe's result)?
|
|
144
|
-
|
|
145
|
-
**Example — 3 diverse probes for a caching problem:**
|
|
135
|
+
Before output, verify: ≥2 opposing probes, ≥1 naive, all independent.
|
|
146
136
|
|
|
137
|
+
**Example — caching problem, 3 diverse probes:**
|
|
147
138
|
```markdown
|
|
148
139
|
- [ ] **T1** [SPIKE]: Validate in-memory LRU cache
|
|
149
|
-
- Type: spike
|
|
150
140
|
- Role: Contradictory-A (in-process)
|
|
151
|
-
- Hypothesis: In-memory LRU
|
|
152
|
-
- Method:
|
|
153
|
-
- Success criteria: DB
|
|
154
|
-
- Blocked by: none
|
|
141
|
+
- Hypothesis: In-memory LRU reduces DB queries by ≥80%
|
|
142
|
+
- Method: LRU with 1000-item cap, load test
|
|
143
|
+
- Success criteria: DB queries drop ≥80% under 100 concurrent users
|
|
155
144
|
|
|
156
145
|
- [ ] **T2** [SPIKE]: Validate Redis distributed cache
|
|
157
|
-
- Type: spike
|
|
158
146
|
- Role: Contradictory-B (external, opposing T1)
|
|
159
|
-
- Hypothesis: Redis
|
|
160
|
-
- Method:
|
|
161
|
-
- Success criteria: DB queries drop ≥80%, works across 2
|
|
162
|
-
- Blocked by: none
|
|
147
|
+
- Hypothesis: Redis scales across multiple instances
|
|
148
|
+
- Method: Redis client, cache top 10 queries, same load test
|
|
149
|
+
- Success criteria: DB queries drop ≥80%, works across 2 instances
|
|
163
150
|
|
|
164
|
-
- [ ] **T3** [SPIKE]: Validate query optimization without cache
|
|
165
|
-
- Type: spike
|
|
151
|
+
- [ ] **T3** [SPIKE]: Validate query optimization without cache
|
|
166
152
|
- Role: Naive (no prior justification — tests if caching is even necessary)
|
|
167
|
-
- Hypothesis: Indexes + query batching alone may
|
|
168
|
-
- Method: Add
|
|
153
|
+
- Hypothesis: Indexes + query batching alone may suffice
|
|
154
|
+
- Method: Add indexes, batch N+1 queries, same load test — no cache
|
|
169
155
|
- Success criteria: DB queries drop ≥80% with zero cache infrastructure
|
|
170
|
-
- Blocked by: none
|
|
171
156
|
```
|
|
172
157
|
|
|
173
158
|
### 7. VALIDATE HYPOTHESES
|
|
174
159
|
|
|
175
|
-
|
|
160
|
+
Unfamiliar APIs or performance-critical → prototype in scratchpad. Fails → write `--failed.md`. Skip for known patterns.
|
|
176
161
|
|
|
177
162
|
### 8. CLEANUP PLAN.md
|
|
178
163
|
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
```
|
|
182
|
-
For each ### section in PLAN.md:
|
|
183
|
-
Extract spec name from header (e.g. "doing-upload" or "done-upload")
|
|
184
|
-
If specs/done-{name}.md exists:
|
|
185
|
-
→ Remove the ENTIRE section: header, tasks, execution summary, fix tasks, separators
|
|
186
|
-
If header references a spec with no matching specs/doing-*.md or specs/done-*.md:
|
|
187
|
-
→ Remove it (orphaned section)
|
|
188
|
-
```
|
|
189
|
-
|
|
190
|
-
Also recalculate the Summary table (specs analyzed, tasks created/completed/pending) to reflect only remaining sections.
|
|
191
|
-
|
|
192
|
-
If PLAN.md becomes empty after cleanup, delete the file and recreate fresh.
|
|
193
|
-
|
|
194
|
-
### 9. OUTPUT PLAN.md
|
|
164
|
+
Prune stale sections: remove `done-*` sections and orphaned headers. Recalculate Summary table. Empty → recreate fresh.
|
|
195
165
|
|
|
196
|
-
|
|
166
|
+
### 9. OUTPUT & RENAME
|
|
197
167
|
|
|
198
|
-
|
|
168
|
+
Append tasks grouped by `### doing-{spec-name}`. Rename `specs/feature.md` → `specs/doing-feature.md`.
|
|
199
169
|
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
### 11. REPORT
|
|
203
|
-
|
|
204
|
-
`✓ Plan generated — {n} specs, {n} tasks. Run /df:execute`
|
|
170
|
+
Report: `✓ Plan generated — {n} specs, {n} tasks. Run /df:execute`
|
|
205
171
|
|
|
206
172
|
## Rules
|
|
207
|
-
- **Spike-first** —
|
|
208
|
-
- **Block on spike** —
|
|
209
|
-
- **Learn from failures** — Extract
|
|
173
|
+
- **Spike-first** — No `--passed.md` → spike before implementation
|
|
174
|
+
- **Block on spike** — Implementation tasks blocked until spike validates
|
|
175
|
+
- **Learn from failures** — Extract next hypothesis, never repeat approach
|
|
210
176
|
- **Plan only** — Do NOT implement (except quick validation prototypes)
|
|
211
|
-
- **Confirm before assume** — Search code before marking "missing"
|
|
212
177
|
- **One task = one logical unit** — Atomic, committable
|
|
213
178
|
- Prefer existing utilities over new code; flag spec gaps
|
|
214
179
|
|
|
@@ -216,74 +181,31 @@ Append tasks grouped by `### doing-{spec-name}`. Include spec gaps and validatio
|
|
|
216
181
|
|
|
217
182
|
| Agent | Model | Base | Scale |
|
|
218
183
|
|-------|-------|------|-------|
|
|
219
|
-
| Explore
|
|
220
|
-
| Reasoner
|
|
184
|
+
| Explore | haiku | 10 | +1 per 20 files |
|
|
185
|
+
| Reasoner | opus | 5 | +1 per 2 specs |
|
|
221
186
|
|
|
222
|
-
Always use
|
|
187
|
+
Always use `Task` tool with explicit `subagent_type` and `model`.
|
|
223
188
|
|
|
224
189
|
## Example
|
|
225
190
|
|
|
226
|
-
### Spike-First (No Prior Experiments)
|
|
227
|
-
|
|
228
191
|
```markdown
|
|
229
|
-
# Plan
|
|
230
|
-
|
|
231
192
|
### doing-upload
|
|
232
193
|
|
|
233
194
|
- [ ] **T1** [SPIKE]: Validate streaming upload approach
|
|
234
195
|
- Type: spike
|
|
235
|
-
- Hypothesis: Streaming uploads
|
|
236
|
-
-
|
|
237
|
-
- Success criteria: Memory stays under 500MB during upload
|
|
238
|
-
- Time-box: 30 min
|
|
196
|
+
- Hypothesis: Streaming uploads handle >1GB without memory issues
|
|
197
|
+
- Success criteria: Memory <500MB during 2GB upload
|
|
239
198
|
- Files: .deepflow/experiments/upload--streaming--active.md
|
|
240
199
|
- Blocked by: none
|
|
241
200
|
|
|
242
201
|
- [ ] **T2**: Create upload endpoint
|
|
243
202
|
- Files: src/api/upload.ts
|
|
244
|
-
-
|
|
203
|
+
- Impact:
|
|
204
|
+
- Callers: src/routes/index.ts:5
|
|
205
|
+
- Duplicates: backend/legacy-upload.go [dead — DELETE]
|
|
206
|
+
- Blocked by: T1
|
|
245
207
|
|
|
246
208
|
- [ ] **T3**: Add S3 service with streaming
|
|
247
209
|
- Files: src/services/storage.ts
|
|
248
|
-
- Blocked by: T1
|
|
249
|
-
```
|
|
250
|
-
|
|
251
|
-
### Spike-First (After Failed Experiment)
|
|
252
|
-
|
|
253
|
-
```markdown
|
|
254
|
-
# Plan
|
|
255
|
-
|
|
256
|
-
### doing-upload
|
|
257
|
-
|
|
258
|
-
- [ ] **T1** [SPIKE]: Validate chunked upload with backpressure
|
|
259
|
-
- Type: spike
|
|
260
|
-
- Hypothesis: Adding backpressure control will prevent buffer overflow
|
|
261
|
-
- Method: Implement pause/resume on buffer threshold, test with 2GB file
|
|
262
|
-
- Success criteria: No memory spikes above 500MB
|
|
263
|
-
- Time-box: 30 min
|
|
264
|
-
- Files: .deepflow/experiments/upload--chunked-backpressure--active.md
|
|
265
|
-
- Blocked by: none
|
|
266
|
-
- Note: Previous approach failed (see upload--buffer-upload--failed.md)
|
|
267
|
-
|
|
268
|
-
- [ ] **T2**: Implement chunked upload endpoint
|
|
269
|
-
- Files: src/api/upload.ts
|
|
270
|
-
- Blocked by: T1 (spike must pass)
|
|
271
|
-
```
|
|
272
|
-
|
|
273
|
-
### After Spike Validates (Full Implementation)
|
|
274
|
-
|
|
275
|
-
```markdown
|
|
276
|
-
# Plan
|
|
277
|
-
|
|
278
|
-
### doing-upload
|
|
279
|
-
|
|
280
|
-
- [ ] **T1**: Create upload endpoint
|
|
281
|
-
- Files: src/api/upload.ts
|
|
282
|
-
- Blocked by: none
|
|
283
|
-
- Note: Use streaming (validated in upload--streaming--passed.md)
|
|
284
|
-
|
|
285
|
-
- [ ] **T2**: Add S3 service with streaming
|
|
286
|
-
- Files: src/services/storage.ts
|
|
287
|
-
- Blocked by: T1
|
|
288
|
-
- Avoid: Direct buffer upload failed (see upload--buffer-upload--failed.md)
|
|
210
|
+
- Blocked by: T1, T2
|
|
289
211
|
```
|